Age | Commit message (Collapse) | Author | |
---|---|---|---|
2024-03-04 | llama : fix embeddings (#5796) | Georgi Gerganov | |
* llama : fix embeddings ggml-ci * llama : do not use KV cache for non-causal models ggml-ci * embeddings : fix llama_batch_init arg * llama : add pooling switch * llama : distinguish token vs sequence embeddings ggml-ci * llama : assert pooling tensor * llama : simplify causal mask condition ggml-ci * llama : assert input batch with pooling enabled * readme : update API changes list |