Age | Commit message (Expand) | Author |
---|---|---|
2024-02-18 | ggml, common, examples, tests : fixed type arguments in printf (#5528) | Herman Semenov |
2024-02-16 | ggml : add numa options (#5377) | bmwl |
2024-01-08 | examples : add passkey test (#3856) | Georgi Gerganov |
2023-10-24 | cuda : add batched cuBLAS GEMM for faster attention (#3749) | Georgi Gerganov |
2023-10-23 | llama : remove token functions with `context` args in favor of `model` (#3720) | Marcus Dunn |
2023-10-22 | batched : add len CLI argument | Georgi Gerganov |
2023-10-18 | speculative : add tree-based sampling example (#3624) | Georgi Gerganov |
2023-10-11 | batched : add bench tool (#3545) | Georgi Gerganov |
2023-09-28 | llama.cpp : split llama_context_params into model and context params (#3301) | slaren |
2023-09-28 | llama : custom attention mask + parallel decoding + no context swaps (#3228) | Georgi Gerganov |