| Age | Commit message (Expand) | Author |
|---|---|---|
| 2025-03-13 | FlashMLA-2 (CPU): faster and smaller compute buffer size (#253) | Kawrakow |
| 2025-02-25 | Give the user the option to override where model weights are stored (#232) | Kawrakow |
| 2024-10-25 | Bitnet changes (#106) | Kawrakow |
| 2024-10-20 | Avoid rebuild of GGML graph for each token (#98) | agray3 |
| 2024-08-12 | Merge mainline - Aug 12 2024 (#17) | Kawrakow |
| 2024-07-27 | Merge mainline llama.cpp (#3) | Kawrakow |
