| Age | Commit message (Expand) | Author |
| 2024-05-14 | Add left recursion check: quit early instead of going into an infinite loop (... | Haggai Nuchi |
| 2024-05-12 | CUDA: add FP32 FlashAttention vector kernel (#7188) | Johannes Gäßler |
| 2024-05-11 | llama : lookup word in vocab before doing BPE merges (#7193) | Haoxiang Fei |
| 2024-05-11 | ggml : full ALiBi support (#7192) | Georgi Gerganov |
| 2024-05-09 | llama3 custom regex split (#6965) | jaime-m-p |
| 2024-05-09 | CUDA: generalize FP16 fattn vec kernel (#7061) | Johannes Gäßler |
| 2024-05-08 | JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143) | Johannes Gäßler |
| 2024-05-08 | llama : add BPE pre-tokenization for Qwen2 (#7114) | Ren Xuancheng |
| 2024-05-08 | ggml : introduce bfloat16 support (#6412) | Justine Tunney |
| 2024-05-05 | command-r : add BPE pre-tokenization (#7063) | DAN™ |
| 2024-05-05 | py : logging and flake8 suppression refactoring (#7081) | Brian |
| 2024-05-04 | tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) | Georgi Gerganov |
| 2024-05-03 | convert.py : add python logging instead of print() (#6511) | Brian |
| 2024-04-30 | ggml : add Flash Attention (#5021) | Georgi Gerganov |
| 2024-04-29 | Extending grammar integration tests (#6644) | Clint Herron |
| 2024-04-29 | llama : fix BPE pre-tokenization (#6920) | Georgi Gerganov |
| 2024-04-24 | llama : add phi 3 chat template (#6857) | Tristan Druyen |
| 2024-04-21 | llama : add llama-3 chat template (#6751) | Wouter |
| 2024-04-18 | ggml : group all experts in a single ggml_mul_mat_id (#6505) | slaren |
| 2024-04-16 | llama : add qwen2moe (#6074) | Shijie |
| 2024-04-15 | `main`: add --json-schema / -j flag (#6659) | Olivier Chafik |
| 2024-04-14 | Add Command R chat template (#6650) | Chao Jiang |
| 2024-04-12 | JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings,... | Olivier Chafik |
| 2024-04-12 | metal : unify mul_mv_id kernels (#6556) | slaren |
| 2024-04-11 | grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses... | Olivier Chafik |
| 2024-04-06 | Tests: Added integration tests for GBNF parser (#6472) | Clint Herron |
| 2024-04-03 | Add OpenChat, Alpaca, Vicuna chat templates (#6397) | kaizau |
| 2024-04-03 | ggml : mul_mat_id use the same tensor for all the experts (#6387) | slaren |
| 2024-03-26 | IQ1_M: 1.75 bpw quantization (#6302) | Kawrakow |
| 2024-03-25 | tests : include IQ2_XXS and IQ2_XS in test-quantize-fns (#6303) | Kawrakow |
| 2024-03-22 | tests : conditional python & node json schema tests (#6207) | Olivier Chafik |
| 2024-03-22 | json-schema-to-grammar : fix order of props + non-str const/enum (#6232) | Olivier Chafik |
| 2024-03-22 | metal : pad n_ctx by 32 (#6177) | Georgi Gerganov |
| 2024-03-21 | tests : disable system() calls (#6198) | Georgi Gerganov |
| 2024-03-21 | json-schema-to-grammar improvements (+ added to server) (#5978) | Olivier Chafik |
| 2024-03-15 | llama : add Orion chat template (#6066) | Xuan Son Nguyen |
| 2024-03-13 | test-backend-ops : skip CPU backend by default (#6028) | slaren |
| 2024-03-11 | llama : refactor unicode stuff (#5992) | Georgi Gerganov |
| 2024-03-09 | ggml : remove old quantization functions (#5942) | Georgi Gerganov |
| 2024-03-09 | tests : gitignore ggml-common.h | Georgi Gerganov |
| 2024-03-04 | add some new ops, fix some operators and add batch operations to certain oper... | leejet |
| 2024-02-27 | IQ4_XS: a 4.25 bpw quantization (#5747) | Kawrakow |
| 2024-02-26 | Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range... | Kawrakow |
| 2024-02-25 | code : normalize enum names (#5697) | Georgi Gerganov |
| 2024-02-24 | IQ3_S: a much better alternative to Q3_K (#5676) | Kawrakow |
| 2024-02-22 | Add Gemma chat template (#5665) | Xuan Son Nguyen |
| 2024-02-22 | server : fallback to chatml, add AlphaMonarch chat template (#5628) | Xuan Son Nguyen |
| 2024-02-21 | IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590) | Kawrakow |
| 2024-02-19 | llama : add llama_chat_apply_template() (#5538) | Xuan Son Nguyen |
| 2024-02-18 | ggml, common, examples, tests : fixed type arguments in printf (#5528) | Herman Semenov |