| Age | Commit message (Expand) | Author |
| 2024-03-13 | test-backend-ops : skip CPU backend by default (#6028) | slaren |
| 2024-03-11 | llama : refactor unicode stuff (#5992) | Georgi Gerganov |
| 2024-03-09 | ggml : remove old quantization functions (#5942) | Georgi Gerganov |
| 2024-03-09 | tests : gitignore ggml-common.h | Georgi Gerganov |
| 2024-03-04 | add some new ops, fix some operators and add batch operations to certain oper... | leejet |
| 2024-02-27 | IQ4_XS: a 4.25 bpw quantization (#5747) | Kawrakow |
| 2024-02-26 | Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range... | Kawrakow |
| 2024-02-25 | code : normalize enum names (#5697) | Georgi Gerganov |
| 2024-02-24 | IQ3_S: a much better alternative to Q3_K (#5676) | Kawrakow |
| 2024-02-22 | Add Gemma chat template (#5665) | Xuan Son Nguyen |
| 2024-02-22 | server : fallback to chatml, add AlphaMonarch chat template (#5628) | Xuan Son Nguyen |
| 2024-02-21 | IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590) | Kawrakow |
| 2024-02-19 | llama : add llama_chat_apply_template() (#5538) | Xuan Son Nguyen |
| 2024-02-18 | ggml, common, examples, tests : fixed type arguments in printf (#5528) | Herman Semenov |
| 2024-02-18 | 1.5 bit quantization (#5453) | Kawrakow |
| 2024-02-17 | ggml : add ALiBi support for ggml_soft_max_ext (#5488) | Georgi Gerganov |
| 2024-02-16 | ggml : add numa options (#5377) | bmwl |
| 2024-02-13 | tests : multi-thread the tokenizer tests (#5474) | Georgi Gerganov |
| 2024-02-13 | tests : disable moe test (#5473) | Georgi Gerganov |
| 2024-02-11 | ggml : add mmla kernels for quantized GEMM (#4966) | snadampal |
| 2024-02-08 | sampling: fix top_k <= 0 (#5388) | Johannes Gäßler |
| 2024-02-08 | tests : .gitignore obj files | Georgi Gerganov |
| 2024-02-03 | refactor : switch to emplace_back to avoid extra object (#5291) | Michael Klimenko |
| 2024-01-31 | llava : add MobileVLM support (#5132) | JidongZhang-THU |
| 2024-01-30 | `ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686) | John Balis |
| 2024-01-30 | SOTA 3-bit quants (#5196) | Kawrakow |
| 2024-01-29 | Nomic Vulkan backend (#4456) | Jared Van Bortel |
| 2024-01-28 | ggml : add unified SYCL backend for Intel GPUs (#2690) | Abhilash Majumder |
| 2024-01-28 | Tests for min_p, sampling queue (#5147) | Johannes Gäßler |
| 2024-01-27 | Remove unused data and add fixes (#5154) | Michael Klimenko |
| 2024-01-26 | tests : gitignore test-c.o | Georgi Gerganov |
| 2024-01-26 | ci : add model tests + script wrapper (#4586) | crasm |
| 2024-01-17 | ggml : add IQ2 to test-backend-ops + refactoring (#4990) | Georgi Gerganov |
| 2024-01-17 | metal : create autorelease pool during library build (#4970) | Georgi Gerganov |
| 2024-01-14 | 2-bit quantizations (#4897) | Kawrakow |
| 2024-01-12 | llama : ggml-backend integration (#4766) | slaren |
| 2024-01-11 | ggml : SOTA 2-bit quants (add IQ2_XS) (#4856) | Kawrakow |
| 2024-01-09 | CUDA: faster softmax via shared memory + fp16 math (#4742) | Johannes Gäßler |
| 2024-01-08 | SOTA 2-bit quants (#4773) | Kawrakow |
| 2024-01-04 | Print backend name on test-backend-ops failure (#4751) | Johannes Gäßler |
| 2024-01-03 | ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639) | Guillaume Wenzek |
| 2024-01-02 | metal : enable shader debugging (cmake option) (#4705) | Georgi Gerganov |
| 2023-12-29 | cmake : fix ld warning duplicate libraries libllama.a (#4671) | Cuong Trinh Manh |
| 2023-12-29 | ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669) | bssrdf |
| 2023-12-28 | gpt2 : Add gpt2 architecture integration (#4555) | manikbhandari |
| 2023-12-24 | cuda : improve cuda pool efficiency using virtual memory (#4606) | slaren |
| 2023-12-21 | ggml : change ggml_scale to take a float instead of tensor (#4573) | Georgi Gerganov |
| 2023-12-18 | llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490) | Ebey Abraham |
| 2023-12-14 | ggml : use ggml_row_size where possible (#4472) | slaren |
| 2023-12-13 | sync : ggml (SD ops, tests, kernels) (#4444) | Georgi Gerganov |