| Age | Commit message (Expand) | Author |
| 2024-06-08 | url: save -mu downloads to new cache location (#7826) | Olivier Chafik |
| 2024-06-08 | server : smart slot selection using Longest Common Prefix (#7728) | sasha0552 |
| 2024-06-07 | cmake : fix BUILD_SHARED_LIBS=ON build (#7784) | intelmatt |
| 2024-06-06 | server : fix --threads-http arg (#7801) | Georgi Gerganov |
| 2024-06-06 | imatrix : migrate to gpt_params (#7771) | Georgi Gerganov |
| 2024-06-06 | Added support for . (any character) token in grammar engine. (#6467) | Clint Herron |
| 2024-06-06 | grammars: x{min,max} repetition operator (#6640) | Olivier Chafik |
| 2024-06-04 | common : refactor cli arg parsing (#7675) | Georgi Gerganov |
| 2024-06-04 | ggml : remove OpenCL (#7735) | Georgi Gerganov |
| 2024-06-03 | Vulkan Mixture of Experts (MoE) support (#7628) | 0cc4m |
| 2024-05-27 | main: replace --no-special with --special (#7534) | Brian |
| 2024-05-25 | train : change default FA argument (#7528) | Georgi Gerganov |
| 2024-05-25 | main : don't print special tokens with --grammar (#6923) | Justine Tunney |
| 2024-05-25 | ggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0 vector dot (#7433) | Masaya, Kato |
| 2024-05-25 | fix missing slash in `fs_get_cache_directory()` (#7503) | Xuan Son Nguyen |
| 2024-05-22 | common : normalize naming style (#7462) | Georgi Gerganov |
| 2024-05-21 | `grammars`: fix resampling logic regression (#7424) | Olivier Chafik |
| 2024-05-21 | examples: cache hf model when --model not provided (#7353) | Amir |
| 2024-05-17 | ggml-quants, llama : removed excess checks (#7274) | Herman Semenov |
| 2024-05-16 | grammar, json, llama: replace push on emplace if it possible (#7273) | Herman Semenov |
| 2024-05-16 | Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (#... | Max Krasnyansky |
| 2024-05-14 | ggml : add RPC backend (#6829) | Radoslav Gerganov |
| 2024-05-11 | server: fix reported top tokens for temperature 0 (#7203) | Johannes Gäßler |
| 2024-05-10 | Fix memory bug in grammar parser (#7194) | Justine Tunney |
| 2024-05-10 | Main+: optionally allow special tokens from user in interactive mode (#7097) | HanishKVC |
| 2024-05-08 | JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143) | Johannes Gäßler |
| 2024-05-08 | main : add --conversation / -cnv flag (#7108) | Dawid Potocki |
| 2024-05-07 | server: fix incorrectly reported token probabilities (#7125) | Johannes Gäßler |
| 2024-05-04 | Fix Linux /sys cpu path to guess number of cores (#7064) | viric |
| 2024-05-01 | Update LOG_IMPL and LOG_TEE_IMPL (#7029) | Andrew Downing |
| 2024-04-30 | perplexity: more statistics, added documentation (#6936) | Johannes Gäßler |
| 2024-04-30 | ggml : add Flash Attention (#5021) | Georgi Gerganov |
| 2024-04-30 | Improve usability of --model-url & related flags (#6930) | Olivier Chafik |
| 2024-04-29 | llava-cli : multiple images (#6969) | cpumaxx |
| 2024-04-29 | llama : fix BPE pre-tokenization (#6920) | Georgi Gerganov |
| 2024-04-29 | sampling : use std::random_device{}() for default random seed (#6962) | David Renshaw |
| 2024-04-27 | Replace "alternative" boolean operator in conditional compilation directive (... | mgroeber9110 |
| 2024-04-26 | quantize: add imatrix and dataset metadata in GGUF (#6658) | Pierrick Hymbert |
| 2024-04-26 | add basic tensor data validation function (#6884) | slaren |
| 2024-04-24 | llama : add llama_get_pooling_type function (#6862) | Douglas Hanley |
| 2024-04-24 | common : revert showing control tokens by default for server (#6860) | Kyle Mistele |
| 2024-04-24 | Server: fix seed for multiple slots (#6835) | Johannes Gäßler |
| 2024-04-21 | llama : add option to render special/control tokens (#6807) | Georgi Gerganov |
| 2024-04-20 | common : try to fix Android CI (#6780) | Georgi Gerganov |
| 2024-04-16 | ggml : add llamafile sgemm (#6414) | Justine Tunney |
| 2024-04-15 | `main`: add --json-schema / -j flag (#6659) | Olivier Chafik |
| 2024-04-12 | JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings,... | Olivier Chafik |
| 2024-04-11 | eval-callback: Example how to use eval callback for debugging (#6576) | Pierrick Hymbert |
| 2024-04-09 | BERT tokenizer fixes (#6498) | Jared Van Bortel |
| 2024-04-08 | llama : support negative ith in llama_get_ API (#6519) | Rick G |