index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
/
llama.cpp
Age
Commit message (
Expand
)
Author
2025-07-17
Bump Windows max open files from 512 to 2048 (#620)
Thireus ☠
2025-07-15
kimi-k2 convert script and chat template (#612)
ubergarm
2025-07-15
Vulkan: a fresh start (#608)
Kawrakow
2025-07-14
Adding IQ2_KL (#602)
Kawrakow
2025-07-14
Ported kimi-k2 support from llama.cpp (#609)
Aleksey Nikiforov
2025-07-13
Fix attn_v conditionality (#604)
Nexes the Elder
2025-07-10
Support for dots.llm1 models (#573)
saood06
2025-07-09
add hunyuan moe support for 561 (#565)
ubergarm
2025-07-06
Special handling of Seed Coder FIM tokens (#585)
Fizz~
2025-07-06
Fix server crash when there is no DRY sampler (#588)
firecoperana
2025-07-04
Vulkan: adding GGML_OP_MULTI_ADD implementation (#582)
Kawrakow
2025-07-03
Vulkan: Disable multi-add for now (#581)
Kawrakow
2025-07-03
Vulkan: add GGML_OP_FUSED_MUL_UNARY (#580)
Kawrakow
2025-07-03
Vulkan: fused rms norm (#577)
Kawrakow
2025-07-03
Do not crash when there is no DRY sampler (#578)
Kawrakow
2025-07-02
Adding IQ3_KS quants (#566)
Kawrakow
2025-07-02
Conditionally disable fused ops when building with Vulkan enabled (#569)
Kawrakow
2025-07-02
Merge vulkan code from mainline up to commit of 6/28/2025 (#563)
firecoperana
2025-06-26
Add Falcon-Edge support (#555)
Kawrakow
2025-06-21
Faster ARM_NEON GEMM implementation for legacy quants (#546)
Kawrakow
2025-06-19
add dry sampler (#513)
firecoperana
2025-06-18
New IQ2_KT, IQ3_KT and IQ4_KT, V2 (#529)
Kawrakow
2025-06-06
Make prompt cache saving and restoring MLA aware (#497)
saood06
2025-06-03
Adding top-n-sigma sampler (#489)
Kawrakow
2025-06-03
Adding the XTC sampler (#486)
Kawrakow
2025-05-31
forgotten refs and typo (#478)
Nexes the Elder
2025-05-30
Replace MLA-specific KV cache with the standard KV cache (#469)
Kawrakow
2025-05-24
Legacy quants conversion schemes in convert_hf_to_gguf.py (#449)
Nexes the Elder
2025-05-23
Trellis quants with CPU inference (#441)
Andrew Chan
2025-05-22
Streamline a bit the quant strategies (#443)
Nexes the Elder
2025-05-17
IQ5_KS_R4: row-interleaved IQ5_KS (#426)
Kawrakow
2025-05-15
Adding IQ5_KS - 5.25 bpw quants (#422)
Kawrakow
2025-05-12
Enable faster prompt processing with mainline llama.cpp GGUFs (#409)
Kawrakow
2025-05-12
Faster DeepSeek FA on CUDA (#408)
Kawrakow
2025-05-12
GPU offload policy (#405)
Kawrakow
2025-05-09
Handle incompatible DeepSeek GGUFs (#394)
Kawrakow
2025-05-09
Support for Llama-3-Nemotron models (#377)
saood06
2025-05-02
Fix model architecture name (#366)
saood06
2025-04-29
Apply Qwen3 PR from llama.cpp (#355)
Ben Harris
2025-04-26
Add GLM-4-0414 Model Support (#344)
ubergarm
2025-04-26
Add support for Cohere2 (#341)
Kawrakow
2025-04-25
Fix LLaMA-4 attention (#342)
Kawrakow
2025-04-22
BitNet adjustments (#338)
Kawrakow
2025-04-22
Add support for bitnet2b_2501 model (#337)
saood06
2025-04-11
Correct L4 rms_norm (#324)
Kawrakow
2025-04-10
LlaMA-4 support (text only) (#321)
Kawrakow
2025-04-08
Guard against attempts to use MLA for non-MLA models (#320)
Kawrakow
2025-04-07
Add copyright notices (#317)
Kawrakow
2025-04-01
Additional guards for interleaved quants (#299)
Kawrakow
2025-03-27
Make sure tensor row size is multiple of block size also when quantizing with...
Kawrakow
[next]