summaryrefslogtreecommitdiff
path: root/ggml/src/ggml.c
AgeCommit message (Expand)Author
2025-02-15Bug fix in activation quantizationIwan Kawrakow
2025-02-15Moving 4D gemm logic from ggml.c to iqk_mul_mat.cpp (#207)Kawrakow
2025-02-11DeepSeek FA support (CPU only) (#200)Kawrakow
2025-02-09Add optional MLA (#188)Kawrakow
2025-02-09Use Q8_K_128 for IQ1_S_R4 and IQ1_M_R4 matrix multiplications (#194)Kawrakow
2025-02-08Revert #79 (#192)Kawrakow
2025-02-06Rename q4_0_r4, q8_0_r4 and iq4_xs_r4 to _r8 (#189)Kawrakow
2025-02-06IQ1_M_R4: better 1.75 bpw quants (#187)Kawrakow
2025-02-05IQ1_S_R4: better 1.5 bpw quants (#185)Kawrakow
2025-01-20More Flash Attention improvements (#173)Kawrakow
2025-01-15CPU Flash Attention improvements (#172)Kawrakow
2025-01-12Fix the strange FA behavior with odd/even batch sizes (#171)Kawrakow
2025-01-10Be able to re-quantize MS BitNet I2_S models (#169)Kawrakow
2024-12-23IQ3_S_R4 (#162)Kawrakow
2024-12-21IQ2_S_R4 (#156)Kawrakow
2024-12-21IQ2_XS_R4 (#155)Kawrakow
2024-12-20IQ2_XXS_R4 (#154)Kawrakow
2024-12-20fix typo (#151)Nexes the Elder
2024-12-20IQ3_XXS_R4 (#153)Kawrakow
2024-12-18IQ4_KS_R4 (#150)Kawrakow
2024-12-18IQ5_K_R4 (#149)Kawrakow
2024-12-17IQ2_K_R4 (#146)Kawrakow
2024-12-17IQ3_K_R4 (#145)Kawrakow
2024-12-15BF16_R16 - 16 interleaved bf16 rows (#142)Kawrakow
2024-12-14Q8_K_R8: Fastest quantized matrix multiplications (#141)Kawrakow
2024-12-12IQ4_K_R4 (#138)Kawrakow
2024-12-11Q2_K_R4 (#136)Kawrakow
2024-12-11Q3_K_R4 (#134)Kawrakow
2024-12-10Q5_K_R4 (#132)Kawrakow
2024-12-10Q6_K_R4 (#130)Kawrakow
2024-12-09Q4_K_R4 (#129)Kawrakow
2024-12-08Faster IQ4_XS_R4 on Zen4 (#128)Kawrakow
2024-12-08Rename iq4_nl_x4 to iq4_nl_r4 (#126)Kawrakow
2024-12-06iq2_bn_r4: fastest Bitnet CPU implementation on the planet (#124)Kawrakow
2024-12-04IQ4_XS_R4 (#123)Kawrakow
2024-12-03Q6_0_R4 (#122)Kawrakow
2024-12-03Q5_0_R4 (#121)Kawrakow
2024-12-03Q8_0_R4 (#120)Kawrakow
2024-12-02Q4_0_R4 (#119)Kawrakow
2024-12-02IQ4_NL_X4 (#118)Kawrakow
2024-10-31Faster MoE inference (#112)Kawrakow
2024-10-25Bitnet changes (#106)Kawrakow
2024-10-16Adding IQ4_KSS: 4.0 bpw quants (#89)Kawrakow
2024-10-13IQ2_KS: 2.1875 bpw non-linear quantization (#85)Kawrakow
2024-10-09New SOTA quantization: 4.25 bpw IQ4_KS (#83)Kawrakow
2024-10-04Fix compiler warningsIwan Kawrakow
2024-10-04Move to c++17 projectwide (#80)Kawrakow
2024-10-04Do not quantize activations if not necessary (#79)Kawrakow
2024-10-02Fused unary(x)*y (#70)Kawrakow
2024-10-02Adding Q6_0 (#77)Kawrakow