diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-05-14 14:04:11 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-14 14:04:11 +0300 |
commit | 0435b68e6d34b4987fee9d94a7221a146532ced1 (patch) | |
tree | 359de0ae12ddc26b8eaf14bbdc0e4b5b40a19f5f /src/llama.cpp | |
parent | b90d6ede2eca3fc48d716868269be5e0e15d00f9 (diff) |
CUDA: quantized GEMM for for IQ4_K, IQ5_K, IQ6_K (#417)
* MMQ for iq4_k: WIP (not working)
* MMQ for iq4_k: working now
* MMQ for iq5_k
* Cleanup
* MMQ for iq5_k: slightly faster
* MMQ for iq6_k
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'src/llama.cpp')
0 files changed, 0 insertions, 0 deletions