diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-08-07 17:25:21 +0300 |
---|---|---|
committer | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2024-08-09 16:00:31 +0200 |
commit | c3f5e4d9a7ddad8e7af6dd43807815496acddab3 (patch) | |
tree | 753d98457de5ba555c7c00b3c680349fa531ab66 /ggml/src/ggml-common.h | |
parent | a9b3f4a54b544a6e9adde65673533e0154d7767a (diff) |
iq6_k: CUDA dequantize
We get a slightly better PPL for LLaMA-3.1-8B compared to q6_K
(0.14% vs 0.26% quantization error).
Diffstat (limited to 'ggml/src/ggml-common.h')
0 files changed, 0 insertions, 0 deletions