diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-08-08 14:01:08 +0300 |
---|---|---|
committer | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2024-08-09 16:00:31 +0200 |
commit | 595d2ae32da40c1223c45c9fc06e24d3cefe095d (patch) | |
tree | 70ce71e3cbc7a15279b0813d36b98a682c7577df /ggml/src/ggml-quants.h | |
parent | 849476acc79af52998316e421baa9befad3b8eb3 (diff) |
iq6_k: slightly better Zen4 iqk_mul_mat
We now arrive at pp-512 = 147 t/s for LLaMA-3.1-8B.
TG-128 is 9.5 t/s. This is better than last commit,
but still kind of slow compared to Q6_K.
My last commit message is wrong: also iq3_k needs a fix
for overflow.
Diffstat (limited to 'ggml/src/ggml-quants.h')
0 files changed, 0 insertions, 0 deletions