diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-05-29 10:38:58 +0300 |
---|---|---|
committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-22 12:02:49 +0300 |
commit | 2c8c0d0a68d78f0aaf7c756849f97d0a5e655afe (patch) | |
tree | 725d1eee1babec05bafa0fd792ba0138b648d117 /examples | |
parent | 34befcaf6731a9a29bb5d7f3f2472e53c4151898 (diff) |
iqk_mul_mat: AVX2 implementation for iq3_xxs
We get 2.3X for PP-512 (87 t/s). But for TG, we need to use
the original implementation in llama.cpp because the template is not able
to match the performance of the special-purpose implementation.
Also, 87 t/s is significantly lower than the 111 t/s I have in iquants.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions