diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-07-18 13:55:51 +0200 |
---|---|---|
committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-07-18 13:55:51 +0200 |
commit | 30b8bcf1a3bf232aabcbb826c7a2769dda6eafa0 (patch) | |
tree | 40d4e4eb9274afcef5a751999e82cd7011b1ffe4 /llama.cpp | |
parent | 8db01c0804b603cb76bbee82ebb1a144c8d3592e (diff) |
iqk_mul_mat(f16): make it work for row sizes that are multiple of 4 on NEON
Here the performance gain is more modest compared to AVX2: we get
PP-512 = 200 t/s up from 190 t/s for iq1_bn-quantized Bitnet-3B
running on M2 Max.
Diffstat (limited to 'llama.cpp')
0 files changed, 0 insertions, 0 deletions