ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-06-08 13:47:02 +0300
committer	Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-06-22 12:02:50 +0300
commit	8a80a31ddd5f3239ab1da6deff1efcdf4f43d1d9 (patch)
tree	78ccaf60d6f17dbead0658ac1057006920d5c324 /ggml-quants.c
parent	81409a02f3c10a74ea23167f1782a951d026ab49 (diff)

iqk_mul_mat: fix q8_0

I was happily using _mm256_packs_epi32() to pack the q8_0 x q8_0 dot products back to int16_t, and getting useful results. But theoretically this can overflow, so it is better to use _mm256_unpacklo_ and _mm256_unpackhi_ to combine the 4 dot products using int32_t additions. This is (almost) as fast, unlike _mm256_hadd_epi32(), which seems excessively slow on the Ryzen-7950X.

Diffstat (limited to 'ggml-quants.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: