summaryrefslogtreecommitdiff
path: root/examples/retrieval
diff options
context:
space:
mode:
authorIwan Kawrakow <iwan.kawrakow@gmail.com>2024-06-07 17:43:29 +0300
committerIwan Kawrakow <iwan.kawrakow@gmail.com>2024-06-22 12:02:50 +0300
commit74b711c8fd80308a5e13620d91e4e55a17ae55ad (patch)
tree4d4b156d63f9e761c8c5d9dac7492c60a93ee22f /examples/retrieval
parent29164263f48790cb280948e34963a5e5a0e1da6a (diff)
iqk_mul_mat: add q8_0
It was actually ready but not turned on. Having forgotten, I made a new implementation along the lines of the fp16 implementation (i.e., using tiling). That matched tiinyBLAS performance. But the existing implementation that I now turned on is faster: PP-512 = 134 t/s vs 128.3 t/s for tinyBLAS TG-128 = 8.7 t/s vs 8.3 t/s for tinyBLAS (@ 4 threads)
Diffstat (limited to 'examples/retrieval')
0 files changed, 0 insertions, 0 deletions