diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-07 17:43:29 +0300 |
---|---|---|
committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-22 12:02:50 +0300 |
commit | 74b711c8fd80308a5e13620d91e4e55a17ae55ad (patch) | |
tree | 4d4b156d63f9e761c8c5d9dac7492c60a93ee22f /examples/retrieval | |
parent | 29164263f48790cb280948e34963a5e5a0e1da6a (diff) |
iqk_mul_mat: add q8_0
It was actually ready but not turned on.
Having forgotten, I made a new implementation along the
lines of the fp16 implementation (i.e., using tiling).
That matched tiinyBLAS performance. But the existing
implementation that I now turned on is faster:
PP-512 = 134 t/s vs 128.3 t/s for tinyBLAS
TG-128 = 8.7 t/s vs 8.3 t/s for tinyBLAS (@ 4 threads)
Diffstat (limited to 'examples/retrieval')
0 files changed, 0 insertions, 0 deletions