diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-05 19:43:08 +0300 |
---|---|---|
committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-22 12:02:50 +0300 |
commit | 2ee56b4f0d079b4a1bd58347b13cb85ac5bd1445 (patch) | |
tree | 2cefcd4c9124beccff23ef5707074ed508d5a025 /examples/json_schema_to_grammar.py | |
parent | 0ad646b9f0b96c449a76d41e4d5ebd4ba16ae690 (diff) |
iqk_mul_mat: minor improvements
Current performance:
| model | size | threads | test | t/s |
| ----------------- | ---------: | -------: | ------: | ---------------: |
| llama 7B IQ3_S | 2.75 GiB | 16 | pp512 | 100.21 ± 0.32 |
| llama 7B IQ3_XXS | 2.41 GiB | 16 | pp512 | 105.25 ± 0.75 |
| llama 7B IQ2_M | 2.20 GiB | 16 | pp512 | 117.88 ± 0.15 |
| llama 7B IQ2_XS | 1.89 GiB | 16 | pp512 | 136.38 ± 0.24 |
| llama 7B IQ2_XXS | 1.73 GiB | 16 | pp512 | 128.47 ± 0.39 |
mean: 117.64
| ----------------- | ---------: | -------: | ------: | ---------------: |
| llama 7B IQ2_XXS | 1.73 GiB | 8 | tg128 | 23.94 ± 0.04 |
| llama 7B IQ2_XS | 1.89 GiB | 8 | tg128 | 23.27 ± 0.03 |
| llama 7B IQ2_M | 2.20 GiB | 8 | tg128 | 18.88 ± 0.03 |
| llama 7B IQ3_XXS | 2.41 GiB | 8 | tg128 | 19.07 ± 0.04 |
| llama 7B IQ3_S | 2.75 GiB | 8 | tg128 | 15.44 ± 0.05 |
mean: 20.12
Diffstat (limited to 'examples/json_schema_to_grammar.py')
0 files changed, 0 insertions, 0 deletions