diff options
| author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-05 19:43:08 +0300 | 
|---|---|---|
| committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-22 12:02:50 +0300 | 
| commit | 2ee56b4f0d079b4a1bd58347b13cb85ac5bd1445 (patch) | |
| tree | 2cefcd4c9124beccff23ef5707074ed508d5a025 /examples/llama.android/llama/src/main/java/android | |
| parent | 0ad646b9f0b96c449a76d41e4d5ebd4ba16ae690 (diff) | |
iqk_mul_mat: minor improvements
Current performance:
| model             |       size |  threads |    test |              t/s |
| ----------------- | ---------: | -------: | ------: | ---------------: |
| llama 7B IQ3_S    |   2.75 GiB |       16 |   pp512 |    100.21 ± 0.32 |
| llama 7B IQ3_XXS  |   2.41 GiB |       16 |   pp512 |    105.25 ± 0.75 |
| llama 7B IQ2_M    |   2.20 GiB |       16 |   pp512 |    117.88 ± 0.15 |
| llama 7B IQ2_XS   |   1.89 GiB |       16 |   pp512 |    136.38 ± 0.24 |
| llama 7B IQ2_XXS  |   1.73 GiB |       16 |   pp512 |    128.47 ± 0.39 |
                                                     mean: 117.64
| ----------------- | ---------: | -------: | ------: | ---------------: |
| llama 7B IQ2_XXS  |   1.73 GiB |        8 |   tg128 |     23.94 ± 0.04 |
| llama 7B IQ2_XS   |   1.89 GiB |        8 |   tg128 |     23.27 ± 0.03 |
| llama 7B IQ2_M    |   2.20 GiB |        8 |   tg128 |     18.88 ± 0.03 |
| llama 7B IQ3_XXS  |   2.41 GiB |        8 |   tg128 |     19.07 ± 0.04 |
| llama 7B IQ3_S    |   2.75 GiB |        8 |   tg128 |     15.44 ± 0.05 |
                                                     mean:  20.12
Diffstat (limited to 'examples/llama.android/llama/src/main/java/android')
0 files changed, 0 insertions, 0 deletions
