summaryrefslogtreecommitdiff
path: root/ggml-cuda
diff options
context:
space:
mode:
authorIwan Kawrakow <iwan.kawrakow@gmail.com>2024-07-15 13:46:07 +0300
committerIwan Kawrakow <iwan.kawrakow@gmail.com>2024-07-15 13:46:07 +0300
commite4dc3babb59da45a60dd3fdf1a9a45e1e9390f37 (patch)
tree34c88b410dcfb041977e6ee827a4ca194c1abe23 /ggml-cuda
parenta4bbd36905b9ac2c2a5f1ded6d1e29fbd5cf4020 (diff)
iq1bn(no lookup): somewhat better
We now have for Bitnet-3B: | threads | test | t/s | | ------: | ------------: | ---------------: | | 16 | pp512 | 308.97 ± 1.89 | | 16 | tg128 | 58.80 ± 0.07 | | 8 | tg128 | 49.79 ± 1.23 | | 4 | tg128 | 28.85 ± 0.02 | | 2 | tg128 | 15.39 ± 0.01 |
Diffstat (limited to 'ggml-cuda')
0 files changed, 0 insertions, 0 deletions