summaryrefslogtreecommitdiff
path: root/convert-hf-to-gguf.py
diff options
context:
space:
mode:
authorIwan Kawrakow <iwan.kawrakow@gmail.com>2024-06-18 13:32:51 +0200
committerIwan Kawrakow <iwan.kawrakow@gmail.com>2024-06-22 12:02:52 +0300
commit4f51348d3d5c0f0bfee42d0a7efc81030f046d13 (patch)
treec6fc7e2acb5ee8f912d4d615d62bdf2dccdff170 /convert-hf-to-gguf.py
parent01ea9a862d4afb73f936de8f4ef46401ce11b596 (diff)
Bitnet(2.25 bpw): Metal
We get PP-512 = 702 t/s, TG-128 = 84 t/s. This is almost on par with q4_0, which is rare on Metal (to not say it does not exist). For reference, q4_0 gives 726 t/s / 86 t/s for Bitnet. TG is kind of funny because we hit 72 t/s on the CPU.
Diffstat (limited to 'convert-hf-to-gguf.py')
0 files changed, 0 insertions, 0 deletions