summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-cann.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-06-01 15:23:44 +0300
committerGitHub <noreply@github.com>2025-06-01 15:23:44 +0300
commit35374bc7e8de2b221ed4eabe426e05d8b9a7f99b (patch)
treef6e8438421d7c0a5971be7259a1581d116e47e79 /ggml/src/ggml-cann.cpp
parent7239ce6b35f0a1812bb54393f6a237c4f7cfe713 (diff)
Metal implementatio for the trellis quants. (#475)
* iq2_kt: Metal dequantize * iq2_kt: Metal GEMV Performance is actually quite decent: 52 t/s on my M2-Max for LlaMA-3.1-8B * iq3_kt: Metal dequantize * iq3_kt: Metal GEMV Performance is not as good as iq2_kt: 40 t/s on my M2-Max for LlaMA-3.1-8B. Flipping signs is a costly affair. * iq4_kt: Metal dequantize - getting NaNs * iq4_kt: Metal GEMV - also not working * iq4_kt: Metal still not working * Disable iq4_kt on Metal for now --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-cann.cpp')
0 files changed, 0 insertions, 0 deletions