summaryrefslogtreecommitdiff
path: root/ggml/src/iqk/iqk_quantize.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-06-26 08:50:49 +0200
committerGitHub <noreply@github.com>2025-06-26 08:50:49 +0200
commit5236c98b41ea564e2211a47c5a1fffcc02e24feb (patch)
treeaeab88437f924a6ec7814da812332aa8ae1e3d41 /ggml/src/iqk/iqk_quantize.cpp
parent8e5106b20f694c84811b073b3a4f86ca9d871441 (diff)
CUDA: MMQ for iqX_r4 quants (#557)
* cuda: MMQ for iq2_k_r4 * cuda: MMQ for iq3_k_r4 * cuda: MMQ for iq4_k_r4 * cuda: MMQ for iq5_k_r4 * iqk_r4 quants: use MMQ only for batches < 1024 tokens --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/iqk/iqk_quantize.cpp')
0 files changed, 0 insertions, 0 deletions