diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-06-26 08:50:49 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-06-26 08:50:49 +0200 |
commit | 5236c98b41ea564e2211a47c5a1fffcc02e24feb (patch) | |
tree | aeab88437f924a6ec7814da812332aa8ae1e3d41 /examples/export-lora | |
parent | 8e5106b20f694c84811b073b3a4f86ca9d871441 (diff) |
CUDA: MMQ for iqX_r4 quants (#557)
* cuda: MMQ for iq2_k_r4
* cuda: MMQ for iq3_k_r4
* cuda: MMQ for iq4_k_r4
* cuda: MMQ for iq5_k_r4
* iqk_r4 quants: use MMQ only for batches < 1024 tokens
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/export-lora')
0 files changed, 0 insertions, 0 deletions