diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-07-10 09:27:28 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-10 09:27:28 +0200 |
commit | 283753cabcabd30eb2cfb93739d9c1679200bf1f (patch) | |
tree | 86de891461ece7de11c98f7fb3eb494203b36cbb /gguf-py/gguf/constants.py | |
parent | 5446ccc8ac87037484ba63f91941de35e0bd58ca (diff) |
CUDA: Faster prompt processing for several quantization types (#595)
* cuda: slightly faster MMQ for iq3_k, iq3_k_r4
* cuda: slightly faster MMQ for iq4_k, iq4_k_r4
* cuda: slightly faster MMQ for iq4_ks_r4
* cuda: slightly faster MMQ for iq4_ks
* cuda: slightly faster MMQ for iq4_xs
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'gguf-py/gguf/constants.py')
0 files changed, 0 insertions, 0 deletions