summaryrefslogtreecommitdiff
path: root/convert_hf_to_gguf.py
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-07-10 09:27:28 +0200
committerGitHub <noreply@github.com>2025-07-10 09:27:28 +0200
commit283753cabcabd30eb2cfb93739d9c1679200bf1f (patch)
tree86de891461ece7de11c98f7fb3eb494203b36cbb /convert_hf_to_gguf.py
parent5446ccc8ac87037484ba63f91941de35e0bd58ca (diff)
CUDA: Faster prompt processing for several quantization types (#595)
* cuda: slightly faster MMQ for iq3_k, iq3_k_r4 * cuda: slightly faster MMQ for iq4_k, iq4_k_r4 * cuda: slightly faster MMQ for iq4_ks_r4 * cuda: slightly faster MMQ for iq4_ks * cuda: slightly faster MMQ for iq4_xs --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'convert_hf_to_gguf.py')
0 files changed, 0 insertions, 0 deletions