ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kawrakow <iwankawrakow@gmail.com>	2025-05-04 12:45:00 +0300
committer	GitHub <noreply@github.com>	2025-05-04 12:45:00 +0300
commit	f7c9a0f036951fecab32e056df954ebc54f8688f (patch)
tree	277a7c5ee63fda3841488e38a1dda9d2a43e0094 /examples/llava/convert_image_encoder_to_gguf.py
parent	13281282986fb6783d0d7d64b3610bfb7085e749 (diff)

CUDA: MMQ for IQ4_KS (#374)

* WIP * WIP: still getting illegal memory access * CUDA: MMQ for iq4_ks now works ~25% faster than dequantize+cuBLAS, ~10% slower than Q4_0 MMQ. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

Diffstat (limited to 'examples/llava/convert_image_encoder_to_gguf.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: