diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-05-04 12:45:00 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-04 12:45:00 +0300 |
commit | f7c9a0f036951fecab32e056df954ebc54f8688f (patch) | |
tree | 277a7c5ee63fda3841488e38a1dda9d2a43e0094 /ggml/src/ggml-vulkan.cpp | |
parent | 13281282986fb6783d0d7d64b3610bfb7085e749 (diff) |
CUDA: MMQ for IQ4_KS (#374)
* WIP
* WIP: still getting illegal memory access
* CUDA: MMQ for iq4_ks now works
~25% faster than dequantize+cuBLAS, ~10% slower than Q4_0 MMQ.
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-vulkan.cpp')
0 files changed, 0 insertions, 0 deletions