summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-impl.h
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-05-04 12:45:00 +0300
committerGitHub <noreply@github.com>2025-05-04 12:45:00 +0300
commitf7c9a0f036951fecab32e056df954ebc54f8688f (patch)
tree277a7c5ee63fda3841488e38a1dda9d2a43e0094 /ggml/src/ggml-impl.h
parent13281282986fb6783d0d7d64b3610bfb7085e749 (diff)
CUDA: MMQ for IQ4_KS (#374)
* WIP * WIP: still getting illegal memory access * CUDA: MMQ for iq4_ks now works ~25% faster than dequantize+cuBLAS, ~10% slower than Q4_0 MMQ. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-impl.h')
0 files changed, 0 insertions, 0 deletions