summaryrefslogtreecommitdiff
path: root/examples/server/server.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-05-26 19:34:54 +0300
committerGitHub <noreply@github.com>2025-05-26 19:34:54 +0300
commit14292913260af89f37a6b856ef73bf88bda25129 (patch)
tree67aed02032ea97819a279e28957821f337221c9e /examples/server/server.cpp
parent24c010b3916b5f1bb9d712d610d1fe9308ef7df4 (diff)
CUDA implementation for IQ2_K_R4, IQ3_K_R4, IQ4_K_R4, IQ5_K_R4 (#461)
* CUDA: iq4_k_r4 dequantize * CUDA: iq4_k_r4 GEMV ~10% slower than iq4_k. * CUDA: slightly faster iq4_k_r4 GEMV * CUDA: slightly faster iq4_k_r4 GEMV We are now within 3% of iq4_k * CUDA: iq5_k_r4 dequantize * CUDA: iq5_k_r4 GEMV ~3% slower than iq5_k. * CUDA: iq3_k_r4 dequantize * CUDA: iq3_k_r4 GEMV * CUDA: slightly faster iq3_k_r4 GEMV * CUDA: iq2_k_r4 GEMV * CUDA: faster iq2_k_r4 GEMV --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/server/server.cpp')
0 files changed, 0 insertions, 0 deletions