summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-rpc.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-01-12 13:19:14 +0200
committerGitHub <noreply@github.com>2025-01-12 13:19:14 +0200
commitc19404bcdaa2a1f8900801d4865673e5f7a03f63 (patch)
treed558fa4a0ace9333730cb7afa98644660091fc13 /ggml/src/ggml-rpc.cpp
parent7553989dd88749de028853f9c0ea39651aad92a3 (diff)
MoE fix for R4 quants (#170)
* Fix bug in iqk_mul_mat I recently added the possibility to have a matrix multiplication kernel that processes 16 columns in the right matrix per iteration. This introduced a bug that shows up when batch size is greater than 16, is not a multiple of 16, and the remainder is not a multiple of the maximum columns being processed by the regular kernels (and so, never showed up in my testing using TG-128 and PP-512). This commit fixes the issue. * Make sure rows per thread is a multiple of 4 also for MoE when using _r4 quants --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-rpc.cpp')
0 files changed, 0 insertions, 0 deletions