diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-01-12 13:19:14 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-12 13:19:14 +0200 |
commit | c19404bcdaa2a1f8900801d4865673e5f7a03f63 (patch) | |
tree | d558fa4a0ace9333730cb7afa98644660091fc13 /ggml/src/ggml-rpc.cpp | |
parent | 7553989dd88749de028853f9c0ea39651aad92a3 (diff) |
MoE fix for R4 quants (#170)
* Fix bug in iqk_mul_mat
I recently added the possibility to have a matrix multiplication
kernel that processes 16 columns in the right matrix per iteration.
This introduced a bug that shows up when batch size is greater
than 16, is not a multiple of 16, and the remainder is not a multiple
of the maximum columns being processed by the regular kernels
(and so, never showed up in my testing using TG-128 and PP-512).
This commit fixes the issue.
* Make sure rows per thread is a multiple of 4 also for MoE when using _r4 quants
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-rpc.cpp')
0 files changed, 0 insertions, 0 deletions