summaryrefslogtreecommitdiff
path: root/ggml/src/iqk/iqk_quantize.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-03-27 10:48:52 +0100
committerGitHub <noreply@github.com>2025-03-27 10:48:52 +0100
commit23b0addb34d8942baedc6f968460560392feadd3 (patch)
tree9738a4d2a96860231afbe86f8e278632a31c4faf /ggml/src/iqk/iqk_quantize.cpp
parentd0b52076da0261f291b01f1ffa44884c8b2cdb1c (diff)
Make sure tensor row size is multiple of block size also when quantizing with --pure (#294)
* WIP - not working * q8_0 without bells and wistles works * It works for q8_0 * Use bf16 instead of f16,int16 * q4_0_r8 * q5_0_r4 * q6_0_r4 * Also q4_1 and q5_1 * Add check if selected type is possible with --pure I often want to quantize with --pure to see quantization performance without quantization mixes. But for models where there qre tensors with row sizes that are not multiple of 256, this results in a crash for k- and i-quants. Hence, lets add a check if the quant selected via --pure is applicable, and change it if not. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/iqk/iqk_quantize.cpp')
0 files changed, 0 insertions, 0 deletions