diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-03-27 10:48:52 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-03-27 10:48:52 +0100 |
commit | 23b0addb34d8942baedc6f968460560392feadd3 (patch) | |
tree | 9738a4d2a96860231afbe86f8e278632a31c4faf /ggml/src/iqk/iqk_quantize.cpp | |
parent | d0b52076da0261f291b01f1ffa44884c8b2cdb1c (diff) |
Make sure tensor row size is multiple of block size also when quantizing with --pure (#294)
* WIP - not working
* q8_0 without bells and wistles works
* It works for q8_0
* Use bf16 instead of f16,int16
* q4_0_r8
* q5_0_r4
* q6_0_r4
* Also q4_1 and q5_1
* Add check if selected type is possible with --pure
I often want to quantize with --pure to see quantization performance
without quantization mixes. But for models where there qre tensors
with row sizes that are not multiple of 256, this results in a crash
for k- and i-quants. Hence, lets add a check if the quant selected
via --pure is applicable, and change it if not.
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/iqk/iqk_quantize.cpp')
0 files changed, 0 insertions, 0 deletions