diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-03-27 05:49:16 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-03-27 05:49:16 +0100 |
commit | d0b52076da0261f291b01f1ffa44884c8b2cdb1c (patch) | |
tree | 93abea8ae30140fbd6733af91eede57c2243e91d /ggml/include/ggml.h | |
parent | a22250df93fd833a6cb7f310b159ad1b54e4d582 (diff) |
Use bf16 instead of fp16 block scales for q8_1 (#292)
* WIP - not working
* q8_0 without bells and wistles works
* It works for q8_0
* Use bf16 instead of f16,int16
* q4_0_r8
* q5_0_r4
* q6_0_r4
* Also q4_1 and q5_1
* q8_0_r8 on avx2
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/include/ggml.h')
-rw-r--r-- | ggml/include/ggml.h | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h index 91219d4a..7cc9100d 100644 --- a/ggml/include/ggml.h +++ b/ggml/include/ggml.h @@ -396,8 +396,9 @@ extern "C" { // GGML_TYPE_I2_S = 36, // - GGML_TYPE_Q8_0_X4 = 98, - GGML_TYPE_Q8_1_X4 = 99, + GGML_TYPE_Q8_0_X4 = 97, + GGML_TYPE_Q8_1_X4 = 98, + GGML_TYPE_Q8_2_X4 = 99, GGML_TYPE_Q6_0 = 133, GGML_TYPE_IQ1_BN = 134, GGML_TYPE_IQ2_BN = 135, |