IQ3_S_R4 (#162)

* iq3_s_r4: WIP * iq3_s_r4: Zen4 * iq3_s_r4: slightly better Zen4 * iq3_s_r4: AVX2 * iq3_s_r4: NEON * iq3_s_r4: rearrange quants * iq3_s_r4: rearranged quants - AVX2 * iq3_s_r4: rearranged quants - NEON --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
author: Kawrakow <iwankawrakow@gmail.com> 2024-12-23 14:34:23 +0100
committer: GitHub <noreply@github.com> 2024-12-23 14:34:23 +0100
commit: 167479e0272dcb5f9babc7668664fa2a75c4f2dd (patch)
tree: c5347677e97acaa0da4ff619c01231afeda40488 /examples/quantize/quantize.cpp
parent: 1a0a35dcd175a2b37fb6a347f69f31cb37eaf035 (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/quantize/quantize.cpp b/examples/quantize/quantize.cpp
index 1599405b..5ffdbc84 100644
--- a/examples/quantize/quantize.cpp
+++ b/examples/quantize/quantize.cpp
@@ -39,6 +39,7 @@ static const std::vector<struct quant_option> QUANT_OPTIONS = {
     { "IQ3_XXS",  LLAMA_FTYPE_MOSTLY_IQ3_XXS,  " 3.06 bpw quantization",            },
     { "IQ3_XXS_R4",LLAMA_FTYPE_MOSTLY_IQ3_XXS_R4,"IQ3_XXS repacked",            },
     { "IQ3_S",    LLAMA_FTYPE_MOSTLY_IQ3_S,    " 3.44 bpw quantization",            },
+    { "IQ3_S_R4", LLAMA_FTYPE_MOSTLY_IQ3_S_R4, "IQ3_S repacked",            },
     { "IQ3_M",    LLAMA_FTYPE_MOSTLY_IQ3_M,    " 3.66 bpw quantization mix",        },
     { "Q3_K",     LLAMA_FTYPE_MOSTLY_Q3_K_M,   "alias for Q3_K_M" },
     { "Q3_K_R4",  LLAMA_FTYPE_MOSTLY_Q3_K_R4,  "Q3_K_S repacked" },
author	Kawrakow <iwankawrakow@gmail.com>	2024-12-23 14:34:23 +0100
committer	GitHub <noreply@github.com>	2024-12-23 14:34:23 +0100
commit	167479e0272dcb5f9babc7668664fa2a75c4f2dd (patch)
tree	c5347677e97acaa0da4ff619c01231afeda40488 /examples/quantize/quantize.cpp
parent	1a0a35dcd175a2b37fb6a347f69f31cb37eaf035 (diff)