diff options
author | Nexes the Elder <124105151+Nexesenex@users.noreply.github.com> | 2025-05-24 10:49:10 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-24 11:49:10 +0300 |
commit | c7ecd4e23acb42f1150abf0b118e0a2c7b8dc959 (patch) | |
tree | 6c619eb2d01abd3435f53bb092209935b252c8bb /gguf-py | |
parent | a2c42f9985a96abc8b1b4104b0524ea4b2da9363 (diff) |
Legacy quants conversion schemes in convert_hf_to_gguf.py (#449)
* Legacy quants conversion schemes in convert_hf_to_gguf.py
This, notably in order to make smaller conversions to generate an iMatrix file.
`Q4_0`,`Q4_1` are here using embeddings, output, attn_k and attn_v in q5_0.
`Q5_0`,`Q5_1` are here using embeddings, output, attn_k and attn_v in q8_0.
Adapted from the following llama.cpp mainline PR : https://github.com/ggml-org/llama.cpp/pull/9022
Original author @chentyjpm
Also, 2 forgotten mentions of FTYPE IQ3_KL in llama.cpp file.
* forgotten IQ5_KS case mention
Diffstat (limited to 'gguf-py')
0 files changed, 0 insertions, 0 deletions