diff options
author | compilade <git@compilade.net> | 2024-05-11 11:06:26 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-11 11:06:26 -0400 |
commit | 5a419926b0c4efab0531401aea91522aaea9fd07 (patch) | |
tree | fc04fa59a6588650a6fed70fedd8c1d4b39ec1d1 /examples/server | |
parent | fae9d234b6606693704eca62fe4aefbb6c6abb45 (diff) |
convert-hf : support bfloat16 conversion (#7158)
* convert-hf : support bfloat16 conversion
* gguf-py : flake8 fixes
* convert-hf : add missing space after comma
* convert-hf : get bit-exact same output as ./quantize
The quantization version was missing.
* convert-hf : don't round bf16 NANs
* convert-hf : save some memory with np.int16 intermediate bf16 weights
* convert-hf : more closely match llama.cpp with which weights to keep in f32
* convert-hf : add --outtype auto-f16
A reason for this to exist is for model quantizers who want an initial
GGUF with the most fidelity to the original model while still using
a 16-bit float type instead of 32-bit floats.
* convert-hf : remove a semicolon because flake8 doesn't like it
It's a reflex from when programming in C/C++, I guess.
* convert-hf : support outtype templating in outfile name
* convert-hf : rename --outtype auto-f16 to --outtype auto
Diffstat (limited to 'examples/server')
0 files changed, 0 insertions, 0 deletions