summaryrefslogtreecommitdiff
path: root/gguf-py/gguf/gguf_writer.py
AgeCommit message (Expand)Author
2025-01-23Deepseek V3 support added (#176)saood06
2024-08-12Merge mainline - Aug 12 2024 (#17)Kawrakow
2024-07-27Merge mainline llama.cpp (#3)Kawrakow
2024-06-17update: support Qwen2-57B-A14B (#7835)Ștefan-Gabriel Muscalu
2024-06-09gguf-py : decouple adding metadata from writing in GGUFWriter (#7827)compilade
2024-05-28Add support for DeepseekV2ForCausalLM (#7519)fairydreaming
2024-05-25gguf-py : fix and simplify quantized shape round-trip (#7483)compilade
2024-05-21llama : add phi3 128K model support (#7225)liuwei-git
2024-05-13convert-hf : support direct Q8_0 conversion (#7234)compilade
2024-05-11convert-hf : support bfloat16 conversion (#7158)compilade
2024-05-08convert-hf : save memory with lazy evaluation (#7075)compilade
2024-05-03convert.py : add python logging instead of print() (#6511)Brian
2024-04-29llama : fix BPE pre-tokenization (#6920)Georgi Gerganov
2024-04-28gguf : enforce that tensor names are unique (#6905)Xuan Son Nguyen
2024-04-18convert : support models with multiple chat templates (#6588)Sigbjørn Skjæret
2024-04-16gguf : add special tokens metadata for FIM/Infill (#6689)Daniel Bevenius
2024-04-05gguf.py : add licence and version to gguf writer (#6504)Brian
2024-03-15llama : add Command-R support (#6033)Andrew Canis
2024-03-15gguf : add support for I64 and F64 arrays (#6062)Ondřej Čertík
2024-03-14llama : support models without vocabulary (#5798)Michael Podvitskiy
2024-03-14gguf-py : fix dtype check (#6045)Georgi Gerganov
2024-03-14gguf-py : add support for I8, I16 and I32 (#6045)Ondřej Čertík
2024-03-08llama : support Mamba Selective State Space Models (#5328)compilade
2024-03-02convert-hf : make model class definitions self-contained (#5825)Jared Van Bortel
2024-02-15Use correct type of pooling for embedding models (#5500)Douglas Hanley
2024-02-15fix(gguf-py): special tokens are no longer skipped when add_<token>_token is ...Michaël de Vries
2024-02-13llama : support batched embeddings (#5466)Douglas Hanley
2024-02-11Add support for BERT embedding models (#5423)Douglas Hanley
2024-02-01llama : support InternLM2 (#5184)Guoteng
2024-01-02llama : differentiate the KV dims in the attention (#4657)postmasters
2023-12-13llama : add Mixtral support (#4406)slaren
2023-11-20ci : add flake8 to github actions (python linting) (#4129)Galunid
2023-11-19gguf-py : export chat templates (#4125)slaren
2023-11-12gguf-py: gguf_writer: Use bytearray to build metadata (#4051)Kerfuffle
2023-11-11gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981)Kerfuffle