summaryrefslogtreecommitdiff
path: root/gguf-py
AgeCommit message (Expand)Author
2024-05-13convert-hf : support direct Q8_0 conversion (#7234)compilade
2024-05-11convert-hf : support bfloat16 conversion (#7158)compilade
2024-05-11llama : add Jina Embeddings architecture (#6826)Joan Fontanals
2024-05-11ggml : full ALiBi support (#7192)Georgi Gerganov
2024-05-09gguf-py : add special token modification capability (#7166)Sigbjørn Skjæret
2024-05-08convert-hf : save memory with lazy evaluation (#7075)compilade
2024-05-08ggml : introduce bfloat16 support (#6412)Justine Tunney
2024-05-03convert.py : add python logging instead of print() (#6511)Brian
2024-04-29llama : fix BPE pre-tokenization (#6920)Georgi Gerganov
2024-04-28gguf : enforce that tensor names are unique (#6905)Xuan Son Nguyen
2024-04-24llama : add phi3 support (#6852)liuwei-git
2024-04-21gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761)pmysl
2024-04-19Implement the OLMo architecture (#6741)nopperl
2024-04-18convert : support models with multiple chat templates (#6588)Sigbjørn Skjæret
2024-04-16llama : add StableLM2 12B (#6635)Ashish
2024-04-16llama : add qwen2moe (#6074)Shijie
2024-04-16gguf : add special tokens metadata for FIM/Infill (#6689)Daniel Bevenius
2024-04-13model: support arch `DbrxForCausalLM` (#6515)Pierrick Hymbert
2024-04-09llama : add Command R Plus support (#6491)Carolinabanana
2024-04-05gguf.py : add licence and version to gguf writer (#6504)Brian
2024-04-03llama : add SEA-LION support (#6448)bryanSwk
2024-04-03ggml : mul_mat_id use the same tensor for all the experts (#6387)slaren
2024-03-29[Model] Add support for xverse (#6301)hxer7963
2024-03-26IQ1_M: 1.75 bpw quantization (#6302)Kawrakow
2024-03-23llama : add grok-1 support (#6204)Julius Arkenberg
2024-03-15llama : add Command-R support (#6033)Andrew Canis
2024-03-15gguf : add support for I64 and F64 arrays (#6062)Ondřej Čertík
2024-03-14gguf-py : bump version to 0.8.0 (#6060)Ondřej Čertík
2024-03-14llama : support models without vocabulary (#5798)Michael Podvitskiy
2024-03-14gguf-py : fix dtype check (#6045)Georgi Gerganov
2024-03-14gguf-py : add support for I8, I16 and I32 (#6045)Ondřej Čertík
2024-03-08llama : support Mamba Selective State Space Models (#5328)compilade
2024-03-03gguf-dump : support i-quants (#5841)Nindaleth
2024-03-02convert-hf : make model class definitions self-contained (#5825)Jared Van Bortel
2024-03-01llama : add StarCoder2 support (#5795)Sourab Mangrulkar
2024-02-21llama : add `gemma` model (#5631)postmasters
2024-02-15Use correct type of pooling for embedding models (#5500)Douglas Hanley
2024-02-15fix(gguf-py): special tokens are no longer skipped when add_<token>_token is ...Michaël de Vries
2024-02-13gguf : add python reader example (#5216)John
2024-02-13llama : add support for Nomic Embed (#5468)Jared Van Bortel
2024-02-13llama : support batched embeddings (#5466)Douglas Hanley
2024-02-11Add support for BERT embedding models (#5423)Douglas Hanley
2024-02-07llama : add MiniCPM support (#5346)runfuture
2024-02-01llama : support InternLM2 (#5184)Guoteng
2024-01-28llama : add support for Orion-14B (#5118)sharpHL
2024-01-26gguf : fix "general.alignment" type in gguf_reader.py (#5136)Riceball LEE
2024-01-19llama : support upcoming Qwen2 (#5037)Shijie
2024-01-19llama : add CodeShell support (#5016)chiranko
2024-01-13convert : update phi-2 to latest HF repo (#4903)Georgi Gerganov
2024-01-12llama : fix llm_build_k_shift to use correct n_rot (#4889)Georgi Gerganov