index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
gguf-py
Age
Commit message (
Expand
)
Author
2025-02-09
Add optional MLA (#188)
Kawrakow
2025-01-23
Deepseek V3 support added (#176)
saood06
2024-08-12
Merge mainline - Aug 12 2024 (#17)
Kawrakow
2024-07-27
Merge mainline llama.cpp (#3)
Kawrakow
2024-06-22
bitnet: python + llama
Iwan Kawrakow
2024-06-17
update: support Qwen2-57B-A14B (#7835)
Ștefan-Gabriel Muscalu
2024-06-17
gguf-dump.py: add --markdown dump output (#7853)
Brian
2024-06-09
gguf-py : decouple adding metadata from writing in GGUFWriter (#7827)
compilade
2024-06-06
llama : add jina v2 base code (#7596)
Joan Fontanals
2024-06-03
llama : MiniCPM support tied embeddings (#7664)
zhangkaihuo
2024-05-30
Move convert.py to examples/convert-legacy-llama.py (#7430)
Galunid
2024-05-30
gguf-py : Add tokenizer.ggml.pre to gguf-new-metadata.py (#7627)
Galunid
2024-05-28
Add support for DeepseekV2ForCausalLM (#7519)
fairydreaming
2024-05-25
gguf-py : fix and simplify quantized shape round-trip (#7483)
compilade
2024-05-24
Add support for ArcticForCausalLM (#7020)
fairydreaming
2024-05-23
ggml : drop support for QK_K=64 (#7473)
Georgi Gerganov
2024-05-21
llama : add phi3 128K model support (#7225)
liuwei-git
2024-05-21
llama : remove Persimmon (#7408)
Georgi Gerganov
2024-05-13
convert-hf : support direct Q8_0 conversion (#7234)
compilade
2024-05-11
convert-hf : support bfloat16 conversion (#7158)
compilade
2024-05-11
llama : add Jina Embeddings architecture (#6826)
Joan Fontanals
2024-05-11
ggml : full ALiBi support (#7192)
Georgi Gerganov
2024-05-09
gguf-py : add special token modification capability (#7166)
Sigbjørn Skjæret
2024-05-08
convert-hf : save memory with lazy evaluation (#7075)
compilade
2024-05-08
ggml : introduce bfloat16 support (#6412)
Justine Tunney
2024-05-03
convert.py : add python logging instead of print() (#6511)
Brian
2024-04-29
llama : fix BPE pre-tokenization (#6920)
Georgi Gerganov
2024-04-28
gguf : enforce that tensor names are unique (#6905)
Xuan Son Nguyen
2024-04-24
llama : add phi3 support (#6852)
liuwei-git
2024-04-21
gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761)
pmysl
2024-04-19
Implement the OLMo architecture (#6741)
nopperl
2024-04-18
convert : support models with multiple chat templates (#6588)
Sigbjørn Skjæret
2024-04-16
llama : add StableLM2 12B (#6635)
Ashish
2024-04-16
llama : add qwen2moe (#6074)
Shijie
2024-04-16
gguf : add special tokens metadata for FIM/Infill (#6689)
Daniel Bevenius
2024-04-13
model: support arch `DbrxForCausalLM` (#6515)
Pierrick Hymbert
2024-04-09
llama : add Command R Plus support (#6491)
Carolinabanana
2024-04-05
gguf.py : add licence and version to gguf writer (#6504)
Brian
2024-04-03
llama : add SEA-LION support (#6448)
bryanSwk
2024-04-03
ggml : mul_mat_id use the same tensor for all the experts (#6387)
slaren
2024-03-29
[Model] Add support for xverse (#6301)
hxer7963
2024-03-26
IQ1_M: 1.75 bpw quantization (#6302)
Kawrakow
2024-03-23
llama : add grok-1 support (#6204)
Julius Arkenberg
2024-03-15
llama : add Command-R support (#6033)
Andrew Canis
2024-03-15
gguf : add support for I64 and F64 arrays (#6062)
Ondřej Čertík
2024-03-14
gguf-py : bump version to 0.8.0 (#6060)
Ondřej Čertík
2024-03-14
llama : support models without vocabulary (#5798)
Michael Podvitskiy
2024-03-14
gguf-py : fix dtype check (#6045)
Georgi Gerganov
2024-03-14
gguf-py : add support for I8, I16 and I32 (#6045)
Ondřej Čertík
2024-03-08
llama : support Mamba Selective State Space Models (#5328)
compilade
[next]