ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-06-21	llama : allow pooled embeddings on any model (#7477)	Douglas Hanley
2024-06-14	convert : add Poro-34B-chat tokenizer support (#7713)	Elaine
2024-06-06	Added support for . (any character) token in grammar engine. (#6467)	Clint Herron
2024-06-05	Fix per token atrributes bits (#7749)	jaime-m-p
2024-06-04	llama : remove beam search (#7736)	Georgi Gerganov
2024-06-04	Per token attributes (#7685)	jaime-m-p
2024-05-31	llama : cache llama_token_to_piece (#7587)	Georgi Gerganov
2024-05-27	llama : add comments about experimental flags (#7544)	Georgi Gerganov
2024-05-26	llama : add Smaug 70B support (#7402)	Bartowski
2024-05-25	main : don't print special tokens with --grammar (#6923)	Justine Tunney
2024-05-23	llama : add getters for n_threads/n_threads_batch (#7464)	Daniel Bevenius
2024-05-19	Add StableLM2 pre-tokenizer (#7349)	Anas Ahouzi
2024-05-14	ggml : add RPC backend (#6829)	Radoslav Gerganov
2024-05-08	llama : add BPE pre-tokenization for Qwen2 (#7114)	Ren Xuancheng
2024-05-08	convert : add BPE pre-tokenization for DBRX (#7132)	DAN™
2024-05-08	ggml : introduce bfloat16 support (#6412)	Justine Tunney
2024-05-07	Fix OLMo HF to GGUF conversion (#6910)	nopperl
2024-05-05	command-r : add BPE pre-tokenization (#7063)	DAN™
2024-05-04	tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)	Georgi Gerganov
2024-05-03	llama : rename ctx to user_data in progress_callback (#7045)	Daniel Bevenius
2024-04-30	ggml : add Flash Attention (#5021)	Georgi Gerganov
2024-04-29	llama : fix BPE pre-tokenization (#6920)	Georgi Gerganov
2024-04-26	quantize: add imatrix and dataset metadata in GGUF (#6658)	Pierrick Hymbert
2024-04-26	add basic tensor data validation function (#6884)	slaren
2024-04-25	quantize : add '--keep-split' to quantize model into shards (#6688)	jiez
2024-04-24	llama : add llama_get_pooling_type function (#6862)	Douglas Hanley
2024-04-24	Server: fix seed for multiple slots (#6835)	Johannes Gäßler
2024-04-21	llama : add option to render special/control tokens (#6807)	Georgi Gerganov
2024-04-21	llama : support Llama 3 HF conversion (#6745)	Pedro Cuenca
2024-04-11	grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses...	Olivier Chafik
2024-04-09	BERT tokenizer fixes (#6498)	Jared Van Bortel
2024-04-08	llama : support negative ith in llama_get_ API (#6519)	Rick G
2024-04-08	llama : save and restore kv cache for single seq id (#6341)	Jan Boon
2024-04-04	examples : add GBNF validator program (#5948)	Clint Herron
2024-03-28	convert : refactor vocab selection logic (#6355)	Jared Van Bortel
2024-03-26	llama : greatly reduce output buffer memory usage (#6122)	compilade
2024-03-26	IQ1_M: 1.75 bpw quantization (#6302)	Kawrakow
2024-03-26	quantize : be able to override metadata by key (#6321)	Kawrakow
2024-03-22	quantize: options for output and token embedding tensors qtype (#6239)	Kawrakow
2024-03-22	llama_model_loader: support multiple split/shard GGUFs (#6187)	Pierrick Hymbert
2024-03-15	llama : add support for control vectors (#5970)	Theia Vogel
2024-03-14	llama : support models without vocabulary (#5798)	Michael Podvitskiy
2024-03-13	llama : add pipeline parallelism support (#6017)	slaren
2024-03-11	llama : more consistent names of count variables (#5994)	Georgi Gerganov
2024-03-11	llama : fix F16/F32 downcast + improve names (#5980)	Georgi Gerganov
2024-03-10	llama : add support for GritLM (#5959)	DAN™
2024-03-08	llama : support Mamba Selective State Space Models (#5328)	compilade
2024-03-04	llama : fix embeddings (#5796)	Georgi Gerganov
2024-03-03	llama : allow for user specified embedding pooling type (#5849)	Douglas Hanley
2024-03-02	llama : add abort_callback to interrupt computation (#5409)	Michael Podvitskiy