ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-01-16	finetune : add training data file to log message (#4979)	Daniel Bevenius
2024-01-16	examples : add complete parallel function calling example (#4974)	Maximilian Winter
2024-01-16	perplexity : fix kv cache handling for hellaswag (#4981)	Georgi Gerganov
2024-01-16	android : introduce starter project example (#4926)	Neuman Vong
2024-01-16	examples : fix and improv docs for the grammar generator (#4909)	Maximilian Winter
2024-01-16	finetune : use LLAMA_FILE_MAGIC_GGLA (#4961)	Daniel Bevenius
2024-01-16	speculative : threading options (#4959)	stduhpf
2024-01-14	Add ability to use importance matrix for all k-quants (#4930)	Kawrakow
2024-01-14	2-bit quantizations (#4897)	Kawrakow
2024-01-13	metal : remove old API (#4919)	Georgi Gerganov
2024-01-13	server : fix prompt caching with system prompt (#4914)	Georgi Gerganov
2024-01-13	llama : minimize size used for state save/load (#4820)	David Friehs
2024-01-13	main : add parameter --no-display-prompt (#4541)	Yann Follet
2024-01-13	server : fix deadlock that occurs in multi-prompt scenarios (#4905)	Ziad Ben Hadj-Alouane
2024-01-13	server : fix crash with multimodal models without BOS token (#4904)	makomk
2024-01-12	examples : add pydantic models to GBNF grammar generator (#4883)	Maximilian Winter
2024-01-12	llama : ggml-backend integration (#4766)	slaren
2024-01-12	export-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)	Daniel Bevenius
2024-01-12	llama.swiftui : update models layout (#4826)	Zay
2024-01-12	Importance Matrix calculation (#4861)	Kawrakow
2024-01-11	server : fix infill when prompt is empty (#4833)	Georgi Gerganov
2024-01-11	main : better name for variable n_print (#4874)	Georgi Gerganov
2024-01-11	main : disable token count by default (#4874)	Georgi Gerganov
2024-01-11	llama : restore intended k-quants mixes for MoE models (#4872)	Kawrakow
2024-01-11	server : implement credentialed CORS (#4514)	Laura
2024-01-11	server : support for multiple api keys (#4864)	Michael Coppola
2024-01-11	server : add `LOG_INFO` when model is successfully loaded (#4881)	Behnam M
2024-01-11	main : print total token count and tokens consumed so far (#4874)	pudepiedj
2024-01-11	server : fix typo in model name (#4876)	Isaac McFadyen
2024-01-11	server : update readme to document the new `/health` endpoint (#4866)	Behnam M
2024-01-11	server : fix build + rename enums (#4870)	Georgi Gerganov
2024-01-10	server : add a `/health` endpoint (#4860)	Behnam M
2024-01-10	clip : support more quantization types (#4846)	John
2024-01-09	llava-cli : don't crash if --image flag is invalid (#4835)	Justine Tunney
2024-01-09	server : update readme about token probs (#4777)	Behnam M
2024-01-09	server : add api-key flag to documentation (#4832)	Zsapi
2024-01-08	llama.swiftui : update readme	Georgi Gerganov
2024-01-08	main : add self-extend support (#4815)	Georgi Gerganov
2024-01-08	examples : add passkey test (#3856)	Georgi Gerganov
2024-01-07	llama-bench : add no-kv-offload parameter (#4812)	slaren
2024-01-07	llama.swiftui : use llama.cpp as SPM package (#4804)	Alex Azarov
2024-01-07	llama.swiftui : add visionOS target (#4805)	Alex Azarov
2024-01-07	server : fix n_predict check (#4798)	Georgi Gerganov
2024-01-06	llama.swiftui : use correct pointer for llama_token_eos (#4797)	Daniel Illescas Romero
2024-01-06	examples : improve base-translate.sh script (#4783)	Georgi Gerganov
2024-01-05	metal : switch back to default.metallib (ggml/681)	Georgi Gerganov
2024-01-05	examples : add few-shot translation example (#4783)	Georgi Gerganov
2024-01-04	finetune : remove unused includes (#4756)	Daniel Bevenius
2024-01-04	server : send token probs for "stream == false" (#4714)	Georgi Gerganov
2024-01-04	llama.swiftui : support loading custom model from file picker (#4767)	singularity