summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-01-16finetune : add training data file to log message (#4979)Daniel Bevenius
2024-01-16examples : add complete parallel function calling example (#4974)Maximilian Winter
2024-01-16perplexity : fix kv cache handling for hellaswag (#4981)Georgi Gerganov
2024-01-16android : introduce starter project example (#4926)Neuman Vong
2024-01-16examples : fix and improv docs for the grammar generator (#4909)Maximilian Winter
2024-01-16finetune : use LLAMA_FILE_MAGIC_GGLA (#4961)Daniel Bevenius
2024-01-16speculative : threading options (#4959)stduhpf
2024-01-14Add ability to use importance matrix for all k-quants (#4930)Kawrakow
2024-01-142-bit quantizations (#4897)Kawrakow
2024-01-13metal : remove old API (#4919)Georgi Gerganov
2024-01-13server : fix prompt caching with system prompt (#4914)Georgi Gerganov
2024-01-13llama : minimize size used for state save/load (#4820)David Friehs
2024-01-13main : add parameter --no-display-prompt (#4541)Yann Follet
2024-01-13server : fix deadlock that occurs in multi-prompt scenarios (#4905)Ziad Ben Hadj-Alouane
2024-01-13server : fix crash with multimodal models without BOS token (#4904)makomk
2024-01-12examples : add pydantic models to GBNF grammar generator (#4883)Maximilian Winter
2024-01-12llama : ggml-backend integration (#4766)slaren
2024-01-12export-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)Daniel Bevenius
2024-01-12llama.swiftui : update models layout (#4826)Zay
2024-01-12Importance Matrix calculation (#4861)Kawrakow
2024-01-11server : fix infill when prompt is empty (#4833)Georgi Gerganov
2024-01-11main : better name for variable n_print (#4874)Georgi Gerganov
2024-01-11main : disable token count by default (#4874)Georgi Gerganov
2024-01-11llama : restore intended k-quants mixes for MoE models (#4872)Kawrakow
2024-01-11server : implement credentialed CORS (#4514)Laura
2024-01-11server : support for multiple api keys (#4864)Michael Coppola
2024-01-11server : add `LOG_INFO` when model is successfully loaded (#4881)Behnam M
2024-01-11main : print total token count and tokens consumed so far (#4874)pudepiedj
2024-01-11server : fix typo in model name (#4876)Isaac McFadyen
2024-01-11server : update readme to document the new `/health` endpoint (#4866)Behnam M
2024-01-11server : fix build + rename enums (#4870)Georgi Gerganov
2024-01-10server : add a `/health` endpoint (#4860)Behnam M
2024-01-10clip : support more quantization types (#4846)John
2024-01-09llava-cli : don't crash if --image flag is invalid (#4835)Justine Tunney
2024-01-09server : update readme about token probs (#4777)Behnam M
2024-01-09server : add api-key flag to documentation (#4832)Zsapi
2024-01-08llama.swiftui : update readmeGeorgi Gerganov
2024-01-08main : add self-extend support (#4815)Georgi Gerganov
2024-01-08examples : add passkey test (#3856)Georgi Gerganov
2024-01-07llama-bench : add no-kv-offload parameter (#4812)slaren
2024-01-07llama.swiftui : use llama.cpp as SPM package (#4804)Alex Azarov
2024-01-07llama.swiftui : add visionOS target (#4805)Alex Azarov
2024-01-07server : fix n_predict check (#4798)Georgi Gerganov
2024-01-06llama.swiftui : use correct pointer for llama_token_eos (#4797)Daniel Illescas Romero
2024-01-06examples : improve base-translate.sh script (#4783)Georgi Gerganov
2024-01-05metal : switch back to default.metallib (ggml/681)Georgi Gerganov
2024-01-05examples : add few-shot translation example (#4783)Georgi Gerganov
2024-01-04finetune : remove unused includes (#4756)Daniel Bevenius
2024-01-04server : send token probs for "stream == false" (#4714)Georgi Gerganov
2024-01-04llama.swiftui : support loading custom model from file picker (#4767)singularity