summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-01-25examples : make pydantic scripts pass mypy and support py3.8 (#5099)Jared Van Bortel
2024-01-25android : use release cmake build type by default (#5123)Valentin Konovalov
2024-01-23Additional KL-divergence statistics (#5081)Kawrakow
2024-01-23minor : clean-up some warnings and style (#5094)Georgi Gerganov
2024-01-23llama.vim : added api key support (#5090)Michael Coppola
2024-01-22KL-divergence (#5076)Kawrakow
2024-01-22llava : MobileVLM support (#4954)XiaotaoChen
2024-01-22imatrix : keep intermediate imatrix results (#5077)Kawrakow
2024-01-22finetune : print sample-start/include-sample-start (#5072)Daniel Bevenius
2024-01-22llama : add Q3_K_XS (#5060)Kawrakow
2024-01-21Add ability to evauate multiple choice tasks (#5047)Kawrakow
2024-01-21Slightly faster imatrix (#5050)Kawrakow
2024-01-20perplexity : fix MSVC build after #5020 (#5043)Jared Van Bortel
2024-01-19finetune : fix ggml_allocr lifetimes (tmp workaround) (#5033)Uzo Nweke
2024-01-19imatrix : add README.mdGeorgi Gerganov
2024-01-19winogrande: evaluate log-probs in parallel (#5036)Kawrakow
2024-01-19perplexity: avoid unnecessary alloocations and logit copies (#5035)Kawrakow
2024-01-19perplexity : faster Winogrande via batching (#5024)Georgi Gerganov
2024-01-18server : defer tasks when "slot unavailable" (#5018)Xuan Son Nguyen
2024-01-18imatrix : fix assert for src0 non-cont checkGeorgi Gerganov
2024-01-18perplexity : fix winogrande N tasks optionGeorgi Gerganov
2024-01-18HellaSwag: speed up by parallelizing log-prob evaluation (#5020)Kawrakow
2024-01-18perplexity : faster HellaSwag via batching (#5017)Georgi Gerganov
2024-01-18Add Winogrande evaluation (#5015)Kawrakow
2024-01-17imatrix : offload to GPU support (#4957)Georgi Gerganov
2024-01-16finetune : add training data file to log message (#4979)Daniel Bevenius
2024-01-16examples : add complete parallel function calling example (#4974)Maximilian Winter
2024-01-16perplexity : fix kv cache handling for hellaswag (#4981)Georgi Gerganov
2024-01-16android : introduce starter project example (#4926)Neuman Vong
2024-01-16examples : fix and improv docs for the grammar generator (#4909)Maximilian Winter
2024-01-16finetune : use LLAMA_FILE_MAGIC_GGLA (#4961)Daniel Bevenius
2024-01-16speculative : threading options (#4959)stduhpf
2024-01-14Add ability to use importance matrix for all k-quants (#4930)Kawrakow
2024-01-142-bit quantizations (#4897)Kawrakow
2024-01-13metal : remove old API (#4919)Georgi Gerganov
2024-01-13server : fix prompt caching with system prompt (#4914)Georgi Gerganov
2024-01-13llama : minimize size used for state save/load (#4820)David Friehs
2024-01-13main : add parameter --no-display-prompt (#4541)Yann Follet
2024-01-13server : fix deadlock that occurs in multi-prompt scenarios (#4905)Ziad Ben Hadj-Alouane
2024-01-13server : fix crash with multimodal models without BOS token (#4904)makomk
2024-01-12examples : add pydantic models to GBNF grammar generator (#4883)Maximilian Winter
2024-01-12llama : ggml-backend integration (#4766)slaren
2024-01-12export-lora : use LLAMA_FILE_MAGIC_GGLA (#4894)Daniel Bevenius
2024-01-12llama.swiftui : update models layout (#4826)Zay
2024-01-12Importance Matrix calculation (#4861)Kawrakow
2024-01-11server : fix infill when prompt is empty (#4833)Georgi Gerganov
2024-01-11main : better name for variable n_print (#4874)Georgi Gerganov
2024-01-11main : disable token count by default (#4874)Georgi Gerganov
2024-01-11llama : restore intended k-quants mixes for MoE models (#4872)Kawrakow
2024-01-11server : implement credentialed CORS (#4514)Laura