summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-02-09llava : add requirements.txt and update README.md (#5428)Daniel Bevenius
2024-02-09server : fix prompt caching for repeated prompts (#5420)Riley Stewart
2024-02-08llava : add missing .py, and fix paths in README.md (#5414)Daniel Bevenius
2024-02-08llava: fix typo/formatting in README.md (#5405)Daniel Bevenius
2024-02-07llava-cli : always tokenize special tokens (#5382)Xiao-Yong Jin
2024-02-07server : update `/props` with "total_slots" value (#5373)Justin Parker
2024-02-06server : remove model.json endpoint (#5371)Alexey Parfenov
2024-02-06server : include total "num_slots" in props endpoint (#5349)Justin Parker
2024-02-06server : add `dynatemp_range` and `dynatemp_exponent` (#5352)Michael Coppola
2024-02-06server : various fixes for the prompt field in /completion (#5300)Niall Coates
2024-02-05server : allow to get default generation settings for completion (#5307)Alexey Parfenov
2024-02-04Adding some imatrix tools (#5302)Kawrakow
2024-02-03refactor : switch to emplace_back to avoid extra object (#5291)Michael Klimenko
2024-02-02perplexity : fix KL divergence calculations on Windows (#5273)kalomaze
2024-02-02[SYCL] update guide of SYCL backend (#5254)Neo Zhang Jianyu
2024-02-01add --no-mmap in llama-bench (#5257)Neo Zhang Jianyu
2024-01-31llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)Georgi Gerganov
2024-01-31llava : add MobileVLM support (#5132)JidongZhang-THU
2024-01-31format license text, restore apache license by legal suggestion (#5233)Neo Zhang Jianyu
2024-01-31support SYCL backend windows build (#5208)Neo Zhang Jianyu
2024-01-30kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)Jared Van Bortel
2024-01-30Revert "server : change deps.sh xxd files to string literals (#5221)"Georgi Gerganov
2024-01-30server : fix context shift (#5195)Georgi Gerganov
2024-01-30server : change deps.sh xxd files to string literals (#5221)JohnnyB
2024-01-30SOTA 3-bit quants (#5196)Kawrakow
2024-01-30quantize : fix typo (#5211)Vladimir Malyutin
2024-01-30main : allow empty --prompt-cache file (#5176)divinity76
2024-01-30server : improve README (#5209)Wu Jian Ping
2024-01-29server : embeddings compatibility for OpenAI (#5190)Wu Jian Ping
2024-01-28ggml : add Vulkan backend (#2059)0cc4m
2024-01-28ggml : add unified SYCL backend for Intel GPUs (#2690)Abhilash Majumder
2024-01-28docker : add server-first container images (#5157)Kyle Mistele
2024-01-27llava : support for Yi-VL and fix for mobileVLM (#5093)John
2024-01-27sync : ggmlGeorgi Gerganov
2024-01-27Remove unused data and add fixes (#5154)Michael Klimenko
2024-01-27server : add self-extend support (#5104)Maximilian Winter
2024-01-26server : refactored the task processing logic (#5065)Xuan Son Nguyen
2024-01-25examples : make pydantic scripts pass mypy and support py3.8 (#5099)Jared Van Bortel
2024-01-25android : use release cmake build type by default (#5123)Valentin Konovalov
2024-01-23Additional KL-divergence statistics (#5081)Kawrakow
2024-01-23minor : clean-up some warnings and style (#5094)Georgi Gerganov
2024-01-23llama.vim : added api key support (#5090)Michael Coppola
2024-01-22KL-divergence (#5076)Kawrakow
2024-01-22llava : MobileVLM support (#4954)XiaotaoChen
2024-01-22imatrix : keep intermediate imatrix results (#5077)Kawrakow
2024-01-22finetune : print sample-start/include-sample-start (#5072)Daniel Bevenius
2024-01-22llama : add Q3_K_XS (#5060)Kawrakow
2024-01-21Add ability to evauate multiple choice tasks (#5047)Kawrakow
2024-01-21Slightly faster imatrix (#5050)Kawrakow
2024-01-20perplexity : fix MSVC build after #5020 (#5043)Jared Van Bortel