summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-05-21examples: cache hf model when --model not provided (#7353)Amir
2024-05-21Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)jaime-m-p
2024-05-20Tokenizer SPM fixes for phi-3 and llama-spm (#7375)jaime-m-p
2024-05-20perplexity: update README FP16 results [no ci] (#7413)Johannes Gäßler
2024-05-20server : fix temperature + disable some tests (#7409)Georgi Gerganov
2024-05-20server : tuning tests (#7388)Georgi Gerganov
2024-05-20server : return error on too large embedding input (#7389)Georgi Gerganov
2024-05-20tests : fix --keep_split -> --keep-split (#7374)Georgi Gerganov
2024-05-19quantize : fix --keep-split check (#7374)Fred Douglas
2024-05-19server: add test for token probs (#7347)Johannes Gäßler
2024-05-19server: fix seed being reported back (#7382)Johannes Gäßler
2024-05-19cmake : update android comments (#7341)Georgi Gerganov
2024-05-18android : use "ci-android" branch for CI (#7341)Georgi Gerganov
2024-05-18server: correct --threads documentation [no ci] (#7362)Johannes Gäßler
2024-05-18perplexity : ndot progress and show stats with < 100 tasks (#7348)strawberrymelonpanda
2024-05-17rpc : set SO_REUSEADDR for the server socket (#7320)Radoslav Gerganov
2024-05-17server : add support for the RPC backend (#7305)Radoslav Gerganov
2024-05-17[Server] Added --verbose option to README [no ci] (#7335)Leon Knauer
2024-05-16Revert "server bench: fix bench not waiting for model load (#7284)" (#7334)Pierrick Hymbert
2024-05-16rpc : get available mem for the CPU backendRadoslav Gerganov
2024-05-16rpc : add command line arg for specifying backend memoryRadoslav Gerganov
2024-05-16doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288)Vaibhav Srivastav
2024-05-15ggml : tag ggml_tensor::backend as deprecated (#7290)slaren
2024-05-15embedding : free the batch after execution (#7297)dm4
2024-05-15server bench: fix bench not waiting for model load (#7284)Johannes Gäßler
2024-05-14server: free sampling contexts on exit (#7264)Steve Grubb
2024-05-14Revert "move ndk code to a new library (#6951)" (#7282)Brian
2024-05-14ggml : add RPC backend (#6829)Radoslav Gerganov
2024-05-14move ndk code to a new library (#6951)Elton Kola
2024-05-14docs: Fix typo and update description for --embeddings flag (#7026)Ryuei
2024-05-14llava-cli: fix base64 prompt (#7248)k.h.lai
2024-05-13perplexity: add BF16 vs. FP16 results (#7150)Johannes Gäßler
2024-05-13change default temperature of OAI compat API from 0 to 1 (#7226)Benjamin Findley
2024-05-11fix system prompt handling (#7153)Xuan Son Nguyen
2024-05-11server : free llama_batch on exit (#7212)Steve Grubb
2024-05-11server: fix reported top tokens for temperature 0 (#7203)Johannes Gäßler
2024-05-11llama : add Jina Embeddings architecture (#6826)Joan Fontanals
2024-05-10llama-bench : add pp+tg test type (#7199)slaren
2024-05-10Fix memory bug in grammar parser (#7194)Justine Tunney
2024-05-10Main+: optionally allow special tokens from user in interactive mode (#7097)HanishKVC
2024-05-10llava : fix moondream support (#7163)Andrei
2024-05-10eval-callback : fix conversion to float (#7184)slaren
2024-05-09TypoFix (#7162)Ahmet Zeer
2024-05-08convert-hf : save memory with lazy evaluation (#7075)compilade
2024-05-08JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)Johannes Gäßler
2024-05-08Revert "llava : add support for moondream vision language model (#6899)"Georgi Gerganov
2024-05-08server : add themes + favicon (#6848)JohnnyB
2024-05-08main : add --conversation / -cnv flag (#7108)Dawid Potocki
2024-05-08server : add_special option for tokenize endpoint (#7059)Johan
2024-05-08clean up json_value & server_log (#7142)Xuan Son Nguyen