summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-05-15embedding : free the batch after execution (#7297)dm4
2024-05-15server bench: fix bench not waiting for model load (#7284)Johannes Gäßler
2024-05-14server: free sampling contexts on exit (#7264)Steve Grubb
2024-05-14Revert "move ndk code to a new library (#6951)" (#7282)Brian
2024-05-14ggml : add RPC backend (#6829)Radoslav Gerganov
2024-05-14move ndk code to a new library (#6951)Elton Kola
2024-05-14docs: Fix typo and update description for --embeddings flag (#7026)Ryuei
2024-05-14llava-cli: fix base64 prompt (#7248)k.h.lai
2024-05-13perplexity: add BF16 vs. FP16 results (#7150)Johannes Gäßler
2024-05-13change default temperature of OAI compat API from 0 to 1 (#7226)Benjamin Findley
2024-05-11fix system prompt handling (#7153)Xuan Son Nguyen
2024-05-11server : free llama_batch on exit (#7212)Steve Grubb
2024-05-11server: fix reported top tokens for temperature 0 (#7203)Johannes Gäßler
2024-05-11llama : add Jina Embeddings architecture (#6826)Joan Fontanals
2024-05-10llama-bench : add pp+tg test type (#7199)slaren
2024-05-10Fix memory bug in grammar parser (#7194)Justine Tunney
2024-05-10Main+: optionally allow special tokens from user in interactive mode (#7097)HanishKVC
2024-05-10llava : fix moondream support (#7163)Andrei
2024-05-10eval-callback : fix conversion to float (#7184)slaren
2024-05-09TypoFix (#7162)Ahmet Zeer
2024-05-08convert-hf : save memory with lazy evaluation (#7075)compilade
2024-05-08JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)Johannes Gäßler
2024-05-08Revert "llava : add support for moondream vision language model (#6899)"Georgi Gerganov
2024-05-08server : add themes + favicon (#6848)JohnnyB
2024-05-08main : add --conversation / -cnv flag (#7108)Dawid Potocki
2024-05-08server : add_special option for tokenize endpoint (#7059)Johan
2024-05-08clean up json_value & server_log (#7142)Xuan Son Nguyen
2024-05-08ggml : introduce bfloat16 support (#6412)Justine Tunney
2024-05-08Fixed save_imatrix to match old behaviour for MoE (#7099)jukofyork
2024-05-07server: fix incorrectly reported token probabilities (#7125)Johannes Gäßler
2024-05-07server : update readme with undocumented options (#7013)Kyle Mistele
2024-05-07main : update log text (EOS to EOG) (#7104)RhinoDevel
2024-05-07docs: fix typos (#7124)omahs
2024-05-05Adding support for the --numa argument for llama-bench. (#7080)kunnis
2024-05-04gguf-split: add --no-tensor-first-split (#7072)Xuan Son Nguyen
2024-05-04If first token generated from the server is the stop word the server will cra...maor-ps
2024-05-01main : fix off by one error for context shift (#6921)l3utterfly
2024-05-01Server: add tests for batch size, different seeds (#6950)Johannes Gäßler
2024-04-30perplexity: more statistics, added documentation (#6936)Johannes Gäßler
2024-04-30ggml : add Flash Attention (#5021)Georgi Gerganov
2024-04-30Improve usability of --model-url & related flags (#6930)Olivier Chafik
2024-04-29main : fix typo in comment in main.cpp (#6985)Daniel Bevenius
2024-04-29build(cmake): simplify instructions (`cmake -B build && cmake --build build ....Olivier Chafik
2024-04-29llava-cli : multiple images (#6969)cpumaxx
2024-04-27ci: server: tests python env on github container ubuntu latest / fix n_predic...Pierrick Hymbert
2024-04-26quantize: add imatrix and dataset metadata in GGUF (#6658)Pierrick Hymbert
2024-04-26server: stop generation at `n_ctx_train` if `n_predict` is not set (#6638)Pierrick Hymbert
2024-04-26bench: server add stop word for PHI-2 (#6916)Pierrick Hymbert
2024-04-25llava : add support for moondream vision language model (#6899)vik
2024-04-25clip : rename lerp function to avoid conflict (#6894)Daniel Bevenius