summaryrefslogtreecommitdiff
path: root/examples/main
AgeCommit message (Expand)Author
2025-06-09Docs update (#509)saood06
2025-06-03Adding top-n-sigma sampler (#489)Kawrakow
2024-08-12Merge mainline - Aug 12 2024 (#17)Kawrakow
2024-07-27Merge mainline llama.cpp (#3)Kawrakow
2024-06-13`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...Olivier Chafik
2024-06-05readme : remove -ins (#7759)arch-btw
2024-06-04common : refactor cli arg parsing (#7675)Georgi Gerganov
2024-05-27main: replace --no-special with --special (#7534)Brian
2024-05-25main : don't print special tokens with --grammar (#6923)Justine Tunney
2024-05-23main : minor (#7462)Georgi Gerganov
2024-05-22common : normalize naming style (#7462)Georgi Gerganov
2024-05-21`grammars`: fix resampling logic regression (#7424)Olivier Chafik
2024-05-21examples: cache hf model when --model not provided (#7353)Amir
2024-05-10Fix memory bug in grammar parser (#7194)Justine Tunney
2024-05-10Main+: optionally allow special tokens from user in interactive mode (#7097)HanishKVC
2024-05-08main : add --conversation / -cnv flag (#7108)Dawid Potocki
2024-05-07main : update log text (EOS to EOG) (#7104)RhinoDevel
2024-05-07docs: fix typos (#7124)omahs
2024-05-01main : fix off by one error for context shift (#6921)l3utterfly
2024-04-30Improve usability of --model-url & related flags (#6930)Olivier Chafik
2024-04-29main : fix typo in comment in main.cpp (#6985)Daniel Bevenius
2024-04-24Server: fix seed for multiple slots (#6835)Johannes Gäßler
2024-04-21llama : support Llama 3 HF conversion (#6745)Pedro Cuenca
2024-04-15`main`: add --json-schema / -j flag (#6659)Olivier Chafik
2024-04-12chore: Fix markdown warnings (#6625)Rene Leonhardt
2024-04-09BERT tokenizer fixes (#6498)Jared Van Bortel
2024-04-08llama : save and restore kv cache for single seq id (#6341)Jan Boon
2024-03-28doc: fix outdated default value of batch size (#6336)Ting Sun
2024-03-26cuda : rename build flag to LLAMA_CUDA (#6299)slaren
2024-03-17common: llama_load_model_from_url using --model-url (#6098)Pierrick Hymbert
2024-03-11llama : more consistent names of count variables (#5994)Georgi Gerganov
2024-03-04main : support special tokens as reverse/anti prompt (#5847)DAN™
2024-02-25llama : refactor k-shift implementation + KV defragmentation (#5691)Georgi Gerganov
2024-02-21examples : do not assume BOS when shifting context (#5622)Jared Van Bortel
2024-02-16ggml : add numa options (#5377)bmwl
2024-02-11main : ctrl+C print timing in non-interactive mode (#3873)Georgi Gerganov
2024-02-03refactor : switch to emplace_back to avoid extra object (#5291)Michael Klimenko
2024-01-30main : allow empty --prompt-cache file (#5176)divinity76
2024-01-13main : add parameter --no-display-prompt (#4541)Yann Follet
2024-01-11main : better name for variable n_print (#4874)Georgi Gerganov
2024-01-11main : disable token count by default (#4874)Georgi Gerganov
2024-01-11main : print total token count and tokens consumed so far (#4874)pudepiedj
2024-01-08main : add self-extend support (#4815)Georgi Gerganov
2023-12-05sampling : custom samplers order (#4285)MaggotHATE
2023-11-30main : pass LOG_TEE callback to llama.cpp log (#4033)Andrew Godfrey
2023-11-20main : Add ChatML functionality to main example (#4046)Seb C
2023-11-16Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)Kerfuffle
2023-11-11Fix some documentation typos/grammar mistakes (#4032)Richard Kiss
2023-11-02build : link against build info instead of compiling against it (#3879)cebtenzzre
2023-10-31samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)kalomaze