index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
examples
/
server
/
README.md
Age
Commit message (
Expand
)
Author
2024-08-12
Merge mainline - Aug 12 2024 (#17)
Kawrakow
2024-07-27
Merge mainline llama.cpp (#3)
Kawrakow
2024-06-13
`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...
Olivier Chafik
2024-06-07
server: update cache_prompt documentation [no ci] (#7745)
Johannes Gäßler
2024-05-19
server: add test for token probs (#7347)
Johannes Gäßler
2024-05-18
server: correct --threads documentation [no ci] (#7362)
Johannes Gäßler
2024-05-17
[Server] Added --verbose option to README [no ci] (#7335)
Leon Knauer
2024-05-14
docs: Fix typo and update description for --embeddings flag (#7026)
Ryuei
2024-05-08
server : add_special option for tokenize endpoint (#7059)
Johan
2024-05-07
server: fix incorrectly reported token probabilities (#7125)
Johannes Gäßler
2024-05-07
server : update readme with undocumented options (#7013)
Kyle Mistele
2024-04-29
build(cmake): simplify instructions (`cmake -B build && cmake --build build ....
Olivier Chafik
2024-04-12
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings,...
Olivier Chafik
2024-04-08
llama : save and restore kv cache for single seq id (#6341)
Jan Boon
2024-04-04
server : remove obsolete --memory-f32 option
Georgi Gerganov
2024-04-03
A few small fixes to server's README docs (#6428)
Fattire
2024-03-26
cuda : rename build flag to LLAMA_CUDA (#6299)
slaren
2024-03-25
Server: clean up OAI params parsing function (#6284)
Xuan Son Nguyen
2024-03-23
common: llama_load_model_from_url split support (#6192)
Pierrick Hymbert
2024-03-23
server: docs: `--threads` and `--threads`, `--ubatch-size`, `--log-disable` (...
Pierrick Hymbert
2024-03-21
server : update readme doc from `slot_id` to `id_slot` (#6213)
Jan Boon
2024-03-17
common: llama_load_model_from_url using --model-url (#6098)
Pierrick Hymbert
2024-03-11
Update server docker image URLs (#5997)
Jakub N
2024-03-11
Server: format error to json (#5961)
Xuan Son Nguyen
2024-03-09
server : clarify some items in the readme (#5957)
Georgi Gerganov
2024-03-09
Server: reorganize some http logic (#5939)
Xuan Son Nguyen
2024-03-09
server : add SSL support (#5926)
Gabe Goodhart
2024-03-07
server : refactor (#5882)
Georgi Gerganov
2024-03-03
server : init http requests thread pool with --parallel if set (#5836)
Pierrick Hymbert
2024-03-01
server : remove api_like_OAI.py proxy script (#5808)
Georgi Gerganov
2024-03-01
server: allow to override threads server pool with --threads-http (#5794)
Pierrick Hymbert
2024-02-25
server: docs - refresh and tease a little bit more the http server (#5718)
Pierrick Hymbert
2024-02-25
server: logs - unified format and --log-format option (#5700)
Pierrick Hymbert
2024-02-25
server: concurrency fix + monitoring - add /metrics prometheus compatible end...
Pierrick Hymbert
2024-02-24
server: init functional tests (#5566)
Pierrick Hymbert
2024-02-22
server : clarify some params in the docs (#5640)
Alexey Parfenov
2024-02-22
Add docs for llama_chat_apply_template (#5645)
Xuan Son Nguyen
2024-02-21
server: health: fix race condition on slots data using tasks queue (#5634)
Pierrick Hymbert
2024-02-20
server : health endpoint configurable failure on no slot (#5594)
Pierrick Hymbert
2024-02-18
common, server : surface min_keep as its own parameter (#5567)
Robey Holderith
2024-02-18
server : slots monitoring endpoint (#5550)
Pierrick Hymbert
2024-02-18
server : enhanced health endpoint (#5548)
Pierrick Hymbert
2024-02-18
server : --n-predict option document and cap to max value (#5549)
Pierrick Hymbert
2024-02-16
server : add "samplers" param to control the samplers order (#5494)
Alexey Parfenov
2024-02-16
ggml : add numa options (#5377)
bmwl
2024-02-11
server : allow to specify tokens as strings in logit_bias (#5003)
Alexey Parfenov
2024-02-07
server : update `/props` with "total_slots" value (#5373)
Justin Parker
2024-02-06
server : add `dynatemp_range` and `dynatemp_exponent` (#5352)
Michael Coppola
2024-02-05
server : allow to get default generation settings for completion (#5307)
Alexey Parfenov
2024-01-30
server : improve README (#5209)
Wu Jian Ping
[next]