index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
examples
/
server
/
tests
/
features
/
steps
/
steps.py
Age
Commit message (
Expand
)
Author
2024-08-12
Merge mainline - Aug 12 2024 (#17)
Kawrakow
2024-07-27
Merge mainline llama.cpp (#3)
Kawrakow
2024-06-13
`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...
Olivier Chafik
2024-05-20
server : tuning tests (#7388)
Georgi Gerganov
2024-05-19
server: add test for token probs (#7347)
Johannes Gäßler
2024-05-13
change default temperature of OAI compat API from 0 to 1 (#7226)
Benjamin Findley
2024-05-08
convert-hf : save memory with lazy evaluation (#7075)
compilade
2024-05-08
server : add_special option for tokenize endpoint (#7059)
Johan
2024-05-01
Server: add tests for batch size, different seeds (#6950)
Johannes Gäßler
2024-04-24
Server: fix seed for multiple slots (#6835)
Johannes Gäßler
2024-04-08
llama : save and restore kv cache for single seq id (#6341)
Jan Boon
2024-03-27
server: continuous performance monitoring and PR comment (#6283)
Pierrick Hymbert
2024-03-23
common: llama_load_model_from_url split support (#6192)
Pierrick Hymbert
2024-03-21
json-schema-to-grammar improvements (+ added to server) (#5978)
Olivier Chafik
2024-03-20
server : allow to override -ngl in tests (#6170)
Georgi Gerganov
2024-03-20
server tests : more pythonic process management; fix bare `except:` (#6146)
Jared Van Bortel
2024-03-17
common: llama_load_model_from_url using --model-url (#6098)
Pierrick Hymbert
2024-03-14
server: disable debug release type sanitizer, simplify trigger (#6047)
Pierrick Hymbert
2024-03-13
llama : add pipeline parallelism support (#6017)
slaren
2024-03-11
Server: format error to json (#5961)
Xuan Son Nguyen
2024-03-10
server: ci: windows build and tests (#5968)
Pierrick Hymbert
2024-03-09
Server: reorganize some http logic (#5939)
Xuan Son Nguyen
2024-03-09
server: tests: add truncated prompt tests, better kv cache size (#5933)
Pierrick Hymbert
2024-03-08
server: metrics: add llamacpp:prompt_seconds_total and llamacpp:tokens_predic...
Pierrick Hymbert
2024-03-07
server : refactor (#5882)
Georgi Gerganov
2024-03-02
server: tests: passkey challenge / self-extend with context shift demo (#5832)
Pierrick Hymbert
2024-02-28
server : add "/chat/completions" alias for "/v1/...` (#5722)
Jorge A
2024-02-25
server: tests - slow inference causes timeout on the CI (#5715)
Pierrick Hymbert
2024-02-25
server: logs - unified format and --log-format option (#5700)
Pierrick Hymbert
2024-02-25
server: concurrency fix + monitoring - add /metrics prometheus compatible end...
Pierrick Hymbert
2024-02-24
server: continue to update other slots on embedding concurrent request (#5699)
Pierrick Hymbert
2024-02-24
server: init functional tests (#5566)
Pierrick Hymbert