summaryrefslogtreecommitdiff
path: root/examples/server/utils.hpp
AgeCommit message (Expand)Author
2024-04-21llama : support Llama 3 HF conversion (#6745)Pedro Cuenca
2024-04-06ci: bench: support sse and fix prompt processing time / server: add tokens us...Pierrick Hymbert
2024-04-03server : handle exception on wrong type in request (#6452)JH23X
2024-03-25Server: clean up OAI params parsing function (#6284)Xuan Son Nguyen
2024-03-23server: flush stdout after logging in both text and json layout (#6253)Pierrick Hymbert
2024-03-22json-schema-to-grammar : fix order of props + non-str const/enum (#6232)Olivier Chafik
2024-03-21json-schema-to-grammar improvements (+ added to server) (#5978)Olivier Chafik
2024-03-20Server: Handle n_keep parameter in the request (#6174)Karthick
2024-03-13Server: Use multi-task for embeddings endpoint (#6001)Xuan Son Nguyen
2024-03-11Server: format error to json (#5961)Xuan Son Nguyen
2024-03-11server : maintain chat completion id for streaming responses (#5988)Minsoo Cheong
2024-03-07server : refactor (#5882)Georgi Gerganov
2024-03-02server: tests: passkey challenge / self-extend with context shift demo (#5832)Pierrick Hymbert
2024-02-29Server: normalize naming (#5779)Xuan Son Nguyen
2024-02-25server: logs - unified format and --log-format option (#5700)Pierrick Hymbert
2024-02-25server: concurrency fix + monitoring - add /metrics prometheus compatible end...Pierrick Hymbert
2024-02-21server: health: fix race condition on slots data using tasks queue (#5634)Pierrick Hymbert
2024-02-20Server: use llama_chat_apply_template (#5593)Xuan Son Nguyen
2024-02-18server : graceful server shutdown (#5244)Daniel Hiltgen
2024-02-11server : add llama2 chat template (#5425)Xuan Son Nguyen
2024-01-27sync : ggmlGeorgi Gerganov
2024-01-26server : refactored the task processing logic (#5065)Xuan Son Nguyen