diff options
author | Pierrick Hymbert <pierrick.hymbert@gmail.com> | 2024-02-25 13:49:43 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-02-25 13:49:43 +0100 |
commit | d52d7819b8ced70c642a88a59da8c78208dc58ec (patch) | |
tree | 07841f1c5b7ab748bac463e62f3fb7ce0b7f96e9 /examples/server/tests/features/server.feature | |
parent | 12894088170f62e4cad4f8d6a3043c185b414bab (diff) |
server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708)
* server: monitoring - add /metrics prometheus compatible endpoint
* server: concurrency issue, when 2 task are waiting for results, only one call thread is notified
* server: metrics - move to a dedicated struct
Diffstat (limited to 'examples/server/tests/features/server.feature')
-rw-r--r-- | examples/server/tests/features/server.feature | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/examples/server/tests/features/server.feature b/examples/server/tests/features/server.feature index 5f81d256..0139f89d 100644 --- a/examples/server/tests/features/server.feature +++ b/examples/server/tests/features/server.feature @@ -13,6 +13,7 @@ Feature: llama.cpp server And 1 slots And embeddings extraction And 32 server max tokens to predict + And prometheus compatible metrics exposed Then the server is starting Then the server is healthy @@ -25,6 +26,7 @@ Feature: llama.cpp server And <n_predict> max tokens to predict And a completion request with no api error Then <n_predicted> tokens are predicted matching <re_content> + And prometheus metrics are exposed Examples: Prompts | prompt | n_predict | re_content | n_predicted | |