diff options
author | Pierrick Hymbert <pierrick.hymbert@gmail.com> | 2024-02-24 19:16:04 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-02-24 19:16:04 +0100 |
commit | 9e359a4f47c1b2dceb99e29706c9f7403d32ab5e (patch) | |
tree | aa491d0744940ccce9ff69fe1bcc9e1f16b7a1ff /examples/server/tests/features/parallel.feature | |
parent | 4c4cb30736582cacb1a164a9d4bc8e17b1014be7 (diff) |
server: continue to update other slots on embedding concurrent request (#5699)
* server: #5655 - continue to update other slots on embedding concurrent request.
* server: tests: add multi users embeddings as fixed
* server: tests: adding OAI compatible embedding concurrent endpoint
* server: tests: adding OAI compatible embedding with multiple inputs
Diffstat (limited to 'examples/server/tests/features/parallel.feature')
-rw-r--r-- | examples/server/tests/features/parallel.feature | 46 |
1 files changed, 46 insertions, 0 deletions
diff --git a/examples/server/tests/features/parallel.feature b/examples/server/tests/features/parallel.feature index 802d624f..c85f9de1 100644 --- a/examples/server/tests/features/parallel.feature +++ b/examples/server/tests/features/parallel.feature @@ -8,6 +8,7 @@ Feature: Parallel And 42 as server seed And 64 KV cache size And 2 slots + And embeddings extraction And continuous batching Then the server is starting Then the server is healthy @@ -75,3 +76,48 @@ Feature: Parallel Then the server is busy Then the server is idle Then all prompts are predicted + + Scenario: Multi users embeddings + Given a prompt: + """ + Write a very long story about AI. + """ + And a prompt: + """ + Write another very long music lyrics. + """ + And a prompt: + """ + Write a very long poem. + """ + And a prompt: + """ + Write a very long joke. + """ + Given concurrent embedding requests + Then the server is busy + Then the server is idle + Then all embeddings are generated + + Scenario: Multi users OAI compatibility embeddings + Given a prompt: + """ + In which country Paris is located ? + """ + And a prompt: + """ + Is Madrid the capital of Spain ? + """ + And a prompt: + """ + What is the biggest US city ? + """ + And a prompt: + """ + What is the capital of Bulgaria ? + """ + And a model tinyllama-2 + Given concurrent OAI embedding requests + Then the server is busy + Then the server is idle + Then all embeddings are generated |