diff options
Diffstat (limited to 'examples/server/tests/features/embeddings.feature')
-rw-r--r-- | examples/server/tests/features/embeddings.feature | 94 |
1 files changed, 94 insertions, 0 deletions
diff --git a/examples/server/tests/features/embeddings.feature b/examples/server/tests/features/embeddings.feature new file mode 100644 index 00000000..b47661e9 --- /dev/null +++ b/examples/server/tests/features/embeddings.feature @@ -0,0 +1,94 @@ +@llama.cpp +@embeddings +Feature: llama.cpp server + + Background: Server startup + Given a server listening on localhost:8080 + And a model file bert-bge-small/ggml-model-f16.gguf from HF repo ggml-org/models + And a model alias bert-bge-small + And 42 as server seed + And 2 slots + And 1024 as batch size + And 2048 KV cache size + And embeddings extraction + Then the server is starting + Then the server is healthy + + Scenario: Embedding + When embeddings are computed for: + """ + What is the capital of Bulgaria ? + """ + Then embeddings are generated + + Scenario: OAI Embeddings compatibility + Given a model bert-bge-small + When an OAI compatible embeddings computation request for: + """ + What is the capital of Spain ? + """ + Then embeddings are generated + + Scenario: OAI Embeddings compatibility with multiple inputs + Given a model bert-bge-small + Given a prompt: + """ + In which country Paris is located ? + """ + And a prompt: + """ + Is Madrid the capital of Spain ? + """ + When an OAI compatible embeddings computation request for multiple inputs + Then embeddings are generated + + Scenario: Multi users embeddings + Given a prompt: + """ + Write a very long story about AI. + """ + And a prompt: + """ + Write another very long music lyrics. + """ + And a prompt: + """ + Write a very long poem. + """ + And a prompt: + """ + Write a very long joke. + """ + Given concurrent embedding requests + Then the server is busy + Then the server is idle + Then all embeddings are generated + + Scenario: Multi users OAI compatibility embeddings + Given a prompt: + """ + In which country Paris is located ? + """ + And a prompt: + """ + Is Madrid the capital of Spain ? + """ + And a prompt: + """ + What is the biggest US city ? + """ + And a prompt: + """ + What is the capital of Bulgaria ? + """ + And a model bert-bge-small + Given concurrent OAI embedding requests + Then the server is busy + Then the server is idle + Then all embeddings are generated + + Scenario: All embeddings should be the same + Given 10 fixed prompts + And a model bert-bge-small + Given concurrent OAI embedding requests + Then all embeddings are the same |