summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRyuei <louixs@users.noreply.github.com>2024-05-14 14:20:47 +0900
committerGitHub <noreply@github.com>2024-05-14 15:20:47 +1000
commit27f65d6267cf22a44c5ccefa7765d53a05bd1259 (patch)
tree77ccab6005d786dca3036842f135f0e5c1c6d484
parentee52225067622babc277371511b8124884e1c797 (diff)
docs: Fix typo and update description for --embeddings flag (#7026)
- Change '--embedding' to '--embeddings' in the README - Update the description to match the latest --help output - Added a caution about defining physical batch size
-rw-r--r--examples/server/README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/examples/server/README.md b/examples/server/README.md
index 65031799..f6eb6942 100644
--- a/examples/server/README.md
+++ b/examples/server/README.md
@@ -48,7 +48,7 @@ page cache before using this. See https://github.com/ggerganov/llama.cpp/issues/
- `--path`: Path from which to serve static files. Default: disabled
- `--api-key`: Set an api key for request authorization. By default, the server responds to every request. With an api key set, the requests must have the Authorization header set with the api key as Bearer token. May be used multiple times to enable multiple valid keys.
- `--api-key-file`: Path to file containing api keys delimited by new lines. If set, requests must include one of the keys for access. May be used in conjunction with `--api-key`s.
-- `--embedding`: Enable embedding extraction. Default: disabled
+- `--embeddings`: Enable embedding vector output and the OAI compatible endpoint /v1/embeddings. Physical batch size (`--ubatch-size`) must be carefully defined. Default: disabled
- `-np N`, `--parallel N`: Set the number of slots for process requests. Default: `1`
- `-cb`, `--cont-batching`: Enable continuous batching (a.k.a dynamic batching). Default: disabled
- `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load a system prompt (initial prompt of all slots). This is useful for chat applications. [See more](#change-system-prompt-on-runtime)