summaryrefslogtreecommitdiff
path: root/examples/server/README.md
diff options
context:
space:
mode:
authorMaximilian Winter <maximilian.winter.91@gmail.com>2024-01-27 14:38:05 +0100
committerGitHub <noreply@github.com>2024-01-27 15:38:05 +0200
commitec903c034131848da9222536ff18da07ec0882a0 (patch)
treecfd7b1e8c3002e676c7b8309124c7c32bc506f60 /examples/server/README.md
parenta1d6df129bcd3d42cda38c09217d8d4ec4ea3bdd (diff)
server : add self-extend support (#5104)
* Ported self extension to server example * Update server.cpp * Fixed prompt caching without self extend * Update server.cpp * Added description to server readme. * Update server.cpp * Update server.cpp * Update server.cpp * Update server.cpp * Update README.md * Changed descriptions * server : formatting * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update server.cpp * Update server.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'examples/server/README.md')
-rw-r--r--examples/server/README.md3
1 files changed, 2 insertions, 1 deletions
diff --git a/examples/server/README.md b/examples/server/README.md
index fd3034b9..1c92a204 100644
--- a/examples/server/README.md
+++ b/examples/server/README.md
@@ -30,7 +30,8 @@ Command line options:
- `-cb`, `--cont-batching`: enable continuous batching (a.k.a dynamic batching) (default: disabled)
- `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load "a system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime)
- `--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA.
-
+- `--grp-attn-n`: Set the group attention factor to extend context size through self-extend(default: 1=disabled), used together with group attention width `--grp-attn-w`
+- `--grp-attn-w`: Set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n`
## Build
server is build alongside everything else from the root of the project