diff options
author | Maximilian Winter <maximilian.winter.91@gmail.com> | 2024-01-27 14:38:05 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-01-27 15:38:05 +0200 |
commit | ec903c034131848da9222536ff18da07ec0882a0 (patch) | |
tree | cfd7b1e8c3002e676c7b8309124c7c32bc506f60 /examples/server/README.md | |
parent | a1d6df129bcd3d42cda38c09217d8d4ec4ea3bdd (diff) |
server : add self-extend support (#5104)
* Ported self extension to server example
* Update server.cpp
* Fixed prompt caching without self extend
* Update server.cpp
* Added description to server readme.
* Update server.cpp
* Update server.cpp
* Update server.cpp
* Update server.cpp
* Update README.md
* Changed descriptions
* server : formatting
* Update examples/server/server.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/server/server.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update server.cpp
* Update server.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'examples/server/README.md')
-rw-r--r-- | examples/server/README.md | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/examples/server/README.md b/examples/server/README.md index fd3034b9..1c92a204 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -30,7 +30,8 @@ Command line options: - `-cb`, `--cont-batching`: enable continuous batching (a.k.a dynamic batching) (default: disabled) - `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load "a system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime) - `--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA. - +- `--grp-attn-n`: Set the group attention factor to extend context size through self-extend(default: 1=disabled), used together with group attention width `--grp-attn-w` +- `--grp-attn-w`: Set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n` ## Build server is build alongside everything else from the root of the project |