diff options
author | saood06 <saood05@gmail.com> | 2025-05-28 00:18:25 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-28 08:18:25 +0300 |
commit | ccd6d9cdf6851f7042c48d682daf47bc0e2eca27 (patch) | |
tree | ac8324411fd50d18ef9eef08f75e18dd69d6299a /examples/server/README.md | |
parent | 09764678456f8991f6095118f3727d9d0b17b8c8 (diff) |
set cache_prompt default to true (#465)
Diffstat (limited to 'examples/server/README.md')
-rw-r--r-- | examples/server/README.md | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/examples/server/README.md b/examples/server/README.md index e17595fe..cb1eb7c9 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -450,7 +450,7 @@ node index.js `id_slot`: Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot. Default: `-1` - `cache_prompt`: Re-use KV cache from a previous request if possible. This way the common prefix does not have to be re-processed, only the suffix that differs between the requests. Because (depending on the backend) the logits are **not** guaranteed to be bit-for-bit identical for different batch sizes (prompt processing vs. token generation) enabling this option can cause nondeterministic results. Default: `false` + `cache_prompt`: Re-use KV cache from a previous request if possible. This way the common prefix does not have to be re-processed, only the suffix that differs between the requests. Because (depending on the backend) the logits are **not** guaranteed to be bit-for-bit identical for different batch sizes (prompt processing vs. token generation) enabling this option can cause nondeterministic results. Default: `true` `system_prompt`: Change the system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime) |