Add docs for llama_chat_apply_template (#5645)

* add docs for llama_chat_apply_template * fix typo
author: Xuan Son Nguyen <thichthat@gmail.com> 2024-02-22 00:31:00 +0100
committer: GitHub <noreply@github.com> 2024-02-22 00:31:00 +0100
commit: 7c8bcc11dc61cf5930b70cd0168b84afcebe12a9 (patch)
tree: f5b04881466f01302d9626433f650763785f8818 /examples/server/README.md
parent: 7fe4678b0244ba7b03eae66ebeaa947e2770bb1a (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/server/README.md b/examples/server/README.md
index 6d9f96cd..4b24ee5d 100644
--- a/examples/server/README.md
+++ b/examples/server/README.md
@@ -41,6 +41,7 @@ see https://github.com/ggerganov/llama.cpp/issues/1437
 - `--grp-attn-w`: Set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n`
 - `-n, --n-predict`: Set the maximum tokens to predict (default: -1)
 - `--slots-endpoint-disable`: To disable slots state monitoring endpoint. Slots state may contain user data, prompts included.
+- `--chat-template JINJA_TEMPLATE`: Set custom jinja chat template. This parameter accepts a string, not a file name (default: template taken from model's metadata). We only support [some pre-defined templates](https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template)
 
 ## Build
author	Xuan Son Nguyen <thichthat@gmail.com>	2024-02-22 00:31:00 +0100
committer	GitHub <noreply@github.com>	2024-02-22 00:31:00 +0100
commit	7c8bcc11dc61cf5930b70cd0168b84afcebe12a9 (patch)
tree	f5b04881466f01302d9626433f650763785f8818 /examples/server/README.md
parent	7fe4678b0244ba7b03eae66ebeaa947e2770bb1a (diff)