ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Xiao-Yong Jin <jinxiaoyong@gmail.com>	2023-08-23 02:12:12 -0500
committer	GitHub <noreply@github.com>	2023-08-23 15:12:12 +0800
commit	b8ad1b66b23f9b2e6e4531e9a62753323036a556 (patch)
tree	72799c23c8335ee997ab579c41313449ca2e4e91 /convert-llama-ggmlv3-to-gguf.py
parent	f5fe98d11bdf9e7797bcfb05c0c3601ffc4b9d26 (diff)

server : allow json array in prompt or content for direct token input (#2306)

* server: allow json array in prompt or content We accept an array of strings and numbers representing tokens, in addition to the current string valued prompt or content. This allows direct token input, so that any special tokens can be processed and used at the frontend during the construction of the json data, before sending to the server. And the server does not need to know or parse special tokens from textual input. With this, we can use EOS and BOS used in llama-2-chat models. * server: use tokenizePrompt(json) and default "" if empty prompt * server: fix prompt check * server: tokenize endpoint no longer adds BOS

Diffstat (limited to 'convert-llama-ggmlv3-to-gguf.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: