main : don't print special tokens with --grammar (#6923)

* main : don't print special tokens with --grammar The CLI interface was recently changed to print special control tokens like the </s> stop message one. This token shouldn't be printed if the grammar flag was passed, unless the grammar specifies it, because that breaks shell-scriptability. * main: use seperate stream for control characters * main: use dprintf and add --ctrl-token-no-out and --ctrl-token-fd-out * main: dprintf isn't part of the IEEE POSIX standard. Just use write(). * main: remove --ctrl-token-fd-out in favor for fcntl() based detection * common.cpp: accidentally removed --interactive-first * main: only merge stdout and control token if not in conversation or grammar mode * main: rejig control token descriptor handling * main: must check pipe status on very top of program * main: renamed --no-special from --ctrl-token-no-out and other refactoring * main: refactor ctrl_token_no_out --> no_special * llama: rename llama_token_is_control_token() to llama_token_is_control() * main: remove special token file descriptor feature (#5) --------- Co-authored-by: Brian <mofosyne@gmail.com>
author: Justine Tunney <jtunney@mozilla.com> 2024-05-25 05:04:03 -0400
committer: GitHub <noreply@github.com> 2024-05-25 19:04:03 +1000
commit: 00c63907931bb08a0ed2b7e38cf44dd290143cb9 (patch)
tree: c2248d26ae5d25160594523b562bb2481fb1c87a /llama.h
parent: faa0e6979a11dcb731e9d778ad42ceaa0302015e (diff)
1 files changed, 3 insertions, 0 deletions
diff --git a/llama.h b/llama.h
index 16cece5d..16676269 100644
--- a/llama.h
+++ b/llama.h
@@ -823,6 +823,9 @@ extern "C" {
     // Check if the token is supposed to end generation (end-of-generation, eg. EOS, EOT, etc.)
     LLAMA_API bool llama_token_is_eog(const struct llama_model * model, llama_token token);
 
+    // Identify if Token Id is a control token or a render-able token
+    LLAMA_API bool llama_token_is_control(const struct llama_model * model, llama_token token);
+
     // Special tokens
     LLAMA_API llama_token llama_token_bos(const struct llama_model * model); // beginning-of-sentence
     LLAMA_API llama_token llama_token_eos(const struct llama_model * model); // end-of-sentence
author	Justine Tunney <jtunney@mozilla.com>	2024-05-25 05:04:03 -0400
committer	GitHub <noreply@github.com>	2024-05-25 19:04:03 +1000
commit	00c63907931bb08a0ed2b7e38cf44dd290143cb9 (patch)
tree	c2248d26ae5d25160594523b562bb2481fb1c87a /llama.h
parent	faa0e6979a11dcb731e9d778ad42ceaa0302015e (diff)