diff options
author | Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> | 2023-11-16 19:14:37 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-11-16 19:14:37 -0700 |
commit | 91f6499393d2d999331fbfdba47a7f8b9f913f0d (patch) | |
tree | 27caf3ad0b9cec979bb5ed3317b5334bdcd9470c /common/common.h | |
parent | 8da46278e1a57107591653275f8e03a281de94f0 (diff) |
Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)
* gguf-py: gguf-dump: Respect --no-tensor flag in JSON mode.
* Respect add_bos_token GGUF metadata value
* gguf-py: Try to fix SpecialVocab giving up too easily for the Nth time
Diffstat (limited to 'common/common.h')
-rw-r--r-- | common/common.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/common/common.h b/common/common.h index dd6b002e..cc048daa 100644 --- a/common/common.h +++ b/common/common.h @@ -200,6 +200,10 @@ std::string llama_detokenize_bpe( llama_context * ctx, const std::vector<llama_token> & tokens); +// Uses the value from the model metadata if possible, otherwise +// defaults to true when model type is SPM, otherwise false. +bool llama_should_add_bos_token(const llama_model * model); + // // YAML utils // |