From 91f6499393d2d999331fbfdba47a7f8b9f913f0d Mon Sep 17 00:00:00 2001 From: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Date: Thu, 16 Nov 2023 19:14:37 -0700 Subject: Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) * gguf-py: gguf-dump: Respect --no-tensor flag in JSON mode. * Respect add_bos_token GGUF metadata value * gguf-py: Try to fix SpecialVocab giving up too easily for the Nth time --- common/common.cpp | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'common/common.cpp') diff --git a/common/common.cpp b/common/common.cpp index 6a711420..e119317d 100644 --- a/common/common.cpp +++ b/common/common.cpp @@ -1072,6 +1072,12 @@ std::string llama_detokenize_bpe(llama_context * ctx, const std::vector