Adding the XTC sampler (#486)

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
author: Kawrakow <iwankawrakow@gmail.com> 2025-06-03 11:32:03 +0300
committer: GitHub <noreply@github.com> 2025-06-03 11:32:03 +0300
commit: ccb265c01676aad9ae5860ba50e74e61dfcd1cf8 (patch)
tree: 8e2d9303bd091c4d0015fce8402162346d998cca /include/llama.h
parent: 4f8b05a0d76e6c5e47fe1f6c7bd079e0fe95dbba (diff)
1 files changed, 8 insertions, 0 deletions
diff --git a/include/llama.h b/include/llama.h
index 607a590d..89526276 100644
--- a/include/llama.h
+++ b/include/llama.h
@@ -1208,6 +1208,14 @@ extern "C" {
           llama_token_data_array * candidates,
                            float   temp);
 
+    /// @details XTC sampler as described in https://github.com/oobabooga/text-generation-webui/pull/6335
+    LLAMA_API void llama_sample_xtc(
+            struct llama_context * ctx,
+          llama_token_data_array * candidates_p,
+                           float   probability,
+                           float   threshold,
+                           size_t  min_keep);
+
     /// @details Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
     /// @param candidates A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
     /// @param tau  The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
author	Kawrakow <iwankawrakow@gmail.com>	2025-06-03 11:32:03 +0300
committer	GitHub <noreply@github.com>	2025-06-03 11:32:03 +0300
commit	ccb265c01676aad9ae5860ba50e74e61dfcd1cf8 (patch)
tree	8e2d9303bd091c4d0015fce8402162346d998cca /include/llama.h
parent	4f8b05a0d76e6c5e47fe1f6c7bd079e0fe95dbba (diff)