summaryrefslogtreecommitdiff
path: root/examples/speculative/README.md
diff options
context:
space:
mode:
authorMinsoo Cheong <54794500+mscheong01@users.noreply.github.com>2024-03-05 03:24:00 +0900
committerGitHub <noreply@github.com>2024-03-04 20:24:00 +0200
commit6d341ab6c53cd51f2921d986d0090cc8b049b39a (patch)
treef212b497e210c8c73fe52369f6bc81297c7b1dab /examples/speculative/README.md
parent4ffcdce2ff877ebb683cd217ea38faf20faa5ffe (diff)
speculative : implement stochastic speculative sampling (#5625)
* (WIP) Implement stochastic speculative decoding * sample from residual distribution on draft accept failure * fix #5657: force greedy sampling with probs when temp is 0 * remove p_accept parameter * fix style * remove unused variables * add srand() in speculative.cpp * replace use of rand() with mt19937 sampling * fixes based on review (@JohannesGaessler) * fix r random generation * randomly select next sequence to verify + fix bug in memory freeing * fix bug in active_seqs sync * fix uniform int distribution initialization * remove warnings from comparison between int and size_t * check grammar in `llama_sample_probability_distribution_impl` * remove malloc code by utilizing vectors * add PR link to README
Diffstat (limited to 'examples/speculative/README.md')
-rw-r--r--examples/speculative/README.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/speculative/README.md b/examples/speculative/README.md
index 814efa59..a6608c5f 100644
--- a/examples/speculative/README.md
+++ b/examples/speculative/README.md
@@ -6,3 +6,4 @@ More info:
- https://github.com/ggerganov/llama.cpp/pull/2926
- https://github.com/ggerganov/llama.cpp/pull/3624
+- https://github.com/ggerganov/llama.cpp/pull/5625