diff options
author | Minsoo Cheong <54794500+mscheong01@users.noreply.github.com> | 2024-03-05 03:24:00 +0900 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-03-04 20:24:00 +0200 |
commit | 6d341ab6c53cd51f2921d986d0090cc8b049b39a (patch) | |
tree | f212b497e210c8c73fe52369f6bc81297c7b1dab /examples/speculative/README.md | |
parent | 4ffcdce2ff877ebb683cd217ea38faf20faa5ffe (diff) |
speculative : implement stochastic speculative sampling (#5625)
* (WIP) Implement stochastic speculative decoding
* sample from residual distribution on draft accept failure
* fix #5657: force greedy sampling with probs when temp is 0
* remove p_accept parameter
* fix style
* remove unused variables
* add srand() in speculative.cpp
* replace use of rand() with mt19937 sampling
* fixes based on review (@JohannesGaessler)
* fix r random generation
* randomly select next sequence to verify + fix bug in memory freeing
* fix bug in active_seqs sync
* fix uniform int distribution initialization
* remove warnings from comparison between int and size_t
* check grammar in `llama_sample_probability_distribution_impl`
* remove malloc code by utilizing vectors
* add PR link to README
Diffstat (limited to 'examples/speculative/README.md')
-rw-r--r-- | examples/speculative/README.md | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/speculative/README.md b/examples/speculative/README.md index 814efa59..a6608c5f 100644 --- a/examples/speculative/README.md +++ b/examples/speculative/README.md @@ -6,3 +6,4 @@ More info: - https://github.com/ggerganov/llama.cpp/pull/2926 - https://github.com/ggerganov/llama.cpp/pull/3624 +- https://github.com/ggerganov/llama.cpp/pull/5625 |