diff options
author | LeonEricsson <70749762+LeonEricsson@users.noreply.github.com> | 2023-12-22 17:05:56 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-12-22 18:05:56 +0200 |
commit | 7082d24cec35e9ce9147535a2224dfc67ee0a78c (patch) | |
tree | b87d0e65d71c8e2a5bdb889483c75d4429d2d566 /examples/lookup/README.md | |
parent | ba661751322a7c201fd3bef71af077c5aebfaa2a (diff) |
lookup : add prompt lookup decoding example (#4484)
* initial commit, going through initializations
* main loop finished, starting to debug
* BUG: generates gibberish/repeating tokens after a while
* kv_cache management
* Added colors to distinguish drafted tokens (--color). Updated README
* lookup : fix token positions in the draft batch
* lookup : use n_draft from CLI params
* lookup : final touches
---------
Co-authored-by: Leon Ericsson <leon.ericsson@icloud.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'examples/lookup/README.md')
-rw-r--r-- | examples/lookup/README.md | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/examples/lookup/README.md b/examples/lookup/README.md new file mode 100644 index 00000000..5bfb0de9 --- /dev/null +++ b/examples/lookup/README.md @@ -0,0 +1,13 @@ +# llama.cpp/examples/lookup + +Demonstration of Prompt Lookup Decoding + +https://github.com/apoorvumang/prompt-lookup-decoding + +The key parameters for lookup decoding are `ngram_min`, `ngram_max` and `n_draft`. The first two determine the size of the ngrams to search for in the prompt for a match. The latter specifies how many subsequent tokens to draft if a match is found. + +More info: + +https://github.com/ggerganov/llama.cpp/pull/4484 +https://github.com/ggerganov/llama.cpp/issues/4226 + |