diff options
author | Olivier Chafik <ochafik@users.noreply.github.com> | 2024-04-11 19:47:34 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-04-11 19:47:34 +0100 |
commit | cbaadc92942c50aab599a9e4c163afc1f44f7c26 (patch) | |
tree | 0a4b962430740a81a6b1789f1edd9ee50074dde3 /llama.h | |
parent | 1bbdaf6ecda6f0a360dfb307b256fcb6838c560b (diff) |
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)
* grammars: reserve rejects & next candidates
* grammars: reuse new_stacks
* grammars: fix missing sig change in llama.h
* grammars: fix test (api changed)
* grammars: update gbnf-validator.cpp
* grammars: simpler syntax (no swap)
Diffstat (limited to 'llama.h')
-rw-r--r-- | llama.h | 5 |
1 files changed, 3 insertions, 2 deletions
@@ -1097,10 +1097,11 @@ const std::vector<std::pair<std::string, struct ggml_tensor *>> & llama_internal struct llama_context * ctx ); -std::vector<std::vector<const llama_grammar_element *>> llama_grammar_accept( +void llama_grammar_accept( const std::vector<std::vector<llama_grammar_element>> & rules, const std::vector<std::vector<const llama_grammar_element *>> & stacks, - const uint32_t chr); + const uint32_t chr, + std::vector<std::vector<const llama_grammar_element *>> & new_stacks); std::pair<std::vector<uint32_t>, llama_partial_utf8> decode_utf8( const std::string & src, |