summaryrefslogtreecommitdiff
path: root/gguf-py
diff options
context:
space:
mode:
authorcompilade <113953597+compilade@users.noreply.github.com>2024-03-03 03:41:55 -0500
committerGitHub <noreply@github.com>2024-03-03 10:41:55 +0200
commitde9692a7d2db66e29e5cb373c6551acc49145ccd (patch)
tree3680f1b63254f37a704ac96d131669e580bf5865 /gguf-py
parente6029348e86c3810d4435faee54ba822cb43e2ef (diff)
llama : fix llama_copy_state_data with fragmented KV cache (#5840)
The row size of the saved states was based on kv_self.head while it should be based on llama_kv_cache_cell_max. Existing session files should still work. * llama : fix llama_kv_cache_cell_max inability to return 1 I've also changed its return type to uint32_t, because this function is always used to set the value of uint32_t variables, and because the index already has this type. * llama : fix state size calculation Some bytes in the state were unaccounted for in llama_get_state_size. Since the logits reserve so much space, it did not cause problems.
Diffstat (limited to 'gguf-py')
0 files changed, 0 insertions, 0 deletions