summaryrefslogtreecommitdiff
path: root/examples/server/tests/features/steps/steps.py
diff options
context:
space:
mode:
authorGeorgi Gerganov <ggerganov@gmail.com>2024-02-25 22:12:24 +0200
committerGitHub <noreply@github.com>2024-02-25 22:12:24 +0200
commitbf08e00643fd529f748f0a858fd79f3061e3fa18 (patch)
tree0043ee582e83a19c8f1ca6d75d1519038f866e1c /examples/server/tests/features/steps/steps.py
parentf7625019c51ca437a5840576d92362cfa710e4a2 (diff)
llama : refactor k-shift implementation + KV defragmentation (#5691)
* llama : refactor k-shift implementation ggml-ci * llama : rename llama_kv_cache_seq_shift to llama_kv_cache_seq_add * llama : cont k-shift refactoring + normalize type names ggml-ci * minor : fix MPI builds * llama : reuse n_rot from the build context ggml-ci * llama : revert enum name changes from this PR ggml-ci * llama : update llama_rope_type * llama : add comment about rope values * llama : fix build * passkey : apply kv cache updates explicitly ggml-ci * llama : change name to llama_kv_cache_update() * llama : add llama_kv_cache_seq_pos_max() * passkey : fix llama_kv_cache_seq_pos_max() usage * llama : some llama_kv_cell simplifications * llama : add llama_kv_cache_compress (EXPERIMENTAL) * llama : add alternative KV cache merging (EXPERIMENTAL) * llama : add llama_kv_cache_defrag * llama : comments * llama : remove llama_kv_cache_compress will add in a separate PR ggml-ci * llama : defragment via non-overlapping moves * llama : ggml_graph based defrag implementation ggml-ci * llama : switch the loop order in build_defrag * llama : add comments
Diffstat (limited to 'examples/server/tests/features/steps/steps.py')
0 files changed, 0 insertions, 0 deletions