ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Johannes Gäßler <johannesg@5d6.de>	2023-12-20 15:41:22 +0100
committer	GitHub <noreply@github.com>	2023-12-20 15:41:22 +0100
commit	799fc2268989482054944c902874cca76337580f (patch)
tree	f535df08f2059a709f8f5b8014d532f1aa086a2d /llama.cpp
parent	328b83de23b33240e28f4e74900d1d06726f5eb1 (diff)

CUDA: Faster Mixtral prompt processing (#4538)

* CUDA: make MoE tensors contiguous for batch size>1 * Update ggml-cuda.cu Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>

Diffstat (limited to 'llama.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: