ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	slaren <slarengh@gmail.com>	2023-12-26 21:23:59 +0100
committer	GitHub <noreply@github.com>	2023-12-26 21:23:59 +0100
commit	dc68f0054cd279cddddb0cae0c9ef4f9cbaa512a (patch)
tree	1c437ea7e78a09d3a1fc7786f42fd3ea8615b292 /gguf-py
parent	de8e496437c59e7d1cc84109e3e49a3478aee25a (diff)

cuda : fix vmm pool with multi GPU (#4620)

* cuda : fix vmm pool with multi GPU * hip * use recommended granularity instead of minimum * better error checking * fix mixtral * use cudaMemcpy3DPeerAsync * use cuda_pool_alloc in ggml_cuda_op_mul_mat * consolidate error checking in ggml_cuda_set_device * remove unnecessary inlines ggml-ci * style fixes * only use vmm for the main device * fix scratch buffer size, re-enable vmm pool for all devices * remove unnecessary check id != g_main_device

Diffstat (limited to 'gguf-py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: