diff options
author | slaren <slarengh@gmail.com> | 2023-12-26 21:23:59 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-12-26 21:23:59 +0100 |
commit | dc68f0054cd279cddddb0cae0c9ef4f9cbaa512a (patch) | |
tree | 1c437ea7e78a09d3a1fc7786f42fd3ea8615b292 /gguf-py | |
parent | de8e496437c59e7d1cc84109e3e49a3478aee25a (diff) |
cuda : fix vmm pool with multi GPU (#4620)
* cuda : fix vmm pool with multi GPU
* hip
* use recommended granularity instead of minimum
* better error checking
* fix mixtral
* use cudaMemcpy3DPeerAsync
* use cuda_pool_alloc in ggml_cuda_op_mul_mat
* consolidate error checking in ggml_cuda_set_device
* remove unnecessary inlines
ggml-ci
* style fixes
* only use vmm for the main device
* fix scratch buffer size, re-enable vmm pool for all devices
* remove unnecessary check id != g_main_device
Diffstat (limited to 'gguf-py')
0 files changed, 0 insertions, 0 deletions