diff options
author | 0cc4m <picard12@live.de> | 2024-06-03 10:59:14 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-06-03 10:59:14 +0200 |
commit | 3d7ebf63123b8652fb7bbecef7ba731202309901 (patch) | |
tree | 8adfcc3dd20946ece9c0b8d15b131823b24455ae /llama.cpp | |
parent | a10cda58d3199cd85305e0f03a8c6056714ae2e8 (diff) |
Vulkan Mixture of Experts (MoE) support (#7628)
* Finish Vulkan mul_mat_id implementation
* Add Vulkan sum_rows and div ops
* Fix MUL_MAT_ID matrix matrix shader
* Fix MUL_MAT_ID matrix vector shader dispatch size
* Fix MUL_MAT_ID matrix vector shader and dispatch code
* Update Vulkan CPU offload for MUL_MAT_ID
* Fix crash when using split mode none and setting a main GPU
Diffstat (limited to 'llama.cpp')
-rw-r--r-- | llama.cpp | 2 |
1 files changed, 1 insertions, 1 deletions
@@ -16372,7 +16372,7 @@ struct llama_context * llama_new_context_with_model( return nullptr; } if (model->split_mode == LLAMA_SPLIT_MODE_NONE) { - ggml_backend_t backend = ggml_backend_vk_init(0); + ggml_backend_t backend = ggml_backend_vk_init(model->main_gpu); if (backend == nullptr) { LLAMA_LOG_ERROR("%s: failed to initialize Vulkan backend\n", __func__); llama_free(ctx); |