diff options
author | Georgi Gerganov <ggerganov@gmail.com> | 2023-12-07 22:26:54 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-12-07 22:26:54 +0200 |
commit | fe680e3d1080a765e5d3150ffd7bab189742898d (patch) | |
tree | cd8be8bf5722d10596923aef7fb44bf8a58378d7 /ggml-cuda.h | |
parent | bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56 (diff) |
sync : ggml (new ops, tests, backend, etc.) (#4359)
* sync : ggml (part 1)
* sync : ggml (part 2, CUDA)
* sync : ggml (part 3, Metal)
* ggml : build fixes
ggml-ci
* cuda : restore lost changes
* cuda : restore lost changes (StableLM rope)
* cmake : enable separable compilation for CUDA
ggml-ci
* ggml-cuda : remove device side dequantize
* Revert "cmake : enable separable compilation for CUDA"
This reverts commit 09e35d04b1c4ca67f9685690160b35bc885a89ac.
* cuda : remove assert for rope
* tests : add test-backend-ops
* ggml : fix bug in ggml_concat
* ggml : restore `ggml_get_n_tasks()` logic in `ggml_graph_plan()`
* ci : try to fix macOS
* ggml-backend : remove backend self-registration
* ci : disable Metal for macOS cmake build
ggml-ci
* metal : fix "supports family" call
* metal : fix assert
* metal : print resource path
ggml-ci
---------
Co-authored-by: slaren <slarengh@gmail.com>
Diffstat (limited to 'ggml-cuda.h')
-rw-r--r-- | ggml-cuda.h | 10 |
1 files changed, 9 insertions, 1 deletions
diff --git a/ggml-cuda.h b/ggml-cuda.h index 528e66c3..cdb0c0c4 100644 --- a/ggml-cuda.h +++ b/ggml-cuda.h @@ -49,7 +49,15 @@ GGML_API int ggml_cuda_get_device_count(void); GGML_API void ggml_cuda_get_device_description(int device, char * description, size_t description_size); // backend API -GGML_API ggml_backend_t ggml_backend_cuda_init(void); // TODO: take a list of devices to use +GGML_API ggml_backend_t ggml_backend_cuda_init(int device); + +GGML_API bool ggml_backend_is_cuda(ggml_backend_t backend); +GGML_API int ggml_backend_cuda_get_device(ggml_backend_t backend); + +GGML_API ggml_backend_buffer_type_t ggml_backend_cuda_buffer_type(int device); + +// pinned host buffer for use with CPU backend for faster copies between CPU and GPU +GGML_API ggml_backend_buffer_type_t ggml_backend_cuda_host_buffer_type(void); #ifdef __cplusplus } |