sync : ggml (new ops, tests, backend, etc.) (#4359)

* sync : ggml (part 1) * sync : ggml (part 2, CUDA) * sync : ggml (part 3, Metal) * ggml : build fixes ggml-ci * cuda : restore lost changes * cuda : restore lost changes (StableLM rope) * cmake : enable separable compilation for CUDA ggml-ci * ggml-cuda : remove device side dequantize * Revert "cmake : enable separable compilation for CUDA" This reverts commit 09e35d04b1c4ca67f9685690160b35bc885a89ac. * cuda : remove assert for rope * tests : add test-backend-ops * ggml : fix bug in ggml_concat * ggml : restore `ggml_get_n_tasks()` logic in `ggml_graph_plan()` * ci : try to fix macOS * ggml-backend : remove backend self-registration * ci : disable Metal for macOS cmake build ggml-ci * metal : fix "supports family" call * metal : fix assert * metal : print resource path ggml-ci --------- Co-authored-by: slaren <slarengh@gmail.com>
author: Georgi Gerganov <ggerganov@gmail.com> 2023-12-07 22:26:54 +0200
committer: GitHub <noreply@github.com> 2023-12-07 22:26:54 +0200
commit: fe680e3d1080a765e5d3150ffd7bab189742898d (patch)
tree: cd8be8bf5722d10596923aef7fb44bf8a58378d7 /ggml-cuda.h
parent: bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56 (diff)
1 files changed, 9 insertions, 1 deletions
diff --git a/ggml-cuda.h b/ggml-cuda.h
index 528e66c3..cdb0c0c4 100644
--- a/ggml-cuda.h
+++ b/ggml-cuda.h
@@ -49,7 +49,15 @@ GGML_API int    ggml_cuda_get_device_count(void);
 GGML_API void   ggml_cuda_get_device_description(int device, char * description, size_t description_size);
 
 // backend API
-GGML_API ggml_backend_t ggml_backend_cuda_init(void); // TODO: take a list of devices to use
+GGML_API ggml_backend_t ggml_backend_cuda_init(int device);
+
+GGML_API bool ggml_backend_is_cuda(ggml_backend_t backend);
+GGML_API int  ggml_backend_cuda_get_device(ggml_backend_t backend);
+
+GGML_API ggml_backend_buffer_type_t ggml_backend_cuda_buffer_type(int device);
+
+// pinned host buffer for use with CPU backend for faster copies between CPU and GPU
+GGML_API ggml_backend_buffer_type_t ggml_backend_cuda_host_buffer_type(void);
 
 #ifdef  __cplusplus
 }
author	Georgi Gerganov <ggerganov@gmail.com>	2023-12-07 22:26:54 +0200
committer	GitHub <noreply@github.com>	2023-12-07 22:26:54 +0200
commit	fe680e3d1080a765e5d3150ffd7bab189742898d (patch)
tree	cd8be8bf5722d10596923aef7fb44bf8a58378d7 /ggml-cuda.h
parent	bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56 (diff)