diff options
author | Radoslav Gerganov <rgerganov@gmail.com> | 2024-05-14 14:27:19 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-14 14:27:19 +0300 |
commit | 5e31828d3e35c76ecfee665bc23771a4bec1d130 (patch) | |
tree | 7f5f2edc7c3fc3e7655904316897e32202edd5d6 /ggml-rpc.h | |
parent | 541600201e6480f54ae09e58d16b154d4b4b331d (diff) |
ggml : add RPC backend (#6829)
* ggml : add RPC backend
The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).
* set TCP_NODELAY
* add CI workflows
* Address review comments
* fix warning
* implement llama_max_devices() for RPC
* Address review comments
* Address review comments
* wrap sockfd into a struct
* implement get_alignment and get_max_size
* add get_device_memory
* fix warning
* win32 support
* add README
* readme : trim trailing whitespace
* Address review comments
* win32 fix
* Address review comments
* fix compile warnings on macos
Diffstat (limited to 'ggml-rpc.h')
-rw-r--r-- | ggml-rpc.h | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/ggml-rpc.h b/ggml-rpc.h new file mode 100644 index 00000000..aa144832 --- /dev/null +++ b/ggml-rpc.h @@ -0,0 +1,24 @@ +#pragma once + +#include "ggml.h" +#include "ggml-backend.h" + +#ifdef __cplusplus +extern "C" { +#endif + +#define GGML_RPC_MAX_SERVERS 16 + +// backend API +GGML_API GGML_CALL ggml_backend_t ggml_backend_rpc_init(const char * endpoint); +GGML_API GGML_CALL bool ggml_backend_is_rpc(ggml_backend_t backend); + +GGML_API GGML_CALL ggml_backend_buffer_type_t ggml_backend_rpc_buffer_type(const char * endpoint); + +GGML_API GGML_CALL void ggml_backend_rpc_get_device_memory(const char * endpoint, size_t * free, size_t * total); + +GGML_API GGML_CALL void start_rpc_server(ggml_backend_t backend, const char * endpoint, size_t free_mem, size_t total_mem); + +#ifdef __cplusplus +} +#endif |