diff options
author | slaren <slarengh@gmail.com> | 2024-03-18 11:03:04 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-03-18 11:03:04 +0100 |
commit | 2bf8d0f7c4cc1235755ad06961ca761e458c5e55 (patch) | |
tree | d2a462deb3c0e34cfb26eab6881a65bfb9fc3b28 /examples/llama-bench/llama-bench.cpp | |
parent | 496bc79bc2b79bfd6124b8687a8dbd6a646e9b06 (diff) |
backend : offload large batches to GPU (#6083)
* backend : offload large batches to GPU
* fix hip
* code cleanup
* fix CUDA split buffers
* Update ggml-backend-impl.h
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* cuda : fix memset without set_device
* imatrix : remove sched affix from weight names
* sched : add a new split if the current one has too many inputs
reduce max inputs per split
more cleanup
* update backends
ggml-ci
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Diffstat (limited to 'examples/llama-bench/llama-bench.cpp')
-rw-r--r-- | examples/llama-bench/llama-bench.cpp | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/examples/llama-bench/llama-bench.cpp b/examples/llama-bench/llama-bench.cpp index 32eea786..4cb23080 100644 --- a/examples/llama-bench/llama-bench.cpp +++ b/examples/llama-bench/llama-bench.cpp @@ -114,10 +114,10 @@ static std::string get_cpu_info() { static std::string get_gpu_info() { std::string id; #ifdef GGML_USE_CUBLAS - int count = ggml_cuda_get_device_count(); + int count = ggml_backend_cuda_get_device_count(); for (int i = 0; i < count; i++) { char buf[128]; - ggml_cuda_get_device_description(i, buf, sizeof(buf)); + ggml_backend_cuda_get_device_description(i, buf, sizeof(buf)); id += buf; if (i < count - 1) { id += "/"; |