ROCm Port (#1087)

* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>
author: Henri Vasserman <henv@hot.ee> 2023-08-25 12:09:42 +0300
committer: GitHub <noreply@github.com> 2023-08-25 12:09:42 +0300
commit: 6bbc598a632560cb45dd2c51ad403bda8723b629 (patch)
tree: 53be13238531021865642158403fbf92c5a9ff58 /llama.cpp
parent: 3f460a2b723c8b936ac29ecfd02f244b3adeba55 (diff)
1 files changed, 1 insertions, 1 deletions
diff --git a/llama.cpp b/llama.cpp
index 52ba31d7..d12b6d1c 100644
--- a/llama.cpp
+++ b/llama.cpp
@@ -1836,7 +1836,7 @@ static void llm_load_tensors(
     (void) main_gpu;
     (void) mul_mat_q;
 #if defined(GGML_USE_CUBLAS)
-    LLAMA_LOG_INFO("%s: using CUDA for GPU acceleration\n", __func__);
+    LLAMA_LOG_INFO("%s: using " GGML_CUDA_NAME " for GPU acceleration\n", __func__);
     ggml_cuda_set_main_device(main_gpu);
     ggml_cuda_set_mul_mat_q(mul_mat_q);
 #define LLAMA_BACKEND_OFFLOAD       GGML_BACKEND_GPU
author	Henri Vasserman <henv@hot.ee>	2023-08-25 12:09:42 +0300
committer	GitHub <noreply@github.com>	2023-08-25 12:09:42 +0300
commit	6bbc598a632560cb45dd2c51ad403bda8723b629 (patch)
tree	53be13238531021865642158403fbf92c5a9ff58 /llama.cpp
parent	3f460a2b723c8b936ac29ecfd02f244b3adeba55 (diff)