diff options
author | Georgi Gerganov <ggerganov@gmail.com> | 2023-12-01 10:51:24 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-12-01 10:51:24 +0200 |
commit | ef47ec18da469423c276b683dd9b5741cee7023e (patch) | |
tree | ec3b4780dbe8f629425de499b298e8eadfd1aa4d /ggml-alloc.c | |
parent | 1d144112c0fbbb4ecc07dbcf4f05a380148bd6de (diff) |
ggml : add ggml_soft_max_ext (#4256)
* metal : implement soft_max_ext
* cuda : implement soft_max_ext
* ggml : implement soft_max_ext (CPU)
* batched-bench : print threads
ggml-ci
* metal : simplify soft_max encoding
ggml-ci
* cuda : use 512 threads for soft_max instead of 32
* ggml : update soft max cpu
* cuda : do warp-based block reduce
* cuda : increase max block size to 1024
* cuda : fix warp reduction initialization of shared mem
* metal : warp-based reduction for soft max kernel
* metal : warp-based reduce for rms_norm
* metal : simplify soft max kernel
ggml-ci
* alloc : fix build with debug
Diffstat (limited to 'ggml-alloc.c')
-rw-r--r-- | ggml-alloc.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/ggml-alloc.c b/ggml-alloc.c index cdfe4caf..0d4e12ae 100644 --- a/ggml-alloc.c +++ b/ggml-alloc.c @@ -137,7 +137,7 @@ void ggml_tallocr_alloc(ggml_tallocr_t alloc, struct ggml_tensor * tensor) { #ifdef GGML_ALLOCATOR_DEBUG add_allocated_tensor(alloc, tensor); - size_t cur_max = (char*)addr - (char*)alloc->data + size; + size_t cur_max = (char*)addr - (char*)alloc->base + size; if (cur_max > alloc->max_size) { printf("max_size = %.2f MB: tensors: ", cur_max / 1024.0 / 1024.0); for (int i = 0; i < 1024; i++) { |