ggml : add ggml_row_size() (fixes llama out of space) (#4461)

* Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values * do not cast to size_t, instead just use doubles * ggml : add ggml_row_size(), deprecate ggml_type_sizef() * ggml : fix row size compute to avoid overflows * tests : fix sizey -> sizez --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
author: LostRuins <39025047+LostRuins@users.noreply.github.com> 2023-12-14 20:13:33 +0800
committer: GitHub <noreply@github.com> 2023-12-14 14:13:33 +0200
commit: 20a68a7030ee06e8eb7eb8e24ae4ac52dc17803f (patch)
tree: 3c84f1f362b064cdbbc2ec3044e47a38c9e44225 /ggml.h
parent: 55e87c3749cb4985c3b316984d40e00e4df4a5d0 (diff)
1 files changed, 7 insertions, 3 deletions
diff --git a/ggml.h b/ggml.h
index 1447646b..ae8101fa 100644
--- a/ggml.h
+++ b/ggml.h
@@ -641,9 +641,13 @@ extern "C" {
     GGML_API size_t  ggml_nbytes_pad  (const struct ggml_tensor * tensor); // same as ggml_nbytes() but padded to GGML_MEM_ALIGN
     GGML_API size_t  ggml_nbytes_split(const struct ggml_tensor * tensor, int nrows_split);
 
-    GGML_API int     ggml_blck_size (enum ggml_type type);
-    GGML_API size_t  ggml_type_size (enum ggml_type type); // size in bytes for all elements in a block
-    GGML_API float   ggml_type_sizef(enum ggml_type type); // ggml_type_size()/ggml_blck_size() as float
+    GGML_API int    ggml_blck_size(enum ggml_type type);
+    GGML_API size_t ggml_type_size(enum ggml_type type);             // size in bytes for all elements in a block
+    GGML_API size_t ggml_row_size (enum ggml_type type, int64_t ne); // size in bytes for all elements in a row
+
+    GGML_DEPRECATED(
+    GGML_API double ggml_type_sizef(enum ggml_type type), // ggml_type_size()/ggml_blck_size() as float
+    "use ggml_row_size() instead");
 
     GGML_API const char * ggml_type_name(enum ggml_type type);
     GGML_API const char * ggml_op_name  (enum ggml_op   op);
author	LostRuins <39025047+LostRuins@users.noreply.github.com>	2023-12-14 20:13:33 +0800
committer	GitHub <noreply@github.com>	2023-12-14 14:13:33 +0200
commit	20a68a7030ee06e8eb7eb8e24ae4ac52dc17803f (patch)
tree	3c84f1f362b064cdbbc2ec3044e47a38c9e44225 /ggml.h
parent	55e87c3749cb4985c3b316984d40e00e4df4a5d0 (diff)