diff options
author | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2024-08-14 10:40:09 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-08-14 10:40:09 +0200 |
commit | 6c5384f20e8657a23aa9d4e0e9856d3d7563a12a (patch) | |
tree | ec7633442345b8e69eee23180d8cf56fe0f59811 /ggml/include/ggml.h | |
parent | bb5ff6fadec40c2e3aa3033dc68bec9367a0c9cc (diff) |
Skip barriers of noops (#19)
GGML_OP_RESHAPE, GGML_OP_VIEW, GGML_OP_PERMUTE, GGML_OP_TRANSPOSE,
along with GGML_OP_NONE, are all noops. I.e., nothinh happens.
But ggml still has a barrier after them, which wastes time.
The waste is not too bad for large models where computations are
long compared to the time taken for thread synchronization.
But for small models skipping those unnecessary waits makes
a significant difference. E.g., for the 99M TriLMamodel,
TG-500 goes up to 1426 t/s from 1240 t/s.
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/include/ggml.h')
-rw-r--r-- | ggml/include/ggml.h | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h index b9b0284b..026993db 100644 --- a/ggml/include/ggml.h +++ b/ggml/include/ggml.h @@ -749,6 +749,8 @@ extern "C" { GGML_API GGML_CALL const char * ggml_op_name (enum ggml_op op); GGML_API const char * ggml_op_symbol(enum ggml_op op); + GGML_API GGML_CALL bool ggml_is_noop(const struct ggml_tensor * tensor); + GGML_API const char * ggml_unary_op_name(enum ggml_unary_op op); GGML_API GGML_CALL const char * ggml_op_desc(const struct ggml_tensor * t); // unary or op name |