diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2024-09-28 13:37:25 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-09-28 13:37:25 +0300 |
commit | 737514fd814d944f8ce965620293a16e5e8a285d (patch) | |
tree | 4b4b79eec0d1cbcc413dd3c6991b6d57439edd86 /ggml/include/ggml.h | |
parent | 1f61e91862dd0b077ccb60459f3cc03f364ee279 (diff) |
Adding SWIGLU unary op (#65)
* Adding GGML_UNARY_OP_SWIGLU
This commit implements the ggml op and CPU compute
forward. I see ~3-4% speedup of PP-512 for Phi-3.5-mini.
* GGML_UNARY_OP_SWIGLU: CUDA implementation
I observe ~12% speedup for PP-512(Phi-3.5-mini).
* GGML_UNARY_OP_SWIGLU: Metal implementation
We get ~2% speedup for PP-512(Phi-3.5-mini).
* GGML_UNARY_OP_SWIGLU: minor improvement on Metal
* GGML_UNARY_OP_SWIGLU: cleanup
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/include/ggml.h')
-rw-r--r-- | ggml/include/ggml.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h index 6ac30b0f..36cc531f 100644 --- a/ggml/include/ggml.h +++ b/ggml/include/ggml.h @@ -564,6 +564,7 @@ extern "C" { GGML_UNARY_OP_SILU, GGML_UNARY_OP_HARDSWISH, GGML_UNARY_OP_HARDSIGMOID, + GGML_UNARY_OP_SWIGLU, GGML_UNARY_OP_COUNT, }; @@ -1127,6 +1128,10 @@ extern "C" { struct ggml_context * ctx, struct ggml_tensor * a); + GGML_API struct ggml_tensor * ggml_swiglu( + struct ggml_context * ctx, + struct ggml_tensor * a); + // a - x // b - dy GGML_API struct ggml_tensor * ggml_silu_back( |