diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-04-03 07:15:49 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-04-03 07:15:49 +0200 |
commit | 07dbc1aa06d761634419759431ebb215baf698bb (patch) | |
tree | 6da38e23c3a954ddfa5ea26a9babb0b3ec334541 /examples/quantize-stats/quantize-stats.cpp | |
parent | 6d405d1fd1bddfe31fcbf00c9c8652a0a9166887 (diff) |
Metal: much faster MoE prompt processing (#307)
* MoE improvements on Metal
This version beats mainline, there are things I don't understand:
* Mianline has effectively gone to GEMV for MUL_MAT_ID. We can do the
same, but we are 30% slower. Why?
* Using actual GEMM, we beat mainline with ubtach size of 128. But then
performance degrades. Why?
* Some cleanup
* Much better
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/quantize-stats/quantize-stats.cpp')
0 files changed, 0 insertions, 0 deletions