diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-02-22 09:41:40 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-02-22 09:41:40 +0200 |
commit | 33646fc40949e0fdcf16d96f5b40d12bf93244a9 (patch) | |
tree | 51cc4f613469f8f761a27ef4910be81b4daf667a /examples/batched.swift | |
parent | c4a5103299e44adc8692e3e373c1974fa9fee270 (diff) |
Fuse MoE up and gate matrix multiplications (#219)
* This seems to be a better way
to do the attention matrix multiplications in the TG case.
* Cleanup
* Fuse up and gate gemms in MoE models
Small (~1-2%) but measurable performan ce gain
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/batched.swift')
0 files changed, 0 insertions, 0 deletions