summaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-02-22 09:41:40 +0200
committerGitHub <noreply@github.com>2025-02-22 09:41:40 +0200
commit33646fc40949e0fdcf16d96f5b40d12bf93244a9 (patch)
tree51cc4f613469f8f761a27ef4910be81b4daf667a /src
parentc4a5103299e44adc8692e3e373c1974fa9fee270 (diff)
Fuse MoE up and gate matrix multiplications (#219)
* This seems to be a better way to do the attention matrix multiplications in the TG case. * Cleanup * Fuse up and gate gemms in MoE models Small (~1-2%) but measurable performan ce gain --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'src')
0 files changed, 0 insertions, 0 deletions