diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-05-04 09:17:44 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-04 09:17:44 +0300 |
commit | ce2b0292e18cd8dd87776797fa455e7fc4cfeed9 (patch) | |
tree | a0c3b0eaad8e4a111faf05c7d10f7c2493f2846d /ggml/src/ggml-alloc.c | |
parent | b890e01238dd73e38c8cac99e34eed2071e1c4c1 (diff) |
CUDA: faster FA TG for GQA models (#370)
* cuda: WIP MMA FA
* Use MMA for TG also when quantized
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-alloc.c')
0 files changed, 0 insertions, 0 deletions