diff options
| author | Kawrakow <iwankawrakow@gmail.com> | 2025-05-04 09:17:44 +0300 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-05-04 09:17:44 +0300 |
| commit | ce2b0292e18cd8dd87776797fa455e7fc4cfeed9 (patch) | |
| tree | a0c3b0eaad8e4a111faf05c7d10f7c2493f2846d /examples/llava/android | |
| parent | b890e01238dd73e38c8cac99e34eed2071e1c4c1 (diff) | |
CUDA: faster FA TG for GQA models (#370)
* cuda: WIP MMA FA
* Use MMA for TG also when quantized
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/llava/android')
0 files changed, 0 insertions, 0 deletions
