summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-kompute.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-05-04 09:17:44 +0300
committerGitHub <noreply@github.com>2025-05-04 09:17:44 +0300
commitce2b0292e18cd8dd87776797fa455e7fc4cfeed9 (patch)
treea0c3b0eaad8e4a111faf05c7d10f7c2493f2846d /ggml/src/ggml-kompute.cpp
parentb890e01238dd73e38c8cac99e34eed2071e1c4c1 (diff)
CUDA: faster FA TG for GQA models (#370)
* cuda: WIP MMA FA * Use MMA for TG also when quantized --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-kompute.cpp')
0 files changed, 0 insertions, 0 deletions