summaryrefslogtreecommitdiff
path: root/examples/passkey/passkey.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-02-15 08:45:45 +0200
committerGitHub <noreply@github.com>2025-02-15 08:45:45 +0200
commit0551e7630b2e74dfd140aadbf6a136c58199f474 (patch)
treea96f395a2daac48d31cb1eedd1621b4943f61cac /examples/passkey/passkey.cpp
parent8e94b29e35d69a6721c35767222d16dc65df2db6 (diff)
Moving 4D gemm logic from ggml.c to iqk_mul_mat.cpp (#207)
This allows us to optimize TG performance for GQA models. E.g., for IQ4_XS L3-8B with 8k TG-64 goes from 8.6 to 10.26 t/s. Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/passkey/passkey.cpp')
0 files changed, 0 insertions, 0 deletions