summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-rpc.cpp
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-06-23 15:50:24 +0200
committerGitHub <noreply@github.com>2025-06-23 15:50:24 +0200
commitddda4d9e64fa889389b784f28da6453f14137452 (patch)
treea5e58bf26d55e181c9ee10b74f328281dbe5df37 /ggml/src/ggml-rpc.cpp
parent4776dd280976784eb0abd743186cc30370104b78 (diff)
Much faster prompt processing for I-quants (ARM_NEON) (#550)
* iq2_xxs 55.8 -> 167.5 t/s. iq2_xxs is at 93.7 t/s * iq2_xs 46.4 -> 166.6 t/s. iq2_xs_r4 is at 72.3 t/s. * iq2_s 42.8 t/s -> 166.8 t/s. iq2_s_r4 is at 71.1 t/s. * iq3_xxs 51.8 t/s -> 165.6 t/s. iq3_xxs_r4 is at 84.6 t/s. * iq3_s 46.0 t/s -> 162.0 t/s. iq3_s_r4 is at 79.4 t/s --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml-rpc.cpp')
0 files changed, 0 insertions, 0 deletions