summaryrefslogtreecommitdiff
path: root/examples
diff options
context:
space:
mode:
authorKawrakow <48489457+ikawrakow@users.noreply.github.com>2024-09-04 07:24:04 +0300
committerGitHub <noreply@github.com>2024-09-04 07:24:04 +0300
commitf17d0d72f565bf24d6eb8aa67d6618cdc143961d (patch)
tree69374b9512d52a47ba410a6982db0e2db1f1f236 /examples
parent8c94dcd43350b6bde8f5618f7e0e9f0b400a2ac6 (diff)
Performance improvements for legacy quants on ARM_NEON (#37)
* WIP: trying to improve legacy quants * WIP: trying to improve legacy quants With this commit PP-512 for LlaMA-3.1-8B goes from 72 t/s to 87.2 t/s for q4_0, and from 61.5 t/s to 73.9 t/s for q4_1, so 20+% improvement for both. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions