ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kawrakow <iwankawrakow@gmail.com>	2024-09-16 16:47:36 +0300
committer	GitHub <noreply@github.com>	2024-09-16 16:47:36 +0300
commit	2874b984006c6c8d0691ce000dcd9ca2cf9ff6fd (patch)
tree	4244cf6b022a6eb728f5d0eb3ba94a739681e345 /examples/server/tests/tests.sh
parent	20f3e6fd2de6378d2a598b48edce369642bf2ee8 (diff)

iqk_mul_mat(ARM_NEON): adding bf16 support (#41)

It looks like ArmV8 ISA has support for bf16, but my M2 Max does not have it, so resorting to bf16 -> f32 conversion and computations in f32. This is 2x slower than f16, but 8x better compared to what I get if I try to run a bf16 model on the M2 (NEON and Metal). Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

Diffstat (limited to 'examples/server/tests/tests.sh')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: