ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kawrakow <iwankawrakow@gmail.com>	2025-02-09 18:59:33 +0200
committer	GitHub <noreply@github.com>	2025-02-09 18:59:33 +0200
commit	cae2b81155fdad75b7beab3a835c438120412969 (patch)
tree	e5b84d2744af15e1218db1ac935b4bfc1c499cb0 /gguf-py/gguf/tensor_mapping.py
parent	33390c4b74fa52875d6028c5c9aaf84f17288c25 (diff)

FA: Add option to build all FA kernels (#197)

Similar to the CUDA situation. It is OFF by default. If OFF, only F16, Q8_0, Q6_0, and, if the CPU provides native BF16 support, BF16 FA kernels will be included. To enable all, cmake -DGGML_IQK_FA_ALL_QUANTS=1 ... This cuts compilation time for iqk_mul_mat.cpp by almost half (45 seconds vs 81 seconds on my Ryzen-7950X). Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

Diffstat (limited to 'gguf-py/gguf/tensor_mapping.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: