ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-06-22 10:18:41 +0300
committer	Iwan Kawrakow <iwan.kawrakow@gmail.com>	2024-06-22 12:02:52 +0300
commit	8c936e3d6593bec82975ba93bec05f9f03bb21f3 (patch)
tree	905607768a802ee341ab95a682a40529db913d92 /iqk-quantize.cpp
parent	fc04994ebf8bfcb988a913cdd331bb120389bc44 (diff)

bitnet: replace ggml_mul with ggml_scale to apply the scales

Also save one scale operation in the ffn network by adjusting rms_eps. We gain up to 3% in performance by doing this, but it is a bit of a hack (we store the tensor scales in op_params while loading the model).

Diffstat (limited to 'iqk-quantize.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: