diff options
| author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-19 17:14:42 +0200 |
|---|---|---|
| committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-22 12:02:52 +0300 |
| commit | ad60fb35677c6fffdc0b17ac61f1796f416a8e8f (patch) | |
| tree | 98e79b1fd8ffd082c1f054e69206ae92e457d0c2 /ggml-cuda/template-instances/mmq-instance-q5_1.cu | |
| parent | 257fa740148293d69aaaeeca5b22450221e34ea4 (diff) | |
bitnet(scale in a separate tensor): replace ggml_mul with ggml_scale
This recovers part of the performance loss. On Metal TG-128 is now
92 t/s, still short of the ~100 t/s with scales applied on the fly.
Diffstat (limited to 'ggml-cuda/template-instances/mmq-instance-q5_1.cu')
0 files changed, 0 insertions, 0 deletions
