diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2025-07-08 19:44:48 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-08 19:44:48 +0200 |
commit | 97c34f4056067e167ed4508366f74b49e60202f7 (patch) | |
tree | 4ccdcc9ab35a3544c53aae6de72355d3c4603c08 /include/llama.h | |
parent | 4c0b66026619cf51f45249181bf2cc1de8cd6884 (diff) |
Faster prompt processing for IQ2_KS, IQ2_K, IQ2_K_R4 (#593)
* cuda: faster MMQ for iq2_ks, iq2_k, iq2_k_r4
* Lookup is still beter for MMQ if we get 4 values at once
* Minor
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'include/llama.h')
0 files changed, 0 insertions, 0 deletions