diff options
| author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-07-30 17:18:31 +0300 |
|---|---|---|
| committer | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2024-08-01 09:38:06 +0200 |
| commit | fd1ae85a329e8148d1de20dc6ef5302110d53b73 (patch) | |
| tree | 8bc375e60041fde1e5d96954a170de86ebfaea8d /ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu | |
| parent | 0d19d19af88a508ee8987abe5fc4f8fcaaa1dc2d (diff) | |
iq3_k: faster CUDA dot product
138 t/s for LLaMA-3.1-8B, which is almost on par with iq3_s.
Diffstat (limited to 'ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu')
0 files changed, 0 insertions, 0 deletions
