diff options
author | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2024-09-11 19:55:42 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-09-11 19:55:42 +0300 |
commit | c920195edd80ab24beb9a0fd3e2f4df582e735d0 (patch) | |
tree | 7874c33edd5a407f72d1d0733093e97283377659 /ggml/src/ggml.c | |
parent | d98a6753a63d970ebdc01c2b7b4f198644eef81c (diff) |
AVX2 Flash Attention 2 (#50)
* AVX2 Flash Attention: add ability to use Q8_0 for kv-cache
* AVX2 Flash Attention: add ability to use Q4_0 for kv-cache
* AVX2 Flash Attention: add ability to use Q4_1 for kv-cache
* Fix Zen4
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/ggml.c')
0 files changed, 0 insertions, 0 deletions