summaryrefslogtreecommitdiff
path: root/ggml/src/ggml-cuda/fattn-common.cuh
AgeCommit message (Expand)Author
2024-10-22Enable q6_0 for flash attention (#101)Kawrakow
2024-10-21Enable IQ4_NL for KV-cache in token generation using Flash Attention (#99)Kawrakow
2024-09-09Add CUDA support for IQ1_TN (#45)Kawrakow
2024-08-27Faster Gemma2 (#27)Kawrakow
2024-08-12Merge mainline - Aug 12 2024 (#17)Kawrakow
2024-07-27Merge mainline llama.cpp (#3)Kawrakow