ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-06-10	CUDA: use tensor cores for MMQ (#7676)	Johannes Gäßler
2024-06-01	CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)	Johannes Gäßler
2024-06-01	CUDA: quantized KV support for FA vec (#7527)	Johannes Gäßler
2024-05-18	CUDA: deduplicate FlashAttention code (#7352)	Johannes Gäßler
2024-05-12	CUDA: add FP32 FlashAttention vector kernel (#7188)	Johannes Gäßler