ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-09-04	Performance improvements for legacy quants on ARM_NEON (#37)	Kawrakow
2024-09-04	Zen4 Flash Attnetion 2 (#36)	Kawrakow
2024-09-02	Fix Zen4 Flash Attention (#35)	Kawrakow
2024-09-02	Do not process prompts containing binary data for escapes (#33)	Kawrakow
2024-09-01	Zen4 Flash Attention (#32)	Kawrakow
2024-08-31	Fix build when iqk_mul_mat is disabled (#31)	Kawrakow
2024-08-27	Faster Gemma2 (#27)	Kawrakow
2024-08-21	softcap: minor improvement (#24)	Kawrakow
2024-08-20	Fused soft cap and SIMD-ified GeLU (#9)	Kawrakow
2024-08-20	iq4_k: use iq5_k also when n_gqa = 2 (#23)	Kawrakow
2024-08-19	AVX2 quantization for Q8_K (#22)	Kawrakow
2024-08-19	quantize_stats: print rmse and max error as fraction of <x> (#21)	Kawrakow
2024-08-19	iq2_k: slightly better bpw - accuracy compromise (#20)	Kawrakow
2024-08-14	Skip barriers of noops (#19)	Kawrakow
2024-08-12	Update README.md	Kawrakow
2024-08-12	Merge mainline - Aug 12 2024 (#17)	Kawrakow
2024-08-09	Fix Makefile	Iwan Kawrakow
2024-08-09	Fix Zen4 implementation of iq3_k, iq4_k, iq5_k	Iwan Kawrakow
2024-08-09	iq6_k: AVX2	Iwan Kawrakow
2024-08-09	iq6_k: Metal	Iwan Kawrakow
2024-08-09	iq6_k: NEON	Iwan Kawrakow
2024-08-09	iq6_k: slightly better Zen4 iqk_mul_mat	Iwan Kawrakow
2024-08-09	iq6_k: Zen4 iqk_mul_mat	Iwan Kawrakow
2024-08-09	iq6_k: CUDA dot product	Iwan Kawrakow
2024-08-09	iq6_k: CUDA dequantize	Iwan Kawrakow
2024-08-09	iq6_k: WIP (quantize/dequantize)	Iwan Kawrakow
2024-08-09	iq6_k: WIP (nothing works)	Iwan Kawrakow
2024-08-07	Adding IQ2_TN for use with ternary models (#13)	Kawrakow
2024-08-05	q2_K: allow it to detect ternary nets and quantize accordingly	Iwan Kawrakow
2024-08-05	Update README.md	Kawrakow
2024-08-05	iq3_k, iq5_k: faster quantization	Iwan Kawrakow
2024-08-03	iq4_k: speedup quantization by a factor of ~2	Iwan Kawrakow
2024-08-01	Add copyright notice	Iwan Kawrakow
2024-08-01	iq2/3_k: tiny bit faster Metal dot products	Iwan Kawrakow
2024-08-01	iq3_k: slightly faster Metal dequantize kernel	Iwan Kawrakow
2024-08-01	iq3_k: Metal dot product	Iwan Kawrakow
2024-08-01	iq2_k: Metal dot product finally works	Iwan Kawrakow
2024-08-01	iq3_k: Metal dequantize	Iwan Kawrakow
2024-08-01	iq3_k: NEON	Iwan Kawrakow
2024-08-01	iq3_k: AVX2 iqk_mul_mat	Iwan Kawrakow
2024-08-01	iq3_k: AVX512 iqk_mul_mat	Iwan Kawrakow
2024-08-01	iq3_k: faster CUDA dot product	Iwan Kawrakow
2024-08-01	iq3_k: CUDA dot product	Iwan Kawrakow
2024-08-01	iq3_k: Basics	Iwan Kawrakow
2024-08-01	iq2_k: very slightly better CUDA dot product	Iwan Kawrakow
2024-08-01	iq2_k: better CUDA dot product	Iwan Kawrakow
2024-08-01	iq2_k: CUDA dot product finally works	Iwan Kawrakow
2024-08-01	iq5_k: CUDA dot product finally works	Iwan Kawrakow
2024-08-01	Factor out iqk CUDA dot products	Iwan Kawrakow
2024-08-01	iq5_k: CUDA dot product still not working	Iwan Kawrakow