ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-12-17	IQ2_K_R4 (#146)	Kawrakow
2024-12-17	IQ3_K_R4 (#145)	Kawrakow
2024-12-15	BF16_R16 - 16 interleaved bf16 rows (#142)	Kawrakow
2024-12-14	Q8_K_R8: Fastest quantized matrix multiplications (#141)	Kawrakow
2024-12-12	IQ4_K_R4 (#138)	Kawrakow
2024-12-11	Q2_K_R4 (#136)	Kawrakow
2024-12-11	Q3_K_R4 (#134)	Kawrakow
2024-12-10	Q5_K_R4 (#132)	Kawrakow
2024-12-10	Q6_K_R4 (#130)	Kawrakow
2024-12-09	Q4_K_R4 (#129)	Kawrakow
2024-12-08	Rename iq4_nl_x4 to iq4_nl_r4 (#126)	Kawrakow
2024-12-06	iq2_bn_r4: fastest Bitnet CPU implementation on the planet (#124)	Kawrakow
2024-12-04	IQ4_XS_R4 (#123)	Kawrakow
2024-12-03	Q6_0_R4 (#122)	Kawrakow
2024-12-03	Q5_0_R4 (#121)	Kawrakow
2024-12-03	Q8_0_R4 (#120)	Kawrakow
2024-12-02	Q4_0_R4 (#119)	Kawrakow
2024-12-02	IQ4_NL_X4 (#118)	Kawrakow
2024-10-25	Bitnet changes (#106)	Kawrakow
2024-10-18	CLI - Specify GGML_TYPE to quantize for the main tensors. (#91)	Nexes the Elder
2024-10-16	Adding IQ4_KSS: 4.0 bpw quants (#89)	Kawrakow
2024-10-13	IQ2_KS: 2.1875 bpw non-linear quantization (#85)	Kawrakow
2024-10-09	New SOTA quantization: 4.25 bpw IQ4_KS (#83)	Kawrakow
2024-10-02	Adding Q6_0 (#77)	Kawrakow
2024-09-27	Adding ability to have meta data per tensor row (#61)	Kawrakow
2024-09-09	Adding IQ1_TN - 1.6875 bpw for TriLM ternary models (#44)	Kawrakow
2024-08-12	Merge mainline - Aug 12 2024 (#17)	Kawrakow
2024-08-09	iq6_k: WIP (quantize/dequantize)	Iwan Kawrakow
2024-08-07	Adding IQ2_TN for use with ternary models (#13)	Kawrakow
2024-08-05	q2_K: allow it to detect ternary nets and quantize accordingly	Iwan Kawrakow
2024-08-01	iq3_k: Basics	Iwan Kawrakow
2024-08-01	iq5_k: Basics	Iwan Kawrakow
2024-08-01	iq2_k: Basics	Iwan Kawrakow
2024-07-28	IQ4_K: SOTA 4-bit quantization (#6)	Kawrakow
2024-07-27	Merge mainline llama.cpp (#3)	Kawrakow
2024-06-24	Bitnet: tiny bity faster 1.625 bpw variant on Metal	Iwan Kawrakow
2024-06-22	bitnet: add 2 bpw quantization	Iwan Kawrakow
2024-06-22	bitnet: CUDA, scalar, AVX2	Iwan Kawrakow
2024-05-22	common : normalize naming style (#7462)	Georgi Gerganov
2024-05-19	quantize : fix --keep-split check (#7374)	Fred Douglas
2024-05-08	ggml : introduce bfloat16 support (#6412)	Justine Tunney
2024-04-26	quantize: add imatrix and dataset metadata in GGUF (#6658)	Pierrick Hymbert
2024-04-25	quantize : add '--keep-split' to quantize model into shards (#6688)	jiez
2024-04-03	ggml : mul_mat_id use the same tensor for all the experts (#6387)	slaren
2024-03-26	IQ1_M: 1.75 bpw quantization (#6302)	Kawrakow
2024-03-26	quantize : be able to override metadata by key (#6321)	Kawrakow
2024-03-22	quantize: options for output and token embedding tensors qtype (#6239)	Kawrakow
2024-02-27	IQ4_XS: a 4.25 bpw quantization (#5747)	Kawrakow
2024-02-26	Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range...	Kawrakow
2024-02-24	IQ3_S: a much better alternative to Q3_K (#5676)	Kawrakow