ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-10-16	Adding IQ4_KSS: 4.0 bpw quants (#89)	Kawrakow
2024-10-13	IQ2_KS: 2.1875 bpw non-linear quantization (#85)	Kawrakow
2024-10-09	New SOTA quantization: 4.25 bpw IQ4_KS (#83)	Kawrakow
2024-10-02	Adding Q6_0 (#77)	Kawrakow
2024-09-27	Adding ability to have meta data per tensor row (#61)	Kawrakow
2024-09-09	Adding IQ1_TN - 1.6875 bpw for TriLM ternary models (#44)	Kawrakow
2024-08-12	Merge mainline - Aug 12 2024 (#17)	Kawrakow
2024-08-09	iq6_k: WIP (quantize/dequantize)	Iwan Kawrakow
2024-08-07	Adding IQ2_TN for use with ternary models (#13)	Kawrakow
2024-08-05	q2_K: allow it to detect ternary nets and quantize accordingly	Iwan Kawrakow
2024-08-01	iq3_k: Basics	Iwan Kawrakow
2024-08-01	iq5_k: Basics	Iwan Kawrakow
2024-08-01	iq2_k: Basics	Iwan Kawrakow
2024-07-28	IQ4_K: SOTA 4-bit quantization (#6)	Kawrakow
2024-07-27	Merge mainline llama.cpp (#3)	Kawrakow
2024-06-24	Bitnet: tiny bity faster 1.625 bpw variant on Metal	Iwan Kawrakow
2024-06-22	bitnet: add 2 bpw quantization	Iwan Kawrakow
2024-06-22	bitnet: CUDA, scalar, AVX2	Iwan Kawrakow
2024-05-22	common : normalize naming style (#7462)	Georgi Gerganov
2024-05-19	quantize : fix --keep-split check (#7374)	Fred Douglas
2024-05-08	ggml : introduce bfloat16 support (#6412)	Justine Tunney
2024-04-26	quantize: add imatrix and dataset metadata in GGUF (#6658)	Pierrick Hymbert
2024-04-25	quantize : add '--keep-split' to quantize model into shards (#6688)	jiez
2024-04-03	ggml : mul_mat_id use the same tensor for all the experts (#6387)	slaren
2024-03-26	IQ1_M: 1.75 bpw quantization (#6302)	Kawrakow
2024-03-26	quantize : be able to override metadata by key (#6321)	Kawrakow
2024-03-22	quantize: options for output and token embedding tensors qtype (#6239)	Kawrakow
2024-02-27	IQ4_XS: a 4.25 bpw quantization (#5747)	Kawrakow
2024-02-26	Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range...	Kawrakow
2024-02-24	IQ3_S: a much better alternative to Q3_K (#5676)	Kawrakow
2024-02-21	IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590)	Kawrakow
2024-02-18	1.5 bit quantization (#5453)	Kawrakow
2024-02-16	ggml : add numa options (#5377)	bmwl
2024-02-03	refactor : switch to emplace_back to avoid extra object (#5291)	Michael Klimenko
2024-01-30	SOTA 3-bit quants (#5196)	Kawrakow
2024-01-30	quantize : fix typo (#5211)	Vladimir Malyutin
2024-01-22	llama : add Q3_K_XS (#5060)	Kawrakow
2024-01-14	Add ability to use importance matrix for all k-quants (#4930)	Kawrakow
2024-01-14	2-bit quantizations (#4897)	Kawrakow
2024-01-11	llama : restore intended k-quants mixes for MoE models (#4872)	Kawrakow
2023-11-02	build : link against build info instead of compiling against it (#3879)	cebtenzzre
2023-10-29	ggml : quantization refactoring (#3833)	Georgi Gerganov
2023-09-28	build : enable more non-default compiler warnings (#3200)	Cebtenzzre
2023-09-18	make : restore build-info.h dependency for several targets (#3205)	Cebtenzzre
2023-09-15	examples : add compiler version and target to build info (#2998)	Cebtenzzre
2023-09-15	check C++ code with -Wmissing-declarations (#3184)	Cebtenzzre
2023-09-07	fix some warnings from gcc and clang-tidy (#3038)	Cebtenzzre
2023-09-01	Allow quantize to only copy tensors, some other improvements (#2931)	Kerfuffle
2023-08-28	quantize : make output filename optional again (#2823)	Cebtenzzre
2023-08-23	Fix values shown in the quantize tool help (#2735)	Kawrakow