ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-10-20	Avoid rebuild of GGML graph for each token (#98)	agray3
2024-10-19	Bitnet: make the scale tensors optional (#97)	Kawrakow
2024-10-19	Quant strategies: attn_q Q4 & attn_v Q6 for Llama 3.1 Q5_K_S (#96)	Nexes the Elder
2024-10-18	CLI - Specify GGML_TYPE to quantize for the main tensors. (#91)	Nexes the Elder
2024-10-16	Adding IQ4_KSS: 4.0 bpw quants (#89)	Kawrakow
2024-10-13	IQ2_KS: 2.1875 bpw non-linear quantization (#85)	Kawrakow
2024-10-11	Minor: printf -> LLAMA_LOG_INFO	Iwan Kawrakow
2024-10-10	Better model info (#84)	Kawrakow
2024-10-09	New SOTA quantization: 4.25 bpw IQ4_KS (#83)	Kawrakow
2024-10-02	Fused unary(x)*y (#70)	Kawrakow
2024-10-02	Adding Q6_0 (#77)	Kawrakow
2024-09-29	Allow bf16 kv-cache (#69)	Kawrakow
2024-09-28	Time to fix replace_all (#68)	Kawrakow
2024-09-28	CUDA non-contiguous RoPE (#66)	Kawrakow
2024-09-28	Adding SWIGLU unary op (#65)	Kawrakow
2024-09-28	Better sub-3-bit quantization mixes with a qkv tensor (#64)	Kawrakow
2024-09-27	Adding ability to have meta data per tensor row (#61)	Kawrakow
2024-09-19	Minor	Iwan Kawrakow
2024-09-14	Quantization mixes tweaks (#53)	Kawrakow
2024-09-09	Adding IQ1_TN - 1.6875 bpw for TriLM ternary models (#44)	Kawrakow
2024-09-08	Adding fused rms_norm (#42)	Kawrakow
2024-08-27	Faster Gemma2 (#27)	Kawrakow
2024-08-21	softcap: minor improvement (#24)	Kawrakow
2024-08-20	Fused soft cap and SIMD-ified GeLU (#9)	Kawrakow
2024-08-20	iq4_k: use iq5_k also when n_gqa = 2 (#23)	Kawrakow
2024-08-19	iq2_k: slightly better bpw - accuracy compromise (#20)	Kawrakow
2024-08-12	Merge mainline - Aug 12 2024 (#17)	Kawrakow
2024-08-09	iq6_k: WIP (quantize/dequantize)	Iwan Kawrakow
2024-08-07	Adding IQ2_TN for use with ternary models (#13)	Kawrakow
2024-08-05	q2_K: allow it to detect ternary nets and quantize accordingly	Iwan Kawrakow
2024-08-01	iq3_k: Basics	Iwan Kawrakow
2024-08-01	iq5_k: Basics	Iwan Kawrakow
2024-08-01	iq2_k: Basics	Iwan Kawrakow
2024-07-28	IQ4_K: SOTA 4-bit quantization (#6)	Kawrakow
2024-07-27	Merge mainline llama.cpp (#3)	Kawrakow