ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-06-22	iqk_mul_mat: delete unused stuff	Iwan Kawrakow
2024-06-22	iqk_mul_mat: add q8_0	Iwan Kawrakow
2024-06-22	iqk_mul_mat: fp16 tweaks	Iwan Kawrakow
2024-06-22	iqk_mul_mat: fp16 implementation cleanup	Iwan Kawrakow
2024-06-22	iqk_mul_mat: fp16 implementation for AVX2	Iwan Kawrakow
2024-06-22	iqk_mul_mat: multi-thread quantization also for MoE models	Iwan Kawrakow
2024-06-22	iqk_mul_mat: make it independent of sgemm	Iwan Kawrakow
2024-06-22	iqk_mul_mat: minor improvements	Iwan Kawrakow
2024-06-22	iqk_mul_mat: no more templates in the IQ dequantizers	Iwan Kawrakow
2024-06-22	iqk_mul_mat: remove template on one of the prepare() functions	Iwan Kawrakow
2024-06-22	iqk_mul_mat: experimenting with zen4	Iwan Kawrakow
2024-06-22	iqk_mul_mat: experimenting with zen4 (iq2_xxs)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: experimenting with zen4 (iq2_xs)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: experimenting with zen4 (iq3_s and iq2_m)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: small improvement for iq3_s	Iwan Kawrakow
2024-06-22	iqk_mul_mat: better AVX2 implementation for iq2_xxs	Iwan Kawrakow
2024-06-22	iqk_mul_mat: better AVX2 implementation for iq2_xxs	Iwan Kawrakow
2024-06-22	iqk_mul_mat: AVX2 implementation for iq2_xxs	Iwan Kawrakow
2024-06-22	iqk_mul_mat: AVX2 implementation for iq2_xs	Iwan Kawrakow
2024-06-22	iqk_mul_mat: AVX2 implementation for iq2_s	Iwan Kawrakow
2024-06-22	Separate templates for TG and PP for i-quants on AVX2	Iwan Kawrakow
2024-06-22	iqk_mul_mat: AVX2 implementation for iq3_xxs	Iwan Kawrakow
2024-06-22	iqk_mul_mat: AVX2 implementation for iq3_s	Iwan Kawrakow
2024-06-22	Cleanup - Arm i-quants should be good now	Iwan Kawrakow
2024-06-22	iqk_mul_mat: Arm implementation for iq3_s (llama.cpp version)	Iwan Kawrakow
2024-06-22	Simplify	Iwan Kawrakow
2024-06-22	iqk_mul_mat: Arm implementation for iq3_xxs (llama.cpp version)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: Arm implementation for iq2_xs (llama.cpp version)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: Arm implementation for iq2_s (llama.cpp version)	Iwan Kawrakow
2024-06-22	Add Q8_0	Iwan Kawrakow
2024-06-22	Cosmetics	Iwan Kawrakow
2024-06-22	iqk_mul_mat: Arm implementation for iq2_xxs (llama.cpp version)	Iwan Kawrakow
2024-06-22	iqk_mul_mat: faster q3_K TG	Iwan Kawrakow
2024-06-22	iqk_mul_mat for llama.cpp	Iwan Kawrakow
2024-06-21	JSON Schema to GBNF integration tests (#7790)	Clint Herron
2024-06-21	vulkan: detect multiple devices by deviceUUID instead of deviceID (#8022)	k.h.lai
2024-06-21	ggml : AVX IQ quants (#7845)	Eve
2024-06-21	llama : optimize long word tokenization with WPM (#8034)	Georgi Gerganov
2024-06-21	llama : allow pooled embeddings on any model (#7477)	Douglas Hanley
2024-06-21	swiftui : enable stream updating (#7754)	Shuichi Tsutsumi
2024-06-20	requirements : Bump torch and numpy for python3.12 (#8041)	Hamdoud Hakem
2024-06-20	convert-hf : Fix the encoding in the convert-hf-to-gguf-update.py (#8040)	Hamdoud Hakem
2024-06-20	common: fix warning (#8036)	Johannes Gäßler
2024-06-20	[SYCL] Fix windows build and inference (#8003)	luoyu-intel
2024-06-20	CUDA: stream-k decomposition for MMQ (#8018)	Johannes Gäßler
2024-06-20	metal : fix `ggml_metal_supports_op` for BF16 (#8021)	Michael de Gans
2024-06-20	server : fix smart slot selection (#8020)	sasha0552
2024-06-19	un-ignore `build-info.cmake` and `build-info.sh` (#7996)	Michael de Gans
2024-06-19	ggml : synchronize threads using barriers (#7993)	slaren
2024-06-19	codecov : remove (#8004)	Georgi Gerganov