ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-06-09	convert-hf : set the model name based on cli arg, if present (#7693)	sasha0552
2024-06-09	convert-hf : match model part name prefix and suffix (#7687)	compilade
2024-06-09	gguf-py : decouple adding metadata from writing in GGUFWriter (#7827)	compilade
2024-06-09	Revert "[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)" (#7808)	slaren
2024-06-08	url: save -mu downloads to new cache location (#7826)	Olivier Chafik
2024-06-08	server : smart slot selection using Longest Common Prefix (#7728)	sasha0552
2024-06-07	vulkan : reuse parent extra for views (#7806)	slaren
2024-06-07	gguf-split : change binary multi-byte units to decimal (#7803)	Christian Zhou-Zheng
2024-06-07	cmake : fix BUILD_SHARED_LIBS=ON build (#7784)	intelmatt
2024-06-07	server: update cache_prompt documentation [no ci] (#7745)	Johannes Gäßler
2024-06-07	server : do not get prompt in infill mode (#7286)	woodx
2024-06-07	[SYCL] fix softmax r2r result wrong issue (#7811)	pengxin99
2024-06-07	check for nans in imatrix and quantize (#7807)	slaren
2024-06-06	server : fix --threads-http arg (#7801)	Georgi Gerganov
2024-06-06	imatrix : migrate to gpt_params (#7771)	Georgi Gerganov
2024-06-06	Added support for . (any character) token in grammar engine. (#6467)	Clint Herron
2024-06-06	README minor fixes (#7798) [no ci]	Mattheus Chediak
2024-06-06	grammars: x{min,max} repetition operator (#6640)	Olivier Chafik
2024-06-06	llama : add jina v2 base code (#7596)	Joan Fontanals
2024-06-06	docker : build only main and server in their images (#7782)	slaren
2024-06-06	docker : add openmp lib (#7780)	slaren
2024-06-06	Fix encoding in python scripts (#7733)	Galunid
2024-06-05	CUDA: refactor mmq, dmmv, mmvq (#7716)	Johannes Gäßler
2024-06-05	ggml : refactor rope norm/neox (#7634)	Georgi Gerganov
2024-06-05	readme : remove -ins (#7759)	arch-btw
2024-06-05	Fix per token atrributes bits (#7749)	jaime-m-p
2024-06-04	Allow number of nodes in CUDA graph to change (#7738)	agray3
2024-06-04	common : refactor cli arg parsing (#7675)	Georgi Gerganov
2024-06-04	ggml : remove OpenCL (#7735)	Georgi Gerganov
2024-06-04	llama : remove beam search (#7736)	Georgi Gerganov
2024-06-04	readme : remove obsolete Zig instructions (#7471)	Georgi Gerganov
2024-06-04	llama-bench : allow using a different printer for stderr with -oe (#7722)	slaren
2024-06-04	Improve hipBLAS support in CMake (#7696)	Daniele
2024-06-04	refine .gitignore (#7688)	zhouwg
2024-06-04	Per token attributes (#7685)	jaime-m-p
2024-06-04	ggml : prevent builds with -ffinite-math-only (#7726)	Georgi Gerganov
2024-06-03	llama : offload to RPC in addition to other backends (#7640)	Radoslav Gerganov
2024-06-03	ggml : use OpenMP as a thread pool (#7606)	Masaya, Kato
2024-06-03	make: fix debug options not being applied to NVCC (#7714)	Johannes Gäßler
2024-06-03	Vulkan Mixture of Experts (MoE) support (#7628)	0cc4m
2024-06-03	cmake : add pkg-config spec file for llama.cpp (#7702)	Andy Tai
2024-06-03	llama : MiniCPM support tied embeddings (#7664)	zhangkaihuo
2024-06-03	llama : avoid double token-to-piece cache (#7654)	Georgi Gerganov
2024-06-03	kompute : implement op_getrows_f32 (#6403)	woachk
2024-06-02	fix bug introduced in using calloc (#7701)	Dave Airlie
2024-06-02	flake.lock: Update (#7686)	Georgi Gerganov
2024-06-02	chore : add ignore rule for generated server themes (#7689)	Austin
2024-06-02	[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)	nickp27
2024-06-01	Fix FlashAttention debug test, FP32 assert (#7684)	Johannes Gäßler
2024-06-01	server : new UI (#7633)	Yazan Agha-Schrader