ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-06-17	update: support Qwen2-57B-A14B (#7835)	Ștefan-Gabriel Muscalu
2024-06-17	Make updates to type cast based on compiler instead of OS (#7851)	Srihari-mcw
2024-06-17	llama : disable FA if KV head size do not match (#7982)	Georgi Gerganov
2024-06-17	Add Nix and Flox install instructions (#7899)	Bryan Honof
2024-06-17	sched : offload_op also requires supports_op (#7977)	slaren
2024-06-17	fix: divide 0 exception in mamba (#7932)	Frank Mai
2024-06-17	Implement non-mapped async IO for CUDA on Windows. (#7896)	Markus Tavenrath
2024-06-17	rpc : fix load/store misaligned addresses (#7948)	Georgi Gerganov
2024-06-17	gguf-dump.py: add --markdown dump output (#7853)	Brian
2024-06-17	[SYCL] Update README-sycl.md for Chapter "Recommended release" and "News" (#7...	Neo Zhang
2024-06-17	Add support for sqrt on CUDA (#7953)	Calvin Laurenson
2024-06-16	cuda : fix bounds check for src0 rows in MMVQ kernel (whisper/2231)	Georgi Gerganov
2024-06-16	ggml : fix and optimize ppc64le (ggml/849)	Hong Bo PENG
2024-06-16	ggml : remove duplicate include of ggml-common.h (ggml/853)	Daniel Bevenius
2024-06-16	flake.lock: Update (#7951)	Georgi Gerganov
2024-06-16	unicode : avoid char32_t (#7957)	Georgi Gerganov
2024-06-16	readme : update UI list [no ci] (#7958)	hopkins385
2024-06-16	ggml : fix handling of zero blocks in IQ quants (#7955)	Georgi Gerganov
2024-06-16	github : update pr template	Georgi Gerganov
2024-06-16	Vulkan Shader Refactor, Memory Debugging Option (#7947)	0cc4m
2024-06-15	Add `cvector-generator` example (#7514)	Xuan Son Nguyen
2024-06-15	[SYCL] remove global variables (#7710)	Meng, Hengyu
2024-06-14	ci : fix macos x86 build (#7940)	olexiyb
2024-06-14	CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)	Johannes Gäßler
2024-06-14	metal : utilize max shared memory for mul_mat_id (#7935)	Georgi Gerganov
2024-06-14	llama-bench : fix RPC indication (#7936)	Radoslav Gerganov
2024-06-14	llama : more checks before assuming FIM tokens (#7644)	Sigbjørn Skjæret
2024-06-14	convert : add Poro-34B-chat tokenizer support (#7713)	Elaine
2024-06-13	rpc : fix ggml_backend_rpc_supports_buft() (#7918)	Radoslav Gerganov
2024-06-13	readme : Remove outdated instructions from README.md (#7914) [no ci]	Galunid
2024-06-13	move BLAS to a separate backend (#6210)	slaren
2024-06-13	`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...	Olivier Chafik
2024-06-12	CUDA: fix broken oob check for FA vec f32 kernel (#7904)	Johannes Gäßler
2024-06-12	tests : add non-cont unary tests (#7857)	Georgi Gerganov
2024-06-12	ggml : improve ggml_is_contiguous logic (#7856)	Georgi Gerganov
2024-06-12	server : restore numeric prompts (#7883)	Georgi Gerganov
2024-06-12	update intel docker oneapi-basekit to 2024.1.1-devel-ubuntu22.04 (#7894)	Meng, Hengyu
2024-06-12	Fix a typo and add Fedora 40 pacakge to install for Vulkan (#7794) [no ci]	Patrice Ferlet
2024-06-11	vulkan: select only one device for single gpu with multiple drivers (#7582)	k.h.lai
2024-06-11	Update Vulkan RoPE implementation (#7818)	0cc4m
2024-06-12	fix broken link in pr template (#7880) [no ci]	Deven Mistry
2024-06-11	github: move PR template to .github/ root (#7868)	Brian
2024-06-11	llama-bench: more compact markdown tables (#7879)	Johannes Gäßler
2024-06-11	tests : check the Python version (#7872)	Georgi Gerganov
2024-06-11	CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860)	Johannes Gäßler
2024-06-11	fix CUDA CI by using a windows-2019 image (#7861)	slaren
2024-06-11	json: refine constraint for whitespace to avoid runaways yet allow pretty pri...	Olivier Chafik
2024-06-11	`json`: document schema conversion in GBNF readme, align manual grammar examp...	Olivier Chafik
2024-06-10	cmake : fix CMake requirement for CUDA (#7821)	Jared Van Bortel
2024-06-10	ci : try win-2019 on server windows test (#7854)	slaren