ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-03-15	llava : change API to pure C style for Rust FFI bindgen (#6079)	Ting Lou
2024-03-15	cuda : disable unused cudaLaunchHostFunc code (#6078)	slaren
2024-03-15	fix set main gpu error (#6073)	Neo Zhang Jianyu
2024-03-15	make : ggml-metal.o depends on ggml.h	Georgi Gerganov
2024-03-15	[SYCL] Fix non-intel device selection (#6042)	AidanBeltonS
2024-03-15	gguf : add support for I64 and F64 arrays (#6062)	Ondřej Čertík
2024-03-15	llama : add Orion chat template (#6066)	Xuan Son Nguyen
2024-03-15	llama-bench : use random tokens to improve accuracy with mixtral (#6069)	slaren
2024-03-14	llama : fix integer overflow during quantization (#6063)	Georgi Gerganov
2024-03-14	gguf : fix resource leaks (#6061)	Steve Grubb
2024-03-14	gguf-py : bump version to 0.8.0 (#6060)	Ondřej Čertík
2024-03-14	llama : support models without vocabulary (#5798)	Michael Podvitskiy
2024-03-14	embedding : add EOS token if not present (#899)	Georgi Gerganov
2024-03-14	gguf-py : fix dtype check (#6045)	Georgi Gerganov
2024-03-14	readme : improve readme for Llava-1.6 example (#6044)	Jian Liao
2024-03-14	server: disable debug release type sanitizer, simplify trigger (#6047)	Pierrick Hymbert
2024-03-14	llama : fix typo	Georgi Gerganov
2024-03-14	llama : optimize defrag moves + fix fragmentation calculation (#6037)	Michael Podvitskiy
2024-03-14	gguf-py : add support for I8, I16 and I32 (#6045)	Ondřej Čertík
2024-03-14	ggml : designate enum vals for integer types (#6050)	Georgi Gerganov
2024-03-14	embedding : print all resulting embeddings (#899)	Georgi Gerganov
2024-03-14	metal : build metallib + fix embed path (#6015)	Georgi Gerganov
2024-03-14	embedding : print cosine similarity (#899)	Georgi Gerganov
2024-03-13	readme : update details about running llama in Termux on Android (#6039)	Linwei Wang
2024-03-13	readme : update API changes and hot topics	Georgi Gerganov
2024-03-13	grammar : handle missing "root" node (#6004)	Clint Herron
2024-03-13	llama : add pipeline parallelism support (#6017)	slaren
2024-03-13	test-backend-ops : skip CPU backend by default (#6028)	slaren
2024-03-13	Update get version (#6025)	AidanBeltonS
2024-03-13	Server: Use multi-task for embeddings endpoint (#6001)	Xuan Son Nguyen
2024-03-12	ci : remove tidy-review (#6021)	slaren
2024-03-12	ggml : reuse quantum structs across backends (#5943)	Georgi Gerganov
2024-03-12	ggml : fix UB in IQ2_S and IQ3_S (#6012)	Georgi Gerganov
2024-03-12	sycl : update IQ1_S kernels (WIP - not working!) (#5995)	Georgi Gerganov
2024-03-11	grammar : fix unnecessarily retained pointer to rules (#6003)	gliptic
2024-03-11	1.5 bit: we can do even better (#5999)	Kawrakow
2024-03-11	llama : more consistent names of count variables (#5994)	Georgi Gerganov
2024-03-11	llama : refactor unicode stuff (#5992)	Georgi Gerganov
2024-03-11	Update server docker image URLs (#5997)	Jakub N
2024-03-11	Server: format error to json (#5961)	Xuan Son Nguyen
2024-03-11	ggml, ci : Windows ARM runner and build fixes (#5979)	Michael Podvitskiy
2024-03-11	server : maintain chat completion id for streaming responses (#5988)	Minsoo Cheong
2024-03-11	cmake : fix subdir for `LLAMA_METAL_EMBED_LIBRARY` (#5985)	Gilad S
2024-03-11	llama : fix F16/F32 downcast + improve names (#5980)	Georgi Gerganov
2024-03-11	Better 1.5 bit quantization (#5971)	Kawrakow
2024-03-11	[SYCL] Add q3_s and q1_s (#5886)	Abhilash Majumder
2024-03-11	[SYCL] Add support for SYCL Nvidia target (#5738)	AidanBeltonS
2024-03-10	metal : move mm_id indices to shared mem (#5982)	Georgi Gerganov
2024-03-10	android : fix utf8 decoding error (#5935)	Dean
2024-03-10	readme : update hot topics	Georgi Gerganov