ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-03-14	ggml : designate enum vals for integer types (#6050)	Georgi Gerganov
2024-03-14	embedding : print all resulting embeddings (#899)	Georgi Gerganov
2024-03-14	metal : build metallib + fix embed path (#6015)	Georgi Gerganov
2024-03-14	embedding : print cosine similarity (#899)	Georgi Gerganov
2024-03-13	readme : update details about running llama in Termux on Android (#6039)	Linwei Wang
2024-03-13	readme : update API changes and hot topics	Georgi Gerganov
2024-03-13	grammar : handle missing "root" node (#6004)	Clint Herron
2024-03-13	llama : add pipeline parallelism support (#6017)	slaren
2024-03-13	test-backend-ops : skip CPU backend by default (#6028)	slaren
2024-03-13	Update get version (#6025)	AidanBeltonS
2024-03-13	Server: Use multi-task for embeddings endpoint (#6001)	Xuan Son Nguyen
2024-03-12	ci : remove tidy-review (#6021)	slaren
2024-03-12	ggml : reuse quantum structs across backends (#5943)	Georgi Gerganov
2024-03-12	ggml : fix UB in IQ2_S and IQ3_S (#6012)	Georgi Gerganov
2024-03-12	sycl : update IQ1_S kernels (WIP - not working!) (#5995)	Georgi Gerganov
2024-03-11	grammar : fix unnecessarily retained pointer to rules (#6003)	gliptic
2024-03-11	1.5 bit: we can do even better (#5999)	Kawrakow
2024-03-11	llama : more consistent names of count variables (#5994)	Georgi Gerganov
2024-03-11	llama : refactor unicode stuff (#5992)	Georgi Gerganov
2024-03-11	Update server docker image URLs (#5997)	Jakub N
2024-03-11	Server: format error to json (#5961)	Xuan Son Nguyen
2024-03-11	ggml, ci : Windows ARM runner and build fixes (#5979)	Michael Podvitskiy
2024-03-11	server : maintain chat completion id for streaming responses (#5988)	Minsoo Cheong
2024-03-11	cmake : fix subdir for `LLAMA_METAL_EMBED_LIBRARY` (#5985)	Gilad S
2024-03-11	llama : fix F16/F32 downcast + improve names (#5980)	Georgi Gerganov
2024-03-11	Better 1.5 bit quantization (#5971)	Kawrakow
2024-03-11	[SYCL] Add q3_s and q1_s (#5886)	Abhilash Majumder
2024-03-11	[SYCL] Add support for SYCL Nvidia target (#5738)	AidanBeltonS
2024-03-10	metal : move mm_id indices to shared mem (#5982)	Georgi Gerganov
2024-03-10	android : fix utf8 decoding error (#5935)	Dean
2024-03-10	readme : update hot topics	Georgi Gerganov
2024-03-10	sync : ggml	Georgi Gerganov
2024-03-10	ggml : try fix 32-bit arm compat (whisper/1938)	Georgi Gerganov
2024-03-10	ggml : remove __constant__ specifier for CUDA tables (#5940)	Georgi Gerganov
2024-03-10	server: ci: windows build and tests (#5968)	Pierrick Hymbert
2024-03-10	llama : add support for GritLM (#5959)	DAN™
2024-03-10	grammar : verify parsed state (#5950)	Clint Herron
2024-03-10	nix: update flake.lock (#5969)	Georgi Gerganov
2024-03-09	server: benchmark: chat/completions scenario and other llm servers comparison...	Pierrick Hymbert
2024-03-09	server : print chat template info	Georgi Gerganov
2024-03-09	perplexity : support using multiple sequences to allow larger batch sizes (#5...	slaren
2024-03-09	readme : update hot topics	Georgi Gerganov
2024-03-09	ggml : fix unnecessary f32 -> f16 -> f32 casts (mmla) (#5951)	Georgi Gerganov
2024-03-09	server : fix metrics init (#5964)	Georgi Gerganov
2024-03-09	ggml : remove old quantization functions (#5942)	Georgi Gerganov
2024-03-09	server : clarify some items in the readme (#5957)	Georgi Gerganov
2024-03-09	server : normalize embeddings (#5956)	SeungWon Jeong
2024-03-09	tests : gitignore ggml-common.h	Georgi Gerganov
2024-03-09	server : fix passing prompt as tokens (#5955)	Alexey Parfenov
2024-03-09	ggml : add ggml-common.h to deduplicate shared code (#5940)	Georgi Gerganov