ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-02-03	refactor : switch to emplace_back to avoid extra object (#5291)	Michael Klimenko
2024-02-03	YaRN : store rope scaling type as int32_t in memory (#5285)	Jared Van Bortel
2024-02-03	readme : add tenere in the ui tools list (#5284)	BADR
2024-02-03	Fix im2col with 32fp (#5286)	AidanBeltonS
2024-02-02	perplexity : fix KL divergence calculations on Windows (#5273)	kalomaze
2024-02-02	scripts : parse wtype in server-llm.sh (#5167)	Georgi Gerganov
2024-02-02	py : add check for '.attn.masked_bias' layers to GPT2model (#5281)	Mirror Azure
2024-02-02	Tidy ggml-sycl (#5261)	AidanBeltonS
2024-02-02	docker : add build for SYCL, Vulkan + update readme (#5228)	Xuan Son Nguyen
2024-02-02	[SYCL] get MAX_MEM_ALLOC from device property (#5270)	Meng, Hengyu
2024-02-02	[SYCL] update guide of SYCL backend (#5254)	Neo Zhang Jianyu
2024-02-02	llama : fix memory leak in llama_batch_free (#5252)	Ian Bull
2024-02-01	add --no-mmap in llama-bench (#5257)	Neo Zhang Jianyu
2024-02-01	Vulkan Phi Fix for AMD Proprietary Drivers (#5260)	0cc4m
2024-02-01	cuda : fix LLAMA_CUDA_F16 (#5262)	slaren
2024-02-01	make : generate .a library for static linking (#5205)	Ali Nehzat
2024-02-01	llama : support InternLM2 (#5184)	Guoteng
2024-01-31	Fix broken Vulkan Cmake (properly) (#5230)	Eve
2024-01-31	llama : reorder build_orion() at correct place (#5118)	Georgi Gerganov
2024-01-31	llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	Georgi Gerganov
2024-01-31	metal : add im2col F32 dst support (#5132)	Georgi Gerganov
2024-01-31	llava : add MobileVLM support (#5132)	JidongZhang-THU
2024-01-31	format license text, restore apache license by legal suggestion (#5233)	Neo Zhang Jianyu
2024-01-31	ggml : limit n_threads to the max n_tasks (#5238)	slaren
2024-01-31	Vulkan Fixes (#5223)	0cc4m
2024-01-30	Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231)	Yiming Cui
2024-01-31	support SYCL backend windows build (#5208)	Neo Zhang Jianyu
2024-01-30	kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)	Jared Van Bortel
2024-01-30	Revert "server : change deps.sh xxd files to string literals (#5221)"	Georgi Gerganov
2024-01-30	server : fix context shift (#5195)	Georgi Gerganov
2024-01-30	server : change deps.sh xxd files to string literals (#5221)	JohnnyB
2024-01-30	ggml : fix IQ3_XXS on Metal (#5219)	Kawrakow
2024-01-30	sync : ggml (#0)	Georgi Gerganov
2024-01-30	gguf : fix comparison (ggml/715)	Georgi Gerganov
2024-01-30	`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686)	John Balis
2024-01-30	gguf : add input validation, prevent integer overflows (ggml/709)	Georgi Gerganov
2024-01-30	ci : fix yolo URLs + fix metal capture (ggml/712)	Georgi Gerganov
2024-01-30	metal : add debug capture backend function (ggml/694)	Jack Mousseau
2024-01-30	Faster AVX2 dot product for IQ2_XS (#5187)	Kawrakow
2024-01-30	SOTA 3-bit quants (#5196)	Kawrakow
2024-01-30	Vulkan Windows APU Memory Handling (#5199)	0cc4m
2024-01-30	quantize : fix typo (#5211)	Vladimir Malyutin
2024-01-30	main : allow empty --prompt-cache file (#5176)	divinity76
2024-01-30	readme : minor (#5204)	Romain Neutron
2024-01-30	readme : update hot topics	Georgi Gerganov
2024-01-30	server : improve README (#5209)	Wu Jian Ping
2024-01-29	ggml alloc: Fix for null dereference on alloc failure (#5200)	Paul Tsochantaris
2024-01-29	kompute : fix fallback to CPU (#5201)	Jared Van Bortel
2024-01-29	Nomic Vulkan backend (#4456)	Jared Van Bortel
2024-01-29	fix typo "RLIMIT_MLOCK" (#5175)	divinity76