ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-03-09	server : clarify some items in the readme (#5957)	Georgi Gerganov
2024-03-09	server : normalize embeddings (#5956)	SeungWon Jeong
2024-03-09	tests : gitignore ggml-common.h	Georgi Gerganov
2024-03-09	server : fix passing prompt as tokens (#5955)	Alexey Parfenov
2024-03-09	ggml : add ggml-common.h to deduplicate shared code (#5940)	Georgi Gerganov
2024-03-09	server : simplify logic for empty prompts (#5953)	Georgi Gerganov
2024-03-09	Server: reorganize some http logic (#5939)	Xuan Son Nguyen
2024-03-09	server : add SSL support (#5926)	Gabe Goodhart
2024-03-09	server: tests: add truncated prompt tests, better kv cache size (#5933)	Pierrick Hymbert
2024-03-08	llama : support Mamba Selective State Space Models (#5328)	compilade
2024-03-08	llama : fix quantization of shared token_embd (#5944)	compilade
2024-03-08	server: metrics: add llamacpp:prompt_seconds_total and llamacpp:tokens_predic...	Pierrick Hymbert
2024-03-08	llama : assume tied weights if lm_head/output weights is missing (#5824)	Don Mahurin
2024-03-08	server : fix EOS token detection with disabled cache (#5938)	Georgi Gerganov
2024-03-08	log : fix MSVC compile errors (#5643)	UEXTM.com
2024-03-07	llama-bench : add embeddings option (#5924)	Georgi Gerganov
2024-03-07	Revert "[SYCL] fix error when set main gpu to non-zero (#5901)" (#5918)	Neo Zhang Jianyu
2024-03-07	server : add `/v1/completions` endpoint (#5914)	Minsoo Cheong
2024-03-07	server : refactor (#5882)	Georgi Gerganov
2024-03-07	[SYCL] fix error when set main gpu to non-zero (#5901)	Neo Zhang Jianyu
2024-03-06	ggml : use SYS_get_cpu if SYS_getcpu is not defined (#5906)	Jared Van Bortel
2024-03-06	ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (#5894)	bobqianic
2024-03-06	convert : remove AWQ remnants (#5768)	Georgi Gerganov
2024-03-06	add wait() to make code stable (#5895)	Neo Zhang Jianyu
2024-03-05	compare-llama-bench.py : remove mul_mat_q (#5892)	slaren
2024-03-05	quants : use MM256_SET_M128I consistently to fix gcc 7 build (#5889)	Jared Van Bortel
2024-03-05	grammars : blacklists character control set (#5888)	ExtReMLapin
2024-03-05	Revert "grammars : don't allow to output unescaped new line in string (#5885)"	Georgi Gerganov
2024-03-05	grammars : don't allow to output unescaped new line in string (#5885)	ExtReMLapin
2024-03-05	Vulkan Improvements (#5835)	0cc4m
2024-03-05	[SYCL] fix mul_mat fault in CI/unit-test (#5862)	Neo Zhang Jianyu
2024-03-05	fix editorconfig check break (#5879)	Minsoo Cheong
2024-03-04	fix speculative decoding build on windows (#5874)	Jeffrey Quesnelle
2024-03-04	nix: static build (#5814)	hutli
2024-03-04	llama : fix embeddings (#5796)	Georgi Gerganov
2024-03-04	flake : fix	Georgi Gerganov
2024-03-04	ggml : fix unknown status (#0)	Georgi Gerganov
2024-03-04	sync : ggml	Georgi Gerganov
2024-03-04	ggml : introduce ggml_status (ggml/750)	Michael Podvitskiy
2024-03-04	cmake : handle cases where git index is not found in .git (#5844)	Dane Madsen
2024-03-04	speculative : implement stochastic speculative sampling (#5625)	Minsoo Cheong
2024-03-04	add alias for chat template (#5858)	Xuan Son Nguyen
2024-03-04	sync : ggml	Georgi Gerganov
2024-03-04	add some new ops, fix some operators and add batch operations to certain oper...	leejet
2024-03-04	common : use LLAMA_DEFAULT_SEED (#5855)	DAN™
2024-03-04	main : support special tokens as reverse/anti prompt (#5847)	DAN™
2024-03-03	cuda : fix data race in soft max (#5853)	slaren
2024-03-03	readme : add API changes section	Georgi Gerganov
2024-03-03	llama : allow for user specified embedding pooling type (#5849)	Douglas Hanley
2024-03-03	gguf-dump : support i-quants (#5841)	Nindaleth