index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-03-09
server : clarify some items in the readme (#5957)
Georgi Gerganov
2024-03-09
server : normalize embeddings (#5956)
SeungWon Jeong
2024-03-09
tests : gitignore ggml-common.h
Georgi Gerganov
2024-03-09
server : fix passing prompt as tokens (#5955)
Alexey Parfenov
2024-03-09
ggml : add ggml-common.h to deduplicate shared code (#5940)
Georgi Gerganov
2024-03-09
server : simplify logic for empty prompts (#5953)
Georgi Gerganov
2024-03-09
Server: reorganize some http logic (#5939)
Xuan Son Nguyen
2024-03-09
server : add SSL support (#5926)
Gabe Goodhart
2024-03-09
server: tests: add truncated prompt tests, better kv cache size (#5933)
Pierrick Hymbert
2024-03-08
llama : support Mamba Selective State Space Models (#5328)
compilade
2024-03-08
llama : fix quantization of shared token_embd (#5944)
compilade
2024-03-08
server: metrics: add llamacpp:prompt_seconds_total and llamacpp:tokens_predic...
Pierrick Hymbert
2024-03-08
llama : assume tied weights if lm_head/output weights is missing (#5824)
Don Mahurin
2024-03-08
server : fix EOS token detection with disabled cache (#5938)
Georgi Gerganov
2024-03-08
log : fix MSVC compile errors (#5643)
UEXTM.com
2024-03-07
llama-bench : add embeddings option (#5924)
Georgi Gerganov
2024-03-07
Revert "[SYCL] fix error when set main gpu to non-zero (#5901)" (#5918)
Neo Zhang Jianyu
2024-03-07
server : add `/v1/completions` endpoint (#5914)
Minsoo Cheong
2024-03-07
server : refactor (#5882)
Georgi Gerganov
2024-03-07
[SYCL] fix error when set main gpu to non-zero (#5901)
Neo Zhang Jianyu
2024-03-06
ggml : use SYS_get_cpu if SYS_getcpu is not defined (#5906)
Jared Van Bortel
2024-03-06
ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (#5894)
bobqianic
2024-03-06
convert : remove AWQ remnants (#5768)
Georgi Gerganov
2024-03-06
add wait() to make code stable (#5895)
Neo Zhang Jianyu
2024-03-05
compare-llama-bench.py : remove mul_mat_q (#5892)
slaren
2024-03-05
quants : use MM256_SET_M128I consistently to fix gcc 7 build (#5889)
Jared Van Bortel
2024-03-05
grammars : blacklists character control set (#5888)
ExtReMLapin
2024-03-05
Revert "grammars : don't allow to output unescaped new line in string (#5885)"
Georgi Gerganov
2024-03-05
grammars : don't allow to output unescaped new line in string (#5885)
ExtReMLapin
2024-03-05
Vulkan Improvements (#5835)
0cc4m
2024-03-05
[SYCL] fix mul_mat fault in CI/unit-test (#5862)
Neo Zhang Jianyu
2024-03-05
fix editorconfig check break (#5879)
Minsoo Cheong
2024-03-04
fix speculative decoding build on windows (#5874)
Jeffrey Quesnelle
2024-03-04
nix: static build (#5814)
hutli
2024-03-04
llama : fix embeddings (#5796)
Georgi Gerganov
2024-03-04
flake : fix
Georgi Gerganov
2024-03-04
ggml : fix unknown status (#0)
Georgi Gerganov
2024-03-04
sync : ggml
Georgi Gerganov
2024-03-04
ggml : introduce ggml_status (ggml/750)
Michael Podvitskiy
2024-03-04
cmake : handle cases where git index is not found in .git (#5844)
Dane Madsen
2024-03-04
speculative : implement stochastic speculative sampling (#5625)
Minsoo Cheong
2024-03-04
add alias for chat template (#5858)
Xuan Son Nguyen
2024-03-04
sync : ggml
Georgi Gerganov
2024-03-04
add some new ops, fix some operators and add batch operations to certain oper...
leejet
2024-03-04
common : use LLAMA_DEFAULT_SEED (#5855)
DAN™
2024-03-04
main : support special tokens as reverse/anti prompt (#5847)
DAN™
2024-03-03
cuda : fix data race in soft max (#5853)
slaren
2024-03-03
readme : add API changes section
Georgi Gerganov
2024-03-03
llama : allow for user specified embedding pooling type (#5849)
Douglas Hanley
2024-03-03
gguf-dump : support i-quants (#5841)
Nindaleth
[next]