index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
examples
Age
Commit message (
Expand
)
Author
2024-02-18
ci : fix wikitext url + compile warnings (#5569)
Georgi Gerganov
2024-02-18
common, server : surface min_keep as its own parameter (#5567)
Robey Holderith
2024-02-18
server : slots monitoring endpoint (#5550)
Pierrick Hymbert
2024-02-18
server : enhanced health endpoint (#5548)
Pierrick Hymbert
2024-02-18
server : --n-predict option document and cap to max value (#5549)
Pierrick Hymbert
2024-02-18
server : graceful server shutdown (#5244)
Daniel Hiltgen
2024-02-18
ggml, common, examples, tests : fixed type arguments in printf (#5528)
Herman Semenov
2024-02-18
llava : update surgery script to not remove tensors (#5536)
Daniel Bevenius
2024-02-18
1.5 bit quantization (#5453)
Kawrakow
2024-02-17
ci : add an option to fail on compile warning (#3952)
Ananta Bastola
2024-02-16
llava : removed excess free(NULL) operation (#5531)
Herman Semenov
2024-02-16
server : add "samplers" param to control the samplers order (#5494)
Alexey Parfenov
2024-02-16
server : fix system prompt cli (#5516)
Rőczey Barnabás
2024-02-16
ggml : add numa options (#5377)
bmwl
2024-02-16
llava : fix clip-model-is-vision flag in README.md (#5509)
Daniel Bevenius
2024-02-15
clip : fix wrong loop condition
Georgi Gerganov
2024-02-15
llava : fix memory management bug (#5491)
Elbios
2024-02-15
llaba : hotfix for llava-1.6 image number (#5495)
John
2024-02-14
llava : update README.md (#5489)
John
2024-02-14
llava : support v1.6 (#5267)
John
2024-02-13
gguf : add python reader example (#5216)
John
2024-02-13
finetune : rename feed-forward tensors (w1/w2/w3) (#4839)
Daniel Bevenius
2024-02-13
llama : support batched embeddings (#5466)
Douglas Hanley
2024-02-12
llava : remove prog parameter from ArgumentParser (#5457)
Daniel Bevenius
2024-02-12
sync : ggml (#5452)
Georgi Gerganov
2024-02-11
Add support for BERT embedding models (#5423)
Douglas Hanley
2024-02-11
server : allow to specify tokens as strings in logit_bias (#5003)
Alexey Parfenov
2024-02-11
main : ctrl+C print timing in non-interactive mode (#3873)
Georgi Gerganov
2024-02-11
lookup: add print for drafting performance (#5450)
Johannes Gäßler
2024-02-11
server : add llama2 chat template (#5425)
Xuan Son Nguyen
2024-02-09
llava : add requirements.txt and update README.md (#5428)
Daniel Bevenius
2024-02-09
server : fix prompt caching for repeated prompts (#5420)
Riley Stewart
2024-02-08
llava : add missing .py, and fix paths in README.md (#5414)
Daniel Bevenius
2024-02-08
llava: fix typo/formatting in README.md (#5405)
Daniel Bevenius
2024-02-07
llava-cli : always tokenize special tokens (#5382)
Xiao-Yong Jin
2024-02-07
server : update `/props` with "total_slots" value (#5373)
Justin Parker
2024-02-06
server : remove model.json endpoint (#5371)
Alexey Parfenov
2024-02-06
server : include total "num_slots" in props endpoint (#5349)
Justin Parker
2024-02-06
server : add `dynatemp_range` and `dynatemp_exponent` (#5352)
Michael Coppola
2024-02-06
server : various fixes for the prompt field in /completion (#5300)
Niall Coates
2024-02-05
server : allow to get default generation settings for completion (#5307)
Alexey Parfenov
2024-02-04
Adding some imatrix tools (#5302)
Kawrakow
2024-02-03
refactor : switch to emplace_back to avoid extra object (#5291)
Michael Klimenko
2024-02-02
perplexity : fix KL divergence calculations on Windows (#5273)
kalomaze
2024-02-02
[SYCL] update guide of SYCL backend (#5254)
Neo Zhang Jianyu
2024-02-01
add --no-mmap in llama-bench (#5257)
Neo Zhang Jianyu
2024-01-31
llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)
Georgi Gerganov
2024-01-31
llava : add MobileVLM support (#5132)
JidongZhang-THU
2024-01-31
format license text, restore apache license by legal suggestion (#5233)
Neo Zhang Jianyu
2024-01-31
support SYCL backend windows build (#5208)
Neo Zhang Jianyu
[next]