index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
examples
Age
Commit message (
Expand
)
Author
2024-06-07
gguf-split : change binary multi-byte units to decimal (#7803)
Christian Zhou-Zheng
2024-06-07
server: update cache_prompt documentation [no ci] (#7745)
Johannes Gäßler
2024-06-07
server : do not get prompt in infill mode (#7286)
woodx
2024-06-07
check for nans in imatrix and quantize (#7807)
slaren
2024-06-06
imatrix : migrate to gpt_params (#7771)
Georgi Gerganov
2024-06-06
grammars: x{min,max} repetition operator (#6640)
Olivier Chafik
2024-06-05
ggml : refactor rope norm/neox (#7634)
Georgi Gerganov
2024-06-05
readme : remove -ins (#7759)
arch-btw
2024-06-04
common : refactor cli arg parsing (#7675)
Georgi Gerganov
2024-06-04
ggml : remove OpenCL (#7735)
Georgi Gerganov
2024-06-04
llama : remove beam search (#7736)
Georgi Gerganov
2024-06-04
llama-bench : allow using a different printer for stderr with -oe (#7722)
slaren
2024-06-02
[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)
nickp27
2024-06-01
server : new UI (#7633)
Yazan Agha-Schrader
2024-06-02
SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings ...
HanishKVC
2024-05-31
server : update js (#7670)
Georgi Gerganov
2024-05-30
Move convert.py to examples/convert-legacy-llama.py (#7430)
Galunid
2024-05-29
llama-bench : add support for the RPC backend (#7435)
Radoslav Gerganov
2024-05-28
server: do not remove whitespace at the start of a completion chunk (#7524)
mgroeber9110
2024-05-28
Markdownish code block fix (#7571)
Nathan Epstein
2024-05-28
llava : update clip.h (#7580)
Ikko Eltociear Ashimine
2024-05-27
main: replace --no-special with --special (#7534)
Brian
2024-05-26
SimpleChat Completion Mode flexibility and cleanup, Settings gMe, Optional sl...
HanishKVC
2024-05-25
train : change default FA argument (#7528)
Georgi Gerganov
2024-05-25
main : don't print special tokens with --grammar (#6923)
Justine Tunney
2024-05-25
android : module (#7502)
Elton Kola
2024-05-25
Make tokenize CLI tool have nicer command line arguments. (#6188)
Mikko Juola
2024-05-24
add build shared lib in win release package (#7438)
Neo Zhang
2024-05-23
ggml : remove ggml_flash_attn and ggml_flash_ff (#7463)
Georgi Gerganov
2024-05-23
main : minor (#7462)
Georgi Gerganov
2024-05-23
SimpleChat: a simple and dumb web front end for testing /chat/completions and...
HanishKVC
2024-05-22
common : normalize naming style (#7462)
Georgi Gerganov
2024-05-22
phi3 : duplicate rope factors in each layer (#7447)
slaren
2024-05-21
llama : add phi3 128K model support (#7225)
liuwei-git
2024-05-21
`grammars`: fix resampling logic regression (#7424)
Olivier Chafik
2024-05-21
examples: cache hf model when --model not provided (#7353)
Amir
2024-05-21
Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)
jaime-m-p
2024-05-20
Tokenizer SPM fixes for phi-3 and llama-spm (#7375)
jaime-m-p
2024-05-20
perplexity: update README FP16 results [no ci] (#7413)
Johannes Gäßler
2024-05-20
server : fix temperature + disable some tests (#7409)
Georgi Gerganov
2024-05-20
server : tuning tests (#7388)
Georgi Gerganov
2024-05-20
server : return error on too large embedding input (#7389)
Georgi Gerganov
2024-05-20
tests : fix --keep_split -> --keep-split (#7374)
Georgi Gerganov
2024-05-19
quantize : fix --keep-split check (#7374)
Fred Douglas
2024-05-19
server: add test for token probs (#7347)
Johannes Gäßler
2024-05-19
server: fix seed being reported back (#7382)
Johannes Gäßler
2024-05-19
cmake : update android comments (#7341)
Georgi Gerganov
2024-05-18
android : use "ci-android" branch for CI (#7341)
Georgi Gerganov
2024-05-18
server: correct --threads documentation [no ci] (#7362)
Johannes Gäßler
2024-05-18
perplexity : ndot progress and show stats with < 100 tasks (#7348)
strawberrymelonpanda
[next]