index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-04-28
flake.lock: Update
github-actions[bot]
2024-04-27
Replace "alternative" boolean operator in conditional compilation directive (...
mgroeber9110
2024-04-27
ci: server: tests python env on github container ubuntu latest / fix n_predic...
Pierrick Hymbert
2024-04-26
Reset schedule earlier to allow overlap with ggml graph computation on device...
agray3
2024-04-26
quantize: add imatrix and dataset metadata in GGUF (#6658)
Pierrick Hymbert
2024-04-26
add basic tensor data validation function (#6884)
slaren
2024-04-26
gguf : fix mismatch between alloc and free functions (#6929)
slaren
2024-04-26
llamafile : use 64-bit integers in sgemm (#6928)
Justine Tunney
2024-04-26
ci: server: fix python installation (#6925)
Pierrick Hymbert
2024-04-26
server: stop generation at `n_ctx_train` if `n_predict` is not set (#6638)
Pierrick Hymbert
2024-04-26
ci: server: fix python installation (#6922)
Pierrick Hymbert
2024-04-26
Merge pull request from GHSA-p5mv-gjc5-mwqv
Georgi Gerganov
2024-04-26
ci: server: fix python installation (#6918)
Pierrick Hymbert
2024-04-26
ci: fix concurrency for pull_request_target (#6917)
Pierrick Hymbert
2024-04-26
bench: server add stop word for PHI-2 (#6916)
Pierrick Hymbert
2024-04-25
llava : add support for moondream vision language model (#6899)
vik
2024-04-25
cmake : restore LLAMA_LLAMAFILE_DEFAULT
Georgi Gerganov
2024-04-25
cmake : remove obsolete ANDROID check
Georgi Gerganov
2024-04-25
llama : synchronize before get/set session data (#6911)
slaren
2024-04-25
ci : tmp disable slow tests
Georgi Gerganov
2024-04-25
readme : update model list (#6908)
BarfingLemurs
2024-04-25
llama : check that all the tensor data is in the model file (#6885)
slaren
2024-04-25
ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (#6906)
Georgi Gerganov
2024-04-25
clip : rename lerp function to avoid conflict (#6894)
Daniel Bevenius
2024-04-25
ggml : fix MIN / MAX macros (#6904)
Georgi Gerganov
2024-04-25
tests : minor bash stuff (#6902)
Georgi Gerganov
2024-04-25
quantize : add '--keep-split' to quantize model into shards (#6688)
jiez
2024-04-24
README: add graphic for matrix multiplication (#6881)
Johannes Gäßler
2024-04-24
llama : add llama_get_pooling_type function (#6862)
Douglas Hanley
2024-04-24
server : do not apply Markdown formatting in code sections (#6850)
mgroeber9110
2024-04-24
common : revert showing control tokens by default for server (#6860)
Kyle Mistele
2024-04-24
Server: fix seed for multiple slots (#6835)
Johannes Gäßler
2024-04-24
ggml : move 32-bit arm compat in ggml-impl.h (#6865)
Georgi Gerganov
2024-04-24
llama : add phi 3 chat template (#6857)
Tristan Druyen
2024-04-24
convert : add support of codeqwen due to tokenizer (#6707)
Junyang Lin
2024-04-24
llama : add phi3 support (#6852)
liuwei-git
2024-04-23
[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 flag activ...
Anas Ahouzi
2024-04-22
llamafile : improve sgemm.cpp (#6796)
Justine Tunney
2024-04-22
ggml : fix calloc argument ordering. (#6820)
Dave Airlie
2024-04-22
llama : fix typo in <|im_end|> token text (#6745)
Georgi Gerganov
2024-04-22
ci: fix job are cancelling each other (#6781)
Pierrick Hymbert
2024-04-22
flake.lock: Update
github-actions[bot]
2024-04-21
`build`: generate hex dump of server assets during build (#6661)
Olivier Chafik
2024-04-21
llama : add option to render special/control tokens (#6807)
Georgi Gerganov
2024-04-21
ggml : fix ggml_backend_cpu_supports_op() for CPY (#0)
Georgi Gerganov
2024-04-21
llama : add llama-3 chat template (#6751)
Wouter
2024-04-21
gguf-py : add IQ1_M to GGML_QUANT_SIZES (#6761)
pmysl
2024-04-21
doc : add link to falcon (#6789)
Jan Boon
2024-04-21
readme : add Fedora instructions (#6783)
Mohammadreza Hendiani
2024-04-21
llava : use logger in llava-cli (#6797)
Justine Tunney
[next]