index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-06-14
llama : more checks before assuming FIM tokens (#7644)
Sigbjørn Skjæret
2024-06-14
convert : add Poro-34B-chat tokenizer support (#7713)
Elaine
2024-06-13
rpc : fix ggml_backend_rpc_supports_buft() (#7918)
Radoslav Gerganov
2024-06-13
readme : Remove outdated instructions from README.md (#7914) [no ci]
Galunid
2024-06-13
move BLAS to a separate backend (#6210)
slaren
2024-06-13
`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...
Olivier Chafik
2024-06-12
CUDA: fix broken oob check for FA vec f32 kernel (#7904)
Johannes Gäßler
2024-06-12
tests : add non-cont unary tests (#7857)
Georgi Gerganov
2024-06-12
ggml : improve ggml_is_contiguous logic (#7856)
Georgi Gerganov
2024-06-12
server : restore numeric prompts (#7883)
Georgi Gerganov
2024-06-12
update intel docker oneapi-basekit to 2024.1.1-devel-ubuntu22.04 (#7894)
Meng, Hengyu
2024-06-12
Fix a typo and add Fedora 40 pacakge to install for Vulkan (#7794) [no ci]
Patrice Ferlet
2024-06-11
vulkan: select only one device for single gpu with multiple drivers (#7582)
k.h.lai
2024-06-11
Update Vulkan RoPE implementation (#7818)
0cc4m
2024-06-12
fix broken link in pr template (#7880) [no ci]
Deven Mistry
2024-06-11
github: move PR template to .github/ root (#7868)
Brian
2024-06-11
llama-bench: more compact markdown tables (#7879)
Johannes Gäßler
2024-06-11
tests : check the Python version (#7872)
Georgi Gerganov
2024-06-11
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860)
Johannes Gäßler
2024-06-11
fix CUDA CI by using a windows-2019 image (#7861)
slaren
2024-06-11
json: refine constraint for whitespace to avoid runaways yet allow pretty pri...
Olivier Chafik
2024-06-11
`json`: document schema conversion in GBNF readme, align manual grammar examp...
Olivier Chafik
2024-06-10
cmake : fix CMake requirement for CUDA (#7821)
Jared Van Bortel
2024-06-10
ci : try win-2019 on server windows test (#7854)
slaren
2024-06-10
examples : remove --instruct remnants (#7846)
Georgi Gerganov
2024-06-10
server : improve "prompt" handling (#7847)
Georgi Gerganov
2024-06-10
CUDA: use tensor cores for MMQ (#7676)
Johannes Gäßler
2024-06-10
use the correct SYCL context for host USM allocations (#7777)
Ben Ashbaugh
2024-06-09
flake.lock: Update (#7838)
Georgi Gerganov
2024-06-09
imatrix : handle partial entries (#7833)
Georgi Gerganov
2024-06-10
docs: Added initial PR template with directions for doc only changes and squa...
Nicolás Pérez
2024-06-09
server: do not remove whitespace at the start of a completion chunk (#7830)
mgroeber9110
2024-06-09
CUDA: revise q8_1 data layout for mul_mat_q (#7824)
Johannes Gäßler
2024-06-09
convert-hf : set the model name based on cli arg, if present (#7693)
sasha0552
2024-06-09
convert-hf : match model part name prefix and suffix (#7687)
compilade
2024-06-09
gguf-py : decouple adding metadata from writing in GGUFWriter (#7827)
compilade
2024-06-09
Revert "[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)" (#7808)
slaren
2024-06-08
url: save -mu downloads to new cache location (#7826)
Olivier Chafik
2024-06-08
server : smart slot selection using Longest Common Prefix (#7728)
sasha0552
2024-06-07
vulkan : reuse parent extra for views (#7806)
slaren
2024-06-07
gguf-split : change binary multi-byte units to decimal (#7803)
Christian Zhou-Zheng
2024-06-07
cmake : fix BUILD_SHARED_LIBS=ON build (#7784)
intelmatt
2024-06-07
server: update cache_prompt documentation [no ci] (#7745)
Johannes Gäßler
2024-06-07
server : do not get prompt in infill mode (#7286)
woodx
2024-06-07
[SYCL] fix softmax r2r result wrong issue (#7811)
pengxin99
2024-06-07
check for nans in imatrix and quantize (#7807)
slaren
2024-06-06
server : fix --threads-http arg (#7801)
Georgi Gerganov
2024-06-06
imatrix : migrate to gpt_params (#7771)
Georgi Gerganov
2024-06-06
Added support for . (any character) token in grammar engine. (#6467)
Clint Herron
2024-06-06
README minor fixes (#7798) [no ci]
Mattheus Chediak
[next]