index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-05-15
embedding : free the batch after execution (#7297)
dm4
2024-05-15
sync : ggml
Georgi Gerganov
2024-05-15
ggml : add `ggml_upscale_ext` (ggml/814)
John Balis
2024-05-15
server bench: fix bench not waiting for model load (#7284)
Johannes Gäßler
2024-05-14
script : sync ggml-rpc
Georgi Gerganov
2024-05-14
metal : support FA without mask + add asserts (#7278)
Georgi Gerganov
2024-05-14
sync : ggml
Georgi Gerganov
2024-05-14
metal : tune soft_max number of threads (whisper/0)
Georgi Gerganov
2024-05-14
ggml : try fix ppc64 (whisper/0)
Georgi Gerganov
2024-05-14
ggml : expose SSE3 and SSSE3 for MSVC when AVX is available (whisper/2128)
Przemysław Pawełczyk
2024-05-14
ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
Hong Bo PENG
2024-05-14
server: free sampling contexts on exit (#7264)
Steve Grubb
2024-05-14
Revert "move ndk code to a new library (#6951)" (#7282)
Brian
2024-05-14
ggml : add RPC backend (#6829)
Radoslav Gerganov
2024-05-14
llama : disable pipeline parallelism with nkvo (#7265)
slaren
2024-05-14
move ndk code to a new library (#6951)
Elton Kola
2024-05-14
Add left recursion check: quit early instead of going into an infinite loop (...
Haggai Nuchi
2024-05-14
docs: Fix typo and update description for --embeddings flag (#7026)
Ryuei
2024-05-13
convert-hf : support direct Q8_0 conversion (#7234)
compilade
2024-05-13
llama : less KV padding when FA is off (#7257)
Georgi Gerganov
2024-05-14
llava-cli: fix base64 prompt (#7248)
k.h.lai
2024-05-13
perplexity: add BF16 vs. FP16 results (#7150)
Johannes Gäßler
2024-05-13
[SYCL] rm wait() (#7233)
Neo Zhang
2024-05-13
llama : rename jina tokenizers to v2 (#7249)
Joan Fontanals
2024-05-13
convert.py: Outfile default name change and additional metadata support (#4858)
Brian
2024-05-13
change default temperature of OAI compat API from 0 to 1 (#7226)
Benjamin Findley
2024-05-13
[SYCL] Add oneapi runtime dll files to win release package (#7241)
Neo Zhang
2024-05-13
[SYCL] update CI with oneapi 2024.1 (#7235)
Neo Zhang
2024-05-12
CUDA: add FP32 FlashAttention vector kernel (#7188)
Johannes Gäßler
2024-05-12
cmake : fix version cmp (#7227)
Georgi Gerganov
2024-05-12
remove convert-lora-to-ggml.py (#7204)
slaren
2024-05-11
metal : fix warnings (skipme) (#0)
Georgi Gerganov
2024-05-11
sync : ggml
Georgi Gerganov
2024-05-11
metal : fix indent (ggml/0)
Georgi Gerganov
2024-05-11
ggml : resolve merge (ggml/0)
Georgi Gerganov
2024-05-12
Scripting & documenting debugging one test without anything else in the loop....
Josh Ramer
2024-05-11
fix system prompt handling (#7153)
Xuan Son Nguyen
2024-05-11
convert-hf : support bfloat16 conversion (#7158)
compilade
2024-05-11
sync : ggml
Georgi Gerganov
2024-05-11
feat: implemented sigmoid function (ggml/806)
Justina Cho
2024-05-11
build: fix and ignore msvc warnings (ggml/805)
Borislav Stanimirov
2024-05-11
convert : skip unaccessible HF repos (#7210)
CrispStrobe
2024-05-11
server : free llama_batch on exit (#7212)
Steve Grubb
2024-05-11
llama : lookup word in vocab before doing BPE merges (#7193)
Haoxiang Fei
2024-05-11
server: fix reported top tokens for temperature 0 (#7203)
Johannes Gäßler
2024-05-11
llama : add Jina Embeddings architecture (#6826)
Joan Fontanals
2024-05-11
ggml : full ALiBi support (#7192)
Georgi Gerganov
2024-05-10
llama-bench : add pp+tg test type (#7199)
slaren
2024-05-10
metal : fix flash attention kernel requirements (#7169)
Georgi Gerganov
2024-05-10
convert : print "ignore_merges" field
Georgi Gerganov
[next]