index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2023-08-23
minor : fix trailing whitespace
Georgi Gerganov
2023-08-23
examples : restore the functionality to import llama2.c models (#2685)
Olivier Chafik
2023-08-23
fix convert-lora-to-ggml.py (#2738)
slaren
2023-08-23
main : insert bos if no tokens (#2727)
klosax
2023-08-23
gitignore : fix for windows (#2729)
akawrykow
2023-08-23
chmod : make scripts executable (#2675)
Cebtenzzre
2023-08-23
devops : RPM Specs (#2723)
JohnnyB
2023-08-23
Fix values shown in the quantize tool help (#2735)
Kawrakow
2023-08-23
Strided perplexity (#2714)
Kawrakow
2023-08-23
Fix ggml to gguf conversion on Windows (#2733)
IgnacioFDM
2023-08-23
server : allow json array in prompt or content for direct token input (#2306)
Xiao-Yong Jin
2023-08-22
docs : add grammar docs (#2701)
Evan Jones
2023-08-22
Improve handling of special tokens in GGML to GGUF converter (#2725)
Kerfuffle
2023-08-23
llama : fix whitespace escaping in tokenizer (#2724)
goerch
2023-08-22
CUDA: use mul_mat_q kernels by default (#2683)
Johannes Gäßler
2023-08-22
convert.py : clarifying error message (#2718)
Alex Petenchea
2023-08-22
Fix CUDA softmax by subtracting max value before exp (#2665)
Jiahao Li
2023-08-22
gguf : add ftype meta info to the model (#2710)
Georgi Gerganov
2023-08-22
Quantization imrovements for k_quants (#2707)
Kawrakow
2023-08-22
embedding : evaluate prompt in batches (#2713)
slaren
2023-08-22
ggml-cuda : use graph allocator (#2684)
slaren
2023-08-22
ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709)
Georgi Gerganov
2023-08-22
llama-bench : minor fixes (#2695)
slaren
2023-08-22
ggml : support CUDA's half type for aarch64(#1455) (#2670)
Kylin
2023-08-22
metal : add missing barriers for mul-mat (#2699)
Shouzheng Liu
2023-08-22
server : fallback to default if client param is null (#2688)
Jhen-Jie Hong
2023-08-21
Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698)
Kerfuffle
2023-08-21
py : remove obsolete script
Georgi Gerganov
2023-08-21
gguf : new file format with flexible meta data (beta) (#2398)
Georgi Gerganov
2023-08-21
metal : fix synchronization in new matrix multiplication kernel (#2686)
Shouzheng Liu
2023-08-21
HellaSwag: split token evaluation into batches if needed (#2681)
Kawrakow
2023-08-20
ggml : move all type info to ggml_type_traits (#2663)
slaren
2023-08-20
More efficient Hellaswag implementation (#2677)
Kawrakow
2023-08-19
server : better default prompt (#2646)
Georgi Gerganov
2023-08-19
server : update xxd usage for older versions compatibility (#2649)
Jhen-Jie Hong
2023-08-18
Add link to clojure bindings to Readme. (#2659)
Adrian
2023-08-18
readme : incoming BREAKING CHANGE
Georgi Gerganov
2023-08-18
llama : add benchmark example (#2626)
slaren
2023-08-18
readme : add link to Rust bindings (#2656)
mdrokz
2023-08-18
perplexity : more meaningful ETA number - 2 decimal points
Georgi Gerganov
2023-08-17
Fix unicode in grammars (fixes #2501) (#2553)
Evan Jones
2023-08-18
server : support for saving templates in browser LocalStorage (#2486)
staviq
2023-08-17
README: fix LLAMA_CUDA_MMV_Y documentation (#2647)
Johannes Gäßler
2023-08-17
[Zig] Fixing Zig build and improvements (#2554)
Henri Vasserman
2023-08-17
Add --cfg-negative-prompt-file option for examples (#2591)
Kerfuffle
2023-08-17
llama : replace (permute + reshape + view_1d) with (view_3d) (#2538)
Georgi Gerganov
2023-08-17
tests : adds simple llama grammar tests (#2618)
drbh
2023-08-17
ggml-alloc : fix discrepency between measure&eval (#2639)
Shouzheng Liu
2023-08-16
cmake : install ggml-meta.metal if LLAMA_METAL (#2449)
Kolen Cheung
2023-08-16
metal : print error of load pipeline state (#2564)
Jhen-Jie Hong
[next]