index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-02-03
refactor : switch to emplace_back to avoid extra object (#5291)
Michael Klimenko
2024-02-03
YaRN : store rope scaling type as int32_t in memory (#5285)
Jared Van Bortel
2024-02-03
readme : add tenere in the ui tools list (#5284)
BADR
2024-02-03
Fix im2col with 32fp (#5286)
AidanBeltonS
2024-02-02
perplexity : fix KL divergence calculations on Windows (#5273)
kalomaze
2024-02-02
scripts : parse wtype in server-llm.sh (#5167)
Georgi Gerganov
2024-02-02
py : add check for '.attn.masked_bias' layers to GPT2model (#5281)
Mirror Azure
2024-02-02
Tidy ggml-sycl (#5261)
AidanBeltonS
2024-02-02
docker : add build for SYCL, Vulkan + update readme (#5228)
Xuan Son Nguyen
2024-02-02
[SYCL] get MAX_MEM_ALLOC from device property (#5270)
Meng, Hengyu
2024-02-02
[SYCL] update guide of SYCL backend (#5254)
Neo Zhang Jianyu
2024-02-02
llama : fix memory leak in llama_batch_free (#5252)
Ian Bull
2024-02-01
add --no-mmap in llama-bench (#5257)
Neo Zhang Jianyu
2024-02-01
Vulkan Phi Fix for AMD Proprietary Drivers (#5260)
0cc4m
2024-02-01
cuda : fix LLAMA_CUDA_F16 (#5262)
slaren
2024-02-01
make : generate .a library for static linking (#5205)
Ali Nehzat
2024-02-01
llama : support InternLM2 (#5184)
Guoteng
2024-01-31
Fix broken Vulkan Cmake (properly) (#5230)
Eve
2024-01-31
llama : reorder build_orion() at correct place (#5118)
Georgi Gerganov
2024-01-31
llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)
Georgi Gerganov
2024-01-31
metal : add im2col F32 dst support (#5132)
Georgi Gerganov
2024-01-31
llava : add MobileVLM support (#5132)
JidongZhang-THU
2024-01-31
format license text, restore apache license by legal suggestion (#5233)
Neo Zhang Jianyu
2024-01-31
ggml : limit n_threads to the max n_tasks (#5238)
slaren
2024-01-31
Vulkan Fixes (#5223)
0cc4m
2024-01-30
Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231)
Yiming Cui
2024-01-31
support SYCL backend windows build (#5208)
Neo Zhang Jianyu
2024-01-30
kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)
Jared Van Bortel
2024-01-30
Revert "server : change deps.sh xxd files to string literals (#5221)"
Georgi Gerganov
2024-01-30
server : fix context shift (#5195)
Georgi Gerganov
2024-01-30
server : change deps.sh xxd files to string literals (#5221)
JohnnyB
2024-01-30
ggml : fix IQ3_XXS on Metal (#5219)
Kawrakow
2024-01-30
sync : ggml (#0)
Georgi Gerganov
2024-01-30
gguf : fix comparison (ggml/715)
Georgi Gerganov
2024-01-30
`ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686)
John Balis
2024-01-30
gguf : add input validation, prevent integer overflows (ggml/709)
Georgi Gerganov
2024-01-30
ci : fix yolo URLs + fix metal capture (ggml/712)
Georgi Gerganov
2024-01-30
metal : add debug capture backend function (ggml/694)
Jack Mousseau
2024-01-30
Faster AVX2 dot product for IQ2_XS (#5187)
Kawrakow
2024-01-30
SOTA 3-bit quants (#5196)
Kawrakow
2024-01-30
Vulkan Windows APU Memory Handling (#5199)
0cc4m
2024-01-30
quantize : fix typo (#5211)
Vladimir Malyutin
2024-01-30
main : allow empty --prompt-cache file (#5176)
divinity76
2024-01-30
readme : minor (#5204)
Romain Neutron
2024-01-30
readme : update hot topics
Georgi Gerganov
2024-01-30
server : improve README (#5209)
Wu Jian Ping
2024-01-29
ggml alloc: Fix for null dereference on alloc failure (#5200)
Paul Tsochantaris
2024-01-29
kompute : fix fallback to CPU (#5201)
Jared Van Bortel
2024-01-29
Nomic Vulkan backend (#4456)
Jared Van Bortel
2024-01-29
fix typo "RLIMIT_MLOCK" (#5175)
divinity76
[next]