index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2024-04-16
convert : fix autoawq gemma (#6704)
Zheng.Deng
2024-04-16
llama : make general.name optional (#6709)
Georgi Gerganov
2024-04-16
ggml : fix llamafile sgemm wdata offsets (#6710)
Georgi Gerganov
2024-04-16
ggml : add llamafile sgemm (#6414)
Justine Tunney
2024-04-16
llama : add StableLM2 12B (#6635)
Ashish
2024-04-16
llama : add qwen2moe (#6074)
Shijie
2024-04-16
gritlm : add --outdir option to hf.sh script (#6699)
Daniel Bevenius
2024-04-16
perplexity : require positive --ctx-size arg (#6695)
Georgi Gerganov
2024-04-16
gguf : add special tokens metadata for FIM/Infill (#6689)
Daniel Bevenius
2024-04-15
`main`: add --json-schema / -j flag (#6659)
Olivier Chafik
2024-04-15
llama : fix restoring the number of outputs from state files (#6687)
compilade
2024-04-15
server : revert "minor layout improvements" (#6684)
Pierrick Hymbert
2024-04-15
swift : linux support (#6590)
Steven Prichard
2024-04-15
fix mul_mat_id() for new input, make the ut pass (#6682)
Neo Zhang Jianyu
2024-04-14
llama : add missing kv clear in llama_beam_search (#6664)
David Renshaw
2024-04-14
Add Command R chat template (#6650)
Chao Jiang
2024-04-14
flake.lock: Update (#6669)
Georgi Gerganov
2024-04-14
Added support for GGML_OP_CLAMP in Metal (#6662)
Dave
2024-04-14
Fix --split-max-size (#6655)
Sigbjørn Skjæret
2024-04-14
[bug fix] convert github repository_owner to lowercase (#6673)
Jaemin Son
2024-04-14
convert : enable the `--use-temp-file` cli flag (#6645)
James A Capozzoli
2024-04-14
fix memcpy() crash, add missed cmd in guide, fix softmax (#6622)
Neo Zhang Jianyu
2024-04-14
CUDA: fix matrix multiplication logic for tests (#6667)
Johannes Gäßler
2024-04-13
model: support arch `DbrxForCausalLM` (#6515)
Pierrick Hymbert
2024-04-12
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings,...
Olivier Chafik
2024-04-12
metal : unify mul_mv_id kernels (#6556)
slaren
2024-04-12
infill : add download instructions for model (#6626)
Daniel Bevenius
2024-04-12
server : coherent log output for KV cache full (#6637)
Pierrick Hymbert
2024-04-12
llama : add gguf_remove_key + remove split meta during quantize (#6591)
jiez
2024-04-12
chore: Fix markdown warnings (#6625)
Rene Leonhardt
2024-04-12
imatrix : remove invalid assert (#6632)
Georgi Gerganov
2024-04-12
Correct free memory and total memory. (#6630)
MasterYi1024
2024-04-12
eval-callback: use ggml_op_desc to pretty print unary operator name (#6631)
Pierrick Hymbert
2024-04-12
ci : disable Metal for macOS-latest-cmake-x64 (#6628)
Georgi Gerganov
2024-04-11
Optimization: eliminate addition of redundant stacks when advancing grammar. ...
Clint Herron
2024-04-11
As suggested by @slaren, disabling Metal for test to fix CI build on OSX from...
Clint Herron
2024-04-11
Refactor Error Handling for CUDA (#6575)
Nikolas
2024-04-11
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses...
Olivier Chafik
2024-04-11
ci: download artifacts to release directory (#6612)
Hugo Roussel
2024-04-11
scripts : add --outdir option to hf.sh (#6600)
Daniel Bevenius
2024-04-11
eval-callback: Example how to use eval callback for debugging (#6576)
Pierrick Hymbert
2024-04-10
gguf : add option to not check tensor data (#6582)
Daniel Bevenius
2024-04-10
minor layout improvements (#6572)
Ralph Soika
2024-04-10
llama : add model types for mixtral (#6589)
slaren
2024-04-10
convert.py : add consolidated.safetensors for mixtral 8x22b (#6587)
slaren
2024-04-10
docs : how to add a model (#6565)
Pierrick Hymbert
2024-04-10
readme : fix ROCm link (#6579)
Artem Zinnatullin
2024-04-10
readme : update UI list (#6560)
sjxx
2024-04-10
readme: fix typo in amdgpu target name (#6573)
Jiří Sejkora
2024-04-09
BERT tokenizer fixes (#6498)
Jared Van Bortel
[next]