index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
tests
Age
Commit message (
Expand
)
Author
2024-05-29
ggml : fix YARN + add tests + add asserts (#7617)
Georgi Gerganov
2024-05-29
cuda : non-cont concat support (#7610)
Georgi Gerganov
2024-05-28
Tokenizer WPM fixes (#7500)
jaime-m-p
2024-05-28
tests : fix test-tokenizer-0.sh
Georgi Gerganov
2024-05-28
ggml : generalize GGML_OP_CONCAT (#7563)
Georgi Gerganov
2024-05-23
Fix phi3 chat template confusion with zephyr (#7449)
Tristan Druyen
2024-05-23
ggml : remove ggml_flash_attn and ggml_flash_ff (#7463)
Georgi Gerganov
2024-05-22
cuda : fix rope + add tests (#7452)
Georgi Gerganov
2024-05-21
llama : add phi3 128K model support (#7225)
liuwei-git
2024-05-21
tests : test-tokenizer-0.sh print more info (#7402)
Georgi Gerganov
2024-05-21
Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)
jaime-m-p
2024-05-20
Tokenizer SPM fixes for phi-3 and llama-spm (#7375)
jaime-m-p
2024-05-18
ggml : fix quants nans when all the group weights are very close to zero (#7313)
slaren
2024-05-18
Unicode codepoint flags for custom regexs (#7245)
jaime-m-p
2024-05-15
ggml : add `ggml_upscale_ext` (ggml/814)
John Balis
2024-05-14
metal : support FA without mask + add asserts (#7278)
Georgi Gerganov
2024-05-14
Add left recursion check: quit early instead of going into an infinite loop (...
Haggai Nuchi
2024-05-12
CUDA: add FP32 FlashAttention vector kernel (#7188)
Johannes Gäßler
2024-05-11
llama : lookup word in vocab before doing BPE merges (#7193)
Haoxiang Fei
2024-05-11
ggml : full ALiBi support (#7192)
Georgi Gerganov
2024-05-09
llama3 custom regex split (#6965)
jaime-m-p
2024-05-09
CUDA: generalize FP16 fattn vec kernel (#7061)
Johannes Gäßler
2024-05-08
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)
Johannes Gäßler
2024-05-08
llama : add BPE pre-tokenization for Qwen2 (#7114)
Ren Xuancheng
2024-05-08
ggml : introduce bfloat16 support (#6412)
Justine Tunney
2024-05-05
command-r : add BPE pre-tokenization (#7063)
DAN™
2024-05-05
py : logging and flake8 suppression refactoring (#7081)
Brian
2024-05-04
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
Georgi Gerganov
2024-05-03
convert.py : add python logging instead of print() (#6511)
Brian
2024-04-30
ggml : add Flash Attention (#5021)
Georgi Gerganov
2024-04-29
Extending grammar integration tests (#6644)
Clint Herron
2024-04-29
llama : fix BPE pre-tokenization (#6920)
Georgi Gerganov
2024-04-24
llama : add phi 3 chat template (#6857)
Tristan Druyen
2024-04-21
llama : add llama-3 chat template (#6751)
Wouter
2024-04-18
ggml : group all experts in a single ggml_mul_mat_id (#6505)
slaren
2024-04-16
llama : add qwen2moe (#6074)
Shijie
2024-04-15
`main`: add --json-schema / -j flag (#6659)
Olivier Chafik
2024-04-14
Add Command R chat template (#6650)
Chao Jiang
2024-04-12
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings,...
Olivier Chafik
2024-04-12
metal : unify mul_mv_id kernels (#6556)
slaren
2024-04-11
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses...
Olivier Chafik
2024-04-06
Tests: Added integration tests for GBNF parser (#6472)
Clint Herron
2024-04-03
Add OpenChat, Alpaca, Vicuna chat templates (#6397)
kaizau
2024-04-03
ggml : mul_mat_id use the same tensor for all the experts (#6387)
slaren
2024-03-26
IQ1_M: 1.75 bpw quantization (#6302)
Kawrakow
2024-03-25
tests : include IQ2_XXS and IQ2_XS in test-quantize-fns (#6303)
Kawrakow
2024-03-22
tests : conditional python & node json schema tests (#6207)
Olivier Chafik
2024-03-22
json-schema-to-grammar : fix order of props + non-str const/enum (#6232)
Olivier Chafik
2024-03-22
metal : pad n_ctx by 32 (#6177)
Georgi Gerganov
2024-03-21
tests : disable system() calls (#6198)
Georgi Gerganov
[next]