summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2024-05-11metal : fix indent (ggml/0)Georgi Gerganov
2024-05-11ggml : resolve merge (ggml/0)Georgi Gerganov
2024-05-12Scripting & documenting debugging one test without anything else in the loop....Josh Ramer
2024-05-11fix system prompt handling (#7153)Xuan Son Nguyen
2024-05-11convert-hf : support bfloat16 conversion (#7158)compilade
2024-05-11sync : ggmlGeorgi Gerganov
2024-05-11feat: implemented sigmoid function (ggml/806)Justina Cho
2024-05-11build: fix and ignore msvc warnings (ggml/805)Borislav Stanimirov
2024-05-11convert : skip unaccessible HF repos (#7210)CrispStrobe
2024-05-11server : free llama_batch on exit (#7212)Steve Grubb
2024-05-11llama : lookup word in vocab before doing BPE merges (#7193)Haoxiang Fei
2024-05-11server: fix reported top tokens for temperature 0 (#7203)Johannes Gäßler
2024-05-11llama : add Jina Embeddings architecture (#6826)Joan Fontanals
2024-05-11ggml : full ALiBi support (#7192)Georgi Gerganov
2024-05-10llama-bench : add pp+tg test type (#7199)slaren
2024-05-10metal : fix flash attention kernel requirements (#7169)Georgi Gerganov
2024-05-10convert : print "ignore_merges" fieldGeorgi Gerganov
2024-05-10llama : use n_vocab to differentiate between mistral 7B and llama3 8B (#7200)slaren
2024-05-10Fix memory bug in grammar parser (#7194)Justine Tunney
2024-05-10Main+: optionally allow special tokens from user in interactive mode (#7097)HanishKVC
2024-05-10llava : fix moondream support (#7163)Andrei
2024-05-10Minor arithmetic improvement to mmvq wrapper kernel (#7172)Ouadie EL FAROUKI
2024-05-10eval-callback : fix conversion to float (#7184)slaren
2024-05-09Vulkan Bugfixes and Improvements (#7084)0cc4m
2024-05-09readme : add scheduled server workflow status badgeGeorgi Gerganov
2024-05-09readme : add app (#6371)l3utterfly
2024-05-09llama3 custom regex split (#6965)jaime-m-p
2024-05-09CUDA: generalize FP16 fattn vec kernel (#7061)Johannes Gäßler
2024-05-09Add warning if token is invalid (#7173)Galunid
2024-05-09llama : update llama_timings.n_p_eval setting (#7160)Daniel Bevenius
2024-05-09gguf-py : add special token modification capability (#7166)Sigbjørn Skjæret
2024-05-09opencl : alignment size converted from bits to bytes (#7090)Albert Jin
2024-05-09TypoFix (#7162)Ahmet Zeer
2024-05-08cmake : fix typo (#7151)Jared Van Bortel
2024-05-08convert-hf : save memory with lazy evaluation (#7075)compilade
2024-05-08Introduction of CUDA Graphs to LLama.cpp (#6766)agray3
2024-05-08JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)Johannes Gäßler
2024-05-08Revert "llava : add support for moondream vision language model (#6899)"Georgi Gerganov
2024-05-08server : add themes + favicon (#6848)JohnnyB
2024-05-08metal : use `vm_allocate` instead of `posix_memalign` on macOS (#7078)Gilad S
2024-05-08main : add --conversation / -cnv flag (#7108)Dawid Potocki
2024-05-08sgemm : AVX Q4_0 and Q8_0 (#6891)Eve
2024-05-08server : add_special option for tokenize endpoint (#7059)Johan
2024-05-08convert.py : --vocab-only generates false but valid params (#7027)20kdc
2024-05-08llama : add BPE pre-tokenization for Qwen2 (#7114)Ren Xuancheng
2024-05-08clean up json_value & server_log (#7142)Xuan Son Nguyen
2024-05-08convert : add BPE pre-tokenization for DBRX (#7132)DAN™
2024-05-08py : also print the normalizersGeorgi Gerganov
2024-05-08compare-llama-bench.py: add missing basicConfig (#7138)Brian
2024-05-08ggml : introduce bfloat16 support (#6412)Justine Tunney