summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2023-11-13sync : ggml (backend v2) (#3912)Georgi Gerganov
2023-11-11Fix some documentation typos/grammar mistakes (#4032)Richard Kiss
2023-11-10server : fix crash when prompt exceeds context size (#3996)Alexey Parfenov
2023-11-11gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981)Kerfuffle
2023-11-10server : allow continue edit on completion mode (#3950)Jhen-Jie Hong
2023-11-08server : add min_p param (#3877)Mihai
2023-11-07ggml : fix backward rope after YaRN (#3974)xaedes
2023-11-07Use params when loading models in llava-cli (#3976)Matthew Tejo
2023-11-07llava : expose as a shared library for downstream projects (#3613)Damian Stewart
2023-11-05server : fix typo for --alias shortcut from -m to -a (#3958)Thái Hoàng Tâm
2023-11-03speculative : change default p_accept to 0.5 + CLI args (#3919)Georgi Gerganov
2023-11-02build : link against build info instead of compiling against it (#3879)cebtenzzre
2023-11-01llama : implement YaRN RoPE scaling (#2268)cebtenzzre
2023-11-01finetune : add -ngl parameter (#3762)Andrew Godfrey
2023-11-01server : re-enable completion and embedded at the same time (#3876)Adrian Hesketh
2023-10-31samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)kalomaze
2023-10-29Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)Kerfuffle
2023-10-29ggml : quantization refactoring (#3833)Georgi Gerganov
2023-10-28llama : add option for greedy sampling with probs (#3813)Georgi Gerganov
2023-10-28speculative : ensure draft and target model vocab matches (#3812)Kerfuffle
2023-10-27simple : fix batch handling (#3803)Thibault Terrasson
2023-10-26server : do not release slot on image input (#3798)Georgi Gerganov
2023-10-25batched-bench : print params at startGeorgi Gerganov
2023-10-24server : add parameter -tb N, --threads-batch N (#3584) (#3768)cebtenzzre
2023-10-24server : do not block system prompt update (#3767)Georgi Gerganov
2023-10-24cmake : add missed dependencies (#3763)John Smith
2023-10-24cuda : add batched cuBLAS GEMM for faster attention (#3749)Georgi Gerganov
2023-10-23llama : remove token functions with `context` args in favor of `model` (#3720)Marcus Dunn
2023-10-22server : parallel decoding and multimodal (#3677)Georgi Gerganov
2023-10-22main : escape prompt for cfg_negative_prompt and consecutive inputs in main w...vvhg1
2023-10-22batched : add len CLI argumentGeorgi Gerganov
2023-10-20sampling : refactor init to use llama_sampling_params (#3696)Georgi Gerganov
2023-10-20gguf : support big endian platform (#3552)Qin Yue Chen
2023-10-20server : fix uninitialized sampling context (close #3685)Georgi Gerganov
2023-10-19multimodal : add BakLLaVA conversion support (#3682)M. Yusuf Sarıgöz
2023-10-19llava : avoid segfault in case of non-existent mmproj file (#3674)M. Yusuf Sarıgöz
2023-10-18speculative : bug fixesGeorgi Gerganov
2023-10-18speculative : add tree-based sampling example (#3624)Georgi Gerganov
2023-10-17llama : avoid fprintf in favor of LLAMA_LOG (#3538)Georgi Gerganov
2023-10-17train-text-from-scratch : fix assert failure in ggml-alloc (#3618)slaren
2023-10-17editorconfig : remove trailing spacesGeorgi Gerganov
2023-10-17server : documentation of JSON return value of /completion endpoint (#3632)coezbek
2023-10-17save-load-state : fix example + add ci test (#3655)Georgi Gerganov
2023-10-17tokenizer : special token handling (#3538)staviq
2023-10-16llava : fix tokenization to not add bos between image embeddings and user pro...Georgi Gerganov
2023-10-14Honor -ngl option for Cuda offloading in llava (#3621)M. Yusuf Sarıgöz
2023-10-13ggml : add context enumeration functions (#3605)slaren
2023-10-12examples: support LLaVA v1.5 (multimodal model) (#3436)M. Yusuf Sarıgöz
2023-10-12server : add completion mode (no chat) (#3582)Aarni Koskela
2023-10-12server : fix kv cache management (#3588)Georgi Gerganov