summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2023-10-28llama : add option for greedy sampling with probs (#3813)Georgi Gerganov
2023-10-28speculative : ensure draft and target model vocab matches (#3812)Kerfuffle
2023-10-27simple : fix batch handling (#3803)Thibault Terrasson
2023-10-26server : do not release slot on image input (#3798)Georgi Gerganov
2023-10-25batched-bench : print params at startGeorgi Gerganov
2023-10-24server : add parameter -tb N, --threads-batch N (#3584) (#3768)cebtenzzre
2023-10-24server : do not block system prompt update (#3767)Georgi Gerganov
2023-10-24cmake : add missed dependencies (#3763)John Smith
2023-10-24cuda : add batched cuBLAS GEMM for faster attention (#3749)Georgi Gerganov
2023-10-23llama : remove token functions with `context` args in favor of `model` (#3720)Marcus Dunn
2023-10-22server : parallel decoding and multimodal (#3677)Georgi Gerganov
2023-10-22main : escape prompt for cfg_negative_prompt and consecutive inputs in main w...vvhg1
2023-10-22batched : add len CLI argumentGeorgi Gerganov
2023-10-20sampling : refactor init to use llama_sampling_params (#3696)Georgi Gerganov
2023-10-20gguf : support big endian platform (#3552)Qin Yue Chen
2023-10-20server : fix uninitialized sampling context (close #3685)Georgi Gerganov
2023-10-19multimodal : add BakLLaVA conversion support (#3682)M. Yusuf Sarıgöz
2023-10-19llava : avoid segfault in case of non-existent mmproj file (#3674)M. Yusuf Sarıgöz
2023-10-18speculative : bug fixesGeorgi Gerganov
2023-10-18speculative : add tree-based sampling example (#3624)Georgi Gerganov
2023-10-17llama : avoid fprintf in favor of LLAMA_LOG (#3538)Georgi Gerganov
2023-10-17train-text-from-scratch : fix assert failure in ggml-alloc (#3618)slaren
2023-10-17editorconfig : remove trailing spacesGeorgi Gerganov
2023-10-17server : documentation of JSON return value of /completion endpoint (#3632)coezbek
2023-10-17save-load-state : fix example + add ci test (#3655)Georgi Gerganov
2023-10-17tokenizer : special token handling (#3538)staviq
2023-10-16llava : fix tokenization to not add bos between image embeddings and user pro...Georgi Gerganov
2023-10-14Honor -ngl option for Cuda offloading in llava (#3621)M. Yusuf Sarıgöz
2023-10-13ggml : add context enumeration functions (#3605)slaren
2023-10-12examples: support LLaVA v1.5 (multimodal model) (#3436)M. Yusuf Sarıgöz
2023-10-12server : add completion mode (no chat) (#3582)Aarni Koskela
2023-10-12server : fix kv cache management (#3588)Georgi Gerganov
2023-10-11main : fix session loading bug (#3400)Georgi Gerganov
2023-10-11server : add parameter -tb N, --threads-batch N (#3584)Michael Coppola
2023-10-11common : fix mirostat state when using multiple sequences (#3543)Kerfuffle
2023-10-11batched : add bench tool (#3545)Georgi Gerganov
2023-10-11examples : add batched.swift + improve CI for swift (#3562)Zane Shannon
2023-10-10infill. : fix tokenization (#3508)vvhg1
2023-10-09refact : fix convert script + zero out KV cache to avoid nans (#3523)Georgi Gerganov
2023-10-08api_like_OAI.py : compat with Microsoft Guidance (#2746)Ryder Wishart
2023-10-08api_like_OAI.py : simplify function (#2796)arcrank
2023-10-06server : docs fix default values and add n_probs (#3506)Mihai
2023-10-06parallel : add option to load external prompt file (#3416)pudepiedj
2023-10-06server : reuse llama_sample_token common util (#3494)Jhen-Jie Hong
2023-10-05build : use std::make_tuple() for compatibility with older GCC versions (#3488)Kenvix ⭐
2023-10-05server : fix incorrect num_tokens_predicted (#3480)Jhen-Jie Hong
2023-10-04finetune : readme fix typo (#3465)Merrick Christensen
2023-10-03main : consistent prefix/suffix coloring (#3425)h-h-h-h
2023-10-03llama : fix session saving/loading (#3400)Georgi Gerganov
2023-10-02gguf : general usability improvements (#3409)cebtenzzre