summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2024-06-17Add support for sqrt on CUDA (#7953)Calvin Laurenson
2024-06-15Add `cvector-generator` example (#7514)Xuan Son Nguyen
2024-06-14llama-bench : fix RPC indication (#7936)Radoslav Gerganov
2024-06-13move BLAS to a separate backend (#6210)slaren
2024-06-13`build`: rename main → llama-cli, server → llama-server, llava-cli → ll...Olivier Chafik
2024-06-12server : restore numeric prompts (#7883)Georgi Gerganov
2024-06-11llama-bench: more compact markdown tables (#7879)Johannes Gäßler
2024-06-11json: refine constraint for whitespace to avoid runaways yet allow pretty pri...Olivier Chafik
2024-06-11`json`: document schema conversion in GBNF readme, align manual grammar examp...Olivier Chafik
2024-06-10examples : remove --instruct remnants (#7846)Georgi Gerganov
2024-06-10server : improve "prompt" handling (#7847)Georgi Gerganov
2024-06-09imatrix : handle partial entries (#7833)Georgi Gerganov
2024-06-09server: do not remove whitespace at the start of a completion chunk (#7830)mgroeber9110
2024-06-09Revert "[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)" (#7808)slaren
2024-06-08server : smart slot selection using Longest Common Prefix (#7728)sasha0552
2024-06-07gguf-split : change binary multi-byte units to decimal (#7803)Christian Zhou-Zheng
2024-06-07server: update cache_prompt documentation [no ci] (#7745)Johannes Gäßler
2024-06-07server : do not get prompt in infill mode (#7286)woodx
2024-06-07check for nans in imatrix and quantize (#7807)slaren
2024-06-06imatrix : migrate to gpt_params (#7771)Georgi Gerganov
2024-06-06grammars: x{min,max} repetition operator (#6640)Olivier Chafik
2024-06-05ggml : refactor rope norm/neox (#7634)Georgi Gerganov
2024-06-05readme : remove -ins (#7759)arch-btw
2024-06-04common : refactor cli arg parsing (#7675)Georgi Gerganov
2024-06-04ggml : remove OpenCL (#7735)Georgi Gerganov
2024-06-04llama : remove beam search (#7736)Georgi Gerganov
2024-06-04llama-bench : allow using a different printer for stderr with -oe (#7722)slaren
2024-06-02[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)nickp27
2024-06-01server : new UI (#7633)Yazan Agha-Schrader
2024-06-02SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings ...HanishKVC
2024-05-31server : update js (#7670)Georgi Gerganov
2024-05-30Move convert.py to examples/convert-legacy-llama.py (#7430)Galunid
2024-05-29llama-bench : add support for the RPC backend (#7435)Radoslav Gerganov
2024-05-28server: do not remove whitespace at the start of a completion chunk (#7524)mgroeber9110
2024-05-28Markdownish code block fix (#7571)Nathan Epstein
2024-05-28llava : update clip.h (#7580)Ikko Eltociear Ashimine
2024-05-27main: replace --no-special with --special (#7534)Brian
2024-05-26SimpleChat Completion Mode flexibility and cleanup, Settings gMe, Optional sl...HanishKVC
2024-05-25train : change default FA argument (#7528)Georgi Gerganov
2024-05-25main : don't print special tokens with --grammar (#6923)Justine Tunney
2024-05-25android : module (#7502)Elton Kola
2024-05-25Make tokenize CLI tool have nicer command line arguments. (#6188)Mikko Juola
2024-05-24add build shared lib in win release package (#7438)Neo Zhang
2024-05-23ggml : remove ggml_flash_attn and ggml_flash_ff (#7463)Georgi Gerganov
2024-05-23main : minor (#7462)Georgi Gerganov
2024-05-23SimpleChat: a simple and dumb web front end for testing /chat/completions and...HanishKVC
2024-05-22common : normalize naming style (#7462)Georgi Gerganov
2024-05-22phi3 : duplicate rope factors in each layer (#7447)slaren
2024-05-21llama : add phi3 128K model support (#7225)liuwei-git
2024-05-21`grammars`: fix resampling logic regression (#7424)Olivier Chafik