summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Collapse)Author
2023-08-18llama : add benchmark example (#2626)slaren
* llama : add benchmark example * add to examples CMakeLists.txt * fix msvc build * add missing include * add Bessel's correction to stdev calculation Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * improve markdown formatting * add missing include * print warning is NDEBUG is not defined * remove n_prompt and n_gen from the matrix, use each value separately instead * better checks for non-optimized builds * llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call * fix json formatting * add sql output * add basic cpu and gpu info (linx/cuda only) * markdown: also show values that differ from the default * markdown: add build id * cleanup * improve formatting * formatting --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2023-08-18perplexity : more meaningful ETA number - 2 decimal pointsGeorgi Gerganov
2023-08-18server : support for saving templates in browser LocalStorage (#2486)staviq
* support for templates in browser LocalStorage * sync accepted #2409 fix from upstream * convert autosave invocation to useEffect * Apply suggestions from code review Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com> * Regen index.html.cpp, suggested from code review --------- Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>
2023-08-17Add --cfg-negative-prompt-file option for examples (#2591)Kerfuffle
Add --cfg-negative-prompt-file option for examples
2023-08-15server : add missing /json-schema-to-grammar.mjs (#2616)Jhen-Jie Hong
fixes #2611
2023-08-14server : add --numa support (#2524)Cheng Shao
2023-08-14server : fix default grammar by use empty string in the UI (#2604)Jhen-Jie Hong
2023-08-14server : implement json-schema-to-grammar.mjs & add grammar param in the UI ↵Jhen-Jie Hong
(#2588) * server : implement json-schema-to-grammar.mjs by follow python impl * server : add grammar support in chat.mjs * server : implement grammer param in the UI * server : generate .hpp * server : remove trailing whitespaces * server : generate .hpp * server : fix sort of prop pairs * server : optimize regex & iteration
2023-08-12Adding support for llama2.c models (#2559)byte-6174
2023-08-12server: fixed wrong variable name in timing json (#2579)Equim
* server: fixed wrong variable name in timing json * remove redunct entry
2023-08-10Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier ↵DannyDaemonic
versions of Windows.
2023-08-10Add --n-predict -2 for stopping generation on full context (#2565)Christian Demsar
2023-08-10Fix grammar-based sampling issue in server (#2566)Martin Krasser
2023-08-08Allow passing grammar to completion endpoint (#2532)Martin Krasser
* Allow passing grammar to completion endpoint
2023-08-08llm.vim : multiline autocompletion, get rid of "^@" (#2543)chaihahaha
2023-08-08vim : bring back simple llm.vim exampleGeorgi Gerganov
2023-08-08vim : streaming and more (#2495)AustinMroz
* Update Vim plugin * Remove getbufoneline usage, Add input bind example. getbufoneline() appears to be a recently added function and has been replaced with getbufline for compatibility. An additional example that explains how to add a keybind that works in insert mode was added.
2023-08-07Add --rope-scale parameter (#2544)klosax
* common.cpp : Add --rope-scale parameter * README.md : Add info about using linear rope scaling
2023-08-06console : fix issue related to Windows 11 PowerShell console mode ↵DannyDaemonic
persistence (#2521)
2023-08-04fix firefox autoscroll (#2519)Jonas Wunderlich
2023-08-04server: regenerate completion.js.hpp (#2515)Cebtenzzre
2023-08-04Add --simple-io option for subprocesses and break out console.h and cpp (#1558)DannyDaemonic
2023-08-04Fixing race condition in server and partial stream handling in frontend. (#2391)Stephen Nichols
* Fixing race condition in server.cpp and partial stream handling in completion.js * Reverting assert edits. * Adding newline to eof
2023-08-04build : fix several cast and printf warnings (#2499)Borislav Stanimirov
2023-08-02examples : generate JSON according to schema (#1887)Evan Jones
* examples : add JSON schema grammars * complete JSON grammar * ensure primitive types can be used as root of schema * support integer type and adjust usage text
2023-08-02tests : Fix compilation warnings (Linux/GCC) (#2451)Eve
* fix hellaswag print format, cast away warning in test-double-float * c++11 cannot use designated initializers * add static to test-grad0.c internal functions * use memcpy in test-double-float.c * port c tests to c++ * use initializer list for ggml_init_params
2023-08-01fix a typo in examples/server/README.md (#2478)Bono Lv
2023-08-01server : Support dark mode (#2414)ebraminio
* server : Support dark mode So it respects user system light / dark settings. * Update index.html.hpp by running ./deps.sh
2023-07-31CUDA: mmq CLI option, fixed mmq build issues (#2453)Johannes Gäßler
2023-07-28perplexity : add Hellaswag calculation (#2389)klosax
* common.h : add hellaswag / remove perplexity-lines * common.cpp : add hellaswag / remove perplexity-lines * perplexity.cpp : add hellswag scores / remove perplexity-lines * perplexity.cpp : clean up * common.h : change default param value * common.cpp : Change default param * perplexity.cpp : alter wording * common.h : alter wording * common.cpp : alter wording
2023-07-28examples : fix whitespaceGeorgi Gerganov
2023-07-28examples : server chat mode with llama2 (#2400)nhamanasu
* add: server chat mode with llama2 * fix: remove the unnecessary last \n
2023-07-28readme : fix the description of the Tail free sampling (TFS) method (#2431)Weird Constructor
2023-07-28llama : use n_embd_gqa instead of n_embd to handle llama-2 70B (#2433)Rand Xie
2023-07-25Add LLAMA_DEFAULT_RMS_EPS so we can change the default (#2384)Kawrakow
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-07-25main : add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS (#2304)Xiao-Yong Jin
* add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS The BOS precedes the string specified by `--in-prefix`. Model generated EOS is now kept in the context. It provides a way to strictly following the prompt format used in Llama-2-chat. The EOS handling also benefits some existing finetunes that uses EOS to mark the end of turn. * examples/common: move input_prefix_bos to other bools
2023-07-25server: add rms_norm_eps parameter (#2380)slaren
2023-07-25[Server] Escape HTML in webchat (#2368)Henri Vasserman
* escape HTML in webchat * add amp
2023-07-24make rms_norm_eps a parameter (#2374)slaren
* make rms_norm_eps a parameter * add rms_norm_eps to command line * fix baby llama, test-grad0 * use scientific notation for eps param in the help ggml-ci
2023-07-24Chat UI extras (#2366)Aarni Koskela
* makefile: correct deps for server * server: tighten settings layout a little * server: expose all currently configured generation params in UI * server: expose remaining generation params, for the adventurous * server: embetter mirostat fields
2023-07-23llama : add grammar-based sampling (#1773)Evan Jones
* llama, main : constrain sampling to grammar * allow loading grammar from file * fix whitespace errors * handle & print parser errors * add comments to grammar syntax and allow newlines where unambiguous * add missing include * support alternates in root rule * fix bugs with empty token and EOS * adjust JSON grammar * remove swp file * rewrite ternary expressions Co-authored-by: Henri Vasserman <henv@hot.ee> * use struct for grammar elements and add Unicode support * add unicode escapes * add inverse char ranges * only sample full tokens (no peeking or truncation) * llama : minor style changes blindly applied in online editor - hopefully I didn't break something * update help text * add warning message if EOS is disabled --------- Co-authored-by: Henri Vasserman <henv@hot.ee> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-23Add gqa parameter support to the server (#2351)IgnacioFDM
* Add gqa parameter support to the server * Change help from stderr to stdout
2023-07-23common : n_threads == -1 uses std::thread::hardware_concurrency() (#2347)wzy
* Fix #2345, fix incorrect n_threads * Update examples/common.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-23llama : grouped-query attention + LLaMAv2 70B support (#2276)Georgi Gerganov
* CUDA: GQA implementation * llama : support for GQA and LLaMAv2 70B ggml-ci * py : fix hparams parsing (if-else blocks) ggml-ci * py : oh boy .. ggml-ci * help : fix gqa value for 70B ggml-ci --------- Co-authored-by: JohannesGaessler <johannesg@5d6.de>
2023-07-23llama : print help to stdout (#2338)maddes8cht
2023-07-23examples : simplify vim plugin (#2327)AustinMroz
Uses builtin json_encode and json_decode functions to simplify escaping Removes the need for temp files
2023-07-22llama : optimize memory buffers (#2325)Georgi Gerganov
2023-07-22Perplexity: Compute scores correlated to HellaSwag (#2312)klosax
* Add parameter --perplexity-lines to perplexity.cpp
2023-07-22examples : basic VIM pluginwhoreson
VIM plugin for server exe
2023-07-21examples : add easy python script to create quantized (k-bit support) GGML ↵Richard Roberson
models from local HF Transformer models (#2311) * Resync my fork with new llama.cpp commits * examples : rename to use dash instead of underscore --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>