summaryrefslogtreecommitdiff
path: root/examples
AgeCommit message (Expand)Author
2023-09-03examples : fix gpt-neox (#2943)momonga
2023-09-02server : avoid aniprompt in probabilities of final response (#2849)Jhen-Jie Hong
2023-09-01readme : quick start command fix (#2908)ZHAOKAI WANG
2023-09-01Allow quantize to only copy tensors, some other improvements (#2931)Kerfuffle
2023-09-01llama2c : rename functionGeorgi Gerganov
2023-09-01minor : add const qualifiers (#2853)m3ndax
2023-09-01build : fix most gcc and clang warnings (#2861)Cebtenzzre
2023-09-01llama2c : fix segfault and alloc-dealloc-mismatch (#2913)Cebtenzzre
2023-08-31scripts: Use local gguf package when running from repo (#2927)Kerfuffle
2023-08-30examples : fix underscore in beam-search + .gitignore (close #2900)Georgi Gerganov
2023-08-30llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879)chaihahaha
2023-08-30main : log file (#2748)staviq
2023-08-29Tell users attmepting to run perplexity with too few tokens to use more (#2882)Kawrakow
2023-08-28train : mem usage and other improvements (#2439)xaedes
2023-08-28llama-bench : set locale to utf8 (#2832)slaren
2023-08-28YAML result logging + preset script (#2657)Johannes Gäßler
2023-08-28quantize : make output filename optional again (#2823)Cebtenzzre
2023-08-27examples : update llama2.c converter to read vocab and write models in GGUF f...Olivier Chafik
2023-08-27llama : speedup tokenization (#2831)Kawrakow
2023-08-27gguf : add 64-bit support (GGUF v2) (#2821)Georgi Gerganov
2023-08-27llama : more tokenizer fixes (#2810)Georgi Gerganov
2023-08-27server : add `/detokenize` endpoint (#2802)Bruce MacDonald
2023-08-26main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (...Dr. Tom Murphy VII Ph.D
2023-08-26Fix HellaSwag (#2805)Kawrakow
2023-08-26Fix spm whitespaces (#2806)klosax
2023-08-26examples : skip unnecessary external lib in server README.md how-to (#2804)lon
2023-08-25Faster perplexity computation (#2786)Kawrakow
2023-08-25llama : add llama_beam_search() (#2267)Matt Pulver
2023-08-25llama-bench : add model sizes (#2771)slaren
2023-08-25server : display token probabilities in the UI (#2489)Jhen-Jie Hong
2023-08-25ROCm Port (#1087)Henri Vasserman
2023-08-24Fix for main example getting stuck when -n -2 and --interactive (#2767)Kerfuffle
2023-08-23llm : add Falcon support (#2717)Georgi Gerganov
2023-08-23minor : fix trailing whitespaceGeorgi Gerganov
2023-08-23examples : restore the functionality to import llama2.c models (#2685)Olivier Chafik
2023-08-23main : insert bos if no tokens (#2727)klosax
2023-08-23chmod : make scripts executable (#2675)Cebtenzzre
2023-08-23Fix values shown in the quantize tool help (#2735)Kawrakow
2023-08-23Strided perplexity (#2714)Kawrakow
2023-08-23server : allow json array in prompt or content for direct token input (#2306)Xiao-Yong Jin
2023-08-22docs : add grammar docs (#2701)Evan Jones
2023-08-22CUDA: use mul_mat_q kernels by default (#2683)Johannes Gäßler
2023-08-22embedding : evaluate prompt in batches (#2713)slaren
2023-08-22ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709)Georgi Gerganov
2023-08-22llama-bench : minor fixes (#2695)slaren
2023-08-22server : fallback to default if client param is null (#2688)Jhen-Jie Hong
2023-08-21gguf : new file format with flexible meta data (beta) (#2398)Georgi Gerganov
2023-08-21HellaSwag: split token evaluation into batches if needed (#2681)Kawrakow
2023-08-20More efficient Hellaswag implementation (#2677)Kawrakow
2023-08-19server : better default prompt (#2646)Georgi Gerganov