| Age | Commit message (Expand) | Author | 
|---|
| 2023-06-24 | llama : make model stateless and context stateful (llama_state) (#1797) | Didzis Gosko | 
| 2023-06-17 | Only one CUDA stream per device for async compute (#1898) | Johannes Gäßler | 
| 2023-06-16 | build : fix and ignore MSVC warnings (#1889) | Borislav Stanimirov | 
| 2023-06-15 | Better error when using both LoRA + GPU layers (#1861) | Johannes Gäßler | 
| 2023-06-14 | CUDA full GPU acceleration, KV cache in VRAM (#1827) | Johannes Gäßler | 
| 2023-06-11 | Fix issue where interactive mode crashes when input exceeds ctx size (#1789) | Kerfuffle | 
| 2023-06-06 | main: add the possibility to open the prompt cache read-only (#1640) | Willy Tarreau | 
| 2023-06-06 | Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703) | Johannes Gäßler | 
| 2023-06-04 | llama : Metal inference (#1642) | Georgi Gerganov | 
| 2023-05-28 | Only show -ngl option when relevant + other doc/arg handling updates (#1625) | Kerfuffle | 
| 2023-05-28 | examples : add --alias option to gpt_params to set use friendly model name (#... | Vladimir Zorin | 
| 2023-05-20 | Fix for mingw (#1462) | DannyDaemonic | 
| 2023-05-19 | main : make reverse prompt option act as a stop token in non-interactive mode... | Jason McCartney | 
| 2023-05-19 | minor : fix compile warnings | Georgi Gerganov | 
| 2023-05-17 | Remove unused n_parts parameter (#1509) | Stephan Walter | 
| 2023-05-15 | fix get_num_physical_cores() (#1436) | zrm | 
| 2023-05-13 | ggml : GPU-accelerated token generation (#1412) | Johannes Gäßler | 
| 2023-05-12 | CLI args use - instead of _, backwards compatible (#1416) | Johannes Gäßler | 
| 2023-05-10 | main : add option to save full output to session (#1338) | Evan Jones | 
| 2023-05-09 | Locale fix for Windows (#1379) | DannyDaemonic | 
| 2023-05-08 | Interface improvements and `--multiline-input` (previously `--author-mode`) (... | DannyDaemonic | 
| 2023-05-08 | llama : require first token to be BOS (#1303) | Georgi Gerganov | 
| 2023-05-08 | Documented CUDA reproducibility, added warning (#1346) | Johannes Gäßler | 
| 2023-05-04 | main : add --in-suffix option (#1318) | 44670 | 
| 2023-05-04 | Only escape prompts when used with `-e` (#1311) | DannyDaemonic | 
| 2023-05-02 | Process escape sequences given in prompts (#1173) | DannyDaemonic | 
| 2023-05-03 | fix missing parameters in `llama_init_from_gpt_params` (#1293) | slaren | 
| 2023-05-02 | examples : add llama_init_from_gpt_params() common function (#1290) | Ron Evans | 
| 2023-05-02 | llama : allow 0 as a seed number. (#1275) | Robert Brisita | 
| 2023-04-30 | common : better default number of threads (#934) | jon-chuang | 
| 2023-04-29 | llama : new sampling algorithms (#1126) | Ivan Stepanov | 
| 2023-04-28 | llama : add session file format and saved sessions in main (#1169) | Evan Jones | 
| 2023-04-24 | examples/main README improvements and some light refactoring (#1131) | mgroeber9110 | 
| 2023-04-17 | Add LoRA support (#820) | slaren | 
| 2023-04-14 | Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982) | Pavol Rusnak | 
| 2023-04-14 | main : alternative instruct mode (Vicuna support, etc.) (#863) | Tomáš Pazdiora | 
| 2023-04-13 | common : remove unnecessary includes (#947) | CRD716 | 
| 2023-04-11 | Fix whitespace, add .editorconfig, add GitHub workflow (#883) | Pavol Rusnak | 
| 2023-04-10 | Rewrite loading code to try to satisfy everyone: | comex | 
| 2023-04-08 | fix for windows utf-8 input (#840) | Tomáš Pazdiora | 
| 2023-04-02 | fix default params for examples/main (#697) | Murilo Santana | 
| 2023-04-01 | Show error message when -f fails | Slaren | 
| 2023-03-28 | all : be more strict about converting float to double (#458) | Stephan Walter | 
| 2023-03-28 | main.cpp fixes, refactoring (#571) | anzz1 | 
| 2023-03-25 | If n_predict == -1, generate forever | Georgi Gerganov | 
| 2023-03-25 | Inifinite generation via context swapping (#71) | Georgi Gerganov | 
| 2023-03-25 | Overhaul the examples structure | Georgi Gerganov |