summaryrefslogtreecommitdiff
path: root/examples/batched/batched.cpp
AgeCommit message (Expand)Author
2024-07-27Merge mainline llama.cpp (#3)Kawrakow
2024-06-04common : refactor cli arg parsing (#7675)Georgi Gerganov
2024-05-22common : normalize naming style (#7462)Georgi Gerganov
2024-04-21llama : support Llama 3 HF conversion (#6745)Pedro Cuenca
2024-03-22metal : pad n_ctx by 32 (#6177)Georgi Gerganov
2024-03-11llama : more consistent names of count variables (#5994)Georgi Gerganov
2024-03-08llama : support Mamba Selective State Space Models (#5328)compilade
2024-02-18ggml, common, examples, tests : fixed type arguments in printf (#5528)Herman Semenov
2024-02-16ggml : add numa options (#5377)bmwl
2024-01-08examples : add passkey test (#3856)Georgi Gerganov
2023-10-24cuda : add batched cuBLAS GEMM for faster attention (#3749)Georgi Gerganov
2023-10-23llama : remove token functions with `context` args in favor of `model` (#3720)Marcus Dunn
2023-10-22batched : add len CLI argumentGeorgi Gerganov
2023-10-18speculative : add tree-based sampling example (#3624)Georgi Gerganov
2023-10-11batched : add bench tool (#3545)Georgi Gerganov
2023-09-28llama.cpp : split llama_context_params into model and context params (#3301)slaren
2023-09-28llama : custom attention mask + parallel decoding + no context swaps (#3228)Georgi Gerganov