summaryrefslogtreecommitdiff
path: root/examples/batched/batched.cpp
AgeCommit message (Expand)Author
2023-10-24cuda : add batched cuBLAS GEMM for faster attention (#3749)Georgi Gerganov
2023-10-23llama : remove token functions with `context` args in favor of `model` (#3720)Marcus Dunn
2023-10-22batched : add len CLI argumentGeorgi Gerganov
2023-10-18speculative : add tree-based sampling example (#3624)Georgi Gerganov
2023-10-11batched : add bench tool (#3545)Georgi Gerganov
2023-09-28llama.cpp : split llama_context_params into model and context params (#3301)slaren
2023-09-28llama : custom attention mask + parallel decoding + no context swaps (#3228)Georgi Gerganov