summaryrefslogtreecommitdiff
path: root/examples/speculative/speculative.cpp
AgeCommit message (Expand)Author
2023-10-23llama : remove token functions with `context` args in favor of `model` (#3720)Marcus Dunn
2023-10-20sampling : refactor init to use llama_sampling_params (#3696)Georgi Gerganov
2023-10-18speculative : bug fixesGeorgi Gerganov
2023-10-18speculative : add tree-based sampling example (#3624)Georgi Gerganov
2023-10-11common : fix mirostat state when using multiple sequences (#3543)Kerfuffle
2023-10-03llama : fix session saving/loading (#3400)Georgi Gerganov
2023-09-28llama.cpp : split llama_context_params into model and context params (#3301)slaren
2023-09-28llama : custom attention mask + parallel decoding + no context swaps (#3228)Georgi Gerganov
2023-09-14speculative : add heuristic algorithm (#3006)Leng Yue
2023-09-13speculative: add --n-gpu-layers-draft option (#3063)FK
2023-09-08build : do not use _GNU_SOURCE gratuitously (#2035)Przemysław Pawełczyk
2023-09-05speculative : add grammar support (#2991)Georgi Gerganov
2023-09-03speculative : PoC for speeding-up inference via speculative sampling (#2926)Georgi Gerganov