Age | Commit message (Expand) | Author |
---|---|---|
2024-05-11 | ggml : full ALiBi support (#7192) | Georgi Gerganov |
2024-04-30 | ggml : add Flash Attention (#5021) | Georgi Gerganov |
2024-03-26 | llama : greatly reduce output buffer memory usage (#6122) | compilade |
2024-03-18 | backend : offload large batches to GPU (#6083) | slaren |
2024-03-13 | llama : add pipeline parallelism support (#6017) | slaren |
2024-03-04 | ggml : introduce ggml_status (ggml/750) | Michael Podvitskiy |
2024-02-28 | Introduce backend GUIDs (ggml/743) | UEXTM.com |
2024-01-29 | Nomic Vulkan backend (#4456) | Jared Van Bortel |