Age | Commit message (Expand) | Author |
---|---|---|
2024-01-20 | llama : run all KQV ops on the CPU with no KV offload (#5049) | slaren |
2024-01-17 | ggml : add IQ2 to test-backend-ops + refactoring (#4990) | Georgi Gerganov |
2024-01-17 | backend : add eval callback (#4935) | Georgi Gerganov |
2024-01-16 | ggml : introduce GGML_CALL function annotation (#4850) | Justine Tunney |
2024-01-12 | backend_sched : fix assignments | slaren |
2024-01-12 | llama : ggml-backend integration (#4766) | slaren |
2024-01-05 | ggml : add error handling to graph_compute (whisper/1714) | Finn Voorhees |
2023-12-29 | ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669) | bssrdf |
2023-12-24 | cuda : improve cuda pool efficiency using virtual memory (#4606) | slaren |
2023-12-21 | llama : initial ggml-backend integration (#4520) | slaren |
2023-12-07 | sync : ggml (new ops, tests, backend, etc.) (#4359) | Georgi Gerganov |
2023-11-13 | sync : ggml (backend v2) (#3912) | Georgi Gerganov |
2023-10-08 | sync : ggml (ggml-backend) (#3548) | Georgi Gerganov |