summaryrefslogtreecommitdiff
path: root/ggml-backend.c
AgeCommit message (Expand)Author
2024-06-13move BLAS to a separate backend (#6210)slaren
2024-06-03llama : offload to RPC in addition to other backends (#7640)Radoslav Gerganov
2024-05-15ggml : tag ggml_tensor::backend as deprecated (#7290)slaren
2024-05-11build: fix and ignore msvc warnings (ggml/805)Borislav Stanimirov
2024-04-26Reset schedule earlier to allow overlap with ggml graph computation on device...agray3
2024-04-22ggml : fix calloc argument ordering. (#6820)Dave Airlie
2024-04-21ggml : fix ggml_backend_cpu_supports_op() for CPY (#0)Georgi Gerganov
2024-03-26cuda : rename build flag to LLAMA_CUDA (#6299)slaren
2024-03-18backend : set max split inputs to GGML_MAX_SRC (#6137)slaren
2024-03-18backend : offload large batches to GPU (#6083)slaren
2024-03-13llama : add pipeline parallelism support (#6017)slaren
2024-03-04ggml : introduce ggml_status (ggml/750)Michael Podvitskiy
2024-02-28Introduce backend GUIDs (ggml/743)UEXTM.com
2024-02-181.5 bit quantization (#5453)Kawrakow
2024-02-17ggml : add ALiBi support for ggml_soft_max_ext (#5488)Georgi Gerganov
2024-02-17ci : add an option to fail on compile warning (#3952)Ananta Bastola
2024-02-13Early return for zero size calls to get_tensor. (#5482)AT
2024-02-12sync : ggml (#5452)Georgi Gerganov
2024-02-10ggml : add abort_callback for cpu backend (ggml/725)Michael Podvitskiy
2024-01-29Nomic Vulkan backend (#4456)Jared Van Bortel
2024-01-28ggml : add Vulkan backend (#2059)0cc4m
2024-01-28ggml : add unified SYCL backend for Intel GPUs (#2690)Abhilash Majumder
2024-01-26cuda : fix tensor size calculation for non-split buffer (#5145)slaren
2024-01-20llama : run all KQV ops on the CPU with no KV offload (#5049)slaren
2024-01-17ggml : add IQ2 to test-backend-ops + refactoring (#4990)Georgi Gerganov
2024-01-17backend : add eval callback (#4935)Georgi Gerganov
2024-01-16ggml : introduce GGML_CALL function annotation (#4850)Justine Tunney
2024-01-12backend_sched : fix assignmentsslaren
2024-01-12llama : ggml-backend integration (#4766)slaren
2024-01-05ggml : add error handling to graph_compute (whisper/1714)Finn Voorhees
2023-12-29ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669)bssrdf
2023-12-24cuda : improve cuda pool efficiency using virtual memory (#4606)slaren
2023-12-21llama : initial ggml-backend integration (#4520)slaren
2023-12-07sync : ggml (new ops, tests, backend, etc.) (#4359)Georgi Gerganov
2023-11-13sync : ggml (backend v2) (#3912)Georgi Gerganov
2023-10-08sync : ggml (ggml-backend) (#3548)Georgi Gerganov