Age | Commit message (Expand) | Author |
---|---|---|
2024-03-26 | llama : greatly reduce output buffer memory usage (#6122) | compilade |
2024-03-24 | imatrix : fix wname for mul_mat_id ops (#6271) | Georgi Gerganov |
2024-03-18 | backend : offload large batches to GPU (#6083) | slaren |
2024-02-16 | ggml : add numa options (#5377) | bmwl |
2024-02-04 | Adding some imatrix tools (#5302) | Kawrakow |
2024-01-22 | imatrix : keep intermediate imatrix results (#5077) | Kawrakow |
2024-01-21 | Slightly faster imatrix (#5050) | Kawrakow |
2024-01-18 | imatrix : fix assert for src0 non-cont check | Georgi Gerganov |
2024-01-17 | imatrix : offload to GPU support (#4957) | Georgi Gerganov |
2024-01-12 | Importance Matrix calculation (#4861) | Kawrakow |