Age | Commit message (Expand) | Author |
---|---|---|
2024-06-05 | CUDA: refactor mmq, dmmv, mmvq (#7716) | Johannes Gäßler |
2024-05-23 | ggml : drop support for QK_K=64 (#7473) | Georgi Gerganov |
2024-04-03 | [SYCL] Disable iqx on windows as WA (#6435) | Meng, Hengyu |
2024-03-27 | Make IQ1_M work for QK_K = 64 (#6327) | Kawrakow |
2024-03-26 | IQ1_M: 1.75 bpw quantization (#6302) | Kawrakow |
2024-03-12 | ggml : reuse quantum structs across backends (#5943) | Georgi Gerganov |
2024-03-11 | 1.5 bit: we can do even better (#5999) | Kawrakow |
2024-03-11 | Better 1.5 bit quantization (#5971) | Kawrakow |
2024-03-10 | ggml : remove __constant__ specifier for CUDA tables (#5940) | Georgi Gerganov |
2024-03-09 | ggml : add ggml-common.h to deduplicate shared code (#5940) | Georgi Gerganov |