index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
ggml-common.h
Age
Commit message (
Expand
)
Author
2024-06-22
bitnet(scale in a separate tensor): CPU improvements
Iwan Kawrakow
2024-06-22
bitnet: put the scale in a separate tensor
Iwan Kawrakow
2024-06-22
Bitnet: 2.25 bpw version
Iwan Kawrakow
2024-06-22
bitnet: add 2 bpw quantization
Iwan Kawrakow
2024-06-22
bitnet: CUDA, scalar, AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat for llama.cpp
Iwan Kawrakow
2024-06-05
CUDA: refactor mmq, dmmv, mmvq (#7716)
Johannes Gäßler
2024-05-23
ggml : drop support for QK_K=64 (#7473)
Georgi Gerganov
2024-04-03
[SYCL] Disable iqx on windows as WA (#6435)
Meng, Hengyu
2024-03-27
Make IQ1_M work for QK_K = 64 (#6327)
Kawrakow
2024-03-26
IQ1_M: 1.75 bpw quantization (#6302)
Kawrakow
2024-03-12
ggml : reuse quantum structs across backends (#5943)
Georgi Gerganov
2024-03-11
1.5 bit: we can do even better (#5999)
Kawrakow
2024-03-11
Better 1.5 bit quantization (#5971)
Kawrakow
2024-03-10
ggml : remove __constant__ specifier for CUDA tables (#5940)
Georgi Gerganov
2024-03-09
ggml : add ggml-common.h to deduplicate shared code (#5940)
Georgi Gerganov