summaryrefslogtreecommitdiff
path: root/examples/server/server.cpp
diff options
context:
space:
mode:
authorKawrakow <48489457+ikawrakow@users.noreply.github.com>2024-02-28 10:37:02 +0200
committerGitHub <noreply@github.com>2024-02-28 10:37:02 +0200
commit7c4263d4261d6ee6f0539d53eb9e1b4d120ba8af (patch)
tree1afa821474af2b1579227870ed19e99206a546a6 /examples/server/server.cpp
parentcb49e0f8c906e5da49e9f6d64a57742a9a241c6a (diff)
ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (#5760)
* WIP: make i-quants work for QK_K = 64 * iq2_xs: attempt to fix AVX dot product for QK_K = 64 Tests pass, but I get gibberish. * QK_K = 64 tests pass on ARM_NEON and Metal Sadly, that does not mean it actually works. * Make CUDA compile with QK_K = 64 Tests don't pass, plus we get misaligned access * Q2_K: fixed bug in imatrix quantization for QK_K = 64 * iq1_s: turn off SIMD implementation for QK_K = 64 (it does not work) --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples/server/server.cpp')
0 files changed, 0 insertions, 0 deletions