2-bit quantizations (#4897)

* imatrix: load * imatrix: WIP * imatrix: Add Q2_K quantization * imatrix: also guard against Q2_K_S quantization without importance matrix * imatrix: guard even more against low-bit quantization misuse --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
author: Kawrakow <48489457+ikawrakow@users.noreply.github.com> 2024-01-14 09:45:56 +0200
committer: GitHub <noreply@github.com> 2024-01-14 09:45:56 +0200
commit: 147b17ac94a24d524e367cda26a9ff6245689f34 (patch)
tree: 6bae34826f82aa28a60ccb26de8eda0464774110 /tests
parent: 807179ec583dcb882f97d9704577c06beb2c5ec9 (diff)
1 files changed, 1 insertions, 1 deletions
diff --git a/tests/test-backend-ops.cpp b/tests/test-backend-ops.cpp
index d9b8b106..22a7856d 100644
--- a/tests/test-backend-ops.cpp
+++ b/tests/test-backend-ops.cpp
@@ -56,7 +56,7 @@ static void init_tensor_uniform(ggml_tensor * tensor, float min = -1.0f, float m
         GGML_ASSERT(size % ggml_blck_size(tensor->type) == 0);
         std::vector<uint8_t> dataq(ggml_row_size(tensor->type, size));
         int64_t hist[16];
-        ggml_quantize_chunk(tensor->type, data.data(), dataq.data(), 0, size, hist);
+        ggml_quantize_chunk(tensor->type, data.data(), dataq.data(), 0, size/tensor->ne[0], tensor->ne[0], hist, nullptr);
         ggml_backend_tensor_set(tensor, dataq.data(), 0, dataq.size());
     } else if (tensor->type == GGML_TYPE_I8 || tensor->type == GGML_TYPE_I16 || tensor->type == GGML_TYPE_I32) {
         // This is going to create some weird integers though.
author	Kawrakow <48489457+ikawrakow@users.noreply.github.com>	2024-01-14 09:45:56 +0200
committer	GitHub <noreply@github.com>	2024-01-14 09:45:56 +0200
commit	147b17ac94a24d524e367cda26a9ff6245689f34 (patch)
tree	6bae34826f82aa28a60ccb26de8eda0464774110 /tests
parent	807179ec583dcb882f97d9704577c06beb2c5ec9 (diff)