Be able to repack tensors at run time (#147)

* Be able to repack tensors at run time * Repack: also add bf16 as repackable type * Repack: make sure number of rows is a multiple of the packing --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
author: Kawrakow <iwankawrakow@gmail.com> 2024-12-17 14:16:34 +0100
committer: GitHub <noreply@github.com> 2024-12-17 14:16:34 +0100
commit: 514ae086200a8cfd78af6a71b6c6ee14931ddc0e (patch)
tree: 0fa47186d7c82afbf078d530f5436c7eb1ae4d79 /ggml/src/iqk/iqk_quantize.h
parent: 4ade4c568c331acad22537f7b9519c740c7a06d0 (diff)
1 files changed, 2 insertions, 0 deletions
diff --git a/ggml/src/iqk/iqk_quantize.h b/ggml/src/iqk/iqk_quantize.h
index 8640b59b..7c568ded 100644
--- a/ggml/src/iqk/iqk_quantize.h
+++ b/ggml/src/iqk/iqk_quantize.h
@@ -173,6 +173,8 @@ void quantize_row_q8_KR8(const float * GGML_RESTRICT x, void * GGML_RESTRICT y,
 void repack_f32_bf16_r16 (const void * GGML_RESTRICT src, void * GGML_RESTRICT dst, int64_t nrows, int64_t n_per_row);
 void repack_bf16_bf16_r16(const void * GGML_RESTRICT src, void * GGML_RESTRICT dst, int64_t nrows, int64_t n_per_row);
 
+void iqk_repack_tensor(struct ggml_tensor * tensor);
+
 #ifdef __cplusplus
 }
 #endif
author	Kawrakow <iwankawrakow@gmail.com>	2024-12-17 14:16:34 +0100
committer	GitHub <noreply@github.com>	2024-12-17 14:16:34 +0100
commit	514ae086200a8cfd78af6a71b6c6ee14931ddc0e (patch)
tree	0fa47186d7c82afbf078d530f5436c7eb1ae4d79 /ggml/src/iqk/iqk_quantize.h
parent	4ade4c568c331acad22537f7b9519c740c7a06d0 (diff)