diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2024-12-17 14:16:34 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-12-17 14:16:34 +0100 |
commit | 514ae086200a8cfd78af6a71b6c6ee14931ddc0e (patch) | |
tree | 0fa47186d7c82afbf078d530f5436c7eb1ae4d79 /ggml/src/iqk/iqk_quantize.h | |
parent | 4ade4c568c331acad22537f7b9519c740c7a06d0 (diff) |
Be able to repack tensors at run time (#147)
* Be able to repack tensors at run time
* Repack: also add bf16 as repackable type
* Repack: make sure number of rows is a multiple of the packing
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/iqk/iqk_quantize.h')
-rw-r--r-- | ggml/src/iqk/iqk_quantize.h | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/ggml/src/iqk/iqk_quantize.h b/ggml/src/iqk/iqk_quantize.h index 8640b59b..7c568ded 100644 --- a/ggml/src/iqk/iqk_quantize.h +++ b/ggml/src/iqk/iqk_quantize.h @@ -173,6 +173,8 @@ void quantize_row_q8_KR8(const float * GGML_RESTRICT x, void * GGML_RESTRICT y, void repack_f32_bf16_r16 (const void * GGML_RESTRICT src, void * GGML_RESTRICT dst, int64_t nrows, int64_t n_per_row); void repack_bf16_bf16_r16(const void * GGML_RESTRICT src, void * GGML_RESTRICT dst, int64_t nrows, int64_t n_per_row); +void iqk_repack_tensor(struct ggml_tensor * tensor); + #ifdef __cplusplus } #endif |