Be able to repack tensors at run time (#147)

* Be able to repack tensors at run time * Repack: also add bf16 as repackable type * Repack: make sure number of rows is a multiple of the packing --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
author: Kawrakow <iwankawrakow@gmail.com> 2024-12-17 14:16:34 +0100
committer: GitHub <noreply@github.com> 2024-12-17 14:16:34 +0100
commit: 514ae086200a8cfd78af6a71b6c6ee14931ddc0e (patch)
tree: 0fa47186d7c82afbf078d530f5436c7eb1ae4d79 /include
parent: 4ade4c568c331acad22537f7b9519c740c7a06d0 (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/include/llama.h b/include/llama.h
index 1627a752..e63d76fe 100644
--- a/include/llama.h
+++ b/include/llama.h
@@ -325,6 +325,7 @@ extern "C" {
         bool use_mmap;      // use mmap if possible
         bool use_mlock;     // force system to keep model in RAM
         bool check_tensors; // validate model tensor data
+        bool repack_tensors;// repack if available
     };
 
     // NOTE: changing the default values of parameters marked as [EXPERIMENTAL] may cause crashes or incorrect results in certain configurations
author	Kawrakow <iwankawrakow@gmail.com>	2024-12-17 14:16:34 +0100
committer	GitHub <noreply@github.com>	2024-12-17 14:16:34 +0100
commit	514ae086200a8cfd78af6a71b6c6ee14931ddc0e (patch)
tree	0fa47186d7c82afbf078d530f5436c7eb1ae4d79 /include
parent	4ade4c568c331acad22537f7b9519c740c7a06d0 (diff)