py : re-enable mmap in convert hf (#4732)

* update: awq support llama-7b model * update: change order * update: benchmark results for llama2-7b * update: mistral 7b v1 benchmark * update: support 4 models * fix: Readme * update: ready for PR * update: readme * fix: readme * update: change order import * black * format code * update: work for bot mpt and awqmpt * update: readme * Rename to llm_build_ffn_mpt_awq * Formatted other files * Fixed params count * fix: remove code * update: more detail for mpt * fix: readme * fix: readme * update: change folder architecture * fix: common.cpp * fix: readme * fix: remove ggml_repeat * update: cicd * update: cicd * uppdate: remove use_awq arg * update: readme * llama : adapt plamo to new ffn ggml-ci * fix: update torch version --------- Co-authored-by: Trần Đức Nam <v.namtd12@vinai.io> Co-authored-by: Le Hoang Anh <v.anhlh33@vinai.io> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
author: Nam D. Tran <42194884+namtranase@users.noreply.github.com> 2024-01-02 16:23:38 +0700
committer: GitHub <noreply@github.com> 2024-01-02 11:23:38 +0200
commit: 26f3071d714f0b27ad7f021a46a66a1085480258 (patch)
tree: bbbc9da48238a470bde8854e94ee6cdc2b27b19c
parent: 775ac8712a7b42cfead2585f42cec0dfd56644ab (diff)
2 files changed, 2 insertions, 2 deletions
diff --git a/awq-py/requirements.txt b/awq-py/requirements.txt
index 5fe60432..99189611 100644
--- a/awq-py/requirements.txt
+++ b/awq-py/requirements.txt
@@ -1,2 +1,2 @@
-torch>=2.0.0
+torch>=2.1.1
 transformers>=4.32.0
diff --git a/convert-hf-to-gguf.py b/convert-hf-to-gguf.py
index 51724c0d..203eaf64 100755
--- a/convert-hf-to-gguf.py
+++ b/convert-hf-to-gguf.py
@@ -59,7 +59,7 @@ class Model:
                 from safetensors import safe_open
                 ctx = cast(ContextManager[Any], safe_open(self.dir_model / part_name, framework="pt", device="cpu"))
             else:
-                ctx = contextlib.nullcontext(torch.load(str(self.dir_model / part_name), map_location="cpu", weights_only=True))
+                ctx = contextlib.nullcontext(torch.load(str(self.dir_model / part_name), map_location="cpu", mmap=True, weights_only=True))
 
             with ctx as model_part:
                 for name in model_part.keys():
author	Nam D. Tran <42194884+namtranase@users.noreply.github.com>	2024-01-02 16:23:38 +0700
committer	GitHub <noreply@github.com>	2024-01-02 11:23:38 +0200
commit	26f3071d714f0b27ad7f021a46a66a1085480258 (patch)
tree	bbbc9da48238a470bde8854e94ee6cdc2b27b19c
parent	775ac8712a7b42cfead2585f42cec0dfd56644ab (diff)