diff options
author | Nam D. Tran <42194884+namtranase@users.noreply.github.com> | 2024-01-02 16:23:38 +0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-01-02 11:23:38 +0200 |
commit | 26f3071d714f0b27ad7f021a46a66a1085480258 (patch) | |
tree | bbbc9da48238a470bde8854e94ee6cdc2b27b19c | |
parent | 775ac8712a7b42cfead2585f42cec0dfd56644ab (diff) |
py : re-enable mmap in convert hf (#4732)
* update: awq support llama-7b model
* update: change order
* update: benchmark results for llama2-7b
* update: mistral 7b v1 benchmark
* update: support 4 models
* fix: Readme
* update: ready for PR
* update: readme
* fix: readme
* update: change order import
* black
* format code
* update: work for bot mpt and awqmpt
* update: readme
* Rename to llm_build_ffn_mpt_awq
* Formatted other files
* Fixed params count
* fix: remove code
* update: more detail for mpt
* fix: readme
* fix: readme
* update: change folder architecture
* fix: common.cpp
* fix: readme
* fix: remove ggml_repeat
* update: cicd
* update: cicd
* uppdate: remove use_awq arg
* update: readme
* llama : adapt plamo to new ffn
ggml-ci
* fix: update torch version
---------
Co-authored-by: Trần Đức Nam <v.namtd12@vinai.io>
Co-authored-by: Le Hoang Anh <v.anhlh33@vinai.io>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
-rw-r--r-- | awq-py/requirements.txt | 2 | ||||
-rwxr-xr-x | convert-hf-to-gguf.py | 2 |
2 files changed, 2 insertions, 2 deletions
diff --git a/awq-py/requirements.txt b/awq-py/requirements.txt index 5fe60432..99189611 100644 --- a/awq-py/requirements.txt +++ b/awq-py/requirements.txt @@ -1,2 +1,2 @@ -torch>=2.0.0 +torch>=2.1.1 transformers>=4.32.0 diff --git a/convert-hf-to-gguf.py b/convert-hf-to-gguf.py index 51724c0d..203eaf64 100755 --- a/convert-hf-to-gguf.py +++ b/convert-hf-to-gguf.py @@ -59,7 +59,7 @@ class Model: from safetensors import safe_open ctx = cast(ContextManager[Any], safe_open(self.dir_model / part_name, framework="pt", device="cpu")) else: - ctx = contextlib.nullcontext(torch.load(str(self.dir_model / part_name), map_location="cpu", weights_only=True)) + ctx = contextlib.nullcontext(torch.load(str(self.dir_model / part_name), map_location="cpu", mmap=True, weights_only=True)) with ctx as model_part: for name in model_part.keys(): |