summaryrefslogtreecommitdiff
path: root/gguf-py/gguf/tensor_mapping.py
diff options
context:
space:
mode:
authorNam D. Tran <42194884+namtranase@users.noreply.github.com>2023-12-27 22:39:45 +0700
committerGitHub <noreply@github.com>2023-12-27 17:39:45 +0200
commitf6793491b5af6da75edad34d6f503ef86d31b09f (patch)
treeba50b7ae1aba91cb465a06970a11137baab7afcf /gguf-py/gguf/tensor_mapping.py
parent879b690a9e1eb1ab0a29b58236fc76978fb4d902 (diff)
llama : add AWQ for llama, llama2, mpt, and mistral models (#4593)
* update: awq support llama-7b model * update: change order * update: benchmark results for llama2-7b * update: mistral 7b v1 benchmark * update: support 4 models * fix: Readme * update: ready for PR * update: readme * fix: readme * update: change order import * black * format code * update: work for bot mpt and awqmpt * update: readme * Rename to llm_build_ffn_mpt_awq * Formatted other files * Fixed params count * fix: remove code * update: more detail for mpt * fix: readme * fix: readme * update: change folder architecture * fix: common.cpp * fix: readme * fix: remove ggml_repeat * update: cicd * update: cicd * uppdate: remove use_awq arg * update: readme * llama : adapt plamo to new ffn ggml-ci --------- Co-authored-by: Trần Đức Nam <v.namtd12@vinai.io> Co-authored-by: Le Hoang Anh <v.anhlh33@vinai.io> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'gguf-py/gguf/tensor_mapping.py')
-rw-r--r--gguf-py/gguf/tensor_mapping.py5
1 files changed, 5 insertions, 0 deletions
diff --git a/gguf-py/gguf/tensor_mapping.py b/gguf-py/gguf/tensor_mapping.py
index 446c6b68..0b8f7041 100644
--- a/gguf-py/gguf/tensor_mapping.py
+++ b/gguf-py/gguf/tensor_mapping.py
@@ -188,6 +188,11 @@ class TensorNameMap:
"model.layers.{bid}.block_sparse_moe.experts.{xid}.w3", # mixtral
),
+ # AWQ-activation gate
+ MODEL_TENSOR.FFN_ACT: (
+ "transformer.blocks.{bid}.ffn.act", # mpt
+ ),
+
# Feed-forward gate
MODEL_TENSOR.FFN_GATE: (
"model.layers.{bid}.mlp.gate_proj", # llama-hf refact