summaryrefslogtreecommitdiff
path: root/examples/llava/MobileVLM-README.md
diff options
context:
space:
mode:
Diffstat (limited to 'examples/llava/MobileVLM-README.md')
-rw-r--r--examples/llava/MobileVLM-README.md14
1 files changed, 12 insertions, 2 deletions
diff --git a/examples/llava/MobileVLM-README.md b/examples/llava/MobileVLM-README.md
index 9eba791d..c1f361d1 100644
--- a/examples/llava/MobileVLM-README.md
+++ b/examples/llava/MobileVLM-README.md
@@ -1,11 +1,13 @@
# MobileVLM
-Currently this implementation supports [MobileVLM-v1.7](https://huggingface.co/mtgv/MobileVLM-1.7B) variants.
+Currently this implementation supports [MobileVLM-1.7B](https://huggingface.co/mtgv/MobileVLM-1.7B) / [MobileVLM_V2-1.7B](https://huggingface.co/mtgv/MobileVLM_V2-1.7B) variants.
for more information, please go to [Meituan-AutoML/MobileVLM](https://github.com/Meituan-AutoML/MobileVLM)
The implementation is based on llava, and is compatible with llava and mobileVLM. The usage is basically same as llava.
+Notice: The overall process of model inference for both **MobilVLM** and **MobilVLM_V2** models is the same, but the process of model conversion is a little different. Therefore, using MobiVLM as an example, the different conversion step will be shown.
+
## Usage
Build with cmake or run `make llava-cli` to build it.
@@ -34,7 +36,7 @@ git clone https://huggingface.co/openai/clip-vit-large-patch14-336
python ./examples/llava/llava-surgery.py -m path/to/MobileVLM-1.7B
```
-3. Use `convert-image-encoder-to-gguf.py` with `--projector-type ldp` to convert the LLaVA image encoder to GGUF:
+3. Use `convert-image-encoder-to-gguf.py` with `--projector-type ldp` (for **V2** the arg is `--projector-type ldpv2`) to convert the LLaVA image encoder to GGUF:
```sh
python ./examples/llava/convert-image-encoder-to-gguf \
@@ -44,6 +46,14 @@ python ./examples/llava/convert-image-encoder-to-gguf \
--projector-type ldp
```
+```sh
+python ./examples/llava/convert-image-encoder-to-gguf \
+ -m path/to/clip-vit-large-patch14-336 \
+ --llava-projector path/to/MobileVLM-1.7B_V2/llava.projector \
+ --output-dir path/to/MobileVLM-1.7B_V2 \
+ --projector-type ldpv2
+```
+
4. Use `convert.py` to convert the LLaMA part of LLaVA to GGUF:
```sh