summaryrefslogtreecommitdiff
path: root/examples
diff options
context:
space:
mode:
Diffstat (limited to 'examples')
-rw-r--r--examples/llava/MobileVLM-README.md2
-rw-r--r--examples/llava/README.md2
-rw-r--r--examples/main/README.md2
-rw-r--r--examples/perplexity/README.md31
-rw-r--r--examples/quantize/README.md22
5 files changed, 29 insertions, 30 deletions
diff --git a/examples/llava/MobileVLM-README.md b/examples/llava/MobileVLM-README.md
index 96b04852..413e433d 100644
--- a/examples/llava/MobileVLM-README.md
+++ b/examples/llava/MobileVLM-README.md
@@ -22,7 +22,7 @@ After building, run: `./llava-cli` to see the usage. For example:
## Model conversion
-- Clone `mobileVLM-1.7B` and `clip-vit-large-patch14-336` locally:
+1. Clone `mobileVLM-1.7B` and `clip-vit-large-patch14-336` locally:
```sh
git clone https://huggingface.co/mtgv/MobileVLM-1.7B
diff --git a/examples/llava/README.md b/examples/llava/README.md
index 67cb0f22..d4810d42 100644
--- a/examples/llava/README.md
+++ b/examples/llava/README.md
@@ -24,7 +24,7 @@ After building, run: `./llava-cli` to see the usage. For example:
## LLaVA 1.5
-- Clone a LLaVA and a CLIP model ([available options](https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.md)). For example:
+1. Clone a LLaVA and a CLIP model ([available options](https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.md)). For example:
```sh
git clone https://huggingface.co/liuhaotian/llava-v1.5-7b
diff --git a/examples/main/README.md b/examples/main/README.md
index bb696b56..10a589ce 100644
--- a/examples/main/README.md
+++ b/examples/main/README.md
@@ -310,7 +310,7 @@ These options help improve the performance and memory usage of the LLaMA models.
### Quantization
-For information about 4-bit quantization, which can significantly improve performance and reduce memory usage, please refer to llama.cpp's primary [README](../../README.md#prepare-data--run).
+For information about 4-bit quantization, which can significantly improve performance and reduce memory usage, please refer to llama.cpp's primary [README](../../README.md#prepare-and-quantize).
## Additional Options
diff --git a/examples/perplexity/README.md b/examples/perplexity/README.md
index 50e1af01..1a8c0dd6 100644
--- a/examples/perplexity/README.md
+++ b/examples/perplexity/README.md
@@ -3,19 +3,18 @@
TODO
## Llama 2 70B Scorechart
-Quantization | Model size (GiB) | Perplexity | Delta to fp16
--- | -- | -- | --
-Q4_0 | 36.20 | 3.5550 | 3.61%
-Q4_1 | 40.20 | 3.5125 | 2.37%
-Q5_0 | 44.20 | 3.4744 | 1.26%
-Q2_K | 27.27 | 3.7339 | 8.82%
-Q3_K_S | 27.86 | 3.7019 | 7.89%
-Q3_K_M | 30.83 | 3.5932 | 4.72%
-Q3_K_L | 33.67 | 3.5617 | 3.80%
-Q4_K_S | 36.39 | 3.4852 | 1.57%
-Q4_K_M | 38.54 | 3.4725 | 1.20%
-Q5_K_S | 44.20 | 3.4483 | 0.50%
-Q5_K_M | 45.41 | 3.4451 | 0.40%
-Q6_K | 52.70 | 3.4367 | 0.16%
-fp16 | 128.5 | 3.4313 | -
-
+| Quantization | Model size (GiB) | Perplexity | Delta to fp16 |
+|--------------|------------------|------------|---------------|
+| Q4_0 | 36.20 | 3.5550 | 3.61% |
+| Q4_1 | 40.20 | 3.5125 | 2.37% |
+| Q5_0 | 44.20 | 3.4744 | 1.26% |
+| Q2_K | 27.27 | 3.7339 | 8.82% |
+| Q3_K_S | 27.86 | 3.7019 | 7.89% |
+| Q3_K_M | 30.83 | 3.5932 | 4.72% |
+| Q3_K_L | 33.67 | 3.5617 | 3.80% |
+| Q4_K_S | 36.39 | 3.4852 | 1.57% |
+| Q4_K_M | 38.54 | 3.4725 | 1.20% |
+| Q5_K_S | 44.20 | 3.4483 | 0.50% |
+| Q5_K_M | 45.41 | 3.4451 | 0.40% |
+| Q6_K | 52.70 | 3.4367 | 0.16% |
+| fp16 | 128.5 | 3.4313 | - |
diff --git a/examples/quantize/README.md b/examples/quantize/README.md
index c8b9a27a..8a10365c 100644
--- a/examples/quantize/README.md
+++ b/examples/quantize/README.md
@@ -4,17 +4,17 @@ TODO
## Llama 2 7B
-Quantization | Bits per Weight (BPW)
--- | --
-Q2_K | 3.35
-Q3_K_S | 3.50
-Q3_K_M | 3.91
-Q3_K_L | 4.27
-Q4_K_S | 4.58
-Q4_K_M | 4.84
-Q5_K_S | 5.52
-Q5_K_M | 5.68
-Q6_K | 6.56
+| Quantization | Bits per Weight (BPW) |
+|--------------|-----------------------|
+| Q2_K | 3.35 |
+| Q3_K_S | 3.50 |
+| Q3_K_M | 3.91 |
+| Q3_K_L | 4.27 |
+| Q4_K_S | 4.58 |
+| Q4_K_M | 4.84 |
+| Q5_K_S | 5.52 |
+| Q5_K_M | 5.68 |
+| Q6_K | 6.56 |
## Llama 2 13B
Quantization | Bits per Weight (BPW)