summaryrefslogtreecommitdiff
path: root/examples/perplexity/README.md
blob: 1a8c0dd6436e28ab576dc3934f7b607675ca2899 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# perplexity

TODO

## Llama 2 70B Scorechart
| Quantization | Model size (GiB) | Perplexity | Delta to fp16 |
|--------------|------------------|------------|---------------|
| Q4_0         | 36.20            | 3.5550     | 3.61%         |
| Q4_1         | 40.20            | 3.5125     | 2.37%         |
| Q5_0         | 44.20            | 3.4744     | 1.26%         |
| Q2_K         | 27.27            | 3.7339     | 8.82%         |
| Q3_K_S       | 27.86            | 3.7019     | 7.89%         |
| Q3_K_M       | 30.83            | 3.5932     | 4.72%         |
| Q3_K_L       | 33.67            | 3.5617     | 3.80%         |
| Q4_K_S       | 36.39            | 3.4852     | 1.57%         |
| Q4_K_M       | 38.54            | 3.4725     | 1.20%         |
| Q5_K_S       | 44.20            | 3.4483     | 0.50%         |
| Q5_K_M       | 45.41            | 3.4451     | 0.40%         |
| Q6_K         | 52.70            | 3.4367     | 0.16%         |
| fp16         | 128.5            | 3.4313     | -             |