summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMirko185 <mirkosig@gmail.com>2024-02-19 08:39:31 +0100
committerGitHub <noreply@github.com>2024-02-19 09:39:31 +0200
commit769a716e30ba1da46f709df1c00727d6869d30e7 (patch)
tree8599c9dcd46c1c8fb4c4fbb1ed39a1877b57163f
parentf0d1fafc029a056cd765bdae58dcaa12312e9879 (diff)
readme : update (#5572)
Added 1.5-bit on README.md
-rw-r--r--README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/README.md b/README.md
index 8c7bc268..70866e24 100644
--- a/README.md
+++ b/README.md
@@ -61,7 +61,7 @@ variety of hardware - locally and in the cloud.
- Plain C/C++ implementation without any dependencies
- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
- AVX, AVX2 and AVX512 support for x86 architectures
-- 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
+- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
- Vulkan, SYCL, and (partial) OpenCL backend support
- CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity