summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGalunid <karolek1231456@gmail.com>2024-05-31 10:09:20 +0200
committerGitHub <noreply@github.com>2024-05-31 10:09:20 +0200
commit1af511fc22cba4959dd8bced5501df9e8af6ddf9 (patch)
tree8b52af552920e6a0079d41d8de2b2b8860891623
parent0541f06296753dbc59a57379eb54cec865a4c9f9 (diff)
Add convert.py removal to hot topics (#7662)
-rw-r--r--README.md3
1 files changed, 2 insertions, 1 deletions
diff --git a/README.md b/README.md
index 60e7aaf2..eeeb6491 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,8 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
### Hot topics
-- **Initial Flash-Attention support: https://github.com/ggerganov/llama.cpp/pull/5021**
+- **`convert.py` has been deprecated and moved to `examples/convert-legacy-llama.py`, please use `convert-hf-to-gguf.py` https://github.com/ggerganov/llama.cpp/pull/7430
+- Initial Flash-Attention support: https://github.com/ggerganov/llama.cpp/pull/5021
- BPE pre-tokenization support has been added: https://github.com/ggerganov/llama.cpp/pull/6920
- MoE memory layout has been updated - reconvert models for `mmap` support and regenerate `imatrix` https://github.com/ggerganov/llama.cpp/pull/6387
- Model sharding instructions using `gguf-split` https://github.com/ggerganov/llama.cpp/discussions/6404