summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKawrakow <iwankawrakow@gmail.com>2025-05-09 11:16:36 +0300
committerGitHub <noreply@github.com>2025-05-09 11:16:36 +0300
commite5a4a3ce78ce96b6822dcd6138a98c4d237ecc9b (patch)
tree57c1d245bc24036c1a139b1a2c6d6da1bbcdb8a3
parent8777fc4855dd1551c20a84cb266f75fa49e9b0e8 (diff)
Update README.md
@saood06 Thanks!
-rw-r--r--README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/README.md b/README.md
index ea4144bb..c1381cad 100644
--- a/README.md
+++ b/README.md
@@ -14,7 +14,7 @@ This repository is a fork of [llama.cpp](https://github.com/ggerganov/llama.cpp)
## Latest News
-* May 9 2025: Support for LlaMA-3-Nmotron models added, see [PR 377](https://github.com/ikawrakow/ik_llama.cpp/pull/377)
+* May 9 2025: Support for LlaMA-3-Nemotron models added, see [PR 377](https://github.com/ikawrakow/ik_llama.cpp/pull/377)
* May 7 2025: 🚀 Faster TG for DeepSeek models with GPU or hybrid GPU/CPU inference. See [PR 386](https://github.com/ikawrakow/ik_llama.cpp/pull/386) for details. Caveat: Ampere or newer Nvidia GPU required
* May 4 2025: 🚀 Significant token generation performance improvement on CUDA with Flash Attention for GQA models. For details and benchmarks see [PR #370](https://github.com/ikawrakow/ik_llama.cpp/pull/370)
* April 29 2025: Qwen3 support added