batched : add bench tool (#3545)

* batched : add bench tool * batched : minor fix table * batched-bench : add readme + n_kv_max is now configurable * batched-bench : init warm-up batch * batched-bench : pass custom set of PP, TG and PL * batched-bench : add mmq CLI arg
author: Georgi Gerganov <ggerganov@gmail.com> 2023-10-11 21:25:33 +0300
committer: GitHub <noreply@github.com> 2023-10-11 21:25:33 +0300
commit: 8c70a5ff25964f0a81e20d142a2f5ac5baff22fc (patch)
tree: 50946ed36e647f5619b1866d77e3042d4d9743c5 /examples/batched
parent: 24ba3d829e31a6eda3fa1723f692608c2fa3adda (diff)
1 files changed, 1 insertions, 1 deletions
diff --git a/examples/batched/batched.cpp b/examples/batched/batched.cpp
index 688ef221..a88e022d 100644
--- a/examples/batched/batched.cpp
+++ b/examples/batched/batched.cpp
@@ -66,7 +66,7 @@ int main(int argc, char ** argv) {
     ctx_params.seed  = 1234;
     ctx_params.n_ctx = n_kv_req;
     ctx_params.n_batch = std::max(n_len, n_parallel);
-    ctx_params.n_threads = params.n_threads;
+    ctx_params.n_threads       = params.n_threads;
     ctx_params.n_threads_batch = params.n_threads_batch == -1 ? params.n_threads : params.n_threads_batch;
 
     llama_context * ctx = llama_new_context_with_model(model, ctx_params);
author	Georgi Gerganov <ggerganov@gmail.com>	2023-10-11 21:25:33 +0300
committer	GitHub <noreply@github.com>	2023-10-11 21:25:33 +0300
commit	8c70a5ff25964f0a81e20d142a2f5ac5baff22fc (patch)
tree	50946ed36e647f5619b1866d77e3042d4d9743c5 /examples/batched
parent	24ba3d829e31a6eda3fa1723f692608c2fa3adda (diff)