From 3015851c5ac7334fb544a23a70a284c117b87044 Mon Sep 17 00:00:00 2001 From: Daniel Bevenius Date: Thu, 23 May 2024 14:29:26 +0200 Subject: llama : add getters for n_threads/n_threads_batch (#7464) * llama : add getters for n_threads/n_threads_batch This commit adds two new functions to the llama API. The functions can be used to get the number of threads used for generating a single token and the number of threads used for prompt and batch processing (multiple tokens). The motivation for this is that we want to be able to get the number of threads that the a context is using. The main use case is for a testing/verification that the number of threads is set correctly. Signed-off-by: Daniel Bevenius * squash! llama : add getters for n_threads/n_threads_batch Rename the getters to llama_n_threads and llama_n_threads_batch. Signed-off-by: Daniel Bevenius --------- Signed-off-by: Daniel Bevenius --- llama.cpp | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'llama.cpp') diff --git a/llama.cpp b/llama.cpp index 1f9e10ee..e540c1b3 100644 --- a/llama.cpp +++ b/llama.cpp @@ -17410,6 +17410,14 @@ void llama_set_n_threads(struct llama_context * ctx, uint32_t n_threads, uint32_ ctx->cparams.n_threads_batch = n_threads_batch; } +uint32_t llama_n_threads(struct llama_context * ctx) { + return ctx->cparams.n_threads; +} + +uint32_t llama_n_threads_batch(struct llama_context * ctx) { + return ctx->cparams.n_threads_batch; +} + void llama_set_abort_callback(struct llama_context * ctx, bool (*abort_callback)(void * data), void * abort_callback_data) { ctx->abort_callback = abort_callback; ctx->abort_callback_data = abort_callback_data; -- cgit v1.2.3