summaryrefslogtreecommitdiff
path: root/src/llama.cpp
diff options
context:
space:
mode:
authorKawrakow <48489457+ikawrakow@users.noreply.github.com>2024-08-14 10:40:09 +0200
committerGitHub <noreply@github.com>2024-08-14 10:40:09 +0200
commit6c5384f20e8657a23aa9d4e0e9856d3d7563a12a (patch)
treeec7633442345b8e69eee23180d8cf56fe0f59811 /src/llama.cpp
parentbb5ff6fadec40c2e3aa3033dc68bec9367a0c9cc (diff)
Skip barriers of noops (#19)
GGML_OP_RESHAPE, GGML_OP_VIEW, GGML_OP_PERMUTE, GGML_OP_TRANSPOSE, along with GGML_OP_NONE, are all noops. I.e., nothinh happens. But ggml still has a barrier after them, which wastes time. The waste is not too bad for large models where computations are long compared to the time taken for thread synchronization. But for small models skipping those unnecessary waits makes a significant difference. E.g., for the 99M TriLMamodel, TG-500 goes up to 1426 t/s from 1240 t/s. Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'src/llama.cpp')
0 files changed, 0 insertions, 0 deletions