summaryrefslogtreecommitdiff
path: root/examples/tokenize
diff options
context:
space:
mode:
authorCausalLM <148736309+CausalLM@users.noreply.github.com>2023-12-02 02:17:06 +0800
committerGitHub <noreply@github.com>2023-12-01 20:17:06 +0200
commit03562f3a86d6706eea9f4fc09b532946c191b34e (patch)
tree709378616d9e23c4fb098dc61c7659b32e8740a4 /examples/tokenize
parent37c746d687d877bc11803e96b4dc5f378b83c0a0 (diff)
llama : support attention bias on LLaMA architecture (#4283)
* Support attention_bias on LLaMA architecture QKVO bias, should fix InternLM (https://github.com/ggerganov/llama.cpp/issues/3133) and works for LLaMAfied Qwen models (https://github.com/ggerganov/llama.cpp/pull/3743#issuecomment-1825923608). * check existence of qkvo bias while loading llama models Tested on LLaMA2, CUDA and CPU. * Update llama.cpp
Diffstat (limited to 'examples/tokenize')
0 files changed, 0 insertions, 0 deletions