ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	CausalLM <148736309+CausalLM@users.noreply.github.com>	2023-12-02 02:17:06 +0800
committer	GitHub <noreply@github.com>	2023-12-01 20:17:06 +0200
commit	03562f3a86d6706eea9f4fc09b532946c191b34e (patch)
tree	709378616d9e23c4fb098dc61c7659b32e8740a4 /examples/tokenize
parent	37c746d687d877bc11803e96b4dc5f378b83c0a0 (diff)

llama : support attention bias on LLaMA architecture (#4283)

* Support attention_bias on LLaMA architecture QKVO bias, should fix InternLM (https://github.com/ggerganov/llama.cpp/issues/3133) and works for LLaMAfied Qwen models (https://github.com/ggerganov/llama.cpp/pull/3743#issuecomment-1825923608). * check existence of qkvo bias while loading llama models Tested on LLaMA2, CUDA and CPU. * Update llama.cpp

Diffstat (limited to 'examples/tokenize')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: