diff options
author | Giuseppe Scrivano <giuseppe@scrivano.org> | 2024-05-28 20:49:49 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-28 21:49:49 +0300 |
commit | 5442939fcc5e6ae41abf40612a95fd71377e487e (patch) | |
tree | 1402af1bf61b8a110252b748b9d453e09946d5cf /ggml-sycl.cpp | |
parent | 56411a950f255b523a9edd684fd1632752474399 (diff) |
llama : support small Granite models (#7481)
* Add optional MLP bias for Granite models
Add optional MLP bias for ARCH_LLAMA to support Granite models.
Partially addresses ggerganov/llama.cpp/issues/7116
Still needs some more changes to properly support Granite.
* llama: honor add_space_prefix from the model configuration
propagate the add_space_prefix configuration from the HF model
configuration to the gguf file and honor it with the gpt2 tokenizer.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* llama: add support for small granite models
it works only for the small models 3b and 8b.
The convert-hf-to-gguf.py script uses the vocabulary size of the
granite models to detect granite and set the correct configuration.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
---------
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Co-authored-by: Steffen Roecker <sroecker@redhat.com>
Diffstat (limited to 'ggml-sycl.cpp')
0 files changed, 0 insertions, 0 deletions