examples : update llama2.c converter to read vocab and write models in GGUF format (#2751)

* llama2.c: direct gguf output (WIP) * Simplify vector building logic * llama2.c gguf conversion: fix token types in converter * llama2.c: support copying vocab from a llama gguf model file * llama2.c: update default path for vocab model + readme * llama2.c: use defines for gguf keys * llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way * llama2.c converter: cleanups + take n_ff from config
author: Olivier Chafik <ochafik@users.noreply.github.com> 2023-08-27 15:13:31 +0100
committer: GitHub <noreply@github.com> 2023-08-27 17:13:31 +0300
commit: 230d46c723edf5999752e4cb67fd94edb19ef9c7 (patch)
tree: 09a7ca40641b05eb2ece183c9a3ead9d9e92189c /examples/convert-llama2c-to-ggml/README.md
parent: 463173a6c0ff353055eb90665794884c888c790f (diff)
1 files changed, 2 insertions, 6 deletions
diff --git a/examples/convert-llama2c-to-ggml/README.md b/examples/convert-llama2c-to-ggml/README.md
index fd561fcb..0f37d295 100644
--- a/examples/convert-llama2c-to-ggml/README.md
+++ b/examples/convert-llama2c-to-ggml/README.md
@@ -12,18 +12,14 @@ usage: ./convert-llama2c-to-ggml [options]
 
 options:
   -h, --help                       show this help message and exit
-  --copy-vocab-from-model FNAME    model path from which to copy vocab (default 'tokenizer.bin')
+  --copy-vocab-from-model FNAME    path of gguf llama model or llama2.c vocabulary from which to copy vocab (default 'models/7B/ggml-model-f16.gguf')
   --llama2c-model FNAME            [REQUIRED] model path from which to load Karpathy's llama2.c model
   --llama2c-output-model FNAME     model path to save the converted llama2.c model (default ak_llama_model.bin')
 ```
 
 An example command using a model from [karpathy/tinyllamas](https://huggingface.co/karpathy/tinyllamas) is as follows:
 
-`$ ./convert-llama2c-to-ggml --copy-vocab-from-model ../llama2.c/tokenizer.bin --llama2c-model stories42M.bin --llama2c-output-model stories42M.ggmlv3.bin`
-
-For now the generated model is in the legacy GGJTv3 format, so you need to convert it to gguf manually:
-
-`$ python ./convert-llama-ggmlv3-to-gguf.py --eps 1e-5 --input stories42M.ggmlv3.bin --output stories42M.gguf.bin`
+`$ ./convert-llama2c-to-ggml --copy-vocab-from-model llama-2-7b-chat.gguf.q2_K.bin --llama2c-model stories42M.bin --llama2c-output-model stories42M.gguf.bin`
 
 Now you can use the model with a command like:
author	Olivier Chafik <ochafik@users.noreply.github.com>	2023-08-27 15:13:31 +0100
committer	GitHub <noreply@github.com>	2023-08-27 17:13:31 +0300
commit	230d46c723edf5999752e4cb67fd94edb19ef9c7 (patch)
tree	09a7ca40641b05eb2ece183c9a3ead9d9e92189c /examples/convert-llama2c-to-ggml/README.md
parent	463173a6c0ff353055eb90665794884c888c790f (diff)