diff options
author | Mikko Juola <mikjuo@gmail.com> | 2024-05-24 18:14:42 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-25 11:14:42 +1000 |
commit | 57684331fc2d685f7d1f5775af0b9e47d1829833 (patch) | |
tree | 46cd7ea076f08aaf17c95aac0f57ac0cf8af3ad0 /examples/llama.android/app/src/main/cpp/llama-android.cpp | |
parent | b83bab15a5d2a1e7807d09613a9b34309d86cfaa (diff) |
Make tokenize CLI tool have nicer command line arguments. (#6188)
* Make tokenizer.cpp CLI tool nicer.
Before this commit, tokenize was a simple CLI tool like this:
tokenize MODEL_FILENAME PROMPT [--ids]
This simple tool loads the model, takes the prompt, and shows the tokens
llama.cpp is interpreting.
This changeset makes the tokenize more sophisticated, and more useful
for debugging and troubleshooting:
tokenize [-m, --model MODEL_FILENAME]
[--ids]
[--stdin]
[--prompt]
[-f, --file]
[--no-bos]
[--log-disable]
It also behaves nicer on Windows now, interpreting and rendering Unicode
from command line arguments and pipes no matter what code page the user
has set on their terminal.
* style fix: strlen(str) == 0 --> *str == 0
* Simplify tokenize.cpp; by getting rid of handling positional style arguments.
It must now be invoked with long --model, --prompt etc. arguments only.
Shortens the code.
* tokenize.cpp: iostream header no longer required
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: brian khuu <mofosyne@gmail.com>
Diffstat (limited to 'examples/llama.android/app/src/main/cpp/llama-android.cpp')
0 files changed, 0 insertions, 0 deletions