diff options
author | Pierrick Hymbert <pierrick.hymbert@gmail.com> | 2024-03-23 18:07:00 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-03-23 18:07:00 +0100 |
commit | f482bb2e4920e544651fb832f2e0bcb4d2ff69ab (patch) | |
tree | 9fabefd6f3b34aef6bf13a8469c7cdf363cc88cb /examples/server/server.cpp | |
parent | 1997577d5e121568ae39f538021733ccd4278c23 (diff) |
common: llama_load_model_from_url split support (#6192)
* llama: llama_split_prefix fix strncpy does not include string termination
common: llama_load_model_from_url:
- fix header name case sensitive
- support downloading additional split in parallel
- hide password in url
* common: EOL EOF
* common: remove redundant LLAMA_CURL_MAX_PATH_LENGTH definition
* common: change max url max length
* common: minor comment
* server: support HF URL options
* llama: llama_model_loader fix log
* common: use a constant for max url length
* common: clean up curl if file cannot be loaded in gguf
* server: tests: add split tests, and HF options params
* common: move llama_download_hide_password_in_url inside llama_download_file as a lambda
* server: tests: enable back Release test on PR
* spacing
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* spacing
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* spacing
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'examples/server/server.cpp')
-rw-r--r-- | examples/server/server.cpp | 18 |
1 files changed, 17 insertions, 1 deletions
diff --git a/examples/server/server.cpp b/examples/server/server.cpp index 27bd2dd7..b02c2546 100644 --- a/examples/server/server.cpp +++ b/examples/server/server.cpp @@ -2208,7 +2208,11 @@ static void server_print_usage(const char * argv0, const gpt_params & params, co printf(" -m FNAME, --model FNAME\n"); printf(" model path (default: %s)\n", params.model.c_str()); printf(" -mu MODEL_URL, --model-url MODEL_URL\n"); - printf(" model download url (default: %s)\n", params.model_url.c_str()); + printf(" model download url (default: unused)\n"); + printf(" -hfr REPO, --hf-repo REPO\n"); + printf(" Hugging Face model repository (default: unused)\n"); + printf(" -hff FILE, --hf-file FILE\n"); + printf(" Hugging Face model file (default: unused)\n"); printf(" -a ALIAS, --alias ALIAS\n"); printf(" set an alias for the model, will be added as `model` field in completion response\n"); printf(" --lora FNAME apply LoRA adapter (implies --no-mmap)\n"); @@ -2337,6 +2341,18 @@ static void server_params_parse(int argc, char ** argv, server_params & sparams, break; } params.model_url = argv[i]; + } else if (arg == "-hfr" || arg == "--hf-repo") { + if (++i >= argc) { + invalid_param = true; + break; + } + params.hf_repo = argv[i]; + } else if (arg == "-hff" || arg == "--hf-file") { + if (++i >= argc) { + invalid_param = true; + break; + } + params.hf_file = argv[i]; } else if (arg == "-a" || arg == "--alias") { if (++i >= argc) { invalid_param = true; |