common: llama_load_model_from_url using --model-url (#6098)

* common: llama_load_model_from_url with libcurl dependency Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
author: Pierrick Hymbert <pierrick.hymbert@gmail.com> 2024-03-17 19:12:37 +0100
committer: GitHub <noreply@github.com> 2024-03-17 19:12:37 +0100
commit: d01b3c4c32357567f3531d4e6ceffc5d23e87583 (patch)
tree: 80e0a075a8b120d6b5b095a73cc36cb2a4535aed /examples/main
parent: cd776c37c945bf58efc8fe44b370456680cb1b59 (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/main/README.md b/examples/main/README.md
index 7f84e426..6a8d1e1c 100644
--- a/examples/main/README.md
+++ b/examples/main/README.md
@@ -67,6 +67,7 @@ main.exe -m models\7B\ggml-model.bin --ignore-eos -n -1 --random-prompt
 In this section, we cover the most commonly used options for running the `main` program with the LLaMA models:
 
 -   `-m FNAME, --model FNAME`: Specify the path to the LLaMA model file (e.g., `models/7B/ggml-model.bin`).
+-   `-mu MODEL_URL --model-url MODEL_URL`: Specify a remote http url to download the file (e.g https://huggingface.co/ggml-org/models/resolve/main/phi-2/ggml-model-q4_0.gguf).
 -   `-i, --interactive`: Run the program in interactive mode, allowing you to provide input directly and receive real-time responses.
 -   `-ins, --instruct`: Run the program in instruction mode, which is particularly useful when working with Alpaca models.
 -   `-n N, --n-predict N`: Set the number of tokens to predict when generating text. Adjusting this value can influence the length of the generated text.
author	Pierrick Hymbert <pierrick.hymbert@gmail.com>	2024-03-17 19:12:37 +0100
committer	GitHub <noreply@github.com>	2024-03-17 19:12:37 +0100
commit	d01b3c4c32357567f3531d4e6ceffc5d23e87583 (patch)
tree	80e0a075a8b120d6b5b095a73cc36cb2a4535aed /examples/main
parent	cd776c37c945bf58efc8fe44b370456680cb1b59 (diff)