diff options
author | Xuan Son Nguyen <thichthat@gmail.com> | 2024-06-15 18:53:40 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-06-15 18:53:40 +0200 |
commit | 0c7b3595b9e5ad2355818e259f06b0dc3f0065b3 (patch) | |
tree | 1146ce43d46ad84568728a0a78ee5aa79c0e9e20 /examples/cvector-generator/README.md | |
parent | 7b2f4a7d193ef2475259bbe7656fcccfab4b1217 (diff) |
Add `cvector-generator` example (#7514)
* add control-vector-generator
* calc diff
* add comments
* proof-of-concept stdlib implementation
Implements PCA and file writing using mostly standard libraries. The output is recognized as a functional control vector, but outputs gibberish.
* param parsing, refactor, comments
Added basic command-line parameters for outfile and one each positive/negative prompt.
Refactored some messy code in PCA computation and GGUF exporting.
Left a bunch of comments regarding further work needed.
* example template completions
Implements an example template set built from the positive/negative prompts like the control vector Python implementation.
* add multi prompts, multi-thread for PCA
* fix mem error
* add debugs
* fix matrix transpose multiplication
you have got to be kidding me
* preliminary template/multiprompt support
model is running out of context and that ought to be fixed (segfaulting) but other than that it looks goodish
* fix zero output & param parsing, functional templating
fixed a bug where the output file had no tensor data/was all zero
fixed a bug where single hyphen flags were not being correctly parsed
implements creation of templated prompts from input (still need to adapt based on model)
* fix square_diff matmul index range and CRLF->LF line endings
fixed a logic error where square_diff would not multiply all rows
fixed a formatting error where the provided completions.txt had CRLF line endings
* add command-line args for num threads, num completions file lines, always reload model
refactored a few things and did what the commit message says on the tin
* code aestheticization
* fix compiler warnings
* in-series multithreading for prompt embedding?
added commented-out code to attempt to start implementing mutlithreading for embedding in main
* remove unnecessary multithreading
* interim fix memory leak
* translated everything but PCA (I think)
* tentatively translate the rest
* fix ggml errors and make new ones
at least it compiles and runs
* fix cb_eval
* temporary commit while I move dev environments
it finally outputs a functioning control vector - "functioning" in the sense that it can be loaded and it clearly has the right idea, but makes the model incoherent
* update debug statements
* pre-tokenize so we can allocate correct memory to ctx_diffs_wrapped
* update comments
* (wip) refactor
* clean up PCA ggml implementation
* fix shape of v_diff_original
* add n_batch for pca
* working version
* remember to copy back the last_eigenvector
* fix n_completions
* bring back n_completions
* default n_pca_batch to 20
* fix macos build
* add to makefile all targets
* use ggml_format_name
* add readme
* fix .editorconfig
* use ggml_backend_tensor_copy
* attemp to fix compile problem on mac
* fix compile warn
* reuse allocr
* move param parser to common
* better error handling
* clean up a bit
* add print_usage
* shorten help msg
* beautify help msg
* escape prompt by default
* change compile target to llama-cvector-generator
* typo
* disable GPU for PCA
* code style
---------
Co-authored-by: Christian Zhou-Zheng <christianzhouzheng@gmail.com>
Diffstat (limited to 'examples/cvector-generator/README.md')
-rw-r--r-- | examples/cvector-generator/README.md | 34 |
1 files changed, 34 insertions, 0 deletions
diff --git a/examples/cvector-generator/README.md b/examples/cvector-generator/README.md new file mode 100644 index 00000000..7b0e79c1 --- /dev/null +++ b/examples/cvector-generator/README.md @@ -0,0 +1,34 @@ +# cvector-generator + +This example demonstrates how to generate a control vector using gguf models. + +Related PRs: +- [Add support for control vectors](https://github.com/ggerganov/llama.cpp/pull/5970) +- (Issue) [Generate control vector using llama.cpp](https://github.com/ggerganov/llama.cpp/issues/6880) +- [Add cvector-generator example](https://github.com/ggerganov/llama.cpp/pull/7514) + +## Examples + +```sh +# CPU only +./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf + +# With GPU +./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf -ngl 99 + +# With advanced options +./cvector-generator -m ./dolphin-2.0-mistral-7b.Q4_K_M.gguf -ngl 99 --completions 128 --pca-iter 2000 --batch-pca 100 + +# To see help message +./cvector-generator -h +# Then, have a look at "cvector" section +``` + +## Tips and tricks + +If you have multiple lines per prompt, you can escape the newline character (change it to `\n`). For example: + +``` +<|im_start|>system\nAct like a person who is extremely happy.<|im_end|> +<|im_start|>system\nYou are in a very good mood today<|im_end|> +``` |