ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-07-27	Merge mainline llama.cpp (#3)	Kawrakow
	* Merging mainline - WIP * Merging mainline - WIP AVX2 and CUDA appear to work. CUDA performance seems slightly (~1-2%) lower as it is so often the case with llama.cpp/ggml after some "improvements" have been made. * Merging mainline - fix Metal * Remove check --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2024-05-23	Fix phi3 chat template confusion with zephyr (#7449)	Tristan Druyen
	* Fix phi3 template matching vs zephyr * Add regression test for new phi3 chat template * Implement review suggestions * Fix phi3 jinja test templates & match by <\|end\|> * Apply suggestion Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Add all phi3 template variants in tests * Remove unneeded message trimming Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Fix tests to not expect trimmed messages --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-04-24	llama : add phi 3 chat template (#6857)	Tristan Druyen
	* Add phi 3 chat template & tests * test : fix chat template result --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-21	llama : add llama-3 chat template (#6751)	Wouter
	* Added llama-3 chat template * Update llama.cpp Co-authored-by: Samuel Tallet <36248671+SamuelTallet@users.noreply.github.com> * Update llama.cpp Co-authored-by: Samuel Tallet <36248671+SamuelTallet@users.noreply.github.com> * Update tests/test-chat-template.cpp Co-authored-by: Samuel Tallet <36248671+SamuelTallet@users.noreply.github.com> * Added EOS stop sequence according to https://github.com/ggerganov/llama.cpp/pull/6751#issuecomment-2065602862 * Removed adding of BOS token before first message * Removed bos token from expected output from llama-3 * Update tests/test-chat-template.cpp Co-authored-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> * Update tests/test-chat-template.cpp Co-authored-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> * Added <\|end_of_text\|> as another stop token * Reverted last change of adding the end_of_text stop word for llama 3 --------- Co-authored-by: Wouter Tichelaar <tichelaarw@spar.net> Co-authored-by: Samuel Tallet <36248671+SamuelTallet@users.noreply.github.com> Co-authored-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-14	Add Command R chat template (#6650)	Chao Jiang
	* Add chat template for command-r model series * Fix indentation * Add chat template test for command-r models and update the implementation to trim whitespaces * Remove debug print
2024-04-03	Add OpenChat, Alpaca, Vicuna chat templates (#6397)	kaizau
	* Add openchat chat template * Add chat template test for openchat * Add chat template for vicuna * Add chat template for orca-vicuna * Add EOS for vicuna templates * Combine vicuna chat templates * Add tests for openchat and vicuna chat templates * Add chat template for alpaca * Add separate template name for vicuna-orca * Remove alpaca, match deepseek with jinja output * Regenerate chat template test with add_generation_prompt * Separate deepseek bos from system message * Match openchat template with jinja output * Remove BOS token from templates, unprefix openchat
2024-03-15	llama : add Orion chat template (#6066)	Xuan Son Nguyen

2024-02-22	Add Gemma chat template (#5665)	Xuan Son Nguyen
	* add gemma chat template * gemma: only apply system_prompt on non-model message
2024-02-22	server : fallback to chatml, add AlphaMonarch chat template (#5628)	Xuan Son Nguyen
	* server: fallback to chatml * add new chat template * server: add AlphaMonarch to test chat template * server: only check model template if there is no custom tmpl * remove TODO
2024-02-19	llama : add llama_chat_apply_template() (#5538)	Xuan Son Nguyen
	* llama: add llama_chat_apply_template * test-chat-template: remove dedundant vector * chat_template: do not use std::string for buffer * add clarification for llama_chat_apply_template * llama_chat_apply_template: add zephyr template * llama_chat_apply_template: correct docs * llama_chat_apply_template: use term "chat" everywhere * llama_chat_apply_template: change variable name to "tmpl"