ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-01-27	Remove unused data and add fixes (#5154)	Michael Klimenko
	* Remove unused data and add fixes * Add missing file * Address review comments * Replace the scope of vq allocation
2024-01-25	llama : dynamic temperature sampling (#4972)	l3utterfly
	* implemented dynamic temperature sampling from koboldcpp * removed trailing whitespace * removed unused temp parameter in llama_sample_entropy * exposed exponent_val in dynamic temp sampler * added debug check for printf statements * use nullptr in llama_sample_softmax call during llama_sample_entropy this avoids counting the time taken stats twice Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * return earlier if there is only 1 candiate (i.e. max_entropy == 0) * reformat 't' case in llama_sample_queue Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * check for one or zero candidates case in llama_sample_entropy --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
2024-01-15	llama : apply classifier-free guidance to logits directly (#4951)	David Friehs

2023-12-23	server : allow to specify custom prompt for penalty calculation (#3727)	Alexey Parfenov

2023-12-23	grammar : check the full vocab only if necessary (opt) (#4306)	kalomaze
	* Check the full vocab for grammar only if necessary * Fix missing logit restoration step (?) Does this matter, actually? * Fix whitespace / formatting * Adjust comment * Didn't mean to push test gbnf * Split sampling into the helper function (?) And also revert the changes made to the header * common : fix final newline --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-12-06	common : fix compile warning	Georgi Gerganov

2023-12-05	sampling : custom samplers order (#4285)	MaggotHATE
	* Samplers sequence order w parameter * Cleaned commented code * Fixed formatting * Rewrote with unordered_map * Revert and rewrite, too many problems and safeguards would be needed * Fixed code style * Code style fixes according to review * More readable samplers input string, fixed help * Style fix in sampler_queue * Formatting fixes * Fixing whitespaces
2023-11-01	sampling : null grammar field after reset (#3885)	l3utterfly

2023-10-31	samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)	kalomaze
	* Introduce the new Min-P sampler by @kalomaze The Min-P sampling method was designed as an alternative to Top-P, and aims to ensure a balance of quality and variety. The parameter p represents the minimum probability for a token to be considered, relative to the probability of the most likely token. * Min-P enabled and set to 0.05 default --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
2023-10-28	llama : add option for greedy sampling with probs (#3813)	Georgi Gerganov
	* llama : add option for greedy sampling with probs * llama : add comment about llama_sample_token_greedy() missing probs * sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs
2023-10-23	llama : remove token functions with `context` args in favor of `model` (#3720)	Marcus Dunn
	* added `llama_model_token_` variants to all the `llama_token_` functions. * added `LLAMA_API` * formatting Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * removed old `llama_token` functions * changed 3 more functions to take in model - `llama_token_get_text` - `llama_token_get_score` - `llama_token_get_type` * added back docs * fixed main.cpp * changed token functions to use new model variants * changed token functions to use new model variants --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-20	sampling : refactor init to use llama_sampling_params (#3696)	Georgi Gerganov
	* sampling : refactor init to use llama_sampling_params * llama : combine repetition, frequency and presence penalties in 1 call * examples : remove embd-input and gptneox-wip * sampling : rename penalty params + reduce size of "prev" vector * sampling : add llama_sampling_print helper * sampling : hide prev behind API and apply #3661 ggml-ci
2023-10-18	speculative : add tree-based sampling example (#3624)	Georgi Gerganov
	* sampling : one sequence per sampling context ggml-ci * speculative : add tree-based sampling support ggml-ci * speculative : reuse the n_parallel CLI param * speculative : refactor sampling * examples : fix build after sampling refactoring ggml-ci * batched : fix n_seq_id * sampling : fix malloc ggml-ci * swift : fix build ggml-ci * swift : try to fix build ggml-ci * prompts : add assistant.txt * common : add llama_batch_add() and llama_batch_clear() helpers * speculative : minor refactor ggml-ci * minor : comments + rename ggml-ci * speculative : fix off-by-one for n_drafted * speculative : fix the n_drafted fix + p constants
2023-10-11	common : fix mirostat state when using multiple sequences (#3543)	Kerfuffle
	* Fix mirostat state when using multiple sequences * Fix mirostat by completely refactoring sampling! * Try to fix zig build. * Export function to fetch/create default sampler states Code formatting cleanups and add some comments Silence a warning about id not being used when logging is disabled * Apply some renaming suggestions. Fix comments that were out of sync with the pull. * Use more consistant naming convention for sampling contexts