ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2023-11-14	Fix MacOS Sonoma model quantization (#4052)	Michael Potter
	Co-authored-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-13	ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060)	Georgi Gerganov
	ggml-ci
2023-11-01	ggml : fix UNUSED macro (#3762)	Georgi Gerganov

2023-11-01	finetune : add -ngl parameter (#3762)	Andrew Godfrey
	* Add '-ngl' support to finetune.cpp * Add fprintf in ggml_cuda_op_add When I tried CUDA offloading during finetuning following the readme, I got an assert here. This probably isn't an important case because inference later gives a warning saying you should use f16 or f32 instead when using lora * Add 'finetune.sh', which currently fails when using GPU "error: operator (): Finetuning on tensors with type 'f16' is not yet supported" * tweak finetune.sh * Suppress some warnings in ggml.c * Add f16 implementation to ggml_compute_forward_add_f16_f32 * Add an f16 case to ggml_add_cast_impl and llama_build_lora_finetune_graphs * finetune.sh: Edit comments * Add "add_f16_f32_f32_cuda" * Tweak an error message * finetune.sh: Add an optional LLAMA_MODEL_DIR variable * finetune.sh: Add an optional LLAMA_TRAINING_DIR variable * train : minor * tabs to spaces --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
2023-10-30	ggml : move FP16 <-> FP32 code to ggml-impl.h (#3861)	Georgi Gerganov
	* ggml : move FP16 <-> FP32 stuff to ggml-impl.h ggml-ci * tests : fix ARM build * ggml : explicitly initialize deprecated type traits * ggml : add math.h to ggml-impl.h * ggml : remove duplicate static assert macros * ggml : prefix lookup tables with ggml_ ggml-ci * ggml-impl : move extern "C" to start of file
2023-10-29	ggml : quantization refactoring (#3833)	Georgi Gerganov
	* ggml : factor all quantization code in ggml-quants ggml-ci * ggml-quants : fix Zig and Swift builds + quantize tool ggml-ci * quantize : --pure option for disabling k-quant mixtures --------- Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>