ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-01-07	ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (#4787)	Konstantin Zhuravlyov

2024-01-07	server : fix n_predict check (#4798)	Georgi Gerganov

2024-01-06	llama.swiftui : use correct pointer for llama_token_eos (#4797)	Daniel Illescas Romero

2024-01-06	examples : improve base-translate.sh script (#4783)	Georgi Gerganov

2024-01-05	cmake : check for openblas64 (#4134)	a-n-n-a-l-e-e
	openblas v0.3.22 64-bit pkg-config file is named openblas64.pc https://github.com/OpenMathLib/OpenBLAS/issues/3790
2024-01-05	flake.nix : fix typo (#4700)	Ikko Eltociear Ashimine
	betwen -> between
2024-01-05	metal : switch back to default.metallib (ggml/681)	Georgi Gerganov
	ggml-ci
2024-01-05	ggml : fix q2_k bpw in comments (ggml/680)	Georgi Gerganov

2024-01-05	ggml : add error handling to graph_compute (whisper/1714)	Finn Voorhees

2024-01-05	ggml : do not sched_yield when calling BLAS (#4761)	Georgi Gerganov
	* ggml : do not sched_yield when calling BLAS ggml-ci * ggml : fix do_yield logic ggml-ci * ggml : simplify do_yield logic ggml-ci
2024-01-05	examples : add few-shot translation example (#4783)	Georgi Gerganov

2024-01-04	finetune : remove unused includes (#4756)	Daniel Bevenius
	This commit removes unused includes from finetune.cpp. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-01-04	server : send token probs for "stream == false" (#4714)	Georgi Gerganov

2024-01-04	Print backend name on test-backend-ops failure (#4751)	Johannes Gäßler

2024-01-04	llama.swiftui : support loading custom model from file picker (#4767)	singularity
	* swiftui: support load model from file picker * swiftui: remove trailing whitespace
2024-01-04	server : fix options in README.md (#4765)	Michael Coppola
	* fix examples/server/README.md * minor : fix whitespace --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-04	ggml : include stdlib.h before intrin.h (#4736)	Georgi Gerganov

2024-01-04	llama.swiftui : fix build of ggml.metallib (#4754)	singularity
	* metal: fix metal backend init failure in swiftui * metal: build ggml.metallib instead of copy src * llama.swift : remove debug flags from metallib build --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-03	train : fix typo in overlapping-samples help msg (#4758)	Daniel Bevenius
	This commit fixes a typo in the help message for the --overlapping-samples option. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-01-03	swift : update Package.swift to use ggml as dependency (#4691)	Ashraful Islam
	* updates the package.swift to use ggml as dependency * changes the ggml package url src to ggerganov
2024-01-03	cuda : simplify expression	Georgi Gerganov
	Co-authored-by: slaren <slarengh@gmail.com>
2024-01-03	cuda : mark I16 and I32 ops as unsupported	Georgi Gerganov
	ggml-ci
2024-01-03	sync : ggml	Georgi Gerganov
	ggml-ci
2024-01-03	metal : add kernel_get_rows_i32	Georgi Gerganov
	ggml-ci
2024-01-03	scripts : fix sync order + metal sed	Georgi Gerganov

2024-01-03	ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639)	Guillaume Wenzek
	* add more int ops * ggml_compute_forward_dup_bytes * add tests * PR comments * tests : minor indentations --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-03	server : throw an error when `slot unavailable` (#4741)	Justin Parker

2024-01-02	metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)	Georgi Gerganov
	* ggml : disable fast-math for Metal (cmake build only) ggml-ci * metal : fix Metal API debug warnings * cmake : add -fno-inline for Metal build (#4545) * metal : fix API debug warnings * metal : fix compile warnings * metal : use uint64_t for strides * cmake : rename option to LLAMA_METAL_SHADER_DEBUG * metal : fix mat-vec Q8_0 kernel for BS > 1 * metal : normalize mat-vec kernel signatures * cmake : respect LLAMA_QKK_64 option * metal : fix mat-vec Q4_K kernel for QK_K == 64 * metal : optimizing ggml_mul_mat_id (wip) * metal : minor fix * metal : opt mul_mm_id
2024-01-02	server : add token counts to html footer (#4738)	Phil H
	* server: add token counts to stats * server: generate hpp --------- Co-authored-by: phiharri <ph@got-root.co.uk>
2024-01-02	llama : llama_model_desc print number of experts	Georgi Gerganov

2024-01-02	llama : replace all API facing `int`'s with `int32_t` (#4577)	Marcus Dunn
	* replaced all API facing `int`'s with `int32_t` * formatting and missed `int` in `llama_token_to_piece`
2024-01-02	llama : differentiate the KV dims in the attention (#4657)	postmasters
	* Add n_key_dim and n_value_dim Some models use values that are not derived from `n_embd`. Also remove `n_embd_head` and `n_embd_gqa` because it is not clear which "head" is referred to (key or value). Fix issue #4648. * Fix `llm_build_kqv` to use `n_value_gqa` * Rebase * Rename variables * Fix llm_build_kqv to be more generic wrt n_embd_head_k * Update default values for n_embd_head_k and n_embd_head_v Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Fix llm_load_tensors: the asserts were not backcompat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-02	editorconfig : fix whitespace and indentation #4710	Georgi Gerganov

2024-01-02	server : add --override-kv parameter (#4710)	minarchist
	* Changes to server to allow metadata override * documentation * flake.nix: expose full scope in legacyPackages * flake.nix: rocm not yet supported on aarch64, so hide the output * flake.nix: expose checks * workflows: nix-ci: init; build flake outputs * workflows: nix-ci: add a job for eval * workflows: weekly `nix flake update` * workflows: nix-flakestry: drop tag filters ...and add a job for flakehub.com * workflows: nix-ci: add a qemu job for jetsons * flake.nix: suggest the binary caches * flake.lock: update to a commit recently cached by nixpkgs-cuda-ci --------- Co-authored-by: John <john@jLap.lan> Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>
2024-01-02	py : re-enable mmap in convert hf (#4732)	Nam D. Tran
	* update: awq support llama-7b model * update: change order * update: benchmark results for llama2-7b * update: mistral 7b v1 benchmark * update: support 4 models * fix: Readme * update: ready for PR * update: readme * fix: readme * update: change order import * black * format code * update: work for bot mpt and awqmpt * update: readme * Rename to llm_build_ffn_mpt_awq * Formatted other files * Fixed params count * fix: remove code * update: more detail for mpt * fix: readme * fix: readme * update: change folder architecture * fix: common.cpp * fix: readme * fix: remove ggml_repeat * update: cicd * update: cicd * uppdate: remove use_awq arg * update: readme * llama : adapt plamo to new ffn ggml-ci * fix: update torch version --------- Co-authored-by: Trần Đức Nam <v.namtd12@vinai.io> Co-authored-by: Le Hoang Anh <v.anhlh33@vinai.io> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-02	finetune: fix typo in README.md (#4733)	Daniel Bevenius
	Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-01-02	metal : enable shader debugging (cmake option) (#4705)	Georgi Gerganov
	* ggml : disable fast-math for Metal (cmake build only) ggml-ci * metal : fix Metal API debug warnings * cmake : add -fno-inline for Metal build (#4545) * metal : fix API debug warnings * metal : fix compile warnings * metal : use uint64_t for strides * cmake : rename option to LLAMA_METAL_SHADER_DEBUG * metal : fix mat-vec Q8_0 kernel for BS > 1 * metal : normalize mat-vec kernel signatures * cmake : respect LLAMA_QKK_64 option * metal : fix mat-vec Q4_K kernel for QK_K == 64 ggml-ci
2023-12-31	flake.lock: update	Someone Serge
	to a commit recently cached by nixpkgs-cuda-ci
2023-12-31	flake.nix: suggest the binary caches	Someone Serge

2023-12-31	workflows: nix-ci: add a qemu job for jetsons	Someone Serge

2023-12-31	workflows: nix-flakestry: drop tag filters	Someone Serge
	...and add a job for flakehub.com
2023-12-31	workflows: weekly `nix flake update`	Someone Serge

2023-12-31	workflows: nix-ci: add a job for eval	Someone Serge

2023-12-31	workflows: nix-ci: init; build flake outputs	Someone Serge

2023-12-31	flake.nix: expose checks	Someone Serge

2023-12-31	flake.nix: rocm not yet supported on aarch64, so hide the output	Someone Serge

2023-12-31	flake.nix: expose full scope in legacyPackages	Someone Serge

2023-12-31	ggml : add ggml_vdotq_s32 alias (#4715)	Georgi Gerganov
	ggml-ci
2023-12-30	clip : refactor + bug fixes (#4696)	Georgi Gerganov
	* clip : refactor + bug fixes ggml-ci * server : add log message
2023-12-30	CUDA: fixed tensor cores not being used on RDNA3 (#4697)	Johannes Gäßler