ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Expand)	Author
2024-01-03	cuda : simplify expression	Georgi Gerganov
2024-01-03	cuda : mark I16 and I32 ops as unsupported	Georgi Gerganov
2023-12-30	CUDA: fixed tensor cores not being used on RDNA3 (#4697)	Johannes Gäßler
2023-12-29	CUDA: fix tensor core logic for Pascal and HIP (#4682)	Johannes Gäßler
2023-12-29	cuda: fix vmm oom issue on NVIDIA AGX Orin (#4687)	hydai
2023-12-29	ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669)	bssrdf
2023-12-26	cuda : fix vmm pool with multi GPU (#4620)	slaren
2023-12-26	Fix new CUDA10 compilation errors (#4635)	FantasyGmm
2023-12-24	cuda : improve cuda pool efficiency using virtual memory (#4606)	slaren
2023-12-23	fallback to CPU buffer if host buffer alloc fails (#4610)	slaren
2023-12-23	CUDA: fixed row rounding for 0 tensor splits (#4594)	Johannes Gäßler
2023-12-22	sync : ggml (fix im2col) (#4591)	Georgi Gerganov
2023-12-22	cuda : fix jetson compile error (#4560)	FantasyGmm
2023-12-22	Fix CudaMemcpy direction (#4599)	Henrik Forstén
2023-12-22	llama : fix platforms without mmap (#4578)	slaren
2023-12-21	ggml : change ggml_scale to take a float instead of tensor (#4573)	Georgi Gerganov
2023-12-21	llama : initial ggml-backend integration (#4520)	slaren
2023-12-21	cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449)	Erik Garrison
2023-12-21	ggml-cuda: Fix HIP build by adding define for __trap (#4569)	arlo-phoenix
2023-12-21	CUDA: mul_mat_id always on GPU for batches >= 32 (#4553)	Johannes Gäßler
2023-12-21	cuda : better error message for ggml_get_rows (#4561)	bobqianic
2023-12-21	cuda : replace asserts in wrong architecture checks with __trap (#4556)	slaren
2023-12-21	Fix access violation in ggml_cuda_free_data if tensor->extra is NULL (#4554)	LoganDark
2023-12-20	CUDA: Faster Mixtral prompt processing (#4538)	Johannes Gäßler
2023-12-18	ggml-cuda: Fix HIP build (#4528)	arlo-phoenix
2023-12-18	llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490)	Ebey Abraham
2023-12-14	ggml : use ggml_row_size where possible (#4472)	slaren
2023-12-13	sync : ggml (SD ops, tests, kernels) (#4444)	Georgi Gerganov
2023-12-13	llama : add Mixtral support (#4406)	slaren
2023-12-07	sync : ggml (new ops, tests, backend, etc.) (#4359)	Georgi Gerganov
2023-12-07	llama : per-layer KV cache + quantum K cache (#4309)	Georgi Gerganov
2023-12-01	ggml : add ggml_soft_max_ext (#4256)	Georgi Gerganov
2023-11-24	ggml-cuda : support stablelm rope (#4156)	slaren
2023-11-23	Fix incorrect format strings and uninitialized variables. (#4133)	Haohui Mai
2023-11-18	Clean up ggml-cuda.cu warnings when compiling with clang (for ROCM) (#4124)	Kerfuffle
2023-11-17	cuda : get_row_rounding F32 (#4095)	Andrew Godfrey
2023-11-17	llama : fix data units (#4101)	Georgi Gerganov
2023-11-15	ggml-cuda : increase max graph size (#4084)	slaren
2023-11-13	ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060)	Georgi Gerganov
2023-11-13	sync : ggml (backend v2) (#3912)	Georgi Gerganov
2023-11-13	Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)	Kerfuffle
2023-11-07	cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)	Meng Zhang
2023-11-05	ggml-cuda : fix f16 mul mat (#3961)	slaren
2023-11-05	cuda : fix disabling device with --tensor-split 1,0 (#3951)	Jared Van Bortel
2023-11-05	cuda : revert CUDA pool stuff (#3944)	slaren
2023-11-03	ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921)	slaren
2023-11-02	cuda : add ROCM aliases for CUDA pool stuff (#3918)	Kerfuffle
2023-11-02	cuda : fix const ptrs warning causing ROCm build issues (#3913)	Georgi Gerganov
2023-11-02	cuda : use CUDA memory pool with async memory allocation/deallocation when av...	Oleksii Maryshchenko
2023-11-02	cuda : check if this fixes Pascal card regression (#3882)	Georgi Gerganov