| Age | Commit message (Collapse) | Author | 
|---|
|  | * Merging mainline - WIP
* Merging mainline - WIP
AVX2 and CUDA appear to work.
CUDA performance seems slightly (~1-2%) lower as it is so often
the case with llama.cpp/ggml after some "improvements" have been made.
* Merging mainline - fix Metal
* Remove check
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> | 
|  | compare-commits.sh : hide stdout, use -oe to print markdown | 
|  | * ggml : group all experts in a single ggml_mul_mat_id
cuda : improve mmid row copy
* cuda : fix bin bcast with non-cont src0
* test-backend-ops : only run all mul mat tests for base types
* llama : disable moe offloading with SYCL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | 
|  |  | 
|  | * scripts : add helpers script for bench comparing commits
* scripts : detect CUDA
* set flags after checking the command line
* fix make flags
---------
Co-authored-by: slaren <slarengh@gmail.com> |