Age | Commit message (Collapse) | Author |
|
* Merge mainline
* Fix after merge
* Remove CI check
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
* Merging mainline - WIP
* Merging mainline - WIP
AVX2 and CUDA appear to work.
CUDA performance seems slightly (~1-2%) lower as it is so often
the case with llama.cpp/ggml after some "improvements" have been made.
* Merging mainline - fix Metal
* Remove check
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
* add sycl preset
* fix debug link error. fix windows crash
* update README
|
|
llama-llava-cli, etc... (#7809)
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930.
* add hot topic notice to README.md
* Update README.md
* Update README.md
* rename gguf-split & quantize bins refs in **/tests.sh
---------
Co-authored-by: HanClinto <hanclinto@gmail.com>
|
|
|
|
* fix typo
* fix typos
* fix typo
* fix typos
* fix typo
* fix typos
|
|
* disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax
* refactor to disable mmap for SYCL backend
* fix compile error in other os
* refactor the solution, use host buf to fix it, instead of disable mmap
* keep to support mmap()
* use host buff to reduce malloc times
* revert to malloc/free solution, for threaad safe
|
|
* fix LOG() error for SYCL, enhance erro check by CI
* rollback to bash
* add newline at end of file
|
|
* update readme sycl for new update
* Update README-sycl.md
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
* Update README-sycl.md
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
* Update README-sycl.md
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
* Update README-sycl.md
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
* Update README-sycl.md
Co-authored-by: AidanBeltonS <87009434+AidanBeltonS@users.noreply.github.com>
* Update README-sycl.md
Co-authored-by: AidanBeltonS <87009434+AidanBeltonS@users.noreply.github.com>
* update by review comments
* update w64devkit link
* update for verify device id part
* Update README-sycl.md
Co-authored-by: Meng, Hengyu <airdldl@163.com>
---------
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Co-authored-by: AidanBeltonS <87009434+AidanBeltonS@users.noreply.github.com>
Co-authored-by: Meng, Hengyu <airdldl@163.com>
|
|
|
|
* suport multiple cards: split-mode - layer|row
* rm warning
* rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test
* update news
* fix merge error
* update according to review comments
|
|
* update guide for make installation, memory, gguf model link, rm todo for windows build
* add vs install requirement
* update for gpu device check
* update help of llama-bench
* fix grammer issues
|
|
|
|
* support SYCL backend windows build
* add windows build in CI
* add for win build CI
* correct install oneMKL
* fix install issue
* fix ci
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix win build
* fix win build
* fix win build
* restore other CI part
* restore as base
* rm no new line
* fix no new line issue, add -j
* fix grammer issue
* allow to trigger manually, fix format issue
* fix format
* add newline
* fix format
* fix format
* fix format issuse
---------
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
|
|
* first update for migration
* update init_cublas
* add debug functio, commit all help code
* step 1
* step 2
* step3 add fp16, slower 31->28
* add GGML_LIST_DEVICE function
* step 5 format device and print
* step6, enhance error check, remove CUDA macro, enhance device id to fix none-zero id issue
* support main device is non-zero
* step7 add debug for code path, rm log
* step 8, rename all macro & func from cuda by sycl
* fix error of select non-zero device, format device list
* ren ggml-sycl.hpp -> ggml-sycl.h
* clear CMAKE to rm unused lib and options
* correct queue: rm dtct:get_queue
* add print tensor function to debug
* fix error: wrong result in 658746bb26702e50f2c59c0e4ada8e9da6010481
* summary dpct definition in one header file to replace folder:dpct
* refactor device log
* mv dpct definition from folder dpct to ggml-sycl.h
* update readme, refactor build script
* fix build with sycl
* set nthread=1 when sycl, increase performance
* add run script, comment debug code
* add ls-sycl-device tool
* add ls-sycl-device, rm unused files
* rm rear space
* dos2unix
* Update README_sycl.md
* fix return type
* remove sycl version from include path
* restore rm code to fix hang issue
* add syc and link for sycl readme
* rm original sycl code before refactor
* fix code err
* add know issue for pvc hang issue
* enable SYCL_F16 support
* align pr4766
* check for sycl blas, better performance
* cleanup 1
* remove extra endif
* add build&run script, clean CMakefile, update guide by review comments
* rename macro to intel hardware
* editor config format
* format fixes
* format fixes
* editor format fix
* Remove unused headers
* skip build sycl tool for other code path
* replace tab by space
* fix blas matmul function
* fix mac build
* restore hip dependency
* fix conflict
* ren as review comments
* mv internal function to .cpp file
* export funciton print_sycl_devices(), mv class dpct definition to source file
* update CI/action for sycl code, fix CI error of repeat/dup
* fix action ID format issue
* rm unused strategy
* enable llama_f16 in ci
* fix conflict
* fix build break on MacOS, due to CI of MacOS depend on external ggml, instead of internal ggml
* fix ci cases for unsupported data type
* revert unrelated changed in cuda cmake
remove useless nommq
fix typo of GGML_USE_CLBLAS_SYCL
* revert hip cmake changes
* fix indent
* add prefix in func name
* revert no mmq
* rm cpu blas duplicate
* fix no_new_line
* fix src1->type==F16 bug.
* pass batch offset for F16 src1
* fix batch error
* fix wrong code
* revert sycl checking in test-sampling
* pass void as arguments of ggml_backend_sycl_print_sycl_devices
* remove extra blank line in test-sampling
* revert setting n_threads in sycl
* implement std::isinf for icpx with fast math.
* Update ci/run.sh
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/sycl/run-llama2.sh
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/sycl/run-llama2.sh
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update CMakeLists.txt
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update CMakeLists.txt
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update CMakeLists.txt
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update CMakeLists.txt
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* add copyright and MIT license declare
* update the cmd example
---------
Co-authored-by: jianyuzh <jianyu.zhang@intel.com>
Co-authored-by: luoyu-intel <yu.luo@intel.com>
Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|