ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-05-20	llama : remove MPI backend (#7395)	slaren

2024-05-17	ROCm: use native CMake HIP support (#5966)	Gavin Zhao
	Supercedes #4024 and #4813. CMake's native HIP support has become the recommended way to add HIP code into a project (see [here](https://rocm.docs.amd.com/en/docs-6.0.0/conceptual/cmake-packages.html#using-hip-in-cmake)). This PR makes the following changes: 1. The environment variable `HIPCXX` or CMake option `CMAKE_HIP_COMPILER` should be used to specify the HIP compiler. Notably this shouldn't be `hipcc`, but ROCm's clang, which usually resides in `$ROCM_PATH/llvm/bin/clang`. Previously this was control by `CMAKE_C_COMPILER` and `CMAKE_CXX_COMPILER`. Note that since native CMake HIP support is not yet available on Windows, on Windows we fall back to the old behavior. 2. CMake option `CMAKE_HIP_ARCHITECTURES` is used to control the GPU architectures to build for. Previously this was controled by `GPU_TARGETS`. 3. Updated the Nix recipe to account for these new changes. 4. The GPU targets to build against in the Nix recipe is now consistent with the supported GPU targets in nixpkgs. 5. Added CI checks for HIP on both Linux and Windows. On Linux, we test both the new and old behavior. The most important part about this PR is the separation of the HIP compiler and the C/C++ compiler. This allows users to choose a different C/C++ compiler if desired, compared to the current situation where when building for ROCm support, everything must be compiled with ROCm's clang. ~~Makefile is unchanged. Please let me know if we want to be consistent on variables' naming because Makefile still uses `GPU_TARGETS` to control architectures to build for, but I feel like setting `CMAKE_HIP_ARCHITECTURES` is a bit awkward when you're calling `make`.~~ Makefile used `GPU_TARGETS` but the README says to use `AMDGPU_TARGETS`. For consistency with CMake, all usage of `GPU_TARGETS` in Makefile has been updated to `AMDGPU_TARGETS`. Thanks to the suggestion of @jin-eld, to maintain backwards compatibility (and not break too many downstream users' builds), if `CMAKE_CXX_COMPILER` ends with `hipcc`, then we still compile using the original behavior and emit a warning that recommends switching to the new HIP support. Similarly, if `AMDGPU_TARGETS` is set but `CMAKE_HIP_ARCHITECTURES` is not, then we forward `AMDGPU_TARGETS` to `CMAKE_HIP_ARCHITECTURES` to ease the transition to the new HIP support. Signed-off-by: Gavin Zhao <git@gzgz.dev>
2024-04-29	build(cmake): simplify instructions (`cmake -B build && cmake --build build ↵	Olivier Chafik
	...`) (#6964) * readme: cmake . -B build && cmake --build build * build: fix typo Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * build: drop implicit . from cmake config command * build: remove another superfluous . * build: update MinGW cmake commands * Update README-sycl.md Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> * build: reinstate --config Release as not the default w/ some generators + document how to build Debug * build: revert more --config Release * build: nit / remove -H from cmake example * build: reword debug instructions around single/multi config split --------- Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-04-04	server: add cURL support to server Dockerfiles (#6474)	Ed Lepedus
	* server: add cURL support to `full.Dockerfile` * server: add cURL support to `full-cuda.Dockerfile` and `server-cuda.Dockerfile` * server: add cURL support to `full-rocm.Dockerfile` and `server-rocm.Dockerfile` * server: add cURL support to `server-intel.Dockerfile` * server: add cURL support to `server-vulkan.Dockerfile` * fix typo in `server-vulkan.Dockerfile` Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-03	server: add cURL support to `server.Dockerfile` (#6461)	Ed Lepedus

2024-03-29	Fedora build update (#6388)	Mohammadreza Hendiani
	* fixed deprecated address * fixed deprecated address * fixed deprecated address * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * reverted back to only the MIT license
2024-03-28	nix: removed unnessesary indentation	hutli

2024-03-28	nix: moved blas availability check to package inputs so it is still overridable	hutli

2024-03-28	using blas.meta.available to check host platform	hutli

2024-03-28	only using explicit blas if hostPlatform is allowed	hutli

2024-03-28	nix: package: don't introduce the dependency on python	Someone Serge
	- The generic /usr/bin/env shebangs are good enough - Python deps are provisioned in the devShells - We need to be able to leave python out at least on windows (currently breaks eval)
2024-03-28	nix: .#widnows: init	hutli
	initial nix build for windows using zig mingwW64 build removes nix zig windows build removes nix zig windows build removed unnessesary glibc.static removed unnessesary import of pkgs in nix fixed missing trailing newline on non-windows nix builds overriding stdenv when building for crosscompiling to windows in nix better variables when crosscompiling windows in nix cross compile windows on macos removed trailing whitespace remove unnessesary overwrite of "CMAKE_SYSTEM_NAME" in nix windows build nix: keep file extension when copying result files during cross compile for windows nix: better checking for file extensions when using MinGW nix: using hostPlatform instead of targetPlatform when cross compiling for Windows using hostPlatform.extensions.executable to extract executable format
2024-03-25	nix: make `xcrun` visible in Nix sandbox for precompiling Metal shaders (#6118)	Joseph Stahl
	* Symlink to /usr/bin/xcrun so that `xcrun` binary is usable during build (used for compiling Metal shaders) Fixes https://github.com/ggerganov/llama.cpp/issues/6117 * cmake - copy default.metallib to install directory When metal files are compiled to default.metallib, Cmake needs to add this to the install directory so that it's visible to llama-cpp Also, update package.nix to use absolute path for default.metallib (it's not finding the bundle) * add `precompileMetalShaders` flag (defaults to false) to disable precompilation of metal shader Precompilation requires Xcode to be installed and requires disable sandbox on nix-darwin
2024-03-26	cuda : rename build flag to LLAMA_CUDA (#6299)	slaren

2024-03-25	nix: fix blas support (#6281)	Christian Kögler
	Since no blas was provided to buildInputs, the executable is built without blas support. This is a backport of NixOS/nixpkgs#298567
2024-03-05	fix editorconfig check break (#5879)	Minsoo Cheong

2024-03-04	nix: static build (#5814)	hutli

2024-03-01	build(nix): Introduce flake.formatter for `nix fmt` (#5687)	Tushar
	* build(nix): Introduce flake.formatter for `nix fmt` * chore: Switch to pkgs.nixfmt-rfc-style
2024-02-22	nix: init singularity and docker images (#5056)	Someone
	Exposes a few attributes demonstrating how to build [singularity](https://docs.sylabs.io/guides/latest/user-guide/)/[apptainer](https://apptainer.org/) and Docker images re-using llama.cpp's Nix expression. Built locally on `x86_64-linux` with `nix build github:someoneserge/llama.cpp/feat/nix/images#llamaPackages.{docker,docker-min,sif,llama-cpp}` and it's fast and effective.
2024-02-19	Enable Vulkan MacOS CI	0cc4m

2024-02-03	add Vulkan support to Nix flake	Martin Schwaighofer

2024-02-02	docker : add build for SYCL, Vulkan + update readme (#5228)	Xuan Son Nguyen
	* add vulkan dockerfile * intel dockerfile: compile sycl by default * fix vulkan dockerfile * add docs for vulkan * docs: sycl build in docker * docs: remove trailing spaces * docs: sycl: add docker section * docs: clarify install vulkan SDK outside docker * sycl: use intel/oneapi-basekit docker image * docs: correct TOC * docs: correct docker image for Intel oneMKL
2024-01-28	docker : add server-first container images (#5157)	Kyle Mistele
	* feat: add Dockerfiles for each platform that user ./server instead of ./main * feat: update .github/workflows/docker.yml to build server-first docker containers * doc: add information about running the server with Docker to README.md * doc: add information about running with docker to the server README * doc: update n-gpu-layers to show correct GPU usage * fix(doc): update container tag from `server` to `server-cuda` for README example on running server container with CUDA
2024-01-24	nix-shell: use addToSearchPath	Michael Hueschen
	thx to @SomeoneSerge for the suggestion!
2024-01-24	nix: add cc to devShell LD_LIBRARY_PATH	Michael Hueschen
	this fixes the error I encountered when trying to run the convert.py script in a venv: ``` $ nix develop [...]$ source .venv/bin/activate (.venv) [...]$ pip3 install -r requirements.txt <... clipped ...> [...]$ python3 ./convert.py Traceback (most recent call last): File "/home/mhueschen/projects-reference/llama.cpp/./convert.py", line 40, in <module> from sentencepiece import SentencePieceProcessor File "/home/mhueschen/projects-reference/llama.cpp/.venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 13, in <module> from . import _sentencepiece ImportError: libstdc++.so.6: cannot open shared object file: No such file or directory ``` however, I am not sure this is the cleanest way to address this linker issue...
2024-01-23	devops : add intel oneapi dockerfile (#5068)	Xuan Son Nguyen
	Co-authored-by: Xuan Son Nguyen <xuanson.nguyen@snowpack.eu>
2024-01-22	nix: add a comment on the many nixpkgs-with-cuda instances	Someone Serge

2024-01-22	nix: add a comment about makeScope	Someone Serge

2024-01-22	nix: refactor the cleanSource rules	Someone Serge

2024-01-22	llama : support StableLM 2 1.6B (#5052)	compilade
	* llama : support StableLM 2 1.6B * convert : fix Qwen's set_vocab wrongly naming all special tokens [PAD{id}] * convert : refactor Qwen's set_vocab to use it for StableLM 2 too * nix : add tiktoken to llama-python-extra * convert : use presence of tokenizer.json to determine StableLM tokenizer loader It's a less arbitrary heuristic than the vocab size.
2024-01-21	Revert LLAMA_NATIVE to OFF in flake.nix (#5066)	iSma

2024-01-05	flake.nix : fix typo (#4700)	Ikko Eltociear Ashimine
	betwen -> between
2023-12-31	flake.nix: expose full scope in legacyPackages	Someone Serge

2023-12-29	python : add check-requirements.sh and GitHub workflow (#4585)	crasm
	* python: add check-requirements.sh and GitHub workflow This script and workflow forces package versions to remain compatible across all convert.py scripts, while allowing secondary convert scripts to import dependencies not wanted in convert.py. Move requirements into ./requirements * Fail on "==" being used for package requirements (but can be suppressed) * Enforce "compatible release" syntax instead of == * Update workflow * Add upper version bound for transformers and protobuf * improve check-requirements.sh * small syntax change * don't remove venvs if nocleanup is passed * See if this fixes docker workflow * Move check-requirements.sh into ./scripts/ --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2023-12-29	flake.nix : rewrite (#4605)	Philip Taron
	* flake.lock: update to hotfix CUDA::cuda_driver Required to support https://github.com/ggerganov/llama.cpp/pull/4606 * flake.nix: rewrite 1. Split into separate files per output. 2. Added overlays, so that this flake can be integrated into others. The names in the overlay are `llama-cpp`, `llama-cpp-opencl`, `llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs). 3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/) rather than `with pkgs;` so that there's dependency injection rather than dependency lookup. 4. Add a description and meta information for each package. The description includes a bit about what's trying to accelerate each one. 5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge. 6. Format with `serokell/nixfmt` for a consistent style. 7. Update `flake.lock` with the latest goods. * flake.nix: use finalPackage instead of passing it manually * nix: unclutter darwin support * nix: pass most darwin frameworks unconditionally ...for simplicity * .nix: nixfmt nix shell github:piegamesde/nixfmt/rfc101-style --command \ nixfmt flake.nix .devops/nix/.nix * flake.nix: add maintainers * nix: move meta down to follow Nixpkgs style more closely * nix: add missing meta attributes nix: clarify the interpretation of meta.maintainers nix: clarify the meaning of "broken" and "badPlatforms" nix: passthru: expose the use* flags for inspection E.g.: ``` ❯ nix eval .#cuda.useCuda true ``` * flake.nix: avoid re-evaluating nixpkgs too many times * flake.nix: use flake-parts * nix: migrate to pname+version * flake.nix: overlay: expose both the namespace and the default attribute * ci: add the (Nix) flakestry workflow * nix: cmakeFlags: explicit OFF bools * nix: cuda: reduce runtime closure * nix: fewer rebuilds * nix: respect config.cudaCapabilities * nix: add the impure driver's location to the DT_RUNPATHs * nix: clean sources more thoroughly ...this way outPaths change less frequently, and so there are fewer rebuilds * nix: explicit mpi support * nix: explicit jetson support * flake.nix: darwin: only expose the default --------- Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>
2023-11-30	docker : add finetune option (#4211)	Juraj Bednar

2023-09-15	ci : Cloud-V for RISC-V builds (#3160)	Ali Tariq
	* Added Cloud-V File * Replaced Makefile with original one --------- Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
2023-09-08	docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044)	hongbo.mo

2023-08-30	[Docker] fix tools.sh argument passing. (#2884)	Henri Vasserman
	* [Docker] fix tools.sh argument passing. This should allow passing multiple arguments to containers with the full image that are using the tools.sh frontend. Fix from https://github.com/ggerganov/llama.cpp/issues/2535#issuecomment-1697091734
2023-08-28	devops : added systemd units and set versioning to use date. (#2835)	JohnnyB
	* Corrections and systemd units * Missing dependency clblast
2023-08-25	ROCm Port (#1087)	Henri Vasserman
	* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>
2023-08-23	devops : RPM Specs (#2723)	JohnnyB
	* Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir
2023-07-13	devops : add missing quotes to bash script (#2193)	Bodo Graumann
	This prevents accidentally expanding arguments that contain spaces.
2023-07-11	docker : add '--server' option (#2174)	Jinwoo Jeong

2023-07-07	docker : add support for CUDA in docker (#1461)	dylan
	Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-08	Add llama.cpp docker support for non-latin languages (#1673)	qingfengfenga
	* Modify Dockerfile default character set to improve compatibility (#1673)
2023-06-03	Docker: change to calling convert.py (#1641)	Jiří Podivín
	Deprecation disclaimer was added to convert-pth-to-ggml.py
2023-05-28	Adding git in container package dependencies (#1621)	Jiří Podivín
	Git added to build packages for version information in docker image Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
2023-04-26	quantize : use `map` to assign quantization type from `string` (#1191)	Pavol Rusnak
	instead of `int` (while `int` option still being supported) This allows the following usage: `./quantize ggml-model-f16.bin ggml-model-q4_0.bin q4_0` instead of: `./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2`
2023-04-14	py : cleanup dependencies (#962)	Pavol Rusnak
	after #545 we do not need torch, tqdm and requests in the dependencies