ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-01-24	nix-shell: use addToSearchPath	Michael Hueschen
	thx to @SomeoneSerge for the suggestion!
2024-01-24	nix: add cc to devShell LD_LIBRARY_PATH	Michael Hueschen
	this fixes the error I encountered when trying to run the convert.py script in a venv: ``` $ nix develop [...]$ source .venv/bin/activate (.venv) [...]$ pip3 install -r requirements.txt <... clipped ...> [...]$ python3 ./convert.py Traceback (most recent call last): File "/home/mhueschen/projects-reference/llama.cpp/./convert.py", line 40, in <module> from sentencepiece import SentencePieceProcessor File "/home/mhueschen/projects-reference/llama.cpp/.venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 13, in <module> from . import _sentencepiece ImportError: libstdc++.so.6: cannot open shared object file: No such file or directory ``` however, I am not sure this is the cleanest way to address this linker issue...
2024-01-23	devops : add intel oneapi dockerfile (#5068)	Xuan Son Nguyen
	Co-authored-by: Xuan Son Nguyen <xuanson.nguyen@snowpack.eu>
2024-01-22	nix: add a comment on the many nixpkgs-with-cuda instances	Someone Serge

2024-01-22	nix: add a comment about makeScope	Someone Serge

2024-01-22	nix: refactor the cleanSource rules	Someone Serge

2024-01-22	llama : support StableLM 2 1.6B (#5052)	compilade
	* llama : support StableLM 2 1.6B * convert : fix Qwen's set_vocab wrongly naming all special tokens [PAD{id}] * convert : refactor Qwen's set_vocab to use it for StableLM 2 too * nix : add tiktoken to llama-python-extra * convert : use presence of tokenizer.json to determine StableLM tokenizer loader It's a less arbitrary heuristic than the vocab size.
2024-01-21	Revert LLAMA_NATIVE to OFF in flake.nix (#5066)	iSma

2024-01-05	flake.nix : fix typo (#4700)	Ikko Eltociear Ashimine
	betwen -> between
2023-12-31	flake.nix: expose full scope in legacyPackages	Someone Serge

2023-12-29	python : add check-requirements.sh and GitHub workflow (#4585)	crasm
	* python: add check-requirements.sh and GitHub workflow This script and workflow forces package versions to remain compatible across all convert.py scripts, while allowing secondary convert scripts to import dependencies not wanted in convert.py. Move requirements into ./requirements * Fail on "==" being used for package requirements (but can be suppressed) * Enforce "compatible release" syntax instead of == * Update workflow * Add upper version bound for transformers and protobuf * improve check-requirements.sh * small syntax change * don't remove venvs if nocleanup is passed * See if this fixes docker workflow * Move check-requirements.sh into ./scripts/ --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2023-12-29	flake.nix : rewrite (#4605)	Philip Taron
	* flake.lock: update to hotfix CUDA::cuda_driver Required to support https://github.com/ggerganov/llama.cpp/pull/4606 * flake.nix: rewrite 1. Split into separate files per output. 2. Added overlays, so that this flake can be integrated into others. The names in the overlay are `llama-cpp`, `llama-cpp-opencl`, `llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs). 3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/) rather than `with pkgs;` so that there's dependency injection rather than dependency lookup. 4. Add a description and meta information for each package. The description includes a bit about what's trying to accelerate each one. 5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge. 6. Format with `serokell/nixfmt` for a consistent style. 7. Update `flake.lock` with the latest goods. * flake.nix: use finalPackage instead of passing it manually * nix: unclutter darwin support * nix: pass most darwin frameworks unconditionally ...for simplicity * .nix: nixfmt nix shell github:piegamesde/nixfmt/rfc101-style --command \ nixfmt flake.nix .devops/nix/.nix * flake.nix: add maintainers * nix: move meta down to follow Nixpkgs style more closely * nix: add missing meta attributes nix: clarify the interpretation of meta.maintainers nix: clarify the meaning of "broken" and "badPlatforms" nix: passthru: expose the use* flags for inspection E.g.: ``` ❯ nix eval .#cuda.useCuda true ``` * flake.nix: avoid re-evaluating nixpkgs too many times * flake.nix: use flake-parts * nix: migrate to pname+version * flake.nix: overlay: expose both the namespace and the default attribute * ci: add the (Nix) flakestry workflow * nix: cmakeFlags: explicit OFF bools * nix: cuda: reduce runtime closure * nix: fewer rebuilds * nix: respect config.cudaCapabilities * nix: add the impure driver's location to the DT_RUNPATHs * nix: clean sources more thoroughly ...this way outPaths change less frequently, and so there are fewer rebuilds * nix: explicit mpi support * nix: explicit jetson support * flake.nix: darwin: only expose the default --------- Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>
2023-11-30	docker : add finetune option (#4211)	Juraj Bednar

2023-09-15	ci : Cloud-V for RISC-V builds (#3160)	Ali Tariq
	* Added Cloud-V File * Replaced Makefile with original one --------- Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
2023-09-08	docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044)	hongbo.mo

2023-08-30	[Docker] fix tools.sh argument passing. (#2884)	Henri Vasserman
	* [Docker] fix tools.sh argument passing. This should allow passing multiple arguments to containers with the full image that are using the tools.sh frontend. Fix from https://github.com/ggerganov/llama.cpp/issues/2535#issuecomment-1697091734
2023-08-28	devops : added systemd units and set versioning to use date. (#2835)	JohnnyB
	* Corrections and systemd units * Missing dependency clblast
2023-08-25	ROCm Port (#1087)	Henri Vasserman
	* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>
2023-08-23	devops : RPM Specs (#2723)	JohnnyB
	* Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir
2023-07-13	devops : add missing quotes to bash script (#2193)	Bodo Graumann
	This prevents accidentally expanding arguments that contain spaces.
2023-07-11	docker : add '--server' option (#2174)	Jinwoo Jeong

2023-07-07	docker : add support for CUDA in docker (#1461)	dylan
	Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-08	Add llama.cpp docker support for non-latin languages (#1673)	qingfengfenga
	* Modify Dockerfile default character set to improve compatibility (#1673)
2023-06-03	Docker: change to calling convert.py (#1641)	Jiří Podivín
	Deprecation disclaimer was added to convert-pth-to-ggml.py
2023-05-28	Adding git in container package dependencies (#1621)	Jiří Podivín
	Git added to build packages for version information in docker image Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
2023-04-26	quantize : use `map` to assign quantization type from `string` (#1191)	Pavol Rusnak
	instead of `int` (while `int` option still being supported) This allows the following usage: `./quantize ggml-model-f16.bin ggml-model-q4_0.bin q4_0` instead of: `./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2`
2023-04-14	py : cleanup dependencies (#962)	Pavol Rusnak
	after #545 we do not need torch, tqdm and requests in the dependencies
2023-04-11	Fix whitespace, add .editorconfig, add GitHub workflow (#883)	Pavol Rusnak

2023-04-03	Remove torch GPU dependencies from the Docker.full image (#665)	bsilvereagle
	By using `pip install torch --index-url https://download.pytorch.org/whl/cpu` instead of `pip install torch` we can specify we want to install a CPU-only version of PyTorch without any GPU dependencies. This reduces the size of the Docker image from 7.32 GB to 1.62 GB
2023-03-23	Remove oboslete command from Docker script	Georgi Gerganov

2023-03-20	Add tqdm to Python requirements (#293)	Stephan Walter
	* Add tqdm to Python requirements * Remove torchvision torchaudio, add requests
2023-03-17	Don't tell users to use a bad number of threads (#243)	Stephan Walter
	The readme tells people to use the command line option "-t 8", causing 8 threads to be started. On systems with fewer than 8 cores, this causes a significant slowdown. Remove the option from the example command lines and use /proc/cpuinfo on Linux to determine a sensible default.
2023-03-17	🚀 Dockerize llamacpp (#132)	Bernat Vadell
	* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>