Age | Commit message (Collapse) | Author |
|
* convert.py: BPE fixes?
* Remove unnecessary conditional in addl token error handling
|
|
|
|
Co-authored-by: mmnga <mmnga1mmnga@gmail.com>
|
|
|
|
* make : remove unused -DGGML_BIG_ENDIAN
* make : put preprocessor stuff in CPPFLAGS
* make : pass Raspberry Pi arch flags to g++ as well
* make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS
* make : fix inverted conditional
|
|
* logging: Fix creating empty file even when disabled
* Minor formatting fix
Co-authored-by: staviq <staviq@gmail.com>
---------
Co-authored-by: staviq <staviq@gmail.com>
|
|
* Update Windows CLBlast instructions
* Update Windows CLBlast instructions
* Remove trailing whitespace
|
|
* ggml_metal_init: Show all Metal device instances in the system
Also show the default Metal device that was picked.
* Update ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* k-quants : fix build on armv7
* ggml : cleanup unused arm32 specific impl
* k-quants : avoid some unused vzero / mzero define
* ggml-alloc : use 4g for MEASURE_MAX_SIZE in 32-bit arm
|
|
|
|
|
|
* quick start command fix
* quick start win command fix
|
|
* Allow quantize tool to only copy tensors to allow repackaging models.
* Slightly better logic when requantizing.
* Change help message to go to `stdout`.
|
|
|
|
Fixes #2922
|
|
* made the methods const
# Conflicts:
# examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp
* made method const
* Update convert-llama2c-to-ggml.cpp
removed write_raw and write_u32
* llama2c : remove misleading const
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
* fix most gcc and clang warnings
* baby-llama : remove commented opt_params_adam
* fix some MinGW warnings
* fix more MinGW warnings
|
|
|
|
* added support for RISCV CFLAGS & native compile + cross compile options
* Add RISC-V Vector Intrinsics Support
Added RVV intrinsics for following
ggml_vec_dot_q4_0_q8_0
ggml_vec_dot_q4_1_q8_1
ggml_vec_dot_q5_0_q8_0
ggml_vec_dot_q5_1_q8_1
ggml_vec_dot_q8_0_q8_0
Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai>
Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>
---------
Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>
Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai>
|
|
|
|
* fix mingw-like builds
* formatting
* make LOG_COMPAT easier to override and extend
* simplify win detection
* fix for #2940
|
|
* llama2c : fix segfault if vocab is not found
* llama2c : fix mismatch between new[] and delete
* llama2c : fix basename on Windows
* llama2c : use a destructor to prevent memory leaks
|
|
* Somewhat faster f16 x f32 matrix multiply kernel
* Better use 32 thread groups for f16 x f32
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
|
|
|
|
* scripts: Use local gguf when running from repo
|
|
Reimplement fix for `PrefetchVirtualMemory`.
Co-authored-by: vxiiduu <73044267+vxiiduu@users.noreply.github.com>
|
|
* convert : fix python 3.8 support
* convert : sort imports
* convert : fix required parameters in convert-llama-ggmlv3-to-gguf
* convert : fix mypy errors in convert-llama-ggmlv3-to-gguf
* convert : use PEP 585 generics and PEP 604 unions
Now that we have `from __future__ import annotations`, we can use this
modern syntax in Python 3.7 instead of restricting support to Python 3.9
or 3.10 respectively.
* gguf.py : a tuple is already a tuple
* add mypy.ini
* convert : add necessary `type: ignore` comments
* gguf-py: bump version
|
|
|
|
* [Docker] fix tools.sh argument passing.
This should allow passing multiple arguments to containers with
the full image that are using the tools.sh frontend.
Fix from https://github.com/ggerganov/llama.cpp/issues/2535#issuecomment-1697091734
|
|
|
|
|
|
* gguf : add workflow for Pypi publishing
* gguf : add workflow for Pypi publishing
* fix trailing whitespace
|
|
* build ci: run make test
* makefile:
- add all
- add test
* enable tests/test-tokenizer-0-llama
* fix path to model
* remove gcc-8 from macos build test
* Update Makefile
* Update Makefile
|
|
|
|
(#2842)
* convert: Fix permute calls and method/func definitions
* Cleanups for gguf-py
* Minor types cleanups.
* Initial implementation of handling merges and special tokens
* convert: Handle special tokens and merges in vocab only mode
convert: Vocab only mode no longer requires loading model tensors
* gguf: Refactor tensor name mapping
* convert: Fix type hint for special_token_types in SpecialVocab
* Use common special vocab handling in various conversion scripts
* First pass at implementing suggested changes
* Second pass
* gguf: SpecialVocab: Fix issue with special token content not in a dict
gguf: SpecialVocab: Allow skipping handling of merges
* convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json
* convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer
* gguf: SpecialVocab: Actually set load_merges in object
* Uniform args parsing and vocab only mode for convert examples
* convert.py: Set gpt2 as tokenizer model when using BPE
* Squish last type warning in gguf.py - yay!
|
|
|
|
* initial, base LOG macro
* add *.log to .gitignore
* added basic log file handler
* reverted log auto endline to better mimic printf
* remove atomics and add dynamic log target
* log_enable/disable, LOG_TEE, basic usage doc
* update .gitignore
* mv include to common, params, help msg
* log tostring helpers, token vectors pretty prints
* main: replaced fprintf/LOG_TEE, some trace logging
* LOG_DISABLE_LOGS compile flag, wrapped f in macros
* fix LOG_TEELN and configchecker
* stub LOG_DUMP_CMDLINE for WIN32 for now
* fix msvc
* cleanup main.cpp:273
* fix stray whitespace after master sync
* log : fix compile warnings
- do not use C++20 stuff
- use PRIu64 to print uint64_t
- avoid string copies by using const ref
- fix ", ##__VA_ARGS__" warnings
- compare strings with == and !=
* log : do not append to existing log + disable file line func by default
* log : try to fix Windows build
* main : wip logs
* main : add trace log
* review: macro f lowercase, str append to sstream
* review: simplify ifs and str comparisons
* fix MSVC, formatting, FMT/VAL placeholders
* review: if/else cleanup
* review: if/else cleanup (2)
* replace _ prefix with _impl suffix
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* tests : add a C compliance test
* make : build C compliance test by default
* make : fix clean and make sure C test fails on clang
* make : move -Werror=implicit-int to CFLAGS
|
|
* ggml : add view_src and view_offs
* update ggml-alloc to use view_src
* update ggml_diag_mask to work correctly with automatic inplace
* exclude other ops that set an inplace flag from automatic inplace
|
|
|
|
Closes #2858
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
* 10X faster BPE tokenizer
* Remove comment that no longer applies
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
convert-to-gguf python scripts
|
|
* [Fix]: convert.py support baichuan7B
* convert.py : fix trailing whitespaces
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
* make : do not pass headers to the compiler
This fixes building tests with clang.
* make : add missing examples
* make : fix build-info.h dependencies
|
|
|
|
|