Age | Commit message (Collapse) | Author |
|
* ggml : remove old quantization functions
ggml-ci
* ggml : simplify ggml_quantize_chunk
ggml-ci
* ggml : restrict correctness
ggml-ci
* ggml : remove hist data from the quantization API
ggml-ci
* tests : remove hist usage in test-backend-ops
ggml-ci
* vulkan : remove hist and fix typo
|
|
* imatrix: load
* imatrix: WIP
* imatrix: Add Q2_K quantization
* imatrix: also guard against Q2_K_S quantization without importance matrix
* imatrix: guard even more against low-bit quantization misuse
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
|
|
* Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values
* do not cast to size_t, instead just use doubles
* ggml : add ggml_row_size(), deprecate ggml_type_sizef()
* ggml : fix row size compute to avoid overflows
* tests : fix sizey -> sizez
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* sync : ggml (backend v2) (wip)
* sync : migrate examples and llama.cpp to dynamic graphs (wip)
* sync : update tests + fix max op params to 64
ggml-ci
* sync : ggml-cuda
ggml-ci
* llama : fix save/load state context size
ggml-ci
* sync : try to fix build on tvOS
* sync : pass custom graph sizes in training examples
* sync : update graph copies to new ggml API
* sync : update sync-ggml.sh with new files
* scripts : fix header in sync script
* train : fix context size calculations
* llama : increase inference graph size up to 4096 nodes
* train : allocate grads for backward graphs
* train : allocate grads for gb_tmp
|
|
* cmake : fix build when .git does not exist
* cmake : simplify BUILD_INFO target
* cmake : add missing dependencies on BUILD_INFO
* build : link against build info instead of compiling against it
* zig : make build info a .cpp source instead of a header
Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>
* cmake : revert change to CMP0115
---------
Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>
|
|
|
|
The precision for Q4_0 has degraded since #1508
|
|
|
|
|
|
* ggml_graph_compute: deprecate using ggml_context, try resolve issue #287
* rewrite: no longer consider backward compitability; plan and make_plan
* minor: rename ctx as plan; const
* remove ggml_graph_compute from tests/test-grad0.c, but current change breaks backward
* add static ggml_graph_compute_sugar()
* minor: update comments
* reusable buffers
* ggml : more consistent naming + metal fixes
* ggml : fix docs
* tests : disable grad / opt + minor naming changes
* ggml : add ggml_graph_compute_with_ctx()
- backwards compatible API
- deduplicates a lot of copy-paste
* ci : enable test-grad0
* examples : factor out plan allocation into a helper function
* llama : factor out plan stuff into a helper function
* ci : fix env
* llama : fix duplicate symbols + refactor example benchmark
* ggml : remove obsolete assert + refactor n_tasks section
* ggml : fix indentation in switch
* llama : avoid unnecessary bool
* ggml : remove comments from source file and match order in header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
|
|
* benchmark-matmul: fix command line parsing, replace macros with functions, report results in GFLOPS
|
|
|
|
* Add git-based build information for better issue tracking
* macOS fix
* "build (hash)" and "CMAKE_SOURCE_DIR" changes
* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages
* Fix conditional dependency on missing target
* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile
* 4 space indenting for cmake, attempt to clean up my mess in Makefile
* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
|
|
|