summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2024-01-26metal : remove unused `n_buffers` and `buffers` (#5129)Paul Tsochantaris
2024-01-26gguf : fix "general.alignment" type in gguf_reader.py (#5136)Riceball LEE
2024-01-26readme : update hot topicsGeorgi Gerganov
2024-01-26Another bucket sort (#5109)Kawrakow
2024-01-25readme : add MobileVLM 1.7B/3B to the supported models list (#5107)XiaotaoChen
2024-01-25llama : dynamic temperature sampling (#4972)l3utterfly
2024-01-25examples : make pydantic scripts pass mypy and support py3.8 (#5099)Jared Van Bortel
2024-01-25android : use release cmake build type by default (#5123)Valentin Konovalov
2024-01-25Fix Q3_K_XS for MoE models (#5113)Kawrakow
2024-01-25metal : show compile log messagesGeorgi Gerganov
2024-01-24cuda : fix 2-bit quants on amd hip (#5105)Engininja2
2024-01-24nix-shell: use addToSearchPathMichael Hueschen
2024-01-24nix: add cc to devShell LD_LIBRARY_PATHMichael Hueschen
2024-01-24llama : pre-allocate input tensors in a separate buffer (#5100)slaren
2024-01-23metal : disable support for MUL_MAT F32 x F16Georgi Gerganov
2024-01-23Additional KL-divergence statistics (#5081)Kawrakow
2024-01-23CUDA: more info when no device code (#5088)Johannes Gäßler
2024-01-23minor : clean-up some warnings and style (#5094)Georgi Gerganov
2024-01-23devops : add intel oneapi dockerfile (#5068)Xuan Son Nguyen
2024-01-23llama.vim : added api key support (#5090)Michael Coppola
2024-01-22llama : fix not enough space in buffer with Qwen (#5086)slaren
2024-01-22KL-divergence (#5076)Kawrakow
2024-01-22ggml : parallelize FP32 conversion when using BLAS (#5045)Reinforce-II
2024-01-22llava : MobileVLM support (#4954)XiaotaoChen
2024-01-22flake.nix: add a comment about flakes vs nixSomeone Serge
2024-01-22nix: add a comment on the many nixpkgs-with-cuda instancesSomeone Serge
2024-01-22nix: add a comment about makeScopeSomeone Serge
2024-01-22nix: refactor the cleanSource rulesSomeone Serge
2024-01-22workflows: nix-ci: drop the redundant "paths" filterSomeone Serge
2024-01-22workflows: nix-build-aarch64: rate limitSomeone Serge
2024-01-22workflows: nix-ci: rebuild on flake.lock updatesSomeone Serge
2024-01-22imatrix : keep intermediate imatrix results (#5077)Kawrakow
2024-01-22llama : support StableLM 2 1.6B (#5052)compilade
2024-01-22finetune : print sample-start/include-sample-start (#5072)Daniel Bevenius
2024-01-22llama : add Q3_K_XS (#5060)Kawrakow
2024-01-22ci : fix Windows CI by updating Intel SDE version (#5053)bobqianic
2024-01-22llama : add more qwen2 models (#5071)Shijie
2024-01-21Revert LLAMA_NATIVE to OFF in flake.nix (#5066)iSma
2024-01-21add safetensors support to convert-lora-to-ggml.py (#5062)kuronekosaiko
2024-01-21add `#include <string>` to unicode.h (#5051)bobqianic
2024-01-21Add ability to evauate multiple choice tasks (#5047)Kawrakow
2024-01-21Slightly faster imatrix (#5050)Kawrakow
2024-01-21flake.lock: Update (#5054)Georgi Gerganov
2024-01-20convert : partially revert PR #4818 (#5041)Jared Van Bortel
2024-01-20perplexity : fix MSVC build after #5020 (#5043)Jared Van Bortel
2024-01-20llama : run all KQV ops on the CPU with no KV offload (#5049)slaren
2024-01-20cmake : add support for ccache (#5002)Herman Semenov
2024-01-20Add a dart/flutter binding to README.md (#4882)adel boussaken
2024-01-20cuda : fix compile error in jetson platform (#4975)Kylin
2024-01-19finetune : fix ggml_allocr lifetimes (tmp workaround) (#5033)Uzo Nweke