summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2024-05-30Fixed painfully slow single process builds. (#7326)JohnnyB
2024-05-31llama : cache llama_token_to_piece (#7587)Georgi Gerganov
2024-05-31Fix conan badge display [no ci] (#7645)Martin Delille
2024-05-31Add brew installation instruction to README [no ci] (#7616)Manuel
2024-05-30readme : add Conan badge (#7638)Martin Delille
2024-05-30github: add contact links to issues and convert question into research [no ci...Brian
2024-05-30Move convert.py to examples/convert-legacy-llama.py (#7430)Galunid
2024-05-30faster avx512 exp implementation (#7551)Chris Elrod
2024-05-30ggml : fix loongarch build (O2 issue) (#7636)junchao-loongson
2024-05-30README: explain parallel build [no ci] (#7618)Johannes Gäßler
2024-05-30[SYCL] fix intel docker (#7630)Meng, Hengyu
2024-05-30gguf-py : Add tokenizer.ggml.pre to gguf-new-metadata.py (#7627)Galunid
2024-05-29metal : remove invalid asserts (#7617)Georgi Gerganov
2024-05-29metal : add missing asserts (#7617)Georgi Gerganov
2024-05-29ggml : fix YARN + add tests + add asserts (#7617)Georgi Gerganov
2024-05-29cuda : non-cont concat support (#7610)Georgi Gerganov
2024-05-29llama-bench : add support for the RPC backend (#7435)Radoslav Gerganov
2024-05-29ggml : use atomic_flag for critical section (#7598)slaren
2024-05-29scripts : remove mpi remnantsGeorgi Gerganov
2024-05-29sync : ggmlGeorgi Gerganov
2024-05-29ggml : restore ggml_rope_xpos_inplace (ggml/0)Georgi Gerganov
2024-05-29Add Arc A750 and Arch linux to readme-sycl.md as verified GPU model and Linux...Akarshan Biswas
2024-05-29ggml : fix typo in ggml.c (#7603)zhouwg
2024-05-29[SYCL] Align GEMM dispatch (#7566)Meng, Hengyu
2024-05-28Tokenizer WPM fixes (#7500)jaime-m-p
2024-05-28sycl : fix assert (#7563)Georgi Gerganov
2024-05-28llama : support small Granite models (#7481)Giuseppe Scrivano
2024-05-28vulkan: properly initialize vulkan devices for LLAMA_SPLIT_MODE_NONE (#7552)k.h.lai
2024-05-28rpc : resource management rework (#7562)Radoslav Gerganov
2024-05-28Add support for DeepseekV2ForCausalLM (#7519)fairydreaming
2024-05-28tests : fix test-tokenizer-0.shGeorgi Gerganov
2024-05-28llama : handle unknown utf8 bytes (#7588)Georgi Gerganov
2024-05-28github: add refactor to issue template (#7561)Brian
2024-05-28[SYCL]fix ggml_sycl_mul_mat_id() to match the change of api (#7436)Neo Zhang
2024-05-28ggml : generalize GGML_OP_CONCAT (#7563)Georgi Gerganov
2024-05-28server: do not remove whitespace at the start of a completion chunk (#7524)mgroeber9110
2024-05-28Markdownish code block fix (#7571)Nathan Epstein
2024-05-28llava : update clip.h (#7580)Ikko Eltociear Ashimine
2024-05-28update HIP_UMA #7399 (#7414)Djip007
2024-05-28adding in x64 targets to cmake presets (#7574)kunnis
2024-05-27make: add --device-debug to NVCC debug flags (#7542)Johannes Gäßler
2024-05-27Allow multiple copy function pointers for CUDA graph kernel param updates (#7...agray3
2024-05-27Fix q_xxs using mul_mat_q (#7459)AidanBeltonS
2024-05-27Add freq factors (#7495)AidanBeltonS
2024-05-27metal : add GGML_OP_REPEAT kernels (#7557)Georgi Gerganov
2024-05-27metal : disable FA kernel for HS=256 (#7556)Georgi Gerganov
2024-05-27llama : add comments about experimental flags (#7544)Georgi Gerganov
2024-05-27github: add self sorted issue ticket forms (#7543)Brian
2024-05-26flake.lock: Update (#7540)Georgi Gerganov
2024-05-27main: replace --no-special with --special (#7534)Brian