summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2023-12-15ggml : group mul_mat_id rows by matrix (cpu only) (#4480)slaren
2023-12-14ggml : use ggml_row_size where possible (#4472)slaren
2023-12-14ggml : remove n_dims from ggml_tensor (#4469)slaren
2023-12-14py : add protobuf dependency (#4466)wonjun Jang
2023-12-14ggml : add ggml_row_size() (fixes llama out of space) (#4461)LostRuins
2023-12-14ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453)Georgi Gerganov
2023-12-14convert : support loading vocab from fast tokenizer config (#3633)wonjun Jang
2023-12-14readme : update supported model list (#4457)BarfingLemurs
2023-12-13server : fix handling of characters that span multiple tokens when streaming ...shibe2
2023-12-13sync : ggml (SD ops, tests, kernels) (#4444)Georgi Gerganov
2023-12-13build : detect host compiler and cuda compiler separately (#4414)Jared Van Bortel
2023-12-13common : add `--version` option to show build info in CLI (#4433)Siwen Yu
2023-12-13readme : update hot topicsGeorgi Gerganov
2023-12-13llama : add Mixtral support (#4406)slaren
2023-12-12server : tweak default sampling parameters (#4367)kalomaze
2023-12-12english : use `typos` to fix comments and logs (#4354)Richard Kiss
2023-12-12build : target Windows 8 for standard mingw-w64 (#4405)Jared Van Bortel
2023-12-12llama : document logits_all deprecation (#4418)crasm
2023-12-12server : fix local model name in server (#4420)Vladimir Zorin
2023-12-12ggml : increased GGML_MAX_PARAMS to allow finetuning of 70b models (#4424)Taikono-Himazin
2023-12-10Update README.md (#4388)Yueh-Po Peng
2023-12-09grammar : revert the replacement of llama_token_to_piece with id_to_token (#4...Xiang (Kevin) Li
2023-12-07sync : ggml (new ops, tests, backend, etc.) (#4359)Georgi Gerganov
2023-12-07llama : per-layer KV cache + quantum K cache (#4309)Georgi Gerganov
2023-12-07train : fix #4227 (double free in examples/train-text-from-scratch/train-text...Hongyu Ouyang
2023-12-06server : recognize cache_prompt parameter in OAI API (#4347)Georgi Gerganov
2023-12-06common : fix compile warningGeorgi Gerganov
2023-12-06speculative : support `--color` (#4343)stduhpf
2023-12-05grammar : pre-computed pieces + reserve mem + less string copies (#4330)Marcus Dunn
2023-12-05llama : allow overriding GGUF metadata when loading model (#4092)Kerfuffle
2023-12-05sampling : custom samplers order (#4285)MaggotHATE
2023-12-05swift : revert compiler checks for swift package (#4332)kchro3
2023-12-04simple : update error message for KV cache check (#4324)Daniel Bevenius
2023-12-04swift : fix concatenation method to avoid invalid UTF8 stringfication (#4325)Miwa / Ensan
2023-12-04swift : fix prompt tokenization logic (#4321)Miwa / Ensan
2023-12-04grammar-parser : fix typo (#4318)Ikko Eltociear Ashimine
2023-12-03ggml : reuse ggml_get_n_tasks() in ggml_graph_plan() (#4308)Georgi Gerganov
2023-12-03ggml : fix soft max out-of-bounds access (#4307)Georgi Gerganov
2023-12-03server : fix OpenAI API `stop` field to be optional (#4299)Ed Lee
2023-12-03py : add grammar to oai like api (#4294)Rickard Edén
2023-12-03llama : pad KV cache size (#4280)Georgi Gerganov
2023-12-01llama : avoid using "optional" keyword (#4283)Georgi Gerganov
2023-12-01llama : support optional tensors (#4283)Georgi Gerganov
2023-12-01swift : fix token_to_piece implementation (#4278)Miwa / Ensan
2023-12-01build : enable libstdc++ assertions for debug builds (#4275)Jared Van Bortel
2023-12-01llama : support attention bias on LLaMA architecture (#4283)CausalLM
2023-12-01llama : add Qwen support (#4281)Shijie
2023-12-01llama : fix integer overflow during quantization (#4284)Georgi Gerganov
2023-12-01py : add requirements file for convert-hf-to-gguf.py (#4277)Daniel Bevenius
2023-12-01ggml : add ggml_soft_max_ext (#4256)Georgi Gerganov