summaryrefslogtreecommitdiff
path: root/examples/imatrix/imatrix.cpp
AgeCommit message (Expand)Author
2025-05-13Fix imatrix calculation for MLA models (#411)Kawrakow
2025-04-14imatrix: collect layer influence statistics (#328)Kawrakow
2025-03-10DeepSeek imatrix stuff (#250)Kawrakow
2025-02-12Fix imatrix overprotectiveness (#202)Kawrakow
2024-08-12Merge mainline - Aug 12 2024 (#17)Kawrakow
2024-07-24Add copyright noticesIwan Kawrakow
2024-06-26imatrix: be able to specify the name of the output tensorIwan Kawrakow
2024-06-09imatrix : handle partial entries (#7833)Georgi Gerganov
2024-06-07check for nans in imatrix and quantize (#7807)slaren
2024-06-06imatrix : migrate to gpt_params (#7771)Georgi Gerganov
2024-06-04common : refactor cli arg parsing (#7675)Georgi Gerganov
2024-05-22common : normalize naming style (#7462)Georgi Gerganov
2024-05-08Fixed save_imatrix to match old behaviour for MoE (#7099)jukofyork
2024-04-26quantize: add imatrix and dataset metadata in GGUF (#6658)Pierrick Hymbert
2024-04-18ggml : group all experts in a single ggml_mul_mat_id (#6505)slaren
2024-04-12imatrix : remove invalid assert (#6632)Georgi Gerganov
2024-04-11eval-callback: Example how to use eval callback for debugging (#6576)Pierrick Hymbert
2024-04-09BERT tokenizer fixes (#6498)Jared Van Bortel
2024-04-03ggml : mul_mat_id use the same tensor for all the experts (#6387)slaren
2024-03-26llama : greatly reduce output buffer memory usage (#6122)compilade
2024-03-24imatrix : fix wname for mul_mat_id ops (#6271)Georgi Gerganov
2024-03-18backend : offload large batches to GPU (#6083)slaren
2024-02-16ggml : add numa options (#5377)bmwl
2024-02-04Adding some imatrix tools (#5302)Kawrakow
2024-01-22imatrix : keep intermediate imatrix results (#5077)Kawrakow
2024-01-21Slightly faster imatrix (#5050)Kawrakow
2024-01-18imatrix : fix assert for src0 non-cont checkGeorgi Gerganov
2024-01-17imatrix : offload to GPU support (#4957)Georgi Gerganov
2024-01-12Importance Matrix calculation (#4861)Kawrakow