diff options
author | Kawrakow <iwankawrakow@gmail.com> | 2024-10-22 17:28:14 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-10-22 17:28:14 +0200 |
commit | b61cf7d0d7e7c5d971087d2f919818fbf684809e (patch) | |
tree | 13b094803488737aed3dc9817e96c196f7bcbf9a /examples | |
parent | 462c6cd7b1b03843ab782e36c75da9bfea657c14 (diff) |
Add support for Granite and GraniteMoE models (#102)
* Add Granite and GranoteMoE models
* Granite: avoid NaNs on CUDA by scaling Q before K*Q multiplication
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions