diff options
| author | Kawrakow <iwankawrakow@gmail.com> | 2024-10-26 16:26:04 +0200 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2024-10-26 16:26:04 +0200 |
| commit | bd309cb782ae8a5205dd741ccb97f6103f74888a (patch) | |
| tree | ddbcade915d8158e893a5eebee40e8fd196353ab /ggml/src/vulkan-shaders/dequant_head.comp | |
| parent | 3805c84686f40fc4423d45308cab6adac2eafdd4 (diff) | |
Bitnet CUDA improvements (#109)
* iq1_bn: improve CUDA TG
On RTX-3080 TG-128(Bitnet-1.58b-3B) goes from 318 t/s to 340 t/s.
I see I have on the front page 301 t/s, so pretty nice improvement
since then.
* iq2_bn(CUDA): quants are not 4-byte aligned
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'ggml/src/vulkan-shaders/dequant_head.comp')
0 files changed, 0 insertions, 0 deletions
