diff options
author | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-25 18:19:11 +0300 |
---|---|---|
committer | Iwan Kawrakow <iwan.kawrakow@gmail.com> | 2024-06-25 18:19:11 +0300 |
commit | 753dbaeeb0be5fb3d0d4337d7854dcf4f3a30fe1 (patch) | |
tree | afedc73d7d8b8032f5c2057aec8bdff95e6601df /common/common.h | |
parent | 8b436a84c53de4c5a8eaf9be72cdd82324da2eeb (diff) |
bitnet: remove iq1_bn lookup table storing +/- signs
The AVX2 implementation was the only one left using it, so
I decided to see if we can get a performant implementation
using the 0,1,2 lookup table. Turns out we can, and it is
even slightly faster than the sign based table. We now
get PP-512 = 275 t/s and TG-128 = 57.7 t/s with 16 threads
on the Ryzen-7950X.
With only one lookup table left for iq1_bn, I renamed it to
iq1bn_grid_u16.
Diffstat (limited to 'common/common.h')
0 files changed, 0 insertions, 0 deletions