index
:
ik_llama.cpp.git
main
Unnamed repository; edit this file 'description' to name the repository.
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
iqk_mul_mat.cpp
Age
Commit message (
Expand
)
Author
2024-06-22
Bitnet(2.25 bpw): NEON
Iwan Kawrakow
2024-06-22
Bitnet: 2.25 bpw version
Iwan Kawrakow
2024-06-22
bitnet 2 bpw: NEON implementation
Iwan Kawrakow
2024-06-22
Removed extra column
Iwan Kawrakow
2024-06-22
bitnet 2 bpw: AVX2 implementation
Iwan Kawrakow
2024-06-22
iqk_mul_mat(bitnet): fix typo
Iwan Kawrakow
2024-06-22
iqk_mul_mat(bitnet): slightly faster AVX2
Iwan Kawrakow
2024-06-22
iq1_bn: better NEON implementation
Iwan Kawrakow
2024-06-22
iq1_bn(NEON): works now, but very slow
Iwan Kawrakow
2024-06-22
iqk_mul_mat(iq1_bn): WIP NEON - don't see why it is not working
Iwan Kawrakow
2024-06-22
iqk_mul_mat(iq1_bn): WIP NEON (not working)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: improve iq1_bn (bitnet) on vanilla AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: improve iq1_bn (bitnet) on AVX2
Iwan Kawrakow
2024-06-22
bitnet: scale is per row, not per tensor
Iwan Kawrakow
2024-06-22
iqk_mul_mat: add iq1_bn (bitnet)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: cleanup
Iwan Kawrakow
2024-06-22
iqk_mul_mat: be independent of llamafile_sgemm
Iwan Kawrakow
2024-06-22
iqk_mul_mat: be independent of llamafile_sgemm (WIP)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: be able to handle any f16/f32 combination on AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: turn on AVX512
Iwan Kawrakow
2024-06-22
iqk_mul_mat: slightly better fp16 with 16 vector registers
Iwan Kawrakow
2024-06-22
iqk_mul_mat: better fp16 for AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: fp16 for Arm
Iwan Kawrakow
2024-06-22
iqk_mul_mat: slightly faster FANCY_SIMD dot product
Iwan Kawrakow
2024-06-22
iqk_mul_mat: fix q8_0
Iwan Kawrakow
2024-06-22
iqk_mul_mat: use block_q8_1_x4 also for AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: use block_q8_0_x4 also for AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: delete unused stuff
Iwan Kawrakow
2024-06-22
iqk_mul_mat: add q8_0
Iwan Kawrakow
2024-06-22
iqk_mul_mat: fp16 tweaks
Iwan Kawrakow
2024-06-22
iqk_mul_mat: fp16 implementation cleanup
Iwan Kawrakow
2024-06-22
iqk_mul_mat: fp16 implementation for AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: make it independent of sgemm
Iwan Kawrakow
2024-06-22
iqk_mul_mat: minor improvements
Iwan Kawrakow
2024-06-22
iqk_mul_mat: no more templates in the IQ dequantizers
Iwan Kawrakow
2024-06-22
iqk_mul_mat: remove template on one of the prepare() functions
Iwan Kawrakow
2024-06-22
iqk_mul_mat: experimenting with zen4
Iwan Kawrakow
2024-06-22
iqk_mul_mat: experimenting with zen4 (iq2_xxs)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: experimenting with zen4 (iq2_xs)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: experimenting with zen4 (iq3_s and iq2_m)
Iwan Kawrakow
2024-06-22
iqk_mul_mat: small improvement for iq3_s
Iwan Kawrakow
2024-06-22
iqk_mul_mat: better AVX2 implementation for iq2_xxs
Iwan Kawrakow
2024-06-22
iqk_mul_mat: better AVX2 implementation for iq2_xxs
Iwan Kawrakow
2024-06-22
iqk_mul_mat: AVX2 implementation for iq2_xxs
Iwan Kawrakow
2024-06-22
iqk_mul_mat: AVX2 implementation for iq2_xs
Iwan Kawrakow
2024-06-22
iqk_mul_mat: AVX2 implementation for iq2_s
Iwan Kawrakow
2024-06-22
Separate templates for TG and PP for i-quants on AVX2
Iwan Kawrakow
2024-06-22
iqk_mul_mat: AVX2 implementation for iq3_xxs
Iwan Kawrakow
2024-06-22
iqk_mul_mat: AVX2 implementation for iq3_s
Iwan Kawrakow
2024-06-22
Cleanup - Arm i-quants should be good now
Iwan Kawrakow
[next]