diff options
author | slaren <slarengh@gmail.com> | 2023-09-30 18:12:57 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-09-30 18:12:57 +0200 |
commit | f5ef5cfb18148131fcf45bdd2331f0db5ab7c3d0 (patch) | |
tree | 97465215d07603cfca34daf8adf8280078e0bf5e /examples/server | |
parent | 40e07a60f9ce06e79f3ccd4c903eba300fb31b5e (diff) |
ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412)
* ggml-cuda : perform cublas matrix multiplication of quantized types as fp16
* rename CC_TURING to CC_VOLTA
* disable fp16 mat mul completely with multi GPU
Diffstat (limited to 'examples/server')
0 files changed, 0 insertions, 0 deletions