summaryrefslogtreecommitdiff
path: root/examples/server
diff options
context:
space:
mode:
authorJustine Tunney <jtunney@mozilla.com>2024-05-17 02:58:52 -0400
committerGitHub <noreply@github.com>2024-05-17 09:58:52 +0300
commit934266c0e0b2aa9781fdba2deb112c161ff038a9 (patch)
tree7fd8b97aa1277c1e34130ec8ef0ea4c7c04cd1a1 /examples/server
parent9c4fdcbec8c7fcc428e723b0d8a1cf1f351ba642 (diff)
ggml : rewrite silu and softmax for cpu (#7154)
This change upstreams llamafile's vectorized expf() functions. This lets us compute softmax and silu more accurately than the short[65536] lookup table that GGML previously used to make this operation go faster. We can support aarch64 and sse2+ with the worst case rounding error of 2ulp. It makes make -j8 tests && ./tests/test-backend-ops -o SOFT_MAX -b CPU perf go 1.5x faster for SSE2+FMA, 1.9x faster for AVX2+FMA and 2.1x on AVX512
Diffstat (limited to 'examples/server')
0 files changed, 0 insertions, 0 deletions