]> git.ipfire.org Git - thirdparty/vectorscan.git/commit
Optimize vectorscan for aarch64 by using shrn instruction
authorDanila Kutenin <danilak@google.com>
Sun, 26 Jun 2022 22:50:05 +0000 (22:50 +0000)
committerDanila Kutenin <danilak@google.com>
Sun, 26 Jun 2022 22:55:45 +0000 (22:55 +0000)
commit49eb18ee4f21b5bd389e0e9d5644be1ec1dc85c6
tree2313edfb1e5c7bffcc47e27592cf7c14cdbaf944
parentbd9113463d0a91605acf571d996a6995ab4d4cc3
Optimize vectorscan for aarch64 by using shrn instruction

This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
src/hwlm/noodle_engine_simd.hpp
src/nfa/limex_shuffle.hpp
src/util/arch/arm/match.hpp
src/util/arch/arm/simd_utils.h
src/util/arch/ppc64el/match.hpp
src/util/arch/x86/match.hpp
src/util/supervector/arch/arm/impl.cpp
src/util/supervector/arch/ppc64el/impl.cpp
src/util/supervector/arch/x86/impl.cpp
src/util/supervector/supervector.hpp
unit/internal/supervector.cpp