]> git.ipfire.org Git - thirdparty/valgrind.git/commit
Reimplement h_generic_calc_GetMSBs8x16 to be more efficient.
authorJulian Seward <jseward@acm.org>
Tue, 13 Jul 2021 07:07:45 +0000 (09:07 +0200)
committerJulian Seward <jseward@acm.org>
Tue, 13 Jul 2021 07:07:45 +0000 (09:07 +0200)
commite5f66a2aa00fa88ba3e0fb004510f0a630881ef1
tree5d2c555ad20b983feb82bcd3ec6631857dd77acf
parent43543527a293e626e601202ca4eeb2216f40815d
Reimplement h_generic_calc_GetMSBs8x16 to be more efficient.

h_generic_calc_GetMSBs8x16 concatenates the top bit of each 8-bit lane in a
128-bit value, producing a 16-bit scalar value.  (It is PMOVMSKB, really).
The existing implementation is excessively inefficient and shows up sometimes
in 'perf' profiles of V.  This commit replaces it with a logarithmic (4-stage)
algorithm which is hopefully much faster.
VEX/priv/host_generic_simd128.c