]> git.ipfire.org Git - thirdparty/glibc.git/commit
x86: Optimize memmove-vec-unaligned-erms.S
authorNoah Goldstein <goldstein.w.n@gmail.com>
Mon, 1 Nov 2021 05:49:51 +0000 (00:49 -0500)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Sat, 6 Nov 2021 21:18:03 +0000 (16:18 -0500)
commita6b7502ec0c2da89a7437f43171f160d713e39c6
treed2ff01bb7c3ea8b1e1415542a50913a27bbb3707
parentac759b1fbf28a82d99afde9046f8b72c7cba5dae
x86: Optimize memmove-vec-unaligned-erms.S

No bug.

The optimizations are as follows:

1) Always align entry to 64 bytes. This makes behavior more
   predictable and makes other frontend optimizations easier.

2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have
   significant benefits in the case that:
        0 < (dst - src) < [256, 512]

3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%]
   improvement and for FSRM [-10%, 25%].

In addition to these primary changes there is general cleanup
throughout to optimize the aligning routines and control flow logic.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
sysdeps/x86_64/memmove.S
sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms-rtm.S
sysdeps/x86_64/multiarch/memmove-avx-unaligned-erms.S
sysdeps/x86_64/multiarch/memmove-avx512-unaligned-erms.S
sysdeps/x86_64/multiarch/memmove-evex-unaligned-erms.S
sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S