]> git.ipfire.org Git - thirdparty/gcc.git/commit
Improve code generation of strided SLP loads
authorRichard Biener <rguenther@suse.de>
Mon, 10 Jun 2024 13:31:35 +0000 (15:31 +0200)
committerRichard Biener <rguenther@suse.de>
Thu, 13 Jun 2024 06:20:58 +0000 (08:20 +0200)
commite8f4d525cb320ff11dd95b985d8043fef0510878
tree113cae310acc34983bf2f98c8bea1a3c47a1acc0
parent6669dc51515313dd1e60c493596dbc90429fc362
Improve code generation of strided SLP loads

This avoids falling back to elementwise accesses for strided SLP
loads when the group size is not a multiple of the vector element
size.  Instead we can use a smaller vector or integer type for the load.

For stores we can do the same though restrictions on stores we handle
and the fact that store-merging covers up makes this mostly effective
for cost modeling which shows for gcc.target/i386/vect-strided-3.c
which we now vectorize with V4SI vectors rather than just V2SI ones.

For all of this there's still the opportunity to use non-uniform
accesses, say for a 6-element group with a VF of two do
V4SI, { V2SI, V2SI }, V4SI.  But that's for a possible followup.

* tree-vect-stmts.cc (get_group_load_store_type): Consistently
use VMAT_STRIDED_SLP for strided SLP accesses and not
VMAT_ELEMENTWISE.
(vectorizable_store): Adjust VMAT_STRIDED_SLP handling to
allow not only half-size but also smaller accesses.
(vectorizable_load): Likewise.

* gcc.target/i386/vect-strided-1.c: New testcase.
* gcc.target/i386/vect-strided-2.c: Likewise.
* gcc.target/i386/vect-strided-3.c: Likewise.
* gcc.target/i386/vect-strided-4.c: Likewise.
gcc/testsuite/gcc.target/i386/vect-strided-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/vect-strided-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/vect-strided-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/vect-strided-4.c [new file with mode: 0644]
gcc/tree-vect-stmts.cc