]> git.ipfire.org Git - thirdparty/gcc.git/commit - gcc/doc/invoke.texi
aarch64: Extend VECT_COMPARE_COSTS to !SVE [PR113104]
authorRichard Sandiford <richard.sandiford@arm.com>
Fri, 5 Jan 2024 16:25:16 +0000 (16:25 +0000)
committerRichard Sandiford <richard.sandiford@arm.com>
Fri, 5 Jan 2024 16:25:16 +0000 (16:25 +0000)
commit7328faf89e9b4953baaff10e18262c70fbd3e578
treeb62615ed724ebbc27e479c2953408d922a0cfe2a
parentd4cd871d15b813caa4b9016f34ebbda3277da4f8
aarch64: Extend VECT_COMPARE_COSTS to !SVE [PR113104]

When SVE is enabled, we try vectorising with multiple different SVE and
Advanced SIMD approaches and use the cost model to pick the best one.
Until now, we've not done that for Advanced SIMD, since "the first mode
that works should always be the best".

The testcase is a counterexample.  Each iteration of the scalar loop
vectorises naturally with 64-bit input vectors and 128-bit output
vectors.  We do try that for SVE, and choose it as the best approach.
But the first approach we try is instead to use:

- a vectorisation factor of 2
- 1 128-bit vector for the inputs
- 2 128-bit vectors for the outputs

But since the stride is variable, the cost of marshalling the input
vector from two iterations outweighs the benefit of doing two iterations
at once.

This patch therefore generalises aarch64-sve-compare-costs to
aarch64-vect-compare-costs and applies it to non-SVE compilations.

gcc/
PR target/113104
* doc/invoke.texi (aarch64-sve-compare-costs): Replace with...
(aarch64-vect-compare-costs): ...this.
* config/aarch64/aarch64.opt (-param=aarch64-sve-compare-costs=):
Replace with...
(-param=aarch64-vect-compare-costs=): ...this new param.
* config/aarch64/aarch64.cc (aarch64_override_options_internal):
Don't disable it when vectorizing for Advanced SIMD only.
(aarch64_autovectorize_vector_modes): Apply VECT_COMPARE_COSTS
whenever aarch64_vect_compare_costs is true.

gcc/testsuite/
PR target/113104
* gcc.target/aarch64/pr113104.c: New test.
* gcc.target/aarch64/sve/cond_arith_1.c: Update for new parameter
names.
* gcc.target/aarch64/sve/cond_arith_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_3.c: Likewise.
* gcc.target/aarch64/sve/cond_arith_3_run.c: Likewise.
* gcc.target/aarch64/sve/gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/gather_load_7.c: Likewise.
* gcc.target/aarch64/sve/load_const_offset_2.c: Likewise.
* gcc.target/aarch64/sve/load_const_offset_3.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise.
* gcc.target/aarch64/sve/mask_load_slp_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
* gcc.target/aarch64/sve/pack_1.c: Likewise.
* gcc.target/aarch64/sve/reduc_4.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_6.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_7.c: Likewise.
* gcc.target/aarch64/sve/strided_load_3.c: Likewise.
* gcc.target/aarch64/sve/strided_store_3.c: Likewise.
* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_signed_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_11.c: Likewise.
* gcc.target/aarch64/sve/vcond_11_run.c: Likewise.
36 files changed:
gcc/config/aarch64/aarch64.cc
gcc/config/aarch64/aarch64.opt
gcc/doc/invoke.texi
gcc/testsuite/gcc.target/aarch64/pr113104.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_arith_1.c
gcc/testsuite/gcc.target/aarch64/sve/cond_arith_1_run.c
gcc/testsuite/gcc.target/aarch64/sve/cond_arith_3.c
gcc/testsuite/gcc.target/aarch64/sve/cond_arith_3_run.c
gcc/testsuite/gcc.target/aarch64/sve/gather_load_6.c
gcc/testsuite/gcc.target/aarch64/sve/gather_load_7.c
gcc/testsuite/gcc.target/aarch64/sve/load_const_offset_2.c
gcc/testsuite/gcc.target/aarch64/sve/load_const_offset_3.c
gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_6.c
gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_7.c
gcc/testsuite/gcc.target/aarch64/sve/mask_load_slp_1.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_1.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_2.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_4.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_1.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_1_run.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_2.c
gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_2_run.c
gcc/testsuite/gcc.target/aarch64/sve/pack_1.c
gcc/testsuite/gcc.target/aarch64/sve/reduc_4.c
gcc/testsuite/gcc.target/aarch64/sve/scatter_store_6.c
gcc/testsuite/gcc.target/aarch64/sve/scatter_store_7.c
gcc/testsuite/gcc.target/aarch64/sve/strided_load_3.c
gcc/testsuite/gcc.target/aarch64/sve/strided_store_3.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_signed_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_unsigned_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_unsigned_1_run.c
gcc/testsuite/gcc.target/aarch64/sve/vcond_11.c
gcc/testsuite/gcc.target/aarch64/sve/vcond_11_run.c