]> git.ipfire.org Git - thirdparty/gcc.git/commit - gcc/rtlanal.h
gcc: Add vec_select -> subreg RTL simplification
authorJonathan Wright <jonathan.wright@arm.com>
Wed, 2 Jun 2021 15:55:00 +0000 (16:55 +0100)
committerJonathan Wright <jonathan.wright@arm.com>
Tue, 13 Jul 2021 20:02:58 +0000 (21:02 +0100)
commit8695bf78dad1a42636775843ca832a2f4dba4da3
treeef451d228433838626da5180c03eee39f86b1bfc
parent60aee15bb7ed57d70face854834468b8b9a3ec39
gcc: Add vec_select -> subreg RTL simplification

Add a new RTL simplification for the case of a VEC_SELECT selecting
the low part of a vector. The simplification returns a SUBREG.

The primary goal of this patch is to enable better combinations of
Neon RTL patterns - specifically allowing generation of 'write-to-
high-half' narrowing intructions.

Adding this RTL simplification means that the expected results for a
number of tests need to be updated:
* aarch64 Neon: Update the scan-assembler regex for intrinsics tests
  to expect a scalar register instead of lane 0 of a vector.
* aarch64 SVE: Likewise.
* arm MVE: Use lane 1 instead of lane 0 for lane-extraction
  intrinsics tests (as the move instructions get optimized away for
  lane 0.)

This patch also adds new code generation tests to
narrow_high_combine.c to verify the benefit of this RTL
simplification.

gcc/ChangeLog:

2021-06-08  Jonathan Wright  <jonathan.wright@arm.com>

* combine.c (combine_simplify_rtx): Add vec_select -> subreg
simplification.
* config/aarch64/aarch64.md (*zero_extend<SHORT:mode><GPI:mode>2_aarch64):
Add Neon to general purpose register case for zero-extend
pattern.
* config/arm/vfp.md (*arm_movsi_vfp): Remove "*" from *t -> r
case to prevent some cases opting to go through memory.
* cse.c (fold_rtx): Add vec_select -> subreg simplification.
* rtl.c (rtvec_series_p): Define predicate to determine
whether a vector contains a linear series of integers.
* rtl.h (rtvec_series_p): Define.
* rtlanal.c (vec_series_lowpart_p): Define predicate to
determine if a vector selection is equivalent to the low part
of the vector.
* rtlanal.h (vec_series_lowpart_p): Define.
* simplify-rtx.c (simplify_context::simplify_binary_operation_1):
Add vec_select -> subreg simplification.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/extract_zero_extend.c: Remove dump scan
for RTL pattern match.
* gcc.target/aarch64/narrow_high_combine.c: Add new tests.
* gcc.target/aarch64/simd/vmulx_laneq_f64_1.c: Update
scan-assembler regex to look for a scalar register instead of
lane 0 of a vector.
* gcc.target/aarch64/simd/vmulxd_laneq_f64_1.c: Likewise.
* gcc.target/aarch64/simd/vmulxs_lane_f32_1.c: Likewise.
* gcc.target/aarch64/simd/vmulxs_laneq_f32_1.c: Likewise.
* gcc.target/aarch64/simd/vqdmlalh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmlals_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmlslh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmlsls_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmullh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmullh_laneq_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmulls_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmulls_laneq_s32.c: Likewise.
* gcc.target/aarch64/sve/dup_lane_1.c: Likewise.
* gcc.target/aarch64/sve/extract_1.c: Likewise.
* gcc.target/aarch64/sve/extract_2.c: Likewise.
* gcc.target/aarch64/sve/extract_3.c: Likewise.
* gcc.target/aarch64/sve/extract_4.c: Likewise.
* gcc.target/aarch64/sve/live_1.c: Update scan-assembler regex
cases to look for 'b' and 'h' registers instead of 'w'.
* gcc.target/arm/crypto-vsha1cq_u32.c: Update scan-assembler
regex to reflect lane 0 vector extractions being simplified
to scalar register moves.
* gcc.target/arm/crypto-vsha1h_u32.c: Likewise.
* gcc.target/arm/crypto-vsha1mq_u32.c: Likewise.
* gcc.target/arm/crypto-vsha1pq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: Extract
lane 1 as the moves for lane 0 now get optimized away.
* gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise.
41 files changed:
gcc/combine.c
gcc/config/aarch64/aarch64.md
gcc/config/arm/vfp.md
gcc/cse.c
gcc/rtl.c
gcc/rtl.h
gcc/rtlanal.c
gcc/rtlanal.h
gcc/simplify-rtx.c
gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c
gcc/testsuite/gcc.target/aarch64/narrow_high_combine.c
gcc/testsuite/gcc.target/aarch64/simd/vmulx_laneq_f64_1.c
gcc/testsuite/gcc.target/aarch64/simd/vmulxd_laneq_f64_1.c
gcc/testsuite/gcc.target/aarch64/simd/vmulxs_lane_f32_1.c
gcc/testsuite/gcc.target/aarch64/simd/vmulxs_laneq_f32_1.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmlalh_lane_s16.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmlals_lane_s32.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmlslh_lane_s16.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmlsls_lane_s32.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_lane_s16.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmullh_laneq_s16.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_lane_s32.c
gcc/testsuite/gcc.target/aarch64/simd/vqdmulls_laneq_s32.c
gcc/testsuite/gcc.target/aarch64/sve/dup_lane_1.c
gcc/testsuite/gcc.target/aarch64/sve/extract_1.c
gcc/testsuite/gcc.target/aarch64/sve/extract_2.c
gcc/testsuite/gcc.target/aarch64/sve/extract_3.c
gcc/testsuite/gcc.target/aarch64/sve/extract_4.c
gcc/testsuite/gcc.target/aarch64/sve/live_1.c
gcc/testsuite/gcc.target/arm/crypto-vsha1cq_u32.c
gcc/testsuite/gcc.target/arm/crypto-vsha1h_u32.c
gcc/testsuite/gcc.target/arm/crypto-vsha1mq_u32.c
gcc/testsuite/gcc.target/arm/crypto-vsha1pq_u32.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c