From: Richard Sandiford Date: Tue, 9 May 2023 06:40:41 +0000 (+0100) Subject: ira: Don't create copies for earlyclobbered pairs X-Git-Tag: basepoints/gcc-15~9547 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=ba72a8d85180d0f4dbcea6eb3458ce175ce190b4;p=thirdparty%2Fgcc.git ira: Don't create copies for earlyclobbered pairs This patch follows on from g:9f635bd13fe9e85872e441b6f3618947f989909a ("the previous patch"). To start by quoting that: If an insn requires two operands to be tied, and the input operand dies in the insn, IRA acts as though there were a copy from the input to the output with the same execution frequency as the insn. Allocating the same register to the input and the output then saves the cost of a move. If there is no such tie, but an input operand nevertheless dies in the insn, IRA creates a similar move, but with an eighth of the frequency. This helps to ensure that chains of instructions reuse registers in a natural way, rather than using arbitrarily different registers for no reason. This heuristic seems to work well in the vast majority of cases. However, the problem fixed in the previous patch was that we could create a copy for an operand pair even if, for all relevant alternatives, the output and input register classes did not have any registers in common. It is then impossible for the output operand to reuse the dying input register. This left unfixed a further case where copies don't make sense: there is no point trying to reuse the dying input register if, for all relevant alternatives, the output is earlyclobbered and the input doesn't match the output. (Matched earlyclobbers are fine.) Handling that case fixes several existing XFAILs and helps with a follow-on aarch64 patch. Tested on aarch64-linux-gnu and x86_64-linux-gnu. A SPEC2017 run on aarch64 showed no differences outside the noise. Also, I tried compiling gcc.c-torture, gcc.dg, and g++.dg for at least one target per cpu directory, using the options -Os -fno-schedule-insns{,2}. The results below summarise the tests that showed a difference in LOC: Target Tests Good Bad Delta Best Worst Median ====== ===== ==== === ===== ==== ===== ====== amdgcn-amdhsa 14 7 7 3 -18 10 -1 arm-linux-gnueabihf 16 15 1 -22 -4 2 -1 csky-elf 6 6 0 -21 -6 -2 -4 hppa64-hp-hpux11.23 5 5 0 -7 -2 -1 -1 ia64-linux-gnu 16 16 0 -70 -15 -1 -3 m32r-elf 53 1 52 64 -2 8 1 mcore-elf 2 2 0 -8 -6 -2 -6 microblaze-elf 285 283 2 -909 -68 4 -1 mmix 7 7 0 -2101 -2091 -1 -1 msp430-elf 1 1 0 -4 -4 -4 -4 pru-elf 8 6 2 -12 -6 2 -2 rx-elf 22 18 4 -40 -5 6 -2 sparc-linux-gnu 15 14 1 -40 -8 1 -2 sparc-wrs-vxworks 15 14 1 -40 -8 1 -2 visium-elf 2 1 1 0 -2 2 -2 xstormy16-elf 1 1 0 -2 -2 -2 -2 with other targets showing no sensitivity to the patch. The only target that seems to be negatively affected is m32r-elf; otherwise the patch seems like an extremely minor but still clear improvement. gcc/ * ira-conflicts.cc (can_use_same_reg_p): Skip over non-matching earlyclobbers. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c: Remove XFAILs. * gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/scale_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/scale_f64.c: Likewise. --- diff --git a/gcc/ira-conflicts.cc b/gcc/ira-conflicts.cc index 5aa080af4213..a4d93c8d7342 100644 --- a/gcc/ira-conflicts.cc +++ b/gcc/ira-conflicts.cc @@ -398,6 +398,9 @@ can_use_same_reg_p (rtx_insn *insn, int output, int input) if (op_alt[input].matches == output) return true; + if (op_alt[output].earlyclobber) + continue; + if (ira_reg_class_intersect[op_alt[input].cl][op_alt[output].cl] != NO_REGS) return true; diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c index b74ae33e100f..e40865fcbc4f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s16.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s16_z_tied1, svint16_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s16_z_untied: { xfail *-*-* } +** asr_wide_x0_s16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** asr z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c index 8698aef26c64..06e4ca2a030e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s32.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s32_z_tied1, svint32_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s32_z_untied: { xfail *-*-* } +** asr_wide_x0_s32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** asr z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c index 77b1669392da..1f840ca8e57e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/asr_wide_s8.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (asr_wide_x0_s8_z_tied1, svint8_t, uint64_t, z0 = svasr_wide_z (p0, z0, x0)) /* -** asr_wide_x0_s8_z_untied: { xfail *-*-* } +** asr_wide_x0_s8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** asr z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c index 9e388e499b84..e02c66947d6c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_w0_s32_z_tied1, svint32_t, int32_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_w0_s32_z_untied: { xfail *-*-* } +** bic_w0_s32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** bic z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c index bf9536815472..57c1e535fea3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_s64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_x0_s64_z_tied1, svint64_t, int64_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_x0_s64_z_untied: { xfail *-*-* } +** bic_x0_s64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** bic z0\.d, p0/m, z0\.d, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c index b308b599b434..9f08ab40a8c5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_w0_u32_z_tied1, svuint32_t, uint32_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_w0_u32_z_untied: { xfail *-*-* } +** bic_w0_u32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** bic z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c index e82db1e94fd6..de84f3af6ff4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bic_u64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (bic_x0_u64_z_tied1, svuint64_t, uint64_t, z0 = svbic_z (p0, z0, x0)) /* -** bic_x0_u64_z_untied: { xfail *-*-* } +** bic_x0_u64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** bic z0\.d, p0/m, z0\.d, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c index 8d63d3909848..a0207726144b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s16.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s16_z_tied1, svint16_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s16_z_untied: { xfail *-*-* } +** lsl_wide_x0_s16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsl z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c index acd813df34f4..bd67b7006b5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s32.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s32_z_tied1, svint32_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s32_z_untied: { xfail *-*-* } +** lsl_wide_x0_s32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsl z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c index 17e8e8685e3f..7eb8627041d9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_s8.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_s8_z_tied1, svint8_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_s8_z_untied: { xfail *-*-* } +** lsl_wide_x0_s8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsl z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c index cff24a85090b..482f8d0557ba 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u16.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u16_z_tied1, svuint16_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u16_z_untied: { xfail *-*-* } +** lsl_wide_x0_u16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsl z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c index 7b1afab4918b..612897d24dfd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u32.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u32_z_tied1, svuint32_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u32_z_untied: { xfail *-*-* } +** lsl_wide_x0_u32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsl z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c index df8b1ec86b49..6ca2f9e7da22 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsl_wide_u8.c @@ -155,7 +155,7 @@ TEST_UNIFORM_ZX (lsl_wide_x0_u8_z_tied1, svuint8_t, uint64_t, z0 = svlsl_wide_z (p0, z0, x0)) /* -** lsl_wide_x0_u8_z_untied: { xfail *-*-* } +** lsl_wide_x0_u8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsl z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c index 863b51a2fc52..9110c5aad446 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u16.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u16_z_tied1, svuint16_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u16_z_untied: { xfail *-*-* } +** lsr_wide_x0_u16_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.h, p0/z, z1\.h ** lsr z0\.h, p0/m, z0\.h, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c index 73c2cf86e330..93af4fa49256 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u32.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u32_z_tied1, svuint32_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u32_z_untied: { xfail *-*-* } +** lsr_wide_x0_u32_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.s, p0/z, z1\.s ** lsr z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c index fe44eabda11d..2f38139d40be 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lsr_wide_u8.c @@ -153,7 +153,7 @@ TEST_UNIFORM_ZX (lsr_wide_x0_u8_z_tied1, svuint8_t, uint64_t, z0 = svlsr_wide_z (p0, z0, x0)) /* -** lsr_wide_x0_u8_z_untied: { xfail *-*-* } +** lsr_wide_x0_u8_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.b, p0/z, z1\.b ** lsr z0\.b, p0/m, z0\.b, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c index 747f8a6397bc..12a1b1d8686b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f32.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (scale_w0_f32_z_tied1, svfloat32_t, int32_t, z0 = svscale_z (p0, z0, x0)) /* -** scale_w0_f32_z_untied: { xfail *-*-* } +** scale_w0_f32_z_untied: ** mov (z[0-9]+\.s), w0 ** movprfx z0\.s, p0/z, z1\.s ** fscale z0\.s, p0/m, z0\.s, \1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c index 004cbfa3eff3..f6b117185848 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/scale_f64.c @@ -127,7 +127,7 @@ TEST_UNIFORM_ZX (scale_x0_f64_z_tied1, svfloat64_t, int64_t, z0 = svscale_z (p0, z0, x0)) /* -** scale_x0_f64_z_untied: { xfail *-*-* } +** scale_x0_f64_z_untied: ** mov (z[0-9]+\.d), x0 ** movprfx z0\.d, p0/z, z1\.d ** fscale z0\.d, p0/m, z0\.d, \1