From: Jakub Jelinek Date: Fri, 14 Apr 2023 07:20:49 +0000 (+0200) Subject: combine: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040] X-Git-Tag: basepoints/gcc-14~29 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=9d1a6119590ef828f9782a7083d03e535bc2f2cf;p=thirdparty%2Fgcc.git combine: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040] The following testcase is miscompiled on riscv since the addition of *mvconst_internal define_insn_and_split. We have: (insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [2 S4 A128]) (reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 36 40 2 (set (reg:SI 171) (zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [0 S2 A128]))) "pr109040.c":9:11 111 {*zero_extendhisi2} (nil)) and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles even different modes like in the above case, and so it optimizes it into: (insn 47 35 39 2 (set (reg:HI 175) (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 47 40 2 (set (reg:SI 171) (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111 {*zero_extendhisi2} (expr_list:REG_DEAD (reg:HI 175) (nil))) Pseudo 166 is result of AND with 0x8084c constant (forced into a register). Combine attempts to combine the AND with the insn 47 above created by DSE, and turns it because of WORD_REGISTER_OPERATIONS and its assumption that all the subword operations are actually done on word mode into: (set (subreg:SI (reg:HI 175) 0) (and:SI (reg:SI 167 [ m ]) (reg:SI 168))) and later on the ZERO_EXTEND is thrown away. We then see (and:SI (subreg:SI (reg:HI 175) 0) (const_int 0x84c)) and optimize that into (subreg:SI (and:HI (reg:HI 175) (const_int 0x84c)) 0) which is still fine, in WORD_REGISTER_OPERATIONS the AND in HImode will set all upper bits up to BITS_PER_WORD to zeros. But later on simplify_binary_operation_1 or simplify_and_const_int_1 sees that because nonzero_bits ((reg:HI 175), HImode) == 0x84c, we can optimize the AND into (reg:HI 175). That isn't correct, because while the low 16 bits of that REG are known to have all bits but 0x84c cleared, we don't know that all the upper 16 bits are all clear as well. So, for WORD_REGISTER_OPERATIONS for integral modes smaller than word mode, we need to check all bits from word_mode in nonzero_bits for the optimizations. 2023-04-14 Jeff Law Jakub Jelinek PR target/108947 PR target/109040 * combine.cc (simplify_and_const_int_1): Compute nonzero_bits in word_mode rather than mode if WORD_REGISTER_OPERATIONS and mode is smaller than word_mode. * simplify-rtx.cc (simplify_context::simplify_binary_operation_1) : Likewise. * gcc.dg/pr108947.c: New test. * gcc.c-torture/execute/pr109040.c: New test. --- diff --git a/gcc/combine.cc b/gcc/combine.cc index 22bf8e1ec898..0106092e4568 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -10055,9 +10055,12 @@ simplify_and_const_int_1 (scalar_int_mode mode, rtx varop, /* See what bits may be nonzero in VAROP. Unlike the general case of a call to nonzero_bits, here we don't care about bits outside - MODE. */ + MODE unless WORD_REGISTER_OPERATIONS is true. */ - nonzero = nonzero_bits (varop, mode) & GET_MODE_MASK (mode); + scalar_int_mode tmode = mode; + if (WORD_REGISTER_OPERATIONS && GET_MODE_BITSIZE (mode) < BITS_PER_WORD) + tmode = word_mode; + nonzero = nonzero_bits (varop, tmode) & GET_MODE_MASK (tmode); /* Turn off all bits in the constant that are known to already be zero. Thus, if the AND isn't needed at all, we will have CONSTOP == NONZERO_BITS @@ -10071,7 +10074,7 @@ simplify_and_const_int_1 (scalar_int_mode mode, rtx varop, /* If VAROP is a NEG of something known to be zero or 1 and CONSTOP is a power of two, we can replace this with an ASHIFT. */ - if (GET_CODE (varop) == NEG && nonzero_bits (XEXP (varop, 0), mode) == 1 + if (GET_CODE (varop) == NEG && nonzero_bits (XEXP (varop, 0), tmode) == 1 && (i = exact_log2 (constop)) >= 0) return simplify_shift_const (NULL_RTX, ASHIFT, mode, XEXP (varop, 0), i); diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index 3b33afa24617..ee75079917f8 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -3752,7 +3752,13 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, return op0; if (HWI_COMPUTABLE_MODE_P (mode)) { - HOST_WIDE_INT nzop0 = nonzero_bits (trueop0, mode); + /* When WORD_REGISTER_OPERATIONS is true, we need to know the + nonzero bits in WORD_MODE rather than MODE. */ + scalar_int_mode tmode = as_a (mode); + if (WORD_REGISTER_OPERATIONS + && GET_MODE_BITSIZE (tmode) < BITS_PER_WORD) + tmode = word_mode; + HOST_WIDE_INT nzop0 = nonzero_bits (trueop0, tmode); HOST_WIDE_INT nzop1; if (CONST_INT_P (trueop1)) { diff --git a/gcc/testsuite/gcc.c-torture/execute/pr109040.c b/gcc/testsuite/gcc.c-torture/execute/pr109040.c new file mode 100644 index 000000000000..b0dedd50e790 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr109040.c @@ -0,0 +1,23 @@ +/* PR target/109040 */ + +typedef unsigned short __attribute__((__vector_size__ (32))) V; + +unsigned short a, b, c, d; + +void +foo (V m, unsigned short *ret) +{ + V v = 6 > ((V) { 2124, 8 } & m); + unsigned short uc = v[0] + a + b + c + d; + *ret = uc; +} + +int +main () +{ + unsigned short x; + foo ((V) { 0, 15 }, &x); + if (x != (unsigned short) ~0) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/pr108947.c b/gcc/testsuite/gcc.dg/pr108947.c new file mode 100644 index 000000000000..2fe2f5c6e576 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr108947.c @@ -0,0 +1,21 @@ +/* PR target/108947 */ +/* { dg-do run } */ +/* { dg-options "-O2 -fno-forward-propagate -Wno-psabi" } */ + +typedef unsigned short __attribute__((__vector_size__ (2 * sizeof (short)))) V; + +__attribute__((__noipa__)) V +foo (V v) +{ + V w = 3 > (v & 3992); + return w; +} + +int +main () +{ + V w = foo ((V) { 0, 9 }); + if (w[0] != 0xffff || w[1] != 0) + __builtin_abort (); + return 0; +}