Here's Shreya's next patch.
In pr58727 we have a case where the tree/gimple optimizers have decided to
"simplify" constants involved in logical ops by turning off as many bits as
they can in the hope that the simplified constant will be easier/smaller to
encode. That "simplified" constant gets passed down into the RTL optimizers
where it can ultimately cause a missed optimization.
Concretely let's assume we have insns 6, 7, 8 as shown in the combine dump
below:
> Trying 6, 7 -> 9:
> 6: r139:SI=r141:SI&0xfffffffffffffffd
> REG_DEAD r141:SI
> 7: r140:SI=r139:SI&0xffffffffffbfffff
> REG_DEAD r139:SI
> 9: r137:SI=r140:SI|0x2
> REG_DEAD r140:SI
We can obviously see that insn 6 is redundant as the bit we turn off would be
turned on by insn 9. But combine ultimately tries to generate:
> (set (reg:SI 137 [ _3 ])
> (ior:SI (and:SI (reg:SI 141 [ a ])
> (const_int -
4194305 [0xffffffffffbffffd]))
> (const_int 2 [0x2])))
That does actually match a pattern on RISC-V, but it's a pattern that generates
two bit-clear insns (or a bit-clear followed by andi and a pattern we'll be
removing someday). But if instead we IOR 0x2 back into the simplified constant
we get:
> (set (reg:SI 137 [ _3 ])
> (ior:SI (and:SI (reg:SI 141 [ a ])
> (const_int -
4194305 [0xffffffffffbfffff]))
> (const_int 2 [0x2])))
That doesn't match, but when split by generic code in the combiner we get:
> Successfully matched this instruction:
> (set (reg:SI 140)
> (and:SI (reg:SI 141 [ a ])
> (const_int -
4194305 [0xffffffffffbfffff])))
> Successfully matched this instruction:
> (set (reg:SI 137 [ _3 ])
> (ior:SI (reg:SI 140)
> (const_int 2 [0x2])))
Which is bclr+bset/ori. ie, we dropped one of the logical AND operations.
Bootstrapped and regression tested on x86 and riscv. Regression tested on the
30 or so embedded targets as well without new failures.
I'll give this a couple days for folks to chime in before pushing on Shreya's
behalf. This doesn't fix pr58727 for the other targets as they would need
target dependent hackery.
Jeff
PR tree-optimization/58727
gcc/
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
In (A & C1) | C2, if (C1|C2) results in a constant with a single bit
clear, then adjust C1 appropriately.
gcc/testsuite/
* gcc.target/riscv/pr58727.c: New test.
/* If (C1|C2) == ~0 then (X&C1)|C2 becomes X|C2. */
if (((c1|c2) & mask) == mask)
return simplify_gen_binary (IOR, mode, XEXP (op0, 0), op1);
+
+ /* If (C1|C2) has a single bit clear, then adjust C1 so that
+ when split it'll match a single bit clear style insn.
+
+ This could have been done with a target dependent splitter, but
+ then every target with single bit manipulation insns would need
+ to implement such splitters. */
+ if (exact_log2 (~(c1 | c2)) >= 0)
+ {
+ rtx temp = gen_rtx_AND (mode, XEXP (op0, 0), GEN_INT (c1 | c2));
+ temp = gen_rtx_IOR (mode, temp, trueop1);
+ return temp;
+ }
}
/* Convert (A & B) | A to A. */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcb -mabi=lp64d" { target { rv64} } } */
+/* { dg-options "-march=rv32gcb -mabi=ilp32" { target { rv32} } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+enum masks { CLEAR = 0x400000, SET = 0x02 };
+unsigned clear_set(unsigned a) { return (a & ~CLEAR) | SET; }
+unsigned set_clear(unsigned a) { return (a | SET) & ~CLEAR; }
+unsigned clear(unsigned a) { return a & ~CLEAR; }
+unsigned set(unsigned a) { return a | SET; }
+__attribute__((flatten)) unsigned clear_set_inline(unsigned a) { return set(clear(a)); }
+__attribute__((flatten)) unsigned set_clear_inline(unsigned a) { return clear(set(a)); }
+
+/* { dg-final { scan-assembler-not "\\sand\\s" } } */
+/* { dg-final { scan-assembler-not "\\sandi\\s" } } */
+