PR rtl-optimization/109476: Use ZERO_EXTEND instead of zeroing a SUBREG.
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR. The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:
but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions. The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.
To answer Richard's question in comment #10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND. For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).
2023-04-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/109476
* lower-subreg.cc: Include explow.h for force_reg.
(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
If decomposing a suitable LSHIFTRT and we're not splitting
ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
instead of setting a high part SUBREG to zero, which helps combine.
(decompose_multiword_subregs): Update call to resolve_shift_zext.
gcc/testsuite/ChangeLog
PR rtl-optimization/109476
* gcc.target/avr/mmcu/pr109476.c: New test case.