PR target/110792: Early clobber issues with rot32di2_doubleword on i386.
This patch is a conservative fix for PR target/110792, a wrong-code
regression affecting doubleword rotations by BITS_PER_WORD, which
effectively swaps the highpart and lowpart words, when the source to be
rotated resides in memory. The issue is that if the register used to
hold the lowpart of the destination is mentioned in the address of
the memory operand, the current define_insn_and_split unintentionally
clobbers it before reading the highpart.
Hence, for the testcase, the incorrectly generated code looks like:
but unfortunately this currently generates significantly worse code,
due to a strange choice of reloads (effectively memcpy), which ends up
looking like:
salq $4, %rdi // calculate address
movdqa WHIRL_S(%rdi), %xmm0 // load the double word in SSE reg.
movaps %xmm0, -16(%rsp) // store the SSE reg back to the stack
movq -8(%rsp), %rdi // load highpart
movq -16(%rsp), %rbp // load lowpart
Note that reload's "&" doesn't distinguish between the memory being
early clobbered, vs the registers used in an addressing mode being
early clobbered.
The fix proposed in this patch is to remove the third alternative, that
allowed offsetable memory as an operand, forcing reload to place the
operand into a register before the rotation. This results in:
I believe there's a more advanced solution, by swapping the order of
the loads (if first destination register is mentioned in the address),
or inserting a lea insn (if both destination registers are mentioned
in the address), but this fix is a minimal "safe" solution, that
should hopefully be suitable for backporting.
2023-08-03 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/110792
* config/i386/i386.md (<any_rotate>ti3): For rotations by 64 bits
place operand in a register before gen_<insn>64ti2_doubleword.
(<any_rotate>di3): Likewise, for rotations by 32 bits, place
operand in a register before gen_<insn>32di2_doubleword.
(<any_rotate>32di2_doubleword): Constrain operand to be in register.
(<any_rotate>64ti2_doubleword): Likewise.
gcc/testsuite/ChangeLog
PR target/110792
* g++.target/i386/pr110792.C: New 32-bit C++ test case.
* gcc.target/i386/pr110792.c: New 64-bit C test case.