We produce inefficient code for some synthesized SImode conditional set
operations (i.e. ones that are not directly implemented in hardware) on
RV64. For example a piece of C code like this:
int
sleu (unsigned int x, unsigned int y)
{
return x <= y;
}
gets compiled (at `-O2') to this:
sleu:
sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi
xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1
andi a0,a0,1 # 16 [c=4 l=4] anddi3/1
ret # 25 [c=0 l=4] simple_return
or (at `-O1') to this:
sleu:
sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi
xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1
sext.w a0,a0 # 16 [c=4 l=4] extendsidi2/0
ret # 24 [c=0 l=4] simple_return
This is because the middle end expands a SLEU operation missing from
RISC-V hardware into a sequence of a SImode SGTU operation followed by
an explicit SImode XORI operation with immediate 1. And while the SGTU
machine instruction (alias SLTU with the input operands swapped) gives a
properly sign-extended 32-bit result which is valid both as a SImode or
a DImode operand the middle end does not see that through a SImode XORI
operation, because we tell the middle end that the RISC-V target (unlike
MIPS) may hold values in DImode integer registers that are valid for
SImode operations even if not properly sign-extended.
However the RISC-V psABI requires that 32-bit function arguments and
results passed in 64-bit integer registers be properly sign-extended, so
this is explicitly done at the conclusion of the function.
Fix this by making the backend use a sequence of a DImode SGTU operation
followed by a SImode SEQZ operation instead. The latter operation is
known by the middle end to produce a properly sign-extended 32-bit
result and therefore combine gets rid of the sign-extension operation
that follows and actually folds it into the very same XORI machine
operation resulting in:
sleu:
sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_didi
xori a0,a0,1 # 16 [c=4 l=4] xordi3/1
ret # 25 [c=0 l=4] simple_return
instead (although the SEQZ alias SLTIU against immediate 1 machine
instruction would equally do and is actually retained at `-O0'). This
is handled analogously for the remaining synthesized operations of this
kind, i.e. `SLE', `SGEU', and `SGE'.
gcc/
* config/riscv/riscv.cc (riscv_emit_int_order_test): Use EQ 0
rather that XOR 1 for LE and LEU operations.
gcc/testsuite/
* gcc.target/riscv/sge.c: New test.
* gcc.target/riscv/sgeu.c: New test.
* gcc.target/riscv/sle.c: New test.
* gcc.target/riscv/sleu.c: New test.
}
else if (invert_ptr == 0)
{
- rtx inv_target = riscv_force_binary (GET_MODE (target),
+ rtx inv_target = riscv_force_binary (word_mode,
inv_code, cmp0, cmp1);
- riscv_emit_binary (XOR, target, inv_target, const1_rtx);
+ riscv_emit_binary (EQ, target, inv_target, const0_rtx);
}
else
{
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+int
+sge (int x, int y)
+{
+ return x >= y;
+}
+
+/* { dg-final { scan-assembler "\\sxori\\sa0,a0,1\n\\sret\n" } } */
+/* { dg-final { scan-assembler-not "andi|sext\\.w" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+int
+sgeu (unsigned int x, unsigned int y)
+{
+ return x >= y;
+}
+
+/* { dg-final { scan-assembler "\\sxori\\sa0,a0,1\n\\sret\n" } } */
+/* { dg-final { scan-assembler-not "andi|sext\\.w" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+int
+sle (int x, int y)
+{
+ return x <= y;
+}
+
+/* { dg-final { scan-assembler "\\sxori\\sa0,a0,1\n\\sret\n" } } */
+/* { dg-final { scan-assembler-not "andi|sext\\.w" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+int
+sleu (unsigned int x, unsigned int y)
+{
+ return x <= y;
+}
+
+/* { dg-final { scan-assembler "\\sxori\\sa0,a0,1\n\\sret\n" } } */
+/* { dg-final { scan-assembler-not "andi|sext\\.w" } } */