gcc.target/aarch64/sve/var_stride_2.c started failing after
r15-268-g9dbff9c05520, but the change was an improvement:
@@ -36,13 +36,11 @@
b.any .L9
ret
.L17:
- ubfiz x5, x3, 10, 16
- ubfiz x4, x2, 10, 16
- add x5, x1, x5
- add x4, x0, x4
- cmp x0, x5
- ccmp x1, x4, 2, ls
uxtw x4, w2
+ add x6, x1, x3, lsl 10
+ cmp x0, x6
+ add x5, x0, x4, lsl 10
+ ccmp x1, x5, 2, ls
ccmp w2, 0, 4, hi
beq .L3
cntb x7
This patch therefore changes the test to expect the new output
for var_stride_2.c.
The changes for var_stride_4.c were a wash, with both versions
having 18(!) arithmetic instructions before the alias check branch.
Both versions sign-extend the n and m arguments as part of this
sequence; the question is whether they do it first or later.
This patch therefore changes the test to accept either the old
or the new code for var_stride_4.c.
gcc/testsuite/
* gcc.target/aarch64/sve/var_stride_2.c: Expect ADD+LSL.
* gcc.target/aarch64/sve/var_stride_4.c: Accept LSL or SBFIZ.
/* { dg-final { scan-assembler {\tldr\tw[0-9]+} } } */
/* { dg-final { scan-assembler {\tstr\tw[0-9]+} } } */
/* Should multiply by (257-1)*4 rather than (VF-1)*4 or (VF-2)*4. */
-/* { dg-final { scan-assembler-times {\tubfiz\tx[0-9]+, x2, 10, 16\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tubfiz\tx[0-9]+, x3, 10, 16\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tadd\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 10\n} 2 } } */
/* { dg-final { scan-assembler-not {\tcmp\tx[0-9]+, 0} } } */
/* { dg-final { scan-assembler-not {\tcmp\tw[0-9]+, 0} } } */
/* { dg-final { scan-assembler-not {\tcsel\tx[0-9]+} } } */
/* { dg-final { scan-assembler {\tldr\tw[0-9]+} } } */
/* { dg-final { scan-assembler {\tstr\tw[0-9]+} } } */
/* Should multiply by (257-1)*4 rather than (VF-1)*4. */
-/* { dg-final { scan-assembler-times {\tsbfiz\tx[0-9]+, x2, 10, 32\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsbfiz\tx[0-9]+, x3, 10, 32\n} 1 } } */
+/* { dg-final { scan-assembler-times {\t(?:lsl\tx[0-9]+, x[0-9]+, 10|sbfiz\tx[0-9]+, x[0-9]+, 10, 32)\n} 2 } } */
/* { dg-final { scan-assembler {\tcmp\tw2, 0} } } */
/* { dg-final { scan-assembler {\tcmp\tw3, 0} } } */
/* { dg-final { scan-assembler-times {\tcsel\tx[0-9]+} 4 } } */