AArch64: Add pattern for unsigned widenings (uxtl) to zip{1,2}
This changes unpack instructions to use zip{1,2} when doing a zero-extending
widening operation. Permutes generally have a higher throughput than the
widening operations. Zeros are shuffled into the top half of the registers.
The testcase
void d2 (unsigned * restrict a, unsigned short *b, int n)
{
for (int i = 0; i < (n & -8); i++)
a[i] = b[i];
}
* gcc.target/aarch64/simd/vmovl_high_1.c: Update codegen.
* gcc.target/aarch64/uxtl-combine-1.c: New test.
* gcc.target/aarch64/uxtl-combine-2.c: New test.
* gcc.target/aarch64/uxtl-combine-3.c: New test.
* gcc.target/aarch64/uxtl-combine-4.c: New test.
* gcc.target/aarch64/uxtl-combine-5.c: New test.
* gcc.target/aarch64/uxtl-combine-6.c: New test.