middle-end: Use addhn for compression instead of inclusive OR when reducing comparison values
Given a sequence such as
int foo ()
{
#pragma GCC unroll 4
for (int i = 0; i < N; i++)
if (a[i] == 124)
return 1;
return 0;
}
where a[i] is long long, we will unroll the loop and use an OR reduction for
early break on Adv. SIMD. Afterwards the sequence is followed by a compression
sequence to compress the 128-bit vectors into 64-bits for use by the branch.
However if we have support for add halving and narrowing then we can instead of
using an OR, use an ADDHN which will do the combining and narrowing.
Note that for now I only do the last OR, however if we have more than one level
of unrolling we could technically chain them. I will revisit this in another
up coming early break series, however an unroll of 2 is fairly common.
gcc/ChangeLog:
* internal-fn.def (VEC_TRUNC_ADD_HIGH): New.
* doc/generic.texi: Document it.
* optabs.def (vec_trunc_add_high): New.
* doc/md.texi: Document it.
* tree-vect-stmts.cc (vectorizable_early_exit): Use addhn if supported.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-early-break-addhn_1.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_2.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_3.c: New test.
* gcc.target/aarch64/vect-early-break-addhn_4.c: New test.