Handle SLP permutations for variable-length vectors
The SLP code currently punts for all variable-length permutes.
This patch makes it handle the easy case of N->N permutes in which
the number of vector lanes is a multiple of N. Every permute then
uses the same mask, and that mask repeats (with a stride) every
N elements.
The patch uses the same path for constant-length vectors,
since it should be slightly cheaper in terms of compile time.
2018-08-24 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-slp.c (vect_transform_slp_perm_load): Separate out
the case in which the permute needs only a single element and
repeats for every vector of the result. Extend that case to
handle variable-length vectors.
* tree-vect-stmts.c (vectorizable_load): Update accordingly.