vect: Enhance cost evaluation in vect_transform_slp_perm_load_1
Following Richi's suggestion in [1], I'm working on deferring
cost evaluation next to the transformation, this patch is
to enhance function vect_transform_slp_perm_load_1 which
could under-cost for vector permutation, since the costing
doesn't try to consider nvectors_per_build, it's inconsistent
with the transformation part.
Basically it changes the below
if (index == count)
{
if (!noop_p)
{
// A ...
// ++*n_perms;
if (!analyze_only)
{
// B1 ...
// B2 ...
for ...
// B3 building VEC_PERM_EXPR
}
}
else if (!analyze_only)
{
// no B2 since no any further uses here.
for ...
// B4 building nothing
}
// B5 ...
}
to:
if (index == count)
{
if (!noop_p)
{
// A ...
if (!analyze_only)
// B1 ...
// B2 ... (trivial computations during analyze_only or not)
for ...
{
// now n_perms is consistent with building VEC_PERM_EXPR
// ++*n_perms;
if (analyze_only)
continue;
// B3 building VEC_PERM_EXPR
}
}
else if (!analyze_only)
{
// no B2 since no any further uses here.
for ...
// B4 building nothing
}
// B5 ...
}