The following hopefully addresses an observed bootstrap issue on aarch64
where maybe-uninit diagnostics occur. It also fixes bogus napkin math
from myself when I was confusing rounded up size of a single access
with rounded up size of the group accessed in a single scalar iteration.
So the following puts in a correctness check, leaving a set of peeling
for gaps as insufficient. This could be rectified by splitting the
last load into multiple ones but I'm leaving this for a followup, better
quickly fix the reported wrong-code.
* tree-vect-stmts.cc (get_group_load_store_type): Do not
re-use poly-int remain but re-compute with non-poly values.
Verify the shortened load is good enough to be covered with
a single scalar gap iteration before accepting it.
* gcc.dg/vect/pr115385.c: Enable AVX2 if available.