PR112694 shows that we try to create sub-vectors of single-element
vectors because can_duplicate_and_interleave_p returns true.
The problem resurfaced in PR116611.
This patch makes can_duplicate_and_interleave_p return false
if count / nvectors > 0 and removes the corresponding check in the riscv
backend.
This partially gets rid of the FAIL in slp-19a.c. At least when built
with cost model we don't have LOAD_LANES anymore. Without cost model,
as in the test suite, we choose a different path and still end up with
LOAD_LANES.
Bootstrapped and regtested on x86 and power10, regtested on
rv64gcv_zvfh_zvbb. Still waiting for the aarch64 results.
Regards
Robin
gcc/ChangeLog:
PR target/112694
PR target/116611.
* config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
return.
* tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
false when we cannot create sub-elements.
mask to do the iteration loop control. Just disable it directly. */
if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL)
return false;
- /* FIXME: Explicitly disable VLA interleave SLP vectorization when we
- may encounter ICE for poly size (1, 1) vectors in loop vectorizer.
- Ideally, middle-end loop vectorizer should be able to disable it
- itself, We can remove the codes here when middle-end code is able
- to disable VLA SLP vectorization for poly size (1, 1) VF. */
- if (!BYTES_PER_RISCV_VECTOR.is_constant ()
- && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL,
- poly_int64 (16, 16)))
- return false;
struct expand_vec_perm_d d;
if (!multiple_p (elt_bytes, 2, &elt_bytes))
return false;
nvectors *= 2;
+ /* We need to be able to fuse COUNT / NVECTORS elements together. */
+ if (!multiple_p (count, nvectors))
+ return false;
}
}