]> git.ipfire.org Git - thirdparty/gcc.git/commit
Improve load permutation lowering
authorRichard Biener <rguenther@suse.de>
Fri, 4 Oct 2024 09:13:58 +0000 (11:13 +0200)
committerRichard Biener <rguenth@gcc.gnu.org>
Sat, 5 Oct 2024 11:59:32 +0000 (13:59 +0200)
commit515f015f3cc4978b8b02bb61ba50ba67d2a24065
treed1c15f375c5c65e6cc29f2fb39e98d24cfadaaca
parent7d736ecbc05a35f73fbd8e3b010d6e9821c34404
Improve load permutation lowering

The following makes sure the emitted even/odd extraction scheme
follows one that ends up with actual trivial even/odd extract permutes.
When we choose a level 2 extract we generate { 0, 1, 4, 5, ... }
which for example the x86 backend doesn't recognize with just SSE
and QImode elements.  So this now follows what the non-SLP interleaving
code would do which is element granular even/odd extracts.

This resolves gcc.dg/vect/vect-strided[-a]-u8-i8-gap*.c FAILs with
--param vect-force-slp=1 on x86_64.

* tree-vect-slp.cc (vect_lower_load_permutations): Prefer
level 1 even/odd extracts.
gcc/tree-vect-slp.cc