git.ipfire.org Git - thirdparty/gcc.git/commit

author	Richard Biener <rguenther@suse.de>
	Wed, 19 Jun 2024 10:57:27 +0000 (12:57 +0200)
committer	Richard Biener <rguenther@suse.de>
	Thu, 20 Jun 2024 06:47:44 +0000 (08:47 +0200)
commit	46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452
tree	a45311f21186aa365e1ae840f429abec2a3d16b1	tree
parent	bea447a2982f3094aa3423b5045cea929f4f4700	commit \| diff

tree-optimization/114413 - SLP CSE after permute optimization

We currently fail to re-CSE SLP nodes after optimizing permutes
which results in off cost estimates.  For gcc.dg/vect/bb-slp-32.c
this shows in not re-using the SLP node with the load and arithmetic
for both the store and the reduction.  The following implements
CSE by re-bst-mapping nodes as finalization part of vect_optimize_slp.

I've tried to make the CSE part of permute materialization but it
isn't a very good fit there.  I've not bothered to implement something
more complete, also handling external defs or defs without
SLP_TREE_SCALAR_STMTS.

I realize this might result in more BB SLP which in turn might slow
down code given costing for BB SLP is difficult (even that we now
vectorize gcc.dg/vect/bb-slp-32.c on x86_64 might be not a good idea).
This is nevertheless feeding more accurate info to costing which is
good.

PR tree-optimization/114413
* tree-vect-slp.cc (release_scalar_stmts_to_slp_tree_map):
New function, split out from ...
(vect_analyze_slp): ... here.  Call it.
(vect_cse_slp_nodes): New function.
(vect_optimize_slp): Call it.

* gcc.dg/vect/bb-slp-32.c: Expect CSE and vectorization on x86.

gcc/testsuite/gcc.dg/vect/bb-slp-32.c		diff \| blob \| blame \| history
gcc/tree-vect-slp.cc		diff \| blob \| blame \| history