From: Kyrylo Tkachov Date: Tue, 16 Jun 2026 10:16:42 +0000 (+0200) Subject: vect: consult the alias oracle for unanalyzable BB SLP dependences X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=efaac8c970280825f413aa4501bc7a0d5328afbd;p=thirdparty%2Fgcc.git vect: consult the alias oracle for unanalyzable BB SLP dependences vect_slp_analyze_data_ref_dependence conservatively reported a dependence whenever the classical (affine) data-dependence test returned chrec_dont_know, e.g. when one of the accesses has a non-affine or runtime array subscript. In the BB SLP region check this is overly pessimistic: the unanalyzable subscript says nothing about whether the two references can actually alias, and the alias oracle can frequently still prove they cannot (distinct restrict parameters, distinct non-escaping objects, and so on). When that happens a perfectly good SLP group is torn down. The motivating case is the deal.II VectorizedArray reciprocal in SPEC CPU 2026 766.femflow_r. The store-sink and load-hoist walkers already fall back to the alias oracle (stmt_may_clobber_ref_p_1 / ref_maybe_used_by_stmt_p) for statements that have no single recorded data reference. Extend that fallback to the chrec_dont_know case: vect_slp_analyze_data_ref_dependence now returns a three-way result (chrec_known when the references are provably independent, chrec_dont_know when the affine test cannot analyze them, and the dependence otherwise) so each caller can tell "unknown" apart from "dependent", and on "unknown" runs the same oracle query it already uses for the no-data-reference case, with the TBAA setting appropriate to what is being moved: no TBAA when sinking a store (a moving store may change the dynamic type), TBAA when hoisting a load. On the new gcc.dg/vect/bb-slp-dep-oracle.c the 8-lane reciprocal group is torn down and emitted scalar without the patch and vectorizes to four vector divides with it. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov gcc/ChangeLog: * tree-vect-data-refs.cc (vect_slp_analyze_data_ref_dependence): Return a three-way tree result (chrec_known when independent, chrec_dont_know when the affine test cannot analyze the pair, the dependence otherwise) instead of a bool. (vect_slp_analyze_store_dependences): Resort to the alias oracle on an unknown dependence as well as on a missing data reference; a store is being moved so do not use TBAA. (vect_slp_analyze_load_dependences): Likewise on the load-hoist paths, using TBAA as a load is being hoisted; also record that the ao_ref has been initialized in check_hoist. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-dep-oracle.c: New test. --- diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-dep-oracle.c b/gcc/testsuite/gcc.dg/vect/bb-slp-dep-oracle.c new file mode 100644 index 00000000000..b2551ef644c --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-dep-oracle.c @@ -0,0 +1,40 @@ +/* BB SLP must not abandon a vectorizable group when the classical (affine) + data-dependence test cannot analyze a runtime array subscript but the alias + oracle can still prove the two references do not alias. + + The per-lane reciprocals are discovered as an SLP group, then the group is + torn down by vect_slp_analyze_data_ref_dependence reporting "can't determine + dependence" between the restrict output store and a runtime-indexed input + load, even though the distinct restrict objects provably do not alias. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target vect_double } */ +/* { dg-additional-options "-O3 -ffast-math -fno-trapping-math" } */ + +struct VA { double data[8]; }; +struct Tensor { struct VA comp[4]; }; + +/* Opaque: the returned index is a runtime value the affine subscript test + cannot analyze. */ +unsigned __attribute__((noipa)) pick (unsigned k) { return k & 3; } + +void f (const struct Tensor *in, double *__restrict out, + unsigned nq, unsigned base) +{ + for (unsigned q = 0; q < nq; q++) + { + const struct VA *rho = &in[q].comp[0]; /* divisor: contiguous .data[i] */ + double inv[8]; + for (unsigned i = 0; i < 8; i++) + inv[i] = 1.0 / rho->data[i]; /* reciprocal group, reused below */ + for (unsigned d = 0; d < 3; d++) + { + const struct VA *mom = &in[q].comp[pick (base + d)]; /* runtime-indexed numerator */ + for (unsigned i = 0; i < 8; i++) + out[(q * 3 + d) * 8 + i] = mom->data[i] * inv[i]; + } + } +} + +/* The reciprocal group must survive the dependence check and vectorize. */ +/* { dg-final { scan-tree-dump "basic block part vectorized" "slp1" } } */ diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 0e57e1068d6..f4662779976 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -973,12 +973,13 @@ vect_analyze_data_ref_dependences (loop_vec_info loop_vinfo, /* Function vect_slp_analyze_data_ref_dependence. - Return TRUE if there (might) exist a dependence between a memory-reference - DRA and a memory-reference DRB for VINFO. When versioning for alias - may check a dependence at run-time, return FALSE. Adjust *MAX_VF - according to the data dependence. */ + Classify the dependence between the memory-references DRA and DRB of DDR + for VINFO using the classical (affine) data-dependence test. Return + chrec_known if they are provably independent, chrec_dont_know if the test + cannot analyze them (in which case the caller can still try to disambiguate + them with the alias oracle), and the dependence (NULL_TREE) otherwise. */ -static bool +static tree vect_slp_analyze_data_ref_dependence (vec_info *vinfo, struct data_dependence_relation *ddr) { @@ -992,21 +993,21 @@ vect_slp_analyze_data_ref_dependence (vec_info *vinfo, /* Independent data accesses. */ if (DDR_ARE_DEPENDENT (ddr) == chrec_known) - return false; + return chrec_known; if (dra == drb) - return false; + return chrec_known; /* Read-read is OK. */ if (DR_IS_READ (dra) && DR_IS_READ (drb)) - return false; + return chrec_known; /* If dra and drb are part of the same interleaving chain consider them independent. */ if (STMT_VINFO_GROUPED_ACCESS (dr_info_a->stmt) && (DR_GROUP_FIRST_ELEMENT (dr_info_a->stmt) == DR_GROUP_FIRST_ELEMENT (dr_info_b->stmt))) - return false; + return chrec_known; /* Unknown data dependence. */ if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know) @@ -1021,7 +1022,7 @@ vect_slp_analyze_data_ref_dependence (vec_info *vinfo, "determined dependence between %T and %T\n", DR_REF (dra), DR_REF (drb)); - return true; + return DDR_ARE_DEPENDENT (ddr); } @@ -1052,29 +1053,35 @@ vect_slp_analyze_store_dependences (vec_info *vinfo, slp_tree node) if (! gimple_vuse (stmt)) continue; - /* If we couldn't record a (single) data reference for this - stmt we have to resort to the alias oracle. */ + /* If we couldn't record a (single) data reference for this stmt, + or the classical dependence test cannot analyze it, we have to + resort to the alias oracle. */ stmt_vec_info stmt_info = vinfo->lookup_stmt (stmt); data_reference *dr_b = STMT_VINFO_DATA_REF (stmt_info); - if (!dr_b) + if (dr_b) { - /* We are moving a store - this means - we cannot use TBAA for disambiguation. */ - if (!ref_initialized_p) - ao_ref_init (&ref, DR_REF (dr_a)); - if (stmt_may_clobber_ref_p_1 (stmt, &ref, false) - || ref_maybe_used_by_stmt_p (stmt, &ref, false)) + gcc_assert (!gimple_visited_p (stmt)); + + ddr_p ddr = initialize_data_dependence_relation (dr_a, + dr_b, vNULL); + tree dep = vect_slp_analyze_data_ref_dependence (vinfo, ddr); + free_dependence_relation (ddr); + if (dep == chrec_known) + continue; + if (dep != chrec_dont_know) return false; - continue; + /* Unknown dependence - fall through to the alias oracle. */ } - gcc_assert (!gimple_visited_p (stmt)); - - ddr_p ddr = initialize_data_dependence_relation (dr_a, - dr_b, vNULL); - bool dependent = vect_slp_analyze_data_ref_dependence (vinfo, ddr); - free_dependence_relation (ddr); - if (dependent) + /* We are moving a store - this means we cannot use TBAA for + disambiguation. */ + if (!ref_initialized_p) + { + ao_ref_init (&ref, DR_REF (dr_a)); + ref_initialized_p = true; + } + if (stmt_may_clobber_ref_p_1 (stmt, &ref, false) + || ref_maybe_used_by_stmt_p (stmt, &ref, false)) return false; } } @@ -1131,10 +1138,22 @@ vect_slp_analyze_load_dependences (vec_info *vinfo, slp_tree node, data_reference *store_dr = STMT_VINFO_DATA_REF (store_info); ddr_p ddr = initialize_data_dependence_relation (dr_a, store_dr, vNULL); - bool dependent + tree dep = vect_slp_analyze_data_ref_dependence (vinfo, ddr); free_dependence_relation (ddr); - if (dependent) + if (dep == chrec_known) + continue; + if (dep != chrec_dont_know) + return false; + /* The classical dependence test cannot analyze this; + resort to the alias oracle. We are hoisting a load + so TBAA may be used for disambiguation. */ + if (!ref_initialized_p) + { + ao_ref_init (&ref, DR_REF (dr_a)); + ref_initialized_p = true; + } + if (stmt_may_clobber_ref_p_1 (store_info->stmt, &ref, true)) return false; } continue; @@ -1145,7 +1164,10 @@ vect_slp_analyze_load_dependences (vec_info *vinfo, slp_tree node, /* We are hoisting a load - this means we can use TBAA for disambiguation. */ if (!ref_initialized_p) - ao_ref_init (&ref, DR_REF (dr_a)); + { + ao_ref_init (&ref, DR_REF (dr_a)); + ref_initialized_p = true; + } if (stmt_may_clobber_ref_p_1 (stmt_info->stmt, &ref, true)) { /* If we couldn't record a (single) data reference for this @@ -1155,10 +1177,13 @@ vect_slp_analyze_load_dependences (vec_info *vinfo, slp_tree node, return false; ddr_p ddr = initialize_data_dependence_relation (dr_a, dr_b, vNULL); - bool dependent + tree dep = vect_slp_analyze_data_ref_dependence (vinfo, ddr); free_dependence_relation (ddr); - if (dependent) + /* The alias oracle above could not rule out a conflict; + only a proven-independent (chrec_known) result lets us + hoist the load past this store. */ + if (dep != chrec_known) return false; } /* No dependence. */