For the loop in the testcase we currently fail to hoist the guard
check of the inner loop (m > 0) out of the outer loop because
find_loop_guard checks all blocks of the outer loop for side-effects,
including those that are skipped by the guard. This usually
is harmless as the guard does not skip any blocks in the outer loop
but in this case store-motion was applied to the inner loop and thus
there's now a skipped store in the outer loop.
The following properly skips blocks that are dominated by the
entry to the skipped region.
PR tree-optimization/117510
* tree-ssa-loop-unswitch.cc (find_loop_guard): Only check
not skipped blocks for side-effects.
* gcc.dg/vect/vect-outer-pr117510.c: New testcase.
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-additional-options "-O3" } */
+
+void f(int n, int m, double *a)
+{
+ a = __builtin_assume_aligned (a, __BIGGEST_ALIGNMENT__);
+ for (int i = 0; i < n; i++)
+ for (int j = 0; j < m; j++)
+ a[i] += 2*a[i] + j;
+}
+
+/* { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" } } */
guard_edge = NULL;
goto end;
}
- if (!empty_bb_without_guard_p (loop, bb, dbg_to_reset))
+ /* If any of the not skipped blocks has side-effects or defs with
+ uses outside of the loop we cannot hoist the guard. */
+ if (!dominated_by_p (CDI_DOMINATORS,
+ bb, guard_edge == te ? fe->dest : te->dest)
+ && !empty_bb_without_guard_p (loop, bb, dbg_to_reset))
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,