]> git.ipfire.org Git - thirdparty/gcc.git/commitdiff
[PATCH 2/2] tree-optimization/vect: Allow single-lane SLP fallback when limit is...
authorLili Cui <lili.cui@intel.com>
Tue, 12 May 2026 17:01:00 +0000 (10:01 -0700)
committerCui, Lili <lili.cui@intel.com>
Tue, 12 May 2026 10:50:09 +0000 (18:50 +0800)
In vect_analyze_slp_reduction, the early bail "if (*limit == 0) return
false" blocked all SLP discovery including the single-lane fallback path.
However, single-lane SLP trees (group_size == 1) do not consume the
discovery limit as they cannot cause exponential tree growth.

This causes vectorization failures in loops with many independent
conditional reductions: multi-lane grouping attempts exhaust the limit,
then the single-lane fallback that would have succeeded is incorrectly
rejected.

The fix moves the limit check to only guard chain analysis (which builds
multi-lane trees and does consume limit), allowing the single-lane
fallback to always proceed.

This improves 731.astcenc_r (-Ofast) by 3.8% on EMR and 1.4% on Znver5 with single-copy.

gcc/ChangeLog:

* tree-vect-slp.cc (vect_analyze_slp_reduction): Don't bail out
early when SLP discovery limit is exhausted; only guard the chain
analysis which may build multi-lane trees.  Single-lane fallback
does not consume limit and should always be attempted.

Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
gcc/tree-vect-slp.cc

index cbcb08f6694762e6e9b94352cdbd22da044a182d..8a052c9baf147409b8d922e5cc395e5c2dd8c05a 100644 (file)
@@ -4728,12 +4728,10 @@ vect_analyze_slp_reduction (loop_vec_info vinfo,
 {
   slp_instance_kind kind = slp_inst_kind_reduc_group;
 
-  /* If there's no budget left bail out early.  */
-  if (*limit == 0)
-    return false;
-
-  /* Try to gather a reduction chain.  */
+  /* Try to gather a reduction chain.  Only attempt if there's budget left
+     since chain analysis may build multi-lane trees that consume limit.  */
   if (! force_single_lane
+      && *limit != 0
       && STMT_VINFO_DEF_TYPE (scalar_stmt) == vect_reduction_def
       && vect_analyze_slp_reduc_chain (vinfo, bst_map, scalar_stmt,
                                       max_tree_size, limit))