From: Lili Cui Date: Tue, 12 May 2026 17:01:00 +0000 (-0700) Subject: [PATCH 2/2] tree-optimization/vect: Allow single-lane SLP fallback when limit is... X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=cc5179f3a36a087891beecd1b3cb23d172dc76c7;p=thirdparty%2Fgcc.git [PATCH 2/2] tree-optimization/vect: Allow single-lane SLP fallback when limit is exhausted In vect_analyze_slp_reduction, the early bail "if (*limit == 0) return false" blocked all SLP discovery including the single-lane fallback path. However, single-lane SLP trees (group_size == 1) do not consume the discovery limit as they cannot cause exponential tree growth. This causes vectorization failures in loops with many independent conditional reductions: multi-lane grouping attempts exhaust the limit, then the single-lane fallback that would have succeeded is incorrectly rejected. The fix moves the limit check to only guard chain analysis (which builds multi-lane trees and does consume limit), allowing the single-lane fallback to always proceed. This improves 731.astcenc_r (-Ofast) by 3.8% on EMR and 1.4% on Znver5 with single-copy. gcc/ChangeLog: * tree-vect-slp.cc (vect_analyze_slp_reduction): Don't bail out early when SLP discovery limit is exhausted; only guard the chain analysis which may build multi-lane trees. Single-lane fallback does not consume limit and should always be attempted. Co-authored-by: Hongtao Liu --- diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index cbcb08f6694..8a052c9baf1 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -4728,12 +4728,10 @@ vect_analyze_slp_reduction (loop_vec_info vinfo, { slp_instance_kind kind = slp_inst_kind_reduc_group; - /* If there's no budget left bail out early. */ - if (*limit == 0) - return false; - - /* Try to gather a reduction chain. */ + /* Try to gather a reduction chain. Only attempt if there's budget left + since chain analysis may build multi-lane trees that consume limit. */ if (! force_single_lane + && *limit != 0 && STMT_VINFO_DEF_TYPE (scalar_stmt) == vect_reduction_def && vect_analyze_slp_reduc_chain (vinfo, bst_map, scalar_stmt, max_tree_size, limit))