fortran: Allow vector math functions only with fast-math [PR 118955]
Vector math functions are currently always enabled in Fortran. This is
incorrect since vector math functions are designed to be Ofast only.
Add a new 'fastmath' option which only accepts vector functions if fast-math
is enabled:
Wilco Dijkstra [Mon, 26 Jan 2026 14:11:17 +0000 (14:11 +0000)]
AArch64: Block symbols in literal pools [PR 123791]
Symbols with a large offset may be emitted into a literal pool.
aarch64_select_rtx_section() may then select .text or .rodata even if
the symbol has a dynamic relocation. Checking for CONST_INT or
CONST_DOUBLE avoids this. Ensure aarch64_cannot_force_const_mem()
returns true for symbols and labels so that they cannot be placed in
literal pools. Only the large code model emits symbols in the literal
pool (no support for PIC/PIE).
gcc:
PR target/123791
* config/aarch64/aarch64.cc (aarch64_cannot_force_const_mem):
Always return true for symbols and labels except for large model.
(aarch64_select_rtx_section): Force .rodata for constants only.
gcc/testsuite:
PR target/123791
* gcc.target/aarch64/pr123791.c: New test.
Tobias Burnus [Tue, 27 Jan 2026 10:50:26 +0000 (11:50 +0100)]
install.texi: Update GCN's newlib requirements
GCN requires 4.3.0 or newer, but improvements and bug fixes were added
in later versions; before 4.3.0 was listed as required and several versions
up to post-4.5.0 commits were mentioned. Now that 4.6.0 has been released,
just referring to 4.6.0 is sufficient. As fixes are important and there is no
real reason to use an older Newlib: Only list 4.6.0+ as to be used,
instead of also mentioning 4.3.0+ (and other older releases).
gcc/ChangeLog:
* doc/install.texi (gcn): Require Newlib 4.6.0+, replacing 4.3.0+
requirement with long list of recommended fixes up to post-4.5.0.
Nathaniel Shead [Fri, 9 Jan 2026 06:21:08 +0000 (17:21 +1100)]
c++/modules: Include instantiation origination for all name lookup [PR122609]
Many expressions rely on standard library names being visible when used,
such as for structured bindings, typeid, coroutine traits, and so forth.
When these expressions occur in templates, and the instantiations occur
in a TU where those names are not visible, we currently error, even if
the name was visible in the TU the template was defined.
This is a poor user experience (made worse by fixit suggestions to add
an include in the module the template was defined, which wouldn't fix
the error anyway). It seems reasonable to also include declarations that
were visible at the point the instantiation originated.
When using 'import std' this should fix most such errors. If using
traditional #includes to provide the standard library this may or may
not fix the error; in many cases we may still incorrectly discard the
relevant names (e.g. typeid in a template does not currently cause us to
consider std::type_info to be decl-reachable).
This also fixes the XFAIL in adl-12_b.C as we add_candidates now
properly considers names visible in the instantiation context of the
comparison.
PR c++/122609
PR c++/101140
gcc/cp/ChangeLog:
* cp-tree.h (visible_from_instantiation_origination): Declare.
* module.cc: (orig_decl_for_instantiation): New function.
(path_of_instantiation): Use it.
(visible_from_instantiation_origination): New function.
* name-lookup.cc (name_lookup::search_namespace_only): Also find
names visible at the point the instantiation originated.
gcc/testsuite/ChangeLog:
* g++.dg/modules/adl-12_b.C: Remove XFAIL.
* g++.dg/modules/inst-8_a.C: New test.
* g++.dg/modules/inst-8_b.C: New test.
* g++.dg/modules/inst-8_c.C: New test.
* g++.dg/modules/inst-9_a.C: New test.
* g++.dg/modules/inst-9_b.C: New test.
* g++.dg/modules/inst-10_a.C: New test.
* g++.dg/modules/inst-10_b.C: New test.
* g++.dg/modules/inst-10_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Jakub Jelinek [Tue, 27 Jan 2026 09:23:43 +0000 (10:23 +0100)]
c++: Don't error on non-consteval defaulted special members in consteval-only classes [PR123404]
As discussed earlier, the following testcase is incorrectly rejected.
While check_consteval_only_fn -> immediate_escalating_function_p
knows that defaulted special members are immediate-scalating:
/* -- a defaulted special member function that is not declared with the
consteval specifier */
special_function_kind sfk = special_memfn_p (fn);
if (sfk != sfk_none && DECL_DEFAULTED_FN (fn))
return true;
it returns false anyway, because the call is too early and DECL_DEFAULTED_FN
is not set yet (unlike DECL_DELETED_FN).
For DECL_DEFAULTED_FN there is quite more code, involving diagnostics for
invalid uses of = delete etc. later in grokfield:
else if (init == ridpointers[(int)RID_DEFAULT])
{
if (defaultable_fn_check (value))
{
DECL_DEFAULTED_FN (value) = 1;
DECL_INITIALIZED_IN_CLASS_P (value) = 1;
DECL_DECLARED_INLINE_P (value) = 1;
/* grokfndecl set this to error_mark_node, but we want to
leave it unset until synthesize_method. */
DECL_INITIAL (value) = NULL_TREE;
}
}
but that is after the
else if (init == ridpointers[(int)RID_DEFAULT])
initialized = SD_DEFAULTED;
...
value = grokdeclarator (declarator, declspecs, FIELD, initialized, &attrlist);
call in the same function where grokdeclarator calls grokfndecl.
As for defaulted special member functions there is nothing to diagnose,
those are always immediate-escalating or explicitly consteval and neither
of those is diagnosed, the following patch just passes not just whether
a fn is deleted, but whole initialized, so both whether it is deleted or
defaulted, and just doesn't call check_consteval_only_fn in that case.
During pt.cc check_consteval_only_fn call DECL_DEFAULTED_FN is already set
before we test it.
2026-01-27 Jakub Jelinek <jakub@redhat.com>
PR c++/123404
* decl.cc (grokfndecl): Replace bool deletedp argument with
int initialized. Test initialized == SD_DELETED instead of deletedp.
Don't call check_consteval_only_fn for defaulted special member fns.
(grokdeclarator): Pass initialized rather than
initialized == SD_DELETED to grokfndecl.
Jakub Jelinek [Tue, 27 Jan 2026 09:19:39 +0000 (10:19 +0100)]
c++: Fix ICE in cxx_printable_name_internal [PR123578]
On the following testcase, we end up with cxx_printable_name_internal
recursion, in particular
+../../gcc/cp/pt.cc:20736
+../../gcc/cp/error.cc:625
+template_args=<tree_vec 0x7fffe94b59b0>, flags=4) at ../../gcc/cp/error.cc:1876
The ICE is due to double free, that function is doing caching of up to 4
printed names, but if such a recursion happens, the inner call can change
ring_counter etc. and the caller will then store the result in a different
ring element from what was freed and so the freed one can be left
unmodified.
The patch fixes it by moving the lang_decl_name call earlier, after the:
/* See if this print name is lying around. */
for (i = 0; i < PRINT_RING_SIZE; i++)
if (uid_ring[i] == DECL_UID (decl) && translate == trans_ring[i])
/* yes, so return it. */
return print_ring[i];
loop and repeating the loop again just for the theoretical case
that some recursion would add the same entry.
The ring_counter adjustment and decision which cache entry to reuse
for the cache is then done without the possibility of ring_counter
or the cache being changed in the middle.
2026-01-27 Jakub Jelinek <jakub@redhat.com>
PR c++/123578
* tree.cc (cxx_printable_name_internal): Call lang_decl_name before
finding the slot to cache it in and repeat search in the cache
after the call.
Frank Scheiner [Mon, 12 Jan 2026 09:48:58 +0000 (10:48 +0100)]
testsuite: only test with LTO if LTO support is actually configured
Bootstrapping GCC (c, c++) on ia64 w/o support for LTO ([1]) showed that
the testsuite (specifically c-c++-common/guality) executes tests with
`-flto` although there was no support for LTO configured.
This is because [...]/guality.exp adds test permutations w/`-flto`
unconditionally. Fix that by checking for LTO support and drop
permutations w/`-flto` if unsupported.
mul z2.s, z2.s, z1.s
mul z4.s, z4.s, z26.s
mul z1.s, z24.s, z1.s
mul z3.s, z23.s, z3.s
add z29.s, z2.s, z29.s
add z30.s, z1.s, z30.s
add z28.s, z3.s, z28.s
add z0.s, z4.s, z0.s
This is since the fix for r16-3328-g3182e95eda4 we now insert casts around the
reduction addend. This causes convert_mult_to_fma to miss the mul + add
sequence.
This patch teaches it to look around the casts for the operands and only accept
the conversions if it's essentially only a sign changing operations.
PR tree-optimization/122749
* tree-ssa-math-opts.cc (convert_mult_to_fma_1, convert_mult_to_fma):
Unwrap converts around addend.
gcc/testsuite/ChangeLog:
PR tree-optimization/122749
* gcc.target/aarch64/pr122749_1.c: New test.
* gcc.target/aarch64/pr122749_2.c: New test.
* gcc.target/aarch64/pr122749_3.c: New test.
* gcc.target/aarch64/pr122749_4.c: New test.
* gcc.target/aarch64/pr122749_5.c: New test.
* gcc.target/aarch64/pr122749_6.c: New test.
* gcc.target/aarch64/pr122749_8.c: New test.
* gcc.target/aarch64/pr122749_9.c: New test.
* gcc.target/aarch64/sve/pr122749_1.c: New test.
* gcc.target/aarch64/sve/pr122749_11.c: New test.
* gcc.target/aarch64/sve/pr122749_12.c: New test.
* gcc.target/aarch64/sve/pr122749_13.c: New test.
* gcc.target/aarch64/sve/pr122749_14.c: New test.
* gcc.target/aarch64/sve/pr122749_2.c: New test.
* gcc.target/aarch64/sve/pr122749_3.c: New test.
* gcc.target/aarch64/sve/pr122749_4.c: New test.
* gcc.target/aarch64/sve/pr122749_5.c: New test.
* gcc.target/aarch64/sve/pr122749_6.c: New test.
* gcc.target/aarch64/sve/pr122749_8.c: New test.
* gcc.target/aarch64/sve/pr122749_9.c: New test.
Patrick Palka [Tue, 27 Jan 2026 03:00:16 +0000 (22:00 -0500)]
c++: non-dep decltype folding of concept-id C<Ts...> [PR123676]
Here since the expression within the decltype C<Ts...> is not instantiation
dependent (we know its type is bool, and don't care about its value)
finish_decltype_type instantiates it immediately via the usual tsubst_expr
with NULL_TREE args. During which however tsubst_pack_expansion isn't
prepared to handle such a substitution due to an overly strict assert.
This patch relaxes the assert accordingly.
PR c++/123676
gcc/cp/ChangeLog:
* pt.cc (tsubst_pack_expansion): Relax unsubsituted_packs
assert to allow !processing_template_decl when args is
NULL_TREE.
In the PR122494 testcase we constant evaluate 'B<int>::v == 0' first
during warning-dependent folding, which is restricted to avoid unnecessary
template instantiation. During this restricted evaluation we do
decl_constant_value on v which in turn manifestly constant evaluates v's
initializer. But this nested evaluation is incorrectly still restricted
since the restriction mechanism is controlled by a global flag. This
causes constraint checking for A<int> to spuriously fail.
We could narrowly fix this by guarding the decl_constant_value code path
with uid_sensitive_constexpr_evaluation_p but that would overly
pessimize warning-dependent folding of constexpr variables with simple
initializers. The problem is ultimately that the restriction mechanism
is misdesigned, and it shouldn't be a global toggle, instead it should
be local to the constexpr evaluation context and propagated accordingly.
The PR123814 testcase is similar except that the nested manifestly
constant evaluation happens through __fold_builtin_source_location
(which performs arbitrary tsubst).
Until we remove or reimplement the mechanism, this patch disables the
mechanism during nested manifestly constant evaluation. We don't ever
want such evaluation to be restricted since it has semantic consequences.
PR c++/122494
PR c++/123814
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_outermost_constant_expr): Clear
uid_sensitive_constexpr_evaluation_value when mce_true.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-pr122494.C: New test.
* g++.dg/cpp2a/concepts-pr123814.C: New test.
PR analyzer/123145 tracks a large slowdown seen in -fanalyzer on
a particular test case in qemu.
Profiling showed a large amount of time being spent iterating through
binding maps with binding_map::const_iterator::operator*, due to the
work spent converting from bit_range to concrete_key via
store_manager::get_concrete_binding.
Many of the iterations where this done are merely looking at the bound
svalues, not the keys, so this work is wasted.
This patch updates these iterations to avoid needing to do work on the
keys.
Crude benchmarking (on a debug, not release build) showed a speedup on
the test case, from 3 hours to 2.2 hours.
No functional change intended.
gcc/analyzer/ChangeLog:
PR analyzer/123145
* program-state.cc (sm_state_map::impl_set_state): Update
iteration to avoid looking up binding_key values.
* region-model-reachability.cc (reachable_regions::handle_sval):
Use iter.get_svalue.
(reachable_regions::handle_parm): Likewise.
* region-model.cc (iterable_cluster::iterable_cluster): Update
iteration to avoid looking up binding_key values.
(iterable_cluster::dump_to_pp): Likewise.
(exposure_through_uninit_copy::calc_num_uninit_bits): Likewise.
(exposure_through_uninit_copy::complain_about_uninit_ranges):
Likewise.
(contains_uninit_p): Likewise.
* store.cc (binding_map::hash): Likewise.
* store.h (bit_range::hash): New, based on...
(concrete_binding::hash): ...this. Reimplement using the above.
(binding_map::const_iterator::get_svalue): New decl.
(binding_map::get_symbolic_bindings): New accessor.
(binding_map::for_each_value): Update iteration to avoid looking
up binding_key values.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Andrew Pinski [Mon, 26 Jan 2026 06:23:51 +0000 (22:23 -0800)]
slsr: Use the correct type to try to convert to for inserting on edge case [PR123820]
I had noticed there was code that will convert the stride
to the correct type
What I didn't realize was the type which it was trying to
use was stride_type but for this case it should have been using
the type of the lhs. This fixes that oversight. Note for pointers
we still want to use stride_type like what is done right above.
I don't have a testcase that does not use LTO though. I didn't figure
out why this testcase needed LTO though.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/123820
gcc/ChangeLog:
* gimple-ssa-strength-reduction.cc (create_add_on_incoming_edge): Use
the correct type for the stride (lhs if non-pointer).
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr123820-1.C: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 25 Jan 2026 23:08:31 +0000 (15:08 -0800)]
final: Fix out of bounds access for invalid asm operands [PR123709]
output_asm_insn has an out of bounds array access if the supplied
operand number in the inline-asm is "big" (>=MAX_RECOG_OPERANDS).
This makes it so that there is no longer an out of bounds access
by increasing the two arrays by one and using the last element as
the fake location for all out of range operands.
This could be a regression because r0-38026-g4f9b4029463bc0 seems like
introduce the out of bounds access but
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/123709
gcc/ChangeLog:
* final.cc (output_asm_operand_names): Skip over
opnum which is MAX_RECOG_OPERANDS (invalid).
(output_asm_insn): Increase opoutput and oporder size
by 1. For out of range operands, set the opnum to
MAX_RECOG_OPERANDS.
gcc/testsuite/ChangeLog:
* c-c++-common/asm-invalid-operand-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Joseph Myers [Mon, 26 Jan 2026 16:49:29 +0000 (16:49 +0000)]
testsuite: Make profopt-execute also copy profile data for dg-additional-sources
Most gcc.dg/tree-prof tests work correctly in environments where .gcda
files from the first run need to be copied from the target, because
there is existing code in profopt-execute to do so. A few tests using
dg-additional-sources fail because that code only copies the .gcda
file for the main test source file. Add similar code to copy it for
any sources listed in dg-additional-sources as well.
The use of additional_sources_used is consistent with what
profopt-target-cleanup does. It turns out to require the added call
to cleanup-after-saved-dg-test to avoid additional_sources_used
leaking from one test into the next.
Tested for x86_64-pc-linux-gnu to make sure native testing isn't
broken, and with cross to aarch64-linux.
* lib/profopt.exp (profopt-execute): Also copy profile data from
target for additional sources. Call cleanup-after-saved-dg-test
before normal return.
Karl Meakin [Thu, 11 Dec 2025 16:49:14 +0000 (16:49 +0000)]
aarch64: fix `asm.c` tests
The assembly functions declared in `asm_1.c` and `asm_3` were not marked
global, so they could not be found by the linker, and would cause the
`asm_2.c` and `asm_4.c` test to fail. Fix by marking the functions with
`.globl`.
Karl Meakin [Thu, 11 Dec 2025 16:24:51 +0000 (16:24 +0000)]
aarch64: fix `mangle_5.C` test
The `volatile` type qualifier is deprecated on function parameters and
return types since C++ 20. This caused extra warning to be emmitted, so
the test failed. Fix it by suppressing the warning.
Marek Polacek [Thu, 22 Jan 2026 20:52:45 +0000 (15:52 -0500)]
c++/reflection: detect more invalid splices in lambda
The comment for check_splice_expr in
<https://gcc.gnu.org/pipermail/gcc-patches/2025-December/704168.html>
points out that I wasn't giving an error for this testcase. (clang++
rejects the testcase.)
Fixed by checking if process_outer_var_ref returned a capture.
gcc/cp/ChangeLog:
* reflect.cc (check_splice_expr): Check if process_outer_var_ref
returned a capture, and give an error if so.
Marek Polacek [Thu, 22 Jan 2026 19:49:08 +0000 (14:49 -0500)]
c++: tweak for cp_parser_type_id_1
This addresses the cp_parser_type_id_1 comment in
<https://gcc.gnu.org/pipermail/gcc-patches/2025-December/704168.html>
asking for simplifying the type_alias_p handling.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_type_specifier): Adjust comment.
(cp_parser_type_id_1): Simplify setting of type_alias_p.
Use nullptr instead of NULL.
Marek Polacek [Thu, 22 Jan 2026 16:31:35 +0000 (11:31 -0500)]
c++/reflection: fix fnptr extraction [PR123620]
When extracting a function pointer, removing noexcept should be
allowed (but not the other way round):
int fn (int) noexcept;
constexpr auto a = extract<int (*)(int)>(^^fn);
but currently we reject this code -- I didn't realize that fnptr_conv_p
allows things that same_type_p doesn't allow, and in can_extract_* we
should check both. And then we need to perform the actual conversion.
PR c++/123620
gcc/cp/ChangeLog:
* reflect.cc (can_extract_member_or_function_p): Also check
fnptr_conv_p.
(extract_member_or_function): Call perform_implicit_conversion.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/extract1.C: Test removing noexcept.
* g++.dg/reflect/extract2.C: Adjust static_assert.
Richard Biener [Fri, 23 Jan 2026 14:24:53 +0000 (15:24 +0100)]
tree-optimization/122474 - adjust check for .VEC_SHL_INSERT
The r16-4558-g1b387bd8978577 change added a check that does not
match what the commit message says. The following fixes this,
trying to more closely match up what we later do during transform
and what we check here. As cleanup this also makes sure that
we compute the neutral value for the scalar type and consistently
when it depends on the initial value, recording it in a new
VECT_REDUC_INFO_NEUTRAL_OP. This avoids issues with comparing
the neutral and initial value when it is bool. This also
refactors vect_transform_cycle_phi a bit to remove dead code
and make the flow easier to follow.
PR tree-optimization/122474
* tree-vectorizer.h (vect_reduc_info_s::neutral_op): New.
(VECT_REDUC_INFO_NEUTRAL_OP): New.
* tree-vect-loop.cc (vectorizable_reduction): Adjust condition
guarding the check for .VEC_SHL_INSERT.
* gcc.target/aarch64/sve2/pr123053.c: New testcase.
* gcc.target/riscv/rvv/pr122474.c: Likewise.
Richard Biener [Mon, 26 Jan 2026 07:57:45 +0000 (08:57 +0100)]
tree-optimization/123755 - fix LEN-masking of trapping calls
There's multiple issues with properly handling len-masking of calls
that might trap. Similar to get_conditional_internal_fn,
get_len_internal_fn expects a COND_* argument only. When the
original call is not already masked computation and code-gen fails
to add mask and else arguments.
This fixes gcc.target/riscv/rvv/autovec/reduc/reduc_call-5.c
PR tree-optimization/123755
* tree-vect-stmts.cc (vectorizable_call): Fixup LEN masking
of unconditional but possibly trapping calls.
vect: Fix outer loop vectorization for nested uncounted loops [PR123657]
Given the inability of `expr_invariant_in_loop_p (loop, expr)' to
handle the `scev_not_known' node as the expression, an unknown loop
bound in the inner loop in a nested set of loops led to
`vect_analyze_loop_form' to erroneously consider the outer loop as
suitable for vectorization. This introduces the necessary unknown
loop iteration count check to ensure correct handling of counted loops
with an embedded uncounted loop.
Tamar Christina [Mon, 26 Jan 2026 08:06:34 +0000 (08:06 +0000)]
vect: distinguish between MASK_CALL and non-trapping calls [PR123628]
In the Fix for PR122103 an ambiguity was introduced when it comes to fortran
due to an inconsistency in libmvec headers.
A reproducer is
!GCC$ builtin (expf) attributes simd (notinbranch)
SUBROUTINE a(b)
REAL, DIMENSION(:) :: b
c: DO i = 1, d
IF (e <= f) THEN
g = EXP(h)
r = g
IF (r > s) THEN
b(i) = t
END IF
END IF
END DO c
END
compiled with -O2 -march=armv8-a+sve, note that fortran, unlike C provides the
libmvec header math-vector-fortran.h unconditionally, which is a separate bug
PR118955 which causes the functions to become available outside of -Ofast.
This means the cases for MASK_CALL and trapping math overlap for fortran at -O2.
The new masking code shouldn't handle SIMD clones.
gcc/ChangeLog:
PR tree-optimization/122103
PR tree-optimization/123628
* tree-if-conv.cc (if_convertible_simdclone_stmt_p): New.
(if_convertible_stmt_p, predicate_statements): Use it.
gcc/testsuite/ChangeLog:
PR tree-optimization/122103
PR tree-optimization/123628
* gfortran.target/aarch64/pr123628.f90: New test.
Nathaniel Shead [Fri, 23 Jan 2026 23:11:35 +0000 (10:11 +1100)]
c++: Fix behaviour of nested maybe_push_to_top_level [PR123663]
What happens in the linked PR is that when evaluating the concept we
call 'push_to_top_level', we see cfun is non-null, and so call
'push_function_context' which sets cfun to NULL and sets a flag so that
we remember to pop it later.
Then, when instantiating Foo's default constructor as part of the
concept, we call 'maybe_push_to_top_level'. Here we see that 'Foo' is
function-local, so '!push_to_top', and we call 'push_function_context'.
This allocates a new cfun for some reason, and pushes that empty cfun.
Eventually we 'maybe_pop_from_top_level', and restore that newly
allocated cfun (instead of a NULL cfun), which means that when we start
trying to build the new-expression (which requires building a statement
list) we try accessing the (uninitialized) cfun's x_stmt_tree rather
than the scope_chain's x_stmt_tree, and so crash.
This fixes the issue by also remembering whether we had a cfun when
doing maybe_push_to_top_level so that we only do push_function_context
if needed.
This also seems to fix PR123354.
PR c++/123663
PR c++/123354
gcc/cp/ChangeLog:
* name-lookup.cc (struct local_state_t): New flag has_cfun.
(local_state_t::save_and_clear): Set has_cfun, call
push_function_context iff there's a cfun to save.
(local_state_t::restore): call pop_function_context if
has_cfun is set.
(maybe_push_to_top_level): Delegte push_function_context to
local_state_t::save_and_clear.
(maybe_pop_from_top_level): Delegate pop_function_context to
local_state_t::restore.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-pr123663.C: New test.
* g++.dg/template/pr123354.C
Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Sandra Loosemore [Sun, 25 Jan 2026 22:43:58 +0000 (22:43 +0000)]
Fix gcc-urlifier selftest failure
My recent commits for PR122243 added index entries for -fno-* options
as well as their normal positive forms. Apparently the "urlifier"
used to insert option URLS into diagnostic messages can find the
anchor for either form, but its self-tests are hard-wired to match
only the positive form for the two specific options it's looking up.
This patch robustifies it to allow it to match the anchor for either
the positive or negative forms.
gcc/ChangeLog
* gcc-urlifier.cc (test_gcc_urlifier): Match either positive
or negative option URLS.
Iain Sandoe [Sat, 24 Jan 2026 14:22:43 +0000 (14:22 +0000)]
c++: Do not mark STATEMENT_LISTs as visited in genericization.
This addresses a latent issue in C++ genericization (only seen
in development code, so far).
In the following code snippet using facilities from the proposed
C++26 contracts implementation:
while (!SWAPPER::isFinished()) {
uc = SWAPPER::swapBytes();
if (0 != uc) {
}
}
contract_assert( translator.d_capacity >= 1 );
During genericization, a statement list from the while loop is freed.
The expansion of the contract_assert then requires a 'new' statement list.
Since the statment list in the while has been visited, it was marked as
such.
A specific property of statement lists is that they are cached using a
LIFO stack. So that the statement list picked for the contract_assert
is the one freed from the while loop. However since that list entry
has already been marked as visited, the newly created contract expansion
is not visited (leading to an ICE).
The solution here is to forgo marking STATEMENT_LISTs as visited in this
code (which is provision for potential future cases, as well as resolving
the specific instance seen).
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_genericize_r): Do not mark STATEMENT_LISTs
as visited.
Roger Sayle [Sun, 25 Jan 2026 21:06:39 +0000 (21:06 +0000)]
PR middle-end/122348: ICE in store_constructor from flexible array member
This patch resolves PR middle-end/122348, an ICE caused by passing a
initialized structure containing a flexible array member by value.
The semantics in C99 (and since gcc 4.4) are that the zero sized array
at the end of the structure is ignored when passing by value. Hence
for the structure in the PR:
struct S {
int a;
int b[];
} s = { 0, { 42 } };
when passed by value, sizeof(s) is considered to be 4 bytes, and on
x86_64 passed in the 32-bit %edi register. Unfortunately, the code
in store_constructor isn't expecting initialized fields where the
type's DECL_SIZE is NULL, which leads to the ICE. Fixed by explicitly
ignoring fields where DECL_SIZE is NULL_TREE. On x86_64, passing "s"
now compiles to just:
f: xorl %edi, %edi
jmp foo
2026-01-25 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/122348
* expr.cc (store_constructor): Ignore fields where DECL_SIZE
is NULL_TREE, i.e. flexible array members.
gcc/testsuite/ChangeLog
PR middle-end/122348
* g++.dg/pr122348.C: New C++ testcase.
* gcc.dg/pr122348.c: New C testcase.
Sandra Loosemore [Sat, 24 Jan 2026 21:37:40 +0000 (21:37 +0000)]
doc: whitespace fixes in option summary
We should be consistently using two spaces between options on the same
line in @gccoptlist environments, in order to visually distinguish
options that take a separate argument (with only one space).
e.g.
-foo @var{arg} -bar
Most places in the Option Summary section follow this convention already
but people have not always been consistent about it when adding
new options.
gcc/ChangeLog
* doc/invoke.texi (Option Summary): Fix whitespace in @gccoptlist
tables.
Sandra Loosemore [Sat, 10 Jan 2026 01:04:56 +0000 (01:04 +0000)]
doc: Fix various option documentation problems [PR122243]
This patch fixes a number of minor problems I found after my initial
pass through the options documentation; a few options still missing
documentation, options documented but missing entries in the index or
option summary, options whose names were misspelled in either the main
entry, option summary, or index, etc.
gcc/ChangeLog
PR other/122243
* doc/cppdiropts.texi: Document -imultiarch.
* doc/invoke.texi (Option Summary) <Optimization Options>: Add
-flto-toplevel-asm-heuristics.
<Program Instrumentation Options>: Remove -fbounds-check.
<Directory Options>: Add -imultiarch.
<ARC Options>: Add -mbitops, -mcmem, -munaligned-access.
<ARM Options>: Add -mvectorize-with-neon-quad and
-mvectorize-with-neon-double.
<AVR Options>: Add -mrmw and -mstrict-X.
<CRIS Options>: Fix typo in -mmax-stackframe.
<Cygwin and MinGW Options>: Add -muse-libstdc-wrappers.
<M680x0 Options>: Add several missing CPU options, plus -mxtls.
<MIPS Options>: Add -mno-data-in-code and -mcode-xonly.
<MMIX Options>: Add mset-data-start, -mset-program-start, and
-mno-set-program-start.
<Nvidia PTX Options>: Add -msoft-stack-reserve-local.
<RS/6000 and PowerPC Options>: Add -mprofile-kernel, -mbit-word,
-mno-splat-word-constant, -mno-splat-float-constant,
-mno-ieee128-constant, and -mno-warn-altivec-long.
(Optimization Options): Document -flto-toplevel-asm-heuristics.
(ARC Options): Document -mbitops and -mcmem.
(ARM Options): Add index entries for mbe32,
m[no-]fix-cortex-a57-aes-1742098, m[no-]fix-cortex-a72-aes-1655431.
Document -mvectorize-with-neon-quad and -mvectorize-with-neon-double.
(AVR Options): Document -mpmem-wrap-around.
(CRIS Options): Fix typo in -mmax-stackframe.
(Cygwin and MinGW Options): Document -muse-libstdc-wrappers.
(DEC Alpha Options): Fix typo in -mfp-regs.
(eBPF Options): Add @opindex for -mframe-limit.
(HPPA Options): Fix typos in -mno-disable-fpregs and -mno-gas
index entries.
(m680x0 Options): Document -m68302, -m68332, -m68851, and -mfidoa.
Document -mnoshort and -mnortd aliases. Document -mxtls.
(MCore Options): Fix typos in -m[no-]relax-immediates.
(MIPS Options): Document -mno-data-in-code and -mcode-xonly.
(MMIX Options): Document -mset-data-start, -mset-program-start, and
-mno-set-program-start.
(Nvidia PTX Options): Document -msoft-stack-reserve-local.
(RS/6000 and PowerPC Options): Document -mprofile-kernel,
-mbit-word, -msplat-word-constant, -msplat-float-constant,
-mieee128-constant, and -mwarn-altivec-long.
(SH Options): Add index entry for -m2e. Document -m4-400.
Sandra Loosemore [Sun, 25 Jan 2026 00:40:51 +0000 (00:40 +0000)]
doc: Mark more options as Undocumented/RejectNegative [PR122243]
In reviewing the autogenerated list of remaining undocumented options
after my first pass through the whole chapter, I found several more
options I'd previously overlooked that should either be marked
Undocumented, or that were missing a RejectNegative attribute where
one was plainly appropriate.
gcc/c-family/ChangeLog
PR other/122243
* c.opt (-output-pch): Mark as Undocumented, as it seems to be
an internal option that has never been documented anyway.
(Werror-implicit-function-declaration): Mark deprecated option
that is not currently documented as Undocumented.
(fconstant-string-class=): Add RejectNegative property.
gcc/ChangeLog
PR other/122243
* common.opt (fbounds-check): Mark as Undocumented, expand comments
to explain why.
* config/frv/frv.opt (msched-lookahead=): Mark unused option as
Undocumented.
* config/m68k/m68k.opt (m68851): Add RejectNegative.
* config/nvptx/nvptx.opt (minit-regs=): Mark as Undocumented. It's
not currently documented and seems to have been introduced as a
stopgap to experiment with different implementation strategies.
* config/rs6000/476.opt (mpreserve-link-stack): Mark as Undocumented.
It seems to be an internal option that is enabled by default on the
cpu that can benefit from it.
This patch corrects the option summary for debug options, adds missing
@opindex entries, and copy-edits the option descriptions in this section
of the manual.
I also marked -gtoggle as RejectNegative; given its documented special
handling, it does not seem possible to use the negative form to override
an earlier positive option.
gcc/ChangeLog
PR other/122243
* common.opt (gtoggle): Mark RejectNegative.
* doc/invoke.texi (Option Summary) <Debugging Options>: Remove
redundant -gno- forms from the list.
(Debugging Options): Add @opindex for -gno- option forms.
Copy-edit option descriptions to avoid future tense and use of
implementor jargon.
Xin Wang [Sat, 24 Jan 2026 23:54:46 +0000 (23:54 +0000)]
libgcc: Use UTItype with mode(TI) for 16-byte atomics
The use of UOItype with mode(OI) for 16-byte atomic operations is
non-standard. The OI mode is not defined in machmode.def and exists
only as an ad-hoc construct in libgcc/sync.c.
This patch replaces it with UTItype using mode(TI), which is the
standard GCC machine mode for 16-byte integers (Tetra Integer).
The size argument is also corrected from 8 to 16 to match the actual
operand width.
libgcc/ChangeLog:
* sync.c: Replace UOItype with UTItype and use mode(TI) pass 16, not
8, to DEFINE macro.
Andrew Pinski [Sat, 24 Jan 2026 07:13:25 +0000 (23:13 -0800)]
SLSR: Wrong type sometimes for rhs2 with pointers [PR123803]
I messed up one of the new gimple_convert when there is
still a pointer type (with -fwrapv-pointer or -fno-strict-overflow).
The newrhs2 should be converted into sizetype instead of the type
of the lhs. Other places were already done correctly, it was
just in replace_rhs_if_not_dup which was broken this way.
Pushed as obvious after bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/123803
gcc/ChangeLog:
* gimple-ssa-strength-reduction.cc (replace_rhs_if_not_dup): For
pointer lhs use sizetype.
gcc/testsuite/ChangeLog:
* gcc.dg/pr123803-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jerry DeLisle [Sat, 24 Jan 2026 02:52:34 +0000 (18:52 -0800)]
Fortran: Fix missed finalization
PR fortran/123772
gcc/fortran/ChangeLog:
* trans.cc: Add global variable is_assign_call.
(gfc_finalize_tree_expr): Derived type function results
with components that have defined assignements are
handled in resolve.cc(generate_component_assignments), unless
the assignment was replaced by a subroutine call to the
subroutine associated with the assignment operator.
(trans_code): In the case of EXEC_ASSIGN_CALL, set the
is_asign_call before calling gfc_trans_call, then clear it
after.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr123772.f03: New test.
Signed off by: Andrew Benson <abensonca@gcc.gnu.org>
Jose E. Marchesi [Sat, 24 Jan 2026 15:58:51 +0000 (16:58 +0100)]
a68: build a68 type nodes before targetm.init_builtins [PR algol68/123785]
The alpha target calls type_for_mode in init_builtins. The algol68
implementation of type_for_mode uses modes created by
a68_build_a68_type_nodes. This patch makes sure that the later is
called before the init_builtins target hook.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
Jakub Jelinek [Sat, 24 Jan 2026 09:38:17 +0000 (10:38 +0100)]
c++: Fix wrong-code with overloaded comma and CPP_EMBED [PR123737]
In cp_parser_expression for comma operator I've used a short path
where instead of calling build_x_compound_expr #embed number times
it is called just 3 times, for the CPP_NUMBER added by the preprocessor
at the start, last byte from CPP_EMBED and then CPP_NUMBER added by
libcpp at the end, enough to make sure -Wunused-value reports something,
but not bothering users with millions of -Wunused-value warnings
and spending too much compile time on it when they use a very large #embed.
As the following testcases show, that is ok for C or for C++ if the
expression before it is known not to have OVERLOAD_TYPE_P (common case
is INTEGER_TYPE I guess), but doesn't work well in case one uses overloaded
comma operator. In that case we just have to call build_x_compound_expr
the right number of times, even if it is a lot.
I think I don't need to test for !expression, because the preprocessor
should guarantee that CPP_EMBED is preceded by CPP_NUMBER CPP_COMMA
tokens.
2026-01-24 Jakub Jelinek <jakub@redhat.com>
PR c++/123737
* parser.cc (cp_parser_expression): Don't handle CPP_EMBED just
as the last byte in it if expression has or might have overloaded
type. In that case call build_x_compound_expr for each byte
in CPP_EMBED separately.
* g++.dg/cpp/embed-28.C: New test.
* g++.dg/parse/comma3.C: New test.
Jakub Jelinek [Sat, 24 Jan 2026 09:36:45 +0000 (10:36 +0100)]
c++: Fix ICE on decltype on non-local structured binding inside of a template [PR123667]
The following testcases ICE when decltype (x) appears in a template
where x is a tuple based structured binding from outside of that template
(one testcase shows the sb in a function and template is a lambda within
that function, the other shows namespace scope sb referenced from a
template).
What I wrote in the comment there is true only for structured bindings
within the current template function, in that case that structured binding
indeed has to have DECL_VALUE_EXPR and lookup_decomp_type might return
NULL or might not and depending on that we should choose if it is
a tuple based structured binding and return its type or if we should
return unlowered type of expr.
But if decltype in a template refers to a structured binding elsewhere,
it could have been finalized already and determined to be tuple based
structured binding, so DECL_HAVE_VALUE_EXPR_P can be false in that case.
In that case, if ptds.saved would be false, we'd just always
return lookup_decomp_type. So, for this case the patch allows
that case in the assert and asserts lookup_decomp_type returned non-NULL.
2026-01-24 Jakub Jelinek <jakub@redhat.com>
PR c++/123667
* semantics.cc (finish_decltype_type): Allow a structured binding
for ptds.saved to have DECL_HAS_VALUE_EXPR_P cleared, if it is
not within current_function_decl, but in that case assert that
lookup_decomp_type succeeded.
* g++.dg/cpp1z/decomp66.C: New test.
* g++.dg/cpp1z/decomp67.C: New test.
Jakub Jelinek [Sat, 24 Jan 2026 08:56:23 +0000 (09:56 +0100)]
c++: Fix ICE on namespace attributes [PR123684]
The following testcase ICEs sionce my r14-650 PR109756 change to
set TREE_VALUE (attr) to error_mark_node for unsupported attributes
for which we've just skipped over balanced tokens for arguments in order
to differentiate that from no arguments.
The problem is that handle_namespace_attrs doesn't go through the normal
attribute handling and just checked attributes by name.
So, if user uses [[visibility (whatever)]] or
[[whatever::visibility (whatever)]] or similarly abi_tag or for deprecated
(which is valid as both [[deprecated ("foo")]] and
[[gnu::deprecated ("foo")]] ) [[whatever::deprecated (whatever)]] on
a namespace, we handle it as if it was a gnu or gnu or standard attribute
even when it is not and can ICE on error_mark_node attributes.
The following patch makes sure we handle it only in the right namespaces
and emit a warning on anything else.
Not sure about backports, the patch changes behavior for say
inline namespace [[foo::abi_tag]] N or
inline namespace [[abi_tag]] N where previously it would use the name
of the namespace as abi tag and now it will ignore it and emit a warning.
One possibility is to just deal with args == error_mark_node and warn
on that attribute only in that case (that is where it would previously ICE).
But perhaps I'm worrying too much and no code in the wild is relying on
non-gnu attributes on namespaces with the same names as the gnu ones
behaving like the gnu ones.
2026-01-24 Jakub Jelinek <jakub@redhat.com>
PR c++/123684
* name-lookup.cc (handle_namespace_attrs): Only handle visibility and
abi_tag attributes in the gnu namespace and deprecated attribute in
the standard or gnu namespaces.
Joseph Myers [Fri, 23 Jan 2026 21:14:31 +0000 (21:14 +0000)]
testsuite: Enable cross testing for gcov tests
Tests of gcov are generally restricted to { target native }. The
issue for these tests is the need to transfer .gcda files from the
target to the host before running gcov. Implement that support and
remove the { target native } restrictions for these tests.
Note that by default code built to generate coverage data expects to
be able to write .gcda files to the same directory name in which the
compiler generated its output, so if that path cannot be created by
the tests on the target then they may still not work in a cross setup.
Other options involving -fprofile-dir are possible but involve their
own complications such as mangling of the .gcda file name (the
mangling logic would then need replicating in gcov.exp). Copying
files from the target using such absolute directory paths is what
already happens with gcc.dg/tree-prof tests using profopt.exp (and
those already work in a cross configuration except for a few using
dg-additional-sources), so this change is effectively making the gcov
tests work more like the tree-prof ones.
Note also that [remote_file host absolute ...] may require appropriate
support in your host board file for the case of a remote host (this
isn't an operation DejaGnu knows about doing remotely by default).
The logic for determining .gcda paths does mean it's the absolute path
on host, not on build, that is relevant.
Tested for x86_64-pc-linux-gnu to make sure native testing isn't
broken, and with cross to aarch64-linux.
Andrew Pinski [Fri, 23 Jan 2026 20:29:17 +0000 (12:29 -0800)]
aarch64/testsuite: Fix test_frame_*.c
The problem here is the test function is now being
inlined into main but that was not expected.
So mark the test functions with noinline and noclone
fixes the issue.
Pushed as obvious after testing to make sure the
test_frame_*.c testcases now work.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/test_frame_common.h (t_frame_pattern):
Add noclone and noinline to the defining test function.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Robin Dapp [Fri, 23 Jan 2026 10:51:57 +0000 (11:51 +0100)]
vect: Only scale vec_offset once [PR123767].
Since allowing "unsupported" scales by just multiplying there was an
issue with how the vec_offset was adjusted:
For "real" gathers/scatters we have a separate vec_offset per stmt copy.
For strided gather/scatter, however, there is just one vec_offset common
to all copies.
In case of an unsupported scale we need to multiply vec_offset with the
required scale which is currently done like this:
for (i = 0; i < num_vec; i++)
vec_offset = vec_offset * scale_constant;
where vec_offset is only different for real gathers/scatter.
Thus, for more than one copy of a strided gather/scatter, we will
erroneously multiply an already scaled vec_offset.
This patch only performs the vec_offset scaling
- for each copy in real gathers/scatters or
- once for the first copy for strided gathers/scatters.
PR tree-optimization/123767
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_store): Only scale offset
once.
(vectorizable_load): Ditto.
All of these patterns have in common is they try to move
the op inside the cnd if it simplifies. The problem is the
type of the op does not need to be the type of first operand
and instead is the type of the type variable. The fix is to
supply the type to the op like in `(op:type @1 @3)`.
But since they were originally in the `(op! @1 @3)`, it should be:
`(op!:type @1 @3)`. Though this would be rejected as we don't pick
up the next token after parsing the `!` (note `^` has the same issue
which is also fixed here too).
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/123778
gcc/ChangeLog:
* genmatch.cc (parser::parse_expr): Peek on the next
token after eating the `!` or `^`.
* match.pd (`(op (cnd @0 @1 @2) @3)`, `(op @3 (cnd @0 @1 @2))`):
Add the type to resulting op.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr123778-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sat, 17 Jan 2026 04:50:53 +0000 (20:50 -0800)]
slsr: Fix replace_rhs_if_not_dup and others for overflow [PR106883]
Like the previous patch this up slsr in the other locations for the
added overflow issues that were being introduced. This was slightly
harder than insert_initializers as we need to allow for type
mismatches in more cases.
But with this all fixed, the testcases in PR 120258 (and its duplicates)
are all working correctly.
* gimple-ssa-strength-reduction.cc (replace_mult_candidate): Allow for
basis_name and bump_tree not to be the same type as the lhs.
Rewrite added multiply for undefined overflow.
(create_add_on_incoming_edge): Allow for init
to be a different type from the wanted type.
Rewrite added add for undefined overflow.
(replace_rhs_if_not_dup): Rewrite replaced stmts
for undefined overflow.
(replace_one_candidate): Allow for basis_name and
rhs2 to be a different type from lhs.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/slsr-8.c: Update the number of `*`.
* gcc.dg/torture/pr120258-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sat, 17 Jan 2026 04:25:56 +0000 (20:25 -0800)]
slsr: fix overflow from create_add_on_incoming_edge [PR106883]
This fixes the overlfow that might be introduced from
creater_add_on_incoming_edge. I have not found a testcase where this
shows up, there possibility could be.
PR tree-optimization/106883
gcc/ChangeLog:
* gimple-ssa-strength-reduction.cc (create_add_on_incoming_edge): Rewrite
the new addition on the edge too.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Fri, 16 Jan 2026 22:48:48 +0000 (14:48 -0800)]
slsr: Move introduce_cast_before_cand to use gimple_convert
This moves introduce_cast_before_cand to use gimple_convert
instead of manually creating the gimple assign. In theory
there could be a removal of one statement being created
but I have not looked to check that.
gcc/ChangeLog:
* gimple-ssa-strength-reduction.cc (introduce_cast_before_cand): Use
gimple_convert instead of manually creating the gimple_assign.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Thu, 15 Jan 2026 00:44:17 +0000 (16:44 -0800)]
slsr: Fix some undefined/overflow introducing bugs in SLSR [PR106883]
This fixes the first part of SLSR incorrectly inserting undefined code (overflow)
into the IR. The easiest way is to rewrite the statement after creating it
using rewrite_to_defined_unconditional.
This fixes the testcases from PR 121347 (and a few others) which all cause an
infinite loops to appear.
I will be posting the fix for replace_rhs_if_not_dup later and at that point I
will add a few testcases.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Andrew Pinski [Fri, 23 Jan 2026 06:04:59 +0000 (22:04 -0800)]
gimple-fold: Fix handling of vdefs for MASK_LOAD_LANES replacement [PR123776]
This was found when I was running the gcc testsuite with some SVE options to
enable SVE only vectorization and enable it always.
After r16-5984-gcee0a9dd2700b9 and r16-6918-g46a3355c7f1656, we would fold:
# .MEM_696 = VDEF <.MEM_695>
vect_array.781 = .MASK_LOAD_LANES (vectp_this.772_515, 64B, loop_mask_511, { 0, 0 });
into:
vect_array.781 = {};
But since this was originally a "load" we don't copy the vdef. Some passes
like fre will not cause a TODO_update_ssa to happen so we hit an assert
which basically says the we should have done an update_ssa.
While we could do an update_ssa, the better fix is to copy the vdef from
the old statement to the new one before doing the gsi_replace. When we
know this will be a store (in the !is_gimple_reg case). And then we have
kept the vop up to date and don't need to do an update_ssa.
Pushed as obvious after a build and test on aarch64-linux-gnu.
PR tree-optimization/123776
gcc/ChangeLog:
* gimple-fold.cc (gimple_fold_partial_load_store): Copy
the vdef from the old statement to the new statement of a
load that is also a store to non gimple_reg.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
sched: Make model scheduler more robust against stale live-in sets
As the comment in the patch says, previous inter-block insn movement
can mean that the current block's live-in set becomes stale. This is
somewhat undesirable, since it'll make estimates less conservative
than intended. However, a fully accurate update would be too expensive
for something that is only supposed to be a heuristic.
gcc/
PR rtl-optimization/80357
PR rtl-optimization/94014
PR rtl-optimization/123144
* haifa-sched.cc (model_recompute): Ignore dying uses of registers
that are not assumed to be live.
gcc/testsuite/
PR rtl-optimization/123144
* gcc.dg/torture/pr123144.c: New file.
Richard Biener [Fri, 23 Jan 2026 08:58:50 +0000 (09:58 +0100)]
middle-end/123775 - add missing expand_vec_cond_expr_p to patterns
This adds a missing check on supportability of a VEC_COND_EXPR
to a match.pd pattern. The existing conditions, in particular
known_eq of TYPE_VECTOR_SUBPARTS, is not enough to distinguish
VNx4SImode from V4SImode with -msve-vector-bits=128.
PR middle-end/123775
* match.pd ((view_convert (vec_cond ...))): Make sure the
resulting vec_cond can be expanded.
* gcc.target/aarch64/sve2/pr123775.c: New testcase.
Jakub Jelinek [Fri, 23 Jan 2026 09:37:46 +0000 (10:37 +0100)]
builtins: Only fold abs/absu if it is sane [PR123703]
To my surprise the C FE marks as builtin even a declaration which has incorrect
return type. Normally gimple_builtin_call_types_compatible_p etc. will
just punt in cases where the return type is wrong, but builtins.cc doesn't
use that. For e.g. the mathfn builtins like sqrt and many others, it will
punt on weird return types, but for fold_builtin_abs it doesn't and happily
tests TYPE_UNSIGNED on it and fold_convert the integral operand to it etc.,
which ICEs if the return type is aggregate.
The following patch fixes it by punting if type is not integral.
2026-01-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123703
* builtins.cc (fold_builtin_abs): Return NULL_TREE if type is not
integral.
Jakub Jelinek [Fri, 23 Jan 2026 07:37:36 +0000 (08:37 +0100)]
openmp: Fix up OpenMP loop parsing in templates [PR123597]
The following testcase is miscompiled, because since r14-3490
because in this case the sum variable is moved out of the loop's
body into an outer BIND_EXPR and becomes shared, so from what
has been previously private can result in data races.
In the C++ FE, BIND_EXPRs are mostly created for 2 reasons.
One is when something calls c_build_bind_expr and that function
creates a BIND_EXPR because there are decls to attach to it,
this happens for !processing_template_decl e.g. from do_poplevel
or poplevel (the latter for sk_cleanup which I think don't appear
if processing_template_decl). And the other case are BIND_EXPRs
created by begin_compound_stmt if processing_template_decl, those
don't stand for the need to collect decls inside of it, but to
say the source had {} at this point, take it into account when
instantiating the template.
Now, on the testcase when parsing the body of the inner collapsed
loop we call cp_parser_statement -> cp_parser_compound_statement
-> begin_compound_statement because the body is surrounded by {}s
and that returns a BIND_EXPR with the processing_template_decl
meaning, the source had {} here.
But the r14-3490 code then calls substitute_in_tree in a loop, trying
to replace placeholders it created with the parsed bodies. And while
doing that, considers all BIND_EXPRs with !BIND_EXPR_VARS redundant
and just throws them away.
They are redundant when !processing_template_decl, but when
processing_template_decl they I think always have !BIND_EXPR_VARS,
the vars inside of such bodies aren't pushed into any BIND_EXPR yet,
they just have a DECL_EXPR somewhere, and the pushing of the instantiated
copies of those will be done only during instantiation.
The following patch fixes it by not treating BIND_EXPRs with !BIND_EXPR_VARS
as redundant if processing_template_decl, it is fine to merge two
BIND_EXPRs with nothing in between them.
2026-01-23 Jakub Jelinek <jakub@redhat.com>
PR c++/123597
* parser.cc (substitute_in_tree_walker, substitute_in_tree): Don't
consider BIND_EXPRs with !BIND_EXPR_VARS redundant if
processing_template_decl.
Richard Biener [Thu, 22 Jan 2026 13:06:50 +0000 (14:06 +0100)]
Avoid selecting masked epilogs for in-order reduction vectorization
When masking an in-order reduction we are applying the mask with a
COND_EXPR followed by an in-order accumulation of all elements,
including the masked ones. That makes loop masking not profitable.
Ideally we'd apply this logic to all loops, even when masking is
selected via --param vect-partial-vector-usage=2 but the current
way we iterate over modes (and opt-out of cost compares) does not
allow do iterate over masked vs. non-masked, so that does not work.
I plan to fix that for GCC 17, for now this fixes a regression
for tagets opting in to avx512_masked_epilogues.
* config/i386/i386.cc (ix86_vector_costs::finish_cost):
Avoid selecting masked epilogs for in-order reductions.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-epil-1.c: New
testcase.
Hongyu Wang [Fri, 9 Jan 2026 08:34:55 +0000 (16:34 +0800)]
[APX] i386: Fix illegal broadcast instruction generated by intrinsic
For _mm256_broadcastsi128_si256 call with -mapxf enabled it may produce
illegal vbroadcasti128 with egpr under high register pressure. Restrict
the pattern to use "jm" and gpr16 for avx2 alternative.
gcc/ChangeLog:
* config/i386/sse.md (avx2_vbroadcasti128_<mode>): Constraint
alternative 0 with jm and add gpr16 attr to avoid egpr usage.
testsuite: Fix issues with simulator testing in guality, simulate-thread tests
The guality and simulate-thread tests expect a native gdb to
run. That's native as in "<simulator-command-name> gdb
<prog-to-test>", not as in "<cross-name-gdb> <prog-to-test>"
(or even a cross gdb connecting to a native gdb-stub). Such
a beast does not currently exist.
Before r16-6780-g620c85fb709d27, there was an early exit for
"remote targets" such as simulator targets in gdb-test. No
test attempting to run a "native" gdb was applied to a
simulator target.
There's a wart in dejagnu sim_exec (up to and including
1.6.3 and unreleased sources as of 2026-01-20) that, instead
of returning [list -1 "error message"], it (also) calls
perror and thus a log for a test-run gets spurious lines
saying "ERROR: Remote execution for simulators not
implemented." That effectively breaks the useful quality for
such logs, that lines matching "^ERROR:" are only caused by
testsuite framework errors, like syntax errors in dg-clauses
in the test-cases.
Further, trying like gdb-test does, to execute remote_expect
for a remote_spawn:ed (sim_spawn:ed) "<simulator-command>
gdb <prog-to-test>", will for unknown reasons, hang each
test-case until it times out, despite the simulator, as
expected, immediately exiting with
e.g. '<simulator-command-name>: can't open "gdb": No such
file or directory'.
Better exit early for simulators for these parts of the
testsuite, like before r16-6780-g620c85fb709d27, but with
the early exit moved nearby those for other early exits for
specific targets, instead of e.g. inside gdb-test.
* g++.dg/guality/guality.exp, gcc.dg/guality/guality.exp,
gcc.dg/simulate-thread/simulate-thread.exp,
g++.dg/simulate-thread/simulate-thread.exp,
gfortran.dg/guality/guality.exp: Exit early for simulators.
Richard Earnshaw [Thu, 22 Jan 2026 14:16:28 +0000 (14:16 +0000)]
arm: fix unrecognized HFmode min/max insns on neon [PR123742]
When expansion support for smin/smax was enabled (presumably for MVE)
the corresponding Neon instructions were not updated to recognize the
generated RTL. This patch makes the necessary changes to recognize
these variants.
Marek Polacek [Wed, 21 Jan 2026 19:04:34 +0000 (14:04 -0500)]
c++: fix user_provided_p
A user-provided function is a user-declared function that is
not explicitly defaulted or deleted on its first declaration
([dcl.fct.def.default]). So,
void bar (int, long) = delete;
in namespace scope should not be user-provided. But user_provided_p
was mistakenly returning true for this case, so eval_is_user_provided
had to work around that.
This patch corrects user_provided_p. It makes use of the fact that
a function deleted after its first declaration is ill-formed
rather than user-provided:
void f();
void f() = delete; // error, not first declaration
gcc/cp/ChangeLog:
* class.cc (user_provided_p): Return false for a deleted
namespace-scope function.
* reflect.cc (eval_is_user_provided): Don't check
DECL_NAMESPACE_SCOPE_P or DECL_DELETED_FN.
Richard Biener [Thu, 22 Jan 2026 12:12:43 +0000 (13:12 +0100)]
tree-optimization/123741 - fix segfault with BB vect and masked stores
When vectorizing the body of an if-converted loop with BB vectorization
we can end up vectorizing masked stores. But the code tracking whether
a loop used masked stores failed to check we're doing loop
vectorization.
Frank Scheiner [Wed, 7 Jan 2026 20:55:30 +0000 (21:55 +0100)]
libstdc++-v3: Update baseline symbols for ia64-linux
The Linux/ia64 libstdc++ baselines haven't been updated in years. This
patch fixes that, makes the "libstdc++-abi/abi_check" test succeed with
master revision 8e23a9982fa4b885a27608825cbe326d61f20498.
This still excludes the GCC 16 symbols. Also the TLS symbols for
__once_call and __once_callable are excluded as per convention, because
those are not present for all configurations.
Notice that `make new-abi-baseline` will include them, so they need to
be removed from the regenerated "baseline_symbols.txt" prior to any
future update.
Robin Dapp [Wed, 21 Jan 2026 07:20:48 +0000 (08:20 +0100)]
forwprop: More nop-conversion handling [PR123731].
Since relaxing the constraints for permutes in r16-6671 for simplifying
vector constructors there is an additional case to be handled as it
interacts with r16-5561 (that allows nop conversions).
In
vector(8) short unsigned int _4;
short int _5;
vector(4) unsigned int _17;
Robin Dapp [Fri, 2 Jan 2026 15:57:21 +0000 (16:57 +0100)]
RISC-V: Fix intrinsic FoF load at -O0 [PR122869].
In the PR we try to compile a loop at -O0 with fault-only-first loads.
We use the VL adjusted by the FoF loads to count the number of
processed elements. Currently, this is implemented as "folding" the FoF
load into a FoF load and a riscv_read_vl directly after.
We cannot guarantee the value of VL between two calls, though. It is
possible that we need a vector store in between which would clobber VL.
This patch makes the VL -> pseudo semantics of the FoF insn explicit and
adjusts the intrinsics expander accordingly.
There is a problem with this approach, though: Technically, the VL
adjustment of the FoF loads is modelled as a store and the VL variable
is made TREE_ADDRESSABLE. At the gimple level we managed to elide the
store very early but at RTL level we don't. Also, we don't manage to
re-use the same register for VL at -O2 and -O3 while it still works for
-O1.
What might help with the second issue above is to add value tracking
to the vsetvl pass. I suppose the first issue would require a larger
intervention.
PR target/122869
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc (fold_fault_load):
Remove
* config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_load_insn):
Use new helper.
(function_expander::prepare_contiguous_load_insn): New helper.
(function_expander::use_fof_load_insn): New function to emit FoF
loads.
* config/riscv/riscv-vector-builtins.h: Declare new functions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr122656-1.c: Remove dg-error.
* gcc.target/riscv/rvv/vsetvl/ffload-3.c: XFAIL for -O2 and -O3.
* gcc.target/riscv/rvv/base/pr122869.c: New test.
Robin Dapp [Fri, 16 Jan 2026 11:54:47 +0000 (12:54 +0100)]
RISC-V: Correct builtin registration order [PR123279].
When compiling ncnn with LTO we encountered an ICE in the final link
step. The reason for this is inconsistent builtin registration
order, and, as a consequence, inconsistent functions codes being
streamed.
The underlying reason is that ncnn compiles files with different -march
options, one of them being -march=..._xtheadvector. XTheadVector does
not support fractional LMULs and several insns while also adding new
insns. As we register builtins sequentially, not registering some
builtins in one TU but registering them in another will naturally lead
to different orders and incompatible function codes.
I'm not really sure how such an executable is going to work eventually
but we should not ICE at least.
At first I tried to re-use the existing function_instance hash but that
would quickly lead to collisions due to the high number of total
builtins. Linear probing for the next bucket would have caused the same
problems we wanted to avoid in the first place.
The problem with XTheadVector in particular is that it both takes away
builtins (the ones with fractional LMUL) as well as adds its own.
Therefore just partitioning the function-code space into extensions is
not sufficient. It would be sufficient if an extension only added
builtins but the order will still be different if we just skip builtins
with fractional LMUL.
There are at least two options now:
- Create placeholders for all skipped builtins.
- Enable the unsupported builtins for XTheadVector and bail at expand
time.
In order to create placeholders we first need to get to the place where
to create them. As we have some guards for XTheadVector before that,
verifying that types are available etc., the necessary changes would
have touched several layers.
Therefore I went with the second option above, combining it with
partitioning the function space into extensions for a bit of future
proofing. Not creating placeholders is also in line with "polluting"
the march flags, i.e. enable everything reasonably possible.
To that end, the patch removes the TARGET_XTEADVECTOR-related checks
in riscv-vector-switch.def and introduces a new builtin requirement
VECTOR_EXT_NO_XTHEAD that contains the "V"-only but not XTHeadVector
builtins.
The function code now looks like this:
Bit 0: RISCV_BUILTIN_VECTOR (class bit)
Bits 1-8: Partition (rvv_builtin_partition enum)
Bits 9+: Index within partition.
I tried to come up with a test case for quite a while but didn't manage.
Reducing the ncnn LTO build failure also proved very difficult so in
order to move forwarding I'm posting the patch without a dedicated test
case.
Richard Biener [Thu, 22 Jan 2026 10:11:18 +0000 (11:11 +0100)]
tree-optimization/123756 - remove now bogus assert in reduction vect
With r16-5372-gfacb92812a4ec5 I have generalized reduction operator
support to allow (masked) internal functions in more cases. The
following removes an now bogus assert given IFN_FMAX is now allowed
given IFN_COND_FMAX is available.
Andrew Pinski [Thu, 22 Jan 2026 07:53:32 +0000 (23:53 -0800)]
testsuite: don't test for shirnk wrapping for arm thumb1 on pr46555.c [PR123751]
Thumb1 does not support shrink wrapping so the check for shrink
wrapping in pr46555.c needs to be disabled for that. It does work
with both thumb2 and arm modes.
Pushed after testing for arm with `-mthumb -march=armv8-m.base`,
`-marm -mcpu=cotext-a72` and `-mthumb -mcpu=cotext-a72` to make
sure the correct tests are happening and still pass.
PR testsuite/123751
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr46555.c: Disable for arm thumb1.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>