Jakub Jelinek [Fri, 28 Nov 2025 09:04:53 +0000 (10:04 +0100)]
match.pd: Re-add (y << x) {<,<=,>,>=} x simplifications [PR122733]
Here is my attempt to implement what has been reverted in r16-5648 using ranger.
Note also the changes to the equality pattern, first of all, there
could be e.g. vector << scalar shifts, although they'll likely
fail on the nop_convert vs. nop_convert, but also it would never
match for say unsigned long long @0 and unsigned int @1 etc., pretty
common cases.
The new simplifier asks the ranger about ranges and bitmasks, verifies
@0 is non-zero and that clz of the @0 nonzero bits bitmask (i.e. the minimum
clz of all possible values of @0) is greater than (or greater than or equal
to) maximum shift count. Which one of those depends on if the actual
non-equality comparison is signed or unsigned.
And gimple_match_range_of_expr now includes in itself undefined_p check
and returns false even for that, so that many of the callers don't
need to check that.
2025-11-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/122733
* gimple-match-head.cc (gimple_match_range_of_expr): Return false
even when range_of_expr returns true, but the range is undefined_p.
* match.pd ((mult (plus:s@5 (mult:s@4 @0 @1) @2) @3)): Remove
vr0.undefined_p () check.
((plus (mult:s@5 (plus:s@4 @0 @1) @2) @3)): Likewise.
((X + M*N) / N -> X / N + M): Remove vr4.undefined_p () check.
((X - M*N) / N -> X / N - M): Likewise.
((y << x) == x, (y << x) != x): Use convert2? instead of
nop_convert2? and test INTEGRAL_TYPE_P on TREE_TYPE (@0) rather than
TREE_TYPE (@1).
((y << x) {<,<=,>,>=} x): New simplification.
(((T)(A)) + CST -> (T)(A + CST)): Remove vr.undefined_p () check.
(x_5 == cstN ? cst4 : cst3): Remove r.undefined_p () check.
* gcc.dg/match-shift-cmp-4.c: New test.
* gcc.dg/match-shift-cmp-5.c: New test.
Jakub Jelinek [Thu, 27 Nov 2025 19:18:57 +0000 (20:18 +0100)]
c: Fix ICE in c_type_tag on va_list [PR121506]
The C and C++ FEs disagree on what TYPE_NAME on RECORD_TYPE for
structure/class definition is (rather than typedef/using, for
those both have TYPE_NAME of TYPE_DECL with DECL_ORIGINAL_TYPE),
the C FE just uses IDENTIFIER_NODE as TYPE_NAME on RECORD_TYPE,
while the C++ FE uses TYPE_DECL as TYPENAME on RECORD_TYPE and
only DECL_NAME on the TYPE_DECL provides the IDENTIFIER_NODE.
The reason for the C++ FE way is that there can be type definitions
at class scope (rather than just typedefs) and those need to be
among TYPE_FIELDS (so the corresponding TYPE_DECL is in that chain)
etc.
The middle-end can cope with it, e.g.
if (TREE_CODE (TYPE_NAME (node)) == IDENTIFIER_NODE)
pp_tree_identifier (pp, TYPE_NAME (node));
else if (TREE_CODE (TYPE_NAME (node)) == TYPE_DECL
&& DECL_NAME (TYPE_NAME (node)))
dump_decl_name (pp, TYPE_NAME (node), flags);
and many other places.
Now, the backends on various targets create artificial structure
definitions for va_list, e.g. x86 creates struct __va_list_tag
and they do it the C++ FE way so that the C++ FE can cope with those.
Except the new c_type_tag can't deal with that and ICEs.
The following patch fixes it so that it can handle it too.
2025-11-27 Jakub Jelinek <jakub@redhat.com>
PR c/121506
* c-typeck.cc (c_type_tag): If TYPE_NAME is TYPE_DECL
with non-NULL DECL_NAME, return that.
Jakub Jelinek [Thu, 27 Nov 2025 18:04:58 +0000 (19:04 +0100)]
gccrs: Partially unbreak rust build with C++20
I've committed earlier today https://gcc.gnu.org/r16-5628 to switch C++ to
-std=gnu++20 by default. That apparently broke rust build (I don't have
cargo installed, so am not testing rust at all).
Here is a completely untested attempt to fix that.
Note, in C++20 u8"abc" literal has const char8_t[4] type rather than
const char[4] which was the case in C++17, and there is std::u8string
etc.
The casts below to (const char *) is what I've used in libcody as well
to make it compilable with all of C++11 to C++26.
Another thing is that the source for some reason expects -fexec-charset=
to be ASCII compatible and -fwide-exec-charset= to be UTF-16 or UTF-32
or something similar. That is certainly not guaranteed.
Now, if rust-lex.cc can be only compiled with C++17 or later, we
could just use u8'_' etc., but as GCC still only requires C++14, I'd
go with u'_' etc.
2025-11-27 Jakub Jelinek <jakub@redhat.com>
* lex/rust-lex.cc (rust_input_source_test): Cast char8_t string
literals to (const char *) to make it compilable with C++20. Use
char16_t or char32_t character literals instead of ordinary
character literals or wide character literals in expected
initializers.
Matthieu Longo [Mon, 24 Nov 2025 14:56:19 +0000 (14:56 +0000)]
aarch64: Define __ARM_BUILDATTR64_FV
Support for Build Attributes (BA) was originally added in [1]. To facilitate their
use in customers codebases and avoid requiring a new Autotools test for BA support,
the specification was later amended. Toolchains that generate BA sections and
support the assembler directives should define the following preprocessor macro:
__ARM_BUILDATTR64_FV <format-version>
Where <format-version> is the same value as in [2]. Currently, only version 'A'
(0x41) is defined.
This patch also introduces two tests: one that verifies the macro definition for
positive detection of BA support; and another that ensures that no such macro is
defined when BA support is absent.
[1]: 98f5547dce2503d9d0f0cd454184d6870a315538
[2]: [Formal syntax of an ELF Attributes Section](https://github.com/smithp35/
abi-aa/blob/build-attributes/buildattr64/buildattr64.rst#formal-syntax-of-an-elf
-attributes-section)
gcc/ChangeLog:
* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Define
__ARM_BUILDATTR64_FV when BA support is detected in GAS.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/build-attributes/build-attribute-define-nok.c: New test.
* gcc.target/aarch64/build-attributes/build-attribute-define-ok.c: New test.
Wilco Dijkstra [Thu, 6 Nov 2025 20:49:22 +0000 (20:49 +0000)]
AArch64: Improve ctz and ffs
Use the ctz insn in the ffs expansion so it uses ctz if CSSC
is available. Rather than splitting, keep ctz as a single
insn for simplicity and possible fusion opportunities.
Move clz, ctz, clrsb, rbit and ffs instructions together.
gcc:
* config/aarch64/aarch64.md (ffs<mode>2): Use gen_ctz.
(ctz<mode>2): Model ctz as a single target instruction.
Andrew Pinski [Wed, 26 Nov 2025 21:55:41 +0000 (13:55 -0800)]
reassociation: Fix canonical ordering in some cases
This was noticed in PR122843 were sometimes reassociation
would create the uncanonical order of operands. This fixes
the problem by swapping the order as the rewrite happens.
Wstringop-overflow.c needed to be xfailed since it started
not to warn because well the warning is too dependent on
the order of operands to MIN_EXPR. This testcase
failed if we had supplied -fno-tree-reassoc before too;
but nothing in the IR changes except the order of 2 operands
of MIN_EXPR. I filed PR 122881 for this xfail.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-reassoc.cc (rewrite_expr_tree): Swap
oe1 and oe2 if commutative code and not in
canonical order.
Jonathan Wakely [Thu, 27 Nov 2025 16:13:44 +0000 (16:13 +0000)]
analyzer: Add missing 'const' to equiv_class::operator==
This produces a warning in C++20:
/home/test/src/gcc/gcc/analyzer/constraint-manager.cc: In member function ‘bool ana::constraint_manager::operator==(const ana::constraint_manager&) const’:
/home/test/src/gcc/gcc/analyzer/constraint-manager.cc:1610:42: warning: C++20 says that these are ambiguous, even though the second is reversed:
1610 | if (!(*ec == *other.m_equiv_classes[i]))
| ^
/home/test/src/gcc/gcc/analyzer/constraint-manager.cc:1178:1: note: candidate 1: ‘bool ana::equiv_class::operator==(const ana::equiv_class&)’
1178 | equiv_class::operator== (const equiv_class &other)
| ^~~~~~~~~~~
/home/test/src/gcc/gcc/analyzer/constraint-manager.cc:1178:1: note: candidate 2: ‘bool ana::equiv_class::operator==(const ana::equiv_class&)’ (reversed)
/home/test/src/gcc/gcc/analyzer/constraint-manager.cc:1178:1: note: try making the operator a ‘const’ member function
Robin Dapp [Wed, 26 Nov 2025 09:27:24 +0000 (10:27 +0100)]
forwprop: Nop-convert operands if necessary [PR122855].
This fixes up r16-5561-g283eb27d5f674b where I allowed nop conversions
for the input operands. There are several paths through the function
that still require an explicit nop conversion for them. This patch adds
them.
Andrew Stubbs [Tue, 11 Nov 2025 15:41:04 +0000 (15:41 +0000)]
amdgcn: Auto-detect USM mode and set HSA_XNACK
The AMD GCN runtime must be set to the correct "XNACK" mode for Unified Shared
Memory and/or self-mapping to work, but this is not always clear at compile and
link time due to the split nature of the offload compilation pipeline.
When XNACK mode is enabled, the runtime will restart GPU load/store
instructions that fail due to memory exceptions caused by page-misses. While
this is important for shared-memory systems that might experience swapping, we
are mostly interested in it because it is also used to implement page migration
between host and GPU memory, which is the basis of USM.
This patch checks that the XNACK mode is configured appropriately in the
compiler, and mkoffload then adds a runtime check into the final program to
ensure that the HSA_XNACK environment variable passes the correct mode to the
GPU.
The HSA_XNACK variable must be set before the HSA runtime is even loaded, so
it makes more sense to have this set within the constructor than at some point
later within libgomp or the GCN plugin.
Other toolchains require the end-user to set HSA_XNACK manually (or else wonder
why it's not working), so the constructor also checks that any existing manual
setting is compatible with the binary's requirements.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_init_cumulative_args): Emit a warning if the
-mxnack setting looks wrong.
* config/gcn/mkoffload.cc: Include tree.h and omp-general.h.
(process_asm): Add omp_requires parameter.
Emit HSA_XNACK code into mkoffload_setup, as required.
(main): Modify HSACO_ATTR_OFF to preserve user-set -mxnack.
Pass omp_requires to process_asm.
Tomasz Kamiński [Thu, 27 Nov 2025 13:31:51 +0000 (14:31 +0100)]
libstdc++: Fix exposure of TU-local lambda in __detail::__func_handle_t.
The lambda is considered to be TU-local entity, use a named function
instead.
As drive-by, a functor stored inside __func_handle::_Inplace is renamed
to _M_fn, as we no longer limit the functor type to function pointers.
libstdc++-v3/ChangeLog:
* include/std/ranges (__func_handle::__select): Named function
extracted from local lambda.
(__detail::__func_handle_t): Define using __func_handle::__select.
(__func_handle::_Inplace): Raname _M_ptr to _M_fn.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Eric Botcazou [Wed, 19 Nov 2025 08:45:05 +0000 (09:45 +0100)]
ada: Couple of minor fixes for build-in-place calls in anonymous contexts
The current code does not deal with all the anonymous contexts uniformly,
since it potentially creates an activation chain and a master only in the
case of an actual in a call; moreover, the master is created in the scope
of the actual's type, instead of in the context of the call like the chain.
The change also aligns Make_Build_In_Place_Call_In_Anonymous_Context with
sibling routines by calling Make_Build_In_Place_Call_In_Object_Declaration
directly instead of letting the expander recursively do it. It also adds
a missing rewriting in Make_Build_In_Place_Iface_Call_In_Anonymous_Context.
gcc/ada/ChangeLog:
* exp_ch6.adb (Expand_Actuals): Do not create activation chain and
master for build-in-place calls here but...
(Make_Build_In_Place_Call_In_Allocator): Use Unqual_Conv.
(Make_Build_In_Place_Call_In_Anonymous_Context): ...here instead.
Call Make_Build_In_Place_Call_In_Object_Declaration directly.
(Make_Build_In_Place_Iface_Call_In_Anonymous_Context): ...and here
instead. Add missing rewriting of the call.
Eric Botcazou [Wed, 19 Nov 2025 07:39:20 +0000 (08:39 +0100)]
ada: Fix undefined reference with inline subprogram containing generic instance
The problem is that, for an inline subprogram declared in an instance, the
cross-unit inlining machinery does not have the body by the time it decides
to inline calls to the subprogram, because the instantiation of bodies is
deferred until the end of the compilation. So it cannot see whether this
body contains excluded declarations or statements by that time, typically
nested packages or instances thereof.
The fix is to check that Is_Inlined is still set on the subprogram before
passing it on to the back-end for cross-unit inlining. It also removes an
obsolete check that was done precisely there.
This also adjusts the description of the -gnatwp switch, which can be used
to get the reason why cross-inlining has failed, for example here:
g.ads:4:01: warning: in instantiation at generic_si.adb:60 [-gnatwp]
g.ads:4:01: warning: cannot inline "*" (nested package instantiation)
gcc/ada/ChangeLog:
PR ada/122574
* doc/gnat_ugn/building_executable_programs_with_gnat.rst (-gnatwp):
Replace reference to -gnatN with -gnatn and adjust accordingly.
* inline.adb: Remove clauses for Exp_Tss.
(Has_Initialized_Type): Delete.
(Add_Inlined_Subprogram): Test that the Is_Inlined flag is still set
on the subprogram.
* usage.adb (Usage): Adjust description of -gnatwp.
* gnat_ugn.texi: Regenerate.
Denis Mazzucato [Mon, 17 Nov 2025 10:54:10 +0000 (11:54 +0100)]
ada: Fix spurious error during record initialization of limited types
This patch fixes the spurious error regarding assignment to limited types.
Inside record initialization, the assignment calling a constructor is actually
its initialization, and is considered legal.
gcc/ada/ChangeLog:
* sem_ch5.adb: Skip check for assignment that doesn't come from source.
ada: Fix spurious exceptions with iterated aggregates
When an array aggregate has an iterated component association over a
range that we know is empty, we don't create a loop during expansion but
we still analyze the expression of the component association in a
unusual context.
Before this patch, this analysis could incorrectly insert actions in an
enclosing scope. This patch fixes it by only doing preanalysis of the
expression in that case.
gcc/ada/ChangeLog:
* exp_aggr.adb (Gen_Loop): Only preanalyze expressions we know won't
evaluated.
Tom Tromey [Tue, 23 Sep 2025 15:36:47 +0000 (09:36 -0600)]
ada: Add Visitor generic to Repinfo
For a gnat-llvm debuginfo patch, it was convenient to be able to
inspect the expressions created during back-annotation. This patch
adds a new generic Visit procedure that can be implemented to allow
such inspection. List_GCC_Expression is reimplemented in terms of
this procedure as a proof of concept.
gcc/ada/ChangeLog:
* repinfo.adb (Visit): New procedure.
(List_GCC_Expression): Rewrite.
* repinfo.ads (Visit): New generic procedure.
VADS inline assembly works by using a qualified expression for one of
the types defined in the Machine_Code package, e.g.
procedure P is
begin
code_2'(INSTR, OPERAND1, OPERAND2);
end y;
This is different from GNAT's own inline assembly machinery, which
instead expects a call to Machine_Code.ASM with a set of
differently-typed arguments.
This incompatibility is preventing GNATSAS' GNAT-Warnings engine from
analyzing VADS code, hence we adapt sem_ch13.adb to not fail on such
constructs when GNAT is running under both Check_Semantics_Only_Mode and
Relaxed_RM_Semantics mode.
gcc/ada/ChangeLog:
* sem_ch13.adb (Analyze_Code_Statement): Do not emit error
message when only checking relaxed semantics.
Eric Botcazou [Fri, 14 Nov 2025 21:39:51 +0000 (22:39 +0100)]
ada: Streamline processing for shared passive and protected objects
The Add_Shared_Var_Lock_Procs procedure in Exp_Smem contains a very ad-hoc
management of transient scopes, which is probably unavoidable but can be
streamlined by changing the insertion point of the finalizer to be the one
used in the presence of controlled objects.
However, the latter change badly interacts with the special processing of
protected subprogram bodies implemented in Build_Finalizer_Call. Now this
processing is obsolete after the recent overhaul of the expansion of these
protected subprogram bodies and can be entirely removed.
No functional changes.
gcc/ada/ChangeLog:
* exp_ch7.adb (Build_Finalizer_Call): Delete.
(Build_Finalizer): Always insert the finalizer at the end of the
statement list in the non-package case.
(Expand_Cleanup_Actions): Attach the finalizer manually, if any.
* exp_smem.adb (Add_Shared_Var_Lock_Procs): Insert all the actions
directly in the transient scope.
Eric Botcazou [Sun, 16 Nov 2025 13:29:39 +0000 (14:29 +0100)]
ada: Couple of small and unrelated cleanups
No functional changes.
gcc/ada/ChangeLog:
* exp_ch11.adb (Expand_N_Handled_Sequence_Of_Statement): Merge the
eslif condition with the if condition for cleanup actions.
* sem_ch6.adb (Analyze_Procedure_Call.Analyze_Call_And_Resolve): Get
rid of if statement whose condition is always true.
* sinfo.ads (Finally_Statements): Document their purpose.
Eric Botcazou [Mon, 17 Nov 2025 07:45:21 +0000 (08:45 +0100)]
ada: Streamline implementation of masters in Exp_Ch9
The incidental discovery of an old issue and its resolution has exposed the
convoluted handling of masters in Exp_Ch9, which uses two totally different
approaches to achieve the same goal, respectively in Build_Master_Entity and
Build_Class_Wide_Master, the latter being quite hard to follow. The handling
of activation chains for extended return statements is also a bit complex.
This gets rid of the second approach entirely for masters, as well as makes
the handling of activation chains uniform for all nodes.
No functional changes.
gcc/ada/ChangeLog:
* gen_il-gen-gen_nodes.adb (N_Extended_Return_Statement): Add
Activation_Chain_Entity semantic field.
* exp_ch3.adb (Build_Master): Use Build_Master_{Entity,Renaming} in
all cases.
(Expand_N_Object_Declaration): Small tweak.
* exp_ch6.adb (Make_Build_In_Place_Iface_Call_In_Allocator): Use
Build_Master_{Entity,Renaming} to build the master.
* exp_ch7.adb (Expand_N_Package_Declaration): Do not guard the call
to Build_Task_Activation_Call for the sake of consistency.
* exp_ch9.ads (Build_Class_Wide_Master): Delete.
(Find_Master_Scope): Likewise.
(Build_Protected_Subprogram_Call_Cleanup): Move to...
(First_Protected_Operation): Move to...
(Mark_Construct_As_Task_Master): New procedure.
* exp_ch9.adb (Build_Protected_Subprogram_Call_Cleanup): ...here.
(First_Protected_Operation): ...here.
(Build_Activation_Chain_Entity): Streamline handling of extended
return statements.
(Build_Class_Wide_Master): Delete.
(Build_Master_Entity): Streamline handling of extended return
statements and call Mark_Construct_As_Task_Master on the context.
(Build_Task_Activation_Call): Assert that the owner is not an
extended return statement.
(Find_Master_Scope): Delete.
(Mark_Construct_As_Task_Master): New procedure.
* sem_ch3.adb (Access_Definition): Use Build_Master_{Entity,Renaming}
in all cases to build a master.
* sem_ch6.adb (Check_Anonymous_Return): Rename to...
(Check_Anonymous_Access_Return_With_Tasks): ...this. At the end,
call Mark_Construct_As_Task_Master on the parent node.
(Analyze_Subprogram_Body_Helper): Adjust to above renaming.
(Create_Extra_Formals): Do not set Has_Master_Entity here.
* sinfo.ads (Activation_Chain_Entity): Adjust description.
Bob Duff [Fri, 14 Nov 2025 21:29:45 +0000 (16:29 -0500)]
ada: VAST found bug: Missing Parent in annotate aspect
In case of an Annotate aspect of the form "Annotate => Expr",
where Expr is an identifier (as opposed to an aggregate),
the Parent field of the N_Identifier node for Expr was
destroyed. This patch changes the code that turns the aspect
into a pragma, so that it no longer has that bug.
The problem was in "New_List (Expr)"; which sets the Parent of
Expr to Empty. But Expr is still part of the tree of the aspect,
so it should have a proper Parent; we can't just stick it in a
temporary list.
The new algorithm constructs the pragma arguments without disturbing
the tree of the aspect.
This is the last known case of missing Parent fields, so we can
now enable the VAST check that detected this bug.
gcc/ada/ChangeLog:
* sem_ch13.adb (Aspect_Annotate): Avoid disturbing the tree of the
aspect.
* vast.adb: Enable Check_Parent_Present.
* exp_ch6.adb (Validate_Subprogram_Calls): Minor reformatting.
Eric Botcazou [Thu, 13 Nov 2025 20:12:54 +0000 (21:12 +0100)]
ada: Fix fallout of recent finalization fix for limited types
The recent finalization fix made for limited types has uncovered cases where
the object returned by calls to build-in-place functions was not finalized
in selected anonymous contexts, most notably the dependent expressions of
conditional expressions. The specific finalization machinery that handles
conditional expressions requires the temporaries built for their dependent
expressions to be visible as early as possible, and this was not the case.
gcc/ada/ChangeLog:
* exp_ch4.adb (Expand_N_Case_Expression): When not optimizing for a
specific context, call Make_Build_In_Place_Call_In_Anonymous_Context
on expressions of alternatives when they are calls to BIP functions.
(Expand_N_If_Expression): Likewise for the Then & Else expressions.
Bob Duff [Thu, 13 Nov 2025 16:40:55 +0000 (11:40 -0500)]
ada: VAST: Check basic tree properties
Miscellaneous improvements to VAST. Mostly debugging improvements.
Move the call to VAST from Frontend to Gnat1drv, because
there is code AFTER the call to Frontend that notices
certain errors, and disables the back end. We want VAST
to be enabled only when the back end will be called.
This is needed to enable Check_Error_Nodes, among other
things.
gcc/ada/ChangeLog:
* frontend.adb: Move call to VAST from here...
* gnat1drv.adb: ...to here.
* vast.ads (VAST_If_Enabled): Rename main entry point of VAST from
VAST to VAST_If_Enabled.
* vast.adb: Miscellaneous improvements. Mostly debugging
improvements. Also enable Check_Error_Nodes. Also add checks:
Check_FE_Only, Check_Scope_Present, Check_Scope_Correct.
* debug.ads: Minor comment tweaks. The comment, "In the checks off
version of debug, the call to Set_Debug_Flag is always a null
operation." appears to be false, so is removed.
* debug.adb: Minor: Remove some code duplication.
* sinfo-utils.adb (nnd): Add comment warning about C vs. Ada
confusion.
Eric Botcazou [Thu, 13 Nov 2025 08:16:52 +0000 (09:16 +0100)]
ada: Fix missing activation of task returned through class-wide type
This fixes an old issue whereby a task returned through the class-wide type
of a limited record type is not activated by the caller, because it is not
moved onto the activation chain that the caller passes to the function.
gcc/ada/ChangeLog:
* exp_ch6.ads (Needs_BIP_Task_Actuals): Adjust description.
* exp_ch6.adb (Expand_N_Extended_Return_Statement): Move activation
chain for every build-in-place function with task formal parameters
when the type of the return object might have tasks.
A recent patch made Multi_Module_Symbolic_Traceback have two consecutive
formal parameters of type Boolean, which opens the door for mixing up
actual parameters in calls. And that mistake was actually made in a call
introduced by the same patch.
This commit fixes the call and also introduces a new enumerated type to
make this kind of mistake less likely in the future.
gcc/ada/ChangeLog:
* libgnat/s-dwalin.ads (Display_Mode_Type): New enumerated type.
(Symbolic_Traceback): Use new type in profile.
* libgnat/s-dwalin.adb (Symbolic_Traceback): Use new type in profile
and adapt body.
* libgnat/s-trasym__dwarf.adb (Multi_Module_Symbolic_Traceback): Fix
wrong call in body of one overload. Use new type in profile. Adapt
body.
(Symbolic_Traceback, Symbolic_Traceback_No_Lock,
Module_Symbolic_Traceback): Use new type in profile and adapt body.
(Calling_Entity): Adapt body.
Jakub Jelinek [Thu, 27 Nov 2025 12:55:17 +0000 (13:55 +0100)]
bitint: Fix up big-endian handling in limb_access [PR122714]
The bitint_extended changes in limb_access broke bitint_big_endian.
As we sometimes (for bitint_extended) access the MEM_REFs using
atype rather than m_limb_type, for big-endian we need to adjust
the MEM_REFs offset if atype has smaller TYPE_SIZE_UNIT than m_limb_size.
2025-11-27 Jakub Jelinek <jakub@redhat.com>
PR target/122714
* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Adjust
MEM_REFs offset for bitint_big_endian if ltype doesn't have the
same byte size as m_limb_type.
Richard Biener [Thu, 27 Nov 2025 09:56:43 +0000 (10:56 +0100)]
Fix OMP SIMD clone mask record/get again
Post-CI checkin detected aarch64 fallout for the last change. AARCH64
has ABI twists that run into a case where an unmasked call when loop
masked allows for a mask that has different shape than that of the
return value which in turn has different type than that of an actual
argument.
While we do not support a mismatch of call mask shape with the
OMP SIMD ABI mask shape when there's no call mask we have no such
restriction.
So the following fixes the record/get of a loop mask in the unmasked
call case, also fixing a latent issue present before. In particular
do not record a random scalar operand as representing the mask.
A testcase is in gcc.target/aarch64/vect-simd-clone-4.c.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Fix
recording of the mask type again. Adjust placing of
mask arguments for non-masked calls.
Dhruv Chawla [Thu, 27 Nov 2025 11:12:33 +0000 (12:12 +0100)]
remove patterns for (y << x) {<,<=,>,>=} x [PR122733]
These patterns should not be in match.pd as they require range
information checks that ideally belong in VRP. They were also causing
breakages as the checks weren't tight enough.
PR tree-optimization/122733
* match.pd ((y << x) {<,<=,>,>=} x): Remove.
((y << x) {==,!=} x): Call constant_boolean_node instead of
build_one_cst/build_zero_cst and combine into one pattern.
* gcc.dg/match-shift-cmp-1.c: Update test to only check
equality.
* gcc.dg/match-shift-cmp-2.c: Likewise.
* gcc.dg/match-shift-cmp-3.c: Likewise.
* gcc.dg/match-shift-cmp-4.c: Removed.
Jakub Jelinek [Thu, 27 Nov 2025 10:57:02 +0000 (11:57 +0100)]
fold-const, match.pd: Pass stmt to expr_not_equal if possible
The following patch is a small extension of the previous patch to pass stmt
context to the ranger queries from match.pd where possible, so that we can
use local ranges on a particular statement rather than global ones.
expr_not_equal_to also uses the ranger, so when possible this passes it
the statement context.
2025-11-27 Jakub Jelinek <jakub@redhat.com>
* fold-const.h (expr_not_equal_to): Add gimple * argument defaulted
to NULL.
* fold-const.cc (expr_not_equal_to): Likewise, pass it through to
range_of_expr.
* generic-match-head.cc (gimple_match_ctx): New static inline.
* match.pd (X % -Y -> X % Y): Capture NEGATE and pass
gimple_match_ctx (@2) as new 3rd argument to expr_not_equal_to.
((A * C) +- (B * C) -> (A+-B) * C): Pass gimple_match_ctx (@3)
as new 3rd argument to expr_not_equal_to.
(a rrotate (bitsize-b) -> a lrotate b): Likewise.
On Wed, Nov 26, 2025 at 09:52:50AM +0100, Richard Biener wrote:
> I wonder if it makes sense to wrap
> get_range_query (cfun)->range_of_expr (r, @0, gimple_match_ctx (@4))
> into sth like gimple_match_range_of_expr (r, @0, @4)?
It does make sense, so the following patch implements that.
Note, gimple-match.h is a bad location for that helper, because
lots of users use it without having value-range.h included and
it is for APIs to use the gimple folders, not for match.pd helpers
themselves, so I've moved there gimple_match_ctx as well.
2025-11-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119683
* gimple-match.h (gimple_match_ctx): Move to ...
* gimple-match-head.cc (gimple_match_ctx): ... here. Make static.
(gimple_match_range_of_expr): New static inline.
* match.pd ((mult (plus:s (mult:s @0 @1) @2) @3)): Use
gimple_match_range_of_expr.
((plus (mult:s (plus:s @0 @1) @2) @3)): Likewise.
((t * u) / u -> t): Likewise.
((t * u) / v -> t * (u / v)): Likewise.
((X + M*N) / N -> X / N + M): Likewise.
((X - M*N) / N -> X / N - M): Likewise.
((X + C) / N -> X / N + C / N): Likewise.
(((T)(A)) + CST -> (T)(A + CST)): Likewise
(x_5 == cstN ? cst4 : cst3): Likewise. Do r.set_varying
even when gimple_match_range_of_expr failed.
Richard Biener [Thu, 27 Nov 2025 09:04:19 +0000 (10:04 +0100)]
tree-optimization/122885 - avoid re-using accumulator for some bool vectors
When boolean vectors do not use vector integer modes we are not
set up to produce the partial epilog in a correctly typed way,
so avoid this situation. For the integer mode case we are able
to pun things correctly, so keep that working.
PR tree-optimization/122885
* tree-vect-loop.cc (vect_find_reusable_accumulator): Reject
mask vectors which do not use integer vector modes.
(vect_create_partial_epilog): Assert the same.
Jonathan Wakely [Sat, 15 Nov 2025 18:19:28 +0000 (18:19 +0000)]
libstdc++: Future-proof C++20 atomic wait/notify
This will allow us to extend atomic waiting functions to support a
possible future 64-bit version of futex, as well as supporting
futex-like wait/wake primitives on other targets (e.g. macOS has
os_sync_wait_on_address and FreeBSD has _umtx_op).
Before this change, the decision of whether to do a proxy wait or to
wait on the atomic variable itself was made in the header at
compile-time, which makes it an ABI property that would not have been
possible to change later. That would have meant that
std::atomic<uint64_t> would always have to do a proxy wait even if Linux
gains support for 64-bit futex2(2) calls at some point in the future.
The disadvantage of proxy waits is that several distinct atomic objects
can share the same proxy state, leading to contention between threads
even when they are not waiting on the same atomic object, similar to
false sharing. It also result in spurious wake-ups because doing a
notify on an atomic object that uses a proxy wait will wake all waiters
sharing the proxy.
For types that are known to definitely not need a proxy wait (e.g. int
on Linux) the header can still choose a more efficient path at
compile-time. But for other types, the decision of whether to do a proxy
wait is deferred to runtime, inside the library internals. This will
make it possible for future versions of libstdc++.so to extend the set
of types which don't need to use proxy waits, without ABI changes.
The way the change works is to stop using the __proxy_wait flag that was
set by the inline code in the headers. Instead the __wait_args struct
has an extra pointer member which the library internals populate with
either the address of the atomic object or the _M_ver counter in the
proxy state. There is also a new _M_obj_size member which stores the
size of the atomic object, so that the library can decide whether a
proxy is needed. So for example if linux gains 64-bit futex support then
the library can decide not to use a proxy when _M_obj_size == 8.
Finally, the _M_old member of the __wait_args struct is changed to
uint64_t so that it has room to store 64-bit values, not just whatever
size the __platform_wait_t type is (which is a 32-bit int on Linux).
Similarly, the _M_val member of __wait_result_type changes to uint64_t
too.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Adjust exports.
* include/bits/atomic_timed_wait.h (_GLIBCXX_HAVE_PLATFORM_TIMED_WAIT):
Do not define this macro.
(__atomic_wait_address_until_v, __atomic_wait_address_for_v):
Adjust assertions to check that __platform_wait_uses_type is
true.
* include/bits/atomic_wait.h (__waitable): New concept.
(__platform_wait_uses_type): Different separately for platforms
with and without platform wait.
(_GLIBCXX_HAVE_PLATFORM_WAIT): Do not define this macro.
(__wait_value_type): New typedef.
(__wait_result_type): Change _M_val to __wait_value_type.
(__wait_flags): Remove __proxy_wait enumerator. Reduce range
reserved for ABI version by the commented-out value.
(__wait_args_base::_M_old): Change type to __wait_args_base.
(__wait_args_base::_M_obj, __wait_args_base::_M_obj_size): New
data members.
(__wait_args::__wait_args): Set _M_obj and _M_obj_size on
construction.
(__wait_args::_M_setup_wait): Change void* parameter to deduced
type. Adjust bit_cast to work for types of different sizes.
(__wait_args::_M_load_proxy_wait_val): Remove function, replace
with ...
(__wait_args::_M_setup_proxy_wait): New function.
(__wait_args::_S_flags_for): Do not set __proxy_wait flag.
(__atomic_wait_address_v): Adjust assertion to check that
__platform_wait_uses_type is true.
* src/c++20/atomic.cc (_GLIBCXX_HAVE_PLATFORM_WAIT): Define here
instead of in header. Check _GLIBCXX_HAVE_PLATFORM_WAIT instead
of _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT.
(__platform_wait, __platform_notify, __platform_wait_until): Add
unused parameter for _M_obj_size.
(__spin_impl): Adjust for 64-bit __wait_args_base::_M_old.
(use_proxy_wait): New function.
(__wait_args::_M_load_proxy_wait_val): Replace with ...
(__wait_args::_M_setup_proxy_wait): New function. Call
use_proxy_wait to decide at runtime whether to wait on the
pointer directly instead of using a proxy. If a proxy is needed,
set _M_obj and _M_obj_size to refer to its _M_ver member. Adjust
for change to type of _M_old.
(__wait_impl): Wait on _M_obj unconditionally. Pass _M_obj_size
to __platform_wait.
(__notify_impl): Call use_proxy_wait to decide whether to notify
on the address parameter or a proxy
(__spin_until_impl): Adjust for change to type of _M_val.
(__wait_until_impl): Wait on _M_obj unconditionally. Pass
_M_obj_size to __platform_wait_until.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Wed, 26 Nov 2025 14:44:03 +0000 (14:44 +0000)]
libstdc++: Fix std::counting_semaphore<> default max value
My recent (uncommitted) changes to support a 64-bit __platform_wait_t
for FreeBSD and Darwin revealed a problem in std::counting_semaphore.
When the default template argument is used and __platform_wait_t is a
64-bit type, the numeric_limits<__platform_wait_t>::max() value doesn't
fit in ptrdiff_t and so we get ptrdiff_t(-1), which fails a
static_assert in the class body.
The solution is to cap the value to PTRDIFF_MAX instead of allowing it
to go negative.
libstdc++-v3/ChangeLog:
* include/bits/semaphore_base.h (__platform_semaphore::_S_max):
Limit to PTRDIFF_MAX to avoid negative values.
* testsuite/30_threads/semaphore/least_max_value.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
liuhongt [Tue, 25 Nov 2025 05:33:46 +0000 (21:33 -0800)]
Refactor mgather/mscatter implementation.
Current implementation is an alias to -mtune-crtl=(Alias(mtune-ctrl=,
use_gather, ^use_gather)), and maybe override by another -mtune-crtl=
.i.e -mgather -mscatter will only enable mscatter
The patch fixes the issue.
gcc/ChangeLog:
* config/i386/i386-options.cc (set_ix86_tune_features): Set
gather/scatter tune if OPTION_SET_P.
* config/i386/i386.opt: Refactor mgather/mscatter.
Lulu Cheng [Mon, 24 Nov 2025 09:03:49 +0000 (17:03 +0800)]
LoongArch: fmv: Fix compilation errors when using glibc versions earlier than 2.38.
The macros HWCAP_LOONGARCH_LSX and HWCAP_LOONGARCH_LASX were defined
in glibc 2.38. However, r16-5155 uses these two macros directly
without checking whether they are defined. This causes errors when
compiling libgcc with glibc versions earlier than 2.38.
gcc/ChangeLog:
* doc/extend.texi: Remove the incorrect prompt message.
libgcc/ChangeLog:
* config/loongarch/cpuinfo.c (HWCAP_LOONGARCH_LSX): Define
it if it is not defined.
(HWCAP_LOONGARCH_LASX): Likewise.
Sandra Loosemore [Thu, 27 Nov 2025 01:25:33 +0000 (01:25 +0000)]
doc: Add --compile-std-module to option summary
Commit 3ad2e2d707c3d6b0c6bd8c3ef0df4f7aaee1c3c added documentation
for this new C++ option, but missed also adding it to the corresponding
Option Summary list.
* c-parser.cc (c_parser_maxof_or_minof_expression): New func.
(c_parser_unary_expression): Add RID_MAXOF & RID_MINOF cases.
* c-tree.h (c_expr_maxof_type): New prototype.
(c_expr_minof_type): New prototype.
* c-typeck.cc (c_expr_maxof_type): New function.
(c_expr_minof_type): New function.
gcc/testsuite/ChangeLog:
* gcc.dg/maxof-bitint.c: New test.
* gcc.dg/maxof-bitint575.c: New test.
* gcc.dg/maxof-compile.c: New test.
* gcc.dg/maxof-pedantic-errors.c: New test.
* gcc.dg/maxof-pedantic.c: New test.
Tamar Christina [Wed, 26 Nov 2025 22:00:07 +0000 (22:00 +0000)]
middle-end: guard against non-single use compares in emit_cmp_and_jump_insns
When I wrote this optimization my patch stack included a change in
tree-out-of-ssa that would duplicate the compares such that the
use is always single use and get_gimple_for_ssa_name would always
succeed.
However I have dropped that for GCC 16 since I didn't expect the
vectorizer to be able to produce duplicate uses of the same
compare results.
But I neglected that you can get it by other means. So this simply
checks that get_gimple_for_ssa_name succeeds for the LEN cases.
The non-LEN cases already check it earlier on.
To still get the optimization in this case the tree-out-of-ssa
change is needed, which is staged for next stage-1.
gcc/ChangeLog:
* optabs.cc (emit_cmp_and_jump_insns): Check for non-single use.
Jeff Law [Wed, 26 Nov 2025 21:52:11 +0000 (14:52 -0700)]
[RISC-V][PR rtl-optimization/122735] Avoid bogus calls to simplify_subreg
Recent changes to simplify_binary_operation_1 reassociate a SUBREG expression
in useful ways. But they fail to account for the asserts at the beginning of
simplify_subreg.
In particular simplify_subreg asserts that the mode can not be VOID or BLK --
the former being the problem here as it's used on CONST_INT nodes which may
appear in an unsimplified REG_EQUAL note like:
That triggers the new code in simplify-rtx to push the subreg into an inner
object. In particular it'll try to push the subreg to the first operand of the
LSHIFTRT. We pass that to simplify_subreg via simplify_gen_subreg and boom!
You could legitimately ask why the original note wasn't simplified further or
removed. That approach could certainly be used to fix this specific problem.
But we've never had that kind of requirement on REG_EQUAL notes and I think it
opens up a huge can of worms if we impose it now. So I chose to make the
newer simplify-rtx code more robust.
Bootstrapped and regression tested on x86_64 and riscv and tested on the
various embedded targets without regressions. I'll wait for the pre-commit CI
tester before committing.
PR rtl-optimization/122735
gcc/
* simplify-rtx.cc (simplify_binary_operation_1): When moving a SUBREG
from an outer expression to an inner operand, make sure to avoid
trying to create invalid SUBREGs.
Declare target's 'link' clause disallows 'nohost'; check for it.
Additionally, some other cleanups have been done.
The 'local' clause to 'declare target' is now supported in the FE,
but a 'sorry, unimplemented' is printed at TREE generation time.
This commit also adds the 'groupprivate' directive, which implies
'declare target' with the 'local' clause. And for completeness also
the 'dyn_groupprivate' clause to 'target'. However, all those new
features will eventually print 'sorry, unimplemented' for now.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_attr): Handle OpenMP's 'local' clause
and the 'groupprivate' directive.
(show_omp_clauses): Handle dyn_groupprivate.
* frontend-passes.cc (gfc_code_walker): Walk dyn_groupprivate.
* gfortran.h (enum gfc_statement): Add ST_OMP_GROUPPRIVATE.
(enum gfc_omp_fallback, gfc_add_omp_groupprivate,
gfc_add_omp_declare_target_local): New.
* match.h (gfc_match_omp_groupprivate): New.
* module.cc (enum ab_attribute, mio_symbol_attribute, load_commons,
write_common_0): Handle 'groupprivate' + declare target's 'local'.
* openmp.cc (gfc_omp_directives): Add 'groupprivate'.
(gfc_free_omp_clauses): Free dyn_groupprivate.
(enum omp_mask2): Add OMP_CLAUSE_LOCAL and OMP_CLAUSE_DYN_GROUPPRIVATE.
(gfc_match_omp_clauses): Match them.
(OMP_TARGET_CLAUSES): Add OMP_CLAUSE_DYN_GROUPPRIVATE.
(OMP_DECLARE_TARGET_CLAUSES): Add OMP_CLAUSE_LOCAL.
(gfc_match_omp_declare_target): Handle groupprivate + fixes.
(gfc_match_omp_threadprivate): Code move to and calling now ...
(gfc_match_omp_thread_group_private): ... this new function.
Also handle groupprivate.
(gfc_match_omp_groupprivate): New.
(resolve_omp_clauses): Resolve dyn_groupprivate.
* parse.cc (decode_omp_directive): Match groupprivate.
(case_omp_decl, parse_spec, gfc_ascii_statement): Handle it.
* resolve.cc (resolve_symbol): Handle groupprivate.
* symbol.cc (gfc_check_conflict, gfc_copy_attr): Handle 'local'
and 'groupprivate'.
(gfc_add_omp_groupprivate, gfc_add_omp_declare_target_local): New.
* trans-common.cc (build_common_decl,
accumulate_equivalence_attributes): Print 'sorry' for
groupprivate and declare target's local.
* trans-decl.cc (add_attributes_to_decl): Likewise..
* trans-openmp.cc (gfc_trans_omp_clauses): Print 'sorry' for
dyn_groupprivate.
(fallback): Process declare target with link/local as
done for 'enter'.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/crayptr2.f90: Move dg-error line.
* gfortran.dg/gomp/declare-target-2.f90: Extend.
* gfortran.dg/gomp/declare-target-4.f90: Update comment,
enable one test.
* gfortran.dg/gomp/declare-target-5.f90: Update dg- wording,
add new test.
* gfortran.dg/gomp/declare-target-indirect-2.f90: Expect
'device_type(any)' in scan-tree-dump.
* gfortran.dg/gomp/declare-target-6.f90: New test.
* gfortran.dg/gomp/dyn_groupprivate-1.f90: New test.
* gfortran.dg/gomp/dyn_groupprivate-2.f90: New test.
* gfortran.dg/gomp/groupprivate-1.f90: New test.
* gfortran.dg/gomp/groupprivate-2.f90: New test.
* gfortran.dg/gomp/groupprivate-3.f90: New test.
* gfortran.dg/gomp/groupprivate-4.f90: New test.
* gfortran.dg/gomp/groupprivate-5.f90: New test.
* gfortran.dg/gomp/groupprivate-6.f90: New test.
Marek Polacek [Mon, 24 Nov 2025 22:31:22 +0000 (17:31 -0500)]
c++: fix crash with pack indexing in noexcept [PR121325]
In my r15-6792 patch I added a call to tsubst in tsubst_pack_index
to fully instantiate args#N in the pack.
Here we are in an unevaluated context, but since the pack is
a TREE_VEC, we call tsubst_template_args which has cp_evaluated
at the beginning. That causes a crash because we trip on the
assert in tsubst_expr/PARM_DECL:
gcc_assert (cp_unevaluated_operand);
because retrieve_local_specialization didn't find anything (becase
there are no local_specializations yet).
We can avoid the cp_evaluated by calling the new tsubst_tree_vec,
which creates a new TREE_VEC and substitutes each element.
PR c++/121325
gcc/cp/ChangeLog:
* pt.cc (tsubst_tree_vec): New.
(tsubst_pack_index): Call it.
The CBN?Z instructions have a very small range (just 128 bytes
forwards). The compiler knows how to handle cases where we
exceed that, but only if the range remains within that which
a condition branch can support. When compiling some machine
generated code it is not too difficult to exceed this limit,
so arrange to fall back to a conditional branch over an
unconditional one in this extreme case.
gcc/ChangeLog:
PR target/122867
* config/arm/arm.cc (arm_print_operand): Use %- to
emit LOCAL_LABEL_PREFIX.
(arm_print_operand_punct_valid_p): Allow %- for punct
and make %_ valid for all compilation variants.
* config/arm/thumb2.md (*thumb2_cbz): Handle very
large branch ranges that exceed the limit of b<cond>.
(*thumb2_cbnz): Likewise.
gcc/testsuite/ChangeLog:
PR target/122867
* gcc.target/arm/cbz-range.c: New test.
The following avoids re-calling of vect_need_peeling_or_partial_vectors_p
after peeling. This was neccesary because the function does not
properly handle being called for epilogues since it looks for the
applied prologue peeling not in the main vector loop but the current
one operated on.
PR tree-optimization/110571
* tree-vectorizer.h (vect_need_peeling_or_partial_vectors_p): Remove.
* tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p):
Fix when called on epilog loops. Make static.
* tree-vect-loop-manip.cc (vect_do_peeling): Do not
re-compute LOOP_VINFO_PEELING_FOR_NITER.
In emit_cmp_and_jump_insns I tried to detect if the operation is signed or
unsigned in order to convert the condition code into an unsigned code.
However I did this based on the incoming tree compare, which is done on the
boolean result. Since booleans are always signed in tree the result was that
we never used an unsigned compare when needed.
This checks one of the arguments of the compare instead.
Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues.
Ok for master?
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
PR tree-optimization/122861
* optabs.cc (emit_cmp_and_jump_insns): Check argument instead of result.
gcc/testsuite/ChangeLog:
PR tree-optimization/122861
* gcc.target/aarch64/sve/vect-early-break-cbranch_10.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_11.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_12.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_13.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_14.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_15.c: New test.
* gcc.target/aarch64/sve/vect-early-break-cbranch_9.c: New test.
* gcc.target/aarch64/vect-early-break-cbranch_4.c: New test.
* gcc.target/aarch64/vect-early-break-cbranch_5.c: New test.
Jakub Jelinek [Wed, 26 Nov 2025 14:01:11 +0000 (15:01 +0100)]
Change the default C++ dialect to gnu++20
On Mon, Nov 03, 2025 at 01:34:28PM -0500, Marek Polacek via Gcc wrote:
> I would like us to declare that C++20 is no longer experimental and
> change the default dialect to gnu++20. Last time we changed the default
> was over 5 years ago in GCC 11:
> <https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=0801f419440c14f6772b28f763ad7d40f7f7a580>
> and before that in 2015 in GCC 6.1, so this happens roughly every 5 years.
>
> I had been hoping to move to C++20 in GCC 15 (see bug 113920), but at that time
> libstdc++ still had incomplete C++20 support and the compiler had issues to iron
> out (mangling of concepts, modules work, etc.). Are we ready now? Is anyone
> aware of any blockers? Presumably we still wouldn't enable Modules by default.
>
> I'm willing to do the work if we decide that it's time to switch the default
> C++ dialect (that includes updating cxx-status.html and adding a new caveat to
> changes.html).
I haven't seen a patch posted for this, so just that something is posted
during stage1 if we decide to do it, here is a patch.
The patch makes -std=gnu++20 the default C++ dialect and documents that
-fmodules is still not implied by that or -std=c++20 and modules support
is still experimental.
2025-11-26 Jakub Jelinek <jakub@redhat.com>
gcc/
* doc/invoke.texi (gnu++17): Remove comment about the default.
(c++20): Remove note about experimental support, except add a note
that modules are still experimental and need to be enabled separately.
(gnu++20): Likewise. Move here comment about the default.
(fcoroutines): Mention it is enabled by default for C++20 and later.
* doc/standards.texi: Document that the default for C++ is
-std=gnu++20.
gcc/c-family/
* c-opts.cc (c_common_init_options): Call set_std_cxx20 rather than
set_std_cxx17.
* c.opt (std=c++2a): Change description to deprecated option wording.
(std=c++20): Remove experimental support part.
(std=c++2b): Change description to deprecated option wording.
(std=gnu++2a): Likewise.
(std=gnu++20): Remove experimental support part.
(std=gnu++2b): Change description to deprecated option wording.
gcc/testsuite/
* lib/target-supports.exp: Set cxx_default to c++20 rather than
c++17.
* lib/g++-dg.exp (g++-std-flags): Reorder list to put 20 first
and 17 after 26.
* g++.dg/debug/pr80461.C (bar): Use v = v + 1; instead of ++v;.
* g++.dg/debug/pr94459.C: Add -std=gnu++17 to dg-options.
* g++.dg/diagnostic/virtual-constexpr.C: Remove dg-skip-if,
instead use { c++11 && c++17_down } effective target instead of
c++11.
* g++.dg/guality/pr67192.C: Add -std=gnu++17.
* g++.dg/torture/pr84961-1.C: Likewise.
* g++.dg/torture/pr84961-2.C: Likewise.
* g++.dg/torture/pr51482.C (anim_track_bez_wvect::tangent): Cast
key_class to int before multiplying it by float.
* g++.dg/torture/stackalign/unwind-4.C (foo): Use g_a = g_a + 1;
instead of g_a++;.
* g++.dg/tree-prof/partition1.C (bar): Use l = l + 1; return l;
instead of return ++l;.
* obj-c++.dg/exceptions-3.mm: Add -std=gnu++17.
* obj-c++.dg/exceptions-5.mm: Likewise.
libgomp/
* testsuite/libgomp.c++/atomic-12.C (main): Add ()s around array
reference index.
* testsuite/libgomp.c++/atomic-13.C: Likewise.
* testsuite/libgomp.c++/atomic-8.C: Likewise.
* testsuite/libgomp.c++/atomic-9.C: Likewise.
* testsuite/libgomp.c++/loop-6.C: Use count = count + 1;
return count > 0; instead of return ++count > 0;.
* testsuite/libgomp.c++/pr38650.C: Add -std=gnu++17.
* testsuite/libgomp.c++/target-lambda-1.C (merge_data_func):
Use [=,this] instead of just [=] in lambda captures.
* testsuite/libgomp.c-c++-common/target-40.c (f1): Use v += 1;
instead of v++;.
* testsuite/libgomp.c-c++-common/depend-iterator-2.c: Use v = v + 1;
instead of v++.
Tomasz Kamiński [Thu, 13 Nov 2025 13:54:11 +0000 (14:54 +0100)]
libstdc++: Optimize functor storage for transform views iterators.
The iterators for transform views (views::transform, views::zip_transform,
and views::adjacent_transform) now store a function handle from (from
__detail::__func_handle namespace) instead of a pointer to the view object
(_M_parent).
The following handle templates are defined in __func_handle namespace:
* _Inplace: Used if the functor is a function pointer or standard operator
wrapper (std::less<>, etc). The functor is stored directly in __func_handle
and the iterator. This avoid double indirection through a pointer to the
function pointer, and reduce the size of iterator for std wrappers.
* _InplaceMemPtr: Used for data or function member pointers. This behaves
similarly to _Inplace, but uses __invoke for invocations.
* _StaticCall: Used if the operator() selected by overload resolution
for the iterator reference is static. In this case, __func_handle is empty,
reducing the iterator size.
* _ViaPointer: Used for all remaining cases. __func_handle stores a pointer
to the functor object stored within the view. Only for this template the
cv-qualification of the functor template parameter (_Fn) relevant, and
specialization for both const and mutable types are generated.
As a consequence of these changes, the iterators of transform views no longer
depend on the view object when handle other than __func_handle::_ViaPointer
is used. The corresponding views are not marked as borrowed_range, as they
are not marked as such in the standard.
The use of _Inplace is limited to only set of pre-C++20 standard functors,
as for once introduced later operator() was retroactively made static.
We do not extent to to any empty fuctor, as it's oprator may still depend on
value of this pointer as illustrated by test12 in
std/ranges/adaptors/transform.cc test file.
Storing function member pointers directly increases the iterator size in that
specific case, but this is deemed beneficial for consistent treatment of
function and data member pointers.
To avoid materializing temporaries when the underlying iterator(s) return a
prvalue, the _M_call_deref and _M_call_subscript methods of handles are
defined to accept the iterator(s), which are then dereferenced as arguments
of the functor.
Using _Fd::operator()(*__iters...) inside requires expression is only
supported since clang-20, however at the point of GCC-16 release, clang-22
should be already available.
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__is_std_op_template)
(__detail::__is_std_op_wrapper, __func_handle::_Inplace)
(__func_handle::_InplaceMemPtr, __func_handle::_ViaPointer)
(__func_handle::_StaticCall, __detail::__func_handle_t): Define.
(transform_view::_Iterator, zip_transform_view::_Iterator)
(adjacent_tranform_view::_Iterator): Replace pointer to view
(_M_parent) with pointer to functor (_M_fun). Update constructors
to construct _M_fun from *__parent->_M_fun. Define operator* and
operator[] in terms of _M_call_deref and _M_call_subscript.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc: New tests.
* testsuite/std/ranges/adaptors/transform.cc: New tests.
* testsuite/std/ranges/zip_transform/1.cc: New tests.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tomasz Kamiński [Thu, 20 Nov 2025 10:23:30 +0000 (11:23 +0100)]
libstdc++: Make C++20s operator wrappers operator() static.
The operator() for function objects introduced in C++20 (e.g., std::identity,
std::compare_three_way, std::ranges::equal) is now defined as static.
Although static operator() is a C++23 feature, it is supported in C++20 by
both GCC and clang (since their support was added in clang-16).
This change is not user-observable, as all affected operators are template
functions. Taking the address of such an operator requires casting to a pointer
to member function with a specific signature. The exact signature is unspecified
per C++20 [member.functions] p2 (e.g. due to potential parameters with default
arguments).
libstdc++-v3/ChangeLog:
* include/bits/ranges_cmp.h (std::identity::operator()):
(ranges::equal_to:operator(), ranges::not_equal_to:operator())
(ranges::greater::operator(), ranges::greater_equal::operator())
(ranges::less::operator(), ranges::less_equal::operator()):
Declare as static.
* libsupc++/compare (std::compare_three_way::operator()):
Declare as static.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 26 Nov 2025 10:05:42 +0000 (11:05 +0100)]
eh: Invoke cleanups/destructors in asm goto jumps [PR122835]
The eh pass lowers try { } finally { } stmts and handles
in there e.g. GIMPLE_GOTOs or GIMPLE_CONDs which jump from
within the try block out of that by redirecting the jumps
to an artificial label with code to perform the cleanups/destructors
and then continuing the goto, ultimately to the original label.
Now, for computed gotos and non-local gotos, we document we don't
invoke destructors (and cleanups as well), that is something we really
can't handle, similarly longjmp.
This PR is about asm goto though, and in that case I don't see why
we shouldn't be performing the cleanups, while the user doesn't
specify which particular label will be jumped to, so it is more
like GIMPLE_COND (i.e. conditional goto) rather than unconditional
GIMPLE_GOTO, even with potentiall more different maybe gotos, there is
still list of the potential labels and we can adjust some or all of them
to artificial labels performing cleanups and continuing jump towards the
user label, we know from where the jumps go (asm goto) and to where
(the different LABEL_DECLs).
So, the following patch handles asm goto in the eh pass similarly to
GIMPLE_COND and GIMPLE_GOTO.
Jakub Jelinek [Wed, 26 Nov 2025 09:57:37 +0000 (10:57 +0100)]
match.pd: Use get_range_query (cfun) in more simplifications and pass current stmt to range_of_expr [PR119683]
The following testcase regressed with r13-3596 which switched over
vrp1 to ranger vrp. Before that, I believe vrp1 was registering
SSA_NAMEs with ASSERT_EXPRs at the start of bbs and so even when
querying the global ranges from match.pd patterns during the vrp1
pass, they saw the local ranges for a particular bb rather than global
ranges. In ranger vrp that doesn't happen anymore, so we need to
pass a stmt to range_of_expr if we want the local ranges rather
than global ones, plus should be using get_range_query (cfun)
instead of get_global_range_query () (most patterns actually use
the former already). Now, for stmt the following patch attempts
to pass the innermost stmt on which that particular capture appears
as operand, but because some passes use match.pd folding on expressions
not yet in the IL, I've added a helper function which tries to find out
from a capture of the LHS operation whether it is a SSA_NAME with
SSA_NAME_DEF_STMT which is in the IL right now and only query
the ranger with that in that case, otherwise NULL (i.e. what it has
been using before).
2025-11-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119683
* gimple-match.h (gimple_match_ctx): New inline function.
* match.pd ((mult (plus:s (mult:s @0 @1) @2) @3)): Capture
PLUS, use get_range_query (cfun) instead of
get_global_range_query () and pass gimple_match_ctx (@5)
as 3rd argument to range_of_expr.
((plus (mult:s (plus:s @0 @1) @2) @3)): Similarly for MULT,
with @4 instead of @5.
((t * u) / u -> t): Similarly with @2 instead of @4.
((t * u) / v -> t * (u / v)): Capture MULT, pass gimple_match_ctx (@3)
as 3rd argument to range_of_expr.
((X + M*N) / N -> X / N + M): Pass gimple_match_ctx (@3) or
gimple_match_ctx (@4) as 3rd arg to some range_of_expr calls.
((X - M*N) / N -> X / N - M): Likewise.
((X + C) / N -> X / N + C / N): Similarly.
(((T)(A)) + CST -> (T)(A + CST)): Capture CONVERT, use
get_range_query (cfun) instead of get_global_range_query ()
and pass gimple_match_ctx (@2) as 3rd argument to range_of_expr.
(x_5 == cstN ? cst4 : cst3): Capture EQNE and pass
gimple_match_ctx (@4) as 3rd argument to range_of_expr.
Soumya AR [Wed, 16 Jul 2025 13:32:08 +0000 (06:32 -0700)]
aarch64: Script to auto generate JSON tuning routines
This commit introduces a Python maintenance script that generates C++ code
for parsing and serializing AArch64 JSON tuning parameters based on the
schema defined in aarch64-json-schema.h.
The script generates two include files:
- aarch64-json-tunings-parser-generated.inc
- aarch64-json-tunings-printer-generated.inc
These generated files are committed as regular source files and included by
aarch64-json-tunings-parser.cc and aarch64-json-tunings-printer.cc respectively.
----
Additionally, this commit centralizes tuning enum definitions into a new
aarch64-tuning-enums.def file. The enums (autoprefetch_model and ldp_stp_policy)
are now defined once using macros and consumed by both the core definitions
(aarch64-opts.h, aarch64-protos.h) and the generated parser/printer code.
Doing this ensures that if someone wants to add a new enum value, they only
need to make modifications in the .def file, and the codegen from the script
will automatically refer to the same enums.
----
The script is run automatically whenever the JSON schema is modified.
----
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-json-tunings-parser.cc: Include
aarch64-json-tunings-parser-generated.inc.
* config/aarch64/aarch64-json-tunings-printer.cc: Include
aarch64-json-tunings-printer-generated.inc.
* config/aarch64/aarch64-opts.h (AARCH64_LDP_STP_POLICY): Use
aarch64-tuning-enums.def.
* config/aarch64/aarch64-protos.h (AARCH64_AUTOPREFETCH_MODE): Use
aarch64-tuning-enums.def.
* config/aarch64/t-aarch64: Invoke
aarch64-generate-json-tuning-routines.py if the schema is modified.
* config/aarch64/aarch64-generate-json-tuning-routines.py: New
maintenance script to generate JSON parser/printer routines.
* config/aarch64/aarch64-json-tunings-parser-generated.inc: New file.
* config/aarch64/aarch64-json-tunings-printer-generated.inc: New file.
* config/aarch64/aarch64-tuning-enums.def: New file.
Soumya AR [Wed, 16 Jul 2025 13:31:33 +0000 (06:31 -0700)]
aarch64: Regression tests for parsing of user-provided AArch64 CPU tuning parameters
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp: New test.
* gcc.target/aarch64/aarch64-json-tunings/boolean-1.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/boolean-1.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/boolean-2.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/boolean-2.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/empty.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/empty.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/enum-1.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/enum-1.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/enum-2.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/enum-2.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-1.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-1.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-2.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-2.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-3.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/integer-3.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/string-1.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/string-1.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/string-2.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/string-2.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/test-all.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/test-all.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-1.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-1.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-2.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-2.json: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-3.c: New test.
* gcc.target/aarch64/aarch64-json-tunings/unsigned-3.json: New test.
Soumya AR [Wed, 16 Jul 2025 13:29:57 +0000 (06:29 -0700)]
aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
This patch adds support for loading custom CPU tuning parameters from a JSON
file for AArch64 targets. The '-muser-provided-CPU=' flag accepts a user
provided JSON file and overrides the internal tuning parameters at GCC runtime.
This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config.gcc: Add aarch64-json-tunings-parser.o.
* config/aarch64/aarch64.cc (aarch64_override_options_internal): Invoke
aarch64_load_tuning_params_from_json if -muser-provided-CPU= is
(aarch64_json_tunings_tests): Extern aarch64_json_tunings_tests().
(aarch64_run_selftests): Add aarch64_json_tunings_tests().
* config/aarch64/aarch64.opt: New option.
* config/aarch64/t-aarch64 (aarch64-json-tunings-parser.o): New define.
* config/aarch64/aarch64-json-schema.h: New file.
* config/aarch64/aarch64-json-tunings-parser.cc: New file.
* config/aarch64/aarch64-json-tunings-parser.h: New file.
Soumya AR [Fri, 11 Jul 2025 12:54:33 +0000 (05:54 -0700)]
json: Add get_map() method to JSON object class
This patch adds a get_map () method to the JSON object class to provide access
to the underlying hash map that stores the JSON key-value pairs.
To do this, we expose the map_t typedef, the return type of get_map().
This change is needed to allow traversal of key-value pairs when parsing
user-provided JSON tuning data.
Additionally, is_a_helper template specializations for json::literal * and
const json::literal * were added to make dynamic casting in the next patch
easier.
This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
Soumya AR [Fri, 11 Jul 2025 12:28:17 +0000 (05:28 -0700)]
aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
This patch adds functionality to dump AArch64 CPU tuning parameters to a JSON
file. The new '-fdump-tuning-model=' flag allows users to export the current
tuning model configuration to a JSON file.
This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config.gcc: Add aarch64-json-tunings-printer.o.
* config/aarch64/aarch64.cc (aarch64_override_options_internal): Invoke
aarch64_print_tune_params if -fdump-tuning-model= is specified.
* config/aarch64/aarch64.opt: New option.
* config/aarch64/t-aarch64 (aarch64-json-tunings-printer.o): New define.
* config/aarch64/aarch64-json-tunings-printer.cc: New file.
* config/aarch64/aarch64-json-tunings-printer.h: New file.
Soumya AR [Thu, 10 Jul 2025 10:52:00 +0000 (03:52 -0700)]
aarch64 + arm: Remove const keyword from tune_params members and nested members
To allow runtime updates to tuning parameters, the const keyword is removed from
the members of the tune_params structure and the members of its nested
structures.
Since this patch also touches tuning structures in the arm backend, it was
bootstrapped on aarch64-linux-gnu as well as arm-linux-gnueabihf.
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
Tomasz Kamiński [Wed, 19 Nov 2025 09:29:18 +0000 (10:29 +0100)]
libstdc++: Hashing support for chrono value classes [PR110357]
This patch implements P2592R3 Hashing support for std::chrono value classes.
To avoid the know issues with current hashing of integer types (see PR104945),
we use chrono::__int_hash function that hash the bytes of representation,
instead of hash<T>, as the later simply cast to value. Currently _Hash_impl
it used, but we should consider replacing it (see PR55815) before C++26 ABI
is made stable. The function is declared inside <chrono> header and chrono
namespace, to make sure that only chrono components would be affected by
such change. Finally, chrono::__int_hash is made variadic, to support
combining hashes of multiple integers.
To reduce the number of calls to hasher (defined out of line), the calendar
types are packed into single unsigned integer value. This is done by
chrono::__hash helper, that calls:
* chrono::__as_int to cast the value of single component, to unsigned integer
with size matching the one used by internal representation: unsigned short
for year/weekday_indexed, and unsigned char in all other cases,
* chrono::__pack_ints to pack integers (if more than one) into single integer
by performing bit shift operations,
* chrono::__int_hash to hash the value produced by above.
Hashing of duration, time_point, and zoned_time only hashes the value and
ignores any difference in the period, i.e. hashes of nanoseconds(2) and
seconds(2) are the same. This does not affect the usages inside unordered
containers, as the arguments are converted to key type first. To address
that period::num and period::den could be included in the hash, however
such approach will not make hashes of equal durations (2000ms, 2s) equal,
so they would remain unusable for precomputed hashes. In consequence,
including period in hash, would only increase runtime cost, withou any
clear benefits.
Futhermore, chrono::__int_hash is used when the duration representation
is integral type, and for other types (floating point due special handling
of +/-0.0 and user defined types) we delegate to hash specialization.
This is automatically picked up by time_point, that delegates to hasher
of duration. Similarly for leap_second that is specified to use integer
durations, we simply hash representations of date() and value(). Finally
zoned_time in addition to handling integer durations as described above,
we also use __int_hash for const time_zone* (if used), as hash<T*> have
similar problems as hash specialization for integers. This is limited
only to _TimeZonePtr being const time_zone* (default), as user can define
hash specializations for raw pointers to they zones.
As accessing the representation for duration requires calling count()
method that returns a copy of representation by value, the noexcept
specification of the hasher needs to take into consideration copy
constructor of duration. Similar reasoning applies for time_since_epoch
for time_points, and get_sys_time, get_time_zone for zoned_time.
For all this cases we use internal __is_nothrow_copy_hashable concept.
Finally support for zoned_time is provided only for CXX11 string ABI,
__cpp_lib_chrono feature test macro cannot be bumped if COW string are used.
To indicate presence of hasher for remaining types this patch also bumps
the internal __glibcxx_chrono_cxx20 macro, and uses it as guard to new
features.
PR libstdc++/110357
libstdc++-v3/ChangeLog:
* include/bits/version.def (chrono, chrono_cxx20): Bump values.
* include/bits/version.h: Regenerate.
* include/std/chrono (__is_nothrow_copy_hashable)
(chrono::__pack_ints, chrono::__as_int, chrono::__int_hash)
(chrono::__hash): Define.
(std::hash): Define partial specialization for duration, time_point,
and zoned_time, and full specializations for calendar types and
leap_second.
(std::__is_fast_hash): Define partial specializations for duration,
time_point, zoned_time.
* testsuite/std/time/hash.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Paul Thomas [Wed, 26 Nov 2025 06:59:20 +0000 (06:59 +0000)]
Fortran: Implement finalization PDTs [PR104650]
2025-11-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/104650
* decl.cc (gfc_get_pdt_instance): If the PDT template has
finalizers, make a new f2k_derived namespace for this intance
and copy the template namespace into it. Set the instance
template_sym field to point to the template.
* expr.cc (gfc_check_pointer_assign): Allow array value pointer
lvalues to point to scalar null expressions in initialization.
* gfortran.h : Add the template_sym field to gfc_symbol.
* resolve.cc (gfc_resolve_finalizers): For a pdt_type, copy the
final subroutines with the same type argument into the pdt_type
finalizer list. Prevent final subroutine type checking and
creation of the vtab for pdt_templates.
* symbol.cc (gfc_free_symbol): Do not call gfc_free_namespace
for pdt_type with finalizers. Instead, free the finalizers and
the namespace.
gcc/testsuite
PR fortran/104650
* gfortran.dg/pdt_70.f03: New test.
Make better use of overflowing operations in max/min(a, add/sub(a, b)) [PR116815]
This patch folds the following patterns:
- For add:
- umax (a, add (a, b)) -> [sum, ovf] = adds (a, b); !ovf ? sum : a
- umin (a, add (a, b)) -> [sum, ovf] = adds (a, b); !ovf ? a : sum
... along with the commutated versions:
- umax (a, add (b, a)) -> [sum, ovf] = adds (b, a); !ovf ? sum : a
- umin (a, add (b, a)) -> [sum, ovf] = adds (b, a); !ovf ? a : sum
- For sub:
- umax (a, sub (a, b)) -> [diff, udf] = subs (a, b); udf ? diff : a
- umin (a, sub (a, b)) -> [diff, udf] = subs (a, b); udf ? a : diff
Where ovf is the overflow flag and udf is the underflow flag. adds and subs are
generated by generating parallel compare+plus/minus which map to
add<mode>3_compareC and sub<mode>3_compare1.
This patch is a respin of the patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2025-May/685021.html as per
the suggestion to turn it into a target-specific transform by Richard
Biener.
FIXME: This pattern cannot currently factor multiple occurences of the
add expression into a single adds, eg: max (a, a + b) + min (a + b, b)
ends up generating two adds instructions. This is something that
was lost when going from GIMPLE to target-specific transforms.
Bootstrapped and regtested on aarch64-unknown-linux-gnu.
* config/aarch64/aarch64.md
(*aarch64_plus_within_<optab><mode>3_<ovf_commutate>): New pattern.
(*aarch64_minus_within_<optab><mode>3): Likewise.
* config/aarch64/iterators.md (ovf_add_cmp): New code attribute.
(udf_sub_cmp): Likewise.
(UMAXMIN): New code iterator.
(ovf_commutate): New iterator.
(ovf_comm_opp): New int attribute.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr116815-1.c: New test.
* gcc.target/aarch64/pr116815-2.c: Likewise.
* gcc.target/aarch64/pr116815-3.c: Likewise.
Pan Li [Mon, 20 Oct 2025 13:08:46 +0000 (21:08 +0800)]
RISC-V: Add testcase for unsigned scalar SAT_MUL form 7
The form 7 of unsigned scalar SAT_MUL has supported from the
previous change. Thus, add the test cases to make sure it
works well.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat/sat_u_mul-8-u16-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u16-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u16-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u32-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u32-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u32-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u64-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u8-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u8-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u8-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-8-u8-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u16-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u16-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u32-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u32-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u64-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u8-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u8-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-8-u8-from-u64.c: New test.
Pan Li [Tue, 25 Nov 2025 07:18:38 +0000 (15:18 +0800)]
Match: Add unsigned SAT_MUL for form 7
This patch would like to try to match the the unsigned
SAT_MUL form 7, aka below:
#define DEF_SAT_U_MUL_FMT_7(NT, WT) \
NT __attribute__((noinline)) \
sat_u_mul_##NT##_from_##WT##_fmt_7 (NT a, NT b) \
{ \
WT x = (WT)a * (WT)b; \
NT max = -1; \
bool overflow_p = x > (WT)(max); \
return -(NT)(overflow_p) | (NT)x; \
}
while WT is uint128_t, uint64_t, uint32_t and uint16_t, and
NT is uint64_t, uint32_t, uint16_t or uint8_t.
gcc/ChangeLog:
* match.pd: Add pattern for SAT_MUL form 7 include
mul and widen_mul.
Andrew Pinski [Tue, 25 Nov 2025 07:34:45 +0000 (23:34 -0800)]
phiprop: Small compile time improvement for phiprop
Now that post dom information is only needed when the new store
can trap (since r16-5555-g952e145796d), only calculate it when
that is the case. It was calculated on demand by r14-2051-g3124bfb14c0bdc. This just changes when we need to
calculate it.
Pushed as obvious.
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Only
calculate on demand post dom info when the new store
might trap.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 25 Nov 2025 22:19:18 +0000 (14:19 -0800)]
phiprop: Make sure types of the load match the inserted phi [PR122847]
This was introduced with r16-5556-ge94e91d6f3775, but the type
check for the delay was not happen because the type at the point
of delaying was set to NULL. It is only until a non-delayed load
when the phi is created, the type is set.
This adds the type check to the replacement for the delayed statements.
Pushed as obvious.
PR tree-optimization/122847
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Add type
check for reuse of the phi for the delayed statements.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122847-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This is not ia64-specific but due to the changes in the recent glibc
commit "Implement C23 const-preserving standard library macros" (i.e.
[2]) now requiring "char *q" to be a pointer to a const char to compile
w/o error because of the return value of strchr() .
From 80af9c233c694904174b54a59404d311378f41f8 Mon Sep 17 00:00:00 2001
From: Frank Scheiner <frank.scheiner@web.de>
Date: Sat, 22 Nov 2025 14:58:10 +0100
Subject: [PATCH] libgomp: Fix GCC build after glibc@cd748a6
char *q needs to be a pointer to a const char for the return value of
strchr() with glibc after "Implement C23 const-preserving standard library
macros".
Philip Herron [Mon, 17 Nov 2025 21:14:44 +0000 (21:14 +0000)]
gccrs: Add support for initial generic associated types
This patch is the initial part in supporting generic associated types. In rust we have
trait item types that get implemented for example:
trait Foo<T> {
type Bar
}
impl<T> Foo for T {
type Bar = T
}
The trait position uses a Ty::Placeholder which is just a thing that gets set for
lazy evaluation to the impl type alias which is actually a Ty::Projection see:
For more info the projection type needs to hold onto generics in order to properly
support generic types this GAT's support extends this all the way to the placeholder
which still needs to be done.
Fixes Rust-GCC#4276
gcc/rust/ChangeLog:
* ast/rust-ast.cc (TraitItemType::as_string): add generic params
* ast/rust-ast.h: remove old comment
* ast/rust-item.h: add generic params to associated type
* ast/rust-type.h: remove old comment
* hir/rust-ast-lower-implitem.cc (ASTLowerTraitItem::visit): hir lowering for gat's
* hir/tree/rust-hir-item.cc (TraitItemType::TraitItemType): gat's on TraitItemType
(TraitItemType::operator=): preserve generic params
* hir/tree/rust-hir-item.h: likewise
* hir/tree/rust-hir.cc (TraitItemType::as_string): likewise
* parse/rust-parse-impl.h (Parser::parse_trait_type): hit the < and parse params
* typecheck/rust-hir-type-check-implitem.cc (TypeCheckImplItemWithTrait::visit): typecheck
* typecheck/rust-tyty.cc (BaseType::has_substitutions_defined): dont destructure
gcc/testsuite/ChangeLog:
* rust/compile/gat1.rs: New test.
* rust/execute/torture/gat1.rs: New test.
Signed-off-by: Philip Herron <herron.philip@googlemail.com>
Owen Avery [Sun, 17 Aug 2025 18:15:35 +0000 (14:15 -0400)]
gccrs: Improve feature handling
This includes a program, written using flex and bison, to extract
information on unstable features from rustc source code and save it to a
header file.
The script does fetch files from https://github.com/rust-lang/rust (the
official rustc git repository), which should be alright, as it's only
intended to be run by maintainers.
See https://doc.rust-lang.org/unstable-book/ for information on unstable
features.
gcc/rust/ChangeLog:
* checks/errors/feature/rust-feature-gate.cc
(FeatureGate::gate): Handle removal of Feature::create.
(FeatureGate::visit): Refer to AUTO_TRAITS as
OPTIN_BUILTIN_TRAITS.
* checks/errors/feature/rust-feature.cc (Feature::create):
Remove.
(Feature::feature_list): New static member variable.
(Feature::name_hash_map): Use "rust-feature-defs.h" to define.
(Feature::lookup): New member function definition.
* checks/errors/feature/rust-feature.h (Feature::State): Add
comments.
(Feature::Name): Use "rust-feature-defs.h" to define.
(Feature::as_string): Make const.
(Feature::name): Likewise.
(Feature::state): Likewise.
(Feature::issue): Likewise.
(Feature::description): Remove member function declaration.
(Feature::create): Remove static member function declaration.
(Feature::lookup): New member function declarations.
(Feature::Feature): Adjust arguments.
(Feature::m_rustc_since): Rename to...
(Feature::m_rust_since): ...here.
(Feature::m_description): Remove.
(Feature::m_reason): New member variable.
(Feature::feature_list): New static member variable.
* checks/errors/feature/rust-feature-defs.h: New file.
* checks/errors/feature/contrib/parse.y: New file.
* checks/errors/feature/contrib/scan.l: New file.
* checks/errors/feature/contrib/.gitignore: New file.
* checks/errors/feature/contrib/Makefile: New file.
* checks/errors/feature/contrib/fetch: New file.
* checks/errors/feature/contrib/regen: New file.
* checks/errors/feature/contrib/copyright-stub.h: New file.
* checks/errors/feature/contrib/README: New file.
Owen Avery [Sat, 28 Jun 2025 01:44:01 +0000 (21:44 -0400)]
gccrs: Create LocalVariable
This should make it easier for us to move away from leaking pointers to
Bvariable everywhere. Since LocalVariable has a single field of type
tree, it should be the same size as a pointer to Bvariable, making the
switch to LocalVariable wherever possible strictly an improvement.
Rainer Orth [Tue, 25 Nov 2025 21:25:48 +0000 (22:25 +0100)]
build: Save/restore CXXFLAGS for zstd tests
I recently encountered a bootstrap failure on trunk caused by the fact
that an older out-of-tree version of ansidecl.h was found before the
in-tree one in $top_srcdir/include, so some macros from that header
that are used in gcc weren't defined.
The out-of-tree version was located in $ZSTD_INC (-I/vol/gcc/include)
which caused that directory to be included in gcc's CXXFLAGS like
CXXFLAGS='-g -O2 -fchecking=1 -I/vol/gcc/include'
causing it to be searched before $srcdir/../include.
I could trace this to the zstd.h test in gcc/configure.ac which sets
CXXFLAGS and LDFLAGS before the actual test, but doesn't reset them
afterwards.
So this patch does just that.
Bootstrapped without regressions on i386-pc-solaris2.11 and
x86_64-pc-linux-gnu.
David Malcolm [Tue, 25 Nov 2025 17:44:10 +0000 (12:44 -0500)]
testsuite: fix issues in gcc.dg/analyzer/strchr-1.c seen with C23 libc
Simplify this test case in the hope of avoiding an error seen
with glibc-2.42.9000-537-gcd748a63ab1 in CI with
"Implement C23 const-preserving standard library macros".
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/strchr-1.c: Drop include of <string.h>, and use
__builtin_strchr throughout rather than strchr to avoid const
correctness issues when the header implements strchr with a C23
const-preserving macro. Drop "const" from two vars.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 25 Nov 2025 17:43:52 +0000 (12:43 -0500)]
analyzer: add logging to deref_before_check::emit
gcc/analyzer/ChangeLog:
* sm-malloc.cc (deref_before_check::emit): Add logging of the
various conditions for late-rejection of a
-Wanalyzer-deref-before-check warning.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Where an we use an Adv. SIMD compare and a reduction sequence to implement
early break. This patch implements the new optabs vec_cbranch_any and
vec_cbranch_all in order to replace the Adv. SIMD compare and reduction with
an SVE flag-setting compare.
This optab could also be used for optimizing the Adv. SIMD Sequence when SVE
is not available. I have a separate patch for that and will send depending on
if this approach is accepted or not.
Note that for floating-point we still need the ptest as floating point SVE
compares don't set flags. In addition because SVE doesn't have a CMTST
equivalent instruction we have to do an explicit AND before the compares.
These two cases don't have a speed advantage, but do have a codesize one
so I've left them enabled.
This patch also eliminated PTEST on normal SVE compare and branch through
the introduction of new optabs cond_vec_cbranch_any and cond_vec_cbranch_all.
In the example
void f1 ()
{
for (int i = 0; i < N; i++)
{
b[i] += a[i];
if (a[i] > 0)
break;
}
}
Where the ptest isn't needed since the branch only cares about the Z and NZ
flags.
GCC Today supports eliding this through the pattern *cmp<cmp_op><mode>_ptest
however this pattern only supports the removal when the outermost context is a
CMP where the predicate is inside the condition itself.
This typically only happens for an unpredicated CMP as a ptrue will be generated
during expand.
where the loop mask is applied to the compare as an AND.
The loop mask is moved into the compare by the pattern *cmp<cmp_op><mode>_and
which moves the mask inside if the current mask is a ptrue since
p && true -> p.
However this happens after combine, and so we can't both move the predicate
inside AND eliminate the ptests.
To fix this the middle-end will now rewrite the mask into the compare optab
and indicate that only the CC flags are required. This allows us to simply
not generate the ptest at all, rather than trying to eliminate it later on.