Andrew Pinski [Tue, 11 Nov 2025 20:07:11 +0000 (12:07 -0800)]
Merge remove_forwarder_block_with_phi into remove_forwarder_block
This is the last cleanup in this area. Merges the splitting functionality
of remove_forwarder_block_with_phi into remove_forwarder_block.
Now mergephi still has the ability to split the edges when merging the forwarder
block with a phi. But this reduces the non-shared code a lot.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Remove must argument.
(remove_forwarder_block): Add can_split
argument. Handle the splitting case (iff phis in bb).
(cleanup_tree_cfg_bb): Update argument to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): Remove.
(pass_merge_phi::execute): Update argument to tree_forwarder_block_p
and call remove_forwarder_block instead of remove_forwarder_block_with_phi.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 11 Nov 2025 19:29:38 +0000 (11:29 -0800)]
cfgcleanup: Support merging forwarder blocks with phis [PR122493]
This adds support for merging forwarder blocks with phis in cleanupcfg.
This patch might seem small but that is because the previous patches were
done to build up to make it easier to add this support.
There is still one more patch to merge remove_forwarder_block
and remove_forwarder_block_with_phi since remove_forwarder_block_with_phi
supports splitting an edge which is not supported as an option in remove_forwarder_block.
The splitting edge option should not be enabled for cfgcleanup but only for mergephi.
Note r8-338-ge7d70c6c3bccb2 added always creating a preheader for loops so we should
protect them if we have a phi node as it goes back and forth here. And both the gimple
and RTL loop code likes to have this preheader in the case of having the same constant
value being starting of the loop.
explaination on testcase changes
gcc.target/i386/pr121062-1.c needed a small change because there is a basic block
which is not duplicated so only one `movq reg, -1` is there instead of 2.
uninit-pred-7_a.c is xfailed and filed as PR122660, some analysis in the PR already of
the difference now.
uninit-pred-5.C was actually a false positive because when
m_best_candidate is non-NULL, m_best_candidate_len is always initialized.
The log message on the testcase is wrong if you manually fall the path
you can notice that. With an extra jump threading after the merging of
some bbs, the false positive is now no longer happening. So change the
dg-warning to dg-bogus.
ssa-dom-thread-7.c now jump threads 12 times in thread2 instead of 8
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122493
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Change bool argument
to a must have phi and allow phis if it is false.
(remove_forwarder_block): Add support for merging of forwarder blocks
with phis.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr121062-1.c: Update count.
* gcc.dg/uninit-pred-7_a.c: xfail line 23.
* g++.dg/uninit-pred-5.C: Change dg-warning to dg-bogus.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update count of jump thread.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 12 Nov 2025 09:30:30 +0000 (01:30 -0800)]
fix handling of mapped and their location
So when we using the newly mapped location, we should check if
it is not unknown location and if so just use the original location.
Note this is a latent bug in remove_forwarder_block_with_phi code too.
This fixes gcc.dg/uninit-pr40635.c when doing more mergephi.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): Use the original location
if the mapped location is unknown.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 11 Nov 2025 08:38:25 +0000 (00:38 -0800)]
mergephi: extend copy_phi_arg_into_existing_phi and use it for remove_forwarder_block_with_phi
copy_phi_arg_into_existing_phi was added in r14-477-g78b0eea7802698
and used in remove_forwarder_block but since
remove_forwarder_block_with_phi needed to use the redirect edge var
map, it was not moved over. This extends copy_phi_arg_into_existing_phi
to have the ability to optional use the mapper.
This also makes remove_forwarder_block_with_phi and remove_forwarder_block closer to
one another. There is a few other changes needed to be able to do both
from the same function.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): New use_map argument.
* tree-cfg.h (copy_phi_arg_into_existing_phi): Update declaration.
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
copy_phi_arg_into_existing_phi instead of inlining it.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 10 Nov 2025 01:17:49 +0000 (17:17 -0800)]
mergephi: use edge iterator in remove_forwarder_block_with_phi
It was always kinda of odd that while remove_forwarder_block used
an edge iterator, remove_forwarder_block_with_phi used a while loop.
remove_forwarder_block_with_phi was added after remove_forwarder_block too.
Anyways this changes remove_forwarder_block_with_phi into use the same
form of loop so it is easier to merge the 2.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
edge iterator instead of while loop.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 10 Nov 2025 00:13:05 +0000 (16:13 -0800)]
cfgcleanup: Remove check on available dominator information in remove_forwarder_block
Since at least r9-1005-gb401e50fed4def, dominator information is
available in remove_forwarder_block so there is no reason to have a
check on if we should update the dominator information, always do it.
This is one more step into commoning remove_forwarder_block and remove_forwarder_block_with_phi.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check
on the available dominator information.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 12 Nov 2025 00:47:04 +0000 (16:47 -0800)]
cfgcleanup: forwarder block, ignore bbs which merge with the predecessor
While moving mergephi's forwarder block removal over to cfgcleanup,
I noticed a few regressions due to removal of a forwarder block (correctly)
but the counts were not updated, instead let these blocks be handled by the merge_blocks
cleanup code.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Reject bb which has a single
predecessor which has a single successor.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 23:54:43 +0000 (15:54 -0800)]
mergephi: Move checks from pass_merge_phi::execute to remove_forwarder_block_with_phi
This moves the checks that were in pass_merge_phi::execute into remove_forwarder_block_with_phi
or tree_forwarder_block_p to make easier to merge remove_forwarder_block_with_phi with remove_forwarder_block.
This also simplifies the code slightly because we can do `return false` rather than break
in one location.
gcc/ChangeLog:
* tree-cfgcleanup.cc (pass_merge_phi::execute): Move
check for abnormal or no phis to remove_forwarder_block_with_phi
and the check on dominated to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): here.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 22:07:15 +0000 (14:07 -0800)]
cfgcleanup: Move check for dest containing non-local label/eh landing pad to tree_forwarder_block_p
I noticed this check was in both remove_forwarder_block and remove_forwarder_block_with_phi but
were slightly different in that eh landing pad was not being checked for remove_forwarder_block_with_phi
when it definite should be.
This folds the check into tree_forwarder_block_p instead as it is called right before hand anyways.
The eh landing pad check was added to the non-phi one by r0-98233-g28e5ca15b76773 but missed the phi variant;
I am not sure if it could show up there but it is better to have one common code than having two copies of
slightly different checks.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Remove check on non-local label.
(remove_forwarder_block): Remove check on non-label/eh landing pad.
(tree_forwarder_block_p): Add check on lable for an eh landing pad.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 21:56:12 +0000 (13:56 -0800)]
cfglceanup: Remove check for infinite loop in remove_forwarder_block/remove_forwarder_block_with_phi
Since removing the worklist for both mergephi and cfglceanup (r0-80545-g672987e82f472b), these
two functions are now called right after tree_forwarder_block_p so there is no reason to the
extra check for infinite loop nor the current loop on the headers check as it is already
handled in tree_forwarder_block_p.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check for infinite loop.
(remove_forwarder_block_with_phi): Likewise. Also remove check for loop header.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 06:40:08 +0000 (22:40 -0800)]
mergephi: Remove worklist
Since the worklist was never added to and the anlysis part can benifit
from the work part, we can combine the analayis part with the work part.
This should get a small speedup for this pass
Looking into the history here, remove_forwarder_block used to add to the worklist
but remove_forwarder_block_with_phi never did.
This is the first step in moving part of the functionality of mergephi into
cfgcleanup.
Jeff Law [Thu, 13 Nov 2025 20:10:12 +0000 (13:10 -0700)]
Handle shift-pairs in ext-dce for targets without zero/sign extension insns
This is more prep work for revamping the zero/sign extension patterns on RISC-V
to avoid the need for define_insn_and_splits.
The core issue at hand is for the base ISA we don't have the full set of
sign/zero extensions. So what's been done so far is to pretend we do via a
define_insn_and_split, then split the extensions into shift pairs post-reload
(for the base ISA).
That has multiple undesirable properties, including inhibiting optimization in
some cases and making it harder to add new optimizations in the most natural
way in the future.
The basic approach we've been taking to these problems has been to generate the
desired code at expansion time. When we do that for RISC-V, ext-dce will no
longer see the zero/sign extension nodes when compiling for the base ISA --
instead it'll see shift pairs. And that in turn causes ext-dce to miss
elimination opportunities which is a regression relative to the trunk right
now.
This patch improves ext-dce to recognize the second shift (right) in such a
sequence, then try to match it up with a prior left shift (which has to be the
immediately prior real instruction). When it can pair them up it'll treat the
pair like an extension. The right shift turns into a simple copy of the source
of the left shift.
That prevents optimization regressions with the in flight code to revamp the
zero extension (and then sign extensino) code. No new tests since it's
preventing existing tests from failing to optimize after some in flight stuff
lands.
Bootstrapped and regression tested on x86_64 and tested on all the crosses in
my tester. The Pioneer and BPI will pick it up tonight for bootstrap testing
on RISC-V.
* ext-dce.cc (ext_dce_try_optimize_rshift): New function to optimize a
shift pair implementing a zero/sign extension.
(ext_dce_try_optimize_extension): Renamed from
ext_dce_try_optimize_insn.
(ext_dce_process_uses): Handle shift pairs implementing extensions.
Andrew Pinski [Thu, 13 Nov 2025 05:06:02 +0000 (21:06 -0800)]
sccp: Fix order of gimplification, removal of the phi and constant prop in sccp (3rd time) [PR122637]
This is 3rd (and hopefully last) time to fix the order here.
The previous times were r16-5093-g77e10b47f25d05 and r16-4905-g7b9d32aa2ffcb5.
The order before these patches were:
* removal of phi
* propagate constants
* gimplification of expr
* create assignment
* rewrite to undefined
* add stmts to bb
The current order before this patch (and after the other 2):
* gimplification of expr
* removal of phi
* create assignment
* propagate constants
* rewrite to undefined
* add stmts to bb
The correct and new order with this patch we have:
* gimplifcation of expr
* propagate constants
* removal of phi
* create the assignment
* rewrite to undefined
* add stmts to bb
This is because the propagate of the constant will cause a fold_stmt which requires
the statement in the IR still. The gimplifcation of expr also calls fold_stmt.
Now with the new order the phi is not removed until right before the creation of the
new assigment so the IR in the basic block is well defined while calling fold_stmt.
Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
PR tree-optimization/122637
gcc/ChangeLog:
* tree-scalar-evolution.cc (final_value_replacement_loop): Fix order
of gimplification and constant prop.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122637-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Artemiy Volkov [Thu, 13 Nov 2025 11:15:19 +0000 (11:15 +0000)]
gcc/testsuite: adjust tree-ssa/forwprop-43.c
Introduced in r16-5042-g470411f44f51d9, this testcase fails on
AdvSIMD-less AArch32 configurations, likely as well as on other targets
without vector support; thus, require it via dg-require-effective-target.
Since this testcase includes stdint.h, require that as well.
Regtested on arm-gnueabihf with
RUNTESTFLAGS=--target_board=unix/-mfpu=vfpv3-d16/-march=armv7-a.
Jeff Law [Thu, 13 Nov 2025 15:51:40 +0000 (08:51 -0700)]
[RISC-V][PR rtl-optimization/122627] Yet another fix in IRA equivalence array handling
Yup, yet another out of bounds access into the equivalence array.
In this case we had an out of bounds write, which corrupted the heap leading to
the fault.
Given this is the 3rd such issue in this space in recent history and the second
in this loop within LRA within a week or so, I looked for a solution that would
cover the whole loop rather than another spot fix.
The good news is this loop runs after elimination, so we can just expand the
equivalence array after elimination and all the right things should happen.
This also allows removal of the spot fix I did last week (which I did
backtest). I didn't have a testcase for the bug in this space I fixed a couple
months ago (and the artifacts from that build are certainly gone from my tester
by now).
Bootstrapped and regression tested on x86. Also verified the RISC-V failures
in this bz and bz122321 are fixed.
Given this is a refinement & simplification of a prior fix, I'm going to take
some slight leeway to push the fix forward now.
PR rtl-optimization/122627
gcc/
* lra-constraints.cc (update_equiv): Remove patch from last week
related to pr122321.
(lra_constraints): Expand the equivalence array after eliminations
are complete.
gcc/testsuite/
* gcc.target/riscv/rvv/autovec/pr122627.c: New test.
Eric Botcazou [Sun, 2 Nov 2025 16:11:19 +0000 (17:11 +0100)]
ada: Fix internal error on protected entry and private record
This is a freezing issue introduced by the new support for deferred extra
formals. The freezing of local types created during the expansion of the
entry construct happens in the wrong scope when the expansion is deferred,
causing reference-before-freezing in the expanded code.
gcc/ada/ChangeLog:
* exp_ch9.adb (Expand_N_Entry_Declaration): In the deferred case,
freeze immediately all the newly created entities.
Douglas B Rupp [Fri, 3 Oct 2025 16:54:47 +0000 (09:54 -0700)]
ada: Corrupted unwind info in aarch64-vx7r2 llvm kernel tests
Adjust the register restoration on aarch64 to not use register 96
on llvm. Avoids the "reg too big" warning on aarch64 when sigtramp
is called. For llvm and aarch64, the correct choice seems to be 32.
Remove parens on REGNO_PC_OFFSET when compiling,
it causes a silent failure due to alphanumeric register names.
Define a macro for __attribute ((optimize (2))) which is
empty if not availble. (Despite being documented, it generates an
"unknown attribute" warning with clang.)
Define ATTRIBUTE_PRINTF_2 if not defined.
gcc/ada/ChangeLog:
* sigtramp-vxworks-target.h (REGNO_PC_OFFSET): Use 32 vice
96 with llvm/clang. (REGNO_G_REG_OFFSET): Remove parens on
operand. (REGNO_GR): Likewise.
* sigtramp-vxworks.c (__gnat_sigtramp): Define a macro for
__attribute__ optimize, which is empty of not available.
* raise-gcc.c (db): Define ATTRIBUTE_PRINTF_2 if not defined.
Steve Baird [Wed, 8 Oct 2025 22:50:58 +0000 (15:50 -0700)]
ada: Avoid duplicate streaming and Put_Image subprograms.
Duplicate streaming and Put_Image subprograms were being generated in some
cases where this was not intended. In most cases this only resulted in unwanted
code duplication (which, of course, is not good), but in some cases it resulted
in compilation failures with spurious "duplicate body" error messages.
gcc/ada/ChangeLog:
* exp_attr.adb: Rewrite the spec and implementation of package
Cached_Attribute_Ops so that the saved value associated with a
type in a given map is not a single subprogram but instead a
set of subprograms. Thus, the correct generation of a second subprogram
for given type for use in some other context no longer causes the
first subprogram to be forgotten. This allows more reuse and,
in particular, allows reuse in the case where generating another
copy of the subprogram would result in a compilation failure.
Update Cached_Attribute_Ops clients correspondingly.
Gary Dismukes [Sat, 11 Oct 2025 00:15:57 +0000 (00:15 +0000)]
ada: Type-resolution error on target name in assignment to indexed container
The compiler fails to resolve expressions involving a target name (@ symbol)
in assignment statements where the target object is an indexed container
object, complaining that the target name is of the reference type associated
with the container type. The target object is initially viewed as having
the reference type, which is what the compiler was also setting as the
type of the N_Target_Name node in the assignment's expression tree (leading
to type errors), and it's only later expansion that changes the target object
to a dereference whose type is the reference type's designated type, which
is too late.
This is addressed by implementing AI22-0082 and AI22-0112. The first AI is
about changing the reference types declared in the predefined containers
generics to be limited types. The second AI revises the resolution rules for
assignment statements to exclude interpretations that are of limited types.
Combining the two AIs, the case described above will resolve to the dereference
of an indexed container component rather than the interpretation of the indexing
as returning an object of a reference type. The AI22-0112 changes also avoid
ambiguities for assignments involving indexed names (such as "C1(I) := C2(J);"),
at least for cases involving the predefined containers (user-defined containers
that declare nonlimited reference types can still run into such ambiguities).
But apart from those AIs, GNAT was already doing things wrong in
the case of overloaded variable names in assignment statements with
container indexing, in determining the type of target names (@ symbols)
as being of the reference type, which could result in wrong-type errors.
GNAT wasn't following the requirement that the variable name in an
assignment statement must be resolved as a "complete context". This is
now corrected by separate resolution code that's done in the case where
the expression of the assignment contains target names.
Also, the existing code in Analyze_Assignment that's used in the
non-target-name case is revised by removing incorrect code for ignoring
the reference interpretations of generalized indexing and replacing it
with code to remove interpretations of limited types (which, per AI22-0112,
needs to be done whether or not there are target names involved).
It should be noted that the changes to make reference types limited in the
predefined container packages can affect existing code that happens to depend
on the reference types being nonlimited, and code changes may be required to
remove or work around such dependence.
gcc/ada/ChangeLog:
* libgnat/a-cbdlli.ads: Add "limited" to partial view of reference types.
* libgnat/a-cbhama.ads: Likewise.
* libgnat/a-cbhase.ads: Likewise.
* libgnat/a-cbmutr.ads: Likewise.
* libgnat/a-cborma.ads: Likewise.
* libgnat/a-cborse.ads: Likewise.
* libgnat/a-cdlili.ads: Likewise.
* libgnat/a-cidlli.ads: Likewise.
* libgnat/a-cihama.ads: Likewise.
* libgnat/a-cihase.ads: Likewise.
* libgnat/a-cimutr.ads: Likewise.
* libgnat/a-ciorma.ads: Likewise.
* libgnat/a-ciormu.ads: Likewise.
* libgnat/a-ciorse.ads: Likewise.
* libgnat/a-cobove.ads: Likewise.
* libgnat/a-cohama.ads: Likewise.
* libgnat/a-cohase.ads: Likewise.
* libgnat/a-coinho.ads: Likewise.
* libgnat/a-coinho__shared.ads: Likewise.
* libgnat/a-coinve.ads: Likewise.
* libgnat/a-comutr.ads: Likewise.
* libgnat/a-convec.ads: Likewise.
* libgnat/a-coorma.ads: Likewise.
* libgnat/a-coormu.ads: Likewise.
* libgnat/a-coorse.ads: Likewise.
* sem_ch5.adb (Analyze_Assignment): Added code to resolve the target
object (LHS) as a complete context when there are target names ("@")
present in the expression of the assignment. Loop over interpretations,
removing any that have a limited type, and set the type (T1) to be the
type of the first nonlimited interpretation. Test for ambiguity by
calling Is_Ambiguous_Operand. Delay analysis of Rhs in the target-name
case. Replace existing test for generalized indexing with implicit
dereference in existing analysis code with test of Is_Limited_Type
along with calling Remove_Interp in the limited case.
* sem_res.adb (Is_Ambiguous_Operand): Condition the calls to
Report_Interpretation on Report_Errors being True.
Eric Botcazou [Mon, 27 Oct 2025 08:18:53 +0000 (09:18 +0100)]
ada: Detect illegal value of static expression of decimal fixed point type
The RM 4.9(36/2) subclause says that, if a static expression is of type
universal_real and its expected type is a decimal fixed point type, then
its value shall be a multiple of the small of the decimal type. This was
enforced for real literals, but not for real named numbers.
Fixing the problem involves tweaking Fold_Ureal and the same tweak is also
applied to Fold_Uint for the sake of consistency in the implementation.
gcc/ada/ChangeLog:
PR ada/29463
* sem_eval.adb (Fold_Uint): Use Universal_Integer as actual type
for a named number.
(Fold_Ureal): Likewise with Universal_Real.
* sem_res.adb (Resolve_Real_Literal): Test whether the literal is
a static expression instead of coming from source to give the error
prescribed by the RM 4.9(36/2) subclause.
ada: Extend internal documentation of suspension objects
This patch adds documentation that stresses some of the consequences of
RM D.10 (10.2/5) that enable a lightweight implementation of suspension
objects.
gcc/ada/ChangeLog:
* libgnarl/s-taspri__posix.ads (Suspension_Object): Add some
documentation.
Eric Botcazou [Thu, 23 Oct 2025 17:20:49 +0000 (19:20 +0200)]
ada: Fix ancient bug in pragma Suppress (Alignment_Check)
The recent change that streamlined the implementation of alignment checks
has uncovered an ancient bug in the implementation of pragma Suppress on
a specific object:
pragma Suppress (Alignment_Check, A);
The pragma would work only if placed before the address clause:
A : Integer;
pragma Suppress (Alignment_Check, A);
for A'Address use ...
but not if placed (just) after it:
A : Integer;
for A'Address use ...
pragma Suppress (Alignment_Check, A);
which seems unfriendly at best.
gcc/ada/ChangeLog:
* sem_prag.adb (Analyze_Pragma) <Process_Suppress_Unsuppress>: For
Alignment_Check on a specific object with an address clause and no
alignment clause, toggle the Check_Address_Alignment flag present
on the address clause.
Eric Botcazou [Thu, 23 Oct 2025 11:10:59 +0000 (13:10 +0200)]
ada: Further update GNAT RM after recent change to alignment checks
Alignment checks are now fully decoupled from range checks.
gcc/ada/ChangeLog:
* doc/gnat_rm/implementation_defined_pragmas.rst (Pragma Suppress):
Remove mention of range checks in the entry for alignment checks.
* gnat_rm.texi: Regenerate.
Xi Ruoyao [Thu, 6 Nov 2025 13:32:54 +0000 (21:32 +0800)]
LoongArch: Don't mix lock-free and locking 16B atomics
As [1] says, we cannot mix up lock-free and locking atomics for one
object. For example assume atom = (0, 0) initially, if we have a
locking "atomic" xor running on T0 and a lock-free store running on T1
concurrently:
we get atom = (0, 1), but the atomicity of xor and store should
guarantee that atom is either (0, 0) or (1, 1).
So, if we want to use a lock-free 16B atomic operation, we need both LSX
and SCQ even if that specific operation only needs one of them. To make
things worse, one may link a TU compiled with -mlsx -mscq and another
without them together, then if we want to use the lock-free 16B atomic
operations in the former, we must make libatomic also use the lock-free
16B atomic operation for the latter so we need to add ifuncs for
libatomic, similar to the discussion about i386 vs. i486 in [1].
Implementing and building the ifuncs currently requires:
- Glibc, because the ifunc resolver interface is libc-specific
- Linux, because the HWCAP bit for LSX is kernel-specific
- A recent enough assembler at build time to recognize sc.q
So the approach here is: only allow 16B lock-free atomic operations in
the compiler if the criteria above is satisfied, and ensure libatomic to
use those lock-free operations on capable hardware (via ifunc unless
both LSX and SCQ are already enabled by the builder) if the compiler
allows 16B lock-free atomic.
gcc/
* configure.ac (HAVE_AS_16B_ATOMIC): Define if the assembler
supports LSX and sc.q.
* configure: Regenerate.
* config.in: Regenerate.
* config/loongarch/loongarch-opts.h (HAVE_AS_16B_ATOMIC):
Defined to 0 if undefined yet.
* config/loongarch/linux.h (HAVE_IFUNC_FOR_LIBATOMIC_16B):
Define as HAVE_AS_16B_ATOMIC && OPTION_GLIBC.
* config/loongarch/loongarch-protos.h
(loongarch_16b_atomic_lock_free_p): New prototype.
* config/loongarch/loongarch.cc
(loongarch_16b_atomic_lock_free_p): Implement.
* config/loongarch/sync.md (atomic_storeti_lsx): Require
loongarch_16b_atomic_lock_free_p.
(atomic_storeti): Likewise.
(atomic_exchangeti_scq): Likewise.
(atomic_exchangeti): Likewise.
(atomic_compare_and_swapti): Likewise.
(atomic_fetch_<amop_ti_fetch>ti_scq): Likewise.
(atomic_fetch_<amop_ti_fetch>ti): Likewise.
(ALL_SC): Likewise for TImode.
(atomic_storeti_scq): Remove.
libatomic/
* configure.ac (ARCH_LOONGARCH): New AM_CONDITIONAL.
* Makefile.am (IFUNC_OPT): Separate the item from IFUNC_OPTIONS
to allow using multiple options for an ISA variant.
(libatomic_la_LIBADD): Add *_16_1_.lo for LoongArch.
(IFUNC_OPTIONS): Build *_16_1_.lo for LoongArch with -mlsx and
-mscq.
* configure: Regenerate.
* Makefile.in: Regenerate.
* configure.tgt (try_ifunc): Set to yes for LoongArch if the
compiler can produce lock-free 16B atomic with -mlsx -mscq.
* config/loongarch/host-config.h: Implement ifunc selector.
Andrew Stubbs [Fri, 28 Jun 2024 10:24:43 +0000 (10:24 +0000)]
openmp, nvptx: ompx_gnu_managed_mem_alloc
This adds support for using Cuda Managed Memory with omp_alloc. AMD support
will be added in a future patch.
There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.
The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.
gcc/fortran/ChangeLog:
* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
comment.
include/ChangeLog:
* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
CU_MEM_ATTACH_GLOBAL flag.
* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.
libgomp/ChangeLog:
* allocator.c (ompx_gnu_max_predefined_alloc): Update to
ompx_gnu_managed_mem_alloc.
(_Static_assert): Fix assertion messages for allocators and add
new assertions for memspace constants.
(omp_max_predefined_mem_space): New define.
(ompx_gnu_min_predefined_mem_space): New define.
(ompx_gnu_max_predefined_mem_space): New define.
(MEMSPACE_ALLOC): Add check for non-standard memspaces.
(MEMSPACE_CALLOC): Likewise.
(MEMSPACE_REALLOC): Likewise.
(MEMSPACE_VALIDATE): Likewise.
(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
non-standard memspaces.
(gcn_memspace_calloc): Likewise.
(gcn_memspace_realloc): Likewise.
(gcn_memspace_validate): Update to validate standard vs non-standard
memspaces.
* config/linux/allocator.c (linux_memspace_alloc): Add managed
memory space handling.
(linux_memspace_calloc): Likewise.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise (returns NULL for fallback).
* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
non-standard memspaces.
(nvptx_memspace_calloc): Likewise.
(nvptx_memspace_realloc): Likewise.
(nvptx_memspace_validate): Update to validate standard vs non-standard
memspaces.
* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
ompx_gnu_managed_mem_space, and some static asserts so I don't forget
them again.
* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
(GOMP_OFFLOAD_managed_free): New declaration.
* libgomp.h (gomp_managed_alloc): New declaration.
(gomp_managed_free): New declaration.
(struct gomp_device_descr): Add managed_alloc_func and
managed_free_func fields.
* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
ompx_gnu_managed_mem_space, add C++ template documentation, and
describe NVPTX and AMD support.
* omp.h.in: Add ompx_gnu_managed_mem_space and
ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
allocator template.
* omp_lib.f90.in: Add Fortran bindings for new allocator and
memory space.
* omp_lib.h.in: Likewise.
* plugin/cuda-lib.def: Add cuMemAllocManaged.
* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
support cuMemAllocManaged.
(GOMP_OFFLOAD_alloc): Move contents to ...
(cleanup_and_alloc): ... this new function, and add managed support.
(GOMP_OFFLOAD_managed_alloc): New function.
(GOMP_OFFLOAD_managed_free): New function.
* target.c (gomp_managed_alloc): New function.
(gomp_managed_free): New function.
(gomp_load_plugin_for_device): Load optional managed_alloc
and managed_free plugin APIs.
* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
* testsuite/libgomp.c++/alloc-managed-1.C: New test.
* testsuite/libgomp.c/alloc-managed-1.c: New test.
* testsuite/libgomp.c/alloc-managed-2.c: New test.
* testsuite/libgomp.c/alloc-managed-3.c: New test.
* testsuite/libgomp.c/alloc-managed-4.c: New test.
* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.
Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com> Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
gcc/fortran/gfortran.texi:1842: First argument to cross-reference may not be empty.
gcc/fortran/gfortran.texi:1903: First argument to cross-reference may not be empty.
gcc/fortran/intrinsic.texi:15549: Unknown command `cindex,'.
However, install.texi states that makeinfo >= 4.7 is required, so this
should work.
This patch fixes those errors.
Tested on x86_64-apple-darwin17.7.0 (makeinfo 4.8), i386-pc-solaris2.11
(makeinfo 7.2), and x86_64-pc-linux-gnu (makeinfo 7.1).
Filip Kastl [Thu, 13 Nov 2025 13:43:07 +0000 (14:43 +0100)]
contrib/check-params-in-docs.py: Compensate for r16-5132
r16-5132-g6786a073fcead3 added mention of the '=' variant of the
'--param' command line option to gcc/doc/invoke.texi. This confused
contrib/check-params-in-docs.py. Fix that.
Commiting as obvious.
contrib/ChangeLog:
* check-params-in-docs.py: Start parsing from
@itemx --param=@var{name}=@var{value} instead of
@item --param @var{name}=@var{value}.
libstdc++: Optimize handling of optional for views: take, drop, reverse and as_const.
This implements P3913R1: Optimize for std::optional in range adaptors.
Specifically, for an opt of type optional<T> that is a view:
* views::reverse(opt), views::take(opt, n), and views::drop(opt, n) returns
optional<T>.
* views::as_const(opt), optional<T&> is converted into optional<const T&>.
optional<T const> is not used in the non-reference case because, such
type is not move assignable, and thus not a view.
libstdc++-v3/ChangeLog:
* include/std/optional (__is_optional_ref): Define.
* include/std/ranges (_Take::operator(), _Drop::operator())
(_Reverse::operator()): Handle optional<T> that are view.
(_AsConst::operator()): Handle optional<T&>.
* testsuite/20_util/optional/range.cc: New tests.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz KamiĆski <tkaminsk@redhat.com>
Alice Carlotti [Mon, 10 Nov 2025 16:15:34 +0000 (16:15 +0000)]
aarch64: Extend syntax for cpuinfo feature string checks
Some SVE features in the toolchain need to be enabled when either of two
different kernel HWCAPS (and corresponding cpuinfo strings) are enabled
(one for non-streaming mode and one for streaming mode).
Add support for using "|" to separate alternative lists of required
features.
Tomasz KamiĆski [Wed, 12 Nov 2025 10:16:58 +0000 (11:16 +0100)]
libtdc++: Test atomic_ref<volatile T> only if operations are lock-free [PR122584]
For non-templated tests, a volatile_<T> alias is used. This alias expands to
volatile T if std::atomic_ref<T>::is_always_lock_free is true, and to T
otherwise. For templated functions, testing is controlled using if constexpr.
PR libstdc++/115402
PR libstdc++/122584
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/address.cc: Guard test for
volatile with if constexpr.
* testsuite/29_atomics/atomic_ref/deduction.cc: Likewise.
* testsuite/29_atomics/atomic_ref/op_support.cc: Likewise.
* testsuite/29_atomics/atomic_ref/requirements.cc: Likewise.
* testsuite/29_atomics/atomic_ref/bool.cc: Use volatile_t alias.
* testsuite/29_atomics/atomic_ref/generic.cc: Likewise.
* testsuite/29_atomics/atomic_ref/integral.cc: Likewise.
* testsuite/29_atomics/atomic_ref/pointer.cc: Likewise.
* testsuite/29_atomics/atomic_ref/float.cc: Likewise, and remove
not discarding if constexpr.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz KamiĆski <tkaminsk@redhat.com>
Nathaniel Shead [Tue, 11 Nov 2025 06:13:46 +0000 (17:13 +1100)]
c++/modules: Maintain attachment of temploid friends after duplicate_decls [PR122551]
The ICE in the PR is because we're attempting to create a binding for an
imported declaration. This is problematic because if there are
duplicates we'll stream via a tt_entity, but won't enable deduplication
on the relevant binding vectors which can cause issues.
The root cause seems to stem from us forgetting that we've produced a
declaration for this entity within our own module, and so the active
declaration is not purely from an imported entity. We also didn't
properly track that this entity has unusual module attachment and
despite being declared here without being an instantiation actually is
attached to a different module than the current one (which may have
caused other problems down the line). This patch fixes both of these
issues.
PR c++/122551
gcc/cp/ChangeLog:
* cp-tree.h (transfer_defining_module): Declare.
* decl.cc (duplicate_decls): Call it for all decls.
Remove now unnecessary equivalent logic for templates.
* module.cc (mangle_module): Add assertion.
(transfer_defining_module): New function.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-20_a.C: New test.
* g++.dg/modules/tpl-friend-20_b.C: New test.
* g++.dg/modules/tpl-friend-20_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Andre Vieira [Thu, 13 Nov 2025 10:46:56 +0000 (10:46 +0000)]
aarch64: Use eor3 for more double xor cases
Expands the use of eor3 where we'd otherwise use two vector eor's.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (*eor3q<mode>4): New insn to be used by
combine after reload to optimize any grouping of eor's that are using FP
registers for scalar modes.
James K. Lowden [Wed, 12 Nov 2025 22:48:34 +0000 (17:48 -0500)]
cobol: Introduce vendor-compatibility layer as user-defined functions.
Install COBOL UDFs in a target directory that includes the GCC version
in its path, to permit side-by-side installation. Support compat
library with COBOL POSIX bindings; support those binding with C
functions in libgcobol as needed.
Changes to the compiler to support POSIX binding and testing.
Include developer conveniences -- Makefiles, bin/ and t/ directories --
to ensure UDFs compile and return reasonable results. These are
not installed and do not affect how libgcobol is built.
gcc/cobol/ChangeLog:
* cdf.y: Install literals in symbol table.
* genapi.cc (parser_alphabet): Use std::string for currency.
(initialize_the_data): Rely on constructor.
(parser_file_add): Better #pragma message.
(parser_exception_file): Return early if not generating code.
* parse.y: Allow library programs to act as functions.
* parse_ante.h (dialect_proscribed): Standardize message.
(intrinsic_call_2): Correct s/fund/func/ misspelling.
* scan.l: Comment.
* symbols.cc (symbols_update): Add unreachable assertion.
(symbol_field_parent_set): Reduce error to debug message.
(cdf_literalize): Declare.
(symbol_table_init): Insert CDF constants as literals.
* symbols.h (cbl_dialect_str): Provide string values for enum.
(is_working_storage): Remove function.
(struct cbl_field_data_t): Add manhandle_initial for Numeric Edited.
(struct cbl_field_t): Initialize name to zeros.
(struct cbl_section_t): Delete unused attr() function.
(symbol_unique_index): Declare.
* token_names.h: Regenerate.
* util.cc (cdf_literalize): Construct a cbl_field_t from a CDF literal.
(symbol_unique_index): Supply "globally" unique number for a program.
libgcobol/ChangeLog:
* Makefile.am: Move UDF-support to posix/shim, add install targets
* Makefile.in: Regenerate
* charmaps.cc (__gg__currency_signs): Use std::string.
* charmaps.h: Include string and vector headers.
(class charmap_t): Use std::string and vector for currency.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for libxml2.
* intrinsic.cc (numval_c): Constify.
* libgcobol.cc (struct program_state): Use std::string and vector.
(__gg__inspect_format_2): Add debug messages.
* libgcobol.h (__gg__get_default_currency_string): Constify.
* valconv.cc (expand_picture): Use std::string and vector.
(__gg__string_to_numeric_edited): Use std::string and vector.
(__gg__currency_sign_init): Use std::string and vector.
(__gg__currency_sign): Use std::string and vector.
* xmlparse.cc (xml_push_parse): Reformat.
* posix/stat.cc: Removed.
* posix/stat.h: Removed.
* .gitignore: New file.
* compat/README.md: New file.
* compat/lib/gnu/CBL_ALLOC_MEM.cbl: New file.
* compat/lib/gnu/CBL_CHECK_FILE_EXIST.cbl: New file.
* compat/lib/gnu/CBL_DELETE_FILE.cbl: New file.
* compat/lib/gnu/CBL_FREE_MEM.cbl: New file.
* compat/t/Makefile: New file.
* compat/t/smoke.cbl: New file.
* posix/README.md: New file.
* posix/bin/Makefile: New file for UDF-developer.
* posix/bin/headers: New file.
* posix/bin/scrape.awk: New file.
* posix/bin/sizeofs.c: New file.
* posix/bin/udf-gen: New file.
* posix/cpy/posix-errno.cbl: New file.
* posix/cpy/statbuf.cpy: New file.
* posix/cpy/tm.cpy: New file.
* posix/errno.cc: Removed.
* posix/localtime.cc: Removed.
* posix/shim/stat.cc: New file.
* posix/shim/stat.h: New file.
* posix/t/Makefile: New file.
* posix/t/errno.cbl: New file.
* posix/t/exit.cbl: New file.
* posix/t/localtime.cbl: New file.
* posix/t/stat.cbl: New file.
* posix/tm.h: Removed.
* posix/udf/posix-exit.cbl: New file.
* posix/udf/posix-localtime.cbl: New file.
* posix/udf/posix-mkdir.cbl: New file.
* posix/udf/posix-stat.cbl: New file.
* posix/udf/posix-unlink.cbl: New file.
David Malcolm [Wed, 12 Nov 2025 21:51:16 +0000 (16:51 -0500)]
EXPERIMENTAL_SARIF_SOCKET: decode errno when reporting errors [PR115970]
gcc/ChangeLog:
PR diagnostics/115970
* diagnostics/sarif-sink.cc (maybe_open_sarif_sink_for_socket):
Add "%m" to error messages, so that we print the string form of
errno.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 12 Nov 2025 21:51:16 +0000 (16:51 -0500)]
diagnostics: add class unique_fd
No functional change intended.
gcc/ChangeLog:
* diagnostics/sarif-sink.cc (class unique_fd): New.
(sarif_socket_sink::sarif_socket_sink): Convert "fd" arg and m_fd
from int to unique_fd.
(~sarif_socket_sink): Drop.
(sarif_socket_sink::dump_kind): Update for m_fd becoming a
unique_fd.
(sarif_socket_sink::m_fd): Convert from "int" to "unique_fd".
(maybe_open_sarif_sink_for_socket): Likewise for "sfd".
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Philipp Tomsich [Thu, 30 Oct 2025 00:03:03 +0000 (01:03 +0100)]
aarch64: Add support for -mcpu=ampere1c
This adds support for the Ampere1c core to the AArch64 backend. The
initial patch only adds the core feature set (ARMv9.2 + extensions)
and does not add any special tuning decisions, and those may come
later.
Bootstrapped and tested on aarch64-none-linux-gnu.
Arthur Cohen [Wed, 12 Nov 2025 16:18:14 +0000 (17:18 +0100)]
gccrs: Fmt: Simplify pragma diagnostic setup
There was a typo in the original commit where the diagnostic context was not being popped,
however popping the context still causes issues while bootstrapping.
For now, just mark -Warray-bounds as a warning to fix bootstrap on trunk and think about
a proper fix (probably adding a push and pop on every file including rust-fmt.h...) later if
the C++ warning issue is still not fixed.
Add documentation for options -mstack-protector-guard= and
-mstack-protector-guard-record which were added in commit r16-5192-g0cd1f03939d and regenerate .opt.urls.
gcc/ChangeLog:
* config/i386/i386.opt.urls: Regenerate.
* config/s390/s390.opt.urls: Ditto.
* doc/invoke.texi: Add documentation for
-mstack-protector-guard= and -mstack-protector-guard-record.
Antoni Boucher [Wed, 12 Feb 2025 22:32:41 +0000 (17:32 -0500)]
libgccjit: Add the function attributes for setting the ABI
gcc/jit/ChangeLog
* jit-playback.cc: Support new function attributes.
* jit-recording.cc: Support new function attributes.
* libgccjit.h: Support new function attributes.
gcc/testsuite/ChangeLog:
* jit.dg/all-non-failing-tests.h: Mention new test.
* jit.dg/test-abi.c: New test.
A constant value with the top bit of a 16-bit const passed to vbicq_n_u16 will
generate invalid assembly. Avoid this by masking the constant during assembly
generation.
The same applies to vorrq_n and vmvnq_n.
gcc/ChangeLog:
PR target/122175
* config/arm/iterators.md (asm_const_size): New mode attr.
* config/arm/mve.md (@mve_<mve_insn>q_n_<supf><mode>): Use it.
gcc/testsuite/ChangeLog:
PR target/122175
* gcc.target/arm/mve/intrinsics/pr122175.c: New test.
Co-authored-by: Richard Earnshaw <rearnsha@arm.com>
Tobias Burnus [Wed, 12 Nov 2025 13:15:43 +0000 (14:15 +0100)]
libgomp.{c-c++-common,fortran}/target-is-accessible-1.c: Fix testcases for omp_default_device [P119677]
Commit r16-5188-g5da963d988e8ea added omp_default_device such that -5
became a conforming device number, but the tests used them to test
for as non-conforming number; now -6 is used.
libgomp/ChangeLog:
PR libgomp/119677
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Modify
test as -5 is now a conforming device number.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: Likewise.
Andre Vieira [Wed, 12 Nov 2025 10:33:09 +0000 (10:33 +0000)]
arm: Fix out of bounds when using cmse with FP types in aggregates [PR122539]
Skip partial register clearing logic when dealing with FP_REGS in aggregates as
these are always fully cleared and the logic assumes a mask for each of the 4
argument GPR_REGS.
Andre Vieira [Wed, 12 Nov 2025 10:25:14 +0000 (10:25 +0000)]
arm: Fix CMSE clearing of union members with no padding [PR122539]
This patch fixes the CMSE register clearing to make sure we don't clear
registers used by a function call. Before this patch the algorithm would only
correctly handle registers with padding bits to clear, any registers that were
fully utilised would be wrongfully cleared.
gcc/ChangeLog:
PR target/122539
* config/arm/arm.cc (comp_not_to_clear_mask_str_un): Update
not_to_clear_reg_mask for union members.
gcc/testsuite/ChangeLog:
* gcc.target/arm/cmse/union-3.x: New test.
* gcc.target/arm/cmse/baseline/union-3.c: New test.
* gcc.target/arm/cmse/mainline/8m/union-3.c: New test.
* gcc.target/arm/cmse/mainline/8_1m/union-3.c: New test.
So far only a per thread canary in the TLS block is supported. This
patch adds support for a global canary, too. For this the new option
-mstack-protector-guard={global,tls} is added which defaults to tls.
The global canary is expected at symbol __stack_chk_guard which means
for a function prologue instructions larl/l(g)fr + mvc are emitted and
for an epilogue larl/l(g)fr + clc.
Furthermore, option -mstack-protector-guard-record is added which is
inspired by -mrecord-mcount and generates section __stack_protector_loc
containing pointers to all instructions which load the address of the
global guard. Thus, this option has only an effect in conjunction with
-mstack-protector-guard=global. The intended use is for the Linux
kernel in order to support run-time patching. In each task_struct of
the kernel a canary is held which will be copied into the lowcore.
Since the kernel supports migration of the lowcore at boot time,
addresses are not necessarily constant. Therefore, the kernel expects
that all instructions loading the address of the canary to be of format
RIL or more precisely are either larl or lgrl and that the instructions
addresses are recorded in section __stack_protector_loc. The kernel is
then required to patch those instructions e.g. to llilf during boot.
In total this means -mstack-protector-guard=global emits code suitable
for user and kernel space.
gcc/ChangeLog:
* config/s390/s390-opts.h (enum stack_protector_guard): Define
SP_TLS and SP_GLOBAL.
* config/s390/s390.h (TARGET_SP_GLOBAL_GUARD): Define predicate.
(TARGET_SP_TLS_GUARD): Define predicate.
* config/s390/s390.md (stack_protect_global_guard_addr<mode>):
New insn.
(stack_protect_set): Also deal with a global guard.
(stack_protect_test): Also deal with a global guard.
* config/s390/s390.opt (-mstack-protector-guard={global,tls}):
New option.
(-mstack-protector-guard-record) New option.
gcc/testsuite/ChangeLog:
* gcc.target/s390/stack-protector-guard-global-1.c: New test.
* gcc.target/s390/stack-protector-guard-global-2.c: New test.
* gcc.target/s390/stack-protector-guard-global-3.c: New test.
* gcc.target/s390/stack-protector-guard-global-4.c: New test.
Richard Biener [Wed, 12 Nov 2025 09:03:23 +0000 (10:03 +0100)]
tree-optimization/122647 - missing bool pattern for bool -> float conversion
The following adds missing support for bool to float conversion which
can be exposed via C++ __builtin_bit_cast. It also corrects a too
lose check in vectorizable_conversion which would have resolved the
ICE but left us with the missed optimization.
PR tree-optimization/122647
* tree-vect-stmts.cc (vectorizable_conversion): Fix guard on
bool to non-bool conversion.
* tree-vect-patterns.cc (vect_recog_bool_pattern): Also handle
FLOAT_EXPR from bool.
libstdc++: optional<T&> for function and unbounded array should not be range [PR122396]
This implements proposed resolution for LWG4308 [1].
For T denoting either function type or unbounded array, the optional<T&> no
longer exposes iterator, and viable begin/end members. The conditionally
provided iterator type, it is now defined in __optional_ref_base
base class.
Furthermore, range support for optional<T&> is now also guarded by
__cpp_lib_optional_range_support.
[1] https://cplusplus.github.io/LWG/issue4308
PR libstdc++/122396
libstdc++-v3/ChangeLog:
* include/std/optional (__optional_ref_base): Define.
(std::optional<_Tp&>): Inherit from __optional_ref_base<_Tp>.
(optional<_Tp&>::iterator): Move to base class.
(optional<_Tp&>::begin, optional<_Tp&>::end): Use deduced return
type and constrain accordingly.
* testsuite/20_util/optional/range.cc: Add test for optional<T&>.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz KamiĆski <tkaminsk@redhat.com>
Eric Botcazou [Wed, 12 Nov 2025 08:03:18 +0000 (09:03 +0100)]
Ada: Fix variable initialized with if-expression not flagged as constant
This is a regression present on the mainline and 15 branch: the -gnatwk
switch no longer flags a string variable initialized with an if-expression
as constant when it is not modified in the program. The fix is to set the
Has_Initial_Value and Never_Set_In_Source flags earlier during analysis in
the Analyze_Object_Declaration procedure.
gcc/ada/
PR ada/122640
* sem_ch3.adb (Analyze_Object_Declaration): Set Is_True_Constant
on entry for constants and Never_Set_In_Source in all cases.
If an initialization expression is present, set Has_Initial_Value
and Is_True_Constant on variables.
Jason Merrill [Fri, 7 Nov 2025 13:16:30 +0000 (18:46 +0530)]
libstdc++: use -Wno-deprecated-declarations
-Wno-deprecated doesn't work with header units, since the testcase can't
change the header unit's version of the __DEPRECATED macro. But
-Wno-deprecated-declarations works just fine to avoid warning about
deprecated things.
Jason Merrill [Tue, 11 Nov 2025 13:15:31 +0000 (18:45 +0530)]
libstdc++: sync prune.exp with GCC
I needed to add module context to dg-prune for libstdc++, and figured it
made sense to sync it with the GCC version rather than maintain slightly
different approaches to stripping the same messages.
libstdc++-v3/ChangeLog:
* testsuite/lib/prune.exp: Sync with gcc prune.exp.
fortran: Fix ICE and self-assignment bugs with recursive allocatable finalizers [PR90519]
Derived types with recursive allocatable components and FINAL procedures
trigger an ICE in gimplify_call_expr because the finalizer wrapper's result
symbol references itself (final->result = final), creating a cycle. This
patch creates a separate __result_<typename> symbol to break the cycle.
Self-assignment (a = a) with such types causes use-after-free because the
left-hand side is finalized before copying, destroying the source. This
patch adds detection using gfc_dep_compare_expr at compile time and pointer
comparison at runtime to skip finalization when lhs == rhs.
Parenthesized self-assignment (a = (a)) creates a temporary, defeating the
simple self-assignment detection. This patch adds strip_parentheses() to
look through INTRINSIC_PARENTHESES operators and ensure deep_copy is enabled
for such cases.
Test pr112459.f90 now expects 6 _final calls instead of 12 because separate
result symbols eliminate double-counting in tree dumps.
PR fortran/90519
gcc/fortran/ChangeLog:
* trans-expr.cc (strip_parentheses): New helper function to strip
INTRINSIC_PARENTHESES operators from expressions.
(is_runtime_conformable): Use strip_parentheses to handle cases
like a = (a) when checking for self-assignment.
(gfc_trans_assignment_1): Strip parentheses before checking if
expr2 is a variable, ensuring deep_copy is enabled for cases like
a = (a). Also strip parentheses when checking for self-assignment
to avoid use-after-free in finalization.
(gfc_trans_scalar_assign): Add comment about parentheses handling.
* class.cc (generate_finalization_wrapper): Create separate result
symbol for finalizer wrapper functions instead of self-referencing
the procedure symbol, avoiding ICE in gimplify_call_expr.
gcc/testsuite/ChangeLog:
* gfortran.dg/finalizer_recursive_alloc_1.f90: New test for ICE fix.
* gfortran.dg/finalizer_recursive_alloc_2.f90: New execution test.
* gfortran.dg/finalizer_self_assign.f90: New test for self-assignment
including a = a, a = (a), and a = (((a))) cases using if/stop pattern.
* gfortran.dg/pr112459.f90: Update to expect 6 _final calls instead
of 12, reflecting corrected self-assignment behavior.
Signed-off-by: Christopher Albert <albert@tugraz.at>
Jerry DeLisle [Tue, 11 Nov 2025 18:47:31 +0000 (10:47 -0800)]
fortran: Implement optional type spec for DO CONCURRENT [PR96255]
This patch adds support for the F2008 optional integer type specification
in DO CONCURRENT and FORALL headers, allowing constructs like:
do concurrent (integer :: i=1:10)
The implementation handles type spec matching, creates shadow variables
when the type spec differs from any outer scope variable, and converts
iterator expressions to match the specified type.
Shadow variable implementation:
When a type-spec is provided and differs from an outer scope variable,
a shadow variable with the specified type is created (with _ prefix).
A recursive expression walker substitutes all references to the outer
variable with the shadow variable throughout the DO CONCURRENT body,
including in array subscripts, substrings, and nested operations.
Constraint enforcement:
Sets gfc_do_concurrent_flag properly (1 for block context, 2 for mask
context) to enable F2008 C1139 enforcement, ensuring only PURE procedures
are allowed in DO CONCURRENT constructs.
Additional fixes:
- Extract apply_typespec_to_iterator() helper to eliminate duplicated
shadow variable creation code (~70 lines)
- Add NULL pointer checks for shadow variables
- Fix iterator counting to handle both EXEC_FORALL and EXEC_DO_CONCURRENT
- Skip FORALL obsolescence warning for DO CONCURRENT (F2018)
- Suppress many-to-one assignment warning for DO CONCURRENT (reductions
are valid, formalized with REDUCE locality-spec in F2023)
PR fortran/96255
gcc/fortran/ChangeLog:
* gfortran.h (gfc_forall_iterator): Add bool shadow field.
* match.cc (apply_typespec_to_iterator): New helper function to
consolidate shadow variable creation logic.
(match_forall_header): Add type-spec parsing for DO CONCURRENT
and FORALL. Create shadow variables when type-spec differs from
outer scope. Replace duplicated code with apply_typespec_to_iterator.
* resolve.cc (replace_in_expr_recursive): New function to recursively
walk expressions and replace symbol references.
(replace_in_code_recursive): New function to recursively walk code
blocks and replace symbol references.
(gfc_replace_forall_variable): New entry point for shadow variable
substitution.
(gfc_resolve_assign_in_forall): Skip many-to-one assignment warning
for DO CONCURRENT.
(gfc_count_forall_iterators): Handle both EXEC_FORALL and
EXEC_DO_CONCURRENT with assertion.
(gfc_resolve_forall): Skip F2018 obsolescence warning for DO
CONCURRENT. Fix memory allocation check. Add NULL checks for shadow
variables. Implement shadow variable walker.
(gfc_resolve_code): Set gfc_do_concurrent_flag for DO CONCURRENT
constructs to enable constraint checking.
gcc/testsuite/ChangeLog:
* gfortran.dg/do_concurrent_typespec_1.f90: New test covering all
shadowing scenarios: undeclared variable, same kind shadowing, and
different kind shadowing.
Co-authored-by: Steve Kargl <kargl@gcc.gnu.org> Co-authored-by: Jerry DeLisle <jvdelisle@gcc.gnu.org> Signed-off-by: Christopher Albert <albert@tugraz.at>
* c-warn.cc (warn_parms_array_mismatch): Split out body of
per-pair in parameter lists iteration into...
(warn_parm_array_mismatch): ...this new function.
Jason Merrill [Mon, 10 Nov 2025 13:02:53 +0000 (18:32 +0530)]
c++/modules: avoid too many hidden friends in ADL
Most of the add_fns calls in adl_namespace_fns also call ovl_skip_hidden,
but we were forgetting that in the case of imports, which meant that for
24_iterators/const_iterator/112490.cc we were considering the
unreachable_sentinel_t hidden friend operator== and therefore failing.
gcc/cp/ChangeLog:
* name-lookup.cc (name_lookup::adl_namespace_fns): Also skip hidden
in the module case.
Jason Merrill [Sat, 8 Nov 2025 23:45:00 +0000 (05:15 +0530)]
c++/modules: use set_cfun
Assigning directly to cfun doesn't properly update the target and
optimization options for the new function, which causes trouble if we load a
function from a module that has different options than the one we were in
the middle of when the load happened. This broke the use of #pragma
optimize in 23_containers/array/iterators/begin_end.cc.
Nathan's comment in module.cc complained about the API doing too much, but
set_cfun seems to me to be exactly what we want here.
gcc/cp/ChangeLog:
* module.cc (module_state::read_cluster): Use set_cfun.
(post_load_processing): Likewise.
Andrew Stubbs [Tue, 11 Nov 2025 15:04:09 +0000 (15:04 +0000)]
amdgcn: Consolidate mkoffload setup constructors
We don't need every mkoffload runtime setting to use it's own constructor.
There was only two committed, but I have more uses for this soon. In theory,
we could also use this setup to choose not to register the kernel with libgomp.
The behaviour is not changed, just the generated code structure.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (process_asm): Replace "configure_stack_size"
constructor with a new regular function, "mkoffload_setup".
(process_obj): Call mkoffload_setup from the "init" constructor.
David Malcolm [Tue, 11 Nov 2025 15:20:47 +0000 (10:20 -0500)]
diagnostics: add experimental SARIF JSON-RPC notifications for IDEs [PR115970]
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3358r0.html#msvc
describes a feature of Visual Studio 2022 version 17.8. which can send its
diagnostics in SARIF form to a pipe when setting the environment variable
SARIF_OUTPUT_PIPE:
https://learn.microsoft.com/en-us/cpp/build/reference/sarif-output?view=msvc-170#retrieving-sarif-through-a-pipe
The precise mechanism above involves Windows-specific details (windows pipes
and HANDLEs).
The following patch implements an analogous feature for GCC, using Unix
domain sockets rather than the Windows-specific details.
With this patch, GCC's cc1, cc1plus, etc will check if
EXPERIMENTAL_SARIF_SOCKET is set in the environment, and if so,
will attempt to connect to that socket. It will send a JSON-RPC
notification to the socket for every diagnostic emitted. Like the
MSVC feature, the diagnostics are sent one-at-a-time as SARIF
"result" objects, rather than sending a full SARIF "log" object.
The patch includes a python test script which runs a server.
Tested by running the script in one terminal:
$ ../../src/contrib/sarif-listener.py
listening on socket: /tmp/tmpjgts0u0i/socket
and then invoking a build in another terminal with the envvar
set to the pertinent socket:
$ EXPERIMENTAL_SARIF_SOCKET=/tmp/tmpjgts0u0i/socket \
make check-gcc RUNTESTFLAGS="analyzer.exp=*"
and watching as all the diagnostics generated during the build
get sent to the listener.
The idea is that an IDE ought to be able to create a socket and
set the environment variable when invoking a build, and then listen
for all the diagnostics, without needing to manually set build flags
to inject SARIF output.
This feature is experimental and subject to change or removal
without notice; I'm adding it to make it easier for IDE developers to
try it out and give feedback.
contrib/ChangeLog:
PR diagnostics/115970
* sarif-listener.py: New file.
gcc/ChangeLog:
PR diagnostics/115970
* diagnostics/sarif-sink.cc: Include <sys/un.h> and <sys/socket.h>.
(sarif_builder::end_group): Update comment.
(sarif_sink::on_end_group): Drop "final".
(class sarif_socket_sink): New subclass.
(maybe_open_sarif_sink_for_socket): New function.
* diagnostics/sarif-sink.h: (maybe_open_sarif_sink_for_socket):
New decl.
* doc/invoke.texi (EXPERIMENTAL_SARIF_SOCKET): Document new
environment variable.
* toplev.cc: Define INCLUDE_VECTOR. Add include of
"diagnostics/sarif-sink.h".
(toplev::main): Call
diagnostics::maybe_open_sarif_sink_for_socket.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Richard Biener [Fri, 7 Nov 2025 12:52:09 +0000 (13:52 +0100)]
Use ranger when simplifying conditions during niter analysis
The following uses ranger to try to simplify boolean expressions
in simplify_using_initial_conditions as used by niter analysis.
We also try to simplify niter expressions themselves, but we cannot
use ranger directly for this.
* tree-ssa-loop-niter.cc (simplify_using_initial_conditions):
Use the active ranger to simplify boolean expressions.
Jeff Law [Tue, 11 Nov 2025 14:19:03 +0000 (07:19 -0700)]
[RISC-V] Improve detection of packw
More infrastructure on the way to eliminating the define_insn_and_split
for zero-extensions.
Exposing the shift-pair approach in the expander may change the order in which
operands appear in later RTL. In the case of packw detection order matters.
It shouldn't, it's an IOR after all, but it does. So we should fix that.
In addition to the ordering issue it slightly changes the form of one operand.
So we want to handle that too. So there's a total of 3 new patterns.
There isn't commonly available hardware with zbkb and it's only lightly tested
in the testsuite. So I wouldn't be terribly surprised to find out there's
other ways we want to represent those operands to ultimately generate a pack
instruction.
Built and tested on riscv32-elf and riscv64-elf in my tester. I'll wait for
pre-commit CI to render a verdict before moving forward.
gcc/
* config/riscv/crypto.md (packf splitters): Variant with
operands reversed. Add variants with the ashift/sign extend
exchanged as well.
Jeff Law [Tue, 11 Nov 2025 14:17:12 +0000 (07:17 -0700)]
[RISC-V] Simplify riscv_extend_to_xmode_reg
So I was trying to untangle our define_insn_and_split situation for
zero-extensions and stumbled over some code we need to adjust & simplify in the
RISC-V backend. I probably should have caught this earlier.
riscv_extend_to_xmode_reg is just a poor implementation of convert_modes; we
can replace the whole thing will a call to convert_modes + force_reg.
Why is this important beyond code hygene?
convert_modes works with the expansion code wheres extend_to_xmode_reg makes
assumptions about the kinds of insns the target directly supports.
This shows up if you try to untangle the zero-extension support where the base ISA doesn't support zero extensions and should be going through an expander rather than using a define_insn_and_split.
The define_insn_and_split for the reg->reg case isn't split until after reload.
Naturally this inhibits some optimizations and forces further work in this
space that should be simple define_splits into also needing to be
define_insn_and_splits.
Anyway, without going further into the zero-extend rathole, this removes the
assumption that the target is providing a single insn zero/sign extension thus
allowing me to continue to untangle that mess.
Bootstrapped and regression tested on the Pioneer (which thoroughly exercises
this code as it does not have the B extension. I don't think the BPI has
picked up this one yet. Also built and regression tested riscv32-elf and
riscv64-elf.
Waiting on pre-commit CI before moving forward.
* config/riscv/riscv.cc (riscv_extend_to_xmode_reg): Simplify
by using convert_modes + force_reg.
Richard Biener [Fri, 7 Nov 2025 12:50:02 +0000 (13:50 +0100)]
Improve range_on_edge for GENERIC expressions
When feeding non-SSA names to range_on_edge we degrade to a
non-contextual query. The following uses the argument added in
the previous patch to indicate the edge as the location of the
range query.
* gimple-range.cc (gimple_ranger::range_on_edge): Pass
the edge as 'edge' to get_tree_range.
(dom_ranger::range_on_edge): Likewise.
Andrew MacLeod [Tue, 11 Nov 2025 07:27:43 +0000 (08:27 +0100)]
Support edge query for range_query::get_tree_range
The following adds an edge argument to get_tree_range and invoke_range_of_expr
to support range_on_edge queries for GENERIC expressions.
* value-query.cc (range_query::invoke_range_of_expr): New
edge argument. If set invoke range_on_edge.
(range_query::get_tree_range): Likewise and adjust.
* value-query.h (range_query::invoke_range_of_expr): New
edge argument.
(range_query::get_tree_range): Likewise.
Lulu Cheng [Thu, 16 Oct 2025 03:26:45 +0000 (11:26 +0800)]
LoongArch: doc: Add description of function attrubute.
Added implementation description of function attributes
target_clones and target_version under LoongArch.
Include the list of supported options and their corresponding
priorities, as well as the rules for setting priorities.
gcc/ChangeLog:
* doc/extend.texi: Add description for LoongArch function
attributes.