Nathaniel Shead [Sat, 22 Nov 2025 11:30:32 +0000 (22:30 +1100)]
c++/modules: Walk indirectly exposed namespaces [PR122699]
In some situations, such as friend injection, we may add an entity to a
namespace without ever explicitly opening that namespace in this TU.
We currently have an additional loop to make sure the namespace is
considered purview, but this isn't sufficient to make
walk_module_binding find it, since the namspace itself is not in the
current TU's symbol table. This patch ensures we still process the
(hidden) binding for the injected friend in this TU.
PR c++/122699
gcc/cp/ChangeLog:
* name-lookup.h (expose_existing_namespace): Declare.
* name-lookup.cc (expose_existing_namespace): New function.
(push_namespace): Call it.
* pt.cc (tsubst_friend_function): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-21_a.C: New test.
* g++.dg/modules/tpl-friend-21_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Andre Vieira [Tue, 25 Nov 2025 11:39:35 +0000 (11:39 +0000)]
arm: add extra sizes to Wstrinop-overflow-47.c warning tests
A thumb2 target without a FPU like -mcpu=cortex-m3 will generate
'vector(4) char' stores which lead to warnings with sizes that weren't being
allowed before by the test. This patch adds these sizes.
gcc/testsuite/ChangeLog:
* gcc.dg/Wstringop-overflow-47.c: Adjust warnings to allow for 32-bit
stores.
Nathaniel Shead [Sun, 23 Nov 2025 12:24:39 +0000 (23:24 +1100)]
c++/modules: Stream all REQUIRES_EXPR_PARMS [PR122789]
We don't generally stream the TREE_CHAIN of a DECL, as this would cause
us to unnecessarily walk into the next member in its scope chain any
time it was referenced by an expression.
Unfortunately, REQUIRES_EXPR_PARMS is a tree chain of PARM_DECLs, so we
were only ever streaming the first parameter. This meant that when a
parameter's type could not be tsubst'd we would ICE instead of returning
false.
This patch special-cases REQUIRES_EXPR to always stream the chain of
decls in its first operand. As a drive-by improvement we also remove a
fixme about checking uncontexted PARM_DECLs.
PR c++/122789
gcc/cp/ChangeLog:
* module.cc (trees_out::core_vals): Treat REQUIRES_EXPR
specially and stream the chained decls of its first operand.
(trees_in::core_vals): Likewise.
(trees_out::tree_node): Check the PARM_DECLs we see are what we
expect.
gcc/testsuite/ChangeLog:
* g++.dg/modules/concept-12_a.C: New test.
* g++.dg/modules/concept-12_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Jason Merrill [Tue, 11 Nov 2025 10:28:01 +0000 (15:58 +0530)]
driver/c++: add --compile-std-module
For simple testcases that want to use the std module, it would be useful to
have a reasonably short way to request building the binary module form
before the testcase. So with this patch users can write
I expect this to be particularly useful on godbolt.org, where currently
building a modules testcase requires messing with cmake. One complication
there is that just adding std.cc to the command line arguments hits the
"cannot specify -o with -S with multiple files" error, so I avoid counting
these added inputs as "multiple files"; in that situation each compile will
output to the same target file, with the user-specified input last so it's
the one that actually ends up in the target after the command completes.
The following testcase ICEs, because
1) -fsanitize=bounds uses TYPE_MAX_VALUE (TYPE_DOMAIN (type)) with
1 or 2 added as last argument of .UBSAN_BOUNDS call and that
expression at that point is some NOP_EXPR around SAVE_EXPR with
testing for negative sizes
2) during gimplification, gimplify_type_sizes is invoked on the DECL_EXPR
outside of an OpenMP region, and forces TYPE_MAX_VALUE into
a pseudo instead, with the SAVE_EXPR obviously being evaluated
before that
3) the OpenMP gimplification sees an implicit or explicit data sharing
of a VLA and among other things arranges to firstprivatize TYPE_MAX_VALUE
4) when gimplifying the .UBSAN_BOUNDS call inside of the OpenMP region,
it sees a SAVE_EXPR and just gimplifies it to the already initialized
s.1 temporary; but nothing marks s.1 for firstprivatization on the
containing construct(s). The problem is that with SAVE_EXPR we never
know if the first use is within the same statement (typical use) or
somewhere earlier in the same OpenMP construct or in an outer OpenMP
construct or its parent etc., the SAVE_EXPR temporary is a function
local var, not something that is added to the innermost scope where
it is used (and it can't because it perhaps could be used outside of
it); so for OpenMP purposes, SAVE_EXPRs better should be used only
within the same OpenMP region and not across the whole function
The following patch fixes it by deferring the addition of
TYPE_MAX_VALUE in the last argument of .UBSAN_BOUNDS until gimplification
for VLAs, if it sees a VLA, instead of making the first argument
0 with pointer to the corresponding array type, it sets the first
argument to 1 with the same type and only sets the last argument to the
addend (1 or 2). And then gimplification can detect this case and
add the TYPE_MAX_VALUE (which in the meantime should have gone through
gimplify_type_sizes).
2025-11-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120052
gcc/
* gimplify.cc (gimplify_call_expr): For IFN_UBSAN_BOUNDS
call with integer_onep first argument, change that argument
to 0 and add TYPE_MAX_VALUE (TYPE_DOMAIN (arr_type)) to
3rd argument before gimplification.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_bounds): For VLAs use
1 instead of 0 as first IFN_UBSAN_BOUNDS argument and only
use the addend without TYPE_MAX_VALUE (TYPE_DOMAIN (type))
in the 3rd argument.
gcc/testsuite/
* c-c++-common/gomp/pr120052.c: New test.
Jakub Jelinek [Tue, 25 Nov 2025 10:18:07 +0000 (11:18 +0100)]
testsuite: Fix up vla-1.c test [PR119931]
From what I can see, the vla-1.c test has been added to test the handling
of debug info for optimized out parameters. But recent changes don't make
the argument optimized away, but optimized away and replaced by constant 5
(even without IPA-VRP). The function is noinline, but can't be noclone
nor noipa exactly because we want to test how it behaves when it is cloned
and the unused argument is dropped.
So, the following patch arranges to hide from the IPA optimizations the
value of x in the caller (and even make sure it is preserved in a register
or stack slot in the caller across the call).
2025-11-25 Jakub Jelinek <jakub@redhat.com>
PR testsuite/119931
* gcc.dg/vla-1.c (main): Hide x value from optimizers and use it after
the call as well.
Rainer Orth [Tue, 25 Nov 2025 09:51:38 +0000 (10:51 +0100)]
testsuite: Fix g++.dg/DRs/dr2581-1.C etc. on non-Linux
The g++.dg/DRs/dr2581-?.C tests FAIL on several targets in two ways:
* Both tests FAIL on a couple of targets in the same way:
FAIL: g++.dg/DRs/dr2581-1.C -std=c++11 (test for warnings, line 25)
FAIL: g++.dg/DRs/dr2581-1.C -std=c++17 (test for warnings, line 25)
FAIL: g++.dg/DRs/dr2581-1.C -std=c++20 (test for warnings, line 25)
FAIL: g++.dg/DRs/dr2581-1.C -std=c++23 (test for warnings, line 25)
FAIL: g++.dg/DRs/dr2581-1.C -std=c++26 (test for warnings, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++11 (test for errors, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++14 (test for errors, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++17 (test for errors, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++20 (test for errors, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++23 (test for errors, line 25)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++26 (test for errors, line 25)
i.e. the __STDC_ISO_10646__ warning is missing. This happens on
Solaris, FreeBSD, Darwin, and several embedded targets. While
__STDC_ISO_10646__ already exists in C99, it's optional there and
seems to be only defined on Linux/glibc. Thus this patch xfail's this
check on non-Linux.
* Besides, on Solaris only there are more failures for -std=c++11 to
c++26, like
FAIL: g++.dg/DRs/dr2581-2.C -std=c++11 (test for warnings, line 24)
FAIL: g++.dg/DRs/dr2581-2.C -std=c++11 (test for excess errors)
Jakub Jelinek [Tue, 25 Nov 2025 09:30:51 +0000 (10:30 +0100)]
openmp: Fix up OpenMP expansion of collapsed loops [PR120564]
Most of gimple_build_cond_empty callers call just build2 to prepare
condition which is put into GIMPLE_COND, but this one spot has been
using incorrectly fold_build2. Now the arguments of the *build2 were
already gimplified, but the folding of some conditions can turn say
unsigned_var > INT_MAX into (int) unsigned_var < 0 etc. and thus
turn the condition into something invalid in gimple, because we only
try to regimplify the operands if they refer to some decl which needs
to be regimplified (has DECL_VALUE_EXPR on it).
Fixed by also using build2 instead of fold_build2.
2025-11-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120564
* omp-expand.cc (extract_omp_for_update_vars): Use build2 instead of
fold_build2 to build argument for gimple_build_cond_empty.
Jakub Jelinek [Tue, 25 Nov 2025 09:06:46 +0000 (10:06 +0100)]
alias: Fix up BITINT_TYPE and non-standard INTEGER_TYPE alias handling [PR122624]
The testcase in the PR is miscompiled on aarch64 with
--param=ggc-min-expand=0 --param=ggc-min-heapsize=0 -O2
(not including it in the testsuite because it is too much of
a lottery).
Anyway, the problem is that the testcase only uses unsigned _BitInt(66)
and never uses _BitInt(66), get_alias_set remembers alias set for
ARRAY_TYPE (of its element type in ARRAY_TYPE's TYPE_ALIAS_SET),
c_common_get_alias_set does not remember in TYPE_ALIAS_SET alias of
unsigned types and instead asks for get_alias_set of corresponding
signed type and that creates a new alias set for each new canonical
type.
So, in this case, when being asked about get_alias_set on ARRAY_TYPE
unsigned _BitInt(66) [N], it recurses down to c_common_get_alias_set
which asks for alias set of at that point newly created signed type
_BitInt(66), new alias set is created for it, remembered in that
signed _BitInt(66) TYPE_ALIAS_SET, not remembered in unsigned _BitInt(66)
and remembered in ARRAY_TYPE's TYPE_ALIAS_SET.
Next a GC collection comes, signed _BitInt(66) is not used anywhere in
any reachable from GC roots, so it is removed.
Later on we ask alias oracle whether the above mentioned ARRAY_TYPE
can for TBAA alias pointer dereference with the same unsigned _BitInt(66)
type. For the ARRAY_TYPE, we have the above created alias set remembered
in TYPE_ALIAS_SET, so that is what we use, but for the unsigned _BitInt(66)
we don't, so create a new signed _BitInt(66), create a new alias set for it
and that is what is returned, so we have to distinct alias sets and return
that they can't alias.
Now, for standard INTEGER_TYPEs this isn't a problem, because both the
signed and unsigned variants of those types are always reachable from GTY
roots. For BITINT_TYPE (or build_nonstandard_integer_type built types)
that isn't the case. I'm not convinced we need to fix it for
build_nonstandard_integer_type built INTEGER_TYPEs though, for bit-fields
their address can't be taken in C/C++, but for BITINT_TYPE this clearly
is a problem.
So, the following patch solves it by
1) remembering the alias set we got from get_alias_set on the signed
_BitInt(N) type in the unsigned _BitInt(N) type
2) returning -1 for unsigned _BitInt(1), because there is no corresponding
signed _BitInt type and so we can handle it normally
3) so that the signed _BitInt(N) type isn't later GC collected and later
readded with a new alias set incompatible with the still reachable
unsigned _BitInt(N) type, the patch for signed _BitInt(N) types checks
if corresponding unsigned _BitInt(N) type doesn't already have
TYPE_ALIAS_SET_KNOWN_P, in that case it remembers and returns that;
in order to avoid infinite recursion, it doesn't call get_alias_set
on the unsigned _BitInt(N) type though
4) while type_hash_canon remembers in the type_hash_table both the hash
and the type, so what exactly we use as the hash isn't that important,
I think using type_hash_canon_hash for BITINT_TYPEs is actually better over
hasing TYPE_MAX_VALUE, because especially for larger BITINT_TYPEs
TYPE_MAX_VALUE can have lots of HWIs in the number, for
type_hash_canon_hash hashes for BITINT_TYPEs only
i) TREE_CODE (i.e. BITINT_TYPE)
ii) TYPE_STRUCTURAL_EQUALITY_P flag (likely false)
iii) TYPE_PRECISION
iv) TYPE_UNSIGNED
so 3 ints and one flag, while the old way can hash one HWI up to
1024 HWIs; note it is also more consistent with most other
type_hash_canon calls, except for build_nonstandard_integer_type; for
some reason changing that one to use also type_hash_canon_hash doesn't
work, causes tons of ICEs
This version of the patch handles INTEGER_TYPEs the same as BITINT_TYPE.
2025-11-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/122624
* tree.cc (build_bitint_type): Use type_hash_canon_hash.
* c-common.cc (c_common_get_alias_set): Fix up handling of BITINT_TYPE
and non-standard INTEGER_TYPEs. For unsigned _BitInt(1) always return
-1. For other unsigned types set TYPE_ALIAS_SET to get_alias_set of
corresponding signed type and return that. For signed types check if
corresponding unsigned type has TYPE_ALIAS_SET_KNOWN_P and if so copy
its TYPE_ALIAS_SET and return that.
In r11-3059-g8183ebcdc1c843, Julian fixed a few issues with
atomic_capture-2.c relying on iteration order guarantees that do not
exist under OpenACC parallelized loops and, notably, do not happen even
by accident on AMDGCN.
The atomic_capture-3.c testcase was made by copying it from
atomic_capture-2.c and adding additional options in commit r12-310-g4cf3b10f27b199, but from an older version of
atomic_capture-2.c, which lacked these ordering fixes fixes, so they
resurfaced in this test.
This patch ports those fixes from atomic_capture-2.c into
atomic_capture-3.c.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: Copy
changes in r11-3059-g8183ebcdc1c843 from atomic_capture-2.c.
Rainer Orth [Tue, 25 Nov 2025 08:23:08 +0000 (09:23 +0100)]
testsuite: i386: Restrict pr120936-1.c etc. to Linux
After switching the i386 check-function-bodies tests to use the new
dg-add-options check_function_bodies feature, several tests still FAIL
in the same way on Solaris/x86. E.g.
i.e. the test expects a call to mcount, while on Solaris _mcount is
called instead. MCOUNT_NAME is only defined as mcount in gnu-user.h and
x86-64.h, so the patch restricts the tests to Linux.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
Rainer Orth [Tue, 25 Nov 2025 08:20:54 +0000 (09:20 +0100)]
testsuite: i386: Guard NO_PROFILE_COUNTERS tests
After switching the i386 check-function-bodies tests to use the new
dg-add-options check_function_bodies feature, several tests still FAIL
in similar ways on Solaris/x86:
* Some FAIL like this:
FAIL: gcc.target/i386/pr120936-6.c (test for excess errors)
Excess errors:
cc1: error: '-mnop-mcount' is not compatible with this target
This happens because -mnop-mcount is only supported in
i386/i386-options.cc (ix86_option_override_internal) if
NO_PROFILE_COUNTERS.
* Others FAIL like
FAIL: gcc.target/i386/pr120936-10.c (test for excess errors)
Excess errors:
gcc/testsuite/gcc.target/i386/pr120936-10.c:23:1: sorry, unimplemented: profiling '-mcmodel=large' with PIC is not supported
This error is generated in i386/i386.cc (x86_function_profiler) if
!NO_PROFILE_COUNTERS.
NO_PROFILE_COUNTERS is only defined in dragonfly.h, x86_64.sh,
gnu-user.h, freebsd.h, cygming.h, and, netbsd-elf.h. However, a couple
of similar tests are restricted to Linux only, so this patch follows
suit. One could introduce a new effective-target keyword to fully
handle this, though.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
Rainer Orth [Tue, 25 Nov 2025 08:18:13 +0000 (09:18 +0100)]
testsuite: i386: Handle check-function-bodies options using dg-add-options
The {gcc,g++}.target/i386 tests that use dg-final { check-function-bodies }
need addititional options on Solaris/x86. So far, those tests have been
updated manually to add the required -fdwarf2-cfi-asm
-fasynchronous-unwind-tables. However, this has two issues:
* Those Solaris-only options make dg-options harder to read, although
they do no harm on other targets.
* Besides, the need for those options repeated got forgotten for each
new bunch of such tests.
To avoid that problem in the future, this patch introduces a new
dg-add-options feature, check_function_bodies, that adds those options
exactly on the targets that need it. It both improves readability and
will hopefully not be forgotten again for future tests.
Tested on i386-pc-solaris2.11 with as/ld and gas/ld, and
x86_64-pc-linux-gnu.
All of them FAIL in the same way: when gas is used, the tests contain
something like
.uaword .LLASF4 ! DW_AT_const_value: "my_foo"
while for /bin/as
.ascii "my_foo\0" ! DW_AT_const_value
is emitted. While other dwarf2 tests support both forms, the tests
above don't. This patch fixes this. To make the regex more readable,
they are switched to using braces instead of double quotes, thus
avoiding excessive escaping. At the same time, they now use
newline-sensitive matching to avoid .* matching across lines.
Tested on sparc-sun-solaris2.11 with as and gas, and
x86_64-pc-linux-gnu.
Kito Cheng [Mon, 24 Nov 2025 07:00:07 +0000 (15:00 +0800)]
c-family: Don't register include paths with -fpreprocessed
This fixes a permission error that occurs when cross-compiling with
-save-temps and a relocated toolchain, where the original build path
exists but is inaccessible.
The issue occurs when:
- Building the toolchain at /home/scratch/build/
- Installing it to another location like /home/user/rv64-toolchain/
- The /home/scratch directory exists but has insufficient permissions
(e.g. drwx------ for /home/scratch/)
Without this fix, cc1 would report errors like:
cc1: error: /home/scratch/build/install/riscv64-unknown-elf/usr/local/include: Permission denied
This occurred because the GCC driver did not pass GCC_EXEC_PREFIX and
isysroot to cc1/cc1plus in the compile stage when using -save-temps.
This caused cc1/cc1plus to search for headers from the wrong (original
build) path instead of the relocated installation path.
The fix ensures cc1/cc1plus won't try to collect include paths when
-fpreprocessed is specified. This prevents the permission error during
cross-compilation with -save-temps, as the preprocessed file already
contains all necessary headers.
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Skip register_include_chains
when cpp_opts->preprocessed is set.
Alexandre Oliva [Mon, 24 Nov 2025 23:07:20 +0000 (20:07 -0300)]
ira: sort allocno_hard_regs by regset
Using hashes of data structures for tie breaking makes sorting
dependent on type sizes, padding, and endianness, i.e., unstable
across different hosts.
ira-color.cc's allocno_hard_regs_compare does that, and it causes
different register allocations to be chosen for the same target
depending on the host. That's undesirable.
Compare the HARD_REG_SETs directly instead, looking for the
lowest-numbered difference register to use as the tie breaker for the
cost compare.
With a hardware implementation of ctz, this is likely faster than the
hash used to break ties before.
for gcc/ChangeLog
PR rtl-optimization/122767
* ira-color.cc (allocno_hard_regs_compare): Break ties
using...
* hard-reg-set.h (hard_reg_set_first_diff): ... this. New
HARD_REG_SET API entry point.
Robin Dapp [Fri, 14 Nov 2025 14:01:29 +0000 (15:01 +0100)]
vect: Make SELECT_VL a convert optab.
Currently select_vl is a direct optab with its mode always Xmode/Pmode.
This does not give us sufficient freedom to enable/disable vsetvl
(=SELECT_VL) depending on the vector mode.
This patch makes select_vl a convert optab and adjusts the associated IFN
functions as well as the query/emit code in the vectorizer.
With this patch nothing new is actually exercised yet. This is going to
happen in a separate riscv patch that enables "VLS" select_vl.
Robin Dapp [Wed, 22 Oct 2025 19:14:36 +0000 (21:14 +0200)]
RISC-V: Add BF VLS modes and document iterators.
We're missing some VLS BF modes, e.g. for gathers. This patch adds them.
While at it, it adds some documentation to the iterators and corrects
the vec_set iterator (for the time being).
Regtested on rv64gcv_zvl512b but curious what the CI says.
Robin Dapp [Sat, 22 Nov 2025 19:53:25 +0000 (20:53 +0100)]
vect: Use start value in vect_load_perm_consecutive_p [PR122797].
vect_load_perm_consecutive_p is used in a spot where we want to check
that a permutation is consecutive and starting with 0. Originally I
wanted to add this way of checking to the function but what I ended
up with is checking whether the given permutation is consecutive
starting from a certain index. Thus, we will return true for
e.g. {1, 2, 3} which doesn't make sense in the context of the PR.
This patch corrects it.
PR tree-optimization/122797
gcc/ChangeLog:
* tree-vect-slp.cc (vect_load_perm_consecutive_p): Check
permutation start at element 0 with value instead of starting
at a given element.
(vect_optimize_slp_pass::remove_redundant_permutations):
Use start value of 0.
* tree-vectorizer.h (vect_load_perm_consecutive_p): Set default
value to to UINT_MAX.
Robin Dapp [Sun, 16 Nov 2025 17:42:04 +0000 (18:42 +0100)]
forwprop: Allow nop conversions for vector constructor.
I observed a vect-construct forwprop opportunity in x264 that we
could handle when checking for a nop conversion instead of a useless
conversion. IMHO a nop-conversion check is sufficient as we're only
dealing with permutations in simplify_vector_constructor.
This patch replaces uses of useless_type_conversion_p with
tree_nop_conversion_p in simplify_vector_constructor.
It was bootstrapped and regtested on x86 and power10, regtested on
aarch64 and riscv64.
There is a single scan-test failure on power
(gcc.target/powerpc/builtins-1.c). The code actually looks better so
I took the liberty of adjusting the test expectation.
hppa: Update peephole2 patterns for scaled/unscaled indexed loads and stores
The peephole2 patterns to optimize scaled/unscaled indexed loads and
stores are updated to ensure the REG_POINTER flag is set/unset in
the base/index regs on targets with non-equivalent space registers.
Previously, unscaled indexed loads and stores were only optimized on
targets with equivalent space registers. We can now optimize these
instructions on targets with non-equivalent space registers.
2025-11-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.h (REGS_OK_FOR_BASE_INDEX): New define.
* config/pa/pa.md: Update peephole2 patterns for scaled/unscaled
indexed loads and stores.
Richard Biener [Mon, 24 Nov 2025 14:10:22 +0000 (15:10 +0100)]
Avoid incomplete SLP handling for OMP SIMD calls with linear/invariant clause
The following restricts these cases to single-lange SLP as they look
at only the representative scalar argument.
PR tree-optimization/122826
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Only
use single-lane SLP for SIMD_CLONE_ARG_TYPE_UNIFORM
and SIMD_CLONE_ARG_TYPE_LINEAR_[REF_]CONSTANT_STEP.
Eric Botcazou [Mon, 24 Nov 2025 18:00:43 +0000 (19:00 +0100)]
Ada: Fix incorrect handling of BOM by -r switch of gnatchop
As reported and analyzed in the PR, gnatchop does not correctly propagate
a BOM present in the source file to the first compilation unit it outputs,
in the case where the -r switch is specified, because it copies the BOM
for the first compilation unit as part of the chopping process instead of
copying it specifically at the start of the unit.
gcc/ada/
PR ada/81106
* gnatchop.adb (Gnatchop): If present in the source file, output
the BOM at the start of every compilation unit.
Andrew Pinski [Fri, 21 Nov 2025 04:43:16 +0000 (20:43 -0800)]
phiprop: Avoid proping loads into loops [PR116835]
This is v2 which uses Richi's code.
This amends "the post-dominance check to deal with
SSA cycles [...]. We need to constrain the load to
be in the same or a subloop (use
flow_loop_nested_p, not loop depth) or in the same
BB when either the load or the PHI is in an irreducible region."
(as described by Richard).
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116835
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Admend the
post-dom check to deal with ssa cycles.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr116835.c: New test.
* gcc.dg/tree-ssa/phiprop-6.c: New test.
* gcc.dg/tree-ssa/phiprop-7.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Co-authored-by: Richard Biener <rguenther@suse.de>
Andrew Pinski [Fri, 21 Nov 2025 07:14:24 +0000 (23:14 -0800)]
phiprop: allowing prop into loop if there is a phi already
This is a small improvement over the original change
for PR60183 where if we created a phi already we can
reuse it always but since the order of this use might
be before the use which was valid to transform. we
need a vector to save the delayed ones.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/60183
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Delay the decision
of always rejecting proping into the loop until all are done.
if there was some delay stmts and a phi was created fill them in.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phiprop-5.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Fri, 21 Nov 2025 06:39:48 +0000 (22:39 -0800)]
phiprop: Allow non-trapping loads to be proped back into the loop
First the testcase for PR60183 is no longer testing that we don't
prop into the loops for possible trapping cases. This adds phiprop-4.c
that tests that.
Second we can prop back into loops if we know the load will not trap.
This adds that check. phiprop-3.c tests that.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/60183
gcc/ChangeLog:
* tree-ssa-phiprop.cc (propagate_with_phi): Allow
known non-trapping loads to happen back into the
loop.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phiprop-3.c: New test.
* gcc.dg/tree-ssa/phiprop-4.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 24 Nov 2025 08:52:35 +0000 (00:52 -0800)]
forwprop: Allow mismatch clobbers in simple dse
As mention in the patch that adds DSEing lhs of calls,
some testcases were xfailed due to exceptions and mismatch
of clobbers in some cases.
This allows them and un-xfails the testcase where they show
up.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-forwprop.cc (do_simple_agr_dse): Allow
for mismatched clobbers.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/simple-dse-3.C: un-xfail.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 24 Nov 2025 00:42:57 +0000 (16:42 -0800)]
forwprop: Add call stmt support to simple dse [PR122633]
This adds the ability to the simple dse to remove the lhs of a
call. It can also remove a call if it was pure/const in some cases.
On trampv3, I found this happened a few times during forwprop2, 3
and 4. The one in 4 was a suprise and even more it caused a removal
of a call which gcc was not able to remove before. This is due to
the nature of DSE/DCE needs to be done iteratively together but
we currently don't do that. So it just happens the late forwprop4's
simple dse is able to remove this call.
I will fix the xfail testcases in a followup, basically there exceptions
can get a mismatch in the CLOBBER which I didn't except before and I was
being super cautious when it comes having them match but in reality
the difference in CLOBBERs don't matter.
Bootstrapped and tested on x86_64-linux-gnu
PR tree-optimization/122633
gcc/ChangeLog:
* tree-ssa-forwprop.cc (do_simple_agr_dse): Remove
lhs of dead store for a call (or the whole call stmt).
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/simple-dse-1.C: New test.
* g++.dg/tree-ssa/simple-dse-2.C: New test.
* g++.dg/tree-ssa/simple-dse-3.C: New test.
* g++.dg/tree-ssa/simple-dse-4.C: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This moves the `(pointer_diff (pointer_plus @0 @2) (pointer_plus @1 @2))` pattern
to right below the `(pointer_diff (pointer_plus @0 @1) (pointer_plus @0 @2))` pattern
to make easier to see both versions are supported.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* match.pd (`(ptr_diff (ptr_plus @0 @2) (ptr_plus @1 @2))`): Move pattern
earlier to the other `(ptr_diff (ptr_plus) (ptr_plus))` pattern.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jonathan Wakely [Thu, 20 Nov 2025 12:19:54 +0000 (12:19 +0000)]
libstdc++: Implement LWG 4370 for std::optional comparisons
This modifies the relational comparisons for std::optional so that they
do not use logical expressions with && or || that involve the
comparisons on the contained values, because x && (*y == *z) might do
the wrong thing if *y == *z does not return bool.
libstdc++-v3/ChangeLog:
* include/std/optional (operator==, operator!=, operator>)
(operator>, operator<=, operator>=): Do not use logical
&& and || with operands of unknown types.
* testsuite/20_util/optional/relops/lwg4370.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Thu, 20 Nov 2025 12:19:54 +0000 (12:19 +0000)]
libstdc++: Implement LWG 4366 for std::expected comparisons
This modifies the equality comparisons for std::expected so that they do
not use explicit conversions to bool, to match the constraints which are
specified in terms of "convertible to" which implies implicitly
convertible. As a result of those changes, we cannot use logical
expressions with && or || that involve comparisons of the contained
values, because x && (*y == *z) might do the wrong thing if *y == *z
does not return bool.
Also add [[nodiscard]] attributes which were missing.
The new lwg4366.cc testcase is a dg-do run test not dg-do compile,
because the original example won't compile with libstdc++ even after
these fixes. We constrain the std::expected comparison operators with
std::convertible_to<bool> and the pathological Bool type in the issue
doesn't satisfy that concept. So the new test replaces the deleted
explicit conversion oeprator in the issue with one that isn't deleted
but terminates if called. This ensures we don't call it, thus ensuring
that std::expected's comparisons do implicit conversions only.
It's unclear to me whether using the convertible_to concept in
std::expected comparisons is conforming, or if we should switch to an
__implicitly_convertible_to<bool> concept which only uses
std::is_convertible_v<T, bool> and doesn't check for explicit
conversions. That can be addressed separately from this change.
libstdc++-v3/ChangeLog:
* include/std/expected (operator==): Use implicit conversion to
bool and do not use logical && and || with operands of unknown
types. Add nodiscard attributes.
* testsuite/20_util/expected/equality.cc: Test some missing
cases which were not covered previously.
* testsuite/20_util/expected/lwg4366.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
1) [dcl.fct.def.default]/2.6 says "if F1 is explicitly defaulted on its
first declaration, it is defined as deleted;", but I wasn't heeding the
"on its first declaration" part, so fix that;
2) when the decl is actually ill-formed, don't talk about it being
implicitly deleted, because it's not;
3) there is no need to export maybe_delete_defaulted_fn.
PR c++/119964
gcc/cp/ChangeLog:
* cp-tree.h (maybe_delete_defaulted_fn): Remove.
* method.cc (maybe_delete_defaulted_fn): Make static. Refactor. If FN
is not explicitly defaulted on its first declaration, emit an error.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/defaulted1.C: New test.
* g++.dg/cpp1y/defaulted2.C: New test.
Richard Biener [Sun, 23 Nov 2025 12:57:47 +0000 (13:57 +0100)]
Move SIMD clone rejections to SIMD clone selection
The following moves checks we used to reject SIMD clone vectorization
to selection of the SIMD clone. It also removes unnecessary
restrictions on constant/external defs, vector types are already
determined and constraints should be not special here.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Move
all SIMD clone validity checks to SIMD clone selection.
Remove late constant/external def vector type setting and
verification.
Jeff Law [Mon, 24 Nov 2025 13:05:27 +0000 (06:05 -0700)]
[PR rtl-optimization/122782] Fix out of range shift causing bootstrap failure with ubsan
As noted in the PR, we're doing a bogus shift in the new code triggering a
ubsan failure. This code works up through DImode and needs to reject attempts
at handling wider modes. Yes, it could be extended to those wider modes and I
expect we will as int128 becomes more common, but it doesn't seem worth the
effort right now.
This patch adds the same kind of test we're using elsewhere to guard against
the bogus shift. While I haven't been able to reproduce the ubsan bootstrap
failure, I can see the bogus shift under the debugger and I can see they no
longer occur after this patch.
This has been bootstrapped and regression tested on x86 and riscv. It's also
been through all the crosses. Pushing to the trunk momentarily.
jeff
PR rtl-optimization/122782
gcc/
* ext-dce.cc (ext_dct_process_uses): Guard against undefined shifts
by properly checking modes on the input object.
Jonathan Wakely [Mon, 24 Nov 2025 12:48:42 +0000 (12:48 +0000)]
libstdc++: Fix pretty printers for std::list
The logs for xmethods.exp show that the std::list tests have never
worked:
gdb.error: No type named std::__cxx11::list<int, std::allocator<int> >::_Node.^M
skipping: File "/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/../python/libstdcxx/v6/xmethods.py", line 445, in match\r\nskipping: node_type = gdb.lookup_type(str(class_type) + '::_Node').pointer()\r\nskipping: ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\nskipping: gdb.error: No type named std::__cxx11::list<int, std::allocator<int> >::_Node.\r\nlist.gdb:11: Error in sourced command file:^M
Error while looking for matching xmethod workers defined in Python.^M
skipping: list.gdb:11: Error in sourced command file:\r\nskipping: Error while looking for matching xmethod workers defined in Python.\r\nUNSUPPORTED: libstdc++-xmethods/list.cc
Because of the way the GDB tests treat errors as UNSUPPORTED (so that
the tests don't fail if the version of GDB is too old to support
xmethods) we were not getting any FAIL, even though the tests were
broken.
The std::list type does not have a nested _Node type, it only has
_Node_ptr. Instead of looking up _Node and then getting a pointer to
that type, just look up _Node_ptr instead.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/xmethods.py (ListMethodsMatcher.match):
Fix lookup for node type.
Jonathan Wakely [Mon, 24 Nov 2025 12:36:32 +0000 (12:36 +0000)]
libstdc++: Fix XMethods for debug mode [PR122821]
The Python GDB XMethods were not matching the debug mode containers,
because the is_specialization_of helper function matches std::(__\d::)?
and so fails to match std::__debug::deque etc.
This makes it match std::__debug:: as well as std:: and std::__8::.
Since the regex already handles the versioned namespace with (__\d::)?
we don't need to also include the _versioned_namespace variable
explicitly. This means we now match std::(__\d::|__debug::)?name<.*>
instead of matching std::(__\d::)?(__8::)?name<.*> which redundantly
included two ways to match the __8 versioned namespace.
libstdc++-v3/ChangeLog:
PR libstdc++/122821
* python/libstdcxx/v6/xmethods.py (_versioned_namespace): Remove
global variable.
(is_specialization_of): Do not use _versioned_namespace. Add
__debug:: to regex.
dwarf: Save bit stride information for array type entry [PR121964]
Lack of DW_AT_bit_stride in a DW_TAG_array_type entry causes GDB to infer
incorrect element size for vector types. The causes incorrect display of
SVE predicate variables as well as out of bounds memory access when reading
contents of SVE predicates from memory in GDB.
We also locate DIE referenced by DW_AT_type and set DW_AT_bit_size 1 in it.
PR debug/121964
gcc/
* dwarf2out.cc (gen_array_type_die): Add DW_AT_bit_stride attribute
for array types based on element type bit precision for integer and
boolean element types.
gcc/testsuite/
* g++.target/aarch64/dwarf-bit-stride-func.C: New test.
* g++.target/aarch64/dwarf-bit-stride-pragma.C: New test.
* g++.target/aarch64/dwarf-bit-stride-pragma-sme.C: New test.
* g++.target/aarch64/sve/dwarf-bit-stride.C: New test.
* gcc.target/aarch64/dwarf-bit-stride-func.c: New test.
* gcc.target/aarch64/dwarf-bit-stride-pragma.c: New test.
* gcc.target/aarch64/dwarf-bit-stride-pragma-sme.c: New test.
* gcc.target/aarch64/sve/dwarf-bit-stride.c: New test.
Paul Thomas [Mon, 24 Nov 2025 11:30:19 +0000 (11:30 +0000)]
Fortran: Failure with 1st PDT example in F2018 standard [PR122766]
2025-11-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/122766
* decl.cc (gfc_match_decl_type_spec): A pdt_type found while
parsing a contains section can only arise from the typespec of
a function declaration. This can be retained in the typespec.
Once we are parsing the function, the first reference to this
derived type will find that it has no symtree. Provide it with
one so that gfc_use_derived does not complain and, again,retain
it in the typespec.
gcc/testsuite
PR fortran/122766
* gfortran.dg/pdt_69.f03: New test.
Richard Biener [Mon, 24 Nov 2025 10:18:42 +0000 (11:18 +0100)]
Adjust gcc.dg/vect/bb-slp-41.c
We now perform SLP vectorization for the scalar epilog of a vector
loop where we now unroll said scalar epilog. The following adjusts
the testcase to look for SLP trying to vectorize a CTOR instead,
which I guessed from the topic of r10-4336-g818b3293f4545d which
added this testcase.
Eric Botcazou [Mon, 24 Nov 2025 09:36:35 +0000 (10:36 +0100)]
Fix wrong code for indexed component with very large index type
This fixes an old issue whereby we generate wrong code in Ada for an indexed
component in an array with a ludicrously large index type instead of raising
Storage_Error. We would need the counterpart of int_const_binop for unop in
the general case, but that's not worth the hassle and int_const_convert is
good enough.
gcc/
PR ada/33994
* fold-const.h (int_const_convert): New prototype.
* fold-const.cc (fold_convert_const_int_from_int): Rename to...
(int_const_convert): ...this, remove static keyword and add third
parameter OVERFLOWABLE.
(fold_convert_const): Call int_const_convert if ARG1 is an integer
constant.
gcc/ada/
PR ada/33994
* gcc-interface/utils.cc (convert) <INTEGER_TYPE>: Call
int_const_convert if the expression is an integer constant.
gcc/testsuite/
* gnat.dg/object_overflow6.adb: New test.
gcc: Set native_system_header_dir on aarch64-mingw
Provide a sensible default value for native_system_header_dir, namely
/mingw/include, on aarch64-mingw. This is in line with the expectations
for mingw file locations, and is already set on both x86- and
x86_64-mingw.
gcc/ChangeLog:
* config.gcc (aarch64-*-mingw*): Set native_system_header_dir.
hppa: Fix scaled and unscaled index support on targets with non-equivalent space registers
HP-UX targets have non-equivalent space registers. The base register
in most loads and stores selects the space register used to calculate
the global virtual address for the instruction.
Previously, the PA-RISC backend attempted to canonicalize the
register order in INDEX + BASE register addresses. This has always
been problematic as reload would sometimes lose the REG_POINTER
flag used to mark a base register. As a result, we allowed any
register order after reload and prayed the registers would be
in canonical order.
This broke with the new late_combine2 pass. It sometimes creates
new indexed instructions after reload. pa_legitimate_address_p
needs updating to ensure the base register is marked with the
REG_POINTER flag and the index register is not marked.
If scaled index instructions are created before reload, the LRA
pass will sometimes convert it an unscaled index instruction
plus reloads and drop the REG_POINTER flag that was in the base
register. Thus, we can't allow scaled and unscaled index loads
and stores until reload is completed.
2025-11-23 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.cc (pa_print_operand): Use REG_POINTER
flag to select base and index registers on targets with
non-equivalent space registers.
(pa_legitimate_address_p): Don't allow scaled and unscaled
indexed addresses until reload is complete. Allow any
register order in unscaled addresses as long as the
REG_POINTER flag is correctly set/unset in the base/index
registers.
* config/pa/predicates.md (mem_operand): Remove code to
delay creating move insns with unscaled indexed addresses
until CSE is not expected.
(move_src_operand): Likewise.
Pan Li [Sat, 15 Nov 2025 03:22:23 +0000 (11:22 +0800)]
Match: Remove unnecessary convert for unsigned SAT_MUL
After we convert from bit_op outer into its captures, some
outer convert of unsigned SAT_MUL form 6 is unnecessary any
more. Thus, remove it. Meanwhile, add c after outer bit_ior
to make the test happy.
gcc/ChangeLog:
* match.pd: Remove unnecessary outer convert and add
c for the outer bit_ior.
Pan Li [Sat, 15 Nov 2025 03:21:37 +0000 (11:21 +0800)]
Test: Add test case for bit_op convert folding
Add test cases of all possible types of bit_op convert folding.
To check there is no tree dump like below:
_5 = (uint8_t) _2;
return _5;
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bit_op_cvt.1.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.2.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.3.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.4.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.5.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.6.c: New test.
* gcc.dg/tree-ssa/bit_op_cvt.h: New test.
Pan Li [Sat, 15 Nov 2025 03:20:37 +0000 (11:20 +0800)]
Match: Simplify (T1)(a bit_op (T2)b) to (T1)a bit_op (T1)b
During the match pattern of SAT_U_MUL form 7, we found there is
a pattern like below:
(nop_convert)(a bit_op (convert b))
which result in the pattern match of SAT_U_MUL complicated and
unintuitive. According to the suggestion of Richard, we would
like to simply it to blew:
(convert a) bit_op (convert b)
which is more friendly for reading and bit_op. There are three
bit_op here, aka bit_ior, bit_and and bit_xor.
gcc/ChangeLog:
* match.pd: Add simplfy to fold outer convert of bit_op
to inner captures.
[tree-optimization] Allow LICM to hoist loads in "self write" patterns
This patch enables Loop Invariant Code Motion (LICM) to hoist loads that
alias with stores when SSA def-use analysis proves the stored value comes
from the loaded value.
The pattern a[i] = a[0] is common in TSVC benchmarks (s293):
for (int i = 0; i < N; i++)
a[i] = a[0];
Previously, GCC conservatively rejected hoisting a[0] due to potential
aliasing when i==0. However, this is a "self write" - even when aliasing
occurs, we're writing back the same value, making hoisting safe.
The optimization checks that:
1. One reference is a load, the other is a store
2. The stored SSA value equals the loaded SSA value
3. Only simple cases with single accesses per reference
This enables vectorization of these patterns by allowing the vectorizer
to see the hoisted loop-invariant value.
With the patch, the loop now vectorizes and generates:
Sandra Loosemore [Sun, 23 Nov 2025 03:29:34 +0000 (03:29 +0000)]
OpenMP: Fix "begin declare variant" test failure with -m32
As reported by Haochen Jiang, this recently-added test case was
failing on x86_64 with -m32; the target hook for matching the "arch"
selector won't match "x86_64" in that case, even if gcc was configured
for that target. It does match plain "x86" for both 64 and 32 bit targets,
so I've switched the testcase to use that instead.
Committed as obvious (at least in retrospect).
gcc/testsuite/ChangeLog
* c-c++-common/gomp/delim-declare-variant-6.c (f3): Use "x86"
instead of "x86_64" in the arch selector, to match both 64- and
32-bit targets.
I had mistakenly been checking the importedness of the originating
module decl, but this is wrong: really we want to check if the specific
decl we're currently instantating came from another module, so just
check DECL_MODULE_IMPORT_P on this directly.
Also updated slightly since there are cases where we do emit TU-local
function or variable templates, albeit unlikely to come up frequently.
PR c++/122636
gcc/cp/ChangeLog:
* module.cc (instantiating_tu_local_entity): Don't check
importingness of originating module decl; also check templates.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-19_a.C: New test.
* g++.dg/modules/internal-19_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Sat, 22 Nov 2025 11:11:35 +0000 (22:11 +1100)]
c++: Correct behaviour of layout_compatible_type for aligned types
The standard does not require two types to have the same alignment (and
hence size) to be considered layout-compatible. The same applies to
members of unions.
gcc/cp/ChangeLog:
* typeck.cc (layout_compatible_type_p): Do not check TYPE_SIZE.
Jeff Law [Sat, 22 Nov 2025 18:33:57 +0000 (11:33 -0700)]
[PR 122701] Emit fresh reg->reg copy rather than modifying existing insnO
I took an ill-advised short-cut with the recent ext-dce improvement to detect
certain shift pairs as sign/zero extensions. Specifically I was adjusting the
SET_SRC of an object.
Often we can get away with that, but as this case shows it's simply not safe
for RTL. The core issue is the right shift we're modifying into a simple
reg->reg move may have things like CLOBBERs outside the set resulting in
Even that is often OK as targets which have these kinds of clobbers often need them on their basic moves because those moves often set condition codes. But that's not true for GCN.
On GCN that transformation leads to an unrecognizable insn as seen in the pr.
The fix is pretty simple. Just emit a new move and delete the shift. Of
course we have to be prepared to handle multiple insns once we use
emit_move_insn, but that's not too bad.
PR rtl-optimization/122701
gcc/
* ext-dce.cc (ext_dce_try_optimize_rshift): Emit a fresh reg->reg
copy rather than modifying the existing right shift.
gcc/testsuite/
* gcc.dg/torture/pr122701.c: New test.
Sandra Loosemore [Thu, 20 Nov 2025 21:45:09 +0000 (21:45 +0000)]
OpenMP: C++ front end support for "begin declare variant"
This patch implements C++ support for the "begin declare variant"
construct. The OpenMP specification is hazy on interaction of this
feature with C++ language features. Variant functions in classes are
supported but must be defined as members in the class definition,
using an unqualified name for the base function which also must be
present in that class. Similarly variant functions in a namespace can
only be defined in that namespace using an unqualified name for a base
function already declared in that namespace. Variants for template
functions or inside template classes seem to (mostly) work.
Sandra Loosemore [Thu, 20 Nov 2025 21:45:08 +0000 (21:45 +0000)]
OpenMP: Add flag for code elision to omp_context_selector_matches.
The "begin declare variant" has different rules for determining
whether a context selector cannot match for purposes of code elision
than we normally use; it excludes the case of a constant false
"condition" selector for the "user" set.
gcc/ChangeLog
* omp-general.cc (omp_context_selector_matches): Add an optional
bool argument for the code elision case.
* omp-general.h (omp_context_selector_matches): Likewise.
Sandra Loosemore [Thu, 20 Nov 2025 21:45:08 +0000 (21:45 +0000)]
OpenMP: Support functions for nested "begin declare variant"
This patch adds functions for variant name mangling and context selector
merging that are shared by the C and C++ front ends.
The OpenMP specification says that name mangling is supposed to encode
the context selector for the variant, but also provides for no way to
reference these functions directly by name or from a different
compilation unit. It also gives no guidance on how dynamic selectors
might be encoded across compilation units.
The GCC implementation of this feature instead treats variant
functions as if they have no linkage and uses a simple counter to
generate names. The exception is variants declared in a module interface,
which are given module linkage.
Jakub Jelinek [Sat, 22 Nov 2025 11:39:09 +0000 (12:39 +0100)]
c++: Fix up [[maybe_unused]] handling on expansion stmts [PR122788]
This PR complains that [[maybe_unused]] attribute is ignored on
the range-for-declaration of expansion-statement.
We copy DECL_ATTRIBUTES and apply late attributes, but early attributes
don't have their handlers called again, so some extra flags need to be
copied as well.
This copies TREE_USED and DECL_READ_P flags.
2025-11-22 Jakub Jelinek <jakub@redhat.com>
PR c++/122788
* pt.cc (finish_expansion_stmt): Or in TREE_USED and DECL_READ_P
flags from range_decl to decl or from corresponding structured binding
to this_decl.
Jakub Jelinek [Sat, 22 Nov 2025 11:24:35 +0000 (12:24 +0100)]
c++: Readd type checks for cp_fold -ffold-simple-inlines foldings [PR122185]
In GCC15, cp_fold -ffold-simple-inlines code contained
if (INDIRECT_TYPE_P (TREE_TYPE (x))
&& INDIRECT_TYPE_P (TREE_TYPE (r)))
check around the optimization, but as std::to_underlying has been
added to the set, it got removed.
Now, the check isn't needed when using correct libstdc++-v3 headers,
because the function template types ensure the converted types are sane
(so for most of them both are some kind of REFERENCE_TYPEs, for addressof
one REFERENCE_TYPE and one POINTER_TYPE, for to_underlying one ENUMERAL_TYPE
and one INTEGRAL_TYPE_P).
But when some fuzzer or user attempts to implement one or more of those
std:: functions and does it wrong (sure, such code is invalid), we can ICE
because build_nop certainly doesn't handle all possible type conversions.
So, the following patch readds the INDIRECT_REF_P && INDIRECT_REF_P check
for everything but to_underlying, for which it checks ENUMERAL_TYPE to
INTEGRAL_TYPE_P. That way we don't ICE on bogus code.
Though, I wonder about 2 things, whether the CALL_EXPR_ARG in there
shouldn't be also guarded just in case somebody tries to compile
namespace std { int to_underlying (); }; int a = std::to_underlying ();
and also whether this to_underlying folding doesn't behave differently
from the libstdc++-v3 implementation if the enum is
enum A : bool { B, C };
I think -fno-fold-simple-inlines will compile it as != 0, while
the -ffold-simple-inlines code just as a cast. Sure, enum with underlying
bool can't contain enumerators with values other than 0 and 1, but it is
still 8-bit at least and so what happens with other values?
2025-11-22 Jakub Jelinek <jakub@redhat.com>
PR c++/122185
* cp-gimplify.cc (cp_fold) <case CALL_EXPR>: For -ffold-simple-inlines
restore check that both types are INDIRECT_TYPE_P, except for
"to_underlying" check that r has ENUMERAL_TYPE and x has
INTEGRAL_TYPE_P.
Deng Jianbo [Fri, 14 Nov 2025 02:22:10 +0000 (10:22 +0800)]
LoongArch: Optimize statement to use bstrins.{w|d}
For statement (a << imm1) | (b & imm2), in case the imm2 equals to
(1 << imm1) - 1, it can be optimized to use bstrins.{w|d} instruction.
gcc/ChangeLog:
* config/loongarch/loongarch.md
(*bstrins_w_for_ior_ashift_and_extend): New template.
(*bstrins_d_for_ior_ashift_and): New template.
* config/loongarch/predicates.md (const_uimm63_operand): New
predicate.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/bstrins-5.c: New test.
* gcc.target/loongarch/bstrins-6.c: New test.
zhaozhou [Fri, 14 Nov 2025 03:18:46 +0000 (11:18 +0800)]
LoongArch: Optimize V4SImode vec_construct for load index length of two.
Under the V4SImode, the vec_construct with the load index {0, 1, 0, 1}
use vldrepl.d, the vec_construct with the load index {0, 1, 0, 0} use
vldrepl.d and vshuf4i, reduced the usage of scalar load and vinsgr2vr.
Kees Cook [Fri, 21 Nov 2025 18:24:34 +0000 (10:24 -0800)]
aarch64: Extract aarch64_indirect_branch_asm for sibcall codegen
Extract indirect branch assembly generation into a new function
aarch64_indirect_branch_asm, paralleling the existing
aarch64_indirect_call_asm function. Replace the open-coded versions in
the sibcall patterns (*sibcall_insn and *sibcall_value_insn) so there
is a common helper for indirect branches where things like SLS mitigation
need to be handled.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (aarch64_indirect_branch_asm):
Declare.
* config/aarch64/aarch64.cc (aarch64_indirect_branch_asm): New
function to generate indirect branch with SLS barrier.
* config/aarch64/aarch64.md (*sibcall_insn): Use
aarch64_indirect_branch_asm.
(*sibcall_value_insn): Likewise.
Daniele Sahebi [Wed, 19 Nov 2025 16:03:05 +0000 (17:03 +0100)]
c++: fix ICE with consteval functions in template decls [PR122658]
Currently, build_over_call calls build_cplus_new in template decls, generating
a TARGET_EXPR that it then passes to fold_non_dependent_expr, which ends up
calling tsubst_expr, and since tsubst_expr doesn't handle TARGET_EXPRs, it ICEs.
Since there is no way for this code path to be executed without causing an
ICE, I believe it can be removed.
PR c++/122658
gcc/cp/ChangeLog:
* call.cc (build_over_call): Don't call build_cplus_new in
template declarations.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/consteval42.C: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Daniele Sahebi <daniele@mkryss.me> Reviewed-by: Marek Polacek <polacek@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
[PR118358, LRA]: Decrease pressure after issuing input reload insns
LRA can generate sequence of reload insns for one input operand using
intermediate pseudos. Register pressure when reload insn for another
input operand is placed before the sequence is more than when the
reload insn is placed after the sequence. The problem report reveals
a case when several such sequences increase the pressure for input
reload insns beyond available registers and as a consequence this
results in LRA cycling.
gcc/ChangeLog:
PR target/118358
* lra-constraints.cc (curr_insn_transform): Move insn reloading
constant into a register right before insn using it.
Jonathan Wakely [Wed, 19 Nov 2025 19:04:05 +0000 (19:04 +0000)]
libstdc++: Implement LWG 4406 and LWG 3424 for std::optional and std::expected
This adjusts the return statements of optional::value_or and
expected::value_or to not perform explicit conversions, so that the
actual conversion performed matches the requirements expressed in the
Mandates: elements (LWG 4406).
Also adjust the return types to remove cv-qualifiers (LWG 3424).
libstdc++-v3/ChangeLog:
* include/std/expected (expected::value_or): Use remove_cv_t for
the return type. Do not use static_cast for return statement.
Adjust static_assert conditions to match return statements.
* include/std/optional (optional::value_or): Likewise.
(optional<T&>::value_or): Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Yuao Ma [Sun, 2 Nov 2025 07:39:38 +0000 (15:39 +0800)]
libstdc++: Implement P3223R2 Making std::istream::ignore less surprising
libstdc++-v3/ChangeLog:
* include/std/istream (ignore): Add an overload for char.
* testsuite/27_io/basic_istream/ignore/char/93672.cc: Adjust
expected behaviour for C++26 mode.
* testsuite/27_io/basic_istream/ignore/char/4.cc: New test.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Jonathan Wakely [Fri, 21 Nov 2025 14:33:29 +0000 (14:33 +0000)]
libstdc++: Update some old docs about predefined feature macros
libstdc++-v3/ChangeLog:
* doc/xml/faq.xml: Refresh information on _GNU_SOURCE and
_XOPEN_SOURCE being predefined.
* doc/xml/manual/internals.xml: Remove outdated paragraph about
_POSIX_SOURCE in libstdc++ source files.
* doc/html/*: Regenerate.
Jakub Jelinek [Fri, 21 Nov 2025 15:25:58 +0000 (16:25 +0100)]
libcody: Make it buildable by C++11 to C++26
The following builds with -std=c++11 and c++14 and c++17 and c++20 and c++23
and c++26.
I see the u8 string literals are mixed e.g. with strerror, so in
-fexec-charset=IBM1047 there will still be garbage, so am not 100% sure if
the u8 literals everywhere are worth it either.
2025-11-21 Jakub Jelinek <jakub@redhat.com>
* cody.hh (S2C): For __cpp_char8_t >= 201811 use char8_t instead of
char in argument type.
(MessageBuffer::Space): Revert 2025-11-15 change.
(MessageBuffer::Append): For __cpp_char8_t >= 201811 add overload
with char8_t const * type of first argument.
(Packet::Packet): Similarly for first argument.
* client.cc (CommunicationError, Client::ProcessResponse,
Client::Connect, ConnectResponse, PathnameResponse, OKResponse,
IncludeTranslateResponse): Cast u8 string literals to (const char *)
where needed.
* server.cc (Server::ProcessRequests, ConnectRequest): Likewise.
Richard Biener [Fri, 21 Nov 2025 11:14:46 +0000 (12:14 +0100)]
Fix OMP SIMD clone mask register and query
The following removes the confusion around num_mask_args that was
added to properly "guess" the number of mask elements in a AVX512
mask that's just represented as int. The actual mistake lies in
the mixup of 'ncopies' which is used to track the number of
OMP SIMD calls to be emitted rather than the number of input
vectors. So this reverts the earlier r16-5374-g5c2fdfc24e343c,
uses the proper 'ncopies' for loop mask record/query and adjusts
the guessing of the SIMD arg mask elements.
PR tree-optimization/122762
PR tree-optimization/122736
PR tree-optimization/122790
* cgraph.h (cgraph_simd_clone_arg::linear_step): Document
use for SIMD_CLONE_ARG_TYPE_MASK.
* omp-simd-clone.cc (simd_clone_adjust_argument_types):
Record the number of mask arguments in linear_step if
mask_mode is not VOIDmode.
* tree-vect-stmts.cc (vectorizable_simd_clone_call):
Remove num_mask_args computation, use a proper ncopies
to query/register loop masks, use linear_step for the
number of mask arguments when determining the number of
mask elements in a mask argument.
Richard Biener [Fri, 21 Nov 2025 09:32:12 +0000 (10:32 +0100)]
tree-optimization/122778 - missed loop masking in OMP SIMD call handling
For AVX512 style masking we fail to apply loop masking to a conditional
OMP SIMD call.
PR tree-optimization/122778
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Honor
a loop mask when passing the conditional mask with AVX512
style masking.
* gcc.dg/vect/vect-simd-clone-22.c: New testcase.
* gcc.dg/vect/vect-simd-clone-22a.c: Likewise.
Marek Polacek [Thu, 20 Nov 2025 18:57:43 +0000 (13:57 -0500)]
c++: make __reference_*_from_temporary honor access [PR120529]
This PR reports that our __reference_*_from_temporary ignore access
control. The reason is that we only check if implicit_conversion
works, but not if the conversion can actually be performed, via
convert_like.
Jakub Jelinek [Fri, 21 Nov 2025 13:17:01 +0000 (14:17 +0100)]
c++: Fix up build_data_member_initialization [PR121445]
The following testcase ICEs, because the constexpr ctor in C++14
or later doesn't contain any member initializers and so the
massage_constexpr_body -> build_constexpr_constructor_member_initializers
-> build_data_member_initialization member initialization discovery
looks at the ctor body instead. And while it has various
cases where it punts, including COMPONENT_REF with a VAR_DECL as first
operand on lhs of INIT_EXPR, here there is COMPONENT_REF with
several COMPONENT_REFs and VAR_DECL only inside the innermost.
The following patch makes sure we punt on those as well, instead of
blindly assuming it is anonymous union member initializer or asserting
it is a vtable store.
An alternative to this would be some flag on the INIT_EXPRs created
by perform_member_init and let build_data_member_initialization inspect
only INIT_EXPRs with that flag set.
2025-11-21 Jakub Jelinek <jakub@redhat.com>
PR c++/121445
* constexpr.cc (build_data_member_initialization): Just return
false if member is COMPONENT_REF of COMPONENT_REF with
VAR_P get_base_address.
PR target/122275
* config/i386/32/dfp-machine.h (DFP_GET_ROUNDMODE): Change `_frnd_orig` to
`unsigned short` for x87 control word.
(DFP_SET_ROUNDMODE): Manipulate the x87 control word as `unsigned short`,
and manipulate the MXCSR as `unsigned int`.
As mentioned in the PR, the COND_SH{L,R} internal fns are expanded without
fallback, their expansion must succeed, and furthermore they don't
differentiate between scalar and vector shift counts, so again both have
to be supported. That is the case of the {ashl,lshr,ashr}v*[hsd]i
patterns which use nonimmediate_or_const_vec_dup_operand predicate for
the shift count, so if the argument isn't const vec dup, it can be always
legitimized by loading into a vector register.
This is not the case of the QImode element conditional vector shifts,
there is no fallback for those and we emit individual element shifts
in that case when not conditional and shift count is not a constant.
So, I'm afraid we can't announce such an expander because then the
vectorizer etc. count with it being fully available.
As I've tried to show in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598#c9
even without this pattern we can sometimes emit
vgf2p8affineqb $0, .LC0(%rip), %ymm0, %ymm0{%k1}
etc. instructions.
In the testcases, the kernels scheduled on queues 11, 12, 13, 14 have
data dependencies on, respectively, 'b', 'c', 'd', and 'e', as they
write to them.
However, they also have a data dependency on 'a' and 'N', as they read
those.
Previously, the testcases exited 'a' on queue 10 and 'N' on queue 15,
meaning that it was possible for the aforementioned kernels to execute
and to have 'a' and 'N' pulled under their feet.
This patch adds waits for each of the kernels onto queue 10 before
freeing 'a', guaranteeing that 'a' outlives the kernels, and the same on
'N'.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c (explanatory
header): Fix typo.
(main): Insert waits on kernels reading 'a' into queue 10 before
exiting 'a', and waits on kernels reading 'N' into queue 15
before exiting 'N'.
* testsuite/libgomp.oacc-c-c++-common/data-2.c: Ditto.