Ben Wu [Tue, 21 Apr 2026 00:08:49 +0000 (20:08 -0400)]
c++: revert fix for PR41127 [PR118374]
Previously, we did not parse definitely in cp_parser_enum_specifier
after seeing CPP_COLON, since we allowed for bitfield widths to follow
"enum identifier :" in member-declarations. However, ISO says that in
such a situation, the colon should be parsed as an enum-base
([dcl.enum]/a), which means bitfield widths are not allowed. This
patch reverts the changes which allowed for bitfield widths, since
parsing definitely improves diagnostics for errant underlying types.
This reverts SVN r151246.
PR c++/118374
PR c++/41127
gcc/cp/ChangeLog:
* parser.cc (cp_parser_enum_specifier): Parse definitely
before cp_parser_type_specifier_seq.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/enum1.C: Update test.
* g++.dg/parse/enum5.C: Expect error with bitfield width
and enum-key in member.
* g++.dg/cpp0x/enum45.C: New test.
c++: Add support for [[gnu::trivial_abi]] attribute [PR107187]
Implement the trivial_abi attribute for GCC to fix ABI compatibility
issues with Clang. Currently, GCC silently ignores this attribute,
causing call convention mismatches when linking GCC-compiled code with
Clang-compiled object files that use trivial_abi types.
This attribute allows types to be treated as trivial for ABI purposes,
enabling pass in registers instead of invisible references. The
attribute is supported with `__attribute__((trivial_abi))` and
`[[clang::trivial_abi]]` spellings.
PR c++/107187
gcc/cp/ChangeLog:
* cp-tree.h (has_trivial_abi_attribute): New function.
(validate_trivial_abi_attribute): Declare.
(classtype_has_non_deleted_copy_or_move_ctor): Declare.
(cxx_clang_attribute_table): Declare.
* tree.cc (handle_trivial_abi_attribute): New function.
(handle_gnu_trivial_abi_attribute): New function.
(classtype_has_trivial_abi): New function.
(validate_trivial_abi_attribute): New function.
(cxx_gnu_attributes): Add trivial_abi entry.
(cxx_clang_attributes): New table for [[clang::trivial_abi]].
* class.cc (finish_struct_bits): Skip BLKmode for types with
trivial_abi attribute.
(classtype_has_non_deleted_copy_or_move_ctor): New function.
(finish_struct_1): Call validate_trivial_abi_attribute before
finish_struct_bits.
* cp-objcp-common.h (cp_objcp_attribute_table): Register
cxx_clang_attribute_table.
* decl.cc (store_parm_decls): Register cleanups for trivial_abi
parameters.
Jason Merrill [Thu, 23 Apr 2026 13:20:57 +0000 (09:20 -0400)]
c++: fix typo in consteval, array, modules [PR124973]
Argh, I must have typoed when I realized that we wanted to check
ff_genericize here rather than !ff_only_non_odr. And didn't notice the
problem because I also forgot the -O in the testcase.
Philipp Tomsich [Thu, 23 Apr 2026 13:31:32 +0000 (07:31 -0600)]
RISC-V: Add SUBREG_PROMOTED annotation to min/max si3 expansion
The <bitmanip_optab>si3 expansion for smin/smax/umin/umax sign-extends
both inputs and then performs the DImode min/max, which returns one of
its inputs unchanged. The result is therefore always sign-extended,
but the missing SUBREG_PROMOTED annotation on the lowpart caused GCC
to emit a redundant sext.w.
Add the SUBREG_PROMOTED_VAR_P / SUBREG_PROMOTED_SET(SRP_SIGNED)
annotation, matching rotrsi3, rotlsi3, and other si3 expansions.
gcc/ChangeLog:
* config/riscv/bitmanip.md (<bitmanip_optab>si3): Add
SUBREG_PROMOTED annotation to lowpart result.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-min-max-05.c: New test.
* gcc.target/riscv/zbb-min-max-06.c: New test.
* gcc.target/riscv/zbb-min-max-07-run.c: New test.
Andrew Pinski [Wed, 22 Apr 2026 19:29:26 +0000 (13:29 -0600)]
[PR target/124029][RISC-V] Adjust cost of comparisons
> Given this is a relatively straightforward define_split, it is likely a good
> case for Austin to chase down.
Actually it is easier than that
The middle-end has a costing mechism for this already:
```
;; cmp: le, old cst: (const_int 268435455 [0xfffffff]) new cst: (const_int 268435456 [0x10000000])
;; old cst cost: 4, new cst cost: 4
```
You need to implement a COMPARE cost part of the riscv_rtx_costs like it is
done for aarch64_rtx_costs.
It won't be 100% exact because in riscv case there is no COMPARE instruction.
But at least it might be more about the costs of generating which constant and
all.
Marek Polacek [Wed, 22 Apr 2026 15:36:37 +0000 (11:36 -0400)]
c++/reflection: reflect on dependent class template [PR124926]
Here we issue a bogus error for
^^Cls<T>::template Inner
where Inner turns out to be a class type, but we created a SCOPE_REF
because we can't know in advance what it will substitute into, and
^^typename Cls<T>::template Inner
is invalid. The typename can only be used in
^^typename Cls<T>::template Inner<int>
We're taking a reflection so both types and non-types are valid, so
I think we shouldn't give the error for ^^, and take the reflection
of the TEMPLATE_DECL.
PR c++/124926
gcc/cp/ChangeLog:
* pt.cc (tsubst_qualified_id): Rename name_lookup_p parameter to
reflecting_p. Check !reflecting_p instead of name_lookup_p. Do
not give the "instantiation yields a type" error when reflecting_p
is true.
(tsubst_expr) <case REFLECT_EXPR>: Adjust the call to
tsubst_qualified_id.
and all other "(assembler options)" tests in gcc.misc-tests/options.exp.
This happends because my builds use something like
--with-as=/vol/gcc/bin/gas-2.46 instead of relying on a random bundled
version of gas. Therefore the configured assembler name doesn't end in
"as".
The assembler options check in options.exp (check_for_all_options) looks
for
" *as(\\.exe)? .*$as_pattern"
in the gcc -v output, with an empty as_pattern.
While gcc was configured with --with-gnu-as, the gcc -v output starts
with
* include/bits/indirect.h (indirect::operator==): Adjust
noexcept specification.
* testsuite/std/memory/indirect/relops.cc: New test for noexcept
specification.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tomasz Kamiński [Tue, 21 Apr 2026 12:34:40 +0000 (14:34 +0200)]
libstdc++: Implement __integral_constant_like in terms of __constexpr_wrapper_like.
This implements LWG4486. integral-constant-like and constexpr-wrapper-like
exposition-only concept duplication.
libstdc++-v3/ChangeLog:
* include/bits/simd_details.h (simd::__constexpr_wrapper_like):
Move to...
* include/std/concepts (std::__constexpr_wrapper_like): Moved
from bits/simd_details.h.
* include/std/span (std::__integral_constant_like): Define in
terms of __constexpr_wrapper_like.
* testsuite/std/simd/traits_impl.cc: Added using declaration
for std::__constexpr_wrapper_like.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
changed the x86_cse pass to also remove redundant TLS calls. Remove the
SSE2 check in x86_cse::gate so that redundant TLS calls are removed when
SSE is disabled.
gcc/
PR target/124994
* config/i386/i386-features.cc (x86_cse::gate): Drop TARGET_SSE2.
gcc/testsuite/
PR target/124994
* gcc.target/i386/pr124994.c: New test.
From: "Bohan Lei"<garthlei@linux.alibaba.com>
Date: Thu, Jan 8, 2026, 10:49
Subject: [PATCH] RISC-V: Remove redundant CALL_P check
To: <gcc-patches@gcc.gnu.org> Cc: <juzhe.zhong@rivai.ai>, <pan2.li@intel.com>, "Bohan Lei"<garthlei@linux.alibaba.com>
Since we are using `reg_set_p` to check VXRM definition, the `CALL_P`
check has become redundant. VXRM is marked as call-used in riscv.h, and
`reg_set_p` in `vxrm_unknown_p` should always return true when a call is
encountered.
Jason Merrill [Wed, 22 Apr 2026 18:53:14 +0000 (14:53 -0400)]
c++: consteval, array, modules [PR124973]
Here the consteval holder constructor calls the defaulted element_array
constructor, which uses a VEC_INIT_EXPR to call the defaulted element
constructor.
When we read in the holder constructor, we need to clone it, so we call
finish_function, which calls cp_fold_function_non_odr_use, which tries to
constant-evaluate the call to the element_array constructor. This
eventually wants to evaluate the VEC_INIT_EXPR, which wants to call the
element constructor (complete object clone). But we haven't cloned the
element constructor yet, so mark_used tries to synthesize it again, which
breaks because the constructor is already defined, just not cloned yet.
We should have cloned the element constructor first, but we didn't know that
the element_array constructor depends on it because VEC_INIT_EXPR doesn't
express that; build_vec_init_expr calls build_vec_init_elt and then throws
it away. Perhaps we want to add the elt_init as an additional operand that
is used to express dependencies, but ignored in expansion?
It would also be nice not to repeat all the finish_function passes when
loading a function from a module; we already did
cp_fold_function_non_odr_use and such for this function before writing out
the module, doing it again is a waste of time.
But also, trying to constant-evaluate the element_array constructor is wrong
for _non_odr_use, it shouldn't be doing any optimization folding.
Furthermore, since the TARGET_EXPR is wrapped in an INIT_EXPR, we should
never have tried to fold it by itself, before cp_genericize_init_expr has a
chance to elide it. So let's only do that folding when ff_genericize, like
the other TARGET_EXPR transformations. This is a much simpler fix for this
testcase.
While we're at it, let's also suppress the other flag_no_inline-conditional
folding when ff_only_non_odr.
PR c++/124973
PR c++/120502
PR c++/120005
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Only
do optimization folding when ff_genericize.
(cp_fold) <case CALL_EXPR>: Don't do
optimization folding when ff_only_non_odr.
gcc/testsuite/ChangeLog:
* g++.dg/modules/consteval-1_a.C: New test.
* g++.dg/modules/consteval-1_b.C: New test.
Currently, trying to do "make install-html" after a build results in an
error:
$ make install-html
⋮
Doing install-html in gcc
make[2]: Entering directory '/tmp/build/gcc'
make[2]: *** No rule to make target '/tmp/build/gcc/HTML/gcc-16.0.1/ga68-coding-guidelines.info', needed by 'algol68.install-html'. Stop.
make[2]: Leaving directory '/tmp/build/gcc'
make[1]: *** [Makefile:5054: install-html-gcc] Error 1
make[1]: Leaving directory '/tmp/build'
make: *** [Makefile:1929: do-install-html] Error 2
The problem is a typo in a dependency of the algol68.install-html rule.
Fix it by removing the ".info" suffix.
With this change, "make install-html" succeeds but ga68-internals and
ga68-coding-guidelines don't get installed. Assuming this is
unintentional, extend the for loop to also install them.
gcc/algol68/ChangeLog:
* Make-lang.in (algol68.install-html): Fix
ga68-coding-guidelines dependency. Install all dependencies.
Andrew Pinski [Wed, 15 Apr 2026 20:39:51 +0000 (13:39 -0700)]
cfghooks: Pass data to callback function of make_forwarder_block
This makes a cleanup that is way overdue and should have been done
years ago. Instead of setting some global/static variables for the
callback function to check here, we pass down the data to the callback
function. This reduces the number of global variables (which should help
with Parallel GCC project). Plus since mfb_keep_just was exported outside
of cfgloopmanip.cc (it was used in tree-ssa-threadupdate.cc), it reduces
is shared between files.
I found this useful when working on PR 123113 as I needed a new callback
function.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* cfghooks.cc (make_forwarder_block): New data argument,
pass it down to redirect_edge_p.
* cfghooks.h (make_forwarder_block): Add void* argument.
* cfgloop.cc (mfb_reis_set): Remove.
(mfb_redirect_edges_in_set): Add new data argument.
Use it instead of mfb_reis_set.
(form_subloop): Create a local variable instead of
mfb_areis_set. Update call to make_forwarder_block.
(merge_latch_edges): Likewise.
* cfgloopmanip.cc (mfb_kj_edge): Remove.
(mfb_keep_just): Add new data argument.
Use it instead of mfb_kj_edge.
(create_preheader): Use local variable instead of
mfb_kj_edge. Update call to make_forwarder_block.
* cfgloopmanip.h (mfb_keep_just): Add void* argument.
* tree-cfgcleanup.cc (mfb_keep_latches): Add unused void* arugment.
(cleanup_tree_cfg_noloop): Update call to make_forwarder_block.
* tree-ssa-threadupdate.cc
(fwd_jt_path_registry::thread_through_loop_header): Use local
variable instead of mfb_kj_edge. Update call to make_forwarder_block.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 15 Apr 2026 00:57:49 +0000 (17:57 -0700)]
cfghooks: Remove new_bb_cbk callback from make_forwarder_block
This callback seems to be unused since it was allowed to be NULL
in r0-78960-g89f8f30f356532 (19 years ago), so let's just remove it.
this is also the first step in changing the callback to make_forwarder_block.
Alice Carlotti [Tue, 21 Apr 2026 18:53:57 +0000 (19:53 +0100)]
aarch64 testsuite: Merge exts_sve2 into exts
Now that we support enabling +sme without +sve2, we no longer need to
include armv9-a when checking assembler support for SME extensions.
Merge exts_sve2 back into exts, and remove the separate handling for
exts_sve2. This is a partial revert of r16-2660-g9793ffce933234.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Merge exts_sve2 handling into exts.
Alice Carlotti [Tue, 21 Apr 2026 18:31:22 +0000 (19:31 +0100)]
aarch64 testsuite: Fix gating of sme-lutv2 asm tests
These tests were configured to try assembling whenever the assembler
supports sme2. Add dg-do directives to restrict this assemblers that
support sme-lutv2 (and otherwise just compile the test).
aarch64: PR124908 Fix ICE in svld1rq fold with -msve-vector-bits=128
svld1rq is a replicated-quadword load: it loads 16 bytes and
replicates them to fill the SVE register. When -msve-vector-bits=128
the instruction can be folded to a normal load.
The GIMPLE fold for svld1rq transforms the intrinsic into a 128-bit
memory load followed by a VEC_PERM_EXPR that replicates the loaded
value. When VL == 128, the VEC_PERM_EXPR becomes an identity
permutation. The checking assertion that validates the permutation
(can_vec_perm_const_p) fails for this degenerate case because the
vec_perm_const hook does not recognise the cross-mode identity
permutation (e.g. V16QI -> VNx16QI).
Fix by detecting when the SVE vector has the same number of elements as
the 128-bit quadword (known_eq (lhs_len, source_nelts)) and emitting a
VIEW_CONVERT_EXPR instead of a VEC_PERM_EXPR.
Bootstrapped and tested on aarch64-none-linux-gnu.
PR target/124908
* config/aarch64/aarch64-sve-builtins-base.cc
(svld1rq_impl::fold): When the SVE vector length equals the
quadword width, emit VIEW_CONVERT_EXPR instead of VEC_PERM_EXPR.
gcc/testsuite/ChangeLog:
PR target/124908
* gcc.target/aarch64/sve/acle/general/ld1rq_2.c: New test.
Jakub Jelinek [Wed, 22 Apr 2026 13:44:28 +0000 (15:44 +0200)]
Update crontab and git_update_version.py
2026-04-22 Jakub Jelinek <jakub@redhat.com>
maintainer-scripts/
* crontab: Snapshots from trunk are now GCC 17 related.
Add GCC 16 snapshots from the respective branch.
contrib/
* gcc-changelog/git_update_version.py (active_refs): Add
releases/gcc-16.
Jakub Jelinek [Wed, 22 Apr 2026 13:03:48 +0000 (15:03 +0200)]
c++, libstc++: Bump __cpp_impl_reflection and __cpp_lib_reflection
Both __cpp_impl_reflection and __cpp_lib_reflection were increased from
202506L to 202603L post Croydon, I assume to show that P3795R2 (maybe some
issues too) have been implemented.
Now, we do implement P3795R2 except for the is_applicable_type,
is_nothrow_applicable_type and apply_result metafunctions, but Jonathan says
there is agreement in LWG that to test for availability of those one should
test __cpp_lib_reflection >= 202603L && __cpp_lib_apply >= 202603L.
So, this patch bumps both FTMs.
2026-04-22 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Bump __cpp_impl_reflection value
from 202506L to 202603L.
gcc/testsuite/
* g++.dg/DRs/dr2581-2.C: Adjust for __cpp_impl_reflection bump from
202506L to 202603L.
* g++.dg/reflect/feat1.C: Likewise. Also adjust for
__cpp_lib_reflection bump from 202506L to 202603L.
* g++.dg/reflect/feat2.C: Likewise.
* g++.dg/reflect/feat3.C: Likewise.
libstdc++-v3/
* include/bits/version.def (reflection): Bump 202506L to 202603L
for both v and in extra_cond.
* include/bits/version.h: Regenerate.
* include/std/meta: Compare __glibcxx_reflection against
202603L rather than 202506L.
* src/c++23/std.cc.in: Likewise.
Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Richard Earnshaw [Mon, 20 Apr 2026 15:57:22 +0000 (16:57 +0100)]
arm: fix mov<mode>_vfp_<mode>16 for fp16-only devs [PR124933]
We should only be using VLD1/VST1 instructions when we have an
auto-inc address of some suitable form and only when we have
NEON_FP16INST. At other times we should just use VLDR/VSTR. This
ensures that the code will assemble on targets that lack Advanced
SIMD. Also correct the constraint so that we can use offset
addressing on platforms that also have Neon.
gcc/ChangeLog:
PR target/124933
* config/arm/constraints.md (Uj): Allow offset addressing for
all targets, only allow Neon addressing when we have both Neon
and FP16INST.
* config/arm/vfp.md (mov<mode>_vfp_<mode>16): Only use vld1/vst1
when the pattern needs address write-back.
gcc/testsuite/ChangeLog:
PR target/124933
* lib/target-supports.exp (v8_1m_main_fp_hard): New arm
architecture variant.
* gcc.target/arm/pr124933.c: New test.
* gcc.target/arm/armv8_2-fp16-move-1.c: Update expected output.
Tomasz Kamiński [Wed, 22 Apr 2026 07:23:37 +0000 (09:23 +0200)]
libstdc++: Accept data_handle_type by value in mdspan deduction guide.
This makes the deduction guide accepting data_handle_type, mapping
and accessor consistent with corresponding constructor.
Resolves LWG4511.
libstdc++-v3/ChangeLog:
* include/std/mdspan (mdspan): Remove reference from
_AccessorType::data_handle_type parameteter of deduction
guide.
* testsuite/23_containers/mdspan/mdspan.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jakub Jelinek [Wed, 22 Apr 2026 05:56:38 +0000 (07:56 +0200)]
gensupport: Fix // comment handling [PR124971]
The while (*templ != '\n' || *templ != '\0') ++templ; loop
is due to a wrong logical operator an endless loop, so clearly
we don't have any // comments in the new syntax *.md files that
are using it in the relevant parts of the patterns.
Fixed by using && instead.
2026-04-22 Jakub Jelinek <jakub@redhat.com>
PR middle-end/124971
* gensupport.cc (convert_syntax): Fix up // comment handling.
Reviewed-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
The last patch made me try also pointer arithmetics on incomplete
type. And here we ICE before actually diagnosing the error
because we try to dereference element_size.
Furthermore, TYPE_SIZE_UNIT (element_type) is in sizetype, so
we shouldn't use build_one_cst (size_type_node) for the void *
case, but size_one_node (aka size_int (1)).
2026-04-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/124969
* c-typeck.cc (build_access_with_size_for_counted_by): Use
size_one_node instead of build_one_cst (size_type_node). Punt
if element_size is NULL_TREE.
Jakub Jelinek [Wed, 22 Apr 2026 05:52:25 +0000 (07:52 +0200)]
c-family: Fix ICE with counted_by attribute [PR124969]
The following valid testcase ICEs, because we try to use
TREE_CODE (NULL_TREE).
We document that counted_by is supported on pointers to void
and that it behaves like the GNU pointer to void arithmetics
extension in that case. build_access_with_size_for_counted_by
already uses 1 in that case as element_size.
The following patch fixes it, plus for error recovery punts
if it is a pointer to incomplete type other than void (pointer
arithmetics on such type will be diagnosed as error later on).
2026-04-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/124969
* c-ubsan.cc (get_index_from_pointer_addr_expr): For
VOID_TYPE_P (pointee_type) use size_one_node instead of
TYPE_SIZE_UNIT (pointee_type) as pointee_size. Punt for
!COMPLETE_OR_VOID_TYPE_P (pointee_type). Formatting fix. Use
CONVERT_EXPR_P (x) instead of CONVERT_EXPR_CODE_P (TREE_CODE (x)).
Jakub Jelinek [Wed, 22 Apr 2026 05:48:13 +0000 (07:48 +0200)]
testsuite: Xfails for aarch64/sme/streaming_mode_1.c and aarch64/sme/za_state_[12].c [PR122483]
This patch xfails some dg-errors in the tests and adds two xfailed
dg-bogus, so that we can downgrade this PR to P2 and defer resolution
hopefully for GCC 17.
2026-04-21 Jakub Jelinek <jakub@redhat.com>
PR target/122483
* gcc.target/aarch64/sme/streaming_mode_1.c: Xfail errors for
sc_a, sc_c, sc_e, s_a, s_c, s_e and keyword_contradiction_1.
* gcc.target/aarch64/sme/za_state_1.c: Xfail errors for shared_a,
shared_c, shared_e, preserved_a, preserved_c, preserved_e and
keyword_conflict_1.
* gcc.target/aarch64/sme/za_state_2.c: Xfail errors for shared_b,
shared_d, shared_f and shared_h. Add xfailed dg-bogus for
extra diagnostics on shared_f and shared_h.
These test-cases got typo-corrected in r16-8316-g630b53cd4ff1c3, which exposed them failing for
cris-elf. Though, at one time they were passing for the
right reason, and bisection shows they'd start failing when
the CRIS target was CC0-converted, and for the same reason
the S/390 32-bit fails. See the pre-existing comment in the
test and also see the PR for details, including a suggested
plan how to fix the optimization pass, if someone is
interested in an "easy hack". Though admittedly, the missed
optimization doesn't affect many targets.
Regarding the target pattern selector expression, I went for
the simplest possible, even though I'm including CRIS in the
64-bit exception.
Jason Merrill [Tue, 21 Apr 2026 20:23:37 +0000 (16:23 -0400)]
c++: #include rewrite and installed compiler [PR123879]
In an installed compiler, the pathname in the expanded_location might be
canonicalized while the pathname in the cpp_dir is not, e.g. eloc.file is
".../include/c++/16.0.1/stdbit.h" while dir->name is
".../lib/gcc/x86_64-pc-linux-gnu/16.0.1/../../../../include/c++/16.0.1". So
let's use lrealpath to compare canonical paths for both. And filename_ncmp,
while we're at it.
PR c++/123879
gcc/cp/ChangeLog:
* module.cc (maybe_translate_include): Use lrealpath in check
whether we're including something in the same directory.
allocator_traits<>::allocate_at_least has taken its allocator
by value, incorrectly. This patch makes it take its allocator
by reference, as specified.
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h (allocate_at_least):
Take allocator argument by reference, per Standard.
`struct sock_fprog` is not provided by glibc, but rather by the linux
headers (`#include <linux/filter.h`). It seems that glibc due to an
implementation detail internally includes `<linux/filter.h>` somewhere
other C libs (e.g. musl) do not, which previously caused build failures
and let to disabling `struct sock_fprog` on non-glibc Linux systems.
This adds the missing include and provides it again for all Linux
systems regardless of C lib.
libsanitizer/ChangeLog:
PR sanitizer/124248
* sanitizer_common/sanitizer_platform_limits_posix.cpp (struct_sock_fprog_sz):
Make available on non-glibc Linux systems.
Jonathan Wakely [Tue, 21 Apr 2026 20:04:23 +0000 (21:04 +0100)]
libstdc++: Fix comment on std::print helper __open_terminal
The comment describes an earlier version of the function which I
experimented with when implementing the std::print feature. Update it to
describe the current semantics.
Jakub Jelinek [Tue, 21 Apr 2026 17:32:22 +0000 (19:32 +0200)]
c++: Parse splice-type-specifier in cp_parser_mem_initializer_id [PR124944]
The grammar has:
mem-initializer-id:
class-or-decltype
identifier
class-or-decltype:
nested-name-specifier[opt] type-name
nested-name-specifier template simple-template-id
computed-type-specifier
computed-type-specifier:
decltype-specifier
pack-index-specifier
splice-type-specifier
but we weren't parsing splice-type-specifier in there, just in
cp_parser_base_specifier. So, the following patch defers
similarly to cp_parser_base_specifier the typename diagnostics
because we don't know whether typename [: will be valid or not
- it could be splice-scope-specifier and in that case typename
is not valid, or splice-type-specifier, in which case it is valid
but not required. And calls cp_parser_splice_type_specifier too
when nested-name-specifier nor :: don't appear.
Jakub Jelinek [Tue, 21 Apr 2026 16:31:36 +0000 (18:31 +0200)]
c++: Fix up iterating expansion stmt b and e var handling [PR124927]
The following testcase ICEs, because the TARGET_EXPR it had has VOID_TYPE
TARGET_EXPR_INITIAL and force_target_expr obviously ICEs when trying to
create a VOID_TYPE TARGET_EXPR.
This patch fixes it by taking the cv_unqualified type of the TARGET_EXPRs
before we extract their TARGET_EXPR_INITIAL, so it does the desirable thing
of recreating the TARGET_EXPRs with cv unqualified types.
2026-04-21 Jakub Jelinek <jakub@redhat.com>
PR c++/124927
* pt.cc (finish_expansion_stmt): Compute types for force_target_expr
from b and e before extracting TARGET_EXPR_INITIAL from it.
Marek Polacek [Mon, 20 Apr 2026 19:55:50 +0000 (15:55 -0400)]
c++/reflection: bogus -Wmissing-field-initializers with <meta> [PR124950]
We emit -Wmissing-field-initializers warnings for code like
data_member_spec (^^int, { .name = "dms" })
which seems undesirable. We can initialize the members of
std::meta::data_member_options to suppress that warning (clang's <meta>
has these initializers too).
PR c++/124950
libstdc++-v3/ChangeLog:
* include/std/meta (std::meta::data_member_options): Initialize
alignment, bit_width, and annotations members.
Jonathan Wakely [Tue, 25 Nov 2025 14:29:50 +0000 (14:29 +0000)]
libstdc++: Use 32-bit platform wait type for OpenBSD and DragonFly [PR120527]
This defines __platform_wait_t as unsigned int for OpenBSD and
DragonFly. This means that std::semaphore will use unsigned int by
default, and so will benefit from more efficient wait/notify ops if we
start to use the OpenBSD futex(2) syscall or the DragonFly umtx(2)
syscalls. We don't currently use them, but if we start to in future, it
would be an ABI break to change __platform_wait_t later.
libstdc++-v3/ChangeLog:
PR libstdc++/120527
* include/bits/atomic_wait.h [__OpenBSD__ || __DragonFly]: Use
unsigned int for __platform_wait_t.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Fri, 5 Dec 2025 15:47:10 +0000 (15:47 +0000)]
libstdc++: Add platform wait functions for Darwin [PR120527]
Darwin has kernel support for this facility from 10.12 (macOS Sierra).
From 10.15 (macOS Catalina) 64bit qualitities are supported.
When the library is built for 10.12+ both 32b and 64b quantities will be
supported by the DSO which means it can be installed on 10.12+ with support
for 64bit available when the instalation is >= 10.15.
The header will only recognise 64b quantities when the deployment version
is >= 10.15.
If the library is built for <= 10.11, the support will be missing and attempts
to use it wlll result in link errors.
The platform wait type is unconditionally set to 32bits, since this is compatible
across supported OS editions.
PR libstdc++/120527
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h:
* src/c++20/atomic.cc (__ulock_wait): Enable supported Darwin versions.
(__ulock_wake): Likewise.
(UL_COMPARE_AND_WAIT): New.
(UL_COMPARE_AND_WAIT64): New.
(ULF_WAKE_ALL): New.
(_GLIBCXX_HAVE_PLATFORM_WAIT): Enable for suppported Darwin versions.
libstdc++: Implement P4012R1 while reverting P3844R2 (consteval simd broadcast)
P3844R2 added consteval conversion for value-preserving conversion from
constants. It had been approved by LEWG in Kona. Therefore, the current
implementation has the consteval broadcast constructor. In Croydon, LEWG
reversed the decision but changed the overload set to keep the design
space open for C++29.
This patch implements the removal of the consteval constructor and
changes the broadcast constructor according to P4012R1, to keep the
design space open.
libstdc++-v3/ChangeLog:
* include/bits/simd_details.h (__value_preserving_cast): Remove.
* include/bits/simd_mask.h (basic_mask): Replace plain 0 and 1
literals with cw<0> and cw<1>. Replace explicit basic_vec
construction from 0 and 1 with default init and broadcast from
_Up(1).
(_M_to_uint): Replace 1 with cw<1>.
* include/bits/simd_vec.h (basic_vec): Remove consteval
broadcast overload. Remove explicit broadcast from
non-value-preserving types.
* testsuite/std/simd/arithmetic.cc: Replace ill-formed integer
literals with explicit cast to T or use cw.
* testsuite/std/simd/mask.cc: Likewise.
* testsuite/std/simd/simd_alg.cc: Likewise.
* testsuite/std/simd/traits_common.cc: Adjust for resulting
traits changes.
* testsuite/std/simd/traits_math.cc: Likewise.
It turns out a union without an active member does not violate C++20 core
constant expression rules and r16-8748 was really just a workaround for a
front end bug. The actual underlying problem -- that the constexpr
evaluator treated an explicitly destroyed union member as still active --
has been fixed by r16-8767 which makes this workaround unnecessary for GCC.
Rather than remove the workaround, restrict it to Clang which seems to have
a similar bug making it still needed for e.g. the r16-8748 testcase.
PR c++/124910
libstdc++-v3/ChangeLog:
* include/std/optional (_Optional_payload_base::_M_destroy):
Restrict r16-8748 workaround to Clang, and adjust comment.
Jakub Jelinek [Tue, 21 Apr 2026 09:53:52 +0000 (11:53 +0200)]
bitintlower: Padding bit fixes, part 4 [PR123635]
As the following testcase shows, not clearing the padding bits after
signed MULT_EXPR (or signed division) is reasonable when overflow actually is
undefined behavior because then anything can happen. But when it is not
undefined behavior due to -fwrapv, we need to clear the padding bits
on targets which chose that behavior. It isn't only signed MULT_EXPR,
but also division because smallest negative / -1 overflows and in that
case the padding bits aren't correct for bitint_extended targets either.
2026-04-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123635
* gimple-lower-bitint.cc (bitint_large_huge::lower_muldiv_stmt):
Extend the padding bits not just for unsigned MULT_EXPR but for any
TYPE_OVERFLOW_WRAPS MULT_EXPR and signed TYPE_OVERFLOW_WRAPS division.
Jakub Jelinek [Tue, 21 Apr 2026 09:49:23 +0000 (11:49 +0200)]
sccvn: Use build_bitint_type in another SCCVN spot [PR124941]
The following testcase ICEs on riscv.
tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def) already uses
build_bitint_type instead of build_nonstandard_integer_type for larger
BITINT_TYPE types:
/* Make sure to interpret in a type that has a range covering the whole
access size. */
if (INTEGRAL_TYPE_P (vr->type) && maxsizei != TYPE_PRECISION (vr->type))
{
if (TREE_CODE (vr->type) == BITINT_TYPE
&& maxsizei > MAX_FIXED_MODE_SIZE)
type = build_bitint_type (maxsizei, TYPE_UNSIGNED (type));
else
type = build_nonstandard_integer_type (maxsizei, TYPE_UNSIGNED (type));
}
and the same change in vn_reference_lookup_3 fixes the ICE.
2026-04-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/124941
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Use build_bitint_type
rather than build_nonstandard_integer_type for
maxsizei larger than MAX_FIXED_MODE_SIZE.
Paul Thomas [Tue, 21 Apr 2026 14:00:14 +0000 (15:00 +0100)]
Fortran: ICE due to allocatable component in hidden type [PR117077]
2026-03-19 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/117077
* trans-expr.cc (gfc_trans_scalar_assign): If the lhs and rhs
TYPE_MAIN_VARIANTs are not the same, convert the rhs to the lhs
type via a VIEW_CONVERT_EXPR.
gcc/testsuite/
PR fortran/117077
* gfortran.dg/pr117077.f90: New test.
commit 14cd2833b27a9700aa8679fd0a090e8ae7a5f44a added xfails, but
2025-12-05's Richard Biener <rguenther@suse.de>'s PR
tree-optimization/120939 patch to skip gcc.dg/torture/pr113026-1.c
when -ftracer had already taken care of them without XPASSes.
for gcc/testsuite/ChangeLog
PR tree-optimization/113524
* gcc.dg/torture/pr113026-1.c: Revert 2026-01-21's
XFAIL of bogus warning on various 32-bit targets.
Changes to arm_v8_3a_complex_neon options and to
vect_complex_add_{float,double} enabled these tests and caused them to
fail with the same failure mode as -half-float.c, namely, reassoc
makes ADD_ROT270 unrecognizable in fms_elemconjsnd after complex
lowering and dce's removal of the original complex assignments.
Gaius Mulley [Tue, 21 Apr 2026 01:56:39 +0000 (02:56 +0100)]
PR modula2/120189 Bugfix to documentation and fix prototypes in m2rts.h
This patch rewrites the Building a shared library section in the
gm2.texi. The new content addresses the default dynamic module
scaffold and also provides an example of C++ calling the m2 shared
library. Bootstrapped using lto on amd64.
gcc/ChangeLog:
PR modula2/120189
* doc/gm2.texi (Building a shared library): Rewrite.
d: Fix ICE in must_pass_in_stack_var_size_or_pad with D enums [PR123411]
An `enum : enum A` type caused the already computed underlying type size
of `enum A` to be overwritten with NULL_TREE. To fix, don't finish the
enum with layout_type unless we're handling the main variant type.
PR d/123411
gcc/d/ChangeLog:
* types.cc (TypeVisitor::visit (TypeEnum *)): Only call layout_type on
the TYPE_MAIN_VARIANT of the enum.
Jason Merrill [Mon, 20 Apr 2026 16:45:57 +0000 (12:45 -0400)]
c++: std::optional reset and constexpr [PR124910]
Constant evaluation didn't recognize that destroying _M_value made it no
longer the active member of the anonymous union, so we were treating the
result as containing an out-of-lifetime value. Instead we should treat the
union as no longer having an active member.
PR c++/124910
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_store_expression): Ending the
lifetime of the active member means no active member.
The switch to front-end lowering of AA literals regressed on 32bit SPARC
targets, as the runtime function `_d_assocarrayliteralTX()` returns a
`void*`, but the expression should be of type `struct { void* ptr; }`.
PR d/124157
gcc/d/ChangeLog:
* expr.cc (ExprVisitor::visit (AssocArrayLiteralExp *)): Return AA
constructor with memory returned by _d_assocarrayliteralTX().
Marek Polacek [Fri, 17 Apr 2026 16:45:53 +0000 (12:45 -0400)]
c++/reflection: support splices with CTAD [PR124706]
This PR points out that we don't handle a splice-type-spec that
designates a deducible template, which then serves as a placeholder
for CTAD. This is allowed by [dcl.type.simple]/3. This patch
fixes that problem by calling make_template_placeholder if we
get a deducible template.
c++, contracts: Account for lambda captures in pre/post [PR124648].
When we have lambda captures, they appear in the vars slot of a bind
expression at the outer operator() body.
We need these to be visible for any pre or post conditions that might
use them, therefore (when a lambda has captures) nest the application
of contract pre and post conditions within the lambda outer bind
expressino.
PR c++/124648
gcc/cp/ChangeLog:
* contracts.cc (maybe_apply_function_contracts): Nest pre and
post conditions inside the outer bind expression of a lambda
with captures.
gcc/testsuite/ChangeLog:
* g++.dg/contracts/cpp26/expr.prim.lambda.closure.p10.C: Update
to include tests of conditions seen in PR124648.
Marek Polacek [Fri, 17 Apr 2026 21:09:43 +0000 (17:09 -0400)]
c++/reflection: dependent type considered consteval-only [PR124855]
Here we emit the "function of consteval-only type must be declared
'consteval'" error for f, even though its type will become
char f(int) after substitution, which is not consteval-only. We
probably shouldn't consider dependent type consteval-only.
PR c++/124855
gcc/cp/ChangeLog:
* reflect.cc (consteval_only_p): Return false if the type is
dependent.
Jonathan Wakely [Fri, 17 Apr 2026 20:53:16 +0000 (21:53 +0100)]
libstdc++: Fix accidentally committed change to spelling of macro
This change to the macro was done intentionally to quickly test that the
changes in r16-8720-g209550a04e143e did not break the code in the #else
branch, but it was not supposed to be committed!
Jakub Jelinek [Mon, 20 Apr 2026 07:53:38 +0000 (09:53 +0200)]
testsuite: Remove -m32 from gcc.target/i386 test [PR122021]
I found another test which uses -m32 in gcc.target/i386/ . Similarly
to the previously fixed tests, the test ought to be tested during i686-linux
testing or x86_64-linux test with --target_board=unix\{-m32,-m64\}
There is nothing ia32 specific on the test, so I've just dropped the -m32.
See also r13-143, r13-6846, r15-7748 and r15-7749 for similar changes in the
past.
2026-04-20 Jakub Jelinek <jakub@redhat.com>
PR middle-end/122021
* gcc.target/i386/pr122021-0.c: Remove -m32 from dg-options.
H.J. Lu [Mon, 20 Apr 2026 05:02:10 +0000 (13:02 +0800)]
pr121649.c: Replace long with long long
pr121649.c is a test enabled for int128 targets. It assumes that long
is 64-bit, which isn't true for all int128 targets. Replace long with
long long for 64-bit integer.
PR testsuite/124939
* gcc.dg/torture/pr121649.c: Replace long with long long.
Jakub Jelinek [Mon, 20 Apr 2026 07:11:24 +0000 (09:11 +0200)]
bitintlower: Padding bit fixes, part 3 [PR123635]
I've debugged the rest of the failures on riscv64-linux (in particular
torture/bitint-{87,89}.c FAILs at -O2).
This is on top of https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713242.html
patch.
One problem was that the lower_shift_stmt RSHIFT_EXPR change to iterate
until p2 rather than p (where p2 is p + 1 if bitint_ext_full and there is
full limb of padding bits), the loop is emitted with a condition before the
header and another condition before the latch edge and I've mistakenly fixed
just the latter and not the former.
ANother problem was that in all the 3 RSHIFT_EXPRs added meant to set
a full limb to 0 or all ones based on most significant bit I've mistakenly used
unsigned type rather than signed, so it was set to 0 or 1 instead (this
was twice in lower_shift_stmt, for the LSHIFT_EXPR case in both cases and
once in lower_float_conv_stmt).
And finally, because unsigned MULT_EXPR doesn't have overflow undefined, we
actually don't need to just clear the full padding bit limb (if any) but
even the padding bits in the partial limb (if any; and this actually doesn't
affect just arm and riscv, but also affects s390x and loongarch).
2026-04-20 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123635
* gimple-lower-bitint.cc (bitint_large_huge::lower_shift_stmt): In the
RSHIFT_EXPR case, use p2 in two LE_EXPR conditions rather than just
one. In LSHIFT_EXPR case, use signed RSHIFT_EXPR instead of unsigned.
(bitint_large_huge::lower_muldiv_stmt): For unsigned MULT_EXPR with
bitint_extended if prec is not multiple of limb_prec, clear padding
bits after libgcc call.
(bitint_large_huge::lower_float_conv_stmt): Use signed RSHIFT_EXPR
instead of unsigned.
Reviewed-by: Jeffrey Law <jeffrey.law@oss.qualcomm.com>
Jakub Jelinek [Mon, 20 Apr 2026 07:08:49 +0000 (09:08 +0200)]
bitintlower: Padding bit fixes, part 2 [PR123635]
So, I've tried the (extremely slow) cfarm95 RISCV box and found that my
earlier PR123635 patch actually broke stuff.
First thing I found is that __riscv__ macro I was using in bitintext.h
doesn't exist and fixed that.
I wrote two new tests (88/89, one for a problem I'll talk about later and
one to cover shifts) and fixed one nit in 86/87. All the testing
has been done on this modified tests and using
make -j8 -k check-gcc RUNTESTFLAGS="dg.exp='*bitint* pr112673.c builtin-stdc-bit-*.c pr112566-2.c pr112511.c pr116588.c pr116003.c pr113693.c pr113602.c flex-array-counted-by-7.c' dg-torture.exp='*bitint* pr116480-2.c pr114312.c pr114121.c' dfp.exp=*bitint* vect.exp='vect-early-break_99-pr113287.c' tree-ssa.exp=pr113735.c"
On pre-r16-8678 source, the FAILs were
FAIL: gcc.dg/torture/bitint-82.c -O0 execution test
FAIL: gcc.dg/torture/bitint-82.c -O2 execution test
FAIL: gcc.dg/torture/bitint-83.c -O0 execution test
FAIL: gcc.dg/torture/bitint-83.c -O2 execution test
FAIL: gcc.dg/torture/bitint-86.c -O0 execution test
FAIL: gcc.dg/torture/bitint-86.c -O2 execution test
FAIL: gcc.dg/torture/bitint-87.c -O0 execution test
FAIL: gcc.dg/torture/bitint-87.c -O2 execution test
FAIL: gcc.dg/torture/bitint-88.c -O0 execution test
FAIL: gcc.dg/torture/bitint-88.c -O2 execution test
FAIL: gcc.dg/torture/bitint-89.c -O0 execution test
FAIL: gcc.dg/torture/bitint-89.c -O2 execution test
i.e. all the bitintext.h tests for padding bits (except bitint-84.c),
plus gcc.dg/bitint-39.c gcc.dg/torture/bitint-37.c tests timing out
(but those timed out due to extremely slow CPU all the time, and are
really large and not padding related, so let's ignore that).
Now, with r16-8678 (i.e. vanilla trunk), the FAILs are
FAIL: gcc.dg/torture/bitint-42.c -O0 execution test
FAIL: gcc.dg/torture/bitint-42.c -O2 execution test
FAIL: gcc.dg/torture/bitint-62.c -O0 execution test
FAIL: gcc.dg/torture/bitint-62.c -O2 execution test
FAIL: gcc.dg/torture/bitint-66.c -O0 execution test
FAIL: gcc.dg/torture/bitint-68.c -O0 execution test
FAIL: gcc.dg/torture/bitint-68.c -O2 execution test
FAIL: gcc.dg/torture/bitint-79.c -O2 execution test
FAIL: gcc.dg/torture/bitint-80.c -O2 execution test
FAIL: gcc.dg/torture/bitint-81.c -O0 execution test
FAIL: gcc.dg/torture/bitint-81.c -O2 execution test
FAIL: gcc.dg/torture/bitint-83.c -O2 execution test
FAIL: gcc.dg/torture/bitint-87.c -O2 execution test
FAIL: gcc.dg/torture/bitint-88.c -O2 execution test
FAIL: gcc.dg/torture/bitint-89.c -O2 execution test
So, I broke some tests (42, 62, 66, 68, 79, 80, 81) and
fixed a few too (82, 86 and at -O0 only 83, 87, 88, 89).
I've debugged the regressions I've caused and the problem is on large/huge
_BitInt bit-field stores, we can't clear any padding bits in those cases,
bit-fields never have paddings (C FE rejects oversized bit-fields and the
padding is used for further fields or is merely structure padding rather
than padding of the bit-field).
The following patch fixes more than that. There is another problem
(bitint-88.c tries to test that), when we merge some operation (e.g.
addition) of some narrower large/huge _BitInt with sign extension from
it into a wider unsigned _BitInt (e.g. signed _BitInt(513) addition
sign extended into unsigned _BitInt(1025)), the earlier solution for
the extra padding limb doesn't work properly, we do want to sign
extend the bit 512 into bits 513-1024, but the padding bits above
that need to be cleared. For the limb containing bit 1024 we do it
right, it is sign extension but outside of loop, so should cast the
all zeros or all ones value to unsigned long : 1 and back, but
the limb containing bit 1088 needs to be just zeroed.
And the patch also adds the bitint_ext_full handling to RSHIFT_EXPR
and LSHIFT_EXPR code.
With this, the FAILs on riscv64-linux are
FAIL: gcc.dg/torture/bitint-87.c -O2 execution test
FAIL: gcc.dg/torture/bitint-89.c -O2 execution test
which means I need to debug further the multiplication/division/modulo/
casts from float and there is some remaining problem with the shifts.
Plus something not covered yet, the overflow builtins/ubsan (all of +-*).
In any case, because this patch doesn't regress on riscv64-linux any
actual non-padding tests and even these two aren't regressions, I'd
like to commit this patch separately and fix stuff incrementally,
to unbreak the bit-field stores.
2026-04-20 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123635
* gimple-lower-bitint.cc (bitint_precision_kind): Assert the current
assumptions, that bitint_ext_full for abi_limb_prec > limb_prec is
supported only when abi_limb_prec is limb_Prec * 2 and it is not
big endian in that case.
(bitint_large_huge::lower_mergeable_stmt): Don't set separate_ext
fir bitint_ext_full for bit-field stores. Guard the condition
on an extra limb of padding bits to be extended rather than including
earlier extensions in that too. If already sign extending before
and type is unsigned, set zero_ms_limb instead and later handle it.
(bitint_large_huge::lower_shift_stmt): Handle bitint_ext_full.
* gcc.dg/bitintext.h: Use __riscv macro instead of __riscv__.
* gcc.dg/torture/bitint-86.c: Remove bogus sync_char_short
effective target.
* gcc.dg/torture/bitint-87.c: Likewise.
* gcc.dg/torture/bitint-88.c: New test.
* gcc.dg/torture/bitint-89.c: New test.
Reviewed-by: Jeffrey Law <jeffrey.law@oss.qualcomm.com>
Soumya AR [Fri, 17 Apr 2026 03:02:11 +0000 (03:02 +0000)]
aarch64: Minor fixes for narrow-gp-writes pass
This patch addresses the following fixes:
- Remove the redundant checks for SUBREG and TRUNCATE.
- Bail out of recursive narrowing in narrow_dimode_src when an operand remains
DImode.
- Use HOST_WIDE_INT_PRINT_HEX instead of %lx for printing the mask.
Bootstrapped and regtested on aarch64-linux-gnu.
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-narrow-gp-writes.cc (narrow_dimode_src): Remove
redundant checks. Don't recurse when an operand remains DImode.
(narrow_gp_writes::optimize_compare_arith_insn): Use
HOST_WIDE_INT_PRINT_HEX.
(narrow_gp_writes::optimize_single_set_insn): Likewise.
The PR is about an ICE on sh caused by an "invalid" subreg.
cse replaced a pseudo register with the hard T register within:
(zero_extend:SI (subreg:QI (reg:SI pseudo) 3))
Since this is a register-for-register replacement, cse just relied on
recog to reject anything that wasn't valid.
However, if validate_subreg had been asked, it would have said that:
(subreg:QI (reg:SI T) 3)
is not valid. This means that even simplify_gen_subreg would have
refused to generate it.
In that sense, cse should not even be trying to match this replacement.
It's not recog's job to reject all invalid rtl. recog is just supposed
to say whether the machine supports a given piece of valid rtl.
In this particular case, the sh port does specifically match:
(zero_extend:SI (subreg:QI (reg:SI T) 3))
even though, by forbidding T from having QImode, the port also
effectively forbids the subreg. See the discussion in the PR trail
about that. But I think the point still stands that cse should verify
the subregs that it creates. It should also try to simplify them down
to hard registers where possible.
I suppose a more complete fix would be to rewrite canon_reg to use a
helper that recursively replaces and simplifies, but that seems somewhat
dangerous at this stage. The scope for non-subreg simplification should
also be pretty limited in practice.
gcc/
PR rtl-optimization/124643
* cse.cc (canon_reg): Handle and canonicalize subregs.
gcc/testsuite/
PR rtl-optimization/124643
* gcc.c-torture/compile/pr124643.c: New test.