The current implementation was returning the result of the g_r_o_o_a_f
call independently of the return expressions for 'normal' cases.
This prevents the NVRO that we need to guarantee copy elision for the
ramp return values - when these are initialised from a temporary of the
same type.
The solution here reorders the code so that the regular return expression
appears before the allocation-failed case. Ensure that the g_r_o and
associated code appears in a distinct scope. These steps are to meet the
constaints of NRV.
PR c++/121219
gcc/cp/ChangeLog:
* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Reorder the return
expressions for the 'normal' and 'allocation failed' cases so that
NRV constraints are met.
There was once a RISC-V extension draft ("N"), which introduced
user-level interrupts. However, it was never ratified and the
specification draft has been removed from the RISC-V ISA manual
in commit `b6cade07034` with the comment "it'll likely need to
be redesigned".
Support for a N extension never made it to GCC, but we support
fuction attributes for user-level interrupt handlers that use
the URET instruction.
The "user" interrupt attribute was documented in the RISC-V C API,
but has been removed in PR #106 in May 2025 (driven by LLVM devs/
maintainers and ack'ed by at least one GCC maintainer).
Richard Biener [Fri, 25 Jul 2025 07:50:18 +0000 (09:50 +0200)]
Remove STMT_VINFO_VEC_STMTS
The following removes the last uses of STMT_VINFO_VEC_STMTS and
the vector itself. Vector stmts are recorded in SLP nodes now.
The last use is a bit strange - it was introduced by
Richard S. in r8-6064-ga57776a1136962 and affects only
power7 and below (the re-align optimized load path). The
check should have never been true since vect_vfa_access_size
is only ever invoked before stmt transform. I have done
the "conservative" change of making it always true now
(so the code is now entered). I can as well remove it, but
I wonder if you remember anything about this ...
* tree-vectorizer.h (_stmt_vec_info::vec_stmts): Remove.
(STMT_VINFO_VEC_STMTS): Likewise.
* tree-vectorizer.cc (vec_info::new_stmt_vec_info): Do not
initialize it.
(vec_info::free_stmt_vec_info): Nor free it.
* tree-vect-data-refs.cc (vect_vfa_access_size): Remove
check on STMT_VINFO_VEC_STMTS.
Richard Biener [Fri, 25 Jul 2025 07:19:47 +0000 (09:19 +0200)]
Remove vect_get_vec_defs_for_operand
This removes vect_get_vec_defs_for_operand and its remaining uses.
It also removes some remaining non-SLP paths in preparation to
elide STMT_VINFO_VEC_STMTS.
* tree-vectorizer.h (vect_get_vec_defs_for_operand): Remove.
* tree-vect-stmts.cc (vect_get_vec_defs_for_operand): Likewise.
(vect_get_vec_defs): Remove non-SLP path.
(check_load_store_for_partial_vectors): We always have an
SLP node.
(vect_check_store_rhs): Likewise.
(vect_get_gather_scatter_ops): Likewise.
(vect_create_vectorized_demotion_stmts): Likewise.
(vectorizable_store): Adjust.
(vectorizable_load): Likewise.
Richard Biener [Fri, 25 Jul 2025 07:04:49 +0000 (09:04 +0200)]
Remove dead code from vectorizable_store
There's dead code in the else block of a if (!costing_p) block,
after trivial pruning only setting of 'op' remains but that has
no further uses downstream. I found this looking for remaining
(must-be-dead) uses of vect_get_vec_defs_for_operand.
* tree-vect-stmts.cc (vectorizable_store): Remove trivially
dead code.
MI300 requires some additional s_nop to be added between some instructions.
* As 'v_readlane' and 'v_writelane' have to be distinguished, the
'laneselect' attribute was changed from no/yes to no/read/write.
* Add some missing 'laneselect' attributes for v_(read,write)lane.
* Replace 'delayeduse' by 'flatmemaccess' which is more explicit,
especially as some uses have to destinguished more details.
(Alongside, one off-by-two delayeduse has been fixed.)
On the other hand, RDNA 2, 3, and 3.5 do not require any added s_nop;
thus, there is no need to walk the instructions for them to insert
pointless S_NOP. (RDNA4 (not yet in GCC) requires it in a few cases.)
gcc/ChangeLog:
* config/gcn/gcn-opts.h (TARGET_NO_MANUAL_NOPS,
TARGET_CDNA3_NOPS): Define.
* config/gcn/gcn.md (define_attr "laneselect): Change 'yes' to
'read' and 'write'.
(define_attr "flatmemaccess"): Add with values store, storex34,
load, atomic, atomicwait, cmpswapx2, and no. Replacing ...
(define_attr "delayeduse"): Remove.
(define_attr "transop"): Add with values yes and no.
(various insns): Update 'laneselect', add flatmemaccess and transop,
remove delayeduse; fixing an issue for s_load_dwordx4 vs.
flat_store_dwordx4 related to delayeduse (now: flatmemaccess).
* config/gcn/gcn-valu.md: Update laneselect attribute and add
flatmemaccess.
* config/gcn/gcn.cc (gcn_cmpx_insn_p): New.
(gcn_md_reorg): Update for MI300 to add additional s_nop.
Skip s_nop-insertion part for RDNA{2,3}; add "VALU writes EXEC
followed by VALU DPP" unconditionally for CDNA2/CDNA3/GCN5.
Tests gcc.dg/asm-hard-reg-error-{4,5}.c ICE on sparc*-sun-solaris2.11
since in tm-preds.h we end up with
#define CONSTRAINT_LEN(c_,s_) 1
and, therefore, do not parse hard register constraints correctly.
Hard register constraints are non-single character constraints and
require insn_constraint_len() in order to determine the length.
In write_tm_preds_h() from genpreds.cc, previously variable
constraint_max_namelen was used in order to decide whether we have
single or non-single character constraints. The distinction shouldn't
be done anymore and we always must call into insn_constraint_len().
While being on it, remove guard constraint_max_namelen>0 since we always
have some constraints coming from common.md. This leaves
constraint_max_namelen without users so remove it.
gcc/ChangeLog:
PR middle-end/121214
* genpreds.cc (constraint_max_namelen): Delete.
(write_tm_preds_h): Always write insn_constraint_len() and
define CONSTRAINT_LEN to it, i.e., remove guard
constraint_max_namelen>1. Remove outer guard
constraint_max_namelen>0 and re-indent.
(write_insn_preds_c): Remove guard
constraint_max_namelen>0 and re-indent.
ada: ppc-vx6: pthread clocks and headers for decls
VxWorks 6 lacks pthread_condattr_setclock, so define CLOCK_RT_Ada to
CLOCK_REALTIME to use the dummy definition of
__gnat_pthread_condattr_setup in libgnarl/thread.c.
socket.c and sysdep.c use FD_ZERO, that relies on bzero on VxWorks 6.
We need to include strings.h to get a declaration for bzero, but don't
require strings.h to exist, since it's nonstandard.
gcc/ada/ChangeLog:
* s-oscons-tmplt.c (CLOCK_RT_Ada) [__vxworks]: Define to
CLOCK_REALTIME on VxWorks6.
* gsocket.h [__vxworks]: Include strings.h if available.
* sysdep.c [__vxworks]: Likewise.
Steve Baird [Wed, 16 Jul 2025 20:37:44 +0000 (13:37 -0700)]
ada: Follow up fixes.
Two follow-up fixes for the previous change for this issue.
gcc/ada/ChangeLog:
* exp_ch6.adb (Apply_Access_Discrims_Accessibility_Check): Do
nothing and simply return if either Ada_Version <= Ada_95 or if
the function being returned from lacks the extra formal parameter
needed to perform the check (typically because the result is
tagged).
Bob Duff [Wed, 16 Jul 2025 14:29:01 +0000 (10:29 -0400)]
ada: Bug in Indefinite_Holders instance passed to formal package
Fix bug when an instance of Indefinite_Holders with a class-wide type is
passed as a generic formal package; Program_Error was raised when
dealing with the implicit "=" function.
The fix is to disable legality checks in formal packages when the
entity is an E_Subprogram_Body, because these are implicitly generated
for class-wide predefined functions when passed to generics.
gcc/ada/ChangeLog:
* sem_ch12.adb (Check_Formal_Package_Instance):
Do nothing in case of E_Subprogram_Body.
A previous patch changed the mechanism of early usage detection for
discriminants but failed to update a couple of surrounding comments
accordingly. This patch fixes this omission.
ada: Fix regression of finalization primitive selection
A recent patch introduced a new flag to mark the types for which looking
up finalization primitives needs special handling. But there was one
place in Build_Derived_Record_Type where the flag was not set when it
should, which introduced a regression in some cases.
This patch adds the missing setting of the flag.
gcc/ada/ChangeLog:
* sem_ch3.adb (Build_Derived_Record_Type): Set flag appropriately.
Eric Botcazou [Mon, 14 Jul 2025 22:37:19 +0000 (00:37 +0200)]
ada: Fix inconsistencies in conversion functions from Duration
The 3 units Ada.Calendar, GNAT.Calendar and GNAT.Sockets contain conversion
functions from the Duration fixed-point type that implement the same idiom
but with some inconsistencies:
* GNAT.Sockets only handles Timeval_Duration, i.e. positive Duration, and
is satisfactory, although a simpler implementation can be written,
* GNAT.Calendar mishandles negative Duration values, as well as integral
Duration values,
* Ada.Calendar mishandles negative Duration values, and rounds nanoseconds
instead of truncating them.
gcc/ada/ChangeLog:
* libgnat/a-calend.adb (To_Struct_Timespec_64): Deal with negative
Duration values and truncate the nanoseconds too.
* libgnat/g-calend.adb (timeval_to_duration): Unsuppress overflow
checks.
(duration_to_timeval): Likewise. Deal with negative Duration values
as well as integral Duration values.
* libgnat/g-socket.adb (To_Timeval): Simplify the implementation.
RISC-V: Add support for resumable non-maskable interrupt (RNMI) handlers
The Smrnmi extension introduces the nmret instruction to return from RNMI
handlers. We already have basic Smrnmi support. This patch introduces
support for the nmret instruction and the ability to set the function
attribute `__attribute__ ((interrupt ("rnmi")))` to let the compiler
generate RNMI handlers.
The attribute name is proposed in a PR for the RISC C API and approved
by LLVM maintainers:
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/116
Nathaniel Shead [Thu, 29 May 2025 10:08:13 +0000 (20:08 +1000)]
c++: Unwrap type traits defined in terms of builtins within diagnostics [PR117294]
Currently, concept failures of standard type traits just report
'expression X<T> evaluates to false'. However, many type traits are
actually defined in terms of compiler builtins; we can do better here.
For instance, 'is_constructible_v' could go on to explain why the type
is not constructible, or 'is_invocable_v' could list potential
candidates.
Apart from concept diagnostics, this is also useful when using such
traits in a 'static_assert' directly, so this patch also adjusts the
diagnostics in that context.
As a first step to supporting that we need to be able to map the
standard type traits to the builtins that they use. Rather than adding
another list that would need to be kept up-to-date whenever a builtin is
added, this patch instead tries to detect any variable template defined
directly in terms of a TRAIT_EXPR.
This patch also adjusts 'diagnose_trait_expr' to provide more helpful
diagnostics for these cases. Not all type traits have yet been updated,
this patch just updates those that seem particularly valuable or
straight-forward. The function also gets moved to cp/semantics.cc to be
closer to 'trait_expr_value'.
Various other parts of the compiler are also adjusted here to assist in
making clear diagnostics, such as making more use of 'is_stub_object' to
refer to a type directly rather than in terms of 'std::declval<T>()'.
Additionally, since there are now more cases of nesting within a
'static_assert'ion I felt it was helpful for the experimental-nesting
mode to nest here as well.
PR c++/117294
PR c++/113854
gcc/cp/ChangeLog:
* call.cc (implicit_conversion_error): Hide label when printing
a stub object.
(convert_like_internal): Likewise, and nest candidate
diagnostics.
* constexpr.cc (diagnose_failing_condition): Nest diagnostics,
attempt to provide more helpful diagnostics for traits.
* constraint.cc (satisfy_atom): Pass result before constant
evaluation to diagnose_atomic_constraint.
(diagnose_trait_expr): Adjust diagnostics for clarity and
detail.
(maybe_diagnose_standard_trait): New function.
(diagnose_atomic_constraint): Attempt to provide more helpful
diagnostics for more traits.
* cp-tree.h (explain_not_noexcept): Declare new function.
(is_trivially_xible): Add parameter.
(is_nothrow_xible): Likewise.
(is_xible): Likewise.
(is_convertible): Likewise.
(is_nothrow_convertible): Likewise.
(diagnose_trait_expr): Declare new function.
(maybe_diagnose_standard_trait): Declare new function.
* error.cc (dump_type) <case TREE_VEC>: Handle trait types.
* except.cc (explain_not_noexcept): New function.
* method.cc (build_trait_object): Add complain parameter.
(build_invoke): Propagate complain parameter.
(assignable_expr): Add explain parameter to show diagnostics.
(constructible_expr): Likewise.
(destructible_expr): Likewise.
(is_xible_helper): Replace trivial flag with explain flag,
add diagnostics.
(is_trivially_xible): New explain flag.
(is_nothrow_xible): Likewise.
(is_xible): Likewise.
(is_convertible_helper): Add complain flag.
(is_convertible): New explain flag.
(is_nothrow_convertible): Likewise.
* typeck.cc (cp_build_function_call_vec): Add handling for stub
objects.
(convert_arguments): Always return -1 on error.
* typeck2.cc (cxx_readonly_error): Add handling for stub
objects.
* g++.dg/cpp2a/concepts-traits3.C: Adjust diagnostics.
* g++.dg/cpp2a/concepts-traits4.C: New test.
* g++.dg/diagnostic/static_assert5.C: New test.
* g++.dg/ext/has_virtual_destructor2.C: New test.
* g++.dg/ext/is_assignable2.C: New test.
* g++.dg/ext/is_constructible9.C: New test.
* g++.dg/ext/is_convertible7.C: New test.
* g++.dg/ext/is_destructible3.C: New test.
* g++.dg/ext/is_invocable6.C: New test.
* g++.dg/ext/is_virtual_base_of_diagnostic2.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Jason Merrill [Thu, 24 Jul 2025 18:07:11 +0000 (14:07 -0400)]
c++: lambda convop in C++23 [PR114632]
The lambda conversion was ICEing for two C++23 features, static op() and
explicit object parameters. The issue with the former seems like a more
general issue: tsubst_function_decl recursing to substitute the parameters
was affected by cp_unevaluated_operand from the decltype that refers to the
declaration. Various places already make a point of clearing
cp_unevaluated_operand ahead of PARM_DECL tsubsting; doing it here makes the
PR101233 fix redundant.
For explicit object lambdas, we want to implement CWG2561 and
just not declare the conversion.
PR c++/114632
PR c++/101233
gcc/cp/ChangeLog:
* lambda.cc (maybe_add_lambda_conv_op): Not for xobj lambda.
* pt.cc (tsubst_function_decl): Add cp_evaluated.
(alias_ctad_tweaks): Revert PR101233 fix.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/explicit-obj-lambda18.C: New test.
* g++.dg/cpp23/static-operator-call7.C: New test.
Richard Biener [Thu, 24 Jul 2025 12:14:24 +0000 (14:14 +0200)]
Remove vec_stmt from vectorizable_* API
The following removes the non-SLP gimple **vec_stmt argument from
the vectorizable_* functions API. Checks on it can be replaced
by an inverted check on the passed cost_vec vector pointer.
Andrew Pinski [Thu, 24 Jul 2025 16:26:38 +0000 (09:26 -0700)]
Fix minor typo in #ifdef docuementation
As reported in https://gcc.gnu.org/pipermail/gcc/2025-July/246417.html,
This fixes the minor typo under the #ifdef documentation about adding
MACRO after the #endif and the MACRO matching of the #ifdef. It had `#ifndef`
in it, rather than `#ifdef`.
Pushed as obvious.
gcc/ChangeLog:
* doc/cpp.texi (#ifdef): Correct typo.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Tomasz Kamiński [Thu, 24 Jul 2025 07:14:38 +0000 (09:14 +0200)]
libstdc++: Cleaned up string_vector_iterators.cc test [PR104874]
Removed the wrong_stuff() function, which was effectively empty for
actual test runs. Replaced the manual failure counter with the VERIFY
macro to simplify identifying failures.
Robin Dapp [Wed, 2 Jul 2025 08:28:57 +0000 (10:28 +0200)]
riscv: testsuite: Fix misalignment check.
This fixes a thinko in the misalignment check. If we want to check for
vector misalignment support we need to load 16-byte elements, not
8-byte elements that will never be misaligned.
Robin Dapp [Thu, 3 Jul 2025 09:04:29 +0000 (11:04 +0200)]
vect: Misalign checks for gather/scatter.
This patch adds simple misalignment checks for gather/scatter
operations. Previously, we assumed that those perform element accesses
internally so alignment does not matter. The riscv vector spec however
explicitly states that vector operations are allowed to fault on
element-misaligned accesses. Reasonable uarchs won't, but...
For gather/scatter we have two paths in the vectorizer:
(1) Regular analysis based on datarefs. Here we can also create
strided loads.
(2) Non-affine access where each gather index is relative to the
initial address.
The assumption this patch works on is that once the alignment for the
first scalar is correct, all others will fall in line, as the index is
always a multiple of the first element's size.
For (1) we have a dataref and can check it for alignment as in other
cases. For (2) this patch checks the object alignment of BASE and
compares it against the natural alignment of the current vectype's unit.
The patch also adds a pointer argument to the gather/scatter IFNs that
contains the necessary alignment. Most of the patch is thus mechanical
in that it merely adjusts indices.
I tested the riscv version with a custom qemu version that faults on
element-misaligned vector accesses. With this patch applied, there is
just a single fault left, which is due to PR120782 and which will be
addressed separately.
Bootstrapped and regtested on x86 and aarch64. Regtested on
rv64gcv_zvl512b with and without unaligned vector support.
gcc/ChangeLog:
* internal-fn.cc (internal_fn_len_index): Adjust indices for new
alias_ptr param.
(internal_fn_else_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
(internal_fn_alias_ptr_index): Ditto.
(internal_fn_offset_index): Ditto.
(internal_fn_scale_index): Ditto.
(internal_gather_scatter_fn_supported_p): Ditto.
* internal-fn.h (internal_fn_alias_ptr_index): Ditto.
* optabs-query.cc (supports_vec_gather_load_p): Ditto.
* tree-vect-data-refs.cc (vect_check_gather_scatter): Add alias
pointer.
* tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Add
alias pointer.
* tree-vect-slp.cc (vect_get_operand_map): Adjust for alias
pointer.
* tree-vect-stmts.cc (vect_truncate_gather_scatter_offset): Add
alias pointer and misalignment handling.
(get_load_store_type): Move from here...
(get_group_load_store_type): ...To here.
(vectorizable_store): Add alias pointer.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct gather_scatter_info): Ditto.
Robin Dapp [Wed, 2 Jul 2025 08:02:16 +0000 (10:02 +0200)]
vect: Add is_gather_scatter argument to misalignment hook.
This patch adds an is_gather_scatter argument to the
support_vector_misalignment hook. All targets but riscv do not care
about alignment for gather/scatter so return true for is_gather_scatter.
Robin Dapp [Wed, 2 Jul 2025 08:04:58 +0000 (10:04 +0200)]
ifn: Add helper functions for gather/scatter.
This patch adds access helpers for the gather/scatter offset and scale
parameters.
gcc/ChangeLog:
* internal-fn.cc (expand_scatter_store_optab_fn): Use new
function.
(expand_gather_load_optab_fn): Ditto.
(internal_fn_offset_index): Ditto.
(internal_fn_scale_index): Ditto.
* internal-fn.h (internal_fn_offset_index): New function.
(internal_fn_scale_index): Ditto.
* tree-vect-data-refs.cc (vect_describe_gather_scatter_call):
Use new function.
Tomasz Kamiński [Wed, 23 Jul 2025 09:33:22 +0000 (11:33 +0200)]
libstdc++: Expand compile-time ranges tests for vector and basic_string.
This replaces most test_constexpr invocations with direct calls to
test_ranges(), which is also used for runtime tests.
SimpleAllocator was made constexpr to simplify this refactoring. Other
test allocators, like uneq_allocator (used in from_range constructor
tests), were not updated.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/from_range.cc: Replace
test_constexpr with test_ranges inside static_assert.
* testsuite/21_strings/basic_string/modifiers/append/append_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/replace/replace_with_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/cons/from_range.cc: Likewise.
* testsuite/23_containers/vector/bool/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/23_containers/vector/cons/from_range.cc: Likewise.
* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
Run full test_ranges instead of span-only in test_constexpr.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Replace test_constexpr with calls to test_ranges and test_overlapping.
* testsuite/util/testsuite_allocator.h (__gnu_test::SimpleAllocator):
Declared member functions as constexpr.
aarch64: Relaxed SEL combiner patterns for unpacked SVE FP binary arithmetic
Extend the binary op/UNSPEC_SEL combiner patterns from SVE_FULL_F/
SVE_FULL_F_B16B16 to SVE_F/SVE_F_B16B16, where the strictness value
is SVE_RELAXED_GP.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16.
(*cond_<optab><mode>_3_relaxed): Likewise.
(*cond_<optab><mode>_any_relaxed): Likwise.
(*cond_<optab><mode>_any_const_relaxed): Extend from SVE_FULL_F
to SVE_F.
(*cond_add<mode>_2_const_relaxed): Likewise.
(*cond_add<mode>_any_const_relaxed): Likewise.
(*cond_sub<mode>_3_const_relaxed): Likewise.
(*cond_sub<mode>_const_relaxed): Likewise.
This patch extends the unpredicated FP division expander to support
partial FP modes. It extends the existing patterns used to implement
UNSPEC_COND_FDIV and it's approximation as needed.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
(@aarch64_frecpe<mode>): Extend from SVE_FULL_F to SVE_F.
(@aarch64_frecps<mode>): Likewise.
(div<mode>3): Likewise, use aarch64_sve_fp_pred.
* config/aarch64/iterators.md: Add warnings above SVE_FP_UNARY
and SVE_FP_BINARY.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
* gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.
aarch64: Add support for unpacked SVE FP binary arithmetic
This patch extends the expanders for unpredicated smax, smin, add, sub,
mul, min, and max, so that they support partial SVE FP modes.
The relevant insn and splitting patterns are also updated.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (<optab><mode>3): Extend from
SVE_FULL_F to SVE_F, use aarch64_sve_fp_pred.
(*post_ra_<sve_fp_op><mode>3): Extend from SVE_FULL_F to SVE_F.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
use aarch64_predicate_operand (ADD/SUB/MUL/MAX/MIN).
(split for using unpredicated insns): Move SVE_RELAXED_GP into
the pattern, rather than testing for it in the condition.
* config/aarch64/aarch64-sve2.md (@aarch64_pred_<optab><mode>):
Extend from VNx8BF_ONLY to SVE_BF.
Steve Baird [Fri, 11 Jul 2025 21:40:59 +0000 (14:40 -0700)]
ada: Use-before-definition of a component of discriminated aggregate's itype.
In some cases involving assigning an aggregate to a formal parameter of
an unconstrained discriminated subtype that has a Dynamic_Predicate, and where
the discriminated type also has a component of an unconstrained discriminated
subtype, the front end generates a malformed tree which causes a compilation
failure when the backend fails a consistency check.
gcc/ada/ChangeLog:
* exp_aggr.adb (Convert_To_Assignments): Add calls to Ensure_Defined
before generating assignments to components that could be
associated with a not-yet-defined itype.
Steve Baird [Mon, 24 Mar 2025 22:34:34 +0000 (15:34 -0700)]
ada: Function return accessibility checking for result access discrims.
RM 6.5 defines static and dynamic checks to ensure that a function result
with one or more access discriminants will not outlive the entity
designated by a non-null access discriminant value (see paragraphs
5.9 and 21). Implement these checks. Also fix a bug in passing along
an implicit parameter needed to perform the dynamic checks when a function
that takes such a parameter returns a call to another such function.
gcc/ada/ChangeLog:
* accessibility.adb (Function_Call_Or_Allocator_Level): Handle the
case where a function that has an Extra_Accessibility_Of_Result
parameter returns as its result a call to another such function.
In that case, the extra parameter should be passed along.
(Check_Return_Construct_Accessibility): Replace a warning about an
inevitable failure of a dynamic check with a legality-rule-violation
error message; adjust the text of the message accordingly.
* exp_ch6.ads (Apply_Access_Discrims_Accessibility_Check): New
procedure, following example of the existing
Apply_CW_Accessibility procedure.
* exp_ch6.adb (Apply_Access_Discrims_Accessibility_Check): body
for new procedure.
(Expand_Simple_Function_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.
* exp_ch3.adb (Make_Allocator_For_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.
testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]
The test vsx-builtin-7.c failed on powerpc64le-linux due to Identical
Code Folding (ICF) merging the functions insert_di_0_v2 and insert_di_0.
This behavior was introduced by commit r15-7961-gdc47161c1f32c3, which
enhanced alias analysis in ao_compare::compare_ao_refs, enabling the
compiler to identify and optimize structurally identical functions. As a
result, the compiler replaced insert_di_0_v2 with a tail call to
insert_di_0, altering the expected test behavior.
This patch adds -fno-ipa-icf to the test's dg-options to disable ICF,
avoiding function merging and ensuring the test executes correctly.
Pan Li [Wed, 23 Jul 2025 05:02:55 +0000 (13:02 +0800)]
RISC-V: Add test case for vx combine polluting VXRM
Add asm check to make sure vx combine of vaaddu.vx will not pollute
the vxrm.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm.h: New test.
The vaaddu.vx combine almost comes from avg_floor, it will
requires the vxrm to be RDN. But not all vaaddu.vx should
depends on the RDN. The vaaddu.vx combine should leverage
the VXRM value as is instead of pollute them all to RDN.
This patch would like to fix this and set it as is.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*uavg_floor_vx_<mode>): Rename
from...
(*<sat_op_v_vdup>_vx_<mode>): Rename to...
(*<sat_op_vdup_v>_vx_<mode>): Rename to...
* config/riscv/riscv-protos.h (enum insn_flags): Add vxrm
RNE, ROD type.
(enum insn_type): Add RNE_P, ROD_P type.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func decl.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto.
* config/riscv/riscv-v.cc (get_insn_type_by_vxrm_val): Add
helper to get insn type by vxrm value.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func impl
to expand vec + vec_dup pattern.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto but for
vec_dup + vec pattern.
* config/riscv/vector-iterators.md: Add helper iterator
for sat vx combine.
Nathaniel Shead [Sat, 24 May 2025 00:56:22 +0000 (10:56 +1000)]
c++/modules: Support re-streaming TU_LOCAL_ENTITYs [PR120412]
When emitting a primary module interface, we must re-stream any TU-local
entities that we saw in a partition. This patch adds the missing
members from core_vals.
As a drive-by fix, in some cases we might have a typedef referring to a
TU-local entity; we need to handle that case as well.
aarch64: Add support for unpacked SVE FP unary operations
This patch extends the expander for unpredicated round, nearbyint, floor,
ceil, rint, and trunc, so that it can handle partial SVE FP modes.
We move fabs and fneg to a separate expander, since they are not trapping
instructions.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (<optab><mode>2): Replace use of
aarch64_ptrue_reg with aarch64_sve_fp_pred.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
and use aarch64_predicate_operand.
* config/aarch64/iterators.md: Split FABS/FNEG out of
SVE_COND_FP_UNARY (into new SVE_COND_FP_UNARY_BITWISE).
aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions
Add UNSPEC_SEL combiner patterns for unpacked FP conversions, where the
strictness value is SVE_RELAXED_GP.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md
(*cond_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>_relaxed):
New FCVT/SEL combiner pattern.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx2SI_ONLY:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
(*cond_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>_relaxed):
New {S,U}CVTF/SEL combiner pattern.
(*cond_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>):
New FCVT/SEL combiner pattern.
(*cond_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
* config/aarch64/iterators.md: New mode iterator for VNx2SI.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fcvt_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fcvtz_1.c: Likewise.
Harald Anlauf [Tue, 22 Jul 2025 18:16:16 +0000 (20:16 +0200)]
Fortran: fix passing of character length of function to procedure [PR121203]
PR fortran/121203
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): Obtain the character
length of an assumed character length procedure from the typespec
of the actual argument even if there is no explicit interface.
Robin Dapp [Thu, 17 Jul 2025 09:09:43 +0000 (11:09 +0200)]
RISC-V: Rework broadcast handling [PR121073].
During the last weeks it became clear that our current broadcast
handling needs an overhaul in order to improve maintainability.
PR121073 showed that my intermediate fix wasn't enough and caused
regressions.
This patch now goes a first step towards untangling broadcast
(vmv.v.x), "set first" (vmv.s.x), and zero-strided load (vlse).
Also can_be_broadcast_p is rewritten and strided_broadcast_p is
introduced to make the distinction clear directly in the predicates.
Due to the pervasiveness of the patterns I needed to touch a lot
of places and tried to clear up some things while at it. The patch
therefore also introduces new helpers expand_broadcast for vmv.v.x
that dispatches to regular as well as strided broadcast and
expand_set_first that does the same thing for vmv.s.x.
The non-strided fallbacks are now implemented as splitters of the
strided variants. This makes it easier to see where and when things
happen.
The test cases I touched appeared wrong to me so this patch sets a new
baseline for some of the scalar_move tests.
There is still work to be done but IMHO that can be deferred: It would
be clearer if the three broadcast-like variants differed not just in
name but also in RTL pattern so matching is not as confusing. Right now
vmv.v.x and vmv.s.x only differ in the mask and are interchangeable by
just changing it from "all ones" to a "single one".
As last time, I regtested on rv64 and rv32 with strided_broadcast turned
on and off. Note there are regressions cond_fma_fnma-[78].c. Those are
due to the patch exposing more fwprop/late-combine opportunities. For
fma/fnma we don't yet have proper costing for vv/vx in place but I'll
expect that to be addressed soon and figured we can live with those for
the time being.
PR target/121073
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Use new helpers.
* config/riscv/autovec.md: Ditto.
* config/riscv/predicates.md (strided_broadcast_mask_operand):
New predicate.
(strided_broadcast_operand): Ditto.
(any_broadcast_operand): Ditto.
* config/riscv/riscv-protos.h (expand_broadcast): Declare.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(strided_broadcast_p): Ditto.
* config/riscv/riscv-string.cc (expand_vec_setmem): Use new
helpers.
* config/riscv/riscv-v.cc (expand_broadcast): New functionk.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(expand_const_vec_duplicate): Use new helpers.
(expand_const_vector_duplicate_repeating): Ditto.
(expand_const_vector_duplicate_default): Ditto.
(sew64_scalar_helper): Ditto.
(expand_vector_init_merge_repeating_sequence): Ditto.
(expand_reduction): Ditto.
(strided_broadcast_p): New function.
(whole_reg_to_reg_move_p): Use new helpers.
* config/riscv/riscv-vector-builtins-bases.cc: Use either
broadcast or strided broadcast.
* config/riscv/riscv-vector-builtins.cc (function_expander::use_ternop_insn):
Ditto.
(function_expander::use_widen_ternop_insn): Ditto.
(function_expander::use_scalar_broadcast_insn): Ditto.
* config/riscv/riscv-vector-builtins.h: Declare scalar
broadcast.
* config/riscv/vector.md (*pred_broadcast<mode>): Split into
regular and strided broadcast.
(*pred_broadcast<mode>_zvfh): Split.
(pred_broadcast<mode>_zvfh): Ditto.
(*pred_broadcast<mode>_zvfhmin): Ditto.
(@pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>_zvfhmin): Ditto.
Andrew Pinski [Tue, 22 Jul 2025 17:26:54 +0000 (10:26 -0700)]
aarch64: Fix fma steering when rename fails [PR120119]
Regrename can fail in some case and `insn_rr[INSN_UID (insn)].op_info`
will be null. The FMA steering code was not expecting the failure to happen.
This started to happen after early RA was added but it has been a latent bug
before that.
Build and tested for aarch64-linux-gnu.
PR target/120119
gcc/ChangeLog:
* config/aarch64/cortex-a57-fma-steering.cc (func_fma_steering::analyze):
Skip if renaming fails.
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr120119-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Robert Dubner [Wed, 23 Jul 2025 12:44:54 +0000 (08:44 -0400)]
cobol: Tweak adjustments to location_t of GENERIC nodes for PERFORM.
COBOL has a group of PERFORM statements that require careful adjustments to
the location_t elements of the GENERIC nodes so that the COBOL-aware version
of GDB behaves properly. These changes are in service of that goal.
Patrick Palka [Wed, 23 Jul 2025 12:38:12 +0000 (08:38 -0400)]
c++: name lookup for non-dep rewritten != expr [PR121179]
Here we're incorrectly rejecting the modules testcase (reduced from a
std module example):
$ cat 121179_a.C
export module foo;
enum class E { x };
bool operator==(E, int);
export
template<class T>
void f() {
E::x != 0;
}
$ cat 121179_b.C
import foo;
template void f<int>();
$ g++ -fmodules 121179_*.C
In module foo, imported at 121179_b.C:1:
121179_a.C: In instantiation of ‘void f@foo() [with T = int]’:
121179_b.C:3:9: required from here
121179_a.C:9:8: error: no match for ‘operator!=’ (operand types are ‘E@foo’ and ‘int’)
This is ultimately because our non-dependent rewritten operator expression
handling throws away the result of unqualified lookup at template parse time,
and so we have to repeat the lookup at instantiation time which fails because
the operator== isn't exported.
This is a known deficiency, but it's easy enough to narrowly fix this
for simple != to == rewrites by making build_min_non_dep_op_overload
look through logical negation.
PR c++/121179
gcc/cp/ChangeLog:
* call.cc (build_new_op): Don't clear *overload for a simple
!= to == rewrite.
* tree.cc (build_min_non_dep_op_overload): Handle TRUTH_NOT_EXPR
appearing in a rewritten operator expression.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/operator-8.C: Strengthen test and remove one
XFAIL.
Patrick Palka [Wed, 23 Jul 2025 12:31:46 +0000 (08:31 -0400)]
c++: fix __is_invocable for std::reference_wrapper [PR121055]
Our implementation of the INVOKE spec ([func.require]) was incorrectly
treating reference_wrapper<T>::get() as returning T instead of T&, which
notably makes a difference when invoking a ref-qualified memfn pointer.
PR c++/121055
gcc/cp/ChangeLog:
* method.cc (build_invoke): Correct reference_wrapper handling.
aarch64: testsuite: Keep -mtune=generic when specifying -moverride
gcc/testsuite/ChangeLog:
* lib/gcc-defs.exp (aarch64-arg-dg-options): Split add_tune into
add_tune and add_override, so that specifying -moverride does not
change the baseline tuning from the testuite's default (generic).
libstdc++: Prepare test code for default_accessor for reuse.
All test code of default_accessor can be reused. This commit moves
the reuseable code into a file generic.cc and prepares the tests for
reuse with aligned_accessor.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/mdspan/accessors/default.cc: Delete.
* testsuite/23_containers/mdspan/accessors/generic.cc: Slightly
generalize the test code previously in default.cc.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
Richard Biener [Wed, 23 Jul 2025 07:40:24 +0000 (09:40 +0200)]
tree-optimization/121220 - improve sinking of stores
We currently do only very restricted store sinking into paths
that have no loads or stores and end in a virtual PHI. The
following extends this to sink towards a single virtual
definition in addition to the case of a PHI, handling skipping
of unrelated virtual uses. We later have to prune cases
that would require virtual PHI insertion and the patch below
basically restricts this to sinking to noreturn paths for now.
PR tree-optimization/121220
* tree-ssa-sink.cc (statement_sink_location): For stores
handle sinking to paths ending in a store. Skip loads
that do not use the store.
Martin Jambor [Wed, 23 Jul 2025 09:22:33 +0000 (11:22 +0200)]
tree-sra: Avoid total SRA if there are incompat. aggregate accesses (PR119085)
We currently use the types encountered in the function body and not in
type declaration to perform total scalarization. Bug PR 119085
uncovered that we miss a check that when the same data is accessed
with aggregate types that those are actually compatible. Without it,
we can base total scalarization on a type that does not "cover" all
live data in a different part of the function. This patch adds the
check.
gcc/ChangeLog:
2025-07-21 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/119085
* tree-sra.cc (sort_and_splice_var_accesses): Prevent total
scalarization if two incompatible aggregates access the same place.
gcc/testsuite/ChangeLog:
2025-07-21 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/119085
* gcc.dg/tree-ssa/pr119085.c: New test.
Tomasz Kamiński [Tue, 22 Jul 2025 11:42:07 +0000 (13:42 +0200)]
libstdc++: Negative tests for constexpr uses inplace_vector [PR119137]
Adds negative tests for preconditions on inserting into a full
inplace_vector and erasing non-existent elementsi at compile-time.
This ensures coverage for the inplace_vector<T, 0> specialization.
Also extends element access tests to cover front() and back()
methods, and const and mutable overloads for all accesses.
PR libstdc++/119137
libstdc++-v3/ChangeLog:
* testsuite/23_containers/inplace_vector/access/elem.cc: Cover
front and back methods and const calls.
* testsuite/23_containers/inplace_vector/access/elem_neg.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/erase_neg.cc:
New test.
* testsuite/23_containers/inplace_vector/modifiers/single_insert_neg.cc:
New test.
Prompted by the discussions around a recent clang bug, I realized that
gcc still defaults to -mcpu=v9 on Solaris/SPARC.
This is an oversight since the Oracle Studio 12.6 cc, released in 2017,
already defaults to -xarch=sparcvis2, the equivalent of
-mcpu=ultrasparc3. Besides, both the 32 and 64-bit libc.so.1 require
UltraSPARC III extensions anyway:
SPARC32PLUS Version 1, V8+ Required, UltraSPARC3 Extensions Required [VIS]
SPARCV9 Version 1, UltraSPARC3 Extensions Required [VIS]
So this patch follows suite.
Bootstrapped on sparc-sun-solaris2.11 and sparcv9-sun-solaris2.11 with
as/ld, gas/ld, and gas/gld configurations.
There are currently two regressions exposed by this patch (PRs 121191
and 121192), which are only present in gcc 16 resp. 15/16.
There's one small caveat: while Solaris now marks all objects with
EF_SPARC_32PLUS EF_SPARC_SUN_US1 EF_SPARC_SUN_US3, gas only sets the
EF_SPARC_SUN_US[13] flags in the ELF header if UltraSPARC I/III insns
are actually used. This is in accordance with the SPARC Compliance
Definition 2.4.1, 4P-1. In the end, it doesn't matter anyway since
libc.so.1 already has both flags, so the resulting executables and
shared objects will too, anyway.
Richard Biener [Tue, 22 Jul 2025 13:04:16 +0000 (15:04 +0200)]
[aarch64] check for non-NULL vectype in aarch64_vector_costs::add_stmt_cost
With a patch still in development we get NULL STMT_VINFO_VECTYPE.
One side-effect is that during scalar stmt testing we no longer
pass a vectype. The following adjusts aarch64_vector_costs::add_stmt_cost
to check for a non-NULL vectype before accessing it, like all the
code surrounding it. The other fix possibility would have been
to re-orderr the check with the vect_mem_access_type one, but that
one is not going to exist during scalar code costing either in the
future.
* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
Check vectype is non-NULL before accessing it.
Andrew Pinski [Wed, 23 Jul 2025 05:11:29 +0000 (22:11 -0700)]
testsuite: Mark fn1 in pr81627.c as noinline [PR120101]
Since r16-372-g064cac730f88dc fn1 is now inlined into main
which meant the scan dump was failing since it was looking
for it only once. Marking fn1 as noinline gets us back to
the old behavior and no longer dependent on the inliner.
Pushed as obvious after a quick test.
PR testsuite/120101
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr81627.c (fn1): Mark as noinline.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Jason Merrill [Wed, 16 Jul 2025 15:52:45 +0000 (11:52 -0400)]
c++: constexpr union placement new [PR121068]
The note and example in [class.union] p6 think that placement new can be
used to change the active member of a union, but we didn't support that for
array members in constant-evaluation even after implementing P1330 and
P2747.
First I tried to address this by introducing a CLOBBER_BEGIN_OBJECT for the
entire array, but that broke the resolution of LWG3436, which invokes 'new
T[1]' for an array T, and trying to clobber a multidimensional array when
the actual object is single-dimensional breaks. So I've raised that issue
with the committee. Until that is resolved, this patch takes a simpler
approach: allow initialization of an element of an array to make the array
the active member of a union.
PR c++/121068
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_store_expression): Allow ARRAY_REFs
when activating an array member of a union.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/constexpr-union6.C: Expect x5 to work.
* g++.dg/cpp26/constexpr-new4.C: New test.
Andrew Pinski [Tue, 29 Apr 2025 14:24:08 +0000 (07:24 -0700)]
Change __builtin_unreachable to __builtin_trap (or infinite loop) if only thing in function [PR109267]
When we have an empty function, things can go wrong with
cfi_startproc/cfi_endproc and a few other things like exceptions. So if
the only thing the function does is a call to __builtin_unreachable,
let's replace that with a __builtin_trap instead if the target has a trap
instruction. For targets without a trap instruction defined, replace it
with an infinite loop; this allows not to need for the abort call to happen
but still get the correct behavior of not having two functions at the same
location.
The QOI idea for basic block reorder is recorded as PR 120004.
Changes since v1:
* v2: Move to final gimple cfg cleanup instead of expand and use
BUILT_IN_UNREACHABLE_TRAP.
* v3: For targets without a trap defined, create an infinite loop.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/109267
gcc/ChangeLog:
* tree-cfgcleanup.cc (execute_cleanup_cfg_post_optimizing): If the first
non debug statement in the first (and only) basic block is a call
to __builtin_unreachable change it to a call to __builtin_trap or an
infinite loop.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_trap): New proc.
* g++.dg/missing-return.C: Update testcase for the !trap case.
* gcc.dg/pr109267-1.c: New test.
* gcc.dg/pr109267-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch fixes the following defects in the function:
- The cost of move instructions larger than the natural word width,
specifically "movd[if]_internal", cannot be estimated correctly
- Floating-point or symbolic constant assignment insns cannot be
identified as L32R instructions
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Rewrite to capture insns that could be L32R machine instructions
wherever possible.
(xtensa_rtx_costs): Fix to consider that moves larger than a
natural word can take multiple L32R machine instructions.
(constantpool_address_p): Cosmetics.
* config/xtensa/xtensa.md (movdi_internal, movdf_internal):
Add missing insn attributes.
xtensa: Make relaxed MOVI instructions treated as "load" type
The relaxed MOVI instructions in the Xtensa ISA are assignment ones that
contain large integer, floating-point or symbolic constants that would not
normally be allowed as immediate values by instructions in assembly code,
and will instead be translated by the assembler later rather the compiler,
into the L32R instructions referencing to literal pool entries containing
that values (see '-mauto-litpools' Xtensa-specific option).
This means that even though such instructions look like nothing more than
constant value assignments in their RTL representation, these may perform
better by treating them as loads from memory (i.e. the actual behavior)
and also trying to avoid using the value immediately after the load,
especially from an instruction scheduling perspective.
gcc/ChangeLog:
* config/xtensa/xtensa.md
(movsi_internal, movhi_internal, movsf_internal):
Change the value of the "type" attribute from "move" to "load"
when the source operand constraint is "Y".
Karl Meakin [Tue, 15 Jul 2025 14:49:58 +0000 (14:49 +0000)]
middle-end: Enable masked load with non-constant offset
The function `vect_check_gather_scatter` requires the `base` of the load
to be loop-invariant and the `off`set to be not loop-invariant. When faced
with a scenario where `base` is not loop-invariant, instead of giving up
immediately we can try swapping the `base` and `off`, if `off` is
actually loop-invariant.
Previously, it would only swap if `off` was the constant zero (and so
trivially loop-invariant). This is too conservative: we can still
perform the swap if `off` is a more complex but still loop-invariant
expression, such as a variable defined outside of the loop.
This allows loops like the function below to be vectorised, if the
target has masked loads and sufficiently large vector registers (eg
`-march=armv8-a+sve -msve-vector-bits=128`):
```c
typedef struct Array {
int elems[3];
} Array;
int loop(Array **pp, int len, int idx) {
int nRet = 0;
for (int i = 0; i < len; i++) {
Array *p = pp[i];
if (p) {
nRet += p->elems[idx];
}
}
return nRet;
}
```
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_check_gather_scatter): Swap
`base` and `off` in more scenarios. Also assert at the end of
the function that `base` and `off` are loop-invariant and not
loop-invariant respectively.
Tomasz Kamiński [Tue, 22 Jul 2025 07:44:24 +0000 (09:44 +0200)]
libstdc++: Make testsuite_iterators constexpr and expand inplace_vector tests [PR119137]
All functions in testsuite_iterators.h are now marked constexpr,
targeting the earliest possible standard. Most functions use C++14 due
to multi-statement bodies, with exceptions:
* BoundsContainer and some constructors are C++11 compatible.
* OutputContainer is C++20 due to operator new/delete usage.
Before C++23, each constexpr templated function requires a constexpr
-suitable instantiation. Functions delegating to _GLIBCXX14_CONSTEXPR
must also be _GLIBCXX14_CONSTEXPR; e.g., forward_iterator_wrapper's
constructor calling input_iterator_wrapper's constructor, or
operator-> calling operator*.
For classes defined C++20 or later (e.g., test_range), constexpr is
applied unconditionally.
PR libstdc++/119137
libstdc++-v3/ChangeLog:
* testsuite/23_containers/inplace_vector/cons/from_range.cc: Run
iterators and range test at compile-time.
* testsuite/23_containers/inplace_vector/modifiers/assign.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/multi_insert.cc:
Likewise.
* testsuite/util/testsuite_iterators.h (__gnu_test::BoundsContainer)
(__gnu_test::OutputContainer, __gnu_test::WritableObject)
(__gnu_test::output_iterator_wrapper, __gnu_test::input_iterator_wrapper)
(__gnu_test::forward_iterator_wrapper)
(__gnu_test::bidirectional_iterator_wrapper)
(__gnu_test::random_access_iterator_wrapper)
(__gnu_test::test_container): Add appropriate _GLIBCXXNN_CONSTEXPR
macros to member functions.
(__gnu_test::contiguous_iterator_wrapper)
(__gnu_test::input_iterator_wrapper_rval)
(__gnu_test::test_range, __gnu_test::test_range_nocopy)
(__gnu_test::test_sized_range_sized_sent)
(__gnu_test::test_sized_range): Add constexpr specifier to member
functions.
Jeff Law [Tue, 22 Jul 2025 13:26:57 +0000 (07:26 -0600)]
[RISC-V] Restrict generic-vector-ooo DFA
So while debugging Austin's work to support the spacemit x60 in the BPI we
found that even though his pipeline description had mappings for all the vector
instructions, they were still getting matched by the generic-vector-ooo DFA.
The core problem is that DFA never restricted itself to a tune option (oops).
That's easily fixed, at which time everything using generic blows up because we
don't have a generic in-order vector DFA. Everything using generic was
indirectly also using generic-vector-ooo for the vector instructions.
It may be better long term to define a generic-vector DFA, but to preserve
behavior, I'm letting generic-vector-ooo match when the generic DFA is active.
Tested in my tester, waiting on pre-commit CI before moving forward.
gcc/
* config/riscv/generic-vector-ooo.md: Restrict insn reservations to
generic_ooo and generic tuning models.
When we have a vector shift with a scalar the shift operand can be
external - in that case we should not use the shift operand def
as hint where to place the vector shift instruction. The ICE
in the PR is because stmt dominance queries only work inside of
the vector region. But we should also never place stmts outside
of it.
PR tree-optimization/121202
* tree-vect-slp.cc (vect_schedule_slp_node): Do not take
an out-of-region stmt as "last".
Gary Dismukes [Fri, 11 Jul 2025 23:30:18 +0000 (23:30 +0000)]
ada: Nested use_type_clause with "all" cancels use_type_clause with wider scope
The compiler mishandles nested use_type_clauses in the case where the
outer one is a normal use_type_clause and the inner one has "all".
Upon leaving the scope of the inner use_type_clause, the outer one
is effectively disabled, because it's not considered redundant (and
in fact it's only partially redundant). This is fixed by testing for
the presence of a use_type_clause for the same type that has a wider
scope when ending the inner use_type_clause.
gcc/ada/ChangeLog:
* sem_ch8.adb (End_Use_Type): Add a test for there not being an earlier
use_type_clause for the same type as an additional criterion for turning
off In_Use and Current_Use_Clause.
This patch adds a GNAT-specific extension which enables "destructors".
Destructors are an optional replacement for Ada.Finalization where some
aspects of the interaction with type derivation are different.
gcc/ada/ChangeLog:
* doc/gnat_rm/gnat_language_extensions.rst: Document new extension.
* snames.ads-tmpl: Add name for new aspect.
* gen_il-fields.ads (Has_Destructor, Is_Destructor): Add new fields.
* gen_il-gen-gen_entities.adb (E_Procedure, Type_Kind): Add new fields.
* einfo.ads (Has_Destructor, Is_Destructor): Document new fields.
* aspects.ads: Add new aspect.
* sem_ch13.adb (Analyze_Aspect_Specifications,
Check_Aspect_At_Freeze_Point, Check_Aspect_At_End_Of_Declarations):
Add semantic analysis for new aspect.
(Resolve_Finalization_Procedure): New function.
(Resolve_Finalizable_Argument): Use new function above.
* sem_util.adb (Propagate_Controlled_Flags): Extend for new field.
* freeze.adb (Freeze_Entity): Add legality check for new aspect.
* exp_ch3.adb (Expand_Freeze_Record_Type, Predefined_Primitive_Bodies):
Use new field.
* exp_ch7.adb (Build_Finalize_Statements): Add expansion for
destructors.
(Make_Final_Call, Build_Record_Deep_Procs): Adapt to new Has_Destructor
field.
(Build_Adjust_Statements): Tweak to handle cases of empty lists.
* gnat_rm.texi: Regenerate.
ada: Fix generation of Initialize and Adjust calls
Before this patch, Make_Init_Call and Make_Adjust_Call made the
assumption that if the type they were called with was untagged and a
derived type, it was the untagged private view of a tagged type. That
assumption made it possible to inspect the root type's primitives to
handle the case where the underlying type was implicitly generated by
the compiler without all inherited primitives.
The introduction of the Finalizable aspect broke that assumption, so
this patch adds a new field to type entities that make the generated
full view stand out, and updates Make_Init_Call and Make_Adjust_Call to
only jump to the root type when they're passed one of those generated
types.
Make_Final_Call and Finalize_Address are two other subprograms that
perform the same test on the types they're passed. They did not suffer
from the same bug as Make_Init_Call and Make_Adjust_Call because of an
earlier, more ad hoc fix, but this patch switches them over to the newly
introduced mechanism for the sake of consistency.
gcc/ada/ChangeLog:
* gen_il-fields.ads (Is_Implicit_Full_View): New field.
* gen_il-gen-gen_entities.adb (Type_Kind): Use new field.
* einfo.ads (Is_Implicit_Full_View): Document new field.
* exp_ch7.adb (Make_Adjust_Call, Make_Init_Call, Make_Final_Call): Use
new field.
* exp_util.adb (Finalize_Address): Likewise.
* sem_ch3.adb (Copy_And_Build): Set new field.
Eric Botcazou [Tue, 8 Jul 2025 19:40:44 +0000 (21:40 +0200)]
ada: Remove obsolete code from Safe_Unchecked_Type_Conversion
That's a kludge added to work around the limitations of the stack checking
mechanism used in the early days.
gcc/ada/ChangeLog:
* exp_util.ads (May_Generate_Large_Temp): Delete.
* exp_util.adb (May_Generate_Large_Temp): Likewise.
(Safe_Unchecked_Type_Conversion): Do not take stack checking into
account to compute the result.
Javier Miranda [Mon, 12 May 2025 18:46:11 +0000 (18:46 +0000)]
ada: Wrong dispatch on result in presence of dependent expression
The compiler generates wrong code in a dispatching call on result
when the call is performed under dependent conditional expressions
or case-expressions.
gcc/ada/ChangeLog:
* sinfo.ads (Is_Expanded_Dispatching_Call): New flag.
(Tag_Propagated): New flag.
* exp_ch6.adb (Expand_Call_Helper): Propagate the tag when
the dispatching call is placed in conditionl expressions or
case-expressions.
* sem_ch5.adb (Analyze_Assignment): For assignment of tag-
indeterminate expression, do not propagate the tag if
previously done.
* sem_disp.adb (Is_Tag_Indeterminate): Add missing support
for conditional expression and case expression.
* exp_disp.ads (Is_Expanded_Dispatching_Call): Removed. Function
replaced by a new flag in the nodes.
* exp_disp.adb (Expand_Dispatching_Call): Set a flag in the
call node to remember that the call has been expanded.
(Is_Expanded_Dispatching_Call): Function removed.
* gen_il-fields.ads (Tag_Propagated): New flag.
(Is_Expanded_Dispatching_Call): New flag.
* gen_il-gen-gen_nodes.adb (Tag_Propagated): New flag.
(Is_Expanded_Dispatching_Call): New flag.