git.ipfire.org Git - thirdparty/gcc.git/log

c++, coroutines: Improve diagnostics for awaiter/promise.

At present, we can issue diagnostics about missing or malformed
awaiter or promise methods when we encounter their uses in the
body of a user's function. We might then re-issue the same
diagnostics when processing the initial or final await expressions.

This change avoids such duplication, and also attempts to
identify issues with the initial or final expressions specifically
since diagnostics for those do not have any useful line number.

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Identify diagnostics
for initial and final await expressions.
(cp_coroutine_transform::wrap_original_function_body): Do
not handle initial and final await expressions here ...
(cp_coroutine_transform::apply_transforms): ... handle them
here and avoid duplicate diagnostics.
* coroutines.h: Declare inital and final await expressions
in the transform class. Save the function closing brace
location.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro1-missing-await-method.C: Adjust for
improved diagnostics.
* g++.dg/coroutines/coro-missing-final-suspend.C: Likewise.
* g++.dg/coroutines/pr104051.C: Move to...
* g++.dg/coroutines/pr104051-0.C: ...here.
* g++.dg/coroutines/pr104051-1.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 4c014a9db521b24bd900eb9a6ac70988322a57da)

c++, coroutines: Handle builtin_constant_p [PR116775].

Since the folding of this builtin happens after the main coroutine FE
lowering, we need to account for await expressions in that lowering.

Since these expressions have a property of being not evaluated, but do
not have the full constraints of an unevaluatated context, we want to
apply the checks and then remove the await expressions so that they no
longer participate in the analysis and lowering.

When a builtin_constant_p call is encountered, and the operand contains
any await expression, we check to see if the operand can be a constant
and replace the call with its result.

PR c++/116775

gcc/cp/ChangeLog:

* coroutines.cc (analyze_expression_awaits): When we see
a builtin_constant_p call, and that contains one or more
await expressions, then replace the call with its result
and discard the unevaluated operand.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr116775.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit ab3f04b73e5a1dd734d3bab64b4878d2d0cc29ad)

c++, coroutines: Ensure that the resumer is marked as can_throw.

We must flag that the resumer might throw (since the wrapping of the
original function body unconditionally adds a try-catch/rethrow). We
also add code that might throw - even when the original function body
would not.

TODO: We could improve code-gen by recognising cases where the combined
body + initial await expressions cannot throw and omitting the unneeded
try/catch/rethrow wrapper.

gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Set can_throw.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit e83c4bfc338fad0c87b2debb37ccfe98d148c7ac)

c++: Fix template class lookup [PR120495, PR115605].

In the reported issues, using lookup_template_class () directly on (for example)
the coroutine_handle identifier fails because a class-local TYPE_DECL is found.
This is because, in the existing code, lookup is called with default parameters
which means that class contexts are examined first.

Fix this, when a context is provided by the caller, by doing lookup in namespace
provided.

PR c++/120495
PR c++/115605

gcc/cp/ChangeLog:

* pt.cc (lookup_template_class): Honour provided namespace contexts
when looking up class templates.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr120495.C: New test.
* g++.dg/pr115605.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit cf588f1a8e7406ced5b08f32f9d23f015a240a31)

c++, coroutines: Make analyze_fn_params into a class method.

This continues code cleanups and migration to encapsulation of the
whole coroutine transform.

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): Move from free function..
(cp_coroutine_transform::analyze_fn_parms):... to method.
(cp_coroutine_transform::apply_transforms): Adjust call to
analyze_fn_parms.
* coroutines.h: Declare analyze_fn_parms.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 2a8af97e3528f812201687334f64b27b94d01271)

c++, coroutines: Simplify initial_await_resume_called.

We do not need to generate this code early, since it does not affect
any of the analysis. Lowering it later takes less code, and avoids
modifying the initial await expresssion which will simplify changes
to analysis to deal with open PRs.

gcc/cp/ChangeLog:

* coroutines.cc (expand_one_await_expression): Set the
initial_await_resume_called flag here.
(build_actor_fn): Populate the frame accessor for the
initial_await_resume_called flag.
(cp_coroutine_transform::wrap_original_function_body): Do
not modify the initial_await expression to include the
initial_await_resume_called flag here.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit bfd4aae0a999375cf008b75c14607c7b94daced3)

c++, coroutines: Some cleanups in build_actor.

We were incorrectly guarding all the frame cleanups on the
basis of frame_needs_free (which is always set for the present
code-gen since we have no allocation elision). The net result
being that the (incorrect) code was behaving as expected.

We built, but never used, a label for the frame destruction;
in practice it is never triggered independently of the promise
and argument copy destruction.

Finally there are a few instances where we had been building
expressions manually rather than using higher-level APIs.

gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Remove an unused
label, guard the frame deallocation correctly, use
simpler APIs to build if and return statements.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 7037c72a8bb41ed07da64b23e14857a066a91baa)

c++: Emit an error for attempted constexpr co_await [PR118903].

We checked that the coroutine expressions were not suitable for
constexpr, but did not emit and error when needed.

PR c++/118903

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Emit
an error when co_await et. al. are used in constexpr
contexts.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr118903.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 0ae77a05c416c9f750cb87f1bef0800651168b7e)

c++: Add co_await, co_yield and co_return to dump_expr.

These were omitted there as an oversight, most of the error handling
for the coroutines code is specific rather than using generic %qE etc.

gcc/cp/ChangeLog:

* error.cc (dump_expr): Add co_await, co_yield and co_return.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 09cac2a833689f2535d6c2c88a67b2169df4e4d7)

c++, coroutines: Make a check more specific [PR109283].

The check was intended to assert that we had visited contained
ternary expressions with embedded co_awaits, but had been made
too general - and therefore was ICEing on code that was actually
OK. Fixed by checking specifically that no co_awaits embedded.

PR c++/109283

gcc/cp/ChangeLog:

* coroutines.cc (find_any_await): Only save the statement
pointer if the caller passes a place for it.
(flatten_await_stmt): When checking that ternary expressions
have been handled, also check that they contain a co_await.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr109283.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 977fadd69776e2a8a6daca43e1c898bc4f87154d)

c++, coroutines: Clean up the ramp cleanups.

This replaces the cleanup try-catch block in the ramp with a series of
eh-only cleanup statements.

Includes
typo fixes from 83aa09e90487b52c1772eeffc4af577ee70536f1
dead code removal from 7bba8d48ea556a03bdc4e9076740b83d3db6599e

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): No longer
create a parameter copy guard var.
(cp_coroutine_transform::build_ramp_function): Replace ramp
cleanup try-catch block with eh-only cleanup statements.
* coroutines.h (struct param_info): Remove the
entry for the parameter copy destructor guard.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 18df4a10bc96946401218019ec566d867238b3e4)

c++, coroutines: Use decltype(auto) for the g_r_o.

The revised wording for coroutines, uses decltype(auto) for the
type of the get return object, which preserves references.

It is quite reasonable for a coroutine body implementation to
complete before control is returned to the ramp - and in that
case we would be creating the ramp return object from an already-
deleted promise object.

Jason observes that this is a terrible situation and we should
seek a resolution to it via core.

Since the test added here explicitly performs the unsafe action
dscribed above we expect it to fail (until a resolution is found).

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Use
decltype(auto) to determine the type of the temporary
get_return_object.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr115908.C: Count promise construction
and destruction. Run the test and XFAIL it.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit e71a6e002c6650a7a7be99277120d3e59ecb78a3)

c++, coroutines: Address CWG2563 return value init [PR119916].

This addresses the clarification that, when the get_return_object is of a
different type from the ramp return, any necessary conversions should be
performed on the return expression (so that they typically occur after the
function body has started execution).

PR c++/119916

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::wrap_original_function_body): Do not
initialise initial_await_resume_called here...
(cp_coroutine_transform::build_ramp_function): ... but here.
When the coroutine is not void, initialize a GRO object from
promise.get_return_object(). Use this as the argument to the
return expression. Use a regular cleanup for the GRO, since
it is ramp-local.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/special-termination-00-sync-completion.C:
Amend for CWG2563 expected behaviour.
* g++.dg/coroutines/torture/special-termination-01-self-destruct.C:
Likewise.
* g++.dg/coroutines/torture/pr119916.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit e06555a40c051d5062405b02f93b89b01a397f97)

c++, coroutines: Fix identification of coroutine ramps [PR120453].

The existing implementation, incorrectly, tried to use DECL_RAMP_FN
in check_return_expr to determine if we are handling a ramp func.
However, that query is only set for the resume/destroy functions.

Replace the use of DECL_RAMP_FN with a new query.

PR c++/120453

gcc/cp/ChangeLog:

* cp-tree.h (DECL_RAMP_P): New.
* typeck.cc (check_return_expr): Use DECL_RAMP_P instead
of DECL_RAMP_FN.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr120453.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 217b7f655227a52e5fe26729baa09dc6083ed577)

c++, coroutines: Allow NVRO in more cases for ramp functions.

The constraints of the c++ coroutines specification require the ramp
to construct a return object early in the function.  This will be returned
at some later time.  This is implemented as NVRO but requires that copying
be well-formed even though it will be elided.  Special-case ramp functions
to allow this.

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): Suppress conversions for NVRO
in coroutine ramp functions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit d87caa9d3595ca845c9282cef8b0c9a656d8def0)

c++: Set the outer brace marker for missed cases.

In some cases, a function might be declared as FUNCTION_NEEDS_BODY_BLOCK
but all the content is contained within that block. However, poplevel
is currently assuming that such cases would always contain subblocks.

In the case that we do have a body block, but there are no subblocks
then st the outer brace marker on the body block. This situation occurs
for at least coroutine lambda ramp functions and empty constructors.

gcc/cp/ChangeLog:

* decl.cc (poplevel): Set BLOCK_OUTER_CURLY_BRACE_P on the
body block for functions with no subblocks.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 689bc394efe9e042acb37799deec6568c0f63a45)

tree-sra: Avoid total SRA if there are incompat. aggregate accesses  (PR119085)

We currently use the types encountered in the function body and not in
type declaration to perform total scalarization.  Bug PR 119085
uncovered that we miss a check that when the same data is accessed
with aggregate types that those are actually compatible.  Without it,
we can base total scalarization on a type that does not "cover" all
live data in a different part of the function.  This patch adds the
check.

gcc/ChangeLog:

2025-07-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/119085
* tree-sra.cc (sort_and_splice_var_accesses): Prevent total
scalarization if two incompatible aggregates access the same place.

gcc/testsuite/ChangeLog:

2025-07-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/119085
* gcc.dg/tree-ssa/pr119085.c: New test.

(cherry picked from commit 171fcc80ede596442712e559c4fc787aa4636694)

calls: Allow musttail calls to noreturn [PR121159]

In the PR119483 r15-9003 change we've allowed musttail calls to noreturn
functions, after all the decision not to normally tail call noreturn
functions is not because it is not possible to tail call those, but because
it screws up backtraces.  As the following testcase shows, we've done that
only for functions not declared [[noreturn]]/_Noreturn but later on
discovered through IPA as noreturn.  Functions explicitly declared
[[noreturn]] have (for historical reasons) volatile FUNCTION_TYPE and
the FUNCTION_DECLs are volatile as well, so in order to support those
we shouldn't complain on ECF_NORETURN (we've stopped doing so for musttail
in PR119483) but also shouldn't complain about TYPE_VOLATILE on their
FUNCTION_TYPE (something that IPA doesn't change, I think it only sets
TREE_THIS_VOLATILE on the FUNCTION_DECL).  volatile on function type
really means noreturn as well, it has no other meaning.

2025-07-29  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/121159
* calls.cc (can_implement_as_sibling_call_p): Don't reject declared
noreturn functions in musttail calls.

* c-c++-common/pr121159.c: New test.
* gcc.dg/plugin/must-tail-call-2.c (test_5): Don't expect an error.

(cherry picked from commit f4abe216199930adfa110059c3c8e642c585388b)

x86: Enable *mov<mode>_(and|or) only for -Oz

commit ef26c151c14a87177d46fd3d725e7f82e040e89f
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Thu Dec 23 12:33:07 2021 +0000

    x86: PR target/103773: Fix wrong-code with -Oz from pop to memory.

added "*mov<mode>_and" and extended "*mov<mode>_or" to transform
"mov $0,mem" to the shorter "and $0,mem" and "mov $-1,mem" to the shorter
"or $-1,mem" for -Oz.  But the new pattern:

(define_insn "*mov<mode>_and"
  [(set (match_operand:SWI248 0 "memory_operand" "=m")
    (match_operand:SWI248 1 "const0_operand"))
   (clobber (reg:CC FLAGS_REG))]
  "reload_completed"
  "and{<imodesuffix>}\t{%1, %0|%0, %1}"
  [(set_attr "type" "alu1")
   (set_attr "mode" "<MODE>")
   (set_attr "length_immediate" "1")])

and the extended pattern:

(define_insn "*mov<mode>_or"
  [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm")
    (match_operand:SWI248 1 "constm1_operand"))
   (clobber (reg:CC FLAGS_REG))]
  "reload_completed"
  "or{<imodesuffix>}\t{%1, %0|%0, %1}"
  [(set_attr "type" "alu1")
   (set_attr "mode" "<MODE>")
   (set_attr "length_immediate" "1")])

aren't guarded for -Oz.  As a result, "and $0,mem" and "or $-1,mem" are
generated without -Oz.

1. Change *mov<mode>_and" to define_insn_and_split and split it to
"mov $0,mem" if not -Oz.
2. Change "*mov<mode>_or" to define_insn_and_split and split it to
"mov $-1,mem" if not -Oz.
3. Don't transform "mov $-1,reg" to "push $-1; pop reg" for -Oz since it
should be transformed to "or $-1,reg".

gcc/

PR target/120427
* config/i386/i386.md (*mov<mode>_and): Changed to
define_insn_and_split.  Split it to "mov $0,mem" if not -Oz.
(*mov<mode>_or): Changed to define_insn_and_split.  Split it
to "mov $-1,mem" if not -Oz.
(peephole2): Don't transform "mov $-1,reg" to "push $-1; pop reg"
for -Oz since it will be transformed to "or $-1,reg".

gcc/testsuite/

PR target/120427
* gcc.target/i386/cold-attribute-4.c: Compile with -Oz.
* gcc.target/i386/pr120427-1.c: New test.
* gcc.target/i386/pr120427-2.c: Likewise.
* gcc.target/i386/pr120427-3.c: Likewise.
* gcc.target/i386/pr120427-4.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 4c80062d7b8c272e2e193b8074a8440dbb4fe588)

Daily bump.

AVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.

Converting from generic AS to __flashx used the same rule like
for __memx, which tags RAM (generic AS) locations by setting bit 23.
The justification was that generic isn't a subset of __flashx, though
that lead to surprises with code like const __flashx *x = NULL.

The natural thing to do is to just load 0x000000 in that case,
so that the null pointer works in __flashx as expected.

Apart from that, converting NULL to __flashx (or __flash) no more
raises a -Waddr-space-convert diagnostic.

gcc/
PR target/121277
* config/avr/avr.cc (avr_addr_space_convert): When converting
from generic AS to __flashx, don't set bit 23.
(avr_convert_to_type): Don't -Waddr-space-convert when NULL
is converted to __flashx or to __flash.

(cherry picked from commit 089faf54fa96565784ebdd8dfcf9c350c4c3bee5)

Fortran: Allow for iterator substitution in array constructors [PR119106]

PR fortran/119106

gcc/fortran/ChangeLog:

* expr.cc (simplify_constructor): Do not simplify constants.
(gfc_simplify_expr): Continue to simplify expression when an
iterator is present.

gcc/testsuite/ChangeLog:

* gfortran.dg/array_constructor_58.f90: New test.

(cherry picked from commit 3a7fcf4f54ecffdbad03787d4f734c1fb2291ef5)

LoongArch: Fix wrong code generated by TARGET_VECTORIZE_VEC_PERM_CONST [PR121064]

When TARGET_VECTORIZE_VEC_PERM_CONST is called, target may be the
same pseudo as op0 and/or op1.  Loading the selector into target
would clobber the input, producing wrong code like

    vld     $vr0, $t0
    vshuf.w $vr0, $vr0, $vr1

So don't load the selector into d->target, use a new pseudo to hold the
selector instead.  The reload pass will load the pseudo for selector and
the pseudo for target into the same hard register (following our
constraint '0' on the shuf instructions) anyway.

gcc/ChangeLog:

PR target/121064
* config/loongarch/lsx.md (lsx_vshuf_<lsxfmt_f>): Add '@' to
generate a mode-aware helper.  Use <VIMODE> as the mode of the
operand 1 (selector).
* config/loongarch/lasx.md (lasx_xvshuf_<lasxfmt_f>): Likewise.
* config/loongarch/loongarch.cc
(loongarch_try_expand_lsx_vshuf_const): Create a new pseudo for
the selector.  Use the mode-aware helper to simplify the code.
(loongarch_expand_vec_perm_const): Likewise.

gcc/testsuite/ChangeLog:

PR target/121064
* gcc.target/loongarch/pr121064.c: New test.

(cherry picked from commit d626debcb3717f18bf2ee88f4281b109b13e1181)

Daily bump.

[RISC-V] Correct CFA notes for stack-clash protection [PR120714]

Fixes incorrect SP-addresses used in CFA notes for the stack probes
unrelative to the frame's top. It applied to the RISC-V targets code
generation when the stack-clash protection is enabled.

PR target/120714
gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_allocate_and_probe_stack_space):
Fix SP-addresses in REG_CFA_DEF_CFA notes for stack-clash case.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr120714.c: New test.

(cherry picked from commit 45a17e3081120f51f8e8b1d7cda73c7d89453e85)

Daily bump.

gcse: Skip hardreg pre when the hardreg is never live [PR121095]

r15-6789-ge7f98d9603808b added a new RTL pass for hardreg PRE for the hard register
of FPM_REGNUM, this pass could get expensive if you have a large number of basic blocks
and the hard register was never live so it does nothing in the end.
In the aarch64 case, FPM_REGNUM is only used for FP8 related code so it has a high probability
of not being used. So skipping the pass for that register can improve both compile time and memory
usage.

Build and tested for aarch64-linux-gnu.

PR middle-end/121095
gcc/ChangeLog:

* gcse.cc (execute_hardreg_pre): Skip if the hardreg which is never live.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 6916639b48357334579cf94717a3e51dd003e940)

aarch64: Fix fma steering when rename fails [PR120119]

Regrename can fail in some case and `insn_rr[INSN_UID (insn)].op_info`
will be null. The FMA steering code was not expecting the failure to happen.
This started to happen after early RA was added but it has been a latent bug
before that.

Build and tested for aarch64-linux-gnu.

PR target/120119

gcc/ChangeLog:

* config/aarch64/cortex-a57-fma-steering.cc (func_fma_steering::analyze):
Skip if renaming fails.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr120119-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 66c879571ab1fbdb4b119b8b0a1a30ebc7160057)

Fortran: fix passing of character length of function to procedure [PR121203]

PR fortran/121203

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Obtain the character
length of an assumed character length procedure from the typespec
of the actual argument even if there is no explicit interface.

gcc/testsuite/ChangeLog:

* gfortran.dg/function_charlen_4.f90: New test.

(cherry picked from commit 53b64337ef325c4e47ae96ea8dea86031a3a0602)

[PATCH] [modula2] Add return to remove build warning

This patch adds a return statement to M2Exception which removes a
build warning.

gcc/m2/ChangeLog:

* gm2-libs/M2EXCEPTION.mod (M2Exception): Add return
exException in case Raise completes.

(cherry picked from commit db8b92d8d61de408e14a4aebf5a777734936699d)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] PR modula2/121164: Modula 2 build failure followup

This is a followup patch for PR modula2/121164 to
fix the location for the error message attributed to cc1gm2.
The Patch has been cherry picked, but without the forced -Wall option
in libgm2.

gcc/m2/ChangeLog:

PR modula2/121164
* gm2-compiler/P1SymBuild.mod: Remove PutProcTypeParam.
Remove PutProcTypeParam.
(CheckFileName): Remove.
(P1EndBuildDefinitionModule): Correct spelling.
(P1EndBuildImplementationModule): Ditto.
(P1EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
* gm2-compiler/P2SymBuild.mod (P2EndBuildDefModule): Correct
spelling.
(P2EndBuildImplementationModule): Ditto.
(P2EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
(CheckFormalParameterSection): Ditto.
* gm2-compiler/P3SymBuild.mod (P3EndBuildDefModule): Ditto.
* gm2-compiler/PCSymBuild.mod (PCEndBuildDefModule): Ditto.
(fixupProcedureType): Pass tok to PutProcTypeVarParam.
Pass tok to PutProcTypeParam.
* gm2-compiler/SymbolTable.def (PutProcTypeParam): Add tok
parameter.
(PutProcTypeVarParam): Ditto.
* gm2-compiler/SymbolTable.mod (SymParam): At change type to
CARDINAL.
New field FullTok.
New field Scope.
(SymVarParam): At change type to CARDINAL.
New field FullTok.
New field Scope.
(GetVarDeclTok): Check ShadowVar for NulSym and return At.
(PutParam): Initialize FullTok.
Initialize At.
Initialize Scope.
(PutVarParam): Initialize FullTok.
Assign At.
Initialize Scope.
(AddProcedureProcTypeParam): Add tok parameter.
(GetScope): Add ParamSym and VarParamSym clause.
(PutProcTypeVarParam): Add tok parameter.
Initialize At.
Initialize FullTok.
(GetDeclaredDefinition): Clause ParamSym return At.
Clause VarParamSym return At.
(GetDeclaredModule): Ditto.
(PutDeclaredDefinition): Remove clause ParamSym.
Remove clause VarParamSym.
(PutDeclaredModule): Remove clause ParamSym.
Remove clause VarParamSym.

gcc/testsuite/ChangeLog:

PR modula2/121164
* gm2/switches/pedantic-params/fail/arrayofchar.def: New test.
* gm2/switches/pedantic-params/fail/arrayofchar.mod: New test.

(cherry picked from commit ab5a89c0b4f1ead202dee072e16690607b810111)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Daily bump.

c++: constexpr uninitialized union [PR120577]

This was failing for two reasons:

1) We were wrongly treating the basic_string constructor as
zero-initializing the object, which it doesn't.
2) Given that, when we went to look for a value for the anonymous union,
we concluded that it was value-initialized, and trying to evaluate that
broke because we weren't setting ctx->ctor for it.

This patch fixes both issues, #1 by setting CONSTRUCTOR_NO_CLEARING and #2
by inserting a new CONSTRUCTOR for the member rather than evaluate it out of
context, which is consistent with cxx_eval_store_expression.

PR c++/120577

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Set
CONSTRUCTOR_NO_CLEARING on initial value for ctor.
(cxx_eval_component_reference): Make value-initialization
of an aggregate member explicit.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union9.C: New test.

(cherry picked from commit f23b5df56e237df9f66b615ca4babc564d5f75de)

ada: Bug in Indefinite_Holders instance passed to formal package

Fix bug when an instance of Indefinite_Holders with a class-wide type is
passed as a generic formal package; Program_Error was raised when
dealing with the implicit "=" function.

The fix is to disable legality checks in formal packages when the
entity is an E_Subprogram_Body, because these are implicitly generated
for class-wide predefined functions when passed to generics.

gcc/ada/ChangeLog:

* sem_ch12.adb (Check_Formal_Package_Instance):
Do nothing in case of E_Subprogram_Body.

ada: Fix regression of finalization primitive selection

A recent patch introduced a new flag to mark the types for which looking
up finalization primitives needs special handling. But there was one
place in Build_Derived_Record_Type where the flag was not set when it
should, which introduced a regression in some cases.

This patch adds the missing setting of the flag.

gcc/ada/ChangeLog:

* sem_ch3.adb (Build_Derived_Record_Type): Set flag appropriately.

Daily bump.

c++: fix __is_invocable for std::reference_wrapper [PR121055]

Our implementation of the INVOKE spec ([func.require]) was incorrectly
treating reference_wrapper<T>::get() as returning T instead of T&, which
notably makes a difference when invoking a ref-qualified memfn pointer.

PR c++/121055

gcc/cp/ChangeLog:

* method.cc (build_invoke): Correct reference_wrapper handling.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_invocable5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 04a176a1d84a84c630cfd4d232736c12b105957a)

libstdc++: Add missing initializers for __maybe_present_t members [PR119962]

Data members of type __maybe_present_t where the conditionally present
type might be an aggregate or fundamental type need to be explicitly
value-initialized (rather than implicitly default-initialized), so that
default-initialization of the containing class always results in an
completely initialized object.

PR libstdc++/119962

libstdc++-v3/ChangeLog:

* include/std/ranges (join_view::_Iterator::_M_outer): Initialize.
(lazy_split_view::_OuterIter::_M_current): Initialize.
(join_with_view::_Iterator::_M_outer_it): Initialize.
* testsuite/std/ranges/adaptors/join.cc (test15): New test.
* testsuite/std/ranges/adaptors/join_with/1.cc (test05): New test.
* testsuite/std/ranges/adaptors/lazy_split.cc (test13): New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 0828600f586e75a2056a4fc7eb0a340c363d6c66)

tree-optimization/121202 - fix vector stmt placement

When we have a vector shift with a scalar the shift operand can be
external - in that case we should not use the shift operand def
as hint where to place the vector shift instruction. The ICE
in the PR is because stmt dominance queries only work inside of
the vector region. But we should also never place stmts outside
of it.

PR tree-optimization/121202
* tree-vect-slp.cc (vect_schedule_slp_node): Do not take
an out-of-region stmt as "last".

* gcc.dg/pr121202.c: New testcase.

(cherry picked from commit bdfb5cc5aa6959a6959fc0cf98da08db89c81032)

c++/modules: Support re-streaming TU_LOCAL_ENTITYs [PR120412]

When emitting a primary module interface, we must re-stream any TU-local
entities that we saw in a partition. This patch adds the missing
members from core_vals.

As a drive-by fix, in some cases we might have a typedef referring to a
TU-local entity; we need to handle that case as well.

PR c++/120412

gcc/cp/ChangeLog:

* module.cc (trees_out::core_vals): Write TU_LOCAL_ENTITY bits.
(trees_in::core_vals): Read it.
(trees_in::tree_node): Handle TU_LOCAL_ENTITY typedefs.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-14_a.C: New test.
* g++.dg/modules/internal-14_b.C: New test.
* g++.dg/modules/internal-14_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit be81c5c01c243013c4bac0718e63e0fdc132d384)

Daily bump.

tree-sra: Fix grp_covered flag computation when totally scalarizing (PR117423)

Testcase of PR 117423 shows a flaw in the fancy way we do "total
scalarization" in SRA now.  We use the types encountered in the
function body and not in type declaration (allowing us to totally
scalarize when only one union field is ever used, since we effectively
"skip" the union then) and can accommodate pre-existing accesses that
happen to fall into padding.

In this case, we skipped the union (bypassing the
totally_scalarizable_type_p check) and the access falling into the
"padding" is an aggregate and so not a candidate for SRA but actually
containing data.  Arguably total scalarization should just bail out
when it encounters this situation (but I decided not to depend on this
mainly because we'd need to detect all cases when we eventually cannot
scalarize, such as when a scalar access has children accesses) but the
actual bug is that the detection if all data in an aggregate is indeed
covered by replacements just assumes that is always the case if total
scalarization triggers which however may not be the case in cases like
this - and perhaps more.

This patch fixes the bug by just assuming that all padding is taken
care of when total scalarization triggered, not that every access was
actually scalarized.

gcc/ChangeLog:

2025-07-17  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/117423
* tree-sra.cc (analyze_access_subtree): Fix computation of grp_covered
flag.

gcc/testsuite/ChangeLog:

2025-07-17  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/117423
* gcc.dg/tree-ssa/pr117423.c: New test.

(cherry picked from commit 7375909e9d9e7de23acb4b1e0a965d8faf1943c4)

testsuite: Fix overflow in gcc.dg/vect/pr116125.c

The test ends up writing a byte beyond bounds of the buffer, which gets
trapped on some targets when the test is run with
-fstack-protector-strong.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr116125.c (mem_overlap): Expand A to 10 members.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
(cherry picked from commit 96d5aef307025a771ae4ef47a9b382ef20eb06c4)

Daily bump.

ada: Fix generation of Initialize and Adjust calls

Before this patch, Make_Init_Call and Make_Adjust_Call made the
assumption that if the type they were called with was untagged and a
derived type, it was the untagged private view of a tagged type. That
assumption made it possible to inspect the root type's primitives to
handle the case where the underlying type was implicitly generated by
the compiler without all inherited primitives.

The introduction of the Finalizable aspect broke that assumption, so
this patch adds a new field to type entities that make the generated
full view stand out, and updates Make_Init_Call and Make_Adjust_Call to
only jump to the root type when they're passed one of those generated
types.

Make_Final_Call and Finalize_Address are two other subprograms that
perform the same test on the types they're passed. They did not suffer
from the same bug as Make_Init_Call and Make_Adjust_Call because of an
earlier, more ad hoc fix, but this patch switches them over to the newly
introduced mechanism for the sake of consistency.

gcc/ada/ChangeLog:

* gen_il-fields.ads (Is_Implicit_Full_View): New field.
* gen_il-gen-gen_entities.adb (Type_Kind): Use new field.
* einfo.ads (Is_Implicit_Full_View): Document new field.
* exp_ch7.adb (Make_Adjust_Call, Make_Init_Call, Make_Final_Call): Use
new field.
* exp_util.adb (Finalize_Address): Likewise.
* sem_ch3.adb (Copy_And_Build): Set new field.

ada: Remove obsolete code from Safe_Unchecked_Type_Conversion

That's a kludge added to work around the limitations of the stack checking
mechanism used in the early days.

gcc/ada/ChangeLog:

* exp_util.ads (May_Generate_Large_Temp): Delete.
* exp_util.adb (May_Generate_Large_Temp): Likewise.
(Safe_Unchecked_Type_Conversion): Do not take stack checking into
account to compute the result.

ada: Fix assertion failure on aggregate with controlled component

The assertion is:

      pragma Assert (Side_Effect_Free (L));

in Make_Tag_Ctrl_Assignment and demonstrates that the sequence:

  Remove_Side_Effects (L);
  pragma Assert (Side_Effect_Free (L));

does not hold in this case.

What happens is that Remove_Side_Effects uses a renaming to remove the side
effects of L but, at the end, the renamed object is substituted back for the
renamed object in the node by Expand_Renaming, which is invoked because the
Is_Renaming_Of_Object flag is set on the renaming after Evaluate_Name has
been invoked on its Name.

This is a general discrepancy between Evaluate_Name and Side_Effect_Free of
Exp_Util, coming from the call to Safe_Unchecked_Type_Conversion present in
Side_Effect_Free in this case.  The long term goal is probably to remove the
call but, in the meantime, this change is sufficient to fix the failure.

gcc/ada/ChangeLog:

* exp_util.adb (Safe_Unchecked_Type_Conversion): Always return True
if the expression is the prefix of an N_Selected_Component.

ada: Tune recent change for bit-packed arrays to help GNATprove backend

When GNAT is operating in GNATprove_Mode the Expander_Active flag is disabled,
but we still must do things that ordinary backends expect.

gcc/ada/ChangeLog:

* sem_util.adb (Get_Actual_Subtype): Do the same for GCC and GNATprove
backends.

ada: Fix wrong indirect access to bit-packed array in iterated loop

This comes from a missing expansion of the bit-packed array reference in
the loop, because the actual subtype created for the dereference lacks a
Packed_Array_Impl_Type as it is ultimately created by the Preanalyze_Range
call present in Analyze_Loop_Statement.

gcc/ada/ChangeLog:

* sem_util.adb (Get_Actual_Subtype): Only create a new subtype when
the expander is active. Remove a useless test of type inequality,
as well as a useless call to Set_Has_Delayed_Freeze on the subtype.

ada: exp_util.adb: prevent infinite loop in case of broken code

A recent commit modified exp_util.adb in order to fix the selection of
Finalize subprograms in the case of untagged objects.
This introduced regressions for GNATSAS in fixedbugs by causing
GNAT2SCIL to loop over the same type over and over in case of broken
code.
We fix this by simply checking that the loop is making progress, and if
it doesn't, assume that we're done.

gcc/ada/ChangeLog:

* exp_util.adb (Finalize_Address): Prevent infinite loop

OpenMP: Fix implicit 'declare target' for <ostream>

libstdc++-v3/include/std/ostream contains:

  namespace std _GLIBCXX_VISIBILITY(default)
  {
    ...
    template<typename _CharT, typename _Traits>
      inline basic_ostream<_CharT, _Traits>&
      endl(basic_ostream<_CharT, _Traits>& __os)
      { return flush(__os.put(__os.widen('\n'))); }
  ...
  #include <bits/ostream.tcc>

and the latter, libstdc++-v3/include/bits/ostream.tcc, has:
    // Inhibit implicit instantiations for required instantiations,
    // which are defined via explicit instantiations elsewhere.
  #if _GLIBCXX_EXTERN_TEMPLATE
    extern template class basic_ostream<char>;
    extern template ostream& endl(ostream&);

Before this commit, omp_discover_declare_target_tgt_fn_r marked 'endl'
as (implicitly) declare target - but not the calls in it due to the
'extern' (DECL_EXTERNAL).

Thanks to inlining and as 'endl' is (therefore) not used and, hence,
discarded by the linker; hencet, it works with -O0 and -O1. However,
as the (unused) function still exits, IPA CP (enabled by -O2) will try
to do constant-value propagation and fails as the definition of 'widen'
is not available.

Solution is to still walk 'endl' despite being an 'extern(al)' decl;
this has been restricted for now to DECL_DECLARED_INLINE_P.

gcc/ChangeLog:

* omp-offload.cc (omp_discover_declare_target_tgt_fn_r): Also
walk external functions that are declare inline (and have a
DECL_SAVED_TREE).

libgomp/ChangeLog:

* testsuite/libgomp.c++/declare_target-2.C: New test.

(cherry picked from commit ea43b99537591b1103da3961c61f1cbfae968859)

Avoid SIGSEGV in nvptx 'mkoffload' for voluminous PTX code

In commit 50be486dff4ea2676ed022e9524ef190b92ae2b1
"nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup", some
additional tracking of the PTX code was added, and this assumes that
potentially every single character of PTX code needs to be tracked as a new
chunk of PTX code.  That's problematic if we're dealing with voluminous PTX
code (for example, non-trivial C++ code), and the 'file_idx' 'alloca'tion then
causes stack overflow.  For example:

    FAIL: libgomp.c++/target-std__valarray-1.C (test for excess errors)
    UNRESOLVED: libgomp.c++/target-std__valarray-1.C compilation failed to produce executable

    lto-wrapper: fatal error: [...]/build-gcc/gcc//accel/nvptx-none/mkoffload terminated with signal 11 [Segmentation fault], core dumped

gcc/
* config/nvptx/mkoffload.cc (process): Use an 'auto_vec' for
'file_idx'.

(cherry picked from commit 01044e0ee27093a3990996578b15f6ab69ed3395)

Add 'libgomp.c++/target-valarray-1.C'

libgomp/
* testsuite/libgomp.c++/target-std__valarray-1.C: New.
* testsuite/libgomp.c++/target-std__valarray-1.output: Likewise.

(cherry picked from commit 2ffada0296c95898a68bdb67ced738fe788df93a)

libgomp: Add testcases for concurrent access to standard C++ containers on offload targets, a number of USM variants

libgomp/
* testsuite/libgomp.c++/target-std__array-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__array-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__deque-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__deque-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__forward_list-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__list-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__list-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__map-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__map-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__multiset-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__set-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__set-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__span-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__span-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__valarray-concurrent.C: Adjust.
* testsuite/libgomp.c++/target-std__vector-concurrent-usm.C: New.
* testsuite/libgomp.c++/target-std__vector-concurrent.C: Adjust.

(cherry picked from commit 83ca283853f195a08d2f758580a369bc6a076122)

libgomp: Add testcases for concurrent access to standard C++ containers on offload targets

libgomp/

* testsuite/libgomp.c++/target-std__array-concurrent.C: New.
* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_map-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_multiset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__flat_set-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__forward_list-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__unordered_map-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__unordered_multimap-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__unordered_multiset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__unordered_set-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit a811d1d72261da58196ccec253fd2bdb10e999db)

libgomp: Add testcases for the standard C++ math library on offload targets

libgomp/

* testsuite/libgomp.c++/target-std__cmath.C: New.
* testsuite/libgomp.c++/target-std__complex.C: Likewise.
* testsuite/libgomp.c++/target-std__numbers.C: Likewise.

(cherry picked from commit fbcd0ad41f7cc801664da1e583f6bcad1eb02a08)

Add 'libgomp.c++/target-flex-[...].C' test cases

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-flex-10.C: New test.
* testsuite/libgomp.c++/target-flex-100.C: New test.
* testsuite/libgomp.c++/target-flex-101.C: New test.
* testsuite/libgomp.c++/target-flex-11.C: New test.
* testsuite/libgomp.c++/target-flex-12.C: New test.
* testsuite/libgomp.c++/target-flex-2000.C: New test.
* testsuite/libgomp.c++/target-flex-2001.C: New test.
* testsuite/libgomp.c++/target-flex-2002.C: New test.
* testsuite/libgomp.c++/target-flex-2003.C: New test.
* testsuite/libgomp.c++/target-flex-30.C: New test.
* testsuite/libgomp.c++/target-flex-300.C: New test.
* testsuite/libgomp.c++/target-flex-31.C: New test.
* testsuite/libgomp.c++/target-flex-32.C: New test.
* testsuite/libgomp.c++/target-flex-33.C: New test.
* testsuite/libgomp.c++/target-flex-41.C: New test.
* testsuite/libgomp.c++/target-flex-60.C: New test.
* testsuite/libgomp.c++/target-flex-61.C: New test.
* testsuite/libgomp.c++/target-flex-62.C: New test.
* testsuite/libgomp.c++/target-flex-70.C: New test.
* testsuite/libgomp.c++/target-flex-80.C: New test.
* testsuite/libgomp.c++/target-flex-81.C: New test.
* testsuite/libgomp.c++/target-flex-90.C: New test.
* testsuite/libgomp.c++/target-flex-common.h: New test.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit 28a5bc2d4f7ae345234a15e22fd65cfad851cf04)

Defuse 'RESULT_DECL' check in 'pass_nrv' (for offloading compilation) [PR119835]

... to avoid running into ICEs per PR119835, until that's resolved properly.

PR middle-end/119835
gcc/
* tree-nrv.cc (pass_nrv::execute): Defuse 'RESULT_DECL' check.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:
'#pragma GCC optimize "-fno-inline"'.
* testsuite/libgomp.c-c++-common/target-abi-struct-1.c: New.
* testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: Adjust.

Co-authored-by: Richard Biener <rguenther@suse.de>
(cherry picked from commit 543f7e1d59f0b6628e0de6610ad5e1cf7150090b)

'TYPE_EMPTY_P' vs. code offloading [PR120308]

We've got 'gcc/stor-layout.cc:finalize_type_size':

/* Handle empty records as per the x86-64 psABI. */
TYPE_EMPTY_P (type) = targetm.calls.empty_record_p (type);

(Indeed x86_64 is still the only target to define 'TARGET_EMPTY_RECORD_P',
calling 'gcc/tree.cc-default_is_empty_record'.)

And so it happens that for an empty struct used in code offloaded from x86_64
host (but not powerpc64le host, for example), we get to see 'TYPE_EMPTY_P' in
offloading compilation (where the offload targets (currently?) don't use it
themselves, and therefore aren't prepared to handle it).

For nvptx offloading compilation, this causes wrong code generation:
'ptxas [...] error : Call has wrong number of parameters', as nvptx code
generation for function definition doesn't pay attention to this flag (say, in
'gcc/config/nvptx/nvptx.cc:pass_in_memory', or whereever else would be
appropriate to handle that), but the generic code 'gcc/calls.cc:expand_call'
via 'gcc/function.cc:aggregate_value_p' does pay attention to it, and we thus
get mismatching function definition vs. function call.

This issue apparently isn't a problem for GCN offloading, but I don't know if
that's by design or by accident.

Richard Biener:
> It looks like TYPE_EMPTY_P is only used during RTL expansion for ABI
> purposes, so computing it during layout_type is premature as shown here.
>
> I would suggest to simply re-compute it at offload stream-in time.

(For avoidance of doubt, the additions to 'gcc.target/nvptx/abi-struct-arg.c',
'gcc.target/nvptx/abi-struct-ret.c' are not dependent on the offload streaming
code changes, but are just to mirror the changes to
'libgomp.oacc-c-c++-common/abi-struct-1.c'.)

PR lto/120308
gcc/
* lto-streamer-out.cc (hash_tree): Don't handle 'TYPE_EMPTY_P' for
'lto_stream_offload_p'.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields):
Likewise.
* tree-streamer-out.cc (pack_ts_type_common_value_fields):
Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Add empty
structure testing.
gcc/testsuite/
* gcc.target/nvptx/abi-struct-arg.c: Add empty structure testing.
* gcc.target/nvptx/abi-struct-ret.c: Likewise.

(cherry picked from commit 9063810c86beee6274d745b91d8fb43a81c9683e)

Add 'libgomp.c-c++-common/target-abi-struct-1-O0.c', 'libgomp.oacc-c-c++-common/abi-struct-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: New.
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Likewise.

(cherry picked from commit 45efda05c47f770a617b44cf85713a696bcf0384)

libgomp.c/target-map-zero-sized-3.c: Fix code for non-USM offload [PR120530]

A mapping clause was missing, causing the code to fail with offloading
when a host pointer was not device accessible.

libgomp/ChangeLog:

PR target/120530
* testsuite/libgomp.c/target-map-zero-sized-3.c (main): Add missing
map clause; remove unused variable.

(cherry picked from commit 16c742e1079e838b920a1b215af17828da7c6365)

GCN, nvptx offloading: Restrain 'WARNING: program timed out.' while in 'dynamic_cast' only for effective-target 'offload_device' [PR119692]

In PR119692 "C++ 'typeinfo', 'vtable' vs. OpenACC, OpenMP 'target' offloading":

> --- Comment #8 from Rainer Orth <ro at gcc dot gnu.org> ---
> The last commit made things worse on sparc-sun-solaris2.11: since that one
> (dg-timeout 10) I regularly get
>
> WARNING: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors)
> program timed out.
> FAIL: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors)
> UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C compilation failed to produce executable
> UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C scan-tree-dump-times optimized "gimple_call <__cxa_bad_cast, " 1
>
> Before that, the test had no issue. Compiling the test on an unloaded system
> usually takes less than 1 sec, but when fully loaded, times can go up.

To keep things simple, let's restrict this temporary (yeah...) workaround to
apply only for effective-target 'offload_device', just like the
'dg-xfail-run-if' itself.

PR target/119692
libgomp/
* testsuite/libgomp.c++/pr119692-1-4.C: '{ dg-timeout 10 { target offload_device } }'.
* testsuite/libgomp.c++/pr119692-1-5.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.

(cherry picked from commit aa143261bdf6db4334b3fcad7768b53e231f998e)

GCN, nvptx offloading: Restrain 'WARNING: program timed out.' while in 'dynamic_cast'" [PR119692]

PR target/119692
libgomp/
* testsuite/libgomp.c++/pr119692-1-4.C: '{ dg-timeout 10 }'.
* testsuite/libgomp.c++/pr119692-1-5.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.

(cherry picked from commit b5f48e7872db30b8f174cb2c497868a358bf75d6)

nvptx: Support '-march=sm_61'

gcc/
* config/nvptx/nvptx-sm.def: Add '61'.
* config/nvptx/nvptx-gen.h: Regenerate.
* config/nvptx/nvptx-gen.opt: Likewise.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust.
* config/nvptx/nvptx.opt (-march-map=sm_61, -march-map=sm_62):
Likewise.
* config.gcc: Likewise.
* doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_61'.
* config/nvptx/gen-multilib-matches-tests: Extend.
gcc/testsuite/
* gcc.target/nvptx/march-map=sm_61.c: Adjust.
* gcc.target/nvptx/march-map=sm_62.c: Likewise.
* gcc.target/nvptx/march=sm_61.c: New.
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm61.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Adjust.

(cherry picked from commit 7b53b88381179c5c8152bcb890460f66d9c88fac)

nvptx: Support '-mptx=5.0'

gcc/
* config/nvptx/nvptx-opts.h (enum ptx_version): Add
'PTX_VERSION_5_0'.
* config/nvptx/nvptx.cc (ptx_version_to_string)
(ptx_version_to_number): Adjust.
* config/nvptx/nvptx.h (TARGET_PTX_5_0): New.
* config/nvptx/nvptx.opt (Enum(ptx_version)): Add 'EnumValue'
'5.0' for 'PTX_VERSION_5_0'.
* doc/invoke.texi (Nvidia PTX Options): Document '-mptx=5.0'.
gcc/testsuite/
* gcc.target/nvptx/mptx=5.0.c: New.

(cherry picked from commit 97616687149f115e0ab946b9a05a9f8c1e47429e)

Adjust 'libgomp.c++/target-cdtor-{1,2}.C' for 'targetm.cxx.use_aeabi_atexit' [PR119853, PR119854]

Fix-up for commit aafe942227baf8c2bcd4cac2cb150e49a4b895a9
"GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]":
we need to adjust for 'targetm.cxx.use_aeabi_atexit':

    gcc/config/arm/arm.cc:#define TARGET_CXX_USE_AEABI_ATEXIT arm_cxx_use_aeabi_atexit

    gcc/config/arm/arm.cc:/* The EABI says __aeabi_atexit should be used to register static
    gcc/config/arm/arm.cc-   destructors.  */
    gcc/config/arm/arm.cc-
    gcc/config/arm/arm.cc-static bool
    gcc/config/arm/arm.cc:arm_cxx_use_aeabi_atexit (void)
    gcc/config/arm/arm.cc-{
    gcc/config/arm/arm.cc-  return TARGET_AAPCS_BASED;
    gcc/config/arm/arm.cc-}

..., which 'gcc/cp/decl.cc:get_atexit_node' then acts on: call '__aeabi_atexit'
instead of '__cxa_atexit', and swap two arguments.

PR target/119853
PR target/119854
libgomp/
* testsuite/libgomp.c++/target-cdtor-1.C: Adjust for
'targetm.cxx.use_aeabi_atexit'.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

(cherry picked from commit 04b42c4245d85f77aa54ec002ebd7bbe6fde5f11)

GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]

'__dso_handle' for '__cxa_atexit', '__cxa_finalize'. See
<https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor>.

PR target/119853
PR target/119854
libgcc/
* config/gcn/crt0.c (_fini_array): Call
'__GCC_offload___cxa_finalize'.
* config/nvptx/gbl-ctors.c (__static_do_global_dtors): Likewise.
libgomp/
* target-cxa-dso-dtor.c: New.
* config/accel/target-cxa-dso-dtor.c: Likewise.
* Makefile.am (libgomp_la_SOURCES): Add it.
* Makefile.in: Regenerate.
* testsuite/libgomp.c++/target-cdtor-1.C: New.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

(cherry picked from commit aafe942227baf8c2bcd4cac2cb150e49a4b895a9)

Add 'libgomp.c-c++-common/target-cdtor-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-cdtor-1.c: New.

(cherry picked from commit 40ce48e87c1e7344c622c8eb6bed53f1311f5a0a)

GCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]

There are GCN/C++ target as well as offloading codes, where the hard-coded
section names in 'gcn_hsa_declare_function_name' do not fit, and assembly thus
fails:

    LLVM ERROR: Size expression must be absolute.

This commit progresses GCN target:

    [-FAIL: g++.dg/init/call1.C  -std=gnu++17 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: g++.dg/init/call1.C  -std=gnu++26 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    UNSUPPORTED: g++.dg/init/call1.C  -std=gnu++98: exception handling not supported

..., and GCN offloading:

    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-1.C output pattern test+}

    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-2.C output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 9 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

PR target/119737
gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Properly
switch sections.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: Remove
PR119737 XFAILing.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit dfc43afe719898c3eafbed37fac7e6809d8b97ab)

Adjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder' [PR118794]

Fix-up for commit aa3e72f943032e5f074b2bd2fd06d130dda8760b
"Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]":
we need to adjust for configurations with 'targetm.arm_eabi_unwinder', as per:

    gcc/config/arm/arm.cc:#define TARGET_ARM_EABI_UNWINDER true
    gcc/config/c6x/c6x.cc:#define TARGET_ARM_EABI_UNWINDER true

..., which for ARM is conditional to '#if ARM_UNWIND_INFO' (defined in
'gcc/config/arm/bpabi.h', used for various GCC configurations), and for
C6x unconditional.

This gets us:

    --- target-exceptions-pr118794-1.C.269t.optimized
    +++ target-exceptions-pr118794-1.C.270t.optimized
    [...]
     __attribute__((omp declare target))
     void f ()
    [...]
       gimple_call <__dt_comp , NULL, &c>
    -  gimple_call <__builtin_eh_pointer, _7, 2>
    -  gimple_call <__builtin_unwind_resume, NULL, _7>
    +  gimple_call <__builtin_cxa_end_cleanup, NULL>

     }
    [...]

PR target/118794
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Adjust for
'targetm.arm_eabi_unwinder'.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

(cherry picked from commit 8a1f5424b04130f88e9dcd5cbecd58300bc5166e)

Daily bump.

Ada: Fix wrong tag in style check warnings

This fixes an old issue whereby violations of the style check -gnatyc are
sometimes reported as violations of -gnatyt instead.

gcc/ada/
PR ada/121184
* styleg.adb (Check_Comment): Use consistent warning message.

aarch64: Tweak handling of general SVE permutes [PR121027]

This PR is partly about a code quality regression that was triggered
by g:caa7a99a052929d5970677c5b639e1fa5166e334.  That patch taught the
gimple optimisers to fold two VEC_PERM_EXPRs into one, conditional
upon either (a) the original permutations not being "native" operations
or (b) the combined permutation being a "native" operation.

Whether something is a "native" operation is tested by calling
can_vec_perm_const_p with allow_variable_p set to false.  This requires
the permutation to be supported directly by TARGET_VECTORIZE_VEC_PERM_CONST,
rather than falling back to the general vec_perm optab.

This exposed a problem with the way that we handled general 2-input
permutations for SVE.  Unlike Advanced SIMD, base SVE does not have
an instruction to do general 2-input permutations.  We do still implement
the vec_perm optab for SVE, but only when the vector length is known at
compile time.  The general expansion is pretty expensive: an AND, a SUB,
two TBLs, and an ORR.  It certainly couldn't be considered a "native"
operation.

However, if a VEC_PERM_EXPR has a constant selector, the indices can
be wider than the elements being permuted.  This is not true for the
vec_perm optab, where the indices and permuted elements must have the
same precision.

This leads to one case where we cannot leave a general 2-input permutation
to be handled by the vec_perm optab: when permuting bytes on a target
with 2048-bit vectors.  In that case, the indices of the elements in
the second vector are in the range [256, 511], which cannot be stored
in a byte index.

TARGET_VECTORIZE_VEC_PERM_CONST therefore has to handle 2-input SVE
permutations for one specific case.  Rather than check for that
specific case, the code went ahead and used the vec_perm expansion
whenever it worked.  But that undermines the !allow_variable_p
handling in can_vec_perm_const_p; it becomes impossible for
target-independent code to distinguish "native" operations from
the worst-case fallback.

This patch instead limits TARGET_VECTORIZE_VEC_PERM_CONST to the
cases that it has to handle.  It fixes the PR for all vector lengths
except 2048 bits.

A better fix would be to introduce some sort of costing mechanism,
which would allow us to reject the new VEC_PERM_EXPR even for
2048-bit targets.  But that would be a significant amount of work
and would not be backportable.

gcc/
PR target/121027
* config/aarch64/aarch64.cc (aarch64_evpc_sve_tbl): Punt on 2-input
operations that can be handled by vec_perm.

gcc/testsuite/
PR target/121027
* gcc.target/aarch64/sve/acle/general/perm_1.c: New test.

(cherry picked from commit 1f52396c6fc940224e9d858d49e41310a6dfa43d)

aarch64: Fix LD1Q and ST1Q failures for big-endian

LD1Q gathers and ST1Q scatters are unusual in that they operate
on 128-bit blocks (effectively VNx1TI).  However, we don't have
modes or ACLE types for 128-bit integers, and 128-bit integers
are not the intended use case.  Instead, the instructions are
intended to be used in "hybrid VLA" operations, where each 128-bit
block is an Advanced SIMD vector.

The normal SVE modes therefore capture the intended use case better
than VNx1TI would.  For example, VNx2DI is effectively N copies
of V2DI, VNx4SI N copies of V4SI, etc.

Since there is only one LD1Q instruction and one ST1Q instruction,
the ACLE support used a single pattern for each, with the loaded or
stored data having mode VNx2DI.  The ST1Q pattern was generated by:

    rtx data = e.args.last ();
    e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data));
    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q);

where the force_lowpart_subreg bitcast the stored data to VNx2DI.
But such subregs require an element reverse on big-endian targets
(see the comment at the head of aarch64-sve.md), which wasn't the
intention.  The code should have used aarch64_sve_reinterpret instead.

The LD1Q pattern was used as follows:

    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q);

which always returns a VNx2DI value, leaving the caller to bitcast
that to the correct mode.  That bitcast again uses subregs and has
the same problem as above.

However, for the reasons explained in the comment, using
aarch64_sve_reinterpret does not work well for LD1Q.  The patch
instead parameterises the LD1Q based on the required data mode.

gcc/
* config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with...
(@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svld1q_gather_impl::expand): Update accordingly.
(svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret
instead of force_lowpart_subreg.

(cherry picked from commit e7f049471c6caf22c65ac48773d864fca7a4cdc4)

testsuite: Add -funwind-tables to sve*/pfalse* tests

The SVE svpfalse folding tests use CFI directives to delimit the
function bodies. That requires -funwind-tables to be enabled,
which is true by default for *-linux-gnu targets, but not for *-elf.

gcc/testsuite/
* gcc.target/aarch64/sve/pfalse-binary.c: Add -funwind-tables.
* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binaryxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-clast.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-count_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-fold_left.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_replicate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ptest.c: Likewise.
* gcc.target/aarch64/sve/pfalse-rdffr.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction_wide.c: Likewise.
* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-storexn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unaryxn.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_wide.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-compare.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: Likewise.

(cherry picked from commit 2ff8da46152cbade579700823cc7b1460ddd91b8)

aarch64: Extend HVLA permutations to big-endian

TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1
"hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions.
This matching was conditional on !BYTES_BIG_ENDIAN.

The ACLE code also lowered the associated SVE2.1 intrinsics into
suitable VEC_PERM_EXPRs.  This lowering was not conditional on
!BYTES_BIG_ENDIAN.

The mismatch led to lots of ICEs in the ACLE tests on big-endian
targets: we lowered to VEC_PERM_EXPRs that are not supported.

I think the !BYTES_BIG_ENDIAN restriction was unnecessary.
SVE maps the first memory element to the least significant end of
the register for both endiannesses, so no endian correction or lane
number adjustment is necessary.

This is in some ways a bit counterintuitive.  ZIPQ1 is conceptually
"apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does
matter when choosing between Advanced SIMD ZIP1 and ZIP2.  For example,
the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little-
endian and ZIP2 for big-endian.  But the difference between the hybrid
VLA and Advanced SIMD permute selectors is a consequence of the
difference between the SVE and Advanced SIMD element orders.

The same thing applies to ACLE intrinsics.  The current lowering of
svzipq1 etc. is correct for both endiannesses.  If ACLE code does:

  2x svld1_s32 + svzipq1_s32 + svst1_s32

then the byte-for-byte result is the same for both endiannesses.
On big-endian targets, this is different from using the Advanced SIMD
sequence below for each 128-bit block:

  2x LDR + ZIP1 + STR

In contrast, the byte-for-byte result of:

  2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32

depends on endianness, since the quadword gathers and scatters use
Advanced SIMD byte ordering for each 128-bit block.  This gather/scatter
sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR
sequence for both endiannesses.

Programmers writing ACLE code have to be aware of this difference
if they want to support both endiannesses.

The patch includes some new execution tests to verify the expansion
of the VEC_PERM_EXPRs.

gcc/
* doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document.
* config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to
BYTES_BIG_ENDIAN.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw):
New proc.
* gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian.  Add
noipa attributes.
* gcc.target/aarch64/sve2/extq_1.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1.c: Likewise.
* gcc.target/aarch64/sve2/dupq_1_run.c: New test.
* gcc.target/aarch64/sve2/extq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1_run.c: Likewise.

(cherry picked from commit 3b870131487d786a74f27a89d0415c8207770f14)

aarch64: Fix endianness of DFmode vector constants

aarch64_simd_valid_imm tries to decompose a constant into a repeating
series of 64 bits, since most Advanced SIMD and SVE immediate forms
require that.  (The exceptions are handled first.)  It does this by
building up a byte-level register image, lsb first.  If the image does
turn out to repeat every 64 bits, it loads the first 64 bits into an
integer.

At this point, endianness has mostly been dealt with.  Endianness
applies to transfers between registers and memory, whereas at this
point we're dealing purely with register values.

However, one of things we try is to bitcast the value to a float
and use FMOV.  This involves splitting the value into 32-bit chunks
(stored as longs) and passing them to real_from_target.  The problem
being fixed by this patch is that, when a value spans multiple 32-bit
chunks, real_from_target expects them to be in memory rather than
register order.  Thus index 0 is the most significant chunk if
FLOAT_WORDS_BIG_ENDIAN and the least significant chunk otherwise.

This fixes aarch64/sve/cond_fadd_1.c and various other tests
for aarch64_be-elf.

gcc/
* config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Account
for FLOAT_WORDS_BIG_ENDIAN when building a floating-point value.

(cherry picked from commit 82dd19890b6139c4bac2385068a68613920ae1a2)

aarch64: Some fixes for SVE INDEX constants

When using SVE INDEX to load an Advanced SIMD vector, we need to
take account of the different element ordering for big-endian
targets.  For example, when big-endian targets store the V4SI
constant { 0, 1, 2, 3 } in registers, 0 becomes the most
significant element, whereas INDEX always operates from the
least significant element.  A big-endian target would therefore
load V4SI { 0, 1, 2, 3 } using:

    INDEX Z0.S, #3, #-1

rather than little-endian's:

    INDEX Z0.S, #0, #1

While there, I noticed that we would only check the first vector
in a multi-vector SVE constant, which would trigger an ICE if the
other vectors turned out to be invalid.  This is pretty difficult to
trigger at the moment, since we only allow single-register modes to be
used as frontend & middle-end vector modes, but it can be seen using
the RTL frontend.

gcc/
* config/aarch64/aarch64.cc (aarch64_sve_index_series_p): New
function, split out from...
(aarch64_simd_valid_imm): ...here.  Account for the different
SVE and Advanced SIMD element orders on big-endian targets.
Check each vector in a structure mode.

gcc/testsuite/
* gcc.dg/rtl/aarch64/vec-series-1.c: New test.
* gcc.dg/rtl/aarch64/vec-series-2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Fix expected
output for this big-endian test.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: Restrict to little-endian
targets and add more tests.
* gcc.target/aarch64/sve/vec_init_4.c: New big-endian version
of vec_init_3.c.

(cherry picked from commit 41c446389446a357172883389e36fd10c882ce6d)

Make the RTL frontend set REG_NREGS correctly

While working on a new testcase that uses the RTL frontend,
I hit a bug where a (reg ...) that spans multiple hard registers
had REG_NREGS set to 1. This caused various things to misbehave.
For example, if the (reg ...) in question was used as crtl->return_rtx,
only the first register in the group would be marked as live on exit.

gcc/
* read-rtl-function.cc (function_reader::read_rtx_operand_r): Use
hard_regno_nregs to work out REG_NREGS for hard registers.

(cherry picked from commit 76db38d811a63a603deedfe327d5e201fc820444)

ext-dce: Fix subreg_lsb is_constant assumption (2)

This patch fixes another instance of the problem described in the
cover note for g:bf3037e923e9f91d93ab64bdf73a37f64f659fb9.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

(cherry picked from commit bddc41e290113dd9160b01f2fdf925a1876c8ee0)

aarch64: Fix ZIP1 order in aarch64_expand_vector_init [PR118891]

aarch64_expand_vector_init contains some divide-and-conquer code
that tries to load the odd and even elements into 64-bit registers
and then ZIP them together. On big-endian targets, the even elements
are more significant than the odd elements and so should come second
in the ZIP.

This fixes many execution failures on aarch64_be-elf, including
gcc.c-torture/execute/pr28982a.c.

gcc/
PR target/118891
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Fix the
ZIP1 operand order for big-endian targets.

(cherry picked from commit cb2b5471516c3c469f65d927a2a30eb15357e429)

aarch64: Fix neon-sve-bridge.c failures for big-endian

Lowpart subregs are generally disallowed on big-endian SVE vector
registers, since the first memory element is stored at the least
significant end of the register, rather than the most significant end.
(See the comment at the head of aarch64-sve.md for details,
and aarch64_modes_compatible_p for the implementation.)

This means that arm_sve_neon_bridge.h needs to use custom define_insns
for big-endian targets, in lieu of using lowpart subregs.  However,
one of those define_insns relied on the prohibited lowparts internally,
to convert an Advanced SIMD register to an SVE register.  Since the
lowpart is not allowed, the lowpart_subreg would return null, leading
to a later ICE.

The simplest fix seems to be to use %Z instead, to force the Advanced
SIMD register to be written as an SVE register.

gcc/
* config/aarch64/aarch64-sve.md (@aarch64_sve_set_neonq_<mode>):
Use %Z instead of lowpart_subreg.  Tweak formatting.

(cherry picked from commit 69c839c7361430ec27d1f13f909531b872588f27)

ext-dce: Fix subreg_lsb is_constant assumption

ext-dce had:

  if (SUBREG_P (dst) && SUBREG_BYTE (dst).is_constant ())
    {
      bit = subreg_lsb (dst).to_constant ();
      if (bit >= HOST_BITS_PER_WIDE_INT)
bit = HOST_BITS_PER_WIDE_INT - 1;
      dst = SUBREG_REG (dst);

But a constant SUBREG_BYTE doesn't guarantee a constant subreg_lsb.
If the SUBREG_REG is a pair of N-bit registers on a big-endian target,
the most significant end has a SUBREG_BYTE of 0 but a subreg_lsb of N.
This N would then be non-constant for variable-length registers.

The patch fixes gcc.dg/torture/pr120276.c and other failures on
aarch64_be-elf.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

(cherry picked from commit bf3037e923e9f91d93ab64bdf73a37f64f659fb9)

vect: Fix VEC_WIDEN_PLUS_HI/LO choice for big-endian [PR118891]

In the tree codes and optabs, the "hi" in a vector hi/lo pair means
"most significant" and the "lo" means "least significant", with
sigificance following GCC's normal endian expectations.  Thus on
big-endian targets, the hi part handles the first half of the elements
in memory order and the lo part handles the second half.

For tree codes, supportable_widening_operation first chooses hi/lo
pairs based on little-endian order and then uses:

  if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
    std::swap (c1, c2);

to adjust.  However, the handling for internal functions was missing
an equivalent fixup.  This led to several execution failures in vect.exp
on aarch64_be-elf.

If the hi/lo code fails, the internal function handling goes on to try
even/odd.  But I couldn't see anything obvious that would put the even/
odd results back into the right order later, so there might be a latent
bug there too.

gcc/
PR tree-optimization/118891
* tree-vect-stmts.cc (supportable_widening_operation): Swap the
hi and lo internal functions on big-endian targets.

(cherry picked from commit ec54a14239b12d03c600c14f3ce9710e65cd33f1)

Daily bump.

Fortran: fix bogus runtime error with optional procedure argument [PR121145]

PR fortran/121145

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Do not create pointer
check for proc-pointer actual passed to optional dummy.

gcc/testsuite/ChangeLog:

* gfortran.dg/pointer_check_15.f90: New test.

(cherry picked from commit 8f9450505f8244d262f8b4ff274f113f99cdc7e2)

Daily bump.

libstdc++: Update some baseline_symbols.txt (x32)

* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt:
Updated.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c7baa61a583b49df63d3df8c6336f8405e24f012)

Daily bump.

[PATCH] PR modula2/121164 Modula 2 build failure

This patch fixes the 2nd parameter name mismatch in
ARRAYOFCHAR.mod.

gcc/m2/ChangeLog:

PR modula2/121164
* gm2-libs/ARRAYOFCHAR.mod (Write): Rename 2nd parameter
name a to str.

(cherry picked from commit 22d8b89689769e5efefd2c4e6dda88d9f0b2a945)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

rust: Silence a clang warning in borrow-checker-diagnostics

When compiling
gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
with clang, it emits the following warning:

gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc:145:46: warning: non-constant-expression cannot be narrowed from type 'Polonius::Loan' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]

I'd hope that for indexing that is never really a problem,
nevertheless if narrowing is taking place, I guess it can be argued it
should be made explicit.

gcc/rust/ChangeLog:

2025-06-23 Martin Jambor <mjambor@suse.cz>

* checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
(BorrowCheckerDiagnostics::get_loan): Type cast loan to uint32_t.

(cherry picked from commit 1e69c5655894ab3cbeb4431a5b3daff211d3c4e1)

gccrs: Fix narrowing conversion warnings

Fixes PR#119641

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-place.h
(IndexVec::size_type): Add.
(IndexVec::MAX_INDEX): Add.
(IndexVec::size): Change the return type to the type of the
internal value used by the index type.
(PlaceDB::lookup_or_add_variable): Use the return value from the
PlaceDB::add_place call.
* checks/errors/borrowck/rust-bir.h
(struct BasicBlockId): Move this definition before the
definition of the struct Function.

Signed-off-by: Owen Avery <powerboat9.gamer@gmail.com>
(cherry picked from commit beced835afa3908aa94550d2ca5ee3879a620adb)

Disable parallel testing for 'rust/compile/nr2/compile.exp' [PR119508]

..., using the standard idiom. This '*.exp' file doesn't adhere to the
parallel testing protocol as defined in 'gcc/testsuite/lib/gcc-defs.exp'.

This also restores proper behavior for '*.exp' files executing after (!) this
one, which erroneously caused hundreds or even thousands of individual test
cases get duplicated vs. skipped, randomly, depending on the '-jN' level.

PR testsuite/119508
gcc/testsuite/
* rust/compile/nr2/compile.exp: Disable parallel testing.

(cherry picked from commit 79d2c3089f480738613b7d338d86d8be710f8158)

Fix time zone for 'cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob' [PR119818]

This progresses:

    PASS: cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob   -O0  (test for excess errors)
    [-FAIL:-]{+PASS:+} cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob   -O0  execution test
    [Etc.]

PR cobol/119818
gcc/testsuite/
* cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob:
'dg-set-target-env-var TZ UTC0'.

(cherry picked from commit ed8761241ac529ccddb2b76a1895c124c67c132c)

mmix: Define MAX_FIXED_MODE_SIZE

Besides this commit working as a release-branch fix for the
PR, code inspection shows slightly better code for TImode
libgcc functions, and a modified
gcc.c-torture/execute/arith-rand-ll.c (basically s/long
long/__int128 and cutting out the non-128-bit cases) shows a
1.4% improvement. (Coremark code is identical, as
expected.)

PR middle-end/120935
* config/mmix/mmix.h (MAX_FIXED_MODE_SIZE): Define.

Co-authored-by: Pietro Monteiro <pietro@sociotechnical.xyz>
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>

tree-optimization/120924 - up --param uninit-max-chain-len

The PR shows that the uninit analysis limits are set too low in
cases we lower switches to ifs as happens on s390x for a linux
kernel TU. This causes false positive uninit diagnostics as we
abort the attempt to prove that a value is initialized on all
paths. The new testcase only would require upping to 9.

PR tree-optimization/120924
* params.opt (uninit-max-chain-len): Up from 8 to 12.

* gcc.dg/uninit-pr120924.c: New testcase.

(cherry picked from commit cf9a479e3f909d5217e954788eb3c5b569e4bc52)

[PATCH] PR modula2/120912: Request for a procedure to obtain a file from an IOChan

This patch introduces the procedure GetFile into the supplementary
ISO style library IOChanUtils.

gcc/m2/ChangeLog:

PR modula2/120912
* gm2-libs-iso/IOChanUtils.def (GetFile): New procedure function.
* gm2-libs-iso/IOChanUtils.mod (GetFile): New procedure function.

(cherry picked from commit 15670d4477ce219c017bd52417a6074b981fb197)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

tree-optimization/121059 - fixup loop mask query

When we opportunistically mask an operand of a AND with an already
available loop mask we need to query that set with the correct number
of masks we expect.

PR tree-optimization/121059
* tree-vect-stmts.cc (vectorizable_operation): Query
scalar_cond_masked_set with the correct number of masks.

* gcc.dg/vect/pr121059.c: New testcase.

Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
(cherry picked from commit 71be87055548cf942c7bc56d10ffd479db8569e4)

tree-optimization/121049 - avoid loop masking with even/odd reduction

The following disables loop masking when we are using an even/odd
widening operation in a reduction because the loop mask then aligns
to the wrong elements.

PR tree-optimization/121049
* internal-fn.h (widening_evenodd_fn_p): Declare.
* internal-fn.cc (widening_evenodd_fn_p): New function.
* tree-vect-stmts.cc (vectorizable_conversion): When using
an even/odd widening function disable loop masking.

* gcc.dg/vect/pr121049.c: New testcase.

(cherry picked from commit bc5570f7ef796fa7f5ab89b34ed9de2be5299f0e)

tree-optimization/121035 - handle stray VN values without expression

When VN iterates we can end up with unreachable inserted expressions
in the expression tables which in turn will not be added to their
value by PREs compute_avail. This will later ICE when we pick
them up and want to generate them. Deal with this by giving up.

PR tree-optimization/121035
* tree-ssa-pre.cc (find_or_generate_expression): Handle
values without expression.

* gcc.dg/pr121035.c: New testcase.

(cherry picked from commit 9af57c471087a3a1b87621bce1208d6c77ba2a4a)