git.ipfire.org Git - thirdparty/gcc.git/log

libgomp: Add testcases for the standard C++ math library on offload targets

libgomp/

* testsuite/libgomp.c++/target-std__cmath.C: New.
* testsuite/libgomp.c++/target-std__complex.C: Likewise.
* testsuite/libgomp.c++/target-std__numbers.C: Likewise.

(cherry picked from commit fbcd0ad41f7cc801664da1e583f6bcad1eb02a08)

Add 'libgomp.c++/target-flex-[...].C' test cases

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-flex-10.C: New test.
* testsuite/libgomp.c++/target-flex-100.C: New test.
* testsuite/libgomp.c++/target-flex-101.C: New test.
* testsuite/libgomp.c++/target-flex-11.C: New test.
* testsuite/libgomp.c++/target-flex-12.C: New test.
* testsuite/libgomp.c++/target-flex-2000.C: New test.
* testsuite/libgomp.c++/target-flex-2001.C: New test.
* testsuite/libgomp.c++/target-flex-2002.C: New test.
* testsuite/libgomp.c++/target-flex-2003.C: New test.
* testsuite/libgomp.c++/target-flex-30.C: New test.
* testsuite/libgomp.c++/target-flex-300.C: New test.
* testsuite/libgomp.c++/target-flex-31.C: New test.
* testsuite/libgomp.c++/target-flex-32.C: New test.
* testsuite/libgomp.c++/target-flex-33.C: New test.
* testsuite/libgomp.c++/target-flex-41.C: New test.
* testsuite/libgomp.c++/target-flex-60.C: New test.
* testsuite/libgomp.c++/target-flex-61.C: New test.
* testsuite/libgomp.c++/target-flex-62.C: New test.
* testsuite/libgomp.c++/target-flex-70.C: New test.
* testsuite/libgomp.c++/target-flex-80.C: New test.
* testsuite/libgomp.c++/target-flex-81.C: New test.
* testsuite/libgomp.c++/target-flex-90.C: New test.
* testsuite/libgomp.c++/target-flex-common.h: New test.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit 28a5bc2d4f7ae345234a15e22fd65cfad851cf04)

Defuse 'RESULT_DECL' check in 'pass_nrv' (for offloading compilation) [PR119835]

... to avoid running into ICEs per PR119835, until that's resolved properly.

PR middle-end/119835
gcc/
* tree-nrv.cc (pass_nrv::execute): Defuse 'RESULT_DECL' check.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c:
'#pragma GCC optimize "-fno-inline"'.
* testsuite/libgomp.c-c++-common/target-abi-struct-1.c: New.
* testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: Adjust.

Co-authored-by: Richard Biener <rguenther@suse.de>
(cherry picked from commit 543f7e1d59f0b6628e0de6610ad5e1cf7150090b)

'TYPE_EMPTY_P' vs. code offloading [PR120308]

We've got 'gcc/stor-layout.cc:finalize_type_size':

/* Handle empty records as per the x86-64 psABI. */
TYPE_EMPTY_P (type) = targetm.calls.empty_record_p (type);

(Indeed x86_64 is still the only target to define 'TARGET_EMPTY_RECORD_P',
calling 'gcc/tree.cc-default_is_empty_record'.)

And so it happens that for an empty struct used in code offloaded from x86_64
host (but not powerpc64le host, for example), we get to see 'TYPE_EMPTY_P' in
offloading compilation (where the offload targets (currently?) don't use it
themselves, and therefore aren't prepared to handle it).

For nvptx offloading compilation, this causes wrong code generation:
'ptxas [...] error : Call has wrong number of parameters', as nvptx code
generation for function definition doesn't pay attention to this flag (say, in
'gcc/config/nvptx/nvptx.cc:pass_in_memory', or whereever else would be
appropriate to handle that), but the generic code 'gcc/calls.cc:expand_call'
via 'gcc/function.cc:aggregate_value_p' does pay attention to it, and we thus
get mismatching function definition vs. function call.

This issue apparently isn't a problem for GCN offloading, but I don't know if
that's by design or by accident.

Richard Biener:
> It looks like TYPE_EMPTY_P is only used during RTL expansion for ABI
> purposes, so computing it during layout_type is premature as shown here.
>
> I would suggest to simply re-compute it at offload stream-in time.

(For avoidance of doubt, the additions to 'gcc.target/nvptx/abi-struct-arg.c',
'gcc.target/nvptx/abi-struct-ret.c' are not dependent on the offload streaming
code changes, but are just to mirror the changes to
'libgomp.oacc-c-c++-common/abi-struct-1.c'.)

PR lto/120308
gcc/
* lto-streamer-out.cc (hash_tree): Don't handle 'TYPE_EMPTY_P' for
'lto_stream_offload_p'.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields):
Likewise.
* tree-streamer-out.cc (pack_ts_type_common_value_fields):
Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Add empty
structure testing.
gcc/testsuite/
* gcc.target/nvptx/abi-struct-arg.c: Add empty structure testing.
* gcc.target/nvptx/abi-struct-ret.c: Likewise.

(cherry picked from commit 9063810c86beee6274d745b91d8fb43a81c9683e)

Add 'libgomp.c-c++-common/target-abi-struct-1-O0.c', 'libgomp.oacc-c-c++-common/abi-struct-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: New.
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Likewise.

(cherry picked from commit 45efda05c47f770a617b44cf85713a696bcf0384)

libgomp.c/target-map-zero-sized-3.c: Fix code for non-USM offload [PR120530]

A mapping clause was missing, causing the code to fail with offloading
when a host pointer was not device accessible.

libgomp/ChangeLog:

PR target/120530
* testsuite/libgomp.c/target-map-zero-sized-3.c (main): Add missing
map clause; remove unused variable.

(cherry picked from commit 16c742e1079e838b920a1b215af17828da7c6365)

GCN, nvptx offloading: Restrain 'WARNING: program timed out.' while in 'dynamic_cast' only for effective-target 'offload_device' [PR119692]

In PR119692 "C++ 'typeinfo', 'vtable' vs. OpenACC, OpenMP 'target' offloading":

> --- Comment #8 from Rainer Orth <ro at gcc dot gnu.org> ---
> The last commit made things worse on sparc-sun-solaris2.11: since that one
> (dg-timeout 10) I regularly get
>
> WARNING: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors)
> program timed out.
> FAIL: libgomp.c++/target-exceptions-bad_cast-1.C (test for excess errors)
> UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C compilation failed to produce executable
> UNRESOLVED: libgomp.c++/target-exceptions-bad_cast-1.C scan-tree-dump-times optimized "gimple_call <__cxa_bad_cast, " 1
>
> Before that, the test had no issue. Compiling the test on an unloaded system
> usually takes less than 1 sec, but when fully loaded, times can go up.

To keep things simple, let's restrict this temporary (yeah...) workaround to
apply only for effective-target 'offload_device', just like the
'dg-xfail-run-if' itself.

PR target/119692
libgomp/
* testsuite/libgomp.c++/pr119692-1-4.C: '{ dg-timeout 10 { target offload_device } }'.
* testsuite/libgomp.c++/pr119692-1-5.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.

(cherry picked from commit aa143261bdf6db4334b3fcad7768b53e231f998e)

GCN, nvptx offloading: Restrain 'WARNING: program timed out.' while in 'dynamic_cast'" [PR119692]

PR target/119692
libgomp/
* testsuite/libgomp.c++/pr119692-1-4.C: '{ dg-timeout 10 }'.
* testsuite/libgomp.c++/pr119692-1-5.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.

(cherry picked from commit b5f48e7872db30b8f174cb2c497868a358bf75d6)

nvptx: Support '-march=sm_61'

gcc/
* config/nvptx/nvptx-sm.def: Add '61'.
* config/nvptx/nvptx-gen.h: Regenerate.
* config/nvptx/nvptx-gen.opt: Likewise.
* config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust.
* config/nvptx/nvptx.opt (-march-map=sm_61, -march-map=sm_62):
Likewise.
* config.gcc: Likewise.
* doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_61'.
* config/nvptx/gen-multilib-matches-tests: Extend.
gcc/testsuite/
* gcc.target/nvptx/march-map=sm_61.c: Adjust.
* gcc.target/nvptx/march-map=sm_62.c: Likewise.
* gcc.target/nvptx/march=sm_61.c: New.
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm61.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Adjust.

(cherry picked from commit 7b53b88381179c5c8152bcb890460f66d9c88fac)

nvptx: Support '-mptx=5.0'

gcc/
* config/nvptx/nvptx-opts.h (enum ptx_version): Add
'PTX_VERSION_5_0'.
* config/nvptx/nvptx.cc (ptx_version_to_string)
(ptx_version_to_number): Adjust.
* config/nvptx/nvptx.h (TARGET_PTX_5_0): New.
* config/nvptx/nvptx.opt (Enum(ptx_version)): Add 'EnumValue'
'5.0' for 'PTX_VERSION_5_0'.
* doc/invoke.texi (Nvidia PTX Options): Document '-mptx=5.0'.
gcc/testsuite/
* gcc.target/nvptx/mptx=5.0.c: New.

(cherry picked from commit 97616687149f115e0ab946b9a05a9f8c1e47429e)

Adjust 'libgomp.c++/target-cdtor-{1,2}.C' for 'targetm.cxx.use_aeabi_atexit' [PR119853, PR119854]

Fix-up for commit aafe942227baf8c2bcd4cac2cb150e49a4b895a9
"GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]":
we need to adjust for 'targetm.cxx.use_aeabi_atexit':

    gcc/config/arm/arm.cc:#define TARGET_CXX_USE_AEABI_ATEXIT arm_cxx_use_aeabi_atexit

    gcc/config/arm/arm.cc:/* The EABI says __aeabi_atexit should be used to register static
    gcc/config/arm/arm.cc-   destructors.  */
    gcc/config/arm/arm.cc-
    gcc/config/arm/arm.cc-static bool
    gcc/config/arm/arm.cc:arm_cxx_use_aeabi_atexit (void)
    gcc/config/arm/arm.cc-{
    gcc/config/arm/arm.cc-  return TARGET_AAPCS_BASED;
    gcc/config/arm/arm.cc-}

..., which 'gcc/cp/decl.cc:get_atexit_node' then acts on: call '__aeabi_atexit'
instead of '__cxa_atexit', and swap two arguments.

PR target/119853
PR target/119854
libgomp/
* testsuite/libgomp.c++/target-cdtor-1.C: Adjust for
'targetm.cxx.use_aeabi_atexit'.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

(cherry picked from commit 04b42c4245d85f77aa54ec002ebd7bbe6fde5f11)

GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]

'__dso_handle' for '__cxa_atexit', '__cxa_finalize'. See
<https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor>.

PR target/119853
PR target/119854
libgcc/
* config/gcn/crt0.c (_fini_array): Call
'__GCC_offload___cxa_finalize'.
* config/nvptx/gbl-ctors.c (__static_do_global_dtors): Likewise.
libgomp/
* target-cxa-dso-dtor.c: New.
* config/accel/target-cxa-dso-dtor.c: Likewise.
* Makefile.am (libgomp_la_SOURCES): Add it.
* Makefile.in: Regenerate.
* testsuite/libgomp.c++/target-cdtor-1.C: New.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

(cherry picked from commit aafe942227baf8c2bcd4cac2cb150e49a4b895a9)

Add 'libgomp.c-c++-common/target-cdtor-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-cdtor-1.c: New.

(cherry picked from commit 40ce48e87c1e7344c622c8eb6bed53f1311f5a0a)

GCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]

There are GCN/C++ target as well as offloading codes, where the hard-coded
section names in 'gcn_hsa_declare_function_name' do not fit, and assembly thus
fails:

    LLVM ERROR: Size expression must be absolute.

This commit progresses GCN target:

    [-FAIL: g++.dg/init/call1.C  -std=gnu++17 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: g++.dg/init/call1.C  -std=gnu++26 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    UNSUPPORTED: g++.dg/init/call1.C  -std=gnu++98: exception handling not supported

..., and GCN offloading:

    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-1.C output pattern test+}

    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-2.C output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 9 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

PR target/119737
gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Properly
switch sections.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: Remove
PR119737 XFAILing.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit dfc43afe719898c3eafbed37fac7e6809d8b97ab)

Adjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder' [PR118794]

Fix-up for commit aa3e72f943032e5f074b2bd2fd06d130dda8760b
"Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]":
we need to adjust for configurations with 'targetm.arm_eabi_unwinder', as per:

    gcc/config/arm/arm.cc:#define TARGET_ARM_EABI_UNWINDER true
    gcc/config/c6x/c6x.cc:#define TARGET_ARM_EABI_UNWINDER true

..., which for ARM is conditional to '#if ARM_UNWIND_INFO' (defined in
'gcc/config/arm/bpabi.h', used for various GCC configurations), and for
C6x unconditional.

This gets us:

    --- target-exceptions-pr118794-1.C.269t.optimized
    +++ target-exceptions-pr118794-1.C.270t.optimized
    [...]
     __attribute__((omp declare target))
     void f ()
    [...]
       gimple_call <__dt_comp , NULL, &c>
    -  gimple_call <__builtin_eh_pointer, _7, 2>
    -  gimple_call <__builtin_unwind_resume, NULL, _7>
    +  gimple_call <__builtin_cxa_end_cleanup, NULL>

     }
    [...]

PR target/118794
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Adjust for
'targetm.arm_eabi_unwinder'.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

(cherry picked from commit 8a1f5424b04130f88e9dcd5cbecd58300bc5166e)

Daily bump.

Ada: Fix wrong tag in style check warnings

This fixes an old issue whereby violations of the style check -gnatyc are
sometimes reported as violations of -gnatyt instead.

gcc/ada/
PR ada/121184
* styleg.adb (Check_Comment): Use consistent warning message.

aarch64: Tweak handling of general SVE permutes [PR121027]

This PR is partly about a code quality regression that was triggered
by g:caa7a99a052929d5970677c5b639e1fa5166e334.  That patch taught the
gimple optimisers to fold two VEC_PERM_EXPRs into one, conditional
upon either (a) the original permutations not being "native" operations
or (b) the combined permutation being a "native" operation.

Whether something is a "native" operation is tested by calling
can_vec_perm_const_p with allow_variable_p set to false.  This requires
the permutation to be supported directly by TARGET_VECTORIZE_VEC_PERM_CONST,
rather than falling back to the general vec_perm optab.

This exposed a problem with the way that we handled general 2-input
permutations for SVE.  Unlike Advanced SIMD, base SVE does not have
an instruction to do general 2-input permutations.  We do still implement
the vec_perm optab for SVE, but only when the vector length is known at
compile time.  The general expansion is pretty expensive: an AND, a SUB,
two TBLs, and an ORR.  It certainly couldn't be considered a "native"
operation.

However, if a VEC_PERM_EXPR has a constant selector, the indices can
be wider than the elements being permuted.  This is not true for the
vec_perm optab, where the indices and permuted elements must have the
same precision.

This leads to one case where we cannot leave a general 2-input permutation
to be handled by the vec_perm optab: when permuting bytes on a target
with 2048-bit vectors.  In that case, the indices of the elements in
the second vector are in the range [256, 511], which cannot be stored
in a byte index.

TARGET_VECTORIZE_VEC_PERM_CONST therefore has to handle 2-input SVE
permutations for one specific case.  Rather than check for that
specific case, the code went ahead and used the vec_perm expansion
whenever it worked.  But that undermines the !allow_variable_p
handling in can_vec_perm_const_p; it becomes impossible for
target-independent code to distinguish "native" operations from
the worst-case fallback.

This patch instead limits TARGET_VECTORIZE_VEC_PERM_CONST to the
cases that it has to handle.  It fixes the PR for all vector lengths
except 2048 bits.

A better fix would be to introduce some sort of costing mechanism,
which would allow us to reject the new VEC_PERM_EXPR even for
2048-bit targets.  But that would be a significant amount of work
and would not be backportable.

gcc/
PR target/121027
* config/aarch64/aarch64.cc (aarch64_evpc_sve_tbl): Punt on 2-input
operations that can be handled by vec_perm.

gcc/testsuite/
PR target/121027
* gcc.target/aarch64/sve/acle/general/perm_1.c: New test.

(cherry picked from commit 1f52396c6fc940224e9d858d49e41310a6dfa43d)

aarch64: Fix LD1Q and ST1Q failures for big-endian

LD1Q gathers and ST1Q scatters are unusual in that they operate
on 128-bit blocks (effectively VNx1TI).  However, we don't have
modes or ACLE types for 128-bit integers, and 128-bit integers
are not the intended use case.  Instead, the instructions are
intended to be used in "hybrid VLA" operations, where each 128-bit
block is an Advanced SIMD vector.

The normal SVE modes therefore capture the intended use case better
than VNx1TI would.  For example, VNx2DI is effectively N copies
of V2DI, VNx4SI N copies of V4SI, etc.

Since there is only one LD1Q instruction and one ST1Q instruction,
the ACLE support used a single pattern for each, with the loaded or
stored data having mode VNx2DI.  The ST1Q pattern was generated by:

    rtx data = e.args.last ();
    e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data));
    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q);

where the force_lowpart_subreg bitcast the stored data to VNx2DI.
But such subregs require an element reverse on big-endian targets
(see the comment at the head of aarch64-sve.md), which wasn't the
intention.  The code should have used aarch64_sve_reinterpret instead.

The LD1Q pattern was used as follows:

    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q);

which always returns a VNx2DI value, leaving the caller to bitcast
that to the correct mode.  That bitcast again uses subregs and has
the same problem as above.

However, for the reasons explained in the comment, using
aarch64_sve_reinterpret does not work well for LD1Q.  The patch
instead parameterises the LD1Q based on the required data mode.

gcc/
* config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with...
(@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svld1q_gather_impl::expand): Update accordingly.
(svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret
instead of force_lowpart_subreg.

(cherry picked from commit e7f049471c6caf22c65ac48773d864fca7a4cdc4)

testsuite: Add -funwind-tables to sve*/pfalse* tests

The SVE svpfalse folding tests use CFI directives to delimit the
function bodies. That requires -funwind-tables to be enabled,
which is true by default for *-linux-gnu targets, but not for *-elf.

gcc/testsuite/
* gcc.target/aarch64/sve/pfalse-binary.c: Add -funwind-tables.
* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binaryxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-clast.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-count_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-fold_left.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_replicate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ptest.c: Likewise.
* gcc.target/aarch64/sve/pfalse-rdffr.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction_wide.c: Likewise.
* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-storexn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unaryxn.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_wide.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-compare.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: Likewise.

(cherry picked from commit 2ff8da46152cbade579700823cc7b1460ddd91b8)

aarch64: Extend HVLA permutations to big-endian

TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1
"hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions.
This matching was conditional on !BYTES_BIG_ENDIAN.

The ACLE code also lowered the associated SVE2.1 intrinsics into
suitable VEC_PERM_EXPRs.  This lowering was not conditional on
!BYTES_BIG_ENDIAN.

The mismatch led to lots of ICEs in the ACLE tests on big-endian
targets: we lowered to VEC_PERM_EXPRs that are not supported.

I think the !BYTES_BIG_ENDIAN restriction was unnecessary.
SVE maps the first memory element to the least significant end of
the register for both endiannesses, so no endian correction or lane
number adjustment is necessary.

This is in some ways a bit counterintuitive.  ZIPQ1 is conceptually
"apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does
matter when choosing between Advanced SIMD ZIP1 and ZIP2.  For example,
the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little-
endian and ZIP2 for big-endian.  But the difference between the hybrid
VLA and Advanced SIMD permute selectors is a consequence of the
difference between the SVE and Advanced SIMD element orders.

The same thing applies to ACLE intrinsics.  The current lowering of
svzipq1 etc. is correct for both endiannesses.  If ACLE code does:

  2x svld1_s32 + svzipq1_s32 + svst1_s32

then the byte-for-byte result is the same for both endiannesses.
On big-endian targets, this is different from using the Advanced SIMD
sequence below for each 128-bit block:

  2x LDR + ZIP1 + STR

In contrast, the byte-for-byte result of:

  2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32

depends on endianness, since the quadword gathers and scatters use
Advanced SIMD byte ordering for each 128-bit block.  This gather/scatter
sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR
sequence for both endiannesses.

Programmers writing ACLE code have to be aware of this difference
if they want to support both endiannesses.

The patch includes some new execution tests to verify the expansion
of the VEC_PERM_EXPRs.

gcc/
* doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document.
* config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to
BYTES_BIG_ENDIAN.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw):
New proc.
* gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian.  Add
noipa attributes.
* gcc.target/aarch64/sve2/extq_1.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1.c: Likewise.
* gcc.target/aarch64/sve2/dupq_1_run.c: New test.
* gcc.target/aarch64/sve2/extq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1_run.c: Likewise.

(cherry picked from commit 3b870131487d786a74f27a89d0415c8207770f14)

aarch64: Fix endianness of DFmode vector constants

aarch64_simd_valid_imm tries to decompose a constant into a repeating
series of 64 bits, since most Advanced SIMD and SVE immediate forms
require that.  (The exceptions are handled first.)  It does this by
building up a byte-level register image, lsb first.  If the image does
turn out to repeat every 64 bits, it loads the first 64 bits into an
integer.

At this point, endianness has mostly been dealt with.  Endianness
applies to transfers between registers and memory, whereas at this
point we're dealing purely with register values.

However, one of things we try is to bitcast the value to a float
and use FMOV.  This involves splitting the value into 32-bit chunks
(stored as longs) and passing them to real_from_target.  The problem
being fixed by this patch is that, when a value spans multiple 32-bit
chunks, real_from_target expects them to be in memory rather than
register order.  Thus index 0 is the most significant chunk if
FLOAT_WORDS_BIG_ENDIAN and the least significant chunk otherwise.

This fixes aarch64/sve/cond_fadd_1.c and various other tests
for aarch64_be-elf.

gcc/
* config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Account
for FLOAT_WORDS_BIG_ENDIAN when building a floating-point value.

(cherry picked from commit 82dd19890b6139c4bac2385068a68613920ae1a2)

aarch64: Some fixes for SVE INDEX constants

When using SVE INDEX to load an Advanced SIMD vector, we need to
take account of the different element ordering for big-endian
targets.  For example, when big-endian targets store the V4SI
constant { 0, 1, 2, 3 } in registers, 0 becomes the most
significant element, whereas INDEX always operates from the
least significant element.  A big-endian target would therefore
load V4SI { 0, 1, 2, 3 } using:

    INDEX Z0.S, #3, #-1

rather than little-endian's:

    INDEX Z0.S, #0, #1

While there, I noticed that we would only check the first vector
in a multi-vector SVE constant, which would trigger an ICE if the
other vectors turned out to be invalid.  This is pretty difficult to
trigger at the moment, since we only allow single-register modes to be
used as frontend & middle-end vector modes, but it can be seen using
the RTL frontend.

gcc/
* config/aarch64/aarch64.cc (aarch64_sve_index_series_p): New
function, split out from...
(aarch64_simd_valid_imm): ...here.  Account for the different
SVE and Advanced SIMD element orders on big-endian targets.
Check each vector in a structure mode.

gcc/testsuite/
* gcc.dg/rtl/aarch64/vec-series-1.c: New test.
* gcc.dg/rtl/aarch64/vec-series-2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Fix expected
output for this big-endian test.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: Restrict to little-endian
targets and add more tests.
* gcc.target/aarch64/sve/vec_init_4.c: New big-endian version
of vec_init_3.c.

(cherry picked from commit 41c446389446a357172883389e36fd10c882ce6d)

Make the RTL frontend set REG_NREGS correctly

While working on a new testcase that uses the RTL frontend,
I hit a bug where a (reg ...) that spans multiple hard registers
had REG_NREGS set to 1. This caused various things to misbehave.
For example, if the (reg ...) in question was used as crtl->return_rtx,
only the first register in the group would be marked as live on exit.

gcc/
* read-rtl-function.cc (function_reader::read_rtx_operand_r): Use
hard_regno_nregs to work out REG_NREGS for hard registers.

(cherry picked from commit 76db38d811a63a603deedfe327d5e201fc820444)

ext-dce: Fix subreg_lsb is_constant assumption (2)

This patch fixes another instance of the problem described in the
cover note for g:bf3037e923e9f91d93ab64bdf73a37f64f659fb9.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

(cherry picked from commit bddc41e290113dd9160b01f2fdf925a1876c8ee0)

aarch64: Fix ZIP1 order in aarch64_expand_vector_init [PR118891]

aarch64_expand_vector_init contains some divide-and-conquer code
that tries to load the odd and even elements into 64-bit registers
and then ZIP them together. On big-endian targets, the even elements
are more significant than the odd elements and so should come second
in the ZIP.

This fixes many execution failures on aarch64_be-elf, including
gcc.c-torture/execute/pr28982a.c.

gcc/
PR target/118891
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Fix the
ZIP1 operand order for big-endian targets.

(cherry picked from commit cb2b5471516c3c469f65d927a2a30eb15357e429)

aarch64: Fix neon-sve-bridge.c failures for big-endian

Lowpart subregs are generally disallowed on big-endian SVE vector
registers, since the first memory element is stored at the least
significant end of the register, rather than the most significant end.
(See the comment at the head of aarch64-sve.md for details,
and aarch64_modes_compatible_p for the implementation.)

This means that arm_sve_neon_bridge.h needs to use custom define_insns
for big-endian targets, in lieu of using lowpart subregs.  However,
one of those define_insns relied on the prohibited lowparts internally,
to convert an Advanced SIMD register to an SVE register.  Since the
lowpart is not allowed, the lowpart_subreg would return null, leading
to a later ICE.

The simplest fix seems to be to use %Z instead, to force the Advanced
SIMD register to be written as an SVE register.

gcc/
* config/aarch64/aarch64-sve.md (@aarch64_sve_set_neonq_<mode>):
Use %Z instead of lowpart_subreg.  Tweak formatting.

(cherry picked from commit 69c839c7361430ec27d1f13f909531b872588f27)

ext-dce: Fix subreg_lsb is_constant assumption

ext-dce had:

  if (SUBREG_P (dst) && SUBREG_BYTE (dst).is_constant ())
    {
      bit = subreg_lsb (dst).to_constant ();
      if (bit >= HOST_BITS_PER_WIDE_INT)
bit = HOST_BITS_PER_WIDE_INT - 1;
      dst = SUBREG_REG (dst);

But a constant SUBREG_BYTE doesn't guarantee a constant subreg_lsb.
If the SUBREG_REG is a pair of N-bit registers on a big-endian target,
the most significant end has a SUBREG_BYTE of 0 but a subreg_lsb of N.
This N would then be non-constant for variable-length registers.

The patch fixes gcc.dg/torture/pr120276.c and other failures on
aarch64_be-elf.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

(cherry picked from commit bf3037e923e9f91d93ab64bdf73a37f64f659fb9)

vect: Fix VEC_WIDEN_PLUS_HI/LO choice for big-endian [PR118891]

In the tree codes and optabs, the "hi" in a vector hi/lo pair means
"most significant" and the "lo" means "least significant", with
sigificance following GCC's normal endian expectations.  Thus on
big-endian targets, the hi part handles the first half of the elements
in memory order and the lo part handles the second half.

For tree codes, supportable_widening_operation first chooses hi/lo
pairs based on little-endian order and then uses:

  if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
    std::swap (c1, c2);

to adjust.  However, the handling for internal functions was missing
an equivalent fixup.  This led to several execution failures in vect.exp
on aarch64_be-elf.

If the hi/lo code fails, the internal function handling goes on to try
even/odd.  But I couldn't see anything obvious that would put the even/
odd results back into the right order later, so there might be a latent
bug there too.

gcc/
PR tree-optimization/118891
* tree-vect-stmts.cc (supportable_widening_operation): Swap the
hi and lo internal functions on big-endian targets.

(cherry picked from commit ec54a14239b12d03c600c14f3ce9710e65cd33f1)

Daily bump.

Fortran: fix bogus runtime error with optional procedure argument [PR121145]

PR fortran/121145

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Do not create pointer
check for proc-pointer actual passed to optional dummy.

gcc/testsuite/ChangeLog:

* gfortran.dg/pointer_check_15.f90: New test.

(cherry picked from commit 8f9450505f8244d262f8b4ff274f113f99cdc7e2)

Daily bump.

libstdc++: Update some baseline_symbols.txt (x32)

* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt:
Updated.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c7baa61a583b49df63d3df8c6336f8405e24f012)

Daily bump.

[PATCH] PR modula2/121164 Modula 2 build failure

This patch fixes the 2nd parameter name mismatch in
ARRAYOFCHAR.mod.

gcc/m2/ChangeLog:

PR modula2/121164
* gm2-libs/ARRAYOFCHAR.mod (Write): Rename 2nd parameter
name a to str.

(cherry picked from commit 22d8b89689769e5efefd2c4e6dda88d9f0b2a945)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

rust: Silence a clang warning in borrow-checker-diagnostics

When compiling
gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
with clang, it emits the following warning:

gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc:145:46: warning: non-constant-expression cannot be narrowed from type 'Polonius::Loan' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]

I'd hope that for indexing that is never really a problem,
nevertheless if narrowing is taking place, I guess it can be argued it
should be made explicit.

gcc/rust/ChangeLog:

2025-06-23 Martin Jambor <mjambor@suse.cz>

* checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
(BorrowCheckerDiagnostics::get_loan): Type cast loan to uint32_t.

(cherry picked from commit 1e69c5655894ab3cbeb4431a5b3daff211d3c4e1)

gccrs: Fix narrowing conversion warnings

Fixes PR#119641

gcc/rust/ChangeLog:

* checks/errors/borrowck/rust-bir-place.h
(IndexVec::size_type): Add.
(IndexVec::MAX_INDEX): Add.
(IndexVec::size): Change the return type to the type of the
internal value used by the index type.
(PlaceDB::lookup_or_add_variable): Use the return value from the
PlaceDB::add_place call.
* checks/errors/borrowck/rust-bir.h
(struct BasicBlockId): Move this definition before the
definition of the struct Function.

Signed-off-by: Owen Avery <powerboat9.gamer@gmail.com>
(cherry picked from commit beced835afa3908aa94550d2ca5ee3879a620adb)

Disable parallel testing for 'rust/compile/nr2/compile.exp' [PR119508]

..., using the standard idiom. This '*.exp' file doesn't adhere to the
parallel testing protocol as defined in 'gcc/testsuite/lib/gcc-defs.exp'.

This also restores proper behavior for '*.exp' files executing after (!) this
one, which erroneously caused hundreds or even thousands of individual test
cases get duplicated vs. skipped, randomly, depending on the '-jN' level.

PR testsuite/119508
gcc/testsuite/
* rust/compile/nr2/compile.exp: Disable parallel testing.

(cherry picked from commit 79d2c3089f480738613b7d338d86d8be710f8158)

Fix time zone for 'cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob' [PR119818]

This progresses:

    PASS: cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob   -O0  (test for excess errors)
    [-FAIL:-]{+PASS:+} cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob   -O0  execution test
    [Etc.]

PR cobol/119818
gcc/testsuite/
* cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob:
'dg-set-target-env-var TZ UTC0'.

(cherry picked from commit ed8761241ac529ccddb2b76a1895c124c67c132c)

mmix: Define MAX_FIXED_MODE_SIZE

Besides this commit working as a release-branch fix for the
PR, code inspection shows slightly better code for TImode
libgcc functions, and a modified
gcc.c-torture/execute/arith-rand-ll.c (basically s/long
long/__int128 and cutting out the non-128-bit cases) shows a
1.4% improvement. (Coremark code is identical, as
expected.)

PR middle-end/120935
* config/mmix/mmix.h (MAX_FIXED_MODE_SIZE): Define.

Co-authored-by: Pietro Monteiro <pietro@sociotechnical.xyz>
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>

tree-optimization/120924 - up --param uninit-max-chain-len

The PR shows that the uninit analysis limits are set too low in
cases we lower switches to ifs as happens on s390x for a linux
kernel TU. This causes false positive uninit diagnostics as we
abort the attempt to prove that a value is initialized on all
paths. The new testcase only would require upping to 9.

PR tree-optimization/120924
* params.opt (uninit-max-chain-len): Up from 8 to 12.

* gcc.dg/uninit-pr120924.c: New testcase.

(cherry picked from commit cf9a479e3f909d5217e954788eb3c5b569e4bc52)

[PATCH] PR modula2/120912: Request for a procedure to obtain a file from an IOChan

This patch introduces the procedure GetFile into the supplementary
ISO style library IOChanUtils.

gcc/m2/ChangeLog:

PR modula2/120912
* gm2-libs-iso/IOChanUtils.def (GetFile): New procedure function.
* gm2-libs-iso/IOChanUtils.mod (GetFile): New procedure function.

(cherry picked from commit 15670d4477ce219c017bd52417a6074b981fb197)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

tree-optimization/121059 - fixup loop mask query

When we opportunistically mask an operand of a AND with an already
available loop mask we need to query that set with the correct number
of masks we expect.

PR tree-optimization/121059
* tree-vect-stmts.cc (vectorizable_operation): Query
scalar_cond_masked_set with the correct number of masks.

* gcc.dg/vect/pr121059.c: New testcase.

Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
(cherry picked from commit 71be87055548cf942c7bc56d10ffd479db8569e4)

tree-optimization/121049 - avoid loop masking with even/odd reduction

The following disables loop masking when we are using an even/odd
widening operation in a reduction because the loop mask then aligns
to the wrong elements.

PR tree-optimization/121049
* internal-fn.h (widening_evenodd_fn_p): Declare.
* internal-fn.cc (widening_evenodd_fn_p): New function.
* tree-vect-stmts.cc (vectorizable_conversion): When using
an even/odd widening function disable loop masking.

* gcc.dg/vect/pr121049.c: New testcase.

(cherry picked from commit bc5570f7ef796fa7f5ab89b34ed9de2be5299f0e)

tree-optimization/121035 - handle stray VN values without expression

When VN iterates we can end up with unreachable inserted expressions
in the expression tables which in turn will not be added to their
value by PREs compute_avail. This will later ICE when we pick
them up and want to generate them. Deal with this by giving up.

PR tree-optimization/121035
* tree-ssa-pre.cc (find_or_generate_expression): Handle
values without expression.

* gcc.dg/pr121035.c: New testcase.

(cherry picked from commit 9af57c471087a3a1b87621bce1208d6c77ba2a4a)

[PATCH] [PR modula2/117203] Followup add Delete procedure function

This patch provides GetFileName procedure function for
FIO.File, FileSystem.File and IOChan.ChanId. The
return result from these procedures can be passed into
StringFileSysOp.Unlink to complete the required delete.

gcc/m2/ChangeLog:

PR modula2/117203
* gm2-libs-log/FileSystem.def (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs-log/FileSystem.mod (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs/SFIO.def (GetFileName): New procedure function.
* gm2-libs/SFIO.mod (GetFileName): New procedure function.
* gm2-libs-iso/IOChanUtils.def: New file.
* gm2-libs-iso/IOChanUtils.mod: New file.

libgm2/ChangeLog:

PR modula2/117203
* libm2iso/Makefile.am (M2DEFS): Add IOChanUtils.def.
(M2MODS): Add IOChanUtils.mod.
* libm2iso/Makefile.in: Regenerate.

gcc/testsuite/ChangeLog:

PR modula2/117203
* gm2/isolib/run/pass/testdelete2.mod: New test.
* gm2/pimlib/logitech/run/pass/testdelete2.mod: New test.
* gm2/pimlib/run/pass/testdelete.mod: New test.

(cherry picked from commit 620a40fa8843dd7f80547bbd63549abc8bbe9521)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

gimple-fold: Fix up big endian _BitInt adjustment [PR121131]

The following testcase ICEs because SCALAR_INT_TYPE_MODE of course
doesn't work for large BITINT_TYPE types which have BLKmode.
native_encode* as well as e.g. r14-8276 use in cases like these
GET_MODE_SIZE (SCALAR_INT_TYPE_MODE ()) and TREE_INT_CST_LOW (TYPE_SIZE_UNIT
()) for the BLKmode ones.
In this case, it wants bits rather than bytes, so I've used
GET_MODE_BITSIZE like before and TYPE_SIZE otherwise.

Furthermore, the patch only computes encoding_size for big endian
targets, for little endian we don't really adjust anything, so there
is no point computing it.

2025-07-18 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/121131
* gimple-fold.cc (fold_nonarray_ctor_reference): Use
TREE_INT_CST_LOW (TYPE_SIZE ()) instead of
GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE ()) for BLKmode BITINT_TYPEs.
Don't compute encoding_size at all for little endian targets.

* gcc.dg/bitint-124.c: New test.

(cherry picked from commit 90955b2f61f787ebc446f0a105b5f49672388d89)

[PATCH] [PR modula2/120731] error in Strings.Pos causing sigsegv

This patch corrects the m2log library procedure function
Strings.Pos which incorrectly sliced the wrong component
of the source string. The incorrect slice could cause
a sigsegv if negative slice indices were generated.

gcc/m2/ChangeLog:

PR modula2/120731
* gm2-libs-log/Strings.def (Delete): Rewrite comment.
* gm2-libs-log/Strings.mod (Pos): Rewrite.
(PosLower): New procedure function.

gcc/testsuite/ChangeLog:

PR modula2/120731
* gm2/pimlib/logitech/run/pass/teststrings.mod: New test.

(cherry picked from commit fc276742e0db337c4d13e6c474abafd4796a6b69)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] [modula2] Comment tidyup in gm2-compiler/M2GCCDeclare.mod

This patch reformats three comments in the GNU GCC style.

gcc/m2/ChangeLog:

* gm2-compiler/M2GCCDeclare.mod (StartDeclareModuleScopeSeparate):
Reformat statement comments.
(StartDeclareModuleScopeWholeProgram): Ditto.

(cherry picked from commit 7a7cc65b8987b9b05fb8fb75824e2000861e6c30)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] PR modula2/120673: Mutually dependent types crash the compiler

This patch fixes an ICE which will occur if cyclic dependent types
are used when declaring a variable. This patch detects the
cyclic dependency and issues an error message for each outstanding
component.

gcc/m2/ChangeLog:

PR modula2/120673
* gm2-compiler/M2GCCDeclare.mod (ErrorDepList): New
global variable set containing every errant dependency symbol.
(mystop): Remove.
(EmitCircularDependancyError): Replace with ...
(EmitCircularDependencyError): ... this.
(AssertAllTypesDeclared): Rewrite.
(DoVariableDeclaration): Ditto.
(TypeDependentsDeclared): New procedure function.
(PrepareGCCVarDeclaration): Ditto.
(DeclareVariable): Remove assert.
(DeclareLocalVariable): Ditto.
(Constructor): Initialize ErrorDepList.
* gm2-compiler/M2MetaError.mod (doErrorScopeProc): Rewrite
and ensure that a symbol with a module scope does not lookup
from a definition module.
* gm2-compiler/P2SymBuild.mod (BuildType): Rewrite so that
a synonym type is created using the token refering to the name
on the lhs.

gcc/testsuite/ChangeLog:

PR modula2/120673
* gm2/pim/fail/badmodvar.mod: New test.
* gm2/pim/fail/cyclictypes.mod: New test.
* gm2/pim/fail/cyclictypes2.mod: New test.
* gm2/pim/fail/cyclictypes4.mod: New test.

(cherry picked from commit fba2f08152375e2c1c167ec921a0197e4c07efc6)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Daily bump.

[PATCH] PR modula2/119650: Regenerate target independent documentation

This patch regenerates the target independent documentation
triggered by the additional library modules.

gcc/m2/ChangeLog:

PR modula2/119650
* gm2-libs/ARRAYOFCHAR.def: Remove comment about non
existent read.
* target-independent/m2/Builtins.texi: Regenerate.
* target-independent/m2/SYSTEM-iso.texi: Ditto.
* target-independent/m2/SYSTEM-pim.texi: Ditto.
* target-independent/m2/gm2-libs.texi: Ditto.

(cherry picked from commit c291bde420556c69423961f59ef6765dc6c4c547)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] PR modula2/120606: FOR loop ICE if the last expression uses an array

This patch fixes the ICE which occurs if the last expression is an array.
It ensures that the start and end values of the for loop expressions are
dereferenced.

gcc/m2/ChangeLog:

PR modula2/120606
* gm2-compiler/M2Quads.mod (ForLoopLastIterator): Dereference
start and end expressions e1 and e2 respectively.

gcc/testsuite/ChangeLog:

PR modula2/120606
* gm2/pim/pass/forarray.mod: New test.

(cherry picked from commit 639a147414ab2b870f9482123fcaa1821e0d5475)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] [PR modula2/119650, PR modula2/117203]: WriteString and Delete are missing from base libraries

This patch introduces a Write procedure for an array of char,
the string and char datatype.  It uses the m2r10 style of
naming the module on the datatype.  This uncovered a bug
in the import handling inside Quadident.  It also includes
an Unlink procedure from a new module FileSysOp and a String
interface to this module.

gcc/m2/ChangeLog:

PR modula2/119650
PR modula2/117203
* gm2-compiler/P2Build.bnf (CheckModuleQualident): New
procedure.
(Qualident): Rewrite.
* gm2-compiler/P3Build.bnf (PushTFQualident): New procedure.
(CheckModuleQualident): Ditto.
(Qualident): Rewrite.
* gm2-compiler/PCBuild.bnf (PushTFQualident): New procedure.
(CheckModuleQualident): Ditto.
(Qualident): Rewrite.
* gm2-compiler/PHBuild.bnf (PushTFQualident): New procedure.
(CheckModuleQualident): Ditto.
(Qualident): Rewrite.
* gm2-libs/ARRAYOFCHAR.def: New file.
* gm2-libs/ARRAYOFCHAR.mod: New file.
* gm2-libs/CFileSysOp.def: New file.
* gm2-libs/CHAR.def: New file.
* gm2-libs/CHAR.mod: New file.
* gm2-libs/FileSysOp.def: New file.
* gm2-libs/FileSysOp.mod: New file.
* gm2-libs/String.def: New file.
* gm2-libs/String.mod: New file.
* gm2-libs/StringFileSysOp.def: New file.
* gm2-libs/StringFileSysOp.mod: New file.

libgm2/ChangeLog:

PR modula2/119650
PR modula2/117203
* libm2pim/Makefile.am (M2MODS): Add ARRAYOFCHAR,
CHAR.mod, StringFileSysOp.mod and String.mod.
(M2DEFS): Add ARRAYOFCHAR, CHAR.mod,
StringFileSysOp.mod and String.mod.
(libm2pim_la_SOURCES): Add CFileSysOp.c.
* libm2pim/Makefile.in: Regenerate.
* libm2pim/CFileSysOp.cc: New file.

gcc/testsuite/ChangeLog:

PR modula2/119650
* gm2/iso/fail/CHAR.mod: New test.
* gm2/iso/run/pass/CHAR.mod: New test.
* gm2/iso/run/pass/importself.mod: New test.
* gm2/pimlib/run/pass/testwrite.mod: New test.
* gm2/pimlib/run/pass/testwritechar.mod: New test.

(cherry picked from commit d1c3cfa3296ae5010c514d67f57acf144a299c7a)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c++: constexpr array testcase [PR87097]

This seems to have been fixed by r15-7260 for PR118285, but is sufficiently
different to merit its own test.

PR c++/87097

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array29.C: New test.

[PATCH] PR modula2/120542: Return statement in the main procedure crashes the compiler

The patch checks whether a return statement is allowed.  It also checks
to see that a return expression is allowed.

gcc/m2/ChangeLog:

PR modula2/120542
* gm2-compiler/M2Quads.mod (BuildReturnLower): New procedure.
(BuildReturn): Allow return without an expression from
module initialization blocks.  Generate an error if an
expression is provided.  Call BuildReturnLower if no error
was seen.

gcc/testsuite/ChangeLog:

PR modula2/120542
* gm2/iso/fail/badreturn.mod: New test.
* gm2/iso/fail/badreturn2.mod: New test.
* gm2/iso/pass/modulereturn.mod: New test.
* gm2/iso/pass/modulereturn2.mod: New test.

(cherry picked from commit 16ab791531ec16fd4596a25efbe6b42e6c16171f)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] PR modula2/120474: InOut buffering should flush the WriteLn before the Read

This patch adds a BufferFlush to InOut.mod:LocalWrite.

gcc/m2/ChangeLog:

PR modula2/120474
* gm2-libs-log/InOut.mod (LocalWrite): Call FIO.FlushBuffer.

(cherry picked from commit 13498bf4fcff4c0633678c53a46b6be425d2904c)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

LoongArch: Prevent subreg of subreg in CRC [PR 120807]

The register_operand predicate can match subreg, then we'd have a subreg
of subreg and it's invalid. Use lowpart_subreg to avoid the nested
subreg.

gcc/ChangeLog:

PR target/120807
* config/loongarch/loongarch.md (crc_combine): Avoid nested
subreg.

gcc/testsuite/ChangeLog:

PR target/120807
* gcc.c-torture/compile/pr120807.c: New test.

(cherry picked from commit 113ed3adc03f79f09ffe00d429d18f89f335b188)

Daily bump.

[PATCH] PR modula2/120497: error is generated for good code when returning a pointer var variable

The return type checking needs to skip over the Lvalue part of the VAR
parameter or variable.

gcc/m2/ChangeLog:

PR modula2/120497
* gm2-compiler/M2Range.mod (IsAssignmentCompatible): Remove from
import list.
(FoldTypeReturnFunc): Rewrite to skip the Lvalue of a var
variable.
(CodeTypeReturnFunc): Ditto.
(CodeTypeIndrX): Call AssignmentTypeCompatible rather than
IsAssignmentCompatible.
(FoldTypeIndrX): Ditto.

gcc/testsuite/ChangeLog:

PR modula2/120497
* gm2/pim/pass/ReturnType.mod: New test.
* gm2/pim/pass/ReturnType2.mod: New test.

(cherry picked from commit 170717fa243ef466a99498113167627539af4553)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] PR modula2/120389 Assigning wrong type to an array causes an ICE

Although cherry picked as described. The cherry pick does not include
the command option (-fm2-strict-type-reason) introduced in:
gcc/m2/gm2-lang.cc, gcc/m2/lang.opt and gcc/doc/gm2.texi from the
original patch.

This patch provides follow on fixes for undetected type violations
which can occur then Lvalues are generated during assignment.
For example array accesses and with statements. The type checker
M2Check.mod has been overhauled and cleaned up.

gcc/m2/ChangeLog:

PR modula2/120389
* gm2-compiler/M2Check.def (AssignmentTypeCompatible): Add new
parameter enableReason.
* gm2-compiler/M2Check.mod (EquivalenceProcedure): New type.
(falseReason2): New procedure function.
(falseReason1): Ditto.
(falseReason0): Ditto.
(checkTypeEquivalence): Rewrite.
(checkUnboundedArray): Ditto.
(checkUnbounded): Ditto.
(checkArrayTypeEquivalence): Ditto.
(checkCharStringTypeEquivalence): Ditto.
(buildError4): Add false reason.
(buildError2): Ditto.
(IsTyped): Use GetDType.
(IsTypeEquivalence): New procedure function.
(checkVarTypeEquivalence): Ditto.
(checkVarEquivalence ): Rewrite.
(checkConstMeta): Ditto.
(checkEnumField): New procedure function.
(checkEnumFieldEquivalence): Ditto.
(checkSubrangeTypeEquivalence): Rewrite.
(checkSystemEquivalence): Ditto.
(checkTypeKindViolation): Ditto.
(doCheckPair): Ditto.
(InitEquivalenceArray): New procedure.
(addEquivalence): Ditto.
(checkProcType): Rewrite.
(deconstruct): Deallocate reason string.
(AssignmentTypeCompatible): Initialize reason and reasonEnable
fields.
(ParameterTypeCompatible): Ditto.
(doExpressionTypeCompatible): Ditto.
* gm2-compiler/M2GenGCC.mod (CodeIndrX) Rewrite.
(CheckBinaryExpressionTypes): Rewrite and simplify now that the
type checker is more robust.
(CheckElementSetTypes): Ditto.
(CodeXIndr): Add new range assignment type check.
* gm2-compiler/M2MetaError.def: Correct comments.
* gm2-compiler/M2Options.def (SetStrictTypeAssignment): New procedure.
(SetStrictTypeReason): Ditto.
* gm2-compiler/M2Options.mod: (SetStrictTypeAssignment): New procedure.
(SetStrictTypeReason): Ditto.
(StrictTypeReason): Initialize.
(StrictTypeAssignment): Ditto.
* gm2-compiler/M2Quads.mod (CheckBreak): Delete.
(BreakQuad): New global variable.
(BreakAtQuad): Delete.
(gdbhook): New procedure.
(BreakWhenQuadCreated): Ditto.
(CheckBreak): Ditto.
(Init): Call BreakWhenQuadCreated and gdbhook.
(doBuildAssignment): Add type assignment range check.
(CheckProcTypeAndProcedure): Only check if the procedure
types differ.
(doIndrX): Add type IndrX range check.
(CheckReturnType): Add range return type check.
* gm2-compiler/M2Range.def (InitTypesIndrXCheck): New procedure
function.
(InitTypesReturnTypeCheck): Ditto.
* gm2-compiler/M2Range.mod (InitTypesIndrXCheck): New procedure
function.
(InitTypesReturnTypeCheck): Ditto.
(HandlerExists): Add new clauses.
(FoldAssignment): Pass extra FALSE parameter to
AssignmentTypeCompatible.
(FoldTypeReturnFunc): New procedure.
(FoldTypeAssign): Ditto.
(FoldTypeIndrX): Ditto.
(CodeTypeAssign): Rewrite.
(CodeTypeIndrX): New procedure.
(CodeTypeReturnFunc): Ditto.
(FoldTypeCheck): Add new case clauses.
(CodeTypeCheck): Ditto.
(FoldRangeCheckLower): Ditto.
(IssueWarning): Ditto.
* gm2-gcc/m2options.h (M2Options_SetStrictTypeAssignment): New
function prototype.
(M2Options_SetStrictTypeReason): Ditto.

gcc/testsuite/ChangeLog:

PR modula2/120389
* gm2/pim/fail/testcharint.mod: New test.
* gm2/pim/fail/testindrx.mod: New test.
* gm2/pim/pass/testxindr.mod: New test.
* gm2/pim/pass/testxindr2.mod: New test.
* gm2/pim/pass/testxindr3.mod: New test.

(cherry picked from commit e131ba3de5f487f5e957ba1b011c960fce557c7b)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Fortran: Fix ICE in ASSOCIATE with user defined operator [PR121060]

2025-07-16 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/121060
* interface.cc (matching_typebound_op): Defer determination of
specific procedure until resolution by returning NULL.

gcc/testsuite/
PR fortran/121060
* gfortran.dg/associate_75.f90: New test.

(cherry picked from commit 82e912344d28cf1a69e5f8e047203ea7eb625302)

i386: Decouple AMX-AVX512 from AVX10.2 and imply AVX512F

In ISE058, the AVX10.2 imply is removed from AMX-AVX512. This
leads to re-consideration on the imply for AMX-AVX512.

Since it is using zmm register and using zmm register only, we
need to at least imply AVX512F. AVX512VL is not needed.

On the other hand, if we imply AVX10.1 for AMX-AVX512, it will
cause -mno-avx10.1 disabling AMX-AVX512. This would be a surprise
for users.

Based on the two reasons above, the patch is decoupling AMX-AVX512
from AVX10.2 and imply AVX512F.

gcc/ChangeLog:

* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AMX_AVX512_SET): Do not set AVX10.2.
(OPTION_MASK_ISA2_AVX10_2_UNSET): Remove AMX-AVX512 unset.
(OPTION_MASK_ISA2_AVX512F_UNSET): Unset AMX-AVX512.
(ix86_handle_option): Imply AVX512F for AMX-AVX512.

gcc/testsuite/ChangeLog:

* gcc.target/i386/amxavx512-cvtrowd2ps-2.c: Add -mavx512fp16 to
use FP16 related intrins for convert.
* gcc.target/i386/amxavx512-cvtrowps2bf16-2.c: Ditto.
* gcc.target/i386/amxavx512-cvtrowps2ph-2.c: Ditto.
* gcc.target/i386/amxavx512-movrow-2.c: Ditto.

Daily bump.

[PATCH] PR modula2/120389 ICE if assigning a constant char to an integer array

This patch fixes an ICE which occurs if a constant char is assigned
into an integer array. The fix it to introduce type checking in
M2GenGCC.mod:CodeXIndr.

gcc/m2/ChangeLog:

PR modula2/120389
* gm2-compiler/M2GenGCC.mod (CodeXIndr): Check to see that
the type of left is assignment compatible with the type of
right.

gcc/testsuite/ChangeLog:

PR modula2/120389
* gm2/iso/fail/badarray3.mod: New test.

(cherry picked from commit 895a8abad245365940939911e3d0de850522791e)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

openmp, fortran: Fix ICE when the procedure name cannot be found in declare variant directives [PR104428]

The result of searching for the procedure name symbol should be checked in
case the symbol cannot be found to avoid a null dereference.

gcc/fortran/

PR fortran/104428
* trans-openmp.cc (gfc_trans_omp_declare_variant): Check that proc_st
is non-NULL before dereferencing. Add line number to error message.

gcc/testsuite/

PR fortran/104428
* gfortran.dg/gomp/pr104428.f90: New.

(cherry picked from commit a05c4f4ee48f76e518dbd2a96e5083f4df833df7)

Fortran: Ensure finalizers are created correctly [PR120637]

Finalize_component freeed an expression that it used to remember which
components in which context it had finalized already. While it makes
sense to free the copy of the expression, if it is unused, it causes
issues, when comparing to a non existent expression. This is now
detected by returning true, when the expression has been used.

PR fortran/120637

gcc/fortran/ChangeLog:

* class.cc (finalize_component): Return true, when a finalizable
component was detect and do not free it.

gcc/testsuite/ChangeLog:

* gfortran.dg/asan/finalize_1.f90: New test.

(cherry picked from commit d1f05661fa6c8a6ea6f59ad365a84469100e425e)

crc: Error out on non-constant poly arguments for the crc builtins [PR120709]

These builtins requires a constant integer for the third argument but currently
there is assert rather than error. This fixes that and updates the documentation too.
Uses the same terms as was being used for the __builtin_prefetch arguments.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/120709

gcc/ChangeLog:

* builtins.cc (expand_builtin_crc_table_based): Error out
instead of asserting the 3rd argument is an integer constant.
* internal-fn.cc (expand_crc_optab_fn): Likewise.
* doc/extend.texi (crc): Document requirement of the poly argument
being a constant.

gcc/testsuite/ChangeLog:

* gcc.dg/crc-non-cst-poly-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit be07dd9a96a7a6f8fb59c939eda84d74b54f8182)

Daily bump.

aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementation of NOR

While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
due to its tied operands, the destination of the movprfx cannot be also
a source operand.  But the offending pattern in aarch64-sve2.md tries
to do exactly that for the "=?&w,w,w" alternative and gas warns for the
attached testcase.

This patch adjusts that alternative to avoid taking operand 0 as an input
in the NBSL again.

So for the testcase in the patch we now generate:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z1.d
        ret

instead of the previous:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z0.d
        ret

which generated a gas warning.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

PR target/120999
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>):
Adjust movprfx alternative.

gcc/testsuite/

PR target/120999
* gcc.target/aarch64/sve2/pr120999.c: New test.

(cherry picked from commit b7bd72ce71df5266e7a7039da318e49862389a72)

aarch64: Fix up commutative and early-clobber markers on compact insns

For constraints there are operand modifiers and constraint qualifiers.
Operand modifiers apply to all alternatives and must appear, in
traditional syntax before the first alternative. Constraint
qualifiers, on the other hand must appear in each alternative to which
they apply.

There's no easy way to validate the distinction in the traditional md
format, but when using the new compact format we can enforce some
semantic checking of these characters to avoid some potentially
surprising code generation.

Fortunately, all of these errors are benign, but the two misplaced
early-clobber markers were quite suspicious at first sight - it's only
by luck that the second alternative does not need an early-clobber.

The syntax checking will be added in the following patch, but first of
all, fix up the errors in aarch64.md.

gcc/
* config/aarch64/aarch64-sve.md (@aarch64_pred_<optab><mode>): Move
commutative marker to the cons specification.
(add<mode>3): Likewise.
(@aarch64_pred_<su>abd<mode>): Likewise.
(@aarch64_pred_<optab><mode>): Likewise.
(*cond_<optab><mode>_z): Likewise.
(<optab><mode>3): Likewise.
(@aarch64_pred_<optab><mode>): Likewise.
(*aarch64_pred_abd<mode>_relaxed): Likewise.
(*aarch64_pred_abd<mode>_strict): Likewise.
(@aarch64_pred_<optab><mode>): Likewise.
(@aarch64_pred_<optab><mode>): Likewise.
(@aarch64_pred_fma<mode>): Likewise.
(@aarch64_pred_fnma<mode>): Likewise.
(@aarch64_pred_<optab><mode>): Likewise.

* config/aarch64/aarch64-sve2.md (@aarch64_sve_<su>clamp<mode>): Move
commutative marker to the cons specification.
(*aarch64_sve_<su>clamp<mode>_x): Likewise.
(@aarch64_sve_fclamp<mode>): Likewise.
(*aarch64_sve_fclamp<mode>_x): Likewise.
(*aarch64_sve2_nor<mode>): Likewise.
(*aarch64_sve2_nand<mode>): Likewise.
(*aarch64_pred_faminmax_fused): Likewise.

* config/aarch64/aarch64.md (*loadwb_pre_pair_<ldst_sz>): Move the
early-clobber marker to the relevant alternative.
(*storewb_pre_pair_<ldst_sz>): Likewise.
(*add<mode>3_aarch64): Move commutative marker to the cons
specification.
(*addsi3_aarch64_uxtw): Likewise.
(*add<mode>3_poly_1): Likewise.
(add<mode>3_compare0): Likewise.
(*addsi3_compare0_uxtw): Likewise.
(*add<mode>3nr_compare0): Likewise.
(<optab><mode>3): Likewise.
(*<optab>si3_uxtw): Likewise.
(*and<mode>3_compare0): Likewise.
(*andsi3_compare0_uxtw): Likewise.
(@aarch64_and<mode>3nr_compare0): Likewise.

(cherry picked from commit f260146bc05f6fba7b2a67a62063c770588b769d)

[committed] [PR rtl-optimization/120242] Fix SUBREG_PROMOTED_VAR_P after ext-dce's actions

I've gone back and forth of these problems multiple times.  We have two passes,
ext-dce and combine which eliminate extensions using totally different
mechanisms.

ext-dce looks for cases where the state of upper bits in an object aren't
observable and if they aren't observable, then eliminates extensions which set
those bits.

combine looks for cases where we know the state of the upper bits and can prove
an extension is just setting those bits to their prior value.  Combine also
looks for cases where the precise extension isn't really important, just the
knowledge that the upper bits are zero or sign extended from a narrower mode
is needed.

Combine relies heavily on the SUBREG_PROMOTED_VAR state to do its job.  If the
actions of ext-dce (or any other pass for that matter) make
SUBREG_PROMOTED_VAR's state inconsistent with combine's expectations, then
combine can end up generating incorrect code.

--

When ext-dce eliminates an extension and turns it into a subreg copy (without
any known SUBREG_PROMOTED_VAR state).  Since we can no longer guarantee the
destination object has any known extension state, we scurry around and wipe
SUBREG_PROMOTED_VAR state for the destination object.

That's fine and dandy, but ultimately insufficient.  Consider if the
destination of the optimized extension was used as a source in a simple copy
insn.  Furthermore assume that the destination of that copy is used within a
SUBREG expression with SUBREG_PROMOTED_VAR set.  ext-dce's actions have
clobbered the SUBREG_PROMOTED_VAR state on the destination of that copy, albeit
indirectly.

This patch addresses this problem by taking the set of pseudos directly
impacted by ext-dce's actions and expands that set by building a transitive
closure for pseudos connected via copies.  We then scurry around finding
SUBREG_PROMOTED_VAR state to wipe for everything in that expanded set of
pseudos.  Voila, everything just works.

--

The other approach here would be to further expand the liveness sets inside
ext-dce.  That's a simpler path forward, but ultimately regresses the quality
of codes we do care about.

One good piece of news is that with the transitive closure bits in place, we
can eliminate a bit of the live set expansion we had in place for
SUBREG_PROMOTED_VAR objects.

--

So let's take one case of the 5 that have been reported.

In ext-dce we have this insn:

> (insn 29 27 30 3 (set (reg:DI 134 [ al_lsm.9 ])
>         (zero_extend:DI (subreg:HI (reg:DI 162) 0))) "j.c":17:17 552 {*zero_extendhidi2_bitmanip}
>      (expr_list:REG_DEAD (reg:DI 162)
>         (nil)))

There are reachable uses of (reg 134):

> (insn 49 47 52 6 (set (mem/c:HI (lo_sum:DI (reg/f:DI 186)
>                 (symbol_ref:DI ("al") [flags 0x86]  <var_decl 0x7ffff73c2da8 al>)) [2 al+0 S2 A16])
>         (subreg/s/v:HI (reg:DI 134 [ al_lsm.9 ]) 0)) 279 {*movhi_internal}
>      (expr_list:REG_DEAD (reg/f:DI 186)
>         (nil)))Obviously safe if we were to remove the extension.

> (insn 52 49 53 6 (set (reg:DI 176)
>         (and:DI (reg:DI 134 [ al_lsm.9 ])
>             (const_int 5 [0x5]))) "j.c":21:12 106 {*anddi3}
>      (expr_list:REG_DEAD (reg:DI 134 [ al_lsm.9 ])
>         (nil)))
> (insn 53 52 56 6 (set (reg:SI 177 [ _8 ])
>         (zero_extend:SI (subreg:HI (reg:DI 176) 0))) "j.c":21:12 551 {*zero_extendhisi2_bitmanip}
>      (expr_list:REG_DEAD (reg:DI 176)
>         (nil))) Safe to remove the extension as we only read the low 16 bits from the destination register (reg 176) in insn 53.

> (insn 27 26 29 3 (set (reg:DI 162)
>         (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 134 [ al_lsm.9 ]) 0)
>                 (const_int 1 [0x1])))) "j.c":17:17 8 {addsi3_extended}
>      (expr_list:REG_DEAD (reg:DI 134 [ al_lsm.9 ])
>         (nil)))
> (insn 29 27 30 3 (set (reg:DI 134 [ al_lsm.9 ])
>         (zero_extend:DI (subreg:HI (reg:DI 162) 0))) "j.c":17:17 552 {*zero_extendhidi2_bitmanip}
>      (expr_list:REG_DEAD (reg:DI 162)
>         (nil)))

Again, not as obvious as the first case, but we only read the low 16 bits from
(reg 162) in insn 29.  So those upper bits in (reg 134) don't matter.

> (insn 26 92 27 3 (set (reg:DI 144 [ ivtmp.17 ])
>         (reg:DI 134 [ al_lsm.9 ])) 277 {*movdi_64bit}
>      (nil))
> (insn 30 29 31 3 (set (reg:DI 135 [ al.2_3 ])
>         (sign_extend:DI (subreg/s/v:HI (reg:DI 144 [ ivtmp.17 ]) 0))) "j.c":17:9 558 {*extendhidi2_bitmanip}
>      (expr_list:REG_DEAD (reg:DI 144 [ ivtmp.17 ])
>         (nil)))Also safe in isolation.  But worth noting that if we remove the extension at insn 29, then the promoted status on (reg:DI 144) in insn 30 is no longer valid.

Setting aside the promoted state of (reg:DI 144) at insn 30 for a minute, let's
look into combine.

> (insn 26 92 27 3 (set (reg:DI 144 [ ivtmp.17 ])
>         (reg:DI 134 [ al_lsm.9 ])) 277 {*movdi_64bit}
>      (nil))   [ ... ]
> (insn 30 29 31 3 (set (reg:DI 135 [ al.2_3 ])
>         (sign_extend:DI (subreg/s/v:HI (reg:DI 144 [ ivtmp.17 ]) 0))) "j.c":17:9 558 {*extendhidi2_bitmanip}
>      (expr_list:REG_DEAD (reg:DI 144 [ ivtmp.17 ])
>         (nil)))
> (jump_insn 31 30 32 3 (set (pc)
>         (if_then_else (eq (reg:DI 135 [ al.2_3 ])
>                 (const_int 0 [0]))
>             (label_ref:DI 41)
>             (pc))) "j.c":4:55 371 {*branchdi}
>      (int_list:REG_BR_PROB 536870913 (nil))
>  -> 41)

Combine will do its thing on insns 30/31.  Essentially the sign extension is
not necessary in this context, assuming the promoted subreg status in insn 30
-- the equality test doesn't really care about the kind of extension, just
knowing the value is extended is enough to safely elide the extension.

And now we've come to the crux the problem.  That promotion state needs to be
adjusted.  The new ext-dce code will see that copy at insn 26 and add (reg 144)
to the set of registers that need promotion state wiped.  And everything is
happy after that.

The other cases are similar in nature.

--

This has been bootstrapped and regression tested on x86_64 and aarch64.
Variants have bootstrapped & regression tested on several other platforms and
it's survived testing on the crosses as well.

Pushing to the trunk...

PR rtl-optimization/120242
PR rtl-optimization/120627
PR rtl-optimization/120736
PR rtl-optimization/120813
gcc/

* ext-dce.cc (ext_dce_process_uses): Remove some cases where we
unnecessarily expanded live sets for promoted subregs.
(expand_changed_pseudos): New function.
(reset_subreg_promoted_p): Use it.

gcc/testsuite/

* gcc.dg/torture/pr120242.c: New test.
* gcc.dg/torture/pr120627.c: Likewise.
* gcc.dg/torture/pr120736.c: Likewise.
* gcc.dg/torture/pr120813.c: Likewise.

(cherry picked from commit 41155992d572030f7918682b2642365ada1f4fbf)

RISC-V: prefetch: fix LRA failing to allocate reg [PR118241]

prefetch was recently fixed/tightened (with Q reg constraint) to only
support right address patterns (REG or REG+D with lower 5 bits clear).
However in some cases that's too restrictive for LRA and it fails to
allocate a reg resulting in following ICE...

| gcc/testsuite/gcc.target/riscv/pr118241-b.cc:31:19: error: unable to generate reloads for:
|   31 | void m() { a.l(); }
|      |                   ^
|(insn 26 25 27 7 (prefetch (mem/f:DI (plus:DI (reg/f:DI 143 [ _5 ])
|                (const_int 56 [0x38])) [5 _5->batch[6]+0 S8 A64])
|        (const_int 0 [0])
|        (const_int 3 [0x3])) "gcc/testsuite/gcc.target/riscv/pr118241-b.cc":18:29 498 {prefetch}
|     (expr_list:REG_DEAD (reg/f:DI 142 [ _5->batch[6] ])
|        (nil)))
|during RTL pass: reload

Fix that by providing a fallback alternative register constraint to reload the address.

PR target/118241

gcc/ChangeLog:

* config/riscv/riscv.md (prefetch): Add alternative "r".

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr118241-b.cc: New test.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
(cherry picked from commit f2a3ab7ebf3c40da77f54e8329272fe048ec48a6)

RISC-V: prefetch: const offset needs to have 5 bits zero, not 4

Spotted this by chance as I saw a similar fixup in comment.
From comments, I think this is needed, but I've not hit any issues due
to this.

gcc/ChangeLog:

* config/riscv/predicates.md (prefetch_operand): mack 5 bits.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
(cherry picked from commit b960201091fcab631a34a8c8d5b30e9f297dfbe5)

[RISC-V][PR target/118241] Fix data prefetch predicate/constraint for RISC-V

Fix typo in comment spotted by Peter B.

PR target/118241
gcc/
* config/riscv/predicates.md: Fix comment typo in recent change.

(cherry picked from commit bf7162b321128ba93521a824e5a7a00d1cc3d1f8)

[RISC-V][PR target/118241] Fix data prefetch predicate/constraint for RISC-V

The RISC-V prefetch support is broken in a few ways.  This addresses the data
side prefetch problems.  I'd mistakenly thought this BZ was a prefetch.i
related (which has deeper problems).

The basic problem is we were accepting any valid address when in fact there are
restrictions.  This patch more precisely defines the predicate such that we
allow

REG
REG+D

Where D must have the low 5 bits clear.  Note that absolute addresses fall into
the REG+D form using the x0 for the register operand since it always has the
value zero.  The test verifies REG, REG+D, ABS addressing modes that are valid
as well as REG+D and ABS which must be reloaded into a REG because the
displacement has low bits set.

An earlier version of this patch has gone through testing in my tester on rv32
and rv64.  Obviously I'll wait for pre-commit CI to do its thing before moving
forward.

This is a good backport candidate after simmering on the trunk for a bit.

PR target/118241
gcc/
* config/riscv/predicates.md (prefetch_operand): New predicate.
* config/riscv/constraints.md (Q): New constraint.
* config/riscv/riscv.md (prefetch): Use new predicate and constraint.
(riscv_prefetchi_<mode>): Similarly.

gcc/testsuite/
* gcc.target/riscv/pr118241.c: New test.

(cherry picked from commit 49199bb29628365fc6c60bd185808a1bad65086d)

Ada: Add missing guard before accessing the Underlying_Record_View field

It is necessary when GNAT extensions are enabled (-gnatX switch).

gcc/ada/
PR ada/121056
* sem_ch4.adb (Try_Object_Operation.Try_Primitive_Operation): Add
test on Is_Record_Type before accessing Underlying_Record_View.

gcc/testsuite/
* gnat.dg/deref4.adb: New test.
* gnat.dg/deref4_pkg.ads: New helper.

x86-64: Add RDI clobber to 64-bit dynamic TLS patterns

*tls_global_dynamic_64_largepic, *tls_local_dynamic_64_<mode> and
*tls_local_dynamic_base_64_largepic use RDI as the __tls_get_addr
argument. Add RDI clobber to these patterns to show it.

gcc/

PR target/120908
* config/i386/i386.cc (legitimize_tls_address): Pass RDI to
gen_tls_local_dynamic_64.
* config/i386/i386.md (*tls_global_dynamic_64_largepic): Add
RDI clobber and use it to generate LEA.
(*tls_local_dynamic_64_<mode>): Likewise.
(*tls_local_dynamic_base_64_largepic): Likewise.
(@tls_local_dynamic_64_<mode>): Add a clobber.

gcc/testsuite/

PR target/120908
* gcc.target/i386/pr120908.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit d8d5e2a8031e74f08f61ccdd727476f97940c5a6)

x86-64: Add RDI clobber to tls_global_dynamic_64 patterns

*tls_global_dynamic_64_<mode> uses RDI as the __tls_get_addr argument.
Add RDI clobber to tls_global_dynamic_64 patterns to show it.

PR target/120908
* config/i386/i386.cc (legitimize_tls_address): Pass RDI to
gen_tls_global_dynamic_64.
* config/i386/i386.md (*tls_global_dynamic_64_<mode>): Add RDI
clobber and use it to generate LEA.
(@tls_global_dynamic_64_<mode>): Add a clobber.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7710d513a552f1fa1b7485ec6b318bafaa6d4cd7)

i386: Remove KEYLOCKER related feature since Panther Lake and Clearwater Forest

According to July 2025 SDM, Key locker will no longer be supported on
hardware 2025 onwards. This means for Panther Lake and Clearwater Forest,
the feature will not be enabled. Remove them from those two platforms.

gcc/ChangeLog:

* config/i386/i386.h (PTA_PANTHERLAKE): Revmoe KL and WIDEKL.
(PTA_CLEARWATERFOREST): Ditto.
* doc/invoke.texi: Revise documentation.

Daily bump.

[PATCH] [RISC-V] Fix shift type for RVV interleaved stepped patterns [PR120356]

It corrects the shift type of interleaved stepped patterns for const vector
expanding in LRA. The shift instruction was initially LSHIFTRT, and it seems
still should be the same type for both LRA and other cases.

PR target/120356

gcc/ChangeLog:

* config/riscv/riscv-v.cc
(expand_const_vector_interleaved_stepped_npatterns):
Fix ASHIFT to LSHIFTRT insn.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr120356.c: New test.

(cherry picked from commit 9c1ed63e4c6b0f80dd47ce421dd7d80d52c38fd3)

[PATCH] riscv: allow zero in zacas subword atomic cas

gcc:
PR target/120995
* config/riscv/sync.md (zacas_atomic_cas_value_strong<mode>):
Allow op3 to be zero.

gcc/testsuite:
PR target/120995
* gcc.target/riscv/amo/zabha-zacas-atomic-cas.c: New test.

(cherry picked from commit 3fd638a9e5497dfdf00f1783d6e704af03fb44b0)

Daily bump.

PR modula2/120253: Error message column numbers should start at 1 not 0

This patch ensures that column numbers start at 1 rather than 0.

gcc/m2/ChangeLog:

PR modula2/120253
* m2.flex (FIRST_COLUMN): New define.
(updatepos): Remove commented code.
(consumeLine): Assign column to FIRST_COLUMN.
(initLine): Ditto.
(m2flex_GetColumnNo): Return FIRST_COLUMN if currentLine is NULL.
(m2flex_GetLineNo): Rewrite for positive logic.
(m2flex_GetLocation): Ditto.

(cherry picked from commit 9a485b83e177cb742be17faf20ac5cc7db14fee3)

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

testsuite: Add testcase for already fixed PR [PR120954]

This was a regression introduced by r16-1893 (and its backports) for C++,
though for C it had false positive warning for years. Fixed by r16-2000
(and its backports).

2025-07-11 Jakub Jelinek <jakub@redhat.com>

PR c++/120954
* c-c++-common/Warray-bounds-11.c: New test.

(cherry picked from commit 9eea49825ebb607f4b67de48c9cba1f85e005932)

ipa: Disallow signature changes in fun->has_musttail functions [PR121023]

As the following testcase shows e.g. on ia32, letting IPA opts change
signature of functions which have [[{gnu,clang}::musttail]] calls
can turn programs that would be compiled normally into something
that is rejected because the caller has fewer argument stack slots
than the function being tail called.

The following patch prevents signature changes for such functions.
It is perhaps too big hammer in some cases, but it might be hard
to try to figure out what signature changes are still acceptable and which
are not at IPA time.

2025-07-11 Jakub Jelinek <jakub@redhat.com>
Martin Jambor <mjambor@suse.cz>

PR ipa/121023
* ipa-fnsummary.cc (compute_fn_summary): Disallow signature changes
on cfun->has_musttail functions.

* c-c++-common/musttail32.c: New test.

(cherry picked from commit 89b9372d61ccd45cb6c71518d62215917e3aaebc)

c++: Fix up final handling in C++98 [PR120628]

The following patch is on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686210.html
patch which stopped treating override as conditional keyword in
class properties.
This PR mentions another problem; we emit a bogus warning on code like
struct C {}; struct C final = {};
in C++98.  In this case we parse final as conditional keyword in C++
(including pedwarn) but the caller then immediately aborts the tentative
parse because it isn't followed by { nor (in some cases) : .
I think we certainly shouldn't pedwarn on it, but I think we even shouldn't
warn for it say for -Wc++11-compat, because we don't actually treat the
identifier as conditional keyword even in C++11 and later.
The patch only does this if final is the only class property conditional
keyword, if one uses
struct S __final final __final = {};
one gets the warning and duplicate diagnostics and later parsing errors.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/120628
* parser.cc (cp_parser_elaborated_type_specifier): Use
cp_parser_nth_token_starts_class_definition_p with extra argument 1
instead of cp_parser_next_token_starts_class_definition_p.
(cp_parser_class_property_specifier_seq_opt): For final conditional
keyword in C++98 check if the token after it isn't
cp_parser_nth_token_starts_class_definition_p nor CPP_NAME and in
that case break without consuming it nor warning.
(cp_parser_class_head): Use
cp_parser_nth_token_starts_class_definition_p with extra argument 1
instead of cp_parser_next_token_starts_class_definition_p.
(cp_parser_next_token_starts_class_definition_p): Renamed to ...
(cp_parser_nth_token_starts_class_definition_p): ... this.  Add N
argument.  Use cp_lexer_peek_nth_token instead of cp_lexer_peek_token.

* g++.dg/cpp0x/final1.C: New test.
* g++.dg/cpp0x/final2.C: New test.
* g++.dg/cpp0x/override6.C: New test.

(cherry picked from commit 8f063b40e5b8f23cb89fee21afaa71deedbdf2aa)

c++: Don't incorrectly reject override after class head name [PR120569]

While the
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2786r13.html#c03-compatibility-changes-for-annex-c-diff.cpp03.dcl.dcl
hunk dropped because
struct C {}; struct C final {};
is actually not valid C++98 (which didn't have list initialization), we
actually also reject
struct D {}; struct D override {};
and that IMHO is valid all the way from C++11 onwards.
Especially in the light of P2786R13 adding new contextual keywords, I think
it is better to use a separate routine for parsing the
class-virt-specifier-seq (in C++11, there was export next to final),
class-virt-specifier (in C++14 to C++23) and
class-property-specifier-seq (in C++26) instead of using the same function
for virt-specifier-seq and class-property-specifier-seq.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/120569
* parser.cc (cp_parser_class_property_specifier_seq_opt): New
function.
(cp_parser_class_head): Use it instead of
cp_parser_property_specifier_seq_opt.  Don't diagnose
VIRT_SPEC_OVERRIDE here.  Formatting fix.

* g++.dg/cpp0x/override2.C: Expect different diagnostics with
override.
* g++.dg/cpp0x/override5.C: New test.

(cherry picked from commit bcb51fe0e26bed7e2c44c4822ca6dec135ba61f3)

c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]

The following testcase is miscompiled with -fsanitize=undefined but we
introduce UB into the IL even without that flag.

The optimization ptr +- (expr +- cst) when expr/cst have undefined
overflow into (ptr +- cst) +- expr is sometimes simply not valid,
without careful analysis on what ptr points to we don't know if it
is valid to do (ptr +- cst) pointer arithmetics.
E.g. on the testcase, ptr points to start of an array (actually
conditionally one or another) and cst is -1, so ptr - 1 is invalid
pointer arithmetics, while ptr + (expr - 1) can be valid if expr
is at runtime always > 1 and smaller than size of the array ptr points
to + 1.

Unfortunately, removing this 1992-ish optimization altogether causes
FAIL: c-c++-common/restrict-2.c  -Wc++-compat   scan-tree-dump-times lim2 "Moving statement" 11
FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump ch2 "is now do-while loop"
FAIL: gcc.dg/tree-ssa/copy-headers-5.c scan-tree-dump-times ch2 "  if " 3
FAIL: gcc.dg/vect/pr57558-2.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/pr57558-2.c -flto -ffat-lto-objects  scan-tree-dump vect "vectorized 1 loops"
regressions (restrict-2.c also for C++ in all std modes).  I've been thinking
about some match.pd optimization for signed integer addition/subtraction of
constant followed by widening integral conversion followed by multiplication
or left shift, but that wouldn't help 32-bit arches.

So, instead at least for now, the following patch keeps doing the
optimization, just doesn't perform it in pointer arithmetics.
pointer_int_sum itself actually adds the multiplication by size_exp,
so ptr + expr is turned into ptr p+ expr * size_exp,
so this patch will try to optimize
ptr + (expr +- cst)
into
ptr p+ ((sizetype)expr * size_exp +- (sizetype)cst * size_exp)
and
ptr - (expr +- cst)
into
ptr p+ -((sizetype)expr * size_exp +- (sizetype)cst * size_exp)

2025-07-04  Jakub Jelinek  <jakub@redhat.com>

PR c/120837
* c-common.cc (pointer_int_sum): Rewrite the intop PLUS_EXPR or
MINUS_EXPR optimization into extension of both intop operands,
their separate multiplication and then addition/subtraction followed
by rest of pointer_int_sum handling after the multiplication.

* gcc.dg/ubsan/pr120837.c: New test.

(cherry picked from commit e16820d4f7ab1d8a40f70beef722e6f8a4c2392c)

libstdc++: Fix __uninitialized_default for constexpr case [PR119754]

We should not use the std::fill optimization for trivial types during
constant evaluation, because we need to begin the lifetime of all
objects, even trivially default constructible ones.

This fixes a bug that Clang diagnosed:

include/c++/16.0.0/bits/stl_algobase.h:925:11: note: assignment to object outside its lifetime is not allowed in a constant expression
925 | *__first = __val;
| ~~~~~~~~~^~~~~~~

I initially just added the #ifdef __cpp_lib_is_constant_evaluated check,
but that gave warnings with GCC because the function isn't constexpr
until C++26. So then I tried checking __glibcxx_raw_memory_algorithms
for the value indicating constexpr uninitialized_value_construct, but
that macro depends on __cpp_constexpr >= 202406 and Clang 19 doesn't
support constexpr placement new, so doesn't define it.

So I decided to just change __uninitialized_default to use
_GLIBCXX20_CONSTEXPR which is consistent with __uninitialized_default_n
(which needs to be constexpr because it's used by std::vector). We don't
currently need to use __uninitialized_default in constexpr contexts for
C++20 code, but we might find uses for it, so now it would be possible.

libstdc++-v3/ChangeLog:

PR libstdc++/119754
* include/bits/stl_uninitialized.h (__uninitialized_default):
Do not use optimized implementation for constexpr case. Use
_GLIBCXX20_CONSTEXPR instead of _GLIBCXX26_CONSTEXPR.

(cherry picked from commit 82d2d12da93b5afbc3479e64d0aa0dcec5b42d8d)

libstdc++: Do not use list-initialization in std::span members [PR120997]

As the bug report shows, for span<const bool> the return statements of
the form `return {data(), count};` will use the new C++26 constructor,
span(initializer_list<element_type>).

Although the conversions from data() to bool and count to bool are
narrowing and should be ill-formed, in system headers the narrowing
diagnostics are suppressed. In any case, even if the compiler diagnosed
them as ill-formed, we still don't want the initializer_list constructor
to be used. We want to use the span(element_type*, size_t) constructor
instead.

Replace the braced-init-list uses with S(data(), count) where S is the
correct return type. We need to make similar changes in the C++26
working draft, which will be taken care of via an LWG issue.

libstdc++-v3/ChangeLog:

PR libstdc++/120997
* include/std/span (span::first, span::last, span::subspan): Do
not use braced-init-list for return statements.
* testsuite/23_containers/span/120997.cc: New test.

(cherry picked from commit a72d0e1a8bf0770ddf1d8d0ebe589f92a4fab4ef)

libstdc++: Ensure pool resources meet alignment requirements [PR118681]

For allocations with size > alignment and size % alignment != 0 we were
sometimes returning pointers that did not meet the requested aligment.
For example, allocate(24, 16) would select the pool for 24-byte objects
and the second allocation from that pool (at offset 24 bytes into the
pool) is only 8-byte aligned not 16-byte aligned.

The pool resources need to round up the requested allocation size to a
multiple of the alignment, so that the selected pool will always return
allocations that meet the alignment requirement.

This backport includes the fixes for the bootstrap error and the tests.

libstdc++-v3/ChangeLog:

PR libstdc++/118681
* src/c++17/memory_resource.cc (choose_block_size): New
function.
(synchronized_pool_resource::do_allocate): Use choose_block_size
to determine appropriate block size.
(synchronized_pool_resource::do_deallocate): Likewise
(unsynchronized_pool_resource::do_allocate): Likewise.
(unsynchronized_pool_resource::do_deallocate): Likewise
* testsuite/20_util/synchronized_pool_resource/118681.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/118681.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit ac2fb60a67d6d1de6446c25c5623b8a1389f4770)

Daily bump.

Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'

Fix-up for commit 72e85d46472716e670cbe6e967109473b8d12d38
"tree-optimization/120780: Support object size for containing objects".
'size_t sz' is unused here, and GCC/nvptx doesn't accept this:

    spawn -ignore SIGHUP [...]/nvptx-none-run ./builtin-dynamic-object-size-pr120780.exe
    error   : Prototype doesn't match for 'main' in 'input file 1 at offset 1924', first defined in 'input file 1 at offset 1924'
    nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)
    FAIL: gcc.dg/builtin-dynamic-object-size-pr120780.c execution test

gcc/testsuite/
* gcc.dg/builtin-dynamic-object-size-pr120780.c: Fix 'main' function.

(cherry picked from commit c6ca6e57004653b787d2d6243fe5ee00cda8aad0)

tree-optimization/120780: Support object size for containing objects

MEM_REF cast of a subobject to its containing object has negative
offsets, which objsz sees as an invalid access. Support this use case
by peeking into the structure to validate that the containing object
indeed contains a type of the subobject at that offset and if present,
adjust the wholesize for the object to allow the negative offset.

gcc/ChangeLog:

PR tree-optimization/120780
* tree-object-size.cc (inner_at_offset,
get_wholesize_for_memref): New functions.
(addr_object_size): Call get_wholesize_for_memref.

gcc/testsuite/ChangeLog:

PR tree-optimization/120780
* gcc.dg/builtin-dynamic-object-size-pr120780.c: New test case.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
(cherry picked from commit 72e85d46472716e670cbe6e967109473b8d12d38)

aarch64: Add support for NVIDIA GB10

This adds support for -mcpu=gb10.  This is a big.LITTLE configuration
involving Cortex-X925 and Cortex-A725 cores.  The appropriate MIDR numbers
are added to detect them in -mcpu=native.  We did not add an
-mcpu=cortex-x925.cortex-a725 option because GB10 does include the crypto
instructions which we want on by default, and the current convention is to not
enable such extensions for Arm Cortex cores in -mcpu where they are optional
in the IP.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* config/aarch64/aarch64-cores.def (gb10): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.

(cherry picked from commit 9ff6ade24cae5a51d1ee9d9ad4b4a5c682e4a5ed)

Daily bump.

Fortran: Remove corank conformability checks [PR120843]

Remove the checks on coranks conformability in expressions,
because there is nothing in the standard about it. When a coarray
has no coindexes it it treated like a non-coarray, when it has
a full-corank coindex its result is a regular array. So nothing
to check for corank conformability.

PR fortran/120843

gcc/fortran/ChangeLog:

* resolve.cc (resolve_operator): Remove conformability check,
because it is not in the standard.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/coindexed_6.f90: Enhance test to have
coarray components covered.

(cherry picked from commit 15413e05eb9cde976b8890cd9b597d0a41a8eb27)