git.ipfire.org Git - thirdparty/gcc.git/log

Adjust reduction with conversion SLP build

The following adjusts how we set SLP_TREE_VECTYPE for the conversion
node we build when fixing up the reduction with conversion SLP instance.
This should probably see more TLC, but the following avoids relying
on STMT_VINFO_VECTYPE for this.

* tree-vect-slp.cc (vect_build_slp_instance): Do not use
SLP_TREE_VECTYPE to determine the conversion back to the
reduction IV.

Avoid vect_is_simple_use call from vectorizable_reduction

When analyzing the reduction cycle we look to determine the
reduction input vector type, for lane-reducing ops we look
at the input but instead of using vect_is_simple_use which
is problematic for SLP we should simply get at the SLP
operands vector type. If that's not set and we make up one
we should also ensure it stays so.

* tree-vect-loop.cc (vectorizable_reduction): Avoid
vect_is_simple_use and record a vector type if we come
up with one.

Avoid vect_is_simple_use call from get_load_store_type

This isn't the required refactoring of vect_check_gather_scatter
but it avoids a now unnecessary call to vect_is_simple_use which
is problematic because it looks at STMT_VINFO_VECTYPE which we
want to get rid of. SLP build already ensures vect_is_simple_use
on all lane defs, so all we need is to populate the offset_vectype
and offset_dt which is not always set by vect_check_gather_scatter.
That's both easy to get from the SLP child directly.

* tree-vect-stmts.cc (get_load_store_type): Do not use
vect_is_simple_use to fill gather/scatter offset operand
vectype and dt.

Pass SLP node down to cost hook for reduction cost

The following arranges vector reduction costs to hand down the
SLP node (of the reduction stmt) to the cost hooks, not only the
stmt_info. This also avoids accessing STMT_VINFO_VECTYPE of an
unrelated stmt to the node that is subject to code generation.

* tree-vect-loop.cc (vect_model_reduction_cost): Get SLP
node instead of stmt_info and use that when recording costs.

aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementation of NOR

While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
due to its tied operands, the destination of the movprfx cannot be also
a source operand.  But the offending pattern in aarch64-sve2.md tries
to do exactly that for the "=?&w,w,w" alternative and gas warns for the
attached testcase.

This patch adjusts that alternative to avoid taking operand 0 as an input
in the NBSL again.

So for the testcase in the patch we now generate:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z1.d
        ret

instead of the previous:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z0.d
        ret

which generated a gas warning.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

PR target/120999
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>):
Adjust movprfx alternative.

gcc/testsuite/

PR target/120999
* gcc.target/aarch64/sve2/pr120999.c: New test.

aarch64: Extend HVLA permutations to big-endian

TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1
"hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions.
This matching was conditional on !BYTES_BIG_ENDIAN.

The ACLE code also lowered the associated SVE2.1 intrinsics into
suitable VEC_PERM_EXPRs.  This lowering was not conditional on
!BYTES_BIG_ENDIAN.

The mismatch led to lots of ICEs in the ACLE tests on big-endian
targets: we lowered to VEC_PERM_EXPRs that are not supported.

I think the !BYTES_BIG_ENDIAN restriction was unnecessary.
SVE maps the first memory element to the least significant end of
the register for both endiannesses, so no endian correction or lane
number adjustment is necessary.

This is in some ways a bit counterintuitive.  ZIPQ1 is conceptually
"apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does
matter when choosing between Advanced SIMD ZIP1 and ZIP2.  For example,
the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little-
endian and ZIP2 for big-endian.  But the difference between the hybrid
VLA and Advanced SIMD permute selectors is a consequence of the
difference between the SVE and Advanced SIMD element orders.

The same thing applies to ACLE intrinsics.  The current lowering of
svzipq1 etc. is correct for both endiannesses.  If ACLE code does:

  2x svld1_s32 + svzipq1_s32 + svst1_s32

then the byte-for-byte result is the same for both endiannesses.
On big-endian targets, this is different from using the Advanced SIMD
sequence below for each 128-bit block:

  2x LDR + ZIP1 + STR

In contrast, the byte-for-byte result of:

  2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32

depends on endianness, since the quadword gathers and scatters use
Advanced SIMD byte ordering for each 128-bit block.  This gather/scatter
sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR
sequence for both endiannesses.

Programmers writing ACLE code have to be aware of this difference
if they want to support both endiannesses.

The patch includes some new execution tests to verify the expansion
of the VEC_PERM_EXPRs.

gcc/
* doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document.
* config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to
BYTES_BIG_ENDIAN.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw):
New proc.
* gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian.  Add
noipa attributes.
* gcc.target/aarch64/sve2/extq_1.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1.c: Likewise.
* gcc.target/aarch64/sve2/dupq_1_run.c: New test.
* gcc.target/aarch64/sve2/extq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1_run.c: Likewise.

Remove dead code dealing with non-SLP

After vect_analyze_loop_operations is gone we can clean up
vect_analyze_stmt as it is no longer called out of SLP context.

* tree-vectorizer.h (vect_analyze_stmt): Remove stmt-info
and need_to_vectorize arguments.
* tree-vect-slp.cc (vect_slp_analyze_node_operations_1):
Adjust.
* tree-vect-stmts.cc (can_vectorize_live_stmts): Remove
stmt_info argument and remove non-SLP path.
(vect_analyze_stmt): Remove stmt_info and need_to_vectorize
argument and prune paths no longer reachable.
(vect_transform_stmt): Adjust.

Comment spelling fix: tunning -> tuning

Kyrylo noticed another spelling bug and like usually, the same mistake
happens in multiple places.

2025-07-10 Jakub Jelinek <jakub@redhat.com>

* config/i386/x86-tune.def: Change "Tunning the" to "tuning" in
comment and use semicolon instead of dot in comment.
* loop-unroll.cc (decide_unroll_stupid): Comment spelling fix,
tunning -> tuning.

Change bellow in comments to below

While I'm not a native English speaker, I believe all the uses
of bellow (roar/bark/...) in comments in gcc are meant to be
below (beneath/under/...).

2025-07-10 Jakub Jelinek <jakub@redhat.com>

gcc/
* tree-vect-loop.cc (scale_profile_for_vect_loop): Comment
spelling fix: bellow -> below.
* ipa-polymorphic-call.cc (record_known_type): Likewise.
* config/i386/x86-tune.def: Likewise.
* config/riscv/vector.md (*vsetvldi_no_side_effects_si_extend):
Likewise.
* tree-scalar-evolution.cc (iv_can_overflow_p): Likewise.
* ipa-devirt.cc (add_type_duplicate): Likewise.
* tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Likewise.
* gimple-ssa-sccopy.cc: Likewise.
* cgraphunit.cc: Likewise.
* graphite.h (struct poly_dr): Likewise.
* ipa-reference.cc (ignore_edge_p): Likewise.
* tree-ssa-alias.cc (ao_compare::compare_ao_refs): Likewise.
* profile-count.h (profile_probability::probably_reliable_p):
Likewise.
* ipa-inline-transform.cc (inline_call): Likewise.
gcc/ada/
* par-load.adb: Comment spelling fix: bellow -> below.
* libgnarl/s-taskin.ads: Likewise.
gcc/testsuite/
* gfortran.dg/g77/980310-3.f: Comment spelling fix: bellow -> below.
* jit.dg/test-debuginfo.c: Likewise.
libstdc++-v3/
* testsuite/22_locale/codecvt/codecvt_unicode.h
(ucs2_to_utf8_out_error): Comment spelling fix: bellow -> below.
(utf16_to_ucs2_in_error): Likewise.

Remove vect_dissolve_slp_only_groups

This function dissolves DR groups that are not subject to SLP. Which
means it is no longer necessary.

* tree-vect-loop.cc (vect_dissolve_slp_only_groups): Remove.
(vect_analyze_loop_2): Do not call it.

Remove vect_analyze_loop_operations

This removes the remains of vect_analyze_loop_operations. All the
checks it does still on LC PHIs of inner loops in outer loop
vectorization should be handled by vectorizable_lc_phi.

* tree-vect-loop.cc (vect_active_double_reduction_p): Remove.
(vect_analyze_loop_operations): Remove.
(vect_analyze_loop_2): Do not call it.

Remove non-SLP vectorization factor determining

The following removes the VF determining step from non-SLP stmts.
For now we keep setting STMT_VINFO_VECTYPE for all stmts, there are
too many places to fix, including some more complicated ones, so
this is defered for a followup.

Along this removes vect_update_vf_for_slp, merging the check for
present hybrid SLP stmts to vect_detect_hybrid_slp and fail analysis
early.  This also removes to essentially duplicate this check in
the stmt walk of vect_analyze_loop_operations.  Getting rid of that,
and performing some other checks earlier is also defered to a followup.

* tree-vect-loop.cc (vect_determine_vf_for_stmt_1): Rename
to ...
(vect_determine_vectype_for_stmt_1): ... this and only set
STMT_VINFO_VECTYPE.  Fail for single-element vector types.
(vect_determine_vf_for_stmt): Rename to ...
(vect_determine_vectype_for_stmt): ... this and only set
STMT_VINFO_VECTYPE. Fail for single-element vector types.
(vect_determine_vectorization_factor): Rename to ...
(vect_set_stmts_vectype): ... this and only set STMT_VINFO_VECTYPE.
(vect_update_vf_for_slp): Remove.
(vect_analyze_loop_operations): Remove walk over stmts.
(vect_analyze_loop_2): Call vect_set_stmts_vectype instead of
vect_determine_vectorization_factor.  Set vectorization factor
from LOOP_VINFO_SLP_UNROLLING_FACTOR.  Fail if vect_detect_hybrid_slp
detects hybrid stmts or when vect_make_slp_decision finds
nothing to SLP.
* tree-vect-slp.cc (vect_detect_hybrid_slp): Move check
whether we have any hybrid stmts here from vect_update_vf_for_slp
* tree-vect-stmts.cc (vect_analyze_stmt): Remove loop over
stmts.
* tree-vectorizer.h (vect_detect_hybrid_slp): Update.

RISCV: Remove the v extension requirement for sat scalar run test

The sat scalar run test should not require the v extension, thus
take rv32 || rv64 instead of riscv_v for the requirement.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
* The rv32gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add-run-1-i16.c: Take rv32 || rv64
instead of riscv_v for scalar run test.
* gcc.target/riscv/sat/sat_s_add-run-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u32-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

cobol: Development round-up. [PR120765, PR119337, PR120794]

This collection of changes reflects development by both Jim Lowden and Bob
Dubner. It includes fixes to the cobcd script; refinements to the multiple-
period syntax; changes to the parser; implementation of DISPLAY/ACCEPT to and
from ENVIRONMENT-NAME, ENVIRONMENT-VALUE, ARGUMENT-NUMBER, ARGUMENT-VALUE and
minor changes to genapi.cc to cut down on the number of cppcheck warnings.

Co-authored-by: James K. Lowden <jklowden@cobolworx.com>
Co-authored-by: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:

PR cobol/120765
PR cobol/119337
PR cobol/120794
* Make-lang.in: Take control of the .cc.o rule.
* cbldiag.h (error_msg_direct): New declaration.
(gcc_location_dump): Forward declaration.
(location_dump): Use gcc_location_dump.
* cdf.y: Change some tokens.
* gcobc: Change dialect handling.
* genapi.cc (parser_call_targets_dump): Temporarily remove from service.
(parser_compile_dcls): Combine temporary arrays.
(get_binary_value_from_float): Apply const to one parameter.
(depending_on_value): Localize a boolean variable.
(normal_normal_compare): Likewise.
(cobol_compare): Eliminate cppcheck warning.
(combined_name): Apply const to an input parameter.
(parser_perform): Apply const to a variable.
(parser_accept): Improve handling of special_name_t parameter and
the exception conditions.
(parser_display): Improve handling of speciat_name_t parameter; use the
os_filename[] string when appropriate.
(program_end_stuff): Rename shadowing variable.
(parser_division): Consolidate temporary char[] arrays.
(parser_file_start): Apply const to a parameter.
(inspect_replacing): Likewise.
(parser_program_hierarchy): Rename shadowing variable.
(mh_identical): Apply const to parameters.
(float_type_of): Likewise.
(picky_memcpy): Likewise.
(mh_numeric_display): Likewise.
(mh_little_endian): Likewise.
(mh_source_is_group): Apply static to a variable it.
(move_helper): Quiet a cppcheck warning.
* genapi.h (parser_accept): Add exceptions to declaration.
(parser_accept_under_discussion): Add declaration.
(parser_display): Change to std::vector; add exceptions to declaration.
* lexio.cc (cdf_source_format): Improve source code location handling.
(source_format_t::infer): Likewise.
(is_fixed_format): Likewise.
(is_reference_format): Likewise.
(left_margin): Likewise.
(right_margin): Likewise.
(cobol_set_indicator_column): Likewise.
(include_debug): Likewise.
(continues_at): Likewise.
(indicated): Likewise.
(check_source_format_directive): Likewise.
(cdftext::free_form_reference_format): Likewise.
* parse.y: Tokens; program and function names; DISPLAY and ACCEPT
handling.
* parse_ante.h (class tokenset_t): Removed.
(class current_tokens_t): Removed.
(field_of): Removed.
* scan.l: Token handling.
* scan_ante.h (level_found): Comment.
* scan_post.h (start_condition_str): Remove cast author_state:.
* symbols.cc (symbols_update): Change error message.
(symbol_table_init): Correct and reorder entries.
(symbol_unresolved_file_key): New function definition.
(cbl_file_key_t::deforward): Change error message.
* symbols.h (symbol_unresolved_file_key): New declaration.
(keyword_tok): New function.
(redefined_token): New function.
(class current_tokens_t): New class.
* symfind.cc (symbol_match): Revise error message.
* token_names.h: Reorder and change numbers in comments.
* util.cc (class cdf_directives_t): New class.
(cobol_set_indicator_column): New function.
(cdf_source_format): New function.
(gcc_location_set_impl): Improve column handling in token_location.
(gcc_location_dump): New function.
(class temp_loc_t): Modify constructor.
(error_msg_direct): New function.
* util.h (class source_format_t): New class.

libgcobol/ChangeLog:

* libgcobol.cc (__gg__accept_envar): ACCEPT/DISPLAY environment variables.
(accept_envar): Likewise.
(default_exception_handler): Refine system log entries.
(open_syslog): Likewise.
(__gg__set_env_name): ACCEPT/DISPLAY environment variables.
(__gg__get_env_name): ACCEPT/DISPLAY environment variables.
(__gg__get_env_value): ACCEPT/DISPLAY environment variables.
(__gg__set_env_value): ACCEPT/DISPLAY environment variables.
(__gg__fprintf_stderr): Adjust __attribute__ for printf.
(__gg__set_arg_num): ACCEPT/DISPLAY command-line arguments.
(__gg__accept_arg_value): ACCEPT/DISPLAY command-line arguments.
(__gg__get_file_descriptor): DISPLAY on os_filename[] /dev device.

libstdc++: Fix __uninitialized_default for constexpr case

We should not use the std::fill optimization for trivial types during
constant evaluation, because we need to begin the lifetime of all
objects, even trivially default constructible ones.

This fixes a bug that Clang diagnosed:

include/c++/16.0.0/bits/stl_algobase.h:925:11: note: assignment to object outside its lifetime is not allowed in a constant expression
925 | *__first = __val;
| ~~~~~~~~~^~~~~~~

I initially just added the #ifdef __cpp_lib_is_constant_evaluated check,
but that gave warnings with GCC because the function isn't constexpr
until C++26. So then I tried checking __glibcxx_raw_memory_algorithms
for the value indicating constexpr uninitialized_value_construct, but
that macro depends on __cpp_constexpr >= 202406 and Clang 19 doesn't
support constexpr placement new, so doesn't define it.

So I decided to just change __uninitialized_default to use
_GLIBCXX20_CONSTEXPR which is consistent with __uninitialized_default_n
(which needs to be constexpr because it's used by std::vector). We don't
currently need to use __uninitialized_default in constexpr contexts for
C++20 code, but we might find uses for it, so now it would be possible.

libstdc++-v3/ChangeLog:

* include/bits/stl_uninitialized.h (__uninitialized_default):
Do not use optimized implementation for constexpr case. Use
_GLIBCXX20_CONSTEXPR instead of _GLIBCXX26_CONSTEXPR.

libstdc++: Add more template keywords to <mdspan> for Clang

This fixes:

include/c++/16.0.0/mdspan:1182:33: error: use 'template' keyword to treat 'mapping' as a dependent template name
1182 |               const typename _OLayout::mapping<_OExtents>&>
      |                                        ^
include/c++/16.0.0/mdspan:1185:31: error: use 'template' keyword to treat 'mapping' as a dependent template name
1185 |             const typename _OLayout::mapping<_OExtents>&, mapping_type>
      |                                      ^

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan): Add template keyword for
dependent name.

libstdc++: Do not use list-initialization in std::span members [PR120997]

As the bug report shows, for span<const bool> the return statements of
the form `return {data(), count};` will use the new C++26 constructor,
span(initializer_list<element_type>).

Although the conversions from data() to bool and count to bool are
narrowing and should be ill-formed, in system headers the narrowing
diagnostics are suppressed. In any case, even if the compiler diagnosed
them as ill-formed, we still don't want the initializer_list constructor
to be used. We want to use the span(element_type*, size_t) constructor
instead.

Replace the braced-init-list uses with S(data(), count) where S is the
correct return type. We need to make similar changes in the C++26
working draft, which will be taken care of via an LWG issue.

libstdc++-v3/ChangeLog:

PR libstdc++/120997
* include/std/span (span::first, span::last, span::subspan): Do
not use braced-init-list for return statements.
* testsuite/23_containers/span/120997.cc: New test.

aarch64: Fix endianness of DFmode vector constants

aarch64_simd_valid_imm tries to decompose a constant into a repeating
series of 64 bits, since most Advanced SIMD and SVE immediate forms
require that.  (The exceptions are handled first.)  It does this by
building up a byte-level register image, lsb first.  If the image does
turn out to repeat every 64 bits, it loads the first 64 bits into an
integer.

At this point, endianness has mostly been dealt with.  Endianness
applies to transfers between registers and memory, whereas at this
point we're dealing purely with register values.

However, one of things we try is to bitcast the value to a float
and use FMOV.  This involves splitting the value into 32-bit chunks
(stored as longs) and passing them to real_from_target.  The problem
being fixed by this patch is that, when a value spans multiple 32-bit
chunks, real_from_target expects them to be in memory rather than
register order.  Thus index 0 is the most significant chunk if
FLOAT_WORDS_BIG_ENDIAN and the least significant chunk otherwise.

This fixes aarch64/sve/cond_fadd_1.c and various other tests
for aarch64_be-elf.

gcc/
* config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Account
for FLOAT_WORDS_BIG_ENDIAN when building a floating-point value.

Fix ICE in afdo_adjust_guessed_profile

gcc/ChangeLog:

* auto-profile.cc (afdo_adjust_guessed_profile): Add forgotten
if (dump_file) guard.

c++: add passing testcases [PR120243]

These pass now; the first was fixed by r16-1507.

PR c++/120243

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr120243-unhandled-1.C: New test.
* g++.dg/coroutines/torture/pr120243-unhandled-2.C: New test.

c++: generic lambda in template arg [PR121012]

My r16-2065 adding missed errors for auto in a template arg in a lambda
parameter also introduced a bogus error on this testcase, where the auto is
both in a lambda parameter and in a template arg, but in the other order,
which is OK. So we should clear in_template_argument_list_p for lambdas
like we do so many other parser flags.

PR c++/121012
PR c++/120917

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Clear
parser->in_template_argument_list_p.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ17.C: New test.

c++: 'this' in lambda in noexcept-spec [PR121008]

In r16-970 I changed finish_this_expr to look at current_class_type rather
than current_class_ptr to accommodate explicit object lambdas. But here in
a lambda in the noexcept-spec, the closure type doesn't yet have the
function as its context, so lambda_expr_this_capture can't find the function
and gives up. But in this context current_class_ptr refers to the
function's 'this', so let's go back to using it in that case.

PR c++/121008
PR c++/113563

gcc/cp/ChangeLog:

* semantics.cc (finish_this_expr): Do check current_class_ref for
non-lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval28.C: New test.

c++: optional template after :: causing error [PR119838]

Found while working on Reflection where we currently reject:

  constexpr auto r = ^^::template C<int>::type;

which should work, because "::template C<int>::" should match the

  nested-name-specifier template(opt) simple-template-id ::

production where the template is optional.  This bug is not limited
to Reflection as demonstrated by the attached test case, so I'm
submitting it separately.

The check_template_keyword_in_nested_name_spec call should ensure that
we're dealing with a template-id if we've seen "template".

PR c++/119838

gcc/cp/ChangeLog:

* parser.cc (cp_parser_nested_name_specifier_opt): New global_p
parameter.  Look for "template" when global_p is true.
(cp_parser_simple_type_specifier): Pass global_p to
cp_parser_nested_name_specifier_opt.

gcc/testsuite/ChangeLog:

* g++.dg/parse/template32.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

aarch64: Some fixes for SVE INDEX constants

When using SVE INDEX to load an Advanced SIMD vector, we need to
take account of the different element ordering for big-endian
targets.  For example, when big-endian targets store the V4SI
constant { 0, 1, 2, 3 } in registers, 0 becomes the most
significant element, whereas INDEX always operates from the
least significant element.  A big-endian target would therefore
load V4SI { 0, 1, 2, 3 } using:

    INDEX Z0.S, #3, #-1

rather than little-endian's:

    INDEX Z0.S, #0, #1

While there, I noticed that we would only check the first vector
in a multi-vector SVE constant, which would trigger an ICE if the
other vectors turned out to be invalid.  This is pretty difficult to
trigger at the moment, since we only allow single-register modes to be
used as frontend & middle-end vector modes, but it can be seen using
the RTL frontend.

gcc/
* config/aarch64/aarch64.cc (aarch64_sve_index_series_p): New
function, split out from...
(aarch64_simd_valid_imm): ...here.  Account for the different
SVE and Advanced SIMD element orders on big-endian targets.
Check each vector in a structure mode.

gcc/testsuite/
* gcc.dg/rtl/aarch64/vec-series-1.c: New test.
* gcc.dg/rtl/aarch64/vec-series-2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Fix expected
output for this big-endian test.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: Restrict to little-endian
targets and add more tests.
* gcc.target/aarch64/sve/vec_init_4.c: New big-endian version
of vec_init_3.c.

Make the RTL frontend set REG_NREGS correctly

While working on a new testcase that uses the RTL frontend,
I hit a bug where a (reg ...) that spans multiple hard registers
had REG_NREGS set to 1. This caused various things to misbehave.
For example, if the (reg ...) in question was used as crtl->return_rtx,
only the first register in the group would be marked as live on exit.

gcc/
* read-rtl-function.cc (function_reader::read_rtx_operand_r): Use
hard_regno_nregs to work out REG_NREGS for hard registers.

libiberty: add routines to handle type-sensitive doubly linked lists

Those methods's implementation is relying on duck-typing at compile
time.
The structure corresponding to the node of a doubly linked list needs
to define attributes 'prev' and 'next' which are pointers on the type
of a node.
The structure wrapping the nodes and others metadata (first, last, size)
needs to define pointers 'first', and 'last' of the node's type, and
an integer type for 'size'.

Mutative methods can be bundled together and be declarable once via a
same macro, or can be declared separately. The merge sort is bundled
separately.
There are 3 types of macros:
1. for the declaration of prototypes: to use in a header file for a
   public declaration, or as a forward declaration in the source file
   for private declaration.
2. for the declaration of the implementation: to use always in a
   source file.
3. for the invocation of the functions.

The methods can be declared either public or private via the second
argument of the declaration macros.

List of currently implemented methods:
- LINKED_LIST_*:
    - APPEND: insert a node at the end of the list.
    - PREPEND: insert a node at the beginning of the list.
    - INSERT_BEFORE: insert a node before the given node.
    - POP_FRONT: remove the first node of the list.
    - POP_BACK: remove the last node of the list.
    - REMOVE: remove the given node from the list.
    - SWAP: swap the two given nodes in the list.
- LINKED_LIST_MERGE_SORT: a merge sort implementation.

include/ChangeLog:

* doubly-linked-list.h: New file.

libiberty/ChangeLog:

* Makefile.in: Add new header.
* testsuite/Makefile.in: Add new test.
* testsuite/test-doubly-linked-list.c: New test.

RISC-V: Add test for vec_duplicate + vssub.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vssub.vv combine to
vssub.vx, with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vssub.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check and run test for vec_duplicate + vssub.vv
combine to vssub.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vssub.vv to vssub.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vssub.vv to the
vssub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_SAT_S_ADD(T, UT, MIN, MAX) \
  T                                      \
  test_##T##_sat_add (T x, T y)          \
  {                                      \
    T sum = (UT)x + (UT)y;               \
    return (x ^ y) < 0                   \
      ? sum                              \
      : (sum ^ x) >= 0                   \
        ? sum                            \
        : x < 0 ? MIN : MAX;             \
  }

  DEF_SAT_S_ADD(int32_t, uint32_t, INT32_MIN, INT32_MAX)
  DEF_VX_BINARY_CASE_2_WRAP(T, SAT_S_ADD_FUNC(T), sat_add)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vssub.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vssub.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add
new case SS_MINUS.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op ss_minus.

Signed-off-by: Pan Li <pan2.li@intel.com>

[PATCH] RISC-V: Enable zvfh for vector-scalar half-float run tests

zvfh is not enabled at the testsuite level. It has to be enabled on a testcase
by testcase basis. This was correctly done for compile tests but not for run
tests. This patch fixes it.
Also, to ensure correct results with half-precision floats, MAX_RELATIVE_DIFF is
set according to the type.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_run.h: Set
MAX_RELATIVE_DIFF depending on type.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c: Enable zvfh.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c: Likewise.

[PATCH] RISC-V: Adjust testdata for unsigned vector SAT_SUB

This patch adjust test data for unsigned vector SAT_SUB to vec_sat_data.h

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: Add vec_sat_u_sub_fmt wrap define.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: Add vec_sat_u_sub test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u8.c: Remove test data.

testsuite: Add a couple of fstack_protector guards

These tests required runtime support for -fstack-protector,
but didn't test for it.

gcc/testsuite/
* gcc.target/aarch64/pr118348_1.c: Require fstack_protector.
* gcc.target/aarch64/pr118348_2.c: Likewise.

ext-dce: Fix subreg_lsb is_constant assumption (2)

This patch fixes another instance of the problem described in the
cover note for g:bf3037e923e9f91d93ab64bdf73a37f64f659fb9.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

[PATCH] [PR target/109286] H8/300: Fix warnings about initfini sections missing attributes

The patch changes order of inclusions, i.e. elfos.h is included before
target specific h8300/h8300.h, in a way similar to a few other targets.
Thanks to this change it is possible to override macros from elfos.h in
h8300/h8300.h, in particular .init/.fini section definitions.

PR target/109286

gcc/ChangeLog:

* config.gcc: Include elfos.h before h8300/h8300.h.

* config/h8300/h8300.h (INIT_SECTION_ASM_OP): Override
default version from elfos.h.
(FINI_SECTION_ASM_OP): Ditto.
(ASM_DECLARE_FUNCTION_NAME): Ditto.
(ASM_GENERATE_INTERNAL_LABEL): Macro removed because it was
being overridden in elfos.h anyway.
(ASM_OUTPUT_SKIP): Ditto.

gimple-fold: extend vector simplification to match scalar bitwise optimizations [PR119196]

    Generalize existing scalar gimple_fold rules to apply the same
    bitwise comparison simplifications to vector types.  Previously, an
    expression like

        (x < y) && (x > y)

    would fold to `false` if x and y are scalars, but equivalent vector
    comparisons were left untouched.  This patch enables folding of
    patterns of the form

        (cmp x y) bit_and (cmp x y)
        (cmp x y) bit_ior (cmp x y)
        (cmp x y) bit_xor (cmp x y)

    for vector operands as well, ensuring consistent optimization across
    all data types.

gcc/ChangeLog:

PR tree-optimization/119196
* match.pd: Allow scalar optimizations with bitwise AND/OR/XOR to apply to vectors.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vector-compare-5.c: Add new test for vector compare simplification.

Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>

tree-simplify: unify simple_comparison ops in vec_cond for bit and/or/xor [PR119196]

Merge simple_comparison patterns under a single vec_cond_expr for bit_and,
bit_ior, and bit_xor in the simplify pass.

Ensure that when both operands of a bit_and, bit_or, or bit_xor are simple_comparison
results, they reside within the same vec_cond_expr rather than separate ones.
This prepares the AST so that subsequent transformations (e.g., folding the
comparisons if possible) can take effect.

gcc/ChangeLog:

PR tree-optimization/119196
* match.pd: Merge multiple vec_cond_expr in a single one for
bit_and, bit_ior and bit_xor.

Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>

[RISC-V][PR target/120642] Avoid propagating constant AVL for theadvector

AVL propagation currently assumes that it can propagate a constant AVL into any
vector insn and trips an assert if the insn fails to recognize after such a
propagation.

However, for xtheadvector that is not a correct assumption; xtheadvector does
not allow the vector length to be a constant integer (other than zero which
allowed via x0).

After consulting with Jin Ma (thanks!) we agree the right fix is to avoid
creating the immediate AVL for xtheadvector.

This has been tested in my tester, just waiting for the pre-commit tester to
spin it.

PR target/120642
gcc/
* config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Do not do
constant AVL propagation for xtheadvector.

gcc/testsuite/
* gcc.target/riscv/rvv/xtheadvector/pr120642.c: New test.

libstdc++: Add smart ptr owner_equals and owner_hash [PR117403]

New structs and member functions added to C++26 by P1901R2.

libstdc++-v3/ChangeLog:

PR libstdc++/117403
* include/bits/shared_ptr.h (shared_ptr::owner_equal)
(shared_ptr::owner_hash, weak_ptr::owner_equal)
(weak_ptr::owner_hash): Define new member functions.
* include/bits/shared_ptr_base.h (owner_equal, owner_hash):
Define new structs.
* include/bits/version.def (smart_ptr_owner_equality): Define.
* include/bits/version.h: Regenerate.
* include/std/memory: Added define for
__glibcxx_want_smart_ptr_owner_equality.
* testsuite/20_util/owner_equal/version.cc: New test.
* testsuite/20_util/owner_equal/cmp.cc: New test.
* testsuite/20_util/owner_equal/noexcept.cc: New test.
* testsuite/20_util/owner_hash/cmp.cc: New test.
* testsuite/20_util/owner_hash/noexcept.cc: New test.
* testsuite/20_util/shared_ptr/observers/owner_equal.cc: New
test.
* testsuite/20_util/shared_ptr/observers/owner_hash.cc:
New test.
* testsuite/20_util/weak_ptr/observers/owner_equal.cc: New test.
* testsuite/20_util/weak_ptr/observers/owner_hash.cc: New test.

Signed-off-by: Paul Keir <paul.keir@uws.ac.uk>

libstdc++: Added missing members to numeric_limits specializations for integer-class types

[iterator.concept.winc]/11 says that std::numeric_limits should be
specialized for integer-class types, with each member defined
appropriately.

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (numeric_limits<__max_size_type>):
New members.
(numeric_limits<__max_diff_type>): Likewise.
* testsuite/std/ranges/iota/max_size_type.cc: New test cases.

Signed-off-by: Mateusz Zych <mte.zych@gmail.com>

Avoid accessing STMT_VINFO_VECTYPE

The following fixes up two places we access STMT_VINFO_VECTYPE that's
not covered by the fixup in vect_analyze/transform_stmt to set that
from SLP_TREE_VECTYPE.

* tree-vect-loop.cc (vectorizable_reduction): Get the
output vector type from slp_for_stmt_info.
* tree-vect-stmts.cc (vect_analyze_stmt): Bail out earlier
for PURE_SLP_STMT when doing loop stmt analysis.

testsuite/120093 - fix gcc.dg/vect/pr101145.c

The following changes noinline to noipa to avoid having IPA-CP clones
confusing the vectorized loop counting.

PR testsuite/120093
* gcc.dg/vect/pr101145.c: Use noipa instead of noinline
attribute.

s390: Fix vector pattern tests for -m31.

Vectorization of int patterns requires 64bit long type (at least the
way the tests are coded). Fix this to only test for successful
vectoriation on 64bit targets.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/pattern-avg-1.c: Fix on -m31.
* gcc.target/s390/vector/pattern-mulh-1.c: Fix on -m31.
* gcc.target/s390/vector/pattern-mulh-2.c: Fix on -m31.

Improve afdo_adjust_guessed_profile

This patch makes afdo_adjust_guessed_profile more robust. Instead of using
median of scales we compute robust average wehre weights is taken from execution
count of edge it originates from and also I added a cap since in some cases
scaling factor may end up being very large introducing artificial hotest regions
of the program confusing ipa-profile's histogram based cutoff.
This was the problem of roms.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* auto-profile.cc (struct scale): New structure.
(add_scale): Also record weights.
(afdo_adjust_guessed_profile): Compute robust average
of scales and cap by max count in function.

Fix profile scaling in tree-inline.cc:initialize_cfun

initialize_cfun calls
profile_count::adjust_for_ipa_scaling (&num, &den);
but then the result is never used. This patch fixes it. Overall scalling
of entry/exit block is bit sloppy in tree-inline. I see if I can clean it up.

* tree-inline.cc (initialize_cfun): Use num and den for scaling.

Fix auto-profile.cc:get_original_name

There are two bugs in get_original_name.  FIrst the for loop walking list of known
suffixes uses sizeos (suffixes).  It evnetually walks to an empty suffix.
Second problem is that strcmp may accept suffixes that are longer.  I.e.
mix up .isra with .israabc.  This is probably not a big deal but the first
bug makes get_original_name to effectively strip all suffixes, even important
one on my setup.

gcc/ChangeLog:

* auto-profile.cc (get_original_name): Fix loop walking the
suffixes.

libstdc++: Fix memory_resource.cc bootstrap failure for non-gthreads targets

The new choose_block_size function added in r16-2112-gac2fb60a67d6d1 was
defined inside an #ifdef _GLIBCXX_HAS_GTHREADS group, which means that
it's not available for single-threaded targets, and so can't be used by
unsynchronized_pool_resource. Move it before that preprocessor group so
it's always defined.

libstdc++-v3/ChangeLog:

* src/c++17/memory_resource.cc: Adjust indentation of unnamed
namespaces.
(pool_sizes): Add comment.
(choose_block_size): Move outside preprocessor group for
gthreads targets.
* testsuite/20_util/synchronized_pool_resource/118681.cc:
Require gthreads.

Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'

Fix-up for commit 72e85d46472716e670cbe6e967109473b8d12d38
"tree-optimization/120780: Support object size for containing objects".
'size_t sz' is unused here, and GCC/nvptx doesn't accept this:

    spawn -ignore SIGHUP [...]/nvptx-none-run ./builtin-dynamic-object-size-pr120780.exe
    error   : Prototype doesn't match for 'main' in 'input file 1 at offset 1924', first defined in 'input file 1 at offset 1924'
    nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)
    FAIL: gcc.dg/builtin-dynamic-object-size-pr120780.c execution test

gcc/testsuite/
* gcc.dg/builtin-dynamic-object-size-pr120780.c: Fix 'main' function.

arm: remove useless push/pop pragmas in arm_neon.h

Remove #pragma GCC target ("arch=armv8.2-a+bf16") since it matches the
preceding pragma GCC target and is thus useless.

gcc/ChangeLog:

* config/arm/arm_neon.h: Remove useless push/pop pragmas.

middle-end: Use rounding division for ranges for partial vectors [PR120922]

This patch adds support for niters ranges for partial
vector loops.

Due to the last iteration being partial the bounds should
be at least 1 but niters // vf as the max.

gcc/ChangeLog:

PR tree-optimization/120922
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Support range
for partial vectors.

middle-end: don't set range on partial vectors [PR120922]

Before the change in g:309dbcea2cabb31bde1a65cdfd30bb7f87b170a2 we would never
set a range for constant VF and requires partial vector loops.

I think a range could be set, since I think the number of latch executions is a
ceiling division of TYPE_MAX_VALUE / vf. To account for the partial iteration.

This would also then deal with the ICE cause in the PR where the chosen VF was
much higher than TYPE_MAX_VALUE and that a mask is relied upon to make it safe.

Since the patch was supposed to not change behavior I've added an additional
partial vector check on the const_vf > 0 check to make it explicit that we only
set it on non-partial vectors (alternative would have been to swap the order of
the vf.constant(&const_vf)) check, but that would have hidden the requirement
sneakily.

The second patch adds support for ranges for partial masks.

gcc/ChangeLog:

PR tree-optimization/120922
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Don't set range
for partial vectors.

gcc/testsuite/ChangeLog:

PR tree-optimization/120922
* gcc.dg/vect/pr120922.c: New test.

libstdc++: Update some baseline_symbols.txt (x32)

* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt:
Updated.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

RISC-V: Disable uint128_t testcase of SAT_MUL when rv32

The rv32 doesn't support __uint128, and then we will have
error like below during test.

error: '__int128' is not supported on this target.

Thus, we disable the uint128_t related test when rv32.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_arith.h: Add xlen check for
uint128_t.
* gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u128.c: Enable
run test for rv64 only.
* gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u128.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-run-1-u64-from-u128.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u128.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

libstdc++: Fix double free in new pool resource test [PR118681]

This was supposed to free p1 and p2, not free p2 twice.

libstdc++-v3/ChangeLog:

PR libstdc++/118681
* testsuite/20_util/unsynchronized_pool_resource/118681.cc: Fix
deallocate argument.

runtime: avoid libc memmove and memclr

The libc memmove and memclr don't reliably operate on full memory words.
We already avoided them on PPC64, but the same problem can occur even
on x86, where some processors use "rep movsb" and "rep stosb".
Always use C code that stores full memory words.

While we're here, clean up the C code. We don't need special handling
if the memmove/memclr pointers are not pointer-aligned.

Unfortunately, this will likely be slower. Perhaps some day we can
have our own assembly code that operates a word at a time,
or we can use different operations when we know there are no pointers.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/685178

syscall: pass correct pointer to system call in recvmsgRaw

The code in recvmsgRaw, introduced in https://go.dev/cl/384695,
incorrectly passed &rsa to the recvmsg system call.
But in recvmsgRaw rsa is already a pointer passed by the caller.
This change passes the correct pointer.

I'm guessing that this didn't show up in the testsuite because
we run the tests in short mode.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/685177

libstdc++: Ensure pool resources meet alignment requirements [PR118681]

For allocations with size > alignment and size % alignment != 0 we were
sometimes returning pointers that did not meet the requested aligment.
For example, allocate(24, 16) would select the pool for 24-byte objects
and the second allocation from that pool (at offset 24 bytes into the
pool) is only 8-byte aligned not 16-byte aligned.

The pool resources need to round up the requested allocation size to a
multiple of the alignment, so that the selected pool will always return
allocations that meet the alignment requirement.

libstdc++-v3/ChangeLog:

PR libstdc++/118681
* src/c++17/memory_resource.cc (choose_block_size): New
function.
(synchronized_pool_resource::do_allocate): Use choose_block_size
to determine appropriate block size.
(synchronized_pool_resource::do_deallocate): Likewise
(unsynchronized_pool_resource::do_allocate): Likewise.
(unsynchronized_pool_resource::do_deallocate): Likewise
* testsuite/20_util/synchronized_pool_resource/118681.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/118681.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

c++: bogus error with union in qualified name [PR83469]

While working on Reflection I noticed that we reject:

  union U { int i; };
  constexpr auto r = ^^typename ::U;

which is due to PR83469.  Andrew P. posted a patch in 2021:
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586344.html
for which I had some comments but an updated patch never came.

~~
There are a few issues here with typenames and unions (and even struct
keywords with unions). First in cp_parser_check_class_key,
we need to allow typenames to name union types and union key
to be able to use with typenames.

The next issue is we need to record if we had a union key,
right now we just record it was a struct/class/typename one
which is wrong.
~~

This patch is an updated and cleaned up version; I've also addressed
a missing bit in pt.cc.

PR c++/83469
PR c++/93809

gcc/cp/ChangeLog:

* cp-tree.h (UNION_TYPE_P): Define.
(TYPENAME_IS_UNION_P): Define.
* decl.cc (struct typename_info): Add union_p field.
(struct typename_hasher::equal): Compare union_p field.
(build_typename_type): Use ti.union_p for union_type.  Set
TYPENAME_IS_UNION_P.
* error.cc (dump_type) <case TYPENAME_TYPE>: Handle
TYPENAME_IS_UNION_P.
* module.cc (trees_out::type_node): Likewise.
* parser.cc (cp_parser_check_class_key): Allow typename key for union
types and allow union keyword for typename types.
* pt.cc (tsubst) <case TYPENAME_TYPE>: Don't conflate unions with
class_type.  For TYPENAME_IS_CLASS_P, check NON_UNION_CLASS_TYPE_P
rather than CLASS_TYPE_P.  Add TYPENAME_IS_UNION_P handling.

gcc/testsuite/ChangeLog:

* g++.dg/template/error45.C: Adjust dg-error.
* g++.dg/warn/Wredundant-tags-3.C: Remove xfail.
* g++.dg/parse/union1.C: New test.
* g++.dg/parse/union2.C: New test.
* g++.dg/parse/union3.C: New test.
* g++.dg/parse/union4.C: New test.
* g++.dg/parse/union5.C: New test.
* g++.dg/parse/union6.C: New test.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

xtensa: Fix B[GE/LT]UI instructions with immediate values of 32768 or 65536 not being emitted

This is because in canonicalize_comparison() in gcc/expmed.cc, the COMPARE
rtx_cost() for the immediate values in the title does not change between
the old and new versions. This patch fixes that.

(note: Currently, this patch only works if some constant propagation
optimizations are enabled (-O2 or higher) or if bare large constant
assignments are possible (-mconst16 or -mauto-litpools). In the future
I hope to make it work at -O1...)

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_b4const_or_zero):
Remove.
(xtensa_b4const): Add a case where the value is 0, and rename
to xtensa_b4const_or_zero.
(xtensa_rtx_costs): Fix to also consider the result of
xtensa_b4constu().

gcc/testsuite/ChangeLog:

* gcc.target/xtensa/BGEUI-BLTUI-32k-64k.c: New.

libstdc++: Fix _GLIBCXX_DEBUG std::forward_list build regression

Commit 2fd6f42c17a8040dbd3460ca34d93695dacf8575 broke _GLIBCXX_DEBUG
std::forward_list implementation.

libstdc++-v3/ChangeLog:

* include/debug/forward_list (_Safe_forward_list<>::_M_swap):
Adapt to _M_this() signature change.

c++: Implement part of C++26 P2686R4 - constexpr structured bindings [PR117784]

The following patch implements the constexpr structured bindings part of
the P2686R4 paper, so the [dcl.pre], [dcl.struct.bind], [dcl.constinit]
and first hunk in [dcl.constexpr] changes.
The paper doesn't have a feature test macro and the constexpr structured
binding part of it seems more-less self-contained, so I think it is useful
to get this in independently from the rest.
Of course, automatic constexpr/constinit structured bindings in the
tuple cases or automatic constexpr/constinit structured bindings with auto &
will not really work for now.
Another reason for the split is that for C++ < 26, I think what the patch
implements is basically what the users will see, i.e. we can accept
constexpr or constinit structured binding with pedwarn, but I think we can't
change the constant expression rules in C++ < 26.

I plan to look at the rest of the paper.

2025-07-08 Jakub Jelinek <jakub@redhat.com>

PR c++/117784
* decl.cc: Implement part of C++26 P2686R4 - constexpr structured
bindings.
(cp_finish_decl): Pedwarn for C++23 and older on constinit on
structured bindings except for static/thread_local where it uses
earlier error.
(grokdeclarator): Pedwarn on constexpr structured bindings for
C++23 and older instead of emitting error always, don't clear
constexpr_p in that case.
* parser.cc (cp_parser_decomposition_declaration): Copy over
DECL_DECLARED_CONSTEXPR_P and DECL_DECLARED_CONSTINIT_P flags.

* g++.dg/cpp1z/decomp3.C (test): For constexpr structured binding
initialize from constexpr var instead of non-constexpr and expect
just a pedwarn for C++23 and older instead of error always.
* g++.dg/cpp26/decomp9.C (foo): Likewise.
* g++.dg/cpp26/decomp22.C: New test.
* g++.dg/cpp26/decomp23.C: New test.
* g++.dg/cpp26/decomp24.C: New test.
* g++.dg/cpp26/decomp25.C: New test.

libstdc++: Do not expose set_brackets/set_separator for formatter with format_kind other than sequence [PR119861]

The standard defines separate specializations of range-default-formatter, out
of which only one for range_format::sequence provide the set_brackets and
set_separator methods. We implemented it as one specialization and exposed
this method for range_format other than string or debug_string, i.e. when
range_formatter was used as underlying formatter.

PR libstdc++/119861

libstdc++-v3/ChangeLog:

* include/std/format (formatter<_Rg, _CharT>::set_separator)
(formatter<_Rg, _CharT>::set_brackets): Constrain with
(format_kind<_Rg> == range_format::sequence).
* testsuite/std/format/ranges/pr119861_neg.cc: New test.

libstdc++: Better CTAD for span and mdspan [PR120914].

This implements P3029R1. In P3029R1, the CTAD for span is refined to
permit deducing the extent of the span from an integral constant, e.g.

  span((T*) ptr, integral_constant<size_t, 5>{});

is deduced as span<T, 5>. Similarly, in

  auto exts = extents(integral_constant<int, 2>);
  auto md = mdspan((T*) ptr, integral_constant<int, 2>);

exts and md have types extents<size_t, 2> and mdspan<double,
extents<size_t, 2>>, respectively.

PR libstdc++/120914

libstdc++-v3/ChangeLog:

* include/std/span (span): Update CTAD to enable
integral constants [P3029R1].
* include/std/mdspan (extents): ditto.
(mdspan): ditto.
* testsuite/23_containers/span/deduction.cc: Test deduction
guide.
* testsuite/23_containers/mdspan/extents/misc.cc: ditto.
* testsuite/23_containers/mdspan/mdspan.cc: ditto.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

s390: Always compute address of stack protector guard

Computing the address of the thread pointer on s390 involves multiple
instructions and therefore bears the risk that the address of the canary
or intermediate values of it are spilled after prologue in order to be
reloaded for the epilogue.  Since there exists no mechanism to ensure
that a value is not coming from stack, as a precaution compute the
address always twice, i.e., one time for the prologue and one time for
the epilogue.  Note, even if there were such a mechanism, emitting
optimal code is non-trivial since there exist cases with opposing
requirements as e.g. if the thread pointer is not only computed for the
TLS guard but also for other TLS objects.  For the latter accesses it is
desired to spill and reload the thread pointer instead of recomputing it
whereas for the former it is not.

gcc/ChangeLog:

* config/s390/s390.md (stack_protect_get_tpsi): New insn.
(stack_protect_get_tpdi): New insn.
(stack_protect_set): Use new insn.
(stack_protect_test): Use new insn.

gcc/testsuite/ChangeLog:

* gcc.target/s390/stack-protector-guard-tls-1.c: New test.

libstdc++: Silence a warning in a test for span.

In a test of span, there's an unused variable myspan. This
commit silences the warning.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/span/contiguous_range_neg.cc: Silence
warning about unused variable myspan.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

Avoid IPA opts around guality plumbing

The following avoids inlining the actual main() (renamed to
guality_main) into the guality plumbing. This can cause
jump threading opportunities to appear and generally increase
the chance what we actually test isn't what we think. Likewise
make guality_check noipa instead of just noinline.

gcc/testsuite/
* gcc.dg/guality/guality.h (guality_main): Declare noipa.
(guality_check): Likewise.

RISC-V: Do not use vsetivli for THeadVector.

In emit_vlmax_insn_lra we use a vsetivli for an immediate AVL.
XTHeadVector does not support this, so guard appropriately.

PR target/120461

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_insn_lra): Do not emit
vsetivli for XTHeadVector.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/pr120461.c: New test.

RISC-V: Ignore non-types in builtin function hash.

If a user passes a string that doesn't represent a variable we still try
to compute a hash for its type. Its tree does not represent a type but
just an exceptional, though. This patch just ignores it, leaving the
error to the checking code later.

PR target/113829

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (registered_function::overloaded_hash):
Skip non-type arguments.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113829.c: New test.

libstdc++: Set feature test macro for complete C++23 mdspan [PR107761].

PR libstdc++/107761

libstdc++-v3/ChangeLog:

* include/bits/version.def (mdspan): Set to 202207 and remove
no_stdname.
* include/bits/version.h: Regenerate.
* testsuite/23_containers/mdspan/version.cc: Test presence
of feature test macro.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

[PATCH] riscv: allow zero in zacas subword atomic cas

gcc:
PR target/120995
* config/riscv/sync.md (zacas_atomic_cas_value_strong<mode>):
Allow op3 to be zero.

gcc/testsuite:
PR target/120995
* gcc.target/riscv/amo/zabha-zacas-atomic-cas.c: New test.

libstdc++: Implement mdspan and tests [PR107761].

Implements the class mdspan as described in N4950, i.e. without P3029.
It also adds tests for mdspan. This commit completes the implementation
of P0009, i.e. the C++23 part <mdspan>.

PR libstdc++/107761

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan): New class.
* src/c++23/std.cc.in (mdspan): Add.
* testsuite/23_containers/mdspan/class_mandate_neg.cc: New test.
* testsuite/23_containers/mdspan/mdspan.cc: New test.
* testsuite/23_containers/mdspan/layout_like.h: Add class
LayoutLike which models a user-defined layout.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

c++: testsuite tweak

My r16-2065 added error lines to this test but left it as dg-do run,
resulting in UNRESOLVED lines from the testsuite.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-variadic.C: Change to 'compile'.

libstdc++: Implement __mdspan::__size.

The current code uses __mdspan::__fwd_prod(__exts, __rank) to express
computing the size of an extent. This commit adds an function __mdspan::
__size(__exts) to express the idea more directly.

libstdc++-v3/ChangeLog:

* include/std/mdspan (__mdspan::__size): New function.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

libstdc++: Restructure mdspan tests to reuse IntLike.

The class IntLike is used for testing extents with user-defined classes
that convert to int. This commit places the class into a separate header
file. This allows it to be reused across different parts of the mdspan
related testsuite.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/extents/custom_integer.cc:
Delete IntLike and include "int_like.h".
* testsuite/23_containers/mdspan/extents/int_like.h: Add
IntLike.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

libstdc++: Check prerequisite of extents::extents.

Previously the prerequisite of the extents ctors that

static_extent(i) == dynamic_extent || extent(i) == other.extent(i).

was not checked. This commit adds the __glibcxx_assert and test them.

libstdc++-v3/ChangeLog:

* include/std/mdspan (extents): Check prerequisite of the ctor that
static_extent(i) == dynamic_extent || extent(i) == other.extent(i).
* testsuite/23_containers/mdspan/extents/class_mandates_neg.cc:
Test the implemented prerequisite.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

libstdc++: Check prerequisites of layout_*::operator().

Previously, the prerequisite that the arguments passed to operator() are
a multi-dimensional index (of extents()) was not checked.

Both mapping::operator() and mdspan::operator[] have the same
prerequisite. Since, mdspan must check the prerequisite for user-defined
layout mappings, the preference is to check in mdspan.

Because out-of-bounds accesses are very common it's nevertheless useful
to check the prerequisite in mapping::operator(). This is relevant for
cases where the layout mappings are used without mdspan. This commit
checks the prerequisites via _GLIBCXX_DEBUG_ASSERTs and adds the required
tests.

More discussion in the email chain starting at:

https://gcc.gnu.org/pipermail/libstdc++/2025-July/062265.html

libstdc++-v3/ChangeLog:

* include/std/mdspan: Check prerequisites of
layout_*::operator() with _GLIBCXX_DEBUG_ASSERTs.
* testsuite/23_containers/mdspan/layouts/debug/out_of_bounds_neg.cc:
Add tests for prerequisites.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

Handle non default git prefix configurations

Mklog parses the diff content from prepare-commit-msg hook but fails
when git has been configured with mnemonicPrefix. Forcing the default
values for the prefixes would set a distinct diff configuration supported
by mklog and prevent most failures.

contrib/ChangeLog:

* prepare-commit-msg: Force default git prefixes.

Signed-off-by: Pierre-Emmanuel Patry <pierre-emmanuel.patry@embecosm.com>

testsuite: i386: Fix gcc.target/i386/memcpy-pr120683-1.c etc. on Solaris/x86

The new tests from

commit 401199377c50045ede560daf3f6e8b51749c2a87
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Jun 17 10:17:17 2025 +0800

    x86: Improve vector_loop/unrolled_loop for memset/memcpy

FAIL on 64-bit Solaris/x86:

FAIL: gcc.target/i386/memcpy-pr120683-1.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-2.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-3.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-4.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-5.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-6.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-pr120683-7.c check-function-bodies foo
FAIL: gcc.target/i386/memcpy-strategy-12.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-1.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-10.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-11.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-12.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-13.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-14.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-15.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-16.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-17.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-18.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-19.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-2.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-20.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-21.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-22.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-23.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-3.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-4.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-5.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-6.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-7.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-8.c check-function-bodies foo
FAIL: gcc.target/i386/memset-pr120683-9.c check-function-bodies foo

Like several times before, they need to be compiled with
-fasynchronous-unwind-tables -fdwarf2-cfi-asm.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2025-07-08  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/memcpy-pr120683-1.c (dg-options): Add
-fasynchronous-unwind-tables -fdwarf2-cfi-asm.
* gcc.target/i386/memcpy-pr120683-2.c: Likewise.
* gcc.target/i386/memcpy-pr120683-3.c: Likewise.
* gcc.target/i386/memcpy-pr120683-4.c: Likewise.
* gcc.target/i386/memcpy-pr120683-5.c: Likewise.
* gcc.target/i386/memcpy-pr120683-6.c: Likewise.
* gcc.target/i386/memcpy-pr120683-7.c: Likewise.
* gcc.target/i386/memcpy-strategy-12.c: Likewise.
* gcc.target/i386/memset-pr120683-1.c: Likewise.
* gcc.target/i386/memset-pr120683-10.c: Likewise.
* gcc.target/i386/memset-pr120683-11.c: Likewise.
* gcc.target/i386/memset-pr120683-12.c: Likewise.
* gcc.target/i386/memset-pr120683-13.c: Likewise.
* gcc.target/i386/memset-pr120683-14.c: Likewise.
* gcc.target/i386/memset-pr120683-15.c: Likewise.
* gcc.target/i386/memset-pr120683-16.c: Likewise.
* gcc.target/i386/memset-pr120683-17.c: Likewise.
* gcc.target/i386/memset-pr120683-18.c: Likewise.
* gcc.target/i386/memset-pr120683-19.c: Likewise.
* gcc.target/i386/memset-pr120683-2.c: Likewise.
* gcc.target/i386/memset-pr120683-20.c: Likewise.
* gcc.target/i386/memset-pr120683-21.c: Likewise.
* gcc.target/i386/memset-pr120683-22.c: Likewise.
* gcc.target/i386/memset-pr120683-23.c: Likewise.
* gcc.target/i386/memset-pr120683-3.c: Likewise.
* gcc.target/i386/memset-pr120683-4.c: Likewise.
* gcc.target/i386/memset-pr120683-5.c: Likewise.
* gcc.target/i386/memset-pr120683-6.c: Likewise.
* gcc.target/i386/memset-pr120683-7.c: Likewise.
* gcc.target/i386/memset-pr120683-8.c: Likewise.
* gcc.target/i386/memset-pr120683-9.c: Likewise.

s390: Split tests for 31bit support

The new vector pattern tests used int128 without guard. This causes
failure on 31bit targets. Split the tests such that the tests
requiring 128 bit support are only executed on targets supporting
them.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/pattern-avg-1.c: Split test.
* gcc.target/s390/vector/pattern-mulh-1.c: Split test.
* gcc.target/s390/vector/pattern-avg-2.c: New test.
* gcc.target/s390/vector/pattern-mulh-2.c: New test.

libstdc++: Document that LWG 3881 is resolved, by using different apporach.

libstdc++-v3/ChangeLog:

* include/std/queue (formatter<queue<_Tp, _Container>, _CharT>)
(formatter<priority_queue<_Tp, _Container, _Compare>, _CharT>):
Add _GLIBCXX_RESOLVE_LIB_DEFECTS comments.

add masked-epilogue tuning

The following adds a x86 tuning to enable the use of AVX512 masked
epilogues in cases we heuristically determine it to be not detrimental
by high chance.  Basically problematic cases are when there are
data streams that are both stored and loaded from and an outer loop
could end up executing only the inner loop masked epilogue and with
unlucky data stream advacement from the outer loop end up needing
to forward from masked stores to masked loads.  This isn't very
well handled, esp. for the case where unmasked operations would
not need to forward at all - that is, when forwarding completely
from the masked out portion of the store (like the AVX upper half
to the AVX lower half of a load).  There's also the case where
the number of iterations is known at compile time, only with
cost comparing we'd consider a non-masked epilog - as we are not
doing that we have to add heuristics to avoid masking when a
single vector epilog iteration would cover all scalar iterations
left (this is exercised by gcc.target/i386/pr110310.c).

SPEC CPU 2017 shows 3% text size savings over not using masked
epilogues with performance impact in the noise.  Masking all vector
epilogues gets that to 4% text size savings with some major
runtime regressions in 503.bwaves_r and 527.cam4_r
(measured on a Zen4 system), we're leaving a 5% improvement
for 549.fotonik3d_r unrealized with the implemented heuristic.

With the heuristics we turn 22513 vector epilogues + up to 12305 scalar
epilogues into 12305 masked vector epilogues of which 574 are for
AVX vector sizes, 79 for SSE vector sizes and the rest for AVX512.
When masking all epilogues we get 14567 of them from
29467 vector + up to 14567 scalar epilogues, so the heuristics disable
an additional 20% of masked epilogues.

* config/i386/x86-tune.def (X86_TUNE_AVX512_MASKED_EPILOGUES):
New tunable, default on for m_ZNVER4 and m_ZNVER5.
* config/i386/i386.cc (ix86_vector_costs::finish_cost): With
X86_TUNE_AVX512_MASKED_EPILOGUES and when the main loop
had a vectorization factor > 2 use a masked epilogue when
possible and when not obviously problematic.

* gcc.target/i386/vect-mask-epilogue-1.c: New testcase.
* gcc.target/i386/vect-mask-epilogue-2.c: Likewise.
* gcc.target/i386/vect-epilogues-3.c: Adjust.

Allow the target to request a masked vector epilogue

Targets recently got the ability to request the vector mode to be
used for a vector epilogue (or the epilogue of a vector epilogue). The
following adds the ability for it to indicate the epilogue should use
loop masking, irrespective of the --param vect-partial-vector-usage
default setting.

The patch below uses a separate flag from the epilogue mode, not
addressing the issue that on x86 the vector_modes mode iteration
hook would not allow for both masked and unmasked variants to be
tried and costed given this doesn't naturally map to modes on
that target. That's left for a future exercise - turning on
cost comparison for the x86 backend would be a prerequesite there.

* tree-vectorizer.h (vector_costs::suggested_epilogue_mode):
Add masked output parameter and return m_masked_epilogue.
(vector_costs::m_masked_epilogue): New tristate flag.
(vector_costs::vector_costs): Initialize m_masked_epilogue.
* tree-vect-loop.cc (vect_analyze_loop_1): Pass in masked
flag to optionally initialize can_use_partial_vectors_p.
(vect_analyze_loop): For epilogues also get whether to use
a masked epilogue for this loop from the target and use
that for the first epilogue mode we try.

Fortran: Ensure finalizers are created correctly [PR120637]

Finalize_component freeed an expression that it used to remember which
components in which context it had finalized already. While it makes
sense to free the copy of the expression, if it is unused, it causes
issues, when comparing to a non existent expression. This is now
detected by returning true, when the expression has been used.

PR fortran/120637

gcc/fortran/ChangeLog:

* class.cc (finalize_component): Return true, when a finalizable
component was detect and do not free it.

gcc/testsuite/ChangeLog:

* gfortran.dg/asan/finalize_1.f90: New test.

tree-optimization/120358 - bogus PTA with structure access

When we compute the constraint for something like
MEM[(const struct QStringView &)&tok2 + 32] we go and compute
what (const struct QStringView &)&tok2 + 32 points to and then
add subvariables to its dereference that possibly fall in the
range of the access according to the original refs size. In
doing that we disregarded that the subvariable the starting
address points to might not be aligned to it and thus the
access might start at any point within that variable. The following
conservatively adjusts the pruning of adjacent sub-variables to
honor this.

PR tree-optimization/120358
* tree-ssa-structalias.cc (get_constraint_for_1): Adjust
pruning of sub-variables according to the imprecise
known start offset.

libstdc++: Make debug iterator pointer sequence const [PR116369]

In revision a35dd276cbf6236e08bcf6e56e62c2be41cf6e3c the debug sequence
have been made mutable to allow attach iterators to const containers.
This change completes this fix by also declaring debug unordered container
members mutable.

Additionally the debug iterator sequence is now a pointer-to-const and so
_Safe_sequence_base _M_attach and all other methods are const qualified.
Not-const methods exported are preserved for abi backward compatibility.

libstdc++-v3/ChangeLog:

PR c++/116369
* config/abi/pre/gnu-versioned-namespace.ver: Use new const qualified symbols.
* config/abi/pre/gnu.ver: Add new const qualified symbols.
* include/debug/safe_base.h
(_Safe_iterator_base::_M_sequence): Declare as pointer-to-const.
(_Safe_iterator_base::_M_attach, _M_attach_single): New, take pointer-to-const
_Safe_sequence_base.
(_Safe_sequence_base::_M_detach_all, _M_detach_singular, _M_revalidate_singular)
(_M_swap, _M_get_mutex): New, const qualified.
(_Safe_sequence_base::_M_attach, _M_attach_single, _M_detach, _M_detach_single):
const qualify.
* include/debug/safe_container.h (_Safe_container<>::_M_cont): Add const qualifier.
(_Safe_container<>::_M_swap_base): New.
(_Safe_container(_Safe_container&&, const _Alloc&, std::false_type)):
Adapt to use latter.
(_Safe_container<>::operator=(_Safe_container&&)): Likewise.
(_Safe_container<>::_M_swap): Likewise and take parameter as const reference.
* include/debug/safe_unordered_base.h
(_Safe_local_iterator_base::_M_safe_container): New.
(_Safe_local_iterator_base::_Safe_local_iterator_base): Take
_Safe_unordered_container_base as pointer-to-const.
(_Safe_unordered_container_base::_M_attach, _M_attach_single): New, take
container as _Safe_unordered_container_base pointer-to-const.
(_Safe_unordered_container_base::_M_local_iterators, _M_const_local_iterators):
Add mutable.
(_Safe_unordered_container_base::_M_detach_all, _M_swap): New, const qualify.
(_Safe_unordered_container_base::_M_attach_local, _M_attach_local_single)
(_M_detach_local, _M_detach_local_single): Add const qualifier.
* include/debug/safe_unordered_container.h (_Safe_unordered_container::_M_self()): New.
* include/debug/safe_unordered_container.tcc
(_Safe_unordered_container::_M_invalidate_if, _M_invalidated_local_if): Use latter.
* include/debug/safe_iterator.h (_Safe_iterator<>::_M_attach, _M_attach_single):
Take _Safe_sequence_base as pointer-to-const.
(_Safe_iterator<>::_M_get_sequence): Add const_cast and comment about it.
* include/debug/safe_local_iterator.h (_Safe_local_iterator<>): Replace usages
of _M_sequence member by _M_safe_container().
(_Safe_local_iterator<>::_M_attach, _M_attach_single): Take
_Safe_unordered_container_base as pointer-to-const.
(_Safe_local_iterator<>::_M_get_sequence): Rename into...
(_Safe_local_iterator<>::_M_get_ucontainer): ...this. Add necessary const_cast and
comment to explain it.
(_Safe_local_iterator<>::_M_is_begin, _M_is_end): Adapt.
* include/debug/safe_local_iterator.tcc: Adapt.
* include/debug/safe_sequence.h
(_Safe_sequence<>::_M_invalidate_if, _M_transfer_from_if): Add const qualifier.
* include/debug/safe_sequence.tcc: Adapt.
* include/debug/deque (std::__debug::deque::erase): Adapt to use new const
qualified methods.
* include/debug/formatter.h: Adapt.
* include/debug/forward_list (_Safe_forward_list::_M_this): Add const
qualification and return pointer for consistency with 'this' keyword.
(_Safe_forward_list::_M_swap_aux): Rename into...
(_Safe_forward_list::_S_swap_aux): ...this and take sequence as const reference.
(forward_list<>::resize): Adapt to use const methods.
* include/debug/list (list<>::resize): Likewise.
* src/c++11/debug.cc: Adapt to const qualification.
* testsuite/util/testsuite_containers.h
(forward_members_unordered::forward_members_unordered): Add check on local_iterator
conversion to const_local_iterator.
(forward_members::forward_members): Add check on iterator conversion to
const_iterator.
* testsuite/23_containers/unordered_map/const_container.cc: New test case.
* testsuite/23_containers/unordered_multimap/const_container.cc: New test case.
* testsuite/23_containers/unordered_multiset/const_container.cc: New test case.
* testsuite/23_containers/unordered_set/const_container.cc: New test case.
* testsuite/23_containers/vector/debug/mutex_association.cc: Adapt.

[vxworks] [x86] disable vxworks6 PIC on vxworks7

VxWorks6 used symbols __GOTT_BASE__ and __GOTT_INDEX__ to obtain the
address of the global offset table.  Starting with VxWorks7, that is
no longer the case, but we've still issued these symbols in
output_set_got.  Do that only with VxWorks<7.

Switching to the call-based PIC register sequence, we have to set the
flag that prevents the use of the red zone, and AFAICT the reasons
that ruled out GOTOFF and other relative addressing no longer apply to
VxWorks7+.

for  gcc/ChangeLog

* config/vxworks-dummy.h (TARGET_VXWORKS_VAROFF): New.
(TARGET_VXWORKS_GOTTPIC): New.
* config/vxworks.h (TARGET_VXWORKS_VAROFF): Override.
(TARGET_VXWORKS_GOTTPIC): Likewise.
* config/i386/i386.cc (output_set_got): Disable VxWorks6 GOT
sequence on VxWorks7.
(legitimize_pic_address): Accept relative addressing of
labels on VxWorks7.
(ix86_delegitimize_address_1): Likewise.
(ix86_output_addr_diff_elt): Likewise.
* config/i386/i386.md (tablejump): Likewise.
(set_got, set_got_labelled): Set no-red-zone flag on VxWorks7.
* config/i386/predicates.md (gotoff_operand): Test
TARGET_VXWORKS_VAROFF.

[committed] Minor fix to gcc.dg/torture/pr120654.c

I don't recall which port complained, but pr120654.c was failing on one or more
of the embedded targets due to the use of malloc/free. This change just turns
them into the __builtin variants which makes everyone happy again.

gcc/testsuite
* gcc.dg/torture/pr120654.c: Use __builtin variants of malloc and free.

[committed][RISC-V] Fix testsuite fallout from check-function-bodies change

Minor fallout from HJ's recent change to the check-function-bodies code in the testsuite.

The label isn't at all important here, so forcing it match is just a waste of time. So this patch just skips over the label. It fixes a handful of failures in testsuite:

> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c check-function-bodies atomic_add_fetch_int_acq_rel
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c check-function-bodies atomic_add_fetch_int_acquire
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c check-function-bodies atomic_add_fetch_int_relaxed
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c check-function-bodies atomic_add_fetch_int_release
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c check-function-bodies atomic_add_fetch_int_seq_cst
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c check-function-bodies atomic_add_fetch_int_acq_rel
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c check-function-bodies atomic_add_fetch_int_acquire
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c check-function-bodies atomic_add_fetch_int_relaxed
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c check-function-bodies atomic_add_fetch_int_release
> unix//-march=rv32gcv: gcc: gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c check-function-bodies atomic_add_fetch_int_seq_cst

gcc/testsuite
* gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: Adjust expected
output.
* gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: Likewise.

[vxworks] add aarch64 to vxworks-dummy.h set

It's not strictly necessary, because nothing defined therein is
referenced by anything in gcc/config/aarch64, but it was an oversight
to not have it there.

for gcc/ChangeLog

* config.gcc (vxworks-dummy.h): Add to aarch64-*-* as well.

Daily bump.

libstdc++: Fix attribute order on __normal_iterator friends [PR120949]

In r16-1911-g6596f5ab746533 I claimed to have reordered some attributes
for compatibility with Clang, but it looks like I got the Clang
restriction backwards and put them all in the wrong order. Clang trunk
accepts either order (probably since the llvm/llvm-project#133107 fix)
but released versions still require a particular order.

There were also some cases where the attributes were after the friend
keyword, which Clang trunk still rejects.

libstdc++-v3/ChangeLog:

PR libstdc++/120949
* include/bits/stl_iterator.h (__normal_iterator): Fix order of
always_inline and nodiscard attributes for Clang compatibility.

libstdc++: Make VERIFY a variadic macro

This defines the testsuite assertion macro VERIFY so that it allows
un-parenthesized expressions containing commas. This matches how assert
is defined in C++26, following the approval of P2264R7.

The primary motivation is to allow expressions that the preprocessor
splits into multiple arguments, e.g.
VERIFY( vec == std::vector<int>{1,2,3,4} );

To achieve this, VERIFY is redefined as a variadic macro and then the
arguments are grouped together again through the use of __VA_ARGS__.

The implementation is complex due to the following points:

- The arguments __VA_ARGS__ are contextually-converted to bool, so that
  scoped enums and types that are not contextually convertible to bool
  cannot be used with VERIFY.
- bool(__VA_ARGS__) is used so that multiple arguments (i.e. those which
  are separated by top-level commas) are ill-formed. Nested commas are
  allowed, but likely mistakes such as VERIFY( cond, "some string" ) are
  ill-formed.
- The bool(__VA_ARGS__) expression needs to be unevaluated, so that we
  don't evaluate __VA_ARGS__ more than once. The simplest way to do that
  would be just sizeof bool(__VA_ARGS__), without parentheses to avoid a
  vexing parse for VERIFY(bool(i)). However that wouldn't work for e.g.
  VERIFY( []{ return true; }() ), because lambda expressions are not
  allowed in unevaluated contexts until C++20. So we use another
  conditional expression with bool(__VA_ARGS__) as the unevaluated
  operand.

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_hooks.h (VERIFY): Define as variadic
macro.
* testsuite/ext/verify_neg.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Use template keyword in __mapping_of alias template

This is needed to fix an error with Clang 19:

include/c++/16.0.0/mdspan:512:30: error: use 'template' keyword to treat 'mapping' as a dependent template name
512 | is_same_v<typename _Layout::mapping<typename _Mapping::extents_type>,
| ^

libstdc++-v3/ChangeLog:

* include/std/mdspan (__mapping_of): Add template keyword.

Revert "Extend "counted_by" attribute to pointer fields of structures. Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE." due to PR120929.

This reverts commit 687727375769dd41971bad369f3553f1163b3e7a.

Revert "Use the counted_by attribute of pointers in builtinin-object-size." due to PR120929

This reverts commit 7165ca43caf47007f5ceaa46c034618d397d42ec.

Revert "Use the counted_by attribute of pointers in array bound checker." due to PR120929

This reverts commit 9d579c522d551eaa807e438206e19a91a3def67f.

check-function-bodies: Support "^[0-9]+:"

While working on

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120936

I tried to use check-function-bodies to verify that label for mcount
and __fentry__ is only generated by "-pg" if it is used by __mcount_loc
section:

1: call mcount
.section __mcount_loc, "a",@progbits
.quad 1b
.previous

Add "^[0-9]+:" to check-function-bodies to allow:

1: call mcount

PR testsuite/120881
* lib/scanasm.exp (check-function-bodies): Allow "^[0-9]+:".

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Ignore more clang warnings in contrib/filter-clang-warnings.py

in contrib we have a script filter-clang-warnings.py which supposedly
filters out uninteresting warnings emitted by clang when it compiles
GCC.  I'm not sure if anyone else uses it but our internal SUSE
testing infrastructure does.

Since Martin Liška left, I have mostly ignored the warnings and so
they have multiplied.  In an effort to improve the situation, I have
tried to fix those warnings which I think are worth it and would like
to adjust the filtering script so that we get to zero "interesting"
warnings again.

The changes are the following:

1. Ignore -Woverloaded-shift-op-parentheses warnings.  IIUC, those
   make some sense when << and >> are used for I/O but since that is
   not the case in GCC they are not really interesting.

2. Ignore -Wunused-function and -Wunneeded-internal-declaration.  I
   think it is OK to occasionally prepare APIs before they are used
   (and with our LTO we should be able to get rid of them).

3. Ignore -Wvla-cxx-extension and -Wunused-command-line-argument which
   just don't seem to be useful.

4. Ignore -Wunused-private-field warning in diagnostic-path-output.cc
   which can only be correct if quite a few functions are removed and
   looks like it is just not an oversight:

     gcc/diagnostic-path-output.cc:271:35: warning: private field 'm_logical_loc_mgr' is not used [-Wunused-private-field]

5. Ignore a case in -Wunused-but-set-variable about named_args which
   is used in a piece of code behind an ifdef in ipa-strub.cc.

6. Adjust the gimple-match and generic-match filters to the fact that
   we now have multiple such files.

7. Ignore warnings about using memcpy to copy around wide_ints, like
   the one below.  I seem to remember wide-int has undergone fairly
   rigorous review and TBH I just hope I know what we are doing.

     gcc/wide-int.h:1198:11: warning: first argument in call to 'memcpy' is a pointer to non-trivially copyable type 'wide_int_storage' [-Wnontrivial-memcall]

8. Ignore -Wc++11-narrowing warning reported in omp-builtins.def when
   it is included from JIT.  The code probably has a bigger issue
   described in PR 120960.

9. Since the patch number 14 in the original series did not get
   approved, I assume that private member field m_wanted_type of class
   element_expected_type_with_indirection in c-family/c-format.cc will
   get a use sooner or later, so I ignore a warning about it being
   unused.

10. I have decided to ignore warnings in m2/gm2-compiler-boot about
    unused stuff (all reported unused stuff are variables).  These
    sources are in the build directory so I assume they are somehow
    generated and so warnings about unused things are a bit expected
    and probably not too bad.

11. On the Zulip chat, I have informed Rust folks they have a bunch of
    -Wunused-private-field cases in the FE.  Until they sort it out
    I'm ignoring these.  I might add the missing explicit type-cast
    case here too if it takes time for the patch I'm posting in this
    series to reach master.

12. I ignore warning about use of offsetof in libiberty/sha1.c which is
    apparently only a "C23 extension:"

      libiberty/sha1.c:239:11: warning: defining a type within 'offsetof' is a C23 extension [-Wc23-extensions]
      libiberty/sha1.c:460:11: warning: defining a type within 'offsetof' is a C23 extension [-Wc23-extensions]

13. I have enlarged the list of .texi files where warnings somehow got
    reported.  Not sure why that happens.

14. In analyzer/sm.cc there are several "no-op" methods which have
    named but unused parameters.  It seems this is deliberate and so I
    have filtered the -Wunused-parameter warning for this file.

I have also re-arranged the entries in a way which hopefully makes
somewhat more sense.

Thanks,

Martin

contrib/ChangeLog:

2025-07-07  Martin Jambor  <mjambor@suse.cz>

* filter-clang-warnings.py (skip_warning): Also ignore
-Woverloaded-shift-op-parentheses, -Wunused-function,
-Wunneeded-internal-declaration, -Wvla-cxx-extension', and
-Wunused-command-line-argument everywhere and a warning about
m_logical_loc_mgr in diagnostic-path-output.cc.  Adjust gimple-match
and generic-match "filenames."  Ignore -Wnontrivial-memcall warnings
in wide-int.h, all warnings about unused stuff in files under
m2/gm2-compiler-boot, all -Wunused-private-field in rust FE, in
analyzer/ana-state-to-diagnostic-state.h and c-family/c-format.cc, all
Warnings in avr-mmcu.texi, install.texi and libgccjit.texi and all
-Wc23-extensions warnings in libiberty/sha1.c. Ignore
-Wunused-parameter in analyzer/sm.cc.  Reorder entries.

ranger: Mark three occurrences of verify_range with overide

In line with my previous patches introducing override where clang
warnings indicate that they are missing, this patch adds it to three
new member functions overriding ancestor virtual functions that do not
have them.

Since Andrew has pre-approved such changes for ranger, I am going to
push it to master after bootstrapping it on x86_64-linux.

Thanks,

Martin

gcc/ChangeLog:

2025-07-07 Martin Jambor <mjambor@suse.cz>

* value-range.h (class irange): Mark member function verify_range
with override.
(class prange): Mark member function verify_range with final override.
(class frange): Mark member function verify_range with override.