git.ipfire.org Git - thirdparty/gcc.git/log

vect: Misalign checks for gather/scatter.

This patch adds simple misalignment checks for gather/scatter
operations.  Previously, we assumed that those perform element accesses
internally so alignment does not matter.  The riscv vector spec however
explicitly states that vector operations are allowed to fault on
element-misaligned accesses.  Reasonable uarchs won't, but...

For gather/scatter we have two paths in the vectorizer:

(1) Regular analysis based on datarefs.  Here we can also create
     strided loads.
(2) Non-affine access where each gather index is relative to the
     initial address.

The assumption this patch works on is that once the alignment for the
first scalar is correct, all others will fall in line, as the index is
always a multiple of the first element's size.

For (1) we have a dataref and can check it for alignment as in other
cases.  For (2) this patch checks the object alignment of BASE and
compares it against the natural alignment of the current vectype's unit.

The patch also adds a pointer argument to the gather/scatter IFNs that
contains the necessary alignment.  Most of the patch is thus mechanical
in that it merely adjusts indices.

I tested the riscv version with a custom qemu version that faults on
element-misaligned vector accesses.  With this patch applied, there is
just a single fault left, which is due to PR120782 and which will be
addressed separately.

Bootstrapped and regtested on x86 and aarch64.  Regtested on
rv64gcv_zvl512b with and without unaligned vector support.

gcc/ChangeLog:

* internal-fn.cc (internal_fn_len_index): Adjust indices for new
alias_ptr param.
(internal_fn_else_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
(internal_fn_alias_ptr_index): Ditto.
(internal_fn_offset_index): Ditto.
(internal_fn_scale_index): Ditto.
(internal_gather_scatter_fn_supported_p): Ditto.
* internal-fn.h (internal_fn_alias_ptr_index): Ditto.
* optabs-query.cc (supports_vec_gather_load_p): Ditto.
* tree-vect-data-refs.cc (vect_check_gather_scatter): Add alias
pointer.
* tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Add
alias pointer.
* tree-vect-slp.cc (vect_get_operand_map): Adjust for alias
pointer.
* tree-vect-stmts.cc (vect_truncate_gather_scatter_offset): Add
alias pointer and misalignment handling.
(get_load_store_type): Move from here...
(get_group_load_store_type): ...To here.
(vectorizable_store): Add alias pointer.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct gather_scatter_info): Ditto.

vect: Add is_gather_scatter argument to misalignment hook.

This patch adds an is_gather_scatter argument to the
support_vector_misalignment hook. All targets but riscv do not care
about alignment for gather/scatter so return true for is_gather_scatter.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_builtin_support_vector_misalignment):
Return true for gather/scatter.
* config/arm/arm.cc (arm_builtin_support_vector_misalignment):
Ditto.
* config/epiphany/epiphany.cc (epiphany_support_vector_misalignment):
Ditto.
* config/gcn/gcn.cc (gcn_vectorize_support_vector_misalignment):
Ditto.
* config/loongarch/loongarch.cc (loongarch_builtin_support_vector_misalignment):
Ditto.
* config/riscv/riscv.cc (riscv_support_vector_misalignment):
Add gather/scatter argument.
* config/rs6000/rs6000.cc (rs6000_builtin_support_vector_misalignment):
Return true for gather/scatter.
* config/s390/s390.cc (s390_support_vector_misalignment):
Ditto.
* doc/tm.texi: Add argument.
* target.def: Ditto.
* targhooks.cc (default_builtin_support_vector_misalignment):
Ditto.
* targhooks.h (default_builtin_support_vector_misalignment):
Ditto.
* tree-vect-data-refs.cc (vect_supportable_dr_alignment):
Ditto.

vect: Add helper macros for gather/scatter.

This encapsulates the IFN and the builtin-function way of handling
gather/scatter via three defines:

  GATHER_SCATTER_IFN_P
  GATHER_SCATTER_LEGACY_P
  GATHER_SCATTER_EMULATED_P

and introduces a helper define for SLP operand handling as well.

gcc/ChangeLog:

* tree-vect-slp.cc (GATHER_SCATTER_OFFSET): New define.
(vect_get_and_check_slp_defs): Use.
* tree-vectorizer.h (GATHER_SCATTER_LEGACY_P): New define.
(GATHER_SCATTER_IFN_P): Ditto.
(GATHER_SCATTER_EMULATED_P): Ditto.
* tree-vect-stmts.cc (vectorizable_store): Use.
(vectorizable_load): Use.

ifn: Add helper functions for gather/scatter.

This patch adds access helpers for the gather/scatter offset and scale
parameters.

gcc/ChangeLog:

* internal-fn.cc (expand_scatter_store_optab_fn): Use new
function.
(expand_gather_load_optab_fn): Ditto.
(internal_fn_offset_index): Ditto.
(internal_fn_scale_index): Ditto.
* internal-fn.h (internal_fn_offset_index): New function.
(internal_fn_scale_index): Ditto.
* tree-vect-data-refs.cc (vect_describe_gather_scatter_call):
Use new function.

vect: remove cast to pointer for create_if in vect_create_data_ref_ptr

Removes `fold_convert (aggr_ptr_type, iv_step)` when using create_iv.

This was previously constructing statements like:
```
unsigned int _49;
vector([4,4]) int * _50;
sizetype _51;
vector([4,4]) int * vectp_x.6;
...
_50 = (vector([4,4]) int *) _49;
_51 = (sizetype) _50;
...
vectp_x.6_48 = vectp_x.6_47 + _51;
```

And instead creates:
```
unsigned int _49;
sizetype _50;
vector([4,4]) int * vectp_x.6;
...
_50 = (sizetype) _49;
...
vectp_x.6_48 = vectp_x.6_47 + _50;
```

As create_iv already has the logic to handle a pointer mode base and an integer
mode var this seems a more natural expression of this.

gcc/ChangeLog:

* tree-vect-data-refs.cc (vect_create_data_ref_ptr): Remove unnecessary
casts to aggr_ptr_type.

libstdc++: Expand compile-time ranges tests for vector and basic_string.

This replaces most test_constexpr invocations with direct calls to
test_ranges(), which is also used for runtime tests.

SimpleAllocator was made constexpr to simplify this refactoring. Other
test allocators, like uneq_allocator (used in from_range constructor
tests), were not updated.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/cons/from_range.cc: Replace
test_constexpr with test_ranges inside static_assert.
* testsuite/21_strings/basic_string/modifiers/append/append_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/replace/replace_with_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/cons/from_range.cc: Likewise.
* testsuite/23_containers/vector/bool/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/23_containers/vector/cons/from_range.cc: Likewise.
* testsuite/23_containers/vector/modifiers/assign/assign_range.cc:
Likewise.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
Likewise.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
Run full test_ranges instead of span-only in test_constexpr.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Replace test_constexpr with calls to test_ranges and test_overlapping.
* testsuite/util/testsuite_allocator.h (__gnu_test::SimpleAllocator):
Declared member functions as constexpr.

Remove non-SLP path from vectorizable_comparison

This removes the non-SLP path from vectorizable_comparison.

* tree-vect-stmts.cc (vectorizable_comparison_1): Remove
non-SLP path.
(vectorizable_comparison): Likewise.

Remove non-SLP path from vectorizable_condition

The following removes the non-SLP paths from vectorizable_condition.

* tree-vect-stmts.cc (vectorizable_condition): Remove
non-SLP paths.

Remove non-SLP path from vectorizable_scan_store

This removes the non-SLP paths from vectorizable_scan_store.

* tree-vect-stmts.cc (vectorizable_scan_store): Remove
non-SLP path and unused parameters.
(vectorizable_store): Adjust.

Remove non-SLP path from vectorizable_shift

This removes the non-SLP paths from vectorizable_shift.

* tree-vect-stmts.cc (vectorizable_shift): Remove non-SLP paths.

Remove non-SLP path from vectorizable_assignment

This removes the non-SLP paths from vectorizable_assignment

* tree-vect-stmts.cc (vectorizable_assignment): Remove
non-SLP paths.

Remove non-SLP path from vectorizable_call

This removes the non-SLP paths from vectorizable_call, propagates
out ncopies == 1 and removes empty loops resulting from that.

* tree-vect-stmts.cc (vectorizable_call): Remove non-SLP path.

Remove non-SLP path from vectorizable_bswap

* tree-vect-stmts.cc (vectorizable_bswap): Remove non-SLP path.

Remove non-SLP path from vectorizable_recurr

This removes the non-SLP paths from vectorizable_recurr.

* tree-vect-loop.cc (vectorizable_recurr): Remove non-SLP path.

aarch64: Relaxed SEL combiner patterns for unpacked SVE FP binary arithmetic

Extend the binary op/UNSPEC_SEL combiner patterns from SVE_FULL_F/
SVE_FULL_F_B16B16 to SVE_F/SVE_F_B16B16, where the strictness value
is SVE_RELAXED_GP.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16.
(*cond_<optab><mode>_3_relaxed): Likewise.
(*cond_<optab><mode>_any_relaxed): Likwise.
(*cond_<optab><mode>_any_const_relaxed): Extend from SVE_FULL_F
to SVE_F.
(*cond_add<mode>_2_const_relaxed): Likewise.
(*cond_add<mode>_any_const_relaxed): Likewise.
(*cond_sub<mode>_3_const_relaxed): Likewise.
(*cond_sub<mode>_const_relaxed): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/unpacked_cond_binary_bf16_1.C: New test.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmax_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmin_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fadd_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fdiv_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmul_1.c: Likewise..
* gcc.target/aarch64/sve/unpacked_cond_fsubr_1.c: Likewise.

aarch64: Add support for unpacked SVE FDIV

This patch extends the unpredicated FP division expander to support
partial FP modes. It extends the existing patterns used to implement
UNSPEC_COND_FDIV and it's approximation as needed.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
(@aarch64_frecpe<mode>): Extend from SVE_FULL_F to SVE_F.
(@aarch64_frecps<mode>): Likewise.
(div<mode>3): Likewise, use aarch64_sve_fp_pred.
* config/aarch64/iterators.md: Add warnings above SVE_FP_UNARY
and SVE_FP_BINARY.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
* gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.

aarch64: Add support for unpacked SVE FP binary arithmetic

This patch extends the expanders for unpredicated smax, smin, add, sub,
mul, min, and max, so that they support partial SVE FP modes.

The relevant insn and splitting patterns are also updated.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (<optab><mode>3): Extend from
SVE_FULL_F to SVE_F, use aarch64_sve_fp_pred.
(*post_ra_<sve_fp_op><mode>3): Extend from SVE_FULL_F to SVE_F.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
use aarch64_predicate_operand (ADD/SUB/MUL/MAX/MIN).
(split for using unpredicated insns): Move SVE_RELAXED_GP into
the pattern, rather than testing for it in the condition.
* config/aarch64/aarch64-sve2.md (@aarch64_pred_<optab><mode>):
Extend from VNx8BF_ONLY to SVE_BF.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/unpacked_binary_bf16_1.C: New test.
* g++.target/aarch64/sve/unpacked_binary_bf16_2.C: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmax_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmax_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmin_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmin_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fadd_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmul_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fsubr_2.c: Likewise.

ada: Use-before-definition of a component of discriminated aggregate's itype.

In some cases involving assigning an aggregate to a formal parameter of
an unconstrained discriminated subtype that has a Dynamic_Predicate, and where
the discriminated type also has a component of an unconstrained discriminated
subtype, the front end generates a malformed tree which causes a compilation
failure when the backend fails a consistency check.

gcc/ada/ChangeLog:

* exp_aggr.adb (Convert_To_Assignments): Add calls to Ensure_Defined
before generating assignments to components that could be
associated with a not-yet-defined itype.

ada: Function return accessibility checking for result access discrims.

RM 6.5 defines static and dynamic checks to ensure that a function result
with one or more access discriminants will not outlive the entity
designated by a non-null access discriminant value (see paragraphs
5.9 and 21). Implement these checks. Also fix a bug in passing along
an implicit parameter needed to perform the dynamic checks when a function
that takes such a parameter returns a call to another such function.

gcc/ada/ChangeLog:

* accessibility.adb (Function_Call_Or_Allocator_Level): Handle the
case where a function that has an Extra_Accessibility_Of_Result
parameter returns as its result a call to another such function.
In that case, the extra parameter should be passed along.
(Check_Return_Construct_Accessibility): Replace a warning about an
inevitable failure of a dynamic check with a legality-rule-violation
error message; adjust the text of the message accordingly.
* exp_ch6.ads (Apply_Access_Discrims_Accessibility_Check): New
procedure, following example of the existing
Apply_CW_Accessibility procedure.
* exp_ch6.adb (Apply_Access_Discrims_Accessibility_Check): body
for new procedure.
(Expand_Simple_Function_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.
* exp_ch3.adb (Make_Allocator_For_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.

ada: Minor adjustment to the doc of Last_Chance_Handler

gcc/ada/ChangeLog:

* doc/gnat_rm/standard_and_implementation_defined_restrictions.rst:
clarify parameter description.
* gnat_rm.texi: Regenerate.

testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

The test vsx-builtin-7.c failed on powerpc64le-linux due to Identical
Code Folding (ICF) merging the functions insert_di_0_v2 and insert_di_0.
This behavior was introduced by commit r15-7961-gdc47161c1f32c3, which
enhanced alias analysis in ao_compare::compare_ao_refs, enabling the
compiler to identify and optimize structurally identical functions. As a
result, the compiler replaced insert_di_0_v2 with a tail call to
insert_di_0, altering the expected test behavior.

This patch adds -fno-ipa-icf to the test's dg-options to disable ICF,
avoiding function merging and ensuring the test executes correctly.

2025-06-24 Jeevitha Palanisamy <jeevitha@linux.ibm.com>

gcc/testsuite/
PR testsuite/119382
* gcc.target/powerpc/vsx-builtin-7.c: Add '-fno-ipa-icf' to dg-options.

RISC-V: Add test case for vx combine polluting VXRM

Add asm check to make sure vx combine of vaaddu.vx will not pollute
the vxrm.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Avoid vaaddu.vx combine pattern pollute VXRM csr

The vaaddu.vx combine almost comes from avg_floor, it will
requires the vxrm to be RDN. But not all vaaddu.vx should
depends on the RDN. The vaaddu.vx combine should leverage
the VXRM value as is instead of pollute them all to RDN.

This patch would like to fix this and set it as is.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*uavg_floor_vx_<mode>): Rename
from...
(*<sat_op_v_vdup>_vx_<mode>): Rename to...
(*<sat_op_vdup_v>_vx_<mode>): Rename to...
* config/riscv/riscv-protos.h (enum insn_flags): Add vxrm
RNE, ROD type.
(enum insn_type): Add RNE_P, ROD_P type.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func decl.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto.
* config/riscv/riscv-v.cc (get_insn_type_by_vxrm_val): Add
helper to get insn type by vxrm value.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func impl
to expand vec + vec_dup pattern.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto but for
vec_dup + vec pattern.
* config/riscv/vector-iterators.md: Add helper iterator
for sat vx combine.

Signed-off-by: Pan Li <pan2.li@intel.com>

c++/modules: Support re-streaming TU_LOCAL_ENTITYs [PR120412]

When emitting a primary module interface, we must re-stream any TU-local
entities that we saw in a partition. This patch adds the missing
members from core_vals.

As a drive-by fix, in some cases we might have a typedef referring to a
TU-local entity; we need to handle that case as well.

PR c++/120412

gcc/cp/ChangeLog:

* module.cc (trees_out::core_vals): Write TU_LOCAL_ENTITY bits.
(trees_in::core_vals): Read it.
(trees_in::tree_node): Handle TU_LOCAL_ENTITY typedefs.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-14_a.C: New test.
* g++.dg/modules/internal-14_b.C: New test.
* g++.dg/modules/internal-14_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

Daily bump.

aarch64: Relaxed SEL combiner patterns for unpacked SVE FP unary operations

Extend the unary op/UNSPEC_SEL combiner patterns from SVE_FULL_F to SVE_F,
where the strictness value is SVE_RELAXED_GP.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F to SVE_F.
(*cond_<optab><mode>_any_relaxed): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_cond_fabs_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fneg_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinta_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinta_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinti_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintp_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintx_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintz_1.c: Likewise.

aarch64: Add support for unpacked SVE FP unary operations

This patch extends the expander for unpredicated round, nearbyint, floor,
ceil, rint, and trunc, so that it can handle partial SVE FP modes.

We move fabs and fneg to a separate expander, since they are not trapping
instructions.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (<optab><mode>2): Replace use of
aarch64_ptrue_reg with aarch64_sve_fp_pred.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
and use aarch64_predicate_operand.
* config/aarch64/iterators.md: Split FABS/FNEG out of
SVE_COND_FP_UNARY (into new SVE_COND_FP_UNARY_BITWISE).

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_fabs_1.c: New test.
* gcc.target/aarch64/sve/unpacked_fneg_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinta_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinta_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinti_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinti_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintp_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintp_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintx_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintx_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintz_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintz_2.c: Likewise.

aarch64: Relaxed SEL combiner patterns for unpacked SVE FP conversions

Add UNSPEC_SEL combiner patterns for unpacked FP conversions, where the
strictness value is SVE_RELAXED_GP.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(*cond_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>_relaxed):
New FCVT/SEL combiner pattern.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx2SI_ONLY:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
(*cond_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>_relaxed):
New {S,U}CVTF/SEL combiner pattern.
(*cond_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>):
New FCVT/SEL combiner pattern.
(*cond_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
* config/aarch64/iterators.md: New mode iterator for VNx2SI.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fcvt_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fcvtz_1.c: Likewise.

contrib: add 'zlib' to ignored_prefixes

This fixes the same problem for syncing zlib that H.J. hit for libffi
in r12-4561-g25ab851dd333d7.

contrib/ChangeLog:
PR other/105404

* gcc-changelog/git_commit.py (ignored_prefixes): Add zlib/.

Fortran: fix passing of character length of function to procedure [PR121203]

PR fortran/121203

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Obtain the character
length of an assumed character length procedure from the typespec
of the actual argument even if there is no explicit interface.

gcc/testsuite/ChangeLog:

* gfortran.dg/function_charlen_4.f90: New test.

RISC-V: Rework broadcast handling [PR121073].

During the last weeks it became clear that our current broadcast
handling needs an overhaul in order to improve maintainability.
PR121073 showed that my intermediate fix wasn't enough and caused
regressions.

This patch now goes a first step towards untangling broadcast
(vmv.v.x), "set first" (vmv.s.x), and zero-strided load (vlse).
Also can_be_broadcast_p is rewritten and strided_broadcast_p is
introduced to make the distinction clear directly in the predicates.

Due to the pervasiveness of the patterns I needed to touch a lot
of places and tried to clear up some things while at it.  The patch
therefore also introduces new helpers expand_broadcast for vmv.v.x
that dispatches to regular as well as strided broadcast and
expand_set_first that does the same thing for vmv.s.x.

The non-strided fallbacks are now implemented as splitters of the
strided variants.  This makes it easier to see where and when things
happen.

The test cases I touched appeared wrong to me so this patch sets a new
baseline for some of the scalar_move tests.

There is still work to be done but IMHO that can be deferred: It would
be clearer if the three broadcast-like variants differed not just in
name but also in RTL pattern so matching is not as confusing.  Right now
vmv.v.x and vmv.s.x only differ in the mask and are interchangeable by
just changing it from "all ones" to a "single one".

As last time, I regtested on rv64 and rv32 with strided_broadcast turned
on and off.  Note there are regressions cond_fma_fnma-[78].c.  Those are
due to the patch exposing more fwprop/late-combine opportunities.  For
fma/fnma we don't yet have proper costing for vv/vx in place but I'll
expect that to be addressed soon and figured we can live with those for
the time being.

PR target/121073

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Use new helpers.
* config/riscv/autovec.md: Ditto.
* config/riscv/predicates.md (strided_broadcast_mask_operand):
New predicate.
(strided_broadcast_operand): Ditto.
(any_broadcast_operand): Ditto.
* config/riscv/riscv-protos.h (expand_broadcast): Declare.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(strided_broadcast_p): Ditto.
* config/riscv/riscv-string.cc (expand_vec_setmem): Use new
helpers.
* config/riscv/riscv-v.cc (expand_broadcast): New functionk.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(expand_const_vec_duplicate): Use new helpers.
(expand_const_vector_duplicate_repeating): Ditto.
(expand_const_vector_duplicate_default): Ditto.
(sew64_scalar_helper): Ditto.
(expand_vector_init_merge_repeating_sequence): Ditto.
(expand_reduction): Ditto.
(strided_broadcast_p): New function.
(whole_reg_to_reg_move_p): Use new helpers.
* config/riscv/riscv-vector-builtins-bases.cc: Use either
broadcast or strided broadcast.
* config/riscv/riscv-vector-builtins.cc (function_expander::use_ternop_insn):
Ditto.
(function_expander::use_widen_ternop_insn): Ditto.
(function_expander::use_scalar_broadcast_insn): Ditto.
* config/riscv/riscv-vector-builtins.h: Declare scalar
broadcast.
* config/riscv/vector.md (*pred_broadcast<mode>): Split into
regular and strided broadcast.
(*pred_broadcast<mode>_zvfh): Split.
(pred_broadcast<mode>_zvfh): Ditto.
(*pred_broadcast<mode>_zvfhmin): Ditto.
(@pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>_zvfhmin): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/repeat-6.c: Adjust test
expectation.
* gcc.target/riscv/rvv/base/scalar_move-5.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-7.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-8.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-9.c: Ditto.
* gcc.target/riscv/rvv/pr121073.c: New test.

RISC-V: testsuite: Fix vx_vf_*run-1-f16.c run tests.

This patch fixes the vf_vfmacc-run-1-f16.c test failures on rv32
by adding zvfh requirements as well as options to the test and
the target harness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c:
Add zvfh requirements and options.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmacc-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmsac-run-1-f16.c:
Ditto.
* lib/target-supports.exp: Add zvfh options.

aarch64: Fix fma steering when rename fails [PR120119]

Regrename can fail in some case and `insn_rr[INSN_UID (insn)].op_info`
will be null. The FMA steering code was not expecting the failure to happen.
This started to happen after early RA was added but it has been a latent bug
before that.

Build and tested for aarch64-linux-gnu.

PR target/120119

gcc/ChangeLog:

* config/aarch64/cortex-a57-fma-steering.cc (func_fma_steering::analyze):
Skip if renaming fails.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr120119-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

cobol: Tweak adjustments to location_t of GENERIC nodes for PERFORM.

COBOL has a group of PERFORM statements that require careful adjustments to
the location_t elements of the GENERIC nodes so that the COBOL-aware version
of GDB behaves properly. These changes are in service of that goal.

gcc/cobol/ChangeLog:

* genapi.cc (leave_procedure): Adjust location_t for PERFORM.
(parser_perform_times): Likewise.
(internal_perform_through_times): Likewise.
(perform_outofline_before_until): Likewise.
(perform_outofline_after_until): Likewise.
(perform_outofline_testafter_varying): Likewise.
(perform_outofline_before_varying): Likewise.

c++: name lookup for non-dep rewritten != expr [PR121179]

Here we're incorrectly rejecting the modules testcase (reduced from a
std module example):

$ cat 121179_a.C
export module foo;

enum class E { x };
bool operator==(E, int);

export
template<class T>
void f() {
E::x != 0;
}

$ cat 121179_b.C
import foo;

template void f<int>();

$ g++ -fmodules 121179_*.C
In module foo, imported at 121179_b.C:1:
121179_a.C: In instantiation of ‘void f@foo() [with T = int]’:
121179_b.C:3:9: required from here
121179_a.C:9:8: error: no match for ‘operator!=’ (operand types are ‘E@foo’ and ‘int’)

This is ultimately because our non-dependent rewritten operator expression
handling throws away the result of unqualified lookup at template parse time,
and so we have to repeat the lookup at instantiation time which fails because
the operator== isn't exported.

This is a known deficiency, but it's easy enough to narrowly fix this
for simple != to == rewrites by making build_min_non_dep_op_overload
look through logical negation.

PR c++/121179

gcc/cp/ChangeLog:

* call.cc (build_new_op): Don't clear *overload for a simple
!= to == rewrite.
* tree.cc (build_min_non_dep_op_overload): Handle TRUTH_NOT_EXPR
appearing in a rewritten operator expression.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/operator-8.C: Strengthen test and remove one
XFAIL.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: fix __is_invocable for std::reference_wrapper [PR121055]

Our implementation of the INVOKE spec ([func.require]) was incorrectly
treating reference_wrapper<T>::get() as returning T instead of T&, which
notably makes a difference when invoking a ref-qualified memfn pointer.

PR c++/121055

gcc/cp/ChangeLog:

* method.cc (build_invoke): Correct reference_wrapper handling.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_invocable5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

aarch64: testsuite: Keep -mtune=generic when specifying -moverride

gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp (aarch64-arg-dg-options): Split add_tune into
add_tune and add_override, so that specifying -moverride does not
change the baseline tuning from the testuite's default (generic).

libstdc++: Prepare test code for default_accessor for reuse.

All test code of default_accessor can be reused. This commit moves
the reuseable code into a file generic.cc and prepares the tests for
reuse with aligned_accessor.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/accessors/default.cc: Delete.
* testsuite/23_containers/mdspan/accessors/generic.cc: Slightly
generalize the test code previously in default.cc.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

libstdc++: Remove redundant parens in mdspan testsuite.

A recent commit improved the macro VERIFY to eliminate the need for
certain parens. This commit updates the test code in

23_containers/mdspan

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/extents/ctor_ints.cc: Remove
superfluous parens.
* testsuite/23_containers/mdspan/extents/ctor_shape.cc: Ditto.
* testsuite/23_containers/mdspan/mdspan.cc: Ditto.

Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

tree-optimization/121220 - improve sinking of stores

We currently do only very restricted store sinking into paths
that have no loads or stores and end in a virtual PHI.  The
following extends this to sink towards a single virtual
definition in addition to the case of a PHI, handling skipping
of unrelated virtual uses.  We later have to prune cases
that would require virtual PHI insertion and the patch below
basically restricts this to sinking to noreturn paths for now.

PR tree-optimization/121220
* tree-ssa-sink.cc (statement_sink_location): For stores
handle sinking to paths ending in a store.  Skip loads
that do not use the store.

* gcc.dg/tree-ssa/ssa-sink-23.c: New testcase.

[modula2] Add return to remove build warning

This patch adds a return statement to M2Exception which removes a
build warning.

gcc/m2/ChangeLog:

* gm2-libs/M2EXCEPTION.mod (M2Exception): Add return
exException in case Raise completes.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

tree-sra: Avoid total SRA if there are incompat. aggregate accesses  (PR119085)

We currently use the types encountered in the function body and not in
type declaration to perform total scalarization.  Bug PR 119085
uncovered that we miss a check that when the same data is accessed
with aggregate types that those are actually compatible.  Without it,
we can base total scalarization on a type that does not "cover" all
live data in a different part of the function.  This patch adds the
check.

gcc/ChangeLog:

2025-07-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/119085
* tree-sra.cc (sort_and_splice_var_accesses): Prevent total
scalarization if two incompatible aggregates access the same place.

gcc/testsuite/ChangeLog:

2025-07-21  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/119085
* gcc.dg/tree-ssa/pr119085.c: New test.

PR modula2/121164: Modula 2 build failure followup

This is a followup patch for PR modula2/121164 to
fix the location for the error message attributed to cc1gm2.

gcc/m2/ChangeLog:

PR modula2/121164
* gm2-compiler/P1SymBuild.mod: Remove PutProcTypeParam.
Remove PutProcTypeParam.
(CheckFileName): Remove.
(P1EndBuildDefinitionModule): Correct spelling.
(P1EndBuildImplementationModule): Ditto.
(P1EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
* gm2-compiler/P2SymBuild.mod (P2EndBuildDefModule): Correct
spelling.
(P2EndBuildImplementationModule): Ditto.
(P2EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
(CheckFormalParameterSection): Ditto.
* gm2-compiler/P3SymBuild.mod (P3EndBuildDefModule): Ditto.
* gm2-compiler/PCSymBuild.mod (PCEndBuildDefModule): Ditto.
(fixupProcedureType): Pass tok to PutProcTypeVarParam.
Pass tok to PutProcTypeParam.
* gm2-compiler/SymbolTable.def (PutProcTypeParam): Add tok
parameter.
(PutProcTypeVarParam): Ditto.
* gm2-compiler/SymbolTable.mod (SymParam): At change type to
CARDINAL.
New field FullTok.
New field Scope.
(SymVarParam): At change type to CARDINAL.
New field FullTok.
New field Scope.
(GetVarDeclTok): Check ShadowVar for NulSym and return At.
(PutParam): Initialize FullTok.
Initialize At.
Initialize Scope.
(PutVarParam): Initialize FullTok.
Assign At.
Initialize Scope.
(AddProcedureProcTypeParam): Add tok parameter.
(GetScope): Add ParamSym and VarParamSym clause.
(PutProcTypeVarParam): Add tok parameter.
Initialize At.
Initialize FullTok.
(GetDeclaredDefinition): Clause ParamSym return At.
Clause VarParamSym return At.
(GetDeclaredModule): Ditto.
(PutDeclaredDefinition): Remove clause ParamSym.
Remove clause VarParamSym.
(PutDeclaredModule): Remove clause ParamSym.
Remove clause VarParamSym.

libgm2/ChangeLog:

PR modula2/121164
* libm2iso/Makefile.am (libm2iso_la_M2FLAGS): Add -Wall.
* libm2iso/Makefile.in: Regenerate.
* libm2log/Makefile.am (libm2log_la_M2FLAGS): Add -Wall.
* libm2log/Makefile.in: Regenerate.
* libm2min/Makefile.am (libm2min_la_M2FLAGS): Add -Wall.
* libm2min/Makefile.in: Regenerate.
* libm2pim/Makefile.am (libm2pim_la_M2FLAGS): Add -Wall.
* libm2pim/Makefile.in: Regenerate.

gcc/testsuite/ChangeLog:

PR modula2/121164
* gm2/switches/pedantic-params/fail/arrayofchar.def: New test.
* gm2/switches/pedantic-params/fail/arrayofchar.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

libstdc++: Negative tests for constexpr uses inplace_vector [PR119137]

Adds negative tests for preconditions on inserting into a full
inplace_vector and erasing non-existent elementsi at compile-time.
This ensures coverage for the inplace_vector<T, 0> specialization.

Also extends element access tests to cover front() and back()
methods, and const and mutable overloads for all accesses.

PR libstdc++/119137

libstdc++-v3/ChangeLog:

* testsuite/23_containers/inplace_vector/access/elem.cc: Cover
front and back methods and const calls.
* testsuite/23_containers/inplace_vector/access/elem_neg.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/erase_neg.cc:
New test.
* testsuite/23_containers/inplace_vector/modifiers/single_insert_neg.cc:
New test.

Default to -mcpu=ultrasparc3 on Solaris/SPARC

Prompted by the discussions around a recent clang bug, I realized that
gcc still defaults to -mcpu=v9 on Solaris/SPARC.

This is an oversight since the Oracle Studio 12.6 cc, released in 2017,
already defaults to -xarch=sparcvis2, the equivalent of
-mcpu=ultrasparc3.  Besides, both the 32 and 64-bit libc.so.1 require
UltraSPARC III extensions anyway:

      SPARC32PLUS Version 1, V8+ Required, UltraSPARC3 Extensions Required [VIS]
      SPARCV9 Version 1, UltraSPARC3 Extensions Required [VIS]

So this patch follows suite.

Bootstrapped on sparc-sun-solaris2.11 and sparcv9-sun-solaris2.11 with
as/ld, gas/ld, and gas/gld configurations.

There are currently two regressions exposed by this patch (PRs 121191
and 121192), which are only present in gcc 16 resp. 15/16.

There's one small caveat: while Solaris now marks all objects with
EF_SPARC_32PLUS EF_SPARC_SUN_US1 EF_SPARC_SUN_US3, gas only sets the
EF_SPARC_SUN_US[13] flags in the ELF header if UltraSPARC I/III insns
are actually used.  This is in accordance with the SPARC Compliance
Definition 2.4.1, 4P-1.  In the end, it doesn't matter anyway since
libc.so.1 already has both flags, so the resulting executables and
shared objects will too, anyway.

2025-07-20  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc:
* config.gcc <sparc*-*-solaris2*> (with_cpu): Default to ultrasparc3.

[aarch64] check for non-NULL vectype in aarch64_vector_costs::add_stmt_cost

With a patch still in development we get NULL STMT_VINFO_VECTYPE.
One side-effect is that during scalar stmt testing we no longer
pass a vectype. The following adjusts aarch64_vector_costs::add_stmt_cost
to check for a non-NULL vectype before accessing it, like all the
code surrounding it. The other fix possibility would have been
to re-orderr the check with the vect_mem_access_type one, but that
one is not going to exist during scalar code costing either in the
future.

* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
Check vectype is non-NULL before accessing it.

middle-end/121216 - ICE with VLA const string initializer

constant_byte_string fails to consider the string type might be VLA
when initialized by an empty string CTOR.

PR middle-end/121216
* expr.cc (constant_byte_string): Check the string type
size fits an uhwi before converting to uhwi.

* gcc.dg/pr121216.c: New testcase.

testsuite: Mark fn1 in pr81627.c as noinline [PR120101]

Since r16-372-g064cac730f88dc fn1 is now inlined into main
which meant the scan dump was failing since it was looking
for it only once. Marking fn1 as noinline gets us back to
the old behavior and no longer dependent on the inliner.

Pushed as obvious after a quick test.

PR testsuite/120101
gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr81627.c (fn1): Mark as noinline.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite: Fix overflow in gcc.dg/vect/pr116125.c

The test ends up writing a byte beyond bounds of the buffer, which gets
trapped on some targets when the test is run with
-fstack-protector-strong.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr116125.c (mem_overlap): Expand A to 10 members.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

Daily bump.

c++: constexpr union placement new [PR121068]

The note and example in [class.union] p6 think that placement new can be
used to change the active member of a union, but we didn't support that for
array members in constant-evaluation even after implementing P1330 and
P2747.

First I tried to address this by introducing a CLOBBER_BEGIN_OBJECT for the
entire array, but that broke the resolution of LWG3436, which invokes 'new
T[1]' for an array T, and trying to clobber a multidimensional array when
the actual object is single-dimensional breaks. So I've raised that issue
with the committee. Until that is resolved, this patch takes a simpler
approach: allow initialization of an element of an array to make the array
the active member of a union.

PR c++/121068

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_store_expression): Allow ARRAY_REFs
when activating an array member of a union.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union6.C: Expect x5 to work.
* g++.dg/cpp26/constexpr-new4.C: New test.

c++: Wmismatched-new-delete-5.C tweak

A patch I was testing noticed that the allocation is too small for the
placement new here, but that isn't the point of the testcase.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wmismatched-new-delete-5.C: Fix allocation.

MAINTAINERS: Add myself as C++ front end reviewer

ChangeLog:

* MAINTAINERS: Add myself as C++ front end reviewer.

Change __builtin_unreachable to __builtin_trap (or infinite loop) if only thing in function [PR109267]

When we have an empty function, things can go wrong with
cfi_startproc/cfi_endproc and a few other things like exceptions. So if
the only thing the function does is a call to __builtin_unreachable,
let's replace that with a __builtin_trap instead if the target has a trap
instruction. For targets without a trap instruction defined, replace it
with an infinite loop; this allows not to need for the abort call to happen
but still get the correct behavior of not having two functions at the same
location.

The QOI idea for basic block reorder is recorded as PR 120004.

Changes since v1:
* v2: Move to final gimple cfg cleanup instead of expand and use
BUILT_IN_UNREACHABLE_TRAP.
* v3: For targets without a trap defined, create an infinite loop.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/109267

gcc/ChangeLog:

* tree-cfgcleanup.cc (execute_cleanup_cfg_post_optimizing): If the first
non debug statement in the first (and only) basic block is a call
to __builtin_unreachable change it to a call to __builtin_trap or an
infinite loop.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_trap): New proc.
* g++.dg/missing-return.C: Update testcase for the !trap case.
* gcc.dg/pr109267-1.c: New test.
* gcc.dg/pr109267-2.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

xtensa: Fix inaccuracy in xtensa_rtx_costs()

This patch fixes the following defects in the function:

   - The cost of move instructions larger than the natural word width,
     specifically "movd[if]_internal", cannot be estimated correctly
   - Floating-point or symbolic constant assignment insns cannot be
     identified as L32R instructions

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Rewrite to capture insns that could be L32R machine instructions
wherever possible.
(xtensa_rtx_costs): Fix to consider that moves larger than a
natural word can take multiple L32R machine instructions.
(constantpool_address_p): Cosmetics.
* config/xtensa/xtensa.md (movdi_internal, movdf_internal):
Add missing insn attributes.

xtensa: Make relaxed MOVI instructions treated as "load" type

The relaxed MOVI instructions in the Xtensa ISA are assignment ones that
contain large integer, floating-point or symbolic constants that would not
normally be allowed as immediate values by instructions in assembly code,
and will instead be translated by the assembler later rather the compiler,
into the L32R instructions referencing to literal pool entries containing
that values (see '-mauto-litpools' Xtensa-specific option).

This means that even though such instructions look like nothing more than
constant value assignments in their RTL representation, these may perform
better by treating them as loads from memory (i.e. the actual behavior)
and also trying to avoid using the value immediately after the load,
especially from an instruction scheduling perspective.

gcc/ChangeLog:

* config/xtensa/xtensa.md
(movsi_internal, movhi_internal, movsf_internal):
Change the value of the "type" attribute from "move" to "load"
when the source operand constraint is "Y".

middle-end: Enable masked load with non-constant offset

The function `vect_check_gather_scatter` requires the `base` of the load
to be loop-invariant and the `off`set to be not loop-invariant. When faced
with a scenario where `base` is not loop-invariant, instead of giving up
immediately we can try swapping the `base` and `off`, if `off` is
actually loop-invariant.

Previously, it would only swap if `off` was the constant zero (and so
trivially loop-invariant). This is too conservative: we can still
perform the swap if `off` is a more complex but still loop-invariant
expression, such as a variable defined outside of the loop.

This allows loops like the function below to be vectorised, if the
target has masked loads and sufficiently large vector registers (eg
`-march=armv8-a+sve -msve-vector-bits=128`):

```c
typedef struct Array {
    int elems[3];
} Array;

int loop(Array **pp, int len, int idx) {
    int nRet = 0;

    for (int i = 0; i < len; i++) {
        Array *p = pp[i];
        if (p) {
            nRet += p->elems[idx];
        }
    }

    return nRet;
}
```

gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_check_gather_scatter): Swap
`base` and `off` in more scenarios. Also assert at the end of
the function that `base` and `off` are loop-invariant and not
loop-invariant respectively.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: Update tests.

AArch64: precommit test for masked load vectorisation.

Commit the test file `mask_load_2.c` before the vectorisation analysis
is changed, so that the changes in codegen are more obvious in the next
commit.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: New test.

libstdc++: Make testsuite_iterators constexpr and expand inplace_vector tests [PR119137]

All functions in testsuite_iterators.h are now marked constexpr,
targeting the earliest possible standard. Most functions use C++14 due
to multi-statement bodies, with exceptions:

* BoundsContainer and some constructors are C++11 compatible.
* OutputContainer is C++20 due to operator new/delete usage.

Before C++23, each constexpr templated function requires a constexpr
-suitable instantiation. Functions delegating to _GLIBCXX14_CONSTEXPR
must also be _GLIBCXX14_CONSTEXPR; e.g., forward_iterator_wrapper's
constructor calling input_iterator_wrapper's constructor, or
operator-> calling operator*.

For classes defined C++20 or later (e.g., test_range), constexpr is
applied unconditionally.

PR libstdc++/119137

libstdc++-v3/ChangeLog:

* testsuite/23_containers/inplace_vector/cons/from_range.cc: Run
iterators and range test at compile-time.
* testsuite/23_containers/inplace_vector/modifiers/assign.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/multi_insert.cc:
Likewise.
* testsuite/util/testsuite_iterators.h (__gnu_test::BoundsContainer)
(__gnu_test::OutputContainer, __gnu_test::WritableObject)
(__gnu_test::output_iterator_wrapper, __gnu_test::input_iterator_wrapper)
(__gnu_test::forward_iterator_wrapper)
(__gnu_test::bidirectional_iterator_wrapper)
(__gnu_test::random_access_iterator_wrapper)
(__gnu_test::test_container): Add appropriate _GLIBCXXNN_CONSTEXPR
macros to member functions.
(__gnu_test::contiguous_iterator_wrapper)
(__gnu_test::input_iterator_wrapper_rval)
(__gnu_test::test_range, __gnu_test::test_range_nocopy)
(__gnu_test::test_sized_range_sized_sent)
(__gnu_test::test_sized_range): Add constexpr specifier to member
functions.

[RISC-V] Restrict generic-vector-ooo DFA

So while debugging Austin's work to support the spacemit x60 in the BPI we
found that even though his pipeline description had mappings for all the vector
instructions, they were still getting matched by the generic-vector-ooo DFA.

The core problem is that DFA never restricted itself to a tune option (oops).
That's easily fixed, at which time everything using generic blows up because we
don't have a generic in-order vector DFA. Everything using generic was
indirectly also using generic-vector-ooo for the vector instructions.

It may be better long term to define a generic-vector DFA, but to preserve
behavior, I'm letting generic-vector-ooo match when the generic DFA is active.

Tested in my tester, waiting on pre-commit CI before moving forward.

gcc/
* config/riscv/generic-vector-ooo.md: Restrict insn reservations to
generic_ooo and generic tuning models.

tree-optimization/121202 - fix vector stmt placement

When we have a vector shift with a scalar the shift operand can be
external - in that case we should not use the shift operand def
as hint where to place the vector shift instruction. The ICE
in the PR is because stmt dominance queries only work inside of
the vector region. But we should also never place stmts outside
of it.

PR tree-optimization/121202
* tree-vect-slp.cc (vect_schedule_slp_node): Do not take
an out-of-region stmt as "last".

* gcc.dg/pr121202.c: New testcase.

genpreds.cc: Do not use rawmemchr for insn_constraint_len

The GNU extension rawmemchr cannot be used. Therefore, replace it by a
simple loop.

gcc/ChangeLog:

* genpreds.cc (write_insn_constraint_len): Replace rawmemchr by
a loop.

ada: Nested use_type_clause with "all" cancels use_type_clause with wider scope

The compiler mishandles nested use_type_clauses in the case where the
outer one is a normal use_type_clause and the inner one has "all".
Upon leaving the scope of the inner use_type_clause, the outer one
is effectively disabled, because it's not considered redundant (and
in fact it's only partially redundant). This is fixed by testing for
the presence of a use_type_clause for the same type that has a wider
scope when ending the inner use_type_clause.

gcc/ada/ChangeLog:

* sem_ch8.adb (End_Use_Type): Add a test for there not being an earlier
use_type_clause for the same type as an additional criterion for turning
off In_Use and Current_Use_Clause.

ada: Only fold array attributes in SPARK when prefix is safe to evaluate

Fix missing checks for prefixes of array attributes in GNATprove mode.

gcc/ada/ChangeLog:

* sem_attr.adb (Eval_Attribute): Only fold array attributes when prefix
is static or at least safe to evaluate

ada: Fix minor issues in comments

gcc/ada/ChangeLog:

* einfo.ads (Is_Controlled_Active): Fix pasto in comment.
* sem_util.ads (Propagate_Controlled_Flags): Update comment for
Destructor aspect.

ada: Add destructors extension

This patch adds a GNAT-specific extension which enables "destructors".
Destructors are an optional replacement for Ada.Finalization where some
aspects of the interaction with type derivation are different.

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst: Document new extension.
* snames.ads-tmpl: Add name for new aspect.
* gen_il-fields.ads (Has_Destructor, Is_Destructor): Add new fields.
* gen_il-gen-gen_entities.adb (E_Procedure, Type_Kind): Add new fields.
* einfo.ads (Has_Destructor, Is_Destructor): Document new fields.
* aspects.ads: Add new aspect.
* sem_ch13.adb (Analyze_Aspect_Specifications,
Check_Aspect_At_Freeze_Point, Check_Aspect_At_End_Of_Declarations):
Add semantic analysis for new aspect.
(Resolve_Finalization_Procedure): New function.
(Resolve_Finalizable_Argument): Use new function above.
* sem_util.adb (Propagate_Controlled_Flags): Extend for new field.
* freeze.adb (Freeze_Entity): Add legality check for new aspect.
* exp_ch3.adb (Expand_Freeze_Record_Type, Predefined_Primitive_Bodies):
Use new field.
* exp_ch7.adb (Build_Finalize_Statements): Add expansion for
destructors.
(Make_Final_Call, Build_Record_Deep_Procs): Adapt to new Has_Destructor
field.
(Build_Adjust_Statements): Tweak to handle cases of empty lists.
* gnat_rm.texi: Regenerate.

ada: Fix crash when creating extra formals for aliased types

This patch makes sure that we return the same decision for all aliased
types when checking if the BIP task extra actuals are needed.

gcc/ada/ChangeLog:

* sem_ch6.adb (Might_Need_BIP_Task_Actuals): Before retrieving the original corresponding
operation we retrieve first the root of the aliased chain.

ada: Fix generation of Initialize and Adjust calls

Before this patch, Make_Init_Call and Make_Adjust_Call made the
assumption that if the type they were called with was untagged and a
derived type, it was the untagged private view of a tagged type. That
assumption made it possible to inspect the root type's primitives to
handle the case where the underlying type was implicitly generated by
the compiler without all inherited primitives.

The introduction of the Finalizable aspect broke that assumption, so
this patch adds a new field to type entities that make the generated
full view stand out, and updates Make_Init_Call and Make_Adjust_Call to
only jump to the root type when they're passed one of those generated
types.

Make_Final_Call and Finalize_Address are two other subprograms that
perform the same test on the types they're passed. They did not suffer
from the same bug as Make_Init_Call and Make_Adjust_Call because of an
earlier, more ad hoc fix, but this patch switches them over to the newly
introduced mechanism for the sake of consistency.

gcc/ada/ChangeLog:

* gen_il-fields.ads (Is_Implicit_Full_View): New field.
* gen_il-gen-gen_entities.adb (Type_Kind): Use new field.
* einfo.ads (Is_Implicit_Full_View): Document new field.
* exp_ch7.adb (Make_Adjust_Call, Make_Init_Call, Make_Final_Call): Use
new field.
* exp_util.adb (Finalize_Address): Likewise.
* sem_ch3.adb (Copy_And_Build): Set new field.

ada: Remove obsolete code from Safe_Unchecked_Type_Conversion

That's a kludge added to work around the limitations of the stack checking
mechanism used in the early days.

gcc/ada/ChangeLog:

* exp_util.ads (May_Generate_Large_Temp): Delete.
* exp_util.adb (May_Generate_Large_Temp): Likewise.
(Safe_Unchecked_Type_Conversion): Do not take stack checking into
account to compute the result.

ada: Wrong dispatch on result in presence of dependent expression

The compiler generates wrong code in a dispatching call on result
when the call is performed under dependent conditional expressions
or case-expressions.

gcc/ada/ChangeLog:

* sinfo.ads (Is_Expanded_Dispatching_Call): New flag.
(Tag_Propagated): New flag.
* exp_ch6.adb (Expand_Call_Helper): Propagate the tag when
the dispatching call is placed in conditionl expressions or
case-expressions.
* sem_ch5.adb (Analyze_Assignment): For assignment of tag-
indeterminate expression, do not propagate the tag if
previously done.
* sem_disp.adb (Is_Tag_Indeterminate): Add missing support
for conditional expression and case expression.
* exp_disp.ads (Is_Expanded_Dispatching_Call): Removed. Function
replaced by a new flag in the nodes.
* exp_disp.adb (Expand_Dispatching_Call): Set a flag in the
call node to remember that the call has been expanded.
(Is_Expanded_Dispatching_Call): Function removed.
* gen_il-fields.ads (Tag_Propagated): New flag.
(Is_Expanded_Dispatching_Call): New flag.
* gen_il-gen-gen_nodes.adb (Tag_Propagated): New flag.
(Is_Expanded_Dispatching_Call): New flag.

ada: Additional condition for Capacity discriminant on bounded container aggregates

This change test an additional condition as part of the criteria used
for deciding whether to generate a call to a container type's Length
function (for passing to the Empty function) when determining the
size of the object to allocate for a bounded container aggregate
with a "for of" iterator.

An update is also made to function Empty in Ada.Containers.Bounded_Hash_Maps,
adding a default to the formal Capacity, to make it consistent with other
bounded containers (and to make it conformant with the Ada RM).

gcc/ada/ChangeLog:

* libgnat/a-cbhama.ads (Empty): Add missing default to Capacity formal.
* libgnat/a-cbhama.adb (Empty): Add missing default to Capacity formal.
* exp_aggr.adb (Build_Size_Expr): Test for presence of Capacity
discriminant as additional criterion for generating the call to
the Length function. Update comments.

ada: Fix assertion failure on aggregate with controlled component

The assertion is:

      pragma Assert (Side_Effect_Free (L));

in Make_Tag_Ctrl_Assignment and demonstrates that the sequence:

  Remove_Side_Effects (L);
  pragma Assert (Side_Effect_Free (L));

does not hold in this case.

What happens is that Remove_Side_Effects uses a renaming to remove the side
effects of L but, at the end, the renamed object is substituted back for the
renamed object in the node by Expand_Renaming, which is invoked because the
Is_Renaming_Of_Object flag is set on the renaming after Evaluate_Name has
been invoked on its Name.

This is a general discrepancy between Evaluate_Name and Side_Effect_Free of
Exp_Util, coming from the call to Safe_Unchecked_Type_Conversion present in
Side_Effect_Free in this case.  The long term goal is probably to remove the
call but, in the meantime, this change is sufficient to fix the failure.

gcc/ada/ChangeLog:

* exp_util.adb (Safe_Unchecked_Type_Conversion): Always return True
if the expression is the prefix of an N_Selected_Component.

ada: Fix unnecessary extra RE_Activation_Chain_Access with No_Task_Parts

This patch checks the presence of No_Task_Parts on any ancestor or
inherited interface, not only its root type, since No_Task_Parts
prohibits tasking for any of its descendant. In case the current
subprogram is overridden/inherited, we need to return the same value
we would return for the original corresponding operation. The aspect
No_Task_Parts is nonoverridable and applies also when specified in a
partial view.

gcc/ada/ChangeLog:

* sem_ch6.adb (Might_Need_BIP_Task_Actuals): Check whether No_Task_Parts is enabled in any
of the derived types, or interfaces, from the user-defined primitive return type.
* sem_ch13.adb (Analyze_Aspect_Specifications): Add No_Task_Parts and No_Controlled_Parts to
the representation chain to be visible in the full view of private types.
* aspects.ads (Nonoverridable_Aspect_Id): As per GNAT RM, No_Task_Parts is nonoverridable.
* sem_util.adb (Check_Inherited_Nonoverridable_Aspects): Likewise.
* sem_util.ads: Fix typo and style.
* sem_disp.adb: Missing comment.

ada: Adding support to defer the addition of extra formals

Add support to create the extra formals when the underlying type
of some formal type or return type of a subprogram, subprogram type
or entry is not available when the entity is frozen. For example,
when a function that returns a private type is frozen before the
full-view of its private type is analyzed.

gcc/ada/ChangeLog:

* einfo.ads (Extra_Formals): Complete documentation.
(Has_First_Controlling_Parameter_Aspect): Place it in alphabetical order.
(Has_Frozen_Extra_Formals): New attribute.
* gen_il-fields.ads (Has_Frozen_Extra_Formals): New entity field.
* gen_il-gen-gen_entities.adb (Has_Frozen_Extra_Formals): Adding new
entity flag to subprograms, subprogram types, and and entries.
* gen_il-internals.adb (Image): Adding Has_Frozen_Extra_Formals.
* exp_ch3.adb (Build_Array_Init_Proc): Freeze its extra formals.
(Build_Init_Procedure): Freeze its extra formals.
(Expand_Freeze_Record_Type): For tagged types with foreign convention
create the extra formals of primitives with convention Ada.
* exp_ch6.ads (Create_Extra_Actuals): New subprogram.
* exp_ch6.adb (Check_BIP_Actuals): Adding assertions.
(Create_Extra_Actuals): New subprogram that factorizes code from
Expand_Call_Helper.
(Expand_Call_Helper): Adding support to defer the addition of extra
actuals. Move the code that adds the extra actuals to a new subprogram.
(Is_Unchecked_Union_Equality): Renamed as Is_Unchecked_Union_Predefined_
Equality_Call.
* exp_ch7.adb (Create_Finalizer): Freeze its extra formals.
(Wrap_Transient_Expression): Link the temporary with its relocated
expression to facilitate locating the expression in the expanded code.
* exp_ch9.ads (Expand_N_Entry_Declaration): Adding one formal.
* exp_ch9.adb (Expand_N_Entry_Declaration): Defer the expansion of
the entry if the extra formals are not available; analyze the built
declarations for the record type that holds all the parameters if
the expansion of the entry declaration was deferred.
* exp_disp.adb (Expand_Dispatching_Call): Handle deferred extra formals.
(Set_CPP_Constructors): Freeze its extra formals.
* freeze.adb (Freeze_Entity): Create the extra actuals of acccess to
subprograms whose designated type is a subprogram type.
(Freeze_Subprogram): Adjust assertion to support deferred extra formals,
and freeze extra formals of non-dispatching subprograms with foreign
convention. Added assertion to check matching of formals in thunks.
* sem_aux.adb (Get_Called_Entity): Adding documentation.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Create the extra formals
of deferred subprograms, subprogram types and entries; create also the
extra actuals of deferred calls.
* sem_ch6.ads (Freeze_Extra_Formals): New subprogram.
(Deferred_Extra_Formals_Support): New package.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Create the extra formals
of subprograms without separate spec.
(Add_Extra_Formal): Add documentation.
(Has_Extra_Formals): Removed.
(Parent_Subprogram): Adding documentation.
(Create_Extra_Formals): Defer adding extra formals if the underlying_type
of some formal type or return type is not available.
(Extra_Formals_Match_OK): Add missing check on the extra formals of
unchecked unions.
(Freeze_Extra_Formals): New subprogram.
(Deferred_Extra_Formals_Support): New package.
* sem_ch9.adb (Analyze_Entry_Declaration): Freeze its extra formals.
* sem_ch13.adb (New_Put_Image_Subprogram): ditto.
* sem_util.ads (Is_Unchecked_Union_Equality): New subprogram.
* sem_util.adb (Is_Unchecked_Union_Equality): ditto.

ada: Tune recent change for bit-packed arrays to help GNATprove backend

When GNAT is operating in GNATprove_Mode the Expander_Active flag is disabled,
but we still must do things that ordinary backends expect.

gcc/ada/ChangeLog:

* sem_util.adb (Get_Actual_Subtype): Do the same for GCC and GNATprove
backends.

ada: Expand continue procedure calls for GNATprove

Continue being a non-reserved keyword, occurrences of continue may
be resolved as procedure calls. Get that special case out of the
way for GNATprove, in anticipation of support for continue keyword.

gcc/ada/ChangeLog:

* exp_spark.adb (Expand_SPARK): Add expansion of continue statements.
(Expand_SPARK_N_Continue_Statement): Expand continue statements resolved
as procedure calls into said procedure calls.

ada: Tune check for restriction No_Relative_Delay and call to Set_Handler

When checking restriction No_Relative_Delay and detecting calls to
Ada.Real_Time.Timing_Events.Set_Handler with a Time_Span parameter,
we looked at the exact type of the actual parameter, while we should
look at its base type.

This patch looks at the type of actual parameter like it is done in
Expand_N_Delay_Until_Statement.

gcc/ada/ChangeLog:

* sem_res.adb (Resolve_Call): Look at the base type of actual parameter
when checking call to Set_Handler.

ada: Fix wrong indirect access to bit-packed array in iterated loop

This comes from a missing expansion of the bit-packed array reference in
the loop, because the actual subtype created for the dereference lacks a
Packed_Array_Impl_Type as it is ultimately created by the Preanalyze_Range
call present in Analyze_Loop_Statement.

gcc/ada/ChangeLog:

* sem_util.adb (Get_Actual_Subtype): Only create a new subtype when
the expander is active. Remove a useless test of type inequality,
as well as a useless call to Set_Has_Delayed_Freeze on the subtype.

ada: Replace "not Present" test with "No" test

Minor change to satisfy GNAT SAS checker.

gcc/ada/ChangeLog:

* exp_aggr.adb (Build_Size_Expr): Change test of "not Present (...)"
to "No (...)".

ada: Capacity determination for container aggregate with container iterator

In the case of a container aggregate that has a container_element_association
given by an iterator_specification that iterates over a container object
(for example, "[for E of V => E]"), the compiler will now determine the
number of elements in the object and can use that in determining the capacity
value to be passed to the container type's Empty function when allocating
space for the aggregate object. This implementation-dependent behavior
is allowed by RM22 4.3.5(40/5).

Prior to this enhancement, the compiler would generally use the Empty
function's default value for the Capacity parameter (a value of just
10 in the current implementation of the predefined containers), which
could easily lead to Capacity_Error being raised for the aggregate.

Note that this is only done for aggregates of container types coming
from instantiations of the predefined container generics, and not for
user-defined container types (due to the special knowledge the compiler
has of the availability of Length functions for the predefined types).
Also, it currently only applies when the object V being iterated over
is a simple object, and is not done for more complex cases, such as
when V is a function call.

gcc/ada/ChangeLog:

* exp_aggr.adb (Build_Size_Expr): Determine the length of a container
aggregate association in the case where it's an iteration over an
object of a container type coming from an instantiation of a predefined
container generic. Minor updates to existing comments.

ada: exp_util.adb: prevent infinite loop in case of broken code

A recent commit modified exp_util.adb in order to fix the selection of
Finalize subprograms in the case of untagged objects.
This introduced regressions for GNATSAS in fixedbugs by causing
GNAT2SCIL to loop over the same type over and over in case of broken
code.
We fix this by simply checking that the loop is making progress, and if
it doesn't, assume that we're done.

gcc/ada/ChangeLog:

* exp_util.adb (Finalize_Address): Prevent infinite loop

ada: Add Unique_Component_Name function for use by CCG.

Define a new function which, initially, is never called.
It is intended to be called from CCG. If an Ada tagged record type
has a component named Foo, then the generated corresponding C struct
might have a component with the same name. This approach almost works,
but breaks down in the (rare) case of an Ada record type where two or more
components have the same name (this is normally illegal, but is possible in
the case of an extension where some component of the parent type is not
visible at the point of the extension). This new function is intended for
use in coping with this case.

gcc/ada/ChangeLog:

* sem_aux.ads: Declare new function Unique_Component_Name.

* sem_aux.adb: Implement new function Unique_Component_Name.

ada: Ensure Expression_Copy has a parent before analysis

Some analysis requires going up the parent chain to get the
relevant context. Ensure that is done for the Expression_Copy
node which is not a syntactic node.

gcc/ada/ChangeLog:

* sem_ch13.adb (Check_Aspect_At_End_Of_Declarations):
Ensure the Expression_Copy always has a parent before
calling any analyze.

ada: Improved support for mutably tagged types

Fix bugs related to mutably tagged types in streaming operations, Put_Image
attributes, aggregates, composite equality comparisons with mutably-tagged
components, and other issues.

gcc/ada/ChangeLog:

* exp_aggr.adb (Build_Record_Aggr_Code.Gen_Assign): In the case of
an aggregate component where the component type is mutably tagged
and the component value is provided by a qualified aggregate (and
qualified with a specific type), avoid incorrectly rejecting the
inner aggregate for violating the rule that the type of an
aggregate shall not be class-wide.
* exp_attr.adb: For a predefined streaming operation (i.e., Read,
Write, Input, or Output) of a class-wide type, the external name
of the tag of the value is normally written out by Output and read
in by Input. In the case of a mutably tagged type, this is instead
done in Write and Read.
* exp_ch4.adb (Expand_Composite_Equality): In the case of an
equality comparison for a type having a mutably tagged component,
we want the component comparison to compare two values of the
mutably tagged type, not two values of the corresponding
array-of-bytes-ish representation type. Even if there are no
user-defined equality functions anywhere in sight, comparing the
array values still doesn't work because undefined bits may end up
participating in the comparison (resulting in an incorrect result
of False).
* exp_put_image.adb: In the case of a class-wide type, the
predefined Image attribute includes the name of the specific type
(and a "'" character, to follow qualified expression syntax) to
indicate the tag of the value. With the introduction of mutably
tagged types, this case can now arise in the case of a component
(of either an enclosing array or an enclosing record), not just
for a top-level object. So we factor the code to do this into a
new procedure, Put_Specific_Type_Name_Qualifier, so that it can be
called from more than one place. This reorganization also involves
replacing the procedure Put_String_Exp with a new procedure,
Put_String_Exp_To_Buffer, declared in a less nested scope. For
mutably tagged components (at the source level) the component type
(at the GNAT tree level) is an array of bytes (actually a two
field record containing an array of bytes, but that's a detail).
Appropriate conversions need to be generated so that we don't end
up generating an image for an array of bytes; this is done at the
same places where Put_Specific_Type_Name_Qualifier is called
(for components) by calling Make_Mutably_Tagged_Conversion.
* exp_strm.adb (Make_Field_Attribute): Add
Make_Mutably_Tagged_Conversion call where we construct a
Selected_Component node and the corresponding component type is
the internal representation type for a mutably tagged type.
(Stream_Base_Type): Return the mutably
tagged type if given the corresponding internal representation type.
* sem_ch3.adb (Array_Type_Declaration): In the case where the
source-level component type of an array type is mutably tagged,
set the Component_Type field of the base type of the declared
array type (as opposed to that of the first subtype of the array
type) to the corresponding internal representation type.
* sem_ch4.adb (Analyze_Selected_Component): In the case of a
selected component name which references a component whose type is
the internal representation type of a mutably tagged type,
generate a conversion to the mutably tagged type.

libstdc++: Fix obvious mistake in inplace_vector::assign_range [PR119137]

In case of input iterators, the loop that assigns to existing elements
should run up to number of elements in vector (_M_size) not capacity (_Nm).

PR libstdc++/119137

libstdc++-v3/ChangeLog:

* include/std/inplace_vector (inplace_vector::assign_range):
Replace _Nm with _M_size in the assigment loop.

Fix gcc.dg/vect/slp-28.c

gcc.dg/vect/slp-28.c is now vectorized as expected even on targets
without vect32.

* gcc.dg/vect/slp-28.c: Adjust.

Daily bump.

[RISC-V] Add missing insn types to xiangshan.md and mips-p8700.md

This is a trivial patch to add a few missing types to pipeline models that are
mostly complete.

In particular this adds the "ghost" to mips-p8700.md and the "sf_vc" and
"sf_vc_se" types to xiangshan.md.

There are definitely some bigger issues to solve in this space. But this is a
trivial fix that stands on its own.

I've tested this in my tester, just waiting for pre-commit CI to do its thing.

gcc/
* config/riscv/mips-p8700.md: Add support for "ghost" insn types.
* config/riscv/xiangshan.md: Add support for "sf_vc" and "sf_vc_se"
insn types.

Ada: Fix wrong tag in style check warnings

This fixes an old issue whereby violations of the style check -gnatyc are
sometimes reported as violations of -gnatyt instead.

gcc/ada/
PR ada/121184
* styleg.adb (Check_Comment): Use consistent warning message.

cobol: Improved linemap and diagnostic handling; PIC validation. [PR120402]

Implementation of PICTURE string validation for PR120402.  Expanded some printf
format attributes.  Improved debugging and diagnostic messages.  Improved
linemap and line location tracking in support of diagnostic messages and
location_t tagging of GENERIC nodes for improved GDB-COBOL performance.
Assorted changes to eliminate cppcheck warnings.

Co-Authored-By: James K. Lowden <jklowden@cobolworx.com>
Co-Authored-By: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:

PR cobol/120402
* Make-lang.in: Elminate commented-out scripting.
* cbldiag.h (_CBLDIAG_H): Change #if 0 to #if GCOBOL_GETENV
(warn_msg): Add printf attributes.
(location_dump): Add debugging message.
* cdf.y: Improved linemap tracking.
* genapi.cc (treeplet_fill_source): const attribute for formal parameter.
(insert_nop): Created to consolidate var_decl_nop writes.
(build_main_that_calls_something): Move generation to the end of executable.
(level_88_helper): Formatting.
(parser_call_targets_dump): Formatting.
(function_pointer_from_name): const attribute for formal parameter.
(parser_initialize_programs): const attribute for formal parameter.
(parser_statement_begin): Improved linemap handling.
(section_label):  Improved linemap handling.
(paragraph_label): Improved linemap handling.
(pseudo_return_pop): Improved linemap handling.
(leave_procedure): Formatting.
(parser_enter_section):  Improved linemap handling.
(parser_enter_paragraph): Improved linemap handling.
(parser_perform): Formatting.
(parser_leave_file): Move creation of main() to this routine.
(parser_enter_program): Move creation of main from here to leave_file.
(parser_accept): Formatting. const attribute for formal parameter.
(parser_accept_command_line): const attribute for formal parameter.
(parser_accept_command_line_count): const attribute for formal parameter.
(parser_accept_envar): Likewise.
(parser_set_envar): Likewise.
(parser_display): Likewise.
(get_exhibit_name): Implement EXHIBIT verb.
(parser_exhibit): Likewise.
(parser_sleep): const attribute for formal parameter.
(parser_division): Improved linemap handling.
(parser_classify): const attribute for formal parameter.
(create_iline_address_pairs): Improved linemap handling.
(parser_perform_start): Likewise.
(perform_inline_until): Likewise.
(perform_inline_testbefore_varying): Likewise.
(parser_perform_until): Likewise.
(parser_perform_inline_times): Likewise.
(parser_intrinsic_subst): const attribute for formal parameter.
(parser_file_merge): Formatting.
(create_and_call): Improved linemap handling.
(mh_identical): const attribute for formal parameter.
(mh_numeric_display): const attribute for formal parameter.
(mh_little_endian): Likewise.
(mh_source_is_group): Likewise.
(psa_FldLiteralA): Formatting.
* genapi.h (parser_accept): const attribute for formal parameter.
(parser_accept_envar): Likewise.
(parser_set_envar): Likewise.
(parser_accept_command_line): Likewise.
(parser_accept_command_line_count): Likewise.
(parser_add): Likewise.
(parser_classify): Likewise.
(parser_sleep): Likewise.
(parser_exhibit): Likewise.
(parser_display): Likewise.
(parser_initialize_programs): Likewise.
(parser_intrinsic_subst): Likewise.
* gengen.cc (gg_assign): Improved linemap handling.
(gg_add_field_to_structure): Likewise.
(gg_define_from_declaration): Likewise.
(gg_build_relational_expression): Likewise.
(gg_goto_label_decl): Likewise.
(gg_goto): Likewise.
(gg_printf): Likewise.
(gg_fprintf): Likewise.
(gg_memset): Likewise.
(gg_memchr): Likewise.
(gg_memcpy): Likewise.
(gg_memmove): Likewise.
(gg_strcpy): Likewise.
(gg_strcmp): Likewise.
(gg_strncmp): Likewise.
(gg_return): Likewise.
(chain_parameter_to_function): Likewise.
(gg_define_function): Likewise.
(gg_get_function_decl): Likewise.
(gg_call_expr): Likewise.
(gg_call): Likewise.
(gg_call_expr_list): Likewise.
(gg_exit): Likewise.
(gg_abort): Likewise.
(gg_strlen): Likewise.
(gg_strdup): Likewise.
(gg_malloc): Likewise.
(gg_realloc): Likewise.
(gg_free): Likewise.
(gg_set_current_line_number): Likewise.
(gg_get_current_line_number): Likewise.
(gg_insert_into_assembler): Likewise.
(token_location_override): Likewise.
(gg_token_location): Likewise.
* gengen.h (location_from_lineno): Likewise.
(gg_set_current_line_number): Likewise.
(gg_get_current_line_number): Likewise.
(gg_token_location): Likewise.
(current_token_location): Likewise.
(current_location_minus_one): Likewise.
(current_location_minus_one_clear): Likewise.
(token_location_override): Likewise.
* genmath.cc (fast_divide):  const attribute for formal parameter.
* genutil.cc (get_and_check_refstart_and_reflen): Likewise.
(get_data_offset): Likewise.
(refer_refmod_length): Likewise.
(refer_offset): Likewise.
(refer_size): Likewise.
(refer_size_dest): Likewise.
(refer_size_source): Likewise.
(qualified_data_location): Likewise.
* genutil.h (refer_offset): Likewise.
(refer_size_source): Likewise.
(refer_size_dest): Likewise.
(qualified_data_location): Likewise.
* parse.y: EVALUATE token; Implement EXHIBIT verb;
Improved linemap handling.
* parse_ante.h (input_file_status_notify): Improved linemap handling.
(location_set): Likewise.
* scan.l: PICTURE string validation.
* scan_ante.h (class picture_t): PICTURE string validation.
(validate_picture): Likewise.
* symbols.cc (symbol_currency): Revised default currency handling.
* symbols.h (symbol_currency): Likewise.
* util.cc (location_from_lineno): Improved linemap handling.
(current_token_location): Improved linemap handling.
(current_location_minus_one): Improved linemap handling.
(current_location_minus_one_clear): Improved linemap handling.
(gcc_location_set_impl): Improved linemap handling.
(warn_msg): Improved linemap handling.
* util.h (cobol_lineno): Improved linemap handling.

match: Add `cmp - 1` simplification to `-icmp` [PR110949]

I have seen this a few places though the testcase from PR 95906
is an obvious place where this shows up for sure.
This convert `cmp - 1` into `-icmp` as that form is more useful
in many cases.

Changes since v1:
* v2: Add check for outer type's precision being greater than 1.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/110949
PR tree-optimization/95906

gcc/ChangeLog:

* match.pd (cmp - 1): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/cmp-2.c: New test.
* gcc.dg/tree-ssa/max-bitcmp-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: Make the default ctor of mdspan conditionally noexcept.

Previously, the default ctor of mdspan was never noexcept, even if all
members of mdspan were nothrow default constructible.

This commit makes mdspan conditionally nothrow default constructible.
A similar strengthening happens in libc++.

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan::mdspan): Make default ctor
conditionally noexcept.
* testsuite/23_containers/mdspan/mdspan.cc: Add tests.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

libstdc++: Strengthen exception guarantee for mdspan methods.

The mdspan::is_{,always}_{unique,strided,exhaustive} methods only call
their counterparts in mdspan::mapping_type. The standard specifies that
the methods of mdspan::mapping_type are noexcept, but doesn't specify if
the methods of mdspan are noexcept.

Libc++ strengthened the exception guarantee for these mdspan methods.
This commit conditionally strengthens these methods for libstdc++.

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan::is_always_unique): Make
conditionally noexcept.
(mdspan::is_always_exhaustive): Ditto.
(mdspan::is_always_strided): Ditto.
(mdspan::is_unique): Ditto.
(mdspan::is_exhaustive): Ditto.
(mdspan::is_strided): Ditto.
* testsuite/23_containers/mdspan/layout_like.h: Make noexcept
configurable. Add ThrowingLayout.
* testsuite/23_containers/mdspan/mdspan.cc: Add tests for
noexcept.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>

[RISC-V] Fix wrong CFA during stack probe

temp1 is used by the probe loop for the step size, but we need the final
address of the stack after the loop which resides in temp2.

PR target/121121
* config/riscv/riscv.cc (riscv_allocate_and_probe_stack_space):
Use temp2 instead of temp1 for the CFA note.

RISC-V: Add test for vec_duplicate + vaaddu.vv combine for DImode

Add asm dump check and run test for vec_duplicate + vaaddu.vv
combine to vaaddu.vx, with the GR2VR cost is 0, 1, 2 and 15 for
the case 0 and case 1.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-u64.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Allow VLS DImode for sat_op vx DImode pattern

When try to introduce the vaaddu.vx combine for DImode, we will meet
ICE like below:

0x4889763 internal_error(char const*, ...)
        .../riscv-gnu-toolchain/gcc/__build__/../gcc/diagnostic-global-context.cc:517
0x4842f98 fancy_abort(char const*, int, char const*)
        .../riscv-gnu-toolchain/gcc/__build__/../gcc/diagnostic.cc:1818
0x2953461 code_for_pred_scalar(int, machine_mode)
        ./insn-opinit.h:1911
0x295f300
riscv_vector::sat_op<110>::expand(riscv_vector::function_expander&) const
        .../riscv-gnu-toolchain/gcc/__build__/../gcc/config/riscv/riscv-vector-builtins-bases.cc:667
0x294bce1 riscv_vector::function_expander::expand()

We will have code_for_nothing when emit the vaadd.vx insn for V2DI vls
mode.  So allow the VLS mode for the sat_op vx pattern to unblock it.

gcc/ChangeLog:

* config/riscv/vector.md: Allow VLS DImode for sat_op vx pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vaaddu.vv combine case 1 with GR2VR cost 0, 1 and 2 for QI, HI and SI mode

Add asm dump check test for vec_duplicate + vaaddu.vv combine to
vaaddu.vx, with the GR2VR cost is 0, 1 and 2. Please note DImode
is not included.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vaaddu.vv combine case 0 with GR2VR cost 0, 2 and 15 for QI, HI and SI mode

Add asm dump check and run test for vec_duplicate + vaaddu.vv
combine to vaaddu.vx, with the GR2VR cost is 0, 2 and 15. Please
note DImode is not included here.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vaaddu.vv to vaaddu.vx on GR2VR cost for HI, QI and SI mode

This patch would like to combine the vec_duplicate + vaaddu.vv to the
vaaddu.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_AVG_FLOOR(NT, WT)        \
  NT                                   \
  test_##NT##_avg_floor(NT x, NT y)    \
  {                                    \
    return (NT)(((WT)x + (WT)y) >> 1); \
  }

  #define AVG_FLOOR_FUNC(T)      test_##T##_avg_floor

  DEF_AVG_FLOOR(uint32_t, uint64_t)
  DEF_VX_BINARY_CASE_2_WRAP(T, AVG_FLOOR_FUNC(T), sat_add)

Before this patch:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vaaddu.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vaaddu.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*uavg_floor_vx_<mode>): Add
pattern for vaaddu.vx combine.
* config/riscv/riscv.cc (get_vector_binary_rtx_cost): Add UNSPEC
handling for UNSPEC_VAADDU.
(riscv_rtx_costs): Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

aarch64: Avoid INS-(W|X)ZR instructions when optimising for speed

For inserting zero into a vector lane we usually use an instruction like:
ins     v0.h[2], wzr

This, however, has not-so-great performance on some CPUs.
On Grace, for example it has a latency of 5 and throughput 1.
The alternative sequence:
movi    v31.8b, #0
ins     v0.h[2], v31.h[0]
is prefereble bcause the MOVI-0 is often a zero-latency operation that is
eliminated by the CPU frontend and the lane-to-lane INS has a latency of 2 and
throughput of 4.
We can avoid the merging of the two instructions into the aarch64_simd_vec_set_zero<mode>
by disabling that pattern when optimizing for speed.

Thanks to wider benchmarking from Tamar, it makes sense to make this change for
all tunings, so no RTX costs or tuning flags are introduced to control this
in a more fine-grained manner.  They can be easily added in the future if needed
for a particular CPU.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set_zero<mode>):
Enable only when optimizing for size.

gcc/testsuite/

* gcc.target/aarch64/simd/mf8_data_1.c (test_set_lane4,
test_setq_lane4): Relax allowed assembly.
* gcc.target/aarch64/vec-set-zero.c: Use -Os in flags.
* gcc.target/aarch64/inszero_split_1.c: New test.