git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Cleanup and stabilize format _Spec<_CharT> and _Pres_type.

These patch makes following changes to _Pres_type values:
* _Pres_esc is replaced with separate _M_debug flag.
* _Pres_s, _Pres_p do not overlap with _Pres_none.
* hexadecimal presentation use same values for pointer, integer
   and floating point types.

The members of _Spec<_CharT> are rearranged so the class contains 8 bits
reserved for future use (_M_reserved) and 8 bits of tail padding.
Derived classes (like _ChronoSpec<_CharT>) can reuse the storage for initial
members. We also add _SpecBase as the base class for _Spec<_CharT> to make
it non-C++98 POD, which allows tail padding to be reused on Itanium ABI.

Finally, the format enumerators are defined as enum class with unsigned
char as underlying type, followed by using enum to bring names in scope.
_Term_char names are adjusted for consistency, and enumerator values are
changed so it can fit in smaller bitfields.

The '?' is changed to separate _M_debug flag, to allow debug format to be
independent from the presentation type, and applied to multiple presentation
types. For example it could be used to trigger memberwise or reflection based
formatting.

The _M_format_character and _M_format_character_escaped functions are merged
to single function that handle normal and debug presentation. In particular
this would allow future support for '?c' for printing integer types as escaped
character. _S_character_width is also folded in the merged function.

Decoupling _Pres_s value from _Pres_none, allows it to be used for string
presentation for range formatting, and removes the need for separate _Pres_seq
and _Pres_str. This does not affect formatting of bool as __formatter_int::_M_parse
overrides default value of _M_type. And with separation of the _M_debug flag,
__formatter_str::format behavior is now agnostic to _M_type value.

The values for integer presentation types, are arranged so textual presentations
(_Prec_s, _Pres_c) are grouped together. For consistency floating point
hexadecimal presentation uses the same values as integer ones.

New _Pres_p and setting for _M_alt enables using some spec to configure formatting
of  uintptr_t with __formatter_int, and const void* with __formatter_ptr.
Differentiating it from _Pres_none would allow future of formatter<T*, _CharT>
that would require explicit presentation type to be specified. This would allow
std::vector<T*> to be formatted directly with '{::p}' format spec.

The constructors for __formatter_int and _formatter_ptr from _Spec<_CharT>,
now also set default presentation modes, as format functions expects them.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific):
Declare as bit fiekd in tail-padding..
* include/bits/formatfwd.h (__format::_Align): Defined as enum
class and add using enum.
* include/std/format (__format::_Pres_type, __format::_Sign)
(__format::_WidthPrec,  __format::_Arg_t): Defined as enum class
and add using enum.
(_Pres_type::_Pres_esc): Replace with _Pres_max.
(_Pres_type::_Pres_seq, _Pres_type::_Pres_str): Remove.
(__format::_Pres_type): Updated values of enumerators as described
above.
(__format::_Spec): Rearranged members to have 8 bits of tail-padding.
(_Spec::_M_debug): Defined.
(_Spec::_M_reserved): Extended to 8 bits and moved at the end.
(_Spec::_M_reserved2): Removed.
(_Spec::_M_parse_fill_and_align, _Spec::_M_parse_sign)
(__format::__write_padded_as_spec): Adjusted default value checks.
(__format::_Term_char): Add using enum and adjust enumertors.
(__Escapes::_S_term): Adjusted for _Term_char values.
(__format::__should_escape_ascii): Adjusted _Term_char uses.
(__format::__write_escaped): Adjusted for _Term_char.
(__formatter_str::parse): Set _Pres_s if specifed and _M_debug
instead of _Pres_esc.
(__formatter_str::set_debug_format): Set _M_debug instead of
_Pres_esc.
(__formatter_str::format, __formatter_str::_M_format_range):
Check _M_debug instead of _Prec_esc.
(__formatter_str::_M_format_escaped): Adjusted _Term_char uses.
(__formatter_int::__formatter_int(_Spec<_CharT>)): Set _Pres_d if
default presentation type is not set.
(__formatter_int::_M_parse): Adjusted default value checks.
(__formatter_int::_M_do_parse): Set _M_debug instead of _Pres_esc.
(__formatter_int::_M_format_character): Handle escaped presentation.
(__formatter_int::_M_format_character_escaped)
(__formatter_int::_S_character_width): Merged into
_M_format_character.
(__formatter_ptr::__formatter_ptr(_Spec<_CharT>)): Set _Pres_p if
default presentation type is not set.
(__formatter_ptr::parse): Add default __type parameter, store _Pres_p,
and handle _M_alt to be consistent with meaning for integers.
(__foramtter_ptr<_CharT>::_M_set_default): Define.
(__format::__pack_arg_types, std::basic_format_args): Add necessary
casts.
(formatter<_CharT, _CharT>::set_debug_format)
(formatter<char, wchar_t>::set_debug_format): Set _M_debug instead of
_Pres_esc.
(formatter<_CharT, _CharT>::format, formatter<char, wchar_t>::format):
Simplify calls to _M_format_character.
(range_formatter<_Rg, _CharT>::parse): Replace _Pres_str with
_Pres_s and set _M_debug instead of _Pres_esc.
(range_formatter<_Rg, _CharT>::format): Replace _Pres_str with
_Pres_s.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Fix incorrect links to archived SGI STL docs

In r8-7777-g25949ee33201f2 I updated some URLs to point to copies of the
SGI STL docs in the Wayback Machine, because the original pags were no
longer hosted on sgi.com. However, I incorrectly assumed that if one
archived page was at https://web.archive.org/web/20171225062613/... then
all the other pages would be too. Apparently that's not how the Wayback
Machine works, and each page is archived on a different date. That meant
that some of our links were redirecting to archived copies of the
announcement that the SGI STL docs have gone away.

This fixes each URL to refer to a correctly archived copy of the
original docs.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Update URL for archived SGI STL docs.
* doc/xml/manual/containers.xml: Likewise.
* doc/xml/manual/extensions.xml: Likewise.
* doc/xml/manual/using.xml: Likewise.
* doc/xml/manual/utilities.xml: Likewise.
* doc/html/*: Regenerate.

libgcc: Move bitint support exports to x86/aarch64 specific map files

When adding _BitInt support I was hoping all or most of arches would
implement it already for GCC 14.  That didn't happen and with
new hosts adding support for _BitInt for GCC 16 (s390x-linux and as was
posted today loongarch-linux too), we need the _BitInt support functions
exported on those arches at GCC_16.0.0 rather than GCC_14.0.0 which
shouldn't be changed anymore.

The following patch does that.  Both arches were already exporting
some of the _BitInt related symbols in their specific map files, this
just moves the remaining ones there as well.

2025-05-20  Jakub Jelinek  <jakub@redhat.com>

* libgcc-std.ver.in (GCC_14.0.0): Remove bitint related exports
from here.
* config/i386/libgcc-glibc.ver (GCC_14.0.0): Add them here.
* config/i386/libgcc-darwin.ver (GCC_14.0.0): Likewise.
* config/i386/libgcc-sol2.ver (GCC_14.0.0): Likewise.
* config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Likewise.

tree-chrec: Use signed_type_for in convert_affine_scev

On s390x-linux I've run into the gcc.dg/torture/bitint-27.c test ICEing in
build_nonstandard_integer_type called from convert_affine_scev (not sure
why it doesn't trigger on x86_64/aarch64).
The problem is clear, when ct is a BITINT_TYPE with some large
TYPE_PRECISION, build_nonstandard_integer_type won't really work on it.

The patch fixes it similarly what has been done for GCC 14 in various
other spots.

2025-05-20 Jakub Jelinek <jakub@redhat.com>

* tree-chrec.cc (convert_affine_scev): Use signed_type_for instead of
build_nonstandard_integer_type.

libgcc: Small bitint_reduce_prec big-endian fixes

The big-endian _BitInt support in libgcc was written without any
testing and so I haven't discovered I've made one mistake in it
(in multiple places).
The bitint_reduce_prec function attempts to optimize inputs
which have some larger precision but at runtime they are found
to need smaller number of limbs.
For little-endian that is handled just by returning smaller
precision (or negative precision for signed), but for
big-endian we need to adjust the passed in limb pointer so that
when it returns smaller precision the argument still contains
the least significant limbs for the returned precision.

2025-05-20 Jakub Jelinek <jakub@redhat.com>

* libgcc2.c (bitint_reduce_prec): For big endian
__LIBGCC_BITINT_ORDER__ use ++*p and --*p instead of
++p and --p.
* soft-fp/bitint.h (bitint_reduce_prec): Likewise.

bitintlower: Big-endian lowering support

The following patch adds big endian support to the bitintlower pass.
While the rest of the _BitInt support has been written with endianity
in mind, in the bitintlower pass I've written it solely little endian
at the start, because the pass is large and complicated and there were
no big-endian backends with _BitInt psABI at the point of writing it,
so the big-endian support would be completely untested.
Now that I got privately a patch to add s390x support, I went through
the whole pass and added the support.
Some months ago I've talked about two possibilities to do the big-endian
support, one perhaps easier would be keep the idx vars (INTEGER_CSTs
for bitint_prec_large and partially SSA_NAMEs, partially INTEGER_CSTs
for bitint_prec_huge) iterating like for little-endian from 0 upwards
and do the big-endian index correction only when accessing the limbs
(but mergeable casts between _BitInts with different number of limbs
would be a nightmare), which would have the disadvantage that we'd need
to wait until propagation and ivopts to fix stuff up (and not sure it
would be able to fix everything), or change stuff so that the idxes
used between the different bitint_large_huge class methods iterate on
big endian from highest down to 0.
The following patch implements the latter.
On s390x with the 3 patches from IBM without this patch I got on
make -j32 -k check-gcc GCC_TEST_RUN_EXPENSIVE=1 RUNTESTFLAGS="GCC_TEST_RUN_EXPENSIVE=1 dg.exp='*bitint* pr112673.c builtin-stdc-bit-*.c pr112566-2.c pr112511.c pr116588.c pr116003.c
+pr113693.c pr113602.c flex-array-counted-by-7.c' dg-torture.exp='*bitint* pr116480-2.c pr114312.c pr114121.c' dfp.exp=*bitint* vect.exp='vect-early-break_99-pr113287.c'
+tree-ssa.exp=pr113735.c"
347 FAILs, 326 out of that execution failures (and that doesn't include
some tests that happened to succeed by pure luck because e.g. comparisons
weren't implemented correctly).
With this patch (and 2 small patches I'm going to post next) I got this
down to
FAIL: gcc.dg/dfp/bitint-1.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-2.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-3.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-4.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-5.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-6.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-8.c (test for excess errors)
FAIL: gcc.dg/torture/bitint-64.c   -O3 -g  execution test
FAIL: gcc.dg/torture/bitint-64.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  execution test
where the dfp stuff is due to missing DPD dfp <-> _BitInt support
I'm working on next, and bitint-64.c is some expansion related
issue with _Atomic _BitInt(5) (will look at it later, but there
bitint lowering isn't involved at all).
Most of the patch just tweaks things so that it iterates in the
right direction, for casts with different number of limbs does the
needed index adjustments and unfortunately due to that (and e.g.
add/sub/mul overflow BE lowering) has some pessimizations on the
SSA conflict side; on little-endian mergeable ops have the
advantage that all the accesses iterate from index 0 up, so even
if there is e.g. overlap between the lhs and some used values, except
for mul/div where libgcc APIs require no overlap we don't need to
avoid it, all the limbs are updated in sync before going on to handle
next limb.  On big-endian, that isn't the case, casts etc. can result
in index adjustments and so we could overwrite a limb that will still
need to be processed as input.  So, there is a special case that looks
for different numbers of limbs among arguments and in that case marks
the lhs as conflicting with the inputs.

On little-endian, this patch shouldn't affect code generation, with
one little exception; in the separate_ext handling in lower_mergeable_stmt
the loop (if bitint_large_huge) was iterating using some idx and then
if bo_idx was non-zero, adding that constant to a new SSA_NAME and using
that to do the limb accesses.  As the limb accesses are the only place
where the idx is used (apart from the loop exit test), I've changed it
to iterate on idxes with bo_idx already added to those.

P.S., would be nice to eventually also enable big-endian aarch64,
but I don't have access to those, so can't test that myself.

P.S., at least in the current s390x patches it wants info->extended_p,
this patch doesn't change anything about that.  I believe most of the
time the _BitInt vars/params/return values are already extended, but
there is no testcase coverage for that, I will work on that incrementally
(and then perhaps arm 32-bit _BitInt support can be enabled too).

2025-05-20  Jakub Jelinek  <jakub@redhat.com>

* gimple-lower-bitint.cc (bitint_big_endian): New variable.
(bitint_precision_kind): Set it.
(struct bitint_large_huge): Add unsigned argument to
finish_arith_overflow.
(bitint_large_huge::limb_access_type): Handle bitint_big_endian.
(bitint_large_huge::handle_operand): Likewise.
(bitint_large_huge::handle_cast): Likewise.
(bitint_large_huge::handle_bit_field_ref): Likewise.
(bitint_large_huge::handle_load): Likewise.
(bitint_large_huge::lower_shift_stmt): Likewise.
(bitint_large_huge::finish_arith_overflow): Likewise.
Add nelts argument.
(bitint_large_huge::lower_addsub_overflow): Handle bitint_big_endian.
Adjust finish_arith_overflow caller.
(bitint_large_huge::lower_mul_overflow): Likewise.
(bitint_large_huge::lower_bit_query): Handle bitint_big_endian.
(bitint_large_huge::lower_stmt): Likewise.
(build_bitint_stmt_ssa_conflicts): Likewise.
(gimple_lower_bitint): Likewise.

* gcc.dg/torture/bitint-78.c: New test.
* gcc.dg/torture/bitint-79.c: New test.
* gcc.dg/torture/bitint-80.c: New test.
* gcc.dg/torture/bitint-81.c: New test.

c++/modules: Ensure vtables are emitted when needed [PR120349]

I missed a testcase in r16-688-gc875748cdc468e for whether a GM vtable
should be emitted in an importer when it has no non-inline key function.
Before that patch the code worked because always we marked all vtables
as DECL_EXTERNAL, which then meant that reading the definition marked
them as DECL_NOT_REALLY_EXTERN.

This patch restores the old behaviour so that vtables are marked
DECL_EXTERNAL (and hence DECL_NOT_REALLY_EXTERN).

PR c++/120349

gcc/cp/ChangeLog:

* module.cc (trees_out::core_bools): Always mark vtables as
DECL_EXTERNAL.

gcc/testsuite/ChangeLog:

* g++.dg/modules/vtt-3_a.C: New test.
* g++.dg/modules/vtt-3_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

[RISC-V] Avoid multiple assignments to output object

This is the next batch of changes to reduce multiple assignments to an output
object.  This time I'm focused on splitters in bitmanip.md.

This doesn't convert every case.  For example there is one case that is very
clearly dependent on eliminating mvconst_internal and adjustment of a splitter
for andn and until those things happen it would clearly be a QOI implementation
regression.

There are cases where we set a scratch register more than once.  It may be
possible to use an additional scratch.  I haven't tried that yet.

I've seen one failure to if-convert a sequence after this patch, but it should
be resolved once the logical AND changes are merged.  Otherwise I'm primarily
seeing slight differences in register allocation and scheduling.  Nothing
concerning to me.

This has run through my tester, but I obviously want to see how it behaves in
the upstream CI system as that tests slightly different multilibs than mine (on
purpose).

gcc/

* config/riscv/bitmanip.md (various splits): Avoid writing the output
more than once when trivially possible.

c++/modules: Fix ICE on merge of instantiation with partial spec [PR120013]

When we import a pending instantiation that matches an existing partial
specialisation, we don't find the slot in the entity map because for
partial specialisations we register the TEMPLATE_DECL but for normal
implicit instantiations we instead register the inner TYPE_DECL.

Because the DECL_MODULE_ENTITY_P flag is set we correctly realise that
it is in the entity map, but ICE when attempting to use that slot in
partition handling.

This patch fixes the issue by detecting this case and instead looking
for the slot for the TEMPLATE_DECL. It doesn't matter that we never add
a slot for the inner decl because we're about to discard it anyway.

PR c++/120013

gcc/cp/ChangeLog:

* module.cc (trees_in::install_entity): Handle re-registering
the inner TYPE_DECL of a partial specialisation.

gcc/testsuite/ChangeLog:

* g++.dg/modules/partial-8.h: New test.
* g++.dg/modules/partial-8_a.C: New test.
* g++.dg/modules/partial-8_b.C: New test.
* g++.dg/modules/partial-8_c.C: New test.
* g++.dg/modules/partial-8_d.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: Always mark tinfo vars as TREE_ADDRESSABLE [PR120350]

We need to mark type info decls as addressable if we take them by
reference; this is done by walking the declaration during parsing and
marking the decl as needed.

However, with modules we don't stream tinfo decls directly; rather we
stream just their name and type and reconstruct them in the importer
directly. This means that any addressable flags are not propagated, and
we error because TREE_ADDRESSABLE is not set despite taking its address.

But tinfo decls should always have TREE_ADDRESSABLE set, as any attempt
to use the tinfo decl will go through build_address anyway. So this
patch fixes the issue by eagerly marking the constructed decl as
TREE_ADDRESSABLE so that modules gets this flag correctly set as well.

PR c++/120350

gcc/cp/ChangeLog:

* rtti.cc (get_tinfo_decl_direct): Mark TREE_ADDRESSABLE.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tinfo-3_a.H: New test.
* g++.dg/modules/tinfo-3_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

Extend vect_recog_cond_expr_convert_pattern to handle REAL_CST

REAL_CST is handled if it can be represented in different floating
point types without loss of precision or under fast math.

gcc/ChangeLog:

PR tree-optimization/103771
* match.pd (cond_expr_convert_p): Extend the match to handle
REAL_CST.
* tree-vect-patterns.cc
(vect_recog_cond_expr_convert_pattern): Handle REAL_CST.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103771-5.c: New test.
* gcc.target/i386/pr103771-6.c: New test.

RISC-V: Tweak the asm check test of vx combine on GR2VR cost [NFC]

Tweak the asm check with define T uint8_t for adding more
vx test easily, as well as less possibility to make mistake.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Extract
define T as type for testing.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 2

Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add asm check
for vrsub with GR2VR cost 2.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add asm check
for vrsub with GR2VR cost 1.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 0

Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vrsub case 1 with GR2VR cost 0.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 15

Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add asm check
for vrsub with GR2VR cost is 15.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add vrsub asm
dump check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 0

Add asm dump check and run test for vec_duplicate + vrsub.vv combine to vrsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vrsub asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper
macros for vx binary reversed.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vrsub.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vrsub-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vrsub.vv to vrsub.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vrub.vv to the
vrsub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY_REVERSE_CASE_0(T, OP, NAME)                   \
  void                                                                \
  test_vx_binary_reverse_##NAME##_##T##_case_0 (T * restrict out,     \
                                                T * restrict in, T x, \
                                                unsigned n)           \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = x OP in[i];                                            \
  }

  DEF_VX_BINARY_REVERSE_CASE_0(int32_t, -)

Before this patch:
  54   │ test_vx_binary_reverse_rsub_int32_t_case_0:
  55   │     beq a3,zero,.L27
  56   │     vsetvli a5,zero,e32,m1,ta,ma
  57   │     vmv.v.x v2,a2
  58   │     slli    a3,a3,32
  59   │     srli    a3,a3,32
  60   │ .L22:
  61   │     vsetvli a5,a3,e32,m1,ta,ma
  62   │     vle32.v v1,0(a1)
  63   │     slli    a4,a5,2
  64   │     sub a3,a3,a5
  65   │     add a1,a1,a4
  66   │     vsub.vv v1,v2,v1
  67   │     vse32.v v1,0(a0)
  68   │     add a0,a0,a4
  69   │     bne a3,zero,.L22

After this patch:
  50   │ test_vx_binary_reverse_rsub_int32_t_case_0:
  51   │     beq a3,zero,.L27
  52   │     slli    a3,a3,32
  53   │     srli    a3,a3,32
  54   │ .L22:
  55   │     vsetvli a5,a3,e32,m1,ta,ma
  56   │     vle32.v v1,0(a1)
  57   │     slli    a4,a5,2
  58   │     sub a3,a3,a5
  59   │     add a1,a1,a4
  60   │     vrsub.vx    v1,v1,a2
  61   │     vse32.v v1,0(a0)
  62   │     add a0,a0,a4
  63   │     bne a3,zero,.L22

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Leverage the new add func to
expand the vx insn.
* config/riscv/riscv-protos.h (expand_vx_binary_vec_dup_vec): Add
new func decl to expand format v = vop(vec_dup(x), v).
(expand_vx_binary_vec_vec_dup): Diito but for format
v = vop(v, vec_dup(x)).
* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add new
func impl to expand vx for v = vop(vec_dup(x), v).
(expand_vx_binary_vec_vec_dup): Diito but for another format
v = vop(v, vec_dup(x)).

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

[committed][RISC-V][PR target/120333] Remove bogus bext pattern

I goof'd when doing analysis of missed bext cases.  For the shift into the sign
bit, then shift into the low bit case (thankfully the least common), I got it
in my brain that the field is at the left shift count.   It's actually at
word_size - 1 - left shift count.

One the subtraction is included, it's no longer profitable to turn those cases
into bext.  Best case scenario would be sub+bext, but we can just as easily use
sll+srl which fuses in some designs into a single op.

So this patch removes those two patterns, adjusts the existing testcase and
adds the new execution test.

Given it's a partial reversion and has passed in my tester, I'm going to go
ahead and push it to the trunk rather than waiting for upstream CI.

PR target/120333
gcc/
* config/riscv/bitmanip.md: Remove bext formed from left+right
shift patterns.

gcc/testsuite/

* gcc.target/riscv/pr114512.c: Update expected output.
* gcc.target/riscv/pr120333.c: New test.

hpux: Fix detection of atomic support when profiling

The pa target lacks atomic sync compare and swap instructions.
These are implemented as libcalls and in libatomic. As on linux,
we lie about their availability.

This fixes the gcov-30.c test on hppa64-hpux11.

2025-05-19 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa-hpux.h (TARGET_HAVE_LIBATOMIC): Define.
(HAVE_sync_compare_and_swapqi): Likewise.
(HAVE_sync_compare_and_swaphi): Likewise.
(HAVE_sync_compare_and_swapsi): Likewise.
(HAVE_sync_compare_and_swapdi): Likewise.

'TYPE_EMPTY_P' vs. code offloading [PR120308]

We've got 'gcc/stor-layout.cc:finalize_type_size':

/* Handle empty records as per the x86-64 psABI. */
TYPE_EMPTY_P (type) = targetm.calls.empty_record_p (type);

(Indeed x86_64 is still the only target to define 'TARGET_EMPTY_RECORD_P',
calling 'gcc/tree.cc-default_is_empty_record'.)

And so it happens that for an empty struct used in code offloaded from x86_64
host (but not powerpc64le host, for example), we get to see 'TYPE_EMPTY_P' in
offloading compilation (where the offload targets (currently?) don't use it
themselves, and therefore aren't prepared to handle it).

For nvptx offloading compilation, this causes wrong code generation:
'ptxas [...] error : Call has wrong number of parameters', as nvptx code
generation for function definition doesn't pay attention to this flag (say, in
'gcc/config/nvptx/nvptx.cc:pass_in_memory', or whereever else would be
appropriate to handle that), but the generic code 'gcc/calls.cc:expand_call'
via 'gcc/function.cc:aggregate_value_p' does pay attention to it, and we thus
get mismatching function definition vs. function call.

This issue apparently isn't a problem for GCN offloading, but I don't know if
that's by design or by accident.

Richard Biener:
> It looks like TYPE_EMPTY_P is only used during RTL expansion for ABI
> purposes, so computing it during layout_type is premature as shown here.
>
> I would suggest to simply re-compute it at offload stream-in time.

(For avoidance of doubt, the additions to 'gcc.target/nvptx/abi-struct-arg.c',
'gcc.target/nvptx/abi-struct-ret.c' are not dependent on the offload streaming
code changes, but are just to mirror the changes to
'libgomp.oacc-c-c++-common/abi-struct-1.c'.)

PR lto/120308
gcc/
* lto-streamer-out.cc (hash_tree): Don't handle 'TYPE_EMPTY_P' for
'lto_stream_offload_p'.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields):
Likewise.
* tree-streamer-out.cc (pack_ts_type_common_value_fields):
Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Add empty
structure testing.
gcc/testsuite/
* gcc.target/nvptx/abi-struct-arg.c: Add empty structure testing.
* gcc.target/nvptx/abi-struct-ret.c: Likewise.

Add 'libgomp.c-c++-common/target-abi-struct-1-O0.c', 'libgomp.oacc-c-c++-common/abi-struct-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-abi-struct-1-O0.c: New.
* testsuite/libgomp.oacc-c-c++-common/abi-struct-1.c: Likewise.

Fix libgomp.oacc-fortran/lib-13.f90 async bug

libgomp/
* testsuite/libgomp.oacc-fortran/lib-13.f90: End data region after
wait API calls.

[RISC-V] Fix false positive from Wuninitialized

As Mark and I independently tripped, there's a Wuninitialized issue in the
RISC-V backend. While *I* know the value would always be properly initialized,
it'd be somewhat painful to either eliminate the infeasible paths or do deep
enough analysis to suppress the false positive.

So this initializes OUTPUT and verifies it's got a reasonable value before
using it for the final copy into operands[0].

Bootstrapped on the BPI (regression testing still has ~12hrs to go).

gcc/
* config/riscv/riscv.cc (synthesize_ior_xor): Initialize OUTPUT and
verify it's non-null before emitting the final copy insn.

Fortran: fix FAIL of gfortran.dg/specifics_1.f90 after r16-372 [PR120099]

After commit r16-372, testcase gfortran.dg/specifics_1.f90 started to
FAIL at -O2 and higher, as DCE lead to elimination of evaluations of
Fortran specific intrinsics returning complex results and with -ff2c.
As the Fortran runtime library is compiled with -fno-f2c, the frontend
generates calls to wrapper subroutines _gfortran_f2c_specific_* that
return their result by reference via their first argument when this is
needed. This is e.g. the case when specific names of the intrinsics are
used for passing as actual argument to procedures. These wrappers are
not pure in the GCC IR sense, even if the Fortran intrinsics are.
Therefore gfc_return_by_reference must return true for these.

PR fortran/120099

gcc/fortran/ChangeLog:

* trans-types.cc (gfc_return_by_reference): Intrinsic functions
returning complex numbers may return their result by reference
with -ff2c.

arm: fully validate mem_noofs_operand [PR120351]

It's not enough to just check that a memory operand is of the form
mem(reg); after RA we also need to validate the register being used.
The safest way to do this is to call memory_operand.

PR target/120351

gcc/ChangeLog:

* config/arm/predicates.md (mem_noofs_operand): Also check the op
is a valid memory_operand.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr120351.c: New test.

libstdc++: Fix some Clang -Wsystem-headers warnings in <ranges>

libstdc++-v3/ChangeLog:

* include/std/ranges (_ZipTransform::operator()): Remove name of
unused parameter.
(chunk_view::_Iterator, stride_view::_Iterator): Likewise.
(join_with_view): Declare _Iterator and _Sentinel as class
instead of struct.
(repeat_view): Declare _Iterator as class instead of struct.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Fix std::format of chrono::local_days with {} [PR120293]

Formatting of chrono::local_days with an empty chrono-specs should be
equivalent to inserting it into an ostream, which should use the
overload for inserting chrono::sys_days into an ostream. The
implementation of empty chrono-specs in _M_format_to_ostream takes some
short cuts, and that wasn't being done correctly for chrono::local_days.

libstdc++-v3/ChangeLog:

PR libstdc++/120293
* include/bits/chrono_io.h (_M_format_to_ostream): Add special
case for local_time convertible to local_days.
* testsuite/std/time/clock/local/io.cc: Check formatting of
chrono::local_days.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

RISC-V: Fix the warning of temporary object dangling references.

During the GCC compilation, some warnings about temporary object dangling
references emerged. They appeared in these code lines in riscv-common.cc:
const riscv_ext_info_t &implied_ext_info, const riscv_ext_info_t &ext_info = get_riscv_ext_info (ext) and auto &ext_info = get_riscv_ext_info (search_ext).
The issue arose because the local variable types were not used in a standardized
way, causing their references to dangle once the function ended.
To fix this, the patch changes the argument type of get_riscv_ext_info to
`const char *`, thereby eliminating the warnings.

Changes for v2:
- Change the argument type of get_riscv_ext_info to `const char *` to eliminate the warnings.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (get_riscv_ext_info): Fix argument type.
(riscv_subset_list::check_implied_ext): Type conversion.

RISC-V: Rename conflicting variables in gen-riscv-ext-texi.cc

The variables `major` and `minor` in `gen-riscv-ext-texi.cc`
conflict with the macros of the same name defined in `<sys/sysmacros.h>`,
which are exposed when building with newer versions of GCC on older
Linux distributions (e.g., Ubuntu 18.04). To resolve this, we rename them
to `major_version` and `minor_version` respectively. This aligns with the
GCC community's recommended practice [1] and improves code clarity.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2025-May/683881.html

gcc/ChangeLog:

* config/riscv/gen-riscv-ext-texi.cc (struct version_t):rename
major/minor to major_version/minor_version.

Signed-off-by: Songhe Zhu <zhusonghe@eswincomputing.com>

RISC-V: Support Zilsd code gen

This commit adds the code gen support for Zilsd, which is a
newly added extension for RISC-V. The Zilsd extension allows
for loading and storing 64-bit values using even-odd register
pairs.

We only try to do miminal code gen support for that, which means only
use the new instructions when the load store is 64 bits data, we can use
that to optimize the code gen of memcpy/memset/memmove and also the
prologue and epilogue of functions, but I think that probably should be
done in a follow up patch.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Handle
load/store with odd-even reg pair.
(riscv_split_64bit_move_p): Don't split load/store if zilsd enabled.
(riscv_hard_regno_mode_ok): Only allow even reg can be used for
64 bits mode for zilsd.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zilsd-code-gen.c: New test.

regcprop: Return from copy_value for unordered modes

The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
partial_subreg_p in the function copy_value during the RTL pass
regcprop, failing the assertion in

inline bool
partial_subreg_p (machine_mode outermode, machine_mode innermode)
{
  /* Modes involved in a subreg must be ordered.  In particular, we must
     always know at compile time whether the subreg is paradoxical.  */
  poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
  poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
  gcc_checking_assert (ordered_p (outer_prec, inner_prec));
  return maybe_lt (outer_prec, inner_prec);
}

Returning from the function if the modes are not ordered before reaching
the call to partial_subreg_p resolves the ICE and passes bootstrap and
testing without regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR middle-end/120276
* regcprop.cc (copy_value): Return in case of unordered modes.

gcc/testsuite/
PR middle-end/120276
* gcc.dg/torture/pr120276.c: New test.

RISC-V: Add new operand constraint: cR

This commit introduces a new operand constraint `cR` for the RISC-V
architecture, which allows the use of an even-odd RVC general purpose register
(x8-x15) in inline asm.

Ref: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/102

gcc/ChangeLog:

* config/riscv/constraints.md (cR): New constraint.
* doc/md.texi (Machine Constraints::RISC-V): Document the new cR
constraint.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/constraint-cR-pair.c: New test case.

i386: Combine AVX10.2 compile time test

Since AVX10.2 enables everything, there is no need to split testcases
for 256 and 512 bit size.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-bf16-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-bf16-1.c: ... to this.
* gcc.target/i386/avx10_2-512-bf16-vector-cmp-1.c: Removed and
combined ...
* gcc.target/i386/avx10_2-bf16-vector-cmp-1.c:... to this.
* gcc.target/i386/avx10_2-512-bf16-vector-fma-1.c: Removed and
combined ...
* gcc.target/i386/avx10_2-bf16-vector-fma-1.c:... to this.
* gcc.target/i386/avx10_2-512-bf16-vector-operations-1.c: Removed
and combined ...
* gcc.target/i386/avx10_2-bf16-vector-operations-1.c:... to this.
* gcc.target/i386/avx10_2-512-bf16-vector-smaxmin-1.c: Removed
and combined ...
* gcc.target/i386/avx10_2-bf16-vector-smaxmin-1.c:... to this.
* gcc.target/i386/avx10_2-512-convert-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-convert-1.c:... to this.
* gcc.target/i386/avx10_2-512-media-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-media-1.c:... to this.
* gcc.target/i386/avx10_2-512-minmax-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-minmax-1.c:... to this.
* gcc.target/i386/avx10_2-512-movrs-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-movrs-1.c:... to this.
* gcc.target/i386/avx10_2-512-satcvt-1.c: Removed and combined ...
* gcc.target/i386/avx10_2-satcvt-1.c:... to this.
* gcc.target/i386/sm4-avx10_2-512-1.c: Move to...
* gcc.target/i386/sm4-avx10_2-1b.c: ...here.

i386: Refactor AVX10.2 runtime test

Since everything is under avx10.2, we could use a header
file plus a file actually run all the tests for runtime
test.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10-check.h: Remove AVX10_512BIT.
* gcc.target/i386/avx10-minmax-helper.h: Ditto.
* gcc.target/i386/avx10_2-vaddbf16-2.c: Add 512 test.
* gcc.target/i386/avx10_2-vcmpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttbf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttbf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vdivbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vdpphps-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmaddXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmsubXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmaddXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmsubXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfpclassbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vgetexpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vgetmantbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vmaxbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto.
* gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto.
* gcc.target/i386/avx10_2-vmulbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vrcpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vreducebf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vrndscalebf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vrsqrtbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vscalefbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vsqrtbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vsubbf16-2.c: Ditto.
* gcc.target/i386/avx512f-helper.h: Remove AVX10_512BIT.
* gcc.target/i386/sm4-check.h: Use AVX10_2.
* gcc.target/i386/avx10_2-512-vaddbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vaddbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcmpbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcmpbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvt2ph2bf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvt2ph2bf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvt2ph2hf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvt2ph2hf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvt2ps2phx-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbf162ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbf162ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbf162iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbf162iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvthf82ph-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvthf82ph-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2bf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2bf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2bf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2bf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2hf8-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2hf8-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2hf8s-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2hf8s-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtph2iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtps2ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvtps2iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttbf162ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttbf162ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttbf162iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttbf162iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttpd2dqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttpd2qqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttpd2udqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttpd2uqqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttph2ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttph2iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2dqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2ibs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2iubs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2qqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2udqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vcvttps2uqqs-2.h: ...here.
* gcc.target/i386/avx10_2-512-vdivbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vdivbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vdpphps-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vdpphps-2.h: ...here.
* gcc.target/i386/avx10_2-512-vfmaddXXXbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vfmaddXXXbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vfmsubXXXbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vfmsubXXXbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vfnmaddXXXbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vfnmaddXXXbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vfnmsubXXXbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vfnmsubXXXbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vfpclassbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vfpclassbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vgetexpbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vgetexpbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vgetmantbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vgetmantbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vmaxbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vmaxbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vminbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vminbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vminmaxbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vminmaxbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vminmaxpd-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vminmaxpd-2.h: ...here.
* gcc.target/i386/avx10_2-512-vminmaxph-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vminmaxph-2.h: ...here.
* gcc.target/i386/avx10_2-512-vminmaxps-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vminmaxps-2.h: ...here.
* gcc.target/i386/avx10_2-512-vmpsadbw-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vmpsadbw-2.h: ...here.
* gcc.target/i386/avx10_2-512-vmulbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vmulbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbssd-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbssd-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbssds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbssds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbsud-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbsud-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbsuds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbsuds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbuud-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbuud-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpbuuds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpbuuds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwsud-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwsud-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwsuds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwsuds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwusd-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwusd-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwusds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwusds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwuud-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwuud-2.h: ...here.
* gcc.target/i386/avx10_2-512-vpdpwuuds-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vpdpwuuds-2.h: ...here.
* gcc.target/i386/avx10_2-512-vrcpbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vrcpbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vreducebf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vreducebf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vrndscalebf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vrndscalebf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vrsqrtbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vrsqrtbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vscalefbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vscalefbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vsqrtbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vsqrtbf16-2.h: ...here.
* gcc.target/i386/avx10_2-512-vsubbf16-2.c:
Remove 512 test. Move to...
* gcc.target/i386/avx10_2-vsubbf16-2.h: ...here.
* gcc.target/i386/sm4key4-avx10_2-512-2.c:
Remove 512 test. Move to...
* gcc.target/i386/sm4key4-avx10_2-2.c: ...here.
* gcc.target/i386/sm4rnds4-avx10_2-512-2.c:
Remove 512 test. Move to...
* gcc.target/i386/sm4rnds4-avx10_2-2.c: ...here.
* gcc.target/i386/vnniint16-auto-vectorize-4.c: Use AVX10_SCALAR
for 512 bit test.
* gcc.target/i386/vnniint8-auto-vectorize-4.c: Ditto.

i386: Combine AVX10.2 intrin files

Since we use a single avx10.2 to enable everything, there is
no need to split them into two files.

gcc/ChangeLog:

* config.gcc: Remove 512 intrin file.
* config/i386/avx10_2-512bf16intrin.h:
Removed and combined to ...
* config/i386/avx10_2bf16intrin.h: ... this.
* config/i386/avx10_2-512convertintrin.h:
Removed and combined to ...
* config/i386/avx10_2convertintrin.h: ... this.
* config/i386/avx10_2-512mediaintrin.h:
Removed and combined to ...
* config/i386/avx10_2mediaintrin.h: ... this.
* config/i386/avx10_2-512minmaxintrin.h:
Removed and combined to ...
* config/i386/avx10_2minmaxintrin.h: ... this.
* config/i386/avx10_2-512satcvtintrin.h:
Removed and combined to ...
* config/i386/avx10_2satcvtintrin.h: ... this.
* config/i386/immintrin.h: Remove 512 intrin file.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx-1.c: Combine tests and change
intrin file name.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

i386: Remove duplicate iterators in md

There are several iterators no longer needed in md files since
after refactor in AVX10, they could directly use legacy AVX512
ones. Remove those duplicate iterators.

gcc/ChangeLog:

* config/i386/sse.md (VF1_VF2_AVX10_2): Removed.
(VF2_AVX10_2): Ditto.
(VI1248_AVX10_2): Ditto.
(VFH_AVX10_2): Ditto.
(VF1_AVX10_2): Ditto.
(VHF_AVX10_2): Ditto.
(VBF_AVX10_2): Ditto.
(VI8_AVX10_2): Ditto.
(VI2_AVX10_2): Ditto.
(VBF): New.
(div<mode>3): Use VBF instead of AVX10.2 ones.
(vec_cmp<mode><avx512fmaskmodelower>): Ditto.
(avx10_2_cvt2ps2phx_<mode><mask_name><round_name>):
Use VHF_AVX512VL instead of AVX10.2 ones.
(vcvt<convertfp8_pack><mode><mask_name>): Ditto.
(vcvthf82ph<mode><mask_name>): Ditto.
(VHF_AVX10_2_2): Remove not needed TARGET_AVX10_2.
(usdot_prod<sseunpackmodelower><mode>): Use VI2_AVX512F
instead of AVX10.2 ones.
(vdpphps_<mode>): Use VF1_AVX512VL instead of AVX10.2 ones.
(vdpphps_<mode>_mask): Ditto.
(vdpphps_<mode>_maskz): Ditto.
(vdpphps_<mode>_maskz_1): Ditto.
(avx10_2_scalefbf16_<mode><mask_name>): Use VBF instead of
AVX10.2 ones.
(<code><mode>3): Ditto.
(avx10_2_<code>bf16_<mode><mask_name>): Ditto.
(avx10_2_fmaddbf16_<mode>_maskz); Ditto.
(avx10_2_fmaddbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fmaddbf16_<mode>_mask): Ditto.
(avx10_2_fmaddbf16_<mode>_mask3): Ditto.
(avx10_2_fnmaddbf16_<mode>_maskz): Ditto.
(avx10_2_fnmaddbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fnmaddbf16_<mode>_mask): Ditto.
(avx10_2_fnmaddbf16_<mode>_mask3): Ditto.
(avx10_2_fmsubbf16_<mode>_maskz); Ditto.
(avx10_2_fmsubbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fmsubbf16_<mode>_mask): Ditto.
(avx10_2_fmsubbf16_<mode>_mask3): Ditto.
(avx10_2_fnmsubbf16_<mode>_maskz): Ditto.
(avx10_2_fnmsubbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fnmsubbf16_<mode>_mask): Ditto.
(avx10_2_fnmsubbf16_<mode>_mask3): Ditto.
(avx10_2_rsqrtbf16_<mode><mask_name>): Ditto.
(avx10_2_sqrtbf16_<mode><mask_name>): Ditto.
(avx10_2_rcpbf16_<mode><mask_name>): Ditto.
(avx10_2_getexpbf16_<mode><mask_name>): Ditto.
(avx10_2_<bf16immop>bf16_<mode><mask_name>): Ditto.
(avx10_2_fpclassbf16_<mode><mask_scalar_merge_name>): Ditto.
(avx10_2_cmpbf16_<mode><mask_scalar_merge_name>): Ditto.
(avx10_2_cvt<sat_cvt_trunc_prefix>bf162i<sat_cvt_sign_prefix>bs<mode><mask_name>):
Ditto.
(avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
Use VHF_AVX512VL instead of AVX10.2 ones.
(avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
Ditto.
(avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
Use VF1_AVX512VL instead of AVX10.2 ones.
(avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
Ditto.
(avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>):
Use VF instead of AVX10.2 ones.
(avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>):
Use VF2 instead of AVX10.2 ones.
(avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>):
Use VI8 instead of AVX10.2 ones.
(avx10_2_minmaxbf16_<mode><mask_name>): Use VBF instead of
AVX10.2 ones.
(avx10_2_minmaxp<mode><mask_name><round_saeonly_name>):
Use VFH_AVX512VL instead of AVX10.2 ones.
(avx10_2_vmovrs<ssemodesuffix><mode><mask_name>):
Use VI1248_AVX512VLBW instead of AVX10.2 ones.

i386: Remove avx10.1-256/512 and evex512 options

As we mentioned in GCC 15, we will remove avx10.1-256/512 and evex512
in GCC 16. Also, the combination of AVX10 and AVX512 option behavior
will also be simplified in GCC 16 since AVX10.1 now implied AVX512,
making the behavior matching everyone else.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h
(get_available_features): Remove feature set for AVX10_1_256.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_EVEX512_SET): Removed.
(OPTION_MASK_ISA2_AVX10_1_256_SET): Removed.
(OPTION_MASK_ISA_AVX10_1_SET): Imply all AVX512 features.
(OPTION_MASK_ISA2_AVX10_1_SET): Ditto.
(OPTION_MASK_ISA2_AVX2_UNSET): Remove AVX10_1_UNSET.
(OPTION_MASK_ISA2_EVEX512_UNSET): Removed.
(OPTION_MASK_ISA2_AVX10_1_UNSET): Remove AVX10_1_256.
(OPTION_MASK_ISA2_AVX512F_UNSET): Unset AVX10_1.
(ix86_handle_option): Remove special handling for AVX512/AVX10.1
options, evex512 and avx10_1_256. Modify ISA set for AVX10 options.
* common/config/i386/i386-cpuinfo.h
(enum feature_priority): Remove P_AVX10_1_256.
(enum processor_features): Remove FEATURE_AVX10_1_256.
* common/config/i386/i386-isas.h: Remove avx10.1-256/512.
* config/i386/avx512bf16intrin.h: Rollback target push before
evex512 is introduced.
* config/i386/avx512bf16vlintrin.h: Ditto.
* config/i386/avx512bitalgintrin.h: Ditto.
* config/i386/avx512bitalgvlintrin.h: Ditto.
* config/i386/avx512bwintrin.h: Ditto.
* config/i386/avx512cdintrin.h: Ditto.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/avx512fp16intrin.h: Ditto.
* config/i386/avx512fp16vlintrin.h: Ditto.
* config/i386/avx512ifmaintrin.h: Ditto.
* config/i386/avx512ifmavlintrin.h: Ditto.
* config/i386/avx512vbmi2intrin.h: Ditto.
* config/i386/avx512vbmi2vlintrin.h: Ditto.
* config/i386/avx512vbmiintrin.h: Ditto.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vldqintrin.h: Ditto.
* config/i386/avx512vlintrin.h: Ditto.
* config/i386/avx512vnniintrin.h: Ditto.
* config/i386/avx512vnnivlintrin.h: Ditto.
* config/i386/avx512vp2intersectintrin.h: Ditto.
* config/i386/avx512vp2intersectvlintrin.h: Ditto.
* config/i386/avx512vpopcntdqintrin.h: Ditto.
* config/i386/avx512vpopcntdqvlintrin.h: Ditto.
* config/i386/gfniintrin.h: Ditto.
* config/i386/vaesintrin.h: Ditto.
* config/i386/vpclmulqdqintrin.h: Ditto.
* config/i386/driver-i386.cc (check_avx512_features): Removed.
(host_detect_local_cpu): Remove -march=native special handling.
* config/i386/i386-builtins.cc
(ix86_vectorize_builtin_gather): Remove TARGET_EVEX512.
* config/i386/i386-c.cc
(ix86_target_macros_internal): Remove EVEX512 and AVX10_1_256.
* config/i386/i386-expand.cc
(ix86_valid_mask_cmp_mode): Remove TARGET_EVEX512.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_vector_duplicate_simode_const): Ditto.
(ix86_expand_vector_init_duplicate): Ditto.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_emit_swsqrtsf): Ditto.
(ix86_vectorize_vec_perm_const): Ditto.
(ix86_expand_vecop_qihi2): Ditto.
(ix86_expand_sse2_mulvxdi3): Ditto.
(ix86_gen_bcst_mem): Ditto.
* config/i386/i386-isa.def (EVEX512): Removed.
(AVX10_1_256): Ditto.
* config/i386/i386-options.cc
(isa2_opts): Remove evex512 and avx10.1-256.
(ix86_function_specific_save): Remove no_avx512_explicit and
no_avx10_1_explicit.
(ix86_function_specific_restore): Ditto.
(ix86_valid_target_attribute_inner_p): Remove evex512 and
avx10.1-256/512.
(ix86_valid_target_attribute_tree): Remove special handling
to rerun ix86_option_override_internal for AVX10.1-256.
(ix86_option_override_internal): Remove warning handling.
(ix86_simd_clone_adjust): Remove evex512.
* config/i386/i386.cc
(type_natural_mode): Remove TARGET_EVEX512.
(ix86_return_in_memory): Ditto.
(standard_sse_constant_p): Ditto.
(standard_sse_constant_opcode): Ditto.
(ix86_get_ssemov): Ditto.
(ix86_legitimate_constant_p): Ditto.
(ix86_vectorize_builtin_scatter): Ditto.
(ix86_hard_regno_mode_ok): Ditto.
(ix86_set_reg_reg_cost): Ditto.
(ix86_rtx_costs): Ditto.
(ix86_vector_mode_supported_p): Ditto.
(ix86_preferred_simd_mode): Ditto.
(ix86_autovectorize_vector_modes): Ditto.
(ix86_get_mask_mode): Ditto.
(ix86_simd_clone_compute_vecsize_and_simdlen): Ditto.
(ix86_simd_clone_usable): Ditto.
* config/i386/i386.h (BIGGEST_ALIGNMENT): Ditto.
(MOVE_MAX): Ditto.
(STORE_MAX_PIECES): Ditto.
(PTA_SKYLAKE_AVX512): Remove PTA_EVEX512.
(PTA_CANNONLAKE): Ditto.
(PTA_ZNVER4): Ditto.
(PTA_GRANITERAPIDS): Use PTA_AVX10_1.
(PTA_DIAMONDRAPIDS): Use PTA_GRANITERAPIDS.
* config/i386/i386.md: Remove TARGET_EVEX512, avx512f_512
and avx512bw_512.
* config/i386/i386.opt: Remove ix86_no_avx512_explicit,
ix86_no_avx10_1_explicit, mevex512, mavx10.1-256/512 and
warning for mavx10.1. Modify option comment.
* config/i386/i386.opt.urls: Remove evex512 and avx10.1-256/512.
* config/i386/predicates.md: Remove TARGET_EVEX512.
* config/i386/sse.md: Ditto.
* doc/extend.texi: Remove avx10.1-256/512. Modify avx10.1 doc.
* doc/invoke.texi: Remove avx10.1-256/512 and evex512.
* doc/sourcebuild.texi: Remove avx10.1-256/512.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_1-1.c: Remove warning.
* gcc.target/i386/avx10_1-2.c: Ditto.
* gcc.target/i386/avx10_1-3.c: Ditto.
* gcc.target/i386/avx10_1-4.c: Ditto.
* gcc.target/i386/pr111068.c: Ditto.
* gcc.target/i386/pr117946.c: Ditto.
* gcc.target/i386/pr117240_avx512f.c: Remove -mevex512 and
warning.
* gcc.target/i386/avx10_1-11.c: Rename to ...
* gcc.target/i386/avx10_1-5.c: ... this. Remove warning.
* gcc.target/i386/avx10_1-12.c: Rename to ...
* gcc.target/i386/avx10_1-6.c: ... this. Remove warning.
* gcc.target/i386/avx10_1-26.c: Rename to ...
* gcc.target/i386/avx10_1-7.c: ... this. Remove warning.
The origin avx10_1-7.c is removed.
* gcc.target/i386/avx10_1-10.c: Removed.
* gcc.target/i386/avx10_1-13.c: Removed.
* gcc.target/i386/avx10_1-14.c: Removed.
* gcc.target/i386/avx10_1-15.c: Removed.
* gcc.target/i386/avx10_1-16.c: Removed.
* gcc.target/i386/avx10_1-17.c: Removed.
* gcc.target/i386/avx10_1-18.c: Removed.
* gcc.target/i386/avx10_1-19.c: Removed.
* gcc.target/i386/avx10_1-20.c: Removed.
* gcc.target/i386/avx10_1-21.c: Removed.
* gcc.target/i386/avx10_1-22.c: Removed.
* gcc.target/i386/avx10_1-23.c: Removed.
* gcc.target/i386/avx10_1-8.c: Removed.
* gcc.target/i386/avx10_1-9.c: Removed.
* gcc.target/i386/noevex512-1.c: Removed.
* gcc.target/i386/noevex512-2.c: Removed.
* gcc.target/i386/noevex512-3.c: Removed.
* gcc.target/i386/pr111889.c: Removed.
* gcc.target/i386/pr111907.c: Removed.

i386: Unpush OPTION_MASK_ISA2_EVEX512 for builtins

As we mentioned in GCC 15, we will remove evex512 in GCC 16 since it
is not useful anymore since we will have 512 bit directly. This patch
will first unpush evex512 in the builtins.

gcc/ChangeLog:

* config/i386/i386-builtin.def
(BDESC): Remove OPTION_MASK_ISA2_EVEX512.
* config/i386/i386-builtins.cc
(ix86_init_mmx_sse_builtins): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr90096.c: Adjust error message.
* gcc.target/i386/pr117304-1.c: Removed.

Daily bump.

emit-rtl: Allow extra checks for paradoxical subregs [PR119966]

When a paradoxical subreg is detected, validate_subreg exits early, thus
skipping the important checks later in the function.

Fix by continuing with the checks instead of declaring early that the
paradoxical subreg is valid.

One of the newly allowed subsequent checks needed to be disabled for
paradoxical subregs.  It turned out that combine attempts to create
a paradoxical subreg of mem even for strict-alignment targets.
That is invalid and should eventually be rejected, but is
temporarily left allowed to prevent regressions for
armv8l-unknown-linux-gnueabihf.  See PR120329 for more details.

Tests I did:
- No regressions were found for C and C++ for the following targets:
   - native x86_64-pc-linux-gnu
   - cross riscv64-unknown-linux-gnu
   - cross riscv32-none-elf
- Sanity checked armv8l-unknown-linux-gnueabihf by cross-building
   up to including libgcc.  Linaro CI bot further confirmed there
   are no regressions.
- Sanity checked powerpc64-unknown-linux-gnu by building native
   toolchain, but I could not setup qemu-user for DejaGnu testing.

PR target/119966

gcc/ChangeLog:

* emit-rtl.cc (validate_subreg): Do not exit immediately for
paradoxical subregs.  Filter subsequent tests which are
not valid for paradoxical subregs.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

Partially lift restriction from loc_list_from_tree_1

The function accepts all handled_component_p expressions and decodes them by
means of get_inner_reference as expected, but bails out on bitfields:

        /* TODO: We can extract value of the small expression via shifting
   even for nonzero bitpos.  */
        if (list_ret == 0)
          return 0;
        if (!multiple_p (bitpos, BITS_PER_UNIT, &bytepos)
            || !multiple_p (bitsize, BITS_PER_UNIT))
          {
            expansion_failed (loc, NULL_RTX,
                              "bitfield access");
            return 0;
          }

This lifts the second part of the restriction, which helps for obscure cases
of packed discriminated record types in Ada, although this requires the very
latest GDB sources.

gcc/
* dwarf2out.cc (loc_list_from_tree_1) <COMPONENT_REF>: Do not bail
out when the size is not a multiple of a byte.
Deal with bit-fields whose size is not a multiple of a byte when
dereferencing an address.

phiopt: Use mark_lhs_in_seq_for_dce instead of doing it inline

Right now phiopt has the same code as mark_lhs_in_seq_for_dce
inlined into match_simplify_replacement. Instead let's use the
function in gimple-fold that does the same thing.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* gimple-fold.cc (mark_lhs_in_seq_for_dce): Make
non-static.
* gimple-fold.h (mark_lhs_in_seq_for_dce): Declare.
* tree-ssa-phiopt.cc (match_simplify_replacement): Use
mark_lhs_in_seq_for_dce instead of manually looping.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Regenerate cobol/lang.opt.urls

The Cobol frontend lang.opt got -M added, but lang.opt.urls wasn't
regenerated.

Fixes: 92b6485a75ca ("cobol: Eliminate exception "blob"; streamline some code generation.")
gcc/cobol/ChangeLog:

* lang.opt.urls: Regenerated.

Daily bump.

[PATCH] libgcc SH: fix alignment for relaxation

From 6462f1e6a2565c5d4756036d9bc2f39dce9bd768 Mon Sep 17 00:00:00 2001
From: QBos07 <qubos@outlook.de>
Date: Sat, 10 May 2025 16:56:28 +0000
Subject: [PATCH] libgcc SH: fix alignment for relaxation

when relaxation is enabled we can not infer the alignment
from the position as that may change. This should not change
non-relaxed builds as its allready aligned there. This was
the missing piece to building an entire toolchain with -mrelax

Credit goes to Oleg Endo: https://sourceware.org/bugzilla/show_bug.cgi?id=3298#c4

libgcc/
* config/sh/lib1funcs.S (ashiftrt_r4_32): Increase alignment.
(movemem): Force alignment of the mova intruction.

[RISC-V] Fix ICE due to bogus use of gen_rtvec

Found this while setting up the risc-v coordination branch off of gcc-15.  Not
sure why I didn't use rtvec_alloc directly here since we're going to initialize
the whole vector ourselves.  Using gen_rtvec was just wrong as it's walking
down a non-existent varargs list.  Under the "right" circumstances it can walk
off a page and fault.

This was seen with a test already in the testsuite (I forget which test), so no
new regression test.

Tested in my tester and verified the failure on the coordination branch is
resolved a well.  Waiting on pre-commit CI to render a verdict.

gcc/
* config/riscv/riscv-vect-permconst.cc (vector_permconst:process_bb):
Use rtvec_alloc, not gen_rtvec since we don't want/need to initialize
the vector.

[PATCH] gcc: add trigonometric pi-based functions as gcc builtins

I committed the wrong version on Yuao's behalf. This followup adds the
documentation changes -- Jeff.

This patch adds trigonometric pi-based functions as gcc builtins: acospi, asinpi, atan2pi,
atanpi, cospi, sinpi, and tanpi. Latest glibc already provides support for
these functions, which we plan to leverage in future gfortran implementations.

The patch includes two test cases to verify both correct code generation and
function definition.

If approved, I suggest committing this foundational change first. Constant
folding for these builtins will be addressed in subsequent patches.

Best regards,
Yuao

From 9a9683d250078ce1bc687797c26ca05a9e91b350 Mon Sep 17 00:00:00 2001
From: Yuao Ma <c8ef@outlook.com>
Date: Wed, 14 May 2025 22:14:00 +0800
Subject: [PATCH] gcc: add trigonometric pi-based functions as gcc builtins

Add trigonometric pi-based functions as GCC builtins: acospi, asinpi, atan2pi,
atanpi, cospi, sinpi, and tanpi. Latest glibc already provides support for
these functions, which we plan to leverage in future gfortran implementations.

The patch includes two test cases to verify both correct code generation and
function definition.

If approved, I suggest committing this foundational change first. Constant
folding for these builtins will be addressed in subsequent patches.

gcc/ChangeLog:

* doc/extend.texi: Mention new builtins.

[PATCH] gcc: add trigonometric pi-based functions as gcc builtins

This patch adds trigonometric pi-based functions as gcc builtins: acospi, asinpi, atan2pi,
atanpi, cospi, sinpi, and tanpi. Latest glibc already provides support for
these functions, which we plan to leverage in future gfortran implementations.

The patch includes two test cases to verify both correct code generation and
function definition.

If approved, I suggest committing this foundational change first. Constant
folding for these builtins will be addressed in subsequent patches.

Best regards,
Yuao

From 9a9683d250078ce1bc687797c26ca05a9e91b350 Mon Sep 17 00:00:00 2001
From: Yuao Ma <c8ef@outlook.com>
Date: Wed, 14 May 2025 22:14:00 +0800
Subject: [PATCH] gcc: add trigonometric pi-based functions as gcc builtins

Add trigonometric pi-based functions as GCC builtins: acospi, asinpi, atan2pi,
atanpi, cospi, sinpi, and tanpi. Latest glibc already provides support for
these functions, which we plan to leverage in future gfortran implementations.

The patch includes two test cases to verify both correct code generation and
function definition.

If approved, I suggest committing this foundational change first. Constant
folding for these builtins will be addressed in subsequent patches.

gcc/ChangeLog:

* builtins.def (TRIG_TYPE): New.
(BUILT_IN_ACOSPI): New.
(BUILT_IN_ACOSPIF): New.
(BUILT_IN_ACOSPIL): New.
(BUILT_IN_ASINPI): New.
(BUILT_IN_ASINPIF): New.
(BUILT_IN_ASINPIL): New.
(BUILT_IN_ATANPI): New.
(BUILT_IN_ATANPIF): New.
(BUILT_IN_ATANPIL): New.
(BUILT_IN_COSPI): New.
(BUILT_IN_COSPIF): New.
(BUILT_IN_COSPIL): New.
(BUILT_IN_SINPI): New.
(BUILT_IN_SINPIF): New.
(BUILT_IN_SINPIL): New.
(BUILT_IN_TANPI): New.
(BUILT_IN_TANPIF): New.
(BUILT_IN_TANPIL): New.
(TRIG2_TYPE): New.
(BUILT_IN_ATAN2PI): New.
(BUILT_IN_ATAN2PIF): New.
(BUILT_IN_ATAN2PIL): New.

gcc/testsuite/ChangeLog:

* gcc.dg/builtins-1.c: Builtin codegen test.
* gcc.dg/c23-builtins-1.c: Builtin signature test.

[RISC-V] Avoid setting output object more than once in IOR/XOR synthesis

While evaluating Shreya's logical AND synthesis work on spec2017 I ran into a
code quality regression where combine was failing to eliminate a redundant sign
extension.

I had a hunch the problem would be with the multiple sets of the same pseudo
register in the AND synthesis path.  I was right that the problem was multiple
sets of the same pseudo, but it was actually some of the splitters in the
RISC-V backend that were the culprit.  Those multiple sets caused the sign bit
tracking code to need to make conservative assumptions thus resulting in
failure to eliminate the unnecessary sign extension.

So before we start moving on the logical AND patch we're going to do some
cleanups.

There's multiple moving parts in play.  For example, we have splitters which do
multiple sets of the output register.  Fixing some of those independently would
result in a code quality regression.  Instead they need some adjustments to or
removal of mvconst_internal.  Of course getting rid of mvconst_internal will
trigger all kinds of code quality regressions right now which ultimately lead
back to the need to revamp the logical AND expander.  Point being we've got
some circular dependencies and breaking them may result in short term code
quality regressions.  I'll obviously try to avoid those as much as possible.

So to start the process this patch adjusts the recently added XOR/IOR synthesis
to avoid re-using the destination register.  While the reuse was clearly safe
from a semantic standpoint, various parts of the compiler can do a better job
for pseudos that are only set once.

Given this synthesis path should only be active during initial RTL generation,
we can create new pseudos at will, so we create a new one for each insn.  At
the end of the sequence we copy from the last set into the final destination.

This has various trivial impacts on the code generation, but the resulting code
looks no better or worse to me across spec2017.

This has been tested in my tester and is currently bootstrapping on my BPI.
Waiting on data from the pre-commit tester before moving forward...

gcc/
* config/riscv/riscv.cc (synthesize_ior_xor): Avoid writing
operands[0] more than once, use new pseudos instead.

RISC-V: Since the loop increment i++ is unreachable, the loop body will never execute more than once

Reported-by: huangcunjian <huangcunjian.huang@alibaba-inc.com>
gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gpr_save_operation_p): Remove
break and fixbug for elt index.

RISC-V: Avoid scalar unsigned SAT_ADD test data duplication

Some of the previous scalar unsigned SAT_ADD test data are
duplicated in different test files. This patch would like to
move them into a shared header file, to avoid the test data
duplication.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_arith.h: Add more helper macros.
* gcc.target/riscv/sat/sat_arith_data.h: Add the test data
for scalar unsigned SAT_ADD.
* gcc.target/riscv/sat/sat_u_add-run-1-u16.c: Leverage the test
data from the shared header file.
* gcc.target/riscv/sat/sat_u_add-run-1-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-1-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-1-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-2-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-2-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-2-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-2-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-3-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-3-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-3-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-3-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-4-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-4-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-4-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-4-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-5-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-5-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-5-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-5-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-6-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-6-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-6-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-6-u8.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u32-from-u64.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u16.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u32.c: Ditto
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u64.c: Ditto

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

Update cpplib es.po

* es.po: Update.

aarch64: Add more vector permute tests for the FMOV optimization [PR100165]

This patch adds more tests for vector permutes which can now be optimized as
FMOV with the generic PERM change and the aarch64 AND patch.

Changes since v1:
* v2: Add -mlittle-endian to the little endian tests explicitly and rename the
tests accordingly.

PR target/100165

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/fmov-3-be.c: New test.
* gcc.target/aarch64/fmov-3-le.c: New test.
* gcc.target/aarch64/fmov-4-be.c: New test.
* gcc.target/aarch64/fmov-4-le.c: New test.
* gcc.target/aarch64/fmov-5-be.c: New test.
* gcc.target/aarch64/fmov-5-le.c: New test.

Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>

aarch64: Optimize AND with certain vector of immediates as FMOV [PR100165]

We can optimize AND with certain vector of immediates as FMOV if the result of
the AND is as if the upper lane of the input vector is set to zero and the lower
lane remains unchanged.

For example, at present:

v4hi
f_v4hi (v4hi x)
{
  return x & (v4hi){ 0xffff, 0xffff, 0, 0 };
}

generates:

f_v4hi:
movi    d31, 0xffffffff
and     v0.8b, v0.8b, v31.8b
ret

With this patch, it generates:

f_v4hi:
fmov    s0, s0
ret

Changes since v1:
* v2: Simplify the mask checking logic by using native_decode_int and address a
few other review comments.

PR target/100165

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_output_fmov): New prototype.
(aarch64_simd_valid_and_imm_fmov): Likewise.
* config/aarch64/aarch64-simd.md (and<mode>3<vczle><vczbe>): Allow FMOV
codegen.
* config/aarch64/aarch64.cc (aarch64_simd_valid_and_imm_fmov): New.
(aarch64_output_fmov): Likewise.
* config/aarch64/constraints.md (Df): New constraint.
* config/aarch64/predicates.md (aarch64_reg_or_and_imm): Update
predicate to support FMOV codegen.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/fmov-1-be.c: New test.
* gcc.target/aarch64/fmov-1-le.c: New test.
* gcc.target/aarch64/fmov-2-be.c: New test.
* gcc.target/aarch64/fmov-2-le.c: New test.

Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>

aarch64: Recognize vector permute patterns which can be interpreted as AND [PR100165]

Certain permute that blends a vector with zero can be interpreted as an AND of a
mask. This idea was suggested by Richard Sandiford when he was reviewing my
patch which tries to optimizes certain vector permute with the FMOV instruction
for the aarch64 target.

For example, for the aarch64 target, at present:

v4hi
f_v4hi (v4hi x)
{
  return __builtin_shuffle (x, (v4hi){ 0, 0, 0, 0 }, (v4hi){ 4, 1, 6, 3 });
}

generates:

f_v4hi:
uzp1    v0.2d, v0.2d, v0.2d
adrp    x0, .LC0
ldr     d31, [x0, #:lo12:.LC0]
tbl     v0.8b, {v0.16b}, v31.8b
ret
.LC0:
.byte   -1
.byte   -1
.byte   2
.byte   3
.byte   -1
.byte   -1
.byte   6
.byte   7

With this patch, it generates:

f_v4hi:
mvni    v31.2s, 0xff, msl 8
and     v0.8b, v0.8b, v31.8b
ret

This patch also provides a target-independent routine for detecting vector
permute patterns which can be interpreted as AND.

Changes since v1:
* v2: Rework the patch to only perform the optimization for aarch64 by calling
the target independent routine vec_perm_and_mask.

PR target/100165

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_evpc_and): New.
(aarch64_expand_vec_perm_const_1): Call aarch64_evpc_and.
* optabs.cc (vec_perm_and_mask): New.
* optabs.h (vec_perm_and_mask): New prototype.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/and-be.c: New test.
* gcc.target/aarch64/and-le.c: New test.

Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>

aarch64: Fix an oversight in aarch64_evpc_reencode

Some fields (e.g., zero_op0_p and zero_op1_p) of the struct "newd" may be left
uninitialized in aarch64_evpc_reencode. This can cause reading of uninitialized
data. I found this oversight when testing my patches on and/fmov
optimizations. This patch fixes the bug by zero initializing the struct.

Pushed as obvious after bootstrap/test on aarch64-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_evpc_reencode): Zero initialize
newd.

libstdc++: Use __is_invocable/nothrow_invocable builtins more

As a follow-up to r15-1253 and r15-1254 which made us use these builtins
in the standard std::is_invocable/nothrow_invocable class templates, let's
also use them directly in the standard variable templates and our internal
C++11 __is_invocable/nothrow_invocable class templates.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__is_invocable): Define in terms of
corresponding builtin if available.
(__is_nothrow_invocable): Likewise.
(is_invocable_v): Likewise.
(is_nothrow_invocable_v): Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

Forwprop: add a debug dump after propagate into comparison does something

I noticed that fowprop does not dump when forward_propagate_into_comparison
did a change to the assign statement.
I am actually using it to help guide changing/improving/add match patterns
instead of depending on doing a tree "combiner" here.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (forward_propagate_into_comparison): Dump
when replacing statement.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

cobol: Eliminate exception "blob"; streamline some code generation.

This eliminates some of the last vestiges of creating a structure at host-time
that is intended for use at target-time.

It removes some unnecessary processing when exceptions are not enabled.

It improves the creation of code that handles table subscripts and refmod
parameters.

gcc/cobol/ChangeLog:

* cobol1.cc (cobol_langhook_handle_option): Eliminate OPT_M.
* except.cc (cbl_enabled_exception_t::dump): Formatting.
(symbol_declaratives_add): Remove.
(declarative_runtime_match): Change to no-blob processing.
* exceptg.h (declarative_runtime_match): Change declaration.
(symbol_declaratives_add): Remove declaration.
* gcobc: Dialect handling.
* genapi.cc (parser_compile_ecs): Formatting; add SHOW_IF_PARSE.
(parser_compile_dcls): Likewise.
(parser_statement_begin): Avoid unnecessary store_location_stuff() call.
(gg_get_depending_on_value): Streamline get_depending_on_value_from_odo().
(depending_on_value): Likewise.
(parser_display_field): Formatting.
(parser_display): Handle case ENV_NAME_e.
(parser_file_open): Avoid unnecessary store_location_stuff.
(parser_file_close): Likewise.
(parser_file_read): Likewise.
(parser_file_write): Likewise.
(parser_file_delete): Likewise.
(parser_file_rewrite): Likewise.
(parser_file_start): Likewise.
(parser_intrinsic_subst): Streamline get_depending_on_value_from_odo().
(parser_intrinsic_call_1): Likewise.
(parser_lsearch_start): Likewise.
(parser_bsearch_start): Likewise.
(parser_sort): Likewise.
(store_location_stuff): Avoid unnecessary assignments.
(parser_pop_exception): Formatting.
* genmath.cc (parser_add): Avoid var_decl_default_compute_error assignment
when doing fast_add().
(parser_subtract): Likewise.
* genutil.cc (REFER): Macro for analyzing code generation.
(get_integer_value): Use data_decl_node for integer value from FldLiteralN.
(get_data_offset): Streamline exception code processing.
(get_and_check_refstart_and_reflen): Likewise.
(get_depending_on_value_from_odo): Likewise.
(get_depending_on_value): Likewise.
(refer_is_clean): Formatting.
(refer_refmod_length): Streamline exception code processing.
(refer_fill_depends): Likewise.
(refer_offset): Likewise.
(refer_size_dest): Likewise.
(refer_size_source): Likewise.
* genutil.h (get_depending_on_value_from_odo): Likewise.
* lang-specs.h: Options definition.
* lang.opt: -M as in c.opt.
* lexio.h: Formatting.
* parse.y: Expand -dialect suggestions; SECTION SEGMENT messages.
* parse_ante.h (declarative_runtime_match): Dialect handling.
(labels_dump): Likewise.
(class current_tokens_t): Likewise.
(class prog_descr_t): Make program_index size_t to prevent padding bytes.
* scan.l: POP_FILE directive.
* scan_ante.h (class enter_leave_t): Better handle line number when
processing COPY statements.
* symbols.cc (symbol_elem_cmp): Eliminate SymFunction.
(symbols_dump): Likewise.
(symbol_label_section_exists): Likewise.
* symbols.h (NAME_MAX): Eliminate. (Was part of SymFunction).
(dialect_is): Improve dialect handling.
(dialect_gcc): Likewise.
(dialect_ibm): Likewise.
(dialect_gnu): Likewise.
(enum symbol_type_t): Eliminate SymFunction.
* util.cc (symbol_type_str): Likewise.
(class unique_stack): Option -M handling.
(cobol_set_pp_option): Likewise.
(parse_file): Likewise.
* util.h (cobol_set_pp_option): Likewise.

libgcobol/ChangeLog:

* common-defs.h (struct cbl_declarative_t): Eliminate blobl.
* libgcobol.cc (__gg__set_env_name): Code for ENVIRONMENT-NAME/VALUE.
(__gg__set_env_value): Likewise.

gcc/testsuite/ChangeLog:

* cobol.dg/group1/declarative_1.cob: Handle modified exception handling.

MAINTAINERS: add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

Since starting from GCC 15 the order is not unique for any
symtab_nodes but m_uid is, I believe we ought to dump the latter in
the ipa-clones dump, if only so that people can reliably match entries
about new clones to those about removed nodes (if any).

This patch also contains a fixes to a few other places where we have
so far dumped order to our ordinary dumps and which have been
identified by Michal Jires.

gcc/ChangeLog:

2025-05-16 Martin Jambor <mjambor@suse.cz>

* cgraph.h (symtab_node): Make member function get_uid const.
* cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the
call graph nodes instead of order.
* cgraph.cc (cgraph_node::remove): Likewise.
* ipa-cp.cc (ipcp_lattice<valtype>::print): Likewise.
* ipa-sra.cc (ipa_sra_summarize_function): Likewise.
* symtab.cc (symtab_node::dump_base): Likewise.

Co-Authored-By: Michal Jires <mjires@suse.cz>

Further simplify the stdlib inline folding

gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold): Do the conversion unconditionally, even for same-type cases.

gcc/ChangeLog:
* doc/invoke.texi: Add to_underlying to -ffold-simple-inlines.

aarch64: Fix narrowing warning in driver-aarch64.cc [PR118603]

Since the AARCH64_CORE defines in aarch64-cores.def all use -1 for
the variant, it is just easier to add the cast to unsigned in the usage
in driver-aarch64.cc.

Build and tested on aarch64-linux-gnu.

gcc/ChangeLog:

PR target/118603
* config/aarch64/driver-aarch64.cc (aarch64_cpu_data): Add cast to unsigned
to VARIANT of the define AARCH64_CORE.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

aarch64: Fix narrowing warning in aarch64_detect_vector_stmt_subtype

There is a narrowing warning in aarch64_detect_vector_stmt_subtype
about gather_load_x32_cost and gather_load_x64_cost converting from int to unsigned.
These fields are always unsigned and even the constructor for sve_vec_cost takes
an unsigned. So let's just move the fields over to unsigned.

Build and tested for aarch64-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (struct sve_vec_cost): Change gather_load_x32_cost
and gather_load_x64_cost fields to unsigned.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

forwprop: Add alias walk limit to optimize_memcpy_to_memset.

As sugguested in https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681507.html,
this adds the aliasing walk limit.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Add a limit on the alias walk.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

forwprop: Move memcpy_to_memset from gimple fold to forwprop

Since this optimization now walks the vops, it is better to only
do it in forwprop rather than in all the time in fold_stmt.

The next patch will add the limit to the alias walk.

gcc/ChangeLog:

* gimple-fold.cc (optimize_memcpy_to_memset): Move to
tree-ssa-forwprop.cc.
(gimple_fold_builtin_memory_op): Remove call to
optimize_memcpy_to_memset.
(fold_stmt_1): Likewise.
* tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Move from
gimple-fold.cc.
(simplify_builtin_call): Try to optimize memcpy/memset.
(pass_forwprop::execute): Try to optimize memcpy like assignment
from a previous memset.

gcc/testsuite/ChangeLog:

* gcc.dg/pr78408-1.c: Update scan to forwprop1 only.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

c++, coroutines: Allow NVRO in more cases for ramp functions.

The constraints of the c++ coroutines specification require the ramp
to construct a return object early in the function.  This will be returned
at some later time.  This is implemented as NVRO but requires that copying
be well-formed even though it will be elided.  Special-case ramp functions
to allow this.

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): Suppress conversions for NVRO
in coroutine ramp functions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

c++: Set the outer brace marker for missed cases.

In some cases, a function might be declared as FUNCTION_NEEDS_BODY_BLOCK
but all the content is contained within that block. However, poplevel
is currently assuming that such cases would always contain subblocks.

In the case that we do have a body block, but there are no subblocks
then st the outer brace marker on the body block. This situation occurs
for at least coroutine lambda ramp functions and empty constructors.

gcc/cp/ChangeLog:

* decl.cc (poplevel): Set BLOCK_OUTER_CURLY_BRACE_P on the
body block for functions with no subblocks.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

c++/modules: Clean up importer_interface

This patch removes some no longer needed special casing in linkage
determination, and makes the distinction between "always_emit" and
"internal" for better future-proofing.

gcc/cp/ChangeLog:

* module.cc (importer_interface): Adjust flags.
(get_importer_interface): Rename flags.
(trees_out::core_bools): Clean up special casing.
(trees_out::write_function_def): Rename flag.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++: one more coro test tweak

After my r16-670, running the testsuite with explicit --stds didn't run this
one in C++17 mode, but the default did. Let's remove the { target c++17 }
so it doesn't by default, either.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr94760-mismatched-traits-and-promise-prev.C:
Remove { target c++17 }.

Manual tweak of some end_sequence callers

This patch mops up obvious redundancies that weren't caught by the
automatic regexp replacements in earlier patches. It doesn't do
anything with genemit.cc, since that will be part of a later series.

gcc/
* config/arm/arm.cc (arm_gen_load_multiple_1): Simplify use of
end_sequence.
(arm_gen_store_multiple_1): Likewise.
* expr.cc (gen_move_insn): Likewise.
* gentarget-def.cc (main): Likewise.

Automatic replacement of end_sequence/return pairs

This is the result of using a regexp to replace:

  rtx( |_insn *)<stuff> = end_sequence ();
  return <stuff>;

with:

  return end_sequence ();

gcc/
* asan.cc (asan_emit_allocas_unpoison): Directly return the
result of end_sequence.
(hwasan_emit_untag_frame): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
* config/arm/arm.cc (arm_call_tls_get_addr): Likewise.
* config/avr/avr-passes.cc (avr_parallel_insn_from_insns): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.

Automatic replacement of get_insns/end_sequence pairs

This is the result of using a regexp to replace instances of:

  <stuff> = get_insns ();
  end_sequence ();

with:

  <stuff> = end_sequence ();

where the indentation is the same for both lines, and where there
might be blank lines inbetween.

gcc/
* asan.cc (asan_clear_shadow): Use the return value of end_sequence,
rather than calling get_insns separately.
(asan_emit_stack_protection, asan_emit_allocas_unpoison): Likewise.
(hwasan_frame_base, hwasan_emit_untag_frame): Likewise.
* auto-inc-dec.cc (attempt_change): Likewise.
* avoid-store-forwarding.cc (process_store_forwarding): Likewise.
* bb-reorder.cc (fix_crossing_unconditional_branches): Likewise.
* builtins.cc (expand_builtin_apply_args): Likewise.
(expand_builtin_return, expand_builtin_mathfn_ternary): Likewise.
(expand_builtin_mathfn_3, expand_builtin_int_roundingfn): Likewise.
(expand_builtin_int_roundingfn_2, expand_builtin_saveregs): Likewise.
(inline_string_cmp): Likewise.
* calls.cc (expand_call): Likewise.
* cfgexpand.cc (expand_asm_stmt, pass_expand::execute): Likewise.
* cfgloopanal.cc (init_set_costs): Likewise.
* cfgrtl.cc (insert_insn_on_edge, prepend_insn_to_edge): Likewise.
(rtl_lv_add_condition_to_bb): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
(aarch64_do_track_speculation): Likewise.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately)
(aarch64_expand_vector_init, aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next, aarch64_mode_emit): Likewise.
(aarch64_md_asm_adjust): Likewise.
(aarch64_switch_pstate_sm_for_landing_pad): Likewise.
(aarch64_switch_pstate_sm_for_jump): Likewise.
(aarch64_switch_pstate_sm_for_call): Likewise.
* config/alpha/alpha.cc (alpha_legitimize_address_1): Likewise.
(alpha_emit_xfloating_libcall, alpha_gp_save_rtx): Likewise.
* config/arc/arc.cc (hwloop_optimize): Likewise.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm-builtins.cc: Likewise.
* config/arm/arm.cc (require_pic_register): Likewise.
(arm_call_tls_get_addr, arm_gen_load_multiple_1): Likewise.
(arm_gen_store_multiple_1, cmse_clear_registers): Likewise.
(cmse_nonsecure_call_inline_register_clear): Likewise.
(arm_attempt_dlstp_transform): Likewise.
* config/avr/avr-passes.cc (bbinfo_t::optimize_one_block): Likewise.
(avr_parallel_insn_from_insns): Likewise.
* config/avr/avr.cc (avr_prologue_setup_frame): Likewise.
(avr_expand_epilogue): Likewise.
* config/bfin/bfin.cc (hwloop_optimize): Likewise.
* config/c6x/c6x.cc (c6x_expand_compare): Likewise.
* config/cris/cris.cc (cris_split_movdx): Likewise.
* config/cris/cris.md: Likewise.
* config/csky/csky.cc (csky_call_tls_get_addr): Likewise.
* config/epiphany/resolve-sw-modes.cc
(pass_resolve_sw_modes::execute): Likewise.
* config/fr30/fr30.cc (fr30_move_double): Likewise.
* config/frv/frv.cc (frv_split_scc, frv_split_cond_move): Likewise.
(frv_split_minmax, frv_split_abs): Likewise.
* config/frv/frv.md: Likewise.
* config/gcn/gcn.cc (move_callee_saved_registers): Likewise.
(gcn_expand_prologue, gcn_restore_exec, gcn_md_reorg): Likewise.
* config/i386/i386-expand.cc
(ix86_expand_carry_flag_compare, ix86_expand_int_movcc): Likewise.
(ix86_vector_duplicate_value, expand_vec_perm_interleave2): Likewise.
(expand_vec_perm_vperm2f128_vblend): Likewise.
(expand_vec_perm_2perm_interleave): Likewise.
(expand_vec_perm_2perm_pblendv): Likewise.
(expand_vec_perm2_vperm2f128_vblend, ix86_gen_ccmp_first): Likewise.
(ix86_gen_ccmp_next): Likewise.
* config/i386/i386-features.cc
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_reg, scalar_chain::convert_op): Likewise.
(timode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.cc (ix86_init_pic_reg, ix86_va_start): Likewise.
(ix86_get_drap_rtx, legitimize_tls_address): Likewise.
(ix86_md_asm_adjust): Likewise.
* config/ia64/ia64.cc (ia64_expand_tls_address): Likewise.
(ia64_expand_compare, spill_restore_mem): Likewise.
(expand_vec_perm_interleave_2): Likewise.
* config/loongarch/loongarch.cc
(loongarch_call_tls_get_addr): Likewise.
* config/m32r/m32r.cc (gen_split_move_double): Likewise.
* config/m32r/m32r.md: Likewise.
* config/m68k/m68k.cc (m68k_call_tls_get_addr): Likewise.
(m68k_call_m68k_read_tp, m68k_sched_md_init_global): Likewise.
* config/m68k/m68k.md: Likewise.
* config/microblaze/microblaze.cc
(microblaze_call_tls_get_addr): Likewise.
* config/mips/mips.cc (mips_call_tls_get_addr): Likewise.
(mips_ls2_init_dfa_post_cycle_insn): Likewise.
(mips16_split_long_branches): Likewise.
* config/nvptx/nvptx.cc (nvptx_gen_shuffle): Likewise.
(nvptx_gen_shared_bcast, nvptx_propagate): Likewise.
(workaround_uninit_method_1, workaround_uninit_method_2): Likewise.
(workaround_uninit_method_3): Likewise.
* config/or1k/or1k.cc (or1k_init_pic_reg): Likewise.
* config/pa/pa.cc (legitimize_tls_address): Likewise.
* config/pru/pru.cc (pru_expand_fp_compare, pru_reorg_loop): Likewise.
* config/riscv/riscv-shorten-memrefs.cc
(pass_shorten_memrefs::transform): Likewise.
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Likewise.
* config/riscv/riscv.cc (riscv_call_tls_get_addr): Likewise.
(riscv_frm_emit_after_bb_end): Likewise.
* config/rl78/rl78.cc (rl78_emit_libcall): Likewise.
* config/rs6000/rs6000.cc (rs6000_debug_legitimize_address): Likewise.
* config/s390/s390.cc (legitimize_tls_address): Likewise.
(s390_two_part_insv, s390_load_got, s390_va_start): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* config/sparc/sparc.cc (sparc_legitimize_tls_address): Likewise.
(sparc_output_mi_thunk, sparc_init_pic_reg): Likewise.
* config/stormy16/stormy16.cc (xstormy16_split_cbranch): Likewise.
* config/xtensa/xtensa.cc (xtensa_copy_incoming_a7): Likewise.
(xtensa_expand_block_set_libcall): Likewise.
(xtensa_expand_block_set_unrolled_loop): Likewise.
(xtensa_expand_block_set_small_loop, xtensa_call_tls_desc): Likewise.
* dse.cc (emit_inc_dec_insn_before, find_shift_sequence): Likewise.
(replace_read): Likewise.
* emit-rtl.cc (reorder_insns, gen_clobber, gen_use): Likewise.
* except.cc (dw2_build_landing_pads, sjlj_mark_call_sites): Likewise.
(sjlj_emit_function_enter, sjlj_emit_function_exit): Likewise.
(sjlj_emit_dispatch_table): Likewise.
* expmed.cc (expmed_mult_highpart_optab, expand_sdiv_pow2): Likewise.
* expr.cc (convert_mode_scalar, emit_move_multi_word): Likewise.
(gen_move_insn, expand_cond_expr_using_cmove): Likewise.
(expand_expr_divmod, expand_expr_real_2): Likewise.
(maybe_optimize_pow2p_mod_cmp, maybe_optimize_mod_cmp): Likewise.
* function.cc (emit_initial_value_sets): Likewise.
(instantiate_virtual_regs_in_insn, expand_function_end): Likewise.
(get_arg_pointer_save_area, make_split_prologue_seq): Likewise.
(make_prologue_seq, gen_call_used_regs_seq): Likewise.
(thread_prologue_and_epilogue_insns): Likewise.
(match_asm_constraints_1): Likewise.
* gcse.cc (prepare_copy_insn): Likewise.
* ifcvt.cc (noce_emit_store_flag, noce_emit_move_insn): Likewise.
(noce_emit_cmove): Likewise.
* init-regs.cc (initialize_uninitialized_regs): Likewise.
* internal-fn.cc (expand_POPCOUNT): Likewise.
* ira-emit.cc (emit_move_list): Likewise.
* ira.cc (ira): Likewise.
* loop-doloop.cc (doloop_modify): Likewise.
* loop-unroll.cc (compare_and_jump_seq): Likewise.
(unroll_loop_runtime_iterations, insert_base_initialization): Likewise.
(split_iv, insert_var_expansion_initialization): Likewise.
(combine_var_copies_in_loop_exit): Likewise.
* lower-subreg.cc (resolve_simple_move,resolve_shift_zext): Likewise.
* lra-constraints.cc (match_reload, check_and_process_move): Likewise.
(process_addr_reg, insert_move_for_subreg): Likewise.
(process_address_1, curr_insn_transform): Likewise.
(inherit_reload_reg, process_invariant_for_inheritance): Likewise.
(inherit_in_ebb, remove_inheritance_pseudos): Likewise.
* lra-remat.cc (do_remat): Likewise.
* mode-switching.cc (commit_mode_sets): Likewise.
(optimize_mode_switching): Likewise.
* optabs.cc (expand_binop, expand_twoval_binop_libfunc): Likewise.
(expand_clrsb_using_clz, expand_doubleword_clz_ctz_ffs): Likewise.
(expand_doubleword_popcount, expand_ctz, expand_ffs): Likewise.
(expand_absneg_bit, expand_unop, expand_copysign_bit): Likewise.
(prepare_float_lib_cmp, expand_float, expand_fix): Likewise.
(expand_fixed_convert, gen_cond_trap): Likewise.
(expand_atomic_fetch_op): Likewise.
* ree.cc (combine_reaching_defs): Likewise.
* reg-stack.cc (compensate_edge): Likewise.
* reload1.cc (emit_input_reload_insns): Likewise.
* sel-sched-ir.cc (setup_nop_and_exit_insns): Likewise.
* shrink-wrap.cc (emit_common_heads_for_components): Likewise.
(emit_common_tails_for_components): Likewise.
(insert_prologue_epilogue_for_components): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.
(insert_value_copy_on_edge): Likewise.
* tree-ssa-loop-ivopts.cc (computation_cost): Likewise.

Make end_sequence return the insn sequence

The start_sequence/end_sequence interface was a big improvement over
the previous state, but one slightly awkward thing about it is that
you have to call get_insns before end_sequence in order to get the
insn sequence itself:

   To get the contents of the sequence just made, you must call
   `get_insns' *before* calling here.

We therefore have quite a lot of code like this:

  insns = get_insns ();
  end_sequence ();
  return insns;

It would seem simpler to write:

  return end_sequence ();

instead.

I can see three main potential objections to this:

(1) It isn't obvious whether ending the sequence would return the first
    or the last instruction.  But although some code reads *both* the
    first and the last instruction, I can't think of a specific case
    where code would want *only* the last instruction.  All the emit
    functions take the first instruction rather than the last.

(2) The "end" in end_sequence might imply the C++ meaning of an exclusive
    endpoint iterator.  But for an insn sequence, the exclusive endpoint
    is always the null pointer, so it would never need to be returned.
    That said, we could rename the function to something like
    "finish_sequence" or "complete_sequence" if this is an issue.

(3) There might have been an intention that start_sequence/end_sequence
    could in future reclaim memory for unwanted sequences, and so an
    explicit get_insns was used to indicate that the caller does want
    the sequence.

    But that sort of memory reclaimation has never been added,
    and now that the codebase is C++, it would be easier to handle
    using RAII.  I think reclaiming memory would be difficult to do in
    any case, since some code records the individual instructions that
    they emit, rather than using get_insns.

gcc/
* rtl.h (end_sequence): Return the sequence.
* emit-rtl.cc (end_sequence): Likewise.

libstdc++: Fix proc check_v3_target_namedlocale for "" locale [PR65909]

When the last format argument to a Tcl proc is named 'args' it has
special meaning and is a list that accepts any number of arguments[1].
This means when "" is passed to the proc and then we expand "$args" we
get an empty list formatted as "{}". My r16-537-g3e2b83faeb6b14 change
broke all uses of dg-require-namedlocale with empty locale names, "".

By changing the name of the formal argument to 'locale' we avoid the
special behaviour for 'args' and now it only accepts a single argument
(as was always intended). When expanded as "$locale" we get "" as I
expected.

[1] https://www.tcl-lang.org/man/tcl9.0/TclCmd/proc.html

libstdc++-v3/ChangeLog:

PR libstdc++/65909
* testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
Change name of formal argument to locale.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

RISC-V: Reuse test name for vx combine test data [NFC]

For run test, we have a name like add/sub to indicate
the testcase. So we can reuse this to identify the
test data instead of a new one.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Take
test name for the vx combine test data.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i16.c: Leverage
the test name to identify the test data.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 2.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 1.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add test cases
for vsub vx combine case 1 with GR2VR cost 0.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add test cases
for vsub vx combine with GR2VR cost 15.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1

Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add test cases
for vsub vx combine with GR2VR cost 1.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Diito.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Diito.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0

Add asm dump check and run test for vec_duplicate + vsub.vv
combine to vsub.vx.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vector sub
vx combine asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vector sub vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-run-1-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Adjust vx combine test case to avoid name conflict

Given we will put all vx combine for int8 in a single file,
we need to make sure the generate function for different
types and ops has different function name. Thus, refactor
the test helper macros for avoiding possible function name
conflict.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add
type and op name to generate test function name.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-run-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Refine the
test helper macros to avoid conflict.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_run.h: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Rename vx_vadd-* testcase to vx-* for all vx combine [NFC]

We would like to arrange all vx combine asm check test into
one file for better management. Thus, rename vx_vadd-* to
vx-*.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-2-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-3-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-4-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-5-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-i8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u32.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u64.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: ...here.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-6-u8.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: ...here.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vsub.vv to the
vsub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)                                        \
  void                                                                \
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {                                                                   \
    for (unsigned i = 0; i < n; i++)                                  \
      out[i] = in[i] OP x;                                            \
  }

  DEF_VX_BINARY(int32_t, -)

Before this patch:
  10   │ test_binary_vx_sub:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma // Deleted if GR2VR cost zero
  13   │     vmv.v.x v2,a2                // Ditto.
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vsub.vv v1,v2,v1
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_binary_vx_sub:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vsub.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*<optab>_vx_<mode>): Add new
pattern to convert vec_duplicate + vsub.vv to vsub.vx.
* config/riscv/riscv.cc (riscv_rtx_costs): Add minus as plus op.
* config/riscv/vector-iterators.md: Add minus to iterator
any_int_binop_no_shift_vx.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

c++: remove coroutines.exp

coroutines.exp was basically only there to add -std=c++20 to all the tests;
removing it lets us use the general support for running tests under multiple
standards. Doing this revealed that some tests that specifically run in
C++17 mode were relying on -std=c++20 followed by -std=c++17 leaving
flag_coroutines set, which seems unintentional, and different from how we
handle other feature flags. So this changes that, and adds the missing
-fcoroutines to those tests.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-syntax-09-convert.C: Add -fcoroutines.
* g++.dg/coroutines/co-await-syntax-10.C
* g++.dg/coroutines/co-await-syntax-11.C
* g++.dg/coroutines/co-await-void_type.C
* g++.dg/coroutines/co-return-warning-1.C
* g++.dg/coroutines/ramp-return-a.C
* g++.dg/coroutines/ramp-return-c.C: Likewise.
* g++.dg/coroutines/coroutines.exp: Removed.
* lib/g++-dg.exp: Start at C++20 for coroutines/

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Set flag_coroutines.
(set_std_cxx20, set_std_cxx23, set_std_cxx26): Not here.

Fortran: default-initialization and functions returning derived type [PR85750]

Functions with non-pointer, non-allocatable result and of derived type did
not always get initialized although the type had default-initialization,
and a derived type component had the allocatable or pointer attribute.
Rearrange the logic when to apply default-initialization.

PR fortran/85750

gcc/fortran/ChangeLog:

* resolve.cc (resolve_symbol): Reorder conditions when to apply
default-initializers.

gcc/testsuite/ChangeLog:

* gfortran.dg/alloc_comp_auto_array_3.f90: Adjust scan counts.
* gfortran.dg/alloc_comp_class_3.f03: Remove bogus warnings.
* gfortran.dg/alloc_comp_class_4.f03: Likewise.
* gfortran.dg/allocate_with_source_14.f03: Adjust scan count.
* gfortran.dg/derived_constructor_comps_6.f90: Likewise.
* gfortran.dg/derived_result_5.f90: New test.

Update gcc zh_CN.po

* zh_CN.po: Update.

cobol: One additional edit to testsuite/cobol.dg/group1/check_88.cob [PR120251]

Missed one edit. This fixes that.

gcc/testsuite/ChangeLog:

PR cobol/120251
* cobol.dg/group1/check_88.cob: One final regex "." instead of "ß"

Update cpplib zh_CN.po

* zh_CN.po: Update.

Enhance bitwise_and::op1_range

Any known bits from the LHS range can be used to specify known bits in
the non-mask operand.

PR tree-optimization/116546
gcc/
* range-op.cc (operator_bitwise_and::op1_range): Utilize bitmask
from the LHS to improve op1's bitmask.

gcc/testsuite/
* gcc.dg/pr116546.c: New.

Allow bitmask intersection to process unknown masks.

bitmask_intersection should not return immediately if the current mask is
unknown. Unknown may mean its the default for a range, and this may
interact in intersting ways with the other bitmask.

PR tree-optimization/116546
* value-range.cc (irange::intersect_bitmask): Allow unknown
bitmasks to be processed.

Improve constant bitmasks.

bitmasks for constants are created only for trailing zeros. It is no
additional work to also include leading 1's in the value that are also
known.
before : [5, 7] mask 0x7 value 0x0
after : [5, 7] mask 0x3 value 0x4

PR tree-optimization/116546
* value-range.cc (irange_bitmask::irange_bitmask): Include
leading ones in the bitmask.

Turn get_bitmask_from_range into an irange_bitmask constructor.

There are other places where this is interesting, so move the static
function into a constructor for class irange_bitmask.

* value-range.cc (irange_bitmask::irange_bitmask): Rename from
get_bitmask_from_range and tweak.
(prange::set): Use new constructor.
(prange::intersect): Use new constructor.
(irange::get_bitmask): Likewise.
* value-range.h (irange_bitmask): New constructor prototype.

Check for casts becoming UNDEFINED.

In various situations a cast that is ultimately unreahcable may produce
an UNDEFINED result, and we can't check the bounds in this case.

PR tree-optimization/120277
gcc/
* range-op-ptr.cc (operator_cast::fold_range): Check if the cast
if UNDEFINED before setting bounds.

gcc/testsuite/
* gcc.dg/pr120277.c: New.