git.ipfire.org Git - thirdparty/gcc.git/log

gimple: Add limit after which slower switchlower algs are used [PR117091] [PR117352]

This patch adds a limit on the number of cases of a switch.  When this
limit is exceeded, switch lowering decides to use faster but less
powerful algorithms.

In particular this means that for finding bit tests switch lowering
decides between the old dynamic programming O(n^2) algorithm and the
new greedy algorithm that Andi Kleen recently added but then reverted
due to PR117352.  It also means that switch lowering may bail out on
finding jump tables if the switch is too large  (Btw it also may not
bail!  It can happen that the greedy algorithms finds some bit tests
which then basically split the switch into multiple smaller switches and
those may be small enough to fit under the limit.)

The limit is implemented as --param switch-lower-slow-alg-max-cases.
Exceeding the limit is reported through -Wdisabled-optimization.

This patch fixes the issue with the greedy algorithm described in
PR117352.  The problem was incorrect usage of the is_beneficial()
heuristic.

gcc/ChangeLog:

PR middle-end/117091
PR middle-end/117352
* doc/invoke.texi: Add switch-lower-slow-alg-max-cases.
* params.opt: Add switch-lower-slow-alg-max-cases.
* tree-switch-conversion.cc (jump_table_cluster::find_jump_tables):
Note in a comment that we are looking for jump tables in
case sequences delimited by the already found bit tests.
(bit_test_cluster::find_bit_tests): Decide between
find_bit_tests_fast() and find_bit_tests_slow().
(bit_test_cluster::find_bit_tests_fast): New function.
(bit_test_cluster::find_bit_tests_slow): New function.
(switch_decision_tree::analyze_switch_statement): Report
exceeding the limit.
* tree-switch-conversion.h: Add find_bit_tests_fast() and
find_bit_tests_slow().

Co-Authored-By: Andi Kleen <ak@gcc.gnu.org>
Signed-off-by: Filip Kastl <fkastl@suse.cz>

c++: allow stores to anon union vars to change current union member in constexpr [PR117614]

Since r14-4771 the FE tries to differentiate between cases where the lhs
of a store allows changing the current union member and cases where it
doesn't, and cases where it doesn't includes everything that has gone
through the cxx_eval_constant_expression path on the lhs.
As the testcase shows, DECL_ANON_UNION_VAR_P vars were handled like that
too, even when stores to them are the only way how to change the current
union member in the sources.

So, the following patch just handles that case manually without calling
cxx_eval_constant_expression and without setting evaluated to true.

2024-12-11 Jakub Jelinek <jakub@redhat.com>

PR c++/117614
* constexpr.cc (cxx_eval_store_expression): For stores to
DECL_ANON_UNION_VAR_P vars just continue with DECL_VALUE_EXPR
of it, without setting evaluated to true or full
cxx_eval_constant_expression.

* g++.dg/cpp2a/constexpr-union8.C: New test.

c++: tweak colorization of incompatible declspecs

Introduce a helper function for complaining about "signed unsigned"
and "short long". Add colorization there so that e.g. the 'signed'
and 'unsigned' are given consistent contrasting colors in both the
message and the quoted source.

gcc/cp/ChangeLog:
* decl.cc: Add #include "diagnostic-highlight-colors.h"
and #include "pretty-print-markup.h".
(complain_about_incompatible_declspecs): New.
(grokdeclarator): Use it when complaining about both 'signed' and
'unsigned', and both 'long' and 'short'.

gcc/ChangeLog:
* diagnostic-highlight-colors.h: Tweak comment.
* pretty-print-markup.h (class pp_element_quoted_string): New,
based on pretty-print.cc's selftest::test_element, adding an
optional highlight color.
* pretty-print.cc (class test_element): Drop.
(selftest::test_pp_format): Use pp_element_quoted_string.
(selftest::test_urlification): Likewise.

gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/long-short-colorization.C: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: suppress "note: " prefix in nested diagnostics [PR116253]

This patch is a followup to:
"c++: use diagnostic nesting [PR116253]"

This patch tweaks how text output with experimental-nesting=yes
prints nested diagnostics, by omitting the leading "note: " from
nested notes.

This reduces the amount of visual cruft the user has to ignore when
reading C++ template errors; see the examples in the testsuite.

This doesn't affect the output for users who have not opted-in
to nested diagnostic-printing.

gcc/ChangeLog:
PR other/116253
* diagnostic-format-text.cc (build_prefix): Don't add the
"note: " prefix when showing nested diagnostics.

gcc/testsuite/ChangeLog:
PR other/116253
* g++.dg/concepts/nested-diagnostics-1-truncated.C: Update
expected output.
* g++.dg/concepts/nested-diagnostics-1.C: Likewise.
* g++.dg/concepts/nested-diagnostics-2.C: Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-show-levels.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-unicode.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: print z candidate count and number them (v2)

Changed in v2: changed wording to "there is"/"there are" rather
than "we found".

This patch is a followup to:
"c++: use diagnostic nesting [PR116253]"

Following Sy Brand's UX suggestions in P2429R0 for example 1, this patch
tweaks print_z_candidates to add a note about the number of candidates,
and adds a candidate number to each one.

Various examples of output can be seen in the testsuite part of the
patch.

gcc/cp/ChangeLog:
* call.cc (print_z_candidates): Count the number of
candidates and issue a note stating the count at an
intermediate nesting level. Number the individual
candidates.

gcc/testsuite/ChangeLog:
* g++.dg/concepts/diagnostic9.C: Update expected
results for candidate count and numbering.
* g++.dg/concepts/nested-diagnostics-1-truncated.C:
* g++.dg/concepts/nested-diagnostics-1.C: Likewise.
* g++.dg/concepts/nested-diagnostics-2.C: Likewise.
* g++.dg/cpp23/explicit-obj-lambda11.C: Likewise.
* g++.dg/cpp2a/desig4.C: Likewise.
* g++.dg/cpp2a/desig6.C: Likewise.
* g++.dg/cpp2a/spaceship-eq15.C: Likewise.
* g++.dg/diagnostic/function-color1.C: Likewise.
* g++.dg/diagnostic/param-type-mismatch-2.C: Likewise.
* g++.dg/diagnostic/pr100716-1.C: Likewise.
* g++.dg/diagnostic/pr100716.C: Likewise.
* g++.dg/lookup/operator-2.C: Likewise.
* g++.dg/lookup/pr80891-5.C: Likewise.
* g++.dg/modules/adhoc-1_b.C: Likewise.
* g++.dg/modules/err-1_c.C: Likewise.
* g++.dg/modules/err-1_d.C: Likewise.
* g++.dg/other/return2.C: Likewise.
* g++.dg/overload/error6.C: Likewise.
* g++.dg/template/local6.C: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: tweak output for nested messages [PR116253]

When printing nested messages with
-fdiagnostics-set-output=text:experimental-nesting=yes
avoid printing a line such as the "cc1plus:" in the following:
• note: set ‘-fconcepts-diagnostics-depth=’ to at least 2 for more detail
cc1plus:
for "special" locations such as UNKNOWN_LOCATION.

gcc/ChangeLog:
PR other/116253
* diagnostic-format-text.cc (on_report_diagnostic): When showing
locations for nested messages on new lines, don't print
UNKNOWN_LOCATION or BUILTINS_LOCATION.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

input.cc: rename file_cache:in_context

No functional change intended.

gcc/ChangeLog:
* input.cc (file_cache::initialize_input_context): Rename member
"in_context" to "m_input_context".
(file_cache::add_file): Likewise.
* input.h (class file_cache): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Ada: Add GNU/Hurd x86_64 support

This is essentially the same as the i386-pc-gnu section, the differences
are the same as between freebsd i386 and freebsd x86_64.

gcc/ada/ChangeLog:

* Makefile.rtl: Add x86_64-pc-gnu section.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Ada: Fix GNU/Hurd priority range

GNU/Mach currently uses a 0..63 range.

gcc/ada/ChangeLog:

* libgnat/system-gnu.ads: New file.
* Makefile.rtl (x86-gnuhurd): Use libgnat/system-gnu.ads instead of
libgnat/system-freebsd.ads.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Ada: Factorize bsd signal definitions

They are all the same on all BSD-like systems (including GNU/Hurd).

gcc/ada/ChangeLog:

* libgnarl/a-intnam__freebsd.ads: Rename to...
* libgnarl/a-intnam__bsd.ads: ... new file.
* libgnarl/a-intnam__dragonfly.ads: Remove file.
* Makefile.rtl (x86-kfreebsd, x86-gnuhurd, x86_64-kfreebsd,
aarch64-freebsd, x86-freebsd, x86_64-freebsd): Use
libgnarl/a-intnam__bsd.ads instead of libgnarl/a-intnam__freebsd.ads.
(x86_64-dragonfly): Use libgnarl/a-intnam__bsd.ads instead of
libgnarl/a-intnam__dragonfly.ads.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

ipa: Update value range jump functions during inlining

When inlining (during the analysis phase) a call graph edge, we update
all pass-through jump functions corresponding to edges going out of
the newly inlined function to be relative to the function into which
we are inlining or to expose the information originally captured for
the edge that is being inlined.

Similarly, we can combine the value range information in pass-through
jump functions corresponding to both edges, which is what this patch
adds - at least for the case when the inlined pass-through is a
simple, non-arithmetic one, which is the case that we also handle for
constant and aggregate jump function parts.

gcc/ChangeLog:

2024-11-01 Martin Jambor <mjambor@suse.cz>

* ipa-cp.h: Forward declare class ipa_vr.
(ipa_vr_operation_and_type_effects) Declare.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Make public.
* ipa-prop.cc (update_jump_functions_after_inlining): Also update
value range jump functions.

middle-end: Add initial support for poly_int64 BIT_FIELD_REF in expand pass [PR96342]

While `poly_int64' has been the default representation of bitfield size
and offset for some time, there was a lack of support for the use of
non-constant `poly_int64' values for those values throughout the
compiler, limiting the applicability of the BIT_FIELD_REF rtl expression
for variable length vectors, such as those used by SVE.

This patch starts work on extending the functionality of relevant
functions in the expand pass such as to enable their use by the compiler
for such vectors.

gcc/ChangeLog:

PR target/96342
* expr.cc (store_constructor): Enable poly_{u}int64 type usage.
(get_inner_reference): Ditto.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: add vec_init support for variable length subvector concatenation. [PR96342]

For architectures where the vector-length is a compile-time variable,
rather representing a runtime constant, as is the case with SVE it is
perfectly reasonable that such vector be made up of two (or more) subvector
components of a compatible sub-length variable.

One example of this would be the concatenation of two VNx4QI vectors
into a single VNx8QI vector.

This patch adds initial support for the enablement of this feature in
the middle-end, removing the `.is_constant()' constraint on the vector's
number of elements, instead making the constant no. of elements the
multiple of the number of subvectors (which must then also be of
variable length, such that their polynomial ratio then results in a
compile-time constant) required to fill the vector.

gcc/ChangeLog:

PR target/96342
* expr.cc (store_constructor): add support for variable-length
vectors.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: Fix mask length arg in call to vect_get_loop_mask [PR96342]

When issuing multiple calls to a simdclone in a vectorized loop,
TYPE_VECTOR_SUBPARTS(vectype) gives the incorrect number when compared
to the TYPE_VECTOR_SUBPARTS result we get from the mask type derived
from the relevant `rgroup_controls' entry within `vect_get_loop_mask'.

By passing `masktype' instead, we are able to get the correct number of
vector subparts and thu eliminate the ICE in the call to
`vect_get_loop_mask' when the data type for which we retrieve the mask
is wider than the one used when defining the mask at mask registration
time.

gcc/ChangeLog:

PR target/96342
* tree-vect-stmts.cc (vectorizable_simd_clone_call):
s/vectype/masktype/ in call to vect_get_loop_mask.

middle-end: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE [PR96342]

This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
might match. This will cause type errors later on.

Other targets do not currently need to use this argument.

gcc/ChangeLog:

PR target/96342
* target.def (TARGET_SIMD_CLONE_USABLE): Add argument.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass stmt_info to
call TARGET_SIMD_CLONE_USABLE.
* config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add argument
and use it to reject the use of SVE simd clones with Advanced SIMD
modes.
* config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused argument.
* config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
* doc/tm.texi: Regenerate

Co-authored-by: Victor Do Nascimento <victor.donascimento@arm.com>
Co-authored-by: Tamar Christina <tamar.christina@arm.com>

middle-end: use two's complement equality when comparing IVs during candidate selection  [PR114932]

IVOPTS normally uses affine trees to perform comparisons between different IVs,
but these seem to have been missing in two key spots and instead normal tree
equivalencies used.

In some cases where we have a two-complements equivalence but not a strict
signedness equivalencies we end up generating both a signed and unsigned IV for
the same candidate.

This patch implements a new OEP flag called OEP_ASSUME_WRAPV.  This flag will
check if the operands would produce the same bit values after the computations
even if the final sign is different.

This happens quite a lot with Fortran but can also happen in C because this came
code is unable to figure out when one expression is a multiple of another.

As an example in the attached testcase we get:

Initial set of candidates:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4
inv_expr 2:     (unsigned long) stride.3_27 * 4

These end up being used in the same group:

Group 1:
cand  cost    compl.  inv.expr.       inv.vars
1     0       0       NIL;    6
2     0       0       NIL;    6
3     0       0       NIL;    6

which ends up with IV opts picking the signed and unsigned IVs:

Improved to:
  cost: 24 (complexity 3)
  reg_cost: 9
  cand_cost: 15
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6, 8
   group:0 --> iv_cand:6, cost=(0,1)
   group:1 --> iv_cand:1, cost=(0,0)
   group:2 --> iv_cand:8, cost=(0,1)
   group:3 --> iv_cand:8, cost=(0,1)
  invariant variables: 6
  invariant expressions: 1, 2

and so generates the same IV as both signed and unsigned:

;;   basic block 21, loop depth 3, count 214748368 (estimated locally, freq 58.2545), maybe hot
;;    prev block 28, next block 31, flags: (NEW, REACHABLE, VISITED)
;;    pred:       28 [always]  count:23622320 (estimated locally, freq 6.4080) (FALLTHRU,EXECUTABLE)
;;                25 [always]  count:191126046 (estimated locally, freq 51.8465) (FALLTHRU,DFS_BACK,EXECUTABLE)
  # .MEM_66 = PHI <.MEM_34(28), .MEM_22(25)>
  # ivtmp.22_41 = PHI <0(28), ivtmp.22_82(25)>
  # ivtmp.26_51 = PHI <ivtmp.26_55(28), ivtmp.26_72(25)>
  # ivtmp.28_90 = PHI <ivtmp.28_99(28), ivtmp.28_98(25)>

...

;;   basic block 24, loop depth 3, count 214748366 (estimated locally, freq 58.2545), maybe hot
;;    prev block 22, next block 25, flags: (NEW, REACHABLE, VISITED)'
;;    pred:       22 [always]  count:95443719 (estimated locally, freq 25.8909) (FALLTHRU)
;;                21 [33.3% (guessed)]  count:71582790 (estimated locally, freq 19.4182) (TRUE_VALUE,EXECUTABLE)
;;                31 [33.3% (guessed)]  count:47721860 (estimated locally, freq 12.9455) (TRUE_VALUE,EXECUTABLE)
# .MEM_22 = PHI <.MEM_44(22), .MEM_31(21), .MEM_79(31)>
ivtmp.22_82 = ivtmp.22_41 + 1;
ivtmp.26_72 = ivtmp.26_51 + _80;
ivtmp.28_98 = ivtmp.28_90 + _39;

These two IVs are always used as unsigned, so IV ops generates:

  _73 = stride.3_27 * 4;
  _80 = (unsigned long) _73;
  _54 = (unsigned long) stride.3_27;
  _39 = _54 * 4;

Which means that in e.g. exchange2 we generate a lot of duplicate code.

This is because candidate 6 and 8 are equivalent under two's complement but have
different signs.

This patch changes it so that if you have two IVs that are affine equivalent to
just pick one over the other.  IV already has code for this, so the patch just
uses affine trees instead of tree for the check.

With it we get:

<Invariant Expressions>:
inv_expr 1:     stride.3_27 * 4

<Group-candidate Costs>:
Group 0:
  cand  cost    compl.  inv.expr.       inv.vars
  5     0       2       NIL;    NIL;
  6     0       3       NIL;    NIL;

Group 1:
  cand  cost    compl.  inv.expr.       inv.vars
  1     0       0       NIL;    6
  2     0       0       NIL;    6
  3     0       0       NIL;    6
  4     0       0       NIL;    6

Initial set of candidates:
  cost: 16 (complexity 3)
  reg_cost: 6
  cand_cost: 10
  cand_group_cost: 0 (complexity 3)
  candidates: 1, 6
   group:0 --> iv_cand:6, cost=(0,3)
   group:1 --> iv_cand:1, cost=(0,0)
  invariant variables: 6
  invariant expressions: 1

gcc/ChangeLog:

PR tree-optimization/114932
* fold-const.cc (operand_compare::operand_equal_p): Use it.
(operand_compare::verify_hash_value): Likewise.
(operand_compare::hash_operand): Likewise.
(test_operand_equality::test): New.
(fold_const_cc_tests): Use it.
* tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV.
* tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.

gcc/testsuite/ChangeLog:

PR tree-optimization/114932
* gfortran.dg/addressing-modes_2.f90: New test.

middle-end: refactor type to be explicit in operand_equal_p [PR114932]

This is a refactoring with no expected behavioral change.
The goal with this is to make the type of the expressions being used explicit.

I did not change all the recursive calls to operand_equal_p () to recurse
directly to the new function but instead this goes through the top level call
which re-extracts the types.

This was done because in most of the cases where we recurse type == arg.
The second patch makes use of this new flexibility to implement an overload
of operand_equal_p which checks for equality under two's complement.

gcc/ChangeLog:

PR tree-optimization/114932
* fold-const.cc (operand_compare::operand_equal_p): Split into one that
takes explicit type parameters and use that in public one.
* fold-const.h (class operand_compare): Add operand_equal_p private
overload.

MAINTAINERS: add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

autoupdate: replace obsolete macros in libiberty

Autoreconf-2.72 warns about obsolete macros. This patch aims at removing
the noise from a future upgrade to autoreconf-2.72 or later. This is in
no a way a complete patch allowing the upgrade to autoreconf-2.72.

- AC_GNU_SOURCE by AC_USE_SYSTEM_EXTENSIONS
  https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.72/
  autoconf.html#index-AC_005fGNU_005fSOURCE-1
- AC_CONFIG_HEADER by AC_CONFIG_HEADERS
  https://www.gnu.org/software/automake/manual/1.12.2/html_node/Obsolete-
  Macros.html#index-AM_005fCONFIG_005fHEADER

Those fixes were originally submitted in a patch series in binutils.
https://inbox.sourceware.org/binutils/878qthm6a0.fsf@gentoo.org/

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Fix autoupdate warnings.

libstdc++: Make std::println use locale from ostream (LWG 4088)

This was just approved in Wrocław.

libstdc++-v3/ChangeLog:

* include/std/ostream (println): Pass stream's locale to
std::format, as per LWG 4088.
* testsuite/27_io/basic_ostream/print/1.cc: Check std::println
with custom locale. Remove unused brit_punc class.

aix: Resolve build failure with default C23

libiberty/getopt.c file is defining _NO_PROTO, which causes
conflicting declarations for the functions in AIX header files
like stdio.h & stdlib.h.
Looks like _NO_PROTO define were added long back and conflicting
declarations were always present until C23 standard uncovered it.

Remove the block defining _NO_PROTO as both Tru64 UNIX (ex-OSF/1)
and AIX 3.2 is no more supported.

libiberty/ChangeLog:

* getopt.c: Remove _NO_PROTO block

aarch64: Use SVE ASRD instruction with Neon modes.

The ASRD instruction on SVE performs an arithmetic shift right by an immediate
for divide.

This patch enables the use of ASRD with Neon modes.

For example:

int in[N], out[N];

void
foo (void)
{
for (int i = 0; i < N; i++)
out[i] = in[i] / 4;
}

compiles to:

ldr q31, [x1, x0]
cmlt v30.16b, v31.16b, #0
and z30.b, z30.b, 3
add v30.16b, v30.16b, v31.16b
sshr v30.16b, v30.16b, 2
str q30, [x0, x2]
add x0, x0, 16
cmp x0, 1024

but can just be:

ldp q30, q31, [x0], 32
asrd z31.b, p7/m, z31.b, #2
asrd z30.b, p7/m, z30.b, #2
stp q30, q31, [x1], 32
cmp x0, x2

This patch also adds the following overload:
aarch64_ptrue_reg (machine_mode pred_mode, machine_mode data_mode)
Depending on the data mode, the function returns a predicate with the
appropriate bits set.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_ptrue_reg): New overload.
* config/aarch64/aarch64-protos.h (aarch64_ptrue_reg): Likewise.
* config/aarch64/aarch64-sve.md: Extended sdiv_pow2<mode>3
and *sdiv_pow2<mode>3 to support Neon modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/sve-asrd.c: New test.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
Signed-off-by: Soumya AR <soumyaa@nvidia.com>

aarch64: Extend SVE2 bit-select instructions for Neon modes.

NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain operands
inverted. These can be extended to work with Neon modes.

Since these instructions are unpredicated, duplicate patterns were added with
the predicate removed to generate these instructions for Neon modes.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.

Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-sve2.md
(*aarch64_sve2_nbsl_unpred<mode>): New pattern to match unpredicated
form.
(*aarch64_sve2_bsl1n_unpred<mode>): Likewise.
(*aarch64_sve2_bsl2n_unpred<mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/bitsel.c: New test.

Fix inaccuracy in cunroll/cunrolli when considering what's innermost loop.

r15-919-gef27b91b62c3aa removed 1 / 3 size reduction for innermost
loop, but it doesn't accurately remember what's "innermost" for 2
testcases in PR117888.

1) For pass_cunroll, the "innermost" loop could be an originally outer
loop with inner loop completely unrolled by cunrolli. The patch moves
local variable cunrolli to parameter of tree_unroll_loops_completely
and passes it directly from execute of the pass.

2) For pass_cunrolli, cunrolli is set to false when the sibling loop
of a innermost loop is completely unrolled, and it inaccurately
takes the innermost loop as an "outer" loop. The patch add another
paramter innermost to helps recognizing the "original" innermost loop.

gcc/ChangeLog:

PR tree-optimization/117888
* tree-ssa-loop-ivcanon.cc (try_unroll_loop_completely): Use
cunrolli instead of cunrolli && !loop->inner to check if it's
innermost loop.
(canonicalize_loop_induction_variables): Add new parameter
const_sbitmap innermost, and pass
cunrolli
&& (unsigned) loop->num < SBITMAP_SIZE (innermost)
&& bitmap_bit_p (innermost, loop->num) as "cunrolli" to
try_unroll_loop_completely
(canonicalize_induction_variables): Pass innermost to
canonicalize_loop_induction_variables.
(tree_unroll_loops_completely_1): Add new parameter
const_sbitmap innermost.
(tree_unroll_loops_completely): Move local variable cunrolli
to parameter to indicate it's from pass cunrolli, also track
all "original" innermost loop at the beginning.

gcc/testsuite/ChangeLog:

* gcc.dg/pr117888-2.c: New test.
* gcc.dg/vect/pr117888-1.c: Ditto.
* gcc.dg/tree-ssa/pr83403-1.c: Add
--param max-completely-peeled-insns=300 for arm*-*-*.
* gcc.dg/tree-ssa/pr83403-2.c: Ditto.

Daily bump.

sarif-replay: fix missing URLs [PR117944]

gcc/ChangeLog:
PR other/117944
* libsarifreplay.cc (sarif_replayer::handle_result_obj): Get any
helpUri from the rule_obj and pass it to add_rule.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

contrib: add 'libgdiagnostics' and 'sarif-replay' to bug_components

contrib/ChangeLog:
* gcc-changelog/git_commit.py (bug_components): Add
'libgdiagnostics' and 'sarif-replay'.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

PR modula2/117120: case ch with a nul char constant causes ICE

This patch fixes the ICE caused when a case clause contains
a character constant ''. The fix was to walk the caselist and
convert any 0 length string into a char constant of value 0.

gcc/m2/ChangeLog:

PR modula2/117120
* gm2-compiler/M2CaseList.mod (CaseBoundsResolved): Rewrite.
(ConvertNulStr2NulChar): New procedure function.
(NulStr2NulChar): Ditto.
(GetCaseExpression): Ditto.
(OverlappingCaseBound): Rewrite.
* gm2-compiler/M2GCCDeclare.mod (CheckResolveSubrange): Allow
'' to be used as the subrange low limit.
* gm2-compiler/M2GenGCC.mod (FoldConvert): Rewrite.
(PopKindTree): Ditto.
(BuildHighFromString): Reformat.
* gm2-compiler/SymbolTable.mod (PushConstString): Add test for
length 0 and PushChar (nul).

gcc/testsuite/ChangeLog:

PR modula2/117120
* gm2/pim/pass/forloopnulchar.mod: New test.
* gm2/pim/pass/nulcharcase.mod: New test.
* gm2/pim/pass/nulcharvar.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

libstdc++: Use feature test macro for pmr::polymorphic_allocator<>

Check the __glibcxx_polymorphic_allocator macro instead of just checking
whether __cplusplus > 201703L.

libstdc++-v3/ChangeLog:

* include/bits/memory_resource.h (polymoprhic_allocator): Use
feature test macro for P0339R6 features.

[PR117946][LRA]: When assigning hard reg use biggest mode to check ira_prohibited_class_mode_regs

A pseudo in the PR test case gets hard reg 43 which is x86 r15 (after
r15, xmm regs go).  The pseudo is of INT_SSE_CLASS and SImode but is
used in TImode as paradoxical subreg.  r15 in TImode is wrong and does
not satisfy constraint 'r'.  Therefore LRA creates moves involving the
pseudo in TImode until the limit of reload insns is achieved.
Unfortunately x86 hard_regno_mode_ok (as some hooks for other targets)
says that it is ok to use r15 for TImode pseudo.  Therefore LRA uses
ira_prohibited_class_mode_regs for such cases but it was checked
against native pseudo mode.  The patch fixes it by using the biggest
pseudo mode.

gcc/ChangeLog:

PR rtl-optimization/117946
* lra-assigns.cc: (find_hard_regno_for_1): Use the biggest mode to
check ira_prohibited_class_mode_regs.

gcc/testsuite/ChangeLog:

PR rtl-optimization/117946
* gcc.target/i386/pr117946.c: New.

Fortran: Fix READ with padding in BLANK ZERO mode.

PR fortran/117819

libgfortran/ChangeLog:

* io/read.c (read_decimal): If the read value is short of the
specified width and pad mode is PAD yes, check for BLANK ZERO
and adjust the value accordingly.
(read_decimal_unsigned): Likewise.
(read_radix): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr117819.f90: New test.

c++: ICE with -Wduplicated-branches in template [PR117880]

In a template, for things like void() we'll create a CAST_EXPR with
a null operand.  That causes a crash with -Wduplicated-branches on:

  false ? void() : void();

because we do

  if (warn_duplicated_branches
      && (complain & tf_warning)
      && (arg2 == arg3 || operand_equal_p (arg2, arg3,
                                           OEP_ADDRESS_OF_SAME_FIELD)))

even in a template.  So one way to fix the ICE would be to check
!processing_template_decl.  But we can also do the following and
continue warning even in templates.

This ICE appeared with the removal of NON_DEPENDENT_EXPR; before,
operand_equal_p would bail on this code so there was no problem.

PR c++/117880

gcc/ChangeLog:

* fold-const.cc (operand_compare::operand_equal_p) <case tcc_unary>:
Use OP_SAME_WITH_NULL instead of OP_SAME.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wduplicated-branches8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

arm: Fix LDRD register overlap [PR117675]

The register indexed variants of LDRD have complex register overlap constraints
which makes them hard to use without using output_move_double (which can't be
used for atomics as it doesn't guarantee to emit atomic LDRD/STRD when required).
Add a new predicate and constraint for plain LDRD/STRD with base or base+imm.
This blocks register indexing and fixes PR117675.

gcc:
PR target/117675
* config/arm/arm.cc (arm_ldrd_legitimate_address): New function.
* config/arm/arm-protos.h (arm_ldrd_legitimate_address): New prototype.
* config/arm/constraints.md: Add new Uo constraint.
* config/arm/predicates.md (arm_ldrd_memory_operand): Add new predicate.
* config/arm/sync.md (arm_atomic_loaddi2_ldrd): Use
arm_ldrd_memory_operand and Uo.

gcc/testsuite:
PR target/117675
* gcc.target/arm/pr117675.c: Add new test.

AArch64: Add baseline tune

Cleanup the extra tune defines by introducing AARCH64_EXTRA_TUNE_BASE as a
common base supported by all modern cores. Initially set it to
AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND. No change in generated code.

gcc:
* config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNE_BASE): New define.
* config/aarch64/tuning_models/ampere1b.h: Use AARCH64_EXTRA_TUNE_BASE.
* config/aarch64/tuning_models/cortexx925.h: Likewise.
* config/aarch64/tuning_models/fujitsu_monaka.h: Likewise.
* config/aarch64/tuning_models/generic_armv8_a.h: Likewise.
* config/aarch64/tuning_models/generic_armv9_a.h: Likewise.
* config/aarch64/tuning_models/neoversen1.h: Likewise.
* config/aarch64/tuning_models/neoversen2.h: Likewise.
* config/aarch64/tuning_models/neoversen3.h: Likewise.
* config/aarch64/tuning_models/neoversev1.h: Likewise.
* config/aarch64/tuning_models/neoversev2.h: Likewise.
* config/aarch64/tuning_models/neoversev3.h: Likewise.
* config/aarch64/tuning_models/neoversev3ae.h: Likewise.

AArch64: Cleanup alignment macros

Change the AARCH64_EXPAND_ALIGNMENT macro into proper function calls to make
future changes easier. Use the existing alignment settings, however avoid
overaligning small array's or structs to 64 bits when there is no benefit.
The lower alignment gives a small reduction in data and stack size.
Using 32-bit alignment for small char arrays still improves performance of
string functions since it can be loaded in full by the first 8/16-byte load.

gcc:
* config/aarch64/aarch64.h (AARCH64_EXPAND_ALIGNMENT): Remove.
(DATA_ALIGNMENT): Use aarch64_data_alignment.
(LOCAL_ALIGNMENT): Use aarch64_stack_alignment.
* config/aarch64/aarch64.cc (aarch64_data_alignment): New function.
(aarch64_stack_alignment): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_data_alignment): New prototype.
(aarch64_stack_alignment): Likewise.

AArch64: Use LDP/STP for large struct types

Use LDP/STP for large struct types as they have useful immediate offsets and
are typically faster. This removes differences between little and big endian
and allows use of LDP/STP without UNSPEC.

gcc:
* config/aarch64/aarch64.cc (aarch64_classify_address): Treat SIMD structs
identically in little and bigendian.
* config/aarch64/aarch64-simd.md (aarch64_mov<mode>): Remove VSTRUCT
instructions.
(aarch64_be_mov<mode>): Allow little-endian, rename to aarch64_mov<mode>.
(aarch64_be_movoi): Allow little-endian, rename to aarch64_movoi.
(aarch64_be_movci): Allow little-endian, rename to aarch64_movci.
(aarch64_be_movxi): Allow little-endian, rename to aarch64_movxi.
Remove big-endian special case in define_split variants.

gcc/testsuite:
* gcc.target/aarch64/torture/simd-abi-8.c: Update to check for LDP/STP.

c++: Implement a coroutine language debug dump

This provides to people working on coroutines, as well as writing tests
for coroutines, a way to have insight into the results and inputs of the
coroutine transformation passes, which is quite essential to
understanding what happens in the coroutine transformation.  Currently,
the information dumped is the pre-transform function (which is not
otherwise available), the generated ramp function, the generated frame
type, the transformed actor/resumer, and the destroyer stub.

While debugging this, I've also encountered a minor bug in
c-pretty-print.cc, where it tried to check DECL_REGISTER of DECLs that
did not support it.  I've added a check for that.

Similary, I found one in pp_cxx_template_parameter, where TREE_TYPE was
called on the list cell the template parameter was in rather than on the
parameter itself.  I've fixed that.

And, lastly, there appeared to be no way to pretty-print a FIELD_DECL,
so I added support to cxx_pretty_printer::declaration for it (by reusing
the VAR_DECL path).

Co-authored-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/c-family/ChangeLog:

* c-pretty-print.cc (c_pretty_printer::storage_class_specifier):
Check that we're looking at a PARM_DECL or VAR_DECL before
looking at DECL_REGISTER.

gcc/cp/ChangeLog:

* coroutines.cc (dump_record_fields): New helper.  Iterates a
RECORD_TYPEs TYPE_FIELDS and pretty-prints them.
(dmp_str): New.  The lang-coro dump stream.
(coro_dump_id): New.  ID of the lang-coro dump.
(coro_dump_flags): New.  Flags passed to the lang-coro dump.
(coro_maybe_dump_initial_function): New helper.  Prints, if
dumping is enabled, the fndecl passed to it as the original
function.
(coro_maybe_dump_ramp): New.  Prints the ramp function passed to
it, if dumping is enabled.
(coro_maybe_dump_transformed_functions): New.
(cp_coroutine_transform::apply_transforms): Initialize the
lang-coro dump.  Call coro_maybe_dump_initial_function on the
original function, as well as coro_maybe_dump_ramp, after the
transformation into the ramp is finished.
(cp_coroutine_transform::finish_transforms): Call
coro_maybe_dump_transformed_functions on the built actor and
destroy.
* cp-objcp-common.cc (cp_register_dumps): Register the coroutine
dump.
* cp-tree.h (coro_dump_id): Declare as extern.
* cxx-pretty-print.cc (pp_cxx_template_parameter): Don't call
TREE_TYPE on a TREE_LIST cell.
(cxx_pretty_printer::declaration): Handle FIELD_DECL similar to
VAR_DECL.

gcc/ChangeLog:

* dumpfile.cc (FIRST_ME_AUTO_NUMBERED_DUMP): Bump to 6 for sake
of the coroutine dump.

c++: P2865R5, Remove Deprecated Array Comparisons from C++26 [PR117788]

This patch implements P2865R5 by promoting the warning to permerror in
C++26 only.

In C++20 we should warn even without -Wall.  Jason fixed this in r15-5713
but let's add a test that doesn't use -Wall.

This caused a FAIL in conditionally_borrowed.cc because we end up
comparing two array types in equality_comparable_with ->
__weakly_eq_cmp_with.  That could be fixed in libstc++, perhaps by
adding std::decay in the appropriate place.

PR c++/117788

gcc/c-family/ChangeLog:

* c-warn.cc (do_warn_array_compare): Emit a permerror in C++26.

gcc/cp/ChangeLog:

* typeck.cc (cp_build_binary_op) <case EQ_EXPR>: Don't check
warn_array_compare.  Check tf_warning_or_error instead of just
tf_warning.  Maybe return an error_mark_node in C++26.
<case LE_EXPR>: Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/Warray-compare-1.c: Expect an error in C++26.
* c-c++-common/Warray-compare-3.c: Likewise.
* c-c++-common/Warray-compare-4.c: New test.
* c-c++-common/Warray-compare-5.c: New test.
* g++.dg/warn/Warray-compare-1.C: New test.

libstdc++-v3/ChangeLog:

* testsuite/std/ranges/adaptors/conditionally_borrowed.cc: Add a
FIXME, adjust.

Reviewed-by: Jason Merrill <jason@redhat.com>

plugin/plugin-gcn.c: Fix error handling of GOMP_OFFLOAD_openacc_async_construct

Follow up to r15-5392-g884637b6362391. As the name implies,
GOMP_OFFLOAD_openacc_async_construct is also externally called.
Hence, partially revert previous commit to permit unlocking handling
in oacc-async.c's lookup_goacc_asyncqueue by not failing fatally.

Hence, also the other (indirect) callers had to be updated:
GOMP_OFFLOAD_dev2dev fails now with 'false' and
GOMP_OFFLOAD_async_run fatally.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_async_run):
Handle omp_async_queue == NULL after call to maybe_init_omp_async.
(GOMP_OFFLOAD_openacc_async_construct): Use error not fatal error,
partially reverting r15-5392.

testsuite/gcc.dg/tree-ssa/pr117973-1.c: New test

PR117973 covers the aspect of
non-LOGICAL_OP_NON_SHORT_CIRCUIT targets for PR111456, for
which the test-case gcc.dg/tree-ssa/pr111456-1.c started
failing as described in PR117954.

* gcc.dg/tree-ssa/pr117973-1.c: New test.

testsuite: Fix cpp0x/trivial1.C for std::is_trivial deprecation in C++26

std::is_trivial is deprecated in C++26, so this test needs to use
-Wno-deprecated now.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/trivial1.C: Add -Wno-deprecated for C++26.

testsuite: Mark gcc.c-torture/execute/memcpy-a?.c tests expensive

These tests can take several seconds per compilation to complete, taking
total elapsed time measured in minutes. Mark them as expensive so as to
let people skip them where they want to save on testing time.

gcc/testsuite/
* gcc.c-torture/execute/memcpy-a1.c: Mark as expensive.
* gcc.c-torture/execute/memcpy-a2.c: Likewise.
* gcc.c-torture/execute/memcpy-a4.c: Likewise.
* gcc.c-torture/execute/memcpy-a8.c: Likewise.

Remove vcond{,u,eq} optabs

This patch removes the remaining traces of the vcond{,u,eq} optabs.
Earlier patches removed the target-independent uses and I couldn't
find any direct references to either the *_optabs or the ifns
in target-specific code.

gcc/
* doc/md.texi (vcond@var{m}@var{n}, vcondu@var{m}@var{n})
(vcondeq@var{m}@var{n}): Delete.
(vcond_mask_@var{m}@var{n}): Redocument in standalone form.
* internal-fn.def (VCOND, VCONDU, VCONDEQ): Delete.
* internal-fn.cc (expand_vec_cond_optab_fn): Delete.
* optabs.def (vcond_optab, vcondu_optab, vcondeq_optab): Delete.

aarch64: Remove vcond{,u} optabs

Prompted by Richard E's arm patch, this one removes the aarch64
support for the vcond{,u} optabs.

gcc/
* config/aarch64/aarch64-protos.h (aarch64_expand_sve_vcond): Delete.
* config/aarch64/aarch64-simd.md (<su><maxmin>v2di3): Expand into
separate vec_cmp and vcond_mask instructions, instead of using vcond.
(vcond<mode><mode>, vcond<v_cmp_mixed><mode>, vcondu<mode><mode>)
(vcondu<mode><v_cmp_mixed>): Delete.
* config/aarch64/aarch64-sve.md (vcond<SVE_ALL:mode><SVE_I:mode>)
(vcondu<SVE_ALL:mode><SVE_I:mode>, vcond<mode><v_fp_equiv>): Likewise.
* config/aarch64/aarch64.cc (aarch64_expand_sve_vcond): Likewise.
* config/aarch64/iterators.md (V_FP_EQUIV, v_fp_equiv, V_cmp_mixed)
(v_cmp_mixed): Likewise.

aarch64: Add support for fp8fma instructions

The AArch64 FEAT_FP8FMA extension introduces instructions for
multiply-add of vectors.

This patch introduces the following instructions:
1. {vmlalbq|vmlaltq}_f16_mf8_fpm.
2. {vmlalbq|vmlaltq}_lane{q}_f16_mf8_fpm.
3. {vmlallbbq|vmlallbtq|vmlalltbq|vmlallttq}_f32_mf8_fpm.
4. {vmlallbbq|vmlallbtq|vmlalltbq|vmlallttq}_lane{q}_f32_mf8_fpm.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(aarch64_pragma_builtins_checker::require_immediate_lane_index): New
overload.
(aarch64_pragma_builtins_checker::check): Add support for FP8FMA
intrinsics.
(aarch64_expand_pragma_builtins): Likewise.
* config/aarch64/aarch64-c.cc
(aarch64_update_cpp_builtins): Conditionally define TARGET_FP8FMA.
* config/aarch64/aarch64-simd-pragma-builtins.def: Add the FP8FMA
intrinsics.
* config/aarch64/aarch64-simd.md:
(@aarch64_<FMLAL_FP8_HF:insn><mode): New pattern.
(@aarch64_<FMLAL_FP8_HF:insn>_lane<V8HF_ONLY:mode><VB:mode>):
Likewise.
(@aarch64_<FMLALL_FP8_SF:insn><mode): Likewise.
(@aarch64_<FMLALL_FP8_SF:insn>_lane<V8HF_ONLY:mode><VB:mode>):
Likewise.
* config/aarch64/iterators.md (V8HF_ONLY): New mode iterator.
(SVE2_FP8_TERNARY_VNX8HF): Rename to...
(FMLAL_FP8_HF): ...this.
(SVE2_FP8_TERNARY_LANE_VNX8HF): Delete in favor of FMLAL_FP8_HF.
(SVE2_FP8_TERNARY_VNX4SF): Rename to...
(FMLALL_FP8_SF): ...this.
(SVE2_FP8_TERNARY_LANE_VNX4SF): Delete in favor of FMLALL_FP8_SF.
(sve2_fp8_fma_op_vnx8hf, sve2_fp8_fma_op_vnx4sf): Fold into...
(insn): ...here.
* config/aarch64/aarch64-sve2.md: Update uses accordingly.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pragma_cpp_predefs_4.c: Test TARGET_FP8FMA.
* gcc.target/aarch64/simd/vmla_fpm.c: New test.
* gcc.target/aarch64/simd/vmla_lane_indices_1.c: Likewise.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>

aarch64: Add support for fp8dot2 and fp8dot4

The AArch64 FEAT_FP8DOT2 and FEAT_FP8DOT4 extension introduces
instructions for dot product of vectors.

This patch introduces the following intrinsics:
1. vdot{q}_{fp16|fp32}_mf8_fpm.
2. vdot{q}_lane{q}_{fp16|fp32}_mf8_fpm.

We added a new aarch64_builtin_signature variant, ternary_lane, and added
support for it in the functions aarch64_fntype and
aarch64_expand_pragma_builtin.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(enum class): Add ternary_lane.
(aarch64_fntype): Hnadle ternary_lane.
(aarch64_pragma_builtins_checker::require_immediate_lane_index): New
function.
(aarch64_pragma_builtins_checker::check): Handle the new intrinsics.
(aarch64_expand_pragma_builtin): Likewise.
* config/aarch64/aarch64-c.cc
(aarch64_update_cpp_builtins): Define TARGET_FP8DOT2 and
TARGET_FP8DOT4.
* config/aarch64/aarch64-simd-pragma-builtins.def: Define vdot
and vdot_lane intrinsics.
* config/aarch64/aarch64-simd.md
(@aarch64_<fpm_uns_op><mode>): New pattern.
(@aarch64_<fpm_uns_op>_lane<VQ_HSF_VDOT:mode><VB:mode>): Likewise.
* config/aarch64/iterators.md (VQ_HSF_VDOT): New mode iterator.
(UNSPEC_VDOT, UNSPEC_VDOT_LANE): New unspecs.
(fpm_uns_op): Handle them.
(VNARROWB, Vnbtype): New mode attributes.
(FPM_VDOT, FPM_VDOT_LANE): New int iterators.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pragma_cpp_predefs_4.c: Test fp8dot2 and fp8dot4.
* gcc.target/aarch64/simd/vdot2_fpm.c: New test.
* gcc.target/aarch64/simd/vdot4_fpm.c: New test.
* gcc.target/aarch64/simd/vdot_lane_indices_1.c: New test.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>

aarch64: Add support for fp8 convert and scale

The AArch64 FEAT_FP8 extension introduces instructions for conversion
and scaling.

This patch introduces the following intrinsics:
1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm.
2. vcvt{q}_mf8_f16_fpm.
3. vcvt_{high}_mf8_f32_fpm.
4. vscale{q}_{f16|f32|f64}.

We introduced two aarch64_builtin_signatures enum variants, unary and
ternary, and added support for these variants in the functions
aarch64_fntype and aarch64_expand_pragma_builtin.

We added new simd_types for integers (s32, s32q, and s64q) and for
floating points (f8 and f8q).

Because we added support for fp8 intrinsics here, we modified the check
in acle/fp8.c that was checking that __ARM_FEATURE_FP8 macro is not
defined.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(FLAG_USES_FPMR, FLAG_FP8): New flags.
(ENTRY): Modified to support ternary operations.
(enum class): New variants to support new signatures.
(struct aarch64_pragma_builtins_data): Extend types to 4 elements.
(aarch64_fntype): Handle new signatures.
(aarch64_get_low_unspec): New function.
(aarch64_convert_to_v64): New function, split out from...
(aarch64_expand_pragma_builtin): ...here. Handle new signatures.
* config/aarch64/aarch64-c.cc
(aarch64_update_cpp_builtins): New flag for FP8.
* config/aarch64/aarch64-simd-pragma-builtins.def: Define new fp8
intrinsics.
(ENTRY_BINARY, ENTRY_BINARY_LANE): Update for new ENTRY interface.
(ENTRY_UNARY, ENTRY_TERNARY, ENTRY_UNARY_FPM): New macros.
(ENTRY_BINARY_VHSDF_SIGNED): Likewise.
* config/aarch64/aarch64-simd.md
(@aarch64_<fpm_uns_op><mode>): New pattern.
(@aarch64_<fpm_uns_op><mode>_high): Likewise.
(@aarch64_<fpm_uns_op><mode>_high_be): Likewise.
(@aarch64_<fpm_uns_op><mode>_high_le): Likewise.
* config/aarch64/iterators.md (V4SF_ONLY, VQ_BHF): New mode iterators.
(UNSPEC_FCVTN_FP8, UNSPEC_FCVTN2_FP8, UNSPEC_F1CVTL_FP8)
(UNSPEC_F1CVTL2_FP8, UNSPEC_F2CVTL_FP8, UNSPEC_F2CVTL2_FP8)
(UNSPEC_FSCALE): New unspecs.
(VPACKB, VPACKBtype): New mode attributes.
(b): Add support for V[48][BH]F.
(FPM_UNARY_UNS, FPM_BINARY_UNS, SCALE_UNS): New int iterators.
(insn): New int attribute.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/fp8.c: Remove check that fp8 feature
macro doesn't exist and...
* gcc.target/aarch64/pragma_cpp_predefs_4.c: ...test that it does here.
* gcc.target/aarch64/simd/scale_fpm.c: New test.
* gcc.target/aarch64/simd/vcvt_fpm.c: New test.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>

libstdc++: Revert change to __bitwise_relocatable

This reverts r15-6060-ge4a0157c2397c9 so that __is_bitwise_relocatable
depends only on is_trivial. To avoid the deprecation warnings for C++26,
use the __is_trivial built-in directly instead of std::is_trivial.

We need to be sure that the type is trivially copyable, not just
trivially constructible and trivially assignable. Otherwise we get
-Wclass-memaccess diagnostics for e.g. std::vector<std::pair<A*, B*>>.
We could add is_trivially_copyable to the conditions, but this isn't
really an appropriate change for stage 3 anyway (it affects all modes
from C++11 upwards). Just revert to using is_trivial, and we can revisit
the condition for GCC 16.

libstdc++-v3/ChangeLog:

* include/bits/stl_uninitialized.h (__is_bitwise_relocatable):
Revert to depending on is_trivial.

tree-optimization/117912 - bogus address equivalences for __builtin_object_size

VN again is the culprit for exploiting address equivalences before
__builtin_object_size got the chance to do its job. This time
it isn't about union members but adjacent structure fields where
an address to one after the last element of an array field can
spill over to the next field.

The following protects all out-of-bound accesses on the upper bound
side (singling out TYPE_MAX_VALUE + 1 is more expensive). It
ignores other out-of-bound addresses that would invoke UB.

Zero-sized arrays are a bit awkward because the C++ represents them
with a -1U upper bound.

There's a similar issue for zero-sized components whose address can
be the same as the adjacent field in C.

PR tree-optimization/117912
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): For addresses
of zero-sized components do not set ->off if the object size pass
didn't run.
For OOB ARRAY_REF accesses in address expressions avoid setting
->off if the object size pass didn't run.
(valueize_refs_1): Likewise.

* c-c++-common/torture/pr117912-1.c: New testcase.
* c-c++-common/torture/pr117912-2.c: Likewise.
* c-c++-common/torture/pr117912-3.c: Likewise.

testsuite/gcc.dg/tree-ssa/pr111456-1.c: Handle fallout

This is expected fallout from r15-5646-gd1cf0d7a0f27fd as
described by that commit. The =0 case is covered by
PR117973.

PR tree-optimization/117954
* gcc.dg/tree-ssa/pr111456-1.c: Pass
--param=logical-op-non-short-circuit=1.

aarch64: Fix ICE happening in SET_TYPE_VECTOR_SUBPARTS with libgccjit

The structure aarch64_simd_type_info was split in 2 because we do not
want to reset the static members of aarch64_simd_type_info to their
default value. We only want the tree types to be GC-ed. This is
necessary for libgccjit which can run multiple times in the same
process. If the static values were GC-ed, the second run would
ICE/segfault because of their invalid value.

The following test suites passed for this patch:

* The aarch64 tests.
* The aarch64 regression tests.

The number of failures of the jit tests on aarch64 lowered from +100 to
~7.

gcc/ChangeLog:
PR target/117923
* config/aarch64/aarch64-builtins.cc: Remove GTY marker on aarch64_simd_types,
aarch64_simd_types_trees (new variable), rename aarch64_simd_types to
aarch64_simd_types_trees.
* config/aarch64/aarch64-builtins.h: Remove GTY marker on aarch64_simd_types,
aarch64_simd_types_trees (new variable).
* config/aarch64/aarch64-sve-builtins-shapes.cc: Rename aarch64_simd_types to
aarch64_simd_types_trees.
* config/aarch64/aarch64-sve-builtins.cc: Rename aarch64_simd_types to
aarch64_simd_types_trees.

RISC-V: Refine signed vector SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-1-i16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-4-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_sub-4-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine signed vector SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i16-to-i8.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_trunc-8-i64-to-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine signed vector SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-1-s16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-1-s32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-1-s64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-1-s8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-2-s16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-2-s32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-2-s64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-2-s8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-3-s16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-3-s32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-3-s64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-3-s8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-4-s16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-4-s32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-4-s64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add-4-s8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned vector SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-1-u16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_trunc-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned vector SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-1-u16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-10-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-10-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-10-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-10-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-6-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-7-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-7-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-7-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-7-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-8-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-8-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-8-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-8-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-9-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-9-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-9-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-9-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_imm-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_imm-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_imm-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_imm-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_zip.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned vector SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-1-u16.c: Take
tree-optimized pass for standard name check, and adjust the times.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-6-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-7-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-7-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-7-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-7-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-8-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-8-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-8-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add-8-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-1-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-1-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-1-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-2-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-2-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-2-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-2-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-3-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-3-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-3-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-3-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_add_imm_reconcile-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

libstdc++: deprecate is_trivial for C++26 (P3247R2)

This actually implements P3247R2 by deprecating the is_trivial type
trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits: Deprecate is_trivial and
is_trivial_v.
* include/experimental/type_traits: Suppress the deprecation
warning.
* testsuite/20_util/is_trivial/requirements/explicit_instantiation.cc:
Amend the test to suppress the deprecation warning.
* testsuite/20_util/is_trivial/requirements/typedefs.cc:
Likewise.
* testsuite/20_util/is_trivial/value.cc: Likewise.
* testsuite/20_util/variable_templates_for_traits.cc: Likewise.
* testsuite/experimental/type_traits/value.cc: Likewise.
* testsuite/18_support/max_align_t/requirements/2.cc: Update the
test with P3247R2's new wording.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

libstdc++: port tests away from is_trivial

In preparation for the deprecation of is_trivial (P3247R2).
Mostly a mechanical exercise, replacing is_trivial with
is_trivially_copyable and/or is_trivially_default_constructible
depending on the cases.

libstdc++-v3/ChangeLog:

* testsuite/20_util/specialized_algorithms/uninitialized_copy/102064.cc:
Port away from is_trivial.
* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/102064.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_default/94540.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_default_n/94540.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/102064.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/102064.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/94540.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct_n/94540.cc:
Likewise.
* testsuite/23_containers/vector/cons/94540.cc: Likewise.
* testsuite/25_algorithms/copy/move_iterators/69478.cc:
Likewise.
* testsuite/25_algorithms/copy_backward/move_iterators/69478.cc:
Likewise.
* testsuite/25_algorithms/move/69478.cc: Likewise.
* testsuite/25_algorithms/move_backward/69478.cc: Likewise.
* testsuite/25_algorithms/rotate/constrained.cc: Likewise.
* testsuite/25_algorithms/rotate_copy/constrained.cc: Likewise.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

libstdc++: port the ranges::uninitialized_* algorithms away from is_trivial

In preparation for the deprecation of is_trivial (P3247R2).
The rangified uninitialized_* specialized memory algorithms have code
paths where they call the non-uninitialized versions, because the latter
are usually optimized. The detection in these code paths uses is_trivial;
port it away from it towards more specific checks.

The detection for the copy/move algorithms was suspicious: it checked
that the output type was trivial, and that assignment from the input
range reference type was nothrow. If so, the algorithm would copy/move
assign (by calling the ranges::copy/move algorithms) instead of
constructing elements. I think this is off because:

1) the constructor that would be called by the algorithm (which may be
   neither a copy or a move constructor) wasn't checked. If that
   constructor isn't trivial the caller might detect that we're not
   calling it, and that goes against the algorithms' specifications.
2) a nothrow assignment is necessary but not sufficient, as again we
   need to check for triviality, or the caller can detect we're calling
   an assignment operator we were never meant to be calling from these
   algorithms.

Therefore I've amended the respective detections.

libstdc++-v3/ChangeLog:

* include/bits/ranges_uninitialized.h: port some if
constexpr away from is_trivial, and towards more specific
detections instead.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

libstdc++: port bitwise relocatable away from is_trivial

In preparation for the deprecation of is_trivial (P3247R2).
"bitwise relocation" (or "trivial relocation" à la P1144/P2786)
doesn't need the full-fledged notion of triviality, just checking for a
trivial move constructor and a trivial destructor is sufficient.

libstdc++-v3/ChangeLog:

* include/bits/stl_uninitialized.h: Amended the
__is_bitwise_relocatable type trait.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

libstdc++: pstl: port away from is_trivial

In preparation for the deprecation of is_trivial (P3247R2).
Unfortunately I am unable to fully understand what aspect of triviality
seems to matter for these algorithms, so I just ported is_trivial to its
direct equivalent (trivially copyable + trivially default
constructible.)

libstdc++-v3/ChangeLog:

* include/pstl/algorithm_impl.h (__remove_elements): Port away
from is_trivial.
(__pattern_inplace_merge): Likewise.
* include/pstl/glue_memory_impl.h (uninitialized_copy): Likewise.
(uninitialized_copy_n): Likewise.
(uninitialized_move): Likewise.
(uninitialized_move_n): Likewise.
(uninitialized_default_construct): Likewise.
(uninitialized_default_construct_n): Likewise.
(uninitialized_value_construct): Likewise.
(uninitialized_value_construct_n): Likewise.
* testsuite/20_util/specialized_algorithms/pstl/uninitialized_construct.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/pstl/uninitialized_copy_move.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/pstl/uninitialized_fill_destroy.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_modifying_operations/partition.cc:
Likewise.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

libstdc++: port away from is_trivial in string classes

In preparation for the deprecation of is_trivial (P3247R2), stop using
it from std::string_view. Also, add the same detection to std::string
(described in [strings.general]/2).

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h: Add a static_assert on the
char-like type.
* include/std/string_view: Port away from is_trivial.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>

Daily bump.

aarch64: Add CRC built-ins test for the target AES.

gcc/testsuite/

* gcc.target/aarch64/crc-builtin-pmul64.c: New test.

Signed-off-by: Mariam Arutunian <mariamarutunian@gmail.com>

aarch64: Implement new expander for efficient CRC computation.

This patch introduces two new expanders for the aarch64 backend,
dedicated to generate optimized code for CRC computations.
The new expanders are designed to leverage specific hardware capabilities
to achieve faster CRC calculations,
particularly using the crc32, crc32c and pmull instructions when supported
by the target architecture.

Expander 1: Bit-Forward CRC (crc<ALLI:mode><ALLX:mode>4)
For targets that support pmul instruction (TARGET_AES),
the expander will generate code that uses the pmull (crypto_pmulldi)
instruction for CRC computation.

Expander 2: Bit-Reversed CRC (crc_rev<ALLI:mode><ALLX:mode>4)
The expander first checks if the target supports the CRC32* instruction set
(TARGET_CRC32)
and the polynomial in use is 0x1EDC6F41 (iSCSI) or 0x04C11DB7 (HDLC). If
the conditions are met,
it emits calls to the corresponding crc32* instruction (depending on the
data size and the polynomial).
If the target does not support crc32* but supports pmull, it then uses the
pmull (crypto_pmulldi) instruction for bit-reversed CRC computation.
Otherwise table-based CRC is generated.

gcc/

* config/aarch64/aarch64-protos.h (aarch64_expand_crc_using_pmull): New
extern function declaration.
(aarch64_expand_reversed_crc_using_pmull): Likewise.
* config/aarch64/aarch64.cc (aarch64_expand_crc_using_pmull): New
function.
(aarch64_expand_reversed_crc_using_pmull): Likewise.
* config/aarch64/aarch64.md (crc_rev<ALLI:mode><ALLX:mode>4): New
expander for reversed CRC.
(crc<ALLI:mode><ALLX:mode>4): New expander for bit-forward CRC.
* config/aarch64/iterators.md (crc_data_type): New mode attribute.

gcc/testsuite/

* gcc.target/aarch64/crc-1-pmul.c: New test.
* gcc.target/aarch64/crc-10-pmul.c: Likewise.
* gcc.target/aarch64/crc-12-pmul.c: Likewise.
* gcc.target/aarch64/crc-13-pmul.c: Likewise.
* gcc.target/aarch64/crc-14-pmul.c: Likewise.
* gcc.target/aarch64/crc-17-pmul.c: Likewise.
* gcc.target/aarch64/crc-18-pmul.c: Likewise.
* gcc.target/aarch64/crc-21-pmul.c: Likewise.
* gcc.target/aarch64/crc-22-pmul.c: Likewise.
* gcc.target/aarch64/crc-23-pmul.c: Likewise.
* gcc.target/aarch64/crc-4-pmul.c: Likewise.
* gcc.target/aarch64/crc-5-pmul.c: Likewise.
* gcc.target/aarch64/crc-6-pmul.c: Likewise.
* gcc.target/aarch64/crc-7-pmul.c: Likewise.
* gcc.target/aarch64/crc-8-pmul.c: Likewise.
* gcc.target/aarch64/crc-9-pmul.c: Likewise.
* gcc.target/aarch64/crc-CCIT-data16-pmul.c: Likewise.
* gcc.target/aarch64/crc-CCIT-data8-pmul.c: Likewise.
* gcc.target/aarch64/crc-coremark-16bitdata-pmul.c: Likewise.
* gcc.target/aarch64/crc-crc32-data16.c: Likewise.
* gcc.target/aarch64/crc-crc32-data32.c: Likewise.
* gcc.target/aarch64/crc-crc32-data8.c: Likewise.
* gcc.target/aarch64/crc-crc32c-data16.c: Likewise.
* gcc.target/aarch64/crc-crc32c-data32.c: Likewise.
* gcc.target/aarch64/crc-crc32c-data8.c: Likewise.

Signed-off-by: Mariam Arutunian <mariamarutunian@gmail.com>
Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>

driver: fix crash with --diagnostics-plain-output [PR117942]

We are crashing here because decode_cmdline_options_to_array has:

if (!strcmp (opt, "-fdiagnostics-plain-output"))
...

but that doesn't handle the '--FLAG' variant.

PR driver/117942

gcc/ChangeLog:

* opts-common.cc (decode_cmdline_options_to_array): Also detect
--diagnostics-plain-output.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

Fortran: fix two minor front-end GMP memleaks

gcc/fortran/ChangeLog:

* expr.cc (find_array_section): Do not initialize GMP variables
twice.

c++: compile time evaluation of prvalues [PR116416]

This PR reports a missed optimization.  When we have:

  Str str{"Test"};
  callback(str);

as in the test, we're able to evaluate the Str::Str() call at compile
time.  But when we have:

  callback(Str{"Test"});

we are not.  With this patch (in fact, it's Patrick's patch with a little
tweak), we turn

  callback (TARGET_EXPR <D.2890, <<< Unknown tree: aggr_init_expr
    5
    __ct_comp
    D.2890
    (struct Str *) <<< Unknown tree: void_cst >>>
    (const char *) "Test" >>>>)

into

  callback (TARGET_EXPR <D.2890, {.str=(const char *) "Test", .length=4}>)

I explored the idea of calling maybe_constant_value for the whole
TARGET_EXPR in cp_fold.  That has three problems:
- we can't always elide a TARGET_EXPR, so we'd have to make sure the
  result is also a TARGET_EXPR;
- the resulting TARGET_EXPR must have the same flags, otherwise Bad
  Things happen;
- getting a new slot is also problematic.  I've seen a test where we
  had "TARGET_EXPR<D.2680, ...>, D.2680", and folding the whole TARGET_EXPR
  would get us "TARGET_EXPR<D.2681, ...>", but since we don't see the outer
  D.2680, we can't replace it with D.2681, and things break.

With this patch, two tree-ssa tests regressed: pr78687.C and pr90883.C.

FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted redundant store: .*.a = {}"
is easy.  Previously, we would call C::C, so .gimple has:

  D.2590 = {};
  C::C (&D.2590);
  D.2597 = D.2590;
  return D.2597;

Then .einline inlines the C::C call:

  D.2590 = {};
  D.2590.a = {}; // #1
  D.2590.b = 0;  // #2
  D.2597 = D.2590;
  D.2590 ={v} {CLOBBER(eos)};
  return D.2597;

then #2 is removed in .fre1, and #1 is removed in .dse1.  So the test
passes.  But with the patch, .gimple won't have that C::C call, so the
IL is of course going to look different.  The .optimized dump looks the
same though so there's no problem.

pr78687.C is XFAILed because the test passes with r15-5746 but not with
r15-5747 as well.  I opened <https://gcc.gnu.org/PR117971>.

PR c++/116416

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Try to fold
TARGET_EXPR_INITIAL and replace it with the folded result if
it's TREE_CONSTANT.

gcc/testsuite/ChangeLog:

* g++.dg/analyzer/pr97116.C: Adjust dg-message.
* g++.dg/tree-ssa/pr78687.C: Add XFAIL.
* g++.dg/tree-ssa/pr90883.C: Adjust dg-final.
* g++.dg/cpp0x/constexpr-prvalue1.C: New test.
* g++.dg/cpp1y/constexpr-prvalue1.C: New test.

Co-authored-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

clang-format AlwaysBreakAfterReturnType to TopLevelDefinitions

The previous value of TopLevel meant that the function name of
declarations would also be on a new line. THis does not match the
current formatting of headers.

Manual testing done on c-common.h.

Also set BraceWrapping.BeforeWhile to true to match the formatting
specified for do/while loops in GNU coding standards.
https://www.gnu.org/prep/standards/standards.html#Formatting

Ok for trunk?

contrib/ChangeLog:

* clang-format: AlwaysBreakAfterReturnType set to
TopLevelDefinitions and BraceWrapping.BeforeWhile set to true.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>

aarch64: Add @ to aarch64_get_lane<mode>

This is a prerequisite for Mariam's CRC support.

gcc/
* config/aarch64/aarch64-simd.md (aarch64_get_lane<mode>): Add
"@" to the name.

libstdc++: Add workaround for read(2) EINVAL on macOS and FreeBSD [PR102259]

On macOS and FreeBSD the read(2) system call can return EINVAL for large
sizes, so limit the maximum that we try to read. The calling code in
basic_filebuf::xsgetn will loop until it gets the size it wants, so we don't
need to loop in basic_file::xsgetn, just limit the maximum size.

libstdc++-v3/ChangeLog:

PR libstdc++/102259
* config/io/basic_file_stdio.cc (basic_file::xsgetn): Limit n to
_GLIBCXX_MAX_READ_SIZE if that macro is defined.
* config/os/bsd/darwin/os_defines.h (_GLIBCXX_MAX_READ_SIZE):
Define to INT_MAX-1.
* config/os/bsd/freebsd/os_defines.h (_GLIBCXX_MAX_READ_SIZE):
Likewise.

libstdc++: Remove std::allocator::is_always_equal typedef for C++26

This was removed by P2868R3, voted into the C++26 draft at the November
2023 meeting in Kona. We've had a deprecated warning in place for three
years.

libstdc++-v3/ChangeLog:

* include/bits/allocator.h (allocator::is_always_equal): Do not
define for C++26.
(allocator<void>::is_always_equal): Likewise.
* testsuite/20_util/allocator/requirements/typedefs.cc: Check
that is_always_equal is not present in C++26.
* testsuite/20_util/allocator/void.cc: Do not require
is_always_equal for C++26.
* testsuite/23_containers/vector/bool/cons/constexpr.cc: Add
missing override of base's is_always_equal.
* testsuite/23_containers/vector/cons/constexpr.cc: Likewise.

libstdc++: Fix debug containers for constant evaluation [PR117962]

Using a stateful allocator with std::vector would fail in Debug Mode,
because the allocator-extended move constructor tries to swap all the
attached safe iterators, but that uses a non-inline function which isn't
constexpr. We don't actually need to swap any iterators in constant
expressions, because we never attach them to the container in the first
place.

This bug went unnoticed because the tests for constexpr std::vector were
using a stateful allocator with a std::allocator base class, but were
failing to override the inherited is_always_equal trait from
std::allocator. That meant that the allocators took the always-equal
code paths, and didn't try to use the buggy constructor. In C++26 the
std::allocator::is_always_equal trait goes away, and so the tests
changed behaviour, revealing the bug.

libstdc++-v3/ChangeLog:

PR libstdc++/117962
* include/debug/safe_container.h: Make allocator-extended move
constructor a no-op during constant evaluation.

[committed] RISC-V testsuite changes to test clmul expansion of CRCs

This testsuite only patch allows us to test code generation for CRC functions
using clmul instructions.

Conceptually it's trivial.  We already have various execution tests in
gcc.dg/torture.  We just define a new set of dg directives and include the
testcase in gcc.dg/torture.

The only gotcha in here is the need to change target-supports.exp.  It was
passing the default set of arguments down to the check_runtime routine, so they
always failed to assemble the testcase and we never claimed the ability to
execute Zbc, Zbkb or Zbkc extension code.

Again, NFC, just testsuite bits.  Pushing to the trunk.

Only aarch64 and x86 bits left ;-)

gcc/testsuite
* gcc.target/riscv/crc-1-zbc.c: New test.
* gcc.target/riscv/crc-1-zbkc.c: Likewise.
* gcc.target/riscv/crc-10-zbc.c: Likewise.
* gcc.target/riscv/crc-10-zbkc.c: Likewise.
* gcc.target/riscv/crc-12-zbc.c: Likewise.
* gcc.target/riscv/crc-12-zbkc.c: Likewise.
* gcc.target/riscv/crc-13-zbc.c: Likewise.
* gcc.target/riscv/crc-13-zbkc.c: Likewise.
* gcc.target/riscv/crc-14-zbc.c: Likewise.
* gcc.target/riscv/crc-14-zbkc.c: Likewise.
* gcc.target/riscv/crc-17-zbc.c: Likewise.
* gcc.target/riscv/crc-17-zbkc.c: Likewise.
* gcc.target/riscv/crc-18-zbc.c: Likewise.
* gcc.target/riscv/crc-18-zbkc.c: Likewise.
* gcc.target/riscv/crc-21-rv64-zbc.c: Likewise.
* gcc.target/riscv/crc-21-rv64-zbkc.c: Likewise.
* gcc.target/riscv/crc-22-zbc.c: Likewise.
* gcc.target/riscv/crc-22-zbkc.c: Likewise.
* gcc.target/riscv/crc-23-zbc.c: Likewise.
* gcc.target/riscv/crc-23-zbkc.c: Likewise.
* gcc.target/riscv/crc-4-zbc.c: Likewise.
* gcc.target/riscv/crc-4-zbkb.c: Likewise.
* gcc.target/riscv/crc-4-zbkc.c: Likewise.
* gcc.target/riscv/crc-5-zbc.c: Likewise.
* gcc.target/riscv/crc-5-zbkb.c: Likewise.
* gcc.target/riscv/crc-5-zbkc.c: Likewise.
* gcc.target/riscv/crc-6-zbc.c: Likewise.
* gcc.target/riscv/crc-6-zbkc.c: Likewise.
* gcc.target/riscv/crc-7-zbc.c: Likewise.
* gcc.target/riscv/crc-7-zbkc.c: Likewise.
* gcc.target/riscv/crc-8-zbc.c: Likewise.
* gcc.target/riscv/crc-8-zbkc.c: Likewise.
* gcc.target/riscv/crc-9-zbc.c: Likewise.
* gcc.target/riscv/crc-9-zbkc.c: Likewise.
* gcc.target/riscv/crc-CCIT-data16-zbc.c: Likewise.
* gcc.target/riscv/crc-CCIT-data16-zbkc.c: Likewise.
* gcc.target/riscv/crc-CCIT-data8-zbc.c: Likewise.
* gcc.target/riscv/crc-CCIT-data8-zbkc.c: Likewise.
* gcc.target/riscv/crc-coremark-16bitdata-zbc.c: Likewise.
* gcc.target/riscv/crc-coremark-16bitdata-zbkc.c: Likewise.
* lib/target-supports.exp (check_effective_target_riscv_zbc_ok): Set
gcc_march before compiling test program.
(check_effective_target_riscv_zbkc_ok): Likewise.
(check_effective_target_riscv_zbkb_ok): Likewise.
Co-authored-by: Jeff Law <jlaw@ventanamicro.com>

Free RTL SSA after late-combine

Late-combine fails to release RTL SSA info, leaking memory
(as -fmem-report shows).

* late-combine.cc (late_combine::execute): Delete RTL SSA.

Assign separate timevar to duplicate computed goto pass

It currently shares the timevar with bb-reorder but can use significant
memory and compile-time on its own.

* timevar.def (TV_DUP_COMPGOTO): Add.
* bb-reorder.cc (pass_data_duplicate_computed_gotos): Use
TV_DUP_COMPGOTO.

s390: Fix UNSPEC_CC_TO_INT canonicalization

Canonicalization of comparisons for UNSPEC_CC_TO_INT missed one case
causing unnecessarily complex code. This especially seems to hit the
Linux kernel.

gcc/ChangeLog:

* config/s390/s390.cc (s390_canonicalize_comparison): Add
missing UNSPEC_CC_TO_INT case.

gcc/testsuite/ChangeLog:

* gcc.target/s390/ccusage.c: New test.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

c++: Allow overloaded builtins to be used in SFINAE context

This commit newly introduces the ability to use overloaded builtins in
C++ SFINAE context.

The goal behind this is in order to ensure there is a single mechanism
that libstdc++ can use to determine whether a given type can be used in
the atomic fetch_add (and similar) builtins.  I am working on another
patch that hopes to use this mechanism to identify whether fetch_add
(and similar) work on floating point types.

Current state of the world:

    GCC currently exposes resolved versions of these builtins to the
    user, so for GCC it's currently possible to use tests similar to the
    below to check for atomic loads on a 2 byte sized object.
      #if __has_builtin(__atomic_load_2)
    Clang does not expose resolved versions of the atomic builtins.

    clang currently allows SFINAE on builtins, so that C++ code can
    check whether a builtin is available on a given type.
    GCC does not (and that is what this patch aims to change).

    C libraries like libatomic can check whether a given atomic builtin
    can work on a given type by using autoconf to check for a
    miscompilation when attempting such a use.

My goal:
    I would like to enable floating point fetch_add (and similar) in
    GCC, in order to use those overloads in libstdc++ implementation of
    atomic<float>::fetch_add.
    This should allow compilers targeting GPU's which have floating
    point fetch_add instructions to emit optimal code.

    In order to do that I need some consistent mechanism that libstdc++
    can use to identify whether the fetch_add builtins have floating
    point overloads (and for which types these exist).

    I would hence like to enable SFINAE on builtins, so that libstdc++
    can use that mechanism for the floating point fetch_add builtins.

Implementation follows the existing mechanism for handling SFINAE
contexts in c-common.cc.  A boolean is passed into the c-common.cc
function indicating whether these functions should emit errors or not.
This boolean comes from `complain & tf_error` in the C++ frontend.
(Similar to other functions like valid_array_size_p and
c_build_vec_perm_expr).

This is done both for resolve_overloaded_builtin and
check_builtin_function_arguments, both of which can be used in SFINAE
contexts.
    I attempted to trigger something using the `reject_gcc_builtin`
    function in an SFINAE context.  Given the context where this
    function is called from the C++ frontend it looks like it may be
    possible, but I did not manage to trigger this in template context
    by attempting to do something similar to the testcases added around
    those calls.
    - I would appreciate any feedback on whether this is something that
      can happen in a template context, and if so some help writing a
      relevant testcase for it.

Both of these functions have target hooks for target specific builtins
that I have updated to take the extra boolean flag.  I have not adjusted
the functions implementing those target hooks (except to update the
declarations) so target specific builtins will still error in SFINAE
contexts.
- I could imagine not updating the target hook definition since nothing
  would use that change.  However I figure that allowing targets to
  decide this behaviour would be the right thing to do eventually, and
  since this is the target-independent part of the change to do that
  this patch should make that change.
  Could adjust if others disagree.

Other relevant points that I'd appreciate reviewers check:
- I did not pass this new flag through
  atomic_bitint_fetch_using_cas_loop since the _BitInt type is not
  available in the C++ frontend and I didn't want if conditions that can
  not be executed in the source.
- I only test non-compile-time-constant types with SVE types, since I do
  not know of a way to get a VLA into a SFINAE context.
- While writing tests I noticed a few differences with clang in this
  area.  I don't think they are problematic but am mentioning them for
  completeness and to allow others to judge if these are a problem).
  - atomic_fetch_add on a boolean is allowed by clang.
  - When __atomic_load is passed an invalid memory model (i.e. too
    large), we give an SFINAE failure while clang does not.

Bootstrap and regression tested on AArch64 and x86_64.
Built first stage on targets whose target hook declaration needed
updated (though did not regtest etc).  Targets triplets I built in order
to check the backend specific changes I made:
   - arm-none-linux-gnueabihf
   - avr-linux-gnu
   - riscv-linux-gnu
   - powerpc-linux-gnu
   - s390x-linux-gnu

Ok for commit to trunk?

gcc/c-family/ChangeLog:

* c-common.cc (builtin_function_validate_nargs,
check_builtin_function_arguments,
speculation_safe_value_resolve_call,
speculation_safe_value_resolve_params, sync_resolve_size,
sync_resolve_params, get_atomic_generic_size,
resolve_overloaded_atomic_exchange,
resolve_overloaded_atomic_compare_exchange,
resolve_overloaded_atomic_load, resolve_overloaded_atomic_store,
resolve_overloaded_builtin):  Add `complain` boolean parameter
and determine whether to emit errors based on its value.
* c-common.h (check_builtin_function_arguments,
resolve_overloaded_builtin):  Mention `complain` boolean
parameter in declarations.  Give it a default of `true`.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc
(aarch64_resolve_overloaded_builtin,aarch64_check_builtin_call):
Add new unused boolean parameter to match target hook
definition.
* config/arm/arm-builtins.cc (arm_check_builtin_call): Likewise.
* config/arm/arm-c.cc (arm_resolve_overloaded_builtin):
Likewise.
* config/arm/arm-protos.h (arm_check_builtin_call): Likewise.
* config/avr/avr-c.cc (avr_resolve_overloaded_builtin):
Likewise.
* config/riscv/riscv-c.cc (riscv_check_builtin_call,
riscv_resolve_overloaded_builtin): Likewise.
* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
Likewise.
* config/rs6000/rs6000-protos.h (altivec_resolve_overloaded_builtin):
Likewise.
* config/s390/s390-c.cc (s390_resolve_overloaded_builtin):
Likewise.
* doc/tm.texi: Regenerate.
* target.def (TARGET_RESOLVE_OVERLOADED_BUILTIN,
TARGET_CHECK_BUILTIN_CALL): Update prototype to include a
boolean parameter that indicates whether errors should be
emitted.  Update documentation to mention this fact.

gcc/cp/ChangeLog:

* call.cc (build_cxx_call):  Pass `complain` parameter to
check_builtin_function_arguments.  Take its value from the
`tsubst_flags_t` type `complain & tf_error`.
* semantics.cc (finish_call_expr):  Pass `complain` parameter to
resolve_overloaded_builtin.  Take its value from the
`tsubst_flags_t` type `complain & tf_error`.

gcc/testsuite/ChangeLog:

* g++.dg/template/builtin-atomic-overloads.def: New test.
* g++.dg/template/builtin-atomic-overloads1.C: New test.
* g++.dg/template/builtin-atomic-overloads2.C: New test.
* g++.dg/template/builtin-atomic-overloads3.C: New test.
* g++.dg/template/builtin-atomic-overloads4.C: New test.
* g++.dg/template/builtin-atomic-overloads5.C: New test.
* g++.dg/template/builtin-atomic-overloads6.C: New test.
* g++.dg/template/builtin-atomic-overloads7.C: New test.
* g++.dg/template/builtin-atomic-overloads8.C: New test.
* g++.dg/template/builtin-sfinae-check-function-arguments.C: New test.
* g++.dg/template/builtin-speculation-overloads.def: New test.
* g++.dg/template/builtin-speculation-overloads1.C: New test.
* g++.dg/template/builtin-speculation-overloads2.C: New test.
* g++.dg/template/builtin-speculation-overloads3.C: New test.
* g++.dg/template/builtin-speculation-overloads4.C: New test.
* g++.dg/template/builtin-speculation-overloads5.C: New test.
* g++.dg/template/builtin-validate-nargs.C: New test.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>

PR modula2/115328: use enable forward bool and set default true

This patch introduces GetEnableForward and SetEnableForward
against which the forward procedure declaration feature is checked.
Currently this is set as default true.

gcc/m2/ChangeLog:

PR modula2/115328
* gm2-compiler/M2Options.def (GetEnableForward): New procedure
function.
(SetEnableForward): New procedure.
* gm2-compiler/M2Options.mod (GetEnableForward): New procedure
function.
(SetEnableForward): New procedure.
(EnableForward): New boolean.
* gm2-compiler/P1SymBuild.mod (EndBuildForward): Check
GetEnableForward and generate an error message if false.

gcc/testsuite/ChangeLog:

PR modula2/115328
* gm2/pim/fail/forward.mod: Move to...
* gm2/pim/pass/forward.mod: ...here.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

docs: Clarify -fsanitize=hwaddress target support [PR117960]

Since GCC 13 -fsanitize=hwaddress is not supported just on AArch64, but also
on x86_64 (but only with -mlam=u48 or -mlam=u57).

2024-12-09 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/117960
* doc/invoke.texi (fsanitize=hwaddress): Clarify on which targets
it is supported.

replace atoi with stroul in c_parser_gimple_parse_bb_spec [PR114541]

The full treatment of these invalid values was considered out of
scope for this patch.

PR c/114541
* gimple-parser.cc (c_parser_gimple_parse_bb_spec):
Use strtoul with ERANGE check instead of atoi to avoid UB
and detect invalid __BB#.

Signed-off-by: Heiko Eißfeldt <heiko@hexco.de>

arm: remove obsolete vcond expanders

The vcond{,u} expander paterns have been declared as obsolete. Remove
them from the Arm backend.

gcc/ChangeLog:

PR target/114189
* config/arm/arm-protos.h (arm_expand_vcond): Delete prototype.
* config/arm/arm.cc (arm_expand_vcond): Delete function.
* config/arm/vec-common.md (vcond<mode><mode>): Delete pattern
(vcond<V_cvtto><mode>): Likewise.
(vcond<VH_cvtto><mode>): Likewise.
(vcondu<mode><v_cmp_result>): Likewise.

RISC-V: Refine signed SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_trunc-1-i16-to-i8.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_trunc-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-8-i64-to-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine signed SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_sub-1-i16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_sub-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-4-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine signed SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add-1-i16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_s_add-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-1-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-2-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-2.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-3-1.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-3.c: Ditto.
* gcc.target/riscv/sat/sat_s_add_imm-4.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned SAT_TRUNC testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_TRUNC exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_TRUNC (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_trunc-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_trunc-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned SAT_SUB testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_SUB exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_SUB (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_sub-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_sub-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-10-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-11-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-12-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-7-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-8-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-9-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u64-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-4-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Refine unsigned SAT_ADD testcase dump check to tree optimized

The sat alu related testcase check the rtl dump for the standard name
like .SAT_ADD exist or not.  But the rtl pass expand is somehow
impressionable by the middle-end change or debug information.  Like
below new appearance recently.

Replacing Expressions
_5 replace with --> _5 = .SAT_ADD (x_3(D), y_4(D)); [tail call]

After that we need to adjust the dump check time and again.  This
patch would like to switch to tree optimized pass for the standard
name check, which is more stable up to a point.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_add-1-u16.c: Take tree-optimized
pass for standard name check, and adjust the times.
* gcc.target/riscv/sat/sat_u_add-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-1.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-10.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-11.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-12.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-13.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-14.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-15.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-17.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-18.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-19.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-2.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-20.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-21.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-22.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-23.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-24.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-25.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-26.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-27.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-28.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-29.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-3.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-30.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-31.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-33.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-34.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-35.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-36.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-37.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-38.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-39.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-4.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-40.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-41.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-42.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-43.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-44.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-45.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-46.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-47.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-48.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-49.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-5.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-50.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-51.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-52.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-53.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-54.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-55.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-56.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-57.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-58.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-59.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-6.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-60.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-7.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm_type_check-9.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>

middle-end/117932 - further speedup DF worklist solver

The triple-indirect memory reference we perform for each incoming
edge age <= last_change_age[bbindex_to_postorder[e->src->index]]
is pretty bad and when there are a lot of small BBs like for the
PR26854 testcase this shows in the profile.  The following reduces
this by one level by making last_change_age indexed by BB index
rather than postorder number and realizing that for the first
iteration the age check is always true.  We pay for this by
allocating last_change_age for all BBs in the function but we
do it like for sparsesets and avoid initializing given we check
the considerd bitmap anyway.  We can also elide initializing
last_visit_age in an obvious way given we separated the initial
iteration in the previous change.

Together this improves compile-time in the PR117932 setting by
another 2%.

PR middle-end/117932
* df-core.cc (df_worklist_propagate_forward): Elide
age check for the first iteration, adjust for
last_change_age change.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Make last_change_age
indexed by BB index, avoid clearing both age arrays.

middle-end/117932 - speed up DF solver

The following addresses slow bitmap operations for maintaining the
iteration order of df_worklist_dataflow_doublequeue for large number
of basic-blocks. The main complexity change is switching the
worklist and pending bitmaps to tree view, a secondary change is
avoiding the fully populated initial bitmap for the first iteration
and instead special-casing that plus avoiding all forward worklist
bitmap sets in that iteration. Usually second or later iterations
are sparse, so optimizing the first iteration seems worthwhile.

For PR117932 when isolating from ext-dce and fold-mem-offset issues
this results in a 10% compile-time reduction.

PR middle-end/117932
* df-core.cc (df_worklist_propagate_forward): When WORKLIST
is NULL, do not set bits there.
(df_worklist_propagate_backward): Likewise.
(df_worklist_dataflow_doublequeue): Separate first pass
over all blocks with NULL worklist.
(df_worklist_dataflow): Do not initialize pending and adjust.

nvptx: Switch default from '-march=sm_30' to '-march=sm_52'

In preparation of GCC/nvptx code changes that require sm_52 features, this
commit raises nvptx code generation from sm_30 "Kepler" to sm_52 "Maxwell".
The latter has been supported as of CUDA 6.5 (2014-08), and is thus supported
by most Nvidia GPUs of the last decade, approximately.  (This commit doesn't
change the use of PTX ISA 6.0, which already requires CUDA 9.0 anyway.)

To continue building sm_30 multilib variants (for use via building/linking with
'-march=sm_30'), specify '--with-multilib-list=default,sm_30', for example.  Or,
to continue defaulting to sm_30 multilib variants, specify '--with-arch=sm_30'
(plus '--without-multilib-list', if applicable).  See the documentation,
<https://gcc.gnu.org/install/specific.html#nvptx-x-none>.

(Note that after a long deprecation time, eventually the
sm_3x "Kepler architecture support is removed from CUDA 12.0", 2022-12.)

gcc/
* config.gcc [nvptx-*]: Switch default from '-march=sm_30' to
'-march=sm_52'.
* doc/install.texi (Nvidia PTX Options): Update.

GCN: Fix 'real_from_integer' usage

The recent commit b3f1b9e2aa079f8ec73e3cb48143a16645c49566
"build: Remove INCLUDE_MEMORY [PR117737]" exposed an issue in code added in
2020 GCN back end commit 95607c12363712c39345e1d97f2c1aee8025e188
"Zero-initialise masked load destinations"; compilation now fails:

    [...]
    In file included from ../../source-gcc/gcc/coretypes.h:507:0,
                     from ../../source-gcc/gcc/config/gcn/gcn.cc:24:
    ../../source-gcc/gcc/real.h: In instantiation of ‘format_helper::format_helper(const T&) [with T = std::nullptr_t]’:
    ../../source-gcc/gcc/config/gcn/gcn.cc:1178:46:   required from here
    ../../source-gcc/gcc/real.h:233:17: error: no match for ‘operator==’ (operand types are ‘std::nullptr_t’ and ‘machine_mode’)
       : m_format (m == VOIDmode ? 0 : REAL_MODE_FORMAT (m))
                     ^
    [...]

That's with 'g++ (GCC) 5.5.0', and seen similarly with
'g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0', for example.

gcc/
* config/gcn/gcn.cc (gcn_vec_constant): Fix 'real_from_integer'
usage.

Rust: libformat_parser: Lower minimum Rust version to 1.49

libgrust/ChangeLog:

* libformat_parser/Cargo.toml: Change Rust edition from 2021 to 2018.
* libformat_parser/generic_format_parser/Cargo.toml: Likewise.
* libformat_parser/generic_format_parser/src/lib.rs: Remove usage of
then-unstable std features and language constructs.
* libformat_parser/src/lib.rs: Likewise, plus provide extension trait
for String::leak.

Rust: Work around 'error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope'

Compiling with Debian GNU/Linux 12 (bookworm) packages:

    $ apt-cache madison cargo rustc
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main Sources
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main Sources

..., we run into:

       Compiling libformat_parser v0.1.0 ([...]/source-gcc/libgrust/libformat_parser)
    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:396:18
        |
    396 |         ptr: str.leak().as_ptr(),
        |                  ^^^^ method not found in `std::string::String`

    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:434:7
        |
    434 |     s.leak();
        |       ^^^^ method not found in `std::string::String`

    error[E0599]: no method named `leak` found for struct `std::string::String` in the current scope
       --> src/lib.rs:439:23
        |
    439 |         ptr: cloned_s.leak().as_ptr(),
        |                       ^^^^ method not found in `std::string::String`

Locally replace 1.72.0+ method 'leak' for struct 'std::string::String'.

libgrust/
* libformat_parser/src/lib.rs: Work around 'error[E0599]:
no method named `leak` found for struct `std::string::String` in the current scope'.

Rust: Work around 'error[E0658]: `let...else` statements are unstable'

Compiling with Debian GNU/Linux 12 (bookworm) packages:

    $ apt-cache madison cargo rustc
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         cargo | 0.66.0+ds1-1 | http://deb.debian.org/debian bookworm/main Sources
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main ppc64el Packages
         rustc | 1.63.0+dfsg1-2 | http://deb.debian.org/debian bookworm/main Sources

..., we run into:

       Compiling generic_format_parser v0.1.0 ([...]/source-gcc/libgrust/libformat_parser/generic_format_parser)
    error[E0658]: `let...else` statements are unstable
       --> generic_format_parser/src/lib.rs:994:5
        |
    994 | /     let Some(unescaped) = unescape_string(snippet) else {
    995 | |         return InputStringKind::NotALiteral;
    996 | |     };
        | |______^
        |
        = note: see issue #87335 <https://github.com/rust-lang/rust/issues/87335> for more information

Rewrite backwards, per <https://rust-lang.github.io/rfcs/3137-let-else.html>.

libgrust/
* libformat_parser/generic_format_parser/src/lib.rs: Work around
'error[E0658]: `let...else` statements are unstable'.

libstdc++: Add missing equality comparison in new tests [PR117921]

These new tests fail in Debug Mode because the allocator types aren't
equality comparable.

libstdc++-v3/ChangeLog:

PR libstdc++/117921
* testsuite/23_containers/set/modifiers/swap/adl.cc: Add
equality comparison for Allocator.
* testsuite/23_containers/unordered_set/modifiers/swap-2.cc:
Likewise.

aarch64: Update cpuinfo strings for some arch features

The entries for some recently-added arch features were missing the cpuinfo
string used in -march=native detection.  Presumably the Linux kernel had not
specified such a string at the time the GCC support was added.
But I see that current versions of Linux do have strings for these features
in the arch/arm64/kernel/cpuinfo.c file in the kernel tree.

This patch adds them.  This fixes the strings for the f32mm and f64mm features
which I think were using the wrong string.  The kernel exposes them with an
"sve" prefix.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* config/aarch64/aarch64-option-extensions.def (sve-b16b16,
f32mm, f64mm, sve2p1, sme-f64f64, sme-i16i64, sme-b16b16,
sme-f16f16, mops): Update FEATURE_STRING field.

tree-eh: Don't crash on GIMPLE_TRY_FINALLY with empty cleanup sequence [PR117845]

The following valid code triggers an ICE with -fsanitize=address

=== cut here ===
void l() {
    auto const ints = {0,1,2,3,4,5};
    for (auto i : { 3 } ) {
        __builtin_printf("%d ", i);
    }
}
=== cut here ===

The problem is that honor_protect_cleanup_actions does not expect the
cleanup sequence of a GIMPLE_TRY_FINALLY to be empty. It is however the
case here since r14-8681-gceb242f5302027, because lower_stmt removes the
only statement in the sequence: a ASAN_MARK statement for the array that
backs the initializer_list).

This patch simply checks that the finally block is not 0 before
accessing it in honor_protect_cleanup_actions.

PR c++/117845

gcc/ChangeLog:

* tree-eh.cc (honor_protect_cleanup_actions): Support empty
finally sequences.

gcc/testsuite/ChangeLog:

* g++.dg/asan/pr117845-2.C: New test.
* g++.dg/asan/pr117845.C: New test.

Fortran: Fix testsuite regressions after r15-5897 [PR116261/PR117901]

2024-12-09 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/116261
* trans-array.cc (gfc_array_init_size): New arg 'explicit_ts',
to suppress the use of the expr3 element size in the descriptor
dtype.
(gfc_array_allocate): New arg 'explicit_ts', used in call to
gfc_array_init_size.
* trans-array.h : Modify prototype for gfc_array_allocate for new
bool argument.
* trans-stmt.cc (gfc_trans_allocate): Set new argument if the
typespec is explicit.

gcc/testsuite/
PR fortran/117901
* gfortran.dg/class_transformational_1.f90: Temporary fix for
ICE with some compile options by setting dummy arg of
'unlimited rebar' to be allocatable.