git.ipfire.org Git - thirdparty/gcc.git/log

[alpha] adjust MEM alignment for block move [PR115459]

Before issuing loads or stores for a block move, adjust the MEM
alignments if analysis of the addresses enabled the inference of
stricter alignment. This ensures that the MEMs are sufficiently
aligned for the corresponding insns, which avoids trouble in case of
e.g. substitutions into SUBREGs.

for gcc/ChangeLog

PR target/115459
* config/alpha/alpha.cc (alpha_expand_block_move): Adjust
MEMs to match inferred alignment.

RISC-V: NO_WARNING preferred else value for RVV

PR target/115840.

In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.

The problem is that `warn_uninit` will emit a warning:
'({anonymous})' may be used uninitialized

Let's mark this tmp var as NO_WARNING.

This problem is found when I try to build glibc with V extension.

gcc

PR target/115840
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as NO_WARNING.

gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.

fortran: Factor the evaluation of MINLOC/MAXLOC's BACK argument

Move the evaluation of the BACK argument out of the loop in the inline code
generated for MINLOC or MAXLOC.  For that, add a new (scalar) element
associated with BACK to the scalarization loop chain, evaluate the argument
with the context of that element, and let the scalarizer do its job.

The problem was not only a missed optimisation, but also a wrong code
one in the cases where the expression associated with BACK is not free of
side-effects, making multiple evaluations observable.

The new tests check the evaluation count of the BACK argument, and try to
cover all the variations (integral or floating-point type, constant or
unknown shape, absent or scalar or array MASK) supported by the inline
implementation of the functions.  Care has been taken to not check the case
of a constant .FALSE. MASK, for which the evaluation of BACK can be elided.

gcc/fortran/ChangeLog:

* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Create a new
scalar scalarization chain element if BACK is present.  Add it to
the loop.  Set the scalarization chain before evaluating the
argument.

gcc/testsuite/ChangeLog:

* gfortran.dg/maxloc_5.f90: New test.
* gfortran.dg/minloc_5.f90: New test.

RISC-V: Disable misaligned vector access in hook riscv_slow_unaligned_access[PR115862]

The reason is that in the following code, icode = movmisalignv8si has
already been rejected by TARGET_VECTOR_MISALIGN_SUPPORTED, but it is
allowed by targetm.slow_unaligned_access,which is contradictory.

(((icode = optab_handler (movmisalign_optab, mode))
!= CODE_FOR_nothing)
|| targetm.slow_unaligned_access (mode, align))

misaligned vector access should be enabled by -mno-vector-strict-align option.

PR target/115862

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_slow_unaligned_access): Disable vector misalign.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>

RISC-V: Add SiFive extensions, xsfvcp and xsfcease

We have already upstreamed these extensions into binutils, and now we need GCC
to recognize these extensions and pass them to binutils as well. We also plan
to upstream intrinsics in the near future. :)

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info): Add xsfvcp.
(riscv_ext_version_table): Add xsfvcp, xsfcease.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv.opt (riscv_sifive_subext): New.
(XSFVCP): New.
(XSFCEASE): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-sf-1.c: New.
* gcc.target/riscv/predef-sf-2.c: New.

rs6000: Remove vcond{,u} expanders

As PR114189 shows, middle-end will obsolete vcond, vcondu
and vcondeq optabs soon. This patch is to remove all
vcond{,u} expanders in rs6000 port and adjust the function
rs6000_emit_vector_cond_expr which is called by those
expanders as static.

PR target/115659

gcc/ChangeLog:

* config/rs6000/rs6000-protos.h (rs6000_emit_vector_cond_expr): Remove.
* config/rs6000/rs6000.cc (rs6000_emit_vector_cond_expr): Add static
qualifier as it is only called by rs6000_emit_swsqrt now.
* config/rs6000/vector.md (vcond<VEC_F:mode><VEC_F:mode>): Remove.
(vcond<VEC_I:mode><VEC_I:mode>): Remove.
(vcondv4sfv4si): Likewise.
(vcondv4siv4sf): Likewise.
(vcondv2dfv2di): Likewise.
(vcondv2div2df): Likewise.
(vcondu<VEC_I:mode><VEC_I:mode>): Likewise.
(vconduv4sfv4si): Likewise.
(vconduv2dfv2di): Likewise.

tree-optimization/115867 - ICE with simdcall vectorization in masked loop

When only a loop mask is to be supplied for the inbranch arg to a
simd function we fail to handle integer mode masks correctly. We
need to guess the number of elements represented by it. This assumes
that excess arguments are all for masks, I wasn't able to create
a simdclone with more than one integer mode mask argument.

The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl

PR tree-optimization/115867
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly
guess the number of mask elements for integer mode masks.

[committed] Fix m68k bootstrap segfault with late-combine

So the m68k port has failed to bootstrap since the introduction of
late-combine.  My suspicion has been this is a backend problem.  Sure enough
after bisecting things down (thank goodness for the debug counter!) I'm happy
to report m68k (after this patch) has moved into its stage3 build for the first
time in a month.

Basically late-combine propagated an address calculation to its use points,
generating this insn (dwarf2out.c, I forget what function):

> (insn 653 652 655 (parallel [
>             (set (mem/j:DI (plus:SI (plus:SI (reg/f:SI 9 %a1 [orig:64 _67 ] [64])
>                             (reg:SI 0 %d0 [321]))
>                         (const_int 20 [0x14])) [0 slot_204->dw_attr_val.v.val_unsigned+0 S8 A16])
>                 (sign_extend:DI (mem/c:SI (plus:SI (reg/f:SI 14 %a6)
>                             (const_int -28 [0xffffffffffffffe4])) [870 %sfp+-28 S4 A16])))
>             (clobber (reg:SI 0 %d0))
>         ]) "../../../gcc/gcc/dwarf2out.cc":24961:23 93 {extendsidi2}
>      (expr_list:REG_DEAD (reg/f:SI 9 %a1 [orig:64 _67 ] [64])
>         (expr_list:REG_DEAD (reg:SI 0 %d0 [321])
>             (expr_list:REG_UNUSED (reg:SI 0 %d0)
>                 (nil)))))
Note how the output uses d0 in the address calculation and the clobber uses d0.

It matches this insn in the md file:

> (define_insn "extendsidi2"
>   [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,o,<")
>         (sign_extend:DI
>          (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm,r<Q>,rm")))
>    (clobber (match_scratch:SI 2 "=X,&d,&d,&d"))]
>   ""
> {
>   if (which_alternative == 0)
>     /* Handle alternative 0.  */
>     {
>       if (TARGET_68020 || TARGET_COLDFIRE)
>         return "move%.l %1,%R0\;smi %0\;extb%.l %0";
>       else
>         return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
>     }
>
>   /* Handle alternatives 1, 2 and 3.  We don't need to adjust address by 4
>      in alternative 3 because autodecrement will do that for us.  */
>   operands[3] = adjust_address (operands[0], SImode,
>                                 which_alternative == 3 ? 0 : 4);
>   operands[0] = adjust_address (operands[0], SImode, 0);
>
>   if (TARGET_68020 || TARGET_COLDFIRE)
>     return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
>   else
>     return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
> }
>   [(set_attr "ok_for_coldfire" "yes,no,yes,yes")])
Note the smi/ext instruction pair in the case for alternatives 1..3.  Those
clobber the scratch register before we're done consuming inputs.  The scratch
register really needs to be marked as an earlyclobber.

That fixes the bootstrap problem, but a cursory review of m68k.md is not
encouraging.  I will not be surprised at all if there's more of this kind of
problem lurking.

But happy to at least have m68k bootstrapping again.   It's failing the
comparison test, but definitely progress.

* config/m68k/m68k.md (extendsidi2): Add missing early clobbers.

libbacktrace: avoid infinite recursion

We could get an infinite recursion in an odd case in which a
.gnu_debugdata section was added to a debug file, and mini_debuginfo
was put into the debug file, and the debug file was put into a
/usr/lib/debug directory to be found by build ID. This combination
doesn't really make sense but we shouldn't get an infinite recursion.

* elf.c (elf_add): Don't use .gnu_debugdata if we are already
reading a debuginfo file.
* Makefile.am (m2test_*): New test targets.
(CHECK_PROGRAMS): Add m2test.
(MAKETESTS): Add m2test_minidebug2.
(%_minidebug2): New pattern.
(CLEANFILES): Remove minidebug2 files.
* Makefile.in: Regenerate.

LoongArch: Remove unreachable codes.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_split_move): Delete.
(loongarch_hard_regno_mode_ok_uncached): Likewise.
* config/loongarch/loongarch.md
(move_doubleword_fpr<mode>): Likewise.
(load_low<mode>): Likewise.
(load_high<mode>): Likewise.
(store_word<mode>): Likewise.
(movgr2frh<mode>): Likewise.
(movfrh2gr<mode>): Likewise.

LoongArch: TFmode is not allowed to be stored in the float register.

PR target/115752

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_hard_regno_mode_ok_uncached): Replace
UNITS_PER_FPVALUE with UNITS_PER_HWFPVALUE.
* config/loongarch/loongarch.h (UNITS_PER_FPVALUE): Delete.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/pr115752.c: New test.

libbacktrace: don't fail if symbol size is unknown

* btest.c (test5): Don't fail if symbol size is 0.
* mtest.c (test5): Likewise.

libbacktrace: correctly gather Mach-O symbol table

For PR libbacktrace/97082.
* macho.c (MACH_O_N_EXT): Don't define.
(MACH_O_N_UNDF): Define.
(macho_defined_symbol): Don't discard N_EXT symbols. Do
discard N_UNDF symbols.

Daily bump.

libbacktrace: fix testsuite for clang

* btest.c (test1, test3): Add optnone attribute.
* edtest.c (test1): Likewise.
* mtest.c (test1, test3): Likewise.
* configure.ac: Use -Wno-attributes and -Wno-unknown-attributes.
* configure: Regenerate.

libstdc++: Test that std::atomic_ref<bool> uses the primary template

The previous commit changed atomic_ref<bool> to not use the integral
specialization. This adds a test to verify that change. We can't
directly test that the primary template is used, but we can check that
the member functions of the integral specializations are not present.

libstdc++-v3/ChangeLog:

* testsuite/29_atomics/atomic_ref/bool.cc: New test.

libstdc++: the specialization atomic_ref<bool> should use the primary template

Per [atomics.ref.int] `bool` is excluded from the list of integral types
for which there is a specialization of the `atomic_ref` class template
and [Note 1] clearly states that `atomic_ref<bool>` "uses the primary
template" instead.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_ref): Do not use integral
specialization for bool.

Signed-off-by: Damien Lebrun-Grandie <dalg24@gmail.com>

libbacktrace: suggest how to fix missing debug info

* elf.c (elf_nodebug): Suggest -g.
* macho.c (macho_nodebug): Suggest -g and dsymutil.
* pecoff.c (coff_nodebug): Suggest -g.

libbacktrace: remove trailing whitespace

* dwarf.c: Remove trailing whitespace.
* macho.c: Likewise.

libstdc++: Switch gcc.gnu.org links to https

libstdc++-v3:
* doc/xml/manual/using.xml: Switch gcc.gnu.org links to https.
* doc/html/manual/using_concurrency.html: Regenerate.
* doc/html/manual/using_dynamic_or_shared.html: Ditto.
* doc/html/manual/using_headers.html: Ditto.
* doc/html/manual/using_namespaces.html: Ditto.

[to-be-committed,RISC-V] Eliminate unnecessary sign extension after inlined str[n]cmp

This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.

Conceptually this is pretty simple.  Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.

FINAL_LABEL is the point after the calculation of the return value.  So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.

Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result.  After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.

So with that background.

We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte.  Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.

There's 4 jumps to final_label in emit_strcmp_scalar.

The first is just returning zero and needs trivial simplification to not
force the result into SImode.

The second is after calling strcmp in the library.  The ABI mandates
that value is sign extended, so there's nothing to do for that case.

The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul.  If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte

The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.

Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.

The net of all that is we know every path has its result properly
extended to X mode.  Standard redundant extension removal will take care
of the rest.

We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc.  It's
also been through a build/test cycle in my tester.  Waiting on results
from the pre-commit testing before moving forward.

gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines.  If necessary
extract low part of the word to store in final result location.

Ranger: Mark a few classes as final

I noticed there was a warning from clang about int_range's
dtor being marked as final saying the class cannot be inherited from.
So let's mark the few ranger classes as final for those which we know
will be final.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* value-range.h (class int_range): Mark as final.
(class prange): Likewise.
(class frange): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: Disable expensive test for debug mode [PR108636]

This test uses -fkeep-inline-functions and with debug mode enabled that
compiles much slower and times out if the system is under heavy load.

The original problem being tested is independent of debug mode, so just
require normal mode for the test.

libstdc++-v3/ChangeLog:

PR libstdc++/108636
* testsuite/27_io/filesystem/path/108636.cc: Require normal
mode.

mve: Fix vsetq_lane for 64-bit elements with lane 1 [PR 115611]

This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.

Added a new test to force float-abi=hard so we can use scan-assembler to check
correct codegen.

gcc/ChangeLog:

PR target/115611
* config/arm/mve.md (mve_vec_setv2di_internal): Fix printing of input
scalar register pair when lane = 1.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vsetq_lane_su64.c: New test.

recog: Avoid validate_change shortcut for groups [PR115782]

In this PR, due to the -f flags, we ended up with:

bb1:  r10=r10
...
bb2:  r10=r10
...
bb3:  ...=r10

with bb1->bb2 and bb1->bb3.

late-combine successfully combined the bb1->bb2 def-use and set
the insn code to NOOP_MOVE_INSN_CODE.  The bb1->bb3 combination
then failed for... reasons.  At this point, everything should have
been rewound to its original state.

However, substituting r10=r10 into r10=r10 gives r10=r10, and
validate_change had an early-out for no-op rtl changes.  This meant
that validate_change did not register a change for the bb2 insn and
so did not save its old insn code.  The NOOP_MOVE_INSN_CODE therefore
persisted even after the attempt had been rewound.

IMO it'd be too cumbersome and error-prone to expect all users of
validate_change to be aware of this possibility.  If code is using
validate_change with in_group=1, I think it has a reasonable expectation
that a change will be registered and that the insn code will be saved
(and restored on cancel).  This patch therefore limits the shortcut
to the !in_group case.

gcc/
PR rtl-optimization/115782
* recog.cc (validate_change_1): Suppress early exit for no-op
changes that are part of a group.

gcc/testsuite/
PR rtl-optimization/115782
* gcc.dg/pr115782.c: New test.

Fix bootstrap broken by gcc-15-1965-ge4f2f46e015

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_conv_array_parameter): Init variable to
NULL_TREE to fix bootstrap.

Fix gimplification of ordering comparisons of arrays of bytes

The Ada compiler now defers to the gimplifier for ordering comparisons of
arrays of bytes (Ada parlance for <, >, <= and >=) because the gimplifier
in turn defers to memcmp for them, which implements the required semantics.

However, the gimplifier has a special processing for aggregate types whose
mode is not BLKmode and this processing deviates from the memcmp semantics
when the target is little-endian.

gcc/
* gimplify.cc (gimplify_scalar_mode_aggregate_compare): Add support
for ordering comparisons.
(gimplify_expr) <default>: Call gimplify_scalar_mode_aggregate_compare
only for integral scalar modes.

gcc/testsuite/
* gnat.dg/array42.adb, gnat.dg/array42_pkg.ads: New test.

AVR: Tidy up subtract-and-zero_extend insns.

There are these insns that subtract and zero-extend where
the subtrahend is zero-extended to the mode of the minuend.
This patch uses one insn (and split) with mode iterators
instead of spelling out each variant individually.
This has the additional benefit that u32 - u24 is also supported,
which previously wasn't.

gcc/
* config/avr/avr-protos.h (avr_out_minus): New prototype.
* config/avr/avr.cc (avr_out_minus): New function.
* config/avr/avr.md (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>)
(*sub<HISI:mode>3.zero_extend.<QIPSI:mode>_split): New insns.
(*subpsi3_zero_extend.qi_split): Remove isns_and_split.
(*subpsi3_zero_extend.hi_split): Remove insn_and_split.
(*subhi3_zero_extend1_split): Remove insn_and_split.
(*subsi3_zero_extend_split): Remove insn_and_split.
(*subsi3_zero_extend.hi_split): Remove insn_and_split.
(*subpsi3_zero_extend.qi): Remove insn.
(*subpsi3_zero_extend.hi): Remove insn.
(*subhi3_zero_extend1): Remove insn.
(*subsi3_zero_extend): Remove insn.
(*subsi3_zero_extend.hi): Remove insn.
gcc/testsuite/
* gcc.target/avr/torture/sub-zerox.c: New test.

c++/modules: Keep entity mapping info across duplicate_decls [PR99241]

When duplicate_decls finds a match with an existing imported
declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it
with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing
an ICE if the same declaration is imported again later.

This fixes the issue by ensuring that the flag is transferred to newdecl
before clearing so that it ends up on olddecl again.

For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P,
though because we don't yet support textual redefinition merging we
can't yet test this works as intended. I don't expect it's possible for
a new declaration already to have extra keyed decls mismatching that of
the old declaration though, so I don't do anything with 'keyed_map' at
this time.

PR c++/99241

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Merge module entity information.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr99241_a.H: New test.
* g++.dg/modules/pr99241_b.H: New test.
* g++.dg/modules/pr99241_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

RISC-V: Add testcases for vector .SAT_SUB in zip benchmark

This patch would like to add the test cases for the vector .SAT_SUB in
the zip benchmark.  Aka:

Form in zip benchmark:
  #define DEF_VEC_SAT_U_SUB_ZIP(T1, T2)                             \
  void __attribute__((noinline))                                    \
  vec_sat_u_sub_##T1##_##T2##_fmt_zip (T1 *x, T2 b, unsigned limit) \
  {                                                                 \
    T2 a;                                                           \
    T1 *p = x;                                                      \
    do {                                                            \
      a = *--p;                                                     \
      *p = (T1)(a >= b ? a - b : 0);                                \
    } while (--limit);                                              \
  }

DEF_VEC_SAT_U_SUB_ZIP(uint8_t, uint16_t)

vec_sat_u_sub_uint16_t_uint32_t_fmt_zip:
  ...
  vsetvli       a4,zero,e32,m1,ta,ma
  vmv.v.x       v6,a1
  vsetvli       zero,zero,e16,mf2,ta,ma
  vid.v v2
  li            a4,-1
  vnclipu.wi    v6,v6,0   // .SAT_TRUNC
.L3:
  vle16.v       v3,0(a3)
  vrsub.vx      v5,v2,a6
  mv            a7,a4
  addw          a4,a4,t3
  vrgather.vv   v1,v3,v5
  vssubu.vv     v1,v1,v6  // .SAT_SUB
  vrgather.vv   v3,v1,v5
  vse16.v       v3,0(a3)
  sub           a3,a3,t1
  bgtu          t4,a4,.L3

Passed the rv64gcv tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

[MAINTAINERS] Update my email address and move to DCO.

* MAINTAINERS: Update my email address.

Signed-off-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>

Fortran: Fix rejecting class arrays of different ranks as storage association argument and add un/pack_class. [PR96992]

Removing the assert in trans-expr, lead to initial strides not set
which is now fixed.  When the array needs repacking, this is done for
class arrays now, too.

Packing class arrays was done using the regular internal pack
function in the past.  But that does not use the vptr's copy
function and breaks OOP paradigms (e.g. deep copy).  The new
un-/pack_class functions use the vptr's copy functionality to
implement OOP paradigms correctly.

PR fortran/96992

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_array_bounds): Set a starting
stride, when descriptor expects a variable for the stride.
(gfc_trans_dummy_array_bias): Allow storage association for
dummy class arrays, when they are not elemental.
(gfc_conv_array_parameter): Add more general class support
and packing for classes, too.
* trans-array.h (gfc_conv_array_parameter): Add lbound shift
for class arrays.
* trans-decl.cc (gfc_build_builtin_function_decls): Add decls
for internal_un-/pack_class.
* trans-expr.cc (gfc_reset_vptr): Allow supplying a type-tree
to generate the vtab from.
(gfc_class_set_vptr): Allow supplying a class-tree to take the
vptr from.
(class_array_data_assign): Rename to gfc_class_array_data_assign
and make usable from other compile units.
(gfc_class_array_data_assign): Renamed from class_array_data_
assign.
(gfc_conv_derived_to_class): Remove assert to
allow converting derived to class type arrays with assumed
rank.  Reduce code base and use gfc_conv_array_parameter also
for classes.
(gfc_conv_class_to_class): Use gfc_class_data_assign.
(gfc_conv_procedure_call): Adapt to new signature of
gfc_conv_derived_to_class.
* trans-io.cc (transfer_expr): Same.
* trans-stmt.cc (trans_associate_var): Same.
* trans.h (gfc_conv_derived_to_class): Signature changed.
(gfc_class_array_data_assign): Made public.
(gfor_fndecl_in_pack_class): Added declaration.
(gfor_fndecl_in_unpack_class): Same.

libgfortran/ChangeLog:

* Makefile.am: Add in_un-/pack_class.c to build.
* Makefile.in: Regenerated from Makefile.am.
* gfortran.map: Added new functions and bumped ABI.
* libgfortran.h (GFC_CLASS_T): Added for generating class
representation at runtime.
* runtime/in_pack_class.c: New file.
* runtime/in_unpack_class.c: New file.

gcc/testsuite/ChangeLog:

* gfortran.dg/class_dummy_11.f90: New test.

Revert "fixincludes: skip stdio_stdarg_h on darwin"

This reverts commit 7d454cae9d7df1f2936ad02d0742674a85396736.

The change breaks bootstrap on some x86_64 OS versions.

Add function filtering to gcov

Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.

The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.

Demo: math.c

    int mul (int a, int b) {
        return a * b;
    }

    int sub (int a, int b) {
        return a - b;
    }

    int sum (int a, int b) {
        return a + b;
    }

Plain matches:

$ gcov -t math --include=sum
        -:    0:Source:math.c
        -:    0:Graph:math.gcno
        -:    0:Data:-
        -:    0:Runs:0
    #####:    9:int sum (int a, int b) {
    #####:   10:    return a + b;
        -:   11:}

$ gcov -t math --include=mul
        -:    0:Source:math.c
        -:    0:Graph:math.gcno
        -:    0:Data:-
        -:    0:Runs:0
    #####:    1:int mul (int a, int b) {
    #####:    2:    return a * b;
        -:    3:}

Regex match:

$ gcov -t math --include=su
        -:    0:Source:math.c
        -:    0:Graph:math.gcno
        -:    0:Data:-
        -:    0:Runs:0
    #####:    5:int sub (int a, int b) {
    #####:    6:    return a - b;
        -:    7:}
    #####:    9:int sum (int a, int b) {
    #####:   10:    return a + b;
        -:   11:}

And similar for exclude:

$ gcov -t math --exclude=sum
        -:    0:Source:math.c
        -:    0:Graph:math.gcno
        -:    0:Data:-
        -:    0:Runs:0
    #####:    1:int mul (int a, int b) {
    #####:    2:    return a * b;
        -:    3:}
    #####:    5:int sub (int a, int b) {
    #####:    6:    return a - b;
        -:    7:}

And json, for good measure:

$ gcov -t math --include=sum --json | jq ".files[].lines[]"
{
  "line_number": 9,
  "function_name": "sum",
  "count": 0,
  "unexecuted_block": true,
  "block_ids": [],
  "branches": [],
  "calls": []
}
{
  "line_number": 10,
  "function_name": "sum",
  "count": 0,
  "unexecuted_block": true,
  "block_ids": [
    2
  ],
  "branches": [],
  "calls": []
}

Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.

Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.

gcc/ChangeLog:

* doc/gcov.texi: Add --include, --exclude, --match-on-demangled
documentation.
* gcov.cc (struct fnfilter): New.
(print_usage): Add --include, --exclude, -M,
--match-on-demangled.
(process_args): Likewise.
(release_structures): Release filters.
(read_graph_file): Only add function_infos matching filters.
(output_lines): Likewise.

gcc/testsuite/ChangeLog:

* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.

Ensure function.end_line in source_info.lines

Ensure that the function.end_line in the lines vector for the source
file, even if it is not explicitly touched by a basic block. This
ensures consistency with what you would expect. For example, this file
has sources[sum.cc].lines.size () == 23 and main.end_line == 2 without
adjusting sources.lines, which in this case is a no-op.

    #####:   17:int main ()
        -:   18:{
    #####:   19:  sum (1, 2);
    #####:   20:  sum (1.1, 2);
    #####:   21:  sum (2.2, 2.3);
    #####:   22:}

This is a useful property when combined with selective reporting.

gcc/ChangeLog:

* gcov.cc (process_all_functions): Ensure fn.end_line is
included source[fn].lines.

RISC-V: c implies zca, and conditionally zcf & zcd

According to Zc-1.0.4-3.pdf from
https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
c implies zca, and conditionally zcf & zcd.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-15.c: adapt TC.
* gcc.target/riscv/attribute-16.c: likewise.
* gcc.target/riscv/attribute-17.c: likewise.
* gcc.target/riscv/attribute-18.c: likewise.
* gcc.target/riscv/pr110696.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-1.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: likewise.
* gcc.target/riscv/arch-39.c: New test.
* gcc.target/riscv/arch-40.c: New test.

Daily bump.

Vect: Optimize truncation for .SAT_SUB operands

To get better vectorized code of .SAT_SUB,  we would like to avoid the
truncated operation for the assignment.  For example, as below.

unsigned int _1;
unsigned int _2;
unsigned short int _4;
_9 = (unsigned short int).SAT_SUB (_1, _2);

If we make sure that the _1 is in the range of unsigned short int.  Such
as a def similar to:

_1 = (unsigned short int)_4;

Then we can do the distribute the truncation operation to:

_3 = (unsigned short int) MIN (65535, _2); // aka _3 = .SAT_TRUNC (_2);
_9 = .SAT_SUB (_4, _3);

Then,  we can better vectorized code and avoid the unnecessary narrowing
stmt during vectorization with below stmt(s).

_3 = .SAT_TRUNC(_2); // SI => HI
_9 = .SAT_SUB (_4, _3);

Let's take RISC-V vector as example to tell the changes.  For below
sample code:

__attribute__((noinline))
void test (uint16_t *x, unsigned b, unsigned n)
{
  unsigned a = 0;
  uint16_t *p = x;

  do {
    a = *--p;
    *p = (uint16_t)(a >= b ? a - b : 0);
  } while (--n);
}

Before this patch:
  ...
  .L3:
  vle16.v v1,0(a3)
  vrsub.vx v5,v2,t1
  mv t3,a4
  addw a4,a4,t5
  vrgather.vv v3,v1,v5
  vsetvli zero,zero,e32,m1,ta,ma
  vzext.vf2 v1,v3
  vssubu.vx v1,v1,a1
  vsetvli zero,zero,e16,mf2,ta,ma
  vncvt.x.x.w v1,v1
  vrgather.vv v3,v1,v5
  vse16.v v3,0(a3)
  sub a3,a3,t4
  bgtu t6,a4,.L3
  ...

After this patch:
test:
  ...
  .L3:
  vle16.v     v3,0(a3)
  vrsub.vx    v5,v2,a6
  mv          a7,a4
  addw        a4,a4,t3
  vrgather.vv v1,v3,v5
  vssubu.vv   v1,v1,v6
  vrgather.vv v3,v1,v5
  vse16.v     v3,0(a3)
  sub     a3,a3,t1
  bgtu    t4,a4,.L3
  ...

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

gcc/ChangeLog:

* tree-vect-patterns.cc (vect_recog_sat_sub_pattern_transform):
Add new func impl to perform the truncation distribution.
(vect_recog_sat_sub_pattern): Perform above optimize before
generate .SAT_SUB call.

Signed-off-by: Pan Li <pan2.li@intel.com>

libstdc++: Make std::basic_format_context non-copyable [PR114387]

Users are not supposed to create objects of this type, and there's no
reason it needs to be copyable. LWG 4061 makes it non-copyable and
non-default constructible.

libstdc++-v3/ChangeLog:

PR libstdc++/114387
* include/std/format (basic_format_context): Define copy
operations as deleted, as per LWG 4061.
* testsuite/std/format/context.cc: New test.

libstdc++: Minor optimization for std::locale::encoding()

For the C locale we know the encoding is ASCII, so we can avoid using
newlocale and nl_langinfo_l. Similarly, for an unnamed locale, we aren't
going to get a useful answer, so we can just use a default-constrcuted
std::text_encoding representing an unknown encoding.

libstdc++-v3/ChangeLog:

* src/c++26/text_encoding.cc (__locale_encoding): Add to unnamed
namespace.
(std::locale::encoding): Optimize for "C" and "*" names.

libstdc++: Use direct-initialization for std::vector<bool>'s allocator [PR115854]

The consensus in the standard committee is that this change shouldn't be
necessary, and the Allocator requirements should require conversions
between rebound allocators to be implicit. But we can make it work for
now anyway.

libstdc++-v3/ChangeLog:

PR libstdc++/115854
* include/bits/stl_bvector.h (_Bvector_base): Convert allocator
to rebound type explicitly.
* testsuite/23_containers/vector/allocator/115854.cc: New test.
* testsuite/23_containers/vector/bool/allocator/115854.cc: New test.

libstdc++: ranges::find needs explicit conversion to size_t [PR115799]

For an integer-class type we need to use an explicit conversion to size_t.

libstdc++-v3/ChangeLog:

PR libstdc++/115799
* include/bits/ranges_util.h (__find_fn): Make conversion
from difference type ti size_t explicit.
* testsuite/25_algorithms/find/bytes.cc: Check ranges::find with
__gnu_test::test_contiguous_range.
* testsuite/std/ranges/range.cc: Adjust expected difference_type
for __gnu_test::test_contiguous_range.
* testsuite/util/testsuite_iterators.h (contiguous_iterator_wrapper):
Use __max_diff_type as difference type.
(test_range::sentinel, test_sized_range_sized_sent::sentinel):
Ensure that operator- returns difference_type.

i386: Swap compare operands in ustrunc patterns

A last minute change led to a wrong operand order in the compare insn.

gcc/ChangeLog:

* config/i386/i386.md (ustruncdi<mode>2): Swap compare operands.
(ustruncsi<mode>2): Ditto.
(ustrunchiqi2): Ditto.

c++: remove Concepts TS code

In GCC 14 we deprecated Concepts TS and discussed removing the code
in GCC 15.  This patch removes Concepts TS code from the front end,
including support for template-introductions, as in:

  template<typename T>
  concept C = true;

  C{T} void foo (T); // write template<C T> void foo (T);

The biggest part of this patch is adjusting the testsuite.  We don't
want to lose coverage so I've converted most of -fconcepts-ts tests
to C++20.  That means they no longer have to be c++17_only.  Mostly
this meant turning "concept bool" into "concept" and turning function
concepts into C++20 concepts.  I've added missing "auto"s where
required, but "auto"s in template-argument-lists are not supported
anymore so I've removed some of the tests; some of them are still
present to verify we don't crash on such autos.  I've also added ()
around "requires" expressions.

I plan to add a porting_to.html entry with a few hints.

I've rebased and tested the patch after the recent r15-1103.

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Remove flag_concepts_ts code.
* c-opts.cc (c_common_post_options): Likewise.
* c.opt: Remove -fconcepts-ts.
* c.opt.urls: Regenerate.

gcc/cp/ChangeLog:

* constraint.cc (deduce_concept_introduction, get_deduced_wildcard,
get_introduction_prototype, introduce_type_template_parameter,
introduce_template_template_parameter,
introduce_nontype_template_parameter,
build_introduced_template_parameter, introduce_template_parameter,
introduce_template_parameter_pack, introduce_template_parameter,
introduce_template_parameters, process_introduction_parms,
check_introduction_list, finish_template_introduction): Remove.
(finish_shorthand_constraint): Remove a Concepts TS comment.
* cp-tree.h (check_auto_in_tmpl_args, finish_template_introduction):
Remove.
* decl.cc (function_requirements_equivalent_p): Remove pre-C++20 code.
(grokfndecl): Don't check flag_concepts_ts.
(grokvardecl): Don't check that concept have type bool.
* parser.cc (cp_parser_decl_specifier_seq): Don't check
flag_concepts_ts.
(cp_parser_introduction_list): Remove.
(cp_parser_template_id): Remove dead code.
(cp_parser_simple_type_specifier): Don't check flag_concepts_ts.
(cp_parser_placeholder_type_specifier): Require require auto or
decltype(auto) even pre-C++20.  Don't check flag_concepts_ts.
(cp_parser_type_id_1): Don't check flag_concepts_ts.
(cp_parser_template_type_arg): Likewise.
(cp_parser_requires_clause_opt): Remove flag_concepts_ts code.
(cp_parser_compound_requirement): Don't check flag_concepts_ts.
(cp_parser_template_introduction): Remove.
(cp_parser_template_declaration_after_export): Don't call
cp_parser_template_introduction.
* pt.cc (template_heads_equivalent_p): Remove pre-C++20 code.
(find_parameter_pack_data): Remove type_pack_expansion_p.
(find_parameter_packs_r): Remove flag_concepts_ts code.  Remove
type_pack_expansion_p code.
(uses_parameter_packs): Remove type_pack_expansion_p code.
(make_pack_expansion): Likewise.
(check_for_bare_parameter_packs): Likewise.
(fixed_parameter_pack_p): Likewise.
(tsubst_qualified_id): Remove dead code.
(extract_autos_r): Remove.
(extract_autos): Remove.
(do_auto_deduction): Remove flag_concepts_ts code.
(type_uses_auto): Likewise.
(check_auto_in_tmpl_args): Remove.

gcc/ChangeLog:

* doc/invoke.texi: Mention that -fconcepts-ts was removed.

libstdc++-v3/ChangeLog:

* testsuite/std/ranges/access/101782.cc: Don't compile with
-fconcepts-ts.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/auto3.C: Compile with -fconcepts.  Run in C++17 and
up.  Add dg-error.
* g++.dg/concepts/auto5.C: Likewise.
* g++.dg/concepts/auto7.C: Compile with -fconcepts.  Add dg-error.
* g++.dg/concepts/auto8a.C: Compile with -fconcepts.
* g++.dg/concepts/class-deduction1.C: Compile with -fconcepts.  Run in
C++17 and up.  Convert to C++20.
* g++.dg/concepts/class5.C: Likewise.
* g++.dg/concepts/class6.C: Likewise.
* g++.dg/concepts/debug1.C: Likewise.
* g++.dg/concepts/decl-diagnose.C: Compile with -fconcepts.  Run in
C++17 and up.  Add dg-error.
* g++.dg/concepts/deduction-constraint1.C: Compile with -fconcepts.
Run in C++17 and up.  Convert to C++20.
* g++.dg/concepts/diagnostic1.C: Likewise.
* g++.dg/concepts/dr1430.C: Likewise.
* g++.dg/concepts/equiv.C: Likewise.
* g++.dg/concepts/equiv2.C: Likewise.
* g++.dg/concepts/expression.C: Likewise.
* g++.dg/concepts/expression2.C: Likewise.
* g++.dg/concepts/expression3.C: Likewise.
* g++.dg/concepts/fn-concept2.C: Compile with -fconcepts.  Run in
C++17 and up.  Remove code.  Add dg-prune-output.
* g++.dg/concepts/fn-concept3.C: Compile with -fconcepts.  Run in
C++17 and up.  Convert to C++20.
* g++.dg/concepts/fn1.C: Likewise.
* g++.dg/concepts/fn10.C: Likewise.
* g++.dg/concepts/fn2.C: Likewise.
* g++.dg/concepts/fn3.C: Likewise.
* g++.dg/concepts/fn4.C: Likewise.
* g++.dg/concepts/fn5.C: Likewise.
* g++.dg/concepts/fn6.C: Likewise.
* g++.dg/concepts/fn7.C: Compile with -fconcepts.  Add dg-error.
* g++.dg/concepts/fn8.C: Compile with -fconcepts.  Run in C++17 and up.
Convert to C++20.
* g++.dg/concepts/fn9.C: Likewise.
* g++.dg/concepts/generic-fn-err.C: Likewise.
* g++.dg/concepts/generic-fn.C: Likewise.
* g++.dg/concepts/inherit-ctor1.C: Likewise.
* g++.dg/concepts/inherit-ctor3.C: Likewise.
* g++.dg/concepts/intro1.C: Likewise.
* g++.dg/concepts/locations1.C: Compile with -fconcepts.  Run in C++17
and up.  Add dg-prune-output.
* g++.dg/concepts/partial-concept-id1.C: Compile with -fconcepts.
Run in C++17 and up.  Convert to C++20.
* g++.dg/concepts/partial-concept-id2.C: Likewise.
* g++.dg/concepts/partial-spec5.C: Likewise.
* g++.dg/concepts/placeholder2.C: Likewise.
* g++.dg/concepts/placeholder3.C: Likewise.
* g++.dg/concepts/placeholder4.C: Likewise.
* g++.dg/concepts/placeholder5.C: Likewise.
* g++.dg/concepts/placeholder6.C: Likewise.
* g++.dg/concepts/pr65634.C: Likewise.
* g++.dg/concepts/pr65636.C: Likewise.
* g++.dg/concepts/pr65681.C: Likewise.
* g++.dg/concepts/pr65848.C: Likewise.
* g++.dg/concepts/pr67249.C: Likewise.
* g++.dg/concepts/pr67595.C: Likewise.
* g++.dg/concepts/pr68434.C: Likewise.
* g++.dg/concepts/pr71127.C: Likewise.
* g++.dg/concepts/pr71128.C: Compile with -fconcepts.  Run in C++17
and up.  Add dg-error.
* g++.dg/concepts/pr71131.C: Compile with -fconcepts.  Run in C++17
and up.  Convert to C++20.
* g++.dg/concepts/pr71385.C: Likewise.
* g++.dg/concepts/pr85065.C: Likewise.
* g++.dg/concepts/pr92804-2.C: Compile with -fconcepts.  Convert to
C++20.
* g++.dg/concepts/template-parm11.C: Compile with -fconcepts.  Run in
C++17 and up.  Convert to C++20.
* g++.dg/concepts/template-parm12.C: Likewise.
* g++.dg/concepts/template-parm2.C: Likewise.
* g++.dg/concepts/template-parm3.C: Likewise.
* g++.dg/concepts/template-parm4.C: Likewise.
* g++.dg/concepts/template-template-parm1.C: Likewise.
* g++.dg/concepts/var-concept1.C: Likewise.
* g++.dg/concepts/var-concept2.C: Likewise.
* g++.dg/concepts/var-concept3.C: Likewise.
* g++.dg/concepts/var-concept4.C: Likewise.
* g++.dg/concepts/var-concept5.C: Likewise.
* g++.dg/concepts/var-concept6.C: Likewise.
* g++.dg/concepts/var-concept7.C: Likewise.
* g++.dg/concepts/var-templ1.C: Run in C++17 and up.
* g++.dg/concepts/var-templ2.C: Compile with -fconcepts.  Run in C++17
and up.  Convert to C++20.
* g++.dg/concepts/var-templ3.C: Likewise.
* g++.dg/concepts/variadic1.C: Likewise.
* g++.dg/concepts/variadic2.C: Likewise.
* g++.dg/concepts/variadic3.C: Likewise.
* g++.dg/concepts/variadic4.C: Likewise.
* g++.dg/cpp2a/concepts-pr65575.C: Likewise.
* g++.dg/cpp2a/concepts-pr66091.C: Likewise.
* g++.dg/cpp2a/concepts-pr67148.C: Compile with -fconcepts.  Convert
to C++20.
* g++.dg/cpp2a/concepts-pr67225-1.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-3.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-4.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-5.C: Likewise.
* g++.dg/cpp2a/concepts-pr67319.C: Likewise.
* g++.dg/cpp2a/concepts-pr67427.C: Likewise.
* g++.dg/cpp2a/concepts-pr67654.C: Likewise.
* g++.dg/cpp2a/concepts-pr67658.C: Likewise.
* g++.dg/cpp2a/concepts-pr67684.C: Likewise.
* g++.dg/cpp2a/concepts-pr67697.C: Likewise.
* g++.dg/cpp2a/concepts-pr67719.C: Likewise.
* g++.dg/cpp2a/concepts-pr67774.C: Likewise.
* g++.dg/cpp2a/concepts-pr67825.C: Likewise.
* g++.dg/cpp2a/concepts-pr67860.C: Likewise.
* g++.dg/cpp2a/concepts-pr67862.C: Likewise.
* g++.dg/cpp2a/concepts-pr67969.C: Likewise.
* g++.dg/cpp2a/concepts-pr68093-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr68372.C: Likewise.
* g++.dg/cpp2a/concepts-pr68812.C: Likewise.
* g++.dg/cpp2a/concepts-pr69235.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752.C: Likewise.
* g++.dg/cpp2a/concepts-pr79759.C: Likewise.
* g++.dg/cpp2a/concepts-pr80746.C: Likewise.
* g++.dg/cpp2a/concepts-pr80773.C: Likewise.
* g++.dg/cpp2a/concepts-pr82507.C: Likewise.
* g++.dg/cpp2a/concepts-pr82740.C: Likewise.
* g++.dg/cpp2a/concepts-pr84980.C: Compile with -fconcepts.  Run in
C++17 and up.  Convert to C++20.
* g++.dg/cpp2a/concepts-pr85265.C: Likewise.
* g++.dg/cpp2a/concepts-pr85808.C: Compile with -fconcepts.  Convert
to C++20.
* g++.dg/cpp2a/concepts-pr86269.C: Likewise.
* g++.dg/cpp2a/concepts-pr87441.C: Likewise.
* g++.dg/cpp2a/concepts-requires5.C: Compile with -fconcepts.
Adjust dg-error.  Add same_as.
* g++.dg/cpp2a/nontype-class50a.C: Compile with -fconcepts.
* g++.dg/concepts/auto1.C: Removed.
* g++.dg/concepts/auto4.C: Removed.
* g++.dg/concepts/auto6.C: Removed.
* g++.dg/concepts/fn-concept1.C: Removed.
* g++.dg/concepts/intro2.C: Removed.
* g++.dg/concepts/intro3.C: Removed.
* g++.dg/concepts/intro4.C: Removed.
* g++.dg/concepts/intro5.C: Removed.
* g++.dg/concepts/intro6.C: Removed.
* g++.dg/concepts/intro7.C: Removed.
* g++.dg/cpp2a/concepts-ts1.C: Removed.
* g++.dg/cpp2a/concepts-ts2.C: Removed.
* g++.dg/cpp2a/concepts-ts3.C: Removed.
* g++.dg/cpp2a/concepts-ts4.C: Removed.
* g++.dg/cpp2a/concepts-ts5.C: Removed.
* g++.dg/cpp2a/concepts-ts6.C: Removed.

c: ICE with invalid sizeof [PR115642]

Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The
code checks for expr.value == error_mark_node but here the e_m_n is
wrapped in a C_MAYBE_CONST_EXPR. I don't think we should have created
such a tree, so let's return earlier in c_cast_expr.

PR c/115642

gcc/c/ChangeLog:

* c-typeck.cc (c_cast_expr): Return error_mark_node if build_c_cast
failed.

gcc/testsuite/ChangeLog:

* gcc.dg/noncompile/sizeof-1.c: New test.

c: ICE on invalid with attribute optimize [PR115549]

I had this PR in my open tabs so why not go ahead and fix it.

decl_attributes gets last_decl, the last already pushed declaration,
to be used in common_handle_aligned_attribute. In C++, we look up
the decl via find_last_decl, which returns NULL_TREE if it finds
a decl that had not been declared. In C, we look up the decl via
lookup_last_decl which returns error_mark_node rather than NULL_TREE
in that case.

The error_mark_node causes a crash in common_handle_aligned_attribute.
We can fix this on the C FE side like in the patch below.

PR c/115549

gcc/c/ChangeLog:

* c-decl.cc (c_decl_attributes): If lookup_last_decl returns
error_mark_node, use NULL_TREE as last_decl.

gcc/testsuite/ChangeLog:

* c-c++-common/attr-aligned-2.c: New test.

testsuite: Align testcase with implementation [PR105090]

Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c
no longer contains any lsr istruction, so drop the check as per
comment 9 in PR105090.

gcc/testsuite/ChangeLog:

PR target/105090
* gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

RISC-V: Update testsuite to use b

Update all instances of zba_zbb_zbs in the testsuite to use b instead

gcc/testsuite/ChangeLog:

* g++.target/riscv/redundant-bitmap-1.C: Use gcb instead of
zba_zbb_zbs
* g++.target/riscv/redundant-bitmap-2.C: Ditto
* g++.target/riscv/redundant-bitmap-3.C: Ditto
* g++.target/riscv/redundant-bitmap-4.C: Ditto
* gcc.target/riscv/shift-add-1.c: Ditto
* gcc.target/riscv/shift-add-2.c: Ditto
* gcc.target/riscv/synthesis-1.c: Ditto
* gcc.target/riscv/synthesis-2.c: Ditto
* gcc.target/riscv/synthesis-3.c: Ditto
* gcc.target/riscv/synthesis-4.c: Ditto
* gcc.target/riscv/synthesis-5.c: Ditto
* gcc.target/riscv/synthesis-6.c: Ditto
* gcc.target/riscv/synthesis-7.c: Ditto
* gcc.target/riscv/synthesis-8.c: Ditto
* gcc.target/riscv/zba_zbs_and-1.c: Ditto
* gcc.target/riscv/zbs-zext-3.c: Ditto
* lib/target-supports.exp: Add b to riscv_get_arch

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

RISC-V: Add support for B standard extension

This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains

https://github.com/riscv/riscv-b/tags

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: Ditto

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

internal-fn: Reuse SUBREG_PROMOTED_VAR_P handling

expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations.  Specifically, for:

  (subreg/v:M1 (reg:M2 R) ...)

it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.

This patch splits this handling out into helper routines and
uses them for other instances of:

  if (!rtx_equal_p (target, ops[0].value))
    emit_move_insn (target, ops[0].value);

It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors.  But I think it could
be useful for the CRC work.

gcc/
* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
functions, split out from...
(expand_fn_using_insn): ...here.
(expand_load_lanes_optab_fn): Use them.
(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
(expand_GOMP_SIMT_LAST_LANE): Likewise.
(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
(expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
(expand_GOMP_SIMT_XCHG_IDX): Likewise.
(expand_partial_load_optab_fn): Likewise.
(expand_vec_cond_optab_fn): Likewise.
(expand_vec_cond_mask_optab_fn): Likewise.
(expand_RAWMEMCHR): Likewise.
(expand_gather_load_optab_fn): Likewise.
(expand_while_optab_fn): Likewise.
(expand_SPACESHIP): Likewise.

c++: array new with value-initialization [PR115645]

This extends the r11-5179 fix which doesn't work with multidimensional
arrays.  In particular,

  struct S {
    explicit S() { }
  };
  auto p = new S[1][1]();

should not say "converting to S from initializer list would use
explicit constructor" because there's no {}.  However, since we
went into the block where we create a {}, we got confused.  We
should not have gotten there but we did because array_p was true.

This patch refines the check once more.

PR c++/115645

gcc/cp/ChangeLog:

* init.cc (build_new): Don't do any deduction for arrays with
bounds if it's value-initialized.

gcc/testsuite/ChangeLog:

* g++.dg/expr/anew7.C: New test.

recog: Handle some mode-changing hardreg propagations

insn_propagation would previously only replace (reg:M H) with X
for some hard register H if the uses of H were also in mode M.
This patch extends it to handle simple mode punning too.

The original motivation was to try to get rid of the execution
frequency test in aarch64_split_simd_shift_p, but doing that is
follow-up work.

I tried this on at least one target per CPU directory (as for
the late-combine patches) and it seems to be a small win for
all of them.

The patch includes a couple of updates to the ia32 results.
In pr105033.c, foo3 replaced:

       vmovq       8(%esp), %xmm1
       vpunpcklqdq %xmm1, %xmm0, %xmm0

with:

       vmovhps     8(%esp), %xmm0, %xmm0

In vect-bfloat16-2b.c, 5 of the vec_extract_v32bf_* routines
(specifically the ones with nonzero even indices) replaced
things like:

       movl    28(%esp), %eax
       vmovd   %eax, %xmm0

with:

       vpinsrw $0, 28(%esp), %xmm0, %xmm0

(These functions return a bf16, and so only the low 16 bits matter.)

gcc/
* recog.cc (insn_propagation::apply_to_rvalue_1): Handle simple
cases of hardreg propagation in which the register is set and
used in different modes.

gcc/testsuite/
* gcc.target/i386/pr105033.c: Expect vmovhps for the ia32 version
of foo.
* gcc.target/i386/vect-bfloat16-2b.c: Expect more vpinsrws.

rtl-ssa: Add replace_nondebug_insn [PR115785]

change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent.  These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.

change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction.  This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction.  But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step.  That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.

This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.

gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.

fixincludes: skip stdio_stdarg_h on darwin

The fix is unnecessary on macOS.

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (stdio_stdarg_h): Skip on darwin.

c++, contracts: Fix ICE in create_tmp_var [PR113968]

During contract parsing, in grok_contract(), we proceed even if the
condition contains errors. This results in contracts with embedded errors
which eventually confuse gimplify. Checks for errors have been added in
grok_contract() to exit early if an error is encountered.

PR c++/113968

gcc/cp/ChangeLog:

* contracts.cc (grok_contract): Check for error_mark_node early
exit.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/pr113968.C: New test.

Signed-off-by: Nina Ranns <dinka.ranns@gmail.com>

fixincludes: add bypass to darwin_objc_runtime_1

Headers are now fixed in the macOS 15 SDK, and the fix should be
bypassed there.

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (darwin_objc_runtime_1): Add bypass.

PR modula2/115823 Wrong expansion of isnormal optab

The bug fix changes gcc/m2/gm2-gcc/m2builtins.c:m2builtins_BuiltinExists
to recognise both __builtin_<functionname> and functionname as a builtin.

gcc/m2/ChangeLog:

PR modula2/115823
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New
field builtinname.
(builtin_function_match): New function.
(builtin_macro_match): Ditto.
(m2builtins_BuiltinExists): Use builtin_function_match and
builtin_macro_match.
(lookup_builtin_macro): Use builtin_macro_match.
(lookup_builtin_function): Use builtin_function_match.
(define_builtin): Assign builtinname field.

gcc/testsuite/ChangeLog:

PR modula2/115823
* gm2/builtins/run/pass/testalloa.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

middle-end: Fix stalled swapped condition code value [PR115836]

emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.

Move calculation of scode value just before its only usage site to
avoid stalled scode value.

PR middle-end/115836

gcc/ChangeLog:

* expmed.cc (emit_store_flag_1): Move calculation of
scode just before its only usage site.

arm: cleanup legacy ARM_PE code

The arm 'pe' target was removed back in 2012 when the FPA support was
removed, but in a small number of places some conditional code was
accidentally left behind. It's no-longer needed, so remove it.

gcc/ChangeLog:

* config/arm/arm-protos.h (arm_dllexport_name_p): Remove prototype.
(arm_dllimport_name_p): Likewise.
(arm_pe_unique_section): Likewise.
(arm_pe_encode_section_info): Likewise.
(arm_dllexport_p): Likewise.
(arm_dllimport_p): Likewise.
(arm_mark_dllexport): Likewise.
(arm_mark_dllimport): Likewise.
(arm_change_mode_p): Likewise.
* config/arm/arm.cc (arm_gnu_attributes): Remove attributes for ARM_PE.
(TARGET_ENCODE_SECTION_INFO): Remove setting for ARM_PE.
(is_called_in_ARM_mode): Remove ARM_PE conditional code.
(thumb1_output_interwork): Remove obsolete ARM_PE code.
(arm_encode_section_info): Remove surrounding #ifndef.

[PR115394] Remove streamer_debugging and it's uses.

gcc/ChangeLog:
PR lto/115394
* lto-streamer.h: Remove streamer_debugging definition.
* lto-streamer-out.cc (stream_write_tree_ref): Remove use of streamer_debugging.
(lto_output_tree): Likewise.
* tree-streamer-in.cc (streamer_read_tree_bitfields): Likewise.
(streamer_get_pickled_tree): Likewise.
* tree-streamer-out.cc (pack_ts_base_value_fields): Likewise.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>

Match: Support form 2 for the .SAT_TRUNC

This patch would like to add form 2 support for the .SAT_TRUNC.  Aka:

Form 2:
  #define DEF_SAT_U_TRUC_FMT_2(NT, WT)     \
  NT __attribute__((noinline))             \
  sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \
  {                                        \
    bool overflow = x > (WT)(NT)(-1);      \
    return overflow ? (NT)-1 : (NT)x;      \
  }

DEF_SAT_U_TRUC_FMT_2(uint32, uint64)

Before this patch:
   3   │
   4   │ __attribute__((noinline))
   5   │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
   6   │ {
   7   │   uint32_t _1;
   8   │   long unsigned int _3;
   9   │
  10   │ ;;   basic block 2, loop depth 0
  11   │ ;;    pred:       ENTRY
  12   │   _3 = MIN_EXPR <x_2(D), 4294967295>;
  13   │   _1 = (uint32_t) _3;
  14   │   return _1;
  15   │ ;;    succ:       EXIT
  16   │
  17   │ }

After this patch:
   3   │
   4   │ __attribute__((noinline))
   5   │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
   6   │ {
   7   │   uint32_t _1;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _1 = .SAT_TRUNC (x_2(D)); [tail call]
  12   │   return _1;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch:
1. The x86 bootstrap test.
2. The x86 fully regression test.
3. The rv64gcv fully regresssion test.

gcc/ChangeLog:

* match.pd: Add form 2 for .SAT_TRUNC.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add new case NOP_EXPR,  and try to match SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>

testsuite: Tests the pattern folding x/sqrt(x) to sqrt(x) for Float16

As a follow-up to adding a pattern that folds x/sqrt(x) to sqrt(x) in match.pd, this patch adds a test case for type Float16 for armv8.2-a+fp16.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/

* gcc.target/aarch64/sqrt_div_float16.c: New test.

testsuite: Allow matching `{_1, { 0,0,0,0 }}` for vect/slp-gap-1.c

While working on adding V4QI support to the aarch64 backend,
vect/slp-gap-1.c started to fail but only because the regex
was failing. Before it was loading use SI (int) and afterwards,
we started to use V4QI. The generated code was the same and the
generated gimple was almost the same. The regex was searching
for `zero-padding trick` and it was still doing that but instead
of directly 0, it was V4QI 0 (or rather `{ 0, 0, 0 }`).
This extends regex to support both.

Tested on x86_64-linux-gnu and aarch64-linux-gnu (with the support added).

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-gap-1.c: Support matching `{_1, { 0, 0, 0, 0 }}`
in addition to `{_1, 0}`.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Remove expanding complex EQ/NE inside a GIMPLE_RETURN [PR115721]

This code has been dead at least since the move over to tuples
in 0-88576-g726a989a8b74bf, when gimple returns could only have
a simple expression in it. So let's remove it.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/115721
* tree-complex.cc (expand_complex_comparison): Remove
support for GIMPLE_RETURN.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

RISC-V: fix zcmp popretz [PR113715]

No functional changes compared with V1, just spaces to table conversion
in testcases to pass check-function-bodies.

V1 passed regression locally but suprisingly failed in pre-commit CI, after
picking the patch from patchwork, I realize table got coverted to spaces
before sending the patch.

Root cause:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.

Insertion of the generated epilogue sequence
into the insn chain doesn't happen at this moment.
If later shrink-wrap decides NOT to insert the epilogue sequence at the end
of insn chain, then the li a0,0 insn has already been mistakeny removed.

Fix this issue by removing generation of cm.popretz in epilogue,
leaving the assignment to a0 and use insn with cm.popret.

That's likely going to result in some kind of code size regression,
but not a correctness regression.

Optimization can be done in future.

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
PR target/113715

* config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): Removed.
(riscv_gen_multi_pop_insn): Remove generation of cm.popretz.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rv32e_zcmp.c: Adapt TC.
* gcc.target/riscv/rv32i_zcmp.c: Likewise.

Daily bump.

Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]

Some tests added to test the type of redeclarations of enumerators
in r15-1394 fail on architectures where sizeof(long) == sizeof(int).
Adapt tests to use long long and/or accept that long long is selected
as type for the enumerator.

PR testsuite/115545

gcc/testsuite/

* gcc.dg/pr115109.c: Adapt test.
* gcc.dg/c23-tag-enum-6.c: Adapt test.
* gcc.dg/c23-tag-enum-7.c: Adapt test.

c: Fix ICE for redeclaration of structs with different alignment [PR114727]

For redeclarations of struct in C23, if one has an alignment attribute
that makes the alignment different, we later get an ICE in verify_types.
This patches disallows such redeclarations by declaring such types to
be different.

PR c/114727

gcc/c/
* c-typeck.cc (tagged_types_tu_compatible): Add test.

gcc/testsuite/
* gcc.dg/pr114727.c: New test.

c: Fix ICE for incorrect code in comptypes_verify [PR115696]

The new verification code produces an ICE for incorrect code. Add the
same logic as already used in comptypes to to bail out under certain
conditions.

PR c/115696

gcc/c/
* c-typeck.cc (comptypes_verify): Bail out for
identical, empty, and erroneous input types.

gcc/testsuite/
* gcc.dg/pr115696.c: New test.

rs6000, remove vector set and vector init built-ins.

The vector init built-ins:

  __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
  __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
  __builtin_vec_init_v2di, __builtin_vec_init_v2df,
  __builtin_vec_init_v1ti

perform the same operation as initializing the vector in C code.  For
example:

  result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
  result_v4si = {1, 2, 3, 4};

These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.

The vector set built-ins:

  __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
  __builtin_vec_set_v4si, __builtin_vec_set_v4sf,
  __builtin_vec_set_v1ti, __builtin_vec_set_v2di,
  __builtin_vec_set_v2df

perform the same operation as setting a specific element in the vector in
C code.  For example:

  src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
  src_v4si[index] = int_val;

The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.

All of the above built-ins that are removed do not have test cases and
are not documented.

Built-ins __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.

The built-ins are removed as they don't provide any benefit over just
using C code.

The code to define the bif_init_bit, bif_is_init, as well as their uses
are removed.  The function altivec_expand_vec_init_builtin is also removed.

gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (altivec_expand_vec_init_builtin):
Remove the function.
(rs6000_expand_builtin): Remove the if bif_is_int check to call
the altivec_expand_vec_init_builtin function.
* config/rs6000/rs6000-builtins.def: Remove the attribute string
comment for init.
(__builtin_vec_init_v16qi,
__builtin_vec_init_v4sf, __builtin_vec_init_v4si,
__builtin_vec_init_v8hi, __builtin_vec_init_v1ti,
__builtin_vec_init_v2df, __builtin_vec_init_v2di,
__builtin_vec_set_v16qi, __builtin_vec_set_v4sf,
__builtin_vec_set_v4si, __builtin_vec_set_v8hi): Remove
built-in definitions.
* config/rs6000/rs6000-gen-builtins.cc: Remove comment for init
attribute string.
(struct attrinfo): Remove isinit entry.
(parse_bif_attrs): Remove the if statement to check for attribute
init.
(ifdef DEBUG): Remove print for init attribute string.
(write_decls): Remove print for define bif_init_bit and
define for bif_is_init.
(write_bif_static_init): Remove if bifp->attrs.isinit statement.

rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
__builtin_altivec_vcmpeqfp_p built-in. The built-in is undocumented and
there are no test cases for it. The patch removes built-in
__builtin_vsx_xvcmpeqsp_p.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp_p):
Remove built-in definition.

rs6000, extend vec_xxpermdi built-in for __int128 args

Add a new signed and unsigned int128 overloaded vector instances for
vec_xxpermdi:

__int128 vec_xxpermdi (__int128, __int128, const int);
__uint128 vec_xxpermdi (__uint128, __uint128, const int);

Update the documentation to include a reference to the new vector built-in
instances of vec_xxpermdi.

Add test cases for the new overloaded instances.

gcc/ChangeLog:
* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
overloaded built-in instances of vector signed and unsigned
int128.
* doc/extend.texi: Add documentation for built-in instances of
vector signed and unsigned int128.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.

rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are
redundant. The overloaded vec_neg built-in provides the same
functionality. The two built-ins are not documented nor are there any
test cases for them.

Remove the definitions so users will use the overloaded vec_neg built-in
which is documented in the PVIPR.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvnegdp,
__builtin_vsx_xvnegsp): Remove built-in definitions.

rs6000, remove __builtin_vsx_vperm_* built-ins

The undocumented built-ins:
  __builtin_vsx_vperm_16qi_uns,
  __builtin_vsx_vperm_1ti,
  __builtin_vsx_vperm_1ti_uns,
  __builtin_vsx_vperm_2df,
  __builtin_vsx_vperm_2di,
  __builtin_vsx_vperm_2di_uns,
  __builtin_vsx_vperm_4sf,
  __builtin_vsx_vperm_4si,
  __builtin_vsx_vperm_4si_uns

are duplicats of the __builtin_altivec_* built-ins that are used by
the overloaded vec_perm built-in that is documented in the PVIPR.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
built-in definitions and comments.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns,
__builtin_vsx_vperm): Change call to built-in to the  overloaded
built-in vec_perm.

rs6000, remove the vec_xxsel built-ins, they are duplicates

The following undocumented built-ins are covered by the existing overloaded
vec_sel built-in definitions.

  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
same as vsc __builtin_vec_sel (vsc, vsc, vuc);  (overloaded vec_sel)

  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
same as vuc __builtin_vec_sel (vuc, vuc, vuc);  (overloaded vec_sel)

  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
same as  vd __builtin_vec_sel (vd, vd, vull);   (overloaded vec_sel)

  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
same as vsll __builtin_vec_sel (vsll, vsll, vsll);  (overloaded vec_sel)

  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
same as vull __builtin_vec_sel (vull, vull, vsll);  (overloaded vec_sel)

  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
same as vf __builtin_vec_sel (vf, vf, vsi)          (overloaded vec_sel)

  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
same as vsi __builtin_vec_sel (vsi, vsi, vbi);      (overloaded vec_sel)

  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
same as vui __builtin_vec_sel (vui, vui, vui);      (overloaded vec_sel)

  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
same as vss __builtin_vec_sel (vss, vss, vbs);      (overloaded vec_sel)

  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
same as vus __builtin_vec_sel (vus, vus, vus);      (overloaded vec_sel)

This patch removed the duplicate built-in definitions so users will only
use the documented vec_sel built-in.  The __builtin_vsx_xxsel_[4si, 8hi,
16qi, 4sf, 2df] tests are also removed.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_16qi,
__builtin_vsx_xxsel_16qi_uns, __builtin_vsx_xxsel_2df,
__builtin_vsx_xxsel_2di, __builtin_vsx_xxsel_2di_uns,
__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_4si,
__builtin_vsx_xxsel_4si_uns, __builtin_vsx_xxsel_8hi,
__builtin_vsx_xxsel_8hi_uns): Remove built-in definitions.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_xxsel_4si,
__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_16qi,
__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_2df,
__builtin_vsx_xxsel): Change built-in call to overloaded built-in
call vec_sel.

rs6000, add overloaded vec_sel with int128 arguments

Extend the vec_sel built-in to take three signed/unsigned/bool int128
arguments and return a signed/unsigned/bool int128 result.

Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete. The
patch removes these built-ins.

The patch adds documentation and test cases for the new overloaded
vec_sel built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
* config/rs6000/rs6000-overload.def (vec_sel): Add new
overloaded vector signed, unsigned and bool 128-bit definitions.
* doc/extend.texi (vec_sel): Add documentation for new instances
with signed, unsigned and bool 129-bit bool arguments.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-10-runnable.c: New runnable test
file.
* gcc.target/powerpc/builtins-10.c: New compile only test file.

rs6000, remove duplicated built-ins of vecmergl and vec_mergeh

The following undocumented built-ins are same as existing documented
overloaded builtins.

  const vf __builtin_vsx_xxmrghw (vf, vf);
same as  vf __builtin_vec_mergeh (vf, vf);     (overloaded vec_mergeh)

  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
same as vsi __builtin_vec_mergeh (vsi, vsi);   (overloaded vec_mergeh)

  const vf __builtin_vsx_xxmrglw (vf, vf);
same as vf __builtin_vec_mergel (vf, vf);      (overloaded vec_mergel)

  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
same as vsi __builtin_vec_mergel (vsi, vsi);   (overloaded vec_mergel)

This patch removes the duplicate built-in definitions so only the
documented built-ins will be available for use.  The case statements in
rs6000_gimple_fold_builtin are removed as they are no longer needed.  The
patch removes the now unused define_expands for vsx_xxmrghw_<mode> and
vsx_xxmrglw_<mode>.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si, __builtin_vsx_xxmrglw,
__builtin_vsx_xxmrglw_4si, __builtin_vsx_xxsel_16qi): Remove
built-in definition.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
remove case entries RS6000_BIF_XXMRGLW_4SI,
RS6000_BIF_XXMRGLW_4SF, RS6000_BIF_XXMRGHW_4SI,
RS6000_BIF_XXMRGHW_4SF.
* config/rs6000/vsx.md (vsx_xxmrghw_<mode>, vsx_xxmrglw_<mode>):
Remove unused define_expands.

rs6000, Remove redundant vector float/double type conversions

The following built-ins are redundant as they are covered by another
overloaded built-in.

  __builtin_vsx_xvcvspdp covered by vec_double{e,o}
  __builtin_vsx_xvcvdpsp covered by vec_float{e,o}
  __builtin_vsx_xvcvsxwdp covered by vec_double{e,o}
  __builtin_vsx_xvcvuxddp_uns covered by vec_double

Remove the redundant built-ins. They are not documented nor do they have
test cases.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspdp,
__builtin_vsx_xvcvdpsp, __builtin_vsx_xvcvsxwdp,
__builtin_vsx_xvcvuxddp_uns): Remove.

rs6000, extend the current vec_{un,}signed{e,o} built-ins

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to a vector of signed/unsigned long long ints.
Extend the existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return a vector of even/odd signed/unsigned integers.

The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
built-ins.

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
now for internal use only. They are not documented and they do not
have test cases.

Add testcases and update documentation.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds,
__builtin_vsx_xvcvspuxds): Rename to __builtin_vsignede_v4sf,
__builtin_vunsignede_v4sf respectively.
(XVCVSPSXDS, XVCVSPUXDS): Rename to VEC_VSIGNEDE_V4SF,
VEC_VUNSIGNEDE_V4SF respectively.
(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
built-in definitions.
* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
vec_unsignede, vec_unsignedo): Add new overloaded specifications.
* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
* doc/extend.texi (vec_signedo, vec_signede, vec_unsignedo,
vec_unsignede): Add documentation for new overloaded built-ins to
convert vector float to vector {un,}signed long long.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c
(test_unsigned_int_result, test_ll_unsigned_int_result): Add
new argument.
(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
tests for the overloaded built-ins.

rs6000, fix error in unsigned vector float to unsigned int built-in definitions

The built-in __builtin_vsx_vunsigned_v2df is supposed to take a vector of
doubles and return a vector of unsigned long long ints.  Similarly
__builtin_vsx_vunsigned_v4sf takes a vector of floats an is supposed to
return a vector of unsinged ints.  The definitions are using the signed
version of the instructions not the unsigned version of the instruction.
The results should also be unsigned.  The built-ins are used by the
overloaded vec_unsigned built-in which has an unsigned result.

Similarly the built-ins __builtin_vsx_vunsignede_v2df and
__builtin_vsx_vunsignedo_v2df are supposed to return an unsigned result.
If the floating point argument is negative, the unsigned result is zero.
The built-ins are used in the overloaded built-in vec_unsignede and
vec_unsignedo respectively.

Add a test cases for a negative floating point arguments for each of the
above built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
vec_unsignede and vec_unsignedo with negative arguments.

rs6000, Remove __builtin_vsx_xvcv{sp{sx,u}ws,dpuxds_uns}

The built-in __builtin_vsx_xvcvspsxws is covered by built-in vec_signed
built-in that is documented in the PVIPR. The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.

The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
vec_unsigned, remove.

The __builtin_vsx_xvcvspuxws is redundant as it is covered by
vec_unsigned, remove.

The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.

The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove.

This patch removes the redundant built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws,
__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws,
__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws): Remove
built-in definitions.

rs6000, Remove __builtin_vsx_cmple* builtins

The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
unsigned arguments and return an unsigned result. The current definitions
take signed arguments and return signed results which is incorrect.

The signed and unsigned versions of __builtin_vsx_cmple* are not
documented in extend.texi. Also there are no test cases for the
built-ins.

Users can use the existing vec_cmple as PVIPR defines instead of
__builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi,
__builtin_vsx_cmple_16qi, __builtin_vsx_cmple_2di,
__builtin_vsx_cmple_4si and __builtin_vsx_cmple_8hi,
__builtin_altivec_cmple_1ti, __builtin_altivec_cmple_u1ti.

Hence these built-ins are redundant and are removed by this patch.

gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (RS6000_BIF_CMPLE_16QI,
RS6000_BIF_CMPLE_U16QI, RS6000_BIF_CMPLE_8HI,
RS6000_BIF_CMPLE_U8HI, RS6000_BIF_CMPLE_4SI, RS6000_BIF_CMPLE_U4SI,
RS6000_BIF_CMPLE_2DI, RS6000_BIF_CMPLE_U2DI, RS6000_BIF_CMPLE_1TI,
RS6000_BIF_CMPLE_U1TI): Remove case statements.
* config/rs6000/rs6000-builtins.def (__builtin_vsx_cmple_16qi,
__builtin_vsx_cmple_2di, __builtin_vsx_cmple_4si,
__builtin_vsx_cmple_8hi, __builtin_vsx_cmple_u16qi,
__builtin_vsx_cmple_u2di, __builtin_vsx_cmple_u4si,
__builtin_vsx_cmple_u8hi): Remove buit-in definitions.

i386: Implement .SAT_TRUNC for unsigned integers

The following testcase:

unsigned short foo (unsigned int x)
{
_Bool overflow = x > (unsigned int)(unsigned short)(-1);
return ((unsigned short)x | (unsigned short)-overflow);
}

currently compiles (-O2) to:

foo:
xorl %eax, %eax
cmpl $65535, %edi
seta %al
negl %eax
orl %edi, %eax
ret

We can expand through ustrunc{m}{n}2 optab to use carry flag from the
comparison and generate code using SBB:

foo:
cmpl $65535, %edi
sbbl %eax, %eax
orl %edi, %eax
ret

or CMOV instruction:

foo:
movl $65535, %eax
cmpl %eax, %edi
cmovnc %edi, %eax
ret

gcc/ChangeLog:

* config/i386/i386.md (@cmp<mode>_1): Use SWI mode iterator.
(ustruncdi<mode>2): New expander.
(ustruncsi<mode>2): Ditto.
(ustrunchiqi2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/sattrunc-1.c: New test.

diagnostics: use refs rather than pointers for diagnostic_{path,context}

Use const & rather than const * in various places where it can't be null
and can't change.

No functional change intended.

gcc/ChangeLog:
* diagnostic-path.cc: Replace "const diagnostic_path *" with
"const diagnostic_path &" throughout, and "diagnostic_context *"
with "diagnostic context &".
* diagnostic.cc (diagnostic_context::show_any_path): Pass
reference in call to print_path.
* diagnostic.h (diagnostic_context::print_path): Convert param
to a reference.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

arm: clean up some legacy FPA related cruft.

Support for the FPA on Arm was removed after gcc-4.7, but this little
bit of crufty code was left behind. In particular the code to support
the 'N' modifier in assembly code was left behind and this lead to a
trail of other code that depended on it, even though most of the
constants that it supported had been removed in the original cleanup.

This patch removes most of the remaining cruft and simplifies the one
bit that remains: to determine whether an RTL construct contains 0.0 we
don't need to convert it to a real value, we can simply compare it to
CONST0_RTX of the appropriate mode.

gcc/

* config/arm/arm.cc (fp_consts_initited): Delete variable.
(value_fp0): Likewise.
(init_fp_table): Delete function.
(fp_const_from_val): Likewise.
(arm_const_double_rtx): Rework to avoid converting to REAL_VALUE_TYPE.
(arm_print_operand, case 'N'): Make use of this case an error.

RISC-V: Fix comment/naming in attribute parsing code

Function target attributes have to be separated by semi-colons.
Let's fix the comment and variable naming to better explain what
the code does.

gcc/ChangeLog:

* config/riscv/riscv-target-attr.cc (riscv_process_target_attr):
Fix comments and variable names.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

RISC-V: Deduplicate arch subset list processing

We have a code duplication in riscv_set_arch_by_subset_list() and
riscv_parse_arch_string(), where the latter function parses an ISA string
into a subset_list before doing the same as the former function.

riscv_parse_arch_string() is used to process command line options and
riscv_set_arch_by_subset_list() processes target attributes.
So, it is obvious that both functions should do the same.
Let's deduplicate the code to enforce this.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_set_arch_by_subset_list):
Fix overlong line.
(riscv_parse_arch_string): Replace duplicated code by a call to
riscv_set_arch_by_subset_list.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

RISC-V: testsuite: Properly gate LTO tests

There are two test cases with the following skip directive:
  dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" }
This reads as: skip if both '-flto' and '-fno-fat-lto-objects'
are present.  This is not the case if only '-flto' is present.

Since both tests depend on instruction sequences (one does
check-function-bodies the other tests for an assembler error
message), they won't work reliably with fat LTO objects.

Let's change the skip line to gate the test on '-flto'
to avoid failing tests like this:

FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto   check-function-bodies interrupt
FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto -flto-partition=none   check-function-bodies interrupt
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 9)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test for errors, line 9)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/interrupt-misaligned.c: Remove
"-fno-fat-lto-objects" from skip condition.
* gcc.target/riscv/pr93202.c: Likewise.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

i386: Correct AVX10 CPUID emulation

AVX10 Documentaion has specified ecx value as 0 for AVX10 version and
vector size under 0x24 subleaf. Although for ecx=1, the bits are all
reserved for now, we still need to specify ecx as 0 to avoid dirty
value in ecx.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features): Correct
AVX10 CPUID emulation to specify ecx value.

c: Rewrite c_parser_omp_tile_sizes to use c_parser_expr_list

The following patch simplifies c_parser_omp_tile_sizes to use
c_parser_expr_list, so that it will get CPP_EMBED parsing naturally,
without having another spot that needs to be adjusted for it.

2024-07-09 Jakub Jelinek <jakub@redhat.com>

* c-parser.cc (c_parser_omp_tile_sizes): Use c_parser_expr_list.

* c-c++-common/gomp/tile-11.c: Adjust expected diagnostics for c.
* c-c++-common/gomp/tile-12.c: Likewise.

c++: Implement C++26 CWG2819 - Allow cv void * null pointer value conversion to object types in constant expressions

The following patch implements CWG2819 (which wasn't a DR because
it changes behavior of C++26 only).

2024-07-09 Jakub Jelinek <jakub@redhat.com>

* constexpr.cc (cxx_eval_constant_expression): CWG2819 - Allow
cv void * null pointer value conversion to object types in constant
expressions.

* g++.dg/cpp26/constexpr-voidptr3.C: New test.
* g++.dg/cpp0x/constexpr-cast2.C: Adjust expected diagnostics for
C++26.
* g++.dg/cpp0x/constexpr-cast4.C: Likewise.

Rename __{float,double}_u to __x86_{float,double}_u to avoid pulluting the namespace.

I have a build failure on NetBSD as the namespace pollution avoidance causes
a direct hit with the system /usr/include/math.h
=======================================================================

In file included from /usr/src/local/gcc/obj/gcc/include/emmintrin.h:31,
                 from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/ext/random:45,
                 from /usr/src/local/gcc/libstdc++-v3/include/precompiled/extc++.h:65:
/usr/src/local/gcc/obj/gcc/include/xmmintrin.h:75:15: error: conflicting declaration 'typedef float __float_u'
   75 | typedef float __float_u __attribute__ ((__may_alias__, __aligned__ (1)));
      |               ^~~~~~~~~
In file included from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/cmath:47,
                 from /usr/src/local/gcc/obj/x86_64-unknown-netbsd10.99/libstdc++-v3/include/x86_64-unknown-netbsd10.99/bits/stdc++.h:114,
                 from /usr/src/local/gcc/libstdc++-v3/include/precompiled/extc++.h:32:
/usr/src/local/gcc/obj/gcc/include-fixed/math.h:49:7: note: previous declaration as 'union __float_u'
   49 | union __float_u {

gcc/ChangeLog:

PR target/115796
* config/i386/emmintrin.h (__float_u): Rename to ..
(__x86_float_u): .. this.
(_mm_load_sd): Ditto.
(_mm_store_sd): Ditto.
(_mm_loadh_pd): Ditto.
(_mm_loadl_pd): Ditto.
* config/i386/xmmintrin.h (__double_u): Rename to ..
(__x86_double_u): .. this.
(_mm_load_ss): Ditto.
(_mm_store_ss): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr115796.c: New test.

RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 2

After the middle-end supported the vector mode of .SAT_ADD,  add more
testcases to ensure the correctness of RISC-V backend for form 2.  Aka:

Form 2:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_2(T, IMM)                          \
  T __attribute__((noinline))                                          \
  vec_sat_u_add_imm##IMM##_##T##_fmt_2 (T *out, T *in, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      out[i] = (T)(in[i] + IMM) < in[i] ? -1 : (in[i] + IMM);          \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_2 (uint64_t, 9)

Passed the fully rv64gcv regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 1

After the middle-end supported the vector mode of .SAT_ADD,  add more
testcases to ensure the correctness of RISC-V backend for form 1.  Aka:

Form 1:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM)                          \
  T __attribute__((noinline))                                          \
  vec_sat_u_add_imm##IMM##_##T##_fmt_1 (T *out, T *in, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      out[i] = (T)(in[i] + IMM) >= in[i] ? (in[i] + IMM) : -1;         \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_1 (uint64_t, 9)

Passed the fully rv64gcv regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-4.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-4.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

[to-be-committed][RISC-V][V3] DCE analysis for extension elimination

The pre-commit testing showed that making ext-dce only active at -O2 and above
would require minor edits to the tests.  In some cases we had specified -O1 in
the test or specified no optimization level at all. Those need to be bumped to
-O2.   In one test we had one set of dg-options overriding another.

The other approach that could have been taken would be to drop the -On
argument, add an explicit -fext-dce and add dg-skip-if options.  I originally
thought that was going to be way to go, but the dg-skip-if aspect was going to
get ugly as things like interaction between unrolling, peeling and -ftracer
would have to be accounted for and would likely need semi-regular adjustment.

Changes since V2:
  Testsuite changes to deal with pass only being enabled at -O2 or
  higher.

--

Changes since V1:

  Check flag_ext_dce before running the new pass.  I'd forgotten that
  I had removed that part of the gate to facilitate more testing.
  Turn flag_ext_dce on at -O2 and above.
  Adjust one of the riscv tests to explicitly avoid vectors
  Adjust a few aarch64 tests
    In tbz_2.c we remove an unnecessary extension which causes us to use
    "x" registers instead of "w" registers.

    In the pred_clobber tests we also remove an extension and that
    ultimately causes a reg->reg copy to change locations.

--

This was actually ack'd late in the gcc-14 cycle, but I chose not to integrate
it given how late we were in the cycle.

The basic idea here is to track liveness of subobjects within a word and if we
find an extension where the bits set aren't actually used, then we convert the
extension into a subreg.  The subreg typically simplifies away.

I've seen this help a few routines in coremark, fix one bug in the testsuite
(pr111384) and fix a couple internally reported bugs in Ventana.

The original idea and code were from Joern; Jivan and I hacked it into usable
shape.  I've had this in my tester for ~8 months, so it's been through more
build/test cycles than I care to contemplate and nearly every architecture we
support.

But just in case, I'm going to wait for it to spin through the pre-commit CI
tester.  I'll find my old ChangeLog before committing.

gcc/
* Makefile.in (OBJS): Add ext-dce.o
* common.opt (ext-dce): Document new option.
* df-scan.cc (df_get_ext_block_use_set): Delete prototype and
make extern.
* df.h (df_get_exit_block_use_set): Prototype.
* ext-dce.cc: New file/pass.
* opts.cc (default_options_table): Handle ext-dce at -O2 or higher.
* passes.def: Add ext-dce before combine.
* tree-pass.h (make_pass_ext_dce): Prototype.

gcc/testsuite
* gcc.target/aarch64/sve/pred_clobber_1.c: Update expected output.
* gcc.target/aarch64/sve/pred_clobber_2.c: Likewise.
* gcc.target/aarch64/sve/pred_clobber_3.c: Likewise.
* gcc.target/aarch64/tbz_2.c: Likewise.
* gcc.target/riscv/core_bench_list.c: New test.
* gcc.target/riscv/core_init_matrix.c: New test.
* gcc.target/riscv/core_list_init.c: New test.
* gcc.target/riscv/matrix_add_const.c: New test.
* gcc.target/riscv/mem-extend.c: New test.
* gcc.target/riscv/pr111384.c: New test.

Co-authored-by: Jivan Hakobyan <jivanhakobyan9@gmail.com>
Co-authored-by: Joern Rennecke <joern.rennecke@embecosm.com>

c-format.cc: add ctors to format_check_results and format_check_context

This is a minor cleanup I spotted whilst working on another patch.
No functional change intended.

gcc/c-family/ChangeLog:
* c-format.cc (format_check_results::format_check_results): New
ctor.
(struct format_check_context): Add ctor; add "m_" prefix to all
fields.
(check_format_info): Use above ctors.
(check_format_arg): Update for "m_" prefix to
format_check_context.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

i386: Promote {QI,HI}mode x86_mov<mode>cc_0_m1_neg to SImode

Promote HImode x86_mov<mode>cc_0_m1_neg insn to SImode to avoid
redundant prefixes. Also promote QImode insn when TARGET_PROMOTE_QImode
is set. This is similar to promotable_binary_operator splitter, where we
promote the result to SImode.

Also correct insn condition for splitters to SImode of NEG and NOT
instructions. The sizes of QImode and SImode instructions are always
the same, so there is no need for optimize_insn_for_size bypass.

gcc/ChangeLog:

* config/i386/i386.md (x86_mov<mode>cc_0_m1_neg splitter to SImode):
New splitter.
(NEG and NOT splitter to SImode): Remove optimize_insn_for_size_p
predicate from insn condition.

libstdc++: Fix _Atomic(T) macro in <stdatomic.h> [PR115807]

The definition of the _Atomic(T) macro needs to refer to ::std::atomic,
not some other std::atomic relative to the current namespace.

libstdc++-v3/ChangeLog:

PR libstdc++/115807
* include/c_compatibility/stdatomic.h (_Atomic): Ensure it
refers to std::atomic in the global namespace.
* testsuite/29_atomics/headers/stdatomic.h/115807.cc: New test.

Remove trailing whitespace from invoke.texi

gcc/ChangeLog:

* doc/invoke.texi: Remove trailing whitespace.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>