git.ipfire.org Git - thirdparty/gcc.git/log

aarch64: Add missing simd requirements for INS [PR118531]

In g:b096a6ebe9d9f9fed4c105f6555f724eb32af95c I'd forgotten
to gate some uses of INS on TARGET_SIMD.

gcc/
PR target/118531
* config/aarch64/aarch64.md (*insv_reg<mode>_<SUBDI_BITS>)
(*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>)
(*aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS>): Add missing
simd requirements.

gcc/testsuite/
* gcc.target/aarch64/ins_bitfield_1a.c: New test.
* gcc.target/aarch64/ins_bitfield_3a.c: Likewise.
* gcc.target/aarch64/ins_bitfield_5a.c: Likewise.

d: Fix failing test with 32-bit compiler [PR114434]

Since the introduction of gdc.test/runnable/test23514.d, it's exposed an
incorrect compilation when adding a 64-bit constant to a link-time
address. The current cast to size_t causes a loss of precision, which
can result in incorrect compilation.

PR d/114434

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a
dinteger_t rather than a size_t.
(ExprVisitor::visit (SymOffExp *)): Likewise.

Fortran: do not copy back for parameter actual arguments [PR81978]

When an array is packed for passing as an actual argument, and the array
has the PARAMETER attribute (i.e., it is a named constant that can reside
in read-only memory), do not copy back (unpack) from the temporary.

PR fortran/81978

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_conv_array_parameter): Do not copy back data
if actual array parameter has the PARAMETER attribute.
* trans-expr.cc (gfc_conv_subref_array_arg): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr81978.f90: New test.

c++: Handle RAW_DATA_CST in make_tree_vector_from_ctor [PR118528]

This is the first bug discovered today with the
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673945.html
hack but then turned into proper testcases where embed-21.C FAILed
since introduction of optimized #embed support and the other when
optimizing large C++ initializers using RAW_DATA_CST.

The problem is that the C++ FE calls make_tree_vector_from_ctor
and uses that as arguments vector for deduction guide handling.
The call.cc code isn't prepared to handle RAW_DATA_CST just about
everywhere, so I think it is safer to make sure RAW_DATA_CST only
appears in CONSTRUCTOR_ELTS and nowhere else.
Thus, the following patch expands the RAW_DATA_CSTs from initializers
into multiple INTEGER_CSTs in the returned vector.

2025-01-20 Jakub Jelinek <jakub@redhat.com>

PR c++/118528
* c-common.cc (make_tree_vector_from_ctor): Expand RAW_DATA_CST
elements from the CONSTRUCTOR to individual INTEGER_CSTs.

* g++.dg/cpp/embed-21.C: New test.
* g++.dg/cpp2a/class-deduction-aggr16.C: New test.

RISC-V: Correct the mode that is causing the program to fail for XTheadCondMov

For XTheadCondMov, the bit width of rs2 should always be XLEN-sized, otherwise
the program logic will be wrong.

Reference form
https://github.com/XUANTIE-RV/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf

Synopsis
Move if equal zero.

Mnemonic
th.mveqz rd, rs1, rs2

Description
This instruction moves the content of register rs1 into rd if the content of rs2 is 0x0.
Otherwise, the value of rd does not change.

Operation
if (reg[rs2] == 0x0)
reg[rd] := reg[rs1]

gcc/ChangeLog:

* config/riscv/thead.md (*th_cond_mov<GPR:mode><GPR2:mode>):
Change GPR2 to X.
(*th_cond_mov<GPR:mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-bug.c: New test.

inline: Purge the abnormal edges as needed in fold_marked_statements [PR118077]

While fixing PR target/117665, I had noticed that fold_marked_statements
would not purge the abnormal edges which could not be taken any more due
to folding a call (devirtualization or simplification of a [target] builtin).
Devirutalization could also cause a call that used to be able to have an
abornal edge become one not needing one too so this was needed for GCC 15.

Bootstrapped and tested on x86_64-linux-gnu

PR tree-optimization/118077
PR tree-optimization/117668

gcc/ChangeLog:

* tree-inline.cc (fold_marked_statements): Purge abnormal edges
as needed.

gcc/testsuite/ChangeLog:

* g++.dg/opt/devirt6.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: perfectly forward std::ranges::clamp arguments

As reported in PR118185, std::ranges::clamp does not correctly forward
the projected value to the comparator. Add the missing forward.

libstdc++-v3/ChangeLog:

PR libstdc++/118185
PR libstdc++/100249
* include/bits/ranges_algo.h (__clamp_fn): Correctly forward the
projected value to the comparator.
* testsuite/25_algorithms/clamp/118185.cc: New test.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

arm, testsuite: fix fast-math-bb-slp-complex-mla-float.c dg-add-options

The test uses floats, not fp16 so it should use arm_v8_3a_complex_neon
instead of arm_v8_3a_fp16_complex_neon.

This makes it PASS on arm-linux-gnueabihf instead of being UNRESOLVED.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Use
arm_v8_3a_complex_neon.

arm, testsuite: remove duplicate dg-add-options arm_v8_3a_complex_neon

These two testcases have twice the same dg-add-options
arm_v8_3a_complex_neon, the patch removes one of them.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/complex-operations-run.c: Remove duplicate
dg-add-options arm_v8_3a_complex_neon.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c:
Likewise.

tree-optimization/117875 - missed SLP vectorization

There's a discrepancy in SLP vs non-SLP vectorization that SLP build
does not handle plain SSA copies (which should have been elimiated
earlier). But this now bites back since non-SLP happily handles them,
causing a regression with --param vect-force-slp=1 which is now default,
resulting in a big performance regression in 456.hmmer.

So the following restores parity between SLP and non-SLP here, defering
the missed copy elimination to later (PR118565).

PR tree-optimization/117875
* tree-vect-slp.cc (vect_build_slp_tree_1): Handle SSA copies.

LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

For things like

        (x | 0x101) << 11

It's obvious to write:

        ori     $r4,$r4,257
        slli.d  $r4,$r4,11

But we are actually generating something insane:

        lu12i.w $r12,524288>>12             # 0x80000
        ori     $r12,$r12,2048
        slli.d  $r4,$r4,11
        or      $r4,$r4,$r12
        jr      $r1

It's because the target-independent canonicalization was written before
we have all the RISC targets where loading an immediate may need
multiple instructions.  So for these targets we need to handle this in
the target code.

We do the reassociation on our own (i.e. reverting the
target-independent reassociation) if "(reg [&|^] mask) << shamt" does
not need to load mask into an register, and either:
- (mask << shamt) needs to be loaded into an register, or
- shamt is a const_immalsl_operand, so the outer shift may be further
  combined with an add.

gcc/ChangeLog:

PR target/115921
* config/loongarch/loongarch-protos.h
(loongarch_reassoc_shift_bitwise): New function prototype.
* config/loongarch/loongarch.cc
(loongarch_reassoc_shift_bitwise): Implement.
* config/loongarch/loongarch.md
(*alslsi3_extend_subreg): New define_insn_and_split.
(<any_bitwise:optab>_shift_reverse<X:mode>): New
define_insn_and_split.
(<any_bitwise:optab>_alsl_reversesi_extended): New
define_insn_and_split.
(zero_extend_ashift): Remove as it's just a special case of
and_shift_reversedi, and it does not make too much sense to
write "alsl.d rd,rs,r0,shamt" instead of "slli.d rd,rs,shamt".
(bstrpick_alsl_paired): Remove as it is already done by
splitting and_shift_reversedi into and + ashift first, then
late combining the ashift and a further add.

gcc/testsuite/ChangeLog:

PR target/115921
* gcc.target/loongarch/bstrpick_alsl_paired.c (scan-rtl-dump):
Scan for and_shift_reversedi instead of the removed
bstrpick_alsl_paired.
* gcc.target/loongarch/bitwise-shift-reassoc.c: New test.

LoongArch: Simplify using bstr{ins,pick} instructions for and

For bstrins, we can merge it into and<mode>3 instead of having a
separate define_insn.

For bstrpick, we can use the constraints to ensure the first source
register and the destination register are the same hardware register,
instead of emitting a move manually.

This will simplify the next commit where we'll reassociate bitwise
and left shift for better code generation.

gcc/ChangeLog:

* config/loongarch/constraints.md (Yy): New define_constriant.
* config/loongarch/loongarch.cc (loongarch_print_operand):
For "%M", output the index of bits to be used with
bstrins/bstrpick.
* config/loongarch/predicates.md (ins_zero_bitmask_operand):
Exclude low_bitmask_operand as for low_bitmask_operand it's
always better to use bstrpick instead of bstrins.
(and_operand): New define_predicate.
* config/loongarch/loongarch.md (any_or): New
define_code_iterator.
(bitwise_operand): New define_code_attr.
(*<optab:any_or><mode:GPR>3): New define_insn.
(*and<mode:GPR>3): New define_insn.
(<optab:any_bitwise><mode:X>3): New define_expand.
(and<mode>3_extended): Remove, replaced by the 3rd alternative
of *and<mode:GPR>3.
(bstrins_<mode>_for_mask): Remove, replaced by the 4th
alternative of *and<mode:GPR>3.
(*<optab:any_bitwise>si3_internal): Remove, already covered by
the *<optab:any_or><mode:GPR>3 and *and<mode:GPR>3 templates.

testsuite: Fix name of PR116348 test case

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr116438.c: Rename to ...
* gcc.c-torture/compile/pr116348.c: ... this.

tree-optimization/118552 - failed LC SSA update after unrolling

When unrolling changes nesting relationship of loops we fail to
mark blocks as in need to change for LC SSA update. Specifically
the LC SSA PHI on a former inner loop exit might be misplaced
if that loop becomes a sibling of its outer loop.

PR tree-optimization/118552
* cfgloopmanip.cc (fix_loop_placement): Properly mark
exit source blocks as to be scanned for LC SSA update when
the loops nesting relationship changed.
(fix_loop_placements): Adjust.
(fix_bb_placements): Likewise.

* gcc.dg/torture/pr118552.c: New testcase.

nvptx: Gracefully handle '-mptx=3.1' if neither sm_30 nor sm_35 multilib variant is built

For example, for GCC/nvptx built with '--with-arch=sm_52' (current default)
and '--without-multilib-list', neither a sm_30 nor a sm_35 multilib variant
is built, and thus no '-mptx=3.1' sub-variant either.  Such a configuration
is possible as of commit 86b3a7532d56f74fcd1c362f2da7f95e8cc4e4a6
"nvptx: Support '--with-multilib-list'", but currently results in the
following bogus behavior:

    [...]/xgcc -print-multi-directory -mgomp -march=sm_52
    mgomp
    [...]/xgcc -print-multi-directory -mgomp -march=sm_35
    mgomp
    [...]/xgcc -print-multi-directory -mgomp -march=sm_30
    mgomp
    [...]/xgcc -print-multi-directory -mgomp -march=sm_35 -mptx=3.1
    .
    [...]/xgcc -print-multi-directory -mgomp -march=sm_30 -mptx=3.1
    .

The latter two '.' are unexpected; linking OpenMP/nvptx offloading code
like this fails with: 'unresolved symbol __nvptx_uni', for example.
Instead of '.', the latter two should print 'mgomp', too.  To achieve that,
we must not set up the '-mptx=3.1' multilib axis if no '-mptx=3.1'
sub-variant is built.

gcc/
* config/nvptx/t-nvptx (MULTILIB_OPTIONS): Don't add 'mptx=3.1' if
neither sm_30 nor sm_35 multilib variant is built.

tree, c++: Consider TARGET_EXPR invariant like SAVE_EXPR [PR118509]

My October PR117259 fix to get_member_function_from_ptrfunc to use a
TARGET_EXPR rather than SAVE_EXPR unfortunately caused some regressions as
well as the following testcase shows.
What happens is that
get_member_function_from_ptrfunc -> build_base_path calls save_expr,
so since the PR117259 change in mnay cases it will call save_expr on
a TARGET_EXPR.  And, for some strange reason a TARGET_EXPR is not considered
an invariant, so we get a SAVE_EXPR wrapped around the TARGET_EXPR.
That SAVE_EXPR <TARGET_EXPR <...>> gets initially added only to the second
operand of ?:, so at that point it would still work fine during expansion.
But unfortunately an expression with that subexpression is handed to the
caller also through *instance_ptrptr = instance_ptr; and gets evaluated
once again when computing the first argument to the method.
So, essentially, we end up with
(TARGET_EXPR <D.2907, ...>, (... ? ... SAVE_EXPR <TARGET_EXPR <D.2907, ...>
... : ...)) (... SAVE_EXPR <TARGET_EXPR <D.2907, ...> ..., ...);
and while D.2907 is initialized during gimplification in the code dominating
everything that uses it, the extra temporary created for the SAVE_EXPR
is initialized only conditionally (if the ?: condition is true) but then
used unconditionally, so we get
pmf-4.C: In function ‘void foo(C, B*)’:
pmf-4.C:12:11: warning: ‘<anonymous>’ may be used uninitialized [-Wmaybe-uninitialized]
   12 |   (y->*x) ();
      |   ~~~~~~~~^~
pmf-4.C:12:11: note: ‘<anonymous>’ was declared here
   12 |   (y->*x) ();
      |   ~~~~~~~~^~
diagnostic and wrong-code issue too.

The following patch fixes it by considering a TARGET_EXPR invariant
for SAVE_EXPR purposes the same as SAVE_EXPR is.  Really creating another
temporary for it is just a waste of the IL.

Unfortunately I had to tweak the omp matching code to be able to accept
TARGET_EXPR the same as SAVE_EXPR.

2025-01-20  Jakub Jelinek  <jakub@redhat.com>

PR c++/118509
gcc/
* tree.cc (tree_invariant_p_1): Return true for TARGET_EXPR too.
gcc/c-family/
* c-omp.cc (c_finish_omp_for): Handle TARGET_EXPR in first operand
of COMPOUND_EXPR incr the same as SAVE_EXPR.
gcc/testsuite/
* g++.dg/expr/pmf-4.C: New test.

tree-ssa-dce: Fix calloc handling [PR118224]

As reported by Dimitar, this should have been a multiplication, but wasn't
caught because in the test (~(__SIZE_TYPE__) 0) / 2 is the largest accepted
size and so adding 3 to it also resulted in "overflow".

The following patch adds one subtest to really verify it is a multiplication
and fixes the operation.

2025-01-20 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/118224
* tree-ssa-dce.cc (is_removable_allocation_p): Multiply a1 by a2
instead of adding it.

* gcc.dg/pr118224.c: New test.

s390: Update vec_(load,store)_len(,_r)

Reflect latest updates for vec_(load,store)_len(,_r) which means that
all types except character based types are deprecated.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_load_len): Deprecate
some overloads.
(s390_vec_store_len): Deprecate some overloads.
(s390_vec_load_len_r): Add.
(s390_vec_store_len_r): Add.
* config/s390/s390-c.cc (s390_vec_load_len_r): Add.
(s390_vec_store_len_r): Add.
* config/s390/vecintrin.h (vec_load_len_r): Redefine.
(vec_store_len_r): Redefine.

s390: Vector shift: Add 128-bit integer support

Add 128-bit vector shift support. Deprecate vector shift by byte
builtins where the shift amount is not of type unsigned character.

gcc/ChangeLog:

* config/s390/s390-builtins.def: Add 128-bit variants.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/vector.md (<vec_shifts_name><mode>3): Add 128-bit
variants.
* config/s390/vx-builtins.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-shift-10.c: New test.
* gcc.target/s390/vector/vec-shift-11.c: New test.
* gcc.target/s390/vector/vec-shift-12.c: New test.
* gcc.target/s390/vector/vec-shift-3.c: New test.
* gcc.target/s390/vector/vec-shift-4.c: New test.
* gcc.target/s390/vector/vec-shift-5.c: New test.
* gcc.target/s390/vector/vec-shift-6.c: New test.
* gcc.target/s390/vector/vec-shift-7.c: New test.
* gcc.target/s390/vector/vec-shift-8.c: New test.
* gcc.target/s390/vector/vec-shift-9.c: New test.

s390: arch15: Vector maximum/minimum: Add 128-bit integer support

For previous architectures emulate operation max/min.

gcc/ChangeLog:

* config/s390/s390-builtins.def: Add 128-bit variants and remove
bool variants.
* config/s390/s390-builtin-types.def: Update accordinly.
* config/s390/s390.md: Emulate min/max for GPR.
* config/s390/vector.md: Add min/max patterns and emulate in
case of no VXE3.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-max-emu.c: New test.
* gcc.target/s390/vector/vec-min-emu.c: New test.

s390: arch15: Vector load positive: Add 128-bit integer support

For previous architectures emulate operation abs.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_abs_s128): Add.
(s390_vlpq): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/vector.md (abs<mode>2): Emulate w/o VXE3.
(*abs<mode>2): Add 128-bit variant.
(*vec_sel0<mode>): Make it a ...
(vec_sel0<mode>): named pattern.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-abs-emu.c: New test.

s390: arch15: Vector compare: Add 128-bit integer support

gcc/ChangeLog:

* config/s390/s390-builtins.def: Add 128-bit variants.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.cc (s390_expand_vec_compare_cc): Also
consider TI modes for vectors.
* config/s390/vector.md: Enable *vec_cmp et al. for VXE3.
* config/s390/vx-builtins.md: Ditto.

s390: arch15: Vector devide/remainder

gcc/ChangeLog:

* config/s390/vector.md (div<mode>3): Add.
(udiv<mode>3): Add.
(mod<mode>3): Add.
(umod<mode>3): Add.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vxe3/vd-1.c: New test.
* gcc.target/s390/vxe3/vd-2.c: New test.
* gcc.target/s390/vxe3/vdl-1.c: New test.
* gcc.target/s390/vxe3/vdl-2.c: New test.
* gcc.target/s390/vxe3/vr-1.c: New test.
* gcc.target/s390/vxe3/vr-2.c: New test.
* gcc.target/s390/vxe3/vrl-1.c: New test.
* gcc.target/s390/vxe3/vrl-2.c: New test.

s390: arch15: Count leading/trailing zeros

Add vector single element 128-bit integer support utilizing new
instructions vclzq and vctzq. Furthermore, add scalar 64-bit integer
support utilizing new instructions clzg and ctzg. For ctzg, also define
the resulting value if the input operand equals zero.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_cntlz): Add 128-bit
integer overloads.
(s390_vclzq): Add.
(s390_vec_cnttz): Add 128-bit integer overloads.
(s390_vctzq): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.h (CTZ_DEFINED_VALUE_AT_ZERO): Define.
* config/s390/s390.md (*clzg): New insn.
(clztidi2): Exploit new insn for target arch15.
(ctzdi2): New insn.
* config/s390/vector.md (clz<mode>2): Extend modes including
128-bit integer.
(ctz<mode>2): Likewise.

s390: arch15: Vector generate element masks

Add instruction vgem and vector builtins
vec_gen_element_masks_{8,16,32,64,128}.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_gen_element_masks_128): Add.
(s390_vgemb): Add.
(s390_vgemh): Add.
(s390_vgemf): Add.
(s390_vgemg): Add.
(s390_vgemq): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.md (UNSPEC_VEC_VGEM): Add.
* config/s390/vecintrin.h (vec_gen_element_masks_8): Define.
(vec_gen_element_masks_16): Define.
(vec_gen_element_masks_32): Define.
(vec_gen_element_masks_64): Define.
(vec_gen_element_masks_128): Define.
* config/s390/vx-builtins.md (vgemv16qi): Add.
(vgem<mode>): Add.

s390: arch15: Vector eval

Add instruction veval and builtin vec_evaluate.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_evaluate): Add.
(s390_veval): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.md (UNSPEC_VEC_VEVAL): Add.
* config/s390/vecintrin.h (vec_evaluate): Define.
* config/s390/vector.md
(*veval<mode>_<logic_op1:logic_op_stringify><logic_op2:logic_op_stringify>):
Add.
(veval<mode>): Add.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vxe3/veval-1.c: New test.
* gcc.target/s390/vxe3/veval-2.c: New test.
* gcc.target/s390/vxe3/veval-3.c: New test.
* gcc.target/s390/vxe3/veval-4.c: New test.
* gcc.target/s390/vxe3/veval-5.c: New test.
* gcc.target/s390/vxe3/veval-6.c: New test.
* gcc.target/s390/vxe3/veval-7.c: New test.
* gcc.target/s390/vxe3/veval-8.c: New test.
* gcc.target/s390/vxe3/veval-9.c: New test.

s390: arch15: Vector blend

Add instruction vblend and builtin vec_blend.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_vec_blend): Add.
(s390_vblendb): Add.
(s390_vblendh): Add.
(s390_vblendf): Add.
(s390_vblendg): Add.
(s390_vblendq): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.md (UNSPEC_VEC_VBLEND): Add.
* config/s390/vecintrin.h (vec_blend): Define.
* config/s390/vx-builtins.md (vblend<mode>): Add.

s390: arch15: Bit deposit and extract

Add instructions bdepg and bextg and corresponding builtins.

gcc/ChangeLog:

* config/s390/s390-builtins.def (s390_bdepg): Add.
(s390_bextg): Add.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/s390.md (UNSPEC_BDEPG): Add.
(UNSPEC_BEXTG): Add.
(bdepg): Add.
(bextg): Add.

s390: arch15: Load indexed address

Add instructions lxa and llxa.

gcc/ChangeLog:

* config/s390/s390.md (*lxa<LXAMODE>_index): Add.
(*lxa<LXAMODE>_displacement_index): Add.
(*lxa<LXAMODE>_index_base): Add.
(*lxa<LXAMODE>_displacement_index_base): Add.
(*lxab_displacement_index_base): Add.
(*llxa<LXAMODE>_displacement_index): Add.
(*llxa<LXAMODE>_index_base): Add.
(*llxa<LXAMODE>_displacement_index_base): Add.
(*llxab_displacement_index_base): Add.

gcc/testsuite/ChangeLog:

* gcc.target/s390/llxa-1.c: New test.
* gcc.target/s390/llxa-2.c: New test.
* gcc.target/s390/llxa-3.c: New test.
* gcc.target/s390/lxa-1.c: New test.
* gcc.target/s390/lxa-2.c: New test.
* gcc.target/s390/lxa-3.c: New test.
* gcc.target/s390/lxa-4.c: New test.

s390: arch15: New instruction variants supporting 128-bit integer

Add new instruction variants and also extend builtins in order to deal
with 128-bit integer.

gcc/ChangeLog:

* config/s390/s390-builtins.def: Add new instruction variants.
* config/s390/s390-builtin-types.def: Update accordingly.
* config/s390/vecintrin.h: Add new defines.
* config/s390/vector.md: Adapt insns for new instruction
variants.
* config/s390/vx-builtins.md: Ditto.

s390: arch15: Prepare for future builtins

gcc/ChangeLog:

* config/s390/s390-builtins.def (B_VXE3): Define.
(B_ARCH15): Define.
* config/s390/s390-c.cc (s390_resolve_overloaded_builtin):
Consistency checks for VXE3.
* config/s390/s390.cc (s390_expand_builtin): Consistency checks
for VXE3.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: VXE3 effective target check.

s390: Bump __VEC__ and add 128-bit integer zvector types

Bump __VEC__ version to 10305 and add 128-bit integer vector types like
`vector __int128` et al. to the zvector extension.

gcc/ChangeLog:

* config/s390/s390-c.cc (rid_int128): New helper function.
(s390_macro_to_expand): Deal with `vector __int128`.
(s390_cpu_cpp_builtins_internal): Bump __VEC__.
* config/s390/s390.cc (s390_handle_vectorbool_attribute): Add
128-bit bool zvector.

s390: arch15: Prepare for a future architecture

gcc/ChangeLog:

* common/config/s390/s390-common.cc: Add arch15 processor flags.
* config.gcc: Add arch15 for options --with-{arch,mtune}.
* config/s390/driver-native.cc (s390_host_detect_local_cpu):
Default to arch15.
* config/s390/s390-opts.h (enum processor_type): Add
PROCESSOR_ARCH15.
* config/s390/s390.cc (processor_table,s390_issue_rate,
s390_get_sched_attrmask,s390_get_unit_mask): Add arch15.
* config/s390/s390.h (enum processor_flags): Add processor flags
for VXE3 and ARCH15.
(TARGET_CPU_VXE3): Define.
(TARGET_CPU_VXE3_P): Define.
(TARGET_CPU_ARCH15): Define.
(TARGET_CPU_ARCH15_P): Define.
(TARGET_VXE3): Define.
(TARGET_VXE3_P): Define.
(TARGET_ARCH15): Define.
(TARGET_ARCH15_P): Define.
* config/s390/s390.md: Add VXE3 and ARCH15 to cpu_facility, and
let attribute "enabled" deal with them.
* config/s390/s390.opt: Add arch15.

gcc/testsuite/ChangeLog:

* gcc.target/s390/s390.exp: Set compiler flags for the vxe3
subdirectory of the testsuite as done e.g. for vxe2.

s390: Sort definitions in vecintrin.h

gcc/ChangeLog:

* config/s390/vecintrin.h: Sort definitions.

s390: Stay scalar for TOINTVEC/tointvec

Currently TOINTVEC maps scalar mode TI/TF to vector mode V1TI/V1TF,
respectively.  As a consequence we may end up with patterns with a
mixture of scalar and vector modes as e.g. for

(define_insn "vec_sel0<mode>"
  [(set (match_operand:VT 0 "register_operand" "=v")
        (if_then_else:VT
         (eq (match_operand:<TOINTVEC> 3 "register_operand" "v")
             (match_operand:<TOINTVEC> 4 "const0_operand" ""))
         (match_operand:VT 1 "register_operand" "v")
         (match_operand:VT 2 "register_operand" "v")))]

This is cumbersome since gen_vec_sel0ti() and gen_vec_sel0tf() require
that operands 3 and 4 are of vector mode whereas the remainder of
operands must be of scalar mode.  Likewise for tointvec.

Fixed by staying scalar.

gcc/ChangeLog:

* config/s390/vector.md: Stay scalar for TOINTVEC/tointvec.

RISC-V: Add sifive_vector.h

sifive_vector.h is a vendor specfic header, it should include before
using sifive vector intrinsic, it's just include riscv_vector.h for now,
we will separate the implementation by adding new pragma in future.

gcc/ChangeLog:

* config.gcc (riscv*): Install sifive_vector.h.
* config/riscv/sifive_vector.h: New.

i386: Fix wrong insn generated by shld/shrd ndd split [PR118510]

For shld/shrd_ndd_2 insn, the spiltter outputs wrong pattern that
mixed parallel for clobber and set. Use register_operand as dest
and ajdust output template to fix.

gcc/ChangeLog:

PR target/118510
* config/i386/i386.md (*x86_64_shld_ndd_2): Use register_operand
for operand[0] and adjust the output template to directly
generate ndd form shld pattern.
(*x86_shld_ndd_2): Likewise.
(*x86_64_shrd_ndd_2): Likewise.
(*x86_shrd_ndd_2): Likewise.

gcc/testsuite/ChangeLog:

PR target/118510
* gcc.target/i386/pr118510.c: New test.

Daily bump.

i386: Reorder *movdi_internal ISA attribute by ascending alternative index

Reorder ISA attribute by ascending alternative index. No functional change.

gcc/ChangeLog:

* config/i386/i386.md (*movdi_internal): Reorder ISA attribute
by ascending alternative index.

i386/testsuite: Fix gcc.target/i386/pr118067*.c tests

These tests use int128 type, so require target int128 instead of ! ia32.
Also, use -mtune= instead of deprecated -mcpu= to avoid compiler warning.

PR rtl-optimization/118067

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118067.c (dg-compile): Use target int128.
* gcc.target/i386/pr118067-2.c (dg-compile): Ditto.
(dg-options): Use -mtune= instead of deprecated -mcpu= option.

Regenerate sparc.opt.urls

sparc added a -mvis3b option, but the sparc.opt.url file wasn't
regenerated.

Fixes: d309844d6fe0 ("Fix bootstrap failure on SPARC with -O3 -mcpu=niagara4")
gcc/ChangeLog:

* config/sparc/sparc.opt.urls: Regenerated.

testsuite: Fixes for test case pr117546.c

This test fails on AVR.

Debugging the test on x86 host, I noticed that u in function s sometimes
has value 16128. The "t <= 3 * u" expression in the same function
results in signed integer overflow for targets with sizeof(int)=2.

Fix by requiring int32 effective target.

Also add return statement for the main function.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr117546.c: Require effective target int32.
(main): Add return statement.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

doc: Move modula2.org link to https

gcc:
* doc/gm2.texi (Type compatibility): Move modula2.org link
to https.

doc: Adjust link to OpenMP specifications

gcc:
* doc/extend.texi (OpenMP): Adjust link to specifications.

Daily bump.

d: Merge upstream dmd, druntime d115713410, phobos 1b242048c.

D front-end changes:

- Import latest fixes from dmd v2.110.0-rc.1.
- Integers in debug or version statements have been removed from
the language.

D runtime changes:

- Import latest fixes from druntime v2.110.0-rc.1.

Phobos changes:

- Import latest fixes from phobos v2.110.0-rc.1.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd d115713410.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime d115713410.
* src/MERGE: Merge upstream phobos 1b242048c.

gcc/testsuite/ChangeLog:

* gdc.dg/asm3.d: Adjust test.

c++: Copy over further 2 flags for !TREE_PUBLIC in copy_linkage [PR118513]

The following testcase ICEs in import_export_decl.
When cp_finish_decomp handles std::tuple* using structural binding,
it calls copy_linkage to copy various VAR_DECL flags from the structured
binding base to the individual sb variables.
In this case the base variable is in anonymous union, so we call
constrain_visibility (..., VISIBILITY_ANON, ...) on it which e.g.
clears TREE_PUBLIC etc. (flags which copy_linkage copies) but doesn't
copy over DECL_INTERFACE_KNOWN/DECL_NOT_REALLY_EXTERN.
When cp_finish_decl calls determine_visibility on the individual sb
variables, those have !TREE_PUBLIC since copy_linkage and so nothing tries
to determine visibility and nothing sets DECL_INTERFACE_KNOWN and
DECL_NOT_REALLY_EXTERN.
Now, this isn't a big deal without modules, the individual variables are
var_finalized_p and so nothing really cares about missing
DECL_INTERFACE_KNOWN.  But in the module case the variables are streamed
out and in and care about those bits.

The following patch is an attempt to copy over also those flags (but I've
limited it to the !TREE_PUBLIC case just in case).  Other option would be
to call it unconditionally, or call constrain_visibility with
VISIBILITY_ANON for !TREE_PUBLIC (but are all !TREE_PUBLIC constrained
visibility) or do it only in the cp_finish_decomp case
after the copy_linkage call there.

2025-01-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/118513
* decl2.cc (copy_linkage): If not TREE_PUBLIC, also set
DECL_INTERFACE_KNOWN, assert it was set on decl and copy
DECL_NOT_REALLY_EXTERN flags.

* g++.dg/modules/decomp-3_a.H: New test.
* g++.dg/modules/decomp-3_b.C: New test.

[RISC-V][PR target/116308] Fix generation of initial RTL for atomics

While this wasn't originally marked as a regression, it almost certainly is
given that older versions of GCC would have used libatomic and would not have
ICE'd on this code.

Basically this is another case where we directly used simplify_gen_subreg when
we should have used gen_lowpart.

When I fixed a similar bug a while back I noted the code in question as needing
another looksie.  I think at that time my brain saw the mixed modes (SI & QI)
and locked up.  But the QI stuff is just the shift count, not some deeper
issue.  So fixing is trivial.

We just replace the simplify_gen_subreg with a gen_lowpart and get on with our
lives.

Tested on rv64 and rv32 in my tester.  Waiting on pre-commit testing for final
verdict.

PR target/116308
gcc/
* config/riscv/riscv.cc (riscv_lshift_subword): Use gen_lowpart
rather than simplify_gen_subreg.

gcc/testsuite/

* gcc.target/riscv/pr116308.c: New test.

Fix uniqueness of symtab_node::get_dump_name.

symtab_node::get_dump_name uses node order to identify nodes.
Order is no longer unique because of Incremental LTO patches.
This patch moves uid from cgraph_node node to symtab_node,
so get_dump_name can use uid instead and get back unique dump names.

In inlining passes, uid is replaced with more appropriate (more compact
for indexing) summary id.

Bootstrapped/regtested on x86_64-linux.
Ok for trunk?

gcc/ChangeLog:

* cgraph.cc (symbol_table::create_empty):
Move uid to symtab_node.
(test_symbol_table_test): Change expected dump id.
* cgraph.h (struct cgraph_node):
Move uid to symtab_node.
(symbol_table::register_symbol): Likewise.
* dumpfile.cc (test_capture_of_dump_calls):
Change expected dump id.
* ipa-inline.cc (update_caller_keys):
Use summary id instead of uid.
(update_callee_keys): Likewise.
* symtab.cc (symtab_node::get_dump_name):
Use uid instead of order.

gcc/testsuite/ChangeLog:

* gcc.dg/live-patching-1.c: Change expected dump id.
* gcc.dg/live-patching-4.c: Likewise.

Fix bootstrap failure on SPARC with -O3 -mcpu=niagara4

This is a regression present on the mainline only, but the underlying issue
has been latent for years: the compiler and the assembler disagree on the
support of the VIS 3B SIMD ISA, the former bundling it with VIS 3 but not
the latter. IMO the documentation is not very clear, so this patch just
aligns the compiler with the assembler.

gcc/
PR target/118512
* config/sparc/sparc-c.cc (sparc_target_macros): Deal with VIS 3B.
* config/sparc/sparc.cc (dump_target_flag_bits): Likewise.
(sparc_option_override): Likewise.
(sparc_vis_init_builtins): Likewise.
* config/sparc/sparc.md (fpcmp_vis): Replace TARGET_VIS3 with
TARGET_VIS3B.
(vec_cmp): Likewise.
(fpcmpu_vis): Likewise.
(vec_cmpu): Likewise.
(vcond_mask_): Likewise.
* config/sparc/sparc.opt (VIS3B): New target mask.
* doc/invoke.texi (SPARC options): Document -mvis3b.

gcc/testsuite/
* gcc.target/sparc/20230328-1.c: Pass -mvis3b instead of -mvis3.
* gcc.target/sparc/20230328-4.c: Likewise.
* gcc.target/sparc/fucmp.c: Likewise.
* gcc.target/sparc/vis3misc.c: Likewise.

RISC-V: Disable RV64-only crc testcases for RV32

These testcases require RV64 targets. They fail when -march=rv32* is
specified while using an riscv64* compiler.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/crc-21-rv64-zbc.c: Disallow rv32 targets.
* gcc.target/riscv/crc-21-rv64-zbkc.c: Ditto.

[PR target/118357] RISC-V: Disable fusing vsetvl instructions by VSETVL_VTYPE_CHANGE_ONLY for XTheadVector.

In RVV 1.0, the instruction "vsetvli zero,zero,*" indicates that the
available vector length (avl) does not change. However, in XTheadVector,
this same instruction signifies that the avl should take the maximum value.
Consequently, when fusing vsetvl instructions, the optimization labeled
"VSETVL_VTYPE_CHANGE_ONLY" is disabled for XTheadVector.

PR target/118357

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc: Function change_vtype_only_p always
returns false for XTheadVector.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/pr118357.c: New test.

tree-optimization/118529 - ICE with condition vectorization

On sparc we end up choosing vector(8) <signed-boolean:1> for the
condition but vector(2) int for the value of a COND_EXPR but we
fail to verify their shapes match and thus things go downhill.

This is a missed-optimization on the pattern recognition side
as well as unhandled vector decomposition in vectorizable_condition.
The following plugs just the observed ICE for now.

PR tree-optimization/118529
* tree-vect-stmts.cc (vectorizable_condition): Check the
shape of the vector and condition vector type are compatible.

* gcc.target/sparc/pr118529.c: New testcase.

AVR: Fix a plenk in doc/invoke.texi.

gcc/
* doc/invoke.texi (AVR Options): Fix plenk at -msplit-ldst.

AArch64: Use standard names for saturating arithmetic

This renames the existing {s,u}q{add,sub} instructions to use the
standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and
IFN_SAT_SUB.

The NEON intrinsics for saturating arithmetic and their corresponding
builtins are changed to use these standard names too.

Using the standard names for the instructions causes 32 and 64-bit
unsigned scalar saturating arithmetic to use the NEON instructions,
resulting in an additional (and inefficient) FMOV to be generated when
the original operands are in GP registers. This patch therefore also
restores the original behaviour of using the adds/subs instructions
in this circumstance.

Additional tests are written for the scalar and Adv. SIMD cases to
ensure that the correct instructions are used. The NEON intrinsics are
already tested elsewhere.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc: Expand iterators.
* config/aarch64/aarch64-simd-builtins.def: Use standard names
* config/aarch64/aarch64-simd.md: Use standard names, split insn
definitions on signedness of operator and type of operands.
* config/aarch64/arm_neon.h: Use standard builtin names.
* config/aarch64/iterators.md: Add VSDQ_I_QI_HI iterator to
simplify splitting of insn for unsigned scalar arithmetic.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/scalar_intrinsics.c: Update testcases.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc:
Template file for unsigned vector saturating arithmetic tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c:
8-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c:
16-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c:
32-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c:
64-bit vector type tests.
* gcc.target/aarch64/saturating_arithmetic.inc: Template file
for scalar saturating arithmetic tests.
* gcc.target/aarch64/saturating_arithmetic_1.c: 8-bit tests.
* gcc.target/aarch64/saturating_arithmetic_2.c: 16-bit tests.
* gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests.
* gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

AArch64: Use standard names for SVE saturating arithmetic

Rename the existing SVE unpredicated saturating arithmetic instructions
to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md: Rename insns

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/saturating_arithmetic.inc:
Template file for auto-vectorizer tests.
* gcc.target/aarch64/sve/saturating_arithmetic_1.c:
Instantiate 8-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_2.c:
Instantiate 16-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_3.c:
Instantiate 32-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_4.c:
Instantiate 64-bit vector tests.

Revert "AArch64: Use standard names for saturating arithmetic"

This reverts commit 5f5833a4107ddfbcd87651bf140151de043f4c36.

Revert "AArch64: Use standard names for SVE saturating arithmetic"

This reverts commit 26b2d9f27ca24f0705641a85f29d179fa0600869.

c++: Fix up find_array_ctor_elt RAW_DATA_CST handling [PR118534]

This is the third bug discovered today with the
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673945.html
hack but then turned into proper testcases where embed-24.C FAILed
since introduction of optimized #embed support and the others when
optimizing large C++ initializers using RAW_DATA_CST.

find_array_ctor_elt already has RAW_DATA_CST support, but on the
following testcases it misses one case I've missed.
The CONSTRUCTORs in question went through the braced_list_to_string
optimization which can turn INTEGER_CST RAW_DATA_CST INTEGER_CST
into just larger RAW_DATA_CST covering even those 2 bytes around it
(if they appear there in the underlying RAW_DATA_OWNER).
With this optimization, RAW_DATA_CST can be the last CONSTRUCTOR_ELTS
elt in a CONSTRUCTOR, either the sole one or say preceeded by some
unrelated other elements.  Now, if RAW_DATA_CST is the only one or
if there are no RAW_DATA_CSTs earlier in CONSTRUCTOR_ELTS, we can
trigger a bug in find_array_ctor_elt.
It has a smart optimization for the very common case where
CONSTRUCTOR_ELTS have indexes and index of the last elt is equal
to CONSTRUCTOR_NELTS (ary) - 1, then obviously we know there are
no RAW_DATA_CSTs before it and the indexes just go from 0 to nelts-1,
so when we care about any of those earlier indexes, we can just return i;
and not worry about anything.
Except it uses if (i < end) return i; rather than if (i < end - 1) return i;
For the latter cases, i.e. anything before the last elt, we know there
are no surprises and return i; is right.  But for the if (i == end - 1)
case, return i; is only correct if the last elt is not RAW_DATA_CST, if it
is RAW_DATA_CST, we still need to split it, which is handled by the code
later in the function.  So, for that we need begin = end - 1, so that the
binary search will just care about that last element.

2025-01-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/118534
* constexpr.cc (find_array_ctor_elt): Don't return i early if
i == end - 1 and the last elt's value is RAW_DATA_CST.

* g++.dg/cpp/embed-24.C: New test.
* g++.dg/cpp1y/pr118534.C: New test.

RISC-V: Remove unused variable in riscv_file_end function.

gcc/ChangeLog:
* config/riscv/riscv.cc: Remove unused variable.

LoongArch: Fix cost model for alsl

Our cost model for alsl was wrong: it matches (a + b * imm) where imm is
1, 2, 3, or 4 (should be 2, 4, 8, or 16), and it does not match
(a + (b << imm)) at all.  For the test case:

    a += c << 3;
    b += c << 3;

it caused the compiler to perform a CSE and make one slli and two add,
but we just want two alsl.

Also add a "code == PLUS" check to prevent matching a - (b << imm) as we
don't have any "slsl" instruction.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_rtx_costs): Fix the
cost for (a + b * imm) and (a + (b << imm)) which can be
implemented with a single alsl instruction.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/alsl-cost.c: New test.

LoongArch: Add alsl.wu

On 64-bit capable LoongArch hardware, alsl.wu is similar to alsl.w but
zero-extending the 32-bit result.

gcc/ChangeLog:

* config/loongarch/loongarch.md (alslsi3_extend): Add alsl.wu.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/alsl_wu.c: New test.

Daily bump.

libfortran: G formatting for UNSIGNED [PR118536]

PR libfortran/118536

libgfortran/ChangeLog:

* io/transfer.c (formatted_transfer_scalar_write): Handle UNSIGNED
in G formatting.

gcc/testsuite/ChangeLog:

* gfortran.dg/unsigned_write_2.f90: New test.

[PR118067][LRA]: Check secondary memory mode for the reg class

This is the second patch for the PR for the new test. The patch
solves problem in the case when secondary memory mode (SImode in the
PR test) returned by hook secondary_memory_needed_mode can not be used
for reg class (ALL_MASK_REGS) involved in secondary memory moves. The
patch uses reg mode instead of one returned by
secondary_memory_needed_mode in this case.

gcc/ChangeLog:

PR rtl-optimization/118067
* lra-constraints.cc (invalid_mode_reg_p): New function.
(curr_insn_transform): Use it to check mode returned by target
secondary_memory_needed_mode.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118067-2.c: New.

testsuite: Make embed-10.c test more robust

With the https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673945.html
hack we get slightly different error wording in one of the errors, given that
the test actually does use #embed, I think both wordings are just fine and
we should accept them.

2025-01-17 Jakub Jelinek <jakub@redhat.com>

* c-c++-common/cpp/embed-10.c: Allow a different error wording for
C++.

d: Add testcase for fixed PR117115

This was fixed in upstream dmd, and merged in r15-6824.

PR d/117115

gcc/testsuite/ChangeLog:

* gdc.dg/pr117115.d: New test.

s390: Replace some checking assertions with output_operand_lossage [PR118511]

r15-2002 s390: Fully exploit vgm, vgbm, vrepi change added
some code to print_operand and added gcc_checking_asserts in there.
But print_operand ideally should have no assertions in it, as most
of the assumptions can be easily violated by people using it in
inline asm.
This issue in particular was seen by failure to compile s390-tools,
which had in its extended inline asm uses of %r1 and %r2.
I really don't know if they meant %%r1 and %%r2 or %1 and %2 and
will leave that decision to the maintainers, but the thing is that
%r1 and %r2 used to expand like %1 and %2 in GCC 14 and earlier,
now in checking build it ICEs and in --enable-checking=release build
fails to assemble (the checking assert is ignored and the compiler just uses
some uninitialized variables to emit something arbitrary).

With the following patch it is diagnosed as error consistently
regardless if it is release checking or no checking or checking compiler.

Note, I see also
      else if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_TLSLDM)
        {
          fprintf (file, "%s", ":tls_ldcall:");
          const char *name = get_some_local_dynamic_name ();
          gcc_assert (name);
          assemble_name (file, name);
        }
in print_operand, maybe that isn't a big deal because it might be
impossible to construct inline asm argument which is UNSPEC_TLSLDM.
And then there is
        case 'e': case 'f':
        case 's': case 't':
          {
            int start, end;
            int len;
            bool ok;

            len = (code == 's' || code == 'e' ? 64 : 32);
            ok = s390_contiguous_bitmask_p (ival, true, len, &start, &end);
            gcc_assert (ok);
            if (code == 's' || code == 't')
              ival = start;
            else
              ival = end;
          }
          break;
which likely should be also output_operand_lossage but I haven't tried
to reproduce that.

2025-01-17  Jakub Jelinek  <jakub@redhat.com>

PR target/118511
* config/s390/s390.cc (print_operand) <case 'p'>: Use
output_operand_lossage instead of gcc_checking_assert.
(print_operand) <case 'q'>: Likewise.
(print_operand) <case 'r'>: Likewise.

* gcc.target/s390/pr118511.c: New test.

AArch64: Use standard names for SVE saturating arithmetic

Rename the existing SVE unpredicated saturating arithmetic instructions
to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md: Rename insns

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/saturating_arithmetic.inc:
Template file for auto-vectorizer tests.
* gcc.target/aarch64/sve/saturating_arithmetic_1.c:
Instantiate 8-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_2.c:
Instantiate 16-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_3.c:
Instantiate 32-bit vector tests.
* gcc.target/aarch64/sve/saturating_arithmetic_4.c:
Instantiate 64-bit vector tests.

AArch64: Use standard names for saturating arithmetic

This renames the existing {s,u}q{add,sub} instructions to use the
standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and
IFN_SAT_SUB.

The NEON intrinsics for saturating arithmetic and their corresponding
builtins are changed to use these standard names too.

Using the standard names for the instructions causes 32 and 64-bit
unsigned scalar saturating arithmetic to use the NEON instructions,
resulting in an additional (and inefficient) FMOV to be generated when
the original operands are in GP registers. This patch therefore also
restores the original behaviour of using the adds/subs instructions
in this circumstance.

Additional tests are written for the scalar and Adv. SIMD cases to
ensure that the correct instructions are used. The NEON intrinsics are
already tested elsewhere.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc: Expand iterators.
* config/aarch64/aarch64-simd-builtins.def: Use standard names
* config/aarch64/aarch64-simd.md: Use standard names, split insn
definitions on signedness of operator and type of operands.
* config/aarch64/arm_neon.h: Use standard builtin names.
* config/aarch64/iterators.md: Add VSDQ_I_QI_HI iterator to
simplify splitting of insn for unsigned scalar arithmetic.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/scalar_intrinsics.c: Update testcases.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc:
Template file for unsigned vector saturating arithmetic tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c:
8-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c:
16-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c:
32-bit vector type tests.
* gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c:
64-bit vector type tests.
* gcc.target/aarch64/saturating_arithmetic.inc: Template file
for scalar saturating arithmetic tests.
* gcc.target/aarch64/saturating_arithmetic_1.c: 8-bit tests.
* gcc.target/aarch64/saturating_arithmetic_2.c: 16-bit tests.
* gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests.
* gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests.

Co-authored-by: Tamar Christina <tamar.christina@arm.com>

rs6000, Remove redundant built-in __builtin_vsx_xvcvuxwdp

The built-in __builtin_vsx_xvcvuxwdp can be covered with PVIPR
function vec_doubleo on LE and vec_doublee on BE. There are no test
cases or documentation for __builtin_vsx_xvcvuxwdp. This patch
removes the redundant built-in.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvuxwdp):
Remove built-in definition.

rs6000, remove built-ins __builtin_vsx_vperm_8hi and __builtin_vsx_vperm_8hi_uns

The two built-ins __builtin_vsx_vperm_8hi and __builtin_vsx_vperm_8hi_uns
are redundant. The are covered by the overloaded vec_perm built-in. The
built-ins are not documented and do not have test cases.

The removal of these built-ins was missed in commit gcc r15-1923 on
7/9/2024.

This patch removes the redundant built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_8hi,
__builtin_vsx_vperm_8hi_uns): Remove built-in definitions.

rs6000, add testcases to the overloaded vec_perm built-in

The overloaded vec_perm built-in supports permuting signed and unsigned
vectors of char, bool char, short int, short bool, int, bool, long long
int, long long bool, int128, float and double.  However, not all of the
supported arguments are included in the test cases.  This patch adds
the missing test cases.

Additionally, in the 128-bit debug print statements the expected result and
the result need to be cast to unsigned long long to print correctly.  The
patch makes this additional change to the print statements.

gcc/ChangeLog:
* doc/extend.texi: Fix spelling mistake in description of the
vec_sel built-in.  Add documentation of the 128-bit vec_perm
instance.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c: Add vec_perm test cases for
arguments of type vector signed long long int, long long bool,
bool, bool short, bool char and pixel, vector unsigned long long
int, unsigned int, unsigned short int, unsigned char.  Cast
arguments for debug prints to unsigned long long.
* gcc.target/powerpc/builtins-4-int128-runnable.c: Add vec_perm
test cases for signed and unsigned int128 arguments.

rs6000, fix test builtins-1-p10-runnable.c

The test has two issues:

1) The test should generate execute abort() if an error is found.
However, the test contains a #define 0 which actually enables the
error prints not exectuting void() because the debug code is protected
by an #ifdef not #if. The #define DEBUG needs to be removed to so the
test will abort on an error.

2) The vec_i_expected output was tweeked to test that it would fail.
The test value was not removed.

By removing the #define DEBUG, the test fails and reports 1 failure.
Removing the intentionally wrong expected value results in the test
passing with no errors as expected.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-1-p10-runnable.c: Remove #define
DEBUG. Replace vec_i_expected value with correct value.

AVR: Add "const" attribute to avr built-in functions if possible.

gcc/
* config/avr/avr-c.cc (DEF_BUILTIN): Add ATTRS argument to macro
definition.
* config/avr/avr.cc: Same.
(avr_init_builtins) <attr_const>: New variable that can be used
as ATTRS argument in DEF_BUILTIN.
* config/avr/builtins.def (DEF_BUILTIN): Add ATTRS parameter
to all definitions.

c++/modules: Propagate FNDECL_USED_AUTO when propagating deduced return types [PR118049]

In the linked testcase, we're erroring because the declared return types
of the functions do not appear to match. This is because when merging
the deduced return types for 'foo' in 'auto-5_b.C', we overwrote the
return type for the declaration with the deduced return type from
'auto-5_a.C' but neglected to track that we were originally declared
with 'auto'.

As a drive-by improvement to QOI, also add checks for if the deduced
return types do not match; this is currently useful because we do not
check the equivalence of the bodies of functions yet.

PR c++/118049

gcc/cp/ChangeLog:

* module.cc (trees_in::is_matching_decl): Propagate
FNDECL_USED_AUTO as well.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-5_a.C: New test.
* g++.dg/modules/auto-5_b.C: New test.
* g++.dg/modules/auto-5_c.C: New test.
* g++.dg/modules/auto-6_a.H: New test.
* g++.dg/modules/auto-6_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

OpenMP/C++: Fix declare_variant's 'adjust_args' if there is a 'this' pointer [PR118321]

The adjust_args clause is internally store as the i-th argument to the function,
which fails if hidden arguments come before. This commit handles the C++ 'this'
pointer by shifting the internal arg index by one.

PR fortran/118321

gcc/cp/ChangeLog:

* decl.cc (omp_declare_variant_finalize_one): Shift adjust_args index
by one for non-static class function's 'this' pointer.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/adjust-args-4.C: New test.

c++: Allow pragmas in NSDMIs [PR118147]

This patch removes the (unnecessary) CPP_PRAGMA_EOL case from
cp_parser_cache_defarg, which currently has the result that any pragmas
in the NSDMI cause an error.

PR c++/118147

gcc/cp/ChangeLog:

* parser.cc (cp_parser_cache_defarg): Don't error when
CPP_PRAGMA_EOL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-defer7.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

testsuite/117958 - ifcombine differences on aarch64 vs rest

ifcombine depends on BRANCH_COST and the testcase relies on ifcombine
to fully optimize the function. But the important parts are optimized
everywhere, so the following delectively XFAILs the less important part.

PR testsuite/117958
* g++.dg/tree-ssa/pr117123.C: XFAIL parts on aarch64-*-*.

AVR: Use INT_N to built-in define __int24.

This patch uses the INT_N interface to define __int24 in avr-modes.def.

Since the testsuite uses -Wpedantic and __int24 is a C/C++ extension,
uses of __int24 and __uint24 is now marked as __extension__.

PR target/118329
gcc/
* config/avr/avr-modes.def: Add INT_N (PSI, 24).
* config/avr/avr.cc (avr_init_builtin_int24)
<__int24>: Remove definition.
<__uint24>: Adjust definition to INT_N interface.
gcc/testsuite/
* gcc.target/avr/pr115830-add.c (__int24, __uint24): Add __extension__
to respective typedefs.
* gcc.target/avr/pr115830-sub-ext.c: Same.
* gcc.target/avr/pr115830-sub.c: Same.
* gcc.target/avr/torture/get-mem.c: Same.
* gcc.target/avr/torture/set-mem.c: Same.
* gcc.target/avr/torture/ifelse-c.h: Same.
* gcc.target/avr/torture/ifelse-d.h: Same.
* gcc.target/avr/torture/ifelse-q.h: Same.
* gcc.target/avr/torture/ifelse-r.h: Same.
* gcc.target/avr/torture/int24-mul.c: Same.
* gcc.target/avr/torture/pr109907-2.c: Same.
* gcc.target/avr/torture/pr61443.c: Same.
* gcc.target/avr/torture/pr63633-ice-mult.c: Same.
* gcc.target/avr/torture/shift-l-u24.c: Same.
* gcc.target/avr/torture/shift-r-i24.c: Same.
* gcc.target/avr/torture/shift-r-u24.c: Same.
* gcc.target/avr/torture/add-extend.c: Same.
* gcc.target/avr/torture/sub-extend.c: Same.
* gcc.target/avr/torture/sub-zerox.c: Same.
* gcc.target/avr/torture/test-gprs.h: Same.

match.pd: Fix (FTYPE) N CMP (FTYPE) M optimization for GENERIC [PR118522]

The last case of this optimization assumes that if 2 integral types
have same precision and TYPE_UNSIGNED, then they are uselessly convertible.
While that is very likely the case for GIMPLE, it is not the case for
GENERIC, so the following patch adds there a convert so that the
optimization produces also valid GENERIC. Without it we got
(int) p == b where b had _BitInt(32) type, so incompatible types.

2025-01-17 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/118522
* match.pd ((FTYPE) N CMP (FTYPE) M): Add convert, as in GENERIC
integral types with the same precision and sign might actually not
be compatible types.

* gcc.dg/bitint-120.c: New test.

c++: Friend classes don't shadow enclosing template class paramater [PR118255]

We currently reject the following code

=== code here ===
template <int non_template> struct S { friend class non_template; };
class non_template {};
S<0> s;
=== code here ===

While EDG agrees with the current behaviour, clang and MSVC don't (see
https://godbolt.org/z/69TGaabhd), and I believe that this code is valid,
since the friend clause does not actually declare a type, so it cannot
shadow anything. The fact that we didn't error out if the non_template
class was declared before S backs this up as well.

This patch fixes this by skipping the call to check_template_shadow for
hidden bindings.

PR c++/118255

gcc/cp/ChangeLog:

* name-lookup.cc (pushdecl): Don't call check_template_shadow
for hidden bindings.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/pr99116-1.C: Adjust test expectation.
* g++.dg/template/friend84.C: New test.

tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds

The following makes niter analysis recognize a loop with an exit
condition scanning over a STRING_CST.  This is done via enhancing
the force evaluation code rather than recognizing for example
strlen (s) as number of iterations because it allows to handle
some more cases.

STRING_CSTs are easy to handle since nothing can write to them, also
processing those should be cheap.  I've refrained from handling
anything besides char8_t.

Note to avoid the -Warray-bound dianostic we have to either early unroll
the loop (there's no final value replacement done, there's a PR
for doing this as part of CD-DCE when possibly eliding a loop),
or create a canonical IV so we can DCE the loads.  The latter is what
the patch does, also avoiding to repeatedly force-evaluate niters.
This also makes final value replacement work again since now ivcanon
is after it.

There are some testsuite adjustments needed, in particular we now
unroll some loops early, causing messages to appear in different
passes but also vectorization to now no longer happening on
outer loops.  The changes mitigate that.

PR tree-optimization/92539
* tree-ssa-loop-ivcanon.cc (tree_unroll_loops_completely_1):
Also try force-evaluation if ivcanon did not yet run.
(canonicalize_loop_induction_variables):
When niter was computed constant by force evaluation add a
canonical IV if we didn't unroll.
* tree-ssa-loop-niter.cc (loop_niter_by_eval): When we
don't find a proper PHI try if the exit condition scans
over a STRING_CST and simulate that.

* g++.dg/warn/Warray-bounds-pr92539.C: New testcase.
* gcc.dg/tree-ssa/sccp-16.c: New testcase.
* g++.dg/vect/pr87621.cc: Use larger power to avoid
inner loop unrolling.
* gcc.dg/vect/pr89440.c: Use larger loop bound to avoid
inner loop unrolling.
* gcc.dg/pr77975.c: Scan cunrolli dump and adjust.

OpenMP: Fix metadirective test failures on x86_64 with -m32

gcc/testsuite/ChangeLog
* c-c++-common/gomp/metadirective-device.c: Don't add extra options
for target ia32.
* c-c++-common/gomp/metadirective-target-device-1.c: Likewise.

RISC-V: Add -fcf-protection=[full|branch|return] to enable zicfiss, zicfilp.

gcc/ChangeLog:
* config/riscv/riscv.cc
(is_zicfilp_p): New function.
(is_zicfiss_p): New function.
* config/riscv/riscv-zicfilp.cc: Update.
* config/riscv/riscv.h: Update.
* config/riscv/riscv.md: Update.
* config/riscv/riscv-c.cc: Add CFI predefine marco.

gcc/testsuite/ChangeLog:
* c-c++-common/fcf-protection-1.c: Update.
* c-c++-common/fcf-protection-2.c: Update.
* c-c++-common/fcf-protection-3.c: Update.
* c-c++-common/fcf-protection-4.c: Update.
* c-c++-common/fcf-protection-5.c: Update.
* c-c++-common/fcf-protection-6.c: Update.
* c-c++-common/fcf-protection-7.c: Update.
* gcc.target/riscv/ssp-1.c: Update.
* gcc.target/riscv/ssp-2.c: Update.
* gcc.target/riscv/zicfilp-call.c: Update.
* gcc.target/riscv/interrupt-no-lpad.c: Update.

RISC-V: Add .note.gnu.property for ZICFILP and ZICFISS ISA extension

gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_file_end): Add .note.gnu.property.

libgcc/ChangeLog:
* config/riscv/crti.S: Add lpad instructions.
* config/riscv/crtn.S: Likewise.
* config/riscv/save-restore.S: Likewise.
* config/riscv/riscv-asm.h: Add GNU_PROPERTY for ZICFILP,
ZICFISS.

Co-Developed-by: Jesse Huang <jesse.huang@sifive.com>

RISC-V: Add Zicfilp ISA extension.

This patch only support landing pad value is 0.
The next version will implement function signature based labeling
scheme.

RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFILP ISA
string.
* config.gcc: Add riscv-zicfilp.o
* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE):
Insert landing pad instructions.
* config/riscv/riscv-protos.h (make_pass_insert_landing_pad):
Declare.
* config/riscv/riscv-zicfilp.cc: New file.
* config/riscv/riscv.cc
(riscv_trampoline_init): Add landing pad instructions.
(riscv_legitimize_call_address): Likewise.
(riscv_output_mi_thunk): Likewise.
* config/riscv/riscv.h: Update.
* config/riscv/riscv.md: Add landing pad patterns.
* config/riscv/riscv.opt (TARGET_ZICFILP): Define.
* config/riscv/t-riscv: Add build rule for
riscv-zicfilp.o

gcc/testsuite/ChangeLog:
* gcc.target/riscv/interrupt-no-lpad.c: New test.
* gcc.target/riscv/zicfilp-call.c: New test.

Co-Developed-by: Greg McGary <gkm@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>

RISC-V: Add Zicfiss ISA extension.

This patch is implemented according to the RISC-V CFI specification.
It supports the generation of shadow stack instructions in the prologue,
epilogue, non-local gotos, and unwinding.

RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFISS ISA string.
* config/riscv/predicates.md: New predicate x1x5_operand.
* config/riscv/riscv.cc
(riscv_expand_prologue): Insert shadow stack instructions.
(riscv_expand_epilogue): Likewise.
(riscv_for_each_saved_reg): Assign t0 or ra register for
sspopchk instruction.
(need_shadow_stack_push_pop_p): New function. Omit shadow
stack operation on leaf function.
* config/riscv/riscv.h
(need_shadow_stack_push_pop_p): Define.
* config/riscv/riscv.md: Add shadow stack patterns.
(save_stack_nonlocal): Add shadow stack instructions for setjump.
(restore_stack_nonlocal): Add shadow stack instructions for longjump.
* config/riscv/riscv.opt (TARGET_ZICFISS): Define.

libgcc/ChangeLog:
* config/riscv/linux-unwind.h: Include shadow-stack-unwind.h.
* config/riscv/shadow-stack-unwind.h
(_Unwind_Frames_Extra): Define.
(_Unwind_Frames_Increment): Define.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/ssp-1.c: New test.
* gcc.target/riscv/ssp-2.c: New test.

Co-Developed-by: Greg McGary <gkm@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>

Daily bump.

d: Fix record layout of compiler-generated TypeInfo_Class [PR115249]

In r14-8766, the layout of TypeInfo_Class changed in the runtime
library, but didn't get reflected in the compiler-generated data,
causing a corruption of runtime type introspection on BigEndian targets.

This adjusts the size of the `ClassFlags' field from uint to ushort, and
adds a new ushort `depth' field in the space where ClassFlags used to
occupy.

PR d/115249

gcc/d/ChangeLog:

* typeinfo.cc (create_tinfo_types): Update internal Typenfo
representation.
(TypeInfoVisitor::visit (TypeInfoClassDeclaration *)): Likewise.

c++: RESULT_DECL replacement w/ non-reduced ctx->object [PR105440]

After surgically replacing RESULT_DECL within a constexpr call result
(for sake of RVO), we can in some cases simplify the call result
further.

In the below testcase the result of get() during evaluation of a's
initializer is the self-referential CONSTRUCTOR:

  {._M_p=(char *) &<retval>._M_local_buf}

which after replacing RESULT_DECL with ctx->object (aka *D.2603, where
the D.2603 temporary points to the current element of _M_elems under
construction) becomes:

  {._M_p=(char *) &D.2603->_M_local_buf}

but what we really want is:

  {._M_p=(char *) &a._M_elems[0]._M_local_buf}.

so that the value of _M_p is independent of the value of the mutable
D.2603 temporary.

So to that end, it seems we should constexpr evaluate the result again
after RESULT_DECL replacement, which is what this patch implements.

PR c++/105440

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): If any RESULT_DECLs get
replaced in the call result, try further evaluating the result.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-dtor17.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

[testsuite] [arm] adjust wmul expectations [PR113560]

Since the machine-independent widening multiply logic was improved
PR113560, ARM's wmul-[567].c fail. AFAICT the logic takes advantage
of the fact that, after zero-extending a narrow integral type to a
wider type, further zero- or sign-extending is equivalent, which
enables different instructions to be used for equivalent effect.

Adjust the tests to accept all the equivalent instructions that can be
used.

for gcc/testsuite/ChangeLog

PR target/113560
* gcc.target/arm/wmul-5.c: Accept other mla instructions.
* gcc.target/arm/wmul-6.c: Likewise.
* gcc.target/arm/wmul-7.c: Likewise.

[testsuite] [arm] multilibs.exp: adjust float abi opt matching

The regexp that matches options that mess with multilibs matches
-mfloat=abi=, but that's probably a typo for -mfloat-abi=. Fix that,
and add -msoft-float and -mhard-float.

for gcc/testsuite/ChangeLog

* gcc.target/arm/multilib.exp: Skip if -mfloat-abi=* or any of
its aliases are used.

[testsuite] skip test on non-hosted libstdc++ [PR113994]

Tests that include <string> need to be skipped when libstdc++ is built
in freestanding mode.

for gcc/testsuite/ChangeLog

PR rtl-optimization/113994
* g++.dg/torture/pr113994.C: Require hosted libstdc++.

[testsuite] drop explicit run overrider in more dfp tests

A few more dfp tests that recently got backported to gcc-14 override
dfp.exp's selection of default action depending on dfprt. Let the
default stand.

for gcc/testsuite/ChangeLog

* gcc.dg/dfp/pr102674.c: Use the default dg-do.
* gcc.dg/dfp/pr43374.c: Likewise.

[testsuite] rearrange requirements for dfp bitint run tests

dfp.exp sets the default to compile when dfprt is not available, but
some dfp bitint tests override the default without that requirement,
and try to run even when dfprt is not available.

Instead of overriding the default, rewrite the requirements so that
they apply even when compiling, since the absence of bitint or of
int128 would presumably cause compile failures.

for gcc/testsuite/ChangeLog

* gcc.dg/dfp/bitint-1.c: Rewrite requirements to retain dfprt.
* gcc.dg/dfp/bitint-2.c: Likewise.
* gcc.dg/dfp/bitint-3.c: Likewise.
* gcc.dg/dfp/bitint-4.c: Likewise.
* gcc.dg/dfp/bitint-5.c: Likewise.
* gcc.dg/dfp/bitint-6.c: Likewise.
* gcc.dg/dfp/bitint-7.c: Likewise.
* gcc.dg/dfp/bitint-8.c: Likewise.
* gcc.dg/dfp/int128-1.c: Likewise.
* gcc.dg/dfp/int128-2.c: Likewise.
* gcc.dg/dfp/int128-3.c: Likewise.
* gcc.dg/dfp/int128-4.c: Likewise.

Fortran/OpenMP: Fix declare_variant's 'adjust_args' mishandling with return by reference [PR118321]

declare_variant's 'adjust_args' clause references the arguments in the
middle end by the argument position; this has to account for hidden
arguments that are inserted before due to return by reference,
as done in this commit.

PR fortran/118321

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_declare_variant): Honor hidden
arguments for append_arg's need_device_ptr.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/adjust-args-12.f90: New test.

c++: explicit spec of constrained member tmpl [PR107522]

When defining a explicit specialization of a constrained member template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS. So during normalization for such an explicit
specialization we need to consider the (parameters of) the most general
template, since that's what the constraints are in terms of and since we
always use the full set of template arguments during satisfaction.

PR c++/107522

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use the
most general template for an explicit specialization of a
member template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-explicit-spec7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: pack expansion arg vs non-pack parm checking ICE [PR118454]

During ahead of time template argument coercion, we handle the case of
passing a pack expansion to a non-pack parameter by breaking out early
and using the original unconverted arguments, deferring coercion until
instantiation time where we have concrete arguments.

This PR illustrates we still need to strip typedefs from the original
arguments in this case as in the ordinary case, for sake of our template
argument hashing/equivalence routines which assume template arguments
went through strip_typedefs.

Since we're using the unconverted arguments we need to preserve
injected-class-name typedefs because we use them to distinguish passing
an injected-class-name vs the corresponding specialization as the
argument to a template template parameter (the former is valid, the
latter isn't).

PR c++/118454

gcc/cp/ChangeLog:

* cp-tree.h (STF_KEEP_INJ_CLASS_NAME): Define.
* pt.cc (iterative_hash_template_argument) <case tcc_type>:
Clarify comment for when we'd see an alias template
specialization here.
(coerce_template_parms): Strip typedefs (except for
injected-class-names) in the pack expansion early break cases
that defer coercion.
* tree.cc (strip_typedefs): Don't strip an injected-class-name
if STF_KEEP_INJ_CLASS_NAME is set.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic187.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: make finish_pseudo_destructor_expr SFINAE-aware [PR116417]

PR c++/116417

gcc/cp/ChangeLog:

* cp-tree.h (finish_pseudo_destructor_expr): Add complain
parameter.
* parser.cc (cp_parser_postfix_dot_deref_expression): Pass
complain=tf_warning_or_error to finish_pseudo_destructor_expr.
* pt.cc (tsubst_expr): Pass complain to
finish_pseudo_destructor_expr.
* semantics.cc (finish_pseudo_destructor_expr): Check complain
before emitting a diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/template/pseudodtor7.C: New test.

Reviewed-by: Marek Polacek <polacek@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>