git.ipfire.org Git - thirdparty/gcc.git/log

aarch64: force operand to fresh register to avoid subreg issues [PR118892]

When the input is already a subreg and we try to make a paradoxical
subreg out of it for copysign this can fail if it violates the subreg
relationship.

Use force_lowpart_subreg instead of lowpart_subreg to then force the
results to a register instead of ICEing.

gcc/ChangeLog:

PR target/118892
* config/aarch64/aarch64.md (copysign<GPF:mode>3): Use
force_lowpart_subreg instead of lowpart_subreg.

gcc/testsuite/ChangeLog:

PR target/118892
* gcc.target/aarch64/copysign-pr118892.c: New test.

libstdc++: Remove stray comma in testing docs

libstdc++-v3/ChangeLog:

* doc/xml/manual/test.xml: Remove stray comma.
* doc/html/manual/test.html: Regenerate.

Fix folding of BIT_NOT_EXPR for POLY_INT_CST [PR118976]

There was an embarrassing typo in the folding of BIT_NOT_EXPR for
POLY_INT_CSTs: it used - rather than ~ on the poly_int. Not sure
how that happened, but it might have been due to the way that
~x is implemented as -1 - x internally.

gcc/
PR tree-optimization/118976
* fold-const.cc (const_unop): Use ~ rather than - for BIT_NOT_EXPR.
* config/aarch64/aarch64.cc (aarch64_test_sve_folding): New function.
(aarch64_run_selftests): Run it.

simplify-rtx: Fix up simplify_logical_relational_operation [PR119002]

The following testcase is miscompiled on powerpc64le-linux starting with
r15-6777.  During combine we see:

(set (reg:SI 134)
    (ior:SI (ge:SI (reg:CCFP 128)
            (const_int 0 [0]))
        (lt:SI (reg:CCFP 128)
            (const_int 0 [0]))))

The simplify_logical_relational_operation code (in its current form)
was written with arithmetic rather than CC modes in mind.  Since CCFP
is a CC mode, it fails the HONOR_NANS check, and so the function assumes
that ge | lt => true.

If one comparison is unsigned then it should be safe to assume that
the other comparison is also unsigned, even for CC modes, since the
optimisation checks that the comparisons are between the same operands.
For the other cases, we can only safely fold comparisons of CC mode
values if the result is always-true (15) or always-false (0).

It turns out that the original testcase for PR117186, which ran at -O,
was relying on the old behaviour for some of the functions.  It needs
4-instruction combinations, and so -fexpensive-optimizations, to pass
in its intended form.

gcc/
PR rtl-optimization/119002
* simplify-rtx.cc
(simplify_context::simplify_logical_relational_operation): Handle
comparisons between CC values.  If there is no evidence that the
CC values are unsigned, restrict the fold to always-true or
always-false results.

gcc/testsuite/
* gcc.c-torture/execute/ieee/pr119002.c: New test.
* gcc.target/aarch64/pr117186.c: Run at -O2 rather than -O.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>

testsuite: Add tests for already fixed PR [PR119071]

Uros' r15-7793 fixed this PR as well, I'm just committing tests
from the PR so that it can be closed.

2025-03-04 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/119071
* gcc.dg/pr119071.c: New test.
* gcc.c-torture/execute/pr119071.c: New test.

Fortran: Prevent ICE when getting caf-token from abstract type [PR77872]

PR fortran/77872

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_get_tree_for_caf_expr): Pick up token from
decl when it is present there for class types.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/class_1.f90: New test.

Fortran: Reduce code complexity [PR77872]

PR fortran/77872

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Use attr instead of
doing type check and branching for BT_CLASS.

tree-optimization/119096 - bogus conditional reduction vectorization

When we vectorize a .COND_ADD reduction and apply the single-use-def
cycle optimization we can end up chosing the wrong else value for
subsequent .COND_ADD. The following rectifies this.

PR tree-optimization/119096
* tree-vect-loop.cc (vect_transform_reduction): Use the
correct else value for .COND_fn.

* gcc.dg/vect/pr119096.c: New testcase.

RISC-V: Fix the test case bug-3.c failure

The bug-3.c would like to check the slli a[0-9]+, a[0-9]+, 33 for the
big poly int handling. But the underlying insn may change to slli 1
+ slli 32 with sorts of optimization. Thus, update the asm check to
function body check with above slli 1 + slli 32 series.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-3.c: Update asm check to
function body check.

Signed-off-by: Pan Li <pan2.li@intel.com>

Daily bump.

Update .po files

gcc/po/
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, ka.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po,
zh_CN.po, zh_TW.po: Update.

libcpp/po/
* be.po, ca.po, da.po, de.po, el.po, eo.po, es.po, fi.po, fr.po,
id.po, ja.po, ka.po, nl.po, pt_BR.po, ro.po, ru.po, sr.po, sv.po,
tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.

Fortran: reject empty derived type with bind(C) attribute [PR101577]

PR fortran/101577

gcc/fortran/ChangeLog:

* symbol.cc (verify_bind_c_derived_type): Generate error message
for derived type with no components in standard conformance mode,
indicating that this is a GNU extension.

gcc/testsuite/ChangeLog:

* gfortran.dg/empty_derived_type.f90: Adjust dg-options.
* gfortran.dg/empty_derived_type_2.f90: New test.

aarch64: Ignore target pragmas while defining intrinsics

Refactor the switcher classes into two separate classes:

- sve_alignment_switcher takes the alignment switching functionality,
  and is used only for ABI correctness when defining sve structure
  types.
- aarch64_target_switcher takes the rest of the functionality of
  aarch64_simd_switcher and sve_switcher, and gates simd/sve specific
  parts upon the specified feature flags.

Additionally, aarch64_target_switcher now adds dependencies of the
specified flags (which adds +fcma and +bf16 to some intrinsic
declarations), and unsets current_target_pragma.

This last change fixes an internal bug where we would sometimes add a
user specified target pragma (stored in current_target_pragma) on top of
an internally specified target architecture while initialising
intrinsics with `#pragma GCC aarch64 "arm_*.h"`.  As far as I can tell, this
has no visible impact at the moment.  However, the unintended target
feature combinations lead to unwanted behaviour in an under-development
patch.

This also fixes a missing Makefile dependency, which was due to
aarch64-sve-builtins.o incorrectly depending on the undefined $(REG_H).
The correct $(REGS_H) dependency is added to the switcher's new source
location.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.cc
(struct aarch64_extension_info): Add field.
(aarch64_get_required_features): New.
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_switcher::aarch64_simd_switcher): Rename to...
(aarch64_target_switcher::aarch64_target_switcher): ...this,
and extend to handle sve, nosimd and target pragmas.
(aarch64_simd_switcher::~aarch64_simd_switcher): Rename to...
(aarch64_target_switcher::~aarch64_target_switcher): ...this,
and extend to handle sve, nosimd and target pragmas.
(handle_arm_acle_h): Use aarch64_target_switcher.
(handle_arm_neon_h): Rename switcher and pass explicit flags.
(aarch64_general_init_builtins): Ditto.
* config/aarch64/aarch64-protos.h
(class aarch64_simd_switcher): Rename to...
(class aarch64_target_switcher): ...this, and add new members.
(aarch64_get_required_features): New prototype.
* config/aarch64/aarch64-sve-builtins.cc
(sve_switcher::sve_switcher): Delete
(sve_switcher::~sve_switcher): Delete
(sve_alignment_switcher::sve_alignment_switcher): New
(sve_alignment_switcher::~sve_alignment_switcher): New
(register_builtin_types): Use alignment switcher
(init_builtins): Rename switcher.
(handle_arm_neon_sve_bridge_h): Ditto.
(handle_arm_sme_h): Ditto.
(handle_arm_sve_h): Ditto, and use alignment switcher.
* config/aarch64/aarch64-sve-builtins.h
(class sve_switcher): Delete.
(class sme_switcher): Delete.
(class sve_alignment_switcher): New.
* config/aarch64/t-aarch64 (aarch64-builtins.o): Add $(REGS_H).
(aarch64-sve-builtins.o): Remove $(REG_H).

arm: remove some redundant zero_extend ops on thumb1

The code in gcc.target/unsigned-extend-1.c really should not need an
unsigned extension operations when the optimizers are used.  For Arm
and thumb2 that is indeed the case, but for thumb1 code it gets more
complicated as there are too many instructions for combine to look at.
For thumb1 we end up with two redundant zero_extend patterns which are
not removed: the first after the subtract instruction and the second of
the final boolean result.

We can partially fix this (for the second case above) by adding a new
split pattern for LEU and GEU patterns which work because the two
instructions for the [LG]EU pattern plus the redundant extension
instruction are combined into a single insn, which we can then split
using the 3->2 method back into the two insns of the [LG]EU sequence.

Because we're missing the optimization for all thumb1 cases (not just
those architectures with UXTB), I've adjust the testcase to detect all
the idioms that we might use for zero-extending a value, namely:

       UXTB
       AND ...#255 (in thumb1 this would require a register to hold 255)
       LSL ... #24; LSR ... #24

but I've also marked this test as XFAIL for thumb1 because we can't yet
eliminate the first of the two extend instructions.

gcc/
* config/arm/thumb1.md (split patterns for GEU and LEU): New.

gcc/testsuite:
* gcc.target/arm/unsigned-extend-1.c: Expand check for any
insn suggesting a zero-extend.  XFAIL for thumb1 code.

Revert "combine: Reverse negative logic in ternary operator"

This reverts commit f1c30c6213fb228f1e8b5973d10c868b834a4acd.

combine: Reverse negative logic in ternary operator

Reverse negative logic in !a ? b : c to become a ? c : b.

No functional changes.

gcc/ChangeLog:

* combine.cc (distribute_notes):
Reverse negative logic in ternary operators.

combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

The combine pass is trying to combine:

Trying 16, 22, 21 -> 23:
   16: r104:QI=flags:CCNO>0
   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC
   21: r119:QI=flags:CCNO<=0
      REG_DEAD flags:CCNO
   23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
      REG_DEAD r120:QI
      REG_DEAD r119:QI
      REG_UNUSED flags:CC

and creates the following two insn sequence:

modifying insn i2    22: r104:QI=flags:CCNO>0
      REG_DEAD flags:CC
deferring rescan insn with uid = 22.
modifying insn i3    23: r110:QI=flags:CCNO<=0
      REG_DEAD flags:CC
deferring rescan insn with uid = 23.

where the REG_DEAD note in i2 is not correct, because the flags
register is still referenced in i3.  In try_combine() megafunction,
we have this part:

--cut here--
    /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
    if (i3notes)
      distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
    if (i2notes)
      distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
    if (i1notes)
      distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
elim_i2, local_elim_i1, local_elim_i0);
    if (i0notes)
      distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, local_elim_i0);
    if (midnotes)
      distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
--cut here--

where the compiler distributes REG_UNUSED note from i2:

   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC

via distribute_notes() using the following:

--cut here--
  /* Otherwise, if this register is used by I3, then this register
     now dies here, so we must put a REG_DEAD note here unless there
     is one already.  */
  else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
   && ! (REG_P (XEXP (note, 0))
? find_regno_note (i3, REG_DEAD,
    REGNO (XEXP (note, 0)))
: find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
    {
      PUT_REG_NOTE_KIND (note, REG_DEAD);
      place = i3;
    }
--cut here--

Flags register is used in I3, but there already is a REG_DEAD note in I3.
The above condition doesn't trigger and continues in the "else" part where
REG_DEAD note is put to I2.  The proposed solution corrects the above
logic to trigger every time the register is referenced in I3, avoiding the
"else" part.

PR rtl-optimization/118739

gcc/ChangeLog:

* combine.cc (distribute_notes) <case REG_UNUSED>: Correct the
logic when the register is used by I3.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118739.c: New test.

ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785)

Since we construct arithmetic jump functions even when there is a
type conversion in between the operation encoded in the jump function
and when it is passed in a call argument, the IPA propagation phase
must also perform the operation and conversion in two steps.  IPA-VR
had actually been doing it even before for binary operations but, as
PR 118756 exposes, not in the case on unary operations.  This patch
adds the necessary step to rectify that.

Like in the scalar constant case, we depend on
expr_type_first_operand_type_p to determine the type of the result of
the arithmetic operation.  On top this, the patch special-cases
ABSU_EXPR because it looks useful an so that the PR testcase exercises
the added code-path.  This seems most appropriate for stage 4, long
term we should probably stream the types, probably after also encoding
them with a string of expr_eval_op rather than what we have today.

A check for expr_type_first_operand_type_p was also missing in the
handling of binary ops and the intermediate value_range was
initialized with a wrong type, so I also fixed this.

gcc/ChangeLog:

2025-02-24  Martin Jambor  <mjambor@suse.cz>

PR ipa/118785

* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle non-conversion
unary operations separately before doing any conversions.  Check
expr_type_first_operand_type_p for non-unary operations too.  Fix type
of op_res.

gcc/testsuite/ChangeLog:

2025-02-24  Martin Jambor  <mjambor@suse.cz>

PR ipa/118785
* g++.dg/lto/pr118785_0.C: New test.

tree-optimization/119057 - bogus double reduction detection

We are detecting a cycle as double reduction where the inner loop
cycle has extra out-of-loop uses.  This clashes at least with
assumptions from the SLP discovery code which says the cycle
isn't reachable from another SLP instance.  It also was not intended
to support this case, in fact with GCC 14 we seem to generate wrong
code here.

PR tree-optimization/119057
* tree-vect-loop.cc (check_reduction_path): Add argument
specifying whether we're analyzing the inner loop of a
double reduction.  Do not allow extra uses outside of the
double reduction cycle in this case.
(vect_is_simple_reduction): Adjust.

* gcc.dg/vect/pr119057.c: New testcase.

ipa/119067 - bogus TYPE_PRECISION check on VECTOR_TYPE

odr_types_equivalent_p can end up using TYPE_PRECISION on vector
types which is a no-go. The following instead uses TYPE_VECTOR_SUBPARTS
for vector types so we also end up comparing the number of vector elements.

PR ipa/119067
* ipa-devirt.cc (odr_types_equivalent_p): Check
TYPE_VECTOR_SUBPARTS for vectors.

* g++.dg/lto/pr119067_0.C: New testcase.
* g++.dg/lto/pr119067_1.C: Likewise.

Fortran: Fix regression on double free on elemental function [PR118747]

Fix a regression were adding a temporary variable inserted a copy of the
argument to the elemental function. That copy was then later used to
free allocated memory, but the freeing was not tracked in the source
array correctly.

PR fortran/118747

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_array_ctor_element): Remove copy to
temporary variable.
* trans-expr.cc (gfc_conv_procedure_call): Use references to
array members instead of copies when freeing after use.
Formatting fix.

gcc/testsuite/ChangeLog:

* gfortran.dg/alloc_comp_auto_array_4.f90: New test.

Daily bump.

[RISC-V][PR target/118934] Fix ICE in RISC-V long branch support

I'm not sure if I goof'd this or if I merely upstreamed someone else's goof.
Either way the long branch code isn't working correctly.

We were using 'n' as the output modifier to negate the condition.  But 'n' has
a special meaning elsewhere, so when presented with a condition rather than
what was expected, boom, the compiler ICE'd.

Thankfully there's only a few places where we were using %n which I turned into
%r.

The BZ entry includes a good testcase, it just takes a long time to compile as
it's trying to create the out-of-range scenario.  I'm not including the
testcase due to how long it takes, but I did test it locally to ensure it's
working properly now.

I'm sure that with a little bit of work I could create at testcase that worked
before and fails with the trunk (by taking advantage of the fuzzyness in length
computations).  So I'm going to consider this a regression.

Will push to the trunk after pre-commit testing does its thing.

PR target/118934
gcc/
* config/riscv/corev.md (cv_branch): Adjust output template.
(branch): Likewise.
* config/riscv/riscv.md (branch): Likewise.
* config/riscv/riscv.cc (riscv_asm_output_opcode): Handle 'r' rather
than 'n'.

PR modula2/119088 ICE when for loop accesses an unknown variable as the iterator

This patch fixes an ICE which occurs when a FOR statement attempts to
use an undeclared variable as its iterator.

gcc/m2/ChangeLog:

PR modula2/119088
* gm2-compiler/M2SymInit.mod (ConfigSymInit): Reimplement to
defensively check for NulSym type.

gcc/testsuite/ChangeLog:

PR modula2/119088
* gm2/pim/fail/tinyfor4.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

Fortran: Small fixes in intrinsic.texi.

gcc/fortran/ChangeLog
* intrinsic.texi: Fix inconsistent capitalization of argument
names and other minor copy-editing.

Fortran: Move "Standard" subheading in documentation [PR47928]

As noted in the issue, the version of the standard an intrinsic was
introduced in is usually not the second-most-important thing a user
needs to know. This patch moves it from near the beginning of each
section towards the end, just ahead of "See also".

gcc/fortran/ChangeLog
PR fortran/47928
* intrinsic.texi: Move the "Standard" subheading farther down.

Fortran: Rename/move "Syntax" subheading in documentation [PR47928]

As suggested in the issue, it makes more sense to describe the function
call argument syntax before talking about the arguments in the description.

gcc/fortran/ChangeLog
PR fortran/47928
* gfortran.texi: Move all the "Syntax" subheadings ahead of
"Description", and rename to "Synopsis".
* intrinsic.texi: Likewise.

Fortran: Whitespace cleanup in documentation [PR47928]

This is a preparatory patch for the main changes requested in the issue.

gcc/fortran/ChangeLog
PR fortran/47928
* intrinsic.texi: Put a blank line between "@item @emph{}"
subheadings, but not more than one.

Fortran: Tidy subheadings in Fortran documentation [PR47928]

This is a preparatory patch for the main documentation changes requested
in the issue.

gcc/fortran/ChangeLog
PR fortran/47928
* gfortran.texi: Consistently use "@emph{Notes}:" instead of
other spellings.
* intrinsic.texi: Likewise. Also fix an inconsistent capitalization
and remove a redundant "Standard" entry.

avr: Fix up avr_print_operand diagnostics [PR118991]

As can be seen in gcc/po/gcc.pot:
#: config/avr/avr.cc:2754
#, c-format
msgid "bad I/O address 0x"
msgstr ""

exgettext couldn't retrieve the whole format string in this case,
because it uses a macro in the middle. output_operand_lossage
is c-format function though, so we can't use %wx to print HOST_WIDE_INT,
and HOST_WIDE_INT_PRINT_HEX_PURE is on some hosts %lx, on others %llx
and on others %I64x so isn't really translatable that way.

As Joseph mentioned in the PR, there is no easy way around this
but go through a temporary buffer, which the following patch does.

2025-03-02 Jakub Jelinek <jakub@redhat.com>

PR translation/118991
* config/avr/avr.cc (avr_print_operand): Print ival into
a temporary buffer and use %s in output_operand_lossage to make
the diagnostics translatable.

gimple: sccopy: Prune removed statements from SCCs [PR117919]

While writing the sccopy pass I didn't realize that 'replace_uses_by ()' can
remove portions of the CFG.  This happens when replacing arguments of some
statement results in the removal of an EH edge.  Because of this sccopy can
then work with GIMPLE statements that aren't part of the IR anymore.  In
PR117919 this triggered an assertion within the pass which assumes that
statements the pass works with are reachable.

This patch tells the pass to notice when a statement isn't in the IR anymore
and remove it from it's worklist.

PR tree-optimization/117919

gcc/ChangeLog:

* gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Prune
statements that 'replace_uses_by ()' removed.

gcc/testsuite/ChangeLog:

* g++.dg/pr117919.C: New test.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

Daily bump.

doc: Simplify description of *-*-freebsd*

gcc:
PR target/69374
* doc/install.texi (Specific, *-*-freebsd*): Simplify description.

ggc: Fix up ggc_internal_cleared_alloc_no_dtor [PR117047]

Apparently I got one of the !HAVE_ATTRIBUTE_ALIAS fallbacks wrong.

It compiled with a warning:
../../gcc/ggc-common.cc: In function 'void* ggc_internal_cleared_alloc_no_dtor(size_t, void (*)(void*), size_t, size_t)':
../../gcc/ggc-common.cc:154:44: warning: unused parameter 'size' [-Wunused-parameter]
  154 | ggc_internal_cleared_alloc_no_dtor (size_t size, void (*f)(void *),
      |                                     ~~~~~~~^~~~
and obviously didn't work right (always allocated 0-sized objects).

Fixed thusly.

2025-03-01  Jakub Jelinek  <jakub@redhat.com>

PR jit/117047
* ggc-common.cc (ggc_internal_cleared_alloc_no_dtor): Pass size
rather than s as the first argument to ggc_internal_cleared_alloc.

Fortran: fix front-end memleak after failure during parsing of NULLIFY

gcc/fortran/ChangeLog:

* match.cc (gfc_match_nullify): Free matched expression when
cleaning up.
* primary.cc (match_variable): Initialize result to NULL.

[PR target/118906] [PATCH v2] RISC-V: Fix a typo in zce to zcf implication

zce must imply zcf but this rule was corrupted after
refactoring in 9e12010b5e724277ea. This may be observed
ater generating an .s file from any source code file with
-mriscv-attribute -march=rv32if_zce -mabi=ilp32 -S
options. A full march will be presented in arch attribute:

    rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0

As you see, zcf is not presented here though f_zce pair is
passed in -march. According to The RISC-V Instruction
Set Manual:

    Specifying Zce on RV32 with F includes Zca, Zcb, Zcmp,
    Zcmt and Zcf.

PR target/118906
gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: fix zce to zcf
implication.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-zce-1.c: New test.
* gcc.target/riscv/attribute-zce-2.c: New test.
* gcc.target/riscv/attribute-zce-3.c: New test.
* gcc.target/riscv/attribute-zce-4.c: New test.

[PATCH] H8/300, libgcc: PR target/114222 For HImode call internal ffs() implementation instead of an external one

When INT_TYPE_SIZE < BITS_PER_WORD gcc emits a call to an external ffs()
implementation instead of a call to "__builtin_ffs()" – see function
init_optabs() in <SRCROOT>/gcc/optabs-libfuncs.cc. External ffs()
(which is usually the one from newlib) in turn calls __builtin_ffs()
what causes infinite recursion and stack overflow. This patch overrides
default gcc bahaviour for H8/300H (and newer) and provides a generic
ffs() implementation for HImode.

PR target/114222
gcc/ChangeLog:

* config/h8300/h8300.cc (h8300_init_libfuncs): For HImode override
calls to external ffs() (from newlib) with calls to __ffshi2() from
libgcc. The implementation of ffs() in newlib calls __builtin_ffs()
what causes infinite recursion and finally a stack overflow.

libgcc/ChangeLog:

* config/h8300/t-h8300: Add __ffshi2().
* config/h8300/ffshi2.c: New file.

input: Fix UB during self-tests [PR119052]

As the comment in check_line says:
  /* get_buffer is not null terminated, but the sscanf stops after a number.  */
the buffer is not null terminated, there is line.length () to determine
the size of the line.  But unlike what the comment says, sscanf actually
still requires null terminated string argument, anything else is UB.
E.g. glibc when initializing the temporary FILE stream for the string does
  if (size == 0)
    end = strchr (ptr, '\0');
and this strchr/rawmemchr is what shows up in valgrind report on cc1/cc1plus
doing self-tests.

The function is used only in a test with 1000 lines, each containg its
number, so numbers from 1 to 1000 inclusive (each time with '\n' separator,
but that isn't included in line.length ()).

So the function just uses a temporary buffer which can fit numbers from 1 to
1000 as strings with terminating '\0' and runs sscanf on that (why not
strtoul?).

Furthermore, the caller allocated number of lines * 15 bytes for the
string, but 1000\n is 5 bytes, so I think * 5 is more than enough.

2025-03-01  Jakub Jelinek  <jakub@redhat.com>

PR other/119052
* input.cc (check_line): Don't call sscanf on non-null terminated
buffer, instead copy line.length () bytes from line.get_buffer ()
to a local buffer, null terminate it and call sscanf on that.
Formatting fix.
(test_replacement): Just allocate maxline * 5 rather than maxline * 15
bytes for the file.  Formatting fix.

Daily bump.

ggc: Avoid using ATTRIBUTE_MALLOC for allocations that need finalization [PR117047]

As analyzed by Andrew/David/Richi/Sam in the PR, the reason for the
libgccjit ICE is that there are GC allocations with finalizers and we
still mark ggc_internal_{,cleared_}alloc with ATTRIBUTE_MALLOC, which
to the optimizers hints that nothing will actually read the state
of the objects when they get out of lifetime.  The finalizer actually
inspects those though.  What actually happens in the testcases is that on
  tree expr_size = TYPE_SIZE (expr->get_type ()->as_tree ());
we see that expr->get_type () was allocated using something with malloc
attribute but it doesn't escape and only the type size from it is queried,
so there is no need to store other members of it.  Except that it does escape
in the GC internals.  Normal GC allocations are fine, they don't look at the
data in the allocated objects on "free", but the ones with finalizers actually
call a function on that object and expect the data to be in there.
So that we don't lose ATTRIBUTE_MALLOC for the common case when no
finalization is needed, the following patch uses the approach used e.g.
for glibc error function which can sometimes be noreturn but at other
times just return normally.
If possible, it uses __attribute__((alias ("..."))) to add an alias
to the function, where one is without ATTRIBUTE_MALLOC and one
(with _no_dtor suffix) is with ATTRIBUTE_MALLOC (note, as this is
C++ and I didn't want to hardcode particular mangling I used an
extern "C" function with 2 aliases to it), and otherwise adds a wrapper
(for the ggc-page/ggc-common case with noinline attribute if possible,
for ggc-none that doesn't matter because ggc-none doesn't support
finalizers).
The *_no_dtor aliases/wrappers are then used in inline functions which
pass unconditional NULL, 0 as the f/s pair.

2025-03-01  Jakub Jelinek  <jakub@redhat.com>

PR jit/117047
* acinclude.m4 (gcc_CHECK_ATTRIBUTE_ALIAS): New.
* configure.ac: Add gcc_CHECK_ATTRIBUTE_ALIAS.
* ggc.h (ggc_internal_alloc): Remove ATTRIBUTE_MALLOC from
overload with finalizer pointer.  Call ggc_internal_alloc_no_dtor
in inline overload without finalizer pointer.
(ggc_internal_alloc_no_dtor): Declare.
(ggc_internal_cleared_alloc): Remove ATTRIBUTE_MALLOC from
overload with finalizer pointer.  Call
ggc_internal_cleared_alloc_no_dtor in inline overload without
finalizer pointer.
(ggc_internal_cleared_alloc_no_dtor): Declare.
(ggc_alloc): Call ggc_internal_alloc_no_dtor if no finalization
is needed.
(ggc_alloc_no_dtor): Call ggc_internal_alloc_no_dtor.
(ggc_cleared_alloc): Call ggc_internal_cleared_alloc_no_dtor if no
finalization is needed.
(ggc_vec_alloc): Call ggc_internal_alloc_no_dtor if no finalization
is needed.
(ggc_cleared_vec_alloc): Call ggc_internal_cleared_alloc_no_dtor if no
finalization is needed.
* ggc-page.cc (ggc_internal_alloc): If HAVE_ATTRIBUTE_ALIAS, turn
overload with finalizer into alias to ggc_internal_alloc_ and
rename it to ...
(ggc_internal_alloc_): ... this, make it extern "C".
(ggc_internal_alloc_no_dtor): New alias if HAVE_ATTRIBUTE_ALIAS,
otherwise new noinline wrapper.
* ggc-common.cc (ggc_internal_cleared_alloc): If HAVE_ATTRIBUTE_ALIAS,
turn overload with finalizer into alias to ggc_internal_alloc_ and
rename it to ...
(ggc_internal_cleared_alloc_): ... this, make it extern "C".
(ggc_internal_cleared_alloc_no_dtor): New alias if
HAVE_ATTRIBUTE_ALIAS, otherwise new noinline wrapper.
* ggc-none.cc (ggc_internal_alloc): If HAVE_ATTRIBUTE_ALIAS, turn
overload with finalizer into alias to ggc_internal_alloc_ and
rename it to ...
(ggc_internal_alloc_): ... this, make it extern "C".
(ggc_internal_alloc_no_dtor): New alias if HAVE_ATTRIBUTE_ALIAS,
otherwise new wrapper.
(ggc_internal_cleared_alloc): If HAVE_ATTRIBUTE_ALIAS, turn overload
with finalizer into alias to ggc_internal_alloc_ and rename it to ...
(ggc_internal_cleared_alloc_): ... this, make it extern "C".
(ggc_internal_cleared_alloc_no_dtor): New alias if
HAVE_ATTRIBUTE_ALIAS, otherwise new wrapper.
* genmatch.cc (ggc_internal_cleared_alloc, ggc_free): Formatting fix.
(ggc_internal_cleared_alloc_no_dtor): Define.
* config.in: Regenerate.
* configure: Regenerate.

openmp: Fix up simd clone mask argument creation on x86 [PR115871]

The following testcase ICEs since r14-5057.
The Intel vector ABI says that in the ZMM case the masks is passed
in unsigned int or unsigned long long arguments and how many bits in
them and how many of those arguments are is determined by the characteristic
data type of the function.  In the testcase simdlen is 32 and characteristic
data type is double, so return as well as first argument is passed in 4
V8DFmode arguments and the mask is supposed to be passed in 4 unsigned int
arguments (8 bits in each).
Before the r14-5057 change there was
      sc->args[i].orig_type = parm_type;
...
        case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP:
        case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP:
        case SIMD_CLONE_ARG_TYPE_VECTOR:
          if (INTEGRAL_TYPE_P (parm_type) || POINTER_TYPE_P (parm_type))
            veclen = sc->vecsize_int;
          else
            veclen = sc->vecsize_float;
          if (known_eq (veclen, 0U))
            veclen = sc->simdlen;
          else
            veclen
              = exact_div (veclen,
                           GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)));
for the argument handling and
  if (sc->inbranch)
    {
      tree base_type = simd_clone_compute_base_data_type (sc->origin, sc);
...
      if (INTEGRAL_TYPE_P (base_type) || POINTER_TYPE_P (base_type))
        veclen = sc->vecsize_int;
      else
        veclen = sc->vecsize_float;
      if (known_eq (veclen, 0U))
        veclen = sc->simdlen;
      else
        veclen = exact_div (veclen,
                            GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)));
for the mask handling.  r14-5057 moved this argument creation later and
unified that:
        case SIMD_CLONE_ARG_TYPE_MASK:
        case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP:
        case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP:
        case SIMD_CLONE_ARG_TYPE_VECTOR:
          if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK
              && sc->mask_mode != VOIDmode)
            elem_type = boolean_type_node;
          else
            elem_type = TREE_TYPE (sc->args[i].vector_type);
          if (INTEGRAL_TYPE_P (elem_type) || POINTER_TYPE_P (elem_type))
            veclen = sc->vecsize_int;
          else
            veclen = sc->vecsize_float;
          if (known_eq (veclen, 0U))
            veclen = sc->simdlen;
          else
            veclen
              = exact_div (veclen,
                           GET_MODE_BITSIZE (SCALAR_TYPE_MODE (elem_type)));
This is correct for the argument cases (so linear or vector) (though
POINTER_TYPE_P will never appear as TREE_TYPE of a vector), but the
boolean_type_node in there is completely bogus, when using AVX512 integer
masks as I wrote above we need the characteristic data type, not bool,
and bool is strange in that it has bitsize of 8 (or 32 on darwin), while
the masks are 1 bit per lane anyway.

Fixed thusly.

2025-03-01  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/115871
* omp-simd-clone.cc (simd_clone_adjust): For SIMD_CLONE_ARG_TYPE_MASK
and sc->mask_mode not VOIDmode, set elem_type to the characteristic
type rather than boolean_type_node.

* gcc.dg/gomp/simd-clones-8.c: New test.

[PATCH] H8/300: PR target/109189 Silence -Wformat warnings on Windows

This patch fixes annoying -Wformat warnings when gcc is built
on Windows/MinGW64. Instead of %ld it uses HOST_WIDE_INT_PRINT_DEC
macro, just like many other targets do.

PR target/109189
gcc/ChangeLog:

* config/h8300/h8300.cc (h8300_print_operand): Replace %ld format
strings with HOST_WIDE_INT_PRINT_DEC macro in order to silence
-Wformat warnings when building on Windows/MinGW64.

testsuite: Fix up toplevel-asm-1.c for LoongArch

Like RISC-V, on LoongArch we don't really support %cN for SYMBOL_REFs
even with -fno-pic.

gcc/testsuite/ChangeLog:

* c-c++-common/toplevel-asm-1.c: Use %cc3 %cc4 instead of %c3
%c4 on LoongArch.

libstdc++: Fix ranges::iter_move handling of rvalues [PR106612]

The specification for std::ranges::iter_move apparently requires us to
handle types which do not satisfy std::indirectly_readable, for example
with overloaded operator* which behaves differently for different value
categories.

libstdc++-v3/ChangeLog:

PR libstdc++/106612
* include/bits/iterator_concepts.h (_IterMove::__iter_ref_t):
New alias template.
(_IterMove::__result): Use __iter_ref_t instead of
std::iter_reference_t.
(_IterMove::__type): Remove incorrect __dereferenceable
constraint.
(_IterMove::operator()): Likewise. Add correct constraints. Use
__iter_ref_t instead of std::iter_reference_t. Forward parameter
as correct value category.
(iter_swap): Add comments.
* testsuite/24_iterators/customization_points/iter_move.cc: Test
that iter_move is found by ADL and that rvalue arguments are
handled correctly.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

libstdc++: Fix ranges::move and ranges::move_backward to use iter_move [PR105609]

The ranges::move and ranges::move_backward algorithms are supposed to
use ranges::iter_move(iter) instead of std::move(*iter), which matters
for an iterator type with an iter_move overload findable by ADL.

Currently those algorithms use std::__assign_one which uses std::move,
so define a new ranges::__detail::__assign_one helper function that uses
ranges::iter_move.

libstdc++-v3/ChangeLog:

PR libstdc++/105609
* include/bits/ranges_algobase.h (__detail::__assign_one): New
helper function.
(__copy_or_move, __copy_or_move_backward): Use new function
instead of std::__assign_one.
* testsuite/25_algorithms/move/constrained.cc: Check that
ADL iter_move is used in preference to std::move.
* testsuite/25_algorithms/move_backward/constrained.cc:
Likewise.

libstdc++: Add static_assertions to ranges::to adaptor factory [PR112803]

The standard requires that we reject attempts to create a ranges::to
adaptor for cv-qualified types and non-class types. Currently we only
diagnose it once the adaptor is used in a pipeline.

This adds static assertions to diagnose it immediately.

libstdc++-v3/ChangeLog:

PR libstdc++/112803
* include/std/ranges (ranges::to): Add static assertions to
enforce Mandates conditions.
* testsuite/std/ranges/conv/112803.cc: New test.

d: Fix comparing uninitialized memory in dstruct.d [PR116961]

Floating-point emulation in the D front-end is done via a type named
`struct longdouble`, which in GDC is a small interface around the
real_value type. Because the D code cannot include gcc/real.h directly,
a big enough buffer is used for the data instead.

On x86_64, this buffer is actually bigger than real_value itself, so
when a new longdouble object is created with

    longdouble r;
    real_from_string3 (&r.rv (), buffer, mode);
    return r;

there is uninitialized padding at the end of `r`.  This was never a
problem when D was implemented in C++ (until GCC 12) as comparing two
longdouble objects with `==' would be forwarded to the relevant
operator== overload that extracted the underlying real_value.

However when the front-end was translated to D, such conditions were
instead rewritten into identity comparisons

    return exp.toReal() is CTFloat.zero

The `is` operator gets lowered as a call to `memcmp() == 0', which is
where the read of uninitialized memory occurs, as seen by valgrind.

==26778== Conditional jump or move depends on uninitialised value(s)
==26778==    at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635)
==26778==    by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373)
==26778==    by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226)
[...]

To avoid accidentally reading uninitialized data, explicitly initialize
all `longdouble` variables with an empty constructor on C++ side of the
implementation before initializing underlying real_value type it holds.

PR d/116961

gcc/d/ChangeLog:

* d-codegen.cc (build_float_cst): Change new_value type from real_t to
real_value.
* d-ctfloat.cc (CTFloat::fabs): Default initialize the return value.
(CTFloat::ldexp): Likewise.
(CTFloat::parse): Likewise.
* d-longdouble.cc (longdouble::add): Likewise.
(longdouble::sub): Likewise.
(longdouble::mul): Likewise.
(longdouble::div): Likewise.
(longdouble::mod): Likewise.
(longdouble::neg): Likewise.
* d-port.cc (Port::isFloat32LiteralOutOfRange): Likewise.
(Port::isFloat64LiteralOutOfRange): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/pr116961.d: New test.

c++: fix rejects-valid and ICE with constexpr NSDMI [PR110822]

Since r10-7718 the attached tests produce an ICE in verify_address:

  error: constant not recomputed when 'ADDR_EXPR' changed

but before that we wrongly rejected the tests with "is not a constant
expression".  This patch fixes both problems.

Since r10-7718 replace_decl_r can replace

  {._M_dataplus=&<retval>._M_local_buf, ._M_local_buf=0}

with

  {._M_dataplus=&HelloWorld._M_local_buf, ._M_local_buf=0}

The initial &<retval>._M_local_buf was not constant, but since
HelloWorld is a static VAR_DECL, the resulting &HelloWorld._M_local_buf
should have been marked as TREE_CONSTANT.  And since we're taking
its address, the whole thing should be TREE_ADDRESSABLE.

PR c++/114913
PR c++/110822

gcc/cp/ChangeLog:

* constexpr.cc (replace_decl_r): If we've replaced something
inside of an ADDR_EXPR, call cxx_mark_addressable and
recompute_tree_invariant_for_addr_expr on the resulting ADDR_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-nsdmi4.C: New test.
* g++.dg/cpp0x/constexpr-nsdmi5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: ICE in replace_decl [PR118986]

Yet another problem that started with r15-6052, compile time evaluation of
prvalues.

cp_fold_r/TARGET_EXPR sees:

  TARGET_EXPR <D.2701, <<< Unknown tree: expr_stmt
    D.2701.__p = TARGET_EXPR <D.2684, <<< Unknown tree: aggr_init_expr
      3
      f1
      D.2684 >>>> >>>>

so when we call maybe_constant_init, the object we're initializing is D.2701,
and the init is the expr_stmt.  We unwrap the EXPR_STMT/INIT_EXPR/TARGET_EXPR
in maybe_constant_init_1 and so end up evaluating the f1 call.  But f1 returns
c2 whereas the type of D.2701 is ._anon_0 -- the closure.

So then we crash in replace_decl on:

  gcc_checking_assert (same_type_ignoring_top_level_qualifiers_p
       (TREE_TYPE (decl), TREE_TYPE (replacement)));

due to the mismatched types.

cxx_eval_outermost_constant_expr is already ready for the types to be
different, in which case the result isn't constant.  But replace_decl
is called before that check.

I'm leaving the assert in replace_decl on purpose, maybe we'll find
another use for it.

PR c++/118986

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Check that the types match
before calling replace_decl, if not, set *non_constant_p.
(maybe_constant_init_1): Don't strip INIT_EXPR if it would change the
type of the expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-prvalue1.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)

Among other things, IPA-SRA checks whether splitting out a bit of an
aggregate or something passed by reference would lead into a clash
with an already known IPA-CP constant a way which would cause problems
later on.  Unfortunately the test is done only in
adjust_parameter_descriptions and is missing when accesses are
propagated from callees to callers, which leads to miscompilation
reported as PR 118243 (where the callee is a function created by
ipa-split).

The matter is then further complicated by the fact that we consider
complex numbers as scalars even though they can be modified piecemeal
(IPA-CP can detect and propagate the pieces separately too) which then
confuses the parameter manipulation machinery furter.

This patch simply adds the missing check to avoid the IPA-SRA
transform in these cases too, which should be suitable for backporting
to all affected release branches.  It is a bit of a shame as in the PR
testcase we do propagate both components of the complex number in
question and the transformation phase could recover.  I have some
prototype patches in this direction but that is something for (a)
stage 1.

gcc/ChangeLog:

2025-02-10  Martin Jambor  <mjambor@suse.cz>

PR ipa/118243
* ipa-sra.cc (pull_accesses_from_callee): New parameters
caller_ipcp_ts and param_idx.  Check that scalar pulled accesses would
not clash with a known IPA-CP aggregate constant.
(param_splitting_across_edge): Pass IPA-CP transformation summary and
caller parameter index to pull_accesses_from_callee.

gcc/testsuite/ChangeLog:

2025-02-10  Martin Jambor  <mjambor@suse.cz>

PR ipa/118243
* g++.dg/ipa/pr118243.C: New test.

c++: generic lambda, implicit 'this' capture, xobj memfn [PR119038]

When a generic lambda calls an overload set containing an iobj member
function we speculatively capture 'this'. We need to do the same
for an xobj member function.

PR c++/119038

gcc/cp/ChangeLog:

* lambda.cc (maybe_generic_this_capture): Consider xobj
member functions as well, not just iobj. Update function
comment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-lambda15.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

libstdc++: Improve optional's <=> constraint recursion workaround [PR104606]

It turns out the reason the behavior of this testcase changed after CWG
2369 is because validity of the substituted return type is now checked
later, after constraints. So a more reliable workaround for this issue
is to add a constraint to check the validity of the return type earlier,
matching the pre-CWG 2369 semantics.

PR libstdc++/104606

libstdc++-v3/ChangeLog:

* include/std/optional (operator<=>): Revert r14-9771 change.
Add constraint checking the validity of the return type
compare_three_way_result_t before the three_way_comparable_with
constraint.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Fix constraint recursion in basic_const_iterator relops [PR112490]

Here for

  using RCI = reverse_iterator<basic_const_iterator<vector<int>::iterator>>
  static_assert(std::totally_ordered<RCI>);

we effectively need to check the requirement

  requires (RCI x) { x RELOP x; }  for each RELOP in {<, >, <=, >=}

which we expect to be straightforwardly satisfied by reverse_iterator's
namespace-scope relops.  But due to ADL we find ourselves also
considering the basic_const_iterator relop friends, which before CWG
2369 would be quickly discarded since RCI clearly isn't convertible to
basic_const_iterator.  After CWG 2369 though we must first check these
relops' constraints (with _It = vector<int>::iterator and _It2 = RCI),
which entails checking totally_ordered<RCI> recursively.

This patch fixes this by turning the problematic non-dependent function
parameters of type basic_const_iterator<_It> into dependent ones of
type basic_const_iterator<_It3> where _It3 is constrained to match _It.
Thus the basic_const_iterator relop friends now get quickly discarded
during deduction and before the constraint check if the second operand
isn't a specialization of basic_const_iterator (or derived from one)
like before CWG 2369.

PR libstdc++/112490

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (basic_const_iterator::operator<):
Replace non-dependent basic_const_iterator function parameter with
a dependent one of type basic_const_iterator<_It3> where _It3
matches _It.
(basic_const_iterator::operator>): Likewise.
(basic_const_iterator::operator<=): Likewise.
(basic_const_iterator::operator>=): Likewise.
* testsuite/24_iterators/const_iterator/112490.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

c++: Fix cxx_eval_store_expression {REAL,IMAG}PART_EXPR handling [PR119045]

I've added the asserts that probe == target because {REAL,IMAG}PART_EXPR
always implies a scalar type and so applying ARRAY_REF/COMPONENT_REF
etc. on it further doesn't make sense and the later code relies on it
to be the last one in refs array. But as the following testcase shows,
we can fail those assertions in case there is a reference or pointer
to the __real__ or __imag__ part, in that case we just evaluate the
constant expression and so probe won't be the same as target.
That case doesn't push anything into the refs array though.

The following patch changes those asserts to verify that refs is still
empty, which fixes it.

2025-02-28 Jakub Jelinek <jakub@redhat.com>

PR c++/119045
* constexpr.cc (cxx_eval_store_expression) <case REALPART_EXPR>:
Assert that refs->is_empty () rather than probe == target.
(cxx_eval_store_expression) <case IMAGPART_EXPR>: Likewise.

* g++.dg/cpp1y/constexpr-complex2.C: New test.

c++: Adjust #embed support for P1967R14

Now that the #embed paper has been voted in, the following patch
removes the pedwarn for C++26 on it (and adjusts pedwarn warning for
older C++ versions) and predefines __cpp_pp_embed FTM.

Also, the patch changes cpp_error to cpp_pedwarning with for C++
-Wc++26-extensions guarding, and for C add -Wc11-c23-compat warning
about #embed.

I believe we otherwise implement everything in the paper already,
except I'm really confused by the
[Example:

#embed <data.dat> limit(__has_include("a.h"))

#if __has_embed(<data.dat> limit(__has_include("a.h")))
// ill-formed: __has_include [cpp.cond] cannot appear here
#endif

— end example]
part.  My reading of both C23 and C++ with the P1967R14 paper in
is that the first case (#embed with __has_include or __has_embed in its
clauses) is what is clearly invalid and so the ill-formed note should be
for #embed.  And the __has_include/__has_embed in __has_embed is actually
questionable.
Both C and C++ have something like
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
or
"The identifiers __has_include and __has_cpp_attribute shall not appear
in any context not mentioned in this subclause."
(into which P1967R14 adds __has_embed) in the conditional inclusion
subclause.  #embed is defined in a different one, so using those in there
is invalid (unless "using the rules specified for conditional inclusion"
wording e.g. in limit clause overrides that).
The reason why I think it is fuzzy for __has_embed is that __has_embed
is actually defined in the Conditional inclusion subclause (so that
would mean one can use __has_include, __has_embed and __has_*attribute
in there) but its clauses are described in a different one.

GCC currently accepts
#embed __FILE__ limit (__has_include (<stdarg.h>))
#if __has_embed (__FILE__ limit (__has_include (<stdarg.h>)))
#endif
#embed __FILE__ limit (__has_embed (__FILE__))
#if __has_embed (__FILE__ limit (__has_embed (__FILE__)))
#endif
Note, it isn't just about limit clause, but also about
prefix/suffix/if_empty, except that in those cases the "using the rules
specified for conditional inclusion" doesn't apply.

In any case, I'd hope that can be dealt with incrementally (and should
be handled the same for both C and C++).

2025-02-28  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* include/cpplib.h (enum cpp_warning_reason): Add
CPP_W_CXX26_EXTENSIONS enumerator.
* init.cc (lang_defaults): Set embed for GNUCXX26 and CXX26.
* directives.cc (do_embed): Adjust pedwarn wording for embed in C++,
use cpp_pedwarning instead of cpp_error and add CPP_W_C11_C23_COMPAT
warning of cpp_pedwarning hasn't diagnosed anything.
gcc/c-family/
* c.opt (Wc++26-extensions): Add CppReason(CPP_W_CXX26_EXTENSIONS).
* c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_pp_embed=202502
for C++26.
gcc/testsuite/
* g++.dg/cpp/embed-1.C: Adjust for pedwarn wording change and don't
expect any error for C++26.
* g++.dg/cpp/embed-2.C: Adjust for pedwarn wording change and don't
expect any warning for C++26.
* g++.dg/cpp26/feat-cxx26.C: Test __cpp_pp_embed value.
* gcc.dg/cpp/embed-17.c: New test.

lto/91299 - weak definition inlined with LTO

The following fixes a thinko in the handling of interposed weak
definitions which confused the interposition check in
get_availability by setting DECL_EXTERNAL too early.

PR lto/91299
gcc/lto/
* lto-symtab.cc (lto_symtab_merge_symbols): Set DECL_EXTERNAL
only after calling get_availability.

gcc/testsuite/
* gcc.dg/lto/pr91299_0.c: New testcase.
* gcc.dg/lto/pr91299_1.c: Likewise.

ipa/111245 - bogus modref analysis for store in call that might throw

We currently record a kill for

*x_4(D) = always_throws ();

because we consider the store always executing since the appropriate
check for whether the stmt could throw is guarded by
!cfun->can_throw_non_call_exceptions.

PR ipa/111245
* ipa-modref.cc (modref_access_analysis::analyze_store): Do
not guard the check of whether the stmt could throw by
cfun->can_throw_non_call_exceptions.

* g++.dg/torture/pr111245.C: New testcase.

ifcvt: Fix ICE with (fix:SI (fix:DF (reg:DF))) [PR117712]

As documented in the manual, FIX/UNSIGNED_FIX from floating point
mode to integral mode has unspecified rounding and FIX from floating point
mode to the same floating point mode is expressing rounding toward zero.
So, some targets (arc, arm, csky, m68k, mmix, nds32, pdp11, sparc and
visium) use
(fix:SI (fix:SF (match_operand:SF 1 "..._operand")))
etc. to express the rounding toward zero during conversion to integer.
For some reason other targets don't use that.

Anyway, the 2 FIXes (or inner FIX with outer UNSIGNED_FIX) cause problems
since the r15-2890 which removed some strict checks in ifcvt.cc on what
SET_SRC can be actually conditionalized (I must say I'm still worried
about the change, don't know why one can't get e.g. inline asm or
something with UNSPEC or some complex backend specific RTLs that
force_operand can't handle), force_operand just ICEs on it, it can only
handle (through expand_fix) conversions from floating point to integral.

The following patch fixes this by detecting this case and just pretend
the inner FIX isn't there, i.e. call expand_fix with the inner FIX's
operand instead, which works and on targets like arm it will just
create the nested FIXes again.

2025-02-28 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/117712
* expr.cc (force_operand): Handle {,UNSIGNED_}FIX with
FIX operand using expand_fix on the inner FIX operand.

* gcc.dg/pr117712.c: New test.

tree-optimization/87984 - hard register assignments not preserved

The following disables redundant store elimination to hard register
variables which isn't valid.

PR tree-optimization/87984
* tree-ssa-dom.cc (dom_opt_dom_walker::optimize_stmt): Do
not perform redundant store elimination to hard register
variables.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt):
Likewise.

* gcc.target/i386/pr87984.c: New testcase.

middle-end/66279 - gimplification clobbers shared asm constraints

When the C++ frontend clones a CTOR we do not copy ASM_EXPR constraints
fully as walk_tree does not recurse to TREE_PURPOSE of TREE_LIST nodes.
At this point doing that seems too dangerous so the following instead
avoids gimplification of ASM_EXPRs to clobber the shared constraints
and unshares it there, like it also unshares TREE_VALUE when it
re-writes a "+" output constraint to separate "=" output and matching
input constraint.

PR middle-end/66279
* gimplify.cc (gimplify_asm_expr): Copy TREE_PURPOSE before
rewriting it for "+" processing.

* g++.dg/pr66279.C: New testcase.

testsuite: Remove -m32 from another i386/ test

I found another test which uses -m32 in gcc.target/i386/ . Similarly
to the previously posted test, the test ought to be tested during i686-linux
testing or x86_64-linux test with --target_board=unix\{-m32,-m64\}
There is nothing ia32 specific on the test, so I've just dropped the -m32.

2025-02-28 Jakub Jelinek <jakub@redhat.com>

* gcc.target/i386/strub-pr118006.c: Remove -m32 from dg-options.

testsuite: Fix up gcc.target/i386/pr118940.c test [PR118940]

The testcase uses -m32 in dg-options, something we try hard not to do,
if something should be tested only for -m32, it is { target ia32 } test,
if it can be tested for -m64/-mx32 too, just some extra options are
needed for ia32, it should have dg-additional-options with ia32 target.

Also, the test wasn't reduced, so I've reduced it using cvise and manual
tweaks and verified the test still FAILs before r15-7700 and succeeds
with current trunk.

2025-02-28 Jakub Jelinek <jakub@redhat.com>

PR target/118940
* gcc.target/i386/pr118940.c: Drop -w, -g and -m32 from dg-options, move
-march=i386 -mregparm=3 to dg-additional-options for ia32 and -fno-pie
to dg-additional-options for pie. Reduce the test.

Fortran: Ensure finalizer is called for unreferenced variable [PR118730]

PR fortran/118730

gcc/fortran/ChangeLog:

* resolve.cc: Mark unused derived type variable with finalizers
referenced to execute finalizer when leaving scope.

gcc/testsuite/ChangeLog:

* gfortran.dg/class_array_15.f03: Remove unused variable.
* gfortran.dg/coarray_poly_7.f90: Adapt scan-tree-dump expr.
* gfortran.dg/coarray_poly_8.f90: Same.
* gfortran.dg/finalize_60.f90: New test.

MAINTAINERS: add myself to write after approval and DCO

ChangeLog:

* MAINTAINERS: Added myself as write after approval and DCO.

x86: Move TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P to i386.cc

Move the TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P target hook from
i386.h to i386.cc.

* config/i386/i386.h (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P):
Moved to ...
* config/i386/i386.cc (TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P):
Here.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Daily bump.

RISC-V: Fix bug for expand_const_vector interleave [PR118931]

This patch would like to fix one bug when expanding const vector for the
interleave case.  For example, we have:

base1 = 151
step = 121

For vec_series, we will generate vector in format of v[i] = base + i * step.
Then the vec_series will have below result for HImode, and we can find
that the result overflow to the highest 8 bits of HImode.

v1.b = {151, 255, 7,  0, 119,  0, 231,  0, 87,  1, 199,  1, 55,   2, 167,   2}

Aka we expect v1.b should be:

v1.b = {151, 0, 7,  0, 119,  0, 231,  0, 87,  0, 199,  0, 55,   0, 167,   0}

After that it will perform the IOR with v2 for the base2(aka another series).

v2.b =  {0,  17, 0, 33,   0, 49,   0, 65,  0, 81,   0, 97,  0, 113,   0, 129}

Unfortunately, the base1 + i * step1 in HImode may overflow to the high
8 bits, and the high 8 bits will pollute the v2 and result in incorrect
value in const_vector.

This patch would like to perform the overflow to smode check before the
optimized interleave code generation.  If overflow or VLA, it will fall
back to the default merge approach.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

PR target/118931

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Add overflow to
smode check and clean up highest bits if overflow.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr118931-run-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

libphobos: Run unittest tests with dg-runtest.

Use `dg-runtest' test driver rather than `dg-test' to run the libphobos
unittest testsuite, same as all other libphobos tests. This prevents
the tests from being ran multiple times when parallelized.

Set `libphobos_test_name' as well so that all tests get a unique name.

libphobos/ChangeLog:

* testsuite/libphobos.unittest/unittest.exp: Use `dg-runtest' rather
than `dg-test'. Set `libphobos_test_name'.

libstdc++: Fix outdated comment in <stacktrace>

My r15-998-g2a83084ce55363 change replaced the use of nothrow
operator new with a call to __get_temporary_buffer, so update the
comment to match.

libstdc++-v3/ChangeLog:

* include/std/stacktrace (_Impl::_M_allocate): Fix outdated
comment.

gimple-fold: Fix a pasto in fold_truth_andor_for_ifcombine [PR119030]

The following testcase is miscompiled since r15-7597.
The left comparison is unsigned (x & 0x8000U) != 0) while the
right one is signed (x >> 16) >= 0 and is actually a signbit test,
so rsignbit is 64.
After debugging this and reading the r15-7597 change, I believe there
is just a pasto, the if (lsignbit) and if (rsignbit) blocks are pretty
much identical with just the first l on all variables starting with l
replaced with r (the only difference is that if (lsignbit) has a comment
explaining the sign <<= 1; stuff, while it isn't repeated in the second one.
Except the second one was using ll_unsignedp instead of rl_unsignedp
in one spot. I think it should use the latter, the signedness of the left
comparison doesn't affect the other one, they are basically independent
with the exception that we check that after transformations they are both
EQ or both NE and later on we try to merge them together.

2025-02-27 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/119030
* gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix a pasto,
ll_unsignedp -> rl_unsignedp.

* gcc.c-torture/execute/pr119030.c: New test.

input: Fix up ICEs with --param=file-cache-files=N for N > 16 [PR118860]

The following testcase ICEs, because we first construct file_cache object
inside of *global_dc, then process options and then call file_cache::tune.
The earlier construction allocates the m_file_slots array (using new)
based on the static data member file_cache::num_file_slots, but then tune
changes it, without actually reallocating all m_file_slots arrays in already
constructed file_cache objects.

I think it is just weird to have the count be a static data member and
the pointer be non-static data member, that is just asking for issues like
this.

So, this patch changes num_file_slots into m_num_file_slots and turns tune
into a non-static member function and changes toplev.cc to call it on the
global_gc->get_file_cache () object.  And let's the tune just delete the
array and allocate it freshly if there is a change in the number of slots
or lines.

Note, file_cache_slot has similar problem, but because there are many, I
haven't moved the count into those objects; I just hope that when tune
is called there is exactly one file_cache constructed and all the
file_cache_slot objects constructed are pointed by its m_file_slots member,
so also on lines change it just deletes it and allocates again.  I think
it should be unlikely that the cache actually has any used slots by the time
it is called.

2025-02-27  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/118860
* input.h (file_cache::tune): No longer static.  Rename argument
from num_file_slots_ to num_file_slots.  Formatting fix.
(file_cache::num_file_slots): Renamed to ...
(file_cache::m_num_file_slots): ... this.  No longer static.
* input.cc (file_cache_slot::tune): Change return type from void to
size_t, return previous file_cache_slot::line_record_size value.
Formatting fixes.
(file_cache::tune): Rename argument from num_file_slots_ to
num_file_slots.  Set m_num_file_slots rather than num_file_slots.
If m_num_file_slots or file_cache_slot::line_record_size changes,
delete[] m_file_slots and new it again.
(file_cache::num_file_slots): Remove definition.
(file_cache::lookup_file): Use m_num_file_slots rather than
num_file_slots.
(file_cache::evicted_cache_tab_entry): Likewise.
(file_cache::file_cache): Likewise.  Initialize m_num_file_slots
to 16.
(file_cache::dump): Use m_num_file_slots rather than num_file_slots.
(file_cache_slot::get_next_line): Formatting fixes.
(file_cache_slot::read_line_num): Likewise.
(get_source_text_between): Likewise.
* toplev.cc (toplev::main): Call global_dc->get_file_cache ().tune
rather than file_cache::tune.

* gcc.dg/pr118860.c: New test.

nvptx: '#define MAX_FIXED_MODE_SIZE 128'

... instead of 64 via 'gcc/defaults.h':

    MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (DImode)

This fixes ICEs:

    [-FAIL: c-c++-common/pr111309-1.c  -Wc++-compat  (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} c-c++-common/pr111309-1.c  -Wc++-compat  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} c-c++-common/pr111309-1.c  -Wc++-compat  [-compilation failed to produce executable-]{+execution test+}

    [-FAIL: c-c++-common/pr111309-1.c  -std=gnu++17 (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: c-c++-common/pr111309-1.c  -std=gnu++26 (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: c-c++-common/pr111309-1.c  -std=gnu++98 (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++98 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} c-c++-common/pr111309-1.c  -std=gnu++98 [-compilation failed to produce executable-]{+execution test+}

    [-FAIL: gcc.dg/torture/pr116480-1.c   -O0  (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} gcc.dg/torture/pr116480-1.c   -O0  (test for excess errors)
    [-FAIL: gcc.dg/torture/pr116480-1.c   -O1  (internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268)-]
    [-FAIL:-]{+PASS:+} gcc.dg/torture/pr116480-1.c   -O1  (test for excess errors)
    PASS: gcc.dg/torture/pr116480-1.c   -O2  (test for excess errors)
    PASS: gcc.dg/torture/pr116480-1.c   -O3 -g  (test for excess errors)
    PASS: gcc.dg/torture/pr116480-1.c   -Os  (test for excess errors)

..., where we ran into 'gcc_assert (icode != CODE_FOR_nothing);' in
'gcc/internal-fn.cc:expand_fn_using_insn' for '__int128' '__builtin_clzg' etc.:

    during RTL pass: expand
    [...]/c-c++-common/pr111309-1.c: In function 'clzI':
    [...]/c-c++-common/pr111309-1.c:69:10: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:268
    0x120ec2cf internal_error(char const*, ...)
            [...]/gcc/diagnostic-global-context.cc:517
    0x102c7c5b fancy_abort(char const*, int, char const*)
            [...]/gcc/diagnostic.cc:1722
    0x109708eb expand_fn_using_insn
            [...]/gcc/internal-fn.cc:268
    0x1098114f expand_internal_call(internal_fn, gcall*)
            [...]/gcc/internal-fn.cc:5273
    0x1098114f expand_internal_call(gcall*)
            [...]/gcc/internal-fn.cc:5281
    0x10594fc7 expand_call_stmt
            [...]/gcc/cfgexpand.cc:3049
    [...]

Likewise, as of commit e8ad697a75b0870a833366daf687668a57cabb6e
"libstdc++: Use new type-generic built-ins in <bit> [PR118855]",
the libstdc++ target library build ICEd in the same way.

Additionally, this change fixes:

    [-FAIL:-]{+PASS:+} gcc.dg/pr105094.c (test for excess errors)

..., which was:

    [...]/gcc.dg/pr105094.c: In function 'foo':
    [...]/gcc.dg/pr105094.c:11:12: error: size of variable 's' is too large

And, finally, regarding 'gcc.target/nvptx/stack_frame-1.c'.  Before, in
'gcc/cfgexpand.cc': 'expand_used_vars' -> 'expand_used_vars_for_block' ->
'expand_one_var' for 'ww' -> 'gcc/function.cc:use_register_for_decl' due to
'DECL_MODE (decl) == BLKmode' did 'return false;', thus -> 'add_stack_var'
(even if 'ww' wasn't then actually living on the stack).  Now, 'ww' has
'TImode' and 'use_register_for_decl' does 'return true;', thus ->
'expand_one_register_var', and therefore no unused stack frame emitted.

gcc/
* config/nvptx/nvptx.h (MAX_FIXED_MODE_SIZE): '#define'.
gcc/testsuite/
* gcc.target/nvptx/stack_frame-1.c: Adjust.

Add 'gcc.target/nvptx/stack_frame-1.c'

gcc/testsuite/
* gcc.target/nvptx/stack_frame-1.c: New.

nvptx: Build libgfortran with '-mfake-ptx-alloca' [PR107635]

As of recent commit 8bf0ee8d62b8a08e808344d31354ab713157e15d
"Fortran: Add transfer_between_remotes [PR107635]", we've got 'alloca' usage
in 'libgfortran/caf/single.c:_gfortran_caf_transfer_between_remotes', and
the libgfortran target library fails to build for legacy configurations where
PTX 'alloca' is not available:

    ../../../../source-gcc/libgfortran/caf/single.c: In function ‘_gfortran_caf_transfer_between_remotes’:
    ../../../../source-gcc/libgfortran/caf/single.c:675:23: sorry, unimplemented: dynamic stack allocation not supported
      675 |       transfer_desc = __builtin_alloca (desc_size);
          |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ../../../../source-gcc/libgfortran/caf/single.c:680:20: sorry, unimplemented: dynamic stack allocation not supported
      680 |     transfer_ptr = __builtin_alloca (*opt_dst_charlen * src_size);
          |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    make[6]: *** [Makefile:4675: caf/single.lo] Error 1

With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
to before, we've got only a small number of regressions due to nvptx 'ld'
complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':

    [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)

    [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-PASS:-]{+FAIL:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/proc_pointer_assign_1.f90 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

    [-PASS:-]{+FAIL:+} gfortran.dg/coarray_43.f90   -O  (test for excess errors)

That's acceptable for such legacy PTX configurations.

PR target/107635
libgfortran/
* config/t-nvptx: New.
* configure.host [nvptx] (tmake_file): Add it.

nvptx: Support '-mfake-ptx-alloca'

With '-mfake-ptx-alloca' enabled, the user-visible behavior changes only
for configurations where PTX 'alloca' is not available. Rather than a
compile-time 'sorry, unimplemented: dynamic stack allocation not supported'
in presence of dynamic stack allocation, compilation and assembly then
succeeds. However, attempting to link in such '*.o' files then fails due
to unresolved symbol '__GCC_nvptx__PTX_alloca_not_supported'.

This is meant to be used in scenarios where large volumes of code are
compiled, a small fraction of which runs into dynamic stack allocation, but
these parts are not important for specific use cases, and we'd thus like the
build to succeed, and error out just upon actual, very rare use of the
offending '*.o' files.

gcc/
* config/nvptx/nvptx.opt (-mfake-ptx-alloca): New.
* config/nvptx/nvptx-protos.h (nvptx_output_fake_ptx_alloca):
Declare.
* config/nvptx/nvptx.cc (nvptx_output_fake_ptx_alloca): New.
* config/nvptx/nvptx.md (define_insn "@nvptx_alloca_<mode>")
[!(TARGET_PTX_7_3 && TARGET_SM52)]: Use it for
'-mfake-ptx-alloca'.
gcc/testsuite/
* gcc.target/nvptx/alloca-1-O0_-mfake-ptx-alloca.c: New.
* gcc.target/nvptx/alloca-2-O0_-mfake-ptx-alloca.c: Likewise.
* gcc.target/nvptx/alloca-4-O3_-mfake-ptx-alloca.c: Likewise.
* gcc.target/nvptx/vla-1-O0_-mfake-ptx-alloca.c: Likewise.
* gcc.target/nvptx/alloca-4-O3.c:
'dg-additional-options -mfake-ptx-alloca'.

nvptx: Delay 'sorry, unimplemented: dynamic stack allocation not supported' from expansion time to code generation

This gives the back end a chance to clean out a few more unnecessary instances
of dynamic stack allocation.  This progresses:

    PASS: gcc.dg/pr78902.c  (test for warnings, line 7)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 8)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 9)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 10)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 11)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 12)
    PASS: gcc.dg/pr78902.c  (test for warnings, line 13)
    PASS: gcc.dg/pr78902.c strndup excessive bound at line 14 (test for warnings, line 13)
    [-UNSUPPORTED: gcc.dg/pr78902.c: dynamic stack allocation not supported-]
    {+PASS: gcc.dg/pr78902.c (test for excess errors)+}

    UNSUPPORTED: gcc.dg/torture/pr71901.c   -O0 : dynamic stack allocation not supported
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr71901.c   -O1  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}
    UNSUPPORTED: gcc.dg/torture/pr71901.c   -O2 : dynamic stack allocation not supported
    UNSUPPORTED: gcc.dg/torture/pr71901.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions : dynamic stack allocation not supported
    UNSUPPORTED: gcc.dg/torture/pr71901.c   -O3 -g : dynamic stack allocation not supported
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr71901.c   -Os  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}

    UNSUPPORTED: gcc.dg/torture/pr78742.c   -O0 : dynamic stack allocation not supported
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr78742.c   -O1  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr78742.c   -O2  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr78742.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}
    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/torture/pr78742.c   -O3 -g  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}
    UNSUPPORTED: gcc.dg/torture/pr78742.c   -Os : dynamic stack allocation not supported

    [-UNSUPPORTED:-]{+PASS:+} gfortran.dg/pr101267.f90   -O  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}

    [-UNSUPPORTED:-]{+PASS:+} gfortran.dg/pr112404.f90   -O  [-: dynamic stack allocation not supported-]{+(test for excess errors)+}

gcc/
* config/nvptx/nvptx.md (define_expand "allocate_stack")
[!TARGET_SOFT_STACK]: Move
'sorry ("dynamic stack allocation not supported");'...
(define_insn "@nvptx_alloca_<mode>"): ... here.
gcc/testsuite/
* gcc.target/nvptx/alloca-1-unused-O0-sm_30.c: Adjust.

nvptx: Add test cases for dead/unused 'alloca'/VLA

gcc/testsuite/
* gcc.target/nvptx/alloca-1-dead-O0-sm_30.c: New.
* gcc.target/nvptx/alloca-1-dead-O0.c: Likewise.
* gcc.target/nvptx/alloca-1-dead-O1-sm_30.c: Likewise.
* gcc.target/nvptx/alloca-1-dead-O1.c: Likewise.
* gcc.target/nvptx/alloca-1-unused-O0-sm_30.c: Likewise.
* gcc.target/nvptx/alloca-1-unused-O0.c: Likewise.
* gcc.target/nvptx/alloca-1-unused-O1-sm_30.c: Likewise.
* gcc.target/nvptx/alloca-1-unused-O1.c: Likewise.
* gcc.target/nvptx/vla-1-dead-O0-sm_30.c: Likewise.
* gcc.target/nvptx/vla-1-dead-O0.c: Likewise.
* gcc.target/nvptx/vla-1-dead-O1-sm_30.c: Likewise.
* gcc.target/nvptx/vla-1-dead-O1.c: Likewise.
* gcc.target/nvptx/vla-1-unused-O0-sm_30.c: Likewise.
* gcc.target/nvptx/vla-1-unused-O0.c: Likewise.
* gcc.target/nvptx/vla-1-unused-O1-sm_30.c: Likewise.
* gcc.target/nvptx/vla-1-unused-O1.c: Likewise.

GCC: Documentation of -x option

This change updates information about the -x option to clarify
that it does not ensure standards compliance. Sparked by
discussions in the following PR.

PR fortran/108369

gcc/ChangeLog:

* doc/invoke.texi: Add a note to clarify. Adjust some wording.

c++: ICE with GOTO_EXPR [PR118928]

In this PR we crash in cxx_eval_constant_expression/GOTO_EXPR on:

  gcc_assert (cxx_dialect >= cxx23);

The code obviously doesn't expect to see a goto pre-C++23.  But we can
get here with the new prvalue optimization.  In this test we found
ourselves in synthesize_method for X::X().  This function calls:

a) finish_function, which does cp_genericize -> ... -> genericize_c_loops,
    which creates the GOTO_EXPR;
b) expand_or_defer_fn -> maybe_clone_body -> ... -> cp_fold_function
    where we reach the new maybe_constant_init call and crash on the
    goto.

Since we can validly get to that assert, I think we should just remove
it.  I don't see other similar asserts like this one.

PR c++/118928

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression) <case GOTO_EXPR>: Remove
an assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-prvalue5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

[PR118940][LRA]: Add a test

PR115458 also solves given PR. So the patch adds only a
test case which can be used for testing LRA work aspects different from
PR115458 test case.

gcc/testsuite/ChangeLog:

PR target/118940
* gcc.target/i386/pr118940.c: New test.

[PR116336][LRA]: Add a test

Patch for PR116234 solves given PR116366. So the patch adds only the test
case which is very different from PR116234 one.

gcc/testsuite/ChangeLog:

PR rtl-optimization/116336
* gcc.dg/pr116336.c: New test.

c++: too many errors with sneaky template [PR118516]

Since C++20 P0846, a name followed by a < can be treated as a template-name
even though name lookup did not find a template-name.  That happens
in this test with "i < foo ()":

  for (int id = 0; i < foo(); ++id);

and results in a raft of errors about non-constant foo().  The problem
is that the require_potential_constant_expression call in
cp_parser_template_argument emits errors even when we're parsing
tentatively.  So we repeat the error when we're trying to parse
as a nested-name-specifier, type-name, etc.

Guarding the call with !cp_parser_uncommitted_to_tentative_parse_p would
mean that require_potential_constant_expression never gets called.  But
we don't need the call at all as far as I can tell.  Stuff like

  template<int N> struct S { };
  int foo () { return 4; }
  void
  g ()
  {
    S<foo()> s;
  }

gets diagnosed in convert_nontype_argument.  In fact, with this patch,
we only emit "call to non-constexpr function" once.  (That is, in C++17
only; C++14 uses a different path.)

PR c++/118516

gcc/cp/ChangeLog:

* parser.cc (cp_parser_template_argument): Don't call
require_potential_constant_expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/fn-template11.C:
* g++.dg/template/fn-template1.C: New test.
* g++.dg/template/fn-template2.C: New test.

testsuite: arm: Avoid incremental link warnings in pr61123-enum-size

This test uses incremental linking, but that can generate warnings if
the LTO step contains a mix of LTO and non-LTO object files (this can
happen when there's a testglue file that is normally included during
linking).

We don't care about the testglue, though, so just tell the LTO
optimizer to generate nolto-rel output, which is what it is falling
back to anyway.

gcc/testsuite:
* gcc.target/arm/lto/pr61123-enum-size_0.c: (dg-lto-options) Move
linker related options to ...
(dg-extra-ld-options): ... here. Add -flinker-output=nolto-rel.

Fortran: Fix ICE on associate of pointer [PR118789]

Fix ICE when associating a pointer to void (c_ptr) by looking at the
compatibility of the type hierarchy.

PR fortran/118789

gcc/fortran/ChangeLog:

* trans-stmt.cc (trans_associate_var): Compare pointed to types when
expr to associate is already a pointer.

gcc/testsuite/ChangeLog:

* gfortran.dg/associate_73.f90: New test.

i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids similar as Sapphire Rapids in x86-tune.def

Since GNR, GNR-D, DMR are both P-core based, we should treat them
just like SPR for now.

gcc/ChangeLog:

* config/i386/x86-tune.def
(X86_TUNE_DEST_FALSE_DEP_FOR_GLC): Add GNR, GNR-D, DMR.
(X86_TUNE_AVOID_256FMA_CHAINS): Ditto.
(X86_TUNE_AVX512_MOVE_BY_PIECES): Ditto.
(X86_TUNE_AVX512_STORE_BY_PIECES): Ditto.

gimple-range-phi: Fix comment typo

During reading of this file I've noticed a typo in the comment, which
this patch fixes.

2025-02-27 Jakub Jelinek <jakub@redhat.com>

* gimple-range-phi.cc (phi_analyzer::process_phi): Fix comment typo,
dpoesn;t -> doesn't.

Makefile: Link in {simple,lazy}-diagnostic-path.o [PR116143]

Some of the plugin.exp tests FAIL in --enable-checking=release builds while
they succeed in --enable-checking=yes builds.
Initially I've changed some small simple out of line methods into inline ones
in the header, but the tests kept failing, just with different symbols.

The _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz symbol (and the
others too) are normally emitted in simple-diagnostic-path.o, it isn't some
fancy C++ optimization of classes with final method or LTO optimization.

The problem is that simple-diagnostic-path.o is like most objects added into
libbackend.a and we then link libbackend.a without -Wl,--whole-archive ...
-Wl,--no-whole-archive around it (and can't easily, not all system compilers
and linkers will support that).
With --enable-checking=yes simple-diagnostic-path.o is pulled in, because
selftest-run-tests.o calls simple_diagnostic_path_cc_tests and so
simple-diagnostic-path.o is linked in.
With --enable-checking=release self-tests aren't done and nothing links in
simple-diagnostic-path.o, because nothing in the compiler proper needs
anything from it, only the plugin tests.

Using -Wl,-M on cc1 linking, I see that in --enable-checking=release
build
analyzer/analyzer-selftests.o
digraph.o
dwarf2codeview.o
fibonacci_heap.o
function-tests.o
hash-map-tests.o
hash-set-tests.o
hw-doloop.o
insn-peep.o
lazy-diagnostic-path.o
options-urls.o
ordered-hash-map-tests.o
pair-fusion.o
print-rtl-function.o
resource.o
rtl-tests.o
selftest-rtl.o
selftest-run-tests.o
simple-diagnostic-path.o
splay-tree-utils.o
typed-splay-tree.o
vmsdbgout.o
aren't linked into cc1 (the *test* for obvious reasons of not doing
selftests, pair-fusion.o because it is aarch64 specific, hw-doloop.o because
x86 doesn't have doloop opts, vmsdbgout.o because not on VMS).

So, the question is if and what from digraph.o, fibinacci_heap.o,
hw-doloop.o, insn-peep.o, lazy-diagnostic-path.o, options-urls.o,
pair-fusion.o, print-rtl-function.o, resource.o, simple-diagnostic-path.o,
splay-tree-utils.o, typed-splay-tree.o are supposed to be part of the
plugin API if anything and how we arrange for those to be linked in when
plugins are enabled.

The following patch just adds unconditionally the
{simple,lazy}-diagnostic-path.o objects to the link lines before libbackend.a
so that their content is available to plugin users.

2025-02-27 Jakub Jelinek <jakub@redhat.com>

PR testsuite/116143
* Makefile.in (EXTRA_BACKEND_OBJS): New variable.
(BACKEND): Use it before libbackend.a.

alias: Perform offset arithmetics in poly_offset_int rather than poly_int64 [PR118819]

This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
hunk.

The following patch hopefully fixes that by doing the additions/subtractions
in poly_offset_int rather than poly_int64 and then converting back to poly_int64.
If it doesn't fit, -1 is returned (which means it is unknown if there is a conflict
or not).

2025-02-27 Jakub Jelinek <jakub@redhat.com>

PR middle-end/118819
* alias.cc (memrefs_conflict_p): Perform arithmetics on c, xsize and
ysize in poly_offset_int and return -1 if it is not representable in
poly_int64.

Daily bump.

libstdc++: Add code comment documenting LWG 4027 change [PR118083]

PR libstdc++/118083

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h
(ranges::__access::__possibly_const_range): Mention LWG 4027.

c: Assorted fixes for flexible array members in unions [PR119001]

r15-209 allowed flexible array members inside of unions, but as the
following testcase shows, not everything has been adjusted for that.
Unlike structures, in unions flexible array member (as an extension)
can be any of the members, not just the last one, as in union all
members are effectively last.
The first hunk is about an ICE on the initialization of the FAM
in union which is not the last FIELD_DECL with a string literal,
the second hunk just formatting fix, third hunk fixes a bug in which
we were just throwing away the initializers (except for with string literal)
of FAMs in unions which aren't the last FIELD_DECL, and the last hunk
is to diagnose FAM errors in unions the same as for structures, in
particular trying to initialize a FAM with non-constant or initialization
in nested context.

2025-02-26 Jakub Jelinek <jakub@redhat.com>

PR c/119001
gcc/
* varasm.cc (output_constructor_regular_field): Don't fail
assertion if next is non-NULL and FIELD_DECL if
TREE_CODE (local->type) is UNION_TYPE.
gcc/c/
* c-typeck.cc (pop_init_level): Don't clear constructor_type
if DECL_CHAIN of constructor_fields is NULL but p->type is UNION_TYPE.
Formatting fix.
(process_init_element): Diagnose non-static initialization of flexible
array member in union or FAM in union initialization in nested context.
gcc/testsuite/
* gcc.dg/pr119001-1.c: New test.
* gcc.dg/pr119001-2.c: New test.

c: stddef.h C23 fixes [PR114870]

The stddef.h header for C23 defines __STDC_VERSION_STDDEF_H__ and
unreachable macros multiple times in some cases.
The header doesn't have normal multiple inclusion guard, because it supports
for glibc inclusion with __need_{size_t,wchar_t,ptrdiff_t,wint_t,NULL}.
While the definition of __STDC_VERSION_STDDEF_H__ and unreachable is done
solely in the #ifdef _STDDEF_H part, so they are defined only if stddef.h
is included without those __need_* macros defined.  But actually once
stddef.h is included without the __need_* macros, _STDDEF_H is then defined
and while further stddef.h includes without __need_* macros don't do
anything:
#if (!defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
      && !defined(__STDDEF_H__)) \
     || defined(__need_wchar_t) || defined(__need_size_t) \
     || defined(__need_ptrdiff_t) || defined(__need_NULL) \
     || defined(__need_wint_t)
if one includes whole stddef.h first and then stddef.h with some of the
__need_* macros defined, the #ifdef _STDDEF_H part is used again.
It isn't that big deal for most cases, as it uses extra guarding macros
like:
#ifndef _GCC_MAX_ALIGN_T
#define _GCC_MAX_ALIGN_T
...
#endif
etc., but for __STDC_VERSION_STDDEF_H__/unreachable nothing like that is
used.

So, either we do what the following patch does and just don't define
__STDC_VERSION_STDDEF_H__/unreachable second time, or use #ifndef
unreachable separately for the #define unreachable() case, or use
new _GCC_STDC_VERSION_STDDEF_H macro to guard this (or two, one for
__STDC_VERSION_STDDEF_H__ and one for unreachable), or rework the initial
condition to be just
#if !defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
     && !defined(__STDDEF_H__)
- I really don't understand why the header should do anything at all after
it has been included once without __need_* macros.  But changing how this
behaves after 35 years might be risky for various OS/libc combinations.

2025-02-26  Jakub Jelinek  <jakub@redhat.com>

PR c/114870
* ginclude/stddef.h (__STDC_VERSION_STDDEF_H__, unreachable): Don't
redefine multiple times if stddef.h is first included without __need_*
defines and later with them.  Move nullptr_t and unreachable and
__STDC_VERSION_STDDEF_H__ definitions into the same
defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L #if block.

* gcc.dg/c23-stddef-2.c: New test.

arm: Fix up REVERSE_CONDITION macro [PR119002]

The linaro CI found my PR119002 patch broke bootstrap on arm.
Seems the problem is that it has incorrect REVERSE_CONDITION macro
definition.
All other target's REVERSE_CONDITION definitions and the default one
just use the macro's arguments, while arm.h definition uses the MODE
argument but uses code instead of CODE (the first argument).
This happens to work because before my patch the only use of the
macro was in jump.cc with
  /* First see if machine description supplies us way to reverse the
     comparison.  Give it priority over everything else to allow
     machine description to do tricks.  */
  if (GET_MODE_CLASS (mode) == MODE_CC
      && REVERSIBLE_CC_MODE (mode))
    return REVERSE_CONDITION (code, mode);
but in my patch it is used with GT rather than code.

2025-02-26  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/119002
* config/arm/arm.h (REVERSE_CONDITION): Use CODE - the macro
argument - in the macro rather than code.

[PR119021][LRA]: Fix rtl correctness check failure in LRA.

  Patch to fix PR115458 contained a code change in dealing with asm
errors to avoid cycling in reporting the error for asm gotos.  This
code was wrong and resulted in checking RTL correctness failure.  This
patch reverts the code change and solves cycling in asm error
reporting in a different way.

gcc/ChangeLog:

PR middle-end/119021
* lra.cc (lra_asm_insn_error): Use lra_invalidate_insn_data
instead of lra_update_insn_regno_info.
* lra-assigns.cc (lra_split_hard_reg_for): Restore old code.

[testsuite] add x86 effective target

I got tired of repeating the conditional that recognizes ia32 or
x86_64, and introduced 'x86' as a shorthand for that, adjusting all
occurrences in target-supports.exp, to set an example.  I found some
patterns that recognized i?86* and x86_64*, but I took those as likely
cut&pastos instead of trying to preserve those weirdnesses.

for  gcc/ChangeLog

* doc/sourcebuild.texi: Add x86 effective target.

for  gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_effective_target_x86): New.
Replace all uses of i?86-*-* and x86_64-*-* in this file.

[testsuite] adjust expectations of x86 vect-simd-clone tests

Some vect-simd-clone tests fail when targeting ancient x86 variants,
because the expected transformations only take place with -msse4 or
higher.

So arrange for these tests to take an -msse4 option on x86, so that
the expected vectorization takes place, but decay to a compile test if
vect.exp would enable execution but the target doesn't have an sse4
runtime.  This requires the new dg-do-if to override the action on a
target while retaining the default action on others, instead of
disabling the test.

We can count on avx512f compile-time support for these tests, because
vect_simd_clones requires that on x86, and that implies sse4 support,
so we need not complicate the scan conditionals with tests for sse4,
except on the last test.

for  gcc/ChangeLog

* doc/sourcebuild.texi (dg-do-if): Document.

for  gcc/testsuite/ChangeLog

* lib/target-supports-dg.exp (dg-do-if): New.
* gcc.dg/vect/vect-simd-clone-16f.c: Use -msse4 on x86, and
skip in case execution is enabled but the runtime isn't.
* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
* gcc.dg/vect/vect-simd-clone-20.c: Likewise, but only skip
the scan test.

simple-diagnostic-path: Inline two trivial methods [PR116143]

Various plugin tests fail with --enable-checking=release, because the
num_events and num_threads methods of simple_diagnostic_path are only used
inside of #if CHECKING_P code inside of GCC proper and then tested inside of
some plugin tests. So, with --enable-checking=yes they are compiled into
cc1/cc1plus etc. binaries and plugins can call those, but with
--enable-checking=release they are optimized away (at least for LTO builds).

As they are trivial, the following patch just defines them inline, so that
the plugin tests get their definitions directly and don't have to rely
on cc1/cc1plus etc. exporting those.

2025-02-26 Jakub Jelinek <jakub@redhat.com>

PR testsuite/116143
* simple-diagnostic-path.h (simple_diagnostic_path::num_events): Define
inline.
(simple_diagnostic_path::num_threads): Likewise.
* simple-diagnostic-path.cc (simple_diagnostic_path::num_events):
Remove out of line definition.
(simple_diagnostic_path::num_threads): Likewise.

Fortran: Remove SAVE_EXPR on lhs in assign [PR108233]

With vectorial shaped datatypes like e.g. complex numbers, fold_convert
inserts a SAVE_EXPR. Using that on the lhs in an assignment prevented
the update of the variable, when in a coarray.

PR fortran/108233

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_trans_assignment_1): Remove SAVE_EXPR on lhs.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/complex_1.f90: New test.

testsuite: Add pragma novector to more tests [PR118464]

These loops will now vectorize the entry finding
loops. As such we get more failures because they
were not expecting to be vectorized.

Fixed by adding #pragma GCC novector.

gcc/testsuite/ChangeLog:

PR tree-optimization/118464
PR tree-optimization/116855
* g++.dg/ext/pragma-unroll-lambda-lto.C: Add pragma novector.
* gcc.dg/tree-ssa/gen-vect-2.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
* gcc.dg/tree-ssa/ivopt_mult_2g.c: Likewise.
* gcc.dg/tree-ssa/ivopts-5.c: Likewise.
* gcc.dg/tree-ssa/ivopts-6.c: Likewise.
* gcc.dg/tree-ssa/ivopts-7.c: Likewise.
* gcc.dg/tree-ssa/ivopts-8.c: Likewise.
* gcc.dg/tree-ssa/ivopts-9.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-1.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-10.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-11.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-12.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-2.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-3.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-4.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-5.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-6.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-7.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-8.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-9.c: Likewise.
* gcc.target/i386/pr90178.c: Likewise.

Daily bump.