git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

Move gfortran.dg/gomp/allocate-static.f90 to libgomp.fortran/

The testcase was turned into a 'dg-do run' check to check for the alignment,
but this only works in testsuite/gfortran.dg, causing link errors for
out-of-tree testing. The test was added in r15-4104-ga8caeaacf499d5.

gcc/testsuite/:

* gfortran.dg/gomp/allocate-static.f90: Move to libgomp/testsuite/.

libgomp/:

* testsuite/libgomp.fortran/allocate-static.f90: Moved from
gcc/testsuite/ as it is a dg-do run test; use real omp_lib_kinds
instead of local definition

libgomp.texi: Update and cleanup of Impl. Status of OpenMP TR13

libgomp/ChangeLog:

* libgomp.texi (OpenMP Technical Report 13): Wording cleanup;
sort as in Appendix B; add missing items; remove duplicates.

libcpp: Use constexpr for _cpp_trigraph_map initialization for C++14

The _cpp_trigraph_map initialization used to be done for C99+ using
designated initializers, but can't be done that way for C++ because
the designated initializer support in C++ as array designators are just
an extension there and don't allow skipping anything nor going backwards.

But, we can get the same effect using C++14 constexpr constructor.
With the following patch we get rid of the runtime initialization
and the array can be in .rodata.

2024-10-07 Jakub Jelinek <jakub@redhat.com>

* internal.h (_cpp_trigraph_map_s): New type for C++14 or later.
(_cpp_trigraph_map_d): New variable for C++14 or later.
(_cpp_trigraph_map): Define to _cpp_trigraph_map_d.map for C++14 or
later.
* init.cc (init_trigraph_map): Define to nothing for C++14 or later.
(TRIGRAPH_MAP, END, s): Define differently for C++14 or later.

Implement MAXLOC and MINLOC for unsigned.

gcc/fortran/ChangeLog:

* check.cc (gfc_check_minloc_maxloc): Handle BT_UNSIGNED.
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Likewise.
* gfortran.texi: Document MAXLOC and MINLOC for UNSIGNED.

libgfortran/ChangeLog:

* Makefile.am: Add files for unsigned MINLOC and MAXLOC.
* Makefile.in: Regenerated.
* gfortran.map: Add files for unsigned MINLOC and MAXLOC.
* generated/maxloc0_16_m1.c: New file.
* generated/maxloc0_16_m16.c: New file.
* generated/maxloc0_16_m2.c: New file.
* generated/maxloc0_16_m4.c: New file.
* generated/maxloc0_16_m8.c: New file.
* generated/maxloc0_4_m1.c: New file.
* generated/maxloc0_4_m16.c: New file.
* generated/maxloc0_4_m2.c: New file.
* generated/maxloc0_4_m4.c: New file.
* generated/maxloc0_4_m8.c: New file.
* generated/maxloc0_8_m1.c: New file.
* generated/maxloc0_8_m16.c: New file.
* generated/maxloc0_8_m2.c: New file.
* generated/maxloc0_8_m4.c: New file.
* generated/maxloc0_8_m8.c: New file.
* generated/maxloc1_16_m1.c: New file.
* generated/maxloc1_16_m2.c: New file.
* generated/maxloc1_16_m4.c: New file.
* generated/maxloc1_16_m8.c: New file.
* generated/maxloc1_4_m1.c: New file.
* generated/maxloc1_4_m16.c: New file.
* generated/maxloc1_4_m2.c: New file.
* generated/maxloc1_4_m4.c: New file.
* generated/maxloc1_4_m8.c: New file.
* generated/maxloc1_8_m1.c: New file.
* generated/maxloc1_8_m16.c: New file.
* generated/maxloc1_8_m2.c: New file.
* generated/maxloc1_8_m4.c: New file.
* generated/maxloc1_8_m8.c: New file.
* generated/minloc0_16_m1.c: New file.
* generated/minloc0_16_m16.c: New file.
* generated/minloc0_16_m2.c: New file.
* generated/minloc0_16_m4.c: New file.
* generated/minloc0_16_m8.c: New file.
* generated/minloc0_4_m1.c: New file.
* generated/minloc0_4_m16.c: New file.
* generated/minloc0_4_m2.c: New file.
* generated/minloc0_4_m4.c: New file.
* generated/minloc0_4_m8.c: New file.
* generated/minloc0_8_m1.c: New file.
* generated/minloc0_8_m16.c: New file.
* generated/minloc0_8_m2.c: New file.
* generated/minloc0_8_m4.c: New file.
* generated/minloc0_8_m8.c: New file.
* generated/minloc1_16_m1.c: New file.
* generated/minloc1_16_m16.c: New file.
* generated/minloc1_16_m2.c: New file.
* generated/minloc1_16_m4.c: New file.
* generated/minloc1_16_m8.c: New file.
* generated/minloc1_4_m1.c: New file.
* generated/minloc1_4_m16.c: New file.
* generated/minloc1_4_m2.c: New file.
* generated/minloc1_4_m4.c: New file.
* generated/minloc1_4_m8.c: New file.
* generated/minloc1_8_m1.c: New file.
* generated/minloc1_8_m16.c: New file.
* generated/minloc1_8_m2.c: New file.
* generated/minloc1_8_m4.c: New file.
* generated/minloc1_8_m8.c: New file.

gcc/testsuite/ChangeLog:

* gfortran.dg/unsigned_35.f90: New test.

[RISC-V] Add splitters to restore condops generation after recent phiopt changes

V2:
  Fix typo in ChangeLog.
  Remove now extraneous comment in cset-sext.c.
  Throttle back branch cost to 1 in various tests

--

Andrew P's recent improvements to phiopt regressed on the riscv testsuite.

Essentially the new code presented to the RTL optimizers is straightline code rather than branchy for the CE pass to analyze and optimize.  In the absence of conditional move support or sfb, the new code would be better.

Unfortunately the presented form isn't a great fit for xventanacondops, zicond or xtheadcondmov.  The net is the resulting code is actually slightly worse than before.  Essentially sne+czero turned into sne+sne+and.

Thankfully, combine is presented with

(and (ne (op1) (const_int 0))
     (ne (op2) (const_int 0)))

As the RHS of a set.  We can use a 3->2 splitter to guide combine on how to profitably rewrite the sequence in a form suitable for condops.  Just splitting that would be enough to fix the regression, but I'm fairly confident that other cases need to be handled and would have regressed had the testsuite been more thorough.

One arm of the AND is going to turn into an sCC instruction.  We have a variety of those that we support.  The codes vary as do the allowed operands of the sCC.  That produces a set of new splitters to handle those cases.

The other arm is going to turn into a czero (or similar) instruction. That one can be generalized to eq/ne.  So another set for that generalization.

We can remove a couple of XFAILs in the rv32 space as it's behaving much more like rv64 at this point.

For SFB targets it's unclear if the new code is better or worse.  In both cases it's a 3 instruction sequence.   So I just adjusted the test.  If the new code is worse for SFB, someone with an understanding of the tradeoffs for an SFB target will need to make adjustments.

Tested in my tester on rv64gcv and rv32gc.  Will wait for the pre-commit testers to render their verdict before moving forward.

gcc/

* config/riscv/iterators.md (scc_0): New code iterator.
* config/riscv/zicond.md: New splitters to improve code generated for
cases like (and (scc) (scc)) for zicond, xventanacondops, xtheadcondmov.

gcc/testsuite/

* gcc.target/riscv/cset-sext-sfb.c: Turn off ssa-phiopt.
* gcc.target/riscv/cset-sext-thead.c: Do not check CE output anymore.
* gcc.target/riscv/cset-sext-ventana.c: Similarly.  Adjust branch cost.
* gcc.target/riscv/cset-sext-zicond.c: Similarly.
* gcc.target/riscv/cset-sext.c: Similarly.  No longer allow
"neg" in asm output.

c: ICE in build_counted_by_ref [PR116735]

When handling the counted_by attribute, if the corresponding field
doesn't exit, in additiion to issue error, we should also remove
the already added non-existing "counted_by" attribute from the
field_decl.

PR c/116735

gcc/c/ChangeLog:

* c-decl.cc (verify_counted_by_attribute): Remove the attribute
when error.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-9.c: New test.

c++: -Wmismatched-tags and modules

In Wmismatched-tags-6.C, we try to compare two declarations of the Cp alias
template, and ICE trying to check whether they're in module purview. We
need to check DECL_LANG_SPECIFIC like elsewhere in the compiler.

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Only check PURVIEW_P if
DECL_LANG_SPECIFIC.

c++: require_deduced_type and modules

With modules more variables have DECL_LANG_SPECIFIC, so we were failing to
call require_deduced_type in constexpr-if30.C.

gcc/cp/ChangeLog:

* decl2.cc (mark_used): Always check require_deduced_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/auto43.C: Adjust diagnostic.
* g++.dg/cpp2a/lambda-generic7.C: Likewise.

c++: modules don't require preprocessor output

init_modules has rejected -M -fmodules-ts on the premise that module
dependency analysis requires macro expansion, but this is no longer
accurate; P1857 prohibited module directives produced by macro expansion.
They can still be dependent on #if directives, but those are still handled
with -fdirectives-only.

What wasn't working was -M or -dM, because cpp_scan_nooutput never called
module_token_pre to implement the import. The simplest fix is to use the
-fdirectives-only scan when modules are enabled and teach directives_only_cb
about flag_no_output.

gcc/cp/ChangeLog:

* module.cc (init_modules): Don't warn about -M.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (preprocess_file): For modules,
use directives-only scan even with flag_no_output.
(directives_only_cb): Respect flag_no_output.

gcc/ChangeLog:

* doc/invoke.texi (C++ Module Preprocessing): Allow -M,
refer to -fdeps.

gcc/testsuite/ChangeLog:

* g++.dg/modules/macro-8_a.H: New test.
* g++.dg/modules/macro-8_b.C: New test.
* g++.dg/modules/macro-8_c.C: New test.
* g++.dg/modules/macro-8_d.C: New test.

arm: fix bootstrap issue with arm_noce_conversion_profitable_p patch [NFC]

This obvious patch fixes two warnings introduced with the implementation of
arm_noce_conversion_profitable_p hook.

gcc/ChangeLog:

* config/arm/arm.cc (arm_noce_oncersion_profitable_p): Remove unused
argument name.
(arm_is_v81m_cond_insn): Initialize variable.

gcc: Remove executable permissions of testcases and *.md files

I've noticed some files were marked as executable, as can be
seen with
find . $ -name \*.[chSC] -o -name \*.md -o -name \*.cc $ -a -perm /111 | xargs ls -l

This commit fixes that.

2024-10-07 Jakub Jelinek <jakub@redhat.com>

gcc/
* config/riscv/vector-crypto.md: Remove executable permissions.
gcc/testsuite/
* gcc.target/aarch64/uxtl-combine-1.c: Remove executable permissions.
* gcc.target/aarch64/uxtl-combine-2.c: Likewise.
* gcc.target/aarch64/uxtl-combine-3.c: Likewise.
* gcc.target/aarch64/uxtl-combine-4.c: Likewise.
* gcc.target/aarch64/uxtl-combine-5.c: Likewise.
* gcc.target/aarch64/uxtl-combine-6.c: Likewise.
* gcc.target/gcn/complex.c: Likewise.
* gcc.target/i386/avx2-bf16-vec-absneg.c: Likewise.
* gcc.target/i386/avx512f-bf16-vec-absneg.c: Likewise.
* gcc.target/i386/pr104371-2.c: Likewise.
* gcc.target/i386/pr115146.c: Likewise.
* gcc.target/i386/vpermt2-special-bf16-shufflue.c: Likewise.
* g++.target/i386/pr107563-a.C: Likewise.
* g++.target/i386/pr107563-b.C: Likewise.

middle-end: reorder masking priority of math functions

Given the categorization of math built-in functions as `ECF_CONST',
when if-converting their uses, their calls are not masked and are thus
called with an all-true predicate.

This, however, is not appropriate where built-ins have library
equivalents, wherein they may exhibit highly architecture-specific
behaviors. For example, vectorized implementations may delegate the
computation of values outside a certain acceptable numerical range to
special (non-vectorized) routines which considerably slow down
computation.

As numerical simulation programs often do bounds check on input values
prior to math calls, conditionally assigning default output values for
out-of-bounds input and skipping the math call altogether, these
fallback implementations should seldom be called in the execution of
vectorized code. If, however, we don't apply any masking to these
math functions, we end up effectively executing both if and else
branches for these values, leading to considerable performance
degradation on scientific workloads.

We therefore invert the order of handling of math function calls in
`if_convertible_stmt_p' to prioritize the handling of their
library-provided implementations over the equivalent internal function.

gcc/ChangeLog:

* tree-if-conv.cc (if_convertible_stmt_p): Check for explicit
function declaration before IFN fallback.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-fncall-mask-math.c: New.

vect: Add more dump messages for VLA SLP permutation [PR116583]

Taking the !repeating_p route for VLA vectors causes analysis
to fail, but it wasn't clear from the dump files when this
had happened, and which node caused it.

gcc/
PR tree-optimization/116583
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Add more
dump messages.

vect: Support more VLA SLP permutations [PR116583]

This is the main patch for PR116583.  Previously, we only
supported VLA SLP permutations for which the output and inputs
have the same number of lanes, and for which that number of
lanes divides the number of vector elements.

The patch extends this to handle:

(1) "packs" of a single 2N-vector input into an N-vector output
(2) "unpacks" of N-vector inputs into an XN-vector output

Hopefully the comments in the code explain the approach.

The contents of the:

  for (unsigned i = 0; i < ncopies; ++i)

loop do not change; the patch simply adds an outer loop around it.

The patch removes the XFAIL in slp-13.c and also improves
the SVE vect.exp results with vect-force-slp=1.  I haven't
added new tests specifically for this, since presumably the
existing ones will cover it once the SLP switch is flipped.

gcc/
PR tree-optimization/116583
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Handle
variable-length pack and unpack permutations.

gcc/testsuite/
PR tree-optimization/116583
* gcc.dg/vect/slp-13.c: Remove xfail for vect_variable_length.
* gcc.dg/vect/slp-13-big-array.c: Likewise.

vect: Restructure repeating_p case for SLP permutations [PR116583]

The repeating_p case previously handled the specific situation
in which the inputs have N lanes and the output has N lanes,
where N divides the number of vector elements.  In that case,
every output uses the same permute vector.

The code was therefore structured so that the outer loop only
constructed one permute vector, with an inner loop generating
as many VEC_PERM_EXPRs from it as required.

However, the main patch for PR116583 adds support for cycling
through N permute vectors, rather than just having one.
The current structure doesn't really handle that case well.
(We'd need to interleave the results after generating them,
which sounds a bit fragile.)

This patch instead makes the transform phase calculate each output
vector's permutation explicitly, like for the !repeating_p path.
As a bonus, it gets rid of one use of SLP_TREE_NUMBER_OF_VEC_STMTS.

This arguably undermines one of the justifications for using repeating_p
for constant-length vectors: that the repeating_p path involved less
work than the !repeating_p path.  That justification does still hold for
the analysis phase, though, and that should be the more time-sensitive
part.  And the other justification -- to get more coverage of the code --
still applies.  So I'd prefer that we continue to use repeating_p for
constant-length vectors unless that causes a known missed optimisation.

gcc/
PR tree-optimization/116583
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Remove
the noutputs_per_mask inner loop and instead generate a
separate permute vector for each output.

vect: Variable lane indices in vectorizable_slp_permutation_1 [PR116583]

The main patch for PR116583 needs to create variable indices into
an input vector. This pre-patch changes the types to allow that.

There is no pretty-print format for poly_uint64 because of issues
with passing C++ objects through "...".

gcc/
PR tree-optimization/116583
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Using
poly_uint64 for scalar lane indices.

aarch64: Fix general permutes of svbfloat16_ts

Testing gcc.target/aarch64/sve/permute_2.c without the associated GCC
patch triggered an unrecognisable insn ICE for the svbfloat16_t tests.
This was because the implementation of general two-vector permutes
requires two TBLs and an ORR, with the ORR being represented as an
unspec for floating-point modes. The associated pattern did not
cover VNx8BF.

gcc/
* config/aarch64/iterators.md (SVE_I): Move further up file.
(SVE_F): New mode iterator.
(SVE_ALL): Redefine in terms of SVE_I and SVE_F.
* config/aarch64/aarch64-sve.md (*<LOGICALF:optab><mode>3): Extend
to all SVE_F.

gcc/testsuite/
* gcc.target/aarch64/sve/permute_5.c: New test.

aarch64: Handle SVE modes in aarch64_evpc_reencode [PR116583]

For Advanced SIMD modes, aarch64_evpc_reencode tests whether
a permute in a narrow element mode can be done more cheaply
in a wider mode. For example, { 0, 1, 8, 9, 4, 5, 12, 13 }
on V8HI is a natural TRN1 on V4SI ({ 0, 4, 2, 6 }).

This patch extends the code to handle SVE data and predicate
modes as well. This is a prerequisite to getting good results
for PR116583.

gcc/
PR target/116583
* config/aarch64/aarch64.cc (aarch64_coalesce_units): New function,
extending the Advanced SIMD handling from...
(aarch64_evpc_reencode): ...here to SVE data and predicate modes.

gcc/testsuite/
PR target/116583
* gcc.target/aarch64/sve/permute_1.c: New test.
* gcc.target/aarch64/sve/permute_2.c: Likewise.
* gcc.target/aarch64/sve/permute_3.c: Likewise.
* gcc.target/aarch64/sve/permute_4.c: Likewise.

testsuite: Unset torture_current_flags after use

Before running a test with specific torture options, gcc-dg-runtest
sets the global variable torture_current_flags to the set of torture
options that will be used. However, it never unset the variable
afterwards, which meant that the last options would hang around
and potentially confuse later non-torture tests.

I saw this with a follow-on patch to check-function-bodies, but it's
probably possible to construct aritificial test combinations that
expose it with check-function-bodies's existing flag filtering.

gcc/testsuite/
* lib/gcc-dg.exp (gcc-dg-runtest): Unset torture_current_flags
after each test.

tree-optimization/116990 - missed control flow check in vect_analyze_loop_form

The following fixes checking for unsupported control flow in
vectorization to also cover the outer loop body.

PR tree-optimization/116990
* tree-vect-loop.cc (vect_analyze_loop_form): Check the current
loop body for control flow.

tree-optimization/116982 - analyze scalar loop exit early

The following makes sure to discover the scalar loop IV exit during
analysis as failure to do so (if DCE and friends are disabled this
can happen due to if-conversion doing DCE and FRE on the if-converted
loop) would ICE later.

I refrained from larger refactoring to be able to eventually backport.

PR tree-optimization/116982
* tree-vectorizer.h (vect_analyze_loop): Pass in .LOOP_VECTORIZED
call.
(vect_analyze_loop_form): Likewise.
* tree-vect-loop.cc (vect_analyze_loop_form): Reject loops where we
cannot determine a IV exit for the scalar loop.
(vect_analyze_loop): Adjust.
* tree-vectorizer.cc (try_vectorize_loop_1): Likewise.
* tree-parloops.cc (gather_scalar_reductions): Likewise.

testsuite: Prevent unrolling of main in LTO test [PR116683]

In r15-3585-g9759f6299d9633cabac540e5c893341c708093ac I added a test which
started failing on PowerPC.  The test checks that we unroll exactly one loop
three times with the following:

// { dg-final { scan-ltrans-rtl-dump-times "Unrolled loop 3 times" 1 "loop2_unroll" } }

which passes on most targets.  However, on PowerPC, the loop in main
gets unrolled too, causing the scan-ltrans-rtl-dump-times check to fail
as the statement now appears twice in the dump.  I think the extra
unrolling is due to different unrolling heuristics in the rs6000 port.

This patch therefore explicitly tries to block the unrolling in main with an
appropriate #pragma.

gcc/testsuite/ChangeLog:

PR testsuite/116683
* g++.dg/ext/pragma-unroll-lambda-lto.C (main): Add #pragma to
prevent unrolling of the setup loop.

ssa-math-opts, i386: Improve spaceship expansion [PR116896]

The PR notes that we don't emit optimal code for C++ spaceship
operator if the result is returned as an integer rather than the
result just being compared against different values and different
code executed based on that.
So e.g. for
template <typename T>
auto foo (T x, T y) { return x <=> y; }
for both floating point types, signed integer types and unsigned integer
types.  auto in that case is std::strong_ordering or std::partial_ordering,
which are fancy C++ abstractions around struct with signed char member
which is -1, 0, 1 for the strong ordering and -1, 0, 1, 2 for the partial
ordering (but for -ffast-math 2 is never the case).
I'm afraid functions like that are fairly common and unless they are
inlined, we really need to map the comparison to those -1, 0, 1 or
-1, 0, 1, 2 values.

Now, for floating point spaceship I've in the past already added an
optimization (with tree-ssa-math-opts.cc discovery and named optab, the
optab only defined on x86 though right now), which ensures there is just
a single comparison instruction and then just tests based on flags.
Now, if we have code like:
  auto a = x <=> y;
  if (a == std::partial_ordering::less)
    bar ();
  else if (a == std::partial_ordering::greater)
    baz ();
  else if (a == std::partial_ordering::equivalent)
    qux ();
  else if (a == std::partial_ordering::unordered)
    corge ();
etc., that results in decent code generation, the spaceship named pattern
on x86 optimizes for the jumps, so emits comparisons on the flags, followed
by setting the result to -1, 0, 1, 2 and subsequent jump pass optimizes that
well.  But if the result needs to be stored into an integer and just
returned that way or there are no immediate jumps based on it (or turned
into some non-standard integer values like -42, 0, 36, 75 etc.), then CE
doesn't do a good job for that, we end up with say
        comiss  %xmm1, %xmm0
        jp      .L4
        seta    %al
        movl    $0, %edx
        leal    -1(%rax,%rax), %eax
        cmove   %edx, %eax
        ret
.L4:
        movl    $2, %eax
        ret
The jp is good, that is the unlikely case and can't be easily handled in
straight line code due to the layout of the flags, but the rest uses cmov
which often isn't a win and a weird math.
With the patch below we can get instead
        xorl    %eax, %eax
        comiss  %xmm1, %xmm0
        jp      .L2
        seta    %al
        sbbl    $0, %eax
        ret
.L2:
        movl    $2, %eax
        ret

The patch changes the discovery in the generic code, by detecting if
the future .SPACESHIP result is just used in a PHI with -1, 0, 1 or
-1, 0, 1, 2 values (the latter for HONOR_NANS) and passes that as a flag in
a new argument to .SPACESHIP ifn, so that the named pattern is told whether
it should optimize for branches or for loading the result into a -1, 0, 1
(, 2) integer.  Additionally, it doesn't detect just floating point <=>
anymore, but also integer and unsigned integer, but in those cases only
if an integer -1, 0, 1 is wanted (otherwise == and > or similar comparisons
result in good code).
The backend then can for those integer or unsigned integer <=>s return
effectively (x > y) - (x < y) in a way that is efficient on the target
(so for x86 with ensuring zero initialization first when needed before
setcc; one for floating point and unsigned, where there is just one setcc
and the second one optimized into sbb instruction, two for the signed int
case).  So e.g. for signed int we now emit
        xorl    %edx, %edx
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setl    %dl
        setg    %al
        subl    %edx, %eax
        ret
and for unsigned
        xorl    %eax, %eax
        cmpl    %esi, %edi
        seta    %al
        sbbb    $0, %al
        ret

Note, I wonder if other targets wouldn't benefit from defining the
named optab too...

2024-10-07  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/116896
* optabs.def (spaceship_optab): Use spaceship$a4 rather than
spaceship$a3.
* internal-fn.cc (expand_SPACESHIP): Expect 3 call arguments
rather than 2, expand the last one, expect 4 operands of
spaceship_optab.
* tree-ssa-math-opts.cc: Include cfghooks.h.
(optimize_spaceship): Check if a single PHI is initialized to
-1, 0, 1, 2 or -1, 0, 1 values, in that case pass 1 as last (new)
argument to .SPACESHIP and optimize away the comparisons,
otherwise pass 0.  Also check for integer comparisons rather than
floating point, in that case do it only if there is a single PHI
with -1, 0, 1 values and pass 1 to last argument of .SPACESHIP
if the <=> is signed, 2 if unsigned.
* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Add
another rtx argument.
(ix86_expand_int_spaceship): Declare.
* config/i386/i386-expand.cc (ix86_expand_fp_spaceship): Add
arg3 argument, if it is const0_rtx, expand like before, otherwise
emit optimized sequence for setting the result into a GPR.
(ix86_expand_int_spaceship): New function.
* config/i386/i386.md (UNSPEC_SETCC_SI_SLP): New UNSPEC code.
(setcc_si_slp): New define_expand.
(*setcc_si_slp): New define_insn_and_split.
(setcc + setcc + movzbl): New define_peephole2.
(spaceship<mode>3): Renamed to ...
(spaceship<mode>4): ... this.  Add an extra operand, pass it
to ix86_expand_fp_spaceship.
(spaceshipxf3): Renamed to ...
(spaceshipxf4): ... this.  Add an extra operand, pass it
to ix86_expand_fp_spaceship.
(spaceship<mode>4): New define_expand for SWI modes.
* doc/md.texi (spaceship@var{m}3): Renamed to ...
(spaceship@var{m}4): ... this.  Document the meaning of last
operand.

* g++.target/i386/pr116896-1.C: New test.
* g++.target/i386/pr116896-2.C: New test.

OpenMP: Allocate directive for static vars, clean up

For the 'allocate' directive, remove the sorry for static variables and
just keep using normal memory, but honor the requested alignment and set
a DECL_ATTRIBUTE in case a target may want to make use of this later on.
The documentation is updated accordingly.

The C diagnostic to check for predefined allocators (req. for static vars)
failed to accept GCC's ompx_gnu_... allocator, now fixed. (Fortran was
already okay; but both now use new common #defined value for checking.)
And while Fortran common block variables are still rejected, the check
has been improved as before the sorry diagnostic did not work for
common blocks in modules.

Finally, for 'allocate' clause on the target/task/taskloop directives,
there is now a warning for omp_thread_mem_alloc (i.e. predefined allocator
with access = thread), which is undefined behavior according to the
OpenMP specification.

And, last, testing showed that var decl + static_assert sets TREE_USED
but does not produce a statement list in C, which did run into an assert
in gimplify. This special case is now also handled.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_allocate): Set alignment for alignof;
accept static variables and fix predef allocator check.

gcc/fortran/ChangeLog:

* openmp.cc (is_predefined_allocator): Use gomp-constants.h consts.
* trans-common.cc (translate_common): Reject OpenMP allocate directives.
* trans-decl.cc (gfc_finish_var_decl): Handle allocate directive
for static variables.
(gfc_trans_deferred_vars): Update for the latter.

gcc/ChangeLog:

* gimplify.cc (gimplify_bind_expr): Fix corner case for OpenMP
allocate directive.
(gimplify_scan_omp_clauses): Warn if omp_thread_mem_alloc is used
as allocator with the target/task/taskloop directive.

include/ChangeLog:

* gomp-constants.h (GOMP_OMP_PREDEF_ALLOC_MAX,
GOMP_OMPX_PREDEF_ALLOC_MIN, GOMP_OMPX_PREDEF_ALLOC_MAX,
GOMP_OMP_PREDEF_ALLOC_THREADS): New defines.

libgomp/ChangeLog:

* allocator.c: Add static asserts for news
GOMP_OMP{,X}_PREDEF_ALLOC_{MIN,MAX} range values.
* libgomp.texi (OpenMP Impl. Status): Allocate directive for
static vars is now supported. Refer to PR for allocate clause.
(Memory allocation): Update for static vars; minor word tweaking.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/allocate-9.c: Update for removed sorry.
* gfortran.dg/gomp/allocate-15.f90: Likewise.
* gfortran.dg/gomp/allocate-pinned-1.f90: Likewise.
* gfortran.dg/gomp/allocate-4.f90: Likewise; add dg-error for
previously missing diagnostic.
* c-c++-common/gomp/allocate-18.c: New test.
* c-c++-common/gomp/allocate-19.c: New test.
* gfortran.dg/gomp/allocate-clause.f90: New test.
* gfortran.dg/gomp/allocate-static-2.f90: New test.
* gfortran.dg/gomp/allocate-static.f90: New test.

Handle non-grouped stores as single-lane SLP: adjust 'gcc.dg/vect/slp-26.c', GCN

As of commit d34cda720988674bcf8a24267c9e1ec61335d6de
"Handle non-grouped stores as single-lane SLP", we see for
'--target=amdgcn-amdhsa' (tested '-march=gfx908', '-march=gfx1100'):

    PASS: gcc.dg/vect/slp-26.c (test for excess errors)
    PASS: gcc.dg/vect/slp-26.c execution test
    PASS: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorized 1 loops" 1
    [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1

    gcc.dg/vect/slp-26.c: pattern found 2 times

Apply the same change to 'amdgcn-*-*' as done for 'riscv_v'.

gcc/testsuite/
* gcc.dg/vect/slp-26.c: Adjust GCN.

nvptx: Re-enable 'gcc.misc-tests/options.exp'

..., just conditionalize its profiling test (as done elsewhere).  The
re-enabled test cases all PASS.

For the record, for example for GCN target, this causes:

     Running [...]/gcc/testsuite/gcc.misc-tests/options.exp ...
    -PASS: compiler driver --coverage option(s)
     PASS: compiler driver -fdump-ipa-all-address option(s)
     PASS: compiler driver -fdump-ipa-all-alias option(s)
     PASS: compiler driver -fdump-ipa-all-all option(s)

That was:

    Running [...]/gcc/testsuite/gcc.misc-tests/options.exp ...
    Executing on host: [xgcc] [...] --coverage [...]
    [...]
    ld: error: undefined symbol: __gcov_exit
    >>> referenced by /tmp/ccRGdqjA.o:(_sub_D_00100_1)
    >>> referenced by /tmp/ccRGdqjA.o:(_sub_D_00100_1)
    collect2: error: ld returned 1 exit status
    compiler exited with status 1
    output is:
    [...]
    PASS: compiler driver --coverage option(s)

..., so that's nothing to worry about.

gcc/testsuite/
* gcc.misc-tests/options.exp: Re-enable for nvptx.

nvptx: Re-enable all variants of 'c-c++-common/torture/complex-sign-mixed-add.c', 'c-c++-common/torture/complex-sign-mixed-sub.c'

PASS with:

    $ ptxas --version
    ptxas: NVIDIA (R) Ptx optimizing assembler
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sun_Sep__9_21:06:46_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.145

..., and execution with 'Driver Version: 361.93.02'.

gcc/testsuite/
* c-c++-common/torture/complex-sign-mixed-add.c: Re-enable all
variants for nvptx.
* c-c++-common/torture/complex-sign-mixed-sub.c: Likewise.

nvptx: Re-enable 'gcc.dg/special/weak-2.c'

PASSes with:

    $ ptxas --version
    ptxas: NVIDIA (R) Ptx optimizing assembler
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sun_Sep__9_21:06:46_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.145

..., and execution with 'Driver Version: 361.93.02'.

gcc/testsuite/
* gcc.dg/special/weak-2.c: Re-enable for nvptx.

nvptx: Re-enable all variants of 'gcc.c-torture/execute/20020529-1.c'

Generally PASSes with:

    $ ptxas --version
    ptxas: NVIDIA (R) Ptx optimizing assembler
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sun_Sep__9_21:06:46_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.145

..., and execution with 'Driver Version: 361.93.02'.

Only the '-O1' execution test FAILs (pre-existing; to be analyzed later):

    nvptx-run: error getting kernel result: an illegal memory access was encountered (CUDA_ERROR_ILLEGAL_ADDRESS, 700)

gcc/testsuite/
* gcc.c-torture/execute/20020529-1.c: Re-enable all variants for
nvptx.

nvptx: Disable effective-target 'freestanding'

After 2014's commit 157e859ffe3b5d43db1e19475711c1a3d21ab57a "remove picochip",
the effective-target 'freestanding' (later) was only ever used for nvptx.
However, the relevant I/O library functions have long been implemented in nvptx
newlib.

These test cases generally PASS, just a few need to get XFAILed; see
<https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/#system-calls>,
and then supposedly
<https://docs.nvidia.com/cuda/cuda-c-programming-guide/#formatted-output> for
description of the non-standard PTX 'vprintf' return value:

> Unlike the C-standard 'printf()', which returns the number of characters
> printed, CUDA's 'printf()' returns the number of arguments parsed. If no
> arguments follow the format string, 0 is returned. If the format string is
> NULL, -1 is returned. If an internal error occurs, -2 is returned.

(I've tried a few variants to confirm that PTX 'vprintf' -- which supposedly is
underlying the CUDA 'printf' -- is what's implementing this behavior.)
Probably, we ought to fix that up in nvptx newlib.

gcc/testsuite/
* gcc.c-torture/execute/printf-1.c: XFAIL for nvptx.
* gcc.c-torture/execute/printf-chk-1.c: Likewise.
* gcc.c-torture/execute/vprintf-1.c: Likewise.
* gcc.c-torture/execute/vprintf-chk-1.c: Likewise.
* lib/target-supports.exp (check_effective_target_freestanding):
Disable for nvptx.

nvptx: Re-enable "ptxas times out" test cases

These are all quick to compile and generally PASS with:

    $ ptxas --version
    ptxas: NVIDIA (R) Ptx optimizing assembler
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sun_Sep__9_21:06:46_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.145

Only 'gcc.c-torture/compile/limits-fndefn.c' at '-O0' still has an issue, as
indicated.  Working around that with '-Wa,--no-verify', for now.

gcc/testsuite/
* gcc.c-torture/compile/920501-4.c: Re-enable nvptx
"ptxas times out" variants.
* gcc.c-torture/compile/921011-1.c: Likewise.
* gcc.c-torture/compile/pr34334.c: Likewise.
* gcc.c-torture/compile/pr37056.c: Likewise.
* gcc.c-torture/compile/pr39423-1.c: Likewise.
* gcc.c-torture/compile/pr49049.c: Likewise.
* gcc.c-torture/compile/pr59417.c: Likewise.
* gcc.c-torture/compile/limits-fndefn.c: Likewise.
Specify '-Wa,--no-verify' for nvptx '-O0'.

nvptx: Re-enable 'gcc.c-torture/compile/20080721-1.c'

PASSes with:

    $ ptxas --version
    ptxas: NVIDIA (R) Ptx optimizing assembler
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sun_Sep__9_21:06:46_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.145

gcc/testsuite/
* gcc.c-torture/compile/20080721-1.c: Re-enable for nvptx.

Daily bump.

testsuite: Require lto in three tests

2024-10-06 John David Anglin <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept87.C: Require lto.
* g++.dg/ext/pragma-unroll-lambda-lto.C: Likewise.
* gcc.dg/enum-alias-3.c: Likewise.

hppa: Use stack slot SP-40 to copy between integer and floating-point registers

2024-10-06 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa-64.h (PA_SECONDARY_MEMORY_NEEDED): Define
to false. Update comment.
* config/pa/pa.md: Modify 64-bit move patterns to support
copying between integer and floating-point registers using
stack slot SP-40.

Add single-lane SLP support to .GOMP_SIMD_LANE vectorization

The following adds basic support for single-lane SLP .GOMP_SIMD_LANE
vectorization, in particular it enables SLP discovery.

* tree-vect-slp.cc (no_arg_map): New.
(vect_get_operand_map): Handle IFN_GOMP_SIMD_LANE.
(vect_build_slp_tree_1): Likewise.
* tree-vect-stmts.cc (vectorizable_call): Handle single-lane SLP
for .GOMP_SIMD_LANE calls.

doc: Focus on DWARF for FreeBSD

gcc:
PR target/69374
* doc/install.texi (Specific) <*-*-freebsd*>: Focus on DWARF
only.

Daily bump.

hppa: Don't clobber frame_pointer_rtx in expanders

Noticed testing LRA. Clobbers cause internal compiler errors.

2024-10-05 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md (nonlocal_goto): Don't clobber
frame_pointer_rtx.
(builtin_longjmp): Likewise.

hppa: Fix indirect_goto constraint

Noticed testing LRA.

2024-10-05 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md: Fix indirect_got constraint.

libstdc++: add std::is_virtual_base_of

Added by P2985R0 for C++26. This simply exposes the compiler
builtin, and adds the feature-testing macro.

libstdc++-v3/ChangeLog:

* include/bits/version.def: Added the feature-testing macro.
* include/bits/version.h: Regenerated.
* include/std/type_traits: Add support for
std::is_virtual_base_of and std::is_virtual_base_of_v,
implemented in terms of the compiler builtin.
* testsuite/20_util/is_virtual_base_of/value.cc: New test.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Implement LWG 3664 changes to ranges::distance

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h (__distance_fn::operator()):
Adjust iterator/sentinel overloads as per LWG 3664.
* testsuite/24_iterators/range_operations/distance.cc:
Test LWG 3664 example.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

Fix various issues of -ftrivial-auto-var-init=zero with Ada

This polishes a few rough edges that prevent -ftrivial-auto-var-init=zero
from working in Ada:

  - build_common_builtin_nodes declares BUILT_IN_CLEAR_PADDING with 3
  instead 2 parameters, now gimple_fold_builtin_clear_padding contains
  the assertion:

    gcc_assert (gimple_call_num_args (stmt) == 2)

  This causes gimple_builtin_call_types_compatible_p to always return false
  in Ada (this works in C/C++ because another declaration is used).

  - gimple_add_init_for_auto_var uses EXPR_LOCATION to fetch the location
  of a DECL node, which always returns UNKNOWN_LOCATION.

  - the machinery attempts to initialize Out parameters.

gcc/
PR middle-end/116933
* gimplify.cc (gimple_add_init_for_auto_var): Use the correct macro
to fetch the source location of the variable.
* tree.cc (common_builtin_nodes): Remove the 3rd parameter in the
type of BUILT_IN_CLEAR_PADDING.

gcc/ada/
PR middle-end/116933
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Out_Parameter>: Add
the "uninitialized" attribute on Out parameters.
* gcc-interface/utils.cc (gnat_internal_attributes): Add entry for
the "uninitialized" attribute.
(handle_uninitialized_attribute): New function.

gcc/testsuite/
* gnat.dg/auto_var_init.adb: New test.

Improve load permutation lowering

The following makes sure the emitted even/odd extraction scheme
follows one that ends up with actual trivial even/odd extract permutes.
When we choose a level 2 extract we generate { 0, 1, 4, 5, ... }
which for example the x86 backend doesn't recognize with just SSE
and QImode elements. So this now follows what the non-SLP interleaving
code would do which is element granular even/odd extracts.

This resolves gcc.dg/vect/vect-strided[-a]-u8-i8-gap*.c FAILs with
--param vect-force-slp=1 on x86_64.

* tree-vect-slp.cc (vect_lower_load_permutations): Prefer
level 1 even/odd extracts.

Daily bump.

MAINTAINERS: Add myself to write after approval

ChangeLog:
* MAINTAINERS: Add myself to write after approval.

diagnostics: bulletproof opening of SARIF output [PR116978]

Introduce a new RAII class diagnostic_output_file to track ownership
of the FILE * for SARIF output.

In particular, the .sarif file is now opened immediately, rather
than at the end of the compile, and so will fail earlier if the
file can't be opened.

Doing so fixes a couple of ICEs in -fdiagnostics-format=sarif-file when
invoking, say, cc1 directly, rather than from the driver.

gcc/ChangeLog:
PR other/116978
* diagnostic-format-sarif.cc (sarif_builder::sarif_builder):
Gracefully handle "main_input_filename_" being NULL.
(sarif_output_format::sarif_output_format): Replace param
"base_file_name" with "output_file" and assert that the file
was opened successfully and has a non-NULL filename.
(sarif_output_format::~sarif_file_output_format): Move
responsibility for building the filename and opening the file from
here to the creator of the instance.
(sarif_output_format::m_base_file_name): Replace with...
(sarif_output_format::m_output_file): ...this.
(diagnostic_output_format_init_sarif_file): Make "line_maps" param
non-const. Gracefully handle "base_file_name" being NULL.
Construct the filename and open the file here, rather than in
~sarif_file_output_format, and handle failures immediately here,
rather than at the end of the compile.
* diagnostic-format-sarif.h: Include "diagnostic-output-file.h".
(diagnostic_output_format_init_sarif_file): Make "line_maps" param
non-const.
* diagnostic-output-file.h: New file.
* diagnostic.cc (diagnostic_context::emit_diagnostic): New.
(diagnostic_context::emit_diagnostic_va): New.
* diagnostic.h (diagnostic_context::emit_diagnostic): New decl.
(diagnostic_context::emit_diagnostic_va): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

x86: Disable stack protector for naked functions

Since naked functions should not enable stack protector, define
TARGET_STACK_PROTECT_RUNTIME_ENABLED_P to disable stack protector
for naked functions.

gcc/

PR target/116962
* config/i386/i386.cc (ix86_stack_protect_runtime_enabled_p): New
function.
(TARGET_STACK_PROTECT_RUNTIME_ENABLED_P): New.

gcc/testsuite/

PR target/116962
* gcc.target/i386/pr116962.c: New file.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

aarch64: Fix bug with max/min (PR116934)

In ac4cdf5cb43c0b09e81760e2a1902ceebcf1a135, I introduced a bug where
I put the new unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN, into the
wrong iterator.

I should have put new unspecs in SVE_COND_FP_MAXMIN but I put it in
SVE_COND_FP_BINARY_REG instead. That was incorrect because the
SVE_COND_FP_MAXMIN iterator is being used for predicated floating-point
maximum/minimum, not SVE_COND_FP_BINARY_REG.

Also added a testcase to validate the new change.

Regression tested on aarch64-unknown-linux-gnu and found no regressions.
There are some test cases with "libitm" in their directory names which
appear in compare_tests output as changed tests but it looks like they
are in the output just because of changed build directories, like from
build-patched/aarch64-unknown-linux-gnu/./libitm/* to
build-pristine/aarch64-unknown-linux-gnu/./libitm/*. I didn't think it
was a cause of concern and have pushed this for review.

gcc/ChangeLog:

PR target/116934
* config/aarch64/iterators.md: Move UNSPEC_COND_SMAX and
UNSPEC_COND_SMIN to correct iterators.

gcc/testsuite/ChangeLog:

PR target/116934
* gcc.target/aarch64/sve2/pr116934.c: New test.

AVR: target/116953 - ICE due to operands clobber in avr_out_sbxx_branch.

PR target/116953
gcc/
* config/avr/avr.cc (avr_out_sbxx_branch): Work on a copy of
the operands rather than on operands itself, which is just
recog_data.operand and may be clobbered by jump_over_one_insn_p.
gcc/testsuite/
* gcc.target/avr/torture/pr116953.c: New test.

testsuite - Some float64 and float32x test require double64plus.

Some of the float64 and float32x test cases are using double built-ins
and hence require double64plus resp. that double is at least as good
as float32x (double_float32xplus).

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_double_float32xplus):
New proc.
* gcc.dg/torture/float32x-builtin.c: Add
dg-require-effective-target double_float32xplus.
* gcc.dg/torture/float32x-tg-2.c: Same.
* gcc.dg/torture/float32x-tg.c: Same.
* gcc.dg/torture/float64-builtin.c: Add
dg-require-effective-target double64plus.
* gcc.dg/torture/float64-tg-2.c: Same.
* gcc.dg/torture/float64-tg.c: Same.

cfgexpand: Expand comment on when non-var clobbers can show up

The comment here is not wrong, just it would be better if mentioning
the C++ front-end instead of just the nested function lowering.

gcc/ChangeLog:

* cfgexpand.cc (add_scope_conflicts_1): Expand comment
on when non-var clobbers show up.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite: Fix fallout of turning warnings into errors on 32-bit Arm

Since commits 2c3db94d9fd ("c: Turn int-conversion warnings into
permerrors") and 55e94561e97e ("c: Turn -Wimplicit-function-declaration
into a permerror") these tests fail with errors such as:

  FAIL: gcc.target/arm/pr59858.c (test for excess errors)
  FAIL: gcc.target/arm/pr65647.c (test for excess errors)
  FAIL: gcc.target/arm/pr65710.c (test for excess errors)
  FAIL: gcc.target/arm/pr97969.c (test for excess errors)

Here's one example of the excess errors:

  FAIL: gcc.target/arm/pr65647.c (test for excess errors)
  Excess errors:
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:17: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:51: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:6:62: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:7:48: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:8:9: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:24:5: error: initialization of 'int' from 'int *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:25:5: error: initialization of 'int' from 'struct S1 *' makes integer from pointer without a cast [-Wint-conversion]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:41:3: error: implicit declaration of function 'fn3'; did you mean 'fn2'? [-Wimplicit-function-declaration]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:46:3: error: implicit declaration of function 'fn5'; did you mean 'fn4'? [-Wimplicit-function-declaration]
  /path/gcc.git/gcc/testsuite/gcc.target/arm/pr65647.c:57:16: error: implicit declaration of function 'fn6'; did you mean 'fn4'? [-Wimplicit-function-declaration]

PR rtl-optimization/59858 and PR target/65710 test the fix of an ICE.
PR target/65647 and PR target/97969 test for a compilation infinite loop.

Therefore, add -fpermissive so that the tests behave as they did previously.
Tested on armv8l-linux-gnueabihf.

gcc/testsuite/ChangeLog:
* gcc.target/arm/pr59858.c: Add -fpermissive.
* gcc.target/arm/pr65647.c: Likewise.
* gcc.target/arm/pr65710.c: Likewise.
* gcc.target/arm/pr97969.c: Likewise.

Revert: AVR: Implement TARGET_FLOATN_MODE.

Revert r15-4073 / 98a1a886e4c0c58ad9f9846caf5697ff00e4f24a
The default TARGET_FLOATN_MODE is just fine.
gcc/
* config/avr/avr.cc (avr_floatn_mode): Remove.
(TARGET_FLOATN_MODE): Remove.

AVR: Implement TARGET_FLOATN_MODE.

gcc/
* config/avr/avr.cc (avr_floatn_mode): New static function.
(TARGET_FLOATN_MODE): New define.

PR modula2/116918 -fswig correct syntax fixup

This patch adds a missing % escape in DoCheckUnbounded.

gcc/m2/ChangeLog:

PR modula2/116918
* gm2-compiler/M2Swig.mod (DoCheckUnbounded): Escape
the % character used in array_functions with %%.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

[PATCH] RISC-V/libgcc: Fix incorrect .cfi_offset for saving ra in __riscv_save_[0-3] on ilp32e.

From 8b3c5ebe8aacbcc4ddf1be8dea9a555e7e1bcc39 Mon Sep 17 00:00:00 2001
From: Jim Lin <jim@andestech.com>
Date: Fri, 4 Oct 2024 14:48:12 +0800
Subject: [PATCH] RISC-V/libgcc: Fix incorrect .cfi_offset for saving ra in
__riscv_save_[0-3] on ilp32e.

libgcc/ChangeLog:

* config/riscv/save-restore.S: Fix .cfi_offset for saving ra in
__riscv_save_[0-3] on ilp32e.

libstdc++/ranges: Implement various small LWG issues

This implements the following small LWG issues:

  3848. adjacent_view, adjacent_transform_view and slide_view missing base accessor
  3851. chunk_view::inner-iterator missing custom iter_move and iter_swap
  3947. Unexpected constraints on adjacent_transform_view::base()
  4001. iota_view should provide empty
  4012. common_view::begin/end are missing the simple-view check
  4013. lazy_split_view::outer-iterator::value_type should not provide default constructor
  4035. single_view should provide empty
  4053. Unary call to std::views::repeat does not decay the argument
  4054. Repeating a repeat_view should repeat the view

libstdc++-v3/ChangeLog:

* include/std/ranges (single_view::empty): Define as per LWG 4035.
(iota_view::empty): Define as per LWG 4001.
(lazy_split_view::_OuterIter::value_type): Remove default
constructor and make other constructor private as per LWG 4013.
(common_view::begin): Disable non-const overload for simple
views as per LWG 4012.
(common_view::end): Likewise.
(adjacent_view::base): Define as per LWG 3848.
(adjacent_transform_view::base): Likewise.
(chunk_view::_InnerIter::iter_move): Define as per LWG 3851.
(chunk_view::_InnerIter::itep_swap): Likewise.
(slide_view::base): Define as per LWG 3848.
(repeat_view): Adjust deduction guide as per LWG 4053.
(_Repeat::operator()): Adjust single-parameter overload as per
LWG 4054.
* testsuite/std/ranges/adaptors/adjacent/1.cc: Verify existence
of base member function.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc: Likewise.
* testsuite/std/ranges/adaptors/chunk/1.cc: Test LWG 3851 example.
* testsuite/std/ranges/adaptors/slide/1.cc: Verify existence of
base member function.
* testsuite/std/ranges/iota/iota_view.cc: Test LWG 4001 example.
* testsuite/std/ranges/repeat/1.cc: Test LWG 4053/4054 examples.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

testsuite: Fix up unevalstr2.C test

The CWG2521 changes adjusted the unevalstr1.C test but didn't adjust
unevalstr2.C test, which now FAILs in C++23 mode.

The intent in both of those tests was to test the separate (now deprecated)
syntax, so instead of removing the space between closing " and _ I've
adjusted the testcase to expect those 17 extra warnings.  And I've also
adjusted the unevalstr1.C testcase to do the same, when it is removed from
C++29 or whatever, that can be just guarded by #if.

But it is actually useful to also test the UDL variant without space between
closing " and _, so I've added new test coverage for that too to both tests.

2024-10-04  Jakub Jelinek  <jakub@redhat.com>

* g++.dg/cpp26/unevalstr1.C: Revert the 2024-10-03 changes, instead
expect extra warnings.  Add another set of tests without space
between " and _.
* g++.dg/cpp26/unevalstr2.C: Expect extra warnings for C++23.  Add
another set of tests without space between " and _.

aarch64: Set Armv9-A generic L1 cache line size to 64 bytes

I'd like to use a value of 64 bytes for the L1 cache size for Armv9-A
generic tuning.
As described in g:9a99559a478111f7fbeec29bd78344df7651c707 this value is used
to set the std::hardware_destructive_interference_size value which we want to
be not overly large when running concurrent applications on large core-count
systems.

The generic value for Armv8-A systems and the port baseline is 256 bytes
because that's what the A64FX CPU has, as set de-facto in
aarch64_override_options_internal.

But for Armv9-A CPUs as far as I know there isn't anything larger
than 64 bytes, so we should be able to use the smaller value here and reduce
the size of concurrent structs that use
std::hardware_destructive_interference_size to pad their fields.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
* config/aarch64/tuning_models/generic_armv9_a.h
(generic_armv9a_prefetch_tune): Define.
(generic_armv9_a_tunings): Use the above.

libstdc++: Fix some Parallel Mode testsuite failures

Some of these are due to no longer using #pragma GCC system_header in
libstdc++ headers, some have been failing for longer and weren't
noticed.

libstdc++-v3/ChangeLog:

* include/parallel/algobase.h (search): Use sequential algorithm
for constant evaluation.
* include/parallel/algorithmfwd.h (search): Add
_GLIBCXX20_CONSTEXPR.
* include/parallel/multiway_merge.h: Remove stray semi-colon.
* include/parallel/multiseq_selection.h: Add diagnostic pragmas
for -Wlong-long warning.
* include/parallel/quicksort.h: Likewise.
* include/parallel/random_number.h: Likewise.
* include/parallel/settings.h: Likewise.
* include/parallel/workstealing.h: Replace ++ and -- on volatile
variables.
* testsuite/17_intro/names.cc: Skip names defined by
<tr1/random>.
* testsuite/20_util/pair/dangling_ref.cc: Skip test if Parallel
Mode is enabled.
* testsuite/20_util/tuple/dangling_ref.cc: Likewise.

arm: Fix missed CE optimization for armv8.1-m.main [PR 116444]

This patch restores missed optimizations for armv8.1-m.main targets that were
missed when the generation of csinc, csinv and csneg were enabled for the same
with patch series containing:

commit c2bb84be4a6e581bbf45891457ee632a07416982
Author: Sudi Das <sudi.das@arm.com>
Date:   Fri Sep 18 15:47:46 2020 +0100

    [PATCH 2/5][Arm] New pattern for CSINV instructions

The original patch series makes use of the "noce" machinery to transform RTL
into patterns that later match the Armv8.1-M Mainline, by getting the target
hook TARGET_HAVE_CONDITIONAL_EXECUTION, to return FALSE for such targets prior
to reload_completed.  The same machinery however was transforming other RTL
patterns which were later on causing the "ce" pass post reload_completed to no
longer optimize conditional execution opportunities, which was causing the
regression observed in PR target/116444, a regression of 'testsuite/gcc.target/arm/thumb-ifcvt-2.c'
when ran for an Armv8.1-M Mainline target.

This patch implements the target hook TARGET_NOCE_CONVERSION_PROFITABLE_P to
only allow "noce" to generate patterns that match CSINV, CSINC and CSNEG.  Thus
ensuring that the early "ce" passes do not ruin things for later ones.

gcc/ChangeLog:

PR target/116444
* config/arm/arm-protos.h (arm_noce_conversion_profitable_p): New
declaration.
* config/arm/arm.cc (arm_is_v81m_cond_insn): New helper function used
in ...
(arm_noce_conversion_profitable_p): ... here. New function to implement
...
(TARGET_NOCE_PROFITABLE_P): ... this target hook.  New define.

Fixup dumping of re-trying without/with single-lane SLP

The following fixes the order of decrementing the SLP mode and
the dumping.

* tree-vect-loop.cc (vect_analyze_loop_2): Derement 'slp'
before dumping which stage we're starting.

diagnostic, pch: Fix up the new diagnostic PCH methods for ubsan checking [PR116936]

The PR notes that the new pch_save/pch_restore methods I've added
recently invoke UB if either m_classification_history.address ()
or m_push_list.address () is NULL (which can happen if those vectors
are empty (and in the pch_save case nothing has been pushed into them
before either). While the corresponding length is necessarily 0,
fwrite (NULL, something, 0, f) or
fread (NULL, something, 0, f) still invoke UB.

The following patch fixes that by not calling fwrite/fread if the
corresponding length is 0.

2024-10-04 Jakub Jelinek <jakub@redhat.com>

PR pch/116936
* diagnostic.cc (diagnostic_option_classifier::pch_save): Only call
fwrite if corresponding length is non-zero.
(diagnostic_option_classifier::pch_restore): Only call fread if
corresponding length is non-zero.

i386: Fix up ix86_expand_int_compare with TImode comparisons of SUBREGs from V8{H,B}Fmode against zero [PR116921]

The following testcase ICEs, because the ix86_expand_int_compare
optimization to use {,v}ptest assumes there are instructions for all
16-byte vector modes.  That isn't the case, we only have one for
V16QI, V8HI, V4SI, V2DI, V1TI, V4SF and V2DF, not for
V8HF nor V8BF.

The following patch fixes that by using the V8HI instruction instead
for those 2 modes.  tmp can't be a SUBREG, because it is SUBREG_REG
of another SUBREG, so we don't need to worry about gen_lowpart
failing.

2024-10-04  Jakub Jelinek  <jakub@redhat.com>

PR target/116921
* config/i386/i386-expand.cc (ix86_expand_int_compare): Add a SUBREG
to V8HImode from V8HFmode or V8BFmode before generating a ptest.

* gcc.target/i386/pr116921.c: New test.

i386: Fix up *minmax<mode>3_2 splitter [PR116925]

While *minmax<mode>3_1 correctly uses
   if (MEM_P (operands[1]))
     operands[1] = force_reg (<MODE>mode, operands[1]);
to ensure operands[1] is not a MEM, *minmax<mode>3_2 does it wrongly
by calling force_reg but ignoring its return value.

The following borderingly obvious patch fixes that.

Didn't find similar other errors in the backend with force_reg calls.

2024-10-04  Jakub Jelinek  <jakub@redhat.com>

PR target/116925
* config/i386/sse.md (*minmax<mode>3_2): Assign force_reg result
back to operands[2] instead of throwing it away.

* g++.target/i386/avx-pr116925.C: New test.

c++: Allow references to internal-linkage vars in C++11 [PR113266]

[temp.arg.nontype] changed in C++11 to allow naming internal-linkage
variables and functions. We currently already handle internal-linkage
functions, but variables were missed; this patch updates this.

PR c++/113266
PR c++/116911

gcc/cp/ChangeLog:

* parser.cc (cp_parser_template_argument): Allow
internal-linkage variables since C++11.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nontype6.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

c++: Return the underlying decl rather than the USING_DECL from update_binding [PR116913]

Users of pushdecl assume that the returned decl will be a possibly
updated decl matching the one that was passed in. My r15-3910 change
broke this since in some cases we would now return USING_DECLs; this
patch fixes the situation.

PR c++/116913

gcc/cp/ChangeLog:

* name-lookup.cc (update_binding): Return the strip_using'd old
decl rather than the binding.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/using70.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

Relax gcc.dg/vect/pr65947-8.c

When failing using forced SLP we do not print the non-SLP failure
mode which reads slightly different. Massage the expectation a bit.

* gcc.dg/vect/pr65947-8.c: Adjust.

libstdc++: Replace implicit lambda capture of 'this' [PR116964]

C++20 deprecates implicit capture of 'this', so change [=] to [this] for
all lambda expressions in <shared_mutex>. This only shows up on targets
where _GLIBCXX_USE_PTHREAD_RWLOCK_T is not defined, as we have an
alternative implementation of shared mutexes in that case.

libstdc++-v3/ChangeLog:

PR libstdc++/116964
* include/std/shared_mutex (__shared_mutex_cv): Use [this] for
lambda captures.
(shared_timed_mutex) [!_GLIBCXX_USE_PTHREAD_RWLOCK_T]: Likewise.

tree-optimization/99856 - fix testcase

When making the testcase use aligned accesses I botched up the
copy&paste. Fixed.

PR tree-optimization/99856
* gcc.dg/vect/pr99856.c: Fix copy&paste errors.

testsuite: fix two newly-running -Wstringop-overflow test directives

This didn't show up until the previous commit which fixed the directive
syntax. The indexing was off for the notes.

gcc/testsuite/ChangeLog:

* gcc.dg/Wstringop-overflow-79.c: Fix index for notes.
* gcc.dg/Wstringop-overflow-80.c: Ditto.

testsuite: add missing braces around dejagnu directives

gcc/testsuite/ChangeLog:

* c-c++-common/analyzer/flex-without-call-summaries.c: Add missing brace.
* c-c++-common/analyzer/malloc-callbacks.c: Ditto.
* gcc.dg/Wstringop-overflow-79.c: Ditto.
* gcc.dg/Wstringop-overflow-80.c: Ditto.

testsuite - Fix gcc.c-torture/execute/ieee/pr108540-1.c

PR testsuite/108540
gcc/testsuite/
* gcc.c-torture/execute/ieee/pr108540-1.c: Un-preprocess
__SIZE_TYPE__ and __INT64_TYPE__.
* gcc.c-torture/execute/ieee/pr108540-1.x: New file, requires double64.

doc: Drop GCC 2.6 ABI change note for H8/h8300-hms

gcc:
PR target/69374
* doc/install.texi (Specific) <h8300-hms>: Drop GCC 2.6
ABI change note.

gcc: fix typo in gimplify

gcc/ChangeLog:

* gimplify.cc (gimple_add_init_for_auto_var): Fix 'variable' typo.

testsuite: gnat.dg: fix dg-do directive syntax

Fix incorrect use of '[' rather than '{' in 'dg-...' directives.

gcc/testsuite/ChangeLog:

* gnat.dg/pack13.adb: Fix 'dg-...' directive syntax.
* gnat.dg/size_attribute.adb: Ditto.
* gnat.dg/subp_elim_errors.adb: Ditto.

c++: record template specialization hash

A lot of compile time of template-heavy code is spent in re-hashing
hashtable elements upon expansion. The following records the hash in the
hash element. This speeds up C++20 compilation of stdc++.h by about 25% for
about a 0.1% increase in memory usage.

With the hash value in the entry, we don't need to pass it separately to the
find functions.

Adding default arguments to the spec and hash fields simplifies spec_entry
initialization and avoids problems from hash starting with an indeterminate
value.

gcc/cp/ChangeLog:

* cp-tree.h (spec_entry::hash): New member.
* pt.cc (spec_hasher::hash): Set it and return it.
(maybe_process_partial_specialization): Clear it when
changing tmpl/args.
(lookup_template_class): Likewise, don't pass hash to find.
(retrieve_specialization): Set it, don't pass hash to find.
(register_specialization): Don't pass hash to find.
(reregister_specialization): Likewise.
(match_mergeable_specialization): Likewise.
(add_mergeable_specialization): Likewise.

Co-authored-by: Richard Biener <rguenther@suse.de>

Daily bump.

c++: free garbage vec in coerce_template_parms

coerce_template_parms can create two different vecs for the inner template
arguments, new_inner_args and (potentially) the result of
expand_template_argument_pack.  One or the other, or possibly both, end up
being garbage: in the typical case, the expanded vec is garbage because it's
only used as the source for convert_template_argument.  In some dependent
cases, the new vec is garbage because we decide to return the original args
instead.  In these cases, ggc_free the garbage vec to reduce the memory
overhead of overload resolution.

gcc/cp/ChangeLog:

* pt.cc (coerce_template_parms): Free garbage vecs.

Co-authored-by: Richard Biener <rguenther@suse.de>

Aarch64: Define WIDEST_HARDWARE_FP_SIZE

The macro is documented like this in the internal manual:

-- Macro: WIDEST_HARDWARE_FP_SIZE
     A C expression for the size in bits of the widest floating-point
     format supported by the hardware.  If you define this macro, you
     must specify a value less than or equal to mode precision of the
     mode used for C type 'long double' (from hook
     'targetm.c.mode_for_floating_type' with argument
     'TI_LONG_DOUBLE_TYPE').  If you do not define this macro, mode
     precision of the mode used for C type 'long double' is the default.

AArch64 uses 128-bit TFmode for long double but, as far as I know, no FPU
implemented in hardware supports it.

gcc/
* config/aarch64/aarch64.h (WIDEST_HARDWARE_FP_SIZE): Define to 64.

gcc/testsuite/
* gnat.dg/specs/size_clause6.ads: New test.

Revert "c++: free garbage vec in coerce_template_parms"

This broke bootstrap, improving.

This reverts commit 5b08ae503dd4aef2789a667daaf1984e7cc94aaa.

c++: -Wdeprecated enables later standard deprecations

By default -Wdeprecated warns about deprecations in the active standard.
When specified explicitly, let's also warn about deprecations in later
standards.

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Explicit -Wdeprecated enables
deprecations from later standards.

gcc/ChangeLog:

* doc/invoke.texi: Explicit -Wdeprecated enables more warnings.

c++: add -Wdeprecated-literal-operator [CWG2521]

C++23 CWG issue 2521 (https://wg21.link/cwg2521) deprecates user-defined
literal operators declared with the optional space between "" and the
suffix.

Many testcases used that syntax; I removed the space from most of them, and
added C++23 warning tests to a few.

CWG 2521

gcc/ChangeLog:

* doc/invoke.texi: Document -Wdeprecated-literal-operator.

gcc/c-family/ChangeLog:

* c.opt: Add -Wdeprecated-literal-operator.
* c-opts.cc (c_common_post_options): Default on in C++23.
* c.opt.urls: Regenerate.

gcc/cp/ChangeLog:

* parser.cc (location_between): New.
(cp_parser_operator): Handle -Wdeprecated-literal-operator.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/udlit-string-literal.h
* g++.dg/cpp0x/Wliteral-suffix2.C
* g++.dg/cpp0x/constexpr-55708.C
* g++.dg/cpp0x/gnu_fext-numeric-literals.C
* g++.dg/cpp0x/gnu_fno-ext-numeric-literals.C
* g++.dg/cpp0x/pr51420.C
* g++.dg/cpp0x/pr60209-neg.C
* g++.dg/cpp0x/pr60209.C
* g++.dg/cpp0x/pr61038.C
* g++.dg/cpp0x/std_fext-numeric-literals.C
* g++.dg/cpp0x/std_fno-ext-numeric-literals.C
* g++.dg/cpp0x/udlit-addr.C
* g++.dg/cpp0x/udlit-args-neg.C
* g++.dg/cpp0x/udlit-args.C
* g++.dg/cpp0x/udlit-args2.C
* g++.dg/cpp0x/udlit-clink-neg.C
* g++.dg/cpp0x/udlit-concat-neg.C
* g++.dg/cpp0x/udlit-concat.C
* g++.dg/cpp0x/udlit-constexpr.C
* g++.dg/cpp0x/udlit-cpp98-neg.C
* g++.dg/cpp0x/udlit-declare-neg.C
* g++.dg/cpp0x/udlit-embed-quote.C
* g++.dg/cpp0x/udlit-extended-id-1.C
* g++.dg/cpp0x/udlit-extended-id-3.C
* g++.dg/cpp0x/udlit-extern-c.C
* g++.dg/cpp0x/udlit-friend.C
* g++.dg/cpp0x/udlit-general.C
* g++.dg/cpp0x/udlit-implicit-conv-neg-char8_t.C
* g++.dg/cpp0x/udlit-implicit-conv-neg.C
* g++.dg/cpp0x/udlit-inline.C
* g++.dg/cpp0x/udlit-mangle.C
* g++.dg/cpp0x/udlit-member-neg.C
* g++.dg/cpp0x/udlit-namespace.C
* g++.dg/cpp0x/udlit-nofunc-neg.C
* g++.dg/cpp0x/udlit-nonempty-str-neg.C
* g++.dg/cpp0x/udlit-nosuffix-neg.C
* g++.dg/cpp0x/udlit-nounder-neg.C
* g++.dg/cpp0x/udlit-operator-neg.C
* g++.dg/cpp0x/udlit-overflow-neg.C
* g++.dg/cpp0x/udlit-overflow.C
* g++.dg/cpp0x/udlit-preproc-neg.C
* g++.dg/cpp0x/udlit-raw-length.C
* g++.dg/cpp0x/udlit-raw-op-string-neg.C
* g++.dg/cpp0x/udlit-raw-op.C
* g++.dg/cpp0x/udlit-raw-str.C
* g++.dg/cpp0x/udlit-resolve-char8_t.C
* g++.dg/cpp0x/udlit-resolve.C
* g++.dg/cpp0x/udlit-shadow-neg.C
* g++.dg/cpp0x/udlit-string-length.C
* g++.dg/cpp0x/udlit-suffix-neg.C
* g++.dg/cpp0x/udlit-template.C
* g++.dg/cpp0x/udlit-tmpl-arg-neg.C
* g++.dg/cpp0x/udlit-tmpl-arg-neg2.C
* g++.dg/cpp0x/udlit-tmpl-arg.C
* g++.dg/cpp0x/udlit-tmpl-parms-neg.C
* g++.dg/cpp0x/udlit-tmpl-parms.C
* g++.dg/cpp1y/pr57640.C
* g++.dg/cpp1y/pr88872.C
* g++.dg/cpp26/unevalstr1.C
* g++.dg/cpp2a/concepts-pr60391.C
* g++.dg/cpp2a/consteval-prop21.C
* g++.dg/cpp2a/nontype-class6.C
* g++.dg/cpp2a/udlit-class-nttp-ctad-neg.C
* g++.dg/cpp2a/udlit-class-nttp-ctad-neg2.C
* g++.dg/cpp2a/udlit-class-nttp-ctad.C
* g++.dg/cpp2a/udlit-class-nttp-neg.C
* g++.dg/cpp2a/udlit-class-nttp-neg2.C
* g++.dg/cpp2a/udlit-class-nttp.C
* g++.dg/ext/is_convertible2.C
* g++.dg/lookup/pr87269.C
* g++.dg/cpp0x/udlit_system_header: Adjust for C++23 deprecated
operator "" _suffix.
* g++.dg/DRs/dr2521.C: New test.

c++: free garbage vec in coerce_template_parms

coerce_template_parms can create two different vecs for the inner template
arguments, new_inner_args and (potentially) the result of
expand_template_argument_pack.  One or the other, or possibly both, end up
being garbage: in the typical case, the expanded vec is garbage because it's
only used as the source for convert_template_argument.  In some dependent
cases, the new vec is garbage because we decide to return the original args
instead.  In these cases, ggc_free the garbage vec to reduce the memory
overhead of overload resolution.

gcc/cp/ChangeLog:

* pt.cc (struct free_if_changed_proxy): New.
(coerce_template_parms): Use it.

Co-authored-by: Richard Biener <rguenther@suse.de>

AVR: Make gcc.dg/c23-stdarg-9.c work.

gcc/testsuite/
* gcc.dg/c23-stdarg-9.c (struct S) [AVR]: Only use int a[500].

libstdc++: Make Unicode utils work with Parallel Mode

libstdc++-v3/ChangeLog:

* include/bits/unicode.h (__unicode::__is_incb_linker): Use
_GLIBCXX_STD_A namespace for std::find.

libstdc++: Fix -Wdeprecated-declarations warning for Parallel Mode [PR116944]

The pragmas to disable warnings need to be moved before the first use of
the deprecated classes.

libstdc++-v3/ChangeLog:

PR libstdc++/116944
* include/parallel/base.h: Move diagnostic pragmas earlier.

libstdc++: Fix some warnings seen during bootstrap

libstdc++-v3/ChangeLog:

* include/bits/locale_facets_nonio.tcc (money_put::__do_get):
Ignore -Wformat warning for __ibm128 arguments.
* include/tr1/tuple (ignore): Ignore -Wunused warning.

Restore aarch64 bootstrap

This zero-initializes vec_init to avoid a bogus maybe-uninitialized
diagnostic.

* tree-vect-loop.cc (vectorizable_induction): Initialize
vec_init.

aarch64: Fix early ra for -fno-delete-dead-exceptions [PR116927]

Early-RA was considering throwing instructions as being dead and removing
them even if -fno-delete-dead-exceptions was in use. This fixes that oversight.

Built and tested for aarch64-linux-gnu.

PR target/116927

gcc/ChangeLog:

* config/aarch64/aarch64-early-ra.cc (early_ra::is_dead_insn): Insns
that throw are not dead with -fno-delete-dead-exceptions.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr116927-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

libstdc++: [_Hashtable] Fix some implementation inconsistencies

Get rid of the different usages of the mutable keyword except in
_Prime_rehash_policy where it is preserved for abi compatibility reason.

Fix comment to explain that we need the computation of bucket index noexcept
to be able to rehash the container when needed.

For Standard instantiations through std::unordered_xxx containers we already
force caching of hash code when hash functor is not noexcep so it is guarantied.

The static_assert purpose in _Hashtable on _M_bucket_index is thus limited
to usages of _Hashtable with exotic _Hashtable_traits.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_NodeBuilder<>::_S_build): Remove
const qualification on _NodeGenerator instance.
(_ReuseOrAllocNode<>::operator()(_Args&&...)): Remove const qualification.
(_ReuseOrAllocNode<>::_M_nodes): Remove mutable.
(_Insert_base<>::_M_insert_range): Remove _NodeGetter const qualification.
(_Hash_code_base<>::_M_bucket_index(const _Hash_node_value<>&, size_t)):
Simplify noexcept declaration, we already static_assert that _RangeHash functor
is noexcept.
* include/bits/hashtable.h: Rework comments. Remove const qualifier on
_NodeGenerator& arguments.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

diagnostics: support SARIF 2.2 output, undocumented for now [PR116301]

GCC currently supports outputting SARIF v2.1.0

Version 2.2 of the SARIF spec is not yet official, but the draft has
already gained features we might might want to use.

This patch extends the SARIF output code to accept a enum sarif_version
parameter internally, representing 2.1.0 or a prerelease of 2.2

The patch updates the SARIF output selftests so that they are run for
all such versions.

I hope to expose this "properly" via the mechanism described
in comment #13 of PR116613.  In the meantime, the patch adds a new
  -fdiagnostics-format=sarif-file-2.2-prerelease
for use by the DejaGnu testsuite, deliberately left undocumented for
now.

The copy of the 2.2 draft schema in the testsuite was downloaded from
https://raw.githubusercontent.com/oasis-tcs/sarif-spec/refs/tags/2.2-prerelease-2024-08-08/sarif-2.2/schema/sarif-2-2.schema.json

The patch adds support for capturing related locations within an ICE
notification for SARIF 2.2 onwards, thus capturing "include chain"
information for SARIF-based reports of ICEs that occur within a
header; see https://github.com/oasis-tcs/sarif-spec/issues/540

The patch does *not* add support for the "scannedFile" role, leaving it
to followup work; see https://github.com/oasis-tcs/sarif-spec/issues/459

gcc/ChangeLog:
PR other/116301
* common.opt (sarif-file-2.2-prerelease): New value for
-fdiagnostics-format=.
* diagnostic-format-sarif.cc
(sarif_location_manager::sarif_location_manager): Move
initialization of m_related_locations_arr here from sarif_result's
ctor.
(sarif_location_manager::add_related_location): Implement for
base class, taking sarif_result's implementation.  Add "builder"
param.
(sarif_location_manager::m_related_locations_arr): Move here from
class sarif_result.
(class sarif_result): Move m_related_locations_arr field and
add_related_location vfunc to class sarif_location_manager.
(sarif_builder::get_version): New accessor.
(sarif_builder::m_version): New field.
(sarif_invocation::add_notification_for_ice): Call
process_worklist on the notification for SARIF 2.2 and later.
(sarif_location_manager::process_worklist_item): Pass builder to
calls to add_related_location.
(sarif_result::on_nested_diagnostic): Likewise.
(sarif_result::on_diagram): Likewise.
(sarif_ice_notification::add_related_location): Add builder param.
For SARIF 2.2 and later chain up to base class impl so that
notifications get related locations.
(sarif_builder::sarif_builder): Add "version" param.
(SARIF_SCHEMA): Delete in favor of...
(sarif_version_to_url): New function.
(SARIF_VERSION): Delete in favor of...
(sarif_version_to_property): New function.
(make_top_level_object): Update to use m_version for "$schema" and
"version".
(sarif_output_format::sarif_output_format): Add "version" param.
(sarif_stream_output_format::sarif_stream_output_format):
Likewise.
(sarif_file_output_format::sarif_file_output_format): Likewise.
(diagnostic_output_format_init_sarif_stderr): Likewise.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
(selftest::test_sarif_diagnostic_context): Likewise.
(selftest::test_make_location_object): Likewise.
(selftest::test_simple_log): Likewise.  Update schema and version
tests accordingly.
(selftest::test_simple_log_2): Add "version" param.
(selftest::test_message_with_embedded_link): Likewise.
(selftest::run_tests_per_version): New, based on the
for_each_line_table_case calls in...
(selftest::diagnostic_format_sarif_cc_tests): Add loop over sarif
versions.  Replace for_each_line_table_case calls with one
call to run_tests_per_version.
* diagnostic-format-sarif.h: Include "diagnostic-format.h".
(enum class sarif_version): New.
(diagnostic_output_format_init_sarif_stderr): Move to here from
diagnostic-format.h.  Add "version" param.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
* diagnostic-format.h: Include "diagnostic.h".
(diagnostic_output_format_init_sarif_stderr): Move from here to
diagnostic-format-sarif.h.
* diagnostic.cc: Define INCLUDE_MEMORY.
Include "diagnostic-format-sarif.h".
(diagnostic_output_format_init): Pass sarif_version::v2_1_0 to
existing SARIF options.
Add case DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.
* diagnostic.h (enum diagnostics_output_format): Add
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE_2_2_PRERELEASE.

gcc/testsuite/ChangeLog:
PR other/116301
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.1.c: New test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2.2.c: New test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2_1.py: Support
script for new test.
* gcc.dg/plugin/crash-test-ice-in-header-sarif-2_2.py: Likewise.
* gcc.dg/plugin/crash-test-ice-in-header.h: New header.
* gcc.dg/plugin/plugin.exp: Add the new tests.
* lib/sarif-schema-2.2-prerelease-2024-08-08.json: New schema
file.
* lib/scansarif.exp (verify-sarif-file): Add optional argument for
specifying which version of the schema to validate against,
supporting "2.1" and "2.2", defaulting to the former.
Update the test name to capture the version of the schema tested
against.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

phiopt: Fix VCE moving by rewriting it into cast [PR116098]

Phiopt match_and_simplify might move a well defined VCE assign statement
from being conditional to being uncondtitional; that VCE might no longer
being defined. It will need a rewrite into a cast instead.

This adds the rewriting code to move_stmt for the VCE case.
This is enough to fix the issue at hand. It should also be using rewrite_to_defined_overflow
but first I need to move the check to see a rewrite is needed into its own function
and that is causing issues (see https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663938.html).
Plus this version is easiest to backport.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/116098

gcc/ChangeLog:

* tree-ssa-phiopt.cc (move_stmt): Rewrite VCEs from integer to integer
types to case.

gcc/testsuite/ChangeLog:

* c-c++-common/torture/pr116098-2.c: New test.
* g++.dg/torture/pr116098-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite/52641 - Make gcc.dg/strict-flex-array-3.c work on int != 32 bits.

PR testsuite/52641
gcc/testsuite/
* gcc.dg/strict-flex-array-3.c (expect) [AVR]: Use custom
version due to AVR-LibC limitations.
(stuff): Use __SIZEOF_INT__ instead of hard-coded values.

AVR: Make gcc.dg/pr113596.c work.

gcc/testsuite/
* gcc.dg/pr113596.c: Require less memory so it works on AVR.

testsuite/52641 - Require int32 for gcc.dg/pr93820-2.c.

PR testsuite/52641
gcc/testsuite/
* gcc.dg/pr93820-2.c: Add dg-require-effective-target int32.

testsuite/52641 - Fix gcc.dg/signbit-6.c for int != 32-bit targets.

PR testsuite/52641
gcc/testsuite/
* gcc.dg/signbit-6.c (main): Initialize a[0] and b[0]
with INT32_MIN (instead of with INT_MIN).