git.ipfire.org Git - thirdparty/gcc.git/log

c: Adjust LDBL_EPSILON for C2x for IBM long double

C2x changes the <float.h> definition of *_EPSILON to apply only to
normalized numbers.  The effect is that LDBL_EPSILON for IBM long
double becomes 0x1p-105L instead of 0x1p-1074L.

There is a reasonable case for considering this a defect fix - it
originated from the issue reporting process (DR#467), though it ended
up being resolved by a paper (N2326) for C2x rather than through the
issue process, and code using *_EPSILON often needs to override the
pre-C2x value of LDBL_EPSILON and use something on the order of
magnitude of the C2x value instead.  However, I've followed the
conservative approach of only making the change for C2x and not for
previous standard versions (and not for C++, which doesn't have the
C2x changes in this area).

The testcases added are intended to be valid for all long double
formats.  The C11 one is based on
gcc.target/powerpc/rs6000-ldouble-2.c (and when we move to a C2x
default, gcc.target/powerpc/rs6000-ldouble-2.c will need an
appropriate option added to keep using an older language version).

Tested with no regressions for cross to powerpc-linux-gnu.

gcc/c-family/
* c-cppbuiltin.cc (builtin_define_float_constants): Do not
special-case __*_EPSILON__ setting for IBM long double for C2x.

gcc/testsuite/
* gcc.dg/c11-float-7.c, gcc.dg/c2x-float-12.c: New tests.

libstdc++: Fix tests broken by C++23 P2266R3 "Simpler implicit move"

In C++23 mode these tests started to FAIL because an rvalue reference
parameter can no longer be bound to an lvalue reference return type. As
confirmed by Ville (who added these tests) the problem overloads are not
intended to be called, and only exist to verify that they don't
interfere with the intended behaviour. This changes the function bodies
to just throw, so that the tests will fail if the function is called.

libstdc++-v3/ChangeLog:

* testsuite/27_io/basic_ostream/inserters_other/char/6.cc:
Change body of unused operator<< overload to throw if called.
* testsuite/27_io/basic_ostream/inserters_other/wchar_t/6.cc:
Likewise.

Do not pessimize range in set_nonzero_bits.

Currently if we have a range of [0,0] and we set the nonzero bits to
1, the current code pessimizes the range to [0,1] because it assumes
the range is [1,1] plus the possibility of 0. This fixes the
oversight.

gcc/ChangeLog:

* value-range.cc (irange::set_nonzero_bits): Do not pessimize range.
(range_tests_nonzero_bits): New test.

Avoid comparing ranges when sub-ranges is 0.

There is nothing else to compare when the number of sub-ranges is 0.

gcc/ChangeLog:

* value-range.cc (irange::operator==): Early bail on m_num_ranges
equal to 0.

Do not compare nonzero masks for varying.

There is no need to compare nonzero masks when comparing two VARYING
ranges, as they are always the same when range types are the same.

gcc/ChangeLog:

* value-range.cc (irange::legacy_equal_p): Remove nonozero mask
check when comparing VR_VARYING ranges.

Do not compare incompatible ranges in ipa-prop.

gcc/ChangeLog:

* ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Do not compare
incompatible ranges in ipa-prop.

Fortran: fix testcases

Remove unreliable test for IEEE_FMA(), which fails on powerpc.
Adjust stop codes for modes_1.f90.

2022-10-03 Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>

gcc/testsuite/

PR fortran/107062
* gfortran.dg/ieee/fma_1.f90: Fix test.
* gfortran.dg/ieee/modes_1.f90: Fix test.

libstdc++: Fix gdb pretty printers when dealing with std::string

Since revision 33b43b0d8cd2de722d177ef823930500948a7487 std::string and other
similar typedef are ambiguous from a gdb point of view because it matches both
std::basic_string<char> and std::__cxx11::basic_string<char> symbols. For those
typedef add a workaround to accept the substitution as long as the same regardless
of __cxx11 namespace.

Also avoid to register printers for types in std::__cxx11::__8:: namespace, there is
no such symbols.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (Printer.add_version): Do not add version
namespace for __cxx11 symbols.
(add_one_template_type_printer): Likewise.
(add_one_type_printer): Likewise.
(FilteringTypePrinter._recognizer.recognize): Add a workaround for std::string & al
ambiguous typedef matching both std:: and std::__cxx11:: symbols.
* testsuite/libstdc++-prettyprinters/cxx17.cc: Remove obsolete
\#define _GLIBCXX_USE_CXX11_ABI 0.
* testsuite/libstdc++-prettyprinters/simple.cc: Likewise. Adapt test to accept
std::__cxx11::list.
* testsuite/libstdc++-prettyprinters/simple11.cc: Likewise.
* testsuite/libstdc++-prettyprinters/whatis.cc: Likewise.
* testsuite/libstdc++-prettyprinters/80276.cc: Likewise and remove xfail for c++20
and debug mode.
* testsuite/libstdc++-prettyprinters/libfundts.cc: Likewise.

Daily bump.

tree-cfg: Fix a verification diagnostic typo [PR107121]

Obvious typo in diagnostics.

2022-10-02 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/107121
* tree-cfg.cc (verify_gimple_call): Fix a typo in diagnostics,
DEFFERED_INIT -> DEFERRED_INIT.

Adjust LIBGCC2_INCLUDES for VxWorks and augment comment

Investigating the reasons for libgcc build failures in a canadian
context, orthogonally to the recent update of vxcrtstuff, exposed
interesting differences in the way include search paths are managed
between a regular Linux->VxWorks cross build and a canadian setup
building a Windows->VxWorks toolchain in a Linux environment.

This change augments the comment attached to LIBGCC2_INCLUDE in
libgcc/config/t-vxworks to better describe the parameters at play.

It also adjusts the addition of options for gcc/include and
gcc/include-fixed to minimize the actual differences for libgcc
in the two kinds of configurations.

2022-03-06 Olivier Hainque <hainque@adacore.com>

libgcc/
* config/t-vxworks (LIBGCC2_INCLUDE): Augment comment. Move
-I options for gcc/include and gcc/include-fixed at the end
and make them -isystem.

Define GCC_DRIVER_HOST_INITIALIZATION for VxWorks targets

We need to perform static links by default on VxWorks, where the use
of shared libraries involves unusual steps compared to standard native
systems.

This has to be conveyed before the lang_specific_driver code gets
invoked (in particular for g++), so specs aren't available.

This change defines the GCC_DRIVER_HOST_INITIALIZATION macro for
VxWorks, to insert a -static option in case the user hasn't provided any
explicit indication on the command line of the kind of link desired.

While a HOST macro doesn't seem appropriate to control a target OS
driven behavior, this matches other uses and won't conflict as VxWorks
is not supported on any of the other configurations using this macro.

gcc/
* config/vxworks-driver.cc: New.
* config.gcc (*vxworks*): Add vxworks-driver.o in extra_gcc_objs.
* config/t-vxworks: Add vxworks-driver.o.
* config/vxworks.h (GCC_DRIVER_HOST_INITIALIZATION): New.

Prevent secondary warning from diagnostic tweak in gthr-vxworks.h

Within gthr-vxworks.h, we prevent C++ errors from missing
declarations in some system headers by prepending their inclusion
with a

#pragma GCC diagnostic ignored "-Wstrict-prototypes"

But Wstrict-prototypes is internally registered as valid for
C/ObjC only, not C++, and this trick in turn triggers a Wpragma
warning with -Wsystem-headers.

This change just arranges to ignore the secondary warning locally.

2021-02-03 Olivier Hainque <hainque@adacore.com>

* config/gthr-vxworks.h: Prevent Wpragma warning for the
pragma diagnostics on Wstrict-prototypes.

Refine guard for vxworks crtstuff spec

Working on the reintroduction of shared libraries support
(and of modules depending on shared libraries) exposed a few
test failures of simple c++ constructor tests on arm-vxworks7r2.

Investigation revealed that we were not linking the
crtstuff objects as needed from a compiler configured not to
have shared libs support, because of the ENABLE_SHARED_LIBGCC
guard in this piece of vxworks.h:

  /* Setup the crtstuff begin/end we might need for dwarf EH registration
     and/or INITFINI_ARRAY support for shared libs.  */

  #if (HAVE_INITFINI_ARRAY_SUPPORT && defined(ENABLE_SHARED_LIBGCC)) \
      || (DWARF2_UNWIND_INFO && !defined(CONFIG_SJLJ_EXCEPTIONS))
  #define VX_CRTBEGIN_SPEC "%{!shared:vx_crtbegin.o%s;:vx_crtbeginS.o%s}"

crtstuff initfini array support is meant to be leveraged for
constructors regardless of whether the compiler also happens to be
configured with shared library support, so the guard on ENABLE_SHARED_LIBGCC
here is inappropriate.

This change just removes it,

2022-09-30  Olivier Hainque <hainque@adacore.com>

gcc/
* config/vxworks.h (VX_CRTBEGIN_SPEC, VX_CRTEND_SPEC): If
HAVE_INITFINI_ARRAY_SUPPORT, pick crtstuff objects regardless
of ENABLE_SHARED_LIBGCC.

Daily bump.

Fortran: Fix ICE and wrong code for assumed-rank arrays [PR100029, PR100040]

gcc/fortran/ChangeLog:

PR fortran/100040
PR fortran/100029
* trans-expr.cc (gfc_conv_class_to_class): Add code to have
assumed-rank arrays recognized as full arrays and fix the type
of the array assignment.
(gfc_conv_procedure_call): Change order of code blocks such that
the free of ALLOCATABLE dummy arguments with INTENT(OUT) occurs
first.

gcc/testsuite/ChangeLog:

PR fortran/100029
* gfortran.dg/PR100029.f90: New test.

PR fortran/100040
* gfortran.dg/PR100040.f90: New test.

c++: make some cp_trait_kind switch statements exhaustive

This replaces the unreachable default case in some cp_trait_kind
switches with an exhaustive listing of the trait codes that we don't
expect to see, so that when adding a new trait we'll get a helpful
-Wswitch warning if we forget to handle the new trait in a relevant
switch.

gcc/cp/ChangeLog:

* semantics.cc (trait_expr_value): Make cp_trait_kind switch
statement exhaustive.
(finish_trait_expr): Likewise.
(finish_trait_type): Likewise.

or1k: Only define TARGET_HAVE_TLS when HAVE_AS_TLS

This was found when testing buildroot with linuxthreads enabled. In
this case, the build passes --disable-tls to the toolchain during
configuration. After building the OpenRISC toolchain it was still
generating TLS code sequences and causing linker failures such as:

..../or1k-buildroot-linux-uclibc-gcc -o gpsd-3.24/gpsctl .... -lusb-1.0 -lm -lrt -lnsl
..../ld: ..../sysroot/usr/lib/libusb-1.0.so: undefined reference to `__tls_get_addr'

This patch fixes this by disabling tls for the OpenRISC target when requested
via --disable-tls.

gcc/ChangeLog:

* config/or1k/or1k.cc (TARGET_HAVE_TLS): Only define if
HAVE_AS_TLS is defined.

Tested-by: Yann E. MORIN <yann.morin@orange.com>

OpenACC: Fix struct-component-kind-1.c test

This patch is a minimal fix for the recently-added
struct-component-kind-1.c test (which is currently failing to emit one
of the errors it expects in scan output). This fragment was erroneously
omitted from the second version of the patch posted previously:

https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602504.html

2022-10-01 Julian Brown <julian@codesourcery.com>

gcc/
* gimplify.cc (omp_group_base): Fix IF_PRESENT (no_create)
handling.

Improve Z flag handling on H8

This patch improves handling of the Z bit in the status register in a
variety of ways to improve either the code size or code speed on various
H8 subtargets.

For example, we can test the zero/nonzero status of the upper byte of a
16 bit register using mov.b, we can move the Z or an inverted Z into a
QImode register profitably on some subtargets. We can move Z or an
inverted Z into the sign bit on the H8/SX profitably, etc.

gcc/

* config/h8300/h8300.md (HSI2): New iterator.
(eqne_invert): Similarly.
* config/h8300/testcompare.md (testhi_upper_z): New pattern.
(cmpqi_z, cmphi_z, cmpsi_z): Likewise.
(store_z_qi, store_z_i_qi, store_z_hi, store_z_hi_sb): New
define_insn_and_splits and/or define_insns.
(store_z_hi_neg, store_z_hi_and, store_z_<mode>): Likewise.
(store_z_<mode>_neg, store_z_<mode>_and, store_z): Likewise.

c++: loop through array CONSTRUCTOR

I noticed that we were ignoring all the special rules for when to use a
simple INIT_EXPR for array initialization from a CONSTRUCTOR, because
split_nonconstant_init_1 was also passing 1 to the from_array parameter.
Arguably that's the real bug, but I think we can be flexible.

The test that I noticed this with no longer fails without it.

gcc/cp/ChangeLog:

* init.cc (build_vec_init): Clear from_array for CONSTRUCTOR
initializer.

c++: cast split_nonconstant_init return val to void

We were already converting the result of expand_vec_init_expr to void; we
need to do the same for split_nonconstant_init.

The test that I noticed this with no longer fails without it.

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_genericize_init): Also convert the result of
split_nonconstant_init to void.

Install correct patch version.

gcc/
* tree-ssa-dom.cc (record_edge_info): Install correct version of
patch.

Emit discriminators for inlined call sites.

This change is based on commit 9fa26998a63d4b22b637ed8702520819e408a694
by Dehao Chen in vendors/google/heads/gcc-4_8.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* dwarf2out.cc (add_call_src_coords_attributes): Emit discriminators for inlined call sites.

Daily bump.

More gimple const/copy propagation opportunities

While investigating a benchmark for optimization opportunities I came across single block loop which either iterates precisely once or forever.    This is an interesting scenario as we can ignore the infinite looping path and treat any PHI nodes as degenerates.  So more concretely let's consider this trivial testcase:

volatile void abort (void);

void
foo(int a)
{
int b = 0;

while (1)
   {
     if (!a)
       break;
     b = 1;
   }

if (b != 0)
   abort ();
}

Quick analysis shows that b's initial value is 0 and its value only changes if we enter an infinite loop.  So if we get to the test b != 0, the only possible value b could have would be 0 and the test and its true arm can be eliminated.

The DOM3 dump looks something like this:

;;   basic block 2, loop depth 0, count 118111600 (estimated locally), maybe hot
;;    prev block 0, next block 3, flags: (NEW, VISITED)
;;    pred:       ENTRY [always]  count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE)
;;    succ:       3 [always]  count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE)

;;   basic block 3, loop depth 1, count 1073741824 (estimated locally), maybe hot
;;    prev block 2, next block 4, flags: (NEW, VISITED)
;;    pred:       2 [always]  count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                3 [89.0% (guessed)]  count:955630224 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  # b_1 = PHI <0(2), 1(3)>
  if (a_3(D) == 0)
    goto <bb 4>; [11.00%]
  else
    goto <bb 3>; [89.00%]
;;    succ:       4 [11.0% (guessed)]  count:118111600 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                3 [89.0% (guessed)]  count:955630224 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 4, loop depth 0, count 118111600 (estimated locally), maybe hot
;;    prev block 3, next block 5, flags: (NEW, VISITED)
;;    pred:       3 [11.0% (guessed)]  count:118111600 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  if (b_1 != 0)
    goto <bb 5>; [0.00%]
  else
    goto <bb 6>; [100.00%]
;;    succ:       5 [never]  count:0 (precise) (TRUE_VALUE,EXECUTABLE)
;;                6 [always]  count:118111600 (estimated locally) (FALSE_VALUE,EXECUTABLE)

This is a good representative of what the benchmark code looks like.

The primary effect we want to capture is to realize that the test if (b_1 != 0) is always false and optimize it accordingly.

In the benchmark, this opportunity is well hidden until after the loop optimizers have completed, so the first chance to capture this case is in DOM3.  Furthermore, DOM wants loops normalized with latch blocks/edges.  So instead of bb3 looping back to itself, there's an intermediate empty block during DOM.

I originally thought this was likely to only affect the benchmark.  But when I instrumented the optimization and bootstrapped GCC, much to my surprise there were several hundred similar cases identified in GCC itself.  So it's not as benchmark specific as I'd initially feared.

Anyway, detecting this in DOM is pretty simple.   We detect the infinite loop, including the latch block.  Once we've done that, we walk the PHI nodes and attach equivalences to the appropriate outgoing edge.   That's all we need to do as the rest of DOM is already prepared to handle equivalences on edges.

gcc/
* tree-ssa-dom.cc (single_block_loop_p): New function.
(record_edge_info): Also record equivalences for the outgoing
edge of a single block loop where the condition is an invariant.

gcc/testsuite/
* gcc.dg/infinite-loop.c: New test.

Minor cleanup/prep in DOM

It's a bit weird that free_dom_edge_info leaves a dangling pointer in e->aux.
Not sure what I was thinking.

There's two callers. One wipes e->aux immediately after the call, the other
attaches a newly created object immediately after the call. So we can wipe
e->aux within the call and simplify one of the two call sites.

This is preparatory work for a minor optimization where we want to detect
another class of edge equivalences in DOM (until something better is available)
and either attach them an existing edge_info structure or create a new one if
one doesn't currently exist for a given edge.

gcc/
* tree-ssa-dom.cc (free_dom_edge_info): Clear e->aux too.
(free_all_edge_infos): Do not clear e->aux here.

Document -fexcess-precision=16 in target.def

* target.def (TARGET_C_EXCESS_PRECISION): Document
-fexcess-precision=16.

Document -fexcess-precision=16 in tm.texi

I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.

RISC-V: Support -fexcess-precision=16

This fixes f19a327077e ("Support -fexcess-precision=16 which will enable
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on
RISC-V targets.

gcc/ChangeLog

PR target/106815
* config/riscv/riscv.cc (riscv_excess_precision): Add support
for EXCESS_PRECISION_TYPE_FLOAT16.

libstdc++: Remove <sstream> dependency from std::bitset::to_ulong() test

There's no need to use a stringstream to test the to_ulong() member.
This will allow the test to be used in freestanding mode.

libstdc++-v3/ChangeLog:

* testsuite/20_util/bitset/access/to_ulong.cc: Construct bitset
from binary literal instead of using stringstream.

libstdc++: Optimize operator>> for std::bitset

We can improve performance by using a char buffer instead of
basic_string. The loop bound already means we can't overflow the buffer,
and we don't need to keep writing a null character after every character
written to the buffer.

We could just use basic_string::resize(N) to zero-init the whole string,
then overwrite those chars. But that zero-init of all N chars would be
wasted in the case where we are writing to a bitset<N> with large N, but
only end up extracting one or two chars from the stream.

With this change we just use buffer of uninitialized chars. For a
small-ish bitset (currently <= 256) we can improve performance further
by using alloca instead of the heap.

libstdc++-v3/ChangeLog:

* include/std/bitset (operator>>): Use a simple buffer instead
of std::basic_string.

libstdc++: Remove non-standard public members in std::bitset

This makes _M_copy_from_ptr, _M_copy_from_string and _M_copy_to_string
private, and declares operator<< and operator>> as friends.

Also remove the historical _M_copy_from_string and _M_copy_to_string
overloads. Those were used before DR 396 was implemented but are
not needed now. There are no tests or docs describing them, so I don't
think we intend to support them as extensions.

libstdc++-v3/ChangeLog:

* include/std/bitset (_M_copy_from_ptr, _M_copy_from_string)
(_M_copy_to_string): Change access to private.
(_M_copy_from_string(const basic_string&, size_t, size_t)):
Remove.
(_M_copy_to_string(const basic_string&)): Remove.

libstdc++: Fix broken dg-prune-output

The new pattern in the dg-prune-output directive doesn't work. Instead
of a messy regex full of leaning toothpicks, just match on the
diagnostic text instead of the header paths.

libstdc++-v3/ChangeLog:

* testsuite/20_util/bind/ref_neg.cc: Fix dg-prune-output
directive.

arm, csky: Fix C++ ICEs with _Float16 and __fp16 [PR107080]

On Fri, Sep 30, 2022 at 09:54:49AM -0400, Jason Merrill wrote:
> > Note, there is one further problem on aarch64/arm, types with HFmode
> > (_Float16 and __fp16) are there mangled as Dh (which is standard
> > Itanium mangling:
> >                   ::= Dh # IEEE 754r half-precision floating point (16 bits)
> >                   ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)
> > so in theory is also ok, but DF16_ is more specific.  Should we just
> > change Dh to DF16_ in those backends, or should __fp16 there be distinct
> > type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ?
>
> You argued for keeping __float128 separate from _Float128, does the same
> argument not apply to this case?

Actually, they already were distinct types that just mangled the same.
So the same issue that had to be solved on i?86, ia64 and rs6000 for
_Float64x vs. long double is a problem on arm and aarch64 with _Float16
vs. __fp16.
The following patch fixes it for arm after aarch64 has been changed
already before.

> > And there is csky, which mangles __fp16 (but only if type's name is __fp16,
> > not _Float16) as __fp16, that looks clearly invalid to me as it isn't
> > valid in the mangling grammar.  So perhaps just nuke csky's mangle_type
> > and have it mangled as DF16_ by the generic code?

And seems even on csky __fp16 is distinct type from _Float16 (which is a
good thing for consistency, these 3 targets are the only ones that have
__fp16 type), so instead the patch handles it the same as on arm/aarch64,
Dh mangling for __fp16 and DF16_ for _Float16.

2022-09-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/107080
* config/arm/arm.cc (arm_mangle_type): Mangle just __fp16 as Dh
and _Float16 as DF16_.
* config/csky/csky.cc (csky_init_builtins): Fix a comment typo.
(csky_mangle_type): Mangle __fp16 as Dh and _Float16 as DF16_
rather than mangling __fp16 as __fp16.

* g++.target/arm/pr107080.C: New test.

diagnostics: Fix virtual location for -Wuninitialized [PR69543]

Warnings issued for -Wuninitialized have been using the spelling location of
the problematic usage, discarding any information on the location of the macro
expansion point if such usage was in a macro. This makes the warnings
impossible to control reliably with #pragma GCC diagnostic, and also discards
useful context in the diagnostic output. There seems to be no need to discard
the virtual location information, so this patch fixes that.

PR69543 was mostly about _Pragma issues which have been fixed for many years
now. The PR remains open because two of the testcases added in response to it
still have xfails, but those xfails have nothing to do with _Pragma and rather
just with the issue fixed by this patch, so the PR can be closed now as well.

The other testcase modified here, pragma-diagnostic-2.c, was explicitly
testing for the undesirable behavior that was xfailed in pr69543-3.c. I have
adjusted that and also added a new testcase verifying all 3 types of warning
that come from tree-ssa-uninit.cc get the proper location information now.

gcc/ChangeLog:

PR preprocessor/69543
* tree-ssa-uninit.cc (warn_uninit): Stop stripping macro tracking
information away from the diagnostic location.
(maybe_warn_read_write_only): Likewise.
(maybe_warn_operand): Likewise.

gcc/testsuite/ChangeLog:

PR preprocessor/69543
* c-c++-common/pr69543-3.c: Remove xfail.
* c-c++-common/pr69543-4.c: Likewise.
* gcc.dg/cpp/pragma-diagnostic-2.c: Adjust test for new behavior.
* c-c++-common/pragma-diag-16.c: New test.

aarch64: Fix C++ ICEs with _Float16 and __fp16 [PR107080]

On Fri, Sep 30, 2022 at 09:54:49AM -0400, Jason Merrill wrote:
> > Note, there is one further problem on aarch64/arm, types with HFmode
> > (_Float16 and __fp16) are there mangled as Dh (which is standard
> > Itanium mangling:
> >                   ::= Dh # IEEE 754r half-precision floating point (16 bits)
> >                   ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)
> > so in theory is also ok, but DF16_ is more specific.  Should we just
> > change Dh to DF16_ in those backends, or should __fp16 there be distinct
> > type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ?
>
> You argued for keeping __float128 separate from _Float128, does the same
> argument not apply to this case?

Actually, they already were distinct types that just mangled the same.
So the same issue that had to be solved on i?86, ia64 and rs6000 for
_Float64x vs. long double is a problem on arm and aarch64 with _Float16
vs. __fp16.
The following patch fixes it so far for aarch64.

2022-09-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/107080
* config/aarch64/aarch64.cc (aarch64_mangle_type): Mangle just __fp16
as Dh and _Float16 as DF16_.

* g++.target/aarch64/pr107080.C: New test.

i386, rs6000, ia64, s390: Fix C++ ICEs with _Float64x or _Float128 [PR107080]

The following testcase ICEs on x86 as well as ppc64le (the latter
with -mabi=ieeelongdouble), because _Float64x there isn't mangled as
DF64x but e or u9__ieee128 instead.
Those are the mangling that should be used for the non-standard
types with the same mode or for long double, but not for _Float64x.
All the 4 mangle_type targhook implementations start with
type = TYPE_MAIN_VARIANT (type);
so I think it is cleanest to handle it the same in all and return NULL
before the switches on mode or whatever other tests.
s390 doesn't actually have a bug, but while I was there, having
type = TYPE_MAIN_VARIANT (type);
if (TYPE_MAIN_VARIANT (type) == long_double_type_node)
looked useless to me.

Note, there is one further problem on aarch64/arm, types with HFmode
(_Float16 and __fp16) are there mangled as Dh (which is standard
Itanium mangling:
                 ::= Dh # IEEE 754r half-precision floating point (16 bits)
                 ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)
so in theory is also ok, but DF16_ is more specific.  Should we just
change Dh to DF16_ in those backends, or should __fp16 there be distinct
type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ?
And there is csky, which mangles __fp16 (but only if type's name is __fp16,
not _Float16) as __fp16, that looks clearly invalid to me as it isn't
valid in the mangling grammar.  So perhaps just nuke csky's mangle_type
and have it mangled as DF16_ by the generic code?

2022-09-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/107080
* config/i386/i386.cc (ix86_mangle_type): Always return NULL
for float128_type_node or float64x_type_node, don't check
float128t_type_node later on.
* config/ia64/ia64.cc (ia64_mangle_type): Always return NULL
for float128_type_node or float64x_type_node.
* config/rs6000/rs6000.cc (rs6000_mangle_type): Likewise.
Don't check float128_type_node later on.
* config/s390/s390.cc (s390_mangle_type): Don't use
TYPE_MAIN_VARIANT on type which was set to TYPE_MAIN_VARIANT
a few lines earlier.

* g++.dg/cpp23/ext-floating11.C: New test.

testsuite: Windows paths use \ and not /

libstdc++-v3/ChangeLog:

* testsuite/20_util/bind/ref_neg.cc: Prune Windows paths too.

Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Only run test on target if VMA == LMA

Checking that the triplet matches arm*-*-eabi (or msp430-*-*) is not
enough to know if the execution will enter an endless loop, or if it
will give a meaningful result. As the execution test only work when
VMA and LMA are equal, make sure that this condition is met.

gcc/ChangeLog:

* doc/sourcebuild.texi: Document new vma_equals_lma effective
target check.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vma_equals_lma): New.
* c-c++-common/torture/attr-noinit-1.c: Requre VMA == LMA to run.
* c-c++-common/torture/attr-noinit-2.c: Likewise.
* c-c++-common/torture/attr-noinit-3.c: Likewise.
* c-c++-common/torture/attr-persistent-1.c: Likewise.
* c-c++-common/torture/attr-persistent-3.c: Likewise.

Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

testsuite: Do not prefix linker script with "-Wl,"

The linker script should not be prefixed with "-Wl," - it's not an
input file and does not interfere with the new dump output filename
strategy.

gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp: Do not prefix linker script with "-Wl,".

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

RISC-V: Add '-m[no]-csr-check' option in gcc.

Add -m[no]-csr-check option in gcc part, when enable -mcsr-check option,
it will add csr-check in .option section and pass this to assembler.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_file_start): New .option.
* config/riscv/riscv.opt: New options.
* doc/invoke.texi: New definations.

c++: streamline built-in trait addition process

Adding a new built-in trait currently involves manual boilerplate
consisting of defining an rid enumerator for the identifier as well as a
corresponding cp_trait_kind enumerator and handling them in various switch
statements, the exact set of which depends on whether the proposed trait
yields (and thus is recognized as) a type or an expression.

To streamline the process, this patch adds a central cp-trait.def file
that tabulates the essential details about each built-in trait (whether
it yields a type or an expression, its code, its spelling and its arity)
and uses this file to automate away the manual boilerplate. It also
migrates all the existing C++-specific built-in traits to use this
approach.

After this change, adding a new built-in trait just entails declaring
it in cp-trait.def and defining its behavior in finish_trait_expr/type
(and handling it in diagnose_trait_expr, if it's an expression-yielding
trait).

gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Use cp/cp-trait.def to handle
C++ traits.
* c-common.h (enum rid): Likewise.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Likewise.
* cp-objcp-common.cc (names_builtin_p): Likewise.
* cp-tree.h (enum cp_trait_kind): Likewise.
* cxx-pretty-print.cc (pp_cxx_trait): Likewise.
* parser.cc (cp_keyword_starts_decl_specifier_p): Likewise.
(cp_parser_primary_expression): Likewise.
(cp_parser_trait): Likewise.
(cp_parser_simple_type_specifier): Likewise.
* cp-trait.def: New file.

testsuite: Colon is reserved on Windows

The ':' is reserved in filenames on Windows.

Without this patch, the test case failes with:
.../ben-1_a.C:4:8: error: failed to write compiled module: Invalid argument
.../ben-1_a.C:4:8: note: compiled module file is 'partitions/module:import.mod'

gcc/testsuite:

* g++.dg/modules/ben-1.map: Replace the colon with dash.
* g++.dg/modules/ben-1_a.C: Likewise

Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

libstdc++: Add missing <bits/stl_algobase.h> include to <bitset>

libstdc++-v3/ChangeLog:

* include/std/bitset: Include <bits/stl_algobase.h>.

rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

As PR99888 and its related show, the current support for
-fpatchable-function-entry on powerpc ELFv2 doesn't work
well with global entry existence.  For example, with one
command line option -fpatchable-function-entry=3,2, it got
below w/o this patch:

  .LPFE1:
  nop
  nop
  .type   foo, @function
  foo:
  nop
  .LFB0:
  .cfi_startproc
  .LCF0:
  0:      addis 2,12,.TOC.-.LCF0@ha
  addi 2,2,.TOC.-.LCF0@l
  .localentry     foo,.-foo

, the assembly is unexpected since the patched nops have
no effects when being entered from local entry.

This patch is to update the nops patched before and after
local entry, it looks like:

  .type   foo, @function
  foo:
  .LFB0:
  .cfi_startproc
  .LCF0:
  0:      addis 2,12,.TOC.-.LCF0@ha
  addi 2,2,.TOC.-.LCF0@l
  nop
  nop
  .localentry     foo,.-foo
  nop

PR target/99888
PR target/105649

gcc/ChangeLog:

* doc/invoke.texi (option -fpatchable-function-entry): Adjust the
documentation for PowerPC ELFv2 ABI dual entry points.
* config/rs6000/rs6000-internal.h
(rs6000_print_patchable_function_entry): New function declaration.
* config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
Support patchable-function-entry by emitting nops before and after
local entry for the function that needs global entry.
* config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip
the function that needs global entry till global entry has been
emitted.
* config/rs6000/rs6000.h (struct machine_function): New bool member
global_entry_emitted.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr99888-1.c: New test.
* gcc.target/powerpc/pr99888-2.c: New test.
* gcc.target/powerpc/pr99888-3.c: New test.
* gcc.target/powerpc/pr99888-4.c: New test.
* gcc.target/powerpc/pr99888-5.c: New test.
* gcc.target/powerpc/pr99888-6.c: New test.
* c-c++-common/patchable_function_entry-default.c: Adjust for
powerpc_elfv2 to avoid compilation error.

rs6000/test: Adjust pr104992.c with vect_int_mod [PR106516]

As PR106516 shows, we can get unexpected gimple outputs for
function thud on some target which supports modulus operation
for vector int. This patch introduces one effective target
vect_int_mod for it, then adjusts the test case with it.

PR testsuite/106516

gcc/testsuite/ChangeLog:

* gcc.dg/pr104992.c: Adjust with vect_int_mod.
* lib/target-supports.exp (check_effective_target_vect_int_mod): New
effective target.

testsuite: [arm] Relax expected register names in MVE tests

These two tests have hardcoded q0 as destination/source of load/store
instructions, but this register is actually used only under
-mfloat-abi=hard. When using -mfloat-abi=softfp, other registers
(eg. q3) can be used to transfer function arguments from core
registers to MVE registers, making the expected regexp fail.

This small patch replaces q0 with q[0-7] to accept any 'q' register.
In several places where we had q[0-9]+, replace it with q[0-7] as MVE
only has q0-q7 registers.

OK for trunk?

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/mve_load_memory_modes.c: Update expected
registers.
* gcc.target/arm/mve/mve_store_memory_modes.c: Likewise.

tree-optimization/107095 - fix typo in .MASK_STORE DSE handling

We were using the size of the mask argument rather than the data
argument for the ao_ref.

PR tree-optimization/107095
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Use data arg
for .MASK_STORE size.

Fortran: Update use_device_ptr for OpenMP 5.1 [PR105318]

OpenMP 5.1 added has_device_addr and relaxed the restrictions for
use_device_ptr, including processing non-type(c_ptr) arguments as
if has_device_addr was used. (There is a semantic difference.)

For completeness, the likewise change was done for 'use_device_ptr',
where non-type(c_ptr) arguments now use use_device_addr.

Finally, a warning for 'device(omp_{initial,invalid}_device)' was
silenced on the way as affecting the new testcase.

PR fortran/105318

gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Update is_device_ptr restrictions
for OpenMP 5.1 and map to has_device_addr where applicable; map
use_device_ptr to use_device_addr where applicable.
Silence integer-range warning for device(omp_{initial,invalid}_device).

libgomp/ChangeLog:
* testsuite/libgomp.fortran/is_device_ptr-2.f90: New test.

gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/is_device_ptr-1.f90: Remove dg-error.
* gfortran.dg/gomp/is_device_ptr-2.f90: Likewise.
* gfortran.dg/gomp/is_device_ptr-3.f90: Update tree-scan-dump.

Arrange to --disable-shared by default for VxWorks

This change makes sure that shared libraries for VxWorks are
only built on explicit request, when configured with --enable-shared.

As the support to build shared libs gets in very incrementally,
this provides us with a robust way to guard the relevant pieces
and reduce the risks of accidentally breaking a platform not yet
ready for it.

2022-09-30 Olivier Hainque <hainque@adacore.com>

* configure.ac (*vxworks*): If enable_shared is not
set, set to "no" and add --disable-shared to target and
host_configargs.
* configure: Regenerate.

c++: reduce redundant TARGET_EXPR

An experiment led me to notice that in some cases we were ending up with
TARGET_EXPR initialized by TARGET_EXPR, which isn't useful.

The target_expr_needs_replace change won't make a difference in most cases,
since cp_genericize_init will have already expanded VEC_INIT_EXPR by the
time we consider it, but it is correct.

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold_r) [TARGET_EXPR]: Collapse
TARGET_EXPR within TARGET_EXPR.
* constexpr.cc (cxx_eval_outermost_constant_expr): Avoid
adding redundant TARGET_EXPR.
* cp-tree.h (target_expr_needs_replace): VEC_INIT_EXPR doesn't.

Daily bump.

c: C2x noreturn attribute

C2x adds a standard [[noreturn]] attribute (which can also be spelt
[[_Noreturn]] for use with <stdnoreturn.h>), so allowing non-returning
functions to be declared in a manner compatible with C++; the
_Noreturn function specifier remains available but is marked
obsolescent.

Implement this attribute.  It's more restricted than GNU
__attribute__ ((noreturn)) - that allows function pointers but using
the standard attribute on a function pointer is a constraint
violation.  Thus, the attribute gets its own handler that checks for a
FUNCTION_DECL before calling the handler for the GNU attribute.  Tests
for the attribute are based on those for C11 _Noreturn and for other
C2x attributes.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c-family/
* c-lex.cc (c_common_has_attribute): Handle noreturn attribute for
C.

gcc/c/
* c-decl.cc (handle_std_noreturn_attribute): New function.
(std_attribute_table): Add _Noreturn and noreturn.

gcc/testsuite/
* gcc.dg/c2x-attr-noreturn-1.c, gcc.dg/c2x-attr-noreturn-2.c,
gcc.dg/c2x-attr-noreturn-3.c: New tests.
* gcc.dg/c2x-has-c-attribute-2.c: Also test __has_c_attribute for
noreturn attribute.

Process unsigned overflow relations for plus and minus is range-ops.

If a relation is available, calculate overflow and normal ranges. Then
apply as appropriate.

gcc/
* range-op.cc (plus_minus_ranges): New.
(adjust_op1_for_overflow): New.
(operator_plus::op1_range): Use new adjustment.
(operator_plus::op2_range): Ditto.
(operator_minus::op1_range): Ditto.
* value-relation.h (relation_lt_le_gt_ge_p): New.

gcc/testsuite/
* gcc.dg/tree-ssa/pr79095.c: Test evrp pass rather than vrp1.

Refine ranges using relations in GORI.

This allows GORI to recognize when a relation passed in applies to the
2 operands of the current statement. Check to see if further range
refinement is possible before proceeding.

* gimple-range-gori.cc (gori_compute::refine_using_relation): New.
(gori_compute::compute_operand1_range): Invoke
refine_using_relation when applicable.
(gori_compute::compute_operand2_range): Ditto.
* gimple-range-gori.h (class gori_compute): Adjust prototypes.

Track value_relations in GORI.

This allows GORI to recognize and pass relations along the calculation chain.
This will allow relations between the LHS and the operand being calculated
to be utilized in op1_range and op2_range.

* gimple-range-gori.cc (ori_compute::compute_operand_range):
Create a relation record and pass it along when possible.
(gori_compute::compute_operand1_range): Pass relation along.
(gori_compute::compute_operand2_range): Ditto.
(gori_compute::compute_operand1_and_operand2_range): Ditto.
* gimple-range-gori.h (class gori_compute): Adjust prototypes.
* gimple-range-op.cc (gimple_range_op_handler::calc_op1): Pass
relation to op1_range call.
(gimple_range_op_handler::calc_op2): Pass relation to op2_range call.
* gimple-range-op.h (class gimple_range_op_handler): Adjust
prototypes.

Move class value_relation the header file.

* value-relation.cc (class value_relation): Move to .h file.
(value_relation::set_relation): Ditto.
(value_relation::value_relation): ditto.
* value-relation.h (class value_relation): Move from .cc file.
(value_relation::set_relation): Ditto
(value_relation::value_relation): Ditto.

Audit op1_range and op2_range for undefined LHS.

If the LHS is undefined, GORI should cease looking. There are numerous
places where this happens, and a few potential traps.

* range-op.cc (operator_minus::op2_range): Check for undefined.
(operator_mult::op1_range): Ditto.
(operator_exact_divide::op1_range): Ditto.
(operator_lshift::op1_range): Ditto.
(operator_rshift::op1_range): Ditto.
(operator_cast::op1_range): Ditto.
(operator_bitwise_and::op1_range): Ditto.
(operator_bitwise_or::op1_range): Ditto.
(operator_trunc_mod::op1_range): Ditto.
(operator_trunc_mod::op2_range): Ditto.
(operator_bitwise_not::op1_range): Ditto.
(pointer_or_operator::op1_range): Ditto.
(range_op_handler::op1_range): Ditto.
(range_op_handler::op2_range): Ditto.

Remove undefined behaviour from testscase.

There was a patch posted to remove the undefined behaviour from this
testcase, but it appear to never have been applied.

gcc/teststuite/
PR tree-optimization/102892
* gcc.dg/pr102892-1.c: Remove undefined behaviour.

c++: implicit lookup of std::initializer_list [PR102576]

Here the lookup for the implicit use of std::initializer_list fails
because we do it using get_namespace_binding, which isn't import aware.
Fix this by using lookup_qualified_name instead.

PR c++/102576

gcc/cp/ChangeLog:

* pt.cc (listify): Use lookup_qualified_name instead of
get_namespace_binding.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr102576_a.H: New test.
* g++.dg/modules/pr102576_b.C: New test.

c++: fix triviality of class with unsatisfied op=

cxx20_pair is trivially copyable because it has a trivial copy constructor
and only a deleted copy assignment operator; the non-triviality of the
unsatisfied copy assignment overload is not considered.

gcc/cp/ChangeLog:

* class.cc (check_methods): Call constraints_satisfied_p.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/cond-triv3.C: New test.

libstdc++: [_GLIBCXX_INLINE_VERSION] Add gdb pretty print for _GLIBCXX_DEBUG

In _GLIBCXX_DEBUG mode containers are in std::__debug namespace but not template
parameters. In _GLIBCXX_INLINE_VERSION mode most types are in std::__8 namespace but
not std::__debug containers. We need to register specific type printers for this
combination.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (add_one_template_type_printer): Register
printer for types in std::__debug namespace with template parameters in std::__8
namespace.

Improve comments and INITFINI macro use in vxcrtsutff.c

This change augments the comment attached to the use of auto-host.h
in vxcrtstuff.c to better describe the reason for including it and
for the associated series of #undef directives.

It also augments the comment on dso_handle and removes a redundant
guard on HAVE_INITFINI_ARRAY_SUPPORT for the shared version of the
objects, nested within a section guarded on USE_INITFINI_ARRAY.

2022-09-29 Olivier Hainque <hainque@adacore.com>

libgcc/
* config/vxcrtstuff.c: Improve the comment attached to the use
of auto-host.h and of __dso_handle. Remove redundant guard on
HAVE_INITFINI_ARRAY_SUPPORT within a USE_INITFINI_ARRAY section.

c++: check DECL_INITIAL for constexpr

We were overlooking non-potentially-constant bits in variable initializer
because we didn't walk into DECL_INITIAL.

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Look into
DECL_INITIAL. Use location wrappers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-local4.C: Expect error sooner.
* g++.dg/cpp2a/consteval24.C: Likewise.
* g++.dg/cpp2a/consteval7.C: Likewise.
* g++.dg/cpp2a/inline-asm3.C: Likewise.

c++: fix class-valued ?: extension

When the gimplifier encounters the same TARGET_EXPR twice, it evaluates
TARGET_EXPR_INITIAL the first time and clears it so that the later
evaluation is just the temporary. With this testcase, using the extension
to treat an omitted middle operand as repeating the first operand, that led
to doing a bitwise copy of the S(1) temporary on return rather than properly
calling the copy constructor.

We can't use S(1) to initialize the return value here anyway, because we
need to materialize it into a temporary so we can convert it to bool and
determine which arm we're evaluating. So let's just treat the middle
operand as an xvalue.

PR c++/93046

gcc/cp/ChangeLog:

* call.cc (build_conditional_expr): For a?:c extension, treat
a reused class prvalue as an xvalue.

gcc/testsuite/ChangeLog:

* g++.dg/ext/cond4.C: Add runtime test.

c++: reduce temporaries in ?:

When the sides of ?: are class prvalues, we wrap the COND_EXPR in a
TARGET_EXPR so that both sides will initialize the same temporary. But in
this case we were stripping the outer TARGET_EXPR and conditionally creating
different temporaries, unnecessarily using extra stack. The
recently added TARGET_EXPR_NO_ELIDE flag avoids this.

gcc/cp/ChangeLog:

* call.cc (build_conditional_expr): Set TARGET_EXPR_NO_ELIDE on the
outer TARGET_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/cond-temp1.C: New test.

amdgcn: remove unused variable

This was left over from a previous version of the SIMD clone patch.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen):
Remove unused elt_bits variable.

Comment about HAVE_INITFINI_ARRAY_SUPPORT in vxworks.h

Explain that we rely on compiler .c files
to include auto-host.h before target configuration headers.

2022-09-29 Olivier Hainque <hainque@adacore.com>

gcc/
* config/vxworks.h: Add comment on our use of
HAVE_INITFINI_ARRAY_SUPPORT.

Add an mcmodel=large multilib for aarch64-vxworks

This makes good sense in general anyway, and in particular
with forthcoming support for shared shared libraries, which will
work for mrtp alone but not yet for mrtp+mcmodel=large.

2022-09-29 Olivier Hainque <hainque@adacore.com>

gcc/
* config/aarch64/t-aarch64-vxworks: Request multilib
variants for mcmodel=large.

Remove TARGET_FLOAT128_ENABLE_TYPE setting for VxWorks

We have, in vxworks.h:

/* linux64.h enables this, not supported in vxWorks. */
#undef TARGET_FLOAT128_ENABLE_TYPE
#define TARGET_FLOAT128_ENABLE_TYPE 0

We inherit linux64.h for a few reasons, but don't really support
float128 for vxworks, so the setting made sense.

Many tests rely on the linux default (1) though, so resetting is
causing lots of failures on compilation tests that would pass otherwise.

Not resetting lets users write code declaring floa128
objects but linking will typically fail at some point, so
there's no real adverse effect.

Bottom line is we don't have any particular incentive to alter
the default, whatever the default, so better leave the parameter
alone.

2022-09-29 Olivier Hainque <hainque@adacore.com>

gcc/
* config/rs6000/vxworks.h (TARGET_FLOAT128_ENABLE_TYPE): Remove
resetting to 0.

Robustify DWARF2_UNWIND_INFO handling in vx-common.h

This adjusts vx-common.h to #define DWARF2_UNWIND_INFO to 0
when ARM_UNWIND_INFO is set, preventing defaults.h from
possibly setting DWARF2_UNWIND_INFO to 1 (as well) on its own
afterwards if the macro isn't defined.

2022-09-29 Olivier Hainque <hainque@adacore.com>

gcc/
* config/vx-common.h (DWARF2_UNWIND_INFO): #define to 0
when ARM_UNWIND_INFO is set.

OpenACC: whole struct vs. component mappings (PR107028)

This patch fixes an ICE when both a complete struct variable and
components of that struct are mapped on the same directive for OpenACC,
using a modified version of the scheme used for OpenMP in the following
patch:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601558.html

A new function has been added to make sure that the mapping kinds of
the whole struct and the member access are compatible -- conservatively,
so as not to copy more to/from the device than the user expects.

This version of the patch uses a different method to detect duplicate
clauses for OpenACC in oacc_resolve_clause_dependencies, and removes
the now-redundant check in omp_accumulate_sibling_lists.  (The latter
check would no longer trigger when we map the whole struct on the same
directive because the component-mapping clauses are now deleted before
the check is executed.)

2022-09-28  Julian Brown  <julian@codesourcery.com>

gcc/
PR middle-end/107028
* gimplify.cc (omp_check_mapping_compatibility,
oacc_resolve_clause_dependencies): New functions.
(omp_accumulate_sibling_list): Remove redundant duplicate clause
detection for OpenACC.
(build_struct_sibling_lists): Skip deleted groups.  Don't build sibling
list for struct variables that are fully mapped on the same directive
for OpenACC.
(gimplify_scan_omp_clauses): Call oacc_resolve_clause_dependencies.

gcc/testsuite/
PR middle-end/107028
* c-c++-common/goacc/struct-component-kind-1.c: New test.
* g++.dg/goacc/pr107028-1.C: New test.
* g++.dg/goacc/pr107028-2.C: New test.
* gfortran.dg/goacc/mapping-tests-5.f90: New test.

c++: implement __remove_cv, __remove_reference and __remove_cvref

This implements builtins for std::remove_cv, std::remove_reference and
std::remove_cvref using TRAIT_TYPE from the previous patch.

gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Add __remove_cv,
__remove_reference and __remove_cvref.
* c-common.h (enum rid): Add RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cp-objcp-common.cc (names_builtin_p): Likewise.
* cp-tree.h (enum cp_trait_kind): Add CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cxx-pretty-print.cc (pp_cxx_trait): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* parser.cc (cp_keyword_starts_decl_specifier_p): Return true
for RID_REMOVE_CV, RID_REMOVE_REFERENCE and RID_REMOVE_CVREF.
(cp_parser_trait): Handle RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.
(cp_parser_simple_type_specifier): Likewise.
* semantics.cc (finish_trait_type): Likewise.

libstdc++-v3/ChangeLog:

* include/bits/unique_ptr.h (unique_ptr<_Tp[], _Dp>): Remove
__remove_cv and use __remove_cv_t instead.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __remove_cv,
__remove_reference and __remove_cvref.
* g++.dg/ext/remove_cv.C: New test.
* g++.dg/ext/remove_reference.C: New test.
* g++.dg/ext/remove_cvref.C: New test.

c++: introduce TRAIT_TYPE alongside TRAIT_EXPR

We already have generic support for predicate-like traits that yield a
boolean value via TRAIT_EXPR, but we lack the same support for traits
that yield a type instead of a value.  Such support would streamline
implementing efficient builtins for the standard library type traits.

To that end this patch implements a generic TRAIT_TYPE type alongside
TRAIT_EXPR, and reimplements the existing UNDERLYING_TYPE builtin trait
using this new TRAIT_TYPE.

gcc/cp/ChangeLog:

* cp-objcp-common.cc (cp_common_init_ts): Replace
UNDERLYING_TYPE with TRAIT_TYPE.
* cp-tree.def (TRAIT_TYPE): Define.
(UNDERLYING_TYPE): Remove.
* cp-tree.h (TRAIT_TYPE_KIND_RAW): Define.
(TRAIT_TYPE_KIND): Define.
(TRAIT_TYPE_TYPE1): Define.
(TRAIT_TYPE_TYPE2): Define.
(WILDCARD_TYPE_P): Return true for TRAIT_TYPE.
(finish_trait_type): Declare.
* cxx-pretty-print.cc (cxx_pretty_printer::primary_expression):
Adjust after renaming pp_cxx_trait_expression.
(cxx_pretty_printer::simple_type_specifier) <case TRAIT_TYPE>:
New.
(cxx_pretty_printer::type_id): Replace UNDERLYING_TYPE with
TRAIT_TYPE.
(pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait): ... this.  Handle TRAIT_TYPE as well.  Correct
pretty printing of the trailing arguments.
* cxx-pretty-print.h (pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait_type): ... this.
* error.cc (dump_type) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
(dump_type_prefix): Replace UNDERLYING_WITH with TRAIT_TYPE.
(dump_type_suffix): Likewise.
* mangle.cc (write_type) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
* module.cc (trees_out::type_node) <case UNDERLYING_TYPE>:
Remove.
<case TRAIT_TYPE>: New.
(trees_in::tree_node): Likewise.
* parser.cc (cp_parser_primary_expression): Adjust after
renaming cp_parser_trait_expr.
(cp_parser_trait_expr): Rename to ...
(cp_parser_trait): ... this.  Call finish_trait_type for traits
that yield a type.
(cp_parser_simple_type_specifier): Adjust after renaming
cp_parser_trait_expr.
* pt.cc (for_each_template_parm_r) <case UNDERLYING_TYPE>:
Remove.
<case TRAIT_TYPE>: New.
(tsubst): Likewise.
(unify): Replace UNDERLYING_TYPE with TRAIT_TYPE.
(dependent_type_p_r): Likewise.
* semantics.cc (finish_underlying_type): Don't return
UNDERLYING_TYPE anymore when processing_template_decl.
(finish_trait_type): Define.
* tree.cc (strip_typedefs) <case UNDERLYING_TYPE>: Remove.
<case TRAIT_TYPE>: New.
(cp_walk_subtrees): Likewise.
* typeck.cc (structural_comptypes): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-59.C: Adjust expected error message.
* g++.dg/ext/underlying_type7.C: Likewise.
* g++.dg/ext/underlying_type13.C: New test.
* g++.dg/ext/underlying_type14.C: New test.

libstdc++: Guard use of new built-in with __has_builtin

I forgot that non-GCC compilers don't have this built-in yet.

For Clang we could do something like the check below (as described in
P2255), but for now I'm just fixing the regression.

#if __has_builtin((__reference_binds_to_temporary)
  bool _Dangle = __reference_binds_to_temporary(_Tp, _Res_t)
                 && __and_<is_reference<_Tp>,
                           __not_<is_reference<_Res_t>>,
                           is_convertible<__remove_cvref_t<_Res_t>*,
                                          __remove_cvref_t<_Tp>*>>::value
#endif

libstdc++-v3/ChangeLog:

* include/std/type_traits (__is_invocable_impl): Check
__has_builtin(__reference_converts_from_temporary) before using
built-in.

c++: import/export NTTP objects

This adds smarts to the module machinery to handle NTTP object
VAR_DECLs. Like typeinfo objects, these must be ignored in the symbol
table, streamed specially and recreated on stream in.

gcc/cp/
PR c++/100616
* module.cc (enum tree_tag): Add tt_nttp_var.
(trees_out::decl_node): Handle NTTP objects.
(trees_in::tree_node): Handle tt_nttp_var.
(depset::hash::add_binding_entry): Skip NTTP objects.

gcc/testsuite/
PR c++/100616
* g++.dg/modules/100616_a.H: New.
* g++.dg/modules/100616_b.C: New.
* g++.dg/modules/100616_c.C: New.
* g++.dg/modules/100616_d.C: New.

place `const volatile' objects in read-only sections

It is common for C BPF programs to use variables that are implicitly
set by the BPF loader and run-time.  It is also necessary for these
variables to be stored in read-only storage so the BPF verifier
recognizes them as such.  This leads to declarations using both
`const' and `volatile' qualifiers, like this:

  const volatile unsigned char is_allow_list = 0;

Where `volatile' is used to avoid the compiler to optimize out the
variable, or turn it into a constant, and `const' to make sure it is
placed in .rodata.

Now, it happens that:

- GCC places `const volatile' objects in the .data section, under the
  assumption that `volatile' somehow voids the `const'.

- LLVM places `const volatile' objects in .rodata, under the
  assumption that `volatile' is orthogonal to `const'.

So there is a divergence, that has practical consequences: it makes
BPF programs compiled with GCC to not work properly.

When looking into this, I found this bugzilla:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
  "change semantics of const volatile variables"

which was filed back in 2005, long ago.  This report was already
asking to put `const volatile' objects in .rodata, questioning the
current behavior.

While discussing this in the #gcc IRC channel I was pointed out to the
following excerpt from the C18 spec:

   6.7.3 Type qualifiers / 5 The properties associated with qualified
         types are meaningful only for expressions that are
         lval-values [note 135]

   135) The implementation may place a const object that is not
        volatile in a read-only region of storage. Moreover, the
        implementation need not allocate storage for such an object if
        its $ address is never used.

This footnote may be interpreted as if const objects that are volatile
shouldn't be put in read-only storage.  Even if I personally was not
very convinced of that interpretation (see my earlier comment in BZ
25521) I filed the following issue in the LLVM tracker in order to
discuss the matter:

  https://github.com/llvm/llvm-project/issues/56468

As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
reflectors about this.  He reported that the reflectors don't think
footnote 135 has any normative value.

So, not having a normative mandate on either direction, there are two
options:

a) To change GCC to place `const volatile' objects in .rodata instead
   of .data.

b) To change LLVM to place `const volatile' objects in .data instead
   of .rodata.

Considering that:

- One target (bpf-unknown-none) breaks with the current GCC behavior.

- No target/platform relies on the GCC behavior, that we know.

- Changing the LLVM behavior at this point would be very severely
  traumatic for the BPF people and their users.

I think the right thing to do at this point is a).
Therefore this patch.

Regtested in x86_64-linux-gnu and bpf-unknown-none.
No regressions observed.

gcc/ChangeLog:

PR middle-end/25521
* varasm.cc (categorize_decl_for_section): Place `const volatile'
objects in read-only sections.
(default_select_section): Likewise.

gcc/testsuite/ChangeLog:

PR middle-end/25521
* lib/target-supports.exp (check_effective_target_elf): Define.
* gcc.dg/pr25521.c: New test.

data-ref: Fix ranges_maybe_overlap_p test

dr_may_alias_p rightly used poly_int_tree_p to guard a use of
ranges_maybe_overlap_p, but used the non-poly extractors.
This caused a few failures in the SVE ACLE asm tests.

gcc/
* tree-data-ref.cc (dr_may_alias_p): Use to_poly_widest instead
of to_widest.

aarch64: Remove redundant TARGET_* checks

After previous patches, it's possible to remove TARGET_*
options that are redundant due to (IMO) obvious dependencies.

gcc/
* config/aarch64/aarch64.h (TARGET_CRYPTO, TARGET_SHA3, TARGET_SM4)
(TARGET_DOTPROD): Don't depend on TARGET_SIMD.
(TARGET_AES, TARGET_SHA2): Likewise. Remove TARGET_CRYPTO test.
(TARGET_FP_F16INST): Don't depend on TARGET_FLOAT.
(TARGET_SVE2, TARGET_SVE_F32MM, TARGET_SVE_F64MM): Don't depend
on TARGET_SVE.
(TARGET_SVE2_AES, TARGET_SVE2_BITPERM, TARGET_SVE2_SHA3)
(TARGET_SVE2_SM4): Don't depend on TARGET_SVE2.
(TARGET_F32MM, TARGET_F64MM): Delete.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Guard
float macros with just TARGET_FLOAT rather than TARGET_FLOAT
|| TARGET_SIMD.
* config/aarch64/aarch64-simd.md (copysign<mode>3): Depend
only on TARGET_SIMD, rather than TARGET_FLOAT && TARGET_SIMD.
(aarch64_crypto_aes<aes_op>v16qi): Depend only on TARGET_AES,
rather than TARGET_SIMD && TARGET_AES.
(aarch64_crypto_aes<aesmc_op>v16qi): Likewise.
(*aarch64_crypto_aese_fused): Likewise.
(*aarch64_crypto_aesd_fused): Likewise.
(aarch64_crypto_pmulldi): Likewise.
(aarch64_crypto_pmullv2di): Likewise.
(aarch64_crypto_sha1hsi): Likewise TARGET_SHA2.
(aarch64_crypto_sha1hv4si): Likewise.
(aarch64_be_crypto_sha1hv4si): Likewise.
(aarch64_crypto_sha1su1v4si): Likewise.
(aarch64_crypto_sha1<sha1_op>v4si): Likewise.
(aarch64_crypto_sha1su0v4si): Likewise.
(aarch64_crypto_sha256h<sha256_op>v4si): Likewise.
(aarch64_crypto_sha256su0v4si): Likewise.
(aarch64_crypto_sha256su1v4si): Likewise.
(aarch64_crypto_sha512h<sha512_op>qv2di): Likewise TARGET_SHA3.
(aarch64_crypto_sha512su0qv2di): Likewise.
(aarch64_crypto_sha512su1qv2di, eor3q<mode>4): Likewise.
(aarch64_rax1qv2di, aarch64_xarqv2di, bcaxq<mode>4): Likewise.
(aarch64_sm3ss1qv4si): Likewise TARGET_SM4.
(aarch64_sm3tt<sm3tt_op>qv4si): Likewise.
(aarch64_sm3partw<sm3part_op>qv4si): Likewise.
(aarch64_sm4eqv4si, aarch64_sm4ekeyqv4si): Likewise.
* config/aarch64/aarch64.md (<FLOATUORS:optab>dihf2)
(copysign<GPF:mode>3, copysign<GPF:mode>3_insn)
(xorsign<mode>3): Remove redundant TARGET_FLOAT condition.

aarch64: Tweak handling of -mgeneral-regs-only

-mgeneral-regs-only is effectively "+nofp for the compiler without
changing the assembler's ISA flags". Currently that's implemented
by making TARGET_FLOAT, TARGET_SIMD and TARGET_SVE depend on
!TARGET_GENERAL_REGS_ONLY and then making any feature that needs FP
registers depend (directly or indirectly) on one of those three TARGET
macros. The problem is that it's easy to forgot to do the last bit.

This patch instead represents the distinction between "assemnbler
ISA flags" and "compiler ISA flags" more directly, funnelling
all updates through a new function that sets both sets of flags
together.

gcc/
* config/aarch64/aarch64.opt (aarch64_asm_isa_flags): New variable.
* config/aarch64/aarch64.h (aarch64_asm_isa_flags)
(aarch64_isa_flags): Redefine as read-only macros.
(TARGET_SIMD, TARGET_FLOAT, TARGET_SVE): Don't depend on
!TARGET_GENERAL_REGS_ONLY.
* common/config/aarch64/aarch64-common.cc
(aarch64_set_asm_isa_flags): New function.
(aarch64_handle_option): Call it when updating -mgeneral-regs.
* config/aarch64/aarch64-protos.h (aarch64_simd_switcher): Replace
m_old_isa_flags with m_old_asm_isa_flags.
(aarch64_set_asm_isa_flags): Declare.
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_switcher::aarch64_simd_switcher)
(aarch64_simd_switcher::~aarch64_simd_switcher): Save and restore
aarch64_asm_isa_flags instead of aarch64_isa_flags.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
* config/aarch64/aarch64.cc (aarch64_set_asm_isa_flags): New function.
(aarch64_override_options, aarch64_handle_attr_arch)
(aarch64_handle_attr_cpu, aarch64_handle_attr_isa_flags): Use
aarch64_set_asm_isa_flags to set the ISA flags.
(aarch64_option_print, aarch64_declare_function_name)
(aarch64_start_file): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
(aarch64_can_inline_p): Check aarch64_asm_isa_flags as well as
aarch64_isa_flags.

aarch64: Tweak contents of flags_on/off fields

After previous changes, it's more convenient if the flags_on and
flags_off fields of all_extensions include the feature flag itself.

gcc/
* common/config/aarch64/aarch64-common.cc (all_extensions):
Include the feature flag in flags_on and flags_off.
(aarch64_parse_extension): Update accordingly.
(aarch64_get_extension_string_for_isa_flags): Likewise.

aarch64: Make more use of aarch64_feature_flags

A previous patch added a aarch64_feature_flags typedef, to abstract
the representation of the feature flags. This patch makes existing
code use the typedef too. Hope I've caught them all!

gcc/
* common/config/aarch64/aarch64-common.cc: Use aarch64_feature_flags
for feature flags throughout.
* config/aarch64/aarch64-protos.h: Likewise.
* config/aarch64/aarch64-sve-builtins.h: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/aarch64/aarch64.opt: Likewise.
* config/aarch64/driver-aarch64.cc: Likewise.

aarch64: Tweak constness of option-related data

Some of the option structures have all-const member variables.
That doesn't seem necessary: we can just use const on the objects
that are supposed to be read-only.

Also, with the new, more C++-heavy option handling, it seems
better to use constexpr for the static data, to make sure that
we're not adding unexpected overhead.

gcc/
* common/config/aarch64/aarch64-common.cc (aarch64_option_extension)
(processor_name_to_arch, arch_to_arch_name): Remove const from
member variables.
(all_extensions, all_cores, all_architectures): Make a constexpr.
* config/aarch64/aarch64.cc (processor): Remove const from
member variables.
(all_architectures): Make a constexpr.
* config/aarch64/driver-aarch64.cc (aarch64_core_data)
(aarch64_arch_driver_info): Remove const from member variables.
(aarch64_cpu_data, aarch64_arches): Make a constexpr.
(get_arch_from_id): Return a pointer to const.
(host_detect_local_cpu): Update accordingly.

aarch64: Avoid std::string in static data

Just a minor patch to avoid having to construct std::strings
in static data.

gcc/
* common/config/aarch64/aarch64-common.cc (processor_name_to_arch)
(arch_to_arch_name): Use const char * instead of std::string.

aarch64: Simplify generation of .arch strings

aarch64-common.cc has two arrays, one maintaining the original
definition order and one sorted by population count.  Sorting
by population count was a way of ensuring topological ordering,
taking advantage of the fact that the entries are partially
ordered by the subset relation.  However, the sorting is not
needed now that the .def file is forced to have topological
order from the outset.

Other changes are:

(1) The population count used:

      uint64_t total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
      uint64_t total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
      int popcnt_a = popcount_hwi ((HOST_WIDE_INT)total_flags_a);
      int popcnt_b = popcount_hwi ((HOST_WIDE_INT)total_flags_b);

    where I think the & was supposed to be |.  This meant that the
    counts would always be 1 in practice, since flag_canonical is
    a single bit.  This led us to printing +nofp+nosimd even though
    GCC "knows" (and GAS agrees) that +nofp disables simd.

(2) The .arch output code converts +aes+sha2 to +crypto.  I think
    the main reason for doing this is to support assemblers that
    predate the individual per-feature crypto flags.  It therefore
    seems more natural to treat it as a special case, rather than
    as an instance of a general pattern.  Hopefully we won't do
    something similar in future!

    (There is already special handling of CRC, for different reasons.)

(3) Previously, if the /proc/cpuinfo code saw a feature like sve,
    it would assume the presence of all the features that sve
    depends on.  It would be possible to keep that behaviour
    if necessary, but it was simpler to assume the presence of
    fp16 (say) only when fphp is present.  There's an argument
    that that's more conservatively correct too.

gcc/
* common/config/aarch64/aarch64-common.cc
(TARGET_OPTION_INIT_STRUCT): Delete.
(aarch64_option_extension): Remove is_synthetic_flag.
(all_extensions): Update accordingly.
(all_extensions_by_on, opt_ext, opt_ext_cmp): Delete.
(aarch64_option_init_struct, aarch64_contains_opt): Delete.
(aarch64_get_extension_string_for_isa_flags): Rewrite to use
all_extensions instead of all_extensions_on.

gcc/testsuite/
* gcc.target/aarch64/cpunative/info_8: Add all dependencies of sve.
* gcc.target/aarch64/cpunative/info_9: Likewise svesm4.
* gcc.target/aarch64/cpunative/info_15: Likewise.
* gcc.target/aarch64/cpunative/info_16: Likewise sve2.
* gcc.target/aarch64/cpunative/info_17: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_2.c: Expect just +nofp
rather than +nofp+nosimd.
* gcc.target/aarch64/cpunative/native_cpu_10.c: Likewise.
* gcc.target/aarch64/target_attr_15.c: Likewise.

aarch64: Simplify feature definitions

Currently the aarch64-option-extensions.def entries, the
aarch64-cores.def entries, and the AARCH64_FL_FOR_* macros
have a transitive closure of dependencies that is maintained by hand.
This is a bit error-prone and is becoming less tenable as more features
are added.  The main point of this patch is to maintain the closure
automatically instead.

For example, the +sve2-aes extension requires sve2 and aes.
This is now described using:

  AARCH64_OPT_EXTENSION("sve2-aes", SVE2_AES, (SVE2, AES), ...)

If life was simple, we could just give the name of the feature
and the list of features that it requires/depends on.  But sadly
things are more complicated.  For example:

- the legacy +crypto option enables aes and sha2 only, but +nocrypto
  disables all crypto-related extensions, including sm4.

- +fp16fml enables fp16, but armv8.4-a enables fp16fml without fp16.
  fp16fml only has an effect when fp16 is also present; see the
  comments for more details.

- +bf16 enables simd, but +bf16+nosimd is valid and enables just the
  scalar bf16 instructions.  rdma behaves similarly.

To handle cases like these, the option entries have extra fields to
specify what an explicit +foo enables and what an explicit +nofoo
disables, in addition to the absolute dependencies.

The other main changes are:

- AARCH64_FL_* are now defined automatically.

- the feature list for each architecture level moves from aarch64.h
  to aarch64-arches.def.

As a consequence, we now have a (redundant) V8A feature flag.

While there, the patch uses a new typedef, aarch64_feature_flags,
for the set of feature flags.  This should make it easier to switch
to a class if we run out of bits in the uint64_t.

For now the patch hardcodes the fact that crypto is the only
synthetic option.  A later patch will remove this field.

To test for things that might not be covered by the testsuite,
I made the driver print out the all_extensions, all_cores and
all_archs arrays before and after the patch, with the following
tweaks:

- renumber the old AARCH64_FL_* bit assignments to match the .def order
- remove the new V8A flag when printing the new tables
- treat CRYPTO and CRYPTO | AES | SHA2 the same way when printing the
  core tables

(On the last point: some cores enabled just CRYPTO while others enabled
CRYPTO, AES and SHA2.  This doesn't cause a difference in behaviour
because of how the dependent macros are defined.  With the new scheme,
all entries with CRYPTO automatically get AES and SHA2 too.)

The only difference is that +nofp now turns off dotprod.  This was
another instance of an incomplete transitive closure, but unlike the
instances fixed in a previous patch, it had no observable effect.

gcc/
* config/aarch64/aarch64-option-extensions.def: Switch to a new format.
* config/aarch64/aarch64-cores.def: Use the same format to specify
lists of features.
* config/aarch64/aarch64-arches.def: Likewise, moving that information
from aarch64.h.
* config/aarch64/aarch64-opts.h (aarch64_feature_flags): New typedef.
* config/aarch64/aarch64.h (aarch64_feature): New class enum.
Turn AARCH64_FL_* macros into constexprs, getting the definitions
from aarch64-option-extensions.def.  Remove AARCH64_FL_FOR_* macros.
* common/config/aarch64/aarch64-common.cc: Include
aarch64-feature-deps.h.
(all_extensions): Update for new .def format.
(all_extensions_by_on, all_cores, all_architectures): Likewise.
* config/aarch64/driver-aarch64.cc: Include aarch64-feature-deps.h.
(aarch64_extensions): Update for new .def format.
(aarch64_cpu_data, aarch64_arches): Likewise.
* config/aarch64/aarch64.cc: Include aarch64-feature-deps.h.
(all_architectures, all_cores): Update for new .def format.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Likewise.

aarch64: Reorder an entry in aarch64-option-extensions.def

aarch64-option-extensions.def was topologically sorted except
for one case: crypto came before its aes and sha2 dependencies.
This patch moves crypto after sha2 instead.

gcc/
* config/aarch64/aarch64-option-extensions.def: Move crypto
after sha2.

gcc/testsuite/
* gcc.target/aarch64/cpunative/native_cpu_0.c: Expect +crypto
to come after +crc.
* gcc.target/aarch64/cpunative/native_cpu_13.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_16.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_17.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_6.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_7.c: Likewise.
* gcc.target/aarch64/options_set_2.c: Likewise.
* gcc.target/aarch64/options_set_3.c: Likewise.
* gcc.target/aarch64/options_set_4.c: Likewise.

aarch64: Fix transitive closure of features

aarch64-option-extensions.def requires us to maintain the transitive
closure of options by hand.  This patch fixes a few cases where a
flag was missed.

+noaes and +nosha2 now disable +crypto, which IMO makes more
sense and is consistent with the Clang behaviour.

gcc/
* config/aarch64/aarch64-option-extensions.def (dotprod): Depend
on fp as well as simd.
(sha3): Likewise.
(aes): Likewise.  Make +noaes disable crypto.
(sha2): Likewise +nosha2.  Also make +nosha2 disable sha3 and
sve2-sha3.
(sve2-sha3): Depend on sha2 as well as sha3.

gcc/testsuite/
* gcc.target/aarch64/options_set_6.c: Expect +crypto+nosha2 to
disable crypto but keep aes.
* gcc.target/aarch64/pragma_cpp_predefs_4.c: New test.

aarch64: Remove AARCH64_FL_RCPC8_4 [PR107025]

AARCH64_FL_RCPC8_4 is an odd-one-out in that it has no associated
entry in aarch64-option-extensions.def. This means that, although
it is internally separated from AARCH64_FL_V8_4A, there is no
mechanism for turning it on and off individually, independently
of armv8.4-a.

The only place that the flag was used independently was in the
entry for thunderx3t110, which enabled it alongside V8_3A.
As noted in PR107025, this means that any use of the extension
will fail to assemble.

In the PR trail, Andrew suggested removing the core entry.
That might be best long-term, but since the barrier for removing
command-line options without a deprecation period is very high,
this patch instead just drops the flag from the core entry.
We'll still produce correct code.

gcc/
PR target/107025
* config/aarch64/aarch64.h (oAARCH64_FL_RCPC8_4): Delete.
(AARCH64_FL_FOR_V8_4A): Update accordingly.
(AARCH64_ISA_RCPC8_4): Use AARCH64_FL_V8_4A directly.
* config/aarch64/aarch64-cores.def (thunderx3t110): Remove
AARCH64_FL_RCPC8_4.

aarch64: Avoid redundancy in aarch64-cores.def

The flags fields of the aarch64-cores.def always start with
AARCH64_FL_FOR_<ARCH>. After previous changes, <ARCH> is always
identical to the previous field, so we can drop the explicit
AARCH64_FL_FOR_<ARCH> and derive it programmatically.

This isn't a big saving in itself, but it helps with later patches.

gcc/
* config/aarch64/aarch64-cores.def: Remove AARCH64_FL_FOR_<ARCH>
from the flags field.
* common/config/aarch64/aarch64-common.cc (all_cores): Add it
here instead.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (all_cores): Likewise.

aarch64: Small config.gcc cleanups

The aarch64-option-extensions.def parsing in config.gcc had
some code left over from when it tried to parse the whole
macro definition. Also, config.gcc now only looks at the
first fields of the aarch64-arches.def entries.

gcc/
* config.gcc: Remove dead aarch64-option-extensions.def code.
* config/aarch64/aarch64-arches.def: Update comment.

aarch64: Add "V" to aarch64-arches.def names

This patch completes the renaming of architecture-level related
things by adding "V" to the name of the architecture in
aarch64-arches.def. Since the "V" is predictable, we can easily
drop it when we don't need it (as when matching /proc/cpuinfo).

Having a valid C identifier is necessary for later patches.

gcc/
* config/aarch64/aarch64-arches.def: Add a leading "V" to the
ARCH_IDENT fields.
* config/aarch64/aarch64-cores.def: Update accordingly.
* common/config/aarch64/aarch64-common.cc (all_cores): Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (aarch64_arches): Skip the
leading "V".

aarch64: Rename AARCH64_FL_FOR_ARCH macros

This patch renames AARCH64_FL_FOR_ARCH* macros to follow the
same V<number><profile> names that we (now) use elsewhere.

The names are only temporary -- a later patch will move the
information to the .def file instead. However, it helps with
the sequencing to do this first.

gcc/
* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8): Rename to...
(AARCH64_FL_FOR_V8A): ...this.
(AARCH64_FL_FOR_ARCH8_1): Rename to...
(AARCH64_FL_FOR_V8_1A): ...this.
(AARCH64_FL_FOR_ARCH8_2): Rename to...
(AARCH64_FL_FOR_V8_2A): ...this.
(AARCH64_FL_FOR_ARCH8_3): Rename to...
(AARCH64_FL_FOR_V8_3A): ...this.
(AARCH64_FL_FOR_ARCH8_4): Rename to...
(AARCH64_FL_FOR_V8_4A): ...this.
(AARCH64_FL_FOR_ARCH8_5): Rename to...
(AARCH64_FL_FOR_V8_5A): ...this.
(AARCH64_FL_FOR_ARCH8_6): Rename to...
(AARCH64_FL_FOR_V8_6A): ...this.
(AARCH64_FL_FOR_ARCH8_7): Rename to...
(AARCH64_FL_FOR_V8_7A): ...this.
(AARCH64_FL_FOR_ARCH8_8): Rename to...
(AARCH64_FL_FOR_V8_8A): ...this.
(AARCH64_FL_FOR_ARCH8_R): Rename to...
(AARCH64_FL_FOR_V8R): ...this.
(AARCH64_FL_FOR_ARCH9): Rename to...
(AARCH64_FL_FOR_V9A): ...this.
(AARCH64_FL_FOR_ARCH9_1): Rename to...
(AARCH64_FL_FOR_V9_1A): ...this.
(AARCH64_FL_FOR_ARCH9_2): Rename to...
(AARCH64_FL_FOR_V9_2A): ...this.
(AARCH64_FL_FOR_ARCH9_3): Rename to...
(AARCH64_FL_FOR_V9_3A): ...this.
* common/config/aarch64/aarch64-common.cc (all_cores): Update
accordingly.
* config/aarch64/aarch64-arches.def: Likewise.
* config/aarch64/aarch64-cores.def: Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.

aarch64: Rename AARCH64_FL architecture-level macros

Following on from the previous AARCH64_ISA patch, this one adds the
profile name directly to the end of architecture-level AARCH64_FL_*
macros.

gcc/
* config/aarch64/aarch64.h (AARCH64_FL_V8_1, AARCH64_FL_V8_2)
(AARCH64_FL_V8_3, AARCH64_FL_V8_4, AARCH64_FL_V8_5, AARCH64_FL_V8_6)
(AARCH64_FL_V9, AARCH64_FL_V8_7, AARCH64_FL_V8_8, AARCH64_FL_V9_1)
(AARCH64_FL_V9_2, AARCH64_FL_V9_3): Add "A" to the end of the name.
(AARCH64_FL_V8_R): Rename to AARCH64_FL_V8R.
(AARCH64_FL_FOR_ARCH8_1, AARCH64_FL_FOR_ARCH8_2): Update accordingly.
(AARCH64_FL_FOR_ARCH8_3, AARCH64_FL_FOR_ARCH8_4): Likewise.
(AARCH64_FL_FOR_ARCH8_5, AARCH64_FL_FOR_ARCH8_6): Likewise.
(AARCH64_FL_FOR_ARCH8_7, AARCH64_FL_FOR_ARCH8_8): Likewise.
(AARCH64_FL_FOR_ARCH8_R, AARCH64_FL_FOR_ARCH9): Likewise.
(AARCH64_FL_FOR_ARCH9_1, AARCH64_FL_FOR_ARCH9_2): Likewise.
(AARCH64_FL_FOR_ARCH9_3, AARCH64_ISA_V8_2A, AARCH64_ISA_V8_3A)
(AARCH64_ISA_V8_4A, AARCH64_ISA_V8_5A, AARCH64_ISA_V8_6A): Likewise.
(AARCH64_ISA_V8R, AARCH64_ISA_V9A, AARCH64_ISA_V9_1A): Likewise.
(AARCH64_ISA_V9_2A, AARCH64_ISA_V9_3A): Likewise.

aarch64: Rename AARCH64_ISA architecture-level macros

All AARCH64_ISA_* architecture-level macros except AARCH64_ISA_V8_R
are for the A profile: they cause __ARM_ARCH_PROFILE to be set to
'A' and they are associated with architecture names like armv8.4-a.

It's convenient for later patches if we make this explicit
by adding an "A" to the name. Also, rather than add an underscore
(as for V8_R) it's more convenient to add the profile directly
to the number, like we already do in the ARCH_IDENT field of the
aarch64-arches.def entries.

gcc/
* config/aarch64/aarch64.h (AARCH64_ISA_V8_2, AARCH64_ISA_V8_3)
(AARCH64_ISA_V8_4, AARCH64_ISA_V8_5, AARCH64_ISA_V8_6)
(AARCH64_ISA_V9, AARCH64_ISA_V9_1, AARCH64_ISA_V9_2)
(AARCH64_ISA_V9_3): Add "A" to the end of the name.
(AARCH64_ISA_V8_R): Rename to AARCH64_ISA_V8R.
(TARGET_ARMV8_3, TARGET_JSCVT, TARGET_FRINT, TARGET_MEMTAG): Update
accordingly.
* common/config/aarch64/aarch64-common.cc
(aarch64_get_extension_string_for_isa_flags): Likewise.
* config/aarch64/aarch64-c.cc
(aarch64_define_unconditional_macros): Likewise.

Add OPTIONS_H_EXTRA to GTFILES

I have a patch that adds a typedef to aarch64's <cpu>-opts.h.
The typedef is used for a TargetVariable in the .opt file,
which means that it is covered by PCH and so needs to be
visible to gengtype.

<cpu>-opts.h is not included directly in tm.h, but indirectly
by target headers (in this case aarch64.h). There was therefore
nothing that caused it to be added to GTFILES.

gcc/
* Makefile.in (GTFILES): Add OPTIONS_H_EXTRA.

driver, cppdefault: Unbreak bootstrap on Debian/Ubuntu [PR107059]

My recent change to enable _Float{16,32,64,128,32x,64x,128x} for C++
apparently broke bootstrap on some Debian/Ubuntu setups.
Those multiarch targets put some headers into
/usr/include/x86_64-linux-gnu/bits/ etc. subdirectory instead of
/usr/include/bits/.
This is handled by
    /* /usr/include comes dead last.  */
    { NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 2 },
    { NATIVE_SYSTEM_HEADER_DIR, NATIVE_SYSTEM_HEADER_COMPONENT, 0, 0, 1, 0 },
in cppdefault.cc, where the 2 in the last element of the first initializer
means the entry is ignored on non-multiarch and suffixed by the multiarch
dir otherwise, so installed gcc has search path like:
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
(when installed with DESTDIR=/home/jakub/gcc/obj01inst).
Now, when fixincludes is run, it is processing the whole /usr/include dir
and all its subdirectories, so floatn{,-common.h} actually go into
.../include-fixed/x86_64-linux-gnu/bits/floatn{,-common.h}
because that is where they appear in /usr/include too.
In some setups, /usr/include also contains /usr/include/bits -> x86_64-linux-gnu/bits
symlink and after the r13-2896 tweak it works.
In other setups there is no /usr/include/bits symlink and when one
#include <bits/floatn.h>
given the above search path, it doesn't find the fixincluded header,
as
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/bits/floatn.h
doesn't exist and
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/x86_64-linux-gnu/bits/floatn.h
isn't searched and so
/usr/include/x86_64-linux-gnu/bits/floatn.h
wins and we fail because of typedef whatever _Float128; and similar.
The following patch ought to fix this.  The first hunk by arranging that
the installed search path actually looks like:
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed/x86_64-linux-gnu
/home/jakub/gcc/obj01inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.0/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
and thus for include-fixed it treats it the same as /usr/include.
The second FIXED_INCLUDE_DIR entry there is:
     { FIXED_INCLUDE_DIR, "GCC", 0, 0, 0,
       /* A multilib suffix needs adding if different multilibs use
          different headers.  */
#ifdef SYSROOT_HEADERS_SUFFIX_SPEC
       1
#else
       0
#endif
     },
where SYSROOT_HEADERS_SUFFIX_SPEC is defined only on vxworks or mips*-mti-linux
and arranges for multilib path to be appended there.  Neither of those
systems is multiarch.
This isn't enough, because when using the -B option, the driver adds
-isystem .../include-fixed in another place, so the second hunk modifies
that spot the same.
/home/jakub/gcc/obj01/gcc/xgcc -B /home/jakub/gcc/obj01/gcc/
then has search path:
/home/jakub/gcc/obj01/gcc/include
/home/jakub/gcc/obj01/gcc/include-fixed/x86_64-linux-gnu
/home/jakub/gcc/obj01/gcc/include-fixed
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
which again is what I think we want to achieve.

2022-09-29  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/107059
* cppdefault.cc (cpp_include_defaults): If SYSROOT_HEADERS_SUFFIX_SPEC
isn't defined, add FIXED_INCLUDE_DIR entry with multilib flag 2
before FIXED_INCLUDE_DIR entry with multilib flag 0.
* gcc.cc (do_spec_1): If multiarch_dir, add
include-fixed/multiarch_dir paths before include-fixed paths.

support -gz=zstd for both linker and assembler

PR driver/106897

gcc/ChangeLog:

* common.opt: Add -gz=zstd value.
* configure.ac: Detect --compress-debug-sections=zstd
for both linker and assembler.
* configure: Regenerate.
* gcc.cc (LINK_COMPRESS_DEBUG_SPEC): Handle -gz=zstd.
(ASM_COMPRESS_DEBUG_SPEC): Likewise.

ada: Remove duplicated doc comment section

A documentation section was duplicated by mistake in r0-110752.
This commit removes the copy that was added by r0-110752, but
integrates the small editorial change that it brought to the
original.

gcc/ada/

* einfo.ads: remove documentation duplicate