git.ipfire.org Git - thirdparty/gcc.git/log

libphobos: Fix executables segfault on mipsel architecture

The dynamic section on MIPS is read-only, but this was not properly
handled in the runtime library. The segfault only occurred for programs
that linked to the shared libphobos library.

libphobos/ChangeLog:

PR d/98806
* libdruntime/gcc/sections/elf_shared.d (MIPS_Any): Declare version
for MIPS32 and MIPS64.
(getDependencies): Adjust dlpi_addr on MIPS_Any.

(cherry picked from commit 81f928ec8e8abf1f21c5ff008c39b5d6af78f6cb)

Fortran: Correction to recent patch in light of comments [PR98022].

2020-12-26 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Throw an error for inquiry
references. Follow with corrected code that would provide the
expected result and provides clean error recovery.

gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: Change to dg-compile and
add errors for inquiry references.

(cherry picked from commit c7256c8260afa313e019fd531574ad33ec49b9f6)

Fortran: Enable inquiry references in data statements [PR98022].

2020-12-12 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Handle inquiry references in
the data statement object list.

gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: New test.

(cherry picked from commit ff2dfdef2f2e01c579dd280daa1d81fbeb4d7ac5)

Daily bump.

c++: Crash when deducing template arguments [PR98790]

maybe_instantiate_noexcept doesn't expect to see error_mark_node, but
the new callsite I introduced in r11-6476 can pass error_mark_node to
it. So cope.

gcc/cp/ChangeLog:

PR c++/98790
* pt.c (maybe_instantiate_noexcept): Return false if FN is
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/98790
* g++.dg/template/deduce8.C: New test.

vect: Fix VLA SLP invariant optimisation [PR98535]

duplicate_and_interleave is the main fallback way of loading
a repeating sequence of elements into variable-length vectors.
The code handles cases in which the number of elements in the
sequence is potentially several times greater than the number
of elements in a vector.

Let:

- NE be the (compile-time) number of elements in the sequence
- NR be the (compile-time) number of vector results and
- VE be the (run-time) number of elements in each vector

The basic approach is to duplicate each element into a
separate vector, giving NE vectors in total, then use
log2(NE) rows of NE permutes to generate NE results.

In the worst case --- when VE has no known compile-time factor
and NR >= NE --- all of these permutes are necessary.  However,
if VE is known to be a multiple of 2**F, then each of the
first F permute rows produces duplicate results; specifically,
the high permute for a given pair is the same as the low permute.
The code dealt with this by reusing the low result for the
high result.  This part was OK.

However, having duplicate results from one row meant that the
next row did duplicate work.  The redundancies would be optimised
away by later passes, but the code tried to avoid generating them
in the first place.  This is the part that went wrong.

Specifically, NR is typically less than NE when some permutes are
redundant, so the code tried to use NR to reduce the amount of work
performed.  The problem was that, although it correctly calculated
a conservative bound on how many results were needed in each row,
it chose the wrong results for anything other than the final row.

This doesn't usually matter for fully-packed SVE vectors.  We first
try to coalesce smaller elements into larger ones, so normally
VE ends up being 2**VQ (where VQ is the number of 128-bit blocks
in an SVE vector).  In that situation we'd only apply the faulty
optimisation to the final row, i.e. the case it handled correctly.
E.g. for things like:

  void
  f (long *x)
  {
    for (int i = 0; i < 100; i += 8)
      {
        x[i] += 1;
        x[i + 1] += 2;
        x[i + 2] += 3;
        x[i + 3] += 4;
        x[i + 4] += 5;
        x[i + 5] += 6;
        x[i + 6] += 7;
        x[i + 7] += 8;
      }
  }

(already tested by the testsuite), we'd have 3 rows of permutes
producing 4 vector results.  The schemne produced:

1st row: 8 results from 4 permutes, highs duplicates of lows
2nd row: 8 results from 8 permutes (half of which are actually redundant)
3rd row: 4 results from 4 permutes

However, coalescing elements is trickier for unpacked vectors,
and at the moment we don't try to do it (see the GET_MODE_SIZE
check in can_duplicate_and_interleave_p).  Unpacked vectors
therefore stress the code in ways that packed vectors didn't.

The patch fixes this by removing the redundancies from each row,
rather than trying to work around them later.  This also removes
the redundant work in the second row of the example above.

gcc/
PR tree-optimization/98535
* tree-vect-slp.c (duplicate_and_interleave): Use quick_grow_cleared.
If the high and low permutes are the same, remove the high permutes
from the working set and only continue with the low ones.

(cherry picked from commit ea74a3f548eb321429c371e327e778e63d9128a0)

Daily bump.

Fix typo in arm_mve.h __arm_vcmpneq_s8 return type

Like all vcmp intrinsics, __arm_vcmpneq_s8 should return a mve_pred16_t.

2021-01-21 Christophe Lyon <christophe.lyon@linaro.org>

gcc/
* config/arm/arm_mve.h (__arm_vcmpneq_s8): Fix return type.

PowerPC: Backport fix for libgcc long double support.

libgcc/
2021-01-20  Michael Meissner  <meissner@linux.ibm.com>

* config/rs6000/t-linux (IBM128_STATIC_OBJS): Back port from
master (12/3/2020).  New make variable.
(IBM128_SHARED_OBJS): New make variable.
(IBM128_OBJS): New make variable.  Set all objects to use the
explicit IBM format, and disable gnu attributes.
(IBM128_CFLAGS): New make variable.
(gcc_s_compile): Add -mno-gnu-attribute to all shared library
modules.

Daily bump.

Update gcc de.po.

* de.po: Update.

ipa-sra: Do not remove return values needed because of non-call EH

IPA-SRA already contains a check to figure out that an otherwise dead
parameter is actually required because of non-call exceptions, but it
is not present at the equivalent spot where SRA figures out whether
the return statement is used for anything useful.  This patch adds
that condition there.

Unfortunately, even though this patch should be good enough for any
normal (I'd even say reasonable) use of the compiler, it hints that
when the user manually switches all sorts of DCE, IPA-SRA would
probably leave behind problematic statements manipulating what
originally were return values, just like it does for parameters (PR
93385).  Fixing this properly might unfortunately be a separate issue
from the mentioned bug because the LHS of a call is changed during
call redirection and the caller often is not a clone.  But I'll see
what I can do.

Meanwhile, the patch below has been bootstrapped and tested on x86_64.

gcc/ChangeLog:

2021-01-18  Martin Jambor  <mjambor@suse.cz>

PR ipa/98690
* ipa-sra.c (ssa_name_only_returned_p): New parameter fun.  Check
whether non-call exceptions allow removal of a statement.
(isra_analyze_call): Pass the appropriate function to
ssa_name_only_returned_p.

gcc/testsuite/ChangeLog:

2021-01-18  Martin Jambor  <mjambor@suse.cz>

PR ipa/98690
* g++.dg/ipa/pr98690.C: New test.

(cherry picked from commit 9b8741c98f2876a430c12c85b396d29a87c9c488)

sparc,rtems: add __FIX_LEON3FT_TN0018 for affected targets

Enable a define FIX_LEON3FT_TN0018 for the LEON3FT targets affected
by the GRLIB-TN-0018 errata described here:
https://www.gaisler.com/notes

gcc/

* config/sparc/rtemself.h (TARGET_OS_CPP_BUILTINS): Add
built-in define __FIX_LEON3FT_TN0018.

(cherry picked from commit 4b690f161b82e428dbe648075da215daa52be0ea)

Fix PR ada/98740

It's a long-standing GENERIC tree sharing issue.

gcc/ada/ChangeLog:
PR ada/98740
* gcc-interface/trans.c (add_decl_expr): Always mark TYPE_ADA_SIZE.

Daily bump.

testsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c

The new gcc.target/i386/pr98100.c test FAILs on Solaris/x86:

FAIL: gcc.target/i386/pr98100.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr98100.c:6:1: error: the call requires 'ifunc', which is not supported by this target

Fixed as follows.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2020-12-07 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr98100.c: Require ifunc support.

Daily bump.

[gcn offloading] Only supported in 64-bit configurations

Similar to nvptx offloading, see PR65099 "nvptx offloading: hard-coded 64-bit
assumptions".

gcc/
* config/gcn/mkoffload.c (main): Create an offload image only in
64-bit configurations.

(cherry picked from commit 505caa7295b93ecdec8ac9b31595eb34dbd48c9f)

[nvptx libgomp plugin] Build only in supported configurations

As recently again discussed in <https://gcc.gnu.org/PR97436> "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported. So we simply should buildn't the nvptx libgomp
plugin in this case.

This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
<https://stackoverflow.com/a/52784819>).

This amends PR65099 commit a92defdab79a1268f4b9dcf42b937e4002a4cf15 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.

libgomp/
PR libgomp/65099
* plugin/configfrag.ac (PLUGIN_NVPTX): Restrict to supported
configurations.
* configure: Regenerate.
* plugin/plugin-nvptx.c (nvptx_get_num_devices): Remove 64-bit
check.

(cherry picked from commit 6106dfb9f73a33c87108ad5b2dcd4842bdd7828e)

Objective-C++ : Fix ICE in potential_constant_expression_1.

We cannot, as things stand, handle Objective-C tree codes in
the switch and deal with this by calling out to a function that
has a dummy version when Objective-C is not enabled.

Because of the way the logic works (with a fall through to a
'sorry' in case of unhandled expressions), the function reports
cases that are known to be unsuitable for constant exprs. The
dummy function always reports 'false' and thus will fall through
to the 'sorry'.

gcc/c-family/ChangeLog:

* c-objc.h (objc_non_constant_expr_p): New.
* stub-objc.c (objc_non_constant_expr_p): New.

gcc/cp/ChangeLog:

* constexpr.c (potential_constant_expression_1): Handle
expressions known to be non-constant for Objective-C.

gcc/objc/ChangeLog:

* objc-act.c (objc_non_constant_expr_p): New.

(cherry picked from commit 878cffbd9e6e1b138a6e0d362e7b29bc0a259940)

Daily bump.

libstdc++: Fix clang analyzer suppression [PR 98605]

The fix for PR libstdc++/82481 should only have applied for targets
where _GLIBCXX_HAVE_TLS is defined. Because it was also done for non-TLS
targets, it isn't possible to use clang's analyzers on non-TLS targets
if the code uses <mutex>. This fixes it by using a NOLINT comment on
the relevant line instead of testing #ifdef __clang_analyzer__ and
compiling different code when analyzing.

I'm not actually able to reproduce the analyzer warning with the tools
from Clang 10.0.1 so I'm not going to try to make the suppression more
specific with NOLINTNEXTLINE(clang-analyzer-code.StackAddressEscape).

libstdc++-v3/ChangeLog:

PR libstdc++/98605
* include/std/mutex (call_once): Use NOLINT to suppress clang
analyzer warnings.

Hurd: Enable ifunc by default

The binutils bugs seem to have been fixed.

gcc/
* config.gcc [$target == *-*-gnu*]: Enable
'default_gnu_indirect_function'.

(cherry picked from commit e9cb89b936f831a02318d45fc4ddb06f7be55ae4)

hurd: libgcc unwinding over signal trampolines with SIGINFO

When the application sets SA_SIGINFO, the signal trampoline parameters
are different to follow POSIX.

libgcc/
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Add the
posix siginfo case to struct handler_args. Detect between legacy
and siginfo from the second parameter, which is a small sigcode in
the legacy case, and a pointer in the siginfo case.

(cherry picked from commit 2b356e689c334ca4765a9ffd95a76cf715447200)

Daily bump.

tree-optimization/98513 - fix bug in range intersection code

This fixes a premature optimization in the range intersection code
which assumes earlier branches have to be taken, not taking into
account that for symbolic ranges we cannot always compare endpoints.
The fix is to instantiate the compare deemed redundant (which then
fails as undecidable for the testcase).

2021-01-06 Richard Biener <rguenther@suse.de>

PR tree-optimization/98513
* value-range.cc (intersect_ranges): Compare the upper bounds
for the expected relation.

* gcc.dg/tree-ssa/pr98513.c: New testcase.

(cherry picked from commit a05cc70a6c1ae0e5b22e16f4d8d13995a38ea1f9)

tree-optimization/98282 - classify V_C_E<constant> as nary

This avoids running into memory reference code in compute_avail by
properly classifying unfolded reference trees on constants.

2021-01-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/98282
* tree-ssa-sccvn.c (vn_get_stmt_kind): Classify tcc_reference on
invariants as VN_NARY.

* g++.dg/opt/pr98282.C: New testcase.

(cherry picked from commit 13b80a7d1b9b712651f5ece589634a6e57c26362)

vect: Fix bogus alignment assumption in alias checks [PR94994]

This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.

This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.

gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.

gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.

(cherry picked from commit 9fa5b473b5b8e289b6542adfd5cfaddfb3036048)

vect, aarch64: Fix alignment units for IFN_MASK* [PR95401]

The IFN_MASK* functions take two leading arguments: a load or
store pointer and a “cookie”. The type of the cookie is the
type of the access for TBAA purposes (like for MEM_REFs)
while the value of the cookie is the alignment of the access.
This PR was caused by a disagreement about whether the alignment
is measured in bits or bytes.

It looks like this goes back to PR68786, which made the
vectoriser create its own cookie argument rather than reusing
the one created by ifcvt. The alignment value of the new cookie
was measured in bytes (as needed by set_ptr_info_alignment)
while the existing code expected it to be measured in bits.
The folds I added for IFN_MASK_LOAD and STORE then made
things worse.

gcc/
PR tree-optimization/95401
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::load_store_cookie): Use bits rather than bytes
for the alignment argument to IFN_MASK_LOAD and IFN_MASK_STORE.
* gimple-fold.c (gimple_fold_mask_load_store_mem_ref): Likewise.
* tree-vect-stmts.c (vectorizable_store): Likewise.
(vectorizable_load): Likewise.

gcc/testsuite/
PR tree-optimization/95401
* g++.dg/vect/pr95401.cc: New test.
* g++.dg/vect/pr95401a.cc: Likewise.

(cherry picked from commit aa204d511859e4859cbe35a867ac407addb4ff54)

recog: Fix a constrain_operands corner case [PR97144]

aarch64's *add<mode>3_poly_1 has a pattern with the constraints:

  "=...,r,&r"
  "...,0,rk"
  "...,Uai,Uat"

i.e. the penultimate alternative requires operands 0 and 1 to match,
but the final alternative does not allow them to match.

The register allocators dealt with this correctly, and so used
different input and output registers for instructions with Uat
operands.  However, constrain_operands carried the penultimate
alternative's matching rule over to the final alternative,
so it would essentially ignore the earlyclobber.  This in turn
allowed postreload to convert a correct Uat pairing into an
incorrect one.

The fix is simple: recompute the matching information for each
alternative.

gcc/
PR rtl-optimization/97144
* recog.c (constrain_operands): Initialize matching_operand
for each alternative, rather than only doing it once.

gcc/testsuite/
PR rtl-optimization/97144
* gcc.c-torture/compile/pr97144.c: New test.
* gcc.target/aarch64/sve/pr97144.c: Likewise.

(cherry picked from commit eac8675225c4cdae347a11089f2b0a22ce920965)

genmodes: Update GET_MODE_MASK when changing NUNITS [PR98214]

The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:

(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)

to zero because the extracted bits appeared to be insignificant.

gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.

(cherry picked from commit 0411210fddbd3ec27c8dc1183f40f662712a2232)

vect: Avoid generating out-of-range shifts [PR98302]

In this testcase we end up with:

  unsigned long long x = ...;
  char y = (char) (x << 37);

The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:

  unsigned long long x = ...;
  char y = (char) x << 37;

which gives an out-of-range shift.  In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.

Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets).  In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.

gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.

gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.

(cherry picked from commit 58a12b0eadac62e691fcf7325ab2bc2c93d46b61)

vect: Fix missing alias checks for 128-bit SVE [PR98371]

On AArch64, the vectoriser tries various ways of vectorising with both
SVE and Advanced SIMD and picks the best one.  All other things being
equal, it prefers earlier attempts over later attempts.

The way this works currently is that, once it has a successful
vectorisation attempt A, it analyses all other attempts as epilogue
loops of A:

      /* When pick_lowest_cost_p is true, we should in principle iterate
over all the loop_vec_infos that LOOP_VINFO could replace and
try to vectorize LOOP_VINFO under the same conditions.
E.g. when trying to replace an epilogue loop, we should vectorize
LOOP_VINFO as an epilogue loop with the same VF limit.  When trying
to replace the main loop, we should vectorize LOOP_VINFO as a main
loop too.

However, autovectorize_vector_modes is usually sorted as follows:

- Modes that naturally produce lower VFs usually follow modes that
   naturally produce higher VFs.

- When modes naturally produce the same VF, maskable modes
   usually follow unmaskable ones, so that the maskable mode
   can be used to vectorize the epilogue of the unmaskable mode.

This order is preferred because it leads to the maximum
epilogue vectorization opportunities.  Targets should only use
a different order if they want to make wide modes available while
disparaging them relative to earlier, smaller modes.  The assumption
in that case is that the wider modes are more expensive in some
way that isn't reflected directly in the costs.

There should therefore be few interesting cases in which
LOOP_VINFO fails when treated as an epilogue loop, succeeds when
treated as a standalone loop, and ends up being genuinely cheaper
than FIRST_LOOP_VINFO.  */

However, the vectoriser can normally elide alias checks for epilogue
loops, on the basis that the main loop should do them instead.
Converting an epilogue loop to a main loop can therefore cause the alias
checks to be skipped.  (It probably also unfairly penalises the original
loop in the cost comparison, given that one loop will have alias checks
and the other won't.)

As the comment says, we should in principle analyse each vector mode
twice: once as a main loop and once as an epilogue.  However, doing
that up-front would be quite expensive.  This patch instead goes for a
compromise: if an epilogue loop for mode M2 seems better than a main
loop for mode M1, re-analyse with M2 as the main loop.

The patch fixes dg.torture.exp=pr69719.c when testing with
-msve-vector-bits=128.

gcc/
PR tree-optimization/98371
* tree-vect-loop.c (vect_reanalyze_as_main_loop): New function.
(vect_analyze_loop): If an epilogue loop appears to be cheaper
than the main loop, re-analyze it as a main loop before adopting
it as a main loop.

(cherry picked from commit 01be45eccee42d0cc6c900f43e2363186517f7ed)

aarch64: Improve vcombine codegen [PR89057]

This patch fixes a codegen regression in the handling of things like:

  __temp.val[0]      \
    = vcombine_##funcsuffix (__b.val[0],      \
     vcreate_##funcsuffix (__AARCH64_UINT64_C (0))); \

in the 64-bit vst[234] functions.  The zero was forced into a
register at expand time, and we relied on combine to fuse the
zero and combine back together into a single combinez pattern.
The problem is that the zero could be hoisted before combine
gets a chance to do its thing.

gcc/
PR target/89057
* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
aarch64_simd_reg_or_zero for operand 2.  Use the combinez patterns
to handle zero operands.

gcc/testsuite/
PR target/89057
* gcc.target/aarch64/pr89057.c: New test.

(cherry picked from commit b41e6dd50f329b0291457e939d4c0dacd81c82c1)

aarch64: Extend aarch64-autovec-preference==2 to 128-bit SVE

When compiling with -msve-vector-bits=128, aarch64_preferred_simd_mode
would pass the same vector width to aarch64_simd_container_mode for
both SVE and Advanced SIMD, and so Advanced SIMD would always “win”.
This patch instead makes it choose directly between SVE and Advanced
SIMD modes, so that aarch64-autovec-preference==2 and
aarch64-autovec-preference==4 work for this configuration.

(aarch64-autovec-preference shouldn't affect aarch64_simd_container_mode
because that would have an ABI impact for things like GNU vectors.)

gcc/
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Use
aarch64_full_sve_mode and aarch64_vq_mode directly, instead of
going via aarch64_simd_container_mode.

(cherry picked from commit 7ff5706fcd732b671afb2d308e8dab7e23050823)

tree-optimization/98221 - fix wrong unpack operation used for big-endian

The vec-abi-varargs-1.c testcase on IBM Z currently fails.

While adding an SI mode vector to a DI mode vector the first is unpacked using:

  _28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
  _34 = [vec_unpack_lo_expr] _28;

However, on big endian targets lo refers to the right hand side of the vector - in this case the zeroes.

2021-01-11  Andreas Krebbel  <krebbel@linux.ibm.com>

PR tree-optimization/98221
* tree-ssa-forwprop.c (simplify_vector_constructor): For
big-endian, use UNPACK[_FLOAT]_HI.

(cherry picked from commit 300a3ce5c5695eb1a7c0476e9d1b45420a463248)

Daily bump.

libstdc++: Fix filesystem::path pretty printer test failure

On some systems libstdc++-prettyprinters/cxx17.cc FAILs with this error:

skipping: Python Exception <type 'exceptions.AttributeError'> 'gdb.Type' object has no attribute 'name': ^M
got: $27 = filesystem::path "/dir/."^M
FAIL: libstdc++-prettyprinters/cxx17.cc print path2

The gdb.Type.name attribute isn't present in GDB 7.6, so we get an
exception from StdPathPrinter._iterator.__next__ trying to use it.
The StdPathPrinter._iterator is already passed the type's name in its
constructor, so we can just store that and use it instead of
gdb.Type.name.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdExpPathPrinter): Store the
name of the type and pass it to the iterator.
(StdPathPrinter): Likewise.
* testsuite/libstdc++-prettyprinters/filesystem-ts.cc: New test.

(cherry picked from commit a70384f94c83895f97179b45c1a8d66202132af8)

libstdc++: Only use __builtin_sprintf if supported [PR 96083]

Clang doesn't support __builtin_sprintf, so use std::sprintf instead.

libstdc++-v3/ChangeLog:

PR libstdc++/96083
* include/ext/throw_allocator.h: Use __has_builtin to check for
__builtin_sprintf support, and use std::sprintf if necessary.

(cherry picked from commit 96d9670e88333d8896a5d2f2bb0403c1e2ad19ab)

libstdc++: Fix std::any pretty printer [PR 68735]

This fixes errors seen on powerpc64 (big endian only) due to the
printers for std::any and std::experimental::any being unable to find
the manager function.

libstdc++-v3/ChangeLog:

PR libstdc++/65480
PR libstdc++/68735
* python/libstdcxx/v6/printers.py (function_pointer_to_name):
New helper function to get the name of a function from its
address.
(StdExpAnyPrinter.__init__): Use it.

(cherry picked from commit dc2b372ed1b1e9af6db45051cff95478c7616807)

libstdc++: Remove accidental -std=gnu++17 from test

This was probably copied from a std::filesystem test and the -std option
wasn't removed.

libstdc++-v3/ChangeLog:

* testsuite/experimental/filesystem/filesystem_error/cons.cc:
Remove -std=gnu++17 option.

(cherry picked from commit ed0b4bb29a50d6be0e4b6411b3cc9f22967f1313)

tree-optimization/98117 - fix range set by vectorization on niter IVs

This avoids the degenerate case of a TYPE_MAX_VALUE latch iteration
count value causing wrong range info for the vector IV. There's
still the case of VF == 1 where if we don't know whether we hit the
above case we cannot emit a range.

2020-12-07 Richard Biener <rguenther@suse.de>

PR tree-optimization/98117
* tree-vect-loop-manip.c (vect_gen_vector_loop_niters):
Properly handle degenerate niter when setting the vector
loop IV range.

* gcc.dg/torture/pr98117.c: New testcase.

(cherry picked from commit cdcbef3c3310a14f2994982b44cb1f8e14c77232)

tree-optimization/97623 - Avoid PRE hoist insertion iteration

We are not really interested in PRE opportunities exposed by
hoisting but only the other way around. So this moves hoist
insertion after PRE iteration finished and removes hoist
insertion iteration alltogether.

It also guards access to NEW_SETS properly.

2020-11-11 Richard Biener <rguenther@suse.de>

PR tree-optimization/97623
* tree-ssa-pre.c (insert): Move hoist insertion after PRE
insertion iteration and do not iterate it.
(create_expression_by_pieces): Guard NEW_SETS access.
(insert_into_preds_of_block): Likewise.

* gcc.dg/tree-ssa/ssa-hoist-3.c: Adjust.
* gcc.dg/tree-ssa/ssa-hoist-7.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-30.c: Likewise.

tree-optimization/97623 - avoid excessive insert iteration for hoisting

This avoids requiring insert iteration for back-to-back hoisting
opportunities as seen in the added testcase. For the PR at hand
this halves the number of insert iterations retaining only
the hard to avoid PRE / hoist insert back-to-backs.

2020-10-30 Richard Biener <rguenther@suse.de>

PR tree-optimization/97623
* tree-ssa-pre.c (insert): First do hoist insertion in
a backward walk.

* gcc.dg/tree-ssa/ssa-hoist-7.c: New testcase.

(cherry picked from commit 82ff7e3426ea926d090777173977f8bedd086816)

Daily bump.

tree-cfg: Allow enum types as result of POINTER_DIFF_EXPR [PR98556]

As conversions between signed integers and signed enums with the same
precision are useless in GIMPLE, it seems strange that we require that
POINTER_DIFF_EXPR result must be INTEGER_TYPE.

If we really wanted to require that, we'd need to change the gimplifier
to ensure that, which it isn't the case on the following testcase.
What is going on during the gimplification is that when we have the
(enum T) (p - q) cast, it is stripped through
      /* Strip away as many useless type conversions as possible
         at the toplevel.  */
      STRIP_USELESS_TYPE_CONVERSION (*expr_p);
and when the MODIFY_EXPR is gimplified, the *to_p has enum T type,
while *from_p has intptr_t type and as there is no conversion in between,
we just create GIMPLE_ASSIGN from that.

2021-01-09  Jakub Jelinek  <jakub@redhat.com>

PR c++/98556
* tree-cfg.c (verify_gimple_assign_binary): Allow lhs of
POINTER_DIFF_EXPR to be any integral type.

* c-c++-common/pr98556.c: New test.

(cherry picked from commit 991656092f78eeab2a48fdbacf4e1f08567badaf)

c++: ICE with constexpr call that returns a PMF [PR98551]

We shouldn't do replace_result_decl after evaluating a call that returns
a PMF because PMF temporaries aren't wrapped in a TARGET_EXPR (and so we
can't trust ctx->object), and PMF initializers can't be self-referential
anyway, so replace_result_decl would always be a no-op.

To that end, this patch changes the relevant AGGREGATE_TYPE_P test to
CLASS_TYPE_P, which should rule out PMFs (as well as arrays, which we
can't return and therefore won't see here). This fixes an ICE from the
sanity check in replace_result_decl in the below testcase during
constexpr evaluation of the call f() in the initializer g(f()).

gcc/cp/ChangeLog:

PR c++/98551
* constexpr.c (cxx_eval_call_expression): Check CLASS_TYPE_P
instead of AGGREGATE_TYPE_P before calling replace_result_decl.

gcc/testsuite/ChangeLog:

PR c++/98551
* g++.dg/cpp0x/constexpr-pmf2.C: New test.

(cherry picked from commit bb1f0b50abbfa01e0ed720a5225a11aa7af32a89)

c++: decl_constant_value and unsharing [PR96197]

In the testcase from the PR we're seeing excessive memory use (> 5GB)
during constexpr evaluation, almost all of which is due to the call to
decl_constant_value in the VAR_DECL/CONST_DECL branch of
cxx_eval_constant_expression.  We reach here every time we evaluate an
ARRAY_REF of a constexpr VAR_DECL, and from there decl_constant_value
makes an unshared copy of the VAR_DECL's initializer.  But unsharing
here is unnecessary because callers of cxx_eval_constant_expression
already unshare its result when necessary.

To fix this excessive unsharing, this patch adds a new defaulted
parameter unshare_p to decl_really_constant_value and
decl_constant_value so that callers can control whether to unshare.

As a simplification, we can also move the call to unshare_expr in
constant_value_1 outside of the loop, since doing unshare_expr on a
DECL_P is a no-op.

Now that we no longer unshare the result of decl_constant_value and
decl_really_constant_value from cxx_eval_constant_expression, memory use
during constexpr evaluation for the testcase from the PR falls from ~5GB
to 15MB according to -ftime-report.

gcc/cp/ChangeLog:

PR c++/96197
* constexpr.c (cxx_eval_constant_expression) <case CONST_DECL>:
Pass false to decl_constant_value and decl_really_constant_value
so that they don't unshare their result.
* cp-tree.h (decl_constant_value): New declaration with an added
bool parameter.
(decl_really_constant_value): Add bool parameter defaulting to
true to existing declaration.
* init.c (constant_value_1): Add bool parameter which controls
whether to unshare the initializer before returning.  Call
unshare_expr at most once.
(scalar_constant_value): Pass true to constant_value_1's new
bool parameter.
(decl_really_constant_value): Add bool parameter and forward it
to constant_value_1.
(decl_constant_value): Likewise, but instead define a new
overload with an added bool parameter.

gcc/testsuite/ChangeLog:

PR c++/96197
* g++.dg/cpp1y/constexpr-array8.C: New test.

(cherry picked from commit 8c00059ce058ea2aec2933319e270f5443b8b909)

Daily bump.

testsuite, coroutines : Fix a bad testcase [PR96504].

Where possible (i.e. where that doesn't alter the intent of a test) we
use a suspend_always as the final suspend and a test that the coroutine
was 'done' to check that the state machine had terminated correctly.

Sometimes, filed PRs have 'suspend_never' as the final suspend expression
and that needs to be changed to match the testsuite style. This is one
I missed and means that the call to 'done()' on the handle is made to an
already-destructed coroutine. Surprisngly, thAt didn't actually trigger
a failure until glibc 2-32.

Fixed by changing the final suspend to be 'suspend_always'.

gcc/testsuite/ChangeLog:

PR c++/96504
* g++.dg/coroutines/torture/pr95519-05-gro.C: Use suspend_always
as the final suspend point so that we can check that the state
machine has reached the expected point.

(cherry picked from commit 334a295fafdf5e66c4c976874282aea959830eb6)

coroutines, testsuite: Fix co-ret-17-void-ret-coro.C.

This was a bad testcase, found with fsanitize=address; the final suspend
is 'suspend never' which flows off the end of the coroutine destroying
the promise and the frame. At that point access via the handle is an
error. Fixed by checking that the promise is destroyed via a global var.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/co-ret-17-void-ret-coro.C: Check for
promise destruction via a global variable.

(cherry picked from commit f6615c213354fd3ec7fc6238e61cc26bb1830464)

libstdc++, coroutine: Add missing constexpr markers.

The methods of the trivial awaitables are intended to
be constexpr.

libstdc++-v3/ChangeLog:

* include/std/coroutine: Mark the methods of the
trivial awaitables as constexpr.

(cherry picked from commit f1b6e46c417224887c2f21baa6d4c538a25fe9fb)

Daily bump.

PR fortran/78746 - invalid access after error recovery

The error recovery after an invalid reference to an undefined CLASS
during a TYPE declaration lead to an invalid access. Add a check.

gcc/fortran/ChangeLog:

* resolve.c (resolve_component): Add check for valid CLASS
reference before trying to access CLASS data.

(cherry picked from commit 8b6f1e8f97fe0d435d334075821149dbd85c8266)

Update cpplib es.po.

* es.po: Update.

Change testcase for pr96325 from run to compile.

2020-08-04 Paul Thomas <pault@gcc.gnu.org>

gcc/testsuite/
PR fortran/96325
* gfortran.dg/pr96325.f90: Change from run to compile.

(cherry picked from commit 863de9321813f947018cc60b06ba163ddcfbb5f2)

This patch fixes PR96325. See the explanatory comment in the testcase.

2020-08-02 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/96325
* primary.c (gfc_match_varspec): In the case that a component
reference is added to an intrinsic type component, emit the
error message in this function.

gcc/testsuite/
PR fortran/96325
* gfortran.dg/pr96325.f90: New test.
* gfortran.dg/pr91589.f90: Update error message.

(cherry picked from commit e41da82345fb01c4c2641c979a94a975d15312ab)

Fix failures with -m32 and some memory leaks.

2020-12-23 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/83118
* trans-array.c (gfc_alloc_allocatable_for_assignment): Make
sure that class expressions are captured for dummy arguments by
use of gfc_get_class_from_gfc_expr otherwise the wrong vptr is
used.
* trans-expr.c (gfc_get_class_from_gfc_expr): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(gfc_trans_assignment_1): Deallocate rhs allocatable components
after passing derived type function results to class lhs.
* trans.h : Add prototype for gfc_get_class_from_gfc_expr.

As well as the PR this patch fixes problems in handling class objects

2020-12-18 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/83118
PR fortran/96012
* resolve.c (resolve_ordinary_assign): Generate a vtable if
necessary for scalar non-polymorphic rhs's to unlimited lhs's.
* trans-array.c (get_class_info_from_ss): New function.
(gfc_trans_allocate_array_storage): Defer obtaining class
element type until all sources of class exprs are tried. Use
class API rather than TREE_OPERAND. Look for class expressions
in ss->info by calling get_class_info_from_ss. After, obtain
the element size for class descriptors. Where the element type
is unknown, cast the data as character(len=size) to overcome
unlimited polymorphic problems.
(gfc_conv_ss_descriptor): Do not fix class variable refs.
(build_class_array_ref, structure_alloc_comps): Replace code
replicating the new function gfc_resize_class_size_with_len.
(gfc_alloc_allocatable_for_assignment): Obtain element size
for lhs in cases of deferred characters and class enitities.
Move code for the element size of rhs to start of block. Clean
up extraction of class parameters throughout this function.
After the shape check test whether or not the lhs and rhs
element sizes are the same. Use earlier evaluation of
'cond_null'. Reallocation of lhs only to happen if size changes
or element size changes.
* trans-expr.c (gfc_resize_class_size_with_len): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(trans_scalar_class_assign): New function.
(gfc_conv_procedure_call): Ensure the vtable is present for
passing a non-class actual to an unlimited formal.
(trans_class_vptr_len_assignment): For expressions of type
BT_CLASS, extract the class expression if necessary. Use a
statement block outside the loop body. Ensure that 'rhs' is
of the correct type. Obtain rhs vptr in all circumstances.
(gfc_trans_scalar_assign): Call trans_scalar_class_assign to
make maximum use of the vptr copy in place of assignment.
(trans_class_assignment): Actually do reallocation if needed.
(gfc_trans_assignment_1): Simplify some of the logic with
'realloc_flag'. Set 'vptr_copy' for all array assignments to
unlimited polymorphic lhs.
* trans.c (gfc_build_array_ref): Call gfc_resize_class_size_
with_len to correct span for unlimited polymorphic decls.
* trans.h : Add prototype for gfc_resize_class_size_with_len.

gcc/testsuite/
PR fortran/83118
PR fortran/96012
* gfortran.dg/dependency_60.f90: New test.
* gfortran.dg/class_allocate_25.f90: New test.
* gfortran.dg/class_assign_4.f90: New test.
* gfortran.dg/unlimited_polymorphic_32.f03: New test.

(cherry picked from commit ce8dcc9105cbd4043d575d8b2c91309a423951a9)

arc: Refurbish adc/sbc patterns

The adc/sbc patterns were unecessary spliting, remove that and
associated functions.

gcc/
2020-12-11 Claudiu Zissulescu <claziss@synopsys.com>

* config/arc/arc-protos.h (arc_scheduling_not_expected): Remove
it.
(arc_sets_cc_p): Likewise.
(arc_need_delay): Likewise.
* config/arc/arc.c (arc_sets_cc_p): Likewise.
(arc_need_delay): Likewise.
(arc_scheduling_not_expected): Likewise.
* config/arc/arc.md: Convert adc/sbc patterns to simple
instruction definitions.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
(cherry picked from commit dfbe642c97f7f430926cb6b33cd5c20b42c85573)

Sync gcc-changelog scripts.

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Sync from master.
* gcc-changelog/git_email.py: Likewise.
* gcc-changelog/git_repository.py: Likewise.
* gcc-changelog/test_email.py: Likewise.
* gcc-changelog/test_patches.txt: Likewise.

Daily bump.

reassoc: Fix reassociation on 32-bit hosts with > 32767 bbs [PR98514]

Apparently reassoc ICEs on large functions (more than 32767 basic blocks
with something to reassociate in those).
The problem is that the pass uses long type to store the ranks, and
the bb ranks are (number of SSA_NAMEs with default defs + 2 + bb->index) << 16,
so with many basic blocks we overflow the ranks and we then have assertions
rank is not negative.

The following patch just uses int64_t instead of long in the pass,
yes, it means slightly higher memory consumption (one array indexed by
bb->index is twice as large, and one hash_map from trees to the ranks
will grow by 50%, but I think it is better than punting on large functions
the reassociation on 32-bit hosts and making it inconsistent e.g. when
cross-compiling.  Given vec.h uses unsigned for vect element counts,
we don't really support more than 4G of SSA_NAMEs or more than 2G of basic
blocks in a function, so even with the << 16 we can't really overflow the
int64_t rank counters.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/98514
* tree-ssa-reassoc.c (bb_rank): Change type from long * to
int64_t *.
(operand_rank): Change type from hash_map<tree, long> to
hash_map<tree, int64_t>.
(phi_rank): Change return type from long to int64_t.
(loop_carried_phi): Change block_rank variable type from long to
int64_t.
(propagate_rank): Change return type, rank parameter type and
op_rank variable type from long to int64_t.
(find_operand_rank): Change return type from long to int64_t
and change slot variable type from long * to int64_t *.
(insert_operand_rank): Change rank parameter type from long to
int64_t.
(get_rank): Change return type and rank variable type from long to
int64_t.  Use PRId64 instead of ld to print the rank.
(init_reassoc): Change rank variable type from long to int64_t
and adjust correspondingly bb_rank and operand_rank initialization.

(cherry picked from commit 4ddee425b8c427d3cc13c49b26f442313e239572)

wide-int: Fix wi::to_mpz [PR98474]

The following testcase is miscompiled, because niter analysis miscomputes
the number of iterations to 0.
The problem is that niter analysis uses mpz_t (wonder why, wouldn't
widest_int do the same job?) and when wi::to_mpz is called e.g. on the
TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong
value.
wi::to_mpz has code to handle negative wide_ints in signed types by
inverting all bits, importing to mpz and complementing it, which is fine,
but doesn't handle correctly the case when the wide_int's len (times
HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p.
E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented
in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create
0xffffffffffffffff mpz_t value from that.
This patch handles it by adding the needed -1 host wide int words (and has
also code to deal with precision that aren't multiple of
HOST_BITS_PER_WIDE_INT).

2020-12-31 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/98474
* wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type
is unsigned and excess negative, append set bits after len until
precision.

* gcc.c-torture/execute/pr98474.c: New test.

(cherry picked from commit 9e603837f7ad886df62e02ac0cd395ec17b7d587)

gimplify: Gimplify value in gimplify_init_ctor_eval_range [PR98353]

gimplify_init_ctor_eval_range wasn't gimplifying value, so if it wasn't
a gimple val, verification at the end of gimplification would ICE (or with
release checking some random pass later on would ICE or misbehave).

2020-12-21 Jakub Jelinek <jakub@redhat.com>

PR c++/98353
* gimplify.c (gimplify_init_ctor_eval_range): Gimplify value before
storing it into cref.

* g++.dg/opt/pr98353.C: New test.

(cherry picked from commit a477f1445b3093d01e68cd4c096c5776ad769e11)

openmp: Fix up handling of addressable temporaries in simd lb, b and incr expressions [PR98383]

For simd, we have code to artificially add locally defined variables into
private clauses if they are addressable, so that omplower turns them into
"omp simd array" variables. As the testcase shows, this is undesirable if
those temporaries only show in the lb, b or incr expressions and nowhere else,
if it is just used there, we really want normal scalar temporaries.

This patch implements that by making sure we don't set for those GOVD_LOCAL-ish
temporaries turned into GOVD_PRIVATE the GOVD_SEEN flag during gimplification
of the lb, b and incr expressions, which means that the private clause isn't
added for those.

2020-12-21 Jakub Jelinek <jakub@redhat.com>

PR c++/98383
* gimplify.c (struct gimplify_omp_ctx): Add in_for_exprs flag.
(gimple_add_tmp_var): For addressable temporaries appearing in
simd lb, b or incr expressions, don't add a private clause unless
it is seen also outside of those expressions in the simd body.
(omp_notice_variable): Likewise.
(gimplify_omp_for): Set and reset in_for_exprs around gimplification
of lb, b or incr expressions.

* g++.dg/gomp/pr98383.C: New test.

(cherry picked from commit b6237343e78ae115d09618efc1443bdf2fd6c09b)

openmp: Don't optimize shared to firstprivate on task with depend clause

The attached testcase is miscompiled, because we optimize shared clauses
to firstprivate when task body can't modify the variable even when the
task has depend clause. That is wrong, because firstprivate means the
variable will be copied immediately when the task is created, while with
depend clause some other task might change it later before the dependencies
are satisfied and the task should observe the value only after the change.

2020-12-18 Jakub Jelinek <jakub@redhat.com>

* gimplify.c (struct gimplify_omp_ctx): Add has_depend member.
(gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND
appears on OMP_TASK.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force
GOVD_WRITTEN on shared variables if task construct has depend clause.

* testsuite/libgomp.c/task-6.c: New test.

(cherry picked from commit 8b60459465252c7d47b58abf83fae2aa84915b03)

openmp, openacc: Fix up handling of data regions [PR98183]

While the data regions (target data and OpenACC counterparts) aren't
standalone directives, unlike most other OpenMP/OpenACC constructs
we allow (apparently as an extension) exceptions and goto out of
the block. During gimplification we place an *end* call into a finally
block so that it is reached even on exceptions or goto out etc.).
During omplower pass we then add paired #pragma omp return for them,
but due to the exceptions because the region is not SESE we can end up
with #pragma omp return appearing only conditionally in the CFG etc.,
which the ompexp pass can't handle.
For the ompexp pass, we actually don't care about the end part or about
target data nesting, so we can treat it as standalone directive.

2020-12-12 Jakub Jelinek <jakub@redhat.com>

PR middle-end/98183
* omp-low.c (lower_omp_target): Don't add OMP_RETURN for
data regions.
* omp-expand.c (expand_omp_target): Don't try to remove
OMP_RETURN for data regions.
(build_omp_regions_1, omp_make_gimple_edges): Don't expect
OMP_RETURN for data regions.

* gcc.dg/gomp/pr98183.c: New test.
* gcc.dg/goacc/pr98183.c: New test.

(cherry picked from commit cc9b9c0b68233d38a26f7acd68cc5f9a8fc4d994)

openmp: Fix ICE with broken doacross loop [PR98205]

If the loop body doesn't ever continue, we don't have a bb to insert the
updates. Fixed by not adding them at all in that case.

2020-12-10 Jakub Jelinek <jakub@redhat.com>

PR middle-end/98205
* omp-expand.c (expand_omp_for_generic): Fix up broken_loop handling.

* c-c++-common/gomp/doacross-4.c: New test.

(cherry picked from commit a2a17ae7d85e420db5fe0c7ab2f59a470e2c7a78)

i386: Fix up X87_ENABLE_{FLOAT,ARITH} in conditions [PR94440]

The documentation says
     For a named pattern, the condition may not depend on the data in
     the insn being matched, but only the target-machine-type flags.
The i386 backend violates that by using flag_excess_precision and
flag_unsafe_math_optimizations in the conditions too, which is bad
when optimize attribute or pragmas are used.  The problem is that the
middle-end caches the enabled conditions for the optabs for a particular
switchable target, but multiple functions can share the same
TARGET_OPTION_NODE, but have different TREE_OPTIMIZATION_NODE with different
flag_excess_precision or flag_unsafe_math_optimizations, so the enabled
conditions then match only one of those.

I think best would be to just have a single options node for both the
generic and target options, then such problems wouldn't exist, but that
would be very risky at this point and quite large change.

So, instead the following patch just shadows flag_excess_precision and
flag_unsafe_math_optimizations values for uses in the instruction conditions
in TargetVariable and during set_cfun artificially creates new
TARGET_OPTION_NODE if flag_excess_precision and/or
flag_unsafe_math_optimizations change from what is recorded in their
TARGET_OPTION_NODE.  The target nodes are hashed, so worst case we can get 4
times as many target option nodes if one would for each unique target option
try all the flag_excess_precision and flag_unsafe_math_optimizations values.

2020-12-08  Jakub Jelinek  <jakub@redhat.com>

PR target/94440
* config/i386/i386.opt (ix86_excess_precision,
ix86_unsafe_math_optimizations): New TargetVariables.
* config/i386/i386.h (X87_ENABLE_ARITH, X87_ENABLE_FLOAT): Use
ix86_unsafe_math_optimizations instead of
flag_unsafe_math_optimizations and ix86_excess_precision instead of
flag_excess_precision.
* config/i386/i386.c (ix86_excess_precision): Rename to ...
(ix86_get_excess_precision): ... this.
(TARGET_C_EXCESS_PRECISION): Define to ix86_get_excess_precision.
* config/i386/i386-options.c (ix86_valid_target_attribute_tree,
ix86_option_override_internal): Update ix86_unsafe_math_optimization
from flag_unsafe_math_optimizations and ix86_excess_precision
from flag_excess_precision when constructing target option nodes.
(ix86_set_current_function): If flag_unsafe_math_optimizations
or flag_excess_precision is different from the one recorded
in TARGET_OPTION_NODE, create a new target option node for the
current function and switch to that.

(cherry picked from commit e401db7bfd8cf86d3833805a81b1252884eb1c9d)

openmp: -fopenmp-simd fixes [PR98187]

This patch fixes two bugs in the -fopenmp-simd support. One is that
in C++ #pragma omp parallel master would actually create OMP_PARALLEL
in the IL, which is a big no-no for -fopenmp-simd, we should be creating
only the constructs -fopenmp-simd handles (mainly OMP_SIMD, OMP_LOOP which
is gimplified as simd in that case, declare simd/reduction and ordered simd).

The other bug was that #pragma omp master taskloop simd combined construct
contains simd and thus should be recognized as #pragma omp simd (with only
the simd applicable clauses), but as master wasn't included in
omp_pragmas_simd, we'd ignore it completely instead.

2020-12-08 Jakub Jelinek <jakub@redhat.com>

PR c++/98187
* c-pragma.c (omp_pragmas): Remove "master".
(omp_pragmas_simd): Add "master".

* parser.c (cp_parser_omp_parallel): For parallel master with
-fopenmp-simd only, just call cp_parser_omp_master instead of
wrapping it in OMP_PARALLEL.

* c-c++-common/gomp/pr98187.c: New test.

(cherry picked from commit 31007091b10944c358e5038a6271d7e2744cde37)

c++: Fix constexpr access to union member through pointer-to-member [PR98122]

We currently incorrectly reject the first testcase, because
cxx_fold_indirect_ref_1 doesn't attempt to handle UNION_TYPEs.
As the second testcase shows, it isn't that easy, because I believe we need
to take into account the active member and prefer that active member over
other members, because if we pick a non-active one, we might reject valid
programs.

2020-12-05 Jakub Jelinek <jakub@redhat.com>

PR c++/98122
* constexpr.c (cxx_union_active_member): New function.
(cxx_fold_indirect_ref_1): Add ctx argument, pass it through to
recursive call. Handle UNION_TYPE.
(cxx_fold_indirect_ref): Add ctx argument, pass it to recursive calls
and cxx_fold_indirect_ref_1.
(cxx_eval_indirect_ref): Adjust cxx_fold_indirect_ref calls.

* g++.dg/cpp1y/constexpr-98122.C: New test.
* g++.dg/cpp2a/constexpr-98122.C: New test.

(cherry picked from commit 43e84ce7d62be121445e17cc0ee009a81fb285d7)

debug: Fix another vector DECL_MODE ICE [PR98100]

The PR88587 fix changes DECL_MODE of vars with vector type during inlining/cloning
when the vars are copied, so that their DECL_MODE matches their TYPE_MODE in
the new function.  Unfortunately, the following testcase still ICEs, the var
isn't really used in the new function and so it isn't copied, but becomes
just a nonlocalized var.  So we can't adjust its DECL_MODE because it
appears in multiple functions and needs different modes in between them.
The following patch changes the DEBUG_INSN creation to use TYPE_MODE instead
of DECL_MODE for vars with vector types.

2020-12-04  Jakub Jelinek  <jakub@redhat.com>

PR target/98100
* cfgexpand.c (expand_gimple_basic_block): For vars with
vector type, use TYPE_MODE rather than DECL_MODE.

* gcc.target/i386/pr98100.c: New test.

(cherry picked from commit 704ccefb576dcf30b27a4b9bdacb6e15902f5307)

dwarf2out: Fix up add_scalar_info not to create invalid DWARF

As discussed in https://sourceware.org/bugzilla/show_bug.cgi?id=26987 ,
for very large bounds (which don't fit into HOST_WIDE_INT) GCC emits invalid
DWARF.
In DWARF2, DW_AT_{lower,upper}_bound were constant reference class.
In DWARF3 they are block constant reference and the
Static and Dynamic Properties of Types
chapter says:
"For a block, the value is interpreted as a DWARF expression; evaluation of the expression
yields the value of the attribute."
In DWARF4/5 they are constant exprloc reference class.
Now, for add_AT_wide we use DW_FORM_data16 (valid in constant class)
when -gdwarf-5, but otherwise just use DW_FORM_block1, which is not constant
class, but block.
For DWARF3 this means emitting clearly invalid DWARF, because the
DW_FORM_block1 should contain a DWARF expression, not random bytes
containing the constant directly.
For DWARF2/DWARF4/5 it could be considered a GNU extension, but a very badly
designed one when it means something different in DWARF3.

The following patch uses add_AT_wide only if we know we'll be using
DW_FORM_data16, and otherwise wastes 2 extra bytes and emits in there
DW_OP_implicit_value <size> before the constant.

2020-12-03 Jakub Jelinek <jakub@redhat.com>

* dwarf2out.c (add_scalar_info): Only use add_AT_wide for 128-bit
constants and only in dwarf-5 or later, where DW_FORM_data16 is
available. Otherwise use DW_FORM_block*/DW_FORM_exprloc with
DW_OP_implicit_value to describe the constant.

(cherry picked from commit 4ec9d0962371c134d881d7dcfcef5effc8ed847f)

vec.h: Fix GCC build with -std=gnu++20 [PR98059]

Apparently vec.h doesn't build with -std=c++20/gnu++20, since the
DR2237 r11-532 change.
template <typename T>
class auto_delete_vec
{
private:
  auto_vec_delete<T> (const auto_delete_vec<T> &) = delete;
};
which vec.h uses is invalid C++20, one needs to use
  auto_vec_delete (const auto_delete_vec &) = delete;
instead which is valid all the way back to C++11 (and without = delete
to C++98).

2020-12-02  Scott Snyder  <sss@li-snyder.org>

PR plugins/98059
* vec.h (auto_delete_vec): Use
DISABLE_COPY_AND_ASSIGN(auto_delete_vec) instead of
DISABLE_COPY_AND_ASSIGN(auto_delete_vec<T>) to make it valid C++20
after DR2237.

(cherry picked from commit bad800c03d00a57fc21718c160459d9a1e8d747a)

openmp: Avoid ICE on depend clause on depobj OpenMP construct [PR98072]

Since r11-5430 we ICE on the following testcase. When parsing the depobj
directive we don't really use cp_parser_omp_all_clauses routine which ATM
disables generation of location wrappers and the newly added assertion
that there are no location wrappers thus triggers.

Fixed by adding the location wrappers suppression sentinel.

Longer term, we should handle location wrappers inside of OpenMP clauses.

2020-12-01 Jakub Jelinek <jakub@redhat.com>

PR c++/98072
* parser.c (cp_parser_omp_depobj): Suppress location wrappers when
parsing depend clause.

* c-c++-common/gomp/depobj-2.c: New test.

(cherry picked from commit 91ddf867a57b028ab322b737ea8355d5a472cd44)

x86_64: Fix up -fpic -mcmodel=large -fno-plt [PR98063]

On the following testcase with -fpic -mcmodel=large -fno-plt we emit
call puts@GOTPCREL(%rip)
but that is not really appropriate for CM_LARGE_PIC, the .text can be larger
than 2GB in that case and the .got slot further away from %rip than what can
fit into the signed 32-bit immediate.

The following patch computes the address of the .got slot the way it is
computed for that model for function pointer loads, and calls that.

2020-12-01 Jakub Jelinek <jakub@redhat.com>

PR target/98063
* config/i386/i386-expand.c (ix86_expand_call): Handle non-plt
CM_LARGE_PIC calls.

* gcc.target/i386/pr98063.c: New test.

(cherry picked from commit 69157fe75823fc34f1e3265345f2d2b99cd8d380)

Daily bump.

c++: ICE with deferred noexcept when deducing targs [PR82099]

In this test we ICE in type_throw_all_p because it got a deferred
noexcept which it shouldn't.  Here's the story:

In noexcept61.C, we call bar, so we perform overload resolution.  When
adding the (only) candidate, we need to deduce template arguments, so
call fn_type_unification as usually.  That deduces U to

  void (*) (int &, int &)

which is correct, but its noexcept-spec is deferred_noexcept.  Then
we call add_function_candidate (bar), wherein we try to create an
implicit conversion sequence for every argument.  Since baz<int> is
of unknown type, we instantiate_type it; it is a TEMPLATE_ID_EXPR
so that calls resolve_address_of_overloaded_function.  But we crash
there, because target_type contains the deferred_noexcept.

So we need to maybe_instantiate_noexcept before we can compare types.
resolve_overloaded_unification seemed like the appropriate spot, now
fn_type_unification produces the function type with its noexcept-spec
instantiated.  This shouldn't go against CWG 1330 because here we
really need to instantiate the noexcept-spec.

This also fixes class-deduction76.C, a dg-ice test I recently added,
therefore this fix also fixes c++/90799, yay.

gcc/cp/ChangeLog:

PR c++/82099
* pt.c (resolve_overloaded_unification): Call
maybe_instantiate_noexcept after instantiating the function
decl.

gcc/testsuite/ChangeLog:

PR c++/82099
* g++.dg/cpp0x/noexcept61.C: New test.

c++: Prevent warnings for value-dependent exprs [PR96742]

Here, in r11-155, I changed the call to uses_template_parms to
type_dependent_expression_p_push to avoid a crash in C++98 in
value_dependent_expression_p on a non-constant expression.  But that
prompted a host of complaints that we now warn for value-dependent
expressions in templates.  Those warnings are technically valid, but
people still don't want them because they're awkward to avoid.  This
patch uses value_dependent_expression_p or type_dependent_expression_p.
But make sure that we don't ICE in value_dependent_expression_p by
checking potential_constant_expression first.

gcc/cp/ChangeLog:

PR c++/96675
PR c++/96742
* pt.c (tsubst_copy_and_build): Call value_dependent_expression_p or
type_dependent_expression_p instead of type_dependent_expression_p_push.
But only call value_dependent_expression_p for expressions that are
potential_constant_expression.

gcc/testsuite/ChangeLog:

PR c++/96675
PR c++/96742
* g++.dg/warn/Wdiv-by-zero-3.C: Turn dg-warning into dg-bogus.
* g++.dg/warn/Wtautological-compare3.C: New test.
* g++.dg/warn/Wtype-limits5.C: New test.
* g++.old-deja/g++.pt/crash10.C: Remove dg-warning.

(cherry picked from commit 976e7ef1a2d54f46021f74d071d9fdb9631298f8)

c++: Fix ICE with inline variable in template [PR97975]

In this test, we have

  static inline const int c = b;

in a class template, and we call store_init_value as usual.  There, the
value is

  IMPLICIT_CONV_EXPR<const float>(b)

which is is_nondependent_static_init_expression but isn't
is_nondependent_constant_expression (they only differ in STRICT).
We call fold_non_dependent_expr, but that just returns the expression
because it only instantiates is_nondependent_constant_expression
expressions.  Since we're not checking the initializer of a constexpr
variable, we go on to call maybe_constant_init, whereupon we crash
because it tries to evaluate all is_nondependent_static_init_expression
expressions, which our value is, but it still contains a template code.

I think the fix is to call fold_non_dependent_init instead of
maybe_constant_init, and only call fold_non_dependent_expr on the
"this is a constexpr variable" path so as to avoid instantiating twice
in a row.  Outside a template this should also avoid evaluating the
value twice.

gcc/cp/ChangeLog:

PR c++/97975
* constexpr.c (fold_non_dependent_init): Add a tree parameter.
Use it.
* cp-tree.h (fold_non_dependent_init): Add a tree parameter with
a default value.
* typeck2.c (store_init_value): Call fold_non_dependent_expr
only when checking the initializer for constexpr variables.
Call fold_non_dependent_init instead of maybe_constant_init.

gcc/testsuite/ChangeLog:

PR c++/97975
* g++.dg/cpp1z/inline-var8.C: New test.

(cherry picked from commit 69bf1c7d5ee21392334f1982d1b40c38e103bbd4)

c++: ICE with switch and scoped enum bit-fields [PR98043]

In this testcase we are crashing trying to gimplify a switch, because
the types of the switch condition and case constants have different
TYPE_PRECISIONs.

This started with my r5-3726 fix: SWITCH_STMT_TYPE is supposed to be the
original type of the switch condition before any conversions, so in the
C++ FE we need to use unlowered_expr_type to get the unlowered type of
enum bit-fields.

Normally, the switch type is subject to integral promotions, but here
we have a scoped enum type and those don't promote:

  enum class B { A };
  struct C { B c : 8; };

  switch (x.c) // type B
    case B::A: // type int, will be converted to B

Here TREE_TYPE is "signed char" but SWITCH_STMT_TYPE is "B".  When
gimplifying this in gimplify_switch_expr, the index type is "B" and
we convert all the case values to "B" in preprocess_case_label_vec,
but SWITCH_COND is of type "signed char": gimple_switch_index should
be the (possibly promoted) type, not the original type, so we gimplify
the "x.c" SWITCH_COND to a SSA_NAME of type "signed char".  And then
we crash because the precision of the index type doesn't match the
precision of the case value type.

I think it makes sense to do the following; at the end of pop_switch
we've already issued the switch warnings, and since scoped enums don't
promote, it should be okay to use the type of SWITCH_STMT_COND.  The
r5-3726 change was about giving warnings for enum bit-fields anyway.

gcc/cp/ChangeLog:

PR c++/98043
* decl.c (pop_switch): If SWITCH_STMT_TYPE is a scoped enum type,
set it to the type of SWITCH_STMT_COND.

gcc/testsuite/ChangeLog:

PR c++/98043
* g++.dg/cpp0x/enum41.C: New test.

(cherry picked from commit 7482d5a3acb7a8a5564f5cddc4f9d2ebbaea75e4)

c++: ICE with -fsanitize=vptr and constexpr dynamic_cast [PR98103]

-fsanitize=vptr initializes all vtable pointers to null so that it can
catch invalid calls; see cp_ubsan_maybe_initialize_vtbl_ptrs. That
means that evaluating a vtable reference can produce a null pointer
in this mode, so cxx_eval_dynamic_cast_fn should check that and give
and error.

gcc/cp/ChangeLog:

PR c++/98103
* constexpr.c (cxx_eval_dynamic_cast_fn): If the evaluating of vtable
yields a null pointer, give an error and return. Use objtype.

gcc/testsuite/ChangeLog:

PR c++/98103
* g++.dg/ubsan/vptr-18.C: New test.

(cherry picked from commit 0221c656bbe5b4ab54e784df3b109c60cb27e5b6)

c++: Fix wrong error with constexpr destructor [PR97427]

When I implemented the code to detect modifying const objects in
constexpr contexts, we couldn't have constexpr destructors, so I didn't
consider them.  But now we can and that caused a bogus error in this
testcase: [class.dtor]p5 says that "const and volatile semantics are not
applied on an object under destruction.  They stop being in effect when
the destructor for the most derived object starts." so we have to clear
the TREE_READONLY flag we set on the object after the constructors have
been called to mark it as no-longer-under-construction.  In the ~Foo
call it's now an object under destruction, so don't report those errors.

gcc/cp/ChangeLog:

PR c++/97427
* constexpr.c (cxx_set_object_constness): New function.
(cxx_eval_call_expression): Set new_obj for destructors too.
Call cxx_set_object_constness to set/unset TREE_READONLY of
the object under construction/destruction.

gcc/testsuite/ChangeLog:

PR c++/97427
* g++.dg/cpp2a/constexpr-dtor10.C: New test.

i386: Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 [PR98522]

Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE
by clearing the top 64 bytes of the input XMM register.

2021-01-05 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/98522
* config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split.
Clear the top 64 bytes of the input XMM register.
(sse_cvttps2pi): Ditto.

gcc/testsuite

PR target/98522
* gcc.target/i386/pr98522.c: New test.

i386: Add _mm256_cmov_si256 [PR98521]

Add missing _mm256_cmov_si256 intrinsic to xopintrin.h.

2021-01-05 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/98521
* config/i386/xopintrin.h (_mm256_cmov_si256): New.

Daily bump.

PR fortran/96381 - invalid read in gfc_find_derived_vtab

An invalid declaration of a CLASS instance can lead to an internal state
with inconsistent attributes during parsing that needs to be handled with
sufficient care when processing subsequent statements. Avoid a lookup of
the vtab entry for such cases.

gcc/fortran/ChangeLog:

* class.c (gfc_find_vtab): Add check on attribute is_class.

(cherry picked from commit d816b0c144d15e6570eb5b124b9f3ccbe3d40082)

Fortran: Delay vtab generation until after parsing [PR92587]

gcc/fortran/ChangeLog:

PR fortran/92587
* match.c (gfc_match_assignment): Move gfc_find_vtab call from here ...
* resolve.c (gfc_resolve_code): ... to here.

gcc/testsuite/ChangeLog:

PR fortran/92587
* gfortran.dg/finalize_37.f90: New test.

(cherry picked from commit ba9fa684053917a07bfa8f4742da0e196e72b9a2)

Daily bump.

coroutines: Emit error for invalid promise return types [PR97438].

At one stage, use cases were proposed for allowing the promise
type to contain both return_value and return_void. That was
not accepted into C++20, so we should reject it as per the PR.

gcc/cp/ChangeLog:

PR c++/97438
* coroutines.cc (struct coroutine_info): Add a field to
record that we emitted a promise type error.
(coro_promise_type_found_p): Check for the case that the
promise type contains both return_void and return_value.
Emit an error if so, with information about the wrong
type methods.

gcc/testsuite/ChangeLog:

PR c++/97438
* g++.dg/coroutines/pr97438.C: New test.

(cherry picked from commit b003c4b14b3f889e6707db68d2c6545eda7a203b)

Darwin : Update libtool and dependencies for Darwin20 [PR97865]

The change in major version (and the increment from Darwin19 to 20)
caused libtool tests to fail which resulted in incorrect build settings
for shared libraries.

We take this opportunity to sort out the shared undefined symbols state
rather than propagating the current unsound behaviour into a new rev.

This change means that we default to the case that missing symbols are
considered an error, and if one wants to allow this intentionally, the
confiuration for that case should be set appropriately.

Three existing cases need undefined dynamic lookup:
libitm, where there is already a configuration mechanism to add the
flags.
libcc1, where we add simple configuration to add the flags for Darwin.
libsanitizer, where we can add to the existing extra flags.

Backported from 1352bc88a0525743c952197fb2db6e4f8c091cde and 5dc998933e7aa737f4a45a8a2885d42d5288d51a

libcc1/ChangeLog:

PR target/97865
* Makefile.am: Add dynamic_lookup to LD flags for Darwin.
* configure.ac: Test for Darwin host and set a flag.
* Makefile.in: Regenerate.
* configure: Regenerate.

libitm/ChangeLog:

PR target/97865
* configure.tgt: Add dynamic_lookup to XLDFLAGS for Darwin.
* configure: Regenerate.

libsanitizer/ChangeLog:

PR target/97865
* configure.tgt: Add dynamic_lookup to EXTRA_CXXFLAGS for
Darwin.
* configure: Regenerate.

ChangeLog:

PR target/97865
* libtool.m4: Update handling of Darwin platform link flags
for Darwin20.

gcc/ChangeLog:

PR target/97865
* configure: Regenerate.

libatomic/ChangeLog:

PR target/97865
* configure: Regenerate.

libbacktrace/ChangeLog:

PR target/97865
* configure: Regenerate.

libffi/ChangeLog:

PR target/97865
* configure: Regenerate.

libgfortran/ChangeLog:

PR target/97865
* configure: Regenerate.

libgomp/ChangeLog:

PR target/97865
* configure: Regenerate.
* Makefile.in: Update copyright years.

libhsail-rt/ChangeLog:

PR target/97865
* configure: Regenerate.

libobjc/ChangeLog:

PR target/97865
* configure: Regenerate.

libphobos/ChangeLog:

PR target/97865
* configure: Regenerate.

libquadmath/ChangeLog:

PR target/97865
* configure: Regenerate.

libssp/ChangeLog:

PR target/97865
* configure: Regenerate.

libstdc++-v3/ChangeLog:

PR target/97865
* configure: Regenerate.

libvtv/ChangeLog:

PR target/97865
* configure: Regenerate.

zlib/ChangeLog:

PR target/97865
* configure: Regenerate.

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

Daily bump.

Darwin, libsanitizer : Support libsanitizer for x86_64-darwin20.

The sanitizer is supported for at least x86_64 and Darwin20.

libsanitizer/ChangeLog:

* configure.tgt: Allow x86_64 Darwin2x.

(cherry picked from commit 93f1186f7d3419c7af692295bd509d7bfaf170a1)