git.ipfire.org Git - thirdparty/gcc.git/log

lto: Ensure we force a change for file/line/column after clear_line_info

As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored.  But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
  ob->current_file = NULL;
  ob->current_line = 0;
  ob->current_col = 0;
  ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info.  But then we do:
      bp_pack_value (bp, ob->current_file != xloc.file, 1);
      bp_pack_value (bp, ob->current_line != xloc.line, 1);
      bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true.  If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero?  Either by oring something into those
tests, or perhaps:
  if (ob->current_reset)
    {
      if (xloc.file == NULL)
        ob->current_file = "";
      if (xloc.line == 0)
        ob->current_line = 1;
      if (xloc.column == 0)
        ob->current_column = 1;
      ob->current_reset = false;
    }
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?

2020-09-04  Jakub Jelinek  <jakub@redhat.com>

* lto-streamer.h (struct output_block): Add reset_locus member.
* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
(lto_output_location_1): If reset_locus, clear it and ensure
current_{file,line,col} is different from xloc members.

(cherry picked from commit 70d8d9bd93f7912e56a27e64abc9e1e895fe143a)

c++: Fix another PCH hash_map issue [PR96901]

The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed.  Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch.  But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR c++/96901
* tree.h (struct decl_tree_traits): New type.
(decl_tree_map): New typedef.

* constexpr.c (fundef_copies_table): Change type from
hash_map<tree, tree> * to decl_tree_map *.

(cherry picked from commit ba6730bd18371a3dff1e37d2c2ee27233285b597)

c++: Disable -frounding-math during manifestly constant evaluation [PR96862]

As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned on.
      /* Don't constant fold this floating point operation if the
         result may dependent upon the run-time rounding mode and
         flag_rounding_math is set, or if GCC's software emulation
         is unable to accurately represent the result.  */
      if ((flag_rounding_math
           || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
          && (inexact || !real_identical (&result, &value)))
        return NULL_TREE;
Jonathan said that we should be evaluating them anyway, e.g. conceptually
as if they are done with the default rounding mode before user had a chance
to change that, and e.g. in C in initializers it is also ignored.
In fact, fold-const.c for C initializers turns off various other options:

/* Perform constant folding and related simplification of initializer
   expression EXPR.  These behave identically to "fold_buildN" but ignore
   potential run-time traps and exceptions that fold must preserve.  */

  int saved_signaling_nans = flag_signaling_nans;\
  int saved_trapping_math = flag_trapping_math;\
  int saved_rounding_math = flag_rounding_math;\
  int saved_trapv = flag_trapv;\
  int saved_folding_initializer = folding_initializer;\
  flag_signaling_nans = 0;\
  flag_trapping_math = 0;\
  flag_rounding_math = 0;\
  flag_trapv = 0;\
  folding_initializer = 1;

  flag_signaling_nans = saved_signaling_nans;\
  flag_trapping_math = saved_trapping_math;\
  flag_rounding_math = saved_rounding_math;\
  flag_trapv = saved_trapv;\
  folding_initializer = saved_folding_initializer;

So, shall cxx_eval_outermost_constant_expr instead turn off all those
options (then warning_sentinel wouldn't be the right thing to use, but given
the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd
need a RAII class for this.  Not sure about the folding_initializer, that
one is affecting complex multiplication and division constant evaluation
somehow.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR c++/96862
* constexpr.c (cxx_eval_outermost_constant_expr): Temporarily disable
flag_rounding_math during manifestly constant evaluation.

* g++.dg/cpp1z/constexpr-96862.C: New test.

(cherry picked from commit 6641d6d3fe79113f8d9f3ced355aea79bffda822)

lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]

As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations.  libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR lto/94311
* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
functions.
* streamer-hooks.h (struct streamer_hooks): Add
output_location_and_block callback.  Fix up formatting for
output_location.
(stream_output_location_and_block): Define.
* lto-streamer.h (class lto_location_cache): Fix comment typo.  Add
current_block member.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::lto_location_cache): Initialize current_block.
(lto_location_cache::cached_location): Add block member.
(struct output_block): Add current_block member.
(lto_output_location): Formatting fix.
(lto_output_location_and_block): Declare.
* lto-streamer.c (lto_streamer_hooks_init): Initialize
streamer_hooks.output_location_and_block.
* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
block members.
(lto_location_cache::apply_location_cache): Handle blocks.
(lto_location_cache::accept_location_cache,
lto_location_cache::revert_location_cache): Fix up function comments.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::input_location): Implement using
input_location_and_block.
(input_function): Invoke apply_location_cache after streaming in all
bbs.
* lto-streamer-out.c (clear_line_info): Set current_block.
(lto_output_location_1): New function, moved from lto_output_location,
added block handling.
(lto_output_location): Implement using lto_output_location_1.
(lto_output_location_and_block): New function.
* gimple-streamer-in.c (input_phi): Use input_location_and_block
to input and cache both location and block.
(input_gimple_stmt): Likewise.
* gimple-streamer-out.c (output_phi): Use
stream_output_location_and_block.
(output_gimple_stmt): Likewise.

(cherry picked from commit 3536ff2de8317c430546fd97574d44c5146cef2b)

fortran: Fix o'...' boz to integer/real conversions [PR96859]

The standard says that excess digits from boz are truncated.
For hexadecimal or binary, the routines copy just the number of digits
that will be needed, but for octal we copy number of digits that
contain one extra bit (for 8-bit, 32-bit or 128-bit, i.e. kind 1, 4 and 16)
or two extra bits (for 16-bit or 64-bit, i.e. kind 2 and 8).
The clearing of the first bit is done correctly by changing the first digit
if it is 4-7 to one smaller by 4 (i.e. modulo 4).
The clearing of the first two bits is done by changing 4 or 6 to 0
and 5 or 7 to 1, which is incorrect, because we really want to change the
first digit to 0 if it was even, or to 1 if it was odd, so digits
2 and 3 are mishandled by keeping them as is, rather than changing 2 to 0
and 3 to 1.

2020-09-02 Jakub Jelinek <jakub@redhat.com>

PR fortran/96859
* check.c (gfc_boz2real, gfc_boz2int): When clearing first two bits,
change also '2' to '0' and '3' to '1' rather than just handling '4'
through '7'.

* gfortran.dg/pr96859.f90: New test.

(cherry picked from commit b567d3bd302933adb253aba9069fd8120c485441)

dwarf2out: Fix up dwarf2out_next_real_insn caching [PR96729]

The addition of NOTE_INSN_BEGIN_STMT and NOTE_INSN_INLINE_ENTRY notes
reintroduced quadratic behavior into dwarf2out_var_location.
This function needs to know the next real instruction to which the var
location note applies, but the way final_scan_insn is called outside of
final.c main loop doesn't make it easy to look up the next real insn in
there (and for non-dwarf it is even useless).  Usually next real insn is
only a few notes away, but we can have hundreds of thousands of consecutive
notes only followed by a real insn.  dwarf2out_var_location to avoid the
quadratic behavior contains a cache, it remembers the next note and when it
is called again on that loc_note, it can use the previously computed
dwarf2out_next_real_insn result, rather than walking the insn chain once
again.  But, for NOTE_INSN_{BEGIN_STMT,INLINE_ENTRY} dwarf2out_var_location
is not called while the code puts into the cache those notes, which means if
we have e.g. in the worst case NOTE_INSN_VAR_LOCATION and
NOTE_INSN_BEGIN_STMT notes alternating, the cache is not really used.

The following patch fixes it by looking up the next NOTE_INSN_VAR_LOCATION
if any.  While the lookup could be perhaps done together with looking for
the next real insn once (e.g. in dwarf2out_next_real_insn or its copy),
there are other dwarf2out_next_real_insn callers which don't need/want that
behavior and if there are more than two NOTE_INSN_VAR_LOCATION notes
followed by the same real insn, we need to do that "find next
NOTE_INSN_VAR_LOCATION" walk anyway.

On the testcase from the PR this patch speeds it 2.8times, from 0m0.674s
to 0m0.236s (why it takes for the reporter more than 60s is unknown).

2020-08-26  Jakub Jelinek  <jakub@redhat.com>

PR debug/96729
* dwarf2out.c (dwarf2out_next_real_insn): Adjust function comment.
(dwarf2out_var_location): Look for next_note only if next_real is
non-NULL, in that case look for the first non-deleted
NOTE_INSN_VAR_LOCATION between loc_note and next_real, if any.

(cherry picked from commit ca1afa261d03c9343dff1208325f87d9ba69ec7a)

Daily bump.

Fix bogus error on Value_Size clause for variant record type

This is a regression present on the mainline and 10 branch: the compiler
rejects a Value_Size clause on a discriminated record type with variant.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (set_rm_size): Do not take into account the
Value_Size clause if it is not for the entity itself.

gcc/testsuite/ChangeLog:
* gnat.dg/specs/size_clause5.ads: New test.

Fix uninitialized variable with nested variant record types

This fixes a wrong code issue with nested variant record types: the
compiler generates move instructions that depend on an uninitialized
variable, which was initially a SAVE_EXPR not instantiated early enough.

gcc/ada/ChangeLog:
* gcc-interface/decl.c (build_subst_list): For a definition, make
sure to instantiate the SAVE_EXPRs generated by the elaboration of
the constraints in front of the elaboration of the type itself.

gcc/testsuite/ChangeLog:
* gnat.dg/discr59.adb: New test.
* gnat.dg/discr59_pkg1.ads: New helper.
* gnat.dg/discr59_pkg2.ads: Likewise.

Daily bump.

c++: Fix ICE in reshape_init with init-list [PR95164]

This patch fixes a long-standing bug in reshape_init_r.  Since r209314
we implement DR 1467 which handles list-initialization with a single
initializer of the same type as the target.  In this test this causes
a crash in reshape_init_r when we're processing a constructor that has
undergone the DR 1467 transformation.

Take e.g. the

  foo({{1, {H{k}}}});

line in the attached test.  {H{k}} initializes the field b of H in I.
H{k} is a functional cast, so has TREE_HAS_CONSTRUCTOR set, so is
COMPOUND_LITERAL_P.  We perform the DR 1467 transformation and turn
{H{k}} into H{k}.  Then we attempt to reshape H{k} again and since
first_initializer_p is null and it's COMPOUND_LITERAL_P, we go here:

           else if (COMPOUND_LITERAL_P (stripped_init))
             gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));

then complain about the missing braces, go to reshape_init_class and ICE
on
               gcc_checking_assert (d->cur->index
                                    == get_class_binding (type, id));

because due to the missing { } we're looking for 'b' in H, but that's
not found.

So we have to be prepared to handle an initializer whose outer braces
have been removed due to DR 1467.

gcc/cp/ChangeLog:

PR c++/95164
* decl.c (reshape_init_r): When initializing an aggregate member
with an initializer from an initializer-list, also consider
COMPOUND_LITERAL_P.

gcc/testsuite/ChangeLog:

PR c++/95164
* g++.dg/cpp0x/initlist123.C: New test.

(cherry picked from commit acbe30bbc884899da72df47d023ebde89f8f47f1)

If the lto plugin encounters a file with multiple symbol sections, each of which also has a v1 symbol extension section[1] then it will attempt to read the extension data for *every* symbol from each of the extension sections.  This results in reading off the end of a buffer with the associated memory corruption that that entails.  This patch fixes that problem.

2020-09-09  Nick Clifton  <nickc@redhat.com>

* lto-plugin.c (struct plugin_symtab): Add last_sym field.
(parse_symtab_extension): Only read as many entries as are
available in the buffer.  Store the data read into the symbol
table indexed from last_sym.  Increment last_sym.

Fortran: Fixes for OpenMP loop-iter privatization (PRs 95109 + 94690)

This commit also fixes a gfortran.dg/gomp/target1.f90 regression;
target1.f90 tests the resolve.c and openmp.c changes.

gcc/fortran/ChangeLog:

PR fortran/95109
PR fortran/94690
* resolve.c (gfc_resolve_code): Also call
gfc_resolve_omp_parallel_blocks for 'distribute parallel do (simd)'.
* openmp.c (gfc_resolve_omp_parallel_blocks): Handle it.
* trans-openmp.c (gfc_trans_omp_target): For TARGET_PARALLEL_DO_SIMD,
call simd not do processing function.

gcc/testsuite/ChangeLog:

PR fortran/95109
PR fortran/94690
* gfortran.dg/gomp/openmp-simd-5.f90: New test.

(cherry picked from commit 61c2d476a52bb108bd05d0226c5522bf0c4b24b5)

[PATCH PR96357][GCC][AArch64]: could not split insn UNSPEC_COND_FSUB with AArch64 SVE

Problem is related to that operand 4 (In original pattern
cond_sub<mode>_any_const) is no longer the same as operand 1, and so
the pattern doesn't match the split condition.

Pattern cond_sub<mode>_any_const is being split by this patch into two
separate patterns:
* Pattern cond_sub<mode>_relaxed_const now matches const_int
SVE_RELAXED_GP operand.
* Pattern cond_sub<mode>_strict_const now matches const_int
SVE_STRICT_GP operand.
* Remove aarch64_sve_pred_dominates_p condition from both patterns.

gcc/ChangeLog:

PR target/96357
* config/aarch64/aarch64-sve.md
(cond_sub<mode>_relaxed_const): Updated and renamed from
cond_sub<mode>_any_const pattern.
(cond_sub<mode>_strict_const): New pattern.

gcc/testsuite/ChangeLog:

PR target/96357
* gcc.target/aarch64/sve/pr96357.c: New test.

Daily bump.

PR fortran/96890 - Wrong answer with intrinsic IALL

The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.

libgfortran/ChangeLog:

* m4/iall.m4: Initial value for result should be -1.
* generated/iall_i1.c (miall_i1): Generated.
* generated/iall_i16.c (miall_i16): Likewise.
* generated/iall_i2.c (miall_i2): Likewise.
* generated/iall_i4.c (miall_i4): Likewise.
* generated/iall_i8.c (miall_i8): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/iall_masked.f90: New test.

(cherry picked from commit 8eeeecbcc17041fdfd3ccc928161ae86e7f9b456)

Fix description of FINDLOC result.

gcc/fortran/ChangeLog:

* intrinsic.texi: Fix description of FINDLOC result.

(cherry picked from commit 213200a27d756df1709be1a1a6a85af97a32fddc)

Daily bump.

Adjust testcase.

Since This testcase is used to check generation of AVX512 vector
comparison, scan-assembler for vmov instruction could be deleted, also
-mprefer-vector-width=512 is added to avoid impact of different
default arch/tune of GCC.

gcc/testsuite
PR target/96574
* gcc.target/i386/pr92865-1.c: Adjust testcase.

Daily bump.

d: Fix ICE in create_tmp_var, at gimple-expr.c:482

Array concatenate expressions were creating more SAVE_EXPRs than what
was necessary. The internal error itself was the result of a forced
temporary being made on a TREE_ADDRESSABLE type.

gcc/d/ChangeLog:

PR d/96924
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Don't force
temporaries needlessly.

gcc/testsuite/ChangeLog:

PR d/96924
* gdc.dg/pr96924.d: New test.

(cherry picked from commit 52908b8de15a1c762a73063f1162bcedfcc993b4)

rs6000, remove improperly defined and unsupported builtins.

gcc/ChangeLog

2020-08-31 Carl Love <cel@us.ibm.com>

PR target/85830
* config/rs6000/altivec.h (vec_popcntb, vec_popcnth, vec_popcntw,
vec_popcntd): Remove defines.

sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820)

The testcase causes and ICE in the SRA verifier on x86_64 when
compiling with -m32 because build_user_friendly_ref_for_offset looks
at an out-of-bounds array_ref within an array_ref which accesses an
offset which does not fit into a signed 32bit integer and turns it
into an array-ref with a negative index.

The best thing is probably to bail out early when encountering an out
of bounds access to a local stack-allocated aggregate (and let the DSE
just delete such statements) which is what the patch does.

I also glanced over to the initial candidate vetting routine to make
sure the size would fit into HWI and noticed that it uses unsigned
variants whereas the rest of SRA operates on signed offsets and
sizes (because get_ref_and_extent does) and so changed that for the
sake of consistency.  These ancient checks operate on sizes of types
as opposed to DECLs but I hope that any issues potentially arising
from that are basically hypothetical.

gcc/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96820
* tree-sra.c (create_access): Disqualify candidates with accesses
beyond the end of the original aggregate.
(maybe_add_sra_candidate): Check that candidate type size fits
signed uhwi for the sake of consistency.

gcc/testsuite/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96820
* gcc.dg/tree-ssa/pr96820.c: New test.

(cherry picked from commit 8ad3fc6ca46c603d9c3efe8e6d4a8f2ff1a893a4)

bpf: generate indirect calls for xBPF

This patch updates the BPF back end to generate indirect calls via
the 'call %reg' instruction when targetting xBPF.

Additionally, the BPF ASM_SPEC is updated to pass along -mxbpf to
gas, where it is now supported.

2020-09-03 David Faust <david.faust@oracle.com>

gcc/

* config/bpf/bpf.h (ASM_SPEC): Pass -mxbpf to gas, if specified.
* config/bpf/bpf.c (bpf_output_call): Support indirect calls in xBPF.

gcc/testsuite/

* gcc.target/bpf/xbpf-indirect-call-1.c: New test.

(cherry picked from commit c3a0f5373919deff68819de1db88c04261d61a87)

Daily bump.

rs6000: MMA built-in dies with incorrect sharing of tree nodes error

When we expand our MMA built-ins into gimple, we erroneously reused the
accumulator memory reference for both the source input value as well as
the destination output value. This led to a tree sharing error.
The solution is to create separate memory references for the input
and output values.

2020-09-01 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/96808
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Do not
reuse accumulator memory reference for source and destination accesses.

gcc/testsuite/
PR target/96808
* gcc.target/powerpc/pr96808.c: New test.

(cherry picked from commit 8bc0f24d7a20d89383859907b875a26ce59dc6c8)

libstdc++: Replace __int_limits with __numeric_traits_integer

I recently added std::__detail::__int_limits as a lightweight
alternative to std::numeric_limits, forgetting that the values it
provides (digits, min and max) are already provided by
__gnu_cxx::__numeric_traits.

This change adds __int_traits as an alias for __numeric_traits_integer.
This avoids instantiating __numeric_traits to decide whether to use
__numeric_traits_integer or __numeric_traits_floating. Then all uses of
__int_limits can be replaced with __int_traits, and __int_limits can be
removed.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Remove bits/int_limits.h.
* include/Makefile.in: Regenerate.
* include/bits/int_limits.h: Removed.
* include/bits/parse_numbers.h (_Select_int_base): Replace
__int_limits with __int_traits.
* include/bits/range_access.h (_SSize::operator()): Likewise.
* include/ext/numeric_traits.h (__numeric_traits_integer): Add
static assertion.
(__int_traits): New alias template.
* include/std/bit (__rotl, __rotr, __countl_zero, __countl_one)
(__countr_zero, __countr_one, __popcount, __bit_ceil)
(__bit_floor, __bit_width) Replace __int_limits with
__int_traits.
* include/std/charconv (__to_chars_8, __from_chars_binary)
(__from_chars_alpha_to_num, from_chars): Likewise.
* include/std/memory_resource (polymorphic_allocator::allocate)
(polymorphic_allocator::allocate_object): Likewise.
* include/std/string_view (basic_string_view::_S_compare):
Likewise.
* include/std/utility (cmp_equal, cmp_less, in_range): Likewise.

(cherry picked from commit eb04805be4029716e76532babc0fa9ecb18de96e)

Daily bump.

libstdc++: Fix std::gcd and std::lcm for unsigned integers [PR 92978]

This fixes a bug with mixed signed and unsigned types, where converting
a negative value to the unsigned result type alters the value. The
solution is to obtain the absolute values of the arguments immediately
and to perform the actual GCD or LCM algorithm on two arguments of the
same type.

In order to operate on the most negative number without overflow when
taking its absolute, use an unsigned type for the result of the abs
operation. For example, -INT_MIN will overflow, but -(unsigned)INT_MIN
is (unsigned)INT_MAX+1U which is the correct value.

libstdc++-v3/ChangeLog:

PR libstdc++/92978
* include/std/numeric (__abs_integral): Replace with ...
(__detail::__absu): New function template that returns an
unsigned type, guaranteeing it can represent the most
negative signed value.
(__detail::__gcd, __detail::__lcm): Require arguments to
be unsigned and therefore already non-negative.
(gcd, lcm): Convert arguments to absolute value as unsigned
type before calling __detail::__gcd or __detail::__lcm.
* include/experimental/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust expected
errors.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/26_numerics/gcd/92978.cc: New test.
* testsuite/26_numerics/lcm/92978.cc: New test.
* testsuite/experimental/numeric/92978.cc: New test.

(cherry picked from commit 82db1a42e9254c9009bbf8ac01366da4d1ab6df5)

libstdc++: Fix three-way comparison for std::array [PR 96851]

The spaceship operator for std::array uses memcmp when the
__is_byte<value_type> trait is true, but memcmp isn't usable in
constexpr contexts. Also, memcmp should only be used for unsigned byte
types, because it gives the wrong answer for signed chars with negative
values.

We can simply check std::is_constant_evaluated() so that we don't use
memcmp during constant evaluation.

To fix the problem of using memcmp for inappropriate types, this patch
adds new __is_memcmp_ordered and __is_memcmp_ordered_with traits. These
say whether using memcmp will give the right answer for ordering
operations such as lexicographical_compare and three-way comparisons.
The new traits can be used in several places.

Unlike the trunk commit this was backported from, this commit for the
branch doesn't extend the memcmp optimisations to all unsigned integers
on big endian targets. Only narrow character types and std::byte will
use memcmp.

libstdc++-v3/ChangeLog:

PR libstdc++/96851
* include/bits/cpp_type_traits.h (__is_memcmp_ordered):
New trait that says if memcmp can be used for ordering.
(__is_memcmp_ordered_with): Likewise, for two types.
* include/bits/ranges_algo.h (__lexicographical_compare_fn):
Use new traits instead of __is_byte and __numeric_traits.
* include/bits/stl_algobase.h (__lexicographical_compare_aux1)
(__is_byte_iter): Likewise.
* include/std/array (operator<=>): Likewise. Only use memcmp
when std::is_constant_evaluated() is false.
* testsuite/23_containers/array/comparison_operators/96851.cc:
New test.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.

(cherry picked from commit 2f983fa69005b603ea1758a013b4134d5b0f24a8)

libstdc++: Use __throw_exception_again macro for -fno-exceptions

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (counted_iterator::operator++(int)):
Use __throw_exception_again macro.

bpf: use the default asm_named_section target hook

This patch makes the BPF backend to not provide its own implementation
of the asm_named_section hook; the default handler works perfectly
well.

2020-09-02 Jose E. Marchesi <jose.marchesi@oracle.com>

gcc/
* config/bpf/bpf.c (bpf_asm_named_section): Delete.
(TARGET_ASM_NAMED_SECTION): Likewise.

(cherry picked from commit 7047a8bab6e41fe9f5dbb29ca170ce416e08dd11)

bpf: use elfos.h

BPF is an ELF-based target, so it definitely benefits from using
elfos.h.  This patch makes the target to use it, and removes
superfluous definitions from bpf.h which are better defined in
elfos.h.

Note that BPF, despite being an ELF target, doesn't use DWARF.  At
some point it will generate DWARF when generating xBPF (-mxbpf) and
BTF when generating plain eBPF, but for the time being it just
generates stabs.

2020-09-02  Jose E. Marchesi  <jemarch@gnu.org>

gcc/
* config.gcc: Use elfos.h in bpf-*-* targets.
* config/bpf/bpf.h (MAX_OFILE_ALIGNMENT): Remove definition.
(COMMON_ASM_OP): Likewise.
(INIT_SECTION_ASM_OP): Likewise.
(FINI_SECTION_ASM_OP): Likewise.
(ASM_OUTPUT_SKIP): Likewise.
(ASM_OUTPUT_ALIGNED_COMMON): Likewise.
(ASM_OUTPUT_ALIGNED_LOCAL): Likewise.

(cherry picked from commit c9d440223594cbf955177628d62a667727a1780a)

Daily bump.

Fortran  : ICE on invalid code PR95398

The CLASS_DATA macro is used to shorten the code accessing the derived
components of an expressions type specification.  If the type is not
BT_CLASS the derived pointer is NULL resulting in an ICE.  To avoid
dereferencing a NULL pointer the type should be BT_CLASS.

2020-09-01  Steven G. Kargl  <kargl@gcc.gnu.org>

gcc/fortran

PR fortran/95398
* resolve.c (resolve_select_type): Add check for BT_CLASS
type before using the CLASS_DATA macro which will have a
NULL pointer to derive components if it isn't BT_CLASS.

2020-09-01  Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/testsuite

PR fortran/95398
* gfortran.dg/pr95398.f90: New test.

(cherry picked from commit 3d137b75febd1a4ad70bcc64e0f79198f5571b86)

Add missing vn_reference_t::punned initialization

gcc/ChangeLog:

PR tree-optimization/96597
* tree-ssa-sccvn.c (vn_reference_lookup_call): Add missing
initialization of ::punned.
(vn_reference_insert): Use consistently false instead of 0.
(vn_reference_insert_pieces): Likewise.

(cherry picked from commit adc646b10c7168c3c95373ee9321e3760fc4c5f1)

tree-optimization/88240 - stopgap for floating point code-hoisting issues

This adds a stopgap measure to avoid performing code-hoisting
on mixed type loads when the load we'd insert in the hoisting
position would be a floating point one. This is because certain
targets (hello x87) cannot perform floating point loads without
possibly altering the bit representation and thus cannot be used
in place of integral loads.

2020-08-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/88240
* tree-ssa-sccvn.h (vn_reference_s::punned): New flag.
* tree-ssa-sccvn.c (vn_reference_insert): Initialize punned.
(vn_reference_insert_pieces): Likewise.
(visit_reference_op_call): Likewise.
(visit_reference_op_load): Track whether a ref was punned.
* tree-ssa-pre.c (do_hoist_insertion): Refuse to perform hoist
insertion on punned floating point loads.

* gcc.target/i386/pr88240.c: New testcase.

(cherry picked from commit 1af5cdd77985daf76130f527deac425c43df9f49)

Daily bump.

tree-optimization/96854 - SLP reduction of two-operator is broken

This fixes SLP reduction of two-operator operations by marking those
not supported. In fact any live lane out of such an operation cannot
be code-generated correctly.

2020-08-31 Richard Biener <rguenther@suse.de>

PR tree-optimization/96854
* tree-vect-loop.c (vectorizable_live_operation): Disallow
SLP_TREE_TWO_OPERATORS nodes.

* gcc.dg/vect/pr96854.c: New testcase.

Refine expander vec_unpacku_float_hi_v16si/vec_unpacku_float_lo_v16si

gcc/
PR target/96551
* config/i386/sse.md (vec_unpacku_float_hi_v16si): For vector
compare to integer mask, don't use gen_rtx_LT, use
ix86_expand_mask_vec_cmp instead.
(vec_unpacku_float_hi_v16si): Ditto.

gcc/testsuite
* gcc.target/i386/avx512f-pr96551-1.c: New test.
* gcc.target/i386/avx512f-pr96551-2.c: New test.

Fortran: Fix absent-optional handling for nondescriptor arrays (PR94672)

gcc/fortran/ChangeLog:

PR fortran/94672
* trans-array.c (gfc_trans_g77_array): Check against the parm decl and
set the nonparm decl used for the is-present check to NULL if absent.

gcc/testsuite/ChangeLog:

PR fortran/94672
* gfortran.dg/optional_assumed_charlen_2.f90: New test.

(cherry picked from commit cb3c3d63315ceb4dc262e5efb83b42c73c43387d)

Daily bump.

d: Fix no NRVO when returning an array of a non-POD struct

TREE_ADDRESSABLE was not propagated from the RECORD_TYPE to the ARRAY_TYPE, so
NRVO code generation was not being triggered.

gcc/d/ChangeLog:

PR d/96157
* d-codegen.cc (d_build_call): Handle TREE_ADDRESSABLE static arrays.
* types.cc (make_array_type): Propagate TREE_ADDRESSABLE from base
type to static array.

gcc/testsuite/ChangeLog:

PR d/96157
* gdc.dg/pr96157a.d: New test.
* gdc.dg/pr96157b.d: New test.

(cherry picked from commit 312ad889e99ff9458c01518325775e75ab57f272)

d: Limit recursive expansion to a common global limit.

Fixes both a bug where compilation would hang, and an issue where recursive
template limits are hit too early.

gcc/d/ChangeLog:

* dmd/globals.h (Global): Add recursionLimit.
* dmd/dmacro.c (Macro::expand): Limit recursive expansion to
global.recursionLimit.
* dmd/dtemplate.c (deduceType): Likewise.
(TemplateInstance::tryExpandMembers): Likewise.
(TemplateInstance::trySemantic3): Likewise.
(TemplateMixin::semantic): Likewise.
* dmd/expressionsem.c (ExpressionSemanticVisitor::visit): Likewise.
* dmd/mtype.c (Type::noMember): Likewise.
(TypeFunction::semantic): Likewise.
* dmd/optimize.c (Expression_optimize): Likewise.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/ice20092.d: New test.

(cherry picked from commit 0f5c98b6a1a7eed281e359f40bc2e4326f2a2f56)

d: Use read() to load contents of stdin into memory.

This would be an improvement over reading one character at a time.

An ICE was discovered when mixing reading from stdin with `-v', this has been
fixed in upstream DMD and backported as well.

Reviewed-on: https://github.com/dlang/dmd/pull/11620

gcc/d/ChangeLog:

* d-lang.cc (d_parse_file): Use read() to load contents from stdin,
allow the front-end to free the memory after parsing.
* dmd/func.c (FuncDeclaration::semantic): Use module filename if
searchPath returns NULL.

(cherry picked from commit 7421802276e737c2da297599121480833db92de9)

Daily bump.

Add expander for movp2hi and movp2qi.

2020-08-30 Uros Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:
PR target/96744
* config/i386/i386-expand.c (split_double_mode): Also handle
E_P2HImode and E_P2QImode.
* config/i386/sse.md (MASK_DWI): New define_mode_iterator.
(mov<mode>): New expander for P2HI,P2QI.
(*mov<mode>_internal): New define_insn_and_split to split
movement of P2QI/P2HI to 2 movqi/movhi patterns after reload.

gcc/testsuite/ChangeLog:

* gcc.target/i386/double_mask_reg-1.c: New test.

Fix: AVX512VP2INTERSECT should imply AVX512DQ.

gcc/ChangeLog

* common/config/i386/i386-common.c (ix86_handle_option): Set
AVX512DQ when AVX512VP2INTERSECT exists.

Daily bump.

Fix shadd-2.c scan assembler count.

2020-08-27 John David Anglin <danglin@gcc.gnu.org>

gcc/testsuite/
* gcc.target/hppa/shadd-2.c: Adjust times to 4.

rs6000, restrict bfloat convert intrinsic to Power 10.

gcc/ChangeLog

2020-08-26 Carl Love <cel@us.ibm.com>
* config/rs6000/rs6000-builtin.def: (BU_P10V_VSX_1) New builtin
macro expansion.
(XVCVBF16SPN, XVCVSPBF16): Replace macro expansion BU_VSX_1 with
BU_P10V_VSX_1.
* config/rs6000/rs6000-call.c: (VSX_BUILTIN_XVCVSPBF16,
VSX_BUILTIN_XVCVBF16SPN): Replace with P10V_BUILTIN_XVCVSPBF16,
P10V_BUILTIN_XVCVBF16SPN respectively.

Fortran  : ICE for division by zero in declaration PR95882

A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.

The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen.  Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.

2020-08-27  Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/fortran

PR fortran/95882
* decl.c (char_len_param_value): Check gfc_seen_div0 and
if it is set return MATCH_ERROR.

2020-08-27  Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/testsuite/

PR fortran/95882
* gfortran.dg/pr95882_1.f90: New test.
* gfortran.dg/pr95882_2.f90: New test.
* gfortran.dg/pr95882_3.f90: New test.
* gfortran.dg/pr95882_4.f90: New test.
* gfortran.dg/pr95882_5.f90: New test.

(cherry picked from commit c336eda750d4e7a0827fedf995da955d6d88d5ca)

arm: Fix -mpure-code support/-mslow-flash-data for armv8-m.base [PR94538]

armv8-m.base (cortex-m23) has the movt instruction, so we need to
disable the define_split to generate a constant in this case,
otherwise we get incorrect insn constraints as described in PR94538.

We also need to fix the pure-code alternative for thumb1_movsi_insn
because the assembler complains with instructions like
movs r0, #:upper8_15:1234
(Internal error in md_apply_fix)
We now generate movs r0, 4 instead.

2020-08-24 Christophe Lyon <christophe.lyon@linaro.org>

PR target/94538
gcc/
* config/arm/thumb1.md: Disable set-constant splitter when
TARGET_HAVE_MOVT.
(thumb1_movsi_insn): Fix -mpure-code
alternative.

PR target/94538
gcc/testsuite/
* gcc.target/arm/pure-code/pr94538-1.c: New test.
* gcc.target/arm/pure-code/pr94538-2.c: New test.

(cherry picked from commit 259d072067997ab8f55afcf735c91b6740fd0425)

Daily bump.

libstdc++: Enable assertions in constexpr string_view members [PR 71960]

Since GCC 6.1 there is no reason we can't just use __glibcxx_assert in
constexpr functions in string_view. As long as the condition is true,
there will be no call to std::__replacement_assert that would make the
function ineligible for constant evaluation.

PR libstdc++/71960
* include/experimental/string_view (basic_string_view):
Enable debug assertions.
* include/std/string_view (basic_string_view):
Likewise.

(cherry picked from commit 3eefb302d2bd8502cb3d8fe44e672b11092ccaf6)

libstdc++: Make variant_npos conversions explicit [PR 96766]

libstdc++-v3/ChangeLog:

PR libstdc++/96766
* include/std/variant (_Variant_storage): Replace implicit
conversions from size_t to __index_type with explicit casts.

(cherry picked from commit 074436cf8cdd2a9ce75cadd36deb8301f00e55b9)

hppa: PR middle-end/87256: Improved hppa_rtx_costs avoids synth_mult madness.

Backport from master:

2020-08-26 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
PR middle-end/87256
* config/pa/pa.c (hppa_rtx_costs_shadd_p): New helper function
to check for coefficients supported by shNadd and shladd,l.
(hppa_rtx_costs): Rewrite to avoid using estimates based upon
FACTOR and enable recursing deeper into RTL expressions.
* config/pa/pa.md (shd_internal): Fix define_expand to provide
gen_shd_internal.

hppa: Improve expansion of ashldi3 when !TARGET_64BIT

Backport from master:

2020-08-26 Roger Sayle <roger@nextmovesoftware.com>

* config/pa/pa.md (ashldi3): Additionally, on !TARGET_64BIT
generate a two instruction shd/zdep sequence when shifting
registers by suitable constants.
(shd_internal): New define_expand to provide gen_shd_internal.

Daily bump.

gimple: Ignore *0 = {CLOBBER} in path isolation [PR96722]

Clobbers of MEM_REF with NULL address are just fancy nops, something we just
ignore and don't emit any code for it (ditto for other clobbers), they just
mark end of life on something, so we shouldn't infer from those that there
is some UB.

2020-08-25 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/96722
* gimple.c (infer_nonnull_range): Formatting fix.
(infer_nonnull_range_by_dereference): Return false for clobber stmts.

* g++.dg/opt/pr96722.C: New test.

(cherry picked from commit a5b15fcb954ba63d58f0daa700281aba33b5f24a)

strlen: Fix handle_builtin_string_cmp [PR96758]

The following testcase is miscompiled, because handle_builtin_string_cmp
sees a strncmp call with constant last argument 4, where one of the strings
has an upper bound of 5 bytes (due to it being an array of that size) and
the other has a known string length of 1 and the result is used only in
equality comparison.
It is folded into __builtin_strncmp_eq (str1, str2, 4), which is
incorrect, because that means reading 4 bytes from both strings and
comparing that.  When one of the strings has known strlen of 1, we want to
compare just 2 bytes, not 4, as strncmp shouldn't compare any bytes beyond
the null.
So, the last argument to __builtin_strncmp_eq should be the minimum of the
provided strncmp last argument and the known string length + 1 (assuming
the other string has only a known upper bound due to array size).

Besides that, I've noticed the code has been written with the intent to also
support the case where we know exact string length of both strings (but not
the string content, so we can't compute it at compile time).  In that case,
both cstlen1 and cstlen2 are non-negative and both arysiz1 and arysiz2 are
negative.  We wouldn't optimize that, cmpsiz would be either the strncmp
last argument, or for strcmp the first string length, but varsiz would be
-1 and thus cmpsiz would be never < varsiz.  The patch fixes it by using the
correct length, in that case using the minimum of the two and for strncmp
also the last argument.

2020-08-25  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/96758
* tree-ssa-strlen.c (handle_builtin_string_cmp): If both cstlen1
and cstlen2 are set, set cmpsiz to their minimum, otherwise use the
one that is set.  If bound is used and smaller than cmpsiz, set cmpsiz
to bound.  If both cstlen1 and cstlen2 are set, perform the optimization.

* gcc.dg/strcmpopt_12.c: New test.

(cherry picked from commit f982a6ec9b6d98f5f37114b1d7455c54ce5056b8)

gimple-fold: Don't optimize wierdo floating point value reads [PR95450]

My patch to introduce native_encode_initializer to fold_ctor_reference
apparently broke gnulib/m4 on powerpc64.
There it uses a const union with two doubles and corresponding IBM double
double long double which actually is the largest normalizable long double
value (1 ulp higher than __LDBL_MAX__).  The reason our __LDBL_MAX__ is
smaller is that we internally treat the double double type as one having
106-bit precision, but it actually has a variable 53-bit to 2000-ish bit precision
and for the
0x1.fffffffffffff7ffffffffffffc000p+1023L
value gnulib uses we need 107-bit precision, therefore for GCC __LDBL_MAX__
is
0x1.fffffffffffff7ffffffffffff8000p+1023L
Before my changes, we wouldn't be able to fold_ctor_reference it and it
worked fine at runtime, but with the change we are able to do that, but
because it is larger than anything we can handle internally, we treat it
weirdly.  Similar problem would be if somebody creates this way valid,
but much more than 106 bit precision e.g. 1.0 + 1.0e-768.
Now, I think similar problem could happen e.g. on i?86/x86_64 with long
double there, it also has some weird values in the format, e.g. the
unnormals, pseudo infinities and various other magic values.

This patch for floating point types (including vector and complex types
with such elements) will try to encode the returned value again and punt
if it has different memory representation from the original.  Note, this
is only done in the path where native_encode_initializer was used, in order
not to affect e.g. just reading an unpunned long double value; the value
should be compiler generated in that case and thus should be properly
representable.  It will punt also if e.g. the padding bits are initialized
to non-zero values.

I think the verification that what we encode can be interpreted back
woiuld be only an internal consistency check (so perhaps for ENABLE_CHECKING
if flag_checking only, but if both directions perform it, then we need
to avoid mutual recursion).
While for the other direction (interpretation), at least for the broken by
design long doubles we just know we can't represent in GCC all valid values.
The other floating point formats are just theoretical case, perhaps we would
canonicalize something to a value that wouldn't trigger invalid exception
when without canonicalization it would trigger it at runtime, so let's just
ignore those.

Adjusted (so far untested) patch to do it in native_interpret_real instead
and limit it to the MODE_COMPOSITE_P cases, for which e.g.
fold-const.c/simplify-rtx.c punts in several other places too because we just
know we can't represent everything.

E.g.
      /* Don't constant fold this floating point operation if the
         result may dependent upon the run-time rounding mode and
         flag_rounding_math is set, or if GCC's software emulation
         is unable to accurately represent the result.  */
      if ((flag_rounding_math
           || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
          && (inexact || !real_identical (&result, &value)))
        return NULL_TREE;
Or perhaps guard it with MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations
too, thus break what gnulib / m4 does with -ffast-math, but not normally?

2020-08-25  Jakub Jelinek  <jakub@redhat.com>

PR target/95450
* fold-const.c (native_interpret_real): For MODE_COMPOSITE_P modes
punt if the to be returned REAL_CST does not encode to the bitwise
same representation.

* gcc.target/powerpc/pr95450.c: New test.

(cherry picked from commit 9f2f79df19fbfaa1c4be313c2f2b5ce04646433e)

c: Fix -Wunused-but-set-* warning with _Generic [PR96571]

The following testcase shows various problems with -Wunused-but-set*
warnings and _Generic construct.  I think it is best to treat the selector
and the ignored expressions as (potentially) read, because when they are
parsed, the vars in there are already marked as TREE_USED.

2020-08-18  Jakub Jelinek  <jakub@redhat.com>

PR c/96571
* c-parser.c (c_parser_generic_selection): Change match_found from bool
to int, holding index of the match.  Call mark_exp_read on the selector
expression and on expressions other than the selected one.

* gcc.dg/Wunused-var-4.c: New test.

(cherry picked from commit 6d42cbe5ad7a7b46437f2576c9920e44dc14b386)

Fix up flag_cunroll_grow_size handling in presence of optimize attr [PR96535]

As the testcase in the PR shows (not included in the patch, as
it seems quite fragile to observe unrolling in the IL), the introduction of
flag_cunroll_grow_size broke optimize attribute related to loop unrolling.
The problem is that the new option flag is set (if not set explicitly) only
in process_options and in rs6000_option_override_internal (and there only if
global_init_p).  So, this means that while it is Optimization option, it
will only be set based on the command line -funroll-loops/-O3/-fpeel-loops
or -funroll-all-loops, which means that if command line does include any of
those, it is enabled even for functions that will through optimize attribute
have all of those disabled, and if command line does not include those,
it will not be enabled for functions that will through optimize attribute
have any of those enabled.

process_options is called just once, so IMHO it should be handling only
non-Optimization option adjustments (various other options suffer from that
too, but as this is a regression from 10.1 on the 10 branch, changing those
is not appropriate).  Similarly, rs6000_option_override_internal is called
only once (with global_init_p) and then for target attribute handling, but
not for optimize attribute handling.

This patch moves the unrolling related handling from process_options into
finish_options which is invoked whenever the options are being finalized,
and the rs6000 specific parts into the override_options_after_change hook
which is called for optimize attribute handling (and unfortunately also
th cfun changes, but what the hook does is cheap) and I've added a call to
that from rs6000_override_options_internal, so it is also called on cmdline
processing and for target attribute.

Furthermore, it stops using AUTODETECT_VALUE, which can work only once,
and instead uses the global_options_set.x_... flags.

2020-08-12  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/96535
* toplev.c (process_options): Move flag_unroll_loops and
flag_cunroll_grow_size handling from here to ...
* opts.c (finish_options): ... here.  For flag_cunroll_grow_size,
don't check for AUTODETECT_VALUE, but instead check
opts_set->x_flag_cunroll_grow_size.
* common.opt (funroll-completely-grow-size): Default to 0.
* config/rs6000/rs6000.c (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE):
Redefine.
(rs6000_override_options_after_change): New function.
(rs6000_option_override_internal): Call it.  Move there the
flag_cunroll_grow_size, unroll_only_small_loops and
flag_rename_registers handling.

(cherry picked from commit fe9458c280dbd6e8b892db4ca3b64185049c376b)

c-family: Fix ICE in get_atomic_generic_size [PR96545]

As the testcase shows, we would ICE if the type of the first argument of
various atomic builtins was pointer to (non-void) incomplete type, we would
assume that TYPE_SIZE_UNIT must be non-NULL.  This patch diagnoses it
instead.  And also changes the TREE_CODE != INTEGER_CST check to
!tree_fits_uhwi_p, as we use tree_to_uhwi after this and at least in theory
the int could be too large and not fit.

2020-08-11  Jakub Jelinek  <jakub@redhat.com>

PR c/96545
* c-common.c (get_atomic_generic_size): Require that first argument's
type points to a complete type and use tree_fits_uhwi_p instead of
just INTEGER_CST TREE_CODE check for the TYPE_SIZE_UNIT.

* c-c++-common/pr96545.c: New test.

(cherry picked from commit 7840b4dc05539cf5575b3e9ff57ff5f6c3da2cae)

tree: Fix up get_narrower [PR96549]

My changes to get_narrower to support COMPOUND_EXPRs apparently
used a wrong type for the COMPOUND_EXPRs, while e.g. the rhs
type was unsigned short, the COMPOUND_EXPR got int type as that was the
original type of op. The type of COMPOUND_EXPR should be always the type
of the rhs.

2020-08-11 Jakub Jelinek <jakub@redhat.com>

PR c/96549
* tree.c (get_narrower): Use TREE_TYPE (ret) instead of
TREE_TYPE (win) for COMPOUND_EXPRs.

* gcc.c-torture/execute/pr96549.c: New test.

(cherry picked from commit 6b815e113c9aec397a86d7194f66455eb189cc7a)

c++: Fix constexpr evaluation of SPACESHIP_EXPR [PR96497]

The following valid testcase is rejected, because cxx_eval_binary_expression
is called on the SPACESHIP_EXPR with lval = true, as the address of the
spaceship needs to be passed to a method call.
After recursing on the operands and calling genericize_spaceship which turns
it into a TARGET_EXPR with initialization, we call cxx_eval_constant_expression
on it which succeeds, but then we fall through into code that will
VERIFY_CONSTANT (r) which FAILs because it is an address of a variable. Rather
than avoiding that for lval = true and SPACESHIP_EXPR, the patch just tail
calls cxx_eval_constant_expression - I believe that call should perform all
the needed verifications.

2020-08-10 Jakub Jelinek <jakub@redhat.com>

PR c++/96497
* constexpr.c (cxx_eval_binary_expression): For SPACESHIP_EXPR, tail
call cxx_eval_constant_expression after genericize_spaceship to avoid
undesirable further VERIFY_CONSTANT.

* g++.dg/cpp2a/spaceship-constexpr3.C: New test.

(cherry picked from commit 5c64df80df274c753bfc8415bd902e1180e76f6a)

openmp: Handle clauses with gimple sequences in convert_nonlocal_omp_clauses properly

If the walk_body on the various sequences of reduction, lastprivate and/or linear
clauses needs to create a temporary variable, we should declare that variable
in that sequence rather than outside, where it would need to be privatized inside of
the construct.

2020-08-08 Jakub Jelinek <jakub@redhat.com>

PR fortran/93553
* tree-nested.c (convert_nonlocal_omp_clauses): For
OMP_CLAUSE_REDUCTION, OMP_CLAUSE_LASTPRIVATE and OMP_CLAUSE_LINEAR
save info->new_local_var_chain around walks of the clause gimple
sequences and declare_vars if needed into the sequence.

2020-08-08 Tobias Burnus <tobias@codesourcery.com>

PR fortran/93553
* testsuite/libgomp.fortran/pr93553.f90: New test.

(cherry picked from commit 676b5525e8333005bdc1c596ed086f1da27a450f)

openmp: Handle reduction clauses on host teams construct [PR96459]

As the new testcase shows, we weren't actually performing reductions on
host teams construct.  And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs.  In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team.  Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.

2020-08-05  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/96459
* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
for host teams.

* testsuite/libgomp.c/teams-3.c: New test.
* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
if not defined yet.
(N(test)): Use it before all N(f*) calls.
* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
(main): Don't call all test_* functions from within
#pragma omp teams reduction(|:err), call them directly.

(cherry picked from commit 916c7a201a9a1dc94f2c056a773826a26d1daca9)

sra: Bail out when encountering accesses with negative offsets (PR 96730)

I must admit I was quite surprised to see that SRA does not disqualify
an aggregate from any transformations when it encounters an offset for
which get_ref_base_and_extent returns a negative offset.  It may not
matter too much because I sure hope such programs always have
undefined behavior (SRA candidates are local variables on stack) but
it is probably better not to perform weird transformations on them as
build ref model with the new build_reconstructed_reference function
currently happily do for negative offsets (they just copy the existing
expression which is then used as the expression of a "propagated"
access) and of course the compiler must not ICE (as it currently does
because the SRA forest verifier does not like the expression).

gcc/ChangeLog:

2020-08-24  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96730
* tree-sra.c (create_access): Disqualify any aggregate with negative
offset access.
(build_ref_for_model): Add assert that offset is non-negative.

gcc/testsuite/ChangeLog:

2020-08-24  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96730
* gcc.dg/tree-ssa/pr96730.c: New test.

(cherry picked from commit 556600286dd312d3ddf3d673a8579576862663e3)

Daily bump.

c++: Emit as-base 'tor symbols for final class.  [PR95428]

For PR70462 I stopped emitting the as-base constructor and destructor
variants for final classes, because they can never be called.  Except that
it turns out that clang calls base variants from complete variants, even for
classes with virtual bases, and in some cases inlines them such that the
calls to the base variant are exposed.  So we need to continue to emit the
as-base symbols, even though they're unreachable by G++-compiled code.

gcc/cp/ChangeLog:

PR c++/95428
* optimize.c (populate_clone_array): Revert PR70462 change.
(maybe_clone_body): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/other/final8.C: Adjust expected output.

Fortran  :  get_environment_variable runtime error PR96486

Runtime error occurs when the type of the value argument is
character(0):  "Zero-length string passed as value...".
The status argument, intent(out), will contain -1 if the value
of the environment is too large to fit in the value argument, this
is the case if the type is character(0) so there is no reason to
produce a runtime error if the value argument is zero length.

2020-08-24  Mark Eggleston  <markeggleston@gcc.gnu.org>

libgfortran/

PR fortran/96486
* intrinsics/env.c: If value_len is > 0 blank the string.
Copy the result only if its length is > 0.

2020-08-24  Mark Eggleston  <markeggleston@gcc.gnu.org>

gcc/testsuite/

PR fortran/96486
* gfortran.dg/pr96486.f90: New test.

(cherry picked from commit de09e7ebc9d5555653745a103eef2b20c7f1dd76)

Daily bump.

Update links to Arm docs

gcc/
* doc/extend.texi: Update links to Arm docs.
* doc/invoke.texi: Likewise.

(cherry picked from commit 09698e44c766c4a05ee463d2e36bc1fdac21dce4)

[LTO/offloading] Fix offloading-compilation ICE without -flto (PR84320)

gcc/ChangeLog:
PR ipa/95320
* ipa-utils.h (odr_type_p): Also permit calls with
only flag_generate_offload set.

(cherry picked from commit c5ab336ba106a407a67e84d8faac5b0ea6f18310)

libstdc++: Skip PSTL tests when installed TBB is too old [PR 96718]

These tests do not actually require TBB, because they only inspect the
feature test macros present in the headers. However, if TBB is installed
then its headers will be included, and the version will be checked. If
the version is too old, compilation fails due to a #error directive.

This change disables the tests if TBB is not present, so that we skip
them instead of failing.

libstdc++-v3/ChangeLog:

PR libstdc++/96718
* testsuite/25_algorithms/pstl/feature_test-2.cc: Require
tbb-backend effective target.
* testsuite/25_algorithms/pstl/feature_test-3.cc: Likewise.
* testsuite/25_algorithms/pstl/feature_test-5.cc: Likewise.
* testsuite/25_algorithms/pstl/feature_test.cc: Likewise.

(cherry picked from commit 988fb2f597d67cdf3603654372c020c28448441f)

Daily bump.

d: Adjust backport of PR96250 for front-end implementation.

gcc/d/ChangeLog:

2020-08-21 Iain Buclaw <ibuclaw@gdcproject.org>

PR d/96250
* dmd/expressionsem.c (ExpressionSemanticVisitor::visit(TypeExp)):
Fix cast from Expression to VarExp.

d: Field access in parentheses causes error: need 'this' for 'field' of type 'type'

1. Fixes an ICE in the front-end if a struct symbol were to appear twice
in the compilation unit.

2. Fixes a rejects-valid bug in the front-end where `(symbol)' was being
resolved as a `var' expression, instead of `this.var'.

gcc/d/ChangeLog:

PR d/96250
* dmd/dstruct.c (StructDeclaration::semantic): Error if redefinition
of struct exists in compilation.
* dmd/expressionsem.c (ExpressionSemanticVisitor::visit(TypeExp)):
Rewrite resolved field variables as 'this.var' before semantic.
* dmd/parse.c (Parser::parseUnaryExp): Mark '(type) una_exp' as a
parenthesized expression.

gcc/testsuite/ChangeLog:

PR d/96250
* gdc.test/fail_compilation/fail17492.d: New test.
* gdc.test/compilable/b9490.d: New test.
* gdc.test/compilable/ice14739.d: New test.
* gdc.test/fail_compilation/ice21060.d: New test.
* gdc.test/fail_compilation/imports/ice21060a/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060b/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060c/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060d/package.d: New file.
* gdc.test/runnable/b16278.d: New test.

d: Fix ICE in setValue at dmd/dinterpret.c:7046

This was originally seen when running the testsuite for a 16-bit target,
however, it could be reproduced on 32-bit using long[] as well.

gcc/d/ChangeLog:

* dmd/ctfeexpr.c (isCtfeValueValid): Return true for array literals as
well as structs.
* dmd/dinterpret.c: Don't reinterpret static arrays into dynamic.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/interpret3.d: Add test.
* gdc.test/fail_compilation/reg6769.d: New test.

(cherry picked from commit c43db80477a95f69425f20e4b8f164081695d1e9)

d: Fix ICE using non-local variable: internal compiler error: Segmentation fault

Moves no frame access error to own function, adding use of it for both
when get_framedecl() cannot find a path to the outer function frame, and
guarding get_decl_tree() from recursively calling itself.

gcc/d/ChangeLog:

PR d/96254
* d-codegen.cc (error_no_frame_access): New.
(get_frame_for_symbol): Use fdparent name in error message.
(get_framedecl): Replace call to assert with error.
* d-tree.h (error_no_frame_access): Declare.
* decl.cc (get_decl_tree): Detect recursion and error.

gcc/testsuite/ChangeLog:

PR d/96254
* gdc.dg/pr96254a.d: New test.
* gdc.dg/pr96254b.d: New test.

(cherry picked from commit 2b1c2a4bd9fb555dccde5d67d6da64547064e0e6)

libgomp: adjust nvptx_free callback context checking

Change test for CUDA callback context in nvptx_free() from using
GOMP_PLUGIN_acc_thread () into checking for CUDA_ERROR_NOT_PERMITTED,
for the former only works for OpenACC, but not OpenMP offloading.

2020-08-20 Chung-Lin Tang <cltang@codesourcery.com>

libgomp/
* plugin/plugin-nvptx.c (nvptx_free):
Change "GOMP_PLUGIN_acc_thread () == NULL" test into check of
CUDA_ERROR_NOT_PERMITTED status for cuMemGetAddressRange. Adjust
comments.

(cherry picked from commit f9b9832837b65046a8f01c18597cf615ff61db40)

Daily bump.

libstdc++: Add deprecated attributes to old iostream members

Back in 2017 I removed these prehistoric members (which were deprecated
since C++98) for C++17 mode. But I didn't add deprecated attributes to
most of them, so users didn't get any warning they would be going away.
Apparently some poor souls do actually use some of these names, and so
now that GCC 11 defaults to -std=gnu++17 some code has stopped
compiling.

This adds deprecated attributes to them, so that C++98/03/11/14 code
will get a warning if it uses them. I'll also backport this to the
release branches so that users can find out about the deprecation before
they start using C++17.

libstdc++-v3/ChangeLog:

* include/bits/c++config (_GLIBCXX_DEPRECATED_SUGGEST): New
macro for "use 'foo' instead" message in deprecated warnings.
* include/bits/ios_base.h (io_state, open_mode, seek_dir)
(streampos, streamoff): Use _GLIBCXX_DEPRECATED_SUGGEST.
* include/std/streambuf (stossc): Replace C++11 attribute
with _GLIBCXX_DEPRECATED_SUGGEST.
* include/std/type_traits (__is_nullptr_t): Use
_GLIBCXX_DEPRECATED_SUGGEST instead of _GLIBCXX_DEPRECATED.
* testsuite/27_io/types/1.cc: Check for deprecated warnings.
Also check for io_state, open_mode and seek_dir typedefs.

(cherry picked from commit eef9bf4ca8d90a1751bc4bff03722ee68999eb8e)

arm: Enable no-writeback vldr.16/vstr.16.

There was previously no way to specify that a register operand cannot
have any writeback modifiers, and as a result the argument to vldr.16
and vstr.16 could be erroneously output with post-increment. This
change adds a constraint which forbids all writeback, and
selects it in the relevant case for vldr.16 and vstr.16

gcc/ChangeLog:

PR target/96682
* config/arm/arm-protos.h (arm_coproc_mem_operand_no_writeback):
Declare prototype.
(arm_mve_mode_and_operands_type_check): Declare prototype.
* config/arm/arm.c (arm_coproc_mem_operand): Refactor to use
_arm_coproc_mem_operand.
(arm_coproc_mem_operand_wb): New function to cover full, limited
and no writeback.
(arm_coproc_mem_operand_no_writeback): New constraint for memory
operand with no writeback.
(arm_print_operand): Extend 'E' specifier for memory operand
that does not support writeback.
(arm_mve_mode_and_operands_type_check): New constraint check for
MVE memory operands.
* config/arm/constraints.md: Add Uj constraint for VFP vldr.16
and vstr.16.
* config/arm/vfp.md (*mov_load_vfp_hf16): New pattern for
vldr.16.
(*mov_store_vfp_hf16): New pattern for vstr.16.
(*mov<mode>_vfp_<mode>16): Remove MVE moves.

gcc/testsuite/ChangeLog:

PR target/96682
* gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c: New test.

(cherry picked from commit 9f6abd2db90151c8966d2d2718ab8c299abf1105)

rs6000: Rename instruction xvcvbf16sp to xvcvbf16spn

The xvcvbf16sp mnemonic, which was just added in ISA 3.1 has been renamed
to xvcvbf16spn, to make it consistent with the other non-signaling conversion
instructions which all end with "n". The only use of this instruction is in
an MMA conversion built-in function, so there is little to no compatibility
issue.

2020-08-18 Peter Bergner <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.def (BU_VSX_1): Rename xvcvbf16sp to
xvcvbf16spn.
* config/rs6000/rs6000-call.c (builtin_function_type): Likewise.
* config/rs6000/vsx.md: Likewise.
* doc/extend.texi: Likewise.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-3.c: Rename xvcvbf16sp to xvcvbf16spn.

(cherry picked from commit 94bedeaf694c728607a718d599edb4d74a2813c0)

rs6000: ICE when using an MMA type as a function param or return value [PR96506]

PR96506 shows a problem where we ICE on illegal usage, namely using MMA
types for function arguments and return values. The solution is to flag
these illegal usages as errors early, before we ICE.

2020-08-13 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR target/96506
* config/rs6000/rs6000-call.c (rs6000_promote_function_mode): Disallow
MMA types as return values.
(rs6000_function_arg): Disallow MMA types as function arguments.

gcc/testsuite/
PR target/96506
* gcc.target/powerpc/pr96506.c: New test.

(cherry picked from commit 0ad7e730c142ef6cd0ddc1491a89a7f330caa887)

Daily bump.

c++: Handle enumerator in C++20 alias CTAD. [PR96199]

To form a deduction guide for an alias template, we substitute the template
arguments from the pattern into the deduction guide for the underlying
class. In the case of B(A1<X>), that produces B(A1<B<T,1>::X>) -> B<T,1>.
But since an enumerator doesn't have its own template info, and B<T,1> is a
dependent scope, trying to look up B<T,1>::X fails and we crash. So we need
to produce a SCOPE_REF instead.

And trying to use the members of the template class is wrong for other
members, as well, as it gives a nonsensical result if the class is
specialized.

gcc/cp/ChangeLog:

PR c++/96199
* pt.c (maybe_dependent_member_ref): New.
(tsubst_copy) [CONST_DECL]: Use it.
[VAR_DECL]: Likewise.
(tsubst_aggr_type): Handle nested type.

gcc/testsuite/ChangeLog:

PR c++/96199
* g++.dg/cpp2a/class-deduction-alias4.C: New test.

i386: Fix restore_stack_nonlocal expander [PR96536].

-fcf-protection code in restore_stack_nonlocal uses a branch based on
a clobber result. The patch adds missing compare.

2020-08-18 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/96536
* config/i386/i386.md (restore_stack_nonlocal):
Add missing compare RTX.

d: Fix ICE Segmentation fault during RTL pass: expand on armhf/armel/s390x

gcc/d/ChangeLog:

PR d/96301
* decl.cc (DeclVisitor::visit (FuncDeclaration *)): Only return
non-trivial structs by invisible reference.

gcc/testsuite/ChangeLog:

PR d/96301
* gdc.dg/pr96301a.d: New test.
* gdc.dg/pr96301b.d: New test.
* gdc.dg/pr96301c.d: New test.

(cherry picked from commit 6bebbc033d8bf2246745ffef7186b0424e08ba6b)

Don't use pinsr/pextr for struct initialization/extraction.

gcc/
PR target/96562
PR target/93897
* config/i386/i386-expand.c (ix86_expand_pinsr): Don't use
pinsr for TImode.
(ix86_expand_pextr): Don't use pextr for TImode.

gcc/testsuite/
* gcc.target/i386/pr96562-1.c: New test.

compiler: export thunks referenced by inline functions

The test case is https://golang.org/cl/248637.

Fixes golang/go#40252

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/248638

Daily bump.