Jakub Jelinek [Fri, 4 Sep 2020 09:53:28 +0000 (11:53 +0200)]
lto: Ensure we force a change for file/line/column after clear_line_info
As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored. But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
ob->current_file = NULL;
ob->current_line = 0;
ob->current_col = 0;
ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info. But then we do:
bp_pack_value (bp, ob->current_file != xloc.file, 1);
bp_pack_value (bp, ob->current_line != xloc.line, 1);
bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true. If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before. Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero? Either by oring something into those
tests, or perhaps:
if (ob->current_reset)
{
if (xloc.file == NULL)
ob->current_file = "";
if (xloc.line == 0)
ob->current_line = 1;
if (xloc.column == 0)
ob->current_column = 1;
ob->current_reset = false;
}
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?
2020-09-04 Jakub Jelinek <jakub@redhat.com>
* lto-streamer.h (struct output_block): Add reset_locus member.
* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
(lto_output_location_1): If reset_locus, clear it and ensure
current_{file,line,col} is different from xloc members.
Jakub Jelinek [Thu, 3 Sep 2020 19:53:40 +0000 (21:53 +0200)]
c++: Fix another PCH hash_map issue [PR96901]
The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed. Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch. But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.
2020-09-03 Jakub Jelinek <jakub@redhat.com>
PR c++/96901
* tree.h (struct decl_tree_traits): New type.
(decl_tree_map): New typedef.
* constexpr.c (fundef_copies_table): Change type from
hash_map<tree, tree> * to decl_tree_map *.
Jakub Jelinek [Thu, 3 Sep 2020 18:11:43 +0000 (20:11 +0200)]
c++: Disable -frounding-math during manifestly constant evaluation [PR96862]
As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned on.
/* Don't constant fold this floating point operation if the
result may dependent upon the run-time rounding mode and
flag_rounding_math is set, or if GCC's software emulation
is unable to accurately represent the result. */
if ((flag_rounding_math
|| (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
&& (inexact || !real_identical (&result, &value)))
return NULL_TREE;
Jonathan said that we should be evaluating them anyway, e.g. conceptually
as if they are done with the default rounding mode before user had a chance
to change that, and e.g. in C in initializers it is also ignored.
In fact, fold-const.c for C initializers turns off various other options:
/* Perform constant folding and related simplification of initializer
expression EXPR. These behave identically to "fold_buildN" but ignore
potential run-time traps and exceptions that fold must preserve. */
int saved_signaling_nans = flag_signaling_nans;\
int saved_trapping_math = flag_trapping_math;\
int saved_rounding_math = flag_rounding_math;\
int saved_trapv = flag_trapv;\
int saved_folding_initializer = folding_initializer;\
flag_signaling_nans = 0;\
flag_trapping_math = 0;\
flag_rounding_math = 0;\
flag_trapv = 0;\
folding_initializer = 1;
So, shall cxx_eval_outermost_constant_expr instead turn off all those
options (then warning_sentinel wouldn't be the right thing to use, but given
the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd
need a RAII class for this. Not sure about the folding_initializer, that
one is affecting complex multiplication and division constant evaluation
somehow.
Jakub Jelinek [Thu, 3 Sep 2020 10:51:01 +0000 (12:51 +0200)]
lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]
As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations. libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.
2020-09-03 Jakub Jelinek <jakub@redhat.com>
PR lto/94311
* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
functions.
* streamer-hooks.h (struct streamer_hooks): Add
output_location_and_block callback. Fix up formatting for
output_location.
(stream_output_location_and_block): Define.
* lto-streamer.h (class lto_location_cache): Fix comment typo. Add
current_block member.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::lto_location_cache): Initialize current_block.
(lto_location_cache::cached_location): Add block member.
(struct output_block): Add current_block member.
(lto_output_location): Formatting fix.
(lto_output_location_and_block): Declare.
* lto-streamer.c (lto_streamer_hooks_init): Initialize
streamer_hooks.output_location_and_block.
* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
block members.
(lto_location_cache::apply_location_cache): Handle blocks.
(lto_location_cache::accept_location_cache,
lto_location_cache::revert_location_cache): Fix up function comments.
(lto_location_cache::input_location_and_block): New method.
(lto_location_cache::input_location): Implement using
input_location_and_block.
(input_function): Invoke apply_location_cache after streaming in all
bbs.
* lto-streamer-out.c (clear_line_info): Set current_block.
(lto_output_location_1): New function, moved from lto_output_location,
added block handling.
(lto_output_location): Implement using lto_output_location_1.
(lto_output_location_and_block): New function.
* gimple-streamer-in.c (input_phi): Use input_location_and_block
to input and cache both location and block.
(input_gimple_stmt): Likewise.
* gimple-streamer-out.c (output_phi): Use
stream_output_location_and_block.
(output_gimple_stmt): Likewise.
Jakub Jelinek [Wed, 2 Sep 2020 10:18:46 +0000 (12:18 +0200)]
fortran: Fix o'...' boz to integer/real conversions [PR96859]
The standard says that excess digits from boz are truncated.
For hexadecimal or binary, the routines copy just the number of digits
that will be needed, but for octal we copy number of digits that
contain one extra bit (for 8-bit, 32-bit or 128-bit, i.e. kind 1, 4 and 16)
or two extra bits (for 16-bit or 64-bit, i.e. kind 2 and 8).
The clearing of the first bit is done correctly by changing the first digit
if it is 4-7 to one smaller by 4 (i.e. modulo 4).
The clearing of the first two bits is done by changing 4 or 6 to 0
and 5 or 7 to 1, which is incorrect, because we really want to change the
first digit to 0 if it was even, or to 1 if it was odd, so digits
2 and 3 are mishandled by keeping them as is, rather than changing 2 to 0
and 3 to 1.
2020-09-02 Jakub Jelinek <jakub@redhat.com>
PR fortran/96859
* check.c (gfc_boz2real, gfc_boz2int): When clearing first two bits,
change also '2' to '0' and '3' to '1' rather than just handling '4'
through '7'.
Jakub Jelinek [Wed, 26 Aug 2020 08:30:15 +0000 (10:30 +0200)]
dwarf2out: Fix up dwarf2out_next_real_insn caching [PR96729]
The addition of NOTE_INSN_BEGIN_STMT and NOTE_INSN_INLINE_ENTRY notes
reintroduced quadratic behavior into dwarf2out_var_location.
This function needs to know the next real instruction to which the var
location note applies, but the way final_scan_insn is called outside of
final.c main loop doesn't make it easy to look up the next real insn in
there (and for non-dwarf it is even useless). Usually next real insn is
only a few notes away, but we can have hundreds of thousands of consecutive
notes only followed by a real insn. dwarf2out_var_location to avoid the
quadratic behavior contains a cache, it remembers the next note and when it
is called again on that loc_note, it can use the previously computed
dwarf2out_next_real_insn result, rather than walking the insn chain once
again. But, for NOTE_INSN_{BEGIN_STMT,INLINE_ENTRY} dwarf2out_var_location
is not called while the code puts into the cache those notes, which means if
we have e.g. in the worst case NOTE_INSN_VAR_LOCATION and
NOTE_INSN_BEGIN_STMT notes alternating, the cache is not really used.
The following patch fixes it by looking up the next NOTE_INSN_VAR_LOCATION
if any. While the lookup could be perhaps done together with looking for
the next real insn once (e.g. in dwarf2out_next_real_insn or its copy),
there are other dwarf2out_next_real_insn callers which don't need/want that
behavior and if there are more than two NOTE_INSN_VAR_LOCATION notes
followed by the same real insn, we need to do that "find next
NOTE_INSN_VAR_LOCATION" walk anyway.
On the testcase from the PR this patch speeds it 2.8times, from 0m0.674s
to 0m0.236s (why it takes for the reporter more than 60s is unknown).
2020-08-26 Jakub Jelinek <jakub@redhat.com>
PR debug/96729
* dwarf2out.c (dwarf2out_next_real_insn): Adjust function comment.
(dwarf2out_var_location): Look for next_note only if next_real is
non-NULL, in that case look for the first non-deleted
NOTE_INSN_VAR_LOCATION between loc_note and next_real, if any.
Eric Botcazou [Thu, 10 Sep 2020 15:47:32 +0000 (17:47 +0200)]
Fix uninitialized variable with nested variant record types
This fixes a wrong code issue with nested variant record types: the
compiler generates move instructions that depend on an uninitialized
variable, which was initially a SAVE_EXPR not instantiated early enough.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (build_subst_list): For a definition, make
sure to instantiate the SAVE_EXPRs generated by the elaboration of
the constraints in front of the elaboration of the type itself.
gcc/testsuite/ChangeLog:
* gnat.dg/discr59.adb: New test.
* gnat.dg/discr59_pkg1.ads: New helper.
* gnat.dg/discr59_pkg2.ads: Likewise.
Marek Polacek [Fri, 4 Sep 2020 20:04:26 +0000 (16:04 -0400)]
c++: Fix ICE in reshape_init with init-list [PR95164]
This patch fixes a long-standing bug in reshape_init_r. Since r209314
we implement DR 1467 which handles list-initialization with a single
initializer of the same type as the target. In this test this causes
a crash in reshape_init_r when we're processing a constructor that has
undergone the DR 1467 transformation.
Take e.g. the
foo({{1, {H{k}}}});
line in the attached test. {H{k}} initializes the field b of H in I.
H{k} is a functional cast, so has TREE_HAS_CONSTRUCTOR set, so is
COMPOUND_LITERAL_P. We perform the DR 1467 transformation and turn
{H{k}} into H{k}. Then we attempt to reshape H{k} again and since
first_initializer_p is null and it's COMPOUND_LITERAL_P, we go here:
else if (COMPOUND_LITERAL_P (stripped_init))
gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));
then complain about the missing braces, go to reshape_init_class and ICE
on
gcc_checking_assert (d->cur->index
== get_class_binding (type, id));
because due to the missing { } we're looking for 'b' in H, but that's
not found.
So we have to be prepared to handle an initializer whose outer braces
have been removed due to DR 1467.
gcc/cp/ChangeLog:
PR c++/95164
* decl.c (reshape_init_r): When initializing an aggregate member
with an initializer from an initializer-list, also consider
COMPOUND_LITERAL_P.
gcc/testsuite/ChangeLog:
PR c++/95164
* g++.dg/cpp0x/initlist123.C: New test.
Nick Clifton [Wed, 9 Sep 2020 14:59:12 +0000 (15:59 +0100)]
If the lto plugin encounters a file with multiple symbol sections, each of which also has a v1 symbol extension section[1] then it will attempt to read the extension data for *every* symbol from each of the extension sections. This results in reading off the end of a buffer with the associated memory corruption that that entails. This patch fixes that problem.
2020-09-09 Nick Clifton <nickc@redhat.com>
* lto-plugin.c (struct plugin_symtab): Add last_sym field.
(parse_symtab_extension): Only read as many entries as are
available in the buffer. Store the data read into the symbol
table indexed from last_sym. Increment last_sym.
Fortran: Fixes for OpenMP loop-iter privatization (PRs 95109 + 94690)
This commit also fixes a gfortran.dg/gomp/target1.f90 regression;
target1.f90 tests the resolve.c and openmp.c changes.
gcc/fortran/ChangeLog:
PR fortran/95109
PR fortran/94690
* resolve.c (gfc_resolve_code): Also call
gfc_resolve_omp_parallel_blocks for 'distribute parallel do (simd)'.
* openmp.c (gfc_resolve_omp_parallel_blocks): Handle it.
* trans-openmp.c (gfc_trans_omp_target): For TARGET_PARALLEL_DO_SIMD,
call simd not do processing function.
gcc/testsuite/ChangeLog:
PR fortran/95109
PR fortran/94690
* gfortran.dg/gomp/openmp-simd-5.f90: New test.
[PATCH PR96357][GCC][AArch64]: could not split insn UNSPEC_COND_FSUB with AArch64 SVE
Problem is related to that operand 4 (In original pattern
cond_sub<mode>_any_const) is no longer the same as operand 1, and so
the pattern doesn't match the split condition.
Pattern cond_sub<mode>_any_const is being split by this patch into two
separate patterns:
* Pattern cond_sub<mode>_relaxed_const now matches const_int
SVE_RELAXED_GP operand.
* Pattern cond_sub<mode>_strict_const now matches const_int
SVE_STRICT_GP operand.
* Remove aarch64_sve_pred_dominates_p condition from both patterns.
gcc/ChangeLog:
PR target/96357
* config/aarch64/aarch64-sve.md
(cond_sub<mode>_relaxed_const): Updated and renamed from
cond_sub<mode>_any_const pattern.
(cond_sub<mode>_strict_const): New pattern.
gcc/testsuite/ChangeLog:
PR target/96357
* gcc.target/aarch64/sve/pr96357.c: New test.
Harald Anlauf [Thu, 3 Sep 2020 18:33:14 +0000 (20:33 +0200)]
PR fortran/96890 - Wrong answer with intrinsic IALL
The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.
libgfortran/ChangeLog:
* m4/iall.m4: Initial value for result should be -1.
* generated/iall_i1.c (miall_i1): Generated.
* generated/iall_i16.c (miall_i16): Likewise.
* generated/iall_i2.c (miall_i2): Likewise.
* generated/iall_i4.c (miall_i4): Likewise.
* generated/iall_i8.c (miall_i8): Likewise.
liuhongt [Tue, 18 Aug 2020 05:18:03 +0000 (13:18 +0800)]
Adjust testcase.
Since This testcase is used to check generation of AVX512 vector
comparison, scan-assembler for vmov instruction could be deleted, also
-mprefer-vector-width=512 is added to avoid impact of different
default arch/tune of GCC.
d: Fix ICE in create_tmp_var, at gimple-expr.c:482
Array concatenate expressions were creating more SAVE_EXPRs than what
was necessary. The internal error itself was the result of a forced
temporary being made on a TREE_ADDRESSABLE type.
Martin Jambor [Fri, 4 Sep 2020 12:31:16 +0000 (14:31 +0200)]
sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820)
The testcase causes and ICE in the SRA verifier on x86_64 when
compiling with -m32 because build_user_friendly_ref_for_offset looks
at an out-of-bounds array_ref within an array_ref which accesses an
offset which does not fit into a signed 32bit integer and turns it
into an array-ref with a negative index.
The best thing is probably to bail out early when encountering an out
of bounds access to a local stack-allocated aggregate (and let the DSE
just delete such statements) which is what the patch does.
I also glanced over to the initial candidate vetting routine to make
sure the size would fit into HWI and noticed that it uses unsigned
variants whereas the rest of SRA operates on signed offsets and
sizes (because get_ref_and_extent does) and so changed that for the
sake of consistency. These ancient checks operate on sizes of types
as opposed to DECLs but I hope that any issues potentially arising
from that are basically hypothetical.
gcc/ChangeLog:
2020-08-28 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96820
* tree-sra.c (create_access): Disqualify candidates with accesses
beyond the end of the original aggregate.
(maybe_add_sra_candidate): Check that candidate type size fits
signed uhwi for the sake of consistency.
gcc/testsuite/ChangeLog:
2020-08-28 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96820
* gcc.dg/tree-ssa/pr96820.c: New test.
Peter Bergner [Tue, 1 Sep 2020 18:47:44 +0000 (13:47 -0500)]
rs6000: MMA built-in dies with incorrect sharing of tree nodes error
When we expand our MMA built-ins into gimple, we erroneously reused the
accumulator memory reference for both the source input value as well as
the destination output value. This led to a tree sharing error.
The solution is to create separate memory references for the input
and output values.
2020-09-01 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/96808
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Do not
reuse accumulator memory reference for source and destination accesses.
gcc/testsuite/
PR target/96808
* gcc.target/powerpc/pr96808.c: New test.
Jonathan Wakely [Tue, 7 Jul 2020 22:26:38 +0000 (23:26 +0100)]
libstdc++: Replace __int_limits with __numeric_traits_integer
I recently added std::__detail::__int_limits as a lightweight
alternative to std::numeric_limits, forgetting that the values it
provides (digits, min and max) are already provided by
__gnu_cxx::__numeric_traits.
This change adds __int_traits as an alias for __numeric_traits_integer.
This avoids instantiating __numeric_traits to decide whether to use
__numeric_traits_integer or __numeric_traits_floating. Then all uses of
__int_limits can be replaced with __int_traits, and __int_limits can be
removed.
Jonathan Wakely [Fri, 28 Aug 2020 21:45:24 +0000 (22:45 +0100)]
libstdc++: Fix std::gcd and std::lcm for unsigned integers [PR 92978]
This fixes a bug with mixed signed and unsigned types, where converting
a negative value to the unsigned result type alters the value. The
solution is to obtain the absolute values of the arguments immediately
and to perform the actual GCD or LCM algorithm on two arguments of the
same type.
In order to operate on the most negative number without overflow when
taking its absolute, use an unsigned type for the result of the abs
operation. For example, -INT_MIN will overflow, but -(unsigned)INT_MIN
is (unsigned)INT_MAX+1U which is the correct value.
libstdc++-v3/ChangeLog:
PR libstdc++/92978
* include/std/numeric (__abs_integral): Replace with ...
(__detail::__absu): New function template that returns an
unsigned type, guaranteeing it can represent the most
negative signed value.
(__detail::__gcd, __detail::__lcm): Require arguments to
be unsigned and therefore already non-negative.
(gcd, lcm): Convert arguments to absolute value as unsigned
type before calling __detail::__gcd or __detail::__lcm.
* include/experimental/numeric (gcd, lcm): Likewise.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust expected
errors.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/26_numerics/gcd/92978.cc: New test.
* testsuite/26_numerics/lcm/92978.cc: New test.
* testsuite/experimental/numeric/92978.cc: New test.
Jonathan Wakely [Wed, 2 Sep 2020 14:17:24 +0000 (15:17 +0100)]
libstdc++: Fix three-way comparison for std::array [PR 96851]
The spaceship operator for std::array uses memcmp when the
__is_byte<value_type> trait is true, but memcmp isn't usable in
constexpr contexts. Also, memcmp should only be used for unsigned byte
types, because it gives the wrong answer for signed chars with negative
values.
We can simply check std::is_constant_evaluated() so that we don't use
memcmp during constant evaluation.
To fix the problem of using memcmp for inappropriate types, this patch
adds new __is_memcmp_ordered and __is_memcmp_ordered_with traits. These
say whether using memcmp will give the right answer for ordering
operations such as lexicographical_compare and three-way comparisons.
The new traits can be used in several places.
Unlike the trunk commit this was backported from, this commit for the
branch doesn't extend the memcmp optimisations to all unsigned integers
on big endian targets. Only narrow character types and std::byte will
use memcmp.
libstdc++-v3/ChangeLog:
PR libstdc++/96851
* include/bits/cpp_type_traits.h (__is_memcmp_ordered):
New trait that says if memcmp can be used for ordering.
(__is_memcmp_ordered_with): Likewise, for two types.
* include/bits/ranges_algo.h (__lexicographical_compare_fn):
Use new traits instead of __is_byte and __numeric_traits.
* include/bits/stl_algobase.h (__lexicographical_compare_aux1)
(__is_byte_iter): Likewise.
* include/std/array (operator<=>): Likewise. Only use memcmp
when std::is_constant_evaluated() is false.
* testsuite/23_containers/array/comparison_operators/96851.cc:
New test.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
BPF is an ELF-based target, so it definitely benefits from using
elfos.h. This patch makes the target to use it, and removes
superfluous definitions from bpf.h which are better defined in
elfos.h.
Note that BPF, despite being an ELF target, doesn't use DWARF. At
some point it will generate DWARF when generating xBPF (-mxbpf) and
BTF when generating plain eBPF, but for the time being it just
generates stabs.
Mark Eggleston [Mon, 1 Jun 2020 07:15:31 +0000 (08:15 +0100)]
Fortran : ICE on invalid code PR95398
The CLASS_DATA macro is used to shorten the code accessing the derived
components of an expressions type specification. If the type is not
BT_CLASS the derived pointer is NULL resulting in an ICE. To avoid
dereferencing a NULL pointer the type should be BT_CLASS.
2020-09-01 Steven G. Kargl <kargl@gcc.gnu.org>
gcc/fortran
PR fortran/95398
* resolve.c (resolve_select_type): Add check for BT_CLASS
type before using the CLASS_DATA macro which will have a
NULL pointer to derive components if it isn't BT_CLASS.
2020-09-01 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite
PR fortran/95398
* gfortran.dg/pr95398.f90: New test.
Richard Biener [Tue, 4 Aug 2020 12:10:45 +0000 (14:10 +0200)]
tree-optimization/88240 - stopgap for floating point code-hoisting issues
This adds a stopgap measure to avoid performing code-hoisting
on mixed type loads when the load we'd insert in the hoisting
position would be a floating point one. This is because certain
targets (hello x87) cannot perform floating point loads without
possibly altering the bit representation and thus cannot be used
in place of integral loads.
2020-08-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/88240
* tree-ssa-sccvn.h (vn_reference_s::punned): New flag.
* tree-ssa-sccvn.c (vn_reference_insert): Initialize punned.
(vn_reference_insert_pieces): Likewise.
(visit_reference_op_call): Likewise.
(visit_reference_op_load): Track whether a ref was punned.
* tree-ssa-pre.c (do_hoist_insertion): Refuse to perform hoist
insertion on punned floating point loads.
Richard Biener [Mon, 31 Aug 2020 11:36:09 +0000 (13:36 +0200)]
tree-optimization/96854 - SLP reduction of two-operator is broken
This fixes SLP reduction of two-operator operations by marking those
not supported. In fact any live lane out of such an operation cannot
be code-generated correctly.
gcc/
PR target/96551
* config/i386/sse.md (vec_unpacku_float_hi_v16si): For vector
compare to integer mask, don't use gen_rtx_LT, use
ix86_expand_mask_vec_cmp instead.
(vec_unpacku_float_hi_v16si): Ditto.
gcc/testsuite
* gcc.target/i386/avx512f-pr96551-1.c: New test.
* gcc.target/i386/avx512f-pr96551-2.c: New test.
Tobias Burnus [Fri, 28 Aug 2020 11:54:10 +0000 (13:54 +0200)]
Fortran: Fix absent-optional handling for nondescriptor arrays (PR94672)
gcc/fortran/ChangeLog:
PR fortran/94672
* trans-array.c (gfc_trans_g77_array): Check against the parm decl and
set the nonparm decl used for the is-present check to NULL if absent.
gcc/testsuite/ChangeLog:
PR fortran/94672
* gfortran.dg/optional_assumed_charlen_2.f90: New test.
Iain Buclaw [Mon, 24 Aug 2020 22:39:17 +0000 (00:39 +0200)]
d: Fix no NRVO when returning an array of a non-POD struct
TREE_ADDRESSABLE was not propagated from the RECORD_TYPE to the ARRAY_TYPE, so
NRVO code generation was not being triggered.
gcc/d/ChangeLog:
PR d/96157
* d-codegen.cc (d_build_call): Handle TREE_ADDRESSABLE static arrays.
* types.cc (make_array_type): Propagate TREE_ADDRESSABLE from base
type to static array.
gcc/testsuite/ChangeLog:
PR d/96157
* gdc.dg/pr96157a.d: New test.
* gdc.dg/pr96157b.d: New test.
* d-lang.cc (d_parse_file): Use read() to load contents from stdin,
allow the front-end to free the memory after parsing.
* dmd/func.c (FuncDeclaration::semantic): Use module filename if
searchPath returns NULL.
liuhongt [Wed, 26 Aug 2020 07:24:10 +0000 (15:24 +0800)]
Add expander for movp2hi and movp2qi.
2020-08-30 Uros Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/96744
* config/i386/i386-expand.c (split_double_mode): Also handle
E_P2HImode and E_P2QImode.
* config/i386/sse.md (MASK_DWI): New define_mode_iterator.
(mov<mode>): New expander for P2HI,P2QI.
(*mov<mode>_internal): New define_insn_and_split to split
movement of P2QI/P2HI to 2 movqi/movhi patterns after reload.
Mark Eggleston [Fri, 21 Aug 2020 05:39:30 +0000 (06:39 +0100)]
Fortran : ICE for division by zero in declaration PR95882
A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.
The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen. Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran
PR fortran/95882
* decl.c (char_len_param_value): Check gfc_seen_div0 and
if it is set return MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95882
* gfortran.dg/pr95882_1.f90: New test.
* gfortran.dg/pr95882_2.f90: New test.
* gfortran.dg/pr95882_3.f90: New test.
* gfortran.dg/pr95882_4.f90: New test.
* gfortran.dg/pr95882_5.f90: New test.
Christophe Lyon [Wed, 19 Aug 2020 09:02:21 +0000 (09:02 +0000)]
arm: Fix -mpure-code support/-mslow-flash-data for armv8-m.base [PR94538]
armv8-m.base (cortex-m23) has the movt instruction, so we need to
disable the define_split to generate a constant in this case,
otherwise we get incorrect insn constraints as described in PR94538.
We also need to fix the pure-code alternative for thumb1_movsi_insn
because the assembler complains with instructions like
movs r0, #:upper8_15:1234
(Internal error in md_apply_fix)
We now generate movs r0, 4 instead.
Jonathan Wakely [Wed, 26 Aug 2020 13:47:51 +0000 (14:47 +0100)]
libstdc++: Enable assertions in constexpr string_view members [PR 71960]
Since GCC 6.1 there is no reason we can't just use __glibcxx_assert in
constexpr functions in string_view. As long as the condition is true,
there will be no call to std::__replacement_assert that would make the
function ineligible for constant evaluation.
2020-08-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/87256
* config/pa/pa.c (hppa_rtx_costs_shadd_p): New helper function
to check for coefficients supported by shNadd and shladd,l.
(hppa_rtx_costs): Rewrite to avoid using estimates based upon
FACTOR and enable recursing deeper into RTL expressions.
* config/pa/pa.md (shd_internal): Fix define_expand to provide
gen_shd_internal.
Roger Sayle [Wed, 26 Aug 2020 06:15:15 +0000 (07:15 +0100)]
hppa: Improve expansion of ashldi3 when !TARGET_64BIT
Backport from master:
2020-08-26 Roger Sayle <roger@nextmovesoftware.com>
* config/pa/pa.md (ashldi3): Additionally, on !TARGET_64BIT
generate a two instruction shd/zdep sequence when shifting
registers by suitable constants.
(shd_internal): New define_expand to provide gen_shd_internal.
Jakub Jelinek [Tue, 25 Aug 2020 11:49:40 +0000 (13:49 +0200)]
gimple: Ignore *0 = {CLOBBER} in path isolation [PR96722]
Clobbers of MEM_REF with NULL address are just fancy nops, something we just
ignore and don't emit any code for it (ditto for other clobbers), they just
mark end of life on something, so we shouldn't infer from those that there
is some UB.
Jakub Jelinek [Tue, 25 Aug 2020 11:47:10 +0000 (13:47 +0200)]
strlen: Fix handle_builtin_string_cmp [PR96758]
The following testcase is miscompiled, because handle_builtin_string_cmp
sees a strncmp call with constant last argument 4, where one of the strings
has an upper bound of 5 bytes (due to it being an array of that size) and
the other has a known string length of 1 and the result is used only in
equality comparison.
It is folded into __builtin_strncmp_eq (str1, str2, 4), which is
incorrect, because that means reading 4 bytes from both strings and
comparing that. When one of the strings has known strlen of 1, we want to
compare just 2 bytes, not 4, as strncmp shouldn't compare any bytes beyond
the null.
So, the last argument to __builtin_strncmp_eq should be the minimum of the
provided strncmp last argument and the known string length + 1 (assuming
the other string has only a known upper bound due to array size).
Besides that, I've noticed the code has been written with the intent to also
support the case where we know exact string length of both strings (but not
the string content, so we can't compute it at compile time). In that case,
both cstlen1 and cstlen2 are non-negative and both arysiz1 and arysiz2 are
negative. We wouldn't optimize that, cmpsiz would be either the strncmp
last argument, or for strcmp the first string length, but varsiz would be
-1 and thus cmpsiz would be never < varsiz. The patch fixes it by using the
correct length, in that case using the minimum of the two and for strncmp
also the last argument.
2020-08-25 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96758
* tree-ssa-strlen.c (handle_builtin_string_cmp): If both cstlen1
and cstlen2 are set, set cmpsiz to their minimum, otherwise use the
one that is set. If bound is used and smaller than cmpsiz, set cmpsiz
to bound. If both cstlen1 and cstlen2 are set, perform the optimization.
Jakub Jelinek [Tue, 25 Aug 2020 05:17:10 +0000 (07:17 +0200)]
gimple-fold: Don't optimize wierdo floating point value reads [PR95450]
My patch to introduce native_encode_initializer to fold_ctor_reference
apparently broke gnulib/m4 on powerpc64.
There it uses a const union with two doubles and corresponding IBM double
double long double which actually is the largest normalizable long double
value (1 ulp higher than __LDBL_MAX__). The reason our __LDBL_MAX__ is
smaller is that we internally treat the double double type as one having
106-bit precision, but it actually has a variable 53-bit to 2000-ish bit precision
and for the
0x1.fffffffffffff7ffffffffffffc000p+1023L
value gnulib uses we need 107-bit precision, therefore for GCC __LDBL_MAX__
is
0x1.fffffffffffff7ffffffffffff8000p+1023L
Before my changes, we wouldn't be able to fold_ctor_reference it and it
worked fine at runtime, but with the change we are able to do that, but
because it is larger than anything we can handle internally, we treat it
weirdly. Similar problem would be if somebody creates this way valid,
but much more than 106 bit precision e.g. 1.0 + 1.0e-768.
Now, I think similar problem could happen e.g. on i?86/x86_64 with long
double there, it also has some weird values in the format, e.g. the
unnormals, pseudo infinities and various other magic values.
This patch for floating point types (including vector and complex types
with such elements) will try to encode the returned value again and punt
if it has different memory representation from the original. Note, this
is only done in the path where native_encode_initializer was used, in order
not to affect e.g. just reading an unpunned long double value; the value
should be compiler generated in that case and thus should be properly
representable. It will punt also if e.g. the padding bits are initialized
to non-zero values.
I think the verification that what we encode can be interpreted back
woiuld be only an internal consistency check (so perhaps for ENABLE_CHECKING
if flag_checking only, but if both directions perform it, then we need
to avoid mutual recursion).
While for the other direction (interpretation), at least for the broken by
design long doubles we just know we can't represent in GCC all valid values.
The other floating point formats are just theoretical case, perhaps we would
canonicalize something to a value that wouldn't trigger invalid exception
when without canonicalization it would trigger it at runtime, so let's just
ignore those.
Adjusted (so far untested) patch to do it in native_interpret_real instead
and limit it to the MODE_COMPOSITE_P cases, for which e.g.
fold-const.c/simplify-rtx.c punts in several other places too because we just
know we can't represent everything.
E.g.
/* Don't constant fold this floating point operation if the
result may dependent upon the run-time rounding mode and
flag_rounding_math is set, or if GCC's software emulation
is unable to accurately represent the result. */
if ((flag_rounding_math
|| (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
&& (inexact || !real_identical (&result, &value)))
return NULL_TREE;
Or perhaps guard it with MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations
too, thus break what gnulib / m4 does with -ffast-math, but not normally?
2020-08-25 Jakub Jelinek <jakub@redhat.com>
PR target/95450
* fold-const.c (native_interpret_real): For MODE_COMPOSITE_P modes
punt if the to be returned REAL_CST does not encode to the bitwise
same representation.
Jakub Jelinek [Tue, 18 Aug 2020 05:51:58 +0000 (07:51 +0200)]
c: Fix -Wunused-but-set-* warning with _Generic [PR96571]
The following testcase shows various problems with -Wunused-but-set*
warnings and _Generic construct. I think it is best to treat the selector
and the ignored expressions as (potentially) read, because when they are
parsed, the vars in there are already marked as TREE_USED.
2020-08-18 Jakub Jelinek <jakub@redhat.com>
PR c/96571
* c-parser.c (c_parser_generic_selection): Change match_found from bool
to int, holding index of the match. Call mark_exp_read on the selector
expression and on expressions other than the selected one.
Jakub Jelinek [Wed, 12 Aug 2020 15:00:41 +0000 (17:00 +0200)]
Fix up flag_cunroll_grow_size handling in presence of optimize attr [PR96535]
As the testcase in the PR shows (not included in the patch, as
it seems quite fragile to observe unrolling in the IL), the introduction of
flag_cunroll_grow_size broke optimize attribute related to loop unrolling.
The problem is that the new option flag is set (if not set explicitly) only
in process_options and in rs6000_option_override_internal (and there only if
global_init_p). So, this means that while it is Optimization option, it
will only be set based on the command line -funroll-loops/-O3/-fpeel-loops
or -funroll-all-loops, which means that if command line does include any of
those, it is enabled even for functions that will through optimize attribute
have all of those disabled, and if command line does not include those,
it will not be enabled for functions that will through optimize attribute
have any of those enabled.
process_options is called just once, so IMHO it should be handling only
non-Optimization option adjustments (various other options suffer from that
too, but as this is a regression from 10.1 on the 10 branch, changing those
is not appropriate). Similarly, rs6000_option_override_internal is called
only once (with global_init_p) and then for target attribute handling, but
not for optimize attribute handling.
This patch moves the unrolling related handling from process_options into
finish_options which is invoked whenever the options are being finalized,
and the rs6000 specific parts into the override_options_after_change hook
which is called for optimize attribute handling (and unfortunately also
th cfun changes, but what the hook does is cheap) and I've added a call to
that from rs6000_override_options_internal, so it is also called on cmdline
processing and for target attribute.
Furthermore, it stops using AUTODETECT_VALUE, which can work only once,
and instead uses the global_options_set.x_... flags.
2020-08-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96535
* toplev.c (process_options): Move flag_unroll_loops and
flag_cunroll_grow_size handling from here to ...
* opts.c (finish_options): ... here. For flag_cunroll_grow_size,
don't check for AUTODETECT_VALUE, but instead check
opts_set->x_flag_cunroll_grow_size.
* common.opt (funroll-completely-grow-size): Default to 0.
* config/rs6000/rs6000.c (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE):
Redefine.
(rs6000_override_options_after_change): New function.
(rs6000_option_override_internal): Call it. Move there the
flag_cunroll_grow_size, unroll_only_small_loops and
flag_rename_registers handling.
Jakub Jelinek [Tue, 11 Aug 2020 14:46:49 +0000 (16:46 +0200)]
c-family: Fix ICE in get_atomic_generic_size [PR96545]
As the testcase shows, we would ICE if the type of the first argument of
various atomic builtins was pointer to (non-void) incomplete type, we would
assume that TYPE_SIZE_UNIT must be non-NULL. This patch diagnoses it
instead. And also changes the TREE_CODE != INTEGER_CST check to
!tree_fits_uhwi_p, as we use tree_to_uhwi after this and at least in theory
the int could be too large and not fit.
2020-08-11 Jakub Jelinek <jakub@redhat.com>
PR c/96545
* c-common.c (get_atomic_generic_size): Require that first argument's
type points to a complete type and use tree_fits_uhwi_p instead of
just INTEGER_CST TREE_CODE check for the TYPE_SIZE_UNIT.
Jakub Jelinek [Tue, 11 Aug 2020 11:46:14 +0000 (13:46 +0200)]
tree: Fix up get_narrower [PR96549]
My changes to get_narrower to support COMPOUND_EXPRs apparently
used a wrong type for the COMPOUND_EXPRs, while e.g. the rhs
type was unsigned short, the COMPOUND_EXPR got int type as that was the
original type of op. The type of COMPOUND_EXPR should be always the type
of the rhs.
2020-08-11 Jakub Jelinek <jakub@redhat.com>
PR c/96549
* tree.c (get_narrower): Use TREE_TYPE (ret) instead of
TREE_TYPE (win) for COMPOUND_EXPRs.
Jakub Jelinek [Mon, 10 Aug 2020 15:53:46 +0000 (17:53 +0200)]
c++: Fix constexpr evaluation of SPACESHIP_EXPR [PR96497]
The following valid testcase is rejected, because cxx_eval_binary_expression
is called on the SPACESHIP_EXPR with lval = true, as the address of the
spaceship needs to be passed to a method call.
After recursing on the operands and calling genericize_spaceship which turns
it into a TARGET_EXPR with initialization, we call cxx_eval_constant_expression
on it which succeeds, but then we fall through into code that will
VERIFY_CONSTANT (r) which FAILs because it is an address of a variable. Rather
than avoiding that for lval = true and SPACESHIP_EXPR, the patch just tail
calls cxx_eval_constant_expression - I believe that call should perform all
the needed verifications.
2020-08-10 Jakub Jelinek <jakub@redhat.com>
PR c++/96497
* constexpr.c (cxx_eval_binary_expression): For SPACESHIP_EXPR, tail
call cxx_eval_constant_expression after genericize_spaceship to avoid
undesirable further VERIFY_CONSTANT.
Jakub Jelinek [Sat, 8 Aug 2020 09:10:30 +0000 (11:10 +0200)]
openmp: Handle clauses with gimple sequences in convert_nonlocal_omp_clauses properly
If the walk_body on the various sequences of reduction, lastprivate and/or linear
clauses needs to create a temporary variable, we should declare that variable
in that sequence rather than outside, where it would need to be privatized inside of
the construct.
2020-08-08 Jakub Jelinek <jakub@redhat.com>
PR fortran/93553
* tree-nested.c (convert_nonlocal_omp_clauses): For
OMP_CLAUSE_REDUCTION, OMP_CLAUSE_LASTPRIVATE and OMP_CLAUSE_LINEAR
save info->new_local_var_chain around walks of the clause gimple
sequences and declare_vars if needed into the sequence.
Jakub Jelinek [Wed, 5 Aug 2020 08:40:10 +0000 (10:40 +0200)]
openmp: Handle reduction clauses on host teams construct [PR96459]
As the new testcase shows, we weren't actually performing reductions on
host teams construct. And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs. In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team. Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96459
* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
for host teams.
* testsuite/libgomp.c/teams-3.c: New test.
* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
if not defined yet.
(N(test)): Use it before all N(f*) calls.
* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
(main): Don't call all test_* functions from within
#pragma omp teams reduction(|:err), call them directly.
Martin Jambor [Tue, 25 Aug 2020 14:11:56 +0000 (16:11 +0200)]
sra: Bail out when encountering accesses with negative offsets (PR 96730)
I must admit I was quite surprised to see that SRA does not disqualify
an aggregate from any transformations when it encounters an offset for
which get_ref_base_and_extent returns a negative offset. It may not
matter too much because I sure hope such programs always have
undefined behavior (SRA candidates are local variables on stack) but
it is probably better not to perform weird transformations on them as
build ref model with the new build_reconstructed_reference function
currently happily do for negative offsets (they just copy the existing
expression which is then used as the expression of a "propagated"
access) and of course the compiler must not ICE (as it currently does
because the SRA forest verifier does not like the expression).
gcc/ChangeLog:
2020-08-24 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96730
* tree-sra.c (create_access): Disqualify any aggregate with negative
offset access.
(build_ref_for_model): Add assert that offset is non-negative.
gcc/testsuite/ChangeLog:
2020-08-24 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/96730
* gcc.dg/tree-ssa/pr96730.c: New test.
Jason Merrill [Fri, 21 Aug 2020 20:23:03 +0000 (16:23 -0400)]
c++: Emit as-base 'tor symbols for final class. [PR95428]
For PR70462 I stopped emitting the as-base constructor and destructor
variants for final classes, because they can never be called. Except that
it turns out that clang calls base variants from complete variants, even for
classes with virtual bases, and in some cases inlines them such that the
calls to the base variant are exposed. So we need to continue to emit the
as-base symbols, even though they're unreachable by G++-compiled code.
Runtime error occurs when the type of the value argument is
character(0): "Zero-length string passed as value...".
The status argument, intent(out), will contain -1 if the value
of the environment is too large to fit in the value argument, this
is the case if the type is character(0) so there is no reason to
produce a runtime error if the value argument is zero length.
2020-08-24 Mark Eggleston <markeggleston@gcc.gnu.org>
libgfortran/
PR fortran/96486
* intrinsics/env.c: If value_len is > 0 blank the string.
Copy the result only if its length is > 0.
2020-08-24 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/96486
* gfortran.dg/pr96486.f90: New test.
Jonathan Wakely [Fri, 21 Aug 2020 11:01:05 +0000 (12:01 +0100)]
libstdc++: Skip PSTL tests when installed TBB is too old [PR 96718]
These tests do not actually require TBB, because they only inspect the
feature test macros present in the headers. However, if TBB is installed
then its headers will be included, and the version will be checked. If
the version is too old, compilation fails due to a #error directive.
This change disables the tests if TBB is not present, so that we skip
them instead of failing.
d: Field access in parentheses causes error: need 'this' for 'field' of type 'type'
1. Fixes an ICE in the front-end if a struct symbol were to appear twice
in the compilation unit.
2. Fixes a rejects-valid bug in the front-end where `(symbol)' was being
resolved as a `var' expression, instead of `this.var'.
gcc/d/ChangeLog:
PR d/96250
* dmd/dstruct.c (StructDeclaration::semantic): Error if redefinition
of struct exists in compilation.
* dmd/expressionsem.c (ExpressionSemanticVisitor::visit(TypeExp)):
Rewrite resolved field variables as 'this.var' before semantic.
* dmd/parse.c (Parser::parseUnaryExp): Mark '(type) una_exp' as a
parenthesized expression.
gcc/testsuite/ChangeLog:
PR d/96250
* gdc.test/fail_compilation/fail17492.d: New test.
* gdc.test/compilable/b9490.d: New test.
* gdc.test/compilable/ice14739.d: New test.
* gdc.test/fail_compilation/ice21060.d: New test.
* gdc.test/fail_compilation/imports/ice21060a/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060b/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060c/package.d: New file.
* gdc.test/fail_compilation/imports/ice21060d/package.d: New file.
* gdc.test/runnable/b16278.d: New test.
Iain Buclaw [Thu, 20 Aug 2020 16:18:40 +0000 (18:18 +0200)]
d: Fix ICE in setValue at dmd/dinterpret.c:7046
This was originally seen when running the testsuite for a 16-bit target,
however, it could be reproduced on 32-bit using long[] as well.
gcc/d/ChangeLog:
* dmd/ctfeexpr.c (isCtfeValueValid): Return true for array literals as
well as structs.
* dmd/dinterpret.c: Don't reinterpret static arrays into dynamic.
gcc/testsuite/ChangeLog:
* gdc.test/compilable/interpret3.d: Add test.
* gdc.test/fail_compilation/reg6769.d: New test.
Moves no frame access error to own function, adding use of it for both
when get_framedecl() cannot find a path to the outer function frame, and
guarding get_decl_tree() from recursively calling itself.
gcc/d/ChangeLog:
PR d/96254
* d-codegen.cc (error_no_frame_access): New.
(get_frame_for_symbol): Use fdparent name in error message.
(get_framedecl): Replace call to assert with error.
* d-tree.h (error_no_frame_access): Declare.
* decl.cc (get_decl_tree): Detect recursion and error.
gcc/testsuite/ChangeLog:
PR d/96254
* gdc.dg/pr96254a.d: New test.
* gdc.dg/pr96254b.d: New test.
Change test for CUDA callback context in nvptx_free() from using
GOMP_PLUGIN_acc_thread () into checking for CUDA_ERROR_NOT_PERMITTED,
for the former only works for OpenACC, but not OpenMP offloading.
libgomp/
* plugin/plugin-nvptx.c (nvptx_free):
Change "GOMP_PLUGIN_acc_thread () == NULL" test into check of
CUDA_ERROR_NOT_PERMITTED status for cuMemGetAddressRange. Adjust
comments.
Jonathan Wakely [Wed, 19 Aug 2020 12:41:26 +0000 (13:41 +0100)]
libstdc++: Add deprecated attributes to old iostream members
Back in 2017 I removed these prehistoric members (which were deprecated
since C++98) for C++17 mode. But I didn't add deprecated attributes to
most of them, so users didn't get any warning they would be going away.
Apparently some poor souls do actually use some of these names, and so
now that GCC 11 defaults to -std=gnu++17 some code has stopped
compiling.
This adds deprecated attributes to them, so that C++98/03/11/14 code
will get a warning if it uses them. I'll also backport this to the
release branches so that users can find out about the deprecation before
they start using C++17.
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_DEPRECATED_SUGGEST): New
macro for "use 'foo' instead" message in deprecated warnings.
* include/bits/ios_base.h (io_state, open_mode, seek_dir)
(streampos, streamoff): Use _GLIBCXX_DEPRECATED_SUGGEST.
* include/std/streambuf (stossc): Replace C++11 attribute
with _GLIBCXX_DEPRECATED_SUGGEST.
* include/std/type_traits (__is_nullptr_t): Use
_GLIBCXX_DEPRECATED_SUGGEST instead of _GLIBCXX_DEPRECATED.
* testsuite/27_io/types/1.cc: Check for deprecated warnings.
Also check for io_state, open_mode and seek_dir typedefs.
Joe Ramsay [Wed, 29 Jul 2020 13:04:28 +0000 (14:04 +0100)]
arm: Enable no-writeback vldr.16/vstr.16.
There was previously no way to specify that a register operand cannot
have any writeback modifiers, and as a result the argument to vldr.16
and vstr.16 could be erroneously output with post-increment. This
change adds a constraint which forbids all writeback, and
selects it in the relevant case for vldr.16 and vstr.16
gcc/ChangeLog:
PR target/96682
* config/arm/arm-protos.h (arm_coproc_mem_operand_no_writeback):
Declare prototype.
(arm_mve_mode_and_operands_type_check): Declare prototype.
* config/arm/arm.c (arm_coproc_mem_operand): Refactor to use
_arm_coproc_mem_operand.
(arm_coproc_mem_operand_wb): New function to cover full, limited
and no writeback.
(arm_coproc_mem_operand_no_writeback): New constraint for memory
operand with no writeback.
(arm_print_operand): Extend 'E' specifier for memory operand
that does not support writeback.
(arm_mve_mode_and_operands_type_check): New constraint check for
MVE memory operands.
* config/arm/constraints.md: Add Uj constraint for VFP vldr.16
and vstr.16.
* config/arm/vfp.md (*mov_load_vfp_hf16): New pattern for
vldr.16.
(*mov_store_vfp_hf16): New pattern for vstr.16.
(*mov<mode>_vfp_<mode>16): Remove MVE moves.
gcc/testsuite/ChangeLog:
PR target/96682
* gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c: New test.
Peter Bergner [Tue, 18 Aug 2020 21:16:11 +0000 (16:16 -0500)]
rs6000: Rename instruction xvcvbf16sp to xvcvbf16spn
The xvcvbf16sp mnemonic, which was just added in ISA 3.1 has been renamed
to xvcvbf16spn, to make it consistent with the other non-signaling conversion
instructions which all end with "n". The only use of this instruction is in
an MMA conversion built-in function, so there is little to no compatibility
issue.
Peter Bergner [Thu, 13 Aug 2020 18:40:39 +0000 (13:40 -0500)]
rs6000: ICE when using an MMA type as a function param or return value [PR96506]
PR96506 shows a problem where we ICE on illegal usage, namely using MMA
types for function arguments and return values. The solution is to flag
these illegal usages as errors early, before we ICE.
2020-08-13 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/96506
* config/rs6000/rs6000-call.c (rs6000_promote_function_mode): Disallow
MMA types as return values.
(rs6000_function_arg): Disallow MMA types as function arguments.
gcc/testsuite/
PR target/96506
* gcc.target/powerpc/pr96506.c: New test.
Jason Merrill [Thu, 6 Aug 2020 06:40:10 +0000 (02:40 -0400)]
c++: Handle enumerator in C++20 alias CTAD. [PR96199]
To form a deduction guide for an alias template, we substitute the template
arguments from the pattern into the deduction guide for the underlying
class. In the case of B(A1<X>), that produces B(A1<B<T,1>::X>) -> B<T,1>.
But since an enumerator doesn't have its own template info, and B<T,1> is a
dependent scope, trying to look up B<T,1>::X fails and we crash. So we need
to produce a SCOPE_REF instead.
And trying to use the members of the template class is wrong for other
members, as well, as it gives a nonsensical result if the class is
specialized.
gcc/cp/ChangeLog:
PR c++/96199
* pt.c (maybe_dependent_member_ref): New.
(tsubst_copy) [CONST_DECL]: Use it.
[VAR_DECL]: Likewise.
(tsubst_aggr_type): Handle nested type.
gcc/testsuite/ChangeLog:
PR c++/96199
* g++.dg/cpp2a/class-deduction-alias4.C: New test.
liuhongt [Wed, 12 Aug 2020 02:48:17 +0000 (10:48 +0800)]
Don't use pinsr/pextr for struct initialization/extraction.
gcc/
PR target/96562
PR target/93897
* config/i386/i386-expand.c (ix86_expand_pinsr): Don't use
pinsr for TImode.
(ix86_expand_pextr): Don't use pextr for TImode.
gcc/testsuite/
* gcc.target/i386/pr96562-1.c: New test.