Eric Botcazou [Fri, 13 Mar 2020 08:58:44 +0000 (09:58 +0100)]
Fix incorrect filling of delay slots in branchy code at -O2
The issue is that relax_delay_slots can streamline the CFG in some cases,
in particular remove BARRIERs, but removing BARRIERs changes the way the
instructions are associated with (basic) blocks by the liveness analysis
code in resource.c (find_basic_block) and thus can cause entries in the
cache maintained by resource.c to become outdated, thus producing wrong
answers downstream.
The fix is to invalidate the cache entries affected by the removal of
BARRIERs in relax_delay_slots, i.e. for the instructions down to the
next BARRIER.
PR rtl-optimization/94119
* resource.h (clear_hashed_info_until_next_barrier): Declare.
* resource.c (clear_hashed_info_until_next_barrier): New function.
* reorg.c (add_to_delay_list): Fix formatting.
(relax_delay_slots): Call clear_hashed_info_until_next_barrier on
the next instruction after removing a BARRIER.
PR87560 reports an ICE when a test case is compiled with -mpower9-vector
and -mno-altivec. This patch terminates compilation with an error when
this combination (and other unreasonable ones) are requested.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions. Reported error is now:
f951: Error: '-mno-altivec' turns off '-mpower9-vector'
2020-03-12 Bill Schmidt <wschmidt@linux.ibm.com>
Backport from master
2020-03-02 Bill Schmidt <wschmidt@linux.ibm.com>
PR target/87560
* rs6000-cpus.def (OTHER_ALTIVEC_MASKS): New #define.
* rs6000.c (rs6000_disable_incompatible_switches): Add table entry
for OPTION_MASK_ALTIVEC.
The filesystem::path::operator+= and filesystem::path::concat functions
operate directly on the native format of the path and so can cause a
path to mutate to a completely different type.
For Windows combining a filename "x" with a filename ":" produces a
root-name "x:". Similarly, a Cygwin root-directory "/" combined with a
root-directory and filename "/x" produces a root-name "//x".
Before this patch the implemenation didn't support those kind of
mutations, assuming that concatenating two filenames would always
produce a filename and concatenating with a root-dir would still have a
root-dir.
This patch fixes it simply by checking for the problem cases and
creating a new path by re-parsing the result of the string
concatenation. This is slightly suboptimal because the argument has
already been parsed if it's a path, but more importantly it doesn't
reuse any excess capacity that the path object being modified might
already have allocated.
Backport from mainline
2020-03-09 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/94063
* src/c++17/fs_path.cc (path::operator+=(const path&)): Add kluge to
handle concatenations that change the type of the first component.
(path::operator+=(basic_string_view<value_type>)): Likewise.
* testsuite/27_io/filesystem/path/concat/94063.cc: New test.
Jonathan Wakely [Thu, 12 Mar 2020 17:39:05 +0000 (17:39 +0000)]
libstdc++: Fix name of macro in #undef directive
The macro that is defined is _GLIBCXX_NOT_FN_CALL_OP but the macro that
was named in the #undef directive was _GLIBCXX_NOT_FN_CALL. This fixes
the #undef.
Backport from mainline
2020-02-04 Jonathan Wakely <jwakely@redhat.com>
* include/std/functional (_GLIBCXX_NOT_FN_CALL_OP): Un-define after
use.
Jonathan Wakely [Thu, 12 Mar 2020 17:39:04 +0000 (17:39 +0000)]
libstdc++: Fix FS-dependent filesystem tests
These tests were failing on XFS because it doesn't support setting file
timestamps past 2038, so the expected overflow when reading back a huge
timestamp into a file_time_type didn't happen.
Additionally, the std::filesystem::file_time_type::clock has an
epoch that is out of range of 32-bit time_t so testing times around that
epoch may also fail.
This fixes the tests to give up gracefully if the filesystem doesn't
support times that can't be represented in 32-bit time_t.
Backport from mainline
2020-02-28 Jonathan Wakely <jwakely@redhat.com>
* testsuite/27_io/filesystem/operations/last_write_time.cc: Fixes for
filesystems that silently truncate timestamps.
* testsuite/experimental/filesystem/operations/last_write_time.cc:
Likewise.
Jonathan Wakely [Thu, 12 Mar 2020 17:39:04 +0000 (17:39 +0000)]
libstdc++: Ensure root-dir converted to forward slash (PR93244)
Backport from mainline
2020-01-13 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/93244
* include/bits/fs_path.h (path::generic_string<C,A>)
[_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Convert root-dir to forward-slash.
* testsuite/27_io/filesystem/path/generic/generic_string.cc: Check
root-dir is converted to forward slash in generic pathname.
* testsuite/27_io/filesystem/path/generic/utf.cc: New test.
* testsuite/27_io/filesystem/path/generic/wchar_t.cc: New test.
arm: correct constraints on movsi_compare0 [PR91913]
The peephole that detects a mov of one register to another followed by
a comparison of the original register against zero is only used in Arm
state; but the instruction that matches this is generic to all 32-bit
compilation states. That instruction lacks support for SP which is
permitted in Arm state, but has restrictions in Thumb2 code.
This patch fixes the problem by allowing SP when in ARM state for all
registers; in Thumb state it allows SP only as a source when the
register really is copied to another target.
gcc/ChangeLog:
PR target/91913
Backport from master
* config/arm/arm.md (movsi_compare0): Allow SP as a source register
in Thumb state and also as a destination in Arm state. Add T16
variants.
gcc/testsuite/ChangeLog:
2020-02-10 Jakub Jelinek <jakub@redhat.com>
PR target/91913
Backport from master
* gfortran.dg/pr91913.f90: New test.
Backport from mainline
2020-03-09 Martin Liska <mliska@suse.cz>
PR target/93800
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Remove set of str_align_loops and str_align_jumps as these
should be set in previous 2 conditions in the function.
Backport from mainline
2020-03-09 Martin Liska <mliska@suse.cz>
PR target/93800
* gcc.target/powerpc/pr93800.c: New test.
Eric Botcazou [Wed, 11 Mar 2020 09:47:34 +0000 (10:47 +0100)]
Fix internal error on locally-defined subpools
If the type is derived in the current compilation unit, and Allocate
is not overridden on derivation (as is typically the case with
Root_Storage_Pool_With_Subpools), the entity for Allocate of the
derived type is an alias for System.Storage_Pools.Subpools.Allocate.
The main assertion in gnat_to_gnu_entity fails in this case, since
this is not a definition and Is_Public is false (since the entity
is nested in the same compilation unit).
2020-03-11 Richard Wai <richard@annexi-strayline.com>
* gcc-interface/decl.c (gnat_to_gnu_entity): Also test Is_Public on
the Alias of the entitiy, if is present, in the main assertion.
Xionghu Luo [Tue, 10 Mar 2020 01:25:20 +0000 (20:25 -0500)]
Backport to gcc-9: PR92398: Fix testcase failure of pr72804.c
Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036.
Tested pass on P9/P8/P7.
(gcc-8 is not needed as the test doesn't exists.)
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
It can have longer latency, but latency via memory is not so critical,
and this does save decode and other resources. It's hard to choose
which is best. Update the test case to fix failures.
gcc/testsuite/ChangeLog:
2020-03-10 Luo Xiong Hu <luoxhu@linux.ibm.com>
backport from master.
PR testsuite/94036
2019-12-02 Luo Xiong Hu <luoxhu@linux.ibm.com>
PR testsuite/92398
* gcc.target/powerpc/pr72804.c: Split the store function to...
* gcc.target/powerpc/pr92398.h: ... this one. New.
* gcc.target/powerpc/pr92398.p9+.c: New.
* gcc.target/powerpc/pr92398.p9-.c: New.
* lib/target-supports.exp (check_effective_target_p8): New.
(check_effective_target_p9+): New.
Jonathan Wakely [Fri, 6 Mar 2020 12:52:51 +0000 (12:52 +0000)]
libstdc++: Fix call to __glibcxx_rwlock_init (PR 94069)
When the target doesn't define PTHREAD_RWLOCK_INITIALIZER we use a
wrapper around pthread_wrlock_init, but the wrapper only takes one
argument and we try to call it with two.
This went unnnoticed on most targets because they do define the
PTHREAD_RWLOCK_INITIALIZER macro, but it causes a bootstrap failure on
darwin8.
Backport from mainline
2020-03-06 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/94069
* include/std/shared_mutex [!PTHREAD_RWLOCK_INITIALIZER]
(__shared_mutex_pthread::__shared_mutex_pthread()): Remove incorrect
second argument to __glibcxx_rwlock_init.
* testsuite/30_threads/shared_timed_mutex/94069.cc: New test.
Jakub Jelinek [Thu, 5 Mar 2020 18:44:42 +0000 (19:44 +0100)]
i386: Fix some -O0 avx2intrin.h and xopintrin.h intrinsic macros [PR94046]
As the testcases show, the macros we have for -O0 for intrinsics that require
constant argument(s) should first cast the argument to the type the -O1+
inline uses and afterwards to whatever type e.g. a builtin needs.
The PR reported one which violated this, and I've grepped for all double-casts
and grepped out from that meaningful casts where the __m{128,256,512}{,d,i}
first cast is cast to same sized __v* type and has the same kind of element
type (float, double, integral). These 7 macros were using different casts,
and I've double checked them against the inline function types.
2020-03-05 Jakub Jelinek <jakub@redhat.com>
PR target/94046
* config/i386/avx2intrin.h (_mm_mask_i32gather_ps): Fix first cast of
SRC and MASK arguments to __m128 from __m128d.
(_mm256_mask_i32gather_ps): Fix first cast of MASK argument to __m256
from __m256d.
(_mm_mask_i64gather_ps): Fix first cast of MASK argument to __m128
from __m128d.
* config/i386/xopintrin.h (_mm_permute2_pd): Fix first cast of C
argument to __m128i from __m128d.
(_mm256_permute2_pd): Fix first cast of C argument to __m256i from
__m256d.
(_mm_permute2_ps): Fix first cast of C argument to __m128i from __m128.
(_mm256_permute2_ps): Fix first cast of C argument to __m256i from
__m256.
* g++.target/i386/pr94046-1.C: New test.
* g++.target/i386/pr94046-2.C: New test.
Jonathan Wakely [Thu, 5 Mar 2020 17:32:58 +0000 (17:32 +0000)]
libstdc++: Fix some warnings in filesystem tests
There's a -Wunused-but-set-variable warning in operations/all.cc which
can be fixed with [[maybe_unused]].
The statements in operations/copy.cc give -Wunused-value warnings. I
think I meant to use |= rather than !=.
And operations/file_size.cc gets -Wsign-compare warnings.
Backport from mainline
2020-03-05 Jonathan Wakely <jwakely@redhat.com>
* testsuite/27_io/filesystem/operations/all.cc: Mark unused variable.
* testsuite/27_io/filesystem/operations/copy.cc: Fix typo.
* testsuite/experimental/filesystem/operations/copy.cc: Likewise.
* testsuite/27_io/filesystem/operations/file_size.cc: Use correct type
for return value, and in comparison.
* testsuite/experimental/filesystem/operations/file_size.cc: Likewise.
Jonathan Wakely [Thu, 5 Mar 2020 16:52:19 +0000 (16:52 +0000)]
libstdc++: make negative count safe with std::for_each_n
The Library Working Group have approved a change to std::for_each_n that
requires it to handle negative N gracefully, which we were not doing for
random access iterators.
Backport from mainline
2019-11-07 Jonathan Wakely <jwakely@redhat.com>
* include/bits/stl_algo.h (for_each_n): Handle negative count.
* testsuite/25_algorithms/for_each/for_each_n_debug.cc: New test.
Richard Earnshaw [Thu, 18 Jul 2019 13:56:52 +0000 (13:56 +0000)]
arm: Fix incorrect modes with 'borrow' operations [PR90311]
Looking through the arm backend I noticed that the modes used to pass
comparison types into subtract-with-carry operations were being
incorrectly set. The result is that the compiler is not truly
self-consistent. To clean this up I've introduced a new predicate,
arm_borrow_operation (borrowed from the AArch64 backend) which can
match the comparison type with the required mode and then fixed all
the patterns to use this. The split patterns that were generating
incorrect modes have all obviously been fixed as well.
The basic rule for the use of a borrow is:
- if the condition code was set by a 'subtract-like' operation (subs, cmp),
then use CCmode and LTU.
- if the condition code was by unsigned overflow of addition (adds), then
use CC_Cmode and GEU.
gcc:
PR target/90311
Backport from master
* config/arm/predicates.md (arm_borrow_operation): New predicate.
* config/arm/arm.c (subdi3_compare1): Use CCmode for the split.
(arm_subdi3, subdi_di_zesidi, subdi_di_sesidi): Likewise.
(subdi_zesidi_zesidi): Likewise.
(negdi2_compare, negdi2_insn): Likewise.
(negdi_extensidi): Likewise.
(negdi_zero_extendsidi): Likewise.
(arm_cmpdi_insn): Likewise.
(subsi3_carryin): Use arm_borrow_operation.
(subsi3_carryin_const): Likewise.
(subsi3_carryin_const0): Likewise.
(subsi3_carryin_compare): Likewise.
(subsi3_carryin_compare_const): Likewise.
(subsi3_carryin_compare_const0): Likewise.
(subsi3_carryin_shift): Likewise.
(rsbsi3_carryin_shift): Likewise.
(negsi2_carryin_compare): Likewise.
gcc/testsuite:
2020-03-05 Jakub Jelinek <jakub@redhat.com>
Backport from master
PR target/90311
* gcc.c-torture/execute/pr90311.c: New test.
Uros Bizjak [Thu, 5 Mar 2020 16:53:03 +0000 (17:53 +0100)]
testsuite: Compile asan_test.C with -Wno-alloc-size-larger-than
asan_test.cc tries to allocate 0xf0000000 bytes for 32bit targets in
a disabled DISABLED_DemoOOM test. Since the testcase is compiled with
-Werror, the compilation fails with:
Paul Thomas [Thu, 5 Mar 2020 10:01:59 +0000 (11:01 +0100)]
Fix ICE in trans_associate_var
2020-03-05 Paul Thomas <pault@gcc.gnu.org>
Backport from trunk
PR fortran/92976
* match.c (select_type_set_tmp): Variable 'selector' to replace
select_type_stack->selector. If the selector array spec has
explicit bounds, make the temporary's bounds deferred.
2020-03-05 Paul Thomas <pault@gcc.gnu.org>
Backport from trunk
PR fortran/92976
* gfortran.dg/select_type_48.f90 : New test.
Marek Polacek [Wed, 4 Mar 2020 23:49:37 +0000 (18:49 -0500)]
c++: Fix mismatch in template argument deduction [PR90505]
2020-03-03 Jason Merrill <jason@redhat.com>
Marek Polacek <polacek@redhat.com>
PR c++/90505 - mismatch in template argument deduction.
* pt.c (tsubst): Don't reduce the template level of template
parameters when tf_partial.
* g++.dg/template/deduce4.C: New test.
* g++.dg/template/deduce5.C: New test.
* g++.dg/template/deduce6.C: New test.
* g++.dg/template/deduce7.C: New test.
Jakub Jelinek [Thu, 27 Feb 2020 08:38:12 +0000 (09:38 +0100)]
maintainer-scripts: Speed up git clone in gcc_release
When doing the 8.4-rc1, I've noticed (probably also because of the dying
disk on sourceware) that git clone is extremely slow, and furthermore when
all of us have some local snapshots, it is a waste of resources to download
everything again. Especially for the -f runs when we'll need to wait until
git tag -s asks us for a gpg password interactively.
The following patch adds an option through which one can point the script
at a local gcc .git directory from which it can --dissociate --reference ...
during cloning to speed it up.
2020-02-27 Jakub Jelinek <jakub@redhat.com>
* gcc_release: Add support for -b local-git-repo argument.
Jakub Jelinek [Tue, 3 Mar 2020 09:42:34 +0000 (10:42 +0100)]
explow: Fix ICE caused by plus_constant [PR94002]
The following testcase ICEs in cross to riscv64-linux. The problem is
that we have a DImode integral constant (that doesn't fit into SImode),
which is pushed into a constant pool and later access just the first half of
it using a MEM. When plus_constant is called on such a MEM, if the constant
has mode, we verify the mode, but if it doesn't, we don't and ICE later on
when we think the CONST_INT is a valid SImode constant.
2020-03-03 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/94002
* explow.c (plus_constant): Punt if cst has VOIDmode and
get_pool_mode is different from mode.
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
c++: Fix ({ ... }) array mem-initializer.
Here, we were going down the wrong path in perform_member_init because of
the incorrect parens around the mem-initializer for the array. And then
cxx_eval_vec_init_1 didn't know what to do with a CONSTRUCTOR as the
initializer. For GCC 9, let's just fix the latter issue.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
c++: Allow parm of empty class type in constexpr.
Since copying a class object is defined in terms of the copy constructor,
copying an empty class is OK even if it would otherwise not be usable in a
constant expression. Relatedly, using a parameter as an lvalue is no more
problematic than a local variable, and calling a member function uses the
object as an lvalue.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
PR c++/91953
* constexpr.c (potential_constant_expression_1) [PARM_DECL]: Allow
empty class type.
[COMPONENT_REF]: A member function reference doesn't use the object
as an rvalue.
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
c++: Fix cast to pointer to VLA.
The C front-end fixed this issue in r257620 by adding a DECL_EXPR from
grokdeclarator. We don't have an easy way to do that in the C++ front-end,
but it works fine to create and prepend a DECL_EXPR when we are genericizing
the NOP_EXPR for the cast.
The C patch wraps the DECL_EXPR in a BIND_EXPR, but that seems unnecessary
in C++; this is just a hook to run gimplify_type_sizes, we aren't actually
declaring anything that we need to worry about scoping for.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
PR c++/88256
* cp-gimplify.c (predeclare_vla): New.
(cp_genericize_r) [NOP_EXPR]: Call it.
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
checking: avoid verify_type_variant crash on incomplete type.
Here, we end up calling gen_type_die_with_usage for a type that's in the
middle of finish_struct_1, after we set TYPE_NEEDS_CONSTRUCTING on it but
before we copy all the flags to the variants--and, significantly, before we
set its TYPE_SIZE. It seems reasonable to only look at
TYPE_NEEDS_CONSTRUCTING on complete types, since we aren't going to try to
create an object of an incomplete type any other way.
gcc/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
PR c++/92601
* tree.c (verify_type_variant): Only verify TYPE_NEEDS_CONSTRUCTING
of complete types.
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
c++: Fix return deduction of lambda in discarded stmt.
A return statement in a discarded statement is not used for return type
deduction, but we still want to do deduction for a return statement in a
lambda in a discarded statement.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
PR c++/90732 - ICE with VLA capture and generic lambda.
We were failing to handle VLA capture in tsubst_lambda_expr; initially
building a DECLTYPE_TYPE for the capture and then tsubsting it doesn't give
the special VLA handling. So with this patch we call add_capture again for
VLAs.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
PR c++/90732 - ICE with VLA capture and generic lambda.
* pt.c (tsubst_lambda_expr): Repeat add_capture for VLAs.
Jason Merrill [Mon, 2 Mar 2020 19:42:47 +0000 (14:42 -0500)]
c++: Fix attributes with lambda and trailing return type.
My fix for 60503 fixed handling of C++11 attributes following the
lambda-declarator. My patch for 89640 re-added support for GNU attributes,
but attributes after the trailing return type were parsed as applying to the
return type rather than to the function. This patch adjusts parsing of a
trailing-return-type to ignore GNU attributes at the end of the declaration
so that they will be applied to the declaration as a whole.
I also considered parsing the attributes between the closing paren and the
trailing-return-type, and tried a variety of approaches to implementing
that, but I think it's better to stick with the documented rule that "An
attribute specifier list may appear immediately before the comma, '=' or
semicolon terminating the declaration of an identifier...." Anyone
disagree?
Meanwhile, C++ committee discussion about the lack of any way to apply
attributes to a lambda op() seems to have concluded that they should go
between the introducer and declarator, so I've implemented that as well.
gcc/cp/ChangeLog
2020-03-02 Jason Merrill <jason@redhat.com>
PR c++/90333
PR c++/89640
PR c++/60503
* parser.c (cp_parser_type_specifier_seq): Don't parse attributes in
a trailing return type.
(cp_parser_lambda_declarator_opt): Parse C++11 attributes before
parens.
Jakub Jelinek [Thu, 27 Feb 2020 09:45:30 +0000 (10:45 +0100)]
gimplify: Don't optimize register const vars to static [PR93949]
The following testcase is rejected, while it was accepted in 3.4 and earlier
(before tree-ssa merge).
The problem is that we decide to promote the const variable to TREE_STATIC,
but TREE_STATIC DECL_REGISTER VAR_DECLs may only be the global register vars
and so assemble_variable/make_decl_rtl diagnoses it.
Either we do what the following patch does, where we could consider
register as a hint the user doesn't want such optimization, because if
something is forced static, it is not "register" anymore and register static
is not valid in C either, or we could clear DECL_REGISTER instead, but would
still need to punt at least on DECL_HARD_REGISTER cases.
Jakub Jelinek [Thu, 27 Feb 2020 10:21:52 +0000 (11:21 +0100)]
sccvn: Punt on ref->size not multiple of 8 for memset (, 123, ) in 9.x [PR93945]
And here is the corresponding 9.x change where we the patch just punts if
ref->size is not whole bytes, like we already punt if offseti is not byte
aligned.
2020-02-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/93945
* tree-ssa-sccvn.c (vn_reference_lookup_3): For memset with non-zero
second operand, require ref->size to be a multiple of BITS_PER_UNIT.
Carl Love [Thu, 27 Feb 2020 00:22:46 +0000 (18:22 -0600)]
PPC64, fix documentation for __builtin_crypto_vpmsum* builtin functions.
PR target/91276 - Doc typos in __builtin_crypto_vpmsum*
gcc/ChangeLog:
2020-02-26 Carl Love <cel@us.ibm.com>
PR target/91276
* doc/extend.texi (PowerPC AltiVec Built-in Functions available on
ISA 3.0): The builtin-function name __builtin_crypto_vpmsumb is only
for the vector unsigned short arguments. It is also listed as the
name of the built-in for arguments vector unsigned short,
vector unsigned int and vector unsigned long long built-ins. The
name of the builtins for these arguments should be:
__builtin_crypto_vpmsumh, __builtin_crypto_vpmsumw and
__builtin_crypto_vpmsumd respectively.
Marek Polacek [Fri, 20 Dec 2019 23:30:04 +0000 (23:30 +0000)]
PR c++/92745 - bogus error when initializing array of vectors.
In r268428 I changed reshape_init_r in such a way that when it sees
a nested { } in a CONSTRUCTOR with missing braces, it just returns
the initializer:
+ else if (COMPOUND_LITERAL_P (stripped_init)
...
+ ++d->cur;
+ gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));
+ return init;
But as this test shows, that's incorrect: if TYPE is an array, we need
to proceed to reshape_init_array_1 which will iterate over the array
initializers:
6006 /* Loop until there are no more initializers. */
6007 for (index = 0;
6008 d->cur != d->end && (!sized_array_p || index <= max_index_cst);
6009 ++index)
6010 {
and update d.cur accordingly. In other words, when reshape_init gets
we recurse on the first element:
{col[0][0], col[1][0], col[2][0], col[3][0]}
and we can't just move d.cur to point to
{col[0][1], col[1][1], col[2][1], col[3][1]}
and return; we need to iterate, so that d.cur ends up being properly
updated, and after all initializers have been seen, points to d.end.
Currently we skip the loop, wherefore we hit this:
6502 /* Make sure all the element of the constructor were used. Otherwise,
6503 issue an error about exceeding initializers. */
6504 if (d.cur != d.end)
6505 {
6506 if (complain & tf_error)
6507 error ("too many initializers for %qT", type);
6508 return error_mark_node;
6509 }
gcc/cp/ChangeLog
2019-12-19 Marek Polacek <polacek@redhat.com>
PR c++/92745 - bogus error when initializing array of vectors.
* decl.c (reshape_init_r): For a nested compound literal, do
call reshape_init_{class,array,vector}.
gcc/testsuite/ChangeLog
2019-12-19 Marek Polacek <polacek@redhat.com>
Jakub Jelinek <jakub@redhat.com>
PR c++/92745 - bogus error when initializing array of vectors.
* g++.dg/cpp0x/initlist118.C: New test.
* g++.dg/cpp0x/initlist118.C: Add -Wno-psabi -w to dg-options.
Jason Merrill [Wed, 26 Feb 2020 18:03:23 +0000 (13:03 -0500)]
cgraph: A COMDAT decl always has non-zero address.
We should be able to assume that a template instantiation or other COMDAT
has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets
DECL_WEAK and we haven't yet decided to emit a definition in this
translation unit.
gcc/ChangeLog
2020-02-26 Jason Merrill <jason@redhat.com>
PR c++/92003
* symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has
non-zero address even if weak and not yet defined.
Jason Merrill [Wed, 26 Feb 2020 18:03:23 +0000 (13:03 -0500)]
c++: Fix constexpr vs. omitted aggregate init.
Value-initialization is importantly different from {}-initialization for
this testcase, where the former calls the deleted S constructor and the
latter initializes S happily.
gcc/cp/ChangeLog
2020-02-26 Jason Merrill <jason@redhat.com>
PR c++/90951
* constexpr.c (cxx_eval_array_reference): {}-initialize missing
elements instead of value-initializing them.
Jason Merrill [Wed, 26 Feb 2020 18:03:23 +0000 (13:03 -0500)]
c++: Fix decltype of empty pack expansion of parm.
In unevaluated context, we only substitute a single PARM_DECL, not the
entire chain, but the handling of an empty pack expansion was missing that
check.
gcc/cp/ChangeLog
2020-02-26 Jason Merrill <jason@redhat.com>
PR c++/93140
* pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in
handling of TREE_CHAIN for empty pack.
Jonathan Wakely [Wed, 26 Feb 2020 16:31:19 +0000 (16:31 +0000)]
libstdc++: Fix undefined behaviour in random dist serialization (PR93205)
The deserialization functions for random number distributions fail to
check the stream state before using the extracted values. In some cases
this leads to using indeterminate values to resize a vector, and then
filling that vector with indeterminate values.
No values that affect control flow should be used without checking that a
good value was read from the stream.
Additionally, where reasonable to do so, defer modifying any state in
the distribution until all values have been successfully read, to avoid
modifying some of the distribution's parameters and leaving others
unchanged.
Backport from mainline
2020-01-09 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/93205
* include/bits/random.h (operator>>): Check stream operation succeeds.
* include/bits/random.tcc: (operator>>): Likewise.
(__extract_params): New function to fill a vector from a stream.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line.
Matheus Castanho [Thu, 13 Feb 2020 23:43:39 +0000 (23:43 +0000)]
rs6000: fixinc: Skip machine_name fix for powerpc*-*-linux*
Some system headers can be broken by the machine_name fix performed
by GCC during the fixincludes step. According to the comment in
fixincludes/fixinc.h:130 :
On some platforms, machine_name doesn't work properly and
breaks some of the header files. Since everything works
properly without it, just wipe the macro list to
disable the fix.
So we can just skip it to avoid trouble.
Backport from trunk
2020-02-13 Matheus Castanho <msc@linux.ibm.com>
fixincludes/
* fixinc.in: Skip machine_name fix on powerpc*-*-linux*.
Jonathan Wakely [Wed, 26 Feb 2020 15:32:34 +0000 (15:32 +0000)]
libstdc++: Replace glibc-specific check for clock_gettime (PR 93325)
It's wrong to assume that clock_gettime is unavailable on any *-*-linux*
target that doesn't have glibc 2.17 or later. Use a generic test instead
of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we
don't get an error for cross targets without a working linker.
This ensures that C library's clock_gettime will be used on non-glibc
targets, instead of an incorrect syscall to SYS_clock_gettime.
Backport from mainline
2020-01-28 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/93325
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS for
clock_gettime instead of explicit glibc version check.
* configure: Regenerate.
Jonathan Wakely [Wed, 26 Feb 2020 15:04:53 +0000 (15:04 +0000)]
libstdc++: Fix regressions in unique_ptr::swap (PR 93562)
The requirements for this function are only that the deleter is
swappable, but we incorrectly require that the element type is complete
and that the deleter can be swapped using std::swap (which requires it
to be move cosntructible and move assignable).
The fix is to add __uniq_ptr_impl::swap which swaps the pointer and
deleter individually, instead of using the generic std::swap on the
tuple containing them.
PR libstdc++/93562
* include/bits/unique_ptr.h (__uniq_ptr_impl::swap): Define.
(unique_ptr::swap, unique_ptr<T[], D>::swap): Call it.
* testsuite/20_util/unique_ptr/modifiers/93562.cc: New test.
Jonathan Wakely [Wed, 26 Feb 2020 14:20:55 +0000 (14:20 +0000)]
libstdc++: Fix freestanding build (PR 92376)
In a freestanding library we don't install the <pstl/pstl_config.h>
header, so don't try to include it unless it exists.
Explicitly declare aligned alloc functions for freestanding, because
<cstdlib> doesn't declare them.
Backport from mainline
2020-01-17 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/92376
* include/bits/c++config: Only do PSTL config when the header is
present, to fix freestanding.
* libsupc++/new_opa.cc [!_GLIBCXX_HOSTED]: Declare allocation
functions if they were detected by configure.
Jonathan Wakely [Wed, 26 Feb 2020 14:00:07 +0000 (14:00 +0000)]
PR libstdc++/78552 only construct std::locale for C locale once
Backport from mainline
2019-10-09 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/78552
* src/c++98/locale_init.cc (locale::classic()): Do not construct a new
locale object for every call.
(locale::_S_initialize_once()): Construct C locale here.
Jiufu Guo [Mon, 17 Feb 2020 02:48:39 +0000 (10:48 +0800)]
rs6000: mark clobber for registers changed by untpyed_call
As PR93047 said, __builtin_apply/__builtin_return does not work well with
-frename-registers. This is caused by return register(e.g. r3) is used to
rename another register, before return register is stored to stack.
This patch fix this issue by emitting clobber for those egisters which
maybe changed by untyped call.
The following testcase is miscompiled in 8+.
The problem is that check_no_overlap has a special case for INTEGER_CST
marked stores (i.e. stores of constants), if both all currenly merged stores
and the one under consideration for merging with them are marked that way,
it anticipates that other INTEGER_CST marked stores that overlap with those
and precede those (have smaller info->order) could be merged with those and
doesn't punt for them.
In PR86844 and PR87859 fixes I've then added quite large code that is
performed after check_no_overlap and tries to find out if we need and can
merge further INTEGER_CST marked stores, or need to punt.
Unfortunately, that code is there only in the overlapping case code and
the testcase below shows that we really need it even in the adjacent store
case. After sort_by_bitpos we have:
bitpos width order rhs_code
96 32 3 INTEGER_CST
128 32 1 INTEGER_CST
128 128 2 INTEGER_CST
192 32 0 MEM_REF
Because of the missing PR86844/PR87859-ish code in the adjacent store
case, we merge the adjacent (memory wise) stores 96/32/3 and 128/32/1,
and then we consider the 128-bit store which is in program-order in between
them, but in this case we punt, because the merging would extend the
merged store region from bitpos 96 and 64-bits to bitpos 96 and 160-bits
and that has an overlap with an incompatible store (the MEM_REF one).
The problem is that we can't really punt this way, because the 128-bit
store is in between those two we've merged already, so either we manage
to merge even that one together with the others, or would need to avoid
already merging the 96/32/3 and 128/32/1 stores together.
Now, rather than copying around the PR86844/PR87859 code to the other spot,
we can actually just use the overlapping code, merge_overlapping is really
a superset of merge_into, so that is what the patch does. If doing
adjacent store merge for rhs_code other than INTEGER_CST, I believe the
current code is already fine, check_no_overlap in that case doesn't make
the exception and will punt if there is some earlier (smaller order)
non-mergeable overlapping store. There is just one case that could be
problematic, if the merged_store has BIT_INSERT_EXPRs in them and the
new store is a constant store (INTEGER_CST rhs_code), then check_no_overlap
would do the exception and still would allow the special case. But we
really shouldn't have the special case in that case, so this patch also
changes check_no_overlap to just have a bool whether we should have the
special case or not.
Note, as I said in the PR, for GCC11 we could consider performing some kind
of cheap DSE during the store merging (perhaps guarded with flag_tree_dse).
And another thing to consider is only consider as problematic non-mergeable
stores that not only have order smaller than last_order as currently, but
also have order larger than first_order, as in this testcase if we actually
ignored (not merged with anything at all) the 192/32/0 store, because it is
not in between the other stores we'd merge, it would be fine to merge the
other 3 stores, though of course the testcase can be easily adjusted by
putting the 192/32 store after the 128/32 store and then this patch would be
still needed. Though, I think I'd need more time thinking this over.
2020-02-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/93820
* gimple-ssa-store-merging.c (check_no_overlap): Change RHS_CODE
argument to ALL_INTEGER_CST_P boolean.
(imm_store_chain_info::try_coalesce_bswap): Adjust caller.
(imm_store_chain_info::coalesce_immediate_stores): Likewise. Handle
adjacent INTEGER_CST store into merged_store->only_constants like
overlapping one.
Christophe Lyon [Wed, 19 Feb 2020 20:55:23 +0000 (20:55 +0000)]
ARM: Fix -mpure-code for v6m
When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
for cortex-m0, I noticed that some testcases were failing because we
still generate "ldr rX, .LCY", which is what we want to avoid with
-mpure-code. This is latent since a recent improvement in fwprop
(PR88833).
In this patch I change the thumb1_movsi_insn pattern so that it emits
the desired instruction sequence when arm_disable_literal_pool is set.
To achieve that, I introduce a new required_for_purecode attribute to
enable the corresponding alternative in thumb1_movsi_insn and take the
actual instruction sequence length into account.
Backport from mainline
2020-02-25 Christophe Lyon <christophe.lyon@linaro.org>
* config/arm/arm.md (required_for_purecode): New attribute.
(enabled): Handle required_for_purecode.
* config/arm/thumb1.md (thumb1_movsi_insn): Add alternative to
work with -mpure-code.
Christophe Lyon [Tue, 17 Dec 2019 15:43:07 +0000 (15:43 +0000)]
ARM: Add support for -mpure-code in thumb-1 (v6m)
This patch extends support for -mpure-code to all thumb-1 processors,
by removing the need for MOVT.
Symbol addresses are built using upper8_15, upper0_7, lower8_15 and
lower0_7 relocations, and constants are built using sequences of
movs/adds and lsls instructions.
The extension of the *thumb1_movhf pattern uses always the same size
(6) although it can emit a shorter sequence when possible. This is
similar to what *arm32_movhf already does.
CASE_VECTOR_PC_RELATIVE is now false with -mpure-code, to avoid
generating invalid assembly code with differences from symbols from
two different sections (the difference cannot be computed by the
assembler).
Tests pr45701-[12].c needed a small adjustment to avoid matching
upper8_15 when looking for the r8 register.
Test no-literal-pool.c is augmented with __fp16, so it now uses
-mfp16-format=ieee.
Test thumb1-Os-mult.c generates an inline code sequence with
-mpure-code and computes the multiplication by using a sequence of
add/shift rather than using the multiply instruction, so we skip it in
presence of -mpure-code.
With -mcpu=cortex-m0, the pure-code/no-literal-pool.c fails because
code like:
static char *p = "Hello World";
char *
testchar ()
{
return p + 4;
}
By contrast, when using -mcpu=cortex-m4, the code looks like:
.section .rodata
.LC0:
.ascii "Hello World\000"
.data
p:
.word .LC0
testchar:
push {r7}
add r7, sp, #0
movw r3, #:lower16:p
movt r3, #:upper16:p
ldr r3, [r3]
adds r3, r3, #4
mov r0, r3
mov sp, r7
pop {r7}
bx lr
I haven't found yet how to make code for cortex-m0 apply upper/lower
relocations to "p" instead of .LC2. The current code looks functional,
but could be improved.
Jakub Jelinek [Tue, 25 Feb 2020 12:56:47 +0000 (13:56 +0100)]
combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908]
git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13.
I've managed to reduce it into the following testcase. The problem is that
during combine we see the s->k = -1; bitfield store and change the SET_SRC
from a pseudo into a constant:
(set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
(const_int 10 [0xa])) [0 +0 S2 A16])
(const_int 2 [0x2])
(const_int 7 [0x7]))
(const_int -1 [0xffffffffffffffff]))
This on s390x with the above option isn't recognized as valid instruction,
so find_split_point decides to handle it as IOR or IOR/AND.
src is -1, mask is 3 and pos is 7.
src != mask (this is also incorrect, we want to set all (both) bits in the
bitfield), so we go for IOR/AND, but instead of trying
mem = (mem & ~0x180) | ((-1 << 7) & 0x180)
we actually try
mem = (mem & ~0x180) | (-1 << 7)
and that is further simplified into:
mem = mem | (-1 << 7)
aka
mem = mem | 0xff80
which doesn't set just the 2-bit bitfield, but also many other bitfields
that shouldn't be touched.
We really should do:
mem = mem | 0x180
instead.
The problem is that we assume that no bits but those low len (2 here) will
be set in the SET_SRC, but there is nothing that can prevent that, we just
should ignore the other bits.
The following patch fixes it by masking src with mask, this way already
the src == mask test will DTRT, and as the code for or_mask uses
gen_int_mode, if the most significant bit is set after shifting it left by
pos, it will be properly sign-extended.
2020-02-25 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/93908
* combine.c (find_split_point): For store into ZERO_EXTRACT, and src
with mask.