Jakub Jelinek [Mon, 27 Apr 2020 19:14:52 +0000 (21:14 +0200)]
x86: Fix up ix86_atomic_assign_expand_fenv [PR94780]
This function, because it is sometimes called even outside of function
bodies, uses create_tmp_var_raw rather than create_tmp_var. But in order
for that to work, when first referenced, the VAR_DECLs need to appear in a
TARGET_EXPR so that during gimplification the var gets the right
DECL_CONTEXT and is added to local decls. Without that, e.g. tree-nested.c
ICEs on those.
2020-04-27 Jakub Jelinek <jakub@redhat.com>
PR target/94780
* config/i386/i386.c (ix86_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for first assignment to
sw_var, exceptions_var, mxcsr_orig_var and mxcsr_mod_var.
Jakub Jelinek [Mon, 27 Apr 2020 14:05:03 +0000 (16:05 +0200)]
c-family: Fix ICE on __builtin_speculation_safe_value () [PR94755]
When this builtin has no parameters, speculation_safe_value_resolve_call
returns BUILT_IN_NONE, but resolve_overloaded_builtin uselessly
dereferences the first param just to return error_mark_node immediately.
The following patch rearranges it so that we only read the first parameter
if fncode is not BUILT_IN_NONE.
2020-04-27 Jakub Jelinek <jakub@redhat.com>
PR c/94755
* c-common.c (resolve_overloaded_builtin): Return error_mark_node for
fncode == BUILT_IN_NONE before initialization of first_param.
Jakub Jelinek [Fri, 24 Apr 2020 22:11:35 +0000 (00:11 +0200)]
c++: Avoid -Wreturn-type warning if a template fn calls noreturn template fn [PR94742]
finish_call_expr already has code to set current_function_returns_abnormally
if a template calls a noreturn function, but on the following testcase it
doesn't call a FUNCTION_DECL, but TEMPLATE_DECL instead, in which case
we didn't check noreturn at all and just assumed it could return.
2020-04-25 Jakub Jelinek <jakub@redhat.com>
PR c++/94742
* semantics.c (finish_call_expr): When looking if all overloads
are noreturn, use STRIP_TEMPLATE to look through TEMPLATE_DECLs.
Jakub Jelinek [Thu, 23 Apr 2020 19:57:50 +0000 (21:57 +0200)]
Shortcut identity VEC_PERM expansion [PR94710]
This PR is about the rs6000 backend emitting wrong assembly
for whole vector shift by 0, and while I think it is desirable
to fix the backend, I don't see a point why the expander should
try to emit that, whole vector shift by 0 is identity, we can just
return the operand.
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR target/94710
* optabs.c (expand_vec_perm_const): For shift_amt const0_rtx
just return v2.
Jakub Jelinek [Fri, 17 Apr 2020 08:33:27 +0000 (10:33 +0200)]
Fix -fcompare-debug issue in delete_insn_and_edges [PR94618]
delete_insn_and_edges calls purge_dead_edges whenever deleting the last insn
in a bb, whatever it is. If it called it only for mandatory last insns
in the basic block (that may not be followed by DEBUG_INSNs, dunno if that
is control_flow_insn_p or something more complex), that wouldn't be a
problem, but as it calls it on any last insn and can actually do something
in the bb, if such an insn is followed by one more more DEBUG_INSNs and
nothing else in the same bb, we don't call purge_dead_edges with -g and do
call it with -g0.
On the testcase, there are two reg-to-reg moves with REG_EH_REGION notes
(previously memory accesses but simplified and yet not optimized), and the
second is followed by DEBUG_INSNs; the second move is delete_insn_and_edges
and after removing it, for -g0 purge_dead_edges removes the REG_EH_REGION
from the now last insn in the bb (the first reg-to-reg move), while
for -g it isn't called and things diverge from that quickly on.
Fixed by calling purdge_dead_edges even if we remove the last real insn
followed only by DEBUG_INSNs in the same bb.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/94618
* cfgrtl.c (delete_insn_and_edges): Set purge not just when
insn is the BB_END of its block, but also when it is only followed
by DEBUG_INSNs in its block.
Jakub Jelinek [Fri, 17 Apr 2020 07:07:49 +0000 (09:07 +0200)]
inliner: Don't ICE on NULL TYPE_DOMAIN [PR94621]
When I've added the VLA tweak for OpenMP to avoid error_mark_nodes in the IL in
type, I forgot that TYPE_DOMAIN could be NULL. Furthermore, as an optimization,
this patch checks the hopefully cheapest condition that is very likely false
most of the time (enabled only during OpenMP handling) first.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94621
* tree-inline.c (remap_type_1): Don't dereference NULL TYPE_DOMAIN.
Move id->adjust_array_error_bounds check first in the condition.
Jakub Jelinek [Thu, 16 Apr 2020 05:19:57 +0000 (07:19 +0200)]
c++: Fix pasto in structured binding diagnostics [PR94571]
This snippet has been copied from the non-structured binding declaration
parsing later in the function, and while for non-structured bindings
it can be followed by comma or semicolon, structured bindings may be
only followed by semicolon.
Or, do we want to have a different message for the case when there is
a comma (and keep this corrected one only if there is something else)
that would explain better what is the bug (or add a fix-it hint)?
Marek said in the PR that clang++ reports
error: decomposition declaration must be the only declaration in its group
There is another thing Marek noted (though, something for different spot),
that diagnostic for auto x(1), [e,f] = test2; could also use a clearer
wording like the above (or a fix-it hint), but the question is if we should
assume [ after , as a structured binding or if we should do some tentative
parsing first to figure out if it looks like a structured binding.
2020-04-16 Jakub Jelinek <jakub@redhat.com>
PR c++/94571
* parser.c (cp_parser_simple_declaration): Fix up a pasto in
diagnostics.
selftest: Work around GCC 4.2 PR33916 bug by optimizing the ctor [PR89494]
GCC 4.2 due to PR33916 miscompiles temp_dump_context ctor, because it doesn't
zero initialize the whole dump_context temporary on which it runs the static
get method and during destruction of the temporary an uninitialized pointer
is deleted.
More recent GCC versions properly zero initialize it and ideally optimize away
the construction/destruction of the temporary, as it isn't used for anything,
but there is no reason to create the temporary, static member functions can
be called without an associated object.
2020-04-15 Gustavo Romero <gromero@linux.ibm.com>
PR bootstrap/89494
* dumpfile.c (selftest::temp_dump_context::temp_dump_context):
Don't construct a dump_context temporary to call static method.
Jakub Jelinek [Wed, 8 Apr 2020 19:22:05 +0000 (21:22 +0200)]
vect: Fix up lowering of TRUNC_MOD_EXPR by negative constant [PR94524]
The first testcase below is miscompiled, because for the division part
of the lowering we canonicalize negative divisors to their absolute value
(similarly how expmed.c canonicalizes it), but when multiplying the division
result back by the VECTOR_CST, we use the original constant, which can
contain negative divisors.
Fixed by computing ABS_EXPR of the VECTOR_CST. Unfortunately, fold-const.c
doesn't support const_unop (ABS_EXPR, VECTOR_CST) and I think it is too late
in GCC 10 cycle to add it now.
Furthermore, while modulo by most negative constant happens to return the
right value, it does that only by invoking UB in the IL, because
we then expand division by that 1U+INT_MAX and say for INT_MIN % INT_MIN
compute the division as -1, and then multiply by INT_MIN, which is signed
integer overflow. We in theory could do the computation in unsigned vector
types instead, but is it worth bothering. People that are doing % INT_MIN
are either testing for standard conformance, or doing something wrong.
So, I've also added punting on % INT_MIN, both in vect lowering and vect
pattern recognition (we punt already for / INT_MIN).
2020-04-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94524
* tree-vect-generic.c (expand_vector_divmod): If any elt of op1 is
negative for signed TRUNC_MOD_EXPR, multiply with absolute value of
op1 rather than op1 itself at the end. Punt for signed modulo by
most negative constant.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Punt for signed
modulo by most negative constant.
* gcc.c-torture/execute/pr94524-1.c: New test.
* gcc.c-torture/execute/pr94524-2.c: New test.
Jakub Jelinek [Wed, 8 Apr 2020 16:24:12 +0000 (18:24 +0200)]
i386: Don't use AVX512F integral masks for V*TImode [PR94438]
The ix86_get_mask_mode hook uses int mask for 512-bit vectors or 128/256-bit
vectors with AVX512VL (that is correct), and only for V*[SD][IF]mode if not
AVX512BW (also correct), but with AVX512BW it would stop checking the
elem_size altogether and pretend the hw has masking support for V*TImode
etc., which it doesn't. That can lead to various ICEs later on.
2020-04-08 Jakub Jelinek <jakub@redhat.com>
PR target/94438
* config/i386/i386.c (ix86_get_mask_mode): Only use int mask for elem_size
1, 2, 4 and 8.
* gcc.target/i386/avx512bw-pr94438.c: New test.
* gcc.target/i386/avx512vlbw-pr94438.c: New test.
Jakub Jelinek [Wed, 8 Apr 2020 13:30:16 +0000 (15:30 +0200)]
c++: Further fix for -fsanitize=vptr [PR94325]
For -fsanitize=vptr, we insert a NULL store into the vptr instead of just
adding a CLOBBER of this. build_clobber_this makes the CLOBBER conditional
on in_charge (implicit) parameter whenever CLASSTYPE_VBASECLASSES, but when
adding this conditionalization to the -fsanitize=vptr code in PR87095,
I wanted it to catch some more cases when the class has CLASSTYPE_VBASECLASSES,
but the vptr is still not shared with something else, otherwise the
sanitization would be less effective.
The following testcase shows that the chosen test that CLASSTYPE_PRIMARY_BINFO
is non-NULL and has BINFO_VIRTUAL_P set wasn't sufficient,
the D class has still sizeof(D) == sizeof(void*) and thus contains just
a single vptr, but while in B::~B() this results in the vptr not being
cleared, in C::~C() this condition isn't true, as CLASSTYPE_PRIMARY_BINFO
in that case is B and is not BINFO_VIRTUAL_P, so it clears the vptr, but the
D::~D() dtor after invoking C::~C() invokes A::~A() with an already cleared
vptr, which is then reported.
The following patch is just a shot in the dark, keep looking through
CLASSTYPE_PRIMARY_BINFO until we find BINFO_VIRTUAL_P, but it works on the
existing testcase as well as this new one.
2020-04-08 Jakub Jelinek <jakub@redhat.com>
PR c++/94325
* decl.c (begin_destructor_body): For CLASSTYPE_VBASECLASSES class
dtors, if CLASSTYPE_PRIMARY_BINFO is non-NULL, but not BINFO_VIRTUAL_P,
look at CLASSTYPE_PRIMARY_BINFO of its BINFO_TYPE if it is not
BINFO_VIRTUAL_P, and so on.
Will Schmidt [Tue, 15 Sep 2020 20:06:08 +0000 (15:06 -0500)]
[PATCH, rs6000] Fix vector long long subtype (PR96139)
Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.
When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.
Due to some file renames, This is a backport and rework of both
[PATCH, rs6000] Fix vector long long subtype (PR96139)
and
[PATCH, rs6000] Testsuite fixup pr96139 tests
PR target/96139
2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com>
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_init_builtin): Update V2DI_type_node
and unsigned_V2DI_type_node definitions.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.
Richard Biener [Wed, 6 May 2020 08:23:15 +0000 (10:23 +0200)]
middle-end/94964 - avoid EH loop entry with CP_SIMPLE_PREHEADERS
Loop optimizers expect to be able to insert on the preheader
edge w/o splitting it thus avoid ending up with a preheader
that enters the loop via an EH edge (or an abnormal edge).
Richard Biener [Fri, 15 May 2020 07:38:54 +0000 (09:38 +0200)]
tree-optimization/95133 - avoid abnormal edges in path splitting
When path splitting tries to detect a CFG diamond make sure it
is composed of normal (non-EH, not abnormal) edges. Otherwise
CFG manipulation later may fail.
2020-05-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/95133
* gimple-ssa-split-paths.c
(find_block_to_duplicate_for_splitting_paths): Check for
normal edges.
Richard Biener [Wed, 17 Jun 2020 12:57:59 +0000 (14:57 +0200)]
tree-optimization/95717 - fix SSA update for vectorizer epilogue
This fixes yet another issue with the custom SSA updating in the
vectorizer when we copy from the non-if-converted loop. We must
not mess with current defs before we updated the BB copies.
2020-06-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/95717
* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg):
Move BB SSA updating before exit/latch PHI current def copying.
This attempts to make is_nothrow_constructible more robust (and
efficient to compile) by not depending on is_constructible. Instead the
__is_constructible intrinsic is used directly. The helper class
__is_nt_constructible_impl which checks whether the construction is
non-throwing now takes a bool template parameter that is substituted by
the result of the instrinsic. This fixes the reported bug by not using
the already-instantiated (and incorrect) value of std::is_constructible.
I don't think it really fixes the problem in general, because
std::is_nothrow_constructible itself could already have been
instantiated in a context where it gives the wrong result. A proper fix
needs to be done in the compiler.
Backported to the gcc-8 and gcc-9 branches to fix PR 96999.
PR libstdc++/94033
* include/std/type_traits (__is_nt_default_constructible_atom): Remove.
(__is_nt_default_constructible_impl): Remove.
(__is_nothrow_default_constructible_impl): Remove.
(__is_nt_constructible_impl): Add bool template parameter. Adjust
partial specializations.
(__is_nothrow_constructible_impl): Replace class template with alias
template.
(is_nothrow_default_constructible): Derive from alias template
__is_nothrow_constructible_impl instead of
__is_nothrow_default_constructible_impl.
* testsuite/20_util/is_nothrow_constructible/94003.cc: New test.
Eric Botcazou [Fri, 11 Sep 2020 08:41:28 +0000 (10:41 +0200)]
Fix crash on array component with nonstandard index type
This is a regression present on mainline, 10 and 9 branches: the compiler
goes into an infinite recursion eventually exhausting the stack for the
declaration of a discriminated record type with an array component having
a discriminant as bound and an index type that is an enumeration type with
a non-standard representation clause.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Only
create extra subtypes for discriminants if the RM size of the base
type of the index type is lower than that of the index type.
gcc/testsuite/ChangeLog:
* gnat.dg/specs/discr7.ads: New test.
Eric Botcazou [Thu, 10 Sep 2020 15:47:32 +0000 (17:47 +0200)]
Fix uninitialized variable with nested variant record types
This fixes a wrong code issue with nested variant record types: the
compiler generates move instructions that depend on an uninitialized
variable, which was initially a SAVE_EXPR not instantiated early enough.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (build_subst_list): For a definition, make
sure to instantiate the SAVE_EXPRs generated by the elaboration of
the constraints in front of the elaboration of the type itself.
gcc/testsuite/ChangeLog:
* gnat.dg/discr59.adb: New test.
* gnat.dg/discr59_pkg1.ads: New helper.
* gnat.dg/discr59_pkg2.ads: Likewise.
Harald Anlauf [Thu, 3 Sep 2020 18:33:14 +0000 (20:33 +0200)]
PR fortran/96890 - Wrong answer with intrinsic IALL
The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.
libgfortran/ChangeLog:
* m4/iall.m4: Initial value for result should be -1.
* generated/iall_i1.c (miall_i1): Generated.
* generated/iall_i16.c (miall_i16): Likewise.
* generated/iall_i2.c (miall_i2): Likewise.
* generated/iall_i4.c (miall_i4): Likewise.
* generated/iall_i8.c (miall_i8): Likewise.
Jonathan Wakely [Thu, 30 Apr 2020 14:47:52 +0000 (15:47 +0100)]
libstdc++: Avoid errors in allocator's noexcept-specifier (PR 89510)
This fixes a regression due to the conditional noexcept-specifier on
std::allocator::construct and std::allocator::destroy, as well as the
corresponding members of new_allocator, malloc_allocator, and
allocator_traits. Those noexcept-specifiers were using expressions which
might be ill-formed, which caused errors outside the immediate context
when checking for the presence of construct and destroy in SFINAE
contexts.
The fix is to use the is_nothrow_constructible and
is_nothrow_destructible type traits instead, because those traits are
safe to use even when the construction/destruction itself is not valid.
The is_nothrow_constructible trait will be false for a type that is not
also nothrow-destructible, even if the new-expression used in the
construct function body is actually noexcept. That's not the correct
answer, but isn't a problem because providing a noexcept-specifier on
these functions is not required by the standard anyway. If the answer is
false when it should be true, that's suboptimal but OK (unlike giving
errors for valid code, or giving a true answer when it should be false).
PR libstdc++/89510
* include/bits/alloc_traits.h (allocator_traits::_S_construct)
(allocator_traits::_S_destroy)
(allocator_traits<allocator<T>>::construct): Use traits in
noexcept-specifiers.
* include/bits/allocator.h (allocator<void>::construct)
(allocator<void>::destroy): Likewise.
* include/ext/malloc_allocator.h (malloc_allocator::construct)
(malloc_allocator::destroy): Likewise.
* include/ext/new_allocator.h (new_allocator::construct)
(new_allocator::destroy): Likewise.
* testsuite/20_util/allocator/89510.cc: New test.
* testsuite/ext/malloc_allocator/89510.cc: New test.
* testsuite/ext/new_allocator/89510.cc: New test.
Kewen Lin [Wed, 2 Sep 2020 02:12:32 +0000 (21:12 -0500)]
rs6000: Backport fixes for PR92923 and PR93136
This patch is to backport the fix for PR92923 and its sequent fix
for PR93136. We found the builtin functions needlessly using
VIEW_CONVERT_EXPRs on their operands can probably cause remarkable
performance degradation especailly when they are used in the hotspot.
One typical case is
...github.com/antonblanchard/crc32-vpmsum/blob/master/vec_crc32.c
With this patch, the execution time can improve 47.81%.
Apart from the original fixes, this patch also gets two cases below
updated. During the regression testing I found two cases failed due
to icf optimization able to be adopted with this patch, the function
bodies use tail calls, the expected assembly instructions are gone.
Mark Eggleston [Fri, 21 Aug 2020 05:39:30 +0000 (06:39 +0100)]
Fortran : ICE for division by zero in declaration PR95882
A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.
The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen. Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran
PR fortran/95882
* decl.c (char_len_param_value): Check gfc_seen_div0 and
if it is set return MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95882
* gfortran.dg/pr95882_1.f90: New test.
* gfortran.dg/pr95882_2.f90: New test.
* gfortran.dg/pr95882_3.f90: New test.
* gfortran.dg/pr95882_4.f90: New test.
* gfortran.dg/pr95882_5.f90: New test.
Christophe Lyon [Wed, 19 Aug 2020 09:02:21 +0000 (09:02 +0000)]
arm: Fix -mpure-code support/-mslow-flash-data for armv8-m.base [PR94538]
armv8-m.base (cortex-m23) has the movt instruction, so we need to
disable the define_split to generate a constant in this case,
otherwise we get incorrect insn constraints as described in PR94538.
We also need to fix the pure-code alternative for thumb1_movsi_insn
because the assembler complains with instructions like
movs r0, #:upper8_15:1234
(Internal error in md_apply_fix)
We now generate movs r0, 4 instead.
Jonathan Wakely [Wed, 26 Aug 2020 13:47:51 +0000 (14:47 +0100)]
libstdc++: Enable assertions in constexpr string_view members [PR 71960]
Since GCC 6.1 there is no reason we can't just use __glibcxx_assert in
constexpr functions in string_view. As long as the condition is true,
there will be no call to std::__replacement_assert that would make the
function ineligible for constant evaluation.
Runtime error occurs when the type of the value argument is
character(0): "Zero-length string passed as value...".
The status argument, intent(out), will contain -1 if the value
of the environment is too large to fit in the value argument, this
is the case if the type is character(0) so there is no reason to
produce a runtime error if the value argument is zero length.
2020-08-24 Mark Eggleston <markeggleston@gcc.gnu.org>
libgfortran/
PR fortran/96486
* intrinsics/env.c: If value_len is > 0 blank the string.
Copy the result only if its length is > 0.
2020-08-24 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/96486
* gfortran.dg/pr96486.f90: New test.
Tamar Christina [Mon, 3 Aug 2020 11:03:17 +0000 (12:03 +0100)]
AArch64: Fix hwasan failure in readline.
My previous fix added an unchecked call to fgets in the new function readline.
fgets can fail when there's an error reading the file in which case it returns
NULL. It also returns NULL when the next character is EOF.
The EOF case is already covered by the existing code but the error case isn't.
This fixes it by returning the empty string on error.
Also I now use strnlen instead of strlen to make sure we never read outside the
buffer.
This was flagged by Matthew Malcomson during his hwasan work.
gcc/ChangeLog:
* config/aarch64/driver-aarch64.c (readline): Check return value fgets.
Tamar Christina [Wed, 8 Jul 2020 13:32:34 +0000 (14:32 +0100)]
AArch64: Add test for -mcpu=native
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/aarch64-cpunative.exp: New file.
* gcc.target/aarch64/cpunative/info_0: New test.
* gcc.target/aarch64/cpunative/info_1: New test.
* gcc.target/aarch64/cpunative/info_10: New test.
* gcc.target/aarch64/cpunative/info_11: New test.
* gcc.target/aarch64/cpunative/info_12: New test.
* gcc.target/aarch64/cpunative/info_13: New test.
* gcc.target/aarch64/cpunative/info_14: New test.
* gcc.target/aarch64/cpunative/info_15: New test.
* gcc.target/aarch64/cpunative/info_2: New test.
* gcc.target/aarch64/cpunative/info_3: New test.
* gcc.target/aarch64/cpunative/info_4: New test.
* gcc.target/aarch64/cpunative/info_5: New test.
* gcc.target/aarch64/cpunative/info_6: New test.
* gcc.target/aarch64/cpunative/info_7: New test.
* gcc.target/aarch64/cpunative/info_8: New test.
* gcc.target/aarch64/cpunative/info_9: New test.
* gcc.target/aarch64/cpunative/native_cpu_0.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_1.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_10.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_13.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_14.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_2.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_3.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_4.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_5.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_6.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_7.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_8.c: New test.
Tamar Christina [Fri, 17 Jul 2020 12:10:28 +0000 (13:10 +0100)]
AArch64: Fix bugs in -mcpu=native detection.
This patch fixes a couple of issues in AArch64's -mcpu=native processing:
The buffer used to read the lines from /proc/cpuinfo is 128 bytes long. While
this was enough in the past with the increase in architecture extensions it is
no longer enough. It results in two bugs:
1) No option string longer than 127 characters is correctly parsed. Features
that are supported are silently ignored.
2) It incorrectly enables features that are not present on the machine:
a) It checks for substring matching instead of full word matching. This makes
it incorrectly detect sb support when ssbs is provided instead.
b) Due to the truncation at the 127 char border it also incorrectly enables
features due to the full feature being cut off and the part that is left
accidentally enables something else.
This breaks -mcpu=native detection on some of our newer system.
The patch fixes these issues by reading full lines up to the \n in a string.
This gives us the full feature line. Secondly it creates a set from this string
to:
1) Reduce matching complexity from O(n*m) to O(n*logm).
2) Perform whole word matching instead of substring matching.
To make this code somewhat cleaner I also changed from using char* to using
std::string and std::set.
Note that I have intentionally avoided the use of ifstream and stringstream
to make it easier to backport. I have also not change the substring matching
for the initial line classification as I cannot find a documented cpuinfo format
which leads me to believe there may be kernels out there that require this which
may be why the original code does this.
I also do not want this to break if the kernel adds a new line that is long and
indents the file by two tabs to keep everything aligned. In short I think an
imprecise match is the right thing here.
Test for this is added as the last thing in this series as it requires some
changes to be made to be able to test this.
Jonathan Wakely [Wed, 19 Aug 2020 11:13:03 +0000 (12:13 +0100)]
libstdc++: Add deprecated attributes to old iostream members
Back in 2017 I removed these prehistoric members (which were deprecated
since C++98) for C++17 mode. But I didn't add deprecated attributes to
most of them, so users didn't get any warning they would be going away.
Apparently some poor souls do actually use some of these names, and so
now that GCC 11 defaults to -std=gnu++17 some code has stopped
compiling.
This adds deprecated attributes to them, so that C++98/03/11/14 code
will get a warning if it uses them. I'll also backport this to the
release branches so that users can find out about the deprecation before
they start using C++17.
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_DEPRECATED_SUGGEST): New
macro for "use 'foo' instead" message in deprecated warnings.
* include/bits/ios_base.h (io_state, open_mode, seek_dir)
(streampos, streamoff): Use _GLIBCXX_DEPRECATED_SUGGEST.
* include/std/streambuf (stossc): Replace C++11 attribute
with _GLIBCXX_DEPRECATED_SUGGEST.
* testsuite/27_io/types/1.cc: Check for deprecated warnings.
Also check for io_state, open_mode and seek_dir typedefs.
Kewen Lin [Wed, 12 Aug 2020 09:19:16 +0000 (04:19 -0500)]
testsuite: Add -fno-common to pr82374.c [PR94077]
As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on
Power7 since gcc8. But it passes from gcc10. By looking into
the difference, it's due to that gcc10 sets -fno-common as default,
which makes vectorizer force the alignment and be able to use
aligned vector load/store on those targets which doesn't support
unaligned vector load/store (here it's Power7).
As Jakub suggested in the PR, this patch is to append -fno-common
into dg-options.
Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7).
Christophe Lyon [Wed, 12 Aug 2020 09:22:38 +0000 (09:22 +0000)]
testsuite: Fix gcc.target/arm/stack-protector-1.c for Cortex-M
The stack-protector-1.c test fails when compiled for Cortex-M:
- for Cortex-M0/M1, str r0, [sp #-8]! is not supported
- for Cortex-M3/M4..., the assembler complains that "use of r13 is
deprecated"
This patch replaces the str instruction with
sub sp, sp, #8
str r0, [sp]
and removes the check for r13, which is unlikely to leak the canary
value.
Jakub Jelinek [Mon, 3 Aug 2020 20:55:28 +0000 (22:55 +0200)]
aarch64: Fix up __aarch64_cas16_acq_rel fallback
As mentioned in the PR, the fallback path when LSE is unavailable writes
incorrect registers to the memory if the previous content compares equal
to x0, x1 - it writes copy of x0, x1 from the start of function, but it
should write x2, x3.
2020-08-03 Jakub Jelinek <jakub@redhat.com>
PR target/96402
* config/aarch64/lse.S (__aarch64_cas16_acq_rel): Use x2, x3 instead
of x(tmp0), x(tmp1) in STXP arguments.
arm: Clear canary value after stack_protect_test [PR96191]
The stack_protect_test patterns were leaving the canary value in the
temporary register, meaning that it was often still in registers on
return from the function. An attacker might therefore have been
able to use it to defeat stack-smash protection for a later function.
gcc/
PR target/96191
* config/arm/arm.md (arm_stack_protect_test_insn): Zero out
operand 2 after use.
* config/arm/thumb1.md (thumb1_stack_protect_test_insn): Likewise.
gcc/testsuite/
* gcc.target/arm/stack-protector-1.c: New test.
* gcc.target/arm/stack-protector-2.c: Likewise.
aarch64: Clear canary value after stack_protect_test [PR96191]
The stack_protect_test patterns were leaving the canary value in the
temporary register, meaning that it was often still in registers on
return from the function. An attacker might therefore have been
able to use it to defeat stack-smash protection for a later function.
gcc/
PR target/96191
* config/aarch64/aarch64.md (stack_protect_test_<mode>): Set the
CC register directly, instead of a GPR. Replace the original GPR
destination with an extra scratch register. Zero out operand 3
after use.
(stack_protect_test): Update accordingly.
gcc/testsuite/
PR target/96191
* gcc.target/aarch64/stack-protector-1.c: New test.
* gcc.target/aarch64/stack-protector-2.c: Likewise.
This function was unimplemented, simply returning the native format
string instead.
PR libstdc++/93245
* include/experimental/bits/fs_path.h (path::generic_string<C,T,A>()):
Return the generic format path, not the native one.
* testsuite/experimental/filesystem/path/generic/generic_string.cc:
Improve test coverage.
It's not possible to construct a path::string_type from an allocator of
a different type. Create the correct specialization of basic_string, and
adjust path::_S_str_convert to use a basic_string_view so that it is
independent of the allocator type.
PR libstdc++/94242
* include/bits/fs_path.h (path::_S_str_convert): Replace first
parameter with basic_string_view so that strings with different
allocators can be accepted.
(path::generic_string<C,T,A>()): Use basic_string object that uses the
right allocator type.
* testsuite/27_io/filesystem/path/generic/94242.cc: New test.
* testsuite/27_io/filesystem/path/generic/generic_string.cc: Improve
test coverage.
Qian Jianhua [Fri, 7 Aug 2020 09:39:40 +0000 (10:39 +0100)]
aarch64: Add A64FX machine model
This patch add support for Fujitsu A64FX, as the first step of adding
A64FX machine model.
A64FX is used in FUJITSU Supercomputer PRIMEHPC FX1000,
PRIMEHPC FX700, and supercomputer Fugaku.
The official microarchitecture information of A64FX can be read at
https://github.com/fujitsu/A64FX.
2020-08-07 Qian jianhua <qianjh@cn.fujitsu.com>
gcc/
* config/aarch64/aarch64-cores.def (a64fx): New core.
* config/aarch64/aarch64-tune.md: Regenerated.
* config/aarch64/aarch64.c (a64fx_prefetch_tune, a64fx_tunings): New.
* doc/invoke.texi: Add a64fx to the list.
early-remat: Handle sets of multiple candidate regs [PR94605]
early-remat.c:process_block wasn't handling insns that set multiple
candidate registers, which led to an assertion failure at the end
of the main loop.
Instructions that set two pseudos aren't rematerialisation candidates in
themselves, but we still need to track them if another instruction that
sets the same register is a rematerialisation candidate.
gcc/
PR rtl-optimization/94605
* early-remat.c (early_remat::process_block): Handle insns that
set multiple candidate registers.
gcc/testsuite/
PR rtl-optimization/94605
* gcc.target/aarch64/sve/pr94605.c: New test.
ipa-devirt: Fix crash in obj_type_ref_class [PR95114]
The testcase has failed since r9-5035, because obj_type_ref_class
tries to look up an ODR type when no ODR type information is
available. (The information was available earlier in the
compilation, but was freed during pass_ipa_free_lang_data.)
We then crash dereferencing the null get_odr_type result.
The test passes with -O2. However, it fails again if -fdump-tree-all
is used, since obj_type_ref_class is called indirectly from the
dump routines.
Other code creates ODR type entries on the fly by passing “true”
as the insert parameter. But obj_type_ref_class can't do that
unconditionally, since it should have no side-effects when used
from the dumping code.
Following a suggestion from Honza, this patch adds parameters
to say whether the routines are being called from dump routines
and uses those to derive the insert parameter.
gcc/
PR middle-end/95114
* tree.h (virtual_method_call_p): Add a default-false parameter
that indicates whether the function is being called from dump
routines.
(obj_type_ref_class): Likewise.
* tree.c (virtual_method_call_p): Likewise.
* ipa-devirt.c (obj_type_ref_class): Likewise. Lazily add ODR
type information for the type when the parameter is false.
* tree-pretty-print.c (dump_generic_node): Update calls to
virtual_method_call_p and obj_type_ref_class accordingly.
gcc/testsuite/
PR middle-end/95114
* g++.target/aarch64/pr95114.C: New test.