Jonathan Wakely [Mon, 12 Oct 2020 17:14:01 +0000 (18:14 +0100)]
libstdc++: Fix documentation for return values of copy algos
The doxygen comments for these algos all incorrectly claim to use
(first - last) as the difference from the start of the output range to
the return value. As reported on the mailing list by Johannes Choo, it
should be (last - first).
libstdc++-v3/ChangeLog:
* include/bits/stl_algobase.h (copy, move, copy_backward)
(move_backward): Fix documentation for returned iterator.
Marek Polacek [Thu, 10 Sep 2020 21:27:43 +0000 (17:27 -0400)]
c++: Fix P0846 (ADL and function templates) in template [PR97010]
To quickly recap, P0846 says that a name is also considered to refer to
a template if it is an unqualified-id followed by a < and name lookup
finds either one or more functions or finds nothing.
In a template, when parsing a function call that has type-dependent
arguments, we can't perform ADL right away so we set KOENIG_LOOKUP_P in
the call to remember to do it when instantiating the call
(tsubst_copy_and_build/CALL_EXPR). When the called function is a
function template, we represent the call with a TEMPLATE_ID_EXPR;
usually the operand is an OVERLOAD.
In the P0846 case though, the operand can be an IDENTIFIER_NODE, when
name lookup found nothing when parsing the template name. But we
weren't handling this correctly in tsubst_copy_and_build. First
we need to pass the FUNCTION_P argument from <case TEMPLATE_ID_EXPR> to
<case IDENTIFIER_NODE>, otherwise we give a bogus error. And then in
<case CALL_EXPR> we need to perform ADL. The rest of the changes is to
give better errors when ADL didn't find anything.
gcc/cp/ChangeLog:
PR c++/97010
* pt.c (tsubst_copy_and_build) <case TEMPLATE_ID_EXPR>: Call
tsubst_copy_and_build explicitly instead of using the RECUR macro.
Handle a TEMPLATE_ID_EXPR with an IDENTIFIER_NODE as its operand.
<case CALL_EXPR>: Perform ADL for a TEMPLATE_ID_EXPR with an
IDENTIFIER_NODE as its operand.
gcc/testsuite/ChangeLog:
PR c++/97010
* g++.dg/cpp2a/fn-template21.C: New test.
* g++.dg/cpp2a/fn-template22.C: New test.
Jonathan Wakely [Mon, 21 Sep 2020 13:28:58 +0000 (14:28 +0100)]
libstdc++: Make std::assume_aligned a constexpr function [PR 97132]
The cast from void* to T* in std::assume_aligned is not valid in a
constexpr function. The optimization hint is redundant during constant
evaluation anyway (the compiler can see the object and knows its
alignment). Simply return the original pointer without applying the
__builtin_assume_aligned hint to it when doing constant evaluation.
libstdc++-v3/ChangeLog:
PR libstdc++/97132
* include/std/memory (assume_aligned): Do not use
__builtin_assume_aligned during constant evaluation.
* testsuite/20_util/assume_aligned/1.cc: Improve test.
* testsuite/20_util/assume_aligned/97132.cc: New test.
Harald Anlauf [Sun, 18 Oct 2020 18:15:26 +0000 (20:15 +0200)]
PR libfortran/97063 - Wrong result for vector (step size is negative) * matrix
The MATMUL intrinsic provided a wrong result for rank-1 times rank-2 array
when a negative stride was used for addressing the elements of the rank-1
array, because a check on strides was erroneously placed before the check
on the rank. Interchange order of checks.
arm: Fix the warning -mcpu=cortex-m55 conflicting with -march=armv8.1-m.main (pr97327).
This patch fixes (PR97327) the warning -mcpu=cortex-m55 conflicts with -march=armv8.1-m.main
for -mfloat-abi=soft by adding the isa_bit_mve_float to clearing FP bit list.
The following combination are fixed with this patch:
$ cat bug.c
int main(){
return 0;
}
A few MVE intrinsics had an unsigned variant implement while they are
supported by the hardware. This patch removes them:
__arm_vqrdmlashq_n_u8
__arm_vqrdmlahq_n_u8
__arm_vqdmlahq_n_u8
__arm_vqrdmlashq_n_u16
__arm_vqrdmlahq_n_u16
__arm_vqdmlahq_n_u16
__arm_vqrdmlashq_n_u32
__arm_vqrdmlahq_n_u32
__arm_vqdmlahq_n_u32
__arm_vmlaldavaxq_p_u32
__arm_vmlaldavaxq_p_u16
gcc/
PR target/96914
* config/arm/arm_mve.h (vqdmlashq, vqdmlashq_m): Define.
* config/arm/arm_mve_builtins.def (vqdmlashq_n_s)
(vqdmlashq_m_n_s,): New.
* config/arm/unspecs.md (VQDMLASHQ_N_S, VQDMLASHQ_M_N_S): New
unspecs.
* config/arm/iterators.md (VQDMLASHQ_N_S, VQDMLASHQ_M_N_S): New
attributes.
(VQDMLASHQ_N): New iterator.
* config/arm/mve.md (mve_vqdmlashq_n_, mve_vqdmlashq_m_n_s): New
patterns.
gcc/testsuite/
PR target/96914
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: New test.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s8.c: New test.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s32.c: New test.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s8.c: New test.
Jakub Jelinek [Tue, 13 Oct 2020 17:13:26 +0000 (19:13 +0200)]
combine: Fix up simplify_shift_const_1 for nested ROTATEs [PR97386]
The following testcases are miscompiled (the first one since my improvements
to rotate discovery on GIMPLE, the other one for many years) because
combiner optimizes nested ROTATEs with narrowing SUBREG in between (i.e.
the outer rotate is performed in shorter precision than the inner one) to
just one ROTATE of the rotated constant. While that (under certain
conditions) can work for shifts, it can't work for rotates where we can only
do that with rotates of the same precision.
2020-10-13 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/97386
* combine.c (simplify_shift_const_1): Don't optimize nested ROTATEs if
they have different modes.
* gcc.c-torture/execute/pr97386-1.c: New test.
* gcc.c-torture/execute/pr97386-2.c: New test.
Jakub Jelinek [Thu, 8 Oct 2020 09:10:34 +0000 (11:10 +0200)]
openmp: Set cfun->calls_alloca when needed in OpenMP outlined regions [PR97294]
The following testcase FAILs, because we don't mark the child OpenMP function
as cfun->calls_alloca when it does call alloca. When optimizing, during DCE we
reset those flags and recompute them again, but with -O0 DCE is not performed.
Fixed by calling notice_special_calls when moving insns to the child function.
cfun->calls_alloca is normally set during gimplification and most of the
alloca calls omp-low.c does go through the gimplifier, but one spot didn't
and built the gcall directly, so that one needs to set calls_alloca too.
2020-10-08 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/97294
* tree-cfg.c (move_block_to_fn): Call notice_special_calls on
call stmts being moved into dest_cfun.
* omp-low.c (lower_rec_input_clauses): Set cfun->calls_alloca when
adding __builtin_alloca_with_align call without gimplification.
Jakub Jelinek [Mon, 5 Oct 2020 16:33:17 +0000 (18:33 +0200)]
support TARGET_MEM_REF in C/C++ error pretty-printing [PR97197]
> See my comment above for Martins attempts to improve things. I don't
> really want to try decide what to do with those late diagnostic IL
> printing but my commit was blamed for showing target-mem-ref unsupported.
>
> I don't have much time to spend to think what to best print and what not,
> but yes, printing only the MEM_REF part is certainly imprecise.
Here is an updated version of the patch that prints TARGET_MEM_REF the way
it should be printed - as C representation of what it actually means.
Of course it would be better to have the original expressions, but with the
late diagnostics we no longer have them.
2020-10-05 Richard Biener <rguenther@suse.de>
Jakub Jelinek <jakub@redhat.com>
PR c++/97197
gcc/cp/
* error.c (dump_expr): Handle TARGET_MEM_REF.
gcc/c-family/
* c-pretty-print.c: Include langhooks.h.
(c_pretty_printer::postfix_expression): Handle TARGET_MEM_REF as
expression.
(c_pretty_printer::expression): Handle TARGET_MEM_REF as
unary_expression.
(c_pretty_printer::unary_expression): Handle TARGET_MEM_REF.
With this patch:
...
foo:
vldrw.32 q3, [r0]
vstrw.u32 q0, [q3, #8]! --> (C)
vstrw.32 q3, [r0]
bx lr
...
Without this patch 2 vstrw assembly instructions (A and B) are generated for vstrwq_scatter_base_wb_s32
intrinsic where as fix generates only one vstrw assembly instruction (C).
Joe Ramsay [Tue, 6 Oct 2020 06:33:52 +0000 (07:33 +0100)]
arm: Add +nomve and +nomve.fp options to -mcpu=cortex-m55
This patch rearranges feature bits for MVE and FP to implement the
following flags for -mcpu=cortex-m55.
- +nomve: equivalent to armv8.1-m.main+fp.dp+dsp.
- +nomve.fp: equivalent to armv8.1-m.main+mve+fp.dp (+dsp is implied by +mve).
- +nofp: equivalent to armv8.1-m.main+mve (+dsp is implied by +mve).
- +nodsp: equivalent to armv8.1-m.main+fp.dp.
Combinations of the above:
- +nomve+nofp: equivalent to armv8.1-m.main+dsp.
- +nodsp+nofp: equivalent to armv8.1-m.main.
Due to MVE and FP sharing vfp_base, some new syntax was required in the CPU
description to implement the concept of 'implied bits'. These are non-named
features added to the ISA late, depending on whether one or more features which
depend on them are present. This means vfp_base can be present when only one of
MVE and FP is removed, but absent when both are removed.
gcc/ChangeLog:
2020-07-31 Joe Ramsay <joe.ramsay@arm.com>
* config/arm/arm-cpus.in:
(ALL_FPU_INTERNAL): Remove vfp_base.
(VFPv2): Remove vfp_base.
(MVE): Remove vfp_base.
(vfp_base): Redefine as implied bit dependent on MVE or FP
(cortex-m55): Add flags to disable MVE, MVE FP, FP and DSP extensions.
* config/arm/arm.c (arm_configure_build_target): Add implied bits to ISA.
* config/arm/parsecpu.awk:
(gen_isa): Print implied bits and their dependencies to ISA header.
(gen_data): Add parsing for implied feature bits.
gcc/testsuite/ChangeLog:
* gcc.target/arm/cortex-m55-nodsp-flag-hard.c: New test.
* gcc.target/arm/cortex-m55-nodsp-flag-softfp.c: New test.
* gcc.target/arm/cortex-m55-nodsp-nofp-flag-softfp.c: New test.
* gcc.target/arm/cortex-m55-nofp-flag-hard.c: New test.
* gcc.target/arm/cortex-m55-nofp-flag-softfp.c: New test.
* gcc.target/arm/cortex-m55-nofp-nomve-flag-softfp.c: New test.
* gcc.target/arm/cortex-m55-nomve-flag-hard.c: New test.
* gcc.target/arm/cortex-m55-nomve-flag-softfp.c: New test.
* gcc.target/arm/cortex-m55-nomve.fp-flag-hard.c: New test.
* gcc.target/arm/cortex-m55-nomve.fp-flag-softfp.c: New test.
* gcc.target/arm/multilib.exp: Add tests for -mcpu=cortex-m55.
As the FIXME which this patch removes states, the current code does
not work when a call with multiple speculative targets gets resolved
through parameter tracking during inlining - it feeds the inliner an
edge it has already dealt with. The patch makes the code which should
prevent it aware of the possibility that that speculation can have
more than one target now.
gcc/ChangeLog:
2020-09-30 Martin Jambor <mjambor@suse.cz>
PR ipa/96394
* ipa-prop.c (update_indirect_edges_after_inlining): Do not add
resolved speculation edges to vector of new direct edges even in
presence of multiple speculative direct edges for a single call.
gcc/testsuite/ChangeLog:
2020-09-30 Martin Jambor <mjambor@suse.cz>
PR ipa/96394
* gcc.dg/tree-prof/pr96394.c: New test.
MIPS/libphobos: Fix switchcontext.S assembly for MIPS I ISA
Correct MIPS I assembly build errors in switchcontext.S:
.../libphobos/libdruntime/config/mips/switchcontext.S: Assembler messages:
.../libphobos/libdruntime/config/mips/switchcontext.S:50: Error: opcode not supported on this processor: mips1 (mips1) `sdc1 $f20,(0*8-((6*8+4+(-6*8+4&7))))($sp)'
etc., due to the use of the MIPS II LDC1 and SDC1 hardware instructions
for FP register load and store operations. Instead use the L.D and S.D
generic assembly instructions, which are strict aliases for the LDC1 and
SDC1 instructions respectively and produce identical machine code where
the assembly for the MIPS II or a higher ISA has been requested, however
they become assembly macros and expand to compatible sequences of LWC1
and SWC1 hardware instructions where the assembly for the MIPS I ISA is
in effect.
libphobos/
* libdruntime/config/mips/switchcontext.S [__mips_hard_float]:
Use L.D and S.D generic assembly instructions rather than LDC1
and SDC1 MIPS II hardware instructions.
Martin Liska [Tue, 13 Oct 2020 14:44:47 +0000 (16:44 +0200)]
IPA: fix profile handling in IRA
gcc/ChangeLog:
PR ipa/97295
* profile-count.c (profile_count::to_frequency): Move part of
gcc_assert to STATIC_ASSERT.
* regs.h (REG_FREQ_FROM_BB): Do not use count.to_frequency for
a function that does not have count_max initialized.
Patrick Palka [Thu, 8 Oct 2020 22:10:05 +0000 (18:10 -0400)]
libstdc++: Make ranges::construct_at constexpr-friendly [PR95788]
This rewrites ranges::construct_at in terms of std::construct_at so
that we can piggyback on the compiler's existing support for
intercepting placement new within std::construct_at during constexpr
evaluation, instead of having to additionally teach the compiler about
ranges::construct_at.
While we're making changes to ranges::construct_at, this patch also
declares it conditionally noexcept and qualifies the calls to declval in
its requires-clause.
libstdc++-v3/ChangeLog:
PR libstdc++/95788
* include/bits/ranges_uninitialized.h:
(__construct_at_fn::operator()): Rewrite in terms of
std::construct_at. Declare it conditionally noexcept. Qualify
calls to declval in its requires-clause.
* testsuite/20_util/specialized_algorithms/construct_at/95788.cc:
New test.
Patrick Palka [Thu, 8 Oct 2020 04:05:36 +0000 (00:05 -0400)]
c++: Set the constraints of a class type sooner [PR96229]
In the testcase below, during processing (at parse time) of Y's base
class X<Y>, convert_template_argument calls is_compatible_template_arg
to check if the template argument Y is no more constrained than the
parameter P. But at this point we haven't yet set Y's constraints, so
get_normalized_constraints_from_decl yields NULL_TREE as the normal form
and caches this result into the normalized_map.
We set Y's constraints later in cp_parser_class_specifier_1 but the
stale normal form in the normalized_map remains. This ultimately causes
us to miss the constraint failure for Y<Z> because according to the
cached normal form, Y is not constrained.
This patch fixes this issue by moving up the call to
associate_classtype_constraints so that we set constraints before we
start processing a class's bases.
gcc/cp/ChangeLog:
PR c++/96229
* parser.c (cp_parser_class_specifier_1): Move call to
associate_classtype_constraints from here to ...
(cp_parser_class_head): ... here.
* pt.c (is_compatible_template_arg): Correct documentation to
say "argument is _no_ more constrained than the parameter".
gcc/testsuite/ChangeLog:
PR c++/96229
* g++.dg/cpp2a/concepts-class2.C: New test.
Alex Coplan [Wed, 30 Sep 2020 08:02:47 +0000 (09:02 +0100)]
arm: Fix ICEs in no-literal-pool.c on MVE [PR97251]
This patch fixes ICEs when compiling
gcc/testsuite/gcc.target/arm/pure-code/no-literal-pool.c with
-mfp16-format=ieee -mfloat-abi=hard -march=armv8.1-m.main+mve
-mpure-code.
The existing conditions in the movsf/movdf expanders (as well as the
no_literal_pool patterns) were too restrictive, requiring
TARGET_HARD_FLOAT instead of TARGET_VFP_BASE, which caused unrecognised
insns when compiling this testcase with integer MVE and -mpure-code.
This patch fixes ICEs in gcc.dg/torture/float16-basic.c for
-march=armv8.1-m.main+mve -mfloat-abi=hard. The problem was
that an fp16 argument was (rightly) being passed in FPRs,
but the fp16 move patterns only handled GPRs. LRA then cycled
trying to look for a way of handling the FPR.
It looks like there are three related problems here:
(1) We're using the wrong fp16 move pattern for base MVE.
*mov<mode>_vfp_<mode>16 (the pattern we use for +mve.fp)
works for base MVE too.
(2) The fp16 MVE load and store patterns are separate from the
main move patterns. The loads and stores should instead be
alternatives of the main move patterns, so that LRA knows
what to do with pseudo registers that become stack slots.
(3) The range restrictions for the loads and stores were wrong
for fp16: we were enforcing a multiple of 4 in [-255*4, 255*4]
instead of a multiple of 2 in [-255*2, 255*2].
(2) came from a patch to prevent writeback being used for MVE.
That patch also added a Uj constraint to enforce the correct
memory types for MVE. I think the simplest fix is therefore to merge
the loads and stores back into the main pattern and extend the Uj
constraint so that it acts like Um for non-MVE.
The testcase for that patch was mve-vldstr16-no-writeback.c, whose
main function is:
void
fn1 (__fp16 *pSrc)
{
__fp16 high;
__fp16 *pDst = 0;
unsigned i;
for (i = 0;; i++)
if (pSrc[i])
pDst[i] = high;
}
Fixing (2) causes the store part to fail, not because we're using
writeback, but because we decide to use GPRs to store high (which is
uninitialised, and so gets replaced with zero). This patch therefore
adds some scan-assembler-nots instead. (I wondered about changing the
testcase to initialise high, but that seemed like a bad idea for
a regression test.)
For (3): MVE seems to be the only thing to use arm_coproc_mem_operand_wb
(and its various interfaces) for 16-bit scalars: the Neon patterns only
use it for 32-bit scalars.
I've added new tests to try the various FPR alternatives of the
move patterns. The range of offsets that GCC uses for FPR loads
and stores is the intersection of the range allowed for GPRs and
FPRs, so the tests include GPR<->memory tests as well.
The fp32 and fp64 tests already pass, they're just there for
completeness.
gcc/
* config/arm/arm-protos.h (arm_mve_mode_and_operands_type_check):
Delete.
* config/arm/arm.c (arm_coproc_mem_operand_wb): Use a scale factor
of 2 rather than 4 for 16-bit modes.
(arm_mve_mode_and_operands_type_check): Delete.
* config/arm/constraints.md (Uj): Allow writeback for Neon,
but continue to disallow it for MVE.
* config/arm/arm.md (*arm32_mov<HFBF:mode>): Add !TARGET_HAVE_MVE.
* config/arm/vfp.md (*mov_load_vfp_hf16, *mov_store_vfp_hf16): Fold
back into...
(*mov<mode>_vfp_<mode>16): ...here but use Uj for the FPR memory
constraints. Use for base MVE too.
gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c: Allow
the store to use GPRs instead of FPRs. Add scan-assembler-nots
for writeback.
* gcc.target/arm/armv8_1m-fp16-move-1.c: New test.
* gcc.target/arm/armv8_1m-fp32-move-1.c: Likewise.
* gcc.target/arm/armv8_1m-fp64-move-1.c: Likewise.
Kyrylo Tkachov [Fri, 9 Oct 2020 09:34:15 +0000 (10:34 +0100)]
PR target/97349 AArch64: Incorrect types for some Neon vdupq_n_<...> intrinsics
This patch fixes the PR by adjusting the input types of the intrinsic
prototypes to the ones mandated by ACLE
Turns out the tests in the testsuite were already using the correct
ones, but implicit conversions hid the bug...
Bootstrapped and tested on aarch64-none-linux-gnu.
Iain Buclaw [Sun, 11 Oct 2020 20:20:43 +0000 (22:20 +0200)]
d: Fix alias protection being ignored if used before declaration.
Fixes a symbol resolver bug where a private alias becomes public if used
before its declaration.
gcc/d/ChangeLog:
2020-10-12 Iain Buclaw <ibuclaw@gdcproject.org>
* dmd/declaration.c (AliasDeclaration::aliasSemantic): Apply storage
class and protection attributes.
gcc/testsuite/ChangeLog:
2020-10-12 Iain Buclaw <ibuclaw@gdcproject.org>
* gdc.test/fail_compilation/fail21001.d: New test.
* gdc.test/fail_compilation/imports/fail21001b.d: New test.
* gdc.test/fail_compilation/imports/issue21295ast_node.d: New test.
* gdc.test/fail_compilation/imports/issue21295astcodegen.d: New test.
* gdc.test/fail_compilation/imports/issue21295dtemplate.d: New test.
* gdc.test/fail_compilation/imports/issue21295visitor.d: New test.
* gdc.test/fail_compilation/issue21295.d: New test.
Richard Biener [Thu, 1 Oct 2020 07:29:32 +0000 (09:29 +0200)]
tree-optimization/97255 - missing vector bool pattern of SRAed bool
SRA tends to use VIEW_CONVERT_EXPR when replacing bool fields with
unsigned char fields. Those are not handled in vector bool pattern
detection causing vector true values to leak. The following fixes
this by turning those into b ? 1 : 0 as well.
2020-10-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/97255
* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
VIEW_CONVERT_EXPR.
Patrick Palka [Thu, 8 Oct 2020 23:31:57 +0000 (19:31 -0400)]
c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273]
cp_tree_equal currently considers alignof the same as __alignof__, but
these operators are semantically different ever since r8-7957. In the
testcase below, this causes the second static_assert to fail on targets
where alignof(double) != __alignof__(double) because the specialization
table (which uses cp_tree_equal as its equality predicate) conflates the
two dependent specializations integral_constant<__alignof__(T)> and
integral_constant<alignof(T)>.
This patch makes cp_tree_equal distinguish between these two operators
by inspecting the ALIGNOF_EXPR_STD_P flag.
Harald Anlauf [Sun, 4 Oct 2020 18:24:29 +0000 (20:24 +0200)]
PR fortran/97272 - Wrong answer from MAXLOC with character arg
The optional KIND argument to the MINLOC/MAXLOC intrinsic must not be
passed to the library function, as the kind conversion of the result
is treated explicitly elsewhere.
gcc/fortran/ChangeLog:
PR fortran/97272
* trans-intrinsic.c (strip_kind_from_actual): Helper function for
removal of KIND argument.
(gfc_conv_intrinsic_minmaxloc): Ignore KIND argument here, as it
is treated elsewhere.
gcc/testsuite/ChangeLog:
PR fortran/97272
* gfortran.dg/pr97272.f90: New test.
PR target/97150 AArch64: 2nd parameter of unsigned Neon scalar shift intrinsics should be signed
In this PR the second argument to the intrinsics should be signed but we
use an unsigned one erroneously.
The corresponding builtins are already using the correct types so it's
just a matter of correcting the signatures in arm_neon.h
PR target/96313 AArch64: vqmovun* return types should be unsigned
In this PR we have the wrong return type for some intrinsics. It should
be unsigned, but we implement it as signed.
Fix this by adjusting the type qualifiers used when creating the
builtins and fixing the type in the arm_neon.h intrinsic.
With the adjustment in qualifiers we now don't need to cast the result
when returning.
Bootstrapped and tested on aarch64-none-linux-gnu.
Alan Modra [Thu, 1 Oct 2020 09:44:09 +0000 (19:14 +0930)]
[RS6000] ICE in decompose, at rtl.h:2282
during RTL pass: fwprop1
gcc.dg/pr82596.c: In function 'test_cststring':
gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282
-m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after
applying rtx_cost patches, which exposed a backend bug.
legitimize_address when presented with the following address
(plus (reg) (const_int 0x7ffffffff))
attempts to rewrite it as a high/low sum. The low part is 0xffff, or
-1, making the high part 0x80000000. But this is no longer canonical
for SImode.
* config/rs6000/rs6000.c (rs6000_legitimize_address): Use
gen_int_mode for high part of address constant.
Jonathan Wakely [Wed, 7 Oct 2020 23:05:53 +0000 (00:05 +0100)]
libstdc++: Fix non-reserved names in headers
Fix some bad uses of "ForwardIterator" in <ranges>.
There's also a "il" parameter in a std::seed_seq constructor in <random>
which is only reserved since C++14.
libstdc++-v3/ChangeLog:
* include/bits/random.h (seed_seq(initializer_list<T>)): Rename
parameter to use reserved name.
* include/bits/ranges_algo.h (shift_left, shift_right): Rename
template parameters to use reserved name.
* testsuite/17_intro/names.cc: Check "il". Do not check "d" and
"y" in C++20 mode.
Martin Liska [Mon, 5 Oct 2020 16:03:08 +0000 (18:03 +0200)]
lto: fix LTO debug sections copying.
readelf -S prints:
There are 81999 section headers, starting at offset 0x1f488060:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 01404f 00 81998 0 0
[ 1] .group GROUP 0000000000000000 000040 000008 04 81995 105027 4
...
[81995] .symtab SYMTAB 0000000000000000d5d9298 2db310 18 81997 105026 8
[81996] .symtab_shndx SYMTAB SECTION INDICES 0000000000000000d8b45a8 079dd8 04 81995 0 4
[81997] .strtab STRTAB 0000000000000000d92e380 80460c 00 0 0 1
...
Looking at the documentation:
Table 7–15 ELF sh_link and sh_info Interpretation
sh_type - sh_link
SHT_SYMTAB - The section header index of the associated string table.
SHT_SYMTAB_SHNDX - The section header index of the associated symbol table.
As seen, sh_link of a SHT_SYMTAB always points to a .strtab and readelf
confirms that.
So we need to use reverse mapping taken from
[81996] .symtab_shndx SYMTAB SECTION INDICES 0000000000000000d8b45a8 079dd8 04 81995 0 4
where sh_link points to 81995.
libiberty/ChangeLog:
PR lto/97290
* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
Use sh_link of a .symtab_shndx section.
[GCC-10 backport] arm: Move iterators from mve.md to iterators.md to maintain consistency.
To maintain consistency with other Arm Architectures backend, iterators and iterator attributes are moved
from mve.md file to iterators.md. Also move enumerators for MVE unspecs from mve.md file to unspecs.md file.
Joe Ramsay [Fri, 2 Oct 2020 14:28:29 +0000 (15:28 +0100)]
arm: Remove coercion from scalar argument to vmin & vmax intrinsics
This patch fixes an issue with vmin* and vmax* intrinsics which accept
a scalar argument. Previously when the scalar was of different width
to the vector elements this would generate __ARM_undef. This change
allows the scalar argument to be implicitly converted to the correct
width. Also tidied up the relevant unit tests, some of which would
have passed even if only one of two or three intrinsic calls had
compiled correctly.
Patrick Palka [Fri, 2 Oct 2020 14:51:31 +0000 (10:51 -0400)]
libstdc++: Add missing P0896 changes to <iterator>
I noticed that the following changes from this paper were not yet
implemented.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (reverse_iterator::iter_move):
Define for C++20 as per P0896.
(reverse_iterator::iter_swap): Likewise.
(move_iterator::operator*): Apply P0896 changes for C++20.
(move_iterator::operator[]): Likewise.
* testsuite/24_iterators/reverse_iterator/cust.cc: New test.
Matthias Klose [Tue, 6 Oct 2020 11:41:37 +0000 (13:41 +0200)]
Backport fix for PR/tree-optimization/97236 - fix bad use of VMAT_CONTIGUOUS
This avoids using VMAT_CONTIGUOUS with single-element interleaving
when using V1mode vectors. Instead keep VMAT_ELEMENTWISE but
continue to avoid load-lanes and gathers.
2020-10-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/97236
* tree-vect-stmts.c (get_group_load_store_type): Keep
VMAT_ELEMENTWISE for single-element vectors.
Tobias Burnus [Tue, 6 Oct 2020 09:49:34 +0000 (11:49 +0200)]
configure: Fix in-tree building of GMP on BSD [PR97302]
ChangeLog:
PR target/97302
* configure.ac: Only set with_gmp to /usr/local
if not building in tree.
* configure: Regenerate.
(cherry picked from commit c0d0a722da8583f74a0c192041be2f379cf487c1)
Andreas Krebbel [Wed, 12 Aug 2020 06:02:34 +0000 (08:02 +0200)]
IBM Z: Fix PR96456
The testcase failed because our backend refuses to generate vector
compare instructions for signaling operators with -fno-trapping-math
-fno-finite-math-only.
gcc/ChangeLog:
PR target/96456
* config/s390/s390.h (TARGET_NONSIGNALING_VECTOR_COMPARE_OK): New
macro.
* config/s390/vector.md (vcond_comparison_operator): Use new macro
for the check.
gcc/testsuite/ChangeLog:
PR target/96456
* gcc.target/s390/pr96456.c: New test.
Jakub Jelinek [Thu, 1 Oct 2020 09:18:35 +0000 (11:18 +0200)]
c++: Fix up default initialization with consteval default ctor [PR96994]
> > The following testcase is miscompiled (in particular the a and i
> > initialization). The problem is that build_special_member_call due to
> > the immediate constructors (but not evaluated in constant expression mode)
> > doesn't create a CALL_EXPR, but returns a TARGET_EXPR with CONSTRUCTOR
> > as the initializer for it,
>
> That seems like the bug; at the end of build_over_call, after you
>
> > call = cxx_constant_value (call, obj_arg);
>
> You need to build an INIT_EXPR if obj_arg isn't a dummy.
That works. obj_arg is NULL if it is a dummy from the earlier code.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/96994
* call.c (build_over_call): If obj_arg is non-NULL, return INIT_EXPR
setting obj_arg to call.
Jakub Jelinek [Thu, 1 Oct 2020 09:16:44 +0000 (11:16 +0200)]
c++: Handle std::construct_at on automatic vars during constant evaluation [PR97195]
As mentioned in the PR, we only support due to a bug in constant expressions
std::construct_at on non-automatic variables, because we VERIFY_CONSTANT the
second argument of placement new, which fails verification if it is an
address of an automatic variable.
The following patch fixes it by not performing that verification, the
placement new evaluation later on will verify it after it is dereferenced.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/97195
* constexpr.c (cxx_eval_call_expression): Don't VERIFY_CONSTANT the
second argument.
Jakub Jelinek [Sat, 26 Sep 2020 08:07:41 +0000 (10:07 +0200)]
powerpc, libcpp: Fix gcc build with clang on power8 [PR97163]
libcpp has two specialized altivec implementations of search_line_fast,
one for power8+ and the other one otherwise.
Both use __attribute__((altivec(vector))) and the GCC builtins rather than
altivec.h and the APIs from there, which is fine, but should be restricted
to when libcpp is built with GCC, so that it can be relied on.
The second elif is
and thus e.g. when built with clang it isn't picked, but the first one was
just guarded with
and so according to the bugreporter clang fails miserably on that.
The following patch fixes that by adding the same GCC_VERSION requirement
as the second version. I don't know where the 4.5 in there comes from and
the exact version doesn't matter that much, as long as it is above 4.2 that
clang pretends to be and smaller or equal to 4.8 as the oldest gcc we
support as bootstrap compiler ATM.
Furthermore, the patch fixes the comment, the version it is talking about is
not pre-GCC 5, but actually the GCC 5+ one.
2020-09-26 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/97163
* lex.c (search_line_fast): Only use _ARCH_PWR8 Altivec version
for GCC >= 4.5.
Jakub Jelinek [Tue, 22 Sep 2020 19:06:32 +0000 (21:06 +0200)]
c++: Ignore __sanitizer_ptr_{sub,cmp} builtin calls during constant expression evaluation [PR97145]
These two builtin calls are added already during parsing before pointer
subtractions or comparisons, normally they perform runtime verification
of whether the pointers point to the same object or different objects,
but during constant expressione valuation we don't really need those
builtins for anything.
2020-09-22 Jakub Jelinek <jakub@redhat.com>
PR c++/97145
* constexpr.c (cxx_eval_builtin_function_call): Return void_node for
calls to __sanitize_ptr_{sub,cmp} builtins.
Kyrylo Tkachov [Fri, 2 Oct 2020 14:23:19 +0000 (15:23 +0100)]
AArch64: Add neoversev1_tunings struct
This patch adds a Neoverse V1-specific tuning struct that currently is
just a deduplication of the N1 struct it was using before and specifying
the SVE width.
This will allow us to tweak Neoverse V1 things in the future as needed.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.
Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.
As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.
The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).
The fix is to never rely on function address and predicate on TOPN
counter types.
libgcc/ChangeLog:
PR gcov-profile/96913
* libgcov-driver.c (write_one_data): Avoid function pointer
comparison in TOP streaming decision.
Martin Liska [Thu, 24 Sep 2020 11:34:13 +0000 (13:34 +0200)]
switch conversion: make a rapid speed up
gcc/ChangeLog:
PR tree-optimization/96979
* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
Make a fast bail out.
(bit_test_cluster::can_be_handled): Likewise here.
* tree-switch-conversion.h (get_range): Use wi::to_wide instead
of a folding.
gcc/testsuite/ChangeLog:
PR tree-optimization/96979
* g++.dg/tree-ssa/pr96979.C: New test.
config/i386/t-rtems: Change from mtune to march for multilibs
* config/i386/t-rtems: Change from mtune to march when building
multilibs. The mtune argument tunes or optimizes for a specific
CPU model but does not ensure the generated code is appropriate
for the CPU model. Prior to this patch, i386 compatible code
was always generated but tuned for later models.
Jakub Jelinek [Thu, 1 Oct 2020 09:04:56 +0000 (11:04 +0200)]
s390: Fix up s390_atomic_assign_expand_fenv
The following patch fixes
-FAIL: gcc.dg/pr94780.c (internal compiler error)
-FAIL: gcc.dg/pr94780.c (test for excess errors)
-FAIL: gcc.dg/pr94842.c (internal compiler error)
-FAIL: gcc.dg/pr94842.c (test for excess errors)
on s390x-linux. The fix is essentially the same as has been applied to many
other targets (i386, aarch64, arm, rs6000, alpha, riscv).
2020-10-01 Jakub Jelinek <jakub@redhat.com>
* config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and old_fpc. Formatting fixes.
Joel Hutton [Wed, 30 Sep 2020 15:20:55 +0000 (16:20 +0100)]
[SLP][VECT] Add check to fix 96827
The following patch adds a simple check to prevent slp stmts from
vector constructors being rearranged. vect_attempt_slp_rearrange_stmts
tries to rearrange to avoid a load permutation.
This fixes PR target/96827
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827
gcc/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96827
* tree-vect-slp.c (vect_analyze_slp): Do not call
vect_attempt_slp_rearrange_stmts for vector constructors.
gcc/testsuite/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96827
* gcc.dg/vect/bb-slp-49.c: New test.
Jim Wilson [Wed, 30 Sep 2020 20:06:28 +0000 (13:06 -0700)]
Fix build failure with zstd versio9n 1.2.0 or older.
Extends the configure check for zstd.h to also verify the zstd version,
since gcc requires features that only exist in 1.3.0 and newer. Without
this patch we get a build error for lto-compress.c when using an old zstd
version.
Backported from master:
2020-09-29 Jim Wilson <jimw@sifive.com>
The Linux kernel has defined the cpuinfo string for the +rng feature, so
this patch adds that to GCC so that -march=native can pick it up.
Bootstrapped and tested on aarch64-none-linux-gnu.
H.J. Lu [Wed, 23 Sep 2020 19:11:45 +0000 (12:11 -0700)]
x86: Use SET operation in MOVDIRI and MOVDIR64B
Since MOVDIRI and MOVDIR64B write to memory, similar to UNSPEC_MOVNT,
use SET operation in MOVDIRI and MOVDIR64B patterns with UNSPEC instead
of UNSPECV.
gcc/
PR target/97184
* config/i386/i386.md (UNSPECV_MOVDIRI): Renamed to ...
(UNSPEC_MOVDIRI): This.
(UNSPECV_MOVDIR64B): Renamed to ...
(UNSPEC_MOVDIR64B): This.
(movdiri<mode>): Use SET operation.
(@movdir64b_<mode>): Likewise.
ira: Fix elimination for global hard FPs [PR97054]
If the hard frame pointer is being used as a global register,
we should skip the usual handling for eliminations. As the
comment says, the register cannot in that case be eliminated
(or eliminated to) and is already marked live where appropriate.
Doing this removes the duplicate error for gcc.target/i386/pr82673.c.
The “cannot be used in 'asm' here” message is meant to be for asm
statements rather than register asms, and the function that the
error is reported against doesn't use asm.
gcc/
2020-09-18 Richard Sandiford <richard.sandiford@arm.com>
PR middle-end/97054
* ira.c (ira_setup_eliminable_regset): Skip the special elimination
handling of the hard frame pointer if the hard frame pointer is fixed.
gcc/testsuite/
2020-09-18 H.J. Lu <hjl.tools@gmail.com>
Richard Sandiford <richard.sandiford@arm.com>
PR middle-end/97054
* g++.target/i386/pr97054.C: New test.
* gcc.target/i386/pr82673.c: Remove redundant extra message.
Here, operands[1] is the address of the canary (&__stack_chk_guard)
and operands[2] is the register that we want to move that address into.
However, the code above instead sets operands[2] to the address of a
constant pool entry that contains &__stack_chk_guard, rather than to
&__stack_chk_guard itself. The sequence therefore does one less
pointer indirection than it should.
The net effect was to use &__stack_chk_guard for stack-smash detection,
instead of using __stack_chk_guard itself.
gcc/
* config/arm/arm.md (*stack_protect_combined_set_insn): For non-PIC,
load the address of the canary rather than the address of the
constant pool entry that points to it.
(*stack_protect_combined_test_insn): Likewise.
gcc/testsuite/
* gcc.target/arm/stack-protector-3.c: New test.
* gcc.target/arm/stack-protector-4.c: Likewise.
aarch64: Prevent canary address being spilled to stack
This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886
for aarch64: under high register pressure, the -fstack-protector
code might spill the address of the canary onto the stack and
reload it at the test site, giving an attacker the opportunity
to change the expected canary value.
This would happen in two cases:
- when generating PIC for -mstack-protector-guard=global
(tested by stack-protector-6.c). This is a direct analogue
of PR85434, which was also about PIC for the global case.
- when using -mstack-protector-guard=sysreg.
The two problems were really separate bugs and caused by separate code,
but it was more convenient to fix them together.
The post-patch code still spills _GLOBAL_OFFSET_TABLE_ for
stack-protector-6.c, which is a more general problem. However,
it no longer spills the canary address itself.
The patch also fixes an ICE when using -mstack-protector-guard=sysreg
with ILP32: even if the register read is SImode, the address
calculation itself should still be DImode.
gcc/
* config/aarch64/aarch64-protos.h (aarch64_salt_type): New enum.
(aarch64_stack_protect_canary_mem): Declare.
* config/aarch64/aarch64.md (UNSPEC_SALT_ADDR): New unspec.
(stack_protect_set): Forward to stack_protect_combined_set.
(stack_protect_combined_set): New pattern. Use
aarch64_stack_protect_canary_mem.
(reg_stack_protect_address_<mode>): Add a salt operand.
(stack_protect_test): Forward to stack_protect_combined_test.
(stack_protect_combined_test): New pattern. Use
aarch64_stack_protect_canary_mem.
* config/aarch64/aarch64.c (strip_salt): New function.
(strip_offset_and_salt): Likewise.
(tls_symbolic_operand_type): Use strip_offset_and_salt.
(aarch64_stack_protect_canary_mem): New function.
(aarch64_cannot_force_const_mem): Use strip_offset_and_salt.
(aarch64_classify_address): Likewise.
(aarch64_symbolic_address_p): Likewise.
(aarch64_print_operand): Likewise.
(aarch64_output_addr_const_extra): New function.
(aarch64_tls_symbol_p): Use strip_salt.
(aarch64_classify_symbol): Likewise.
(aarch64_legitimate_pic_operand_p): Use strip_offset_and_salt.
(aarch64_legitimate_constant_p): Likewise.
(aarch64_mov_operand_p): Use strip_salt.
(TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA): Override.
aarch64: Tweaks to the handling of fixed-length SVE types
This patch is really four things rolled into one, since separating
them seemed artificial:
- Update the mangling of the fixed-length SVE ACLE types to match
the upcoming spec. The idea is to mangle:
VLAT __attribute__((arm_sve_vector_bits(N)))
as an instance __SVE_VLS<VLAT, N> of the template:
__SVE_VLS<typename, unsigned>
- Give the fixed-length types their own TYPE_DECL. This is needed
to make the above mangling fix work, but should also be a minor
QoI improvement for error reporting. Unfortunately, the names are
quite verbose, e.g.:
Previously we would complain that the type isn't an SVE type;
now we complain that it isn't a vector type.
- Don't allow arm_sve_vector_bits(N) to be applied to existing
fixed-length SVE types.
gcc/
* config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute):
Take the ACLE name of the type as a parameter and add it as fourth
argument to the "SVE type" attribute.
(register_builtin_types): Update call accordingly.
(register_tuple_type): Likewise. Construct the name of the type
earlier in order to do this.
(get_arm_sve_vector_bits_attributes): New function.
(handle_arm_sve_vector_bits_attribute): Report a more sensible
error message if the attribute is applied to an SVE tuple type.
Don't allow the attribute to be applied to an existing fixed-length
SVE type. Mangle the new type as __SVE_VLS<type, vector-bits>.
Add a dummy TYPE_DECL to the new type.
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: New test.
* g++.target/aarch64/sve/acle/general-c++/mangle_6.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_7.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_8.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_9.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_10.C: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Check the
error messages reported when arm_sve_vector_bits is applied to
SVE tuple types or to existing fixed-length SVE types.
aarch64: Update the mangling of single SVE vectors and predicates
GCC was implementing an old mangling scheme for single SVE
vectors and predicates (based on the Advanced SIMD one).
The final definition instead put them in the vendor built-in
namespace via the "u" prefix.
gcc/
* config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE): Add a
leading "u" to each mangled name.
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Add a leading
"u" to the mangling of each SVE vector and predicate type.
* g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_3.C: Likewise.
* g++.target/aarch64/sve/acle/general-c++/mangle_5.C: Likewise.