2020-10-29 Andrea Corallo <andrea.corallo@arm.com>
* gcc.target/aarch64/advsimd-intrinsics/vld2_lane_bf16_indices_1.c:
Run it also for the arm backend.
* gcc.target/aarch64/advsimd-intrinsics/vld2q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld4_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld4q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/arm/simd/vldn_lane_bf16_1.c: New test.
Ed Schonberg [Thu, 10 Dec 2020 21:26:57 +0000 (22:26 +0100)]
Fix PR ada/98230
It's a rather curious malfunction of the 'Mod attribute applied to the
variable of a loop whose upper bound is dynamic.
gcc/ada/ChangeLog:
PR ada/98230
* exp_attr.adb (Expand_N_Attribute_Reference, case Mod): Use base
type of argument to obtain static bound and required size.
gcc/testsuite/ChangeLog:
* gnat.dg/modular6.adb: New test.
Patrick Palka [Thu, 30 Jul 2020 02:06:44 +0000 (22:06 -0400)]
c++: overload sets and placeholder return type [PR64194]
In the testcase below, template argument deduction for the call
g(id<int>) goes wrong because the functions in the overload set id<int>
each have a yet-undeduced auto return type, and this undeduced return
type makes try_one_overload fail to match up any of the overloads with
g's parameter type, leading to g's template argument going undeduced and
to the overload set going unresolved.
This patch fixes this issue by performing return type deduction via
instantiation before doing try_one_overload, in a manner similar to what
resolve_address_of_overloaded_function does.
gcc/cp/ChangeLog:
PR c++/64194
* pt.c (resolve_overloaded_unification): If the function
template specialization has a placeholder return type,
then instantiate it before attempting unification.
gcc/testsuite/ChangeLog:
PR c++/64194
* g++.dg/cpp1y/auto-fn60.C: New test.
Kewen Lin [Wed, 19 Aug 2020 02:37:39 +0000 (21:37 -0500)]
options: Make --help= see overridden values
Options "-Q --help=params" don't show the final values after
target option overriding, instead it emits the default values
in params.opt (without any explicit param settings).
This patch makes it see overridden values.
gcc/ChangeLog:
* opts-global.c (decode_options): Call target_option_override_hook
before it prints for --help=*.
Jason Merrill [Wed, 25 Nov 2020 22:05:24 +0000 (17:05 -0500)]
c++: Fix deduction from auto template parameter [PR93083]
The check in do_class_deduction to handle passing one class placeholder
template parm as an argument for itself needed to be extended to also handle
equivalent parms from other templates.
Eric Botcazou [Mon, 7 Dec 2020 09:48:06 +0000 (10:48 +0100)]
Fix internal error on library-level type extended locally
The compiler aborts on the local extension of a tagged type declared
at library level, with a progenitor given by an interface type having
a primitive that is a homograph of a primitive of the tagged type.
gcc/ada/ChangeLog:
* gcc-interface/trans.c (maybe_make_gnu_thunk): Return false if the
target is local and thunk and target do not have the same context.
doc/implement-c.texi: About same-as-scalar-type volatile aggregate accesses, PR94600
We say very little about reads and writes to aggregate /
compound objects, just scalar objects (i.e. assignments don't
cause reads). Let's lets say something safe about aggregate
objects, but only for those that are the same size as a scalar
type.
There's an equal-sounding section (Volatiles) in extend.texi,
but this seems a more appropriate place, as specifying the
behavior of a standard qualifier.
gcc:
2020-12-04 Hans-Peter Nilsson <hp@axis.com>
Martin Sebor <msebor@redhat.com>
PR middle-end/94600
* doc/implement-c.texi (Qualifiers implementation): Add blurb
about access to the whole of a volatile aggregate object, only for
same-size as a scalar object.
Eric Botcazou [Fri, 4 Dec 2020 09:04:56 +0000 (10:04 +0100)]
Fix checking failure in IPA-SRA
This is a regression present on the mainline and 10 branch: on the one
hand, IPA-SRA does *not* disqualify accesses with zero size but, on the
other hand, it checks that accesses present in the tree have a (strictly)
positive size, thus trivially yielding an ICE in some cases.
gcc/ChangeLog:
* ipa-sra.c (verify_access_tree_1): Relax assertion on the size.
gcc/testsuite/ChangeLog:
* gnat.dg/opt91.ads, gnat.dg/opt91.adb: New test.
* gnat.dg/opt91_pkg.ads, gnat.dg/opt91_pkg.adb: New helper.
Uros Bizjak [Thu, 3 Dec 2020 16:49:42 +0000 (17:49 +0100)]
i386: Fix up ix86_md_asm_adjust for TImode [PR98086]
ix86_md_asm_adjust assumes that dest_mode can be only [QHSD]Imode
and nothing else. The patch rewrites zero-extension part to use
convert_to_mode to handle TImode and hypothetically even wider modes.
2020-12-03 Uroš Bizjak <ubizjak@gmail.com>
Jakub Jelinek <jakub@redhat.com>
gcc/
PR target/98086
* config/i386/i386.c (ix86_md_asm_adjustmd): Rewrite
zero-extension part to use convert_to_mode.
gcc/testsuite/
PR target/98086
* gcc.target/i386/pr98086.c: New test.
expr: Fix REDUCE_BIT_FIELD for constants [PR95694, PR96151]
This is yet another PR caused by constant integer rtxes not storing
a mode. We were calling REDUCE_BIT_FIELD on a constant integer that
didn't fit in poly_int64, and then tripped the as_a<scalar_int_mode>
assert on VOIDmode.
AFAICT REDUCE_BIT_FIELD is always passed rtxes that have TYPE_MODE
(rather than some other mode) and it just fills in the redundant
sign bits of that TYPE_MODE value. So it should be safe to get
the mode from the type instead of the rtx. The patch does that
and asserts that the modes agree, where information is available.
That on its own is enough to fix the bug, but we might as well
extend the folding case to all constant integers, not just those
that fit poly_int64.
gcc/
PR middle-end/95694
* expr.c (expand_expr_real_2): Get the mode from the type rather
than the rtx, and assert that it is consistent with the mode of
the rtx (where known). Optimize all constant integers, not just
those that can be represented in poly_int64.
gcc/testsuite/
PR middle-end/95694
* gcc.dg/pr95694.c: New test.
where the VQ coefficient is unsigned but is effectively acting
as a negative number. We wrongly give the POLY_INT_CST the range:
[9, INT_MAX]
and things go downhill from there: later iterations of the unrolled
epilogue are wrongly removed as dead.
I guess this is the final nail in the coffin for doing VRP on
POLY_INT_CSTs. For other similarly exotic testcases we could have
overflow for any coefficient, not just those that could be treated
as contextually negative.
Testing TYPE_OVERFLOW_UNDEFINED doesn't seem like an option because we
couldn't handle warn_strict_overflow properly. At this stage we're
just recording a range that might or might not lead to strict-overflow
assumptions later.
It still feels like we should be able to do something here, but for
now removing the code seems safest. It's also telling that there
are no testsuite failures on SVE from doing this.
In r11-2922, Przemek fixed a post-RA instruction match failure
caused by the SVE FP subtraction patterns.. This patch applies
the same fix to the other patterns.
To recap, the issue is around the handling of predication.
We want to do two things:
- Optimise cases in which a predicate is known to be all-true.
- Differentiate cases in which the predicate on an _x ACLE function has
to be kept as-is from cases in which we can make more lanes active.
The former is true by default, the latter is true for certain
combinations of flags in the -ffast-math group.
This is handled by a boolean flag in the unspecs to say whether the
predicate is “strict” or “relaxed”. When combining multiple strict
operations, the predicates used in the operations generally need to
match. When combining multiple relaxed operations, we can ignore the
predicates on nested operations and just use the predicate on the
“outermost” operation.
Originally I'd tried to reduce combinatorial explosion by using
aarch64_sve_pred_dominates_p. This required matching predicates
for strict operations but allowed more combinations for relaxed
operations.
The problem (as I should have remembered) is that C conditions on
insn patterns can't reliably enforce matching operands. If the
same register is used in two different input operands, the RA is
allowed to use different hard registers for those input operands
(and sometimes it has to). So operands that match before RA
might not match afterwards. The only sure way to force a match
is via match_dup.
This patch splits the cases into two. I cry bitter tears at having
to do this, but I think it's the only backportable fix. There might
be some way of using define_subst to generate the cond_* patterns from
the pred_* patterns, with some alternatives strategically disabled in
each case, but that's future work and might not be an improvement.
Since so many patterns now do this, I moved the comments from the
subtraction pattern to a new banner comment at the head of the file.
aarch64: Avoid false dependencies for SVE unary operations
For calls like:
z0 = svabs_s8_x (p0, z1)
we previously generated:
abs z0.b, p0/m, z1.b
However, this creates a false dependency on z0 (the merge input).
This can lead to strange results in some cases, e.g. serialising
the operation behind arbitrary earlier operations, or preventing
two iterations of a loop from being executed in parallel.
This patch therefore ties the input to the output, using a MOVPRFX
if necessary and possible. (The SVE2 unary long instructions do
not support MOVPRFX.)
When testing the patch, I hit a bug in the big-endian SVE move
optimisation in aarch64_maybe_expand_sve_subreg_move. I don't
have an indepenedent testcase for it, so I didn't split it out
into a separate patch.
gcc/
* config/aarch64/aarch64.c (aarch64_maybe_expand_sve_subreg_move):
Do not optimize LRA subregs.
* config/aarch64/aarch64-sve.md
(@aarch64_pred_<SVE_INT_UNARY:optab><mode>): Tie the input to the
output.
(@aarch64_sve_revbhw_<SVE_ALL:mode><PRED_HSD:mode>): Likewise.
(*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise.
(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
(*cnot<mode>): Likewise.
(@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): Likewise.
(@aarch64_sve_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>):
Likewise.
(@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
Likewise.
(@aarch64_sve_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>):
Likewise.
(@aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>):
Likewise.
(@aarch64_sve_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>):
Likewise.
(@aarch64_sve_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>):
Likewise.
(@aarch64_sve_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>):
Likewise.
* config/aarch64/aarch64-sve2.md
(@aarch64_pred_<SVE2_COND_FP_UNARY_LONG:sve_fp_op><mode>): Likewise.
(@aarch64_pred_<SVE2_COND_FP_UNARY_NARROWB:sve_fp_op><mode>): Likewise.
(@aarch64_pred_<SVE2_U32_UNARY:sve_int_op><mode>): Likewise.
(@aarch64_pred_<SVE2_COND_INT_UNARY_FP:sve_fp_op><mode>): Likewise.
dse: Cope with bigger-than-integer modes [PR98037]
dse.c:find_shift_sequence tries to represent a store and load
back as a shift right followed by a truncation. It therefore
needs to find an integer mode in which to do the shift right.
The loop it uses has the form:
which implicitly assumes that read_mode has an equivalent integer mode.
As shown in the testcase, not all modes have such an integer mode.
This patch just makes the code start from the smallest integer mode and
skip modes that are too small. The loop already breaks at the first
mode wider than word_mode.
gcc/
PR rtl-optimization/98037
* dse.c (find_shift_sequence): Iterate over all integers and
skip modes that are too small.
gcc/testsuite/
PR rtl-optimization/98037
* gcc.target/aarch64/sve/acle/general/pr98037.c: New test.
Richard Biener [Wed, 26 Aug 2020 13:12:17 +0000 (15:12 +0200)]
tree-optimization/96698 - fix ICE when vectorizing nested cycles
This fixes vectorized PHI latch edge updating and delay it until
all of the loop is code generated to deal with the case that the
latch def is a PHI in the same block.
2020-08-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/96698
* tree-vectorizer.h (loop_vec_info::reduc_latch_defs): New.
(loop_vec_info::reduc_latch_slp_defs): Likewise.
* tree-vect-stmts.c (vect_transform_stmt): Only record
stmts to update PHI latches from, perform the update ...
* tree-vect-loop.c (vect_transform_loop): ... here after
vectorizing those PHIs.
(info_for_reduction): Properly handle non-reduction PHIs.
Patrick Palka [Mon, 12 Oct 2020 17:46:21 +0000 (13:46 -0400)]
libstdc++: Apply proposed resolution for LWG 3449 [PR95322]
Now that the frontend bug PR96805 is fixed, we can cleanly apply the
proposed resolution for this issue.
This slightly deviates from the proposed resolution by declaring _CI a
member of take_view instead of take_view::_Sentinel, since it doesn't
depend on anything within _Sentinel anymore.
libstdc++-v3/ChangeLog:
PR libstdc++/95322
* include/std/ranges (take_view::_CI): Define this alias
template as per LWG 3449 and remove ...
(take_view::_Sentinel::_CI): ... this type alias.
(take_view::_Sentinel::operator==): Adjust use of _CI
accordingly. Define a second overload that accepts an iterator
of the opposite constness as per LWG 3449.
(take_while_view::_Sentinel::operator==): Likewise.
* testsuite/std/ranges/adaptors/95322.cc: Add tests for LWG 3449.
Richard Biener [Mon, 9 Nov 2020 14:19:56 +0000 (15:19 +0100)]
tree-optimization/97760 - reduction paths with unhandled live stmt
This makes sure we reject reduction paths with a live stmt that
is not the last one altering the value. This is because we do not
handle this in the epilogue unless there's a scalar epilogue loop.
2020-11-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/97760
* tree-vect-loop.c (check_reduction_path): Reject
reduction paths we do not handle in epilogue generation.
Richard Biener [Mon, 26 Oct 2020 09:08:38 +0000 (10:08 +0100)]
tree-optimization/97539 - reset out-of-loop debug uses before peeling
This makes sure to reset out-of-loop debug uses before vectorizer
loop peeling as we cannot make sure to retain the use-def dominance
relationship when there are no LC SSA nodes.
Richard Biener [Thu, 26 Nov 2020 09:07:06 +0000 (10:07 +0100)]
testsuite/98002 - fix gcc.dg/strncmp-2.c
This makes sure not to free() memory we have mprotected to PROT_NONE
by calling mprotect again with PROT_READ|PROT_WRITE. This avoids
crashing the allocator when in debug mode.
2020-11-16 Richard Biener <rguenther@suse.de>
PR testsuite/98002
* gcc.dg/strncmp-2.c: Call mprotect again before free.
[Obvious] arm: Fix test from failing on some targets [PR91816]
This recently submitted test was found to fail on some Cortex-M
targets. This was because codegen on these CPUs would emit a ldr
instead of a movw/movt pair, resulting in an overall smaller test
(i.e. the branch wasn't as far) and the behaviour being tested
for not being triggered.
This commit doubles the size of the test to account for this.
Eric Botcazou [Sat, 28 Nov 2020 11:54:48 +0000 (12:54 +0100)]
Fix PR target/97939
The little dance around 4096 that add/sub instructions do on the SPARC
needs to be taken into account for the overflow arithmetic operations.
It cannot be done for unsigned overflow, but it can be done for signed
overflow.
gcc/ChangeLog:
PR target/97939
* config/sparc/predicates.md (arith_double_add_operand): Comment.
* config/sparc/sparc.md (uaddvdi4): Use arith_double_operand.
(addvdi4): Use arith_double_add_operand.
(addsi3): Remove useless attributes.
(addvsi4): Use arith_add_operand.
(*cmp_ccv_plus): Likewise and add second alternative accordingly.
(*cmp_ccxv_plus): Likewise.
(*cmp_ccv_plus_set): Likewise.
(*cmp_ccxv_plus_set): Likewise.
(*cmp_ccv_plus_sltu_set): Likewise.
(usubvdi4): Use arith_double_operand.
(subvdi4): Use arith_double_add_operand.
(subsi3): Remove useless attributes.
(subvsi4): Use arith_add_operand.
(*cmp_ccv_minus): Likewise and add second alternative accordingly.
(*cmp_ccxv_minus): Likewise.
(*cmp_ccv_minus_set): Likewise.
(*cmp_ccxv_minus_set): Likewise.
(*cmp_ccv_minus_sltu_set): Likewise.
(negsi2): Use register_operand.
(unegvsi3): Likewise.
(negvsi3) Likewise.
(*cmp_ccnz_neg): Likewise.
(*cmp_ccxnz_neg): Likewise.
(*cmp_ccnz_neg_set): Likewise.
(*cmp_ccxnz_neg_set): Likewise.
(*cmp_ccc_neg_set): Likewise.
(*cmp_ccxc_neg_set): Likewise.
(*cmp_ccc_neg_sltu_set): Likewise.
(*cmp_ccv_neg): Likewise.
(*cmp_ccxv_neg): Likewise.
(*cmp_ccv_neg_set): Likewise.
(*cmp_ccxv_neg_set): Likewise.
(*cmp_ccv_neg_sltu_set): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/sparc/overflow-6.c: New test.
Eric Botcazou [Thu, 26 Nov 2020 15:38:35 +0000 (16:38 +0100)]
Fix PR target/96607
After 15 years trying to find out what can go into the delay slot of
the call to __tls_get_addr with the Solaris linker, it's now time to
concede defeat and consider it as not to be filled.
gcc/ChangeLog:
PR target/96607
* config/sparc/sparc-protos.h (eligible_for_call_delay): Delete.
* config/sparc/sparc.c (eligible_for_call_delay): Likewise.
* config/sparc/sparc.md (in_call_delay): Likewise.
(tls_delay_slot): New attribute.
(define_delay [call]): Use in_branch_delay.
(tgd_call<P:mode>): Set type to call_no_delay_slot when
tls_delay_slot is false.
(tldm_call<P:mode>): Likewise.
Kyrylo Tkachov [Fri, 27 Nov 2020 09:19:33 +0000 (09:19 +0000)]
aarch64: Introduce --param=aarch64-autovec-preference to select autovec preference in backend
This is a patch that introduces the aarch64-autovec-preference that can
take values from 0 - 4, 0 being the default.
It can be used to override the autovectorisation preferences in the
backend:
0 - use default scheme
1 - only use Advanced SIMD
2 - only use SVE
3 - use Advanced SIMD and SVE, prefer Advanced SIMD in the event of a
tie (as determined by costs)
4 - use Advanced SIMD and SVE, prefer SVE in the event of a tie (as
determined by costs)
It can valuable for experimentation when comparing SVE and Advanced SIMD
autovectorisation strategies.
It achieves this adjusting the order of the interleaved SVE and Advanced
SIMD modes in aarch64_autovectorize_vector_modes.
It also adjusts aarch64_preferred_simd_mode to use the new comparison
function to pick Advanced SIMD or SVE to start with.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
* config/aarch64/aarch64.opt
(-param=aarch64-autovec-preference): Define.
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set aarch64_sve_compare_costs to 0 when preferring only Advanced
SIMD.
(aarch64_cmp_autovec_modes): Define.
(aarch64_preferred_simd_mode): Adjust to use the above.
(aarch64_autovectorize_vector_modes): Likewise.
* doc/invoke.texi: Document aarch64-autovec-preference param.
Thomas Schwinge [Wed, 25 Nov 2020 19:36:55 +0000 (20:36 +0100)]
Don't create location wrapper nodes within OpenACC clauses
This fixes a GCC 11, 10, 9 regression introduced by commit dfd7fdca2ac17d8b823a16700525824ca312ade0 (Subversion r267272) "C++: more
location wrapper nodes (PR c++/43064, PR c++/43486)". But: this isn't
intending to blame David, because back then, the problem hasn't been visible in
the testsuite (or else I'm sure would've been addressed right away) because of
our all dear friend: missing testsuite coverage. Thus, for GCC 8, I'm likewise
enhancing the testsuite, without the C++ front end code changes.
I actually had presumed that there may be an issue for OpenACC:
<http://mid.mail-archive.com/874lb9qr2u.fsf@euler.schwinge.homeip.net>, so here
we are, two years (and many "wasted" hours...) later...
Jonathan Wakely [Wed, 25 Nov 2020 17:18:44 +0000 (17:18 +0000)]
libstdc++: Fix missing subsumption in std::iterator_traits [PR 97935]
libstdc++-v3/ChangeLog:
PR libstdc++/97935
* include/bits/iterator_concepts.h (__detail::__iter_without_category):
New helper concept.
(__iterator_traits::__cat): Use __detail::__iter_without_category.
* testsuite/24_iterators/associated_types/iterator.traits.cc: New test.
Jonathan Wakely [Thu, 4 Jun 2020 22:20:49 +0000 (23:20 +0100)]
libstdc++: Remove workarounds for constrained nested class templates
With PR c++/92078 and PR c++/92103 both fixed, nested class templates
can now be constrained. That means a number of namespace-scope helpers
can be moved to the class scope, so they're only visible where they're
needed.
* include/bits/iterator_concepts.h (__detail::__ptr, __detail::__ref)
(__detail::__cat, __detail::__diff): Move to class scope in the
relevant __iterator_traits specializations.
(__iterator_traits<>): Use nested class templates instead of ones from
namespace __detail.
* include/bits/stl_iterator.h (__detail::__common_iter_ptr): Move to
class scope in iterator_traits<common_iterator<I, S>>.
(iterator_traits<common_iterator<I, S>>): Use nested class template
instead of __detail::__common_iter_ptr.
Jakub Jelinek [Tue, 24 Nov 2020 08:04:28 +0000 (09:04 +0100)]
openmp: Fix C ICE on OpenMP atomics
c_parser_binary_expression was using build2 to create a temporary holder
for binary expression that c_parser_atomic and c_finish_omp_atomic can then
handle. The latter performs then all the needed checking.
Unfortunately, build2 performs some checking too, e.g. PLUS_EXPR vs.
POINTER_PLUS_EXPR or matching types of the arguments, nothing we can guarantee
at the parsing time. So we need something like C++ build_min_nt*. This
patch implements that inline.
2020-11-24 Jakub Jelinek <jakub@redhat.com>
PR c/97958
* c-parser.c (c_parser_binary_expression): For omp atomic binary
expressions, use make_node instead of build2 to avoid checking build2
performs.
Jakub Jelinek [Fri, 20 Nov 2020 11:26:58 +0000 (12:26 +0100)]
arm: Fix up neon_vector_mem_operand [PR97528]
The documentation for POST_MODIFY says:
Currently, the compiler can only handle second operands of the
form (plus (reg) (reg)) and (plus (reg) (const_int)), where
the first operand of the PLUS has to be the same register as
the first operand of the *_MODIFY.
The following testcase ICEs, because combine just attempts to simplify
things and ends up with
(post_modify (reg1) (plus (mult (reg2) (const_int 4)) (reg1))
but the target predicates accept it, because they only verify
that POST_MODIFY's second operand is PLUS and the second operand
of the PLUS is a REG.
The following patch fixes this by performing further verification that
the POST_MODIFY is in the form it should be.
2020-11-20 Jakub Jelinek <jakub@redhat.com>
PR target/97528
* config/arm/arm.c (neon_vector_mem_operand): For POST_MODIFY, require
first POST_MODIFY operand is a REG and is equal to the first operand
of PLUS.
Jakub Jelinek [Sat, 14 Nov 2020 08:14:19 +0000 (09:14 +0100)]
dwarf2: Emit DW_TAG_unspecified_parameters even in late DWARF [PR97599]
Aldy's PR71855 fix avoided emitting multiple redundant
DW_TAG_unspecified_parameters sub-DIEs of a single DIE by restricting
it to early dwarf only. That unfortunately means if we need to emit
another DIE for the function (whether it is for LTO, or e.g. because of
IPA cloning), we don't emit DW_TAG_unspecified_parameters, it remains
solely in the DW_AT_abstract_origin's referenced DIE.
But DWARF consumers don't really use DW_TAG_unspecified_parameters
from there, like we duplicate DW_TAG_formal_parameter sub-DIEs even in the
clones because either they have some more specific location, or e.g.
a function clone could have fewer or different argument types etc.,
they need to assume that originally stdarg function isn't later stdarg etc.
Unfortunately, while for DW_TAG_formal_parameter sub-DIEs, we can use the
hash tabs to look the PARM_DECLs if we already have the DIEs, for
DW_TAG_unspecified_parameters we don't have an easy way to look it up.
The following patch handles it by trying to figure out if we are creating a
fresh new DIE (in that case we add DW_TAG_unspecified_parameters if it is
stdarg), or if gen_subprogram_die is called again on an pre-existing DIE
to fill in some further details (then it will not touch it).
Except for lto, subr_die != old_die would be good enough, but unfortunately
for LTO the new DIE that will refer to early dwarf created DIE is created
on the fly during lookup_decl_die. So the patch tracks if the DIE has
no children before any children are added to it.
2020-11-14 Jakub Jelinek <jakub@redhat.com>
PR debug/97599
* dwarf2out.c (gen_subprogram_die): Call
gen_unspecified_parameters_die even if not early dwarf, but only
if subr_die is a newly created DIE.
Jason Merrill [Fri, 20 Nov 2020 20:20:45 +0000 (15:20 -0500)]
dwarf2: ICE with local class in unused function [PR97918]
Here, since we only mention bar<B>, we never emit debug information for it.
But we do emit debug information for H<J>::h, so we need to refer to the
debug info for bar<B>::J even though there is no bar<B>. We deal with this
sort of thing in dwarf2out with the limbo_die_list; parentless dies like J
get attached to the CU at EOF. But here, we were flushing the limbo list,
then generating the template argument DIE for H<J> that refers to J, which
adds J to the limbo list, too late to be flushed. So let's flush a little
later.
gcc/ChangeLog:
PR c++/97918
* dwarf2out.c (dwarf2out_early_finish): flush_limbo_die_list
after gen_scheduled_generic_parms_dies.
gcc/testsuite/ChangeLog:
PR c++/97918
* g++.dg/debug/localclass2.C: New test.
Jason Merrill [Thu, 8 Oct 2020 19:43:26 +0000 (15:43 -0400)]
c++: Fix member alias template in C++17 and up. [PR96805]
Here we're trying to push into a<T>::c<N> in order to instantiate t<N>, but
were building a TYPENAME_TYPE for it because a<T> isn't open yet. Don't
do that when we know we're trying to enter the scope.
gcc/cp/ChangeLog:
PR c++/96805
PR c++/96199
* pt.c (tsubst_aggr_type): Don't build a TYPENAME_TYPE when
entering_scope.
(tsubst_template_decl): Use tsubst_aggr_type.
gcc/testsuite/ChangeLog:
PR c++/96805
* g++.dg/cpp0x/alias-decl-pr96805.C: New test.
Richard Earnshaw [Tue, 24 Nov 2020 16:21:17 +0000 (16:21 +0000)]
arm: correctly handle negating INT_MIN in arm_split_atomic_op [PR97534]
arm_split_atomic_op handles subtracting a constant by converting it
into addition of the negated constant. But if the type of the operand
is int and the constant is -1 we currently end up generating invalid
RTL which can lead to an abort later on.
The problem is that in a HOST_WIDE_INT, INT_MIN is represented as
0xffffffff80000000 and the negation of this is 0x0000000080000000, but
that's not a valid constant for use in SImode operations.
The fix is straight-forward which is to use gen_int_mode rather than
simply GEN_INT. This knows how to correctly sign-extend the negated
constant when this is needed.
gcc/
PR target/97534
* config/arm/arm.c (arm_split_atomic_op): Use gen_int_mode when
negating a const_int.
gcc/testsuite
* gcc.dg/pr97534.c: New test.
Thomas Schwinge [Fri, 20 Nov 2020 09:41:46 +0000 (10:41 +0100)]
More explicit checking of which OMP constructs we're expecting, part II
In particular, more precisely highlight what applies generally vs. the special
handling for the current 'parloops'-based OpenACC 'kernels' implementation.
gcc/
* omp-expand.c (expand_oacc_for): More explicit checking of which
OMP constructs we're expecting.
Thomas Schwinge [Fri, 6 Nov 2020 08:51:16 +0000 (09:51 +0100)]
[testsuite] Emit 'warning' instead of 'error' diagnostics for 'dg-optimized', 'dg-missed'
The diagnostics produced by 'dg-optimized', 'dg-missed' aren't error
diagnostics (fatal, meaning: causes compilation to fail) but rather warning
diagnostics (non-fatal, doesn't cause compilation to fail). Thus, same as
'dg-message', these should use 'saved-dg-warning' instead of 'saved-dg-error',
which then prints: "(test for *warnings*, line [...]) instead of currently:
"(test for *errors*, line [...])".
This is a small bug-fix for commit ed2d9d3720adef3a260b8a55e17e744352a901fc
"dumpfile.c: use prefixes other than 'note: ' for
MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}", which added 'dg-optimized',
'dg-missed'.
gcc/testsuite/
* lib/gcc-dg.exp (dg-optimized, dg-missed): Use 'saved-dg-warning'
instead of 'saved-dg-error'.
Thomas Schwinge [Fri, 6 Nov 2020 08:18:06 +0000 (09:18 +0100)]
[testsuite] Enable column location checking for 'dg-optimized', 'dg-missed'
'process-message' would like the 'msgprefix' argument without trailing space.
This is a small bug-fix for commit ed2d9d3720adef3a260b8a55e17e744352a901fc
"dumpfile.c: use prefixes other than 'note: ' for
MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}", which added 'dg-optimized',
'dg-missed'.
Iain Buclaw [Sun, 22 Nov 2020 13:29:54 +0000 (14:29 +0100)]
d: Fix OutOfMemoryError thrown when appending to an array with a side effect
When appending a character to an array, the result of that concat
assignment was not the new value of the array, similarly, when appending
an array to another array, side effects were evaluated in reverse to the
expected order of evaluation.
As of this change, the address of the left-hand side expression is
saved and re-used as the result. Its evaluation is now also forced to
occur before the concat operation itself is called.
gcc/d/ChangeLog:
PR d/97889
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Enforce LTR order of
evaluation on left and right hand side expressions.
Jonathan Wakely [Fri, 20 Nov 2020 11:30:33 +0000 (11:30 +0000)]
libstdc++: Remove <memory_resource> dependency from <regex> [PR 92546]
Unlike the other headers that declare alias templates in namespace pmr,
<regex> includes <memory_resource>. That was done because the
pmr::string::const_iterator typedef requires pmr::string to be complete,
which requires pmr::polymorphic_allocator<char> to be complete.
By using __normal_iterator<const char*, pmr::string> instead of the
const_iterator typedef we can avoid the completeness requirement.
This makes <regex> smaller, by not requiring <memory_resource> and its
<shared_mutex> dependency, which depends on <chrono>. Backporting this
will also help with PR 97876, where <stop_token> ends up being needed by
<regex> via <memory_resource>.
libstdc++-v3/ChangeLog:
PR libstdc++/92546
* include/std/regex (pmr::smatch, pmr::wsmatch): Declare using
underlying __normal_iterator type, not nested typedef
basic_string::const_iterator.
Jonathan Wakely [Thu, 19 Nov 2020 22:32:54 +0000 (22:32 +0000)]
libstdc++: Fix compilation error with clang-8 [PR 97876]
This fixes a compilation error with clang-8 and earlier. This change is
only on the gcc-10 branch, not master, because the <stop_token> header
is included indirectly in more places on the branch than on master.
PR libstdc++/97876
* include/std/stop_token (_Stop_state_t): Define default
constructor as user-provided not defaulted.
Jonathan Wakely [Thu, 19 Nov 2020 21:07:06 +0000 (21:07 +0000)]
libstdc++: Avoid calling undefined __gthread_self weak symbol [PR 95989]
Since glibc 2.27 the pthread_self symbol has been defined in libc rather
than libpthread. Because we only call pthread_self through a weak alias
it's possible for statically linked executables to end up without a
definition of pthread_self. This crashes when trying to call an
undefined weak symbol.
We can use the __GLIBC_PREREQ version check to detect the version of
glibc where pthread_self is no longer in libpthread, and call it
directly rather than through the weak reference.
It would be better to check for pthread_self in libc during configure
instead of hardcoding the __GLIBC_PREREQ check. That would be
complicated by the fact that prior to glibc 2.27 libc.a didn't have the
pthread_self symbol, but libc.so.6 did. The configure checks would need
to try to link both statically and dynamically, and the result would
depend on whether the static libc.a happens to be installed during
configure (which could vary between different systems using the same
version of glibc). Doing it properly is left for a future date, as that
will be needed anyway after glibc moves all pthread symbols from
libpthread to libc. When that happens we should revisit the whole
approach of using weak symbols for pthread symbols.
For the purposes of std::this_thread::get_id() we call
pthread_self() directly when using glibc 2.27 or later. Otherwise, if
__gthread_active_p() is true then we know the libpthread symbol is
available so we call that. Otherwise, we are single-threaded and just
use ((__gthread_t)1) as the thread ID.
An undesirable consequence of this change is that code compiled prior to
the change might inline the old definition of this_thread::get_id()
which always returns (__gthread_t)1 in a program that isn't linked to
libpthread. Code compiled after the change will use pthread_self() and
so get a real TID. That could result in the main thread having different
thread::id values in different translation units. This seems acceptable,
as there are not expected to be many uses of thread::id in programs
that aren't linked to libpthread.
An earlier version of this patch also changed __gthread_self() to use
__GLIBC_PREREQ(2, 27) and only use the weak symbol for older glibc. Tha
might still make sense to do, but isn't needed by libstdc++ now.
libstdc++-v3/ChangeLog:
PR libstdc++/95989
* config/os/gnu-linux/os_defines.h (_GLIBCXX_NATIVE_THREAD_ID):
Define new macro to get reliable thread ID.
* include/std/stop_token (_Stop_state_t::_M_request_stop):
Use new macro if it's defined.
(_Stop_state_t::_M_remove_callback): Likewise.
* include/std/thread (this_thread::get_id): Likewise.
* testsuite/30_threads/jthread/95989.cc: New test.
* testsuite/30_threads/this_thread/95989.cc: New test.
Alex Coplan [Thu, 12 Nov 2020 10:03:21 +0000 (10:03 +0000)]
aarch64: Fix SVE2 BCAX pattern [PR97730]
This patch adds a missing not to the SVE2 BCAX (Bitwise clear and
exclusive or) pattern, fixing the PR. Since SVE doesn't have an
unpredicated not instruction, we need to use a (vacuously) predicated
not here.
To ensure that the predicate is instantiated correctly (to all 1s) for
the intrinsics, we pull out a separate expander from the define_insn.
From the ISA reference [1]:
> Bitwise AND elements of the second source vector with the
> corresponding inverted elements of the third source vector, then
> exclusive OR the results with corresponding elements of the first
> source vector.
Uros Bizjak [Thu, 19 Nov 2020 08:26:01 +0000 (09:26 +0100)]
i386: Disable *<absneg:code><mode>2_i387_1 for TARGET_SSE_MATH modes
This pattern interferes with *<absneg:code><mode>2_1 when TARGET_SSE_MATH
modes are active. Combine pass is able to remove (use) RTXes and transforms
*<absneg:code><mode>2_1 to *<absneg:code><mode>2_i387_1 where SSE
alternatives are not available.
2020-11-19 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/97887
* config/i386/i386.md (*<absneg:code><mode>2_i387_1):
Disable for TARGET_SSE_MATH modes.
gcc/testsuite/
PR target/97887
* gcc.target/i386/pr97887.c: New test.
This applies the proposed resolution of LWG 3500, which corrects the
return type and constraints of this member function to use the right
iterator type. Additionally, a nearby local variable is uglified.
libstdc++-v3/ChangeLog:
* include/std/ranges (join_view::_Iterator::_M_satisfy): Uglify
local variable inner.
(join_view::_Iterator::operator->): Use _Inner_iter instead of
_Outer_iter in the function signature as per LWG 3500.
* testsuite/std/ranges/adaptors/join.cc (test08): Test it.
Jonathan Wakely [Tue, 6 Oct 2020 08:41:16 +0000 (09:41 +0100)]
libstdc++: Avoid CTAD for std::ranges::join_view [LWG 3474]
In commit ef275d1f2083f8a1fa1b59a3cd07fd3e8431023e I implemented the
wrong resolution of LWG 3474. This removes the deduction guide and
alters the views::join factory to create the right type explicitly.
libstdc++-v3/ChangeLog:
* include/std/ranges (join_view): Remove deduction guide.
(views::join): Add explicit template argument list to prevent
deducing the wrong type.
* testsuite/std/ranges/adaptors/join.cc: Move test for LWG 3474
here, from ...
* testsuite/std/ranges/adaptors/join_lwg3474.cc: Removed.
Jonathan Wakely [Thu, 10 Sep 2020 17:51:24 +0000 (18:51 +0100)]
libstdc++: Fix macro redefinition warnings
Including <version> after <iterator> gives a warning about redefining
the __cpp_lib_array_constexpr macro. What happens is that <iterator>
sets the C++20 value, then <version> redefines it to the C++17 value,
then undefines it and defines it again to the C++20 value.
This change avoids defining it to the C++17 value when compiling C++20
or later (which also means we no longer need the #undef).
A similar warning happens for __cpp_lib_constexpr_char_traits when
including <version> after any header that includes <bits/char_traits.h>.
libstdc++-v3/ChangeLog:
* include/std/version (__cpp_lib_array_constexpr)
(__cpp_lib_constexpr_char_traits): Only define C++17 value when
compiling C++17.
Jonathan Wakely [Mon, 15 Jun 2020 13:31:26 +0000 (14:31 +0100)]
libstdc++: Update value of __cpp_lib_constexpr_char_traits for C++20
Although not required by SD-6 or the C++20 draft, we define the macro
__cpp_lib_constexpr_char_traits to indicate support for P0432R1. This
updates the value in C++20 mode for the P1032R1 changes to char_traits.
* include/bits/char_traits.h (__cpp_lib_constexpr_char_traits):
Update value for C++20.
* include/std/version (__cpp_lib_constexpr_char_traits): Likewise.
* testsuite/21_strings/char_traits/requirements/constexpr_functions_c++17.cc:
Update expected value.