Iain Sandoe [Wed, 23 Jun 2021 07:13:22 +0000 (08:13 +0100)]
coroutines: Fix a typo in rewriting the function.
When amending the function re-write code, I made a typo in
the block connections. This has not shown up in any test
fails (as far as can be seen) but is a regression in debug
info.
Iain Sandoe [Mon, 3 May 2021 07:22:53 +0000 (08:22 +0100)]
Darwin, X86: Adjust call clobbers to allow for lazy-binding [PR 100152].
We allow public functions defined in a TU to bind locally for PIC
code (the default) on 64bit Mach-O.
If such functions are not inlined, we cannot tell at compile-time if
they might be called via the lazy symbol resolver (this can depend on
options given at link-time). Therefore, we must assume that the lazy
resolver could be used which clobbers R11 and R10.
PR target/100152
* config/i386/i386-expand.c (ix86_expand_call): If a call is
to a non-local-binding, or local but to a public symbol, then
assume that it might be indirected via the lazy symbol binder.
Mark R10 and R10 as clobbered in that case.
Ian Lance Taylor [Mon, 19 Jul 2021 23:47:05 +0000 (16:47 -0700)]
compiler: avoid aliases in receiver types
If a package declares a method on an alias type, the alias would be
used in the export data. This would then trigger a compiler
assertion on import: we should not be adding methods to aliases.
Fix the problem by ensuring that receiver types do not use alias types.
This seems preferable to consistently avoiding aliases in export data,
as aliases can cross packages. And it's painful to try to patch this
while writing the export data, as at that point all the types are known.
Bill Schmidt [Mon, 19 Jul 2021 17:49:17 +0000 (12:49 -0500)]
rs6000: Don't let swaps pass break multiply low-part (PR101129)
Backport from mainline.
2021-07-15 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/101129
* config/rs6000/rs6000-p8swap.c (has_part_mult): New.
(rs6000_analyze_swaps): Insns containing a subreg of a mult are
not swappable.
Jonathan Wakely [Wed, 12 May 2021 10:21:51 +0000 (11:21 +0100)]
libstdc++: Fix some problems in PSTL tests
libstdc++-v3/ChangeLog:
* testsuite/25_algorithms/pstl/alg_nonmodifying/find_end.cc:
Increase dg-timeout-factor to 4. Fix -Wunused-parameter
warnings. Replace bitwise AND with logical AND in loop
condition.
* testsuite/25_algorithms/pstl/alg_nonmodifying/search_n.cc:
Replace bitwise AND with logical AND in loop condition.
* testsuite/util/pstl/test_utils.h: Remove unused parameter
names.
Jonathan Wakely [Tue, 15 Jun 2021 14:07:25 +0000 (15:07 +0100)]
libstdc++: Remove precondition checks from ranges::subrange
The assertion in the subrange constructor causes semantic changes,
because the call to ranges::distance performs additional operations that
are not part of the constructor's specification. That will fail to
compile if the iterator is move-only, because the argument to
ranges::distance is passed by value. It will modify the subrange if the
iterator is not a forward iterator, because incrementing the copy also
affects the _M_begin member. Those problems could be prevented by using
if-constexpr to only do the assertion for copyable forward iterators,
but the call to ranges::distance can also prevent the constructor being
usable in constant expressions. If the member initializers are usable in
constant expressions, but iterator increments of equality comparisons
are not, then the checks done by __glibcxx_assert might
make constant evaluation fail.
This change removes the assertion. Additionally, a new typedef is
introduced to simplify the declarations using __make_unsigned_like_t on
the iterator's difference type.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (subrange): Add __size_type typedef
and use it to simplify declarations.
(subrange(i, s, n)): Remove assertion.
* testsuite/std/ranges/subrange/constexpr.cc: New test.
Jonathan Wakely [Wed, 14 Jul 2021 19:14:14 +0000 (20:14 +0100)]
libstdc++: Fix std::get<T> for std::tuple [PR101427]
The std::get<T> functions relied on deduction failing if more than one
base class existed for the type T. However the implementation of Core
DR 2303 (in r11-4693) made deduction succeed (and select the
more-derived base class).
This rewrites the implementation of std::get<T> to explicitly check for
more than one occurrence of T in the tuple elements, making it
ill-formed again. Additionally, the large wall of overload resolution
errors described in PR c++/101460 is avoided by making std::get<T> use
__get_helper<I> directly instead of calling std::get<I>, and by adding a
deleted overload of __get_helper<N> for out-of-range N.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101427
* include/std/tuple (tuple_element): Improve static_assert text.
(__get_helper): Add deleted overload.
(get<i>(tuple<T...>&&), get<i>(const tuple<T...>&&)): Use
__get_helper directly.
(__get_helper2): Remove.
(__find_uniq_type_in_pack): New constexpr helper function.
(get<T>): Use __find_uniq_type_in_pack and __get_helper instead
of __get_helper2.
* testsuite/20_util/tuple/element_access/get_neg.cc: Adjust
expected errors.
* testsuite/20_util/tuple/element_access/101427.cc: New test.
Jakub Jelinek [Thu, 1 Jul 2021 06:55:49 +0000 (08:55 +0200)]
openmp - Fix up && and || reductions [PR94366]
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.
This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.
2021-07-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.
Tobias Burnus [Tue, 4 May 2021 11:38:03 +0000 (13:38 +0200)]
OpenMP: Support complex/float in && and || reduction
C/C++ permit logical AND and logical OR also with floating-point or complex
arguments by doing an unequal zero comparison; the result is an 'int' with
value one or zero. Hence, those are also permitted as reduction variable,
even though it is not the most sensible thing to do.
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Accept float + complex
for || and && reductions.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_reduction_clause): Accept float + complex
for || and && reductions.
gcc/ChangeLog:
* omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle
&& and || with floating-point and complex arguments.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/reduction-2.c: New test.
* testsuite/libgomp.c-c++-common/reduction-3.c: New test.
Comparisons of NULLPTR_TYPE operands cause all kinds of problems in the
middle-end and in fold-const.c, various optimizations assume that if they
see e.g. a non-equality comparison with one of the operands being
INTEGER_CST and it is not INTEGRAL_TYPE_P (which has TYPE_{MIN,MAX}_VALUE),
they can build_int_cst (type, 1) to find a successor.
The following patch fixes it by making sure they don't appear in the IL,
optimize them away at cp_fold time as all can be folded.
Though, I've just noticed that clang++ rejects the non-equality comparisons
instead, foo () > 0 with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'int')
and foo () > nullptr with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'nullptr_t')
Shall we reject those too, in addition or instead of parts of this patch?
If so, wouldn't this patch be still useful for backports, I bet we don't
want to start reject it on the release branches when we used to accept it.
2021-07-15 Jakub Jelinek <jakub@redhat.com>
PR c++/101443
* cp-gimplify.c (cp_fold): For comparisons with NULLPTR_TYPE
operands, fold them right away to true or false.
pot_dummy_types is a hash_set from whose traversal the code prints some type
lines. hash_set normally uses default_hash_traits which for pointer types
(the hash set hashes const char *) uses pointer_hash which hashes the
addresses of the pointers except of the least significant 3 bits.
With address space randomization, that results in non-determinism in the
-fdump-go-specs= generated file, each invocation can have different order of
the lines emitted from pot_dummy_types traversal.
This patch fixes it by hashing the string contents instead to make the
hashes reproduceable.
2021-07-14 Jakub Jelinek <jakub@redhat.com>
PR go/101407
* godump.c (godump_str_hash): New type.
(godump_container::pot_dummy_types): Use string_hash instead of
ptr_hash in the hash_set.
Jakub Jelinek [Tue, 13 Jul 2021 07:50:49 +0000 (09:50 +0200)]
libgomp: Don't include limits.h instead of hidden visibility block
sem.h is included in between # pragma GCC visibility push(hidden)
and # pragma GCC visibility pop and includes limits.h there, which
since the introduction of sysconf declaration in recent glibcs
in there causes trouble. libgomp assumes it is compiled by gcc,
so we don't really need to include limits.h there and can use
-__INT_MAX__ - 1 instead (which clang and icc support too for years).
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Florian Weimer <fweimer@redhat.com>
* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
* config/linux/affinity.c: Include limits.h.
Jakub Jelinek [Thu, 1 Jul 2021 07:45:02 +0000 (09:45 +0200)]
dwarf2out: Handle COMPOUND_LITERAL_EXPR in loc_list_from_tree_1 [PR101266]
In this case dwarf2out_decl is called from the FEs with GENERIC but not
yet gimplified expressions in it.
As loc_list_from_tree_1 has an exhaustive list of tree codes it wants to
handle and for checking asserts no other codes makes it in, we should
handle even GENERIC trees that shouldn't be valid in GIMPLE.
The following patch handles COMPOUND_LITERAL_EXPR by hnadling it like the
underlying VAR_DECL temporary.
Verified the emitted DWARF is correct (but unoptimized, we emit
DW_OP_lit1 DW_OP_lit1 DW_OP_minus for the upper bound).
Jakub Jelinek [Tue, 29 Jun 2021 09:24:38 +0000 (11:24 +0200)]
match.pd: Avoid (intptr_t)x eq/ne CST to x eq/ne (typeof x) CST opt in GENERIC when sanitizing [PR101210]
When we have (intptr_t) x == cst where x has REFERENCE_TYPE, this
optimization creates x == cst out of it where cst has REFERENCE_TYPE.
If it is done in GENERIC folding, it can results in ubsan failures
where the INTEGER_CST with REFERENCE_TYPE is instrumented.
Fixed by deferring it to GIMPLE folding in this case.
2021-06-29 Jakub Jelinek <jakub@redhat.com>
PR c++/101210
* match.pd ((intptr_t)x eq/ne CST to x eq/ne (typeof x) CST): Don't
perform the optimization in GENERIC when sanitizing and x has a
reference type.
Jakub Jelinek [Thu, 24 Jun 2021 13:58:02 +0000 (15:58 +0200)]
c: Fix up c_parser_has_attribute_expression [PR101176]
This function keeps src_range member of the result uninitialized, which at
least under valgrind can show up later when those uninitialized location_t's
can make it into the IL or location_t hash tables.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101176
* c-parser.c (c_parser_has_attribute_expression): Set source range for
the result.
Jakub Jelinek [Thu, 24 Jun 2021 13:55:28 +0000 (15:55 +0200)]
c: Fix C cast error-recovery [PR101171]
The following testcase ICEs during error-recovery, as build_c_cast calls
note_integer_operands on error_mark_node and that wraps it into
C_MAYBE_CONST_EXPR which is unexpected and causes ICE later on.
Seems most other callers of note_integer_operands check early if something
is error_mark_node and return before calling note_integer_operands on it.
The following patch fixes it by not calling on error_mark_node, another
possibility would be to handle error_mark_node in note_integer_operands and
just return it.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101171
* c-typeck.c (build_c_cast): Don't call note_integer_operands on
error_mark_node.
Jakub Jelinek [Thu, 24 Jun 2021 10:22:14 +0000 (12:22 +0200)]
stor-layout: Avoid DECL_BIT_FIELD_REPRESENTATIVE with NULL TREE_TYPE [PR101172]
finish_bitfield_representative has an early out if the field after a
bitfield has error_mark_node type, but that early out leads to TREE_TYPE
of the DECL_BIT_FIELD_REPRESENTATIVE being NULL, which breaks assumptions
on code that uses the DECL_BIT_FIELD_REPRESENTATIVE during error-recovery.
The following patch instead sets TREE_TYPE of the representative to
error_mark_node, something the users can deal with better. At this point
the representative can be set as DECL_BIT_FIELD_REPRESENTATIVE for multiple
bitfields, so making sure that we clear the DECL_BIT_FIELD_REPRESENTATIVE
instead would be harder (but doable, e.g. with the error_mark_node TREE_TYPE
set by this patch set some flag in the caller and if the flag is there, walk
all the fields once again and clear all DECL_BIT_FIELD_REPRESENTATIVE that
have error_mark_node TREE_TYPE).
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101172
* stor-layout.c (finish_bitfield_representative): If nextf has
error_mark_node type, set repr type to error_mark_node too.
Patrick Palka [Fri, 16 Jul 2021 20:21:13 +0000 (16:21 -0400)]
c++: alias CTAD in unevaluated context [PR101233]
This is the alias CTAD version of the CTAD bug PR93248, and the fix is
the same: clear cp_unevaluated_operand so that the entire chain of
DECL_ARGUMENTS gets substituted.
PR c++/101233
gcc/cp/ChangeLog:
* pt.c (alias_ctad_tweaks): Clear cp_unevaluated_operand for
substituting DECL_ARGUMENTS.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias10.C: New test.
This PR is about CTAD but the underlying problems are more general;
CTAD is a good trigger for them because of the necessary substitution
into constraints that deduction guide generation entails.
In the testcase below, when generating the implicit deduction guide for
the constrained constructor template for A, we substitute the generic
flattening map 'tsubst_args' into the constructor's constraints. During
this substitution, tsubst_pack_expansion returns a rebuilt pack
expansion for sizeof...(xs), but doesn't carry over the
PACK_EXPANSION_LOCAL_P (and PACK_EXPANSION_SIZEOF_P) flag from the
original tree to the rebuilt one. The flag is otherwise unset on the
original tree but gets set for the rebuilt tree from make_pack_expansion
since at_function_scope_p() is true (we're inside main). This leads to
a crash during satisfaction when substituting into the pack expansion
because we don't have local_specializations set up (and it'd be set up
for us if PACK_EXPANSION_LOCAL_P is unset)
Similarly, tsubst_constraint needs to set cp_unevaluated so that the
substitution performed therein doesn't rely on local_specializations.
This avoids a crash during CTAD for C below.
gcc/cp/ChangeLog:
PR c++/100138
* constraint.cc (tsubst_constraint): Set up cp_unevaluated.
(satisfy_atom): Set up iloc_sentinel before calling
cxx_constant_value.
* pt.c (tsubst_pack_expansion): When returning a rebuilt pack
expansion, carry over PACK_EXPANSION_LOCAL_P and
PACK_EXPANSION_SIZEOF_P from the original pack expansion.
gcc/testsuite/ChangeLog:
PR c++/100138
* g++.dg/cpp2a/concepts-ctad4.C: New test.
Jonathan Wakely [Tue, 15 Jun 2021 13:39:02 +0000 (14:39 +0100)]
libstdc++: Use function object for __decay_copy helper
By changing __cust_access::__decay_copy from a function template to a
function object we avoid ADL. That means it's fine to call it
unqualified (the compiler won't waste time doing ADL in associated
namespaces, and won't try to complete associated types).
This also makes some other minor simplications to other concepts for the
[range.access] CPOs.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/iterator_concepts.h (__cust_access::__decay_copy):
Replace with function object.
(__cust_access::__member_begin, ___cust_access::_adl_begin): Use
__decay_copy unqualified.
* include/bits/ranges_base.h (__member_end, __adl_end):
Likewise. Use __range_iter_t for type of ranges::begin.
(__member_rend): Use correct value category for rbegin argument.
(__member_data): Use __decay_copy unqualified.
(__begin_data): Use __range_iter_t for type of ranges::begin.
H.J. Lu [Fri, 11 Jun 2021 14:31:29 +0000 (07:31 -0700)]
x86: Replace ix86_red_zone_size with ix86_red_zone_used
Add red_zone_used to machine_function to track if red zone is used.
When expanding function prologue, set red_zone_used to true if red
zone is used.
gcc/
PR target/101023
* config/i386/i386.c (ix86_expand_prologue): Set red_zone_used
to true if red zone is used.
(ix86_output_indirect_jmp): Replace ix86_red_zone_size with
ix86_red_zone_used.
* config/i386/i386.h (machine_function): Add red_zone_used.
(ix86_red_zone_size): Removed.
(ix86_red_zone_used): New.
* config/i386/i386.md (peephole2 patterns): Replace
ix86_red_zone_size with ix86_red_zone_used.
gcc/testsuite/
PR target/101023
* g++.target/i386/pr101023a.C: New test.
* g++.target/i386/pr101023b.C: Likewise.
H.J. Lu [Fri, 9 Jul 2021 16:16:01 +0000 (09:16 -0700)]
x86: Don't enable UINTR in 32-bit mode
UINTR is available only in 64-bit mode. Since the codegen target is
unknown when the the gcc driver is processing -march=native, to properly
handle UINTR for -march=native:
1. Pass "arch [32|64]" and "tune [32|64]" to host_detect_local_cpu to
indicate 32-bit and 64-bit codegen.
2. Change ix86_option_override_internal to enable UINTR only in 64-bit
mode for -march=CPU when PTA_CPU includes PTA_UINTR.
gcc/
PR target/101395
* config/i386/driver-i386.c (host_detect_local_cpu): Check
"arch [32|64]" and "tune [32|64]" for 32-bit and 64-bit codegen.
Enable UINTR only for 64-bit codegen.
* config/i386/i386-options.c
(ix86_option_override_internal::DEF_PTA): Skip PTA_UINTR if not
in 64-bit mode.
* config/i386/i386.h (ARCH_ARG): New.
(CC1_CPU_SPEC): Pass "[arch|tune] 32" for 32-bit codegen and
"[arch|tune] 64" for 64-bit codegen.
Richard Biener [Fri, 9 Jul 2021 09:13:11 +0000 (11:13 +0200)]
driver/101383 - handle -gtoggle in driver
The driver amends assembler options with for example --gdwarf-5
when debugging is enabled but the check for that does not consider
the effect of -gtoggle which is not handled in the common option
machinery. The following alters debug_info_level according to
-gtoggle mimicing what process_options later does in the compiler.
This in particular avoids changing of the cc1-checksum with every
bootstrap (debug) cycle as we compute that from stage2 where we
use -g -gtoggle but with --gdwarf-5 and no debug info from the
compiler the assembler will fill the line table with the temporary
assembler file names.
2021-07-09 Richard Biener <rguenther@suse.de>
PR driver/101383
* gcc.c (process_command): Process -gtoggle like process_options
would after parsing options.
Andrew MacLeod [Tue, 22 Jun 2021 21:46:05 +0000 (17:46 -0400)]
Do not continue propagating values which cannot be set properly.
If the on-entry cache cannot properly represent a range, do not continue
trying to propagate it.
PR tree-optimization/101148
PR tree-optimization/101014
* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust.
(ranger_cache::~ranger_cache): Adjust.
(ranger_cache::block_range): Check if propagation disallowed.
(ranger_cache::propagate_cache): Disallow propagation if new value
can't be stored properly.
* gimple-range-cache.h (ranger_cache::m_propfail): New member.
Andrew MacLeod [Tue, 8 Jun 2021 13:43:17 +0000 (09:43 -0400)]
Don't process lookups for debug statements in Ranger.
Although PR 100781 is not an issue in GCC11, its possible that a similar
situation may arise. The identical fix cannot be easily introduced.
With EVRP always running in hybrid mode, there is no need for ranger to
spawn a lookup for a debug statement in this release.
* gimple-range.cc (gimple_ranger::range_of_expr): Treat debug statments
as contextless queries to avoid additional lookups.
Andrew MacLeod [Mon, 7 Jun 2021 17:18:55 +0000 (13:18 -0400)]
Implement a sparse bitmap representation for Rangers on-entry cache.
Use a sparse representation for the on entry cache, and utilize it when
the number of basic blocks in the function exceeds param_evrp_sparse_threshold.
Andrew MacLeod [Mon, 7 Jun 2021 17:12:01 +0000 (13:12 -0400)]
Implement multi-bit aligned accessors for sparse bitmap.
Provide set/get routines to allow sparse bitmaps to be treated as an array
of multiple bit values. Only chunk sizes that are powers of 2 are supported.
Andrew MacLeod [Fri, 7 May 2021 16:03:01 +0000 (12:03 -0400)]
Clean up and virtualize the on-entry cache interface.
Cleanup/Virtualize the ssa_block_range class, and implement the current
vector approach as a derived class.
Allow memory allocation from the irange allocator obstack for easy freeing.
* gimple-range-cache.cc (ssa_block_ranges): Virtualize.
(sbr_vector): Renamed from ssa_block_cache.
(sbr_vector::sbr_vector): Allocate from obstack abd initialize.
(ssa_block_ranges::~ssa_block_ranges): Remove.
(sbr_vector::set_bb_range): Use varying and undefined cached values.
(ssa_block_ranges::set_bb_varying): Remove.
(sbr_vector::get_bb_range): Adjust assert.
(sbr_vector::bb_range_p): Adjust assert.
(~block_range_cache): No freeing loop required.
(block_range_cache::get_block_ranges): Remove.
(block_range_cache::set_bb_range): Inline get_block_ranges.
(block_range_cache::set_bb_varying): Remove.
* gimple-range-cache.h (set_bb_varying): Remove prototype.
* value-range.h (irange_allocator::get_memory): New.
Odd-numbered indices describing argument access sizes in the fnspec
string can only hold 't' or a digit, as tested in the beginning of the
case. When checking that the size-supplying argument does not have
additional information associated with it, the test that excludes the
't' possibility looks for it at the even position in the fnspec
string. Oops.
This might yield false positives and negatives if a function has a
fnspec in which an argument uses a 't' access-size, and ('t' - '1')
happens to be the index of an argument described in an fnspec string.
Assuming ASCII encoding, it would take a function with at least 68
arguments described in fnspec. Still, probably worth fixing.
for gcc/ChangeLog
* tree-ssa-alias.c (attr_fnspec::verify): Fix index in
non-'t'-sized arg check.
The use of npos triggers a diagnostic as described in PR c++/101361.
This change replaces the use of npos with the exact length, which is
already known. We can further simplify it by inlining the effects of
compare and substr, avoiding the redundant range checks in the latter.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR c++/101361
* include/std/string_view (ends_with): Use traits_type::compare
directly.
Jonathan Wakely [Tue, 13 Jul 2021 11:09:37 +0000 (12:09 +0100)]
libstdc++: Remove duplicate #include in <string_view>
When I added the new C++23 constructor I added a conditional include of
<bits/ranges_base.h>, which was already being included unconditionally.
This removes the unconditional include but changes the condition for the
other one, so it's used for C++20 as well.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/std/string_view: Only include <bits/ranges_base.h>
once, and only for C++20 and later.
The std::as_writable_bytes function should be constrained to only accept
writable spans. Currently it can be called but then gives an error in
the function body.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
liuhongt [Tue, 15 Jun 2021 08:25:16 +0000 (16:25 +0800)]
Disparage slightly the mask register alternative for bitwise operations.
The avx512 supports bitwise operations with mask registers, but the
throughput of those instructions is much lower than that of the
corresponding gpr version, so we would additionally disparages
slightly the mask register alternative for bitwise operations in the
LRA.
Also when allocano cost of GENERAL_REGS is same as MASK_REGS, allocate
MASK_REGS first since it has already been disparaged.
gcc/ChangeLog:
PR target/101142
* config/i386/i386.md: (*anddi_1): Disparage slightly the mask
register alternative.
(*and<mode>_1): Ditto.
(*andqi_1): Ditto.
(*andn<mode>_1): Ditto.
(*<code><mode>_1): Ditto.
(*<code>qi_1): Ditto.
(*one_cmpl<mode>2_1): Ditto.
(*one_cmplsi2_1_zext): Ditto.
(*one_cmplqi2_1): Ditto.
* config/i386/i386.c (x86_order_regs_for_local_alloc): Change
the order of mask registers to be before general registers.
Richard Biener [Wed, 14 Jul 2021 09:06:58 +0000 (11:06 +0200)]
tree-optimization/101445 - fix negative stride SLP vect with gaps
The following fixes the IV adjustment for the gap in a negative
stride SLP vectorization. The adjustment was in the wrong direction,
now fixes as in the patch.
2021-07-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/101445
* tree-vect-stmts.c (vectorizable_load): Do the gap adjustment
of the IV in the correct direction for negative stride
accesses.
Patrick Palka [Fri, 9 Jul 2021 14:20:25 +0000 (10:20 -0400)]
c++: requires-expr with dependent extra args [PR101181]
Here we're crashing ultimately because the mechanism for delaying
substitution into a requires-expression (and constexpr if and pack
expansions) doesn't expect to see dependent args. But we end up
capturing dependent args here during substitution into the default
template argument as part of coerce_template_parms for the dependent
specialization p<T>.
This patch enables the commented out code in add_extra_args for handling
this situation. This isn't needed for pack expansions (as the
accompanying comment points out), and it doesn't seem strictly necessary
for constexpr if either, but for requires-expressions delaying even
dependent substitution is important for ensuring we don't evaluate
requirements out of order.
It turns out we also need to make a copy of the arguments when capturing
them so that coerce_template_parms doesn't later add to them and form an
unexpected cycle (REQUIRES_EXPR_EXTRA_ARGS (t) would indirectly point to t).
We also need to make tsubst_template_args handle missing template
arguments, since the arguments we capture from coerce_template_parms
and are incomplete at that point.
PR c++/101181
gcc/cp/ChangeLog:
* constraint.cc (tsubst_requires_expr): Pass complain/in_decl to
add_extra_args.
* cp-tree.h (add_extra_args): Add complain/in_decl parameters.
* pt.c (build_extra_args): Make a copy of args.
(add_extra_args): Add complain/in_decl parameters. Enable the
code for handling the case where the extra arguments are
dependent.
(tsubst_pack_expansion): Pass complain/in_decl to
add_extra_args.
(tsubst_template_args): Handle missing template arguments.
(tsubst_expr) <case IF_STMT>: Pass complain/in_decl to
add_extra_args.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-requires26.C: New test.
* g++.dg/cpp2a/lambda-uneval16.C: New test.
Patrick Palka [Fri, 9 Jul 2021 14:20:22 +0000 (10:20 -0400)]
c++: find_template_parameters and TEMPLATE_DECLs [PR101247]
r12-1989 fixed the testcase in the PR, but unfortunately the fix is
buggy: it breaks the case where the common template between the
TEMPLATE_DECL t and ctx_parms is the innermost template (as in
concepts-memtmpl5.C below). This can be fixed by instead passing the
TREE_TYPE of ctmpl to common_enclosing_class when ctmpl is a class
template.
But even after that's fixed, the analogous case where the innermost
template is a partial specialization is still broken (as in
concepts-memtmpl5a.C below), because ctmpl is always a primary template.
So this patch instead takes a diferent approach that doesn't rely on
ctx_parms at all: when looking for the template parameters of a
TEMPLATE_DECL that are shared with the current template context, just
walk its DECL_CONTEXT. As long as the template is not overly general
(e.g. we didn't pass it through most_general_template), this should give
us exactly what we want, since if a TEMPLATE_DECL can be referred to
from some template context then the template parameters it uses must all
be in-scope and contained in its DECL_CONTEXT. This effectively makes
us treat TEMPLATE_DECLs more similarly to other _DECLs (whose DECL_CONTEXT
we also walk).
PR c++/101247
gcc/cp/ChangeLog:
* pt.c (any_template_parm_r) <case TEMPLATE_DECL>: Just walk the
DECL_CONTEXT.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-memtmpl4.C: Uncomment the commented out
example, which we now handle correctly.
* g++.dg/cpp2a/concepts-memtmpl5.C: New test.
* g++.dg/cpp2a/concepts-memtmpl5a.C: New test.
Patrick Palka [Fri, 2 Jul 2021 17:54:57 +0000 (13:54 -0400)]
c++: unqualified member template in constraint [PR101247]
Here any_template_parm_r is failing to mark the template parameters
implicitly used by the unqualified use of 'd' inside the constraint
because the code to do so assumes each level of a template parameter
list points to the corresponding primary template, but here the
parameter level for A in the out-of-line definition of A::B does not
(nor do the parameter levels for A and C in the definition of A::C),
which causes us to overlook the sharing.
So it seems we can't in general depend on the TREE_TYPE of a template
parameter level being non-empty here. This patch partially fixes this
by rewriting the relevant part of any_template_parm_r to not depend on
the TREE_TYPE of outer levels. We still depend on the innermost level
to point to the innermost primary template, so we still crash on the
commented out line in the below testcase.
PR c++/101247
gcc/cp/ChangeLog:
* pt.c (any_template_parm_r) <case TEMPLATE_DECL>: Rewrite to
use common_enclosing_class and to not depend on the TREE_TYPE
of outer levels pointing to the corresponding primary template.
Patrick Palka [Thu, 1 Jul 2021 00:44:52 +0000 (20:44 -0400)]
c++: cxx_eval_array_reference and empty elem type [PR101194]
Here the initializer for x is represented as an empty CONSTRUCTOR due to
its empty element type. So during constexpr evaluation of the ARRAY_REF
x[0], we end up trying to value initialize the omitted element at index 0,
which fails because the element type is not default constructible.
This patch makes cxx_eval_array_reference specifically handle the case
where the element type is an empty type.
PR c++/101194
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_array_reference): When the element type
is an empty type and the corresponding element is omitted, just
return an empty CONSTRUCTOR instead of attempting value
initialization.
Patrick Palka [Thu, 24 Jun 2021 17:11:44 +0000 (13:11 -0400)]
c++: alias CTAD and aggregate deduction cand [PR98832]
During alias CTAD, we're accidentally ignoring the aggregate deduction
candidate for the underlying template because this guide is added
separately via maybe_aggr_guide (which doesn't yet handle alias
templates) instead of via deduction_guides_for (which does). This patch
makes maybe_aggr_guide handle alias templates in a manner similar to
deduction_guides_for.
PR c++/98832
gcc/cp/ChangeLog:
* pt.c (maybe_aggr_guide): Handle alias templates appropriately.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias9.C: New test.
Patrick Palka [Thu, 24 Jun 2021 15:29:02 +0000 (11:29 -0400)]
c++: requires-expression folding [PR101182]
Here we're crashing because cp_fold_function walks into the (templated)
requirements of a requires-expression outside a template, but the
folding routines aren't prepared to handle templated trees. This patch
fixes this by making cp_fold use evaluate_requires_expr to fold a
requires-expression as a whole, which also means we no longer need to
explicitly do so during gimplification. (Note that we delay folding
of such requires-expressions for sake of better diagnostics when one is
used as the condition of a failed static_assert.)
PR c++/101182
gcc/cp/ChangeLog:
* constraint.cc (evaluate_requires_expr): Adjust function comment.
* cp-gimplify.c (cp_genericize_r) <case REQUIRES_EXPR>: Move to ...
(cp_fold) <case REQUIRES_EXPR>: ... here.
This rewrites ranges::minmax and ranges::minmax_element so that it
performs at most 3*N/2 many comparisons, as required by the standard.
In passing, this also fixes PR100387 by avoiding a premature std::move
in ranges::minmax and in std::shift_right.
PR libstdc++/100387
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__minmax_fn::operator()): Rewrite
to limit comparison complexity to 3*N/2.
(__minmax_element_fn::operator()): Likewise.
(shift_right): Avoid premature std::move of __result.
* testsuite/25_algorithms/minmax/constrained.cc (test04, test05):
New tests.
* testsuite/25_algorithms/minmax_element/constrained.cc (test02):
Likewise.
Patrick Palka [Mon, 7 Jun 2021 16:02:08 +0000 (12:02 -0400)]
c++: access of dtor named by qualified template-id [PR100918]
Here, when resolving the destructor named by Inner<int>::~Inner<int>
(which is valid until C++20) we end up in cp_parser_lookup_name called
indirectly from cp_parser_template_id to look up the name Inner from
the scope Inner<int>. The lookup naturally finds the injected-class-name,
and because the flag is_template is true, we adjust this lookup result
to the TEMPLATE_DECL Inner. We then check access of this adjusted
lookup result. But this access check fails because the lookup scope is
Inner<int> and the context_for_name_lookup for the TEMPLATE_DECL is
Outer (whereas for the injected-class-name it's also Inner<int>).
The simplest fix seems to be to check access of the original lookup
result (the injected-class-name) instead of the adjusted result (the
TEMPLATE_DECL). So this patch moves the access check in
cp_parser_lookup_name to before the injected-class-name adjustment.
PR c++/100918
gcc/cp/ChangeLog:
* parser.c (cp_parser_lookup_name): Check access of the lookup
result before we potentially adjust an injected-class-name to
its TEMPLATE_DECL.
Patrick Palka [Wed, 26 May 2021 12:35:31 +0000 (08:35 -0400)]
c++: Fix reference NTTP binding to noexcept fn [PR97420]
Here, in C++17 mode, convert_nontype_argument_function is rejecting
binding a non-noexcept function reference template parameter to a
noexcept function (encoded as the template argument '*(int (&) (int)) &f').
The first roadblock to making this work is that the argument is wrapped
an an implicit INDIRECT_REF, so we need to unwrap it before calling
strip_fnptr_conv.
The second roadblock is that the NOP_EXPR cast converts from a function
pointer type to a reference type while simultaneously removing the
noexcept qualification, and fnptr_conv_p doesn't consider this cast to
be a function pointer conversion. This patch fixes this by making
fnptr_conv_p treat REFERENCE_TYPEs and POINTER_TYPEs interchangeably.
Finally, in passing, this patch also simplifies noexcept_conv_p by
removing a bunch of redundant checks already performed by its only
caller fnptr_conv_p.
PR c++/97420
gcc/cp/ChangeLog:
* cvt.c (noexcept_conv_p): Remove redundant checks and simplify.
(fnptr_conv_p): Don't call non_reference. Use INDIRECT_TYPE_P
instead of TYPE_PTR_P.
* pt.c (convert_nontype_argument_function): Look through
implicit INDIRECT_REFs before calling strip_fnptr_conv.
Richard Biener [Mon, 5 Jul 2021 09:53:07 +0000 (11:53 +0200)]
middle-end/101291 - set loop copy of versioned loop
This fixes the vectorizer loop versioning code failing to clear
niter related info on the scalar loop as it assumed get_loop_copy
would work even for the outermost loop. The patch makes that
assumption hold by adjusting the loop versioning code.
2021-07-05 Richard Biener <rguenther@suse.de>
PR middle-end/101291
* cfgloopmanip.c (loop_version): Set the loop copy of the
versioned loop to the new loop.
Richard Biener [Fri, 28 May 2021 12:26:06 +0000 (14:26 +0200)]
tree-optimization/100778 - avoid cross-BB vectorization of trapping op
This avoids vectorizing a possibly trapping operation when lanes
are handled in different BBs. I spotted this when working on the
originally reported issue in PR100778.
2021-05-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/100778
* tree-vect-slp.c (vect_build_slp_tree_1): Prevent possibly
trapping ops in different BBs.
Jason Merrill [Fri, 9 Jul 2021 17:50:01 +0000 (13:50 -0400)]
c++: concepts TS and explicit specialization [PR101098]
duplicate_decls was not recognizing the explicit specialization as matching
the implicit specialization of g<Y> because
function_requirements_equivalent_p was seeing the C constraint on the
implicit one and not on the explicit.
PR c++/101098
gcc/cp/ChangeLog:
* decl.c (function_requirements_equivalent_p): Only compare
trailing requirements on a specialization.
Jason Merrill [Mon, 31 May 2021 16:36:25 +0000 (12:36 -0400)]
c++: missing dtor with -fno-elide-constructors [PR100838]
tf_no_cleanup only applies to the outermost TARGET_EXPR, and we already
clear it for nested calls in build_over_call, but in this case both
constructor calls came from convert_like, so we need to clear it in the
recursive call as well. This revealed that we were adding an extra
ck_rvalue in direct-initialization cases where it was wrong.
For GCC 11, limit the changes to -fno-elide-constructors.
PR c++/100838
gcc/cp/ChangeLog:
* call.c (convert_like_internal): Clear tf_no_cleanup when
recursing.
(build_user_type_conversion_1): Only add ck_rvalue if
LOOKUP_ONLYCONVERTING.
Jason Merrill [Wed, 26 May 2021 21:38:42 +0000 (17:38 -0400)]
c++: argument pack with expansion [PR86355]
This testcase revealed that we were using PACK_EXPANSION_EXTRA_ARGS a lot
more than necessary; use_pack_expansion_extra_args_p meant to use it in the
case of corresponding arguments in different argument packs differing in
whether they are pack expansions, but it was mistakenly also returning true
for the case of a single argument pack containing both expansion and
non-expansion elements.
Surprisingly, just disabling that didn't lead to any regressions in the
testsuite; it seems other changes have prevented us getting to this point
for code that used to exercise it. So this patch limits the check to
arguments in the same position in the packs, and asserts that we never
actually see a mismatch.
PR c++/86355
gcc/cp/ChangeLog:
* pt.c (use_pack_expansion_extra_args_p): Don't compare
args from the same argument pack.
Martin Jambor [Fri, 9 Jul 2021 14:09:53 +0000 (16:09 +0200)]
ipa-sra: Fix thinko when overriding safe_to_import_accesses (PR 101066)
The "new" IPA-SRA has a more difficult job than the previous
not-truly-IPA version when identifying situations in which a parameter
passed by reference can be passed into a third function and only thee
converted to one passed by value (and possibly "split" at the same
time).
In order to allow this, two conditions must be fulfilled. First the
call to the third function must happen before any modifications of
memory, because it could change the value passed by reference.
Second, in order to make sure we do not introduce new (invalid)
dereferences, the call must postdominate the entry BB.
The second condition is actually not necessary if the caller function
is also certain to dereference the pointer but the first one must
still hold. Unfortunately, the code making this overriding decision
also happen to trigger when the first condition is not fulfilled.
This is fixed in the following patch.
gcc/ChangeLog:
2021-06-16 Martin Jambor <mjambor@suse.cz>
PR ipa/101066
* ipa-sra.c (class isra_call_summary): New member
m_before_any_store, initialize it in the constructor.
(isra_call_summary::dump): Dump the new field.
(ipa_sra_call_summaries::duplicate): Copy it.
(process_scan_results): Set it.
(isra_write_edge_summary): Stream it.
(isra_read_edge_summary): Likewise.
(param_splitting_across_edge): Only override
safe_to_import_accesses if m_before_any_store is set.
Eric Botcazou [Fri, 9 Jul 2021 10:08:52 +0000 (12:08 +0200)]
Fix build failure on Windows with older binutils
This is the build failure on Windows with binutils for which GNU as accepts
the --gdwarf-5 switch but GNU ld generates broken binaries with DWARF 5.
We already have the HAVE_LD_BROKEN_PE_DWARF5 kludge to disable DWARF 5 in
this case but it only tames the DWARF version in the compiler, so the
driver still passes --gdwarf-5 when invoked on an assembly file with -g.
gcc/
PR target/101377
* gcc.c (ASM_DEBUG_DWARF_OPTION): Set again to --gdwarf2 in
the case where HAVE_AS_WORKING_DWARF_N_FLAG is not defined
and HAVE_LD_BROKEN_PE_DWARF5 is defined.
Marek Polacek [Thu, 8 Jul 2021 00:02:18 +0000 (20:02 -0400)]
c++: Fix noexcept with unevaluated operand [PR101087]
It sounds plausible that this assert
int f();
static_assert(noexcept(sizeof(f())));
should pass: sizeof produces a std::size_t and its operand is not
evaluated, so it can't throw. noexcept should only evaluate to
false for potentially evaluated operands. Therefore I think that
check_noexcept_r shouldn't walk into operands of sizeof/decltype/
alignof/typeof.
PR c++/101087
gcc/cp/ChangeLog:
* cp-tree.h (unevaluated_p): New.
* except.c (check_noexcept_r): Use it. Don't walk into
unevaluated operands.
Jason Merrill [Wed, 7 Jul 2021 21:57:40 +0000 (17:57 -0400)]
Revert "c++: Improve init handling"
Apparently looking through these codes means that in a template, we end up
feeding a TARGET_EXPR to fold_non_dependent_expr, which should never
happen. This is a broader issue, but for now let's just revert the change.
Jason Merrill [Thu, 24 Jun 2021 21:32:02 +0000 (17:32 -0400)]
c++: constexpr aggr init of empty class [PR101040]
This is basically the aggregate initializer version of PR97566; as in that
bug, we are trying to initialize empty field 'obj' in 'single' when there's
no CONSTRUCTOR entry for the 'single' base class subobject of 'derived'. As
with that bug, the fix is to stop trying to add entries for empty fields,
this time in cxx_eval_bare_aggregate.
The change to the other function isn't necessary for this version of
the patch, but seems worthwhile for robustness anyway.
This adjusts the loop interchange dependence checking to properly
guard all dependence checks with DDR_REVERSED_P or its inverse.
2021-07-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/101173
PR tree-optimization/101280
* gimple-loop-interchange.cc
(tree_loop_interchange::valid_data_dependences): Properly
guard all dependence checks with DDR_REVERSED_P or its
inverse.
Richard Biener [Tue, 22 Jun 2021 10:13:44 +0000 (12:13 +0200)]
middle-end/101156 - remove not working optimization in gimplification
This removes a premature and not working optimization from the
gimplifier. When gimplification is requested not to produce a SSA
name we try to avoid generating a copy when we did so anyway but
instead replace the LHS of its definition. But that only works in
case there are no uses of the SSA name already which is something
we cannot easily check, so the following removes said optimization.
Statistics on the whole bootstrap shows we hit this optimization
only for libiberty/cp-demangle.c and overall we have 21652112
gimplifications where just 240 copies are elided. Preserving
the optimization would require scanning the original expression
and the pre and post sequences for SSA names and uses, that seems
excessive to avoid these 240 copies.
Richard Biener [Tue, 8 Jun 2021 10:52:12 +0000 (12:52 +0200)]
tree-optimization/100923 - fix alias-ref construction wrt availability
This PR shows that building an ao_ref from value-numbers is prone to
expose bogus contextual alias info to the oracle. The following makes
sure to construct ao_refs from SSA names available at the program point
only.
On the way it modifies the awkward valueize_refs[_1] API.
2021-06-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/100923
* tree-ssa-sccvn.c (valueize_refs_1): Take a pointer to
the operand vector to be valueized.
(valueize_refs): Likewise.
(valueize_shared_reference_ops_from_ref): Adjust.
(valueize_shared_reference_ops_from_call): Likewise.
(vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise. Re-valueize
with honoring availability when we are about to create
the ao_ref and valueized before.
(vn_reference_lookup): Likewise.
(vn_reference_insert_pieces): Adjust.
Richard Biener [Wed, 16 Jun 2021 07:49:18 +0000 (09:49 +0200)]
tree-optimization/101088 - fix SM invalidation issue
When we face a sm_ord vs sm_unord for the same ref during
store sequence merging we assert that the ref is already marked
unsupported. But it can be that it will only be marked so
during the ongoing merging so instead of asserting mark it here.
Also apply some optimization to not waste resources to search
for already unsupported refs.
2021-06-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/101088
* tree-ssa-loop-im.c (sm_seq_valid_bb): Only look for
supported refs on edges. Do not assert same ref but
different kind stores are unsuported but mark them so.
(hoist_memory_references): Only look for supported refs
on exits.
David Edelsohn [Thu, 20 May 2021 18:07:18 +0000 (14:07 -0400)]
aix: collect2 text files in archive
Rust places text files in archives. AIX ld ignores such files with a
warning. The collect2 wrapper for ld had been exiting with a fatal
error if it scanned an archive that contained a non-COFF file.
This patch updates collect2.c to issue a warning and ignore the file
member, matching the behavior of AIX ld. GCC can encounter archives
created by Rust and should not issue a fatal error. This changes
fatal_error to warning, with an implicit location and no associated
optimization flag.
gcc/ChangeLog:
2021-05-20 Clement Chigot <clement.chigot@atos.net>
David Edelsohn <dje.gcc@gmail.com>
* collect2.c (scan_prog_file): Issue non-fatal warning for
non-COFF files.
The D front-end semantic pass sometimes declares a temporary inside a
return expression. This is now detected with the RESULT_DECL replacing
the temporary, allowing for RVO to be done.
PR d/101273
gcc/d/ChangeLog:
* toir.cc (IRVisitor::visit (ReturnStatement *)): Detect returns that
use a temporary, and replace with return value.
d: RHS value lost when a target_expr modifies LHS in a cond_expr
To prevent the RHS of an assignment modifying the LHS before the
assignment proper, a target_expr is forced so that function calls that
return with slot optimization modify the temporary instead. This did
not work for conditional expressions however, to give one example. So
now the RHS is always forced to a temporary.
PR d/101282
gcc/d/ChangeLog:
* d-codegen.cc (build_assign): Force target_expr on RHS for non-POD
assignment expressions.