Jakub Jelinek [Thu, 20 May 2021 11:30:48 +0000 (13:30 +0200)]
openmp: Handle explicit linear clause properly in combined constructs with target [PR99928]
linear clause should have the effect of firstprivate+lastprivate (or for IVs
not declared in the construct lastprivate) on outer constructs and eventually
map(tofrom:) on target when combined with it.
2021-05-20 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* gimplify.c (gimplify_scan_omp_clauses) <case OMP_CLAUSE_LINEAR>: For
explicit linear clause when combined with target, make it map(tofrom:)
instead of no clause or firstprivate.
* c-c++-common/gomp/pr99928-4.c: Remove all xfails.
* c-c++-common/gomp/pr99928-5.c: Likewise.
Jason Merrill [Wed, 19 May 2021 21:33:21 +0000 (17:33 -0400)]
c++: _Complex template parameter [PR100634]
We were crashing because invalid_nontype_parm_type_p allowed _Complex
template parms, but convert_nontype_argument didn't know what to do for
them. Let's just disallow it, people can and should use std::complex
instead.
PR c++/100634
gcc/cp/ChangeLog:
* pt.c (invalid_nontype_parm_type_p): Return true for COMPLEX_TYPE.
Jason Merrill [Wed, 19 May 2021 20:40:24 +0000 (16:40 -0400)]
c++: ICE with using and enum [PR100659]
Here the code for 'using enum' is confused by the combination of a
using-decl and an enum that are not from 'using enum'; this CONST_DECL is
from the normal unscoped enum scoping.
PR c++/100659
gcc/cp/ChangeLog:
* cp-tree.h (CONST_DECL_USING_P): Check for null TREE_TYPE.
Jason Merrill [Tue, 18 May 2021 16:29:33 +0000 (12:29 -0400)]
c++: ICE with <=> fallback [PR100367]
Here, when genericizing lexicographical_compare_three_way, we haven't yet
walked the operands, so (a == a) still sees ADDR_EXPR <a>, but this is after
we've changed the type of a to REFERENCE_TYPE. When we try to fold (a == a)
by constexpr evaluation, the constexpr code doesn't understand trying to
take the address of a reference, and we end up crashing.
Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
fold_build2 instead of build_new_op on scalar operands. Class operands
should have been expanded during parsing.
PR c++/100367
PR c++/96299
gcc/cp/ChangeLog:
* method.c (genericize_spaceship): Use fold_build2 for scalar
operands.
Christophe Lyon [Wed, 19 May 2021 14:45:54 +0000 (14:45 +0000)]
arm/testsuite: Fix testcase for PR99977
Some targets (eg arm-none-uclinuxfdpiceabi) do not support Thumb-1,
and since the testcase forces -march=armv8-m.base, we need to check
whether this option is actually supported.
Using dg-add-options arm_arch_v8m_base ensure that we pass -mthumb as
needed too.
Jakub Jelinek [Wed, 19 May 2021 07:41:55 +0000 (09:41 +0200)]
openmp: Handle lastprivate on combined target correctly [PR99928]
This patch deals with 2 issues:
1) the gimplifier couldn't differentiate between
#pragma omp parallel master
#pragma omp taskloop simd
and
#pragma omp parallel master taskloop simd
when there is a significant difference for clause handling between
the two; as master construct doesn't have any clauses, we don't currently
represent it during gimplification by an gimplification omp context at all,
so this patch makes sure we don't set OMP_PARALLEL_COMBINED on parallel master
when not combined further. If we ever add a separate master context during
gimplification, we'd use ORT_COMBINED_MASTER vs. ORT_MASTER (or MASKED probably).
2) lastprivate when combined with target should be map(tofrom:) on the target,
this change handles it only when not combined with firstprivate though, that
will need further work (similarly to linear or reduction).
2021-05-19 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/
* tree.h (OMP_MASTER_COMBINED): Define.
* gimplify.c (gimplify_scan_omp_clauses): Rewrite lastprivate
handling for outer combined/composite constructs to a loop.
Handle lastprivate on combined target.
(gimplify_expr): Formatting fix.
gcc/c/
* c-parser.c (c_parser_omp_master): Set OMP_MASTER_COMBINED on
master when combined with taskloop.
(c_parser_omp_parallel): Don't set OMP_PARALLEL_COMBINED on
parallel master when not combined with taskloop.
gcc/cp/
* parser.c (cp_parser_omp_master): Set OMP_MASTER_COMBINED on
master when combined with taskloop.
(cp_parser_omp_parallel): Don't set OMP_PARALLEL_COMBINED on
parallel master when not combined with taskloop.
gcc/testsuite/
* c-c++-common/gomp/pr99928-2.c: Remove all xfails.
* c-c++-common/gomp/pr99928-12.c: New test.
Jason Merrill [Tue, 18 May 2021 21:15:42 +0000 (17:15 -0400)]
c++: ICE with bad definition of decimal32 [PR100261]
The change to only look at the global binding for non-classes meant that
here, when dealing with decimal32 which is magically mangled like its first
non-static data member, we got a collision with the mangling for float.
Fixed by also looking up an existing binding for such magical classes.
Here we have a pack expansion of a template template parameter pack, of
which the pattern is a TEMPLATE_DECL, which strip_typedefs doesn't want to
see.
PR c++/100372
gcc/cp/ChangeLog:
* tree.c (strip_typedefs): Only look at the pattern of a
TYPE_PACK_EXPANSION if it's a type.
Jason Merrill [Tue, 18 May 2021 16:06:36 +0000 (12:06 -0400)]
c++: "perfect" implicitly deleted move [PR100644]
Here we were ignoring the template constructor because the implicit move
constructor had all perfect conversions. But CWG1402 says that an
implicitly deleted move constructor is ignored by overload resolution; we
implement that instead by preferring any other candidate in joust, to get
better diagnostics, but that means we need to handle that case here as well.
gcc/cp/ChangeLog:
PR c++/100644
* call.c (perfect_candidate_p): An implicitly deleted move
is not perfect.
Jason Merrill [Wed, 14 Apr 2021 15:24:50 +0000 (11:24 -0400)]
c++: constant expressions are evaluated [PR93314]
My GCC 11 patch for PR93314 turned off cp_unevaluated_operand while
processing an id-expression that names a non-static data member, but the
broader issue is that in general, a constant-expression is evaluated even in
an unevaluated operand.
This also fixes 100205, introduced by the earlier patch that couldn't
distinguish between the different allow_non_constant_p cases.
PR c++/100205
PR c++/93314
gcc/cp/ChangeLog:
* cp-tree.h (cp_evaluated): Add reset parm to constructor.
* parser.c (cp_parser_constant_expression): Change
allow_non_constant_p to int. Use cp_evaluated.
(cp_parser_initializer_clause): Pass 2 to allow_non_constant_p.
* semantics.c (finish_id_expression_1): Don't mess with
cp_unevaluated_operand here.
Tom de Vries [Tue, 18 May 2021 06:24:00 +0000 (08:24 +0200)]
[nvptx] Handle memmodel for atomic ops
The atomic ops in nvptx.md have memmodel arguments, which are currently
ignored.
Handle these, fixing test-case fails libgomp.c-c++-common/reduction-{5,6}.c
on volta.
Tested libgomp on x86_64-linux with nvptx accelerator.
gcc/ChangeLog:
2021-05-17 Tom de Vries <tdevries@suse.de>
PR target/100497
* config/nvptx/nvptx-protos.h (nvptx_output_atomic_insn): Declare
* config/nvptx/nvptx.c (nvptx_output_barrier)
(nvptx_output_atomic_insn): New function.
(nvptx_print_operand): Add support for 'B'.
* config/nvptx/nvptx.md: Use nvptx_output_atomic_insn for atomic
insns.
openmp: Notify team barrier of pending tasks in omp_fulfill_event
The team barrier should be notified of any new tasks that become runnable
as the result of a completing task, otherwise the barrier threads might
not resume processing available tasks, resulting in a hang.
openmp: Notify team barrier of pending tasks in omp_fulfill_event
The team barrier should be notified of any new tasks that become runnable
as the result of a completing task, otherwise the barrier threads might
not resume processing available tasks, resulting in a hang.
Jonathan Wakely [Mon, 17 May 2021 10:54:06 +0000 (11:54 +0100)]
libstdc++: Fix filesystem::path constraints for volatile [PR 100630]
The constraint check for filesystem::path construction uses
decltype(__is_path_src(declval<Source>())) which mean it considers
conversion from an rvalue. When Source is a volatile-qualified type
it cannot use is_path_src(const Unknown&) because a const lvalue
reference can only bind to a non-volatile rvalue.
Since the relevant path members all have a const Source& parameter,
the constraint should be defined in terms of declval<const Source&>(),
not declval<Source>(). This avoids the problem of volatile-qualified
rvalues, because we no longer use an rvalue at all.
libstdc++-v3/ChangeLog:
PR libstdc++/100630
* include/experimental/bits/fs_path.h (__is_constructible_from):
Test construction from a const lvalue, not an rvalue.
* testsuite/27_io/filesystem/path/construct/100630.cc: New test.
* testsuite/experimental/filesystem/path/construct/100630.cc:
New test.
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h (__waiter::_M_do_wait_v): loop
until value change observed.
(__waiter_base::_M_laundered): New member.
(__waiter_base::_M_notify): Check _M_laundered to determine
whether to wake one or all.
(__detail::__atomic_compare): Return true if call to
__builtin_memcmp() == 0.
(__waiter_base::_S_do_spin_v): Adjust predicate.
* testsuite/29_atomics/atomic/wait_notify/100334.cc: New
test.
Alex Coplan [Tue, 27 Apr 2021 13:56:15 +0000 (14:56 +0100)]
arm: Fix ICEs with compare-and-swap and -march=armv8-m.base [PR99977]
The PR shows two ICEs with __sync_bool_compare_and_swap and
-mcpu=cortex-m23 (equivalently, -march=armv8-m.base): one in LRA and one
later on, after the CAS insn is split.
The LRA ICE occurs because the
@atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1 pattern attempts to tie
two output operands together (operands 0 and 1 in the third
alternative). LRA can't handle this, since it doesn't make sense for an
insn to assign to the same operand twice.
The later (post-splitting) ICE occurs because the expansion of the
cbranchsi4_scratch insn doesn't quite go according to plan. As it
stands, arm_split_compare_and_swap calls gen_cbranchsi4_scratch,
attempting to pass a register (neg_bval) to use as a scratch register.
However, since the RTL template has a match_scratch here,
gen_cbranchsi4_scratch ignores this argument and produces a scratch rtx.
Since this is all happening after RA, this is doomed to fail (and we get
an ICE about the insn not matching its constraints).
It seems that the motivation for the choice of constraints in the
atomic_compare_and_swap pattern comes from an attempt to satisfy the
constraints of the cbranchsi4_scratch insn. This insn requires the
scratch register to be the same as the input register in the case that
we use a larger negative immediate (one that satisfies J, but not L).
Of course, as noted above, LRA refuses to assign two output operands to
the same register, so this was never going to work.
The solution I'm proposing here is to collapse the alternatives to the
CAS insn (allowing the two output register operands to be matched to
different registers) and to ensure that the constraints for
cbranchsi4_scratch are met in arm_split_compare_and_swap. We do this by
inserting a move to ensure the source and destination registers match if
necessary (i.e. in the case of large negative immediates).
Another notable change here is that we only do:
emit_move_insn (neg_bval, const1_rtx);
for non-negative immediates. This is because the ADDS instruction used in
the negative case suffices to leave a suitable value in neg_bval: if the
operands compare equal, we don't take the branch (so neg_bval will be
set by the load exclusive). Otherwise, the ADDS will leave a nonzero
value in neg_bval, which will correctly signal that the CAS has failed
when it is later negated.
gcc/ChangeLog:
PR target/99977
* config/arm/arm.c (arm_split_compare_and_swap): Fix up codegen
with negative immediates: ensure we expand cbranchsi4_scratch
correctly and ensure we satisfy its constraints.
* config/arm/sync.md
(@atomic_compare_and_swap<CCSI:arch><NARROW:mode>_1): Don't
attempt to tie two output operands together with constraints;
collapse two alternatives.
(@atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1): Likewise.
* config/arm/thumb1.md (cbranchsi4_neg_late): New.
gcc/testsuite/ChangeLog:
PR target/99977
* gcc.target/arm/pr99977.c: New test.
Richard Biener [Mon, 17 May 2021 06:51:03 +0000 (08:51 +0200)]
Update mpfr version to 3.1.6
This updates the mpfr version to 3.1.6 which is the last bugfix
release from the 3.1.x series and avoids printing the version
is buggy but acceptable from our configury.
2021-05-17 Richard Biener <rguenther@suse.de>
contrib/ChangeLog:
* download_prerequisites: Update mpfr version to 3.1.6.
* prerequisites.md5: Update.
* prerequisites.sha512: Likewise.
Jakub Jelinek [Fri, 14 May 2021 08:17:19 +0000 (10:17 +0200)]
openmp: Add testcases to verify OpenMP 5.0 2.14 and OpenMP 5.1 2.17 rules [PR99928]
In preparation of PR99928 patch review, I've prepared testcases with clauses
that need more interesting handling on combined/composite constructs,
in particular firstprivate, lastprivate, firstprivate+lastprivate, linear
(explicit on non-iv, explicit on simd iv, implicit on simd iv, implicit on
simd iv declared in the construct), reduction (scalars, array sections of
array variables, array sections with pointer bases) and in_reduction.
OpenMP 5.0 had the wording broken for reduction, the intended rule to use
map(tofrom:) on target when combined with it was bound only on inscan modifier
presence which makes no sense, as then inscan may not be used, this has
been fixed in 5.1 and I'm just assuming 5.1 wording for that.
There are various cases where e.g. from historical or optimization reasons
GCC slightly deviates from the rules, but in most cases it is something
that shouldn't be really observable, e.g. whether
#pragma omp parallel for firstprivate(x)
is handled as
#pragma omp parallel shared(x)
#pragma omp for firstprivate(x)
or
#pragma omp parallel firstprivate(x)
#pragma omp for
shouldn't be possible to distinguish in user code. I've added FIXMEs
in the testcases about that, but maybe we just should keep it as is
(alternative would be to do it in standard compliant way and transform
into whatever we like after gimplification (e.g. early during omplower)).
Some cases we for historical reasons implement even with clauses on
constructs which in the standard don't accept them that way and then
handling those magically in omp lowering/expansion, in particular e.g.
#pragma omp parallel for firstprivate(x) lastprivate(x)
we treat as
#pragma omp parallel firstprivate(x) lastprivate(x)
#pragma omp for
even when lastprivate is not valid on parallel. Maybe one day we
could change that if we make sure we don't regress generated code quality.
I've also found a bug in OpenMP 5.0/5.1,
#pragma omp parallel sections firstprivate(x) lastprivate(x)
incorrectly says that it should be handled as
#pragma omp parallel firstprivate(x)
#pragma omp sections lastprivate(x)
which when written that way results in error; filed as
https://github.com/OpenMP/spec/issues/2758
to be fixed in OpenMP 5.2. GCC handles it the way it used to do
and users expect, so nothing to fix on the GCC side.
Also, we don't support yet in_reduction clause on target construct,
which means the -11.c testcase can't include any tests about in_reduction
handling on all the composite constructs that include target.
The work found two kinds of bugs on the GCC side, one is the known thing
that we implement still the 4.5 behavior and don't mark for
lastprivate/linear/reduction the list item as map(tofrom:) as mentioned
in PR99928. These cases are xfailed in the tests.
And another one is with r21 and r28 in -{8,9,10}.c tests - we don't add
reduction clause on teams for
#pragma omp {target ,}teams distribute simd reduction(+:r)
even when the spec says that teams shouldn't receive reduction only
when combined with loop construct.
In
make check-gcc check-g++ RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} gomp.exp=pr99928*'
testing this shows:
# of expected passes 5648
# of expected failures 872
and with Tobias' patch applied:
# of expected passes 5648
# of unexpected successes 384
# of expected failures 488
2021-05-13 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* c-c++-common/gomp/pr99928-1.c: New test.
* c-c++-common/gomp/pr99928-2.c: New test.
* c-c++-common/gomp/pr99928-3.c: New test.
* c-c++-common/gomp/pr99928-4.c: New test.
* c-c++-common/gomp/pr99928-5.c: New test.
* c-c++-common/gomp/pr99928-6.c: New test.
* c-c++-common/gomp/pr99928-7.c: New test.
* c-c++-common/gomp/pr99928-8.c: New test.
* c-c++-common/gomp/pr99928-9.c: New test.
* c-c++-common/gomp/pr99928-10.c: New test.
* c-c++-common/gomp/pr99928-11.c: New test.
Tobias Burnus [Fri, 14 May 2021 07:30:01 +0000 (09:30 +0200)]
OpenMP: Support complex/float in && and || reduction
C/C++ permit logical AND and logical OR also with floating-point or complex
arguments by doing an unequal zero comparison; the result is an 'int' with
value one or zero. Hence, those are also permitted as reduction variable,
even though it is not the most sensible thing to do.
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Accept float + complex
for || and && reductions.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_reduction_clause): Accept float + complex
for || and && reductions.
gcc/ChangeLog:
* omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle
&& and || with floating-point and complex arguments.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/reduction-2.c: New test.
* testsuite/libgomp.c-c++-common/reduction-3.c: New test.
Tom de Vries [Fri, 14 May 2021 07:24:47 +0000 (09:24 +0200)]
Disable SIMT for user-defined reduction
The test-case included in this patch contains this target region:
...
for (int i0 = 0 ; i0 < N0 ; i0++ )
counter_N0.i += 1;
...
When running with nvptx accelerator, the counter variable is expected to
be N0 after the region, but instead is N0 / 32. The problem is that rather
than getting the result for all warp lanes, we get it for just one lane.
This is caused by the implementation of SIMT being incomplete. It handles
regular reductions, but appearantly not user-defined reductions.
For now, handle this by disabling SIMT in this case, specifically by setting
sctx->max_vf to 1.
Tested libgomp on x86_64-linux with nvptx accelerator.
gcc/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* omp-low.c (lower_rec_input_clauses): Disable SIMT for user-defined
reduction.
libgomp/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* testsuite/libgomp.c/target-44.c: New test.
Tom de Vries [Fri, 14 May 2021 07:21:36 +0000 (09:21 +0200)]
Handle alternative IV
Consider the test-case libgomp.c/pr81778.c added in this commit, with
this core loop (note: CANARY_SIZE set to 0 for simplicity):
...
int s = 1;
#pragma omp target simd
for (int i = N - 1; i > -1; i -= s)
a[i] = 1;
...
which, given that N is 32, sets a[0..31] to 1.
On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending
on the lane that is executing) at bb entry.
So we have the following sequence:
- a[0..31] is set to 1
- i is updated to -32..-1
- D.3209 is updated to 1 (being 0 initially)
- bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates
to true
- bb6 is once more executed, which should not happen because all the elements
that needed to be handled were already handled.
- consequently, elements that should not be written are written
- with CANARY_SIZE == 0, we may run into a libgomp error:
...
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
...
and with CANARY_SIZE unmodified, we run into:
...
Expected 0, got 1 at base[-961]
Aborted (core dumped)
...
The cause of this is as follows:
- because the step s is a variable rather than a constant, an alternative
IV (D.3209 in our example) is generated in expand_omp_simd, and the
loop condition is tested in terms of the alternative IV rather than
the original IV (i in our example).
- the SIMT code in expand_omp_simd works by modifying step and initial value.
- The initial value fd->loop.n1 is loaded into a variable n1, which is
modified by the SIMT code and then used there-after.
- The step fd->loop.step is loaded into a variable step, which is modified
by the SIMT code, but afterwards there are uses of both step and
fd->loop.step.
- There are uses of fd->loop.step in the alternative IV handling code,
which should use step instead.
Fix this by introducing an additional variable orig_step, which is not
modified by the SIMT code and replacing all remaining uses of fd->loop.step
by either step or orig_step.
Build on x86_64-linux with nvptx accelerator, tested libgomp.
This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200
with driver 450.66.
gcc/ChangeLog:
2020-10-02 Tom de Vries <tdevries@suse.de>
* omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of
fd->loop.step by either step or orig_step.
Tobias Burnus [Fri, 14 May 2021 07:44:19 +0000 (09:44 +0200)]
Fix fallout from merge from releases/gcc-11
Re-apply commit f963d6c79b1d52ce565c772166c3c3e1b1d0aa78 on nvptx.c:
"Remove duplicate SESE code in NVPTX backend"
The code got readded by a wrongly resolved merge conflict.
Apply to the OG11 code the change of the merge-conflict causing change:
commit da9c085ddbfe61e5954c8ec4e996240fa3a994c0
"nvptx: Fix up nvptx build against latest libstdc++ [PR100375]"
gcc/ChangeLog
* omp-sese.c (omp_sese_pseudo): Use nullptr instead of 0
as first argument of pseudo_node_t constructors.
* config/nvptx/nvptx.c (bb_pair_t, bb_pair_vec_t,
pseudo_node_t, bracket, bracket_vec_t,
bb_sese, bb_sese::~bb_sese, bb_sese::append, bb_sese::remove,
BB_SET_SESE, BB_GET_SESE, nvptx_sese_number, nvptx_sese_pseudo,
nvptx_sese_color, nvptx_find_sese): Remove again.
arm: Remove duplicate definitions from arm_mve.h (pr100419).
This patch removes several duplicated intrinsic definitions from
arm_mve.h mentioned in PR100419 and also fixes the wrong arguments
in few of intrinsics polymorphic variants.
Richard Earnshaw [Thu, 13 May 2021 10:42:58 +0000 (11:42 +0100)]
arm: correctly handle inequality comparisons against max constants [PR100563]
Normally we expect the gimple optimizers to fold away comparisons that
are always true, but at some lower optimization levels this is not
always the case, so the back-end has to be able to generate correct
code in these cases.
In this example, we have a comparison of the form
(unsigned long long) op <= ~0ULL
which, of course is always true.
Normally, in the arm back-end we handle these expansions where the
immediate cannot be handled directly by adding 1 to the constant and
then adjusting the comparison operator:
(unsigned long long) op < CONST + 1
but we cannot do that when the constant is already the largest value.
Fortunately, we observe that the comparisons we need to handle this
way are either always true or always false, so instead of forming a
comparison against the maximum value, we can replace it with a
comparison against the minimum value (which just happens to also be a
constant we can handle. So
gcc:
PR target/100563
* config/arm/arm.c (arm_canonicalize_comparison): Correctly
canonicalize DImode inequality comparisons against the
maximum integral value.
Patrick Palka [Fri, 30 Apr 2021 14:59:20 +0000 (10:59 -0400)]
libstdc++: Implement P2367 changes to avoid some list-initialization
This implements the wording changes of P2367R0 "Remove misuses of
list-initialization from Clause 24", modulo the parts that depend
on P1739R4 which we don't yet implement (due to LWG 3407).
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (subrange::subrange): Avoid
list-initialization in delegating constructor.
* include/std/ranges (single_view): Replace implicit guide
with explicit deduction guide that decays its argument.
(_Single::operator()): Avoid CTAD when constructing the
single_view object.
(_Iota::operator()): Avoid list-initialization.
(__detail::__can_filter_view, _Filter::operator()): Likewise.
(__detail::__can_transform_view, _Transform::operator()): Likewise.
(take_view::begin): Likewise.
(__detail::__can_take_view, _Take::operator()): Likewise.
(__detail::__can_take_while_view, _TakeWhile::operator()): Likewise.
(__detail::__can_drop_view, _Drop::operator()): Likewise.
(__detail::__can_drop_while_view, _DropWhile::operator()): Likewise.
(split_view::split_view): Use views::single when initializing
_M_pattern.
(__detail::__can_split_view, _Split::operator()): Avoid
list-initialization.
(_Counted::operator()): Likewise.
* testsuite/std/ranges/p2367.cc: New test.
Jakub Jelinek [Wed, 12 May 2021 13:14:35 +0000 (15:14 +0200)]
libcpp: Fix up -fdirectives-only preprocessing of includes not ending with newline [PR100392]
If a header doesn't end with a new-line, with -fdirectives-only we right now
preprocess it as
int i = 1;# 2 "pr100392.c" 2
i.e. the line directive isn't on the next line, which means we fail to parse
it when compiling.
GCC 10 and earlier libcpp/directives-only.c had for this:
if (!pfile->state.skipping && cur != base)
{
/* If the file was not newline terminated, add rlimit, which is
guaranteed to point to a newline, to the end of our range. */
if (cur[-1] != '\n')
{
cur++;
CPP_INCREMENT_LINE (pfile, 0);
lines++;
}
cb->print_lines (lines, base, cur - base);
}
and we have the assertion
/* Files always end in a newline or carriage return. We rely on this for
character peeking safety. */
gcc_assert (buffer->rlimit[0] == '\n' || buffer->rlimit[0] == '\r');
So, this patch just does readd the more less same thing, so that we emit
a newline after the inline even when it wasn't there before.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/100392
* lex.c (cpp_directive_only_process): If buffer doesn't end with '\n',
add buffer->rlimit[0] character to the printed range and
CPP_INCREMENT_LINE and increment line_count.
* gcc.dg/cpp/pr100392.c: New test.
* gcc.dg/cpp/pr100392.h: New file.
Jakub Jelinek [Wed, 12 May 2021 08:38:35 +0000 (10:38 +0200)]
expand: Don't reuse DEBUG_EXPRs with vector type if they have different modes [PR100508]
The inliner doesn't remap DEBUG_EXPR_DECLs, so the same decls can appear
in multiple functions.
Furthermore, expansion reuses corresponding DEBUG_EXPRs too, so they again
can be reused in multiple functions.
Neither of that is a major problem, DEBUG_EXPRs are just magic value holders
and what value they stand for is independent in each function and driven by
what debug stmts or DEBUG_INSNs they are bound to.
Except for DEBUG_EXPR*s with vector types, TYPE_MODE can be either BLKmode
or some vector mode depending on whether current function's enabled ISAs
support that vector mode or not. On the following testcase, we expand it
first in foo function without AVX2 enabled and so the DEBUG_EXPR is
BLKmode, but later the same DEBUG_EXPR_DECL is used in a simd clone with
AVX2 enabled and expansion ICEs because of a mode mismatch.
The following patch fixes that by forcing recreation of a DEBUG_EXPR if
there is a mode mismatch for vector typed DEBUG_EXPR_DECL, DEBUG_EXPRs
will be still reused in between functions otherwise and within the same
function the mode should be always the same.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100508
* cfgexpand.c (expand_debug_expr): For DEBUG_EXPR_DECL with vector
type, don't reuse DECL_RTL if it has different mode, instead force
creation of a new DEBUG_EXPR.
Jakub Jelinek [Tue, 11 May 2021 07:07:47 +0000 (09:07 +0200)]
openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471]
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early
return, doesn't create any tasks and more importantly, doesn't create
a taskgroup and doesn't register task reductions. But, the code emitted
in the callers assumes task reductions have been registered and performs
the reduction handling and task reduction unregistration. The pointer
to the task reduction private variables is reused, on input it is the alignment
and only on output it is the pointer, so in the case taskloop with no iterations
the caller attempts to dereference the alignment value as if it was a pointer
and crashes. We could in the early returns register the task reductions
only to have them looped over and unregistered in the caller, but I think
it is better to tell the caller there is nothing to task reduce and bypass
all that.
2021-05-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100471
* omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data
is 0, bypass the reduction loop including
GOMP_taskgroup_reduction_unregister call.
* taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not
GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task
reduction pointer.
* testsuite/libgomp.c/task-reduction-4.c: New test.
RISC-V: For '-march' and '-mabi' options, add 'Negative' property mentions itself.
When use multi-lib riscv-tool-chain. A bug is triggered when there are two
'-march' at command line.
riscv64-unknown-elf-gcc -march=rv32gcp -mabi=ilp32f -march=rv32gcpzp64 HelloWorld.c
/lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: /lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/lib/crt0.o: ABI is incompatible with that of the selected emulation:
target emulation `elf64-littleriscv' does not match `elf32-littleriscv'
/lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: failed to merge target specific data of file /lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/lib/crt0.o
/lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: /lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/crtbegin.o: ABI is incompatible with that of the selected emulation:
target emulation `elf64-littleriscv' does not match `elf32-littleriscv'
/lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: failed to merge target specific data of file /lhome/gengq/riscv64-linux-ptest/lib/gcc/riscv64-unknown-elf/10.2.0/crtbegin.o
......
This patch fix it. And the DRIVER would prune the extra '-march' and '-mabi'
options and keep only the last one valid.
Jonathan Wakely [Tue, 11 May 2021 14:01:01 +0000 (15:01 +0100)]
libstdc++: Fix missing members in std::allocator<void>
The changes in 75c6a925dab5b7af9ab47c10906cb0e140261cc2 were slightly
incorrect, because the converting constructor should be noexcept, and
the POCMA and is_always_equal traits should still be present in C++20.
This fixes it, and slightly refactors the preprocessor conditions and
order of members. Also add comments explaining things.
The non-standard construct and destroy members added for PR 78052 can be
private if allocator_traits<allocator<void>> is made a friend.
libstdc++-v3/ChangeLog:
* include/bits/allocator.h (allocator<void>) [C++20]: Add
missing noexcept to constructor. Restore missing POCMA and
is_always_equal_traits.
[C++17]: Make construct and destroy members private and
declare allocator_traits as a friend.
* include/bits/memoryfwd.h (allocator_traits): Declare.
* include/ext/malloc_allocator.h (malloc_allocator::allocate):
Add nodiscard attribute. Add static assertion for LWG 3307.
* include/ext/new_allocator.h (new_allocator::allocate): Add
static assertion for LWG 3307.
* testsuite/20_util/allocator/void.cc: Check that converting
constructor is noexcept. Check for propagation traits and
size_type and difference_type. Check that pointer and
const_pointer are gone in C++20.
Jonathan Wakely [Mon, 10 May 2021 20:06:22 +0000 (21:06 +0100)]
libstdc++: Remove TODO comment
We have a comment saying to replace the simple binary_semaphore type
with std::binary_semaphore, which has been done. However, that isn't
defined on all targets. So keep the simple one here that just implements
the parts of the API needed by <stop_token>, and remove the comment
suggesting it should be replaced.
Jonathan Wakely [Mon, 10 May 2021 19:46:38 +0000 (20:46 +0100)]
libstdc++: Implement proposed resolution to LWG 3548
This has been tentatively approved by LWG. The deleter from a unique_ptr
can be moved into the shared_ptr (at least, since LWG 2802). This uses
std::forward<_Del>(__r.get_deleter()) not std::move(__r.get_deleter())
because we don't want to convert the deleter to an rvalue when _Del is
an lvalue reference type.
This also adds a missing is_move_constructible_v<D> constraint to the
shared_ptr(unique_ptr<Y, D>&&) constructor, which is inherited from the
shared_ptr(Y*, D) constructor due to the use of "equivalent to" in the
specified effects.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h (__shared_count(unique_ptr&&)):
Initialize a non-reference deleter from an rvalue, as per LWG
3548.
(__shared_ptr::_UniqCompatible): Add missing constraint.
* testsuite/20_util/shared_ptr/cons/lwg3548.cc: New test.
* testsuite/20_util/shared_ptr/cons/unique_ptr_deleter.cc: Check
constraints.
Jonathan Wakely [Mon, 10 May 2021 15:22:54 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from remaining tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:54 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from algorithm tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from containers tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from strings tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from PMR tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from concurrency tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 option from any/optional/variant tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 options from filesystem tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 15:22:53 +0000 (16:22 +0100)]
libstdc++: Remove redundant -std=gnu++17 options from PSTL tests
GCC defaults to -std=gnu++17 now anyway, and using it explicitly in the
dg-options directive prevents running these tests with different modes
such as -std=c++17 or -std=gnu++20.
Jonathan Wakely [Mon, 10 May 2021 12:57:49 +0000 (13:57 +0100)]
libstdc++: Rename test type to avoid clashing with std::any
When PCH are enabled this test file includes <any> and so the
using-directive brings std::any into the global scope. It isn't
currently a problem, because the -std option in the dg-options means
that PCH is not used. If that option is removed, the test fails with PCH
and passes without.
This just renames the type to avoid the name classh (and also the 'none'
type for consistency).
libstdc++-v3/ChangeLog:
* testsuite/20_util/variant/compile.cc: Rename 'any' to avoid
clash with std::any.
Jonathan Wakely [Thu, 6 May 2021 12:40:53 +0000 (13:40 +0100)]
libstdc++: Fix definition of std::remove_cvref_t
I originally defined std::remove_cvref_t in terms of the internal
__remove_cvref_t trait, to avoid instantiating the remove_cvref class
template. However, as described in P1715R0 that is observable by users
and is thus non-conforming.
This defines remove_cvref_t as specified in the standard.
libstdc++-v3/ChangeLog:
* include/std/type_traits (remove_cvref_t): Define in terms of
remove_cvref.
* testsuite/20_util/remove_cvref/value.cc: Check alias.
Prior to C++20 it should be ill-formed to use std::make_shared with an
array type (and we don't support the C++20 feature to make it valid yet
anyway).
libstdc++-v3/ChangeLog:
PR libstdc++/99006
* include/bits/shared_ptr.h (allocate_shared): Assert that _Tp
is not an array type.
* include/bits/shared_ptr_base.h (__allocate_shared): Likewise.
* testsuite/20_util/shared_ptr/creation/99006.cc: New test.
Philippe Blain [Sat, 13 Mar 2021 00:26:46 +0000 (19:26 -0500)]
libstdc++: Install libstdc++*-gdb.py more robustly [PR 99453]
In order for GDB to auto-load the pretty printers, they must be installed
as "libstdc++.$ext-gdb.py", where 'libstdc++.$ext' is the name of the
object file that is loaded by GDB [1], i.e. the libstdc++ shared library.
The approach taken in libstdc++-v3/python/Makefile.am is to loop over
files matching 'libstdc++*' in $(DESTDIR)$(toolexeclibdir) and choose
the last file matching that glob that is not a symlink, the Libtool
'*.la' file or a Python file.
That works fine for ELF targets where the matching names are:
Try to make a better job at installing the pretty printers with the
correct name by copying the approach taken by isl [2], that is, using
a sed invocation on the Libtool-generated 'libstdc++.la' to read the
correct name for the current platform.
Patrick Palka [Tue, 11 May 2021 17:19:46 +0000 (13:19 -0400)]
libstdc++: Remove extern "C" from Ryu sources
floating_to_chars.cc includes the Ryu sources into an anonymous
namespace as a convenient way to give all its symbols internal linkage.
But an entity declared extern "C" always has external linkage even
from within an anonymous namespace, so this trick doesn't work in the
presence of extern "C", and it causes the Ryu function generic_to_chars
to be visible from libstdc++.a.
This patch removes the only use of extern "C" from our local copy of
Ryu along with some declarations for never-defined functions that GCC
now warns about.
libstdc++-v3/ChangeLog:
* src/c++17/ryu/LOCAL_PATCHES: Update.
* src/c++17/ryu/ryu_generic_128.h: Remove extern "C".
Remove declarations for never-defined functions.
* testsuite/20_util/to_chars/4.cc: New test.
Chung-Lin Tang [Wed, 5 May 2021 15:11:19 +0000 (08:11 -0700)]
OpenMP 5.0: Implement relaxation of implicit map vs. existing device mappings
This patch implements relaxing the requirements when a map with the implicit
attribute encounters an overlapping existing map. As the OpenMP 5.0 spec
describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22):
"If a single contiguous part of the original storage of a list item with an
implicit data-mapping attribute has corresponding storage in the device data
environment prior to a task encountering the construct that is associated with
the map clause, only that part of the original storage will have corresponding
storage in the device data environment as a result of the map clause."
Also tracked in the OpenMP spec context as issue #1463:
https://github.com/OpenMP/spec/issues/1463
* gomp-constants.h (GOMP_MAP_IMPLICIT): New special map kind bits value.
(GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of
special map kind bits.
(GOMP_MAP_NONCONTIG_ARRAY_P): Adjust test for non-contiguous array map
kind bits to be more specific.
(GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds.
gcc/ChangeLog:
* tree.h (OMP_CLAUSE_MAP_IMPLICIT_P): New access macro for 'implicit'
bit, using 'base.deprecated_flag' field of tree_node.
* tree-pretty-print.c (dump_omp_clause): Add support for printing
implicit attribute in tree dumping.
* gimplify.c (gimplify_adjust_omp_clauses_1):
Set OMP_CLAUSE_MAP_IMPLICIT_P to 1 if map clause is implicitly created.
(gimplify_adjust_omp_clauses): Adjust place of adding implicitly created
clauses, from simple append, to starting of list, after non-map clauses.
* omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind
values passed to libgomp for implicit maps.
* target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add
implicit map handling to allow a "superset" existing map as valid case.
(get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value.
(get_implicit): New function to extract implicit status.
(gomp_map_fields_existing): Adjust arguments in calls to
gomp_map_vars_existing, and add uses of get_implicit.
(gomp_map_vars_internal): Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.
OpenACC: Fix pattern in dg-bogus in Fortran testcases again
It turned out that a compiler built without offloading support
and one with can produce slightly different diagnostic.
Offloading support implies ENABLE_OFFLOAD which implies that
g->have_offload is set when offloading is actually needed.
In cgraphunit.c, the latter causes flag_generate_offload = 1,
which in turn affects tree.c's free_lang_data.
The result is that the front-end specific diagnostic gets reset
('tree_diagnostics_defaults (global_dc)'), which affects in this
case 'Warning' vs. 'warning' via the Fortran frontend.
Result: 'Warning:' vs. 'warning:'.
Side note: Other FE also override the diagnostic, leading to
similar differences, e.g. the C++ FE outputs mangled function
names differently, cf. patch thread.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f:
Use [Ww]arning in dg-bogus as FE diagnostic and default
diagnostic differ and the result depends on ENABLE_OFFLOAD.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/classify-serial.f95:
Use [Ww]arning in dg-bogus as FE diagnostic and default
diagnostic differ and the result depends on ENABLE_OFFLOAD.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
OpenACC: Fix pattern in dg-bogus in Fortran testcases
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f:
Correct spelling in dg-bogus to match -Wopenacc-parallelism.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/classify-serial.f95:
Correct spelling in dg-bogus to match -Wopenacc-parallelism.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.