Julian Brown [Tue, 18 May 2021 17:22:56 +0000 (10:22 -0700)]
[og11] Rework indirect struct handling for OpenACC in gimplify.c
This patch reworks indirect struct handling in gimplify.c (i.e. for
struct components mapped with "mystruct->a[0:n]", "mystruct->b", etc.),
for OpenACC. The key observation leading to these changes was that
component mappings of references-to-structures is already implemented
and working, and indirect struct component handling via a pointer can
work quite similarly. That lets us remove some earlier, special-case
handling for mapping indirect struct component accesses for OpenACC,
which required the pointed-to struct to be manually mapped before the
indirect component mapping.
With this patch, you can map struct components directly (e.g. an array
slice "mystruct->a[0:n]") just like you can map a non-indirect struct
component slice ("mystruct.a[0:n]"). Both references-to-pointers (with
the former syntax) and references to structs (with the latter syntax)
work now.
For Fortran class pointers, we no longer re-use GOMP_MAP_TO_PSET for the
class metadata (the structure that points to the class data and vptr)
-- it is instead treated as any other struct.
For C++, the struct handling also works for class members ("this->foo"),
without having to explicitly map "this[:1]" first.
For OpenACC, we permit chained indirect component references
("mystruct->a->b[0:n]"), though only the last part of such mappings will
trigger an attach/detach operation. To properly use such a construct
on the target, you must still manually map "mystruct->a[:1]" first --
but there's no need to map "mystruct[:1]" explicitly before that.
This version of the patch avoids altering code paths for OpenMP,
where possible.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET
mappings for class metadata, nor GOMP_MAP_POINTER mappings for
POINTER_TYPE_P decls.
gcc/
* gimplify.c (extract_base_bit_offset): Add BASE_IND and OPENMP
parameters. Handle pointer-typed indirect references for OpenACC
alongside reference-typed ones.
(strip_components_and_deref, aggregate_base_p): New functions.
(build_struct_group): Add pointer type indirect ref handling,
including chained references, for OpenACC. Also handle references to
structs for OpenACC. Conditionalise bits for OpenMP only where
appropriate.
(gimplify_scan_omp_clauses): Rework pointer-type indirect structure
access handling to work more like the reference-typed handling for
OpenACC only.
* omp-low.c (scan_sharing_clauses): Handle pointer-type indirect struct
references, and references to pointers to structs also.
gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
Julian Brown [Tue, 18 May 2021 17:08:22 +0000 (10:08 -0700)]
[og11] Refactor struct lowering for OpenACC/OpenMP in gimplify.c
This patch is a second attempt at refactoring struct component mapping
handling for OpenACC/OpenMP during gimplification, after the patch I
posted here:
This patch goes further, in that the struct-handling code is outlined
into its own function (to create the "GOMP_MAP_STRUCT" node and the
sorted list of nodes immediately following it, from a set of mappings
of components of a given struct or derived type). I've also gone through
the list-handling code and attempted to add comments documenting how it
works to the best of my understanding, and broken out a couple of helper
functions in order to (hopefully) have the code self-document better also.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (insert_struct_comp_map): Refactor function into...
(build_struct_comp_nodes): This new function. Remove list handling
and improve self-documentation.
(insert_node_after, move_node_after, move_nodes_after,
move_concat_nodes_after): New helper functions.
(build_struct_group): New function to build up GOMP_MAP_STRUCT node
groups to map struct components. Outlined from...
(gimplify_scan_omp_clauses): Here. Call above function.
Julian Brown [Mon, 19 Apr 2021 13:24:41 +0000 (06:24 -0700)]
[og11] Unify ARRAY_REF/INDIRECT_REF stripping code in extract_base_bit_offset
For historical reasons, it seems that extract_base_bit_offset
unnecessarily used two different ways to strip ARRAY_REF/INDIRECT_REF
nodes from component accesses. I verified that the two ways of performing
the operation gave the same results across the whole testsuite (and
several additional benchmarks).
The code was like this since an earlier "mechanical" refactoring by me,
first posted here:
It was never clear to me if there was an important semantic
difference between the two ways of stripping the base before calling
get_inner_reference, but it appears that there is not, so one can go away.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (extract_base_bit_offset): Unify ARRAY_REF/INDIRECT_REF
stripping code in first call/subsequent call cases.
It never makes sense for a GOMP_MAP_ATTACH_DETACH mapping to survive
beyond gimplify.c, so this patch rewrites such mappings to GOMP_MAP_ATTACH
or GOMP_MAP_DETACH unconditionally (rather than checking for a list
of types of OpenACC or OpenMP constructs), in cases where it hasn't
otherwise been done already in the preceding code.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (gimplify_scan_omp_clauses): Simplify condition
for changing GOMP_MAP_ATTACH_DETACH to GOMP_MAP_ATTACH or
GOMP_MAP_DETACH.
Tobias Burnus [Tue, 1 Jun 2021 12:16:42 +0000 (14:16 +0200)]
Fortran/OpenMP: Support (parallel) master taskloop (simd) [PR99928]
PR middle-end/99928
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_node, show_code_node): Handle
(parallel) master taskloop (simd).
* frontend-passes.c (gfc_code_walker): Set in_omp_workshare
to false for parallel master taskloop (simd).
* gfortran.h (enum gfc_statement):
Add ST_OMP_(END_)(PARALLEL_)MASTER_TASKLOOP(_SIMD).
(enum gfc_exec_op): EXEC_OMP_(PARALLEL_)MASTER_TASKLOOP(_SIMD).
* match.h (gfc_match_omp_master_taskloop,
gfc_match_omp_master_taskloop_simd,
gfc_match_omp_parallel_master_taskloop,
gfc_match_omp_parallel_master_taskloop_simd): New prototype.
* openmp.c (gfc_match_omp_parallel_master_taskloop,
gfc_match_omp_parallel_master_taskloop_simd,
gfc_match_omp_master_taskloop,
gfc_match_omp_master_taskloop_simd): New.
(gfc_match_omp_taskloop_simd): Permit 'reduction' clause.
(resolve_omp_clauses): Handle new combined directives; remove
inscan-reduction check to reduce multiple errors; add
task-reduction error for 'taskloop simd'.
(gfc_resolve_omp_parallel_blocks,
resolve_omp_do, omp_code_to_statement,
gfc_resolve_omp_directive): Handle new combined constructs.
* parse.c (decode_omp_directive, next_statement,
gfc_ascii_statement, parse_omp_do, parse_omp_structured_block,
parse_executable): Likewise.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans.c (trans_code): Likewise.
* trans-openmp.c (gfc_split_omp_clauses,
gfc_trans_omp_directive): Likewise.
(gfc_trans_omp_parallel_master): Move after gfc_trans_omp_master_taskloop;
handle parallel master taskloop (simd) as well.
(gfc_trans_omp_taskloop): Take gfc_exec_op as arg.
(gfc_trans_omp_master_taskloop): New.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/reduction5.f90: Remove dg-error; the issue is
now diagnosed with less error output.
* gfortran.dg/gomp/scan-1.f90: Likewise.
* gfortran.dg/gomp/pr99928-3.f90: New test.
* gfortran.dg/gomp/taskloop-1.f90: New test.
Tobias Burnus [Mon, 31 May 2021 14:50:19 +0000 (16:50 +0200)]
gfortran.dg/gomp/depend-iterator-{1,2}.f90: Use dg-do compile
'dg-do run' is pointless -1 due to dg-error. And it won't work except
by chance for gomp tests; as -2 only has depend(out:), a 'dg-do compile'
is sufficient.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/depend-iterator-1.f90: Use dg-do compile.
* gfortran.dg/gomp/depend-iterator-2.f90: Use dg-do compile.
Jakub Jelinek [Mon, 31 May 2021 14:49:06 +0000 (16:49 +0200)]
openmp: Add shared to parallel for linear on parallel master taskloop simd [PR99928]
I forgot to add default(none) and defaultmap(none) wherever possible on the
testcases to make sure none of the required clauses are added implicitly (because
in that case it doesn't work with these none arguments of those default* clauses
or works differently with other default* settings.
And that revealed we didn't add shared on parallel for linear clause
on parallel master taskloop simd, so this patch fixes that too.
2021-05-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* gimplify.c (gimplify_scan_omp_clauses): For taskloop simd
combined with parallel, make sure to add shared clause to
parallel for explicit linear clause.
* c-c++-common/gomp/pr99928-1.c: Add default(none) to constructs
combined with parallel, teams or taskloop and defaultmap(none)
to constructs combined with target.
* c-c++-common/gomp/pr99928-2.c: Likewise.
* c-c++-common/gomp/pr99928-3.c: Likewise.
* c-c++-common/gomp/pr99928-4.c: Likewise.
* c-c++-common/gomp/pr99928-5.c: Likewise.
* c-c++-common/gomp/pr99928-6.c: Likewise.
* c-c++-common/gomp/pr99928-7.c: Likewise.
* c-c++-common/gomp/pr99928-8.c: Likewise.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
* c-c++-common/gomp/pr99928-13.c: Likewise.
* c-c++-common/gomp/pr99928-14.c: Likewise.
This v3 adds a small bug fix, where the initialization of the refcount didn't
handle all cases, fixed by using gomp_refcount_increment here (more consistent).
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): For new key entries, set
k->refcount to 0, remove initialization of k->structelem_refcount,
use gomp_increment_refcount to consistently handle all increment cases.
Jakub Jelinek [Tue, 25 May 2021 15:24:38 +0000 (17:24 +0200)]
c++: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]
When passing expressions with decltype(nullptr) type with side-effects to
ellipsis, we pass (void *)0 instead, but for the side-effects evaluate them
on the lhs of a COMPOUND_EXPR. Unfortunately that means we warn about it
if the expression is a call to nodiscard marked function, even when the
result is really used, just needs to be transformed.
Fixed by adding a warning_sentinel.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.
* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.
Jakub Jelinek [Tue, 25 May 2021 14:44:35 +0000 (16:44 +0200)]
c++tools: Include <cstdlib> for exit [PR100731]
This TU uses exit, but doesn't include <stdlib.h> or <cstdlib> and relies
on some other header to include it indirectly, which apparently doesn't
happen on reporter's host.
The other <c*> headers aren't guarded either and we rely on a compiler
capable of C++11, so maybe we can rely on <cstdlib> being around
unconditionally.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/100731
* server.cc: Include <cstdlib>.
Jakub Jelinek [Thu, 20 May 2021 07:09:07 +0000 (09:09 +0200)]
libcpp: Fix up -fdirectives-only handling of // comments on last line not terminated with newline [PR100646]
As can be seen on the testcases, before the -fdirectives-only preprocessing
rewrite the preprocessor would assume // comments are terminated by the
end of file even when newline wasn't there, but now we error out.
The following patch restores the previous behavior.
2021-05-20 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/100646
* lex.c (cpp_directive_only_process): Treat end of file as termination
for !is_block comments.
* gcc.dg/cpp/pr100646-1.c: New test.
* gcc.dg/cpp/pr100646-2.c: New test.
Jakub Jelinek [Wed, 19 May 2021 10:05:30 +0000 (12:05 +0200)]
builtins: Fix ICE with unprototyped builtin call [PR100576]
For unprototyped builtins the checking we perform is only about whether
the used argument is integral, pointer etc., not the exact precision.
We emit a warning about the problem though:
pr100576.c: In function ‘foo’:
pr100576.c:9:11: warning: implicit declaration of function ‘memcmp’ [-Wimplicit-function-declaration]
9 | int n = memcmp (p, v, b);
| ^~~~~~
pr100576.c:1:1: note: include ‘<string.h>’ or provide a declaration of ‘memcmp’
+++ |+#include <string.h>
1 | /* PR middle-end/100576 */
pr100576.c:9:25: warning: ‘memcmp’ argument 3 type is ‘int’ where ‘long unsigned int’ is expected in a call to built-in function declared without prototype
+[-Wbuiltin-declaration-mismatch]
9 | int n = memcmp (p, v, b);
| ^
It means in the testcase below where the user incorrectly called memcmp
with last argument int rather then size_t, the warning stuff in builtins.c
ICEs because it compares a wide_int from such a bound with another wide_int
which has precision of size_t/sizetype and wide_int asserts the compared
wide_ints are compatible.
Fixed by forcing the bound to have the right type.
2021-05-19 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100576
* builtins.c (check_read_access): Convert bound to size_type_node if
non-NULL.
Jakub Jelinek [Tue, 18 May 2021 08:26:45 +0000 (10:26 +0200)]
regcprop: Avoid DCE of asm goto [PR100590]
The following testcase ICEs, because copyprop_hardreg_forward_1 decides
to DCE asm goto with REG_UNUSED notes (because the output is unused and
asm isn't volatile). But that DCE just removes the asm goto, leaving
a bb with two successors and no insn at the end that would allow that.
The following patch makes sure we drop that way only INSNs and not
JUMP_INSNs or CALL_INSNs.
2021-05-18 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100590
* regcprop.c (copyprop_hardreg_forward_1): Only DCE dead sets if
they are NONJUMP_INSN_P.
Jakub Jelinek [Tue, 18 May 2021 08:10:17 +0000 (10:10 +0200)]
function: Set dummy DECL_ASSEMBLER_NAME in push_dummy_function [PR100580]
Last year I've added cgraph_node::get_create calls for the dummy
functions used for -fdump-passes, so that it interacts well with pass
disabling/enabling which is cgraph uid based.
Unfortunately, as the following testcase shows, when assembler hash
is present, that wants to compute DECL_ASSEMBLER_NAME and the C++ FE
is unprepared to handle it on the dummy functions which don't have
DECL_NAME etc.
The following patch fixes it by setting up a dummy DECL_ASSEMBLER_NAME
on these, so that the FEs don't need to compute it.
2021-05-18 Jakub Jelinek <jakub@redhat.com>
PR c++/100580
* function.c (push_dummy_function): Set DECL_ARTIFICIAL and
DECL_ASSEMBLER_NAME on the fn_decl.
Jakub Jelinek [Sat, 15 May 2021 08:12:11 +0000 (10:12 +0200)]
regcprop: Fix another cprop_hardreg bug [PR100342]
On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches wrote:
> Ah, ok, thanks for the extra context.
>
> So AIUI the problem when recording xmm2<-di isn't just:
>
> [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>
> but also that:
>
> [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
>
> For example, all registers in this sequence can be part of the same chain:
>
> (set (reg:HI R1) (reg:HI R0))
> (set (reg:SI R2) (reg:SI R1)) // [A]
> (set (reg:DI R3) (reg:DI R2)) // [A]
> (set (reg:SI R4) (reg:SI R[0-3]))
> (set (reg:HI R5) (reg:HI R[0-4]))
>
> But:
>
> (set (reg:SI R1) (reg:SI R0))
> (set (reg:HI R2) (reg:HI R1))
> (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
>
> is problematic because it dips below the precision of the oldest regno
> and then increases again.
>
> When this happens, I guess we have two choices:
>
> (1) what the patch does: treat R3 as the start of a new chain.
> (2) pretend that the copy occured in vd->e[sr].mode instead
> (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
>
> I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
> Maybe the optimisation provided by (2) compared to (1) isn't common
> enough to be worth the complication.
>
> I think we should test [B] as well as [A] though. The pass is set
> up to do some quite elaborate mode changes and I think rejecting
> [A] on its own would make some of the other code redundant.
> It also feels like it should be a seperate “if” or “else if”,
> with its own comment.
Unfortunately, we now have a testcase that shows that testing also [B]
is a problem (unfortunately now latent on the trunk, only reproduces
on 10 and 11 branches).
The comment in the patch tries to list just the interesting instructions,
we have a 64-bit value, copy low 8 bit of those to another register,
copy full 64 bits to another register and then clobber the original register.
Before that (set (reg:DI r14) (const_int ...)) we have a chain
DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain, so
we have QI si, DI bp , si being the oldest_regno.
Next DI si is copied into DI dx. Only the low 8 bits of that are defined,
the rest is unspecified, but we would add DI dx into that same chain at the
end, so QI si, DI bp, DI dx [*]. Next si is overwritten, so the chain is
DI bp, DI dx. And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
as redundant, because we think bp and dx are already equivalent, when in
reality that is true only for the lowpart 8 bits.
I believe the [*] marked step above is where the bug is.
The committed regcprop.c (copy_value) change (but only committed to
trunk/11, not to 10) added
else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
&& partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode))
return;
and while the first partial_subreg_p call returns true, the second one
doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
true and we'd return, but as that reg got clobbered, si became the oldest
regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
But as the testcase shows, what is the oldest_regno in the chain is
something that changes over time, so relying on it for anything is
problematic, something could have a different oldest_regno and later
on get a different oldest_regno (perhaps with different mode) because
the oldest_regno got overwritten and it can change both ways.
The following patch effectively implements your (2) above.
2021-05-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100342
* regcprop.c (copy_value): When copying a source reg in a wider
mode than it has recorded for the value, adjust recorded destination
mode too or punt if !REG_CAN_CHANGE_MODE_P.
Chung-Lin Tang [Thu, 27 May 2021 14:57:09 +0000 (22:57 +0800)]
OpenMP 5.0: Remove array section base-pointer mapping semantics, and other front-end adjustments
This patch largely implements three pieces of functionality:
(1) Per discussion and clarification on the omp-lang mailing list, standards
conforming behavior for mapping array sections should *NOT* also map the
base-pointer, i.e for this code:
struct S { int *ptr; ... };
struct S s;
#pragma omp target enter data map(to: s.ptr[:100])
Currently we generate after gimplify:
map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
which is deemed incorrect. After this patch, the gimplify results are now
adjusted to:
map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior,
attachment will happen)
The correct way of achieving the base-pointer-also-mapped behavior would be
to use:
This adjustment in behavior required a number of small adjustments here and
there in gimplify, including to accomodate map sequences for C++ references.
(2) Fixes in libgomp/target.c to not overwrite attached pointers when handling
device<->host copies, mainly for the "always" case.
(3) A set of changes to the C/C++ front-ends to extend the allowed component
access syntax in map clauses. This is actually mainly an effort to allow
SPEC HPC to compile. These changes are enabled for both OpenACC and OpenMP.
gcc/c/ChangeLog:
* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/cp/ChangeLog:
* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.
gcc/ChangeLog:
* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify conditions.
(omp_target_reorder_clauses): In alloc/to/from sorting phase, also
move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting
phase where we make sure pointers with an attach/detach map are ordered
correctly.
(gimplify_scan_omp_clauses): Add modifications to avoid creating
GOMP_MAP_STRUCT and associated alloc map for attach/detach maps.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase.
* c-c++-common/gomp/target-enter-data-1.c: New testcase.
* c-c++-common/gomp/target-implicit-map-2.c: New testcase.
libgomp/ChangeLog:
* target.c (gomp_map_vars_existing): Make sure attached pointer is
not overwritten during cross-host/device copying.
(gomp_update): Likewise.
(gomp_exit_data): Likewise.
* testsuite/libgomp.c++/target-11.C: Adjust testcase.
* testsuite/libgomp.c++/target-12.C: Likewise.
* testsuite/libgomp.c++/target-15.C: Likewise.
* testsuite/libgomp.c++/target-16.C: Likewise.
* testsuite/libgomp.c++/target-17.C: Likewise.
* testsuite/libgomp.c++/target-21.C: Likewise.
* testsuite/libgomp.c++/target-23.C: Likewise.
* testsuite/libgomp.c/target-23.c: Likewise.
* testsuite/libgomp.c/target-29.c: Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.
Chung-Lin Tang [Thu, 27 May 2021 14:22:00 +0000 (22:22 +0800)]
OpenMP 5.0: Implement relaxation of implicit map vs. existing device mappings
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570365.html
This patch implements relaxing the requirements when a map with the implicit
attribute encounters an overlapping existing map. As the OpenMP 5.0 spec
describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22):
"If a single contiguous part of the original storage of a list item with an
implicit data-mapping attribute has corresponding storage in the device data
environment prior to a task encountering the construct that is associated with
the map clause, only that part of the original storage will have corresponding
storage in the device data environment as a result of the map clause."
Also tracked in the OpenMP spec context as issue #1463:
https://github.com/OpenMP/spec/issues/1463
include/ChangeLog:
* gomp-constants.h (GOMP_MAP_IMPLICIT): New special map kind bits value.
(GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of
special map kind bits.
(GOMP_MAP_NONCONTIG_ARRAY_P): Adjust test for non-contiguous array map
kind bits to be more specific.
(GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds.
gcc/ChangeLog:
* tree.h (OMP_CLAUSE_MAP_IMPLICIT_P): New access macro for 'implicit'
bit, using 'base.deprecated_flag' field of tree_node.
* tree-pretty-print.c (dump_omp_clause): Add support for printing
implicit attribute in tree dumping.
* gimplify.c (gimplify_adjust_omp_clauses_1):
Set OMP_CLAUSE_MAP_IMPLICIT_P to 1 if map clause is implicitly created.
(gimplify_adjust_omp_clauses): Adjust place of adding implicitly created
clauses, from simple append, to starting of list, after non-map clauses.
* omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind
values passed to libgomp for implicit maps.
* target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add
implicit map handling to allow a "superset" existing map as valid case.
(get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value.
(get_implicit): New function to extract implicit status.
(gomp_map_fields_existing): Adjust arguments in calls to
gomp_map_vars_existing, and add uses of get_implicit.
(gomp_map_vars_internal): Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.
Chung-Lin Tang [Thu, 27 May 2021 11:53:45 +0000 (19:53 +0800)]
OpenMP 5.0: Improve OpenMP target support for C++ (includes PR92120 v3)
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570886.html
To summarize, this patch set is an improvement for OpenMP target support for
C++, including for inside non-static members, lambda objects, and struct
member deref access expressions. The corresponding modifications for the C
front-end are also included.
This patch supercedes the prior versions of my PR92120 patch
(implicit C++ map(this[:1])), so dubbing this "v3" of patch for that PR.
Prior versions of the PR92120 patch was implemented by recording uses of
'this' in the parser, and then use the recorded uses during "finish" to
create the implicit maps.
When working on supporting lambda objects, this required using a tree-walk
style processing of the OMP_TARGET body, so in only made sense to merge the
entire 'this' processing together with it, so a large part of the parser
changes were dropped, with the main processing in semantics.c now.
Other parser changes to support '->' in map clauses are also with this patch.
gcc/cp/ChangeLog:
* cp-tree.h (finish_omp_target): New declaration.
(finish_omp_target_clauses): Likewise.
* parser.c (cp_parser_omp_clause_map): Adjust call to
cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true.
(cp_parser_omp_target): Factor out code, adjust into calls to new
function finish_omp_target.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for
OMP_TARGET case.
* semantics.c (handle_omp_array_sections_1): Add handling to create
'this->member' from 'member' FIELD_DECL.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP
map clauses. Handle 'A->member' case in map clauses.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function.
(finish_omp_target): New function.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.
gcc/ChangeLog:
* gimplify.c ("tree-hash-traits.h"): Add include.
(gimplify_scan_omp_clauses): Change struct_map_to_clause to type
hash_map<tree_operand, tree> *. Adjust struct map handling to handle
cases of *A and A->B expressions. Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.
(gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for
exit data directives code to earlier position.
* omp-low.c (lower_omp_target):
Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
* tree-pretty-print.c (dump_omp_clause): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/target-3.c: New testcase.
* g++.dg/gomp/target-3.C: New testcase.
* g++.dg/gomp/target-lambda-1.C: New testcase.
* g++.dg/gomp/target-this-1.C: New testcase.
* g++.dg/gomp/target-this-2.C: New testcase.
* g++.dg/gomp/target-this-3.C: New testcase.
* g++.dg/gomp/target-this-4.C: New testcase.
* g++.dg/gomp/target-this-5.C: New testcase.
* g++.dg/gomp/this-2.C: Adjust testcase.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind):
Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
(GOMP_MAP_POINTER_P):
Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION.
libgomp/ChangeLog:
* libgomp.h (gomp_attach_pointer): Add bool parameter.
* oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_existing): Update assert condition to
include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION.
(gomp_map_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for mapping a pointer with NULL target.
(gomp_attach_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for attaching a pointer with NULL target.
(gomp_map_vars_internal): Update calls to gomp_map_pointer and
gomp_attach_pointer, add handling for
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases.
* testsuite/libgomp.c++/target-23.C: New testcase.
* testsuite/libgomp.c++/target-lambda-1.C: New testcase.
* testsuite/libgomp.c++/target-this-1.C: New testcase.
* testsuite/libgomp.c++/target-this-2.C: New testcase.
* testsuite/libgomp.c++/target-this-3.C: New testcase.
* testsuite/libgomp.c++/target-this-4.C: New testcase.
* testsuite/libgomp.c++/target-this-5.C: New testcase.
David Edelsohn [Fri, 23 Apr 2021 21:45:10 +0000 (17:45 -0400)]
testuite: fix libtdc++ libatomic flags
Some ports require libatomic for atomic operations, at least for some
data types and widths. The libstdc++ testsuite previously was updated
to link against libatomic, but the search path was hard-coded to
something that is not always correct, and the shared library search
path was not set.
The search path was hard-coded to the expected location of the
libatomic build directory relative to the libstdc++ testsuite
directory, but if one uses parallelism when invoking the libstdc++
testsuite, the tests are run in the "normalXX" sub-directories, for
which the hard-coded search path is incorrect. The path also is
incorrect for alternative multilib and tool options.
This patch adopts the logic from gcc/testsuite/lib/atomic-dg.exp to
search for the library and adds the logic to the libstdc++ testsuite
libatomic seatch path code. Previously the libstdc++ testsuite atomic
tests failed depending on the build configuration and if a build of
libatomic was installed in the default search path.
Bootstrapped on powerpc-ibm-aix7.2.3.0.
libstdc++-v3/ChangeLog:
* testsuite/lib/dg-options.exp (atomic_link_flags): New.
(add_options_for_libatomic): Use atomic_link_flags.
AIX uses a compiler-managed TOC for global data, including TLS symbols.
The GCC TOC implementation manages the TOC entries through the constant pool.
TLS symbols sometimes require a function call to obtain the TLS base
pointer. The arguments to the TLS call can conflict with arguments to
a normal function call if the TLS symbol is an argument in the normal call.
GCC specifically checks for this situation and precomputes the TLS
arguments, but the mechanism to check for this requirement utilizes
legitimate_constant_p(). The necessary result of legitimate_constant_p()
for correct TOC behavior and for correct TLS argument behavior is in
conflict.
This patch adds a new target hook precompute_tls_p() to decide if an
argument should be precomputed regardless of the result from
legitmate_constant_p().
Harald Anlauf [Mon, 17 May 2021 19:35:38 +0000 (21:35 +0200)]
PR fortran/98411 - Pointless warning for static variables
Variables with explicit SAVE attribute cannot end up on the stack.
There is no point in checking whether they should be moved off the
stack to static storage.
gcc/fortran/ChangeLog:
PR fortran/98411
* trans-decl.c (gfc_finish_var_decl): Add check for explicit SAVE
attribute.
gcc/testsuite/ChangeLog:
PR fortran/98411
* gfortran.dg/pr98411.f90: New test.
Harald Anlauf [Thu, 27 May 2021 11:55:11 +0000 (13:55 +0200)]
PR fortran/100656 - prevent ICE in gfc_conv_expr_present
gcc/fortran/ChangeLog:
PR fortran/100656
* trans-array.c (gfc_conv_ss_startstride): Do not call check for
presence of a dummy argument when a symbol actually refers to a
non-dummy.
gcc/testsuite/ChangeLog:
PR fortran/100656
* gfortran.dg/bounds_check_22.f90: New test.
Patrick Palka [Fri, 30 Apr 2021 22:45:46 +0000 (18:45 -0400)]
libstdc++: Implement P2328 changes to join_view
This implements the wording changes of P2328R0 "join_view should join
all views of ranges".
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__non_propating_cache): Define
as per P2328.
(join_view): Remove constraints on the value and reference types
of the wrapped iterator type as per P2328.
(join_view::_Iterator::_M_satisfy): Adjust as per P2328.
(join_view::_Iterator::operator++): Likewise.
(join_view::_M_inner): Use __non_propating_cache as per P2328.
Remove now-redundant use of __maybe_present_t.
* testsuite/std/ranges/adaptors/join.cc: Include <array>.
(test10): New test.
Patrick Palka [Mon, 24 May 2021 19:24:44 +0000 (15:24 -0400)]
libstdc++: Fix iterator caching inside range adaptors [PR100479]
This fixes two issues with our iterator caching as described in detail
in the PR. Since we recently added the __non_propagating_cache class
template as part of r12-336 for P2328, this patch just rewrites the
problematic _CachedPosition partial specialization in terms of this
class template.
For the offset partial specialization, it's safe to propagate the cached
offset on copy/move, but we should still invalidate the cached offset in
the source object on move.
libstdc++-v3/ChangeLog:
PR libstdc++/100479
* include/std/ranges (__detail::__non_propagating_cache): Move
definition up to before that of _CachedPosition. Make base
class _Optional_base protected instead of private. Add const
overload for operator*.
(__detail::_CachedPosition): Rewrite the partial specialization
for forward ranges as a derived class of __non_propagating_cache.
Remove the size constraint on the partial specialization for
random access ranges. Add copy/move/copy-assignment/move-assignment
members to the offset partial specialization for random
access ranges that propagate the cached value but additionally
invalidate it in the source object on move.
* testsuite/std/ranges/adaptors/100479.cc: New test.
Patrick Palka [Wed, 26 May 2021 20:02:33 +0000 (16:02 -0400)]
c++: access for hidden friend of nested class template [PR100502]
Here, during ahead of time access checking for the private member
EnumeratorRange<T>::end_reached_ in the hidden friend f, we're triggering
the assert in enforce_access that verifies we're not trying to add a
access check for a dependent decl onto TI_DEFERRED_ACCESS_CHECKS.
The special thing about this class member access expression is that
the overall expression is non-dependent (so finish_class_member_access_expr
doesn't exit early at parse time), and then accessible_p rejects the
access (so we don't exit early from enforce access either, and end up
triggering the assert b/c the member itself is dependent). I think
we're correct to reject the access because a hidden friend is not a
member function, so [class.access.nest] doesn't apply, and also a hidden
friend of a nested class is not a friend of the enclosing class.
To fix this ICE, this patch disables ahead of time access checking
during the member lookup in finish_class_member_access_expr. This
avoids potentially pushing an access check for a dependent member onto
TI_DEFERRED_ACCESS_CHECKS, and it's safe because we're going to redo the
same lookup at instantiation time anyway.
PR c++/100502
gcc/cp/ChangeLog:
* typeck.c (finish_class_member_access_expr): Disable ahead
of time access checking during the member lookup.
gcc/testsuite/ChangeLog:
* g++.dg/template/access37.C: New test.
* g++.dg/template/access37a.C: New test.
Jakub Jelinek [Fri, 28 May 2021 09:26:48 +0000 (11:26 +0200)]
openmp: Fix up handling of reduction clause on constructs combined with target [PR99928]
The reduction clause should be copied as map (tofrom: ) to combined target if
present, but as we need different handling of array sections between map and reduction,
doing that during gimplification would be harder.
So, this patch adds them during splitting, and similarly to firstprivate adds them
with a new flag that they should be just ignored/removed if an explicit map clause
of the same list item is present.
The exact rules are to be decided in https://github.com/OpenMP/spec/issues/2766
so this patch just implements something that is IMHO reasonable and exact detailed
testcases for the cornercases will follow once it is clarified.
2021-05-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/
* tree.h (OMP_CLAUSE_MAP_IMPLICIT): Define.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): For reduction clause if combined with
target add a map tofrom clause with OMP_CLAUSE_MAP_IMPLICIT.
gcc/c/
* c-typeck.c (handle_omp_array_sections): Copy OMP_CLAUSE_MAP_IMPLICIT.
(c_finish_omp_clauses): Move not just OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT
marked clauses last, but also OMP_CLAUSE_MAP_IMPLICIT. Add
map_firstprivate_head bitmap, set it for GOMP_MAP_FIRSTPRIVATE_POINTER
maps and silently remove OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT if it is
present too. For OMP_CLAUSE_MAP_IMPLICIT silently remove the clause
if present in map_head, map_field_head or map_firstprivate_head
bitmaps.
gcc/cp/
* semantics.c (handle_omp_array_sections): Copy
OMP_CLAUSE_MAP_IMPLICIT.
(finish_omp_clauses): Move not just OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT
marked clauses last, but also OMP_CLAUSE_MAP_IMPLICIT. Add
map_firstprivate_head bitmap, set it for GOMP_MAP_FIRSTPRIVATE_POINTER
maps and silently remove OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT if it is
present too. For OMP_CLAUSE_MAP_IMPLICIT silently remove the clause
if present in map_head, map_field_head or map_firstprivate_head
bitmaps.
gcc/testsuite/
* c-c++-common/gomp/pr99928-8.c: Remove all xfails.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
* c-c++-common/gomp/pr99928-16.c: New test.
* testsuite/libgomp.fortran/depend-iterator-2.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/affinity-1.c: New test.
* c-c++-common/gomp/affinity-2.c: New test.
* c-c++-common/gomp/affinity-3.c: New test.
* c-c++-common/gomp/affinity-4.c: New test.
* c-c++-common/gomp/affinity-5.c: New test.
* c-c++-common/gomp/affinity-6.c: New test.
* c-c++-common/gomp/affinity-7.c: New test.
* gfortran.dg/gomp/affinity-clause-1.f90: New test.
* gfortran.dg/gomp/affinity-clause-2.f90: New test.
* gfortran.dg/gomp/affinity-clause-3.f90: New test.
* gfortran.dg/gomp/affinity-clause-4.f90: New test.
* gfortran.dg/gomp/affinity-clause-5.f90: New test.
* gfortran.dg/gomp/affinity-clause-6.f90: New test.
* gfortran.dg/gomp/depend-iterator-1.f90: New test.
* gfortran.dg/gomp/depend-iterator-2.f90: New test.
* gfortran.dg/gomp/depend-iterator-3.f90: New test.
* gfortran.dg/gomp/taskwait.f90: New test.
Jakub Jelinek [Fri, 28 May 2021 08:08:56 +0000 (10:08 +0200)]
libgomp: Add openacc_{cuda,cublas,cudart} effective targets and use them in openacc testsuite
When gcc is configured for nvptx offloading with --without-cuda-driver
and full CUDA isn't installed, many libgomp.oacc-*/* tests fail,
some of them because cuda.h header can't be found, others because
the tests can't be linked against -lcuda, -lcudart or -lcublas.
I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms
installed, so libcuda.so.1 can be dlopened and the offloading works,
but linking against those libraries isn't possible nor are the
headers around (for the plugin itself there is the fallback
libgomp/plugin/cuda/cuda.h).
The following patch adds 3 new effective targets and uses them in tests that
needs those.
Richard Earnshaw [Thu, 27 May 2021 09:25:37 +0000 (10:25 +0100)]
arm: Remove use of opts_set in arm_configure_build_target [PR100767]
The variable global_options_set is a reflection of which options have
been explicitly set from the command line in the structure
global_options. But it doesn't describe the contents of a
cl_target_option. cl_target_option is a set of options to apply and
once configured should represent a viable set of options without
needing to know which were explicitly set by the user.
Unfortunately arm_configure_build_target was incorrectly conflating
the two. Fortunately, however, we do not really need to know this
since the various override_options functions should have sanitized the
target_options values before constructing a cl_target_option
structure. It is safe, therefore, to simply drop this parameter to
arm_configure_build_target and rely on checking that various string
parameters are non-null before dereferencing them.
Alex Coplan [Tue, 11 May 2021 12:11:09 +0000 (13:11 +0100)]
arm: Avoid emitting bogus CFA adjusts for CMSE nonsecure calls [PR99725]
The PR shows us attaching REG_CFA_ADJUST_CFA notes to stack pointer
adjustments emitted in cmse_nonsecure_call_inline_register_clear (when
-march=armv8.1-m.main). However, the stack pointer is not guaranteed to
be the CFA reg. If we're at -O0 or we have -fno-omit-frame-pointer, then
the frame pointer will be used as the CFA reg, and these notes on the sp
adjustments will lead to ICEs in dwarf2out_frame_debug_adjust_cfa.
This patch avoids emitting these notes if the current function has a
frame pointer.
gcc/ChangeLog:
PR target/99725
* config/arm/arm.c (cmse_nonsecure_call_inline_register_clear):
Avoid emitting CFA adjusts on the sp if we have the fp.
gcc/testsuite/ChangeLog:
PR target/99725
* gcc.target/arm/cmse/pr99725.c: New test.
Jakub Jelinek [Wed, 26 May 2021 13:15:19 +0000 (15:15 +0200)]
openmp: Fix up handling of target constructs in offloaded routines [PR100573]
OpenMP Nesting of Regions restrictions say:
- If a target update, target data, target enter data, or target exit data
construct is encountered during execution of a target region, the behavior is unspecified.
- If a target construct is encountered during execution of a target region and a device
clause in which the ancestor device-modifier appears is not present on the construct, the
behavior is unspecified.
That wording is about the dynamic (runtime) behavior, not about lexical nesting,
so while it is UB if omp target * is encountered in the target region, we need to make
it compile and link (for lexical nesting of target * inside of target we actually
emit a warning).
To make this work, I had to do multiple changes.
One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target".
Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a.
The entrypoint functions shouldn't be called or passed in the offload regions,
otherwise
libgomp: cuLaunchKernel error: too many resources requested for launch
was reported; fixed by changing those arguments of calls to GOMP_target_ext
to NULL.
And we didn't mark the entrypoints "omp target entrypoint" when the caller
has been "omp declare target".
2021-05-26 Jakub Jelinek <jakub@redhat.com>
PR libgomp/100573
gcc/
* omp-low.c: Include omp-offload.h.
(create_omp_child_function): If current_function_decl has
"omp declare target" attribute and is_gimple_omp_offloaded,
remove that attribute from the copy of attribute list and
add "omp target entrypoint" attribute instead.
(lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.*
variables for offloading if in omp_maybe_offloaded_ctx.
* omp-offload.c (pass_omp_target_link::execute): Nullify second
argument to GOMP_target_data_ext in offloaded code.
libgomp/
* config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): New dummy entrypoints.
* config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): Likewise.
* testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS,
OMPFROM, OMPTO): Define.
(main): Remove #pragma omp target teams around all the tests.
* testsuite/libgomp.c-c++-common/target-41.c: New test.
* testsuite/libgomp.c-c++-common/target-42.c: New test.
Harald Anlauf [Sun, 23 May 2021 18:51:14 +0000 (20:51 +0200)]
Fortran: fix passing return value to class(*) dummy argument
gcc/fortran/ChangeLog:
PR fortran/100551
* trans-expr.c (gfc_conv_procedure_call): Adjust check for
implicit conversion of actual argument to an unlimited polymorphic
procedure argument.
gcc/testsuite/ChangeLog:
PR fortran/100551
* gfortran.dg/pr100551.f90: New test.
Uros Bizjak [Tue, 18 May 2021 13:45:54 +0000 (15:45 +0200)]
i386: Fix split_double_mode with paradoxical subreg [PR100626]
split_double_mode calls simplify_gen_subreg, which fails for the
high half of the paradoxical subreg. Return temporary register
instead of NULL RTX in this case.
2021-05-18 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/100626
* config/i386/i386-expand.c (split_double_mode): Return
temporary register when simplify_gen_subreg fails with
the high half od the paradoxical subreg.
Jakub Jelinek [Tue, 25 May 2021 10:45:15 +0000 (12:45 +0200)]
openmp: Fix reduction clause handling on teams distribute simd [PR99928]
When a directive isn't combined with worksharing-loop, it takes much
simpler clause splitting path for reduction, and that one was missing
handling of teams when combined with simd.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Copy reduction to teams when teams is
combined with simd and not with taskloop or for.
gcc/testsuite/
* c-c++-common/gomp/pr99928-8.c: Remove xfails from omp teams r21 and
r28 checks.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/reduction-17.c: New test.
This splits can_associate_p into checks for SSA defs and checks
for the type so it can be called from is_reassociable_op to
catch cases not catched by the earlier fix.
2021-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/100519
* tree-ssa-reassoc.c (can_associate_p): Split into...
(can_associate_op_p): ... this
(can_associate_type_p): ... and this.
(is_reassociable_op): Call can_associate_op_p.
(break_up_subtract_bb): Call the appropriate predicates.
(reassociate_bb): Likewise.
Richard Biener [Tue, 11 May 2021 11:23:45 +0000 (13:23 +0200)]
ipa/100513 - fix SSA_NAME_DEF_STMT corruption in IPA param manip
This fixes unintended clobbering of SSA_NAME_DEF_STMT of the
cloned/inlined from SSA name during IPA parameter manipulation
of call stmt LHSs. gimple_call_set_lhs adjusts SSA_NAME_DEF_STMT
of the lhs to the stmt being modified but when
ipa_param_body_adjustments::modify_call_stmt is called the
cloning/inlining process has not yet remapped the stmts operands
to the copy variants but they are still original.
2021-05-11 Richard Biener <rguenther@suse.de>
PR ipa/100513
* ipa-param-manipulation.c
(ipa_param_body_adjustments::modify_call_stmt): Avoid
altering SSA_NAME_DEF_STMT by adjusting the calls LHS
via gimple_call_lhs_ptr.
Richard Biener [Tue, 11 May 2021 08:58:35 +0000 (10:58 +0200)]
middle-end/100509 - avoid folding constant to aggregate type
When folding a constant initializer looking through aliases to
incompatible types can lead to us trying to fold a constant
to an aggregate type which can't work. Simply avoid trying
to constant fold non-register typed symbols.
2021-05-11 Richard Biener <rguenther@suse.de>
PR middle-end/100509
* gimple-fold.c (fold_gimple_assign): Only call
get_symbol_constant_value on register type symbols.
Richard Biener [Mon, 10 May 2021 09:37:27 +0000 (11:37 +0200)]
tree-optimization/100492 - avoid irreducible regions in loop distribution
When we distribute away a condition we rely on the ability to
change it to either 1 != 0 or 0 != 0 depending on the direction
of the exit branch in the respective loop. But when the loop
contains an irreducible sub-region then for the conditions inside
this this fails and can lead to infinite loops being generated.
Avoid distibuting loops with irreducible sub-regions.
2021-05-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/100492
* tree-loop-distribution.c (find_seed_stmts_for_distribution):
Find nothing when the loop contains an irreducible region.
PR fortran/86470
* testsuite/libgomp.fortran/class-firstprivate-1.f90: New test.
* testsuite/libgomp.fortran/class-firstprivate-2.f90: New test.
* testsuite/libgomp.fortran/class-firstprivate-3.f90: New test.
gcc/testsuite/ChangeLog:
PR fortran/86470
* gfortran.dg/gomp/class-firstprivate-1.f90: New test.
* gfortran.dg/gomp/class-firstprivate-2.f90: New test.
* gfortran.dg/gomp/class-firstprivate-3.f90: New test.
* gfortran.dg/gomp/class-firstprivate-4.f90: New test.
Alex Coplan [Mon, 10 May 2021 08:46:45 +0000 (09:46 +0100)]
arm: Fix wrong code with MVE V2DImode loads and stores [PR99960]
As the PR shows, we currently miscompile V2DImode loads and stores for
MVE. We're currently using 64-bit loads/stores, but need to be using
128-bit vector loads and stores. Fixed thusly.
Some intrinsics tests were checking that we (incorrectly) used the
64-bit loads/stores: these have been updated.
gcc/ChangeLog:
PR target/99960
* config/arm/mve.md (*mve_mov<mode>): Simplify output code. Use
vldrw.u32 and vstrw.32 for V2D[IF]mode loads and stores.
gcc/testsuite/ChangeLog:
PR target/99960
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
Update now that we're (correctly) using full 128-bit vector
loads/stores.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c:
Likewise.
Jakub Jelinek [Sun, 23 May 2021 11:18:10 +0000 (13:18 +0200)]
openmp: Fix up firstprivate+lastprivate clause handling [PR99928]
The C/C++ clause splitting happens very early during construct parsing,
but only the FEs later on handle possible instantiations, non-static
member handling and array section lowering.
In the OpenMP 5.0/5.1 rules, whether firstprivate is added to combined
target depends on whether it isn't also mentioned in lastprivate or map
clauses, but unfortunately I think such checks are much better done only
when the FEs perform all the above mentioned changes.
So, this patch arranges for the firstprivate clause to be copied or moved
to combined target construct (as before), but sets flags on that clause,
which tell the FE *finish_omp_clauses and the gimplifier it has been added
only conditionally and let the FEs and gimplifier DTRT for these.
2021-05-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/
* tree.h (OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT_TARGET): Define.
* gimplify.c (enum gimplify_omp_var_data): Fix up
GOVD_MAP_HAS_ATTACHMENTS value, add GOVD_FIRSTPRIVATE_IMPLICIT.
(omp_lastprivate_for_combined_outer_constructs): If combined target
has GOVD_FIRSTPRIVATE_IMPLICIT set for the decl, change it to
GOVD_MAP | GOVD_SEEN.
(gimplify_scan_omp_clauses): Set GOVD_FIRSTPRIVATE_IMPLICIT for
firstprivate clauses with OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT.
(gimplify_adjust_omp_clauses): For firstprivate clauses with
OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT either clear that bit and
OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT_TARGET too, or remove it and
let it be replaced by implicit map clause.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Set OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT
on firstprivate clause copy going to target construct, and for
target simd set also OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT_TARGET bit.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Move firstprivate clauses with
OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT to the end of the chain. Don't error
if a decl is mentioned both in map clause and in such firstprivate
clause unless OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT_TARGET is also set.
gcc/cp/
* semantics.c (finish_omp_clauses): Move firstprivate clauses with
OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT to the end of the chain. Don't error
if a decl is mentioned both in map clause and in such firstprivate
clause unless OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT_TARGET is also set.
gcc/testsuite/
* c-c++-common/gomp/pr99928-3.c: Remove all xfails.
* c-c++-common/gomp/pr99928-15.c: New test.
Jakub Jelinek [Sun, 23 May 2021 11:09:26 +0000 (13:09 +0200)]
openmp: Fix up handling of implicit lastprivate on outer constructs for implicit linear and lastprivate IVs [PR99928]
This patch fixes the handling of lastprivate propagation to outer combined/composite
leaf constructs from implicit linear or lastprivate clauses on simd IVs and adds missing
testsuite coverage for explicit and implicit lastprivate on simd IVs.
2021-05-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* gimplify.c (omp_lastprivate_for_combined_outer_constructs): New
function.
(gimplify_scan_omp_clauses) <case OMP_CLAUSE_LASTPRIVATE>: Use it.
(gimplify_omp_for): Likewise.
* c-c++-common/gomp/pr99928-6.c: Remove all xfails.
* c-c++-common/gomp/pr99928-13.c: New test.
* c-c++-common/gomp/pr99928-14.c: New test.
Eric Botcazou [Fri, 21 May 2021 08:57:02 +0000 (10:57 +0200)]
Fix internal error on locally derived bit-packed array type
This is a regression present on the mainline, 11 and 10 branches,
in the form of an ICE on a locally derived bit-packed array type.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Type>: Process
the implementation type of a packed type implemented specially.
gcc/testsuite/
* gnat.dg/derived_type7.adb, gnat.dg/derived_type7.ads: New test.
Eric Botcazou [Fri, 21 May 2021 08:45:21 +0000 (10:45 +0200)]
Always translate Is_Pure flag into pure in C sense
Gigi has historically translated the Is_Pure flag of the front-end into
the "const" attribute of GNU C. That's correct for subprograms of pure
Ada units, but not fully exact according to the semantics of the flag.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_subprog_type): Always translate
the Is_Pure flag into the "pure" attribute of GNU C.
Eric Botcazou [Fri, 21 May 2021 08:40:41 +0000 (10:40 +0200)]
Fix segfault at run time on strict-alignment platforms
This fixes a regression present on the mainline and 11 branch by
restricting the problematic change dealing with bitfields whose
nomimal subtype is self-referential to the cases where the size
is really lower.
gcc/ada/
* gcc-interface/trans.c (Call_to_gnu): Restrict previous change
to bitfields whose size is not equal to the type size.
(gnat_to_gnu): Likewise.
Eric Botcazou [Fri, 21 May 2021 08:26:50 +0000 (10:26 +0200)]
Fix incorrect SLOC on instruction
This puts the missing SLOC on a statement generated by a return.
gcc/ada/
* gcc-interface/trans.c (gnat_to_gnu) <N_Simple_Return_Statement>:
Put a SLOC on the assignment from the return value to the return
object in the copy-in/copy-out case.
Jason Merrill [Thu, 20 May 2021 01:12:45 +0000 (21:12 -0400)]
c++: designated init with anonymous union [PR100489]
My patch for PR98463 added an assert that tripped on this testcase, because
we ended up with a U CONSTRUCTOR with an initializer for a, which is not a
member of U. We need to wrap the a initializer in another CONSTRUCTOR for
the anonymous union.
There was already support for this in process_init_constructor_record, but
not in process_init_constructor_union. But since this is about brace
elision, it really belongs under reshape_init rather than digest_init, so
this patch moves the handling to reshape_init_class, which also handles
unions.
PR c++/100489
gcc/cp/ChangeLog:
* decl.c (reshape_init_class): Handle designator for
member of anonymous aggregate here.
* typeck2.c (process_init_constructor_record): Not here.
Andreas Krebbel [Tue, 27 Apr 2021 08:09:06 +0000 (10:09 +0200)]
PR100281 C++: Fix SImode pointer handling
The problem appears to be triggered by two locations in the front-end
where non-POINTER_SIZE pointers aren't handled right now.
1. An assertion in strip_typedefs is triggered because the alignment
of the types don't match. This in turn is caused by creating the new
type with build_pointer_type instead of taking the type of the
original pointer into account.
2. An assertion in cp_convert_to_pointer is triggered which expects
the target type to always have POINTER_SIZE.
gcc/cp/ChangeLog:
PR c++/100281
* cvt.c (cp_convert_to_pointer): Use the size of the target
pointer type.
* tree.c (cp_build_reference_type): Call
cp_build_reference_type_for_mode with VOIDmode.
(cp_build_reference_type_for_mode): Rename from
cp_build_reference_type. Add MODE argument and invoke
build_reference_type_for_mode.
(strip_typedefs): Use build_pointer_type_for_mode and
cp_build_reference_type_for_mode for pointers and references.
gcc/ChangeLog:
PR c++/100281
* tree.c (build_reference_type_for_mode)
(build_pointer_type_for_mode): Pick pointer mode if MODE argument
is VOIDmode.
(build_reference_type, build_pointer_type): Invoke
build_*_type_for_mode with VOIDmode.
gcc/testsuite/ChangeLog:
PR c++/100281
* g++.target/s390/pr100281-1.C: New test.
* g++.target/s390/pr100281-2.C: New test.
Joern Rennecke [Thu, 20 May 2021 12:21:41 +0000 (13:21 +0100)]
libstdc++: Disable floating_to_chars.cc on 16 bit targets
This patch conditionally disables the compilation of floating_to_chars.cc
on 16 bit targets, thus fixing a build failure for these targets as
the POW10_SPLIT_2 array exceeds the maximum object size.
libstdc++-v3/
PR libstdc++/100361
* include/std/charconv (to_chars): Hide the overloads for
floating-point types for 16 bit targets.
* src/c++17/floating_to_chars.cc: Don't compile for 16 bit targets.
* testsuite/20_util/to_chars/double.cc: Run this test only on
size32plus targets.
* testsuite/20_util/to_chars/float.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.
Jakub Jelinek [Thu, 20 May 2021 11:30:48 +0000 (13:30 +0200)]
openmp: Handle explicit linear clause properly in combined constructs with target [PR99928]
linear clause should have the effect of firstprivate+lastprivate (or for IVs
not declared in the construct lastprivate) on outer constructs and eventually
map(tofrom:) on target when combined with it.
2021-05-20 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* gimplify.c (gimplify_scan_omp_clauses) <case OMP_CLAUSE_LINEAR>: For
explicit linear clause when combined with target, make it map(tofrom:)
instead of no clause or firstprivate.
* c-c++-common/gomp/pr99928-4.c: Remove all xfails.
* c-c++-common/gomp/pr99928-5.c: Likewise.
Jason Merrill [Wed, 19 May 2021 21:33:21 +0000 (17:33 -0400)]
c++: _Complex template parameter [PR100634]
We were crashing because invalid_nontype_parm_type_p allowed _Complex
template parms, but convert_nontype_argument didn't know what to do for
them. Let's just disallow it, people can and should use std::complex
instead.
PR c++/100634
gcc/cp/ChangeLog:
* pt.c (invalid_nontype_parm_type_p): Return true for COMPLEX_TYPE.
Jason Merrill [Wed, 19 May 2021 20:40:24 +0000 (16:40 -0400)]
c++: ICE with using and enum [PR100659]
Here the code for 'using enum' is confused by the combination of a
using-decl and an enum that are not from 'using enum'; this CONST_DECL is
from the normal unscoped enum scoping.
PR c++/100659
gcc/cp/ChangeLog:
* cp-tree.h (CONST_DECL_USING_P): Check for null TREE_TYPE.
Jason Merrill [Tue, 18 May 2021 16:29:33 +0000 (12:29 -0400)]
c++: ICE with <=> fallback [PR100367]
Here, when genericizing lexicographical_compare_three_way, we haven't yet
walked the operands, so (a == a) still sees ADDR_EXPR <a>, but this is after
we've changed the type of a to REFERENCE_TYPE. When we try to fold (a == a)
by constexpr evaluation, the constexpr code doesn't understand trying to
take the address of a reference, and we end up crashing.
Fixed by avoiding constexpr evaluation in genericize_spaceship, by using
fold_build2 instead of build_new_op on scalar operands. Class operands
should have been expanded during parsing.
PR c++/100367
PR c++/96299
gcc/cp/ChangeLog:
* method.c (genericize_spaceship): Use fold_build2 for scalar
operands.
Christophe Lyon [Wed, 19 May 2021 14:45:54 +0000 (14:45 +0000)]
arm/testsuite: Fix testcase for PR99977
Some targets (eg arm-none-uclinuxfdpiceabi) do not support Thumb-1,
and since the testcase forces -march=armv8-m.base, we need to check
whether this option is actually supported.
Using dg-add-options arm_arch_v8m_base ensure that we pass -mthumb as
needed too.
Jakub Jelinek [Wed, 19 May 2021 07:41:55 +0000 (09:41 +0200)]
openmp: Handle lastprivate on combined target correctly [PR99928]
This patch deals with 2 issues:
1) the gimplifier couldn't differentiate between
#pragma omp parallel master
#pragma omp taskloop simd
and
#pragma omp parallel master taskloop simd
when there is a significant difference for clause handling between
the two; as master construct doesn't have any clauses, we don't currently
represent it during gimplification by an gimplification omp context at all,
so this patch makes sure we don't set OMP_PARALLEL_COMBINED on parallel master
when not combined further. If we ever add a separate master context during
gimplification, we'd use ORT_COMBINED_MASTER vs. ORT_MASTER (or MASKED probably).
2) lastprivate when combined with target should be map(tofrom:) on the target,
this change handles it only when not combined with firstprivate though, that
will need further work (similarly to linear or reduction).
2021-05-19 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/
* tree.h (OMP_MASTER_COMBINED): Define.
* gimplify.c (gimplify_scan_omp_clauses): Rewrite lastprivate
handling for outer combined/composite constructs to a loop.
Handle lastprivate on combined target.
(gimplify_expr): Formatting fix.
gcc/c/
* c-parser.c (c_parser_omp_master): Set OMP_MASTER_COMBINED on
master when combined with taskloop.
(c_parser_omp_parallel): Don't set OMP_PARALLEL_COMBINED on
parallel master when not combined with taskloop.
gcc/cp/
* parser.c (cp_parser_omp_master): Set OMP_MASTER_COMBINED on
master when combined with taskloop.
(cp_parser_omp_parallel): Don't set OMP_PARALLEL_COMBINED on
parallel master when not combined with taskloop.
gcc/testsuite/
* c-c++-common/gomp/pr99928-2.c: Remove all xfails.
* c-c++-common/gomp/pr99928-12.c: New test.
Jason Merrill [Tue, 18 May 2021 21:15:42 +0000 (17:15 -0400)]
c++: ICE with bad definition of decimal32 [PR100261]
The change to only look at the global binding for non-classes meant that
here, when dealing with decimal32 which is magically mangled like its first
non-static data member, we got a collision with the mangling for float.
Fixed by also looking up an existing binding for such magical classes.
Here we have a pack expansion of a template template parameter pack, of
which the pattern is a TEMPLATE_DECL, which strip_typedefs doesn't want to
see.
PR c++/100372
gcc/cp/ChangeLog:
* tree.c (strip_typedefs): Only look at the pattern of a
TYPE_PACK_EXPANSION if it's a type.
Jason Merrill [Tue, 18 May 2021 16:06:36 +0000 (12:06 -0400)]
c++: "perfect" implicitly deleted move [PR100644]
Here we were ignoring the template constructor because the implicit move
constructor had all perfect conversions. But CWG1402 says that an
implicitly deleted move constructor is ignored by overload resolution; we
implement that instead by preferring any other candidate in joust, to get
better diagnostics, but that means we need to handle that case here as well.
gcc/cp/ChangeLog:
PR c++/100644
* call.c (perfect_candidate_p): An implicitly deleted move
is not perfect.