git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

Hafiz Abid Qadeer [Wed, 9 Mar 2022 21:40:45 +0000 (21:40 +0000)]

Fix an ICE with allocate directive.

Add case for OMP_CLAUSE_ALLOCATOR in walk_tree_1. This helps fix
an ICE which occurs only on OG11 with allocate directive.

Please note that this change is not needed on master. The code
there handles all clauses in the same way so a special case for
OMP_CLAUSE_ALLOCATOR is not required.

gcc/
* tree.c (walk_tree_1): Add case for OMP_CLAUSE_ALLOCATOR.

commit | commitdiff | tree

Hafiz Abid Qadeer [Wed, 9 Mar 2022 14:09:45 +0000 (14:09 +0000)]

Lower allocate directive (OpenMP 5.0).

Backport of a patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588372.html

This patch looks for malloc/free calls that were generated by allocate statement
that is associated with allocate directive and replaces them with GOMP_alloc
and GOMP_free.

gcc/ChangeLog:

* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_ALLOCATOR.
(scan_omp_allocate): New.
(scan_omp_1_stmt): Call it.
(lower_omp_allocate): New function.
(lower_omp_1): Call it.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-6.f90: Add tests.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocate-2.f90: New test.
* gfortran.dg/gomp/allocate-7.f90: New test.
* gfortran.dg/gomp/allocate-8.f90: New test.

commit | commitdiff | tree

Hafiz Abid Qadeer [Wed, 9 Mar 2022 13:42:35 +0000 (13:42 +0000)]

Gimplify allocate directive (OpenMP 5.0).

Backport of a patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588371.html

gcc/ChangeLog:

* doc/gimple.texi: Describe GIMPLE_OMP_ALLOCATE.
* gimple-pretty-print.c (dump_gimple_omp_allocate): New function.
(pp_gimple_stmt_1): Call it.
* gimple.c (gimple_build_omp_allocate): New function.
* gimple.def (GIMPLE_OMP_ALLOCATE): New node.
* gimple.h (enum gf_mask): Add GF_OMP_ALLOCATE_KIND_MASK,
GF_OMP_ALLOCATE_KIND_ALLOCATE and GF_OMP_ALLOCATE_KIND_FREE.
(struct gomp_allocate): New.
(is_a_helper <gomp_allocate *>::test): New.
(is_a_helper <const gomp_allocate *>::test): New.
(gimple_build_omp_allocate): Declare.
(gimple_omp_subcode): Replace GIMPLE_OMP_TEAMS with
GIMPLE_OMP_ALLOCATE.
(gimple_omp_allocate_set_clauses): New.
(gimple_omp_allocate_set_kind): Likewise.
(gimple_omp_allocate_clauses): Likewise.
(gimple_omp_allocate_kind): Likewise.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_ALLOCATE.
* gimplify.c (gimplify_omp_allocate): New.
(gimplify_expr): Call it.
* gsstruct.def (GSS_OMP_ALLOCATE): Define.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-6.f90: Add tests.

commit | commitdiff | tree

Hafiz Abid Qadeer [Sat, 8 Jan 2022 18:52:09 +0000 (18:52 +0000)]

Handle cleanup of omp allocated variables (OpenMP 5.0).

Currently we are only handling omp allocate directive that is associated
with an allocate statement. This statement results in malloc and free calls.
The malloc calls are easy to get to as they are in the same block as allocate
directive. But the free calls come in a separate cleanup block. To help any
later passes finding them, an allocate directive is generated in the
cleanup block with kind=free. The normal allocate directive is given
kind=allocate.

Backport of a patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588370.html

gcc/fortran/ChangeLog:

* gfortran.h (struct access_ref): Declare new members
omp_allocated and omp_allocated_end.
* openmp.c (gfc_match_omp_allocate): Set new_st.resolved_sym to
NULL.
(prepare_omp_allocated_var_list_for_cleanup): New function.
(gfc_resolve_omp_allocate): Call it.
* trans-decl.c (gfc_trans_deferred_vars): Process omp_allocated.
* trans-openmp.c (gfc_trans_omp_allocate): Set kind for the stmt
generated for allocate directive.

gcc/ChangeLog:

* tree-core.h (struct tree_base): Add comments.
* tree-pretty-print.c (dump_generic_node): Handle allocate directive
kind.
* tree.h (OMP_ALLOCATE_KIND_ALLOCATE): New define.
(OMP_ALLOCATE_KIND_FREE): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-6.f90: Test kind of allocate directive.

commit | commitdiff | tree

Hafiz Abid Qadeer [Wed, 9 Mar 2022 11:52:49 +0000 (11:52 +0000)]

Translate allocate directive (OpenMP 5.0).

Backport of a patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588369.html

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_trans_omp_clauses): Handle OMP_LIST_ALLOCATOR.
(gfc_trans_omp_allocate): New function.
(gfc_trans_omp_directive): Handle EXEC_OMP_ALLOCATE.

gcc/ChangeLog:

* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_ALLOCATOR.
(dump_generic_node): Handle OMP_ALLOCATE.
* tree.def (OMP_ALLOCATE): New.
* tree.h (OMP_ALLOCATE_CLAUSES): Likewise.
(OMP_ALLOCATE_DECL): Likewise.
(OMP_ALLOCATE_ALLOCATOR): Likewise.
* tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_ALLOCATOR.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-6.f90: New test.

commit | commitdiff | tree

Hafiz Abid Qadeer [Wed, 9 Mar 2022 11:36:04 +0000 (11:36 +0000)]

Add parsing support for allocate directive (OpenMP 5.0)

Currently we only make use of this directive when it is associated
with an allocate statement.

Backport of a patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588368.html

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_node): Handle EXEC_OMP_ALLOCATE.
(show_code_node): Likewise.
* gfortran.h (enum gfc_statement): Add ST_OMP_ALLOCATE.
(OMP_LIST_ALLOCATOR): New enum value.
(enum gfc_exec_op): Add EXEC_OMP_ALLOCATE.
* match.h (gfc_match_omp_allocate): New function.
* openmp.c (enum omp_mask1): Add OMP_CLAUSE_ALLOCATOR.
(OMP_ALLOCATE_CLAUSES): New define.
(gfc_match_omp_allocate): New function.
(resolve_omp_clauses): Add ALLOCATOR in clause_names.
(omp_code_to_statement): Handle EXEC_OMP_ALLOCATE.
(EMPTY_VAR_LIST): New define.
(check_allocate_directive_restrictions): New function.
(gfc_resolve_omp_allocate): Likewise.
(gfc_resolve_omp_directive): Handle EXEC_OMP_ALLOCATE.
* parse.c (decode_omp_directive): Handle ST_OMP_ALLOCATE.
(next_statement): Likewise.
(gfc_ascii_statement): Likewise.
* resolve.c (gfc_resolve_code): Handle EXEC_OMP_ALLOCATE.
* st.c (gfc_free_statement): Likewise.
* trans.c (trans_code): Likewise

commit | commitdiff | tree

Hafiz Abid Qadeer [Mon, 21 Feb 2022 13:54:57 +0000 (13:54 +0000)]

Set omp_requires_mask for dynamic_allocators.

This is backport of a patch posted in
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590655.html

This patch fixes an issue that although gfortran accepts
'requires dynamic_allocators', it does not set the omp_requires_mask
accordingly.

gcc/fortran/ChangeLog:

* parse.c (gfc_parse_file): Set OMP_REQUIRES_DYNAMIC_ALLOCATORS
bit in omp_requires_mask.

commit | commitdiff | tree

Hafiz Abid Qadeer [Fri, 18 Feb 2022 21:28:08 +0000 (21:28 +0000)]

Add a restriction on allocate clause (OpenMP 5.0)

This is backport of a patch posted in
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590597.html

An allocate clause in target region must specify an allocator
unless the compilation unit has requires construct with
dynamic_allocators clause. Current implementation of the allocate
clause did not check for this restriction. This patch fills that
gap.

gcc/ChangeLog:

* omp-low.c (omp_maybe_offloaded_ctx): New prototype.
(scan_sharing_clauses): Check a restriction on allocate clause.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/allocate-2.c: Add tests.
* c-c++-common/gomp/allocate-8.c: New test.
* gfortran.dg/gomp/allocate-3.f90: Add tests.
* gcc.dg/gomp/pr104517.c: Update.

commit | commitdiff | tree

Hafiz Abid Qadeer [Mon, 31 Jan 2022 19:02:14 +0000 (19:02 +0000)]

Fix multiple issue in the testcase allocate-1.f90.

This is backport of patch posted in
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589928.html

1. Thomas reported in
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589039.html
that this testcase is randomly failing. The problem was fixed pool
size which was exhausted when there were a lot of threads. Fixed it
by removing pool_size trait which causes default pool size to be used
which should be big enough.

2. Array indices have been changed to check the last element in the
array.

3. Remove a redundant assignment and move some code to better match
C testcase.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocate-1.f90: Remove pool_size
trait. Test last index in w and v array. Remove redundant
assignment to V(1). Move alignment checks at the end of
parallel region.

commit | commitdiff | tree

Hafiz Abid Qadeer [Fri, 24 Sep 2021 09:04:12 +0000 (10:04 +0100)]

Add support for allocate clause (OpenMP 5.0).

This patch adds support for OpenMP 5.0 allocate clause for fortran. It does not
yet support the allocator-modifier as specified in OpenMP 5.1. The allocate
clause is already supported in C/C++.

This commit contains 4 following upstream commits.

69561fc781a Add support for allocate clause (OpenMP 5.0).
deb9f18f677 Change kind of integer literal to fix a testcase.
6d498135016 libgomp.fortran/allocate-1.f90: Minor cleanup
f62156eab7b libgomp.fortran/allocate-1.f90: Fix minor cleanup

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Handle OMP_LIST_ALLOCATE.
* gfortran.h (OMP_LIST_ALLOCATE): New enum value.
* openmp.c (enum omp_mask1): Add OMP_CLAUSE_ALLOCATE.
(gfc_match_omp_clauses): Handle OMP_CLAUSE_ALLOCATE
(OMP_PARALLEL_CLAUSES, OMP_DO_CLAUSES, OMP_SECTIONS_CLAUSES)
(OMP_TASK_CLAUSES, OMP_TASKLOOP_CLAUSES, OMP_TARGET_CLAUSES)
(OMP_TEAMS_CLAUSES, OMP_DISTRIBUTE_CLAUSES)
(OMP_SINGLE_CLAUSES): Add OMP_CLAUSE_ALLOCATE.
(OMP_TASKGROUP_CLAUSES): New.
(gfc_match_omp_taskgroup): Use OMP_TASKGROUP_CLAUSES instead of
OMP_CLAUSE_TASK_REDUCTION.
(resolve_omp_clauses): Handle OMP_LIST_ALLOCATE.
(resolve_omp_do): Avoid warning when loop iteration variable is
in allocate clause.
* trans-openmp.c (gfc_trans_omp_clauses): Handle translation of
allocate clause.
(gfc_split_omp_clauses): Update for OMP_LIST_ALLOCATE.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-1.f90: New test.
* gfortran.dg/gomp/allocate-2.f90: New test.
* gfortran.dg/gomp/allocate-3.f90: New test.
* gfortran.dg/gomp/collapse1.f90: Update error message.
* gfortran.dg/gomp/openmp-simd-4.f90: Likewise.
* gfortran.dg/gomp/clauses-1.f90: Uncomment allocate clause.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocate-1.c: New test.
* testsuite/libgomp.fortran/allocate-1.f90: New test.
* libgomp.texi: Remove string that says that allocate clause
support is for C/C++ only.

(cherry picked from commit 69561fc781aca3dea3aa4d5d562ef5a502965924)

Change kind of integer literal to fix a testcase.

As Thomas reported in
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588448.html
a test added in my recent allocate clause patch fails on m32. It was due
to default kind for integer matching c_intptr_t for m32. I have now
changed it to 0_1 so that always integer with kind=1 is used.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/allocate-2.f90: Change 0 to 0_1.

(cherry picked from commit deb9f18f67788c36f4652bca101d93faf07ecf39)

libgomp.fortran/allocate-1.f90: Minor cleanup

libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-1.c (is_64bit_aligned): Renamed
from is_64bit_aligned_.
* testsuite/libgomp.fortran/allocate-1.f90: Fix interface decl
and use it, more implicit none, remove unused argument.

(cherry picked from commit 6d4981350168f1eb3f72149bd7e05b9ba6bec1fd)

libgomp.fortran/allocate-1.f90: Fix minor cleanup

libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-1.f90: Remove spurious
STOP of previous commit.

(cherry picked from commit f62156eab7b757d1ee03a11d5c96c72bd3de079c)

commit | commitdiff | tree

Harald Anlauf [Sat, 7 Aug 2021 18:30:32 +0000 (20:30 +0200)]

Fortran: ICE with automatic character object, save, and various options

gcc/fortran/ChangeLog:

PR fortran/68568
* primary.c (gfc_expr_attr): Variable attribute can only be
inquired when symtree is non-NULL.

(cherry picked from commit cd754efa9a5349c693919046b8be074395ea114e)

commit | commitdiff | tree

Harald Anlauf [Sun, 9 Jan 2022 21:08:14 +0000 (22:08 +0100)]

Fortran: reject invalid non-constant pointer initialization targets

gcc/fortran/ChangeLog:

PR fortran/101762
* expr.c (gfc_check_pointer_assign): For pointer initialization
targets, check that subscripts and substring indices in
specifications are constant expressions.

gcc/testsuite/ChangeLog:

PR fortran/101762
* gfortran.dg/pr101762.f90: New test.

(cherry picked from commit 2e63128306ff93d8f53119137dd6c28b2defac94)

commit | commitdiff | tree

Tobias Burnus [Fri, 22 Oct 2021 21:23:06 +0000 (23:23 +0200)]

Fortran: Avoid running into assert with -fcheck= + UBSAN

PR fortran/92621
gcc/fortran/
* trans-expr.c (gfc_trans_assignment_1): Add STRIP_NOPS.

gcc/testsuite/
* gfortran.dg/bind-c-intent-out-2.f90: New test.

(cherry picked from commit 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2)

commit | commitdiff | tree

Tobias Burnus [Wed, 2 Mar 2022 19:02:15 +0000 (20:02 +0100)]

Fortran/OpenMP: class.cc fix for mapping of DT with allocatable components

    This commit: OG11 version.
    GCC 12/mainline submission (previous commit and this follow up):
    https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591144.html

    gcc/fortran/ChangeLog:

            * class.c (generate_callback_wrapper): Fixes.

commit | commitdiff | tree

Tobias Burnus [Tue, 1 Mar 2022 15:35:08 +0000 (16:35 +0100)]

Fortran/OpenMP: Support mapping of DT with allocatable components

This commit: OG11 version.
GCC 12/mainline submission:
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591075.html

gcc/fortran/ChangeLog:

* class.c (finalization_scalarizer): Mark syms as artificial.
(generate_callback_wrapper): New.
(gfc_find_derived_vtab): Call it, add _callback comp.
* f95-lang.c (LANG_HOOKS_OMP_DEEP_MAPPING,
LANG_HOOKS_OMP_DEEP_MAPPING_P,
LANG_HOOKS_OMP_DEEP_MAPPING_CNT): Redeinfe
* gfortran.h (gfc_import_iso_c_binding_module,
GFC_CLASS_CALLBACK_DEFAULT_FLAG, GFC_CLASS_CALLBACK_VTABLE_FLAG,
GFC_CLASS_CB_ALLOCATABLE, GFC_CLASS_CB_POINTER,
GFC_CLASS_CB_PROC_POINTER, GFC_CLASS_CB_VTABLE,
GFC_CLASS_CB_VPTR): New.
* match.c (select_type_set_tmp): Propagate allocatable property.
* module.c (MOD_VERSION): Bump due to vtab change.
(import_iso_c_binding_module): New import_all arg.
(gfc_import_iso_c_binding_module): New.
(gfc_use_module): Update call.
* openmp.c (resolve_omp_clauses): Accept DT with alloc comps.
* resolve.c (gfc_resolve_formal_arglist, gfc_resolve_intrinsic,
resolve_fl_procedure, resolve_types): Permit some violations
for internal code.
* trans-array.c (gfc_conv_descriptor_stride_get,
gfc_tree_array_size, gfc_full_array_size): Update
for GFC_TYPE_ARRAY_AKIND change.
(gfc_conv_expr_descriptor): Likewise; permit calling with tree code.
* trans-expr.c (VTABLE_CALLBACK_FIELD): Add.
(VTAB_GET_FIELD_GEN): Use it.
(VTABLE_DEALLOCATE_FIELD): Undef at the end.
(gfc_conv_expr_reference): Fixes; avoid unneccessary temp var.
* trans-intrinsic.c (gfc_conv_intrinsic_sizeof,
gfc_conv_associated): Fix class and comp-ref handling.
(conv_isocbinding_function): Remove buggy code.
* trans-openmp.c (gfc_has_alloc_comps): Add ptr_ok arg.
(gfc_omp_private_outer_ref, gfc_walk_alloc_comps,
gfc_omp_clause_default_ctor, gfc_omp_clause_copy_ctor,
gfc_omp_clause_assign_op, gfc_omp_clause_dtor,
(gfc_omp_finish_clause): Update call.
(GFC_MAP_TOKEN_DATA, GFC_MAP_TOKEN_SIZES, GFC_MAP_TOKEN_KINDS,
GFC_MAP_TOKEN_DATA_OFFSET, GFC_MAP_TOKEN_OFFSET,
GFC_MAP_TOKEN_FLAGS, GFC_MAP_TOKEN_DETACH): Define.
(gfc_omp_get_token_data, gfc_omp_get_token_sizes,
gfc_omp_get_token_kinds, gfc_omp_get_token_offset_data,
gfc_omp_get_token_offset, gfc_omp_get_token_flags,
gfc_omp_get_token_detach, gfc_omp_get_map_token_type,
gfc_omp_get_cb_type, gfc_omp_gen_deep_map_fn,
gfc_omp_deep_mapping_map, gfc_omp_deep_mapping_item,
gfc_omp_deep_mapping_comps, gfc_omp_gen_simple_loop,
gfc_omp_get_array_size, gfc_omp_elmental_loop,
gfc_omp_deep_map_kind_p, gfc_omp_deep_mapping_int_p,
gfc_omp_deep_mapping_p, gfc_omp_deep_mapping_do),
gfc_omp_deep_mapping_cnt, gfc_omp_deep_mapping): New.
(gfc_trans_omp_array_section): Save clause decl to survive gimplifying.
(gfc_trans_omp_clauses): Likewise; fixes.
* trans-types.c (gfc_build_array_type, gfc_get_derived_type,
gfc_get_array_descr_info): Update array kind to distinguish
different assumed-rank arrays.
* trans.h (gfc_class_vtab_callback_get, gfc_omp_deep_mapping_p,
gfc_omp_deep_mapping_cnt, gfc_omp_deep_mapping): New prototypes.
(enum gfc_array_kind): Additional GFC_ARRAY_ASSUMED_RANK_* entries.

gcc/ChangeLog:

* langhooks-def.h (lhd_omp_deep_mapping_p,
lhd_omp_deep_mapping_cnt, lhd_omp_deep_mapping): New.
(LANG_HOOKS_OMP_DEEP_MAPPING_P, LANG_HOOKS_OMP_DEEP_MAPPING_CNT,
LANG_HOOKS_OMP_DEEP_MAPPING): Define.
(LANG_HOOKS_DECLS): Use it.
* langhooks.c (lhd_omp_deep_mapping_p, lhd_omp_deep_mapping_cnt,
lhd_omp_deep_mapping): New stubs.
* langhooks.h (struct lang_hooks_for_decls): Add new hooks
* omp-expand.c (expand_omp_target): Handle dynamic-size
addr/sizes/kinds arrays.
* omp-low.c (build_sender_ref, fixup_child_record_type,
scan_sharing_clauses, lower_omp_target): Update to handle
new hooks and dynamic-size addr/sizes/kinds arrays.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocatable-comp.f90: New test.
* testsuite/libgomp.fortran/map-alloc-comp-3.f90: New test.
* testsuite/libgomp.fortran/map-alloc-comp-4.f90: New test.
* testsuite/libgomp.fortran/map-alloc-comp-5.f90: New test.
* testsuite/libgomp.fortran/map-alloc-comp-6.f90: New test.
* testsuite/libgomp.fortran/map-alloc-comp-7.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/c_loc_test_22.f90: Update scan-tree.
* gfortran.dg/finalize_21.f90: Likewise.
* gfortran.dg/gomp/map-alloc-comp-1.f90: Remove sorry dg-error.

commit | commitdiff | tree

Tobias Burnus [Tue, 15 Feb 2022 20:42:33 +0000 (21:42 +0100)]

Fortran/OpenMP: Fix depend-clause handling for c_ptr

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_depobj): Fix to alloc/ptr dummy
and for c_ptr.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/depend-4.f90: Add VALUE test, update scan test.
* gfortran.dg/gomp/depend-5.f90: Fix scan tree for -m32.
* gfortran.dg/gomp/depend-6.f90: New test.

(cherry picked from commit 4d74ea551734694c225643c4069b1b4d4d2b05ed)

commit | commitdiff | tree

Tobias Burnus [Tue, 15 Feb 2022 11:26:48 +0000 (12:26 +0100)]

Fortran/OpenMP: Fix depend-clause handling

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_clauses, gfc_trans_omp_depobj):
Depend on the proper addr, for ptr/alloc depend on pointee.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/depend-4.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/depend-4.f90: New test.
* gfortran.dg/gomp/depend-5.f90: New test.

(cherry picked from commit 3939c1b11279dc950d2f160eb940dd791f7b40f1)

commit | commitdiff | tree

Tobias Burnus [Thu, 10 Feb 2022 08:30:19 +0000 (09:30 +0100)]

Fortran/OpenMP: Avoid ICE for invalid char array in omp atomic [PR104329]

PR fortran/104329
gcc/fortran/ChangeLog:

* openmp.c (resolve_omp_atomic): Defer extra-code assert after
other diagnostics.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/atomic-28.f90: New test.

(cherry picked from commit 9694f6121982668285a21020b55b44c3099f7042)

commit | commitdiff | tree

Tobias Burnus [Tue, 4 Jan 2022 13:58:06 +0000 (14:58 +0100)]

libgomp/testsuite: Improve omp_get_device_num() tests

Related to r12-6208-gebc853deb7cc0487de9ef6e891a007ba853d1933
"libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load"

That commit fixed an issue with omp_get_device_num() on gcn/nvptx that
resulted in having always the value 0.
This commit modifies the tests to iterate over all devices such that on a
multi-nonhost-device system it had detected that always-zero issue.

libgomp/ChangeLog:

* testsuite/libgomp.c-c++-common/target-45.c: Iterate over all devices.
* testsuite/libgomp.fortran/target10.f90: Likewise.

(cherry picked from commit be661959a6b6d8f9c3c8608a746789e7b2ec3ca4)

commit | commitdiff | tree

Tobias Burnus [Sun, 27 Feb 2022 21:00:06 +0000 (22:00 +0100)]

Fortran: Handle compare in OpenMP atomic

gcc/fortran/ChangeLog:

PR fortran/103576
* openmp.c (is_scalar_intrinsic_expr): Fix condition.
(resolve_omp_atomic): Fix/update checks, accept compare.
* trans-openmp.c (gfc_trans_omp_atomic): Handle compare.

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.1): Set Fortran support for atomic to 'Y'.
* testsuite/libgomp.fortran/atomic-19.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/atomic-25.f90: Remove sorry, fix + add checks.
* gfortran.dg/gomp/atomic-26.f90: Likewise.
* gfortran.dg/gomp/atomic-21.f90: New test.

(cherry picked from commit 494ebfa7c9aacaeb6ec1fccc47a0e49f31eb2bb8)

commit | commitdiff | tree

Tobias Burnus [Sat, 4 Dec 2021 18:39:43 +0000 (19:39 +0100)]

Fortran/OpenMP: Support most of 5.1 atomic extensions

Implements moste of OpenMP 5.1 atomic extensions,
except that 'compare' is parsed but rejected during
resolution. (As the trans-openmp.c handling is missing.)

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Handle
weak/compare/fail clause.
* gfortran.h (gfc_omp_clauses): Add weak, compare, fail.
* openmp.c (enum omp_mask1, gfc_match_omp_clauses,
OMP_ATOMIC_CLAUSES): Update for new clauses.
(gfc_match_omp_atomic): Update for 5.1 atomic changes.
(is_conversion): Support widening in one go.
(is_scalar_intrinsic_expr): New.
(resolve_omp_atomic): Update for 5.1 atomic changes.
* parse.c (parse_omp_oacc_atomic): Update for compare.
* resolve.c (gfc_resolve_blocks): Update asserts.
* trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/atomic-2.f90: Move now supported code to ...
* gfortran.dg/gomp/atomic.f90: here.
* gfortran.dg/gomp/atomic-10.f90: New test.
* gfortran.dg/gomp/atomic-12.f90: New test.
* gfortran.dg/gomp/atomic-15.f90: New test.
* gfortran.dg/gomp/atomic-16.f90: New test.
* gfortran.dg/gomp/atomic-17.f90: New test.
* gfortran.dg/gomp/atomic-18.f90: New test.
* gfortran.dg/gomp/atomic-19.f90: New test.
* gfortran.dg/gomp/atomic-20.f90: New test.
* gfortran.dg/gomp/atomic-22.f90: New test.
* gfortran.dg/gomp/atomic-24.f90: New test.
* gfortran.dg/gomp/atomic-25.f90: New test.
* gfortran.dg/gomp/atomic-26.f90: New test.

libgomp/ChangeLog

* libgomp.texi (OpenMP 5.1): Update status.

(cherry picked from commit 689407ef916503b2f5a3c8c07fe7d5ab1913f956)

commit | commitdiff | tree

Tobias Burnus [Mon, 15 Nov 2021 14:44:11 +0000 (15:44 +0100)]

Fortran: openmp: Add support for thread_limit clause on target

gcc/fortran/ChangeLog:

* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
teams.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/thread-limit-1.f90: New test.

(cherry picked from commit 82ec4cb3c43c7429be6b902d96770a6435fa068b)

commit | commitdiff | tree

Jakub Jelinek [Mon, 15 Nov 2021 12:20:53 +0000 (13:20 +0100)]

openmp: Add support for thread_limit clause on target

OpenMP 5.1 says that thread_limit clause can also appear on target,
and similarly to teams should affect the thread-limit-var ICV.
On combined target teams, the clause goes to both.

We actually passed thread_limit internally on target already before,
but only used it for gcn/ptx offloading to hint how many threads should be
created and for ptx didn't set thread_limit_var in that case.
Similarly for host fallback.
Also, I found that we weren't copying the args array that contains encoded
thread_limit and num_teams clause for target (etc.) for async target.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

gcc/
* gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT
to OMP_TARGET_CLAUSES if it isn't there already.
gcc/c-family/
* c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>:
Duplicate to both OMP_TARGET and OMP_TEAMS.
gcc/c/
* c-parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
gcc/cp/
* parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
libgomp/
* task.c (gomp_create_target_task): Copy args array as well.
* target.c (gomp_target_fallback): Add args argument.
Set gomp_icv (true)->thread_limit_var if thread_limit is present.
(GOMP_target): Adjust gomp_target_fallback caller.
(GOMP_target_ext): Likewise.
(gomp_target_task_fn): Likewise.
* config/nvptx/team.c (gomp_nvptx_main): Set
gomp_global_icv.thread_limit_var.
* testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.

(cherry picked from commit aea72386831c0c5672f55983034cc709b968daea)

commit | commitdiff | tree

Tobias Burnus [Fri, 12 Nov 2021 16:58:21 +0000 (17:58 +0100)]

Fortran/openmp: Fix '!$omp end'

gcc/fortran/ChangeLog:

* parse.c (decode_omp_directive): Fix permitting 'nowait' for some
combined directives, add missing 'omp end ... loop'.
(gfc_ascii_statement): Fix ST_OMP_END_TEAMS_LOOP result.
* openmp.c (resolve_omp_clauses): Add missing combined loop constructs
case values to the 'if(directive-name: ...)' check.
* trans-openmp.c (gfc_split_omp_clauses): Put nowait on target if
first leaf construct accepting it.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/unexpected-end.f90: Update dg-error.
* gfortran.dg/gomp/clauses-1.f90: New test.
* gfortran.dg/gomp/nowait-2.f90: New test.
* gfortran.dg/gomp/nowait-3.f90: New test.

(cherry picked from commit 48c6cac9caea1dc7c5f50ad3a736f6693e74a11b)

commit | commitdiff | tree

Tobias Burnus [Thu, 11 Nov 2021 16:27:00 +0000 (17:27 +0100)]

Fortran/openmp: Add support for 2 argument num_teams clause

Fortran part to commit r12-5146-g48d7327f2aaf65

gcc/fortran/ChangeLog:

* gfortran.h (struct gfc_omp_clauses): Rename num_teams to
num_teams_upper, add num_teams_upper.
* dump-parse-tree.c (show_omp_clauses): Update to handle
lower-bound num_teams clause.
* frontend-passes.c (gfc_code_walker): Likewise
* openmp.c (gfc_free_omp_clauses, gfc_match_omp_clauses,
resolve_omp_clauses): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses,
gfc_trans_omp_target): Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/teams-1.f90: New test.

(cherry picked from commit 407eaad25f45ccba6e45e6f07d6c69c51cc567f3)

commit | commitdiff | tree

Jakub Jelinek [Thu, 11 Nov 2021 08:42:47 +0000 (09:42 +0100)]

openmp: Add support for 2 argument num_teams clause

In OpenMP 5.1, num_teams clause can accept either one expression as before,
but it in that case changed meaning, rather than create <= expression
teams it is now create == expression teams.  Or it accepts two expressions
separated by :, with the meaning that the first is low bound and second upper
bound on how many teams should be created.  The other ways to set number of
teams are upper bounds with lower bound of 1.

The following patch does parsing of this for C/C++.  For host teams, we
actually don't need to do anything further right now, we always create
(pretend to create) exactly the requested number of teams, so we can just
evaluate and throw away the lower bound for now.
For teams nested in target, we don't guarantee that though and further
work will be needed.
In particular, omplower now turns the teams part of:
struct S { S (); S (const S &); ~S (); int s; };
void bar (S &, S &);
int baz ();
_Pragma ("omp declare target to (baz)");

void
foo (void)
{
  S a, b;
  #pragma omp target private (a) map (b)
  {
    #pragma omp teams firstprivate (b) num_teams (baz ())
    {
      bar (a, b);
    }
  }
}
into:
  retval.0 = baz ();
  retval.1 = retval.0;
  {
    unsigned int retval.3;
    struct S * D.2549;
    struct S b;

    retval.3 = (unsigned int) retval.1;
    D.2549 = .omp_data_i->b;
    S::S (&b, D.2549);
    #pragma omp teams num_teams(retval.1) firstprivate(b) shared(a)
    __builtin_GOMP_teams (retval.3, 0);
    {
      bar (&a, &b);
    }
    S::~S (&b);
    #pragma omp return(nowait)
  }
IMHO we want a new API, say GOMP_teams3 which will take 3 arguments
instead of 2 (the lower and upper bounds from num_teams and thread_limit)
and will return a bool whether it should do the teams body or not.
And, we should add right before outermost {} above
while (__builtin_GOMP_teams3 ((unsigned) retval.1, (unsigned) retval.1, 0))
and remove the __builtin_GOMP_teams call.  The current function performs
exit equivalent (at least on NVPTX) which seems bad because that means
the destructors of e.g. private variables on target aren't invoked, and
at the current placement neither destructors of the already constructed
privatized variables in teams.
I'll do this next on the compiler side, but I'm afraid I'll need help
with the nvptx and amdgcn implementations.  E.g. for nvptx, we won't be
able to use %ctaid.x .  I think ideal would be to use a .shared
integer variable for the omp_get_team_num value, but I don't have any
experience with that, are .shared variables zero initialized by default,
or do they have random value at start?  PTX docs say they aren't initializable.

2021-11-11  Jakub Jelinek  <jakub@redhat.com>

gcc/
* tree.h (OMP_CLAUSE_NUM_TEAMS_EXPR): Rename to ...
(OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR): ... this.
(OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR): Define.
* tree.c (omp_clause_num_ops): Increase num ops for
OMP_CLAUSE_NUM_TEAMS to 2.
* tree-pretty-print.c (dump_omp_clause): Print optional lower bound
for OMP_CLAUSE_NUM_TEAMS.
* gimplify.c (gimplify_scan_omp_clauses): Gimplify
OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR if non-NULL.
(optimize_target_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead
of OMP_CLAUSE_NUM_TEAMS_EXPR.  Handle OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
* omp-low.c (lower_omp_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR
instead of OMP_CLAUSE_NUM_TEAMS_EXPR.
* omp-expand.c (expand_teams_call, get_target_arguments): Likewise.
gcc/c/
* c-parser.c (c_parser_omp_clause_num_teams): Parse optional
lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
OMP_CLAUSE_NUM_TEAMS_EXPR.
(c_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
gcc/cp/
* parser.c (cp_parser_omp_clause_num_teams): Parse optional
lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
OMP_CLAUSE_NUM_TEAMS_EXPR.
(cp_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
* semantics.c (finish_omp_clauses): Handle
OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR of OMP_CLAUSE_NUM_TEAMS clause.
* pt.c (tsubst_omp_clauses): Likewise.
(tsubst_expr): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Use
OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR.
gcc/testsuite/
* c-c++-common/gomp/clauses-1.c (bar): Supply lower-bound expression
to half of the num_teams clauses.
* c-c++-common/gomp/num-teams-1.c: New test.
* c-c++-common/gomp/num-teams-2.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Supply lower-bound expression
to half of the num_teams clauses.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* g++.dg/gomp/num-teams-1.C: New test.
* g++.dg/gomp/num-teams-2.C: New test.
libgomp/
* testsuite/libgomp.c-c++-common/teams-1.c: New test.

(cherry picked from commit 48d7327f2aaf65e224f5f0793a65b950297f6c7f)

commit | commitdiff | tree

Tobias Burnus [Sat, 30 Oct 2021 21:45:32 +0000 (23:45 +0200)]

OpenMP: Add strictly nested API call check [PR102972]

The teams construct only permits omp_get_num_teams and omp_get_team_num
as API call in strictly nested regions - check for it.

Additionally, for Fortran, using DECL_NAME does not show the mangled
name, hence, DECL_ASSEMBLER_NAME had to be used to.

Finally, 'target device(ancestor:1)' wrongly rejected non-API calls
as well.

PR middle-end/102972
gcc/ChangeLog:

* omp-low.c (omp_runtime_api_call): Use DECL_ASSEMBLER_NAME to get
internal Fortran name; new permit_num_teams arg to permit
omp_get_num_teams and omp_get_team_num.
(scan_omp_1_stmt): Update call to it, add missing call for
reverse offload, and check for strictly nested API calls in teams.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/target-device-ancestor-3.c: Add non-API
routine test.
* gfortran.dg/gomp/order-6.f90: Add missing bind(C).
* c-c++-common/gomp/teams-3.c: New test.
* gfortran.dg/gomp/teams-3.f90: New test.
* gfortran.dg/gomp/teams-4.f90: New test.

libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/icv-3.c: Nest API calls inside
parallel construct.
* testsuite/libgomp.c-c++-common/icv-4.c: Likewise.
* testsuite/libgomp.c/target-3.c: Likewise.
* testsuite/libgomp.c/target-5.c: Likewise.
* testsuite/libgomp.c/target-6.c: Likewise.
* testsuite/libgomp.c/target-teams-1.c: Likewise.
* testsuite/libgomp.c/teams-1.c: Likewise.
* testsuite/libgomp.c/thread-limit-2.c: Likewise.
* testsuite/libgomp.c/thread-limit-3.c: Likewise.
* testsuite/libgomp.c/thread-limit-4.c: Likewise.
* testsuite/libgomp.c/thread-limit-5.c: Likewise.
* testsuite/libgomp.fortran/icv-3.f90: Likewise.
* testsuite/libgomp.fortran/icv-4.f90: Likewise.
* testsuite/libgomp.fortran/teams1.f90: Likewise.

(cherry picked from commit 948d461954f2642ca187f86c19d297ba7a86320f)

commit | commitdiff | tree

Chung-Lin Tang [Thu, 24 Feb 2022 09:07:48 +0000 (01:07 -0800)]

openmp: Handle C/C++ array reference base-pointers in array sections

In cases where a program constructs its own deep-copying for arrays-of-pointers,
e.g:
   #pragma omp target enter data map(to:level->vectors[:N])
   for (i = 0; i < N; i++)
     #pragma omp target enter data map(to:level->vectors[i][:N])

We need to treat the part of the array reference before the array section
as a base-pointer (here 'level->vectors[i]'), providing pointer-attachment
behavior.

This patch adds this inside handle_omp_array_sections(), tracing the whole
sequence of array dimensions, creating a whole base-pointer reference
iteratively using build_array_ref(). The conditions are that each of the
"absorbed" dimensions must be length==1, and the final reference must be
of pointer-type (so that pointer attachment makes sense).

Merged from:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590658.html

2022-02-24  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/c/ChangeLog:

* c-typeck.cc (handle_omp_array_sections): Add handling for
creating array-reference base-pointer attachment clause.

gcc/cp/ChangeLog:

* semantics.cc (handle_omp_array_sections): Add handling for
creating array-reference base-pointer attachment clause.

gcc/ChangeLog:

* gimplify.cc (gimplify_scan_omp_clauses): Add case for
attach/detach map kind for ARRAY_REF of POINTER_TYPE.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/target-enter-data-1.c: Adjust testcase.

libgomp/testsuite/ChangeLog:

* libgomp.c-c++-common/ptr-attach-2.c: New test.

commit | commitdiff | tree

Kwok Cheung Yeung [Fri, 18 Feb 2022 19:00:57 +0000 (19:00 +0000)]

openmp: Improve handling of nested OpenMP metadirectives in C and C++

This patch fixes a misparsing issue when encountering code like:

  #pragma omp metadirective when {<selector_set>={...}: A)
    #pragma omp metadirective when (<selector_set>={...}: B)

When called for the first metadirective, analyze_metadirective_body would
stop just before the colon in the second metadirective because it naively
assumes that the '}' marks the end of a code block.

The assertion for clauses to end parsing at the same point is now disabled
if a parse error has occurred during the parsing of the clause, since some
tokens may not be consumed if a parse error cuts parsing short.

2022-02-18  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/c/
* c-parser.c (c_parser_omp_construct): Move handling of
PRAGMA_OMP_METADIRECTIVE from here...
(c_parser_pragma): ...to here.
(analyze_metadirective_body): Check that the bracket nesting level
is also zero before stopping the adding of tokens on encountering a
close brace.
(c_parser_omp_metadirective): Modify function signature and update.
Do not assert on remaining tokens if there has been a parse error.

gcc/cp/
* parser.c (cp_parser_omp_construct): Move handling of
PRAGMA_OMP_METADIRECTIVE from here...
(cp_parser_pragma): ...to here.
(analyze_metadirective_body): Check that the bracket
nesting level is also zero before stopping the adding of tokens on
encountering a close brace.
(cp_parser_omp_metadirective): Modify function signature and update.
Do not assert on remaining tokens if there has been a parse error.

gcc/testsuite/
* c-c++-common/gomp/metadirective-1.c (f): Add test for
improperly nested metadirectives.

commit | commitdiff | tree

Kwok Cheung Yeung [Fri, 11 Feb 2022 15:42:50 +0000 (15:42 +0000)]

openmp: More Fortran front-end fixes for metadirectives

This adds a check for declarative OpenMP directives in metadirective
variants (already present in the C/C++ front-ends), and fixes an
ICE when an empty metadirective (i.e. just '!$omp metadirective')
is presented.

2022-02-11 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/fortran/
* gfortran.h (is_omp_declarative_stmt): New.
* openmp.c (match_omp_metadirective): Reject declarative OpenMP
directives with 'sorry'.
* parse.c (parse_omp_metadirective_body): Check that state stack head
is non-null before dereferencing.
(is_omp_declarative_stmt): New.

gcc/testsuite/
* gfortran.dg/gomp/metadirective-2.f90 (main): Test empty
metadirective.

commit | commitdiff | tree

Kwok Cheung Yeung [Fri, 11 Feb 2022 11:20:18 +0000 (11:20 +0000)]

openmp: Eliminate non-matching metadirective variants early in Fortran front-end

This patch checks during parsing if a metadirective selector is both
resolvable and non-matching - if so, it is removed from further
consideration.  This is both more efficient, and avoids spurious
syntax errors caused by considering combinations of selectors that
lead to invalid combinations of OpenMP directives, when that
combination would never arise in the first place.

This exposes another bug - when metadirectives that are not of the
begin-end variety are nested, we might have to drill up through
multiple layers of the state stack to reach the state for the
next statement.  This is now fixed.

2022-02-11  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-general.c (DELAY_METADIRECTIVES_AFTER_LTO): Check that cfun is
non-null before derefencing.

gcc/fortran/
* decl.c (gfc_match_end): Search for first previous state that is not
COMP_OMP_METADIRECTIVE.
* gfortran.h (gfc_skip_omp_metadirective_clause): Add prototype.
* openmp.c (match_omp_metadirective): Skip clause if
result of gfc_skip_omp_metadirective_clause is true.
* trans-openmp.c (gfc_trans_omp_set_selector): Add argument and
disable expression conversion if false.
(gfc_skip_omp_metadirective_clause): New.

gcc/testsuite/
* gfortran.dg/gomp/metadirective-8.f90: New.

commit | commitdiff | tree

Andrew Stubbs [Sat, 12 Feb 2022 23:44:48 +0000 (23:44 +0000)]

amdgcn: Allow vector reductions on constants

Obviously it would be better if these reductions could be evaluated at compile
time, but this will avoid an ICE.

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_expand_reduc_scalar): Use force_reg.

(cherry picked from commit d51cad0b840a14c66732cb6a166c11ddf55d18b2)

commit | commitdiff | tree

Kwok Cheung Yeung [Mon, 31 Jan 2022 13:44:21 +0000 (05:44 -0800)]

openmp: Fix error message in Fortran front-end

An extra comma in an error message causes failures in the Fortran tests for
declare variant, because the message differs from that expected.

2022-01-31 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/fortran/
* openmp.c (gfc_match_omp_context_selector_specification): Remove
extra comma in error message.

commit | commitdiff | tree

Kwok Cheung Yeung [Fri, 28 Jan 2022 16:57:05 +0000 (16:57 +0000)]

Add missing ChangeLog.omp entries for previous patch

gcc/
* ChangeLog.omp: Update.

gcc/testsuite
* ChangeLog.omp: Update.

libgomp/
* ChangeLog.omp: Update.

commit | commitdiff | tree

Kwok Cheung Yeung [Fri, 28 Jan 2022 13:56:33 +0000 (13:56 +0000)]

openmp: Add warning when functions containing metadirectives with 'construct={target}' called directly

void f(void)
{
  #pragma omp metadirective \
    when (construct={target}: A) \
    default (B)
    ...
}
...
{
  #pragma omp target
    f(); // Target call

  f(); // Local call
}

With the OpenMP 5.0/5.1 specifications, we would expect A to be selected in
the metadirective when the target call is made, but B when f is called
directly outside of a target context.  However, since GCC does not have
separate copies of f for local and target calls, and the construct selector
is static, it must be resolved one way or the other at compile-time (currently
in the favour of selecting A), which may be unexpected behaviour.

This patch attempts to detect the above situation, and will emit a warning
if found.

2022-01-28  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* gimplify.c (gimplify_omp_metadirective): Mark offloadable functions
containing metadirectives with 'construct={target}' in the selector.
* omp-general.c (omp_has_target_constructor_p): New.
* omp-general.h (omp_has_target_constructor_p): New prototype.
* omp-low.c (lower_omp_1): Emit warning if marked functions called
outside of a target context.

gcc/testsuite/
* c-c++-common/gomp/metadirective-4.c (main): Add expected warning.
* gfortran.dg/gomp/metadirective-4.f90 (test): Likewise.

libgomp/
* testsuite/libgomp.c-c++-common/metadirective-2.c (main): Add
expected warning.
* testsuite/libgomp.fortran/metadirective-2.f90 (test): Likewise.

commit | commitdiff | tree

Chung-Lin Tang [Thu, 27 Jan 2022 10:33:00 +0000 (18:33 +0800)]

Fix omp-low ICE for indirect references based off component access [PR103642]

This issue was triggered after the patch extending syntax for component access
in map clauses in commit 0ab29cf0bb68960c1f87405f14b4fb2109254e2f.

In gimplify_scan_omp_clauses, the case for handling indirect accesses (which
creates firstprivate ptr and zero-length array section map for such decls) was
erroneously went into for non-pointer cases (here being the base struct decl),
so added the
appropriate checks there.

Added new testcase is a compile only test for the ICE. The original omptests
t-partial-struct test actually should not execute correctly, because for
map(t.s->a[:N]), map(t.s[:1]) is not implicitly mapped, thus the entire
offloaded access does not work as is (fixing that omptests test is out of
scope here).

2022-01-27 Chung-Lin Tang <cltang@codesourcery.com>

PR middle-end/103642

gcc/ChangeLog:

* gimplify.cc (gimplify_scan_omp_clauses): Do not do indir_p handling
for non-pointer or non-reference-to-pointer cases.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/pr103642.c: New test.

(cherry picked from commit 1c91b014923f418e0aab789c5cf57facf04bf266)

commit | commitdiff | tree

Chung-Lin Tang [Fri, 14 Jan 2022 13:58:34 +0000 (21:58 +0800)]

openmp: Fix ICE in [PR103705]

Fix ICE for cases like:
#pragma omp target update from(s[0].a[0:1])

where multiple ARRAY_REF nodes exist and require more than one peeling
during [c_]finish_omp_clauses.

PR c++/103705

gcc/c/ChangeLog:

* c-typeck.c (c_finish_omp_clauses): Also continue peeling off of
outer node for ARRAY_REFs.

gcc/cp/ChangeLog:

* semantics.c (finish_omp_clauses): Also continue peeling off of
outer node for ARRAY_REFs.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/pr103705.c: New test.

(cherry picked from commit cd7484d05cd4b7a9d741fe8bf6c4525406ed7620)

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 19:50:08 +0000 (11:50 -0800)]

openmp: Add support for 'target_device' context selector set

2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/
* builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New
type.
* omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin.
* omp-general.c (omp_context_selector_matches): Handle 'target_device'
selector set.
(omp_dynamic_cond): Generate expression tree for 'target_device'
selector set.
(omp_context_compute_score): Handle selectors in 'target_device' set.

gcc/c/
* c-parser.c (omp_target_device_selectors): New.
(c_parser_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(c_parser_omp_context_selector_specification): Handle 'target_device'
selector set.

gcc/cp/
* parser.c (omp_target_device_selectors): New.
(cp_parser_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(cp_parser_omp_context_selector_specification): Handle 'target_device'
selector set.

gcc/fortran/
* openmp.c (omp_target_device_selectors): New.
(gfc_match_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(gfc_match_omp_context_selector_specification): Handle 'target_device'
selector set.
* types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type.

gcc/testsuite/
* c-c++-common/gomp/metadirective-7.c: New.
* gfortran.dg/gomp/metadirective-7.f90: New.

libgomp/
* Makefile.am (libgomp_la_SOURCES): Add selector.c.
* Makefile.am: Regenerate.
* config/gcn/selector.c: New.
* config/linux/selector.c: New.
* config/linux/x86/selector.c: New.
* config/nvptx/selector.c: New.
* libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New.
* libgomp.h (struct gomp_device_descr): Add evaluate_device_func field.
* libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device.
* libgomp_g.h (GOMP_evaluate_current_device): New.
(GOMP_evaluate_target_device): New.
* oacc-host.c (host_evaluate_device): New.
(host_openacc_exec): Initialize evaluate_device_func field to
host_evaluate_device.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New.
* plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and
compute_minor fields.
(nvptx_open_device): Read compute capability information from device.
(CHECK_ISA): New macro.
(GOMP_OFFLOAD_evaluate_device): New.
* selector.c: New.
* target.c (GOMP_evaluate_target_device): New.
(gomp_load_plugin_for_device): Load evaulate_device plugin function.
* testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase.
* testsuite/libgomp.fortran/metadirective-5.f90: New testcase.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 19:40:58 +0000 (11:40 -0800)]

openmp: Metadirective fixes

Fix regressions introduced by block/statement skipping.

If user condition selector is constant, do not return it as a dynamic
selector.

2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/c/
* c-parser.c (c_parser_skip_to_end_of_block_or_statement): Track
bracket depth separately from nesting depth.

gcc/cp/
* parser.c (cp_parser_skip_to_end_of_statement): Revert.
(cp_parser_skip_to_end_of_block_or_statement): Track bracket depth
separately from nesting depth.

gcc/
* omp-general.c (omp_dynamic_cond): Do not return user condition if
constant.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 19:32:08 +0000 (11:32 -0800)]

openmp: Add testcases for metadirectives

2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/gomp/metadirective-1.c: New.
* c-c++-common/gomp/metadirective-2.c: New.
* c-c++-common/gomp/metadirective-3.c: New.
* c-c++-common/gomp/metadirective-4.c: New.
* c-c++-common/gomp/metadirective-5.c: New.
* c-c++-common/gomp/metadirective-6.c: New.
* gcc.dg/gomp/metadirective-1.c: New.
* gfortran.dg/gomp/metadirective-1.f90: New.
* gfortran.dg/gomp/metadirective-2.f90: New.
* gfortran.dg/gomp/metadirective-3.f90: New.
* gfortran.dg/gomp/metadirective-4.f90: New.
* gfortran.dg/gomp/metadirective-5.f90: New.
* gfortran.dg/gomp/metadirective-6.f90: New.

libgomp/
* testsuite/libgomp.c-c++-common/metadirective-1.c: New.
* testsuite/libgomp.c-c++-common/metadirective-2.c: New.
* testsuite/libgomp.c-c++-common/metadirective-3.c: New.
* testsuite/libgomp.c-c++-common/metadirective-4.c: New.
* testsuite/libgomp.fortran/metadirective-1.f90: New.
* testsuite/libgomp.fortran/metadirective-2.f90: New.
* testsuite/libgomp.fortran/metadirective-3.f90: New.
* testsuite/libgomp.fortran/metadirective-4.f90: New.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 19:24:55 +0000 (11:24 -0800)]

openmp, fortran: Add Fortran support for parsing metadirectives

This adds support for parsing OpenMP metadirectives in the Fortran front end.

2022-01-25  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-general.c (omp_check_context_selector): Revert string length
check.
(omp_context_name_list_prop): Likewise.

gcc/fortran/
* decl.c (gfc_match_end): Handle COMP_OMP_METADIRECTIVE and
COMP_OMP_BEGIN_METADIRECTIVE.
* dump-parse-tree.c (show_omp_node): Handle EXEC_OMP_METADIRECTIVE.
(show_code_node): Handle EXEC_OMP_METADIRECTIVE.
* gfortran.h (enum gfc_statement): Add ST_OMP_METADIRECTIVE,
ST_OMP_BEGIN_METADIRECTIVE and ST_OMP_END_METADIRECTIVE.
(struct gfc_omp_metadirective_clause): New structure.
(gfc_get_omp_metadirective_clause): New macro.
(struct gfc_st_label): Add omp_region field.
(enum gfc_exec_op): Add EXEC_OMP_METADIRECTIVE.
(struct gfc_code): Add omp_metadirective_clauses field.
(gfc_free_omp_metadirective_clauses): New prototype.
(match_omp_directive): New prototype.
* io.c (format_asterisk): Initialize omp_region field.
* match.h (gfc_match_omp_begin_metadirective): New prototype.
(gfc_match_omp_metadirective): New prototype.
* openmp.c (gfc_match_omp_eos): Match ')' in context selectors.
(gfc_free_omp_metadirective_clauses): New.
(gfc_match_omp_clauses): Remove context_selector argument.  Rely on
gfc_match_omp_eos to match end of clauses.
(match_omp): Remove extra argument to gfc_match_omp_clauses.
(gfc_match_omp_context_selector): Remove extra argument to
gfc_match_omp_clauses.  Set gfc_matching_omp_context_selector
before call to gfc_match_omp_clauses and reset after.
(gfc_match_omp_context_selector_specification): Modify to take a
gfc_omp_set_selector** argument.
(gfc_match_omp_declare_variant): Pass set_selectors to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): New.
(gfc_match_omp_begin_metadirective): New.
(gfc_match_omp_metadirective): New.
(resolve_omp_metadirective): New.
(gfc_resolve_omp_directive): Handle EXEC_OMP_METADIRECTIVE.
* parse.c (gfc_matching_omp_context_selector): New variable.
(gfc_in_metadirective_body): New variable.
(gfc_omp_region_count): New variable.
(decode_omp_directive): Match 'begin metadirective',
'end metadirective' and 'metadirective'.
(match_omp_directive): New.
(case_omp_structured_block): New.
(case_omp_do): New.
(gfc_ascii_statement): Handle metadirective statements.
(gfc_omp_end_stmt): New.
(parse_omp_do): Delegate to gfc_omp_end_stmt.
(parse_omp_structured_block): Delegate to gfc_omp_end_stmt. Handle
ST_OMP_END_METADIRECTIVE.
(parse_omp_metadirective_body): New.
(parse_executable): Delegate to case_omp_structured_block and
case_omp_do.  Return after one statement if compiling regular
metadirective.  Handle metadirective statements.
(gfc_parse_file): Reset gfc_omp_region_count,
gfc_in_metadirective_body and gfc_matching_omp_context_selector.
* parse.h (enum gfc_compile_state): Add COMP_OMP_METADIRECTIVE and
COMP_OMP_BEGIN_METADIRECTIVE.
(gfc_omp_end_stmt): New prototype.
(gfc_matching_omp_context_selector): New declaration.
(gfc_in_metadirective_body): New declaration.
(gfc_omp_region_count): New declaration.
* resolve.c (gfc_resolve_code): Handle EXEC_OMP_METADIRECTIVE.
* st.c (gfc_free_statement): Handle EXEC_OMP_METADIRECTIVE.
* symbol.c (compare_st_labels): Take omp_region into account.
(gfc_get_st_labels): Incorporate omp_region into label.
* trans-decl.c (gfc_get_label_decl): Add omp_region into translated
label name.
* trans-openmp.c (gfc_trans_omp_directive): Handle
EXEC_OMP_METADIRECTIVE.
(gfc_trans_omp_set_selector): Hoist code from...
(gfc_trans_omp_declare_variant): ...here.
(gfc_trans_omp_metadirective): New.
* trans-stmt.h (gfc_trans_omp_metadirective): New prototype.
* trans.c (trans_code): Handle EXEC_OMP_METADIRECTIVE.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 19:01:53 +0000 (11:01 -0800)]

openmp: Add C++ support for parsing metadirectives

This adds support for parsing OpenMP metadirectives in the C++ front end.

2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/cp/
* parser.c (cp_parser_skip_to_end_of_statement): Handle parentheses.
(cp_parser_skip_to_end_of_block_or_statement): Likewise.
(cp_parser_omp_context_selector): Add extra argument. Allow
non-constant expressions.
(cp_parser_omp_context_selector_specification): Add extra argument and
propagate to cp_parser_omp_context_selector.
(analyze_metadirective_body): New.
(cp_parser_omp_metadirective): New.
(cp_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE.
(cp_parser_pragma): Handle PRAGMA_OMP_METADIRECTIVE.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 18:49:44 +0000 (10:49 -0800)]

openmp: Add support for streaming metadirectives and resolving them after LTO

This patch adds support for streaming metadirective Gimple statements during
LTO, and adds a metadirective expansion pass that runs after LTO.  This is
required for metadirectives with selectors that can only be resolved from
within the accel compiler.

2022-01-25  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* Makefile.in (OBJS): Add omp-expand-metadirective.o.
* gimple-streamer-in.c (input_gimple_stmt): Add case for
GIMPLE_OMP_METADIRECTIVE.  Handle metadirective labels.
* gimple-streamer-out.c (output_gimple_stmt): Likewise.
* omp-expand-metadirective.cc: New.
* passes.def: Add pass_omp_expand_metadirective.
* tree-pass.h (make_pass_omp_expand_metadirective): New prototype.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 18:42:50 +0000 (10:42 -0800)]

openmp: Add support for resolving metadirectives during parsing and Gimplification

This adds support for resolving metadirectives according to the OpenMP 5.1
specification.  The variants are sorted by score, then gathered into a list
of dynamic replacement candidates.  The metadirective is then expanded into
a sequence of 'if..else' statements to test the dynamic selector and execute
the variant if the selector is satisfied.

If any of the selectors in the list are unresolvable, GCC will give up on
resolving the metadirective and try again later.

2022-01-25  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* gimplify.c (expand_omp_metadirective): New.
* omp-general.c: Include tree-pretty-print.h.
(DELAY_METADIRECTIVES_AFTER_LTO): New macro.
(omp_context_selector_matches): Delay resolution of selectors.  Allow
non-constant expressions.
(omp_dynamic_cond): New.
(omp_dynamic_selector_p): New.
(sort_variant): New.
(omp_get_dynamic_candidates): New.
(omp_resolve_metadirective): New.
(omp_resolve_metadirective): New.
* omp-general.h (struct omp_metadirective_variant): New.
(omp_resolve_metadirective): New prototype.

gcc/c-family/
* c-omp.c (c_omp_expand_metadirective_r): New.
(c_omp_expand_metadirective): New.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 18:36:59 +0000 (10:36 -0800)]

openmp: Add middle-end support for metadirectives

This adds a new Gimple statement type GIMPLE_OMP_METADIRECTIVE, which
represents the metadirective in Gimple. In high Gimple, the statement
contains the body of the directive variants, whereas in low Gimple, it
only contains labels to the bodies.

This patch adds support for converting metadirectives from tree to Gimple
form, and handling of the Gimple form (Gimple lowering, OpenMP lowering
and expansion, inlining, SSA handling etc).

Metadirectives should be resolved before they reach the back-end, otherwise
the compiler will crash as GCC does not know how to convert metadirective
Gimple statements to RTX.

2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/
* gimple-low.c (lower_omp_metadirective): New.
(lower_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
* gimple-pretty-print.c (dump_gimple_omp_metadirective): New.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_METADIRECTIVE.
* gimple-walk.c (walk_gimple_op): Handle GIMPLE_OMP_METADIRECTIVE.
(walk_gimple_stmt): Likewise.
* gimple.c (gimple_alloc_omp_metadirective): New.
(gimple_build_omp_metadirective): New.
(gimple_build_omp_metadirective_variant): New.
* gimple.def (GIMPLE_OMP_METADIRECTIVE): New.
(GIMPLE_OMP_METADIRECTIVE_VARIANT): New.
* gimple.h (gomp_metadirective_variant): New.
(gomp_metadirective): New.
(is_a_helper <gomp_metadirective *>::test): New.
(is_a_helper <gomp_metadirective_variant *>::test): New.
(is_a_helper <const gomp_metadirective *>::test): New.
(is_a_helper <const gomp_metadirective_variant *>::test): New.
(gimple_alloc_omp_metadirective): New prototype.
(gimple_build_omp_metadirective): New prototype.
(gimple_build_omp_metadirective_variant): New prototype.
(gimple_has_substatements): Add GIMPLE_OMP_METADIRECTIVE case.
(gimple_has_ops): Add GIMPLE_OMP_METADIRECTIVE.
(gimple_omp_metadirective_label): New.
(gimple_omp_metadirective_set_label): New.
(gimple_omp_metadirective_variants): New.
(gimple_omp_metadirective_set_variants): New.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_METADIRECTIVE.
* gimplify.c (is_gimple_stmt): Add OMP_METADIRECTIVE.
(expand_omp_metadirective): New.
(gimplify_omp_metadirective): New.
(gimplify_expr): Add case for OMP_METADIRECTIVE.
* gsstruct.def (GSS_OMP_METADIRECTIVE): New.
(GSS_OMP_METADIRECTIVE_VARIANT): New.
* omp-expand.c (build_omp_regions_1): Handle GIMPLE_OMP_METADIRECTIVE.
(omp_make_gimple_edges): Likewise.
* omp-low.c (struct omp_context): Add next_clone field.
(new_omp_context): Initialize next_clone field.
(clone_omp_context): New.
(delete_omp_context): Delete clone contexts.
(scan_omp_metadirective): New.
(scan_omp_1_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
(lower_omp_metadirective): New.
(lower_omp_1): Handle GIMPLE_OMP_METADIRECTIVE.
* tree-cfg.c (cleanup_dead_labels): Handle GIMPLE_OMP_METADIRECTIVE.
(gimple_redirect_edge_and_branch): Likewise.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
(estimate_num_insns): Likewise.
* tree-pretty-print.c (dump_generic_node): Handle OMP_METADIRECTIVE.
* tree-ssa-operands.c (parse_ssa_operands): Handle
GIMPLE_OMP_METADIRECTIVE.

commit | commitdiff | tree

Kwok Cheung Yeung [Tue, 25 Jan 2022 18:31:19 +0000 (10:31 -0800)]

openmp: Add C support for parsing metadirectives

This patch implements parsing for the OpenMP metadirective introduced in
OpenMP 5.0.  Metadirectives are parsed into an OMP_METADIRECTIVE node,
with the variant clauses forming a chain accessible via
OMP_METADIRECTIVE_CLAUSES.  Each clause contains the context selector
and tree for the variant.

User conditions in the selector are now permitted to be non-constant when
used in metadirectives as specified in OpenMP 5.1.

2021-01-25  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-general.c (omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New stub function.
* omp-general.h (struct omp_metadirective_variant): New.
(omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New prototype.
* tree.def (OMP_METADIRECTIVE): New.
* tree.h (OMP_METADIRECTIVE_CLAUSES): New macro.

gcc/c/
* c-parser.c (c_parser_skip_to_end_of_block_or_statement): Handle
parentheses in statement.
(c_parser_omp_metadirective): New prototype.
(c_parser_omp_context_selector): Add extra argument.  Allow
non-constant expressions.
(c_parser_omp_context_selector_specification): Add extra argument and
propagate it to c_parser_omp_context_selector.
(analyze_metadirective_body): New.
(c_parser_omp_metadirective): New.
(c_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE.

gcc/c-family
* c-common.h (enum c_omp_directive_kind): Add C_OMP_DIR_META.
(c_omp_expand_metadirective): New prototype.
* c-gimplify.c (genericize_omp_metadirective_stmt): New.
(c_genericize_control_stmt): Handle OMP_METADIRECTIVE tree nodes.
* c-omp.c (omp_directives): Classify metadirectives as C_OMP_DIR_META.
(c_omp_expand_metadirective): New stub function.
* c-pragma.c (omp_pragmas): Add entry for metadirective.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_METADIRECTIVE.

commit | commitdiff | tree

Richard Biener [Thu, 10 Jun 2021 09:03:55 +0000 (11:03 +0200)]

Expose stable sort algorithm to gcc_sort_r and add vec::stablesort

This makes it possible to apply GCCs stable sort algorithm to vec<>
and also use it with the qsort_r compatible interface.

2021-06-10 Richard Biener <rguenther@suse.de>

* system.h (gcc_stablesort_r): Declare.
* sort.cc (gcc_sort_r): Support stable sort.
(gcc_stablesort_r): Define.
* vec.h (vec<>::stablesort): Add.

(cherry-picked from commit 367f52dcc24045b072aeb26bc301a2980b39241f)

commit | commitdiff | tree

Sandra Loosemore [Wed, 19 Jan 2022 20:50:49 +0000 (12:50 -0800)]

Fortran: Fix scope for OMP AFFINITY clause iterator variables [PR103695]

gfc_finish_var_decl was confused by the undocumented overloading of
the proc_name field in struct gfc_namespace to contain iterator
variables for the OpenMP AFFINITY clause, causing it to insert the
decls in the wrong scope.  This patch adds a new distinct field to
hold these variables.

2022-01-23  Sandra Loosemore  <sandra@codesourcery.com>

PR fortran/103695
PR fortran/102621

Backport from mainline:

2022-01-20  Sandra Loosemore  <sandra@codesourcery.com>

gcc/fortran
* gfortran.h (struct gfc_namespace) Add omp_affinity_iterator
field.
* dump-parse-tree.c (show_iterator): Use it.
* openmp.c (gfc_match_iterator): Likewise.
(resolve_omp_clauses): Likewise.
* trans-decl.c (gfc_finish_var_decl): Likewise.
* trans-openmp.c (handle_iterator): Likewise.

gcc/testsuite/
* gfortran.dg/gomp/affinity-clause-3.f90: Adjust pattern.
* gfortran.dg/gomp/pr102621.f90: New.
* gfortran.dg/gomp/pr103695.f90: New.

commit | commitdiff | tree

Marcel Vollweiler [Wed, 19 Jan 2022 13:57:54 +0000 (05:57 -0800)]

libgomp, OpenMP: Fix issue for omp_get_device_num on gcn targets.

Currently omp_get_device_num does not work on gcn targets with more than one
offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in
icv-device.c and thus "__gomp_device_num" is not visible in the offload image.

This patch removes "static" such that "__gomp_device_num" is now part of the
offload image and can now be found in GOMP_OFFLOAD_load_image in the plugin.

This is not an issue for nvptx. There, "__gomp_device_num" is in the offload
image even with "static".

libgomp/ChangeLog:

* config/gcn/icv-device.c: Make GOMP_DEVICE_NUM_VAR public (remove
"static") to make the device num available in the offload image.

(cherry picked from commit 0bd247bbbe4cf396173f09eeec37e116e98f8471)

commit | commitdiff | tree

Chung-Lin Tang [Tue, 4 Jan 2022 09:26:23 +0000 (17:26 +0800)]

libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load

In the patch that implemented omp_get_device_num(), there was an error where
the stringification of GOMP_DEVICE_NUM_VAR, which is the macro expanding to
the actual symbol used, was erroneously using the STRINGX() macro in the
libgomp offload image symbol search, and expansion of the variable name
string through the additional layer of preprocessor symbol was not properly
achieved.

This patch fixes this by changing to properly use XSTRING(), also from
include/symcat.h.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Change uses of STRINGX
into XSTRING when looking for GOMP_DEVICE_NUM_VAR in offload image.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.

(cherry picked from commit fbb592407c9dd244b4cea086cbb90d7bd0bf60bb)

commit | commitdiff | tree

Chung-Lin Tang [Tue, 4 Jan 2022 07:37:15 +0000 (15:37 +0800)]

openmp: Fix ICE in gimplify_omp_affinity [PR103643]

After the PR90030 patch, which removes the universal casting of all Fortran
array pointers to 'c_char*', a Fortran descriptor based array passed into an
affinity() clause now looks like:

- #pragma omp task private(i) shared(b) affinity(*(c_char *) a.data)
+ #pragma omp task private(i) shared(b) affinity(*(integer(kind=4)[0:] * restrict) a.data)

The 'integer(kind=4)[0:]' incomplete type appears to be causing ICE during
gimplify_expr() due to 'is_gimple_val, fb_rvalue'. The ICE appears to be fixed
just by adjusting to 'is_gimple_lvalue, fb_lvalue'. Considering the use of the
affinity() clause, which should be specifying the location of a particular
object in memory, this probably makes sense.

gcc/ChangeLog:

PR middle-end/103643

* gimplify.c (gimplify_omp_affinity): Adjust gimplify_expr of entire
OMP_CLAUSE_DECL to use 'is_gimple_lvalue, fb_lvalue'

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/pr103643.f90: New test.

(cherry picked from commit 62c8b21d48ab6012ddc50529a39071d902dba31a)

commit | commitdiff | tree

Andrew Stubbs [Tue, 16 Nov 2021 10:32:35 +0000 (10:32 +0000)]

amdgcn: Change offload variable table discovery

Up to now the libgomp GCN plugin has been finding the offload variables
by using a symbol lookup, but the AMD runtime requires that the symbols are
global for that to work. This was ensured by mkoffload as a post-procssing
step, but the LLVM 13 assembler no longer accepts this in the case where the
variable was previously declared differently.

This patch switches to locating the symbols directly from the
offload_var_table, which means that only one symbol needs to be forced
global.

This changes breaks the libgomp image compatibility so GOMP_VERSION_GCN has
also been bumped.

gcc/ChangeLog:

* config/gcn/mkoffload.c (process_asm): Process the variable table
completely differently.
(process_obj): Encode the varaible data differently.

include/ChangeLog:

* gomp-constants.h (GOMP_VERSION_GCN): Bump.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (struct gcn_image_desc): Remove global_variables.
(GOMP_OFFLOAD_load_image): Locate the offload variables via the
table, not individual symbols.

(cherry picked from commit 4a87a8e4b13e979e7c8a626a8f4082715a48e21e)

commit | commitdiff | tree

Andrew Stubbs [Fri, 3 Dec 2021 17:46:41 +0000 (17:46 +0000)]

libgomp, nvptx: low-latency memory allocator

This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.

The use of the PTX dynamic_smem_size feature means that the minimum version
requirement is now bumped to 4.1 (still old at this point).

libgomp/ChangeLog:

* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(dynamic_smem_size): New constants.
(omp_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC.
Implement fall-backs for predefined allocators.
(omp_realloc): Use MEMSPACE_REALLOC.
Implement fall-backs for predefined allocators.
* config/nvptx/team.c (__nvptx_lowlat_heap_root): New variable.
(__nvptx_lowlat_pool): New asm varaible.
(gomp_nvptx_main): Initialize the low-latency heap.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/allocators-1.c: New test.
* testsuite/libgomp.c/allocators-2.c: New test.
* testsuite/libgomp.c/allocators-3.c: New test.
* testsuite/libgomp.c/allocators-4.c: New test.
* testsuite/libgomp.c/allocators-5.c: New test.
* testsuite/libgomp.c/allocators-6.c: New test.

commit | commitdiff | tree

Andrew Stubbs [Tue, 21 Dec 2021 10:09:08 +0000 (10:09 +0000)]

nvptx: bump default to PTX 4.1

gcc/ChangeLog:

* config/nvptx/nvptx-opts.h (ptx_version): Change PTX_VERSION_3_1 to
PTX_VERSION_4_1.
* config/nvptx/nvptx.c (nvptx_file_start): Bump minimum PTX version
to 4.1.
* config/nvptx/nvptx.opt (ptx_version): Add 4.1. Change default.
doc/invoke.texi: -mptx default is now 4.1.

commit | commitdiff | tree

Andrew Stubbs [Thu, 16 Dec 2021 15:30:05 +0000 (15:30 +0000)]

OpenMP: allow requires dynamic_allocators

There's no need to reject the dynamic_allocators requires directive because
we actually do support the feature, and it doesn't have to actually "do"
anything.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_requires): Don't "sorry" dynamic_allocators.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_requires): Don't "sorry" dynamic_allocators.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_requires): Don't "sorry" dynamic_allocators.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/requires-8.f90: Reinstate dynamic allocators
requirement.

commit | commitdiff | tree

Chung-Lin Tang [Thu, 2 Dec 2021 10:24:03 +0000 (18:24 +0800)]

fortran: OpenMP/OpenACC array mapping alignment fix (PR90030)

Fix issue with the Fortran front-end when mapping arrays: when creating the
data MEM_REF for the map clause, there was a convention of casting the
referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).

This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target.

This patch fixes this by removing the char_type_node pointer converts, and
adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)).

PR fortran/90030

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
to char_type_node, add gcc_assert of POINTER_TYPE_P.
(gfc_trans_omp_array_section): Likewise.
(gfc_trans_omp_clauses): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc/finalize-1.f: Adjust scan test.
* gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
* gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
* gfortran.dg/gomp/map-3.f90: Likewise.
* gfortran.dg/gomp/pr78260-2.f90: Likewise.
* gfortran.dg/gomp/pr78260-3.f90: Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
* testsuite/libgomp.fortran/pr90030.f90: New test.

(cherry picked from commit 1ac7a8c9e4798d352eb8c64905dd38086af4e1cd)

commit | commitdiff | tree

Chung-Lin Tang [Fri, 3 Dec 2021 09:27:17 +0000 (17:27 +0800)]

fortran: Fix setting of array lower bound for named arrays

This patch fixes a case of setting array low-bounds, found for particular uses
of SOURCE=/MOLD=. This adjusts the relevant part in gfc_trans_allocate() to
set e3_has_nodescriptor only for non-named arrays.

2021-12-03 Tobias Burnus <tobias@codesourcery.com>

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
only for non-named arrays.

gcc/testsuite/ChangeLog:

* gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
* gfortran.dg/allocate_with_mold_4.f90: New testcase.

(cherry picked from commit 6262e3a22b3d86afc116480bc59a7bb30b0cfd40)

commit | commitdiff | tree

Tobias Burnus [Wed, 24 Nov 2021 10:08:40 +0000 (11:08 +0100)]

Update GMP/MPFR/MPC/ISL version in contrib/download_prerequisites

contrib/
* download_prerequisites: Update to gmp-6.2.1, mpfr-4.1.0,
mpc-1.2.1 and isl-0.24.
* prerequisites.md5: Update hash.
* prerequisites.sha512: Likewise.

(cherry picked from commit be60f80247feb72b47af62cda66c82a0fa6c1cdc)

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:22:48 +0000 (16:22 +0100)]

openacc: Adjust test expectations to new "kernels" handling

Adjust tests to changed expectations with the new Graphite-based
"kernels" handling.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90: Removed.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/acc-icf.c: Adjust.
* c-c++-common/goacc/cache-3-1.c: Adjust.
* c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Adjust.
* c-c++-common/goacc/classify-serial.c: Adjust.
* c-c++-common/goacc/if-clause-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Adjust.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: Adjust.
* c-c++-common/goacc/kernels-loop-3.c: Adjust.
* c-c++-common/goacc/loop-2-kernels.c: Adjust.
* c-c++-common/goacc/nested-reductions-2-parallel.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust.
* c-c++-common/goacc/routine-1.c: Adjust.
* c-c++-common/goacc/routine-level-of-parallelism-2.c: Adjust.
* c-c++-common/goacc/routine-nohost-1.c: Adjust.
* c-c++-common/goacc/uninit-copy-clause.c: Adjust.
* gcc.dg/goacc/loop-processing-1.c: Adjust.
* gcc.dg/goacc/nested-function-1.c: Adjust.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust.
* gfortran.dg/goacc/classify-kernels.f95: Adjust.
* gfortran.dg/goacc/classify-parallel.f95: Adjust.
* gfortran.dg/goacc/classify-routine.f95: Adjust.
* gfortran.dg/goacc/classify-serial.f95: Adjust.
* gfortran.dg/goacc/common-block-3.f90: Adjust.
* gfortran.dg/goacc/gang-static.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-1.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-inner.f95: Adjust.
* gfortran.dg/goacc/kernels-loop.f95: Adjust.
* gfortran.dg/goacc/kernels-tree.f95: Adjust.
* gfortran.dg/goacc/loop-2-kernels.f95: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-2.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-3.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-4.f90: Adjust.
* gfortran.dg/goacc/nested-function-1.f90: Adjust.
* gfortran.dg/goacc/nested-reductions-2-parallel.f90: Adjust.
* gfortran.dg/goacc/pr72741.f90: Adjust.
* gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust.
* gfortran.dg/goacc/private-predetermined-kernels-1.f95: Adjust.
* gfortran.dg/goacc/routine-module-mod-1.f90: Adjust.
* gfortran.dg/goacc/uninit-copy-clause.f95: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Removed.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:22:29 +0000 (16:22 +0100)]

graphite: Accept loops without data references

It seems that the check that rejects loops without data references is
only included to avoid handling non-profitable loops. Including those
loops in Graphite's analysis enables more consistent diagnostic
messages in OpenACC "kernels" code and does not introduce any
testsuite regressions. If executing Graphite on loops without
data references leads to noticeable compile time slow-downs for
non-OpenACC users of Graphite, the check can be re-introduced but
restricted to non-OpenACC functions.

gcc/ChangeLog:

* graphite-scop-detection.c (scop_detection::harmful_loop_in_region):
Remove check for loops without data references.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:21:57 +0000 (16:21 +0100)]

graphite: Adjust scop loop-nest choice

The find_common_loop function is used in Graphite to obtain a common
super-loop of all loops inside a SCoP.  The function is applied to the
loop of the destination block of the edge that leads into the SESE
region and the loop of the source block of the edge that exits the
region.  The exit block is usually introduced by the canonicalization
of the loop structure that Graphite does to support its code
generation. If it is empty, it may happen that it belongs to the outer
fake loop.  This way, build_alias_set may end up analysing
data-references with respect to this loop although there may exist a
proper super-loop of the SCoP loops.  This does not seem to be correct
in general and it leads to problems with runtime alias check creation
which fails if executed on a loop without niter information.

gcc/ChangeLog:

        * graphite-scop-detection.c (scop_context_loop): New function.
        (build_alias_set): Use scop_context_loop instead of find_common_loop.
        * graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
        * graphite.h (scop_context_loop): New declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:21:42 +0000 (16:21 +0100)]

graphite: Tune parameters for OpenACC use

The default values of some parameters that restrict Graphite's
resource usage are too low for many OpenACC codes.  Furthermore,
exceeding the limits does not alwas lead to user-visible diagnostic
messages.

This commit increases the parameter values on OpenACC functions.  The
values were chosen to allow for the analysis of all "kernels" regions
in the SPEC ACCEL v1.3 benchmark suite.  Warnings about exceeded
Graphite-related limits are added to the -fopt-info-missed
output. Those warnings are phrased in a uniform way that intentionally
refers to the "data-dependence analysis" of "OpenACC loops" instead of
"a failure in Graphite" to make them easier to understand for users.

gcc/ChangeLog:

        * graphite-optimize-isl.c (optimize_isl): Adjust
param_max_isl_operations value for OpenACC functions and add
special warnings if value gets exceeded.

* graphite-scop-detection.c (build_scops): Likewise for
param_graphite_max_arrays_per_scop.

gcc/testsuite/ChangeLog:

        * gcc.dg/goacc/graphite-parameter-1.c: New test.
        * gcc.dg/goacc/graphite-parameter-2.c: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:56 +0000 (16:20 +0100)]

openacc: Disable pass_pre on outlined functions analyzed by Graphite

The additional dependences introduced by partial redundancy
elimination proper and by the code hoisting step of the pass very
often cause Graphite to fail on OpenACC functions. On the other hand,
the pass can also enable the analysis of OpenACC loops (cf. e.g. the
loop-auto-transfer-4.f90 testcase), for instance, because full
redundancy elimination removes definitions that would otherwise
prevent the creation of runtime alias checks outside of the SCoP.

This commit disables the actual partial redundancy elimination step as
well as the code hoisting step of pass_pre on OpenACC functions that
might be handled by Graphite.

gcc/ChangeLog:

* tree-ssa-pre.c (insert): Skip any insertions in OpenACC
functions that might be processed by Graphite.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:41 +0000 (16:20 +0100)]

openacc: Handle internal function calls in pass_lim

The loop invariant motion pass correctly refuses to move statements
out of a loop if any other statement in the loop is unanalyzable.  The
pass does not know how to handle the OpenACC internal function calls
which was not necessary until recently when the OpenACC device
lowering pass was moved to a later position in the pass pipeline.

This commit changes pass_lim to ignore the OpenACC internal function
calls which do not contain any memory references. The hoisting enabled
by this change can be useful for the data-dependence analysis in
Graphite; for instance, in the outlined functions for OpenACC regions,
all invariant accesses to the ".omp_data_i" struct should be hoisted
out of the OpenACC loop.  This is particularly important for variables
that were scalars in the original loop and which have been turned into
accesses to the struct by the outlining process.  Not hoisting those
can prevent scalar evolution analysis which is crucial for Graphite.
Since any hoisting that introduces intermediate names - and hence,
"fake" dependences - inside the analyzed nest can be harmful to
data-dependence analysis, a flag to restrict the hoisting in OpenACC
functions is added to the pass. The pass instance that executes before
Graphite now runs with this flag set to true and the pass instance
after Graphite runs unrestricted.

A more precise way of selecting the statements for which hoisting
should be enabled is left for a future improvement.

gcc/ChangeLog:
        * passes.def: Set restrict_oacc_hoisting to true for the early
pass_lim instance.
* tree-ssa-loop-im.c (movement_possibility): Add
restrict_oacc_hoisting flag to function; restrict movement if set.
(compute_invariantness): Add restrict_oacc_hoisting flag and pass it on.
        (gather_mem_refs_stmt): Skip IFN_GOACC_LOOP and IFN_UNIQUE
calls.
        (loop_invariant_motion_in_fun): Add restrict_oacc_hoisting flag and
pass it on.
        (pass_lim::execute): Pass on new flags.
* tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Adjust declaration.
* gimple-loop-interchange.cc (pass_linterchange::execute): Adjust call to
loop_invariant_motion_in_fun.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:20:15 +0000 (16:20 +0100)]

openacc: Warn about "independent" "kernels" loops with data-dependences

This commit concerns loops in OpenACC "kernels" region that have been marked
up with an explicit "independent" clause by the user, but for which Graphite
found data dependences.  A discussion on the private internal OpenACC mailing
list suggested that warning the user about the dependences woud be a more
acceptable solution than reverting the user's decision. This behavior is
implemented by the present commit.

gcc/ChangeLog:

        * common.opt: Add flag Wopenacc-false-independent.
        * omp-offload.c (oacc_loop_warn_if_false_independent): New function.
        (oacc_loop_fixed_partitions): Call from here.

commit | commitdiff | tree

Andrew Stubbs [Tue, 16 Nov 2021 15:19:53 +0000 (16:19 +0100)]

openacc: Add runtime alias checking for OpenACC kernels

This commit adds the code generation for the runtime alias checks for
OpenACC loops that have been analyzed by Graphite. The runtime alias
check condition gets generated in Graphite. It is evaluated by the
code generated for the IFN_GOACC_LOOP internal function calls. If
aliasing is detected at runtime, the execution dimensions get adjusted
to execute the affected loops sequentially.

gcc/ChangeLog:

* graphite-isl-ast-to-gimple.c: Include internal-fn.h.
(graphite_oacc_analyze_scop): Implement runtime alias checks.
* omp-expand.c (expand_oacc_for): Add an additional "noalias" parameter
to GOACC_LOOP internal calls, and initialise it to integer_one_node.
* omp-offload.c (oacc_xform_loop): Integrate the runtime alias check
into the GOACC_LOOP expansion.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-2.c: New test.

commit | commitdiff | tree

Andrew Stubbs [Tue, 16 Nov 2021 15:19:23 +0000 (16:19 +0100)]

openacc: Add data optimization pass

Address PR90591 "Avoid unnecessary data transfer out of OMP
construct", for simple (but common) cases.

This commit adds a pass that optimizes data mapping clauses.
Currently, it can optimize copy/map(tofrom) clauses involving scalars
to copyin/map(to) and further to "private".  The pass is restricted
"kernels" regions but could be extended to other types of regions.

gcc/ChangeLog:

        * Makefile.in: Add pass.
        * doc/gimple.texi: TODO.
        * gimple-walk.c (walk_gimple_seq_mod): Adjust for backward walking.
        * gimple-walk.h (struct walk_stmt_info): Add field.
        * passes.def: Add new pass.
        * tree-pass.h (make_pass_omp_data_optimize): New declaration.
        * omp-data-optimize.cc: New file.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Expect optimization messages.
        * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.

gcc/testsuite/ChangeLog:

        * c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
        * c-c++-common/goacc/uninit-copy-clause.c: Likewise.
        * gfortran.dg/goacc/uninit-copy-clause.f95: Likewise.
        * c-c++-common/goacc/omp_data_optimize-1.c: New test.
        * g++.dg/goacc/omp_data_optimize-1.C: New test.
        * gfortran.dg/goacc/omp_data_optimize-1.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:18:02 +0000 (16:18 +0100)]

Add function for printing a single OMP_CLAUSE

Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump
the whole OMP clause chain") changed the dumping behavior for
OMP_CLAUSEs.  The old behavior is required for a follow-up
commit ("openacc: Add data optimization pass") that optimizes single
OMP_CLAUSEs.

gcc/ChangeLog:

        * tree-pretty-print.c (print_omp_clause_to_str): Add new function.
        * tree-pretty-print.h (print_omp_clause_to_str): Add declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:17:48 +0000 (16:17 +0100)]

openacc: Remove unused partitioning in "kernels" regions

With the old "kernels" handling, unparallelized regions would
get executed with 1x1x1 partitioning even if the user provided
explicit num_gangs, num_workers clauses etc.

This commit restores this behavior by removing unused partitioning
after assigning the parallelism dimensions to loops.

gcc/ChangeLog:

        * omp-offload.c (oacc_remove_unused_partitioning): New function
        for removing partitioning that is not used by any loop.
        (oacc_validate_dims): Call oacc_remove_unused_partitioning and
        enable warnings about unused partitioning.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Adjust
        expectations.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:17:15 +0000 (16:17 +0100)]

openacc: Add further kernels tests

Add some copies of tests to continue covering the old "parloops"-based
"kernels" implementation - until it gets removed from GCC - and
add further tests for the new Graphite-based implementation.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90:
New test.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/classify-kernels-unparallelized-graphite.c:
New test.
* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
New test.
* c-c++-common/goacc/kernels-decompose-1-parloops.c: New test.
* c-c++-common/goacc/kernels-reduction-parloops.c: New test.
* c-c++-common/goacc/loop-auto-reductions.c: New test.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto-parloops.c:
New test.
* c-c++-common/goacc/note-parallelism-kernels-loops-1.c: New test.
* c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c:
New test.
* gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
New test.
* gfortran.dg/goacc/kernels-conversion.f95: New test.
* gfortran.dg/goacc/kernels-decompose-1-parloops.f95: New test.
* gfortran.dg/goacc/kernels-decompose-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-parloops-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-parloops.f95: New test.
* gfortran.dg/goacc/kernels-reductions.f90: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:16:47 +0000 (16:16 +0100)]

openacc: Add "can_be_parallel" flag info to "graph" dumps

gcc/ChangeLog:

* graph.c (oacc_get_fn_attrib): New declaration.
(find_loop_location): New declaration.
(draw_cfg_nodes_for_loop): Print value of the
can_be_parallel flag at the top of loops in OpenACC
functions.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:16:22 +0000 (16:16 +0100)]

openacc: Use Graphite for dependence analysis in "kernels" regions

This commit changes the handling of OpenACC "kernels" to use Graphite
for dependence analysis. To this end, it first introduces a new
internal representation for "kernels" regions which should be analyzed
by Graphite in pass_omp_oacc_kernels_decompose.  This is now the
default for all "kernels" regions, but the old handling is still
available through the command line parameter
"--param=openacc_kernels=decompose-parloops".  The handling of this
new region type in the omp lowering and omp offloading passes follows
the existing handling for "parallel" regions.  This replaces the
specialized handling for "kernels" regions that was previously used
and which was in limited in many ways.

Graphite is adjusted to be able to analyze the OpenACC functions that
get outlined from the "kernels" regions. It is enabled to handle the
internal function calls that contain information about OpenACC
constructs. In some places where function calls would be rejected by
Graphite, those calls need to be ignored. In other places, information
about the loop step, bounds etc. needs to be extracted from the
calls. The goal is to enable an analysis of the original loop
parameters although the omp lowering and expansion steps have already
modified the loop structure.  Some parallelization-enabling constructs
such as OpenACC "reduction" and "private"/"firstprivate" clauses must
be recognized and the data-dependences must be adjusted to reflect the
semantics of those constructs.  The data-dependence analysis step in
Graphite has so far been tied to the code generation step.  This
commit introduces a separate data-dependence analysis step that avoids
the code generation.  This is necessary because adjusting the code
generation to create a correct OpenACC loop structure would require
very considerable effort and the goal of this commit is to implement
the dependence analysis only. The ability to use Graphite for
dependence analysis without its code generation might be of
independent interest, but it is so far used for OpenACC purposes
only. In general, all changes to Graphite try to avoid affecting other
uses of Graphite as much as possible.

gcc/ChangeLog:

* Makefile.in: Add graphite-oacc.o
* cfgloop.c (alloc_loop): Set can_be_parallel_valid_p to false.
* cfgloop.h: Add can_be_parallel_valid_p field.
* cfgloopmanip.c (copy_loop_info): Add assert.
* config/nvptx/nvptx.c (nvptx_goacc_reduction_setup):
* doc/invoke.texi: Adjust param openacc-kernels description.
* doc/passes.texi: Adjust pass_ipa_oacc_kernels description.
* flag-types.h (enum openacc_kernels):Add
OPENACC_KERNELS_DECOMPOSE_PARLOOPS.
* gimple-pretty-print.c (dump_gimple_omp_target): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
* gimple.h (enum gf_mask): Add
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE and
widen GF_OMP_TARGET_KIND_MASK.
(is_gimple_omp_oacc): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(is_gimple_omp_offloaded): Likewise.
* gimplify.c (gimplify_omp_for): Enable reduction localization
for "kernels" regions.
(gimplify_omp_workshare): Likewise.
* graphite-dependences.c (scop_get_reads_and_writes): Handle
"kills" and "reduction" PDRs.
(apply_schedule_on_deps): Add dump output for intermediate
steps of the dependence computation to enable understanding
of unexpected dependences.
(carries_deps): Likewise.
(scop_get_dependences): Handle "kill" operations and add dump
output.
* graphite-isl-ast-to-gimple.c (visit_schedule_loop_node): New function.
(graphite_oacc_analyze_scop): New function.
* graphite-optimize-isl.c (optimize_isl): Remove "static" and
add argument to identify OpenACC use; don't fail on unchanged
schedule in this case.
* graphite-poly.c (new_poly_dr): Handle "kills".
(print_pdr): Likewise.
(new_gimple_poly_bb): Likewise.
(free_gimple_poly_bb): Likewise.
(new_scop): Handle "reduction", "private", and "firstprivate"
hash sets.
(free_scop): Likewise.
(print_isl_space): New function.
(debug_isl_space): New function.
* graphite-scop-detection.c (scop_detection::can_represent_loop):
Don't fail if niter is 0 in OpenACC functions.
(scop_detection::add_scop): Don't reject regions with only one
loop in OpenACC functions.
(ignored_oacc_internal_call_p): New function.
(scan_tree_for_params): Handle VIEW_CONVERT_EXPR.
(stmt_has_side_effects): Ignore internal OpenACC function calls.
(add_write): Likewise.
(add_read): Likewise.
(add_kill): New function.
(add_kills): New function.
(add_oacc_kills): New function.
(try_generate_gimple_bb): Kill false dependences for OpenACC
"private"/"firstprivate" vars.
(gather_bbs::gather_bbs): Determin OpenACC
"private"/"firstprivate" vars in region.
(gather_bbs::before_dom_children): Add assert.
(determine_openacc_reductions): New function.
(build_scops): Determine OpenACC "reduction" vars in SCoP.
* graphite-sese-to-poly.c (oacc_ifn_call_extract): New declaration.
(oacc_internal_call_p): New function.
(build_poly_dr): Ignore internal OpenACC function calls,
* handle "reduction" refs.
(build_poly_sr): Likewise; handle "kill" operations.
* graphite.c (graphite_transform_loops): Accept functions with
only a single loop.
(oacc_enable_graphite_p): New function.
(gate_graphite_transforms): Enable pass on OpenACC functions.
* graphite.h (enum poly_dr_type): Add PDR_KILL.
(struct poly_dr): Add "is_reduction" field.
(new_poly_dr): Add argument to declaration.
(pdr_kill_p): New function.
(print_isl_space): New declaration.
(debug_isl_space): New declaration.
(struct scop): Add fields "reductions_vars",
"oacc_firstprivate_vars", and "oacc_private_scalars".
(optimize_isl): New declaration.
(graphite_oacc_analyze_scop): New declaration.
* internal-fn.c (expand_UNIQUE): Handle
IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE
* internal-fn.h: Add OACC_PRIVATE_SCALAR and OACC_FIRSTPRIVATE
* omp-expand.c (struct omp_region): Adjust comment.
(expand_omp_taskloop_for_inner):
(expand_omp_for): Add asserts about expected "kernels" region types.
(mark_loops_in_oacc_kernels_region): Likewise.
(expand_omp_target): Likewise; handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(build_omp_regions_1): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
Likewise.
(omp_make_gimple_edges): Likewise.
* omp-general.c (oacc_get_kernels_attrib): New function.
(oacc_get_fn_dim_size): Allow argument to be NULL.
* omp-general.h (oacc_get_kernels_attrib): New declaration.
* omp-low.c (struct omp_context): Add fields
"oacc_firstprivate_vars" and "oacc_private_scalars".
(was_originally_oacc_kernels): New function.
(is_oacc_kernels):
(is_oacc_kernels_decomposed_graphite_part): New function.
(new_omp_context): Allocate "oacc_first_private_vars" and
"oacc_private_scalars" ...
(delete_omp_context): ... and free from here.
(oacc_record_firstprivate_var_clauses): New function.
(oacc_record_private_scalars): New function.
(scan_sharing_clauses): Call functions to record "private"
scalars and "firstprivate" variables.
(check_oacc_kernel_gwv): Add assert.
(ctx_in_oacc_kernels_region): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE.
(scan_omp_for): Likewise.
(check_omp_nesting_restrictions): Likewise.
(lower_oacc_head_mark): Likewise.
(lower_omp_for): Likewise.
(lower_omp_target): Create "private" and "firstprivate" marker
call statements.
(lower_oacc_head_tail): Adjust "private" and "firstprivate"
marker calls.
(lower_oacc_reductions): Emit "private" and "firstprivate"
marker call statements.
(make_oacc_firstprivate_vars_marker): New function.
(make_oacc_private_scalars_marker): New function.
* omp-oacc-kernels-decompose.cc (adjust_region_code_walk_stmt_fn):
Assign GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GRAPHITE to
region using the new "kernels" handling.
(make_region_seq): Adjust default region type for new
"kernels" handling; no more exceptions, let Graphite handle everything.
(make_region_loop_nest): Likewise; add dump output and assert.
(adjust_nested_loop_clauses): Stop creating "auto" clauses if
loop has "independent", "gang" etc.
(transform_kernels_loop_clauses): Likewise.
* omp-offload.c (oacc_extract_loop_call): New function.
(oacc_loop_get_cfg_loop): New function.
(can_be_parallel_str): New function.
(oacc_loop_can_be_parallel_p): New function.
(oacc_parallel_kernels_graphite_fun_p): New function.
(oacc_parallel_fun_p): New function.
(oacc_loop_transform_auto_into_independent): New function, ...
(oacc_loop_fixed_partitions): ... called from here to transfer
the result of Graphite's analysis to the loop.
(execute_oacc_loop_designation): Handle "oacc
functions with "parallel_kernels_graphite" attribute.
(execute_oacc_device_lower): Handle
IFN_UNIQUE_OACC_PRIVATE_SCALAR and IFN_UNIQUE_OACC_FIRSTPRIVATE.
* omp-offload.h (oacc_extract_loop_call): Add declaration.
* params.opt: Add "param=openacc-kernels" value "decompose-parloops".
* sese.c (scalar_evolution_in_region): "Redirect" SCEV
analysis to outer loop for IFN_GOACC_LOOP calls.
* sese.h: Add field "kill_scalar_refs".
* tree-chrec.c (chrec_fold_plus_1): Handle VIEW_CONVERT_EXPR
like CASE_CONVERT.
* tree-data-ref.c (dump_data_reference): Include
* DR_BASE_ADDRESS and DR_OFFSET in dump output.
(get_references_in_stmt): Don't reject OpenACC internal function
calls.
(graphite_find_data_references_in_stmt): Remove unused variable.
* tree-parloops.c (pass_parallelize_loops::execute): Disable
pass with the new kernels handling, enable if requested explicitly.
* tree-scalar-evolution.c (set_scev_analyze_openacc_calls):
Set flag to enable the analysis of internal OpenACC function
calls (use for Graphite only).
(oacc_call_analyzable_p): New function.
(oacc_ifn_call_extract): New function.
(oacc_simplify): New function.
(add_to_evolution): Simplify OpenACC internal function calls
if applicable.
(follow_ssa_edge_binary): Likewise.
(follow_ssa_edge_expr): Likewise.
(follow_copies_to_constant): Likewise.
(analyze_initial_condition): Likewise.
(interpret_loop_phi): Likewise.
(interpret_gimple_call): New function.
(interpret_rhs_expr): Likewise.
(instantiate_scev_name): Likewise.
(analyze_scalar_evolution_1): Handle GIMPLE_CALL, handle default definitions.
(expression_expensive_p): Consider internal OpenACC calls to
be cheap.
* tree-scalar-evolution.h (set_scev_analyze_openacc_calls):
New declaration.
(oacc_call_analyzable_p): New declaration.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
lhs of internal OpenACC function calls necessary.
* tree-ssa-ifcombine.c (recognize_if_then_else):
* tree-ssa-loop-niter.c (oacc_call_analyzable_p):
(oacc_ifn_call_extract): New declaration.
(interpret_gimple_call): New delcaration.
(expand_simple_operations): Handle internal OpenACC function calls.
* tree-ssa-loop.c (gate_oacc_kernels): Disable for new
"kernels" handling.
* graphite-oacc.c: New file.
* graphite-oacc.h: New file.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-independent.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/classify-kernels.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Removed.
* c-c++-common/goacc/kernels-reduction.c: Removed.
* gfortran.dg/goacc/loop-auto-transfer-2.f90: New test.
* gfortran.dg/goacc/loop-auto-transfer-3.f90: New test.
* gfortran.dg/goacc/loop-auto-transfer-4.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:15:08 +0000 (16:15 +0100)]

graphite: Add runtime alias checking

Graphite rejects a SCoP if it contains a pair of data references for
which it cannot determine statically if they may alias. This happens
very often, for instance in C code which does not use explicit
"restrict".  This commit adds the possibility to analyze a SCoP
nevertheless and perform an alias check at runtime.  Then, if aliasing
is detected, the execution will fall back to the unoptimized SCoP.

TODO This needs more testing on non-OpenACC code.

gcc/ChangeLog:

        * common.opt: Add fgraphite-runtime-alias-checks.
        * graphite-isl-ast-to-gimple.c
        (generate_alias_cond): New function.
        (graphite_regenerate_ast_isl): Use from here.
        * graphite-poly.c (new_scop): Create unhandled_alias_ddrs vec ...
(free_scop): and release here.
        * graphite-scop-detection.c (dr_defs_outside_region): New function.
        (dr_well_analyzed_for_runtime_alias_check_p): New function.
        (graphite_runtime_alias_check_p): New function.
        (build_alias_set): Record unhandled alias ddrs for later alias check
        creation if flag_graphite_runtime_alias_checks is true instead
        of failing.
        * graphite.h (struct scop): Add field unhandled_alias_ddrs.
        * sese.h (has_operands_from_region_p): New function.
gcc/testsuite/ChangeLog:

        * gcc.dg/graphite/alias-1.c: New test.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:14:41 +0000 (16:14 +0100)]

Move compute_alias_check_pairs to tree-data-ref.c

Move this function from tree-loop-distribution.c to tree-data-ref.c
and make it non-static to enable its use from other parts of GCC.

gcc/ChangeLog:
* tree-loop-distribution.c (data_ref_segment_size): Remove function.
(latch_dominated_by_data_ref): Likewise.
(compute_alias_check_pairs): Likewise.

* tree-data-ref.c (data_ref_segment_size): New function,
copied from tree-loop-distribution.c
(compute_alias_check_pairs): Likewise.
(latch_dominated_by_data_ref): Likewise.

* tree-data-ref.h (compute_alias_check_pairs): New declaration.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:13:51 +0000 (16:13 +0100)]

Fix branch prediction dump message

Instead of, for instance, "Loop got predicted 1 to iterate 10 times"
the message should be "Loop 1 got predicted to iterate 10 times".

gcc/ChangeLog:

* predict.c (pass_profile::execute): Fix dump message.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:13:03 +0000 (16:13 +0100)]

graphite: Fix minor mistakes in comments

gcc/ChangeLog:

* graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and
a reference to a variable which does not exist.
* graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo
in comment.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:12:23 +0000 (16:12 +0100)]

graphite: Rename isl_id_for_ssa_name

The SSA names for which this function gets used are always SCoP
parameters and hence "isl_id_for_parameter" is a better name.  It also
explains the prefix "P_" for those names in the ISL representation.

gcc/ChangeLog:

* graphite-sese-to-poly.c (isl_id_for_ssa_name): Rename to ...
  (isl_id_for_parameter): ... this new function name.
  (build_scop_context): Adjust function use.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:11:21 +0000 (16:11 +0100)]

graphite: Extend SCoP detection dump output

Extend dump output to make understanding why Graphite rejects to
include a loop in a SCoP easier (for GCC developers).

ChangeLog:

        * graphite-scop-detection.c (scop_detection::can_represent_loop):
Output reason for failure to dump file.
        (scop_detection::harmful_loop_in_region): Likewise.
        (scop_detection::graphite_can_represent_expr): Likewise.
        (scop_detection::stmt_has_simple_data_refs_p): Likewise.
        (scop_detection::stmt_simple_for_scop_p): Likewise.
(print_sese_loop_numbers): New function.
        (scop_detection::add_scop): Use from here to print loops in
rejected SCoP.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:07:34 +0000 (16:07 +0100)]

openacc: Move pass_oacc_device_lower after pass_graphite

The OpenACC device lowering pass must run after the Graphite pass to
allow for the use of Graphite for automatic parallelization of kernels
regions in the future. Experimentation has shown that it is best,
performancewise, to run pass_oacc_device_lower together with the
related passes pass_oacc_loop_designation and pass_oacc_gimple_workers
early after pass_graphite in pass_tree_loop, at least if the other
tree loop passes are not adjusted. In particular, to enable
vectorization which is crucial for GCN offloading, device lowering
should happen before pass_vectorize. To bring the loops contained in
the offloading functions into the shape expected by the loop
vectorizer, we have to make sure that some passes that previously were
executed only once before pass_tree_loop are also executed on the
offloading functions. To ensure the execution of
pass_oacc_device_lower if pass_tree_loop does not execute (no loops,
no optimizations), we introduce two further copies of the pass to the
pipeline that run if there are no loops or if no optimization is
performed.

gcc/ChangeLog:

* omp-general.c (oacc_get_fn_dim_size): Return 0 on
missing "dims".
* omp-offload.c (pass_oacc_loop_designation::clone): New
member function.
(pass_oacc_gimple_workers::clone): Likewise.
(pass_oacc_gimple_device_lower::clone): Likewise.
* passes.c (pass_data_no_loop_optimizations): New pass_data.
(class pass_no_loop_optimizations): New pass.
(make_pass_no_loop_optimizations): New function.
* passes.def: Move pass_oacc_{loop_designation,
gimple_workers, device_lower} into tree_loop, and add
copies to pass_tree_no_loop and to new
pass_no_loop_optimizations. Add copies of passes pass_ccp,
pass_ipa_warn, pass_complete_unrolli, pass_backprop,
pass_phiprop, pass_fix_loops after the OpenACC passes
in pass_tree_loop.
* tree-ssa-loop-ivcanon.c (pass_complete_unroll::clone):
New member function.
(pass_complete_unrolli::clone): Likewise.
* tree-ssa-loop.c (pass_fix_loops::clone): Likewise.
(pass_tree_loop_init::clone): Likewise.
(pass_tree_loop_done::clone): Likewise.
* tree-ssa-phiprop.c (pass_phiprop::clone): Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust
expected output to pass name changes due to the pass
reordering and cloning.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/goacc/loop-processing-1.c: Adjust expected output
* to pass name changes due to the pass reordering and cloning.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/routine-nohost-1.c: Likewise.
* c-c++-common/unroll-1.c: Likewise.
* c-c++-common/unroll-4.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-2.c: Likewise.
* gcc.dg/tree-ssa/backprop-3.c: Likewise.
* gcc.dg/tree-ssa/backprop-4.c: Likewise.
* gcc.dg/tree-ssa/backprop-5.c: Likewise.
* gcc.dg/tree-ssa/backprop-6.c: Likewise.
* gcc.dg/tree-ssa/cunroll-1.c: Likewise.
* gcc.dg/tree-ssa/cunroll-3.c: Likewise.
* gcc.dg/tree-ssa/cunroll-9.c: Likewise.
* gcc.dg/tree-ssa/ldist-17.c: Likewise.
* gcc.dg/tree-ssa/loop-38.c: Likewise.
* gcc.dg/tree-ssa/pr21463.c: Likewise.
* gcc.dg/tree-ssa/pr45427.c: Likewise.
* gcc.dg/tree-ssa/pr61743-1.c: Likewise.
* gcc.dg/unroll-2.c: Likewise.
* gcc.dg/unroll-3.c: Likewise.
* gcc.dg/unroll-4.c: Likewise.
* gcc.dg/unroll-5.c: Likewise.
* gcc.dg/vect/vect-profile-1.c: Likewise.
* c-c++-common/goacc/device-lowering-debug-optimization.c: New test.
* c-c++-common/goacc/device-lowering-no-loops.c: New test.
* c-c++-common/goacc/device-lowering-no-optimization.c: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>

commit | commitdiff | tree

Sandra Loosemore [Tue, 16 Nov 2021 15:09:51 +0000 (16:09 +0100)]

Fortran: delinearize multi-dimensional array accesses

The Fortran front end presently linearizes accesses to
multi-dimensional arrays by combining the indices for the various
dimensions into a series of explicit multiplies and adds with
refactoring to allow CSE of invariant parts of the computation.
Unfortunately this representation interferes with Graphite-based loop
optimizations.  It is difficult to recover the original
multi-dimensional form of the access by the time loop optimizations
run because parts of it have already been optimized away or into a
form that is not easily recognizable, so it seems better to have the
Fortran front end produce delinearized accesses to begin with, a set
of nested ARRAY_REFs similar to the existing behavior of the C and C++
front ends.  This is a long-standing problem that has previously been
discussed e.g. in PR 14741 and PR61000.

This patch is an initial implementation for explicit array accesses
only; it doesn't handle the accesses generated during scalarization of
whole-array or array-section operations, which follow a different code
path.

        gcc/
        * expr.c (get_inner_reference): Handle NOP_EXPR like
        VIEW_CONVERT_EXPR.

        gcc/fortran/
        * lang.opt (-param=delinearize=): New.
        * trans-array.c (get_class_array_vptr): New, split from...
        (build_array_ref): ...here.
        (get_array_lbound, get_array_ubound): New, split from...
        (gfc_conv_array_ref): ...here.  Additional code refactoring
        plus support for delinearization of the array access.

        gcc/testsuite/
        * gfortran.dg/assumed_type_2.f90: Adjust patterns.
        * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise.
        * gfortran.dg/graphite/block-3.f90: Remove xfails.
        * gfortran.dg/graphite/block-4.f90: Likewise.
        * gfortran.dg/inline_matmul_24.f90: Adjust patterns.
        * gfortran.dg/no_arg_check_2.f90: Likewise.
        * gfortran.dg/pr32921.f: Likewise.
        * gfortran.dg/reassoc_4.f: Disable delinearization for this test.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>

commit | commitdiff | tree

Richard Sandiford [Sat, 24 Apr 2021 08:35:16 +0000 (09:35 +0100)]

Add dg-final option-based target selectors

This patch adds target selectors of the form:

  { any-opts "opt1" ... "optn" }
  { no-opts "opt1" ... "optn" }

for skipping or xfailing tests based on compiler options.  It only
works for dg-final selectors.

The patch then uses no-opts to exclude -O0 and (sometimes) -Og from
some guality.exp xfails.  AFAICT (based on gcc-testresults) these
tests pass for those options for all targets.

gcc/
* doc/sourcebuild.texi: Document no-opts and any-opts target
selectors.

gcc/testsuite/
* lib/target-supports-dg.exp (selector_expression): Handle any-opts
and no-opts.
* gcc.dg/guality/pr41353-1.c: Exclude -O0 from xfail.
* gcc.dg/guality/pr59776.c: Likewise.
* gcc.dg/guality/pr54970.c: Likewise -O0 and -Og.

commit | commitdiff | tree

Frederik Harwath [Tue, 16 Nov 2021 15:08:40 +0000 (16:08 +0100)]

Fix gimple_debug_cfg declaration

Silence a warning. The argument type did not match the definition.

gcc/ChangeLog:

* tree-cfg.h (gimple_debug_cfg): Change argument type from int
to dump_flags_t.

commit | commitdiff | tree

Tobias Burnus [Wed, 10 Nov 2021 13:48:34 +0000 (14:48 +0100)]

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9233-g3dea90505df136a4b361665772ef8e62306cfcdb (Nov 10, 2021)

commit | commitdiff | tree

Richard Biener [Wed, 10 Nov 2021 10:08:03 +0000 (11:08 +0100)]

testsuite/102690 - XFAIL g++.dg/warn/Warray-bounds-16.C

This XFAILs the bogus diagnostic test and rectifies the expectation
on the optimization.

2021-11-10 Richard Biener <rguenther@suse.de>

PR testsuite/102690
* g++.dg/warn/Warray-bounds-16.C: XFAIL diagnostic part
and optimization.

(cherry picked from commit b2cd32b743ba440e75505ce30c6b5c592ed144ea)

commit | commitdiff | tree

Jakub Jelinek [Wed, 10 Nov 2021 07:51:07 +0000 (08:51 +0100)]

openmp: For default(none) ignore variables created by ubsan_create_data [PR64888]

We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification
(others are added later). One way to fix this would be to introduce further
UBSAN_ internal functions and lower it later (sanopt pass) like other ifns,
this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL
and TYPE_ARTIFICIAL.

2021-10-21 Jakub Jelinek <jakub@redhat.com>

PR middle-end/64888
gcc/c-family/
* c-omp.c (c_omp_predefined_variable): Return true also for
ubsan_create_data created artificial variables.
gcc/testsuite/
* c-c++-common/ubsan/pr64888.c: New test.

(cherry picked from commit 40dd9d839e52f679d8eabc1c5ca0ca17a5ccfd14)

commit | commitdiff | tree

Thomas Schwinge [Wed, 10 Nov 2021 07:49:34 +0000 (08:49 +0100)]

Restore 'GOMP_OPENACC_DIM' environment variable parsing

... that got broken by recent commit c057ed9c52c6a63a1a692268f916b1a9131cd4b7
"openmp: Fix up strtoul and strtoull uses in libgomp", resulting in spurious
FAILs for tests specifying 'dg-set-target-env-var "GOMP_OPENACC_DIM" "[...]"'.

libgomp/
* env.c (parse_gomp_openacc_dim): Restore parsing.

(cherry picked from commit 00c9ce13a64e324dabd8dfd236882919a3119479)

commit | commitdiff | tree

GCC Administrator [Wed, 10 Nov 2021 00:18:14 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Xionghu Luo [Thu, 4 Nov 2021 01:23:03 +0000 (20:23 -0500)]

rs6000: Fix incorrect fusion constraint [PR102991]

gcc/ChangeLog:

2021-11-05 Xionghu Luo <luoxhu@linux.ibm.com>

PR target/102991
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl: Fix incorrect clobber constraint.

(cherry picked from commit 614b39757b8b61f70ac1c666edb7a01a5fc19cd4)

commit | commitdiff | tree

GCC Administrator [Tue, 9 Nov 2021 00:18:14 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Biener [Mon, 18 Oct 2021 07:10:43 +0000 (09:10 +0200)]

tree-optimization/102798 - avoid copying PTA info to old SSA names

The vectorizer duplicates pointer-info to created pointer bases
but it has to avoid changing points-to info on existing SSA names
because there's now flow-sensitive info in there (pt->pt_null as
set from VRP).

2021-10-18 Richard Biener <rguenther@suse.de>

PR tree-optimization/102798
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref):
Only copy points-to info to newly generated SSA names.

* gcc.dg/pr102798.c: New testcase.

commit | commitdiff | tree

Richard Biener [Thu, 30 Sep 2021 13:05:53 +0000 (15:05 +0200)]

middle-end/102518 - avoid invalid GIMPLE during inlining

When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.

2021-09-30 Richard Biener <rguenther@suse.de>

PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.

* gcc.dg/torture/pr102518.c: New testcase.

commit | commitdiff | tree

Richard Biener [Mon, 18 Oct 2021 08:31:19 +0000 (10:31 +0200)]

tree-optimization/102788 - avoid spurious bool pattern fails

Bool pattern recog is required for correctness since vectorized
compares otherwise produce -1 for true so any context where bool
is used as value and not as condition or mask needs to be replaced
with CMP ? 1 : 0.  When we fail to find a vector type for the
result of such use we may not simply elide such transform since
a new bool result can emerge when for example the cast_forwprop
pattern is applied.  So the following avoids failing of the
bool pattern recog process and instead not assign a vector type
for the stmt.

2021-10-18  Richard Biener  <rguenther@suse.de>

PR tree-optimization/102788
* tree-vect-patterns.c (vect_init_pattern_stmt): Allow
a NULL vectype.
(vect_pattern_recog_1): Likewise.
(vect_recog_bool_pattern): Continue matching the pattern
even if we do not have a vector type for a conversion
result.

* g++.dg/vect/pr102788.cc: New testcase.

(cherry picked from commit eb032893675afea4b01cc6ad06a3e0dcfe9b51cd)

commit | commitdiff | tree

Richard Biener [Fri, 15 Oct 2021 06:41:57 +0000 (08:41 +0200)]

ipa/102762 - fix ICE with invalid __builtin_va_arg_pack () use

We have to be careful to not break the argument space calculation.
If there's not enough arguments just do not append any.

2021-10-15 Richard Biener <rguenther@suse.de>

PR ipa/102762
* tree-inline.c (copy_bb): Avoid underflowing nargs.

* gcc.dg/torture/pr102762.c: New testcase.

(cherry picked from commit 11a4714860d2df6ba496d55379e7dc702d5fc425)

commit | commitdiff | tree

Richard Biener [Tue, 12 Oct 2021 11:42:08 +0000 (13:42 +0200)]

tree-optimization/102572 - fix gathers with invariant mask

This fixes the vector def gathering for invariant masks which
failed to pass in the desired vector type resulting in a non-mask
type to be generate.

2021-10-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/102572
* tree-vect-stmts.c (vect_build_gather_load_calls): When
gathering the vectorized defs for the mask pass in the
desired mask vector type so invariants will be handled
correctly.

* g++.dg/vect/pr102572.cc: New testcase.

(cherry picked from commit 9f12a45ef147e563f099c24c293830727e8204cc)

commit | commitdiff | tree

Richard Biener [Tue, 31 Aug 2021 08:28:40 +0000 (10:28 +0200)]

tree-optimization/102139 - fix SLP DR base alignment

When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.

To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization. For this to work for groups inside a BB that are
separate by a call that might not return we now store the DR
analysis group-id permanently and use that for an additional check
when the DRs are in the same BB.

2021-08-31 Richard Biener <rguenther@suse.de>

PR tree-optimization/102139
* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
type to record a std::pair of the stmt-info and the innermost
loop behavior.
(dr_vec_info::group): New member.
* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
(vect_compute_data_ref_alignment): Verify the recorded
base alignment can be used.
(data_ref_pair): Remove.
(dr_group_sort_cmp): Adjust.
(vect_analyze_data_ref_accesses): Store the group-ID in the
dr_vec_info and operate on a vector of dr_vec_infos.

* gcc.dg/torture/pr102139.c: New testcase.

(cherry picked from commit 153766ec8351d55cfe8bd6d69bdfc0c2cef71e56)

commit | commitdiff | tree

Richard Biener [Fri, 20 Aug 2021 09:32:00 +0000 (11:32 +0200)]

Refactor BB splitting of DRs for SLP group analysis

This uses the group_id computed to ensure DRs in different BBs do
not get merged into a DR group.  To achieve this we seed the
group from the BB index when group_ids are not computed and we
make sure to bump the group_id when advancing to the next BB for
BB SLP analysis.

This paves the way for relaxing the grouping for BB vectorization
by adjusting its group_id computation.

2021-08-20  Richard Biener  <rguenther@suse.de>

* tree-vect-data-refs.c (dr_group_sort_cmp): Do not compare
BBs.
(vect_analyze_data_ref_accesses): Likewise.  Assign the BB
index as group_id when dataref_groups were not computed.
* tree-vect-slp.c (vect_slp_bbs): Bump current_group when
we advace to the next BB.

(cherry picked from commit 37744f8260857005c8409c9e2e633a05c768a7dd)

commit | commitdiff | tree

Richard Biener [Mon, 11 Oct 2021 14:06:03 +0000 (16:06 +0200)]

middle-end/101480 - overloaded global new/delete

The following fixes the issue of ignoring side-effects on memory
from overloaded global new/delete operators by not marking them
as effectively 'const' apart from other explicitely specified
side-effects.

This will cause

FAIL: g++.dg/warn/Warray-bounds-16.C -std=gnu++1? (test for excess errors)

because we now no longer statically see the initialization loop
never executes because the call to operator new can now clobber 'a.m'.
This seems to be an issue with the warning code and/or ranger so
I'm leaving this FAIL to be addressed as followup.

2021-10-11 Richard Biener <rguenther@suse.de>

PR middle-end/101480
* gimple.c (gimple_call_fnspec): Do not mark operator new/delete
as const.

* g++.dg/torture/pr10148.C: New testcase.

(cherry picked from commit 09a0affdb0598a54835ac4bb0dd6b54122c12916)

commit | commitdiff | tree

Martin Liska [Fri, 5 Nov 2021 15:50:06 +0000 (16:50 +0100)]

gcov-profile: Fix -fcompare-debug with -fprofile-generate [PR100520]

PR gcov-profile/100520

gcc/ChangeLog:

* coverage.c (coverage_compute_profile_id): Strip .gk when
compare debug is used.
* system.h (endswith): New function.

gcc/testsuite/ChangeLog:

* gcc.dg/pr100520.c: New test.

(cherry picked from commit 7553bd35c876efaf8ab0b6661a6102822b99e6e3)

commit | commitdiff | tree

Martin Liska [Mon, 8 Nov 2021 11:58:28 +0000 (12:58 +0100)]

gcc-changelog: sync from master

contrib/ChangeLog:

* gcc-changelog/git_check_commit.py: Sync from master.
* gcc-changelog/git_commit.py: Likewise.
* gcc-changelog/git_email.py: Likewise.
* gcc-changelog/git_update_version.py: Likewise.
* gcc-changelog/test_email.py: Likewise.
* gcc-changelog/test_patches.txt: Likewise.

commit | commitdiff | tree

Kewen Lin [Tue, 26 Oct 2021 02:05:02 +0000 (21:05 -0500)]

vect: Don't update inits for simd_lane_access DRs [PR102789]

As PR102789 shows, when vectorizer does some peelings for alignment
in prologues, function vect_update_inits_of_drs would update the
inits of some drs. But as the failed case, we shouldn't update the
dr for simd_lane_access, it has the fixed-length storage mainly for
the main loop, the update can make the access out of bound and access
the unexpected element.

gcc/ChangeLog:

PR tree-optimization/102789
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of simd_lane_access.

(cherry picked from commit f3dbd3f36d55178d0a9e4431043cbc950524969a)

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom