]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
4 years agoOpenACC: Fix pattern in dg-bogus in Fortran testcases
Tobias Burnus [Mon, 26 Apr 2021 19:57:31 +0000 (21:57 +0200)] 
OpenACC: Fix pattern in dg-bogus in Fortran testcases

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f:
Correct spelling in dg-bogus to match -Wopenacc-parallelism.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc/classify-serial.f95:
Correct spelling in dg-bogus to match -Wopenacc-parallelism.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.

(cherry picked from commit 5a26ba75de623f75fb44cddc2a9c982d31c96213)

4 years agoAdd '-Wopenacc-parallelism'
Thomas Schwinge [Fri, 23 Apr 2021 10:23:51 +0000 (12:23 +0200)] 
Add '-Wopenacc-parallelism'

... to diagnose potentially suboptimal choices regarding OpenACC parallelism.

Not enabled by default: too noisy ("*potentially* suboptimal choices"); see
XFAILed 'dg-bogus'es.

gcc/c-family/
* c.opt (Wopenacc-parallelism): New.
gcc/fortran/
* lang.opt (Wopenacc-parallelism): New.
gcc/
* omp-offload.c (oacc_validate_dims): Implement
'-Wopenacc-parallelism'.
* doc/invoke.texi (-Wopenacc-parallelism): Document.
gcc/testsuite/
* c-c++-common/goacc/diag-parallelism-1.c: New.
* c-c++-common/goacc/acc-icf.c: Specify '-Wopenacc-parallelism',
and match diagnostics, as appropriate.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/kernels-decompose-1.c: Likewise.
* c-c++-common/goacc/kernels-decompose-2.c: Likewise.
* c-c++-common/goacc/parallel-dims-1.c: Likewise.
* c-c++-common/goacc/parallel-reduction.c: Likewise.
* c-c++-common/goacc/pr70688.c: Likewise.
* c-c++-common/goacc/routine-1.c: Likewise.
* c-c++-common/goacc/routine-level-of-parallelism-2.c: Likewise.
* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/parallel-tree.f95: Likewise.
* gfortran.dg/goacc/routine-4.f90: Likewise.
* gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
* gfortran.dg/goacc/uninit-dim-clause.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Specify
'-Wopenacc-parallelism', and match diagnostics, as appropriate.
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/mode-transitions.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-variables.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c:
Likewise.
* testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise.
* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise.
* testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.

Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Tom de Vries <vries@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
(cherry picked from commit 22cff118f7526bec195ed6e41452980820fdf3a8)

4 years ago[OpenACC] Don't compile libgomp testcases with '-w'
Thomas Schwinge [Fri, 16 Apr 2021 14:08:19 +0000 (16:08 +0200)] 
[OpenACC] Don't compile libgomp testcases with '-w'

We'd like to actually catch compiler diagnostics (and currently there aren't
any).

libgomp/
* testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Don't
compile with '-w'.
* testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-6.c: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.

(cherry picked from commit 7c640779bf042ceb05fe50260307af88fed75407)

4 years agoFix OpenACC tests introduced in GCC 11
Kwok Cheung Yeung [Thu, 29 Apr 2021 22:19:22 +0000 (15:19 -0700)] 
Fix OpenACC tests introduced in GCC 11

OG11 has additional warnings compared to mainline, which are not expected by
tests newly introduced in GCC 11.  Also, some of the dump messages are now
emitted in the oaccloops pass rather than oaccdevlow.

2021-04-29  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/classify-serial.c: Surpress warnings.  Dump
oaccloops pass instead of oaccdevlow.
* gfortran.dg/goacc/classify-serial.f95: Likewise.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Surpress warnings.
* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Likewise.

4 years agoUpdate expected messages in data-clause-1 tests
Kwok Cheung Yeung [Thu, 29 Apr 2021 22:06:38 +0000 (15:06 -0700)] 
Update expected messages in data-clause-1 tests

The patch 'Merge non-contiguous array support patches' handles one of the
non-contiguous cases such that it is no longer an error.

2021-04-29  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/data-clause-1.c (foo): Remove expected message.
* g++.dg/goacc/data-clause-1.C (foo): Remove expected message.

4 years agoUpdate expected message in kernels-decompose-ice-1.c test
Kwok Cheung Yeung [Thu, 29 Apr 2021 21:51:45 +0000 (14:51 -0700)] 
Update expected message in kernels-decompose-ice-1.c test

The expected message differs between mainline and OG11.  The contents of the
message are not important in this test anyway, so change it to match.

2021-04-29  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-ice-1.c (main): Update expected
message.

4 years agoUpdate expected messages in kernels-decompose-2 tests
Kwok Cheung Yeung [Thu, 29 Apr 2021 21:38:15 +0000 (14:38 -0700)] 
Update expected messages in kernels-decompose-2 tests

This changes expected messages that differ between mainline and OG11.  On
OG10, these messages were added in the patch:

081a01963ca8 Update expected messages, errors and warnings for "kernels" tests

2021-04-29  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-2.c (main): Update expected
messages.
* gfortran.dg/goacc/kernels-decompose-2.f95 (main): Update expected
messages.

4 years agolibgomp.fortran/depobj-1.f90: Fix omp_depend_kind
Tobias Burnus [Wed, 21 Apr 2021 20:47:18 +0000 (22:47 +0200)] 
libgomp.fortran/depobj-1.f90: Fix omp_depend_kind

libgomp/
* testsuite/libgomp.fortran/depobj-1.f90: Use omp_lib's
omp_depend_kind instead of defining it as 16.

(cherry picked from commit 0c0bdcc60cfa0e87b3b9abc1460a7073b199d38a)

4 years agoFortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'
Tobias Burnus [Wed, 21 Apr 2021 08:58:29 +0000 (10:58 +0200)] 
Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_namelist): Handle depobj + mutexinoutset
in the depend clause.
(show_omp_clauses, show_omp_node, show_code_node): Handle depobj.
* gfortran.h (enum gfc_statement): Add ST_OMP_DEPOBJ.
(enum gfc_omp_depend_op): Add OMP_DEPEND_UNSET,
OMP_DEPEND_MUTEXINOUTSET and OMP_DEPEND_DEPOBJ.
(gfc_omp_clauses): Add destroy, depobj_update and depobj.
(enum gfc_exec_op): Add EXEC_OMP_DEPOBJ
* match.h (gfc_match_omp_depobj): Match 'omp depobj'.
* openmp.c (gfc_match_omp_clauses): Add depobj + mutexinoutset
to depend clause.
(gfc_match_omp_depobj, resolve_omp_clauses, gfc_resolve_omp_directive):
Handle 'omp depobj'.
* parse.c (decode_omp_directive, next_statement, gfc_ascii_statement):
Likewise.
* resolve.c (gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses): Handle depobj + mutexinoutset
in the depend clause.
(gfc_trans_omp_depobj, gfc_trans_omp_directive): Handle EXEC_OMP_DEPOBJ.
* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/depobj-1.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/depobj-1.f90: New test.
* gfortran.dg/gomp/depobj-2.f90: New test.

(cherry picked from commit a61c4964cd71446232d62ec9b10a7d45b440dd9f)

4 years agoFix is_oacc_parallel_or_serial for kernel regions
Kwok Cheung Yeung [Wed, 7 Apr 2021 19:49:31 +0000 (12:49 -0700)] 
Fix is_oacc_parallel_or_serial for kernel regions

2021-04-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-low.c (is_oacc_parallel_or_serial): Handle
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED and
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE.

4 years agoUpdate expected messages in OpenACC tests
Kwok Cheung Yeung [Wed, 7 Apr 2021 18:48:50 +0000 (11:48 -0700)] 
Update expected messages in OpenACC tests

This updates the types of messages expected in the test, and the '-fopt-info'
option used to request them.  The phrasing of the expected messages has also
changed somewhat and has been adjusted to match.

2021-04-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Update
additional options and expected messages.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Update
additional options and expected messages.

4 years ago[WIP] OpenMP 5.0: requires directive: workaround to fix libgomp IntelMIC plugin build
Thomas Schwinge [Wed, 3 Mar 2021 21:37:58 +0000 (22:37 +0100)] 
[WIP] OpenMP 5.0: requires directive: workaround to fix libgomp IntelMIC plugin build

Fix-up for og10 commit c2e4a17adc0989f216c7fc3f93f150c66adba23a "OpenMP 5.0:
requires directive".

The GCC offloading target configurations don't build/use
'crtoffloadbegin.o'/'crtoffloadtable.o'/'crtoffloadend.o'
('libgcc/offloadstuff.c'), but the libgomp IntelMIC plugin still does link
against libgomp, and the latter unconditionally refers to
'__requires_mask_table', '__requires_mask_table_end':

    make[3]: Entering directory '[...]/build-gcc-offload-x86_64-intelmicemul-linux-gnu/x86_64-intelmicemul-linux-gnu/liboffloadmic/plugin'
    [...]/build-gcc-offload-x86_64-intelmicemul-linux-gnu/./gcc/xg++ [...] -loffloadmic_target -lcoi_device -lgomp -rdynamic ../ofldbegin.o offload_target_main.o ../ofldend.o -o offload_target_main
    ./../../libgomp/.libs/libgomp.so: undefined reference to `__requires_mask_table_end'
    ./../../libgomp/.libs/libgomp.so: undefined reference to `__requires_mask_table'
    collect2: error: ld returned 1 exit status
    Makefile:806: recipe for target 'offload_target_main' failed
    make[3]: *** [offload_target_main] Error 1

I have not researched what a proper fix would look like.

libgomp/
* target.c (__requires_mask_table, __requires_mask_table_end): Add
'__attribute__((weak))'.

4 years agoLambda capturing of pointers and references in target directives
Chung-Lin Tang [Thu, 18 Mar 2021 18:29:20 +0000 (02:29 +0800)] 
Lambda capturing of pointers and references in target directives

This patch implements proper lambda capturing of pointer and reference variables
as specified in OpenMP 5.0. We map the entire closure object as a to-map,
attach pointers to zero-length array sections, and perform mapping of
references.

2021-03-18  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/cp/ChangeLog:

* cp-tree.h (set_omp_target_this_expr): Delete.
(finish_omp_target_clauses): New prototype.
* lambda.c (lambda_expr_this_capture): Remove call to
set_omp_target_this_expr.
* parser.c (cp_parser_omp_target): Likewise.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for target
directives.
* semantics.c (omp_target_this_expr): Delete.
(omp_target_ptr_members_accessed): Delete.
(finish_non_static_data_member): Remove call to
set_omp_target_this_expr. Remove use of omp_target_ptr_members_accessed.
(finish_this_expr): Remove call to set_omp_target_this_expr.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function, with code factored out from
finish_omp_target. Add lambda object handling case.
(finish_omp_target): Factor code out and adjust to use
finish_omp_target_clauses.
(finish_omp_clauses): Revert prior "Adjustments to allow '*ptr' and
'ptr->member' cases in map clausess.", since not needed with new
organization of target-directive clause processing.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-lambda-1.C: New test.

libgomp/testsuite/ChangeLog:

* libgomp.c++/target-lambda-1.C: New test.

4 years agoFix template case of non-static member access inside member functions
Chung-Lin Tang [Thu, 11 Mar 2021 08:31:08 +0000 (00:31 -0800)] 
Fix template case of non-static member access inside member functions

Prior patches for C++ non-static member access had problems under template
classes, due to re-calling of finish_omp_clauses after finish_omp_target
created the implicit maps required, but not of allowed form in finish_omp_clauses.

This patch solves this by slightly relaxing the allowed expressions in
finish_omp_clauses.

2021-03-11  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/cp/ChangeLog:

* semantics.c (finish_omp_clauses): Adjustments to allow '*ptr' and
'ptr->member' cases in map clausess.
(finish_omp_target): Use INDIRECT_REF instead of MEM_REF in created
clauses, add processing_template_decl handling.

gcc/ChangeLog:

* gimplify.c (gimplify_scan_omp_clauses): Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-this-3.C: Adjust scan test.
* g++.dg/gomp/target-this-4.C: Likewise.
* g++.dg/gomp/target-this-5.C: New test.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-this-5.C: New test.

4 years agoArrow operator handling for C front-end in OpenMP map clauses
Chung-Lin Tang [Mon, 8 Mar 2021 07:56:52 +0000 (15:56 +0800)] 
Arrow operator handling for C front-end in OpenMP map clauses

This patch merges some of the equivalent changes already done for the C++
front-end to the C parts.

2021-03-08  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.

gcc/ChangeLog:

* gimplify.c (gimplify_scan_omp_clauses): Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.

gcc/testsuite/ChangeLog:

* gcc.dg/gomp/target-3.c: New test.

4 years agoDWARF: late code range fixup
Andrew Stubbs [Thu, 4 Mar 2021 23:12:17 +0000 (23:12 +0000)] 
DWARF: late code range fixup

Ensure that the parent DWARF subprograms of offload kernel functions have a
code range, and are therefore not discarded by GDB.  This is only necessary
when the parent function does not actually exist in the final binary, which is
commonly the case within the offload device's binary.

This patch replaces 808bdf1bb29 and fdcb23540a2.  It should be squashed with
those before being posted upstream.

gcc/

* gcc/dwarf2out.c (notional_parents_list): New file variable.
(gen_subprogram_die): Record offload kernel functions in
notional_parents_list.
(fixup_notional_parents): New function.
(dwarf2out_finish): Call fixup_notional_parents.
(dwarf2out_c_finalize): Reset notional_parents_list.

4 years agoopenmp: Scale type precision of collapsed iterator variable
Kwok Cheung Yeung [Mon, 1 Mar 2021 22:15:30 +0000 (14:15 -0800)] 
openmp: Scale type precision of collapsed iterator variable

This sets the type precision of the collapsed iterator variable to the
sum of the precision of the collapsed loop variables, up to a maximum of
sizeof(long long) (i.e. 64-bits).

2021-03-01  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-expand.c (expand_oacc_for): Convert .tile variable to
diff_type before multiplying.
* omp-general.c (omp_extract_for_data): Use accumulated precision
of all collapsed for-loops as precision of iteration variable, up
to the precision of a long long.

libgomp/
* testsuite/libgomp.c-c++-common/collapse-4.c: New.
* testsuite/libgomp.fortran/collapse5.f90: New.

4 years agoAllow static constexpr fields in mappable types for C++
Chung-Lin Tang [Wed, 3 Mar 2021 14:39:10 +0000 (22:39 +0800)] 
Allow static constexpr fields in mappable types for C++

This patch is a merge of:
https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01246.html

Static members in general disqualify a C++ class from being target mappable,
but static constexprs are inline optimized away, so should not interfere.

OpenMP 5.0 in general lifts the static member limitation, so this
patch will probably further adjusted later.

2021-03-03  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/cp/ChangeLog:

* decl2.c (cp_omp_mappable_type_1): Allow fields with
DECL_DECLARED_CONSTEXPR_P to be mapped.

gcc/testsuite/ChangeLog:

* g++.dg/goacc/static-constexpr-1.C: New test.
* g++.dg/gomp/static-constexpr-1.C: New test.

4 years agonvptx: remove erroneous stack deletion
Andrew Stubbs [Tue, 23 Feb 2021 21:35:08 +0000 (21:35 +0000)] 
nvptx: remove erroneous stack deletion

The stacks are not supposed to be deleted every time memory is allocated, only
when there is insufficient memory.  The unconditional call here seems to be in
error, and is causing a costly reallocation of the stacks before every launch.

libgomp/

* plugin/plugin-nvptx.c (GOMP_OFFLOAD_alloc): Remove early call to
nvptx_stacks_free.

4 years agoDWARF: fix ICE caused by offload debug fix
Andrew Stubbs [Fri, 26 Feb 2021 11:40:06 +0000 (11:40 +0000)] 
DWARF: fix ICE caused by offload debug fix

This should be squashed with 808bdf1bb29 and fdcb23540a2 to go to mainline.

gcc/

* dwarf2out.c (gen_subprogram_die): Replace existing low/high PC
attributes, rather than ICE.

4 years agoRecommit "Enable gimplify GOMP_MAP_STRUCT handling of (COMPONENT_REF (INDIRECT_REF...
Chung-Lin Tang [Fri, 26 Feb 2021 12:14:49 +0000 (20:14 +0800)] 
Recommit "Enable gimplify GOMP_MAP_STRUCT handling of (COMPONENT_REF (INDIRECT_REF ...)) map clauses".

This patch was reverted in commit ea43db9372133e84f95bdb2c6934a5cc3db3d530
due to a regression. Now re-applying to devel/omp/gcc-10 after regression fixed.

This patch tries to allow map(A->ptr) to be properly handled the same way as
map(B.ptr) expressions. map(struct:*A) clauses are now produced during
gimplify.

This patch, as of time of commit, is only pushed to devel/omp/gcc-10, not yet
submitted as mainline patch to upstream.

2021-02-26  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/ChangeLog:

* gimplify.c ("tree-hash-traits.h"): Add include.
(gimplify_scan_omp_clauses): Change struct_map_to_clause to type
hash_map<tree_operand, tree> *. Adjust struct map handling to handle
cases of *A and A->B expressions.
(gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for
exit data directives code to earlier position.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-3.C: Adjust testcase gimple scanning.
* g++.dg/gomp/target-this-2.C: Likewise.
* g++.dg/gomp/target-this-3.C: Likewise.
* g++.dg/gomp/target-this-4.C: Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-23.C: New testcase.

4 years agoFix regression of array members in OpenMP map clauses.
Chung-Lin Tang [Fri, 26 Feb 2021 12:13:29 +0000 (20:13 +0800)] 
Fix regression of array members in OpenMP map clauses.

Fixed a regression of array members not working in OpenMP map clauses after
commit bf8605f14ec33ea31233a3567f3184fee667b695.

This patch itself probably should be considered a fix for commit aadfc9843.

2021-02-26  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/cp/ChangeLog:

* semantics.c (handle_omp_array_sections): Adjust position of making
COMPONENT_REF from FIELD_DECL to earlier position.

4 years agoOpenMP 5.0: requires directive
Chung-Lin Tang [Tue, 2 Feb 2021 12:34:01 +0000 (20:34 +0800)] 
OpenMP 5.0: requires directive

This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563393.html

This patch completes more of the reverse_offload, unified_address, and
unified_shared_memory clauses for the OpenMP 5.0 requires directive,
including runtime verification of the offload target.
(currently no offload devices actually support above features, only
warning messages are emitted)

This may possibly reverted/updated when a final patch is approved
for mainline.

2021-02-02  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/c/ChangeLog:

* c-parser.c (c_parser_declaration_or_fndef): Set
OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has
"omp declare target" attribute.
(c_parser_omp_target_data): Set    OMP_REQUIRES_TARGET_USED in
omp_requires_mask.
(c_parser_omp_target_enter_data): Likewise.
(c_parser_omp_target_exit_data): Likewise.
(c_parser_omp_requires): Adjust to only mention "not implemented yet"
for OMP_REQUIRES_DYNAMIC_ALLOCATORS.

gcc/cp/ChangeLog:

* parser.c (cp_parser_simple_declaration): Set
OMP_REQUIRES_TARGET_USED in omp_requires_mask if function has
"omp declare target" attribute.
(cp_parser_omp_target_data): Set OMP_REQUIRES_TARGET_USED in
omp_requires_mask.
(cp_parser_omp_target_enter_data): Likewise.
(cp_parser_omp_target_exit_data): Likewise.
(cp_parser_omp_requires): Adjust to only mention "not implemented yet"
for OMP_REQUIRES_DYNAMIC_ALLOCATORS.

gcc/fortran/ChangeLog:

* openmp.c (gfc_check_omp_requires): Fix REVERSE_OFFLOAD typo.
(gfc_match_omp_requires): Adjust to only mention "not implemented yet"
for OMP_REQUIRES_DYNAMIC_ALLOCATORS.
* parse.c ("tree.h"): Add include.
("omp-general.h"): Likewise.
(gfc_parse_file): Add code to merge omp_requires to omp_requires_mask.

gcc/ChangeLog:

* omp-offload.c (omp_finish_file): Add code to reate OpenMP requires
mask variable in .gnu.gomp_requires section if needed.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/requires-4.c: Remove prune of "not supported yet".
* gcc/testsuite/gfortran.dg/gomp/requires-4.f90: Fix REVERSE_OFFLOAD typo.
* gcc/testsuite/gfortran.dg/gomp/requires-8.f90: Likewise.

include/ChangeLog:

* gomp-constants.h (GOMP_REQUIRES_UNIFIED_ADDRESS): New symbol.
(GOMP_REQUIRES_UNIFIED_SHARED_MEMORY): Likewise.
(GOMP_REQUIRES_REVERSE_OFFLOAD): Likewise.

libgcc/ChangeLog:

* offloadstuff.c (__requires_mask_table): New symbol to mark start of
.gnu.gomp_requires section.
(__requires_mask_table_end): New symbol to mark end of
.gnu.gomp_requires section.

libgomp/ChangeLog:

* libgomp-plugin.h (GOMP_OFFLOAD_supported_features): New declaration.
* libgomp.h (struct gomp_device_descr): New 'supported_features_func'
plugin hook field.
* oacc-host.c (host_supported_features): New host hook function.
(host_dispatch): Initialize 'supported_features_func' host hook.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_supported_features): New function.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_supported_features): Likewise.
* target.c (<stdio.h>): Add include of standard header.
(gomp_requires_mask): New static variable.
(__requires_mask_table): New declaration.
(__requires_mask_table_end): Likewise.
(gomp_load_plugin_for_device): Add loading of 'supported_features' hook.
(gomp_target_init): Add code to summarize .gnu._gomp_requires section
mask values, emit error if inconsistency found.

* testsuite/libgomp.c-c++-common/requires-1.c: New test.
* testsuite/libgomp.c-c++-common/requires-1-aux.c: New file linked with
above test.
* testsuite/libgomp.c-c++-common/requires-2.c: New test.
* testsuite/libgomp.c-c++-common/requires-2-aux.c: New file linked with
above test.

liboffloadmic/ChangeLog:

* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_supported_features):
New function.

4 years agoOpenMP 5.0: map this[:1] in C++ non-static member functions (PR 92120)
Chung-Lin Tang [Tue, 2 Feb 2021 12:31:37 +0000 (20:31 +0800)] 
OpenMP 5.0: map this[:1] in C++ non-static member functions (PR 92120)

This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558975.html

This patch creates automatic mapping of map(this[:1]) and pointer members
as zero-length array sections, as specified by the OpenMP 5.0 specification.

This may possibly reverted/updated when a final patch is approved
for mainline.

2021-02-02  Chung-Lin Tang  <cltang@codesourcery.com>

PR middle-end/92120

gcc/cp/ChangeLog:

* cp-tree.h (finish_omp_target): New declaration.
(set_omp_target_this_expr): Likewise.
* lambda.c (lambda_expr_this_capture): Add call to
set_omp_target_this_expr.
* parser.c (cp_parser_omp_target): Factor out code, change to call
finish_omp_target, add re-initing call to set_omp_target_this_expr.
* semantics.c (omp_target_this_expr): New static variable.
(omp_target_ptr_members_accessed): New static hash_map for tracking
accessed non-static pointer-type members.
(finish_non_static_data_member): Add call to set_omp_target_this_expr.
Add recording of non-static pointer-type members access.
(finish_this_expr): Add call to set_omp_target_this_expr.
(set_omp_target_this_expr): New function to set omp_target_this_expr.
(finish_omp_target): New function with code merged from
cp_parser_omp_target, plus code to implement this[:1] and __closure map
clauses for OpenMP.

(handle_omp_array_sections_1): Move code to peel of '*' for
reference-based COMPONENT_REFs before FIELD_DECL transforming.
(finish_omp_clauses): Handle 'A->member' case in map clauses.

gcc/ChangeLog:

* omp-low.c (lower_omp_target):
Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
* tree-pretty-print.c (dump_omp_clause): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-this-1.C: New testcase.
* g++.dg/gomp/target-this-2.C: New testcase.
* g++.dg/gomp/target-this-3.C: New testcase.
* g++.dg/gomp/target-this-4.C: New testcase.

include/ChangeLog:

* gomp-constants.h (enum gomp_map_kind):
Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
(GOMP_MAP_POINTER_P):
Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION.

libgomp/ChangeLog:

* libgomp.h (gomp_attach_pointer): Add bool parameter.
* oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_existing): Update assert condition to
include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION.
(gomp_map_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for mapping a pointer with NULL target.
(gomp_attach_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for attaching a pointer with NULL target.
(gomp_map_vars_internal): Update calls to gomp_map_pointer and
gomp_attach_pointer, add handling for
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases.
* testsuite/libgomp.c++/target-this-1.C: New testcase.
* testsuite/libgomp.c++/target-this-2.C: New testcase.
* testsuite/libgomp.c++/target-this-3.C: New testcase.
* testsuite/libgomp.c++/target-this-4.C: New testcase.

4 years agoOpenMP 5.0: Allow multiple clauses mapping same variable
Chung-Lin Tang [Mon, 1 Feb 2021 11:16:47 +0000 (03:16 -0800)] 
OpenMP 5.0: Allow multiple clauses mapping same variable

This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562081.html

This patch now allows multiple clauses on the same construct to map
the same variable, which was not valid in OpenMP 4.5, but now allowed
in 5.0.

This may possibly reverted/updated when a final patch is approved
for mainline.

2021-02-01  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/c/ChangeLog:

* c-typeck.c (c_finish_omp_clauses): Adjust to allow duplicate
mapped variables for OpenMP.

gcc/cp/ChangeLog:

* semantics.c (finish_omp_clauses):  Adjust to allow duplicate
mapped variables for OpenMP.

gcc/ChangeLog:

* omp-low.c (install_parm_decl): Add new 'tree key_expr' parameter.
Use key_expr as splay-tree key instead of var itself.
(install_var_field): Add new 'tree key_expr = NULL_TREE'
default parameter. Set splay-tree lookup key to key_expr instead of
var if key_expr is non-NULL. Adjust call to install_parm_decl.
Update comments.
(scan_sharing_clauses): Use clause tree expression as splay-tree key
for map/to/from and OpenACC firstprivate cases when installing the
variable field into the send/receive record type.
(maybe_lookup_field_in_outer_ctx): Add code to search through
construct clauses instead of entirely based on splay-tree lookup.
(lower_oacc_reductions): Adjust to find map-clause of reduction
variable, then create receiver-ref.
(lower_omp_target): Adjust to lookup var field using clause expression.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/clauses-2.c: Adjust testcase.

4 years agoOpenMP 5.0 Structure element mapping
Chung-Lin Tang [Wed, 27 Jan 2021 13:35:43 +0000 (21:35 +0800)] 
OpenMP 5.0 Structure element mapping

This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561139.html

This patch implements the changes to the behavior of target mapping structure
elements, as specified in OpenMP 5.0.

This may possibly reverted/updated when a final patch is approved
for mainline.

libgomp/ChangeLog:

* hashtab.h (htab_clear): New function with initialization code
factored out from...
(htab_create): ...here, adjust to use htab_clear function.

* libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of
special refcount values, add comments.
(REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL.
(REFCOUNT_LINK): Likewise.
(REFCOUNT_STRUCTELEM): New special refcount range for structure
element siblings.
(REFCOUNT_STRUCTELEM_P): Macro for testing for structure element
sibling maps.
(REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling.
(REFCOUNT_STRUCTELEM_FLAG_LAST):  Flag to indicate last sibling.
(REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag.
(REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag.
(struct splay_tree_key_s): Add structelem_refcount and
structelem_refcount_ptr fields into a union with dynamic_refcount.
Add comments.
(gomp_map_vars): Delete declaration.
(gomp_map_vars_async): Likewise.
(gomp_unmap_vars): Likewise.
(gomp_unmap_vars_async): Likewise.
(goacc_map_vars): New declaration.
(goacc_unmap_vars): Likewise.

* oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars.
(goacc_enter_datum): Likewise.
(goacc_enter_data_internal): Likewise.
* oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars
and goacc_unmap_vars.
(GOACC_data_start): Adjust to use goacc_map_vars.
(GOACC_data_end): Adjust to use goacc_unmap_vars.

* target.c (hash_entry_type): New typedef.
(htab_alloc): New function hook for hashtab.h.
(htab_free): Likewise.
(htab_hash): Likewise.
(htab_eq): Likewise.
(hashtab.h): Add file include.
(gomp_increment_refcount): New function.
(gomp_decrement_refcount): Likewise.
(gomp_map_vars_existing): Add refcount_set parameter, adjust to use
gomp_increment_refcount.
(gomp_map_fields_existing): Add refcount_set parameter, adjust calls
to gomp_map_vars_existing.

(gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p
variable to guard OpenMP specific paths, adjust calls to
gomp_map_vars_existing, add structure element sibling splay_tree_key
sequence creation code, adjust Fortran map case to avoid increment
under OpenMP.
(gomp_map_vars): Adjust to static, add refcount_set parameter, manage
local refcount_set if caller passed in NULL, adjust call to
gomp_map_vars_internal.
(gomp_map_vars_async): Adjust and rename into...
(goacc_map_vars): ...this new function, adjust call to
gomp_map_vars_internal.

(gomp_remove_splay_tree_key): New function with code factored out from
gomp_remove_var_internal.
(gomp_remove_var_internal): Add code to handle removing multiple
splay_tree_key sequence for structure elements, adjust code to use
gomp_remove_splay_tree_key for splay-tree key removal.
(gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use
gomp_decrement_refcount.
(gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage
local refcount_set if caller passed in NULL, adjust call to
gomp_unmap_vars_internal.
(gomp_unmap_vars_async): Adjust and rename into...
(goacc_unmap_vars): ...this new function, adjust call to
gomp_unmap_vars_internal.
(GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and
gomp_unmap_vars.
(GOMP_target_ext): Likewise.
(gomp_target_data_fallback): Adjust call to gomp_map_vars.
(GOMP_target_data): Likewise.
(GOMP_target_data_ext): Likewise.
(GOMP_target_end_data): Adjust call to gomp_unmap_vars.
(gomp_exit_data): Add refcount_set parameter, adjust to use
gomp_decrement_refcount, adjust to queue splay-tree keys for removal
after main loop.
(GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to
gomp_map_vars and gomp_exit_data.
(gomp_target_task_fn): Likewise.

* testsuite/libgomp.c-c++-common/refcount-1.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.

4 years agoTarget mapping C++ members inside member functions
Chung-Lin Tang [Thu, 21 Jan 2021 15:04:26 +0000 (23:04 +0800)] 
Target mapping C++ members inside member functions

This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562467.html

This patch fixes some problems with target mapping when inside
C++ member functions:

1. Allow deref '->' in map clauses.
2. Allow this[X] in map clauses.
3. Create map(this->member) from map(member), when encountering
   member's FIELD_DECL.

This may possibly reverted/updated when a final patch is approved
for mainline.

gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_clause_map): Adjust call to
cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true.
* semantics.c (handle_omp_array_sections_1): Add handling to create
'this->member' from 'member' FIELD_DECL.
(finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP
map clauses.

gcc/testsuite/ChangeLog:
* g++.dg/gomp/target-3.C: New test.
* g++.dg/gomp/this-2.C: Adjust testcase.

4 years agoCorrect fix offload dwarf info
Andrew Stubbs [Sat, 16 Jan 2021 15:18:07 +0000 (15:18 +0000)] 
Correct fix offload dwarf info

The previous patch wasn't quite right, apparently.  Somehow the behaviour
changed after another clean build?  This tweak fixes it.

This patch should be squashed with fdcb23540a2 to go to mainline.

gcc/ChangeLog:

* dwarf2out.c (gen_subprogram_die): Check offload attributes only.

4 years agoDWARF address space for variables
Andrew Stubbs [Fri, 15 Jan 2021 11:26:46 +0000 (11:26 +0000)] 
DWARF address space for variables

Add DWARF address class attributes for variables that exist outside the
generic address space.  In particular, this is the case for gang-private
variables in OpenACC offload kernels.

gcc/ChangeLog:

* dwarf2out.c (add_location_or_const_value_attribute): Set
DW_AT_address_class, if appropriate.

4 years agoamdgcn: DWARF address spaces
Andrew Stubbs [Wed, 7 Oct 2020 13:15:22 +0000 (14:15 +0100)] 
amdgcn: DWARF address spaces

Map GCN address spaces to the proposed DWARF address spaces defined by AMD.

gcc/ChangeLog:

* config/gcn/gcn.c: Include dwarf2.h.
(gcn_addr_space_debug): New function.
(TARGET_ADDR_SPACE_DEBUG): New hook.

4 years agoamdgcn: Fix DWARF variables with alloca
Andrew Stubbs [Fri, 8 Jan 2021 13:56:52 +0000 (13:56 +0000)] 
amdgcn: Fix DWARF variables with alloca

Require a frame pointer for entry functions that use alloca because it isn't
possible to encode the DWARF frame otherwise.  Adjust the CFA definition
expressions accordingly.

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_expand_prologue): Use the frame pointer for
the DWARF CFA, if it exists.
(gcn_frame_pointer_rqd): Require a frame pointer for entry functions
that use alloca.

4 years agoFix offload dwarf info
Andrew Stubbs [Sun, 6 Dec 2020 19:23:55 +0000 (19:23 +0000)] 
Fix offload dwarf info

Add a notional code range to the notional parent function of offload kernel
functions.  This is enough to prevent GDB discarding them.

gcc/ChangeLog:

* dwarf2out.c (gen_subprogram_die): Add high/low_pc attributes for
parents of offload kernels.

4 years ago[og10] vect: Add target hook to prefer gather/scatter instructions
Julian Brown [Wed, 25 Nov 2020 17:08:01 +0000 (09:08 -0800)] 
[og10] vect: Add target hook to prefer gather/scatter instructions

For AMD GCN, the instructions available for loading/storing vectors are
always scatter/gather operations (i.e. there are separate addresses for
each vector lane), so the current heuristic to avoid gather/scatter
operations with too many elements in get_group_load_store_type is
counterproductive. Avoiding such operations in that function can
subsequently lead to a missed vectorization opportunity whereby later
analyses in the vectorizer try to use a very wide array type which is
not available on this target, and thus it bails out.

The attached patch adds a target hook to override the "single_element_p"
heuristic in the function as a target hook, and activates it for GCN. This
allows much better code to be generated for affected loops.

2021-01-13  Julian Brown  <julian@codesourcery.com>

gcc/
* doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add
documentation hook.
* doc/tm.texi: Regenerate.
* target.def (prefer_gather_scatter): Add target hook under vectorizer.
* tree-vect-stmts.c (get_group_load_store_type): Optionally prefer
gather/scatter instructions to scalar/elementwise fallback.
* config/gcn/gcn.c (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define
hook.

4 years ago[og10] openacc: Adjust loop lowering for AMD GCN
Julian Brown [Fri, 6 Nov 2020 23:17:29 +0000 (15:17 -0800)] 
[og10] openacc: Adjust loop lowering for AMD GCN

This patch adjusts OpenACC loop lowering in the AMD GCN target compiler
in such a way that the autovectorizer can vectorize the "vector" dimension
of those loops in more cases.

Rather than generating "SIMT" code that executes a scalar instruction
stream for each lane of a vector in lockstep, for GCN we model the GPU
like a typical CPU, with separate instructions to operate on scalar and
vector data. That means that unlike other offload targets, we rely on
the autovectorizer to handle the innermost OpenACC parallelism level,
which is "vector".

Because of this, the OpenACC builtin functions to return the current
vector lane and the vector width return 0 and 1 respectively, despite
the native vector width being 64 elements wide.

This allows generated code to work with our chosen compilation model,
but the way loops are lowered in omp-offload.c:oacc_xform_loop does not
understand the discrepancy between logical (OpenACC) and physical vector
sizes correctly. That means that if a loop is partitioned over e.g. the
worker AND vector dimensions, we actually lower with unit vector size --
meaning that if we then autovectorize, we end up trying to vectorize
over the "worker" dimension rather than the vector one! Then, because
the number of workers is not fixed at compile time, that means the
autovectorizer has a hard time analysing the loop and thus vectorization
often fails entirely.

We can fix this by deducing the true vector width in oacc_xform_loop,
and using that when we are on a "non-SIMT" offload target. We can then
rearrange how loops are lowered in that function so that the loop form
fed to the autovectorizer is more amenable to vectorization -- namely,
the innermost step is set to process each loop iteration sequentially.

For some benchmarks, allowing vectorization to succeed leads to quite
impressive performance improvements -- I've observed between 2.5x and
40x on one machine/GPU combination.

The low-level builtins available to user code (__builtin_goacc_parlevel_id
and __builtin_goacc_parlevel_size) continue to return 0/1 respectively
for the vector dimension for AMD GCN, even if their containing loop is
vectorized -- that's a quirk that we might possibly want to address at
some later date.

Only non-"chunking" loops are handled at present. "Chunking" loops are
still lowered as before.

2021-01-13  Julian Brown  <julian@codesourcery.com>

gcc/
* omp-offload.c (oacc_thread_numbers): Add VF_BY_VECTORIZER parameter.
Add overloaded wrapper for previous arguments & behaviour.
(oacc_xform_loop): Lower vector loops to iterate a multiple of
omp_max_vf times over contiguous steps on non-SIMT targets.

libgomp/testsuite/
* libgomp.oacc-c-c++-common/loop-gwv-1.c: Adjust for loop lowering
changes.
* libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
* libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise.
* libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.

4 years agogomp/pr94874.c: Update scan-tree-dump
Tobias Burnus [Mon, 28 Sep 2020 17:57:20 +0000 (19:57 +0200)] 
gomp/pr94874.c: Update scan-tree-dump

g++ on OG10 adds an additional temporary compared to mainline.

* c-c++-common/gomp/pr94874.c: Update scan-tree-dump.

4 years agoamdgcn: CFI configuration
Andrew Stubbs [Fri, 17 Jul 2020 14:42:26 +0000 (15:42 +0100)] 
amdgcn: CFI configuration

The necessary adjustments to support CFI in ROCGDB (ROCm 3.8+).

The -fomit-frame-pointer option now has different defaults because it has now
become useful.  Otherwise the only change in output is in the debug info.

gcc/

* common/config/gcn/gcn-common.c
(gcn_option_optimization_table): Change OPT_fomit_frame_pointer to -O3.
* config/gcn/gcn.c (move_callee_saved_registers): Emit CFI notes for
prologue register saves.
(gcn_expand_prologue): Prefer the frame pointer when emitting CFI.
(gcn_frame_pointer_rqd): New function.
(gcn_debug_unwind_info): Use UI_DWARF2.
(gcn_dwarf_register_number): Map DWARF_LINK_REGISTER to DWARF PC.
(gcn_dwarf_register_span): DWARF_LINK_REGISTER doesn't span.
(TARGET_FRAME_POINTER_REQUIRED): Define new hook.
* config/gcn/gcn.h (DWARF_FRAME_RETURN_COLUMN): New define.
(DWARF_LINK_REGISTER): New define.
(FIRST_PSEUDO_REGISTER): Increment.
(FIXED_REGISTERS): Add entry for DWARF_LINK_REGISTER.
(CALL_USED_REGISTERS): Likewise.
(REGISTER_NAMES): Likewise.

4 years agodwarf: Multi-register CFI address support
Andrew Stubbs [Mon, 27 Jul 2020 09:55:22 +0000 (10:55 +0100)] 
dwarf: Multi-register CFI address support

Add support for architectures such as AMD GCN, in which the pointer size is
larger than the register size.  This allows the CFI information to include
multi-register locations for the stack pointer, frame pointer, and return
address.

Note that this uses a newly proposed DWARF operator DW_OP_LLVM_piece_end,
which is currently only recognized by the ROCGDB debugger from AMD.  The exact
name and encoding for this operator is subject to change if and when the DWARF
standard accepts it.

gcc/ChangeLog:

* dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg.
(dw_frame_pointer_regnum): Likewise.
(new_cfi_row): Use set_by_dwreg.
(get_cfa_from_loc_descr): Use set_by_dwreg.  Support register spans
with DW_OP_piece and DW_OP_LLVM_piece_end.  Support DW_OP_lit*,
DW_OP_const*, DW_OP_minus, and DW_OP_plus.
(lookup_cfa_1): Use set_by_dwreg.
(def_cfa_0): Update for cfa_reg and support register spans.
(reg_save): Change sreg parameter to struct cfa_reg.  Support register
spans.
(dwf_cfa_reg): New function.
(dwarf2out_flush_queued_reg_saves): Use dwf_cfa_reg instead of
dwf_regno.
(dwarf2out_frame_debug_def_cfa): Likewise.
(dwarf2out_frame_debug_adjust_cfa): Likewise.
(dwarf2out_frame_debug_cfa_offset): Likewise.  Update reg_save usage.
(dwarf2out_frame_debug_cfa_register): Likewise.
(dwarf2out_frame_debug_expr): Likewise.
(create_pseudo_cfg): Use set_by_dwreg.
(initial_return_save): Use set_by_dwreg and dwf_cfa_reg,
(create_cie_data): Use dwf_cfa_reg.
(execute_dwarf2_frame): Use dwf_cfa_reg.
(dump_cfi_row): Use set_by_dwreg.
* dwarf2out.c (build_span_loc): New function.
(build_cfa_loc): Support register spans.
(build_cfa_aligned_loc): Update cfa_reg usage.
(convert_cfa_to_fb_loc_list): Use set_by_dwreg.
* dwarf2out.h (struct cfa_reg): New type.
(struct dw_cfa_location): Use struct cfa_reg.
(build_span_loc): New prototype.
* gengtype.c (main): Accept poly_uint16_pod type.

include/ChangeLog:

* dwarf2.def (DW_OP_LLVM_piece_end): New extension operator.

4 years agoFix warning messages in libgomp.oacc-fortran/privatized-ref-1.f95 testcase
Kwok Cheung Yeung [Thu, 17 Sep 2020 10:16:04 +0000 (03:16 -0700)] 
Fix warning messages in libgomp.oacc-fortran/privatized-ref-1.f95 testcase

2020-09-17  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95 (workers, vectors):
Reduce number of workers to 16.

4 years agoXFAIL libgomp.oacc-fortran/privatized-ref-2.f90 on nvptx
Kwok Cheung Yeung [Wed, 16 Sep 2020 17:19:35 +0000 (10:19 -0700)] 
XFAIL libgomp.oacc-fortran/privatized-ref-2.f90 on nvptx

The testcase uses alloca, which is not currently supported on nvptx
(see PR65181).

2020-09-16  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: XFAIL on nvptx.

4 years agoFix warning messages in libgomp.oacc-c++/privatized-ref-[23].C testcases
Kwok Cheung Yeung [Wed, 16 Sep 2020 10:18:54 +0000 (03:18 -0700)] 
Fix warning messages in libgomp.oacc-c++/privatized-ref-[23].C testcases

More workers are requested than are supported, resulting in warning messages
as the number of workers is overridden.

2020-09-16  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c++/privatized-ref-2.C (workers, vectors):
Reduce number of workers to 16.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C (workers, vectors):
Likewise.

4 years agoRelax some restrictions on the loop bound in kernels loop annotation.
Sandra Loosemore [Sun, 30 Aug 2020 19:15:23 +0000 (12:15 -0700)] 
Relax some restrictions on the loop bound in kernels loop annotation.

OpenACC loop semantics require that the loop bound be computable
before entering the loop, rather than the C/C++ semantics where the
end test is evaluated on every iteration.  Formerly the kernels loop
annotater permitted only constants and variables not modified in the
loop body in the loop bound expression.  This patch relaxes those
restrictions somewhat to allow many forms of expressions involving
such constants and variables, including calls to constant functions.

2020-08-30  Sandra Loosemore  <sandra@codesourcery.com>

gcc/c-family/
* c-omp.c (end_test_ok_for_annotation_r): New.
(end_test_ok_for_annotation): New.
(check_and_annotate_for_loop): Use the new helper function.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-annotation-21.c: New.
* c-c++-common/goacc/kernels-loop-annotation-22.c: New.

4 years agoClean up loop variable extraction in OpenACC kernels loop annotation.
Sandra Loosemore [Sun, 30 Aug 2020 19:15:23 +0000 (12:15 -0700)] 
Clean up loop variable extraction in OpenACC kernels loop annotation.

The code for identifying annotatable loops in OpenACC kernels regions
previously looked for the loop variable as the left-hand side of the
comparison in the loop end test.  However, front end optimizations
sometimes switch the sense of the comparison, making this method
unreliable.  In particular, it's ambiguous when both operands to the
end test comparison are local variables.

This patch reorders the loop processing to identify the loop variable
from the initializer, rather than the end test. The processing of the
end test then just checks that one of the operands to the comparison
matches the variable appearing in the initializer.  Much of the patch
is code refactoring, moving the initializer analysis out of
annotate_for_loop to check_and_annotate_for_loop so it can be
performed earlier.

2020-08-30  Sandra Loosemore  <sandra@codesourcery.com>

gcc/c-family/
* c-omp.c (annotate_for_loop): Move initializer processing...
(check_and_annotate_for_loop): ... to here.  Allow the loop
variable as either operand to the condition.

4 years agoFix typo; remove XFAIL
Tobias Burnus [Mon, 24 Aug 2020 09:00:51 +0000 (11:00 +0200)] 
Fix typo; remove XFAIL

gcc/
2020-08-24  Tobias Burnus  <tobias@codesourcery.com>

        * omp-sese.c: Fix comment typo.

gcc/testsuite/
2020-08-24  Tobias Burnus  <tobias@codesourcery.com>
            Sandra Loosemore  <sandra@codesourcery.com>

        * testsuite/gfortran.dg/goacc/kernels-decompose-2.f95:
        Fix typo; remove XFAIL.

4 years agoFix patterns in Fortran tests for kernels loop annotation.
Sandra Loosemore [Sun, 23 Aug 2020 05:43:57 +0000 (22:43 -0700)] 
Fix patterns in Fortran tests for kernels loop annotation.

Several of the Fortran tests for kernels loop annotation were failing
due to changes in the formatting of "acc loop" constructs in the dump
file.  Now the "auto" clause appears first, instead of after "private".

2020-08-23   Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
* gfortran.dg/goacc/kernels-loop-annotation-1.f95: Update
expected output.
* gfortran.dg/goacc/kernels-loop-annotation-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-3.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-4.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-5.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-6.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-7.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-8.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-11.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-12.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-13.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-14.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-15.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-16.f95: Likewise.

4 years agoPermit calls to builtins and intrinsics in kernels loops.
Sandra Loosemore [Sun, 23 Aug 2020 01:23:26 +0000 (18:23 -0700)] 
Permit calls to builtins and intrinsics in kernels loops.

This tweak to the OpenACC kernels loop annotation relaxes the
restrictions on function calls in the loop body.  Normally calls to
functions not explicitly marked with a parallelism attribute are not
permitted, but C/C++ builtins and Fortran intrinsics have known
semantics so we can generally permit those without restriction.  If
any turn out to be problematical, we can add on here to recognize
them, or in the processing of the "auto" annotations.

2020-08-22  Sandra Loosemore  <sandra@codesourcery.com>

gcc/c-family/
* c-omp.c (annotate_loops_in_kernels_regions): Test for
calls to builtins.

gcc/fortran/
* openmp.c (check_expr_for_invalid_calls): Check for intrinsic
functions.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-annotation-20.c: New.
* gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.

4 years agoUpdate dg-* in gfortran.dg/gomp/pr67500.f90
Tobias Burnus [Fri, 21 Aug 2020 11:28:06 +0000 (13:28 +0200)] 
Update dg-* in gfortran.dg/gomp/pr67500.f90

Contrary to GCC 11, OG10 uses an error instead of a warning,
cf. commit 271c7fef548a86676d304b1eb2be5c0d47280bd6.

gcc/testsuite/
        * gfortran.dg/gomp/pr67500.f90: Change dg-warning to
        dg-error.

4 years agoAnnotate inner loops in "acc kernels loop" directives (Fortran).
Sandra Loosemore [Thu, 20 Aug 2020 02:24:43 +0000 (19:24 -0700)] 
Annotate inner loops in "acc kernels loop" directives (Fortran).

Normally explicit loop directives in a kernels region inhibit
automatic annotation of other loops in the same nest, on the theory
that users have indicated they want manual control over that section
of code.  However there seems to be an expectation in user code that
the combined "kernels loop" directive should still allow annotation of
inner loops.  This patch implements this behavior in Fortran.

2020-08-19  Sandra Loosemore  <sandra@codesourcery.com>

gcc/fortran/
* openmp.c (annotate_do_loops_in_kernels): Handle
EXEC_OACC_KERNELS_LOOP separately to permit annotation of inner
loops in a combined "acc kernels loop" directive.

gcc/testsuite/
* gfortran.dg/goacc/kernels-loop-annotation-18.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-19.f95: New.
* gfortran.dg/goacc/combined-directives.f90: Adjust expected
patterns.
* gfortran.dg/goacc/private-explicit-kernels-1.f95: Likewise.
* gfortran.dg/goacc/private-predetermined-kernels-1.f95:
Likewise.

4 years agoAnnotate inner loops in "acc kernels loop" directives (C/C++).
Sandra Loosemore [Thu, 20 Aug 2020 02:18:57 +0000 (19:18 -0700)] 
Annotate inner loops in "acc kernels loop" directives (C/C++).

Normally explicit loop directives in a kernels region inhibit
automatic annotation of other loops in the same nest, on the theory
that users have indicated they want manual control over that section
of code.  However there seems to be an expectation in user code that
the combined "kernels loop" directive should still allow annotation of
inner loops.  This patch implements this behavior for C and C++.

2020-08-19  Sandra Loosemore  <sandra@codesourcery.com>

gcc/c-family/
* c-omp.c (annotate_loops_in_kernels_regions): Process inner
loops in combined "acc kernels loop" directives.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-annotation-18.c: New.
* c-c++-common/goacc/kernels-loop-annotation-19.c: New.
* c-c++-common/goacc/combined-directives.c: Adjust expected
patterns.

4 years agoAdd a "combined" flag for "acc kernels loop" etc directives.
Sandra Loosemore [Thu, 20 Aug 2020 02:13:55 +0000 (19:13 -0700)] 
Add a "combined" flag for "acc kernels loop" etc directives.

2020-08-19  Sandra Loosemore  <sandra@codesourcery.com>

gcc/
* tree.h (OACC_LOOP_COMBINED): New.

gcc/c/
* c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED.

gcc/cp/
* parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED.

gcc/fortran/
* trans-openmp.c (gfc_trans_omp_do): Add combined parameter,
use it to set OACC_LOOP_COMBINED.  Update all call sites.

4 years agoFix gfortran.dg/goacc/pr70828.f90 testcase
Kwok Cheung Yeung [Wed, 19 Aug 2020 19:50:42 +0000 (12:50 -0700)] 
Fix gfortran.dg/goacc/pr70828.f90 testcase

Array mapping was changed by the patch '[OpenMP, Fortran] Add
structure/derived-type element mapping'.

2020-08-19  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* gfortran.dg/goacc/pr70828.f90: Update expected output in Gimple
dump.

4 years agoRevert "OpenACC reference count consistency checking"
Julian Brown [Fri, 31 Jul 2020 15:22:00 +0000 (08:22 -0700)] 
Revert "OpenACC reference count consistency checking"

This reverts commit 56ebcdff802678b11deb1a9c3f6d20d33498e944.

4 years ago[og10] openacc: Unshare reduction temporaries for GCN
Julian Brown [Tue, 28 Jul 2020 13:02:50 +0000 (06:02 -0700)] 
[og10] openacc: Unshare reduction temporaries for GCN

The GCN backend uses tree nodes like MEM((__lds TYPE *) <constant>)
for reduction temporaries. Unlike e.g. var decls and SSA names, these
nodes cannot be shared during gimplification, but are so in some
circumstances. This is detected when appropriate --enable-checking
options are used. This patch unshares such nodes when they are reused
more than once.

2020-07-30  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Do not
cache/share decls for reduction temporaries between invocations.
(gcn_goacc_reduction_teardown): Unshare VAR on second use.
* config/gcn/gcn.c (gcn_init_machine_status): Do not initialise
reduc_decls.
* config/gcn/gcn.h (machine_function): Remove reduc_decls cache.

4 years ago[og10] openacc: Delete useless temp in gcn-tree.c
Julian Brown [Tue, 28 Jul 2020 15:13:36 +0000 (08:13 -0700)] 
[og10] openacc: Delete useless temp in gcn-tree.c

This is an obvious cleanup to get rid of an unnecessary temporary
variable.

2020-07-30  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn-tree.c (gcn_goacc_reduction_teardown): Remove useless
temporary variable "decl".

4 years ago[og10] openacc: Fix parallel-dims.c test
Julian Brown [Thu, 30 Jul 2020 20:34:10 +0000 (13:34 -0700)] 
[og10] openacc: Fix parallel-dims.c test

This test appears to have been missed when enabling worker partitioning
support for AMD GCN. It was still assuming num_workers was fixed at 1.

2020-07-30  Julian Brown  <julian@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Fix for GCN
num_workers > 1.

4 years agoFix c-c++-common/goacc/routine-4.c and c-c++-common/goacc/routine-4-extern.c testcases
Kwok Cheung Yeung [Tue, 28 Jul 2020 12:41:14 +0000 (05:41 -0700)] 
Fix c-c++-common/goacc/routine-4.c and c-c++-common/goacc/routine-4-extern.c testcases

'Various OpenACC reduction enhancements - FE changes' (commit
6b3e1f7f05cd360bbd356b3f78511aa2ec3f40c3) introduced checks for gang
reductions on orphan loops.  The checks triggered in the routine-4.c
and routine-4-extern.c testcases, requiring changes that effectively
rendered them useless as test cases.

This patch restores the original intent of the test cases, by restoring
the original tests and removing the orphan loop reductions that were
triggering the new check.

This patch should probably have been part of 'Various OpenACC reduction
enhancements - test cases' (commit 6a0b5806b24bfdefe0b0f3ccbcc51299e5195dca).

2020-07-28  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/routine-4.c (seq, vector, worker, gang): Revert
previous changes.  Remove loop reductions.
* c-c++-common/goacc/routine-4-extern.c (seq, vector, worker, gang):
Likewise.

4 years agoomp-low.c: Avoid offload-target lto1 is-missing error by not-privatizing TREE_READONL...
Tobias Burnus [Mon, 27 Jul 2020 06:31:51 +0000 (08:31 +0200)] 
omp-low.c: Avoid offload-target lto1 is-missing error by not-privatizing TREE_READONLY vars

Fixes issue exposed by testsuite/libgomp.oacc-fortran/privatized-ref-2.f90

gcc/ChangeLog:

* omp-low.c (oacc_record_private_var_clauses,
oacc_record_vars_in_bind): Do not privatize read-only static/extern
variables.

4 years agoFix failure in testcase gfortran.dg/goacc/routine-module-mod-1.f90
Kwok Cheung Yeung [Fri, 24 Jul 2020 16:50:49 +0000 (09:50 -0700)] 
Fix failure in testcase gfortran.dg/goacc/routine-module-mod-1.f90

This testcase prints an extra diagnostic message due to 'Default compute
dimensions (compile time)' (commit 4472e4cdcd264b8e9102a2804c0ce35a5865f7a1).

2020-07-24  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* gfortran.dg/goacc/routine-module-mod-1.f90 (g_1): Add
expected output.

4 years agoXFAIL tests in gfortran.dg/goacc/loop-2-kernels.f95
Kwok Cheung Yeung [Fri, 24 Jul 2020 16:50:41 +0000 (09:50 -0700)] 
XFAIL tests in gfortran.dg/goacc/loop-2-kernels.f95

The C-equivalent version of the test (c-c++-common/goacc/loop-2-kernels.c)
has these tests XFAILed in the commit 'Make new OpenACC kernels conversion
the default; adjust and add tests' (commit
757f56ddc43fd80bb8740222ec352111b26d66e9), so the Fortran version should
be XFAILed too.

2020-07-24  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* gfortran.dg/goacc/loop-2-kernels.f95: Add XFAILs.

4 years agoFix broken testcase gcc.dg/goacc/loop-processing-1.c
Kwok Cheung Yeung [Fri, 24 Jul 2020 16:50:34 +0000 (09:50 -0700)] 
Fix broken testcase gcc.dg/goacc/loop-processing-1.c

2020-07-24  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* gcc.dg/goacc/loop-processing-1.c: Remove erroneous line-break.
Specify correct tree pass for output.  Fix expected constant values.

4 years agoFix failures in c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c...
Kwok Cheung Yeung [Fri, 24 Jul 2020 16:50:19 +0000 (09:50 -0700)] 
Fix failures in c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c testcase

This should have been part of 'Update expected messages, errors and warnings
for "kernels" tests' (commit 081a01963ca8db7ddaaf5871d281321454fd3246).

2020-07-24  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c
(main): Remove obsolete expected messages.

4 years agoFix failure in testcase c-c++-common/goacc/routine-nohost-1.c
Kwok Cheung Yeung [Wed, 22 Jul 2020 16:44:45 +0000 (09:44 -0700)] 
Fix failure in testcase c-c++-common/goacc/routine-nohost-1.c

2020-07-22  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/routine-nohost-1.c: Change tree dump pass to
oaccloops.

4 years agolibgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message
Frederik Harwath [Mon, 20 Jul 2020 09:24:21 +0000 (11:24 +0200)] 
libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

According to the OpenACC standard version 2.5 and later, reductions on
orphaned gang loops are explicitly disallowed (cf.  section "Changes
from Version 2.0 to 2.5").  A loop is "orphaned" if it is not
lexically contained in a compute construct (cf. section "Loop
construct" of the OpenACC standard), i.e. in either a "parallel", a
"serial", or a "kernels" construct.

This commit fixes the check for reductions on orphaned gang loops in
the Fortran frontend which (in contrast to the C, C++ frontends)
erroneously rejects reductions on gang loops that are contained in
"kernels" constructs.

2020-07-20  Frederik Harwath  <frederik@codesourcery.com>

gcc/fortran/

* openmp.c (oacc_is_parallel_or_serial): Removed function.
(oacc_is_kernels): New function.
(oacc_is_compute_construct): New function.
(resolve_oacc_loop_blocks): Use "oacc_is_compute_construct"
instead of "oacc_is_parallel_or_serial" for checking that a
loop is not orphaned.

gcc/testsuite/

* gfortran.dg/goacc/orphan-reductions-2.f90: New test
verifying that the "gang reduction on an orphan loop" error message
is not emitted for non-orphaned loops.

* c-c++-common/goacc/orphan-reductions-2.c: Likewise for C and C++.

4 years agolibgomp.oacc-fortran/firstprivate-int.f90 fix for nonexisting kind-16 int
Tobias Burnus [Thu, 16 Jul 2020 12:37:23 +0000 (14:37 +0200)] 
libgomp.oacc-fortran/firstprivate-int.f90 fix for nonexisting kind-16 int

libgomp/
        * testsuite/libgomp.oacc-fortran/firstprivate-int.f90: Use
        highest available integer kind instead of assuming that kind=16 exists.

4 years agoamdgcn: Tune default OpenMP/OpenACC GPU utilization
Kwok Cheung Yeung [Thu, 29 Aug 2019 17:16:42 +0000 (10:16 -0700)] 
amdgcn: Tune default OpenMP/OpenACC GPU utilization

libgomp/
* plugin/plugin-gcn.c (parse_target_attributes): Automatically set
the number of teams and threads if necessary.
(gcn_exec): Automatically set the number of gangs and workers if
necessary.

Co-Authored-By: Andrew Stubbs <ams@codesourcery.com>
4 years agoopenacc: Remove unnecessary barriers (gimple worker partitioning/broadcast)
Julian Brown [Thu, 13 Feb 2020 14:13:34 +0000 (06:13 -0800)] 
openacc: Remove unnecessary barriers (gimple worker partitioning/broadcast)

This is an optimisation for middle-end worker-partitioning support (used
to support multiple workers on AMD GCN).  At present, barriers may be
emitted in cases where they aren't needed and cannot be optimised away.
This patch stops the extraneous barriers from being emitted in the
first place.

One exception to the above (where the barrier is still needed) is for
predicated blocks of code that perform a write to gang-private shared
memory from one worker.  We must execute a barrier before other workers
read that shared memory location.

2020-07-15  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn.c (gimple.h): Include.
(gcn_fork_join): Emit barrier for worker-level joins.
* omp-sese.c (find_local_vars_to_propagate): Add writes_gangprivate
bitmap parameter. Set bit for blocks containing gang-private variable
writes.
(worker_single_simple): Don't emit barrier after predicated block.
(worker_single_copy): Don't emit barrier if we're not broadcasting
anything and the block contains no gang-private writes.
(neuter_worker_single): Don't predicate blocks that only contain NOPs
or internal marker functions.  Pass has_gangprivate_write argument to
worker_single_copy.
(oacc_do_neutering): Add writes_gangprivate bitmap handling.

4 years agoamdgcn: Add waitcnt after LDS write instructions
Julian Brown [Mon, 10 Feb 2020 20:26:57 +0000 (12:26 -0800)] 
amdgcn: Add waitcnt after LDS write instructions

Data-share write (ds_write) instructions do not necessarily complete
the write to LDS immediately. When a write completes, LGKM_CNT is
decremented. For now, we wait until LGKM_CNT reaches zero after each
ds_write instruction.

This fixes a race condition in the case where LDS is read immediately
after being written. This can happen with broadcast operations.

2020-07-15  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn-valu.md (scatter<mode>_insn_1offset_ds<exec_scatter>):
Add waitcnt.
(*mov<mode>_insn, *movti_insn): Add waitcnt to ds_write alternatives.

4 years agoopenacc: Shared memory layout optimisation
Julian Brown [Mon, 27 Jan 2020 14:14:05 +0000 (06:14 -0800)] 
openacc: Shared memory layout optimisation

This patch implements an algorithm to lay out local data-share (LDS)
space.  It currently works for AMD GCN.  At the moment, LDS is used for
three things:

  1. Gang-private variables
  2. Reduction temporaries (accumulators)
  3. Broadcasting for worker partitioning

After the patch is applied, (2) and (3) are placed at preallocated
locations in LDS, and (1) continues to be handled by the backend (as it
is at present prior to this patch being applied). LDS now looks like this:

  +--------------+ (gang local size + 1024, = 1536)
  | free space   |
  |    ...       |
  | - - - - - - -|
  | worker bcast |
  +--------------+
  | reductions   |
  +--------------+ <<< -mgang-local-size=<number> (def. 512)
  | gang private |
  |    vars      |
  +--------------+ (32)
  | low LDS vars |
  +--------------+ LDS base

So, gang-private space is fixed at a constant amount at compile time
(which can be increased with a command-line switch if necessary
for some given code). The layout algorithm takes out a slice of the
remainder of usable space for reduction vars, and uses the rest for
worker partitioning.

The partitioning algorithm works as follows.

 1. An "adjacency" set is built up for each basic block that might
    do a broadcast. This is calculated by starting at each such block,
    and doing a recursive DFS walk over successors to find the next
    block (or blocks) that *also* does a broadcast
    (dfs_broadcast_reachable_1).

 2. The adjacency set is inverted to get adjacent predecessor blocks also.

 3. Blocks that will perform a broadcast are sorted by size of that
    broadcast: the biggest blocks are handled first.

 4. A splay tree structure is used to calculate the spans of LDS memory
    that are already allocated by the blocks adjacent to this one
    (merge_ranges{,_1}.

 5. The current block's broadcast space is allocated from the first free
    span not allocated in the splay tree structure calculated above
    (first_fit_range). This seems to work quite nicely and efficiently
    with the splay tree structure.

 6. Continue with the next-biggest broadcast block until we're done.

In this way, "adjacent" broadcasts will not use the same piece of
LDS memory.

2020-07-15  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Update
prototype.
* config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Use
preallocated block of LDS memory.
(gcn_goacc_create_propagation_record): Add OFFSET parameter, and return
temporary LDS space at that offset.  Return pointer in "sender" case.
(gcn_goacc_adjust_private_decl): Return var.
* config/gcn/gcn.c (acc_lds_size, gangprivate_hwm, lds_allocs): New
global vars.
(ACC_LDS_SIZE): Define as acc_lds_size.
(gcn_init_machine_status): Don't initialise lds_allocated and
lds_allocs fields of machine function struct.
(gcn_option_override): Handle default size for gang-private variables
and -mgang-local-size option.
(gcn_expand_prologue): Use LDS_SIZE instead of LDS_SIZE-1 when
initialising M0_REG.
(gcn_shared_mem_layout): New function.
(gcn_print_lds_decl): Update comment. Use global lds_allocs map and
gangprivate_hwm variable.
(TARGET_GOACC_SHARED_MEM_LAYOUT): Define target hook.
* config/gcn/gcn.h (machine_function): Remove lds_allocated,
lds_allocs. Add reduction_base, reduction_limit.
* config/gcn/gcn.opt (gang_local_size_opt): New global.
(mgang-local-size=): New option.
* config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl): Return var.
* doc/tm.texi.in (TARGET_GOACC_SHARED_MEM_LAYOUT): Place documentation
hook.
* doc/tm.texi: Regenerate.
* omp-offload.c (addr_expr_rewrite_info): Change adjusted_vars to a
hash_map.
(rewrite_addr_expr): Rewrite VAR_DECLs also.
(default_goacc_create_propagation_record): Add OFFSET parameter.
(execute_oacc_gimple_workers): Calculate per-function reduction
temporary and private-variable size.  Call OpenACC shared_mem_layout
hook.  Move num_workers==1 handling here.
(execute_oacc_device_lower): Fix for adjusted_vars being a hash_map
rather than a hash_set.
(pass_oacc_gimple_workers::gate): Remove num_workers==1 handling from
here.  Enable pass for all OpenACC routines in order to call shared
memory-layout hook.
* omp-sese.c (targhooks.h, diagnostic-core.h): Add includes.
(build_sender_ref): Handle sender_decl being pointer.
(worker_single_copy): Add PLACEMENT and ISOLATE_BROADCASTS parameters.
Pass placement argument to create_propagation_record hook invocations.
Handle sender_decl being pointer and isolate_broadcasts inserting extra
barriers.
(blk_offset_map_t): Add typedef.
(neuter_worker_single): Add BLK_OFFSET_MAP parameter.  Pass
preallocated range to worker_single_copy call.
(dfs_broadcast_reachable_1): New function.
(idx_decl_pair_t, used_range_vec_t): New typedefs.
(sort_size_descending): New function.
(addr_range): New class.
(splay_tree_compare_addr_range, splay_tree_free_key, first_fit_range,
merge_ranges_1, merge_ranges): New functions.
(oacc_do_neutering): Add BOUNDS_LO, BOUNDS_HI parameters.  Arrange
layout of shared memory for broadcast operations.
* omp-sese.h (oacc_do_neutering): Update prototype.
* target.def (adjust_private_decl): Change return type to tree.
(create_propagation_record): Add OFFSET parameter.
(shared_mem_layout): New hook.
* targhooks.h (default_goacc_create_propagation_record): Update
prototype.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: New test.

4 years agoopenacc: Turn off worker partitioning if num_workers==1
Julian Brown [Mon, 20 Jan 2020 19:42:28 +0000 (11:42 -0800)] 
openacc: Turn off worker partitioning if num_workers==1

This patch turns off the middle-end worker-partitioning support if the
number of workers for an outlined offload function is one.  In that case,
we do not need to perform the broadcasting/neutering code transformation.

2020-07-15  Julian Brown  <julian@codesourcery.com>

gcc/
* omp-offload.c (pass_oacc_gimple_workers::gate): Disable worker
partitioning if num_workers is 1.

4 years agoopenacc: Fix race condition in Fortran loop collapse tests
Julian Brown [Thu, 25 Jun 2020 14:40:53 +0000 (07:40 -0700)] 
openacc: Fix race condition in Fortran loop collapse tests

The gangs participating in a gang-partitioned loop are not all guaranteed
to complete before some given gang continues to execute beyond that loop.
This means that two existing test cases contain a race condition,
because a loop that may be gang-partitioned is followed immediately by
another loop.  The fix is to place the loops in separate parallel regions.

2020-07-15  Julian Brown  <julian@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Fix race condition.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.

4 years agoFix routine-wv-2.c testcase
Kwok Cheung Yeung [Tue, 2 Jun 2020 13:53:29 +0000 (06:53 -0700)] 
Fix routine-wv-2.c testcase

2020-06-02  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c (DoWorkVec):
Replace openacc_amdgcn_accel_selected selector with
openacc_radeon_accel_selected.

4 years agoFix goacc/noncontig_array-1.c testcase
Kwok Cheung Yeung [Tue, 2 Jun 2020 13:36:45 +0000 (06:36 -0700)] 
Fix goacc/noncontig_array-1.c testcase

2020-06-02  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/noncontig_array-1.c: Dump Gimple pass.

4 years agoEnsure sender/receiver refs are in the correct address space
Kwok Cheung Yeung [Sat, 6 Jun 2020 15:02:00 +0000 (08:02 -0700)] 
Ensure sender/receiver refs are in the correct address space

This fixes ICEs that may occur when the sender or receiver is in a different
address space.

2020-06-06  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-sese.c (oacc_build_component_ref): Apply address space of
base object to the entire component reference.

4 years agoOpenACC: fix privatization of by-reference arrays
Tobias Burnus [Wed, 3 Jun 2020 13:35:12 +0000 (15:35 +0200)] 
OpenACC: fix privatization of by-reference arrays

Replacing of a by-reference variable in a private clause by a local variable
makes sense; however, for arrays, the size is not directly known by the type.
This causes an ICE via create_tmp_var which indirectly invokes
force_constant_size in this case - but the latter only handled Ada.

gcc/ChangeLog:

        * gimplify.c (localize_reductions): Do not create local
variable for privatized arrays.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: New test.

4 years agoFix parallel-dims.f90 testcase for AMD GCN
Kwok Cheung Yeung [Thu, 21 May 2020 20:52:50 +0000 (13:52 -0700)] 
Fix parallel-dims.f90 testcase for AMD GCN

2020-05-21  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c (aux_gang):
Handle case where AMD GCN is used.
(acc_worker): Likewise.
(acc_vector): Likewise.

4 years agoFix loop-dim-default.c testcase for AMD GCN
Kwok Cheung Yeung [Thu, 21 May 2020 20:43:13 +0000 (13:43 -0700)] 
Fix loop-dim-default.c testcase for AMD GCN

2020-05-21  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (check):
Change ACC_DEVICE_TYPE_gcn to ACC_DEVICE_TYPE_radeon.

4 years agoFix bug in processing of array dimensions in data clauses.
Sandra Loosemore [Tue, 31 Mar 2020 21:29:09 +0000 (14:29 -0700)] 
Fix bug in processing of array dimensions in data clauses.

The g++ front end wraps the array length and low_bound values in
NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail.
The test case c-c++-common/goacc/kernels-loop-annotation-1.c was
tickling this bug and giving bogus errors in g++ because it was falling
through to dynamic array code instead of recognizing the constant bounds.

This patch was posted upstream here
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542694.html
but not yet committed.  It may be that some other fix for this problem
is implemented on mainline instead; check before merging this patch.

2020-03-31  Sandra Loosemore  <sandra@codesourcery.com>

gcc/cp/
* semantics.c (handle_omp_array_sections_1): Call STRIP_NOPS
on length and low_bound;
(handle_omp_array_sections): Likewise.

4 years agoAdditional Fortran testsuite fixes for kernels loops annotation pass.
Sandra Loosemore [Thu, 19 Mar 2020 15:32:24 +0000 (08:32 -0700)] 
Additional Fortran testsuite fixes for kernels loops annotation pass.

These testsuite fixes are specific to the og10 branch, so are being
segregated from the ones that apply to mainline in a separate commit
from the main Fortran kernels loop annotation patch.

2020-03-27  Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust
line numbering.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-2.f95: Add
-fno-openacc-kernels-annotate-loops.

4 years agoKernels loops annotation: Fortran.
Sandra Loosemore [Tue, 17 Mar 2020 01:08:01 +0000 (18:08 -0700)] 
Kernels loops annotation: Fortran.

This patch implements the Fortran support for adding "#pragma acc loop auto"
annotations to loops in OpenACC kernels regions.  It implements the same
-fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options
that were previously added (and documented) for the C/C++ front ends.

2020-03-27  Sandra Loosemore  <sandra@codesourcery.com>
    Gergö Barany <gergo@codesourcery.com>

gcc/fortran/
* gfortran.h (gfc_oacc_annotate_loops_in_kernels_regions): Declare.
* lang.opt (Wopenacc-kernels-annotate-loops): New.
(fopenacc-kernels-annotate-loops): New.
* openmp.c: Include options.h.
(enum annotation_state, enum annotation_result): New.
(check_code_for_invalid_calls): New.
(check_expr_for_invalid_calls): New.
(check_for_invalid_calls): New.
(annotate_do_loop): New.
(annotate_do_loops_in_kernels): New.
(compute_goto_targets): New.
(gfc_oacc_annotate_loops_in_kernels_regions): New.
* parse.c (gfc_parse_file): Handle -fopenacc-kernels-annotate-loops.

gcc/testsuite/
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
-fno-openacc-kernels-annotate-loops option.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/common-block-3.f90: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
* gfortran.dg/goacc/kernels-loop.f95: Likewise.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95:
Likewise.
* gfortran.dg/goacc/kernels-loop-annotation-1.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-2.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-3.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-4.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-5.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-6.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-7.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-8.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-9.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-10.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-11.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-12.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-13.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-14.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-15.f95: New.
* gfortran.dg/goacc/kernels-loop-annotation-16.f95: New.

4 years agoAdd -fno-openacc-kernels-annotate-loops option to more testcases.
Sandra Loosemore [Mon, 16 Mar 2020 03:31:09 +0000 (20:31 -0700)] 
Add -fno-openacc-kernels-annotate-loops option to more testcases.

The testcases being tweaked here are present on the og10 branch but not on
FSF mainline.

2020-03-27  Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-2.c: Add
-fno-openacc-kernels-annotate-loops.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.

4 years agoKernels loops annotation: C and C++.
Sandra Loosemore [Sun, 15 Mar 2020 22:13:46 +0000 (15:13 -0700)] 
Kernels loops annotation: C and C++.

This patch detects loops in kernels regions that are candidates for
parallelization, and adds "#pragma acc loop auto" annotations to them.
This annotation is controlled by the -fopenacc-kernels-annotate-loops
option, which is enabled by default.  -Wopenacc-kernels-annotate-loops
can be used to produce diagnostics about loops that cannot be annotated.

2020-03-27  Sandra Loosemore  <sandra@codesourcery.com>

Kernels loops annotation: C and C++.

gcc/c-family/
* c-common.h (c_oacc_annotate_loops_in_kernels_regions): Declare.
* c-omp.c: Include tree-iterator.h
(enum annotation_state): New.
(struct annotation_info): New.
(do_not_annotate_loop): New.
(do_not_annotate_loop_nest): New.
(annotation_error): New.
(c_finish_omp_for_internal): Split from c_finish_omp_for.  Use
annotation_error function.  Code refactoring to avoid destructive
changes that cannot be undone in case of error.
(is_local_var): New.
(lang_specific_unwrap_initializer): New.
(annotate_for_loop): New.
(check_and_annotate_for_loop): New.
(annotate_loops_in_kernels_regions): New.
(c_oacc_annotate_loops_in_kernels_regions): New.
* c.opt (Wopenacc-kernels-annotate-loops): New.
(fopenacc-kernels-annotate-loops): New.

gcc/c/
* c-decl.c (c_unwrap_for_init): New.
(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.

gcc/cp/
* decl.c (cp_unwrap_for_init): New.
(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.

gcc/
* doc/invoke.texi (Option Summary): Add entries for
-Wopenacc-kernels-annotate-loops and
-fno-openacc-kernels-annotate-loops.
(Warning Options): Document -Wopenacc-kernels-annotate-loops.
(Optimization Options): Document -fno-openacc-kernels-annotate-loops.

gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Add
-fno-openacc-kernels-annotate-loops option.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/kernels-counter-var-redundant-load.c: Likewise.
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
* c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
* c-c++-common/goacc/kernels-double-reduction.c: Likewise.
* c-c++-common/goacc/kernels-loop-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-3.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
* c-c++-common/goacc/kernels-loop-data.c: Likewise.
* c-c++-common/goacc/kernels-loop-g.c: Likewise.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
* c-c++-common/goacc/kernels-loop-n.c: Likewise.
* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
* c-c++-common/goacc/kernels-loop.c: Likewise.
* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c:
Likewise.
* c-c++-common/goacc/kernels-reduction.c: Likewise.
* c-c++-common/goacc/kernels-loop-annotation-1.c: New.
* c-c++-common/goacc/kernels-loop-annotation-2.c: New.
* c-c++-common/goacc/kernels-loop-annotation-3.c: New.
* c-c++-common/goacc/kernels-loop-annotation-4.c: New.
* c-c++-common/goacc/kernels-loop-annotation-5.c: New.
* c-c++-common/goacc/kernels-loop-annotation-6.c: New.
* c-c++-common/goacc/kernels-loop-annotation-7.c: New.
* c-c++-common/goacc/kernels-loop-annotation-8.c: New.
* c-c++-common/goacc/kernels-loop-annotation-9.c: New.
* c-c++-common/goacc/kernels-loop-annotation-10.c: New.
* c-c++-common/goacc/kernels-loop-annotation-11.c: New.
* c-c++-common/goacc/kernels-loop-annotation-12.c: New.
* c-c++-common/goacc/kernels-loop-annotation-13.c: New.
* c-c++-common/goacc/kernels-loop-annotation-14.c: New.
* c-c++-common/goacc/kernels-loop-annotation-15.c: New.
* c-c++-common/goacc/kernels-loop-annotation-16.c: New.
* c-c++-common/goacc/kernels-loop-annotation-17.c: New.

4 years agoAdd XFAIL for libgomp.oacc-c-c++-common/data-firstprivate-1.c
Kwok Cheung Yeung [Tue, 24 Mar 2020 16:36:42 +0000 (09:36 -0700)] 
Add XFAIL for libgomp.oacc-c-c++-common/data-firstprivate-1.c

The firstprivate_int optimization changes the semantics of firstprivate
in this test, so XFAIL it until the correct semantics for firstprivate
have been decided (PR92036).

2020-03-24  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/data-firstprivate-1.c: XFAIL
execution test.

4 years agoFix test failure in routine-level-of-parallelism-2.c testcase
Kwok Cheung Yeung [Thu, 19 Mar 2020 22:09:32 +0000 (15:09 -0700)] 
Fix test failure in routine-level-of-parallelism-2.c testcase

c-c++-common/goacc/routine-level-of-parallelism-2.c is supposed to be
equivalent to gfortran.dg/goacc/routine-level-of-parallelism-1.f90,
but is missing some test directives present in the latter.

2020-03-19  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/testsuite/
* c-c++-common/goacc/routine-level-of-parallelism-2.c (g_1): Add
missing test directives.

4 years agoAllow gang-level reductions in OpenACC routines with gang-level parallelism
Kwok Cheung Yeung [Thu, 19 Mar 2020 16:44:18 +0000 (09:44 -0700)] 
Allow gang-level reductions in OpenACC routines with gang-level parallelism

2020-03-19  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-offload.c (oacc_loop_auto_partitions): Check for 'omp declare
target' attributes with a gang clause attached.

4 years agoReduce XFAILs in goacc/kernels-decompose-2.f95
Tobias Burnus [Tue, 17 Mar 2020 10:53:07 +0000 (11:53 +0100)] 
Reduce XFAILs in goacc/kernels-decompose-2.f95

gcc/testsuite/
* gfortran.dg/goacc/kernels-decompose-2.f95: Remove some
xfails as the tests now pass.

4 years agoFix for is_gimple_reg vars to 'data kernels'
Tobias Burnus [Mon, 16 Mar 2020 15:22:57 +0000 (16:22 +0100)] 
Fix for is_gimple_reg vars to 'data kernels'

Nearly all variable mapping is moved from 'kernels' to a surrounding
'data kernels' and then 'force_present' mapped for the 'kernels'. However, as
libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as
there is a special case for is_gimple_reg in mapping and that fails badly if
outside a target region (e.g. offloading = false). As those are transferred by
value and not as a pointer, it makes more sense to only map them at
'kernels' and ignore them for 'data kernels'.
Additionally, as e.g. libgomp.oacc-c-c++-common/kernels-decompose-1.c shows,
one still additionally to handle 'kernels'-declared variables which now are
declared in 'kernels data' and and can be handled as is_gimple_reg.

gcc/
* omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region):
is_gimple_reg vars are not yet mapped, fall through to map is as
before the transformation.
(omp_oacc_kernels_decompose_1): Don't map is_gimple_reg vars.
(decompose_kernels_region_body): Use tofrom for is_gimple_reg vars.
(omp_oacc_kernels_decompose_1): Handle is_gimple_reg vars as without
data kernels.

gcc/testsuite/
* gfortran.dg/goacc/common-block-3.f90: Update scan-tree-dump-times.
* gfortran.dg/goacc/declare-3.f95: Update scan-tree-dump-times.

4 years agoAdd expected warnings to parallel-dims.c and parallel-dims.f90 testcases
Kwok Cheung Yeung [Mon, 16 Mar 2020 14:50:46 +0000 (07:50 -0700)] 
Add expected warnings to parallel-dims.c and parallel-dims.f90 testcases

2020-03-16  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c (main): Add
expected warnings.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90 (main): Likewise.

4 years agoAllow gang reductions inside serial constructs
Kwok Cheung Yeung [Fri, 13 Mar 2020 18:13:49 +0000 (11:13 -0700)] 
Allow gang reductions inside serial constructs

2020-03-13  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/fortran/
* openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New.
(resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of
oacc_is_parallel.

4 years agoFix ICE in libgomp.oacc-c++/ref-1.C
Kwok Cheung Yeung [Thu, 12 Mar 2020 14:38:02 +0000 (07:38 -0700)] 
Fix ICE in libgomp.oacc-c++/ref-1.C

2020-03-12  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-sese.c (install_var_field): Generate a field name for a VAR_DECL
using the DECL_UID if it does not have a DECL_NAME.

4 years agoRun all kernels regions with GOMP_MAP_FORCE_TOFROM mappings synchronously
Julian Brown [Tue, 22 Oct 2019 00:22:31 +0000 (17:22 -0700)] 
Run all kernels regions with GOMP_MAP_FORCE_TOFROM mappings synchronously

gcc/
* omp-oacc-kernels-decompose.cc (decompose_kernels_region_body): Add
inhibit_async parameter.  Force asynchronous kernel launches to
run synchronously if they have problematic variable mappings.
Don't add explicit wait for decomposed kernels regions we forced
synchronous.
(omp_oacc_kernels_decompose_1): Detect problematic variable mappings,
and inhibit asynchronous execution if we find any.

4 years agoFix vector handling for firstprivate of <= pointer size
Tobias Burnus [Fri, 6 Mar 2020 15:24:52 +0000 (16:24 +0100)] 
Fix vector handling for firstprivate of <= pointer size

Test case is the existing libgomp.oacc-c++/firstprivate-mappings-1.C.

* omp-low.c (convert_from_firstprivate_int):
Use VIEW_CONVERT also for vectors.

4 years agoFix execution error in loop-dim-default.c test running on AMD GCN
Kwok Cheung Yeung [Thu, 5 Mar 2020 21:20:55 +0000 (13:20 -0800)] 
Fix execution error in loop-dim-default.c test running on AMD GCN

2020-03-05  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (main):
Allow use of workers on AMD GCN.

4 years agoFix testsuite failures due to expected compiler warnings
Kwok Cheung Yeung [Mon, 2 Mar 2020 16:32:08 +0000 (08:32 -0800)] 
Fix testsuite failures due to expected compiler warnings

2020-03-02  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c (main): Add
dg-warnings for expected compiler warnings.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c (DoWorkVec):
Likewise.
* testsuite/libgomp.oacc-fortran/optional-private.f90 (t1, t2):
Likewise.

4 years agoFix ICE in oacc_verify_routine_clauses
Kwok Cheung Yeung [Thu, 20 Feb 2020 20:56:13 +0000 (12:56 -0800)] 
Fix ICE in oacc_verify_routine_clauses

2020-02-20  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/
* omp-general.c (oacc_verify_routine_clauses): Handle
OMP_CLAUSE_NOHOST.

4 years agolibgomp - fix dg-warning line numbers
Tobias Burnus [Wed, 18 Sep 2019 11:45:34 +0000 (13:45 +0200)] 
libgomp - fix dg-warning line numbers

libgomp/
        * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Fix dg-warning
        line numbers.

4 years agoFix OpenACC "ephemeral" asynchronous host-to-device copies
Julian Brown [Wed, 11 Sep 2019 20:22:03 +0000 (13:22 -0700)] 
Fix OpenACC "ephemeral" asynchronous host-to-device copies

libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_host2dev): Update
prototype.
* libgomp.h (gomp_copy_host2dev): Update prototype.
* oacc-host.c (host_openacc_async_host2dev): Add ephemeral parameter.
* oacc-mem.c (memcpy_tofrom_device): Update call to gomp_copy_host2dev.
(update_dev_host): Likewise.
* oacc-parallel.c (GOACC_enter_exit_data): Call async versions of
acc_attach/acc_detach/acc_detach_finalize functions.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev): Add
ephemeral parameter.  Copy source data to temporary space immediately
if true, and pass to queue_push_copy.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_host2dev):
Add EPHEMERAL parameter, and FIXME function comment.
* target.c (goacc_device_copy_async): Remove.
(gomp_copy_host2dev): Add ephemeral parameter. Update function comment.
Call async host2dev plugin hook directly.
(gomp_copy_dev2host): Call async dev2host plugin hook directly.
(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer,
gomp_detach_pointer): Update calls to gomp_copy_host2dev.
(gomp_map_vars_internal): Don't use coalescing buffer for asynchronous
copies. Update calls to gomp_copy_host2dev.
(gomp_update): Update calls to gomp_copy_host2dev.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c (main): Fix
async-safety issue. Increase number of iterations.
* testsuite/libgomp.oacc-fortran/lib-16.f90: Fix async-safety issue.
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Likewise.

4 years agoFix warning syntax and typos in two libgomp tests
Julian Brown [Mon, 20 May 2019 23:31:41 +0000 (16:31 -0700)] 
Fix warning syntax and typos in two libgomp tests

libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Expect
"optimized:" not "note:" in warnings.

4 years agoFortran "declare create"/allocate support for OpenACC
Julian Brown [Tue, 26 Feb 2019 23:48:00 +0000 (15:48 -0800)] 
Fortran "declare create"/allocate support for OpenACC

2018-10-04  Cesar Philippidis  <cesar@codesourcery.com>
            Julian Brown  <julian@codesourcery.com>

gcc/
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise.  Create local storage for
the values being pointed to.  Add new orig_type argument.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT.  Don't privatize
firstprivate VLAs.
* tree-pretty-print.c (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.

gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.c (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.c (find_module_oacc_declare_clauses): Relax
oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
OMP_MAP_TO, in order to match OpenACC 2.5 semantics.
* trans-openmp.c (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls.  Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.
(gfc_trans_oacc_declare_allocate): New function.
* trans-stmt.c (gfc_trans_allocate): Call
gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
attribute set.
(gfc_trans_deallocate): Likewise.
* trans.h (gfc_trans_oacc_declare_allocate): Declare.

gcc/testsuite/
* gfortran.dg/goacc/declare-allocatable-1.f90: New test.

include/
* gomp-constants.h (enum gomp_map_kind): Define
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.

libgomp/
* oacc-mem.c (gomp_acc_declare_allocate): New function.
* oacc-parallel.c (GOACC_enter_exit_data): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.

2020-02-19  Julian Brown  <julian@codesourcery.com>

gcc/fortran/
* trans-openmp.c (gfc_omp_check_optional_argument): Handle non-decl
case.

gcc/
* gimplify.c (gimplify_scan_omp_clauses): Handle
GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE.

libgomp/
* libgomp.h (gomp_acc_declare_allocate): Remove prototype.
* oacc-mem.c (gomp_acc_declare_allocate): Make static.  Add POINTER
argument. Use acc_delete instead of acc_free.  Handle scalar
mappings.
(find_group_last): Handle GOMP_MAP_DECLARE_ALLOCATE and
GOMP_MAP_DECLARE_DEALLOCATE groupings.
(goacc_enter_data_internal): Fix kind check for
GOMP_MAP_DECLARE_ALLOCATE. Pass new pointer argument to
gomp_acc_declare_allocate.
(goacc_exit_data_internal): Unlock device mutex around
gomp_acc_declare_allocate call. Pass new pointer argument. Handle
group pointer mapping for deallocate.

2021-04-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* oacc-mem.c (goacc_enter_data_internal): Unlock mutex before calling
gomp_acc_declare_allocate and relock it afterwards.

4 years agoRe-do OpenACC private variable resolution
Julian Brown [Mon, 14 Oct 2019 20:12:39 +0000 (13:12 -0700)] 
Re-do OpenACC private variable resolution

gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(mark_oacc_gangprivate): Remove unused function.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts.  Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC.  Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables.  Remove unused variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this.  Update documentation (briefly).

libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.

4 years agoHandle references in OpenACC "private" clauses
Julian Brown [Fri, 20 Sep 2019 20:53:10 +0000 (13:53 -0700)] 
Handle references in OpenACC "private" clauses

gcc/
* gimplify.c (localize_reductions): Rewrite references for
OMP_CLAUSE_PRIVATE also.

libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test.

4 years agoReduce testsuite fails
Tobias Burnus [Thu, 19 Sep 2019 13:57:08 +0000 (15:57 +0200)] 
Reduce testsuite fails

gcc/testsuite/
2019-09-19  Tobias Burnus  <tobias@codesourcery.com>

        * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
        one dg-message for additional -fopt-info-optimized-omp output.
        * gfortran.dg/goacc/classify-kernels.f95: Likewise.