Tobias Burnus [Wed, 3 Jun 2020 13:35:12 +0000 (15:35 +0200)]
OpenACC: fix privatization of by-reference arrays
Replacing of a by-reference variable in a private clause by a local variable
makes sense; however, for arrays, the size is not directly known by the type.
This causes an ICE via create_tmp_var which indirectly invokes
force_constant_size in this case - but the latter only handled Ada.
gcc/ChangeLog:
* gimplify.c (localize_reductions): Do not create local
variable for privatized arrays.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c (aux_gang):
Handle case where AMD GCN is used.
(acc_worker): Likewise.
(acc_vector): Likewise.
Sandra Loosemore [Tue, 31 Mar 2020 21:29:09 +0000 (14:29 -0700)]
Fix bug in processing of array dimensions in data clauses.
The g++ front end wraps the array length and low_bound values in
NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail.
The test case c-c++-common/goacc/kernels-loop-annotation-1.c was
tickling this bug and giving bogus errors in g++ because it was falling
through to dynamic array code instead of recognizing the constant bounds.
This patch was posted upstream here
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542694.html
but not yet committed. It may be that some other fix for this problem
is implemented on mainline instead; check before merging this patch.
Sandra Loosemore [Thu, 19 Mar 2020 15:32:24 +0000 (08:32 -0700)]
Additional Fortran testsuite fixes for kernels loops annotation pass.
These testsuite fixes are specific to the og10 branch, so are being
segregated from the ones that apply to mainline in a separate commit
from the main Fortran kernels loop annotation patch.
Sandra Loosemore [Tue, 17 Mar 2020 01:08:01 +0000 (18:08 -0700)]
Kernels loops annotation: Fortran.
This patch implements the Fortran support for adding "#pragma acc loop auto"
annotations to loops in OpenACC kernels regions. It implements the same
-fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options
that were previously added (and documented) for the C/C++ front ends.
Sandra Loosemore [Sun, 15 Mar 2020 22:13:46 +0000 (15:13 -0700)]
Kernels loops annotation: C and C++.
This patch detects loops in kernels regions that are candidates for
parallelization, and adds "#pragma acc loop auto" annotations to them.
This annotation is controlled by the -fopenacc-kernels-annotate-loops
option, which is enabled by default. -Wopenacc-kernels-annotate-loops
can be used to produce diagnostics about loops that cannot be annotated.
Add XFAIL for libgomp.oacc-c-c++-common/data-firstprivate-1.c
The firstprivate_int optimization changes the semantics of firstprivate
in this test, so XFAIL it until the correct semantics for firstprivate
have been decided (PR92036).
Fix test failure in routine-level-of-parallelism-2.c testcase
c-c++-common/goacc/routine-level-of-parallelism-2.c is supposed to be
equivalent to gfortran.dg/goacc/routine-level-of-parallelism-1.f90,
but is missing some test directives present in the latter.
Tobias Burnus [Mon, 16 Mar 2020 15:22:57 +0000 (16:22 +0100)]
Fix for is_gimple_reg vars to 'data kernels'
Nearly all variable mapping is moved from 'kernels' to a surrounding
'data kernels' and then 'force_present' mapped for the 'kernels'. However, as
libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as
there is a special case for is_gimple_reg in mapping and that fails badly if
outside a target region (e.g. offloading = false). As those are transferred by
value and not as a pointer, it makes more sense to only map them at
'kernels' and ignore them for 'data kernels'.
Additionally, as e.g. libgomp.oacc-c-c++-common/kernels-decompose-1.c shows,
one still additionally to handle 'kernels'-declared variables which now are
declared in 'kernels data' and and can be handled as is_gimple_reg.
gcc/
* omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region):
is_gimple_reg vars are not yet mapped, fall through to map is as
before the transformation.
(omp_oacc_kernels_decompose_1): Don't map is_gimple_reg vars.
(decompose_kernels_region_body): Use tofrom for is_gimple_reg vars.
(omp_oacc_kernels_decompose_1): Handle is_gimple_reg vars as without
data kernels.
Julian Brown [Tue, 22 Oct 2019 00:22:31 +0000 (17:22 -0700)]
Run all kernels regions with GOMP_MAP_FORCE_TOFROM mappings synchronously
gcc/
* omp-oacc-kernels-decompose.cc (decompose_kernels_region_body): Add
inhibit_async parameter. Force asynchronous kernel launches to
run synchronously if they have problematic variable mappings.
Don't add explicit wait for decomposed kernels regions we forced
synchronous.
(omp_oacc_kernels_decompose_1): Detect problematic variable mappings,
and inhibit asynchronous execution if we find any.
Julian Brown [Tue, 26 Feb 2019 23:48:00 +0000 (15:48 -0800)]
Fortran "declare create"/allocate support for OpenACC
2018-10-04 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
gcc/
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise. Create local storage for
the values being pointed to. Add new orig_type argument.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT. Don't privatize
firstprivate VLAs.
* tree-pretty-print.c (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.c (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.c (find_module_oacc_declare_clauses): Relax
oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
OMP_MAP_TO, in order to match OpenACC 2.5 semantics.
* trans-openmp.c (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls. Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.
(gfc_trans_oacc_declare_allocate): New function.
* trans-stmt.c (gfc_trans_allocate): Call
gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
attribute set.
(gfc_trans_deallocate): Likewise.
* trans.h (gfc_trans_oacc_declare_allocate): Declare.
gcc/testsuite/
* gfortran.dg/goacc/declare-allocatable-1.f90: New test.
include/
* gomp-constants.h (enum gomp_map_kind): Define
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.
libgomp/
* oacc-mem.c (gomp_acc_declare_allocate): New function.
* oacc-parallel.c (GOACC_enter_exit_data): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.
Julian Brown [Mon, 14 Oct 2019 20:12:39 +0000 (13:12 -0700)]
Re-do OpenACC private variable resolution
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(mark_oacc_gangprivate): Remove unused function.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts. Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables. Remove unused variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this. Update documentation (briefly).
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.
Julian Brown [Fri, 20 Sep 2019 20:53:10 +0000 (13:53 -0700)]
Handle references in OpenACC "private" clauses
gcc/
* gimplify.c (localize_reductions): Rewrite references for
OMP_CLAUSE_PRIVATE also.
libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test.
Julian Brown [Fri, 9 Aug 2019 20:01:33 +0000 (13:01 -0700)]
Wait at end of OpenACC asynchronous kernels regions
gcc/
* omp-oacc-kernels-decompose.cc (add_wait): New function, split out
of...
(add_async_clauses_and_wait): ...here. Call new outlined function.
(decompose_kernels_region_body): Add wait at the end of
explicitly-asynchronous kernels regions.
Julian Brown [Fri, 5 Jul 2019 01:14:41 +0000 (18:14 -0700)]
Assumed-size arrays with non-lexical data mappings
gcc/
* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
assumed-size arrays in map clauses for Fortran/OpenMP.
* omp-low.c (lower_omp_target): Set the size of assumed-size Fortran
arrays to one to allow use of data already mapped on the offload device.
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.
Julian Brown [Tue, 28 May 2019 15:42:10 +0000 (08:42 -0700)]
Apply gangprivate attribute to innermost decl
...and fix parallelism-level calculation when applying the attribute.
gcc/
* omp-low.c (mark_oacc_gangprivate): Add CTX parameter. Use to look up
correct decl to add attribute to.
(lower_omp_for): Move "oacc gangprivate" processing from here...
(process_oacc_gangprivate_1): ...to here. New function.
(lower_omp_target): Update call to mark_oacc_gangprivate.
(execute_lower_omp): Call process_oacc_gangprivate_1 for each OMP
context.
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: New test.
Julian Brown [Sun, 19 May 2019 17:42:20 +0000 (10:42 -0700)]
Fix references declared in lexically-enclosing OpenACC data region
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Guard addition of clauses for
pointers with DECL_P.
gcc/
* gimplify.c (oacc_array_mapping_info): Add REF field.
(gimplify_scan_omp_clauses): Initialise above field for data blocks
passed by reference.
(gomp_oacc_needs_data_present): Handle references.
(gimplify_adjust_omp_clauses_1): Handle references and optional
arguments for variables declared in lexically-enclosing OpenACC data
region.
Julian Brown [Thu, 16 May 2019 12:45:35 +0000 (05:45 -0700)]
Avoid introducing 'create' mapping clauses for loop index variables in kernels regions
gcc/
* omp-oacc-kernels-decompose.cc (find_omp_for_index_vars_1,
find_omp_for_index_vars): New functions.
(maybe_build_inner_data_region): Add IDX_VARS argument. Don't add
CREATE mapping clauses for loop index variables. Set TREE_ADDRESSABLE
flag on newly-mapped declarations as a side effect.
(decompose_kernels_region_body): Call find_omp_for_index_vars. Don't
create PRESENT clause for loop index variables. Pass index variable
set to maybe_build_inner_data_region.
Julian Brown [Wed, 9 Jan 2019 11:41:04 +0000 (03:41 -0800)]
Update OpenACC version to 2.6
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update _OPENACC define to 201711.
gcc/doc/
* invoke.texi: Update mention of OpenACC version to 2.6.
gcc/fortran/
* cpp.c (cpp_define_builtins): Update _OPENACC define to 201711.
* gfortran.texi: Update mentions of OpenACC version to 2.6.
* intrinsic.texi: Likewise.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update expected value for
_OPENACC define.
* gfortran.dg/openacc-define-3.f90: Likewise.
libgomp/
* libgomp.texi: Update mentions of OpenACC version to 2.6. Update
section numbers to match version 2.6 of the spec.
* openacc.f90 (openacc_version): Update to 201711.
* openacc_lib.h (openacc_version): Update to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update expected
openacc_version to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
[nvptx] Expand OpenACC child function arguments to use CUDA params space
This patch replaces the patches "Use functional parameters for data mappings
in OpenACC child functions" and "Make OpenACC function-parameter explosion
optional".
* config/nvptx/nvptx.c (nvptx_expand_to_rtl_hook): New function
implementing CUDA .params space transformation.
(TARGET_EXPAND_TO_RTL_HOOK): implement hook with
nvptx_expand_to_rtl_hook.
Disable AC_PROG_CXX and consequently a libstdc++ dependency for libffi,
introduced with upstream libffi commit 7d698125b1f0 ("Use the proper C++
compiler to run C++ tests"). This is only needed for the libffi test
suite, which we don't have to support in the GCC tree, as libffi is
maintained as a separate project. The dependency causes a build failure
with the `powerpc64le-linux-gnu' target due to a circular dependency:
due to a libgomp dependency for libstdc++ and then a libffi dependency
for libgomp, introduced with commit 998eb38b265d ("Use functional
parameters for data mappings in OpenACC child functions").
Julian Brown [Thu, 21 Mar 2019 22:09:24 +0000 (15:09 -0700)]
Add support for gang local storage allocation in shared memory
2018-12-11 Julian Brown <julian@codesourcery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (tree-hash-traits.h): Include.
(gangprivate_shared_size): New global variable.
(gangprivate_shared_align): Likewise.
(gangprivate_shared_sym): Likewise.
(gangprivate_shared_hmap): Likewise.
(nvptx_option_override): Initialize gangprivate_shared_sym,
gangprivate_shared_align.
(nvptx_file_end): Output gangprivate_shared_sym.
(nvptx_goacc_expand_accel_var): New function.
(nvptx_set_current_function): New function.
(TARGET_SET_CURRENT_FUNCTION): Define hook.
(TARGET_GOACC_EXPAND_ACCEL): Likewise.
* doc/tm.texi (TARGET_GOACC_EXPAND_ACCEL_VAR): Document new hook.
* doc/tm.texi.in (TARGET_GOACC_EXPAND_ACCEL_VAR): Likewise.
* expr.c (expand_expr_real_1): Remap decls marked with the
"oacc gangprivate" attribute.
* omp-low.c (omp_context): Add oacc_partitioning_level and
oacc_addressable_var_decls fields.
(new_omp_context): Initialize oacc_addressable_var_decls in new
omp_context.
(delete_omp_context): Delete oacc_addressable_var_decls in old
omp_context.
(lower_oacc_head_tail): Record partitioning-level count in omp context.
(oacc_record_private_var_clauses, oacc_record_vars_in_bind)
(mark_oacc_gangprivate): New functions.
(lower_omp_for): Call oacc_record_private_var_clauses with "for"
clauses. Call mark_oacc_gangprivate for gang-partitioned loops.
(lower_omp_target): Call oacc_record_private_var_clauses with "target"
clauses.
Call mark_oacc_gangprivate for offloaded target regions.
(lower_omp_1): Call vars_in_bind for GIMPLE_BIND within OMP regions.
* target.def (expand_accel_var): New hook.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New test.
* testsuite/libgomp.oacc-c/pr85465.c: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: New test.kk
gcc/testsuite/
* c-c++-common/goacc/reduction-8.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-1.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-2.c: New
test.
Julian Brown [Tue, 26 Feb 2019 23:55:23 +0000 (15:55 -0800)]
Don't mark OpenACC auto loops as independent inside acc parallel regions
2018-09-20 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* omp-low.c (lower_oacc_head_mark): Don't mark OpenACC auto
loops as independent inside acc parallel regions.
gcc/testsuite/
* c-c++-common/goacc/loop-auto-1.c: Adjust test case to conform to
the new behavior of the auto clause in OpenACC 2.5.
* c-c++-common/goacc/loop-auto-2.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* c-c++-common/goacc/loop-auto-3.c: New test.
* gfortran.dg/goacc/loop-auto-1.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust test case
to conform to the new behavior of the auto clause in OpenACC 2.5.
Julian Brown [Tue, 26 Feb 2019 23:10:21 +0000 (15:10 -0800)]
Enable GOMP_MAP_FIRSTPRIVATE_INT for OpenACC
2018-12-22 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/
* omp-low.c (maybe_lookup_field_in_outer_ctx): New function.
(convert_to_firstprivate_int): New function.
(convert_from_firstprivate_int): New function.
(lower_omp_target): Enable GOMP_MAP_FIRSTPRIVATE_INT in OpenACC.
Remove unused variable.
libgomp/
* oacc-parallel.c (GOACC_parallel_keyed): Handle
GOMP_MAP_FIRSTPRIVATE_INT host addresses.
* plugin/plugin-nvptx.c (nvptx_exec): Handle
GOMP_MAP_FIRSTPRIVATE_INT host addresses.
* testsuite/libgomp.oacc-c++/firstprivate-int.C: New test.
* testsuite/libgomp.oacc-c-c++-common/firstprivate-int.c: New
test.
* testsuite/libgomp.oacc-fortran/firstprivate-int.f90: New test.
Julian Brown [Tue, 26 Feb 2019 22:22:41 +0000 (14:22 -0800)]
Fix implicit mapping for array slices on lexically-enclosing data constructs (PR70828)
2018-08-28 Julian Brown <julian@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
gcc/
* gimplify.c (oacc_array_mapping_info): New struct.
(gimplify_omp_ctx): Add decl_data_clause hash map.
(new_omp_context): Zero-initialise above.
(delete_omp_context): Delete above if allocated.
(gimplify_scan_omp_clauses): Scan for array mappings on data constructs,
and record in above map.
(gomp_oacc_needs_data_present): New function.
(gimplify_adjust_omp_clauses_1): Handle data mappings (e.g. array
slices) declared in lexically-enclosing data constructs.
* omp-low.c (lower_omp_target): Allow decl for bias not to be present
in OpenACC context.
gcc/testsuite/
* c-c++-common/goacc/acc-data-chain.c: New test.
* gfortran.dg/goacc/pr70828.f90: New test.
* gfortran.dg/goacc/pr70828-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr70828.c: New test.
* testsuite/libgomp.oacc-fortran/implicit_copy.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-2.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-3.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-4.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-5.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-6.f90: New test.
Julian Brown [Tue, 26 Feb 2019 22:12:06 +0000 (14:12 -0800)]
Default compute dimensions (compile time)
Typo fix relative to last posted version.
2018-10-05 Nathan Sidwell <nathan@acm.org>
Tom de Vries <tdevries@suse.de>
Thomas Schwinge <thomas@codesourcery.com>
Julian Brown <julian@codesourcery.com>
Julian Brown [Tue, 26 Feb 2019 21:39:03 +0000 (13:39 -0800)]
Generate sequential loop for OpenACC loop directive inside kernels
2019-09-20 Chung-Lin Tang <cltang@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
gcc/
* omp-expand.c (struct omp_region): Add inside_kernels_p field.
(expand_omp_for_generic): Adjust to generate a 'sequential' loop
when GOMP builtin arguments are BUILT_IN_NONE.
(expand_omp_for): Use expand_omp_for_generic to generate a
non-parallelized loop for OMP_FORs inside OpenACC kernels regions.
(expand_omp): Mark inside_kernels_p field true for regions
nested inside OpenACC kernels constructs.
gcc/testsuite/
* c-c++-common/goacc/kernels-loop-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-2-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n-acc-loop.c: New test.
* c-c++-common/goacc/kernels-acc-loop-reduction.c: New test.
* c-c++-common/goacc/kernels-acc-loop-smaller-equal.c: New test.
Julian Brown [Wed, 6 Mar 2019 22:44:56 +0000 (14:44 -0800)]
Reinstate kernels-restrict behaviour
This patch contains a small fix for upstream churn relative to the last version
posted.
2018-09-20 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
* omp-low.c (install_var_field): New base_pointer_restrict
argument.
(scan_sharing_clauses): Update call to install_var_field.
(omp_target_base_pointers_restrict_p): New function.
(scan_omp_target): Update call to install_var_field.
Julian Brown [Tue, 12 Feb 2019 23:14:22 +0000 (15:14 -0800)]
Various OpenACC reduction enhancements - test cases
2018-12-13 Cesar Philippidis <cesar@codesourcery.com>
Nathan Sidwell <nathan@acm.org>
Julian Brown <julian@codesourcery.com>
gcc/testsuite/
* c-c++-common/goacc/orphan-reductions-1.c: New test.
* c-c++-common/goacc/reduction-7.c: New test.
* c-c++-common/goacc/routine-4.c: Update.
* g++.dg/goacc/reductions-1.C: New test.
* gcc.dg/goacc/loop-processing-1.c: Update.
* gfortran.dg/goacc/orphan-reductions-1.f90: New test.
libgomp/
* libgomp.oacc-c-c++-common/par-reduction-3.c: New test.
* libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c: New test.
* libgomp.oacc-fortran/reduction-9.f90: New test.
Julian Brown [Tue, 12 Feb 2019 23:06:55 +0000 (15:06 -0800)]
Various OpenACC reduction enhancements - ME and nvptx changes
Parts of the first posting got lost in the second posting, above.
This version hopefully contains everything.
2018-10-30 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (nvptx_propagate_unified): New.
(nvptx_split_blocks): Call it for cond_uni insn.
(nvptx_expand_cond_uni): New.
(enum nvptx_builtins): Add NVPTX_BUILTIN_COND_UNI.
(nvptx_init_builtins): Initialize it.
(nvptx_expand_builtin):
(nvptx_generate_vector_shuffle): Change integral SHIFT operand to
tree BITS operand.
(nvptx_vector_reduction): New.
(nvptx_adjust_reduction_type): New.
(nvptx_goacc_reduction_setup): Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_init): Don't update LHS if it doesn't exist.
(nvptx_goacc_reduction_fini): Call nvptx_vector_reduction for vector.
Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_teardown):
* config/nvptx/nvptx.md (cond_uni): New pattern.
Julian Brown [Tue, 12 Feb 2019 22:56:12 +0000 (14:56 -0800)]
Various OpenACC reduction enhancements - FE changes
This version differs somewhat from the last version posted upstream
(and addresses some of Jakub's review comments).
2018-12-13 Cesar Philippidis <cesar@codesourcery.com>
Nathan Sidwell <nathan@acm.org>
Julian Brown <julian@codesourcery.com>
gcc/c/
* c-parser.c (c_parser_omp_variable_list): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(c_parser_oacc_data_clause): Add region-type argument.
(c_parser_oacc_data_clause_deviceptr): Likewise.
(c_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to c_parser_omp_variable_list.
(c_parser_oacc_all_clauses): Update calls to
c_parser_omp_clause_reduction.
(c_parser_omp_all_clauses): Likewise.
(c_parser_oacc_cache): Update call to c_parser_omp_var_list_parens.
* c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC
gang reductions. Suppress user-defined reduction error for OpenACC.
gcc/cp/
* parser.c (cp_parser_omp_var_list_no_open): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(cp_parser_omp_var_list): Add c_omp_region_type argument. Update call
to cp_parser_omp_var_list_parens.
(cp_parser_oacc_data_clause): Update call to cp_parser_omp_var_list.
(cp_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to cp_parser_omp_var_list_no_open.
(cp_parser_oacc_all_clauses): Update call to
cp_parser_omp_clause_reduction.
(cp_parser_omp_all_clauses): Likewise.
* semantics.c (finish_omp_reduction_clause): Add c_omp_region_type
argument. Suppress user-defined reduction error for OpenACC.
(finish_omp_clauses): Emit an error on orphan OpenACC gang reductions.
gcc/fortran/
* openmp.c (oacc_is_parallel): New.
(resolve_oacc_loop_blocks): Emit an error on orphan OpenACC
gang reductions.
* trans-openmp.c (gfc_omp_clause_copy_ctor): Permit reductions.
Julian Brown [Tue, 12 Feb 2019 22:32:34 +0000 (14:32 -0800)]
Add OpenACC Fortran support for deviceptr and variable in common blocks
2018-06-29 Cesar Philippidis <cesar@codesourcery.com>
James Norris <jnorris@codesourcery.com>
gcc/fortran/
* openmp.c (gfc_match_omp_map_clause): Re-write handling of the
deviceptr clause. Add new common_blocks argument. Propagate it to
gfc_match_omp_variable_list.
(gfc_match_omp_clauses): Update calls to gfc_match_omp_map_clauses.
(resolve_positive_int_expr): Promote the warning to an error.
(check_array_not_assumed): Remove pointer check.
(resolve_oacc_nested_loops): Error on do concurrent loops.
* trans-openmp.c (gfc_omp_finish_clause): Don't create pointer data
mappings for deviceptr clauses.
(gfc_trans_omp_clauses): Likewise.
gcc/
* gimplify.c (enum gimplify_omp_var_data): Add GOVD_DEVICETPR.
(oacc_default_clause): Privatize fortran common blocks.
(omp_notice_variable): Add GOVD_DEVICEPTR attribute when appropriate.
Defer the expansion of DECL_VALUE_EXPR for common block decls.
(gimplify_scan_omp_clauses): Add GOVD_DEVICEPTR attribute when
appropriate.
(gimplify_adjust_omp_clauses_1): Set GOMP_MAP_FORCE_DEVICEPTR for
implicit deviceptr mappings.
Julian Brown [Tue, 12 Feb 2019 14:36:03 +0000 (06:36 -0800)]
Tweak error return value for acc_set_cuda_stream.
The return value of acc_set_cuda_stream is unspecified in OpenACC 2.6.
The testsuite changes might be unnecessary with the current async code.
libgomp/
* oacc-cuda.c (acc_set_cuda_stream): Return 0 on error/invalid
arguments.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Handle unnumbered
async stream being an alias for a numbered async stream.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
gcc/c/
* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/cp/
* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/fortran/
* f95-lang.c (DEF_FUNCTION_TYPE_VAR_5): New symbol.
* types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
gcc/
* builtin-types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
* omp-builtins.def (BUILT_IN_GOACC_DATA_START): Adjust function type
to new BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR.
* gimplify.c (gimplify_scan_omp_clauses): Skip gimplification of
OMP_CLAUSE_SIZE of non-contiguous array maps (which is a TREE_LIST).
* omp-expand.c (expand_omp_target): Add non-contiguous array descriptor
pointers to variadic arguments.
* omp-low.c (append_field_to_record_type): New function.
(create_noncontig_array_descr_type): Likewise.
(create_noncontig_array_descr_init_code): Likewise.
(scan_sharing_clauses): For non-contiguous array map kinds, check for
supported dimension structure, and install non-contiguous array
variable into current omp_context.
(reorder_noncontig_array_clauses): New function.
(scan_omp_target): Call reorder_noncontig_array_clauses to place
non-contiguous array map clauses at beginning of clause sequence.
(lower_omp_target): Add handling for non-contiguous array map kinds,
add all created non-contiguous array descriptors to
gimple_omp_target_data_arg.
gcc/testsuite/
* c-c++-common/goacc/noncontig_array-1.c: New test.
libgomp/
* libgomp_g.h (GOACC_data_start): Add variadic '...' to declaration.
* libgomp.h (gomp_map_vars_openacc): New function declaration.
* oacc-int.h (struct goacc_ncarray_dim): New struct declaration.
(struct goacc_ncarray_descr_type): Likewise.
(struct goacc_ncarray): Likewise.
(struct goacc_ncarray_info): Likewise.
(goacc_noncontig_array_create_ptrblock): New function declaration.
* oacc-parallel.c (goacc_noncontig_array_count_rows): New function.
(goacc_noncontig_array_compute_sizes): Likewise.
(goacc_noncontig_array_fill_rows_1): Likewise.
(goacc_noncontig_array_fill_rows): Likewise.
(goacc_process_noncontiguous_arrays): Likewise.
(goacc_noncontig_array_create_ptrblock): Likewise.
(GOACC_parallel_keyed): Use goacc_process_noncontiguous_arrays to
handle non-contiguous array descriptors at end of varargs, adjust
to use gomp_map_vars_openacc.
(GOACC_data_start): Likewise. Adjust function type to accept varargs.
* target.c (gomp_map_vars_internal): Add struct goacc_ncarray_info *
nca_info parameter, add handling code for non-contiguous arrays.
(gomp_map_vars_openacc): Add new function for specialization of
gomp_map_vars_internal for OpenACC structured region usage.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support
header for new tests.
testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c
Depending on whether GCC is configured using --with-mode=zarch or not,
for the 31bit target instructions are generated either for ESA or
z/Architecture. For the sake of simplicity and robustness test only for
the latter by adding manually option -mzarch.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
z/Architecture only.
gcc/fortran
PR fortran/100110
* trans-decl.c (gfc_get_symbol_decl): Replace test for host
association with a check that the current and symbol namespaces
are the same.
gcc/testsuite/
PR fortran/100110
* gfortran.dg/pdt_31.f03: New test.
* gfortran.dg/pdt_26.f03: Reduce 'builtin_malloc' count from 9
to 8.
libphobos: Fix SIGBUS in read_encoded_value_with_base on sparc-sun-solaris (PR98584)
Instead of unsafe pointer dereferencing, use memcpy() to read encoded
values from memory. The function `read_encoded_value' has been updated
to accept a ref parameter, this simplifies handling of the pointer to
memory needing to be read.
libphobos/ChangeLog:
PR d/98584
* libdruntime/gcc/deh.d (scanLSDA): Update calls to read_uleb128 and
read_encoded_value.
(actionTableLookup): Update calls to read_sleb128 and
read_encoded_value_with_base.
* libdruntime/gcc/unwind/pe.d (read_uleb128): Update signature.
(read_sleb128): Update signature.
(read_unaligned): New function.
(read_encoded_value_with_base): Update signature. Call read_unaligned
instead of unsafe pointer dereferencing.
(read_encoded_value): Update signature.
Andrew MacLeod [Fri, 16 Apr 2021 21:08:51 +0000 (17:08 -0400)]
tree-optimization/100081 - Limit depth of logical expression windback.
Limit how many logical expressions GORI will look back through when
evaluating outgoing edge range.
PR tree-optimization/100081
* gimple-range-cache.h (ranger_cache): Inherit from gori_compute
rather than gori_compute_cache.
* gimple-range-gori.cc (is_gimple_logical_p): Move to top of file.
(range_def_chain::m_logical_depth): New member.
(range_def_chain::range_def_chain): Initialize m_logical_depth.
(range_def_chain::get_def_chain): Don't build defchains through more
than LOGICAL_LIMIT logical expressions.
* params.opt (param_ranger_logical_depth): New.
d: Fix ICE in when formating a string with '%' or '`' characters (PR98457)
The percentage character was being confused for a format specifier in
pp_format(), whilst the backtick character was confused for the
beginning of a quoted string in expand_d_format().
Both are now properly escaped to avoid the ICE.
gcc/d/ChangeLog:
PR d/98457
* d-diagnostic.cc (expand_d_format): Handle escaped backticks.
(escape_d_format): New funtion.
(verror): Call escape_d_format on prefixing strings.
(vdeprecation): Likewise.
Richard Earnshaw [Mon, 19 Apr 2021 15:56:31 +0000 (16:56 +0100)]
arm: partial revert of r11-8168 [PR100067]
This is a partial revert of r11-8168. The overall purpose of the
commit is retained (to fix a bogus warning when -mfpu=<not-auto> is
used in combination with eg -mcpu=neoverse-v1), but it removes the
hunk that changed the subsequent feature bits for features of a
simd/fp unit that cannot be described by -mfpu. While I still think
that is the correct direction of travel, it's somewhat disruptive and
not appropriate for late stage4. I'll revisit for gcc-12.
gcc:
PR target/100067
* config/arm/arm.c (arm_configure_build_target): Do not strip
extended FPU/SIMD feature bits from the target ISA when -mfpu
is specified (partial revert of r11-8168).
Eric Botcazou [Mon, 19 Apr 2021 08:13:36 +0000 (10:13 +0200)]
Fix another -freorder-blocks-and-partition glitch with Windows SEH
Since GCC 8, the -freorder-blocks-and-partition pass can split a function
into hot and cold parts, thus generating 2 FDEs for a single function in
DWARF for exception purposes and doing an equivalent trick for Windows SEH.
Now the Windows system unwinder does not support arbitrarily large frames
and there is even a hard limit on the encoding of the CFI, which changes
the stack allocation strategy when it is topped and which must be reflected
everywhere.
gcc/
* config/i386/winnt.c (i386_pe_seh_cold_init): Properly deal with
frames larger than the SEH maximum frame size.
gcc/testsuite/
* gnat.dg/opt92.adb: New test.