Sandra Loosemore [Thu, 19 Mar 2020 15:32:24 +0000 (08:32 -0700)]
Additional Fortran testsuite fixes for kernels loops annotation pass.
These testsuite fixes are specific to the og10 branch, so are being
segregated from the ones that apply to mainline in a separate commit
from the main Fortran kernels loop annotation patch.
Sandra Loosemore [Tue, 17 Mar 2020 01:08:01 +0000 (18:08 -0700)]
Kernels loops annotation: Fortran.
This patch implements the Fortran support for adding "#pragma acc loop auto"
annotations to loops in OpenACC kernels regions. It implements the same
-fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options
that were previously added (and documented) for the C/C++ front ends.
Sandra Loosemore [Sun, 15 Mar 2020 22:13:46 +0000 (15:13 -0700)]
Kernels loops annotation: C and C++.
This patch detects loops in kernels regions that are candidates for
parallelization, and adds "#pragma acc loop auto" annotations to them.
This annotation is controlled by the -fopenacc-kernels-annotate-loops
option, which is enabled by default. -Wopenacc-kernels-annotate-loops
can be used to produce diagnostics about loops that cannot be annotated.
Add XFAIL for libgomp.oacc-c-c++-common/data-firstprivate-1.c
The firstprivate_int optimization changes the semantics of firstprivate
in this test, so XFAIL it until the correct semantics for firstprivate
have been decided (PR92036).
Fix test failure in routine-level-of-parallelism-2.c testcase
c-c++-common/goacc/routine-level-of-parallelism-2.c is supposed to be
equivalent to gfortran.dg/goacc/routine-level-of-parallelism-1.f90,
but is missing some test directives present in the latter.
Tobias Burnus [Mon, 16 Mar 2020 15:22:57 +0000 (16:22 +0100)]
Fix for is_gimple_reg vars to 'data kernels'
Nearly all variable mapping is moved from 'kernels' to a surrounding
'data kernels' and then 'force_present' mapped for the 'kernels'. However, as
libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as
there is a special case for is_gimple_reg in mapping and that fails badly if
outside a target region (e.g. offloading = false). As those are transferred by
value and not as a pointer, it makes more sense to only map them at
'kernels' and ignore them for 'data kernels'.
Additionally, as e.g. libgomp.oacc-c-c++-common/kernels-decompose-1.c shows,
one still additionally to handle 'kernels'-declared variables which now are
declared in 'kernels data' and and can be handled as is_gimple_reg.
gcc/
* omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region):
is_gimple_reg vars are not yet mapped, fall through to map is as
before the transformation.
(omp_oacc_kernels_decompose_1): Don't map is_gimple_reg vars.
(decompose_kernels_region_body): Use tofrom for is_gimple_reg vars.
(omp_oacc_kernels_decompose_1): Handle is_gimple_reg vars as without
data kernels.
Julian Brown [Tue, 22 Oct 2019 00:22:31 +0000 (17:22 -0700)]
Run all kernels regions with GOMP_MAP_FORCE_TOFROM mappings synchronously
gcc/
* omp-oacc-kernels-decompose.cc (decompose_kernels_region_body): Add
inhibit_async parameter. Force asynchronous kernel launches to
run synchronously if they have problematic variable mappings.
Don't add explicit wait for decomposed kernels regions we forced
synchronous.
(omp_oacc_kernels_decompose_1): Detect problematic variable mappings,
and inhibit asynchronous execution if we find any.
Julian Brown [Tue, 26 Feb 2019 23:48:00 +0000 (15:48 -0800)]
Fortran "declare create"/allocate support for OpenACC
2018-10-04 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
gcc/
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise. Create local storage for
the values being pointed to. Add new orig_type argument.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT. Don't privatize
firstprivate VLAs.
* tree-pretty-print.c (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.c (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.c (find_module_oacc_declare_clauses): Relax
oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
OMP_MAP_TO, in order to match OpenACC 2.5 semantics.
* trans-openmp.c (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls. Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.
(gfc_trans_oacc_declare_allocate): New function.
* trans-stmt.c (gfc_trans_allocate): Call
gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
attribute set.
(gfc_trans_deallocate): Likewise.
* trans.h (gfc_trans_oacc_declare_allocate): Declare.
gcc/testsuite/
* gfortran.dg/goacc/declare-allocatable-1.f90: New test.
include/
* gomp-constants.h (enum gomp_map_kind): Define
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.
libgomp/
* oacc-mem.c (gomp_acc_declare_allocate): New function.
* oacc-parallel.c (GOACC_enter_exit_data): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.
Julian Brown [Mon, 14 Oct 2019 20:12:39 +0000 (13:12 -0700)]
Re-do OpenACC private variable resolution
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(mark_oacc_gangprivate): Remove unused function.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts. Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables. Remove unused variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this. Update documentation (briefly).
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.
Julian Brown [Fri, 20 Sep 2019 20:53:10 +0000 (13:53 -0700)]
Handle references in OpenACC "private" clauses
gcc/
* gimplify.c (localize_reductions): Rewrite references for
OMP_CLAUSE_PRIVATE also.
libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test.
Julian Brown [Fri, 9 Aug 2019 20:01:33 +0000 (13:01 -0700)]
Wait at end of OpenACC asynchronous kernels regions
gcc/
* omp-oacc-kernels-decompose.cc (add_wait): New function, split out
of...
(add_async_clauses_and_wait): ...here. Call new outlined function.
(decompose_kernels_region_body): Add wait at the end of
explicitly-asynchronous kernels regions.
Julian Brown [Fri, 5 Jul 2019 01:14:41 +0000 (18:14 -0700)]
Assumed-size arrays with non-lexical data mappings
gcc/
* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
assumed-size arrays in map clauses for Fortran/OpenMP.
* omp-low.c (lower_omp_target): Set the size of assumed-size Fortran
arrays to one to allow use of data already mapped on the offload device.
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.
Julian Brown [Tue, 28 May 2019 15:42:10 +0000 (08:42 -0700)]
Apply gangprivate attribute to innermost decl
...and fix parallelism-level calculation when applying the attribute.
gcc/
* omp-low.c (mark_oacc_gangprivate): Add CTX parameter. Use to look up
correct decl to add attribute to.
(lower_omp_for): Move "oacc gangprivate" processing from here...
(process_oacc_gangprivate_1): ...to here. New function.
(lower_omp_target): Update call to mark_oacc_gangprivate.
(execute_lower_omp): Call process_oacc_gangprivate_1 for each OMP
context.
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: New test.
Julian Brown [Sun, 19 May 2019 17:42:20 +0000 (10:42 -0700)]
Fix references declared in lexically-enclosing OpenACC data region
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Guard addition of clauses for
pointers with DECL_P.
gcc/
* gimplify.c (oacc_array_mapping_info): Add REF field.
(gimplify_scan_omp_clauses): Initialise above field for data blocks
passed by reference.
(gomp_oacc_needs_data_present): Handle references.
(gimplify_adjust_omp_clauses_1): Handle references and optional
arguments for variables declared in lexically-enclosing OpenACC data
region.
Julian Brown [Thu, 16 May 2019 12:45:35 +0000 (05:45 -0700)]
Avoid introducing 'create' mapping clauses for loop index variables in kernels regions
gcc/
* omp-oacc-kernels-decompose.cc (find_omp_for_index_vars_1,
find_omp_for_index_vars): New functions.
(maybe_build_inner_data_region): Add IDX_VARS argument. Don't add
CREATE mapping clauses for loop index variables. Set TREE_ADDRESSABLE
flag on newly-mapped declarations as a side effect.
(decompose_kernels_region_body): Call find_omp_for_index_vars. Don't
create PRESENT clause for loop index variables. Pass index variable
set to maybe_build_inner_data_region.
Julian Brown [Wed, 9 Jan 2019 11:41:04 +0000 (03:41 -0800)]
Update OpenACC version to 2.6
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update _OPENACC define to 201711.
gcc/doc/
* invoke.texi: Update mention of OpenACC version to 2.6.
gcc/fortran/
* cpp.c (cpp_define_builtins): Update _OPENACC define to 201711.
* gfortran.texi: Update mentions of OpenACC version to 2.6.
* intrinsic.texi: Likewise.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update expected value for
_OPENACC define.
* gfortran.dg/openacc-define-3.f90: Likewise.
libgomp/
* libgomp.texi: Update mentions of OpenACC version to 2.6. Update
section numbers to match version 2.6 of the spec.
* openacc.f90 (openacc_version): Update to 201711.
* openacc_lib.h (openacc_version): Update to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update expected
openacc_version to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
[nvptx] Expand OpenACC child function arguments to use CUDA params space
This patch replaces the patches "Use functional parameters for data mappings
in OpenACC child functions" and "Make OpenACC function-parameter explosion
optional".
* config/nvptx/nvptx.c (nvptx_expand_to_rtl_hook): New function
implementing CUDA .params space transformation.
(TARGET_EXPAND_TO_RTL_HOOK): implement hook with
nvptx_expand_to_rtl_hook.
Disable AC_PROG_CXX and consequently a libstdc++ dependency for libffi,
introduced with upstream libffi commit 7d698125b1f0 ("Use the proper C++
compiler to run C++ tests"). This is only needed for the libffi test
suite, which we don't have to support in the GCC tree, as libffi is
maintained as a separate project. The dependency causes a build failure
with the `powerpc64le-linux-gnu' target due to a circular dependency:
due to a libgomp dependency for libstdc++ and then a libffi dependency
for libgomp, introduced with commit 998eb38b265d ("Use functional
parameters for data mappings in OpenACC child functions").
Julian Brown [Thu, 21 Mar 2019 22:09:24 +0000 (15:09 -0700)]
Add support for gang local storage allocation in shared memory
2018-12-11 Julian Brown <julian@codesourcery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (tree-hash-traits.h): Include.
(gangprivate_shared_size): New global variable.
(gangprivate_shared_align): Likewise.
(gangprivate_shared_sym): Likewise.
(gangprivate_shared_hmap): Likewise.
(nvptx_option_override): Initialize gangprivate_shared_sym,
gangprivate_shared_align.
(nvptx_file_end): Output gangprivate_shared_sym.
(nvptx_goacc_expand_accel_var): New function.
(nvptx_set_current_function): New function.
(TARGET_SET_CURRENT_FUNCTION): Define hook.
(TARGET_GOACC_EXPAND_ACCEL): Likewise.
* doc/tm.texi (TARGET_GOACC_EXPAND_ACCEL_VAR): Document new hook.
* doc/tm.texi.in (TARGET_GOACC_EXPAND_ACCEL_VAR): Likewise.
* expr.c (expand_expr_real_1): Remap decls marked with the
"oacc gangprivate" attribute.
* omp-low.c (omp_context): Add oacc_partitioning_level and
oacc_addressable_var_decls fields.
(new_omp_context): Initialize oacc_addressable_var_decls in new
omp_context.
(delete_omp_context): Delete oacc_addressable_var_decls in old
omp_context.
(lower_oacc_head_tail): Record partitioning-level count in omp context.
(oacc_record_private_var_clauses, oacc_record_vars_in_bind)
(mark_oacc_gangprivate): New functions.
(lower_omp_for): Call oacc_record_private_var_clauses with "for"
clauses. Call mark_oacc_gangprivate for gang-partitioned loops.
(lower_omp_target): Call oacc_record_private_var_clauses with "target"
clauses.
Call mark_oacc_gangprivate for offloaded target regions.
(lower_omp_1): Call vars_in_bind for GIMPLE_BIND within OMP regions.
* target.def (expand_accel_var): New hook.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New test.
* testsuite/libgomp.oacc-c/pr85465.c: New test.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: New test.kk
gcc/testsuite/
* c-c++-common/goacc/reduction-8.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-1.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-2.c: New
test.
Julian Brown [Tue, 26 Feb 2019 23:55:23 +0000 (15:55 -0800)]
Don't mark OpenACC auto loops as independent inside acc parallel regions
2018-09-20 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* omp-low.c (lower_oacc_head_mark): Don't mark OpenACC auto
loops as independent inside acc parallel regions.
gcc/testsuite/
* c-c++-common/goacc/loop-auto-1.c: Adjust test case to conform to
the new behavior of the auto clause in OpenACC 2.5.
* c-c++-common/goacc/loop-auto-2.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* c-c++-common/goacc/loop-auto-3.c: New test.
* gfortran.dg/goacc/loop-auto-1.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust test case
to conform to the new behavior of the auto clause in OpenACC 2.5.
Julian Brown [Tue, 26 Feb 2019 23:10:21 +0000 (15:10 -0800)]
Enable GOMP_MAP_FIRSTPRIVATE_INT for OpenACC
2018-12-22 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/
* omp-low.c (maybe_lookup_field_in_outer_ctx): New function.
(convert_to_firstprivate_int): New function.
(convert_from_firstprivate_int): New function.
(lower_omp_target): Enable GOMP_MAP_FIRSTPRIVATE_INT in OpenACC.
Remove unused variable.
libgomp/
* oacc-parallel.c (GOACC_parallel_keyed): Handle
GOMP_MAP_FIRSTPRIVATE_INT host addresses.
* plugin/plugin-nvptx.c (nvptx_exec): Handle
GOMP_MAP_FIRSTPRIVATE_INT host addresses.
* testsuite/libgomp.oacc-c++/firstprivate-int.C: New test.
* testsuite/libgomp.oacc-c-c++-common/firstprivate-int.c: New
test.
* testsuite/libgomp.oacc-fortran/firstprivate-int.f90: New test.
Julian Brown [Tue, 26 Feb 2019 22:22:41 +0000 (14:22 -0800)]
Fix implicit mapping for array slices on lexically-enclosing data constructs (PR70828)
2018-08-28 Julian Brown <julian@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
gcc/
* gimplify.c (oacc_array_mapping_info): New struct.
(gimplify_omp_ctx): Add decl_data_clause hash map.
(new_omp_context): Zero-initialise above.
(delete_omp_context): Delete above if allocated.
(gimplify_scan_omp_clauses): Scan for array mappings on data constructs,
and record in above map.
(gomp_oacc_needs_data_present): New function.
(gimplify_adjust_omp_clauses_1): Handle data mappings (e.g. array
slices) declared in lexically-enclosing data constructs.
* omp-low.c (lower_omp_target): Allow decl for bias not to be present
in OpenACC context.
gcc/testsuite/
* c-c++-common/goacc/acc-data-chain.c: New test.
* gfortran.dg/goacc/pr70828.f90: New test.
* gfortran.dg/goacc/pr70828-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr70828.c: New test.
* testsuite/libgomp.oacc-fortran/implicit_copy.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-2.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-3.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-4.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-5.f90: New test.
* testsuite/libgomp.oacc-fortran/pr70828-6.f90: New test.
Julian Brown [Tue, 26 Feb 2019 22:12:06 +0000 (14:12 -0800)]
Default compute dimensions (compile time)
Typo fix relative to last posted version.
2018-10-05 Nathan Sidwell <nathan@acm.org>
Tom de Vries <tdevries@suse.de>
Thomas Schwinge <thomas@codesourcery.com>
Julian Brown <julian@codesourcery.com>
Julian Brown [Tue, 26 Feb 2019 21:39:03 +0000 (13:39 -0800)]
Generate sequential loop for OpenACC loop directive inside kernels
2019-09-20 Chung-Lin Tang <cltang@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
gcc/
* omp-expand.c (struct omp_region): Add inside_kernels_p field.
(expand_omp_for_generic): Adjust to generate a 'sequential' loop
when GOMP builtin arguments are BUILT_IN_NONE.
(expand_omp_for): Use expand_omp_for_generic to generate a
non-parallelized loop for OMP_FORs inside OpenACC kernels regions.
(expand_omp): Mark inside_kernels_p field true for regions
nested inside OpenACC kernels constructs.
gcc/testsuite/
* c-c++-common/goacc/kernels-loop-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-2-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n-acc-loop.c: New test.
* c-c++-common/goacc/kernels-acc-loop-reduction.c: New test.
* c-c++-common/goacc/kernels-acc-loop-smaller-equal.c: New test.
Julian Brown [Wed, 6 Mar 2019 22:44:56 +0000 (14:44 -0800)]
Reinstate kernels-restrict behaviour
This patch contains a small fix for upstream churn relative to the last version
posted.
2018-09-20 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
* omp-low.c (install_var_field): New base_pointer_restrict
argument.
(scan_sharing_clauses): Update call to install_var_field.
(omp_target_base_pointers_restrict_p): New function.
(scan_omp_target): Update call to install_var_field.
Julian Brown [Tue, 12 Feb 2019 23:14:22 +0000 (15:14 -0800)]
Various OpenACC reduction enhancements - test cases
2018-12-13 Cesar Philippidis <cesar@codesourcery.com>
Nathan Sidwell <nathan@acm.org>
Julian Brown <julian@codesourcery.com>
gcc/testsuite/
* c-c++-common/goacc/orphan-reductions-1.c: New test.
* c-c++-common/goacc/reduction-7.c: New test.
* c-c++-common/goacc/routine-4.c: Update.
* g++.dg/goacc/reductions-1.C: New test.
* gcc.dg/goacc/loop-processing-1.c: Update.
* gfortran.dg/goacc/orphan-reductions-1.f90: New test.
libgomp/
* libgomp.oacc-c-c++-common/par-reduction-3.c: New test.
* libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c: New test.
* libgomp.oacc-fortran/reduction-9.f90: New test.
Julian Brown [Tue, 12 Feb 2019 23:06:55 +0000 (15:06 -0800)]
Various OpenACC reduction enhancements - ME and nvptx changes
Parts of the first posting got lost in the second posting, above.
This version hopefully contains everything.
2018-10-30 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (nvptx_propagate_unified): New.
(nvptx_split_blocks): Call it for cond_uni insn.
(nvptx_expand_cond_uni): New.
(enum nvptx_builtins): Add NVPTX_BUILTIN_COND_UNI.
(nvptx_init_builtins): Initialize it.
(nvptx_expand_builtin):
(nvptx_generate_vector_shuffle): Change integral SHIFT operand to
tree BITS operand.
(nvptx_vector_reduction): New.
(nvptx_adjust_reduction_type): New.
(nvptx_goacc_reduction_setup): Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_init): Don't update LHS if it doesn't exist.
(nvptx_goacc_reduction_fini): Call nvptx_vector_reduction for vector.
Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_teardown):
* config/nvptx/nvptx.md (cond_uni): New pattern.
Julian Brown [Tue, 12 Feb 2019 22:56:12 +0000 (14:56 -0800)]
Various OpenACC reduction enhancements - FE changes
This version differs somewhat from the last version posted upstream
(and addresses some of Jakub's review comments).
2018-12-13 Cesar Philippidis <cesar@codesourcery.com>
Nathan Sidwell <nathan@acm.org>
Julian Brown <julian@codesourcery.com>
gcc/c/
* c-parser.c (c_parser_omp_variable_list): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(c_parser_oacc_data_clause): Add region-type argument.
(c_parser_oacc_data_clause_deviceptr): Likewise.
(c_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to c_parser_omp_variable_list.
(c_parser_oacc_all_clauses): Update calls to
c_parser_omp_clause_reduction.
(c_parser_omp_all_clauses): Likewise.
(c_parser_oacc_cache): Update call to c_parser_omp_var_list_parens.
* c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC
gang reductions. Suppress user-defined reduction error for OpenACC.
gcc/cp/
* parser.c (cp_parser_omp_var_list_no_open): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(cp_parser_omp_var_list): Add c_omp_region_type argument. Update call
to cp_parser_omp_var_list_parens.
(cp_parser_oacc_data_clause): Update call to cp_parser_omp_var_list.
(cp_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to cp_parser_omp_var_list_no_open.
(cp_parser_oacc_all_clauses): Update call to
cp_parser_omp_clause_reduction.
(cp_parser_omp_all_clauses): Likewise.
* semantics.c (finish_omp_reduction_clause): Add c_omp_region_type
argument. Suppress user-defined reduction error for OpenACC.
(finish_omp_clauses): Emit an error on orphan OpenACC gang reductions.
gcc/fortran/
* openmp.c (oacc_is_parallel): New.
(resolve_oacc_loop_blocks): Emit an error on orphan OpenACC
gang reductions.
* trans-openmp.c (gfc_omp_clause_copy_ctor): Permit reductions.
Julian Brown [Tue, 12 Feb 2019 22:32:34 +0000 (14:32 -0800)]
Add OpenACC Fortran support for deviceptr and variable in common blocks
2018-06-29 Cesar Philippidis <cesar@codesourcery.com>
James Norris <jnorris@codesourcery.com>
gcc/fortran/
* openmp.c (gfc_match_omp_map_clause): Re-write handling of the
deviceptr clause. Add new common_blocks argument. Propagate it to
gfc_match_omp_variable_list.
(gfc_match_omp_clauses): Update calls to gfc_match_omp_map_clauses.
(resolve_positive_int_expr): Promote the warning to an error.
(check_array_not_assumed): Remove pointer check.
(resolve_oacc_nested_loops): Error on do concurrent loops.
* trans-openmp.c (gfc_omp_finish_clause): Don't create pointer data
mappings for deviceptr clauses.
(gfc_trans_omp_clauses): Likewise.
gcc/
* gimplify.c (enum gimplify_omp_var_data): Add GOVD_DEVICETPR.
(oacc_default_clause): Privatize fortran common blocks.
(omp_notice_variable): Add GOVD_DEVICEPTR attribute when appropriate.
Defer the expansion of DECL_VALUE_EXPR for common block decls.
(gimplify_scan_omp_clauses): Add GOVD_DEVICEPTR attribute when
appropriate.
(gimplify_adjust_omp_clauses_1): Set GOMP_MAP_FORCE_DEVICEPTR for
implicit deviceptr mappings.
Julian Brown [Tue, 12 Feb 2019 14:36:03 +0000 (06:36 -0800)]
Tweak error return value for acc_set_cuda_stream.
The return value of acc_set_cuda_stream is unspecified in OpenACC 2.6.
The testsuite changes might be unnecessary with the current async code.
libgomp/
* oacc-cuda.c (acc_set_cuda_stream): Return 0 on error/invalid
arguments.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Handle unnumbered
async stream being an alias for a numbered async stream.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
gcc/c/
* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/cp/
* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/fortran/
* f95-lang.c (DEF_FUNCTION_TYPE_VAR_5): New symbol.
* types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
gcc/
* builtin-types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
* omp-builtins.def (BUILT_IN_GOACC_DATA_START): Adjust function type
to new BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR.
* gimplify.c (gimplify_scan_omp_clauses): Skip gimplification of
OMP_CLAUSE_SIZE of non-contiguous array maps (which is a TREE_LIST).
* omp-expand.c (expand_omp_target): Add non-contiguous array descriptor
pointers to variadic arguments.
* omp-low.c (append_field_to_record_type): New function.
(create_noncontig_array_descr_type): Likewise.
(create_noncontig_array_descr_init_code): Likewise.
(scan_sharing_clauses): For non-contiguous array map kinds, check for
supported dimension structure, and install non-contiguous array
variable into current omp_context.
(reorder_noncontig_array_clauses): New function.
(scan_omp_target): Call reorder_noncontig_array_clauses to place
non-contiguous array map clauses at beginning of clause sequence.
(lower_omp_target): Add handling for non-contiguous array map kinds,
add all created non-contiguous array descriptors to
gimple_omp_target_data_arg.
gcc/testsuite/
* c-c++-common/goacc/noncontig_array-1.c: New test.
libgomp/
* libgomp_g.h (GOACC_data_start): Add variadic '...' to declaration.
* libgomp.h (gomp_map_vars_openacc): New function declaration.
* oacc-int.h (struct goacc_ncarray_dim): New struct declaration.
(struct goacc_ncarray_descr_type): Likewise.
(struct goacc_ncarray): Likewise.
(struct goacc_ncarray_info): Likewise.
(goacc_noncontig_array_create_ptrblock): New function declaration.
* oacc-parallel.c (goacc_noncontig_array_count_rows): New function.
(goacc_noncontig_array_compute_sizes): Likewise.
(goacc_noncontig_array_fill_rows_1): Likewise.
(goacc_noncontig_array_fill_rows): Likewise.
(goacc_process_noncontiguous_arrays): Likewise.
(goacc_noncontig_array_create_ptrblock): Likewise.
(GOACC_parallel_keyed): Use goacc_process_noncontiguous_arrays to
handle non-contiguous array descriptors at end of varargs, adjust
to use gomp_map_vars_openacc.
(GOACC_data_start): Likewise. Adjust function type to accept varargs.
* target.c (gomp_map_vars_internal): Add struct goacc_ncarray_info *
nca_info parameter, add handling code for non-contiguous arrays.
(gomp_map_vars_openacc): Add new function for specialization of
gomp_map_vars_internal for OpenACC structured region usage.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support
header for new tests.
testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c
Depending on whether GCC is configured using --with-mode=zarch or not,
for the 31bit target instructions are generated either for ESA or
z/Architecture. For the sake of simplicity and robustness test only for
the latter by adding manually option -mzarch.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
z/Architecture only.
gcc/fortran
PR fortran/100110
* trans-decl.c (gfc_get_symbol_decl): Replace test for host
association with a check that the current and symbol namespaces
are the same.
gcc/testsuite/
PR fortran/100110
* gfortran.dg/pdt_31.f03: New test.
* gfortran.dg/pdt_26.f03: Reduce 'builtin_malloc' count from 9
to 8.
libphobos: Fix SIGBUS in read_encoded_value_with_base on sparc-sun-solaris (PR98584)
Instead of unsafe pointer dereferencing, use memcpy() to read encoded
values from memory. The function `read_encoded_value' has been updated
to accept a ref parameter, this simplifies handling of the pointer to
memory needing to be read.
libphobos/ChangeLog:
PR d/98584
* libdruntime/gcc/deh.d (scanLSDA): Update calls to read_uleb128 and
read_encoded_value.
(actionTableLookup): Update calls to read_sleb128 and
read_encoded_value_with_base.
* libdruntime/gcc/unwind/pe.d (read_uleb128): Update signature.
(read_sleb128): Update signature.
(read_unaligned): New function.
(read_encoded_value_with_base): Update signature. Call read_unaligned
instead of unsafe pointer dereferencing.
(read_encoded_value): Update signature.
Andrew MacLeod [Fri, 16 Apr 2021 21:08:51 +0000 (17:08 -0400)]
tree-optimization/100081 - Limit depth of logical expression windback.
Limit how many logical expressions GORI will look back through when
evaluating outgoing edge range.
PR tree-optimization/100081
* gimple-range-cache.h (ranger_cache): Inherit from gori_compute
rather than gori_compute_cache.
* gimple-range-gori.cc (is_gimple_logical_p): Move to top of file.
(range_def_chain::m_logical_depth): New member.
(range_def_chain::range_def_chain): Initialize m_logical_depth.
(range_def_chain::get_def_chain): Don't build defchains through more
than LOGICAL_LIMIT logical expressions.
* params.opt (param_ranger_logical_depth): New.
d: Fix ICE in when formating a string with '%' or '`' characters (PR98457)
The percentage character was being confused for a format specifier in
pp_format(), whilst the backtick character was confused for the
beginning of a quoted string in expand_d_format().
Both are now properly escaped to avoid the ICE.
gcc/d/ChangeLog:
PR d/98457
* d-diagnostic.cc (expand_d_format): Handle escaped backticks.
(escape_d_format): New funtion.
(verror): Call escape_d_format on prefixing strings.
(vdeprecation): Likewise.
Richard Earnshaw [Mon, 19 Apr 2021 15:56:31 +0000 (16:56 +0100)]
arm: partial revert of r11-8168 [PR100067]
This is a partial revert of r11-8168. The overall purpose of the
commit is retained (to fix a bogus warning when -mfpu=<not-auto> is
used in combination with eg -mcpu=neoverse-v1), but it removes the
hunk that changed the subsequent feature bits for features of a
simd/fp unit that cannot be described by -mfpu. While I still think
that is the correct direction of travel, it's somewhat disruptive and
not appropriate for late stage4. I'll revisit for gcc-12.
gcc:
PR target/100067
* config/arm/arm.c (arm_configure_build_target): Do not strip
extended FPU/SIMD feature bits from the target ISA when -mfpu
is specified (partial revert of r11-8168).
Eric Botcazou [Mon, 19 Apr 2021 08:13:36 +0000 (10:13 +0200)]
Fix another -freorder-blocks-and-partition glitch with Windows SEH
Since GCC 8, the -freorder-blocks-and-partition pass can split a function
into hot and cold parts, thus generating 2 FDEs for a single function in
DWARF for exception purposes and doing an equivalent trick for Windows SEH.
Now the Windows system unwinder does not support arbitrarily large frames
and there is even a hard limit on the encoding of the CFI, which changes
the stack allocation strategy when it is topped and which must be reflected
everywhere.
gcc/
* config/i386/winnt.c (i386_pe_seh_cold_init): Properly deal with
frames larger than the SEH maximum frame size.
gcc/testsuite/
* gnat.dg/opt92.adb: New test.
testsuite: Enable zero-scratch-regs-{8,9,10,11}.c on s390*
On s390* the only missing part for the mentioned testcases was a load of
a double floating-point zero via a move (in particular for quite old
machines) which was added in commit 46c47420a5fefd4d9d02b0db347235dd74e20fb2.
Common code implementation is sufficient in order to clear volatile
GPRs, FPRs, and VRs. Access registers a0 and a1 are nonvolatile and not
cleared. Therefore, target hook TARGET_ZERO_CALL_USED_REGS is not
implemented for s390*.
Added a target specific test in order to ensure that all call clobbered
GPRs, FPRs, and VRs are zeroed and all call saved registers are kept.
gcc/testsuite/ChangeLog:
* c-c++-common/zero-scratch-regs-8.c: Enable on s390*.
* c-c++-common/zero-scratch-regs-9.c: Likewise.
* c-c++-common/zero-scratch-regs-10.c: Likewise.
* c-c++-common/zero-scratch-regs-11.c: Likewise.
* gcc.target/s390/zero-scratch-regs-1.c: New test.
Following up on the fix for PR99914, when testing on MinGW, it was found
not to support weak in the same way as on ELF or Mach-O targets.
So the linkage has been reverted back to COMDAT for that target, however
in order to properly support overriding functions and variables, all
declarations with external linkage must be put on COMDAT. For this a
new target hook has been added to control the behavior.
gcc/ChangeLog:
PR d/99914
* config/i386/winnt-d.c (TARGET_D_TEMPLATES_ALWAYS_COMDAT): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (D language and ABI): Add @hook for
TARGET_D_TEMPLATES_ALWAYS_COMDAT.
gcc/d/ChangeLog:
PR d/99914
* d-target.def (d_templates_always_comdat): New hook.
* d-tree.h (mark_needed): Remove prototype.
* decl.cc: Include d-target.h.
(mark_needed): Rename to...
(d_mark_needed): ...this. Make static.
(set_linkage_for_decl): Put variables in comdat if
d_templates_always_comdat.
Following on from adding TARGET_D_REGISTER_OS_TARGET_INFO, this adds the
required handlers to implement `__traits(getTargetInfo, "objectFormat")'
for all platforms that have D support files.
Some back-ends (i386, rs6000, and pa) have some awarenes of the what
object format they are compiling for, so new getTargetInfo handlers have
been have added both to those back-ends as well as platform-specific
target files to override the default in the D front-end.
gcc/ChangeLog:
* config/darwin-d.c (darwin_d_handle_target_object_format): New
function.
(darwin_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/dragonfly-d.c (dragonfly_d_handle_target_object_format): New
function.
(dragonfly_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/freebsd-d.c (freebsd_d_handle_target_object_format): New
function.
(freebsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/glibc-d.c (glibc_d_handle_target_object_format): New
function.
(glibc_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/i386/i386-d.c (ix86_d_handle_target_object_format): New
function.
(ix86_d_register_target_info): Add ix86_d_handle_target_object_format
as handler for objectFormat key.
* config/i386/winnt-d.c (winnt_d_handle_target_object_format): New
function.
(winnt_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/netbsd-d.c (netbsd_d_handle_target_object_format): New
function.
(netbsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/openbsd-d.c (openbsd_d_handle_target_object_format): New
function.
(openbsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/pa/pa-d.c (pa_d_handle_target_object_format): New function.
(pa_d_register_target_info): Add pa_d_handle_target_object_format as
handler for objectFormat key.
* config/rs6000/rs6000-d.c (rs6000_d_handle_target_object_format): New
function.
(rs6000_d_register_target_info): Add
rs6000_d_handle_target_object_format as handler for objectFormat key.
* config/sol2-d.c (solaris_d_handle_target_object_format): New
function.
(solaris_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
gcc/d/ChangeLog:
* d-target.cc (d_handle_target_object_format): New function.
(d_language_target_info): Add d_handle_target_object_format as handler
for objectFormat key.
(Target::getTargetInfo): Continue if handler returned NULL_TREE.
Jakub Jelinek [Sat, 17 Apr 2021 09:31:30 +0000 (11:31 +0200)]
libstdc++: Update some baseline_symbols.txt
As we have only one P1 left right now, I think it is the right time
to update abi list files in libstdc++.
Here is an update for x86_64/i?86/s390x/ppc64 linux (aarch64 seems
to be correct already). For ppc64le it is missing the IEEE128 symver
symbols, but those need further work on the abi checking side.