From: Julian Brown Date: Mon, 14 Oct 2019 20:12:39 +0000 (-0700) Subject: [og9] Re-do OpenACC private variable resolution X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=833d954448cd353c3e40208ab9916edb5a7c5c5b;p=thirdparty%2Fgcc.git [og9] Re-do OpenACC private variable resolution gcc/ * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. * config/gcn/gcn-tree.c (diagnostic-core.h): Include. (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter. * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. * config/nvptx/nvptx.c (tree-pretty-print.h): Include. (nvptx_goacc_adjust_private_decl): New function. (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function. * doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. * doc/tm.texi: Regenerated. * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE. * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE. * omp-low.c (omp_context): Remove oacc_partitioning_levels field. (lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before fork. (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its gimple call arguments as appropriate. Don't set oacc_partitioning_levels in omp_context. Pass private_marker to lower_oacc_reductions. (oacc_record_private_var_clauses): Don't check for NULL ctx. (make_oacc_private_marker): New function. (lower_omp_for): Only call oacc_record_vars_in_bind for OpenACC contexts. Create private marker and pass to lower_oacc_head_tail. (lower_omp_target): Remove unnecessary call to oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate. Create private marker and pass to lower_oacc_reductions. (process_oacc_gangprivate_1): Remove. (lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't iterate over contexts calling process_oacc_gangprivate_1. (omp-offload.c (oacc_loop_xform_head_tail): Treat private-variable markers like fork/join when transforming head/tail sequences. (execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of "oacc gangprivate" attributes to determine partitioning level of variables. * omp-sese.c (find_gangprivate_vars): New function. (find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead of "oacc gangprivate" attribute to determine which variables are gang-private. (oacc_do_neutering): Use find_gangprivate_vars. * target.def (adjust_gangprivate_decl): Rename to... (adjust_private_decl): ...this. Update documentation (briefly). libgomp/ * testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use oaccdevlow dump and update scanned output. * testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise. Add missing atomic to force worker partitioning for test variable. (cherry picked from openacc-gcc-9-branch commit bbad7288269195b39603cdfde6c15f9488de83dc) --- diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index 99734c8982a5..a84569760d32 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,53 @@ +2019-10-16 Julian Brown + + * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename + to... + (gcn_goacc_adjust_private_decl): ...this. + * config/gcn/gcn-tree.c (diagnostic-core.h): Include. + (gcn_goacc_adjust_gangprivate_decl): Rename to... + (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter. + * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... + (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. + * config/nvptx/nvptx.c (tree-pretty-print.h): Include. + (nvptx_goacc_adjust_private_decl): New function. + (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function. + * doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to... + (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this. + * doc/tm.texi: Regenerated. + * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE. + * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE. + * omp-low.c (omp_context): Remove oacc_partitioning_levels field. + (lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before + fork. + (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its + gimple call arguments as appropriate. Don't set + oacc_partitioning_levels in omp_context. Pass private_marker to + lower_oacc_reductions. + (oacc_record_private_var_clauses): Don't check for NULL ctx. + (make_oacc_private_marker): New function. + (lower_omp_for): Only call oacc_record_vars_in_bind for + OpenACC contexts. Create private marker and pass to + lower_oacc_head_tail. + (lower_omp_target): Remove unnecessary call to + oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate. + Create private marker and pass to lower_oacc_reductions. + (process_oacc_gangprivate_1): Remove. + (lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't + iterate over contexts calling process_oacc_gangprivate_1. + (omp-offload.c (oacc_loop_xform_head_tail): Treat + private-variable markers like fork/join when transforming head/tail + sequences. + (execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of + "oacc gangprivate" attributes to determine partitioning level of + variables. + * omp-sese.c (find_gangprivate_vars): New function. + (find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead + of "oacc gangprivate" attribute to determine which variables are + gang-private. + (oacc_do_neutering): Use find_gangprivate_vars. + * target.def (adjust_gangprivate_decl): Rename to... + (adjust_private_decl): ...this. Update documentation (briefly). + 2019-10-09 Tobias Burnus * f95-lang.c (LANG_HOOKS_OMP_ARRAY_DATA): Set to gfc_omp_array_data. diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h index 1711862c6a29..e33c0598feea 100644 --- a/gcc/config/gcn/gcn-protos.h +++ b/gcc/config/gcn/gcn-protos.h @@ -39,7 +39,7 @@ extern rtx gcn_gen_undef (machine_mode); extern bool gcn_global_address_p (rtx); extern tree gcn_goacc_create_propagation_record (tree record_type, bool sender, const char *name); -extern void gcn_goacc_adjust_gangprivate_decl (tree var); +extern void gcn_goacc_adjust_private_decl (tree var, int level); extern void gcn_goacc_reduction (gcall *call); extern bool gcn_hard_regno_rename_ok (unsigned int from_reg, unsigned int to_reg); diff --git a/gcc/config/gcn/gcn-tree.c b/gcc/config/gcn/gcn-tree.c index 04902a39b299..db8e290dc781 100644 --- a/gcc/config/gcn/gcn-tree.c +++ b/gcc/config/gcn/gcn-tree.c @@ -44,6 +44,7 @@ #include "cgraph.h" #include "targhooks.h" #include "langhooks-def.h" +#include "diagnostic-core.h" /* }}} */ /* {{{ OMP GCN pass. @@ -697,8 +698,11 @@ gcn_goacc_create_propagation_record (tree record_type, bool sender, } void -gcn_goacc_adjust_gangprivate_decl (tree var) +gcn_goacc_adjust_private_decl (tree var, int level) { + if (level != GOMP_DIM_GANG) + return; + tree type = TREE_TYPE (var); tree lds_type = build_qualified_type (type, TYPE_QUALS_NO_ADDR_SPACE (type) diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index e0a558b289a0..2835a3d71419 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -6044,8 +6044,8 @@ print_operand (FILE *file, rtx x, int code) #undef TARGET_GOACC_CREATE_PROPAGATION_RECORD #define TARGET_GOACC_CREATE_PROPAGATION_RECORD \ gcn_goacc_create_propagation_record -#undef TARGET_GOACC_ADJUST_GANGPRIVATE_DECL -#define TARGET_GOACC_ADJUST_GANGPRIVATE_DECL gcn_goacc_adjust_gangprivate_decl +#undef TARGET_GOACC_ADJUST_PRIVATE_DECL +#define TARGET_GOACC_ADJUST_PRIVATE_DECL gcn_goacc_adjust_private_decl #undef TARGET_GOACC_FORK_JOIN #define TARGET_GOACC_FORK_JOIN gcn_fork_join #undef TARGET_GOACC_REDUCTION diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index d6b2881d1108..2a41d5659945 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -76,6 +76,7 @@ #include "intl.h" #include "tree-hash-traits.h" #include "omp-sese.h" +#include "tree-pretty-print.h" /* This file should be included last. */ #include "target-def.h" @@ -6019,6 +6020,28 @@ nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) return false; } +/* Implement TARGET_GOACC_ADJUST_PRIVATE_DECL. Set "oacc gangprivate" + attribute for gang-private variable declarations. */ + +void +nvptx_goacc_adjust_private_decl (tree decl, int level) +{ + if (level != GOMP_DIM_GANG) + return; + + if (!lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (decl))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Setting 'oacc gangprivate' attribute for decl:"); + print_generic_decl (dump_file, decl, TDF_SLIM); + fputc ('\n', dump_file); + } + tree id = get_identifier ("oacc gangprivate"); + DECL_ATTRIBUTES (decl) = tree_cons (id, NULL, DECL_ATTRIBUTES (decl)); + } +} + /* Implement TARGET_GOACC_EXPAND_ACCEL_VAR. Place "oacc gangprivate" variables in shared memory. */ @@ -6201,6 +6224,9 @@ nvptx_set_current_function (tree fndecl) #undef TARGET_HAVE_SPECULATION_SAFE_VALUE #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed +#undef TARGET_GOACC_ADJUST_PRIVATE_DECL +#define TARGET_GOACC_ADJUST_PRIVATE_DECL nvptx_goacc_adjust_private_decl + #undef TARGET_GOACC_EXPAND_ACCEL_VAR #define TARGET_GOACC_EXPAND_ACCEL_VAR nvptx_goacc_expand_accel_var diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 536a436b1c4d..e44f80584219 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6162,8 +6162,9 @@ memories. A return value of NULL indicates that the target does not handle this VAR_DECL, and normal RTL expanding is resumed. @end deftypefn -@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_GANGPRIVATE_DECL (tree @var{var}) -Tweak variable declaration for a gang-private variable. +@deftypefn {Target Hook} void TARGET_GOACC_ADJUST_PRIVATE_DECL (tree @var{var}, @var{int}) +Tweak variable declaration for a private variable at the specified +parallelism level. @end deftypefn @deftypevr {Target Hook} bool TARGET_GOACC_WORKER_PARTITIONING diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index c0b92f25da79..74a1b03f1e8b 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4210,7 +4210,7 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_GOACC_EXPAND_ACCEL_VAR -@hook TARGET_GOACC_ADJUST_GANGPRIVATE_DECL +@hook TARGET_GOACC_ADJUST_PRIVATE_DECL @hook TARGET_GOACC_WORKER_PARTITIONING diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 04081f36c4d3..9b5e518cc4b3 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -2617,6 +2617,8 @@ expand_UNIQUE (internal_fn, gcall *stmt) else gcc_unreachable (); break; + case IFN_UNIQUE_OACC_PRIVATE: + break; } if (pattern) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 7164ee5cf3c7..a2810edc1b45 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see #define IFN_UNIQUE_CODES \ DEF(UNSPEC), \ DEF(OACC_FORK), DEF(OACC_JOIN), \ - DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK) + DEF(OACC_HEAD_MARK), DEF(OACC_TAIL_MARK), \ + DEF(OACC_PRIVATE) enum ifn_unique_kind { #define DEF(X) IFN_UNIQUE_##X diff --git a/gcc/omp-low.c b/gcc/omp-low.c index f0d87a686fe2..eddae6444169 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -147,9 +147,6 @@ struct omp_context /* A tree_list of the reduction clauses in outer contexts. */ tree outer_reduction_clauses; - /* The number of levels of OpenACC partitioning invoked in this context. */ - unsigned oacc_partitioning_levels; - /* Addressable variable decls in this context. */ vec *oacc_addressable_var_decls; }; @@ -6148,8 +6145,9 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list, static void lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, - gcall *fork, gcall *join, gimple_seq *fork_seq, - gimple_seq *join_seq, omp_context *ctx) + gcall *fork, gcall *private_marker, gcall *join, + gimple_seq *fork_seq, gimple_seq *join_seq, + omp_context *ctx) { gimple_seq before_fork = NULL; gimple_seq after_fork = NULL; @@ -6351,6 +6349,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, /* Now stitch things together. */ gimple_seq_add_seq (fork_seq, before_fork); + if (private_marker) + gimple_seq_add_stmt (fork_seq, private_marker); if (fork) gimple_seq_add_stmt (fork_seq, fork); gimple_seq_add_seq (fork_seq, after_fork); @@ -7048,7 +7048,7 @@ lower_oacc_loop_marker (location_t loc, tree ddvar, bool head, HEAD and TAIL. */ static void -lower_oacc_head_tail (location_t loc, tree clauses, +lower_oacc_head_tail (location_t loc, tree clauses, gcall *private_marker, gimple_seq *head, gimple_seq *tail, omp_context *ctx) { bool inner = false; @@ -7056,13 +7056,19 @@ lower_oacc_head_tail (location_t loc, tree clauses, gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node)); unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx); + + if (private_marker) + { + gimple_set_location (private_marker, loc); + gimple_call_set_lhs (private_marker, ddvar); + gimple_call_set_arg (private_marker, 1, ddvar); + } + tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK); tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN); gcc_assert (count); - ctx->oacc_partitioning_levels = count; - for (unsigned done = 1; count; count--, done++) { gimple_seq fork_seq = NULL; @@ -7089,7 +7095,8 @@ lower_oacc_head_tail (location_t loc, tree clauses, &join_seq); lower_oacc_reductions (loc, clauses, place, inner, - fork, join, &fork_seq, &join_seq, ctx); + fork, (count == 1) ? private_marker : NULL, + join, &fork_seq, &join_seq, ctx); /* Append this level to head. */ gimple_seq_add_seq (head, fork_seq); @@ -8755,9 +8762,6 @@ oacc_record_private_var_clauses (omp_context *ctx, tree clauses) { tree c; - if (!ctx) - return; - for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE) { @@ -8821,6 +8825,58 @@ mark_oacc_gangprivate (vec *decls, omp_context *ctx) } } +/* Build an internal UNIQUE function with type IFN_UNIQUE_OACC_PRIVATE listing + the addresses of variables that should be made private at the surrounding + parallelism level. Such functions appear in the gimple code stream in two + forms, e.g. for a partitioned loop: + + .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6, 1, 68); + .data_dep.6 = .UNIQUE (OACC_PRIVATE, .data_dep.6, -1, &w); + .data_dep.6 = .UNIQUE (OACC_FORK, .data_dep.6, -1); + .data_dep.6 = .UNIQUE (OACC_HEAD_MARK, .data_dep.6); + + or alternatively, OACC_PRIVATE can appear at the top level of a parallel, + not as part of a HEAD_MARK sequence: + + .UNIQUE (OACC_PRIVATE, 0, 0, &w); + + For such stand-alone appearances, the 3rd argument is always 0, denoting + gang partitioning. */ + +static gcall * +make_oacc_private_marker (omp_context *ctx) +{ + int i; + tree decl; + + if (ctx->oacc_addressable_var_decls->length () == 0) + return NULL; + + auto_vec args; + + args.quick_push (build_int_cst (integer_type_node, + IFN_UNIQUE_OACC_PRIVATE)); + args.quick_push (integer_zero_node); + args.quick_push (integer_minus_one_node); + + FOR_EACH_VEC_ELT (*ctx->oacc_addressable_var_decls, i, decl) + { + for (omp_context *thisctx = ctx; thisctx; thisctx = thisctx->outer) + { + tree inner_decl = maybe_lookup_decl (decl, thisctx); + if (inner_decl) + { + decl = inner_decl; + break; + } + } + tree addr = build_fold_addr_expr (decl); + args.safe_push (addr); + } + + return gimple_build_call_internal_vec (IFN_UNIQUE, args); +} + /* Lower code for an OMP loop directive. */ static void @@ -8857,6 +8913,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) gbind *inner_bind = as_a (gimple_seq_first_stmt (omp_for_body)); tree vars = gimple_bind_vars (inner_bind); + if (is_gimple_omp_oacc (ctx->stmt)) + oacc_record_vars_in_bind (ctx, vars); gimple_bind_append_vars (new_stmt, vars); /* bind_vars/BLOCK_VARS are being moved to new_stmt/block, don't keep them on the inner_bind and it's block. */ @@ -8953,6 +9011,12 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) lower_omp (gimple_omp_body_ptr (stmt), ctx); + gcall *private_marker = NULL; + if (is_gimple_omp_oacc (ctx->stmt) + && !gimple_seq_empty_p (omp_for_body) + && !gimple_seq_empty_p (omp_for_body)) + private_marker = make_oacc_private_marker (ctx); + /* Lower the header expressions. At this point, we can assume that the header is of the form: @@ -8989,7 +9053,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (is_gimple_omp_oacc (ctx->stmt) && !ctx_in_oacc_kernels_region (ctx)) lower_oacc_head_tail (gimple_location (stmt), - gimple_omp_for_clauses (stmt), + gimple_omp_for_clauses (stmt), private_marker, &oacc_head, &oacc_tail, ctx); /* Add OpenACC partitioning and reduction markers just before the loop. */ @@ -9872,8 +9936,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) clauses = gimple_omp_target_clauses (stmt); - oacc_record_private_var_clauses (ctx, clauses); - gimple_seq dep_ilist = NULL; gimple_seq dep_olist = NULL; if (omp_find_clause (clauses, OMP_CLAUSE_DEPEND)) @@ -10242,8 +10304,6 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (offloaded) { - mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx); - /* Declare all the variables created by mapping and the variables declared in the scope of the target body. */ record_vars_into (ctx->block_vars, child_fn); @@ -11195,8 +11255,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) them as a dummy GANG loop. */ tree level = build_int_cst (integer_type_node, GOMP_DIM_GANG); + gcall *private_marker = make_oacc_private_marker (ctx); + + if (private_marker) + gimple_call_set_arg (private_marker, 2, level); + lower_oacc_reductions (gimple_location (ctx->stmt), clauses, level, - false, NULL, NULL, &fork_seq, &join_seq, ctx); + false, NULL, private_marker, NULL, &fork_seq, + &join_seq, ctx); } gimple_seq_add_seq (&new_body, fork_seq); @@ -11307,26 +11373,6 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p, omp_context *ctx) gimple_build_omp_return (false)); } -/* Find gang-private variables in a context. */ - -static int -process_oacc_gangprivate_1 (splay_tree_node node, void * /* data */) -{ - omp_context *ctx = (omp_context *) node->value; - unsigned level_total = 0; - omp_context *thisctx; - - for (thisctx = ctx; thisctx; thisctx = thisctx->outer) - level_total += thisctx->oacc_partitioning_levels; - - /* If the current context and parent contexts are distributed over a - total of one parallelism level, we have gang partitioning. */ - if (level_total == 1) - mark_oacc_gangprivate (ctx->oacc_addressable_var_decls, ctx); - - return 0; -} - /* Helper to lookup dynamic array through nested omp contexts. Returns TREE_LIST of dimensions, and the CTX where it was found in *CTX_P. */ @@ -11666,7 +11712,9 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx) ctx); break; case GIMPLE_BIND: - oacc_record_vars_in_bind (ctx, gimple_bind_vars (as_a (stmt))); + if (ctx && is_gimple_omp_oacc (ctx->stmt)) + oacc_record_vars_in_bind (ctx, + gimple_bind_vars (as_a (stmt))); lower_omp (gimple_bind_body_ptr (as_a (stmt)), ctx); maybe_remove_omp_member_access_dummy_vars (as_a (stmt)); break; @@ -11917,7 +11965,6 @@ execute_lower_omp (void) if (all_contexts) { - splay_tree_foreach (all_contexts, process_oacc_gangprivate_1, NULL); splay_tree_delete (all_contexts); all_contexts = NULL; } diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index a6f64aac37e8..e489ad3073a2 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -1110,7 +1110,9 @@ oacc_loop_xform_head_tail (gcall *from, int level) = ((enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (stmt, 0))); - if (k == IFN_UNIQUE_OACC_FORK || k == IFN_UNIQUE_OACC_JOIN) + if (k == IFN_UNIQUE_OACC_FORK + || k == IFN_UNIQUE_OACC_JOIN + || k == IFN_UNIQUE_OACC_PRIVATE) *gimple_call_arg_ptr (stmt, 2) = replacement; else if (k == kind && stmt != from) break; @@ -1828,6 +1830,8 @@ execute_oacc_device_lower () for (unsigned i = 0; i < GOMP_DIM_MAX; i++) dims[i] = oacc_get_fn_dim_size (current_function_decl, i); + hash_set adjusted_vars; + /* Now lower internal loop functions to target-specific code sequences. */ basic_block bb; @@ -1904,6 +1908,43 @@ execute_oacc_device_lower () case IFN_UNIQUE_OACC_TAIL_MARK: remove = true; break; + + case IFN_UNIQUE_OACC_PRIVATE: + { + HOST_WIDE_INT level + = TREE_INT_CST_LOW (gimple_call_arg (call, 2)); + if (level == -1) + break; + for (unsigned i = 3; + i < gimple_call_num_args (call); + i++) + { + tree arg = gimple_call_arg (call, i); + gcc_assert (TREE_CODE (arg) == ADDR_EXPR); + tree decl = TREE_OPERAND (arg, 0); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + static char const *const axes[] = + /* Must be kept in sync with GOMP_DIM + enumeration. */ + { "gang", "worker", "vector" }; + fprintf (dump_file, "Decl UID %u has %s " + "partitioning:", DECL_UID (decl), + axes[level]); + print_generic_decl (dump_file, decl, TDF_SLIM); + fputc ('\n', dump_file); + } + if (targetm.goacc.adjust_private_decl) + { + tree oldtype = TREE_TYPE (decl); + targetm.goacc.adjust_private_decl (decl, level); + if (TREE_TYPE (decl) != oldtype) + adjusted_vars.add (decl); + } + } + remove = true; + } + break; } break; } @@ -1952,21 +1993,10 @@ execute_oacc_device_lower () uses (2). At least on AMD GCN, there are atomic operations that work directly in the LDS address space. */ - if (targetm.goacc.adjust_gangprivate_decl) + if (targetm.goacc.adjust_private_decl) { tree var; unsigned i; - hash_set adjusted_vars; - - FOR_EACH_LOCAL_DECL (cfun, i, var) - { - if (!VAR_P (var) - || !lookup_attribute ("oacc gangprivate", DECL_ATTRIBUTES (var))) - continue; - - targetm.goacc.adjust_gangprivate_decl (var); - adjusted_vars.add (var); - } FOR_ALL_BB_FN (bb, cfun) for (gimple_stmt_iterator gsi = gsi_start_bb (bb); diff --git a/gcc/omp-sese.c b/gcc/omp-sese.c index d72670176771..13d803fb1cdb 100644 --- a/gcc/omp-sese.c +++ b/gcc/omp-sese.c @@ -713,19 +713,61 @@ find_partitioned_var_uses (parallel_g *par, unsigned outer_mask, } } +/* Gang-private variables (typically placed in a GPU's shared memory) do not + need to be processed by the worker-propagation mechanism. Populate the + GANGPRIVATE_VARS set with any such variables found in the current + function. */ + +static void +find_gangprivate_vars (hash_set *gangprivate_vars) +{ + basic_block block; + + FOR_EACH_BB_FN (block, cfun) + { + for (gimple_stmt_iterator gsi = gsi_start_bb (block); + !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + if (gimple_call_internal_p (stmt, IFN_UNIQUE)) + { + enum ifn_unique_kind k = ((enum ifn_unique_kind) + TREE_INT_CST_LOW (gimple_call_arg (stmt, 0))); + if (k == IFN_UNIQUE_OACC_PRIVATE) + { + HOST_WIDE_INT level + = TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); + if (level != GOMP_DIM_GANG) + continue; + for (unsigned i = 3; i < gimple_call_num_args (stmt); i++) + { + tree arg = gimple_call_arg (stmt, i); + gcc_assert (TREE_CODE (arg) == ADDR_EXPR); + tree decl = TREE_OPERAND (arg, 0); + gangprivate_vars->add (decl); + } + } + } + } + } +} + static void find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask, hash_set *partitioned_var_uses, + hash_set *gangprivate_vars, vec *prop_set) { unsigned mask = outer_mask | par->mask; if (par->inner) find_local_vars_to_propagate (par->inner, mask, partitioned_var_uses, - prop_set); + gangprivate_vars, prop_set); if (par->next) find_local_vars_to_propagate (par->next, outer_mask, partitioned_var_uses, - prop_set); + gangprivate_vars, prop_set); if (!(mask & GOMP_DIM_MASK (GOMP_DIM_WORKER))) { @@ -747,8 +789,7 @@ find_local_vars_to_propagate (parallel_g *par, unsigned outer_mask, || is_global_var (var) || AGGREGATE_TYPE_P (TREE_TYPE (var)) || !partitioned_var_uses->contains (var) - || lookup_attribute ("oacc gangprivate", - DECL_ATTRIBUTES (var))) + || gangprivate_vars->contains (var)) continue; if (stmt_may_clobber_ref_p (stmt, var)) @@ -1353,9 +1394,12 @@ oacc_do_neutering (void) &prop_set); hash_set partitioned_var_uses; + hash_set gangprivate_vars; + find_gangprivate_vars (&gangprivate_vars); find_partitioned_var_uses (par, mask, &partitioned_var_uses); - find_local_vars_to_propagate (par, mask, &partitioned_var_uses, &prop_set); + find_local_vars_to_propagate (par, mask, &partitioned_var_uses, + &gangprivate_vars, &prop_set); FOR_ALL_BB_FN (bb, cfun) { diff --git a/gcc/target.def b/gcc/target.def index c9c3f650e8aa..d4901389cbc4 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1730,9 +1730,10 @@ rtx, (tree var), NULL) DEFHOOK -(adjust_gangprivate_decl, -"Tweak variable declaration for a gang-private variable.", -void, (tree var), +(adjust_private_decl, +"Tweak variable declaration for a private variable at the specified\n\ +parallelism level.", +void, (tree var, int), NULL) DEFHOOK diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index bf880ac0c5e6..b1748accd5dc 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,3 +1,10 @@ +2019-10-16 Julian Brown + + * testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use + oaccdevlow dump and update scanned output. + * testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise. + Add missing atomic to force worker partitioning for test variable. + 2019-10-16 Julian Brown * testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Support AMD GCN. diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 index 9158b6f4768b..dafc70c743e5 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 @@ -1,8 +1,8 @@ ! Test for "oacc gangprivate" attribute on gang-private variables ! { dg-do run } -! { dg-additional-options "-fdump-tree-omplower-details" } -! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl: integer\\(kind=4\\) w;" 1 "omplower" } } */ +! { dg-additional-options "-fdump-tree-oaccdevlow-details" } +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has gang partitioning: integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */ program main integer :: w, arr(0:31) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 index d147229d91e5..90e06be24ff5 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 @@ -1,8 +1,8 @@ -! Test for lack of "oacc gangprivate" attribute on worker-private variables +! Test for worker-private variables ! { dg-do run } -! { dg-additional-options "-fdump-tree-omplower-details" } -! { dg-final { scan-tree-dump-times "Setting 'oacc gangprivate' attribute for decl" 0 "omplower" } } */ +! { dg-additional-options "-fdump-tree-oaccdevlow-details" } +! { dg-final { scan-tree-dump-times "Decl UID \[0-9\]+ has worker partitioning: integer\\(kind=4\\) w;" 1 "oaccdevlow" } } */ program main integer :: w, arr(0:31) @@ -13,7 +13,9 @@ program main w = 0 !$acc loop seq do i = 0, 31 + !$acc atomic update w = w + 1 + !$acc end atomic end do arr(j) = w end do