The test commands for scanning optimization dump files
perform globbing on the argument that specifies the suffix
of the dump files to be scanned. This behavior is currently
undocumented. Furthermore, the current implementation of
"scan-dump" and similar procedures yields an error whenever
the file name globbing matches more than one file (due to an
attempt to call "open" on multiple files) while a failure to
match any file results in an unresolved test.
This commit documents the globbing behavior. The dump
scanning procedures are changed to make the test unresolved
if globbing matches more than one file.
2020-07-02 Frederik Harwath <frederik@codesourcery.com>
* doc/sourcebuild.texi: Describe globbing of the
dump file scanning commands "suffix" argument.
gcc/testsuite/ChangeLog:
2020-07-02 Frederik Harwath <frederik@codesourcery.com>
* lib/scandump.exp (glob-dump-file): New proc.
(scan-dump): Use glob-dump-file for file name expansion.
(scan-dump-times): Likewise.
(scan-dump-dem): Likewise.
(scan-dump-dem-not): Likewise.
gcc/testsuite/
* gfortran.dg/gomp/combined-if.f90: Adjust expected number
of matches depending on whether nvptx offloading is supported.
* lib/target-supports.exp
(check_effective_target_offload_nvptx): New.
fortran: Apply if clause to all sub-constructs in combined OpenMP constructs
The unmodified 'if' clause should be applied to all the sub-constructs that
accept an 'if' clause in a combined OpenMP construct, and not just to the
'parallel' sub-construct.
Aaron Sawdey [Wed, 17 Jun 2020 20:35:55 +0000 (15:35 -0500)]
identify lfs prefixed case PR95347
The same problem also arises for plfs where prefixed_load_p()
doesn't recognize it so we get just lfs in the asm output
with an @pcrel address.
Backport from master
2020-06-03 Aaron Sawdey <acsawdey@linux.ibm.com>
PR target/95347
* config/rs6000/rs6000.c (is_stfs_insn): Rename to
is_lfs_stfs_insn and make it recognize lfs as well.
(prefixed_store_p): Use is_lfs_stfs_insn().
(prefixed_load_p): Use is_lfs_stfs_insn() to recognize lfs.
Aaron Sawdey [Wed, 17 Jun 2020 20:32:53 +0000 (15:32 -0500)]
Correctly identify stfs if prefixed
Because reg_to_non_prefixed() only looks at the register being used, it
doesn't get the right answer for stfs, which leads to us not seeing
that it has a PCREL symbol ref. This patch works around this by
introducing a helper function that inspects the insn to see if it is in
fact a stfs. Then if we use NON_PREFIXED_DEFAULT, address_to_insn_form()
can see that it has the PCREL symbol ref.
Backport from master
2020-06-02 Aaron Sawdey <acsawdey@linux.ibm.com>
PR target/95347
* config/rs6000/rs6000.c (prefixed_store_p): Add special case
for stfs.
(is_stfs_insn): New helper function.
Andrew Stubbs [Tue, 2 Jun 2020 20:00:40 +0000 (21:00 +0100)]
amdgcn: Switch to HSACO v3 binary format
This upgrades the compiler to emit HSA Code Object v3 binaries. This means
changing the assembler directives, and linker command line options.
The gcn-run and libgomp loaders need corresponding alterations. The
relocations no longer need to be fixed up manually, and the kernel symbol
names have changed slightly.
This move makes the binaries compatible with the new rocgdb from ROCm 3.5.
This is a backport from GCC 11.
2020-06-18 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-hsa.h (TEXT_SECTION_ASM_OP): Use ".text".
(BSS_SECTION_ASM_OP): Use ".bss".
(ASM_SPEC): Remove "-mattr=-code-object-v3".
(LINK_SPEC): Add "--export-dynamic".
* config/gcn/gcn-opts.h (processor_type): Replace PROCESSOR_VEGA with
PROCESSOR_VEGA10 and PROCESSOR_VEGA20.
* config/gcn/gcn-run.c (HSA_RUNTIME_LIB): Use ".so.1" variant.
(load_image): Remove obsolete relocation handling.
Add ".kd" suffix to the symbol names.
* config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT): Set to 62.
(gcn_option_override): Update gcn_isa test.
(gcn_kernel_arg_types): Update all the assembler directives.
Remove the obsolete options.
(gcn_conditional_register_usage): Update MAX_NORMAL_SGPR_COUNT usage.
(gcn_omp_device_kind_arch_isa): Handle PROCESSOR_VEGA10 and
PROCESSOR_VEGA20.
(output_file_start): Rework assembler file header.
(gcn_hsa_declare_function_name): Rework kernel metadata.
* config/gcn/gcn.h (GCN_KERNEL_ARG_TYPES): Set to 16.
* config/gcn/gcn.opt (PROCESSOR_VEGA): Remove enum.
(PROCESSOR_VEGA10): New enum value.
(PROCESSOR_VEGA20): New enum value.
libgomp/
* plugin/plugin-gcn.c (init_environment_variables): Use ".so.1"
variant for HSA_RUNTIME_LIB name.
(find_executable_symbol_1): Delete.
(find_executable_symbol): Delete.
(init_kernel_properties): Add ".kd" suffix to symbol names.
(find_load_offset): Delete.
(create_and_finalize_hsa_program): Remove relocation handling.
As the following testcase shows, the exception for the aarch64
vec_pack_trunc_di is not sufficient on x86, the halfvectype
"vectors" have SImode but the x86 vec_pack_trunc_si meant for
the bool bitmasks combines 2x SImode into DImode, while in the
testcase the halfvectype is 1x SImode "vector" with SImode and
result is 2x HImode "vector" with SImode.
Richard Sandiford's reply:
FWIW, since the aarch64 case was only found by inspection and might
not be useful, personally I'd prefer to drop that case after all.
2020-06-18 Jakub Jelinek <jakub@redhat.com>
PR target/95713
* tree-ssa-forwprop.c (simplify_vector_constructor): Don't allow
scalar mode halfvectype other than vector boolean for
VEC_PACK_TRUNC_EXPR.
[PATCH][GCC] arm: Fix the MVE ACLE vaddq_m polymorphic variants.
Hello,
This patch fixes the MVE ACLE vaddq_m polymorphic variants by modifying the corresponding
intrinsic parameters and vaddq_m polymorphic variant's _Generic case entries in "arm_mve.h"
header file.
This patch modifies the MVE scalar shift RTL patterns. The current patterns
have wrong constraints and predicates due to which the values returned from
MVE scalar shift instructions are overwritten in the code-gen.
+FAIL: libgomp.c/target-39.c (internal compiler error)
+FAIL: libgomp.c/target-39.c (test for excess errors)
+UNRESOLVED: libgomp.c/target-39.c compilation failed to produce executable
This is in a '--enable-offload-targets=[...],hsa' build, with '-foffload=hsa'
enabled (by default).
during GIMPLE pass: hsagen
source-gcc/libgomp/testsuite/libgomp.c/target-39.c: In function ‘main._omp_fn.0.hsa.0’:
source-gcc/libgomp/testsuite/libgomp.c/target-39.c:23:11: internal compiler error: Segmentation fault
23 | #pragma omp target map(from:err)
| ^~~
[...]
GDB:
Program received signal SIGSEGV, Segmentation fault.
fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
6267 return (fndecl_built_in_p (node, BUILT_IN_NORMAL)
(gdb) bt
#0 fndecl_built_in_p (node=0x0, name=BUILT_IN_PREFETCH) at [...]/source-gcc/gcc/tree.h:6267
#1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5304
#2 0x0000000000b1aca7 in gen_hsa_insns_for_gimple_stmt (stmt=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at [...]/source-gcc/gcc/hsa-gen.c:5770
#3 0x0000000000b1bd21 in gen_body_from_gimple () at [...]/source-gcc/gcc/hsa-gen.c:5999
#4 0x0000000000b1dbd2 in generate_hsa (kernel=<optimized out>) at [...]/source-gcc/gcc/hsa-gen.c:6596
#5 0x0000000000b1de66 in (anonymous namespace)::pass_gen_hsail::execute (this=0x2a2aac0) at [...]/source-gcc/gcc/hsa-gen.c:6680
#6 0x0000000000d06f90 in execute_one_pass (pass=pass@entry=0x2a2aac0) at [...]/source-gcc/gcc/passes.c:2502
[...]
(gdb) up
#1 0x0000000000b19739 in gen_hsa_insns_for_call (stmt=stmt@entry=0x7ffff693b200, hbb=hbb@entry=0x2b152c0) at /home/thomas/tmp/source/gcc/build/track-slim-omp/source-gcc/gcc/hsa-gen.c:5304
5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
(gdb) print function_decl
$1 = (tree) 0x0
(gdb) list
5299 if (!gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
5300 {
5301 tree function_decl = gimple_call_fndecl (stmt);
5302 /* Prefetch pass can create type-mismatching prefetch builtin calls which
5303 fail the gimple_call_builtin_p test above. Handle them here. */
5304 if (fndecl_built_in_p (function_decl, BUILT_IN_PREFETCH))
5305 return;
5306
5307 if (function_decl == NULL_TREE)
5308 {
The problem is present already since 2016-11-23 commit 56b1c60e412fcf1245b4780871553cbdebb956a3 (r242761) "Merge from HSA branch to
trunk", and the fix obvious enough.
Harald Anlauf [Sun, 14 Jun 2020 14:12:47 +0000 (16:12 +0200)]
PR fortran/95088 - Buffer overflows with PDTs, submodules and long symbols
With PDTs (parameterized derived types) and submodules, name mangling
results in variably long internal symbols. Instead of using a fixed-size
intermediate buffer, which is actually not really needed, just use a
pointer to strings.
2020-06-14 Harald Anlauf <anlauf@gmx.de>
gcc/fortran/
PR fortran/95088
* class.c (get_unique_type_string): Replace use of fixed size
buffer by internally passing a pointer to strings.
Marek Polacek [Wed, 17 Jun 2020 18:38:05 +0000 (14:38 -0400)]
c++: ICE with IMPLICIT_CONV_EXPR in array subscript [PR95508]
Since r10-7096 convert_like, when called in a template, creates an
IMPLICIT_CONV_EXPR when we're converting to/from array type.
In this test, we have e[f], and we're converting f (of type class A) to
int, so convert_like in build_new_op_1 created the IMPLICIT_CONV_EXPR
that got into cp_build_array_ref which calls maybe_constant_value. My
patch above failed to adjust this spot to call fold_non_dependent_expr
instead, which can handle codes like I_C_E in a template. Fixed by
using a new function maybe_fold_non_dependent_expr, which, if the expr
can't be evaluated to a constant, returns the original expression.
Tobias Burnus [Tue, 16 Jun 2020 18:18:31 +0000 (20:18 +0200)]
OpenACC/Fortran: permit 'routine' inside PURE
gcc/fortran/ChangeLog
* parse.c (decode_oacc_directive): Permit 'acc routine' also
inside pure procedures.
* openmp.c (gfc_match_oacc_routine): Inside pure procedures
do not permit gang, worker or vector clauses.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/routine-10.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/pure-elemental-procedures-2.f90: New test.
arm: Correct the grouping of operands in MVE vector scatter store intrinsics (PR94735).
The operands in RTL patterns of MVE vector scatter store intrinsics are
wrongly grouped, because of which few vector loads and stores instructions
are wrongly getting optimized out with -O2.
A new predicate "mve_scatter_memory" is defined in this patch, this predicate
returns TRUE on matching: (mem(reg)) for MVE scatter store intrinsics.
This patch fixes the issue by adding define_expand pattern with
"mve_scatter_memory" predicate and calls the corresponding define_insn by
passing register_operand as first argument. This register_operand is extracted
from the operand with "mve_scatter_memory" predicate in define_expand pattern.
Backported from mainline
2020-06-04 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
arm: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler instructions
generated by current compiler are wrong.
eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
But as per Arm-arm second argument in above instructions must also be a low
register (<= r7). This patch fixes this issue by creating a new predicate
"mve_memory_operand" and constraint "Ux" which allows low registers as arguments
to the generated instructions depending on the mode of the argument. A new constraint
"Ul" is created to handle loading to PC-relative addressing modes for vector
store/load intrinsiscs.
All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32
are modified in this patch.
Backported from mainline
2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
gcc/
PR target/94959
* config/arm/arm-protos.h (arm_mode_base_reg_class): Function
declaration.
(mve_vector_mem_operand): Likewise.
* config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check
the load from memory to a core register is legitimate for give mode.
(mve_vector_mem_operand): Define function.
(arm_print_operand): Modify comment.
(arm_mode_base_reg_class): Define.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for
TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE.
* config/arm/constraints.md (Ux): Likewise.
(Ul): Likewise.
* config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also
add support for missing Vector Store Register and Vector Load Register.
Add a new alternative to support load from memory to PC (or label) in
vector store/load.
(mve_vstrbq_<supf><mode>): Modify constraint Us to Ux.
(mve_vldrbq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrbq_z_<supf><mode>): Modify constraint Us to Ux.
(mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_z_fv8hf): Likewise.
(mve_vldrhq_z_<supf><mode>): Likewise.
(mve_vldrwq_fv4sf): Likewise.
(mve_vldrwq_<supf>v4si): Likewise.
(mve_vldrwq_z_fv4sf): Likewise.
(mve_vldrwq_z_<supf>v4si): Likewise.
(mve_vld1q_f<mode>): Modify constriant Us to Ux.
(mve_vld1q_<supf><mode>): Likewise.
(mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vstrhq_p_<supf><mode>): Likewise.
(mve_vstrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrwq_fv4sf): Modify constriant Us to Ux.
(mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE
instructions to emit.
(mve_vstrwq_p_<supf>v4si): Likewise.
(mve_vstrwq_<supf>v4si): Likewise.Modify constriant Us to Ux.
* config/arm/predicates.md (mve_memory_operand): Define.
Tobias Burnus [Tue, 9 Jun 2020 14:31:22 +0000 (16:31 +0200)]
openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
gcc/ChangeLog:
* omp-offload.c (add_decls_addresses_to_decl_constructor,
omp_finish_file): With in_lto_p, stream out all offload-table
items even if the symtab_node does not exist.
Tobias Burnus [Mon, 8 Jun 2020 21:24:57 +0000 (23:24 +0200)]
openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
gcc/ChangeLog:
PR lto/94848
PR middle-end/95551
* omp-offload.c (add_decls_addresses_to_decl_constructor,
omp_finish_file): Skip removed items.
* lto-cgraph.c (output_offload_tables): Likewise; set force_output
to this node for variables and functions.
libgomp/ChangeLog:
PR lto/94848
PR middle-end/95551
* testsuite/libgomp.fortran/target-var.f90: New test.
Iain Sandoe [Sun, 14 Jun 2020 16:01:57 +0000 (17:01 +0100)]
coroutines: Handle lambda closure pointers like 'this'.
It was agreed amongst the implementors that the correct
interpretation of the standard is that lambda closure pointers
should be treated in the same manner as class object pointers.
gcc/cp/ChangeLog:
* coroutines.cc (instantiate_coro_traits): Pass a reference
to lambda closure objects to traits instantiation.
(morph_fn_to_coro): Likewise for promise parameter
preview and allocator lookup.
Harald Anlauf [Thu, 11 Jun 2020 18:29:45 +0000 (20:29 +0200)]
PR fortran/95544 - Fix ICE in NULL() argument to intrinsics
Fortran 2018: An argument to an intrinsic procedure other than ASSOCIATED,
NULL, or PRESENT shall be a data object. An EXPR_NULL is not a data
object. Add checks for intrinsics.
2020-06-11 Steven G. Kargl <kargl@gcc.gnu.org>
Harald Anlauf <anlauf@gmx.de>
Harald Anlauf [Tue, 5 May 2020 20:16:50 +0000 (22:16 +0200)]
PR fortran/93366 - ICE on invalid, reject invalid use of NULL() as argument
gcc/fortran/ChangeLog:
2020-05-05 Steve Kargl <kargl@gcc.gnu.org>
Harald Anlauf <anlauf@gmx.de>
PR fortran/93366
* check.c (gfc_check_associated, invalid_null_arg): Factorize
check for presence of invalid NULL() argument.
(gfc_check_kind, gfc_check_merge, gfc_check_shape)
(gfc_check_sizeof, gfc_check_spread, gfc_check_transfer): Use this
check for presence of invalid NULL() arguments.
gcc/testsuite/ChangeLog:
2020-05-05 Harald Anlauf <anlauf@gmx.de>
PR fortran/93366
* gfortran.dg/pr93366.f90: New test.
Harald Anlauf [Thu, 11 Jun 2020 13:48:56 +0000 (15:48 +0200)]
PR fortran/95503 - Fix ICE in gfc_is_simply_contiguous, at fortran/expr.c:5844
The check for assigning a pointer that cannot be determined to be simply
contiguous at compile time to a contiguous pointer does not need to be
invoked if the lhs of the assignment is known to have conflicting attributes.
2020-06-11 Harald Anlauf <anlauf@gmx.de>
gcc/fortran/
PR fortran/95503
* expr.c (gfc_check_pointer_assign): Skip contiguity check of rhs
of pointer assignment if lhs cannot be simply contiguous.
gcc/testsuite/
PR fortran/95503
* gfortran.dg/pr95503.f90: New test.
Thomas Koenig [Sun, 14 Jun 2020 11:39:43 +0000 (13:39 +0200)]
When avoiding double deallocation, look at namespace, expression and component.
Our finalization handling is a mess. Really, we should get to try and get
this fixed for gcc 11.
In the meantime, here is a patch which fixes a regression I introduced
when fixing a regression with a memory leak. The important thing
here is to realize that we do not need to finalize (and deallocate)
multiple times for the same expression and the same component
in the same namespace. It might cause code size regressions, but
better big code than wrong code...
PR fortran/94109
* class.c (finalize_component): Return early if finalization has
already happened for expression and component within namespace.
* gfortran.h (gfc_was_finalized): New type.
(gfc_namespace): Add member was_finalzed.
(gfc_expr): Remove finalized.
* symbol.c (gfc_free_namespace): Free was_finalized.
Jakub Jelinek [Tue, 9 Jun 2020 06:39:36 +0000 (08:39 +0200)]
c-family: Fix up MEM_REF printing [PR95580]
The C FE in the MEM_REF printing ICEs if the type of the first argument
(which due to useless pointer conversions can be an arbitrary type) is a
pointer to an incomplete type. The code just wants to avoid printing a cast
if it is a pointer to single byte elements.
2020-06-09 Jakub Jelinek <jakub@redhat.com>
PR c/95580
* c-pretty-print.c (c_pretty_printer::unary_expression): Handle the
case when MEM_REF's first argument has type pointer to incomplete type.
Jakub Jelinek [Mon, 8 Jun 2020 09:05:10 +0000 (11:05 +0200)]
forwprop: Ignore scalar mode vectors in simplify_vector_constructor [PR95528]
As mentioned in the PR, the problem is that at least the x86 backend asumes
that the vec_unpack* and vec_pack* optabs with integral modes are for the
AVX512-ish vector masks rather than for very small vectors done in GPRs.
The only other target that seems to have a scalar mode vec_{,un}pack* optab
is aarch64 as discussed in the PR, so there is also a condition for that.
All other targets have just vector mode optabs.
2020-06-08 Jakub Jelinek <jakub@redhat.com>
PR target/95528
* tree-ssa-forwprop.c (simplify_vector_constructor): Don't use
VEC_UNPACK*_EXPR or VEC_PACK_TRUNC_EXPR with scalar modes unless the
type is vector boolean.
Jakub Jelinek [Thu, 28 May 2020 21:40:54 +0000 (23:40 +0200)]
c++: Try to complete decomp types [PR95328]
Two years ago Paolo has added the
else if (processing_template_decl && !COMPLETE_TYPE_P (type))
pedwarn (...);
lines into cp_finish_decomp. For type dependent decl we punt much earlier,
but even for types which aren't type dependent COMPLETE_TYPE_P might be
false as this testcase shows, so this patch tries to complete_type first
(the reason for writing it that way is that it is then followed by another
else if and if complete_type returns error_mark_node, we shouldn't report
anything, as a bug should have been reported already.
2020-05-28 Jakub Jelinek <jakub@redhat.com>
PR c++/95328
* decl.c (cp_finish_decomp): Call complete_type before checking
COMPLETE_TYPE_P.
Jakub Jelinek [Thu, 14 May 2020 07:51:05 +0000 (09:51 +0200)]
openmp: Fix placement of 2nd+ preparation statement for PHIs in simd clone lowering [PR95108]
For normal stmts, preparation statements are inserted before the stmt, so if we need multiple,
they are in the correct order, but for PHIs we emit them after labels in the entry successor
bb, and we used to emit them in the reverse order that way.
2020-05-14 Jakub Jelinek <jakub@redhat.com>
PR middle-end/95108
* omp-simd-clone.c (struct modify_stmt_info): Add after_stmt member.
(ipa_simd_modify_stmt_ops): For PHIs, only add before first stmt in
entry block if info->after_stmt is NULL, otherwise add after that stmt
and update it after adding each stmt.
(ipa_simd_modify_function_body): Initialize info.after_stmt.
Jakub Jelinek [Tue, 26 May 2020 07:35:21 +0000 (09:35 +0200)]
openmp: Ensure copy ctor for composite distribute parallel for class iterators is instantiated [PR95197]
During gimplification omp_finish_clause langhook is called in several places
to add the language specific info to the clause like what default/copy ctors,
dtors and assignment operators should be used.
Unfortunately, if it refers to some not yet instantiated method, during
gimplification it is too late and the methods will not be instantiated
anymore. For other cases, the genericizer has code to detect those and
instantiate whatever is needed, this change adds the same for
distribute parallel for class iterators where we under the hood need
a copy constructor for the iterator to implement it.
2020-05-26 Jakub Jelinek <jakub@redhat.com>
PR c++/95197
* gimplify.c (find_combined_omp_for): Move to omp-general.c.
* omp-general.h (find_combined_omp_for): Declare.
* omp-general.c: Include tree-iterator.h.
(find_combined_omp_for): New function, moved from gimplify.c.
* cp-gimplify.c: Include omp-general.h.
(cp_genericize_r) <case OMP_DISTRIBUTE>: For class iteration
variables in composite distribute parallel for, instantiate copy
ctor of their types.
Jakub Jelinek [Wed, 13 May 2020 09:22:37 +0000 (11:22 +0200)]
Fix -fcompare-debug issue in purge_dead_edges [PR95080]
The following testcase fails with -fcompare-debug, the bug used to be latent
since introduction of -fcompare-debug.
The loop at the start of purge_dead_edges behaves differently between -g0
and -g - if the last insn is a DEBUG_INSN, then it skips not just
DEBUG_INSNs but also NOTEs until it finds some other real insn (or bb head),
while with -g0 it will not skip any NOTEs, so if we have
real_insn
note
debug_insn // not present with -g0
then with -g it might remove useless REG_EH_REGION from real_insn, while
with -g0 it will not.
Yet another option would be not skipping NOTE_P in the loop; I couldn't find
in history rationale for why it is done.
2020-05-13 Jakub Jelinek <jakub@redhat.com>
PR debug/95080
* cfgrtl.c (purge_dead_edges): Skip over debug and note insns even
if the last insn is a note.
Iain Sandoe [Sat, 13 Jun 2020 09:36:29 +0000 (10:36 +0100)]
coroutines: Make call argument handling more robust [PR95440]
build_new_method_call is supposed to be able to handle a null
arguments list pointer (when the method has no parms). There
were a couple of places where uses of the argument list pointer
were not defended against NULL.
gcc/cp/ChangeLog:
PR c++/95440
* call.c (add_candidates): Use vec_safe_length() for
testing the arguments list.
(build_new_method_call_1): Use vec_safe_is_empty() when
checking for an empty args list.
gcc/testsuite/ChangeLog:
PR c++/95440
* g++.dg/coroutines/pr95440.C: New test.
Patrick Palka [Thu, 11 Jun 2020 20:33:41 +0000 (16:33 -0400)]
c++: constrained class template friend [PR93467]
This fixes two issues in our handling of constrained class template
friend declarations.
The first issue is that we fail to set the constraints on the injected
class template declaration during tsubst_friend_class.
The second issue is that the template parameter levels within the parsed
constraints of a class template friend declaration are shifted if the
enclosing class is a template, and this shift leads to spurious
constraint mismatch errors in associate_classtype_constraints if the
friend declaration refers to an already declared class template.
gcc/cp/ChangeLog:
PR c++/93467
* constraint.cc (associate_classtype_constraints): If there is a
discrepancy between the current template depth and the template
depth of the original declaration, then adjust the template
parameter depth within the current constraints appropriately.
* pt.c (tsubst_friend_class): Substitute into and set the
constraints on the injected declaration.
gcc/testsuite/ChangeLog:
PR c++/93467
* g++.dg/cpp2a/concepts-friend6.C: New test.
* g++.dg/cpp2a/concepts-friend7.C: New test.
Iain Sandoe [Fri, 12 Jun 2020 07:28:35 +0000 (08:28 +0100)]
coroutines: Correct handling of references in parm copies [PR95350].
Adjust to handle rvalue refs the same way as clang, and to correct
the handling of moves when a copy CTOR is present. This is one area
where we could make things easier for the end-user (as was implemented
before this change), however there needs to be agreement about when the
full statement containing a coroutine call ends (i.e. when the ramp
terminates or when the coroutine terminates).
PR c++/95350
* g++.dg/coroutines/torture/func-params-08.C: Adjust test to
reflect that all rvalue refs are dangling.
* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C:
Likewise.
* g++.dg/coroutines/pr95350.C: New test.
Marek Polacek [Thu, 11 Jun 2020 20:35:33 +0000 (16:35 -0400)]
c++: Fix ICE in check_local_shadow with enum [PR95560]
Another indication that perhaps this warning is emitted too early. We
crash because same_type_p gets a null type: we have an enumerator
without a fixed underlying type and finish_enum_value_list hasn't yet
run. So check if the type is null before calling same_type_p.
PR c++/95560
* name-lookup.c (check_local_shadow): Check if types are
non-null before calling same_type_p.
Marek Polacek [Thu, 11 Jun 2020 20:33:13 +0000 (16:33 -0400)]
c++: explicit(bool) malfunction with dependent expression [PR95066]
I forgot to set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P when merging two
function declarations and as a sad consequence, we never tsubsted
the dependent explicit-specifier in tsubst_function_decl, leading to
disregarding the explicit-specifier altogether, and wrongly accepting
this test.
PR c++/95066
* decl.c (duplicate_decls): Set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P.
Marek Polacek [Thu, 11 Jun 2020 20:29:57 +0000 (16:29 -0400)]
c++: ICE with -Wall and constexpr if [PR94937]
An ICE arises here because we call cp_get_callee_fndecl_nofold in a
template, and we've got a CALL_EXPR whose CALL_EXPR_FN is a BASELINK.
This tickles the INDIRECT_TYPE_P assert in cp_get_fndecl_from_callee.
Fixed by turning the assert into a condition and returning NULL_TREE
in that case.
PR c++/94937
* cvt.c (cp_get_fndecl_from_callee): Return NULL_TREE if the function
type is not INDIRECT_TYPE_P.
* decl.c (omp_declare_variant_finalize_one): Call
cp_get_callee_fndecl_nofold instead of looking for the function decl
manually.
* g++.dg/cpp1z/constexpr-if34.C: New test.
* g++.dg/cpp2a/is-constant-evaluated10.C: New test.
Marek Polacek [Thu, 11 Jun 2020 20:27:59 +0000 (16:27 -0400)]
c++: Fix bogus -Wparentheses warning [PR95344]
Since r267272, which added location wrappers, cp_fold loses
TREE_NO_WARNING on a MODIFY_EXPR that finish_parenthesized_expr set, and
that results in a bogus -Wparentheses warning.
I.e., previously we had "b = 1" but now we have "VIEW_CONVERT_EXPR<bool>(b) = 1"
and cp_fold_maybe_rvalue folds away the location wrapper and so we do
2718 x = fold_build2_loc (loc, code, TREE_TYPE (x), op0, op1);
in cp_fold and the flag is lost.
PR c++/95344
* cp-gimplify.c (cp_fold) <case MODIFY_EXPR>: Don't set
TREE_THIS_VOLATILE here.
(cp_fold): Set it here along with TREE_NO_WARNING.
Marek Polacek [Thu, 11 Jun 2020 20:26:41 +0000 (16:26 -0400)]
c++: ICE when shortening right shift [PR94955]
Since r10-6527 fold_for_warn calls maybe_constant_value, which means it
can fold more than it previously could. In this testcase it means that
cp_build_binary_op/RSHIFT_EXPR set short_shift because now we were able
to fold op1 to an INTEGER_CST. But then when actually performing the
shortening we crashed because cp_fold_rvalue wasn't able to fold as much
as f_f_w and so tree_int_cst_sgn crashed on a NOP_EXPR. Therefore the
calls should probably match.
PR c++/94955
* typeck.c (cp_build_binary_op): Use fold_for_warn instead of
cp_fold_rvalue.
Iain Sandoe [Thu, 11 Jun 2020 13:11:14 +0000 (14:11 +0100)]
coroutines: Ensure distinct DTOR trees [PR95137].
Part of the PR notes that there are UBSAN fails for the coroutines
test suite. These are primarily related to the use of the same DTOR
tree in the two edges from the await block. Fixed by building a new
tree for each.
gcc/cp/ChangeLog:
PR c++/95137
* coroutines.cc (expand_one_await_expression): Build separate
DTOR trees for the awaitable object on the destroy and resume
paths.
Harald Anlauf [Sun, 7 Jun 2020 12:47:24 +0000 (14:47 +0200)]
PR fortran/95091 - Buffer overflows with submodules and long symbols
With submodules, name mangling results in long internal symbols. This
requires adjustment of the sizes of temporaries to avoid buffer overflows.
2020-06-07 Harald Anlauf <anlauf@gmx.de>
gcc/fortran/
PR fortran/95091
* class.c (get_unique_type_string, gfc_hash_value): Enlarge
buffers, and check whether the strings returned by
get_unique_type_string() fit.
Patrick Palka [Wed, 10 Jun 2020 21:37:53 +0000 (17:37 -0400)]
libstdc++: Fix some ranges algos optimizations [PR95578]
ranges::copy and a number of other ranges algorithms have unwrapping
optimizations for iterators of type __normal_iterator, move_iterator and
reverse_iterator. But in the checks that guard these optimizations we
currently only test that the iterator of the iterator/sentinel pair has
the appropriate type before proceeding with the corresponding
optimization, and do not also test the sentinel type.
This breaks the testcase in this PR because this testcase constructs via
range adaptors a range whose begin() is a __normal_iterator and whose
end() is a custom sentinel type, and then performs ranges::copy on it.
From there we bogusly perform the __normal_iterator unwrapping
optimization on this iterator/sentinel pair, which immediately leads to
a constraint failure since the custom sentinel type does not model
sentinel_for<int*>.
This patch fixes this issue by refining each of the problematic checks
to also test that the iterator and sentinel types are the same before
applying the corresponding unwrapping optimization. Along the way, some
code simplifications are made.
libstdc++-v3/ChangeLog:
PR libstdc++/95578
* include/bits/ranges_algo.h (__lexicographical_compare_fn):
Also check that the iterator and sentinel have the same type before
applying the unwrapping optimization for __normal_iterator.
Split the check into two, one for the first iterator/sentinel
pair and another for second iterator/sentinel pair. Remove uses
of __niter_base, and remove uses of std::move on a
__normal_iterator.
* include/bits/ranges_algobase.h (__equal_fn): Likewise.
(__copy_or_move): Likewise. Perform similar adjustments for
the reverse_iterator and move_iterator optimizations. Inline
the checks into the if-constexprs, and use using-declarations to
make them less visually noisy. Remove uses of __niter_wrap.
(__copy_or_move_backward): Likewise.
* testsuite/25_algorithms/copy/95578.cc: New test.
* testsuite/25_algorithms/copy_backward/95578.cc: New test.
* testsuite/25_algorithms/equal/95578.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/95578.cc: New test.
* testsuite/25_algorithms/move/95578.cc: New test.
* testsuite/25_algorithms/move_backward/95578.cc: New test.
Iain Sandoe [Wed, 10 Jun 2020 07:12:32 +0000 (08:12 +0100)]
coroutines: Fix missed ramp function return copy elision [PR95346].
Confusingly, "get_return_object ()" can do two things:
- Firstly it can provide the return object for the ramp function (as
the name suggests).
- Secondly if the type of the ramp function is different from that
of the get_return_object call, this is used as a single parameter
to a CTOR for the ramp's return type.
In the first case we can rely on finish_return_stmt () to do the
necessary processing for copy elision.
In the second case, we should have passed a prvalue to the CTOR as
per the standard comment, but I had omitted the rvalue () call. Fixed
thus.
gcc/cp/ChangeLog:
PR c++/95346
* coroutines.cc (morph_fn_to_coro): Ensure that the get-
return-object is constructed correctly; When it is not the
final return value, pass it to the CTOR of the return type
as an rvalue, per the standard comment.
gcc/testsuite/ChangeLog:
PR c++/95346
* g++.dg/coroutines/pr95346.C: New test.
Marek Polacek [Wed, 10 Jun 2020 14:49:08 +0000 (10:49 -0400)]
c++: Fix ICE with delayed parsing of noexcept-specifier [PR95562]
Here we ICE because a DEFERRED_PARSE expression leaked to tsubst_copy.
We create these expressions for deferred noexcept-specifiers in
cp_parser_save_noexcept; they are supposed to be re-parsed in
cp_parser_late_noexcept_specifier. In this case we never got around
to re-parsing it because the noexcept-specifier was attached to a
pointer to a function, not to a function declaration. But we should
not have delayed the parsing here in the first place; we already
avoid delaying the parsing for alias-decls, typedefs, and friend
function declarations. (Clang++ also doesn't delay the parsing
for pointers to function.)
gcc/cp/ChangeLog:
PR c++/95562
* parser.c (cp_parser_direct_declarator): Clear
CP_PARSER_FLAGS_DELAY_NOEXCEPT if the declarator kind is not
cdk_id.
gcc/testsuite/ChangeLog:
PR c++/95562
* g++.dg/cpp0x/noexcept60.C: New test.
Tobias Burnus [Wed, 3 Jun 2020 13:35:12 +0000 (15:35 +0200)]
OpenACC: fix privatization of by-reference arrays
Replacing of a by-reference variable in a private clause by a local variable
makes sense; however, for arrays, the size is not directly known by the type.
This causes an ICE via create_tmp_var which indirectly invokes
force_constant_size in this case - but the latter only handled Ada.
gcc/ChangeLog:
* gimplify.c (localize_reductions): Do not create local
variable for privatized arrays.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c (aux_gang):
Handle case where AMD GCN is used.
(acc_worker): Likewise.
(acc_vector): Likewise.
Sandra Loosemore [Tue, 31 Mar 2020 21:29:09 +0000 (14:29 -0700)]
Fix bug in processing of array dimensions in data clauses.
The g++ front end wraps the array length and low_bound values in
NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail.
The test case c-c++-common/goacc/kernels-loop-annotation-1.c was
tickling this bug and giving bogus errors in g++ because it was falling
through to dynamic array code instead of recognizing the constant bounds.
This patch was posted upstream here
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542694.html
but not yet committed. It may be that some other fix for this problem
is implemented on mainline instead; check before merging this patch.
Sandra Loosemore [Thu, 19 Mar 2020 15:32:24 +0000 (08:32 -0700)]
Additional Fortran testsuite fixes for kernels loops annotation pass.
These testsuite fixes are specific to the og10 branch, so are being
segregated from the ones that apply to mainline in a separate commit
from the main Fortran kernels loop annotation patch.
Sandra Loosemore [Tue, 17 Mar 2020 01:08:01 +0000 (18:08 -0700)]
Kernels loops annotation: Fortran.
This patch implements the Fortran support for adding "#pragma acc loop auto"
annotations to loops in OpenACC kernels regions. It implements the same
-fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options
that were previously added (and documented) for the C/C++ front ends.
Sandra Loosemore [Sun, 15 Mar 2020 22:13:46 +0000 (15:13 -0700)]
Kernels loops annotation: C and C++.
This patch detects loops in kernels regions that are candidates for
parallelization, and adds "#pragma acc loop auto" annotations to them.
This annotation is controlled by the -fopenacc-kernels-annotate-loops
option, which is enabled by default. -Wopenacc-kernels-annotate-loops
can be used to produce diagnostics about loops that cannot be annotated.
gcc/c
* c-decl.c (c_break_label, c_cont_label): Delete, and replace
with...
(in_statement): New.
(start_function): Adjust for above change.
(c_push_function_context, c_pop_function_context): Likewise.
* c-lang.h (struct language_function): Likewise.
* c-objc-common.h (LANG_HOOKS_BLOCK_MAY_FALLTHRU): Define.
* c-parser.c (objc_foreach_break_label, objc_foreach_continue_label):
New.
(c_parser_statement_after_labels): Adjust calls to c_finish_bc_stmt.
(c_parser_switch_statement): Adjust break/switch context handling
and calls to renamed functions.
(c_parser_while_statement): Adjust break/switch context handling and
build a WHILE_STMT.
(c_parser_do_statement): Ditto, with DO_STMT respectively.
(c_parser_for_statement): Ditto, with FOR_STMT respectively.
(c_parser_omp_for_loop): Adjust break/switch context handling.
* c-tree.h (c_break_label, c_cont_label): Delete.
(IN_SWITCH_STMT, IN_ITERATION_STMT): Define.
(IN_OMP_BLOCK, IN_OMP_FOR, IN_OBJC_FOREACH): Define.
(in_statement, switch_statement_break_seen_p): Declare.
(c_start_case, c_finish_case): Renamed to...
(c_start_switch, c_finish_switch).
(c_finish_bc_stmt): Adjust arguments.
* c-typeck.c (build_function_call_vec): Don't try to print
statements with %qE format.
(struct c_switch): Rename switch_expr field to switch_stmt.
Add break_stmt_seen_p field.
(c_start_case): Rename to c_start_switch. Build a SWITCH_STMT
instead of a SWITCH_EXPR. Update for changes to struct c_switch.
(do_case): Update for changes to struct c_switch.
(c_finish_case): Rename to c_finish_switch. Update for changes to
struct c_switch and change of representation from SWITCH_EXPR to
SWITCH_STMT.
(c_finish_loop): Delete.
(c_finish_bc_stmt): Update to reflect changes to break/continue
state representation. Build a BREAK_STMT or CONTINUE_STMT instead
of a GOTO_EXPR except for objc foreach loops.
gcc/objc
* objc-act.c (objc_start_method_definition): Update to reflect
changes to break/continue state bookkeeping in C front end.
Move loop and switch tree data structures from cp/ to c-family/.
This patch moves the definitions for DO_STMT, FOR_STMT, WHILE_STMT,
SWITCH_STMT, BREAK_STMT, and CONTINUE_STMT from the C++ front end to
c-family. This includes the genericizers, pretty-printers, and dump
support as well as the tree definitions and accessors. Some related
code for OMP_FOR and similar OMP constructs is also moved.
This patch was previously posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534145.html
gcc/c-family/
* c-common.c (c_block_may_fallthrough): New, split from
cxx_block_may_fallthrough in the cp front end.
(c_common_init_ts): Move handling of loop and switch-related
statements here from the cp front end.
* c-common.def (FOR_STMT, WHILE_STMT, DO_STMT): Move here from
cp front end.
(BREAK_STMT, CONTINUE_STMT, SWITCH_STMT): Likewise.
* c-common.h (c_block_may_fallthru): Declare.
(bc_state_t): Move here from cp front end.
(save_bc_state, restore_bc_state): Declare.
(c_genericize_control_stmt): Declare.
(WHILE_COND, WHILE_BODY): Likewise.
(DO_COND, DO_BODY): Likewise.
(FOR_INIT_STMT, FOR_COND, FOR_EXPR, FOR_BODY, FOR_SCOPE): Likewise.
(SWITCH_STMT_COND, SWITCH_STMT_BODY): Likewise.
(SWITCH_STMT_TYPE, SWITCH_STMT_SCOPE): Likewise.
(SWITCH_STMT_ALL_CASES_P, SWITCH_STMT_NO_BREAK_P): Likewise.
(LABEL_DECL_BREAK, LABEL_DECL_CONTINUE): Likewise.
* c-dump.c (dump_stmt): Copy from cp front end.
(c_dump_tree): Move code to handle structured loop and switch
tree nodes here from cp front end.
* c-gimplify.c: Adjust includes.
(enum bc_t, bc_label, begin_bc_block, finish_bc_block): Move from
cp front end.
(save_bc_state, restore_bc_state): New functions using old code
from cp front end.
(get_bc_label, expr_loc_or_loc): Move from cp front end.
(genericize_c_loop): Move from cp front end.
(genericize_for_stmt, genericize_while_stmt): Likewise.
(genericize_do_stmt, genericize_switch_stmt): Likewise.
(genericize_continue_stmt, genericize_break_stmt): Likewise.
(genericize_omp_for_stmt): Likewise.
(c_genericize_control_stmt): New function using code split from
cp front end.
(c_genericize_control_r): New.
(c_genericize): Call walk_tree with c_genericize_control_r.
* c-pretty-print.c (c_pretty_printer::statement): Move code to handle
structured loop and switch tree nodes here from cp front end.
gcc/cp/
* cp-gimplify.c (enum bc_t, bc_label): Move to c-family.
(begin_bc_block, finish_bc_block, get_bc_label): Likewise.
(genericize_cp_loop): Likewise.
(genericize_for_stmt, genericize_while_stmt): Likewise.
(genericize_do_stmt, genericize_switch_stmt): Likewise.
(genericize_continue_stmt, genericize_break_stmt): Likewise.
(genericize_omp_for_stmt): Likewise.
(cp_genericize_r): Call c_genericize_control_stmt instead of
above functions directly.
(cp_genericize): Call save_bc_state and restore_bc_state instead
of manipulating bc_label directly.
* cp-objcp-common.c (cxx_block_may_fallthru): Defer to
c_block_may_fallthru instead of handling SWITCH_STMT here.
(cp_common_init_ts): Move handling of loop and switch-related
statements to c-family.
* cp-tree.def (FOR_STMT, WHILE_STMT, DO_STMT): Move to c-family.
(BREAK_STMT, CONTINUE_STMT, SWITCH_STMT): Likewise.
* cp-tree.h (LABEL_DECL_BREAK, LABEL_DECL_CONTINUE): Likewise.
(WHILE_COND, WHILE_BODY): Likewise.
(DO_COND, DO_BODY): Likewise.
(FOR_INIT_STMT, FOR_COND, FOR_EXPR, FOR_BODY, FOR_SCOPE): Likewise.
(SWITCH_STMT_COND, SWITCH_STMT_BODY): Likewise.
(SWITCH_STMT_TYPE, SWITCH_STMT_SCOPE): Likewise.
(SWITCH_STMT_ALL_CASES_P, SWITCH_STMT_NO_BREAK_P): Likewise.
* cxx-pretty-print.c (cxx_pretty_printer::statement): Move code
to handle structured loop and switch tree nodes to c-family.
* dump.c (cp_dump_tree): Likewise.
gcc/
* doc/generic.texi (Basic Statements): Document SWITCH_EXPR here,
not SWITCH_STMT.
(Statements for C and C++): Rename node to reflect what
the introduction already says about sharing between C and C++
front ends. Copy-edit and correct documentation for structured
loops and switch.
Add XFAIL for libgomp.oacc-c-c++-common/data-firstprivate-1.c
The firstprivate_int optimization changes the semantics of firstprivate
in this test, so XFAIL it until the correct semantics for firstprivate
have been decided (PR92036).
Fix test failure in routine-level-of-parallelism-2.c testcase
c-c++-common/goacc/routine-level-of-parallelism-2.c is supposed to be
equivalent to gfortran.dg/goacc/routine-level-of-parallelism-1.f90,
but is missing some test directives present in the latter.
Tobias Burnus [Mon, 16 Mar 2020 15:22:57 +0000 (16:22 +0100)]
Fix for is_gimple_reg vars to 'data kernels'
Nearly all variable mapping is moved from 'kernels' to a surrounding
'data kernels' and then 'force_present' mapped for the 'kernels'. However, as
libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as
there is a special case for is_gimple_reg in mapping and that fails badly if
outside a target region (e.g. offloading = false). As those are transferred by
value and not as a pointer, it makes more sense to only map them at
'kernels' and ignore them for 'data kernels'.
Additionally, as e.g. libgomp.oacc-c-c++-common/kernels-decompose-1.c shows,
one still additionally to handle 'kernels'-declared variables which now are
declared in 'kernels data' and and can be handled as is_gimple_reg.
gcc/
* omp-oacc-kernels.c (maybe_build_inner_data_region): is_gimple_reg vars
are not yet mapped, fall through to map is as before the transformation.
(transform_kernels_region): Don't map is_gimple_reg vars.
(decompose_kernels_region_body): Use tofrom for is_gimple_reg vars.
(transform_kernels_region): Handle is_gimple_reg vars as without
data kernels.
Julian Brown [Tue, 22 Oct 2019 00:22:31 +0000 (17:22 -0700)]
Run all kernels regions with GOMP_MAP_FORCE_TOFROM mappings synchronously
gcc/
* omp-oacc-kernels.c (decompose_kernels_region_body): Add
inhibit_async parameter. Force asynchronous kernel launches to
run synchronously if they have problematic variable mappings.
Don't add explicit wait for decomposed kernels regions we forced
synchronous.
(transform_kernels_region): Detect problematic variable mappings,
and inhibit asynchronous execution if we find any.
Julian Brown [Tue, 26 Feb 2019 23:48:00 +0000 (15:48 -0800)]
Fortran "declare create"/allocate support for OpenACC
2018-10-04 Cesar Philippidis <cesar@codesourcery.com>
Julian Brown <julian@codesourcery.com>
gcc/
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise. Create local storage for
the values being pointed to. Add new orig_type argument.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT. Don't privatize
firstprivate VLAs.
* tree-pretty-print.c (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.c (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.c (find_module_oacc_declare_clauses): Relax
oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
OMP_MAP_TO, in order to match OpenACC 2.5 semantics.
* trans-openmp.c (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls. Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.
(gfc_trans_oacc_declare_allocate): New function.
* trans-stmt.c (gfc_trans_allocate): Call
gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
attribute set.
(gfc_trans_deallocate): Likewise.
* trans.h (gfc_trans_oacc_declare_allocate): Declare.
gcc/testsuite/
* gfortran.dg/goacc/declare-allocatable-1.f90: New test.
include/
* gomp-constants.h (enum gomp_map_kind): Define
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.
libgomp/
* oacc-mem.c (gomp_acc_declare_allocate): New function.
* oacc-parallel.c (GOACC_enter_exit_data): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.
Julian Brown [Mon, 14 Oct 2019 20:12:39 +0000 (13:12 -0700)]
Re-do OpenACC private variable resolution
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(mark_oacc_gangprivate): Remove unused function.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts. Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables. Remove unused variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this. Update documentation (briefly).
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.