Jakub Jelinek [Sun, 6 Jun 2021 17:37:06 +0000 (19:37 +0200)]
openmp: Call c_omp_adjust_map_clauses even for combined target [PR100902]
When looking at in_reduction support for target, I've noticed that
c_omp_adjust_map_clauses is not called for the combined target case.
The following patch fixes it.
Unfortunately, there are other issues.
One is (also mentioned in the PR) that currently the pointer attachment
stuff seems to be clause ordering dependent (the standard says that clause
ordering on the same construct does not matter), the baz and qux cases
in the PR are rejected while when swapped it is accepted.
Note, the order of clauses in GCC really is treated as insignificant
initially and only later on the compiler can adjust the ordering (e.g. when
we sort map clauses based on what they refer to etc.) and in particular,
clauses from parsing is reverse of the order in user code, while
c_omp_split_clauses performed for combined/composite constructs typically
reverses that ordering, i.e. makes it follow the user code ordering.
And another one is I'm slightly afraid c_omp_adjust_map_clauses might
misbehave in templates, though haven't tried to verify it with testcases.
When processing_template_decl, the non-dependent clauses will be handled
usually the same as when not in a template, but dependent clauses aren't
processed or only limited processing is done there, and rest is deferred
till later. From quick skimming of c_omp_adjust_map_clauses, it seems
it might not be very happy about non-processed map clauses that might
still have the TREE_LIST representation of array sections, or might
not have finalized decls or base decls etc.
So, for this I wonder if cp_parser_omp_target (and other cp/parser.c
callers of c_omp_adjust_map_clauses) shouldn't call it only
if (!processing_template_decl) - perhaps you could add
cp_omp_adjust_map_clauses wrapper that would be
if (!processing_template_decl)
c_omp_adjust_map_clauses (...);
- and call c_omp_adjust_map_clauses from within pt.c after the clauses
are tsubsted and finish_omp_clauses is called again.
2021-06-06 Jakub Jelinek <jakub@redhat.com>
PR c/100902
* c-parser.c (c_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.
* parser.c (cp_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.
Jakub Jelinek [Tue, 8 Jun 2021 09:45:02 +0000 (11:45 +0200)]
openmp: Fix ICE on depend(source) clause during cdtor cloning [PR100957]
The depend(source) clause has NULL OMP_CLAUSE_DECL, it has just the
depend kind specified and no arguments. So copy_tree_body_r shouldn't
check TREE_CODE on it without checking it is non-NULL.
2021-06-08 Jakub Jelinek <jakub@redhat.com>
PR c++/100957
* tree-inline.c (copy_tree_body_r): For OMP_CLAUSE_DEPEND don't
check TREE_CODE if OMP_CLAUSE_DECL is NULL.
Tobias Burnus [Tue, 8 Jun 2021 08:04:14 +0000 (10:04 +0200)]
Fortran/OpenMP: Fix clause splitting for target/parallel/teams [PR99928]
PR middle-end/99928
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_add_clause_implicitly): New.
(gfc_split_omp_clauses): Use it.
(gfc_free_split_omp_clauses): New.
(gfc_trans_omp_do_simd, gfc_trans_omp_parallel_do,
gfc_trans_omp_parallel_do_simd, gfc_trans_omp_distribute,
gfc_trans_omp_teams, gfc_trans_omp_target, gfc_trans_omp_taskloop,
gfc_trans_omp_master_taskloop, gfc_trans_omp_parallel_master): Use it.
Jakub Jelinek [Tue, 8 Jun 2021 06:22:02 +0000 (08:22 +0200)]
openmp: Add testcase for scan directive with nested functions
> In convert_nonlocal_omp_clauses, the following clauses are
> missing: OMP_CLAUSE_AFFINITY, OMP_CLAUSE_DEVICE_TYPE,
> OMP_CLAUSE_EXCLUSIVE, OMP_CLAUSE_INCLUSIVE.
OMP_CLAUSE_{EXCLUSIVE,INCLUSIVE} isn't needed, because we don't
walk the clauses at all for GIMPLE_OMP_SCAN. It would be a bug
if we used the exclusive/inclusive operands after gimplification,
but we apparently don't do that, all we check is whether the
OMP_CLAUSE_KIND of the first clause (all should be the same) is
OMP_CLAUSE_EXCLUSIVE or OMP_CLAUSE_INCLUSIVE, nothing else.
Patrick Palka [Thu, 3 Jun 2021 13:37:11 +0000 (09:37 -0400)]
c++: using-enum and access specifiers [PR100862]
When copying the enumerators imported by a class-scope using-enum
declaration, we need to override current_access_specifier so that
finish_member_declaration gives the copies the same access as the
using-enum decl. (A class-scope using-enum is processed late, so
current_access_specifier at this point is otherwise set to the last
access specifier within the class.) To that end, this patch makes
handle_using_decl call set_current_access_from_decl accordingly.
For consistency, this patch makes build_enumerator use
set_current_access_from_decl too.
PR c++/100862
gcc/cp/ChangeLog:
* pt.c (set_current_access_from_decl): Move to ...
* class.c (set_current_access_from_decl): ... here.
(handle_using_decl): Use it to propagate the access of the
using-enum decl to the copy of the imported enumerator.
* cp-tree.h (set_current_access_from_decl): Declare.
* decl.c (build_enumerator): Simplify using make_temp_override
and set_current_access_from_decl.
Patrick Palka [Fri, 4 Jun 2021 17:46:53 +0000 (13:46 -0400)]
c++: tsubst_function_decl and excess arg levels [PR100102]
Here, when instantiating the dependent alias template
duration::__is_harmonic with args={{T,U},{int}}, we find ourselves
substituting the function decl _S_gcd. Since we have more arg levels
than _S_gcd has parm levels, an old special case in tsubst_function_decl
causes us to unwantedly reduce args to its innermost level, yielding
args={int}, which leads to a nonsensical substitution into the decl
context and eventually a crash.
The comment for this special case refers to three examples for which we
ought to see more arg levels than parm levels here, but none of the
examples actually demonstrate this. In the first example, when
defining S<int>::f(U) parms_depth is 2 and args_depth is 1, and
later when instantiating say S<int>::f<char> both depths are 2. In the
second example, when substituting the template friend declaration
parms_depth is 2 and args_depth is 1, and later when instantiating f
both depths are 1. Finally, the third example is invalid since we can't
specialize a member template of an unspecialized class template like
that.
Given that this reduction code seems no longer relevant for its
documented purpose and that it causes problems as in the PR, this patch
just removes it. Note that as far as bootstrap/regtest is concerned,
this code is dead; the below two tests would be the first to reach it.
PR c++/100102
gcc/cp/ChangeLog:
* pt.c (tsubst_function_decl): Remove old code for reducing
args when it has excess levels.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alias-decl-72.C: New test.
* g++.dg/cpp0x/alias-decl-72a.C: New test.
Avi Kivity [Mon, 7 Jun 2021 15:19:05 +0000 (11:19 -0400)]
libstdc++: add missing typename for dependent type in ranges::elements_view [PR100900]
Clang complains about the missing typename. I believe it's not required
in a more complete implementation of C++, but it's nicer to support
less complete implementations.
and moves the stripping of ARRAY_REFS/INDIRECT_REFS out of
extract_base_bit_offset and back into the (two) call sites of the
function. The difference between the two ways of looking through these
nodes comes down to (I think) what processing has been done on the
clause in question already: in the case where BASE_REF is non-NULL,
we are processing an OMP_CLAUSE_DECL for the first time. Conversely,
when BASE_REF is NULL, we are processing a node from the sorted list
that is being constructed after a GOMP_MAP_STRUCT node.
2021-06-07 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (extract_base_bit_offset): Don't look through ARRAY_REFs or
INDIRECT_REFs here.
(build_struct_group): Reinstate previous behaviour for handling
ARRAY_REFs/INDIRECT_REFs.
Correct implementation of random_init() when -fcoarray=lib is given.
Backport from mainline.
2021-06-06 Andre Vehreschild <vehre@gcc.gnu.org>
Steve Kargl <kargl@gcc.gnu.org>
gcc/fortran/ChangeLog:
PR fortran/98301
* trans-decl.c (gfc_build_builtin_function_decls): Move decl.
* trans-intrinsic.c (conv_intrinsic_random_init): Use bool for
lib-call of caf_random_init instead of logical (4-byte).
* trans.h: Add tree var for random_init.
libgfortran/ChangeLog:
PR fortran/98301
* caf/libcaf.h (_gfortran_caf_random_init): New function.
* caf/single.c (_gfortran_caf_random_init): New function.
* gfortran.map: Added fndecl.
* intrinsics/random_init.f90: Implement random_init.
Iain Buclaw [Fri, 4 Jun 2021 17:38:26 +0000 (19:38 +0200)]
d: Fix ICE in gimplify_var_or_parm_decl, at gimplify.c:2755 (PR100882)
Constructor calls for temporaries were reusing the TARGET_EXPR_SLOT of a
TARGET_EXPR for an assignment, which later got passed to `build_assign',
which stripped away the outer TARGET_EXPR, leaving a reference to a lone
temporary with no declaration.
This stripping away of the TARGET_EXPR also discarded any cleanups that
may have been assigned to the expression as well.
So now the reuse of TARGET_EXPR_SLOT has been removed, and
`build_assign' now constructs assignments inside the TARGET_EXPR_INITIAL
slot. This has also been extended to `return_expr', to deal with
possibility of a TARGET_EXPR being returned.
gcc/d/ChangeLog:
PR d/100882
* d-codegen.cc (build_assign): Construct initializations inside
TARGET_EXPR_INITIAL.
(compound_expr): Remove intermediate expressions that have no
side-effects.
(return_expr): Construct returns inside TARGET_EXPR_INITIAL.
* expr.cc (ExprVisitor::visit (CallExp *)): Remove useless assignment
to TARGET_EXPR_SLOT.
gcc/testsuite/ChangeLog:
PR d/100882
* gdc.dg/pr100882a.d: New test.
* gdc.dg/pr100882b.d: New test.
* gdc.dg/pr100882c.d: New test.
* gdc.dg/torture/pr100882.d: New test.
Harald Anlauf [Fri, 4 Jun 2021 17:23:48 +0000 (19:23 +0200)]
Fortran - ICE in inline_matmul_assign
Restrict inlining of matmul to those cases where assignment to the
result array does not need special treatment.
gcc/fortran/ChangeLog:
PR fortran/99839
* frontend-passes.c (inline_matmul_assign): Do not inline matmul
if the assignment to the resulting array if it is not of canonical
type (real/integer/complex/logical).
gcc/testsuite/ChangeLog:
PR fortran/99839
* gfortran.dg/inline_matmul_25.f90: New test.
Tobias Burnus [Fri, 4 Jun 2021 15:46:48 +0000 (17:46 +0200)]
Fortran: Fix OpenMP/OpenACC continue-line parsing
gcc/fortran/ChangeLog:
* scanner.c (skip_fixed_omp_sentinel): Set openacc_flag if
this is not an (OpenMP) continuation line.
(skip_fixed_oacc_sentinel): Likewise for openmp_flag and OpenACC.
(gfc_next_char_literal): gfc_error_now to force error for mixed OMP/ACC
continuation once per location and return '\n'.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/omp-fixed.f: Re-add test item changed in previous
commit in addition - add more dg-errors and '... end ...' due to changed
parsing.
* gfortran.dg/goacc/omp.f95: Likewise.
* gfortran.dg/goacc-gomp/mixed-1.f: New test.
* gfortran.dg/gomp/pr99928-3.f90: Add 'default(none)', following
C/C++ version of the patch.
* gfortran.dg/gomp/loop-1.f90: New test.
* gfortran.dg/gomp/loop-2.f90: New test.
* gfortran.dg/gomp/pr99928-1.f90: New test; based on C/C++ test.
* gfortran.dg/gomp/pr99928-11.f90: Likewise.
* gfortran.dg/gomp/pr99928-2.f90: Likewise.
* gfortran.dg/gomp/pr99928-4.f90: Likewise.
* gfortran.dg/gomp/pr99928-5.f90: Likewise.
* gfortran.dg/gomp/pr99928-6.f90: Likewise.
* gfortran.dg/gomp/pr99928-8.f90: Likewise.
* gfortran.dg/goacc/omp.f95: Use 'acc kernels loops' instead
of 'acc loops' to hide unrelated bug for now.
* gfortran.dg/goacc/omp-fixed.f: Likewise
Tobias Burnus [Fri, 4 Jun 2021 09:25:50 +0000 (11:25 +0200)]
c++: Fix up attribute handling in methods in templates [PR100872]
The following testcase FAILs because a dependent (late) attribute is never
tsubsted. While the testcase is OpenMP, I think it is a generic C++ FE problem
that could affect any other dependent attribute.
apply_late_template_attributes documents that it relies on
/* save_template_attributes puts the dependent attributes at the beginning of
the list; find the non-dependent ones. */
The "operator binding" attributes that are sometimes added are added to the
head of DECL_ATTRIBUTES list though and because it doesn't have
ATTR_IS_DEPENDENT set it violates this requirement.
The following patch fixes it by adding that attribute after all
ATTR_IS_DEPENDENT attributes. I'm not 100% sure if DECL_ATTRIBUTES can't be
shared by multiple functions (e.g. the cdtor clones), but the code uses
later remove_attribute which could break that too.
Other option would be to copy_list the ATTR_IS_DEPENDENT portion of the
DECL_ATTRIBUTES list if we need to do this, that would be the same as this
patch but replace that *ap = op_attr; at the end with
*ap = NULL_TREE;
DECL_ATTRIBUTES (cfn) = chainon (copy_list (DECL_ATTRIBUTES (cfn)),
op_attr);
Or perhaps set ATTR_IS_DEPENDENT on the "operator bindings" attribute,
though it would need to be studied what would it try to do with the
attribute during tsubst.
2021-06-04 Jakub Jelinek <jakub@redhat.com>
PR c++/100872
* name-lookup.c (maybe_save_operator_binding): Add op_attr after all
ATTR_IS_DEPENDENT attributes in the DECL_ATTRIBUTES list rather than
to the start.
Tobias Burnus [Fri, 4 Jun 2021 09:19:55 +0000 (11:19 +0200)]
openmp: Assorted depend/affinity/iterator related fixes [PR100859]
The depend-iterator-3.C testcases shows various bugs.
1) tsubst_omp_clauses didn't handle OMP_CLAUSE_AFFINITY (should be
handled like OMP_CLAUSE_DEPEND)
2) because locators can be arbitrary lvalue expressions, we need
to allow for C++ array section base (especially when array section
is just an array reference) FIELD_DECLs, handle them as this->member,
but don't need to privatize in any way
3) similarly for this as base
4) depend(inout: this) is invalid, but for different reason than the reported
one, again this is an expression, but not lvalue expression, so that
should be reported
5) the ctor/dtor cloning in the C++ FE (which is using walk_tree with
copy_tree_body_r) didn't handle iterators correctly, walk_tree normally
doesn't walk TREE_PURPOSE of TREE_LIST, and in the iterator case
that TREE_VEC contains also a BLOCK that needs special handling during
copy_tree_body_r
2021-06-03 Jakub Jelinek <jakub@redhat.com>
PR c++/100859
gcc/
* tree-inline.c (copy_tree_body_r): Handle iterators on
OMP_CLAUSE_AFFINITY or OMP_CLAUSE_DEPEND.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases.
gcc/cp/
* semantics.c (handle_omp_array_sections_1): For
OMP_CLAUSE_{AFFINITY,DEPEND} handle FIELD_DECL base using
finish_non_static_data_member and allow this as base.
(finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases. Let this be diagnosed by !lvalue_p
case for OMP_CLAUSE_{AFFINITY,DEPEND} and remove useless
assert.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_AFFINITY.
gcc/testsuite/
* g++.dg/gomp/depend-iterator-3.C: New test.
* g++.dg/gomp/this-1.C: Don't expect any diagnostics for
this as base expression of depend array section, expect a different
error wording for this as depend locator and add testcases
for affinity clauses.
Eric Botcazou [Thu, 3 Jun 2021 11:29:32 +0000 (13:29 +0200)]
Fix miscompilation of predicate on bit-packed array types
This is a regression present on the mainline and 11 branch in the form of a
miscompilation by the new mod/ref IPA pass of code that passes constrained
bit-packed array objets in a call to a subprograms taking unconstrained
bit-packed array parameters, which occurs for predicate on bit-packed array
types for example.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Type>: Add PAT
local constant and use it throughout. If it is set, use a ref-all
pointer type for the pointer-to-array field of the fat pointer type.
<E_Array_Subtype>: Add PAT local constant and use it throughout.
gcc/testsuite/
* gnat.dg/bit_packed_array6.adb: New test.
* gnat.dg/bit_packed_array6_pkg.ads: New helper.
Eric Botcazou [Thu, 3 Jun 2021 10:39:39 +0000 (12:39 +0200)]
Tame fix for PR ipa/99122
The return part has a major performance impact in Ada where variable-sized
types are first-class citizens, but it turns out that it is not exercized
in the testsuite yet, so back it out for now.
gcc/
PR ipa/99122
* tree-inline.c (inline_forbidden_p): Remove test on return type.
gcc/testsuite/
* gnat.dg/inline22.adb: New test.
Eric Botcazou [Mon, 10 May 2021 15:41:01 +0000 (17:41 +0200)]
Remove call to gcc_unreachable in range-op.cc
The Ada testcase happens to stumble on the call to gcc_unreachable in
operator_bitwise_xor::op1_range, but there is nothing wrong going on
and it's safe to let it go through.
gcc/
* range-op.cc (get_bool_state): Adjust head comment.
(operator_not_equal::op1_range): Fix comment.
(operator_bitwise_xor::op1_range): Remove call to gcc_unreachable.
gcc/testsuite/
* gnat.dg/specs/opt5.ads: New test.
* gnat.dg/specs/opt5_pkg.ads: New helper.
Alex Coplan [Wed, 19 May 2021 14:52:45 +0000 (15:52 +0100)]
arm: Fix ICE with CMSE nonsecure calls on Armv8.1-M [PR100333]
As the PR shows, we ICE shortly after expanding nonsecure calls for Armv8.1-M.
For Armv8.1-M, we have TARGET_HAVE_FPCXT_CMSE. As it stands, the expander
(arm.md:nonsecure_call_internal) moves the callee's address to a register (with
copy_to_suggested_reg) only if !TARGET_HAVE_FPCXT_CMSE.
However, looking at the pattern which the insn appears to be intended to
match (thumb2.md:*nonsecure_call_reg_thumb2_fpcxt), it requires the
callee's address to be in a register.
This patch therefore just forces the callee's address into a register in
the expander.
gcc/ChangeLog:
PR target/100333
* config/arm/arm.md (nonsecure_call_internal): Always ensure
callee's address is in a register.
gcc/testsuite/ChangeLog:
PR target/100333
* gcc.target/arm/cmse/pr100333.c: New test.
Jonathan Wakely [Thu, 20 May 2021 15:39:06 +0000 (16:39 +0100)]
libstdc++: Use __builtin_unreachable for constexpr assertions [PR 100676]
The current implementation of compile-time precondition checks causes
compilation to fail by calling a non-constexpr function declared at
block scope. This breaks the CUDA compiler, which wraps some libstdc++
headers in a pragma that declares everything as a __host__ __device__
function, but others are not wrapped and so everything is a __host__
function. The local declaration thus gets redeclared as two different
types of function, which doesn't work.
Just use __builtin_unreachable to make constant evaluation fail, instead
of the local function declaration. Also simplify the assertion macros,
which has the side effect of giving simpler compilation errors when
using Clang.
libstdc++-v3/ChangeLog:
PR libstdc++/100676
* include/bits/c++config (__glibcxx_assert_1): Rename to ...
(__glibcxx_constexpr_assert): ... this.
(__glibcxx_assert_impl): Use __glibcxx_constexpr_assert.
(__glibcxx_assert): Define as either __glibcxx_constexpr_assert
or __glibcxx_assert_impl.
(__glibcxx_assert_2): Remove
* include/debug/macros.h (_GLIBCXX_DEBUG_VERIFY_AT_F): Use
__glibcxx_constexpr_assert instead of __glibcxx_assert_1.
* testsuite/21_strings/basic_string_view/element_access/char/back_constexpr_neg.cc:
Adjust expected error.
* testsuite/21_strings/basic_string_view/element_access/char/constexpr_neg.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/front_constexpr_neg.cc:
Likewise.
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/back_constexpr_neg.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr_neg.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/front_constexpr_neg.cc:
Likewise.
* testsuite/23_containers/span/back_neg.cc: Likewise.
* testsuite/23_containers/span/front_neg.cc: Likewise.
* testsuite/23_containers/span/index_op_neg.cc: Likewise.
Jonathan Wakely [Tue, 1 Jun 2021 15:02:45 +0000 (16:02 +0100)]
libstdc++: Fix return value of std::ranges::advance [PR 100833]
The three-argument form of ranges::advance is supposed to return the
difference between the second argument and the distance the iterator was
advanced. When a non-random-access iterator is not advanced (because it
already equals the sentinel) we were returning 0 rather than n - 0.
libstdc++-v3/ChangeLog:
PR libstdc++/100833
* include/bits/ranges_base.h (ranges::advance(iter, n, sentinel)):
Fix return value for no-op case.
* testsuite/24_iterators/range_operations/advance.cc: Test
return values of three-argument overload.
Jonathan Wakely [Wed, 26 May 2021 16:32:53 +0000 (17:32 +0100)]
libstdc++: Change [range.iter.op] functions to function objects [PR 100768]
The standard specifies std::ranges::distance etc as function templates,
but it also requires them to not be found by ADL, and to suppress ADL
when normal unqualified lookup does find them. That means they need to
be function objects.
libstdc++-v3/ChangeLog:
PR libstdc++/100768
* include/bits/ranges_base.h (advance, distance, next, prev):
Replace function templates with function objects.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc:
Adjust for changes to function objects.
* testsuite/std/ranges/adaptors/elements.cc: Add using
declarations for names from namespace ranges.
* testsuite/std/ranges/adaptors/transform.cc: Likewise.
* testsuite/24_iterators/range_operations/100768.cc: New test.
Jonathan Wakely [Tue, 1 Jun 2021 10:00:16 +0000 (11:00 +0100)]
libstdc++: Fix installation of python hooks [PR 99453]
When no shared library is installed, the new code to determine the name
of the -gdb.py file yields an empty string. Use the name of the static
library in that case.
libstdc++-v3/ChangeLog:
PR libstdc++/99453
* python/Makefile.am: Use archive name for printer hook if no
dynamic library name is available.
* python/Makefile.in: Regenerate.
Julian Brown [Tue, 18 May 2021 17:22:56 +0000 (10:22 -0700)]
[og11] Rework indirect struct handling for OpenACC in gimplify.c
This patch reworks indirect struct handling in gimplify.c (i.e. for
struct components mapped with "mystruct->a[0:n]", "mystruct->b", etc.),
for OpenACC. The key observation leading to these changes was that
component mappings of references-to-structures is already implemented
and working, and indirect struct component handling via a pointer can
work quite similarly. That lets us remove some earlier, special-case
handling for mapping indirect struct component accesses for OpenACC,
which required the pointed-to struct to be manually mapped before the
indirect component mapping.
With this patch, you can map struct components directly (e.g. an array
slice "mystruct->a[0:n]") just like you can map a non-indirect struct
component slice ("mystruct.a[0:n]"). Both references-to-pointers (with
the former syntax) and references to structs (with the latter syntax)
work now.
For Fortran class pointers, we no longer re-use GOMP_MAP_TO_PSET for the
class metadata (the structure that points to the class data and vptr)
-- it is instead treated as any other struct.
For C++, the struct handling also works for class members ("this->foo"),
without having to explicitly map "this[:1]" first.
For OpenACC, we permit chained indirect component references
("mystruct->a->b[0:n]"), though only the last part of such mappings will
trigger an attach/detach operation. To properly use such a construct
on the target, you must still manually map "mystruct->a[:1]" first --
but there's no need to map "mystruct[:1]" explicitly before that.
This version of the patch avoids altering code paths for OpenMP,
where possible.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET
mappings for class metadata, nor GOMP_MAP_POINTER mappings for
POINTER_TYPE_P decls.
gcc/
* gimplify.c (extract_base_bit_offset): Add BASE_IND and OPENMP
parameters. Handle pointer-typed indirect references for OpenACC
alongside reference-typed ones.
(strip_components_and_deref, aggregate_base_p): New functions.
(build_struct_group): Add pointer type indirect ref handling,
including chained references, for OpenACC. Also handle references to
structs for OpenACC. Conditionalise bits for OpenMP only where
appropriate.
(gimplify_scan_omp_clauses): Rework pointer-type indirect structure
access handling to work more like the reference-typed handling for
OpenACC only.
* omp-low.c (scan_sharing_clauses): Handle pointer-type indirect struct
references, and references to pointers to structs also.
gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
Julian Brown [Tue, 18 May 2021 17:08:22 +0000 (10:08 -0700)]
[og11] Refactor struct lowering for OpenACC/OpenMP in gimplify.c
This patch is a second attempt at refactoring struct component mapping
handling for OpenACC/OpenMP during gimplification, after the patch I
posted here:
This patch goes further, in that the struct-handling code is outlined
into its own function (to create the "GOMP_MAP_STRUCT" node and the
sorted list of nodes immediately following it, from a set of mappings
of components of a given struct or derived type). I've also gone through
the list-handling code and attempted to add comments documenting how it
works to the best of my understanding, and broken out a couple of helper
functions in order to (hopefully) have the code self-document better also.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (insert_struct_comp_map): Refactor function into...
(build_struct_comp_nodes): This new function. Remove list handling
and improve self-documentation.
(insert_node_after, move_node_after, move_nodes_after,
move_concat_nodes_after): New helper functions.
(build_struct_group): New function to build up GOMP_MAP_STRUCT node
groups to map struct components. Outlined from...
(gimplify_scan_omp_clauses): Here. Call above function.
Julian Brown [Mon, 19 Apr 2021 13:24:41 +0000 (06:24 -0700)]
[og11] Unify ARRAY_REF/INDIRECT_REF stripping code in extract_base_bit_offset
For historical reasons, it seems that extract_base_bit_offset
unnecessarily used two different ways to strip ARRAY_REF/INDIRECT_REF
nodes from component accesses. I verified that the two ways of performing
the operation gave the same results across the whole testsuite (and
several additional benchmarks).
The code was like this since an earlier "mechanical" refactoring by me,
first posted here:
It was never clear to me if there was an important semantic
difference between the two ways of stripping the base before calling
get_inner_reference, but it appears that there is not, so one can go away.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (extract_base_bit_offset): Unify ARRAY_REF/INDIRECT_REF
stripping code in first call/subsequent call cases.
It never makes sense for a GOMP_MAP_ATTACH_DETACH mapping to survive
beyond gimplify.c, so this patch rewrites such mappings to GOMP_MAP_ATTACH
or GOMP_MAP_DETACH unconditionally (rather than checking for a list
of types of OpenACC or OpenMP constructs), in cases where it hasn't
otherwise been done already in the preceding code.
2021-06-02 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.c (gimplify_scan_omp_clauses): Simplify condition
for changing GOMP_MAP_ATTACH_DETACH to GOMP_MAP_ATTACH or
GOMP_MAP_DETACH.
Jason Merrill [Fri, 28 May 2021 21:05:23 +0000 (17:05 -0400)]
c++: no clobber for C++20 destroying delete [PR91859]
Before C++20 added destroying operator delete, by the time we called
operator delete for a pointer, the object would already be gone. But that
isn't true for destroying delete. Since the optimizers' assumptions about
operator delete are based on either DECL_IS_REPLACEABLE_OPERATOR (which
already is not set) or CALL_FROM_NEW_OR_DELETE_P, let's avoid setting the
latter flag in this case.
Jason Merrill [Fri, 28 May 2021 03:54:52 +0000 (23:54 -0400)]
c++: 'this' adjustment for devirtualized call
My patch for 95719 made us do a better job of finding the actual virtual
function we want to call, but didn't update the 'this' pointer adjustment to
match.
PR c++/100797
PR c++/95719
gcc/cp/ChangeLog:
* call.c (build_over_call): Adjust base_binfo in
resolves_to_fixed_type_p case.
Tobias Burnus [Tue, 1 Jun 2021 12:16:42 +0000 (14:16 +0200)]
Fortran/OpenMP: Support (parallel) master taskloop (simd) [PR99928]
PR middle-end/99928
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_node, show_code_node): Handle
(parallel) master taskloop (simd).
* frontend-passes.c (gfc_code_walker): Set in_omp_workshare
to false for parallel master taskloop (simd).
* gfortran.h (enum gfc_statement):
Add ST_OMP_(END_)(PARALLEL_)MASTER_TASKLOOP(_SIMD).
(enum gfc_exec_op): EXEC_OMP_(PARALLEL_)MASTER_TASKLOOP(_SIMD).
* match.h (gfc_match_omp_master_taskloop,
gfc_match_omp_master_taskloop_simd,
gfc_match_omp_parallel_master_taskloop,
gfc_match_omp_parallel_master_taskloop_simd): New prototype.
* openmp.c (gfc_match_omp_parallel_master_taskloop,
gfc_match_omp_parallel_master_taskloop_simd,
gfc_match_omp_master_taskloop,
gfc_match_omp_master_taskloop_simd): New.
(gfc_match_omp_taskloop_simd): Permit 'reduction' clause.
(resolve_omp_clauses): Handle new combined directives; remove
inscan-reduction check to reduce multiple errors; add
task-reduction error for 'taskloop simd'.
(gfc_resolve_omp_parallel_blocks,
resolve_omp_do, omp_code_to_statement,
gfc_resolve_omp_directive): Handle new combined constructs.
* parse.c (decode_omp_directive, next_statement,
gfc_ascii_statement, parse_omp_do, parse_omp_structured_block,
parse_executable): Likewise.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans.c (trans_code): Likewise.
* trans-openmp.c (gfc_split_omp_clauses,
gfc_trans_omp_directive): Likewise.
(gfc_trans_omp_parallel_master): Move after gfc_trans_omp_master_taskloop;
handle parallel master taskloop (simd) as well.
(gfc_trans_omp_taskloop): Take gfc_exec_op as arg.
(gfc_trans_omp_master_taskloop): New.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/reduction5.f90: Remove dg-error; the issue is
now diagnosed with less error output.
* gfortran.dg/gomp/scan-1.f90: Likewise.
* gfortran.dg/gomp/pr99928-3.f90: New test.
* gfortran.dg/gomp/taskloop-1.f90: New test.
Tobias Burnus [Mon, 31 May 2021 14:50:19 +0000 (16:50 +0200)]
gfortran.dg/gomp/depend-iterator-{1,2}.f90: Use dg-do compile
'dg-do run' is pointless -1 due to dg-error. And it won't work except
by chance for gomp tests; as -2 only has depend(out:), a 'dg-do compile'
is sufficient.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/depend-iterator-1.f90: Use dg-do compile.
* gfortran.dg/gomp/depend-iterator-2.f90: Use dg-do compile.
Jakub Jelinek [Mon, 31 May 2021 14:49:06 +0000 (16:49 +0200)]
openmp: Add shared to parallel for linear on parallel master taskloop simd [PR99928]
I forgot to add default(none) and defaultmap(none) wherever possible on the
testcases to make sure none of the required clauses are added implicitly (because
in that case it doesn't work with these none arguments of those default* clauses
or works differently with other default* settings.
And that revealed we didn't add shared on parallel for linear clause
on parallel master taskloop simd, so this patch fixes that too.
2021-05-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
* gimplify.c (gimplify_scan_omp_clauses): For taskloop simd
combined with parallel, make sure to add shared clause to
parallel for explicit linear clause.
* c-c++-common/gomp/pr99928-1.c: Add default(none) to constructs
combined with parallel, teams or taskloop and defaultmap(none)
to constructs combined with target.
* c-c++-common/gomp/pr99928-2.c: Likewise.
* c-c++-common/gomp/pr99928-3.c: Likewise.
* c-c++-common/gomp/pr99928-4.c: Likewise.
* c-c++-common/gomp/pr99928-5.c: Likewise.
* c-c++-common/gomp/pr99928-6.c: Likewise.
* c-c++-common/gomp/pr99928-7.c: Likewise.
* c-c++-common/gomp/pr99928-8.c: Likewise.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
* c-c++-common/gomp/pr99928-13.c: Likewise.
* c-c++-common/gomp/pr99928-14.c: Likewise.
This v3 adds a small bug fix, where the initialization of the refcount didn't
handle all cases, fixed by using gomp_refcount_increment here (more consistent).
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): For new key entries, set
k->refcount to 0, remove initialization of k->structelem_refcount,
use gomp_increment_refcount to consistently handle all increment cases.
Jakub Jelinek [Tue, 25 May 2021 15:24:38 +0000 (17:24 +0200)]
c++: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]
When passing expressions with decltype(nullptr) type with side-effects to
ellipsis, we pass (void *)0 instead, but for the side-effects evaluate them
on the lhs of a COMPOUND_EXPR. Unfortunately that means we warn about it
if the expression is a call to nodiscard marked function, even when the
result is really used, just needs to be transformed.
Fixed by adding a warning_sentinel.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.
* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.
Jakub Jelinek [Tue, 25 May 2021 14:44:35 +0000 (16:44 +0200)]
c++tools: Include <cstdlib> for exit [PR100731]
This TU uses exit, but doesn't include <stdlib.h> or <cstdlib> and relies
on some other header to include it indirectly, which apparently doesn't
happen on reporter's host.
The other <c*> headers aren't guarded either and we rely on a compiler
capable of C++11, so maybe we can rely on <cstdlib> being around
unconditionally.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/100731
* server.cc: Include <cstdlib>.
Jakub Jelinek [Thu, 20 May 2021 07:09:07 +0000 (09:09 +0200)]
libcpp: Fix up -fdirectives-only handling of // comments on last line not terminated with newline [PR100646]
As can be seen on the testcases, before the -fdirectives-only preprocessing
rewrite the preprocessor would assume // comments are terminated by the
end of file even when newline wasn't there, but now we error out.
The following patch restores the previous behavior.
2021-05-20 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/100646
* lex.c (cpp_directive_only_process): Treat end of file as termination
for !is_block comments.
* gcc.dg/cpp/pr100646-1.c: New test.
* gcc.dg/cpp/pr100646-2.c: New test.
Jakub Jelinek [Wed, 19 May 2021 10:05:30 +0000 (12:05 +0200)]
builtins: Fix ICE with unprototyped builtin call [PR100576]
For unprototyped builtins the checking we perform is only about whether
the used argument is integral, pointer etc., not the exact precision.
We emit a warning about the problem though:
pr100576.c: In function ‘foo’:
pr100576.c:9:11: warning: implicit declaration of function ‘memcmp’ [-Wimplicit-function-declaration]
9 | int n = memcmp (p, v, b);
| ^~~~~~
pr100576.c:1:1: note: include ‘<string.h>’ or provide a declaration of ‘memcmp’
+++ |+#include <string.h>
1 | /* PR middle-end/100576 */
pr100576.c:9:25: warning: ‘memcmp’ argument 3 type is ‘int’ where ‘long unsigned int’ is expected in a call to built-in function declared without prototype
+[-Wbuiltin-declaration-mismatch]
9 | int n = memcmp (p, v, b);
| ^
It means in the testcase below where the user incorrectly called memcmp
with last argument int rather then size_t, the warning stuff in builtins.c
ICEs because it compares a wide_int from such a bound with another wide_int
which has precision of size_t/sizetype and wide_int asserts the compared
wide_ints are compatible.
Fixed by forcing the bound to have the right type.
2021-05-19 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100576
* builtins.c (check_read_access): Convert bound to size_type_node if
non-NULL.
Jakub Jelinek [Tue, 18 May 2021 08:26:45 +0000 (10:26 +0200)]
regcprop: Avoid DCE of asm goto [PR100590]
The following testcase ICEs, because copyprop_hardreg_forward_1 decides
to DCE asm goto with REG_UNUSED notes (because the output is unused and
asm isn't volatile). But that DCE just removes the asm goto, leaving
a bb with two successors and no insn at the end that would allow that.
The following patch makes sure we drop that way only INSNs and not
JUMP_INSNs or CALL_INSNs.
2021-05-18 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100590
* regcprop.c (copyprop_hardreg_forward_1): Only DCE dead sets if
they are NONJUMP_INSN_P.
Jakub Jelinek [Tue, 18 May 2021 08:10:17 +0000 (10:10 +0200)]
function: Set dummy DECL_ASSEMBLER_NAME in push_dummy_function [PR100580]
Last year I've added cgraph_node::get_create calls for the dummy
functions used for -fdump-passes, so that it interacts well with pass
disabling/enabling which is cgraph uid based.
Unfortunately, as the following testcase shows, when assembler hash
is present, that wants to compute DECL_ASSEMBLER_NAME and the C++ FE
is unprepared to handle it on the dummy functions which don't have
DECL_NAME etc.
The following patch fixes it by setting up a dummy DECL_ASSEMBLER_NAME
on these, so that the FEs don't need to compute it.
2021-05-18 Jakub Jelinek <jakub@redhat.com>
PR c++/100580
* function.c (push_dummy_function): Set DECL_ARTIFICIAL and
DECL_ASSEMBLER_NAME on the fn_decl.
Jakub Jelinek [Sat, 15 May 2021 08:12:11 +0000 (10:12 +0200)]
regcprop: Fix another cprop_hardreg bug [PR100342]
On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches wrote:
> Ah, ok, thanks for the extra context.
>
> So AIUI the problem when recording xmm2<-di isn't just:
>
> [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>
> but also that:
>
> [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
>
> For example, all registers in this sequence can be part of the same chain:
>
> (set (reg:HI R1) (reg:HI R0))
> (set (reg:SI R2) (reg:SI R1)) // [A]
> (set (reg:DI R3) (reg:DI R2)) // [A]
> (set (reg:SI R4) (reg:SI R[0-3]))
> (set (reg:HI R5) (reg:HI R[0-4]))
>
> But:
>
> (set (reg:SI R1) (reg:SI R0))
> (set (reg:HI R2) (reg:HI R1))
> (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
>
> is problematic because it dips below the precision of the oldest regno
> and then increases again.
>
> When this happens, I guess we have two choices:
>
> (1) what the patch does: treat R3 as the start of a new chain.
> (2) pretend that the copy occured in vd->e[sr].mode instead
> (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
>
> I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
> Maybe the optimisation provided by (2) compared to (1) isn't common
> enough to be worth the complication.
>
> I think we should test [B] as well as [A] though. The pass is set
> up to do some quite elaborate mode changes and I think rejecting
> [A] on its own would make some of the other code redundant.
> It also feels like it should be a seperate “if” or “else if”,
> with its own comment.
Unfortunately, we now have a testcase that shows that testing also [B]
is a problem (unfortunately now latent on the trunk, only reproduces
on 10 and 11 branches).
The comment in the patch tries to list just the interesting instructions,
we have a 64-bit value, copy low 8 bit of those to another register,
copy full 64 bits to another register and then clobber the original register.
Before that (set (reg:DI r14) (const_int ...)) we have a chain
DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain, so
we have QI si, DI bp , si being the oldest_regno.
Next DI si is copied into DI dx. Only the low 8 bits of that are defined,
the rest is unspecified, but we would add DI dx into that same chain at the
end, so QI si, DI bp, DI dx [*]. Next si is overwritten, so the chain is
DI bp, DI dx. And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
as redundant, because we think bp and dx are already equivalent, when in
reality that is true only for the lowpart 8 bits.
I believe the [*] marked step above is where the bug is.
The committed regcprop.c (copy_value) change (but only committed to
trunk/11, not to 10) added
else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
&& partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode))
return;
and while the first partial_subreg_p call returns true, the second one
doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
true and we'd return, but as that reg got clobbered, si became the oldest
regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
But as the testcase shows, what is the oldest_regno in the chain is
something that changes over time, so relying on it for anything is
problematic, something could have a different oldest_regno and later
on get a different oldest_regno (perhaps with different mode) because
the oldest_regno got overwritten and it can change both ways.
The following patch effectively implements your (2) above.
2021-05-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100342
* regcprop.c (copy_value): When copying a source reg in a wider
mode than it has recorded for the value, adjust recorded destination
mode too or punt if !REG_CAN_CHANGE_MODE_P.
Chung-Lin Tang [Thu, 27 May 2021 14:57:09 +0000 (22:57 +0800)]
OpenMP 5.0: Remove array section base-pointer mapping semantics, and other front-end adjustments
This patch largely implements three pieces of functionality:
(1) Per discussion and clarification on the omp-lang mailing list, standards
conforming behavior for mapping array sections should *NOT* also map the
base-pointer, i.e for this code:
struct S { int *ptr; ... };
struct S s;
#pragma omp target enter data map(to: s.ptr[:100])
Currently we generate after gimplify:
map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
which is deemed incorrect. After this patch, the gimplify results are now
adjusted to:
map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior,
attachment will happen)
The correct way of achieving the base-pointer-also-mapped behavior would be
to use:
This adjustment in behavior required a number of small adjustments here and
there in gimplify, including to accomodate map sequences for C++ references.
(2) Fixes in libgomp/target.c to not overwrite attached pointers when handling
device<->host copies, mainly for the "always" case.
(3) A set of changes to the C/C++ front-ends to extend the allowed component
access syntax in map clauses. This is actually mainly an effort to allow
SPEC HPC to compile. These changes are enabled for both OpenACC and OpenMP.
gcc/c/ChangeLog:
* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/cp/ChangeLog:
* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.
gcc/ChangeLog:
* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify conditions.
(omp_target_reorder_clauses): In alloc/to/from sorting phase, also
move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting
phase where we make sure pointers with an attach/detach map are ordered
correctly.
(gimplify_scan_omp_clauses): Add modifications to avoid creating
GOMP_MAP_STRUCT and associated alloc map for attach/detach maps.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase.
* c-c++-common/gomp/target-enter-data-1.c: New testcase.
* c-c++-common/gomp/target-implicit-map-2.c: New testcase.
libgomp/ChangeLog:
* target.c (gomp_map_vars_existing): Make sure attached pointer is
not overwritten during cross-host/device copying.
(gomp_update): Likewise.
(gomp_exit_data): Likewise.
* testsuite/libgomp.c++/target-11.C: Adjust testcase.
* testsuite/libgomp.c++/target-12.C: Likewise.
* testsuite/libgomp.c++/target-15.C: Likewise.
* testsuite/libgomp.c++/target-16.C: Likewise.
* testsuite/libgomp.c++/target-17.C: Likewise.
* testsuite/libgomp.c++/target-21.C: Likewise.
* testsuite/libgomp.c++/target-23.C: Likewise.
* testsuite/libgomp.c/target-23.c: Likewise.
* testsuite/libgomp.c/target-29.c: Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.
Chung-Lin Tang [Thu, 27 May 2021 14:22:00 +0000 (22:22 +0800)]
OpenMP 5.0: Implement relaxation of implicit map vs. existing device mappings
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570365.html
This patch implements relaxing the requirements when a map with the implicit
attribute encounters an overlapping existing map. As the OpenMP 5.0 spec
describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22):
"If a single contiguous part of the original storage of a list item with an
implicit data-mapping attribute has corresponding storage in the device data
environment prior to a task encountering the construct that is associated with
the map clause, only that part of the original storage will have corresponding
storage in the device data environment as a result of the map clause."
Also tracked in the OpenMP spec context as issue #1463:
https://github.com/OpenMP/spec/issues/1463
include/ChangeLog:
* gomp-constants.h (GOMP_MAP_IMPLICIT): New special map kind bits value.
(GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of
special map kind bits.
(GOMP_MAP_NONCONTIG_ARRAY_P): Adjust test for non-contiguous array map
kind bits to be more specific.
(GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds.
gcc/ChangeLog:
* tree.h (OMP_CLAUSE_MAP_IMPLICIT_P): New access macro for 'implicit'
bit, using 'base.deprecated_flag' field of tree_node.
* tree-pretty-print.c (dump_omp_clause): Add support for printing
implicit attribute in tree dumping.
* gimplify.c (gimplify_adjust_omp_clauses_1):
Set OMP_CLAUSE_MAP_IMPLICIT_P to 1 if map clause is implicitly created.
(gimplify_adjust_omp_clauses): Adjust place of adding implicitly created
clauses, from simple append, to starting of list, after non-map clauses.
* omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind
values passed to libgomp for implicit maps.
* target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add
implicit map handling to allow a "superset" existing map as valid case.
(get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value.
(get_implicit): New function to extract implicit status.
(gomp_map_fields_existing): Adjust arguments in calls to
gomp_map_vars_existing, and add uses of get_implicit.
(gomp_map_vars_internal): Likewise.
* testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.
Chung-Lin Tang [Thu, 27 May 2021 11:53:45 +0000 (19:53 +0800)]
OpenMP 5.0: Improve OpenMP target support for C++ (includes PR92120 v3)
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570886.html
To summarize, this patch set is an improvement for OpenMP target support for
C++, including for inside non-static members, lambda objects, and struct
member deref access expressions. The corresponding modifications for the C
front-end are also included.
This patch supercedes the prior versions of my PR92120 patch
(implicit C++ map(this[:1])), so dubbing this "v3" of patch for that PR.
Prior versions of the PR92120 patch was implemented by recording uses of
'this' in the parser, and then use the recorded uses during "finish" to
create the implicit maps.
When working on supporting lambda objects, this required using a tree-walk
style processing of the OMP_TARGET body, so in only made sense to merge the
entire 'this' processing together with it, so a large part of the parser
changes were dropped, with the main processing in semantics.c now.
Other parser changes to support '->' in map clauses are also with this patch.
gcc/cp/ChangeLog:
* cp-tree.h (finish_omp_target): New declaration.
(finish_omp_target_clauses): Likewise.
* parser.c (cp_parser_omp_clause_map): Adjust call to
cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true.
(cp_parser_omp_target): Factor out code, adjust into calls to new
function finish_omp_target.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for
OMP_TARGET case.
* semantics.c (handle_omp_array_sections_1): Add handling to create
'this->member' from 'member' FIELD_DECL.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP
map clauses. Handle 'A->member' case in map clauses.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function.
(finish_omp_target): New function.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.
gcc/ChangeLog:
* gimplify.c ("tree-hash-traits.h"): Add include.
(gimplify_scan_omp_clauses): Change struct_map_to_clause to type
hash_map<tree_operand, tree> *. Adjust struct map handling to handle
cases of *A and A->B expressions. Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.
(gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for
exit data directives code to earlier position.
* omp-low.c (lower_omp_target):
Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
* tree-pretty-print.c (dump_omp_clause): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/target-3.c: New testcase.
* g++.dg/gomp/target-3.C: New testcase.
* g++.dg/gomp/target-lambda-1.C: New testcase.
* g++.dg/gomp/target-this-1.C: New testcase.
* g++.dg/gomp/target-this-2.C: New testcase.
* g++.dg/gomp/target-this-3.C: New testcase.
* g++.dg/gomp/target-this-4.C: New testcase.
* g++.dg/gomp/target-this-5.C: New testcase.
* g++.dg/gomp/this-2.C: Adjust testcase.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind):
Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds.
(GOMP_MAP_POINTER_P):
Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION.
libgomp/ChangeLog:
* libgomp.h (gomp_attach_pointer): Add bool parameter.
* oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_existing): Update assert condition to
include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION.
(gomp_map_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for mapping a pointer with NULL target.
(gomp_attach_pointer): Add 'bool allow_zero_length_array_sections'
parameter, add support for attaching a pointer with NULL target.
(gomp_map_vars_internal): Update calls to gomp_map_pointer and
gomp_attach_pointer, add handling for
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and
GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases.
* testsuite/libgomp.c++/target-23.C: New testcase.
* testsuite/libgomp.c++/target-lambda-1.C: New testcase.
* testsuite/libgomp.c++/target-this-1.C: New testcase.
* testsuite/libgomp.c++/target-this-2.C: New testcase.
* testsuite/libgomp.c++/target-this-3.C: New testcase.
* testsuite/libgomp.c++/target-this-4.C: New testcase.
* testsuite/libgomp.c++/target-this-5.C: New testcase.
David Edelsohn [Fri, 23 Apr 2021 21:45:10 +0000 (17:45 -0400)]
testuite: fix libtdc++ libatomic flags
Some ports require libatomic for atomic operations, at least for some
data types and widths. The libstdc++ testsuite previously was updated
to link against libatomic, but the search path was hard-coded to
something that is not always correct, and the shared library search
path was not set.
The search path was hard-coded to the expected location of the
libatomic build directory relative to the libstdc++ testsuite
directory, but if one uses parallelism when invoking the libstdc++
testsuite, the tests are run in the "normalXX" sub-directories, for
which the hard-coded search path is incorrect. The path also is
incorrect for alternative multilib and tool options.
This patch adopts the logic from gcc/testsuite/lib/atomic-dg.exp to
search for the library and adds the logic to the libstdc++ testsuite
libatomic seatch path code. Previously the libstdc++ testsuite atomic
tests failed depending on the build configuration and if a build of
libatomic was installed in the default search path.
Bootstrapped on powerpc-ibm-aix7.2.3.0.
libstdc++-v3/ChangeLog:
* testsuite/lib/dg-options.exp (atomic_link_flags): New.
(add_options_for_libatomic): Use atomic_link_flags.
AIX uses a compiler-managed TOC for global data, including TLS symbols.
The GCC TOC implementation manages the TOC entries through the constant pool.
TLS symbols sometimes require a function call to obtain the TLS base
pointer. The arguments to the TLS call can conflict with arguments to
a normal function call if the TLS symbol is an argument in the normal call.
GCC specifically checks for this situation and precomputes the TLS
arguments, but the mechanism to check for this requirement utilizes
legitimate_constant_p(). The necessary result of legitimate_constant_p()
for correct TOC behavior and for correct TLS argument behavior is in
conflict.
This patch adds a new target hook precompute_tls_p() to decide if an
argument should be precomputed regardless of the result from
legitmate_constant_p().
Harald Anlauf [Mon, 17 May 2021 19:35:38 +0000 (21:35 +0200)]
PR fortran/98411 - Pointless warning for static variables
Variables with explicit SAVE attribute cannot end up on the stack.
There is no point in checking whether they should be moved off the
stack to static storage.
gcc/fortran/ChangeLog:
PR fortran/98411
* trans-decl.c (gfc_finish_var_decl): Add check for explicit SAVE
attribute.
gcc/testsuite/ChangeLog:
PR fortran/98411
* gfortran.dg/pr98411.f90: New test.
Harald Anlauf [Thu, 27 May 2021 11:55:11 +0000 (13:55 +0200)]
PR fortran/100656 - prevent ICE in gfc_conv_expr_present
gcc/fortran/ChangeLog:
PR fortran/100656
* trans-array.c (gfc_conv_ss_startstride): Do not call check for
presence of a dummy argument when a symbol actually refers to a
non-dummy.
gcc/testsuite/ChangeLog:
PR fortran/100656
* gfortran.dg/bounds_check_22.f90: New test.
Patrick Palka [Fri, 30 Apr 2021 22:45:46 +0000 (18:45 -0400)]
libstdc++: Implement P2328 changes to join_view
This implements the wording changes of P2328R0 "join_view should join
all views of ranges".
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__non_propating_cache): Define
as per P2328.
(join_view): Remove constraints on the value and reference types
of the wrapped iterator type as per P2328.
(join_view::_Iterator::_M_satisfy): Adjust as per P2328.
(join_view::_Iterator::operator++): Likewise.
(join_view::_M_inner): Use __non_propating_cache as per P2328.
Remove now-redundant use of __maybe_present_t.
* testsuite/std/ranges/adaptors/join.cc: Include <array>.
(test10): New test.
Patrick Palka [Mon, 24 May 2021 19:24:44 +0000 (15:24 -0400)]
libstdc++: Fix iterator caching inside range adaptors [PR100479]
This fixes two issues with our iterator caching as described in detail
in the PR. Since we recently added the __non_propagating_cache class
template as part of r12-336 for P2328, this patch just rewrites the
problematic _CachedPosition partial specialization in terms of this
class template.
For the offset partial specialization, it's safe to propagate the cached
offset on copy/move, but we should still invalidate the cached offset in
the source object on move.
libstdc++-v3/ChangeLog:
PR libstdc++/100479
* include/std/ranges (__detail::__non_propagating_cache): Move
definition up to before that of _CachedPosition. Make base
class _Optional_base protected instead of private. Add const
overload for operator*.
(__detail::_CachedPosition): Rewrite the partial specialization
for forward ranges as a derived class of __non_propagating_cache.
Remove the size constraint on the partial specialization for
random access ranges. Add copy/move/copy-assignment/move-assignment
members to the offset partial specialization for random
access ranges that propagate the cached value but additionally
invalidate it in the source object on move.
* testsuite/std/ranges/adaptors/100479.cc: New test.
Patrick Palka [Wed, 26 May 2021 20:02:33 +0000 (16:02 -0400)]
c++: access for hidden friend of nested class template [PR100502]
Here, during ahead of time access checking for the private member
EnumeratorRange<T>::end_reached_ in the hidden friend f, we're triggering
the assert in enforce_access that verifies we're not trying to add a
access check for a dependent decl onto TI_DEFERRED_ACCESS_CHECKS.
The special thing about this class member access expression is that
the overall expression is non-dependent (so finish_class_member_access_expr
doesn't exit early at parse time), and then accessible_p rejects the
access (so we don't exit early from enforce access either, and end up
triggering the assert b/c the member itself is dependent). I think
we're correct to reject the access because a hidden friend is not a
member function, so [class.access.nest] doesn't apply, and also a hidden
friend of a nested class is not a friend of the enclosing class.
To fix this ICE, this patch disables ahead of time access checking
during the member lookup in finish_class_member_access_expr. This
avoids potentially pushing an access check for a dependent member onto
TI_DEFERRED_ACCESS_CHECKS, and it's safe because we're going to redo the
same lookup at instantiation time anyway.
PR c++/100502
gcc/cp/ChangeLog:
* typeck.c (finish_class_member_access_expr): Disable ahead
of time access checking during the member lookup.
gcc/testsuite/ChangeLog:
* g++.dg/template/access37.C: New test.
* g++.dg/template/access37a.C: New test.
Jakub Jelinek [Fri, 28 May 2021 09:26:48 +0000 (11:26 +0200)]
openmp: Fix up handling of reduction clause on constructs combined with target [PR99928]
The reduction clause should be copied as map (tofrom: ) to combined target if
present, but as we need different handling of array sections between map and reduction,
doing that during gimplification would be harder.
So, this patch adds them during splitting, and similarly to firstprivate adds them
with a new flag that they should be just ignored/removed if an explicit map clause
of the same list item is present.
The exact rules are to be decided in https://github.com/OpenMP/spec/issues/2766
so this patch just implements something that is IMHO reasonable and exact detailed
testcases for the cornercases will follow once it is clarified.
2021-05-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99928
gcc/
* tree.h (OMP_CLAUSE_MAP_IMPLICIT): Define.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): For reduction clause if combined with
target add a map tofrom clause with OMP_CLAUSE_MAP_IMPLICIT.
gcc/c/
* c-typeck.c (handle_omp_array_sections): Copy OMP_CLAUSE_MAP_IMPLICIT.
(c_finish_omp_clauses): Move not just OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT
marked clauses last, but also OMP_CLAUSE_MAP_IMPLICIT. Add
map_firstprivate_head bitmap, set it for GOMP_MAP_FIRSTPRIVATE_POINTER
maps and silently remove OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT if it is
present too. For OMP_CLAUSE_MAP_IMPLICIT silently remove the clause
if present in map_head, map_field_head or map_firstprivate_head
bitmaps.
gcc/cp/
* semantics.c (handle_omp_array_sections): Copy
OMP_CLAUSE_MAP_IMPLICIT.
(finish_omp_clauses): Move not just OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT
marked clauses last, but also OMP_CLAUSE_MAP_IMPLICIT. Add
map_firstprivate_head bitmap, set it for GOMP_MAP_FIRSTPRIVATE_POINTER
maps and silently remove OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT if it is
present too. For OMP_CLAUSE_MAP_IMPLICIT silently remove the clause
if present in map_head, map_field_head or map_firstprivate_head
bitmaps.
gcc/testsuite/
* c-c++-common/gomp/pr99928-8.c: Remove all xfails.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
* c-c++-common/gomp/pr99928-16.c: New test.
* testsuite/libgomp.fortran/depend-iterator-2.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/affinity-1.c: New test.
* c-c++-common/gomp/affinity-2.c: New test.
* c-c++-common/gomp/affinity-3.c: New test.
* c-c++-common/gomp/affinity-4.c: New test.
* c-c++-common/gomp/affinity-5.c: New test.
* c-c++-common/gomp/affinity-6.c: New test.
* c-c++-common/gomp/affinity-7.c: New test.
* gfortran.dg/gomp/affinity-clause-1.f90: New test.
* gfortran.dg/gomp/affinity-clause-2.f90: New test.
* gfortran.dg/gomp/affinity-clause-3.f90: New test.
* gfortran.dg/gomp/affinity-clause-4.f90: New test.
* gfortran.dg/gomp/affinity-clause-5.f90: New test.
* gfortran.dg/gomp/affinity-clause-6.f90: New test.
* gfortran.dg/gomp/depend-iterator-1.f90: New test.
* gfortran.dg/gomp/depend-iterator-2.f90: New test.
* gfortran.dg/gomp/depend-iterator-3.f90: New test.
* gfortran.dg/gomp/taskwait.f90: New test.
Jakub Jelinek [Fri, 28 May 2021 08:08:56 +0000 (10:08 +0200)]
libgomp: Add openacc_{cuda,cublas,cudart} effective targets and use them in openacc testsuite
When gcc is configured for nvptx offloading with --without-cuda-driver
and full CUDA isn't installed, many libgomp.oacc-*/* tests fail,
some of them because cuda.h header can't be found, others because
the tests can't be linked against -lcuda, -lcudart or -lcublas.
I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms
installed, so libcuda.so.1 can be dlopened and the offloading works,
but linking against those libraries isn't possible nor are the
headers around (for the plugin itself there is the fallback
libgomp/plugin/cuda/cuda.h).
The following patch adds 3 new effective targets and uses them in tests that
needs those.
Richard Earnshaw [Thu, 27 May 2021 09:25:37 +0000 (10:25 +0100)]
arm: Remove use of opts_set in arm_configure_build_target [PR100767]
The variable global_options_set is a reflection of which options have
been explicitly set from the command line in the structure
global_options. But it doesn't describe the contents of a
cl_target_option. cl_target_option is a set of options to apply and
once configured should represent a viable set of options without
needing to know which were explicitly set by the user.
Unfortunately arm_configure_build_target was incorrectly conflating
the two. Fortunately, however, we do not really need to know this
since the various override_options functions should have sanitized the
target_options values before constructing a cl_target_option
structure. It is safe, therefore, to simply drop this parameter to
arm_configure_build_target and rely on checking that various string
parameters are non-null before dereferencing them.
Alex Coplan [Tue, 11 May 2021 12:11:09 +0000 (13:11 +0100)]
arm: Avoid emitting bogus CFA adjusts for CMSE nonsecure calls [PR99725]
The PR shows us attaching REG_CFA_ADJUST_CFA notes to stack pointer
adjustments emitted in cmse_nonsecure_call_inline_register_clear (when
-march=armv8.1-m.main). However, the stack pointer is not guaranteed to
be the CFA reg. If we're at -O0 or we have -fno-omit-frame-pointer, then
the frame pointer will be used as the CFA reg, and these notes on the sp
adjustments will lead to ICEs in dwarf2out_frame_debug_adjust_cfa.
This patch avoids emitting these notes if the current function has a
frame pointer.
gcc/ChangeLog:
PR target/99725
* config/arm/arm.c (cmse_nonsecure_call_inline_register_clear):
Avoid emitting CFA adjusts on the sp if we have the fp.
gcc/testsuite/ChangeLog:
PR target/99725
* gcc.target/arm/cmse/pr99725.c: New test.
Jakub Jelinek [Wed, 26 May 2021 13:15:19 +0000 (15:15 +0200)]
openmp: Fix up handling of target constructs in offloaded routines [PR100573]
OpenMP Nesting of Regions restrictions say:
- If a target update, target data, target enter data, or target exit data
construct is encountered during execution of a target region, the behavior is unspecified.
- If a target construct is encountered during execution of a target region and a device
clause in which the ancestor device-modifier appears is not present on the construct, the
behavior is unspecified.
That wording is about the dynamic (runtime) behavior, not about lexical nesting,
so while it is UB if omp target * is encountered in the target region, we need to make
it compile and link (for lexical nesting of target * inside of target we actually
emit a warning).
To make this work, I had to do multiple changes.
One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target".
Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a.
The entrypoint functions shouldn't be called or passed in the offload regions,
otherwise
libgomp: cuLaunchKernel error: too many resources requested for launch
was reported; fixed by changing those arguments of calls to GOMP_target_ext
to NULL.
And we didn't mark the entrypoints "omp target entrypoint" when the caller
has been "omp declare target".
2021-05-26 Jakub Jelinek <jakub@redhat.com>
PR libgomp/100573
gcc/
* omp-low.c: Include omp-offload.h.
(create_omp_child_function): If current_function_decl has
"omp declare target" attribute and is_gimple_omp_offloaded,
remove that attribute from the copy of attribute list and
add "omp target entrypoint" attribute instead.
(lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.*
variables for offloading if in omp_maybe_offloaded_ctx.
* omp-offload.c (pass_omp_target_link::execute): Nullify second
argument to GOMP_target_data_ext in offloaded code.
libgomp/
* config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): New dummy entrypoints.
* config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext,
GOMP_target_end_data, GOMP_target_update_ext,
GOMP_target_enter_exit_data): Likewise.
* testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS,
OMPFROM, OMPTO): Define.
(main): Remove #pragma omp target teams around all the tests.
* testsuite/libgomp.c-c++-common/target-41.c: New test.
* testsuite/libgomp.c-c++-common/target-42.c: New test.
Harald Anlauf [Sun, 23 May 2021 18:51:14 +0000 (20:51 +0200)]
Fortran: fix passing return value to class(*) dummy argument
gcc/fortran/ChangeLog:
PR fortran/100551
* trans-expr.c (gfc_conv_procedure_call): Adjust check for
implicit conversion of actual argument to an unlimited polymorphic
procedure argument.
gcc/testsuite/ChangeLog:
PR fortran/100551
* gfortran.dg/pr100551.f90: New test.
Uros Bizjak [Tue, 18 May 2021 13:45:54 +0000 (15:45 +0200)]
i386: Fix split_double_mode with paradoxical subreg [PR100626]
split_double_mode calls simplify_gen_subreg, which fails for the
high half of the paradoxical subreg. Return temporary register
instead of NULL RTX in this case.
2021-05-18 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/100626
* config/i386/i386-expand.c (split_double_mode): Return
temporary register when simplify_gen_subreg fails with
the high half od the paradoxical subreg.