Jakub Jelinek [Mon, 5 Sep 2022 07:39:04 +0000 (09:39 +0200)]
openmp: Partial OpenMP 5.2 doacross and omp_cur_iteration support
The following patch implements part of the OpenMP 5.2 changes related
to ordered loops and with the assumed resolution of
https://github.com/OpenMP/spec/issues/3302 issues.
The changes are:
1) the depend clause on stand-alone ordered constructs has been renamed
to doacross (because depend clause has different syntax on other
constructs) with some syntax changes below, depend clause is deprecated
(we'll deprecate stuff on the GCC side only when we have everything else
from 5.2 implemented)
depend(source) -> doacross(source:) or doacross(source:omp_cur_iteration)
depend(sink:vec) -> doacross(sink:vec) (where vec has the same syntax
as before)
2) in 5.1 and before it has been significant whether ordered clause has or
doesn't have an argument, if it didn't, only block-associated ordered
could appear in the body, if it did, only stand-alone ordered could appear
in the body, all loops had to be perfectly nested, no associated
range-based for loops, no linear clause on work-sharing loop and ordered
clause with an argument wasn't allowed on composite for simd.
In 5.2, whether ordered clause has or doesn't have an argument is
insignificant (except for bugs in the standard, #3302 mentions those),
if the argument is missing, it is simply treated as equal to collapse
argument (if any, otherwise 1). The implementation better should be able
to differentiate between ordered and doacross loops at compile time
which previously was through the absence or presence of the argument,
now it is done through looking at the body of the construct lexically
and looking for stand-alone ordered constructs. If there are any,
it is to be handled as doacross loop, otherwise it is ordered loop
(but in that case ordered argument if present must be equal to collapse
argument - 5.2 says instead it must be one, but that is clearly wrong
and mentioned in #3302) - stand-alone ordered constructs must appear
lexically in the body (and had to before as well). For the restrictions
mentioned above, the for simd restriction is gone (stand-alone ordered
can't appear in simd construct, so that is enough), and the other rules
are expected to be changed into something related to presence of
stand-alone ordered constructs in the body
3) 5.2 allows a new syntax, doacross(sink:omp_cur_iteration-1), which
means wait for previous iteration in the iteration space of all the
associated loops
The following patch implements that, except that we sorry for now
on the doacross(sink:omp_cur_iteration-1) syntax during omp expansion
because library side isn't done yet for it. It doesn't implement it for
the Fortran FE either.
Incrementally, I'd like to change the way we differentiate between
stand-alone and block-associated ordered constructs, because the current
way of looking for presence of doacross clause doesn't work well if those
clauses are removed because they had been invalid (wrong syntax or
unknown variables in it etc.) and of course implement
doacross(sink:omp_cur_iteration-1).
2022-09-03 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DOACROSS.
(enum omp_clause_depend_kind): Remove OMP_CLAUSE_DEPEND_SOURCE
and OMP_CLAUSE_DEPEND_SINK, add OMP_CLAUSE_DEPEND_INVALID.
(enum omp_clause_doacross_kind): New type.
(struct tree_omp_clause): Add subcode.doacross_kind member.
* tree.h (OMP_CLAUSE_DEPEND_SINK_NEGATIVE): Remove.
(OMP_CLAUSE_DOACROSS_KIND): Define.
(OMP_CLAUSE_DOACROSS_SINK_NEGATIVE): Define.
(OMP_CLAUSE_DOACROSS_DEPEND): Define.
(OMP_CLAUSE_ORDERED_DOACROSS): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE_DOACROSS entries.
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
* tree-pretty-print.cc (dump_omp_clause): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. Handle
OMP_CLAUSE_DOACROSS.
* gimplify.cc (gimplify_omp_depend): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
(gimplify_scan_omp_clauses): Likewise. Handle OMP_CLAUSE_DOACROSS.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(find_standalone_omp_ordered): New function.
(gimplify_omp_for): When OMP_CLAUSE_ORDERED is present, search
body for OMP_ORDERED with OMP_CLAUSE_DOACROSS and if found,
set OMP_CLAUSE_ORDERED_DOACROSS.
(gimplify_omp_ordered): Don't handle OMP_CLAUSE_DEPEND_SINK or
OMP_CLAUSE_DEPEND_SOURCE, instead check OMP_CLAUSE_DOACROSS, adjust
diagnostics that presence or absence of ordered clause parameter
is irrelevant. Handle doacross(sink:omp_cur_iteration-1). Use
actual user name of the clause - doacross or depend - in diagnostics.
* omp-general.cc (omp_extract_for_data): Don't set fd->ordered
if !OMP_CLAUSE_ORDERED_DOACROSS (t). If
OMP_CLAUSE_ORDERED_DOACROSS (t) but !OMP_CLAUSE_ORDERED_EXPR (t),
set fd->ordered to -1 and set it after the loop in that case to
fd->collapse.
* omp-low.cc (check_omp_nesting_restrictions): Don't handle
OMP_CLAUSE_DEPEND_SOURCE nor OMP_CLAUSE_DEPEND_SINK, instead check
OMP_CLAUSE_DOACROSS. Use actual user name of the clause - doacross
or depend - in diagnostics. Diagnose mixing of stand-alone and
block associated ordered constructs binding to the same loop.
(lower_omp_ordered_clauses): Don't handle OMP_CLAUSE_DEPEND_SINK,
instead handle OMP_CLAUSE_DOACROSS.
(lower_omp_ordered): Look for OMP_CLAUSE_DOACROSS instead of
OMP_CLAUSE_DEPEND.
(lower_depend_clauses): Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK.
* omp-expand.cc (expand_omp_ordered_sink): Emit a sorry for
doacross(sink:omp_cur_iteration-1).
(expand_omp_ordered_source_sink): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE. Use actual user name of the clause
- doacross or depend - in diagnostics.
(expand_omp): Look for OMP_CLAUSE_DOACROSS clause instead of
OMP_CLAUSE_DEPEND.
(build_omp_regions_1): Likewise.
(omp_make_gimple_edges): Likewise.
* lto-streamer-out.cc (hash_tree): Handle OMP_CLAUSE_DOACROSS.
* tree-streamer-in.cc (unpack_ts_omp_clause_value_fields): Likewise.
* tree-streamer-out.cc (pack_ts_omp_clause_value_fields): Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DOACROSS.
* c-omp.cc (c_finish_omp_depobj): Check also for OMP_CLAUSE_DOACROSS
clause and diagnose it. Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK. Assert kind is not OMP_CLAUSE_DEPEND_INVALID.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Handle doacross.
(c_parser_omp_clause_depend_sink): Renamed to ...
(c_parser_omp_clause_doacross_sink): ... this. Add depend_p argument.
Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(c_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(c_parser_omp_clause_doacross): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(c_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(c_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Handle doacross.
(cp_parser_omp_clause_depend_sink): Renamed to ...
(cp_parser_omp_clause_doacross_sink): ... this. Add depend_p
argument. Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(cp_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(cp_parser_omp_clause_doacross): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(cp_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(cp_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* pt.cc (tsubst_omp_clause_decl): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE.
(tsubst_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(tsubst_expr): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
* semantics.cc (cp_finish_omp_clause_depend_sink): Rename to ...
(cp_finish_omp_clause_doacross_sink): ... this.
(finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS. Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS
clause instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND
on it.
gcc/testsuite/
* c-c++-common/gomp/doacross-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/doacross-5.c: New test.
* c-c++-common/gomp/doacross-6.c: New test.
* c-c++-common/gomp/nesting-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/ordered-3.c: Likewise.
* c-c++-common/gomp/sink-3.c: Likewise.
* gfortran.dg/gomp/nesting-2.f90: Likewise.
Andrew Stubbs [Fri, 5 Aug 2022 12:28:50 +0000 (13:28 +0100)]
omp-simd-clone: Allow fixed-lane vectors
The vecsize_int/vecsize_float has an assumption that all arguments will use
the same bitsize, and vary the number of lanes according to the element size,
but this is inappropriate on targets where the number of lanes is fixed and
the bitsize varies (i.e. amdgcn).
With this change the vecsize can be left zero and the vectorization factor will
be the same for all types.
gcc/ChangeLog:
* doc/tm.texi: Regenerate.
* omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero
vecsize.
(simd_clone_adjust_argument_types): Likewise.
* target.def (compute_vecsize_and_simdlen): Document the new
vecsize_int and vecsize_float semantics.
Andrew Stubbs [Fri, 15 Jul 2022 08:47:36 +0000 (09:47 +0100)]
amdgcn: Vector procedure call ABI
Adjust the (unofficial) procedure calling ABI such that vector arguments are
passed in vector registers, not on the stack. Scalar arguments continue to
be passed in scalar registers, making a total of 12 argument registers.
The return value is also moved to a vector register (even for scalars; it
would be possible to retain the scalar location, using untyped_call, but
there's no obvious advantage in doing so).
The OG11 version contained code that was not part of the upstream version
as the approach changed; however, when OG12 was created, those were
accidentally applied to OG12.
This commit removes (reverts) the additional code, i.e. the OG12 commits:
* dwarf2cfi.cc (get_cfa_from_loc_descr): Support register spans
with DW_OP_piece and DW_OP_LLVM_piece_end.
* dwarf2out.cc (build_cfa_loc): Support register spans.
include/ChangeLog:
* dwarf2.def (DW_OP_LLVM_piece_end): New extension operator.
Tobias Burnus [Tue, 30 Aug 2022 19:44:10 +0000 (21:44 +0200)]
OpenMP: Support reverse offload (middle end part)
gcc/ChangeLog:
* internal-fn.cc (expand_GOMP_TARGET_REV): New.
* internal-fn.def (GOMP_TARGET_REV): New.
* lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
'omp target device_ancestor_host' as in_other_partition and don't
error if absent.
* omp-low.cc (create_omp_child_function): Mark as 'noclone'.
* omp-expand.cc (expand_omp_target): For reverse offload, remove
sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
empty-body nohost function.
* omp-offload.cc (execute_omp_device_lower): Handle
IFN_GOMP_TARGET_REV.
(pass_omp_target_link::execute): For ACCEL_COMPILER, don't
nullify fn argument for reverse offload
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
refer to 'requires'.
* testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
* testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-1.f90: New test.
Tobias Burnus [Tue, 30 Aug 2022 17:48:37 +0000 (19:48 +0200)]
gfortran.dg/gomp/depend-6.f90: minor fix + dump update
Contains the minor fix of upstream commit r13-2152-gf05e3b2c63f3307ba405900f1a80c25b2e87b0a3
to avoid setting the same array element twice, which is:
Exactly the same as previous commit for depend-4.f90, r13-2151.
Additionally, it updates the expected tree dumps due to differences
between GCC 12/OG12 and mainline (GCC13); the latter seems to
use a restricted pointer of type '&a[1]' while OG12 has pointerplus
expressions like 'a + 4'.
The changes include -m32/-m64 handling for the depobj var and in
two cases, the expected count increases from 1 to 2 for code like
'D.\[0-9\]+ = daa'.
Marek Polacek [Mon, 29 Aug 2022 20:54:05 +0000 (16:54 -0400)]
c++: __has_builtin gives the wrong answer [PR106759]
We've supported __is_nothrow_constructible since r11-4386, but
names_builtin_p didn't know about it, so it gave the wrong answer for
#if __has_builtin(__is_nothrow_constructible)
...
#endif
I've tested all C++-only built-ins and only two were missing.
PR c++/106759
gcc/cp/ChangeLog:
* cp-objcp-common.cc (names_builtin_p): Handle RID_IS_NOTHROW_ASSIGNABLE
and RID_IS_NOTHROW_CONSTRUCTIBLE.
Peter Bergner [Sun, 28 Aug 2022 00:44:16 +0000 (19:44 -0500)]
rs6000: Allow conversions of MMA pointer types [PR106017]
GCC incorrectly disables conversions between MMA pointer types, which
are allowed with clang. The original intent was to disable conversions
between MMA types and other other types, but pointer conversions should
have been allowed. The fix is to just remove the MMA pointer conversion
handling code altogether.
H.J. Lu [Thu, 18 Aug 2022 21:17:33 +0000 (14:17 -0700)]
x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics
On 64-bit Windows, long is 32 bits and can't be used as stride in memory
operand when base is a pointer which is 64 bits. Cast stride to
__PTRDIFF_TYPE__, instead of long.
The following patch expands IEEE_CLASS inline in the FE but only for the
powerpc64le-linux IEEE quad real(kind=16), using the __builtin_fpclassify
builtin and explicit check of the MSB mantissa bit in place of missing
__builtin_signbit builtin.
Jakub Jelinek [Wed, 24 Aug 2022 07:57:09 +0000 (09:57 +0200)]
i386: Fix up mode iterators that weren't expanded [PR106721]
Currently, when md file reader sees <something> and something is valid mode
(or code) attribute but which doesn't include case for the current mode
(or code), it just keeps the <something> untouched.
I went through all cases matching <[a-zA-Z] in tmp-mddump.md after make mddump.
One of the cases was related to the V*HF mode additions and there was one typo.
From what I can see, this has been voted in as a DR and as it means
we warn less often than before in -std={gnu,c}++2{0,3} modes or with
-Wvolatile, I wonder if it shouldn't be backported to affected release
branches as well.
2022-08-16 Jakub Jelinek <jakub@redhat.com>
* typeck.cc (cp_build_modify_expr): Implement
P2327R1 - De-deprecating volatile compound operations. Don't warn
for |=, &= or ^= with volatile lhs.
* expr.cc (mark_use) <case MODIFY_EXPR>: Adjust warning wording,
leave out simple.
* g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile
compound |=, &= and ^= operations.
* g++.dg/cpp2a/volatile3.C: Likewise.
* g++.dg/cpp2a/volatile5.C: Likewise.
Jakub Jelinek [Mon, 15 Aug 2022 11:56:57 +0000 (13:56 +0200)]
ifcvt: Fix up noce_convert_multiple_sets [PR106590]
The following testcase is miscompiled on x86_64-linux.
The problem is in the noce_convert_multiple_sets optimization.
We essentially have:
if (g == 1)
{
g = 1;
f = 23;
}
else
{
g = 2;
f = 20;
}
and for each insn try to create a conditional move sequence.
There is code to detect overlap with the regs used in the condition
and the destinations, so we actually try to construct:
tmp_g = g == 1 ? 1 : 2;
f = g == 1 ? 23 : 20;
g = tmp_g;
which is fine. But, we actually try to create two different
conditional move sequences in each case, seq1 with the whole
(eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1]))
condition and seq2 with cc_cmp
(eq (reg:CCZ 17 flags) (const_int 0 [0]))
to rely on the earlier present comparison. In each case, we
compare the rtx costs and choose the cheaper sequence (seq1 if both
have the same cost).
The problem is that with the skylake tuning,
tmp_g = g == 1 ? 1 : 2;
is actually expanded as
tmp_g = (g == 1) + 1;
in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2
(which doesn't). The tuning says both have the same cost, so we
pick seq1. Next we check sequences for
f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it
uses (reg 17 flags) which has been clobbered earlier.
The following patch fixes that by detecting if we in the chosen
sequence clobber some register mentioned in cc_cmp or rev_cc_cmp,
and if yes, arranges for only seq1 (i.e. sequences that emit the
comparison itself) to be used after that.
2022-08-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/106590
* ifcvt.cc (check_for_cc_cmp_clobbers): New function.
(noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs
mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any
further conditional moves.
Harald Anlauf [Tue, 23 Aug 2022 20:16:14 +0000 (22:16 +0200)]
Fortran: improve error recovery while simplifying size of bad array [PR103694]
gcc/fortran/ChangeLog:
PR fortran/103694
* simplify.cc (simplify_size): The size expression of an array cannot
be simplified if an error occurs while resolving the array spec.
gcc/testsuite/ChangeLog:
PR fortran/103694
* gfortran.dg/pr103694.f90: New test.
Since 256-bit vector integer comparison is under TARGET_AVX2,
and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that.
Restrict gimple fold condition to TARGET_AVX2.
Jonathan Wakely [Tue, 23 Aug 2022 14:46:16 +0000 (15:46 +0100)]
libstdc++: Fix visit<void>(v) for non-void visitors [PR106589]
The optimization for the common case of std::visit forgot to handle the
edge case of passing zero variants to a non-void visitor and converting
the result to void.
libstdc++-v3/ChangeLog:
PR libstdc++/106589
* include/std/variant (__do_visit): Handle is_void<R> for zero
argument case.
* testsuite/20_util/variant/visit_r.cc: Check std::visit<void>(v).
Kewen Lin [Tue, 16 Aug 2022 05:18:51 +0000 (00:18 -0500)]
vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322]
As PR106322 shows, in some cases for some vector type whose
TYPE_MODE is a scalar integral mode instead of a vector mode,
it's possible to obtain wrong target support information when
querying with the scalar integral mode. For example, for the
test case in PR106322, on ppc64 32bit vectorizer gets vector
type "vector(2) short unsigned int" for scalar type "short
unsigned int", its mode is SImode instead of V2HImode. The
target support querying checks umul_highpart optab with SImode
and considers it's supported, then vectorizer further generates
.MULH IFN call for that vector type. Unfortunately it's wrong
to use SImode support for that vector type multiply highpart
here.
This patch is to teach vectorizable_call analysis not to allow
vect_emulated_vector_p type for both vectype_in and vectype_out
as Richi suggested.
PR tree-optimization/106322
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_call): Don't allow
vect_emulated_vector_p type for both vectype_in and vectype_out.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr106322.c: New test.
* gcc.target/powerpc/pr106322.c: New test.
Kewen.Lin [Tue, 16 Aug 2022 05:24:07 +0000 (00:24 -0500)]
rs6000: Adjust mov optabs for opaque modes [PR103353]
As PR103353 shows, we may want to continue to expand built-in
function __builtin_vsx_lxvp, even if we have already emitted
error messages about some missing required conditions. As
shown in that PR, without one explicit mov optab on OOmode
provided, it would call emit_move_insn recursively.
So this patch is to allow the mov pattern to be generated during
expanding phase if compiler has already seen errors.
PR target/103353
gcc/ChangeLog:
* config/rs6000/mma.md (define_expand movoo): Move TARGET_MMA condition
check to preparation statements and add handlings for !TARGET_MMA.
(define_expand movxo): Likewise.
Jonathan Wakely [Mon, 22 Aug 2022 16:24:27 +0000 (17:24 +0100)]
libstdc++: Document linker option for C++23 <stacktrace> [PR105678]
libstdc++-v3/ChangeLog:
PR libstdc++/105678
* doc/xml/manual/using.xml: Document -lstdc++_libbacktrace
requirement for using std::stacktrace. Also adjust -frtti and
-fexceptions to document non-default (i.e. negative) forms.
* doc/html/*: Regenerate.
Additionally, it updates the expected tree dumps due to differences
between GCC 12/OG12 and mainline (GCC13); the latter seems to
use a restricted pointer of type '&a[1]' while OG12 has pointerplus
expressions like 'a + 4'.
The changes include -m32/-m64 handling for the depobj var and in
two cases, the expected count increases from 1 to 2 for code like
'D.\[0-9\]+ = daa'.
gcc/testsuite/
* gfortran.dg/gomp/depend-4.f90: Update expected tree dumps.
Tobias Burnus [Tue, 23 Aug 2022 11:00:38 +0000 (13:00 +0200)]
Fortran: OpenMP fix declare simd inside modules and absent linear step [PR106566]
gcc/fortran/ChangeLog:
PR fortran/106566
* openmp.cc (gfc_match_omp_clauses): Fix setting linear-step value
to 1 when not specified.
(gfc_match_omp_declare_simd): Accept module procedures.
gcc/testsuite/ChangeLog:
PR fortran/106566
* gfortran.dg/gomp/declare-simd-4.f90: New test.
* gfortran.dg/gomp/declare-simd-5.f90: New test.
* gfortran.dg/gomp/declare-simd-6.f90: New test.
Tobias Burnus [Tue, 23 Aug 2022 09:35:01 +0000 (11:35 +0200)]
gcn/mkoffload: Cleanup temporary dbgobj file
The file (suffix ".mkoffload.dbg.o") used to save the dbgobj data
data has to be passed to maybe_unlink for cleanup or -v -save-temps stderr
diagnostic. That was missed before.
This is a partial backport of commit r13-2125, "mkoffload: Cleanup
temporary omp_requires_file", only for GCN's mkoffload and its dbgobj
file as 'omp requires' is not supported on GCC 12 and, hence,
omp_requires_file does not exist on this branch.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (main): Add dbgobj to files_to_cleanup.
Tobias Burnus [Mon, 22 Aug 2022 07:53:25 +0000 (09:53 +0200)]
lto-wrapper.cc: Delete offload_names temp files in case of error [PR106686]
Usually, the caller takes care of the .o files for the offload compilers
(suffix: ".target.o"). However, if an error occurs during processing
(e.g. fatal error by lto1), they were not deleted.
gcc/ChangeLog:
PR lto/106686
* lto-wrapper.cc (free_array_of_ptrs): Move before tool_cleanup.
(tool_cleanup): Unlink offload_names.
(compile_offload_image): Take filename argument to set it early.
(compile_images_for_offload_targets): Update call; set
offload_names to NULL after freeing the array.
Tobias Burnus [Fri, 19 Aug 2022 14:21:43 +0000 (16:21 +0200)]
mkoffload: Cleanup temporary omp_requires_file
The file (suffix ".mkoffload.omp_requires") used to save the 'omp requires'
data has to be passed to maybe_unlink for cleanup or -v -save-temps stderr
diagnostic. That was missed before. - For GCN, the same has to be done for
the files with suffix ".mkoffload.dbg.o".
gcc/ChangeLog:
* config/gcn/mkoffload.cc (main): Add omp_requires_file and dbgobj to
files_to_cleanup.
* config/i386/intelmic-mkoffload.cc (prepare_target_image): Add
omp_requires_file to temp_files.
* config/nvptx/mkoffload.cc (omp_requires_file): New global static var.
(main): Remove local omp_requires_file var.
(tool_cleanup): Handle omp_requires_file.
Tobias Burnus [Wed, 17 Aug 2022 13:59:06 +0000 (15:59 +0200)]
OpenMP requires: Fix diagnostic filename corner case
The issue occurs when there is, e.g., main._omp_fn.0 in two files with
different OpenMP requires clauses. The function entries in the offload
table ends up having the same decl tree and, hence, the diagnostic showed
the same filename for both. Solution: Use the .o filename in this case.
Note that the issue does not occur with same-named 'static' functions and
without the fatal error from the requires diagnostic, there would be
later a linker error due to having two 'main'.
gcc/
* lto-cgraph.cc (input_offload_tables): Improve requires diagnostic
when filenames come out identically.
When splay_tree_prefix is defined, the .h file
defines splay_* macros to add the prefix. However,
before those were only unset when additionally
splay_tree_c was defined.
Additionally, for consistency undefine splay_tree_c
also when no splay_tree_prefix is defined - there
is no interdependence either.
libgomp/ChangeLog:
* splay-tree.h: Fix splay_* macro unsetting if
splay_tree_prefix is defined.
Tobias Burnus [Wed, 17 Aug 2022 12:36:24 +0000 (14:36 +0200)]
OpenMP/C++: Allow classes with static members to be mappable [PR104493]
As this is the last lang-specific user of the omp_mappable_type hook,
the hook is removed, keeping only a generic omp_mappable_type for
incomplete types (or error_node).
* cp-objcp-common.h (LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
* cp-tree.h (cp_omp_mappable_type, cp_omp_emit_unmappable_type_notes):
Remove.
* decl2.cc (cp_omp_mappable_type_1, cp_omp_mappable_type,
cp_omp_emit_unmappable_type_notes): Remove.
(cplus_decl_attributes): Call omp_mappable_type instead of
removed langhook.
* decl.cc (cp_finish_decl): Likewise; call cxx_incomplete_type_inform
in lieu of cp_omp_emit_unmappable_type_notes.
* semantics.cc (finish_omp_clauses): Likewise.
gcc/ChangeLog:
* gimplify.cc (omp_notice_variable): Call omp_mappable_type
instead of removed langhook.
* omp-general.h (omp_mappable_type): New prototype.
* omp-general.cc (omp_mappable_type): New; moved from ...
* langhooks.cc (lhd_omp_mappable_type): ... here.
* langhooks-def.h (lhd_omp_mappable_type,
LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Remote the latter.
* langhooks.h (struct lang_hooks_for_types): Remove
omp_mappable_type.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/unmappable-1.C: Remove dg-error; remove dg-note no
longer shown as TYPE_MAIN_DECL is NULL.
* c-c++-common/gomp/map-incomplete-type.c: New test.
PR106342 - IBM zSystems: Provide vsel for all vector modes
dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
produces an insn that vsel<mode> is supposed to recognize, but can't,
because it's not defined for V2SF. Fix by defining it for all vector
modes supported by copysign<mode>3.
gcc/ChangeLog:
* config/s390/vector.md (V_HW_FT): New iterator.
* config/s390/vx-builtins.md (vsel<mode>): Use V_HW_FT instead
of V_HW.
Iain Buclaw [Tue, 16 Aug 2022 10:22:10 +0000 (12:22 +0200)]
d: Update DIP links in gdc documentation to point at upstream repository
The wiki links probably worked at some point in the distant past, but
now the official location of tracking all D Improvement Proposals is on
the upstream dlang/DIPs GitHub repository.
PR d/106638
gcc/d/ChangeLog:
* gdc.texi: Update DIP links to point at upstream dlang/DIPs
repository.
Iain Buclaw [Mon, 15 Aug 2022 17:00:43 +0000 (19:00 +0200)]
d: Defer compiling inline definitions until after the module has finished.
This is to prevent the case of when generating the methods of a struct
type, we don't accidentally emit an inline function that references it,
as the outer struct itself would still be incomplete.
gcc/d/ChangeLog:
* d-tree.h (d_defer_declaration): Declare.
* decl.cc (function_needs_inline_definition_p): Defer checking
DECL_UNINLINABLE and DECL_DECLARED_INLINE_P.
(maybe_build_decl_tree): Call d_defer_declaration instead of
build_decl_tree.
* modules.cc (deferred_inline_declarations): New variable.
(build_module_tree): Set deferred_inline_declarations and a handle
declarations pushed to it.
(d_defer_declaration): New function.
Iain Buclaw [Mon, 15 Aug 2022 15:51:03 +0000 (17:51 +0200)]
d: Fix internal compiler error: Segmentation fault at gimple-expr.cc:88
Because complex types are deprecated in the language, the new way to
expose native complex types is by defining an enum with a basetype of a
library-defined struct that is implicitly treated as-if it is native.
As casts are not implicitly added by the front-end when downcasting from
enum to its underlying type, we must insert an explicit cast during the
code generation pass.
PR d/106623
gcc/d/ChangeLog:
* d-codegen.cc (underlying_complex_expr): New function.
(d_build_call): Handle passing native complex objects as the
library-defined equivalent.
* d-tree.h (underlying_complex_expr): Declare.
* expr.cc (ExprVisitor::visit (DotVarExp *)): Call
underlying_complex_expr instead of build_vconvert.
Jason Merrill [Tue, 26 Jul 2022 15:02:21 +0000 (11:02 -0400)]
c++: constexpr, empty base after non-empty [PR106369]
Here the CONSTRUCTOR we were providing for D{} had an entry for the B base
subobject at offset 0 following the entry for the C base, causing
output_constructor_regular_field to ICE due to going backwards. It might be
nice for that function to be more tolerant of empty fields, but it also
seems reasonable for the front end to prune the useless entry.
PR c++/106369
gcc/cp/ChangeLog:
* constexpr.cc (reduced_constant_expression_p): Return false
if a CONSTRUCTOR initializes an empty field.
Marek Polacek [Thu, 28 Apr 2022 20:50:06 +0000 (16:50 -0400)]
c++: pedwarn for empty unnamed enum in decl [PR67048]
[dcl.dcl]/5 says that
enum { };
is ill-formed, and since r197742 we issue a pedwarn. However, the
pedwarn also fires for
enum { } x;
which is well-formed. So only warn when {} is followed by a ;. This
should be correct since you can't have "enum {}, <whatever>" -- that
produces "expected unqualified-id before ',' token".
PR c++/67048
gcc/cp/ChangeLog:
* parser.cc (cp_parser_enum_specifier): Warn about empty unnamed enum
only when it's followed by a semicolon.
Peter Bergner [Sat, 18 Jun 2022 04:43:23 +0000 (23:43 -0500)]
c: Handle initializations of opaque types [PR106016]
The initial commit that added opaque types thought that there couldn't
be any valid initializations for variables of these types, but the test
case in the bug report shows that isn't true. The solution is to handle
OPAQUE_TYPE initializations like the other scalar types.
Richard Biener [Mon, 8 Aug 2022 07:07:23 +0000 (09:07 +0200)]
lto/106540 - fix LTO tree input wrt dwarf2out_register_external_die
I've revisited the earlier two workarounds for dwarf2out_register_external_die
getting duplicate entries. It turns out that r11-525-g03d90a20a1afcb
added dref_queue pruning to lto_input_tree but decl reading uses that
to stream in DECL_INITIAL even when in the middle of SCC streaming.
When that SCC then gets thrown away we can end up with debug nodes
registered which isn't supposed to happen. The following adjusts
the DECL_INITIAL streaming to go the in-SCC way, using lto_input_tree_1,
since no SCCs are expected at this point, just refs.
PR lto/106540
PR lto/106334
* lto-streamer-in.cc (lto_read_tree_1): Use lto_input_tree_1
to input DECL_INITIAL, avoiding to commit drefs.
The macro definition for LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast
was earlier located in the documentation comment for
gcc_jit_context_new_bitcast, making it unavailable to code that
consumed libgccjit.h. This commit moves the definition out of the
comment, making it effective.
gcc/jit/ChangeLog:
* libgccjit.h (LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast): Move
definition out of comment.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Iain Buclaw [Tue, 9 Aug 2022 10:48:14 +0000 (12:48 +0200)]
d: Fix undefined reference to pragma(inline) symbol (PR106563)
Functions that are declared `pragma(inline)' should be treated as if
they are defined in every translation unit they are referenced from,
regardless of visibility protection. Ensure they always get
DECL_ONE_ONLY linkage, and start emitting them into other modules that
import them.
PR d/106563
gcc/d/ChangeLog:
* decl.cc (DeclVisitor::visit (FuncDeclaration *)): Set semanticRun
before generating its symbol.
(function_defined_in_root_p): New function.
(function_needs_inline_definition_p): New function.
(maybe_build_decl_tree): New function.
(get_symbol_decl): Call maybe_build_decl_tree before returning symbol.
(start_function): Use function_defined_in_root_p instead of inline
test for locally defined symbols.
(set_linkage_for_decl): Check for inline functions before private or
protected symbols.
gcc/testsuite/ChangeLog:
* gdc.dg/torture/torture.exp (srcdir): New proc.
* gdc.dg/torture/imports/pr106563math.d: New test.
* gdc.dg/torture/imports/pr106563regex.d: New test.
* gdc.dg/torture/imports/pr106563uni.d: New test.
* gdc.dg/torture/pr106563.d: New test.
Iain Buclaw [Mon, 8 Aug 2022 13:17:47 +0000 (15:17 +0200)]
d: Fix ICE in in add_stack_var, at cfgexpand.cc:476
The type that triggers the ICE never got completed by the semantic
analysis pass. Checking for size forces it to be done, or issue a
compile-time error.
PR d/106555
gcc/d/ChangeLog:
* d-target.cc (Target::isReturnOnStack): Check for return type size.
gcc/testsuite/ChangeLog:
* gdc.dg/imports/pr106555.d: New test.
* gdc.dg/pr106555.d: New test.
Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.
A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning for power10, but it would
set the option if we were tuning for a different machine and have load and store
vector pair instructions enabled.
This patch eliminates the code setting -mblock-ops-vector-pair. If you want to
generate load vector pair and store vector pair instructions for block moves,
you must use -mblock-ops-vector-pair.
2022-08-05 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove code
setting -mblock-ops-vector-pair. Back port patch from trunk on 8/3.
Jonathan Wakely [Thu, 4 Aug 2022 09:20:18 +0000 (10:20 +0100)]
libstdc++: Rename data members of std::unexpected and std::bad_expected_access
The P2549R1 paper was accepted for C++23. I already implemented it for
our <expected>, but I didn't rename the private daata members, only the
public member functions. This renames the data members for consistency
with the working draft.
libstdc++-v3/ChangeLog:
* include/std/expected (unexpected::_M_val): Rename to _M_unex.
(bad_expected_access::_M_val): Likewise.
Jonathan Wakely [Thu, 4 Aug 2022 09:18:23 +0000 (10:18 +0100)]
libstdc++: Update value of __cpp_lib_ios_noreplace macro
My P2467R1 proposal was accepted for C++23 so there's an official value
for this macro now.
libstdc++-v3/ChangeLog:
* include/bits/ios_base.h (__cpp_lib_ios_noreplace): Update
value to 202207L.
* include/std/version (__cpp_lib_ios_noreplace): Likewise.
* testsuite/27_io/basic_ofstream/open/char/noreplace.cc: Check
for new value.
* testsuite/27_io/basic_ofstream/open/wchar_t/noreplace.cc:
Likewise.
Jonathan Wakely [Mon, 27 Jun 2022 13:43:54 +0000 (14:43 +0100)]
libstdc++: Improve directory iterator abstractions for openat
Currently the _Dir::open_subdir function decides whether to construct a
_Dir_base with just a pathname, or a file descriptor and pathname. But
that means it is tightly coupled to the implementation of
_Dir_base::openat, which is what actually decides whether to use a file
descriptor or not. If the derived class passes a file descriptor and
filename, but the base class expects a full path and ignores the file
descriptor, then recursive_directory_iterator cannot recurse.
This change introduces a new type that provides the union of all the
information available to the derived class (the full pathname, as well
as a file descriptor for a directory and another pathname relative to
that directory). This allows the derived class to be agnostic to how the
base class will use that information.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (_Dir::dir_and_pathname):: Replace with
current() returning _At_path.
(_Dir::_Dir, _Dir::open_subdir, _Dir::do_unlink): Adjust.
* src/filesystem/dir-common.h (_Dir_base::_At_path): New class.
(_Dir_base::_Dir_Base, _Dir_base::openat): Use _At_path.
* src/filesystem/dir.cc (_Dir::dir_and_pathname): Replace with
current() returning _At_path.
(_Dir::_Dir, _Dir::open_subdir): Adjust.
Jonathan Wakely [Thu, 28 Jul 2022 19:55:51 +0000 (20:55 +0100)]
libstdc++: Tweak common_iterator::operator-> return type [PR104443]
This adjusts the return type to match the resolution of LWG 3672. There
is no functional difference, because decltype(auto) always deduced a
value anyway, but this makes it simpler and consistent with the working
draft.
libstdc++-v3/ChangeLog:
PR libstdc++/104443
* include/bits/stl_iterator.h (common_iterator::operator->):
Change return type to just auto.
Jonathan Wakely [Tue, 12 Jul 2022 10:18:47 +0000 (11:18 +0100)]
libstdc++: Check for EOF if extraction avoids buffer overflow [PR106248]
In r11-2581-g17abcc77341584 (for LWG 2499) I added overflow checks to
the pre-C++20 operator>>(istream&, char*) overload. Those checks can
cause extraction to stop after filling the buffer, where previously it
would have tried to extract another character and stopped at EOF. When
that happens we no longer set eofbit in the stream state, which is
consistent with the behaviour of the new C++20 overload, but is an
observable and unexpected change in the C++17 behaviour. What makes it
worse is that the behaviour change is dependent on optimization, because
__builtin_object_size is used to detect the buffer size and that only
works when optimizing.
To avoid the unexpected and optimization-dependent change in behaviour,
set eofbit manually if we stopped extracting because of the buffer size
check, but had reached EOF anyway. If the stream's rdstate() != goodbit
or width() is non-zero and smaller than the buffer, there's nothing to
do. Otherwise, we filled the buffer and need to check for EOF, and maybe
set eofbit.
The new check is guarded by #ifdef __OPTIMIZE__ because otherwise
__builtin_object_size is useless. There's no point compiling and
emitting dead code that can't be eliminated because we're not
optimizing.
We could add extra checks that the next character in the buffer is not
whitespace, to detect the case where we stopped early and prevented a
buffer overflow that would have happened otherwise. That would allow us
to assert or set badbit in the stream state when undefined behaviour was
prevented. However, those extra checks would increase the size of the
function, potentially reducing the likelihood of it being inlined, and
so making the buffer size detection less reliable. It seems preferable
to prevent UB and silently truncate, rather than miss the UB and allow
the overflow to happen.
libstdc++-v3/ChangeLog:
PR libstdc++/106248
* include/std/istream [C++17] (operator>>(istream&, char*)):
Set eofbit if we stopped extracting at EOF.
* testsuite/27_io/basic_istream/extractors_character/char/pr106248.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/pr106248.cc:
New test.
Jonathan Wakely [Fri, 1 Jul 2022 10:40:29 +0000 (11:40 +0100)]
libstdc++: Add nodiscard attribute to filesystem operations
Some of these are not truly "pure" because they access the file system,
e.g. exists and file_size, but they do not modify anything and are only
useful for the return value.
If you really want to use one of those functions just to check whether
an error is reported (either via an exception or an error_code&
argument) you can still do so, but you need to cast the discarded result
to void. Several tests need such a change, because they were indeed
only calling the functions to check for expected errors.
libstdc++-v3/ChangeLog:
* include/bits/fs_ops.h: Add nodiscard to all pure functions.
* include/experimental/bits/fs_ops.h: Likewise.
* testsuite/27_io/filesystem/operations/all.cc: Do not discard
results of absolute and canonical.
* testsuite/27_io/filesystem/operations/absolute.cc: Cast
discarded result to void.
* testsuite/27_io/filesystem/operations/canonical.cc: Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/canonical.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
Jonathan Wakely [Tue, 14 Jun 2022 13:37:25 +0000 (14:37 +0100)]
libstdc++: Check for size overflow in constexpr allocation [PR105957]
libstdc++-v3/ChangeLog:
PR libstdc++/105957
* include/bits/allocator.h (allocator::allocate): Check for
overflow in constexpr allocation.
* testsuite/20_util/allocator/105975.cc: New test.
Jonathan Wakely [Fri, 10 Jun 2022 13:39:13 +0000 (14:39 +0100)]
libstdc++: Make std::lcm and std::gcd detect overflow [PR105844]
When I fixed PR libstdc++/92978 I introduced a regression whereby
std::lcm(INT_MIN, 1) and std::lcm(50000, 49999) would no longer produce
errors during constant evaluation. Those calls are undefined, because
they violate the preconditions that |m| and the result can be
represented in the return type (which is int in both those cases). The
regression occurred because __absu<unsigned>(INT_MIN) is well-formed,
due to the explicit casts to unsigned in that new helper function, and
the out-of-range multiplication is well-formed, because unsigned
arithmetic wraps instead of overflowing.
To fix 92978 I made std::gcm and std::lcm calculate |m| and |n|
immediately, yielding a common unsigned type that was used to calculate
the result. That was partly correct, but there's no need to use an
unsigned type. Doing so only suppresses the overflow errors so the
compiler can't detect them. This change replaces __absu with __abs_r
that returns the common type (not its corresponding unsigned type). This
way we can detect overflow in __abs_r when required, while still
supporting the most-negative value when it can be represented in the
result type. To detect LCM results that are out of range of the result
type we still need explicit checks, because neither constant evaluation
nor UBsan will complain about unsigned wrapping for cases such as
std::lcm(500000u, 499999u). We can detect those overflows efficiently by
using __builtin_mul_overflow and asserting.
libstdc++-v3/ChangeLog:
PR libstdc++/105844
* include/experimental/numeric (experimental::gcd): Simplify
assertions. Use __abs_r instead of __absu.
(experimental::lcm): Likewise. Remove use of __detail::__lcm so
overflow can be detected.
* include/std/numeric (__detail::__absu): Rename to __abs_r and
change to allow signed result type, so overflow can be detected.
(__detail::__lcm): Remove.
(gcd): Simplify assertions. Use __abs_r instead of __absu.
(lcm): Likewise. Remove use of __detail::__lcm so overflow can
be detected.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/26_numerics/gcd/105844.cc: New test.
* testsuite/26_numerics/lcm/105844.cc: New test.