git.ipfire.org Git - thirdparty/gcc.git/log

openmp: Partial OpenMP 5.2 doacross and omp_cur_iteration support

The following patch implements part of the OpenMP 5.2 changes related
to ordered loops and with the assumed resolution of
https://github.com/OpenMP/spec/issues/3302 issues.

The changes are:
1) the depend clause on stand-alone ordered constructs has been renamed
   to doacross (because depend clause has different syntax on other
   constructs) with some syntax changes below, depend clause is deprecated
   (we'll deprecate stuff on the GCC side only when we have everything else
   from 5.2 implemented)
   depend(source) -> doacross(source:) or doacross(source:omp_cur_iteration)
   depend(sink:vec) -> doacross(sink:vec) (where vec has the same syntax
   as before)
2) in 5.1 and before it has been significant whether ordered clause has or
   doesn't have an argument, if it didn't, only block-associated ordered
   could appear in the body, if it did, only stand-alone ordered could appear
   in the body, all loops had to be perfectly nested, no associated
   range-based for loops, no linear clause on work-sharing loop and ordered
   clause with an argument wasn't allowed on composite for simd.
   In 5.2, whether ordered clause has or doesn't have an argument is
   insignificant (except for bugs in the standard, #3302 mentions those),
   if the argument is missing, it is simply treated as equal to collapse
   argument (if any, otherwise 1).  The implementation better should be able
   to differentiate between ordered and doacross loops at compile time
   which previously was through the absence or presence of the argument,
   now it is done through looking at the body of the construct lexically
   and looking for stand-alone ordered constructs.  If there are any,
   it is to be handled as doacross loop, otherwise it is ordered loop
   (but in that case ordered argument if present must be equal to collapse
   argument - 5.2 says instead it must be one, but that is clearly wrong
   and mentioned in #3302) - stand-alone ordered constructs must appear
   lexically in the body (and had to before as well).  For the restrictions
   mentioned above, the for simd restriction is gone (stand-alone ordered
   can't appear in simd construct, so that is enough), and the other rules
   are expected to be changed into something related to presence of
   stand-alone ordered constructs in the body
3) 5.2 allows a new syntax, doacross(sink:omp_cur_iteration-1), which
   means wait for previous iteration in the iteration space of all the
   associated loops

The following patch implements that, except that we sorry for now
on the doacross(sink:omp_cur_iteration-1) syntax during omp expansion
because library side isn't done yet for it.  It doesn't implement it for
the Fortran FE either.
Incrementally, I'd like to change the way we differentiate between
stand-alone and block-associated ordered constructs, because the current
way of looking for presence of doacross clause doesn't work well if those
clauses are removed because they had been invalid (wrong syntax or
unknown variables in it etc.) and of course implement
doacross(sink:omp_cur_iteration-1).

2022-09-03  Jakub Jelinek  <jakub@redhat.com>

gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DOACROSS.
(enum omp_clause_depend_kind): Remove OMP_CLAUSE_DEPEND_SOURCE
and OMP_CLAUSE_DEPEND_SINK, add OMP_CLAUSE_DEPEND_INVALID.
(enum omp_clause_doacross_kind): New type.
(struct tree_omp_clause): Add subcode.doacross_kind member.
* tree.h (OMP_CLAUSE_DEPEND_SINK_NEGATIVE): Remove.
(OMP_CLAUSE_DOACROSS_KIND): Define.
(OMP_CLAUSE_DOACROSS_SINK_NEGATIVE): Define.
(OMP_CLAUSE_DOACROSS_DEPEND): Define.
(OMP_CLAUSE_ORDERED_DOACROSS): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE_DOACROSS entries.
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
* tree-pretty-print.cc (dump_omp_clause): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.  Handle
OMP_CLAUSE_DOACROSS.
* gimplify.cc (gimplify_omp_depend): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
(gimplify_scan_omp_clauses): Likewise.  Handle OMP_CLAUSE_DOACROSS.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(find_standalone_omp_ordered): New function.
(gimplify_omp_for): When OMP_CLAUSE_ORDERED is present, search
body for OMP_ORDERED with OMP_CLAUSE_DOACROSS and if found,
set OMP_CLAUSE_ORDERED_DOACROSS.
(gimplify_omp_ordered): Don't handle OMP_CLAUSE_DEPEND_SINK or
OMP_CLAUSE_DEPEND_SOURCE, instead check OMP_CLAUSE_DOACROSS, adjust
diagnostics that presence or absence of ordered clause parameter
is irrelevant.  Handle doacross(sink:omp_cur_iteration-1).  Use
actual user name of the clause - doacross or depend - in diagnostics.
* omp-general.cc (omp_extract_for_data): Don't set fd->ordered
if !OMP_CLAUSE_ORDERED_DOACROSS (t).  If
OMP_CLAUSE_ORDERED_DOACROSS (t) but !OMP_CLAUSE_ORDERED_EXPR (t),
set fd->ordered to -1 and set it after the loop in that case to
fd->collapse.
* omp-low.cc (check_omp_nesting_restrictions): Don't handle
OMP_CLAUSE_DEPEND_SOURCE nor OMP_CLAUSE_DEPEND_SINK, instead check
OMP_CLAUSE_DOACROSS.  Use actual user name of the clause - doacross
or depend - in diagnostics.  Diagnose mixing of stand-alone and
block associated ordered constructs binding to the same loop.
(lower_omp_ordered_clauses): Don't handle OMP_CLAUSE_DEPEND_SINK,
instead handle OMP_CLAUSE_DOACROSS.
(lower_omp_ordered): Look for OMP_CLAUSE_DOACROSS instead of
OMP_CLAUSE_DEPEND.
(lower_depend_clauses): Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK.
* omp-expand.cc (expand_omp_ordered_sink): Emit a sorry for
doacross(sink:omp_cur_iteration-1).
(expand_omp_ordered_source_sink): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE.  Use actual user name of the clause
- doacross or depend - in diagnostics.
(expand_omp): Look for OMP_CLAUSE_DOACROSS clause instead of
OMP_CLAUSE_DEPEND.
(build_omp_regions_1): Likewise.
(omp_make_gimple_edges): Likewise.
* lto-streamer-out.cc (hash_tree): Handle OMP_CLAUSE_DOACROSS.
* tree-streamer-in.cc (unpack_ts_omp_clause_value_fields): Likewise.
* tree-streamer-out.cc (pack_ts_omp_clause_value_fields): Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DOACROSS.
* c-omp.cc (c_finish_omp_depobj): Check also for OMP_CLAUSE_DOACROSS
clause and diagnose it.  Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK.  Assert kind is not OMP_CLAUSE_DEPEND_INVALID.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Handle doacross.
(c_parser_omp_clause_depend_sink): Renamed to ...
(c_parser_omp_clause_doacross_sink): ... this.  Add depend_p argument.
Handle parsing of doacross(sink:omp_cur_iteration-1).  Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(c_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(c_parser_omp_clause_doacross): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(c_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(c_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Handle doacross.
(cp_parser_omp_clause_depend_sink): Renamed to ...
(cp_parser_omp_clause_doacross_sink): ... this.  Add depend_p
argument.  Handle parsing of doacross(sink:omp_cur_iteration-1).  Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(cp_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(cp_parser_omp_clause_doacross): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(cp_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(cp_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* pt.cc (tsubst_omp_clause_decl): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE.
(tsubst_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(tsubst_expr): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
* semantics.cc (cp_finish_omp_clause_depend_sink): Rename to ...
(cp_finish_omp_clause_doacross_sink): ... this.
(finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS.  Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS
clause instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND
on it.
gcc/testsuite/
* c-c++-common/gomp/doacross-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/doacross-5.c: New test.
* c-c++-common/gomp/doacross-6.c: New test.
* c-c++-common/gomp/nesting-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/ordered-3.c: Likewise.
* c-c++-common/gomp/sink-3.c: Likewise.
* gfortran.dg/gomp/nesting-2.f90: Likewise.

(cherry picked from commit a651e6d59188da8992f8bfae2df1cb4e6316f9e6)

amdgcn: OpenMP SIMD routine support

Enable and configure SIMD clones for amdgcn. This affects both the __simd__
function attribute, and the OpenMP "declare simd" directive.

Note that the masked SIMD variants are generated, but the middle end doesn't
actually support calling them yet.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): New.
(gcn_simd_clone_adjust): New.
(gcn_simd_clone_usable): New.
(TARGET_SIMD_CLONE_ADJUST): New.
(TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): New.
(TARGET_SIMD_CLONE_USABLE): New.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-1.c: Add dg-warning.
* gcc.dg/vect/vect-simd-clone-2.c: Add dg-warning.
* gcc.dg/vect/vect-simd-clone-3.c: Add dg-warning.
* gcc.dg/vect/vect-simd-clone-4.c: Add dg-warning.
* gcc.dg/vect/vect-simd-clone-5.c: Add dg-warning.
* gcc.dg/vect/vect-simd-clone-8.c: Add dg-warning.

(cherry picked from commit b73c49f6f88dd7f7569f9a72c8ceb04598d4c15c)

omp-simd-clone: Unbreak bootstrap

This patch fixes -Werror=sign-compare errors during stage2/stage3.

2022-08-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
Jakub Jelinek <jakub@redhat.com>

* omp-simd-clone.cc (simd_clone_adjust_return_type,
simd_clone_adjust_argument_types): Use known_eq (veclen, 0U)
instead of known_eq (veclen, 0) to avoid -Wsign-compare warnings.

(cherry picked from commit 437bde93dcde8309bb23ee255924c697e8e70df9)

omp-simd-clone: Allow fixed-lane vectors

The vecsize_int/vecsize_float has an assumption that all arguments will use
the same bitsize, and vary the number of lanes according to the element size,
but this is inappropriate on targets where the number of lanes is fixed and
the bitsize varies (i.e. amdgcn).

With this change the vecsize can be left zero and the vectorization factor will
be the same for all types.

gcc/ChangeLog:

* doc/tm.texi: Regenerate.
* omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero
vecsize.
(simd_clone_adjust_argument_types): Likewise.
* target.def (compute_vecsize_and_simdlen): Document the new
vecsize_int and vecsize_float semantics.

(cherry picked from commit f134a25ee8c29646f35f7e466109f6a7f5b9e824)

amdgcn: Vector procedure call ABI

Adjust the (unofficial) procedure calling ABI such that vector arguments are
passed in vector registers, not on the stack.  Scalar arguments continue to
be passed in scalar registers, making a total of 12 argument registers.

The return value is also moved to a vector register (even for scalars; it
would be possible to retain the scalar location, using untyped_call, but
there's no obvious advantage in doing so).

After this change the ABI is as follows:

s0-s13  : Reserved for kernel launch parameters.
s14-s15 : Frame pointer.
s16-s17 : Stack pointer.
s18-s19 : Link register.
s20-s21 : Exec Save.
s22-s23 : CC Save.
s24-s25 : Scalar arguments.          NO LONGER RETURN VALUE.
s26-s29 : Additional scalar arguments (makes 6 total).
s30-s31 : Static Chain.
v0      : Prologue/epilogue scratch.
v1      : Constant 0, 1, 2, 3, 4, ... 63.
v2-v7   : Prologue/epilogue scratch.
v8-v9   : Return value & vector arguments.              NEW.
v10-v13 : Additional vector arguments (makes 6 total).  NEW.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_function_value): Allow vector return values.
(num_arg_regs): Allow vector arguments.
(gcn_function_arg): Likewise.
(gcn_function_arg_advance): Likewise.
(gcn_arg_partial_bytes): Likewise.
(gcn_return_in_memory): Likewise.
(gcn_expand_epilogue): Get return value from v8.
* config/gcn/gcn.h (RETURN_VALUE_REG): Set to v8.
(FIRST_PARM_REG): USE FIRST_SGPR_REG for clarity.
(FIRST_VPARM_REG): New.
(FUNCTION_ARG_REGNO_P): Allow vector parameters.
(struct gcn_args): Add vnum field.
(LIBCALL_VALUE): All vector return values.
* config/gcn/gcn.md (gcn_call_value): Add vector constraints.
(gcn_call_value_indirect): Likewise.

(cherry picked from commit 4e1914625dec4aa09a5671c6294e877dbf4518f5)

Revert OG12-only parts of "dwarf: Multi-register CFI address support"

"dwarf: Multi-register CFI address support." was commited to upstream (GCC12)
as r12-5833-g13b6c7639cfdca892a3f02b63596b097e1839f38

It is based on an earlier version that was committed to OG11
as ce8e18474780aaae1abbfdde0543c948ae97da42

The OG11 version contained code that was not part of the upstream version
as the approach changed; however, when OG12 was created, those were
accidentally applied to OG12.

This commit removes (reverts) the additional code, i.e. the OG12 commits:

Reverts:
commit 3405728e403ce7054f09e9a69846a57f680060db
"dwarf: Multi-register CFI address support"

gcc/ChangeLog:

* dwarf2cfi.cc (get_cfa_from_loc_descr): Support register spans
with DW_OP_piece and DW_OP_LLVM_piece_end.
* dwarf2out.cc (build_cfa_loc): Support register spans.

include/ChangeLog:

* dwarf2.def (DW_OP_LLVM_piece_end): New extension operator.

Reverts:
commit 29ba2e4eeff0381e04a37a3c471c56cd887d2035
"Fix mis-merge of 'dwarf: Multi-register CFI address support'"

gcc/
* dwarf2cfi.cc (get_cfa_from_loc_descr): Check op against DW_OP_bregx.

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8732-g63997f222380a7b718a9045effa4841c00f30f70 (31st Aug 2022)

Daily bump.

Update gcc sv.po

* sv.po: Update.

Fortran: Fix OpenMP clause name in error message

gcc/fortran/ChangeLog:

* openmp.cc (gfc_check_omp_requires): Fix clause name in error.

Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
(cherry picked from commit 8af266501795dd76d05faef498dbd3472a01b305)

OpenMP: Support reverse offload (middle end part)

gcc/ChangeLog:

* internal-fn.cc (expand_GOMP_TARGET_REV): New.
* internal-fn.def (GOMP_TARGET_REV): New.
* lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
'omp target device_ancestor_host' as in_other_partition and don't
error if absent.
* omp-low.cc (create_omp_child_function): Mark as 'noclone'.
* omp-expand.cc (expand_omp_target): For reverse offload, remove
sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
empty-body nohost function.
* omp-offload.cc (execute_omp_device_lower): Handle
IFN_GOMP_TARGET_REV.
(pass_omp_target_link::execute): For ACCEL_COMPILER, don't
nullify fn argument for reverse offload

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
refer to 'requires'.
* testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
* testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-1.f90: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
* c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to
scan-tree-dump-times.
* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
Likewise.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
* c-c++-common/goacc/kernels-loop-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-3.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
* c-c++-common/goacc/kernels-loop-data.c: Likewise.
* c-c++-common/goacc/kernels-loop-g.c: Likewise.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
* c-c++-common/goacc/kernels-loop-n.c: Likewise.
* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
* c-c++-common/goacc/kernels-loop.c: Likewise.
* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise.
* gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
* gfortran.dg/goacc/kernels-loop.f95: Likewise.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise.

(cherry picked from commit d6621a2f3176dd6a593d4f5fa7f85db0234b40d2)

gfortran.dg/gomp/depend-6.f90: minor fix + dump update

Contains the minor fix of upstream commit
r13-2152-gf05e3b2c63f3307ba405900f1a80c25b2e87b0a3
to avoid setting the same array element twice, which is:
Exactly the same as previous commit for depend-4.f90, r13-2151.

Additionally, it updates the expected tree dumps due to differences
between GCC 12/OG12 and mainline (GCC13); the latter seems to
use a restricted pointer of type '&a[1]' while OG12 has pointerplus
expressions like 'a + 4'.

The changes include -m32/-m64 handling for the depobj var and in
two cases, the expected count increases from 1 to 2 for code like
'D.\[0-9\]+ = daa'.

The latter is likewise the same as done for depend-4.f90 on the OG12
branch in commit d77133b29fc51de4e0f37de5f6ad2b5496124346-

gcc/testsuite/
* gfortran.dg/gomp/depend-6.f90: Update expected tree dumps.

Fortran/OpenMP: Fix strictly structured blocks parsing

gcc/fortran/ChangeLog:

* parse.cc (parse_omp_structured_block): When parsing strictly
structured blocks, issue an error if the end-directive comes
before the 'end block'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/strictly-structured-block-4.f90: New test.

(cherry picked from commit 33f24eb58748e9db7c827662753757c5c2217eb4)

c++: __has_builtin gives the wrong answer [PR106759]

We've supported __is_nothrow_constructible since r11-4386, but
names_builtin_p didn't know about it, so it gave the wrong answer for
#if __has_builtin(__is_nothrow_constructible)
...
#endif

I've tested all C++-only built-ins and only two were missing.

PR c++/106759

gcc/cp/ChangeLog:

* cp-objcp-common.cc (names_builtin_p): Handle RID_IS_NOTHROW_ASSIGNABLE
and RID_IS_NOTHROW_CONSTRUCTIBLE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: New test.

(cherry picked from commit fe915f35b7d8dc768a2b977c09aa02f933e1d1e9)

sve: Fix fcmuo combine patterns [PR106524]

There's no encoding for fcmuo with zero. This restricts the combine patterns
from accepting zero registers.

gcc/ChangeLog:

PR target/106524
* config/aarch64/aarch64-sve.md (*fcmuo<mode>_nor_combine,
*fcmuo<mode>_bic_combine): Don't accept comparisons against zero.

gcc/testsuite/ChangeLog:

PR target/106524
* gcc.target/aarch64/sve/pr106524.c: New test.

(cherry picked from commit f4ff20d464f90c85919ce2e7fa63e204dcda4e40)

Daily bump.

rs6000: Allow conversions of MMA pointer types [PR106017]

GCC incorrectly disables conversions between MMA pointer types, which
are allowed with clang. The original intent was to disable conversions
between MMA types and other other types, but pointer conversions should
have been allowed. The fix is to just remove the MMA pointer conversion
handling code altogether.

gcc/
PR target/106017
* config/rs6000/rs6000.cc (rs6000_invalid_conversion): Remove handling
of MMA pointer conversions.

gcc/testsuite/
PR target/106017
* gcc.target/powerpc/pr106017.c: New test.

(cherry picked from commit 1ae1325f24cea1698b56e4299d95446a1f7b90a2)

x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics

On 64-bit Windows, long is 32 bits and can't be used as stride in memory
operand when base is a pointer which is 64 bits. Cast stride to
__PTRDIFF_TYPE__, instead of long.

PR target/106714
* config/i386/amxtileintrin.h (_tile_loadd_internal): Cast to
__PTRDIFF_TYPE__.
(_tile_stream_loadd_internal): Likewise.
(_tile_stored_internal): Likewise.

(cherry picked from commit aeb9b58225916bc84a0cd02c6fc77bbb92167e53)

fortran: Expand ieee_arithmetic module's ieee_value inline [PR106579]

The following patch expands IEEE_VALUE function inline in the FE,
but only for the powerpc64le-linux IEEE quad real(kind=16) case.

2022-08-26 Jakub Jelinek <jakub@redhat.com>

PR fortran/106579
* trans-intrinsic.cc: Include realmpfr.h.
(conv_intrinsic_ieee_value): New function.
(gfc_conv_ieee_arithmetic_function): Handle ieee_value.

(cherry picked from commit 0c2d6aa1be2ea85e751852834986ae52d58134d3)

fortran: Expand ieee_arithmetic module's ieee_class inline [PR106579]

The following patch expands IEEE_CLASS inline in the FE but only for the
powerpc64le-linux IEEE quad real(kind=16), using the __builtin_fpclassify
builtin and explicit check of the MSB mantissa bit in place of missing
__builtin_signbit builtin.

2022-08-26 Jakub Jelinek <jakub@redhat.com>

PR fortran/106579
gcc/fortran/
* f95-lang.cc (gfc_init_builtin_functions): Initialize
BUILT_IN_FPCLASSIFY.
* libgfortran.h (IEEE_OTHER_VALUE, IEEE_SIGNALING_NAN,
IEEE_QUIET_NAN, IEEE_NEGATIVE_INF, IEEE_NEGATIVE_NORMAL,
IEEE_NEGATIVE_DENORMAL, IEEE_NEGATIVE_SUBNORMAL,
IEEE_NEGATIVE_ZERO, IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL,
IEEE_POSITIVE_SUBNORMAL, IEEE_POSITIVE_NORMAL, IEEE_POSITIVE_INF):
New enum.
* trans-intrinsic.cc (conv_intrinsic_ieee_class): New function.
(gfc_conv_ieee_arithmetic_function): Handle ieee_class.
libgfortran/
* ieee/ieee_helper.c (IEEE_OTHER_VALUE, IEEE_SIGNALING_NAN,
IEEE_QUIET_NAN, IEEE_NEGATIVE_INF, IEEE_NEGATIVE_NORMAL,
IEEE_NEGATIVE_DENORMAL, IEEE_NEGATIVE_SUBNORMAL,
IEEE_NEGATIVE_ZERO, IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL,
IEEE_POSITIVE_SUBNORMAL, IEEE_POSITIVE_NORMAL, IEEE_POSITIVE_INF):
Move to gcc/fortran/libgfortran.h.

(cherry picked from commit db630423a97ec6690a8eb0e5c3cb186c91e3740d)

i386: Fix up mode iterators that weren't expanded [PR106721]

Currently, when md file reader sees <something> and something is valid mode
(or code) attribute but which doesn't include case for the current mode
(or code), it just keeps the <something> untouched.
I went through all cases matching <[a-zA-Z] in tmp-mddump.md after make mddump.
One of the cases was related to the V*HF mode additions and there was one typo.

2022-08-24 Jakub Jelinek <jakub@redhat.com>

PR target/106721
* config/i386/sse.md (i128vldq): Add V16HF entry.
(avx512er_vmrcp28<mode><mask_name><round_saeonly_name>): Fix typo,
mask_opernad3 -> mask_operand3.

(cherry picked from commit 846e5c009e360f0c4fe58ff0d3aee03ebe3ca1a9)

c++: Implement P2327R1 - De-deprecating volatile compound operations

From what I can see, this has been voted in as a DR and as it means
we warn less often than before in -std={gnu,c}++2{0,3} modes or with
-Wvolatile, I wonder if it shouldn't be backported to affected release
branches as well.

2022-08-16 Jakub Jelinek <jakub@redhat.com>

* typeck.cc (cp_build_modify_expr): Implement
P2327R1 - De-deprecating volatile compound operations. Don't warn
for |=, &= or ^= with volatile lhs.
* expr.cc (mark_use) <case MODIFY_EXPR>: Adjust warning wording,
leave out simple.

* g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile
compound |=, &= and ^= operations.
* g++.dg/cpp2a/volatile3.C: Likewise.
* g++.dg/cpp2a/volatile5.C: Likewise.

(cherry picked from commit 6e790ca4615443fa395ac5cdba1ab6c87810985c)

ifcvt: Fix up noce_convert_multiple_sets [PR106590]

The following testcase is miscompiled on x86_64-linux.
The problem is in the noce_convert_multiple_sets optimization.
We essentially have:
if (g == 1)
  {
    g = 1;
    f = 23;
  }
else
  {
    g = 2;
    f = 20;
  }
and for each insn try to create a conditional move sequence.
There is code to detect overlap with the regs used in the condition
and the destinations, so we actually try to construct:
tmp_g = g == 1 ? 1 : 2;
f = g == 1 ? 23 : 20;
g = tmp_g;
which is fine.  But, we actually try to create two different
conditional move sequences in each case, seq1 with the whole
(eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1]))
condition and seq2 with cc_cmp
(eq (reg:CCZ 17 flags) (const_int 0 [0]))
to rely on the earlier present comparison.  In each case, we
compare the rtx costs and choose the cheaper sequence (seq1 if both
have the same cost).
The problem is that with the skylake tuning,
tmp_g = g == 1 ? 1 : 2;
is actually expanded as
tmp_g = (g == 1) + 1;
in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2
(which doesn't).  The tuning says both have the same cost, so we
pick seq1.  Next we check sequences for
f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it
uses (reg 17 flags) which has been clobbered earlier.

The following patch fixes that by detecting if we in the chosen
sequence clobber some register mentioned in cc_cmp or rev_cc_cmp,
and if yes, arranges for only seq1 (i.e. sequences that emit the
comparison itself) to be used after that.

2022-08-15  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/106590
* ifcvt.cc (check_for_cc_cmp_clobbers): New function.
(noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs
mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any
further conditional moves.

* gcc.dg/torture/pr106590.c: New test.

(cherry picked from commit 3a74a7bf62f47ed0d19866576378724be932ee17)

Daily bump.

Fortran: improve error recovery while simplifying size of bad array [PR103694]

gcc/fortran/ChangeLog:

PR fortran/103694
* simplify.cc (simplify_size): The size expression of an array cannot
be simplified if an error occurs while resolving the array spec.

gcc/testsuite/ChangeLog:

PR fortran/103694
* gfortran.dg/pr103694.f90: New test.

(cherry picked from commit 55d8c5409325001c89c35c3d04d425dec9127146)

Don't gimple fold ymm-version vblendvpd/vblendvps/vpblendvb w/o TARGET_AVX2

Since 256-bit vector integer comparison is under TARGET_AVX2,
and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that.
Restrict gimple fold condition to TARGET_AVX2.

gcc/ChangeLog:

PR target/106704
* config/i386/i386-builtin.def (BDESC): Add
CODE_FOR_avx_blendvpd256/CODE_FOR_avx_blendvps256 to
corresponding builtins.
* config/i386/i386.cc (ix86_gimple_fold_builtin):
Don't fold IX86_BUILTIN_PBLENDVB256, IX86_BUILTIN_BLENDVPS256,
IX86_BUILTIN_BLENDVPD256 w/o TARGET_AVX2.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106704.c: New test.

Daily bump.

LoongArch: Fix pr106459 by use HWIT instead of 1UL.

gcc/ChangeLog:

PR target/106459
* config/loongarch/loongarch.cc (loongarch_build_integer):
Use HOST_WIDE_INT.
* config/loongarch/loongarch.h (IMM_REACH): Likewise.
(HWIT_1U): New Defined.
(LU12I_OPERAND): Use HOST_WIDE_INT.
(LU32I_OPERAND): Likewise.
(LU52I_OPERAND): Likewise.
(HWIT_UC_0xFFF): Likwise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/pr106459.c: New test.

(cherry picked from commit b169b67d7dafe2b786f87c31d6b2efc603fd880c)

Daily bump.

libstdc++: Fix visit<void>(v) for non-void visitors [PR106589]

The optimization for the common case of std::visit forgot to handle the
edge case of passing zero variants to a non-void visitor and converting
the result to void.

libstdc++-v3/ChangeLog:

PR libstdc++/106589
* include/std/variant (__do_visit): Handle is_void<R> for zero
argument case.
* testsuite/20_util/variant/visit_r.cc: Check std::visit<void>(v).

(cherry picked from commit e85bb1881e57e53306ede2a15f30d06480d69886)

vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322]

As PR106322 shows, in some cases for some vector type whose
TYPE_MODE is a scalar integral mode instead of a vector mode,
it's possible to obtain wrong target support information when
querying with the scalar integral mode.  For example, for the
test case in PR106322, on ppc64 32bit vectorizer gets vector
type "vector(2) short unsigned int" for scalar type "short
unsigned int", its mode is SImode instead of V2HImode.  The
target support querying checks umul_highpart optab with SImode
and considers it's supported, then vectorizer further generates
.MULH IFN call for that vector type.  Unfortunately it's wrong
to use SImode support for that vector type multiply highpart
here.

This patch is to teach vectorizable_call analysis not to allow
vect_emulated_vector_p type for both vectype_in and vectype_out
as Richi suggested.

PR tree-optimization/106322

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_call): Don't allow
vect_emulated_vector_p type for both vectype_in and vectype_out.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106322.c: New test.
* gcc.target/powerpc/pr106322.c: New test.

(cherry picked from commit 5239e2bd48fb1e6a1d1b06a1bac49bee0a742e98)

rs6000: Adjust mov optabs for opaque modes [PR103353]

As PR103353 shows, we may want to continue to expand built-in
function __builtin_vsx_lxvp, even if we have already emitted
error messages about some missing required conditions. As
shown in that PR, without one explicit mov optab on OOmode
provided, it would call emit_move_insn recursively.

So this patch is to allow the mov pattern to be generated during
expanding phase if compiler has already seen errors.

PR target/103353

gcc/ChangeLog:

* config/rs6000/mma.md (define_expand movoo): Move TARGET_MMA condition
check to preparation statements and add handlings for !TARGET_MMA.
(define_expand movxo): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103353.c: New test.

(cherry picked from commit 9367e3a65f874dffc8f8a3b6760e77fd9ed67117)

Daily bump.

libstdc++: Document linker option for C++23 <stacktrace> [PR105678]

libstdc++-v3/ChangeLog:

PR libstdc++/105678
* doc/xml/manual/using.xml: Document -lstdc++_libbacktrace
requirement for using std::stacktrace. Also adjust -frtti and
-fexceptions to document non-default (i.e. negative) forms.
* doc/html/*: Regenerate.

(cherry picked from commit cc4fa7a210b638d6a46f14dab17f2361389d18e1)

Update gcc .po files

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8705-g4218d3abfde1aa3dadfdacb55893f08489e8a064 (23th Aug 2022)

gfortran.dg/gomp/depend-4.f90: minor fix + dump update

Contains the minor fix of upstream commit
r13-2151-g6b2a584ed5bed1b851ee7b4668ac705f20cbb2c4
to avoid setting the same array element twice.

Additionally, it updates the expected tree dumps due to differences
between GCC 12/OG12 and mainline (GCC13); the latter seems to
use a restricted pointer of type '&a[1]' while OG12 has pointerplus
expressions like 'a + 4'.

The changes include -m32/-m64 handling for the depobj var and in
two cases, the expected count increases from 1 to 2 for code like
'D.\[0-9\]+ = daa'.

gcc/testsuite/

* gfortran.dg/gomp/depend-4.f90: Update expected tree dumps.

Fortran: OpenMP fix declare simd inside modules and absent linear step [PR106566]

gcc/fortran/ChangeLog:

PR fortran/106566
* openmp.cc (gfc_match_omp_clauses): Fix setting linear-step value
to 1 when not specified.
(gfc_match_omp_declare_simd): Accept module procedures.

gcc/testsuite/ChangeLog:

PR fortran/106566
* gfortran.dg/gomp/declare-simd-4.f90: New test.
* gfortran.dg/gomp/declare-simd-5.f90: New test.
* gfortran.dg/gomp/declare-simd-6.f90: New test.

(cherry picked from commit 1513512ec7d0751cba30c9c8804f2be462acfb9b)

gcn/mkoffload: Cleanup temporary dbgobj file

The file (suffix ".mkoffload.dbg.o") used to save the dbgobj data
data has to be passed to maybe_unlink for cleanup or -v -save-temps stderr
diagnostic. That was missed before.

This is a partial backport of commit r13-2125, "mkoffload: Cleanup
temporary omp_requires_file", only for GCN's mkoffload and its dbgobj
file as 'omp requires' is not supported on GCC 12 and, hence,
omp_requires_file does not exist on this branch.

gcc/ChangeLog:

* config/gcn/mkoffload.cc (main): Add dbgobj to files_to_cleanup.

(cherry picked from commit 713ec97e593bd4d9915a13bc4047f064fec0e24a)

Fortran: OpenMP fix declare simd inside modules and absent linear step [PR106566]

Partial backport from commit r13-2093, only, as 'OpenMP 5.2 linear
clause syntax' is not on GCC 12.

gcc/fortran/ChangeLog:

PR fortran/106566
* openmp.cc (gfc_match_omp_declare_simd): Accept module procedures.

gcc/testsuite/ChangeLog:

PR fortran/106566
* gfortran.dg/gomp/declare-simd-6.f90: New test.

(cherry picked from commit 1513512ec7d0751cba30c9c8804f2be462acfb9b)

OpenMP: Fix var replacement with 'simd' and linear-step vars [PR106548]

gcc/ChangeLog:

PR middle-end/106548
* omp-low.cc (lower_rec_input_clauses): Use build_outer_var_ref
for 'simd' linear-step values that are variable.

libgomp/ChangeLog:

PR middle-end/106548
* testsuite/libgomp.c/linear-2.c: New test.

(cherry picked from commit d9c9424d2c4f7b25acfc00db0076a65882c8a99f)

Daily bump.

lto-wrapper.cc: Delete offload_names temp files in case of error [PR106686]

Usually, the caller takes care of the .o files for the offload compilers
(suffix: ".target.o"). However, if an error occurs during processing
(e.g. fatal error by lto1), they were not deleted.

gcc/ChangeLog:

PR lto/106686
* lto-wrapper.cc (free_array_of_ptrs): Move before tool_cleanup.
(tool_cleanup): Unlink offload_names.
(compile_offload_image): Take filename argument to set it early.
(compile_images_for_offload_targets): Update call; set
offload_names to NULL after freeing the array.

(cherry picked from commit e228683b244c6afbe3bdfef7ba607fbda813bd0f)

Daily bump.

mkoffload: Cleanup temporary omp_requires_file

The file (suffix ".mkoffload.omp_requires") used to save the 'omp requires'
data has to be passed to maybe_unlink for cleanup or -v -save-temps stderr
diagnostic. That was missed before. - For GCN, the same has to be done for
the files with suffix ".mkoffload.dbg.o".

gcc/ChangeLog:

* config/gcn/mkoffload.cc (main): Add omp_requires_file and dbgobj to
files_to_cleanup.
* config/i386/intelmic-mkoffload.cc (prepare_target_image): Add
omp_requires_file to temp_files.
* config/nvptx/mkoffload.cc (omp_requires_file): New global static var.
(main): Remove local omp_requires_file var.
(tool_cleanup): Handle omp_requires_file.

(cherry picked from commit 713ec97e593bd4d9915a13bc4047f064fec0e24a)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8698-g2d29d7b240d9ca87cbee5d90c846694125d293af (19th Aug 2022)
This includes the 12.2.0 release and then 12.2.1 post-release version bump

Bump BASE-VER

* BASE-VER: Set to 12.2.1.

Update ChangeLog and version files for release

Daily bump.

Regenerate gcc.pot

* gcc.pot: Regenerate.

OpenMP requires: Fix diagnostic filename corner case

The issue occurs when there is, e.g., main._omp_fn.0 in two files with
different OpenMP requires clauses. The function entries in the offload
table ends up having the same decl tree and, hence, the diagnostic showed
the same filename for both. Solution: Use the .o filename in this case.

Note that the issue does not occur with same-named 'static' functions and
without the fatal error from the requires diagnostic, there would be
later a linker error due to having two 'main'.

gcc/
* lto-cgraph.cc (input_offload_tables): Improve requires diagnostic
when filenames come out identically.

(cherry picked from commit 027b281f1e8de55d959695c7f1e80572fae6dbe7)

libgomp/splay-tree.h: Fix splay_tree_prefix handling

When splay_tree_prefix is defined, the .h file
defines splay_* macros to add the prefix. However,
before those were only unset when additionally
splay_tree_c was defined.
Additionally, for consistency undefine splay_tree_c
also when no splay_tree_prefix is defined - there
is no interdependence either.

libgomp/ChangeLog:

* splay-tree.h: Fix splay_* macro unsetting if
splay_tree_prefix is defined.

(cherry picked from commit 6b4e49fdfcc9bff5459d5a821dd7e9476c7c1c10)

OpenMP/C++: Allow classes with static members to be mappable [PR104493]

As this is the last lang-specific user of the omp_mappable_type hook,
the hook is removed, keeping only a generic omp_mappable_type for
incomplete types (or error_node).

PR c++/104493

gcc/c/ChangeLog:

* c-decl.cc (c_decl_attributes, finish_decl): Call omp_mappable_type
instead of removed langhook.
* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

* cp-objcp-common.h (LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
* cp-tree.h (cp_omp_mappable_type, cp_omp_emit_unmappable_type_notes):
Remove.
* decl2.cc (cp_omp_mappable_type_1, cp_omp_mappable_type,
cp_omp_emit_unmappable_type_notes): Remove.
(cplus_decl_attributes): Call omp_mappable_type instead of
removed langhook.
* decl.cc (cp_finish_decl): Likewise; call cxx_incomplete_type_inform
in lieu of cp_omp_emit_unmappable_type_notes.
* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

* gimplify.cc (omp_notice_variable): Call omp_mappable_type
instead of removed langhook.
* omp-general.h (omp_mappable_type): New prototype.
* omp-general.cc (omp_mappable_type): New; moved from ...
* langhooks.cc (lhd_omp_mappable_type): ... here.
* langhooks-def.h (lhd_omp_mappable_type,
LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Remote the latter.
* langhooks.h (struct lang_hooks_for_types): Remove
omp_mappable_type.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/unmappable-1.C: Remove dg-error; remove dg-note no
longer shown as TYPE_MAIN_DECL is NULL.
* c-c++-common/gomp/map-incomplete-type.c: New test.

Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
(cherry picked from commit 92a5de3df2dc958d6b3d18a0466189ad31f5ae79)

PR106342 - IBM zSystems: Provide vsel for all vector modes

dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
produces an insn that vsel<mode> is supposed to recognize, but can't,
because it's not defined for V2SF. Fix by defining it for all vector
modes supported by copysign<mode>3.

gcc/ChangeLog:

* config/s390/vector.md (V_HW_FT): New iterator.
* config/s390/vx-builtins.md (vsel<mode>): Use V_HW_FT instead
of V_HW.

(cherry picked from commit 2f17f489de47d46626ed85804c3b810547ef550e)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8692-gebf3cd1716139f210ff80ba2e2da42e666d72bec (17th Aug 2022)

Daily bump.

d: Update DIP links in gdc documentation to point at upstream repository

The wiki links probably worked at some point in the distant past, but
now the official location of tracking all D Improvement Proposals is on
the upstream dlang/DIPs GitHub repository.

PR d/106638

gcc/d/ChangeLog:

* gdc.texi: Update DIP links to point at upstream dlang/DIPs
repository.

(cherry picked from commit e56b695aa3aed3c0c80616bba569bbeb4a06b5e5)

Daily bump.

d: Defer compiling inline definitions until after the module has finished.

This is to prevent the case of when generating the methods of a struct
type, we don't accidentally emit an inline function that references it,
as the outer struct itself would still be incomplete.

gcc/d/ChangeLog:

* d-tree.h (d_defer_declaration): Declare.
* decl.cc (function_needs_inline_definition_p): Defer checking
DECL_UNINLINABLE and DECL_DECLARED_INLINE_P.
(maybe_build_decl_tree): Call d_defer_declaration instead of
build_decl_tree.
* modules.cc (deferred_inline_declarations): New variable.
(build_module_tree): Set deferred_inline_declarations and a handle
declarations pushed to it.
(d_defer_declaration): New function.

(cherry picked from commit 8db5b71e212debcc4f6a17f80191ca187c307fcb)

d: Fix internal compiler error: Segmentation fault at gimple-expr.cc:88

Because complex types are deprecated in the language, the new way to
expose native complex types is by defining an enum with a basetype of a
library-defined struct that is implicitly treated as-if it is native.
As casts are not implicitly added by the front-end when downcasting from
enum to its underlying type, we must insert an explicit cast during the
code generation pass.

PR d/106623

gcc/d/ChangeLog:

* d-codegen.cc (underlying_complex_expr): New function.
(d_build_call): Handle passing native complex objects as the
library-defined equivalent.
* d-tree.h (underlying_complex_expr): Declare.
* expr.cc (ExprVisitor::visit (DotVarExp *)): Call
underlying_complex_expr instead of build_vconvert.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/pr106623.d: New test.

(cherry picked from commit e206fecaac29f559f4990312b875604eb1ce3ef3)

Daily bump.

c++: constexpr, empty base after non-empty [PR106369]

Here the CONSTRUCTOR we were providing for D{} had an entry for the B base
subobject at offset 0 following the entry for the C base, causing
output_constructor_regular_field to ICE due to going backwards. It might be
nice for that function to be more tolerant of empty fields, but it also
seems reasonable for the front end to prune the useless entry.

PR c++/106369

gcc/cp/ChangeLog:

* constexpr.cc (reduced_constant_expression_p): Return false
if a CONSTRUCTOR initializes an empty field.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-lambda27.C: New test.

(cherry picked from commit 9efe4e153d994974afcbba09c3c683f5f4a19c63)

c++: pedwarn for empty unnamed enum in decl [PR67048]

[dcl.dcl]/5 says that

  enum { };

is ill-formed, and since r197742 we issue a pedwarn.  However, the
pedwarn also fires for

   enum { } x;

which is well-formed.  So only warn when {} is followed by a ;.  This
should be correct since you can't have "enum {}, <whatever>" -- that
produces "expected unqualified-id before ',' token".

PR c++/67048

gcc/cp/ChangeLog:

* parser.cc (cp_parser_enum_specifier): Warn about empty unnamed enum
only when it's followed by a semicolon.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum42.C: New test.

(cherry picked from commit fd0d3e9121c5aa65150d242676be6bbdc8d4a92a)

c: Handle initializations of opaque types [PR106016]

The initial commit that added opaque types thought that there couldn't
be any valid initializations for variables of these types, but the test
case in the bug report shows that isn't true. The solution is to handle
OPAQUE_TYPE initializations like the other scalar types.

2022-06-17 Peter Bergner <bergner@linux.ibm.com>

gcc/
PR c/106016
* expr.cc (count_type_elements): Handle OPAQUE_TYPE.

gcc/testsuite/
PR c/106016
* gcc.target/powerpc/pr106016.c: New test.

(cherry picked from commit 975658b782f36dcf6eb190966d5b705977bfd5eb)

Daily bump.

aarch64: Implement ACLE Data Intrinsics

This patch adds support for the ACLE Data Intrinsics to the AArch64 port.

gcc/ChangeLog:

2022-07-25 Andre Vieira <andre.simoesdiasvieira@arm.com>

* config/aarch64/aarch64.md (rbit<mode>2): Rename this ...
(@aarch64_rbit<mode>): ... to this and change it in...
(ffs<mode>2,ctz<mode>2): ... here.
(@aarch64_rev16<mode>): New.
* config/aarch64/aarch64-builtins.cc: (aarch64_builtins):
Define the following enum AARCH64_REV16, AARCH64_REV16L,
AARCH64_REV16LL, AARCH64_RBIT, AARCH64_RBITL, AARCH64_RBITLL.
(aarch64_init_data_intrinsics): New.
(aarch64_general_init_builtins): Add call to
aarch64_init_data_intrinsics.
(aarch64_expand_builtin_data_intrinsic): New.
(aarch64_general_expand_builtin): Add call to
aarch64_expand_builtin_data_intrinsic.
* config/aarch64/arm_acle.h (__clz, __clzl, __clzll, __cls, __clsl,
__clsll, __rbit, __rbitl, __rbitll, __rev, __revl, __revll, __rev16,
__rev16l, __rev16ll, __ror, __rorl, __rorll, __revsh): New.

gcc/testsuite/ChangeLog:

2022-07-25 Andre Vieira <andre.simoesdiasvieira@arm.com>

* gcc.target/aarch64/acle/data-intrinsics.c: New test.

(cherry picked from commit eb966d393dfdfd2c80994e4bfcc0dddf85828a73)

Daily bump.

OpenMP: Fix folding with simd's linear clause [PR106492]

gcc/ChangeLog:

PR middle-end/106492
* omp-low.cc (lower_rec_input_clauses): Add missing folding
to data type of linear-clause list item.

gcc/testsuite/ChangeLog:

PR middle-end/106492
* g++.dg/gomp/pr106492.C: New test.

(cherry picked from commit 8a16b9f983824b6b9a25275cd23b6bba8c98b800)

tree-optimization/106513 - fix mistake in bswap symbolic number shifts

This fixes a mistake in typing a local variable in the symbolic
shift routine.

PR tree-optimization/106513
* gimple-ssa-store-merging.cc (do_shift_rotate): Use uint64_t
for head_marker.

* gcc.dg/torture/pr106513.c: New testcase.

(cherry picked from commit f675afa4eeac9910a2c085a95aa04d6d9f2fd8d6)

lto/106540 - fix LTO tree input wrt dwarf2out_register_external_die

I've revisited the earlier two workarounds for dwarf2out_register_external_die
getting duplicate entries. It turns out that r11-525-g03d90a20a1afcb
added dref_queue pruning to lto_input_tree but decl reading uses that
to stream in DECL_INITIAL even when in the middle of SCC streaming.
When that SCC then gets thrown away we can end up with debug nodes
registered which isn't supposed to happen. The following adjusts
the DECL_INITIAL streaming to go the in-SCC way, using lto_input_tree_1,
since no SCCs are expected at this point, just refs.

PR lto/106540
PR lto/106334
* lto-streamer-in.cc (lto_read_tree_1): Use lto_input_tree_1
to input DECL_INITIAL, avoiding to commit drefs.

(cherry picked from commit 2a1448f2763a72c83e2ec496f78243a975b0d44e)

Daily bump.

libgccjit.h: Uncomment macro definition for testing gcc_jit_context_new_bitcast support

(cherry-picked from r13-2004-g9385cd9c74cf66)

The macro definition for LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast
was earlier located in the documentation comment for
gcc_jit_context_new_bitcast, making it unavailable to code that
consumed libgccjit.h. This commit moves the definition out of the
comment, making it effective.

gcc/jit/ChangeLog:
* libgccjit.h (LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast): Move
definition out of comment.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

d: Fix undefined reference to pragma(inline) symbol (PR106563)

Functions that are declared `pragma(inline)' should be treated as if
they are defined in every translation unit they are referenced from,
regardless of visibility protection. Ensure they always get
DECL_ONE_ONLY linkage, and start emitting them into other modules that
import them.

PR d/106563

gcc/d/ChangeLog:

* decl.cc (DeclVisitor::visit (FuncDeclaration *)): Set semanticRun
before generating its symbol.
(function_defined_in_root_p): New function.
(function_needs_inline_definition_p): New function.
(maybe_build_decl_tree): New function.
(get_symbol_decl): Call maybe_build_decl_tree before returning symbol.
(start_function): Use function_defined_in_root_p instead of inline
test for locally defined symbols.
(set_linkage_for_decl): Check for inline functions before private or
protected symbols.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/torture.exp (srcdir): New proc.
* gdc.dg/torture/imports/pr106563math.d: New test.
* gdc.dg/torture/imports/pr106563regex.d: New test.
* gdc.dg/torture/imports/pr106563uni.d: New test.
* gdc.dg/torture/pr106563.d: New test.

(cherry picked from commit 04284176d549ff2565406406a6d53ab4ba8e507d)

Daily bump.

d: Fix ICE in in add_stack_var, at cfgexpand.cc:476

The type that triggers the ICE never got completed by the semantic
analysis pass. Checking for size forces it to be done, or issue a
compile-time error.

PR d/106555

gcc/d/ChangeLog:

* d-target.cc (Target::isReturnOnStack): Check for return type size.

gcc/testsuite/ChangeLog:

* gdc.dg/imports/pr106555.d: New test.
* gdc.dg/pr106555.d: New test.

(cherry picked from commit 4b0253b019943abf2cc5f4db0b7ed67caedffe4a)

Daily bump.

Do not enable -mblock-ops-vector-pair.

Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.

A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning for power10, but it would
set the option if we were tuning for a different machine and have load and store
vector pair instructions enabled.

This patch eliminates the code setting -mblock-ops-vector-pair.  If you want to
generate load vector pair and store vector pair instructions for block moves,
you must use -mblock-ops-vector-pair.

2022-08-05   Michael Meissner  <meissner@linux.ibm.com>

gcc/

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove code
setting -mblock-ops-vector-pair.  Back port patch from trunk on 8/3.

libstdc++: Make std::string_view(Range&&) constructor explicit

The P2499R0 paper was recently approved for C++23.

libstdc++-v3/ChangeLog:

* include/std/string_view (basic_string_view(Range&&)): Add
explicit as per P2499R0.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
Adjust implicit conversions. Check implicit conversions fail.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
Likewise.

(cherry picked from commit 2678386df2cc3505da85e95643327aa928e66a8e)

libstdc++: Rename data members of std::unexpected and std::bad_expected_access

The P2549R1 paper was accepted for C++23. I already implemented it for
our <expected>, but I didn't rename the private daata members, only the
public member functions. This renames the data members for consistency
with the working draft.

libstdc++-v3/ChangeLog:

* include/std/expected (unexpected::_M_val): Rename to _M_unex.
(bad_expected_access::_M_val): Likewise.

(cherry picked from commit 07c7ee4d2d42f4728928556dbbe0700f9a13db90)

libstdc++: Update value of __cpp_lib_ios_noreplace macro

My P2467R1 proposal was accepted for C++23 so there's an official value
for this macro now.

libstdc++-v3/ChangeLog:

* include/bits/ios_base.h (__cpp_lib_ios_noreplace): Update
value to 202207L.
* include/std/version (__cpp_lib_ios_noreplace): Likewise.
* testsuite/27_io/basic_ofstream/open/char/noreplace.cc: Check
for new value.
* testsuite/27_io/basic_ofstream/open/wchar_t/noreplace.cc:
Likewise.

(cherry picked from commit 3e9bd6b2b1782891639fa5d49b7d2a933b8e85cd)

Daily bump.

libstdc++: Improve directory iterator abstractions for openat

Currently the _Dir::open_subdir function decides whether to construct a
_Dir_base with just a pathname, or a file descriptor and pathname. But
that means it is tightly coupled to the implementation of
_Dir_base::openat, which is what actually decides whether to use a file
descriptor or not. If the derived class passes a file descriptor and
filename, but the base class expects a full path and ignores the file
descriptor, then recursive_directory_iterator cannot recurse.

This change introduces a new type that provides the union of all the
information available to the derived class (the full pathname, as well
as a file descriptor for a directory and another pathname relative to
that directory). This allows the derived class to be agnostic to how the
base class will use that information.

libstdc++-v3/ChangeLog:

* src/c++17/fs_dir.cc (_Dir::dir_and_pathname):: Replace with
current() returning _At_path.
(_Dir::_Dir, _Dir::open_subdir, _Dir::do_unlink): Adjust.
* src/filesystem/dir-common.h (_Dir_base::_At_path): New class.
(_Dir_base::_Dir_Base, _Dir_base::openat): Use _At_path.
* src/filesystem/dir.cc (_Dir::dir_and_pathname): Replace with
current() returning _At_path.
(_Dir::_Dir, _Dir::open_subdir): Adjust.

(cherry picked from commit 198781144f33b0ef17dd2094580b5c77ad97d6e8)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8653-g4e5ca7ff8c9afd3c38245aa6b939cd3ae49bf1fe (3 Aug 2022)

libstdc++: Tweak common_iterator::operator-> return type [PR104443]

This adjusts the return type to match the resolution of LWG 3672. There
is no functional difference, because decltype(auto) always deduced a
value anyway, but this makes it simpler and consistent with the working
draft.

libstdc++-v3/ChangeLog:

PR libstdc++/104443
* include/bits/stl_iterator.h (common_iterator::operator->):
Change return type to just auto.

(cherry picked from commit b5f5d1b36edbcd7d923f2e2653e54e52637c715b)

libstdc++: Check for EOF if extraction avoids buffer overflow [PR106248]

In r11-2581-g17abcc77341584 (for LWG 2499) I added overflow checks to
the pre-C++20 operator>>(istream&, char*) overload. Those checks can
cause extraction to stop after filling the buffer, where previously it
would have tried to extract another character and stopped at EOF. When
that happens we no longer set eofbit in the stream state, which is
consistent with the behaviour of the new C++20 overload, but is an
observable and unexpected change in the C++17 behaviour. What makes it
worse is that the behaviour change is dependent on optimization, because
__builtin_object_size is used to detect the buffer size and that only
works when optimizing.

To avoid the unexpected and optimization-dependent change in behaviour,
set eofbit manually if we stopped extracting because of the buffer size
check, but had reached EOF anyway. If the stream's rdstate() != goodbit
or width() is non-zero and smaller than the buffer, there's nothing to
do. Otherwise, we filled the buffer and need to check for EOF, and maybe
set eofbit.

The new check is guarded by #ifdef __OPTIMIZE__ because otherwise
__builtin_object_size is useless. There's no point compiling and
emitting dead code that can't be eliminated because we're not
optimizing.

We could add extra checks that the next character in the buffer is not
whitespace, to detect the case where we stopped early and prevented a
buffer overflow that would have happened otherwise. That would allow us
to assert or set badbit in the stream state when undefined behaviour was
prevented. However, those extra checks would increase the size of the
function, potentially reducing the likelihood of it being inlined, and
so making the buffer size detection less reliable. It seems preferable
to prevent UB and silently truncate, rather than miss the UB and allow
the overflow to happen.

libstdc++-v3/ChangeLog:

PR libstdc++/106248
* include/std/istream [C++17] (operator>>(istream&, char*)):
Set eofbit if we stopped extracting at EOF.
* testsuite/27_io/basic_istream/extractors_character/char/pr106248.cc:
New test.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/pr106248.cc:
New test.

(cherry picked from commit 5ae74944af1de032d4a27fad4a2287bd3a2163fd)

libstdc++: Add nodiscard attribute to filesystem operations

Some of these are not truly "pure" because they access the file system,
e.g. exists and file_size, but they do not modify anything and are only
useful for the return value.

If you really want to use one of those functions just to check whether
an error is reported (either via an exception or an error_code&
argument) you can still do so, but you need to cast the discarded result
to void. Several tests need such a change, because they were indeed
only calling the functions to check for expected errors.

libstdc++-v3/ChangeLog:

* include/bits/fs_ops.h: Add nodiscard to all pure functions.
* include/experimental/bits/fs_ops.h: Likewise.
* testsuite/27_io/filesystem/operations/all.cc: Do not discard
results of absolute and canonical.
* testsuite/27_io/filesystem/operations/absolute.cc: Cast
discarded result to void.
* testsuite/27_io/filesystem/operations/canonical.cc: Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/canonical.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.

(cherry picked from commit f7a148304a71f3d3ad6845b7966fdc3af88c9e45)

libstdc++: Support constexpr global std::string for size < 15 [PR105995]

I don't think this is required by the standard, but it's easy to
support.

libstdc++-v3/ChangeLog:

PR libstdc++/105995
* include/bits/basic_string.h (_M_use_local_data): Initialize
the entire SSO buffer.
* testsuite/21_strings/basic_string/cons/char/105995.cc: New test.

(cherry picked from commit 98a0d72a610a87e8e383d366e50253ddcc9a51dd)

libstdc++: Fix indentation in allocator base classes

libstdc++-v3/ChangeLog:

* include/bits/new_allocator.h: Fix indentation.
* include/ext/malloc_allocator.h: Likewise.

(cherry picked from commit 29da01709facbcc7efef4fd6767660d417f44531)

libstdc++: Check for size overflow in constexpr allocation [PR105957]

libstdc++-v3/ChangeLog:

PR libstdc++/105957
* include/bits/allocator.h (allocator::allocate): Check for
overflow in constexpr allocation.
* testsuite/20_util/allocator/105975.cc: New test.

(cherry picked from commit 0a9af7b4ef1b8aa85cc8820acf54d41d1569fc10)

libstdc++: Make std::lcm and std::gcd detect overflow [PR105844]

When I fixed PR libstdc++/92978 I introduced a regression whereby
std::lcm(INT_MIN, 1) and std::lcm(50000, 49999) would no longer produce
errors during constant evaluation. Those calls are undefined, because
they violate the preconditions that |m| and the result can be
represented in the return type (which is int in both those cases). The
regression occurred because __absu<unsigned>(INT_MIN) is well-formed,
due to the explicit casts to unsigned in that new helper function, and
the out-of-range multiplication is well-formed, because unsigned
arithmetic wraps instead of overflowing.

To fix 92978 I made std::gcm and std::lcm calculate |m| and |n|
immediately, yielding a common unsigned type that was used to calculate
the result. That was partly correct, but there's no need to use an
unsigned type. Doing so only suppresses the overflow errors so the
compiler can't detect them. This change replaces __absu with __abs_r
that returns the common type (not its corresponding unsigned type). This
way we can detect overflow in __abs_r when required, while still
supporting the most-negative value when it can be represented in the
result type. To detect LCM results that are out of range of the result
type we still need explicit checks, because neither constant evaluation
nor UBsan will complain about unsigned wrapping for cases such as
std::lcm(500000u, 499999u). We can detect those overflows efficiently by
using __builtin_mul_overflow and asserting.

libstdc++-v3/ChangeLog:

PR libstdc++/105844
* include/experimental/numeric (experimental::gcd): Simplify
assertions. Use __abs_r instead of __absu.
(experimental::lcm): Likewise. Remove use of __detail::__lcm so
overflow can be detected.
* include/std/numeric (__detail::__absu): Rename to __abs_r and
change to allow signed result type, so overflow can be detected.
(__detail::__lcm): Remove.
(gcd): Simplify assertions. Use __abs_r instead of __absu.
(lcm): Likewise. Remove use of __detail::__lcm so overflow can
be detected.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/26_numerics/gcd/105844.cc: New test.
* testsuite/26_numerics/lcm/105844.cc: New test.

(cherry picked from commit 671970a5621e18e7079b4ca113e56434c858db66)