git.ipfire.org Git - thirdparty/gcc.git/log

openmp: Add support for declare simd and declare variant in a attribute syntax

This patch adds support for declare simd and declare variant in attribute
syntax.  Either in attribute-specifier-seq at the start of declaration, in
that case it has similar restriction to pragma-syntax, that there is a single
function declaration/definition in the declaration, rather than variable
declaration or more than one function declarations or mix of function and
variable declarations.  Or after the declarator id, in that case it applies
just to the single function declaration and the same declaration can have
multiple such attributes.  Or both.

Furthermore, cp_parser_statement has been adjusted so that it doesn't
accept [[omp::directive (parallel)]] etc. before statements that don't
take attributes at all, or where those attributes don't appertain to
the statement but something else (e.g. to label, using directive,
declaration, etc.).

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
* parser.h (struct cp_omp_declare_simd_data): Remove
in_omp_attribute_pragma and clauses members, add loc and attribs.
(struct cp_oacc_routine_data): Remove loc member, add clauses
member.
* parser.c (cp_finalize_omp_declare_simd): New function.
(cp_parser_handle_statement_omp_attributes): Mention in
function comment the function is used also for
attribute-declaration.
(cp_parser_handle_directive_omp_attributes): New function.
(cp_parser_statement): Don't call
cp_parser_handle_statement_omp_attributes if statement doesn't
have attribute-specifier-seq at the beginning at all or if
if those attributes don't appertain to the statement.
(cp_parser_simple_declaration): Call
cp_parser_handle_directive_omp_attributes and
cp_finalize_omp_declare_simd.
(cp_parser_explicit_instantiation): Likewise.
(cp_parser_init_declarator): Initialize prefix_attributes
only after parsing declarators.
(cp_parser_direct_declarator): Call
cp_parser_handle_directive_omp_attributes and
cp_finalize_omp_declare_simd.
(cp_parser_member_declaration): Likewise.
(cp_parser_single_declaration): Likewise.
(cp_parser_omp_declare_simd): Don't initialize
data.in_omp_attribute_pragma, instead initialize
data.attribs[0] and data.attribs[1].
(cp_finish_omp_declare_variant): Remove
in_omp_attribute_pragma argument, instead use
parser->lexer->in_omp_attribute_pragma.
(cp_parser_late_parsing_omp_declare_simd): Adjust
cp_finish_omp_declare_variant caller.  Handle attribute-syntax
declare simd/variant.
gcc/testsuite/
* g++.dg/gomp/attrs-1.C (bar): Add missing semicolon after
[[omp::directive (threadprivate (t2))]].  Add tests with
if/while/switch after parallel in attribute syntax.
(corge): Add missing omp:: before directive.
* g++.dg/gomp/attrs-2.C (bar): Add missing semicolon after
[[omp::directive (threadprivate (t2))]].
* g++.dg/gomp/attrs-10.C: New test.
* g++.dg/gomp/attrs-11.C: New test.

(cherry picked from commit c40c6a50fd4da342c87a715ae83927a37797e094)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to bde6489fe0365b20f8df28157f8434ae7554ac3d (Aug 9, 2021)

openmp: Implement omp_get_device_num routine

This patch implements the omp_get_device_num library routine, specified in
OpenMP 5.0.

GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number"
variable, is defined on the device-side libgomp, has it's address returned to
host-side libgomp during device initialization, and the host libgomp then
sets its value to the designated device number.

libgomp/ChangeLog:

* icv-device.c (omp_get_device_num): New API function, host side.
* fortran.c (omp_get_device_num_): New interface function.
* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
* libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
omp_get_device_num_.
* libgomp.texi (omp_get_device_num): Add documentation for new API
function.
* omp.h.in (omp_get_device_num): Add declaration.
* omp_lib.f90.in (omp_get_device_num): Likewise.
* omp_lib.h.in (omp_get_device_num): Likewise.
* target.c (gomp_load_image_to_device): If additional entry for device
number exists at end of returned entries from 'load_image_func' hook,
copy the assigned device number over to the device variable.

* config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
(omp_get_device_num): New API function, device side.
* plugin/plugin-gcn.c ("symcat.h"): Add include.
(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
at end of returned 'target_table' entries.

* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
(omp_get_device_num): New API function, device side.
* plugin/plugin-nvptx.c ("symcat.h"): Add include.
(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
at end of returned 'target_table' entries.

* testsuite/lib/libgomp.exp
(check_effective_target_offload_target_intelmic): New function for
testing for intelmic offloading.
* testsuite/libgomp.c-c++-common/target-45.c: New test.
* testsuite/libgomp.fortran/target10.f90: New test.

(cherry picked from commit 0bac793ed6bad2c0c13cd1e93a1aa5808467afc8)

Daily bump.

aarch64: Add -mtune=neoverse-512tvb

This patch adds an option to tune for Neoverse cores that have
a total vector bandwidth of 512 bits (4x128 for Advanced SIMD
and a vector-length-dependent equivalent for SVE). This is intended
to be a compromise between tuning aggressively for a single core like
Neoverse V1 (which can be too narrow) and tuning for AArch64 cores
in general (which can be too wide).

-mcpu=neoverse-512tvb is equivalent to -mcpu=neoverse-v1
-mtune=neoverse-512tvb.

gcc/
* doc/invoke.texi: Document -mtune=neoverse-512tvb and
-mcpu=neoverse-512tvb.
* config/aarch64/aarch64-cores.def (neoverse-512tvb): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64.c (neoverse512tvb_sve_vector_cost)
(neoverse512tvb_sve_issue_info, neoverse512tvb_vec_issue_info)
(neoverse512tvb_vector_cost, neoverse512tvb_tunings): New structures.
(aarch64_adjust_body_cost_sve): Handle -mtune=neoverse-512tvb.
(aarch64_adjust_body_cost): Likewise.

(cherry picked from commit 048039c49b96875144f67e7789fdea54abf7710b)

aarch64: Restrict issue heuristics to inner vector loop

The AArch64 vector costs try to take issue rates into account.
However, when vectorising an outer loop, we lumped the inner
and outer operations together, which is somewhat meaningless.
This patch restricts the heuristic to the inner loop.

gcc/
* config/aarch64/aarch64.c (aarch64_add_stmt_cost): Only
record issue information for operations that occur in the
innermost loop.

(cherry picked from commit 9690309baf8294b0512b55b133bc102dc0dac5b5)

aarch64: Tweak MLA vector costs

The issue-based vector costs currently assume that a multiply-add
sequence can be implemented using a single instruction.  This is
generally true for scalars (which have a 4-operand instruction)
and SVE (which allows the output to be tied to any input).
However, for Advanced SIMD, multiplying two values and adding
an invariant will end up being a move and an MLA.

The only target to use the issue-based vector costs is Neoverse V1,
which would generally prefer SVE in this case anyway.  I therefore
don't have a self-contained testcase.  However, the distinction
becomes more important with a later patch.

gcc/
* config/aarch64/aarch64.c (aarch64_multiply_add_p): Add a vec_flags
parameter.  Detect cases in which an Advanced SIMD MLA would almost
certainly require a MOV.
(aarch64_count_ops): Update accordingly.

(cherry picked from commit 028059b46ec9aef7dd447792c579f35396751068)

aarch64: Tweak the cost of elementwise stores

When the vectoriser scalarises a strided store, it counts one
scalar_store for each element plus one vec_to_scalar extraction
for each element. However, extracting element 0 is free on AArch64,
so it should have zero cost.

I don't have a testcase that requires this for existing -mtune
options, but it becomes more important with a later patch.

gcc/
* config/aarch64/aarch64.c (aarch64_is_store_elt_extraction): New
function, split out from...
(aarch64_detect_vector_stmt_subtype): ...here.
(aarch64_add_stmt_cost): Treat extracting element 0 as free.

(cherry picked from commit 537afb0857c8f60c2b60a09fad4660420cd13e8f)

aarch64: Add gather_load_xNN_cost tuning fields

This patch adds tuning fields for the total cost of a gather load
instruction. Until now, we've costed them as one scalar load
per element instead. Those scalar_load-based values are also
what the patch uses to fill in the new fields for existing
cost structures.

gcc/
* config/aarch64/aarch64-protos.h (sve_vec_cost):
Add gather_load_x32_cost and gather_load_x64_cost.
* config/aarch64/aarch64.c (generic_sve_vector_cost)
(a64fx_sve_vector_cost, neoversev1_sve_vector_cost): Update
accordingly, using the values given by the scalar_load * number
of elements calculation that we used previously.
(aarch64_detect_vector_stmt_subtype): Use the new fields.

(cherry picked from commit 78770e0e5d9fef70679e1db4eb2fb06596fbb2f8)

aarch64: Split out aarch64_adjust_body_cost_sve

This patch splits the SVE-specific part of aarch64_adjust_body_cost
out into its own subroutine, so that a future patch can call it
more than once. I wondered about using a lambda to avoid having
to pass all the arguments, but in the end this way seemed clearer.

gcc/
* config/aarch64/aarch64.c (aarch64_adjust_body_cost_sve): New
function, split out from...
(aarch64_adjust_body_cost): ...here.

(cherry picked from commit b585f0112f293ace8fadc0c7ace59230140b7472)

aarch64: Add a simple fixed-point class for costing

This patch adds a simple fixed-point class for holding fractional
cost values.  It can exactly represent the reciprocal of any
single-vector SVE element count (including the non-power-of-2 ones).
This means that it can also hold 1/N for all N in [1, 16], which should
be enough for the various *_per_cycle fields.

For now the assumption is that the number of possible reciprocals
is fixed at compile time and so the class should always be able
to hold an exact value.

The class uses a uint64_t to hold the fixed-point value, which means
that it can hold any scaled uint32_t cost.  Normally we don't worry
about overflow when manipulating raw uint32_t costs, but just to be
on the safe side, the class uses saturating arithmetic for all
operations.

As far as the changes to the cost routines themselves go:

- The changes to aarch64_add_stmt_cost and its subroutines are
  just laying groundwork for future patches; no functional change
  intended.

- The changes to aarch64_adjust_body_cost mean that we now
  take fractional differences into account.

gcc/
* config/aarch64/fractional-cost.h: New file.
* config/aarch64/aarch64.c: Include <algorithm> (indirectly)
and cost_fraction.h.
(vec_cost_fraction): New typedef.
(aarch64_detect_scalar_stmt_subtype): Use it for statement costs.
(aarch64_detect_vector_stmt_subtype): Likewise.
(aarch64_sve_adjust_stmt_cost, aarch64_adjust_stmt_cost): Likewise.
(aarch64_estimate_min_cycles_per_iter): Use vec_cost_fraction
for cycle counts.
(aarch64_adjust_body_cost): Likewise.
(aarch64_test_cost_fraction): New function.
(aarch64_run_selftests): Call it.

(cherry picked from commit 83d796d3e58badcb864d179b882979f714ffd162)

aarch64: Turn sve_width tuning field into a bitmask

The tuning structures have an sve_width field that specifies the
number of bits in an SVE vector (or SVE_NOT_IMPLEMENTED if not
applicable).  This patch turns the field into a bitmask so that
it can specify multiple widths at the same time.  For now we
always treat the mininum width as the likely width.

An alternative would have been to add extra fields, which would
have coped correctly with non-power-of-2 widths.  However,
we're very far from supporting constant non-power-of-2 vectors
in GCC, so I think the non-power-of-2 case will in reality always
have to be hidden behind VLA.

gcc/
* config/aarch64/aarch64-protos.h (tune_params::sve_width): Turn
into a bitmask.
* config/aarch64/aarch64.c (aarch64_cmp_autovec_modes): Update
accordingly.
(aarch64_estimated_poly_value): Likewise.  Use the least significant
set bit for the minimum and likely values.  Use the most significant
set bit for the maximum value.

(cherry picked from commit fa3ca6151ccac0e215727641eee36abf6e437b26)

tree-optimization/101505 - properly determine stmt precision for PHIs

Loop vectorization pattern recog fails to walk PHIs when determining
stmt precisions. This fails to recognize non-mask uses for bools
in PHIs and outer loop vectorization.

2021-07-19 Richard Biener <rguenther@suse.de>

PR tree-optimization/101505
* tree-vect-patterns.c (vect_determine_precisions): Walk
PHIs also for loop vectorization.

* gcc.dg/vect/pr101505.c: New testcase.

(cherry picked from commit 8df3ee8f7d85d0708f3c3ca96b55c9230c2ae9f0)

c/101512 - fix missing address-taking in c_common_mark_addressable_vec

c_common_mark_addressable_vec fails to look through C_MAYBE_CONST_EXPR
in the case it isn't at the toplevel.

2021-07-21 Richard Biener <rguenther@suse.de>

PR c/101512
gcc/c-family/
* c-common.c (c_common_mark_addressable_vec): Look through
C_MAYBE_CONST_EXPR even if not at the toplevel.

gcc/testsuite/
* gcc.dg/torture/pr101512.c: New testcase.

(cherry picked from commit e63d76234d18cac731c4f3610d513bd8b39b5520)

Daily bump.

sanitizer: cherry pick 414482751452e54710f16bae58458c66298aaf69

The patch is needed in order to support recent glibc (2.34).

libsanitizer/ChangeLog:

PR sanitizer/101749
* sanitizer_common/sanitizer_posix_libcdep.cpp: Prevent
generation of dependency on _cxa_guard for static
initialization.

Daily bump.

libgomp amdgcn: Fix issues with dynamic OpenMP thread scaling

libgomp/ChangeLog:

* config/gcn/bar.h (gomp_barrier_init): Limit thread count to the
actual physical number.
* config/gcn/team.c (gomp_team_start): Don't attempt to set up
threads that do not exist.

Daily bump.

Fix execution failure of parity_1.f90 on P10 [PR100952]

gcc/
PR target/100952
* config/rs6000/rs6000.md (cstore<mode>4): Fix wrong fall through.

(cherry picked from commit 3382846558e02044598556e66e5ea1cb3115429d)

openmp: Handle OpenMP directives in attribute syntax in attribute-declaration

Now that we parse attribute-declaration (outside of functions), the following
patch handles OpenMP directives in its attribute(s).
What needs handling incrementally is diagnose mismatching begin/end pair
like
[[omp::directive (declare target)]];
int a;
#pragma omp end declare target
or
#pragma omp declare target
int b;
[[omp::directive (end declare target)]];
and handling declare simd/declare variant on declarations (function
definitions and declarations), for those in two different spots.

2021-07-31 Jakub Jelinek <jakub@redhat.com>

* parser.c (cp_parser_declaration): Handle OpenMP directives
in attribute-declaration.

* g++.dg/gomp/attrs-9.C: New test.

(cherry picked from commit 05bcef5a88b34dd13179cabbe902e9135cb40ffe)

Update gcc fr.po.

* fr.po: Update.

mips: Fix up mips_atomic_assign_expand_fenv [PR94780]

Commit message shamelessly copied from 1777beb6b129 by jakub:

This function, because it is sometimes called even outside of function
bodies, uses create_tmp_var_raw rather than create_tmp_var. But in order
for that to work, when first referenced, the VAR_DECLs need to appear in a
TARGET_EXPR so that during gimplification the var gets the right
DECL_CONTEXT and is added to local decls.

gcc/

PR target/94780
* config/mips/mips.c (mips_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR.

(cherry picked from commit 2065654435e3d97676366f82b939bc9273382dbe)

mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]

Middle-end started to emit vec_cmp and vec_cmpu since GCC 11, causing
ICE on MIPS with MSA enabled.  Add the pattern to prevent it.

gcc/

PR target/101132
* config/mips/mips-protos.h (mips_expand_vec_cmp_expr): Declare.
* config/mips/mips.c (mips_expand_vec_cmp_expr): New function.
* config/mips/mips-msa.md (vec_cmp<MSA:mode><mode_i>): New
  expander.
  (vec_cmpu<IMSA:mode><mode_i>): New expander.

gcc/testsuite/

PR target/101132
* gcc.target/mips/pr101132.c: New test.

(cherry picked from commit 45cb789e6adf5d571c574a94b77413c845fed106)

c++: Fix up attribute rollbacks in cp_parser_statement

During the OpenMP directives using C++ attribute syntax work, I've noticed
that cp_parser_statement when parsing various block declarations that do
not allow attribute-specifier-seq at the start rolls back the attributes
only if std_attrs is non-NULL (i.e. some attributes have been parsed),
but doesn't roll back if some tokens were parsed as attribute-specifier-seq,
but didn't yield any attributes (e.g. [[]][[]][[]][[]]), which means
we accept those empty attributes even in places where they don't appear
in the grammar.

The following patch fixes that by instead checking if there are any
tokens to roll back.  This makes the parsing handle the first
function the same as the second one (where some attribute appears).

The testcase contains two xfails, using namespace ... apparently
allows attributes at the start and the attributes shall appeartain to
using in that case.  To be fixed incrementally.

2021-07-30  Jakub Jelinek  <jakub@redhat.com>

* parser.c (cp_parser_statement): Rollback attributes not just
when std_attrs is non-NULL, but whenever
cp_parser_std_attribute_spec_seq parsed any tokens.

* g++.dg/cpp0x/gen-attrs-76.C: New test.

(cherry picked from commit 0ba2003cf306aa98b6ec91c9d849ab9bafcf17c2)

Update gcc de.po.

* de.po: Update.

[libgomp] Restore offloading 'libgomp/fortran.c'

GCN:

    ld: error: undefined symbol: gomp_ialias_omp_display_env
    >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744)
    >>>               fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744)
    >>>               fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750)
    >>>               fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750)
    >>>               fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status

nvptx:

    unresolved symbol omp_display_env
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status

Fix-up for commit 7123ae2455b5a1a2f19f13fa82c377cfda157f23
"Implement OpenMP 5.1 section 3.15: omp_display_env".

libgomp/
* fortran.c (omp_display_env_, omp_display_env_8_): Only
'#ifndef LIBGOMP_OFFLOADED_ONLY'.

Co-Authored-By: Ulrich Drepper <drepper@redhat.com>
(cherry picked from commit 28665ddc7efa48f9b39615e313a2c4a7a66cdb24)

Implement OpenMP 5.1 section 3.15: omp_display_env

This is a new interface which is easily implemented using the
already existing code for the handling of the OMP_DISPLAY_ENV
environment variable.

libgomp/
* env.c (wait_policy, stacksize): New static variables,
move out of handle_omp_display_env.
(omp_display_env): New function.  The meat of the old
handle_omp_display_env function.
(handle_omp_display_env): Change to not take parameters
and instead use the global variables.  Only perform
parsing, defer to omp_display_env for the implementation.
(initialize_env): Remove local variables wait_policy and
stacksize.  Don't pass parameters to handle_omp_display_env.
* fortran.c: Add ialias_redirect for omp_display_env.
(omp_display_env_, omp_display_env_8_): New functions.
* libgomp.map (OMP_5.1): New version.  Add omp_display_env,
omp_display_env_, and omp_display_env_8_.
* omp.h.in: Declare omp_display_env.
* omp_lib.f90.in: Likewise.
* omp_lib.h.in: Likewise.

(cherry picked from commit 7123ae2455b5a1a2f19f13fa82c377cfda157f23)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to d185445c8d331fd5b0a2d1ba76e5402812fdf9d6 (July 29, 2021)

c++: Accept C++11 attribute-definition [PR101582]

As the following testcase shows, we don't parse properly
C++11 attribute-declaration:
https://eel.is/c++draft/dcl.dcl#nt:attribute-declaration

cp_parser_toplevel_declaration just handles empty-declaration parsing
(with diagnostics for C++98) and otherwise calls cp_parser_declaration
which on it calls cp_parser_simple_declaration and rejects it with
"does not declare anything" permerror.

The following patch moves the handling of empty-declaration from
cp_parser_toplevel_declaration to cp_parser_declaration and
handles attribute-declaration in cp_parser_declaration
by parsing the attributes (standard ones only, we've never supported
__attribute__((...)); at namespace scope, so I'm not sure we need to
introduce that), which for C++98 emits the needed diagnostics, and then
warning if there are any attributes that we throw away on the floor.

I'll need this later for OpenMP directives at namespace scope, e.g.
[[omp::directive (requires, atomic_default_mem_order(seq_cst))]];
should be valid at namespace scope (and many other directives).

2021-07-30 Jakub Jelinek <jakub@redhat.com>

PR c++/101582
* parser.c (cp_parser_skip_std_attribute_spec_seq): Add a forward
declaration.
(cp_parser_declaration): Parse empty-declaration and
attribute-declaration.
(cp_parser_toplevel_declaration): Don't parse empty-declaration here.

* g++.dg/cpp0x/gen-attrs-45.C: Expect a warning about ignored
attributes instead of error.
* g++.dg/cpp0x/gen-attrs-75.C: New test.
* g++.dg/modules/pr101582-1.C: New test.

(cherry picked from commit 77ab4e3be2d92b1ff671d58418d852195f10dd20)

Update gcc .po files.

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, zh_CN.po, zh_TW.po: Update.

rs6000: Add int128 target check to pr101129.c (PR101531)

Backport from mainline.

2021-07-21 Bill Schmidt <wschmidt@linux.ibm.com>

gcc/testsuite/
PR target/101531
* gcc.target/powerpc/pr101129.c: Adjust.

d: Return the correct value for C++ constructor calls (PR101664)

C++ constructors return void, even though the front-end semantic treats
them as implicitly returning `this'. To handle this correctly, the
object reference is cached and used as the result of the expression.

PR d/101664

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (CallExp *)): Use object expression as
result for C++ constructor calls.

gcc/testsuite/ChangeLog:

* gdc.dg/extern-c++/extern-c++.exp: New.
* gdc.dg/extern-c++/pr101664.d: New test.
* gdc.dg/extern-c++/pr101664_1.cc: New test.

(cherry picked from commit 7616ed6307c90b5bbf1bf53550d33cc674ab4b6f)

d: Ensure casting from bool results in either 0 or 1 (PR96435)

If casting from bool, the result is either 0 or 1, any other value
violates @safe code, so enforce that it is never invalid.

PR d/96435

gcc/d/ChangeLog:

* d-convert.cc (convert_for_rvalue): New function.
* d-tree.h (convert_for_rvalue): Declare.
* expr.cc (ExprVisitor::visit (CastExp *)): Use convert_for_rvalue.
(build_return_dtor): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/pr96435.d: New test.

(cherry picked from commit 5c9b7408dc578cb2ae142a5c1b724c183497bdb2)

amdgcn: Fix attributes for LLVM-12 [PR 100208]

This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

PR target/100208
* config.in: Regenerate.
* config/gcn/gcn-hsa.h (A_FIJI): New define.
(A_900): New define.
(A_906): New define.
(A_908): New define.
(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
* config/gcn/gcn.c (output_file_start): Adjust attributes according
to the assembler capabilities.
* config/gcn/mkoffload.c (main): Likewise.
* configure: Regenerate.
* configure.ac: Add tests for LLVM assembler attribute features.

(cherry picked from commit 1af16666943ef075673501765a13e425e47015cd)

Daily bump.

Correct a mistake in a warnung for -Wnonnull.

In the warning for -Wnonnull when warning about array parameters
with bounds > 0 and which are NULL the numbers referring to the
two arguments are switched. This patch corrects the mistake.

2021-07-28 Martin Uecker <muecker@gwdg.de>

gcc/
* calls.c (maybe_warn_rdwr_sizes): Correct argument
numbers in warning that were switched.

gcc/testsuite/
* gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.

Fortran: extend check for array arguments and reject CLASS array elements.

gcc/fortran/ChangeLog:

PR fortran/101536
* check.c (array_check): Adjust check for the case of CLASS
arrays.

gcc/testsuite/ChangeLog:

PR fortran/101536
* gfortran.dg/pr101536.f90: New test.

(cherry picked from commit e314cfc371d8b2405a1d81e51b90f9fb24b9061f)

Fortran: ICE, OOM while calculating sizes of derived type array components

gcc/fortran/ChangeLog:

PR fortran/101514
* target-memory.c (gfc_interpret_derived): Size of array component
of derived type can only be computed here for explicit shape.
* trans-types.c (gfc_get_nodesc_array_type): Do not dereference
NULL pointers.

gcc/testsuite/ChangeLog:

PR fortran/101514
* gfortran.dg/pr101514.f90: New test.

(cherry picked from commit c2b15fe27e6a0e42b108111d51acce69628593b4)

Fortran: reject FORMAT tag of unknown type.

gcc/fortran/ChangeLog:

PR fortran/101084
* io.c (resolve_tag_format): Extend FORMAT check to unknown type.

gcc/testsuite/ChangeLog:

PR fortran/101084
* gfortran.dg/fmt_nonchar_3.f90: New test.

(cherry picked from commit f527b8233498b40c8a2c616b82265f2e58aba42a)

d: Wrong evaluation order of binary expressions (PR101640)

The use of fold_build2 can in some cases swap the order of its operands
if that is the more optimal thing to do. However this breaks semantic
guarantee of left-to-right evaluation in D.

PR d/101640

gcc/d/ChangeLog:

* expr.cc (binary_op): Use build2 instead of fold_build2.

gcc/testsuite/ChangeLog:

* gdc.dg/pr96429.d: Update test.
* gdc.dg/pr101640.d: New test.

(cherry picked from commit 54ec50bada94a8ff92edb04ee5216c27fa4bf942)

d: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)

Both the front-end and code generator had a modulo by zero bug when testing if
a conversion from a static array to dynamic array was valid.

PR d/101490

gcc/d/ChangeLog:

* d-codegen.cc (build_array_index): Handle void arrays same as byte.
* d-convert.cc (convert_expr): Handle converting to zero-sized arrays.
* dmd/dcast.c (castTo): Handle casting to zero-sized arrays.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101490.d: New test.
* gdc.test/fail_compilation/fail22144.d: New test.

(cherry picked from commit c936c39f86c74b3bfc6831f694b3165296c99dc0)

d: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)

Backports fix from upstream to allow __FUNCTION__ and
__PRETTY_FUNCTION__ to be used as C string literals.

Reviewed-on: https://github.com/dlang/dmd/pull/12923

PR d/101441

gcc/d/ChangeLog:

* dmd/expression.c (FuncInitExp::resolveLoc): Set type as `string'.
(PrettyFuncInitExp::resolveLoc): Likewise.

gcc/testsuite/ChangeLog:

* gdc.test/compilable/b19002.d: New test.

(cherry picked from commit 1a2306ffe79df89389cc850ce85c586d0f1c8264)

d: Compile-time reflection for supported built-ins (PR101127)

In order to allow user-code to determine whether a back-end builtin is
available without error, LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE has been
defined to delay putting back-end builtin functions until the ISA that
defines them has been declared.

However in D, there is no global namespace.  All builtins get pushed
into the `gcc.builtins' module, which is constructed during the semantic
analysis pass, which has already finished by the time target attributes
are evaluated.  So builtins are not pushed by the new langhook because
they would be ultimately ignored.  Builtins exposed to D code then can
now only be altered by the command-line.

PR d/101127

gcc/d/ChangeLog:

* d-builtins.cc (d_builtin_function_ext_scope): New function.
* d-lang.cc (LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE): Define.
* d-tree.h (d_builtin_function_ext_scope): Declare.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101127a.d: New test.
* gdc.dg/pr101127b.d: New test.

(cherry picked from commit b2f6e1de242fff5713763cd3146dcf3f9dee51ca)

d: Change in DotTemplateExp type semantics leading to regression (PR101619)

By giving dot templates a type, meant that properry resolving silently
started passing for code that should never have passed. The simple fix
is to provide implementations for checkType and checkValue that give an
error about dot templates having neither a value nor type.

Reviewed-on: https://github.com/dlang/dmd/pull/12920

PR d/101619

gcc/d/ChangeLog:

* dmd/expression.c (DotTemplateExp::checkType): New function.
(DotTemplateExp::checkValue): New function.
* dmd/expression.h (class DotTemplateExp): Declare checkType and
checkValue.

gcc/testsuite/ChangeLog:

* gdc.test/fail_compilation/fail7424b.d: Update test.
* gdc.test/fail_compilation/fail7424c.d: Update test.
* gdc.test/fail_compilation/fail7424d.d: Update test.
* gdc.test/fail_compilation/fail7424e.d: Update test.
* gdc.test/fail_compilation/fail7424f.d: Update test.
* gdc.test/fail_compilation/fail7424g.d: Update test.
* gdc.test/fail_compilation/fail7424h.d: Update test.
* gdc.test/fail_compilation/fail7424i.d: Update test.
* gdc.test/compilable/test22133.d: New test.
* gdc.test/fail_compilation/fail22133.d: New test.

(cherry picked from commit 3e2136117487fa839f7601c3e22a2856978fb9d0)

gimple-fold: Fix up __builtin_clear_padding on classes with virtual inheritence [PR101586]

For the following testcase, B is 16-byte type, containing 8-byte
virtual pointer and 1-byte A member, and C contains two FIELD_DECLs,
one with B type and size of just 8-byte and then a field with type
A and 1-byte size.
The __builtin_clear_padding code was upset about the B typed FIELD_DECL
containing FIELD_DECLs beyond the field size and triggered
assertion failure.
This patch makes it ignore all FIELD_DECLs that are (fully) beyond the sz
passed from the caller (except for the flexible array member
diagnostics that is kept).

2021-07-27 Jakub Jelinek <jakub@redhat.com>

PR middle-end/101586
* gimple-fold.c (clear_padding_type): Ignore FIELD_DECLs with byte
positions above or equal to sz except for diagnostics of flexible
array members.

* g++.dg/torture/builtin-clear-padding-4.C: New test.

(cherry picked from commit a21bd3cebd6f54af70a37c18b8fbeae933fb6515)

expmed: Fix store_integral_bit_field [PR101562]

Our documentation says that paradoxical subregs shouldn't appear
in strict_low_part:
'(strict_low_part (subreg:M (reg:N R) 0))'
     This expression code is used in only one context: as the
     destination operand of a 'set' expression.  In addition, the
     operand of this expression must be a non-paradoxical 'subreg'
     expression.
but on the testcase below that triggers UB at runtime
store_integral_bit_field emits exactly that.

The following patch fixes it by ensuring the requirement is satisfied.

2021-07-23  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/101562
* expmed.c (store_integral_bit_field): Only use movstrict_optab
if the operand isn't paradoxical.

* gcc.c-torture/compile/pr101562.c: New test.

(cherry picked from commit 8408d34570c9fe9f3d22a25a76df2a4c64f08477)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to 32e6acb39983541ac76e95d44883f05aa9d1c350 (2021-07-28), which
is the first post-GCC-11.2.0 commit: 'BASE-VER: Set to 11.2.1.'

Update BASE-VER to 11.2.1

This updates BASE-VER to 11.2.1 after the 11.2 release.

2021-07-28 Richard Biener <rguenther@suse.de>

* BASE-VER: Set to 11.2.1

Update ChangeLog and version files for release

Daily bump.

PR fortran/93308/93963/94327/94331/97046 problems raised by descriptor handling

Fortran: Fix attributes and bounds in ISO_Fortran_binding.

2021-07-26 José Rui Faustino de Sousa <jrfsousa@gmail.com>
Tobias Burnus <tobias@codesourcery.com>

PR fortran/93308
PR fortran/93963
PR fortran/94327
PR fortran/94331
PR fortran/97046

gcc/fortran/ChangeLog:

* trans-decl.c (convert_CFI_desc): Only copy out the descriptor
if necessary.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Updated attribute
handling which reflect a previous intermediate version of the
standard. Only copy out the descriptor if necessary.

libgfortran/ChangeLog:

* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Add code
to verify the descriptor. Correct bounds calculation.
(gfc_desc_to_cfi_desc): Add code to verify the descriptor.

gcc/testsuite/ChangeLog:

* gfortran.dg/ISO_Fortran_binding_1.f90: Add pointer attribute,
this test is still erroneous but now it compiles.
* gfortran.dg/bind_c_array_params_2.f90: Update regex to match
code changes.
* gfortran.dg/PR93308.f90: New test.
* gfortran.dg/PR93963.f90: New test.
* gfortran.dg/PR94327.c: New test.
* gfortran.dg/PR94327.f90: New test.
* gfortran.dg/PR94331.c: New test.
* gfortran.dg/PR94331.f90: New test.
* gfortran.dg/PR97046.f90: New test.

(cherry picked from commit 0cbf03689e3e7d9d6002b8e5d159ef3716d0404c)

Bind(c): signed char is not a Fortran character type

CFI_allocate and CFI_select_part were incorrectly treating
CFI_type_signed_char as a Fortran character type for the purpose of
deciding whether or not to use the elem_len argument. It is a Fortran
integer type per table 18.2 in the 2018 Fortran standard.

Other functions in ISO_Fortran_binding.c appeared to handle this case
correctly already.

2021-07-15 Sandra Loosemore <sandra@codesourcery.com>

libgfortran/
* runtime/ISO_Fortran_binding.c (CFI_allocate): Don't use elem_len
for CFI_type_signed_char.
(CFI_select_part): Likewise.

(cherry picked from commit e4966e1d1de54441bdaed54eb00084db4193bbbf)

Fortran: set version field in CFI_cdesc_t to CFI_VERSION

When converting a GFC descriptor to a CFI descriptor, it was
incorrectly copying the version number from the former to the latter.
It should be using the value of CFI_VERSION instead (currently 1).
Going the other direction, the GFC version field is initialized to
zero elsewhere, so do that here too.

2021-06-30 Tobias Burnus <tobias@codesourcery.com>
Sandra Loosemore <sandra@codesourcery.com>

libgfortran/
* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc):
Initialize version field to 0.
(gfc_desc_to_cfi_desc): Initialize version field to CFI_VERSION.

(cherry picked from commit 58b735b70b03884052c80ac032df90eed7059f8d)

Fix Fortran rounding issues, PR fortran/96983.

I was looking at Fortran PR 96983, which fails on the PowerPC when trying to
run the test PR96711.F90.  The compiler ICEs because the PowerPC does not have
a floating point type with a type precision of 128.  The reason is that the
PowerPC has 3 different 128 bit floating point types (__float128/_Float128,
__ibm128, and long double).  Currently long double uses the IBM extended double
type, but we would like to switch to using IEEE 128-bit long doubles in the
future.

In order to prevent the compiler from converting explicit __ibm128 types to
long double when long double uses the IEEE 128-bit representation, we have set
up the precision for __ibm128 to be 128, long double to be 127, and
__float128/_Float128 to be 126.

Originally, I was trying to see if for Fortran, I could change the precision of
long double to be 128 (Fortran doesn't access __ibm128), but it quickly became
hard to get the changes to work.

I looked at the Fortran code in build_round_expr, and I came to the conclusion
that there is no reason to promote the floating point type.  If you just do a
normal round of the value using the current floating point format and then
convert it to the integer type.  We don't have an appropriate built-in function
that provides the equivalent of llround for 128-bit integer types.

This patch fixes the compiler crash.

However, while with this patch, the PowerPC compiler will not crash when
building the test case, it will not run on the current default installation.
The failure is because the test is explicitly expecting 128-bit floating point
to handle 10384593717069655257060992658440192_16 (i.e. 2**113).

By default, the PowerPC uses IBM extended double used for 128-bit floating
point.  The IBM extended double format is a pair of doubles that provides more
mantissa bits but does not grow the expoenent range.  The value in the test is
fine for IEEE 128-bit floating point, but it is too large for the PowerPC
extended double setup.

I have built the following tests with this patch:

   * I have built a bootstrap compiler on a little endian power9 Linux system
     with the default long double format (IBM extended double).  The
     pr96711.f90 test builds, but it does not run due to the range of the
     real*16 exponent.  There were no other regressions in the C/C++/Fortran
     tests.

   * I have built a bootstrap compiler on a little endian power9 Linux system
     with the default long double format set to IEEE 128-bit. I used the
     Advance Toolchain 14.0-2 to provide the IEEE 128-bits.  The compiler was
     configured to build power9 code by default, so the test generated native
     power9 IEEE 128-bit instructions.  The pr96711.f90 test builds and runs
     correctly in this setup.

   * I have built a bootstrap compiler on a big endian power8 Linux system with
     the default long double format (IBM extended double).  Like the first
     case, the pr96711.f90 test does not crash the compiler, but the test fails
     due to the range of the real*16 exponent.    There were no other
     regressions in the C/C++/Fortran tests.

   * I built a bootstrap compiler on my x86_64 laptop.  There were no
     regressions in the tests.

gcc/fortran/
2021-04-21  Michael Meissner  <meissner@linux.ibm.com>

PR fortran/96983
* trans-intrinsic.c (build_round_expr): If int type is larger than
long long, do the round and convert to the integer type.  Do not
try to find a floating point type the exact size of the integer
type.

(cherry picked from commit 3cf04d1afa8a4955a0a9a395dd21ce1b6484aa78)

Daily bump.

Regenerate gcc.pot.

* gcc.pot: Regenerate.

offloading: fix -foffload hinting

PR sanitizer/101425

gcc/ChangeLog:

* gcc.c (check_offload_target_name): Call
candidates_list_and_hint only if we have a candidate.

(cherry picked from commit 9b8b37d1b6301855213b8d4860feaeb74d464c6b)

gcc.c: Add -foffload= to display_help

gcc/ChangeLog:

* common.opt (foffload): Remove help as Driver only.
* gcc.c (display_help): Add -foffload.

(cherry picked from commit 63fe82d80dee997b25ca60fa7d1ed07e97930976)

gcc.c's check_offload_target_name: Fixes to inform hints

gcc/ChangeLog:

* gcc.c (close_at_file, execute): Replace alloca by XALLOCAVEC.
(check_offload_target_name): Fix splitting OFFLOAD_TARGETS into
a candidate list; better inform no offload target is configured
and fix hint extraction when passed target is not '\0' at [len].
* common.opt (foffload): Add tailing '.'.
(foffload-options): Likewise; fix flag name in the help string.

(cherry picked from commit a3ce7d75dd9c0308b8565669f31127436cb2ba9f)

openmp: Add support for omp attributes section and scan directives

This patch adds support for expressing the section and scan directives
using the attribute syntax and additionally fixes some bugs in the attribute
syntax directive handling.
For now it requires that the scan and section directives appear as the only
attribute, not combined with other OpenMP or non-OpenMP attributes on the same
statement.

2021-07-26  Jakub Jelinek  <jakub@redhat.com>

* parser.h (struct cp_lexer): Add orphan_p member.
* parser.c (cp_parser_statement): Don't change in_omp_attribute_pragma
upon restart from CPP_PRAGMA handling.  Fix up condition when a lexer
should be destroyed and adjust saved_tokens if it records tokens from
the to be destroyed lexer.
(cp_parser_omp_section_scan): New function.
(cp_parser_omp_scan_loop_body): Use it.  If
parser->lexer->in_omp_attribute_pragma, allow optional comma
after scan.
(cp_parser_omp_sections_scope): Use cp_parser_omp_section_scan.

* g++.dg/gomp/attrs-1.C: Use attribute syntax even for section
and scan directives.
* g++.dg/gomp/attrs-2.C: Likewise.
* g++.dg/gomp/attrs-6.C: New test.
* g++.dg/gomp/attrs-7.C: New test.
* g++.dg/gomp/attrs-8.C: New test.

(cherry picked from commit acf9d1fd806fabf62dfe232439b11263c191e32d)

Daily bump.

openmp: Add support for __has_attribute(omp::directive) and __has_attribute(omp::sequence)

Now that the C++ FE supports these attributes, but not through registering
them in the attributes tables (they work quite differently from other
attributes), this teaches c_common_has_attributes about those.

2021-07-23 Jakub Jelinek <jakub@redhat.com>

* c-lex.c (c_common_has_attribute): Call canonicalize_attr_name also
on attr_id. Return 1 for omp::directive or omp::sequence in C++11
and later.

* c-c++-common/gomp/attrs-1.c: New test.
* c-c++-common/gomp/attrs-2.c: New test.
* c-c++-common/gomp/attrs-3.c: New test.

(cherry picked from commit 7f7364108f7441e6bd6f6f79a2d991e4e0f71b28)

openmp: Diagnose invalid mixing of the attribute and pragma syntax directives

The OpenMP 5.1 spec says that the attribute and pragma syntax directives
should not be mixed on the same statement.  The following patch adds diagnostic
for that,
  [[omp::directive (...)]]
  #pragma omp ...
is always an error and for the other order
  #pragma omp ...
  [[omp::directive (...)]]
it depends on whether the pragma directive is an OpenMP construct
(then it is an error because it needs a structured block or loop
or statement as body) or e.g. a standalone directive (then it is fine).

Only block scope is handled for now though, namespace scope and class scope
still needs implementing even the basic support.

2021-07-23  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP__START_ and
PRAGMA_OMP__LAST_ enumerators.
gcc/cp/
* parser.h (struct cp_parser): Add omp_attrs_forbidden_p member.
* parser.c (cp_parser_handle_statement_omp_attributes): Diagnose
mixing of attribute and pragma syntax directives when seeing
omp::directive if parser->omp_attrs_forbidden_p or if attribute syntax
directives are followed by OpenMP pragma.
(cp_parser_statement): Clear parser->omp_attrs_forbidden_p after
the cp_parser_handle_statement_omp_attributes call.
(cp_parser_omp_structured_block): Add disallow_omp_attrs argument,
if true, set parser->omp_attrs_forbidden_p.
(cp_parser_omp_scan_loop_body, cp_parser_omp_sections_scope): Pass
false as disallow_omp_attrs to cp_parser_omp_structured_block.
(cp_parser_omp_parallel, cp_parser_omp_task): Set
parser->omp_attrs_forbidden_p.
gcc/testsuite/
* g++.dg/gomp/attrs-4.C: New test.
* g++.dg/gomp/attrs-5.C: New test.

(cherry picked from commit 2c5d803d03209478b4f060785c6f6ba2f0de88ad)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to 9ca1fa731d5e1ced991339bb8fc8b7757064b2ab (July 23, 2021)

Daily bump.

[POWER10] __morestack calls from pcrel code

Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10
and linking with a non-pcrel libgcc results in crashes due to the
power10 pcrel code not having r2 set for the generic-morestack.c
functions called from __morestack.  There is also a problem when
non-pcrel code calls a pcrel libgcc.  See the patch comments.

A similar situation theoretically occurs with ELFv1 multi-toc
executables, when __morestack might be located in a different toc
group to its caller.  This patch makes no attempt to fix that, since
the gold linker does not support multi-toc (gold is needed for proper
support of -fsplit-stack code) nor does gcc emit __morestack calls
that support multi-toc.

* config/rs6000/morestack.S (R2_SAVE): Define.
(__morestack): Save and restore r2.  Set up r2 for called
functions.

(cherry picked from commit cd6ca96f5d530e4ee07b65ac8b075119ba5bb035)

Daily bump.

openmp: Fix up omp_check_private [PR101535]

The target data construct shouldn't affect omp_check_private, unless
the decl there is privatized (use_device_* clauses).  The routine
had some code for that, but it just did continue; in a loop that looped
only if the region type is one of selected 4 kinds, so effectively resulted
in return false; instead of looping again.  And not diagnosing lastprivate
(or reduction etc.) on a variable that is private to containing parallel
results in ICEs later on, as there is no original list item to which store
the last result.
The target construct is unclear as it has an implicit parallel region
and it is not obvious if the data privatization clauses on the construct
shall be treated as data privatization on the implicit parallel or just
on the target.  For now treat those as privatization on the implicit
parallel, but treat map clauses as shared on the implicit parallel.

2021-07-21  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/101535
* gimplify.c (omp_check_private): Properly skip ORT_TARGET_DATA
contexts in which decl isn't privatized and for ORT_TARGET return
false if decl is mapped.

* c-c++-common/gomp/pr101535-1.c: New test.
* c-c++-common/gomp/pr101535-2.c: New test.

(cherry picked from commit b136b7a78774107943fe94051c42b5a968a3ad3f)

c++: Ensure OpenMP reduction with reference type references complete type [PR101516]

The following testcase ICEs because we haven't verified if reduction decl
has reference type that TREE_TYPE of the reference is a complete type,
require_complete_type on the decl doesn't ensure that.

2021-07-21 Jakub Jelinek <jakub@redhat.com>

PR c++/101516
* semantics.c (finish_omp_reduction_clause): Also call
complete_type_or_else and return true if it fails.

* g++.dg/gomp/pr101516.C: New test.

(cherry picked from commit aea199f96cf116ba4c81426207acde371556610c)

openmp: Fix up omp_check_private [PR101535]

The target data construct shouldn't affect omp_check_private, unless
the decl there is privatized (use_device_* clauses).  The routine
had some code for that, but it just did continue; in a loop that looped
only if the region type is one of selected 4 kinds, so effectively resulted
in return false; instead of looping again.  And not diagnosing lastprivate
(or reduction etc.) on a variable that is private to containing parallel
results in ICEs later on, as there is no original list item to which store
the last result.
The target construct is unclear as it has an implicit parallel region
and it is not obvious if the data privatization clauses on the construct
shall be treated as data privatization on the implicit parallel or just
on the target.  For now treat those as privatization on the implicit
parallel, but treat map clauses as shared on the implicit parallel.

2021-07-21  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/101535
* gimplify.c (omp_check_private): Properly skip ORT_TARGET_DATA
contexts in which decl isn't privatized and for ORT_TARGET return
false if decl is mapped.

* c-c++-common/gomp/pr101535-1.c: New test.
* c-c++-common/gomp/pr101535-2.c: New test.

(cherry picked from commit b136b7a78774107943fe94051c42b5a968a3ad3f)

c++: Ensure OpenMP reduction with reference type references complete type [PR101516]

The following testcase ICEs because we haven't verified if reduction decl
has reference type that TREE_TYPE of the reference is a complete type,
require_complete_type on the decl doesn't ensure that.

2021-07-21 Jakub Jelinek <jakub@redhat.com>

PR c++/101516
* semantics.c (finish_omp_reduction_clause): Also call
complete_type_or_else and return true if it fails.

* g++.dg/gomp/pr101516.C: New test.

(cherry picked from commit aea199f96cf116ba4c81426207acde371556610c)

Fortran: Fix bind(C) character length checks

gcc/fortran/ChangeLog:

* decl.c (gfc_verify_c_interop_param): Update for F2008 + F2018
changes; reject unsupported bits with 'Error: Sorry,'.
* trans-expr.c (gfc_conv_procedure_call): Fix condition to
For using CFI descriptor with characters.

gcc/testsuite/ChangeLog:

* gfortran.dg/iso_c_binding_char_1.f90: Update dg-error.
* gfortran.dg/pr32599.f03: Use -std=-f2003 + update comment.
* gfortran.dg/bind_c_char_10.f90: New test.
* gfortran.dg/bind_c_char_6.f90: New test.
* gfortran.dg/bind_c_char_7.f90: New test.
* gfortran.dg/bind_c_char_8.f90: New test.
* gfortran.dg/bind_c_char_9.f90: New test.

(cherry picked from commit b3d4011ba10275fbd5d6ec5a16d5aaebbdfb5d3c)

Daily bump.

rs6000: Fix up easy_vector_constant_msb handling [PR101384]

The following gcc.dg/pr101384.c testcase is miscompiled on
powerpc64le-linux.
easy_altivec_constant has code to try construct vector constants with
different element sizes, perhaps different from CONST_VECTOR's mode. But as
written, that works fine for vspltis[bhw] cases, but not for the vspltisw
x,-1; vsl[bhw] x,x,x case, because that creates always a V16QImode, V8HImode
or V4SImode constant containing broadcasted constant with just the MSB set.
The vspltis_constant function etc. expects the vspltis[bhw] instructions
where the small [-16..15] or even [-32..30] constant is sign-extended to the
remaining step bytes, but that is not the case for the 0x80...00 constants,
with step 1 we can't handle e.g.
{ 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff }
vectors but do want to handle e.g.
{ 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80 }
and similarly with copies 1 we do want to handle e.g.
{ 0x80808080, 0x80808080, 0x80808080, 0x80808080 }.

This is a simpler version of the fix for backports, which limits the EASY_VECTOR_MSB case
matching to step == 1 && copies == 1, because that is the only case the
splitter handles correctly.

2021-07-20 Jakub Jelinek <jakub@redhat.com>

PR target/101384
* config/rs6000/rs6000.c (vspltis_constant): Accept EASY_VECTOR_MSB
only if step and copies are equal to 1.

* gcc.dg/pr101384.c: New test.

amdgcn: Add -mxnack and -msram-ecc [PR 100208]

gcc/ChangeLog:

PR target/100208
* config/gcn/gcn-hsa.h (DRIVER_SELF_SPECS): New.
(ASM_SPEC): Set -mattr for xnack and sram-ecc.
* config/gcn/gcn-opts.h (enum sram_ecc_type): New.
* config/gcn/gcn-valu.md: Add a warning comment.
* config/gcn/gcn.c (gcn_option_override): Add "sorry" for -mxnack.
(output_file_start): Add xnack and sram-ecc state to ".amdgcn_target".
* config/gcn/gcn.md: Add a warning comment.
* config/gcn/gcn.opt: Add -mxnack and -msram-ecc.
* config/gcn/mkoffload.c (EF_AMDGPU_MACH_AMDGCN_GFX908): Remove
SRAM-ECC flag.
(EF_AMDGPU_XNACK): New.
(EF_AMDGPU_SRAM_ECC): New.
(elf_flags): New.
(copy_early_debug_info): Use elf_flags.
(main): Handle -mxnack and -msram-ecc options.
* doc/invoke.texi: Document -mxnack and -msram-ecc.

gcc/testsuite/ChangeLog:

PR target/100208
* gcc.target/gcn/sram-ecc-1.c: New test.
* gcc.target/gcn/sram-ecc-2.c: New test.
* gcc.target/gcn/sram-ecc-3.c: New test.
* gcc.target/gcn/sram-ecc-4.c: New test.
* gcc.target/gcn/sram-ecc-5.c: New test.
* gcc.target/gcn/sram-ecc-6.c: New test.
* gcc.target/gcn/sram-ecc-7.c: New test.
* gcc.target/gcn/sram-ecc-8.c: New test.

(cherry picked from commit aad32a00b7d2b64ae158b2b167768a9ae3e20f6e)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to 5dd3fe90a5c0067e3e21d81aa904d13a4154eb4a (July 2021)

X86: Provide a CTOR for stringop_algs [PR100246].

Several older compilers fail to build modern GCC because of missing
or incomplete C++11 support.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR bootstrap/100246 - [11/12 Regression] GCC will not bootstrap with clang 3.4/3.5 [xcode 5/6, Darwin 12/13]

PR bootstrap/100246

gcc/ChangeLog:

* config/i386/i386.h (struct stringop_algs): Define a CTOR for
this type.

(cherry picked from commit f99f6eb58e1f894dae024f63cc2fe30fa7605e59)

coroutines: Adjust outlined function names [PR95520].

The mechanism used to date for uniquing the coroutine helper
functions (actor, destroy) was over-complicating things and
leading to the noted PR and also difficulties in setting
breakpoints on these functions (so this will help PR99215 as
well).

This implementation delegates the adjustment to the mangling
to write_encoding() which necessitates some book-keeping so
that it is possible to determine which of the coroutine
helper names is to be mangled.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/95520 - [coroutines] __builtin_FUNCTION() returns mangled .actor instead of original function name

PR c++/95520

gcc/cp/ChangeLog:

* coroutines.cc (struct coroutine_info): Add fields for
actor and destroy function decls.
(to_ramp): New.
(coro_get_ramp_function): New.
(coro_get_actor_function): New.
(coro_get_destroy_function): New.
(act_des_fn): Set up mapping between ramp, actor and
destroy functions.
(morph_fn_to_coro): Adjust interface to the builder for
helper function decls.
* cp-tree.h (DECL_ACTOR_FN, DECL_DESTROY_FN, DECL_RAMP_FN,
JOIN_STR): New.
* mangle.c (write_encoding): Handle coroutine helpers.
(write_unqualified_name): Handle lambda coroutine helpers.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95520.C: New test.

(cherry picked from commit 237ab3ee49e2f3110accfcc03b6c0df8b4889f15)

coroutines: Factor code. Match original source location in helpers [NFC].

This is primarily a source code refactoring, the only change is to
ensure that the outlined functions are marked to begin at the same
line as the original.  Otherwise, they get the default (which seems
to be input_location, which corresponds to the closing brace at the
point that this is done).  Having the source location point to that
confuses some debuggers.

This is a contributory fix to:
PR c++/99215 - coroutines: debugging with gdb

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Move common code to
act_des_fn.
(build_destroy_fn): Likewise.
(act_des_fn): Build the void return here.  Ensure that the
source location matches the original function.

(cherry picked from commit d5b1bb0d197f9141a0f0e510f8d1b598c3df9552)

coroutines: Fix a typo in rewriting the function.

When amending the function re-write code, I made a typo in
the block connections. This has not shown up in any test
fails (as far as can be seen) but is a regression in debug
info.

Fixed thus.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc
(coro_rewrite_function_body): Connect the replacement
function block to the block nest correctly.

(cherry picked from commit 0d5db79a61af150cba48612c9fbc3267262adb93)

Darwin, X86: Adjust call clobbers to allow for lazy-binding [PR 100152].

We allow public functions defined in a TU to bind locally for PIC
code (the default) on 64bit Mach-O.

If such functions are not inlined, we cannot tell at compile-time if
they might be called via the lazy symbol resolver (this can depend on
options given at link-time). Therefore, we must assume that the lazy
resolver could be used which clobbers R11 and R10.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

PR target/100152
* config/i386/i386-expand.c (ix86_expand_call): If a call is
to a non-local-binding, or local but to a public symbol, then
assume that it might be indirected via the lazy symbol binder.
Mark R10 and R10 as clobbered in that case.

(cherry picked from commit 41bd1b190358fce213f5add8396faf14a32d5c23)

i386: Remove atomic_storedi_fpu and atomic_loaddi_fpu peepholes [PR100182]

These patterns result in non-atomic sequence.

2021-07-21 Uroš Bizjak <ubizjak@gmail.com>

gcc/
PR target/100182
* config/i386/sync.md (define_peephole2 atomic_storedi_fpu):
Remove.
(define_peephole2 atomic_loaddi_fpu): Ditto.

gcc/testsuite/
PR target/100182
* gcc.target/i386/pr71245-1.c: Remove.
* gcc.target/i386/pr71245-2.c: Ditto.

Daily bump.

compiler: avoid aliases in receiver types

If a package declares a method on an alias type, the alias would be
used in the export data. This would then trigger a compiler
assertion on import: we should not be adding methods to aliases.

Fix the problem by ensuring that receiver types do not use alias types.
This seems preferable to consistently avoiding aliases in export data,
as aliases can cross packages. And it's painful to try to patch this
while writing the export data, as at that point all the types are known.

Test case is https://golang.org/cl/335172.

Fixes golang/go#47131

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/335729

rs6000: Don't let swaps pass break multiply low-part (PR101129)

Backport from mainline.

2021-07-15 Bill Schmidt <wschmidt@linux.ibm.com>

gcc/
PR target/101129
* config/rs6000/rs6000-p8swap.c (has_part_mult): New.
(rs6000_analyze_swaps): Insns containing a subreg of a mult are
not swappable.

gcc/testsuite/
PR target/101129
* gcc.target/powerpc/pr101129.c: New.

openmp: Initial support for OpenMP directives expressed as C++11 attributes

This is an OpenMP 5.1 feature, but I think it is something very useful for
OpenMP users, so I'm committing it now instead of waiting until all 5.0
work is done.

The support is incomplete, only attributes on statements (or block local
declarations) are supported right now, while for non-executable directives
they should be also supported at namespace scope and at class scope, and
for declarations in all places that appertain to the declarations rather
than e.g. types.

I need to also fix up handling of C++11 non-OpenMP attributes mixed with
OpenMP attributes before block local declarations (currently it throws
them away), probably reject if the directives appertain to labels etc.

In order not to complicate all the OpenMP directive parsing, it is done
by remembering the tokens from the attribute, slightly adjusting them and
feeding them through a temporary new lexer to cp_parse_pragma.

2021-07-02  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
* c-common.h (enum c_omp_directive_kind): New enum.
(struct c_omp_directive): New type.
(c_omp_categorize_directive): Declare.
* c-omp.c (omp_directives): New variable.
(c_omp_categorize_directive): New function.
gcc/cp/
* parser.h (struct cp_lexer): Add in_omp_attribute_pragma member.
(struct cp_omp_declare_simd_data): Likewise.
* cp-tree.h (enum cp_tree_index): Add CPTI_OMP_IDENTIFIER.
(omp_identifier): Define.
* parser.c (cp_parser_skip_to_pragma_eol): Handle
in_omp_attribute_pragma CPP_PRAGMA_EOL followed by CPP_EOF.
(cp_parser_require_pragma_eol): Likewise.
(struct cp_omp_attribute_data): New type.
(cp_parser_handle_statement_omp_attributes): New function.
(cp_parser_statement): Handle OpenMP directives in statement's
attribute-specifier-seq.
(cp_parser_omp_directive_args, cp_parser_omp_sequence_args): New
functions.
(cp_parser_std_attribute): Handle omp::directive and omp::sequence
attributes.
(cp_parser_omp_all_clauses): If in_omp_attribute_pragma, allow
a comma also before the first clause.
(cp_parser_omp_allocate): Likewise.
(cp_parser_omp_atomic): Likewise.
(cp_parser_omp_depobj): Likewise.
(cp_parser_omp_flush): Likewise.
(cp_parser_omp_ordered): Likewise.
(cp_parser_omp_declare_simd): Save in_omp_attribute_pragma
into struct cp_omp_declare_simd_data.
(cp_finish_omp_declare_variant): Add in_omp_attribute_pragma
argument.  If set, allow a comma also before match clause.
(cp_parser_late_parsing_omp_declare_simd): If in_omp_attribute_pragma,
allow a comma also before the first clause.  Adjust
cp_finish_omp_declare_variant caller.
(cp_parser_omp_declare_target): If in_omp_attribute_pragma, allow
a comma also before the first clause.
(cp_parser_omp_declare_reduction_exprs): Likewise.
(cp_parser_omp_requires): Likewise.
* decl.c (initialize_predefined_identifiers): Initialize
omp_identifier.
* decl2.c (cplus_decl_attributes): Reject omp::directive and
omp::sequence attributes.
gcc/testsuite/
* g++.dg/gomp/attrs-1.C: New test.
* g++.dg/gomp/attrs-2.C: New test.
* g++.dg/gomp/attrs-3.C: New test.

(cherry picked from commit 9984f63aab93a370101966b7eb198dc61130b3c8)

openmp: Reject #pragma omp atomic update, [PR101297]

I've noticed that we allow a trailing comma on OpenMP atomic construct
if there is at least one clause. Commas should be only allowed to
separate the clauses (or in OpenMP 5.1 to separate directive name
from the clauses).

2021-07-02 Jakub Jelinek <jakub@redhat.com>

PR c/101297
* c-parser.c (c_parser_omp_atomic): Consume comma only if it
appears before a CPP_NAME.

* parser.c (cp_parser_omp_atomic): Consume comma only if it
appears before a CPP_NAME.

* c-c++-common/gomp/atomic-24.c: New test.

(cherry picked from commit 2ca89394280da4afad6074ec3cb7136b6142af7b)

gcc/ChangeLog.omp: Update for last commit

gcc/
* ChangeLog.omp: Update for last cherry-pick commit
'gcc.c: Silence warning in check_offload_target_name'.

libstdc++: Fix some problems in PSTL tests

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/pstl/alg_nonmodifying/find_end.cc:
Increase dg-timeout-factor to 4. Fix -Wunused-parameter
warnings. Replace bitwise AND with logical AND in loop
condition.
* testsuite/25_algorithms/pstl/alg_nonmodifying/search_n.cc:
Replace bitwise AND with logical AND in loop condition.
* testsuite/util/pstl/test_utils.h: Remove unused parameter
names.

(cherry picked from commit d1adbe5c1bd3f3ba098ff112eed9b61515e2dc20)

libstdc++: Remove precondition checks from ranges::subrange

The assertion in the subrange constructor causes semantic changes,
because the call to ranges::distance performs additional operations that
are not part of the constructor's specification. That will fail to
compile if the iterator is move-only, because the argument to
ranges::distance is passed by value. It will modify the subrange if the
iterator is not a forward iterator, because incrementing the copy also
affects the _M_begin member. Those problems could be prevented by using
if-constexpr to only do the assertion for copyable forward iterators,
but the call to ranges::distance can also prevent the constructor being
usable in constant expressions. If the member initializers are usable in
constant expressions, but iterator increments of equality comparisons
are not, then the checks done by __glibcxx_assert might
make constant evaluation fail.

This change removes the assertion. Additionally, a new typedef is
introduced to simplify the declarations using __make_unsigned_like_t on
the iterator's difference type.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/ranges_util.h (subrange): Add __size_type typedef
and use it to simplify declarations.
(subrange(i, s, n)): Remove assertion.
* testsuite/std/ranges/subrange/constexpr.cc: New test.

(cherry picked from commit a88fc03ba7e52d9a072f25d42bb9619fedb7892e)

libstdc++: Fix std::get<T> for std::tuple [PR101427]

The std::get<T> functions relied on deduction failing if more than one
base class existed for the type T. However the implementation of Core
DR 2303 (in r11-4693) made deduction succeed (and select the
more-derived base class).

This rewrites the implementation of std::get<T> to explicitly check for
more than one occurrence of T in the tuple elements, making it
ill-formed again. Additionally, the large wall of overload resolution
errors described in PR c++/101460 is avoided by making std::get<T> use
__get_helper<I> directly instead of calling std::get<I>, and by adding a
deleted overload of __get_helper<N> for out-of-range N.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101427
* include/std/tuple (tuple_element): Improve static_assert text.
(__get_helper): Add deleted overload.
(get<i>(tuple<T...>&&), get<i>(const tuple<T...>&&)): Use
__get_helper directly.
(__get_helper2): Remove.
(__find_uniq_type_in_pack): New constexpr helper function.
(get<T>): Use __find_uniq_type_in_pack and __get_helper instead
of __get_helper2.
* testsuite/20_util/tuple/element_access/get_neg.cc: Adjust
expected errors.
* testsuite/20_util/tuple/element_access/101427.cc: New test.

(cherry picked from commit 17855eed7fc76b2cee7fbbc26f84d3c8b99be13c)

Daily bump.

g++.dg/gomp/clause-3.C: Fix - missing in r12-438-g1580fc7 [PR100422]

gcc/testsuite/
PR testsuite/100422
* g++.dg/gomp/clause-3.C: Use 'reduction(&:..)' instead of '...(&&:..)'.

(cherry picked from commit af4e4d35f0b84d7c2f57a7b682a09116e9911142)

openmp - Fix up && and || reductions [PR94366]

As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.

This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.

2021-07-01 Jakub Jelinek <jakub@redhat.com>

PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.

(cherry picked from commit 91c771ec8a3b649765de3e0a7b04cf946c6649ef)