git.ipfire.org Git - thirdparty/gcc.git/log

Add dg-final option-based target selectors

This patch adds target selectors of the form:

  { any-opts "opt1" ... "optn" }
  { no-opts "opt1" ... "optn" }

for skipping or xfailing tests based on compiler options.  It only
works for dg-final selectors.

The patch then uses no-opts to exclude -O0 and (sometimes) -Og from
some guality.exp xfails.  AFAICT (based on gcc-testresults) these
tests pass for those options for all targets.

gcc/
* doc/sourcebuild.texi: Document no-opts and any-opts target
selectors.

gcc/testsuite/
* lib/target-supports-dg.exp (selector_expression): Handle any-opts
and no-opts.
* gcc.dg/guality/pr41353-1.c: Exclude -O0 from xfail.
* gcc.dg/guality/pr59776.c: Likewise.
* gcc.dg/guality/pr54970.c: Likewise -O0 and -Og.

Fix gimple_debug_cfg declaration

Silence a warning. The argument type did not match the definition.

gcc/ChangeLog:

* tree-cfg.h (gimple_debug_cfg): Change argument type from int
to dump_flags_t.

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9233-g3dea90505df136a4b361665772ef8e62306cfcdb (Nov 10, 2021)

testsuite/102690 - XFAIL g++.dg/warn/Warray-bounds-16.C

This XFAILs the bogus diagnostic test and rectifies the expectation
on the optimization.

2021-11-10 Richard Biener <rguenther@suse.de>

PR testsuite/102690
* g++.dg/warn/Warray-bounds-16.C: XFAIL diagnostic part
and optimization.

(cherry picked from commit b2cd32b743ba440e75505ce30c6b5c592ed144ea)

openmp: For default(none) ignore variables created by ubsan_create_data [PR64888]

We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification
(others are added later). One way to fix this would be to introduce further
UBSAN_ internal functions and lower it later (sanopt pass) like other ifns,
this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL
and TYPE_ARTIFICIAL.

2021-10-21 Jakub Jelinek <jakub@redhat.com>

PR middle-end/64888
gcc/c-family/
* c-omp.c (c_omp_predefined_variable): Return true also for
ubsan_create_data created artificial variables.
gcc/testsuite/
* c-c++-common/ubsan/pr64888.c: New test.

(cherry picked from commit 40dd9d839e52f679d8eabc1c5ca0ca17a5ccfd14)

Restore 'GOMP_OPENACC_DIM' environment variable parsing

... that got broken by recent commit c057ed9c52c6a63a1a692268f916b1a9131cd4b7
"openmp: Fix up strtoul and strtoull uses in libgomp", resulting in spurious
FAILs for tests specifying 'dg-set-target-env-var "GOMP_OPENACC_DIM" "[...]"'.

libgomp/
* env.c (parse_gomp_openacc_dim): Restore parsing.

(cherry picked from commit 00c9ce13a64e324dabd8dfd236882919a3119479)

Daily bump.

rs6000: Fix incorrect fusion constraint [PR102991]

gcc/ChangeLog:

2021-11-05 Xionghu Luo <luoxhu@linux.ibm.com>

PR target/102991
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl: Fix incorrect clobber constraint.

(cherry picked from commit 614b39757b8b61f70ac1c666edb7a01a5fc19cd4)

Daily bump.

tree-optimization/102798 - avoid copying PTA info to old SSA names

The vectorizer duplicates pointer-info to created pointer bases
but it has to avoid changing points-to info on existing SSA names
because there's now flow-sensitive info in there (pt->pt_null as
set from VRP).

2021-10-18 Richard Biener <rguenther@suse.de>

PR tree-optimization/102798
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref):
Only copy points-to info to newly generated SSA names.

* gcc.dg/pr102798.c: New testcase.

middle-end/102518 - avoid invalid GIMPLE during inlining

When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.

2021-09-30 Richard Biener <rguenther@suse.de>

PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.

* gcc.dg/torture/pr102518.c: New testcase.

tree-optimization/102788 - avoid spurious bool pattern fails

Bool pattern recog is required for correctness since vectorized
compares otherwise produce -1 for true so any context where bool
is used as value and not as condition or mask needs to be replaced
with CMP ? 1 : 0.  When we fail to find a vector type for the
result of such use we may not simply elide such transform since
a new bool result can emerge when for example the cast_forwprop
pattern is applied.  So the following avoids failing of the
bool pattern recog process and instead not assign a vector type
for the stmt.

2021-10-18  Richard Biener  <rguenther@suse.de>

PR tree-optimization/102788
* tree-vect-patterns.c (vect_init_pattern_stmt): Allow
a NULL vectype.
(vect_pattern_recog_1): Likewise.
(vect_recog_bool_pattern): Continue matching the pattern
even if we do not have a vector type for a conversion
result.

* g++.dg/vect/pr102788.cc: New testcase.

(cherry picked from commit eb032893675afea4b01cc6ad06a3e0dcfe9b51cd)

ipa/102762 - fix ICE with invalid __builtin_va_arg_pack () use

We have to be careful to not break the argument space calculation.
If there's not enough arguments just do not append any.

2021-10-15 Richard Biener <rguenther@suse.de>

PR ipa/102762
* tree-inline.c (copy_bb): Avoid underflowing nargs.

* gcc.dg/torture/pr102762.c: New testcase.

(cherry picked from commit 11a4714860d2df6ba496d55379e7dc702d5fc425)

tree-optimization/102572 - fix gathers with invariant mask

This fixes the vector def gathering for invariant masks which
failed to pass in the desired vector type resulting in a non-mask
type to be generate.

2021-10-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/102572
* tree-vect-stmts.c (vect_build_gather_load_calls): When
gathering the vectorized defs for the mask pass in the
desired mask vector type so invariants will be handled
correctly.

* g++.dg/vect/pr102572.cc: New testcase.

(cherry picked from commit 9f12a45ef147e563f099c24c293830727e8204cc)

tree-optimization/102139 - fix SLP DR base alignment

When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.

To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization. For this to work for groups inside a BB that are
separate by a call that might not return we now store the DR
analysis group-id permanently and use that for an additional check
when the DRs are in the same BB.

2021-08-31 Richard Biener <rguenther@suse.de>

PR tree-optimization/102139
* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
type to record a std::pair of the stmt-info and the innermost
loop behavior.
(dr_vec_info::group): New member.
* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
(vect_compute_data_ref_alignment): Verify the recorded
base alignment can be used.
(data_ref_pair): Remove.
(dr_group_sort_cmp): Adjust.
(vect_analyze_data_ref_accesses): Store the group-ID in the
dr_vec_info and operate on a vector of dr_vec_infos.

* gcc.dg/torture/pr102139.c: New testcase.

(cherry picked from commit 153766ec8351d55cfe8bd6d69bdfc0c2cef71e56)

Refactor BB splitting of DRs for SLP group analysis

This uses the group_id computed to ensure DRs in different BBs do
not get merged into a DR group.  To achieve this we seed the
group from the BB index when group_ids are not computed and we
make sure to bump the group_id when advancing to the next BB for
BB SLP analysis.

This paves the way for relaxing the grouping for BB vectorization
by adjusting its group_id computation.

2021-08-20  Richard Biener  <rguenther@suse.de>

* tree-vect-data-refs.c (dr_group_sort_cmp): Do not compare
BBs.
(vect_analyze_data_ref_accesses): Likewise.  Assign the BB
index as group_id when dataref_groups were not computed.
* tree-vect-slp.c (vect_slp_bbs): Bump current_group when
we advace to the next BB.

(cherry picked from commit 37744f8260857005c8409c9e2e633a05c768a7dd)

middle-end/101480 - overloaded global new/delete

The following fixes the issue of ignoring side-effects on memory
from overloaded global new/delete operators by not marking them
as effectively 'const' apart from other explicitely specified
side-effects.

This will cause

FAIL: g++.dg/warn/Warray-bounds-16.C -std=gnu++1? (test for excess errors)

because we now no longer statically see the initialization loop
never executes because the call to operator new can now clobber 'a.m'.
This seems to be an issue with the warning code and/or ranger so
I'm leaving this FAIL to be addressed as followup.

2021-10-11 Richard Biener <rguenther@suse.de>

PR middle-end/101480
* gimple.c (gimple_call_fnspec): Do not mark operator new/delete
as const.

* g++.dg/torture/pr10148.C: New testcase.

(cherry picked from commit 09a0affdb0598a54835ac4bb0dd6b54122c12916)

gcov-profile: Fix -fcompare-debug with -fprofile-generate [PR100520]

PR gcov-profile/100520

gcc/ChangeLog:

* coverage.c (coverage_compute_profile_id): Strip .gk when
compare debug is used.
* system.h (endswith): New function.

gcc/testsuite/ChangeLog:

* gcc.dg/pr100520.c: New test.

(cherry picked from commit 7553bd35c876efaf8ab0b6661a6102822b99e6e3)

gcc-changelog: sync from master

contrib/ChangeLog:

* gcc-changelog/git_check_commit.py: Sync from master.
* gcc-changelog/git_commit.py: Likewise.
* gcc-changelog/git_email.py: Likewise.
* gcc-changelog/git_update_version.py: Likewise.
* gcc-changelog/test_email.py: Likewise.
* gcc-changelog/test_patches.txt: Likewise.

vect: Don't update inits for simd_lane_access DRs [PR102789]

As PR102789 shows, when vectorizer does some peelings for alignment
in prologues, function vect_update_inits_of_drs would update the
inits of some drs. But as the failed case, we shouldn't update the
dr for simd_lane_access, it has the fixed-length storage mainly for
the main loop, the update can make the access out of bound and access
the unexpected element.

gcc/ChangeLog:

PR tree-optimization/102789
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of simd_lane_access.

(cherry picked from commit f3dbd3f36d55178d0a9e4431043cbc950524969a)

Daily bump.

Fortran: error recovery on initializing invalid derived type array component

gcc/fortran/ChangeLog:

PR fortran/102816
* resolve.c (resolve_structure_cons): Reject invalid array spec of
a DT component referenced in a structure constructor.

gcc/testsuite/ChangeLog:

PR fortran/102816
* gfortran.dg/pr102816.f90: New test.

(cherry picked from commit 99af0b2f0fe1c0dc8c6d558157e700326d52816a)

Fortran: validate shape of arrays in constructors against declarations

gcc/fortran/ChangeLog:

PR fortran/102685
* decl.c (match_clist_expr): Set rank/shape of clist initializer
to match LHS.
* resolve.c (resolve_structure_cons): In a structure constructor,
compare shapes of array components against declared shape.

gcc/testsuite/ChangeLog:

PR fortran/102685
* gfortran.dg/derived_constructor_char_1.f90: Fix invalid code.
* gfortran.dg/pr70931.f90: Likewise.
* gfortran.dg/transfer_simplify_2.f90: Likewise.
* gfortran.dg/pr102685.f90: New test.

Co-authored-by: Tobias Burnus <tobias@codesourcery.com>
(cherry picked from commit 1e819bd95ebeefc1dc469daa1855ce005cb77822)

Fortran: error recovery on rank mismatch of array and its initializer

gcc/fortran/ChangeLog:

PR fortran/102715
* decl.c (add_init_expr_to_sym): Reject rank mismatch between
array and its initializer.

gcc/testsuite/ChangeLog:

PR fortran/102715
* gfortran.dg/pr68019.f90: Adjust error message.
* gfortran.dg/pr102715.f90: New test.

(cherry picked from commit df2135e88a8f78c853b35246ad426b01b6d08378)

Fortran: fix simplification of array-valued parameter expressions

gcc/fortran/ChangeLog:

PR fortran/102817
* expr.c (simplify_parameter_variable): Copy shape of referenced
subobject when simplifying.

gcc/testsuite/ChangeLog:

PR fortran/102817
* gfortran.dg/pr102817.f90: New test.

(cherry picked from commit bcf3728abe8488882922005166d3065fc5fdfea1)

Fortran: handle initialization of derived type parameter arrays from scalar

gcc/fortran/ChangeLog:

PR fortran/99348
PR fortran/102521
* decl.c (add_init_expr_to_sym): Extend initialization of
parameter arrays from scalars to handle derived types.

gcc/testsuite/ChangeLog:

PR fortran/99348
PR fortran/102521
* gfortran.dg/parameter_array_init_8.f90: New test.

(cherry picked from commit 74ccca380cde5e79e082d39214b306a90ded0344)

Daily bump.

Support TI mode and soft float on PA64

This change implements TI mode on PA64.  Various new patterns are
added to pa.md.  The libgcc build needed modification to build both
DI and TI routines.  We also need various softfp routines to
convert to and from TImode.

I added full softfp for the -msoft-float option.  At the moment,
this doesn't completely eliminate all use of the floating-point
co-processor.  For this, libgcc needs to be built with -msoft-mult.
The floating-point exception support also needs a soft option.

2021-11-05  John David Anglin  <danglin@gcc.gnu.org>

PR libgomp/96661

gcc/ChangeLog:

* config/pa/pa-modes.def: Add OImode integer type.
* config/pa/pa.c (pa_scalar_mode_supported_p): Allow TImode
for TARGET_64BIT.
* config/pa/pa.h (MIN_UNITS_PER_WORD) Define to MIN_UNITS_PER_WORD
to UNITS_PER_WORD if IN_LIBGCC2.
* config/pa/pa.md (addti3, addvti3, subti3, subvti3, negti2,
negvti2, ashlti3, shrpd_internal): New patterns.
Change some multi instruction types to multi.

libgcc/ChangeLog:

* config.host (hppa*64*-*-linux*): Revise tmake_file.
(hppa*64*-*-hpux11*): Likewise.
* config/pa/sfp-exceptions.c: New.
* config/pa/sfp-machine.h: New.
* config/pa/t-dimode: New.
* config/pa/t-softfp-sfdftf: New.

Speed up jump table switch detection.

PR tree-optimization/100393

gcc/ChangeLog:

* tree-switch-conversion.c (group_cluster::dump): Use
get_comparison_count.
(jump_table_cluster::find_jump_tables): Pre-compute number of
comparisons and then decrement it. Cache also max_ratio.
(jump_table_cluster::can_be_handled): Change signature.
* tree-switch-conversion.h (get_comparison_count): New.

(cherry picked from commit c517cf2e685e2903b591d63c1034ff9726cb3822)

gcc: vx-common.h: fix test for VxWorks7

The macro TARGET_VXWORKS7 is always defined (see vxworks-dummy.h).
Thus we need to test its value, not its definedness.

Fixes aca124df (define NO_DOT_IN_LABEL only in vxworks6).

gcc/ChangeLog:

* config/vx-common.h: Test value of TARGET_VXWORKS7 rather
than definedness.

(cherry picked from commit 44d0243a247dd1280265c649dab26e9486ffa015)

Daily bump.

x86: Check leal/addl gcc.target/i386/amxtile-3.c for x32

Check leal and addl for x32 to fix:

FAIL: gcc.target/i386/amxtile-3.c scan-assembler addq[ \\t]+\\$12
FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+4
FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+8

* gcc.target/i386/amxtile-3.c: Check leal/addl for x32.

(cherry picked from commit fbe58ba97aff3270877d7fd5600c17687b85964c)

i386: Fix wrong result for AMX-TILE intrinsic when parsing expression.

_tile_loadd, _tile_stored, _tile_streamloadd intrinsics are defined by
macro, so the parameters should be wrapped by parentheses to accept
expressions.

gcc/ChangeLog:

* config/i386/amxtileintrin.h (_tile_loadd_internal): Add
parentheses to base and stride.
(_tile_stream_loadd_internal): Likewise.
(_tile_stored_internal): Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/i386/amxtile-3.c: New test.

Daily bump.

ranger: Fix `-Werror' build error with `ranger_cache::push_poor_value'

Remove a commit 86534c07a390 ("Disable poor value processing in ranger
cache.") regression that caused GCC not to build anymore if `-Werror'
has been enabled:

.../gcc/gimple-range-cache.cc: In member function 'bool ranger_cache::push_poor_value(basic_block, tree)':
.../gcc/gimple-range-cache.cc:850:44: error: unused parameter 'bb' [-Werror=unused-parameter]
  850 | ranger_cache::push_poor_value (basic_block bb, tree name)
      |                                ~~~~~~~~~~~~^~
.../gcc/gimple-range-cache.cc:850:53: error: unused parameter 'name' [-Werror=unused-parameter]
  850 | ranger_cache::push_poor_value (basic_block bb, tree name)
      |                                                ~~~~~^~~~

To keep the change to the minimum mark the parameters reported unused.

gcc/
* gimple-range-cache.cc (ranger_cache::push_poor_value): Mark
parameters unused.

[PR102842] Consider all outputs in generation of matching reloads

Without considering all output insn operands (not only processed
before), in rare cases LRA can use the same hard register for
different outputs of the insn on different assignment subpasses. The
patch fixes the problem.

gcc/ChangeLog:

PR rtl-optimization/102842
* lra-constraints.c (match_reload): Ignore out in checking values
of outs.
(curr_insn_transform): Collect outputs before doing reloads of operands.

gcc/testsuite/ChangeLog:

PR rtl-optimization/102842
* g++.target/arm/pr102842.C: New test.

ipa/102714 - IPA SRA eliding volatile

The following fixes the volatileness check of IPA SRA which was
looking at the innermost reference when checking TREE_THIS_VOLATILE
but the reference to check is the outermost one.

2021-10-13 Richard Biener <rguenther@suse.de>

PR ipa/102714
* ipa-sra.c (ptr_parm_has_nonarg_uses): Fix volatileness
check.

* gcc.dg/ipa/pr102714.c: New testcase.

(cherry picked from commit 23cd18c60c8188e3d68eda721cdb739199e85e5b)

Daily bump.

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merged up to r11-9199-gfdc2700d09530227a520e8d33623b7956582cffb (Nov 2, 2021)

openmp: Add testcase for threadprivate random access class iterators

This adds a testcase for random access class iterators. The diagnostics
can be different between templates and non-templates, as for some
threadprivate vars finish_id_expression replaces them with call to their
corresponding wrapper, but I think it is not that big deal, we reject
it in either case.

2021-11-02 Jakub Jelinek <jakub@redhat.com>

* g++.dg/gomp/loop-8.C: New test.

(cherry picked from commit fb7fee84813b23487baf0c1094860251229ab5dd)

openmp: Diagnose threadprivate OpenMP loop iterators

We weren't diagnosing the
The loop iteration variable may not appear in a threadprivate directive.
restriction which used to be in 5.0 just among the Worksharing-Loop
restrictions but in 5.1 it is among Canonical Loop Nest Form restrictions.

This patch diagnoses those.

2021-10-30 Jakub Jelinek <jakub@redhat.com>

* gimplify.c (gimplify_omp_for): Diagnose threadprivate iterators.

* c-c++-common/gomp/loop-10.c: New test.

(cherry picked from commit 6f449bb93b33d63fa8a1b8d021d8d36f27ffe054)

Daily bump.

libstdc++: Fix range access for empty std::valarray [PR103022]

The std::begin and std::end overloads for std::valarray are defined in
terms of std::addressof(v[0]) which is undefined for an empty valarray.

libstdc++-v3/ChangeLog:

PR libstdc++/103022
* include/std/valarray (begin, end): Do not dereference an empty
valarray. Add noexcept and [[nodiscard]].
* testsuite/26_numerics/valarray/range_access.cc: Check empty
valarray. Check iterator properties. Run as well as compiling.
* testsuite/26_numerics/valarray/range_access2.cc: Likewise.
* testsuite/26_numerics/valarray/103022.cc: New test.

(cherry picked from commit 91bac9fed5d082f0b180834110ebc0f46f97599a)

Daily bump.

Update documentation of %X spec

%X
Output the accumulated linker options specified by -Wl or a ‘%x’ spec string

The part about -Wl has been obsolete for 27 years, since this change:

Author: Torbjorn Granlund <tege@gnu.org>
Date:   Thu Oct 27 18:04:25 1994 +0000

    (process_command): Handle -Wl, and -Xlinker similar to -l,

    i.e., preserve their order with respect to linker input files.

Technically speaking, the arguments of -l, -Wl and -Xlinker are input files.

gcc/
* doc/invoke.texi (%X): Remove obsolete reference to -Wl.

Daily bump.

Fortran: do not restrict PDT KIND and LEN type parameters to default integer

gcc/fortran/ChangeLog:

PR fortran/102917
* decl.c (match_attr_spec): Remove invalid integer kind checks on
KIND and LEN attributes of PDTs.

gcc/testsuite/ChangeLog:

PR fortran/102917
* gfortran.dg/pdt_4.f03: Adjust testcase.

(cherry picked from commit cfcb27cfcb1d32b8cf7bc463cc1fc5cacae8d199)

Fix warnings building linux-atomic.c and fptr.c on hppa64-linux

The file fptr.c is specific to 32-bit hppa-linux and should not be
included in LIB2ADD on hppa64-linux.

There is a builtin type mismatch in linux-atomic.c using the type
long long unsigned int for 64-bit atomic operations on hppa64-linux.

2021-10-27 John David Anglin <danglin@gcc.gnu.org>

libgcc/ChangeLog:

* config.host (hppa*64*-*-linux*): Don't add pa/t-linux to
tmake_file.
* config/pa/linux-atomic.c: Define u8, u16 and u64 types.
Use them in FETCH_AND_OP_2, OP_AND_FETCH_2, COMPARE_AND_SWAP_2,
SYNC_LOCK_TEST_AND_SET_2 and SYNC_LOCK_RELEASE_1 macros.
* config/pa/t-linux64 (LIB1ASMSRC): New define.
(LIB1ASMFUNCS): Revise.
(HOST_LIBGCC2_CFLAGS): Add "-DLINUX=1".

sra: Fix corner case of total scalarization with virtual inheritance (PR 102505)

PR 102505 is a situation where of SRA takes its initial top-level
access size from a get_ref_base_and_extent called on a COMPONENT_REF,
and thus derived frm the FIELD_DECL, which however does not include a
virtual base.  Total scalarization then goes on traversing the type,
which however has virtual base past the non-virtual bits, tricking SRA
to create sub-accesses outside of the supposedly encompassing
accesses, which in turn triggers the verifier within the pass.

The patch below fixes that by failing total scalarization when this
situation is detected.

This backport also has commit f217e87972a2a207e793101fc05cfc9dd095c678
squashed into it in order to avoid PR 102886 that the fix introduced
on trunk.

gcc/ChangeLog:

2021-10-20  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/102505
* tree-sra.c (totally_scalarize_subtree): Check that the
encountered field fits within the acces we would like to put it
in.

gcc/testsuite/ChangeLog:

2021-10-20  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/102505
* g++.dg/torture/pr102505.C: New test.

(cherry picked from commit 701ee067807b80957c65bd7ff94b6099a27181de)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9188-g2563fba71d0b67cb69b2cd2da649a50a70f2a02f (Oct 27, 2021)

Fortran: Fix 'select rank' for allocatables/pointers

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_select_rank_cases): Fix condition
for allocatables/pointers.

gcc/testsuite/ChangeLog:

* gfortran.dg/PR93963.f90: Extend testcase by scan-tree-dump test.

(cherry picked from commit 7f899b23f36f94f907a025d3eeaf3e4640544927)

openmp: Document that non-rect loops are not supported in Fortran yet

I've found we claim to support non-rectangular loops, but don't actually
support those in Fortran, as can be seen on:
  integer i, j
  !$omp parallel do collapse(2)
  do i = 0, 10
    do j = 0, i
    end do
  end do
end
To support this, the Fortran FE needs to allow the valid forms of
non-rectangular loops and disallow others, so mainly it needs its
updated version of c-omp.c c_omp_check_loop_iv etc., plus for non-rectangular
lb or ub expressions emit a TREE_VEC instead of normal expression as the C/C++ FE
do, plus testsuite coverage.

2021-10-27  Jakub Jelinek  <jakub@redhat.com>

* libgomp.texi (OpenMP 5.0): Mention that Non-rectangular loop nests
aren't implemented for Fortran yet.

(cherry picked from commit eef811490646a68c9893968a421b351e7bf704e1)

openmp: Allow non-rectangular loops with pointer iterators

This patch handles pointer iterators for non-rectangular loops. They are
more limited than integral iterators of non-rectangular loops, in particular
only var-outer, var-outer + a2, a2 + var-outer or var-outer - a2 can appear
in lb or ub where a2 is some integral loop invariant expression, so no e.g.
multiplication etc.

2021-10-27 Jakub Jelinek <jakub@redhat.com>

gcc/
* omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular
iterators with pointer types.
(expand_omp_for_init_vars, extract_omp_for_update_vars): Likewise.
gcc/c-family/
* c-omp.c (c_omp_check_loop_iv_r): Don't clear 3rd bit for
POINTER_PLUS_EXPR.
(c_omp_check_nonrect_loop_iv): Handle POINTER_PLUS_EXPR.
(c_omp_check_loop_iv): Set kind even if the iterator is non-integral.
gcc/testsuite/
* c-c++-common/gomp/loop-8.c: New test.
* c-c++-common/gomp/loop-9.c: New test.
libgomp/
* testsuite/libgomp.c/loop-26.c: New test.
* testsuite/libgomp.c/loop-27.c: New test.

(cherry picked from commit 2084b5f42a4432da8b0625f9c669bf690ec46468)

openmp: Don't reject some valid initializers or conditions of non-rectangular loops [PR102854]

In C++, if an iterator has or might have (e.g. dependent type) class type we
remember the original init expressions and check those separately for presence
of iterators, because for class iterators we turn those into expressions that
always do contain reference to the current iterator.  But this resulted in
rejecting valid non-rectangular loop where the dependent type is later instantiated
to an integral type.

Non-rectangular loops with class random access iterators remain broken, that is something
to be fixed incrementally.

2021-10-27  Jakub Jelinek  <jakub@redhat.com>

PR c++/102854
gcc/c-family/
* c-common.h (c_omp_check_loop_iv_exprs): Add enum tree_code argument.
* c-omp.c (c_omp_check_loop_iv_r): For trees other than decls,
TREE_VEC, PLUS_EXPR, MINUS_EXPR, MULT_EXPR, POINTER_PLUS_EXPR or
conversions temporarily clear the 3rd bit from d->kind while walking
subtrees.
(c_omp_check_loop_iv_exprs): Add CODE argument.  Or in 4 into data.kind
if possibly non-rectangular.
gcc/cp/
* semantics.c (handle_omp_for_class_iterator,
finish_omp_for): Adjust c_omp_check_loop_iv_exprs caller.
gcc/testsuite/
* g++.dg/gomp/loop-3.C: Don't expect some errors.
* g++.dg/gomp/loop-7.C: New test.

(cherry picked from commit 6b0f35299bd1468ebc13b900a73b7cac6181a2aa)

Daily bump.

gcc/configure: Check for powerpc64le*-*-freebsd*

Only powerpc64-unknown-freebsd was checked for.

Signed-off-by: Piotr Kubaj <pkubaj@FreeBSD.org>
gcc/
* configure.ac: Treat powerpc64*-*-freebsd* the same as
powerpc64-*-freebsd*.
* configure: Regenerate.

(cherry picked from commit a9ef07fe5899fc5998395cdbf96e00af372cfb0b)

Fortran: Fix character(len=cst) dummies with bind(C) [PR102885]

PR fortran/102885

gcc/fortran/ChangeLog:

* trans-decl.c (gfc_conv_cfi_to_gfc): Properly handle nonconstant
character lenghts.

gcc/testsuite/ChangeLog:

* gfortran.dg/lto/bind-c-char_0.f90: New test.

(cherry picked from commit a31a3d0421f0cf1f7eefacfec8cbf37e7f91600d)

Daily bump.

Revise -mdisable-fpregs option and add new -msoft-mult option

The behavior of the -mdisable-fpregs is confusing in that it doesn't
disable the use of the floating-point registers in all situations.
The -msoft-float disables the use of the floating-point registers in
all situations.  The Linux kernel only needs to disable use of the
xmpyu instruction to avoid using the floating-point registers.

This change revises the -mdisable-fpregs option to disable the use of
the floating-point registers in all situations.  It is now equivalent
to the -msoft-float option.  A new -msoft-mult option is added to
disable use of the xmpyu instruction.  The libgcc library can be
compiled with the -msoft-mult option to avoid using hardware integer
multiplication.

2021-10-24  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa-d.c (pa_d_handle_target_float_abi): Don't check
TARGET_DISABLE_FPREGS.
* config/pa/pa.c (fix_range): Use MASK_SOFT_FLOAT instead of
MASK_DISABLE_FPREGS.
(hppa_rtx_costs): Don't check TARGET_DISABLE_FPREGS.  Adjust
cost of hardware integer multiplication.
(pa_conditional_register_usage): Don't check TARGET_DISABLE_FPREGS.
* config/pa/pa.h (INT14_OK_STRICT): Likewise.
* config/pa/pa.md: Don't check TARGET_DISABLE_FPREGS. Check
TARGET_SOFT_FLOAT in patterns that use xmpyu instruction.
* config/pa/pa.opt (mdisable-fpregs): Change target mask to
SOFT_FLOAT.  Revise comment.
(msoft-float): New option.

Don't use 'G' constraint in integer move patterns

The 'G' constraint only matches a float zero.

2021-10-24 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md: Don't use 'G' constraint in integer move patterns.

Daily bump.

x86: Document -fcf-protection requires i686 or newer

PR target/98667
* doc/invoke.texi: Document -fcf-protection requires i686 or
new.

(cherry picked from commit 1373066a46d8d47abd97e46a005aef3b3dbfe94a)

testsuite: Fix up gfortran.dg/gomp/strictly*.f90 testcases

While these testcases are dg-do compile only, I think it is better not to
give users bad examples and avoid unnecessary data races in testcases (unless
it is exactly what we want to test). Perhaps one day we'll do some analysis
and warn about data races...

2021-10-21 Jakub Jelinek <jakub@redhat.com>

* gfortran.dg/gomp/strictly-structured-block-1.f90: Use call do_work
instead of x = x + 1 in places where the latter could be a data race.
* gfortran.dg/gomp/strictly-structured-block-2.f90: Likewise.
* gfortran.dg/gomp/strictly-structured-block-3.f90: Likewise.

(cherry picked from commit e633f82fb71817f3232688869c1eb59f60eb78ca)

openmp: Fortran strictly-structured blocks support

This implements strictly-structured blocks support for Fortran, as specified in
OpenMP 5.2. This now allows using a Fortran BLOCK construct as the body of most
OpenMP constructs, with a "!$omp end ..." ending directive optional for that
form.

gcc/fortran/ChangeLog:

* decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case
together with COMP_BLOCK.
* parse.c (parse_omp_structured_block): Change return type to
'gfc_statement', add handling for strictly-structured block case, adjust
recursive calls to parse_omp_structured_block.
(parse_executable): Adjust calls to parse_omp_structured_block.
* parse.h (enum gfc_compile_state): Add
COMP_OMP_STRICTLY_STRUCTURED_BLOCK.
* trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case
handling.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/cancel-1.f90: Adjust testcase.
* gfortran.dg/gomp/nesting-3.f90: Adjust testcase.
* gfortran.dg/gomp/strictly-structured-block-1.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-2.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-3.f90: New test.

libgomp/ChangeLog:

* libgomp.texi (Support of strictly structured blocks in Fortran):
Adjust to 'Y'.
* testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase.

(cherry picked from commit 2e4659199e814b7ee0f6bd925fd2c0a7610da856)

testsuite/libgomp.oacc-fortran/: Add -Wopenacc-parallelism

The following testcases expect the -Wopenacc-parallelism warning output
but did fail as not compiled with that warning; solution: add it.

* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: Compile
with -Wopenacc-parallelism.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Likewise.

Fortran: Fixes and additional tests for shape/ubound/size [PR94070]

This patch reimplements the SHAPE intrinsic to be inlined similarly to
LBOUND and UBOUND, instead of as a library call, to avoid an
unnecessary array copy. Various bugs are also fixed.

gcc/fortran/
PR fortran/94070

* expr.c (gfc_simplify_expr): Handle GFC_ISYM_SHAPE along with
GFC_ISYM_LBOUND and GFC_ISYM_UBOUND.
* trans-array.c (gfc_conv_ss_startstride): Likewise.
(set_loop_bounds): Likewise.
* trans-intrinsic.c (gfc_trans_intrinsic_bound): Extend to
handle SHAPE. Correct logic for zero-size special cases and
detecting assumed-rank arrays associated with an assumed-size
argument.
(gfc_conv_intrinsic_shape): Deleted.
(gfc_conv_intrinsic_function): Handle GFC_ISYM_SHAPE like
GFC_ISYM_LBOUND and GFC_ISYM_UBOUND.
(gfc_add_intrinsic_ss_code): Likewise.
(gfc_walk_intrinsic_bound): Likewise.

gcc/testsuite/
PR fortran/94070

* gfortran.dg/c-interop/shape-bindc.f90: New test.
* gfortran.dg/c-interop/shape-poly.f90: New test.
* gfortran.dg/c-interop/size-bindc.f90: New test.
* gfortran.dg/c-interop/size-poly.f90: New test.
* gfortran.dg/c-interop/ubound-bindc.f90: New test.
* gfortran.dg/c-interop/ubound-poly.f90: New test.

(cherry picked from commit 1af78e731feb9327a17c99ebaa19a4cca1125caf)

Daily bump.

openmp: in_reduction support for Fortran

This patch implements support for the in_reduction clause for Fortran.
It also includes more completion of the taskgroup construct inside the
Fortran front-end, thus allowing task_reduction to work for task and
target constructs.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default
false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case.
(gfc_match_omp_clauses): Add 'openmp_target' default false parameter,
adjust call to gfc_match_omp_clause_reduction.
(match_omp): Adjust call to gfc_match_omp_clauses
* trans-openmp.c (gfc_trans_omp_taskgroup): Add call to
gfc_match_omp_clause, create and return block.

gcc/ChangeLog:

* omp-low.c (omp_copy_decl_2): For !ctx, use record_vars to add new copy
as local variable.
(scan_sharing_clauses): Place copy of OMP_CLAUSE_IN_REDUCTION decl in
ctx->outer instead of ctx.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan
pattern.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-in-reduction-1.f90: New test.
* testsuite/libgomp.fortran/target-in-reduction-2.f90: New test.

(cherry picked from commit d98626bf451dea6a28a42d953f7d0bd7659ad4d5)

(This merges the review comments, taken care of in the mainline commit,
referenced above. For OG11, the heavy lifting was already done in
commit 07a380a8a024fbcc61c0098400da9a382b9a7010 )

Avoid exception propagation during bootstrap

This addresses PR ada/100486, which is the bootstrap failure of GCC 11 for
32-bit Windows in the MSYS setup. The PR shows that we cannot rely on
exception propagation being operational during the bootstrap, at least on
the 11 branch, so fix this by removing the problematic raise statement.

gcc/ada/
PR ada/100486
* sem_prag.adb (Check_Valid_Library_Unit_Pragma): Do not raise an
exception as part of the bootstrap.

openmp: Fix up struct gomp_work_share handling [PR102838]

If GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is not defined, the intent was to
treat the split of the structure between first cacheline (64 bytes)
as mostly write-once, use afterwards and second cacheline as rw just
as an optimization. But as has been reported, with vectorization enabled
at -O2 it can now result in aligned vector 16-byte or larger stores.
When not having posix_memalign/aligned_alloc/memalign or other similar API,
alloc.c emulates it but it needs to allocate extra memory for the dynamic
realignment.
So, for the GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC not defined case, this patch
stops using aligned (64) attribute in the middle of the structure and instead
inserts padding that puts the second half of the structure at offset 64 bytes.

And when GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, usually it was allocated
as aligned, but for the orphaned case it could still be allocated just with
gomp_malloc without guaranteed proper alignment.

2021-10-20 Jakub Jelinek <jakub@redhat.com>

PR libgomp/102838
* libgomp.h (struct gomp_work_share_1st_cacheline): New type.
(struct gomp_work_share): Only use aligned(64) attribute if
GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, otherwise just
add padding before lock to ensure lock is at offset 64 bytes
into the structure.
(gomp_workshare_struct_check1, gomp_workshare_struct_check2):
New poor man's static assertions.
* work.c (gomp_work_share_start): Use gomp_aligned_alloc instead of
gomp_malloc if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC.

(cherry picked from commit c7abdf46fb7ac9a0c37f120feff3fcc3a752584f)

gfortran.dg/bind-c-contiguous-5.c: Big-endian fix

gcc/testsuite/

PR fortran/102815
* gfortran.dg/bind-c-contiguous-5.c (do_call, reset_var): Handle
big andian.

(cherry picked from commit d4044db034b40c275b5f287d5854a102d22e07c0)

c++: Fix up push_local_extern_decl_alias error recovery [PR102642]

My recent push_local_extern_decl_alias change broke error-recovery,
do_pushdecl can return error_mark_node and set_decl_tls_model can't be
called on that. There are other code paths that store error_mark_node
into DECL_LOCAL_DECL_ALIAS, with the intent to differentiate the cases
where we haven't yet tried to push it into the namespace scope (NULL)
and one where we have tried it but it failed (error_mark_node), but looking
around, there are other spots where we call functions or do processing
which doesn't tolerate error_mark_node.

So, the first hunk with the testcase fixes the testcase, the others
fix what I've spotted and the fix was easy to figure out (there are I think
3 other spots mainly for function multiversioning).

2021-10-20 Jakub Jelinek <jakub@redhat.com>

PR c++/102642
* name-lookup.c (push_local_extern_decl_alias): Don't call
set_decl_tls_model on error_mark_node.
* decl.c (make_rtl_for_nonlocal_decl): Don't call
set_user_assembler_name on error_mark_node.
* parser.c (cp_parser_oacc_declare): Ignore DECL_LOCAL_DECL_ALIAS
if it is error_mark_node.
(cp_parser_omp_declare_target): Likewise.

* g++.dg/tls/pr102642.C: New test.

(cherry picked from commit 424945258d1778617b5d3d5273f6e1c10e718f80)

Daily bump.

libstdc++: Fix doxygen generation to work with relative paths

In r12-826 I tried to remove some redundant steps from the doxygen
build, but they are needed when configure is run as a relative path. The
use of pwd is to resolve the relative path to an absolute one.

libstdc++-v3/ChangeLog:

* doc/Makefile.am (stamp-html-doxygen, stamp-html-doxygen)
(stamp-latex-doxygen, stamp-man-doxygen): Fix recipes for
relative ${top_srcdir}.
* doc/Makefile.in: Regenerate.

(cherry picked from commit 04d392e8430ca66a3f12b7db4f3cb84788269a48)

Fortran: Fix "str" to scalar descriptor conversion [PR92482]

PR fortran/92482
gcc/fortran/ChangeLog:

* trans-expr.c (gfc_conv_procedure_call): Use TREE_OPERAND not
build_fold_indirect_ref_loc to undo an ADDR_EXPR.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind-c-char-descr.f90: Remove xfail; extend a bit.

(cherry picked from commit 6920d5a1a2834e9c62d441b8f4c6186b01107d13)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9173-g3de82c6c36fea8fad7145b62ccb3a7c06d1f3c51 (Oct 19, 2021)

Fortran: Fix CLASS conversion check [PR102745]

PR fortran/102745
gcc/fortran/ChangeLog
* intrinsic.c (gfc_convert_type_warn): Fix checks by checking CLASS
and do typcheck in correct order for type extension.
* misc.c (gfc_typename): Print proper not internal CLASS type name.

gcc/testsuite/ChangeLog
* gfortran.dg/class_72.f90: New.

(cherry picked from commit 017665f63047ce47b087b0b283548a60e5abf3d2)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9172-g164044da342dc5f21132b2a782115f3fd70c29c5 (Oct 19, 2021).

openmp: Add additional tests for declare variant in Fortran

Add tests to check that explicitly specifying the containing procedure as the
base name for declare variant works.

2021-10-18 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/testsuite/

* gfortran.dg/gomp/declare-variant-15.f90 (variant2, base2, test2):
Add tests.
* gfortran.dg/gomp/declare-variant-16.f90 (base2, variant2, test2):
Add tests.

(cherry picked from commit 38733234024697d2144613c4a992e970f40afad8)

openmp: Fix handling of numa_domains(1)

If numa-domains is used with num-places count, sometimes the function
could create more places than requested and crash.  This depended on the
content of /sys/devices/system/node/online file, e.g. if the file
contains
0-1,16-17
and all NUMA nodes contain at least one CPU in the cpuset of the program,
then numa_domains(2) or numa_domains(4) (or 5+) work fine while
numa_domains(1) or numa_domains(3) misbehave.  I.e. the function was able
to stop after reaching limit on the , separators (or trivially at the end),
but not within in the ranges.

2021-10-18  Jakub Jelinek  <jakub@redhat.com>

* config/linux/affinity.c (gomp_affinity_init_numa_domains): Add
&& gomp_places_list_len < count after nfirst <= nlast loop condition.

(cherry picked from commit 3adcf7e104284b4867996b08f37ece50056ee8f6)

Daily bump.

i386: Fix ICE in ix86_print_opreand_address [PR 102761]

2021-10-18 Uroš Bizjak <ubizjak@gmail.com>

PR target/102761

gcc/ChangeLog:

* config/i386/i386.c (ix86_print_operand_address):
Error out for non-address_operand asm operands.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102761.c: New test.

[PR/target 100316] Allow constant address for __builtin___clear_cache.

__builtin___clear_cache was able to accept constant address for the
argument, but it seems no longer accept recently, and it even not
accept constant address which is hold in variable when optimization is
enable:

```
void foo3(){
  void *yy = (void*)0x1000;
  __builtin___clear_cache(yy, yy);
}
```

So this patch make BEGIN and END accept VOIDmode, like cselib_lookup_mem did per
Jim Wilson's suggestion.

```
static cselib_val *
cselib_lookup_mem (rtx x, int create)
{
  ...
  addr_mode = GET_MODE (XEXP (x, 0));
  if (addr_mode == VOIDmode)
    addr_mode = Pmode;
```

Changes v2 -> v3:
- Use gcc_assert rather than error, maybe_emit_call_builtin___clear_cache is
internal use only, and we already checked the type in other place.

Changes v1 -> v2:
- Check is CONST_INT intead of cehck mode, no new testcase, since
  constant value with other type like CONST_DOUBLE will catched by
  front-end.
e.g.
Code:
```c
void foo(){
  __builtin___clear_cache(1.11, 0);
}
```
Error message:
```
clearcache-double.c: In function 'foo':
clearcache-double.c:2:27: error: incompatible type for argument 1 of '__builtin___clear_cache'
    2 |   __builtin___clear_cache(1.11, 0);
      |                           ^~~~
      |                           |
      |                           double
clearcache-double.c:2:27: note: expected 'void *' but argument is of type 'double'
```

gcc/ChangeLog:

PR target/100316
* builtins.c (maybe_emit_call_builtin___clear_cache): Allow
CONST_INT for BEGIN and END, and use gcc_assert rather than
error.

gcc/testsuite/ChangeLog:

PR target/100316
* gcc.c-torture/compile/pr100316.c: New.

(cherry picked from commit 4e5bc4e4506a7ae7bb88fc925a425652a1da6b2d)

openmp: Fix up handling of OMP_PLACES=threads(1)

When writing the places-*.c tests, I've noticed that we mishandle threads
abstract name with specified num-places if num-places isn't a multiple of
number of hw threads in a core. It then happily ignores the maximum count
and overwrites for the remaining hw threads in a core further places that
haven't been allocated.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* config/linux/affinity.c (gomp_affinity_init_level_1): For level 1
after creating count places clean up and return immediately.
* testsuite/libgomp.c/places-6.c: New test.
* testsuite/libgomp.c/places-7.c: New test.
* testsuite/libgomp.c/places-8.c: New test.

(cherry picked from commit 4764049dd620affcd3e2658dc7f03a6616370a29)

Fix merge of: amdgcn: fix up offload debug linking with LLVM 13

For some odd reasons (probably wrong merge conflict resolution),
one of the changes to config/gcn/mkoffload.c of
commit r11-9168-gcc84160c5f470b23b7aed4633f887df113b2675d
disappeared when merging origin/releases/gcc-11 into OG11.
Thus, apply I (re)applied it manually:

gcc/
* config/gcn/mkoffload.c (main): Just let the attribute flags
pass through.

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9168-gcc84160c5f470b23b7aed4633f887df113b2675d (Oct 18, 2021).

amdgcn: fix up offload debug linking with LLVM 13

Between LLVM 9 and LLVM 13 the attribute works differently in several ways,
and this needs to be allowed for in GCC and mkoffload independently.

This patch fixes up mkoffload when debug info is enabled, which is made more
complicated because the configure tests checks whether the attribute option
is accepted silently, but does not check if the assembler actually sets the
ELF flags for that attribute, and mkoffload needs to mimick that behaviour
exactly. The patch therefore removes some of the conditionals.

gcc/ChangeLog:

* config/gcn/gcn-hsa.h (S_FIJI): Set unconditionally.
(S_900): Likewise.
(S_906): Likewise.
* config/gcn/gcn.c: Hard code SRAM ECC settings for old architectures.
* config/gcn/mkoffload.c (ELFABIVERSION_AMDGPU_HSA): Rename to ...
(ELFABIVERSION_AMDGPU_HSA_V3): ... this.
(ELFABIVERSION_AMDGPU_HSA_V4): New.
(SET_SRAM_ECC_UNSUPPORTED): New.
(copy_early_debug_info): Create elf flags to match the other objects.
(main): Just let the attribute flags pass through.

(cherry picked from commit f3d64372d777d7d6068df8167b6751c289963e85)

amdgcn: Fix assembler version incompatibility

This is another case of the global_load instruction format changing in LLVM
(because they fixed a bug). The configure test is already in place to detect
what is needed.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (gather<mode>_insn_2offsets<exec>): Apply
HAVE_GCN_ASM_GLOBAL_LOAD_FIXED.
(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.

(cherry picked from commit 81c362c7c2bccd72d798bf7ea6c74d4b1cc3931f)

amdgcn: Implement -msram-ecc=any

The option was already there, but just an alias for -msram-ecc=on. Now that
LLVM13 supports HSACOv4 and the new ELF flags I can implement the option
properly.

The "any" option is the default in order to ensure that library files work
whichever way the user wants, which means we won't need multilibs to support
the different SRAM ECC hardware configurations.

gcc/ChangeLog:

* config/gcn/gcn-hsa.h (SRAMOPT): Include the whole option string.
Adjust for new -msram-ecc=any behaviour.
(ASM_SPEC): Adjust -mxnack and -msram-ecc usage.
* config/gcn/gcn.c (output_file_start): Implement -msram-ecc=any.
* config/gcn/mkoffload.c (EF_AMDGPU_XNACK): Rename to ...
(EF_AMDGPU_XNACK_V3): ... this.
(EF_AMDGPU_SRAM_ECC): Rename to ...
(EF_AMDGPU_SRAM_ECC_V3): ... this.
(EF_AMDGPU_FEATURE_XNACK_V4): New.
(EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4): New.
(EF_AMDGPU_FEATURE_XNACK_ANY_V4): New.
(EF_AMDGPU_FEATURE_XNACK_OFF_V4): New.
(EF_AMDGPU_FEATURE_XNACK_ON_V4): New.
(EF_AMDGPU_FEATURE_SRAMECC_V4): New.
(EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4): New.
(EF_AMDGPU_FEATURE_SRAMECC_ANY_V4): New.
(EF_AMDGPU_FEATURE_SRAMECC_OFF_V4): New.
(EF_AMDGPU_FEATURE_SRAMECC_ON_V4): New.
(SET_XNACK_ON): New.
(SET_XNACK_OFF): New.
(TEST_XNACK): New.
(SET_SRAM_ECC_ON): New.
(SET_SRAM_ECC_ANY): New.
(SET_SRAM_ECC_OFF): New.
(TEST_SRAM_ECC_ANY): New.
(TEST_SRAM_ECC_ON): New.
(main): Implement HSACOv4 and -msram-ecc=any.

(cherry picked from commit 205dafb6edeca08419f4a5976be79bf7c86fd9a1)

amdgcn: Support LLVM 13 assembler syntax

The LLVM devs have changed the assembler architecture attribute names on both
CLI and in the ".amdgcn_target" directive, and changed the attribute syntax
inside the directive, without keeping any backwards compatibility. :-(

This patch improves our configure tests to detect what dialect to use, what
attributes are valid, and adjusts the specs to match.

gcc/ChangeLog:

* config.in: Regenerate.
* config/gcn/gcn-hsa.h (X_FIJI): New macro.
(X_900): New macro.
(X_906): New macro.
(X_908): New macro.
(A_FIJI): Rename to ...
(S_FIJI): ... this.
(A_900): Rename to ...
(S_900): ... this.
(A_906): Rename to ...
(S_906): ... this.
(A_908): Rename to ...
(S_908): ... this.
(SRAMOPT): New macro.
(ASM_SPEC): Adjust xnack option usage.
* config/gcn/gcn.c (output_file_start): Adjust amdgcn_target usage.
* configure: Regenerate.
* configure.ac: Detect LLVM assembler dialect.

(cherry picked from commit 6ca03ca35a58ebf9792aa8a08adf00b6fd3e0015)

amdgcn: Mark s_mulk_i32 as clobbering SCC

The s_mulk_i32 instruction sets the SCC status register according to
whether the multiplication overflows, but that is not currently modelled
in the GCN backend. AFAIK this is a latent bug and hasn't been noticed
"in the wild", but it should be fixed.

2021-06-29 Julian Brown <julian@codesourcery.com>

gcc/
* config/gcn/gcn.md (mulsi3): Make s_mulk_i32 variant clobber SCC.

(cherry picked from commit 5c127c4cac308429cba483a2ac4e175c2ab26165)

amdgcn: Fix attributes for LLVM-12 [PR 100208]

This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

PR target/100208
* config.in: Regenerate.
* config/gcn/gcn-hsa.h (A_FIJI): New define.
(A_900): New define.
(A_906): New define.
(A_908): New define.
(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
* config/gcn/gcn.c (output_file_start): Adjust attributes according
to the assembler capabilities.
* config/gcn/mkoffload.c (main): Likewise.
* configure: Regenerate.
* configure.ac: Add tests for LLVM assembler attribute features.

amdgcn: Add -mxnack and -msram-ecc [PR 100208]

gcc/ChangeLog:

PR target/100208
* config/gcn/gcn-hsa.h (DRIVER_SELF_SPECS): New.
(ASM_SPEC): Set -mattr for xnack and sram-ecc.
* config/gcn/gcn-opts.h (enum sram_ecc_type): New.
* config/gcn/gcn-valu.md: Add a warning comment.
* config/gcn/gcn.c (gcn_option_override): Add "sorry" for -mxnack.
(output_file_start): Add xnack and sram-ecc state to ".amdgcn_target".
* config/gcn/gcn.md: Add a warning comment.
* config/gcn/gcn.opt: Add -mxnack and -msram-ecc.
* config/gcn/mkoffload.c (EF_AMDGPU_MACH_AMDGCN_GFX908): Remove
SRAM-ECC flag.
(EF_AMDGPU_XNACK): New.
(EF_AMDGPU_SRAM_ECC): New.
(elf_flags): New.
(copy_early_debug_info): Use elf_flags.
(main): Handle -mxnack and -msram-ecc options.
* doc/invoke.texi: Document -mxnack and -msram-ecc.

gcc/testsuite/ChangeLog:

PR target/100208
* gcc.target/gcn/sram-ecc-1.c: New test.
* gcc.target/gcn/sram-ecc-2.c: New test.
* gcc.target/gcn/sram-ecc-3.c: New test.
* gcc.target/gcn/sram-ecc-4.c: New test.
* gcc.target/gcn/sram-ecc-5.c: New test.
* gcc.target/gcn/sram-ecc-6.c: New test.
* gcc.target/gcn/sram-ecc-7.c: New test.
* gcc.target/gcn/sram-ecc-8.c: New test.

(cherry picked from commit aad32a00b7d2b64ae158b2b167768a9ae3e20f6e)