git.ipfire.org Git - thirdparty/gcc.git/log

nvptx/mkoffload.cc: Fix "$nohost" check

If lhd_set_decl_assembler_name is invoked - in particular if
!TREE_PUBLIC (decl) && !DECL_FILE_SCOPE_P (decl) - the '.nohost' suffix
might change to '.nohost.2'. This happens for the existing reverse offload
testcases via cgraph_node::analyze and is a side effect of
r13-3455-g178ac530fe67e4f2fc439cc4ce89bc19d571ca31 for some reason.

The solution is to not only check for a tailing '$nohost' but also for
'$nohost$' in nvptx/mkoffload.cc.

gcc/ChangeLog:

* config/nvptx/mkoffload.cc (process): Recognize '$nohost$...'
besides tailing '$nohost' as being for reverse offload.

(cherry picked from commit d59858f6ee7f356f27ccc2d29129826781f9483f)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8907-g58da1386d2233b8e01aaac8f7c4a61a2ccf52743 (14th Nov 2022)

libgomp: Fix up build on mingw [PR107641]

Pointers should be first casted to intptr_t/uintptr_t before casting
them to another integral type to avoid warnings.
Furthermore, the function has code like
  else if (upper <= UINT_MAX)
    something;
  else
    something_else;
so it seems using unsigned type for upper where upper <= UINT_MAX is always
true is not intended.

2022-11-12  Jakub Jelinek  <jakub@redhat.com>

PR libgomp/107641
* env.c (parse_unsigned_long): Cast params[2] to uintptr_t rather than
unsigned long.  Change type of upper from unsigned to unsigned long.

(cherry picked from commit 2a193e9df82917eaf440a20f99a3febe91dcb5fe)

Daily bump.

Add guality testcase for RTL alias analysis fix

gcc/testsuite/
* gcc.dg/guality/param-6.c: New test.

Restore RTL alias analysis for hard frame pointer

The change:

2021-07-28 Bin Cheng <bin.cheng@linux.alibaba.com>

alias.c (init_alias_analysis): Don't skip prologue/epilogue.

broke the alias analysis for the hard frame pointer (when it is used as a
frame pointer, i.e. when the frame pointer is not eliminated) described in
the large comment at the top of the file, because static_reg_base_value is
set for it and, consequently, new_reg_base_value too.

When the instruction saving the stack pointer into the hard frame pointer in
the prologue is processed, it is viewed as a second set of the hard frame
pointer and to a different value by record_set, which then proceeds to reset
new_reg_base_value to 0 and the game is over.

gcc/
* alias.cc (init_alias_analysis): Do not record sets to the hard
frame pointer if the frame pointer has not been eliminated.

Daily bump.

Always use TYPE_MODE instead of DECL_MODE for vector field

e034c5c8957 re PR target/78643 (ICE in convert_move, at expr.c:230)

fixed the case where DECL_MODE of a vector field is BLKmode and its
TYPE_MODE is a vector mode because of target attribute. Remove the
BLKmode check for the case where DECL_MODE of a vector field is a vector
mode and its TYPE_MODE isn't a vector mode because of target attribute.

gcc/

PR target/107304
* expr.cc (get_inner_reference): Always use TYPE_MODE for vector
field with vector raw mode.

gcc/testsuite/

PR target/107304
* gcc.target/i386/pr107304.c: New test.

(cherry picked from commit 1c64aba8cdf6509533f554ad86640f274cdbe37f)

libstdc++: Remove empty <author> elements in manual

This fixes a spurious comma before the list of authors in the PDF
version of the libstdc++ manual.

Also fix the commented-out examples which should show <personblurb> not
<authorblurb>.

libstdc++-v3/ChangeLog:

* doc/xml/authors.xml: Remove empty author element.
* doc/xml/manual/spine.xml: Likewise.
* doc/html/manual/index.html: Regenerate.

(cherry picked from commit 4596339d9fabdcbd66b5a7430fa56544f75ecef1)

Daily bump.

amdgcn: Fix expansion of GCN_BUILTIN_LDEXPV builtin

2022-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/
* config/gcn/gcn.cc (gcn_expand_builtin_1): Expand first argument
of GCN_BUILTIN_LDEXPV to V64DFmode.

amdgcn: Various fixes for SIMD math library

Fix an error if VECTOR_RETURN is used on a constant.
Simplify the implementation of ilogb by removing VECTOR_IFs with a trivial
body.

2022-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>

libgcc/
* config/gcn/simd-math/amdgcnmach.h (VECTOR_RETURN): Store value of
return value in a local variable first.
* config/gcn/simd-math/v64df_ilogb.c (ilogb): Simplify.

amdgcn: Fixed intermittent failure in vectorized version of rint

The lane mask was not being updated properly in nested conditionals.
Also fixed an issue causing inaccuracy in double-precision rint.

2022-11-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

libgcc/
* config/gcn/simd-math/v64df_rint.c (rint): Simplified.  Fixed bug in
nested VECTOR_IF.  Fixed issue with signed right-shift.
* config/gcn/simd-math/v64sf_rint.c (rintf): Simplified.  Fixed bug in
nested VECTOR_IF.

Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS

gcc/ChangeLog:

* config/i386/driver-i386.cc (host_detect_local_cpu):
Move sapphirerapids out of AVX512_VP2INTERSECT.
* config/i386/i386.h: Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS
* doc/invoke.texi: Remove AVX512_VP2INTERSECT from SAPPHIRERAPIDS

Daily bump.

doc: Document correct -fwide-exec-charset defaults [PR41041]

As shown in the PR, the default is not UTF-32 but rather UTF-32BE or
UTF-32LE, avoiding the need for a byte order mark in literals.

gcc/ChangeLog:

PR c/41041
* doc/cppopts.texi: Document -fwide-exec-charset defaults
correctly.

(cherry picked from commit e50ea3a42f058c14ee29327d5277ab0435e3d36b)

Fix recent thinko in operand_equal_p

There is a thinko in a recent improvement made to operand_equal_p where
the code just looks at operand 2 of COMPONENT_REF, if it is present, to
compare addresses. That's wrong because operand 2 contains the number of
DECL_OFFSET_ALIGN-bit-sized words so, when DECL_OFFSET_ALIGN > 8, not all
the bytes are included and some of them are in DECL_FIELD_BIT_OFFSET, see
get_inner_reference for the model computation.

In other words, you would need to compare operand 2 and DECL_OFFSET_ALIGN
and DECL_FIELD_BIT_OFFSET in this situation, but I'm not sure this is worth
the hassle in practice so the fix just removes this alternate handling.

gcc/
* fold-const.cc (operand_compare::operand_equal_p) <COMPONENT_REF>:
Do not take into account operand 2.
(operand_compare::hash_operand) <COMPONENT_REF>: Likewise.

gcc/testsuite/
* gnat.dg/opt99.adb: New test.
* gnat.dg/opt99_pkg1.ads, gnat.dg/opt99_pkg1.adb: New helper.
* gnat.dg/opt99_pkg2.ads: Likewise.

Align with: "OpenMP/Fortran: 'target update' with DT components"

This commit partially undos the OG12 commit
   cb934e37962eeccc8641982b9a9855408979c767
   OpenMP/Fortran: 'target update' with strides + DT components
to match the mainline (GCC 13) version:
   r13-3625-g6629444170f85e9b1e243aa07e3e07a8b9f8fce5
   OpenMP/Fortran: 'target update' with DT components

The difference is that strides are not permitted in the mainline
version; for the reason and to-do, see: https://gcc.gnu.org/PR107517

Interdiff changelog:

2022-11-04  Tobias Burnus  <tobias@codesourcery.com>

gcc/fortran/ChangeLog.omp
        Partial Revert:
        2022-11-02  Tobias Burnus  <tobias@codesourcery.com>

        * openmp.cc (resolve_omp_clauses):Accept noncontiguous arrays.

libgomp/ChangeLog.omp

        * testsuite/libgomp.fortran/target-13.f90: Remove strides to match
        mainline (GCC 13) version.

(cherry picked from commit 6629444170f85e9b1e243aa07e3e07a8b9f8fce5)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8891-g14a92220a2f061328aae32ee6b5cdc7f62375902 (4th Nov 2022)

Note: This includes in principle some OpenMP/libgomp commits, but those have
been cherry picked already.

Daily bump.

i386: Fix uninitialized register after peephole2 conversion [PR107404]

The eliminate reg-reg move by inverting the condition of
a cmove #2 peephole2 converts the following sequence:

  473: bx:DI=[r14:DI*0x8+r12:DI]
  960: r15:DI=r8:DI
  485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;}
  737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI}

to:

1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;}
1111: r15:DI=[r14:DI*0x8+r12:DI]
1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI}

Please note that(insn 1110) uses register BX, but its
initialization was eliminated.

Avoid conversion if eliminated move intialized a register, used
in the moved instruction.

2022-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/107404
* config/i386/i386.md (eliminate reg-reg move by inverting the
condition of a cmove #2 peephole2): Check if eliminated move
initialized a register, used in the moved instruction.

gcc/testsuite/ChangeLog:

PR target/107404
* g++.target/i386/pr107404.C: New test.

(cherry picked from commit 553b1d3dd5b9253ebdf66ee3260c717d5b807dd1)

c, c++: Fix up excess precision handling of scalar_to_vector conversion [PR107358]

As mentioned earlier in the C++ excess precision support mail, the following
testcase is broken with excess precision both in C and C++ (though just in C++
it was triggered in real-world code).
scalar_to_vector is called in both FEs after the excess precision promotions
(or stripping of EXCESS_PRECISION_EXPR), so we can then get invalid
diagnostics that say float vector + float involves truncation (on ia32
from long double to float).

The following patch fixes that by calling scalar_to_vector on the operands
before the excess precision promotions, let scalar_to_vector just do the
diagnostics (it does e.g. fold_for_warn so it will fold
EXCESS_PRECISION_EXPR around REAL_CST to constants etc.) but will then
do the actual conversions using the excess precision promoted operands
(so say if we have vector double + (float + float) we don't actually do
vector double + (float) ((long double) float + (long double) float)
but
vector double + (double) ((long double) float + (long double) float)

2022-10-24 Jakub Jelinek <jakub@redhat.com>

PR c++/107358
gcc/c/
* c-typeck.cc (build_binary_op): Pass operands before excess precision
promotions to scalar_to_vector call.
gcc/testsuite/
* c-c++-common/pr107358.c: New test.

(cherry picked from commit 65e3274e363cb2c6bfe6b5e648916eb7696f7e2f)

c++: Fix up constexpr handling of char/signed char/short pre/post inc/decrement [PR105774]

signed char, char or short int pre/post inc/decrement are represented by
normal {PRE,POST}_{INC,DEC}REMENT_EXPRs in the FE and only gimplification
ensures that the {PLUS,MINUS}_EXPR is done in unsigned version of those
types:
    case PREINCREMENT_EXPR:
    case PREDECREMENT_EXPR:
    case POSTINCREMENT_EXPR:
    case POSTDECREMENT_EXPR:
      {
        tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 0));
        if (INTEGRAL_TYPE_P (type) && c_promoting_integer_type_p (type))
          {
            if (!TYPE_OVERFLOW_WRAPS (type))
              type = unsigned_type_for (type);
            return gimplify_self_mod_expr (expr_p, pre_p, post_p, 1, type);
          }
        break;
      }
This means during constant evaluation we need to do it similarly (either
using unsigned_type_for or using widening to integer_type_node).
The following patch does the latter.

2022-10-24  Jakub Jelinek  <jakub@redhat.com>

PR c++/105774
* constexpr.cc (cxx_eval_increment_expression): For signed types
that promote to int, evaluate PLUS_EXPR or MINUS_EXPR in int type.

* g++.dg/cpp1y/constexpr-105774.C: New test.

(cherry picked from commit da8c362c4c18cff2f2dfd5c4706bdda7576899a4)

libgomp: Fix up creation of artificial teams

When not in explicit parallel/target/teams construct, we in some cases create
an artificial parallel with a single thread (either to handle target nowait
or for task reduction purposes).  In those cases, it handled again artificially
created implicit task (created by gomp_new_icv for cases where we needed to write
to some ICVs), but as the testcases show, didn't take into account possibility
of this being done from explicit task(s).  The code would destroy/free the previous
task and replace it with the new implicit task.  If task is an explicit task
(when teams is NULL, all explicit tasks behave like if (0)), it is a pointer to
a local stack variable, so freeing it doesn't work, and additionally we shouldn't
lose the explicit tasks - the new implicit task should instead replace the
ancestor task which is the first implicit one.

2022-10-12  Jakub Jelinek  <jakub@redhat.com>

* task.c (gomp_create_artificial_team): Fix up handling of invocations
from within explicit task.
* target.c (GOMP_target_ext): Likewise.
* testsuite/libgomp.c/task-7.c: New test.
* testsuite/libgomp.c/task-8.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-17.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-18.c: New test.

(cherry picked from commit a58a965eb73253759f6a3e1c7380392557da89c8)

tree-cfg: Fix a verification diagnostic typo [PR107121]

Obvious typo in diagnostics.

2022-10-02 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/107121
* tree-cfg.cc (verify_gimple_call): Fix a typo in diagnostics,
DEFFERED_INIT -> DEFERRED_INIT.

(cherry picked from commit d01bd0b0f3b8f4c33c437ff10f0b949200627f56)

openmp: Fix ICE with taskgroup at -O0 -fexceptions [PR107001]

The following testcase ICEs because with -O0 -fexceptions GOMP_taskgroup_end
call isn't directly followed by GOMP_RETURN statement, but there are some
conditionals to handle exceptions and we fail to find the correct GOMP_RETURN.

The fix is to treat taskgroup similarly to target data, both of these constructs
emit a try { body } finally { end_call } around the construct's body during
gimplification and we need to see proper construct nesting during gimplification
and omp lowering (including nesting of regions checks), but during omp expansion
we don't really need their nesting anymore, all we need is emit something at
the start of the region and the end of the region is the end API call we've
already emitted during gimplification.  For target data, we weren't adding
GOMP_RETURN statement during omp lowering, so after that pass it is treated
merely like stand-alone omp directives.  This patch does the same for
taskgroup too.

2022-09-24  Jakub Jelinek  <jakub@redhat.com>

PR c/107001
* omp-low.cc (lower_omp_taskgroup): Don't add GOMP_RETURN statement
at the end.
* omp-expand.cc (build_omp_regions_1): Clarify GF_OMP_TARGET_KIND_DATA
is not stand-alone directive.  For GIMPLE_OMP_TASKGROUP, also don't
update parent.
(omp_make_gimple_edges) <case GIMPLE_OMP_TASKGROUP>: Reset
cur_region back after new_omp_region.

* c-c++-common/gomp/pr107001.c: New test.

(cherry picked from commit ad2aab5c816a6fd56b46210c0a4a4c6243da1de9)

openmp, c: Tighten up c_tree_equal [PR106981]

This patch changes c_tree_equal to work more like cp_tree_equal, be
more strict in what it accepts.  The ICE on the first testcase was
due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which
ICEs if the two constants have different precision, but as the second
testcase shows, being too lenient in it can also lead to miscompilation
of valid OpenMP programs where we think certain expression is the same
even when it isn't and can be guaranteed at runtime to represent different
memory location.  So, the patch looks through only NON_LVALUE_EXPRs
and for constants as well as casts requires that the types match before
actually comparing the constant values or recursing on the cast operands.

2022-09-24  Jakub Jelinek  <jakub@redhat.com>

PR c/106981
gcc/c/
* c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the
start.  For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and
t2 have different types.
gcc/testsuite/
* c-c++-common/gomp/pr106981.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/pr106981.c: New test.

(cherry picked from commit 3c5bccb608c665ac3f62adb1817c42c845812428)

openmp: Fix handling of target constructs in static member functions [PR106829]

Just calling current_nonlambda_class_type in static member functions returns
non-NULL, but something that isn't *this and if unlucky can match part of the
IL and can be added to target clauses.
      if (DECL_NONSTATIC_MEMBER_P (decl)
          && current_class_ptr)
is a guard used elsewhere (in check_accessibility_of_qualified_id).

2022-09-07  Jakub Jelinek  <jakub@redhat.com>

PR c++/106829
* semantics.cc (finish_omp_target_clauses): If current_function_decl
isn't a nonstatic member function, don't set data.current_object to
non-NULL.

* g++.dg/gomp/pr106829.C: New test.

(cherry picked from commit e90af965e5c858ba02c0cdfbac35d0a19da1c2f6)

Daily bump.

Fortran "declare create"/allocate support for OpenACC: adjust 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1*.f90'

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1.f90:
New.

Fortran "declare create"/allocate support for OpenACC: adjust 'libgomp.oacc-fortran/declare-allocatable-1*.f90'

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90:
Likewise.

XFAIL some OpenACC 'kernels' confusion in 'libgomp.oacc-fortran/declare-allocatable-1*.f90'

Seen only for certain optimizations levels, as indicated; so there are a few
XPASSes otherwise.

There are neither OpenACC 'kernels' constructs here, nor other 'loop'
constructs with 'auto' clause, so I'm not sure what's going on.

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
XFAIL some OpenACC 'kernels' confusion.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90:
Likewise.

Support OpenACC 'declare create' with Fortran allocatable arrays, part II [PR106643, PR96668]

PR libgomp/106643
PR fortran/96668
libgomp/
* oacc-mem.c (goacc_enter_data_internal): Support
OpenACC 'declare create' with Fortran allocatable arrays, part II.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.

(cherry picked from commit f6ce1e77bbf5d3a096f52e674bfd7354c6537d10)

Support OpenACC 'declare create' with Fortran allocatable arrays, part I [PR106643]

PR libgomp/106643
libgomp/
* oacc-mem.c (goacc_enter_data_internal): Support
OpenACC 'declare create' with Fortran allocatable arrays, part I.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
New.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
New.

(cherry picked from commit da8e0e1191c5512244a752b30dea0eba83e3d10c)

Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
New.

(cherry picked from commit abeaf3735fe2568b9d5b8096318da866b1fe1e5c)

Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'

... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
for missing support for OpenACC "Changes from Version 2.0 to 2.5":
"The 'declare create' directive with a Fortran 'allocatable' has new behavior".
Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
manually.

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
New.

(cherry picked from commit 59c6c5dbf267cd9d0a8df72b2a5eb5657b64268e)

Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'

libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
(cherry picked from commit 8c357d884b16cb3c14cba8a61be5b53fd04a6bfe)

Fortran/OpenMP: Fix DT struct-component with 'alloc' and array descr

When using 'map(alloc: var, dt%comp)' needs to have a 'to' mapping of
the array descriptor as otherwise the bounds are not available in the
target region. - Likewise for character strings.

This patch implements this; however, some additional issues are exposed
by the testcase; those are '#if 0'ed and will be handled later.

Submitted to mainline (but pending review):
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604887.html

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_clauses): Ensure DT struct-comp with
array descriptor and 'alloc:' have the descriptor mapped with 'to:'.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-enter-data-3.f90: New test.

libgomp.c-c++-common/requires-4.c: Fix dg-xfail-run-if condition

Seemingly an extra { } is required. This is a follow up to
OG12 commit 0c47ae1c9283a812f832e80e451bfa82519c21e8

libgomp/
* testsuite/libgomp.c-c++-common/requires-4.c: Fix dg-xfail-run-if
condition.

OpenMP/Fortran: 'target update' with strides + DT components

OpenMP 5.0 permits to use arrays with strides and derived
type components for the list items to the 'from'/'to' clauses
of the 'target update' directive.

Submitted to mainline (but pending review):
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604687.html

gcc/fortran/ChangeLog:

* openmp.cc (gfc_match_omp_clauses): Permit derived types.
(resolve_omp_clauses):Accept noncontiguous
arrays.
* trans-openmp.cc (gfc_trans_omp_clauses): Fixes for
derived-type changes; fix size for scalars.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-11.f90: New test.
* testsuite/libgomp.fortran/target-13.f90: New test.

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8881-gb80a690673272919896ee5939250e50d882f2418 (2nd Nov 2022)

amdgcn: Enable SIMD vectorization of math functions

Calls to vectorized versions of routines in the math library will now
be inserted when vectorizing code containing supported math functions.

2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com>
Paul-Antoine Arras <pa@codesourcery.com>

gcc/
* builtins.cc (mathfn_built_in_explicit): New.
* config/gcn/gcn.cc: Include case-cfn-macros.h.
(mathfn_built_in_explicit): Add prototype.
(gcn_vectorize_builtin_vectorized_function): New.
(gcn_libc_has_function): New.
(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define.
(TARGET_LIBC_HAS_FUNCTION): Define.

gcc/testsuite/
* gcc.target/gcn/simd-math-1.c: New testcase.

libgomp/
* testsuite/libgomp.c/simd-math-1.c: New testcase.

amdgcn: Add SIMD versions of math routines to libgcc

These should eventually be moved into Newlib.

2022-11-01  Kwok Cheung Yeung  <kcy@codesourcery.com>
    Paul-Antoine Arras  <pa@codesourcery.com>
    Andrew Jenner  <andrew@codesourcery.com>

libgcc/
* config.host: Add t-simdmath for GCN if header files present.
* config/gcn/simd-math/amdgcnmach.h: New.
* config/gcn/simd-math/v64_mathcnst.c: New.
* config/gcn/simd-math/v64_reent.c: New.
* config/gcn/simd-math/v64df_acos.c: New.
* config/gcn/simd-math/v64df_acosh.c: New.
* config/gcn/simd-math/v64df_asin.c: New.
* config/gcn/simd-math/v64df_asine.c: New.
* config/gcn/simd-math/v64df_asinh.c: New.
* config/gcn/simd-math/v64df_atan.c: New.
* config/gcn/simd-math/v64df_atan2.c: New.
* config/gcn/simd-math/v64df_atangent.c: New.
* config/gcn/simd-math/v64df_atanh.c: New.
* config/gcn/simd-math/v64df_copysign.c: New.
* config/gcn/simd-math/v64df_cos.c: New.
* config/gcn/simd-math/v64df_cosh.c: New.
* config/gcn/simd-math/v64df_erf.c: New.
* config/gcn/simd-math/v64df_exp.c: New.
* config/gcn/simd-math/v64df_exp2.c: New.
* config/gcn/simd-math/v64df_finite.c: New.
* config/gcn/simd-math/v64df_fmod.c: New.
* config/gcn/simd-math/v64df_gamma.c: New.
* config/gcn/simd-math/v64df_hypot.c: New.
* config/gcn/simd-math/v64df_ilogb.c: New.
* config/gcn/simd-math/v64df_isnan.c: New.
* config/gcn/simd-math/v64df_ispos.c: New.
* config/gcn/simd-math/v64df_lgamma.c: New.
* config/gcn/simd-math/v64df_lgamma_r.c: New.
* config/gcn/simd-math/v64df_log.c: New.
* config/gcn/simd-math/v64df_log10.c: New.
* config/gcn/simd-math/v64df_log2.c: New.
* config/gcn/simd-math/v64df_modf.c: New.
* config/gcn/simd-math/v64df_numtest.c: New.
* config/gcn/simd-math/v64df_pow.c: New.
* config/gcn/simd-math/v64df_remainder.c: New.
* config/gcn/simd-math/v64df_rint.c: New.
* config/gcn/simd-math/v64df_scalb.c: New.
* config/gcn/simd-math/v64df_scalbn.c: New.
* config/gcn/simd-math/v64df_signif.c: New.
* config/gcn/simd-math/v64df_sin.c: New.
* config/gcn/simd-math/v64df_sine.c: New.
* config/gcn/simd-math/v64df_sineh.c: New.
* config/gcn/simd-math/v64df_sinh.c: New.
* config/gcn/simd-math/v64df_sqrt.c: New.
* config/gcn/simd-math/v64df_tan.c: New.
* config/gcn/simd-math/v64df_tanh.c: New.
* config/gcn/simd-math/v64df_tgamma.c: New.
* config/gcn/simd-math/v64sf_acos.c: New.
* config/gcn/simd-math/v64sf_acosh.c: New.
* config/gcn/simd-math/v64sf_asin.c: New.
* config/gcn/simd-math/v64sf_asine.c: New.
* config/gcn/simd-math/v64sf_asinh.c: New.
* config/gcn/simd-math/v64sf_atan.c: New.
* config/gcn/simd-math/v64sf_atan2.c: New.
* config/gcn/simd-math/v64sf_atangent.c: New.
* config/gcn/simd-math/v64sf_atanh.c: New.
* config/gcn/simd-math/v64sf_copysign.c: New.
* config/gcn/simd-math/v64sf_cos.c: New.
* config/gcn/simd-math/v64sf_cosh.c: New.
* config/gcn/simd-math/v64sf_erf.c: New.
* config/gcn/simd-math/v64sf_exp.c: New.
* config/gcn/simd-math/v64sf_exp2.c: New.
* config/gcn/simd-math/v64sf_finite.c: New.
* config/gcn/simd-math/v64sf_fmod.c: New.
* config/gcn/simd-math/v64sf_gamma.c: New.
* config/gcn/simd-math/v64sf_hypot.c: New.
* config/gcn/simd-math/v64sf_ilogb.c: New.
* config/gcn/simd-math/v64sf_isnan.c: New.
* config/gcn/simd-math/v64sf_ispos.c: New.
* config/gcn/simd-math/v64sf_lgamma.c: New.
* config/gcn/simd-math/v64sf_lgamma_r.c: New.
* config/gcn/simd-math/v64sf_log.c: New.
* config/gcn/simd-math/v64sf_log10.c: New.
* config/gcn/simd-math/v64sf_log2.c: New.
* config/gcn/simd-math/v64sf_modf.c: New.
* config/gcn/simd-math/v64sf_numtest.c: New.
* config/gcn/simd-math/v64sf_pow.c: New.
* config/gcn/simd-math/v64sf_remainder.c: New.
* config/gcn/simd-math/v64sf_rint.c: New.
* config/gcn/simd-math/v64sf_scalb.c: New.
* config/gcn/simd-math/v64sf_scalbn.c: New.
* config/gcn/simd-math/v64sf_sin.c: New.
* config/gcn/simd-math/v64sf_sine.c: New.
* config/gcn/simd-math/v64sf_sineh.c: New.
* config/gcn/simd-math/v64sf_sinh.c: New.
* config/gcn/simd-math/v64sf_sqrt.c: New.
* config/gcn/simd-math/v64sf_tan.c: New.
* config/gcn/simd-math/v64sf_tanh.c: New.
* config/gcn/simd-math/v64sf_tgamma.c: New.
* config/gcn/t-simdmath: New.

Daily bump.

amdgcn: Add builtins for vector floor/floorf

2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/
* config/gcn/gcn-builtins.def (FLOORVF): New builtin.
(FLOORV): New builtin.
* config/gcn/gcn.cc (gcn_expand_builtin_1): Expand GCN_BUILTIN_FLOORVF
and GCN_BUILTIN_FLOORV.

amdgcn: Fix expansion of builtin for vector fabs operation

2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com>

* config/gcn/gcn.cc (gcn_expand_builtin_1): Fix expansion of
GCN_BUILTIN_FABSV.

openmp: Bugfix in omp_expand_metadirective for same blocks/edges to be deleted.

This patch handles an ICE that is thrown in omp_expand_metadirective when a
basic_block for a metadirective label is tried to be deleted multiple times.
To avoid this situation, processed labels are added to the already existing
list of labels that are not intended to be deleted.

The issue occured in the attached test case.

gcc/ChangeLog:

* omp-expand-metadirective.cc (omp_expand_metadirective): Add already
processed labels to "labels" (the list of labels not to be deleted).

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/metadirective-8.c: New test.

amdgcn: add fmin/fmax patterns

Add fmin/fmax for scalar, vector, and reductions. The smin/smax patterns are
already using the IEEE compliant hardware instructions anyway, so we can just
expand to use those insns.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (fminmaxop): New iterator.
(<fexpander><mode>3): New define_expand.
(<fexpander><mode>3<exec>): Likewise.
(reduc_<fexpander>_scal_<mode>): Likewise.
* config/gcn/gcn.md (fexpander): New attribute.

(cherry picked from commit 10aa0356118f44e5f4d720a2a4c731b173baa298)

amdgcn: multi-size vector reductions

Add support for vector reductions for any vector width by switching iterators
and generalising the code slightly. There's no one-instruction way to move an
item from lane 31 to lane 0 (63, 15, 7, 3, and 1 are all fine though), and
vec_extract is probably fewer cycles anyway, so now we always reduce to an
SGPR.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (V64_SI): Delete iterator.
(V64_DI): Likewise.
(V64_1REG): Likewise.
(V64_INT_1REG): Likewise.
(V64_2REG): Likewise.
(V64_ALL): Likewise.
(V64_FP): Likewise.
(reduc_<reduc_op>_scal_<mode>): Use V_ALL. Use gen_vec_extract.
(fold_left_plus_<mode>): Use V_FP.
(*<reduc_op>_dpp_shr_<mode>): Use V_1REG.
(*<reduc_op>_dpp_shr_<mode>): Use V_DI.
(*plus_carry_dpp_shr_<mode>): Use V_INT_1REG.
(*plus_carry_in_dpp_shr_<mode>): Use V_SI.
(*plus_carry_dpp_shr_<mode>): Use V_DI.
(mov_from_lane63_<mode>): Delete.
(mov_from_lane63_<mode>): Delete.
* config/gcn/gcn.cc (gcn_expand_reduc_scalar): Support partial vectors.
* config/gcn/gcn.md (unspec): Remove UNSPEC_MOV_FROM_LANE63.

(cherry picked from commit f539029c1ce6fb9163422d1a8b6ac12a2554eaa2)

Daily bump.

Fortran: Add missing TKR initialization to class variables [PR100097, PR100098]

gcc/fortran/ChangeLog:

PR fortran/100097
PR fortran/100098
* trans-array.cc (gfc_trans_class_array): New function to
initialize class descriptor's TKR information.
* trans-array.h (gfc_trans_class_array): Add function prototype.
* trans-decl.cc (gfc_trans_deferred_vars): Add calls to the new
function for both pointers and allocatables.

gcc/testsuite/ChangeLog:

PR fortran/100097
PR fortran/100098
* gfortran.dg/PR100097.f90: New test.
* gfortran.dg/PR100098.f90: New test.

(cherry picked from commit 4cfdaeb2755121ac1069f09898def56469b0fb51)

Daily bump.

Fortran: BOZ literal constants are not compatible to any type [PR103413]

gcc/fortran/ChangeLog:

PR fortran/103413
* symbol.cc (gfc_type_compatible): A boz-literal-constant has no type
and thus is not considered compatible to any type.

gcc/testsuite/ChangeLog:

PR fortran/103413
* gfortran.dg/illegal_boz_arg_4.f90: New test.

(cherry picked from commit f7d28818179247685f3c101f9f2f16366f56309b)

openmp: Allow optional comma after directive-specifier in C/C++

Previously we've been allowing that comma only in C++ when in attribute
form (which was the reason why it has been allowed), but 5.1 allows that
even in pragma form in C/C++ (with clarifications in 5.2) and 5.2
also in Fortran (which this patch doesn't implement).

Note, for directives which take an argument (== unnamed clause),
comma is not allowed in between the directive name and the argument,
like the directive-1.c testcase shows.

2022-10-28 Jakub Jelinek <jakub@redhat.com>

gcc/c/
* c-parser.cc (c_parser_omp_all_clauses): Allow optional
comma before the first clause.
(c_parser_omp_allocate, c_parser_omp_atomic, c_parser_omp_depobj,
c_parser_omp_flush, c_parser_omp_scan_loop_body,
c_parser_omp_ordered, c_finish_omp_declare_variant,
c_parser_omp_declare_target, c_parser_omp_declare_reduction,
c_parser_omp_requires, c_parser_omp_error,
c_parser_omp_assumption_clauses): Likewise.
gcc/cp/
* parser.cc (cp_parser_omp_all_clauses): Allow optional comma
before the first clause even in pragma syntax.
(cp_parser_omp_allocate, cp_parser_omp_atomic, cp_parser_omp_depobj,
cp_parser_omp_flush, cp_parser_omp_scan_loop_body,
cp_parser_omp_ordered, cp_parser_omp_assumption_clauses,
cp_finish_omp_declare_variant, cp_parser_omp_declare_target,
cp_parser_omp_declare_reduction_exprs, cp_parser_omp_requires,
cp_parser_omp_error): Likewise.
gcc/testsuite/
* c-c++-common/gomp/directive-1.c: New test.
* c-c++-common/gomp/clauses-6.c: New test.
* c-c++-common/gomp/declare-variant-2.c (f75a): Declare.
(f75): Use f75a as variant instead of f1 and don't expect error.
* g++.dg/gomp/clause-4.C (foo): Don't expect error on comma
before first clause.
* gcc.dg/gomp/clause-2.c (foo): Likewise.

(cherry picked from commit 89999f2358724fa4e71c7c3b4de340582c0e43da)

Merge commit '9b116c51a451995f1bae8fdac0748fcf3f06aafe'

[og12] OpenACC: Don't gang-privatize artificial variables: restrict to blocks: ChangeLog

... forgotten in og12 commit 9a50d282f03f7f1e1ad00de917143a2a8e0c0ee0
"[og12] OpenACC: Don't gang-privatize artificial variables: restrict to blocks".

OpenACC: Don't gang-privatize artificial variables [PR90115]

This patch prevents compiler-generated artificial variables from being
treated as privatization candidates for OpenACC.

The rationale is that e.g. "gang-private" variables actually must be
shared by each worker and vector spawned within a particular gang, but
that sharing is not necessary for any compiler-generated variable (at
least at present, but no such need is anticipated either). Variables on
the stack (and machine registers) are already private per-"thread"
(gang, worker and/or vector), and that's fine for artificial variables.

We're restricting this to blocks, as we still need to understand what it
means for a 'DECL_ARTIFICIAL' to appear in a 'private' clause.

Several tests need their scan output patterns adjusted to compensate.

2022-10-14 Julian Brown <julian@codesourcery.com>

PR middle-end/90115
gcc/
* omp-low.cc (oacc_privatization_candidate_p): Artificial vars are not
privatization candidates.

libgomp/
* testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output.
* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/print-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
(cherry picked from commit 11e811d8e2f63667f60f73731bb934273f5882b8)

[og12] OpenACC: Don't gang-privatize artificial variables: restrict to blocks

Follow-up to og12 commit d4504346d2a1d6ffecb8b2d8e3e04ab8ea259785
"[og12] OpenACC: Don't gang-privatize artificial variables", to restore
the previous behavior, until we understand what it means for a
'DECL_ARTIFICIAL' to appear in a 'private' clause.

gcc/
* omp-low.cc (oacc_privatization_candidate_p) <DECL_ARTIFICIAL>:
Restrict to 'block's.
libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Adjust.

Resolve '-Wsign-compare' issue in 'gcc/omp-low.cc:lower_rec_simd_input_clauses': ChangeLog

... forgotten in og12 commit 4e32d1582a137d5f34248fdd3e93d35a798f5221
"Resolve '-Wsign-compare' issue in 'gcc/omp-low.cc:lower_rec_simd_input_clauses'".

Resolve '-Wsign-compare' issue in 'gcc/omp-low.cc:lower_rec_simd_input_clauses'

..., introduced in og12 commit 55722a87dd223149dcd41ca9c8eba16ad5b3eddc
"openmp: fix max_vf setting for amdgcn offloading":

    In file included from [...]/source-gcc/gcc/coretypes.h:482,
                     from [...]/source-gcc/gcc/omp-low.cc:27:
    [...]/source-gcc/gcc/poly-int.h: In instantiation of ‘typename if_nonpoly<Ca, bool>::type maybe_lt(const Ca&, const poly_int_pod<N, Cb>&) [with unsigned int N = 1; Ca = int; Cb = long unsigned int; typename if_nonpoly<Ca, bool>::type = bool]’:
    [...]/source-gcc/gcc/poly-int.h:1510:7:   required from ‘poly_int<N, typename poly_result<Ca, typename if_nonpoly<Cb>::type>::type> ordered_max(const poly_int_pod<N, C>&, const Cb&) [with unsigned int N = 1; Ca = long unsigned int; Cb = int; typename poly_result<Ca, typename if_nonpoly<Cb>::type>::type = long unsigned int; typename if_nonpoly<Cb>::type = int]’
    [...]/source-gcc/gcc/omp-low.cc:5180:33:   required from here
    [...]/source-gcc/gcc/poly-int.h:1384:12: error: comparison of integer expressions of different signedness: ‘const int’ and ‘const long unsigned int’ [-Werror=sign-compare]
     1384 |   return a < b.coeffs[0];
          |          ~~^~~~~~~~~~~
    [...]/source-gcc/gcc/poly-int.h: In instantiation of ‘typename if_nonpoly<Cb, bool>::type maybe_lt(const poly_int_pod<N, C>&, const Cb&) [with unsigned int N = 1; Ca = long unsigned int; Cb = int; typename if_nonpoly<Cb, bool>::type = bool]’:
    [...]/source-gcc/gcc/poly-int.h:1515:2:   required from ‘poly_int<N, typename poly_result<Ca, typename if_nonpoly<Cb>::type>::type> ordered_max(const poly_int_pod<N, C>&, const Cb&) [with unsigned int N = 1; Ca = long unsigned int; Cb = int; typename poly_result<Ca, typename if_nonpoly<Cb>::type>::type = long unsigned int; typename if_nonpoly<Cb>::type = int]’
    [...]/source-gcc/gcc/omp-low.cc:5180:33:   required from here
    [...]/source-gcc/gcc/poly-int.h:1373:22: error: comparison of integer expressions of different signedness: ‘const long unsigned int’ and ‘const int’ [-Werror=sign-compare]
     1373 |   return a.coeffs[0] < b;
          |          ~~~~~~~~~~~~^~~

gcc/
* omp-low.cc (lower_rec_simd_input_clauses): For 'ordered_max',
cast 'omp_max_simt_vf ()', 'omp_max_simd_vf ()' to 'unsigned'.

Fix target selector syntax in 'gcc.dg/vect/bb-slp-cond-1.c'

... to restore testing lost in recent
commit r13-3225-gbd9a05594d227cde79a67dc715bd9d82e9c464e9
"amdgcn: vector testsuite tweaks" (for example, x86_64-pc-linux-gnu):

    PASS: gcc.dg/vect/bb-slp-cond-1.c (test for excess errors)
    PASS: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  scan-tree-dump vect "(no need for alias check [^\\n]* when VF is 1|no alias between [^\\n]* when [^\\n]* is outside \$-16, 16\$)"
    [-PASS: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  scan-tree-dump-times vect "loop vectorized" 1-]
    PASS: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects (test for excess errors)
    PASS: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects execution test
    PASS: gcc.dg/vect/bb-slp-cond-1.c execution test
    PASS: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump vect "(no need for alias check [^\\n]* when VF is 1|no alias between [^\\n]* when [^\\n]* is outside \$-16, 16\$)"
    [-PASS: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times vect "loop vectorized" 1-]

gcc/testsuite/
* gcc.dg/vect/bb-slp-cond-1.c: Fix target selector syntax.

(cherry picked from commit 0607307768b66a90e27c5bc91a247acc938f070e)

Daily bump.

openacc: Revert erroneous gang reduction changes

This patch reverts some changes related to "gang reduction on an orphan loop"
of commit 3a5e525489f2f808093ae1f12b5d2b406f571ec7 "Various OpenACC reduction
enhancements - FE change" similar to the mainline commit
77d24d43644909852998043335b5a0e09d1e8f02.

gcc/c/ChangeLog:

* c-typeck.cc (c_finish_omp_clauses): Remove "gang reduction on an
orphan loop" checking.

gcc/cp/ChangeLog:

* semantics.cc (finish_omp_clauses): Remove "gang reduction on an
orphan loop" checking.

gcc/fortran/ChangeLog:

* openmp.cc (oacc_is_parallel): Remove.
(resolve_oacc_loop_blocks): Remove "gang reduction on an orphan loop"
checking.

openacc: Revert erroneous gang reduction change in OpenAcc test

This patch reverts an erroneous modification of an OpenAcc test case in
d27d6c9e1e3bc18ba0113757b743b306ea69f825 "Various OpenACC reduction
enhancements - test cases".

The same reversion was already done on mainline with commit
77d24d43644909852998043335b5a0e09d1e8f02.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc/orphan-reductions-1.f90: Adjust.

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8872-gca0220d42e075194ca1341c98a0a9a9f3fd1c719 (27th Oct 2022)

lto: do not load LTO stream for aliases [PR107418]

PR lto/107418

gcc/lto/ChangeLog:

* lto-dump.cc (lto_main): Do not load LTO stream for aliases.

(cherry picked from commit be6c75547385c69706370f4e792b04295f708a5a)

IRA: Make sure array is big enough

In commit 081c96621da, the call to resize_reg_info() was moved before
the call to remove_scratches() and the latter one can increase the
number of regs and that would cause an out of bounds usage on the
reg_renumber global array.

Without this patch, the following testcase randomly fails with:
during RTL pass: ira
In file included from /src/gcc/testsuite/gcc.dg/compat/struct-by-value-5b_y.c:13:
/src/gcc/testsuite/gcc.dg/compat/struct-by-value-5b_y.c: In function 'checkgSf13':
/src/gcc/testsuite/gcc.dg/compat/fp-struct-test-by-value-y.h:28:1: internal compiler error: Segmentation fault
/src/gcc/testsuite/gcc.dg/compat/struct-by-value-5b_y.c:22:1: note: in expansion of macro 'TEST'

gcc/ChangeLog:

* ira.cc: Resize array after reg number increased.

Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 4e1d704243a4f3c4ded47cd0d02427bb7efef069)

Daily bump.

aarch64: update Ampere-1 core definition

This brings the extensions detected by -mcpu=native on Ampere-1 systems
in sync with the defaults generated for -mcpu=ampere1.

Note that some early kernel versions on Ampere1 may misreport the
presence of PAUTH and PREDRES (i.e., -mcpu=native will add 'nopauth'
and 'nopredres').

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Update
Ampere-1 core entry.

(cherry picked from commit db2f5d661239737157cf131de7d4df1c17d8d88d)

aarch64: fix off-by-one in reading cpuinfo

Fixes: 341573406b39
Don't subtract one from the result of strnlen() when trying to point
to the first character after the current string. This issue would
cause individual characters (where the 128 byte buffers are stitched
together) to be lost.

gcc/ChangeLog:

* config/aarch64/driver-aarch64.cc (readline): Fix off-by-one.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cpunative/info_18: New test.
* gcc.target/aarch64/cpunative/native_cpu_18.c: New test.

(cherry picked from commit b1cfbccc41de6aec950c0f662e7e85ab34bfff8a)

Handle operator new with alignment in usm transform.

Since C++17, the is a variant of operator new with alignment. This
patch converts it to omp_aligned_alloc when unified shared memory is
being used.

gcc/ChangeLog:

* omp-low.cc (usm_transform): Handle operator new with alignment.

libgomp/ChangeLog:

* testsuite/libgomp.c++/usm-2.C: New test.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/usm-4.C: New test.
* g++.dg/gomp/usm-5.C: New test.

Daily bump.

OpenAcc: Correction of reduction enhancement

Commit bce2c92cfec2ae1eb9d79e36dff5a220b688bfa1 "Various OpenACC reduction
enhancements - ME and nvptx changes" introduced several regressions:

        gcc/testsuite/c-c++-common/goacc/nested-reductions-1-routine.c
        gcc/testsuite/c-c++-common/goacc/nested-reductions-2-routine.c
        gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c
        gcc/testsuite/gfortran.dg/goacc/nested-reductions-1-routine.f90
        gcc/testsuite/gfortran.dg/goacc/nested-reductions-2-routine.f90
        gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90

This fixes above regressions.

gcc/ChangeLog:

        * omp-offload.cc (oacc_loop_auto_partitions): Removed OLF reduction
        handling.

Relax assertion in profiler

This assertion in branch_prob:

  if (bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb)
    {
      location_t loc = DECL_SOURCE_LOCATION (current_function_decl);
      gcc_checking_assert (!RESERVED_LOCATION_P (loc));

had been correct until the fix for PR debug/101598 was installed.

gcc/
* profile.cc (branch_prob): Be prepared for ignored functions with
DECL_SOURCE_LOCATION set to UNKNOWN_LOCATION.

gcc/testsuite:
* gnat.dg/specs/coverage1.ads: New test.
* gnat.dg/specs/variant_part.ads: Minor tweak.
* gnat.dg/specs/weak1.ads: Add dg directive.

IBM zSystems: Fix function_ok_for_sibcall [PR106355]

For a parameter with BLKmode we cannot use REG_NREGS in order to
determine the number of consecutive registers. Streamlined this with
the implementation of s390_function_arg.

Fix some indentation whitespace, too.

gcc/ChangeLog:

PR target/106355
* config/s390/s390.cc (s390_call_saved_register_used): For a
parameter with BLKmode fix determining number of consecutive
registers.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr106355.h: Common code for new tests.
* gcc.target/s390/pr106355-1.c: New test.
* gcc.target/s390/pr106355-2.c: New test.
* gcc.target/s390/pr106355-3.c: New test.

(cherry picked from commit cb994acc08b67f26a54e7c5dc1f4995a2ce24d98)

i386: fix pedantic warning

PR target/107364

gcc/ChangeLog:

* common/config/i386/i386-cpuinfo.h (enum processor_vendor):
Fix pedantic warning.

(cherry picked from commit f3f000b7689ce9eb6364808072025672af1e4e1b)

x86: fix VENDOR_MAX enum value

PR target/107364

gcc/ChangeLog:

* common/config/i386/i386-cpuinfo.h (enum processor_vendor):
Reorder enum values as BUILTIN_VENDOR_MAX should not point
in the middle of the valid enum values.

(cherry picked from commit f751bf4c5d1aaa1aacfcbdec62881c5ea1175dfb)

Daily bump.

libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs

Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
I'm seeing a lot of libgomp execution test regressions.  Random
example, 'libgomp.c-c++-common/error-1.c':

    [...]
      GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]

    Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
    0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
    2127            if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
    (gdb) print ptx_dev
    $1 = (struct ptx_device *) 0x6a55a0
    (gdb) print ptx_dev->rev_data
    $2 = (struct rev_offload *) 0xffffffff00000000
    (gdb) print ptx_dev->rev_data->fn
    Cannot access memory at address 0xffffffff00000000

libgomp/
* plugin/plugin-nvptx.c (nvptx_open_device): Initialize
'ptx_dev->rev_data'.

(cherry picked from commit 205538832b7033699047900cf25928f5920d8b93)

vect: WORKAROUND vectorizer bug

This patch disables vectorization of memory accesses to non-default address
spaces where the pointer size is different to the usual pointer size. This
condition typically occurs in OpenACC programs on amdgcn, where LDS memory is
used for broadcasting gang-private variables between threads. In particular,
see libgomp.oacc-c-c++-common/private-variables.c

The problem is that the address space information is dropped from the various
types in the middle-end and eventually it triggers an ICE trying to do an
address conversion. That ICE can be avoided by defining
POINTERS_EXTEND_UNSIGNED, but that just produces wrong RTL code later on.

A correct solution would ensure that all the vectypes have the correct address
spaces, but I don't have time for that right now.

gcc/ChangeLog:

* tree-vect-data-refs.cc (vect_analyze_data_refs): Workaround an
address-space bug.

amdgcn: disallow USM on gfx908

It does work, but not well and only with the amdgpu.noreply=0 kernel boot
option.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_init_cumulative_args): Disallow gfx908.

amdgcn, libgomp: USM allocation update

Allocate Unified Shared Memory via malloc and hsa_amd_svm_attributes_set,
instead of hsa_allocate_memory. This scheme should be more efficient for
for memory that is first accessed by the CPU.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (HSA_AMD_SYSTEM_INFO_SVM_SUPPORTED): New.
(HSA_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT): New.
(HSA_AMD_SVM_ATTRIB_GLOBAL_FLAG): New.
(HSA_AMD_SVM_GLOBAL_FLAG_COARSE_GRAINED): New.
(hsa_amd_svm_attribute_pair_t): New.
(struct hsa_runtime_fn_info): Add hsa_amd_svm_attributes_set_fn.
(dump_hsa_system_info): Dump HSA_AMD_SYSTEM_INFO_SVM_SUPPORTED and
HSA_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT.
(DLSYM_OPT_FN): New.
(init_hsa_runtime_functions): Add hsa_amd_svm_attributes_set.
(GOMP_OFFLOAD_usm_alloc): Use malloc and hsa_amd_svm_attributes_set.
(GOMP_OFFLOAD_usm_free): Use regular free.
* testsuite/libgomp.c/usm-1.c: Add -mxnack=on for amdgcn.
* testsuite/libgomp.c/usm-2.c: Likewise.
* testsuite/libgomp.c/usm-3.c: Likewise.
* testsuite/libgomp.c/usm-4.c: Likewise.

libgomp/nvptx: Prepare for reverse-offload callback handling

This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

* cuda/cuda.h (enum CUdevice_attribute): Add
CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
(CU_MEMHOSTALLOC_DEVICEMAP): Define.
(cuMemHostAlloc): Add prototype.

libgomp/ChangeLog:

* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
'static' for this variable.
* config/nvptx/libgomp-nvptx.h: New file.
* config/nvptx/target.c: Include it.
(GOMP_ADDITIONAL_ICVS): Declare extern var.
(GOMP_REV_OFFLOAD_VAR): Declare var.
(GOMP_target_ext): Handle reverse offload.
* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
* target.c (gomp_target_rev): ... this new stub function.
* libgomp.h (gomp_target_rev): Declare.
* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
* plugin/cuda-lib.def (cuMemHostAlloc): Add.
* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
(struct ptx_device): Add rev_data member.
(nvptx_open_device): Remove async_engines query, last used in
r10-304-g1f4c5b9b; add unified-address assert check.
(GOMP_OFFLOAD_get_num_devices): Claim unified address
support.
(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
offload functions exist. Make offload var available
on host and device.
(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
(GOMP_OFFLOAD_run): Handle reverse offload.

(cherry picked from commit 052dfa279f5de90b324d60cf787e821b18cf496c)

gcc/testsuite: Change 'cunrolli' to 'cunrolli1' in dump scan + options

The OG12 commit
  3e8b51d143e openacc: Move pass_oacc_device_lower after pass_graphite
adds a new pass, which also re-invokes some previous passes. This
seems to have the effect that the pass names and, hence, the dump
names have now a tailing number.

In that commit, 'cunrolli' was changed to 'cunrolli1 for some testcases.
This commit does likewise for some more testcases. In particular,
. in scan-tree-dump the tailing '1' is crucial to change UNRESOLVED
  to PASS.
. for "-fdisable-tree-cunrolli1" option, it changes FAIL (excess errors)
  to PASS
. even without the change, "-fdump-tree-cunrolli{-details,-optimized}"
  has PASS, but I believe the tailing 1 ensures that only the first
  'cunrolli' dumps.

gcc/testsuite
* g++.dg/ext/unroll-1.C: Change 'cunrolli' to 'cunrolli1' in
dg-options and scan-tree-dump.
* g++.dg/ext/unroll-2.C: Likewise.
* g++.dg/ext/unroll-3.C: Likewise.
* g++.dg/vect/pr36648.cc: Likewise.
* gcc.dg/tree-prof/init-array.c: Likewise.
* gcc.dg/tree-ssa/pr100359.c: Likewise.
* gcc.dg/tree-ssa/pr59597.c: Likewise.
* gcc.dg/unroll-2.c: Likewise.
* gfortran.dg/directive_unroll_1.f90: Likewise.
* gfortran.dg/directive_unroll_4.f90: Likewise.
* gnat.dg/unroll1.adb: Likewise.
* gnat.dg/unroll2.adb: Likewise.

OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236]

For 'target parallel' and similarly nested directives, cgraph_node's
calls_declare_variant_alt was not set in the parent region node but in
cfun->decl. Hence, pass_omp_device_lower did not process handle the
internal function GOMP_TARGET_REV. - Solution is to set it to the
DECL_CONTEXT, which is set in adjust_context_and_scope.

The cgraph_node::create_clone issue is exposed with -O2 for the existing
libgomp.fortran/reverse-offload-1.f90.

PR middle-end/107236

gcc/ChangeLog:
* omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
in DECL_CONTEXT and not to cfun->decl.
* cgraphclones.cc (cgraph_node::create_clone): Copy also the
node's calls_declare_variant_alt value.

gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/target-device-ancestor-6.f90: New test.

(cherry picked from commit 178ac530fe67e4f2fc439cc4ce89bc19d571ca31)

Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-8861-g1ccec25cf0c3c9cfd5882c83fd8cc56ea2987bad (24th Oct 2022)

Missing pr104517.c change from: 'Add a restriction on allocate clause (OpenMP 5.0)'

OG12 commit df47c25110474565f521508a1545232550052a75 included everything
of r13-150-g1a8c4d9ed36556a95bd7d53c04d2ec4c95594061 but the change to
gcc/testsuite/gcc.dg/gomp/pr104517.c

This commit cherry-picks the missing changes to that file.
Note: The OG12 commit already contained the ChangeLog.omp entry for
that file.

gcc/testsuite/
* gcc.dg/gomp/pr104517.c: Update.

(cherry picked from commit 1a8c4d9ed36556a95bd7d53c04d2ec4c95594061)

Daily bump.

Fortran: error recovery with references of bad array constructors [PR105633]

gcc/fortran/ChangeLog:

PR fortran/105633
* expr.cc (find_array_section): Move check for NULL pointers so
that both subscript triplets and vector subscripts are covered.

gcc/testsuite/ChangeLog:

PR fortran/105633
* gfortran.dg/pr105633.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit ecb20df4fa6d99daa635c7fb662dc0554610777e)

Daily bump.

omp-oacc-kernels-decompose.cc: fix -fcompare-debug with GIMPLE_DEBUG

GIMPLE_DEBUG were put in a parallel region of its own, which is not
only pointless but also breaks -fcompare-debug. With this commit,
they are handled like simple assignments: those placed are places
into the same body as the loop such that only one parallel region
remains as without debugging. This fixes the existing testcase
libgomp.oacc-c-c++-common/kernels-loop-g.c.

Note: GIMPLE_DEBUG are only accepted with -fcompare-debug; if they
appear otherwise, decompose_kernels_region_body rejects them with
a sorry (unchanged).

gcc/
* omp-oacc-kernels-decompose.cc (top_level_omp_for_in_stmt,
decompose_kernels_region_body): Handle GIMPLE_DEBUG like
simple assignment.

Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421]

After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]",
"big" private data now works for GCN offloading, too.

PR target/105421
libgomp/
* testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.

(cherry picked from commit c7ebee2378426eeca425ca5406af213a926f154c)