git.ipfire.org Git - thirdparty/gcc.git/log

Fortran version of libgomp.c-c++-common/icv-{3,4}.c

This adds the Fortran testsuite coverage of
omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit

libgomp/
* testsuite/libgomp.fortran/icv-3.f90: New.
* testsuite/libgomp.fortran/icv-4.f90: New.

(cherry picked from commit f5a538e1647ae67cf204c5c3b1bd9cca5224dfd1)

Fortran: Various CLASS + assumed-rank fixed [PR102541]

Starting point was PR102541, were a previous patch caused an invalid
e->ref access for class. When testing, it turned out that for
CLASS to CLASS the code was never executed - additionally, issues
appeared for optional and a bogus error for -fcheck=all. In particular:

There were a bunch of issues related to optional CLASS, can have the
'attr.dummy' set in CLASS_DATA (sym) - but sometimes also in 'sym'!?!
Additionally, gfc_variable_attr could return pointer = 1 for nonpointers
when the expr is no longer "var" but "var%_data".

PR fortran/102541

gcc/fortran/ChangeLog:

* check.c (gfc_check_present): Handle optional CLASS.
* interface.c (gfc_compare_actual_formal): Likewise.
* trans-array.c (gfc_trans_g77_array): Likewise.
* trans-decl.c (gfc_build_dummy_array_decl): Likewise.
* trans-types.c (gfc_sym_type): Likewise.
* primary.c (gfc_variable_attr): Fixes for dummy and
pointer when 'class%_data' is passed.
* trans-expr.c (set_dtype_for_unallocated, gfc_conv_procedure_call):
For assumed-rank dummy, fix setting rank for dealloc/notassoc actual
and setting ubound to -1 for assumed-size actuals.

gcc/testsuite/ChangeLog:

* gfortran.dg/assumed_rank_24.f90: New test.

(cherry picked from commit eb92cd57a1ebe7cd7589bdbec34d9ae337752ead)

openmp: Avoid calling clear_type_padding_in_mask in the common case where there can't be any padding

We can use the clear_padding_type_may_have_padding_p function, which
is conservative for e.g. RECORD_TYPE/UNION_TYPE, but for the floating and
complex floating types is accurate. clear_type_padding_in_mask is
more expensive because we need to allocate memory, fill it, call the function
which itself is more expensive and then analyze the memory, so for the
common case of float/double atomics or even long double on most targets
we can avoid that.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

gcc/
* gimple-fold.h (clear_padding_type_may_have_padding_p): Declare.
* gimple-fold.c (clear_padding_type_may_have_padding_p): No longer
static.
gcc/c-family/
* c-omp.c (c_finish_omp_atomic): Use
clear_padding_type_may_have_padding_p.

(cherry picked from commit 8e1fe3f779185cc678493ceda42c2e620a5c1387)

openmp: Add documentation for omp_{get_max, set_num}_threads and omp_{s, g}et_teams_thread_limit

This patch adds documentation for these new OpenMP 5.1 APIs as well as
two new environment variables - OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

* libgomp.texi (omp_get_max_teams, omp_get_teams_thread_limit,
omp_set_num_teams, omp_set_teams_thread_limit, OMP_NUM_TEAMS,
OMP_TEAMS_THREAD_LIMIT): Document.

(cherry picked from commit 4096bf82a0cda5f79d7e5921b73897f76e00a800)

openmp: Fix up warnings on libgomp.info build

When building libgomp documentation, I see
makeinfo --split-size=5000000  -I ../../../libgomp/../gcc/doc/include -I ../../../libgomp -o libgomp.info ../../../libgomp/libgomp.texi
../../../libgomp/libgomp.texi:503: warning: node next `omp_get_default_device' in menu `omp_get_device_num' and in sectioning `omp_get_dynamic' differ
../../../libgomp/libgomp.texi:528: warning: node prev `omp_get_dynamic' in menu `omp_get_device_num' and in sectioning `omp_get_default_device' differ
../../../libgomp/libgomp.texi:560: warning: node next `omp_get_initial_device' in menu `omp_get_level' and in sectioning `omp_get_device_num' differ
../../../libgomp/libgomp.texi:587: warning: node next `omp_get_device_num' in menu `omp_get_dynamic' and in sectioning `omp_get_level' differ
../../../libgomp/libgomp.texi:587: warning: node prev `omp_get_device_num' in menu `omp_get_default_device' and in sectioning `omp_get_initial_device' differ
../../../libgomp/libgomp.texi:615: warning: node prev `omp_get_level' in menu `omp_get_initial_device' and in sectioning `omp_get_device_num' differ
warnings.  This patch fixes those.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

* libgomp.texi (omp_get_device_num): Move @node before omp_get_dynamic
to avoid makeinfo warnings.

(cherry picked from commit de7fa7063e99d29b6cc2024a90a3755a1001a154)

openmp: Add testsuite coverage for omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit

This adds (C/C++ only) testsuite coverage for these new OpenMP 5.1 APIs.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

* testsuite/libgomp.c-c++-common/icv-3.c: New test.
* testsuite/libgomp.c-c++-common/icv-4.c: New test.

(cherry picked from commit 88f5ad524a152711b7345b6bee2e06c5af0e88bc)

libgomp: alloc* test fixes [PR102628, PR102668]

As reported, the alloc-9.c test and alloc-{1,2,3}.F* and alloc-11.f90
tests fail on powerpc64-linux with -m32.
The reason why it fails just there is that malloc doesn't guarantee there
128-bit alignment (historically glibc guaranteed 2 * sizeof (void *)
alignment from malloc).

There are two separate issues.
One is a thinko on my side.
In this part of alloc-9.c test (copied to alloc-11.f90), we have
2 allocators, a with pool size 1024B and alignment 16B and default fallback
and a2 with pool size 512B and alignment 32B and a as fallback allocator.
We start at no allocations in both at line 194 and do:
  p = (int *) omp_alloc (sizeof (int), a2);
// This succeeds in a2 and needs 4+overhead bytes (which includes the 32B alignment)
  p = (int *) omp_realloc (p, 420, a, a2);
// This allocates 420 bytes+overhead in a, with 16B alignment and deallocates the above
  q = (int *) omp_alloc (sizeof (int), a);
// This allocates 4+overhead bytes in a, with 16B alignment
  q = (int *) omp_realloc (q, 420, a2, a);
// This allocates 420+overhead in a2 with 32B alignment
  q = (int *) omp_realloc (q, 768, a2, a2);
// This attempts to reallocate, but as there are elevated alignment
// requirements doesn't try to just realloc (even if it wanted to try that
// a2 is almost full, with 512-420-overhead bytes left in it), so it
// tries to alloc in a2, but there is no space left in the pool, falls
// back to a, which already has 420+overhead bytes allocated in it and
// 1024-420-overhead bytes left and so fails too and fails to default
// non-pool allocator that allocates it, but doesn't guarantee alignment
// higher than malloc guarantees.
// But, the test expected 16B alignment.

So, I've slightly lowered the allocation sizes in that part of the test
420->320 and 768 -> 568, so that the last test still fails to allocate
in a2 (568 > 512-320-overhead) but succeeds in a as fallback, which was
the intent of the test.

Another thing is that alloc-1.F90 seems to be transcription of
libgomp.c-c++-common/alloc-1.c into Fortran, but alloc-1.c had:
  q = (int *) omp_alloc (768, a2);
  if ((((uintptr_t) q) % 16) != 0)
    abort ();
  q[0] = 7;
  q[767 / sizeof (int)] = 8;
  r = (int *) omp_alloc (512, a2);
  if ((((uintptr_t) r) % __alignof (int)) != 0)
    abort ();
there but Fortran has:
        cq = omp_alloc (768_c_size_t, a2)
        if (mod (transfer (cq, intptr), 16_c_intptr_t) /= 0) stop 12
        call c_f_pointer (cq, q, [768 / c_sizeof (i)])
        q(1) = 7
        q(768 / c_sizeof (i)) = 8
        cr = omp_alloc (512_c_size_t, a2)
        if (mod (transfer (cr, intptr), 16_c_intptr_t) /= 0) stop 13
I'm changing the latter to 4_c_intptr_t because other spots in the
testcase do that, Fortran sadly doesn't have c_alignof, but strictly
speaking it isn't correct, __alignof (int) could be on some architectures
smaller than 4.
So probably alloc-1.F90 etc. should also have
! { dg-additional-sources alloc-7.c }
! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
and use get__alignof_int.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

PR libgomp/102628
PR libgomp/102668
* testsuite/libgomp.c-c++-common/alloc-9.c (main): Decrease
allocation sizes from 420 to 320 and from 768 to 568.
* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
* testsuite/libgomp.fortran/alloc-1.F90: Change expected alignment
for cr from 16 to 4.

(cherry picked from commit 342aedf0e5f324cc2fb026466a8cc5cc7f839183)

gfortran.dg/gomp/defaultmap-2.f90: Use dg-message not -dg-note

gcc/testsuite/
* gfortran.dg/gomp/defaultmap-2.f90: Replace unsupported
dg-note by generic dg-message.

libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP
5.1. This was already implemented by Jakub (C/C++, commit ea82325afec) and
Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available (one test
for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as
deprecated in C/C++ and Fortran for OpenMP 5.1.

libgomp/ChangeLog:

* allocator.c (omp_init_allocator): Replace omp_atv_sequential with
omp_atv_serialized.
* omp.h.in: Add deprecated flag for omp_atv_sequential.
* omp_lib.f90.in: Add deprecated flag for omp_atv_sequential.
* testsuite/libgomp.c-c++-common/alloc-10.c: New test.
* testsuite/libgomp.fortran/alloc-12.f90: New test.

(cherry picked from commit f70977936a306e2fb4140361ba47bf5d5cc0a47d)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9094-gb3dfc8635d26b9c3cbe7731dd7b8be8a2242eab9 (Oct 11, 2021)

openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit

OpenMP 5.1 adds env vars and functions to set and query new ICVs used
as fallback if thread_limit or num_teams clauses aren't specified on
teams construct.

The following patch implements those, though further work will be needed:
1) OpenMP 5.1 also changed the num_teams clause, so that it can specify
   both lower and upper limit for how many teams should be created and
   changed the meaning when only one expression is provided, instead of
   num_teams(expr) in 5.0 meaning num_teams(1:expr) in 5.1, it now means
   num_teams(expr:expr), i.e. while previously we could create 1 to expr
   teams, in 5.1 we have some low limit by default equal to the single
   expression provided and may not create fewer teams.
   For host teams (which we don't currently implement efficiently for
   NUMA hosts) we trivially satisfy it now by always honoring what the
   user asked for, but for the offloading teams I think we'll need to
   rethink the APIs; currently teams construct is just a call that returns
   and possibly lowers the number of teams; and whenever possible we try
   to evaluate num_teams/thread_limit already on the target construct
   and the GOMP_teams call just sets the number of teams to the minimum
   of provided and requested teams; for some cases e.g. where target
   is not combined with teams and num_teams expression calls some functions
   etc., we need to call those functions in the target region and so it is
   late to figure number of teams, but also hw could just limit what it
   is willing to create; in that case I'm afraid we need to run the target
   body multiple times and arrange for omp_get_team_num () returning the
   right values
2) we need to finally implement the NUMA handling for GOMP_teams_reg
3) I now realize I haven't added some testcase coverage, will do that
   incrementally
4) libgomp.texi needs updates for these new APIs, but also others like
   the allocator

2021-10-11  Jakub Jelinek  <jakub@redhat.com>

gcc/
* omp-low.c (omp_runtime_api_call): Handle omp_get_max_teams,
omp_[sg]et_teams_thread_limit and omp_set_num_teams.
libgomp/
* omp.h.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* omp_lib.f90.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* omp_lib.h.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* libgomp.h (gomp_nteams_var, gomp_teams_thread_limit_var): Declare.
* libgomp.map (OMP_5.1): Export omp_get_max_teams{,_},
omp_get_teams_thread_limit{,_}, omp_set_num_teams{,_,_8_} and
omp_set_teams_thread_limit{,_,_8_}.
* icv.c (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): New
functions.
* env.c (gomp_nteams_var, gomp_teams_thread_limit_var): Define.
(omp_display_env): Print OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.
(initialize_env): Handle OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT env
vars.
* teams.c (GOMP_teams_reg): If thread_limit is not specified, use
gomp_teams_thread_limit_var as fallback if not zero.  If num_teams
is not specified, use gomp_nteams_var.
* fortran.c (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Add
ialias_redirect.
(omp_set_num_teams_, omp_set_num_teams_8_, omp_get_max_teams_,
omp_set_teams_thread_limit_, omp_set_teams_thread_limit_8_,
omp_get_teams_thread_limit_): New functions.

(cherry picked from commit 07dd3bcda17f97cf5476c3d6f2f2501c1e0712e6)

Daily bump.

var-tracking: Fix a wrong-debug issue caused by my r10-7665 var-tracking change [PR102441]

Since my r10-7665-g33c45e51b4914008064d9b77f2c1fc0eea1ad060 change, we get
wrong-debug on e.g. the following testcase at -O2 -g on x86_64-linux for the
x parameter:
void bar (int *r);
int
foo (int x)
{
  int r = 0;
  bar (&r);
  return r;
}
At the start of function, we have
        subq    $24, %rsp
        leaq    12(%rsp), %rdi
instructions.  The x parameter is passed in %rdi, but isn't used in the
function and so the leaq instruction overwrites %rdi without remembering
%rdi anywhere.  Before the r10-7665 change (which was trying to fix a large
(3% for 32-bit, 1% for 64-bit x86-64) debug info/loc growth introduced with
r10-7515), the leaq insn above resulted in a MO_VAL_SET micro-operation that
said that the value of sp + 12, a cselib_sp_derived_value_p, is stored into
the %rdi register.  The r10-7665 change added a change to add_stores that
added no micro-operation for the leaq store, with the rationale that the sp
based values can be and will be always computable some other more compact
and primarily more stable way (cfa based expression like DW_OP_fbreg, that
is the same in the whole function).  That is true.  But by throwing the
micro-operation on the floor, we miss another important part of the
MO_VAL_SET, in particular that the destination of the store, %rdi in this
case, now has a different value from what it had before, so the vt_*
dataflow code thinks that even after the leaq instruction %rdi still holds
the x argument value (and changes it to DW_OP_entry_value (%rdi) only in the
middle of the call to bar).  Previously and with the patches below,
the location for x changes already at the end of leaq instruction to
DW_OP_entry_value (%rdi).

My first attempt to fix this was instead of dropping the MO_VAL_SET add
a MO_CLOBBER operation:
--- gcc/var-tracking.c.jj       2021-05-04 21:02:24.196799586 +0200
+++ gcc/var-tracking.c  2021-09-24 19:23:16.420154828 +0200
@@ -6133,7 +6133,9 @@ add_stores (rtx loc, const_rtx expr, voi
     {
       if (preserve)
        preserve_value (v);
-      return;
+      mo.type = MO_CLOBBER;
+      mo.u.loc = loc;
+      goto log_and_return;
     }

   nloc = replace_expr_with_values (oloc);
so don't track that the value lives in the loc destination, but track
that the previous value doesn't live there anymore.  That failed bootstrap
miserably, the vt_* code isn't prepared to see MO_CLOBBER of a MEM that
isn't tracked (e.g. has MEM_EXPR on it that the var-tracking code wants
to track, i.e. track_p in add_stores).  On the other side, thinking about
it more, in the most common case where a cselib_sp_derived_value_p value
is stored into the sp register (and which is the reason why PR94495
testcase got larger), dropping the micro-operation on the floor is the
right thing, because we have that cselib_sp_derived_value_p tracking, any
reads from the sp hard register will be treated as
cselib_sp_derived_value_p.
Then I've tried 3 different patches described below and in the end
what is committed is patch2.
Additionally, I've gathered statistics from cc1plus by always reverting the
var-tracking.c change after finished bootstrap/regtest and rebuilding the
stage3 var-tracking.o and cc1plus, such that it would be comparable.
dwlocstat and .debug_{info,loclists} section sizes detailed below.
patch3 uses MO_VAL_SET (i.e. essentially reversion of the r10-7665
change) when destination is not a REG_P and !track_p, otherwise if
destination is sp drops the micro-operation on the floor (i.e. no change),
otherwise adds a MO_CLOBBER.
patch1 is similar, except it checks for destination not equal to sp and
!track_p, i.e. for !track_p REG_P destinations other than sp it will use
MO_VAL_SET rather than MO_CLOBBER.
Finally, patch2, the shortest patch, uses MO_VAL_SET whenever destination
is not sp and otherwise drops the micro-operation on the floor.
All the 3 patches don't affect the PR94495 testcase, all the changes
there were caused by stores of sp based values into %rsp.

While the patch2 (and patch1 which results in exactly the same sizes)
causes the largest debug loclists/info growth from the 3, it is still quite
minor (0.651% on 64-bit and 0.114% on 32-bit) compared
to the 1% and 3% PR94495 was trying to solve, and I actually think it is the
best thing to do.  Because, if we have say
  int q[10];
  int *p = &q[0];
or similar and we load the &q[0] sp based value into some hard register,
by noting in the debug info that p lives in some hard reg for some part
of the function and a user is trying to change the p var in the debugger,
if we say it lives in some register or memory, there is some chance that
the changing of the value could work successfully (of course, nothing
is guaranteed, we don't have tracking of where each var lives at which
moment for changing purposes (i.e. what register, memory or else you need
to change in order to change behavior of the code)), while if we just say
that p's location is DW_OP_fbreg 16 DW_OP_stack_value, that is a read-only
value one can just print but not change.  Now, for stores of variable
values into the sp register, I don't think we have such an issue, you don't
want debugger to change your stack pointer when user asks to change value
of some variable whose value lives in the stack pointer, that would pretty
much always result in misbehavior of the program.
So, my preference from these 3 is patch2 and that is being committed.

64-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1064665/37%     1064665/37%
11..20  35972/1%        1100637/38%
21..30  47969/1%        1148606/40%
31..40  45787/1%        1194393/42%
41..50  57529/2%        1251922/44%
51..60  53974/1%        1305896/46%
61..70  112055/3%       1417951/50%
71..80  79420/2%        1497371/52%
81..90  126225/4%       1623596/57%
91..100 1206682/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a44949f 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5d046 506e947 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1064685/37%     1064685/37%
11..20  36011/1%        1100696/38%
21..30  47975/1%        1148671/40%
31..40  45799/1%        1194470/42%
41..50  57566/2%        1252036/44%
51..60  54011/1%        1306047/46%
61..70  112068/3%       1418115/50%
71..80  79421/2%        1497536/52%
81..90  126171/4%       1623707/57%
91..100 1206571/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a448f27 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff608bc 52070dd 00      0   0  1
patch3
cov%    samples cumul
0..10   1064698/37%     1064698/37%
11..20  36018/1%        1100716/38%
21..30  47977/1%        1148693/40%
31..40  45804/1%        1194497/42%
41..50  57562/2%        1252059/44%
51..60  54018/1%        1306077/46%
61..70  112071/3%       1418148/50%
71..80  79424/2%        1497572/52%
81..90  126172/4%       1623744/57%
91..100 1206534/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a449548 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5df39 507acd8 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.651% and for vanilla -> patch3 by 0.020%.

32-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1061892/37%     1061892/37%
11..20  34002/1%        1095894/39%
21..30  43513/1%        1139407/40%
31..40  41667/1%        1181074/42%
41..50  59144/2%        1240218/44%
51..60  47009/1%        1287227/45%
61..70  105069/3%       1392296/49%
71..80  72990/2%        1465286/52%
81..90  125988/4%       1591274/56%
91..100 1208726/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c83d 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebc816e 3fe44fd 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1061999/37%     1061999/37%
11..20  34065/1%        1096064/39%
21..30  43557/1%        1139621/40%
31..40  41690/1%        1181311/42%
41..50  59191/2%        1240502/44%
51..60  47143/1%        1287645/45%
61..70  105045/3%       1392690/49%
71..80  73021/2%        1465711/52%
81..90  125885/4%       1591596/56%
91..100 1208404/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c597 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca915 401ffad 00      0   0  1
patch3
cov%    samples cumul
0..10   1062006/37%     1062006/37%
11..20  34073/1%        1096079/39%
21..30  43559/1%        1139638/40%
31..40  41693/1%        1181331/42%
41..50  59189/2%        1240520/44%
51..60  47142/1%        1287662/45%
61..70  105054/3%       1392716/49%
71..80  73027/2%        1465743/52%
81..90  125874/4%       1591617/56%
91..100 1208383/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c690 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca40a 4020a6e 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.114% and for vanilla -> patch3 by 0.116%.

2021-10-10  Jakub Jelinek  <jakub@redhat.com>

PR debug/102441
* var-tracking.c (add_stores): For cselib_sp_derived_value_p values
use MO_VAL_SET if loc is not sp.

(cherry picked from commit 9583b26f3701ea0456405d84f9a898451a2f7452)

Daily bump.

openmp: Add support for OpenMP 5.1 structured-block-sequences

Related to this is the addition of structured-block-sequence in OpenMP 5.1,
which doesn't change anything for Fortran, but for C/C++ allows multiple
statements instead of just one possibly compound around the separating
directives (section and scan).

I've also made some updates to the OpenMP 5.1 support list in libgomp.texi.

2021-10-09  Jakub Jelinek  <jakub@redhat.com>

gcc/c/
* c-parser.c (c_parser_omp_structured_block_sequence): New function.
(c_parser_omp_scan_loop_body): Use it.
(c_parser_omp_sections_scope): Likewise.
gcc/cp/
* parser.c (cp_parser_omp_structured_block): Remove disallow_omp_attrs
argument.
(cp_parser_omp_structured_block_sequence): New function.
(cp_parser_omp_scan_loop_body): Use it.
(cp_parser_omp_sections_scope): Likewise.
gcc/testsuite/
* c-c++-common/gomp/sections1.c (foo): Don't expect errors on
multiple statements in between section directive(s).  Add testcases
for invalid no statements in between section directive(s).
* gcc.dg/gomp/sections-2.c (foo): Don't expect errors on
multiple statements in between section directive(s).
* g++.dg/gomp/sections-2.C (foo): Likewise.
* g++.dg/gomp/attrs-6.C (foo): Add testcases for multiple
statements in between section directive(s).
(bar): Add testcases for multiple statements in between scan
directive.
* g++.dg/gomp/attrs-7.C (bar): Adjust expected error recovery.
libgomp/
* libgomp.texi (OpenMP 5.1): Mention implemented support for
structured block sequences in C/C++.  Mention support for
unconstrained/reproducible modifiers on order clause.
Mention partial (C/C++ only) support of extentensions to atomics
construct.  Mention partial (C/C++ on clause only) support of
align/allocator modifiers on allocate clause.

(cherry picked from commit 875124eb0822d8905d73bd30d51421ba8afde282)

Daily bump.

Fortran: Add diagnostic for F2018:C839 (TS29113:C535c)

2021-10-08 Sandra Loosemore <sandra@codesourcery.com>

PR fortran/54753

gcc/fortran/
* interface.c (gfc_compare_actual_formal): Add diagnostic
for F2018:C839. Refactor shared code and fix bugs with class
array info lookup, and extend similar diagnostic from PR94110
to also cover class types.

gcc/testsuite/
* gfortran.dg/c-interop/c535c-1.f90: Rewrite and expand.
* gfortran.dg/c-interop/c535c-2.f90: Remove xfails.
* gfortran.dg/c-interop/c535c-3.f90: Likewise.
* gfortran.dg/c-interop/c535c-4.f90: Likewise.
* gfortran.dg/PR94110.f90: Extend to cover class types.

(cherry picked from commit 7afb61087d2cb7a6d27463bab5a7567fac69f97a)

Fortran: Avoid var initialization in interfaces [PR54753]

Intent(out) implies deallocation/default initialization; however, it is
pointless to do this for dummy-arguments symbols of procedures which are
inside an INTERFACE block. – This also fixes a bogus error for the attached
included testcase, but fixing the non-interface version still has to be done.

PR fortran/54753

gcc/fortran/ChangeLog:

* resolve.c (can_generate_init, resolve_fl_variable_derived,
resolve_symbol): Only do initialization with intent(out) if not
inside of an interface block.

(cherry picked from commit 51d9ef7747b2dc439f7456303f0784faf5cdb1d3)

openmp: Fix up declare target handling for vars with DECL_LOCAL_DECL_ALIAS [PR102640]

The introduction of DECL_LOCAL_DECL_ALIAS and push_local_extern_decl_alias
in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a broke the following
testcase.  The following patch fixes it by treating similarly not just
the variable to or link clause is put on, but also its DECL_LOCAL_DECL_ALIAS
if any.  If it hasn't been created yet, when it is created it will copy
attributes and therefore should get it for free, and as it is an extern,
nothing more than attributes is needed for it.

2021-10-08  Jakub Jelinek  <jakub@redhat.com>

PR c++/102640
gcc/cp/
* parser.c (handle_omp_declare_target_clause): New function.
(cp_parser_omp_declare_target): Use it.
gcc/testsuite/
* c-c++-common/gomp/pr102640.c: New test.

(cherry picked from commit db3d7270b42fe27fb05664c4fdf524ab7ad13a75)

Daily bump.

c++: variadic ttp constraint subsumption [PR99904]

Here we're crashing when level-lowering the variadic constraint C<Ts...>
on the template template parameter TT because tsubst_pack_expansion expects
processing_template_decl to be set during a partial substitution.

PR c++/99904

gcc/cp/ChangeLog:

* pt.c (is_compatible_template_arg): Set processing_template_decl
around tsubst_constraint_info.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp4.C: New test.

(cherry picked from commit 2e6e0d86a06389056d0e7fecc99c547420ad787a)

Daily bump.

c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]

Here during partial ordering of the two partial specializations we end
up in unify with parm=arg=NONTYPE_ARGUMENT_PACK<V0, V1>, and crash shortly
thereafter because uses_template_parms(parms) calls potential_const_expr
which doesn't handle NONTYPE_ARGUMENT_PACK.

This patch fixes this by extending potential_constant_expression to handle
NONTYPE_ARGUMENT_PACK appropriately.

PR c++/102547

gcc/cp/ChangeLog:

* constexpr.c (potential_constant_expression_1): Handle
NONTYPE_ARGUMENT_PACK.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-partial2.C: New test.
* g++.dg/cpp0x/variadic-partial2a.C: New test.

(cherry picked from commit d4c470c376b4cb82c9a0b7e8a4b88c44d5e4289d)

c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]

is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.

PR c++/102535

gcc/cp/ChangeLog:

* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_trivially_constructible7.C: New test.

(cherry picked from commit 9845c52db38f15740861435f38f7e5ad8a8de2ec)

c++: defaulted comparisons and vptr fields [PR95567]

We need to explicitly skip over vptr fields when synthesizing a
defaulted comparison operator, because next_initializable_field
doesn't do so for us.

PR c++/95567

gcc/cp/ChangeLog:

* method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-virtual1.C: New test.

(cherry picked from commit b6bca2e631b54f992c058ca8e445b45e9816690b)

real: fix encoding of negative IEEE double/quad values [PR98216]

In encode_ieee_double/quad, the assignment

unsigned long WORD = r->sign << 31;

is intended to set the 31st bit of WORD whenever the sign bit is set.
But on LP64 hosts it also unintentionally sets the upper 32 bits of WORD,
because r->sign gets promoted from unsigned:1 to int and then the result
of the shift (equal to INT_MIN) gets sign extended from int to long.

In the C++ frontend, this bug causes incorrect mangling of negative
floating point values because the output of real_to_target called from
write_real_cst unexpectedly has the upper 32 bits of this word set,
which the caller doesn't mask out.

This patch fixes this by avoiding the unwanted sign extension. Note
that r0-53976 fixed the same bug in encode_ieee_single long ago.

PR c++/98216
PR c++/91292

gcc/ChangeLog:

* real.c (encode_ieee_double): Avoid unwanted sign extension.
(encode_ieee_quad): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-float2.C: New test.

(cherry picked from commit 34947d4e97ee72b26491cfe5ff4fa8258fadbe95)

c++: concept-ids and value-dependence [PR102412]

The problem here is that uses_template_parms returns true for all
concept-ids (even those with non-dependent arguments), so when a concept-id
is used as a default template argument then during deduction the default
argument is considered dependent even after substituting into it, which
leads to deduction failure (from type_unification_real).

This patch fixes this by implementing the resolution of CWG 2446 which
says a concept-id is dependent only if its arguments are.

DR 2446
PR c++/102412

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Check value_dependent_expression_p
instead of processing_template_decl.
* pt.c (value_dependent_expression_p) <case TEMPLATE_ID_EXPR>:
Return true only if any_dependent_template_arguments_p.
(instantiation_dependent_r) <case CALL_EXPR>: Remove this case.
<case TEMPLATE_ID_EXPR>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-nondep2.C: New test.
* g++.dg/cpp2a/concepts-nondep3.C: New test.

(cherry picked from commit 9329344a6d81a6a5e3bd171167ebc7b158bb44f4)

c++: constrained variable template issues [PR98486]

This fixes some issues with constrained variable templates:

  - Constraints aren't checked when explicitly specializing a variable
    template.
  - Constraints aren't attached to a static data member template at
    parse time.
  - Constraints don't get propagated when (partially) instantiating a
    static data member template, so we need to make sure to look up
    constraints using the most general template during satisfaction.

PR c++/98486

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.

(cherry picked from commit 2e2e65a46d2674bed53afd211493876ee2b79453)

c++: empty union member activation during constexpr [PR102163]

Here, the union's constructor is defined to activate its empty data
member _M_rest, but during constexpr evaluation of this constructor the
subobject constructor call O::O(&_M_rest, 42) doesn't produce a side
effect that actually activates the member, so the union still appears
uninitialized after its constructor has run. This patch fixes this by
using a dummy MODIFY_EXPR in this situation, whose evaluation ensures
the member gets activated.

PR c++/102163

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): After evaluating a
subobject constructor call for an empty union member, produce a
side effect that makes sure the member gets activated.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-empty17.C: New test.

(cherry picked from commit de07cff96abd43f6f65dcf333958899c2ec42598)

c++: aggregate CTAD and brace elision [PR101344]

Here the problem is ultimately that collect_ctor_idx_types always
recurses into an eligible sub-CONSTRUCTOR regardless of whether the
corresponding pair of braces was elided in the original initializer.
This causes us to reject some completely-braced forms of aggregate
CTAD as in the first testcase below, because collect_ctor_idx_types
effectively assumes that the original initializer is always minimally
braced (and so the aggregate deduction candidate is given a function
type that's incompatible with the original completely-braced initializer).

In order to fix this, collect_ctor_idx_types needs to somehow know the
shape of the original initializer when iterating over the reshaped
initializer. To that end this patch makes reshape_init flag sub-ctors
that were built to undo brace elision in the original ctor, so that
collect_ctor_idx_types that determine whether to recurse into a sub-ctor
by simply inspecting this flag.

This happens to also fix PR101820, which is about aggregate CTAD using
designated initializers, for much the same reasons.

A curious case is the "intermediately-braced" initialization of 'e3'
(which we reject) in the first testcase below. It seems to me we're
behaving as specified here (according to [over.match.class.deduct]/1)
because the initializer element x_1={1, 2, 3, 4} corresponds to the
subobject e_1=E::t, hence the type T_1 of the first function parameter
of the aggregate deduction candidate is T(&&)[2][2], but T can't be
deduced from x_1 using this parameter type (as opposed to say T(&&)[4]).

PR c++/101344
PR c++/101803

gcc/cp/ChangeLog:

* cp-tree.h (CONSTRUCTOR_BRACES_ELIDED_P): Define.
* decl.c (reshape_init_r): Set it.
* pt.c (collect_ctor_idx_types): Recurse into a sub-CONSTRUCTOR
iff CONSTRUCTOR_BRACES_ELIDED_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-aggr11.C: New test.
* g++.dg/cpp2a/class-deduction-aggr12.C: New test.

(cherry picked from commit be4a4fb516688d7cfe28a80a4aa333f4ecf0b518)

c++: ignore explicit dguides during NTTP CTAD [PR101883]

Since (template) argument passing is a copy-initialization context,
we mustn't consider explicit deduction guides when deducing a CTAD
placeholder type of an NTTP.

PR c++/101883

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Pass LOOKUP_IMPLICIT to
do_auto_deduction.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class49.C: New test.

(cherry picked from commit a6b3db3e8625a3cba1240f0b5e1a29bd6c68b8ca)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9077-g7d7630fb6636530d5154d68a777b1d7d7086cfe3 (Oct 6, 2021)

Fortran: Fix deprecate warning with parameter

Only warn with !GCC$ ATTRIBUTES DEPRECATED if
deprecated PARMETERS are actually used.

gcc/fortran/ChangeLog:

* resolve.c (resolve_values): Only show
deprecated warning if attr.referenced.

gcc/testsuite/ChangeLog:

* gfortran.dg/attr_deprecated-2.f90: New test.

(cherry picked from commit ece8b0fce6bbfb1e531de8164da47eeed80d3cf1)

openmp: Optimize for OpenMP atomics 2x__builtin_clear_padding+__builtin_memcmp if possible

For the few long double types that do have padding bits, e.g. on x86
the clear_type_padding_in_mask computed mask is
ff ff ff ff ff ff ff ff ff ff 00 00 for 32-bit and
ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 for 64-bit.
Instead of doing __builtin_clear_padding on both operands that will clear the
last 2 or 6 bytes and then memcmp on the whole 12/16 bytes, we can just
memcmp 10 bytes. The code also handles if the padding would be at the start
or both at the start and end, but everything on byte boundaries only and
non-padding bits being contiguous.
This works around a tree-ssa-dse.c bug (but we need to fix it anyway,
as libstdc++ won't do this and as it can deal with arbitrary types, it even
can't do that generally).

2021-10-06 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/102571
* c-omp.c (c_finish_omp_atomic): Optimize the case where type has
padding, but the non-padding bits are contiguous set of bytes
by adjusting the memcmp call arguments instead of emitting
__builtin_clear_padding and then comparing all the type's bytes.

(cherry picked from commit ba837323dbda2bca5a1c8a4c78092a88241dcfa3)

gfortran.dg/gomp/pr43711.f90: Change dg-* for XFAIL->PASS

gcc/testsuite/
* gfortran.dg/gomp/pr43711.f90: Add dg-error + dg-prune-output,
remove dg-excess-errors to change XFAIL to PASS.

(cherry picked from commit 7f4192dd3d84cb3f6584ae847eae18519d1eb76d)

Daily bump.

c++: Fix apply_identity_attributes [PR102548]

The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes.  The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.

The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).

There are two bugs.  One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated.  That is fixed by the a2 -> a2 != a change.

The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it.  That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly).  But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs.  When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing.  So that is the && first_ident != error_mark_node
chunk.

2021-10-05  Jakub Jelinek  <jakub@redhat.com>

PR c++/102548
* tree.c (apply_identity_attributes): Fix handling of the
case where an attribute in the list doesn't affect type
identity but some attribute before it does.

* g++.target/i386/pr102548.C: New test.

(cherry picked from commit 737f95bab557584d876f02779ab79fe3cfaacacf)

ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]

We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.

2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>

PR sanitizer/102515
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done.
gcc/testsuite/
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.

(cherry picked from commit 9c1a633d96926357155d4702b66f8a0ec856a81f)

c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]

The introduction of push_local_extern_decl_alias in
r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a
broke tls vars, while the decl they are created for has the tls model
set properly, nothing sets it for the alias that is actually used,
so accesses to it are done as if they were normal variables.
This is then diagnosed at link time if the definition of the extern
vars is __thread/thread_local.

2021-10-01 Jakub Jelinek <jakub@redhat.com>

PR c++/102496
* name-lookup.c (push_local_extern_decl_alias): Return early even for
tls vars with non-dependent type when processing_template_decl. For
CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias.

* g++.dg/tls/pr102496-1.C: New test.
* g++.dg/tls/pr102496-2.C: New test.

(cherry picked from commit 701075864ac4d1c6cec936d10f9cfc2aeb8c1699)

IBM Z: Use @PLT symbols for local functions in 64-bit mode

This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.

gcc/ChangeLog:

* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call<mode>): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.

(cherry picked from commit 0990d93dd8a)

IBM Z: Define NO_PROFILE_COUNTERS

s390 glibc does not need counters in the .data section, since it stores
edge hits in its own data structure. Therefore counters only waste
space and confuse diffing tools (e.g. kpatch), so don't generate them.

gcc/ChangeLog:

* config/s390/s390.c (s390_function_profiler): Ignore labelno
parameter.
* config/s390/s390.h (NO_PROFILE_COUNTERS): Define.

gcc/testsuite/ChangeLog:

* gcc.target/s390/mnop-mcount-m31-mzarch.c: Adapt to the new
prologue size.
* gcc.target/s390/mnop-mcount-m64.c: Likewise.

(cherry picked from commit a1c1b7a888a)

Daily bump.

Fix testcase counts.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update counts.

d: gdc driver ignores -static-libstdc++ when automatically linking libstdc++ library

Adds handling of `-static-libstc++' in the gdc driver, so that libstdc++
is appropriately linked if libstdc++ is either needed or seen on the
command-line.

PR d/102574

gcc/d/ChangeLog:

* d-spec.cc (lang_specific_driver): Link libstdc++ statically if
-static-libstdc++ was given on command-line.

(cherry picked from commit c86a16b07b76604a8e3d556f135babab80e2b747)

Fortran: Avoid var initialization in interfaces [PR54753]

Intent(out) implies deallocation/default initialization; however, it is
pointless to do this for dummy-arguments symbols of procedures which are
inside an INTERFACE block. – This also fixes a bogus error for the attached
included testcase, but fixing the non-interface version still has to be done.

PR fortran/54753

gcc/fortran/ChangeLog:

* resolve.c (can_generate_init, resolve_fl_variable_derived,
resolve_symbol): Only do initialization with intent(out) if not
inside of an interface block.

(cherry picked from commit 51d9ef7747b2dc439f7456303f0784faf5cdb1d3)

Remove dead code in config/rs6000/vxworks.h

These lines were added last year:

/* Initialize library function table. */
#undef TARGET_INIT_LIBFUNCS
#define TARGET_INIT_LIBFUNCS rs6000_vxworks_init_libfuncs

but TARGET_INIT_LIBFUNCS is #undef-ed in config/rs6000/rs6000.c and
rs6000_vxworks_init_libfuncs is nowhere defined in any case.

gcc/
* config/rs6000/vxworks.h (TARGET_INIT_LIBFUNCS): Delete.

Daily bump.

Fortran: resolve expressions during SIZE simplification

gcc/fortran/ChangeLog:

PR fortran/102458
* simplify.c (simplify_size): Resolve expressions used in array
specifications so that SIZE can be simplified.

gcc/testsuite/ChangeLog:

PR fortran/102458
* gfortran.dg/pr102458b.f90: New test.

(cherry picked from commit b19bbfb1482505367dd19ae4ab1ea19e36802b6a)

Fortran - improve checking for intrinsics allowed in constant expressions

gcc/fortran/ChangeLog:

PR fortran/102458
* expr.c (is_non_constant_intrinsic): Check for intrinsics
excluded in constant expressions (F2018:10.1.12).
(gfc_is_constant_expr): Use that check.

gcc/testsuite/ChangeLog:

PR fortran/102458
* gfortran.dg/pr102458.f90: New test.

(cherry picked from commit 84cccff60a978174271a30042bf7841d2ae436eb)

coroutines: Only set parm copy guard vars if we have exceptions [PR 102454].

For coroutines, we make copies of the original function arguments into
the coroutine frame. Normally, these are destroyed on the proper exit
from the coroutine when the frame is destroyed.

However, if an exception is thrown before the first suspend point is
reached, the cleanup has to happen in the ramp function. These cleanups
are guarded such that they are only applied to any param copies actually
made.

The ICE is caused by an attempt to set the guard variable when there are
no exceptions enabled (the guard var is not created in this case).

Fixed by checking for flag_exceptions in this case too.

While touching this code paths, also clean up the synthetic names used
when a function parm is unnamed.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/102454

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): Clean up synthetic names for
unnamed function params.
(morph_fn_to_coro): Do not try to set a guard variable for param
DTORs in the ramp, unless we have exceptions active.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr102454.C: New test.

(cherry picked from commit fae627162d5f8cfb273b10349883eeb74baaa43f)

coroutines: Make proxy vars for the function arg copies.

This adds top level proxy variables for the coroutine frame
copies of the original function args. These are then available
in the debugger to refer to the frame copies. We rewrite the
function body to use the copies, since the original parms will
no longer be in scope when the coroutine is running.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (struct param_info): Add copy_var.
(build_actor_fn): Use simplified param references.
(register_param_uses): Likewise.
(rewrite_param_uses): Likewise.
(analyze_fn_parms): New function.
(coro_rewrite_function_body): Add proxies for the fn
parameters to the outer bind scope of the rewritten code.
(morph_fn_to_coro): Use simplified version of param ref.

(cherry picked from commit 70ee703c479081ac2ea67eb67041551216e66783)

coroutines: Expose implementation state to the debugger.

In the process of transforming a coroutine into the separate representation
as the ramp function and a state machine, we generate some variables that
are of interest to a user during debugging.  Any variable that is persistent
for the execution of the coroutine is placed into the coroutine frame.

In particular:
  The promise object.
  The function pointers for the resumer and destroyer.
  The current resume index (suspend point).
  The handle that represents this coroutine 'self handle'.
  Any handle provided for a continuation coroutine.
  Whether the coroutine frame is allocated and needs to be freed.

Visibility of some of these has already been requested by end users.

This patch ensures that such variables have names that are usable in a
debugger, but are in the reserved namespace for the implementation (they
all begin with _Coro_).  The identifiers are generated lazily when the
first coroutine is encountered.

We place the variables into the outermost bind expression and then add a
DECL_VALUE_EXPR to each that points to the frame entry.

These changes simplify the handling of the variables in the body of the
function (in particular, the use of the DECL_VALUE_EXPR means that we now
no longer need to rewrite proxies for the promise and coroutine handles into
the frame->offset form).

Partial improvement to debugging (PR c++/99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_resume_fn_id, coro_destroy_fn_id,
coro_promise_id, coro_frame_needs_free_id, coro_resume_index_id,
coro_self_handle_id, coro_actor_continue_id,
coro_frame_i_a_r_c_id): New.
(coro_init_identifiers): Initialize new name identifiers.
(coro_promise_type_found_p): Use pre-built identifiers.
(struct await_xform_data): Remove unused fields.
(transform_await_expr): Delete code that is now unused.
(build_actor_fn): Simplify interface, use pre-built identifiers and
remove transforms that are no longer needed.
(build_destroy_fn): Use revised field names.
(register_local_var_uses): Use pre-built identifiers.
(coro_rewrite_function_body): Simplify interface, use pre-built
identifiers.  Generate proxy vars in the outer bind expr scope for the
implementation state that we wish to expose.
(morph_fn_to_coro): Adjust comments for new variable names, use pre-
built identifiers.  Remove unused code to generate frame entries for
the implementation state.  Adjust call for build_actor_fn.

(cherry picked from commit c5a735fa9df7eca4666c8da5e51ed9c5ab7cc81a)

coroutines: Support for debugging implementation state.

Some of the state that is associated with the implementation
is of interest to a user debugging a coroutine.  In particular
items such as the suspend point, promise object, and current
suspend point.

These variables live in the coroutine frame, but we can inject
proxies for them into the outermost bind expression of the
coroutine.  Such variables are automatically moved into the
coroutine frame (if they need to persist across a suspend
expression).  PLacing the proxies thus allows the user to
inspect them by name in the debugger.

To implement this, we ensure that (at the outermost scope) the
frame entries are not mangled (coroutine frame variables are
usually mangled with scope nesting information so that they do
not clash).  We can safely avoid doing this for the outermost
scope so that we can map frame entries directly to the variables.

This is partial contribution to debug support (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Do not mangle
frame entries for the outermost scope.  Record the outer
scope as nesting depth 0.

(cherry picked from commit addf167a23f61c0ec97f6e71577a0623f3fc13e7)

coroutines: Add a helper for creating local vars.

This is primarily code factoring, but we take this opportunity
to rename some of the implementation variables (which we intend
to expose to debugging) so that they are in the implementation
namespace.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_build_artificial_var): New.
(build_actor_fn): Use var builder, rename vars to use
implementation namespace.
(coro_rewrite_function_body): Likewise.
(morph_fn_to_coro): Likewise.

(cherry picked from commit a45a7ecdf34311587daa2e90cc732adcefac447b)

coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.

Variables that need to persist over suspension expressions
must be preserved by being copied into the coroutine frame.

The initial implementations do this manually in the transform
code. However, that has various disadvantages - including
that the debug connections are lost between the original var
and the frame copy.

The revised implementation makes use of DECL_VALUE_EXPRs to
contain the frame offset expressions, so that the original
var names are preserved in the code.

This process is also applied to the function parms which are
always copied to the frame. In this case the decls need to be
copied since they are used in two different contexts during
the re-write (in the building of the ramp function, and in
the actor function itself).

This will assist in improvement of debugging (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (transform_local_var_uses): Record
frame offset expressions as DECL_VALUE_EXPRs instead of
rewriting them.

(cherry picked from commit 88974974d8188cf12e87e4ad3d23a8cbdd557f0e)

coroutines : Add a missed begin/finish else clause to the codegen.

Minor code-gen correction.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Add begin/finish clauses
to the initial test in the actor function.

(cherry picked from commit 21b4d0ef543d68187d258415b51d0d6676af89fd)

coroutines: No cleanups on goto statements.

Minor cleanup, this is statement not an expression, we do not
need to use finish_expr_stmt here.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (await_statement_walker): Use build_stmt and
add_stmt instead of build1 and finish_expr_stmt.

(cherry picked from commit 8406ed9af2655479a9c8469d7acca2cf5784f5d6)

c++: don't call 'rvalue' in coroutines code

A change to check glvalue_p rather than specifically for TARGET_EXPR
revealed issues with the coroutines code's use of the 'rvalue' function,
which shouldn't be used on class glvalues, so I've removed those calls.

In build_co_await I just dropped them, because I don't see anything in the
co_await specification that indicates that we would want to move from an
lvalue result of operator co_await.  And simplified that code while I was
touching it; cp_build_modify_expr (...INIT_EXPR...) will call the
constructor.

In morph_fn_to_coro I changed the handling of the rvalue reference coroutine
frame field to use move, to treat the rval ref as an xvalue.  I used
forward_parm to pass the function parms to the constructor for the field.
And I simplified the return handling so we get the desired rvalue semantics
from the normal implicit move on return.

I question default-initializing the non-void return value of the function if
get_return_object returns void; I'm not messing with it here, but I've filed
PR100476 about it.

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Don't call 'rvalue'.
(flatten_await_stmt): Simplify initialization.
(morph_fn_to_coro): Change 'rvalue' to 'move'.  Simplify.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C:
Adjust diagnostic.

(cherry picked from commit 14ed21f8749ae359690d9c4a69ca38cc45d0d1b0)

Daily bump.

Add libgomp.fortran/order-reproducible-*.f90

libgomp/ChangeLog:

* testsuite/libgomp.fortran/order-reproducible-1.f90: New test
based on libgomp.c-c++-common/order-reproducible-1.c.
* testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise.
* testsuite/libgomp.fortran/my-usleep.c: New test.

(cherry picked from commit 703d8a4d39bdc3870b60460e75f8ff7fbcb365b4)

Daily bump.

Add/update libgomp.fortran/alloc-*.f90

libgomp/ChangeLog:

* testsuite/libgomp.fortran/alloc-10.f90: Fix alignment check.
* testsuite/libgomp.fortran/alloc-7.f90: Fix array access.
* testsuite/libgomp.fortran/alloc-8.f90: Likewise.
* testsuite/libgomp.fortran/alloc-11.f90: New test for omp_realloc,
based on libgomp.c-c++-common/alloc-9.c.

(cherry picked from commit 2a93d18da3bd79eb0be87f029eefc8b28bb66dec)

openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)

While OpenMP 5.1 implies order(concurrent) is the same thing as
order(reproducible:concurrent), this is going to change in OpenMP 5.2, where
essentially order(concurrent) means nothing is stated on whether it is
reproducible or unconstrained (and is determined by other means, e.g. for/do
with schedule static or runtime with static being selected is implicitly
reproducible, distribute with dist_schedule static is implicitly reproducible,
loop is implicitly reproducible) and when the modifier is specified explicitly,
it overrides the implicit behavior either way.
And, when order(reproducible:concurrent) is used with e.g. schedule(dynamic)
or some other schedule that is by definition not reproducible, it is
implementation's duty to ensure it is reproducible, either by remembering how
it scheduled some loop and then replaying the same schedule when seeing loops
with the same directive/schedule/number of iterations, or by overriding the
schedule to some reproducible one.

This patch doesn't implement the 5.2 wording just yet, but in the FEs
differentiates between the 3 states - no explicit modifier, explicit reproducible
or explicit unconstrainted, so that the middle-end can easily switch any time.
Instead it follows the 5.1 wording where both order(concurrent) (implicit or
explicit) or order(reproducible:concurrent) imply reproducibility.
And, it implements the easier method, when for/do should be reproducible, it
just chooses static schedule. order(concurrent) implies no OpenMP APIs in the
loop body nor threadprivate vars, so the exact scheduling isn't (easily at least)
observable.

2021-10-01 Jakub Jelinek <jakub@redhat.com>

gcc/
* tree.h (OMP_CLAUSE_ORDER_REPRODUCIBLE): Define.
* tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_ORDER>: Print
reproducible: for OMP_CLAUSE_ORDER_REPRODUCIBLE.
* omp-general.c (omp_extract_for_data): If OMP_CLAUSE_ORDER is seen
without OMP_CLAUSE_ORDER_UNCONSTRAINED, overwrite sched_kind to
OMP_CLAUSE_SCHEDULE_STATIC.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Also copy
OMP_CLAUSE_ORDER_REPRODUCIBLE.
gcc/c/
* c-parser.c (c_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/cp/
* parser.c (cp_parser_omp_clause_order): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier.
gcc/fortran/
* gfortran.h (gfc_omp_clauses): Add order_reproducible bitfield.
* dump-parse-tree.c (show_omp_clauses): Print REPRODUCIBLE: for it.
* openmp.c (gfc_match_omp_clauses): Set order_reproducible for
explicit reproducible: modifier.
* trans-openmp.c (gfc_trans_omp_clauses): Set
OMP_CLAUSE_ORDER_REPRODUCIBLE for order_reproducible.
(gfc_split_omp_clauses): Also copy order_reproducible.
gcc/testsuite/
* gfortran.dg/gomp/order-5.f90: Adjust scan-tree-dump-times regexps.
libgomp/
* testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test.
* testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test.

(cherry picked from commit e705b8533aa0a00a65734eb5fd6344295723dccc)

openmp: Avoid PLT relocations for omp_* symbols in libgomp

This patch avoids the following relocations:
readelf -Wr libgomp.so.1.0.0 | grep omp_
00000000000470e0  0000020700000007 R_X86_64_JUMP_SLOT     000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0
0000000000047170  000000b800000007 R_X86_64_JUMP_SLOT     000000000000e760 omp_display_env@@OMP_5.1 + 0
00000000000471e0  000000e800000007 R_X86_64_JUMP_SLOT     000000000000f910 omp_get_initial_device@@OMP_4.5 + 0
0000000000047280  0000019500000007 R_X86_64_JUMP_SLOT     0000000000015940 omp_get_active_level@@OMP_3.0 + 0
00000000000472c8  0000020d00000007 R_X86_64_JUMP_SLOT     0000000000035210 omp_get_team_num@@OMP_4.0 + 0
00000000000472f0  0000014700000007 R_X86_64_JUMP_SLOT     0000000000035200 omp_get_num_teams@@OMP_4.0 + 0
by using ialias{,_call,_redirect} macros as needed.

We still have many acc_* PLT relocations, could somebody please fix those?
readelf -Wr libgomp.so.1.0.0 | grep acc_
0000000000046fb8  000001ed00000006 R_X86_64_GLOB_DAT      0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0
0000000000046fd8  000000a400000006 R_X86_64_GLOB_DAT      0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0
0000000000046fe0  000001d100000006 R_X86_64_GLOB_DAT      0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0
0000000000047058  000001dd00000007 R_X86_64_JUMP_SLOT     0000000000031f40 acc_create_async@@OACC_2.5 + 0
0000000000047068  0000011500000007 R_X86_64_JUMP_SLOT     000000000002fc60 acc_get_property@@OACC_2.6 + 0
0000000000047070  000001fb00000007 R_X86_64_JUMP_SLOT     0000000000032ce0 acc_wait_all@@OACC_2.0 + 0
0000000000047080  0000006500000007 R_X86_64_JUMP_SLOT     000000000002f990 acc_on_device@@OACC_2.0 + 0
0000000000047088  000000ae00000007 R_X86_64_JUMP_SLOT     0000000000032140 acc_attach_async@@OACC_2.6 + 0
0000000000047090  0000021900000007 R_X86_64_JUMP_SLOT     000000000002f550 acc_get_device_type@@OACC_2.0 + 0
0000000000047098  000001cb00000007 R_X86_64_JUMP_SLOT     0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0
00000000000470a8  0000005200000007 R_X86_64_JUMP_SLOT     0000000000031f80 acc_copyin@@OACC_2.0 + 0
00000000000470b8  000001ad00000007 R_X86_64_JUMP_SLOT     0000000000032030 acc_delete_finalize@@OACC_2.5 + 0
00000000000470e8  0000010900000007 R_X86_64_JUMP_SLOT     0000000000031f00 acc_create@@OACC_2.0 + 0
00000000000470f8  0000005900000007 R_X86_64_JUMP_SLOT     0000000000032b70 acc_wait_async@@OACC_2.0 + 0
0000000000047110  0000013100000007 R_X86_64_JUMP_SLOT     0000000000032860 acc_async_test@@OACC_2.0 + 0
0000000000047118  000001ff00000007 R_X86_64_JUMP_SLOT     000000000002f720 acc_get_device_num@@OACC_2.0 + 0
0000000000047128  0000019100000007 R_X86_64_JUMP_SLOT     0000000000032020 acc_delete_async@@OACC_2.5 + 0
0000000000047130  000001d200000007 R_X86_64_JUMP_SLOT     000000000002efa0 acc_shutdown@@OACC_2.0 + 0
0000000000047150  000000d000000007 R_X86_64_JUMP_SLOT     0000000000031f00 acc_present_or_create@@OACC_2.0 + 0
0000000000047188  0000019200000007 R_X86_64_JUMP_SLOT     0000000000031910 acc_is_present@@OACC_2.0 + 0
0000000000047190  000001aa00000007 R_X86_64_JUMP_SLOT     000000000002fca0 acc_get_property_string@@OACC_2.6 + 0
00000000000471d0  000001bf00000007 R_X86_64_JUMP_SLOT     0000000000032120 acc_update_self_async@@OACC_2.5 + 0
0000000000047200  0000020500000007 R_X86_64_JUMP_SLOT     0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0
0000000000047208  000000a600000007 R_X86_64_JUMP_SLOT     0000000000031790 acc_deviceptr@@OACC_2.0 + 0
0000000000047218  0000007500000007 R_X86_64_JUMP_SLOT     0000000000032000 acc_delete@@OACC_2.0 + 0
0000000000047238  000001e900000007 R_X86_64_JUMP_SLOT     000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0
0000000000047240  000001f600000007 R_X86_64_JUMP_SLOT     000000000002ef20 acc_init@@OACC_2.0 + 0
0000000000047248  0000018800000007 R_X86_64_JUMP_SLOT     0000000000032060 acc_copyout@@OACC_2.0 + 0
0000000000047258  0000021f00000007 R_X86_64_JUMP_SLOT     0000000000032a80 acc_wait@@OACC_2.0 + 0
0000000000047270  000001bc00000007 R_X86_64_JUMP_SLOT     0000000000032100 acc_update_self@@OACC_2.0 + 0
0000000000047288  0000011400000007 R_X86_64_JUMP_SLOT     0000000000032080 acc_copyout_async@@OACC_2.5 + 0
0000000000047290  0000013d00000007 R_X86_64_JUMP_SLOT     000000000002f850 acc_set_device_num@@OACC_2.0 + 0
00000000000472a8  000000c500000007 R_X86_64_JUMP_SLOT     00000000000320e0 acc_update_device_async@@OACC_2.5 + 0
00000000000472c0  0000014600000007 R_X86_64_JUMP_SLOT     0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0
00000000000472f8  0000006a00000007 R_X86_64_JUMP_SLOT     000000000002f310 acc_get_num_devices@@OACC_2.0 + 0
0000000000047350  0000021700000007 R_X86_64_JUMP_SLOT     0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0
0000000000047360  0000020900000007 R_X86_64_JUMP_SLOT     00000000000320c0 acc_update_device@@OACC_2.0 + 0
0000000000047380  0000008400000007 R_X86_64_JUMP_SLOT     0000000000032950 acc_async_test_all@@OACC_2.0 + 0

2021-10-01  Jakub Jelinek  <jakub@redhat.com>

* affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add
ialias_redirect.
* env.c (handle_omp_display_env): Use ialias_call.
* icv-device.c: Move ialias right below each function.
(omp_get_device_num): Use ialias_call.
* fortran.c (omp_fulfill_event): Add ialias_redirect.
* icv.c (omp_get_active_level): Add ialias_redirect.

(cherry picked from commit 3749c3aff6512003a61b7cf4a96eff0e8926d781)

openmp: Add alloc_align attribute to omp_aligned_*alloc and testcase for omp_realloc

This patch adds alloc_align attribute to omp_aligned_{,c}alloc so that if
the first argument is constant, GCC can assume requested alignment.

Additionally, it adds testsuite coverage for omp_realloc which I haven't
managed to write in the patch from yesterday.

2021-10-01 Jakub Jelinek <jakub@redhat.com>

* omp.h.in (omp_aligned_alloc, omp_aligned_calloc): Add
__alloc_align__ (1) attribute.
* testsuite/libgomp.c-c++-common/alloc-9.c: New test.

(cherry picked from commit 998e434f8f9ce42842710af9e33c1b9732b22999)

Default to dwarf version 4 on hppa64-hpux

2021-10-01 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR debug/102373
* config/pa/pa.c (pa_option_override): Default to dwarf version 4
on hppa64-hpux.

Use libiberty snprintf and vsnprintf on hppa*-*-hpux*.

libiberty/ChangeLog:

PR target/100734
* configure.ac: Use libiberty snprintf and vsnprintf on
hppa*-*-hpux*.
* configure: Regenerate.

Fix ICE with stack checking emulation at -O2

On bare-metal platforms, the Ada compiler emulates stack checking (it is
required by the language and tested by ACATS) in the runtime via the
stack_check_libfunc hook of the RTL middle-end. Calls to the function
are generated as libcalls but they now require a proper function type
at -O2 or above.

gcc/
* explow.c: Include langhooks.h.
(set_stack_check_libfunc): Build a proper function type.

Fix PR c++/64697 at -O1 or above

The BFD fix eliminates the link failure and working code is generated at
-O0, but _not_ when optimization is enabled because the optimizer changes:

        movq    .refptr._ZTH1s(%rip), %rax
        testq   %rax, %rax
        je      .L2
        call    _ZTH1s

into:

        leaq    _ZTH1s(%rip), %rax
        testq   %rax, %rax
        je      .L2
        call    _ZTH1s

and the leaq now also gets the relocation overflow.  So the fix is to
teach legitimate_pic_address_disp_p to reject the transformation when
the symbol is an external weak function, which yields:

        cmpq    $0, .refptr._ZTH1s(%rip)
        je      .L2
        call    _ZTH1s

and the cmpq keeps a relocation that does not overflow.

gcc/
PR c++/64697
* config/i386/i386.c (legitimate_pic_address_disp_p): For PE-COFF do
not return true for external weak function symbols in medium model.

Daily bump.

Fortran: fix error recovery for invalid constructor

gcc/fortran/ChangeLog:

PR fortran/102520
* array.c (expand_constructor): Do not dereference NULL pointer.

gcc/testsuite/ChangeLog:

PR fortran/102520
* gfortran.dg/pr102520.f90: New test.

(cherry picked from commit 5e2adfeed21ee584a82cdcdfa7eed41202eb67cd)

Fortran: Fix same_type_as

A test for CLASS(*) + assumed rank was missing; adding a test to
unlimited_polymorphic_1.f03 showed an ICE as backend_decl wasn't
set. While gfc_get_symbol_decl would fix it, the code also assumed
that the class(*) was a variable and could not be a subobject of
a derived type.

PR fortran/71703
PR fortran/84007

gcc/fortran/ChangeLog:

* trans-intrinsic.c (gfc_conv_same_type_as): Fix handling
of UNLIMITED_POLY.
* trans.h (gfc_vtpr_hash_get): Renamed prototype to ...
(gfc_vptr_hash_get): ... this to match function name.

gcc/testsuite/ChangeLog:

* gfortran.dg/c-interop/c535b-1.f90: Remove wrong comment.
* gfortran.dg/unlimited_polymorphic_1.f03: Extend.
* gfortran.dg/unlimited_polymorphic_32.f90: New test.

(cherry picked from commit 643e8f4ee3a2a59a9b96fbcd1ffa8bacbda5b383)

libgomp.fortran/alloc-*.f90: Add missing dg-prune-output

libgomp/
* testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output
for -fintrinsic-modules-path= warning of the C compiler.
* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
* testsuite/libgomp.fortran/alloc-10.f90: Likewise.

(cherry picked from commit ef37ddf477ac4b21ec4d1be9260cfd3b431fd4a9)

openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran

gcc/ChangeLog:

* omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and
omp_{c,re}alloc, fix omp_alloc/omp_free.

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.1): Set implementation status to Y for
omp_aligned_{,c}alloc and omp_{c,re}alloc routines.
* omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc,
omp_realloc): Add.
* testsuite/libgomp.fortran/alloc-10.f90: New test.
* testsuite/libgomp.fortran/alloc-6.f90: New test.
* testsuite/libgomp.fortran/alloc-7.c: New test.
* testsuite/libgomp.fortran/alloc-7.f90: New test.
* testsuite/libgomp.fortran/alloc-8.f90: New test.
* testsuite/libgomp.fortran/alloc-9.f90: New test.

(cherry picked from commit 70de20db232545daa2d6616e3581313476395ea3)

[Ada] Minor tweaks to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (Parse_Header): Tweak comments.
(Read_Entry_Format_Array): Tweak exception message.
(Symbolic_Address.Set_Result): Likewise.

[Ada] Small optimization to DWARF 5 mode in System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (To_File_Name): Fetch only the last string
from the .debug_line_str section.
(Symbolic_Address.Set_Result): Likewise.

[Ada] Follow-up tweaks to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (Skip_Form): Fix cases of DW_FORM_addrx
and DW_FORM_implicit_const. Replace Constraint_Error with
Dwarf_Error.

[Ada] Adjust latest change for ELF platforms

gcc/ada/

* libgnat/s-objrea.adb (Get_Load_Address): Return 0 for ELF.

[Ada] Add support for PE-COFF PIE to System.Dwarf_Line

gcc/ada/

* adaint.c (__gnat_get_executable_load_address): Add Win32 support.
* libgnat/s-objrea.ads (Get_Xcode_Bounds): Fix typo in comment.
(Object_File): Minor reformatting.
(ELF_Object_File): Uncomment predicate.
(PECOFF_Object_File): Likewise.
(XCOFF32_Object_File): Likewise.
* libgnat/s-objrea.adb: Minor reformatting throughout.
(Get_Load_Address): Implement for PE-COFF.
* libgnat/s-dwalin.ads: Remove clause for System.Storage_Elements
and use consistent wording in comments.
(Dwarf_Context): Set type of Low, High and Load_Address to Address.
* libgnat/s-dwalin.adb (Get_Load_Displacement): New function.
(Is_Inside): Call Get_Load_Displacement.
(Low_Address): Likewise.
(Open): Adjust to type change.
(Aranges_Lookup): Change type of Addr to Address.
(Read_Aranges_Entry): Likewise for Start and adjust.
(Enable_Cach): Adjust to type change.
(Symbolic_Address): Change type of Addr to Address.
(Symbolic_Traceback): Call Get_Load_Displacement.

[Ada] Small cleanup in System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.ads: Remove clause for Ada.Exceptions.Traceback,
add clause for System.Traceback_Entries and alphabetize.
(AET): Delete.
(STE): New package renaming.
(Symbolic_Traceback): Adjust.
* libgnat/s-dwalin.adb: Remove clauses for Ada.Exceptions.Traceback
and System.Traceback_Entries.
(Symbolic_Traceback): Adjust.

[Ada] Add DWARF 5 support to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.ads: Adjust a few comments left and right.
(Line_Info_Register): Comment out unused components.
(Line_Info_Header): Add DWARF 5 support.
(Dwarf_Context): Likewise. Rename "prologue" into "header".
* libgnat/s-dwalin.adb: Alphabetize "with" clauses.
(DWARF constants): Add DWARF 5 support and reorder.
(For_Each_Row): Adjust.
(Initialize_Pass): Likewise.
(Initialize_State_Machine): Likewise and fix typo.
(Open): Add DWARF 5 support.
(Parse_Prologue): Rename into...
(Parse_Header): ...this and add DWARF 5 support.
(Read_And_Execute_Isn): Rename into...
(Read_And_Execute_Insn): ...this and adjust.
(To_File_Name): Change parameter name and add DWARF 5 support.
(Read_Entry_Format_Array): New procedure.
(Skip_Form): Add DWARF 5 support and reorder.
(Seek_Abbrev): Do not count entries and add DWARF 5 support.
(Debug_Info_Lookup): Add DWARF 5 support.
(Symbolic_Address.Set_Result): Likewise.
(Symbolic_Address): Adjust.

openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc

This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that
fixes an omp_alloc bug which is hard to test for - if the first allocator
fails but has a larger alignment trait and has a fallback allocator, either
the default behavior or a user fallback, then the extra alignment will be used
even in the fallback allocation, rather than just starting with whatever
alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc).

Jonathan's comment on IRC this morning made me realize that I should add
alloc_align attributes to 2 of the prototypes and I still need to add testsuite
coverage for omp_realloc, will do that in a follow-up.

2021-09-30  Jakub Jelinek  <jakub@redhat.com>

* omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc,
omp_realloc): New prototypes.
(omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free)
attribute.
* allocator.c: Include string.h.
(omp_aligned_alloc): No longer static, add ialias.  Add new_alignment
variable and use it instead of alignment so that when retrying the old
alignment is used again.  Don't retry if new alignment is the same
as old alignment, unless allocator had pool size.
(omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call.
(omp_aligned_calloc, omp_calloc, omp_realloc): New functions.
* libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc,
omp_aligned_calloc and omp_realloc.
* testsuite/libgomp.c-c++-common/alloc-4.c (main): Add
omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests.
* testsuite/libgomp.c-c++-common/alloc-5.c: New test.
* testsuite/libgomp.c-c++-common/alloc-6.c: New test.
* testsuite/libgomp.c-c++-common/alloc-7.c: New test.
* testsuite/libgomp.c-c++-common/alloc-8.c: New test.

(cherry picked from commit b38a4bd10249b5070ea1f4708a0fd228df268c26)

Daily bump.

rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz

Fwprop will happily optimize two xxsetaccz instructions into one xxsetaccz
by propagating the results of the first to the uses of the second.
We really don't want that to happen given the late priming/depriming of
accumulators.  I fixed this by making the xxsetaccz source operand an
unspec volatile.  I also removed the mma_xxsetaccz define_expand and
define_insn_and_split and replaced it with a simple define_insn.
The expand and splitter patterns were leftovers from the pre opaque mode
code when the xxsetaccz code was part of the movpxi pattern, and we don't
need them now.

Rather than a new test case, I was able to just modify the current test case
to add another __builtin_mma_xxsetaccz call which shows the bad code gen
with unpatched compilers.

2021-09-14  Peter Bergner  <bergner@linux.ibm.com>

gcc/
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
(unspecv): Add UNSPECV_MMA_XXSETACCZ.
(*mma_xxsetaccz): Delete.
(mma_xxsetaccz): Change to define_insn.  Remove operand 1.
Use UNSPECV_MMA_XXSETACCZ.  Update comment.
* config/rs6000/rs6000.c (rs6000_rtx_costs): Use UNSPECV_MMA_XXSETACCZ.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-6.c: Add second call to xxsetacc
built-in.  Update instruction counts.

(cherry picked from commit f80b9be083e0e7d49e7744b7e531b9aa52acd563)

openmp: Disallow reduction with var private in containing parallel even on scope [PR102504]

The standard has a restriction:
"A list item that appears in a reduction clause of a scope construct must be
shared in the parallel region to which a corresponding scope region binds."
similar to the restriction for worksharing constructs, but we were checking
it only on worksharing constructs and not for scope and ICEd later on during
omp expansion.

2021-09-29 Jakub Jelinek <jakub@redhat.com>

PR middle-end/102504
* gimplify.c (gimplify_scan_omp_clauses): Use omp_check_private even
in OMP_SCOPE clauses, not just on worksharing construct clauses.

* c-c++-common/gomp/scope-4.c: New test.

(cherry picked from commit d3e7bb15e28c554bf4484a912f3b9c18c60ec68f)

Daily bump.

libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]

The depend type is a struct with two pointer members for C/C++ - but for
Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus,
libgomp's configure checks that an integer type/kind with size 2*sizeof(void*)
is available. However, this integer type/kind is not needed when building without
Fortran support. Thus, only check this when Fortran is enabled.

libgomp/
PR libgomp/96661
* configure.ac: Only check for int-type = 2*size_t support when
building with Fortran support.
* configure: Regenerate.

(cherry picked from commit 1f0a57bd54aed558e0167016dd980177f88f8480)

i386: Don't emit fldpi etc. if -frounding-math [PR102498]

i387 has instructions to store some transcedental numbers into the top of
stack.  The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision.  The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.

2021-09-28  Jakub Jelinek  <jakub@redhat.com>

PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.

* gcc.target/i386/pr102498.c: New test.

(cherry picked from commit 3b7041e8345c2f1030e58620f28e22d64b2c196b)

gfortran.dg/include_15.f90: Add dg-prune-output [PR102500]

gcc/testsuite/
PR fortran/102500
* gfortran.dg/include_15.f90: Add 'dg-prune-output' to prune
-Wmissing-include-dirs output printed or not depending on
how the testsuite is run.

(cherry picked from commit ce450af5087b95001b003184b8ecc2c9bbf65378)

openmp: Don't call omp_finish_clause on implicitly added private clauses on simd [PR102492]

The gimplifier adds implicit private clauses on SIMD constructs for local
variables in the SIMD body if they are addressable to make sure they use
the magic arrays with "omp simd array" attribute (such that each SIMD lane
has its own copy), but we actually don't need to default privatize etc. those,
the construction for them is done in the SIMD body and so is destruction.
omp_finish_clause for C++ now requires default constructor (and dtor) for private,
so that OpenMP 5.1 default(private) works, but that will never be needed on
SIMD. So, this patch just doesn't call omp_finish_clause for private on simd.
The C and Fortran langhooks don't do anything for private.

2021-09-28 Jakub Jelinek <jakub@redhat.com>

PR middle-end/102492
* gimplify.c (gimplify_adjust_omp_clauses_1): Don't call the
omp_finish_clause langhook on implicitly added OMP_CLAUSE_PRIVATE
clauses on SIMD constructs.

* g++.dg/gomp/simd-3.C: New test.

(cherry picked from commit 4f07769057c45ec9e751ab1c23e0fe4750102840)

Daily bump.

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9033-g654d1bd86a6d62e03530872d519a20bd06ae3a50 (Sept 27, 2021)

Fortran: Fix assumed-size to assumed-rank passing [PR94070]

This code inlines the size0 and size1 libgfortran calls, the former is still
used by libgfortan itself (and by old code). Besides permitting more
optimizations, it also permits to handle assumed-rank dummies better: If the
dummy argument is a nonpointer/nonallocatable, an assumed-size actual arg is
repesented by having ubound == -1 for the last dimension. However, for
allocatable/pointers, this value can also exist. Hence, the dummy arg attr
has to be honored.

For that reason, when calling an assumed-rank procedure with nonpointer,
nonallocatable dummy arguments, the bounds have to be updated to avoid
the case ubound == -1 for the last dimension.

PR fortran/94070

gcc/fortran/ChangeLog:

* trans-array.c (gfc_tree_array_size): New function to
find size inline (whole array or one dimension).
(array_parameter_size): Use it, take stmt_block as arg.
(gfc_conv_array_parameter): Update call.
* trans-array.h (gfc_tree_array_size): Add prototype.
* trans-decl.c (gfor_fndecl_size0, gfor_fndecl_size1): Remove
these global vars.
(gfc_build_intrinsic_function_decls): Remove their initialization.
* trans-expr.c (gfc_conv_procedure_call): Update
bounds of pointer/allocatable actual args to nonallocatable/nonpointer
dummies to be one based.
* trans-intrinsic.c (gfc_conv_intrinsic_shape): Fix case for
assumed rank with allocatable/pointer dummy.
(gfc_conv_intrinsic_size): Update to use inline function.
* trans.h (gfor_fndecl_size0, gfor_fndecl_size1): Remove var decl.

libgfortran/ChangeLog:

* intrinsics/size.c (size0, size1): Comment that now not
used by newer compiler code.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Update
expected dg-note output.

gcc/testsuite/ChangeLog:

* gfortran.dg/c-interop/cf-out-descriptor-6.f90: Remove xfail.
* gfortran.dg/c-interop/size.f90: Remove xfail.
* gfortran.dg/intrinsic_size_3.f90: Update scan-tree-dump-times.
* gfortran.dg/transpose_optimization_2.f90: Likewise.
* gfortran.dg/size_optional_dim_1.f90: Add scan-tree-dump-not.
* gfortran.dg/assumed_rank_22.f90: New test.
* gfortran.dg/assumed_rank_22_aux.c: New test.

(cherry picked from commit 00f6de9c69119594f7dad3bd525937c94c8200d0)

Daily bump.

Fortran: Fix associated intrinsic with assumed rank [PR101334]

ASSOCIATE (ptr, tgt) takes as first argument also an assumed-rank array;
however, using it together with a tgt (required to be non assumed rank)
had issues for both scalar and nonscalar tgt.

PR fortran/101334
gcc/fortran/ChangeLog:

* trans-intrinsic.c (gfc_conv_associated): Support assumed-rank
'pointer' with scalar/array 'target' argument.

libgfortran/ChangeLog:

* intrinsics/associated.c (associated): Also check for same rank.

gcc/testsuite/ChangeLog:

* gfortran.dg/associated_assumed_rank.f90: New test.

(cherry picked from commit fe2771b291c2c7c0ac37b75ec5b160937524b60c)

Daily bump.

Fortran: Add missing diagnostic for F2018 C711 (TS29113 C407c)

2021-09-24 Sandra Loosemore <sandra@codesourcery.com>

PR fortran/101333

gcc/fortran/
* interface.c (compare_parameter): Enforce F2018 C711.

gcc/testsuite/
* gfortran.dg/c-interop/c407c-1.f90: Remove xfails.

(cherry picked from commit 2364250eccc389a5f9820ac55f8260d34f229e73)

Fortran: Diagnose default-initialized pointer/allocatable dummies

TS29113 changed what was then C516 in the 2010 Fortran standard (now
C1557 in F2018) from disallowing all of pointer, allocatable, and
optional attributes on dummy arguments to BIND(C) functions, to
disallowing only pointer/allocatable with default-initialization.
gfortran was previously failing to diagnose violations of this
constraint.

2021-09-23 Sandra Loosemore <sandra@codesourcery.com>

PR fortran/101320

gcc/fortran/
* decl.c (gfc_verify_c_interop_param): Handle F2018 C1557,
aka TS29113 C516.

gcc/testsuite/
* gfortran.dg/c-interop/c516.f90: Remove xfails. Add more
tests.

(cherry picked from commit 2646d0e06b170569be1da28fce1d6e2f03a15f60)