git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

tree-optimization: [PR102622]: wrong code due to signed one bit integer and "a?-1:0"

Since the problem was already fixed on this branch, we just want to add the
testcase so it does not regress there.

PR tree-optimization/102622

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/bitfld-10.c: New test.

(cherry picked from commit 882d806c1a8f9d2d2ade1133de88d63e5d4fe40c)

doc: improve -fsanitize=undefined description

gcc/ChangeLog:
* doc/invoke.texi: Add link to UndefinedBehaviorSanitizer
documentation, mention UBSAN_OPTIONS, similar to what is done
for AddressSanitizer.

(cherry picked from commit 1c0a83eff7bb5b1db997a9726ae6542aec893baa)

Daily bump.

var-tracking: Fix a wrong-debug issue caused by my r10-7665 var-tracking change [PR102441]

Since my r10-7665-g33c45e51b4914008064d9b77f2c1fc0eea1ad060 change, we get
wrong-debug on e.g. the following testcase at -O2 -g on x86_64-linux for the
x parameter:
void bar (int *r);
int
foo (int x)
{
  int r = 0;
  bar (&r);
  return r;
}
At the start of function, we have
        subq    $24, %rsp
        leaq    12(%rsp), %rdi
instructions.  The x parameter is passed in %rdi, but isn't used in the
function and so the leaq instruction overwrites %rdi without remembering
%rdi anywhere.  Before the r10-7665 change (which was trying to fix a large
(3% for 32-bit, 1% for 64-bit x86-64) debug info/loc growth introduced with
r10-7515), the leaq insn above resulted in a MO_VAL_SET micro-operation that
said that the value of sp + 12, a cselib_sp_derived_value_p, is stored into
the %rdi register.  The r10-7665 change added a change to add_stores that
added no micro-operation for the leaq store, with the rationale that the sp
based values can be and will be always computable some other more compact
and primarily more stable way (cfa based expression like DW_OP_fbreg, that
is the same in the whole function).  That is true.  But by throwing the
micro-operation on the floor, we miss another important part of the
MO_VAL_SET, in particular that the destination of the store, %rdi in this
case, now has a different value from what it had before, so the vt_*
dataflow code thinks that even after the leaq instruction %rdi still holds
the x argument value (and changes it to DW_OP_entry_value (%rdi) only in the
middle of the call to bar).  Previously and with the patches below,
the location for x changes already at the end of leaq instruction to
DW_OP_entry_value (%rdi).

My first attempt to fix this was instead of dropping the MO_VAL_SET add
a MO_CLOBBER operation:
--- gcc/var-tracking.c.jj       2021-05-04 21:02:24.196799586 +0200
+++ gcc/var-tracking.c  2021-09-24 19:23:16.420154828 +0200
@@ -6133,7 +6133,9 @@ add_stores (rtx loc, const_rtx expr, voi
     {
       if (preserve)
        preserve_value (v);
-      return;
+      mo.type = MO_CLOBBER;
+      mo.u.loc = loc;
+      goto log_and_return;
     }

   nloc = replace_expr_with_values (oloc);
so don't track that the value lives in the loc destination, but track
that the previous value doesn't live there anymore.  That failed bootstrap
miserably, the vt_* code isn't prepared to see MO_CLOBBER of a MEM that
isn't tracked (e.g. has MEM_EXPR on it that the var-tracking code wants
to track, i.e. track_p in add_stores).  On the other side, thinking about
it more, in the most common case where a cselib_sp_derived_value_p value
is stored into the sp register (and which is the reason why PR94495
testcase got larger), dropping the micro-operation on the floor is the
right thing, because we have that cselib_sp_derived_value_p tracking, any
reads from the sp hard register will be treated as
cselib_sp_derived_value_p.
Then I've tried 3 different patches described below and in the end
what is committed is patch2.
Additionally, I've gathered statistics from cc1plus by always reverting the
var-tracking.c change after finished bootstrap/regtest and rebuilding the
stage3 var-tracking.o and cc1plus, such that it would be comparable.
dwlocstat and .debug_{info,loclists} section sizes detailed below.
patch3 uses MO_VAL_SET (i.e. essentially reversion of the r10-7665
change) when destination is not a REG_P and !track_p, otherwise if
destination is sp drops the micro-operation on the floor (i.e. no change),
otherwise adds a MO_CLOBBER.
patch1 is similar, except it checks for destination not equal to sp and
!track_p, i.e. for !track_p REG_P destinations other than sp it will use
MO_VAL_SET rather than MO_CLOBBER.
Finally, patch2, the shortest patch, uses MO_VAL_SET whenever destination
is not sp and otherwise drops the micro-operation on the floor.
All the 3 patches don't affect the PR94495 testcase, all the changes
there were caused by stores of sp based values into %rsp.

While the patch2 (and patch1 which results in exactly the same sizes)
causes the largest debug loclists/info growth from the 3, it is still quite
minor (0.651% on 64-bit and 0.114% on 32-bit) compared
to the 1% and 3% PR94495 was trying to solve, and I actually think it is the
best thing to do.  Because, if we have say
  int q[10];
  int *p = &q[0];
or similar and we load the &q[0] sp based value into some hard register,
by noting in the debug info that p lives in some hard reg for some part
of the function and a user is trying to change the p var in the debugger,
if we say it lives in some register or memory, there is some chance that
the changing of the value could work successfully (of course, nothing
is guaranteed, we don't have tracking of where each var lives at which
moment for changing purposes (i.e. what register, memory or else you need
to change in order to change behavior of the code)), while if we just say
that p's location is DW_OP_fbreg 16 DW_OP_stack_value, that is a read-only
value one can just print but not change.  Now, for stores of variable
values into the sp register, I don't think we have such an issue, you don't
want debugger to change your stack pointer when user asks to change value
of some variable whose value lives in the stack pointer, that would pretty
much always result in misbehavior of the program.
So, my preference from these 3 is patch2 and that is being committed.

64-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1064665/37%     1064665/37%
11..20  35972/1%        1100637/38%
21..30  47969/1%        1148606/40%
31..40  45787/1%        1194393/42%
41..50  57529/2%        1251922/44%
51..60  53974/1%        1305896/46%
61..70  112055/3%       1417951/50%
71..80  79420/2%        1497371/52%
81..90  126225/4%       1623596/57%
91..100 1206682/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a44949f 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5d046 506e947 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1064685/37%     1064685/37%
11..20  36011/1%        1100696/38%
21..30  47975/1%        1148671/40%
31..40  45799/1%        1194470/42%
41..50  57566/2%        1252036/44%
51..60  54011/1%        1306047/46%
61..70  112068/3%       1418115/50%
71..80  79421/2%        1497536/52%
81..90  126171/4%       1623707/57%
91..100 1206571/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a448f27 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff608bc 52070dd 00      0   0  1
patch3
cov%    samples cumul
0..10   1064698/37%     1064698/37%
11..20  36018/1%        1100716/38%
21..30  47977/1%        1148693/40%
31..40  45804/1%        1194497/42%
41..50  57562/2%        1252059/44%
51..60  54018/1%        1306077/46%
61..70  112071/3%       1418148/50%
71..80  79424/2%        1497572/52%
81..90  126172/4%       1623744/57%
91..100 1206534/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a449548 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5df39 507acd8 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.651% and for vanilla -> patch3 by 0.020%.

32-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1061892/37%     1061892/37%
11..20  34002/1%        1095894/39%
21..30  43513/1%        1139407/40%
31..40  41667/1%        1181074/42%
41..50  59144/2%        1240218/44%
51..60  47009/1%        1287227/45%
61..70  105069/3%       1392296/49%
71..80  72990/2%        1465286/52%
81..90  125988/4%       1591274/56%
91..100 1208726/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c83d 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebc816e 3fe44fd 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1061999/37%     1061999/37%
11..20  34065/1%        1096064/39%
21..30  43557/1%        1139621/40%
31..40  41690/1%        1181311/42%
41..50  59191/2%        1240502/44%
51..60  47143/1%        1287645/45%
61..70  105045/3%       1392690/49%
71..80  73021/2%        1465711/52%
81..90  125885/4%       1591596/56%
91..100 1208404/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c597 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca915 401ffad 00      0   0  1
patch3
cov%    samples cumul
0..10   1062006/37%     1062006/37%
11..20  34073/1%        1096079/39%
21..30  43559/1%        1139638/40%
31..40  41693/1%        1181331/42%
41..50  59189/2%        1240520/44%
51..60  47142/1%        1287662/45%
61..70  105054/3%       1392716/49%
71..80  73027/2%        1465743/52%
81..90  125874/4%       1591617/56%
91..100 1208383/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c690 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca40a 4020a6e 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.114% and for vanilla -> patch3 by 0.116%.

2021-10-10  Jakub Jelinek  <jakub@redhat.com>

PR debug/102441
* var-tracking.c (add_stores): For cselib_sp_derived_value_p values
use MO_VAL_SET if loc is not sp.

(cherry picked from commit 9583b26f3701ea0456405d84f9a898451a2f7452)

Daily bump.

openmp: Fix up declare target handling for vars with DECL_LOCAL_DECL_ALIAS [PR102640]

The introduction of DECL_LOCAL_DECL_ALIAS and push_local_extern_decl_alias
in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a broke the following
testcase.  The following patch fixes it by treating similarly not just
the variable to or link clause is put on, but also its DECL_LOCAL_DECL_ALIAS
if any.  If it hasn't been created yet, when it is created it will copy
attributes and therefore should get it for free, and as it is an extern,
nothing more than attributes is needed for it.

2021-10-08  Jakub Jelinek  <jakub@redhat.com>

PR c++/102640
gcc/cp/
* parser.c (handle_omp_declare_target_clause): New function.
(cp_parser_omp_declare_target): Use it.
gcc/testsuite/
* c-c++-common/gomp/pr102640.c: New test.

(cherry picked from commit db3d7270b42fe27fb05664c4fdf524ab7ad13a75)

Daily bump.

c++: variadic ttp constraint subsumption [PR99904]

Here we're crashing when level-lowering the variadic constraint C<Ts...>
on the template template parameter TT because tsubst_pack_expansion expects
processing_template_decl to be set during a partial substitution.

PR c++/99904

gcc/cp/ChangeLog:

* pt.c (is_compatible_template_arg): Set processing_template_decl
around tsubst_constraint_info.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp4.C: New test.

(cherry picked from commit 2e6e0d86a06389056d0e7fecc99c547420ad787a)

Daily bump.

c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]

Here during partial ordering of the two partial specializations we end
up in unify with parm=arg=NONTYPE_ARGUMENT_PACK<V0, V1>, and crash shortly
thereafter because uses_template_parms(parms) calls potential_const_expr
which doesn't handle NONTYPE_ARGUMENT_PACK.

This patch fixes this by extending potential_constant_expression to handle
NONTYPE_ARGUMENT_PACK appropriately.

PR c++/102547

gcc/cp/ChangeLog:

* constexpr.c (potential_constant_expression_1): Handle
NONTYPE_ARGUMENT_PACK.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-partial2.C: New test.
* g++.dg/cpp0x/variadic-partial2a.C: New test.

(cherry picked from commit d4c470c376b4cb82c9a0b7e8a4b88c44d5e4289d)

c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]

is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.

PR c++/102535

gcc/cp/ChangeLog:

* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_trivially_constructible7.C: New test.

(cherry picked from commit 9845c52db38f15740861435f38f7e5ad8a8de2ec)

c++: defaulted comparisons and vptr fields [PR95567]

We need to explicitly skip over vptr fields when synthesizing a
defaulted comparison operator, because next_initializable_field
doesn't do so for us.

PR c++/95567

gcc/cp/ChangeLog:

* method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-virtual1.C: New test.

(cherry picked from commit b6bca2e631b54f992c058ca8e445b45e9816690b)

real: fix encoding of negative IEEE double/quad values [PR98216]

In encode_ieee_double/quad, the assignment

unsigned long WORD = r->sign << 31;

is intended to set the 31st bit of WORD whenever the sign bit is set.
But on LP64 hosts it also unintentionally sets the upper 32 bits of WORD,
because r->sign gets promoted from unsigned:1 to int and then the result
of the shift (equal to INT_MIN) gets sign extended from int to long.

In the C++ frontend, this bug causes incorrect mangling of negative
floating point values because the output of real_to_target called from
write_real_cst unexpectedly has the upper 32 bits of this word set,
which the caller doesn't mask out.

This patch fixes this by avoiding the unwanted sign extension. Note
that r0-53976 fixed the same bug in encode_ieee_single long ago.

PR c++/98216
PR c++/91292

gcc/ChangeLog:

* real.c (encode_ieee_double): Avoid unwanted sign extension.
(encode_ieee_quad): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-float2.C: New test.

(cherry picked from commit 34947d4e97ee72b26491cfe5ff4fa8258fadbe95)

c++: concept-ids and value-dependence [PR102412]

The problem here is that uses_template_parms returns true for all
concept-ids (even those with non-dependent arguments), so when a concept-id
is used as a default template argument then during deduction the default
argument is considered dependent even after substituting into it, which
leads to deduction failure (from type_unification_real).

This patch fixes this by implementing the resolution of CWG 2446 which
says a concept-id is dependent only if its arguments are.

DR 2446
PR c++/102412

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Check value_dependent_expression_p
instead of processing_template_decl.
* pt.c (value_dependent_expression_p) <case TEMPLATE_ID_EXPR>:
Return true only if any_dependent_template_arguments_p.
(instantiation_dependent_r) <case CALL_EXPR>: Remove this case.
<case TEMPLATE_ID_EXPR>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-nondep2.C: New test.
* g++.dg/cpp2a/concepts-nondep3.C: New test.

(cherry picked from commit 9329344a6d81a6a5e3bd171167ebc7b158bb44f4)

c++: constrained variable template issues [PR98486]

This fixes some issues with constrained variable templates:

  - Constraints aren't checked when explicitly specializing a variable
    template.
  - Constraints aren't attached to a static data member template at
    parse time.
  - Constraints don't get propagated when (partially) instantiating a
    static data member template, so we need to make sure to look up
    constraints using the most general template during satisfaction.

PR c++/98486

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.

(cherry picked from commit 2e2e65a46d2674bed53afd211493876ee2b79453)

c++: empty union member activation during constexpr [PR102163]

Here, the union's constructor is defined to activate its empty data
member _M_rest, but during constexpr evaluation of this constructor the
subobject constructor call O::O(&_M_rest, 42) doesn't produce a side
effect that actually activates the member, so the union still appears
uninitialized after its constructor has run. This patch fixes this by
using a dummy MODIFY_EXPR in this situation, whose evaluation ensures
the member gets activated.

PR c++/102163

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): After evaluating a
subobject constructor call for an empty union member, produce a
side effect that makes sure the member gets activated.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-empty17.C: New test.

(cherry picked from commit de07cff96abd43f6f65dcf333958899c2ec42598)

c++: aggregate CTAD and brace elision [PR101344]

Here the problem is ultimately that collect_ctor_idx_types always
recurses into an eligible sub-CONSTRUCTOR regardless of whether the
corresponding pair of braces was elided in the original initializer.
This causes us to reject some completely-braced forms of aggregate
CTAD as in the first testcase below, because collect_ctor_idx_types
effectively assumes that the original initializer is always minimally
braced (and so the aggregate deduction candidate is given a function
type that's incompatible with the original completely-braced initializer).

In order to fix this, collect_ctor_idx_types needs to somehow know the
shape of the original initializer when iterating over the reshaped
initializer. To that end this patch makes reshape_init flag sub-ctors
that were built to undo brace elision in the original ctor, so that
collect_ctor_idx_types that determine whether to recurse into a sub-ctor
by simply inspecting this flag.

This happens to also fix PR101820, which is about aggregate CTAD using
designated initializers, for much the same reasons.

A curious case is the "intermediately-braced" initialization of 'e3'
(which we reject) in the first testcase below. It seems to me we're
behaving as specified here (according to [over.match.class.deduct]/1)
because the initializer element x_1={1, 2, 3, 4} corresponds to the
subobject e_1=E::t, hence the type T_1 of the first function parameter
of the aggregate deduction candidate is T(&&)[2][2], but T can't be
deduced from x_1 using this parameter type (as opposed to say T(&&)[4]).

PR c++/101344
PR c++/101803

gcc/cp/ChangeLog:

* cp-tree.h (CONSTRUCTOR_BRACES_ELIDED_P): Define.
* decl.c (reshape_init_r): Set it.
* pt.c (collect_ctor_idx_types): Recurse into a sub-CONSTRUCTOR
iff CONSTRUCTOR_BRACES_ELIDED_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-aggr11.C: New test.
* g++.dg/cpp2a/class-deduction-aggr12.C: New test.

(cherry picked from commit be4a4fb516688d7cfe28a80a4aa333f4ecf0b518)

c++: ignore explicit dguides during NTTP CTAD [PR101883]

Since (template) argument passing is a copy-initialization context,
we mustn't consider explicit deduction guides when deducing a CTAD
placeholder type of an NTTP.

PR c++/101883

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Pass LOOKUP_IMPLICIT to
do_auto_deduction.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class49.C: New test.

(cherry picked from commit a6b3db3e8625a3cba1240f0b5e1a29bd6c68b8ca)

Fortran: Fix deprecate warning with parameter

Only warn with !GCC$ ATTRIBUTES DEPRECATED if
deprecated PARMETERS are actually used.

gcc/fortran/ChangeLog:

* resolve.c (resolve_values): Only show
deprecated warning if attr.referenced.

gcc/testsuite/ChangeLog:

* gfortran.dg/attr_deprecated-2.f90: New test.

(cherry picked from commit ece8b0fce6bbfb1e531de8164da47eeed80d3cf1)

Daily bump.

c++: Fix apply_identity_attributes [PR102548]

The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes.  The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.

The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).

There are two bugs.  One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated.  That is fixed by the a2 -> a2 != a change.

The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it.  That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly).  But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs.  When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing.  So that is the && first_ident != error_mark_node
chunk.

2021-10-05  Jakub Jelinek  <jakub@redhat.com>

PR c++/102548
* tree.c (apply_identity_attributes): Fix handling of the
case where an attribute in the list doesn't affect type
identity but some attribute before it does.

* g++.target/i386/pr102548.C: New test.

(cherry picked from commit 737f95bab557584d876f02779ab79fe3cfaacacf)

ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]

We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.

2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>

PR sanitizer/102515
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done.
gcc/testsuite/
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.

(cherry picked from commit 9c1a633d96926357155d4702b66f8a0ec856a81f)

c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]

The introduction of push_local_extern_decl_alias in
r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a
broke tls vars, while the decl they are created for has the tls model
set properly, nothing sets it for the alias that is actually used,
so accesses to it are done as if they were normal variables.
This is then diagnosed at link time if the definition of the extern
vars is __thread/thread_local.

2021-10-01 Jakub Jelinek <jakub@redhat.com>

PR c++/102496
* name-lookup.c (push_local_extern_decl_alias): Return early even for
tls vars with non-dependent type when processing_template_decl. For
CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias.

* g++.dg/tls/pr102496-1.C: New test.
* g++.dg/tls/pr102496-2.C: New test.

(cherry picked from commit 701075864ac4d1c6cec936d10f9cfc2aeb8c1699)

IBM Z: Use @PLT symbols for local functions in 64-bit mode

This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.

gcc/ChangeLog:

* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call<mode>): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.

(cherry picked from commit 0990d93dd8a)

IBM Z: Define NO_PROFILE_COUNTERS

s390 glibc does not need counters in the .data section, since it stores
edge hits in its own data structure. Therefore counters only waste
space and confuse diffing tools (e.g. kpatch), so don't generate them.

gcc/ChangeLog:

* config/s390/s390.c (s390_function_profiler): Ignore labelno
parameter.
* config/s390/s390.h (NO_PROFILE_COUNTERS): Define.

gcc/testsuite/ChangeLog:

* gcc.target/s390/mnop-mcount-m31-mzarch.c: Adapt to the new
prologue size.
* gcc.target/s390/mnop-mcount-m64.c: Likewise.

(cherry picked from commit a1c1b7a888a)

Daily bump.

Fix testcase counts.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update counts.

d: gdc driver ignores -static-libstdc++ when automatically linking libstdc++ library

Adds handling of `-static-libstc++' in the gdc driver, so that libstdc++
is appropriately linked if libstdc++ is either needed or seen on the
command-line.

PR d/102574

gcc/d/ChangeLog:

* d-spec.cc (lang_specific_driver): Link libstdc++ statically if
-static-libstdc++ was given on command-line.

(cherry picked from commit c86a16b07b76604a8e3d556f135babab80e2b747)

Remove dead code in config/rs6000/vxworks.h

These lines were added last year:

/* Initialize library function table. */
#undef TARGET_INIT_LIBFUNCS
#define TARGET_INIT_LIBFUNCS rs6000_vxworks_init_libfuncs

but TARGET_INIT_LIBFUNCS is #undef-ed in config/rs6000/rs6000.c and
rs6000_vxworks_init_libfuncs is nowhere defined in any case.

gcc/
* config/rs6000/vxworks.h (TARGET_INIT_LIBFUNCS): Delete.

Daily bump.

Fortran: resolve expressions during SIZE simplification

gcc/fortran/ChangeLog:

PR fortran/102458
* simplify.c (simplify_size): Resolve expressions used in array
specifications so that SIZE can be simplified.

gcc/testsuite/ChangeLog:

PR fortran/102458
* gfortran.dg/pr102458b.f90: New test.

(cherry picked from commit b19bbfb1482505367dd19ae4ab1ea19e36802b6a)

Fortran - improve checking for intrinsics allowed in constant expressions

gcc/fortran/ChangeLog:

PR fortran/102458
* expr.c (is_non_constant_intrinsic): Check for intrinsics
excluded in constant expressions (F2018:10.1.12).
(gfc_is_constant_expr): Use that check.

gcc/testsuite/ChangeLog:

PR fortran/102458
* gfortran.dg/pr102458.f90: New test.

(cherry picked from commit 84cccff60a978174271a30042bf7841d2ae436eb)

coroutines: Only set parm copy guard vars if we have exceptions [PR 102454].

For coroutines, we make copies of the original function arguments into
the coroutine frame. Normally, these are destroyed on the proper exit
from the coroutine when the frame is destroyed.

However, if an exception is thrown before the first suspend point is
reached, the cleanup has to happen in the ramp function. These cleanups
are guarded such that they are only applied to any param copies actually
made.

The ICE is caused by an attempt to set the guard variable when there are
no exceptions enabled (the guard var is not created in this case).

Fixed by checking for flag_exceptions in this case too.

While touching this code paths, also clean up the synthetic names used
when a function parm is unnamed.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/102454

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): Clean up synthetic names for
unnamed function params.
(morph_fn_to_coro): Do not try to set a guard variable for param
DTORs in the ramp, unless we have exceptions active.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr102454.C: New test.

(cherry picked from commit fae627162d5f8cfb273b10349883eeb74baaa43f)

coroutines: Make proxy vars for the function arg copies.

This adds top level proxy variables for the coroutine frame
copies of the original function args. These are then available
in the debugger to refer to the frame copies. We rewrite the
function body to use the copies, since the original parms will
no longer be in scope when the coroutine is running.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (struct param_info): Add copy_var.
(build_actor_fn): Use simplified param references.
(register_param_uses): Likewise.
(rewrite_param_uses): Likewise.
(analyze_fn_parms): New function.
(coro_rewrite_function_body): Add proxies for the fn
parameters to the outer bind scope of the rewritten code.
(morph_fn_to_coro): Use simplified version of param ref.

(cherry picked from commit 70ee703c479081ac2ea67eb67041551216e66783)

coroutines: Expose implementation state to the debugger.

In the process of transforming a coroutine into the separate representation
as the ramp function and a state machine, we generate some variables that
are of interest to a user during debugging.  Any variable that is persistent
for the execution of the coroutine is placed into the coroutine frame.

In particular:
  The promise object.
  The function pointers for the resumer and destroyer.
  The current resume index (suspend point).
  The handle that represents this coroutine 'self handle'.
  Any handle provided for a continuation coroutine.
  Whether the coroutine frame is allocated and needs to be freed.

Visibility of some of these has already been requested by end users.

This patch ensures that such variables have names that are usable in a
debugger, but are in the reserved namespace for the implementation (they
all begin with _Coro_).  The identifiers are generated lazily when the
first coroutine is encountered.

We place the variables into the outermost bind expression and then add a
DECL_VALUE_EXPR to each that points to the frame entry.

These changes simplify the handling of the variables in the body of the
function (in particular, the use of the DECL_VALUE_EXPR means that we now
no longer need to rewrite proxies for the promise and coroutine handles into
the frame->offset form).

Partial improvement to debugging (PR c++/99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_resume_fn_id, coro_destroy_fn_id,
coro_promise_id, coro_frame_needs_free_id, coro_resume_index_id,
coro_self_handle_id, coro_actor_continue_id,
coro_frame_i_a_r_c_id): New.
(coro_init_identifiers): Initialize new name identifiers.
(coro_promise_type_found_p): Use pre-built identifiers.
(struct await_xform_data): Remove unused fields.
(transform_await_expr): Delete code that is now unused.
(build_actor_fn): Simplify interface, use pre-built identifiers and
remove transforms that are no longer needed.
(build_destroy_fn): Use revised field names.
(register_local_var_uses): Use pre-built identifiers.
(coro_rewrite_function_body): Simplify interface, use pre-built
identifiers.  Generate proxy vars in the outer bind expr scope for the
implementation state that we wish to expose.
(morph_fn_to_coro): Adjust comments for new variable names, use pre-
built identifiers.  Remove unused code to generate frame entries for
the implementation state.  Adjust call for build_actor_fn.

(cherry picked from commit c5a735fa9df7eca4666c8da5e51ed9c5ab7cc81a)

coroutines: Support for debugging implementation state.

Some of the state that is associated with the implementation
is of interest to a user debugging a coroutine.  In particular
items such as the suspend point, promise object, and current
suspend point.

These variables live in the coroutine frame, but we can inject
proxies for them into the outermost bind expression of the
coroutine.  Such variables are automatically moved into the
coroutine frame (if they need to persist across a suspend
expression).  PLacing the proxies thus allows the user to
inspect them by name in the debugger.

To implement this, we ensure that (at the outermost scope) the
frame entries are not mangled (coroutine frame variables are
usually mangled with scope nesting information so that they do
not clash).  We can safely avoid doing this for the outermost
scope so that we can map frame entries directly to the variables.

This is partial contribution to debug support (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Do not mangle
frame entries for the outermost scope.  Record the outer
scope as nesting depth 0.

(cherry picked from commit addf167a23f61c0ec97f6e71577a0623f3fc13e7)

coroutines: Add a helper for creating local vars.

This is primarily code factoring, but we take this opportunity
to rename some of the implementation variables (which we intend
to expose to debugging) so that they are in the implementation
namespace.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_build_artificial_var): New.
(build_actor_fn): Use var builder, rename vars to use
implementation namespace.
(coro_rewrite_function_body): Likewise.
(morph_fn_to_coro): Likewise.

(cherry picked from commit a45a7ecdf34311587daa2e90cc732adcefac447b)

coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.

Variables that need to persist over suspension expressions
must be preserved by being copied into the coroutine frame.

The initial implementations do this manually in the transform
code. However, that has various disadvantages - including
that the debug connections are lost between the original var
and the frame copy.

The revised implementation makes use of DECL_VALUE_EXPRs to
contain the frame offset expressions, so that the original
var names are preserved in the code.

This process is also applied to the function parms which are
always copied to the frame. In this case the decls need to be
copied since they are used in two different contexts during
the re-write (in the building of the ramp function, and in
the actor function itself).

This will assist in improvement of debugging (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (transform_local_var_uses): Record
frame offset expressions as DECL_VALUE_EXPRs instead of
rewriting them.

(cherry picked from commit 88974974d8188cf12e87e4ad3d23a8cbdd557f0e)

coroutines : Add a missed begin/finish else clause to the codegen.

Minor code-gen correction.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (build_actor_fn): Add begin/finish clauses
to the initial test in the actor function.

(cherry picked from commit 21b4d0ef543d68187d258415b51d0d6676af89fd)

coroutines: No cleanups on goto statements.

Minor cleanup, this is statement not an expression, we do not
need to use finish_expr_stmt here.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (await_statement_walker): Use build_stmt and
add_stmt instead of build1 and finish_expr_stmt.

(cherry picked from commit 8406ed9af2655479a9c8469d7acca2cf5784f5d6)

c++: don't call 'rvalue' in coroutines code

A change to check glvalue_p rather than specifically for TARGET_EXPR
revealed issues with the coroutines code's use of the 'rvalue' function,
which shouldn't be used on class glvalues, so I've removed those calls.

In build_co_await I just dropped them, because I don't see anything in the
co_await specification that indicates that we would want to move from an
lvalue result of operator co_await.  And simplified that code while I was
touching it; cp_build_modify_expr (...INIT_EXPR...) will call the
constructor.

In morph_fn_to_coro I changed the handling of the rvalue reference coroutine
frame field to use move, to treat the rval ref as an xvalue.  I used
forward_parm to pass the function parms to the constructor for the field.
And I simplified the return handling so we get the desired rvalue semantics
from the normal implicit move on return.

I question default-initializing the non-void return value of the function if
get_return_object returns void; I'm not messing with it here, but I've filed
PR100476 about it.

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Don't call 'rvalue'.
(flatten_await_stmt): Simplify initialization.
(morph_fn_to_coro): Change 'rvalue' to 'move'.  Simplify.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C:
Adjust diagnostic.

(cherry picked from commit 14ed21f8749ae359690d9c4a69ca38cc45d0d1b0)

Daily bump.

Default to dwarf version 4 on hppa64-hpux

2021-10-01 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR debug/102373
* config/pa/pa.c (pa_option_override): Default to dwarf version 4
on hppa64-hpux.

Use libiberty snprintf and vsnprintf on hppa*-*-hpux*.

libiberty/ChangeLog:

PR target/100734
* configure.ac: Use libiberty snprintf and vsnprintf on
hppa*-*-hpux*.
* configure: Regenerate.

Fix ICE with stack checking emulation at -O2

On bare-metal platforms, the Ada compiler emulates stack checking (it is
required by the language and tested by ACATS) in the runtime via the
stack_check_libfunc hook of the RTL middle-end. Calls to the function
are generated as libcalls but they now require a proper function type
at -O2 or above.

gcc/
* explow.c: Include langhooks.h.
(set_stack_check_libfunc): Build a proper function type.

Fix PR c++/64697 at -O1 or above

The BFD fix eliminates the link failure and working code is generated at
-O0, but _not_ when optimization is enabled because the optimizer changes:

        movq    .refptr._ZTH1s(%rip), %rax
        testq   %rax, %rax
        je      .L2
        call    _ZTH1s

into:

        leaq    _ZTH1s(%rip), %rax
        testq   %rax, %rax
        je      .L2
        call    _ZTH1s

and the leaq now also gets the relocation overflow.  So the fix is to
teach legitimate_pic_address_disp_p to reject the transformation when
the symbol is an external weak function, which yields:

        cmpq    $0, .refptr._ZTH1s(%rip)
        je      .L2
        call    _ZTH1s

and the cmpq keeps a relocation that does not overflow.

gcc/
PR c++/64697
* config/i386/i386.c (legitimate_pic_address_disp_p): For PE-COFF do
not return true for external weak function symbols in medium model.

Daily bump.

Fortran: fix error recovery for invalid constructor

gcc/fortran/ChangeLog:

PR fortran/102520
* array.c (expand_constructor): Do not dereference NULL pointer.

gcc/testsuite/ChangeLog:

PR fortran/102520
* gfortran.dg/pr102520.f90: New test.

(cherry picked from commit 5e2adfeed21ee584a82cdcdfa7eed41202eb67cd)

[Ada] Minor tweaks to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (Parse_Header): Tweak comments.
(Read_Entry_Format_Array): Tweak exception message.
(Symbolic_Address.Set_Result): Likewise.

[Ada] Small optimization to DWARF 5 mode in System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (To_File_Name): Fetch only the last string
from the .debug_line_str section.
(Symbolic_Address.Set_Result): Likewise.

[Ada] Follow-up tweaks to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.adb (Skip_Form): Fix cases of DW_FORM_addrx
and DW_FORM_implicit_const. Replace Constraint_Error with
Dwarf_Error.

[Ada] Adjust latest change for ELF platforms

gcc/ada/

* libgnat/s-objrea.adb (Get_Load_Address): Return 0 for ELF.

[Ada] Add support for PE-COFF PIE to System.Dwarf_Line

gcc/ada/

* adaint.c (__gnat_get_executable_load_address): Add Win32 support.
* libgnat/s-objrea.ads (Get_Xcode_Bounds): Fix typo in comment.
(Object_File): Minor reformatting.
(ELF_Object_File): Uncomment predicate.
(PECOFF_Object_File): Likewise.
(XCOFF32_Object_File): Likewise.
* libgnat/s-objrea.adb: Minor reformatting throughout.
(Get_Load_Address): Implement for PE-COFF.
* libgnat/s-dwalin.ads: Remove clause for System.Storage_Elements
and use consistent wording in comments.
(Dwarf_Context): Set type of Low, High and Load_Address to Address.
* libgnat/s-dwalin.adb (Get_Load_Displacement): New function.
(Is_Inside): Call Get_Load_Displacement.
(Low_Address): Likewise.
(Open): Adjust to type change.
(Aranges_Lookup): Change type of Addr to Address.
(Read_Aranges_Entry): Likewise for Start and adjust.
(Enable_Cach): Adjust to type change.
(Symbolic_Address): Change type of Addr to Address.
(Symbolic_Traceback): Call Get_Load_Displacement.

[Ada] Small cleanup in System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.ads: Remove clause for Ada.Exceptions.Traceback,
add clause for System.Traceback_Entries and alphabetize.
(AET): Delete.
(STE): New package renaming.
(Symbolic_Traceback): Adjust.
* libgnat/s-dwalin.adb: Remove clauses for Ada.Exceptions.Traceback
and System.Traceback_Entries.
(Symbolic_Traceback): Adjust.

[Ada] Add DWARF 5 support to System.Dwarf_Line

gcc/ada/

* libgnat/s-dwalin.ads: Adjust a few comments left and right.
(Line_Info_Register): Comment out unused components.
(Line_Info_Header): Add DWARF 5 support.
(Dwarf_Context): Likewise. Rename "prologue" into "header".
* libgnat/s-dwalin.adb: Alphabetize "with" clauses.
(DWARF constants): Add DWARF 5 support and reorder.
(For_Each_Row): Adjust.
(Initialize_Pass): Likewise.
(Initialize_State_Machine): Likewise and fix typo.
(Open): Add DWARF 5 support.
(Parse_Prologue): Rename into...
(Parse_Header): ...this and add DWARF 5 support.
(Read_And_Execute_Isn): Rename into...
(Read_And_Execute_Insn): ...this and adjust.
(To_File_Name): Change parameter name and add DWARF 5 support.
(Read_Entry_Format_Array): New procedure.
(Skip_Form): Add DWARF 5 support and reorder.
(Seek_Abbrev): Do not count entries and add DWARF 5 support.
(Debug_Info_Lookup): Add DWARF 5 support.
(Symbolic_Address.Set_Result): Likewise.
(Symbolic_Address): Adjust.

Daily bump.

rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz

Fwprop will happily optimize two xxsetaccz instructions into one xxsetaccz
by propagating the results of the first to the uses of the second.
We really don't want that to happen given the late priming/depriming of
accumulators.  I fixed this by making the xxsetaccz source operand an
unspec volatile.  I also removed the mma_xxsetaccz define_expand and
define_insn_and_split and replaced it with a simple define_insn.
The expand and splitter patterns were leftovers from the pre opaque mode
code when the xxsetaccz code was part of the movpxi pattern, and we don't
need them now.

Rather than a new test case, I was able to just modify the current test case
to add another __builtin_mma_xxsetaccz call which shows the bad code gen
with unpatched compilers.

2021-09-14  Peter Bergner  <bergner@linux.ibm.com>

gcc/
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
(unspecv): Add UNSPECV_MMA_XXSETACCZ.
(*mma_xxsetaccz): Delete.
(mma_xxsetaccz): Change to define_insn.  Remove operand 1.
Use UNSPECV_MMA_XXSETACCZ.  Update comment.
* config/rs6000/rs6000.c (rs6000_rtx_costs): Use UNSPECV_MMA_XXSETACCZ.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-6.c: Add second call to xxsetacc
built-in.  Update instruction counts.

(cherry picked from commit f80b9be083e0e7d49e7744b7e531b9aa52acd563)

Daily bump.

libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]

The depend type is a struct with two pointer members for C/C++ - but for
Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus,
libgomp's configure checks that an integer type/kind with size 2*sizeof(void*)
is available. However, this integer type/kind is not needed when building without
Fortran support. Thus, only check this when Fortran is enabled.

libgomp/
PR libgomp/96661
* configure.ac: Only check for int-type = 2*size_t support when
building with Fortran support.
* configure: Regenerate.

(cherry picked from commit 1f0a57bd54aed558e0167016dd980177f88f8480)

i386: Don't emit fldpi etc. if -frounding-math [PR102498]

i387 has instructions to store some transcedental numbers into the top of
stack.  The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision.  The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.

2021-09-28  Jakub Jelinek  <jakub@redhat.com>

PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.

* gcc.target/i386/pr102498.c: New test.

(cherry picked from commit 3b7041e8345c2f1030e58620f28e22d64b2c196b)

Daily bump.

Fix value uninitialization in vn_reference_insert_pieces [PR102400]

2021-09-23 Feng Xue <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/102400
* tree-ssa-sccvn.c (vn_reference_insert_pieces): Initialize
result_vdef to zero value.

Fix null-pointer dereference in delete_dead_or_redundant_call [PR102451]

2021-09-23 Feng Xue <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/102451
* tree-ssa-dse.c (delete_dead_or_redundant_call): Record bb of stmt
before removal.

Daily bump.

IBM Z: TPF: Add cc clobber to profiling expanders

The code sequence emitted uses CC internally.

gcc/ChangeLog:

* config/s390/tpf.md (prologue_tpf, epilogue_tpf): Add cc clobber.

(cherry picked from commit e1223ea2f48e8588160b2948f8a1f8e47f9694fd)

IBM Z: Fix PR102222

Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.

gcc/ChangeLog:

PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr102222.c: New test.

(cherry picked from commit a9b3c451be58f0fe660154323ace7ba72a4211ec)

ipa-fnsummary: Remove inconsistent bp_pack_value

There is one inconsistent bit-field streaming out and in.
On the side of streaming in:

    bp_pack_value (&bp, info->inlinable, 1);
    bp_pack_value (&bp, false, 1);
    bp_pack_value (&bp, info->fp_expressions, 1);

while on the side of the streaming out:

    info->inlinable = bp_unpack_value (&bp, 1);
    info->fp_expressions = bp_unpack_value (&bp, 1)

The removal of Cilk Plus support r8-4956 missed to remove
the streaming out of the bit, instead just change the value
for streaming out to be always false.

By hacking fp_expression_p to always return true, I can see
it reads the wrong fp_expressions value (false) out in wpa.

GCC12 adopts commit 63c6446f77b9001d26f973114450d790749f282b
which removes the inconsistent streaming out instead.

gcc/ChangeLog:

* ipa-fnsummary.c (inline_read_section): Unpack a dummy bit
to keep consistent with the side of streaming out.

Daily bump.

rs6000: Fix ELFv2 r12 use in epilogue

We cannot use r12 here, it is already in use as the GEP (for sibling
calls).

2021-09-08 Segher Boessenkool <segher@kernel.crashing.org>
PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use
r11 instead of r12 for restoring CR.

(cherry picked from commit 86e6268cff328e27ee6f90e2afc35b6f437a25cd)

rs6000: Don't use r12 for CR save on ELFv2 (PR102107)

CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.

It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1. So here goes.

2021-09-03 Segher Boessenkool <segher@kernel.crashing.org>

PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): On ELFv2 use r11
instead of r12 for CR save, in all cases.

(cherry picked from commit 2484f7a4b0f52e6ed04754be336f1fa6fde47f6b)

Fortran - (large) arrays in the main shall be static

gcc/fortran/ChangeLog:

PR fortran/102366
* trans-decl.c (gfc_finish_var_decl): Disable the warning message
for variables moved from stack to static storange if they are
declared in the main, but allow the move to happen.

gcc/testsuite/ChangeLog:

PR fortran/102366
* gfortran.dg/pr102366.f90: New test.

(cherry picked from commit 51166eb2c534692c3c7779def24f83c8c3811b98)

Fix no_fsanitize_address effective target

The implementation of the no_fsanitize_address effective target was copied
from asan-dg.exp without realizing that it does not work outside of this
context (there is a comment explaining why). As a consequence, it always
returns 0, so for example the directive in gnat.dg/asan1.adb:

{ dg-skip-if "no address sanitizer" { no_fsanitize_address } }

does not work. This led some people to add the nonsensical:

{ dg-require-effective-target no_fsanitize_address }

to sanitizer tests, e.g. g++.dg/warn/uninit-pr93100.C, thus disabling them
everywhere instead of just for the problematic targets.

gcc/testsuite/
* lib/target-supports.exp (no_fsanitize_address): Add missing bits.
* gcc.dg/pr91441.c: Likewise.
* gcc.dg/pr96260.c: Likewise.
* gcc.dg/pr96307.c: Likewise.
* gnat.dg/asan1.adb: Likewise.

* g++.dg/abi/anon4.C: Likewise.

Daily bump.

GCC11 - Fortran: combined directives - order(concurrent) not on distribute

While OpenMP 5.1 and GCC 12 permits 'order(concurrent)' on distribute,
OpenMP 5.0 and GCC 11 don't. This patch for GCC 11 ensures the clause also
does not end up on 'distribute' when splitting combined directives.

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_split_omp_clauses): Don't put 'order(concurrent)'
on 'distribute' for combined directives, matching OpenMP 5.0

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/distribute-order-concurrent.f90: New test.

Daily bump.

Fortran - fix handling of optional allocatable DT arguments with INTENT(OUT)

gcc/fortran/ChangeLog:

PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.

gcc/testsuite/ChangeLog:

PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.

(cherry picked from commit cfea7b86f2430b9cb8018379b071f4004233119c)

Fortran - fix ICE during error recovery checking entry characteristics

gcc/fortran/ChangeLog:

PR fortran/102311
* resolve.c (resolve_entries): Attempt to recover cleanly after
rejecting mismatched function entries.

gcc/testsuite/ChangeLog:

PR fortran/102311
* gfortran.dg/entry_25.f90: New test.

(cherry picked from commit b305ec979d9dfc8153859a62a8ab9dd43c3bfc73)

Daily bump.

Fix PR rtl-optimization/102306

This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.

gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.

gcc/testsuite/
* gcc.target/sparc/20210917-1.c: New test.

Daily bump.

Fortran - fix handling of substring start and end indices

gcc/fortran/ChangeLog:

PR fortran/85130
* expr.c (find_substring_ref): Handle given substring start and
end indices as signed integers, not unsigned.

gcc/testsuite/ChangeLog:

PR fortran/85130
* gfortran.dg/substr_6.f90: Revert commit r8-7574, adding again
test that was erroneously considered as illegal.

(cherry picked from commit 8d93ba93d3b13ac3d3c34404cad87732c809605b)

Fortran - ensure simplification of bounds of array-valued named constants

gcc/fortran/ChangeLog:

PR fortran/82314
* decl.c (add_init_expr_to_sym): For proper initialization of
array-valued named constants the array bounds need to be
simplified before adding the initializer.

gcc/testsuite/ChangeLog:

PR fortran/82314
* gfortran.dg/pr82314.f90: New test.

(cherry picked from commit 104c05c5284b7822d770ee51a7d91946c7e56d50)

sparc: Add scheduling information for LEON5

The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.

gcc/ChangeLog:

* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.

sparc: Add NOP in stack_protect_setsi if sparc_fix_b2bst enabled

This is needed to prevent the Store -> (Non-store or load) -> Store
sequence.

gcc/ChangeLog:

* config/sparc/sparc.md (stack_protect_setsi): Add NOP to prevent
sensitive sequence for B2BST errata workaround.

sparc: Prevent atomic instructions in beginning of functions for UT700

A call to the function might have a load instruction in the delay slot
and a load followed by an atomic function could cause a deadlock.

gcc/ChangeLog:

* config/sparc/sparc.c (sparc_do_work_around_errata): Do not begin
functions with atomic instruction in the UT700 errata workaround.

sparc: Skip all empty assembly statements

This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().

gcc/ChangeLog:

* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.

sparc: Treat more instructions as load or store in errata workarounds

Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.

gcc/ChangeLog:

* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.

sparc: Print out bit names for LEON and LEON3 with -mdebug

gcc/ChangeLog:

* config/sparc/sparc.c (dump_target_flag_bits): Print bit names for
LEON and LEON3.

Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align

The problem here is the aarch64_expand_setmem code did not check
STRICT_ALIGNMENT if it is creating an overlapping store.
This patch adds that check and the testcase works.

gcc/ChangeLog:

PR target/101934
* config/aarch64/aarch64.c (aarch64_expand_setmem):
Check STRICT_ALIGNMENT before creating an overlapping
store.

gcc/testsuite/ChangeLog:

PR target/101934
* gcc.target/aarch64/memset-strict-align-1.c: New test.

(cherry picked from commit a45786e9a31f995087d8cb42bc3a4fe06911e588)

Daily bump.

c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]

> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.

So, the following patch fixes that.

The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.

The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.

2021-09-15 Jakub Jelinek <jakub@redhat.com>

PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.

(cherry picked from commit e5d1af8a07ae9fcc40ea5c781c3ad46d20ea12a6)

c++: Update DECL_*SIZE for objects with flexible array members with initializers [PR102295]

The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.

Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.

2021-09-14 Jakub Jelinek <jakub@redhat.com>

PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.

* g++.target/i386/pr102295.C: New test.

(cherry picked from commit 818c505188ff5cd8eb048eb0e614c4ef732225bd)