gcc/ChangeLog:
* doc/invoke.texi: Add link to UndefinedBehaviorSanitizer
documentation, mention UBSAN_OPTIONS, similar to what is done
for AddressSanitizer.
Jakub Jelinek [Sun, 10 Oct 2021 10:13:22 +0000 (12:13 +0200)]
var-tracking: Fix a wrong-debug issue caused by my r10-7665 var-tracking change [PR102441]
Since my r10-7665-g33c45e51b4914008064d9b77f2c1fc0eea1ad060 change, we get
wrong-debug on e.g. the following testcase at -O2 -g on x86_64-linux for the
x parameter:
void bar (int *r);
int
foo (int x)
{
int r = 0;
bar (&r);
return r;
}
At the start of function, we have
subq $24, %rsp
leaq 12(%rsp), %rdi
instructions. The x parameter is passed in %rdi, but isn't used in the
function and so the leaq instruction overwrites %rdi without remembering
%rdi anywhere. Before the r10-7665 change (which was trying to fix a large
(3% for 32-bit, 1% for 64-bit x86-64) debug info/loc growth introduced with
r10-7515), the leaq insn above resulted in a MO_VAL_SET micro-operation that
said that the value of sp + 12, a cselib_sp_derived_value_p, is stored into
the %rdi register. The r10-7665 change added a change to add_stores that
added no micro-operation for the leaq store, with the rationale that the sp
based values can be and will be always computable some other more compact
and primarily more stable way (cfa based expression like DW_OP_fbreg, that
is the same in the whole function). That is true. But by throwing the
micro-operation on the floor, we miss another important part of the
MO_VAL_SET, in particular that the destination of the store, %rdi in this
case, now has a different value from what it had before, so the vt_*
dataflow code thinks that even after the leaq instruction %rdi still holds
the x argument value (and changes it to DW_OP_entry_value (%rdi) only in the
middle of the call to bar). Previously and with the patches below,
the location for x changes already at the end of leaq instruction to
DW_OP_entry_value (%rdi).
My first attempt to fix this was instead of dropping the MO_VAL_SET add
a MO_CLOBBER operation:
--- gcc/var-tracking.c.jj 2021-05-04 21:02:24.196799586 +0200
+++ gcc/var-tracking.c 2021-09-24 19:23:16.420154828 +0200
@@ -6133,7 +6133,9 @@ add_stores (rtx loc, const_rtx expr, voi
{
if (preserve)
preserve_value (v);
- return;
+ mo.type = MO_CLOBBER;
+ mo.u.loc = loc;
+ goto log_and_return;
}
nloc = replace_expr_with_values (oloc);
so don't track that the value lives in the loc destination, but track
that the previous value doesn't live there anymore. That failed bootstrap
miserably, the vt_* code isn't prepared to see MO_CLOBBER of a MEM that
isn't tracked (e.g. has MEM_EXPR on it that the var-tracking code wants
to track, i.e. track_p in add_stores). On the other side, thinking about
it more, in the most common case where a cselib_sp_derived_value_p value
is stored into the sp register (and which is the reason why PR94495
testcase got larger), dropping the micro-operation on the floor is the
right thing, because we have that cselib_sp_derived_value_p tracking, any
reads from the sp hard register will be treated as
cselib_sp_derived_value_p.
Then I've tried 3 different patches described below and in the end
what is committed is patch2.
Additionally, I've gathered statistics from cc1plus by always reverting the
var-tracking.c change after finished bootstrap/regtest and rebuilding the
stage3 var-tracking.o and cc1plus, such that it would be comparable.
dwlocstat and .debug_{info,loclists} section sizes detailed below.
patch3 uses MO_VAL_SET (i.e. essentially reversion of the r10-7665
change) when destination is not a REG_P and !track_p, otherwise if
destination is sp drops the micro-operation on the floor (i.e. no change),
otherwise adds a MO_CLOBBER.
patch1 is similar, except it checks for destination not equal to sp and
!track_p, i.e. for !track_p REG_P destinations other than sp it will use
MO_VAL_SET rather than MO_CLOBBER.
Finally, patch2, the shortest patch, uses MO_VAL_SET whenever destination
is not sp and otherwise drops the micro-operation on the floor.
All the 3 patches don't affect the PR94495 testcase, all the changes
there were caused by stores of sp based values into %rsp.
While the patch2 (and patch1 which results in exactly the same sizes)
causes the largest debug loclists/info growth from the 3, it is still quite
minor (0.651% on 64-bit and 0.114% on 32-bit) compared
to the 1% and 3% PR94495 was trying to solve, and I actually think it is the
best thing to do. Because, if we have say
int q[10];
int *p = &q[0];
or similar and we load the &q[0] sp based value into some hard register,
by noting in the debug info that p lives in some hard reg for some part
of the function and a user is trying to change the p var in the debugger,
if we say it lives in some register or memory, there is some chance that
the changing of the value could work successfully (of course, nothing
is guaranteed, we don't have tracking of where each var lives at which
moment for changing purposes (i.e. what register, memory or else you need
to change in order to change behavior of the code)), while if we just say
that p's location is DW_OP_fbreg 16 DW_OP_stack_value, that is a read-only
value one can just print but not change. Now, for stores of variable
values into the sp register, I don't think we have such an issue, you don't
want debugger to change your stack pointer when user asks to change value
of some variable whose value lives in the stack pointer, that would pretty
much always result in misbehavior of the program.
So, my preference from these 3 is patch2 and that is being committed.
Jakub Jelinek [Fri, 8 Oct 2021 08:58:56 +0000 (10:58 +0200)]
openmp: Fix up declare target handling for vars with DECL_LOCAL_DECL_ALIAS [PR102640]
The introduction of DECL_LOCAL_DECL_ALIAS and push_local_extern_decl_alias
in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a broke the following
testcase. The following patch fixes it by treating similarly not just
the variable to or link clause is put on, but also its DECL_LOCAL_DECL_ALIAS
if any. If it hasn't been created yet, when it is created it will copy
attributes and therefore should get it for free, and as it is an extern,
nothing more than attributes is needed for it.
2021-10-08 Jakub Jelinek <jakub@redhat.com>
PR c++/102640
gcc/cp/
* parser.c (handle_omp_declare_target_clause): New function.
(cp_parser_omp_declare_target): Use it.
gcc/testsuite/
* c-c++-common/gomp/pr102640.c: New test.
Here we're crashing when level-lowering the variadic constraint C<Ts...>
on the template template parameter TT because tsubst_pack_expansion expects
processing_template_decl to be set during a partial substitution.
PR c++/99904
gcc/cp/ChangeLog:
* pt.c (is_compatible_template_arg): Set processing_template_decl
around tsubst_constraint_info.
Here during partial ordering of the two partial specializations we end
up in unify with parm=arg=NONTYPE_ARGUMENT_PACK<V0, V1>, and crash shortly
thereafter because uses_template_parms(parms) calls potential_const_expr
which doesn't handle NONTYPE_ARGUMENT_PACK.
This patch fixes this by extending potential_constant_expression to handle
NONTYPE_ARGUMENT_PACK appropriately.
Patrick Palka [Thu, 30 Sep 2021 21:54:17 +0000 (17:54 -0400)]
c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]
is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.
PR c++/102535
gcc/cp/ChangeLog:
* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_trivially_constructible7.C: New test.
Patrick Palka [Fri, 24 Sep 2021 16:36:26 +0000 (12:36 -0400)]
real: fix encoding of negative IEEE double/quad values [PR98216]
In encode_ieee_double/quad, the assignment
unsigned long WORD = r->sign << 31;
is intended to set the 31st bit of WORD whenever the sign bit is set.
But on LP64 hosts it also unintentionally sets the upper 32 bits of WORD,
because r->sign gets promoted from unsigned:1 to int and then the result
of the shift (equal to INT_MIN) gets sign extended from int to long.
In the C++ frontend, this bug causes incorrect mangling of negative
floating point values because the output of real_to_target called from
write_real_cst unexpectedly has the upper 32 bits of this word set,
which the caller doesn't mask out.
This patch fixes this by avoiding the unwanted sign extension. Note
that r0-53976 fixed the same bug in encode_ieee_single long ago.
Patrick Palka [Wed, 22 Sep 2021 15:16:53 +0000 (11:16 -0400)]
c++: concept-ids and value-dependence [PR102412]
The problem here is that uses_template_parms returns true for all
concept-ids (even those with non-dependent arguments), so when a concept-id
is used as a default template argument then during deduction the default
argument is considered dependent even after substituting into it, which
leads to deduction failure (from type_unification_real).
This patch fixes this by implementing the resolution of CWG 2446 which
says a concept-id is dependent only if its arguments are.
DR 2446
PR c++/102412
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Check value_dependent_expression_p
instead of processing_template_decl.
* pt.c (value_dependent_expression_p) <case TEMPLATE_ID_EXPR>:
Return true only if any_dependent_template_arguments_p.
(instantiation_dependent_r) <case CALL_EXPR>: Remove this case.
<case TEMPLATE_ID_EXPR>: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-nondep2.C: New test.
* g++.dg/cpp2a/concepts-nondep3.C: New test.
This fixes some issues with constrained variable templates:
- Constraints aren't checked when explicitly specializing a variable
template.
- Constraints aren't attached to a static data member template at
parse time.
- Constraints don't get propagated when (partially) instantiating a
static data member template, so we need to make sure to look up
constraints using the most general template during satisfaction.
PR c++/98486
gcc/cp/ChangeLog:
* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.
Patrick Palka [Tue, 14 Sep 2021 15:22:12 +0000 (11:22 -0400)]
c++: empty union member activation during constexpr [PR102163]
Here, the union's constructor is defined to activate its empty data
member _M_rest, but during constexpr evaluation of this constructor the
subobject constructor call O::O(&_M_rest, 42) doesn't produce a side
effect that actually activates the member, so the union still appears
uninitialized after its constructor has run. This patch fixes this by
using a dummy MODIFY_EXPR in this situation, whose evaluation ensures
the member gets activated.
PR c++/102163
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_call_expression): After evaluating a
subobject constructor call for an empty union member, produce a
side effect that makes sure the member gets activated.
Patrick Palka [Wed, 18 Aug 2021 12:37:45 +0000 (08:37 -0400)]
c++: aggregate CTAD and brace elision [PR101344]
Here the problem is ultimately that collect_ctor_idx_types always
recurses into an eligible sub-CONSTRUCTOR regardless of whether the
corresponding pair of braces was elided in the original initializer.
This causes us to reject some completely-braced forms of aggregate
CTAD as in the first testcase below, because collect_ctor_idx_types
effectively assumes that the original initializer is always minimally
braced (and so the aggregate deduction candidate is given a function
type that's incompatible with the original completely-braced initializer).
In order to fix this, collect_ctor_idx_types needs to somehow know the
shape of the original initializer when iterating over the reshaped
initializer. To that end this patch makes reshape_init flag sub-ctors
that were built to undo brace elision in the original ctor, so that
collect_ctor_idx_types that determine whether to recurse into a sub-ctor
by simply inspecting this flag.
This happens to also fix PR101820, which is about aggregate CTAD using
designated initializers, for much the same reasons.
A curious case is the "intermediately-braced" initialization of 'e3'
(which we reject) in the first testcase below. It seems to me we're
behaving as specified here (according to [over.match.class.deduct]/1)
because the initializer element x_1={1, 2, 3, 4} corresponds to the
subobject e_1=E::t, hence the type T_1 of the first function parameter
of the aggregate deduction candidate is T(&&)[2][2], but T can't be
deduced from x_1 using this parameter type (as opposed to say T(&&)[4]).
PR c++/101344
PR c++/101803
gcc/cp/ChangeLog:
* cp-tree.h (CONSTRUCTOR_BRACES_ELIDED_P): Define.
* decl.c (reshape_init_r): Set it.
* pt.c (collect_ctor_idx_types): Recurse into a sub-CONSTRUCTOR
iff CONSTRUCTOR_BRACES_ELIDED_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-aggr11.C: New test.
* g++.dg/cpp2a/class-deduction-aggr12.C: New test.
Patrick Palka [Wed, 18 Aug 2021 12:37:42 +0000 (08:37 -0400)]
c++: ignore explicit dguides during NTTP CTAD [PR101883]
Since (template) argument passing is a copy-initialization context,
we mustn't consider explicit deduction guides when deducing a CTAD
placeholder type of an NTTP.
PR c++/101883
gcc/cp/ChangeLog:
* pt.c (convert_template_argument): Pass LOOKUP_IMPLICIT to
do_auto_deduction.
Jakub Jelinek [Tue, 5 Oct 2021 20:28:38 +0000 (22:28 +0200)]
c++: Fix apply_identity_attributes [PR102548]
The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes. The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.
The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).
There are two bugs. One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated. That is fixed by the a2 -> a2 != a change.
The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it. That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly). But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs. When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing. So that is the && first_ident != error_mark_node
chunk.
2021-10-05 Jakub Jelinek <jakub@redhat.com>
PR c++/102548
* tree.c (apply_identity_attributes): Fix handling of the
case where an attribute in the list doesn't affect type
identity but some attribute before it does.
Jakub Jelinek [Fri, 1 Oct 2021 12:27:32 +0000 (14:27 +0200)]
ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]
We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR sanitizer/102515
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done.
gcc/testsuite/
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.
Jakub Jelinek [Fri, 1 Oct 2021 08:30:16 +0000 (10:30 +0200)]
c++: Fix handling of __thread/thread_local extern vars declared at function scope [PR102496]
The introduction of push_local_extern_decl_alias in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a
broke tls vars, while the decl they are created for has the tls model
set properly, nothing sets it for the alias that is actually used,
so accesses to it are done as if they were normal variables.
This is then diagnosed at link time if the definition of the extern
vars is __thread/thread_local.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/102496
* name-lookup.c (push_local_extern_decl_alias): Return early even for
tls vars with non-dependent type when processing_template_decl. For
CP_DECL_THREAD_LOCAL_P vars call set_decl_tls_model on alias.
* g++.dg/tls/pr102496-1.C: New test.
* g++.dg/tls/pr102496-2.C: New test.
IBM Z: Use @PLT symbols for local functions in 64-bit mode
This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.
gcc/ChangeLog:
* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call<mode>): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.
Ilya Leoshkevich [Thu, 17 Jun 2021 12:18:17 +0000 (14:18 +0200)]
IBM Z: Define NO_PROFILE_COUNTERS
s390 glibc does not need counters in the .data section, since it stores
edge hits in its own data structure. Therefore counters only waste
space and confuse diffing tools (e.g. kpatch), so don't generate them.
Iain Buclaw [Sun, 3 Oct 2021 14:02:24 +0000 (16:02 +0200)]
d: gdc driver ignores -static-libstdc++ when automatically linking libstdc++ library
Adds handling of `-static-libstc++' in the gdc driver, so that libstdc++
is appropriately linked if libstdc++ is either needed or seen on the
command-line.
PR d/102574
gcc/d/ChangeLog:
* d-spec.cc (lang_specific_driver): Link libstdc++ statically if
-static-libstdc++ was given on command-line.
Harald Anlauf [Fri, 24 Sep 2021 17:10:15 +0000 (19:10 +0200)]
Fortran - improve checking for intrinsics allowed in constant expressions
gcc/fortran/ChangeLog:
PR fortran/102458
* expr.c (is_non_constant_intrinsic): Check for intrinsics
excluded in constant expressions (F2018:10.1.12).
(gfc_is_constant_expr): Use that check.
gcc/testsuite/ChangeLog:
PR fortran/102458
* gfortran.dg/pr102458.f90: New test.
coroutines: Only set parm copy guard vars if we have exceptions [PR 102454].
For coroutines, we make copies of the original function arguments into
the coroutine frame. Normally, these are destroyed on the proper exit
from the coroutine when the frame is destroyed.
However, if an exception is thrown before the first suspend point is
reached, the cleanup has to happen in the ramp function. These cleanups
are guarded such that they are only applied to any param copies actually
made.
The ICE is caused by an attempt to set the guard variable when there are
no exceptions enabled (the guard var is not created in this case).
Fixed by checking for flag_exceptions in this case too.
While touching this code paths, also clean up the synthetic names used
when a function parm is unnamed.
* coroutines.cc (analyze_fn_parms): Clean up synthetic names for
unnamed function params.
(morph_fn_to_coro): Do not try to set a guard variable for param
DTORs in the ramp, unless we have exceptions active.
coroutines: Make proxy vars for the function arg copies.
This adds top level proxy variables for the coroutine frame
copies of the original function args. These are then available
in the debugger to refer to the frame copies. We rewrite the
function body to use the copies, since the original parms will
no longer be in scope when the coroutine is running.
* coroutines.cc (struct param_info): Add copy_var.
(build_actor_fn): Use simplified param references.
(register_param_uses): Likewise.
(rewrite_param_uses): Likewise.
(analyze_fn_parms): New function.
(coro_rewrite_function_body): Add proxies for the fn
parameters to the outer bind scope of the rewritten code.
(morph_fn_to_coro): Use simplified version of param ref.
coroutines: Expose implementation state to the debugger.
In the process of transforming a coroutine into the separate representation
as the ramp function and a state machine, we generate some variables that
are of interest to a user during debugging. Any variable that is persistent
for the execution of the coroutine is placed into the coroutine frame.
In particular:
The promise object.
The function pointers for the resumer and destroyer.
The current resume index (suspend point).
The handle that represents this coroutine 'self handle'.
Any handle provided for a continuation coroutine.
Whether the coroutine frame is allocated and needs to be freed.
Visibility of some of these has already been requested by end users.
This patch ensures that such variables have names that are usable in a
debugger, but are in the reserved namespace for the implementation (they
all begin with _Coro_). The identifiers are generated lazily when the
first coroutine is encountered.
We place the variables into the outermost bind expression and then add a
DECL_VALUE_EXPR to each that points to the frame entry.
These changes simplify the handling of the variables in the body of the
function (in particular, the use of the DECL_VALUE_EXPR means that we now
no longer need to rewrite proxies for the promise and coroutine handles into
the frame->offset form).
* coroutines.cc (coro_resume_fn_id, coro_destroy_fn_id,
coro_promise_id, coro_frame_needs_free_id, coro_resume_index_id,
coro_self_handle_id, coro_actor_continue_id,
coro_frame_i_a_r_c_id): New.
(coro_init_identifiers): Initialize new name identifiers.
(coro_promise_type_found_p): Use pre-built identifiers.
(struct await_xform_data): Remove unused fields.
(transform_await_expr): Delete code that is now unused.
(build_actor_fn): Simplify interface, use pre-built identifiers and
remove transforms that are no longer needed.
(build_destroy_fn): Use revised field names.
(register_local_var_uses): Use pre-built identifiers.
(coro_rewrite_function_body): Simplify interface, use pre-built
identifiers. Generate proxy vars in the outer bind expr scope for the
implementation state that we wish to expose.
(morph_fn_to_coro): Adjust comments for new variable names, use pre-
built identifiers. Remove unused code to generate frame entries for
the implementation state. Adjust call for build_actor_fn.
coroutines: Support for debugging implementation state.
Some of the state that is associated with the implementation
is of interest to a user debugging a coroutine. In particular
items such as the suspend point, promise object, and current
suspend point.
These variables live in the coroutine frame, but we can inject
proxies for them into the outermost bind expression of the
coroutine. Such variables are automatically moved into the
coroutine frame (if they need to persist across a suspend
expression). PLacing the proxies thus allows the user to
inspect them by name in the debugger.
To implement this, we ensure that (at the outermost scope) the
frame entries are not mangled (coroutine frame variables are
usually mangled with scope nesting information so that they do
not clash). We can safely avoid doing this for the outermost
scope so that we can map frame entries directly to the variables.
This is partial contribution to debug support (PR 99215).
This is primarily code factoring, but we take this opportunity
to rename some of the implementation variables (which we intend
to expose to debugging) so that they are in the implementation
namespace.
* coroutines.cc (coro_build_artificial_var): New.
(build_actor_fn): Use var builder, rename vars to use
implementation namespace.
(coro_rewrite_function_body): Likewise.
(morph_fn_to_coro): Likewise.
Iain Sandoe [Wed, 23 Jun 2021 07:19:13 +0000 (08:19 +0100)]
coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.
Variables that need to persist over suspension expressions
must be preserved by being copied into the coroutine frame.
The initial implementations do this manually in the transform
code. However, that has various disadvantages - including
that the debug connections are lost between the original var
and the frame copy.
The revised implementation makes use of DECL_VALUE_EXPRs to
contain the frame offset expressions, so that the original
var names are preserved in the code.
This process is also applied to the function parms which are
always copied to the frame. In this case the decls need to be
copied since they are used in two different contexts during
the re-write (in the building of the ramp function, and in
the actor function itself).
This will assist in improvement of debugging (PR 99215).
Jason Merrill [Wed, 5 May 2021 01:33:36 +0000 (21:33 -0400)]
c++: don't call 'rvalue' in coroutines code
A change to check glvalue_p rather than specifically for TARGET_EXPR
revealed issues with the coroutines code's use of the 'rvalue' function,
which shouldn't be used on class glvalues, so I've removed those calls.
In build_co_await I just dropped them, because I don't see anything in the
co_await specification that indicates that we would want to move from an
lvalue result of operator co_await. And simplified that code while I was
touching it; cp_build_modify_expr (...INIT_EXPR...) will call the
constructor.
In morph_fn_to_coro I changed the handling of the rvalue reference coroutine
frame field to use move, to treat the rval ref as an xvalue. I used
forward_parm to pass the function parms to the constructor for the field.
And I simplified the return handling so we get the desired rvalue semantics
from the normal implicit move on return.
I question default-initializing the non-void return value of the function if
get_return_object returns void; I'm not messing with it here, but I've filed
PR100476 about it.
Eric Botcazou [Fri, 1 Oct 2021 08:56:45 +0000 (10:56 +0200)]
Fix ICE with stack checking emulation at -O2
On bare-metal platforms, the Ada compiler emulates stack checking (it is
required by the language and tested by ACATS) in the runtime via the
stack_check_libfunc hook of the RTL middle-end. Calls to the function
are generated as libcalls but they now require a proper function type
at -O2 or above.
gcc/
* explow.c: Include langhooks.h.
(set_stack_check_libfunc): Build a proper function type.
Eric Botcazou [Fri, 1 Oct 2021 08:49:34 +0000 (10:49 +0200)]
Fix PR c++/64697 at -O1 or above
The BFD fix eliminates the link failure and working code is generated at
-O0, but _not_ when optimization is enabled because the optimizer changes:
movq .refptr._ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
into:
leaq _ZTH1s(%rip), %rax
testq %rax, %rax
je .L2
call _ZTH1s
and the leaq now also gets the relocation overflow. So the fix is to
teach legitimate_pic_address_disp_p to reject the transformation when
the symbol is an external weak function, which yields:
cmpq $0, .refptr._ZTH1s(%rip)
je .L2
call _ZTH1s
and the cmpq keeps a relocation that does not overflow.
gcc/
PR c++/64697
* config/i386/i386.c (legitimate_pic_address_disp_p): For PE-COFF do
not return true for external weak function symbols in medium model.
Eric Botcazou [Thu, 24 Jun 2021 10:19:36 +0000 (12:19 +0200)]
[Ada] Add support for PE-COFF PIE to System.Dwarf_Line
gcc/ada/
* adaint.c (__gnat_get_executable_load_address): Add Win32 support.
* libgnat/s-objrea.ads (Get_Xcode_Bounds): Fix typo in comment.
(Object_File): Minor reformatting.
(ELF_Object_File): Uncomment predicate.
(PECOFF_Object_File): Likewise.
(XCOFF32_Object_File): Likewise.
* libgnat/s-objrea.adb: Minor reformatting throughout.
(Get_Load_Address): Implement for PE-COFF.
* libgnat/s-dwalin.ads: Remove clause for System.Storage_Elements
and use consistent wording in comments.
(Dwarf_Context): Set type of Low, High and Load_Address to Address.
* libgnat/s-dwalin.adb (Get_Load_Displacement): New function.
(Is_Inside): Call Get_Load_Displacement.
(Low_Address): Likewise.
(Open): Adjust to type change.
(Aranges_Lookup): Change type of Addr to Address.
(Read_Aranges_Entry): Likewise for Start and adjust.
(Enable_Cach): Adjust to type change.
(Symbolic_Address): Change type of Addr to Address.
(Symbolic_Traceback): Call Get_Load_Displacement.
Eric Botcazou [Tue, 22 Jun 2021 22:21:00 +0000 (00:21 +0200)]
[Ada] Small cleanup in System.Dwarf_Line
gcc/ada/
* libgnat/s-dwalin.ads: Remove clause for Ada.Exceptions.Traceback,
add clause for System.Traceback_Entries and alphabetize.
(AET): Delete.
(STE): New package renaming.
(Symbolic_Traceback): Adjust.
* libgnat/s-dwalin.adb: Remove clauses for Ada.Exceptions.Traceback
and System.Traceback_Entries.
(Symbolic_Traceback): Adjust.
Peter Bergner [Tue, 14 Sep 2021 15:47:18 +0000 (10:47 -0500)]
rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz
Fwprop will happily optimize two xxsetaccz instructions into one xxsetaccz
by propagating the results of the first to the uses of the second.
We really don't want that to happen given the late priming/depriming of
accumulators. I fixed this by making the xxsetaccz source operand an
unspec volatile. I also removed the mma_xxsetaccz define_expand and
define_insn_and_split and replaced it with a simple define_insn.
The expand and splitter patterns were leftovers from the pre opaque mode
code when the xxsetaccz code was part of the movpxi pattern, and we don't
need them now.
Rather than a new test case, I was able to just modify the current test case
to add another __builtin_mma_xxsetaccz call which shows the bad code gen
with unpatched compilers.
2021-09-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
(unspecv): Add UNSPECV_MMA_XXSETACCZ.
(*mma_xxsetaccz): Delete.
(mma_xxsetaccz): Change to define_insn. Remove operand 1.
Use UNSPECV_MMA_XXSETACCZ. Update comment.
* config/rs6000/rs6000.c (rs6000_rtx_costs): Use UNSPECV_MMA_XXSETACCZ.
gcc/testsuite/
* gcc.target/powerpc/mma-builtin-6.c: Add second call to xxsetacc
built-in. Update instruction counts.
libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]
The depend type is a struct with two pointer members for C/C++ - but for
Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus,
libgomp's configure checks that an integer type/kind with size 2*sizeof(void*)
is available. However, this integer type/kind is not needed when building without
Fortran support. Thus, only check this when Fortran is enabled.
libgomp/
PR libgomp/96661
* configure.ac: Only check for int-type = 2*size_t support when
building with Fortran support.
* configure: Regenerate.
Jakub Jelinek [Tue, 28 Sep 2021 11:02:51 +0000 (13:02 +0200)]
i386: Don't emit fldpi etc. if -frounding-math [PR102498]
i387 has instructions to store some transcedental numbers into the top of
stack. The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision. The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.
2021-09-28 Jakub Jelinek <jakub@redhat.com>
PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.
Andreas Krebbel [Wed, 22 Sep 2021 07:32:21 +0000 (09:32 +0200)]
IBM Z: Fix PR102222
Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.
gcc/ChangeLog:
PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.
The removal of Cilk Plus support r8-4956 missed to remove
the streaming out of the bit, instead just change the value
for streaming out to be always false.
By hacking fp_expression_p to always return true, I can see
it reads the wrong fp_expressions value (false) out in wpa.
We cannot use r12 here, it is already in use as the GEP (for sibling
calls).
2021-09-08 Segher Boessenkool <segher@kernel.crashing.org>
PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use
r11 instead of r12 for restoring CR.
rs6000: Don't use r12 for CR save on ELFv2 (PR102107)
CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.
It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1. So here goes.
Harald Anlauf [Fri, 17 Sep 2021 19:45:33 +0000 (21:45 +0200)]
Fortran - (large) arrays in the main shall be static
gcc/fortran/ChangeLog:
PR fortran/102366
* trans-decl.c (gfc_finish_var_decl): Disable the warning message
for variables moved from stack to static storange if they are
declared in the main, but allow the move to happen.
gcc/testsuite/ChangeLog:
PR fortran/102366
* gfortran.dg/pr102366.f90: New test.
Eric Botcazou [Tue, 21 Sep 2021 07:25:47 +0000 (09:25 +0200)]
Fix no_fsanitize_address effective target
The implementation of the no_fsanitize_address effective target was copied
from asan-dg.exp without realizing that it does not work outside of this
context (there is a comment explaining why). As a consequence, it always
returns 0, so for example the directive in gnat.dg/asan1.adb:
{ dg-skip-if "no address sanitizer" { no_fsanitize_address } }
does not work. This led some people to add the nonsensical:
GCC11 - Fortran: combined directives - order(concurrent) not on distribute
While OpenMP 5.1 and GCC 12 permits 'order(concurrent)' on distribute,
OpenMP 5.0 and GCC 11 don't. This patch for GCC 11 ensures the clause also
does not end up on 'distribute' when splitting combined directives.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_split_omp_clauses): Don't put 'order(concurrent)'
on 'distribute' for combined directives, matching OpenMP 5.0
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/distribute-order-concurrent.f90: New test.
Harald Anlauf [Thu, 16 Sep 2021 18:12:21 +0000 (20:12 +0200)]
Fortran - fix handling of optional allocatable DT arguments with INTENT(OUT)
gcc/fortran/ChangeLog:
PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.
gcc/testsuite/ChangeLog:
PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.
Eric Botcazou [Fri, 17 Sep 2021 08:12:12 +0000 (10:12 +0200)]
Fix PR rtl-optimization/102306
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.
gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.
gcc/testsuite/
* gcc.target/sparc/20210917-1.c: New test.
Harald Anlauf [Mon, 13 Sep 2021 17:28:10 +0000 (19:28 +0200)]
Fortran - ensure simplification of bounds of array-valued named constants
gcc/fortran/ChangeLog:
PR fortran/82314
* decl.c (add_init_expr_to_sym): For proper initialization of
array-valued named constants the array bounds need to be
simplified before adding the initializer.
gcc/testsuite/ChangeLog:
PR fortran/82314
* gfortran.dg/pr82314.f90: New test.
Daniel Cederman [Mon, 25 Mar 2019 08:12:17 +0000 (09:12 +0100)]
sparc: Add scheduling information for LEON5
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.
gcc/ChangeLog:
* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.
Daniel Cederman [Fri, 16 Oct 2020 07:12:30 +0000 (09:12 +0200)]
sparc: Skip all empty assembly statements
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().
gcc/ChangeLog:
* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.
Daniel Cederman [Fri, 25 Sep 2020 11:17:46 +0000 (13:17 +0200)]
sparc: Treat more instructions as load or store in errata workarounds
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.
gcc/ChangeLog:
* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.
Andrew Pinski [Tue, 31 Aug 2021 04:41:14 +0000 (04:41 +0000)]
Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align
The problem here is the aarch64_expand_setmem code did not check
STRICT_ALIGNMENT if it is creating an overlapping store.
This patch adds that check and the testcase works.
gcc/ChangeLog:
PR target/101934
* config/aarch64/aarch64.c (aarch64_expand_setmem):
Check STRICT_ALIGNMENT before creating an overlapping
store.
gcc/testsuite/ChangeLog:
PR target/101934
* gcc.target/aarch64/memset-strict-align-1.c: New test.
Jakub Jelinek [Wed, 15 Sep 2021 20:21:17 +0000 (22:21 +0200)]
c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
Jakub Jelinek [Tue, 14 Sep 2021 14:56:30 +0000 (16:56 +0200)]
c++: Update DECL_*SIZE for objects with flexible array members with initializers [PR102295]
The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.
Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.