Jakub Jelinek [Thu, 11 Nov 2021 09:14:04 +0000 (10:14 +0100)]
dwarf2out: Fix up field_byte_offset [PR101378]
For PCC_BITFIELD_TYPE_MATTERS field_byte_offset has quite large code
to deal with it since many years ago (see it e.g. in GCC 3.2, although it
used to be on HOST_WIDE_INTs, then on double_ints, now on offset_ints).
But that code apparently isn't able to cope with members with empty class
types with [[no_unique_address]] attribute, because the empty classes have
non-zero type size but zero decl size and so one can end up from the
computation with negative offset or offset 1 byte smaller than it should be.
For !PCC_BITFIELD_TYPE_MATTERS, we just use
tree_result = byte_position (decl);
which seems exactly right even for the empty classes or anything which is
not a bitfield (and for which we don't add DW_AT_bit_offset attribute).
So, instead of trying to handle those no_unique_address members in the
current already very complicated code, this limits it to bitfields.
stor-layout.c PCC_BITFIELD_TYPE_MATTERS handling also affects only
bitfields, twice it checks DECL_BIT_FIELD and once DECL_BIT_FIELD_TYPE.
As discussed, this patch uses DECL_BIT_FIELD_TYPE check, because
DECL_BIT_FIELD might be cleared for some bitfields with bitsizes
multiple of BITS_PER_UNIT and e.g.
struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;
int
main ()
{
s.c = 0x55;
s.d = 0xaaaa;
t.c = 0x55;
t.d = 0xaaaa;
s.e++;
}
has different debug info with DECL_BIT_FIELD check.
2021-11-11 Jakub Jelinek <jakub@redhat.com>
PR debug/101378
* dwarf2out.c (field_byte_offset): Do the PCC_BITFIELD_TYPE_MATTERS
handling only for DECL_BIT_FIELD_TYPE decls.
Jakub Jelinek [Thu, 21 Oct 2021 08:27:44 +0000 (10:27 +0200)]
openmp: For default(none) ignore variables created by ubsan_create_data [PR64888]
We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification
(others are added later). One way to fix this would be to introduce further
UBSAN_ internal functions and lower it later (sanopt pass) like other ifns,
this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL
and TYPE_ARTIFICIAL.
2021-10-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/64888
gcc/c-family/
* c-omp.c (c_omp_predefined_variable): Return true also for
ubsan_create_data created artificial variables.
gcc/testsuite/
* c-c++-common/ubsan/pr64888.c: New test.
Jakub Jelinek [Tue, 19 Oct 2021 07:24:57 +0000 (09:24 +0200)]
c++: Don't reject calls through PMF during constant evaluation [PR102786]
The following testcase incorrectly rejects the c initializer,
while in the s.*a case cxx_eval_* sees .__pfn reads etc.,
in the s.*&S::foo case get_member_function_from_ptrfunc creates
expressions which use INTEGER_CSTs with type of pointer to METHOD_TYPE.
And cxx_eval_constant_expression rejects any INTEGER_CSTs with pointer
type if they aren't 0.
Either we'd need to make sure we defer such folding till cp_fold but the
function and pfn_from_ptrmemfunc is used from lots of places, or
the following patch just tries to reject only non-zero INTEGER_CSTs
with pointer types if they don't point to METHOD_TYPE in the hope that
all such INTEGER_CSTs with POINTER_TYPE to METHOD_TYPE are result of
folding valid pointer-to-member function expressions.
I don't immediately see how one could create such INTEGER_CSTs otherwise,
cast of integers to PMF is rejected and would have the PMF RECORD_TYPE
anyway, etc.
2021-10-19 Jakub Jelinek <jakub@redhat.com>
PR c++/102786
* constexpr.c (cxx_eval_constant_expression): Don't reject
INTEGER_CSTs with type POINTER_TYPE to METHOD_TYPE.
Jakub Jelinek [Fri, 15 Oct 2021 14:25:25 +0000 (16:25 +0200)]
openmp: Fix up handling of OMP_PLACES=threads(1)
When writing the places-*.c tests, I've noticed that we mishandle threads
abstract name with specified num-places if num-places isn't a multiple of
number of hw threads in a core. It then happily ignores the maximum count
and overwrites for the remaining hw threads in a core further places that
haven't been allocated.
2021-10-15 Jakub Jelinek <jakub@redhat.com>
* config/linux/affinity.c (gomp_affinity_init_level_1): For level 1
after creating count places clean up and return immediately.
* testsuite/libgomp.c/places-6.c: New test.
* testsuite/libgomp.c/places-7.c: New test.
* testsuite/libgomp.c/places-8.c: New test.
Jakub Jelinek [Sun, 10 Oct 2021 10:13:22 +0000 (12:13 +0200)]
var-tracking: Fix a wrong-debug issue caused by my r10-7665 var-tracking change [PR102441]
Since my r10-7665-g33c45e51b4914008064d9b77f2c1fc0eea1ad060 change, we get
wrong-debug on e.g. the following testcase at -O2 -g on x86_64-linux for the
x parameter:
void bar (int *r);
int
foo (int x)
{
int r = 0;
bar (&r);
return r;
}
At the start of function, we have
subq $24, %rsp
leaq 12(%rsp), %rdi
instructions. The x parameter is passed in %rdi, but isn't used in the
function and so the leaq instruction overwrites %rdi without remembering
%rdi anywhere. Before the r10-7665 change (which was trying to fix a large
(3% for 32-bit, 1% for 64-bit x86-64) debug info/loc growth introduced with
r10-7515), the leaq insn above resulted in a MO_VAL_SET micro-operation that
said that the value of sp + 12, a cselib_sp_derived_value_p, is stored into
the %rdi register. The r10-7665 change added a change to add_stores that
added no micro-operation for the leaq store, with the rationale that the sp
based values can be and will be always computable some other more compact
and primarily more stable way (cfa based expression like DW_OP_fbreg, that
is the same in the whole function). That is true. But by throwing the
micro-operation on the floor, we miss another important part of the
MO_VAL_SET, in particular that the destination of the store, %rdi in this
case, now has a different value from what it had before, so the vt_*
dataflow code thinks that even after the leaq instruction %rdi still holds
the x argument value (and changes it to DW_OP_entry_value (%rdi) only in the
middle of the call to bar). Previously and with the patches below,
the location for x changes already at the end of leaq instruction to
DW_OP_entry_value (%rdi).
My first attempt to fix this was instead of dropping the MO_VAL_SET add
a MO_CLOBBER operation:
--- gcc/var-tracking.c.jj 2021-05-04 21:02:24.196799586 +0200
+++ gcc/var-tracking.c 2021-09-24 19:23:16.420154828 +0200
@@ -6133,7 +6133,9 @@ add_stores (rtx loc, const_rtx expr, voi
{
if (preserve)
preserve_value (v);
- return;
+ mo.type = MO_CLOBBER;
+ mo.u.loc = loc;
+ goto log_and_return;
}
nloc = replace_expr_with_values (oloc);
so don't track that the value lives in the loc destination, but track
that the previous value doesn't live there anymore. That failed bootstrap
miserably, the vt_* code isn't prepared to see MO_CLOBBER of a MEM that
isn't tracked (e.g. has MEM_EXPR on it that the var-tracking code wants
to track, i.e. track_p in add_stores). On the other side, thinking about
it more, in the most common case where a cselib_sp_derived_value_p value
is stored into the sp register (and which is the reason why PR94495
testcase got larger), dropping the micro-operation on the floor is the
right thing, because we have that cselib_sp_derived_value_p tracking, any
reads from the sp hard register will be treated as
cselib_sp_derived_value_p.
Then I've tried 3 different patches described below and in the end
what is committed is patch2.
Additionally, I've gathered statistics from cc1plus by always reverting the
var-tracking.c change after finished bootstrap/regtest and rebuilding the
stage3 var-tracking.o and cc1plus, such that it would be comparable.
dwlocstat and .debug_{info,loclists} section sizes detailed below.
patch3 uses MO_VAL_SET (i.e. essentially reversion of the r10-7665
change) when destination is not a REG_P and !track_p, otherwise if
destination is sp drops the micro-operation on the floor (i.e. no change),
otherwise adds a MO_CLOBBER.
patch1 is similar, except it checks for destination not equal to sp and
!track_p, i.e. for !track_p REG_P destinations other than sp it will use
MO_VAL_SET rather than MO_CLOBBER.
Finally, patch2, the shortest patch, uses MO_VAL_SET whenever destination
is not sp and otherwise drops the micro-operation on the floor.
All the 3 patches don't affect the PR94495 testcase, all the changes
there were caused by stores of sp based values into %rsp.
While the patch2 (and patch1 which results in exactly the same sizes)
causes the largest debug loclists/info growth from the 3, it is still quite
minor (0.651% on 64-bit and 0.114% on 32-bit) compared
to the 1% and 3% PR94495 was trying to solve, and I actually think it is the
best thing to do. Because, if we have say
int q[10];
int *p = &q[0];
or similar and we load the &q[0] sp based value into some hard register,
by noting in the debug info that p lives in some hard reg for some part
of the function and a user is trying to change the p var in the debugger,
if we say it lives in some register or memory, there is some chance that
the changing of the value could work successfully (of course, nothing
is guaranteed, we don't have tracking of where each var lives at which
moment for changing purposes (i.e. what register, memory or else you need
to change in order to change behavior of the code)), while if we just say
that p's location is DW_OP_fbreg 16 DW_OP_stack_value, that is a read-only
value one can just print but not change. Now, for stores of variable
values into the sp register, I don't think we have such an issue, you don't
want debugger to change your stack pointer when user asks to change value
of some variable whose value lives in the stack pointer, that would pretty
much always result in misbehavior of the program.
So, my preference from these 3 is patch2 and that is being committed.
Jakub Jelinek [Tue, 5 Oct 2021 20:28:38 +0000 (22:28 +0200)]
c++: Fix apply_identity_attributes [PR102548]
The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes. The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.
The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).
There are two bugs. One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated. That is fixed by the a2 -> a2 != a change.
The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it. That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly). But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs. When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing. So that is the && first_ident != error_mark_node
chunk.
2021-10-05 Jakub Jelinek <jakub@redhat.com>
PR c++/102548
* tree.c (apply_identity_attributes): Fix handling of the
case where an attribute in the list doesn't affect type
identity but some attribute before it does.
Jakub Jelinek [Fri, 1 Oct 2021 12:27:32 +0000 (14:27 +0200)]
ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]
We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR sanitizer/102515
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done.
gcc/testsuite/
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.
Jakub Jelinek [Tue, 28 Sep 2021 11:02:51 +0000 (13:02 +0200)]
i386: Don't emit fldpi etc. if -frounding-math [PR102498]
i387 has instructions to store some transcedental numbers into the top of
stack. The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision. The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.
2021-09-28 Jakub Jelinek <jakub@redhat.com>
PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.
Jakub Jelinek [Wed, 15 Sep 2021 20:21:17 +0000 (22:21 +0200)]
c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
Jakub Jelinek [Tue, 14 Sep 2021 14:56:30 +0000 (16:56 +0200)]
c++: Update DECL_*SIZE for objects with flexible array members with initializers [PR102295]
The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.
Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.
Jakub Jelinek [Tue, 14 Sep 2021 14:55:04 +0000 (16:55 +0200)]
c++: Fix __is_*constructible/assignable for templates [PR102305]
is_xible_helper returns error_mark_node (i.e. false from the traits)
for abstract classes by testing ABSTRACT_CLASS_TYPE_P (to) early.
Unfortunately, as the testcase shows, that doesn't work on class templates
that haven't been instantiated yet, ABSTRACT_CLASS_TYPE_P for them is false
until it is instantiated, which is done when the routine later constructs
a dummy object with that type.
The following patch fixes this by calling complete_type first, so that
ABSTRACT_CLASS_TYPE_P test will work properly, while keeping the handling
of arrays with unknown bounds, or incomplete types where it is done
currently.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102305
* method.c (is_xible_helper): Call complete_type on to.
Jakub Jelinek [Wed, 8 Sep 2021 09:25:31 +0000 (11:25 +0200)]
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
Jakub Jelinek [Mon, 23 Aug 2021 09:50:14 +0000 (11:50 +0200)]
dwarf2out: Emit DW_AT_location for global register vars during early dwarf [PR101905]
The following patch emits DW_AT_location for global register variables
already during early dwarf, since usually late_global_decl hook isn't even
called for those, as nothing needs to be emitted for them.
2021-08-23 Jakub Jelinek <jakub@redhat.com>
PR debug/101905
* dwarf2out.c (gen_variable_die): Add DW_AT_location for global
register variables already during early_dwarf if possible.
Jakub Jelinek [Wed, 28 Jul 2021 16:43:15 +0000 (18:43 +0200)]
ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]
The following testcase ICEs, because the base is a CONST_DECL for
the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
/* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'. */
#define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.
The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...
2021-07-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101624
* ubsan.c (maybe_instrument_pointer_overflow,
instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
PARM_DECLs or RESULT_DECLs.
* sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.
* gfortran.dg/ubsan/ubsan.exp: New file.
* gfortran.dg/ubsan/pr101624.f90: New test.
Jakub Jelinek [Fri, 23 Jul 2021 17:55:16 +0000 (19:55 +0200)]
expmed: Fix store_integral_bit_field [PR101562]
Our documentation says that paradoxical subregs shouldn't appear
in strict_low_part:
'(strict_low_part (subreg:M (reg:N R) 0))'
This expression code is used in only one context: as the
destination operand of a 'set' expression. In addition, the
operand of this expression must be a non-paradoxical 'subreg'
expression.
but on the testcase below that triggers UB at runtime
store_integral_bit_field emits exactly that.
The following patch fixes it by ensuring the requirement is satisfied.
2021-07-23 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101562
* expmed.c (store_integral_bit_field): Only use movstrict_optab
if the operand isn't paradoxical.
Jakub Jelinek [Wed, 21 Jul 2021 07:45:02 +0000 (09:45 +0200)]
openmp: Fix up omp_check_private [PR101535]
The target data construct shouldn't affect omp_check_private, unless
the decl there is privatized (use_device_* clauses). The routine
had some code for that, but it just did continue; in a loop that looped
only if the region type is one of selected 4 kinds, so effectively resulted
in return false; instead of looping again. And not diagnosing lastprivate
(or reduction etc.) on a variable that is private to containing parallel
results in ICEs later on, as there is no original list item to which store
the last result.
The target construct is unclear as it has an implicit parallel region
and it is not obvious if the data privatization clauses on the construct
shall be treated as data privatization on the implicit parallel or just
on the target. For now treat those as privatization on the implicit
parallel, but treat map clauses as shared on the implicit parallel.
2021-07-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101535
* gimplify.c (omp_check_private): Properly skip ORT_TARGET_DATA
contexts in which decl isn't privatized and for ORT_TARGET return
false if decl is mapped.
* c-c++-common/gomp/pr101535-1.c: New test.
* c-c++-common/gomp/pr101535-2.c: New test.
Jakub Jelinek [Wed, 21 Jul 2021 07:38:59 +0000 (09:38 +0200)]
c++: Ensure OpenMP reduction with reference type references complete type [PR101516]
The following testcase ICEs because we haven't verified if reduction decl
has reference type that TREE_TYPE of the reference is a complete type,
require_complete_type on the decl doesn't ensure that.
2021-07-21 Jakub Jelinek <jakub@redhat.com>
PR c++/101516
* semantics.c (finish_omp_reduction_clause): Also call
complete_type_or_else and return true if it fails.
Jakub Jelinek [Tue, 20 Jul 2021 14:41:29 +0000 (16:41 +0200)]
rs6000: Fix up easy_vector_constant_msb handling [PR101384]
The following gcc.dg/pr101384.c testcase is miscompiled on
powerpc64le-linux.
easy_altivec_constant has code to try construct vector constants with
different element sizes, perhaps different from CONST_VECTOR's mode. But as
written, that works fine for vspltis[bhw] cases, but not for the vspltisw
x,-1; vsl[bhw] x,x,x case, because that creates always a V16QImode, V8HImode
or V4SImode constant containing broadcasted constant with just the MSB set.
The vspltis_constant function etc. expects the vspltis[bhw] instructions
where the small [-16..15] or even [-32..30] constant is sign-extended to the
remaining step bytes, but that is not the case for the 0x80...00 constants,
with step 1 we can't handle e.g.
{ 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff }
vectors but do want to handle e.g.
{ 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80 }
and similarly with copies 1 we do want to handle e.g.
{ 0x80808080, 0x80808080, 0x80808080, 0x80808080 }.
This is a simpler version of the fix for backports, which limits the EASY_VECTOR_MSB case
matching to step == 1 && copies == 1, because that is the only case the
splitter handles correctly.
2021-07-20 Jakub Jelinek <jakub@redhat.com>
PR target/101384
* config/rs6000/rs6000.c (vspltis_constant): Accept EASY_VECTOR_MSB
only if step and copies are equal to 1.
Jakub Jelinek [Thu, 1 Jul 2021 06:55:49 +0000 (08:55 +0200)]
openmp - Fix up && and || reductions [PR94366]
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.
This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.
2021-07-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.
Tobias Burnus [Tue, 4 May 2021 11:38:03 +0000 (13:38 +0200)]
OpenMP: Support complex/float in && and || reduction
C/C++ permit logical AND and logical OR also with floating-point or complex
arguments by doing an unequal zero comparison; the result is an 'int' with
value one or zero. Hence, those are also permitted as reduction variable,
even though it is not the most sensible thing to do.
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Accept float + complex
for || and && reductions.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_reduction_clause): Accept float + complex
for || and && reductions.
gcc/ChangeLog:
* omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle
&& and || with floating-point and complex arguments.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/reduction-2.c: New test.
* testsuite/libgomp.c-c++-common/reduction-3.c: New test.
Comparisons of NULLPTR_TYPE operands cause all kinds of problems in the
middle-end and in fold-const.c, various optimizations assume that if they
see e.g. a non-equality comparison with one of the operands being
INTEGER_CST and it is not INTEGRAL_TYPE_P (which has TYPE_{MIN,MAX}_VALUE),
they can build_int_cst (type, 1) to find a successor.
The following patch fixes it by making sure they don't appear in the IL,
optimize them away at cp_fold time as all can be folded.
Though, I've just noticed that clang++ rejects the non-equality comparisons
instead, foo () > 0 with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'int')
and foo () > nullptr with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'nullptr_t')
Shall we reject those too, in addition or instead of parts of this patch?
If so, wouldn't this patch be still useful for backports, I bet we don't
want to start reject it on the release branches when we used to accept it.
2021-07-15 Jakub Jelinek <jakub@redhat.com>
PR c++/101443
* cp-gimplify.c (cp_fold): For comparisons with NULLPTR_TYPE
operands, fold them right away to true or false.
pot_dummy_types is a hash_set from whose traversal the code prints some type
lines. hash_set normally uses default_hash_traits which for pointer types
(the hash set hashes const char *) uses pointer_hash which hashes the
addresses of the pointers except of the least significant 3 bits.
With address space randomization, that results in non-determinism in the
-fdump-go-specs= generated file, each invocation can have different order of
the lines emitted from pot_dummy_types traversal.
This patch fixes it by hashing the string contents instead to make the
hashes reproduceable.
2021-07-14 Jakub Jelinek <jakub@redhat.com>
PR go/101407
* godump.c (godump_str_hash): New type.
(godump_container::pot_dummy_types): Use string_hash instead of
ptr_hash in the hash_set.
Jakub Jelinek [Tue, 13 Jul 2021 07:50:49 +0000 (09:50 +0200)]
libgomp: Don't include limits.h instead of hidden visibility block
sem.h is included in between # pragma GCC visibility push(hidden)
and # pragma GCC visibility pop and includes limits.h there, which
since the introduction of sysconf declaration in recent glibcs
in there causes trouble. libgomp assumes it is compiled by gcc,
so we don't really need to include limits.h there and can use
-__INT_MAX__ - 1 instead (which clang and icc support too for years).
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Florian Weimer <fweimer@redhat.com>
* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
* config/linux/affinity.c: Include limits.h.
Jakub Jelinek [Thu, 1 Jul 2021 07:45:02 +0000 (09:45 +0200)]
dwarf2out: Handle COMPOUND_LITERAL_EXPR in loc_list_from_tree_1 [PR101266]
In this case dwarf2out_decl is called from the FEs with GENERIC but not
yet gimplified expressions in it.
As loc_list_from_tree_1 has an exhaustive list of tree codes it wants to
handle and for checking asserts no other codes makes it in, we should
handle even GENERIC trees that shouldn't be valid in GIMPLE.
The following patch handles COMPOUND_LITERAL_EXPR by hnadling it like the
underlying VAR_DECL temporary.
Verified the emitted DWARF is correct (but unoptimized, we emit
DW_OP_lit1 DW_OP_lit1 DW_OP_minus for the upper bound).
Jakub Jelinek [Tue, 29 Jun 2021 09:24:38 +0000 (11:24 +0200)]
match.pd: Avoid (intptr_t)x eq/ne CST to x eq/ne (typeof x) CST opt in GENERIC when sanitizing [PR101210]
When we have (intptr_t) x == cst where x has REFERENCE_TYPE, this
optimization creates x == cst out of it where cst has REFERENCE_TYPE.
If it is done in GENERIC folding, it can results in ubsan failures
where the INTEGER_CST with REFERENCE_TYPE is instrumented.
Fixed by deferring it to GIMPLE folding in this case.
2021-06-29 Jakub Jelinek <jakub@redhat.com>
PR c++/101210
* match.pd ((intptr_t)x eq/ne CST to x eq/ne (typeof x) CST): Don't
perform the optimization in GENERIC when sanitizing and x has a
reference type.
Jakub Jelinek [Thu, 24 Jun 2021 13:58:02 +0000 (15:58 +0200)]
c: Fix up c_parser_has_attribute_expression [PR101176]
This function keeps src_range member of the result uninitialized, which at
least under valgrind can show up later when those uninitialized location_t's
can make it into the IL or location_t hash tables.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101176
* c-parser.c (c_parser_has_attribute_expression): Set source range for
the result.
Jakub Jelinek [Thu, 24 Jun 2021 13:55:28 +0000 (15:55 +0200)]
c: Fix C cast error-recovery [PR101171]
The following testcase ICEs during error-recovery, as build_c_cast calls
note_integer_operands on error_mark_node and that wraps it into
C_MAYBE_CONST_EXPR which is unexpected and causes ICE later on.
Seems most other callers of note_integer_operands check early if something
is error_mark_node and return before calling note_integer_operands on it.
The following patch fixes it by not calling on error_mark_node, another
possibility would be to handle error_mark_node in note_integer_operands and
just return it.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101171
* c-typeck.c (build_c_cast): Don't call note_integer_operands on
error_mark_node.
Jakub Jelinek [Wed, 23 Jun 2021 08:03:28 +0000 (10:03 +0200)]
openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167]
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.
2021-06-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101167
* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.
* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
Jakub Jelinek [Mon, 21 Jun 2021 11:30:42 +0000 (13:30 +0200)]
inline-asm: Fix ICE with bitfields in "m" operands [PR100785]
Bitfields, while they live in memory, aren't something inline-asm can easily
operate on.
For C and "=m" or "+m", we were diagnosing bitfields in the past in the
FE, where c_mark_addressable had:
case COMPONENT_REF:
if (DECL_C_BIT_FIELD (TREE_OPERAND (x, 1)))
{
error
("cannot take address of bit-field %qD", TREE_OPERAND (x, 1));
return false;
}
but that check got moved in GCC 6 to build_unary_op instead and now we
emit an error during expansion and ICE afterwards (i.e. error-recovery).
For "m" it used to be diagnosed in c_mark_addressable too, but since
GCC 6 it is ice-on-invalid.
For C++, this was never diagnosed in the FE, but used to be diagnosed
in the gimplifier and/or during expansion before 4.8.
The following patch does multiple things:
1) diagnoses it in the FEs
2) simplifies during expansion the inline asm if any errors have been
reported (similarly how e.g. vregs pass if it detects errors on
inline-asm either deletes them or simplifies to bare minimum -
just labels), so that we don't have error-recovery ICEs there
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR inline-asm/100785
gcc/
* cfgexpand.c (expand_asm_stmt): If errors are emitted,
remove all inputs, outputs and clobbers from the asm and
set template to "".
gcc/c/
* c-typeck.c (c_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/cp/
* typeck.c (cxx_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/testsuite/
* c-c++-common/pr100785.c: New test.
Jakub Jelinek [Fri, 18 Jun 2021 09:20:40 +0000 (11:20 +0200)]
stor-layout: Don't create DECL_BIT_FIELD_REPRESENTATIVE for QUAL_UNION_TYPE [PR101062]
> The following patch does create them, but treats all such bitfields as if
> they were in a structure where the particular bitfield is the only field.
While the patch passed bootstrap/regtest on the trunk, when trying to
backport it to 11 branch the bootstrap failed with
atree.ads:3844:34: size for "Node_Record" too small
errors. Turns out the error is not about size being too small, but actually
about size being non-constant, and comes from:
/* In a FIELD_DECL of a RECORD_TYPE, this is a pointer to the storage
representative FIELD_DECL. */
#define DECL_BIT_FIELD_REPRESENTATIVE(NODE) \
(FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
/* For a FIELD_DECL in a QUAL_UNION_TYPE, records the expression, which
if nonzero, indicates that the field occupies the type. */
#define DECL_QUALIFIER(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
so by setting up DECL_BIT_FIELD_REPRESENTATIVE in QUAL_UNION_TYPE we
actually set or modify DECL_QUALIFIER and then construct size as COND_EXPRs
with those bit field representatives (e.g. with array type) as conditions
which doesn't fold into constant.
The following patch fixes it by not creating DECL_BIT_FIELD_REPRESENTATIVEs
for QUAL_UNION_TYPE as there is nowhere to store them,
Shall we change tree.h to document that DECL_BIT_FIELD_REPRESENTATIVE
is valid also on UNION_TYPE?
I see:
tree-ssa-alias.c- if (TREE_CODE (type1) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field1))
tree-ssa-alias.c: field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
tree-ssa-alias.c- if (TREE_CODE (type2) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field2))
tree-ssa-alias.c: field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
Shall we change that to || == UNION_TYPE or do we assume all fields
are overlapping in a UNION_TYPE already?
At other spots (asan, ubsan, expr.c) it is unclear what will happen
if they see a QUAL_UNION_TYPE with a DECL_QUALIFIER (or does the Ada FE
lower that somehow)?
Jakub Jelinek [Wed, 16 Jun 2021 10:17:55 +0000 (12:17 +0200)]
stor-layout: Create DECL_BIT_FIELD_REPRESENTATIVE even for bitfields in unions [PR101062]
The following testcase is miscompiled on x86_64-linux, the bitfield store
is implemented as a RMW 64-bit operation at d+24 when the d variable has
size of only 28 bytes and scheduling moves in between the R and W part
a store to a different variable that happens to be right after the d
variable.
The reason for this is that we weren't creating
DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions.
The following patch does create them, but treats all such bitfields as if
they were in a structure where the particular bitfield is the only field.
2021-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_representative): For fields in unions
assume nextf is always NULL.
(finish_bitfield_layout): Compute bit field representatives also in
unions, but handle it as if each bitfield was the only field in the
aggregate.
Jakub Jelinek [Wed, 16 Jun 2021 11:10:48 +0000 (13:10 +0200)]
testsuite: Use noipa attribute instead of noinline, noclone
I've noticed this test now on various arches sometimes FAILs, sometimes
PASSes (the line 12 test in particular).
The problem is that a = 0; initialization in the caller no longer happens
before the f(&a) call as what the argument points to is only used in
debug info.
Making the function noipa forces the caller to initialize it and still
tests what the test wants to test, namely that we don't consider *p as
valid location for the c variable at line 18 (after it has been overwritten
with *p = 1;).
2021-06-16 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of
noinline, noclone.
Jakub Jelinek [Wed, 16 Jun 2021 08:45:27 +0000 (10:45 +0200)]
libffi: Fix up x86_64 classify_argument
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD. The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).
* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
to number of words needed for type->size + byte_offset bytes rather
than just type->size bytes. Compute pos before the loop and check
total size of the structure.
* testsuite/libffi.call/nested_struct12.c: New test.
Jakub Jelinek [Tue, 15 Jun 2021 09:36:47 +0000 (11:36 +0200)]
expr: Fix up VEC_PACK_TRUNC_EXPR expansion [PR101046]
The following testcase ICEs, because we have a mode mismatch.
VEC_PACK_TRUNC_EXPR's operands have different modes from the result
(same vector mode size but twice as large element),
but we were passing non-NULL subtarget with the mode of the result
to the expansion of its arguments, so the VEC_PERM_EXPR in one of the
operands which had V8SImode operands and result had V16HImode target.
Fixed by clearing the subtarget if we are changing mode.
2021-06-15 Jakub Jelinek <jakub@redhat.com>
PR target/101046
* expr.c (expand_expr_real_2) <case VEC_PACK_FIX_TRUNC_EXPR,
case VEC_PACK_TRUNC_EXPR>: Clear subtarget when changing mode.
Jakub Jelinek [Fri, 11 Jun 2021 10:59:43 +0000 (12:59 +0200)]
simplify-rtx: Fix up simplify_logical_relational_operation for vector IOR [PR101008]
simplify_relational_operation callees typically return just const0_rtx
or const_true_rtx and then simplify_relational_operation attempts to fix
that up if the comparison result has vector mode, or floating mode,
or punt if it has scalar mode and vector mode operands (it doesn't know how
exactly to deal with the scalar masks).
But, simplify_logical_relational_operation has a special case, where
it attempts to fold (x < y) | (x >= y) etc. and if it determines it is
always true, it just returns const_true_rtx, without doing the dances that
simplify_relational_operation does.
That results in an ICE on the following testcase, where such folding happens
during expansion (of debug stmts into DEBUG_INSNs) and we ICE because
all of sudden a VOIDmode rtx appears where it expects a vector (V4SImode)
rtx.
The following patch fixes that by moving the adjustement into a separate
helper routine and using it from both simplify_relational_operation and
simplify_logical_relational_operation.
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101008
* simplify-rtx.c (relational_result): New function.
(simplify_logical_relational_operation,
simplify_relational_operation): Use it.
Jakub Jelinek [Mon, 7 Jun 2021 07:28:31 +0000 (09:28 +0200)]
fold-const: Fix up fold_read_from_vector [PR100887]
The callers of fold_read_from_vector expect that the index they pass is
an index of an element in the vector and the function does that most of the
time. But we allow CONSTRUCTORs with VECTOR_TYPE to have VECTOR_TYPE
elements and in that case every CONSTRUCTOR element represents not just one
index (with the exception of V1 vectors), but multiple.
So returning zero vector if i >= CONSTRUCTOR_NELTS or returning some
CONSTRUCTOR_ELT's value might not be what the callers expect.
Fixed by punting if the first element has vector type.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
In theory we could instead recurse (and assert that for CONSTRUCTORs of
vector elements we have always all elements specified like tree-cfg.c
verifies?) after adjusting the index appropriately.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR target/100887
* fold-const.c (fold_read_from_vector): Return NULL if trying to
read from a CONSTRUCTOR with vector type elements.
Jakub Jelinek [Mon, 7 Jun 2021 07:25:37 +0000 (09:25 +0200)]
tree-inline: Fix up __builtin_va_arg_pack handling [PR100898]
The following testcase ICEs, because gimple_call_arg_ptr (..., 0)
asserts that there is at least one argument, while we were using
it even if we didn't copy anything just to get a pointer from/to which
the zero arguments should be copied.
Fixed by guarding the memcpy calls. Also, the code was calling
gimple_call_num_args too many times - 5 times instead of 2, so the patch
adds two temporaries for those.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100898
* tree-inline.c (copy_bb): Only use gimple_call_arg_ptr if memcpy
should copy any arguments. Don't call gimple_call_num_args
on id->call_stmt or call_stmt more than once.
Jakub Jelinek [Fri, 4 Jun 2021 09:20:02 +0000 (11:20 +0200)]
x86: Fix ix86_expand_vector_init for V*TImode [PR100887]
We have vec_initv4tiv2ti and vec_initv2titi patterns which call
ix86_expand_vector_init and assume it works for those modes. For the
case of construction from two half-sized vectors, the code assumes it
will always succeed, but we have only insn patterns with SImode and DImode
element types. QImode and HImode element types are already handled
by performing it with same sized vectors with SImode elements and the
following patch extends that to V*TImode vectors.
2021-06-04 Jakub Jelinek <jakub@redhat.com>
PR target/100887
* config/i386/i386-expand.c (ix86_expand_vector_init): Handle
concatenation from half-sized modes with TImode elements.
Jakub Jelinek [Tue, 25 May 2021 15:24:38 +0000 (17:24 +0200)]
c++: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]
When passing expressions with decltype(nullptr) type with side-effects to
ellipsis, we pass (void *)0 instead, but for the side-effects evaluate them
on the lhs of a COMPOUND_EXPR. Unfortunately that means we warn about it
if the expression is a call to nodiscard marked function, even when the
result is really used, just needs to be transformed.
Fixed by adding a warning_sentinel.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.
* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.
Jakub Jelinek [Tue, 18 May 2021 08:10:17 +0000 (10:10 +0200)]
function: Set dummy DECL_ASSEMBLER_NAME in push_dummy_function [PR100580]
Last year I've added cgraph_node::get_create calls for the dummy
functions used for -fdump-passes, so that it interacts well with pass
disabling/enabling which is cgraph uid based.
Unfortunately, as the following testcase shows, when assembler hash
is present, that wants to compute DECL_ASSEMBLER_NAME and the C++ FE
is unprepared to handle it on the dummy functions which don't have
DECL_NAME etc.
The following patch fixes it by setting up a dummy DECL_ASSEMBLER_NAME
on these, so that the FEs don't need to compute it.
2021-05-18 Jakub Jelinek <jakub@redhat.com>
PR c++/100580
* function.c (push_dummy_function): Set DECL_ARTIFICIAL and
DECL_ASSEMBLER_NAME on the fn_decl.
Jakub Jelinek [Sat, 15 May 2021 08:12:11 +0000 (10:12 +0200)]
regcprop: Fix another cprop_hardreg bug [PR100342]
On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patches wrote:
> Ah, ok, thanks for the extra context.
>
> So AIUI the problem when recording xmm2<-di isn't just:
>
> [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>
> but also that:
>
> [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
>
> For example, all registers in this sequence can be part of the same chain:
>
> (set (reg:HI R1) (reg:HI R0))
> (set (reg:SI R2) (reg:SI R1)) // [A]
> (set (reg:DI R3) (reg:DI R2)) // [A]
> (set (reg:SI R4) (reg:SI R[0-3]))
> (set (reg:HI R5) (reg:HI R[0-4]))
>
> But:
>
> (set (reg:SI R1) (reg:SI R0))
> (set (reg:HI R2) (reg:HI R1))
> (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
>
> is problematic because it dips below the precision of the oldest regno
> and then increases again.
>
> When this happens, I guess we have two choices:
>
> (1) what the patch does: treat R3 as the start of a new chain.
> (2) pretend that the copy occured in vd->e[sr].mode instead
> (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
>
> I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
> Maybe the optimisation provided by (2) compared to (1) isn't common
> enough to be worth the complication.
>
> I think we should test [B] as well as [A] though. The pass is set
> up to do some quite elaborate mode changes and I think rejecting
> [A] on its own would make some of the other code redundant.
> It also feels like it should be a seperate “if” or “else if”,
> with its own comment.
Unfortunately, we now have a testcase that shows that testing also [B]
is a problem (unfortunately now latent on the trunk, only reproduces
on 10 and 11 branches).
The comment in the patch tries to list just the interesting instructions,
we have a 64-bit value, copy low 8 bit of those to another register,
copy full 64 bits to another register and then clobber the original register.
Before that (set (reg:DI r14) (const_int ...)) we have a chain
DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain, so
we have QI si, DI bp , si being the oldest_regno.
Next DI si is copied into DI dx. Only the low 8 bits of that are defined,
the rest is unspecified, but we would add DI dx into that same chain at the
end, so QI si, DI bp, DI dx [*]. Next si is overwritten, so the chain is
DI bp, DI dx. And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
as redundant, because we think bp and dx are already equivalent, when in
reality that is true only for the lowpart 8 bits.
I believe the [*] marked step above is where the bug is.
The committed regcprop.c (copy_value) change (but only committed to
trunk/11, not to 10) added
else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
&& partial_subreg_p (vd->e[sr].mode,
vd->e[vd->e[sr].oldest_regno].mode))
return;
and while the first partial_subreg_p call returns true, the second one
doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
true and we'd return, but as that reg got clobbered, si became the oldest
regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
But as the testcase shows, what is the oldest_regno in the chain is
something that changes over time, so relying on it for anything is
problematic, something could have a different oldest_regno and later
on get a different oldest_regno (perhaps with different mode) because
the oldest_regno got overwritten and it can change both ways.
The following patch effectively implements your (2) above.
2021-05-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100342
* regcprop.c (copy_value): When copying a source reg in a wider
mode than it has recorded for the value, adjust recorded destination
mode too or punt if !REG_CAN_CHANGE_MODE_P.
liuhongt [Mon, 18 Jan 2021 08:55:32 +0000 (16:55 +0800)]
Fix incorrect optimization by cprop_hardreg.
If SRC had been assigned a mode narrower than the copy, we can't
always link DEST into the chain even they have same
hard_regno_nregs(i.e. HImode/SImode in i386 backend).
PR rtl-optimization/98694
* regcprop.c (copy_value): If SRC had been assigned a mode
narrower than the copy, we can't link DEST into the chain even
they have same hard_regno_nregs(i.e. HImode/SImode in i386
backend).
gcc/testsuite/ChangeLog:
PR rtl-optimization/98694
* gcc.target/i386/pr98694.c: New test.
Jakub Jelinek [Wed, 12 May 2021 08:38:35 +0000 (10:38 +0200)]
expand: Don't reuse DEBUG_EXPRs with vector type if they have different modes [PR100508]
The inliner doesn't remap DEBUG_EXPR_DECLs, so the same decls can appear
in multiple functions.
Furthermore, expansion reuses corresponding DEBUG_EXPRs too, so they again
can be reused in multiple functions.
Neither of that is a major problem, DEBUG_EXPRs are just magic value holders
and what value they stand for is independent in each function and driven by
what debug stmts or DEBUG_INSNs they are bound to.
Except for DEBUG_EXPR*s with vector types, TYPE_MODE can be either BLKmode
or some vector mode depending on whether current function's enabled ISAs
support that vector mode or not. On the following testcase, we expand it
first in foo function without AVX2 enabled and so the DEBUG_EXPR is
BLKmode, but later the same DEBUG_EXPR_DECL is used in a simd clone with
AVX2 enabled and expansion ICEs because of a mode mismatch.
The following patch fixes that by forcing recreation of a DEBUG_EXPR if
there is a mode mismatch for vector typed DEBUG_EXPR_DECL, DEBUG_EXPRs
will be still reused in between functions otherwise and within the same
function the mode should be always the same.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100508
* cfgexpand.c (expand_debug_expr): For DEBUG_EXPR_DECL with vector
type, don't reuse DECL_RTL if it has different mode, instead force
creation of a new DEBUG_EXPR.
Jakub Jelinek [Tue, 11 May 2021 07:07:47 +0000 (09:07 +0200)]
openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471]
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early
return, doesn't create any tasks and more importantly, doesn't create
a taskgroup and doesn't register task reductions. But, the code emitted
in the callers assumes task reductions have been registered and performs
the reduction handling and task reduction unregistration. The pointer
to the task reduction private variables is reused, on input it is the alignment
and only on output it is the pointer, so in the case taskloop with no iterations
the caller attempts to dereference the alignment value as if it was a pointer
and crashes. We could in the early returns register the task reductions
only to have them looped over and unregistered in the caller, but I think
it is better to tell the caller there is nothing to task reduce and bypass
all that.
2021-05-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100471
* omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data
is 0, bypass the reduction loop including
GOMP_taskgroup_reduction_unregister call.
* taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not
GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task
reduction pointer.
* testsuite/libgomp.c/task-reduction-4.c: New test.
Eric Botcazou [Tue, 10 May 2022 07:33:16 +0000 (09:33 +0200)]
Fix internal error with vectorization on SPARC
This is a regression present since the 10.x series, but the underlying issue
has been there since the TARGET_VEC_PERM_CONST hook was implemented, in the
form of an ICE when expanding a constant VEC_PERM_EXPR in V4QI, while the
back-end only supports V8QI constant VEC_PERM_EXPRs.
gcc/
PR target/105292
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): Return
true only for 8-byte vector modes.
gcc/testsuite/
* gcc.target/sparc/20220510-1.c: New test.
Patrick Palka [Tue, 26 Apr 2022 01:49:00 +0000 (21:49 -0400)]
c++: ICE with requires-expr and -Wsequence-point [PR105304]
Here we're crashing from verify_sequence_points for this requires-expr
condition because it contains a templated CAST_EXPR with empty operand,
and verify_tree doesn't ignore this empty operand only because the
manual tail recursion that it performs for unary expression trees skips
the NULL test.
PR c++/105304
gcc/c-family/ChangeLog:
* c-common.c (verify_tree) [restart]: Move up to before the
NULL test.
Patrick Palka [Sat, 26 Mar 2022 14:20:16 +0000 (10:20 -0400)]
c++: ICE when building builtin operator->* set [PR103455]
Here when constructing the builtin operator->* candidate set according
to the available conversion functions for the operand types, we end up
considering a candidate with C1=T (through B's dependent conversion
function) and C2=F, during which we crash from DERIVED_FROM_P because
dependent_type_p sees a TEMPLATE_TYPE_PARM outside of a template
context.
Sidestepping the question of whether we should be considering such a
dependent conversion function here in the first place, it seems futile
to test DERIVED_FROM_P for anything other than an actual class type, so
this patch fixes this ICE by simply guarding the DERIVED_FROM_P test
with CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
PR c++/103455
gcc/cp/ChangeLog:
* call.c (add_builtin_candidate) <case MEMBER_REF>: Test
CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
Patrick Palka [Thu, 17 Feb 2022 13:35:23 +0000 (08:35 -0500)]
c++: double non-dep folding from finish_compound_literal [PR104565]
In finish_compound_literal, we perform non-dependent expr folding before
the call to check_narrowing ever since r9-5973. But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding means tsubst will see non-templated trees during the
second folding, which causes a spurious error in the below testcase.
This patch removes the former folding operation; it seems obviated by
the latter one.
Patrick Palka [Tue, 25 Jan 2022 20:04:49 +0000 (15:04 -0500)]
c++: deleted fn and noexcept inst [PR101532, PR104225]
Here when attempting to use B's implicitly deleted default constructor,
mark_used rightfully returns false, but for the wrong reason: it
tries to instantiate the synthesized noexcept specifier which then only
silently fails because get_defaulted_eh_spec suppresses diagnostics
for deleted functions. This lack of diagnostics causes us to crash on
the first testcase below (thanks to the assert in finish_expr_stmt), and
silently accept the second testcase.
To fix this, this patch makes mark_used avoid attempting to instantiate
the noexcept specifier of a deleted function, so that we'll instead
directly reject (and diagnose) the function due to its deletedness.
PR c++/101532
PR c++/104225
gcc/cp/ChangeLog:
* decl2.c (mark_used): Don't consider maybe_instantiate_noexcept
on a deleted function.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/nsdmi-template21.C: New test.
* g++.dg/cpp0x/nsdmi-template21a.C: New test.
Jonathan Wakely [Wed, 13 Apr 2022 13:09:33 +0000 (14:09 +0100)]
libstdc++: Use LTLIBICONV when linking libstdc++.so [PR93602]
This fixes missing libiconv symbols when libstdc++ is built on a system
that has libiconv installed. If the libiconv headers are found then
libstdc++ depends on libiconv_open etc instead of libc's iconv_open. But
without this fix libstdc++ is not linked to the libiconv library that
provides the definitions of those symbols.
As discussed in PR 93602 this changed means that libstdc++.so.6 might
have an rpath pointing to the location of the libiconv.so library. If
that is not desired, then GCC must be configured to link to a static
libiconv.a instead, using either --with-libiconv-type=static or an
in-tree build of libiconv.
Jonathan Wakely [Fri, 6 May 2022 20:19:17 +0000 (21:19 +0100)]
libstdc++: Fix deserialization for std::normal_distribution [PR105502]
This fixes a regression in std::normal_distribution deserialization that
caused the object to be left unchanged if the __state_avail value read
from the stream was false.
libstdc++-v3/ChangeLog:
PR libstdc++/105502
* include/bits/random.tcc
(operator>>(basic_istream<C,T>&, normal_distribution<R>&)):
Update state when __state_avail is false.
* testsuite/26_numerics/random/normal_distribution/operators/serialize.cc:
Check that deserialized object equals serialized one.
Richard Biener [Fri, 8 Apr 2022 11:13:29 +0000 (13:13 +0200)]
tree-optimization/105198 - wrong code with predictive commoning
When predictive commoning looks for a looparound PHI it tries
to match the entry value definition (a load) up with the appropriate
member of the chain. But it fails to consider stmts clobbering
the very same memory location inbetween the load and loop entry.
In theory we could be more clever on must aliases that would be
also picked up from a load (so not exactly stmt_kills_ref_p) and
use the stored value from that if it is an exact match. But we
currently have no way to propagate this information inside predcom.
2022-04-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/105198
* tree-predcom.c (find_looparound_phi): Check whether
the found memory location of the entry value is clobbered
inbetween the value we want to use and loop entry.
Richard Biener [Mon, 7 Feb 2022 08:31:07 +0000 (09:31 +0100)]
middle-end/104402 - split out _Complex compares from COND_EXPRs
This makes sure we always have a _Complex compare split to a
different stmt for the compare operand in a COND_EXPR on GIMPLE.
Complex lowering doesn't handle this and the change is something
we want for all kind of compares at some point.
2022-02-07 Richard Biener <rguenther@suse.de>
PR middle-end/104402
* gimple-expr.c (is_gimple_condexpr): _Complex typed
compares are not valid.
* tree-cfg.c (verify_gimple_assign_ternary): For COND_EXPR
check is_gimple_condexpr.
libphobos: Don't call free on the TLS array in the emutls destroy function.
Fixes a segfault seen on Darwin when a GC scan is ran after a thread has
been destroyed. As the global emutlsArrays hash still has a reference
to the array itself, and tries to iterate all elements.
Setting the length to zero frees all allocated elements in the array,
and ensures that it is skipped when the _d_emutls_scan is called.
Jonathan Wakely [Thu, 9 Dec 2021 22:22:42 +0000 (22:22 +0000)]
libstdc++: Fix ambiguous comparisons for iterators in C++20
Since r11-1571 (c++: Refinements to "more constrained") was changed in
the front end, the following comment from stl_iterator.h stopped being
true:
// These extra overloads are not needed in C++20, because the ones above
// are constrained with a requires-clause and so overload resolution will
// prefer them to greedy unconstrained function templates.
The requires-clause is no longer considered when comparing unrelated
function templates. That means that the constrained operator== specified
in the standard is no longer more constrained than the pathological
comparison operators defined in the testsuite_greedy_ops.h header. This
was causing several tests to FAIL in C++20 mode:
FAIL: 23_containers/deque/types/1.cc (test for excess errors)
FAIL: 23_containers/vector/types/1.cc (test for excess errors)
FAIL: 24_iterators/move_iterator/greedy_ops.cc (test for excess errors)
FAIL: 24_iterators/normal_iterator/greedy_ops.cc (test for excess errors)
FAIL: 24_iterators/reverse_iterator/greedy_ops.cc (test for excess errors)
The solution is to restore some of the non-standard comparison operators
that are more specialized than the greedy operators in the testsuite.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (operator==, operator<=>): Define
overloads for homogeneous specializations of reverse_iterator,
__normal_iterator and move_iterator.
Jonathan Wakely [Tue, 20 Apr 2021 15:16:13 +0000 (16:16 +0100)]
libstdc++: Do not allocate a zero-size vector<bool> [PR 100153]
The vector<bool>::shrink_to_fit() implementation will allocate new
storage even if the vector is empty. That then leads to the
end-of-storage pointer being non-null and equal to the _M_start._M_p
pointer, which means that _M_end_addr() has undefined behaviour.
The fix is to stop doing a useless zero-sized allocation in
shrink_to_fit(), so that _M_start._M_p and _M_end_of_storage are both
null after an empty vector shrinks.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100153
* include/bits/vector.tcc (vector<bool>::_M_shrink_to_fit()):
When size() is zero just deallocate and reset.
Jonathan Wakely [Mon, 1 Nov 2021 11:32:39 +0000 (11:32 +0000)]
libstdc++: Reorder constraints on std::span::span(Range&&) constructor.
In PR libstdc++/103013 Tim Song pointed out that we could reorder the
constraints of this constructor. That's worth doing just to reduce the
work the compiler has to do during overload resolution, even if it isn't
needed to make the code in the PR work.
Jonathan Wakely [Tue, 9 Nov 2021 10:31:18 +0000 (10:31 +0000)]
libstdc++: Make spurious std::random_device FAIL less likely
It's possible that independent reads from /dev/random and /dev/urandom
could produce the same value by chance. Retry if that happens. The
chances of it happening twice are miniscule.
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/random/random_device/cons/token.cc:
Retry if random devices produce the same value.
Jonathan Wakely [Thu, 21 Jul 2016 19:38:51 +0000 (20:38 +0100)]
libstdc++: Fix out-of-bound array accesses in testsuite
I fixed some undefined behaviour in string tests in r238609, but I only
fixed the narrow char versions. This applies the same fixes to the
wchar_t ones. These problems were found when testing a patch to make
std::basic_string usable in constexpr.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc:
Fix reads past the end of strings.
* testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc:
Likewise.
* testsuite/experimental/string_view/operations/compare/wchar_t/1.cc:
Likewise.
Jonathan Wakely [Mon, 12 Apr 2021 10:12:47 +0000 (11:12 +0100)]
libstdc++: Fix test for libstdc++ not including <unistd.h> [PR100117]
The <cxxx> headers for the C library are not under our control, so we
can't prevent them from including <unistd.h>. Change the PR 49745 test
to only include the C++ library headers, not the <cxxx> ones.
To ensure <bits/stdc++.h> isn't included automatically we need to use
no_pch to disable PCH.
libstdc++-v3/ChangeLog:
PR libstdc++/100117
* testsuite/17_intro/headers/c++1998/49745.cc: Explicitly list
all C++ headers instead of including <bits/stdc++.h>
This restores support for std::make_exception_ptr<E&> and for using
std::exception_ptr in C++98.
Because the new non-throwing implementation needs to use std::decay to
handle references the original throwing implementation is used for
C++98.
We also need to change the typeid expression so it doesn't yield the
dynamic type when the function parameter is a reference to a polymorphic
type. Otherwise the new exception object could be caught by any handler
matching the dynamic type, even though the actual exception object is
only a copy of the base class, sliced to the static type.
libstdc++-v3/ChangeLog:
PR libstdc++/103630
* libsupc++/exception_ptr.h (make_exception_ptr): Decay the
template parameter. Use typeid of the static type.
* testsuite/18_support/exception_ptr/103630.cc: New test.
Jonathan Wakely [Fri, 4 Feb 2022 15:23:31 +0000 (15:23 +0000)]
libstdc++: Remove un-implementable noexcept from Filesystem TS operations
LWG 3014 removed these incorrect noexcept specifications from the C++17
std::filesystem operations. They are also incorrect on the experimental
TS versions and should be removed from them too.
The path::begin() fix should have been part of r12-3930-gf2b7f56a15d9cb.
Thanks to Timm Bäder for reporting this one.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_fwd.h (copy_file): Remove
incorrect noexcept from declaration.
* include/experimental/bits/fs_path.h (path::begin, path::end):
Add noexcept to declarations, to match definitions.
Jonathan Wakely [Thu, 3 Mar 2022 22:20:32 +0000 (22:20 +0000)]
libstdc++: Fix test failure on AIX
This fixes a test failure due to a non-reserved name in an AIX system
header (included via <pthread.h>). That name clashes with one of the
names we check our own headers for, so skip checking that name on AIX.
libstdc++-v3/ChangeLog:
* testsuite/17_intro/names.cc (func): Undef on AIX.
Jonathan Wakely [Wed, 26 Jan 2022 16:08:51 +0000 (16:08 +0000)]
libstdc++: Avoid overflow in ranges::advance(i, n, bound)
When (bound - i) or n is the most negative value of its type, the
negative of the value will overflow. Instead of abs(n) >= abs(bound - i)
use n >= (bound - i) when positive and n <= (bound - i) when negative.
The function has a precondition that they must have the same sign, so
this works correctly. The precondition check can be moved into the else
branch, and simplified.
The standard requires calling ranges::advance(i, bound) even if i==bound
is already true, which is technically observable, but that's pointless.
We can just return n in that case. Similarly, for i!=bound but n==0 we
are supposed to call ranges::advance(i, n), but that's pointless. An LWG
issue to allow omitting the pointless calls is expected to be filed.
libstdc++-v3/ChangeLog:
* include/bits/range_access.h (ranges::advance): Avoid signed
overflow. Do nothing if already equal to desired result.
* testsuite/24_iterators/range_operations/advance_overflow.cc:
New test.
Jonathan Wakely [Wed, 5 Jan 2022 16:25:47 +0000 (16:25 +0000)]
libstdc++: Do not use std::isdigit in <charconv> [PR103911]
This avoids a potential race condition if std::setlocale is used
concurrently with std::from_chars.
libstdc++-v3/ChangeLog:
PR libstdc++/103911
* include/std/charconv (__from_chars_alpha_to_num): Return
char instead of unsigned char. Change invalid return value to
127 instead of using numeric trait.
(__from_chars_alnum): Fix comment. Do not use std::isdigit.
Change type of variable to char.
PR104228 showed that character lengths were shared between associate
variable and associate targets. This is problematic when the associate
target is itself a variable and gets a variable to hold the length, as
the length variable is added (and all the variables following it in the chain)
to both the associate variable scope and the target variable scope.
This caused an ICE when compiling with -O0 -fsanitize=address.
This change forces the creation of a separate character length for the
associate variable. It also forces the initialization of the character
length variable to avoid regressing associate_32 and associate_47 tests.
--
fortran: Separate associate character lengths earlier [PR104570]
This change workarounds an ICE in the evaluation of the character length
of an array expression referencing an associate variable; the code is
not prepared to see a non-scalar expression as it doesn’t initialize the
scalarizer.
Before this change, associate length symbols get a new gfc_charlen at
resolution stage to unshare them from the associate expression, so that
at translation stage it is a decl specific to the associate symbol that
is initialized, not the decl of some other symbol. This
reinitialization of gfc_charlen happens after expressions referencing
the associate symbol have been parsed, so that those expressions retain
the original gfc_charlen they have copied from the symbol.
At translation stage, the gfc_charlen for the associate symbol is setup
with the decl holding the actual length value, but the expressions have
retained the original gfc_charlen without any decl. So they need to
evaluate the character length, and this is where the ICE happens.
This change moves the reinitialization of gfc_charlen earlier at parsing
stage, so that at resolution stage the gfc_charlen can be retained as
it’s already not shared with any other symbol, and the expressions which
now share their gfc_charlen with the symbol are automatically updated
when the length decl is setup at translation stage. There is no need
any more to evaluate the character length as it has all the required
information, and the ICE doesn’t happen.
The first resolve.c hunk is necessary to avoid regressing on the
associate_35.f90 testcase.
--
PR fortran/104228
PR fortran/104570
gcc/fortran/ChangeLog:
* parse.c (parse_associate): Use a new distinct gfc_charlen if
the copied type has one whose length is not known to be
constant.
* resolve.c (resolve_assoc_var): Also create a new character
length for non-dummy associate targets. Reset charlen if it’s
shared with the associate target regardless of the expression
type. Don’t reinitialize charlen if it’s deferred.
* trans-stmt.c (trans_associate_var): Initialize character
length even if no temporary is used for the associate variable.
gcc/testsuite/ChangeLog:
* gfortran.dg/asan_associate_58.f90: New test.
* gfortran.dg/asan_associate_59.f90: New test.
* gfortran.dg/associate_58.f90: New test.