Jakub Jelinek [Fri, 15 Oct 2021 14:25:25 +0000 (16:25 +0200)]
openmp: Fix up handling of OMP_PLACES=threads(1)
When writing the places-*.c tests, I've noticed that we mishandle threads
abstract name with specified num-places if num-places isn't a multiple of
number of hw threads in a core. It then happily ignores the maximum count
and overwrites for the remaining hw threads in a core further places that
haven't been allocated.
2021-10-15 Jakub Jelinek <jakub@redhat.com>
* config/linux/affinity.c (gomp_affinity_init_level_1): For level 1
after creating count places clean up and return immediately.
* testsuite/libgomp.c/places-6.c: New test.
* testsuite/libgomp.c/places-7.c: New test.
* testsuite/libgomp.c/places-8.c: New test.
Jakub Jelinek [Tue, 5 Oct 2021 20:28:38 +0000 (22:28 +0200)]
c++: Fix apply_identity_attributes [PR102548]
The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes. The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.
The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).
There are two bugs. One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated. That is fixed by the a2 -> a2 != a change.
The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it. That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly). But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs. When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing. So that is the && first_ident != error_mark_node
chunk.
2021-10-05 Jakub Jelinek <jakub@redhat.com>
PR c++/102548
* tree.c (apply_identity_attributes): Fix handling of the
case where an attribute in the list doesn't affect type
identity but some attribute before it does.
Jakub Jelinek [Fri, 1 Oct 2021 12:27:32 +0000 (14:27 +0200)]
ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]
We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.
2021-10-01 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR sanitizer/102515
gcc/c-family/
* c-ubsan.c (ubsan_instrument_division): Check the right
flag_sanitize_recover bit, depending on which sanitization
is done.
gcc/testsuite/
* c-c++-common/ubsan/float-div-by-zero-2.c: New test.
Jakub Jelinek [Tue, 28 Sep 2021 11:02:51 +0000 (13:02 +0200)]
i386: Don't emit fldpi etc. if -frounding-math [PR102498]
i387 has instructions to store some transcedental numbers into the top of
stack. The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision. The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.
2021-09-28 Jakub Jelinek <jakub@redhat.com>
PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.
Jakub Jelinek [Wed, 15 Sep 2021 20:21:17 +0000 (22:21 +0200)]
c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
Jakub Jelinek [Tue, 14 Sep 2021 14:56:30 +0000 (16:56 +0200)]
c++: Update DECL_*SIZE for objects with flexible array members with initializers [PR102295]
The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.
Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.
Jakub Jelinek [Tue, 14 Sep 2021 14:55:04 +0000 (16:55 +0200)]
c++: Fix __is_*constructible/assignable for templates [PR102305]
is_xible_helper returns error_mark_node (i.e. false from the traits)
for abstract classes by testing ABSTRACT_CLASS_TYPE_P (to) early.
Unfortunately, as the testcase shows, that doesn't work on class templates
that haven't been instantiated yet, ABSTRACT_CLASS_TYPE_P for them is false
until it is instantiated, which is done when the routine later constructs
a dummy object with that type.
The following patch fixes this by calling complete_type first, so that
ABSTRACT_CLASS_TYPE_P test will work properly, while keeping the handling
of arrays with unknown bounds, or incomplete types where it is done
currently.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102305
* method.c (is_xible_helper): Call complete_type on to.
Jakub Jelinek [Wed, 8 Sep 2021 09:25:31 +0000 (11:25 +0200)]
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
Jakub Jelinek [Mon, 23 Aug 2021 09:50:14 +0000 (11:50 +0200)]
dwarf2out: Emit DW_AT_location for global register vars during early dwarf [PR101905]
The following patch emits DW_AT_location for global register variables
already during early dwarf, since usually late_global_decl hook isn't even
called for those, as nothing needs to be emitted for them.
2021-08-23 Jakub Jelinek <jakub@redhat.com>
PR debug/101905
* dwarf2out.c (gen_variable_die): Add DW_AT_location for global
register variables already during early_dwarf if possible.
Jakub Jelinek [Wed, 28 Jul 2021 16:43:15 +0000 (18:43 +0200)]
ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]
The following testcase ICEs, because the base is a CONST_DECL for
the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
/* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'. */
#define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.
The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...
2021-07-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101624
* ubsan.c (maybe_instrument_pointer_overflow,
instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
PARM_DECLs or RESULT_DECLs.
* sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.
* gfortran.dg/ubsan/ubsan.exp: New file.
* gfortran.dg/ubsan/pr101624.f90: New test.
Jakub Jelinek [Fri, 23 Jul 2021 17:55:16 +0000 (19:55 +0200)]
expmed: Fix store_integral_bit_field [PR101562]
Our documentation says that paradoxical subregs shouldn't appear
in strict_low_part:
'(strict_low_part (subreg:M (reg:N R) 0))'
This expression code is used in only one context: as the
destination operand of a 'set' expression. In addition, the
operand of this expression must be a non-paradoxical 'subreg'
expression.
but on the testcase below that triggers UB at runtime
store_integral_bit_field emits exactly that.
The following patch fixes it by ensuring the requirement is satisfied.
2021-07-23 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101562
* expmed.c (store_integral_bit_field): Only use movstrict_optab
if the operand isn't paradoxical.
Jakub Jelinek [Wed, 21 Jul 2021 07:45:02 +0000 (09:45 +0200)]
openmp: Fix up omp_check_private [PR101535]
The target data construct shouldn't affect omp_check_private, unless
the decl there is privatized (use_device_* clauses). The routine
had some code for that, but it just did continue; in a loop that looped
only if the region type is one of selected 4 kinds, so effectively resulted
in return false; instead of looping again. And not diagnosing lastprivate
(or reduction etc.) on a variable that is private to containing parallel
results in ICEs later on, as there is no original list item to which store
the last result.
The target construct is unclear as it has an implicit parallel region
and it is not obvious if the data privatization clauses on the construct
shall be treated as data privatization on the implicit parallel or just
on the target. For now treat those as privatization on the implicit
parallel, but treat map clauses as shared on the implicit parallel.
2021-07-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101535
* gimplify.c (omp_check_private): Properly skip ORT_TARGET_DATA
contexts in which decl isn't privatized and for ORT_TARGET return
false if decl is mapped.
* c-c++-common/gomp/pr101535-1.c: New test.
* c-c++-common/gomp/pr101535-2.c: New test.
Jakub Jelinek [Wed, 21 Jul 2021 07:38:59 +0000 (09:38 +0200)]
c++: Ensure OpenMP reduction with reference type references complete type [PR101516]
The following testcase ICEs because we haven't verified if reduction decl
has reference type that TREE_TYPE of the reference is a complete type,
require_complete_type on the decl doesn't ensure that.
2021-07-21 Jakub Jelinek <jakub@redhat.com>
PR c++/101516
* semantics.c (finish_omp_reduction_clause): Also call
complete_type_or_else and return true if it fails.
Jakub Jelinek [Tue, 20 Jul 2021 14:41:29 +0000 (16:41 +0200)]
rs6000: Fix up easy_vector_constant_msb handling [PR101384]
The following gcc.dg/pr101384.c testcase is miscompiled on
powerpc64le-linux.
easy_altivec_constant has code to try construct vector constants with
different element sizes, perhaps different from CONST_VECTOR's mode. But as
written, that works fine for vspltis[bhw] cases, but not for the vspltisw
x,-1; vsl[bhw] x,x,x case, because that creates always a V16QImode, V8HImode
or V4SImode constant containing broadcasted constant with just the MSB set.
The vspltis_constant function etc. expects the vspltis[bhw] instructions
where the small [-16..15] or even [-32..30] constant is sign-extended to the
remaining step bytes, but that is not the case for the 0x80...00 constants,
with step 1 we can't handle e.g.
{ 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff, 0x80, 0xff, 0xff, 0xff }
vectors but do want to handle e.g.
{ 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80, 0, 0, 0, 0x80 }
and similarly with copies 1 we do want to handle e.g.
{ 0x80808080, 0x80808080, 0x80808080, 0x80808080 }.
This is a simpler version of the fix for backports, which limits the EASY_VECTOR_MSB case
matching to step == 1 && copies == 1, because that is the only case the
splitter handles correctly.
2021-07-20 Jakub Jelinek <jakub@redhat.com>
PR target/101384
* config/rs6000/rs6000.c (vspltis_constant): Accept EASY_VECTOR_MSB
only if step and copies are equal to 1.
Jakub Jelinek [Thu, 1 Jul 2021 06:55:49 +0000 (08:55 +0200)]
openmp - Fix up && and || reductions [PR94366]
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.
This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.
2021-07-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.
Tobias Burnus [Tue, 4 May 2021 11:38:03 +0000 (13:38 +0200)]
OpenMP: Support complex/float in && and || reduction
C/C++ permit logical AND and logical OR also with floating-point or complex
arguments by doing an unequal zero comparison; the result is an 'int' with
value one or zero. Hence, those are also permitted as reduction variable,
even though it is not the most sensible thing to do.
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Accept float + complex
for || and && reductions.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_reduction_clause): Accept float + complex
for || and && reductions.
gcc/ChangeLog:
* omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle
&& and || with floating-point and complex arguments.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/reduction-2.c: New test.
* testsuite/libgomp.c-c++-common/reduction-3.c: New test.
Comparisons of NULLPTR_TYPE operands cause all kinds of problems in the
middle-end and in fold-const.c, various optimizations assume that if they
see e.g. a non-equality comparison with one of the operands being
INTEGER_CST and it is not INTEGRAL_TYPE_P (which has TYPE_{MIN,MAX}_VALUE),
they can build_int_cst (type, 1) to find a successor.
The following patch fixes it by making sure they don't appear in the IL,
optimize them away at cp_fold time as all can be folded.
Though, I've just noticed that clang++ rejects the non-equality comparisons
instead, foo () > 0 with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'int')
and foo () > nullptr with
invalid operands to binary expression ('decltype(nullptr)' (aka 'nullptr_t') and 'nullptr_t')
Shall we reject those too, in addition or instead of parts of this patch?
If so, wouldn't this patch be still useful for backports, I bet we don't
want to start reject it on the release branches when we used to accept it.
2021-07-15 Jakub Jelinek <jakub@redhat.com>
PR c++/101443
* cp-gimplify.c (cp_fold): For comparisons with NULLPTR_TYPE
operands, fold them right away to true or false.
pot_dummy_types is a hash_set from whose traversal the code prints some type
lines. hash_set normally uses default_hash_traits which for pointer types
(the hash set hashes const char *) uses pointer_hash which hashes the
addresses of the pointers except of the least significant 3 bits.
With address space randomization, that results in non-determinism in the
-fdump-go-specs= generated file, each invocation can have different order of
the lines emitted from pot_dummy_types traversal.
This patch fixes it by hashing the string contents instead to make the
hashes reproduceable.
2021-07-14 Jakub Jelinek <jakub@redhat.com>
PR go/101407
* godump.c (godump_str_hash): New type.
(godump_container::pot_dummy_types): Use string_hash instead of
ptr_hash in the hash_set.
Jakub Jelinek [Tue, 13 Jul 2021 07:50:49 +0000 (09:50 +0200)]
libgomp: Don't include limits.h instead of hidden visibility block
sem.h is included in between # pragma GCC visibility push(hidden)
and # pragma GCC visibility pop and includes limits.h there, which
since the introduction of sysconf declaration in recent glibcs
in there causes trouble. libgomp assumes it is compiled by gcc,
so we don't really need to include limits.h there and can use
-__INT_MAX__ - 1 instead (which clang and icc support too for years).
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Florian Weimer <fweimer@redhat.com>
* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
* config/linux/affinity.c: Include limits.h.
Jakub Jelinek [Thu, 1 Jul 2021 07:45:02 +0000 (09:45 +0200)]
dwarf2out: Handle COMPOUND_LITERAL_EXPR in loc_list_from_tree_1 [PR101266]
In this case dwarf2out_decl is called from the FEs with GENERIC but not
yet gimplified expressions in it.
As loc_list_from_tree_1 has an exhaustive list of tree codes it wants to
handle and for checking asserts no other codes makes it in, we should
handle even GENERIC trees that shouldn't be valid in GIMPLE.
The following patch handles COMPOUND_LITERAL_EXPR by hnadling it like the
underlying VAR_DECL temporary.
Verified the emitted DWARF is correct (but unoptimized, we emit
DW_OP_lit1 DW_OP_lit1 DW_OP_minus for the upper bound).
Jakub Jelinek [Tue, 29 Jun 2021 09:24:38 +0000 (11:24 +0200)]
match.pd: Avoid (intptr_t)x eq/ne CST to x eq/ne (typeof x) CST opt in GENERIC when sanitizing [PR101210]
When we have (intptr_t) x == cst where x has REFERENCE_TYPE, this
optimization creates x == cst out of it where cst has REFERENCE_TYPE.
If it is done in GENERIC folding, it can results in ubsan failures
where the INTEGER_CST with REFERENCE_TYPE is instrumented.
Fixed by deferring it to GIMPLE folding in this case.
2021-06-29 Jakub Jelinek <jakub@redhat.com>
PR c++/101210
* match.pd ((intptr_t)x eq/ne CST to x eq/ne (typeof x) CST): Don't
perform the optimization in GENERIC when sanitizing and x has a
reference type.
Jakub Jelinek [Thu, 24 Jun 2021 13:58:02 +0000 (15:58 +0200)]
c: Fix up c_parser_has_attribute_expression [PR101176]
This function keeps src_range member of the result uninitialized, which at
least under valgrind can show up later when those uninitialized location_t's
can make it into the IL or location_t hash tables.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101176
* c-parser.c (c_parser_has_attribute_expression): Set source range for
the result.
Jakub Jelinek [Thu, 24 Jun 2021 13:55:28 +0000 (15:55 +0200)]
c: Fix C cast error-recovery [PR101171]
The following testcase ICEs during error-recovery, as build_c_cast calls
note_integer_operands on error_mark_node and that wraps it into
C_MAYBE_CONST_EXPR which is unexpected and causes ICE later on.
Seems most other callers of note_integer_operands check early if something
is error_mark_node and return before calling note_integer_operands on it.
The following patch fixes it by not calling on error_mark_node, another
possibility would be to handle error_mark_node in note_integer_operands and
just return it.
2021-06-24 Jakub Jelinek <jakub@redhat.com>
PR c/101171
* c-typeck.c (build_c_cast): Don't call note_integer_operands on
error_mark_node.
Jakub Jelinek [Wed, 23 Jun 2021 08:03:28 +0000 (10:03 +0200)]
openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167]
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.
2021-06-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101167
* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.
* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
Jakub Jelinek [Mon, 21 Jun 2021 11:30:42 +0000 (13:30 +0200)]
inline-asm: Fix ICE with bitfields in "m" operands [PR100785]
Bitfields, while they live in memory, aren't something inline-asm can easily
operate on.
For C and "=m" or "+m", we were diagnosing bitfields in the past in the
FE, where c_mark_addressable had:
case COMPONENT_REF:
if (DECL_C_BIT_FIELD (TREE_OPERAND (x, 1)))
{
error
("cannot take address of bit-field %qD", TREE_OPERAND (x, 1));
return false;
}
but that check got moved in GCC 6 to build_unary_op instead and now we
emit an error during expansion and ICE afterwards (i.e. error-recovery).
For "m" it used to be diagnosed in c_mark_addressable too, but since
GCC 6 it is ice-on-invalid.
For C++, this was never diagnosed in the FE, but used to be diagnosed
in the gimplifier and/or during expansion before 4.8.
The following patch does multiple things:
1) diagnoses it in the FEs
2) simplifies during expansion the inline asm if any errors have been
reported (similarly how e.g. vregs pass if it detects errors on
inline-asm either deletes them or simplifies to bare minimum -
just labels), so that we don't have error-recovery ICEs there
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR inline-asm/100785
gcc/
* cfgexpand.c (expand_asm_stmt): If errors are emitted,
remove all inputs, outputs and clobbers from the asm and
set template to "".
gcc/c/
* c-typeck.c (c_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/cp/
* typeck.c (cxx_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/testsuite/
* c-c++-common/pr100785.c: New test.
Jakub Jelinek [Fri, 18 Jun 2021 09:20:40 +0000 (11:20 +0200)]
stor-layout: Don't create DECL_BIT_FIELD_REPRESENTATIVE for QUAL_UNION_TYPE [PR101062]
> The following patch does create them, but treats all such bitfields as if
> they were in a structure where the particular bitfield is the only field.
While the patch passed bootstrap/regtest on the trunk, when trying to
backport it to 11 branch the bootstrap failed with
atree.ads:3844:34: size for "Node_Record" too small
errors. Turns out the error is not about size being too small, but actually
about size being non-constant, and comes from:
/* In a FIELD_DECL of a RECORD_TYPE, this is a pointer to the storage
representative FIELD_DECL. */
#define DECL_BIT_FIELD_REPRESENTATIVE(NODE) \
(FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
/* For a FIELD_DECL in a QUAL_UNION_TYPE, records the expression, which
if nonzero, indicates that the field occupies the type. */
#define DECL_QUALIFIER(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
so by setting up DECL_BIT_FIELD_REPRESENTATIVE in QUAL_UNION_TYPE we
actually set or modify DECL_QUALIFIER and then construct size as COND_EXPRs
with those bit field representatives (e.g. with array type) as conditions
which doesn't fold into constant.
The following patch fixes it by not creating DECL_BIT_FIELD_REPRESENTATIVEs
for QUAL_UNION_TYPE as there is nowhere to store them,
Shall we change tree.h to document that DECL_BIT_FIELD_REPRESENTATIVE
is valid also on UNION_TYPE?
I see:
tree-ssa-alias.c- if (TREE_CODE (type1) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field1))
tree-ssa-alias.c: field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
tree-ssa-alias.c- if (TREE_CODE (type2) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field2))
tree-ssa-alias.c: field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
Shall we change that to || == UNION_TYPE or do we assume all fields
are overlapping in a UNION_TYPE already?
At other spots (asan, ubsan, expr.c) it is unclear what will happen
if they see a QUAL_UNION_TYPE with a DECL_QUALIFIER (or does the Ada FE
lower that somehow)?
Jakub Jelinek [Wed, 16 Jun 2021 10:17:55 +0000 (12:17 +0200)]
stor-layout: Create DECL_BIT_FIELD_REPRESENTATIVE even for bitfields in unions [PR101062]
The following testcase is miscompiled on x86_64-linux, the bitfield store
is implemented as a RMW 64-bit operation at d+24 when the d variable has
size of only 28 bytes and scheduling moves in between the R and W part
a store to a different variable that happens to be right after the d
variable.
The reason for this is that we weren't creating
DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions.
The following patch does create them, but treats all such bitfields as if
they were in a structure where the particular bitfield is the only field.
2021-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_representative): For fields in unions
assume nextf is always NULL.
(finish_bitfield_layout): Compute bit field representatives also in
unions, but handle it as if each bitfield was the only field in the
aggregate.
Jakub Jelinek [Wed, 16 Jun 2021 11:10:48 +0000 (13:10 +0200)]
testsuite: Use noipa attribute instead of noinline, noclone
I've noticed this test now on various arches sometimes FAILs, sometimes
PASSes (the line 12 test in particular).
The problem is that a = 0; initialization in the caller no longer happens
before the f(&a) call as what the argument points to is only used in
debug info.
Making the function noipa forces the caller to initialize it and still
tests what the test wants to test, namely that we don't consider *p as
valid location for the c variable at line 18 (after it has been overwritten
with *p = 1;).
2021-06-16 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of
noinline, noclone.
Jakub Jelinek [Wed, 16 Jun 2021 08:45:27 +0000 (10:45 +0200)]
libffi: Fix up x86_64 classify_argument
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD. The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).
* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
to number of words needed for type->size + byte_offset bytes rather
than just type->size bytes. Compute pos before the loop and check
total size of the structure.
* testsuite/libffi.call/nested_struct12.c: New test.
Jakub Jelinek [Tue, 15 Jun 2021 09:36:47 +0000 (11:36 +0200)]
expr: Fix up VEC_PACK_TRUNC_EXPR expansion [PR101046]
The following testcase ICEs, because we have a mode mismatch.
VEC_PACK_TRUNC_EXPR's operands have different modes from the result
(same vector mode size but twice as large element),
but we were passing non-NULL subtarget with the mode of the result
to the expansion of its arguments, so the VEC_PERM_EXPR in one of the
operands which had V8SImode operands and result had V16HImode target.
Fixed by clearing the subtarget if we are changing mode.
2021-06-15 Jakub Jelinek <jakub@redhat.com>
PR target/101046
* expr.c (expand_expr_real_2) <case VEC_PACK_FIX_TRUNC_EXPR,
case VEC_PACK_TRUNC_EXPR>: Clear subtarget when changing mode.
Jakub Jelinek [Mon, 7 Jun 2021 07:25:37 +0000 (09:25 +0200)]
tree-inline: Fix up __builtin_va_arg_pack handling [PR100898]
The following testcase ICEs, because gimple_call_arg_ptr (..., 0)
asserts that there is at least one argument, while we were using
it even if we didn't copy anything just to get a pointer from/to which
the zero arguments should be copied.
Fixed by guarding the memcpy calls. Also, the code was calling
gimple_call_num_args too many times - 5 times instead of 2, so the patch
adds two temporaries for those.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100898
* tree-inline.c (copy_bb): Only use gimple_call_arg_ptr if memcpy
should copy any arguments. Don't call gimple_call_num_args
on id->call_stmt or call_stmt more than once.
Jakub Jelinek [Fri, 4 Jun 2021 09:20:02 +0000 (11:20 +0200)]
x86: Fix ix86_expand_vector_init for V*TImode [PR100887]
We have vec_initv4tiv2ti and vec_initv2titi patterns which call
ix86_expand_vector_init and assume it works for those modes. For the
case of construction from two half-sized vectors, the code assumes it
will always succeed, but we have only insn patterns with SImode and DImode
element types. QImode and HImode element types are already handled
by performing it with same sized vectors with SImode elements and the
following patch extends that to V*TImode vectors.
2021-06-04 Jakub Jelinek <jakub@redhat.com>
PR target/100887
* config/i386/i386.c (ix86_expand_vector_init): Handle
concatenation from half-sized modes with TImode elements.
Jakub Jelinek [Tue, 25 May 2021 15:24:38 +0000 (17:24 +0200)]
c++: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]
When passing expressions with decltype(nullptr) type with side-effects to
ellipsis, we pass (void *)0 instead, but for the side-effects evaluate them
on the lhs of a COMPOUND_EXPR. Unfortunately that means we warn about it
if the expression is a call to nodiscard marked function, even when the
result is really used, just needs to be transformed.
Fixed by adding a warning_sentinel.
2021-05-25 Jakub Jelinek <jakub@redhat.com>
PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.
* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.
Jakub Jelinek [Wed, 12 May 2021 08:38:35 +0000 (10:38 +0200)]
expand: Don't reuse DEBUG_EXPRs with vector type if they have different modes [PR100508]
The inliner doesn't remap DEBUG_EXPR_DECLs, so the same decls can appear
in multiple functions.
Furthermore, expansion reuses corresponding DEBUG_EXPRs too, so they again
can be reused in multiple functions.
Neither of that is a major problem, DEBUG_EXPRs are just magic value holders
and what value they stand for is independent in each function and driven by
what debug stmts or DEBUG_INSNs they are bound to.
Except for DEBUG_EXPR*s with vector types, TYPE_MODE can be either BLKmode
or some vector mode depending on whether current function's enabled ISAs
support that vector mode or not. On the following testcase, we expand it
first in foo function without AVX2 enabled and so the DEBUG_EXPR is
BLKmode, but later the same DEBUG_EXPR_DECL is used in a simd clone with
AVX2 enabled and expansion ICEs because of a mode mismatch.
The following patch fixes that by forcing recreation of a DEBUG_EXPR if
there is a mode mismatch for vector typed DEBUG_EXPR_DECL, DEBUG_EXPRs
will be still reused in between functions otherwise and within the same
function the mode should be always the same.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100508
* cfgexpand.c (expand_debug_expr): For DEBUG_EXPR_DECL with vector
type, don't reuse DECL_RTL if it has different mode, instead force
creation of a new DEBUG_EXPR.
Jakub Jelinek [Tue, 11 May 2021 07:07:47 +0000 (09:07 +0200)]
openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471]
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early
return, doesn't create any tasks and more importantly, doesn't create
a taskgroup and doesn't register task reductions. But, the code emitted
in the callers assumes task reductions have been registered and performs
the reduction handling and task reduction unregistration. The pointer
to the task reduction private variables is reused, on input it is the alignment
and only on output it is the pointer, so in the case taskloop with no iterations
the caller attempts to dereference the alignment value as if it was a pointer
and crashes. We could in the early returns register the task reductions
only to have them looped over and unregistered in the caller, but I think
it is better to tell the caller there is nothing to task reduce and bypass
all that.
2021-05-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100471
* omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data
is 0, bypass the reduction loop including
GOMP_taskgroup_reduction_unregister call.
* taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not
GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task
reduction pointer.
* testsuite/libgomp.c/task-reduction-4.c: New test.
Eric Botcazou [Tue, 10 May 2022 07:33:16 +0000 (09:33 +0200)]
Fix internal error with vectorization on SPARC
This is a regression present since the 10.x series, but the underlying issue
has been there since the TARGET_VEC_PERM_CONST hook was implemented, in the
form of an ICE when expanding a constant VEC_PERM_EXPR in V4QI, while the
back-end only supports V8QI constant VEC_PERM_EXPRs.
gcc/
PR target/105292
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): Return
true only for 8-byte vector modes.
gcc/testsuite/
* gcc.target/sparc/20220510-1.c: New test.
Jonathan Wakely [Fri, 6 May 2022 20:19:17 +0000 (21:19 +0100)]
libstdc++: Fix deserialization for std::normal_distribution [PR105502]
This fixes a regression in std::normal_distribution deserialization that
caused the object to be left unchanged if the __state_avail value read
from the stream was false.
libstdc++-v3/ChangeLog:
PR libstdc++/105502
* include/bits/random.tcc
(operator>>(basic_istream<C,T>&, normal_distribution<R>&)):
Update state when __state_avail is false.
* testsuite/26_numerics/random/normal_distribution/operators/serialize.cc:
Check that deserialized object equals serialized one.
Jonathan Wakely [Fri, 26 Nov 2021 12:07:13 +0000 (12:07 +0000)]
libstdc++: Fix test that fails for C++98 mode
When I backported r11-2760 as r10-8644 I simplified it and didn't add
the new _GLIBCXX11_DEPRECATED macro. That means that the macro used on
the old iostream members does nothing for C++98 mode, and so the test
fails. This adjusts the test to only expect warnigns for C++11 and
later.
libstdc++-v3/ChangeLog:
* testsuite/27_io/types/1.cc: Add c++11 target selector to
warnings.
Jonathan Wakely [Fri, 4 Feb 2022 15:23:31 +0000 (15:23 +0000)]
libstdc++: Remove un-implementable noexcept from Filesystem TS operations
LWG 3014 removed these incorrect noexcept specifications from the C++17
std::filesystem operations. They are also incorrect on the experimental
TS versions and should be removed from them too.
This restores support for std::make_exception_ptr<E&> and for using
std::exception_ptr in C++98.
Because the new non-throwing implementation needs to use std::decay to
handle references the original throwing implementation is used for
C++98.
We also need to change the typeid expression so it doesn't yield the
dynamic type when the function parameter is a reference to a polymorphic
type. Otherwise the new exception object could be caught by any handler
matching the dynamic type, even though the actual exception object is
only a copy of the base class, sliced to the static type.
libstdc++-v3/ChangeLog:
PR libstdc++/103630
* libsupc++/exception_ptr.h (make_exception_ptr): Decay the
template parameter. Use typeid of the static type.
* testsuite/18_support/exception_ptr/103630.cc: New test.
Jonathan Wakely [Thu, 17 Dec 2020 13:27:04 +0000 (13:27 +0000)]
libstdc++: Test errno macros directly for all targets [PR 93151]
This applies the same changes to the djgpp and mingw versions of
error_constants.h as r11-6137 did for the generic version.
All of these constants are defined as macros by <errno.h> on these
targets, so we can just test the macro directly instead of checking for
it at configure time.
libstdc++-v3/ChangeLog:
PR libstdc++/93151
* config/os/djgpp/error_constants.h: Test POSIX errno macros
directly, instead of corresponding _GLIBCXX_HAVE_EXXX macros.
* config/os/mingw32-w64/error_constants.h: Likewise.
* config/os/mingw32/error_constants.h: Likewise.
Jonathan Wakely [Tue, 15 Dec 2020 20:28:11 +0000 (20:28 +0000)]
libstdc++: Test errno macros directly, not via autoconf [PR 93151]
This fixes a bug caused by a mismatch between the macros defined by
<errno.h> when GCC is built and the macros defined by <errno.h> when
users include <system_error>. If the user code is compiled with
_XOPEN_SOURCE defined to 500 or 600, Darwin suppresses the
ENOTRECOVERABLE and EOWNERDEAD macros, which are not defined by SUSv3
(aka POSIX.1-2001).
Since POSIX requires the errno macros to be macros (and not variables or
enumerators) we can just test for them directly using the preprocessor.
That means that <system_error> will match what is actually defined when
it's included, not what was defined when GCC was built. With that change
there is no need for the GLIBCXX_CHECK_SYSTEM_ERROR configure checks and
they can be removed.
libstdc++-v3/ChangeLog:
PR libstdc++/93151
* acinclude.m4 (GLIBCXX_CHECK_SYSTEM_ERROR): Remove.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac (GLIBCXX_CHECK_SYSTEM_ERROR): Do not use.
* config/os/generic/error_constants.h: Test POSIX errno macros
directly, instead of corresponding _GLIBCXX_HAVE_EXXX macros.
* testsuite/19_diagnostics/headers/system_error/errc_std_c++0x.cc:
Likewise.
* testsuite/19_diagnostics/headers/system_error/93151.cc: New
test.
Jonathan Wakely [Tue, 4 May 2021 14:49:38 +0000 (15:49 +0100)]
libstdc++: Fix undefined behaviour in std::string
This fixes a ubsan error when constructing a string with a null pointer:
bits/basic_string.h:534:21: runtime error: applying non-zero offset 18446744073709551615 to null pointer
The _M_construct function only cares whether the second pointer is
non-null, so create a non-null value without undefined arithmetic.
We can also pass the random_access_iterator_tag directly to the
_M_construct function, to avoid going via the tag dispatching
_M_construct_aux, because we know we have pointers not integers here.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (basic_string(const CharT*, const A&)):
Do not do arithmetic on null pointer.
Jonathan Wakely [Thu, 12 Aug 2021 16:35:25 +0000 (17:35 +0100)]
libstdc++: Add additional overload of std::lerp [PR101870]
The [cmath.syn] p1 wording about additional overloads sufficient to
handle any arithmetic types also applies to std::lerp. This adds a new
overload of std::lerp that does the required promotions to support
arguments of arbitrary arithmetic types.
A new __promoted_t alias template is added, which the C++17 function
templates std::hypot and std::lerp can use to avoid instantiating the
__promote_3 class template.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101870
* include/c_global/cmath (hypot): Use __promoted_t.
(lerp): Add new overload accepting any arithmetic types.
* include/ext/type_traits.h (__promoted_t): New alias template.
* testsuite/26_numerics/lerp.cc: Moved to...
* testsuite/26_numerics/lerp/1.cc: ...here.
* testsuite/26_numerics/lerp/constexpr.cc: New test.
* testsuite/26_numerics/lerp/version.cc: New test.
Jonathan Wakely [Tue, 20 Apr 2021 15:16:13 +0000 (16:16 +0100)]
libstdc++: Do not allocate a zero-size vector<bool> [PR 100153]
The vector<bool>::shrink_to_fit() implementation will allocate new
storage even if the vector is empty. That then leads to the
end-of-storage pointer being non-null and equal to the _M_start._M_p
pointer, which means that _M_end_addr() has undefined behaviour.
The fix is to stop doing a useless zero-sized allocation in
shrink_to_fit(), so that _M_start._M_p and _M_end_of_storage are both
null after an empty vector shrinks.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100153
* include/bits/vector.tcc (vector<bool>::_M_shrink_to_fit()):
When size() is zero just deallocate and reset.
Jonathan Wakely [Sat, 4 Dec 2021 11:38:25 +0000 (11:38 +0000)]
libstdc++: Initialize member in std::match_results [PR103549]
This fixes a -Wuninitialized warning for std::cmatch m1, m2; m1=m2;
Also name the template parameters in the forward declaration, to get rid
of the <template-parameter-1-1> noise in diagnostics.
libstdc++-v3/ChangeLog:
PR libstdc++/103549
* include/bits/regex.h (match_results): Give names to template
parameters in first declaration.
(match_results::_M_begin): Add default member-initializer.
The path::begin() fix should have been part of r12-3930-gf2b7f56a15d9cb.
Thanks to Timm Bäder for reporting this one.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_fwd.h (copy_file): Remove
incorrect noexcept from declaration.
* include/experimental/bits/fs_path.h (path::begin, path::end):
Add noexcept to declarations, to match definitions.
PR104228 showed that character lengths were shared between associate
variable and associate targets. This is problematic when the associate
target is itself a variable and gets a variable to hold the length, as
the length variable is added (and all the variables following it in the chain)
to both the associate variable scope and the target variable scope.
This caused an ICE when compiling with -O0 -fsanitize=address.
This change forces the creation of a separate character length for the
associate variable. It also forces the initialization of the character
length variable to avoid regressing associate_32 and associate_47 tests.
--
fortran: Separate associate character lengths earlier [PR104570]
This change workarounds an ICE in the evaluation of the character length
of an array expression referencing an associate variable; the code is
not prepared to see a non-scalar expression as it doesn’t initialize the
scalarizer.
Before this change, associate length symbols get a new gfc_charlen at
resolution stage to unshare them from the associate expression, so that
at translation stage it is a decl specific to the associate symbol that
is initialized, not the decl of some other symbol. This
reinitialization of gfc_charlen happens after expressions referencing
the associate symbol have been parsed, so that those expressions retain
the original gfc_charlen they have copied from the symbol.
At translation stage, the gfc_charlen for the associate symbol is setup
with the decl holding the actual length value, but the expressions have
retained the original gfc_charlen without any decl. So they need to
evaluate the character length, and this is where the ICE happens.
This change moves the reinitialization of gfc_charlen earlier at parsing
stage, so that at resolution stage the gfc_charlen can be retained as
it’s already not shared with any other symbol, and the expressions which
now share their gfc_charlen with the symbol are automatically updated
when the length decl is setup at translation stage. There is no need
any more to evaluate the character length as it has all the required
information, and the ICE doesn’t happen.
The first resolve.c hunk is necessary to avoid regressing on the
associate_35.f90 testcase.
--
PR fortran/104228
PR fortran/104570
gcc/fortran/ChangeLog:
* parse.c (parse_associate): Use a new distinct gfc_charlen if
the copied type has one whose length is not known to be
constant.
* resolve.c (resolve_assoc_var): Also create a new character
length for non-dummy associate targets. Reset charlen if it’s
shared with the associate target regardless of the expression
type. Don’t reinitialize charlen if it’s deferred.
* trans-stmt.c (trans_associate_var): Initialize character
length even if no temporary is used for the associate variable.
gcc/testsuite/ChangeLog:
* gfortran.dg/asan_associate_58.f90: New test.
* gfortran.dg/asan_associate_59.f90: New test.
* gfortran.dg/associate_58.f90: New test.
Richard Biener [Mon, 28 Mar 2022 08:07:53 +0000 (10:07 +0200)]
tree-optimization/105070 - annotate bit cluster tests with locations
The following makes sure to annotate the tests generated by
switch lowering bit-clustering with locations which otherwise
can be completely lost even at -O0.
2022-03-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/105070
* tree-switch-conversion.h
(bit_test_cluster::hoist_edge_and_branch_if_true): Add location
argument.
* tree-switch-conversion.c
(bit_test_cluster::hoist_edge_and_branch_if_true): Annotate
cond with location.
(bit_test_cluster::emit): Annotate all generated expressions
with location.
Richard Biener [Wed, 9 Mar 2022 09:55:49 +0000 (10:55 +0100)]
middle-end/104786 - ICE with asm and VLA
The following fixes an ICE observed with a MEM_REF allows_mem asm
operand referencing a VLA. The following makes sure to not attempt
to go the temporary creation way when we cannot.
2022-03-09 Richard Biener <rguenther@suse.de>
PR middle-end/104786
* cfgexpand.c (expand_asm_stmt): Do not generate a copy
for VLAs without an upper size bound.
Richard Biener [Tue, 23 Nov 2021 09:11:41 +0000 (10:11 +0100)]
tree-optimization/103361 - fix unroll-and-jam direction vector handling
This properly uses lambda_int instead of truncating the direction
vector to int which leads to false unexpected negative values.
2021-11-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/103361
* gimple-loop-jam.c (adjust_unroll_factor): Use lambda_int
for the dependence distance.
* tree-data-ref.c (print_lambda_vector): Properly print a lambda_int.
Richard Biener [Thu, 20 Jan 2022 13:25:51 +0000 (14:25 +0100)]
middle-end/100786 - constant folding from incompatible alias
The following avoids us ICEing doing constant folding from variables
with aliases of different types. The issue appears both in
folding and CCP and FRE can do more fancy stuff to still constant
fold cases where the load is smaller than the initializer so
defer it to there.
2022-01-20 Richard Biener <rguenther@suse.de>
PR middle-end/100786
* gimple-fold.c (get_symbol_constant_value): Only return
values of compatible type to the symbol.
libphobos: Don't call free on the TLS array in the emutls destroy function.
Fixes a segfault seen on Darwin when a GC scan is ran after a thread has
been destroyed. As the global emutlsArrays hash still has a reference
to the array itself, and tries to iterate all elements.
Setting the length to zero frees all allocated elements in the array,
and ensures that it is skipped when the _d_emutls_scan is called.
Fritz Reese [Tue, 19 Apr 2022 20:45:46 +0000 (16:45 -0400)]
fortran: Fix conv of UNION constructors [PR105310]
This fixes an ICE when a UNION is the (1+8*2^n)-th field in a DEC
STRUCTURE when compiled with -finit-derived -finit-local-zero.
The problem was CONSTRUCTOR_APPEND_ELT from within gfc_conv_union_initializer
modified the vector pointer, but the pointer was passed by-value,
so the old pointer from the caller (gfc_conv_structure) pointed to freed
memory.
PR fortran/105310
gcc/fortran/ChangeLog:
* trans-expr.c (gfc_conv_union_initializer): Pass vec* by reference.
Alex Coplan [Wed, 6 Apr 2022 10:16:10 +0000 (11:16 +0100)]
arm: Fix ICEs with compare-and-swap and -march=armv8-m.base [PR99977]
The PR shows two ICEs with __sync_bool_compare_and_swap and
-mcpu=cortex-m23 (equivalently, -march=armv8-m.base): one in LRA and one
later on, after the CAS insn is split.
The LRA ICE occurs because the
@atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1 pattern attempts to tie
two output operands together (operands 0 and 1 in the third
alternative). LRA can't handle this, since it doesn't make sense for an
insn to assign to the same operand twice.
The later (post-splitting) ICE occurs because the expansion of the
cbranchsi4_scratch insn doesn't quite go according to plan. As it
stands, arm_split_compare_and_swap calls gen_cbranchsi4_scratch,
attempting to pass a register (neg_bval) to use as a scratch register.
However, since the RTL template has a match_scratch here,
gen_cbranchsi4_scratch ignores this argument and produces a scratch rtx.
Since this is all happening after RA, this is doomed to fail (and we get
an ICE about the insn not matching its constraints).
It seems that the motivation for the choice of constraints in the
atomic_compare_and_swap pattern comes from an attempt to satisfy the
constraints of the cbranchsi4_scratch insn. This insn requires the
scratch register to be the same as the input register in the case that
we use a larger negative immediate (one that satisfies J, but not L).
Of course, as noted above, LRA refuses to assign two output operands to
the same register, so this was never going to work.
The solution I'm proposing here is to collapse the alternatives to the
CAS insn (allowing the two output register operands to be matched to
different registers) and to ensure that the constraints for
cbranchsi4_scratch are met in arm_split_compare_and_swap. We do this by
inserting a move to ensure the source and destination registers match if
necessary (i.e. in the case of large negative immediates).
Another notable change here is that we only do:
emit_move_insn (neg_bval, const1_rtx);
for non-negative immediates. This is because the ADDS instruction used in
the negative case suffices to leave a suitable value in neg_bval: if the
operands compare equal, we don't take the branch (so neg_bval will be
set by the load exclusive). Otherwise, the ADDS will leave a nonzero
value in neg_bval, which will correctly signal that the CAS has failed
when it is later negated.
gcc/ChangeLog:
PR target/99977
* config/arm/arm.c (arm_split_compare_and_swap): Fix up codegen
with negative immediates: ensure we expand cbranchsi4_scratch
correctly and ensure we satisfy its constraints.
* config/arm/sync.md
(@atomic_compare_and_swap<CCSI:arch><NARROW:mode>_1): Don't
attempt to tie two output operands together with constraints;
collapse two alternatives.
(@atomic_compare_and_swap<CCSI:arch><SIDI:mode>_1): Likewise.
* config/arm/thumb1.md (cbranchsi4_neg_late): New.
gcc/testsuite/ChangeLog:
PR target/99977
* gcc.target/arm/pr99977.c: New test.