Jakub Jelinek [Fri, 19 Mar 2021 21:54:31 +0000 (22:54 +0100)]
c: Fix up -Wunused-but-set-* warnings for _Atomics [PR99588]
As the following testcases show, compared to -D_Atomic= case we have many
-Wunused-but-set-* warning false positives.
When an _Atomic variable/parameter is read, we call mark_exp_read on it in
convert_lvalue_to_rvalue, but build_atomic_assign does not.
For consistency with the non-_Atomic case where we mark_exp_read the lhs
for lhs op= ... but not for lhs = ..., this patch does that too.
But furthermore we need to pattern match the trees emitted by _Atomic store,
so that _Atomic store itself is not marked as being a variable read, but
when the result of the store is used, we mark it.
2021-03-19 Jakub Jelinek <jakub@redhat.com>
PR c/99588
* c-typeck.c (mark_exp_read): Recognize what build_atomic_assign
with modifycode NOP_EXPR produces and mark the _Atomic var as read
if found.
(build_atomic_assign): For modifycode of NOP_EXPR, use COMPOUND_EXPRs
rather than STATEMENT_LIST. Otherwise call mark_exp_read on lhs.
Set TREE_SIDE_EFFECTS on the TARGET_EXPR.
* gcc.dg/Wunused-var-5.c: New test.
* gcc.dg/Wunused-var-6.c: New test.
Jakub Jelinek [Tue, 16 Mar 2021 20:17:44 +0000 (21:17 +0100)]
c++: Ensure correct destruction order of local statics [PR99613]
As mentioned in the PR, if end of two constructions of local statics
is strongly ordered, their destructors should be run in the reverse order.
As we run __cxa_guard_release before calling __cxa_atexit, it is possible
that we have two threads that access two local statics in the same order
for the first time, one thread wins the __cxa_guard_acquire on the first
one but is rescheduled in between the __cxa_guard_release and __cxa_atexit
calls, then the other thread is scheduled and wins __cxa_guard_acquire
on the second one and calls __cxa_quard_release and __cxa_atexit and only
afterwards the first thread calls its __cxa_atexit. This means a variable
whose completion of the constructor strongly happened after the completion
of the other one will be destructed after the other variable is destructed.
The following patch fixes that by swapping the __cxa_guard_release and
__cxa_atexit calls.
2021-03-16 Jakub Jelinek <jakub@redhat.com>
PR c++/99613
* decl.c (expand_static_init): For thread guards, call __cxa_atexit
before calling __cxa_guard_release rather than after it. Formatting
fixes.
Jakub Jelinek [Thu, 4 Mar 2021 18:38:08 +0000 (19:38 +0100)]
expand: Fix ICE in store_bit_field_using_insv [PR93235]
The following testcase ICEs on aarch64. The problem is that
op0 is (subreg:HI (reg:HF ...) 0) and because we can't create a SUBREG of a
SUBREG and aarch64 doesn't have HImode insv, only SImode insv,
store_bit_field_using_insv tries to create (subreg:SI (reg:HF ...) 0)
which is not valid for the target and so gen_rtx_SUBREG ICEs.
The following patch fixes it by punting if the to be created SUBREG
doesn't validate, callers of store_bit_field_using_insv can handle
the fallback.
2021-03-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/93235
* expmed.c (store_bit_field_using_insv): Return false of xop0 is a
SUBREG and a SUBREG to op_mode can't be created.
Jakub Jelinek [Wed, 3 Mar 2021 15:12:23 +0000 (16:12 +0100)]
c++: Fix -fstrong-eval-order for operator &&, || and , [PR82959]
P0145R3 added
"However, the operands are sequenced in the order prescribed for the built-in
operator" rule for overloaded operator calls when using the operator syntax.
op_is_ordered follows that, but added just the overloaded operators
added in that paper. &&, || and comma operators had rules that
lhs is sequenced before rhs already in C++98.
The following patch adds those cases to op_is_ordered.
2021-03-03 Jakub Jelinek <jakub@redhat.com>
PR c++/82959
* call.c (op_is_ordered): Handle TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR
and COMPOUND_EXPR.
Jakub Jelinek [Wed, 3 Mar 2021 08:55:19 +0000 (09:55 +0100)]
c-family: Avoid ICE on va_arg [PR99324]
build_va_arg calls the middle-end mark_addressable, which e.g. requires that
cfun is non-NULL. The following patch calls instead c_common_mark_addressable_vec
which is the c-family variant similarly to the FE c_mark_addressable and
cxx_mark_addressable, except that it doesn't error on addresses of register
variables. As the taking of the address is artificial for the .VA_ARG
ifn and when that is lowered goes away, it is similar case to the vector
subscripting for which c_common_mark_addressable_vec has been added.
2021-03-03 Jakub Jelinek <jakub@redhat.com>
PR c/99324
* c-common.c (build_va_arg): Call c_common_mark_addressable_vec
instead of mark_addressable. Fix a comment typo -
neutrallly -> neutrally.
Jakub Jelinek [Fri, 26 Feb 2021 09:43:28 +0000 (10:43 +0100)]
c++: Fix operator() lookup in lambdas [PR95451]
During name lookup, name-lookup.c uses:
if (!(!iter->type && HIDDEN_TYPE_BINDING_P (iter))
&& (bool (want & LOOK_want::HIDDEN_LAMBDA)
|| !is_lambda_ignored_entity (iter->value))
&& qualify_lookup (iter->value, want))
binding = iter->value;
Unfortunately as the following testcase shows, this doesn't work in
generic lambdas, where we on the auto b = ... lambda ICE and on the
auto d = lambda reject it even when it should be valid. The problem
is that the binding doesn't have a FUNCTION_DECL with
LAMBDA_FUNCTION_P for the operator(), but an OVERLOAD with
TEMPLATE_DECL for such FUNCTION_DECL.
The following patch fixes that in is_lambda_ignored_entity, other
possibility would be to do that before calling is_lambda_ignored_entity
in name-lookup.c.
2021-02-26 Jakub Jelinek <jakub@redhat.com>
PR c++/95451
* lambda.c (is_lambda_ignored_entity): Before checking for
LAMBDA_FUNCTION_P, use OVL_FIRST. Drop FUNCTION_DECL check.
Jakub Jelinek [Wed, 24 Feb 2021 11:10:25 +0000 (12:10 +0100)]
fold-const: Fix up ((1 << x) & y) != 0 folding for vectors [PR99225]
This optimization was written purely with scalar integers in mind,
can work fine even with vectors, but we can't use build_int_cst but
need to use build_one_cst instead.
2021-02-24 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/99225
* fold-const.c (fold_binary_loc) <case NE_EXPR>: In (x & (1 << y)) != 0
to ((x >> y) & 1) != 0 simplifications use build_one_cst instead of
build_int_cst (..., 1). Formatting fixes.
Jakub Jelinek [Tue, 23 Feb 2021 08:49:48 +0000 (09:49 +0100)]
fold-const: Fix ICE in fold_read_from_constant_string on invalid code [PR99204]
fold_read_from_constant_string and expand_expr_real_1 have code to optimize
constant reads from string (tree vs. rtl).
If the STRING_CST array type has zero low bound, index is fold converted to
sizetype and so the compare_tree_int works fine, but if it has some other
low bound, it calls size_diffop_loc and that function from 2 sizetype
operands creates a ssizetype difference. expand_expr_real_1 then uses
tree_fits_uhwi_p + compare_tree_int and so works fine, but fold-const.c
only checked if index is INTEGER_CST and calls compare_tree_int, which means
for negative index it will succeed and result in UB in the compiler.
2021-02-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/99204
* fold-const.c (fold_read_from_constant_string): Check that
tree_fits_uhwi_p (index) rather than just that index is INTEGER_CST.
Jakub Jelinek [Tue, 23 Feb 2021 08:30:18 +0000 (09:30 +0100)]
libstdc++: Fix up constexpr std::char_traits<char>::compare [PR99181]
Because of LWG 467, std::char_traits<char>::lt compares the values
cast to unsigned char rather than char, so even when char is signed
we get unsigned comparision. std::char_traits<char>::compare uses
__builtin_memcmp and that works the same, but during constexpr evaluation
we were calling __gnu_cxx::char_traits<char_type>::compare. As
char_traits::lt is not virtual, __gnu_cxx::char_traits<char_type>::compare
used __gnu_cxx::char_traits<char_type>::lt rather than
std::char_traits<char>::lt and thus compared chars as signed if char is
signed.
This change fixes it by inlining __gnu_cxx::char_traits<char_type>::compare
into std::char_traits<char>::compare by hand, so that it calls the right
lt method.
2021-02-23 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/99181
* include/bits/char_traits.h (char_traits<char>::compare): For
constexpr evaluation don't call
__gnu_cxx::char_traits<char_type>::compare but do the comparison loop
directly.
* testsuite/21_strings/char_traits/requirements/char/99181.cc: New
test.
Jakub Jelinek [Fri, 19 Feb 2021 11:14:39 +0000 (12:14 +0100)]
tree-cfg: Fix up gimple_merge_blocks FORCED_LABEL handling [PR99034]
The verifiers require that DECL_NONLOCAL or EH_LANDING_PAD_NR
labels are always the first label if there is more than one label.
When merging blocks, we don't honor that though.
On the following testcase, we try to merge blocks:
<bb 13> [count: 0]:
<L2>:
S::~S (&s);
and
<bb 15> [count: 0]:
<L0>:
resx 1
where <L2> is landing pad and <L0> is FORCED_LABEL. And the code puts
the FORCED_LABEL before the landing pad label, violating the verification
requirements.
The following patch fixes it by moving the FORCED_LABEL after the
DECL_NONLOCAL or EH_LANDING_PAD_NR label if it is the first label.
2021-02-19 Jakub Jelinek <jakub@redhat.com>
PR ipa/99034
* tree-cfg.c (gimple_merge_blocks): If bb a starts with eh landing
pad or non-local label, put FORCED_LABELs from bb b after that label
rather than before it.
Jakub Jelinek [Thu, 18 Feb 2021 21:17:52 +0000 (22:17 +0100)]
c: Fix ICE with -fexcess-precision=standard [PR99136]
The following testcase ICEs on i686-linux, because c_finish_return wraps
c_fully_folded retval back into EXCESS_PRECISION_EXPR, but when the function
return type is void, we don't call convert_for_assignment on it that would
then be fully folded again, but just put the retval into RETURN_EXPR's
operand, so nothing removes it anymore and during gimplification we
ICE as EXCESS_PRECISION_EXPR is not handled.
This patch fixes it by not adding that EXCESS_PRECISION_EXPR in functions
returning void, the return value is ignored and all we need is evaluate any
side-effects of the expression.
2021-02-18 Jakub Jelinek <jakub@redhat.com>
PR c/99136
* c-typeck.c (c_finish_return): Don't wrap retval into
EXCESS_PRECISION_EXPR in functions that return void.
Jakub Jelinek [Wed, 17 Feb 2021 14:03:25 +0000 (15:03 +0100)]
c++: Fix up build_zero_init_1 once more [PR99106]
My earlier build_zero_init_1 patch for flexible array members created
an empty CONSTRUCTOR. As the following testcase shows, that doesn't work
very well because the middle-end doesn't expect CONSTRUCTOR elements with
incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR
had).
The following patch just doesn't add any CONSTRUCTOR for the flexible array
members, it doesn't seem to be needed.
2021-02-17 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/99106
* init.c (build_zero_init_1): For flexible array members just return
NULL_TREE instead of returning empty CONSTRUCTOR with non-complete
ARRAY_TYPE.
Jakub Jelinek [Mon, 15 Feb 2021 08:16:06 +0000 (09:16 +0100)]
match.pd: Fix up A % (cast) (pow2cst << B) simplification [PR99079]
The (mod @0 (convert?@3 (power_of_two_cand@1 @2))) simplification
uses tree_nop_conversion_p (type, TREE_TYPE (@3)) condition, but I believe
it doesn't check what it was meant to check. On convert?@3
TREE_TYPE (@3) is not the type of what it has been converted from, but
what it has been converted to, which needs to be (because it is operand
of normal binary operation) equal or compatible to type of the modulo
result and first operand - type.
I could fix that by using && tree_nop_conversion_p (type, TREE_TYPE (@1))
and be done with it, but actually most of the non-nop conversions are IMHO
ok and so we would regress those optimizations.
In particular, if we have say narrowing conversions (foo5 and foo6 in
the new testcase), I think we are fine, either the shift of the power of two
constant after narrowing conversion is still that power of two (or negation
of that) and then it will still work, or the result of narrowing conversion
is 0 and then we would have UB which we can ignore.
Similarly, widening conversions where the shift result is unsigned are fine,
or even widening conversions where the shift result is signed, but we sign
extend to a signed wider divisor, the problematic case of INT_MIN will
become x % (long long) INT_MIN and we can still optimize that to
x & (long long) INT_MAX.
What doesn't work is the case in the pr99079.c testcase, widening conversion
of a signed shift result to wider unsigned divisor, where if the shift
is negative, we end up with x % (unsigned long long) INT_MIN which is
x % 0xffffffff80000000ULL where the divisor is not a power of two and
we can't optimize that to x & 0x7fffffffULL.
So, the patch rejects only the single problematic case.
Furthermore, when the shift result is signed, we were introducing UB into
a program which previously didn't have one (well, left shift into the sign
bit is UB in some language/version pairs, but it is definitely valid in
C++20 - wonder if I shouldn't move the gcc.c-torture/execute/pr99079.c
testcase to g++.dg/torture/pr99079.C and use -std=c++20), by adding that
subtraction of 1, x % (1 << 31) in C++20 is well defined, but
x & ((1 << 31) - 1) triggers UB on the subtraction.
So, the patch performs the subtraction in the unsigned type if it isn't
wrapping.
2021-02-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/99079
* match.pd (A % (pow2pcst << N) -> A & ((pow2pcst << N) - 1)): Remove
useless tree_nop_conversion_p (type, TREE_TYPE (@3)) check. Instead
require both type and TREE_TYPE (@1) to be integral types and either
type having smaller or equal precision, or TREE_TYPE (@1) being
unsigned type, or type being signed type. If TREE_TYPE (@1)
doesn't have wrapping overflow, perform the subtraction of one in
unsigned type.
* gcc.dg/fold-modpow2-2.c: New test.
* gcc.c-torture/execute/pr99079.c: New test.
Jakub Jelinek [Thu, 11 Feb 2021 16:24:17 +0000 (17:24 +0100)]
c++: Fix zero initialization of flexible array members [PR99033]
array_type_nelts returns error_mark_node for type of flexible array members
and build_zero_init_1 was placing an error_mark_node into the CONSTRUCTOR,
on which e.g. varasm ICEs. I think there is nothing erroneous on zero
initialization of flexible array members though, such arrays should simply
get no elements, like they do if such classes are constructed (everything
except when some larger initializer comes from an explicit initializer).
So, this patch handles [] arrays in zero initialization like [0] arrays
and fixes handling of the [0] arrays - the
tree_int_cst_equal (max_index, integer_minus_one_node) check
didn't do what it thought it would do, max_index is typically unsigned
integer (sizetype) and so it is never equal to a -1.
What the patch doesn't do and maybe would be desirable is if it returns
error_mark_node for other reasons let the recursive callers not stick that
into CONSTRUCTOR but return error_mark_node instead. But I don't have a
testcase where that would be needed right now.
2021-02-11 Jakub Jelinek <jakub@redhat.com>
PR c++/99033
* init.c (build_zero_init_1): Handle zero initialiation of
flexible array members like initialization of [0] arrays.
Use integer_minus_onep instead of comparison to integer_minus_one_node
and integer_zerop instead of comparison against size_zero_node.
Formatting fixes.
Jakub Jelinek [Wed, 10 Feb 2021 18:52:37 +0000 (19:52 +0100)]
varasm: Fix ICE with -fsyntax-only [PR99035]
My FE change from 2 years ago uses TREE_ASM_WRITTEN in -fsyntax-only
mode more aggressively to avoid "expanding" functions multiple times.
With -fsyntax-only nothing is really expanded, so I think it is acceptable
to adjust the assert and allow declare_weak at any time, with -fsyntax-only
we know it is during parsing only anyway.
2021-02-10 Jakub Jelinek <jakub@redhat.com>
PR c++/99035
* varasm.c (declare_weak): For -fsyntax-only, allow even
TREE_ASM_WRITTEN function decls.
Jakub Jelinek [Wed, 10 Feb 2021 09:34:58 +0000 (10:34 +0100)]
openmp: Temporarily disable into_ssa when gimplifying OpenMP reduction clauses [PR99007]
gimplify_scan_omp_clauses was already calling gimplify_expr with false as
last argument to make sure it is not an SSA_NAME, but as the testcases show,
that is not enough, SSA_NAME temporaries created during that gimplification
can be reused too and we can't allow SSA_NAMEs to be used across OpenMP
region boundaries, as we can only firstprivatize decls.
Fixed by temporarily disabling into_ssa.
2021-02-10 Jakub Jelinek <jakub@redhat.com>
PR middle-end/99007
* gimplify.c (gimplify_scan_omp_clauses): For MEM_REF on reductions,
temporarily disable gimplify_ctxp->into_ssa around gimplify_expr
calls.
* g++.dg/gomp/pr99007.C: New test.
* gcc.dg/gomp/pr99007-1.c: New test.
* gcc.dg/gomp/pr99007-2.c: New test.
* gcc.dg/gomp/pr99007-3.c: New test.
Jakub Jelinek [Fri, 5 Feb 2021 09:22:07 +0000 (10:22 +0100)]
c++: Fix ICE with structured binding initialized to incomplete array [PR97878]
We ICE on the following testcase, for incomplete array a on auto [b] { a }; without
giving any kind of diagnostics, with auto [c] = a; during error-recovery.
The problem is that we get too far through check_initializer and e.g.
store_init_value -> constexpr stuff can't deal with incomplete array types.
As the type of the structured binding artificial variable is always deduced,
I think it is easiest to diagnose this early, even if they have array types
we'll need their deduced type to be complete rather than just its element
type.
2021-02-05 Jakub Jelinek <jakub@redhat.com>
PR c++/97878
* decl.c (check_array_initializer): For structured bindings, require
the array type to be complete.
Jakub Jelinek [Wed, 3 Feb 2021 08:09:26 +0000 (09:09 +0100)]
ifcvt: Avoid ICEs trying to force_operand random RTL [PR97487]
As the testcase shows, RTL ifcvt can throw random RTL (whatever it found in
some insns) at expand_binop or expand_unop and expects it to do something
(and then will check if it created valid insns and punts if not).
These functions in the end if the operands don't match try to
copy_to_mode_reg the operands, which does
if (!general_operand (x, VOIDmode))
x = force_operand (x, temp);
but, force_operand is far from handling all possible RTLs, it will ICE for
all more unusual RTL codes. Basically handles just simple arithmetic and
unary RTL operations if they have an optab and
expand_simple_binop/expand_simple_unop ICE on others.
The following patch fixes it by adding some operand verification (whether
there is a hope that copy_to_mode_reg will succeed on those). It is added
both to noce_emit_move_insn (not needed for this exact testcase,
that function simply tries to recog the insn as is and if it fails,
handles some simple binop/unop cases; the patch performs the verification
of their operands) and noce_try_sign_mask.
2021-02-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97487
* ifcvt.c (noce_can_force_operand): New function.
(noce_emit_move_insn): Use it.
(noce_try_sign_mask): Likewise. Formatting fix.
* gcc.dg/pr97487-1.c: New test.
* gcc.dg/pr97487-2.c: New test.
Jakub Jelinek [Fri, 29 Jan 2021 09:30:09 +0000 (10:30 +0100)]
expand: Fix up find_bb_boundaries [PR98331]
When expansion emits some control flow insns etc. inside of a former GIMPLE
basic block, find_bb_boundaries needs to split it into multiple basic
blocks.
The code needs to ignore debug insns in decisions how many splits to do or
where in between some non-debug insns the split should be done, but it can
decide where to put debug insns if they can be kept and otherwise throws
them away (they can't stay outside of basic blocks).
On the following testcase, we end up in the bb from expander with
control flow insn
debug insns
barrier
some other insn
(the some other insn is effectively dead after __builtin_unreachable and
we'll optimize that out later).
Without debug insns, we'd do the split when encountering some other insn
and split after PREV_INSN (some other insn), i.e. after barrier (and the
splitting code then moves the barrier in between basic blocks).
But if there are debug insns, we actually split before the first debug insn
that appeared after the control flow insn, so after control flow insn,
and get a basic block that starts with debug insns and then has a barrier
in the middle that nothing moves it out of the bb. This leads to ICEs and
even if it wouldn't, different behavior from -g0.
The reason for treating debug insns that way is a different case, e.g.
control flow insn
debug insns
some other insn
or even
control flow insn
barrier
debug insns
some other insn
where splitting before the first such debug insn allows us to keep them
while otherwise we would have to drop them on the floor, and in those
situations we behave the same with -g0 and -g.
So, the following patch fixes it by resetting debug_insn not just when
splitting the blocks (it is set only after seeing a control flow insn and
before splitting for it if needed), but also when seeing a barrier,
which effectively means we always throw away debug insns after a control
flow insn and before following barrier if any, but there is no way around
that, control flow insn must be the last in the bb (BB_END) and BARRIER
after it, debug insns aren't allowed outside of bb.
We still handle the other cases fine (when there is no barrier or when
debug insns appear only after the barrier).
2021-01-29 Jakub Jelinek <jakub@redhat.com>
PR debug/98331
* cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing
a BARRIER.
Jakub Jelinek [Thu, 28 Jan 2021 15:13:11 +0000 (16:13 +0100)]
c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]
As the testcase shows, for vars appearing in templates, we don't attach
the asm spec string to the pattern decls, nor pass it back to cp_finish_decl
during instantiation.
The following patch does that.
2021-01-28 Jakub Jelinek <jakub@redhat.com>
PR c++/33661
PR c++/98847
* decl.c (cp_finish_decl): For register vars with asmspec in templates
call set_user_assembler_name and set DECL_HARD_REGISTER.
* pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars,
pass asmspec_tree to cp_finish_decl.
Jakub Jelinek [Tue, 26 Jan 2021 13:48:26 +0000 (14:48 +0100)]
aarch64: Tighten up checks for ubfix [PR98681]
The testcase in the patch doesn't assemble, because the instruction requires
that the penultimate operand (lsb) range is [0, 32] (or [0, 64]) and the last
operand's range is [1, 32 - lsb] (or [1, 64 - lsb]).
The INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) will accept the lsb operand
to be in range [MIN, 32] (or [MIN, 64]) and then we invoke UB in the
compiler and sometimes it will make it through.
The patch changes all the INTVAL uses in that function to UINTVAL,
which isn't strictly necessary, but can be done (e.g. after the
UINTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) check we know it is not
negative and thus INTVAL (shft_amnt) and UINTVAL (shft_amnt) then behave the
same. But, I had to add INTVAL (mask) > 0 check in that case, otherwise we
risk (hypothetically) emitting instruction that doesn't assemble.
The problem is with masks that have the MSB bit set, while the instruction
can handle those, e.g.
ubfiz w1, w0, 13, 19
will do
(w0 << 13) & 0xffffe000
in RTL we represent SImode constants with MSB set as negative HOST_WIDE_INT,
so it will actually be HOST_WIDE_INT_C (0xffffffffffffe000), and
the instruction uses %P3 to print the last operand, which calls
asm_fprintf (f, "%u", popcount_hwi (INTVAL (x)))
to print that. But that will not print 19, but 51 instead, will include
there also all the copies of the sign bit.
Not supporting those masks with MSB set isn't a big loss though, they really
shouldn't appear normally, as both GIMPLE and RTL optimizations should
optimize those away (one isn't masking any bits off with such masks, so
just w0 << 13 will do too).
2021-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/98681
* config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p):
Use UINTVAL (shft_amnt) and UINTVAL (mask) instead of INTVAL (shft_amnt)
and INTVAL (mask). Add && INTVAL (mask) > 0 condition.
Jakub Jelinek [Sat, 23 Jan 2021 08:41:58 +0000 (09:41 +0100)]
rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]
The x86 __m64 type is defined as:
/* The Intel API is flexible enough that we must allow aliasing with other
vector types, and their scalar components. */
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
and so matches the comment above it in that reads and stores through
pointers to __m64 can alias anything.
But in the rs6000 headers that is the case only for __m128, but not __m64.
The following patch adds that attribute, which fixes the
FAIL: gcc.target/powerpc/sse-movhps-1.c execution test
FAIL: gcc.target/powerpc/sse-movlps-1.c execution test
regressions that appeared when Honza improved ipa-modref.
Jakub Jelinek [Fri, 22 Jan 2021 18:03:23 +0000 (19:03 +0100)]
c++: Fix up ubsan false positives on references [PR95693]
Alex' 2 years old change to build_zero_init_1 to return NULL pointer with
reference type for references breaks the sanitizers, the assignment of NULL
to a reference typed member is then instrumented before it is overwritten
with a non-NULL address later on.
That change has been done to fix error recovery ICE during
process_init_constructor_record, where we:
if (TYPE_REF_P (fldtype))
{
if (complain & tf_error)
error ("member %qD is uninitialized reference", field);
else
return PICFLAG_ERRONEOUS;
}
a few lines earlier, but then continue and ICE when build_zero_init returns
NULL.
The following patch reverts the build_zero_init_1 change and instead creates
the NULL with reference type constants during the error recovery.
The pr84593.C testcase Alex' change was fixing still works as before.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/95693
* init.c (build_zero_init_1): Revert the 2018-03-06 change to
return build_zero_cst for reference types.
* typeck2.c (process_init_constructor_record): Instead call
build_zero_cst here during error recovery instead of build_zero_init.
Jakub Jelinek [Fri, 22 Jan 2021 10:50:18 +0000 (11:50 +0100)]
match.pd: Replace incorrect simplifications into copysign [PR90248]
In the PR Andrew said he has implemented a simplification that has been
added to LLVM, but that actually is not true, what is in there are
X * (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into +-abs(X)
but what has been added into GCC are (X cmp 0.0 ? +-1.0 : -+1.0)
simplifications into copysign(1, +-X) and then
X * copysign (1, +-X) into +-abs (X).
The problem is with the (X cmp 0.0 ? +-1.0 : -+1.0) simplifications,
they don't work correctly when X is zero.
E.g.
(X > 0.0 ? 1.0 : -1.0)
is -1.0 when X is either -0.0 or 0.0, but copysign will make it return
1.0 for 0.0 and -1.0 only for -0.0.
(X >= 0.0 ? 1.0 : -1.0)
is 1.0 when X is either -0.0 or 0.0, but copysign will make it return
still 1.0 for 0.0 and -1.0 for -0.0.
The simplifications were guarded on !HONOR_SIGNED_ZEROS, but as discussed in
the PR, that option doesn't mean that -0.0 will not ever appear as operand
of some operation, it is hard to guarantee that without compiler adding
canonicalizations of -0.0 to 0.0 after most of the operations and thus
making it very slow, but that the user asserts that he doesn't care if the result
of operations will be 0.0 or -0.0. Not to mention that some of the
transformations are incorrect even for positive 0.0.
So, instead of those simplifications this patch recognizes patterns where
those ?: expressions are multiplied by X, directly into +-abs.
That works fine even for 0.0 and -0.0 (as long as we don't care about
whether the result is exactly 0.0 or -0.0 in those cases), because
whether the result of copysign is -1.0 or 1.0 doesn't matter when it is
multiplied by 0.0 or -0.0.
As a follow-up, maybe we should add the simplification mentioned in the PR,
in particular doing copysign by hand through
VIEW_CONVERT_EXPR <int, float_X> < 0 ? -float_constant : float_constant
into copysign (float_constant, float_X). But I think that would need to be
done in phiopt.
Jakub Jelinek [Sat, 9 Jan 2021 09:49:38 +0000 (10:49 +0100)]
tree-cfg: Allow enum types as result of POINTER_DIFF_EXPR [PR98556]
As conversions between signed integers and signed enums with the same
precision are useless in GIMPLE, it seems strange that we require that
POINTER_DIFF_EXPR result must be INTEGER_TYPE.
If we really wanted to require that, we'd need to change the gimplifier
to ensure that, which it isn't the case on the following testcase.
What is going on during the gimplification is that when we have the
(enum T) (p - q) cast, it is stripped through
/* Strip away as many useless type conversions as possible
at the toplevel. */
STRIP_USELESS_TYPE_CONVERSION (*expr_p);
and when the MODIFY_EXPR is gimplified, the *to_p has enum T type,
while *from_p has intptr_t type and as there is no conversion in between,
we just create GIMPLE_ASSIGN from that.
2021-01-09 Jakub Jelinek <jakub@redhat.com>
PR c++/98556
* tree-cfg.c (verify_gimple_assign_binary): Allow lhs of
POINTER_DIFF_EXPR to be any integral type.
Jakub Jelinek [Thu, 31 Dec 2020 10:06:56 +0000 (11:06 +0100)]
wide-int: Fix wi::to_mpz [PR98474]
The following testcase is miscompiled, because niter analysis miscomputes
the number of iterations to 0.
The problem is that niter analysis uses mpz_t (wonder why, wouldn't
widest_int do the same job?) and when wi::to_mpz is called e.g. on the
TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong
value.
wi::to_mpz has code to handle negative wide_ints in signed types by
inverting all bits, importing to mpz and complementing it, which is fine,
but doesn't handle correctly the case when the wide_int's len (times
HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p.
E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented
in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create
0xffffffffffffffff mpz_t value from that.
This patch handles it by adding the needed -1 host wide int words (and has
also code to deal with precision that aren't multiple of
HOST_BITS_PER_WIDE_INT).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98474
* wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type
is unsigned and excess negative, append set bits after len until
precision.
Jakub Jelinek [Mon, 21 Dec 2020 23:01:34 +0000 (00:01 +0100)]
gimplify: Gimplify value in gimplify_init_ctor_eval_range [PR98353]
gimplify_init_ctor_eval_range wasn't gimplifying value, so if it wasn't
a gimple val, verification at the end of gimplification would ICE (or with
release checking some random pass later on would ICE or misbehave).
2020-12-21 Jakub Jelinek <jakub@redhat.com>
PR c++/98353
* gimplify.c (gimplify_init_ctor_eval_range): Gimplify value before
storing it into cref.
Jakub Jelinek [Fri, 18 Dec 2020 20:43:20 +0000 (21:43 +0100)]
openmp: Don't optimize shared to firstprivate on task with depend clause
The attached testcase is miscompiled, because we optimize shared clauses
to firstprivate when task body can't modify the variable even when the
task has depend clause. That is wrong, because firstprivate means the
variable will be copied immediately when the task is created, while with
depend clause some other task might change it later before the dependencies
are satisfied and the task should observe the value only after the change.
2020-12-18 Jakub Jelinek <jakub@redhat.com>
* gimplify.c (struct gimplify_omp_ctx): Add has_depend member.
(gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND
appears on OMP_TASK.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force
GOVD_WRITTEN on shared variables if task construct has depend clause.
Jakub Jelinek [Sat, 12 Dec 2020 07:36:02 +0000 (08:36 +0100)]
openmp, openacc: Fix up handling of data regions [PR98183]
While the data regions (target data and OpenACC counterparts) aren't
standalone directives, unlike most other OpenMP/OpenACC constructs
we allow (apparently as an extension) exceptions and goto out of
the block. During gimplification we place an *end* call into a finally
block so that it is reached even on exceptions or goto out etc.).
During omplower pass we then add paired #pragma omp return for them,
but due to the exceptions because the region is not SESE we can end up
with #pragma omp return appearing only conditionally in the CFG etc.,
which the ompexp pass can't handle.
For the ompexp pass, we actually don't care about the end part or about
target data nesting, so we can treat it as standalone directive.
2020-12-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98183
* omp-low.c (lower_omp_target): Don't add OMP_RETURN for
data regions.
* omp-expand.c (expand_omp_target): Don't try to remove
OMP_RETURN for data regions.
(build_omp_regions_1, omp_make_gimple_edges): Don't expect
OMP_RETURN for data regions.
* gcc.dg/gomp/pr98183.c: New test.
* gcc.dg/goacc/pr98183.c: New test.
Jakub Jelinek [Fri, 4 Dec 2020 11:18:21 +0000 (12:18 +0100)]
debug: Fix another vector DECL_MODE ICE [PR98100]
The PR88587 fix changes DECL_MODE of vars with vector type during inlining/cloning
when the vars are copied, so that their DECL_MODE matches their TYPE_MODE in
the new function. Unfortunately, the following testcase still ICEs, the var
isn't really used in the new function and so it isn't copied, but becomes
just a nonlocalized var. So we can't adjust its DECL_MODE because it
appears in multiple functions and needs different modes in between them.
The following patch changes the DEBUG_INSN creation to use TYPE_MODE instead
of DECL_MODE for vars with vector types.
2020-12-04 Jakub Jelinek <jakub@redhat.com>
PR target/98100
* cfgexpand.c (expand_gimple_basic_block): For vars with
vector type, use TYPE_MODE rather than DECL_MODE.
Jakub Jelinek [Wed, 2 Dec 2020 23:29:46 +0000 (00:29 +0100)]
dwarf2out: Fix up add_scalar_info not to create invalid DWARF
As discussed in https://sourceware.org/bugzilla/show_bug.cgi?id=26987 ,
for very large bounds (which don't fit into HOST_WIDE_INT) GCC emits invalid
DWARF.
In DWARF2, DW_AT_{lower,upper}_bound were constant reference class.
In DWARF3 they are block constant reference and the
Static and Dynamic Properties of Types
chapter says:
"For a block, the value is interpreted as a DWARF expression; evaluation of the expression
yields the value of the attribute."
In DWARF4/5 they are constant exprloc reference class.
Now, for add_AT_wide we use DW_FORM_data16 (valid in constant class)
when -gdwarf-5, but otherwise just use DW_FORM_block1, which is not constant
class, but block.
For DWARF3 this means emitting clearly invalid DWARF, because the
DW_FORM_block1 should contain a DWARF expression, not random bytes
containing the constant directly.
For DWARF2/DWARF4/5 it could be considered a GNU extension, but a very badly
designed one when it means something different in DWARF3.
The following patch uses add_AT_wide only if we know we'll be using
DW_FORM_data16, and otherwise wastes 2 extra bytes and emits in there
DW_OP_implicit_value <size> before the constant.
2020-12-03 Jakub Jelinek <jakub@redhat.com>
* dwarf2out.c (add_scalar_info): Only use add_AT_wide for 128-bit
constants and only in dwarf-5 or later, where DW_FORM_data16 is
available. Otherwise use DW_FORM_block*/DW_FORM_exprloc with
DW_OP_implicit_value to describe the constant.
Jakub Jelinek [Tue, 1 Dec 2020 09:44:40 +0000 (10:44 +0100)]
x86_64: Fix up -fpic -mcmodel=large -fno-plt [PR98063]
On the following testcase with -fpic -mcmodel=large -fno-plt we emit
call puts@GOTPCREL(%rip)
but that is not really appropriate for CM_LARGE_PIC, the .text can be larger
than 2GB in that case and the .got slot further away from %rip than what can
fit into the signed 32-bit immediate.
The following patch computes the address of the .got slot the way it is
computed for that model for function pointer loads, and calls that.
Jakub Jelinek [Tue, 24 Nov 2020 08:04:28 +0000 (09:04 +0100)]
openmp: Fix C ICE on OpenMP atomics
c_parser_binary_expression was using build2 to create a temporary holder
for binary expression that c_parser_atomic and c_finish_omp_atomic can then
handle. The latter performs then all the needed checking.
Unfortunately, build2 performs some checking too, e.g. PLUS_EXPR vs.
POINTER_PLUS_EXPR or matching types of the arguments, nothing we can guarantee
at the parsing time. So we need something like C++ build_min_nt*. This
patch implements that inline.
2020-11-24 Jakub Jelinek <jakub@redhat.com>
PR c/97958
* c-parser.c (c_parser_binary_expression): For omp atomic binary
expressions, use make_node instead of build2 to avoid checking build2
performs.
Jakub Jelinek [Sat, 14 Nov 2020 08:14:19 +0000 (09:14 +0100)]
dwarf2: Emit DW_TAG_unspecified_parameters even in late DWARF [PR97599]
Aldy's PR71855 fix avoided emitting multiple redundant
DW_TAG_unspecified_parameters sub-DIEs of a single DIE by restricting
it to early dwarf only. That unfortunately means if we need to emit
another DIE for the function (whether it is for LTO, or e.g. because of
IPA cloning), we don't emit DW_TAG_unspecified_parameters, it remains
solely in the DW_AT_abstract_origin's referenced DIE.
But DWARF consumers don't really use DW_TAG_unspecified_parameters
from there, like we duplicate DW_TAG_formal_parameter sub-DIEs even in the
clones because either they have some more specific location, or e.g.
a function clone could have fewer or different argument types etc.,
they need to assume that originally stdarg function isn't later stdarg etc.
Unfortunately, while for DW_TAG_formal_parameter sub-DIEs, we can use the
hash tabs to look the PARM_DECLs if we already have the DIEs, for
DW_TAG_unspecified_parameters we don't have an easy way to look it up.
The following patch handles it by trying to figure out if we are creating a
fresh new DIE (in that case we add DW_TAG_unspecified_parameters if it is
stdarg), or if gen_subprogram_die is called again on an pre-existing DIE
to fill in some further details (then it will not touch it).
Except for lto, subr_die != old_die would be good enough, but unfortunately
for LTO the new DIE that will refer to early dwarf created DIE is created
on the fly during lookup_decl_die. So the patch tracks if the DIE has
no children before any children are added to it.
2020-11-14 Jakub Jelinek <jakub@redhat.com>
PR debug/97599
* dwarf2out.c (gen_subprogram_die): Call
gen_unspecified_parameters_die even if not early dwarf, but only
if subr_die is a newly created DIE.
Jakub Jelinek [Tue, 3 Nov 2020 20:42:51 +0000 (21:42 +0100)]
c++: Don't try to parse a function declaration as deduction guide [PR97663]
While these function declarations have NULL decl_specifiers->type,
they have still type specifiers specified from which the default int
in the return type is added, so we shouldn't try to parse those as
deduction guides.
2020-11-03 Jakub Jelinek <jakub@redhat.com>
PR c++/97663
* parser.c (cp_parser_init_declarator): Don't try to parse
C++17 deduction guides if there are any type specifiers even when
type is NULL.
Jakub Jelinek [Wed, 28 Oct 2020 09:24:20 +0000 (10:24 +0100)]
wide-int: Fix up set_bit_large
> >> wide_int new_lb = wi::set_bit (r.lower_bound (0), 127)
> >>
> >> and creates the value:
> >>
> >> p new_lb
> >> {<wide_int_storage> = {val = {-65535, -1, 0}, len = 2, precision = 128},
> >> static is_sign_extended = true}
> >
> > This is non-canonical and so invalid, if the low HWI has the MSB set
> > and the high HWI is -1, it should have been just
> > val = {-65535}, len = 1, precision = 128}
> >
> > I guess the bug is that wi::set_bit_large doesn't call canonize.
>
> Yeah, looks like a micro-optimisation gone wrong.
2020-10-28 Jakub Jelinek <jakub@redhat.com>
* wide-int.cc (wi::set_bit_large): Call canonize unless setting
msb bit and clearing bits above it.
Jakub Jelinek [Tue, 13 Oct 2020 17:13:26 +0000 (19:13 +0200)]
combine: Fix up simplify_shift_const_1 for nested ROTATEs [PR97386]
The following testcases are miscompiled (the first one since my improvements
to rotate discovery on GIMPLE, the other one for many years) because
combiner optimizes nested ROTATEs with narrowing SUBREG in between (i.e.
the outer rotate is performed in shorter precision than the inner one) to
just one ROTATE of the rotated constant. While that (under certain
conditions) can work for shifts, it can't work for rotates where we can only
do that with rotates of the same precision.
2020-10-13 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/97386
* combine.c (simplify_shift_const_1): Don't optimize nested ROTATEs if
they have different modes.
* gcc.c-torture/execute/pr97386-1.c: New test.
* gcc.c-torture/execute/pr97386-2.c: New test.
Jakub Jelinek [Thu, 8 Oct 2020 09:10:34 +0000 (11:10 +0200)]
openmp: Set cfun->calls_alloca when needed in OpenMP outlined regions [PR97294]
The following testcase FAILs, because we don't mark the child OpenMP function
as cfun->calls_alloca when it does call alloca. When optimizing, during DCE we
reset those flags and recompute them again, but with -O0 DCE is not performed.
Fixed by calling notice_special_calls when moving insns to the child function.
cfun->calls_alloca is normally set during gimplification and most of the
alloca calls omp-low.c does go through the gimplifier, but one spot didn't
and built the gcall directly, so that one needs to set calls_alloca too.
2020-10-08 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/97294
* tree-cfg.c (move_block_to_fn): Call notice_special_calls on
call stmts being moved into dest_cfun.
* omp-low.c (lower_rec_input_clauses): Set cfun->calls_alloca when
adding __builtin_alloca_with_align call without gimplification.
Jakub Jelinek [Sat, 26 Sep 2020 08:07:41 +0000 (10:07 +0200)]
powerpc, libcpp: Fix gcc build with clang on power8 [PR97163]
libcpp has two specialized altivec implementations of search_line_fast,
one for power8+ and the other one otherwise.
Both use __attribute__((altivec(vector))) and the GCC builtins rather than
altivec.h and the APIs from there, which is fine, but should be restricted
to when libcpp is built with GCC, so that it can be relied on.
The second elif is
and thus e.g. when built with clang it isn't picked, but the first one was
just guarded with
and so according to the bugreporter clang fails miserably on that.
The following patch fixes that by adding the same GCC_VERSION requirement
as the second version. I don't know where the 4.5 in there comes from and
the exact version doesn't matter that much, as long as it is above 4.2 that
clang pretends to be and smaller or equal to 4.8 as the oldest gcc we
support as bootstrap compiler ATM.
Furthermore, the patch fixes the comment, the version it is talking about is
not pre-GCC 5, but actually the GCC 5+ one.
2020-09-26 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/97163
* lex.c (search_line_fast): Only use _ARCH_PWR8 Altivec version
for GCC >= 4.5.
Patrick Palka [Wed, 7 Oct 2020 14:49:00 +0000 (10:49 -0400)]
c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273]
cp_tree_equal currently considers alignof the same as __alignof__, but
these operators are semantically different ever since r8-7957. In the
testcase below, this causes the second static_assert to fail on targets
where alignof(double) != __alignof__(double) because the specialization
table (which uses cp_tree_equal as its equality predicate) conflates the
two dependent specializations integral_constant<__alignof__(T)> and
integral_constant<alignof(T)>.
This patch makes cp_tree_equal distinguish between these two operators
by inspecting the ALIGNOF_EXPR_STD_P flag.
Jonathan Wakely [Sat, 10 Oct 2020 20:22:12 +0000 (21:22 +0100)]
libstdc++: Replace use of reserved name that clashes [PR 97362]
The name __deref is defined as a macro by Windows headers.
This renames the __deref() helper function to __ref. It doesn't actually
dereference an iterator. it just has the same type as the iterator's
reference type.
libstdc++-v3/ChangeLog:
PR libstdc++/97362
* doc/html/manual/source_code_style.html: Regenerate.
* doc/xml/manual/appendix_contributing.xml: Add __deref to
BADNAMES.
* include/debug/functions.h (_Irreflexive_checker::__deref):
Rename to __ref.
Jonathan Wakely [Wed, 16 Dec 2020 13:37:17 +0000 (13:37 +0000)]
libstdc++: Fix errors from Library Fundamentals TS headers in C++11 [PR 98319]
Currently the <experimental/random>, <experimental/source_location> and
<experimental/utility> headers can be included in C++98 and C++11 modes,
but gives errors. With this change they can be included, but define
nothing.
libstdc++-v3/ChangeLog:
PR libstdc++/98319
* include/experimental/memory_resource: Add system_header pragma
and only define contents for C++14 and later.
* include/experimental/random: Only define contents for C++14
and later.
* include/experimental/source_location: Likewise.
* include/experimental/utility: Likewise.
* testsuite/experimental/feat-lib-fund.cc: Include all LFTS
headers that are present. Allow test to run for all modes.
Eric Botcazou [Mon, 19 Apr 2021 08:13:36 +0000 (10:13 +0200)]
Fix another -freorder-blocks-and-partition glitch with Windows SEH
Since GCC 8, the -freorder-blocks-and-partition pass can split a function
into hot and cold parts, thus generating 2 FDEs for a single function in
DWARF for exception purposes and doing an equivalent trick for Windows SEH.
Now the Windows system unwinder does not support arbitrarily large frames
and there is even a hard limit on the encoding of the CFI, which changes
the stack allocation strategy when it is topped and which must be reflected
everywhere.
gcc/
* config/i386/winnt.c (i386_pe_seh_cold_init): Properly deal with
frames larger than the SEH maximum frame size.
gcc/testsuite/
* gnat.dg/opt92.adb: New test.
Thomas Schwinge [Thu, 25 Jun 2020 09:59:42 +0000 (11:59 +0200)]
libgomp HSA/GCN plugins: don't prepend the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'
For unknown reasons, this had gotten added for the libgomp HSA plugin in commit b8d89b03db5f212919e4571671ebb4f5f8b1e19d (r242749) "Remove build dependence on
HSA run-time", and later propagated into the GCN plugin.
Kyrylo Tkachov [Wed, 17 Mar 2021 18:26:05 +0000 (18:26 +0000)]
aarch64: Fix status return logic in RNG intrinsics
There is a bug with the RNG intrinsics in their return code. The definition says:
"Stores a 64-bit random number into the object pointed to by the argument and returns zero.
If the implementation could not generate a random number within a reasonable period of time
the object pointed to by the input is set to zero and a non-zero value is returned."
This means we should be testing whether to return non-zero with:
CSET W0, EQ
rather than NE.
This patch fixes that.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c (aarch64_expand_rng_builtin): Use EQ
to compare against CC_REG rather than NE.
Richard Biener [Wed, 4 Mar 2020 09:40:32 +0000 (10:40 +0100)]
tree-optimization/93964 - adjust ISL code generation for pointer params
Pointers eventually need intermediate conversions in code generation.
Allowing them is much easier than fending them off since niter
and scev expansion easily drag those in.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/93964
* graphite-isl-ast-to-gimple.c
(gcc_expression_from_isl_ast_expr_id): Add intermediate
conversion for pointer to integer converts.
* graphite-scop-detection.c (assign_parameter_index_in_region):
Relax assert.
Richard Biener [Thu, 1 Oct 2020 07:29:32 +0000 (09:29 +0200)]
tree-optimization/97255 - missing vector bool pattern of SRAed bool
SRA tends to use VIEW_CONVERT_EXPR when replacing bool fields with
unsigned char fields. Those are not handled in vector bool pattern
detection causing vector true values to leak. The following fixes
this by turning those into b ? 1 : 0 as well.
2020-10-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/97255
* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
VIEW_CONVERT_EXPR.
Richard Biener [Thu, 30 Jul 2020 08:24:42 +0000 (10:24 +0200)]
tree-optimization/96370 - make reassoc expr rewrite more robust
In the face of the more complex tricks in reassoc with respect
to negate processing it can happen that the expression rewrite
is fooled to recurse on a leaf and pick up a bogus expression
code. The following patch makes the expression rewrite more
robust in providing the expression code to it directly since
it is the same for all operations in a chain.
2020-07-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/96370
* tree-ssa-reassoc.c (rewrite_expr_tree): Add operation
code parameter and use it instead of picking it up from
the stmt that is being rewritten.
(reassociate_bb): Pass down the operation code.
Richard Biener [Fri, 31 Jul 2020 06:41:56 +0000 (08:41 +0200)]
middle-end/96369 - fix missed short-circuiting during range folding
This makes the special case of constant evaluated LHS for a
short-circuiting or/and explicit rather than doing range
merging and eventually exposing a side-effect that shouldn't be
evaluated.
Richard Biener [Tue, 7 Apr 2020 14:29:37 +0000 (16:29 +0200)]
middle-end/94479 - fix gimplification of address
When gimplifying an address operand we may expose an indirect
ref via DECL_VALUE_EXPR for example. This is dealt with in the
code already but it fails to consider that INDIRECT_REFs get
gimplified to MEM_REFs.
Fixed which makes the ICE observed on x86_64-netbsd go away.
2020-04-07 Richard Biener <rguenther@suse.de>
PR middle-end/94479
* gimplify.c (gimplify_addr_expr): Also consider generated
MEM_REFs.