PR target/96307
* gcc.dg/pr96307.c: New.
* gcc.target/riscv/pr96260.c: Move this test case from here to ...
* gcc.dg/pr96260.c: ... here.
* gcc.target/riscv/pr91441.c: Move this test case from here to ...
* gcc.dg/pr91441.c: ... here.
* lib/target-supports.exp (check_effective_target_no_fsanitize_address):
New proc.
Jakub Jelinek [Fri, 29 Jan 2021 09:30:09 +0000 (10:30 +0100)]
expand: Fix up find_bb_boundaries [PR98331]
When expansion emits some control flow insns etc. inside of a former GIMPLE
basic block, find_bb_boundaries needs to split it into multiple basic
blocks.
The code needs to ignore debug insns in decisions how many splits to do or
where in between some non-debug insns the split should be done, but it can
decide where to put debug insns if they can be kept and otherwise throws
them away (they can't stay outside of basic blocks).
On the following testcase, we end up in the bb from expander with
control flow insn
debug insns
barrier
some other insn
(the some other insn is effectively dead after __builtin_unreachable and
we'll optimize that out later).
Without debug insns, we'd do the split when encountering some other insn
and split after PREV_INSN (some other insn), i.e. after barrier (and the
splitting code then moves the barrier in between basic blocks).
But if there are debug insns, we actually split before the first debug insn
that appeared after the control flow insn, so after control flow insn,
and get a basic block that starts with debug insns and then has a barrier
in the middle that nothing moves it out of the bb. This leads to ICEs and
even if it wouldn't, different behavior from -g0.
The reason for treating debug insns that way is a different case, e.g.
control flow insn
debug insns
some other insn
or even
control flow insn
barrier
debug insns
some other insn
where splitting before the first such debug insn allows us to keep them
while otherwise we would have to drop them on the floor, and in those
situations we behave the same with -g0 and -g.
So, the following patch fixes it by resetting debug_insn not just when
splitting the blocks (it is set only after seeing a control flow insn and
before splitting for it if needed), but also when seeing a barrier,
which effectively means we always throw away debug insns after a control
flow insn and before following barrier if any, but there is no way around
that, control flow insn must be the last in the bb (BB_END) and BARRIER
after it, debug insns aren't allowed outside of bb.
We still handle the other cases fine (when there is no barrier or when
debug insns appear only after the barrier).
2021-01-29 Jakub Jelinek <jakub@redhat.com>
PR debug/98331
* cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing
a BARRIER.
Jakub Jelinek [Thu, 28 Jan 2021 15:13:11 +0000 (16:13 +0100)]
c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]
As the testcase shows, for vars appearing in templates, we don't attach
the asm spec string to the pattern decls, nor pass it back to cp_finish_decl
during instantiation.
The following patch does that.
2021-01-28 Jakub Jelinek <jakub@redhat.com>
PR c++/33661
PR c++/98847
* decl.c (cp_finish_decl): For register vars with asmspec in templates
call set_user_assembler_name and set DECL_HARD_REGISTER.
* pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars,
pass asmspec_tree to cp_finish_decl.
Jakub Jelinek [Wed, 27 Jan 2021 19:35:21 +0000 (20:35 +0100)]
aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853]
The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html
patch that introduced this pattern claimed:
Would generate:
combine_balanced_int:
bfxil w0, w1, 0, 16
uxtw x0, w0
ret
But with this patch generates:
combine_balanced_int:
bfxil w0, w1, 0, 16
ret
and it is indeed what it should generate, but it doesn't do that,
it emits bfxil x0, x1, 0, 16
instead which doesn't zero extend from 32 to 64 bits, but preserves
the bits from the destination register.
2021-01-27 Jakub Jelinek <jakub@redhat.com>
PR target/98853
* config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use
%w0, %w1 and %2 instead of %0, %1 and %2.
* gcc.c-torture/execute/pr98853-1.c: New test.
* gcc.c-torture/execute/pr98853-2.c: New test.
Jakub Jelinek [Tue, 26 Jan 2021 13:48:26 +0000 (14:48 +0100)]
aarch64: Tighten up checks for ubfix [PR98681]
The testcase in the patch doesn't assemble, because the instruction requires
that the penultimate operand (lsb) range is [0, 32] (or [0, 64]) and the last
operand's range is [1, 32 - lsb] (or [1, 64 - lsb]).
The INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) will accept the lsb operand
to be in range [MIN, 32] (or [MIN, 64]) and then we invoke UB in the
compiler and sometimes it will make it through.
The patch changes all the INTVAL uses in that function to UINTVAL,
which isn't strictly necessary, but can be done (e.g. after the
UINTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) check we know it is not
negative and thus INTVAL (shft_amnt) and UINTVAL (shft_amnt) then behave the
same. But, I had to add INTVAL (mask) > 0 check in that case, otherwise we
risk (hypothetically) emitting instruction that doesn't assemble.
The problem is with masks that have the MSB bit set, while the instruction
can handle those, e.g.
ubfiz w1, w0, 13, 19
will do
(w0 << 13) & 0xffffe000
in RTL we represent SImode constants with MSB set as negative HOST_WIDE_INT,
so it will actually be HOST_WIDE_INT_C (0xffffffffffffe000), and
the instruction uses %P3 to print the last operand, which calls
asm_fprintf (f, "%u", popcount_hwi (INTVAL (x)))
to print that. But that will not print 19, but 51 instead, will include
there also all the copies of the sign bit.
Not supporting those masks with MSB set isn't a big loss though, they really
shouldn't appear normally, as both GIMPLE and RTL optimizations should
optimize those away (one isn't masking any bits off with such masks, so
just w0 << 13 will do too).
2021-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/98681
* config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p):
Use UINTVAL (shft_amnt) and UINTVAL (mask) instead of INTVAL (shft_amnt)
and INTVAL (mask). Add && INTVAL (mask) > 0 condition.
Jakub Jelinek [Mon, 25 Jan 2021 09:03:40 +0000 (10:03 +0100)]
fold: Fix up strn{case,}cmp folding [PR98771]
As mentioned in the PR, the compiler behaves differently during strncmp
and strncasecmp folding between 32-bit and 64-bit hosts targeting 64-bit
target. I think that is highly undesirable.
The culprit is the host_size_t_cst_p predicate that is used by
fold_const_call, which punts if the target size_t constants don't fit into
host size_t. This patch gets rid of that behavior, instead it punts the
same when it doesn't fit into uhwi.
The predicate was used for strncmp and strncasecmp folding and for bcmp, memcmp and
memchr folding.
The constant is in all cases compared to 0, we can do that whether it fits
into size_t or unsigned HOST_WIDE_INT, then it is used in s2 <= s0 or
s2 <= s1 comparisons where s0 and s1 already have uhwi type and represent
the sizes of the objects.
The important difference is for strn{,case}cmp folding, we pass that s2
value as the last argument to the host functions comparing the c_getstr
results. If s2 fits into size_t, then my patch makes no difference,
but if it is larger, we know the 2 c_getstr objects need to fit into the
host address space, so larger s2 should just act essentially as strcmp
or strcasecmp; as none of those objects can occupy 100% of the address
space, using MIN (SIZE_MAX, s2) achieves that.
2021-01-25 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98771
* fold-const-call.c (host_size_t_cst_p): Renamed to ...
(size_t_cst_p): ... this. Check and store unsigned HOST_WIDE_INT
value rather than host size_t.
(fold_const_call): Change type of s2 from size_t to
unsigned HOST_WIDE_INT. Use size_t_cst_p instead of
host_size_t_cst_p. For strncmp calls, pass MIN (s2, SIZE_MAX)
instead of s2 as last argument.
Jakub Jelinek [Sat, 23 Jan 2021 08:41:58 +0000 (09:41 +0100)]
rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]
The x86 __m64 type is defined as:
/* The Intel API is flexible enough that we must allow aliasing with other
vector types, and their scalar components. */
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
and so matches the comment above it in that reads and stores through
pointers to __m64 can alias anything.
But in the rs6000 headers that is the case only for __m128, but not __m64.
The following patch adds that attribute, which fixes the
FAIL: gcc.target/powerpc/sse-movhps-1.c execution test
FAIL: gcc.target/powerpc/sse-movlps-1.c execution test
regressions that appeared when Honza improved ipa-modref.
Jakub Jelinek [Fri, 22 Jan 2021 18:03:23 +0000 (19:03 +0100)]
c++: Fix up ubsan false positives on references [PR95693]
Alex' 2 years old change to build_zero_init_1 to return NULL pointer with
reference type for references breaks the sanitizers, the assignment of NULL
to a reference typed member is then instrumented before it is overwritten
with a non-NULL address later on.
That change has been done to fix error recovery ICE during
process_init_constructor_record, where we:
if (TYPE_REF_P (fldtype))
{
if (complain & tf_error)
error ("member %qD is uninitialized reference", field);
else
return PICFLAG_ERRONEOUS;
}
a few lines earlier, but then continue and ICE when build_zero_init returns
NULL.
The following patch reverts the build_zero_init_1 change and instead creates
the NULL with reference type constants during the error recovery.
The pr84593.C testcase Alex' change was fixing still works as before.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/95693
* init.c (build_zero_init_1): Revert the 2018-03-06 change to
return build_zero_cst for reference types.
* typeck2.c (process_init_constructor_record): Instead call
build_zero_cst here during error recovery instead of build_zero_init.
Jakub Jelinek [Fri, 22 Jan 2021 10:50:18 +0000 (11:50 +0100)]
match.pd: Replace incorrect simplifications into copysign [PR90248]
In the PR Andrew said he has implemented a simplification that has been
added to LLVM, but that actually is not true, what is in there are
X * (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into +-abs(X)
but what has been added into GCC are (X cmp 0.0 ? +-1.0 : -+1.0)
simplifications into copysign(1, +-X) and then
X * copysign (1, +-X) into +-abs (X).
The problem is with the (X cmp 0.0 ? +-1.0 : -+1.0) simplifications,
they don't work correctly when X is zero.
E.g.
(X > 0.0 ? 1.0 : -1.0)
is -1.0 when X is either -0.0 or 0.0, but copysign will make it return
1.0 for 0.0 and -1.0 only for -0.0.
(X >= 0.0 ? 1.0 : -1.0)
is 1.0 when X is either -0.0 or 0.0, but copysign will make it return
still 1.0 for 0.0 and -1.0 for -0.0.
The simplifications were guarded on !HONOR_SIGNED_ZEROS, but as discussed in
the PR, that option doesn't mean that -0.0 will not ever appear as operand
of some operation, it is hard to guarantee that without compiler adding
canonicalizations of -0.0 to 0.0 after most of the operations and thus
making it very slow, but that the user asserts that he doesn't care if the result
of operations will be 0.0 or -0.0. Not to mention that some of the
transformations are incorrect even for positive 0.0.
So, instead of those simplifications this patch recognizes patterns where
those ?: expressions are multiplied by X, directly into +-abs.
That works fine even for 0.0 and -0.0 (as long as we don't care about
whether the result is exactly 0.0 or -0.0 in those cases), because
whether the result of copysign is -1.0 or 1.0 doesn't matter when it is
multiplied by 0.0 or -0.0.
As a follow-up, maybe we should add the simplification mentioned in the PR,
in particular doing copysign by hand through
VIEW_CONVERT_EXPR <int, float_X> < 0 ? -float_constant : float_constant
into copysign (float_constant, float_X). But I think that would need to be
done in phiopt.
Jakub Jelinek [Fri, 22 Jan 2021 10:42:03 +0000 (11:42 +0100)]
on ARRAY_REFs sign-extend offsets only from sizetype's precision [PR98255]
As discussed in the PR, the problem here is that the routines changed in
this patch sign extend the difference of index and low_bound from the
precision of the index, so e.g. when index is unsigned int and contains
value -2U, we treat it as index -2 rather than 0x00000000fffffffeU on 64-bit
arches.
On the other hand, get_inner_reference which is used during expansion, does:
if (! integer_zerop (low_bound))
index = fold_build2 (MINUS_EXPR, TREE_TYPE (index),
index, low_bound);
offset = size_binop (PLUS_EXPR, offset,
size_binop (MULT_EXPR,
fold_convert (sizetype, index),
unit_size));
which effectively requires that either low_bound is constant 0 and then
index in ARRAY_REFs can be arbitrary type which is then sign or zero
extended to sizetype, or low_bound is something else and then index and
low_bound must have compatible types and it is still converted afterwards to
sizetype and from there then a few lines later:
expr.c- if (poly_int_tree_p (offset))
expr.c- {
expr.c: poly_offset_int tem = wi::sext (wi::to_poly_offset (offset),
expr.c- TYPE_PRECISION (sizetype));
The following patch makes those routines match what get_inner_reference is
doing.
2021-01-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98255
* tree-dfa.c (get_ref_base_and_extent): For ARRAY_REFs, sign
extend index - low_bound from sizetype's precision rather than index
precision.
(get_addr_base_and_unit_offset_1): Likewise.
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Likewise.
* gimple-fold.c (fold_const_aggregate_ref_1): Likewise.
Jakub Jelinek [Thu, 21 Jan 2021 16:20:24 +0000 (17:20 +0100)]
c++: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672]
The following testcase is rejected even when it is valid.
The problem is that potential_constant_expression_1 doesn't have the
accurate *jump_target tracking cxx_eval_* has, and when the loop has
a condition that isn't guaranteed to be always true, the body isn't walked
at all. That is mostly a correct conservative behavior, except that it
doesn't detect if there are any return statements in the body, which means
the loop might return instead of falling through to the next statement.
We already have code for return stmt discovery in code snippets we don't
try to evaluate for switches, so this patch reuses that for FOR_STMT
and WHILE_STMT bodies.
Note, I haven't touched FOR_EXPR, with statement expressions it could
have return stmts in it too, or it could have break or continue statements
that wouldn't bind to the current loop but to something outer. That
case is clearly mishandled by potential_constant_expression_1 even
when the condition is missing or is always true, and it wouldn't surprise me
if cxx_eval_* didn't handle it right either, so I'm deferring that to
separate PR for later. We'd need proper test coverage for all of that.
> Hmm, IF_STMT probably also needs to check the else clause, if the condition
> isn't a known constant.
You're right, I thought it was ok because it recurses with tf_none, but
if the then branch is potentially constant and only else returns, continues
or breaks, then as the enhanced testcase shows we were mishandling it too.
2021-01-21 Jakub Jelinek <jakub@redhat.com>
PR c++/98672
* constexpr.c (check_for_return_continue_data): Add break_stmt member.
(check_for_return_continue): Also look for BREAK_STMT. Handle
SWITCH_STMT by ignoring break_stmt from its body.
(potential_constant_expression_1) <case FOR_STMT>,
<case WHILE_STMT>: If the condition isn't constant true, check if
the loop body can contain a return stmt.
<case SWITCH_STMT>: Adjust check_for_return_continue_data initializer.
<case IF_STMT>: If recursion with tf_none is successful,
merge *jump_target from the branches - returns with highest priority,
breaks or continues lower. If then branch is potentially constant and
doesn't return, check the else branch if it could return, break or
continue.
Jason Merrill [Fri, 22 Jan 2021 18:17:10 +0000 (13:17 -0500)]
c++: [[no_unique_address]] in empty base [PR98463]
In this testcase, cxx_eval_store_expression got confused trying to build up
CONSTRUCTORs for initializing a subobject because the subobject is a member
of an empty base. In C++14 mode and below we don't build FIELD_DECLs for
empty bases, so the CONSTRUCTOR skipped the empty base, and treated the
member as a member of the derived class, which breaks.
Fixed by recognizing this situation and giving up on trying to build a
CONSTRUCTOR for the inner target at that point; since it doesn't have any
data, we don't need to actually store anything.
Jason Merrill [Tue, 26 Jan 2021 21:04:24 +0000 (16:04 -0500)]
c++: Invisible refs are not restrict [PR97474]
In this testcase, we refer to the a parameter through a reference in its own
member, which we asserted couldn't happen by marking the parameter as
'restrict'. This assumption could also be broken if the address escapes
from the constructor.
gcc/cp/ChangeLog:
PR c++/97474
* call.c (type_passed_as): Don't mark invisiref restrict.
gcc/testsuite/ChangeLog:
PR c++/97474
* g++.dg/torture/pr97474.C: New test.
Jason Merrill [Wed, 13 Jan 2021 18:27:06 +0000 (13:27 -0500)]
c++: Avoid redundant copy in {} init [PR98642]
Here, initializing from { } implies a call to the default constructor for
base. We were then seeing that we're initializing a base subobject, so we
tried to copy the result of that call. This is clearly wrong; we should
initialize the base directly from its default constructor.
Jason Merrill [Fri, 15 Jan 2021 16:42:00 +0000 (11:42 -0500)]
c++: Fix list-init of array of no-copy type [PR63707]
build_vec_init_elt models initialization from some arbitrary object of the
type, i.e. copy, but in the case of list-initialization we don't do a copy
from the elements, we initialize them directly.
gcc/cp/ChangeLog:
PR c++/63707
* tree.c (build_vec_init_expr): Don't call build_vec_init_elt
if we got a CONSTRUCTOR.
gcc/testsuite/ChangeLog:
PR c++/63707
* g++.dg/cpp0x/initlist-array13.C: New test.
Eric Botcazou [Thu, 28 Jan 2021 10:31:35 +0000 (11:31 +0100)]
Fix LTO bootstrap on Windows
The latest fix introduced a comparison of executables and this cannot
directly work on Windows because they are timestamped. Moreover nobody
sets $(exeext) at top level, at least on MinGW, so you get weird behavior
because some tools add the implicit .exe suffix and others do not.
contrib/
PR lto/85574
* compare-lto: Deal with PE-COFF executables specifically.
Kyrylo Tkachov [Thu, 21 Jan 2021 16:33:49 +0000 (16:33 +0000)]
tree-ssa-mathopts: Use proper poly_int64 comparison with param_avoid_fma_max_bits [PR 98766]
We ICE here because we end up comparing a poly_int64 with a scalar using
<= rather than maybe_le.
This patch fixes that in the way rich suggests in the PR.
gcc/ChangeLog:
PR tree-optimization/98766
* tree-ssa-math-opts.c (convert_mult_to_fma): Use maybe_le when
comparing against type size with param_avoid_fma_max_bits.
gcc/testsuite/ChangeLog:
PR tree-optimization/98766
* gcc.dg/pr98766.c: New test.
Eric Botcazou [Tue, 26 Jan 2021 17:54:26 +0000 (18:54 +0100)]
Fix PR ada/98228
This is the profiled bootstrap failure for s390x/Linux on the mainline,
which has been introduced by the modref pass but actually exposing an
existing issue in the maybe_pad_type function that is visible on s390x.
The issue is too weak a test for the addressability of the inner component.
gcc/ada/
Marius Hillenbrand <mhillen@linux.ibm.com>
PR ada/98228
* gcc-interface/utils.c (maybe_pad_type): Test the size of the new
packable type instead of its alignment for addressability's sake.
Sebastian Huber [Mon, 25 Jan 2021 11:29:05 +0000 (12:29 +0100)]
RTEMS: Fix default linker script
We have to use ENDFILE_SPEC for the default linker script and not
STARTFILE_SPEC, since STARTFILE_SPEC is place before the user provided
library search paths.
Sebastian Huber [Fri, 22 Jan 2021 11:45:49 +0000 (12:45 +0100)]
RTEMS: Fix GCC specification
The use of -nostdlib and -nodefaultlibs disables the processing of
LIB_SPEC (%L) as specified by LINK_COMMAND_SPEC and thus disables the
default linker script for RTEMS. Move the linker script to
STARTFILE_SPEC which is controlled by -nostdlib and -nostartfiles. This
fits better since the linker script defines the platform start file
provided by the board support package in RTEMS.
gcc/
* config/rtems.h (STARTFILE_SPEC): Remove nostdlib and
nostartfiles handling since this is already done by
LINK_COMMAND_SPEC. Evaluate qnolinkcmds.
(ENDFILE_SPEC): Remove nostdlib and nostartfiles handling since this
is already done by LINK_COMMAND_SPEC.
(LIB_SPECS): Remove nostdlib and nodefaultlibs handling since
this is already done by LINK_COMMAND_SPEC. Remove qnolinkcmds
evaluation.
Eric Botcazou [Mon, 25 Jan 2021 10:27:29 +0000 (11:27 +0100)]
Fix internal error on extension with interface at -O2
This is a regression present on the mainline, 10 and 9 branches, in the
form of an internal error with the Ada compiler when a covariant-only
thunk is inlined into its caller.
gcc/ada/
* gcc-interface/trans.c (make_covariant_thunk): Set the DECL_CONTEXT
of the parameters and do not set TREE_PUBLIC on the thunk.
(maybe_make_gnu_thunk): Pass the alias to the covariant thunk.
* gcc-interface/utils.c (finish_subprog_decl): Set the DECL_CONTEXT
of the parameters here...
(begin_subprog_body): ...instead of here.
gcc/testsuite/
* gnat.dg/thunk2.adb, gnat.dg/thunk2.ads: New test.
* gnat.dg/thunk2_pkg.ads: New helper.
The compiler can match mpyd.eq r0,r1,r0 as a predicated instruction,
which is incorrect. The mpyd(u) instruction takes as input two 32-bit
registers, returning into a double 64-bit even-odd register pair. For
the predicated case, the ARC instruction decoder expects the
destination register to be the same as the first input register. In
the big-endian case the result is swaped in the destination register
pair, however, the instruction encoding remains the same. Refurbish
the mpyd(u) patterns to take into account the above observation.
* config/arc/arc.md (mpyd<su_optab>_arcv2hs): New template
pattern.
(*pmpyd<su_optab>_arcv2hs): Likewise.
(*pmpyd<su_optab>_imm_arcv2hs): Likewise.
(mpyd_arcv2hs): Moved into above template.
(mpyd_imm_arcv2hs): Moved into above template.
(mpydu_arcv2hs): Likewise.
(mpydu_imm_arcv2hs): Likewise.
(su_optab): New optab prefix for sign/zero-extending operations.
Paul Thomas [Tue, 29 Dec 2020 17:37:25 +0000 (17:37 +0000)]
Fortran: Fix deferred character lengths in array constructors [PR93833].
2020-12-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93833
* trans-array.c (get_array_ctor_var_strlen): If the character
length backend_decl cannot be found, convert the expression and
use the string length. Clear up some minor white space issues
in the rest of the file.
gcc/testsuite/
PR fortran/93833
* gfortran.dg/deferred_character_36.f90 : New test.
Iain Buclaw [Sat, 23 Jan 2021 23:20:25 +0000 (00:20 +0100)]
libphobos: Fix executables segfault on mipsel architecture
The dynamic section on MIPS is read-only, but this was not properly
handled in the runtime library. The segfault only occurred for programs
that linked to the shared libphobos library.
libphobos/ChangeLog:
PR d/98806
* libdruntime/gcc/sections/elf_shared.d (MIPS_Any): Declare version
for MIPS32 and MIPS64.
(getDependencies): Adjust dlpi_addr on MIPS_Any.
Paul Thomas [Sat, 26 Dec 2020 16:44:24 +0000 (16:44 +0000)]
Fortran: Correction to recent patch in light of comments [PR98022].
2020-12-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Throw an error for inquiry
references. Follow with corrected code that would provide the
expected result and provides clean error recovery.
gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: Change to dg-compile and
add errors for inquiry references.
Marek Polacek [Fri, 22 Jan 2021 17:50:53 +0000 (12:50 -0500)]
c++: Crash when deducing template arguments [PR98790]
maybe_instantiate_noexcept doesn't expect to see error_mark_node, but
the new callsite I introduced in r11-6476 can pass error_mark_node to
it. So cope.
gcc/cp/ChangeLog:
PR c++/98790
* pt.c (maybe_instantiate_noexcept): Return false if FN is
error_mark_node.
gcc/testsuite/ChangeLog:
PR c++/98790
* g++.dg/template/deduce8.C: New test.
duplicate_and_interleave is the main fallback way of loading
a repeating sequence of elements into variable-length vectors.
The code handles cases in which the number of elements in the
sequence is potentially several times greater than the number
of elements in a vector.
Let:
- NE be the (compile-time) number of elements in the sequence
- NR be the (compile-time) number of vector results and
- VE be the (run-time) number of elements in each vector
The basic approach is to duplicate each element into a
separate vector, giving NE vectors in total, then use
log2(NE) rows of NE permutes to generate NE results.
In the worst case --- when VE has no known compile-time factor
and NR >= NE --- all of these permutes are necessary. However,
if VE is known to be a multiple of 2**F, then each of the
first F permute rows produces duplicate results; specifically,
the high permute for a given pair is the same as the low permute.
The code dealt with this by reusing the low result for the
high result. This part was OK.
However, having duplicate results from one row meant that the
next row did duplicate work. The redundancies would be optimised
away by later passes, but the code tried to avoid generating them
in the first place. This is the part that went wrong.
Specifically, NR is typically less than NE when some permutes are
redundant, so the code tried to use NR to reduce the amount of work
performed. The problem was that, although it correctly calculated
a conservative bound on how many results were needed in each row,
it chose the wrong results for anything other than the final row.
This doesn't usually matter for fully-packed SVE vectors. We first
try to coalesce smaller elements into larger ones, so normally
VE ends up being 2**VQ (where VQ is the number of 128-bit blocks
in an SVE vector). In that situation we'd only apply the faulty
optimisation to the final row, i.e. the case it handled correctly.
E.g. for things like:
(already tested by the testsuite), we'd have 3 rows of permutes
producing 4 vector results. The schemne produced:
1st row: 8 results from 4 permutes, highs duplicates of lows
2nd row: 8 results from 8 permutes (half of which are actually redundant)
3rd row: 4 results from 4 permutes
However, coalescing elements is trickier for unpacked vectors,
and at the moment we don't try to do it (see the GET_MODE_SIZE
check in can_duplicate_and_interleave_p). Unpacked vectors
therefore stress the code in ways that packed vectors didn't.
The patch fixes this by removing the redundancies from each row,
rather than trying to work around them later. This also removes
the redundant work in the second row of the example above.
gcc/
PR tree-optimization/98535
* tree-vect-slp.c (duplicate_and_interleave): Use quick_grow_cleared.
If the high and low permutes are the same, remove the high permutes
from the working set and only continue with the low ones.
Michael Meissner [Thu, 21 Jan 2021 01:30:22 +0000 (20:30 -0500)]
PowerPC: Backport fix for libgcc long double support.
libgcc/
2021-01-20 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/t-linux (IBM128_STATIC_OBJS): Back port from
master (12/3/2020). New make variable.
(IBM128_SHARED_OBJS): New make variable.
(IBM128_OBJS): New make variable. Set all objects to use the
explicit IBM format, and disable gnu attributes.
(IBM128_CFLAGS): New make variable.
(gcc_s_compile): Add -mno-gnu-attribute to all shared library
modules.
Martin Jambor [Tue, 19 Jan 2021 14:50:49 +0000 (15:50 +0100)]
ipa-sra: Do not remove return values needed because of non-call EH
IPA-SRA already contains a check to figure out that an otherwise dead
parameter is actually required because of non-call exceptions, but it
is not present at the equivalent spot where SRA figures out whether
the return statement is used for anything useful. This patch adds
that condition there.
Unfortunately, even though this patch should be good enough for any
normal (I'd even say reasonable) use of the compiler, it hints that
when the user manually switches all sorts of DCE, IPA-SRA would
probably leave behind problematic statements manipulating what
originally were return values, just like it does for parameters (PR
93385). Fixing this properly might unfortunately be a separate issue
from the mentioned bug because the LHS of a call is changed during
call redirection and the caller often is not a clone. But I'll see
what I can do.
Meanwhile, the patch below has been bootstrapped and tested on x86_64.
gcc/ChangeLog:
2021-01-18 Martin Jambor <mjambor@suse.cz>
PR ipa/98690
* ipa-sra.c (ssa_name_only_returned_p): New parameter fun. Check
whether non-call exceptions allow removal of a statement.
(isra_analyze_call): Pass the appropriate function to
ssa_name_only_returned_p.
Rainer Orth [Tue, 8 Dec 2020 12:29:26 +0000 (13:29 +0100)]
testsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c
The new gcc.target/i386/pr98100.c test FAILs on Solaris/x86:
FAIL: gcc.target/i386/pr98100.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr98100.c:6:1: error: the call requires 'ifunc', which is not supported by this target
Fixed as follows.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
Thomas Schwinge [Mon, 30 Nov 2020 14:15:20 +0000 (15:15 +0100)]
[nvptx libgomp plugin] Build only in supported configurations
As recently again discussed in <https://gcc.gnu.org/PR97436> "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported. So we simply should buildn't the nvptx libgomp
plugin in this case.
This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
<https://stackoverflow.com/a/52784819>).
This amends PR65099 commit a92defdab79a1268f4b9dcf42b937e4002a4cf15 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.
Iain Sandoe [Sat, 31 Oct 2020 09:25:47 +0000 (09:25 +0000)]
Objective-C++ : Fix ICE in potential_constant_expression_1.
We cannot, as things stand, handle Objective-C tree codes in
the switch and deal with this by calling out to a function that
has a dummy version when Objective-C is not enabled.
Because of the way the logic works (with a fall through to a
'sorry' in case of unhandled expressions), the function reports
cases that are known to be unsuitable for constant exprs. The
dummy function always reports 'false' and thus will fall through
to the 'sorry'.
The fix for PR libstdc++/82481 should only have applied for targets
where _GLIBCXX_HAVE_TLS is defined. Because it was also done for non-TLS
targets, it isn't possible to use clang's analyzers on non-TLS targets
if the code uses <mutex>. This fixes it by using a NOLINT comment on
the relevant line instead of testing #ifdef __clang_analyzer__ and
compiling different code when analyzing.
I'm not actually able to reproduce the analyzer warning with the tools
from Clang 10.0.1 so I'm not going to try to make the suppression more
specific with NOLINTNEXTLINE(clang-analyzer-code.StackAddressEscape).
libstdc++-v3/ChangeLog:
PR libstdc++/98605
* include/std/mutex (call_once): Use NOLINT to suppress clang
analyzer warnings.
Samuel Thibault [Mon, 21 Dec 2020 14:36:30 +0000 (15:36 +0100)]
hurd: libgcc unwinding over signal trampolines with SIGINFO
When the application sets SA_SIGINFO, the signal trampoline parameters
are different to follow POSIX.
libgcc/
* config/i386/gnu-unwind.h (x86_gnu_fallback_frame_state): Add the
posix siginfo case to struct handler_args. Detect between legacy
and siginfo from the second parameter, which is a small sigcode in
the legacy case, and a pointer in the siginfo case.
Richard Biener [Wed, 6 Jan 2021 08:26:55 +0000 (09:26 +0100)]
tree-optimization/98513 - fix bug in range intersection code
This fixes a premature optimization in the range intersection code
which assumes earlier branches have to be taken, not taking into
account that for symbolic ranges we cannot always compare endpoints.
The fix is to instantiate the compare deemed redundant (which then
fails as undecidable for the testcase).
2021-01-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/98513
* value-range.cc (intersect_ranges): Compare the upper bounds
for the expected relation.
vect: Fix bogus alignment assumption in alias checks [PR94994]
This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.
This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.
gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.
gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.
vect, aarch64: Fix alignment units for IFN_MASK* [PR95401]
The IFN_MASK* functions take two leading arguments: a load or
store pointer and a “cookie”. The type of the cookie is the
type of the access for TBAA purposes (like for MEM_REFs)
while the value of the cookie is the alignment of the access.
This PR was caused by a disagreement about whether the alignment
is measured in bits or bytes.
It looks like this goes back to PR68786, which made the
vectoriser create its own cookie argument rather than reusing
the one created by ifcvt. The alignment value of the new cookie
was measured in bytes (as needed by set_ptr_info_alignment)
while the existing code expected it to be measured in bits.
The folds I added for IFN_MASK_LOAD and STORE then made
things worse.
gcc/
PR tree-optimization/95401
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::load_store_cookie): Use bits rather than bytes
for the alignment argument to IFN_MASK_LOAD and IFN_MASK_STORE.
* gimple-fold.c (gimple_fold_mask_load_store_mem_ref): Likewise.
* tree-vect-stmts.c (vectorizable_store): Likewise.
(vectorizable_load): Likewise.
gcc/testsuite/
PR tree-optimization/95401
* g++.dg/vect/pr95401.cc: New test.
* g++.dg/vect/pr95401a.cc: Likewise.
recog: Fix a constrain_operands corner case [PR97144]
aarch64's *add<mode>3_poly_1 has a pattern with the constraints:
"=...,r,&r"
"...,0,rk"
"...,Uai,Uat"
i.e. the penultimate alternative requires operands 0 and 1 to match,
but the final alternative does not allow them to match.
The register allocators dealt with this correctly, and so used
different input and output registers for instructions with Uat
operands. However, constrain_operands carried the penultimate
alternative's matching rule over to the final alternative,
so it would essentially ignore the earlyclobber. This in turn
allowed postreload to convert a correct Uat pairing into an
incorrect one.
The fix is simple: recompute the matching information for each
alternative.
gcc/
PR rtl-optimization/97144
* recog.c (constrain_operands): Initialize matching_operand
for each alternative, rather than only doing it once.
gcc/testsuite/
PR rtl-optimization/97144
* gcc.c-torture/compile/pr97144.c: New test.
* gcc.target/aarch64/sve/pr97144.c: Likewise.
genmodes: Update GET_MODE_MASK when changing NUNITS [PR98214]
The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:
(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)
to zero because the extracted bits appeared to be insignificant.
gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.
unsigned long long x = ...;
char y = (char) (x << 37);
The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:
unsigned long long x = ...;
char y = (char) x << 37;
which gives an out-of-range shift. In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.
Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets). In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.
gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.
gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.
vect: Fix missing alias checks for 128-bit SVE [PR98371]
On AArch64, the vectoriser tries various ways of vectorising with both
SVE and Advanced SIMD and picks the best one. All other things being
equal, it prefers earlier attempts over later attempts.
The way this works currently is that, once it has a successful
vectorisation attempt A, it analyses all other attempts as epilogue
loops of A:
/* When pick_lowest_cost_p is true, we should in principle iterate
over all the loop_vec_infos that LOOP_VINFO could replace and
try to vectorize LOOP_VINFO under the same conditions.
E.g. when trying to replace an epilogue loop, we should vectorize
LOOP_VINFO as an epilogue loop with the same VF limit. When trying
to replace the main loop, we should vectorize LOOP_VINFO as a main
loop too.
However, autovectorize_vector_modes is usually sorted as follows:
- Modes that naturally produce lower VFs usually follow modes that
naturally produce higher VFs.
- When modes naturally produce the same VF, maskable modes
usually follow unmaskable ones, so that the maskable mode
can be used to vectorize the epilogue of the unmaskable mode.
This order is preferred because it leads to the maximum
epilogue vectorization opportunities. Targets should only use
a different order if they want to make wide modes available while
disparaging them relative to earlier, smaller modes. The assumption
in that case is that the wider modes are more expensive in some
way that isn't reflected directly in the costs.
There should therefore be few interesting cases in which
LOOP_VINFO fails when treated as an epilogue loop, succeeds when
treated as a standalone loop, and ends up being genuinely cheaper
than FIRST_LOOP_VINFO. */
However, the vectoriser can normally elide alias checks for epilogue
loops, on the basis that the main loop should do them instead.
Converting an epilogue loop to a main loop can therefore cause the alias
checks to be skipped. (It probably also unfairly penalises the original
loop in the cost comparison, given that one loop will have alias checks
and the other won't.)
As the comment says, we should in principle analyse each vector mode
twice: once as a main loop and once as an epilogue. However, doing
that up-front would be quite expensive. This patch instead goes for a
compromise: if an epilogue loop for mode M2 seems better than a main
loop for mode M1, re-analyse with M2 as the main loop.
The patch fixes dg.torture.exp=pr69719.c when testing with
-msve-vector-bits=128.
gcc/
PR tree-optimization/98371
* tree-vect-loop.c (vect_reanalyze_as_main_loop): New function.
(vect_analyze_loop): If an epilogue loop appears to be cheaper
than the main loop, re-analyze it as a main loop before adopting
it as a main loop.
in the 64-bit vst[234] functions. The zero was forced into a
register at expand time, and we relied on combine to fuse the
zero and combine back together into a single combinez pattern.
The problem is that the zero could be hoisted before combine
gets a chance to do its thing.
gcc/
PR target/89057
* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
aarch64_simd_reg_or_zero for operand 2. Use the combinez patterns
to handle zero operands.
gcc/testsuite/
PR target/89057
* gcc.target/aarch64/pr89057.c: New test.
aarch64: Extend aarch64-autovec-preference==2 to 128-bit SVE
When compiling with -msve-vector-bits=128, aarch64_preferred_simd_mode
would pass the same vector width to aarch64_simd_container_mode for
both SVE and Advanced SIMD, and so Advanced SIMD would always “win”.
This patch instead makes it choose directly between SVE and Advanced
SIMD modes, so that aarch64-autovec-preference==2 and
aarch64-autovec-preference==4 work for this configuration.
(aarch64-autovec-preference shouldn't affect aarch64_simd_container_mode
because that would have an ABI impact for things like GNU vectors.)
gcc/
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Use
aarch64_full_sve_mode and aarch64_vq_mode directly, instead of
going via aarch64_simd_container_mode.
The gdb.Type.name attribute isn't present in GDB 7.6, so we get an
exception from StdPathPrinter._iterator.__next__ trying to use it.
The StdPathPrinter._iterator is already passed the type's name in its
constructor, so we can just store that and use it instead of
gdb.Type.name.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdExpPathPrinter): Store the
name of the type and pass it to the iterator.
(StdPathPrinter): Likewise.
* testsuite/libstdc++-prettyprinters/filesystem-ts.cc: New test.
Jonathan Wakely [Wed, 2 Dec 2020 21:39:08 +0000 (21:39 +0000)]
libstdc++: Fix std::any pretty printer [PR 68735]
This fixes errors seen on powerpc64 (big endian only) due to the
printers for std::any and std::experimental::any being unable to find
the manager function.
libstdc++-v3/ChangeLog:
PR libstdc++/65480
PR libstdc++/68735
* python/libstdcxx/v6/printers.py (function_pointer_to_name):
New helper function to get the name of a function from its
address.
(StdExpAnyPrinter.__init__): Use it.
Richard Biener [Mon, 7 Dec 2020 09:29:07 +0000 (10:29 +0100)]
tree-optimization/98117 - fix range set by vectorization on niter IVs
This avoids the degenerate case of a TYPE_MAX_VALUE latch iteration
count value causing wrong range info for the vector IV. There's
still the case of VF == 1 where if we don't know whether we hit the
above case we cannot emit a range.
2020-12-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/98117
* tree-vect-loop-manip.c (vect_gen_vector_loop_niters):
Properly handle degenerate niter when setting the vector
loop IV range.
Richard Biener [Tue, 3 Nov 2020 14:03:41 +0000 (15:03 +0100)]
tree-optimization/97623 - Avoid PRE hoist insertion iteration
We are not really interested in PRE opportunities exposed by
hoisting but only the other way around. So this moves hoist
insertion after PRE iteration finished and removes hoist
insertion iteration alltogether.
It also guards access to NEW_SETS properly.
2020-11-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/97623
* tree-ssa-pre.c (insert): Move hoist insertion after PRE
insertion iteration and do not iterate it.
(create_expression_by_pieces): Guard NEW_SETS access.
(insert_into_preds_of_block): Likewise.
Richard Biener [Fri, 30 Oct 2020 12:32:32 +0000 (13:32 +0100)]
tree-optimization/97623 - avoid excessive insert iteration for hoisting
This avoids requiring insert iteration for back-to-back hoisting
opportunities as seen in the added testcase. For the PR at hand
this halves the number of insert iterations retaining only
the hard to avoid PRE / hoist insert back-to-backs.
2020-10-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/97623
* tree-ssa-pre.c (insert): First do hoist insertion in
a backward walk.
Jakub Jelinek [Sat, 9 Jan 2021 09:49:38 +0000 (10:49 +0100)]
tree-cfg: Allow enum types as result of POINTER_DIFF_EXPR [PR98556]
As conversions between signed integers and signed enums with the same
precision are useless in GIMPLE, it seems strange that we require that
POINTER_DIFF_EXPR result must be INTEGER_TYPE.
If we really wanted to require that, we'd need to change the gimplifier
to ensure that, which it isn't the case on the following testcase.
What is going on during the gimplification is that when we have the
(enum T) (p - q) cast, it is stripped through
/* Strip away as many useless type conversions as possible
at the toplevel. */
STRIP_USELESS_TYPE_CONVERSION (*expr_p);
and when the MODIFY_EXPR is gimplified, the *to_p has enum T type,
while *from_p has intptr_t type and as there is no conversion in between,
we just create GIMPLE_ASSIGN from that.
2021-01-09 Jakub Jelinek <jakub@redhat.com>
PR c++/98556
* tree-cfg.c (verify_gimple_assign_binary): Allow lhs of
POINTER_DIFF_EXPR to be any integral type.
Patrick Palka [Fri, 8 Jan 2021 15:11:25 +0000 (10:11 -0500)]
c++: ICE with constexpr call that returns a PMF [PR98551]
We shouldn't do replace_result_decl after evaluating a call that returns
a PMF because PMF temporaries aren't wrapped in a TARGET_EXPR (and so we
can't trust ctx->object), and PMF initializers can't be self-referential
anyway, so replace_result_decl would always be a no-op.
To that end, this patch changes the relevant AGGREGATE_TYPE_P test to
CLASS_TYPE_P, which should rule out PMFs (as well as arrays, which we
can't return and therefore won't see here). This fixes an ICE from the
sanity check in replace_result_decl in the below testcase during
constexpr evaluation of the call f() in the initializer g(f()).
gcc/cp/ChangeLog:
PR c++/98551
* constexpr.c (cxx_eval_call_expression): Check CLASS_TYPE_P
instead of AGGREGATE_TYPE_P before calling replace_result_decl.
gcc/testsuite/ChangeLog:
PR c++/98551
* g++.dg/cpp0x/constexpr-pmf2.C: New test.
Patrick Palka [Fri, 31 Jul 2020 02:21:41 +0000 (22:21 -0400)]
c++: decl_constant_value and unsharing [PR96197]
In the testcase from the PR we're seeing excessive memory use (> 5GB)
during constexpr evaluation, almost all of which is due to the call to
decl_constant_value in the VAR_DECL/CONST_DECL branch of
cxx_eval_constant_expression. We reach here every time we evaluate an
ARRAY_REF of a constexpr VAR_DECL, and from there decl_constant_value
makes an unshared copy of the VAR_DECL's initializer. But unsharing
here is unnecessary because callers of cxx_eval_constant_expression
already unshare its result when necessary.
To fix this excessive unsharing, this patch adds a new defaulted
parameter unshare_p to decl_really_constant_value and
decl_constant_value so that callers can control whether to unshare.
As a simplification, we can also move the call to unshare_expr in
constant_value_1 outside of the loop, since doing unshare_expr on a
DECL_P is a no-op.
Now that we no longer unshare the result of decl_constant_value and
decl_really_constant_value from cxx_eval_constant_expression, memory use
during constexpr evaluation for the testcase from the PR falls from ~5GB
to 15MB according to -ftime-report.
gcc/cp/ChangeLog:
PR c++/96197
* constexpr.c (cxx_eval_constant_expression) <case CONST_DECL>:
Pass false to decl_constant_value and decl_really_constant_value
so that they don't unshare their result.
* cp-tree.h (decl_constant_value): New declaration with an added
bool parameter.
(decl_really_constant_value): Add bool parameter defaulting to
true to existing declaration.
* init.c (constant_value_1): Add bool parameter which controls
whether to unshare the initializer before returning. Call
unshare_expr at most once.
(scalar_constant_value): Pass true to constant_value_1's new
bool parameter.
(decl_really_constant_value): Add bool parameter and forward it
to constant_value_1.
(decl_constant_value): Likewise, but instead define a new
overload with an added bool parameter.
gcc/testsuite/ChangeLog:
PR c++/96197
* g++.dg/cpp1y/constexpr-array8.C: New test.
Iain Sandoe [Wed, 6 Jan 2021 19:40:45 +0000 (19:40 +0000)]
testsuite, coroutines : Fix a bad testcase [PR96504].
Where possible (i.e. where that doesn't alter the intent of a test) we
use a suspend_always as the final suspend and a test that the coroutine
was 'done' to check that the state machine had terminated correctly.
Sometimes, filed PRs have 'suspend_never' as the final suspend expression
and that needs to be changed to match the testsuite style. This is one
I missed and means that the call to 'done()' on the handle is made to an
already-destructed coroutine. Surprisngly, thAt didn't actually trigger
a failure until glibc 2-32.
Fixed by changing the final suspend to be 'suspend_always'.
gcc/testsuite/ChangeLog:
PR c++/96504
* g++.dg/coroutines/torture/pr95519-05-gro.C: Use suspend_always
as the final suspend point so that we can check that the state
machine has reached the expected point.
This was a bad testcase, found with fsanitize=address; the final suspend
is 'suspend never' which flows off the end of the coroutine destroying
the promise and the frame. At that point access via the handle is an
error. Fixed by checking that the promise is destroyed via a global var.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/torture/co-ret-17-void-ret-coro.C: Check for
promise destruction via a global variable.
Paul Thomas [Sun, 2 Aug 2020 09:35:36 +0000 (10:35 +0100)]
This patch fixes PR96325. See the explanatory comment in the testcase.
2020-08-02 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/96325
* primary.c (gfc_match_varspec): In the case that a component
reference is added to an intrinsic type component, emit the
error message in this function.
Paul Thomas [Sat, 26 Dec 2020 15:08:11 +0000 (15:08 +0000)]
Fix failures with -m32 and some memory leaks.
2020-12-23 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/83118
* trans-array.c (gfc_alloc_allocatable_for_assignment): Make
sure that class expressions are captured for dummy arguments by
use of gfc_get_class_from_gfc_expr otherwise the wrong vptr is
used.
* trans-expr.c (gfc_get_class_from_gfc_expr): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(gfc_trans_assignment_1): Deallocate rhs allocatable components
after passing derived type function results to class lhs.
* trans.h : Add prototype for gfc_get_class_from_gfc_expr.