Jakub Jelinek [Wed, 1 Sep 2021 11:30:51 +0000 (13:30 +0200)]
vectorizer: Fix up vectorization using WIDEN_MINUS_EXPR [PR102124]
The following testcase is miscompiled on aarch64-linux at -O3 since the
introduction of WIDEN_MINUS_EXPR.
The problem is if the inner type (half_type) is unsigned and the result
type in which the subtraction is performed (type) has precision more than
twice as larger as the inner type's precision.
For other widening operations like WIDEN_{PLUS,MULT}_EXPR, if half_type
is unsigned, the addition/multiplication result in itype is also unsigned
and needs to be zero-extended to type.
But subtraction is special, even when half_type is unsigned, the subtraction
behaves as signed (also regardless of whether the result type is signed or
unsigned), 0xfeU - 0xffU is -1 or 0xffffffffU, not 0x0000ffff.
I think it is better not to use mixed signedness of types in
WIDEN_MINUS_EXPR (have unsigned vector of operands and signed result
vector), so this patch instead adds another cast to make sure we always
sign-extend the result from itype to type if type is wider than itype.
2021-09-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/102124
* tree-vect-patterns.c (vect_recog_widen_op_pattern): For ORIG_CODE
MINUS_EXPR, if itype is unsigned with smaller precision than type,
add an extra cast to signed variant of itype to ensure sign-extension.
Richard Biener [Wed, 1 Sep 2021 09:49:39 +0000 (11:49 +0200)]
tree-optimization/93491 - avoid PRE of trapping calls across exits
This makes us avoid PREing calls that could trap across other
calls that might not return. The PR88087 testcase has exactly
such case so I've refactored the testcase to contain a valid PRE.
I've also adjusted PRE to not consider pure calls possibly
not returning in line with what we do elsewhere.
Note we don't have a good idea whether a function always returns
normally or whether its body is known to never trap. That's
something IPA could compute.
2021-09-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/93491
* tree-ssa-pre.c (compute_avail): Set BB_MAY_NOTRETURN
after processing the stmt itself. Do not consider
pure functions possibly not returning. Properly avoid
adding possibly trapping calls to EXP_GEN when there's
a preceeding possibly not returning call.
* tree-ssa-sccvn.c (vn_reference_may_trap): Conservatively
not handle calls.
* gcc.dg/torture/pr93491.c: New testcase.
* gcc.dg/tree-ssa/pr88087.c: Change to valid PRE opportunity.
Richard Biener [Tue, 31 Aug 2021 08:28:40 +0000 (10:28 +0200)]
tree-optimization/102139 - fix SLP DR base alignment
When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.
To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization. For this to work for groups inside a BB that are
separate by a call that might not return we now store the DR
analysis group-id permanently and use that for an additional check
when the DRs are in the same BB.
2021-08-31 Richard Biener <rguenther@suse.de>
PR tree-optimization/102139
* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
type to record a std::pair of the stmt-info and the innermost
loop behavior.
(dr_vec_info::group): New member.
* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
(vect_compute_data_ref_alignment): Verify the recorded
base alignment can be used.
(data_ref_pair): Remove.
(dr_group_sort_cmp): Adjust.
(vect_analyze_data_ref_accesses): Store the group-ID in the
dr_vec_info and operate on a vector of dr_vec_infos.
YunQiang Su [Tue, 31 Aug 2021 11:19:49 +0000 (07:19 -0400)]
md/define_c_enum: support value assignation
Currently, the enums from define_c_enum and define_enum can only
has values one by one from 0.
In fact we can support the behaviour just like C, aka like
(define_enum "mips_isa" [(mips1 1) mips2 (mips32 32) mips32r2]),
then we can get
enum mips_isa {
MIPS_ISA_MIPS1 = 1,
MIPS_ISA_MIPS2 = 2,
MIPS_ISA_MIPS32 = 32,
MIPS_ISA_MIPS32R2 = 33
};
gcc/ChangeLog:
* read-md.c (md_reader::handle_enum): support value assignation.
* doc/md.texi: record define_c_enum value assignation support.
Jakub Jelinek [Wed, 1 Sep 2021 10:06:25 +0000 (12:06 +0200)]
bswap: Fix up bswap_view_convert handling [PR102141]
bswap_view_convert is used twice in spots where gsi_insert_before is the
right thing, but in the last one it wants to insert preparation stmts
for the VIEW_CONVERT_EXPR emitted with gsi_insert_after, where at the
gsi we still need to insert bswap_stmt and maybe mask_stmt whose lhs
the preparation stmts will use.
So, this patch adds a BEFORE argument to the function and emits the
preparation statements before or after depending on that.
2021-09-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/102141
* gimple-ssa-store-merging.c (bswap_view_convert): Add BEFORE
argument. If false, emit stmts after gsi instead of before, and
with GSI_NEW_STMT.
(bswap_replace): Adjust callers. When converting output of bswap,
emit VIEW_CONVERT prepratation stmts after a copy of gsi instead
of before it.
Roger Sayle [Wed, 1 Sep 2021 07:38:39 +0000 (08:38 +0100)]
C: PR c/79412: Poison decls with error_mark_node after type mismatch
This patch fixes an ICE during error-recovery regression in the C front-end.
The symptom is that the middle-end's sanity checking assertions fail during
gimplification when being asked to increment an array, which is non-sense.
The issue is that the C-front end has detected the type mismatch and
reported an error to the user, but hasn't provided any indication of this
to the middle-end, simply passing bogus trees that the optimizers recognize
as invalid.
This appears to be a frequently reported ICE with 94730, 94731, 101036
and 101365 all marked as duplicates.
I believe the correct (polite) fix is to mark the mismatched types as
problematic/dubious in the front-end, when the error is spotted, so that
the middle-end has a heads-up and can be a little more forgiving. This
patch to c-decl.c's duplicate_decls sets (both) mismatched types to
error_mark_node if they are significantly different, and we've issued
an error message. Alas, this is too punitive for FUNCTION_DECLs where
we store return types, parameter lists, parameter types and attributes
in the type, but fortunately the middle-end is already more cautious
about trusting possibly suspect function types.
This fix required one minor change to the testsuite, typedef-var-2.c
where after conflicting type definitions, we now no longer assume that
the (first or) second definition is the correct one. This change only
affects the behaviour after seen_error(), so should be relatively safe.
2021-09-01 Roger Sayle <roger@nextmovesoftware.com>
Joseph Myers <joseph@codesourcery.com>
gcc/c/ChangeLog
PR c/79412
* c-decl.c (duplicate_decls): On significant mismatches, mark the
types of both (non-function) decls as error_mark_node, so that the
middle-end can see the code is malformed.
(free_attr_access_data): Don't process if the type has been set to
error_mark_node.
gcc/testsuite/ChangeLog
PR c/79412
* gcc.dg/pr79412.c: New test case.
* gcc.dg/typedef-var-2.c: Update expeted errors.
Jason Merrill [Wed, 25 Aug 2021 19:10:21 +0000 (15:10 -0400)]
c++: Various small fixes
A copy-paste error, a couple of missed checks to guard undefined accesses,
and we don't need to use type_uses_auto to extract the auto node we just
built.
Patrick Palka [Tue, 31 Aug 2021 17:31:10 +0000 (13:31 -0400)]
c++: check arity before deduction w/ explicit targs [PR12672]
During overload resolution, when the arity of a function template
clearly disagrees with the arity of the call, no specialization of the
function template could yield a viable candidate. The deduction routine
type_unification_real already notices this situation, but not before
it substitutes explicit template arguments into the template, a step
which could induce a hard error. Although it's necessary to perform
this substitution first in order to check arity perfectly (since the
substitution can e.g. expand a non-trailing parameter pack), in most
cases we can determine ahead of time whether there's an arity
disagreement without needing to perform deduction at all.
To that end, this patch implements an (approximate) arity check in
add_template_candidate_real that guards actual deduction. It's enabled
only when there are explicit template arguments since that's when
deduction can force otherwise avoidable template instantiations. (I
experimented with enabling it unconditionally as an optimization, and
observed some improvements to compile time of about 5% but also some
slowdowns of about the same magnitude, so kept it conditional.)
In passing, this adds a least_p parameter to arity_rejection for sake
of consistent diagnostics with unify_arity.
A couple of testcases needed to be adjusted so that deduction continues
to occur as intended after this change. Except in unify6.C, where we
were expecting foo<void ()> to be ill-formed due to substitution
forming a function type with an added 'const', but ISTM this is
permitted by [dcl.fct]/7, so I changed the test accordingly.
PR c++/12672
gcc/cp/ChangeLog:
* call.c (rejection_reason::call_varargs_p): Rename this
previously unused member to ...
(rejection_reason::least_p): ... this.
(arity_rejection): Add least_p parameter.
(add_template_candidate_real): When there are explicit
template arguments, check that the arity of the call agrees with
the arity of the function before attempting deduction.
(print_arity_information): Add least_p parameter.
(print_z_candidate): Adjust call to print_arity_information.
Thomas Schwinge [Fri, 27 Aug 2021 05:49:35 +0000 (07:49 +0200)]
Fix 'OMP_CLAUSE_TILE' operands handling in 'gcc/tree.c:walk_tree_1'
In r245300 (commit 02889d23ee3b02854dff203dd87b9a25e30b61b4)
"OpenACC tile clause support" that one had changed to three operands,
similar to 'OMP_CLAUSE_COLLAPSE'.
There is no (existing) test case where this seems to matter (likewise
for 'OMP_CLAUSE_COLLAPSE'), but it's good to be consistent.
gcc/
* tree.c (walk_tree_1) <OMP_CLAUSE_TILE>: Handle three operands.
Jonathan Wakely [Tue, 31 Aug 2021 15:30:01 +0000 (16:30 +0100)]
libstdc++: Remove redundant noexcept-specifier on definitions
These destructors are noexcept anyway. I removed the redundant noexcept
from the error_category destructor's declaration in r0-123475, but
didn't remove it from the defaulted definition in system_error.cc. That
causes warnings if the library is built with Clang.
This removes the redundant noexcept from ~error_category and
~system_error and adds tests to ensure they really are noexcept.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/c++11/system_error.cc (error_category::~error_category()):
Remove noexcept-specifier.
(system_error::~system_error()): Likewise.
* testsuite/19_diagnostics/error_category/noexcept.cc: New test.
* testsuite/19_diagnostics/system_error/noexcept.cc: New test.
Jonathan Wakely [Tue, 31 Aug 2021 12:09:26 +0000 (13:09 +0100)]
libstdc++: Improve error handling in Net TS name resolution
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/experimental/internet (__make_resolver_error_code):
Handle EAI_SYSTEM errors.
(basic_resolver_results): Use __make_resolver_error_code. Use
Glibc NI_MAXHOST and NI_MAXSERV values for buffer sizes.
Roger Sayle [Tue, 31 Aug 2021 16:25:58 +0000 (17:25 +0100)]
[Committed] Fix subreg_promoted_mode breakage on various platforms.
My apologies for the inconvenience. My recent patch to preserve
SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))), and other
places in the middle-end, has broken the build on several targets.
The change to convert_modes inadvertently used the same
subreg_promoted_mode idiom for retrieving the mode of a SUBREG_REG
as the existing code just a few lines earlier. Alas in the meantime,
the original SUBREG gets replaced by one without SUBREG_PROMOTED_VAR_P,
the whole raison-d'etre for my patch, and I'd not realized/noticed
that subreg_promoted_mode asserts for this. Alas neither the bootstrap
and regression test on x86_64-pc-linux-gnu nor my testing on nvptx-none
must have hit this particular case. The logic of this transformation
is sound, it's the implementation that's bitten me.
This patch has been committed, after another "make bootstrap" on
x86_64-pc-linux-gnu (just in case), and confirmation/pre-approval
from Jeff Law that this indeed fixes the build failures seen on
several platforms.
My humble apologies again.
2021-08-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Don't use subreg_promoted_mode on a
SUBREG if it can't be guaranteed to a SUBREG_PROMOTED_VAR_P set.
Instead use the standard (safer) is_a <scalar_int_mode> idiom.
Jason Merrill [Mon, 30 Aug 2021 22:42:05 +0000 (18:42 -0400)]
c++: Improve error recovery with constexpr [PR92193]
The compiler tries to limit error cascades in limit_bad_template_recursion
by avoiding triggering a new instantiation from one that has caused errors.
We were exempting constexpr functions from this because they can be needed
for constant evaluation, but as more and more functions get marked
constexpr, this becomes an over-broad category. So as suggested on IRC,
this patch only exempts functions that are needed for mandatory constant
evaluation.
As noted in the comment, this flag doesn't particularly need to use a bit in
the FUNCTION_DECL, but there were still some free.
PR c++/92193
gcc/cp/ChangeLog:
* cp-tree.h (FNDECL_MANIFESTLY_CONST_EVALUATED): New.
* constexpr.c (cxx_eval_call_expression): Set it.
* pt.c (neglectable_inst_p): Check it.
I'm getting:
FAIL: gcc.dg/vect/pr101145.c scan-tree-dump-times vect "vectorized 1 loops" 7
FAIL: gcc.dg/vect/pr101145_1.c scan-tree-dump-times vect "vectorized 1 loops" 2
FAIL: gcc.dg/vect/pr101145_2.c scan-tree-dump-times vect "vectorized 1 loops" 2
FAIL: gcc.dg/vect/pr101145_3.c scan-tree-dump-times vect "vectorized 1 loops" 2
FAIL: gcc.dg/vect/pr101145.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 7
FAIL: gcc.dg/vect/pr101145_1.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 2
FAIL: gcc.dg/vect/pr101145_2.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 2
FAIL: gcc.dg/vect/pr101145_3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 2
on i686-linux (or x86_64-linux with -m32/-mno-sse).
The problem is that those tests use dg-options, which in */vect/ testsuite
throws away all the carefully added default options to enable vectorization
on each target (and which e.g. vect_int etc. effective targets rely on).
The old way would be to name those tests gcc.dg/vect/O3-pr101145*,
but we can also use dg-additional-options (which doesn't throw the default
options, just appends to them) which is IMO better so that we don't have to
rename the tests.
2021-08-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/101145
* gcc.dg/vect/pr101145.c: Use dg-additional-options with just -O3
instead of dg-options with -O3 -fdump-tree-vect-details.
* gcc.dg/vect/pr101145_1.c: Likewise.
* gcc.dg/vect/pr101145_2.c: Likewise.
* gcc.dg/vect/pr101145_3.c: Likewise.
Add support for device-modifiers for 'omp target device'.
'device_num' and 'ancestor' are now parsed on target device constructs for C,
C++, and Fortran (see OpenMP specification 5.0, p. 170). When 'ancestor' is
used, then 'sorry, not supported' is output. Moreover, the restrictions for
'ancestor' are implemented (see OpenMP specification 5.0, p. 174f).
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_device): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_clause_device): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
* semantics.c (finish_omp_clauses): Error handling. Constant device ids must
evaluate to '1' if 'ancestor' is used.
gcc/fortran/ChangeLog:
* gfortran.h: Add variable for 'ancestor' in struct gfc_omp_clauses.
* openmp.c (gfc_match_omp_clauses): Parse device-modifiers 'device_num'
and 'ancestor' in 'target device' clauses.
* trans-openmp.c (gfc_trans_omp_clauses): Set OMP_CLAUSE_DEVICE_ANCESTOR.
gcc/ChangeLog:
* gimplify.c (gimplify_scan_omp_clauses): Error handling. 'ancestor' only
allowed on target constructs and only with particular other clauses.
* omp-expand.c (expand_omp_target): Output of 'sorry, not supported' if
'ancestor' is used.
* omp-low.c (check_omp_nesting_restrictions): Error handling. No nested OpenMP
structs when 'ancestor' is used.
(scan_omp_1_stmt): No usage of OpenMP runtime routines in a target region when
'ancestor' is used.
* tree-pretty-print.c (dump_omp_clause): Append 'ancestor'.
* tree.h (OMP_CLAUSE_DEVICE_ANCESTOR): Define macro.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-device-1.c: New test.
* c-c++-common/gomp/target-device-2.c: New test.
* c-c++-common/gomp/target-device-ancestor-1.c: New test.
* c-c++-common/gomp/target-device-ancestor-2.c: New test.
* c-c++-common/gomp/target-device-ancestor-3.c: New test.
* c-c++-common/gomp/target-device-ancestor-4.c: New test.
* gfortran.dg/gomp/target-device-1.f90: New test.
* gfortran.dg/gomp/target-device-2.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-1.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-2.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-3.f90: New test.
* gfortran.dg/gomp/target-device-ancestor-4.f90: New test.
Roger Sayle [Tue, 31 Aug 2021 10:45:07 +0000 (11:45 +0100)]
Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))).
SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial subreg
is correctly zero-extended or sign-extended in the parent register. For
example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) indicates that the
byte x is zero extended in reg:SI 23, which is useful for optimization.
An example is that zero extending the above QImode value to HImode can
simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0).
This patch addresses the oversight/missed optimization opportunity that
the new HImode subreg above should retain its SUBREG_PROMOTED_VAR_P
annotation as its value is guaranteed to be correctly extended in the
SImode parent. The code below to preserve SUBREG_PROMOTED_VAR_P is already
present in the middle-end (e.g. simplify-rtx.c:7232-7242) but missing
from one or two (precisely three) places that (accidentally) strip it.
Whilst there I also added another optimization. If we need to extend
the above QImode value beyond the SImode register holding it, say to
DImode, we can eliminate the SUBREG and simply extend from the SImode
register to DImode.
2021-08-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg.
* simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]:
Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider)
partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate
SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical.
[ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when
creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg
would be paradoxical.
Roger Sayle [Tue, 31 Aug 2021 10:41:57 +0000 (11:41 +0100)]
Only simplify TRUNCATE to SUBREG on TRULY_NOOP_TRUNCATION targets.
As recently remarked by Jeff Law, SUBREGs are the "forever chemicals"
of GCC's RTL; once created they persist in the environment. The problem,
according to the comment on lines 5428-5438 of combine.c is that
non-tieable SUBREGs interfere with reload/register allocation, so
combine often doesn't touch/clean-up instructions containing a SUBREG.
This is the first and simplest of two patches to tackle that problem,
by teaching combine to avoid converting explicit TRUNCATEs into
SUBREGs that it can't handle.
Consider the following (hypothetical) sequence of instructions on
a STORE_FLAG_VALUE=1 target, which stores a zero or one in an SI
register, then uselessly truncates to QImode, then extends it again.
which ideally (i.e. with this patch) combine would simplify to:
(set (reg:SI 0) (ne:SI (reg:BI 28) (const_int 0)))
Alas currently, during combine the middle TRUNCATE is converted into
a lowpart SUBREG, which subst then turns into (clobber (const_int 0)),
abandoning the attempted combination, that then never reaches recog.
2021-08-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* combine.c (combine_simplify_rtx): Avoid converting an explicit
TRUNCATE into a lowpart SUBREG on !TRULY_NOOP_TRUNCATION targets.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
Quoting from https://gcc.gnu.org/pipermail/gcc/2021-July/236716.html:
--------------------------------------------------------------------
It was pointed out to me off-list that config/aarch64/value-unwind.h
is missing the runtime exception. It looks like a few other files
are too; a fuller list is:
Certainly for the aarch64 file this was simply a mistake;
it seems to have been copied from the i386 version, both of which
reference the runtime exception but don't actually include it.
--------------------------------------------------------------------
Similarly, frv-abi.h referenced the exception but didn't include it.
pa64-hpux-lib.h was missing any reference to the exception.
The decision was that this was simply a mistake
[https://gcc.gnu.org/pipermail/gcc/2021-July/236717.html]:
--------------------------------------------------------------------
[…] It generally is
considered a textual omission. The runtime library components of GCC
are intended to be licensed under the runtime exception, which was
granted and approved at the time of introduction.
--------------------------------------------------------------------
and that we should simply change all of the files above
[https://gcc.gnu.org/pipermail/gcc/2021-July/236719.html]:
--------------------------------------------------------------------
Please correct the text in the files. The files in libgcc used in the
GCC runtime are intended to be licensed with the runtime exception and
GCC previously was granted approval for that licensing and purpose.
[…]
The runtime exception explicitly was intended for this purpose and
usage at the time that GCC received approval to apply the exception.
--------------------------------------------------------------------
Richard Biener [Tue, 31 Aug 2021 07:13:20 +0000 (09:13 +0200)]
middle-end/102129 - avoid TER of possibly trapping expressions
The following avoids applying TER to possibly trapping expressions,
preventing a trapping FP multiplication to be moved across a call
that should not be executed.
2021-08-31 Richard Biener <rguenther@suse.de>
PR middle-end/102129
* tree-ssa-ter.c (find_replaceable_in_bb): Do not move
possibly trapping expressions across calls.
Andrew Burgess [Mon, 30 Aug 2021 20:16:50 +0000 (21:16 +0100)]
gdb: Add a dependency between gdb and libbacktrace
GDB is going to start using libbacktrace, so add a build dependency
between the two modules. This change needs to be added into the GCC
toplevel files, and then back-ported to the binutils-gdb repository.
2021-08-31 Andrew Burgess <andrew.burgess@embecosm.com>
ChangeLog:
* Makefile.def: Add all-gdb dependency on all-libbacktrace.
* Makefile.in: Regenerate.
Jakub Jelinek [Tue, 31 Aug 2021 08:29:23 +0000 (10:29 +0200)]
tree-ssa-ccp: Fix up bit_value_binop on RSHIFT_EXPR [PR102134]
As mentioned in the PR, this hunk is guarded with !wi::neg_p (r1val | r1mask, sgn)
which means if sgn is UNSIGNED, it is always true, but r1val | r1mask in
widest_int is still sign-extended. That means wi::clz (arg) returns 0,
wi::get_precision (arg) returns some very large number
(WIDE_INT_MAX_PRECISION, on x86_64 576 bits) and width is 64, so we end up
with lzcount of -512 where the code afterwards expects a non-negative
lzcount. For arg without the sign bit set the code works right, those
numbers are zero extended and so wi::clz must return wi::get_precision (arg) - width
plus number of leading zero bits within the width precision.
The patch fixes it by handling the sign-extension specially, either it could
be done through wi::neg_p (arg) check, but lzcount == 0 works identically.
2021-08-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/102134
* tree-ssa-ccp.c (bit_value_binop) <case RSHIFT_EXPR>: If sgn is
UNSIGNED and r1val | r1mask has MSB set, ensure lzcount doesn't
become negative.
Andrew Pinski [Tue, 31 Aug 2021 05:36:47 +0000 (05:36 +0000)]
Fix gcc.dg/ipa/inline-8.c for -fPIC
The problem here is with -fPIC, both cmp and move
don't bind locally so they are not even tried to be
inlined. This fixes the issue by marking both
functions as static and now the testcase passes
for both -fPIC and -fno-PIC cases.
OK? Tested on x86_64-linux-gnu.
gcc/testsuite/ChangeLog:
* gcc.dg/ipa/inline-8.c: Mark cmp and move as
static so they both bind local and available for
inlinine.
Andrew Pinski [Mon, 30 Aug 2021 22:43:16 +0000 (22:43 +0000)]
Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.
So the main issue here is that some signals are not setup unlike collect2.
So this merges the setting up of the signal handlers to one function in
collect-utils and has collect2 and lto-wrapper call that function.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR driver/79181
* collect-utils.c (setup_signals): New declaration.
* collect-utils.h (setup_signals): New function.
* collect2.c (handler): Delete.
(main): Instead of manually setting up the signals,
just call setup_signals.
* lto-wrapper.c (main): Likewise.
Andrew Pinski [Fri, 23 Jul 2021 17:21:06 +0000 (17:21 +0000)]
Fix x86/56337 : 1<<28 alignment is broken
The problem here is the x86_64 back-end uses a signed integer
for alignment and then divides by BITS_PER_UNIT so if we had
INT_MIN (which is what 1<<28*8 is), we would get the wrong result.
This fixes the problem by using unsigned for the argument to
x86_output_aligned_bss and x86_output_aligned_bss.
YunQiang Su [Sat, 28 Aug 2021 11:39:04 +0000 (07:39 -0400)]
MIPS: add .module mipsREV to all output asm file
Currently, the asm output file for MIPS has no rev info.
It can make some trouble, for example:
assembler is mips1 by default,
gcc is fpxx by default.
To assemble the output of gcc -S, we have to pass -mips2
to assembler.
gcc/ChangeLog:
* config/mips/mips.c (mips_module_isa_name): New.
mips_file_start: add .module mipsREV to all asm output
liuhongt [Mon, 30 Aug 2021 07:05:14 +0000 (15:05 +0800)]
Unify UNSPEC_MASKED_EQ/GT to the form of UNSPEC_PCMP.
Currently for evex vpcmpeqb instruction, we have two forms of rtl
template representation, one is (unspec [op1 op2] UNSPEC_MASK_EQ), the
other is (unspec [op1, op2, const_int 0] UNSPEC_PCMP), which increases
the maintenance burden, such as optimization (not: vpcmpeqb)
to (vpcmpneqb) requires two define_insn_and_split to match the two
forms respectively, this patch removes UNSPEC_MASK_EQ/GT, unifying
them into the form of UNSPEC_PCMP.
gcc/ChangeLog:
* config/i386/sse.md (*<avx512>_ucmp<mode>3_1): Change from
define_split to define_insn_and_split.
(*avx2_eq<mode>3): Removed.
(<avx512>_eq<mode>3<mask_scalar_merge_name>): Adjust pattern
(<avx512>_eq<mode>3<mask_scalar_merge_name>_1): Rename to ..
(*<avx512>_eq<mode>3<mask_scalar_merge_name>_1): .. this, and
adjust pattern.
(*avx2_gt<mode>3): Removed.
(<avx512>_gt<mode>3<mask_scalar_merge_name>): Change from
define_insn to define_expand, and adjust pattern.
(UNSPEC_MASKED_EQ, UNSPEC_MASKED_GT): Removed.
David Malcolm [Mon, 30 Aug 2021 22:36:31 +0000 (18:36 -0400)]
analyzer: support "bifurcation"; reimplement realloc [PR99260]
Most of the state-management code in the analyzer involves
modifying state objects in-place, which implies a single outcome.
(I originally implemented in-place modification because I wanted
to avoid having to create copies of state objects, and it's now
very difficult to change this aspect of the analyzer's design)
However, there are various special-cases such as "realloc" for which
it's best to split the state into multiple outcomes.
This patch adds a mechanism for "bifurcating" the analysis in places
where there isn't a split in the CFG, and uses it to implement realloc,
in this case treating it as having 3 possible outcomes:
- failure, returning NULL
- success, growing the buffer in-place without moving it
- success, allocating a new buffer, copying the content of the old
buffer to it, and freeing the old buffer.
gcc/analyzer/ChangeLog:
PR analyzer/99260
* analyzer.h (class custom_edge_info): New class, adapted from
exploded_edge::custom_info_t. Make member functions const.
Make update_model return bool, converting edge param from
reference to a pointer, and adding a ctxt param.
(class path_context): New class.
* call-info.cc: New file.
* call-info.h: New file.
* engine.cc: Include "analyzer/call-info.h" and <memory>.
(impl_region_model_context::impl_region_model_context): Update for
new m_path_ctxt field.
(impl_region_model_context::bifurcate): New.
(impl_region_model_context::terminate_path): New.
(impl_region_model_context::get_malloc_map): New.
(impl_sm_context::impl_sm_context): Update for new m_path_ctxt
field.
(impl_sm_context::get_fndecl_for_call): Likewise.
(impl_sm_context::set_next_state): Likewise.
(impl_sm_context::warn): Likewise.
(impl_sm_context::is_zero_assignment): Likewise.
(impl_sm_context::get_path_context): New.
(impl_sm_context::m_path_ctxt): New.
(impl_region_model_context::on_condition): Update for new
path_ctxt param. Handle m_enode_for_diag being NULL.
(impl_region_model_context::on_phi): Update for new path_ctxt
param.
(exploded_node::on_stmt): Add path_ctxt param, updating ctor calls
to use it as necessary. Use it to bail out after sm-handling,
if needed.
(exploded_node::detect_leaks): Update for new path_ctxt param.
(dynamic_call_info_t::update_model): Update for conversion of
exploded_edge::custom_info_t to custom_edge_info.
(dynamic_call_info_t::add_events_to_path): Likewise.
(rewind_info_t::update_model): Likewise.
(rewind_info_t::add_events_to_path): Likewise.
(exploded_edge::exploded_edge): Likewise.
(exploded_graph::add_edge): Likewise.
(exploded_graph::maybe_process_run_of_before_supernode_enodes):
Update for new path_ctxt param.
(class impl_path_context): New.
(exploded_graph::process_node): Update for new path_ctxt param.
Create an impl_path_context and pass it to exploded_node::on_stmt.
Use it to terminate iterating stmts if terminate_path is called
on it. After processing a run of stmts, query path_ctxt to
potentially terminate the analysis path, and/or to "bifurcate" the
analysis into multiple additional paths.
(feasibility_state::maybe_update_for_edge): Update for new
update_model ctxt param.
* exploded-graph.h
(impl_region_model_context::impl_region_model_context): Add
path_ctxt param.
(impl_region_model_context::bifurcate): New.
(impl_region_model_context::terminate_path): New
(impl_region_model_context::get_ext_state): New.
(impl_region_model_context::get_malloc_map): New.
(impl_region_model_context::m_path_ctxt): New field.
(exploded_node::on_stmt): Add path_ctxt param.
(class exploded_edge::custom_info_t): Move to analyzer.h, renaming
to custom_edge_info, and making the changes as noted in analyzer.h
above.
(exploded_edge::exploded_edge): Update for these changes to
exploded_edge::custom_info_t.
(exploded_edge::m_custom_info): Likewise.
(class dynamic_call_info_t): Likewise.
(class rewind_info_t): Likewise.
(exploded_graph::add_edge): Likewise.
* program-state.cc (program_state::on_edge): Update for new
path_ctxt param.
(program_state::push_call): Likewise.
(program_state::returning_call): Likewise.
(program_state::prune_for_point): Likewise.
* region-model-impl-calls.cc: Include "analyzer/call-info.h".
(call_details::get_fndecl_for_call): New.
(region_model::impl_call_realloc): Reimplement.
* region-model.cc (region_model::on_call_pre): Move call to
impl_call_realloc to...
(region_model::on_call_post): ...here. Consolidate creation
of call_details instance.
(noop_region_model_context::bifurcate): New.
(noop_region_model_context::terminate_path): New.
* region-model.h (call_details::get_call_stmt): New.
(call_details::get_fndecl_for_call): New.
(region_model::on_realloc_with_move): New.
(region_model_context::bifurcate): New.
(region_model_context::terminate_path): New.
(region_model_context::get_ext_state): New.
(region_model_context::get_malloc_map): New.
(noop_region_model_context::bifurcate): New.
(noop_region_model_context::terminate_path): New.
(noop_region_model_context::get_ext_state): New.
(noop_region_model_context::get_malloc_map): New.
* sm-malloc.cc: Include "analyzer/program-state.h".
(malloc_state_machine::on_realloc_call): Reimplement.
(malloc_state_machine::on_realloc_with_move): New.
(region_model::on_realloc_with_move): New.
* sm-signal.cc (class signal_delivery_edge_info_t): Update for
conversion from exploded_edge::custom_info_t to custom_edge_info.
* sm.h (sm_context::get_path_context): New.
* svalue.cc (svalue::maybe_get_constant): Call
unwrap_any_unmergeable.
gcc/testsuite/ChangeLog:
PR analyzer/99260
* gcc.dg/analyzer/capacity-2.c: Update for changes to realloc
analysis.
* gcc.dg/analyzer/pr99193-1.c: Likewise.
* gcc.dg/analyzer/pr99193-3.c: Likewise.
* gcc.dg/analyzer/realloc-1.c: Likewise. Add test coverage for
realloc of non-heap pointer, realloc from mismatching allocator,
and realloc on a freed pointer.
* gcc.dg/analyzer/realloc-2.c: New test.
Jason Merrill [Sun, 29 Aug 2021 22:17:22 +0000 (18:17 -0400)]
c++: limit instantiation with ill-formed class [PR96286]
I noticed that after the static_assert failures in lwg3466.cc, we got
various follow-on errors because we went ahead and tried to instantiate the
promise<T> member functions even after instantiating the class itself ran
into problems. Interrupting instantiation of the class itself seems likely
to cause error-recovery problems, but preventing instantiation of member
functions seems strictly better for error-recovery.
This doesn't fix any of the specific testcases in PR96286, but addresses
part of that problem space.
PR c++/96286
gcc/cp/ChangeLog:
* cp-tree.h (struct lang_type): Add erroneous bit-field.
(CLASSTYPE_ERRONEOUS): New.
* pt.c (limit_bad_template_recursion): Check it.
(instantiate_class_template_1): Set it.
Jason Merrill [Sat, 28 Aug 2021 04:40:29 +0000 (00:40 -0400)]
c++: preserve location through constexpr
While working on the patch for PR101460, I noticed that we were losing the
expression location when folding class prvalue expressions. The final patch
doesn't fold class prvalues, but this still seems a worthwhile change. I
don't add location wrappers for scalar prvalues because many callers are
trying to fold them away.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_outermost_constant_expr): Copy
expr location to result.
Jason Merrill [Mon, 30 Aug 2021 13:44:28 +0000 (09:44 -0400)]
c++: fold function template args sooner [PR101460]
As discussed in the PR, we were giving a lot of unnecessary errors for this
testcase because we didn't try to do constant evaluation until
convert_nontype_argument, which happens for each of the candidates. But
when looking at a template-id as the function operand of a call, we can try
to fold arguments before we get into overload resolution.
Andrew Pinski [Mon, 30 Aug 2021 20:30:41 +0000 (20:30 +0000)]
Fix PR 90142: contrib/download_prerequisites uses test ==
Since == is not portable, it is better to use = in contrib/
download_prerequisites. The only place == was used is inside
the function md5_check which is used only on Mac OS X.
Tested on Mac OS X as:
./contrib/download_prerequisites --md5
Both with all files having the correct checksum and one with a broken one.
contrib/ChangeLog:
* download_prerequisites (md5_check): Replace == inside
test with = to be more portable.
Jason Merrill [Fri, 27 Aug 2021 21:28:28 +0000 (17:28 -0400)]
c++: Add warning about missing 'requires'
I noticed that concepts-lambda14.C had two useless requires-expressions:
static_assert(requires { C<T>; });
always succeeds, because C<T> is always a valid expression for any type,
regardless of whether C is satisfied for a particular type. Presumably the
user means
static_assert(requires { requires C<T>; });
to make the C<T> a nested-requirement. Of course,
static_assert(C<T>);
is much simpler and means the same thing; this is more relevant in the
middle of a longer requires-expression, such as the bug this warning found
in cmcstl2:
Harald Anlauf [Mon, 30 Aug 2021 20:41:01 +0000 (22:41 +0200)]
Fortran - correct check for constraint F2008:C628 / F2018:C932
gcc/fortran/ChangeLog:
PR fortran/101349
* resolve.c (resolve_allocate_expr): An unlimited polymorphic
argument to ALLOCATE must be ALLOCATABLE or a POINTER. Fix the
corresponding check.
gcc/testsuite/ChangeLog:
PR fortran/101349
* gfortran.dg/unlimited_polymorphic_33.f90: New test.
Bill Schmidt [Fri, 5 Mar 2021 01:54:00 +0000 (19:54 -0600)]
rs6000: Darwin builtin support
2021-03-04 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/darwin.h (SUBTARGET_INIT_BUILTINS): Use the new
decl when new_builtins_are_live.
* config/rs6000/rs6000-builtin-new.def (__builtin_cfstring): New
built-in.
This reworks the previous attempt to avoid leaving around if-converted
scalar code in BB vectorized loop bodies to keep costing independent
subgraphs which should address the observed regression with 519.lbm_r.
For this to work we now first cost all subgraphs and only after
doing that proceed to emit vectorized code.
2021-08-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/102128
* tree-vect-slp.c (vect_bb_vectorization_profitable_p):
Move scanning for if-converted scalar code to the caller
and instead delay clearing the visited flag for profitable
subgraphs.
(vect_slp_region): Cost all subgraphs before scheduling.
For if-converted BB vectorization scan for scalar COND_EXPRs
and do not vectorize if any found and the cost model is
very-cheap.
Richard Biener [Mon, 30 Aug 2021 07:55:59 +0000 (09:55 +0200)]
Make sure -fexceptions is enabled when -fnon-call-exceptions is
This makes -fexceptions enabled by -fnon-call-exceptions, removing
the odd state of !flag_exceptions && flag_non_call_exceptions from
middle-end consideration.
2021-08-30 Richard Biener <rguenther@suse.de>
* common.opt (fexceptions): Mark
EnabledBy(fnon-call-exceptions).
* doc/invoke.texi (fnon-call-exceptions): Document this
enables -fexceptions.
Sebastian Huber [Tue, 17 Aug 2021 07:53:43 +0000 (09:53 +0200)]
Use __builtin_trap() for abort() if inhibit_libc
abort() is used in gcc_assert() and gcc_unreachable() which is used by target
libraries such as libgcov.a. This patch changes the abort() definition under
certain conditions. If inhibit_libc is defined and abort is not already
defined, then abort() is defined to __builtin_trap().
The inhibit_libc define is usually defined if GCC is built for targets running
in embedded systems which may optionally use a C standard library. If
inhibit_libc is defined, then there may be still a full featured abort()
available. abort() is a heavy weight function which depends on signals and
file streams. For statically linked applications, this means that a dependency
on gcc_assert() pulls in the support for signals and file streams. This could
prevent using gcov to test low end targets for example. Using __builtin_trap()
avoids these dependencies if the target implements a "trap" instruction. The
application or operating system could use a trap handler to react to failed GCC
runtime checks which caused a trap.
gcc/
* tsystem.h (abort): Define abort() if inhibit_libc is defined and it
is not already defined.
YunQiang Su [Sat, 28 Aug 2021 12:17:18 +0000 (08:17 -0400)]
libffi: Fix MIPS r6 support
for some instructions, MIPS r6 uses different encoding other than
the previous releases.
1. mips/n32.S disable .set mips4: since it casuses old insn encoding
is used.
https://github.com/libffi/libffi/pull/396 has been accepted as: 94c102aa69b04337f63498e0e6551fcdce549ae5
2. mips/ffi.c: the encoding for JR is hardcoded: we need to use
different value for r6 and pre-r6.
https://github.com/libffi/libffi/pull/401 has been accpeted as: 746dbe3a6a79a41931c03b51df2972be4d5e5028
libffi/
PR libffi/83636
* src/mips/n32.S: disable .set mips4
* src/mips/ffi.c: use different JR encoding for r6.
liuhongt [Fri, 6 Aug 2021 02:18:43 +0000 (10:18 +0800)]
Make sure we're playing with integral modes before call extract_integral_bit_field.
gcc/ChangeLog:
* expmed.c (extract_bit_field_1): Make sure we're playing with
integral modes before call extract_integral_bit_field.
(extract_integral_bit_field): Add a parameter of type
scalar_int_mode which corresponds to of tmode.
And call extract_and_convert_fixed_bit_field instead of
extract_fixed_bit_field and convert_extracted_bit_field.
(extract_and_convert_fixed_bit_field): New function, it's a
combination of extract_fixed_bit_field and
convert_extracted_bit_field.
Iain Buclaw [Sun, 29 Aug 2021 18:26:06 +0000 (20:26 +0200)]
libiberty: Add support for demangling local D template declarations
The D language now allows multiple different template declarations in
the same function that have the same mangled name. To make the mangled
names unique, a fake parent in the form `__Sddd' is added to the symbol.
This information is not important for the user, so the demangler now
handles and ignores it.
Iain Buclaw [Sun, 29 Aug 2021 17:00:33 +0000 (19:00 +0200)]
libiberty: Add support for D `typeof(*null)' types
The D language has a new bottom type `typeof(*null)'. Null types were
also incorrectly being demangled as `none', this has been fixed to be
`typeof(null)'.
Iain Sandoe [Fri, 27 Aug 2021 18:49:05 +0000 (19:49 +0100)]
Darwin: Fixes for darwin_libc_has_function.
Firstly, the checks for availability need not be run for any
currently supported Darwin version (or for any version of
Darwin on x86). In fact, the only test that is needed that
differs from the default is for the availbaility of sincos.
Test that and then fall back to the default implementation.
Secondly, the funtion appears to be called from the Jit library
before the value of darwin_macosx_version_min has been set up -
at present we work around this by guarding the checks on having
a non-null pointer for darwin_macosx_version_min.
* config/darwin.c (darwin_libc_has_function): Do not run
the checks for x86 or modern Darwin. Make sure that there
is a value set for darwin_macosx_version_min before testing.
Iain Buclaw [Sat, 28 Aug 2021 18:28:02 +0000 (20:28 +0200)]
d: Use `int` to store class and struct flags
gcc/d/ChangeLog:
* typeinfo.cc (TypeInfoVisitor::visit(TypeInfoClassDeclaration *)):
Use int to store type flags.
(TypeInfoVisitor::visit(TypeInfoStructDeclaration *)): Likewise.
Iain Buclaw [Sat, 28 Aug 2021 14:57:03 +0000 (16:57 +0200)]
d: ICE in gimple_register_canonical_type_1, at lto/lto-common.c:430 (PR102094)
User defined types have the TYPE_CXX_ODR_P flag set, but closure frames
did not. This mismatch led to an ICE in the conflict detection for ODR
and interoperable non-ODR types. As a given closure frame is tied
explicitly to a function, it already conforms to ODR.
PR d/102094
gcc/d/ChangeLog:
* d-codegen.cc (build_frame_type): Set TYPE_CXX_ODR_P.
Iain Sandoe [Mon, 15 Mar 2021 21:40:40 +0000 (21:40 +0000)]
testsuite, Darwin : Skip a test requiring strndup in libc.
Before Darwin11 there is no strndup in libc. This test fails with
warning output because of that - so skip it on these versions (since
they are not able to use strndup anyway).
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/strndup-1.c: Skip for Darwin versions
without strndup support in libc.
Jan Hubicka [Sat, 28 Aug 2021 18:57:08 +0000 (20:57 +0200)]
Improve handling of table overflows in modref_ref_node
gcc/ChangeLog:
* ipa-modref-tree.h (modref_access_node::merge): Break out
logic combining offsets and logic merging ranges to ...
(modref_access_node::combined_offsets): ... here
(modref_access_node::update2): ... here
(modref_access_node::closer_pair_p): New member function.
(modref_access_node::forced_merge): New member function.
(modre_ref_node::insert): Do merging when table is full.
Jonathan Wakely [Sat, 28 Aug 2021 10:05:58 +0000 (11:05 +0100)]
libstdc++: Fix std::allocator<void> for versioned namespace
Removing the allocator<void> specialization for the versioned namespace
breaks _Extptr_allocator<void> because the allocator<void>
specialization was still declared in <bits/memoryfwd.h>, making it an
incomplete type. It wrong to remove that specialization anyway, because
it is still needed pre-C++20.
This removes the #if ! _GLIBCXX_INLINE_VERSION check, so that
allocator<void> is still explicitly specialized for the versioned
namespace, consistent with the normal unversioned namespace mode.
To make _Extptr_allocator<void> usable as a ProtoAllocator, this change
adds a default constructor and converting constructor. That is
consistent with std::allocator<void> since C++20 (and harmless to do for
earlier standards).
I'm also explicitly specializing allocator_traits<allocator<void>> so
that it doesn't need to use allocator<void>::construct and destroy.
Doing that allows those members to be removed, further simplifying
allocator<void>. That new explicit specialization can delete the
allocate, deallocate and max_size members, which are always ill-formed
for allocator<void>.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h (allocator_traits): Add explicit
specialization for allocator<void>. Improve doxygen comments.
* include/bits/allocator.h (allocator<void>): Restore for the
versioned namespace.
(allocator<void>::construct, allocator<void>::destroy): Remove.
* include/ext/extptr_allocator.h (_Extptr_allocator<void>):
Add default constructor and converting constructor.
Jonathan Wakely [Thu, 26 Aug 2021 23:20:31 +0000 (00:20 +0100)]
libstdc++: Name std::function template parameter
This avoids "<template-parameter-2-2>" being shown in the diagnostics
for ill-formed uses of std::function constructor:
In instantiation of 'std::function<_Res(_ArgTypes ...)>::function(_Functor&&)
[with _Functor = f(f()::_Z1fv.frame*)::<lambda()>;
<template-parameter-2-2> = void; _Res = void; _ArgTypes = {}]'
Instead we get:
In instantiation of 'std::function<_Res(_ArgTypes ...)>::function(_Functor&&)
[with _Functor = f(f()::_Z1fv.frame*)::<lambda()>;
_Constraints = void; _Res = void; _ArgTypes = {}]'
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/std_function.h (function::function(F&&)): Give
name to defaulted template parameter, to improve diagnostics.
Use markdown for more doxygen comments.
Alexandre Oliva [Sat, 28 Aug 2021 03:40:14 +0000 (00:40 -0300)]
fix latent bootstrap-debug issue
I've hit a bootstrap-debug error involving large subprograms in
gcc/ada/sem_ch12.adb. I'm afraid I couldn't narrow it down to a
reasonable testcase.
thread1 made different decisions about a block containing a
builtin_eh_filter call because in one compilation, estimate_num_insns
found a cgraph_node for the builtin and could thus get to the
is_simple_builtin test, but in the other it didn't. With different
insn counts, one stage jump-threaded and the other didn't, and the
resulting code diverged quite a bit.
The reason the builtin had a cgraph_node in one case but not the other
was that modref got a chance to analyze the builtin call when it was
the first stmt in the block, and that created the cgraph_node.
However, when it was preceded by debug stmts, the loop in
analyze_function was cut short after the first debug stmt, because the
summary so far was not useful.
This patch fixes both issues: skip debug stmts in the analyze_function
loop, so as to prevent them from affecting any decisions in the loop,
and enable the insn count estimator to get to the is_simple_builtin
test when a cgraph_node has not been created for the builtin.
for gcc/ChangeLog
* ipa-modref.c (analyze_function): Skip debug stmts.
* tree-inline.c (estimate_num_insn): Consider builtins even
without a cgraph_node.
Jason Merrill [Fri, 27 Aug 2021 14:00:49 +0000 (10:00 -0400)]
c++: Set type on dependent ARROW_EXPR
Even if the operand of -> has dependent type, if it's a pointer we know
that the result will be the target type of that pointer. This should avoid
some unnecessary TYPEOF_EXPR when looking up a name after ->.
gcc/cp/ChangeLog:
* typeck2.c (build_x_arrow): Do set TREE_TYPE when operand is
a dependent pointer.