David Malcolm [Wed, 22 Jan 2020 14:37:18 +0000 (09:37 -0500)]
analyzer: fix ICE due to sm-state origin being purged (PR 93382)
The ICE in PR analyzer/93382 is a validation error.
The global variable "idx" acquires a "tainted" state from local array
n1[0]. When the frame is popped, the svalue for n1[0] is purged, but
the "taint" sm_state_map's entry for "idx" has a svalue_id referencing
the now-purged svalue. This is caught by program_state::validate as an
assertion failure.
This patch fixes the issue by resetting the origin id within
sm_state_map entries for the case where the origin id has been purged.
gcc/analyzer/ChangeLog:
PR analyzer/93382
* program-state.cc (sm_state_map::on_svalue_purge): If the
entry survives, but the origin is being purged, then reset the
origin to null.
gcc/testsuite/ChangeLog:
PR analyzer/93382
* gcc.dg/analyzer/pr93382.c: New test.
Andrew Pinski [Wed, 22 Jan 2020 23:34:34 +0000 (23:34 +0000)]
Fix patchable-function-entry on arc
The problem here is arc looks at current_output_insn unconditional
but sometimes current_output_insn is NULL. With patchable-function-entry,
it will be. This is similar to how the nios2, handles "%.".
Committed as obvious after a simple test with -fpatchable-function-entry=1.
ChangeLog:
* config/arc/arc.c (output_short_suffix): Check insn for nullness.
Andrew Pinski [Thu, 23 Jan 2020 00:40:19 +0000 (00:40 +0000)]
Revert "Allow tree-ssa.exp to be run by itself" and move some testcases
This reverts commit 9085381f1931cc3667412c8fff91878184835901 as it was
causing default dg-do to be set incorrectly on most targets.
Instead move testcases that are vect related testcase that
use "dg-require-effective-target vect_*" to the vect test area.
ChangeLog:
* gcc.dg/tree-ssa/pr88497-1.c: Move to ...
* gcc.dg/vect/pr88497-1.c: This.
* gcc.dg/tree-ssa/pr88497-2.c: Move to ...
* gcc.dg/vect/pr88497-2.c: This.
* gcc.dg/tree-ssa/pr88497-3.c: Move to ...
* gcc.dg/vect/pr88497-3.c: This.
* gcc.dg/tree-ssa/pr88497-4.c: Move to ...
* gcc.dg/vect/pr88497-4.c: This.
* gcc.dg/tree-ssa/pr88497-5.c: Move to ...
* gcc.dg/vect/pr88497-5.c: This.
* gcc.dg/tree-ssa/pr88497-6.c: Move to ...
* gcc.dg/vect/pr88497-6.c: This.
* gcc.dg/tree-ssa/pr88497-7.c: Move to ...
* gcc.dg/vect/pr88497-7.c: This.
Revert:
* tree-ssa.exp: Set DEFAULT_VECTCFLAGS and DEFAULT_VECTCFLAGS.
Call check_vect_support_and_set_flags also.
Jason Merrill [Wed, 22 Jan 2020 19:21:06 +0000 (14:21 -0500)]
c-family: Fix problems with blender and PPC from PR 40752 patch.
blender in SPEC is built with -funsigned-char, which is also the default on
PPC, and exposed -Wsign-conversion issues that weren't seen by the x86_64
testsuite. In blender we were complaining about operands to an expression
that we didn't't previously complain about as a whole. So only check
operands after we check the whole expression. Also, to fix the PR 40752
testcases on -funsigned-char targets, don't consider -Wsign-conversion for
the second operand of PLUS_EXPR, especially since fold changes
"x - 5" to "x + (-5)". And don't use SCHAR_MAX with plain char.
PR testsuite/93391 - PR 40752 test fails with unsigned plain char.
PR c++/40752
* c-warn.c (conversion_warning): Check operands only after checking
the whole expression. Don't check second operand of + for sign.
Andrew Pinski [Mon, 20 Jan 2020 22:10:32 +0000 (22:10 +0000)]
Allow tree-ssa.exp to be run by itself
tree-ssa testcases sometimes check autovect effective target
but does not set it up. On MIPS, those testcases fail with
some TCL error messages. This fixes the issue by calling
check_vect_support_and_set_flags inside tree-ssa.exp.
There might be other .exp files which need to be done this
way too but I have not checked all of them.
Tested on x86_64-linux-gnu and a cross to mips64-octeon-linux-gnu.
Both full run of the testsuite and running tree-ssa.exp by itself.
testsuite/ChangeLog:
* tree-ssa.exp: Set DEFAULT_VECTCFLAGS and DEFAULT_VECTCFLAGS.
Call check_vect_support_and_set_flags also.
David Malcolm [Wed, 22 Jan 2020 16:45:58 +0000 (11:45 -0500)]
analyzer: fix setjmp handling with -g (PR 93378)
PR analyzer/93378 reports an ICE at -O1 -g when analyzing a rewind via
longjmp to a setjmp call with.
The root cause is that the rewind_info_t::get_setjmp_call attempts to
locate the setjmp GIMPLE_CALL via within the exploded_node containing
it, but the exploded_node has two stmts: a GIMPLE_DEBUG, then the
GIMPLE_CALL, and so erroneously picks the GIMPLE_DEBUG, leading to
a failed as_a <const gcall *>.
This patch reworks how the analyzer stores information about a setjmp
so that instead of storing an exploded_node *, it instead introduces
a "setjmp_record" struct, for use by both setjmp_svalue and
rewind_info_t. Hence we store the information directly, rather than
attempting to reconstruct it, fixing the bug.
gcc/analyzer/ChangeLog:
PR analyzer/93378
* engine.cc (setjmp_svalue::compare_fields): Update for
replacement of m_enode with m_setjmp_record.
(setjmp_svalue::add_to_hash): Likewise.
(setjmp_svalue::get_index): Rename...
(setjmp_svalue::get_enode_index): ...to this.
(setjmp_svalue::print_details): Update for replacement of m_enode
with m_setjmp_record.
(exploded_node::on_longjmp): Likewise.
* exploded-graph.h (rewind_info_t::m_enode_origin): Replace...
(rewind_info_t::m_setjmp_record): ...with this.
(rewind_info_t::rewind_info_t): Update for replacement of m_enode
with m_setjmp_record.
(rewind_info_t::get_setjmp_point): Likewise.
(rewind_info_t::get_setjmp_call): Likewise.
* region-model.cc (region_model::dump_summary_of_map): Likewise.
(region_model::on_setjmp): Likewise.
* region-model.h (struct setjmp_record): New struct.
(setjmp_svalue::m_enode): Replace...
(setjmp_svalue::m_setjmp_record): ...with this.
(setjmp_svalue::setjmp_svalue): Update for replacement of m_enode
with m_setjmp_record.
(setjmp_svalue::clone): Likewise.
(setjmp_svalue::get_index): Rename...
(setjmp_svalue::get_enode_index): ...to this.
(setjmp_svalue::get_exploded_node): Replace...
(setjmp_svalue::get_setjmp_record): ...with this.
gcc/testsuite/ChangeLog:
PR analyzer/93378
* gcc.dg/analyzer/setjmp-pr93378.c: New test.
David Malcolm [Fri, 17 Jan 2020 14:50:33 +0000 (09:50 -0500)]
analyzer: introduce namespace to avoid ODR clashes (PR 93307)
PR analyzer/93307 reports that in an LTO bootstrap, there are ODR
violations between:
- the "region" type:
gcc/analyzer/region-model.h:792
vs:
gcc/sched-int.h:1443
- the "constraint" type:
gcc/analyzer/constraint-manager.h:121
vs:
gcc/tree-ssa-structalias.c:533
This patches solves this clash by putting all of the analyzer names
within a namespace. I chose "ana" as it is short (to save typing).
The analyzer selftests are moved from namespace "selftest" to
"ana::selftest".
There are various places where the namespace has to be closed
and reopened, to allow e.g. for specializations of templates
in the global namespace.
gcc/ChangeLog:
PR analyzer/93307
* gdbinit.in (break-on-saved-diagnostic): Update for move of
diagnostic_manager into "ana" namespace.
* selftest-run-tests.c (selftest::run_tests): Update for move of
selftest::run_analyzer_selftests to
ana::selftest::run_analyzer_selftests.
Marek Polacek [Tue, 21 Jan 2020 22:38:54 +0000 (17:38 -0500)]
PR c++/92907 - noexcept does not consider "const" in member functions.
Here the problem is that if the noexcept specifier is used in the context
of a const member function, const is not considered for the member variables,
leading to a bogus error. g's const makes its 'this' const, so the first
overload of f should be selected.
In cp_parser_noexcept_specification_opt we inject 'this', but always
unqualified:
25737 if (current_class_type)
25738 inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
so we need to pass the function's qualifiers down here. In
cp_parser_direct_declarator it's easy: use the just parsed cv_quals, in
cp_parser_late_noexcept_specifier look at the 'this' parameter to figure it
out.
2020-01-22 Marek Polacek <polacek@redhat.com>
PR c++/92907 - noexcept does not consider "const" in member functions.
* parser.c (cp_parser_lambda_declarator_opt): Pass the proper
qualifiers to cp_parser_exception_specification_opt.
(cp_parser_direct_declarator): Pass the function qualifiers to
cp_parser_exception_specification_opt.
(cp_parser_class_specifier_1): Pass the function declaration to
cp_parser_late_noexcept_specifier.
(cp_parser_late_noexcept_specifier): Add a tree parameter. Use it to
pass the qualifiers of the function to
cp_parser_noexcept_specification_opt.
(cp_parser_noexcept_specification_opt): New cp_cv_quals parameter.
Use it in inject_this_parameter.
(cp_parser_exception_specification_opt): New cp_cv_quals parameter.
Use it.
(cp_parser_transaction): Pass TYPE_UNQUALIFIED to
cp_parser_noexcept_specification_opt.
(cp_parser_transaction_expression): Likewise.
Patrick Palka [Thu, 16 Jan 2020 21:46:40 +0000 (16:46 -0500)]
Fix a couple of memory leaks in the C++ frontend
The leak in get_mapped_args is due to auto_vec not properly supporting
destructible elements in that auto_vec's destructor doesn't call the
destructors of its elements.
gcc/cp/ChangeLog:
* constraint.cc (get_mapped_args): Avoid using auto_vec
as a vector element. Release the vectors inside the lists
vector.
* parser.c (cp_literal_operator_id): Free the buffer.
cfgexpand: Update partition size when merging variables
cfgexpand sorts variables by decreasing size, so when merging a later
variable into an earlier one, there's usually no need to update the
merged size.
But for poly_int sizes, the sort function just uses a lexicographical
comparison of the coefficients, so e.g. 2X+2 comes before 0X+32.
Which is bigger depends on the runtime value of X.
This patch therefore takes the upper bound of the two sizes, which
is conservatively correct for variable-length vectors and a no-op
on other targets.
It's probably a bad idea to merge fixed-length and variable-length
variables in practice, but that's really an optimisation decision.
I think we should have this patch as a correctness fix either way.
This is easiest to test using the ACLE, but in principle it could happen
for autovectorised code too, e.g. when using OpenMP vector variables.
It's therefore a regression from GCC 8.
2020-01-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* cfgexpand.c (union_stack_vars): Update the size.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/stack_vars_1.c: New test.
Martin Liska [Wed, 22 Jan 2020 12:40:12 +0000 (13:40 +0100)]
Fix TOP N counter update.
PR tree-optimization/92924
* libgcov-profiler.c (__gcov_topn_values_profiler_body): First
try to find an existing value, then find an empty slot
if not found.
Richard Biener [Wed, 22 Jan 2020 11:38:12 +0000 (12:38 +0100)]
tree-optimization/93381 fix integer offsetting in points-to analysis
We were incorrectly assuming a merge operation is conservative enough
for not explicitely handled operations but we also need to consider
offsetting within fields when field-sensitive analysis applies.
2020-01-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/93381
* tree-ssa-structalias.c (find_func_aliases): Assume offsetting
throughout, handle all conversions the same.
The two patterns that call aarch64_expand_subvti ensure that {low,high}_in1
is a register, while {low,high}_in2 can be a register or immediate.
subdi3_compare1_imm uses the aarch64_plus_immediate predicate for its last
two operands (the value and negated value), but aarch64_expand_subvti calls
it whenever low_in2 is a CONST_INT, which leads to ICEs during vregs pass,
as the emitted insn is not recognized as valid subdi3_compare1_imm.
The following patch fixes that by only using subdi3_compare1_imm if it is ok
to do so, and otherwise force the constant into register and use the
non-immediate version - subdi3_compare1.
Furthermore, previously the code was calling force_reg on high_in2 only if
low_in2 is CONST_INT, on the (reasonable) assumption is that only if low_in2
is a CONST_INT, high_in2 can be non-REG, but with the above changes even in
the else we might have CONST_INT and force_reg doesn't do anything if the
operand is already a REG, so this patch calls it unconditionally.
2020-01-22 Jakub Jelinek <jakub@redhat.com>
PR target/93335
* config/aarch64/aarch64.c (aarch64_expand_subvti): Only use
gen_subdi3_compare1_imm if low_in2 satisfies aarch64_plus_immediate
predicate, not whenever it is CONST_INT. Otherwise, force_reg it.
Call force_reg on high_in2 unconditionally.
Martin Liska [Wed, 22 Jan 2020 11:08:11 +0000 (12:08 +0100)]
Smart relaxation of TOP N counter.
PR tree-optimization/92924
* profile.c (compute_value_histograms): Divide
all counter values.
PR tree-optimization/92924
* libgcov-driver.c (prune_topn_counter): New.
(prune_counters): Likewise.
(dump_one_gcov): Prune a run-time counter.
* libgcov-profiler.c (__gcov_topn_values_profiler_body):
For a known value, add GCOV_TOPN_VALUES to value.
Otherwise, decrement all counters by one.
Richard Earnshaw [Wed, 22 Jan 2020 10:06:50 +0000 (10:06 +0000)]
contrib: script to create a new vendor branch
This script is intended to create a new vendor branch. Doing so is
not completely obvious if you are not familiar with the upstream
structure, so this takes the pain out of getting it right.
It doesn't check out the branch locally, but does set everything up so
that, if you have push enabled for your vendor branches, then
Jakub Jelinek [Wed, 22 Jan 2020 09:22:16 +0000 (10:22 +0100)]
i386: Fix up -fdollars-in-identifiers with identifiers starting with $ in -masm=att [PR91298]
In AT&T syntax leading $ is special, so if we have identifiers that start
with dollar, we usually fail to assemble it (or assemble incorrectly).
As mentioned in the PR, what works is wrapping the identifiers inside of
parens, like:
movl $($a), %eax
leaq ($a)(,%rdi,4), %rax
movl ($a)(%rip), %eax
movl ($a)+16(%rip), %eax
.globl $a
.type $a, @object
.size $a, 72
$a:
.string "$a"
.quad ($a)
(this is x86_64 -fno-pic -O2). In some places ($a) is not accepted,
like as .globl operand, in .type, .size, so the patch overrides
ASM_OUTPUT_SYMBOL_REF rather than e.g. ASM_OUTPUT_LABELREF.
I didn't want to duplicate what assemble_name is doing (following
transparent aliases), so split assemble_name into two parts; just
mere looking at the first character of a name before calling assemble_name
wouldn't be good enough, a transparent alias could lead from a name
not starting with $ to one starting with it and vice versa.
2020-01-22 Jakub Jelinek <jakub@redhat.com>
PR target/91298
* output.h (assemble_name_resolve): Declare.
* varasm.c (assemble_name_resolve): New function.
(assemble_name): Use it.
* config/i386/i386.h (ASM_OUTPUT_SYMBOL_REF): Define.
* gcc.target/i386/pr91298-1.c: New test.
* gcc.target/i386/pr91298-2.c: New test.
Jakub Jelinek [Wed, 22 Jan 2020 08:50:53 +0000 (09:50 +0100)]
openmp: Teach omp_code_to_statement about rest of OpenMP statements
The omp_code_to_statement function added with the initial OpenACC support
only handled small subset of the OpenMP statements, leading to ICE if
any other OpenMP directive appeared inside of OpenACC directive.
JunMa [Thu, 21 Nov 2019 00:51:22 +0000 (08:51 +0800)]
Add error messages for missing methods of awaitable class
gcc/cp/ChangeLog
* coroutines.cc (lookup_awaitable_member): Lookup an awaitable member.
(lookup_promise_method): Emit diagnostic when get NULL_TREE back only.
(build_co_await): Use lookup_awaitable_member instead of lookup_member.
gcc/testsuite/ChangeLog
* g++.dg/coroutines/coro1-missing-await-method.C: New test.
Andrew Pinski [Fri, 17 Jan 2020 06:54:53 +0000 (06:54 +0000)]
Fix target/93119 (aarch64): ICE with traditional TLS support on ILP32
The problem here was g:23b88fda665d2f995c was not a complete fix
for supporting tranditional TLS on ILP32.
So the problem here is a couple of things, first __tls_get_addr
call will return a C pointer value so we need to use ptr_mode
when we are creating the call. Then we need to convert
back that register to the correct mode, either zero extending
it or just creating a move instruction.
Also symbol_ref can either be in SImode or DImode. So we need to
allow both modes.
Built and tested on aarch64-linux-gnu with no regressions.
Also built a full toolchain (including glibc) defaulting to traditional
TLS that targets ilp32 and lp64.
ChangeLog:
PR target/93119
* config/aarch64/aarch64.md (tlsgd_small_<mode>): Have operand 0
as PTR mode. Have operand 1 as being modeless, it can be P mode.
(*tlsgd_small_<mode>): Likewise.
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately)
<case SYMBOL_SMALL_TLSGD>: Call gen_tlsgd_small_* with a ptr_mode
register. Convert that register back to dest using convert_mode.
Joseph Myers [Wed, 22 Jan 2020 01:23:42 +0000 (01:23 +0000)]
Fix ICE with cast of division by zero (PR c/93348).
Bug 93348 reports an ICE on certain cases of casts of expressions that
may appear only in unevaluated parts of integer constant expressions,
arising from the generation of nested C_MAYBE_CONST_EXPRs. This patch
fixes it by adding a call to remove_c_maybe_const_expr in the
integer-operands case, as is done in other similar cases.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/93348
gcc/c:
* c-typeck.c (build_c_cast): Call remove_c_maybe_const_expr on
argument with integer operands.
gcc/testsuite:
* gcc.c-torture/compile/pr93348-1.c: New test.
David Malcolm [Tue, 21 Jan 2020 17:42:36 +0000 (12:42 -0500)]
analyzer: fix qsort issue with array_region keys (PR 93352)
PR analyzer/93352 reports a qsort failure
"comparator not anti-symmetric: -2147483648, -2147483648)"
within the analyzer on code involving an array access of [0x7fffffff + 1].
The issue is that array_region (which uses int for keys into known
values in the array) uses subtraction to implement int_cmp for sorting
the keys, which isn't going to work for boundary values.
Potentially a wider type should be used, but for now this patch fixes
the ICE by using explicit comparisons rather than subtraction to
implement the qsort callback.
gcc/analyzer/ChangeLog:
PR analyzer/93352
* region-model.cc (int_cmp): Rename to...
(array_region::key_cmp): ...this, using key_t rather than int.
Rewrite in terms of comparisons rather than subtraction to
ensure qsort is anti-symmetric when handling extreme values.
(array_region::walk_for_canonicalization): Update for above
renaming.
* region-model.h (array_region::key_cmp): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/93352
* gcc.dg/analyzer/pr93352.c: New test.
Jason Merrill [Tue, 7 Jan 2020 17:20:26 +0000 (12:20 -0500)]
PR c++/40752 - useless -Wconversion with short +=.
This is a longstanding issue with lots of duplicates; people are not
interested in a -Wconversion warning about mychar += 1. So now that warning
depends on -Warith-conversion; otherwise we only warn if operands of the
arithmetic have conversion issues.
* c.opt (-Warith-conversion): New.
* c-warn.c (conversion_warning): Recurse for operands of
operators. Only warn about the whole expression with
-Warith-conversion.
Jason Merrill [Fri, 10 Jan 2020 17:49:03 +0000 (12:49 -0500)]
Handle -Wsign-conversion in conversion_warning.
It seemed strange to me to warn about sign conversion in
unsafe_conversion_p, when other warnings are in conversion_warning, and the
latter function is the only place that asks the former function to warn.
This change is also necessary for my -Warith-conversion patch.
Jim Wilson [Tue, 21 Jan 2020 23:20:19 +0000 (15:20 -0800)]
RISC-V: Fix rtl checking enabled failure with -msave-restore.
Found with an rtl checking enabled build and check. This triggered failures
in the gcc.target/riscv/save-restore* tests. We are using XINT to access an
XWINT value; INTVAL is the preferred solution.
gcc/
* config/riscv/riscv-sr.c (riscv_sr_match_prologue): Use INTVAL
instead of XINT.
Oops. A few stragglers, same as recent update: differing
-march=... options is an error, noticed with e.g.
"make check RUNTESTFLAGS=--target_board=cris-sim/arch=v10"
H.J. Lu [Tue, 21 Jan 2020 22:09:53 +0000 (14:09 -0800)]
i386: Do GNU2 TLS address computation in ptr_mode
Since GNU2 TLS address from glibc run-time is in ptr_mode, we should do
GNU2 TLS address computation in ptr_mode and zero-extend result to Pmode.
gcc/
PR target/93319
* config/i386/i386.c (ix86_tls_module_base): Replace Pmode
with ptr_mode.
(legitimize_tls_address): Do GNU2 TLS address computation in
ptr_mode and zero-extend result to Pmode.
* config/i386/i386.md (@tls_dynamic_gnu2_64_<mode>): Replace
:P with :PTR and Pmode with ptr_mode.
(*tls_dynamic_gnu2_lea_64_<mode>): Likewise.
(*tls_dynamic_gnu2_call_64_<mode>): Likewise.
(*tls_dynamic_gnu2_combine_64_<mode>): Likewise.
gcc/testsuite/
PR target/93319
* gcc.target/i386/pr93319-1a.c: Don't include <stdio.h>.
(test1): Replace printf with __builtin_printf.
Jason Merrill [Tue, 21 Jan 2020 19:21:49 +0000 (14:21 -0500)]
PR c++/60855 - ICE with sizeof VLA capture.
For normal captures we usually look through them within unevaluated context,
but that doesn't work here; trying to take the sizeof of the array in the
enclosing scope tries and fails to evaluate a SAVE_EXPR from the enclosing
scope.
* lambda.c (is_lambda_ignored_entity): Don't look past VLA capture.
Jason Merrill [Tue, 21 Jan 2020 18:22:35 +0000 (13:22 -0500)]
PR c++/90732 - ICE with VLA capture and generic lambda.
We were failing to handle VLA capture in tsubst_lambda_expr; initially
building a DECLTYPE_TYPE for the capture and then tsubsting it doesn't give
the special VLA handling. So with this patch we call add_capture again for
VLAs.
* pt.c (tsubst_lambda_expr): Repeat add_capture for VLAs.
Iain Sandoe [Tue, 21 Jan 2020 20:42:17 +0000 (20:42 +0000)]
[coro] Fix co_await of void type.
gcc/cp
2020-01-21 Iain Sandoe <iain@sandoe.co.uk>
Bin Cheng <bin.cheng@linux.alibaba.com>
* coroutines.cc (coro_promise_type_found_p): Check for NULL return
from complete_type_or_else.
(register_param_uses): Likewise.
(build_co_await): Do not try to use complete_type_or_else for void
types, otherwise for incomplete types, check for NULL return from
complete_type_or_else.
gcc/testsuite
2020-01-21 Bin Cheng <bin.linux@linux.alibaba.com>
* g++.dg/coroutines/co-await-void_type.C: New test.
Jakub Jelinek [Tue, 21 Jan 2020 20:43:03 +0000 (21:43 +0100)]
riscv: Fix up riscv_rtx_costs for RTL checking (PR target/93333)
As mentioned in the PR, during combine rtx_costs can be called sometimes
even on RTL that has not been validated yet and so can contain even operands
that aren't valid in any instruction.
2020-01-21 Jakub Jelinek <jakub@redhat.com>
PR target/93333
* config/riscv/riscv.c (riscv_rtx_costs) <case ZERO_EXTRACT>: Verify
the last two operands are CONST_INT_P before using them as such.
I'd used mode-based element types in the SVE ACLE implementation, but
it turns out that they don't correspond to the <stdint.h> types used by
ILP32 newlib. GCC already knows what the correct <stdint.h> types are,
I just wasn't using the right interface to find them.
A consequence of this is that ILP32 newlib code needs to cast "int *"
pointers to "int32_t *" before passing them to s32 loads and stores,
since int32_t is defined as "long int" rather than "int". That matches
the normal C++ overloading behaviour for this target, where passing
"int *" to:
void f(int32_t *);
void f(int64_t *);
would be ambiguous. It also matches the corresponding <arm_neon.h>
behaviour.
2020-01-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-sve-builtins.def: Use get_typenode_from_name
to get the integer element types.
Szabolcs Nagy [Wed, 15 Jan 2020 12:23:40 +0000 (12:23 +0000)]
[AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI
This is a workaround that emits a BTI after the function label if that
is followed by a patch area. We try to remove the BTI that follows the
patch area (this may fail e.g. if the first instruction is a PACIASP).
So before this commit -fpatchable-function-entry=3,1 with bti generates
.section __patchable_function_entries
.8byte .LPFE
.text
.LPFE:
nop
foo:
nop
nop
bti c // or paciasp
...
and after this commit
.section __patchable_function_entries
.8byte .LPFE
.text
.LPFE:
nop
foo:
bti c
nop
nop
// may be paciasp
...
and with -fpatchable-function-entry=1 (M=0) the code now is
foo:
bti c
.section __patchable_function_entries
.8byte .LPFE
.text
.LPFE:
nop
// may be paciasp
...
There is a new bti insn in the middle of the patchable area users need
to be aware of unless M=0 (patch area is after the new bti) or M=N
(patch area is before the label, no new bti). Note: bti is not added to
all functions consistently (it can be turned off per function using a
target attribute or the compiler may detect that the function is never
called indirectly), so if bti is inserted in the middle of a patch area
then user code needs to deal with detecting it.
Tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
PR target/92424
* config/aarch64/aarch64.c (aarch64_declare_function_name): Set
cfun->machine->label_is_assembled.
(aarch64_print_patchable_function_entry): New.
(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
* config/aarch64/aarch64.h (struct machine_function): New field,
label_is_assembled.
gcc/testsuite/ChangeLog:
PR target/92424
* gcc.target/aarch64/pr92424-1.c: New test.
* gcc.target/aarch64/pr92424-2.c: New test.
* gcc.target/aarch64/pr92424-3.c: New test.
Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix
multiply instructions. These require binutils support for the same
instructions.
This patch adds a DejaGNU test to ensure this binutils support is there
and uses it in the files that need this test.
Testing Done:
Checked on a cross-compiler that:
Tests running for binutils commit e264b5b7a are listed as UNSUPPORTED.
Tests running for binutils commit 26916852e all pass.
gcc/testsuite/ChangeLog:
2020-01-21 Matthew Malcomson <matthew.malcomson@arm.com>
Jan Hubicka [Tue, 21 Jan 2020 15:33:43 +0000 (16:33 +0100)]
Fix updating of call_stmt_site_hash
This patch fixes ICE causes by call stmt site hash going out of sync. For
speculative edges it is assumed to contain a direct call so if we are
removing it hashtable needs to be updated. I realize that the code is ugly
but I will leave cleanup for next stage1.
Bootstrapped/regtested x86_64-linux. This patch makes it possible to build
Firefox again.
PR lto/93318
* cgraph.c (cgraph_edge::resolve_speculation,
cgraph_edge::redirect_call_stmt_to_callee): Fix update of
call_stmt_site_hash.
Jason Merrill [Mon, 20 Jan 2020 18:27:10 +0000 (13:27 -0500)]
PR c++/91476 - anon-namespace reference temp clash between TUs.
The problem in the PR was that make_temporary_var_for_ref_to_temp ran before
determine_visibility, so when we copied the linkage of the reference
variable it had not yet been restricted by its anonymous namespace context,
so the temporary wrongly ended up with TREE_PUBLIC set. The natural
solution is to run determine_visibility earlier. But that needs to happen
after maybe_commonize_var increases the linkage of some local variables, and
on targets without weak symbol support, that function does different things
based on the results of check_initializer, which is what calls
make_temporary_var_for_ref_to_temp. To break this circular dependency I'm
calling maybe_commonize_var early, and then again later if the target
doesn't support weak symbols.
It also occurred to me that make_temporary_var_for_ref_to_temp wasn't
handling DECL_VISIBILITY at all, and verified that we were doing the wrong
thing. So I've combined the linkage-copying code from there and two other
places.
* decl2.c (copy_linkage): Factor out of get_guard.
* call.c (make_temporary_var_for_ref_to_temp): Use it.
* decl.c (cp_finish_decomp): Use it.
(cp_finish_decl): determine_visibility sooner.
Richard Biener [Tue, 21 Jan 2020 09:37:18 +0000 (10:37 +0100)]
tree-optimization/92328 fix value-number with bogus type
We were actually value-numbering two entities with different type
the same rather than just having the same representation in the
hashtable. The following fixes that by wrapping the value in a
to be inserted VIEW_CONVERT_EXPR.
2020-01-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/92328
* tree-ssa-sccvn.c (vn_reference_lookup_3): Preserve
type when value-numbering same-sized store by inserting a
VIEW_CONVERT_EXPR.
(eliminate_dom_walker::eliminate_stmt): When eliminating
a redundant store handle bit-reinterpretation of the same value.
Andrew Pinski [Tue, 21 Jan 2020 08:34:42 +0000 (08:34 +0000)]
Change recursive prepare_block_for_update to use a worklist
Reported as PR 93321, prepare_block_for_update with some huge
recusive inlining can go past the stack limit. Transforming this
recursive into worklist improves the stack usage here and we no
longer seg fault for the testcase. Note the order we walk the siblings
change.
ChangeLog:
PR tree-opt/93321
* tree-into-ssa.c (prepare_block_for_update_1): Split out from ...
(prepare_block_for_update): This. Use a worklist instead of recursing.
* gcc/config/arm/arm.c (clear_operation_p):
Initialise last_regno, skip first iteration
based on the first_set value and use ints instead
of the unnecessary HOST_WIDE_INTs.
Jakub Jelinek [Tue, 21 Jan 2020 08:17:27 +0000 (09:17 +0100)]
powerpc: Fix ICE with fp conditional move (PR target/93073)
The following testcase ICEs, because for TFmode the particular subtraction
pattern (*subtf3) is not enabled with the given options. Using
expand_simple_binop instead of emitting the subtraction by hand just moves
the ICE one insn later, NEG of ABS is not then recognized, etc., but
ultimately the problem is that when rs6000_emit_cmove is called for floating
point operand mode (and earlier condition ensures that in that case
compare_mode is also floating point), the expander makes sure the
operand mode is SFDF, but for the comparison mode nothing checks it, yet
there is just one *fsel* pattern with 2 separate SFDF iterators.
The following patch fixes it by giving up if compare_mode is not SFmode or
DFmode.
2020-01-21 Jakub Jelinek <jakub@redhat.com>
PR target/93073
* config/rs6000/rs6000.c (rs6000_emit_cmove): If using fsel, punt for
compare_mode other than SFmode or DFmode.
Frederik Harwath [Mon, 20 Jan 2020 06:45:43 +0000 (07:45 +0100)]
Add runtime ISA check for amdgcn offloading
The HSA/ROCm runtime rejects binaries not built for the exact GPU device
present. So far, the libgomp amdgcn plugin does not verify that the GPU ISA
and the ISA specified at compile time match before handing over the binary to
the runtime. In case of a mismatch, the user is confronted with an unhelpful
runtime error.
This commit implements a runtime ISA check. In case of an ISA mismatch, the
execution is aborted with a clear error message and a hint at the correct
compilation parameters for the GPU on which the execution has been attempted.
libgomp/
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): New enum.
* (EF_AMDGPU_MACH_MASK): New constant.
* (gcn_isa): New typedef.
* (gcn_gfx801_s): New constant.
* (gcn_gfx803_s): New constant.
* (gcn_gfx900_s): New constant.
* (gcn_gfx906_s): New constant.
* (gcn_isa_name_len): New constant.
* (elf_gcn_isa_field): New function.
* (isa_hsa_name): New function.
* (isa_gcc_name): New function.
* (isa_code): New function.
* (struct agent_info): Add field "device_isa" and remove field
"gfx900_p".
* (GOMP_OFFLOAD_init_device): Adapt agent init to "agent_info"
field changes, fail if device has unknown ISA.
* (parse_target_attributes): Replace "gfx900_p" by "device_isa".
* (isa_matches_agent): New function ...
* (create_and_finalize_hsa_program): ... used from here to check
that the GPU ISA and the code-object ISA match.
Wilco Dijkstra [Mon, 20 Jan 2020 14:29:40 +0000 (14:29 +0000)]
[AArch64] Set jump-align=4 for neoversen1
Testing shows the setting of 32:16 for jump alignment has a significant
codesize cost, however it doesn't make a difference in performance.
So set jump-align to 4 to get 1.6% codesize improvement.
gcc/
* config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.
Andrew Pinski [Sat, 18 Jan 2020 00:41:06 +0000 (00:41 +0000)]
Fix PR 93242: patchable-function-entry broken on MIPS
On MIPS, .set noreorder/reorder needs to emitted around
the nop. The template for the nop instruction uses %(/%) to
do that. But default_print_patchable_function_entry uses
fprintf rather than output_asm_insn to output the instruction.
This fixes the problem by using output_asm_insn to emit the nop
instruction.
ChangeLog:
PR middle-end/93242
* targhooks.c (default_print_patchable_function_entry): Use
output_asm_insn to emit the nop instruction.
Nathan Sidwell [Mon, 20 Jan 2020 13:39:59 +0000 (05:39 -0800)]
[PR 80005] Fix __has_include
__has_include is funky in that it is macro-like from the POV of #ifdef and
friends, but lexes its parenthesize argument #include-like. We were
failing the second part of that, because we used a forwarding macro to an
internal name, and hence always lexed the argument in macro-parameter
context. We componded that by not setting the right flag when lexing, so
it didn't even know. Mostly users got lucky.
This reimplements the handline.
1) Remove the forwarding, but declare object-like macros that
expand to themselves. This satisfies the #ifdef requirement
2) Correctly set angled_brackets when lexing the parameter. This tells
the lexer (a) <...> is a header name and (b) "..." is too (not a string).
3) Remove the in__has_include lexer state, just tell find_file that that's
what's happenning, so it doesn't emit an error.
We lose the (undocumented) ability to #undef __has_include. That may well
have been an accident of implementation. There are no tests for it.
We gain __has_include behaviour for all users of the preprocessors -- not
just the C-family ones that defined a forwarding macro.
H.J. Lu [Mon, 20 Jan 2020 13:02:14 +0000 (05:02 -0800)]
x32: Add x32 support to -mtls-dialect=gnu2
To add x32 support to -mtls-dialect=gnu2, we need to replace DI with
P in GNU2 TLS patterns. Since DEST set by tls_dynamic_gnu2_64 is in
ptr_mode, PLUS in GNU2 TLS address computation must be done in ptr_mode
to support -maddress-mode=long. Also replace the "{q}" suffix on lea
with "%z0" to support both 32-bit and 64-bit destination register.
Tested on Linux/x86-64.
gcc/
PR target/93319
* config/i386/i386.c (legitimize_tls_address): Pass Pmode to
gen_tls_dynamic_gnu2_64. Compute GNU2 TLS address in ptr_mode.
* config/i386/i386.md (tls_dynamic_gnu2_64): Renamed to ...
(@tls_dynamic_gnu2_64_<mode>): This. Replace DI with P.
(*tls_dynamic_gnu2_lea_64): Renamed to ...
(*tls_dynamic_gnu2_lea_64_<mode>): This. Replace DI with P.
Remove the {q} suffix from lea.
(*tls_dynamic_gnu2_call_64): Renamed to ...
(*tls_dynamic_gnu2_call_64_<mode>): This. Replace DI with P.
(*tls_dynamic_gnu2_combine_64): Renamed to ...
(*tls_dynamic_gnu2_combine_64_<mode>): This. Replace DI with P.
Pass Pmode to gen_tls_dynamic_gnu2_64.
Richard Biener [Mon, 20 Jan 2020 09:36:09 +0000 (10:36 +0100)]
debug/92763 keep DIEs that might be used in DW_TAG_inlined_subroutine
We were pruning type-local subroutine DIEs if their context is unused
despite us later needing those DIEs as abstract origins for inlines.
The patch makes code already present for -fvar-tracking-assignments
unconditional.
2020-01-20 Richard Biener <rguenther@suse.de>
PR debug/92763
* dwarf2out.c (prune_unused_types): Unconditionally mark
called function DIEs.
Richard Earnshaw [Mon, 20 Jan 2020 10:37:29 +0000 (10:37 +0000)]
contrib: New remotes structure for vendor and personal refs
The initial structure for vendor and personal branches makes use of
the default remote (normally origin) for the upstream
repository). Unfortunately, this causes some confusion, especially for
personal branches because a push will not push to the correct upstream
location. This can be 'fixed' by adding a push refspec for the remote,
but that has the unfortunate consequence of breaking the push.default
behaviour for git push, and it becomes too easy to accidentally commit
something unintended to the main parts of the repository.
To work around this, this patch changes the configuration to use
separate 'remotes' for these additional refs, with one remote for the
personal space and another remote for each vendor's space. The
personal space is called after the user's preferred branch-space
prefix (default 'me'), the vendor spaces are called
vendors/<vendor-name>.
As far as possible, I've made the script automatically restructure any
existing fetch or push lines that earlier versions of the scripts may
have created - the gcc-git-customization.sh script will convert all
vendor refs that it can find, so it is not necessary to re-add any
vendors you've already added.
You might, however, want to run
git remote prune <origin>
after running to clean up any stale upstream-refs that might still be
in your local repo, and then
git fetch vendors/<vendor>
or
git fetch <me>
to re-populate the remotes/ structures.
Also, for any branch you already have that tracks a personal or vendor
branch upstream, you might need to run
git config branch.<name>.remote <new-remote>
so that merges and pushes go to the right place (I haven't attempted
to automate this last part).
Please be aware that if you have multiple personal branches set up, then
git push <me>
will still consider all of them for pushing. If you only want to push
one branch, then either write
git push <me> HEAD
or
git push <me> <me>/branch
as appropriate.
And don't forget '-n' (--dry-run) to see what would be done if this
were not a dry run.
Finally, now that the vendors spaces are isolated from each other and
from the other spaces, I've added an option "--enable-push" to
git-fetch-vendor.sh. If passed, then a "push" spec will be added for
that vendor to enable pushing to the upstream. If you re-run the
script for the same vendor without the option, the push spec will be
removed.
* gcc-git-customization.sh: Check that user-supplied remote
name exists before continuting. Use a separate remotes for the
personal commit area. Convert existing personal and vendor
fetch rules to new layout.
* git-fetch-vendor.sh: New vendor layout. Add --enable-push
option.
Martin Liska [Mon, 20 Jan 2020 10:10:30 +0000 (11:10 +0100)]
Record outer non-cleanup region in TREE EH.
PR tree-optimization/93199
* tree-eh.c (struct leh_state): Add
new field outer_non_cleanup.
(cleanup_is_dead_in): Pass leh_state instead
of eh_region. Add a checking that state->outer_non_cleanup
points to outer non-clean up region.
(lower_try_finally): Record outer_non_cleanup
for this_state.
(lower_catch): Likewise.
(lower_eh_filter): Likewise.
(lower_eh_must_not_throw): Likewise.
(lower_cleanup): Likewise.
Eric S. Raymond [Mon, 20 Jan 2020 01:12:29 +0000 (17:12 -0800)]
Clean up references to Subversion in documentation sources.
Clean up references to SVN in in the GCC docs, redirecting to Git
documentation as appropriate.
Where references to "the source code repository" rather than a
specific VCS make sense, I have used them. You might, after
all, change VCSes again someday.
I have not modified either generated HTML files nor maintainer scripts.
These changes should be complete with repect to the documentation tree.
I see no (other) way to, depending on the absence of an option,
add an option for a specific target.
For gcc.dg/torture/pr26515.c and cris-elf, you get an error for
supplying multiple (different) -march=... options (where that
error is desirable), like testing cris-elf with
RUNTESTFLAGS=--target_board=cris-sim/arch=v8, where otherwise
-march=v10 and -march=v8 will both be given, and the test would
fail.
For historians, this was accidentally misordered and committed after
the (first) patch using march_option. Oops.
Jason Merrill [Sun, 19 Jan 2020 14:14:54 +0000 (09:14 -0500)]
PR c++/33799 - destroy return value, take 2.
This patch differs from the reverted patch for 33799 in that it adds the
CLEANUP_STMT for the return value at the end of the function, and only if
we've seen a cleanup that might throw, so it should not affect most C++11
code.
* cp-tree.h (current_retval_sentinel): New macro.
(struct language_function): Add throwing_cleanup bitfield.
* decl.c (cxx_maybe_build_cleanup): Set it.
* except.c (maybe_set_retval_sentinel)
(maybe_splice_retval_cleanup): New functions.
* parser.c (cp_parser_compound_statement): Call
maybe_splice_retval_cleanup.
* typeck.c (check_return_expr): Call maybe_set_retval_sentinel.
Jason Merrill [Sun, 19 Jan 2020 14:15:26 +0000 (09:15 -0500)]
Simplify lambda parsing.
Since we removed the special parsing for C++11 lambdas, it's just been an
open-coded copy of cp_parser_function_body. So let's call it instead. This
avoids the need to change this code in my revised 33799 patch.
* parser.c (cp_parser_lambda_body): Use cp_parser_function_body.
Jan Hubicka [Sun, 19 Jan 2020 15:41:11 +0000 (16:41 +0100)]
Implement speculative call verifier
this patch implements verifier and fixes one bug where speculative calls
produced by ipa-devirt ended up having num_speculative_call_targets = 0
instead of 1.
* cgraph.c (cgraph_edge::make_speculative): Increase number of
speculative targets.
(verify_speculative_call): New function
(cgraph_node::verify_node): Use it.
* ipa-profile.c (ipa_profile): Fix formating; do not set number of
speculations.
Jan Hubicka [Sun, 19 Jan 2020 12:49:38 +0000 (13:49 +0100)]
Fix ICE in speculative_call_info
this fixes two issues with the new multi-target speculation code which reproduce
on Firefox. I can now build firefox with FDO locally but on Mozilla build bots
it still fails with ICE in speculative_call_info.
One problem is that speuclative code compares call_stmt and lto_stmt_uid in
a way that may get unwanted effect when these gets out of sync. It does not
make sense to have both non-zero so I added code clearing it and sanity check
that it is kept this way.
Other problem is cgraph_edge::make_direct not working well with multiple
targets. In this case it removed one speuclative target and the indirect call
leaving other targets in the tree.
This is fixed by iterating across all targets and removing all except the good
one (if it exists).
PR lto/93318
* cgraph.c (cgraph_edge::resolve_speculation): Fix foramting.
(cgraph_edge::make_direct): Remove all indirect targets.
(cgraph_edge::redirect_call_stmt_to_callee): Use make_direct..
(cgraph_node::verify_node): Verify that only one call_stmt or
lto_stmt_uid is set.
* cgraphclones.c (cgraph_edge::clone): Set only one call_stmt or
lto_stmt_uid.
* lto-cgraph.c (lto_output_edge): Simplify streaming of stmt.
(lto_output_ref): Simplify streaming of stmt.
* lto-streamer-in.c (fixup_call_stmt_edges_1): Clear lto_stmt_uid.
Without this, there are, for each compilation of arit.c, 30ish
occurrences of "this statement may fall through
[-Wimplicit-fallthrough=]", for lines that look like
case 32: DS; case 31: DS; case 30: DS; case 29: DS;
config.gcc <obsolete targets>: Add crisv32-*-* and cris-*-linux*
I'm sorry to say that there's no incentive to maintain
crisv32-*-* and cris-*-linux* configurations beyond nostalgia,
(and I'm out of that for the moment). Support in the Linux
kernel for either applicable CRIS variant (CRIS v10 and CRIS
v32) is gone since 2018. Their related part of the cc0
transition workload would be noticable. Note that cris-elf
remains, but crisv32-elf and the CRIS v32 multilib will be
removed, at least for now.
I'm not completely happy about the message (the next-next line
after the context) "*** unless a maintainer comes forward"
because it'd have to be at an infinitesimal maintenance cost to
the cris-elf support. Still, I'm not bothered enough to add
another case construct or means for "planned obsolescence".