Paolo Carlini [Wed, 24 Jan 2018 00:57:18 +0000 (00:57 +0000)]
re PR c++/83921 (GCC rejects constexpr initialization of empty aggregate.)
/cp
2018-01-23 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/83921
* decl.c (check_for_uninitialized_const_var): Not static; add
bool and tsubst_flags_t parameters; adjust to be used both in
constexpr context and not.
* constexpr.c (potential_constant_expression_1): Use the above.
* cp-tree.h (check_for_uninitialized_const_var): Declare.
/testsuite
2018-01-23 Paolo Carlini <paolo.carlini@oracle.com>
Max Filippov [Tue, 23 Jan 2018 21:42:52 +0000 (21:42 +0000)]
libgcc: xtensa: fix NaN return from add/sub/mul/div helpers
libgcc/
2018-01-23 Max Filippov <jcmvbkbc@gmail.com>
* config/xtensa/ieee754-df.S (__addsf3, __subsf3, __mulsf3)
(__divsf3): Make NaN return value quiet.
* config/xtensa/ieee754-sf.S (__adddf3, __subdf3, __muldf3)
(__divdf3): Make NaN return value quiet.
H.J. Lu [Tue, 23 Jan 2018 19:30:32 +0000 (19:30 +0000)]
i386: Use const reference of struct ix86_frame to avoid copy
We can use const reference of struct ix86_frame to avoid making a local
copy of ix86_frame. ix86_expand_epilogue makes a local copy of struct
ix86_frame and uses the reg_save_offset field as a local variable. This
patch uses a separate local variable for reg_save_offset.
Tested on x86-64 with ada.
PR target/83905
* config/i386/i386.c (ix86_expand_prologue): Use cost reference
of struct ix86_frame.
(ix86_expand_epilogue): Likewise. Add a local variable for
the reg_save_offset field in struct ix86_frame.
Bin Cheng [Tue, 23 Jan 2018 16:47:03 +0000 (16:47 +0000)]
re PR tree-optimization/82604 (SPEC CPU2006 410.bwaves ~50% performance regression with trunk@253679 when ftree-parallelize-loops is used)
PR tree-optimization/82604
* tree-loop-distribution.c (enum partition_kind): New enum item
PKIND_PARTIAL_MEMSET.
(partition_builtin_p): Support above new enum item.
(generate_code_for_partition): Ditto.
(compute_access_range): Differentiate cases that equality can be
proven at all loops, the innermost loops or no loops.
(classify_builtin_st, classify_builtin_ldst): Adjust call to above
function. Set PKIND_PARTIAL_MEMSET for partition appropriately.
(finalize_partitions, distribute_loop): Don't fuse partition of
PKIND_PARTIAL_MEMSET kind when distributing 3-level loop nest.
(prepare_perfect_loop_nest): Distribute 3-level loop nest only if
parloop is enabled.
Martin Liska [Tue, 23 Jan 2018 15:46:02 +0000 (16:46 +0100)]
Handle trailing arrays in ODR warning (PR lto/81440).
2018-01-23 Martin Liska <mliska@suse.cz>
PR lto/81440
* lto-symtab.c (lto_symtab_merge): Handle and do not warn about
trailing arrays at the end of a struct.
2018-01-23 Martin Liska <mliska@suse.cz>
PR lto/81440
* gcc.dg/lto/pr81440.h: New test.
* gcc.dg/lto/pr81440_0.c: New test.
* gcc.dg/lto/pr81440_1.c: New test.
Martin Liska [Tue, 23 Jan 2018 15:43:59 +0000 (16:43 +0100)]
Remove predictors that are unrealiable.
2018-01-23 Martin Liska <mliska@suse.cz>
* predict.def (PRED_INDIR_CALL): Set probability to PROB_EVEN in
order to ignore the predictor.
(PRED_POLYMORPHIC_CALL): Likewise.
(PRED_RECURSIVE_CALL): Likewise.
Martin Liska [Tue, 23 Jan 2018 12:26:37 +0000 (13:26 +0100)]
Clean-up IPA profile dump output.
2018-01-23 Martin Liska <mliska@suse.cz>
* tree-profile.c (tree_profiling): Print function header to
aware reader which function we are working on.
* value-prof.c (gimple_find_values_to_profile): Do not print
not interesting value histograms.
Martin Liska [Tue, 23 Jan 2018 12:24:55 +0000 (13:24 +0100)]
Fix profile_quality sanity check.
2018-01-22 Martin Liska <mliska@suse.cz>
* profile-count.h (enum profile_quality): Add
profile_uninitialized as the first value. Do not number values
as they are zero based.
(profile_count::verify): Update sanity check.
(profile_probability::verify): Likewise.
Nathan Sidwell [Tue, 23 Jan 2018 12:18:50 +0000 (12:18 +0000)]
[C++ PATCH] Deprecate ARM-era for scopes
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01940.html
gcc/cp/
Deprecate ARM-era for scope handling
* decl.c (poplevel): Flag_new_for_scope is a boolean-like.
(cxx_init_decl_processing): Deprecate flag_new_for_scope being
cleared.
* name-lookup.c (check_for_out_of_scope_variable): Deprecate and
cleanup handling.
* semantics.c (begin_for_scope): Flag_new_for_scope is
boolean-like.
(finish_for_stmt, begin_range_for_stmt): Likewise.
David Malcolm [Tue, 23 Jan 2018 11:10:47 +0000 (11:10 +0000)]
-Warray-bounds: Fix false positive in some "switch" stmts (PR tree-optimization/83510)
PR tree-optimization/83510 reports that r255649 (for
PR tree-optimization/83312) introduced a false positive for
-Warray-bounds for array accesses within certain switch statements:
those for which value-ranges allow more than one case to be reachable,
but for which one or more of the VR-unreachable cases contain
out-of-range array accesses.
In the reproducer, after the switch in f is inlined into g, we have 3 cases
for the switch (case 9, case 10-19, and default), within a loop that
ranges from 0..9.
With both the old and new code, vr_values::simplify_switch_using_ranges clears
the EDGE_EXECUTABLE flag on the edge to the "case 10-19" block. This
happens during the dom walk within the substitute_and_fold_engine.
With the old code, the clearing of that EDGE_EXECUTABLE flag led to the
/* Skip blocks that were found to be unreachable. */
code in the old implementation of vrp_prop::check_all_array_refs skipping
the "case 10-19" block.
With the new code, we have a second dom walk, and that dom_walker's ctor
sets all edges to be EDGE_EXECUTABLE, losing that information.
Then, dom_walker::before_dom_children (here, the subclass'
check_array_bounds_dom_walker::before_dom_children) can return one edge, if
there's a unique successor edge, and dom_walker::walk filters the dom walk
to just that edge.
Here we have two VR-valid edges (case 9 and default), and an VR-invalid
successor edge (case 10-19). There's no *unique* valid successor edge,
and hence taken_edge is NULL, and the filtering in dom_walker::walk
doesn't fire.
Hence we've lost the filtering of the "case 10-19" BB, hence the false
positive.
The issue is that we have two dom walks: first within vr_values'
substitute_and_fold_dom_walker (which has skip_unreachable_blocks == false),
then another within vrp_prop::check_all_array_refs (with
skip_unreachable_blocks == true).
Each has different "knowledge" about ruling out edges due to value-ranges,
but we aren't combining that information. The former "knows" about
out-edges at a particular control construct (e.g. at a switch), the latter
"knows" about dominance, but only about unique successors (hence the
problem when two out of three switch cases are valid).
This patch combines the information by preserving the EDGE_EXECUTABLE
flags from the first dom walk, and using it in the second dom walk,
potentially rejecting additional edges.
Doing so fixes the false positive.
I attempted an alternative fix, merging the two dom walks into one, but
that led to crashes in identify_jump_threads, so I went with this, as
a less invasive fix.
gcc/ChangeLog:
PR tree-optimization/83510
* domwalk.c (set_all_edges_as_executable): New function.
(dom_walker::dom_walker): Convert bool param
"skip_unreachable_blocks" to enum reachability. Move setup of
edge flags to set_all_edges_as_executable and only do it when
reachability is REACHABLE_BLOCKS.
* domwalk.h (enum dom_walker::reachability): New enum.
(dom_walker::dom_walker): Convert bool param
"skip_unreachable_blocks" to enum reachability.
(set_all_edges_as_executable): New decl.
* graphite-scop-detection.c (gather_bbs::gather_bbs): Convert
from false for "skip_unreachable_blocks" to ALL_BLOCKS for
"reachability".
* tree-ssa-dom.c (dom_opt_dom_walker::dom_opt_dom_walker): Likewise,
but converting true to REACHABLE_BLOCKS.
* tree-ssa-sccvn.c (sccvn_dom_walker::sccvn_dom_walker): Likewise.
* tree-vrp.c
(check_array_bounds_dom_walker::check_array_bounds_dom_walker):
Likewise, but converting it to REACHABLE_BLOCKS_PRESERVING_FLAGS.
(vrp_dom_walker::vrp_dom_walker): Likewise, but converting it to
REACHABLE_BLOCKS.
(vrp_prop::vrp_finalize): Call set_all_edges_as_executable
if check_all_array_refs will be called.
gcc/testsuite/ChangeLog:
PR tree-optimization/83510
* gcc.c-torture/compile/pr83510.c: New test case.
Fix vect_float markup for a couple of tests (PR 83888)
vect_float is true for arm*-*-* targets, but the support is only
available when -funsafe-math-optimizations is on. This caused
failures in two tests that disable fast-math.
The easiest fix seemed to be to add a new target selector for
"vect_float without special options".
2018-01-23 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR testsuite/83888
* doc/sourcebuild.texi (vect_float): Say that the selector
only describes the situation when -funsafe-math-optimizations is on.
(vect_float_strict): Document.
gcc/testsuite/
PR testsuite/83888
* lib/target-supports.exp (check_effective_target_vect_float): Say
that the result only holds when -funsafe-math-optimizations is on.
(check_effective_target_vect_float_strict): New procedure.
* gcc.dg/vect/no-fast-math-vect16.c: Use vect_float_strict instead
of vect_float.
* gcc.dg/vect/vect-reduc-6.c: Likewise.
Disable some patterns for fold-left reductions (PR 83965)
In this PR we recognised a PLUS_EXPR as a fold-left reduction,
then applied pattern matching to convert it to a WIDEN_SUM_EXPR.
We need to keep the original code in this case since we implement
the reduction using scalar rather than vector operations.
2018-01-23 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83965
* tree-vect-patterns.c (vect_reassociating_reduction_p): New function.
(vect_recog_dot_prod_pattern, vect_recog_sad_pattern): Use it
instead of checking only for a reduction.
(vect_recog_widen_sum_pattern): Likewise.
gcc/testsuite/
PR tree-optimization/83965
* gcc.dg/vect/pr83965.c: New test.
Jan Hubicka [Tue, 23 Jan 2018 09:55:37 +0000 (10:55 +0100)]
predict.c (probably_never_executed): Only use precise profile info.
* predict.c (probably_never_executed): Only use precise profile info.
(compute_function_frequency): Skip after inlining hack since we now
have quality checking.
Richard Biener [Tue, 23 Jan 2018 08:00:20 +0000 (08:00 +0000)]
re PR tree-optimization/83963 ([graphite] ICE in merge_sese, at graphite-scop-detection.c:517)
2018-01-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/83963
* graphite-scop-detection.c (scop_detection::harmful_loop_in_region):
Properly terminate dominator walk when crossing the exit edge not
when visiting its source block.
* gfortran.dg/graphite/pr83963.f: New testcase.
* gcc.dg/graphite/pr83963-2.c: Likewise.
Jakub Jelinek [Mon, 22 Jan 2018 22:59:33 +0000 (23:59 +0100)]
re PR tree-optimization/83081 ([arm] gcc.dg/pr80218.c fails since r254888)
PR tree-optimization/83081
* profile-count.h (profile_probability::split): New method.
* dojump.c (do_jump_1) <case TRUTH_ANDIF_EXPR, case TRUTH_ORIF_EXPR>:
Use profile_probability::split.
(do_compare_rtx_and_jump): Fix adjustment of probabilities
when splitting a single conditional jump into 2.
Sebastian Perta [Mon, 22 Jan 2018 20:20:28 +0000 (20:20 +0000)]
rl78-expand.md: New define_expand "bswaphi2"
2018-01-22 Sebastian Perta <sebastian.perta@renesas.com>
* config/rl78/rl78-expand.md: New define_expand "bswaphi2"
* config/rl78/rl78-virt.md: New define_insn "*bswaphi2_virt"
* config/rl78/rl78-real.md: New define_insn "*bswaphi2_real"
Sebastian Perta [Mon, 22 Jan 2018 19:53:55 +0000 (19:53 +0000)]
rl78-protos.h: New function declaration rl78_split_movdi
2018-01-22 Sebastian Perta <sebastian.perta@renesas.com>
* config/rl78/rl78-protos.h: New function declaration rl78_split_movdi
* config/rl78/rl78.md: New define_expand "movdi"
* config/rl78/rl78.c: New function definition rl78_split_movdi
Michael Meissner [Mon, 22 Jan 2018 19:36:18 +0000 (19:36 +0000)]
re PR target/83862 (powerpc: ICE in signbit testcase)
[gcc]
2018-01-22 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/83862
* config/rs6000/rs6000-protos.h (rs6000_split_signbit): Delete,
no longer used.
* config/rs6000/rs6000.c (rs6000_split_signbit): Likewise.
* config/rs6000/rs6000.md (signbit<mode>2): Change code for IEEE
128-bit to produce an UNSPEC move to get the double word with the
signbit and then a shift directly to do signbit.
(signbit<mode>2_dm): Replace old IEEE 128-bit signbit
implementation with a new version that just does either a direct
move or a regular move. Move memory interface to separate insns.
Move insns so they are next to the expander.
(signbit<mode>2_dm_mem_be): New combiner insns to combine load
with signbit move. Split big and little endian case.
(signbit<mode>2_dm_mem_le): Likewise.
(signbit<mode>2_dm_<su>ext): Delete, no longer used.
(signbit<mode>2_dm2): Likewise.
[gcc/testsuite]
2018-01-22 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/83862
* gcc.target/powerpc/pr83862.c: New test.
* config/rs6000/rs6000-builtin.def (ST_ELEMREV_V1TI, LD_ELEMREV_V1TI,
LVX_V1TI): Add macro expansion.
* config/rs6000/rs6000-c.c (altivec_builtin_types): Add argument
definitions for VSX_BUILTIN_VEC_XST_BE, VSX_BUILTIN_VEC_ST,
VSX_BUILTIN_VEC_XL, LD_ELEMREV_V1TI builtins.
* config/rs6000/rs6000-p8swap.c (insn_is_swappable_p);
Change check to determine if the instruction is a byte reversing
entry. Fix typo in comment.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add case entry
for VSX_BUILTIN_ST_ELEMREV_V1TI and VSX_BUILTIN_LD_ELEMREV_V1TI.
Add def_builtin calls for new builtins.
* config/rs6000/vsx.md (vsx_st_elemrev_v1ti, vsx_ld_elemrev_v1ti):
Add define_insn expansion.
gcc/testsuite/ChangeLog:
2018-01-22 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/powerpc.exp: Add torture tests for
builtins-4-runnable.c, builtins-6-runnable.c,
builtins-5-p9-runnable.c, builtins-6-p9-runnable.c.
* gcc.target/powerpc/builtins-6-runnable.c: New test file.
* gcc.target/powerpc/builtins-4-runnable.c: Add additional tests
for signed/unsigned 128-bit and long long int loads.
Janne Blomqvist [Mon, 22 Jan 2018 13:31:08 +0000 (15:31 +0200)]
PR 78534, 83704 Large character lengths
This patch fixes various parts of the code to use a larger type than
int for the character length. Depending on the situation,
HOST_WIDE_INT, size_t, or gfc_charlen_t is appropriate.
Regtested on x86_64-pc-linux-gnu and i686-pc-linux-gnu.
gcc/fortran/ChangeLog:
2018-01-22 Janne Blomqvist <jb@gcc.gnu.org>
PR 78534
PR 83704
* arith.c (gfc_arith_concat): Use size_t for string length.
(gfc_compare_string): Likewise.
(gfc_compare_with_Cstring): Likewise.
* array.c (gfc_resolve_character_array_constructor): Use
HOST_WIDE_INT, gfc_mpz_get_hwi.
* check.c (gfc_check_fe_runtime_error): Use size_t.
* data.c (create_character_initializer): Use HOST_WIDE_INT,
gfc_extract_hwi.
* decl.c (gfc_set_constant_character_len): Use gfc_charlen_t.
(add_init_expr_to_sym): Use HOST_WIDE_INT.
* expr.c (gfc_build_init_expr): Use HOST_WIDE_INT,
gfc_extract_hwi.
(gfc_apply_init): Likewise.
* match.h (gfc_set_constant_character_len): Update prototype.
* primary.c (match_string_constant): Use size_t.
* resolve.c (resolve_ordinary_assign): Use HOST_WIDE_INT,
gfc_mpz_get_hwi.
* simplify.c (init_result_expr): Likewise.
(gfc_simplify_len_trim): Use size_t.
* target-memory.c (gfc_encode_character): Use size_t.
(gfc_target_encode_expr): Use HOST_WIDE_INT, gfc_mpz_get_hwi.
(interpret_array): Use size_t.
(gfc_interpret_character): Likewise.
* target-memory.h (gfc_encode_character): Update prototype.
(gfc_interpret_character): Likewise.
(gfc_target_interpret_expr): Likewise.
* trans-const.c (gfc_build_string_const): Use size_t for length
argument.
(gfc_build_wide_string_const): Likewise.
* trans-const.h (gfc_build_string_const): Likewise.
(gfc_build_wide_string_const): Likewise.
2018-01-22 Janne Blomqvist <jb@gcc.gnu.org>
PR 78534
PR 83704
* gfortran.dg/string_1.f90: Remove printing the length.
Richard Biener [Mon, 22 Jan 2018 13:10:57 +0000 (13:10 +0000)]
re PR tree-optimization/83963 ([graphite] ICE in merge_sese, at graphite-scop-detection.c:517)
2018-01-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/83963
* graphite-scop-detection.c (scop_detection::get_sese): Delay
including the loop exit block.
(scop_detection::merge_sese): Likewise.
(scop_detection::add_scop): Do it here instead.
Sudakshina Das [Mon, 22 Jan 2018 10:56:26 +0000 (10:56 +0000)]
[ARM] Fix test fail with conflicting -mfloat-abi
This patch fixes my earlier test case that fails for arm-none-eabi
with explicit user option for -mfloat-abi which conflict with
the test case options. I have added a guard to skip the test
on those cases.
ChangeLog entries:
*** gcc/testsuite/ChangeLog ***
2018-01-22 Sudakshina Das <sudi.das@arm.com>
* gcc.c-torture/compile/pr82096.c: Add dg-skip-if and
dg-require-effective-target directives.
Kyrylo Tkachov [Mon, 22 Jan 2018 10:50:20 +0000 (10:50 +0000)]
[arm] Make gcc.target/arm/copysign_softfloat_1.c more robust
This test has needlessly restrictive requirements. It tries to force a soft-float target and tries to run.
This makes it unsupportable for any non-soft-float variant.
In fact, the test can be a run-time test for any target, and only the scan-assembler tests are specific to
-mfloat-abi=soft. So this patch makes the test always runnable and makes the scan-assembler checks predicable
on the the new arm_sotftfloat effective target check.
* doc/sourcebuild.texi (arm_softfloat): Document.
* lib/target-supports.exp (check_effective_target_arm_softfloat):
New procedure.
* gcc.target/arm/copysign_softfloat_1.c: Allow running everywhere.
Adjust scan-assembler checks for soft-float.
re PR testsuite/77734 (FAIL: gcc.dg/plugin/must-tail-call-1.c -fplugin=./must_tail_call_plugin.so (test for excess errors))
PR gcc/77734
* config/pa/pa.c (pa_function_ok_for_sibcall): Use
targetm.binds_local_p instead of TREE_PUBLIC to check local binding.
Move TARGET_PORTABLE_RUNTIME check after TARGET_64BIT check.
Fix vect_def_type handling in x86 scatter support (PR 83940)
As Jakub says in the PR, the problem here was that the x86/built-in
version of the scatter support was using a bogus scatter_src_dt
when calling vect_get_vec_def_for_stmt_copy (and had since it
was added). The patch uses the vect_def_type from the original
call to vect_is_simple_use instead.
However, Jakub also pointed out that other parts of the load and store
code passed the vector operand rather than the scalar operand to
vect_is_simple_use. That probably works most of the time since
a constant scalar operand should give a constant vector operand,
and likewise for external and internal definitions. But it
definitely seems more robust to pass the scalar operand.
The patch avoids the issue for gather and scatter offsets by
using the cached gs_info.offset_dt. This is safe because gathers
and scatters are never grouped, so there's only one statement operand
to consider. The patch also caches the vect_def_type for mask operands,
which is safe because grouped masked operations share the same mask.
That just leaves the store rhs. We still need to recalculate the
vect_def_type there since different store values in the group can
have different definition types. But since we still have access
to the original scalar operand, it seems better to use that instead.
2018-01-20 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83940
* tree-vect-stmts.c (vect_truncate_gather_scatter_offset): Set
offset_dt to vect_constant_def rather than vect_unknown_def_type.
(vect_check_load_store_mask): Add a mask_dt_out parameter and
use it to pass back the definition type.
(vect_check_store_rhs): Likewise rhs_dt_out.
(vect_build_gather_load_calls): Add a mask_dt argument and use
it instead of a call to vect_is_simple_use.
(vectorizable_store): Update calls to vect_check_load_store_mask
and vect_check_store_rhs. Use the dt returned by the latter instead
of scatter_src_dt. Use the cached mask_dt and gs_info.offset_dt
instead of calls to vect_is_simple_use. Pass the scalar rather
than the vector operand to vect_is_simple_use when handling
second and subsequent copies of an rhs value.
(vectorizable_load): Update calls to vect_check_load_store_mask
and vect_build_gather_load_calls. Use the cached mask_dt and
gs_info.offset_dt instead of calls to vect_is_simple_use.
gcc/testsuite/
PR tree-optimization/83940
* gcc.dg/torture/pr83940.c: New test.
Jakub Jelinek [Sat, 20 Jan 2018 09:58:31 +0000 (10:58 +0100)]
re PR middle-end/83945 (internal compiler error: Segmentation fault with -O -fcode-hoisting)
PR middle-end/83945
* tree-emutls.c: Include gimplify.h.
(lower_emutls_2): New function.
(lower_emutls_1): If ADDR_EXPR is a gimple invariant and walk_tree
with lower_emutls_2 callback finds some TLS decl in it, unshare_expr
it before further processing.
Jakub Jelinek [Fri, 19 Jan 2018 22:36:04 +0000 (23:36 +0100)]
re PR debug/81570 (create_pseudo_cfg assumes that INCOMING_FRAME_SP_OFFSET is a constant)
PR debug/81570
PR debug/83728
* dwarf2cfi.c (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define to
INCOMING_FRAME_SP_OFFSET if not defined.
(scan_trace): Add ENTRY argument. If true and
DEFAULT_INCOMING_FRAME_SP_OFFSET != INCOMING_FRAME_SP_OFFSET,
emit a note to adjust the CFA offset.
(create_cfi_notes): Adjust scan_trace callers.
(create_cie_data): Use DEFAULT_INCOMING_FRAME_SP_OFFSET rather than
INCOMING_FRAME_SP_OFFSET in the CIE.
* config/i386/i386.h (DEFAULT_INCOMING_FRAME_SP_OFFSET): Define.
* config/stormy16/stormy16.h (DEFAULT_INCOMING_FRAME_SP_OFFSET):
Likewise.
* doc/tm.texi.in (DEFAULT_INCOMING_FRAME_SP_OFFSET): Document.
* doc/tm.texi: Regenerated.
Tony Reix [Fri, 19 Jan 2018 17:45:24 +0000 (17:45 +0000)]
xcoff.c (xcoff_incl_compare): New function.
* xcoff.c (xcoff_incl_compare): New function.
(xcoff_incl_search): New function.
(xcoff_process_linenos): Use bsearch to find include file.
(xcoff_initialize_fileline): Sort include file information.
Martin Liska [Fri, 19 Jan 2018 12:06:18 +0000 (13:06 +0100)]
Adjust predictor values according to SPEC2006 and SPEC2017.
2018-01-19 Martin Liska <mliska@suse.cz>
* predict.def (PRED_LOOP_EXIT): Change from 85 to 89.
(PRED_LOOP_EXIT_WITH_RECURSION): Change from 72 to 78.
(PRED_LOOP_EXTRA_EXIT): Change from 83 to 67.
(PRED_OPCODE_POSITIVE): Change from 64 to 59.
(PRED_TREE_OPCODE_POSITIVE): Change from 64 to 59.
(PRED_CONST_RETURN): Change from 69 to 65.
(PRED_NULL_RETURN): Change from 91 to 71.
(PRED_LOOP_IV_COMPARE_GUESS): Change from 98 to 64.
(PRED_LOOP_GUARD): Change from 66 to 73.
Martin Liska [Fri, 19 Jan 2018 12:05:20 +0000 (13:05 +0100)]
Introduce PROB_UNINITIALIZED constant and use it in predict.def.
2018-01-19 Martin Liska <mliska@suse.cz>
* predict.c (predict_insn_def): Add new assert.
(struct branch_predictor): Change type to signed integer.
(test_prediction_value_range): Amend test to cover
PROB_UNINITIALIZED.
* predict.def (PRED_LOOP_ITERATIONS): Use the new constant.
(PRED_LOOP_ITERATIONS_GUESSED): Likewise.
(PRED_LOOP_ITERATIONS_MAX): Likewise.
(PRED_LOOP_IV_COMPARE): Likewise.
* predict.h (PROB_UNINITIALIZED): Define new constant.
Martin Liska [Fri, 19 Jan 2018 12:03:24 +0000 (13:03 +0100)]
Fix usage of analyze_brprob.py script.
2018-01-19 Martin Liska <mliska@suse.cz>
* analyze_brprob.py: Support new format that can be easily
parsed. Add new column to report.
2018-01-19 Martin Liska <mliska@suse.cz>
* predict.c (dump_prediction): Add new format for
analyze_brprob.py script which is enabled with -details
suboption.
* profile-count.h (precise_p): New function.
Check whether any statements need masking (PR 83922)
This PR is an odd case in which, due to the low optimisation level,
we enter vectorisation with:
outer1:
x_1 = PHI <x_3(outer2), ...>;
...
inner:
x_2 = 0;
...
outer2:
x_3 = PHI <x_2(inner)>;
These statements are tentatively treated as a double reduction by
vect_force_simple_reduction, but in the end only x_3 and x_2 are marked
as relevant. vect_analyze_loop_operations skips over x_3, leaving the
vectorizable_reduction check to a presumed future test of x_1, which
in this case never happens. We therefore end up vectorising x_2 only
(complete with peeling for niters!) and leave the scalar x_3 in place.
This caused a segfault in the support for fully-masked loops,
since there were no statements that needed masking. Fixed by
checking for that.
But I think this is also a flaw in vect_analyze_loop_operations.
Outer loop vectorisation reduces the number of times that the
inner loop is executed, so it wouldn't necessarily be valid
to leave the scalar x_3 in place for all vectorisable x_2.
There's already code to forbid that when x_1 isn't present:
/* FORNOW: we currently don't support the case that these phis
are not used in the outerloop (unless it is double reduction,
i.e., this phi is vect_reduction_def), cause this case
requires to actually do something here. */
I think we need to do the same if x_1 is present but not relevant.
2018-01-19 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83922
* tree-vect-loop.c (vect_verify_full_masking): Return false if
there are no statements that need masking.
(vect_active_double_reduction_p): New function.
(vect_analyze_loop_operations): Use it when handling phis that
are not in the loop header.
gcc/testsuite/
PR tree-optimization/83922
* gcc.dg/pr83922.c: New test.
This testcase ICEd because we converted the initial value of an
induction to the vector element type even for nested inductions.
This isn't necessary because the initial expression is vectorised
normally, and it meant that init_expr was no longer the original
statement operand by the time we called vect_get_vec_def_for_operand.
Also, adding the conversion code here made the existing SLP conversion
redundant.
2018-01-19 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83914
* tree-vect-loop.c (vectorizable_induction): Don't convert
init_expr or apply the peeling adjustment for inductions
that are nested within the vectorized loop.
gcc/testsuite/
PR tree-optimization/83914
* gcc.dg/vect/pr83914.c: New test.
Kyrylo Tkachov [Fri, 19 Jan 2018 10:41:57 +0000 (10:41 +0000)]
[arm] Fix gcc.target/arm/negdi-[12].c
These tests are failing for a silly reason. They scan for an occurrence of the NEGS instruction.
NEGS (and NEG in general) is a pre-UAL alias of RSB with an immediate of 0 and we only emit it
in one pattern: *thumb2_negsi2_short in thumb2.md. In all other instances of negation we emit
the modern RSB mnemonic. This causes needless differences in assembly output.
For example, for these testcases we emit NEG when compiling for -march=armv7-a, but for armv7ve
we emit RSB, causing the scan-assembler tests to fail.
This patch updates the *thumb2_negsi2_short pattern to use the RSB mnemonic and
fixes the flaky scan-assembler directives.
These tests now pass for my compiler configured with:
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4 --with-float=hard --with-mode=thumb
Bootstrapped and tested on arm-none-linux-gnueabihf as well.
* config/arm/thumb2.md (*thumb2_negsi2_short): Use RSB mnemonic
instead of NEG.
* gcc.target/arm/negdi-1.c: Remove bogus assembler scan for negs.
* gcc.target/arm/negdi-2.c: Likewise.
* gcc.target/arm/thumb-16bit-ops.c: Replace scan for NEGS with RSBS.
--This line,gand those below, will be ignored--
M gcc/config/arm/thumb2.md
M gcc/ChangeLog
M gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
M gcc/testsuite/gcc.target/arm/negdi-1.c
M gcc/testsuite/gcc.target/arm/negdi-2.c
M gcc/testsuite/ChangeLog
Kyrylo Tkachov [Fri, 19 Jan 2018 10:26:53 +0000 (10:26 +0000)]
[arm] Fix gcc.target/arm/pr40956.c
The scan-assembler tests here check for MOVS for Thumb1 and MOV for Thumb2,
but in fact there's no reason why we wouldn't generate MOVS for Thumb2 as well,
it really depends on a lot of optimisation decisions.
The only behaviour we want to test is that we move a 0 constant into a register
only once, which can be achieved with either MOV or MOVS.
Simplify the check by always checking for either MOV or MOVS.
Kyrylo Tkachov [Fri, 19 Jan 2018 09:58:37 +0000 (09:58 +0000)]
[arm] Fix gcc.target/arm/pr79058.c
This testcase tests 32-bit ARM state functionality, so add the -marm to make it explicit
as well as to avoid Thumb1 hard-float errors for certain toolchain configurations.
* gcc.target/arm/pr79058.c: Add arm_arm_ok check and -marm to options.
Ian Lance Taylor [Fri, 19 Jan 2018 04:52:12 +0000 (04:52 +0000)]
mksysinfo: force Passwd.Pw_[ug]id from int32 to uint32
Solaris 10 uses int32 for the Pw_uid and Pw_gid fields of Passwd,
but most systems, including Solaris 11, use uint32. Force uint32
for consistency and to fix the build.