* fold-const.c (fold_binary_loc) : Move Convert A/B/C to A/(B*C)
to match.pd.
Move Convert A/(B/C) to (A/B)*C to match.pd.
Move Convert C1/(X*C2) into (C1/C2)/X to match.pd.
Move Optimize (X & (-A)) / A where A is a power of 2, to
X >> log2(A) to match.pd.
* match.pd (rdiv (rdiv:s @0 @1) @2): New simplifier.
(rdiv @0 (rdiv:s @1 @2)): New simplifier.
(div (convert? (bit_and @0 INTEGER_CST@1)) INTEGER_CST@2):
New simplifier.
(rdiv REAL_CST@0 (mult @1 REAL_CST@2)): New simplifier.
spop [Thu, 12 Nov 2015 00:37:47 +0000 (00:37 +0000)]
Preserve the original program while using graphite.
Earlier, graphite used to translate portions of the original program after
scop-detection in order to represent the SCoP into polyhedral model. This was
required because each basic block was represented as independent basic block in
the polyhedral model. So all the cross-basic-block dependencies were translated
out-of-ssa.
With this patch those dependencies are also exposed to the ISL, so there is no
need to modify the original structure of the program.
After this patch we should be able to enable graphite at some default
optimization level.
Highlights:
Remove cross bb scalar to array translation
For reductions, add support for more than just INT_CST
Early bailout on codegen.
Verify loop-closed ssa structure during copy of renames
The uses of exprs should come from bb which dominates the bb
Collect the init value of close phi in loop-guard
Do not follow vuses for close-phi, postpone loop-close phi until the
corresponding loop-phi is processed
Bail out if no bb found to place cond/loop -phis
Move insertion of liveouts at the end of codegen
Insert loop-phis in the loop-header.
This patch passes regtest and bootstrap with BOOT_CFLAGS='-O2 -fgraphite-identity -floop-nest-optimize'
2015-11-11 Aditya Kumar <aditya.k7@samsung.com>
Sebastian Pop <s.pop@samsung.com>
* graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple):
New member codegen_error
(translate_isl_ast_for_loop): Remove call to single_succ_edge and early return.
(translate_isl_ast_node_user): Early return in case of error.
(translate_isl_ast_to_gimple::translate_isl_ast): Same.
(translate_isl_ast_to_gimple::translate_pending_phi_nodes): New.
(add_parameters_to_ivs_params): Remove macro.
(graphite_regenerate_ast_isl): Add if_region pointer to region.
* graphite-poly.c (new_poly_dr): Remove macro.
(print_pdr): Same.
(new_gimple_poly_bb): Same.
(free_gimple_poly_bb): Same.
(print_scop_params): Same.
* graphite-poly.h (struct poly_dr): Same.
(struct poly_bb): Add new_bb.
(gbb_from_bb): Remove dead code.
(pbb_from_bb): Same.
* graphite-scop-detection.c (parameter_index_in_region_1): Same.
(parameter_index_in_region): Same.
(find_scop_parameters): Same.
(build_cross_bb_scalars_def): New.
(build_cross_bb_scalars_use): New.
(graphite_find_cross_bb_scalar_vars): New
(try_generate_gimple_bb): Reads and Writes.
(build_alias_set): Move.
(gather_bbs::before_dom_children): Gather bbs visited.
(build_scops): call build_alias_set.
* graphite-sese-to-poly.c (phi_arg_in_outermost_loop): Delete.
(remove_simple_copy_phi): Delete.
(remove_invariant_phi): Delete.
(simple_copy_phi_p): Delete.
(reduction_phi_p): Delete.
(isl_id_for_dr): Remove unused param.
(parameter_index_in_region_1): Remove macro usage.
(set_scop_parameter_dim): Same.
(add_param_constraints): Same.
(add_conditions_to_constraints): Same
(build_scop_iteration_domain): Same.
(pdr_add_alias_set): Comment.
(add_scalar_version_numbers): New.
(build_poly_dr): ISL id.
(build_scop_drs): Move.
(build_poly_sr_1): Same.
(insert_stmts): Remove.
(build_poly_sr): New.
(new_pbb_from_pbb): Delete.
(insert_out_of_ssa_copy_on_edge): Delete.
(create_zero_dim_array): Delete.
(scalar_close_phi_node_p): Delete.
(propagate_expr_outside_region): Delete.
(rewrite_close_phi_out_of_ssa): Delete.
(rewrite_phi_out_of_ssa): Delete.
(rewrite_degenerate_phi): Delete.
(rewrite_reductions_out_of_ssa): Delete.
(rewrite_cross_bb_scalar_dependence): Delete.
(handle_scalar_deps_crossing_scop_limits):
(rewrite_cross_bb_scalar_deps): Delete.
(build_poly_scop): Remove calls to out-of-ssa functions.
* graphite.c (graphite_transform_loops): Early return in case of codegen error.
* sese.c (debug_rename_map_1): Delete.
(debug_rename_map): Delete.
(sese_record_loop): Remove macro.
(build_sese_loop_nests): Same.
(new_sese_info): Same.
(free_sese_info): Same.
(sese_insert_phis_for_liveouts):
(is_loop_closed_ssa_use): New.
(number_of_phi_nodes): New.
(bb_contains_loop_close_phi_nodes): New.
(bb_contains_loop_phi_nodes): New.
(phi_uses_name): New.
(is_valid_rename):
(get_rename): Add old_bb and loop_phi for more precise matching of
exprs.
(set_rename): Pass region.
(later_of_the_two): New.
(gsi_insert_earliest): New.
(collect_all_ssa_names): New.
(substitute_ssa_name): New.
(rename_all_uses): New.
(get_rename_from_scev): New.
(rename_uses): Pass old_bb for more precise matching of exprs.
(get_def_bb_for_const): New.
(get_new_name): New.
(get_loc): New.
(get_edges): New.
(copy_loop_phi_args): New.
(copy_loop_phi_nodes): New.
(get_loop_init_value): New.
(find_init_value): New.
(find_init_value_close_phi): New.
(copy_loop_close_phi_args): New.
(copy_loop_close_phi_nodes): New.
(add_phi_arg_for_new_expr): New.
(copy_cond_phi_args): New.
(copy_cond_phi_nodes): New.
(copy_phi_nodes): New.
(should_copy_to_new_region): New.
(set_rename_for_each_def): New.
(graphite_copy_stmts_from_block): Early return in case of error.
(copy_bb_and_scalar_dependences): Same.
* sese.h (vec_find): New.
(SESE_PARAMS): Delete.
(SESE_LOOPS): Delete.
(SESE_LOOP_NEST): Delete.
(sese_contains_loop): Remove macro usage.
(sese_nb_params): Same.
(struct gimple_poly_bb): Added read_scalar_refs, write_scalar_refs.
redi [Wed, 11 Nov 2015 17:08:51 +0000 (17:08 +0000)]
Loop in std::this_thread sleep functions
PR libstdc++/60421
* include/std/thread (this_thread::sleep_for): Retry on EINTR.
(this_thread::sleep_until): Retry if time not reached.
* src/c++11/thread.cc (__sleep_for): Retry on EINTR.
* testsuite/30_threads/this_thread/60421.cc: Test interruption and
non-steady clocks.
rguenth [Wed, 11 Nov 2015 14:40:36 +0000 (14:40 +0000)]
2015-11-11 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_slp_analyze_and_verify_instance_alignment):
Declare.
(vect_analyze_data_refs_alignment): Make loop vect specific.
(vect_verify_datarefs_alignment): Likewise.
* tree-vect-data-refs.c (vect_slp_analyze_data_ref_dependences):
Add missing continue.
(vect_compute_data_ref_alignment): Export.
(vect_compute_data_refs_alignment): Merge into...
(vect_analyze_data_refs_alignment): ... this.
(verify_data_ref_alignment): Split out from ...
(vect_verify_datarefs_alignment): ... here.
(vect_slp_analyze_and_verify_node_alignment): New function.
(vect_slp_analyze_and_verify_instance_alignment): Likewise.
* tree-vect-slp.c (vect_supported_load_permutation_p): Remove
misplaced checks on alignment.
(vect_slp_analyze_bb_1): Add fatal output parameter. Do
alignment analysis after SLP discovery and do it per instance.
(vect_slp_bb): When vect_slp_analyze_bb_1 fatally failed do not
bother to re-try using different vector sizes.
ebotcazou [Wed, 11 Nov 2015 14:22:43 +0000 (14:22 +0000)]
PR target/67265
* ira.c (ira_setup_eliminable_regset): Do not necessarily create the
frame pointer for stack checking if non-call exceptions aren't used.
* config/i386/i386.c (ix86_finalize_stack_realign_flags): Likewise.
segher [Wed, 11 Nov 2015 14:09:30 +0000 (14:09 +0000)]
simplify-rtx: Simplify trunc of and of shiftrt
If we have
(truncate:M1 (and:M2 (lshiftrt:M2 (x:M2) C) C2))
we can write it instead as
(and:M1 (lshiftrt:M1 (truncate:M1 (x:M2)) C) C2)
(if that is valid, of course), which has smaller modes for the
binary ops, and the truncate can often simplify further (if "x"
is a register, for example).
* gcc/simplify-rtx.c (simplify_truncation): Simplify TRUNCATE
of AND of [LA]SHIFTRT.
dardiss [Wed, 11 Nov 2015 13:40:08 +0000 (13:40 +0000)]
Undo delay slot filling and use compact branches in selected cases.
gcc/
* config/mips/mips.c (mips_breakable_sequence_p): New function.
(mips_break_sequence): New function.
(mips_reorg_process_insns) Use them. Use compact branches in selected
situations.
gcc/testsuite/
* gcc.target/mips/split-ds-sequence.c: New test.
marxin [Wed, 11 Nov 2015 11:21:44 +0000 (11:21 +0000)]
Fix various memory leaks
* gimple-ssa-strength-reduction.c (create_phi_basis):
Use auto_vec.
* passes.c (release_dump_file_name): New function.
(pass_init_dump_file): Used from this function.
(pass_fini_dump_file): Likewise.
* tree-sra.c (convert_callers_for_node): Use xstrdup_for_dump.
* var-tracking.c (vt_initialize): Use pool_allocator.
jiwang [Wed, 11 Nov 2015 10:51:31 +0000 (10:51 +0000)]
[Patch] PR tree-optimization/68234 Improve range info for loop Phi node
2015-11-11 Richard Biener <rguenth@gcc.gnu.org>
Jiong Wang <jiong.wang@arm.com>
gcc/
PR tree-optimization/68234
* tree-vrp.c (vrp_visit_phi_node): Extend SCEV check to those loop PHI
node which estimiated to be VR_VARYING initially.
gcc/testsuite/
* gcc.dg/tree-ssa/pr68234.c: New testcase.
rts [Wed, 11 Nov 2015 10:36:00 +0000 (10:36 +0000)]
Tighten up checks when tying chains.
gcc/
* regname.c (scan_rtx_reg): Check the matching number of consecutive
registers when tying chains.
(build_def_use): Move terminated_this_insn earlier in the function.
ian [Tue, 10 Nov 2015 20:31:11 +0000 (20:31 +0000)]
PR go/68255
cmd/go: always use --whole-archive for gccgo packages
This is a backport of https://golang.org/cl/16775.
This is, in effect, what the gc toolchain does. It fixes cases where Go
code refers to a C global variable; without this, if the global variable
was the only thing visible in the C code, the generated cgo file might
not get pulled in from the archive, leaving the Go variable
uninitialized.
This was reported against gccgo as https://gcc.gnu.org/PR68255 .
uros [Tue, 10 Nov 2015 17:48:31 +0000 (17:48 +0000)]
* config/i386/i386.c (ix86_print_operand): Remove dead code that
tried to avoid (%rip) for call operands.
* config/i386/i386.c (ix86_print_operand_address_as): Add no_rip
argument. Do not use RIP relative addressing when no_rip is set.
(ix86_print_operand): Update call to ix86_print_operand_address_as.
(ix86_print_operand_address): Ditto.
* config/i386/i386.md (*movabs<mode>_1): Use %P modifier for
absolute movabs operand 0. Add square braces for -masm=intel.
(*movabs<mode>_2): Ditto for operand 1.
ienkovich [Tue, 10 Nov 2015 12:17:30 +0000 (12:17 +0000)]
gcc/
* optabs.c (expand_binop_directly): Allow scalar mode for
vec_pack_trunc_optab.
* tree-vect-loop.c (vect_determine_vectorization_factor): Skip
boolean vector producers from pattern sequence when computing VF.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs) Add
vect_recog_mask_conversion_pattern.
(search_type_for_mask): Choose the smallest
type if different size types are mixed.
(build_mask_conversion): New.
(vect_recog_mask_conversion_pattern): New.
(vect_pattern_recog_1): Allow scalar mode for boolean vectype.
* tree-vect-stmts.c (vectorizable_mask_load_store): Support masked
load with pattern.
(vectorizable_conversion): Support boolean vectors.
(free_stmt_vec_info): Allow patterns for statements with no lhs.
* tree-vectorizer.h (NUM_PATTERNS): Increase to 14.
rguenth [Tue, 10 Nov 2015 10:14:02 +0000 (10:14 +0000)]
2015-11-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/68240
* tree-ssa-sccvn.c (cond_stmts_equal_p): Handle commutative compares
properly.
(visit_phi): For PHIs with just a single executable edge
take its value directly.
(expressions_equal_p): Handle VN_TOP properly.
ktkachov [Tue, 10 Nov 2015 09:37:51 +0000 (09:37 +0000)]
[AArch64][2/3] Implement negcc, notcc optabs
* config/aarch64/aarch64.md (<neg_not_op><mode>cc): New define_expand.
* config/aarch64/iterators.md (NEG_NOT): New code iterator.
(neg_not_op): New code attribute.
rts [Tue, 10 Nov 2015 09:12:52 +0000 (09:12 +0000)]
Tie chains for move instructions.
gcc/
* regrename.c (create_new_chain): Initialize renamed and tied_chain.
(build_def_use): Initialize terminated_this_insn.
(find_best_rename_reg): Pick and check register from the tied chain.
(regrename_do_replace): Mark head as renamed.
(struct du_head *terminated_this_insn). New static variable.
(scan_rtx_reg): Tie chains in move insns. Set terminated_this_insn.
* regrename.h (struct du_head): Add tied_chain, renamed members.
ramana [Tue, 10 Nov 2015 08:35:21 +0000 (08:35 +0000)]
Workaround PR68256 on AArch64
> This is causing a bootstrap comparison failure in gcc/go/gogo.o.
I've had a look at this and the trigger is the
aarch64_use_constant_blocks_p change which appears to be causing a
bootstrap comparison failure because of differences to offsets when
built with debug and without debug. I don't think the problem is
specifically in the backend but this needs some careful
investigation. For now, in the interest of go bootstraps continuing on
trunk - I'm proposing a patch that partially rolls back the change in
aarch64_use_constant_blocks_p and am still looking into the issue but
it will take me some more time to get to the bottom of the issue.
Bootstrapped on aarch64-none-linux-gnu including (c,c++ and go) -
testing finished ok.
cesar [Tue, 10 Nov 2015 05:23:04 +0000 (05:23 +0000)]
gcc/cp/
* parser.c (cp_finalize_oacc_routine): New boolean first argument.
(cp_ensure_no_oacc_routine): Update call to cp_finalize_oacc_routine.
(cp_parser_simple_declaration): Maintain a boolean first to keep track
of each new declarator. Propagate it to cp_parser_init_declarator.
(cp_parser_init_declarator): New boolean first argument. Propagate it
to cp_parser_save_member_function_body and cp_finalize_oacc_routine.
(cp_parser_member_declaration): Likewise.
(cp_parser_single_declaration): Update call to
cp_parser_init_declarator.
(cp_parser_save_member_function_body): New boolean first_decl argument.
Propagate it to cp_finalize_oacc_routine.
(cp_parser_finish_oacc_routine): New boolean first argument. Use it to
determine if multiple declarators follow a routine construct.
(cp_parser_oacc_routine): Update call to cp_parser_finish_oacc_routine.
gcc/testsuite/
* c-c++-common/goacc/routine-5.c: Enable c++ tests.
msebor [Tue, 10 Nov 2015 02:23:34 +0000 (02:23 +0000)]
PR c++/67913 - new expression with negative size not diagnosed
PR c++/67927 - array new expression with excessive number of elements
not diagnosed
gcc/cp/
* call.c (build_operator_new_call): Do not assume size_check
is non-null, analogously to the top half of the function.
* init.c (build_new_1): Detect and diagnose array sizes in
excess of the maximum of roughly SIZE_MAX / 2.
Insert a runtime check only for arrays with a non-constant size.
(build_new): Detect and diagnose negative array sizes.
gcc/testsuite/
* init/new45.C: New test to verify that operator new is invoked
with or without overhead for a cookie.
* init/new44.C: New test for placement new expressions for arrays
with excessive number of elements.
* init/new43.C: New test for placement new expressions for arrays
with negative number of elements.
* other/new-size-type.C: Expect array new expression with
an excessive number of elements to be rejected.
meissner [Tue, 10 Nov 2015 00:04:03 +0000 (00:04 +0000)]
[gcc]
2015-11-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/constraints.md (wF constraint): New constraints
for power9/toc fusion.
(wG constraint): Likewise.
* config/rs6000/predicates.md (u6bit_cint_operand): New
predicate, recognize 0..63.
(upper16_cint_operand): New predicate for power9 and toc fusion.
(fpr_reg_operand): Likewise.
(toc_fusion_or_p9_reg_operand): Likewise.
(toc_fusion_mem_raw): Likewise.
(toc_fusion_mem_wrapped): Likewise.
(fusion_gpr_addis): If power9 fusion, allow fusion for a larger
address range.
(fusion_gpr_mem_combo): Delete, use fusion_addis_mem_combo_load
instead.
(fusion_addis_mem_combo_load): Add support for power9 fusion of
floating point loads, floating point stores, and gpr stores.
(fusion_addis_mem_combo_store): Likewise.
(fusion_offsettable_mem_operand): Likewise.
* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add new
elements for power9 fusion.
(rs6000_debug_print_mode): Rework debug information to print more
information about fusion.
(rs6000_init_hard_regno_mode_ok): Setup for power9 fusion
support.
(rs6000_legitimate_address_p): Recognize toc fusion as a valid
offsettable memory address.
(rs6000_rtx_costs): Update costs for new ISA 3.0 instructions.
(emit_fusion_gpr_load): Move most of the code from
emit_fusion_gpr_load into emit_fusion-addis that handles both
power8 and power9 fusion.
(emit_fusion_addis): Likewise.
(emit_fusion_load_store): Likewise.
(fusion_wrap_memory_address): Add support for TOC fusion.
(fusion_split_address): Likewise.
(fusion_p9_p): Add support for power9 fusion.
(expand_fusion_p9_load): Likewise.
(expand_fusion_p9_store): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.h (TARGET_EXTSWSLI): Macros for support for
new instructions in ISA 3.0.
(TARGET_CTZ): Likewise.
(TARGET_TOC_FUSION_INT): Macros for power9 fusion support.
(TARGET_TOC_FUSION_FP): Likewise.
* config/rs6000/rs6000.md (UNSPEC_FUSION_P9): New power9/toc
fusion unspecs.
(UNSPEC_FUSION_ADDIS): Likewise.
(QHSI mode iterator): New iterator for power9 fusion.
(GPR_FUSION): Likewise.
(FPR_FUSION): Likewise.
(mod<mode>3): Add support for ISA 3.0
modulus instructions.
(umod<mode>3): Likewise.
(divmod peephole): Likewise.
(udivmod peephole): Likewise.
(ctz<mode>2): Add support for ISA 3.0 count trailing zeros scalar
instructions.
(ctz<mode>2_h): Likewise.
(ashdi3_extswsli): Add support for ISA 3.0 EXTSWSLI instruction.
(ashdi3_extswsli_dot): Likewise.
(ashdi3_extswsli_dot2): Likewise.
(power9 fusion splitter): New power9/toc fusion support.
(toc_fusionload_<mode>): Likewise.
(toc_fusionload_di): Likewise.
(fusion_gpr_load_<mode>): Update predicate function.
(power9 fusion peephole2s): New power9/toc fusion support.
(fusion_gpr_<P:mode>_<GPR_FUSION:mode>_load): Likewise.
(fusion_gpr_<P:mode>_<GPR_FUSION:mode>_store): Likewise.
(fusion_fpr_<P:mode>_<FPR_FUSION:mode>_load): Likewise.
(fusion_fpr_<P:mode>_<FPR_FUSION:mode>_store): Likewise.
(fusion_p9_<mode>_constant): Likewise.
[gcc/testsuite]
2015-11-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* lib/target-supports.exp (check_p8vector_hw_available): Split
long line.
(check_vsx_hw_available): Likewise.
(check_p9vector_hw_available): Add new checks for ISA 3.0 hardware
support and for PowerPC float128 support.
(check_p9modulo_hw_available): Likewise.
(check_ppc_float128_sw_available): Likewise.
(check_ppc_float128_hw_available): Likewise.
(check_effective_target_powerpc_p9vector_ok): Likewise.
(check_effective_target_powerpc_p9modulo_ok): Likewise.
(check_effective_target_powerpc_float128_sw_ok): Likewise.
(check_effective_target_powerpc_float128_hw_ok): Likewise.
(is-effective-target): Add new PowerPc targets.
(is-effective-target-keyword): Likewise.
(check_vect_support_and_set_flags): If we have ISA 3.0 vector
instructions, use it.
* gcc.target/powerpc/mod-1.c: New test for ISA 3.0 instructions.
* gcc.target/powerpc/mod-2.c: Likewise.
* gcc.target/powerpc/ctz-1.c: Likewise.
* gcc.target/powerpc/ctz-2.c: Likewise.
* gcc.target/powerpc/extswsli-1.c: Likewise.
* gcc.target/powerpc/extswsli-2.c: Likewise.
* gcc.target/powerpc/extswsli-3.c: Likewise.
* gcc.target/powerpc/fusion.c (fusion_vector): Move to fusion2.c
and allow the test on PowerPC LE.
* gcc.target/powerpc/fusion2.c (fusion_vector): Likewise.
* gcc.target/powerpc/fusion3.c: New file, test power9 fusion.
* gcc.target/powerpc/float128-call.c: Use powerpc_float128_sw_ok
check instead of powerpc_vsx_ok.
* gcc.target/powerpc/float128-mix.c: Likewise.