Teresa Johnson [Mon, 8 Apr 2013 17:39:10 +0000 (17:39 +0000)]
First phase of unifying the computation of profile scale factors/probabilities and the actual scaling to use rounding divides...
First phase of unifying the computation of profile scale factors/probabilities
and the actual scaling to use rounding divides:
- Add new macro GCOV_COMPUTE_SCALE to basic-block.h to compute the scale
factor/probability via a rounding divide.
- Change all locations that already perform rounding divides (inline or via RDIV)
to use the appropriate helper: GCOV_COMPUTE_SCALE, apply_probability or
combine_probabilities.
- Change ipa-cp.c truncating divides to use rounding divides.
- Add comments to all other locations (currently using truncating divides) to
switch them to one of the helpers so they use a rounding divide.
Next phase will be to replace the locations using truncating divides, marked
with a comment here, into rounding divides via the helper methods.
* config/epiphany/epiphany.md (mov_f+2): New peephole2 pattern.
(cstoresi4): Also allow re-use of zero result when doing a NE
comparison to a non-zero operand.
Use (clobber (sratch)) for first insn if the gpr output is not needed.
* config/epiphany/epiphany.md (logical_op): New code iterator.
(op_mnc): New code attribute.
(<op_mnc>_f, mov_f, cstoresi4): New patterns.
(mov_f+1, mov_f+2): New peephole2 patterns.
* config/epiphany/epiphany.md (GPR_1): New constant.
(define_expand "mov<mode>cc): FAIL if gen_compare_reg returned 0.
* config/epiphany/epiphany.c (gen_compare_reg):
For flag_finite_math_only, avoid swapping operands when r0 and/or r1
is already in place.
Use GPR_0 / GPR_1 instead of 0/1 for r0/r1 register numbers.
Don't require being called during rtl expansion; If y operlaps r0,
return 0.
(epiphany_compute_frame_size, epiphany_expand_prologue): Use GPR_1.
(epiphany_expand_epilogue): Likewise.
Jakub Jelinek [Mon, 8 Apr 2013 13:46:00 +0000 (15:46 +0200)]
re PR c++/34949 (Dead code in empty destructors.)
PR c++/34949
PR c++/50243
* tree-eh.c (optimize_clobbers): Only remove clobbers if bb doesn't
contain anything but clobbers, at most one __builtin_stack_restore,
optionally debug stmts and final resx, and if it has at least one
incoming EH edge. Don't check for SSA_NAME on LHS of a clobber.
(sink_clobbers): Don't check for SSA_NAME on LHS of a clobber.
Instead of moving clobbers with MEM_REF LHS with SSA_NAME address
which isn't defaut definition, remove them.
(unsplit_eh, cleanup_empty_eh): Use single_{pred,succ}_{p,edge}
instead of EDGE_COUNT comparisons or EDGE_{PRED,SUCC}.
* tree-ssa-ccp.c (execute_fold_all_builtins): Remove clobbers
with MEM_REF LHS with SSA_NAME address.
* g++.dg/opt/vt3.C: New test.
* g++.dg/opt/vt4.C: New test.
epiphany.h (struct GTY (()) machine_function): Add member lr_slot_known.
* config/epiphany/epiphany.h (struct GTY (()) machine_function):
Add member lr_slot_known.
* config/epiphany/epiphany.md (reload_insi_ra): Compute lr_slot_offs
if necessary.
* config/epiphany/epiphany.c (epiphany_compute_frame_size):
Remove code that sets lr_slot_offset according to what a previous
version of epiphany_emit_save_restore used to do.
(epiphany_emit_save_restore): When doing an lr save or restore,
set/verify lr_slot_known and lr_slot_offset.
Jakub Jelinek [Mon, 8 Apr 2013 08:20:39 +0000 (10:20 +0200)]
tree-loop-distribution.c (const_with_all_bytes_same): New function.
* tree-loop-distribution.c (const_with_all_bytes_same): New function.
(generate_memset_builtin): Only handle integer_all_onesp as -1 val if
TYPE_PRECISION is equal to mode bitsize. Use const_with_all_bytes_same
if possible to compute val.
(classify_partition): Verify CONSTRUCTOR doesn't have any elts.
For QImode integers don't require anything about precision. Use
const_with_all_bytes_same to find out if the constant doesn't have
repeated bytes in it.
* gcc.dg/pr56837.c: New test.
* gcc.dg/tree-ssa/ldist-19.c: Don't check for
"generated memset minus one".
update_web_docs_libstdcxx_svn: No longer ignore all output from the actual copy process.
* update_web_docs_libstdcxx_svn: No longer ignore all output from
the actual copy process.
Check the exit code of the actual copy process; diagnose problems.
Jonathan Wakely [Sun, 7 Apr 2013 15:42:27 +0000 (15:42 +0000)]
forward_list.h: Only include required headers.
* include/bits/forward_list.h: Only include required headers.
(forward_list::reference): Define directly, not using __alloc_traits.
(forward_list::const_reference): Likewise.
Bill Schmidt [Fri, 5 Apr 2013 19:27:58 +0000 (19:27 +0000)]
re PR target/56843 (PowerPC Newton-Raphson reciprocal estimates can be improved)
gcc:
2013-04-05 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR target/56843
* config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove.
(rs6000_emit_swdiv_low_precision): Remove.
(rs6000_emit_swdiv): Rewrite to handle between one and four
iterations of Newton-Raphson generally; modify required number of
iterations for some cases.
* config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove.
gcc/testsuite:
2013-04-05 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
Jonathan Wakely [Fri, 5 Apr 2013 10:03:04 +0000 (10:03 +0000)]
re PR libstdc++/56841 (ld: Unsatisfied symbol "__atomic_exchange_8" in file /test/gnu/gcc/objdir/prev-hppa64-hp-hpux11.11/libstdc++-v3/src/.libs/libstdc++.a[eh_terminate.o])
PR libstdc++/56841
* libsupc++/eh_ptr.cc (rethrow_exception): Use get_unexpected() and
get_terminate() accessors.
* libsupc++/eh_throw.cc (__cxa_throw): Likewise.
* libsupc++/eh_terminate.cc: Use mutex when atomic builtins not
available.
* libsupc++/new_handler.cc: Likewise.
* lib/target-supports.exp (check_effective_target_arm_v8_neon_hw):
New procedure.
(check_effective_target_arm_v8_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_neon_ok): Change to use
check_effective_target_arm_v8_neon_ok_nocache.
(add_options_for_arm_v8_neon): Use et_arm_v8_neon_flags to set ARMv8
NEON flags.
(check_effective_target_vect_call_btruncf):
Enable for arm and ARMv8 NEON.
(check_effective_target_vect_call_ceilf): Likewise.
(check_effective_target_vect_call_floorf): Likewise.
(check_effective_target_vect_call_roundf): Likewise.
(check_vect_support_and_set_flags): Handle ARMv8 NEON effective
target.
* config/arm/arm-protos.h (arm_builtin_vectorized_function):
New function prototype.
* config/arm/arm.c (TARGET_VECTORIZE_BUILTINS): Define.
(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Likewise.
(arm_builtin_vectorized_function): New function.
* config/arm/arm_neon_builtins.def: New file.
* config/arm/arm.c (neon_builtin_data): Move contents to
arm_neon_builtins.def.
(enum arm_builtins): Include neon builtin definitions.
(ARM_BUILTIN_NEON_BASE): Move from enum to macro.
* config/arm/t-arm (arm.o): Add dependency on
arm_neon_builtins.def.
Marek Polacek [Thu, 4 Apr 2013 15:48:25 +0000 (15:48 +0000)]
re PR tree-optimization/48186 (ICE: SIGFPE (division by zero) in maybe_hot_frequency_p at predict.c:129 with --param hot-bb-frequency-fraction=0 on basic code)
PR tree-optimization/48186
* predict.c (maybe_hot_frequency_p): Return false if
HOT_BB_FREQUENCY_FRACTION is 0.
(cgraph_maybe_hot_edge_p): Likewise.
Richard Biener [Thu, 4 Apr 2013 10:55:25 +0000 (10:55 +0000)]
re PR tree-optimization/56837 (-ftree-loop-distribute-patterns generates incorrect code)
2013-04-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/56837
* tree-loop-distribution.c (classify_partition): For non-zero
values require that the value has the same precision as its
mode to be useful as memset value.
Nick Clifton [Thu, 4 Apr 2013 07:25:35 +0000 (07:25 +0000)]
oops - omitted this from previous delta:
* config/v850/v850e3v5.md (fmasf4): Use fmaf.s on E3V5
architectures.
(fmssf4): Use fmsf.s on E3V5 architectures.
(fnmasf4): Use fnmaf.s on E3V5 architectures.
(fnmssf4): Use fnmsf.s on E3V5 architectures.
Jason Merrill [Thu, 4 Apr 2013 00:14:00 +0000 (20:14 -0400)]
cp-demangle.c (cplus_demangle_type): Fix function quals.
libiberty/
* cp-demangle.c (cplus_demangle_type): Fix function quals.
(d_pointer_to_member_type): Simplify.
gcc/cp/
* mangle.c (write_type): When writing a function type with
function-cv-quals, don't add the unqualified type as a
substitution candidate.
Teresa Johnson [Wed, 3 Apr 2013 20:51:28 +0000 (20:51 +0000)]
This patch enables the gcov-dump tool to optionally compute and dump the working set information from the counter histogram...
This patch enables the gcov-dump tool to optionally compute and dump
the working set information from the counter histogram, via a new -w option.
This is useful to help understand and tune how the compiler will use
the counter histogram, since it first computes the working set and selects
thresholds based on that.
This required moving the bulk of the compute_working_sets functionality
into gcov-io.c so that it was accessible by gcov-dump.c.
2013-04-03 Teresa Johnson <tejohnson@google.com>
* gcov-io.c (compute_working_sets): Moved most of body of old
compute_working_sets here from profile.c.
* gcov-io.h (NUM_GCOV_WORKING_SETS): Moved here from profile.c.
(gcov_working_set_t): Moved typedef here from basic-block.h
(compute_working_set): Declare.
* profile.c (NUM_GCOV_WORKING_SETS): Moved to gcov-io.h.
(get_working_sets): Renamed from compute_working_set,
replace most of body with call to new compute_working_sets.
(get_exec_counts): Replace call to compute_working_sets
to get_working_sets.
* profile.h (get_working_sets): Renamed from
compute_working_set.
* lto-cgraph.c (input_symtab): Replace call to compute_working_sets
to get_working_sets.
* basic-block.h (gcov_working_set_t): Moved to gcov-io.h.
* gcov-dump.c (dump_working_sets): New function.
Jeff Law [Wed, 3 Apr 2013 19:18:09 +0000 (13:18 -0600)]
re PR tree-optimization/56799 (Runfail after r197060+r197082.)
PR tree-optimization/56799
* tree-ssa-dom.c (record_equivalences_from_incoming_edge): Bring
back test for widening conversion erroneously dropped in prior
change.
PR tree-optimization/56799
* gcc.c-torture/execute/pr56799.c: New test.
Jakub Jelinek [Wed, 3 Apr 2013 15:24:13 +0000 (17:24 +0200)]
re PR c++/56819 (ICE: SIGSEGV in int_cst_value (tree.h:4013) with -fcompare-debug)
PR debug/56819
* tree.c (strip_typedefs): Copy NON_DEFAULT_TEMPLATE_ARGS_COUNT
from args to new_args.
(strip_typedefs_expr): Copy NON_DEFAULT_TEMPLATE_ARGS_COUNT from t to
r instead of doing {S,G}ET_NON_DEFAULT_TEMPLATE_ARGS_COUNT.
Nick Clifton [Wed, 3 Apr 2013 14:06:38 +0000 (14:06 +0000)]
v850e3v5.md (fmasf4): Use fmaf.s on E3V5 architectures.
* config/v850/v850e3v5.md (fmasf4): Use fmaf.s on E3V5
architectures.
(fmssf4): Use fmsf.s on E3V5 architectures.
(fnmasf4): Use fnmaf.s on E3V5 architectures.
(fnmssf4): Use fnmsf.s on E3V5 architectures.
Richard Biener [Wed, 3 Apr 2013 13:41:13 +0000 (13:41 +0000)]
re PR tree-optimization/56817 (ICE in hide_evolution_in_other_loops_than_loop)
2013-04-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/56817
* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely):
Split out ...
(tree_unroll_loops_completely_1): ... new function to manually
walk the loop tree, properly defering outer loops of unrolled
loops to later iterations.
gcc/
* simplify-rtx.c (simplify_binary_operation_1) <VEC_SELECT>:
Handle VEC_MERGE.
(simplify_ternary_operation) <VEC_MERGE>: Use unsigned HOST_WIDE_INT
for masks. Test for side effects. Handle nested VEC_MERGE. Handle
equal arguments.
gcc/testsuite/
* gcc.target/i386/merge-1.c: New testcase.
* gcc.target/i386/avx2-vpblendd128-1.c: Make it non-trivial.
Jakub Jelinek [Wed, 3 Apr 2013 09:17:44 +0000 (11:17 +0200)]
re PR c/19449 (__builtin_constant_p cannot resolve to const when optimizing)
PR c/19449
* tree.h (force_folding_builtin_constant_p): New decl.
* builtins.c (force_folding_builtin_constant_p): New variable.
(fold_builtin_constant_p): Fold immediately also if
force_folding_builtin_constant_p.
* c-parser.c (c_parser_get_builtin_args): Add choose_expr_p
argument. If set, or it temporarily for parsing of the first
argument into force_folding_builtin_constant_p.
(c_parser_postfix_expression): Adjust callers.