Jeff Law [Fri, 3 May 2013 16:35:04 +0000 (10:35 -0600)]
re PR tree-optimization/57411 (ICE: verify_ssa failed: definition in block 4 does not dominate use in block 11 with -fno-tree-dce -ftree-vectorize)
PR tree-optimization/57411
* tree-vrp.c (simplify_cond_using_ranges): Verify the constant
operand of the condition will bit into the new type when eliminating
a cast feeding a condition.
PR tree-optimization/57411
* gcc.c-torture/execute/pr57144.c: New test.
Uros Bizjak [Fri, 3 May 2013 12:03:29 +0000 (14:03 +0200)]
i386.md (isa): Add x64_sse4_noavx and x64_avx members.
* config/i386/i386.md (isa): Add x64_sse4_noavx and x64_avx members.
(enabled): Handle new members.
* config/i386/sse.md (*vec_concatv2si): Merge from
*vec_concatv2si_sse2 and vec_concatv2si_sse.
(vec_concatv2di): Merge with *vec_concatv2di_rex64.
PR tree-optimization/57027
* tree-ssa-math-opts.c (convert_mult_to_fma): When checking for
fnms opportunity, check we got the prerequisite kind of tree / gimple
before using accessor functions.
Richard Biener [Fri, 3 May 2013 11:09:59 +0000 (11:09 +0000)]
double-int.h (lshift): New overload without precision and arith argument.
2013-05-03 Richard Biener <rguenther@suse.de>
* double-int.h (lshift): New overload without precision
and arith argument.
(operator *=, operator +=, operator -=): Move ...
* double-int.c (operator *=, operator +=, operator -=): ... here
and implement more efficiently.
(mul_double_with_sign): Remove.
(lshift_double): Adjust to take unsinged shift argument, push
dispatching code to callers.
(mul_double_wide_with_sign): Add early out for callers that
are not interested in high parts or overflow.
(lshift): New function.
(lshift, rshift, alshift, arshift, llshift, lrshift): Add
dispatch code here.
(lrotate, rrotate): Use logical shifts.
* expr.c (get_inner_reference): Use lshift.
* fixed-value.c (do_fixed_divide): Likewise.
* tree-dfa.c (get_ref_base_and_extent): Likewise.
* tree-ssa-alias.c (indirect_ref_may_alias_decl_p): Likewise.
(indirect_refs_may_alias_p): Likewise.
(stmt_kills_ref_p_1): Likewise.
Tobias Burnus [Thu, 2 May 2013 16:29:14 +0000 (18:29 +0200)]
re PR fortran/57142 (SIZE/SHAPE overflow despite kind=8)
2013-05-02 Tobias Burnus <burnus@net-b.de>
PR fortran/57142
* simplify.c (gfc_simplify_size): Renamed from
simplify_size; fix kind=8 handling.
(gfc_simplify_size): New function.
(gfc_simplify_shape): Add range check.
* resolve.c (resolve_function): Fix handling
for ISYM_SIZE.
Martin Jambor [Thu, 2 May 2013 14:03:02 +0000 (16:03 +0200)]
re PR middle-end/56988 (ipa-cp incorrectly propagates a field of an aggregate)
2013-05-02 Martin Jambor <mjambor@suse.cz>
PR middle-end/56988
* ipa-prop.h (ipa_agg_replacement_value): New flag by_ref.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Also check that by_ref
flags match.
(find_aggregate_values_for_callers_subset): Fill in the by_ref flag of
ipa_agg_replacement_value structures.
(known_aggs_to_agg_replacement_list): Likewise.
* ipa-prop.c (write_agg_replacement_chain): Stream by_ref flag.
(read_agg_replacement_chain): Likewise.
(ipcp_transform_function): Also check that by_ref flags match.
Richard Biener [Thu, 2 May 2013 13:59:38 +0000 (13:59 +0000)]
graphds.h (struct graph): Add obstack member.
2013-05-02 Richard Biener <rguenther@suse.de>
* graphds.h (struct graph): Add obstack member.
* graphds.c (new_graph): Initialize obstack and allocate
vertices from it.
(add_edge): Allocate edge from the obstack.
(free_graph): Free the obstack instead of all edges and
vertices.
Teresa Johnson [Thu, 2 May 2013 13:20:47 +0000 (13:20 +0000)]
Follow-on patch to r197595 to complete the replacement of truncating divides in...
Follow-on patch to r197595 to complete the replacement of truncating divides
in profile scaling code with rounding divide equivalents using helper routines
in basic-block.h.
In addition to bootstrap and profiledbootstrap builds and tests (with and
without LTO), I built and tested performance of the SPEC cpu2006 benchmarks
with FDO on a Nehalem system. I didn't see any performance changes that
looked significant.
re PR target/57091 (ICE: in assign_by_spills, at lra-assigns.c:1268 with -mcmodel=large and indirect call)
2013-05-01 Vladimir Makarov <vmakarov@redhat.com>
PR target/57091
* lra-constraints.c (best_small_class_operands_num): Remove.
(process_alt_operands): Remove small_class_operands_num. Take
small classes operands into losers and only if the operand is not
matched. Modify debugging output.
(curr_insn_transform): Remove best_small_class_operands_num.
Print insn name.
2013-05-01 Vladimir Makarov <vmakarov@redhat.com>
PR target/57091
* gcc.target/i386/pr57091.c: New test.
* sel-sched.c (move_op_orig_expr_found): Remove insn_emitted
variable. Use just INSN_UID for determining whether an insn
should be only disconnected from the insn stream.
* sel-sched-ir.h (EXPR_WAS_CHANGED): Remove.
Mike Frysinger [Tue, 30 Apr 2013 04:07:23 +0000 (04:07 +0000)]
gcc: arm: linux-eabi: fix handling of armv4 bx fixups when linking
The bpabi.h header already sets up defines to automatically use the
--fix-v4bx flag with the assembler & linker as needed, and creates a
default assembly & linker spec which uses those. Unfortunately, the
linux-eabi.h header clobbers the LINK_SPEC define and doesn't include
the v4bx define when setting up its own. So while the assembler spec
is retained and works fine to generate the right relocs, building for
armv4 targets doesn't invoke the linker correctly so all the relocs
get processed as if we had an armv4t target.
You can see this with -dumpspecs when configuring gcc for an armv4
target and using --with-arch=armv4:
$ armv4l-unknown-linux-gnueabi-gcc -dumpspecs |& grep -B 1 fix-v4bx
*subtarget_extra_asm_spec:
.... %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx} ...
With this fix in place, we also get the link spec:
$ armv4l-unknown-linux-gnueabi-gcc -dumpspecs |& grep -B 1 fix-v4bx
*link:
... %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx} ...
And all my hello world tests / glibc builds automatically turn the
bx insn into the 'mov pc, lr' insn and all is right in the world.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
From-SVN: r198438
Vladimir Makarov [Mon, 29 Apr 2013 19:42:20 +0000 (19:42 +0000)]
re PR target/57097 (ICE: in find_hard_regno_for, at lra-assigns.c:561 with -O2 -fPIC -m32)
2013-04-29 Vladimir Makarov <vmakarov@redhat.com>
PR target/57097
* lra-constraints.c (process_alt_operands): Discourage a bit more
using memory for pseudos. Print cost dump for alternatives.
Modify cost values for conflicts with early clobbers.
(curr_insn_transform): Spill pseudos reassigned to NO_REGS.
2013-04-29 Vladimir Makarov <vmakarov@redhat.com>
PR target/57097
* gcc.target/i386/pr57097.c: New test.
Richard Biener [Mon, 29 Apr 2013 15:06:18 +0000 (15:06 +0000)]
re PR tree-optimization/57075 (verify_flow_info failed: control flow in the middle of basic block)
2013-04-29 Richard Biener <rguenther@suse.de>
PR middle-end/57075
* tree-inline.c (copy_edges_for_bb): Still split the bbs,
even if not adding abnormal edges for calls that can make
abnormal gotos.
Teresa Johnson [Mon, 29 Apr 2013 13:22:46 +0000 (13:22 +0000)]
This patch fixes PR bootstrap/57077.
This patch fixes PR bootstrap/57077. Certain new uses of apply_probability
are actually scaling the counts up, and the scale factor should not
be treated as a probability as the value may exceed REG_BR_PROB_BASE.
One example (from the PR) is when scaling counts up in LTO when merging
profiles. Another example I found when preparing the patch to use
the rounding divide in more places is when inlining COMDAT functions.
Add new helper function apply_scale that does the scaling without
the probability range check. I audited the new uses of apply_probability
and changed the calls as appropriate.
2013-04-29 Teresa Johnson <tejohnson@google.com>
PR bootstrap/57077
* basic-block.h (apply_scale): New function.
(apply_probability): Use apply_scale.
* gimple-streamer-in.c (input_bb): Ditto.
* lto-streamer-in.c (input_cfg): Ditto.
* lto-cgraph.c (merge_profile_summaries): Ditto.
* tree-optimize.c (execute_fixup_cfg): Ditto.
* tree-inline.c (copy_bb): Update comment to use
apply_scale.
(copy_edges_for_bb): Ditto.
(copy_cfg_body): Ditto.
Jeff Law [Mon, 29 Apr 2013 12:52:17 +0000 (06:52 -0600)]
tree-vrp.c (range_fits_type_p): Move to earlier point in file.
* tree-vrp.c (range_fits_type_p): Move to earlier point in file.
(simplify_cond_using_ranges): Generalize code to simplify
COND_EXPRs where one argument is a constant and the other
is an SSA_NAME created by an integral type conversion.
Richard Biener [Mon, 29 Apr 2013 11:31:33 +0000 (11:31 +0000)]
re PR middle-end/57089 (ICE in verify_loop_structure, at cfgloop.c:1647)
2013-04-29 Richard Biener <rguenther@suse.de>
PR middle-end/57089
* omp-low.c (expand_omp_taskreg): If the parent function had
a broken loop tree make sure to schedule a fixup for the child
as well.
(expand_omp_for_generic): Properly add loops.
(expand_omp_for_static_nochunk): Likewise.
(expand_omp_for_static_chunk): Likewise.
(expand_omp_for): For the degenerate case fixup loops.
(expand_omp_sections): Fix default bb placement in loops.
(expand_omp_atomic_pipeline): Properly add loops.
re PR target/54349 (_mm_cvtsi128_si64 unnecessary stores value at stack)
PR target/54349
* config/i386/i386.h (enum ix86_tune_indices)
<X86_TUNE_INTER_UNIT_MOVES_TO_VEC, X86_TUNE_INTER_UNIT_MOVES_FROM_VEC>:
New, split from X86_TUNE_INTER_UNIT_MOVES.
<X86_TUNE_INTER_UNIT_MOVES>: Remove.
(TARGET_INTER_UNIT_MOVES_TO_VEC): New define.
(TARGET_INTER_UNIT_MOVES_FROM_VEC): Ditto.
(TARGET_INTER_UNIT_MOVES): Remove.
* config/i386/i386.c (initial_ix86_tune_features): Update.
Disable X86_TUNE_INTER_UNIT_MOVES_FROM_VEC for m_ATHLON_K8 only.
(ix86_expand_convert_uns_didf_sse): Use
TARGET_INTER_UNIT_MOVES_TO_VEC instead of TARGET_INTER_UNIT_MOVES.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_expand_vector_init_interleave): Ditto.
(inline_secondary_memory_needed): Return true for moves from SSE class
registers for !TARGET_INTER_UNIT_MOVES_FROM_VEC targets and for moves
to SSE class registers for !TARGET_INTER_UNIT_MOVES_TO_VEC targets.
* config/i386/constraints.md (Yi, Ym): Depend on
TARGET_INTER_UNIT_MOVES_TO_VEC.
(Yj, Yn): New constraints.
* config/i386/i386.md (*movdi_internal): Change constraints of
operand 1 from Yi to Yj and from Ym to Yn.
(*movsi_internal): Ditto.
(*movdf_internal): Ditto.
(*movsf_internal): Ditto.
(*float<SWI48x:mode><X87MODEF:mode>2_1): Use
TARGET_INTER_UNIT_MOVES_TO_VEC instead of TARGET_INTER_UNIT_MOVES.
(*float<SWI48x:mode><X87MODEF:mode>2_1 splitters): Ditto.
(floatdi<X87MODEF:mode>2_i387_with_xmm): Ditto.
(floatdi<X87MODEF:mode>2_i387_with_xmm splitters): Ditto.
* config/i386/sse.md (movdi_to_sse): Ditto.
(sse2_stored): Change constraint of operand 1 from Yi to Yj.
Use TARGET_INTER_UNIT_MOVES_FROM_VEC instead of
TARGET_INTER_UNIT_MOVES.
(sse_storeq_rex64): Change constraint of operand 1 from Yi to Yj.
(sse_storeq_rex64 splitter): Use TARGET_INTER_UNIT_MOVES_FROM_VEC
instead of TARGET_INTER_UNIT_MOVES.
* config/i386/mmx.md (*mov<mode>_internal): Change constraint of
operand 1 from Yi to Yj and from Ym to Yn.
Jakub Jelinek [Mon, 29 Apr 2013 07:55:09 +0000 (09:55 +0200)]
re PR tree-optimization/57083 (Wrong constant folding)
PR tree-optimization/57083
* tree-vrp.c (extract_range_from_binary_expr_1): For LSHIFT_EXPR with
non-singleton shift count range, zero extend low_bound for uns case.
Jakub Jelinek [Mon, 29 Apr 2013 07:43:20 +0000 (09:43 +0200)]
predicates.md (general_vector_operand): New predicate.
* config/i386/predicates.md (general_vector_operand): New predicate.
* config/i386/i386.c (const_vector_equal_evenodd_p): New function.
(ix86_expand_mul_widen_evenodd): Force op1 resp. op2 into register
if they aren't nonimmediate operands. If their original values
satisfy const_vector_equal_evenodd_p, don't shift them.
* config/i386/sse.md (mul<mode>3): Use general_vector_operand
predicates. For the SSE4.1 case force operands[{1,2}] into registers
if not nonimmediate_operand.
(vec_widen_smult_even_v4si): Use nonimmediate_operand predicates
instead of register_operand.
(vec_widen_<s>mult_odd_<mode>): Use general_vector_operand predicates.