Andreas Krebbel [Tue, 10 May 2016 09:00:53 +0000 (09:00 +0000)]
S/390: Disable scalar vector instructions with -mno-vx.
Although the scalar variants of the vector instructions aren't
actually vector instructions they are still executed in the vector
facility and therefore need to be disabled when disabling the facility
with -mno-vx.
Fixed with the attached patch. Committed to head, GCC 6, and GCC 5
branches.
gcc/ChangeLog:
2016-05-10 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Richard Biener [Tue, 10 May 2016 08:20:43 +0000 (08:20 +0000)]
re PR tree-optimization/70497 (Missed CSE of subregs on GIMPLE)
2016-05-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/70497
PR tree-optimization/28367
* tree-ssa-sccvn.c (vn_nary_build_or_lookup): New function
split out from ...
(visit_reference_op_load): ... here.
(vn_reference_lookup_3): Use it to handle subreg-like accesses
with simplified BIT_FIELD_REFs.
* tree-ssa-pre.c (eliminate_insert): Handle inserting BIT_FIELD_REFs.
* tree-complex.c (extract_component): Handle BIT_FIELD_REFs
correctly.
* gcc.dg/torture/20160404-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-fre-54.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-55.c: Likewise.
DWARF: add abstract origin links on lexical blocks DIEs
Track from which abstract lexical block concrete ones come from in DWARF
so that debuggers can inherit the former from the latter. This enables
debuggers to properly handle the following case:
* function Child2 is nested in a lexical block, itself nested in
function Child1;
* function Child1 is inlined into some call site;
* function Child2 is never inlined.
Here, Child2 is described in DWARF only in the abstract instance of
Child1. So when debuggers decode Child1's concrete instances, they need
to fetch the definition for Child2 in the corresponding abstract
instance: the DW_AT_abstract_origin link on the lexical block that
embeds Child1 enables them to do that.
Bootstrapped and regtested on x86_64-linux.
gcc/ChangeLog:
* dwarf2out.c (add_abstract_origin_attribute): Adjust
documentation comment. For BLOCK nodes, add a
DW_AT_abstract_origin attribute that points to the DIE generated
for the origin BLOCK.
(gen_lexical_block_die): Call add_abstract_origin_attribute for
blocks from inlined functions.
Joel Sherrill [Tue, 10 May 2016 07:11:00 +0000 (07:11 +0000)]
[RTEMS] Fix moxie libgcc support
libgcc/
PR libgcc/70720
* config.host (moxie-*-rtems*): Merge this stanza with other moxie
targets so the same extra_parts are built. Also have tmake_file add
on to its value rather than override.
Alan Modra [Mon, 9 May 2016 23:12:20 +0000 (08:42 +0930)]
[RS6000] Stop regrename twiddling with split-stack prologue
PR target/70947
* config/rs6000/rs6000.c (rs6000_expand_split_stack_prologue): Stop
regrename modifying insns saving lr before __morestack call.
* config/rs6000/rs6000.md (split_stack_return): Similarly for
insns restoring lr after __morestack call.
Aaron Sawdey [Mon, 9 May 2016 16:56:30 +0000 (16:56 +0000)]
rs6000.c (rs6000_reassociation_width): Add function for TARGET_SCHED_REASSOCIATION_WIDTH to enable parallel...
* config/rs6000/rs6000.c (rs6000_reassociation_width): Add
function for TARGET_SCHED_REASSOCIATION_WIDTH to enable
parallel reassociation for power8 and forward.
Uros Bizjak [Mon, 9 May 2016 15:37:30 +0000 (17:37 +0200)]
i386.md (absneg splitters with general regs): Use general_reg_operand predicate.
* config/i386/i386.md (absneg splitters with general regs): Use
general_reg_operand predicate.
(btsq peephole2): Use x86_64_immediate_operand to check if new
value is suitable for immediate operand. Generate emitted insn
using RTL expressions.
(btcq peephole2): Ditto.
(btrq peephole2): Ditto. Generate correct immediate operand
for AND masking.
Fix handling of negative bitpos in expand_debug_expr
expand_debug_expr handled negative bit positions using:
else if (bitpos < 0)
{
HOST_WIDE_INT units
= (-bitpos + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
op0 = adjust_address_nv (op0, mode1, units);
bitpos += units * BITS_PER_UNIT;
}
Here "units" is the negative of the (negative) byte offset, so I think
we should be offsetting OP0 by -units instead. E.g. a bitpos of -17
would give units==3, so this code would move OP0 up by 3 bytes and set
bitpos to 7, giving a total bitpos of 31.
Just noticed by inspection. An assert triggered for:
to check whether we had previously seen a nonzero multiple, but "mult" is
a pointer to the previous value rather than the previous value itself.
Noticed by inspection while working on another patch, so I don't have a
testcase. I tried adding an assert for combinations that were wrongly
rejected before but it didn't trigger during a bootstrap and regtest.
Jonathan Wakely [Mon, 9 May 2016 11:50:01 +0000 (12:50 +0100)]
libstdc++/71004 fix recent additions to testcase
PR libstdc++/71004
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Fix test02 to not call member
functions on invalid iterator, and use VERIFY not assert.
Kaushik Phatak [Mon, 9 May 2016 11:44:58 +0000 (11:44 +0000)]
rl78.c (rl78_expand_prologue): Save the MDUC related registers in all interrupt handlers if necessary.
* config/rl78/rl78.c (rl78_expand_prologue): Save the MDUC related
registers in all interrupt handlers if necessary.
(rl78_option_override): Add warning.
(MUST_SAVE_MDUC_REGISTERS): New macro.
(rl78_expand_epilogue): Restore the MDUC registers if necessary.
* config/rl78/rl78.c (check_mduc_usage): New function.
(mduc_regs): New structure to hold MDUC register data.
* config/rl78/rl78.md (is_g13_muldiv_insn): New attribute.
(mulsi3_g13): Add is_g13_muldiv_insn attribute.
(udivmodsi4_g13): Add is_g13_muldiv_insn attribute.
(mulhi3_g13): Add is_g13_muldiv_insn attribute.
* config/rl78/rl78.opt (msave-mduc-in-interrupts): New option.
* doc/invoke.texi (RL78 Options): Add -msave-mduc-in-interrupts.
Bin Cheng [Mon, 9 May 2016 11:44:03 +0000 (11:44 +0000)]
tree-if-conv.c (tree-ssa-loop.h): Include header file.
* tree-if-conv.c (tree-ssa-loop.h): Include header file.
(tree-ssa-loop-niter.h): Ditto.
(idx_within_array_bound, ref_within_array_bound): New functions.
(ifcvt_memrefs_wont_trap): Check if array ref is within bound.
Factor out check on writable base object to ...
(base_object_writable): ... here.
gcc/testsuite/
* gcc.dg/tree-ssa/ifc-9.c: New test.
* gcc.dg/tree-ssa/ifc-10.c: New test.
* gcc.dg/tree-ssa/ifc-11.c: New test.
* gcc.dg/tree-ssa/ifc-12.c: New test.
* gcc.dg/vect/pr61194.c: Remove XFAIL.
* gcc.dg/vect/vect-23.c: Remove XFAIL.
* gcc.dg/vect/vect-mask-store-move-1.c: Revise test check.
Avoid endless run-time recursion for copying single-element tuples where the...
Avoid endless run-time recursion for copying single-element
tuples where the element type is by-value constructible
from any type.
* include/std/tuple (_NotSameTuple): New.
* include/std/tuple (tuple(_UElements&&...): Use it.
* testsuite/20_util/tuple/cons/element_accepts_anything_byval.cc: New.
Jim Wilson [Sat, 7 May 2016 23:11:57 +0000 (23:11 +0000)]
Emit vmov.i64 to load 0.0 into FP reg when neon enabled.
* config/arm/arm.md: (arch): Add neon.
(arch_enabled): Return yes for arch neon when TARGET_NEON.
* config/arm/vfp.md (movdf_vfp): Add w/G as alternative 3. Add
neon_move as type for alt 3. Add arch attr enabling alt 3 for neon.
Emit vmov.i64 for alt 3. Renumber alternatives 3 to 8. Adjust
attributes for alt renumbering. Mark alt 3 as non-predicable.
(thumb2_movdf_vfp): Likewise.
Uros Bizjak [Sat, 7 May 2016 14:36:11 +0000 (16:36 +0200)]
i386.md (*addqi_1): Add preferred_for_speed attribute to disparage alternatives 3 and 4 for...
* config/i386/i386.md (*addqi_1): Add preferred_for_speed attribute
to disparage alternatives 3 and 4 for TARGET_PARTIAL_REG_STALL targets.
(*andqi_1): Add preferred_for_speed attribute to disparage
alternative 2 for TARGET_PARTIAL_REG_STALL targets.
(*<code>qi_1): Ditto.
(*one_cmplqi2_1): Add preferred_for_speed attribute to disparage
alternative 1 for TARGET_PARTIAL_REG_STALL targets.
(*ashlqi3_1): Ditto.
(*swap<mode>): Merge from *swap<mode>_1 and *swap<mode>_2 patterns.
Add preferred_for_size attribute to disparage alternative 0 and
preferred_for_speed attribute to disparage alternative 1 for
TARGET_PARTIAL_REG_STALL targets.
Introduces the nodes used to model connectivity in the escape graph
and related state: a node's escape level and an encoding that will
be added to import and export data.
Uros Bizjak [Fri, 6 May 2016 21:14:20 +0000 (23:14 +0200)]
i386.md (LEAMODE): New mode attribute.
* config/i386/i386.md (LEAMODE): New mode attribute.
(plus to LEA splitter): Rewrite splitter using LEAMODE mode attribute.
(ashift to LEA splitter): Rewrte splitter using SWI mode iterator
and LEAMODE mode attribute. Use VOIDmode const_0_to_3_operand as
operand 2 predicate.
(*lea<mode>_general_2): Use VOIDmode for const248_operand.
(*lea<mode>_general_3): Ditto.
(*lea<mode>_general_4): Use VOIDmode for const_0_to_3_operand.
Chris Manghane [Fri, 6 May 2016 17:37:55 +0000 (17:37 +0000)]
escape: Add skeleton for gc analysis.
Introduces a skeleton replacement escape analysis
which contains four different phases extracted from the escape
analysis implementation in gc/esc.go. Also introduces the
Escape_context each phase uses to make decisions.
Jakub Jelinek [Fri, 6 May 2016 15:23:56 +0000 (17:23 +0200)]
re PR target/70941 (Test miscompiled with -O2.)
PR middle-end/70941
* gcc.dg/torture/pr70941.c (abort): Remove prototype.
(a, b, c, d): Change type from char to signed char.
(main): Compare against (signed char) -1634678893 instead of
hardcoded -109. Use __builtin_abort instead of abort.
David Malcolm [Fri, 6 May 2016 15:18:59 +0000 (15:18 +0000)]
Move name_to_pass_map into class pass_manager
gcc/ChangeLog:
* pass_manager.h (pass_manager::register_pass_name): New method.
(pass_manager::get_pass_by_name): New method.
(pass_manager::create_pass_tab): New method.
(pass_manager::m_name_to_pass_map): New field.
* passes.c (name_to_pass_map): Delete global in favor of field
"m_name_to_pass_map" of pass_manager.
(register_pass_name): Rename from a function to...
(pass_manager::register_pass_name): ...this method, updating
for renaming of global "name_to_pass_map" to field
"m_name_to_pass_map".
(create_pass_tab): Rename from a function to...
(pass_manager::create_pass_tab): ...this method, updating
for renaming of global "name_to_pass_map" to field.
(get_pass_by_name): Rename from a function to...
(pass_manager::get_pass_by_name): ...this method.
(enable_disable_pass): Convert use of get_pass_by_name to
a method call, locating the pass_manager singleton.
Jakub Jelinek [Fri, 6 May 2016 13:13:09 +0000 (15:13 +0200)]
sse.md (*vec_extractv4sf_0, [...]): Use v instead of x in vex or maybe_vex alternatives...
* config/i386/sse.md (*vec_extractv4sf_0, *sse4_1_extractps,
*vec_extractv4sf_mem, vec_extract_lo_v16hi, vec_extract_hi_v16hi,
vec_extract_lo_v32qi, vec_extract_hi_v32qi): Use v instead of x
in vex or maybe_vex alternatives, use maybe_evex instead of vex
in prefix.
Jakub Jelinek [Fri, 6 May 2016 13:12:32 +0000 (15:12 +0200)]
sse.md (*vec_concatv2sf_sse4_1, [...]): Use v instead of x in vex or maybe_vex alternatives...
* config/i386/sse.md (*vec_concatv2sf_sse4_1, *vec_concatv4sf): Use
v instead of x in vex or maybe_vex alternatives, use
maybe_evex instead of vex in prefix.
Jakub Jelinek [Fri, 6 May 2016 13:11:56 +0000 (15:11 +0200)]
sse.md (sse_shufps_<mode>, [...]): Use v instead of x in vex or maybe_vex alternatives...
* config/i386/sse.md (sse_shufps_<mode>, sse_storehps, sse_loadhps,
sse_storelps, sse_movss, avx2_vec_dup<mode>, avx2_vec_dupv8sf_1,
sse2_shufpd_<mode>, sse2_storehpd, sse2_storelpd, sse2_loadhpd,
sse2_loadlpd, sse2_movsd): Use v instead of x in vex or maybe_vex
alternatives, use maybe_evex instead of vex in prefix.
Richard Biener [Fri, 6 May 2016 12:53:26 +0000 (12:53 +0000)]
re PR tree-optimization/70948 (r235622 caused gcc.c-torture/execute/va-arg-pack-1.c execution failure AArch64)
2016-05-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/70948
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
Properly clobber all fields of va_list for __builtin_va_start.
Uros Bizjak [Thu, 5 May 2016 22:48:29 +0000 (00:48 +0200)]
re PR target/70873 ([7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.)
PR target/70873
* config/i386/i386-protos.h (ix86_standard_x87sse_constant_load_p):
New prototype.
* config/i386/i386.c (ix86_standard_x87sse_constant_load_p): New.
* config/i386/i386.md (push mem splitter): Use find_constant_src in
the splitter condition.
(FP load splitter): Use ix86_standard_x87sse_constant_load_p in
the splitter condition.
(FP float_extend load splitter): Ditto.
Uros Bizjak [Thu, 5 May 2016 20:33:42 +0000 (22:33 +0200)]
i386.md (peehole2 patterns): Change true_regnum to REGNUM in all peephole2 patterns.
* config/i386/i386.md (peehole2 patterns): Change true_regnum
to REGNUM in all peephole2 patterns.
(post-reload splitters): Change true_regnum to REGNUM in
post-reload splitters.
(zero_extend splitters): Use general_reg_operand and
nonimmediate_gr_operand predicates.
* openmp.c (gfc_match_omp_clauses): Restructuralize, so that clause
parsing is done in a big switch based on gfc_peek_ascii_char and
individual clauses under their first letters are sorted too.
Jakub Jelinek [Thu, 5 May 2016 13:26:59 +0000 (15:26 +0200)]
c-parser.c (c_parser_switch_statement): Add IF_P argument, parse it through to c_parser_c99_block_statement.
* c-parser.c (c_parser_switch_statement): Add IF_P argument,
parse it through to c_parser_c99_block_statement.
(c_parser_statement_after_labels): Adjust c_parser_switch_statement
caller.
* parser.c (cp_parser_selection_statement): For RID_SWITCH,
pass if_p instead of NULL to cp_parser_implicitly_scoped_statement.
Alan Modra [Thu, 5 May 2016 00:07:27 +0000 (09:37 +0930)]
[RS6000] TARGET_RELOCATABLE
For ABI_V4, -mrelocatable and -fPIC both generate position independent
code, with some extra "fixup" output for -mrelocatable. The
similarity of these two options has led to the situation where the
sysv4.h SUBTARGET_OVERRIDE_OPTIONS sets flag_pic on seeing
-mrelocatable, and sets TARGET_RELOCATABLE on seeing -fPIC. That
prevents LTO from properly optimizing position dependent executables,
because the mutual dependence of the flags and the fact that LTO
streaming records the state of rs6000_isa_flags, result in flag_pic
being set when it shouldn't be.
So, don't set TARGET_RELOCATABLE when -fPIC. Places that currently
test TARGET_RELOCATABLE can instead test
TARGET_RELOCATABLE || (DEFAULT_ABI == ABI_V4 && flag_pic > 1)
or since TARGET_RELOCATABLE can only be enabled when ABI_V4,
DEFAULT_ABI == ABI_V4 && (TARGET_RELOCATABLE || flag_pic > 1).
Also, since flag_pic is set by -mrelocatable, a number of places that
currently test TARGET_RELOCATABLE can be simplified. I also made
-mrelocatable set TARGET_NO_FP_IN_TOC, allowing TARGET_RELOCATABLE to
be removed from ASM_OUTPUT_SPECIAL_POOL_ENTRY_P. Reducing occurrences
of TARGET_RELOCATABLE is a good thing.
Uros Bizjak [Wed, 4 May 2016 21:13:13 +0000 (23:13 +0200)]
re PR target/70873 ([7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.)
PR target/70873
* config/i386/i386.md
(TARGET_SSE_PARTIAL_REG_DEPENDENCY float_extend sf->df peephole2):
Change to post-epilogue_completed late splitter. Use sse_reg_operand
as operand 0 predicate.
(TARGET_SSE_PARTIAL_REG_DEPENDENCY float_truncate df->sf peephole2):
Ditto.
(TARGET_SSE_PARTIAL_REG_DEPENDENCY float {si,di}->{sf,df} peephole2):
Ditto. Emit the pattern using RTX.
(TARGET_USE_VECTOR_FP_CONVERTS float_extend sf->df splitter):
Use sse_reg_opreand as operand 0 predicate. Do not use true_regnum in
the post-reload splitter. Use lowpart_subreg instead of gen_rtx_REG.
(TARGET_USE_VECTOR_FP_CONVERTS float_truncate df->sf splitter):
Ditto.
(TARGET_USE_VECTOR_CONVERTS float si->{sf,df} splitter): Use
sse_reg_operand as operand 0 predicate.
(TARGET_SPLIT_MEM_OPND_FOR_FP_CONVERTS float_extend sf->df peephole2):
Use sse_reg_opreand as operand 0 predicate. Use lowpart_subreg
instead of gen_rtx_REG.
(TARGET_SPLIT_MEM_OPND_FOR_FP_CONVERTS float_truncate sf->df peephole2):
Ditto.
cfgcleanup: Fold jumps and conditional branches with returns
This patch makes cfgcleanup optimize jumps to returns. There are three
cases this handles:
-- A jump to a return; this is simplified to just that return.
-- A conditional branch to a return; simplified to a conditional return.
-- A conditional branch that falls through to a return. This is simplified
to a conditional return (with the condition inverted), falling through
to a jump to the original destination. That jump can then be optimized
further, as usual.
This handles all cases the current function.c does, and a few it misses.
* cfgcleanup.c (bb_is_just_return): New function.
(try_optimize_cfg): Simplify jumps to return, branches to return,
and branches around return.
Jakub Jelinek [Wed, 4 May 2016 20:44:40 +0000 (22:44 +0200)]
re PR c++/70906 (ice in add_expr, at tree.c:7925)
PR c++/70906
PR c++/70933
* tree-core.h (enum operand_equal_flag): Add OEP_HASH_CHECK.
* tree.c (inchash::add_expr): If !IS_EXPR_CODE_CLASS (tclass),
assert flags & OEP_HASH_CHECK, instead of asserting it
never happens. Handle TARGET_EXPR.
* fold-const.c (operand_equal_p): For hash verification,
or in OEP_HASH_CHECK into flags.
* g++.dg/opt/pr70906.C: New test.
* g++.dg/opt/pr70933.C: New test.