Richard Biener [Tue, 21 Mar 2017 11:43:45 +0000 (11:43 +0000)]
re PR tree-optimization/80032 (C++ excessive stack usage (no stack reuse))
2017-03-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/80032
* gimplify.c (gimple_push_cleanup): Add force_uncond parameter,
if set force the cleanup to happen unconditionally.
(gimplify_target_expr): Push inserted clobbers with force_uncond
to avoid them being removed by control-dependent DCE.
Richard Biener [Tue, 21 Mar 2017 11:42:22 +0000 (11:42 +0000)]
re PR tree-optimization/80122 (__builtin_va_arg_pack() and __builtin_va_arg_pack_len() does not work correctly)
2017-03-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/80122
* tree-inline.c (copy_bb): Do not expans va-arg packs or
va_arg_pack_len when the inlined call stmt requires pack
expansion itself.
* tree-inline.h (struct copy_body_data): Make call_stmt a gcall *.
If the dest of an I0 or I1 is used in an insn before I2, as can happen
in various uncommon cases, and we manage to do the combination, the set
is moved to I2, which is wrong. Don't allow combining the insns in this
case.
PR rtl-optimization/79910
* combine.c (can_combine_p): Do not allow combining an I0 or I1
if its dest is used by an insn before I2 (other than the combined
insns themselves, which are properly handled already).
* combine.c (record_used_regs): New static function.
(try_combine): Handle situations where there is an additional
instruction between I2 and I3 which needs to have a LOG_LINK
updated.
Revert:
2017-03-17 Jim Wilson <jim.wilson@linaro.org>
Bill Schmidt [Mon, 20 Mar 2017 20:04:25 +0000 (20:04 +0000)]
re PR tree-optimization/80054 (ICE in verify_ssa with -O3 -march=broadwell/skylake-avx512)
[gcc]
2017-03-20 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR tree-optimization/80054
* gimple-ssa-strength-reduction.c (all_phi_incrs_profitable): Fail
the optimization if a PHI or any of its arguments is not dominated
by the candidate's basis. Use gphi* rather than gimple* as
appropriate.
(replace_profitable_candidates): Clean up a gimple* variable that
should be a gphi* variable.
[gcc/testsuite]
2017-03-20 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR tree-optimization/80054
* g++.dg/torture/pr80054.C: New file.
Kelvin Nilsen [Mon, 20 Mar 2017 18:05:00 +0000 (18:05 +0000)]
re PR target/79963 (vec_eq_any extracts wrong CR bit when compiling with -mcpu=power9)
gcc/testsuite/ChangeLog:
2017-03-20 Kelvin Nilsen <kelvin@gcc.gnu.org>
PR target/79963
* gcc.target/powerpc/vsu/vec-any-eq-10.c: Add scan-assembler
directive to assure selection of proper bit using rlwinm insn.
* gcc.target/powerpc/vsu/vec-any-eq-14.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eq-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eq-8.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eq-9.c: Likewise.
gcc/ChangeLog:
2017-03-20 Kelvin Nilsen <kelvin@gcc.gnu.org>
PR target/79963
* config/rs6000/altivec.h (vec_all_ne): Under __cplusplus__ and
__POWER9_VECTOR__ #ifdef control, change template definition to
use Power9-specific built-in function.
(vec_any_eq): Likewise.
* config/rs6000/vector.md (vector_ae_v2di_p): Change the flag used
to control outcomes from this test.
(vector_ae_<mode>p): For VEC_F modes, likewise.
Palmer Dabbelt [Mon, 20 Mar 2017 16:43:21 +0000 (16:43 +0000)]
RISC-V: Don't prefer FP_REGS for integers
On RISC-V we can't store integers in floating-point registers as this is
forbidden by the ISA. We've always disallowed this, but we were
setting the preferred mode to FP_REGS for some integer modes. This
caused the LRA to blow up with some hard to read error messages.
This patch removes the prefered mode hook, as the right thing to do here
is nothing.
Thanks to Kito for finding the bug, and mpf for the fix. See also
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79912>.
Palmer Dabbelt [Mon, 20 Mar 2017 16:43:17 +0000 (16:43 +0000)]
Use more conservative fences on RISC-V
The RISC-V memory model is still in the process of being formally
specified, so for now we're going to be safe and add the I/O bits to
userspace fences because there's no way to know if userspace is touching
memory-mapped I/O regions at compile time.
This will have no impact on existing microarchitecutres because they
treat all fences conservatively.
gcc/ChangeLog:
2017-03-17 Palmer Dabbelt <palmer@dabbelt.com>
* config/riscv/riscv.c (riscv_print_operand): Use "fence
iorw,ow".
* config/riscv/sync.mc (mem_thread_fence_1): Use "fence
iorw,iorw".
Richard Biener [Mon, 20 Mar 2017 13:06:58 +0000 (13:06 +0000)]
re PR tree-optimization/80113 (ICE in set_var_live_on_entry at tree-ssa-live.c:1018)
2017-03-20 Richard Biener <rguenther@suse.de>
PR tree-optimization/80113
* graphite-isl-ast-to-gimple.c (copy_loop_phi_nodes): Do not
allocate extra SSA name for PHI def.
(add_close_phis_to_outer_loops): Likewise.
(add_close_phis_to_merge_points): Likewise.
(copy_loop_close_phi_args): Likewise.
(copy_cond_phi_nodes): Likewise.
Martin Liska [Mon, 20 Mar 2017 10:06:00 +0000 (11:06 +0100)]
MPX: fix PR middle-end/79753
2017-03-20 Martin Liska <mliska@suse.cz>
PR middle-end/79753
* tree-chkp.c (chkp_build_returned_bound): Do not build
returned bounds for a LHS that's not a BOUNDED_P type.
2017-03-20 Martin Liska <mliska@suse.cz>
PR middle-end/79753
* gcc.target/i386/mpx/pr79753.c: New test.
Andreas Krebbel [Mon, 20 Mar 2017 09:33:11 +0000 (09:33 +0000)]
S/390: PR78857: Don't use load and test if result is live.
The FP load and test instruction should not be used for a comparison
if the target operand is being used afterwards. It unfortunately
turns SNaNs into QNaNs.
gcc/ChangeLog:
2017-03-20 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
PR target/78857
* config/s390/s390.md ("cmp<mode>_ccs_0"): Add a clobber of the
target operand. A new splitter adds the clobber statement in case
the target operand is dead anyway.
gcc/testsuite/ChangeLog:
2017-03-20 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
PR target/78857
* gcc.target/s390/load-and-test-fp-1.c: New test.
* gcc.target/s390/load-and-test-fp-2.c: New test.
Paul Thomas [Sat, 18 Mar 2017 12:38:02 +0000 (12:38 +0000)]
re PR fortran/79676 ([submodules] Compilation/linking error when module procedures PRIVATE)
2017-03-18 Paul Thomas <pault@gcc.gnu.org>
PR fortran/79676
* module.c (mio_symbol_attribute): Remove reset of the flag
'no_module_procedures'.
(check_for_module_procedures): New function. Move declaration
of 'no_module_procedures' to above it.
(gfc_dump_module): Traverse namespace calling new function.
2017-03-18 Paul Thomas <pault@gcc.gnu.org>
PR fortran/79676
* gfortran.dg/submodule_28.f08 : New test.
Jonathan Wakely [Fri, 17 Mar 2017 19:28:05 +0000 (19:28 +0000)]
Fix alignment bugs in std::codecvt_utf16
* src/c++11/codecvt.cc (range): Add non-type template parameter and
define oerloaded operators for reading and writing code units.
(range<Elem, false>): Define partial specialization for accessing
wide characters in potentially unaligned byte ranges.
(ucs2_span(const char16_t*, const char16_t*, ...))
(ucs4_span(const char16_t*, const char16_t*, ...)): Change parameters
to range<const char16_t, false> in order to avoid unaligned reads.
(__codecvt_utf16_base<char16_t>::do_out)
(__codecvt_utf16_base<char32_t>::do_out)
(__codecvt_utf16_base<wchar_t>::do_out): Use range specialization for
unaligned data to avoid unaligned writes.
(__codecvt_utf16_base<char16_t>::do_in)
(__codecvt_utf16_base<char32_t>::do_in)
(__codecvt_utf16_base<wchar_t>::do_in): Likewise for writes. Return
error if there are unprocessable trailing bytes.
(__codecvt_utf16_base<char16_t>::do_length)
(__codecvt_utf16_base<char32_t>::do_length)
(__codecvt_utf16_base<wchar_t>::do_length): Pass arguments of type
range<const char16_t, false> to span functions.
* testsuite/22_locale/codecvt/codecvt_utf16/misaligned.cc: New test.
Bernd Schmidt [Fri, 17 Mar 2017 15:10:13 +0000 (09:10 -0600)]
re PR rtl-optimization/79910 (wrong code with -O -fweb)
PR rtl-optimization/79910
* combine.c (record_used_regs): New static function.
(try_combine): Handle situations where there is an additional
instruction between I2 and I3 which needs to have a LOG_LINK
updated.
PR rtl-optimization/79910
* gcc.dg/torture/pr79910.c: New test.
Jeff Law [Fri, 17 Mar 2017 15:01:56 +0000 (09:01 -0600)]
re PR tree-optimization/71437 (Performance regression after r235817)
PR tree-optimization/71437
* tree-vrp.c (simplify_stmt_for_jump_threading): Lookup the
conditional in the hash table first.
(vrp_dom_walker::before_dom_children): Extract condition from
ASSERT_EXPR. Record condition, its inverion and any implied
conditions as well.
PR tree-optimization/71437
* gcc.dg/tree-ssa/pr71437.c: New test.
* gcc.dg/tree-ssa/20040305-1.c: Test earlier dump.
* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust for jump threads
now caught by VRP, but which were previously caught by DOM.
Richard Biener [Fri, 17 Mar 2017 12:48:56 +0000 (12:48 +0000)]
re PR c++/80075 (ICE: "statement marked for throw, but doesn’t" with -fnon-call-exceptions)
2017-03-17 Richard Biener <rguenther@suse.de>
PR middle-end/80075
* tree-eh.c (stmt_could_throw_1_p): Only handle gimple assigns.
Properly verify the LHS before the RHS possibly claims to be
handled.
(stmt_could_throw_p): Hande gimple conds fully here. Clobbers
do not throw.
Martin Jambor [Fri, 17 Mar 2017 12:34:27 +0000 (13:34 +0100)]
Document -fipa-vrp
2017-03-17 Martin Jambor <mjambor@suse.cz>
* doc/invoke.texi (Option Options): Include -fipa-vrp in the list.
(List of -O2 options): Likewise.
(-fipa-bit-cp): Replace "ipa" with "interprocedural."
(-fipa-vrp) New.
Alexandre Oliva [Thu, 16 Mar 2017 23:31:01 +0000 (23:31 +0000)]
stabilize store merging
Don't let pointer randomization change the order in which we process
store chains. This may cause SSA_NAMEs to be released in different
order, and if they're reused later, they may cause differences in SSA
partitioning, leading to differences in expand, and ultimately to
different code.
bootstrap-debug-lean (-fcompare-debug) on i686-linux-gnu has failed in
haifa-sched.c since r245196 exposed the latent ordering problem in
store merging. In this case, the IR differences (different SSA names
selected for copies in out-of-SSA, resulting in some off-by-one
differences in pseudos) were not significant enough to be visible in
the compiler output.
for gcc/ChangeLog
* gimple-ssa-store-merging.c (struct imm_store_chain_info):
Add linked-list forward and backlinks. Insert on
construction, remove on destruction.
(class pass_store_merging): Add m_stores_head field.
(pass_store_merging::terminate_and_process_all_chains):
Iterate over m_stores_head list.
(pass_store_merging::terminate_all_aliasing_chains):
Likewise.
(pass_store_merging::execute): Check for debug stmts first.
Push new chains onto the m_stores_head stack.
Michael Meissner [Thu, 16 Mar 2017 20:09:21 +0000 (20:09 +0000)]
re PR target/71294 (ICE in gen_add2_insn, at optabs.c:4442 on powerpc64le-linux)
[gcc]
2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71294
* config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Allow a
SPLAT operation on ISA 2.07 64-bit systems that have direct move,
but no MTVSRDD support, by doing MTVSRD and XXPERMDI.
[gcc/testsuite]
2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com>
Jeff Law [Thu, 16 Mar 2017 19:21:33 +0000 (13:21 -0600)]
re PR tree-optimization/71437 (Performance regression after r235817)
PR tree-optimization/71437
* tree-ssa-dom.c (dom_opt_dom_walker): Remove thread_across_edge
member function. Implementation moved into after_dom_children
member function and into the threader's thread_outgoing_edges
function.
(dom_opt_dom_walker::after_dom_children): Simplify by moving
some code into new thread_outgoing_edges.
* tree-ssa-threadedge.c (thread_across_edge): Make static and simplify
definition. Simplify marker handling (do it here). Assume we always
have the available expression and the const/copies tables.
(thread_outgoing_edges): New function extracted from tree-ssa-dom.c
and tree-vrp.c
* tree-ssa-threadedge.h (thread_outgoing_edges): Declare.
* tree-vrp.c (equiv_stack): No longer file scoped.
(vrp_dom_walker): New class.
(vrp_dom_walker::before_dom_children): New member function.
(vrp_dom_walker::after_dom_children): Likewise.
(identify_jump_threads): Setup domwalker. Use it rather than
walking edges in a random order by hand. Simplify setup/finalization.
(finalize_jump_threads): Remove.
(vrp_finalize): Do not call identify_jump_threads here.
(execute_vrp): Do it here instead and call thread_through_all_blocks
here too.
Jeff Law [Thu, 16 Mar 2017 19:21:23 +0000 (13:21 -0600)]
re PR tree-optimization/71437 (Performance regression after r235817)
PR tree-optimization/71437
* tree-ssa-dom.c (pfn_simplify): Add basic_block argument. All
callers changed.
(simplify_stmt_for_jump_threading): Add basic_block argument. All
callers changed.
(lhs_of_dominating_assert): Moved from here into tree-vrp.c.
(dom_opt_dom_walker::thread_across_edge): Remove
handle_dominating_asserts argument. All callers changed.
(record_temporary_equivalences_from_stmts_at_dest): Corresponding
changes. Remove calls to lhs_of_dominating_assert. Other
uses of handle_dominating_asserts turn into unconditional code
(simplify_control_stmt_condition_1): Likewise.
(simplify_control_stmt_condition): Likewise.
(thread_through_normal_block, thread_across_edge): Likewise.
* tree-ssa-threadedge.h (thread_across_edge): Corresponding changes.
* tree-vrp.c (lhs_of_dominating_assert): Move here. Return original
object if it is not an SSA_NAME.
(simplify_stmt_for_jump_threading): Call lhs_of_dominating_assert
before calling into the VRP specific simplifiers.
(identify_jump_threads): Remove handle_dominating_asserts
argument.
Jakub Jelinek [Thu, 16 Mar 2017 16:50:27 +0000 (17:50 +0100)]
re PR fortran/80010 (diagnostics: typo $!)
PR fortran/80010
* parse.c (gfc_ascii_statement): Use !$ACC for ST_OACC_ATOMIC
and ST_OACC_END_ATOMIC, instead of !ACC.
* trans-decl.c (finish_oacc_declare): Use !$ACC instead of $!ACC.
* openmp.c (gfc_match_oacc_declare, gfc_match_oacc_wait,
gfc_resolve_oacc_declare): Likewise.
Jakub Jelinek [Thu, 16 Mar 2017 16:27:08 +0000 (17:27 +0100)]
re PR fortran/79886 (ICE in pp_format, at pretty-print.c:681)
PR fortran/79886
* tree-diagnostic.c (default_tree_printer): No longer static.
* tree-diagnostic.h (default_tree_printer): New prototype.
fortran/
* error.c (gfc_format_decoder): Rename plus argument to set_locus,
remove ATTRIBUTE_UNUSED from all arguments, call default_tree_printer
if not a Fortran specific spec.
* trans-io.c: Include options.h.
(gfc_build_st_parameter): Temporarily disable -Wpadded around layout
of artificial IO data structures.
testsuite/
* gfortran.dg/pr79886.f90: New test.
Jonathan Wakely [Thu, 16 Mar 2017 15:28:02 +0000 (15:28 +0000)]
PR libstdc++/80041 fix codecvt_utf16<wchar_t> to use UTF-16 not UTF-8
PR libstdc++/80041
* src/c++11/codecvt.cc (__codecvt_utf16_base<wchar_t>::do_out)
(__codecvt_utf16_base<wchar_t>::do_in): Convert char arguments to
char16_t to work with UTF-16 instead of UTF-8.
* testsuite/22_locale/codecvt/codecvt_utf16/80041.cc: New test.
PR libstdc++/79980
* include/bits/locale_conv.h (__do_str_codecvt): Set __count on
error path.
* src/c++11/codecvt.cc (operator&=, operator|=, operator~): Overloads
for manipulating codecvt_mode values.
(read_utf16_bom): Compare input to BOM constants instead of integral
constants that depend on endianness. Take mode parameter by
reference and adjust it, to distinguish between no BOM present and
UTF-16BE BOM present.
(ucs4_in, ucs2_span, ucs4_span): Adjust calls to read_utf16_bom.
(surrogates): New enumeration type.
(utf16_in, utf16_out): Add surrogates parameter to choose between
UTF-16 and UCS2 behaviour.
(utf16_span, ucs2_span): Use std::min not std::max.
(ucs2_out): Use std::min not std::max. Disallow surrogate pairs.
(ucs2_in): Likewise. Adjust calls to read_utf16_bom.
* testsuite/22_locale/codecvt/codecvt_utf16/79980.cc: New test.
* testsuite/22_locale/codecvt/codecvt_utf8/79980.cc: New test.
Jonathan Wakely [Thu, 16 Mar 2017 15:27:45 +0000 (15:27 +0000)]
PR libstdc++/79511 fix endianness of UTF-16 data
PR libstdc++/79511
* src/c++11/codecvt.cc (write_utf16_code_point): Don't write 0xffff
as a surrogate pair.
(__codecvt_utf8_utf16_base<char32_t>::do_in): Use native endianness
for internal representation.
(__codecvt_utf8_utf16_base<wchar_t>::do_in): Likewise.
* testsuite/22_locale/codecvt/codecvt_utf8_utf16/79511.cc: New test.
Kyrylo Tkachov [Thu, 16 Mar 2017 10:03:11 +0000 (10:03 +0000)]
[AArch64] Use 'x' constraint for vector HFmode multiplication by indexed element instructions
* config/aarch64/iterators.md (h_con): Return "x" for V4HF and V8HF.
* config/aarch64/aarch64-simd.md (*aarch64_fma4_elt_from_dup<mode>):
Use h_con constraint for operand 1.
(*aarch64_fnma4_elt_from_dup<mode>): Likewise.
(*aarch64_mulx_elt_from_dup<mode>): Likewise for operand 2.
Jeff Law [Thu, 16 Mar 2017 03:19:35 +0000 (21:19 -0600)]
re PR tree-optimization/71437 (Performance regression after r235817)
PR tree-optimization/71437
* tree-ssa-dom.c (struct cond_equivalence): Moved from here into
tree-ssa-scopedtables.
(lookup_avail_expr, build_and_record_new_cond): Likewise.
(record_conditions, record_cond, vuse_eq): Likewise.
(record_edge_info): Adjust to API tweak of record_conditions.
(simplify_stmt_for_jump_threading): Similarly for lookup_avail_expr.
(record_temporary_equivalences, optimize_stmt): Likewise.
(eliminate_redundant_computations): Likewise.
(record_equivalences_from_stmt): Likewise.
* tree-ssa-scopedtables.c: Include options.h and params.h.
(vuse_eq): New function, moved from tree-ssa-dom.c
(build_and_record_new_cond): Likewise.
(record_conditions): Likewise. Accept vector of conditions rather
than edge_equivalence structure for first argument.
for the first argument.
(avail_exprs_stack::lookup_avail_expr): New member function, moved
from tree-ssa-dom.c.
(avail_exprs_stack::record_cond): Likewise.
* tree-ssa-scopedtables.h (struct cond_equivalence): Moved here
from tree-ssa-dom.c.
(avail_exprs_stack): Add new member functions lookup_avail_expr
and record_cond.
(record_conditions): Declare.
Implement LWG 2857, {variant,optional,any}::emplace should return the constructed value.
Implement LWG 2857, {variant,optional,any}::emplace should
return the constructed value.
* include/std/any (emplace(_Args&&...)): Change the return type and
return a reference to the constructed value.
(emplace(initializer_list<_Up>, _Args&&...)): Likewise.
* include/std/optional (emplace(_Args&&...)): Likewise.
(emplace(initializer_list<_Up>, _Args&&...)): Likewise.
* include/std/variant (emplace<_Tp>(_Args&&...)): Likewise.
(emplace<_Tp>(initializer_list<_Up>, _Args&&...)): Likewise.
(emplace<_Np>(_Args&&...)): Likewise.
(emplace<_Np>(initializer_list<_Up>, _Args&&...)): Likewise.
* testsuite/20_util/any/assign/emplace.cc: Add tests for
checking the return value of emplace.
* testsuite/20_util/any/misc/any_cast_neg.cc: Adjust.
* testsuite/20_util/optional/assignment/6.cc: Add tests for
checking the return value of emplace.
* testsuite/20_util/variant/run.cc: Likewise.
It was XFAILed because there was a bug in glibc, but that bug was fixed
nine years ago. Nowadays everyone uses a version of glibc with the bug
fixed, so we should no longer XFAIL the test.
gcc/testsuite/
PR fortran/33271
* gfortran.dg/nint_2.f90: Do not xfail powerpc*-*-linux*.