Yuri Rumyantsev [Tue, 2 Feb 2016 09:46:26 +0000 (09:46 +0000)]
re PR middle-end/68542 (10% 481.wrf performance regression)
gcc/
2016-02-02 Yuri Rumyantsev <ysrumyan@gmail.com>
PR middle-end/68542
* config/i386/i386.c (ix86_expand_branch): Add support for conditional
branch with vector comparison.
* config/i386/sse.md (VI48_AVX): New mode iterator.
(define_expand "cbranch<mode>4): Add support for conditional branch
with vector comparison.
* tree-vect-loop.c (optimize_mask_stores): New function.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores after vec_info destroy.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.
gcc/testsuite
2016-02-02 Yuri Rumyantsev <ysrumyan@gmail.com>
PR middle-end/68542
* gcc.dg/vect/vect-mask-store-move-1.c: New test.
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.
Alan Modra [Tue, 2 Feb 2016 00:01:16 +0000 (10:31 +1030)]
[RS6000] ABI_V4 init of toc section
Since 4c4a180d lto has turned off flag_pic when linking a fixed
position executable. So flag_pic is zero in rs6000_file_start.
However, when we get to actually emitting code, flag_pic may be on
again. This results in undefined references to ".LCTOC1".
PR target/68662
* config/rs6000/rs6000.c (need_toc_init): New var, set it
whenever toc_label_name used.
(rs6000_file_start): Don't set up toc section here,
(rs6000_output_function_epilogue): do so here instead,
(rs6000_xcoff_file_start): and here.
* config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init.
(load_toc_aix_di): Likewise.
Jakub Jelinek [Mon, 1 Feb 2016 22:39:31 +0000 (23:39 +0100)]
re PR rtl-optimization/69592 (Compile-time and memory-use hog in combine)
PR rtl-optimization/69592
* rtlanal.c (nonzero_bits_binary_arith_p): New inline function.
(cached_nonzero_bits): Use it instead of ARITHMETIC_P.
(num_sign_bit_copies_binary_arith_p): New inline function.
(cached_num_sign_bit_copies): Use it instead of ARITHMETIC_P.
Jeff Law [Mon, 1 Feb 2016 22:03:57 +0000 (15:03 -0700)]
re PR testsuite/68580 (FAIL: c-c++-common/tsan/pr65400-1.c -O0 execution test)
PR tree-optimization/68580
* params.def (FSM_MAXIMUM_PHI_ARGUMENTS): New param.
* tree-ssa-threadbackward.c
(fsm_find_control_statement_thread_paths): Do not try to walk
through large PHI nodes.
Bin Cheng [Mon, 1 Feb 2016 17:17:47 +0000 (17:17 +0000)]
re PR tree-optimization/67921 ("internal compiler error: in build_polynomial_chrec, at tree-chrec.h:147" when using -fsanitize=undefined)
PR tree-optimization/67921
* fold-const.c (split_tree): New parameters. Convert pointer
type variable part to proper type before negating.
(fold_binary_loc): Pass new arguments to split_tree.
gcc/testsuite/ChangeLog
PR tree-optimization/67921
* c-c++-common/ubsan/pr67921.c: New test.
Richard Biener [Mon, 1 Feb 2016 12:39:04 +0000 (12:39 +0000)]
re PR tree-optimization/69579 (gcc ICE at -O3 and __sigsetjmp with “tree check: expected ssa_name, have integer_cst in compute_optimized_partition_bases”)
2016-02-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/69579
* tree-ssa-loop-ivcanon.c (propagate_constants_for_unrolling):
Do not propagate through abnormal PHI results.
Jakub Jelinek [Mon, 1 Feb 2016 08:47:27 +0000 (09:47 +0100)]
re PR rtl-optimization/69570 (if-conversion bug on i?86)
PR rtl-optimization/69570
* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Return true only
if there is more than one set, not if there is a single set.
Paul Thomas [Sun, 31 Jan 2016 10:22:05 +0000 (10:22 +0000)]
re PR fortran/67564 (Segfault on sourced allocattion statement with class(*) arrays)
2016-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/67564
* trans-expr.c (gfc_conv_procedure_call): For the vtable copy
subroutines, add a string length argument, when the actual
argument is an unlimited polymorphic class object.
2016-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/67564
* gfortran.dg/allocate_with_source_17.f03: New test.
Jakub Jelinek [Sat, 30 Jan 2016 18:04:13 +0000 (19:04 +0100)]
re PR middle-end/69546 (wrong code with -O and simple int128 arithmetics)
PR tree-optimization/69546
* wide-int.cc (wi::divmod_internal): For unsigned division
where both operands fit into uhwi, if o1 is 1 and o0 has
msb set, if divident_prec is larger than bits per hwi,
clear another quotient word and return 2 instead of 1.
Similarly for remainder with msb in HWI set, if dividend_prec
is larger than bits per hwi.
Bill Schmidt [Sat, 30 Jan 2016 01:18:43 +0000 (01:18 +0000)]
re PR target/65546 (FAIL: gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c)
2016-01-29 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR target/65546
* gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Correct
condition being checked, and disable it when the target supports
misaligned loads and stores.
Martin Jambor [Fri, 29 Jan 2016 22:53:28 +0000 (23:53 +0100)]
[hsa] Atomic assess memory model fixes
2016-01-29 Martin Jambor <mjambor@suse.cz>
* hsa-gen.c (get_memory_order_name): Mask with MEMMODEL_BASE_MASK.
Use short lowercase names.
(get_memory_order): Mask with MEMMODEL_BASE_MASK. Support
MEMMODEL_CONSUME with acquire semantics and MEMMODEL_SEQ_CST with
acq_rel one. Protect warning agains segfaults if
get_memory_order_name returns NULL.
(gen_hsa_ternary_atomic_for_builtin): Support with MEMMODEL_SEQ_CST
with release semantics. Do not warn if get_memory_order already did.
(gen_hsa_insns_for_call): Support with MEMMODEL_SEQ_CST with acquire
semantics. Fix check for relaxed or acquire semantics. Do not warn
if get_memory_order already did.
Vladimir Makarov [Fri, 29 Jan 2016 18:47:17 +0000 (18:47 +0000)]
re PR target/69299 (-mavx performance degradation with r232088)
2016-01-29 Vladimir Makarov <vmakarov@redhat.com>
PR target/69299
* config/i386/constraints.md (Bm): Describe as special memory
constraint.
* doc/md.texi (DEFINE_SPECIAL_MEMORY_CONSTRAINT): Describe it.
* genoutput.c (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
* genpreds.c (struct constraint_data): Add is_special_memory.
(have_special_memory_constraints, special_memory_start): New
static vars.
(special_memory_end): Ditto.
(add_constraint): Add new arg is_special_memory. Add code to
process its true value. Update have_special_memory_constraints.
(process_define_constraint): Pass the new arg.
(process_define_register_constraint): Ditto.
(choose_enum_order): Process special memory.
(write_tm_preds_h): Generate enum const CT_SPECIAL_MEMORY and
function insn_extra_special_memory_constraint.
(main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
* gensupport.c (process_rtx): Process
DEFINE_SPECIAL_MEMORY_CONSTRAINT.
* ira-costs.c (record_reg_classes): Process CT_SPECIAL_MEMORY.
* ira-lives.c (single_reg_class): Use
insn_extra_special_memory_constraint.
* ira.c (ira_setup_alts): Process CT_SPECIAL_MEMORY.
* lra-constraints.c (process_alt_operands): Ditto.
(curr_insn_transform): Use insn_extra_special_memory_constraint.
* recog.c (asm_operand_ok, preprocess_constraints): Process
CT_SPECIAL_MEMORY.
* reload.c (find_reloads): Ditto.
* rtl.def (DEFINE_SPECIFAL_MEMORY_CONSTRAINT): New.
* stmt.c (parse_input_constraint): Use
insn_extra_special_memory_constraint.
Jakub Jelinek [Fri, 29 Jan 2016 14:14:56 +0000 (15:14 +0100)]
re PR target/69551 (Wrong code with single element vector insert)
PR target/69551
* config/i386/i386.c (ix86_expand_vector_set) <case V4SImode>: For
SSE1, copy target into the temporary reg first before recursing
on it.
Andrew Bennett [Fri, 29 Jan 2016 13:54:53 +0000 (13:54 +0000)]
p5600-bonding.c (dg-options): Force the test to be always built for p5600.
testsuite/
2016-01-29 Andrew Bennett <andrew.bennett@imgtec.com>
* gcc.target/mips/p5600-bonding.c (dg-options): Force the test to be
always built for p5600.
* gcc.target/mips/mips.exp (mips-dg-options): Add support for the
isa=p5600 dg-option.
Richard Biener [Fri, 29 Jan 2016 11:21:19 +0000 (11:21 +0000)]
re PR middle-end/69547 (no-op array initializer emits an empty loop)
2016-01-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/69547
* tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1):
Do not mark clobbers necessary.
(mark_all_reaching_defs_necessary_1): Likewise.
Dominik Vogt [Fri, 29 Jan 2016 10:09:13 +0000 (10:09 +0000)]
S/390: Require a hardware vector support for test to succeed.
The test case works on S/390 too, but only with -march=z13 or later
(i.e. if Gcc can make use of hardware vector support). Otherwise the
optimization gets too complex. The attached patch forces Gcc to use
-march=z13 instead of xfail'ing the test on S/390.
gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Require a hardware vector support for
test to succeed.
Marek Polacek [Fri, 29 Jan 2016 09:25:14 +0000 (09:25 +0000)]
re PR c++/69509 (infinite loop compiling a VLA in a recursive constexpr function)
PR c++/69509
PR c++/69516
* constexpr.c (cxx_eval_array_reference): Give the "array subscript
out of bound" error earlier.
* init.c (build_vec_init): Change NE_EXPR into GT_EXPR. Update the
commentary.
* g++.dg/ext/constexpr-vla2.C: New test.
* g++.dg/ext/constexpr-vla3.C: New test.
* g++.dg/ubsan/vla-1.C: Remove dg-shouldfail.
Patrick Palka [Fri, 29 Jan 2016 01:51:03 +0000 (01:51 +0000)]
Fix cp_binding_level reuse logic
gcc/cp/ChangeLog:
* name-lookup.c (begin_scope): After reusing a cp_binding_level
structure, update free_binding_level before the structure's
level_chain field gets cleared, not after.
Uros Bizjak [Thu, 28 Jan 2016 22:32:47 +0000 (23:32 +0100)]
re PR target/69459 (wrong code with -O2 and vector arithmetics @ x86_64)
PR target/69459
* config/i386/constraints.md (C): Only accept constant zero operand.
(BC): New constraint.
* config/i386/sse.md (*mov<mode>_internal): Use BC constraint
instead of C constraint.
* doc/md.texi (Machine Constraints): Update description
of C constraint.
testsuite/ChangeLog:
PR target/69459
* gcc.target/i386/pr69459.c: New test.
Sebastian Pop [Thu, 28 Jan 2016 16:39:10 +0000 (16:39 +0000)]
remove out of sync comments
* graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple):
Remove comments from class declarations: they are already in the code
close by the defs.
Wilco Dijkstra [Thu, 28 Jan 2016 15:41:46 +0000 (15:41 +0000)]
A recent change added UNSPEC to the CCMP patterns to stop combine optimizing the immediate in a rare case.
A recent change added UNSPEC to the CCMP patterns to stop combine optimizing
the immediate in a rare case. This requires a fix to the CCMP cost
calculation as the CCMP instruction with unspec is no longer recognized.
Fix the ccmp_1.c test by adding -ffinite-math-only so FCCMPE is emitted on
relational compares.
re PR libstdc++/69450 (libstdc++-v3/include/math.h:66:1 2: error: 'constexpr bool std::isnan(double)' conflicts with a previous declaration)
PR libstdc++/69450
* acinclude.m4 (GLIBCXX_CHECK_MATH11_PROTO): Split check for obsolete
isinf and isnan functions into two independent checks. Check on hpux.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/c_global/cmath (isinf(double), isnan(double)): Use
_GLIBCXX_HAVE_OBSOLETE_ISINF and _GLIBCXX_HAVE_OBSOLETE_ISNAN,
respectively.
Wilco Dijkstra [Thu, 28 Jan 2016 11:52:08 +0000 (11:52 +0000)]
Add support for vector permute cost since various permutes can expand into a complex sequence of instructions.
Add support for vector permute cost since various permutes can expand
into a complex sequence of instructions. This fixes major performance
regressions due to recent changes in SLP vectorizer (which now vectorizes
more aggressively and emits many complex permutes). Set the cost to > 1
for all microarchitectures so that the number of permutes is usually zero
and regressions disappear.
2016-01-28 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.c (generic_vector_cost):
Set vec_permute_cost.
(cortexa57_vector_cost): Likewise.
(exynosm1_vector_cost): Likewise.
(xgene1_vector_cost): Likewise.
(aarch64_builtin_vectorization_cost): Use vec_permute_cost.
* config/aarch64/aarch64-protos.h (cpu_vector_cost):
Add vec_permute_cost entry.
Wilco Dijkstra [Thu, 28 Jan 2016 11:45:06 +0000 (11:45 +0000)]
Several instructions disassemble a zero immediate as wzr/xzr due to using a register operand in the disassembly.
Several instructions disassemble a zero immediate as wzr/xzr due to
using a register operand in the disassembly. Avoid this by removing
the register operand.
re PR fortran/62536 (ICE (segfault) for invalid END BLOCK statement)
gcc/fortran/ChangeLog:
2016-01-28 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/62536
* decl.c (gfc_match_end): Only unnest and remove BLOCK namespaces
when the END encountered does not match a BLOCK's end.
gcc/testsuite/ChangeLog:
2016-01-28 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/62536
* gfortran.dg/block_15.f08: New test.
* gfortran.dg/block_end_error_1.f90: Need to catch additional error
on incorrectly closed BLOCK.