wilco [Wed, 10 Feb 2016 12:52:23 +0000 (12:52 +0000)]
Enable instruction fusion of dependent AESE; AESMC and AESD; AESIMC pairs.
This can give up to 2x speedup on many AArch64 implementations. Also model
the crypto instructions on Cortex-A57 according to the Optimization Guide.
gcc/
* config/aarch64/aarch64.c (cortexa53_tunings): Enable AES fusion.
(cortexa57_tunings): Likewise.
(cortexa72_tunings): Likewise.
(arch_macro_fusion_pair_p): Add support for AES fusion.
* config/aarch64/aarch64-fusion-pairs.def: Add AES_AESMC entry.
* config/arm/aarch-common.c (aarch_crypto_can_dual_issue):
Allow virtual registers before reload so early scheduling works.
* config/arm/cortex-a57.md (cortex_a57_crypto_simple): Use
correct latency and pipeline.
(cortex_a57_crypto_complex): Likewise.
(cortex_a57_crypto_xor): Likewise.
(define_bypass): Add AES bypass.
rguenth [Wed, 10 Feb 2016 12:46:33 +0000 (12:46 +0000)]
2016-02-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/69726
* passes.def: Add DCE pass before late uninit.
* match.pd: Add A ? B : (!A ? C : X) -> A ? B : C patterns to
really fixup if-conversions job.
meissner [Tue, 9 Feb 2016 22:31:31 +0000 (22:31 +0000)]
[gcc]
2016-02-09 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/68404
* config/rs6000/predicates.md (fusion_gpr_addis): Prevent fusing
an ADDIS that adds a pointer to a large constant that sets the
upper16 bits with a load operation.
[gcc/testsuite]
2016-02-09 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/68404
* gcc.target/powerpc/fusion.c: Rewrite test to use TOC fusion
instead accessing a really large arrray.
* gcc.target/powerpc/fusion3.c: Likewise.
vries [Tue, 9 Feb 2016 08:52:26 +0000 (08:52 +0000)]
Fix GOMP/GOACC_parallel optimization in ipa-pta
2016-02-09 Tom de Vries <tom@codesourcery.com>
PR tree-optimization/69599
* tree-ssa-structalias.c (fndecl_maybe_in_other_partition): New
function.
(find_func_aliases_for_builtin_call, find_func_clobbers)
(ipa_pta_execute): Handle case that foo and foo._0 are not in same lto
partition.
* testsuite/libgomp.c/omp-nested-3.c: New test.
* testsuite/libgomp.c/pr46032-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/parallel-2.c: New test.
ppalka [Mon, 8 Feb 2016 23:06:21 +0000 (23:06 +0000)]
Fix PR c++/69139 (deduction failure with trailing return type)
gcc/cp/ChangeLog:
PR c++/69139
* parser.c (cp_parser_simple_type_specifier): Make the check
for disambiguating between an 'auto' placeholder and an implicit
template parameter more robust.
gcc/testsuite/ChangeLog:
PR c++/69139
* g++.dg/cpp0x/trailing12.C: New test.
* g++.dg/cpp0x/trailing13.C: New test.
jakub [Mon, 8 Feb 2016 20:07:56 +0000 (20:07 +0000)]
PR tree-optimization/69209
* ipa-split.c (split_function): If split part is not
returning retval, retval has gimple type but is not
gimple value, force it into a SSA_NAME first.
libcpp/ChangeLog:
PR preprocessor/69664
* errors.c (cpp_diagnostic_with_line): Only call
rich_location::override_column if the column is non-zero.
* line-map.c (rich_location::override_column): Update columns
within m_ranges[0]. Add assertions to verify that doing so is
sane.
bernds [Mon, 8 Feb 2016 15:31:08 +0000 (15:31 +0000)]
Fix latent LRA remat issue (PR68730)
PR rtl-optimization/68730
* lra-remat.c (insn_to_cand_activation): New static variable.
(lra_remat): Allocate and free it.
(create_cand): New arg activation. Initialize a field in
insn_to_cand_activation if it is nonnull.
(create_cands): Pass the activation insn to create_cand when making
a candidate involving an output reload. Reorganize code a little.
(do_remat): Keep track of active status of candidates in a separate
bitmap.
redi [Mon, 8 Feb 2016 15:22:32 +0000 (15:22 +0000)]
Enable isinf/isnan checks for all targets
PR libstdc++/48891
* acinclude.m4 (GLIBCXX_CHECK_MATH11_PROTO): Enable isinf and isnan
checks for all targets except *-*-solaris2.* and ensure we find the
libc math.h header not our own.
* configure: Regenerate.
rguenth [Mon, 8 Feb 2016 14:51:20 +0000 (14:51 +0000)]
2016-02-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/69719
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Properly use absolute of the difference of the two offsets to
compare or adjust the segment length.
law [Mon, 8 Feb 2016 08:17:32 +0000 (08:17 +0000)]
PR tree-optimization/65917
* tree-ssa-dom.c (record_temporary_equivalences): Record both
equivalences from if (x == y) style conditionals.
(loop_depth_of_name): Remove.
(record_equality): Remove loop depth check.
* tree-ssa-scopedtables.h (const_and_copies): Refine comments.
(const_and_copies::record_const_or_copy_raw): New member function.
* tree-ssa-scopedtables.c
(const_and_copies::record_const_or_copy_raw): New, factored out of
(const_and_copies::record_const_or_copy): Call new member function.
PR tree-optimization/65917
* gcc.dg/tree-ssa/20030922-2.c: No longer xfailed.
jvdelisle [Sun, 7 Feb 2016 20:15:55 +0000 (20:15 +0000)]
2016-02-07 Jerry DeLisle <jvdelisle@gcc.gnu.org>
PR fortran/50555
* primary.c (match_actual_arg): If symbol has attribute flavor of
namelist, generate an error. (gfc_match_rvalue): Likewise return
MATCH_ERROR.
* resolve.c (resolve_symbol): Scan arument list of procedures and
generate an error if a namelist is found.
PR fortran/50555
* gfortran.dg/namelist_args.f90: New test.
law [Fri, 5 Feb 2016 23:49:08 +0000 (23:49 +0000)]
PR tree-optimization/68541
* gimple-ssa-split-paths.c: Include tree-cfg.h and params.h.
(count_stmts_in_block): New function.
(poor_ifcvt_candidate_code): Likewise.
(is_feasible_trace): Add some heuristics to determine when path
splitting is profitable.
(find_block_to_duplicate_for_splitting_paths): Make sure the graph
is a diamond with a single exit.
PR tree-optimization/68541
* gcc.dg/tree-ssa/split-path-2.c: New test.
* gcc.dg/tree-ssa/split-path-3.c: New test.
* gcc.dg/tree-ssa/split-path-4.c: New test.
* gcc.dg/tree-ssa/split-path-5.c: New test.
* gcc.dg/tree-ssa/split-path-6.c: New test.
* gcc.dg/tree-ssa/split-path-7.c: New test.
msebor [Fri, 5 Feb 2016 22:27:37 +0000 (22:27 +0000)]
PR c++/69662 - -Wplacement-new on allocated one element array members
gcc/testsuite/ChangeLog:
PR c++/69662
* g++.dg/warn/Wplacement-new-size-1.C: New test.
* g++.dg/warn/Wplacement-new-size-2.C: New test.
gcc/cp/ChangeLog:
PR c++/69662
* init.c (find_field_init): New function.
(warn_placement_new_too_small): Call it. Handle one-element arrays
at ends of structures special.
gcc/c-family/ChangeLog:
PR c++/69662
* c.opt (Warning options): Update -Wplacement-new to take
an optional argument.
gcc/ChangeLog:
PR c++/69662
* doc/invoke.texi: Update -Wplacement-new to take an optional
argument.
rth [Fri, 5 Feb 2016 22:05:17 +0000 (22:05 +0000)]
PR c/69643
* tree.c (tree_nop_conversion_p): Do not strip casts into or
out of non-standard address spaces.
testsuite/
* gcc.target/i386/addr-space-4.c: New.
* gcc.target/i386/addr-space-5.c: New.
PR fortran/66089
gcc/fortran/
* trans-expr.c (expr_is_variable, gfc_expr_is_variable): Rename
the former to the latter and make it non-static. Update callers.
* gfortran.h (gfc_expr_is_variable): New declaration.
(struct gfc_ss_info): Add field needs_temporary.
* trans-array.c (gfc_scalar_elemental_arg_saved_as_argument):
Tighten the condition on aggregate expressions with a check
that the expression is a variable and doesn't need a temporary.
(gfc_conv_resolve_dependency): Add intermediary reference variable.
Set the needs_temporary field.
gcc/testsuite/
* gfortran.dg/elemental_dependency_6.f90: New.
ppalka [Fri, 5 Feb 2016 14:36:44 +0000 (14:36 +0000)]
Fix PR c++/68948 (wrong code generation due to invalid constructor call)
gcc/cp/ChangeLog:
PR c++/68948
* pt.c (tsubst_baselink): Diagnose an invalid constructor call
if lookup_fnfields returns NULL_TREE and the name being looked
up has the form A::A.
gcc/testsuite/ChangeLog:
PR c++/68948
* g++.dg/template/pr68948.C: New test.
amylaar [Fri, 5 Feb 2016 14:27:26 +0000 (14:27 +0000)]
2016-01-05 Jeremy Bennett <jeremy.bennett@embecosm.com>
* doc/invoke.texi (Optimize Options): In table of --param options
rename second occurrence of tracer-min-branch-ratio to
tracer-min-branch-probability, rename
tracer-min-branch-ratio-feedback to
tracer-min-branch-probability-feedback and clarify description,
rename sched-spec-state-edge-prob-cutoff to
sched-state-edge-prob-cutoff, rename selsched-max-insns-to-rename
to selsched-insns-to-rename, rename lto-minpartition to
lto-min-partition, delete reorder-blocks-duplicate and
reorder-blocks-duplicate-feedback.
krebbel [Fri, 5 Feb 2016 10:25:08 +0000 (10:25 +0000)]
libstdc++: S/390: Add missing baseline_symbols.txt for s390x/-m31.
The attached patch copies the existing
libstdc++-v3/config/abi/post/s390-linux-gnu/baseline_symbols.txt
to .../s390x-linux-gnu/32/baseline_symbols.txt. This fixes the
abi test failure on s390x with -m31.
libstdc++-v3/ChangeLog
* config/abi/post/s390x-linux-gnu/32/baseline_symbols.txt (FUNC):
New file. Copied over from s390-linux-gnu.
krebbel [Fri, 5 Feb 2016 10:08:17 +0000 (10:08 +0000)]
S/390: Fix r6 vararg handling.
This patch fixes a problem introduced with the GPR into FPR slot save
feature for leaf functions.
r6 is argument register as well as call-saved. Currently we might
decide that it will be a candidate for being saved into an FPR. If it
turns out later that r6 also needs to be saved due to being required
for vararg we undo the FPR save decision and put it on the stack
again. Unfortunately the code did not adjust the GPR restore range
accordingly so that the register does not get restored in the load
multiple.
This fixes the following testcases on s390x:
< FAIL: libgomp.c/doacross-1.c execution test
< FAIL: libgomp.c/doacross-2.c execution test
< FAIL: libgomp.c/doacross-3.c execution test
< FAIL: libgomp.c++/doacross-1.C execution test
gcc/ChangeLog:
2016-02-05 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
PR target/69625
* config/s390/s390.c (SAVE_SLOT_NONE, SAVE_SLOT_STACK): New
defines.
(s390_register_info_gprtofpr): Use new macros above.
(s390_register_info_stdarg_fpr): Adjust max_fpr to better match
its name.
(s390_register_info_stdarg_gpr): Adjust max_gpr to better match
its name. Adjust restore and save gpr ranges.
(s390_register_info_set_ranges): New function.
(s390_register_info): Use new macros above. Call
s390_register_info_set_ranges.
(s390_optimize_register_info): Likewise.
(s390_hard_regno_rename_ok): Use new macros.
(s390_hard_regno_scratch_ok): Likewise.
(s390_emit_epilogue): Likewise.
(s390_can_use_return_insn): Likewise.
(s390_optimize_prologue): Likewise.
* config/s390/s390.md (GPR2_REGNUM, GPR6_REGNUM): New constants.
segher [Thu, 4 Feb 2016 23:09:51 +0000 (23:09 +0000)]
combine: distribute_notes again (PR69567, PR64682)
As it happens the patch I did over a year ago for PR64682 isn't quite
correct. This is PR69567. This fixes it.
PR rtl-optimization/64682
PR rtl-optimization/69567
* combine.c (distribute_notes) <REG_DEAD>: Place the death note
before I2 only if the register is both used and set in I2.
redi [Thu, 4 Feb 2016 21:43:40 +0000 (21:43 +0000)]
Update copyright years in libstdc++ manual and add link
* doc/xml/manual/containers.xml: Add cross-reference to Dual ABI.
* doc/xml/manual/spine.xml: Update copyright years and author blurb.
* doc/html/*: Regenerate.
meissner [Thu, 4 Feb 2016 21:05:14 +0000 (21:05 +0000)]
[gcc]
2016-02-04 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/69667
* config/rs6000/rs6000.md (mov<mode>_64bit_dm): Use 'd' constraint
instead of 'ws', and 'wh' instead of 'wm' since TFmode/IFmode are
not allowed into the traditional Altivec registers.
(movtd_64bit_nodm): Likewise.
(mov<mode>_32bit, FMOVE128_FPR iterator): Likewise.
[gcc/testsuite]
2016-02-04 Michael Meissner <meissner@linux.vnet.ibm.com>
wilco [Thu, 4 Feb 2016 18:23:35 +0000 (18:23 +0000)]
This patch fixes an exponential issue in ccmp.c. When deciding which ccmp
expansion to use, the tree nodes gs0 and gs1 are fully expanded twice. If
they contain more CCMP opportunities, their subtrees are also expanded twice.
When the trees are complex the expansion takes exponential time and memory.
As a workaround in GCC6 compute the cost of the first expansion early, and
only try the alternative expansion if the cost is low enough. This rarely
affects real code, eg. SPECINT2006 has identical codesize.
vapier [Thu, 4 Feb 2016 17:32:11 +0000 (17:32 +0000)]
gcc: invoke: delete -mno-fma4 docs
We don't document the -mno-xxx variants for other flags here, and the
paragraph here specifically says "Each has a corresponding -mno- option
to disable use of these instructions". Drop the -mno-fma4 line.
X gets allocated to an AVX register, as usual for V2TI. The problem is
that the movti for B doesn't then preserve the other half of X, even
though the subreg semantics are supposed to guarantee that.
in which B2 is a no-op and therefore implicit. The handling ought
to be the same regardless of whether there is an rtl insn that
explicitly assigns to (subreg:TI (reg:V2TI X) 16).
This patch implements that idea. Hopefully the comments explain
what's going on.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabihf.
gcc/
PR rtl-optimization/69577
* reginfo.c (record_subregs_of_mode): Add a partial_def parameter.
(find_subregs_of_mode): Update accordingly. Iterate over partial
definitions.
gcc/testsuite/
PR rtl-optimization/69577
* gcc.target/i386/pr69577.c: New test.