jakub [Tue, 19 Jul 2016 16:47:30 +0000 (16:47 +0000)]
PR middle-end/71734
* g++.dg/vect/pr70729.cc: Don't include string.h or xmmintrin.h.
(my_alloc): Rewritten to use __builtin_posix_memalign and
__SIZE_TYPE__.
(my_free): Use __builtin_free instead of _mm_free.
(Vec::operator=): Use __builtin_memcpy.
PR tree-optimization/71901
* tree-ssa-sccvn.h (struct vn_reference_op_struct): Add
align member, group stuff with the bitfield.
(vn_ref_op_align_unit): New inline.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): For ARRAY_REFs
record element alignment and operand 3 unchanged.
(ao_ref_init_from_vn_reference): Adjust.
(valueize_refs_1): Likewise.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.
PR lto/71907
* lto-streamer-out.c (DFS::DFS_write_tree_body): For blocks
with an abstract origin that is not an inlined function outer
scope add a self-reference as abstract origin.
* tree-streamer-out.c (write_ts_block_tree_pointers): Likewise.
[gcc]
2016-07-18 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71493
* config/rs6000/rs6000.c (rs6000_function_value): Fix
unintentional System V.4 structure return breakage for structures
with a single floating point element.
[gcc/testsuite]
2016-07-18 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71493
* gcc.target/powerpc/pr71493-1.c: New test.
* gcc.target/powerpc/pr71493-2.c: Likewise.
jakub [Mon, 18 Jul 2016 18:45:18 +0000 (18:45 +0000)]
PR c++/70869
PR c++/71054
* cp-gimplify.c (cp_genericize_r): Revert the 2016-07-07 change.
* tree.c (cp_walk_subtrees): For DECL_EXPR on DECL_ARTIFICIAL
non-static VAR_DECL, walk the decl's DECL_INITIAL, DECL_SIZE and
DECL_SIZE_UNIT.
jakub [Mon, 18 Jul 2016 18:43:19 +0000 (18:43 +0000)]
PR c++/71828
* constexpr.c (cxx_eval_constant_expression) <case REALPART_EXPR>:
For lval don't use cxx_eval_unary_expression and instead recurse
and if needed rebuild the reference.
PR tree-optimization/71734
* tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which
contains REF, use it to check safelen, assume that safelen value
must be greater 1, fix style.
(ref_indep_loop_p_2): Add REF_LOOP argument.
(ref_indep_loop_p): Pass LOOP as additional argument to
ref_indep_loop_p_2.
Allocate constant size dynamic stack space in the prologue
The attached patch fixes a warning during Linux kernel compilation
on S/390 due to -mwarn-dynamicstack and runtime alignment of stack
variables with constant size causing cfun->calls_alloca to be set
(even if alloca is not used at all). The patched code places
constant size runtime aligned variables in the "virtual stack
vars" area instead of creating a "virtual stack dynamic" area.
The kernel uses runtime alignment for the page structure (aligned
to 16 bytes), and apart from triggereing the alloca warning
(-mwarn-dynamicstack), the current Gcc also generates inefficient
code like
(if later optimization passes are able to get rid of the frame
pointer). Is there a specific reason why the patched behaviour
shouldn't be used for all platforms?
--
As the placement of runtime aligned stack variables with constant
size is done completely in the middleend, I don't see a way to fix
this in the backend.
gcc/ChangeLog:
2016-07-18 Dominik Vogt <vogt@linux.vnet.ibm.com>
* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog:
2016-07-18 Dominik Vogt <vogt@linux.vnet.ibm.com>
* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.
* config/pa/pa.c (hppa_profile_hook): Allocate stack space for
register parameters. Remove code to initialize argument pointer
on TARGET_64BIT. Optimize call to _mcount when it can be reached
using a pc-relative branch. Cleanup conditional code.
* config/pa/pa.md (call_mcount): New expander.
(call_mcount_nonpic): New insn.
(call_mcount_pic): New insn and split.
(call_mcount_pic_post_reload): New insn.
(call_mcount_64bit): New insn and split.
(call_mcount_64bit_post_reload): New insn.
* config/avr/predicates.md (const_m255_to_m1_operand): New.
* config/avr/constraints.md (Cn8, Ca1, Co1, Yx2): New constraints.
* config/avr/avr.md (add<mode>3) <ALL1>: Make "r,0,r" more
expensive.
(*cmphi.zero-extend.0, *cmphi.zero-extend.1)
(*usum_widenqihi3, *udiff_widenqihi3)
(*addhi3_zero_extend.const): New combiner insns.
(andqi3, iorqi3): Provide "l" (NO_LD_REGS) alternative if
just 1 bit is affected.
* config/avr/avr.c (avr_out_bitop) <QImode>: Don't access xop[3].
(avr_out_compare) [EQ,NE]: Tweak comparing d-regs against -1.
cesar [Fri, 15 Jul 2016 14:13:48 +0000 (14:13 +0000)]
gcc/c/
* c-parser.c (c_parser_oacc_declare): Don't scan for
GOMP_MAP_POINTER.
* c-typeck.c (handle_omp_array_sections): Mark data clauses with
GOMP_MAP_FORCE_{PRESENT,TO,FROM,TOFROM} as potentially having
zero-length subarrays.
gcc/cp/
* parser.c (cp_parser_oacc_declare): Don't scan for
GOMP_MAP_POINTER.
* semantics.c (handle_omp_array_sections): Mark data clauses with
GOMP_MAP_FORCE_{PRESENT,TO,FROM,TOFROM} as potentially having
zero-length subarrays.
gcc/
* omp-low.c (lower_omp_target): Mark data clauses with
GOMP_MAP_FORCE_{PRESENT,TO,FROM,TOFROM} as potentially having
zero-length subarrays.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/zero_length_subarrays.c: New
test.
PR tree-optimization/71887
* tree-ssa-phiopt.c (absorbing_element_p): Add rhs arg and
verify it is not zero for division / modulo handling.
(value_replacement): Adjust.
* cgraphunit.c (cgraph_order_sort_kind): New entry ORDER_VAR_UNDEF.
(output_in_order): Loop over undefined variables too. Output them
via assemble_undefined_decl. Skip variables that correspond to hard
registers or have value-exprs.
* varpool.c (symbol_table::output_variables): Handle undefined
variables together with defined ones.
* tree-ssa-pre.c (get_representative_for): Make sure to return
the value number of SSA names.
(phi_translate_1): get_representative_for cannot return NULL.
(do_pre_regular_insertion): Remove redundant call to
fully_constant_expression.
(do_pre_partial_partial_insertion): Likewise.
* c-decl.c (implicit_decl_warning): Use FUZZY_LOOKUP_FUNCTION_NAME
instead of FUZZY_LOOKUP_NAME.
(lookup_name_fuzzy): For FUZZY_LOOKUP_FUNCTION_NAME consider
FUNCTION_DECLs, {VAR,PARM}_DECLs function pointers and macros.
* tree-scalar-evolution.c (simple_iv_with_niters): New funcion.
(derive_simple_iv_with_niters): New function.
(simple_iv): Rewrite using simple_iv_with_niters.
* tree-scalar-evolution.h (simple_iv_with_niters): New decl.
* tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions): New
function.
(number_of_iterations_exit): Rewrite using above function.
* tree-ssa-loop-niter.h (number_of_iterations_exit_assumptions): New
Decl.
gcc/testsuite
* gcc.dg/tree-ssa/loop-41.c: New test.
2016-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config/arm/arm.h (TARGET_HAVE_LDACQ): Enable for ARMv8-M Mainline.
(TARGET_HAVE_LDACQD): New macro.
* config/arm/sync.md (atomic_loaddi): Use TARGET_HAVE_LDACQD rather
than TARGET_HAVE_LDACQ.
(arm_load_acquire_exclusivedi): Likewise.
(arm_store_release_exclusivedi): Likewise.
2016-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
PR rtl-optimization/71878
* lra-constraints.c (match_reload): Pass information about other
output operands. Create new unique register value if matching input
operand shares same register value as output operand being considered.
(curr_insn_transform): Record output operands already processed.
* gimple.h (stmt_can_terminate_bb_p): New function.
* tree-cfg.c (need_fake_edge_p): Rename to ...
(stmt_can_terminate_bb_p): ... this; return true if stmt can
throw external; handle const and pure calls.
* tree-ssa-loop-niter.c (loop_only_exit_p): Use it.
PR tree-optimization/71866
* tree-ssa-pre.c (get_constant_for_value_id): Remove.
(do_hoist_insertion): Avoid endless recursion when we
didn't insert anything because we managed to simplify
things down to a constant or SSA name.
(fully_constant_expression): Re-write in terms of ...
* tree-ssa-sccvn.h (vn_nary_simplify): ... this. Declare.
* tree-ssa-sccvn.c (vn_nary_simplify): New wrapper around
vn_nary_build_or_lookup_1.
(vn_nary_build_or_lookup_1): Added flag and renamed from ...
(vn_nary_build_or_lookup): ... this which now wraps it.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Access_Type>: Also use
the void pointer type if the designated type is incomplete and has no
full view in LTO mode.
<E_Access_Protected_Subprogram_Type>: Adjust comment.
<E_Incomplete_Type>: Likewise.
* gcc-interface/trans.c (Call_to_gnu): Do not convert to the type of
the actual if it is a dummy type.
ak [Thu, 14 Jul 2016 02:14:56 +0000 (02:14 +0000)]
Some fixes for profile test cases for autofdo
This fixes some basic issues with the profile test cases with autofdo.
- Disable checking for value transformations that autofdo does not
support.
- Disable checking for fixed hit counts which autofdo does not support
- Enable dumping of afdo log file and check right log file.
- Increase run time of test cases to 1M iterations because autofdo needs
a few samples to make sense of a program. The test case don't run
noticeable slower with that.
There are still failures unfortunately, especially the indirect call
transformations do not trigger because autofdo thinks they are not hot.
This can be addressed later.
ak [Thu, 14 Jul 2016 02:14:13 +0000 (02:14 +0000)]
Add dg-final-scan-autofdo and dg-final-scan-not-autofdo
Autofdo outputs to different dump files and doesn't support some
transformation that normal profiling. Add dg-final-scan-autofdo
and dg-final-scan-not-autofdo statements to the test suite
so that the test cases can hande those cases separately.
gcc/testsuite/:
2016-07-13 Andi Kleen <ak@linux.intel.com>
* lib/profopt.exp (dg-final-scan-autofdo,
dg-final-scan-not-autofdo): New functions.
ak [Thu, 14 Jul 2016 02:14:02 +0000 (02:14 +0000)]
Don't run instrumented value profiler changes with afdo
The pass to transform gimple based on value profiling runs with autofdo
on, but currently every transformation fails. For indirect calls autofdo
does it on its own, and it doesn't suppport other value profiling. So don't
run this pass when autofdo is active. This also avoids bogus
dump file entries.
gcc/:
2016-07-13 Andi Kleen <ak@linux.intel.com>
* value-prof.c (gimple_value_profile_transformations): Don't run
when auto_profile is on.
ak [Thu, 14 Jul 2016 02:13:48 +0000 (02:13 +0000)]
Print indirect call changes in afdo dump file
Print some information about indirect call promotions in the afdo dump
file. Do it in the same format as the instrumented profiler so that
the test suite can match on it.
gcc/:
2016-07-13 Andi Kleen <ak@linux.intel.com>
* auto-profile.c (update_inlined_ind_target,
afdo_indirect_call): Print information to dump file.
law [Wed, 13 Jul 2016 22:18:40 +0000 (22:18 +0000)]
* genrecog.c (special_predicate_operand_p): New function.
(predicate_name): Move function.
(validate_pattern): Don't warn about missing mode for all
define_special_predicate predicates.
law [Wed, 13 Jul 2016 22:06:09 +0000 (22:06 +0000)]
PR c++/70926
* cplus-dem.c: Handle large values and overflow when demangling
length variables.
(demangle_template_value_parm): Read only until end of mangled string.
(do_hpacc_template_literal): Likewise.
(do_type): Handle overflow when demangling array indices.