Kyrylo Tkachov [Wed, 25 May 2016 15:53:21 +0000 (15:53 +0000)]
[RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition
PR rtl-optimization/66940
* ifcvt.c (noce_get_alt_condition): Check that incrementing or
decrementing desired_val will not overflow before performing these
operations.
Nick Clifton [Wed, 25 May 2016 14:31:46 +0000 (14:31 +0000)]
msp430.c (msp430_attr): Produce an error if a static interrupt handler is detected.
* config/msp430/msp430.c (msp430_attr): Produce an error if a
static interrupt handler is detected.
* config/msp430/msp430.h (LIB_SPEC): Do not use msp430.ld as the
default linker script.
* config/msp430/msp430.md (movpsihi2_lo): New pattern for loading
the low part of a symbolic pointer.
Richard Biener [Wed, 25 May 2016 11:49:03 +0000 (11:49 +0000)]
re PR tree-optimization/71261 (Trunk GCC hangs on knl and broadwell targets)
2016-05-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/71261
* tree-if-conv.c (ifcvt_split_def_stmt): Walk uses on the
interesting stmt instead of immediate uses when looking
for the use operand to replace.
Michael Meissner [Tue, 24 May 2016 23:19:08 +0000 (23:19 +0000)]
altivec.md (VNEG iterator): New iterator for VNEGW/VNEGD instructions.
[gcc]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/altivec.md (VNEG iterator): New iterator for
VNEGW/VNEGD instructions.
(p9_neg<mode>2): New insns for ISA 3.0 VNEGW/VNEGD.
(neg<mode>2): Add expander for V2DImode added in ISA 2.06, and
support for ISA 3.0 VNEGW/VNEGD instructions.
[gcc/testsuite]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD
instructions.
c-parser.c (c_parser_oacc_declare): Add support for GOMP_MAP_FIRSTPRIVATE_POINTER.
gcc/c/
* c-parser.c (c_parser_oacc_declare): Add support for
GOMP_MAP_FIRSTPRIVATE_POINTER.
* c-typeck.c (handle_omp_array_sections_1): Replace bool is_omp
argument with enum c_omp_region_type ort.
(handle_omp_array_sections): Likewise. Update call to
handle_omp_array_sections_1.
(c_finish_omp_clauses): Add specific errors and warning messages for
OpenACC. Use firsrtprivate pointers for OpenACC subarrays. Update
call to handle_omp_array_sections.
gcc/cp/
* parser.c (cp_parser_oacc_declare): Add support for
GOMP_MAP_FIRSTPRIVATE_POINTER.
* semantics.c (handle_omp_array_sections_1): Replace bool is_omp
argument with enum c_omp_region_type ort. Don't privatize OpenACC
non-static members.
(handle_omp_array_sections): Replace bool is_omp argument with enum
c_omp_region_type ort. Update call to handle_omp_array_sections_1.
(finish_omp_clauses): Add specific errors and warning messages for
OpenACC. Use firsrtprivate pointers for OpenACC subarrays. Update
call to handle_omp_array_sections.
gcc/
* gimplify.c (omp_notice_variable): Use zero-length arrays for data
pointers inside OACC_DATA regions.
(gimplify_scan_omp_clauses): Prune firstprivate clause associated
with OACC_DATA, OACC_ENTER_DATA and OACC_EXIT data regions.
(gimplify_adjust_omp_clauses): Fix typo in comment.
Michael Meissner [Tue, 24 May 2016 22:45:45 +0000 (22:45 +0000)]
altivec.md (VParity): New mode iterator for vector parity built-in functions.
[gcc]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/altivec.md (VParity): New mode iterator for vector
parity built-in functions.
(p9v_ctz<mode>2): Add support for ISA 3.0 vector count trailing
zeros.
(p9v_parity<mode>2): Likewise.
* config/rs6000/vector.md (VEC_IP): New mode iterator for vector
parity.
(ctz<mode>2): ISA 3.0 expander for vector count trailing zeros.
(parity<mode>2): ISA 3.0 expander for vector parity.
* config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for
power9 built-ins.
(BU_P9_64BIT_MISC_0): Likewise.
(BU_P9_MISC_0): Likewise.
(BU_P9V_AV_1): Likewise.
(BU_P9V_AV_2): Likewise.
(BU_P9V_AV_3): Likewise.
(BU_P9V_AV_P): Likewise.
(BU_P9V_VSX_1): Likewise.
(BU_P9V_OVERLOAD_1): Likewise.
(BU_P9V_OVERLOAD_2): Likewise.
(BU_P9V_OVERLOAD_3): Likewise.
(VCTZB): Add vector count trailing zeros support.
(VCTZH): Likewise.
(VCTZW): Likewise.
(VCTZD): Likewise.
(VPRTYBD): Add vector parity support.
(VPRTYBQ): Likewise.
(VPRTYBW): Likewise.
(VCTZ): Add overloaded vector count trailing zeros support.
(VPRTYB): Add overloaded vector parity support.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
overloaded vector count trailing zeros and parity instructions.
* config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for
vector parity support.
* config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count
trailing zeros support.
(vec_cntlz): Likewise.
(vec_vctzb): Likewise.
(vec_vctzd): Likewise.
(vec_vctzh): Likewise.
(vec_vctzw): Likewise.
(vec_vprtyb): Add ISA 3.0 vector parity support.
(vec_vprtybd): Likewise.
(vec_vprtybw): Likewise.
(vec_vprtybq): Likewise.
* doc/extend.texi (PowerPC AltiVec Built-in Functions): Document
the ISA 3.0 vector count trailing zeros and vector parity built-in
functions.
[gcc/testsuite]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-vparity.c: New file to check ISA 3.0
vector parity built-in functions.
* gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector
count trailing zeros automatic vectorization.
* gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector
count trailing zeros built-in functions.
François Dumont [Tue, 24 May 2016 20:55:57 +0000 (20:55 +0000)]
c++config (_GLIBCXX14_USE_CONSTEXPR): New.
2016-05-24 François Dumont <fdumont@gcc.gnu.org>
* include/bits/c++config (_GLIBCXX14_USE_CONSTEXPR): New.
* include/bits/hashtable_policy.h
(_Prime_rehash_policy::__has_load_factor): New. Mark rehash policy
having load factor management.
(_Mask_range_hashing): New.
(__clp2): New.
(_Power2_rehash_policy): New.
(_Inserts<>): Remove last template parameter, _Unique_keys, so that
partial specializations only depend on whether iterators are constant
or not.
* testsuite/23_containers/unordered_set/hash_policy/26132.cc: Adapt to
test new hash policy.
* testsuite/23_containers/unordered_set/hash_policy/load_factor.cc:
Likewise.
* testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
Likewise.
* testsuite/23_containers/unordered_set/insert/hash_policy.cc:
Likewise.
* testsuite/23_containers/unordered_set/max_load_factor/robustness.cc:
Likewise.
* testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc:
New.
* testsuite/performance/23_containers/insert/54075.cc: Add benchmark
using the new hash policy.
* testsuite/performance/23_containers/insert_erase/41975.cc: Likewise.
Martin Sebor [Tue, 24 May 2016 20:29:36 +0000 (20:29 +0000)]
PR c++/71147 - [6 Regression] Flexible array member wrongly rejected in template
gcc/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* gcc/tree.h (complete_or_array_type_p): New inline function.
gcc/testsuite/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* g++.dg/ext/flexary16.C: New test.
gcc/cp/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* decl.c (layout_var_decl, grokdeclarator): Use complete_or_array_type_p.
* pt.c (instantiate_class_template_1): Try to complete the element
type of a flexible array member.
(can_complete_type_without_circularity): Handle arrays of unknown bound.
* typeck.c (complete_type): Also complete the type of the elements of
arrays with an unspecified bound.
Jakub Jelinek [Tue, 24 May 2016 19:12:42 +0000 (21:12 +0200)]
i386.h (TARGET_AVOID_4BYTE_PREFIXES): Define.
* config/i386/i386.h (TARGET_AVOID_4BYTE_PREFIXES): Define.
* config/i386/constraints.md (Yr): Test TARGET_AVOID_4BYTE_PREFIXES
rather than X86_TUNE_AVOID_4BYTE_PREFIXES.
Jakub Jelinek [Tue, 24 May 2016 19:12:06 +0000 (21:12 +0200)]
sse.md (<sse4_1>_round<ssemodesuffix><avxsizesuffix>): Limit 1st alternative to noavx isa...
* config/i386/sse.md (<sse4_1>_round<ssemodesuffix><avxsizesuffix>):
Limit 1st alternative to noavx isa, split 2nd alternative into one
noavx and one avx alternative, use *x and Bm in the former and
x and m in the latter.
Jakub Jelinek [Tue, 24 May 2016 19:11:33 +0000 (21:11 +0200)]
sse.md (vec_set<mode>_0): Use sse4_noavx isa instead of sse4 for the first alternative...
* config/i386/sse.md (vec_set<mode>_0): Use sse4_noavx isa instead
of sse4 for the first alternative, drop %v from the template
and d operand modifier. Split second alternative into one sse4_noavx
and one avx alternative, use *x instead of *v in the former and v
instead of *v in the latter.
(*sse4_1_extractps): Use noavx isa instead of * for the first
alternative, drop %v from the template. Split second alternative into
one noavx and one avx alternative, use *x instead of *v in the
former and v instead of *v in the latter.
(<vi8_sse4_1_avx2_avx512>_movntdqa): Guard the first 2 alternatives
with noavx and the last one with avx.
(sse4_1_phminposuw): Guard first alternative with noavx isa,
split the second one into one noavx and one avx alternative,
use *x and Bm in the former and x and m in the latter one.
(<sse4_1>_ptest<mode>): Use noavx instead of * for the first two
alternatives.
Jakub Jelinek [Tue, 24 May 2016 19:10:55 +0000 (21:10 +0200)]
sse.md (sse4_1_<code>v8qiv8hi2<mask_name>): Limit first two alternatives to noavx...
* config/i386/sse.md (sse4_1_<code>v8qiv8hi2<mask_name>): Limit
first two alternatives to noavx, use *x instead of *v in the second
one, add avx alternative without *.
(sse4_1_<code>v4qiv4si2<mask_name>, sse4_1_<code>v4hiv4si2<mask_name>,
sse4_1_<code>v2qiv2di2<mask_name>, sse4_1_<code>v2hiv2di2<mask_name>,
sse4_1_<code>v2siv2di2<mask_name>): Likewise.
Jeff Law [Tue, 24 May 2016 16:57:48 +0000 (10:57 -0600)]
tree-ssa-threadbackwards.c (convert_and_register_jump_thread_path): New function, extracted from...
* tree-ssa-threadbackwards.c (convert_and_register_jump_thread_path):
New function, extracted from...
(fsm_find_control_statement_thread_paths): Here. Use the new function.
Allow simple copies and constant initializations in the SSA chain.
The vectorizable_* routines had many instances of:
slp_node || PURE_SLP_STMT (stmt_info)
which gives the misleading impression that we can have
!slp_node && PURE_SLP_STMT (stmt_info). In this context
it's really enough to test slp_node on its own.
There are three cases:
loop vectorisation only:
vectorizable_foo called only with !slp_node
pure SLP:
vectorizable_foo called only with slp_node
hybrid SLP:
(e.g. a vector that's used in SLP statements and also in a reduction)
- vectorizable_foo called once with slp_node for the SLP uses.
- vectorizable_foo called once with !slp_node for the non-SLP uses.
Hybrid SLP isn't possible for stores, so I added an explicit assert
for that.
I also made vectorizable_comparison static, to make it obvious that
no other callers outside tree-vect-stmts.c could use it with the
!slp && PURE_SLP_STMT combination.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vectorizer.h (vectorizable_comparison): Delete.
* tree-vect-loop.c (vectorizable_reduction): Remove redundant
PURE_SLP_STMT check.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_load): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_store): Likewise. Assert that we don't have
hybrid SLP.
(vectorizable_comparison): Make static. Remove redundant
PURE_SLP_STMT check.
(vect_transform_stmt): Assert that we always have an slp_node
if PURE_SLP_STMT.
Kyrylo Tkachov [Tue, 24 May 2016 14:04:03 +0000 (14:04 +0000)]
[ARM][4/4] Simplify checks for CONST_INT_P and comparison against 1/0
* config/arm/neon.md (ashldi3_neon): Replace comparison of INTVAL of
operands[2] against 1 with comparison against CONST1_RTX.
(<shift>di3_neon): Likewise.
* config/arm/predicates.md (const0_operand): Replace with comparison
against CONST0_RTX.
Kyrylo Tkachov [Tue, 24 May 2016 13:55:19 +0000 (13:55 +0000)]
[ARM][2/4] Replace casts of 1 to HOST_WIDE_INT by HOST_WIDE_INT_1 and HOST_WIDE_INT_1U
* config/arm/arm.md (andsi3): Replace cast of 1 to HOST_WIDE_INT
with HOST_WIDE_INT_1.
(insv): Likewise.
* config/arm/arm.c (optimal_immediate_sequence): Replace cast of
1 to unsigned HOST_WIDE_INT with HOST_WIDE_INT_1U.
(arm_canonicalize_comparison): Likewise.
(thumb1_rtx_costs): Replace cast of 1 to HOST_WIDE_INT with
HOST_WIDE_INT_1.
(thumb1_size_rtx_costs): Likewise.
(vfp_const_double_index): Replace cast of 1 to unsigned
HOST_WIDE_INT with HOST_WIDE_INT_1U.
(get_jump_table_size): Replace cast of 1 to HOST_WIDE_INT with
HOST_WIDE_INT_1.
(arm_asan_shadow_offset): Replace cast of 1 to unsigned
HOST_WIDE_INT with HOST_WIDE_INT_1U.
* config/arm/neon.md (vec_set<mode>): Replace cast of 1 to
HOST_WIDE_INT with HOST_WIDE_INT_1.
Kyrylo Tkachov [Tue, 24 May 2016 11:32:35 +0000 (11:32 +0000)]
[ARM] PR target/69857 Remove bogus early return false; in gen_operands_ldrd_strd
PR target/69857
* config/arm/arm.c (gen_operands_ldrd_strd): Remove bogus early
return. Reindent transformation comment and mention the ARM state
behavior.
vectorizable_load forces peeling for gaps if the vectorisation factor
is not a multiple of the group size, since in that case we'd normally load
beyond the original scalar accesses but drop the excess elements as part
of a following permute:
This isn't necessary for LOAD_LANES though, since it loads only the
data needed and does the permute itself.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vect-stmts.c (vectorizable_load): Reorder checks so that
load_lanes/grouped_load classification comes first. Don't check
whether the vectorization factor is a multiple of the group size
for load_lanes.
gcc/testsuite/
* gcc.dg/vect/vect-load-lanes-peeling-1.c: New test.
vectorizable_load had a curious "force_peeling" variable, with no
comment explaining why we need it for single-element interleaving
but not for other cases. I think it's simply because we weren't
initialising the GROUP_GAP correctly for single loads.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vect-data-refs.c (vect_analyze_group_access_1): Set
GROUP_GAP for single-element interleaving.
* tree-vect-stmts.c (vectorizable_load): Remove force_peeling
variable.
Richard Biener [Tue, 24 May 2016 07:55:56 +0000 (07:55 +0000)]
re PR middle-end/70434 (adding an extraneous cast to vector type results in inferior code)
2016-05-24 Richard Biener <rguenther@suse.de>
PR middle-end/70434
PR c/69504
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Rename to ...
(convert_vector_to_array_for_subscript): ... this.
* c-common.c (convert_vector_to_pointer_for_subscript): Use a
VIEW_CONVERT_EXPR to an array type. Rename to ...
(convert_vector_to_array_for_subscript): ... this.
c/
* c-typeck.c (build_array_ref): Do not complain about indexing
non-lvalue vectors. Adjust for function name change.
* tree-ssa.c (non_rewritable_mem_ref_base): Make sure to mark
bases which are accessed with non-invariant indices.
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Re-write
constant index ARRAY_REFs of vectors into BIT_FIELD_REFs.
* c-c++-common/vector-subscript-4.c: New testcase.
* c-c++-common/vector-subscript-5.c: Likewise.
Michael Meissner [Mon, 23 May 2016 23:42:52 +0000 (23:42 +0000)]
re PR target/71201 (PowerPC XXPERM instruction fails on ISA 3.0 system.)
[gcc]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71201
* config/rs6000/altivec.md (altivec_vperm_<mode>_internal): Drop
ISA 3.0 xxperm fusion alternative.
(altivec_vperm_v8hiv16qi): Likewise.
(altivec_vperm_<mode>_uns_internal): Likewise.
(vperm_v8hiv4si): Likewise.
(vperm_v16qiv8hi): Likewise.
[gcc/testsuite]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/p9-permute.c: Run test on big endian as well
as little endian.
[gcc]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
vpermr/xxpermr on ISA 3.0.
(altivec_expand_vec_perm_le): Likewise.
* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
ISA 3.0.
Jason Merrill [Mon, 23 May 2016 21:21:18 +0000 (17:21 -0400)]
PR c++/70735 - generic lambda and local static variable
* pt.c (tsubst_copy): Just return a local variable from
non-template context. Don't call rest_of_decl_compilation for
duplicated static locals.
(tsubst_decl): Set DECL_CONTEXT of local static from another
function.
Jakub Jelinek [Sun, 22 May 2016 10:27:27 +0000 (12:27 +0200)]
sse.md (vec_set_lo_v16hi, [...]): Add alternative with v constraint instead of x and vinserti32x4 insn.
* config/i386/sse.md (vec_set_lo_v16hi, vec_set_hi_v16hi,
vec_set_lo_v32qi, vec_set_hi_v32qi): Add alternative with
v constraint instead of x and vinserti32x4 insn.
* gcc.target/i386/avx512vl-vinserti32x4-3.c: New test.
Jakub Jelinek [Sun, 22 May 2016 10:25:55 +0000 (12:25 +0200)]
sse.md (avx2_vec_dupv4df): Use v instead of x constraint, use maybe_evex prefix instead of vex.
* config/i386/sse.md (avx2_vec_dupv4df): Use v instead of x
constraint, use maybe_evex prefix instead of vex.
(vec_dupv4sf): Use v constraint instead of x for output
operand except for noavx alternative, use Yv constraint
instead of x for input. Use maybe_evex prefix instead of vex.
(*vec_dupv4si): Likewise.
(*vec_dupv2di): Likewise.
* gcc.target/i386/avx512vl-vbroadcast-1.c: New test.