meissner [Tue, 24 May 2016 23:19:08 +0000 (23:19 +0000)]
[gcc]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/altivec.md (VNEG iterator): New iterator for
VNEGW/VNEGD instructions.
(p9_neg<mode>2): New insns for ISA 3.0 VNEGW/VNEGD.
(neg<mode>2): Add expander for V2DImode added in ISA 2.06, and
support for ISA 3.0 VNEGW/VNEGD instructions.
[gcc/testsuite]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD
instructions.
cesar [Tue, 24 May 2016 22:54:21 +0000 (22:54 +0000)]
gcc/c/
* c-parser.c (c_parser_oacc_declare): Add support for
GOMP_MAP_FIRSTPRIVATE_POINTER.
* c-typeck.c (handle_omp_array_sections_1): Replace bool is_omp
argument with enum c_omp_region_type ort.
(handle_omp_array_sections): Likewise. Update call to
handle_omp_array_sections_1.
(c_finish_omp_clauses): Add specific errors and warning messages for
OpenACC. Use firsrtprivate pointers for OpenACC subarrays. Update
call to handle_omp_array_sections.
gcc/cp/
* parser.c (cp_parser_oacc_declare): Add support for
GOMP_MAP_FIRSTPRIVATE_POINTER.
* semantics.c (handle_omp_array_sections_1): Replace bool is_omp
argument with enum c_omp_region_type ort. Don't privatize OpenACC
non-static members.
(handle_omp_array_sections): Replace bool is_omp argument with enum
c_omp_region_type ort. Update call to handle_omp_array_sections_1.
(finish_omp_clauses): Add specific errors and warning messages for
OpenACC. Use firsrtprivate pointers for OpenACC subarrays. Update
call to handle_omp_array_sections.
gcc/
* gimplify.c (omp_notice_variable): Use zero-length arrays for data
pointers inside OACC_DATA regions.
(gimplify_scan_omp_clauses): Prune firstprivate clause associated
with OACC_DATA, OACC_ENTER_DATA and OACC_EXIT data regions.
(gimplify_adjust_omp_clauses): Fix typo in comment.
meissner [Tue, 24 May 2016 22:45:45 +0000 (22:45 +0000)]
[gcc]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/altivec.md (VParity): New mode iterator for vector
parity built-in functions.
(p9v_ctz<mode>2): Add support for ISA 3.0 vector count trailing
zeros.
(p9v_parity<mode>2): Likewise.
* config/rs6000/vector.md (VEC_IP): New mode iterator for vector
parity.
(ctz<mode>2): ISA 3.0 expander for vector count trailing zeros.
(parity<mode>2): ISA 3.0 expander for vector parity.
* config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for
power9 built-ins.
(BU_P9_64BIT_MISC_0): Likewise.
(BU_P9_MISC_0): Likewise.
(BU_P9V_AV_1): Likewise.
(BU_P9V_AV_2): Likewise.
(BU_P9V_AV_3): Likewise.
(BU_P9V_AV_P): Likewise.
(BU_P9V_VSX_1): Likewise.
(BU_P9V_OVERLOAD_1): Likewise.
(BU_P9V_OVERLOAD_2): Likewise.
(BU_P9V_OVERLOAD_3): Likewise.
(VCTZB): Add vector count trailing zeros support.
(VCTZH): Likewise.
(VCTZW): Likewise.
(VCTZD): Likewise.
(VPRTYBD): Add vector parity support.
(VPRTYBQ): Likewise.
(VPRTYBW): Likewise.
(VCTZ): Add overloaded vector count trailing zeros support.
(VPRTYB): Add overloaded vector parity support.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
overloaded vector count trailing zeros and parity instructions.
* config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for
vector parity support.
* config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count
trailing zeros support.
(vec_cntlz): Likewise.
(vec_vctzb): Likewise.
(vec_vctzd): Likewise.
(vec_vctzh): Likewise.
(vec_vctzw): Likewise.
(vec_vprtyb): Add ISA 3.0 vector parity support.
(vec_vprtybd): Likewise.
(vec_vprtybw): Likewise.
(vec_vprtybq): Likewise.
* doc/extend.texi (PowerPC AltiVec Built-in Functions): Document
the ISA 3.0 vector count trailing zeros and vector parity built-in
functions.
[gcc/testsuite]
2016-05-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-vparity.c: New file to check ISA 3.0
vector parity built-in functions.
* gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector
count trailing zeros automatic vectorization.
* gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector
count trailing zeros built-in functions.
fdumont [Tue, 24 May 2016 20:55:57 +0000 (20:55 +0000)]
2016-05-24 François Dumont <fdumont@gcc.gnu.org>
* include/bits/c++config (_GLIBCXX14_USE_CONSTEXPR): New.
* include/bits/hashtable_policy.h
(_Prime_rehash_policy::__has_load_factor): New. Mark rehash policy
having load factor management.
(_Mask_range_hashing): New.
(__clp2): New.
(_Power2_rehash_policy): New.
(_Inserts<>): Remove last template parameter, _Unique_keys, so that
partial specializations only depend on whether iterators are constant
or not.
* testsuite/23_containers/unordered_set/hash_policy/26132.cc: Adapt to
test new hash policy.
* testsuite/23_containers/unordered_set/hash_policy/load_factor.cc:
Likewise.
* testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
Likewise.
* testsuite/23_containers/unordered_set/insert/hash_policy.cc:
Likewise.
* testsuite/23_containers/unordered_set/max_load_factor/robustness.cc:
Likewise.
* testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc:
New.
* testsuite/performance/23_containers/insert/54075.cc: Add benchmark
using the new hash policy.
* testsuite/performance/23_containers/insert_erase/41975.cc: Likewise.
msebor [Tue, 24 May 2016 20:29:36 +0000 (20:29 +0000)]
PR c++/71147 - [6 Regression] Flexible array member wrongly rejected in template
gcc/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* gcc/tree.h (complete_or_array_type_p): New inline function.
gcc/testsuite/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* g++.dg/ext/flexary16.C: New test.
gcc/cp/ChangeLog:
2016-05-24 Martin Sebor <msebor@redhat.com>
PR c++/71147
* decl.c (layout_var_decl, grokdeclarator): Use complete_or_array_type_p.
* pt.c (instantiate_class_template_1): Try to complete the element
type of a flexible array member.
(can_complete_type_without_circularity): Handle arrays of unknown bound.
* typeck.c (complete_type): Also complete the type of the elements of
arrays with an unspecified bound.
jakub [Tue, 24 May 2016 19:12:42 +0000 (19:12 +0000)]
* config/i386/i386.h (TARGET_AVOID_4BYTE_PREFIXES): Define.
* config/i386/constraints.md (Yr): Test TARGET_AVOID_4BYTE_PREFIXES
rather than X86_TUNE_AVOID_4BYTE_PREFIXES.
jakub [Tue, 24 May 2016 19:12:06 +0000 (19:12 +0000)]
* config/i386/sse.md (<sse4_1>_round<ssemodesuffix><avxsizesuffix>):
Limit 1st alternative to noavx isa, split 2nd alternative into one
noavx and one avx alternative, use *x and Bm in the former and
x and m in the latter.
jakub [Tue, 24 May 2016 19:11:33 +0000 (19:11 +0000)]
* config/i386/sse.md (vec_set<mode>_0): Use sse4_noavx isa instead
of sse4 for the first alternative, drop %v from the template
and d operand modifier. Split second alternative into one sse4_noavx
and one avx alternative, use *x instead of *v in the former and v
instead of *v in the latter.
(*sse4_1_extractps): Use noavx isa instead of * for the first
alternative, drop %v from the template. Split second alternative into
one noavx and one avx alternative, use *x instead of *v in the
former and v instead of *v in the latter.
(<vi8_sse4_1_avx2_avx512>_movntdqa): Guard the first 2 alternatives
with noavx and the last one with avx.
(sse4_1_phminposuw): Guard first alternative with noavx isa,
split the second one into one noavx and one avx alternative,
use *x and Bm in the former and x and m in the latter one.
(<sse4_1>_ptest<mode>): Use noavx instead of * for the first two
alternatives.
jakub [Tue, 24 May 2016 19:10:55 +0000 (19:10 +0000)]
* config/i386/sse.md (sse4_1_<code>v8qiv8hi2<mask_name>): Limit
first two alternatives to noavx, use *x instead of *v in the second
one, add avx alternative without *.
(sse4_1_<code>v4qiv4si2<mask_name>, sse4_1_<code>v4hiv4si2<mask_name>,
sse4_1_<code>v2qiv2di2<mask_name>, sse4_1_<code>v2hiv2di2<mask_name>,
sse4_1_<code>v2siv2di2<mask_name>): Likewise.
law [Tue, 24 May 2016 16:57:48 +0000 (16:57 +0000)]
* tree-ssa-threadbackwards.c (convert_and_register_jump_thread_path):
New function, extracted from...
(fsm_find_control_statement_thread_paths): Here. Use the new function.
Allow simple copies and constant initializations in the SSA chain.
rsandifo [Tue, 24 May 2016 14:05:20 +0000 (14:05 +0000)]
Clean up PURE_SLP_STMT handling
The vectorizable_* routines had many instances of:
slp_node || PURE_SLP_STMT (stmt_info)
which gives the misleading impression that we can have
!slp_node && PURE_SLP_STMT (stmt_info). In this context
it's really enough to test slp_node on its own.
There are three cases:
loop vectorisation only:
vectorizable_foo called only with !slp_node
pure SLP:
vectorizable_foo called only with slp_node
hybrid SLP:
(e.g. a vector that's used in SLP statements and also in a reduction)
- vectorizable_foo called once with slp_node for the SLP uses.
- vectorizable_foo called once with !slp_node for the non-SLP uses.
Hybrid SLP isn't possible for stores, so I added an explicit assert
for that.
I also made vectorizable_comparison static, to make it obvious that
no other callers outside tree-vect-stmts.c could use it with the
!slp && PURE_SLP_STMT combination.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vectorizer.h (vectorizable_comparison): Delete.
* tree-vect-loop.c (vectorizable_reduction): Remove redundant
PURE_SLP_STMT check.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_load): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_store): Likewise. Assert that we don't have
hybrid SLP.
(vectorizable_comparison): Make static. Remove redundant
PURE_SLP_STMT check.
(vect_transform_stmt): Assert that we always have an slp_node
if PURE_SLP_STMT.
ktkachov [Tue, 24 May 2016 14:04:03 +0000 (14:04 +0000)]
[ARM][4/4] Simplify checks for CONST_INT_P and comparison against 1/0
* config/arm/neon.md (ashldi3_neon): Replace comparison of INTVAL of
operands[2] against 1 with comparison against CONST1_RTX.
(<shift>di3_neon): Likewise.
* config/arm/predicates.md (const0_operand): Replace with comparison
against CONST0_RTX.
ktkachov [Tue, 24 May 2016 13:55:19 +0000 (13:55 +0000)]
[ARM][2/4] Replace casts of 1 to HOST_WIDE_INT by HOST_WIDE_INT_1 and HOST_WIDE_INT_1U
* config/arm/arm.md (andsi3): Replace cast of 1 to HOST_WIDE_INT
with HOST_WIDE_INT_1.
(insv): Likewise.
* config/arm/arm.c (optimal_immediate_sequence): Replace cast of
1 to unsigned HOST_WIDE_INT with HOST_WIDE_INT_1U.
(arm_canonicalize_comparison): Likewise.
(thumb1_rtx_costs): Replace cast of 1 to HOST_WIDE_INT with
HOST_WIDE_INT_1.
(thumb1_size_rtx_costs): Likewise.
(vfp_const_double_index): Replace cast of 1 to unsigned
HOST_WIDE_INT with HOST_WIDE_INT_1U.
(get_jump_table_size): Replace cast of 1 to HOST_WIDE_INT with
HOST_WIDE_INT_1.
(arm_asan_shadow_offset): Replace cast of 1 to unsigned
HOST_WIDE_INT with HOST_WIDE_INT_1U.
* config/arm/neon.md (vec_set<mode>): Replace cast of 1 to
HOST_WIDE_INT with HOST_WIDE_INT_1.
ktkachov [Tue, 24 May 2016 11:32:35 +0000 (11:32 +0000)]
[ARM] PR target/69857 Remove bogus early return false; in gen_operands_ldrd_strd
PR target/69857
* config/arm/arm.c (gen_operands_ldrd_strd): Remove bogus early
return. Reindent transformation comment and mention the ARM state
behavior.
rsandifo [Tue, 24 May 2016 10:15:36 +0000 (10:15 +0000)]
Avoid unnecessary peeling for gaps with LD3
vectorizable_load forces peeling for gaps if the vectorisation factor
is not a multiple of the group size, since in that case we'd normally load
beyond the original scalar accesses but drop the excess elements as part
of a following permute:
This isn't necessary for LOAD_LANES though, since it loads only the
data needed and does the permute itself.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vect-stmts.c (vectorizable_load): Reorder checks so that
load_lanes/grouped_load classification comes first. Don't check
whether the vectorization factor is a multiple of the group size
for load_lanes.
gcc/testsuite/
* gcc.dg/vect/vect-load-lanes-peeling-1.c: New test.
rsandifo [Tue, 24 May 2016 10:13:35 +0000 (10:13 +0000)]
Fix GROUP_GAP for single-element interleaving
vectorizable_load had a curious "force_peeling" variable, with no
comment explaining why we need it for single-element interleaving
but not for other cases. I think it's simply because we weren't
initialising the GROUP_GAP correctly for single loads.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-vect-data-refs.c (vect_analyze_group_access_1): Set
GROUP_GAP for single-element interleaving.
* tree-vect-stmts.c (vectorizable_load): Remove force_peeling
variable.
rguenth [Tue, 24 May 2016 07:55:56 +0000 (07:55 +0000)]
2016-05-24 Richard Biener <rguenther@suse.de>
PR middle-end/70434
PR c/69504
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Rename to ...
(convert_vector_to_array_for_subscript): ... this.
* c-common.c (convert_vector_to_pointer_for_subscript): Use a
VIEW_CONVERT_EXPR to an array type. Rename to ...
(convert_vector_to_array_for_subscript): ... this.
c/
* c-typeck.c (build_array_ref): Do not complain about indexing
non-lvalue vectors. Adjust for function name change.
* tree-ssa.c (non_rewritable_mem_ref_base): Make sure to mark
bases which are accessed with non-invariant indices.
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Re-write
constant index ARRAY_REFs of vectors into BIT_FIELD_REFs.
* c-c++-common/vector-subscript-4.c: New testcase.
* c-c++-common/vector-subscript-5.c: Likewise.
meissner [Mon, 23 May 2016 23:42:52 +0000 (23:42 +0000)]
[gcc]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/71201
* config/rs6000/altivec.md (altivec_vperm_<mode>_internal): Drop
ISA 3.0 xxperm fusion alternative.
(altivec_vperm_v8hiv16qi): Likewise.
(altivec_vperm_<mode>_uns_internal): Likewise.
(vperm_v8hiv4si): Likewise.
(vperm_v16qiv8hi): Likewise.
[gcc/testsuite]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/p9-permute.c: Run test on big endian as well
as little endian.
[gcc]
2016-05-23 Michael Meissner <meissner@linux.vnet.ibm.com>
Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
vpermr/xxpermr on ISA 3.0.
(altivec_expand_vec_perm_le): Likewise.
* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
ISA 3.0.
jason [Mon, 23 May 2016 21:21:18 +0000 (21:21 +0000)]
PR c++/70735 - generic lambda and local static variable
* pt.c (tsubst_copy): Just return a local variable from
non-template context. Don't call rest_of_decl_compilation for
duplicated static locals.
(tsubst_decl): Set DECL_CONTEXT of local static from another
function.
jakub [Sun, 22 May 2016 10:27:27 +0000 (10:27 +0000)]
* config/i386/sse.md (vec_set_lo_v16hi, vec_set_hi_v16hi,
vec_set_lo_v32qi, vec_set_hi_v32qi): Add alternative with
v constraint instead of x and vinserti32x4 insn.
* gcc.target/i386/avx512vl-vinserti32x4-3.c: New test.
jakub [Sun, 22 May 2016 10:25:55 +0000 (10:25 +0000)]
* config/i386/sse.md (avx2_vec_dupv4df): Use v instead of x
constraint, use maybe_evex prefix instead of vex.
(vec_dupv4sf): Use v constraint instead of x for output
operand except for noavx alternative, use Yv constraint
instead of x for input. Use maybe_evex prefix instead of vex.
(*vec_dupv4si): Likewise.
(*vec_dupv2di): Likewise.
* gcc.target/i386/avx512vl-vbroadcast-1.c: New test.
ebotcazou [Fri, 20 May 2016 21:46:58 +0000 (21:46 +0000)]
* tree-vrp.c (compare_values_warnv): Simplify handling of symbolic
ranges by calling get_single_symbol and tidy up. Look more closely
into NAME + CST1 vs CST2 comparisons if type overflow is undefined.
ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Signed_Integer_Subtype>:
Make same-sized subtypes of signed base types signed.
* gcc-interface/utils.c (make_type_from_size): Adjust to above change.
(unchecked_convert): Likewise.
jamborm [Fri, 20 May 2016 21:04:31 +0000 (21:04 +0000)]
[PR 70884] Constant pool SRA fix
2016-05-20 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/70884
* tree-sra.c (initialize_constant_pool_replacements): Do not check
should_scalarize_away_bitmap and cannot_scalarize_away_bitmap bits.
(sort_and_splice_var_accesses): Do not consider multiple scalar reads
of constant pool data as a reason for scalarization.
seurer [Fri, 20 May 2016 20:31:52 +0000 (20:31 +0000)]
This patch changes some of the dejagnu options to better restrict
where the test cases run so that they will no longer cause failures on
power7 machines.
Based on a subsequent patch I also updated the code formatting (indentation,
etc.) for the code from the original patch (r235577) in both the test cases
and in rs6000-c.c.
[gcc]
2016-05-20 Bill Seurer <seurer@linux.vnet.ibm.com>
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Fix
code formatting in ALTIVEC_BUILTIN_VEC_ADDE section.
[gcc/testsuite]
2016-05-20 Bill Seurer <seurer@linux.vnet.ibm.com>
nathan [Fri, 20 May 2016 19:52:50 +0000 (19:52 +0000)]
* config/nvptx/nptx.c (nvptx_option_override): Only set
flag_toplevel_reorder, if not explicitly specified. Set
flag_no_common, unless explicitly specified.
testsuite/
* gcc.target/nvptx/uninit-decl.c: Force common storage, add
non-common cases.
* gcc.dg/tree-ssa/ssa-store-ccp-2.c: Add -fcommon.