Tom de Vries [Wed, 19 Jul 2017 13:05:21 +0000 (13:05 +0000)]
Add v2si support for nvptx
2017-07-19 Tom de Vries <tom@codesourcery.com>
* config/nvptx/nvptx-modes.def: New file. Add V2SImode.
* config/nvptx/nvptx.c (nvptx_ptx_type_from_mode): Handle V2SImode.
(nvptx_vector_mode_supported): New function. Allow V2SImode.
(TARGET_VECTOR_MODE_SUPPORTED_P): Redefine to nvptx_vector_mode_supported.
* config/nvptx/nvptx.md (VECIM): New mode iterator. Add V2SI.
(mov<VECIM>_insn): New define_insn.
(define_expand "mov<VECIM>): New define_expand.
* gcc.target/nvptx/slp-run.c: New test.
* gcc.target/nvptx/slp.c: New test.
* gcc.target/nvptx/v2si-cvt.c: New test.
* gcc.target/nvptx/v2si-run.c: New test.
* gcc.target/nvptx/v2si.c: New test.
* gcc.target/nvptx/vec.inc: New test.
Jakub Jelinek [Wed, 19 Jul 2017 12:31:59 +0000 (14:31 +0200)]
re PR tree-optimization/81346 (Missed constant propagation into comparison)
PR tree-optimization/81346
* fold-const.h (fold_div_compare, range_check_type): Declare.
* fold-const.c (range_check_type): New function.
(build_range_check): Use range_check_type.
(fold_div_compare): No longer static, rewritten into
a match.pd helper function.
(fold_comparison): Don't call fold_div_compare here.
* match.pd (X / C1 op C2): New optimization using fold_div_compare
as helper function.
* gcc.dg/tree-ssa/pr81346-1.c: New test.
* gcc.dg/tree-ssa/pr81346-2.c: New test.
* gcc.dg/tree-ssa/pr81346-3.c: New test.
* gcc.dg/tree-ssa/pr81346-4.c: New test.
* gcc.target/i386/umod-3.c: Hide comparison against 1 from the
compiler to avoid X / C1 op C2 optimization to trigger.
Ian Lance Taylor [Tue, 18 Jul 2017 23:29:15 +0000 (23:29 +0000)]
compiler: insert backend type conversion for closure func ptr
In Func_expression::do_get_backend when creating the backend
representation for a closure, create a backend type conversion to
account for potential differences between the closure struct type
(where the number of fields is dependent on the number of values
referenced in the closure) and the generic function descriptor type
(struct with single function pointer field).
Ian Lance Taylor [Tue, 18 Jul 2017 23:14:29 +0000 (23:14 +0000)]
re PR go/81451 (missing futex check - libgo/runtime/thread-linux.c:12:0 futex.h:13:12: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘long’)
PR go/81451
runtime: inline runtime_osinit
We had two identical copies of runtime_osinit. They set runtime_ncpu,
a variable that is no longer used. Removing that leaves us with two lines.
Inline those two lines in the two places the function was called.
Ian Lance Taylor [Tue, 18 Jul 2017 22:31:00 +0000 (22:31 +0000)]
compiler: pass correct 'function' flag to circular_pointer_type
The code in Named_type::do_get_backend was not passing the correct
flag value for circular function types to Backend::circular_pointer_type
(it was always setting this flag to false). Pass a true value if the
type being converted is a function type.
Ian Lance Taylor [Tue, 18 Jul 2017 22:06:31 +0000 (22:06 +0000)]
re PR go/81324 (libgo does not build with glibc 2.18)
PR go/81324
sysinfo.c: ignore ptrace_peeksiginfo_args from <linux/ptrace.h>
With some versions of glibc and GNU/Linux ptrace_pseeksiginfo_args is
defined in both <sys/ptrace.h> and <linux/ptrace.h>. We don't actually
care about the struct, so use a #define to avoid a redefinition error.
re PR target/81471 (internal compiler error: in curr_insn_transform, at lra-constraints.c:3495)
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx<mode>3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.
testsuite/ChangeLog:
PR target/81471
* gcc.target/i386/pr81471.c: New test.
Jan Hubicka [Tue, 18 Jul 2017 13:49:30 +0000 (15:49 +0200)]
re PR tree-optimization/81462 (ICE in estimate_bb_frequencies at gcc/predict.c:3546)
PR middle-end/81462
* predict.c (set_even_probabilities): Cleanup; do not affect
probabilities that are already known.
(combine_predictions_for_bb): Call even when count is set.
* class.c (classtype_has_move_assign_or_move_ctor): Declare.
(add_implicitly_declared_members): Use it.
(type_has_move_constructor, type_has_move_assign): Merge into ...
(classtype_has_move_assign_or_move_ctor): ... this new function.
* cp-tree.h (type_has_move_constructor, type_has_move_assign): Delete.
Robin Dapp [Tue, 18 Jul 2017 09:23:35 +0000 (09:23 +0000)]
Fix PR81362: Vector peeling
npeel was erroneously overwritten by vect_peeling_hash_get_lowest_cost
although the corresponding dataref is not used afterwards. It should
be safe to get rid of the npeel parameter since we use the returned
peeling_info's npeel anyway. Also removed the body_cost_vec parameter
which is not used elsewhere.
gcc/ChangeLog:
2017-07-18 Robin Dapp <rdapp@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Remove
body_cost_vec from _vect_peel_extended_info.
(vect_peeling_hash_get_lowest_cost): Do not set body_cost_vec.
(vect_peeling_hash_choose_best_peeling): Remove body_cost_vec and
npeel.
Richard Biener [Tue, 18 Jul 2017 07:35:40 +0000 (07:35 +0000)]
re PR tree-optimization/80620 (gcc produces wrong code with -O3)
2017-07-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/80620
PR tree-optimization/81403
* tree-ssa-pre.c (phi_translate_1): Clear range and points-to
info when re-using a VN table entry.
* gcc.dg/torture/pr80620.c: New testcase.
* gcc.dg/torture/pr81403.c: Likewise.
Richard Biener [Tue, 18 Jul 2017 07:26:04 +0000 (07:26 +0000)]
re PR tree-optimization/81418 (ICE in vect_get_vec_def_for_stmt_copy)
2017-06-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/81418
* tree-vect-loop.c (vectorizable_reduction): Properly compute
vectype_in. Verify that with lane-reducing reduction operations
we have a single def-use cycle.
class.c (type_has_user_declared_move_constructor, [...]): Combine into ...
* class.c (type_has_user_declared_move_constructor,
type_has_user_declared_move_assign): Combine into ...
(classtype_has_user_move_assign_or_move_ctor_p): ... this new function.
* cp-tree.h (type_has_user_declared_move_constructor,
type_has_user_declared_move_assign): Combine into ...
(classtype_has_user_move_assign_or_move_ctor_p): ... this. Declare.
* method.c (maybe_explain_implicit_delete): Use it.
(lazily_declare_fn): Use it.
* tree.c (type_has_nontrivial_copy_init): Use it.
Emitting subregs in the expand will result in broken code due to LRA handling of them. Issue observed while turning on mlra and mexpand-adddi options using dejagnu test suite. Deprecate this
option.
[ARC] [LRA] Avoid emitting COND_EXEC during expand.
Emmitting COND_EXEC rtxes during expand does introduces errors due to LRA handling of them. Issue discovered while running dejagnu test suit with mlra option on.
* config/arc/arc.md (clzsi2): Expand to an arc_clzsi2 instruction
that also clobbers the CC register. The old expand code is moved
to ...
(*arc_clzsi2): ... here.
(ctzsi2): Expand to an arc_ctzsi2 instruction that also clobbers
the CC register. The old expand code is moved to ...
(arc_ctzsi2): ... here.
Bin Cheng [Mon, 17 Jul 2017 11:40:54 +0000 (11:40 +0000)]
re PR tree-optimization/81369 (ICE in generate_code_for_partition)
PR target/81369
* tree-loop-distribution.c (classify_partition): Only assert on
numer of iterations.
(merge_dep_scc_partitions): Delete prameter. Update function call.
(distribute_loop): Remove code handling loop with unknown niters.
(pass_loop_distribution::execute): Skip loop with unknown niters.
Bin Cheng [Mon, 17 Jul 2017 11:34:30 +0000 (11:34 +0000)]
re PR tree-optimization/81374 (ICE in bb_top_order_cmp, at tree-loop-distribution.c:391)
PR tree-optimization/81374
* tree-loop-distribution.c (pass_loop_distribution::execute): Record
the max index of basic blocks, rather than number of basic blocks.
This patch refactors a number of functions and compiler hooks into using a
single function which checks if a rtx is suited for pic or not. Removed
functions are arc_legitimate_pc_offset_p and arc_legitimate_pic_operand_p
beeing replaced by calls to arc_legitimate_pic_addr_p. Thus we have an
unitary way of checking a rtx beeing pic.
* config/arc/arc-protos.h (arc_legitimate_pc_offset_p): Remove
proto.
(arc_legitimate_pic_operand_p): Likewise.
* config/arc/arc.c (arc_legitimate_pic_operand_p): Remove
function.
(arc_needs_pcl_p): Likewise.
(arc_legitimate_pc_offset_p): Likewise.
(arc_legitimate_pic_addr_p): Remove LABEL_REF case, as this
function is also used in constrains.md.
(arc_legitimate_constant_p): Use arc_legitimate_pic_addr_p to
validate pic constants. Handle CONST_INT, CONST_DOUBLE, MINUS and
PLUS. Only return true/false in known cases, otherwise assert.
(arc_legitimate_address_p): Remove arc_legitimate_pic_addr_p as it
is already called in arc_legitimate_constant_p.
* config/arc/arc.h (CONSTANT_ADDRESS_P): Consider also LABEL for
pic addresses.
(LEGITIMATE_PIC_OPERAND_P): Use
arc_raw_symbolic_reference_mentioned_p function.
* config/arc/constraints.md (Cpc): Use arc_legitimate_pic_addr_p
function.
(Cal): Likewise.
(C32): Likewise.
Jakub Jelinek [Mon, 17 Jul 2017 09:10:23 +0000 (11:10 +0200)]
re PR tree-optimization/81365 (GCC miscompiles swap)
PR tree-optimization/81365
* tree-ssa-phiprop.c (propagate_with_phi): When considering hoisting
aggregate moves onto bb predecessor edges, make sure there are no
loads that could alias the lhs in between the start of bb and the
loads from *phi.
Jakub Jelinek [Mon, 17 Jul 2017 08:14:16 +0000 (10:14 +0200)]
re PR tree-optimization/81396 (Optimization of reading Little-Endian 64-bit number with portable code has a regression)
PR tree-optimization/81396
* tree-ssa-math-opts.c (struct symbolic_number): Add n_ops field.
(init_symbolic_number): Initialize it to 1.
(perform_symbolic_merge): Add n_ops from both operands into the new
n_ops.
(find_bswap_or_nop): Don't consider n->n == cmpnop computations
without base_addr as useless if they need more than one operation.
(bswap_replace): Handle !bswap case for NULL base_addr.
Sebastian Huber [Mon, 17 Jul 2017 05:27:13 +0000 (05:27 +0000)]
[SPARC/RTEMS] Add __FIX_LEON3FT_B2BST
In case the LEON3FT back-to-back store workaround is active
(sparc_fix_b2bst), then define the builtin define __FIX_LEON3FT_B2BST on
RTEMS. The intended use case for this is operating system code in
assembly language. See also:
Eric Botcazou [Sun, 16 Jul 2017 22:03:54 +0000 (22:03 +0000)]
re PR rtl-optimization/81424 (internal error on GPRbuild with -O2)
PR rtl-optimization/81424
* optabs.c (prepare_cmp_insn): Use copy_to_reg instead of force_reg
to remove potential trapping from operands if -fnon-call-exceptions.
* parser.c (cp_parser_cast_expression): Use %q#T instead of %qT
in old-style cast diagnostic.
* typeck.c (maybe_warn_about_useless_cast): Use %q#T instead of %qT
in useless cast diagnostic.
* error.c (type_to_string): Remove enum special handling.
* g++.dg/cpp1z/direct-enum-init1.C: Revert special enum handling.
* g++.dg/warn/pr12242.C: Likewise.
parser.c (cp_parser_cast_expression): Use %q#T instead of %qT in old-style cast diagnostic.
* parser.c (cp_parser_cast_expression): Use %q#T instead of %qT
in old-style cast diagnostic.
* typeck.c (maybe_warn_about_useless_cast): Use %q#T instead of %qT
in useless cast diagnostic.
* error.c (type_to_string): Remove enum special handling.
* g++.dg/cpp1z/direct-enum-init1.C: Revert special enum handling.
* g++.dg/warn/pr12242.C: Likewise.
Ian Lance Taylor [Fri, 14 Jul 2017 22:25:26 +0000 (22:25 +0000)]
libgo: don't copy semt into runtime.inc
https://gcc.gnu.org/PR81449 reports a problem with the definition semt
in runtime.inc on some systems. Since the C code in libgo/runtime
doesn't need semt, just don't copy it into runtime.inc.
rs6000-c.c (altivec_overloaded_builtins): Add array entries to represent __ieee128 versions of the scalar_test_data_class...
gcc/ChangeLog:
2017-07-14 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
array entries to represent __ieee128 versions of the
scalar_test_data_class, scalar_test_neg, scalar_extract_exp,
scalar_extract_sig, and scalar_insert_exp built-in functions.
(altivec_resolve_overloaded_builtin): Add special case handling
for the __builtin_scalar_insert_exp function, as represented by
the P9V_BUILTIN_VEC_VSIEDP constant.
* config/rs6000/rs6000-builtin.def (VSEEQP): Add scalar extract
exponent support for __ieee128 argument.
(VSESQP): Add scalar extract signature support for __ieee128
argument.
(VSTDCNQP): Add scalar test negative support for __ieee128
argument.
(VSIEQP): Add scalar insert exponent support for __int128 argument
with __ieee128 result.
(VSIEQPF): Add scalar insert exponent support for __ieee128
argument with __ieee128 result.
(VSTDCQP): Add scalar test data class support for __ieee128
argument.
(VSTDCNQP): Add overload support for scalar test negative with
__ieee128 argument.
(VSTDCQP): Add overload support for scalar test data class
__ieee128 argument.
* config/rs6000/vsx.md (UNSPEC_VSX_SXSIG) Replace
UNSPEC_VSX_SXSIGDP.
(UNSPEC_VSX_SIEXPQP): New constant.
(xsxexpqp): New insn for VSX scalar extract exponent quad
precision.
(xsxsigqp): New insn for VSX scalar extract significand quad
precision.
(xsiexpqpf): New insn for VSX scalar insert exponent quad
precision with floating point argument.
(xststdcqp): New expand for VSX scalar test data class quad
precision.
(xststdcnegqp): New expand for VSX scalar test negative quad
precision.
(xststdcqp): New insn to match expansions for VSX scalar test data
class quad precision and VSX scalar test negative quad precision.
* config/rs6000/rs6000.c (rs6000_expand_binop_builtin): Add
special case operand checking to enforce that second operand of
VSX scalar test data class with quad precision argument is a 7-bit
unsigned literal.
* doc/extend.texi (PowerPC AltiVec Built-in Functions): Add
prototypes and descriptions of __ieee128 versions of
scalar_extract_exp, scalar_extract_sig, scalar_insert_exp,
scalar_test_data_class, and scalar_test_neg built-in functions.
gcc/testsuite/ChangeLog:
2017-07-14 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/bfp/scalar-cmp-exp-eq-3.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-eq-4.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-gt-3.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-gt-4.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-lt-3.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-lt-4.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-unordered-3.c: New test.
* gcc.target/powerpc/bfp/scalar-cmp-exp-unordered-4.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-exp-3.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-exp-4.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-exp-5.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-exp-6.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-exp-7.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-sig-3.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-sig-4.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-sig-5.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-sig-6.c: New test.
* gcc.target/powerpc/bfp/scalar-extract-sig-7.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-10.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-11.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-12.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-13.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-14.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-15.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-6.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-7.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: New test.
* gcc.target/powerpc/bfp/scalar-insert-exp-9.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-10.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-11.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-12.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-13.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-14.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-15.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-8.c: New test.
* gcc.target/powerpc/bfp/scalar-test-data-class-9.c: New test.
* gcc.target/powerpc/bfp/scalar-test-neg-4.c: New test.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: New test.
* gcc.target/powerpc/bfp/scalar-test-neg-6.c: New test.
* gcc.target/powerpc/bfp/scalar-test-neg-7.c: New test.
* gcc.target/powerpc/bfp/scalar-test-neg-8.c: New test.
* gcc.target/powerpc/bfp/vec-extract-exp-4.c: New test.
* gcc.target/powerpc/bfp/vec-extract-exp-5.c: New test.
* gcc.target/powerpc/bfp/vec-extract-sig-4.c: New test.
* gcc.target/powerpc/bfp/vec-extract-sig-5.c: New test.
* gcc.target/powerpc/bfp/vec-insert-exp-10.c: New test.
* gcc.target/powerpc/bfp/vec-insert-exp-11.c: New test.
* gcc.target/powerpc/bfp/vec-insert-exp-8.c: New test.
* gcc.target/powerpc/bfp/vec-insert-exp-9.c: New test.
* gcc.target/powerpc/bfp/vec-test-data-class-8.c: New test.
* gcc.target/powerpc/bfp/vec-test-data-class-9.c: New test.
Jason Merrill [Fri, 14 Jul 2017 19:13:49 +0000 (15:13 -0400)]
Constrain std::variant constructor for class template argument deduction
2017-07-14 Jason Merrill <jason@redhat.com>
Jonathan Wakely <jwakely@redhat.com>
* include/std/variant (variant::variant(_Tp&&)): Constrain to remove
the constructor for empty variants from the candidate functions
during class template argument deduction.
* testsuite/20_util/variant/deduction.cc: New.
Co-Authored-By: Jonathan Wakely <jwakely@redhat.com>
From-SVN: r250213
[ARM] Fix definition of __ARM_FEATURE_NUMERIC_MAXMIN
Definition of __ARM_FEATURE_NUMERIC_MAXMIN checks for
TARGET_ARM_ARCH >= 8 and TARGET_NEON being true in addition to
TARGET_VFP5. However, instructions covered by this macro are part of
FPv5 which is available in ARMv7E-M architecture. This commit fixes the
macro to only check for TARGET_VFP5.
2017-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_NUMERIC_MAXMIN solely based on TARGET_VFP5.
2017-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config/arm/arm-cpus.in (cortex-r52): Add new entry.
(armv8-r): Set ARM Cortex-R52 as default CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* config/arm/driver-arm.c (arm_cpu_table): Add entry for ARM
Cortex-R52.
* doc/invoke.texi: Mention -mtune=cortex-r52 and availability of fp.dp
extension for -mcpu=cortex-r52.
fp-armv8 is currently defined as a double precision FPv5 with 32 D
registers *and* a special FP_ARMv8 bit. However FP for ARMv8 should only
bring 32 D registers on top of FPv5-D16 so this FP_ARMv8 bit is
spurious. As a consequence, many instruction patterns which are guarded
by TARGET_FPU_ARMV8 are unavailable to FPv5-D16 and FPv5-SP-D16.
This commit gets rid of TARGET_FPU_ARMV8 and rewire all uses to
expressions based on TARGET_VFP5, TARGET_VFPD32 and TARGET_VFP_DOUBLE.
It also redefine ISA_FP_ARMv8 to include the D32 capability to
distinguish it from FPv5-D16. At last, it sets the +fp.sp for ARMv8-R to
enable FPv5-SP-D16 (ie FP for ARMv8 with single precision only and 16 D
registers).
2017-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator.
(ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32.
* config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5.
(fp-armv8): Define it as FP_ARMv8 only.
config/arm/arm.h (TARGET_FPU_ARMV8): Delete.
(TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
config/arm/arm.c (arm_rtx_costs_internal): Replace checks against
TARGET_FPU_ARMV8 by checks against TARGET_VFP5.
* config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define
first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather
than TARGET_FPU_ARMV8.
* config/arm/arm-c.c (arm_cpu_builtins): Likewise for
__ARM_FEATURE_NUMERIC_MAXMIN macro definition.
* config/arm/arm.md (cmov<mode>): Condition on TARGET_VFP5 rather than
TARGET_FPU_ARMV8.
* config/arm/neon.md (neon_vrint): Likewise.
(neon_vcvt): Likewise.
(neon_<fmaxmin_op><mode>): Likewise.
(<fmaxmin><mode>3): Likewise.
* config/arm/vfp.md (l<vrint_pattern><su_optab><mode>si2): Likewise.
* config/arm/predicates.md (arm_cond_move_operator): Check against
TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing.