vec_cmps assign the result of a vector comparison to a mask.
The optab was called with the destination having mode mask_mode
but with the source (the comparison) having mode VOIDmode,
which led to invalid rtl if the source operand was used directly.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* optabs.c (vector_compare_rtx): Add a cmp_mode parameter
and use it in the final call to gen_rtx_fmt_ee.
(expand_vec_cond_expr): Update accordingly.
(expand_vec_cmp_expr): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242489
local_cprop_find_used_regs punted on all multiword registers,
with the comment:
/* Setting a subreg of a register larger than word_mode leaves
the non-written words unchanged. */
But this only applies if the outer mode is smaller than the
inner mode. If they're the same size then writes to the subreg
are a normal full update.
This patch uses df_read_modify_subreg_p instead. A later patch
adds more uses of the same routine, but this part had a (positive)
effect on code generation for the testsuite whereas the others
seemed to be simple clean-ups.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* cprop.c (local_cprop_find_used_regs): Use df_read_modify_subreg_p.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242488
Andrew Burgess [Wed, 16 Nov 2016 11:42:43 +0000 (11:42 +0000)]
[ARC] Fix LE tests for nps400 variant.
gcc/arc: New peephole2 and little endian arc test fixes
Resolve some test failures introduced for little endian arc as a result
of the recent arc/nps400 additions.
There's a new peephole2 optimisation to merge together two zero_extracts
in order that the movb instruction can be used.
One of the test cases is extended so that the test does something
meaningful in both big and little endian arc mode.
Other tests have their expected results updated to reflect improvements
in other areas of GCC.
gcc/ChangeLog:
Andrew Burgess <andrew.burgess@embecosm.com>
* config/arc/arc.md (movb peephole2): New peephole2 to merge two
zero_extract operations to allow a movb to occur.
* gcc.target/arc/movb-1.c: Update little endian arc results.
* gcc.target/arc/movb-2.c: Likewise.
* gcc.target/arc/movb-5.c: Likewise.
* gcc.target/arc/movh_cl-1.c: Extend test to cover little endian
arc.
Fix PR78294 - thread sanitizer broken when using ld.gold
When one uses ld.gold to build gcc, the thread sanitizer doesn't work,
because gold is more conservative when applying TLS relaxations than
ld.bfd. In this case a missing initial-exec attribute on a declaration
causes gcc to assume the general dynamic model. With ld.bfd this gets
relaxed to initial exec when linking the shared library, so the missing
attribute doesn't matter. But ld.gold doesn't perform this optimization
and this leads to crashes on tsan instrumented binaries.
The CONCAT handling in emit_group_load chooses between doing
an extraction from a single component or forcing the whole
thing to memory and extracting from there. The condition for
the former (more efficient) option was:
On the one hand this seems dangerous, since the second line
allows bit ranges that start in the first component and leak
into the second. On the other hand it seems strange to allow
references that start after the first byte of the second
component but not those that start after the first byte
of the first component. This led to a pessimisation of
things like gcc.dg/builtins-54.c for hppa64-hp-hpux11.23.
This patch simply checks whether the reference is contained
within a single component. It also makes sure that we do
an extraction on anything that doesn't span the whole
component (even if it's constant).
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* expr.c (emit_group_load_1): Tighten check for whether an
access involves only one operand of a CONCAT. Use extract_bit_field
for constants if the bit range does span the whole operand.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242477
Fix handling of unknown sizes in rtx_addr_can_trap_p
If the size passed in to rtx_addr_can_trap_p was zero, the frame
handling would get the size from the mode instead. However, this
too can be zero if the mode is BLKmode, i.e. if we have a BLKmode
memory reference with no MEM_SIZE (which should be rare these days).
This meant that the conditions for a 4-byte access at offset X were
stricter than those for an access of unknown size at offset X.
This patch checks whether the size is still zero, as the
SYMBOL_REF handling does.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
Fix nb_iterations_estimate calculation in tree-vect-loop.c
vect_transform_loop has to reduce three iteration counts by
the vectorisation factor: nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate.
All three are latch execution counts rather than loop body
execution counts. The calculations were taking that into
account for the first two, but not for nb_iterations_estimate.
This patch updates the way the calculations are done to fix
this and to add a bit more commentary about what is going on.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree-vect-loop.c (vect_transform_loop): Protect the updates of
all three iteration counts with an any_* test. Use a single update
for each count. Fix the calculation of nb_iterations_estimate.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242475
Kyrylo Tkachov [Wed, 16 Nov 2016 09:02:18 +0000 (09:02 +0000)]
[ARM] PR target/78364: Add proper restrictions to zero and sign_extract patterns operands
PR target/78364
* config/arm/arm.md (*extv_reg): Restrict operands 2 and 3 to the
proper ranges for an SBFX instruction.
(extzv_t2): Likewise for UBFX.
Richard Biener [Wed, 16 Nov 2016 08:42:20 +0000 (08:42 +0000)]
re PR tree-optimization/78348 ([7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038)
2016-11-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/78348
* tree-loop-distribution.c (enum partition_kind): Add PKIND_MEMMOVE.
(generate_memcpy_builtin): Honor PKIND_MEMCPY on the partition.
(classify_partition): Set PKIND_MEMCPY if dependence analysis
revealed no dependency, PKIND_MEMMOVE otherwise.
Jakub Jelinek [Wed, 16 Nov 2016 08:28:50 +0000 (09:28 +0100)]
re PR sanitizer/77823 (ICE: in ubsan_encode_value, at ubsan.c:137 with -fsanitize=undefined and vector types)
PR sanitizer/77823
* ubsan.c (ubsan_build_overflow_builtin): Add DATAP argument, if
it points to non-NULL tree, use it instead of ubsan_create_data.
(instrument_si_overflow): Handle vector signed integer overflow
checking.
* ubsan.h (ubsan_build_overflow_builtin): Add DATAP argument.
* tree-vrp.c (simplify_internal_call_using_ranges): Punt for
vector IFN_UBSAN_CHECK_*.
* internal-fn.c (expand_addsub_overflow): Add DATAP argument,
pass it through to ubsan_build_overflow_builtin.
(expand_neg_overflow, expand_mul_overflow): Likewise.
(expand_vector_ubsan_overflow): New function.
(expand_UBSAN_CHECK_ADD, expand_UBSAN_CHECK_SUB,
expand_UBSAN_CHECK_MUL): Use tit for vector arithmetics.
(expand_arith_overflow): Adjust expand_*_overflow callers.
* c-c++-common/ubsan/overflow-vec-1.c: New test.
* c-c++-common/ubsan/overflow-vec-2.c: New test.
* genattrtab.c (attr_rtx_1): Avoid allocating new rtx objects.
Clear ATTR_CURR_SIMPLIFIED_P for re-used binary rtx objects.
Use DEF_ATTR_STRING for string arguments. Use RTL_HASH for
integer arguments. Only set ATTR_PERMANENT_P on newly hashed
rtx when all sub-rtx are also permanent.
(attr_eq): Simplify.
(attr_copy_rtx): Remove.
(make_canonical, get_attr_value): Use attr_equal_p.
(copy_boolean): Rehash NOT.
(simplify_test_exp_in_temp,
optimize_attrs): Remove call to attr_copy_rtx.
(attr_alt_intersection, attr_alt_union,
attr_alt_complement, mk_attr_alt): Rehash EQ_ATTR_ALT.
(make_automaton_attrs): Use attr_eq.
Jonathan Wakely [Tue, 15 Nov 2016 19:32:44 +0000 (19:32 +0000)]
Make std::tuple_size<cv T> SFINAE-friendly (LWG 2770)
* doc/xml/manual/intro.xml: Document LWG 2770 status. Remove entries
for 2742 and 2748.
* doc/html/*: Regenerate.
* include/std/utility (__tuple_size_cv_impl): New helper to safely
detect tuple_size<T>::value, as per LWG 2770.
(tuple_size<cv T>): Adjust partial specializations to derive from
__tuple_size_cv_impl.
* testsuite/20_util/tuple/cv_tuple_size.cc: Test SFINAE-friendliness.
Mark Wielaard [Tue, 15 Nov 2016 19:31:59 +0000 (19:31 +0000)]
libiberty: demangler crash with missing :? or fold expression component.
When constructing an :? or fold expression that requires a third
expression only the first and second were explicitly checked to
not be NULL. Since the third expression is also required in these
constructs it needs to be explicitly checked and rejected when missing.
Otherwise the demangler will crash once it tries to d_print the
NULL component. Added two examples to demangle-expected of strings
that would crash before this fix.
Mark Wielaard [Tue, 15 Nov 2016 19:31:50 +0000 (19:31 +0000)]
libiberty: Fix some demangler crashes caused by reading past end of input.
In various situations the cplus_demangle () function could read past the
end of input causing crashes. Add checks in various places to not advance
the demangle string location and fail early when end of string is reached.
Add various examples of input strings to the testsuite that would crash
test-demangle before the fixes.
Found by using the American Fuzzy Lop (afl) fuzzer.
libiberty/ChangeLog:
* cplus-dem.c (demangle_signature): After 'H', template function,
no success and don't advance position if end of string reached.
(demangle_template): After 'z', template name, return zero on
premature end of string.
(gnu_special): Guard strchr against searching for zero characters.
(do_type): If member, only advance mangled string when 'F' found.
* testsuite/demangle-expected: Add examples of strings that could
crash the demangler by reading past end of input.
Uros Bizjak [Tue, 15 Nov 2016 19:26:41 +0000 (20:26 +0100)]
funcspec-56.inc: New file.
* gcc.target/i386/funcspec-56.inc: New file.
* gcc.target/i386.funcspec-5.c: Include funcspec-56.inc. Remove
common 32-bit and 64-bit function specific options.
* gcc.target/i386.funcspec-6.c: Ditto.
Several definitions of INCOMING_RETURN_ADDR_RTX used
gen_rtx_REG (VOIDmode, ...), which with later patches
would trip an assert. This patch converts them to use
Pmode instead.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
After simplifying the operands of a PLUS, canon_rtx checked only
for cases in which one of the simplified operands was a constant,
falling back to gen_rtx_PLUS otherwise. This left the PLUS in a
non-canonical order if one of the simplified operands was
(plus (reg R1) (const_int X)); we'd end up with:
(plus (plus (reg R1) (const_int Y)) (reg R2))
rather than:
(plus (plus (reg R1) (reg R2)) (const_int Y))
Fixing this exposed new DSE opportunities on spu-elf in
gcc.c-torture/execute/builtins/strcat-chk.c but otherwise
it doesn't seem to have much practical effect.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* alias.c (canon_rtx): Use simplify_gen_binary.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242445
LOAD_EXTEND_OP only applies to scalar integer modes that are narrower
than a word. However, callers weren't consistent about which of these
checks they made beforehand, and also weren't consistent about whether
"smaller" was based on (bit)size or precision (IMO it's the latter).
This patch adds a wrapper to try to make the macro easier to use.
LOAD_EXTEND_OP is often used to disable transformations that aren't
beneficial when extends from memory are free, so being stricter about
the check accidentally exposed more optimisation opportunities.
"SUBREG_BYTE (...) == 0" and subreg_lowpart_p are implied by
paradoxical_subreg_p, so the patch also removes some redundant tests.
The patch doesn't change reload, since different checks could have
unforeseen consequences.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
Fix simplify_shift_const_1 handling of vector shifts
simplify_shift_const_1 handles both shifts of scalars by scalars
and shifts of vectors by scalars. For vectors this means that
each element is shifted by the same amount.
However:
(a) the two cases weren't always distinguished, so we'd try
things for vectors that only made sense for scalars.
(b) a lot of the range and bitcount checks were based on the
bitsize or precision of the full shifted operand, rather
than the mode of each element.
Fixing (b) accidentally exposed more optimisation opportunities,
although that wasn't the point of the patch.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* combine.c (simplify_shift_const_1): Use the number of bits
in the inner mode to determine the range of the shift.
When handling shifts of vectors, skip any rules that apply
only to scalars.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242442
Tim Shen [Tue, 15 Nov 2016 17:26:59 +0000 (17:26 +0000)]
variant: Remove variant<T&>...
* include/std/variant: Remove variant<T&>, variant<void>, variant<> support
to rebase on the post-Issaquah design.
* testsuite/20_util/variant/compile.cc: Likewise.
Jakub Jelinek [Tue, 15 Nov 2016 17:05:23 +0000 (18:05 +0100)]
decl.c (cp_finish_decomp): For DECL_NAMESPACE_SCOPE_P decl, set DECL_ASSEMBLER_NAME.
* decl.c (cp_finish_decomp): For DECL_NAMESPACE_SCOPE_P decl,
set DECL_ASSEMBLER_NAME.
* parser.c (cp_parser_decomposition_declaration): Likewise
if returning error_mark_node.
* mangle.c (mangle_decomp): New function.
* cp-tree.h (mangle_decomp): New declaration.
gcc/
* config/mips/mips.c (mips16_emit_constants): Emit `consttable'
insn at the beginning of the constant pool.
(mips_insert_insn_pseudos): New function.
(mips_machine_reorg2): Call it.
* config/mips/mips.md (unspec): Add UNSPEC_CONSTTABLE and
UNSPEC_INSN_PSEUDO enum values.
(insn_pseudo, consttable): New insns.
gcc/testsuite/
* gcc.target/mips/insn-casesi.c: New test case.
* gcc.target/mips/insn-pseudo-1.c: New test case.
* gcc.target/mips/insn-pseudo-2.c: New test case.
* gcc.target/mips/insn-pseudo-3.c: New test case.
* gcc.target/mips/insn-pseudo-4.c: New test case.
* gcc.target/mips/insn-tablejump.c: New test case.
Jonathan Wakely [Tue, 15 Nov 2016 14:33:20 +0000 (14:33 +0000)]
Add std::string constructor for substring of string_view (LWG 2742)
* doc/xml/manual/intro.xml: Document LWG 2742 status.
* doc/html/*: Regenerate.
* include/bits/basic_string.h
(basic_string(const T&, size_type, size_type, const Allocator&)): Add
constructor for substring of basic_string_view, as per LWG 2742 but
with additional constraint to fix ambiguity.
* testsuite/21_strings/basic_string/cons/char/9.cc: New test.
* testsuite/21_strings/basic_string/cons/wchar_t/9.cc: New test.
Jonathan Wakely [Tue, 15 Nov 2016 14:33:09 +0000 (14:33 +0000)]
Constrain swap overload for std::optional (LWG 2748)
* doc/xml/manual/intro.xml: Document LWG 2748 status.
* include/std/optional (optional<T>::swap): Use is_nothrow_swappable_v
for exception specification.
(swap(optional<T>&, optional<T>&)): Disable when T is not swappable.
* testsuite/20_util/optional/swap/2.cc: New test.
Michael Matz [Tue, 15 Nov 2016 14:02:28 +0000 (14:02 +0000)]
re PR target/77881 (Non-optimal signed comparison on x86_64 since r146817)
PR missed-optimization/77881
* combine.c (simplify_comparison): Remove useless subregs
also inside the loop, not just after it.
(make_compound_operation): Recognize some subregs as being
masking as well.
Jason Merrill [Tue, 15 Nov 2016 05:22:28 +0000 (00:22 -0500)]
Various C++17 decomposition fixes.
* tree.c (bitfield_p): New.
* cp-tree.h: Declare it.
* typeck.c (cxx_sizeof_expr, cxx_alignof_expr)
(cp_build_addr_expr_1): Use it instead of DECL_C_BIT_FIELD.
* decl.c (cp_finish_decomp): Look through reference. Always
SET_DECL_DECOMPOSITION_P.
* semantics.c (finish_decltype_type): Adjust decomposition handling.
Ian Lance Taylor [Mon, 14 Nov 2016 23:16:04 +0000 (23:16 +0000)]
runtime: don't crash if signal handler info argument is nil
Apparently on Solaris 10 a SA_SIGINFO signal handler can be invoked with
a nil info argument. I would not have believed it but I've now seen it
happen, and the sigaction man page actually says "If the second argument
is not equal to NULL, it points to a siginfo_t structure...." So, if
that happens, don't crash.
Also fix another case where we want to make sure that &T{} does not
allocate.
Implement P0504R0 (Revisiting in-place tag types for any/optional/variant).
Implement P0504R0 (Revisiting in-place tag types for
any/optional/variant).
* include/std/any (any(_ValueType&& __value)): Constrain
the __is_in_place_type with the decayed type.
(make_any): Adjust to use the new tag type.
* include/std/utility (in_place_tag): Remove.
(in_place_t): Turn into a non-reference tag type.
(__in_place, __in_place_type, __in_place_index): Remove.
(in_place): Turn into an inline variable of non-reference
tag type.
(in_place<_Tp>): Remove.
(in_place_index<_Idx>): Remove.
(in_place_type_t): New.
(in_place_type): Turn into a variable template of non-reference
type.
(in_place_index_t): New.
(in_place_index): Turn into a variable template of non-reference
type.
* include/std/variant
(_Variant_storage(in_place_index_t<_Np>, _Args&&...)): Adjust to
use the new tag type.
(_Union(in_place_index_t<0>, _Args&&...)): Likewise.
(_Union(in_place_index_t<_Np>, _Args&&...)): Likewise.
(_Variant_base()): Likewise.
(variant(_Tp&&)): Likewise.
(variant(in_place_type_t<_Tp>, _Args&&...)): Likewise.
(variant(in_place_type_t<_Tp>, initializer_list<_Up>,
_Args&&...)): Likewise.
(variant(in_place_index_t<_Np>, _Args&&...)): Likewise.
(variant(in_place_index_t<_Np>, initializer_list<_Up>,
_Args&&...)): Likewise
(variant(allocator_arg_t, const _Alloc&)): Likewise.
(variant(allocator_arg_t, const _Alloc&, _Tp&&)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_type_t<_Tp>,
_Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_type_t<_Tp>,
initializer_list<_Up>, _Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_index_t<_Np>,
_Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_index_t<_Np>,
initializer_list<_Up>, _Args&&...)): Likewise.
(emplace(_Args&&...)): Likewise.
(emplace(initializer_list<_Up>, _Args&&...)): Likewise.
* testsuite/20_util/any/cons/explicit.cc: Likewise.
* testsuite/20_util/any/cons/in_place.cc: Likewise.
* testsuite/20_util/any/requirements.cc: Add tests to
check that any is not constructible from the new in_place_type_t
of any value category.
* testsuite/20_util/in_place/requirements.cc: Adjust to
use the new tag type.
* testsuite/20_util/variant/compile.cc: Likewise.
* testsuite/20_util/variant/run.cc: Likewise.
tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg and cmpnop in two steps...
2016-11-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg
and cmpnop in two steps: first the ones not accessed in original gimple
expression in a endian independent way and then the ones not accessed
in the final result in an endian-specific way.
(bswap_replace): Stop doing big endian adjustment.
* tree-ssanames.c (make_ssa_name_fn): New argument, check for version
and assign proper version for parsed ssa names.
* tree-ssanames.h (make_ssa_name_fn): Add new argument to the function.
* internal-fn.c (expand_PHI): New function.
* internal-fn.h (expand_PHI): Declared here.
* internal-fn.def: New defination for PHI.
* tree-cfg.c (lower_phi_internal_fn): New function.
(build_gimple_cfg): Call it.
(verify_gimple_call): Condition for passing label as arg in internal
function PHI.
* tree-into-ssa.c (rewrite_add_phi_arguments): Handle already
present PHIs with arguments.
* tree-ssanames.c (make_ssa_name_fn): New argument, check for version
and assign proper version for parsed ssa names.
* tree-ssanames.h (make_ssa_name_fn): Add new argument to the function.
* internal-fn.c (expand_PHI): New function.
* internal-fn.h (expand_PHI): Declared here.
* internal-fn.def: New defination for PHI.
* tree-cfg.c (lower_phi_internal_fn): New function.
(build_gimple_cfg): Call it.
(verify_gimple_call): Condition for passing label as arg in internal
function PHI.
* tree-into-ssa.c (rewrite_add_phi_arguments): Handle already
present PHIs with arguments.
Martin Liska [Mon, 14 Nov 2016 12:09:48 +0000 (13:09 +0100)]
Introduce -fprofile-update=prefer-atomic
PR bootstrap/78069
* common.opt: Add prefer-atomic as a new enum value for
-fprofile-update.
* coretypes.h: Likewise.
* doc/invoke.texi: Document the new option value.
* gcc.c: Replace atomic with prefer-atomic. Remove warning.
* tree-profile.c (tree_profiling): Select default value
of -fprofile-update when 'prefer-atomic' is selected.
PR bootstrap/78069
* gcc.dg/no_profile_instrument_function-attr-1.c: Update test
to match scanned pattern.
* gcc.dg/tree-ssa/ssa-lim-11.c: Likewise.
Wilco Dijkstra [Mon, 14 Nov 2016 12:07:03 +0000 (12:07 +0000)]
The second patch updates the Cortex-A57 scheduler now that we can differentiate between shifts and bitfield inserts.
The second patch updates the Cortex-A57 scheduler now that we can differentiate
between shifts and bitfield inserts. The Cortex-A57 Software Optimization Guide
indicates that BFM operations use the integer multi-cycle pipeline, while ARM
UXTB/H instructions use the Integer 1 or Integer 0 pipelines, so swap the bfm
and extend reservations. This results in minor scheduling differences.
Wilco Dijkstra [Mon, 14 Nov 2016 12:04:11 +0000 (12:04 +0000)]
Currently the SBFM, UBFM and BFM instructions all use the attribute "bfm".
SBFM and UBFM include all shifts on AArch64, which are simpler than bitfield
insert. Add a new bfx attribute for these instructions so that they can be
modelled more accurately in the future. There is no difference in code
generation.
Wilco Dijkstra [Mon, 14 Nov 2016 11:51:33 +0000 (11:51 +0000)]
The existing vector costs stop some beneficial vectorization.
The existing vector costs stop some beneficial vectorization. This is mostly
due to vector statement cost being set to 3 as well as vector loads having a
higher cost than scalar loads. This means that even when we vectorize 4x, it
is possible that the cost of a vectorized loop is similar to the scalar
version, and we fail to vectorize.
Using a cost of 3 for a vector operation suggests they are 3 times as
expensive as scalar operations. Since most vector operations have a
similar throughput as scalar operations, this is not correct.
Using slightly lower values for these heuristics now allows this loop
and many others to be vectorized. On a proprietary benchmark the gain
from vectorizing this loop is around 15-30% which shows vectorizing it is
indeed beneficial.
* config/aarch64/aarch64.c (cortexa57_vector_cost):
Change vec_stmt_cost, vec_align_load_cost and vec_unalign_load_cost.
Jakub Jelinek [Mon, 14 Nov 2016 07:54:50 +0000 (08:54 +0100)]
Implement P0217R3 - C++17 structured bindings
Implement P0217R3 - C++17 structured bindings
* g++.dg/cpp1z/decomp1.C: New test.
* g++.dg/cpp1z/decomp2.C: New test.
* g++.dg/cpp1z/decomp3.C: New test.
* g++.dg/cpp1z/decomp4.C: New test.
* g++.dg/cpp1z/decomp5.C: New test.
* g++.dg/cpp1z/decomp6.C: New test.
* g++.dg/cpp1z/decomp7.C: New test.
* g++.dg/cpp1z/decomp8.C: New test.
* g++.dg/cpp1z/decomp9.C: New test.
* g++.dg/cpp1z/decomp10.C: New test.
Co-Authored-By: Jason Merrill <jason@redhat.com>
From-SVN: r242378
Jason Merrill [Mon, 14 Nov 2016 04:58:45 +0000 (23:58 -0500)]
Improve various diagnostic issues.
* call.c (build_new_method_call_1): Include template arguments in
error message.
(print_error_for_call_failure): Likewise.
(build_new_function_call): Pass them in.
* name-lookup.c (supplement_binding_1): Don't complain about a
conflict with an erroneous declaration.
* error.c (dump_decl): Fix printing of alias declaration.
* decl.c (make_typename_type): Call cxx_incomplete_type_error.
* parser.c (cp_parser_diagnose_invalid_type_name): Likewise.
* semantics.c (perform_koenig_lookup): Don't wrap an error in
TEMPLATE_ID_EXPR.
Jonathan Wakely [Sun, 13 Nov 2016 22:57:45 +0000 (22:57 +0000)]
Add array support to std::shared_ptr for C++17
* doc/xml/manual/status_cxx2017.xml: Update status.
* doc/html/manual/status.html: Regenerate.
* include/bits/shared_ptr.h (shared_ptr(unique_ptr<_Yp, _Del>)): Add
extension constructor to maintain C++14 behaviour.
* include/bits/shared_ptr_base.h (__sp_array_delete): Add new struct.
(__shared_count(_Ptr, false_type), __shared_count(_Ptr, true_type)):
New constructors.
(__sp_compatible_with, __sp_is_constructible): Add specializations
for array support.
(__sp_is_constructible_arr, __sp_is_constructible_arrN): New helpers.
(__shared_ptr_access): New base class for observer member functions.
(__shared_ptr::element_type): Use remove_extent.
(__shared_ptr::_UniqCompatible): Add __sp_compatible_with check.
(__shared_ptr(_Yp*)): Use tag dispatching to call new __shared_count
constructor.
(__shared_ptr(unique_ptr<_Yp, _Del>)): Add extension constructor.
(__shared_ptr::operator*, __shared_ptr::operator->): Remove and
inherit from __shared_ptr_access base class.
(__shared_ptr::__has_esft_base): Return false for array types.
(__weak_ptr::element_type): Use remove_extent.
* include/experimental/bits/shared_ptr.h (__libfund_v1): Remove.
(__shared_ptr<__libfund_v1<_Tp>>): Remove specializations.
(__wak_ptr<__libfund_v1<_Tp>>): Likewise.
(experimental::__sp_compatible_v): Redefine using
__sp_compatible_with.
(experimental::__sp_is_constructible_v): Redefine using
__sp_is_constructible.
(get_deleter, operator<<): Change argument from __shared_ptr to
shared_ptr.
* testsuite/20_util/shared_ptr/cons/array.cc: New test.
* testsuite/20_util/shared_ptr/cons/unique_ptr_array.cc: Adjust for
new behaviour.
* testsuite/20_util/shared_ptr/observers/array.cc: Test observers for
arrays.
* testsuite/20_util/shared_ptr/observers/array_neg.cc: New test.