Paolo Carlini [Tue, 14 May 2019 11:43:55 +0000 (11:43 +0000)]
Reapply r270597.
2019-05-14 Paolo Carlini <paolo.carlini@oracle.com>
PR preprocessor/90382
* decl.c (grokdeclarator): Fix value assigned to typespec_loc, use
min_location.
2019-05-14 Paolo Carlini <paolo.carlini@oracle.com>
PR preprocessor/90382
* g++.dg/diagnostic/trailing1.C: New test.
Jonathan Wakely [Tue, 14 May 2019 11:17:23 +0000 (12:17 +0100)]
Add __gnu_test::NullablePointer utility to testsuite
* testsuite/20_util/allocator_traits/members/allocate_hint_nonpod.cc:
Use operator-> to access raw pointer member.
* testsuite/23_containers/vector/59829.cc: Likewise.
* testsuite/23_containers/vector/bool/80893.cc: Likewise.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Use NullablePointer.
* testsuite/util/testsuite_allocator.h (NullablePointer): New utility
for tests.
(PointerBase, PointerBase_void): Derive from NullablePointer and use
its constructors and equality operators. Change converting
constructors to use operator-> to access private member of the other
pointer type.
(PointerBase_void::operator->()): Add, for access to private member.
(operator-(PointerBase, PointerBase)): Change to hidden friend.
(operator==(PointerBase, PointerBase)): Remove.
(operator!=(PointerBase, PointerBase)): Remove.
Jonathan Wakely [Tue, 14 May 2019 11:17:18 +0000 (12:17 +0100)]
Fix unique_ptr pretty printer for empty classes
The printer was confused when unique_ptr<T,D>::pointer is an empty
class, or the deleter is not empty. Instead of assuming the tuple has a
single _M_head_impl member manually inspect the tuple base classes to
get the first element.
* python/libstdcxx/v6/printers.py (UniquePointerPrinter.__init__): Do
not assume field called _M_head_impl is the first tuple element.
* testsuite/libstdc++-prettyprinters/compat.cc: Make tuple
implementation more accurate.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Check unique_ptr with
empty pointer type and non-empty deleter.
Jonathan Wakely [Tue, 14 May 2019 11:17:11 +0000 (12:17 +0100)]
LWG 2899 - Make is_move_constructible correct for unique_ptr
* include/bits/unique_ptr.h (__uniq_ptr_impl): Add move constructor,
move assignment operator.
(__uniq_ptr_impl::release(), __uniq_ptr_impl::reset(pointer)): Add.
(__uniq_ptr_data): New class template with conditionally deleted
special members.
(unique_ptr, unique_ptr<T[], D>): Change type of data member from
__uniq_ptr_impl<T, D> to __uniq_ptr_data<T, D>. Define move
constructor and move assignment operator as defaulted.
(unique_ptr::release(), unique_ptr<T[], D>::release()): Forward to
__uniq_ptr_impl::release().
(unique_ptr::reset(pointer), unique_ptr<T[], D>::reset<U>(U)): Forward
to __uniq_ptr_impl::reset(pointer).
* python/libstdcxx/v6/printers.py (UniquePointerPrinter.__init__):
Check for new __uniq_ptr_data type.
* testsuite/20_util/unique_ptr/dr2899.cc: New test.
Richard Biener [Tue, 14 May 2019 09:11:15 +0000 (09:11 +0000)]
re PR tree-optimization/88828 (Inefficient update of the first element of vector registers)
2019-05-14 Richard Biener <rguenther@suse.de>
H.J. Lu <hongjiu.lu@intel.com>
PR tree-optimization/88828
* tree-ssa-forwprop.c (simplify_vector_constructor): Handle
permuting in a single non-constant element not extracted
from a vector.
Jonathan Wakely [Mon, 13 May 2019 20:12:06 +0000 (21:12 +0100)]
PR libstdc++/90454.cc path construction from void*
Make the filesystem::path constructors SFINAE away for void* arguments,
instead of giving an error due to iterator_traits<void*>::reference.
PR libstdc++/90454.cc path construction from void*
* include/bits/fs_path.h (path::_Path): Use remove_pointer so that
pointers to void are rejected as well as void.
* include/experimental/bits/fs_path.h (path::_Path): Likewise.
* testsuite/27_io/filesystem/path/construct/80762.cc: Also check
pointers to void.
* testsuite/experimental/filesystem/path/construct/80762.cc: Likewise.
Jonathan Wakely [Mon, 13 May 2019 20:11:52 +0000 (21:11 +0100)]
Fix testsuite regression caused by r271077
* g++.dg/cpp0x/Wattributes1.C: Adjust dg-error line number to fix
regression, by matching a note on any line.
* g++.dg/cpp0x/Wattributes2.C: Add another copy that checks the
correct line number is matched without depending on a library header.
* oacc-async.c (get_goacc_thread): New function.
(get_goacc_thread_device): New function.
(lookup_goacc_asyncqueue): New function.
(get_goacc_asyncqueue): New function.
(acc_async_test): Adjust code to use new async design.
(acc_async_test_all): Likewise.
(acc_wait): Likewise.
(acc_wait_async): Likewise.
(acc_wait_all): Likewise.
(acc_wait_all_async): Likewise.
(goacc_async_free): New function.
(goacc_init_asyncqueues): Likewise.
(goacc_fini_asyncqueues): Likewise.
* oacc-cuda.c (acc_get_cuda_stream): Adjust code to use new async
design.
(acc_set_cuda_stream): Likewise.
* oacc-host.c (host_openacc_exec): Adjust parameters, remove 'async'.
(host_openacc_register_async_cleanup): Remove.
(host_openacc_async_exec): New function.
(host_openacc_async_test): Adjust parameters.
(host_openacc_async_test_all): Remove.
(host_openacc_async_wait): Remove.
(host_openacc_async_wait_async): Remove.
(host_openacc_async_wait_all): Remove.
(host_openacc_async_wait_all_async): Remove.
(host_openacc_async_set_async): Remove.
(host_openacc_async_synchronize): New function.
(host_openacc_async_serialize): New function.
(host_openacc_async_host2dev): New function.
(host_openacc_async_dev2host): New function.
(host_openacc_async_queue_callback): New function.
(host_openacc_async_construct): New function.
(host_openacc_async_destruct): New function.
(struct gomp_device_descr host_dispatch): Remove initialization of old
interface, add intialization of new async sub-struct.
* oacc-init.c (acc_shutdown_1): Adjust to use gomp_fini_device.
(goacc_attach_host_thread_to_device): Remove old async code usage.
* oacc-int.h (goacc_init_asyncqueues): New declaration.
(goacc_fini_asyncqueues): Likewise.
(goacc_async_copyout_unmap_vars): Likewise.
(goacc_async_free): Likewise.
(get_goacc_asyncqueue): Likewise.
(lookup_goacc_asyncqueue): Likewise.
* oacc-mem.c (memcpy_tofrom_device): Adjust code to use new async
design.
(present_create_copy): Adjust code to use new async design.
(delete_copyout): Likewise.
(update_dev_host): Likewise.
(gomp_acc_insert_pointer): Add async parameter, adjust code to use new
async design.
(gomp_acc_remove_pointer): Adjust code to use new async design.
* oacc-parallel.c (GOACC_parallel_keyed): Adjust code to use new async
design.
(GOACC_enter_exit_data): Likewise.
(goacc_wait): Likewise.
(GOACC_update): Likewise.
* oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Change to assert fail
when called, warn as obsolete in comment.
* target.c (goacc_device_copy_async): New function.
(gomp_copy_host2dev): Remove 'static', add goacc_asyncqueue parameter,
add goacc_device_copy_async case.
(gomp_copy_dev2host): Likewise.
(gomp_map_vars_existing): Add goacc_asyncqueue parameter, adjust code.
(gomp_map_pointer): Likewise.
(gomp_map_fields_existing): Likewise.
(gomp_map_vars_internal): New always_inline function, renamed from
gomp_map_vars.
(gomp_map_vars): Implement by calling gomp_map_vars_internal.
(gomp_map_vars_async): Implement by calling gomp_map_vars_internal,
passing goacc_asyncqueue argument.
(gomp_unmap_tgt): Remove static, add attribute_hidden.
(gomp_unref_tgt): New function.
(gomp_unmap_vars_internal): New always_inline function, renamed from
gomp_unmap_vars.
(gomp_unmap_vars): Implement by calling gomp_unmap_vars_internal.
(gomp_unmap_vars_async): Implement by calling
gomp_unmap_vars_internal, passing goacc_asyncqueue argument.
(gomp_fini_device): New function.
(gomp_exit_data): Adjust gomp_copy_dev2host call.
(gomp_load_plugin_for_device): Remove old interface, adjust to load
new async interface.
(gomp_target_fini): Adjust code to call gomp_fini_device.
Nathan Sidwell [Mon, 13 May 2019 11:56:57 +0000 (11:56 +0000)]
[DWARF] dwarf2out cleanups
https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00485.html
* dwarf2out.c (breakout_comdat_types): Move comment to correct
piece of code.
(const_ok_for_output_1): Balance parens around #if/#else/#endif
(gen_member_die): Move abstract origin check earlier. Only VARs
can be static_inline_p. Simplify splicing control flow.
Richard Biener [Mon, 13 May 2019 11:37:21 +0000 (11:37 +0000)]
re PR tree-optimization/90402 (ICE in slpeel_duplicate_current_defs_from_edges)
2019-05-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/90402
* tree-if-conv.c (tree_if_conversion): Value number only
the loop body by making the latch an exit of the region
as well.
* tree-ssa-sccvn.c (process_bb): Add flag whether to skip
processing PHIs.
(do_rpo_vn): Deal with multiple edges into the entry block
that are not backedges inside the region by skipping PHIs
of the entry block.
Richard Biener [Mon, 13 May 2019 11:22:21 +0000 (11:22 +0000)]
re PR tree-optimization/90316 (large compile time increase in opt / alias stmt walking for Go example)
2019-05-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/90316
* tree-ssa-pre.c (insert_aux): Fold into ...
(insert): ... this function. Use a RPO walk to reduce the
number of required iterations.
Jonathan Wakely [Mon, 13 May 2019 10:50:21 +0000 (11:50 +0100)]
Remove Profile Mode, deprecated since GCC 7.1
The Profile Mode extension is not used by anybody, nor maintained by
anybody. The containers do not support the full API specified in recent
standards, and so enabling Profile Mode is not source compatible with
much modern C++ code. The heuristics that would check the profile
information and make useful suggestions never materialized, so it isn't
useful.
Jonathan Wakely [Mon, 13 May 2019 10:49:58 +0000 (11:49 +0100)]
Remove array_allocator extension, deprecated since 4.9.0
This type is not a conforming allocator, because it cannot be reliably
rebound to allocate for a different type. The result of the rebind
transformation still uses the same underlying std::tr1::array<T, 1>
array, which may not be correctly aligned or even have elements the
right size for the value_type of the rebound allocator.
It has been deprecated for several years and should now be removed.
* doc/xml/manual/allocator.xml: Remove documentation for
array_allocator.
* doc/xml/manual/evolution.xml: Document array_allocator removal.
* doc/xml/manual/using.xml: Remove header from documentation.
* include/Makefile.am: Remove <ext/array_allocator.h> header.
* include/Makefile.in: Regenerate.
* include/ext/array_allocator.h: Remove.
* include/precompiled/extc++.h: Do not include removed header.
* testsuite/ext/array_allocator/1.cc: Remove.
* testsuite/ext/array_allocator/2.cc: Remove.
* testsuite/ext/array_allocator/26875.cc: Remove.
* testsuite/ext/array_allocator/3.cc: Remove.
* testsuite/ext/array_allocator/check_deallocate_null.cc: Remove.
* testsuite/ext/array_allocator/check_delete.cc: Remove.
* testsuite/ext/array_allocator/check_new.cc: Remove.
* testsuite/ext/array_allocator/variadic_construct.cc: Remove.
* testsuite/ext/headers.cc: Do not include removed header.
Martin Liska [Mon, 13 May 2019 07:05:23 +0000 (09:05 +0200)]
Do not follow zero edges in cycle detection (PR gcov-profile/90380).
2019-05-13 Martin Liska <mliska@suse.cz>
PR gcov-profile/90380
* gcov.c (handle_cycle): Do not support zero cycle count,
it should not be possible.
(path_contains_zero_cycle_arc): New function.
(circuit): Ignore zero cycle arc counts.
Martin Liska [Mon, 13 May 2019 07:04:58 +0000 (09:04 +0200)]
Test for not existence of a negative loop (PR gcov-profile/90380).
2019-05-13 Martin Liska <mliska@suse.cz>
PR gcov-profile/90380
* gcov.c (enum loop_type): Remove the enum and
the operator.
(handle_cycle): Assert that we should not reach
a negative count.
(circuit): Use loop_found instead of a tri-state loop_type.
(get_cycles_count): Do not handle NEGATIVE_LOOP as it can't
happen.
Iain Sandoe [Sun, 12 May 2019 19:26:16 +0000 (19:26 +0000)]
darwin, powerpc - set .machine in an asm file.
The asm file fails to build if we use a modern assembler
which checks that the machine is consistent with the
filetype. Fixed by adjusting in a similar manner to
other assembler.
libgcc/
2019-05-12 Iain Sandoe <iain@sandoe.co.uk>
* config/rs6000/darwin-vecsave.S: Set .machine appropriately.
Iain Sandoe [Sun, 12 May 2019 19:07:49 +0000 (19:07 +0000)]
x86 - fix pr82920
The various thunks output codes have inconsisten output
mechanisms. The patch factors out some common code that
writes out the jumps and uses the regular output scheme
that accounts for __USER_LABEL_PREFIX__.
The testsuite changes are largely mechanical compensation
for the revised output (and the fact that Darwin doesn't
use non-PIC by default).
gcc/
2019-05-12 Iain Sandoe <iain@sandoe.co.uk>
PR target/82920
* config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): New.
(ix86_output_indirect_branch_via_reg): Use output mechanism
accounting for __USER_LABEL_PREFIX__.
(ix86_output_indirect_branch_via_push): Likewise.
(ix86_output_function_return): Likewise.
(ix86_output_indirect_function_return): Likewise.
The recent AArch64 absolute difference patterns had to go through
some hoops to pair max/min rtx codes with the same signedness.
I also need to pair signed/unsigned codes with sign/zero extension
for some SVE ACLE patterns.
This patch therefore supports <...> as rtx codes, like we already
do for modes.
2019-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/md.texi: Document use of code attributes in rtx patterns.
* read-md.h (rtx_reader::rtx_alloc_for_name): New member function.
* read-rtl.c (find_code): Split out search loops into...
(maybe_find_code): ...this new function.
(check_code_iterator): Make the error message more informative.
(check_code_attribute): New function.
(rtx_reader::rtx_alloc_for_name): Likewise.
(rtx_reader::read_rtx_code): Use rtx_alloc_for_name.
* config/aarch64/predicates.md (aarch64_smin, aarch64_umin): Delete.
* config/aarch64/aarch64-simd.md (*aarch64_<su>abd<mode>_3): Use
<max_opp> directly as an rtx code instead of via a match_operator.
* config/aarch64/aarch64-sve.md (aarch64_<su>abd<mode>_3): Likewise.
(<su>abd<mode>_3): Update accordingly.
Janne Blomqvist [Sun, 12 May 2019 08:26:18 +0000 (11:26 +0300)]
fortran: C++ support for generating C prototypes
When generating C prototypes for Fortran procedures with the
-fc-prototypes and -fc-prototypes-external options, print a snippet
defining macros for complex types, and add C++ support by suppressing
mangling.
fortran/ChangeLog:
2019-05-12 Janne Blomqvist <jb@gcc.gnu.org>
* dump-parse-tree.c (get_c_type_name): Use macros for complex type
names.
* parse.c (gfc_parse_file): Define complex macros, add CPP support
when printing C prototypes.
Iain Sandoe [Sat, 11 May 2019 15:05:58 +0000 (15:05 +0000)]
testsuite, darwin] Fix PR81058.
The tests fail because Darwin indirects common accesses which causes different
codegen and the mismatch in output. Placing the vars in regular .data section
fixes that.
gcc/testsuite/
2019-05-11 Iain Sandoe <iain@sandoe.co.uk>
PR testsuite/81058
* gcc.target/i386/avx512bw-vpmovswb-1.c: Use regular data section
for variables on Darwin, rather than common.
* gcc.target/i386/avx512bw-vpmovuswb-1.c: Likewise.
* gcc.target/i386/avx512bw-vpmovwb-1.c: Likewise.
Ian Lance Taylor [Sat, 11 May 2019 01:12:37 +0000 (01:12 +0000)]
runtime: set up g early
runtime.throw needs a g to work properly. Set up g early, to
ensure that if something goes wrong in the runtime startup (e.g.
runtime.check fails), the program terminates in a reasonable way.
Jonathan Wakely [Fri, 10 May 2019 21:41:23 +0000 (22:41 +0100)]
PR libstdc++/81266 fix std::thread::native_handle_type test
The test uses remove_pointer because in most cases native_handle_type is
a pointer to the actual type that the C++ class contains. However, for
std::thread, native_handle_type is the same type as the type contained
in std::thread, and so remove_pointer is not needed. On targets where
pthread_t is a pointer type remove_pointer<native_handle_type> is not a
no-op, instead it transforms pthread_t and causes the test to fail.
The fix is to not apply remove_pointer when testing std::thread.
PR libstdc++/81266
* testsuite/util/thread/all.h: Do not use remove_pointer for
std::thread::native_handle_type.
A disabled specialization should not be callable, so move the function
call operator into a new base class which correctly implements the
disabled hash semantics. For the versioned namespace configuration do
not derive from __poison_hash in the enabled case, as the empty base
class serves no purpose but potentially increases the object size. For
the default configuration that base class must be kept, to preserve
layout.
An enabled specialization should not be unconditionally noexcept,
because the underlying hash object might throw.
PR libstdc++/90388
* include/bits/unique_ptr.h (default_delete, default_delete<T[]>):
Use _Require for constraints.
(operator>(nullptr_t, const unique_ptr<T,D>&)): Implement exactly as
per the standard.
(__uniq_ptr_hash): New base class with conditionally-disabled call
operator.
(hash<unique_ptr<T,D>>): Derive from __uniq_ptr_hash.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust dg-error line.
* testsuite/20_util/unique_ptr/hash/90388.cc: New test.
Paul Thomas [Fri, 10 May 2019 07:59:42 +0000 (07:59 +0000)]
re PR fortran/90093 (Extended C interop: optional argument incorrectly identified as PRESENT)
2019-05-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/90093
* trans-decl.c (convert_CFI_desc): Test that the dummy is
present before doing any of the conversions.
PR fortran/90352
* decl.c (gfc_verify_c_interop_param): Restore the error for
charlen > 1 actual arguments passed to bind(C) procs.
Clean up trailing white space.
PR fortran/90355
* trans-array.c (gfc_trans_create_temp_array): Set the 'span'
field to the element length for all types.
(gfc_conv_expr_descriptor): The force_no_tmp flag is used to
prevent temporary creation, especially for substrings.
* trans-decl.c (gfc_trans_deferred_vars): Rather than assert
that the backend decl for the string length is non-null, use it
as a condition before calling gfc_trans_vla_type_sizes.
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): 'force_no_tmp'
is set before calling gfc_conv_expr_descriptor.
* trans.c (get_array_span): Move the code for extracting 'span'
from gfc_build_array_ref to this function. This is specific to
descriptors that are component and indirect references.
* trans.h : Add the force_no_tmp flag bitfield to gfc_se.
2019-05-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/90093
* gfortran.dg/ISO_Fortran_binding_12.f90: New test.
* gfortran.dg/ISO_Fortran_binding_12.c: Supplementary code.
PR fortran/90352
* gfortran.dg/iso_c_binding_char_1.f90: New test.
PR fortran/90355
* gfortran.dg/ISO_Fortran_binding_4.f90: Add 'substr' to test
the direct passing of substrings as descriptors to bind(C).
* gfortran.dg/assign_10.f90: Increase the tree_dump count of
'atmp' to account for the setting of the 'span' field.
* gfortran.dg/transpose_optimization_2.f90: Ditto.
Martin Liska [Fri, 10 May 2019 06:31:32 +0000 (08:31 +0200)]
Fix location where lto-dump is installed.
2019-05-10 Martin Liska <mliska@suse.cz>
* Make-lang.in: Use program_transform_name for lto-dump.
* config-lang.in: Do not mark lto-dump compiler as we don't
want to have it installed at
lib/gcc/x86_64-pc-linux-gnu/10.0.0/lto-dump.
Cherry Zhang [Thu, 9 May 2019 21:24:56 +0000 (21:24 +0000)]
compiler: avoid copy for string([]byte) conversion used in map keys
If a string([]byte) conversion is used immediately as a key for a
map read, we don't need to copy the backing store of the byte
slice, as mapaccess does not keep a reference to it.
The gc compiler does more than this: it also avoids the copy if
the map key is a composite literal that contains the conversion
as a field, like, T{ ... { ..., string(b), ... }, ... }. For now,
we just optimize the simple case, which is probably most common.
[arm] PR target/90405 fix regression for thumb1 with -mtpcs-leaf-frame
-mtpcs-leaf-frame causes an APCS-style backtrace frame to be created
on the stack. This should probably be deprecated, but it did reveal
an issue with the patch I committed previously to improve the code
generation when pushing high registers, in that
thumb_find_work_register had a different idea as to which registers
were available as scratch registers.
The new code actually does a better job of finding a viable work
register and doesn't rely so much on assumptions about the ABI, so it
seems better to adapt thumb_find_work_register to the new approach.
This way we can eliminate some rather crufty code.
gcc:
PR target/90405
* config/arm/arm.c (callee_saved_reg_p): Move before
thumb_find_work_register.
(thumb1_prologue_unused_call_clobbered_lo_regs): Move before
thumb_find_work_register. Only call df_get_live_out once.
(thumb1_epilogue_unused_call_clobbered_lo_regs): Likewise.
(thumb_find_work_register): Use
thumb1_prologue_unused_call_clobbered_lo_regs instead of ad hoc
algorithms to locate a spare call clobbered reg.
gcc/testsuite:
PR target/90405
* gcc.target/arm/pr90405.c: New test.
Martin Liska [Thu, 9 May 2019 13:03:45 +0000 (15:03 +0200)]
Support {MIN,MAX}_EXPR in GIMPLE FE.
2019-05-09 Martin Liska <mliska@suse.cz>
* gimple-pretty-print.c (dump_binary_rhs): Dump MIN_EXPR
and MAX_EXPR in GIMPLE FE format.
2019-05-09 Martin Liska <mliska@suse.cz>
* gimple-parser.c (c_parser_gimple_statement): Support __MIN and
__MAX.
(c_parser_gimple_unary_expression): Parse also binary expression
__MIN and __MAX.
(c_parser_gimple_parentized_binary_expression): New function.
2019-05-09 Martin Liska <mliska@suse.cz>
Thomas Schwinge [Thu, 9 May 2019 09:51:59 +0000 (11:51 +0200)]
[PR89221] Continue to default to '--disable-frame-pointer' for x86 GNU systems
The recent trunk r270914 for PR89221 "--enable-frame-pointer does not work as
intended" fixed a scripting defect in the x86 '--enable-frame-pointer'
handling.
This has the side effect that, for example, for '--target=i686-gnu' this is now
enabled by default: 'USE_IX86_FRAME_POINTER=1' is added to 'tm_defines'. Given
that it's highly unlikely that anyone would now suddenly want
'--enable-frame-pointer' as the default for any kind of GNU system, I'm
changing the default back for GNU systems, to match that of a 'target_os' of
'linux* | darwin[8912]*'.
gcc/
PR target/89221
* configure.ac (--enable-frame-pointer): Disable by default for
GNU systems.
* configure: Regenerate.
This patch makes a number of corrections to rs6000_register_move_cost,
adds a new register union class, GEN_OR_VSX_REGS, and adjusts insn
alternative costs to suit.
The patch initially just corrected register move cost when direct
moves are available, but that resulted in regressions. Inspection of
those regressions showed ALL_REGS being used as the register allocno
class, which isn't ideal. gcc/doc/tm.texi says: "You should define a
class for the union of two classes whenever some instruction allows
both classes". Thus, define GEN_OR_VSX_REGS for the register
allocator. (IRA wants to use the union of two register classes when
the costs of the classes are below memory cost, which happens more
often with the low direct move cost.)
As per https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89271#c11 we ought
to be returning the minimal cost for union classes. That can be done
by rs6000_register_move_cost testing for vsx first, where the number
of regs for a given mode might be smaller than the same mode in gprs,
and changing the LINK_OR_CTR_REGS case to exclude SPEC_OR_GEN_REGS and
NON_FLOAT_REGS.
I removed the VECTOR_MEM_VSX_P test since that leads to silly results
for scalar mode moves between altivec and float when TARGET_VSX. eg.
rs6000_register_move_cost:, ret=2, mode=DF, from=FLOAT_REGS, to=FLOAT_REGS
rs6000_register_move_cost:, ret=16, mode=DF, from=FLOAT_REGS, to=ALTIVEC_REGS
rs6000_register_move_cost:, ret=2, mode=DF, from=FLOAT_REGS, to=VSX_REGS
The patch also fixes wrong results for moves within and between any of
the non-gpr, non-vsx special reg classes. The comment about "moving
between two similar registers is just one instruction" is false. We
can't move lr to ctr directly, for example. I believe the intent of
the "reg_classes_intersect_p (to, from)" was to cover moves within
float or altivec, so I moved that test inside the code handling vsx,
and made sure the intersection wasn't anything besides vsx by masking
off everything else. Masking isn't strictly necessary at the moment,
but would be if we create a GEN_OR_ALTIVEC_REGS class some time in the
future.
TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS is needed for rs6000 in order
to fix the 20% cactus_adm spec regression when using GEN_OR_VSX_REGS
as an allocno class. It is similar to the aarch64 version but without
any selection by regno mode if the best class is a union class.
PR target/89271
* config/rs6000/rs6000.h (enum reg_class, REG_CLASS_NAMES),
(REG_CLASS_CONTENTS): Add GEN_OR_VSX_REGS class.
* config/rs6000/rs6000.c (rs6000_register_move_cost): Correct
cost for general <-> vsx when direct moves are available.
Cost union classes at minimal cost for any reg in the class.
Correct calculation for moves between vsx, float, and altivec.
Don't return a low cost for moves between special regs. Don't
use hard coded register numbers.
(TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS): Define.
(rs6000_ira_change_pseudo_allocno_class): New function.
* config/rs6000/rs6000.md (movsi_internal1, mov<mode>_internal),
(movdi_internal32, movdi_internal64): Remove '*' from vsx register
alternatives.
(movsi_internal1): Don't disparage vector alternatives.
(mov<mode>_internal): Likewise, excepting alternative that
will be split.
* config/rs6000/vsx.md (vsx_splat_<mode>_reg): Don't disparage
we <- b alternative.
Cherry Zhang [Wed, 8 May 2019 23:06:52 +0000 (23:06 +0000)]
compiler: avoid copy for string([]byte) conversion used in string comparison
If a string([]byte) conversion is used immediately in a string
comparison, we don't need to copy the backing store of the byte
slice, as the string comparison doesn't hold any reference to
it. Instead, just create a string header from the byte slice and
pass it for comparison.
A new type of expression, String_value_expression, is introduced,
for constructing string headers.
Thomas Koenig [Wed, 8 May 2019 21:55:13 +0000 (21:55 +0000)]
re PR fortran/90351 (-fc-prototypes does not dump prototypes for external procedures)
2019-05-08 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90351
PR fortran/90329
* gfortran.dg/dump-parse-tree.c: Include version.h.
(gfc_dump_external_c_prototypes): New function.
(get_c_type_name): Select "char" as a name for a simple char.
Adjust to handling external functions. Also handle complex.
(write_decl): Add argument bind_c. Adjust for dumping of external
procedures.
(write_proc): Likewise.
(write_interop_decl): Add bind_c argument to call of write_proc.
* gfortran.h: Add prototype for gfc_dump_external_c_prototypes.
* lang.opt: Add -fc-prototypes-external flag.
* parse.c (gfc_parse_file): Move dumping of BIND(C) prototypes.
Call gfc_dump_external_c_prototypes if option is set.
* invoke.texi: Document -fc-prototypes-external.
compiler: generate memmove for non-pointer slice copy
The builtin copy function is lowered to runtime functions
slicecopy, stringslicecopy, or typedslicecopy. The first two are
basically thin wrappers of memmove. Instead of making a runtime
call, we can just use __builtin_memmove. This gives the compiler
backend opportunities for further optimizations.
Move the lowering of builtin copy function to flatten phase for
the ease of rewriting.
Also do this optimization for the copy part of append(s1, s2...).
Jakub Jelinek [Wed, 8 May 2019 17:06:46 +0000 (19:06 +0200)]
re PR c++/59813 (tail-call elimination didn't fire for left-shift of char to cout)
PR c++/59813
PR tree-optimization/89060
* tree-ssa-live.h (live_vars_map): New typedef.
(compute_live_vars, live_vars_at_stmt, destroy_live_vars): Declare.
* tree-ssa-live.c: Include gimple-walk.h and cfganal.h.
(struct compute_live_vars_data): New type.
(compute_live_vars_visit, compute_live_vars_1, compute_live_vars,
live_vars_at_stmt, destroy_live_vars): New functions.
* tree-tailcall.c: Include tree-ssa-live.h.
(live_vars, live_vars_vec): New global variables.
(find_tail_calls): Perform variable life analysis before punting.
(tree_optimize_tail_calls_1): Clean up live_vars and live_vars_vec.
* tree-inline.h (struct copy_body_data): Add eh_landing_pad_dest
member.
* tree-inline.c (add_clobbers_to_eh_landing_pad): Remove BB argument.
Perform variable life analysis to select variables that really need
clobbers added.
(copy_edges_for_bb): Don't call add_clobbers_to_eh_landing_pad here,
instead set id->eh_landing_pad_dest and assert it is the same.
(copy_cfg_body): Call it here if id->eh_landing_pad_dest is non-NULL.
This patch fixes a problem with the thumb1 prologue code where the link
register could be unconditionally used as a scratch register even if the
return value was still live at the end of the prologue.
Additionally, the patch improves the code generated when we are not
using many low call-saved registers to make use of any unused call
clobbered registers to help with the saving of high registers that
cannot be pushed directly (quite rare in normal code as the register
allocator correctly prefers low registers).
2019-05-08 Mihail Ionescu <mihail.ionescu@arm.com>
Richard Earnshaw <rearnsha@arm.com>
gcc:
PR target/88167
* config/arm/arm.c (thumb1_prologue_unused_call_clobbered_lo_regs): New
function.
(thumb1_epilogue_unused_call_clobbered_lo_regs): New function.
(thumb1_compute_save_core_reg_mask): Don't force a spare work
register if both the epilogue and prologue can use call-clobbered
regs.
(thumb1_unexpanded_epilogue): Use
thumb1_epilogue_unused_call_clobbered_lo_regs. Reverse the logic for
picking temporaries for restoring high regs to match that of the
prologue where possible.
(thumb1_expand_prologue): Add any usable call-clobbered low registers to
the list of work registers. Detect if the return address is still live
at the end of the prologue and avoid using it for a work register if so.
If the return address is not live, add LR to the list of pushable regs
after the first pass.
gcc/testsuite:
PR target/88167
* gcc.target/arm/pr88167-1.c: New test.
* gcc.target/arm/pr88167-2.c: New test.
Co-Authored-By: Richard Earnshaw <rearnsha@arm.com>
From-SVN: r271012
Bin Cheng [Wed, 8 May 2019 11:37:45 +0000 (11:37 +0000)]
re PR tree-optimization/90078 (ICE with deep templates caused by overflow)
PR tree-optimization/90078
* tree-ssa-loop-ivopts.c (INFTY): Increase value for infinite cost.
(struct comp_cost): Promote type of members to int64_t.
(infinite_cost): Don't set complexity in initialization.
(comp_cost::operator +,-,+=,-+,/=,*=): Assert when cost computation
overflows to infinite_cost.
(adjust_setup_cost): Promote type of parameter and cost computation
to int64_t.
(struct ainc_cost_data, struct iv_ca): Promote type of member to
int64_t.
(get_scaled_computation_cost_at, determine_iv_cost): Promote type of
cost computation to int64_t.
(determine_group_iv_costs, iv_ca_dump, find_optimal_iv_set): Use
int64_t's format specifier in dump.
gcc/testsuite
* g++.dg/tree-ssa/pr90078.C: New test.