Oleg Endo [Tue, 31 May 2016 15:06:25 +0000 (15:06 +0000)]
Fix wrong-code issues of RX atomic operations.
gcc/
* config/rx/rx.md (FETCHOP_NO_MINUS): New code iterator.
(atomic_<fetchop_name>_fetchsi): Extract minus operator into ...
(atomic_sub_fetchsi): ... this new pattern.
(mvtc): Add CC_REG clobber.
Roger Sayle [Tue, 31 May 2016 11:30:56 +0000 (11:30 +0000)]
builtins.c (java_builtins): Use popcount* and bswap* builtins to implement bitCount() and reverseBytes()...
2016-05-31 Roger Sayle <roger@nextmovesoftware.com>
gcc/java:
* builtins.c (java_builtins): Use popcount* and bswap* builtins to
implement bitCount() and reverseBytes() methods in java.lang.Integer
and friends.
(initialize_builtins): Annotate math builtins with ECF_LEAF. Call
define_builtin for the new popcount* and bswap* builtins.
libjava:
* testsuite/libjava.lang/BuiltinBitCount.java: New test case.
* testsuite/libjava.lang/BuiltinReverseBytes.java: Likewise.
Jan Hubicka [Tue, 31 May 2016 10:42:59 +0000 (12:42 +0200)]
loop-init.c (gate): Do not enale RTL loop unroller with -fpeel-loops.
* loop-init.c (gate): Do not enale RTL loop unroller with -fpeel-loops.
It no longer does that.
* toplev.c (process_options): Do not enable flag_web with -fpeel-loops.
Martin Sebor [Mon, 30 May 2016 22:56:43 +0000 (22:56 +0000)]
PR c++/71306 - bogus -Wplacement-new with an array element
gcc/cp/ChangeLog:
2016-05-27 Martin Sebor <msebor@redhat.com>
PR c++/71306
* init.c (warn_placement_new_too_small): Handle placement new arguments
that are elements of arrays more carefully. Remove a pointless loop.
gcc/testsuite/ChangeLog:
2016-05-27 Martin Sebor <msebor@redhat.com>
PR c++/71306
* g++.dg/warn/Wplacement-new-size-3.C: New test.
Jakub Jelinek [Mon, 30 May 2016 21:36:24 +0000 (23:36 +0200)]
re PR c++/71349 (Combined async target clause parsing issues)
PR c++/71349
* c-parser.c (c_parser_omp_for): Don't disallow nowait clause
when combined with target construct.
* parser.c (cp_parser_omp_for): Don't disallow nowait clause
when combined with target construct.
(cp_parser_omp_parallel): Pass cclauses == NULL as last argument
to cp_parser_omp_all_clauses.
* c-omp.c (c_omp_split_clauses): Put OMP_CLAUSE_DEPEND to
C_OMP_CLAUSE_SPLIT_TARGET. Put OMP_CLAUSE_NOWAIT to
C_OMP_CLAUSE_SPLIT_TARGET if combined with target construct,
instead of C_OMP_CLAUSE_SPLIT_FOR.
Paolo Carlini [Mon, 30 May 2016 19:18:13 +0000 (19:18 +0000)]
re PR c++/71238 (Undeclared function message imprecisely points to error column)
/cp
2016-05-30 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/71238
* lex.c (unqualified_name_lookup_error): Take a location too.
(unqualified_fn_lookup_error): Take a cp_expr.
* cp-tree.h (unqualified_name_lookup_error,
unqualified_fn_lookup_error): Adjust declarations.
* semantics.c (perform_koenig_lookup): Adjust
unqualified_fn_lookup_error call, pass the location of
the identifier too as part of a cp_expr.
/testsuite
2016-05-30 Paolo Carlini <paolo.carlini@oracle.com>
Andi Kleen [Mon, 30 May 2016 18:13:12 +0000 (18:13 +0000)]
Don't cause ICEs when auto profile file is not found with checking
Currently, on a checking enabled compiler when -fauto-profile does
not find the profile feedback file it errors out with assertation
failures. Add proper errors for this case.
gcc/:
2016-05-30 Andi Kleen <ak@linux.intel.com>
* auto-profile.c (read_profile): Replace asserts with errors
when file does not exist.
* gcov-io.c (gcov_read_words): Dito.
Martin Liska [Mon, 30 May 2016 16:04:50 +0000 (18:04 +0200)]
Add profiling support for IVOPTS
* tree-ssa-loop-ivopts.c (get_computation_cost_at): Scale
computed costs by frequency of BB they belong to.
(get_scaled_computation_cost_at): New function.
PR middle-end/71269
PR middle-end/71252
* tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so
that inserted stmt will not dominate stmts that defines its operand.
(rewrite_expr_tree): Add stmt_to_insert before adding the use stmt.
(rewrite_expr_tree_parallel): Likewise.
Eric Botcazou [Mon, 30 May 2016 08:48:17 +0000 (08:48 +0000)]
visium.c (visium_split_double_add): Minor tweaks.
* config/visium/visium.c (visium_split_double_add): Minor tweaks.
(visium_expand_copysign): Use gen_int_mode directly.
(visium_compute_frame_size): Minor tweaks.
Alan Modra [Sat, 28 May 2016 00:22:56 +0000 (09:52 +0930)]
ira.c bb_loop_depth again
Follow the same practice as other places in ira.c, where
free_dominance_info is called along with loop_optimizer_finalize. Not
doing so causes an ICE on gcc-5-branch, so avoid that possibility on
trunk.
Jeff Law [Fri, 27 May 2016 16:32:38 +0000 (10:32 -0600)]
tree-ssa-threadedge.c: Remove include of tree-ssa-threadbackward.h.
* tree-ssa-threadedge.c: Remove include of tree-ssa-threadbackward.h.
(thread_across_edge): Remove calls to find_jump_threads_backwards.
* passes.def: Add jump threading passes before DOM/VRP.
* tree-ssa-threadbackward.c (find_jump_threads_backwards): Change
argument to a basic block from an edge. Remove tests which are
handled elsewhere.
(pass_data_thread_jumps, class pass_thread_jumps): New.
(pass_thread_jumps::gate, pass_thread_jumps::execute): New.
(make_pass_thread_jumps): Likewise.
* tree-pass.h (make_pass_thread_jumps): Declare.
* config/visium/visium-protos.h (split_double_move): Rename into...
(visium_split_double_move): ...this.
(visium_split_double_add): Declare.
* config/visium/visium.c (split_double_move): Rename into...
(visium_split_double_move): ...this.
(visium_split_double_add): New function.
(visium_expand_copysign): Renumber operands for consistency.
* config/visium/visium.md (DImode move splitter): Adjust to renaming.
(DFmode move splitter): Likewise.
(*addi3_insn): Split by means of visium_split_double_add.
(*adddi3_insn_flags): Delete.
(*plus_plus_sltu<subst_arith>): New insn.
(*subdi3_insn): Split by means of visium_split_double_add.
(subdi3_insn_flags): Delete.
(*minus_minus_sltu<subst_arith>): New insn.
(*negdi2_insn): Split by means of visium_split_double_add.
(*negdi2_insn_flags): Delete.
re PR c++/69855 (Missing diagnostic for overload that only differs by return type)
/cp
PR c++/69855
* name-lookup.c (pushdecl_maybe_friend_1): Push local function
decls into the global scope after stripping template bits
and setting DECL_ANTICIPATED.
Wilco Dijkstra [Fri, 27 May 2016 12:15:47 +0000 (12:15 +0000)]
Remove aarch64_cannot_change_mode_class as the underlying issue (PR67609) has been resolved.
Remove aarch64_cannot_change_mode_class as the underlying issue
(PR67609) has been resolved. This avoids a few unnecessary lane
widening operations like:
Michael Meissner [Thu, 26 May 2016 21:38:19 +0000 (21:38 +0000)]
rs6000.c (rs6000_emit_p9_fp_minmax): New function for ISA 3.0 min/max support.
[gcc]
2016-05-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): New function
for ISA 3.0 min/max support.
(rs6000_emit_p9_fp_cmove): New function for ISA 3.0 floating point
conditional move support.
(rs6000_emit_cmove): Call rs6000_emit_p9_fp_minmax and
rs6000_emit_p9_fp_cmove if the ISA 3.0 instructions are
available.
* config/rs6000/rs6000.md (SFDF2): New iterator to allow doing
conditional moves where the comparison type is different from move
type.
(fp_minmax): New code iterator for smin/smax.
(minmax): New code attributes for min/max.
(SMINMAX): Likewise.
(smax<mode>3): Combine min, max insns into one insn using the
fp_minmax code iterator. Add support for ISA 3.0 min/max
instructions that don't need -ffast-math.
(s<minmax><mode>3): Likewise.
(smax<mode>3_vsx): Likewise.
(smin<mode>3): Likewise.
(s<minmax><mode>3_vsx): Likewise.
(smin<mode>3_vsx): Likewise.
(pre-VSX min/max splitters): Likewise.
(s<minmax><mode>3_fpr): Likewise.
(movsfcc): Rewrite floating point conditional moves to combine
SFmode/DFmode into a single insn.
(mov<mode>cc): Likewise.
(movdfcc): Likewise.
(fselsfsf4): Combine FSEL cases into a single insn, using SFDF and
SFDF2 iterators to handle all combinations.
(fseldfsf4): Likewise.
(fsel<SFDF:mode><SFDF2:mode>4): Likewise.
(fseldfdf4): Likewise.
(fselsfdf4): Likewise.
(mov<SFDF:mode><SFDF2:mode>cc_p9): Add support for the ISA 3.0
comparison instructions that set a 0/-1 mask, and use it for
floating point conditional move via XXSEL.
(fpmask<mode>): Likewise.
(xxsel<mode>): Likewise.
* config/rs6000/predicates.md (min_max_operator): Delete, no
longer used.
(fpmask_comparison_operaton): New insn for ISA 3.0 comparison
instructions that generate a 0/-1 mask for use with XXSEL.
* config/rs6000/rs6000.h (TARGET_MINMAX_SF): New helper macros to
say whether floating point min/max is available, either through
FSEL, ISA 2.06 min/max, and ISA 3.0 min/max instrucitons.
(TARGET_MINMAX_DF): Likewise.
[gcc/testsuite]
2016-05-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-minmax-1.c: New tests for ISA 3.0
floating point min/max/comparison instructions.
* gcc.target/powerpc/p9-minmax-2.c: Likewise.
Patrick Palka [Thu, 26 May 2016 18:17:43 +0000 (18:17 +0000)]
Fix PR c++/70822 (bogus error with parenthesized SCOPE_REF)
gcc/cp/ChangeLog:
PR c++/70822
PR c++/70106
* cp-tree.h (REF_PARENTHESIZED_P): Make this flag apply to
SCOPE_REFs too.
* pt.c (tsubst_qualified_id): If REF_PARENTHESIZED_P is set
on the qualified_id then propagate it to the resulting
expression.
(do_auto_deduction): Check REF_PARENTHESIZED_P on SCOPE_REFs
too.
* semantics.c (force_paren_expr): If given a SCOPE_REF, just set
its REF_PARENTHESIZED_P flag.
gcc/testsuite/ChangeLog:
PR c++/70822
PR c++/70106
* g++.dg/cpp1y/auto-fn32.C: New test.
* g++.dg/cpp1y/paren4.C: New test.
tree-ssa-loop-ivopts.c:loop_body_includes_call was treating internal
calls such as IFN_SQRT as clobbering all caller-saved registers, which
I don't think is appropriate for any current internal function.
Tested on aarch64-linux-gnu and x86_64-linux-gnu.
gcc/
* tree-ssa-loop-ivopts.c (loop_body_includes_call): Don't assume
that internal functions will clobber all caller-saved registers.
Wilco Dijkstra [Thu, 26 May 2016 12:25:51 +0000 (12:25 +0000)]
GCC expands switch statements in a very simplistic way and tries to use a table...
GCC expands switch statements in a very simplistic way and tries to use a table
expansion even when it is a bad idea for performance or codesize.
GCC typically emits extremely sparse tables that contain mostly default entries
(something which currently cannot be tuned by backends). Additionally the
computation of the minimum/maximum label offsets is too simplistic so the
tables are often twice as large as necessary.
The cost of a table switch is significant due to the setup overhead, the table
lookup (which due to being sparse and large adds unnecessary cachemisses)
and hard to predict indirect jump. Therefore it is best to avoid using a table
unless there are many real case labels.
This patch fixes that by setting the default aarch64_case_values_threshold to
16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch
heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size
(0.5-1% smaller).
gcc/
* config/aarch64/aarch64.c (aarch64_case_values_threshold):
Return a better case_values_threshold when optimizing.
* target.c (gomp_device_copy): New function.
(gomp_copy_host2dev): Likewise.
(gomp_copy_dev2host): Likewise.
(gomp_free_device_memory): Likewise.
(gomp_map_vars_existing): Adjust to call gomp_copy_host2dev.
(gomp_map_pointer): Likewise.
(gomp_map_vars): Adjust to call gomp_copy_host2dev, handle
NULL value from alloc_func plugin hook.
(gomp_unmap_tgt): Adjust to call gomp_free_device_memory.
(gomp_copy_from_async): Adjust to call gomp_copy_dev2host.
(gomp_unmap_vars): Likewise.
(gomp_update): Adjust to call gomp_copy_dev2host and
gomp_copy_host2dev functions.
(gomp_unload_image_from_device): Handle false value from
unload_image_func plugin hook.
(gomp_init_device): Handle false value from init_device_func
plugin hook.
(gomp_exit_data): Adjust to call gomp_copy_dev2host.
(omp_target_free): Adjust to call gomp_free_device_memory.
(omp_target_memcpy): Handle return values from host2dev_func,
dev2host_func, and dev2dev_func plugin hooks.
(omp_target_memcpy_rect_worker): Likewise.
(gomp_target_fini): Handle false value from fini_device_func
plugin hook.
* libgomp.h (struct gomp_device_descr): Adjust return type of
init_device_func, fini_device_func, unload_image_func, free_func,
dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
* oacc-init.c (acc_shutdown_1): Handle false value from
fini_device_func plugin hook.
* oacc-host.c (host_init_device): Change return type to bool.
(host_fini_device): Likewise.
(host_unload_image): Likewise.
(host_free): Likewise.
(host_dev2host): Likewise.
(host_host2dev): Likewise.
* oacc-mem.c (acc_free): Handle plugin hook fatal error case.
(acc_memcpy_to_device): Likewise.
(acc_memcpy_from_device): Likewise.
(delete_copyout): Add libfnname parameter, handle free_func
hook fatal error case.
(acc_delete): Adjust delete_copyout call.
(acc_copyout): Likewise.
(update_dev_host): Move gomp_mutex_unlock to after
host2dev/dev2host hook calls.
* plugin/plugin-hsa.c (hsa_warn): Adjust 'hsa_error' local variable
to 'hsa_error_msg', for clarity.
(hsa_fatal): Likewise.
(hsa_error): New function.
(init_hsa_context): Change return type to bool, adjust to return
false on error.
(GOMP_OFFLOAD_get_num_devices): Adjust to handle init_hsa_context
return value.
(GOMP_OFFLOAD_init_device): Change return type to bool, adjust to
return false on error.
(get_agent_info): Adjust to return NULL on error.
(destroy_hsa_program): Change return type to bool, adjust to
return false on error.
(GOMP_OFFLOAD_load_image): Adjust to return -1 on error.
(destroy_module): Change return type to bool, adjust to
return false on error.
(GOMP_OFFLOAD_unload_image): Likewise.
(GOMP_OFFLOAD_fini_device): Likewise.
(GOMP_OFFLOAD_alloc): Change to return NULL when called.
(GOMP_OFFLOAD_free): Change to return false when called.
(GOMP_OFFLOAD_dev2host): Likewise.
(GOMP_OFFLOAD_host2dev): Likewise.
(GOMP_OFFLOAD_dev2dev): Likewise.
* plugin/plugin-nvptx.c (CUDA_CALL_ERET): New convenience macro.
(CUDA_CALL): Likewise.
(CUDA_CALL_ASSERT): Likewise.
(map_init): Change return type to bool, use CUDA_CALL* macros.
(map_fini): Likewise.
(init_streams_for_device): Change return type to bool, adjust
call to map_init.
(fini_streams_for_device): Change return type to bool, adjust
call to map_fini.
(select_stream_for_async): Release stream_lock before calls to
GOMP_PLUGIN_fatal, adjust call to map_init.
(nvptx_init): Use CUDA_CALL* macros.
(nvptx_attach_host_thread_to_device): Change return type to bool,
use CUDA_CALL* macros.
(nvptx_open_device): Use CUDA_CALL* macros.
(nvptx_close_device): Change return type to bool, use CUDA_CALL*
macros.
(nvptx_get_num_devices): Use CUDA_CALL* macros.
(link_ptx): Change return type to bool, use CUDA_CALL* macros.
(nvptx_exec): Use CUDA_CALL* macros.
(nvptx_alloc): Use CUDA_CALL* macros.
(nvptx_free): Change return type to bool, use CUDA_CALL* macros.
(nvptx_host2dev): Likewise.
(nvptx_dev2host): Likewise.
(nvptx_wait): Use CUDA_CALL* macros.
(nvptx_wait_async): Likewise.
(nvptx_wait_all): Likewise.
(nvptx_wait_all_async): Likewise.
(nvptx_set_cuda_stream): Adjust order of stream_lock acquire,
use CUDA_CALL* macros, adjust call to map_fini.
(GOMP_OFFLOAD_init_device): Change return type to bool,
adjust code accordingly.
(GOMP_OFFLOAD_fini_device): Likewise.
(GOMP_OFFLOAD_load_image): Adjust calls to
nvptx_attach_host_thread_to_device/link_ptx to handle errors,
use CUDA_CALL* macros.
(GOMP_OFFLOAD_unload_image): Change return type to bool, adjust
return code.
(GOMP_OFFLOAD_alloc): Adjust calls to code to handle error return.
(GOMP_OFFLOAD_free): Change return type to bool, adjust calls to
handle error return.
(GOMP_OFFLOAD_dev2host): Likewise.
(GOMP_OFFLOAD_host2dev): Likewise.
(GOMP_OFFLOAD_openacc_register_async_cleanup): Use CUDA_CALL* macros.
(GOMP_OFFLOAD_openacc_create_thread_data): Likewise.