This patch makes genmatch match calls based on combined_fn rather
than built_in_function and extends the matching to internal functions.
It also uses fold_const_call to fold the calls to a constant, rather
than going through fold_builtin_n.
In order to slightly simplify the code and remove potential
ambiguity, the patch enforces lower case for tree codes
(foo->FOO_EXPR), caps for functions (no built_in_hypot->BUILT_IN_HYPOT)
and requires an exact match for user-defined identifiers. The first two
were already met in practice but there were a couple of cases where
operator lists were defined in one case and used in another.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* match.pd: Use HYPOT and COS rather than hypot and cos.
Use CASE_CFN_* macros. Guard log/exp folds with
SCALAR_FLOAT_TYPE_P.
* genmatch.c (internal_fn): New enum.
(fn_id::fn): Change to an unsigned int.
(fn_id::fn_id): Accept internal_fn too.
(add_builtin): Rename to...
(add_function): ...this and turn into a template.
(get_operator): Only try one variation if the original name fails.
Only add _EXPR if the original name was all lower case.
Try converting internal and built-in function names to their
CFN equivalents.
(expr::gen_transform): Use maybe_build_call_expr_loc for generic.
(dt_simplify::gen_1): Likewise.
(dt_node::gen_kids_1): Use gimple_call_combined_fn for gimple
and get_call_combined_fn for generic.
(dt_simplify::gen): Use combined_fn as the type of fn_ids.
(decision_tree::gen): Likewise.
(main): Use lower case in the strings for {VIEW_,}CONVERT[012].
Use add_function rather than add_builtin. Register internal
functions too.
* generic-match-head.c: Include case-cfn-macros.h.
* gimple-fold.c (replace_stmt_with_simplification): Use
gimple_call_combined_fn to test whether we can keep an
existing call.
* gimple-match.h (code_helper): Replace built_in_function
with combined_fn.
* gimple-match-head.c: Include fold-const-call.h, internal-fn.h
and case-fn-macros.h.
(gimple_resimplify1): Use fold_const_call.
(gimple_resimplify2, gimple_resimplify3): Likewise.
(build_call_internal, build_call): New functions.
(maybe_push_res_to_seq): Use them.
(gimple_simplify): Use fold_const_call. Set *rcode to a combined_fn
rather than a built-in function.
* tree.h (build_call_expr_internal_loc): Declare.
(maybe_build_call_expr_loc): Likewise.
* tree.c (build_call_expr_internal_loc_array): New function.
(maybe_build_call_expr_loc): Likewise.
This patch extends mathfn_built_in to handle combined_fn, but keeps the
old built_in_function interface around since it's a common case.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* builtins.h (mathfn_built_in): Add a variant that takes
a combined_fn.
* builtins.c: Include case-cfn-macros.h.
(CASE_MATHFN): Use CASE_CFN_*.
(CASE_MATHFN_REENT): Use CFN_ codes.
(mathfn_built_in_2, mathfn_built_in_1): Replace built_in_function
argument with a combined_fn.
(mathfn_built_in): Add a variant that takes a combined_fn.
(expand_builtin_int_roundingfn_2): Update callers accordingly.
(fold_builtin_sincos, fold_builtin_classify): Likewise.
Another patch to extend uses of built_in_function to combined_fn,
this time in tree-vect-patterns.c. The old code didn't handle the
long double pow variants, but I think that's because noone had a target
that would benefit rather than because the code would mishandle them.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* tree-vect-patterns.c: Include case-cfn-macros.h.
(vect_recog_pow_pattern): Use combined_fn instead of built-in codes.
Another patch to extend uses of built_in_function to combined_fn, this time
in tree-ssa-math-opts.c.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* tree-ssa-math-opts.c: Include case-cfn-macros.h.
(execute_cse_sincos_1): Use combined_fn instead of built-in codes.
(pass_cse_sincos::execute): Likewise.
Another patch to extend uses of built_in_function to combined_fn, this time
in tree-vrp.c.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* tree-vrp.c: Include case-cfn-macros.h.
(extract_range_basic): Switch on combined_fn rather than handling
built-in functions and internal functions separately.
This patch generalises fold-const.[hc] routines to use combined_fn
instead of built_in_function. It also updates gimple-ssa-backprop,c
since the update is simple and it avoids churn on the call to
negate_mathfn_p.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* fold-const.h (negate_mathfn_p): Take a combined_fn rather
than a built_in_function.
(tree_call_nonnegative_warnv_p): Take a combined_fn rather than
a function decl.
(integer_valued_real_call_p): Likewise.
* fold-const.c: Include case-cfn-macros.h
(negate_mathfn_p): Take a combined_fn rather than a built_in_function.
(negate_expr_p): Update accordingly.
(tree_call_nonnegative_warnv_p): Take a combined_fn rather than
a function decl.
(integer_valued_real_call_p): Likewise.
(tree_invalid_nonnegative_warnv_p): Update accordingly.
(integer_valued_real_p): Likewise.
* gimple-fold.c (gimple_call_nonnegative_warnv_p): Update call
to tree_call_nonnegative_warnv_p.
(gimple_call_integer_valued_real_p): Likewise
integer_valued_real_call_p.
* gimple-ssa-backprop.c: Include case-cfn-macros.h.
(backprop::process_builtin_call_use): Extend to combined_fn.
(strip_sign_op_1): Likewise.
(backprop::process_use): Don't check for built-in calls here.
(backprop::execute): Likewise.
(backprop::optimize_builtin_call): Update call to negate_mathfn_p.
This patch automatically generates case macros such as:
CASE_CFN_SQRT
for each {F,,L} floating-point built-in function and each {,L,LL,IMAX}
integer built-in function. The macros match the same built-in
functions as CASE_FLT_FN and CASE_INT_FN but in addition include
the associated internal function, if any.
The idea is to make sure that users of combined_fn don't need to know
which built-in functions have internal-function equivalents. If we add
a new function to internal-fn.def, all combined_fn users should pick it
up automatically.
The generator wants to use "hash_set <nofree_string_hash>",
so the patch follows hash_map in using the types given by the
traits as the key. This is a no-op for current users of hash_set.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* Makefile.in (HASH_TABLE_H): Add GGC_H.
(MOSTLYCLEANFILES, generated_files): Add case-fn-macros.h.
(s-case-cfn-macros, case-cfn-macros.h, build/gencfn-macros.o)
(build/gencfn-macros$(build_exeext): New rules.
(genprogerr): Add cfn-macros.
* hash-set.h (hash_set): Use the traits value_type as the key.
* gencfn-macros.c: New file.
This patch adds internal function equivalents of all the INT_FN functions.
Unlike the math functions, these functions never set errno and the internal
functions should be exactly equivalent to the built-in ones. The reason
for defining the internal functions is so that we can extend the
functionality to other modes, in particular vector modes.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* internal-fn.def (DEF_INTERNAL_INT_FN): New macro.
(CLRSB, CLZ, CTZ, FFS, PARITY, POPCOUNT): New functions.
* builtins.c (associated_internal_fn): Handle them.
This patch adds internal functions for simple FLT_FN built-in functions,
in cases where an associated optab already exists. Unlike some of the
built-in functions, these internal functions never set errno.
LDEXP is an odd-one out in that its second operand is an integer.
All the others operate on uniform types.
The patch also adds a function to query the internal function associated
with a built-in function (if any), and another to test whether a given
gcall could be replaced by a call to an internal function on the current
target (as long as the caller deals with errno appropriately).
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* builtins.h (associated_internal_fn): Declare.
(replacement_internal_fn): Likewise.
* builtins.c: Include internal-fn.h
(associated_internal_fn, replacement_internal_fn): New functions.
* internal-fn.def (DEF_INTERNAL_FLT_FN): New macro.
(ACOS, ASIN, ATAN, COS, EXP, EXP10, EXP2, EXPM1, LOG, LOG10, LOG1P)
(LOG2, LOGB, SIGNIFICAND, SIN, SQRT, TAN, CEIL, FLOOR, NEARBYINT)
(RINT, ROUND, TRUNC, ATAN2, COPYSIGN, FMOD, POW, REMAINDER, SCALB)
(LDEXP): New functions.
* internal-fn.c: Include recog.h.
(unary_direct, binary_direct): New macros.
(expand_direct_optab_fn): New function.
(expand_unary_optab_fn): New macro.
(expand_binary_optab_fn): Likewise.
(direct_unary_optab_supported_p): Likewise.
(direct_binary_optab_supported_p): Likewise.
Add basic support for direct_optab internal functions
This patch adds a concept of internal functions that map directly to an
optab (here called "direct internal functions"). The function can only
be used if the associated optab can be used.
so the patch converts them to the new infrastructure. These four
all need different types of optabs, but future patches will add
regular unary and binary ones.
In general we need one or two modes to decide whether an optab is
supported, depending on whether it's a convert_optab or not.
This in turn means that we need up to two types to decide whether
an internal function is supported. The patch records which types
are needed for each internal function, using -1 if the return type
should be used and N>=0 if the type of argument N should be used.
(LOAD_LANES and STORE_LANES are unusual in that both optab modes
come from the same array type.)
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* coretypes.h (tree_pair): New type.
* internal-fn.def (DEF_INTERNAL_OPTAB_FN): New macro. Use it
for MASK_LOAD, LOAD_LANES, MASK_STORE and STORE_LANES.
* internal-fn.h (direct_internal_fn_info): New structure.
(direct_internal_fn_array): Declare.
(direct_internal_fn_p, direct_internal_fn): New functions.
(direct_internal_fn_types, direct_internal_fn_supported_p): Declare.
* internal-fn.c (not_direct, mask_load_direct, load_lanes_direct)
(mask_store_direct, store_lanes_direct): New macros.
(direct_internal_fn_array) New array.
(get_multi_vector_move): Return the optab handler without asserting
that it is available.
(expand_LOAD_LANES): Rename to...
(expand_load_lanes_optab_fn): ...this and add an optab argument.
(expand_STORE_LANES): Rename to...
(expand_store_lanes_optab_fn): ...this and add an optab argument.
(expand_MASK_LOAD): Rename to...
(expand_mask_load_optab_fn): ...this and add an optab argument.
(expand_MASK_STORE): Rename to...
(expand_mask_store_optab_fn): ...this and add an optab argument.
(direct_internal_fn_types, direct_optab_supported_p)
(multi_vector_optab_supported_p, direct_internal_fn_supported_p)
(direct_internal_fn_supported_p): New functions.
(direct_mask_load_optab_supported_p): New macro.
(direct_load_lanes_optab_supported_p): Likewise.
(direct_mask_store_optab_supported_p): Likewise.
(direct_store_lanes_optab_supported_p): Likewise.
I'm working on a patch series that needs to be able to treat built-in
functions and internal functions in a similar way. This patch adds a
new enum, combined_fn, that combines the two together. It also adds
utility functions for seeing which combined_fn (if any) is called by
a given CALL_EXPR or gcall.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* tree-core.h (internal_fn): Move immediately after the definition
of built_in_function.
(combined_fn): New enum.
* tree.h (as_combined_fn, builtin_fn_p, as_builtin_fn)
(internal_fn_p, as_internal_fn): New functions.
(get_call_combined_fn, combined_fn_name): Declare.
* tree.c (get_call_combined_fn): New function.
(combined_fn_name): Likewise.
* gimple.h (gimple_call_combined_fn): Declare.
* gimple.c (gimple_call_combined_fn): New function.
PR target/56036
* doc/invoke.texi (Option Summary): Add -mms-bitfields to x86
option list.
(x86 Options): Add -mms-bitfields and -mno-ms-bitfields. Move
discussion of the Microsoft structure layout details here from
its former home in extend.texi.
* doc/extend.texi (x86 Variable Attributes): Replace detailed
discussion with pointer to its new location. Add cross-reference
to corresponding type attributes.
(x86 Type Attributes): Add cross-references to command-line options
and variable attributes.
Kyrylo Tkachov [Tue, 17 Nov 2015 13:20:08 +0000 (13:20 +0000)]
[ARM] PR 68143 Properly update memory offsets when expanding setmem
PR target/68143
* config/arm/arm.c (arm_block_set_unaligned_vect): Keep track of
offset from dstbase and use it appropriately in
adjust_automodify_address.
(arm_block_set_aligned_vect): Likewise.
* tree-if-conv.c: Include varasm.h
(ref_DR_map): Define.
(baseref_DR_map): Like wise
(struct ifc_dr): Add new tree predicate field.
(hash_memrefs_baserefs_and_store_DRs_read_written_info): New function.
(memrefs_read_or_written_unconditionally): Remove.
(write_memrefs_written_at_least_once): Remove.
(ifcvt_memrefs_wont_trap): Use hash maps to query
unconditional read/written information.
(if_convertible_loop_p_1): Initialize hash maps and predicates
before hashing data references and delete hashmaps at the end.
Michael Meissner [Mon, 16 Nov 2015 22:13:21 +0000 (22:13 +0000)]
vsx.md (VSX_L): Do not include IBM extended double 128-bit types...
2015-11-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vsx.md (VSX_L): Do not include IBM extended double
128-bit types, just types that fit in a single vector.
* config/rs6000/rs6000.md (FMOVE128_GPR): Likewise.
Steven G. Kargl [Mon, 16 Nov 2015 19:15:25 +0000 (19:15 +0000)]
re PR fortran/58027 ("Arithmetic overflow converting ..." in PARAMETER triggers an ICE)
2015-11-16 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/58027
PR fortran/60993
* expr.c (gfc_check_init_expr): Prevent a redundant check when a
__convert_* function was inserted into an array constructor.
(gfc_check_assign_symbol): Check for an initialization expression
when a __convert_* was inserted.
2015-11-16 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/58027
PR fortran/60993
* gfortran.dg/pr58027.f90: New test.
simplify-rtx: Simplify sign_extend of lshiftrt to zero_extend (PR68330)
Since r230164, in PR68330 combine ends up with a sign_extend of an
lshiftrt by some constant, and it does not know to morph that into a
zero_extract (the extend will always extend with zeroes). I think
it is best to let simplify-rtx always replace such a sign_extend by
a zero_extend, after which everything works as expected.
Richard Biener [Mon, 16 Nov 2015 15:04:00 +0000 (15:04 +0000)]
re PR tree-optimization/68306 (ICE: in vectorizable_store, at tree-vect-stmts.c:5651)
2015-11-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/68306
* tree-vect-data-refs.c (vect_verify_datarefs_alignment): Fix
bogus copying from verify_data_ref_alignment and use continue
instead of return.
Oleg Endo [Mon, 16 Nov 2015 14:11:50 +0000 (14:11 +0000)]
re PR target/68277 ([SH]: error: insn does not satisfy its constraints when compiling erlang)
gcc/
PR target/68277
* config/sh/sh.md (addsi3_scr): Handle reg overlap of operands[0] and
operands[2].
(*addsi3): Add another insn_and_split variant for reload.
Co-Authored-By: Kaz Kojima <kkojima@gcc.gnu.org>
From-SVN: r230425
Tom de Vries [Mon, 16 Nov 2015 12:40:41 +0000 (12:40 +0000)]
Remove first_pass_instance from pass_ccp
2015-11-16 Tom de Vries <tom@codesourcery.com>
* passes.def: Add arg to pass_ccp pass instantiation.
* tree-ssa-ccp.c (ccp_finalize): Add param nonzero_p. Use nonzero_p
instead of first_pass_instance.
(do_ssa_ccp): Add and handle param nonzero_p.
(pass_ccp::pass_ccp): Initialize nonzero_p.
(pass_ccp::set_pass_param): New member function. Set nonzero_p.
(pass_ccp::execute): Call do_ssa_ccp with extra arg.
(pass_ccp::nonzero_p): New private member.
Tom de Vries [Mon, 16 Nov 2015 12:40:33 +0000 (12:40 +0000)]
Remove first_pass_instance from pass_object_sizes
2015-11-16 Tom de Vries <tom@codesourcery.com>
* passes.def: Add arg to pass_object_sizes pass instantiation.
* tree-object-size.c (pass_object_sizes::pass_object_sizes): Initialize
insert_min_max_p.
(pass_object_sizes::set_pass_param): New member function. Set
insert_min_max_p.
(pass_object_sizes::insert_min_max_p): New private member.
(pass_object_sizes::execute): Use insert_min_max_p instead of
first_pass_instance.
Tom de Vries [Mon, 16 Nov 2015 12:40:24 +0000 (12:40 +0000)]
Remove first_pass_instance from pass_dominator
2015-11-16 Tom de Vries <tom@codesourcery.com>
* passes.def: Add arg to pass_dominator pass instantiation.
* tree-pass.h (first_pass_instance): Remove pass_dominator-related bit
of comment.
* tree-ssa-dom.c (pass_dominator::pass_dominator): Initialize
may_peel_loop_headers_p.
(pass_dominator::set_pass_param): New member function. Set
may_peel_loop_headers_p.
(pass_dominator::may_peel_loop_headers_p): New private member.
(pass_dominator::execute): Use may_peel_loop_headers_p instead of
first_pass_instance.
Tom de Vries [Mon, 16 Nov 2015 12:40:14 +0000 (12:40 +0000)]
Remove first_pass_instance from pass_reassoc
2015-11-16 Tom de Vries <tom@codesourcery.com>
* passes.def: Add arg to pass_reassoc pass instantiation.
* tree-ssa-reassoc.c (reassoc_insert_powi_p): New static variable.
(acceptable_pow_call, reassociate_bb): Use reassoc_insert_powi_p instead
of first_pass_instance.
(execute_reassoc): Add and handle insert_powi_p parameter.
(pass_reassoc::insert_powi_p): New private member.
(pass_reassoc::pass_reassoc): Initialize insert_powi_p.
(pass_reassoc::set_pass_param): New member function. Set insert_powi_p.
(pass_reassoc::execute): Call execute_reassoc with extra arg.
Eric Botcazou [Mon, 16 Nov 2015 12:16:54 +0000 (12:16 +0000)]
i386.c (ix86_adjust_stack_and_probe): Adjust and use an lea instruction when possible.
* config/i386/i386.c (ix86_adjust_stack_and_probe): Adjust and use
an lea instruction when possible.
(output_adjust_stack_and_probe): Rotate the loop and simplify.
(ix86_emit_probe_stack_range): Adjust.
(output_probe_stack_range): Rotate the loop and simplify.
Trevor Saunders [Mon, 16 Nov 2015 02:28:15 +0000 (02:28 +0000)]
PR 68366 - include emit-rtl.h in sdbout.c
Some of the pa target macros rely on macros in emit-rtl.h and sdbout.c
uses some of those macros, which means that sdbout.c needs to include
emit-rtl.h.
Paul Thomas [Sun, 15 Nov 2015 14:07:52 +0000 (14:07 +0000)]
re PR fortran/50221 (Allocatable string length fails with array assignment)
2015-11-15 Paul Thomas <pault@gcc.gnu.org>
PR fortran/50221
PR fortran/68216
PR fortran/63932
PR fortran/66408
* trans_array.c (gfc_conv_scalarized_array_ref): Pass the
symbol decl for deferred character length array references.
* trans-stmt.c (gfc_trans_allocate): Keep the string lengths
to update deferred length character string lengths.
* trans-types.c (gfc_get_dtype_rank_type); Use the string
length of deferred character types for the dtype size.
* trans.c (gfc_build_array_ref): For references to deferred
character arrays, use the domain max value, if it is a variable
to set the 'span' and use pointer arithmetic for acces to the
element.
(trans_code): Set gfc_current_locus for diagnostic purposes.
PR fortran/67674
* trans-expr.c (gfc_conv_procedure_call): Do not fix deferred
string lengths of components.
PR fortran/49954
* resolve.c (deferred_op_assign): New function.
(gfc_resolve_code): Call it.
* trans-array.c (concat_str_length): New function.
(gfc_alloc_allocatable_for_assignment): Jump directly to alloc/
realloc blocks for deferred character length arrays because the
string length might change, even if the shape is the same. Call
concat_str_length to obtain the string length for concatenation
since it is needed to compute the lhs string length.
Set the descriptor dtype appropriately for the new string
length.
* trans-expr.c (gfc_trans_assignment_1): Use the rse string
length for all characters, other than deferred types. For
concatenation operators, push the rse.pre block to the inner
most loop so that the temporary pointer and the assignments
are properly placed.
2015-11-15 Paul Thomas <pault@gcc.gnu.org>
PR fortran/50221
* gfortran.dg/deferred_character_1.f90: New test.
* gfortran.dg/deferred_character_4.f90: New test for comment
#4 of the PR.
PR fortran/68216
* gfortran.dg/deferred_character_2.f90: New test.
PR fortran/67674
* gfortran.dg/deferred_character_3.f90: New test.
PR fortran/63932
* gfortran.dg/deferred_character_5.f90: New test.
PR fortran/66408
* gfortran.dg/deferred_character_6.f90: New test.
PR fortran/49954
* gfortran.dg/deferred_character_7.f90: New test.
Andreas Tobler [Sat, 14 Nov 2015 21:17:24 +0000 (22:17 +0100)]
acinclude.m4 (GLIBCXX_ENABLE_CLOCALE): Change locale implementation from darwin to DragonFly.
2015-11-14 Andreas Tobler <andreast@gcc.gnu.org>
* acinclude.m4 (GLIBCXX_ENABLE_CLOCALE): Change locale implementation
from darwin to DragonFly.
* configure: Regenerate.
* config/os/bsd/freebsd/ctype_configure_char.cc: Improve locale
support, do it the same as DragonFly.
* config/os/bsd/freebsd/os_defines.h: Add fine grained C99 defines.
* omp-low.c (lower_omp_ordered): Add argument to GOMP_SMD_ORDERED_*
internal calls - 0 if ordered simd and 1 for ordered threads simd.
* tree-vectorizer.c (adjust_simduid_builtins): If GOMP_SIMD_ORDERED_*
argument is 1, replace it with GOMP_ordered_* call instead of removing
it.
gcc/c/
2015-11-14 Jakub Jelinek <jakub@redhat.com>
* c-typeck.c (c_finish_omp_clauses): Don't mark
GOMP_MAP_FIRSTPRIVATE_POINTER decls addressable.
gcc/cp/
2015-11-14 Jakub Jelinek <jakub@redhat.com>
* semantics.c (finish_omp_clauses): Don't mark
GOMP_MAP_FIRSTPRIVATE_POINTER decls addressable.
libgomp/
2015-11-14 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
Ilya Verbin <ilya.verbin@intel.com>
* ordered.c (gomp_doacross_init, GOMP_doacross_post,
GOMP_doacross_wait, gomp_doacross_ull_init, GOMP_doacross_ull_post,
GOMP_doacross_ull_wait): For GFS_GUIDED don't divide number of
iterators or IV by chunk size.
* parallel.c (gomp_resolve_num_threads): Don't assume that
if thr->ts.team is non-NULL, then pool must be non-NULL.
* libgomp-plugin.h (GOMP_PLUGIN_target_task_completion): Declare.
* libgomp.map (GOMP_PLUGIN_1.1): New symbol version, export
GOMP_PLUGIN_target_task_completion.
* Makefile.am (libgomp_la_SOURCES): Add priority_queue.c.
* Makefile.in: Regenerate.
* libgomp.h: Shuffle prototypes and forward definitions around so
priority queues can be defined.
(enum gomp_task_kind): Add GOMP_TASK_ASYNC_RUNNING.
(enum gomp_target_task_state): New enum.
(struct gomp_target_task): Add state, tgt, task and team fields.
(gomp_create_target_task): Change return type to bool, add
state argument.
(gomp_target_task_fn): Change return type to bool.
(struct gomp_device_descr): Add async_run_func.
(struct gomp_task): Remove children, next_child, prev_child,
next_queue, prev_queue, next_taskgroup, prev_taskgroup.
Add pnode field.
(struct gomp_taskgroup): Remove children.
Add taskgroup_queue.
(struct gomp_team): Change task_queue type to a priority queue.
(splay_compare): Define inline.
(priority_queue_offset): New.
(priority_node_to_task): New.
(task_to_priority_node): New.
* oacc-mem.c: Do not include splay-tree.h.
* priority_queue.c: New file.
* priority_queue.h: New file.
* splay-tree.c: Do not include splay-tree.h.
(splay_tree_foreach_internal): New.
(splay_tree_foreach): New.
* splay-tree.h: Become re-entrant if splay_tree_prefix is defined.
(splay_tree_callback): Define typedef.
* target.c (splay_compare): Move to libgomp.h.
(GOMP_target): Don't adjust *thr in any way around running offloaded
task.
(GOMP_target_ext): Likewise. Handle target nowait.
(GOMP_target_update_ext, GOMP_target_enter_exit_data): Check
return value from gomp_create_target_task, if false, fallthrough
as if no dependencies exist.
(gomp_target_task_fn): Change return type to bool, return true
if the task should have another part scheduled later. Handle
target nowait.
(gomp_load_plugin_for_device): Initialize async_run.
* task.c (gomp_init_task): Initialize children_queue.
(gomp_clear_parent_in_list): New.
(gomp_clear_parent_in_tree): New.
(gomp_clear_parent): Handle priorities.
(GOMP_task): Likewise.
(priority_queue_move_task_first,
gomp_target_task_completion, GOMP_PLUGIN_target_task_completion):
New functions.
(gomp_create_target_task): Use priority queues. Change return type
to bool, add state argument, return false if for async
{{enter,exit} data,update} constructs no dependencies need to be
waited for, handle target nowait. Set task->fn to NULL instead of
gomp_target_task_fn.
(verify_children_queue): Remove.
(priority_list_upgrade_task): New.
(priority_queue_upgrade_task): New.
(verify_task_queue): Remove.
(priority_list_downgrade_task): New.
(priority_queue_downgrade_task): New.
(gomp_task_run_pre): Use priority queues.
Abstract code out to priority_queue_downgrade_task.
(gomp_task_run_post_handle_dependers): Use priority queues.
(gomp_task_run_post_remove_parent): Likewise.
(gomp_task_run_post_remove_taskgroup): Likewise.
(gomp_barrier_handle_tasks): Likewise. Handle target nowait target
tasks specially.
(GOMP_taskwait): Likewise.
(gomp_task_maybe_wait_for_dependencies): Likewise. Abstract code to
priority-queue_upgrade_task.
(GOMP_taskgroup_start): Use priority queues.
(GOMP_taskgroup_end): Likewise. Handle target nowait target tasks
specially. If taskgroup is NULL, and thr->ts.level is 0, act as a
barrier.
* taskloop.c (GOMP_taskloop): Handle priorities.
* team.c (gomp_new_team): Call priority_queue_init.
(free_team): Call priority_queue_free.
(gomp_free_thread): Call gomp_team_end if thr->ts.team is artificial
team created for target nowait in implicit parallel region.
(gomp_team_start): For nested check, test thr->ts.level instead of
thr->ts.team != NULL.
* testsuite/libgomp.c/doacross-3.c: New test.
* testsuite/libgomp.c/ordered-5.c: New test.
* testsuite/libgomp.c/priority.c: New test.
* testsuite/libgomp.c/target-31.c: New test.
* testsuite/libgomp.c/target-32.c: New test.
* testsuite/libgomp.c/target-33.c: New test.
* testsuite/libgomp.c/target-34.c: New test.
liboffloadmic/
2015-11-14 Ilya Verbin <ilya.verbin@intel.com>
* runtime/offload_host.cpp (task_completion_callback): New
variable.
(offload_proxy_task_completed_ooo): Call task_completion_callback.
(__offload_register_task_callback): New function.
* runtime/offload_host.h (__offload_register_task_callback): New
declaration.
* plugin/libgomp-plugin-intelmic.cpp (offload): Add async_data
argument, handle async offloading.
(register_main_image): Call register_main_image.
(GOMP_OFFLOAD_init_device, get_target_table, GOMP_OFFLOAD_alloc,
GOMP_OFFLOAD_free, GOMP_OFFLOAD_host2dev, GOMP_OFFLOAD_dev2host,
GOMP_OFFLOAD_dev2dev) Adjust offload callers.
(GOMP_OFFLOAD_async_run): New function.
(GOMP_OFFLOAD_run): Implement using GOMP_OFFLOAD_async_run.
Steven G. Kargl [Sat, 14 Nov 2015 17:31:16 +0000 (17:31 +0000)]
re PR fortran/67803 (ICE on concatenating wrong character array constructor)
2015-11-14 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/67803
* array.c (gfc_match_array_constructor): If array constructor included
a CHARACTER typespec, check array elements for compatible type.
2015-11-14 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/67803
* gfortran.dg/pr67803.f90: New test.
Jonathan Wakely [Sat, 14 Nov 2015 17:24:42 +0000 (17:24 +0000)]
Fix std::wstring capacity test for short wchar_t
* testsuite/21_strings/basic_string/capacity/char/18654.cc: Use
real minimum capacity.
* testsuite/21_strings/basic_string/capacity/wchar_t/18654.cc:
Likewise.
Kai Tietz [Fri, 13 Nov 2015 22:24:45 +0000 (22:24 +0000)]
Add non-folding variants for convert_to_*.
2015-11-13 Kai Tietz <ktietz70@googlemail.com>
Marek Polacek <polacek@redhat.com>
Jason Merrill <jason@redhat.com>
gcc/
* convert.c (maybe_fold_build1_loc): New.
(maybe_fold_build2_loc): New.
(convert_to_pointer_1): Split out from convert_to_pointer.
(convert_to_pointer_nofold): New.
(convert_to_real_1): Split out from convert_to_real.
(convert_to_real_nofold): New.
(convert_to_integer_1): Split out from convert_to_integer.
(convert_to_integer_nofold): New.
(convert_to_complex_1): Split out from convert_to_complex.
(convert_to_complex_nofold): New.
* convert.h: Declare new functions.
* tree-complex.c (create_one_component_var): Break up line to
avoid sequence point issues.
gcc/c-family/
* c-lex.c (interpret_float): Use fold_convert.
Co-Authored-By: Jason Merrill <jason@redhat.com> Co-Authored-By: Marek Polacek <polacek@redhat.com>
From-SVN: r230359
Nathan Sidwell [Fri, 13 Nov 2015 21:51:32 +0000 (21:51 +0000)]
omp-low.c (scan_sharing_clauses): Accept INDEPENDENT, AUTO & SEQ.
gcc/
* gcc/omp-low.c (scan_sharing_clauses): Accept INDEPENDENT, AUTO &
SEQ.
(oacc_loop_fixed_partitions): Correct return type to bool.
(oacc_loop_auto_partitions): New.
(oacc_loop_partition): Take mask argument, call
oacc_loop_auto_partitions.
(execute_oacc_device_lower): Provide mask to oacc_loop_partition.