Bernd Schmidt [Thu, 24 Nov 2016 12:22:16 +0000 (12:22 +0000)]
re PR rtl-optimization/78120 (If conversion no longer performed)
PR rtl-optimization/78120
* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
cases, and additionally test against max_seq_cost for speed
optimization.
(noce_process_if_block): Compute an estimate for the original cost when
optimizing for speed, using the minimum of then and else block costs.
testsuite/
PR rtl-optimization/78120
* gcc.target/i386/pr78120.c: New test.
Eric Botcazou [Thu, 24 Nov 2016 12:02:53 +0000 (12:02 +0000)]
re PR middle-end/78429 (ICE in set_value_range, at tree-vrp.c on non-standard boolean)
PR middle-end/78429
* tree.h (wi::fits_to_boolean_p): New predicate.
(wi::fits_to_tree_p): Use it for boolean types.
* tree.c (int_fits_type_p): Likewise.
Martin Liska [Thu, 24 Nov 2016 09:42:18 +0000 (10:42 +0100)]
Fix print_node for CONSTRUCTORs
* print-tree.c (struct bucket): Remove.
(print_node): Add new argument which drives whether a tree node
is printed briefly or not.
(debug_tree): Replace a custom hash table with hash_set<T>.
* print-tree.h (print_node): Add the argument.
Joseph Myers [Wed, 23 Nov 2016 23:34:05 +0000 (23:34 +0000)]
Fix e500 offset handling for TImode.
Given my previous fix for a missing insn pattern for e500, building
glibc runs into an assembler error "Error: operand out of range (256
is not between 0 and 248)". This comes from an insn:
This patch adjusts the offset handling for TImode - and TDmode and
PTImode in case such subregs can arise for them - to be the same as
for TFmode, so that proper SPE offset checks are made in the
TARGET_E500_DOUBLE case.
This allows the glibc build to complete. Testing shows 372 FAILs
across the gcc, g++ and libstdc++ testsuites; more cleanup is
certainly needed, but this gets to the point where the toolchain at
least builds so it's possible to compare test results when fixing
bugs.
* config/rs6000/rs6000.c (rs6000_legitimate_offset_address_p): For
TARGET_E500_DOUBLE. handle TDmode, TImode and PTImode the same as
TFmode, IFmode and KFmode.
Joseph Myers [Wed, 23 Nov 2016 23:32:54 +0000 (23:32 +0000)]
Add another e500 subreg pattern.
Building glibc for powerpc-linux-gnuspe --enable-e500-double, given
the patch <https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02404.html>
applied, fails with errors such as:
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c: In function '__modfl':
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c:91:1: error: unrecognizable insn:
}
^
(insn 31 30 32 2 (set (reg:DF 203)
(subreg:DF (reg:TI 202) 8)) "../sysdeps/ieee754/ldbl-128ibm/s_modfl.c":44 -1
(nil))
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c:91:1: internal compiler error: in extract_insn, at recog.c:2311
This patch adds an insn pattern similar to various patterns already
present to handle extracting such a subreg. This allows the glibc
build to get further, until it runs into an assembler error for which
I have another patch.
gcc:
* config/rs6000/spe.md (*frob_<SPE64:mode>_ti_8): New insn
pattern.
gcc/testsuite:
* gcc.c-torture/compile/20161123-1.c: New test.
Jakub Jelinek [Wed, 23 Nov 2016 19:50:23 +0000 (20:50 +0100)]
re PR tree-optimization/78482 (wrong code at -O3 in both 32-bit and 64-bit modes on x86_64-linux-gnu)
PR tree-optimization/78482
* gcc.dg/torture/pr78482.c (c, d): Use signed char instead of char.
(bar): New function.
(main): Call bar instead of printf.
Jakub Jelinek [Wed, 23 Nov 2016 19:28:41 +0000 (20:28 +0100)]
re PR middle-end/69183 (ICE when using OpenMP PRIVATE keyword in OMP DO loop not explicitly encapsulated in OMP PARALLEL region)
PR middle-end/69183
* omp-low.c (build_outer_var_ref): Change lastprivate argument
to code, pass it recursively, adjust uses. For OMP_CLAUSE_PRIVATE
on worksharing constructs, treat it like clauses on simd construct.
Formatting fix.
(lower_rec_input_clauses): For OMP_CLAUSE_PRIVATE_OUTER_REF pass
OMP_CLAUSE_PRIVATE as last argument to build_outer_var_ref.
(lower_lastprivate_clauses): Pass OMP_CLAUSE_LASTPRIVATE instead
of true as last argument to build_outer_var_ref.
Uros Bizjak [Wed, 23 Nov 2016 19:05:53 +0000 (20:05 +0100)]
i386.md (*movqi_internal): Calculate mode attribute of alternatives 7,8,9 depending on TARGET_AVX512DQ.
* gcc.target/config/i386.md (*movqi_internal): Calculate mode
attribute of alternatives 7,8,9 depending on TARGET_AVX512DQ.
<TYPE_MSKMOV>: Emit kmovw for MODE_HI insn mode attribute.
(*k<logic><mode>): Calculate mode attribute depending on
TARGET_AVX512DQ. Emit k<logic>w for MODE_HI insn mode attribute.
(*andqi_1): Calculate mode attribute of alternative 3 depending
on TARGET_AVX512DQ. Emit kandw for MODE_HI insn mode attribute.
(kandn<mode>): Calculate mode attribute of alternative 2 depending
on TARGET_AVX512DQ. Emit kandnw for MODE_HI insn mode attribute.
(kxnor<mode>): Merge insn patterns using SWI1248_AVX512BW mode
iterator. Calculate mode attribute of alternative 1 depending
on TARGET_AVX512DQ. Emit kxnorw for MODE_HI insn mode attribute.
(*one_cmplqi2_1): Calculate mode attribute of alternative 2 depending
on TARGET_AVX512DQ. Emit knotw for MODE_HI insn mode attribute.
Jakub Jelinek [Wed, 23 Nov 2016 18:45:27 +0000 (19:45 +0100)]
re PR c++/77907 (Add "const" to argument of constexpr constructor causes the object to be left in unconstructed state)
PR c++/77907
* cp-gimplify.c (cp_fold) <case CALL_EXPR>: When calling constructor
and maybe_constant_value returns non-CALL_EXPR, create INIT_EXPR
with the object on lhs and maybe_constant_value returned expr on rhs.
PR middle-end/78153
* gimple-fold.c (fold_stmt_1): Handle case for GIMPLE_RETURN.
* tree-vrp.c (extract_range_basic): Handle case for
CFN_BUILT_IN_STRLEN.
testsuite/
* gcc.dg/tree-ssa/pr78153-1.c: New test.
* gcc.dg/tree-ssa/pr78153-2.c: Likewise.
James Greenhalgh [Wed, 23 Nov 2016 17:33:39 +0000 (17:33 +0000)]
[Patch 16/17 libgcc ARM] Half to double precision conversions
gcc/
* config/arm/arm.c (arm_convert_to_type): Delete.
(TARGET_CONVERT_TO_TYPE): Delete.
(arm_init_libfuncs): Enable trunc_optab from DFmode to HFmode.
(arm_libcall_uses_aapcs_base): Add trunc_optab from DF- to HFmode.
* config/arm/arm.h (TARGET_FP16_TO_DOUBLE): New.
* config/arm/arm.md (truncdfhf2): Only convert through SFmode if we
are in fast math mode, and have no single step hardware instruction.
(extendhfdf2): Only expand through SFmode if we don't have a
single-step hardware instruction.
* config/arm/vfp.md (*truncdfhf2): New.
(extendhfdf2): Likewise.
* config/arm/fp16.c (struct format): New.
(binary32): New.
(__gnu_float2h_internal): New. Body moved from
__gnu_f2h_internal and generalize.
(_gnu_f2h_internal): Move body to function __gnu_float2h_internal.
Call it with binary32.
Co-Authored-By: Matthew Wahab <matthew.wahab@arm.com>
From-SVN: r242781
James Greenhalgh [Wed, 23 Nov 2016 17:23:12 +0000 (17:23 +0000)]
[Patch 6/17] Migrate excess precision logic to use TARGET_EXCESS_PRECISION
gcc/
* toplev.c (init_excess_precision): Delete most logic.
* tree.c (excess_precision_type): Rewrite to use
TARGET_EXCESS_PRECISION.
* doc/invoke.texi (-fexcess-precision): Document behaviour in a
more generic fashion.
* ginclude/float.h: Wrap definition of FLT_EVAL_METHOD in
__STDC_WANT_IEC_60559_TYPES_EXT__.
gcc/c-family/
* c-common.c (excess_precision_mode_join): New.
(c_ts18661_flt_eval_method): New.
(c_c11_flt_eval_method): Likewise.
(c_flt_eval_method): Likewise.
* c-common.h (excess_precision_mode_join): New.
(c_flt_eval_method): Likewise.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New.
(cpp_iec_559_value): Call it.
(c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__,
call c_flt_eval_method to set __FLT_EVAL_METHOD__ and
__FLT_EVAL_METHOD_TS_18661_3__.
Jakub Jelinek [Wed, 23 Nov 2016 15:59:25 +0000 (16:59 +0100)]
re PR c++/71450 (ICE on invalid C++11 code on x86_64-linux-gnu: in tree check: expected record_type or union_type or qual_union_type, have template_type_parm in lookup_base, at cp/search.c:203)
PR c++/71450
* pt.c (tsubst_copy): Return error_mark_node when mark_used
fails, even when complain & tf_error.
* g++.dg/cpp0x/pr71450-1.C: New test.
* g++.dg/cpp0x/pr71450-2.C: New test.
Jakub Jelinek [Wed, 23 Nov 2016 15:54:39 +0000 (16:54 +0100)]
re PR c++/77739 (internal compiler error: in create_tmp_var, at gimple-expr.c:524)
PR c++/77739
* cp-gimplify.c (cp_gimplify_tree) <case VEC_INIT_EXPR>: Pass
false as handle_invisiref_parm_p to cp_genericize_tree.
(struct cp_genericize_data): Add handle_invisiref_parm_p field.
(cp_genericize_r): Don't wrap is_invisiref_parm into references
if !wtd->handle_invisiref_parm_p.
(cp_genericize_tree): Add handle_invisiref_parm_p argument,
set wtd.handle_invisiref_parm_p to it.
(cp_genericize): Pass true as handle_invisiref_parm_p to
cp_genericize_tree. Formatting fix.
Martin Jambor [Wed, 23 Nov 2016 14:51:02 +0000 (15:51 +0100)]
backport: hsa-builtins.def: New file.
Merge from HSA branch to trunk
2016-11-23 Martin Jambor <mjambor@suse.cz>
Martin Liska <mliska@suse.cz>
gcc/
* hsa-builtins.def: New file.
* Makefile.in (BUILTINS_DEF): Add hsa-builtins.def dependency.
* builtins.def: Include hsa-builtins.def.
(DEF_HSA_BUILTIN): New macro.
* dumpfile.h (OPTGROUP_OPENMP): Define.
* dumpfile.c (optgroup_options): Added OPTGROUP_OPENMP.
* gimple.h (gf_mask): Added elements GF_OMP_FOR_GRID_INTRA_GROUP and
GF_OMP_FOR_GRID_GROUP_ITER.
(gimple_omp_for_grid_phony): Added checking assert.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): New function.
(gimple_omp_for_set_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Likewise.
(gimple_omp_for_set_grid_group_iter): Likewise.
* omp-low.c (check_omp_nesting_restrictions): Allow GRID loop where
previosuly only distribute loop was permitted.
(lower_lastprivate_clauses): Allow non tcc_comparison predicates.
(grid_get_kernel_launch_attributes): Support multiple HSA grid
dimensions.
(grid_expand_omp_for_loop): Likewise and also support standalone
distribute constructs. New parameter INTRA_GROUP, updated both users.
(grid_expand_target_grid_body): Support standalone distribute
constructs.
(pass_data_expand_omp): Changed optinfo_flags to OPTGROUP_OPENMP.
(pass_data_expand_omp_ssa): Likewise.
(pass_data_omp_device_lower): Likewsie.
(pass_data_lower_omp): Likewise.
(pass_data_diagnose_omp_blocks): Likewise.
(pass_data_oacc_device_lower): Likewise.
(pass_data_omp_target_link): Likewise.
(grid_lastprivate_predicate): New function.
(lower_omp_for_lastprivate): Call grid_lastprivate_predicate for
gridified loops.
(lower_omp_for): Support standalone distribute constructs.
(grid_prop): New type.
(grid_safe_assignment_p): Check for assignments to group_sizes, new
parameter GRID.
(grid_seq_only_contains_local_assignments): New parameter GRID, pass
it to callee.
(grid_find_single_omp_among_assignments_1): Likewise, improve missed
optimization info messages.
(grid_find_single_omp_among_assignments): Likewise.
(grid_find_ungridifiable_statement): Do not bail out for SIMDs.
(grid_parallel_clauses_gridifiable): New function.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_dist_follows_simple_pattern): Likewise.
(grid_gfor_follows_tiling_pattern): Likewise.
(grid_call_permissible_in_distribute_p): Likewise.
(grid_handle_call_in_distribute): Likewise.
(grid_dist_follows_tiling_pattern): Likewise.
(grid_target_follows_gridifiable_pattern): Support standalone distribute
constructs.
(grid_var_segment): New enum.
(grid_mark_variable_segment): New function.
(grid_copy_leading_local_assignments): Call grid_mark_variable_segment
if a new argument says so.
(grid_process_grid_body): New function.
(grid_eliminate_combined_simd_part): Likewise.
(grid_mark_tiling_loops): Likewise.
(grid_mark_tiling_parallels_and_loops): Likewise.
(grid_process_kernel_body_copy): Support standalone distribute
constructs.
(grid_attempt_target_gridification): New grid variable holding overall
gridification state. Support standalone distribute constructs and
collapse clauses.
* doc/optinfo.texi (Optimization groups): Document OPTGROUP_OPENMP.
* hsa.h (hsa_bb): Add method method append_phi.
(hsa_insn_br): Renamed to hsa_insn_cbr, renamed all
occurences in all files too.
(hsa_insn_br): New class, now the ancestor of hsa_incn_cbr.
(is_a_helper <hsa_insn_br *>::test): New function.
(is_a_helper <hsa_insn_cbr *>::test): Adjust to only cover conditional
branch instructions.
(hsa_insn_signal): Make a direct descendant of
hsa_insn_basic. Add memorder constructor parameter and
m_memory_order and m_signalop member variables.
(hsa_insn_queue): Changed constructor parameters to common form.
Added m_segment and m_memory_order member variables.
(hsa_summary_t): Add private member function
process_gpu_implementation_attributes.
(hsa_function_summary): Rename m_binded_function to
m_bound_function.
(hsa_insn_basic_p): Remove typedef.
(hsa_op_with_type): Change hsa_insn_basic_p into plain pointers.
(hsa_op_reg_p): Remove typedef.
(hsa_function_representation): Change hsa_op_reg_p into plain
pointers.
(hsa_insn_phi): Removed new and delete operators.
(hsa_insn_br): Likewise.
(hsa_insn_cbr): Likewise.
(hsa_insn_sbr): Likewise.
(hsa_insn_cmp): Likewise.
(hsa_insn_mem): Likewise.
(hsa_insn_atomic): Likewise.
(hsa_insn_signal): Likewise.
(hsa_insn_seg): Likewise.
(hsa_insn_call): Likewise.
(hsa_insn_arg_block): Likewise.
(hsa_insn_comment): Likewise.
(hsa_insn_srctype): Likewise.
(hsa_insn_packed): Likewise.
(hsa_insn_cvt): Likewise.
(hsa_insn_alloca): Likewise.
* hsa.c (hsa_destroy_insn): Also handle instances of hsa_insn_br.
(process_gpu_implementation_attributes): New function.
(link_functions): Move some functionality into it. Adjust after
renaming m_binded_functions to m_bound_functions.
(hsa_insn_basic::op_output_p): Add BRIG_OPCODE_DEBUGTRAP
to the list of instructions with no output registers.
(get_in_type): Return this if it is a register of
matching size.
(hsa_get_declaration_name): Moved to...
* hsa-gen.c (hsa_get_declaration_name): ...here. Allocate
temporary string on an obstack instead from ggc.
(query_hsa_grid): Renamed to query_hsa_grid_dim, reimplemented, cut
down to two overloads.
(hsa_allocp_operand_address): Removed.
(hsa_allocp_operand_immed): Likewise.
(hsa_allocp_operand_reg): Likewise.
(hsa_allocp_operand_code_list): Likewise.
(hsa_allocp_operand_operand_list): Likewise.
(hsa_allocp_inst_basic): Likewise.
(hsa_allocp_inst_phi): Likewise.
(hsa_allocp_inst_mem): Likewise.
(hsa_allocp_inst_atomic): Likewise.
(hsa_allocp_inst_signal): Likewise.
(hsa_allocp_inst_seg): Likewise.
(hsa_allocp_inst_cmp): Likewise.
(hsa_allocp_inst_br): Likewise.
(hsa_allocp_inst_sbr): Likewise.
(hsa_allocp_inst_call): Likewise.
(hsa_allocp_inst_arg_block): Likewise.
(hsa_allocp_inst_comment): Likewise.
(hsa_allocp_inst_queue): Likewise.
(hsa_allocp_inst_srctype): Likewise.
(hsa_allocp_inst_packed): Likewise.
(hsa_allocp_inst_cvt): Likewise.
(hsa_allocp_inst_alloca): Likewise.
(hsa_allocp_bb): Likewise.
(hsa_obstack): New.
(hsa_init_data_for_cfun): Initialize obstack.
(hsa_deinit_data_for_cfun): Release memory of the obstack.
(hsa_op_immed::operator new): Use obstack instead of object_allocator.
(hsa_op_reg::operator new): Likewise.
(hsa_op_address::operator new): Likewise.
(hsa_op_code_list::operator new): Likewise.
(hsa_op_operand_list::operator new): Likewise.
(hsa_insn_basic::operator new): Likewise.
(hsa_insn_phi::operator new): Likewise.
(hsa_insn_br::operator new): Likewise.
(hsa_insn_sbr::operator new): Likewise.
(hsa_insn_cmp::operator new): Likewise.
(hsa_insn_mem::operator new): Likewise.
(hsa_insn_atomic::operator new): Likewise.
(hsa_insn_signal::operator new): Likewise.
(hsa_insn_seg::operator new): Likewise.
(hsa_insn_call::operator new): Likewise.
(hsa_insn_arg_block::operator new): Likewise.
(hsa_insn_comment::operator new): Likewise.
(hsa_insn_srctype::operator new): Likewise.
(hsa_insn_packed::operator new): Likewise.
(hsa_insn_cvt::operator new): Likewise.
(hsa_insn_alloca::operator new): Likewise.
(hsa_init_new_bb): Likewise.
(hsa_bb::append_phi): New function.
(gen_hsa_phi_from_gimple_phi): Use it.
(get_symbol_for_decl): Fix dinstinguishing between
global and local functions. Put local variables into a segment
according to their attribute or static flag, if there is one.
(hsa_insn_br::hsa_insn_br): New.
(hsa_insn_br::operator new): Likewise.
(hsa_insn_cbr::hsa_insn_cbr): Set width via ancestor constructor.
(query_hsa_grid_nodim): New function.
(multiply_grid_dim_characteristics): Likewise.
(gen_get_num_threads): Likewise.
(gen_get_num_teams): Reimplemented.
(gen_get_team_num): Likewise.
(gen_hsa_insns_for_known_library_call): Updated calls to the above
helper functions.
(get_memory_order_name): Removed.
(get_memory_order): Likewise.
(hsa_memorder_from_tree): New function.
(gen_hsa_ternary_atomic_for_builtin): Renamed to
gen_hsa_atomic_for_builtin, can also create signals.
(gen_hsa_insns_for_call): Handle many new builtins. Adjust to use
hsa_memory_order_from_tree and gen_hsa_atomic_for_builtin.
(hsa_insn_atomic): Fix function comment.
(hsa_insn_signal::hsa_insn_signal): Fix comment. Update call to
ancestor constructor and initialization of new member variables.
(hsa_insn_queue::hsa_insn_queue): Added initialization of new
member variables.
(hsa_get_host_function): Handle functions with no bound CPU
implementation. Fix binded to bound.
(get_brig_function_name): Likewise.
(HSA_SORRY_ATV): Remove semicolon after macro.
(HSA_SORRY_AT): Likewise.
(omp_simple_builtin::generate): Add missing semicolons.
(hsa_insn_phi::operator new): Removed.
(hsa_insn_br::operator new): Likewise.
(hsa_insn_cbr::operator new): Likewise.
(hsa_insn_sbr::operator new): Likewise.
(hsa_insn_cmp::operator new): Likewise.
(hsa_insn_mem::operator new): Likewise.
(hsa_insn_atomic::operator new): Likewise.
(hsa_insn_signal::operator new): Likewise.
(hsa_insn_seg::operator new): Likewise.
(hsa_insn_call::operator new): Likewise.
(hsa_insn_arg_block::operator new): Likewise.
(hsa_insn_comment::operator new): Likewise.
(hsa_insn_srctype::operator new): Likewise.
(hsa_insn_packed::operator new): Likewise.
(hsa_insn_cvt::operator new): Likewise.
(hsa_insn_alloca::operator new): Likewise.
(get_symbol_for_decl): Accept CONST_DECLs, put them to
readonly segment.
(gen_hsa_addr): Also process CONST_DECLs.
(gen_hsa_addr_insns): Process CONST_DECLs by creating private
copies.
(gen_hsa_unary_operation): Make sure the function does
not use bittype source type for firstbit and lastbit operations.
(gen_hsa_popcount_to_dest): Make sure the function uses a bittype
source type.
* hsa-brig.c (emit_insn_operands): Cope with zero operands in an
instruction.
(emit_branch_insn): Renamed to emit_cond_branch_insn.
Emit the width stored in the class.
(emit_generic_branch_insn): New function.
(emit_insn): Call emit_generic_branch_insn.
(emit_signal_insn): Remove obsolete comment. Update
member variable name, pick a type according to profile.
(emit_alloca_insn): Remove obsolete comment.
(emit_atomic_insn): Likewise.
(emit_queue_insn): Get segment and memory order from the IR object.
(hsa_brig_section): Make allocate_new_chunk, chunks
and cur_chunk provate, add a default NULL parameter to add method.
(hsa_brig_section::add): Added a new parameter, store pointer to
output data there if it is non-NULL.
(emit_function_directives): Use this new parameter instead of
calculating the pointer itself, fix function comment.
(hsa_brig_emit_function): Add forgotten endian conversion.
(hsa_output_kernels): Remove unnecessary building of
kernel_dependencies_vector_type.
(emit_immediate_operand): Declare.
(emit_directive_variable): Also emit initializers of CONST_DECLs.
(gen_hsa_insn_for_internal_fn_call): Also handle IFN_RSQRT.
(verify_function_arguments): Properly detect variadic
arguments.
* hsa-dump.c (hsa_width_specifier_name): New function.
(dump_hsa_insn_1): Dump generic branch instructions, update signal
member variable name. Special dumping for queue objects.
* ipa-hsa.c (process_hsa_functions): Adjust after renaming
m_binded_functions to m_bound_functions. Copy externally visible flag
to the node.
(ipa_hsa_write_summary): Likewise.
(ipa_hsa_read_section): Likewise.
gcc/fortran/
* f95-lang.c (DEF_HSA_BUILTIN): New macro.
Richard Biener [Wed, 23 Nov 2016 14:40:05 +0000 (14:40 +0000)]
re PR tree-optimization/78396 (gcc.dg/vect/bb-slp-cond-1.c FAILs after fix for PR77848)
2016-11-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/78396
* tree-vectorizer.c (vectorize_loops): If an innermost loop didn't
vectorize try vectorizing an if-converted body using BB vectorization.
This isn't intended to change the behaviour, just rewrite the
existing logic in a different (and hopefully clearer) way.
The new form -- particularly the part based on the "block"
concept -- is easier to convert to polynomial sizes.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* rtlanal.c (subreg_get_info): Use more local variables.
Remark that for HARD_REGNO_NREGS_HAS_PADDING, each scalar unit
occupies at least one register. Assume that full hard registers
have consistent endianness. Share previously-duplicated if block.
Rework the main handling so that it operates on independently-
addressable YMODE-sized blocks. Use subreg_size_lowpart_offset
to check lowpart offsets, without trying to find an equivalent
integer mode first. Handle WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN
as a final register-endianness correction.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242758
combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)
r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
It all has the same root cause: that patch makes combine convert every
lowpart subreg of a logical shift right to a zero_extract. This cannot
work at all if it is not a constant shift, and it has to be a bit more
careful exactly which bits it extracts.
PR target/77881
PR bootstrap/78390
PR target/78438
PR bootstrap/78477
* combine.c (make_compound_operation_int): Do not convert a subreg of
a non-constant logical shift right to a zero_extract. Handle the case
where some zero bits have been shifted into the range covered by that
subreg.
Provide versions of subreg_lowpart_offset and subreg_highpart_offset
that work on mode sizes rather than modes. Also provide a routine
that converts an lsb position to a subreg offset.
The intent (in combination with later patches) is to move the
handling of the BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN case into
just two places, so that for other combinations we don't have
to split offsets into words and subwords.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* rtl.h (subreg_size_offset_from_lsb): Declare.
(subreg_offset_from_lsb): New function.
(subreg_size_lowpart_offset): Declare.
(subreg_lowpart_offset): Turn into an inline function.
(subreg_size_highpart_offset): Declare.
(subreg_highpart_offset): Turn into an inline function.
* emit-rtl.c (subreg_size_lowpart_offset): New function.
(subreg_size_highpart_offset): Likewise
* rtlanal.c (subreg_size_offset_from_lsb): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242755
Richard Biener [Wed, 23 Nov 2016 14:25:48 +0000 (14:25 +0000)]
re PR tree-optimization/78482 (wrong code at -O3 in both 32-bit and 64-bit modes on x86_64-linux-gnu)
2016-11-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/78482
* tree-cfgcleanup.c: Include tree-ssa-loop-niter.h.
(remove_forwarder_block_with_phi): When merging with a loop
header creates a new latch reset number of iteration information
of the loop.
Bin Cheng [Wed, 23 Nov 2016 12:44:08 +0000 (12:44 +0000)]
fold-const.c (fold_cond_expr_with_comparison): Move simplification for A cmp C1 ? A : C2 to below, also simplify remaining code.
* fold-const.c (fold_cond_expr_with_comparison): Move simplification
for A cmp C1 ? A : C2 to below, also simplify remaining code.
* match.pd: Move and extend simplification from above to here:
(cond (cmp (convert1? x) c1) (convert2? x) c2) -> (minmax (x c)).
* tree-if-conv.c (ifcvt_follow_ssa_use_edges): New func.
(predicate_scalar_phi): Call fold_stmt using the new valueize func.
gcc/testsuite
* gcc.dg/fold-cond_expr-1.c: New test.
* gcc.dg/fold-condcmpconv-1.c: New test.
* gcc.dg/fold-condcmpconv-2.c: New test.
Paolo Bonzini [Wed, 23 Nov 2016 10:06:07 +0000 (10:06 +0000)]
system.h (HAVE_DESIGNATED_INITIALIZERS, [...]): Do not use "defined" in macros.
gcc:
2016-11-23 Paolo Bonzini <bonzini@gnu.org>
* system.h (HAVE_DESIGNATED_INITIALIZERS,
HAVE_DESIGNATED_UNION_INITIALIZERS): Do not use
"defined" in macros.
* doc/cpp.texi (Defined): Mention -Wexpansion-to-defined.
* doc/cppopts.texi (Invocation): Document -Wexpansion-to-defined.
* doc/invoke.texi (Warning Options): Document -Wexpansion-to-defined.
gcc/c-family:
2016-11-23 Paolo Bonzini <bonzini@gnu.org>
* c.opt (Wexpansion-to-defined): New.
gcc/testsuite:
2016-11-23 Paolo Bonzini <bonzini@gnu.org>
* gcc.dg/cpp/defined.c: Mark newly introduced warnings and
adjust for warning->pedwarn change.
* gcc.dg/cpp/defined-syshdr.c,
gcc.dg/cpp/defined-Wexpansion-to-defined.c,
gcc.dg/cpp/defined-Wextra-Wno-expansion-to-defined.c,
gcc.dg/cpp/defined-Wextra.c,
gcc.dg/cpp/defined-Wno-expansion-to-defined.c: New testcases.
libcpp:
2016-11-23 Paolo Bonzini <bonzini@gnu.org>
* include/cpplib.h (struct cpp_options): Add new member
warn_expansion_to_defined.
(CPP_W_EXPANSION_TO_DEFINED): New enum member.
* expr.c (parse_defined): Warn for all uses of "defined"
in macros, and tie warning to CPP_W_EXPANSION_TO_DEFINED.
Make it a pedwarning instead of a warning.
* system.h (HAVE_DESIGNATED_INITIALIZERS): Do not use
"defined" in macros.
The test fails for avr because fn1 does not get inlined into fn2. Inlining
occurs for x86_64 because fn1's computed size equals call_stmt_size. For the
avr, 32 bit memory moves are more expensive, and b[3] = p10[a] results in
a bigger size for fn1, preventing the inlining.
Add -finline-small-functions to force early inliner to inline fn1.
Georg-Johann Lay [Wed, 23 Nov 2016 09:17:57 +0000 (09:17 +0000)]
re PR target/60300 ([avr] Suboptimal stack pointer manipulation for frame setup)
gcc/
PR target/60300
* config/avr/constraints.md (Csp): Widen range to [-11..6].
* config/avr/avr.c (avr_prologue_setup_frame): Limit number
of RCALLs in prologue to 3.
Jakub Jelinek [Wed, 23 Nov 2016 08:08:47 +0000 (09:08 +0100)]
re PR target/78451 (FAIL: gcc.target/i386/sse-22a.c: error: inlining failed in call to always_inline '_mm512_setzero_ps')
PR target/78451
* c-pragma.c (handle_pragma_target): Don't replace
current_target_pragma, but chainon the new args to the current one.
* gcc.target/i386/pr78451.c: New test.
* gcc.target/i386/pr69255-1.c: Use #pragma GCC push_options
and #pragma GCC pop_options around the first #pragma GCC target.
* gcc.target/i386/pr69255-2.c: Likewise.
* gcc.target/i386/pr69255-3.c: Likewise.
Michael Collison [Wed, 23 Nov 2016 07:47:25 +0000 (07:47 +0000)]
2016-11-22 Michael Collison <michael.collison@arm.com>
* config/aarch64/aarch64-protos.h
(aarch64_and_split_imm1, aarch64_and_split_imm2)
(aarch64_and_bitmask_imm): New prototypes
* config/aarch64/aarch64.c (aarch64_and_split_imm1):
New overloaded function to create bit mask covering the
lowest to highest bits set.
(aarch64_and_split_imm2): New overloaded functions to create bit
mask of zeros between first and last bit set.
(aarch64_and_bitmask_imm): New function to determine if a integer
is a valid two instruction "and" operation.
* config/aarch64/aarch64.md:(and<mode>3): New define_insn and _split
allowing wider range of constants with "and" operations.
* (ior<mode>3, xor<mode>3): Use new LOGICAL2 iterator to prevent
"and" operator from matching restricted constant range used for
ior and xor operators.
* config/aarch64/constraints.md (UsO constraint): New SImode constraint
for constants in "and" operantions.
(UsP constraint): New DImode constraint for constants in "and" operations.
* config/aarch64/iterators.md (lconst2): New mode iterator.
(LOGICAL2): New code iterator.
* config/aarch64/predicates.md (aarch64_logical_and_immediate): New
predicate
(aarch64_logical_and_operand): New predicate allowing extended constants
for "and" operations.
* testsuite/gcc.target/aarch64/and_const.c: New test to verify
additional constants are recognized and fewer instructions generated.
* testsuite/gcc.target/aarch64/and_const2.c: New test to verify
additional constants are recognized and fewer instructions generated.
Than McIntosh [Tue, 22 Nov 2016 22:28:05 +0000 (22:28 +0000)]
compiler: relocate ID encoding utilities to gofrontend
Relocate the code that encodes/sanitizes identifiers to make them
assembler-friendly, moving it from the back end to the front end; the
decisions about when to encode an identifier and the calls to the
encoding helpers now take place entirely in gofrontend.
Ian Lance Taylor [Tue, 22 Nov 2016 17:58:04 +0000 (17:58 +0000)]
runtime: rewrite panic/defer code from C to Go
The actual stack unwind code is still in C, but the rest of the code,
notably all the memory allocation, is now in Go. The names are changed
to the names used in the Go 1.7 runtime, but the code is necessarily
somewhat different.
The __go_makefunc_can_recover function is dropped, as the uses of it
were removed in https://golang.org/cl/198770044.
Jakub Jelinek [Tue, 22 Nov 2016 17:56:43 +0000 (18:56 +0100)]
OpenMP loop cloning for SIMT execution
2016-11-22 Jakub Jelinek <jakub@redhat.com>
Alexander Monakov <amonakov@ispras.ru>
* internal-fn.c (expand_GOMP_USE_SIMT): New function.
* tree.c (omp_clause_num_ops): OMP_CLAUSE__SIMT_ has 0 operands.
(omp_clause_code_name): Add _simt_ name.
(walk_tree_1): Handle OMP_CLAUSE__SIMT_.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__SIMT_.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE__SIMT_.
(scan_omp_simd): New function.
(scan_omp_1_stmt): Use it in target regions if needed.
(omp_max_vf): Don't max with omp_max_simt_vf.
(lower_rec_simd_input_clauses): Use omp_max_simt_vf if
OMP_CLAUSE__SIMT_ is present.
(lower_rec_input_clauses): Compute maybe_simt from presence of
OMP_CLAUSE__SIMT_.
(lower_lastprivate_clauses): Likewise.
(expand_omp_simd): Likewise.
(execute_omp_device_lower): Lower IFN_GOMP_USE_SIMT.
* internal-fn.def (GOMP_USE_SIMT): New internal function.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__SIMT_.
Co-Authored-By: Alexander Monakov <amonakov@ispras.ru>
From-SVN: r242714
Carl Love [Tue, 22 Nov 2016 16:49:02 +0000 (16:49 +0000)]
rs6000-c.c: Add built-in support for vector compare equal and vector compare not equal.
gcc/ChangeLog:
2016-11-21 Carl Love <cel@us.ibm.com>
* config/rs6000/rs6000-c.c: Add built-in support for vector compare
equal and vector compare not equal. The vector compares take two
arguments of type vector bool char, vector bool short, vector bool int,
vector bool long long with the same return type.
* doc/extend.texi: Update built-in documentation file for the new
powerpc built-ins.
gcc/testsuite/ChangeLog:
2016-11-21 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/builtins-3.c: New file to test the new
built-ins for vector compare equal and vector compare not equal.