Add a target hook for getting an ABI from a function type
This patch adds a target hook that allows targets to return
the ABI associated with a particular function type. Generally,
when multiple ABIs are in use, it must be possible to tell from
a function type and its attributes which ABI it is using.
2019-09-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.def (fntype_abi): New target hook.
* doc/tm.texi.in (TARGET_FNTYPE_ABI): Likewise.
* doc/tm.texi: Regenerate.
* target.h (predefined_function_abi): Declare.
* function-abi.cc (fntype_abi): Call targetm.calls.fntype_abi,
if defined.
* config/aarch64/aarch64.h (ARM_PCS_SIMD): New arm_pcs value.
* config/aarch64/aarch64.c: Include function-abi.h.
(aarch64_simd_abi, aarch64_fntype_abi): New functions.
(TARGET_FNTYPE_ABI): Define.
This patch adds new structures and functions for handling
multiple ABIs in a translation unit. The structures are:
- predefined_function_abi: describes a static, predefined ABI
- function_abi: describes either a predefined ABI or a local
variant of one (e.g. taking -fipa-ra into account)
The patch adds functions for getting the ABI from a given type
or decl; a later patch will also add a function for getting the
ABI of the target of a call insn.
Although ABIs are about much more than call-clobber/saved choices,
I wanted to keep the name general in case we add more ABI-related
information in future.
2019-09-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* Makefile.in (OBJS): Add function-abi.o.
(GTFILES): Add function-abi.h.
* function-abi.cc: New file.
* function-abi.h: Likewise.
* emit-rtl.h (rtl_data::abi): New field.
* function.c: Include function-abi.h.
(prepare_function_start): Initialize crtl->abi.
* read-rtl-function.c: Include regs.h and function-abi.h.
(read_rtl_function_body): Initialize crtl->abi.
(read_rtl_function_body_from_file_range): Likewise.
* reginfo.c: Include function-abi.h.
(init_reg_sets_1): Initialize default_function_abi.
(globalize_reg): Call add_full_reg_clobber for each predefined ABI
when making a register global.
* target-globals.h (this_target_function_abi_info): Declare.
(target_globals::function_abi_info): New field.
(restore_target_globals): Copy it.
* target-globals.c: Include function-abi.h.
(default_target_globals): Initialize the function_abi_info field.
(target_globals): Allocate it.
(save_target_globals): Free it.
The aarch64_vector_pcs handling in aarch64_hard_regno_call_part_clobbered
checks whether the mode might be bigger than 16 bytes, since on SVE
targets the (non-SVE) vector PCS only guarantees that the low 16 bytes
are preserved. But for multi-register modes, we should instead test
whether each single-register part might be bigger than 16 bytes.
(The size is always divided evenly between registers.)
The testcase uses XImode as an example where this helps.
2019-09-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_hard_regno_call_part_clobbered):
For multi-registers modes, test how big each register part is.
gcc/testsuite/
* gcc.target/aarch64/torture/simd-abi-8.c: New test.
Introduce rtx_alloca, alloca_raw_REG and alloca_rtx_fmt_*
When one passes short-lived fake rtxes to backends in order to test
their capabilities, it might be beneficial to allocate these rtxes on
stack in order to reduce the load on GC.
Provide macro counterparts of some of the gen_* functions for that
purpose.
gcc/ChangeLog:
2019-09-30 Ilya Leoshkevich <iii@linux.ibm.com>
* emit-rtl.c (init_raw_REG): New function.
(gen_raw_REG): Use init_raw_REG.
* gengenrtl.c (gendef): Emit init_* functions and alloca_*
macros.
* rtl.c (rtx_alloc_stat_v): Use rtx_init.
* rtl.h (rtx_init): New function.
(rtx_alloca): New function.
(init_raw_REG): New function.
(alloca_raw_REG): New macro.
Michael Meissner [Mon, 30 Sep 2019 13:49:13 +0000 (13:49 +0000)]
Add initial support for prefixed/PC-relative addressing.
2019-09-30 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/predicates.md (pcrel_address): Delete predicate.
(pcrel_local_address): Replace pcrel_address predicate, use the
new function address_to_insn_form.
(pcrel_external_address): Replace with new implementation using
address_to_insn_form..
(prefixed_mem_operand): Delete predicate which is now unused.
(pcrel_external_mem_operand): Delete predicate which is now
unused.
* config/rs6000/rs6000-protos.h (enum insn_form): New
enumeration.
(enum non_prefixed): New enumeration.
(address_to_insn_form): New declaration.
(prefixed_load_p): New declaration.
(prefixed_store_p): New declaration.
(prefixed_paddi_p): New declaration.
(rs6000_asm_output_opcode): New declaration.
(rs6000_final_prescan_insn): Move declaration and update calling
signature.
(address_is_prefixed): New helper inline function.
* config/rs6000/rs6000.c(print_operand_address): Check for either
PC-relative local symbols or PC-relative external symbols.
(rs6000_emit_move): Support loading PC-relative addresses.
(mode_supports_prefixed_address_p): Delete, no longer used.
(rs6000_prefixed_address_mode_p): Delete, no longer used.
(address_to_insn_form): New function to decode an address format.
(reg_to_non_prefixed): New function to identify what the
non-prefixed memory instruction format is for a register.
(prefixed_load_p): New function to identify prefixed loads.
(prefixed_store_p): New function to identify prefixed stores.
(prefixed_paddi_p): New function to identify prefixed load
immediates.
(next_insn_prefixed_p): New static state variable.
(rs6000_final_prescan_insn): New function to determine if an insn
uses a prefixed instruction.
(rs6000_asm_output_opcode): New function to emit 'p' in front of a
prefixed instruction.
* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): New target hook.
(ASM_OUTPUT_OPCODE): New target hook.
* config/rs6000/rs6000.md (prefixed): New insn attribute for
prefixed instructions.
(prefixed_length): New insn attribute for the size of prefixed
instructions.
(non_prefixed_length): New insn attribute for the size of
non-prefixed instructions.
(pcrel_local_addr): New insn to load up a local PC-relative
address.
(pcrel_extern_addr): New insn to load up an external PC-relative
address.
(mov<mode>_64bit_dm): Split the alternatives for loading 0.0 to a
GPR and loading a 128-bit floating point type to a GPR.
Richard Biener [Mon, 30 Sep 2019 11:59:16 +0000 (11:59 +0000)]
gimple.c (gimple_get_lhs): For PHIs return the result.
2019-09-30 Richard Biener <rguenther@suse.de>
* gimple.c (gimple_get_lhs): For PHIs return the result.
* tree-vectorizer.h (vectorizable_live_operation): Also get the
SLP instance as argument.
* tree-vect-loop.c (vect_analyze_loop_operations): Also handle
double-reduction PHIs with vectorizable_lc_phi.
(vect_analyze_loop_operations): Adjust.
(vect_create_epilog_for_reduction): Remove all code not dealing
with reduction LC PHI or epilogue generation.
(vectorizable_live_operation): Call vect_create_epilog_for_reduction
for live stmts of reductions.
* tree-vect-stmts.c (vectorizable_condition): When !for_reduction
do not handle defs that are not vect_internal_def.
(can_vectorize_live_stmts): Adjust.
(vect_analyze_stmt): When the vectorized stmt defined a value
used on backedges adjust the backedge uses of vectorized PHIs.
Jonathan Wakely [Mon, 30 Sep 2019 11:52:08 +0000 (12:52 +0100)]
Implement LWG 3255 for std::span constructors
Also fix the constraints on span(Container&) and span(const Container&)
constructors so that they aren't used for const spans or const arrays.
* include/std/span (span(element_type(&)[N]))
(span(array<value_type, N>&), span(const array<value_type, N>&)):
Deduce array element type to allow safe const conversions (LWG 3255).
[!_GLIBCXX_P1394] (span(Container&), span(const Container&)): Use
remove_cv_t on arguments to __is_std_span and __is_std_array.
* testsuite/23_containers/span/lwg3255.cc: New test.
Martin Jambor [Mon, 30 Sep 2019 08:18:59 +0000 (10:18 +0200)]
[PR 91853] Prevent IPA-SRA ICEs on type-mismatched calls
2019-09-30 Martin Jambor <mjambor@suse.cz>
PR ipa/91853
* tree-inline.c (force_value_to_type): New function.
(setup_one_parameter): Use force_value_to_type to convert type.
* tree-inline.c (force_value_to_type): Declare.
* ipa-param-manipulation.c (ipa_param_adjustments::modify_call): Deal
with register type mismatches.
Andreas Tobler [Mon, 30 Sep 2019 07:54:52 +0000 (09:54 +0200)]
config.gcc: Use the secure-plt on FreeBSD 13 and upwards for 32-bit PowerPC.
2019-09-30 Andreas Tobler <andreast@gcc.gnu.org>
* config.gcc: Use the secure-plt on FreeBSD 13 and upwards for
32-bit PowerPC.
Define TARGET_FREEBSD32_SECURE_PLT for 64-bit PowerPC.
* config/rs6000/t-freebsd64: Make use of the above define and build
the 32-bit libraries with secure-plt.
Replace the define_expand and two define_insns with a single
@macho_low_<mode> and update callers.
gcc/ChangeLog:
2019-09-29 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c (gen_macho_low):Amend to include the mode
argument.
(machopic_indirect_data_reference): Amend gen_macho_low call
to include mode argument
* config/rs6000/rs6000.c (emit_move): Likewise. Amend a comment.
* config/rs6000/darwin.md (@macho_low_<mode>): New, replaces
the macho_high expander and two define_insn entries.
Paul Thomas [Sun, 29 Sep 2019 10:12:42 +0000 (10:12 +0000)]
re PR fortran/91726 (ICE in gfc_conv_array_ref, at fortran/trans-array.c:3612)
2019-09-29 Paul Thomas <pault@gcc.gnu.org>
PR fortran/91726
* resolve.c (gfc_expr_to_initialize): Bail out with a copy of
the original expression if the array ref is a scalar and the
array_spec has corank.
* trans-array.c (gfc_conv_array_ref): Such expressions are OK
even if the array ref codimen is zero.
* trans-expr.c (gfc_get_class_from_expr): New function taken
from gfc_get_vptr_from_expr.
(gfc_get_vptr_from_expr): Call new function.
* trans-stmt.c (trans_associate_var): If one of these is a
target expression, extract the class expression from the target
and copy its fields to a new target variable.
* trans.h : Add prototype for gfc_get_class_from_expr.
2019-09-29 Paul Thomas <pault@gcc.gnu.org>
PR fortran/91726
* gfortran.dg/coarray_poly_9.f90 : New test.
Drop the expander and use a mode iterator on the define_insn
for @macho_high_<mode> instead.
gcc/ChangeLog:
2019-09-28 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c (gen_macho_high): Amend to include the mode
argument.
(machopic_indirect_data_reference): Amend gen_macho_high call
to include mode argument.
(machopic_legitimize_pic_address): Likewise.
* config/rs6000/rs6000.c (rs6000_legitimize_address):
* config/rs6000/darwin.md (@macho_high_<mode>): New, replaces
the macho_high expander and two define_insn entries.
Steven G. Kargl [Sat, 28 Sep 2019 16:26:43 +0000 (16:26 +0000)]
re PR fortran/91864 (ICE in gfc_check_do_variable, at fortran/parse.c:4405)
2019-09-28 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/91864
* gcc/fortran/io.c (match_io_element): An inquiry parameter cannot be
read into.
* gcc/fortran/match.c (gfc_match_allocate): An inquiry parameter
can be neither an allocate-object nor stat variable.
(gfc_match_deallocate): An inquiry parameter cannot be deallocated.
Marek Polacek [Sat, 28 Sep 2019 15:35:37 +0000 (15:35 +0000)]
PR c++/91889 - follow-up fix for DR 2352.
* call.c (involves_qualification_conversion_p): New function.
(direct_reference_binding): Build a ck_qual if the conversion
would involve a qualification conversion.
(convert_like_real): Strip the conversion created by the ck_qual
in direct_reference_binding.
* g++.dg/cpp0x/ref-bind3.C: Add dg-error.
* g++.dg/cpp0x/ref-bind4.C: New test.
* g++.dg/cpp0x/ref-bind5.C: New test.
* g++.dg/cpp0x/ref-bind6.C: New test.
* g++.old-deja/g++.pt/spec35.C: Revert earlier change.
Ian Lance Taylor [Sat, 28 Sep 2019 00:16:57 +0000 (00:16 +0000)]
compiler: resolve importing ambiguity for more complex function calls
Tweak the exporter for inlinable function bodies to work around a
problem with importing of function calls whose function expressions
are not simple function names. In the bug in question, the function
body exporter was writing out a function call of the form
(*(*FuncTyp)(var))(arg)
which produced an export data representation of
*$convert(<type 5>, var)(x)
which is hard to parse unambiguously. Fix: change the export data
emitter to introduce parens around the function expression for more
complex calls.
Regenerate `configure' scripts for `uclinuxfdpiceabi' libtool.m4 update
A change made with r275564 ("[ARM/FDPIC v6 02/24] [ARM] FDPIC: Handle
arm*-*-uclinuxfdpiceabi in configure scripts") to libtool.m4 has not
regenerated all the `configure' scripts affected. Fix it.
Jakub Jelinek [Fri, 27 Sep 2019 20:14:24 +0000 (22:14 +0200)]
re PR c++/88203 (assert does not compile with OpenMP's pragma omp parallel for default(none))
PR c++/88203
c-family/
* c-common.h (c_omp_predefined_variable): Declare.
* c-omp.c (c_omp_predefined_variable): New function.
(c_omp_predetermined_sharing): Return OMP_CLAUSE_DEFAULT_SHARED
for predefined variables.
c/
* c-parser.c (c_parser_predefined_identifier): New function.
(c_parser_postfix_expression): Use it.
(c_parser_omp_variable_list): Parse predefined identifiers.
* c-typeck.c (c_finish_omp_clauses): Allow predefined variables
in shared and firstprivate clauses, even when they are predetermined
shared.
cp/
* parser.c (cp_parser_omp_var_list_no_open): Parse predefined
variables.
* semantics.c (finish_omp_clauses): Allow predefined variables in
shared and firstprivate clauses, even when they are predetermined
shared.
* cp-gimplify.c (cxx_omp_predetermined_sharing_1): Return
OMP_CLAUSE_DEFAULT_SHARED for predefined variables.
testsuite/
* c-c++-common/gomp/pr88203-1.c: New test.
* c-c++-common/gomp/pr88203-2.c: New test.
* c-c++-common/gomp/pr88203-3.c: New test.
Jason Merrill [Fri, 27 Sep 2019 18:23:10 +0000 (14:23 -0400)]
constexpr.c (cxx_fold_indirect_ref): Use similar_type_p.
* constexpr.c (cxx_fold_indirect_ref): Use similar_type_p.
Merging the similar_type_p change to the concepts branch broke a cmcstl2
testcase; investigating led me to this small testcase which has always
failed on trunk.
Jason Merrill [Fri, 27 Sep 2019 18:19:55 +0000 (14:19 -0400)]
cp-tree.h (class iloc_sentinel): New.
* cp-tree.h (class iloc_sentinel): New.
We didn't already have a sentinel for input_location, and while
temp_override would work, it would also happily set input_location to 0,
which breaks things that try to look up the associated filename.
* decl.c (grokdeclarator, finish_enum_value_list): Use it.
* mangle.c (mangle_decl_string): Use it.
* pt.c (perform_typedefs_access_check): Use it.
Ian Lance Taylor [Fri, 27 Sep 2019 17:51:43 +0000 (17:51 +0000)]
compiler: don't read known type, simplify Import::finalize_methods
With the current export format, if we already know the type, we don't
have to read and parse the definition.
We only use the finalizer in Import::finalize_methods, so make it a
local variable. To match Finalize_methods::type, only put struct
types into real_for_named.
Ian Lance Taylor [Fri, 27 Sep 2019 17:34:58 +0000 (17:34 +0000)]
compiler: only check whether struct or array types are big
Fetching the size of a type typically involves a hash table lookup,
and is generally non-trivial. The escape analysis code calls is_big
more than one might expect. So only fetch the size if we need it.
* tree-vectorizer.h (_stmt_vec_info::reduc_fn): New.
(STMT_VINFO_REDUC_FN): Likewise.
* tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize
STMT_VINFO_REDUC_FN.
* tree-vect-loop.c (vect_is_simple_reduction): Fix STMT_VINFO_REDUC_IDX
for condition reductions.
(vect_create_epilog_for_reduction): Compute all required state
from the stmt to be vectorized.
(vectorizable_reduction): Simplify vect_create_epilog_for_reduction
invocation and remove then dead code. For single def-use chains
record only a single vector stmt.
Jakub Jelinek [Fri, 27 Sep 2019 10:28:48 +0000 (12:28 +0200)]
re PR tree-optimization/91885 (ICE when compiling SPEC 2017 blender benchmark with -O3 -fprofile-generate)
PR tree-optimization/91885
* gcc.dg/pr91885.c (__int64_t): Change from long to long long.
(__uint64_t): Change from unsigned long to unsigned long long.
[AArch64] Split built-in function codes into major and minor codes
It was easier to add the SVE ACLE support without enumerating every
function at build time. This in turn meant that it was easier if the
SVE builtins occupied a distinct numberspace from the existing AArch64
ones, which *are* enumerated at build time. This patch therefore
divides the built-in functions codes into "major" and "minor" codes.
At present the major code is just "general", but the SVE patch will add
"SVE" as well.
Also, it was convenient to put the SVE ACLE support in its own file,
so the patch makes aarch64.c provide the frontline target hooks directly,
forwarding to the other files for the real work.
The reason for organising the files this way is that aarch64.c needs
to define the target hook macros whatever happens, and having aarch64.c
macros forward to aarch64-builtins.c functions and aarch64-bulitins.c
functions forward to the SVE file seemed a bit indirect. Doing things
the way the patch does them puts aarch64-builtins.c and the SVE code on
more of an equal footing.
The aarch64_(general_)gimple_fold_builtin change is mostly just
reindentation.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_builtin_class): New enum.
(AARCH64_BUILTIN_SHIFT, AARCH64_BUILTIN_CLASS): New constants.
(aarch64_gimple_fold_builtin, aarch64_mangle_builtin_type)
(aarch64_fold_builtin, aarch64_init_builtins, aarch64_expand_builtin):
(aarch64_builtin_decl, aarch64_builtin_rsqrt): Delete.
(aarch64_general_mangle_builtin_type, aarch64_general_init_builtins):
(aarch64_general_fold_builtin, aarch64_general_gimple_fold_builtin):
(aarch64_general_expand_builtin, aarch64_general_builtin_decl):
(aarch64_general_builtin_rsqrt): Declare.
* config/aarch64/aarch64-builtins.c (aarch64_general_add_builtin):
New function.
(aarch64_mangle_builtin_type): Rename to...
(aarch64_general_mangle_builtin_type): ...this.
(aarch64_init_fcmla_laneq_builtins, aarch64_init_simd_builtins)
(aarch64_init_crc32_builtins, aarch64_init_builtin_rsqrt)
(aarch64_init_pauth_hint_builtins, aarch64_init_tme_builtins): Use
aarch64_general_add_builtin instead of add_builtin_function.
(aarch64_init_builtins): Rename to...
(aarch64_general_init_builtins): ...this. Use
aarch64_general_add_builtin instead of add_builtin_function.
(aarch64_builtin_decl): Rename to...
(aarch64_general_builtin_decl): ...this and remove the unused
arguments.
(aarch64_expand_builtin): Rename to...
(aarch64_general_expand_builtin): ...this and remove the unused
arguments.
(aarch64_builtin_rsqrt): Rename to...
(aarch64_general_builtin_rsqrt): ...this.
(aarch64_fold_builtin): Rename to...
(aarch64_general_fold_builtin): ...this. Take the function subcode
and return type as arguments. Remove the "ignored" argument.
(aarch64_gimple_fold_builtin): Rename to...
(aarch64_general_gimple_fold_builtin): ...this. Take the function
subcode and gcall as arguments, and return the new function call.
* config/aarch64/aarch64.c (aarch64_init_builtins)
(aarch64_fold_builtin, aarch64_gimple_fold_builtin)
(aarch64_expand_builtin, aarch64_builtin_decl): New functions.
(aarch64_builtin_reciprocal): Call aarch64_general_builtin_rsqrt
instead of aarch64_builtin_rsqrt.
(aarch64_mangle_type): Call aarch64_general_mangle_builtin_type
instead of aarch64_mangle_builtin_type.
[C][C++] Allow targets to check calls to BUILT_IN_MD functions
For SVE, we'd like the frontends to check calls to target-specific
built-in functions in the same way that they already do for "normal"
builtins. This patch adds a target hook for that and extends
check_builtin_function_arguments accordingly.
A slight complication is that when TARGET_RESOLVE_OVERLOADED_BUILTIN
has resolved an overload, it can use build_function_call_vec to build
the call to the underlying non-overloaded function decl. This in
turn coerces the arguments to the function type and then calls
check_builtin_function_arguments to check the final call. If the
target does find a problem in this final call, it can be useful
to refer to the original overloaded function decl in diagnostics,
since that's what the user wrote.
The patch therefore passes the original decl as a final optional
parameter to build_function_call_vec.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.def (check_builtin_call): New target hook.
* doc/tm.texi.in (TARGET_CHECK_BUILTIN_CALL): New @hook.
* doc/tm.texi: Regenerate.
gcc/c-family/
* c-common.h (build_function_call_vec): Take the original
function decl as an optional final parameter.
(check_builtin_function_arguments): Take the original function decl.
* c-common.c (check_builtin_function_arguments): Likewise.
Handle all built-in functions, not just BUILT_IN_NORMAL ones.
Use targetm.check_builtin_call to check BUILT_IN_MD functions.
gcc/c/
* c-typeck.c (build_function_call_vec): Take the original function
decl as an optional final parameter. Pass all built-in calls to
check_builtin_function_arguments.
gcc/cp/
* cp-tree.h (build_cxx_call): Take the original function decl
as an optional final parameter.
(cp_build_function_call_vec): Likewise.
* call.c (build_cxx_call): Likewise. Pass all built-in calls to
check_builtin_function_arguments.
* typeck.c (build_function_call_vec): Take the original function
decl as an optional final parameter and pass it to
cp_build_function_call_vec.
(cp_build_function_call_vec): Take the original function
decl as an optional final parameter and pass it to build_cxx_call.
Fix reduc_index==1 handling for COND_REDUCTION (PR91909)
The then/else order of the VEC_COND_EXPRs created by
vect_create_epilog_for_reduction meeds to line up with the
main VEC_COND_EXPR.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/91909
* tree-vect-loop.c (vect_create_epilog_for_reduction): Take a
reduc_index parameter. When handling COND_REDUCTION, make sure
that the reduction phi operand is in the correct arm of the
VEC_COND_EXPR.
(vectorizable_reduction): Pass reduc_index to the above.
Zero-sized fields do not get processed by finish_record_type: they're
removed from the field list before and reinserted after, so their
DECL_SIZE_UNIT remains unset, causing the translation of assignment
statements with use_memset_p, in quite unusual circumstances, to use a
NULL_TREE as the memset length. This patch sets DECL_SIZE_UNIT for
the zero-sized fields, that don't go through language-independent
layout, in language-specific layout.
for gcc/ada/ChangeLog
* gcc-interface/decl.c (components_to_record): Set
DECL_SIZE_UNIT for zero-sized fields.
Eric Botcazou [Thu, 26 Sep 2019 21:43:51 +0000 (21:43 +0000)]
charset.c (UCS_LIMIT): New macro.
* charset.c (UCS_LIMIT): New macro.
(ucn_valid_in_identifier): Use it instead of a hardcoded constant.
(_cpp_valid_ucn): Issue a pedantic warning for UCNs larger than
UCS_LIMIT outside of identifiers in C and in C++2a or later.
Max Filippov [Thu, 26 Sep 2019 20:51:27 +0000 (20:51 +0000)]
xtensa: fix PR target/91880
Xtensa hwloop_optimize segfaults when zero overhead loop is about to be
inserted as the first instruction of the function.
Insert zero overhead loop instruction into new basic block before the
loop when basic block that precedes the loop is empty.
2019-09-26 Max Filippov <jcmvbkbc@gmail.com>
gcc/
* config/xtensa/xtensa.c (hwloop_optimize): Insert zero overhead
loop instruction into new basic block before the loop when basic
block that precedes the loop is empty.
gcc/testsuite/
* gcc.target/xtensa/pr91880.c: New test case.
* gcc.target/xtensa/xtensa.exp: New test suite.
We can use the mode iterators directly with an @pattern to avoid the
need for an expander that was only there to pass the mode through.
gcc/ChangeLog:
2019-09-26 Iain Sandoe <iain@sandoe.co.uk>
* config/rs6000/darwin.md: Replace the expanders for
load_macho_picbase and reload_macho_picbase with use of '@'
in their respective define_insns.
(nonlocal_goto_receiver): Pass Pmode to gen_reload_macho_picbase.
* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): Pass
Pmode to gen_load_macho_picbase.
* config/rs6000/rs6000.md: Likewise.
Richard Biener [Thu, 26 Sep 2019 16:54:51 +0000 (16:54 +0000)]
re PR tree-optimization/91896 (ICE in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1687)
2019-09-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/91896
* tree-vect-loop.c (vectorizable_reduction): The single
def-use cycle optimization cannot apply when there's more
than one pattern stmt involved.
Richard Biener [Thu, 26 Sep 2019 13:52:45 +0000 (13:52 +0000)]
tree-vect-loop.c (vect_analyze_loop_operations): Also call vectorizable_reduction for vect_double_reduction_def.
2019-09-26 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_analyze_loop_operations): Also call
vectorizable_reduction for vect_double_reduction_def.
(vect_transform_loop): Likewise.
(vect_create_epilog_for_reduction): Move double-reduction
PHI creation and preheader argument setting of PHIs ...
(vectorizable_reduction): ... here. Also process
vect_double_reduction_def PHIs, creating the vectorized
PHI nodes, remembering the scalar adjustment computed for
the epilogue in STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT.
Remember the original reduction code in STMT_VINFO_REDUC_CODE.
* tree-vectorizer.c (vec_info::new_stmt_vec_info):
Initialize STMT_VINFO_REDUC_CODE.
* tree-vectorizer.h (_stmt_vec_info::reduc_epilogue_adjustment): New.
(_stmt_vec_info::reduc_code): Likewise.
(STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT): Likewise.
(STMT_VINFO_REDUC_CODE): Likewise.
This patch implements some more SIMD32, but these ones have a DImode result+addend.
Apart from that there's nothing too exciting about them.
Bootstrapped and tested on arm-none-linux-gnueabihf.
* config/arm/arm.md (arm_<simd32_op>): New define_insn.
* config/arm/arm_acle.h (__smlald, __smlaldx, __smlsld, __smlsldx):
Define.
* config/arm/arm_acle.h: Define builtins for the above.
* config/arm/iterators.md (SIMD32_DIMODE): New int_iterator.
(simd32_op): Handle the above.
* config/arm/unspecs.md: Define unspecs for the above.
This patch is part of a series to implement the SIMD32 ACLE intrinsics [1].
The interesting parts implementation-wise involve adding support for setting and reading
the Q bit for saturation and the GE-bits for the packed SIMD instructions.
That will come in a later patch.
For now, this patch implements the other intrinsics that don't need anything special ;
just a mapping from arm_acle.h function to builtin to RTL expander+unspec.
I've compressed as many as I could with iterators so that we end up needing only 3
new define_insns.
Bootstrapped and tested on arm-none-linux-gnueabihf.
My recent assemble_real patch (r275873) meant that we now output
negative FP16 constants in the same way as we'd output an integer
subreg of them. This patch updates gcc.target/arm/fp16-* accordingly.
2019-09-26 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* gcc.target/arm/fp16-compile-alt-3.c: Expect (__fp16) -2.0
to be written as a negative short rather than a positive one.
* gcc.target/arm/fp16-compile-ieee-3.c: Likewise.
David Malcolm [Wed, 25 Sep 2019 19:32:44 +0000 (19:32 +0000)]
Colorize %L and %C text to match diagnostic_show_locus (PR fortran/91426)
gcc/fortran/ChangeLog:
PR fortran/91426
* error.c (curr_diagnostic): New static variable.
(gfc_report_diagnostic): New static function.
(gfc_warning): Replace call to diagnostic_report_diagnostic with
call to gfc_report_diagnostic.
(gfc_format_decoder): Colorize the text of %L and %C to match the
colorization used by diagnostic_show_locus.
(gfc_warning_now_at): Replace call to diagnostic_report_diagnostic with
call to gfc_report_diagnostic.
(gfc_warning_now): Likewise.
(gfc_warning_internal): Likewise.
(gfc_error_now): Likewise.
(gfc_fatal_error): Likewise.
(gfc_error_opt): Likewise.
(gfc_internal_error): Likewise.
Martin Jambor [Wed, 25 Sep 2019 14:24:33 +0000 (16:24 +0200)]
Remove newly unused function and variable in tree-sra
Hi,
Martin and his clang warnings discovered that I forgot to remove a
static inline function and a variable when ripping out the old IPA-SRA
from tree-sra.c and both are now unused. Thus I am doing that now
with the patch below which I will commit as obvious (after including
it in a round of a bootstrap and testing on an x86_64-linux).
[AArch64] Use implementation namespace consistently in arm_neon.h
We're somewhat inconsistent in arm_neon.h when it comes to using the implementation namespace for local
identifiers. This means things like:
#define hash_abcd 0
#define hash_e 1
#define wk 2
#include "arm_neon.h"
uint32x4_t
foo (uint32x4_t a, uint32_t b, uint32x4_t c)
{
return vsha1cq_u32 (a, b, c);
}
don't compile.
This patch fixes these issues throughout the whole of arm_neon.h
Bootstrapped and tested on aarch64-none-linux-gnu.
The advsimd-intrinsics.exp tests pass just fine.
Richard Biener [Wed, 25 Sep 2019 13:09:25 +0000 (13:09 +0000)]
re PR tree-optimization/91896 (ICE in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1687)
2019-09-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/91896
* tree-vect-loop.c (vectorizable_reduction): The single
def-use cycle optimization cannot apply when there's more
than one pattern stmt involved.
[AARCH64] Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions
to be data coherence are discoverable through new fields in CTR_EL0.
Let's support the two bits if they are enabled, the CPU core will
not execute the unnecessary DCache clean or Icache Invalidation
instructions.