rs6000-p8swap.c (const_load_sequence_p): Revise this function to return false if the definition used by the swap...
gcc/ChangeLog:
2017-09-27 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-p8swap.c (const_load_sequence_p): Revise
this function to return false if the definition used by the swap
instruction is artificial, or if the memory address from which the
constant value is loaded is not represented by a base address held
in a register or if the base address register is a frame or stack
pointer. Additionally, return false if the base address of the
loaded constant is a SYMBOL_REF but is not considered to be a
constant.
(replace_swapped_load_constant): New function.
(rs6000_analyze_swaps): Add a new pass to replace a swap of a
loaded constant vector with a load of a swapped constant vector.
gcc/testsuite/ChangeLog:
2017-09-27 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/swaps-p8-28.c: New test.
* gcc.target/powerpc/swaps-p8-29.c: New test.
* gcc.target/powerpc/swaps-p8-30.c: New test.
* gcc.target/powerpc/swaps-p8-31.c: New test.
* gcc.target/powerpc/swaps-p8-32.c: New test.
* gcc.target/powerpc/swaps-p8-33.c: New test.
* gcc.target/powerpc/swaps-p8-34.c: New test.
* gcc.target/powerpc/swaps-p8-35.c: New test.
* gcc.target/powerpc/swaps-p8-36.c: New test.
* gcc.target/powerpc/swaps-p8-37.c: New test.
* gcc.target/powerpc/swaps-p8-38.c: New test.
* gcc.target/powerpc/swaps-p8-39.c: New test.
* gcc.target/powerpc/swaps-p8-40.c: New test.
* gcc.target/powerpc/swaps-p8-41.c: New test.
* gcc.target/powerpc/swaps-p8-42.c: New test.
* gcc.target/powerpc/swaps-p8-43.c: New test.
* gcc.target/powerpc/swaps-p8-44.c: New test.
* gcc.target/powerpc/swaps-p8-45.c: New test.
François Dumont [Wed, 27 Sep 2017 20:16:43 +0000 (20:16 +0000)]
22131.cc: Make test less istreambuf_iterator implementation dependent.
2017-09-27 François Dumont <fdumont@gcc.gnu.org>
* testsuite/22_locale/money_get/get/char/22131.cc: Make test less
istreambuf_iterator implementation dependent.
* testsuite/22_locale/money_get/get/wchar_t/22131.cc: Likewise.
Ian Lance Taylor [Wed, 27 Sep 2017 17:46:33 +0000 (17:46 +0000)]
compiler: fix crash on struct that embeds pointer type
The type verification code that enforces rules about the types of
embedded struct fields was not properly handling the case where the
pointed-to type is a pointer type, e.g.
type s *struct{ C int }
type t struct{ *s }
which is illegal according to the spec. Tweak the verifier to catch
this case, and add a guard in the lowering pass to make sure that we
don't crash on invalid accesses to field "C" in type "t" above.
[BRIGFE] Improved support for function and module scope group
segment variables.
PRM specs defines function and module scope group segment variables
as an experimental feature. However, PRM test suite uses and
hcc relies on them. In addition, hcc assumes certain group variable
layout in its dynamic group segment allocation code.
We cannot have global group memory offsets if we want to
both have kernel-specific group segment size and multiple kernels
calling the same functions that use function scope group memory
variables.
Now group segment is handled by separate book keeping of module
scope and function (kernel) offsets. Each function has a "frame"
in the group segment offset to which is given as an argument.
* graphite-scop-detection.c (find_scop_parameters): Move
loop bound handling ...
(gather_bbs::before_dom_children): ... here, avoiding the need
to build scop_info->loop_nest.
(record_loop_in_sese): Remove.
* sese.h (sese_info_t::loop_nest): Remove.
* sese.c (new_sese_info): Do not allocate loop_nest.
(free_sese_info): Do not free loop_nest.
Richard Biener [Wed, 27 Sep 2017 13:06:34 +0000 (13:06 +0000)]
graphite.h (scop::max_alias_set): New member.
2017-09-27 Richard Biener <rguenther@suse.de>
* graphite.h (scop::max_alias_set): New member.
* graphite-scop-detection.c: Remove references to non-existing
--param in comments.
(build_alias_sets): Record the maximum alias set used for drs.
(build_scops): Support zero as unlimited for
--param graphite-max-arrays-per-scop.
* graphite-sese-to-poly.c (add_scalar_version_numbers): Remove
and inline into ...
(build_poly_sr_1): ... here. Compute alias set based on the
maximum alias set used for drs rather than
PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP
* doc/invoke.texi (graphite-max-bbs-per-function): Remove.
(graphite-max-nb-scop-params): Document special value zero.
* domwalk.h (dom_walker::STOP): New symbolical constant.
(dom_walker::dom_walker): Add optional parameter for bb to
RPO mapping.
(dom_walker::~dom_walker): Declare.
(dom_walker::before_dom_children): Document STOP return value.
(dom_walker::m_user_bb_to_rpo): New member.
(dom_walker::m_bb_to_rpo): Likewise.
* domwalk.c (dom_walker::dom_walker): Compute bb to RPO
mapping here if not provided by the user.
(dom_walker::~dom_walker): Free bb to RPO mapping if not
provided by the user.
(dom_walker::STOP): Define.
(dom_walker::walk): Do not compute bb to RPO mapping here.
Support STOP return value from before_dom_children to stop
walking.
* graphite-optimize-isl.c (optimize_isl): If the schedule
is the same still generate code if -fgraphite-identity
or -floop-parallelize-all are given.
* graphite-scop-detection.c: Include cfganal.h.
(gather_bbs::gather_bbs): Get and pass through bb to RPO
mapping.
(gather_bbs::before_dom_children): Return STOP for BBs
not in the region.
(build_scops): Compute bb to RPO mapping and pass it to
the domwalk. Treat --param graphite-max-nb-scop-params=0
as not limiting the number of params.
* graphite.c (graphite_initialize): Remove limit on the
number of basic-blocks in a function.
* params.def (PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION): Remove.
(PARAM_GRAPHITE_MAX_NB_SCOP_PARAMS): Adjust to documented
default value of 10.
Michael Meissner [Wed, 27 Sep 2017 01:20:24 +0000 (01:20 +0000)]
vsx.md (peephole for optimizing move SF to GPR): Adjust code to eliminate needing to do the shift right 32-bits operation after...
[gcc]
2017-09-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vsx.md (peephole for optimizing move SF to GPR):
Adjust code to eliminate needing to do the shift right 32-bits
operation after XSCVDPSPN.
[gcc/testsuite]
2017-09-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/pr71977-1.c: Update test to know that we
don't generate a 32-bit shift after doing XSCVDPSPN.
* gcc.target/powerpc/direct-move-float1.c: Likewise.
* gcc.target/powerpc/direct-move-float3.c: New test.
Janus Weil [Tue, 26 Sep 2017 20:28:00 +0000 (22:28 +0200)]
re PR fortran/82143 (add a -fdefault-real-16 flag)
2017-09-26 Janus Weil <janus@gcc.gnu.org>
PR fortran/82143
PR fortran/82324
* doc/sourcebuild.texi: Document fortran_real_10 and fortran_real_16.
2017-09-26 Janus Weil <janus@gcc.gnu.org>
PR fortran/82143
PR fortran/82324
* lib/target-supports.exp (check_effective_target_fortran_real_10): New.
* gfortran.dg/promotion_3.f90: Only run if real(16) is available.
* gfortran.dg/promotion_4.f90: Only run if real(10) is available.
Don't assume that DOUBLE PRECISION has kind=16.
Michael Meissner [Tue, 26 Sep 2017 18:12:33 +0000 (18:12 +0000)]
rs6000.md (movsi_from_df): Optimize converting a DFmode to a SFmode...
2017-09-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (movsi_from_df): Optimize converting a
DFmode to a SFmode, and then needing to move the SFmode to a GPR
to use the XSCVDPSP instruction instead of FRSP and XSCVDPSPN.
Michael Meissner [Tue, 26 Sep 2017 18:04:37 +0000 (18:04 +0000)]
rs6000.md (movsi_from_sf): Adjust code to eliminate doing a 32-bit shift right or vector extract after...
2017-09-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (movsi_from_sf): Adjust code to
eliminate doing a 32-bit shift right or vector extract after doing
XSCVDPSPN. Use zero_extendsidi2 instead of p8_mfvsrd_4_disf to
move the value to the GPRs.
(movdi_from_sf_zero_ext): Likewise.
(reload_gpr_from_vsxsf): Likewise.
(p8_mfvsrd_4_disf): Delete, no longer used.
Michael Meissner [Tue, 26 Sep 2017 17:37:14 +0000 (17:37 +0000)]
rs6000.md (extendsi<mode>2): Add a splitter to do sign extension from a vector register to a GPR by doing a...
2017-09-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (extendsi<mode>2): Add a splitter to do
sign extension from a vector register to a GPR by doing a 32-bit
direct move and then an EXTSW.
(extendsi<mode>2 splitter): Likewise.
Martin Jambor [Tue, 26 Sep 2017 17:15:29 +0000 (19:15 +0200)]
Make SRA qsort comparator transitive
2017-09-26 Martin Jambor <mjambor@suse.cz>
* tree-sra.c (compare_access_positions): Put integral types first,
stabilize sorting of integral types, remove conditions putting
non-full-precision integers last.
(sort_and_splice_var_accesses): Disable scalarization if a
non-integert would be represented by a non-full-precision integer.
Joseph Myers [Tue, 26 Sep 2017 16:35:53 +0000 (17:35 +0100)]
Enable no-exec stacks for more targets using the Linux kernel.
Building glibc for many different configurations and running the
compilation parts of the testsuite runs into failures of the
elf/check-execstack test for hppa and microblaze. Those
configurations default to executable stacks in the Linux kernel
because of VM_DATA_DEFAULT_FLAGS definitions including VM_EXEC
(VM_DATA_DEFAULT_FLAGS being the default definition of
VM_STACK_DEFAULT_FLAGS).
This fails because those configurations are not generating
.note.GNU-stack sections to indicate that programs do not need an
executable stack. This patch fixes GCC to generate those sections on
those architectures (when configured for a target using the Linux
kernel), as it does on other architectures, together with adding that
section to libgcc .S sources, with the same code as used on other
architectures (or a variant using "#ifdef __linux__" instead of the
usual "#if defined(__ELF__) && defined(__linux__)" for microblaze, as
that configuration doesn't use elfos.h and so doesn't define __ELF__).
This suffices to eliminate that glibc test failure. (For hppa, the
compilation parts of the glibc testsuite still fail because of the
separate elf/check-textrel failure.)
gcc:
* config/microblaze/linux.h (TARGET_ASM_FILE_END): Likewise.
* config/pa/pa.h (NEED_INDICATE_EXEC_STACK): Likewise.
* config/pa/pa-linux.h (NEED_INDICATE_EXEC_STACK): Likewise.
* config/pa/pa.c (pa_hpux_file_end): Rename to pa_file_end.
Define unconditionally, with [ASM_OUTPUT_EXTERNAL_REAL]
conditionals inside the function instead of around it. Call
file_end_indicate_exec_stack if NEED_INDICATE_EXEC_STACK.
(TARGET_ASM_FILE_END): Define unconditionally to pa_file_end.
Jakub Jelinek [Tue, 26 Sep 2017 13:58:11 +0000 (15:58 +0200)]
re PR middle-end/35691 (Missed (a == 0) && (b == 0) into (a|(typeof(a)(b)) == 0 when the types don't match)
PR middle-end/35691
* tree-ssa-reassoc.c (update_range_test): Dump r->exp each time
if it is different SSA_NAME.
(optimize_range_tests_cmp_bitwise): New function.
(optimize_range_tests): Call it.
* gcc.dg/pr35691-5.c: New test.
* gcc.dg/pr35691-6.c: New test.
Andreas Krebbel [Tue, 26 Sep 2017 10:35:27 +0000 (10:35 +0000)]
S/390: Fix vector fp unordered compares
V2DF mode was still hard-coded here.
gcc/ChangeLog:
2017-09-26 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/s390.c (s390_expand_vec_compare): Use the new mode
independent expanders.
* config/s390/vector.md ("vec_cmpuneq", "vec_cmpltgt")
("vec_ordered", "vec_unordered"): New expanders.
Andreas Krebbel [Tue, 26 Sep 2017 10:33:37 +0000 (10:33 +0000)]
S/390: Add support for vec_shr
gcc/ChangeLog:
2017-09-26 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/predicates.md ("const_shift_by_byte_operand"): New
predicate.
* config/s390/vector.md ("*vec_srb<mode>"): Change modes to V_128
and V16QI.
("*vec_slb<mode>"): New insn pattern.
("vec_shr_<mode>"): New expander.
* config/s390/vx-builtins.md ("vec_slb<mode>"): Turn into expander
and force the shift count operand to V16QImode.
("vec_srb<mode>"): Set shift count mode to V16QI.
Andreas Krebbel [Tue, 26 Sep 2017 10:32:58 +0000 (10:32 +0000)]
S/390: Add widening vector mult lo/hi patterns
Add support for widening vector multiply lo/hi patterns. These do not
directly match on IBM Z instructions but can be emulated with even/odd
+ vector merge.
gcc/ChangeLog:
2017-09-26 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/vector.md ("vec_widen_umult_lo_<mode>")
("vec_widen_umult_hi_<mode>", "vec_widen_smult_lo_<mode>")
("vec_widen_smult_hi_<mode>"): New expander definitions.
Richard Earnshaw [Tue, 26 Sep 2017 09:33:49 +0000 (09:33 +0000)]
[ARM] PR82175 - fix -mcpu=native not working correctly.
The new option processing machinery relies on %< rules in the specs to
suppress options that are rewritten. Suppression appears to be a two
phase process where the option is partially suppressed when %< is
processed and then fully suppressed at the end of the string. Strings
are separated by commas and there can be multiple strings used to form
DRIVER_SELF_SPECS.
The fix in this case is to separate the driver self specs for ARM into
separate rules as described; this forces the -m{cpu,tune,arch}=native
options to be properly removed before proceeding to the next rule set.
PR target/82175
* config/arm/arm.h (DRIVER_SELF_SPECS): Separate sub-rules with commas.
PR demangler/82195
* cp-demangle.c (d_encoding): Strip return type when name is a
LOCAL_NAME.
(d_local_name): Strip return type of enclosing TYPED_NAME.
* testsuite/demangle-expected: Add and adjust tests.
Jeff Law [Mon, 25 Sep 2017 23:13:55 +0000 (17:13 -0600)]
rs6000-protos.h (output_probe_stack_range): Update prototype for new argument.
* config/rs6000/rs6000-protos.h (output_probe_stack_range): Update
prototype for new argument.
* config/rs6000/rs6000.c (rs6000_emit_allocate_stack_1): New function,
mostly extracted from rs6000_emit_allocate_stack.
(rs6000_emit_probe_stack_range_stack_clash): New function.
(rs6000_emit_allocate_stack): Call
rs6000_emit_probe_stack_range_stack_clash as needed.
(rs6000_emit_probe_stack_range): Add additional argument
to call to gen_probe_stack_range{si,di}.
(output_probe_stack_range): New.
(output_probe_stack_range_1): Renamed from output_probe_stack_range.
(output_probe_stack_range_stack_clash): New.
(rs6000_emit_prologue): Emit notes into dump file as requested.
* rs6000.md (allocate_stack): Handle -fstack-clash-protection.
(probe_stack_range<P:mode>): Operand 0 is now early-clobbered.
Add additional operand and pass it to output_probe_stack_range.
* lib/target-supports.exp
(check_effective_target_supports_stack_clash_protection): Enable for
rs6000 and powerpc targets.
Bin Cheng [Mon, 25 Sep 2017 17:32:36 +0000 (17:32 +0000)]
re PR tree-optimization/82163 (ICE on valid code at -O3 on x86_64-linux-gnu: in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:707)
PR tree-optimization/82163
* tree-ssa-loop-manip.h (verify_loop_closed_ssa): New parameter.
(checking_verify_loop_closed_ssa): New parameter.
* tree-ssa-loop-manip.c (check_loop_closed_ssa_use): Delete.
(check_loop_closed_ssa_stmt): Delete.
(check_loop_closed_ssa_def, check_loop_closed_ssa_bb): New functions.
(verify_loop_closed_ssa): Check loop closed ssa form for LOOP.
(tree_transform_and_unroll_loop): Check loop closed ssa form only for
changed loops.
gcc/testsuite
* gcc.dg/tree-ssa/pr82163.c: New test.
Thomas Koenig [Mon, 25 Sep 2017 16:49:48 +0000 (16:49 +0000)]
lang.opt: Add -Wdo-subscript.
2017-09-25 Thomas Koenig <tkoenig@gcc.gnu.org>
* lang.opt: Add -Wdo-subscript.
* frontend-passes.c (do_t): New type.
(doloop_list): Use variable of do_type.
(if_level): Variable to track if levels.
(select_level): Variable to track select levels.
(gfc_run_passes): Initialize i_level and select_level.
(doloop_code): Record current level of if + select
level in doloop_list. Add seen_goto if there could
be a branch outside the loop. Use different type for
doloop_list.
(doloop_function): Call do_intent and do_subscript; move
functionality of checking INTENT to do_intent.
(insert_index_t): New type, for callback_insert_index.
(callback_insert_index): New function.
(insert_index): New function.
(do_subscript): New function.
(do_intent): New function.
(gfc_code_walker): Keep track of if_level and select_level.
* invoke.texi: Document -Wdo-subscript.
2017-09-25 Thomas Koenig <tkoenig@gcc.gnu.org>
* gfortran.dg/do_subscript_1.f90: New test.
* gfortran.dg/do_subscript_2.f90: New test.
* gfortran.dg/gomp/associate1.f90: Add out of bounds warning.
* gfortran.dg/predcom-1.f: Adjust loop bounds.
* gfortran.dg/unconstrained_commons.f: Add out of bounds warning.
was very common, so the patch adds a canned definition for that,
called constant_alignment_word_strings. Some ports had a variation
that used a port-local FASTEST_ALIGNMENT instead of BITS_PER_WORD;
the patch uses constant_alignment_word_strings if FASTEST_ALIGNMENT
was always BITS_PER_WORD and a port-local hook function otherwise.
2017-09-25 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* target.def (constant_alignment): New hook.
* defaults.h (CONSTANT_ALIGNMENT): Delete.
* doc/tm.texi.in (CONSTANT_ALIGNMENT): Replace with...
(TARGET_CONSTANT_ALIGNMENT): ...this new hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_constant_alignment): Declare.
(constant_alignment_word_strings): Likewise.
* targhooks.c (default_constant_alignment): New function.
(constant_alignment_word_strings): Likewise.
* builtins.c (get_object_alignment_2): Use targetm.constant_alignment
instead of CONSTANT_ALIGNMENT.
* varasm.c (align_variable, get_variable_align, build_constant_desc)
(force_const_mem): Likewise.
* config/aarch64/aarch64.h (CONSTANT_ALIGNMENT): Delete.
* config/aarch64/aarch64.c (aarch64_constant_alignment): New function.
(aarch64_classify_address): Call it instead of CONSTANT_ALIGNMENT.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/alpha/alpha.h (CONSTANT_ALIGNMENT): Delete commented-out
definition.
* config/arc/arc.h (CONSTANT_ALIGNMENT): Delete.
* config/arc/arc.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/arm/arm.h (CONSTANT_ALIGNMENT_FACTOR): Delete.
(CONSTANT_ALIGNMENT): Likewise.
* config/arm/arm.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(arm_constant_alignment): New function.
* config/bfin/bfin.h (CONSTANT_ALIGNMENT): Delete.
* config/bfin/bfin.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/cr16/cr16.h (CONSTANT_ALIGNMENT): Delete.
* config/cr16/cr16.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/cris/cris.h (CONSTANT_ALIGNMENT): Delete.
* config/cris/cris.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(cris_constant_alignment): New function.
* config/epiphany/epiphany.h (CONSTANT_ALIGNMENT): Delete.
* config/epiphany/epiphany.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(epiphany_constant_alignment): New function.
* config/fr30/fr30.h (CONSTANT_ALIGNMENT): Delete.
* config/fr30/fr30.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/frv/frv.h (CONSTANT_ALIGNMENT): Delete.
* config/frv/frv.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/ft32/ft32.h (CONSTANT_ALIGNMENT): Delete.
* config/ft32/ft32.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/i386/i386.h (CONSTANT_ALIGNMENT): Delete.
* config/i386/i386-protos.h (ix86_constant_alignment): Delete.
* config/i386/i386.c (ix86_constant_alignment): Make static.
Use the same interface as the target hook.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/ia64/ia64.h (CONSTANT_ALIGNMENT): Delete.
* config/ia64/ia64.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/iq2000/iq2000.h (CONSTANT_ALIGNMENT): Delete.
* config/iq2000/iq2000.c (iq2000_constant_alignment): New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/lm32/lm32.h (CONSTANT_ALIGNMENT): Delete.
* config/lm32/lm32.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/m32r/m32r.h (CONSTANT_ALIGNMENT): Delete.
* config/m32r/m32r.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/mcore/mcore.h (CONSTANT_ALIGNMENT): Delete.
* config/mcore/mcore.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/microblaze/microblaze.h (CONSTANT_ALIGNMENT): Delete.
* config/microblaze/microblaze.c (microblaze_constant_alignment):
New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/mips/mips.h (CONSTANT_ALIGNMENT): Delete.
* config/mips/mips.c (mips_constant_alignment): New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/mmix/mmix.h (CONSTANT_ALIGNMENT): Delete.
* config/mmix/mmix-protos.h (mmix_constant_alignment): Delete.
* config/mmix/mmix.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(mmix_constant_alignment): Make static. Use the same interface
as the target hook.
* config/moxie/moxie.h (CONSTANT_ALIGNMENT): Delete.
* config/moxie/moxie.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/nios2/nios2.h (CONSTANT_ALIGNMENT): Delete.
* config/nios2/nios2.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/pa/pa.h (CONSTANT_ALIGNMENT): Delete.
* config/pa/pa.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/powerpcspe/powerpcspe.h (CONSTANT_ALIGNMENT): Delete.
* config/powerpcspe/powerpcspe.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(rs6000_constant_alignment): New function.
* config/riscv/riscv.h (CONSTANT_ALIGNMENT): Delete.
* config/riscv/riscv.c (riscv_constant_alignment): New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/rs6000/rs6000.h (CONSTANT_ALIGNMENT): Delete.
* config/rs6000/rs6000.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(rs6000_constant_alignment): New function.
* config/s390/s390.h (CONSTANT_ALIGNMENT): Delete.
* config/s390/s390.c (s390_constant_alignment): New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/sh/sh.h (CONSTANT_ALIGNMENT): Delete.
* config/sh/sh.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/sparc/sparc.h (CONSTANT_ALIGNMENT): Delete.
* config/sparc/sparc.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(sparc_constant_alignment): New function.
* config/spu/spu.h (CONSTANT_ALIGNMENT): Delete.
* config/spu/spu.c (spu_constant_alignment): New function.
(TARGET_CONSTANT_ALIGNMENT): Redefine.
* config/stormy16/stormy16.h (CONSTANT_ALIGNMENT): Delete.
* config/stormy16/stormy16.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/tilegx/tilegx.h (CONSTANT_ALIGNMENT): Delete.
* config/tilegx/tilegx.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/tilepro/tilepro.h (CONSTANT_ALIGNMENT): Delete.
* config/tilepro/tilepro.c (TARGET_CONSTANT_ALIGNMENT): Redefine to
constant_alignment_word_strings.
* config/visium/visium.h (CONSTANT_ALIGNMENT): Delete.
* config/visium/visium.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(visium_constant_alignment): New function.
* config/xtensa/xtensa.h (CONSTANT_ALIGNMENT): Delete.
* config/xtensa/xtensa.c (TARGET_CONSTANT_ALIGNMENT): Redefine.
(xtensa_constant_alignment): New function.
* system.h (CONSTANT_ALIGNMENT): Poison.
Will Schmidt [Mon, 25 Sep 2017 14:35:02 +0000 (14:35 +0000)]
rs6000.c (rs6000_gimple_fold_builtin): Add handling for early folding of vector stores (ALTIVEC_BUILTIN_ST_*).
[gcc]
2017-09-25 Will Schmidt <will_schmidt@vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
for early folding of vector stores (ALTIVEC_BUILTIN_ST_*).
(rs6000_builtin_valid_without_lhs): New helper function.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Remove obsoleted code for handling ALTIVEC_BUILTIN_VEC_ST.
This patch changes the element type of (auto_)vec_perm_indices from
unsigned char to unsigned short. This is needed for fixed-length
2048-bit SVE. (SVE is variable-length by default, but it's possible
to ask for specific vector lengths if you want to.)
2017-09-25 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* target.h (vec_perm_indices): Use unsigned short rather than
unsigned char.
(auto_vec_perm_indices): Likewise.
* config/aarch64/aarch64.c (aarch64_vectorize_vec_perm_const_ok):
Use unsigned int rather than unsigned char.
* config/arm/arm.c (arm_vectorize_vec_perm_const_ok): Likewise.
Richard Biener [Mon, 25 Sep 2017 13:19:16 +0000 (13:19 +0000)]
cfgloop.h (sort_sibling_loops): Declare.
2017-09-25 Richard Biener <rguenther@suse.de>
* cfgloop.h (sort_sibling_loops): Declare.
* cfgloop.c (sort_sibling_loops_cmp): New helper.
(sort_sibling_loops): New function sorting the sibling loop list
in RPO order.
* graphite.c (graphite_transform_loops): Sort sibling loops.
Update interface to TARGET_VECTORIZE_VEC_PERM_CONST_OK
This patch makes TARGET_VECTORIZE_VEC_PERM_CONST_OK take the permute
vector in the form of a vec_perm_indices instead of an unsigned char *.
It follows on from the recent patch that did the same in target-independent
code.
It was easy to make ARM and AArch64 use vec_perm_indices internally
as well, and converting AArch64 helps with SVE. I did try doing the same
for the other ports, but the surgery needed was much more invasive and
much less obviously correct.
2017-09-22 Richard Sandiford <richard.sandifird@linaro.org>
gcc/
* target.def (vec_perm_const_ok): Change sel parameter to
vec_perm_indices.
* optabs-query.c (can_vec_perm_p): Update accordingly.
* doc/tm.texi: Regenerate.
* config/aarch64/aarch64.c (expand_vec_perm_d): Change perm to
auto_vec_perm_indices and remove separate nelt field.
(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
(aarch64_evpc_ext, aarch64_evpc_rev, aarch64_evpc_dup)
(aarch64_evpc_tbl, aarch64_expand_vec_perm_const_1)
(aarch64_expand_vec_perm_const): Update accordingly.
(aarch64_vectorize_vec_perm_const_ok): Likewise. Change sel
to vec_perm_indices.
* config/arm/arm.c (expand_vec_perm_d): Change perm to
auto_vec_perm_indices and remove separate nelt field.
(arm_evpc_neon_vuzp, arm_evpc_neon_vzip, arm_evpc_neon_vrev)
(arm_evpc_neon_vtrn, arm_evpc_neon_vext, arm_evpc_neon_vtbl)
(arm_expand_vec_perm_const_1, arm_expand_vec_perm_const): Update
accordingly.
(arm_vectorize_vec_perm_const_ok): Likewise. Change sel
to vec_perm_indices.
* config/i386/i386.c (ix86_vectorize_vec_perm_const_ok): Change
sel to vec_perm_indices.
* config/ia64/ia64.c (ia64_vectorize_vec_perm_const_ok): Likewise.
* config/mips/mips.c (mips_vectorize_vec_perm_const_ok): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_vectorize_vec_perm_const_ok):
Likewise.
* config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const_ok):
Likewise.
[PR82155] Fix crash in dwarf2out_abstract_function
This patch is an attempt to fix the crash reported in PR82155.
When generating a C++ class method for a class that is itself nested in
a class method, dwarf2out_early_global_decl currently leaves the
existing context DIE as it is if it already exists. However, it is
possible that this call happens at a point where this context DIE is
just a declaration that is itself not located in its own context.
From there, if dwarf2out_early_global_decl is not called on any of the
FUNCTION_DECL in the context chain, DIEs will be left badly scoped and
some (such as the nested method) will be removed by the type pruning
machinery. As a consequence, dwarf2out_abstract_function will will
crash when called on the corresponding DECL because it asserts that the
DECL has a DIE.
This patch fixes this crash making dwarf2out_early_global_decl process
context DIEs the same way we process abstract origins for FUNCTION_DECL:
if the corresponding DIE exists but is only a declaration, call
dwarf2out_decl anyway on it so that it is turned into a more complete
DIE and so that it is relocated in the proper context.
Bootstrapped and regtested on x86_64-linux.
gcc/
PR debug/82155
* dwarf2out.c (dwarf2out_early_global_decl): Call dwarf2out_decl
on the FUNCTION_DECL function context if it has a DIE that is a
declaration.
* aspects.adb, bindgen.adb, clean.adb, erroutc.adb, exp_ch13.adb,
exp_dbug.adb, exp_unst.adb, exp_util.adb, frontend.adb, gnat1drv.adb,
gnatdll.adb, gnatlink.adb, gnatls.adb, gnatname.adb, gnatxref.adb,
gnatfind.adb, libgnat/a-cfhama.ads, libgnat/a-exetim__mingw.adb,
libgnat/a-strmap.adb, libgnat/a-teioed.adb, libgnat/g-alvety.ads,
libgnat/g-expect.adb, libgnat/g-regist.adb, libgnat/g-socket.adb,
libgnat/g-socthi__mingw.ads, libgnat/s-stausa.adb,
libgnat/s-tsmona__linux.adb, libgnat/s-tsmona__mingw.adb,
libgnarl/s-taenca.adb, libgnarl/s-tassta.adb, libgnarl/s-tarest.adb,
libgnarl/s-tpobop.adb, make.adb, makeusg.adb, namet.adb, output.ads,
put_scos.adb, repinfo.adb, rtsfind.adb, scn.ads, sem_attr.adb,
sem_aux.ads, sem_warn.ads, targparm.adb, xr_tabls.adb, xref_lib.adb:
Removal of ineffective use-clauses.
* exp_ch9.adb (Is_Simple_Barrier_Name): Check for false positives with
constant folded barriers.
* ghost.adb, sprint.adb, sem_ch10.adb, sem_warn.adb: Change access to
Subtype_Marks and Names list in use-clause nodes to their new singular
counterparts (e.g. Subtype_Mark, Name).
* par.adb, par-ch8.adb (Append_Use_Clause): Created to set
Prev_Ids and More_Ids in use-clause nodes.
(P_Use_Clause): Modify to take a list as a parameter.
(P_Use_Package_Clause, P_Use_Type_Clause): Divide names and
subtype_marks within an aggregate use-clauses into individual clauses.
* par-ch3.adb, par-ch10.adb, par-ch12.adb: Trivally modify call to
P_Use_Clause to match its new behavior.
* sem.adb (Analyze): Mark use clauses for non-overloaded entities.
* sem_ch4.adb (Try_One_Interp): Add sanity check to handle previous
errors.
* sem_ch6.adb (Analyze_Generic_Subprogram_Body,
Analyze_Subprogram_Body_Helper): Update use clause chain at the end of
the declarative region.
* sem_ch7.adb (Analyze_Package_Body_Helper): Update use clause chain
after analysis (Analyze_Package_Specification): Update use clause chain
when there is no body.
* sem_ch8.ads, sem_ch8.adb (Analyze_Use_Package, Analyze_Use_Type): Add
parameter to determine weither the installation of scopes should also
propagate on the use-clause "chain".
(Mark_Use_Clauses): Created to traverse use-clause chains and determine
what constitutes a valid "use" of a clause.
(Update_Use_Clause_Chain): Created to aggregate common machinary used
to clean up use-clause chains (and warn on ineffectiveness) at the end
of declaritive regions.
* sem_ch8.adb (Analyze_Package_Name): Created to perform analysis on a
package name from a use-package clause.
(Analyze_Package_Name_List): Created to perform analysis on a list of
package names (similar to Analyze_Package_Name).
(Find_Most_Prev): Created to traverse to the beginning of a given
use-clause chain.
(Most_Decendant_Use_Clause): Create to identify which clause from a
given set is highest in scope (not always the most prev).
(Use_One_Package, Use_One_Type): Major cleanup and reorganization to
handle the new chaining algorithm, also many changes related to
redundant clauses. A new parameter has also been added to force
installation to handle certain cases.
* sem_ch9.adb (Analyze_Entry_Body, Analyze_Protected_Body,
Analyze_Task_Body): Mark use clauses on relevant entities.
* sem_ch10.adb, sem_ch10.ads (Install_Context_Clauses,
Install_Parents): Add parameter to determine weither the installation
of scopes should also propagate on the use-clause "chain".
* sem_ch12.adb (Inline_Instance_Body): Add flag in call to
Install_Context to avoid redundant chaining of use-clauses.
* sem_ch13.adb: Minor reformatting.
* sem_res.adb (Resolve): Mark use clauses on operators.
(Resolve_Call, Resolve_Entity_Name): Mark use clauses on relevant
entities.
* sinfo.adb, sinfo.ads (Is_Effective_Use_Clause,
Set_Is_Effective_Use_Clause): Add new flag to N_Use_Clause nodes to
represent any given clause's usage/reference/necessity.
(Prev_Use_Clause, Set_Prev_Use_Clause): Add new field to N_Use_Clause
nodes to allow loose chaining of redundant clauses.
(Set_Used_Operations, Set_Subtype_Mark, Set_Prev_Ids, Set_Names,
Set_More_Ids, Set_Name): Modify set procedure calls to reflect
reorganization in node fields.
* types.ads (Source_File_Index): Adjust index bounds.
(No_Access_To_Source_File): New constant.
2017-09-25 Ed Schonberg <schonberg@adacore.com>
* sem_ch13.adb (Analyze_One_Aspect): In ASIS mode make a full copy of
the expression to be used in the generated attribute specification
(rather than relocating it) to avoid resolving a potentially malformed
tree when the expression is resolved through an ASIS-specific call to
Resolve_Aspect_Expressions. This manifests itself as a crash on a
function with parameter associations.
* exp_spark.adb (Expand_SPARK_Indexed_Component,
Expand_SPARK_Selected_Component): New procedures to insert explicit
dereference if required.
(Expand_SPARK): Call the new procedures.
2017-09-25 Patrick Bernardi <bernardi@adacore.com>
* adaint.c (win32_wait): Properly handle error and take into account
the WIN32 limitation on the number of simultaneous wait objects.
2017-09-25 Yannick Moy <moy@adacore.com>
* sem_ch3.adb (Constant_Redeclaration): Do not insert a call to the
invariant procedure in GNATprove mode.
* sem_ch5.adb (Analyze_Assignment): Likewise.
Richard Biener [Mon, 25 Sep 2017 09:48:31 +0000 (09:48 +0000)]
graphite-optimize-isl.c (optimize_isl): Fail and dump if ISL errors other than isl_error_quota happen.
2017-09-25 Richard Biener <rguenther@suse.de>
* graphite-optimize-isl.c (optimize_isl): Fail and dump if
ISL errors other than isl_error_quota happen. Dump if the
schedule is the same.
* graphite-sese-to-poly.c (build_poly_scop): Fail on ISL
errors instead of aborting inside ISL.
* adabkend.adb (Call_Back_End): Fix wording of "front-end" and
"back-end" in comments.
2017-09-25 Ed Schonberg <schonberg@adacore.com>
* exp_ch6.adb (Expand_Call_Helper): The extra accessibility check in a
call that appears in a classwide precondition and that mentions an
access formal of the subprogram, must use the accessibility level of
the actual in the call. This is one case in which a reference to a
formal parameter appears outside of the body of the subprogram.
* sem_prag.adb (Analyze_Constituent): Raise Unrecoverable_Error rather
than Program_Error because U_E is more in line with respect to the
intended behavior.
2017-09-25 Ed Schonberg <schonberg@adacore.com>
* sem_ch13.adb (Resolve_Aspect_Expressions): The expression for aspect
Storage_Size does not freeze, and thus can include references to
deferred constants.
* exp_spark.adb (Expand_SPARK_Potential_Renaming): Do not process a
reference when it appears within a pragma of no significance to SPARK.
(In_Insignificant_Pragma): New routine.
* sem_prag.ads: Add new table Pragma_Significant_In_SPARK.
2017-09-25 Ed Schonberg <schonberg@adacore.com>
* sem_ch12.adb (Analyze_Associations, case N_Formal_Package): If the
actual is a renaming, indicate that it is the renamed package that must
be frozen before the instantiation.
2017-09-25 Yannick Moy <moy@adacore.com>
* doc/gnat_ugn/gnat_and_program_execution.rst: Fix typo in description
of dimensionality system in GNAT UG.
* gnat_ugn.texi: Regenerate.
2017-09-25 Yannick Moy <moy@adacore.com>
* gnat1drv.adb: Call Check_Safe_Pointers from the frontend in
GNATprove_Mode when switch -gnatdF used.
2017-09-25 Piotr Trojanek <trojanek@adacore.com>
* adabkend.adb (Call_Back_End): Reset Current_Error_Node when starting
the backend.
exp_imgv.adb (Expand_Image_Attribute): Disable the optimized expansion of user-defined enumeration types when...
gcc/ada/
2017-09-25 Javier Miranda <miranda@adacore.com>
* exp_imgv.adb (Expand_Image_Attribute): Disable the optimized
expansion of user-defined enumeration types when the generation of
names for enumeration literals is suppressed.
2017-09-25 Gary Dismukes <dismukes@adacore.com>
* libgnarl/s-taprop__linux.adb: Minor reformatting.
2017-09-25 Ed Schonberg <schonberg@adacore.com>
* sem_ch13.adb (Resolve_Aspect_Expressions): Do not resolve identifiers
that appear as selector names of parameter associations, as these are
never resolved by visibility.
2017-09-25 Justin Squirek <squirek@adacore.com>
* sem_res.adb (Resolve_Entry): Generate reference for index entities.
* exp_imgv.adb (Is_User_Defined_Enumeration_Type): New subprogram.
(Expand_User_Defined_Enumeration_Image): New subprogram.
(Expand_Image_Attribute): Enable speed-optimized expansion of
user-defined enumeration types when we are compiling with optimizations
enabled.
2017-09-25 Piotr Trojanek <trojanek@adacore.com>
* sem_util.adb (Has_Null_Abstract_State): Remove, as an exactly same
routine is already provided by Einfo.
* einfo.adb (Has_Null_Abstract_State): Replace with the body from
Sem_Util, which had better comments and avoided double calls to
Abstract_State.
[Patch, Darwin] Fix PR80556 by linking the system unwinder ahead of libgcc_eh.
PR target/80556
* config/i386/darwin.h (REAL_LIB_SPEC): New; put libSystem ahead
of libgcc_eh for m64.
* config/i386/darwin64.h: Likewise.
/* WORKAROUND pr80556:
For x86_64 Darwin10 and later, the unwinder is in libunwind (redirected
from libSystem). This doesn't use the keymgr (see keymgr.c) and therefore
the calls that libgcc makes to obtain the KEYMGR_GCC3_DW2_OBJ_LIST are not
updated to include new images, and might not even be valid for a single
image.
Therefore, for 64b exes at least, we must use the libunwind implementation,
even when static-libgcc is specified. We put libSystem first so that
unwinder symbols are satisfied from there.
* exp_ch3.adb: Rename Comp_Type_Simple to be Comp_Simple_Init.
2017-09-25 Doug Rupp <rupp@adacore.com>
* libgnarl/s-taprop__linux.adb (Base_Monotonic_Clock): New variable.
(Compute_Base_Monotonic_Clock): New function.
(Timed_Sleep): Adjust to use Base_Monotonic_Clock.
(Timed_Delay): Likewise.
(Monotonic_Clock): Likewise.
* s-oscons-tmplt.c (CLOCK_MONOTONIC): Use on Linux.
Janne Blomqvist [Mon, 25 Sep 2017 06:44:18 +0000 (09:44 +0300)]
Remove unnecessary fold_convert in gfc_(un)likely
This patch removes an unnecessary fold_convert to boolean_type_node at
the end of gfc_likely and gfc_unlikely. It makes no difference to the
generated code, but makes tree dumps a little bit cleaner.
H.J. Lu [Sun, 24 Sep 2017 21:37:09 +0000 (14:37 -0700)]
x32: Encode %esp as %rsp to avoid 0x67 prefix
Since the upper 32 bits of stack register are always zero for x32, we
can encode %esp as %rsp to avoid 0x67 prefix in address if there is no
index or base register.
gcc/
PR target/82267
* config/i386/i386.c (ix86_print_operand_address_as): Encode
%esp as %rsp to avoid 0x67 prefix if there is no index or base
register.
gcc/testsuite/
PR target/82267
* gcc.target/i386/pr82267.c: New test.
Janus Weil [Sat, 23 Sep 2017 13:15:20 +0000 (15:15 +0200)]
re PR fortran/82143 (add a -fdefault-real-16 flag)
2017-09-23 Janus Weil <janus@gcc.gnu.org>
PR fortran/82143
* lang.opt: Add the options -fdefault-real-10 and -fdefault-real-16.
Rename flag_default_real to flag_default_real_8.
* invoke.texi: Add documentation.
* module.c (use_iso_fortran_env_module): flag_default_real is renamed.
* trans-types.c (gfc_init_kinds): Implement the flags
-fdefault-real-10 and -fdefault-real-16. Make -fdefault-double-8 work
without -fdefault-real-8.
2017-09-23 Janus Weil <janus@gcc.gnu.org>
PR fortran/82143
* gfortran.dg/promotion_3.f90: New test case.
* gfortran.dg/promotion_4.f90: New test case.
Jakub Jelinek [Fri, 22 Sep 2017 18:56:23 +0000 (20:56 +0200)]
re PR middle-end/35691 (Missed (a == 0) && (b == 0) into (a|(typeof(a)(b)) == 0 when the types don't match)
PR middle-end/35691
* match.pd: Simplify x == -1 & y == -1 into (x & y) == -1
and x != -1 | y != -1 into (x & y) != -1.
* gcc.dg/pr35691-1.c: Use -fdump-tree-forwprop1-details
instead of -fdump-tree-forwprop-details in dg-options.
* gcc.dg/pr35691-2.c: Likewise.
* gcc.dg/pr35691-3.c: New test.
* gcc.dg/pr35691-4.c: New test.
Jakub Jelinek [Fri, 22 Sep 2017 18:55:21 +0000 (20:55 +0200)]
re PR sanitizer/81929 (exponential slowdown in undefined behavior sanitizer for streaming)
PR sanitizer/81929
* tree.c (struct replace_placeholders_t): Add pset field.
(replace_placeholders_r): Call cp_walk_tree with d->pset as
last argument instead of NULL. Formatting fix.
(replace_placeholders): Add pset variable, add its address
into data. Pass &pset instead of NULL to cp_walk_tree.
Steve Ellcey [Fri, 22 Sep 2017 18:04:18 +0000 (18:04 +0000)]
config.gcc: Add new case statement to set default_gnu_indirect_function.
2017-09-22 Steve Ellcey <sellcey@cavium.com>
* config.gcc: Add new case statement to set
default_gnu_indirect_function. Remove it from x86_64-*-linux*,
i[34567]86-*, powerpc*-*-linux*spe*, powerpc*-*-linux*, s390-*-linux*,
s390x-*-linux* case statements. Added aarch64 to the list of
supported architectures.
PR82289: Computing peeling costs for irrelevant drs
This PR shows that we weren't filtering out irrelevant stmts in
vect_get_peeling_costs_all_drs (unlike related loops in which
we iterate over all datarefs).
2017-09-22 Richard Sandiford <richard.sandiford@linaro.org>
The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
there was also a hard-coded assumption that this was equal to the type
size. This was inconvenient for SVE for two reasons:
- When compiling for a specific power-of-2 SVE vector length, we might
want to align to a full vector. However, the TYPE_ALIGN is governed
by the ABI alignment, which is 128 bits regardless of size.
- For vector-length-agnostic code it doesn't usually make sense to align,
since the runtime vector length might not be a power of two. Even for
power of two sizes, there's no guarantee that aligning to the previous
16 bytes will be an improveent.
This patch therefore adds a target hook to control the preferred
vectoriser (as opposed to ABI) alignment.
2017-09-22 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (preferred_vector_alignment): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_preferred_vector_alignment): Declare.
* targhooks.c (default_preferred_vector_alignment): New function.
* tree-vectorizer.h (dataref_aux): Add a target_alignment field.
Expand commentary.
(DR_TARGET_ALIGNMENT): New macro.
(aligned_access_p): Update commentary.
(vect_known_alignment_in_bytes): New function.
* tree-vect-data-refs.c (vect_calculate_required_alignment): New
function.
(vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
Calculate the misalignment based on the target alignment rather than
the vector size.
(vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
(vect_enhance_data_refs_alignment): Mask the byte misalignment with
the target alignment, rather than masking the element misalignment
with the number of elements in a vector. Also use the target
alignment when calculating the maximum number of peels.
(vect_find_same_alignment_drs): Use vect_calculate_required_alignment
instead of TYPE_ALIGN_UNIT.
(vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
(vect_create_addr_base_for_vector_ref): Update call accordingly.
(vect_create_data_ref_ptr): Likewise.
(vect_setup_realignment): Realign by ANDing with
-DR_TARGET_MISALIGNMENT.
* tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
the number of peels based on DR_TARGET_ALIGNMENT.
* tree-vect-stmts.c (get_group_load_store_type): Compare the gap
with the guaranteed alignment boundary when deciding whether
overrun is OK.
(vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
(ensure_base_align): Remove stmt_info parameter. Get the
target base alignment from DR_TARGET_ALIGNMENT.
(vectorizable_store): Update call accordingly. Interpret
DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
TYPE_ALIGN_UNIT.
(vectorizable_load): Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
of alignment message.
* gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r253101
This patch adds a helper function for getting the number of bytes
accessed by an unvectorised data reference, which helps when general
modes have a variable size.
2017-09-22 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (vect_get_scalar_dr_size): New function.
* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use it.
(vect_enhance_data_refs_alignment): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r253099
Richard Earnshaw [Fri, 22 Sep 2017 16:24:46 +0000 (17:24 +0100)]
[arm] Improve error checking in parsecpu.awk
This patch adds a bit more error checking to parsecpu.awk to ensure
that statements are not missing arguments or have excess arguments
beyond those permitted. It also slightly improves the handling of
errors so that we terminate properly if parsing fails and be as
helpful as we can while in the parsing phase.
2017-09-22 Richard Earnshaw <richard.earnshaw@arm.com>
* config/arm/parsecpu.awk (fatal): Note that we've encountered an
error. Only quit immediately if parsing is complete.
(BEGIN): Initialize fatal_err and parse_done.
(begin fpu, end fpu): Check number of arguments.
(begin arch, end arch): Likewise.
(begin cpu, end cpu): Likewise.
(cname, tune for, tune flags, architecture, fpu, option): Likewise.
(optalias): Likewise.
Richard Earnshaw [Fri, 22 Sep 2017 16:17:11 +0000 (17:17 +0100)]
[arm] auto-generate arm-isa.h from CPU descriptions
This patch autogenerates arm-isa.h from new entries in arm-cpus.in.
This has the primary advantage that it makes the description file more
self-contained, but it also solves the 'array dimensioning' problem
that Tamar recently encountered. It adds two new constructs to
arm-cpus.in: features and fgroups. Fgroups are simply a way of naming
a group of feature bits so that they can be referenced together. We
follow the convention that feature bits are all lower case, while
fgroups are (predominantly) upper case. This is helpful as in some
contexts they share the same namespace. Most of the minor changes in
this patch are related to adopting this new naming convention.
2017-09-22 Richard Earnshaw <richard.earnshaw@arm.com>
* config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file.
* config/arm/arm-isa.h: Delete. Move definitions to ...
* arm-cpus.in: ... here. Use new feature and fgroup values.
* config/arm/arm.c (arm_option_override): Use lower case for feature
bit names.
* config/arm/arm.h (TARGET_HARD_FLOAT): Likewise.
(TARGET_VFP3, TARGET_VFP5, TARGET_FMA): Likewise.
* config/arm/parsecpu.awk (END): Add new command 'isa'.
(isa_pfx): Delete.
(print_isa_bits_for): New function.
(gen_isa): New function.
(gen_comm_data): Use print_isa_bits_for.
(define feature): New keyword.
(define fgroup): New keyword.
* config/arm/t-arm (TM_H): Remove.
(GTM_H): Add arm-isa.h.
(arm-isa.h): Add rule to generate file.
* common/config/arm/arm-common.c: (arm_canon_arch_option): Use lower
case for feature bit names.