Paul Thomas [Wed, 31 Jan 2018 20:28:35 +0000 (20:28 +0000)]
re PR fortran/84088 ([nvptx] libgomp.oacc-fortran/declare-*.f90 execution fails)
2018-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84088
* trans-expr.c (gfc_conv_procedure_call): If the parm expr is
an address expression passed to an assumed rank dummy, convert
to an indirect reference.
2018-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84088
* gfortran.dg/pr84088.f90 : New test.
Ian Lance Taylor [Wed, 31 Jan 2018 18:35:58 +0000 (18:35 +0000)]
compiler: lower expression types in lowering pass
Ensure that array types with complicated length expressions are
handled correctly by lowering expression types in the lowering pass.
This required some adjustment of constant expression types to not
report too many errors for circular constant expressions. We now
record error types in the Named_constant type. If we find the
circularity due to lowering the Named_constant, we use that location
for the error message; this retains the error location we used to use.
Ian Lance Taylor [Wed, 31 Jan 2018 18:25:17 +0000 (18:25 +0000)]
runtime: fix type descriptor name in C code
I forgot to update the name of the map[string]bool type descriptor
used in go-fieldtrack.c. This didn't cause any errors because it's a
weak symbol, and the current testsuite has no field tracking tests.
Ian Lance Taylor [Wed, 31 Jan 2018 14:43:37 +0000 (14:43 +0000)]
net: rename TestAddr6 to avoid gotest confusion
On ppc64 gotest treats data variables whose names begin with "Test" as
tests to run. This is to support the function descriptors used for
ppc64 ELF ABI v1. This causes gotest to think that TestAddr6 is a
test, when it is actually a variable. For a simple fix until we can
figure out how to write gotest properly, rename the variable.
Janne Blomqvist [Wed, 31 Jan 2018 14:16:22 +0000 (16:16 +0200)]
Use pointer sized array indices.
Using pointer sized variables (e.g. size_t / ptrdiff_t) when the
variables are used as array indices allows accessing larger arrays,
and can be a slight performance improvement due to no need for sign or
zero extending, or masking.
Janne Blomqvist [Wed, 31 Jan 2018 13:23:20 +0000 (15:23 +0200)]
PR 78534 Reinstate better string copy algorithm
As part of the change to larger character lengths, the string copy
algorithm was temporarily pessimized to get around some spurious
-Wstringop-overflow warnings. Having tried a number of variations of
this algorithm I have managed to get it down to one spurious warning,
only with -O1 optimization, in the testsuite. This patch reinstates
the optimized variant and modifies this one testcase to ignore the
warning.
Regtested on x86_64-pc-linux-gnu.
gcc/fortran/ChangeLog:
2018-01-31 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
* trans-expr.c (fill_with_spaces): Use memset instead of
generating loop.
(gfc_trans_string_copy): Improve opportunity to use builtins with
constant lengths.
gcc/testsuite/ChangeLog:
2018-01-31 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
* gfortran.dg/allocate_deferred_char_scalar_1.f03: Prune
-Wstringop-overflow warnings due to spurious warning with -O1.
* gfortran.dg/char_cast_1.f90: Update dump scan pattern.
* gfortran.dg/transfer_intrinsic_1.f90: Likewise.
This test has been failing since forever, it has never passed AFAIK.
The PR details the vectoriser deficiency.
I propose we xfail this with a reference to the PR.
PR tree-optimization/64946
* gcc.target/aarch64/vect-abs-compile.c: XFAIL byte and half-word
scan-assembler checks.
Eric Botcazou [Wed, 31 Jan 2018 10:03:06 +0000 (10:03 +0000)]
re PR rtl-optimization/84071 (wrong elimination of zero-extension after sign-extended load)
PR rtl-optimization/84071
* combine.c (record_dead_and_set_regs_1): Record the source unmodified
for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target.
* config/arc/arc-protos.h (arc_is_uncached_mem_p): Function proto.
* config/arc/arc.c (arc_handle_uncached_attribute): New function.
(arc_attribute_table): Add 'uncached' attribute.
(arc_print_operand): Print '.di' flag for uncached memory
accesses.
(arc_in_small_data_p): Do not consider for small data the uncached
types.
(arc_is_uncached_mem_p): New function.
* config/arc/predicates.md (compact_store_memory_operand): Check
for uncached memory accesses.
(nonvol_nonimm_operand): Likewise.
* gcc/doc/extend.texi (ARC Type Attribute): New subsection.
Jakub Jelinek [Wed, 31 Jan 2018 08:31:52 +0000 (09:31 +0100)]
re PR preprocessor/69869 (internal compiler error: Segmentation fault in call to skip_macro_block_comment when using '-traditional-cpp')
PR preprocessor/69869
* traditional.c (skip_macro_block_comment): Return bool, true if
the macro block comment is unterminated.
(copy_comment): Use return value from skip_macro_block_comment instead
of always false.
Ian Lance Taylor [Wed, 31 Jan 2018 02:11:03 +0000 (02:11 +0000)]
compiler: Function_type and Backend_function_type should not be identical
Function_type and Backend_function_type have different backend
representations, so they should not be identical. Otherwise it
confuses Type::type_btypes map.
Jakub Jelinek [Tue, 30 Jan 2018 20:03:04 +0000 (21:03 +0100)]
re PR target/83986 (ICE in maybe_record_trace_start, at dwarf2cfi.c:2348)
PR rtl-optimization/83986
* sched-deps.c (sched_analyze_insn): For frame related insns, add anti
dependence against last_pending_memory_flush in addition to
pending_jump_insns.
Alexandre Oliva [Tue, 30 Jan 2018 17:40:50 +0000 (17:40 +0000)]
[PR81611] accept copies in simple_iv_increment_p
If there are copies between the GIMPLE_PHI at the loop body and the
increment that reaches it (presumably through a back edge), still
regard it as a simple_iv_increment, so that we won't consider the
value in the back edge eligible for forwprop. Doing so would risk
making the phi node and the incremented conflicting value live
within the loop, and the phi node to be preserved for propagated
uses after the loop.
Jakub Jelinek [Tue, 30 Jan 2018 15:58:22 +0000 (16:58 +0100)]
re PR tree-optimization/84111 (Compile time hog w/ -O2)
PR tree-optimization/84111
* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Skip
inner loops added during recursion, as they don't have up-to-date
SSA form.
Jan Hubicka [Tue, 30 Jan 2018 13:17:40 +0000 (14:17 +0100)]
re PR lto/83954 (LTO: Bogus -Wlto-type-mismatch warning for array of pointer to incomplete type)
PR lto/83954
* lto-symtab.c (warn_type_compatibility_p): Silence false positive
for type match warning on arrays of pointers.
* gcc.dg/lto/pr83954.h: New testcase.
* gcc.dg/lto/pr83954_0.c: New testcase.
* gcc.dg/lto/pr83954_1.c: New testcase.
Richard Biener [Tue, 30 Jan 2018 11:19:47 +0000 (11:19 +0000)]
re PR target/83008 ([performance] Is it better to avoid extra instructions in data passing between loops?)
2018-01-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/83008
* tree-vect-slp.c (vect_analyze_slp_cost_1): Properly cost
invariant and constant vector uses in stmts when they need
more than one stmt.
[AArch64] Fix sve/extract_[12].c for big-endian SVE
sve/extract_[12].c were relying on the target-independent optimisation
that removes a redundant vec_select, so that we don't end up with
things like:
dup v0.4s, v0.4s[0]
...use s0...
But that optimisation rightly doesn't trigger for big-endian targets,
because GCC expects lane 0 to be in the high part of the register
rather than the low part.
SVE breaks this assumption -- see the comment at the head of
aarch64-sve.md for details -- so the optimisation is valid for
both endiannesses. Long term, we probably need some kind of target
hook to make GCC aware of this.
But there's another problem with the current extract pattern: it doesn't
tell the register allocator how cheap an extraction of lane 0 is with
tied registers. It seems better to split the lane 0 case out into
its own pattern and use tied operands for the FPR<-SIMD case,
so that using different registers has the cost of an extra reload.
I think we want this for both endiannesses, regardless of the hook
described above.
Also, the gen_lowpart in this pattern fails for aarch64_be due to
TARGET_CAN_CHANGE_MODE_CLASS restrictions, so the patch uses gen_rtx_REG
instead. We're only creating this rtl in order to print it, so there's
no need for anything fancier.
2018-01-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_0): New
pattern.
(*vec_extract<mode><Vel>_v128): Require a nonzero lane number.
Use gen_rtx_REG rather than gen_lowpart.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257178
LRA was using a subreg offset of 0 whenever constraints matched
two operands with different modes. That leads to an invalid offset
(and ICE) on big-endian targets if one of the modes is narrower
than a word. E.g. if a (reg:SI X) is matched to a (reg:QI Y),
the big-endian subreg should be (subreg:QI (reg:SI X) 3) rather
than (subreg:QI (reg:SI X) 0).
But this raises the issue of what the behaviour should be when the
matched operands occupy different numbers of registers. Should the
register numbers match, or should the locations of the lsbs match?
Although the documentation isn't clear, reload went for the second
interpretation (which seems the most natural to me):
/* On a REG_WORDS_BIG_ENDIAN machine, point to the last register of a
multiple hard register group of scalar integer registers, so that
for example (reg:DI 0) and (reg:SI 1) will be considered the same
register. */
So I think this means that we can/must use the lowpart offset
unconditionally, rather than trying to separate out the multi-register
case. This also matches the LRA handling of constant integers, which
already uses lowpart subregs.
The patch fixes gcc.target/aarch64/sve/extract_[34].c for aarch64_be.
2018-01-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* lra-constraints.c (match_reload): Use subreg_lowpart_offset
rather than 0 when creating partial subregs.
Kyrylo Tkachov [Tue, 30 Jan 2018 09:13:39 +0000 (09:13 +0000)]
[testsuite] XFAIL gcc.dg/tree-ssa/ssa-dom-cse-2.c on non-NEON arm targets
This test fails to optimise away the PLUS reduction in the loop on arm targets when vectorisation
is not enabled due to absence of SIMD instructions.
From reading the logs and the PR I gather that the presence or absence of SIMD affects the passing of this test
on other targets as well, as evidenced by the long list of xfail targets.
This list looks quite unwieldy to me, but here is a patch adding non-NEON arm to that list.
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: XFAIL on !arm_neon arm targets.
Jeff Law [Tue, 30 Jan 2018 05:30:40 +0000 (22:30 -0700)]
re PR testsuite/81010 (test case gcc.target/powerpc/pr56605.c fails starting with r248958)
PR testsuite/81010
* gcc.target/powerpc/pr56605.c: Update various dg- directives to
better match other tests which require vsx. Verify the zero
extension is part of the test in the combiner dump.
Ian Lance Taylor [Tue, 30 Jan 2018 04:48:55 +0000 (04:48 +0000)]
internal/syscall/unix: add randomTrap for sh/shbe
CL 84555 added support for the SuperH architecture, but didn't add the
randomTrap definition to be used for the getrandom syscall on Linux.
Add it now.
Michael Meissner [Mon, 29 Jan 2018 22:30:34 +0000 (22:30 +0000)]
re PR target/81550 (gcc.target/powerpc/loop_align.c fails starting with r250482)
2018-01-29 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/81550
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): If DFmode
and SFmode can go in Altivec registers (-mcpu=power7 for DFmode,
-mcpu=power8 for SFmode) don't set the PRE_INCDEC or PRE_MODIFY
flags. This restores the settings used before the 2017-07-24.
Turning off pre increment/decrement/modify allows IVOPTS to
optimize DF/SF loops where the index is an int.
Ian Lance Taylor [Mon, 29 Jan 2018 20:58:23 +0000 (20:58 +0000)]
compiler: don't insert write barriers if we've seen errors
The compiler skips the escape analysis pass if it has seen any errors.
The write barrier pass, especially the check-escapes portion, relies
on escape analysis running. So don't run this pass if there have been
any errors, as it may cause further unreliable error reports.
Richard Biener [Mon, 29 Jan 2018 15:22:55 +0000 (15:22 +0000)]
re PR libgomp/84086 ([8 Regresssion] segfault in instantiate_scev_r for libgomp.fortran/examples-4/simd-2.f90 -O1)
2018-01-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/84086
* tree-ssanames.c: Include cfgloop.h and tree-scalar-evolution.h.
(flush_ssaname_freelist): When SSA names were released reset
the SCEV hash table.
Jonathan Wakely [Mon, 29 Jan 2018 12:33:32 +0000 (12:33 +0000)]
PR libstdc++/83658 fix exception-safety in std::any::emplace
PR libstdc++/83658
* include/std/any (any::__do_emplace): Only set _M_manager after
constructing the contained object.
* testsuite/20_util/any/misc/any_cast_neg.cc: Adjust dg-error line.
* testsuite/20_util/any/modifiers/83658.cc: New test.
Jakub Jelinek [Sat, 27 Jan 2018 06:27:47 +0000 (07:27 +0100)]
c-cppbuiltin.c (c_cpp_builtins): Use ggc_strdup for the fp_suffix argument.
* c-cppbuiltin.c (c_cpp_builtins): Use ggc_strdup for the fp_suffix
argument.
(LAZY_HEX_FP_VALUES_CNT): Define.
(lazy_hex_fp_values): Allow up to LAZY_HEX_FP_VALUES_CNT lazy hex fp
values rather than just 12.
(builtin_define_with_hex_fp_value): Likewise.
* include/cpplib.h (enum cpp_builtin_type): Change BT_LAST_USER from
BT_FIRST_USER + 31 to BT_FIRST_USER + 63.
This patch merges the safe-indirect-jump-1.c and -8.c testcases,
since they do the same thing. On the 64-bit and AIX ABIs the indirect
call is not a sibcall, since there is code generated after the call
(the restore of r2). On the 32-bit non-AIX ABIs it is a sibcall.
* gcc.target/powerpc/safe-indirect-jump-1.c: Build on all targets.
Make expected output depend on whether we expect sibcalls or not.
* gcc.target/powerpc/safe-indirect-jump-8.c: Delete (merged into
safe-indirect-jump-1.c).
Steven G. Kargl [Fri, 26 Jan 2018 19:33:16 +0000 (19:33 +0000)]
re PR fortran/83998 (ICE in gfc_conv_intrinsic_dot_product, at fortran/trans-intrinsic.c:4403)
2018-01-26 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/83998
* simplify.c (compute_dot_product): Initialize result to INTEGER(1) 0
or .false. The summation does the correct type conversion.
(gfc_simplify_dot_product): Special case zero-sized arrays.
This patch fixes the testsuite failures gcc.target/aarch64/subs_compare_1.c and subs_compare_2.c
The tests check that we combine a sequence like:
sub w2, w0, w1
cmp w0, w1
into
subs w2, w0, w1
This is done by a couple of peepholes in aarch64.md.
Unfortunately due to scheduling and other optimisations the SUB and CMP
can come in a different order:
cmp w0, w1
sub w0, w0, w1
And the existing peepholes cannot catch that and we fail to combine the two.
This patch adds a peephole that matches the CMP as the first insn and the SUB as the second
and outputs a SUBS. This is almost equivalent to the existing peephole that matches SUB first and CMP second
except that it doesn't have the restriction that the output register of the SUB has to not be one of the input registers.
Remember "sub w0, w0, w1 ; cmp w0, w1" is *not* equivalent to: "subs w0, w0, w1"
but "cmp w0, w1 ; sub w0, w0, w1" is.
So this is what this patch does. It adds a peephole for the case above and one for the SUB-immediate variant
(because the SUB-immediate is represented as PLUS-of-negated-immediate and thus has different RTL structure).
Bootstrapped and tested on aarch64-none-linux-gnu.
* config/aarch64/aarch64.md: Add peepholes for CMP + SUB -> SUBS
and CMP + SUB-immediate -> SUBS.