Jan Hubicka [Tue, 6 Feb 2018 13:27:04 +0000 (14:27 +0100)]
re PR lto/81004 (linking failed with -flto and static libboost_program_options)
PR lto/81004
* lto.c: Include builtins.h
(register_resolution): Merge resolutions in case trees was
merged across units.
(lto_maybe_register_decl): Break out from ...
(lto_read_decls): ... here.
(unify_scc): Also register decls here.
(read_cgraph_and_symbols): Sanity check that all resolutions was
read.
"dict" was added in Tcl 8.5, but until a couple of weeks ago the
testsuite had worked with 8.4.
This patch uses arrays instead, like we do for the caching in
target-supports.exp. It is a bit uglier than using dicts was,
but hopefully not too bad...
2018-02-05 Richard Sandiford <richard.sandiford@linaro.org>
gcc/testsuite/
* lib/lto.exp (lto_handle_diagnostics): Remove messages_by_file
argument and use dg-messages-by-file instead. Expect it to be
an array rather than a dict.
(lto-link-and-maybe-run): Remove messages_by_file argument and
use an upvar for dg-messages-by-file. Update call to
lto_handle_diagnostics.
(lt-get-options): Treat dg-messages-by-file as an array
rather than a dict.
(lto-get-options-main): Likewise. Set the entry rather than appending.
(lto-execute): Treat dg-messages-by-file as an array rather than
a dict. Update call to lto-link-and-maybe-run.
These tests started passing after r257293, which had the side-effect
of renumbering the SSA names and leaving the COND_EXPRs in their
natural order.
This does show a deeper underlying issue that code generation is too
sensitive to internal things like SSA_NAME versions, but it no longer
affects these particular tests (for now).
2018-02-05 Richard Sandiford <richard.sandiford@linaro.org>
compiler: update iota handling, fix using iota in array length
CL 71750 changed the definition of how iota works. This patch updates
gccgo for the new definition.
We've been mishandling iota appearing in a type that appears in a
const expression, as in `c = len([iota]int{})`. Correct that by copying
type expressions when we copy an expression. For simplicity only copy
when it can change the size of a type, as that is the only case where
iota in a type can affect the value of a constant (I think). This is
still a bunch of changes, but almost all boilerplate.
compiler: permit empty statements after fallthrough
The language spec permits empty statements after a fallthrough
statement, so implement that. Also give a better error message when a
fallthrough statement is in the wrong place. The test case for this
is in the master repository, test/fixedbugs/issue14540.go, just not
yet in the gccgo repository.
compiler: in range, evaluate array if it has receives or calls
The last change was incomplete, in that it did not evaluate the array
argument in some cases where it had to be evaluated. This reuses the
existing code for checking whether len/cap is constant.
Also clean up the use of _ as the second variable in a for/range,
which was previously inconsistent depending on whether the statement
used = or :=.
compiler: give error for non-int arguments to make
This implements a requirement of the language spec.
While we're here fix the value returned by the type method of a
builtin call expression to make, although this doesn't seem to make
any difference anywhere since we lower this to a runtime call before
the determine_types pass anyhow.
There is already a test for this error in the master repository:
test/fixedbugs/issue16949.go. It just hasn't made it into the gccgo
testsuite yet.
H.J. Lu [Fri, 2 Feb 2018 16:43:46 +0000 (16:43 +0000)]
i386: Pass INVALID_REGNUM as invalid register number
* config/i386/i386.c (ix86_output_function_return): Pass
INVALID_REGNUM, instead of -1, as invalid register number to
indirect_thunk_name and output_indirect_thunk.
go-gcc.cc (Gcc_backend::type_size): Return 0 for void_type_node.
* go-gcc.cc (Gcc_backend::type_size): Return 0 for
void_type_node.
(Gcc_backend::convert_expression): Don't convert if the type of
expr_tree is void_type_node.
(Gcc_backend::array_index_expression): Don't index if the type of
the array expression is void_type_node.
(Gcc_backend::init_statement): Don't initialize if the type of the
initializer expression is void_type_node.
(Gcc_backend::assignment_statement): Don't assign if the type of
either the left or right hand side is void_type_node.
(Gcc_backend::temporary_variable): Don't initialize if the type of
the initializer expression is void_type_node.
compiler: don't incorrectly evaluate range variable
The language spec says that in `for i = range x`, in which there is no
second iteration variable, if len(x) is constant, then x is not
evaluated. This only matters when x is an expression that panics but
whose type is an array type; in such a case, we should not evaluate x,
since len of any array type is a constant.
Igor Tsimbalist [Fri, 2 Feb 2018 10:06:39 +0000 (11:06 +0100)]
PR84066 Wrong shadow stack register size is saved for x32
x32 is a 64-bit process with 32-bit software pointer and kernel may
place x32 shadow stack above 4GB. We need to save and restore 64-bit
shadow stack register for x32. builtin jmp buf size is 5 pointers. We
have space to save 64-bit shadow stack pointers: 32-bit SP, 32-bit FP,
32-bit IP, 64-bit SSP for x32.
PR target/84066
* gcc/config/i386/i386.md: Replace Pmode with word_mode in
builtin_setjmp_setup and builtin_longjmp to support x32.
* gcc/testsuite/gcc.target/i386/cet-sjlj-6a.c: New test.
* gcc/testsuite/gcc.target/i386/cet-sjlj-6b.c: Likewise.
On ia64, a separate stack is used for saving/restoring register frames,
occupying the other end of the stack mapping. This must also be scanned
for pointers into the heap.
math: adjust compilation flags, use them when testing
We were using special compilation flags for the math package, but we
weren't using them when testing. That meant that our tests were not
checking the real code we were providing. Fix that.
Fixing that revealed that we were not using a good set of flags, or at
least were not using flags that let the tests pass. Adjust the flags
to stop using -funsafe-math-optimizations on x86. Instead always use
-ffp-contract=off -fno-math-errno -fno-trapping-math for all targets.
Janne Blomqvist [Thu, 1 Feb 2018 19:47:15 +0000 (21:47 +0200)]
PR 83975 Associate target with non-constant character length
When associating a variable of type character, if the length of the
target isn't known at compile time, generate an error. See PR 83344
for more details.
Regtested on x86_64-pc-linux-gnu.
gcc/fortran/ChangeLog:
2018-02-01 Janne Blomqvist <jb@gcc.gnu.org>
PR 83975
PR 83344
* resolve.c (resolve_assoc_var): Generate an error if
target length unknown.
Peter Bergner [Thu, 1 Feb 2018 18:26:51 +0000 (12:26 -0600)]
re PR target/56010 (Powerpc, -mcpu=native and -mtune=native use the wrong name for target 7450)
PR target/56010
PR target/83743
* config/rs6000/driver-rs6000.c: #include "diagnostic.h".
#include "opts.h".
(rs6000_supported_cpu_names): New static variable.
(linux_cpu_translation_table): Likewise.
(elf_platform) <cpu>: Define new static variable and use it.
Translate kernel AT_PLATFORM name to canonical name if needed.
Error if platform name is unknown.
Jeff Law [Thu, 1 Feb 2018 16:22:56 +0000 (09:22 -0700)]
re PR target/84128 (i686: Stack spilling in -fstack-clash-protection prologue neglects %esp change)
PR target/84128
* config/i386/i386.c (release_scratch_register_on_entry): Add new
OFFSET and RELEASE_VIA_POP arguments. Use SP+OFFSET to restore
the scratch if RELEASE_VIA_POP is false.
(ix86_adjust_stack_and_probe_stack_clash): Un-constify SIZE.
If we have to save a temporary register, decrement SIZE appropriately.
Pass new arguments to release_scratch_register_on_entry.
(ix86_adjust_stack_and_probe): Likewise.
(ix86_emit_probe_stack_range): Pass new arguments to
release_scratch_register_on_entry.
PR target/84128
* gcc.target/i386/pr84128.c: New test.
Otherwise on a 64-bit system we will read the 32-bit value as a 64-bit
value. Since getaddrinfo returns negative numbers as error values,
these will be interpreted as numbers like 0xfffffffe rather than -2,
and the comparisons with values like syscall.EAI_NONAME will fail.
Use range info in split_constant_offset (PR 81635)
This patch implements the original suggestion for fixing PR 81635:
use range info in split_constant_offset to see whether a conversion
of a wrapping type can be split. The range info problem described in:
The patch is part 1. There needs to be a follow-on patch to handle:
for (unsigned int i = 0; i < n; i += 4)
{
...[i + 2]...
...[i + 3]...
which the old SCEV test handles, but which the range check doesn't.
At the moment we record that the low two bits of "i" are clear,
but we still end up with a maximum range of 0xffffffff rather than
0xfffffffc.
2018-01-31 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/81635
* tree-data-ref.c (split_constant_offset_1): For types that
wrap on overflow, try to use range info to prove that wrapping
cannot occur.
gcc/testsuite/
PR tree-optimization/81635
* gcc.dg/vect/bb-slp-pr81635-1.c: New test.
* gcc.dg/vect/bb-slp-pr81635-2.c: Likewise.
Renlin Li [Thu, 1 Feb 2018 13:02:24 +0000 (13:02 +0000)]
[PR83370][AARCH64]Use tighter register constraint for sibcall patterns.
In aarch64 backend, ip0/ip1 register will be used in the prologue/epilogue as
temporary register.
When the compiler is performing sibcall optimization. It has the chance to use
ip0/ip1 register for indirect function call to hold the address. However,
those two register might be clobbered by the epilogue code which makes the
last sibcall instruction invalid.
The patch here renames the register class CALLER_SAVE_REGS to TAILCALL_ADDR_REGS
to reflect its usage, and remove IP registers from this class.
Richard Biener [Thu, 1 Feb 2018 12:51:24 +0000 (12:51 +0000)]
domwalk.h (dom_walker::dom_walker): Add additional constructor for specifying RPO order and allow NULL for that.
2018-02-01 Richard Biener <rguenther@suse.de>
* domwalk.h (dom_walker::dom_walker): Add additional constructor
for specifying RPO order and allow NULL for that.
* domwalk.c (dom_walker::dom_walker): Likewise.
(dom_walker::walk): Handle NULL RPO order.
* tree-into-ssa.c (rewrite_dom_walker): Do not walk dom children
in RPO order.
(rewrite_update_dom_walker): Likewise.
(mark_def_dom_walker): Likewise.
[AArch64] Fix SVE testsuite failures for ILP32 (PR 83846)
The SVE tests are split into code-quality compile tests and runtime
tests. A lot of the former are geared towards LP64. It would be
possible (but tedious!) to mark up every line that is expected to work
only for LP64, but I think it would be a constant source of problems.
Since the code has not been tuned for ILP32 yet, I think the best
thing is to select only the runtime tests for that combination.
They all pass on aarch64-elf and aarch64_be-elf except vec-cond-[34].c,
which are unsupported due to the lack of fenv support.
The patch also replaces uses of built-in types with stdint.h types
where possible. (This excludes tests that change the endianness,
since we can't assume that system header files work in that case.)
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
[AArch64] Handle SVE subregs that are effectively REVs
Subreg reads should be equivalent to storing the inner register to
memory and loading the appropriate memory bytes back, with subreg
writes doing the reverse. For the reasons explained in the comments,
this isn't what happens for big-endian SVE if we simply reinterpret
one vector register as having a different element size, so the
conceptual store and load is needed in the general case.
However, that obviously produces poor code if we do it too often.
The patch therefore adds a pattern for handling the operation in
registers. This copes with the important case of a VIEW_CONVERT
created by tree-vect-slp.c:duplicate_and_interleave.
It might make sense to tighten the predicates in aarch64-sve.md so
that such subregs are not allowed as operands to most instructions,
but that's future work.
This fixes the sve/slp_*.c tests on aarch64_be.
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_split_sve_subreg_move)
(aarch64_maybe_expand_sve_subreg_move): Declare.
* config/aarch64/aarch64.md (UNSPEC_REV_SUBREG): New unspec.
* config/aarch64/predicates.md (aarch64_any_register_operand): New
predicate.
* config/aarch64/aarch64-sve.md (mov<mode>): Optimize subreg moves
that are semantically a reverse operation.
(*aarch64_sve_mov<mode>_subreg_be): New pattern.
* config/aarch64/aarch64.c (aarch64_maybe_expand_sve_subreg_move):
(aarch64_replace_reg_mode, aarch64_split_sve_subreg_move): New
functions.
(aarch64_can_change_mode_class): For big-endian, forbid changes
between two SVE modes if they have different element sizes.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257289
This patch deals with cases in which a CONST_VECTOR contains a
repeating bit pattern that is wider than one element but narrower
than 128 bits. The current code:
* treats the repeating pattern as a single element
* uses the associated LD1R to load and replicate it (such as LD1RD
for 64-bit patterns)
* uses a subreg to cast the result back to the original vector type
The problem is that for big-endian targets, the final cast is
effectively a form of element reverse. E.g. say we're using LD1RD to load
16-bit elements, with h being the high parts and l being the low parts:
+-----+-----+-----+-----+-----+----
lanes | 0 | 1 | 2 | 3 | 4 | ...
+-----+-----+-----+-----+-----+----
memory bytes |h0 l0 h1 l1 h2 l2 h3 l3 h0 l0 ....
+----------------------------------
V V V V V V V V
----------+-----------------------+
register .... | 0 |
after ----------+-----------------------+ lsb
LD1RD .... h3 l3 h0 l0 h1 l1 h2 l2 h3 l3|
----------------------------------+
A later patch fixes the handling of general subregs to account
for this, but it means that we need to do a REV instruction
after the load. It seems better to use LD1RQ[BHW] on a 128-bit
pattern instead, since that gets the endianness right without
a separate fixup instruction.
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_expand_sve_const_vector): Prefer
the TImode handling for big-endian targets.
gcc/testsuite/
* gcc.target/aarch64/sve/slp_2.c: Expect LD1RQ to be used instead
of LD1R[HWD] for multi-element constants on big-endian targets.
* gcc.target/aarch64/sve/slp_3.c: Likewise.
* gcc.target/aarch64/sve/slp_4.c: Likewise.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257288
The fallback way of handling a repeated 128-bit constant vector for SVE
is to force the 128 bits to the constant pool and use LD1RQ to load it.
Previously the code always used the byte variant of LD1RQ (LD1RQB),
with a preceding BSWAP for big-endian targets. However, that BSWAP
doesn't handle all cases correctly.
The simplest fix seemed to be to use the LD1RQ appropriate for the
element size.
This helps to fix some of the sve/slp_*.c tests for aarch64_be,
although a later patch is needed as well.
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64-sve.md (sve_ld1rq): Replace with...
(*sve_ld1rq<Vesize>): ... this new pattern. Handle all element sizes,
not just bytes.
* config/aarch64/aarch64.c (aarch64_expand_sve_widened_duplicate):
Remove BSWAP handing for big-endian targets and use the form of
LD1RQ appropariate for the mode.
gcc/testsuite/
* gcc.target/aarch64/sve/slp_2.c: Expect LD1RQD rather than LD1RQB.
* gcc.target/aarch64/sve/slp_3.c: Expect LD1RQW rather than LD1RQB.
* gcc.target/aarch64/sve/slp_4.c: Expect LD1RQH rather than LD1RQB.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257287
[AArch64] Generalise aarch64_simd_valid_immediate for SVE
The current aarch64_simd_valid_immediate code predates the move
to the new CONST_VECTOR representation, so for variable-length SVE
it only handles duplicates of single elements, rather than duplicates
of repeating patterns.
This patch removes the restriction. It means that the validity
of a duplicated constant depends only on the bit pattern, not on
the mode used to represent it.
The patch is needed by a later big-endian fix.
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_simd_valid_immediate): Handle
all CONST_VECTOR_DUPLICATE_P vectors, not just those with a single
duplicated element.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257286
aarch64_secondary_reload enforced a secondary reload via
aarch64_sve_reload_be for memory and pseudo registers, but failed
to do the same for subregs of pseudo registers. To avoid this and
any similar problems, the patch instead tests for things that the move
patterns handle directly; if the operand isn't one of those, we should
use the reload pattern instead.
The patch fixes an ICE in sve/mask_struct_store_3.c for aarch64_be,
where the bogus target description was (rightly) causing LRA to cycle.
2018-02-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tearget/83845
* config/aarch64/aarch64.c (aarch64_secondary_reload): Tighten
check for operands that need to go through aarch64_sve_reload_be.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257285
Janne Blomqvist [Thu, 1 Feb 2018 07:41:03 +0000 (09:41 +0200)]
PR 83705 Repeat with large values
This patch fixes the regression by increasing the limit where we fall
back to runtime to 2**28 elements, which is the same limit where
previous releases failed. The are still bugs in the runtime
evaluation, so in many cases longer characters will still fail, so
print a warning message.
Regtested on x86_64-pc-linux-gnu.
gcc/fortran/ChangeLog:
2018-02-01 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/83705
* simplify.c (gfc_simplify_repeat): Increase limit for deferring
to runtime, print a warning message.
We already dereference the pointer to copy the value, but if the
method does not use the value then the pointer dereference may be
optimized away. Do an explicit nil check so that we get the panic
that is required.
Jakub Jelinek [Wed, 31 Jan 2018 20:47:48 +0000 (21:47 +0100)]
re PR fortran/84116 (ICE in gfc_match_omp_clauses, at fortran/openmp.c:1354)
PR fortran/84116
* openmp.c (gfc_match_omp_clauses): If all the linear
gfc_match_omp_variable_list calls failed, don't gfc_free_omp_namelist
nor set *head = NULL. Formatting fixes.
Jason Merrill [Wed, 31 Jan 2018 20:46:36 +0000 (21:46 +0100)]
re PR c++/83993 (ICE: constant not recomputed when ADDR_EXPR changed)
PR c++/83993
* constexpr.c (cxx_eval_outermost_constant_expr): Build NOP_EXPR
around non-constant ADDR_EXPRs rather than clearing TREE_CONSTANT
on ADDR_EXPR.
Jakub Jelinek [Wed, 31 Jan 2018 20:45:41 +0000 (21:45 +0100)]
re PR c++/83993 (ICE: constant not recomputed when ADDR_EXPR changed)
PR c++/83993
* constexpr.c (diag_array_subscript): Emit different diagnostics
if TYPE_DOMAIN (arraytype) is NULL.
(cxx_eval_array_reference, cxx_eval_store_expression): For arrays
with NULL TYPE_DOMAIN use size_zero_node as nelts.
* g++.dg/init/pr83993-1.C: New test.
* g++.dg/cpp0x/pr83993.C: New test.
Paul Thomas [Wed, 31 Jan 2018 20:28:35 +0000 (20:28 +0000)]
re PR fortran/84088 ([nvptx] libgomp.oacc-fortran/declare-*.f90 execution fails)
2018-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84088
* trans-expr.c (gfc_conv_procedure_call): If the parm expr is
an address expression passed to an assumed rank dummy, convert
to an indirect reference.
2018-01-31 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84088
* gfortran.dg/pr84088.f90 : New test.