Alan Modra [Sat, 30 Apr 2016 00:35:39 +0000 (10:05 +0930)]
[RS6000] Allow saving of fixed regs.
As I noted a long time ago in the comment on fixed_reg_p, the real
problem with saving fixed/global regs is that exception frame
unwinding might restore them. So don't emit eh_frame info for any
such reg, and the unwinder won't restore them.
Also, tidy rs6000_savres_strategy. Delaying some checks means we
won't iterate over regs quite so often.
* config/rs6000/rs6000.c (rs6000_savres_strategy): Force inline
restoring when fixed_reg_p, but allow out-of-line or stmw save.
Check for user regs later to avoid unnecessary looping over regs.
Merge user reg check with non-saved reg check. Don't force
inline VR restore when static chain used.
(rs6000_frame_related): Omit eh_frame info for user regs when
saving.
(fixed_regs_p): Delete.
Alan Modra [Sat, 30 Apr 2016 00:01:07 +0000 (09:31 +0930)]
Goodbye REG_LIVE_LENGTH
* regs.h (struct reg_info_t): Delete live_length.
(REG_LIVE_LENGTH): Delete macro.
* regstat.c (regstat_bb_compute_ri): Delete artificial_uses,
local_live, local_processed and local_live_last_luid params.
Replace bb_index param with bb. Don't set REG_LIVE_LENGTH.
Formatting fixes.
(regstat_compute_ri): Adjust for above. Don't set
REG_LIVE_LENGTH.
(dump_reg_info): Don't print live length.
* ira.c (update_equiv_regs): Replace test of REG_LIVE_LENGTH
with test of setjmp_crosses. Don't set REG_LIVE_LENGTH.
Localize loop_depth var.
Paolo Carlini [Sat, 30 Apr 2016 00:00:51 +0000 (00:00 +0000)]
re PR c++/66644 (Rejects C++11 in-class anonymous union members initialization)
/cp
2016-04-29 Paolo Carlini <paolo.carlini@oracle.com>
PR c++/66644
* class.c (check_field_decl): Remove final int* parameter, change
the return type to bool; fix logic in order not to reject multiple
initialized fields in anonymous struct.
(check_field_decls): Adjust call.
/testsuite
2016-04-29 Paolo Carlini <paolo.carlini@oracle.com>
Alan Modra [Sat, 30 Apr 2016 00:00:22 +0000 (09:30 +0930)]
ira.c validate_equiv_mem
This function is used to validate REG_EQUIV notes generated by ira,
and to validate potential insn combines performed by ira. The two
conditions are not exactly the same, with reload being more
restrictive. Separate them so more combines/moves can occur.
For example, this sequence from cfgexpand.c:expand_gimple_cond
callq _Z18update_bb_for_insnP15basic_block_def
mov 0x10(%rbx),%rdi
mov 0x0(%rip),%rbp # x_rtl+0x34
callq _Z9safe_as_aIP8rtx_insn7rtx_defET_PT0_
mov %r13,%rdx
mov %rbp,%rsi
mov %rax,%rdi
callq _Z18create_basic_blockP7rtx_defS0_P15basic_block_def
Alan Modra [Fri, 29 Apr 2016 23:59:22 +0000 (09:29 +0930)]
ira.c use DF infrastructure for combine_and_move_insns
This patch actually improves generated code, because REG_DEAD notes
used by the old insn scan are not always present. On x86_64, see
gcc/wide-int-print.o:print_hex for an example of a function that is
smaller and uses one less callee saved reg.
* ira.c (combine_and_move_insns): Rather than scanning insns,
use DF infrastucture to find use and def insns.
Alan Modra [Fri, 29 Apr 2016 23:58:17 +0000 (09:28 +0930)]
ira.c combine_and_move_insns, and ordering of functions
Notes added by add_store_equivs are not used directly or indirectly by
combine_and_move_insns. add_store_equivs can therefore run later
without affecting the output of combine_and_move_insns, and thus
add_store_equivs need not take into account potentially moved insns.
Since not all potentially combined/moved insns are in fact combined or
moved, this may allow add_store_equivs to add more REG_EQUIV notes.
grow_reg_equivs isn't needed until the reload reg_equivs array is
changed.
ira.c (combine_and_move_insns): Move invariant conditions..
(ira.c): ..to here. Call combine_and_move_insns before
add_store_equivs. Call grow_reg_equivs later. Allocate
req_equiv later using max_reg_num() rather than global max_regno.
(contains_replace_regs): Delete.
(add_store_equivs): Remove contains_replace_regs test.
Alan Modra [Fri, 29 Apr 2016 23:57:00 +0000 (09:27 +0930)]
ira.c tidies: split update_reg_equivs
* ira.c (add_store_equivs, combine_and_move_insns): New functions,
split out from..
(update_reg_equivs): ..here. Move allocation and freeing of
reg_equiv, and calls to grow_reg_equivs, init_alias_analysis,
end_alias_analysis to..
(ira): ..here.
Patrick Palka [Fri, 29 Apr 2016 19:15:25 +0000 (19:15 +0000)]
tree-ssa-threadedge.c (simplify_control_stmt_condition): Split out into ...
2016-04-29 Patrick Palka <ppalka@gcc.gnu.org>
* tree-ssa-threadedge.c (simplify_control_stmt_condition): Split
out into ...
(simplify_control_stmt_condition_1): ... here. Recurse into
BIT_AND_EXPRs and BIT_IOR_EXPRs.
* gcc.dg/tree-ssa/ssa-thread-14.c: New test.
* gcc.dg/tree-ssa/ssa-thread-11.c: Update expected output.
re PR middle-end/70626 (bogus results in 'acc parallel loop' reductions)
gcc/c-family/
PR middle-end/70626
* c-common.h (c_oacc_split_loop_clauses): Add boolean argument.
* c-omp.c (c_oacc_split_loop_clauses): Use it to duplicate
reduction clauses in acc parallel loops.
gcc/c/
PR middle-end/70626
* c-parser.c (c_parser_oacc_loop): Don't augment mask with
OACC_LOOP_CLAUSE_MASK.
(c_parser_oacc_kernels_parallel): Update call to
c_oacc_split_loop_clauses.
gcc/cp/
PR middle-end/70626
* parser.c (cp_parser_oacc_loop): Don't augment mask with
OACC_LOOP_CLAUSE_MASK.
(cp_parser_oacc_kernels_parallel): Update call to
c_oacc_split_loop_clauses.
gcc/fortran/
PR middle-end/70626
* trans-openmp.c (gfc_trans_oacc_combined_directive): Duplicate
the reduction clause in both parallel and loop directives.
gcc/testsuite/
PR middle-end/70626
* c-c++-common/goacc/combined-reduction.c: New test.
* gfortran.dg/goacc/reduction-2.f95: Add check for kernels reductions.
libgomp/
PR middle-end/70626
* testsuite/libgomp.oacc-c++/template-reduction.C: Adjust test.
* testsuite/libgomp.oacc-c-c++-common/combined-reduction.c: New test.
* testsuite/libgomp.oacc-fortran/combined-reduction.f90: New test.
tree-vect-loop.c (vect_transform_loop): Fix nb_iterations_upper_bound computation for vectorized loop.
gcc/
* tree-vect-loop.c (vect_transform_loop): Fix
nb_iterations_upper_bound computation for vectorized loop.
gcc/testsuite/
* gcc.target/i386/vect-unpack-2.c (avx512bw_test): Avoid
optimization of vector loop.
* gcc.target/i386/vect-unpack-3.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-1.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-2.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-3.c: New test.
Andreas Krebbel [Fri, 29 Apr 2016 09:17:35 +0000 (09:17 +0000)]
S/390: Replace LDER with LDR.
For performance reasons it is important to write the full 64 bits of
an FPR target reg even when dealing with 32 bit values. So we chose
lder over ler for 32 bit float register moves. lder zero-extends the
32 bit value from the source reg to 64 bit in the target. However,
since it actually doesn't matter whether we write the upper 32 bits
with zeros or with any other garbage we can also use ldr instead. It
is bit shorter and therefore will do good for I-Cache usage.
gcc/ChangeLog:
2016-04-29 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
This fixes an issue with the long displacement memory address
constraints S and T. These were defined to only accept long
displacement addresses. This is wrong since a memory constraint must
not reject an address with a 0 displacement. Reload relies on being
able to turn an invalid memory address into a valid one by reloading
the address into a base register. The S and T constraints would
reject such an address.
This isn't really a problem for the backend since we used the
constraints with that knowledge there but it is a problem for people
writing inline assemblies.
gcc/ChangeLog:
2016-04-29 Ulrich Weigand <uweigand@de.ibm.com>
* config/s390/constraints.md ("U", "W"): Invoke
s390_mem_constraint with "ZR" and "ZT".
* config/s390/s390.c (s390_check_qrst_address): Reject invalid
addresses when using LRA. Accept also short displacements for S
and T constraints. Do not check for long displacement target for
S and T constraints.
(s390_mem_constraint): Remove handling of U and W constraints.
* config/s390/s390.md (various patterns): Remove the short
displacement constraints (Q and R) if a long displacement
constraint is present. Add longdisp as required CPU capability.
* config/s390/vector.md: Likewise.
* config/s390/vx-builtins.md: Likewise.
[ARC] Fix unwanted match for sign extend 16-bit constant.
The combine pass may conclude umulhisi3_imm pattern can accept also sign
extended 16-bit constants. This patch prohibits the combine in considering
this pattern as suitable.
Richard Biener [Fri, 29 Apr 2016 08:36:49 +0000 (08:36 +0000)]
re PR tree-optimization/13962 ([tree-ssa] make "fold" use alias information to optimize pointer comparisons)
2016-04-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/13962
PR tree-optimization/65686
* tree-ssa-alias.h (ptrs_compare_unequal): Declare.
* tree-ssa-alias.c (ptrs_compare_unequal): New function
using PTA to compare pointers.
* match.pd: Add pattern for pointer equality compare simplification
using ptrs_compare_unequal.
i386.md (Load+RegOp to Mov+MemOp peephole2): Use SWI mode iterator.
* config/i386/i386.md (Load+RegOp to Mov+MemOp peephole2):
Use SWI mode iterator. Use general_reg_operand predicate.
(Load+RegOp to Mov+MemOp peephole2 with vector regs): Split
peephole to MMX and SSE part. Use mmx_reg_operand and sse_reg_operand
predicates.
Ian Lance Taylor [Thu, 28 Apr 2016 19:12:13 +0000 (19:12 +0000)]
compiler: Export String_index_expression.
Exports String_index_expression and adds the getter `string` that
returns the underlying string. This will be used to handle string
indexing different from array indexing in escape analysis.
i386.md (peephole2s for operations with memory inputs): Use SWI mode iterator.
* config/i386/i386.md (peephole2s for operations with memory inputs):
Use SWI mode iterator.
(peephole2s for operations with memory outputs): Ditto.
Do not check for stack checking probe.
Jason Merrill [Thu, 28 Apr 2016 19:01:13 +0000 (15:01 -0400)]
cvt.c (cp_get_callee): New.
* cvt.c (cp_get_callee): New.
* constexpr.c (get_function_named_in_call): Use it.
* cxx-pretty-print.c (postfix_expression): Use it.
* except.c (check_noexcept_r): Use it.
* method.c (check_nontriv): Use it.
* tree.c (build_aggr_init_expr): Use it.
* cp-tree.h: Declare it.
In r193072 sbitmap_popcount was removed, so we cannot ask for the popcount
of an sbitmap anymore. Nothing calls sbitmap_alloc_with_popcount either.
This patch removes everything else popcount-related from sbitmap.
* cfganal.c (bitmap_intersection_of_succs): Delete assert checking
dst->popcount.
(bitmap_intersection_of_preds): Ditto.
(bitmap_union_of_succs): Ditto.
(bitmap_union_of_preds): Ditto.
* sbitmap.c (do_popcount): Delete.
(BITMAP_DEBUGGING): Delete.
(sbitmap_verify_popcount): Delete.
(sbitmap_alloc): Don't initialize the popcount field.
(sbitmap_alloc_with_popcount): Delete.
(sbitmap_resize): Don't resize the popcount array.
(sbitmap_vector_alloc): Don't initialize the popcount field.
(bitmap_copy): Don't copy the popcount array.
(bitmap_clear): Don't clear the popcount array.
(bitmap_clear): Delete the popcount array handling.
(bitmap_ior_and_compl): Delete the popcount assert.
(bitmap_not): Ditto.
(bitmap_and_compl): Ditto.
(bitmap_and): Delete the popcount array handling.
(bitmap_xor): Ditto.
(bitmap_ior): Ditto.
(bitmap_or_and): Delete the popcount assert.
(bitmap_and_or): Ditto.
(popcount_table): Delete.
(sbitmap_elt_popcount): Delete.
* sbitmap.h (simple_bitmap_def): Delete the popcount field.
(bitmap_set_bit): Delete the popcount assert.
(bitmap_clear_bit): Ditto.
(sbitmap_free): Don't free the popcount array.
(sbitmap_alloc_with_popcount): Delete declaration.
(sbitmap_popcount): Ditto.
Jakub Jelinek [Thu, 28 Apr 2016 17:10:14 +0000 (19:10 +0200)]
re PR target/70821 (x86_64: __atomic_fetch_add/sub() uses XADD rather than DECL in some cases)
PR target/70821
* config/i386/sync.md (define_peephole2 *atomic_fetch_add_cmp<mode>):
Add new peephole2 where the first insn is *mov<mode>_or instead of
*mov<mode>_internal.
Expanders do not have more elements in the operands array than declared
in the pattern. So, we cannot use operands[5] here. Instead just
declare and use another rtx.
PR target/70668
* config/nds32/nds32.md (casesi): Don't access the operands array
out of bounds.
Bill Seurer [Thu, 28 Apr 2016 16:01:52 +0000 (16:01 +0000)]
This patch adds support for the signed and unsigned int versions of the...
This patch adds support for the signed and unsigned int versions of the
vec_adde altivec builtins from the Power Architecture 64-Bit ELF V2 ABI
OpenPOWER ABI for Linux Supplement (16 July 2015 Version 1.1). There are
many of the builtins that are missing and this is the first of a series
of patches to add them.
There aren't instructions for the int versions of vec_adde so the
output code is built from other built-ins that do have instructions
which in this case is just two vec_adds with a vec_and to ensure the
carry vector is comprised of only the values 0 or 1.
The new test cases are executable tests which verify that the generated
code produces expected values. C macros were used so that the same
test case could be used for both the signed and unsigned versions. An
extra executable test case is also included to ensure that the modified
support for the __int128 versions of vec_adde is not broken. The same
test case could not be used for both int and __int128 because of some
differences in loading and storing the vectors.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions. Is this ok for trunk?
[gcc]
2016-04-28 Bill Seurer <seurer@linux.vnet.ibm.com>
* config/rs6000/rs6000-builtin.def (vec_adde): Change vec_adde to a
special case builtin.
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
ALTIVEC_BUILTIN_VEC_ADDE.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Add
support for ALTIVEC_BUILTIN_VEC_ADDE.
* config/rs6000/rs6000.c (altivec_init_builtins): Add definition
for __builtin_vec_adde.
[gcc/testsuite]
2016-04-28 Bill Seurer <seurer@linux.vnet.ibm.com>
* gcc.target/powerpc/vec-adde.c: New test.
* gcc.target/powerpc/vec-adde-int128.c: New test.
* gcc.target/i386/avx-vround-1.c: New test.
* gcc.target/i386/avx-vround-2.c: New test.
* gcc.target/i386/avx512vl-vround-1.c: New test.
* gcc.target/i386/avx512vl-vround-2.c: New test.
Martin Jambor [Thu, 28 Apr 2016 14:35:04 +0000 (16:35 +0200)]
Verify that context of local DECLs is the current function
2016-04-28 Martin Jambor <mjambor@suse.cz>
* tree-cfg.c (verify_expr): Verify that local declarations belong to
this function. Call verify_expr on MEM_REFs and bases of other
handled_components.
[ARC] Don't use drsub* instructions when selecting fpuda.
The double precision floating point assist instructions are not
implementing the reverse double subtract instruction (drsub) found in
the FPX extension.
* config/arc/arc.md (cpu_facility): Add fpx variant.
(subdf3): Prohibit use reverse sub when assist operations option
is enabled.
* config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub
instructions only when FPX is enabled.
* testsuite/gcc.target/arc/trsub.c: New test.
i386.md (*fop_<mode>_1_mixed): Do not check for mult_operator when calculating "type" attribute.
* config/i386/i386.md (*fop_<mode>_1_mixed): Do not check for
mult_operator when calculating "type" attribute.
(*fop_<mode>_1_i387): Ditto.
(*fop_xf_1_i387): Ditto.
(x87 stack loads peephole2): Add "reg = op (mem, reg)" peephole2.
Use std::swap to swap operands. Use RTL expressions to generate
converted pattern.
Eduard Sanou [Thu, 28 Apr 2016 09:12:05 +0000 (09:12 +0000)]
c-common.c (get_source_date_epoch): New function...
gcc/c-family/ChangeLog:
2016-04-28 Eduard Sanou <dhole@openmailbox.org>
Matthias Klose <doko@debian.org>
* c-common.c (get_source_date_epoch): New function, gets the environment
variable SOURCE_DATE_EPOCH and parses it as long long with error
handling.
* c-common.h (get_source_date_epoch): Prototype.
* c-lex.c (c_lex_with_flags): set parse_in->source_date_epoch.
gcc/ChangeLog:
2016-04-28 Eduard Sanou <dhole@openmailbox.org>
Matthias Klose <doko@debian.org>
2016-04-28 Eduard Sanou <dhole@openmailbox.org>
Matthias Klose <doko@debian.org>
* include/cpplib.h (cpp_init_source_date_epoch): Prototype.
* init.c (cpp_init_source_date_epoch): New function.
* internal.h: Added source_date_epoch variable to struct
cpp_reader to store a reproducible date.
* macro.c (_cpp_builtin_macro_text): Set pfile->date timestamp from
pfile->source_date_epoch instead of localtime if source_date_epoch is
set, to be used for __DATE__ and __TIME__ macros to help reproducible
builds.