Bill Seurer [Fri, 19 Jul 2019 18:33:59 +0000 (18:33 +0000)]
[PATCH, rs6000] Split up rs6000.c.
The source file rs6000.c has grown to unreasonable size and is being
split up into several smaller source files. This should improve
compilation speed for building gcc.
This is the second of several patches to do this and moves most of the
function call and builtin code to a new source file.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu and
powerpc64-unknown-linux-gnu with no regressions. Is this ok for trunk?
2019-07-17 Bill Seurer <seurer@linux.vnet.ibm.com>
After some discussion, we've decided to rename the +bitperm feature
flag to +sve2-bitperm, so that it's consistent with the other SVE2
feature flags. The associated macro was already
__ARM_FEATURE_SVE2_BITPERM, so only the feature flag itself
needs to change.
2019-07-19 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/invoke.texi: Rename the AArch64 +bitperm extension flag
to +sve-bitperm.
* config/aarch64/aarch64-option-extensions.def: Likewise.
Richard Biener [Fri, 19 Jul 2019 08:47:41 +0000 (08:47 +0000)]
re PR tree-optimization/91207 (Wrong code with -O3)
2019-07-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/91207
Revert
2019-07-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/91178
* tree-vect-stmts.c (get_group_load_store_type): For SLP
loads with a gap larger than the vector size always use
VMAT_STRIDED_SLP.
(vectorizable_load): For VMAT_STRIDED_SLP with a permutation
avoid loading vectors that are only contained in the gap
and thus are not needed.
Jason Merrill [Fri, 19 Jul 2019 06:52:47 +0000 (02:52 -0400)]
PR c++/90098 - partial specialization and class non-type parms.
A non-type template parameter of class type used in an expression has
const-qualified type; the pt.c hunks deal with this difference from the
unqualified type of the parameter declaration. WAhen we use such a
parameter as an argument to another template, we don't want to confuse
things by copying it, we should pass it straight through. And we might as
well skip copying other classes in constant evaluation context in a
template, too; we'll get the copy semantics at instantiation time.
Michael Meissner [Thu, 18 Jul 2019 19:07:13 +0000 (19:07 +0000)]
Rename function.
2019-07-18 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/predicates.md (prefixed_mem_operand): Call
rs6000_prefixed_address_mode_p instead of rs6000_prefixed_address.
* config/rs6000/rs6000-protos.h (rs6000_prefixed_address_mode_p):
Rename function from rs6000_prefixed_address.
* config/rs6000/rs6000.c (rs6000_prefixed_address_mode_p): Rename
function from rs6000_prefixed_address.
Michael Meissner [Thu, 18 Jul 2019 18:16:43 +0000 (18:16 +0000)]
Update PowerPC compiler for pc-relative support.
2019-07-18 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/aix.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.
* config/rs6000/darwin.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.
* config/rs6000/linux64.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
check to require -mcmodel=medium for pc-relative addressing.
(create_TOC_reference): Add assertion for TARGET_TOC.
(rs6000_legitimize_address): Use TARGET_NO_TOC_OR_PCREL instead of
TARGET_NO_TOC.
(rs6000_emit_move): Likewise.
(TOC_alias_set): Rename TOC alias set static variable from 'set'
to 'TOC_alias_set'.
(get_TOC_alias_set): Likewise.
(output_toc): Use TARGET_NO_TOC_OR_PCREL instead of
TARGET_NO_TOC.
(rs6000_can_eliminate): Likewise.
* config/rs6000/rs6000.h (TARGET_TOC): Define in terms of
TARGET_HAS_TOC and not pc-relative.
(TARGET_NO_TOC_OR_PCREL): New macro to replace TARGET_NO_TOC.
* config/rs6000/sysv4.h (TARGET_HAS_TOC): Rename TARGET_TOC to
TARGET_HAS_TOC.
(TARGET_TOC): Likewise.
(TARGET_NO_TOC): Delete here, define TARGET_NO_TOC_OR_PCREL in
rs6000.h.
re PR target/91188 (strict_low_part operations do not work)
PR target/91188
* config/i386/i386.md (*addqi_1_slp): Use register_operand predicate
for operand 0. Do not use (match_dup) to match operand 1 with
operand 0. Add check in insn constraint that either input operand
matches operand 0. Use SWI12 mode iterator to also handle
HImode operands.
(*and<mode>_1_slp): Ditto.
(*<code>qi_1_slp): Ditto.
(*sub<mode>_1_slp): Use register_operand predicate for operand 0.
Do not use (match_dup) to match operand 1 with operand 0. Add
check in insn constraint that operand 1 matches operand 0.
Use SWI12 mode iterator to also handle HImode operands.
(*ashl<mode>3_1_slp): Ditto.
(*<shift_insn><mode>3_1_slp): Ditto.
(*<rotate_insn><mode>3_1_slp): Ditto.
Ian Lance Taylor [Thu, 18 Jul 2019 16:51:00 +0000 (16:51 +0000)]
compiler: fix bug in importing blocks from inline functions
This patch fixes a buglet in the function body importer. Add hooks for
keeping a stack of blocks corresponding to the block nesting in the
imported function. This ensures that local variables and temps wind up
correctly scoped and don't introduce collisions.
Makefile.rtl, [...]: Introduce a "STANDALONE" mode where C runtime files do not have any dependency...
* Makefile.rtl, expect.c, env.c, aux-io.c, mkdir.c, initialize.c,
cstreams.c, raise.c, tracebak.c, adadecode.c, init.c, raise-gcc.c,
argv.c, adaint.c, adaint.h, ctrl_c.c, sysdep.c, rtinit.c, cio.c,
seh_init.c, exit.c, targext.c: Introduce a "STANDALONE" mode where C
runtime files do not have any dependency on GCC include files.
Remove unnecessary includes.
Remove remaining references to VMS in runtime C file.
* runtime.h: new File.
Sylvia Taylor [Thu, 18 Jul 2019 16:02:05 +0000 (16:02 +0000)]
[patch2/2][arm]: remove builtin expand for sha1
This patch removes the builtin expand handling for sha1h/c/m/p and
replaces it with expand patterns. This should make it more consistent
with how we handle intrinsic implementations and cleans up the custom
sha1 code in the arm_expand builtins for unop and ternop.
2019-07-18 Sylvia Taylor <sylvia.taylor@arm.com>
* config/arm/arm-builtins.c
(arm_expand_ternop_builtin): Remove explicit sha1 builtin handling.
(arm_expand_unop_builtin): Likewise.
* config/arm/crypto.md
(crypto_sha1h): Convert from define_insn to define_expand.
(crypto_<crypto_pattern>): Likewise.
(crypto_sha1h_lb): New define_insn.
(crypto_<crypto_pattern>_lb): Likewise.
Sylvia Taylor [Thu, 18 Jul 2019 15:42:13 +0000 (15:42 +0000)]
[patch1/2][arm][PR90317]: fix sha1 patterns
This patch fixes:
1) Ice message thrown when using the crypto_sha1h intrinsic due to
incompatible mode used for zero_extend. Removed zero extend as it is
not a good choice for vector modes and using an equivalent single
mode like TI (128bits) instead of V4SI produces extra instructions
making it inefficient.
This affects gcc version 8 and above.
2) Incorrect combine optimizations made due to vec_select usage
in the sha1 patterns on arm. The patterns should only combine
a vec select within a sha1h<op> instruction when the lane is 0.
This affects gcc version 5 and above.
- Fixed by explicitly declaring the valid const int for such
optimizations. For cases when the lane is not 0, the vector
lane selection now occurs in a e.g. vmov instruction prior
to sha1h<op>.
- Updated the sha1h testcases on arm to check for additional
cases with custom vector lane selection.
The intrinsic functions for the sha1 patterns have also been
simplified which seems to eliminate extra vmovs like:
- vmov.i32 q8, #0.
demangle.h (rust_is_mangled): Move to libiberty/rust-demangle.h.
include/
* demangle.h (rust_is_mangled): Move to libiberty/rust-demangle.h.
(rust_demangle_sym): Move to libiberty/rust-demangle.h.
libiberty/
* cplus-dem.c: Include rust-demangle.h.
* rust-demangle.c: Include rust-demangle.h.
* rust-demangle.h: New file.
Richard Earnshaw [Thu, 18 Jul 2019 13:56:52 +0000 (13:56 +0000)]
[arm] Fix incorrect modes with 'borrow' operations
Looking through the arm backend I noticed that the modes used to pass
comparison types into subtract-with-carry operations were being
incorrectly set. The result is that the compiler is not truly
self-consistent. To clean this up I've introduced a new predicate,
arm_borrow_operation (borrowed from the AArch64 backend) which can
match the comparison type with the required mode and then fixed all
the patterns to use this. The split patterns that were generating
incorrect modes have all obviously been fixed as well.
The basic rule for the use of a borrow is:
- if the condition code was set by a 'subtract-like' operation (subs, cmp),
then use CCmode and LTU.
- if the condition code was by unsigned overflow of addition (adds), then
use CC_Cmode and GEU.
* config/arm/predicates.md (arm_borrow_operation): New predicate.
* config/arm/arm.c (subdi3_compare1): Use CCmode for the split.
(arm_subdi3, subdi_di_zesidi, subdi_di_sesidi): Likewise.
(subdi_zesidi_zesidi): Likewise.
(negdi2_compare, negdi2_insn): Likewise.
(negdi_extensidi): Likewise.
(negdi_zero_extendsidi): Likewise.
(arm_cmpdi_insn): Likewise.
(subsi3_carryin): Use arm_borrow_operation.
(subsi3_carryin_const): Likewise.
(subsi3_carryin_const0): Likewise.
(subsi3_carryin_compare): Likewise.
(subsi3_carryin_compare_const): Likewise.
(subsi3_carryin_compare_const0): Likewise.
(subsi3_carryin_shift): Likewise.
(rsbsi3_carryin_shift): Likewise.
(negsi2_carryin_compare): Likewise.
Jan Hubicka [Thu, 18 Jul 2019 13:08:34 +0000 (15:08 +0200)]
lto-common.c (gimple_register_canonical_type_1): Do not look for non-ODR conflicts of types in anonymous namespaces.
* lto-common.c (gimple_register_canonical_type_1): Do not look for
non-ODR conflicts of types in anonymous namespaces.
(unify_scc): Do not merge anonymous namespace types.
* g++.dg/lto/alias-5_0.C: New testcase.
* g++.dg/lto/alias-5_1.C: New.
* g++.dg/lto/alias-5_2.c: New.
Bin Cheng [Thu, 18 Jul 2019 08:38:09 +0000 (08:38 +0000)]
re PR tree-optimization/91137 (Wrong code with -O3)
PR tree-optimization/91137
* tree-ssa-loop-ivopts.c (struct ivopts_data): New field.
(tree_ssa_iv_optimize_init, alloc_iv, tree_ssa_iv_optimize_finalize):
Init, use and fini the above new field.
(determine_base_object_1): New function.
(determine_base_object): Reimplement using walk_tree.
gcc/testsuite
PR tree-optimization/91137
* gcc.c-torture/execute/pr91137.c: New test.
This change is needed to avoid a regression in gcc.dg/ifcvt-3.c
for a later patch. Without it, we enter CSE with a dead comparison left
by if-conversion and then eliminate the second (live) comparison in
favour of the dead one. That's functionally correct in itself, but it
meant that we'd combine the subtraction and comparison into a SUBS
before we have a chance to fold away the subtraction.
2019-07-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* basic-block.h (CLEANUP_FORCE_FAST_DCE): New macro.
* cfgcleanup.c (cleanup_cfg): Call run_fast_dce if
CLEANUP_FORCE_FAST_DCE is set.
* ifcvt.c (rest_of_handle_if_conversion): Pass
CLEANUP_FORCE_FAST_DCE to the final cleanup_cfg call if
if-conversion succeeded.
Ian Lance Taylor [Thu, 18 Jul 2019 05:05:20 +0000 (05:05 +0000)]
compiler: fix bug in handling of unordered set during exporting
In CL 183850 a change was made to combine tracking/discovery of
exported types and imported packages during export data generation. As
a result of this refactoring a bug was introduced: the new code can
potentially insert items into the exports set (an unordered_set) while
iterating through the same set, which is illegal according to the spec
for std::unordered_set.
This patch fixes the problem by changing the type discovery phase to
iterate through a separate list of sorted exports, as opposed to
iterating through the main unordered set. Also included is a change
to fix the code that looks for variables that are referenced from
inlined routine bodies (this code wasn't scanning all of the function
that it needed to scan).
-Wmissing-attributes: check that we avoid duplicates and false positives
The initial patch for PR 81824 fixed various possibilities of
-Wmissing-attributes reporting duplicates and false positives. The
test that avoided them was a little obscure, though, so this patch
rewrites it into a more self-evident form.
The patch also adds a testcase that already passed, but that
explicitly covers some of the possibilities of reporting duplicates
and false positives that preexisting tests did not cover.
for gcc/ChangeLog
PR middle-end/81824
* attribs.c (decls_mismatched_attributes): Simplify the logic
that avoids duplicates and false positives.
for gcc/testsuite/ChangeLog
PR middle-end/81824
* g++.dg/Wmissing-attributes-1.C: New. Some of its fragments
are from Martin Sebor.
pa.c (pa_som_asm_init_sections): Don't force all constant data into data section when generating PIC code.
* config/pa/pa.c (pa_som_asm_init_sections): Don't force all constant
data into data section when generating PIC code.
(pa_select_section): Use pa_reloc_rw_mask() to qualify relocs.
(pa_reloc_rw_mask): Return 3 when generating PIC code and when
generating code for SOM targets earlier than HP-UX 11. Otherwise,
return 2 for SOM and 0 for other targets.
Richard Biener [Wed, 17 Jul 2019 11:21:49 +0000 (11:21 +0000)]
re PR tree-optimization/91178 (Infinite recursion in split_constant_offset in slp after r260289)
2019-07-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/91178
* tree-ssa.c (release_defs_bitset): Iterate from higher to
lower SSA names to avoid quadratic behavior in the common case.
* tree-data-ref.c (split_constant_offset): Add limit argument
and pass it down. Initialize it from PARAM_SSA_NAME_DEF_CHAIN_LIMIT.
(split_constant_offset_1): Add limit argument and use it to
limit SSA def walking. Optimize the common plus/minus case.
Richard Biener [Wed, 17 Jul 2019 10:26:25 +0000 (10:26 +0000)]
re PR tree-optimization/91178 (Infinite recursion in split_constant_offset in slp after r260289)
2019-07-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/91178
* tree-vect-stmts.c (get_group_load_store_type): For SLP
loads with a gap larger than the vector size always use
VMAT_STRIDED_SLP.
(vectorizable_load): For VMAT_STRIDED_SLP with a permutation
avoid loading vectors that are only contained in the gap
and thus are not needed.
Jakub Jelinek [Wed, 17 Jul 2019 07:15:30 +0000 (09:15 +0200)]
re PR tree-optimization/91157 (ICE: verify_gimple failed (error: position plus size exceeds size of referenced object in 'bit_field_ref'))
PR tree-optimization/91157
* tree-vect-generic.c (expand_vector_comparison): Handle lhs being
a vector boolean with scalar mode.
(expand_vector_condition): Handle first operand being a vector boolean
with scalar mode.
(expand_vector_operations_1): For comparisons, don't bail out early
if the return type is vector boolean with scalar mode, but comparison
operand type is not.
* gcc.target/i386/avx512f-pr91157.c: New test.
* gcc.target/i386/avx512bw-pr91157.c: New test.
Jakub Jelinek [Wed, 17 Jul 2019 07:13:17 +0000 (09:13 +0200)]
re PR tree-optimization/91157 (ICE: verify_gimple failed (error: position plus size exceeds size of referenced object in 'bit_field_ref'))
PR tree-optimization/91157
* tree-vect-generic.c (expand_vector_comparison): Handle lhs being
a vector boolean with scalar mode.
(expand_vector_condition): Handle first operand being a vector boolean
with scalar mode.
(expand_vector_operations_1): For comparisons, don't bail out early
if the return type is vector boolean with scalar mode, but comparison
operand type is not.
* gcc.target/i386/avx512f-pr91157.c: New test.
* gcc.target/i386/avx512bw-pr91157.c: New test.
i386.md (*testdi_1): Match CCZmode for constants that might have the SImode sign bit set.
* config/i386/i386.md (*testdi_1): Match CCZmode for
constants that might have the SImode sign bit set.
(*testqi_1_maybe_si): Remove "!" constraint modifier.
Use correct constraints for pentium pairing.
(*test<mode>_1): Ditto.
Jeff Law [Tue, 16 Jul 2019 14:44:44 +0000 (08:44 -0600)]
re PR rtl-optimization/91173 (ICE: in int_mode_for_mode, at stor-layout.c:403)
PR rtl-optimization/91173
* tree-ssa-address.c (addr_for_mem_ref): If the base is an
SSA_NAME with a constant value, fold its value into the offset
and clear the base before calling gen_addr_rtx.
Jan Hubicka [Tue, 16 Jul 2019 09:29:17 +0000 (11:29 +0200)]
alias-1_0.C: Use -O3.
* g++.dg/lto/alias-1_0.C: Use -O3.
* g++.dg/lto/alias-2_0.C: Use -O3.
* g++.dg/lto/alias-3_0.C: Add loop to enable inlining with
-fno-use-linker-plugin.
* g++.dg/lto/alias-3_1.C: Remove dg-lto-do and dg-lto-options.
Jason Merrill [Tue, 16 Jul 2019 08:57:03 +0000 (04:57 -0400)]
Simplify range location creation in C++ parser.
Many places in the parser follow the same pattern of capturing the location
of the last lexed token, either before or after lexing it, and then using
that as the end of a location range; this can be simplified by passing the
lexer to make_location and grabbing the token location there.
* parser.c (make_location): Add overload taking cp_lexer* as last
parameter.
Jason Merrill [Tue, 16 Jul 2019 08:50:16 +0000 (04:50 -0400)]
Simplify type-specifier parsing.
Previously, the tentative parses for optional type-specifier and to support
class template argument deduction were combined awkwardly. This
reorganization was motivated by the new concepts branch.
* parser.c (cp_parser_simple_type_specifier): Separate tentative
parses for optional type-spec and CTAD.
Jason Merrill [Tue, 16 Jul 2019 08:49:04 +0000 (04:49 -0400)]
Fix g++.dg/template/pr84789.C on new concepts branch.
On the concepts branch I ran into trouble where a pre-parsed dependent
nested-name-specifier got replaced on a subsequent parse with is_declaration
by one with typenames resolved, which was then used wrongly on a further
parse with !is_declaration.
* parser.c (cp_parser_nested_name_specifier_opt): If the token is
already CPP_NESTED_NAME_SPECIFIER, leave it alone.
This patch reports an error if the .md file has an unscoped
attribute that maps to more than one possible value.
2019-07-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* read-md.h (md_reader::record_potential_iterator_use): Add a
file_location parameter.
* read-rtl.c (attribute_use::loc): New field.
(map_attr_string): Take a file_location parameter. Report cases
in which attributes map to multiple distinct values.
(apply_attribute_uses): Update call accordingly.
(md_reader::handle_overloaded_name): Likewise.
(md_reader::apply_iterator_to_string): Likewise. Skip empty
nonnull strings.
(record_attribute_use): Take a file_location parameter.
Initialize attribute_use::loc.
(md_reader::record_potential_iterator_use): Take a file_location
parameter. Update call to record_attribute_use.
(rtx_reader::rtx_alloc_for_name): Update call accordingly.
(rtx_reader::read_rtx_code): Likewise.
(rtx_reader::read_rtx_operand): Likewise. Record a location
for implicitly-expanded empty strings.
Also make it public, so that clients can use the location for error
reporting.
2019-07-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* read-md.h (md_reader::ptr_loc): Moved from read-md.c.
Use file_location instead of separate fields.
(md_reader::set_md_ptr_loc): Take a file_location instead of a
separate filename and line number.
* read-md.c (ptr_loc): As above.
(md_reader::copy_md_ptr_loc): Update for new ptr_loc layout.
(md_reader::fprint_md_ptr_loc): Likewise.
(md_reader::set_md_ptr_loc): Likewise. Take a file_location
instead of a separate filename and line number.
(md_reader::read_string): Update call accordingly.
This patch is part of a series that fixes ambiguous attribute
uses in .md files, i.e. cases in which attributes didn't use
<ITER:ATTR> to specify an iterator, and in which <ATTR> could
have different values depending on the iterator chosen.
No behavioural change -- produces the same code as before except
for formatting and line numbers.
2019-07-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/rs6000/rs6000.md (*mov<mode>_update1): Explicitly
use <SFDF:mode>, <SFDF:MODE>, <SFDF:Ff> and <SFDF:bits> rather than
leaving the choice between SFDF and P implicit.
(*mov<mode>_update2): Likewise.
(*cmp<IBM128:mode>_internal2): Explicitly use <IBM128:MODE>
rather than leaving the choice betweem IBM128 and GPR implicit.
(*fix<uns>_trunc<IEEE128:mode><QHSI:mode>2_mem): Explicitly use
<IEEE128:MODE> rather than leaving the choice between IEEE128 and
QHSI implicit.
(AltiVec define_peephole2s): Explicitly use <ALTIVEC_DFORM:MODE>
rather than leaving the choice between ALTIVEC_DFORM and P implicit.
* config/rs6000/vsx.md
(*vsx_ext_<VSX_EXTRACT_I:VS_scalar>_fl_<FL_CONV:mode>)
(*vsx_ext_<VSX_EXTRACT_I:VS_scalar>_ufl_<FL_CONV:mode>): Explicitly
use <FL_CONV:VSisa> rather than leaving the choice between FL_CONV
and VSX_EXTRACT_I implicit.
This patch is part of a series that fixes ambiguous attribute
uses in .md files, i.e. cases in which attributes didn't use
<ITER:ATTR> to specify an iterator, and in which <ATTR> could
have different values depending on the iterator chosen.
No behavioural change -- produces the same code as before.
2019-07-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/mips/micromips.md (*movep<MOVEP1:mode><MOVEP2:mode>):
Explicitly use <MOVEP1:MODE> for the mode attribute.
Ian Lance Taylor [Mon, 15 Jul 2019 21:17:16 +0000 (21:17 +0000)]
runtime: expose the g variable
Currently, getg is implemented in C, which loads the thread-local
g variable. The g variable is declared static in C.
This CL exposes the g variable, so it can be accessed from the Go
side. This allows the Go compiler to inline getg calls to direct
access of g.
Currently, the actual inlining is only implemented in the gollvm
compiler. The g variable is thread-local and the compiler backend
may choose to cache the TLS address in a register or on stack. If
a thread switch happens the cache may become invalid. I don't
know how to disable the TLS address cache in gccgo, therefore
the inlining of getg is not implemented. In the future gccgo may
gain this if we know how to do it safely.
Richard Biener [Mon, 15 Jul 2019 12:48:47 +0000 (12:48 +0000)]
re PR tree-optimization/91162 (ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.c:86 (error: invalid 'PHI' argument))
2019-07-15 Richard Biener <rguenther@suse.de>
PR middle-end/91162
* tree-cfg.c (move_block_to_fn): When releasing a virtual PHI
node make sure to replace all uses with something valid.
Kewen Lin [Mon, 15 Jul 2019 05:12:05 +0000 (05:12 +0000)]
re PR tree-optimization/88497 (Improve Accumulation in Auto-Vectorized Code)
gcc/ChangeLog
2019-07-15 Kewen Lin <linkw@gcc.gnu.org>
PR tree-optimization/88497
* tree-ssa-reassoc.c (reassociate_bb): Swap the positions of
GIMPLE_BINARY_RHS check and gimple_visited_p check, call new
function undistribute_bitref_for_vector.
(undistribute_bitref_for_vector): New function.
(cleanup_vinfo_map): Likewise.
(sort_by_mach_mode): Likewise.
Jerry DeLisle [Sun, 14 Jul 2019 22:52:58 +0000 (22:52 +0000)]
re PR fortran/87233 (Constraint C1279 still followed after f2008 standard revision (?))
2019-07-14 Jerry DeLisle <jvdelisle@gcc.gnu.org>
PR fortran/87233
* expr.c (check_restricted): Relax constraint C1279 which was
removed from F2008 and above.
* gfortran.dg/initialization_14.f90: Modify to now pass by
removing two dg-error commands. Added comments.
* gfortran.dg/initialization_30.f90: New test that includes the
two tests removed above with the 'dg-options -std=f95'.
i386.md (nonmemory_szext_operand): New mode attribute.
* config/i386/i386.md (nonmemory_szext_operand): New mode attribute.
(test<mode>_ccno_1): Macroize insn pattern from testsi_ccno_1
and testdi_ccno_1 using SWI48 mode attribute.
(*testdi_1): Use x86_64_szext_nonmemory_operand instead of
x86_64_szext_general_operand.
(*testqi_1_maybe_si): Use nonmemory_operand instead of general_operand.
(*test<mode>_1): Use nonmemory_szext_operand mode attribute
instead of genera_operand mode attribute.
gdbhooks.py: dump-fn, dot-fn: cast ret values of fopen/fclose
Work around the following
(gdb) Python Exception <class 'gdb.error'> 'fclose@@GLIBC_2.2.5' has
unknown return type; cast the call to its declared return type:
(gdb) Error occurred in Python: 'fclose@@GLIBC_2.2.5' has unknown
return type; cast the call to its declared return type
This is due to GDB not being able to pick up and use the return types from
debug info for external declarations.
2019-07-14 Vladislav Ivanishin <vlad@ispras.ru>
* gdbhooks.py (DumpFn.invoke): Add explicit casts of return values of
fopen and fclose to their respective types.
(DotFn.invoke): Ditto.
Jan Hubicka [Sun, 14 Jul 2019 11:57:10 +0000 (13:57 +0200)]
ipa-fnsummary.c (ipa_dump_hints): Do not dump array_index.
* ipa-fnsummary.c (ipa_dump_hints): Do not dump array_index.
(ipa_fn_summary::~ipa_fn_summary): Do not destroy array_index.
(ipa_fn_summary_t::duplicate): Do not duplicate array_index.
(array_index_predicate): Remove.
(analyze_function_body): Account cost for variable ofsetted array
indexing.
(estimate_node_size_and_time): Do not compute array index hint.
(ipa_merge_fn_summary_after_inlining): Do not merge array index hint.
(inline_read_section): Do not read array index hint.
(ipa_fn_summary_write): Do not write array index hint.
* doc/invoke.texi (ipa-cp-array-index-hint-bonus): Remove.
* ipa-cp.c (hint_time_bonus): Remove.
* ipa-fnsummary.h (ipa_hints_vals): Remove array_index.
(ipa_fnsummary): Remove array_index.
* ipa-inline.c (want_inline_small_function_p): Do not use
array_index.
(edge_badness): Likewise.
* params.def (PARAM_IPA_CP_ARRAY_INDEX_HINT_BONUS): Remove.
Jan Hubicka [Sat, 13 Jul 2019 18:46:40 +0000 (20:46 +0200)]
tree-ssa-alias.c (component_ref_to_zero_sized_trailing_array_p): Break out from ...
* tree-ssa-alias.c (component_ref_to_zero_sized_trailing_array_p):
Break out from ...
(aliasing_component_refs_walk): Break out from ...
(aliasing_component_refs_p): ... here.
We currently get lot of build warnings like
/home/segher/src/gcc/gcc/config/rs6000/rs6000-c.c:7039:12: warning: misspelled term 'builtin function' in format; use 'built-in function' instead [-Wformat-diag]
7039 | error ("builtin function %qs not supported in this compiler "
| ^~~~~~~~~~~~~~~~
That would print something like
builtin function '__builtin_example' not supported in this compiler
Changing that to "built-in" as suggested only makes this worse.
Instead, let's just remove the whole "builtin function" phrase.
* gimplify.c (struct gimplify_omp_ctx): Add order_concurrent member.
(omp_notice_threadprivate_variable): Diagnose threadprivate variable
uses inside of order(concurrent) constructs.
(gimplify_scan_omp_clauses): Set ctx->order_concurrent if
OMP_CLAUSE_ORDER is seen.
* omp-low.c (struct omp_context): Add order_concurrent member.
(scan_sharing_clauses): Set ctx->order_concurrent if
OMP_CLAUSE_ORDER is seen.
(check_omp_nesting_restrictions): Diagnose ordered or atomic inside
of simd order(concurrent). Diagnose constructs not allowed inside of
for order(concurrent).
(setjmp_or_longjmp_p): Add a context and TREE_PUBLIC check to avoid
complaining about static double setjmp (double); or class static
methods or non-global namespace setjmps.
(omp_runtime_api_call): New function.
(scan_omp_1_stmt): Diagnose OpenMP runtime API calls inside of
order(concurrent) loops.
* c-c++-common/gomp/order-3.c: New test.
* c-c++-common/gomp/order-4.c: New test.
During GCC-9, the codegen for unreachable switch case statements changed
such that the (undefined) behaviour of reaching such statements is directed
to one of the existing switch cases. This means that the testcase which
deals with the old behaviour can no longer work (and there is nothing to test
with it). The [Darwin-specific] test is now redundant and can be removed.
Jan Hubicka [Fri, 12 Jul 2019 16:56:57 +0000 (16:56 +0000)]
tree-ssa-alias.c (same_tmr_indexing_p): Break out from ...
* tree-ssa-alias.c (same_tmr_indexing_p): Break out from ...
(indirect_refs_may_alias_p): ... here.
(nonoverlapping_component_refs_since_match_p): Support also non-trivial
mem refs in the access paths.
* gcc.dg/tree-ssa/alias-access-path-9.c: New testcase.
Jiangning Liu [Fri, 12 Jul 2019 16:28:43 +0000 (16:28 +0000)]
re PR tree-optimization/89430 (A missing ifcvt optimization to generate csel)
2019-07-12 Jiangning Liu <jiangning.liu@amperecomputing.com>
PR tree-optimization/89430
* tree-ssa-phiopt.c (cond_store_replacement): Support conditional
store elimination for local variable without address escape.
PR tree-optimization/89430
* gcc.dg/tree-ssa/pr89430-1.c: New test.
* gcc.dg/tree-ssa/pr89430-2.c: New test.
* gcc.dg/tree-ssa/pr89430-3.c: New test.
* gcc.dg/tree-ssa/pr89430-4.c: New test.
* gcc.dg/tree-ssa/pr89430-5.c: New test.
* gcc.dg/tree-ssa/pr89430-6.c: New test.
The only preexisting use of GIMPLE_EH_ELSE, for transactional memory
commits, did not allow exceptions to escape from the ELSE path. The
trick it uses to allow the ELSE path to see the propagating exception
does not work very well if the exception cleanup raises further
exceptions: the ELSE block is configured to handle exceptions in
itself. This confuses the heck out of CFG and EH cleanups.
Basing the lowering context for the ELSE block on outer_state, rather
than this_state, gets us the expected enclosing handler.
for gcc/ChangeLog
* tree-eh.c (honor_protect_cleanup_actions): Use outer_
rather than this_state as the lowering context for the ELSE
seq in a GIMPLE_EH_ELSE.