Paul A. Clarke [Fri, 26 Oct 2018 17:23:46 +0000 (17:23 +0000)]
[rs6000] x86 vector intrinsics compatibility: clean-ups for 32bit support
Implement various corrections in the compatibility implementations of the
x86 vector intrinsics found after enabling 32bit mode for the associated
test cases. (Actual enablement coming in a subsequent patch.)
Jakub Jelinek [Fri, 26 Oct 2018 10:26:17 +0000 (12:26 +0200)]
gcc_release (error, inform): Use $@ instead of $1.
* gcc_release (error, inform): Use $@ instead of $1.
(build_sources): Check for ^[[:blank:]]*GCC in both index.html
and changes.html, rather than for GCC in one and ^GCC in another one.
Jan Hubicka [Fri, 26 Oct 2018 07:20:01 +0000 (09:20 +0200)]
ipa-devirt.c (warn_odr): Make static.
* ipa-devirt.c (warn_odr): Make static.
(types_same_for_odr): Drop strict variant.
(types_odr_comparable): Likewise.
(odr_or_derived_type_p): Look for main variants.
(odr_name_hasher::equal): Cleanup comment.
(odr_subtypes_equivalent): Add warn and warned arguments; check main
variants.
(type_variants_equivalent_p): break out from ...
(odr_types_equivalent): ... here; go for main variants where needed.
(warn_odr): ... here; turn static.
(warn_types_mismatch): Compare mangled names of main variants.
* ipa-utils.h (types_odr_comparable): Drop strict parameter.
(type_with_linkage_p): Sanity check that we look at main variant.
* lto.c (lto_read_decls): Only consider main variant to be ODR type.
* tree.h (types_same_for_odr): Drop strict argument.
Richard Biener [Fri, 26 Oct 2018 07:12:02 +0000 (07:12 +0000)]
re PR tree-optimization/87746 (ICE in vect_update_misalignment_for_peel, at tree-vect-data-refs.c:1035)
2018-10-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/87746
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Simplify and fix WRT strided store groups with size not
equal to step in element count.
(vect_analyze_group_access_1): Dump the whole group.
Ian Lance Taylor [Fri, 26 Oct 2018 02:43:35 +0000 (02:43 +0000)]
libgo: don't use wc in gotest
The wc command is not in the GNU approved list of Makefile utilities
(https://www.gnu.org/prep/standards/html_node/Utilities-in-Makefiles.html#Utilities-in-Makefiles).
Ian Lance Taylor [Thu, 25 Oct 2018 22:18:08 +0000 (22:18 +0000)]
compiler: improve name mangling for packpaths
The current implementation of Gogo::pkgpath_for_symbol was written in
a way that allowed two distinct package paths to map to the same
symbol, which could cause collisions at link- time or compile-time.
Switch to a better mangling scheme to insure that we get a unique
packagepath symbol for each package. In the new scheme instead of having
separate mangling schemes for identifiers and package paths, the
main identifier mangler ("go_encode_id") now handles mangling of
both packagepath characters and identifier characters.
The new mangling scheme is more intrusive: "foo/bar.Baz" is mangled as
"foo..z2fbar.Baz" instead of "foo_bar.Baz". To mitigate this, this
patch also adds a demangling capability so that function names
returned from runtime.CallersFrames are converted back to their
original unmangled form.
Changing the pkgpath_for_symbol scheme requires updating a number of
//go:linkname directives and C "__asm__" directives to match the new
scheme, as well as updating the 'gotest' driver (which makes
assumptions about the correct mapping from pkgpath symbol to package
name).
Bill Schmidt [Thu, 25 Oct 2018 20:16:39 +0000 (20:16 +0000)]
emmintrin.h (_mm_slli_epi16): Replace deprecated function with vec_sl.
2018-10-25 Bill Schmidt <wschmidt@linux.ibm.com>
Jinsong Ji <jji@us.ibm.com>
* config/rs6000/emmintrin.h (_mm_slli_epi16): Replace deprecated
function with vec_sl.
(_mm_slli_epi32): Likewise.
(_mm_slli_epi64): Likewise.
(_mm_srai_epi16): Replace deprecated function with vec_sra.
(_mm_srai_epi32): Likewise.
(_mm_srli_epi16): Replace deprecated function with vec_sr.
(_mm_srli_epi32): Likewise.
(_mm_srli_epi64): Likewise.
(_mm_sll_epi16): Replace deprecated function with vec_sl.
(_mm_sll_epi32): Likewise.
(_mm_sll_epi64): Likewise.
(_mm_sra_epi16): Replace deprecated function with vec_sra.
(_mm_sra_epi32): Likewise.
(_mm_srl_epi16): Replace deprecated function with vec_sr.
(_mm_srl_epi32): Likewise.
(_mm_srl_epi64): Likewise.
Co-Authored-By: Jinsong Ji <jji@us.ibm.com>
From-SVN: r265508
Bill Schmidt [Thu, 25 Oct 2018 20:14:40 +0000 (20:14 +0000)]
emmintrin.h (_mm_sll_epi16): Replace comparison operators with vec_cmp* for compatibility due to unfortunate...
2018-10-25 Bill Schmidt <wschmidt@linux.ibm.com>
Jinsong Ji <jji@us.ibm.com>
* gcc/config/rs6000/emmintrin.h (_mm_sll_epi16): Replace
comparison operators with vec_cmp* for compatibility due to
unfortunate history; clean up formatting and use types more
appropriately.
(_mm_sll_epi32): Likewise.
(_mm_sll_epi64): Likewise.
(_mm_srl_epi16): Likewise.
(_mm_srl_epi32): Likewise.
(_mm_srl_epi64): Likewise.
Co-Authored-By: Jinsong Ji <jji@us.ibm.com>
From-SVN: r265507
Bill Schmidt [Thu, 25 Oct 2018 20:09:24 +0000 (20:09 +0000)]
emmintrin.h (_mm_sll_epi64): Remove wrong cast.
2018-10-25 Bill Schmidt <wschmidt@linux.ibm.com>
Jinsong Ji <jji@us.ibm.com>
* config/rs6000/emmintrin.h (_mm_sll_epi64): Remove wrong cast.
* config/rs6000/xmmintrin.h (_mm_min_ps): Change m's type to
__vector __bool int. Use vec_cmpgt in preference to deprecated
function vec_vcmpgtfp.
(_mm_max_ps): Likewise.
Co-Authored-By: Jinsong Ji <jji@us.ibm.com>
From-SVN: r265506
Jonathan Wakely [Thu, 25 Oct 2018 15:34:04 +0000 (16:34 +0100)]
PR libstdc++/87749 fix (and optimize) string move construction
The move constructor for the SSO string uses assign(const basic_string&)
when either:
(1) the source string is "local" and so the contents of the small string
buffer need to be copied, or
(2) the allocator does not propagate and is_always_equal is false.
Case (1) is suboptimal, because the assign member is not noexcept and
the compiler isn't smart enough to see it won't actually throw in this
case. This causes extra code in the move assignment operator so that any
exception will be turned into a call to std::terminate. This can be
fixed by copying small strings inline instead of calling assign.
Case (2) is a bug, because the specific instances of the allocators
could be equal even if is_always_equal is false. This can result in an
unnecessary deep copy (and potentially-throwing allocation) when the
storage should be moved. This can be fixed by simply checking if the
allocators are equal.
PR libstdc++/87749
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(basic_string::operator=(basic_string&&)): For short strings copy the
buffer inline. Only fall back to using assign(const basic_string&) to
do a deep copy when reallocation is needed.
* testsuite/21_strings/basic_string/modifiers/assign/char/87749.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/assign/char/
move_assign_optim.cc: New test.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/87749.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/
move_assign_optim.cc: New test.
Jan Hubicka [Thu, 25 Oct 2018 14:33:27 +0000 (16:33 +0200)]
ipa-devirt.c (main_odr_variant): Remove.
* ipa-devirt.c (main_odr_variant): Remove.
(hash_odr_name, types_same_for_odr, types_odr_comparable,
odr_name_hasher::equal, odr_subtypes_equivalent_p):
Drop use of main_odr_variant.
(add_type_duplicate): Silence confused warnings on integer types.
(get_odr_type): Always look for main variant.
(register_odr_type): Simplify.
Matching it to movdi_larl improves the code, eliminating one
instruction and the literal pool entry:
larl %r1,.LANCHOR0+8
stgrl %r1,.LANCHOR0
br %r14
Taking it one step further, there is no reason to keep movdi_64 and
movdi_larl separate, since this could potentially improve code in other
ways by giving lra one more alternative to choose from.
gcc/ChangeLog:
2018-10-25 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/constraints.md (ZL): New constraint.
* config/s390/s390.c (legitimate_pic_operand_p): Accept LARL
operands.
* config/s390/s390.md (movdi_larl): Remove.
(movdi_64): Add the LARL alternative.
gcc/testsuite/ChangeLog:
2018-10-25 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/global-array-almost-huge-element.c: New test.
* gcc.target/s390/global-array-almost-negative-huge-element.c: New test.
* gcc.target/s390/global-array-element-pic.c: New test.
* gcc.target/s390/global-array-even-element.c: New test.
* gcc.target/s390/global-array-huge-element.c: New test.
* gcc.target/s390/global-array-negative-huge-element.c: New test.
* gcc.target/s390/global-array-odd-element.c: New test.
Ilya Leoshkevich [Thu, 25 Oct 2018 13:47:10 +0000 (13:47 +0000)]
Fix rtx_code_size static initialization order fiasco
r264556 and r264537 changed the format of EQ_ATTR_ALT RTXs to "ww",
which also required adjusting rtx_code_size initializer. In order to
simplify things, the list of rtx_codes known to use HOST_WIDE_INTs was
replaced by the format string check. However, unlike the old one, this
new check cannot be always performed at compile time, in which case a
static constructor is generated. This may lead to a static
initialization order fiasco with respect to other static constructors
in the compiler, in case of PR87747, cselib's pool_allocator.
gcc/ChangeLog:
2018-10-25 Ilya Leoshkevich <iii@linux.ibm.com>
PR bootstrap/87747
* rtl.c (RTX_CODE_HWINT_P_1): New helper macro.
(RTX_CODE_HWINT_P): New macro.
(rtx_code_size): Use RTX_CODE_HWINT_P ().
dg-cmp-results: display NA->FAIL & NA->UNRESOLVED by default
Currently, dg-cmp-results will not print anything for a test that was
not run before, even if it is a FAIL or UNRESOLVED now. This means that
when contributing a code change together with a testcase in the same
commit one must run dg-cmp-results twice: once to check for regression
on a full testsuite run and once against the new testcase with -v -v.
This also prevents using dg-cmp-results on sum files generated with
test_summary since these would not contain PASS.
This patch changes dg-cmp-results to print NA->FAIL and NA->UNRESOLVED
changes by default.
2018-10-25 Thomas Preud'homme <thomas.preudhomme@linaro.org>
contrib/
* dg-cmp-results.sh: Print NA-FAIL and NA->UNRESOLVED changes at
default verbosity.
gcc.dg/sibcall-9.c and gcc.dg/sibcall-10.c give execution failure
on ARM when compiled with -fPIC due to the PIC access to volatile
variable v creating an extra spill which causes the frame size of the
two recursive functions to be different. Making the variable static
solve the issue because the variable can be access in a PC-relative way
and avoid the spill, while still testing sibling call as originally
intended.
2018-10-25 Thomas Preud'homme <thomas.preudhomme@linaro.org>
gcc/testsuite/
* gcc.dg/sibcall-9.c: Make v static.
* gcc.dg/sibcall-10.c: Likewise.
Richard Biener [Thu, 25 Oct 2018 08:59:07 +0000 (08:59 +0000)]
re PR tree-optimization/87665 (gcc HEAD (svn: 265340) breaks elements on resize)
2018-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/87665
PR tree-optimization/87745
* tree-vectorizer.h (get_earlier_stmt): Remove.
(get_later_stmt): Pick up UID from the original non-pattern stmt.
Jakub Jelinek [Thu, 25 Oct 2018 07:56:55 +0000 (09:56 +0200)]
re PR fortran/87725 (OpenMP 4.5 clause schedule(simd,monotonic:static) not understood)
PR fortran/87725
* openmp.c (gfc_match_omp_clauses): Parse simd, monotonic and
nonmonotonic modifiers regardless of if they have been parsed
already or if the opposite one has. Fix up check whether
comma after modifier should be parsed.
(resolve_omp_clauses): Diagnose schedule modifier restrictions.
* c-c++-common/gomp/schedule-modifiers-1.c (bar): Separate modifier
from kind with a colon rather than comma.
* gfortran.dg/gomp/schedule-modifiers-1.f90: New test.
* gfortran.dg/gomp/schedule-modifiers-2.f90: New test.
combine: Don't do make_more_copies for dest PC (PR87720)
Jumps are written in RTL as moves to PC. But the latter has no mode,
so we shouldn't try to use it. Since the optimization this routine
does does not really help for jumps at all, let's just skip it.
PR rtl-optimization/87720
* combine.c (make_more_copies): Skip if the dest is pc_rtx.
Alexandre Oliva [Wed, 24 Oct 2018 21:55:39 +0000 (21:55 +0000)]
gOlogy: do not change code in isolate-paths for warnings only
The isolate-paths pass is activated by various -f flags, but also by
-Wnull-dereference. Most of its codegen changes are conditioned on at
least one of the -f flags, but those that detect, warn about and
isolate paths that return the address of local variables are enabled
even if the pass is activated only by -Wnull-dereference.
-W flags should not cause codegen changes, so this patch makes the
codegen changes conditional on the presence of any of the -f flags
that activate the pass. Should we have a separate option to activate
only this kind of transformation?
for gcc/ChangeLog
* gimple-ssa-isolate-paths.c
(find_implicit_erroneous_behavior): Do not change code if the
pass is running for warnings only.
(find_explicit_erroneous_behavior): Likewise.
Michael Meissner [Wed, 24 Oct 2018 20:16:31 +0000 (20:16 +0000)]
rs6000.c (TARGET_MANGLE_DECL_ASSEMBLER_NAME): Define as rs6000_mangle_decl_assembler_name.
[gcc]
2018-10-24 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (TARGET_MANGLE_DECL_ASSEMBLER_NAME):
Define as rs6000_mangle_decl_assembler_name.
(rs6000_mangle_decl_assembler_name): If the user switched from IBM
long double to IEEE long double, switch the names of the long
double built-in functions to be <func>f128 instead of <func>l.
[gcc/testsuite]
2018-10-24 Michael Meissner <meissner@linux.ibm.com>
* gcc.target/powerpc/float128-math.c: New test to make sure the
long double built-in function names use the f128 form if the user
switched from IBM long double to IEEE long double.
* gcc.target/powerpc/ppc-fortran/ieee128-math.f90: Likewise.
Jakub Jelinek [Wed, 24 Oct 2018 19:39:23 +0000 (21:39 +0200)]
re PR c++/86288 (Recognize __gnu and/or __gnu__ as attribute-namespace)
PR c++/86288
* parser.c (cp_parser_std_attribute): Canonicalize attr_ns, and when
:: is not present and attr_ns non-NULL, canonicalize also attr_id.
(cp_parser_attribute_spec): Fix comment typo.
Martin Liska [Wed, 24 Oct 2018 13:52:21 +0000 (15:52 +0200)]
Switch conversion: support any ax + b transformation (PR tree-optimization/84436).
2018-10-24 Martin Liska <mliska@suse.cz>
PR tree-optimization/84436
* tree-switch-conversion.c (switch_conversion::contains_same_values_p):
Remove.
(switch_conversion::contains_linear_function_p): New.
(switch_conversion::build_one_array): Support linear
transformation on input.
* tree-switch-conversion.h (struct switch_conversion): Add
contains_linear_function_p declaration.
2018-10-24 Martin Liska <mliska@suse.cz>
PR tree-optimization/84436
* gcc.dg/tree-ssa/pr84436-1.c: New test.
* gcc.dg/tree-ssa/pr84436-2.c: New test.
* gcc.dg/tree-ssa/pr84436-3.c: New test.
* gcc.dg/tree-ssa/pr84436-4.c: New test.
* gcc.dg/tree-ssa/pr84436-5.c: New test.
Ilya Leoshkevich [Wed, 24 Oct 2018 12:04:53 +0000 (12:04 +0000)]
S/390: Fix ICE in s390_check_qrst_address ()
In r265371 (S/390: Make "b" constraint match literal pool references)
the CONSTANT_POOL_ADDRESS_P () check was moved from
s390_loadrelative_operand_p () to s390_check_qrst_address (). However,
in the original code it was guarded by SYMBOL_REF_P (), which was not
added to the new code.
gcc/ChangeLog:
2018-10-24 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/s390.c (s390_check_qrst_address): Add the missing
SYMBOL_REF_P () check.
Richard Biener [Wed, 24 Oct 2018 11:46:58 +0000 (11:46 +0000)]
re PR tree-optimization/87105 (Autovectorization [X86, SSE2, AVX2, DoublePrecision])
2018-10-24 Richard Biener <rguenther@suse.de>
PR tree-optimization/87105
* tree-vect-data-refs.c (vect_analyze_group_access_1): Adjust
dump classification.
(vect_analyze_data_ref_accesses): Handle duplicate loads and
stores by splitting the affected group after the fact.
* tree-vect-slp.c (vect_build_slp_tree_2): Dump when we
fail the SLP build because of size constraints.
* gcc.dg/vect/bb-slp-39.c: New testcase.
* gfortran.dg/vect/pr83232.f90: Un-XFAIL.
Richard Biener [Wed, 24 Oct 2018 09:42:19 +0000 (09:42 +0000)]
re PR tree-optimization/84013 (wrong __restrict clique with inline asm operand)
2018-10-24 Richard Biener <rguenther@suse.de>
PR tree-optimization/84013
* tree-ssa-structalias.c (struct msdi_data): New struct for
marshalling data to walk_stmt_load_store_ops.
(maybe_set_dependence_info): Refactor as callback for
walk_stmt_load_store_ops.
(compute_dependence_clique): Set restrict info on all stmt kinds.
Using a delegating constructor to implement these constructors means
that they instantiate the destructor, which requires the element_type to
be complete. In C++11 and C++14 they were specified to be delegating,
but that was changed as part of LWG 2801 so in C++17 they don't require
a complete type (as was intended all along).
PR libstdc++/87704
* include/bits/unique_ptr.h (unique_ptr::unique_ptr(nullptr_t)): Do
not delegate to default constructor.
(unique_ptr<T[], D>::unique_ptr(nullptr_t)): Likewise.
* testsuite/20_util/unique_ptr/cons/incomplete.cc: New test.
Richard Biener [Tue, 23 Oct 2018 11:34:56 +0000 (11:34 +0000)]
re PR tree-optimization/87105 (Autovectorization [X86, SSE2, AVX2, DoublePrecision])
2018-10-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/87105
PR tree-optimization/87608
* passes.def (pass_all_early_optimizations): Add early phi-opt
after dce.
* tree-ssa-phiopt.c (value_replacement): Ignore NOPs and predicts in
addition to debug stmts.
(tree_ssa_phiopt_worker): Add early_p argument, do only min/max
and abs replacement early.
* tree-cfg.c (gimple_empty_block_p): Likewise.
* g++.dg/tree-ssa/phiopt-1.C: New testcase.
g++.dg/vect/slp-pr87105.cc: Likewise.
* g++.dg/tree-ssa/pr21463.C: Scan phiopt2 because this testcase
relies on phiprop run before.
* g++.dg/tree-ssa/pr30738.C: Likewise.
* g++.dg/tree-ssa/pr57380.C: Likewise.
* gcc.dg/tree-ssa/pr84859.c: Likewise.
* gcc.dg/tree-ssa/pr45397.c: Scan phiopt2 because phiopt1 is
confused by copies in the IL left by EVRP.
* gcc.dg/tree-ssa/phi-opt-5.c: Likewise, this time confused
by predictors.
* gcc.dg/tree-ssa/phi-opt-12.c: Scan phiopt2.
* gcc.dg/pr24574.c: Likewise.
* g++.dg/tree-ssa/pr86544.C: Scan phiopt4.
Richard Earnshaw [Tue, 23 Oct 2018 10:19:15 +0000 (10:19 +0000)]
[arm] Update default CPUs during configure
There are a couple of places in config.gcc where the default CPU is
still arm6, but that was removed as a supported CPU earlier this year.
This patch fixes those entries.
The default CPU for configurations that do not explicitly set a
default is now arm7tdmi (so assumes thumb is available). Given that
StrongArm is on the deprecated list, this is a better default than we
had previously.
For NetBSD the default is StrongArm; this is the only remaining port
that uses the old ABI and really still carries support for non-thumb
based targets.
PR target/86383
* config.gcc (arm*-*-netbsdelf*): Default to StrongARM if no CPU
specified to configure.
(arm*-*-*): Use ARM7TDMI as the target CPU if no default provided.
Paul Thomas [Tue, 23 Oct 2018 08:27:14 +0000 (08:27 +0000)]
re PR fortran/85603 (ICE with character array substring assignment)
2018-10-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/85603
* frontend-passes.c (get_len_call): New function to generate a
call to intrinsic LEN.
(create_var): Use this to make length expressions for variable
rhs string lengths.
Clean up some white space issues.
2018-10-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/85603
* gfortran.dg/deferred_character_23.f90 : Check reallocation is
occurring as it should and a regression caused by version 1 of
this patch.
Ian Lance Taylor [Tue, 23 Oct 2018 02:46:41 +0000 (02:46 +0000)]
compiler: export indexed type data, read unexported types lazily
Introduce a new "types" command to the export data to record the
number of types and the size of their export data. It is immediately
followed by new "type" commands that can be indexed. Parse all the
exported types immediately so that we register them, but parse other
type data only as needed.
On most targets every function starts with moves from the parameter
passing (hard) registers into pseudos. Similarly, after every call
there is a move from the return register into a pseudo. These moves
usually combine with later instructions (leaving pretty much the same
instruction, just with a hard reg instead of a pseudo).
This isn't a good idea. Register allocation can get rid of unnecessary
moves just fine, and moving the parameter passing registers into many
later instructions tends to prevent good register allocation. This
patch disallows combining moves from a hard (non-fixed) register.
This also avoid the problem mentioned in PR87600 #c3 (combining hard
registers into inline assembler is problematic).
Because the register move can often be combined with other instructions
*itself*, for example for setting some condition code, this patch adds
extra copies via new pseudos after every copy-from-hard-reg.
On some targets this reduces average code size. On others it increases
it a bit, 0.1% or 0.2% or so. (I tested this on all *-linux targets).
PR rtl-optimization/87600
* combine.c: Add include of expr.h.
(cant_combine_insn_p): Do not combine moves from any hard non-fixed
register to a pseudo.
(make_more_copies): New function, add a copy to a new pseudo after
the moves from hard registers into pseudos.
(rest_of_handle_combine): Declare rebuild_jump_labels_after_combine
later. Call make_more_copies.
GCC will currently increment "reject" once, for operand 0, and then decrement
it once for each of the other operands, ending with reject == -2 and an
assertion failure. If there's a conflict then it might try to decrement reject
yet again.
Incidentally, what these patterns are trying to achieve is an allocation in
which operand 0 may match one of the other operands, but may not partially
overlap any of them. Ideally there'd be a better way to do this.
In any case, it will affect any pattern in which multiple operands may (or
must) match an early-clobber operand.
The patch only allows a reject-- when one has not already occurred, for that
operand.
2018-10-22 Andrew Stubbs <ams@codesourcery.com>
gcc/
* lra-constraints.c (process_alt_operands): New local array,
matching_early_clobber. Check matching_early_clobber before
decrementing reject, and set matching_early_clobber after.
rs6000: Handle print_operand_address for unexpected RTL (PR87598)
As the PR shows, the user can force this to be called on at least some
RTL that is not a valid address. Most targets treat this as if the
user knows best; let's do the same.
PR target/87598
* config/rs6000/rs6000.c (print_operand_address): For unexpected RTL
call output_addr_const and hope for the best.
Richard Biener [Mon, 22 Oct 2018 13:57:47 +0000 (13:57 +0000)]
2018-10-22 Richard Biener <rguenther@suse.de>
* gimple-ssa-evrp-analyze.c
(evrp_range_analyzer::record_ranges_from_incoming_edge): Be
smarter about what ranges to use.
* tree-vrp.c (add_assert_info): Dump here.
(register_edge_assert_for_2): Instead of here at multiple but
not all places.
Steven Bosscher [Mon, 22 Oct 2018 13:54:23 +0000 (13:54 +0000)]
re PR middle-end/63155 (memory hog)
2018-10-22 Steven Bosscher <steven@gcc.gnu.org>
Richard Biener <rguenther@suse.de>
* bitmap.h: Update data structure documentation, including a
description of bitmap views as either linked-lists or splay trees.
(struct bitmap_element_def): Update comments for splay tree bitmaps.
(struct bitmap_head_def): Likewise.
(bitmap_list_view, bitmap_tree_view): New prototypes.
(bitmap_initialize_stat): Initialize a bitmap_head's indx and
tree_form fields.
(bmp_iter_set_init): Assert the iterated bitmaps are in list form.
(bmp_iter_and_init, bmp_iter_and_compl_init): Likewise.
* bitmap.c (bitmap_elem_to_freelist): Unregister overhead of a
released bitmap element here.
(bitmap_element_free): Remove.
(bitmap_elt_clear_from): Work on splay tree bitmaps.
(bitmap_list_link_element): Renamed from bitmap_element_link. Move
this function similar ones such that linked-list bitmap implementation
functions are grouped.
(bitmap_list_unlink_element): Renamed from bitmap_element_unlink,
and moved for grouping.
(bitmap_list_insert_element_after): Renamed from
bitmap_elt_insert_after, and moved for grouping.
(bitmap_list_find_element): New function spliced from bitmap_find_bit.
(bitmap_tree_link_left, bitmap_tree_link_right,
bitmap_tree_rotate_left, bitmap_tree_rotate_right, bitmap_tree_splay,
bitmap_tree_link_element, bitmap_tree_unlink_element,
bitmap_tree_find_element): New functions for splay-tree bitmap
implementation.
(bitmap_element_link, bitmap_element_unlink, bitmap_elt_insert_after):
Renamed and moved, see above entries.
(bitmap_tree_listify_from): New function to convert part of a splay
tree bitmap to a linked-list bitmap.
(bitmap_list_view): Convert a splay tree bitmap to linked-list form.
(bitmap_tree_view): Convert a linked-list bitmap to splay tree form.
(bitmap_find_bit): Remove.
(bitmap_clear, bitmap_clear_bit, bitmap_set_bit,
bitmap_single_bit_set_p, bitmap_first_set_bit, bitmap_last_set_bit):
Handle splay tree bitmaps.
(bitmap_copy, bitmap_count_bits, bitmap_and, bitmap_and_into,
bitmap_elt_copy, bitmap_and_compl, bitmap_and_compl_into,
bitmap_compl_and_into, bitmap_elt_ior, bitmap_ior, bitmap_ior_into,
bitmap_xor, bitmap_xor_into, bitmap_equal_p, bitmap_intersect_p,
bitmap_intersect_compl_p, bitmap_ior_and_compl,
bitmap_ior_and_compl_into, bitmap_set_range, bitmap_clear_range,
bitmap_hash): Reject trying to act on splay tree bitmaps. Make
corresponding changes to use linked-list specific bitmap_element
manipulation functions as applicable for efficiency.
(bitmap_tree_to_vec): New function.
(debug_bitmap_elt_file): New function split out from ...
(debug_bitmap_file): ... here. Handle splay tree bitmaps.
(bitmap_print): Likewise.
PR tree-optimization/63155
* tree-ssa-propagate.c (ssa_prop_init): Use tree-view for the
SSA edge worklists.
* tree-ssa-coalesce.c (coalesce_ssa_name): Populate used_in_copies
in tree-view.
+/* Unaligned version of the same type. */
+typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__,
+ __aligned__ (1)));
+
/* Internal data types for implementing the intrinsics. */
typedef float __v4sf __attribute__ ((__vector_size__ (16)));
Eric Botcazou [Mon, 22 Oct 2018 11:03:17 +0000 (11:03 +0000)]
utils.c (unchecked_convert): Use local variables for the biased and reverse SSO attributes of both types.
* gcc-interface/utils.c (unchecked_convert): Use local variables for
the biased and reverse SSO attributes of both types.
Further extend the processing of integral types in the presence of
reverse SSO to all scalar types.
Martin Jambor [Mon, 22 Oct 2018 08:27:50 +0000 (10:27 +0200)]
Add a fun parameter to three stmt_could_throw... functions
This long patch only does one simple thing, adds an explicit function
parameter to predicates stmt_could_throw_p, stmt_can_throw_external
and stmt_can_throw_internal.
My motivation was ability to use stmt_can_throw_external in IPA
analysis phase without the need to push cfun. As I have discovered,
we were already doing that in cgraph.c, which this patch avoids as
well. In the process, I had to add a struct function parameter to
stmt_could_throw_p and decided to also change the interface of
stmt_can_throw_internal just for the sake of some minimal consistency.
In the process I have discovered that calling method
cgraph_node::create_version_clone_with_body (used by ipa-split,
ipa-sra, OMP simd and multiple_target) leads to calls of
stmt_can_throw_external with NULL cfun. I have worked around this by
making stmt_can_throw_external and stmt_could_throw_p gracefully
accept NULL and just be pessimistic in that case. The problem with
fixing this in a better way is that struct function for the clone is
created after cloning edges where we attempt to push the yet not
existing cfun, and moving it before would require a bit of surgery in
tree-inline.c. A slightly hackish but simpler fix might be to
explicitely pass the "old" function to symbol_table::create_edge
because it should be just as good at that moment. In any event, that
is a topic for another patch.
I believe that currently we incorrectly use cfun in
maybe_clean_eh_stmt_fn and maybe_duplicate_eh_stmt_fn, both in
tree-eh.c, and so I have fixed these cases too. The bulk of other
changes is just mechanical adding of cfun to all users.
Bootstrapped and tested on x86_64-linux (also with extra NULLing and
restoring cfun to double check it is not used in a place I missed), OK
for trunk?
Thanks,
Martin
2018-10-22 Martin Jambor <mjambor@suse.cz>
* tree-eh.h (stmt_could_throw_p): Add function parameter.
(stmt_can_throw_external): Likewise.
(stmt_can_throw_internal): Likewise.
* tree-eh.c (lower_eh_constructs_2): Pass cfun to stmt_could_throw_p.
(lower_eh_constructs_2): Likewise.
(stmt_could_throw_p): Add fun parameter, use it instead of cfun.
(stmt_can_throw_external): Likewise.
(stmt_can_throw_internal): Likewise.
(maybe_clean_eh_stmt_fn): Pass cfun to stmt_could_throw_p.
(maybe_clean_or_replace_eh_stmt): Pass cfun to stmt_could_throw_p.
(maybe_duplicate_eh_stmt_fn): Pass new_fun to stmt_could_throw_p.
(maybe_duplicate_eh_stmt): Pass cfun to stmt_could_throw_p.
(pass_lower_eh_dispatch::execute): Pass cfun to
stmt_can_throw_external.
(cleanup_empty_eh): Likewise.
(verify_eh_edges): Pass cfun to stmt_could_throw_p.
* cgraph.c (cgraph_edge::set_call_stmt): Pass a function to
stmt_can_throw_external instead of pushing it to cfun.
(symbol_table::create_edge): Likewise.
* gimple-fold.c (fold_builtin_atomic_compare_exchange): Pass cfun to
stmt_can_throw_internal.
* gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children): Pass cfun
to stmt_could_throw_p.
* gimple-ssa-store-merging.c (handled_load): Pass cfun to
stmt_can_throw_internal.
(pass_store_merging::execute): Likewise.
* gimple-ssa-strength-reduction.c
(find_candidates_dom_walker::before_dom_children): Pass cfun to
stmt_could_throw_p.
* gimplify-me.c (gimple_regimplify_operands): Pass cfun to
stmt_can_throw_internal.
* ipa-pure-const.c (check_call): Pass cfun to stmt_could_throw_p and
to stmt_can_throw_external.
(check_stmt): Pass cfun to stmt_could_throw_p.
(check_stmt): Pass cfun to stmt_can_throw_external.
(pass_nothrow::execute): Likewise.
* trans-mem.c (expand_call_tm): Pass cfun to stmt_can_throw_internal.
* tree-cfg.c (is_ctrl_altering_stmt): Pass cfun to
stmt_can_throw_internal.
(verify_gimple_in_cfg): Pass cfun to stmt_could_throw_p.
(stmt_can_terminate_bb_p): Pass cfun to stmt_can_throw_external.
(gimple_purge_dead_eh_edges): Pass cfun to stmt_can_throw_internal.
* tree-complex.c (expand_complex_libcall): Pass cfun to
stmt_could_throw_p and to stmt_can_throw_internal.
(expand_complex_multiplication): Pass cfun to stmt_can_throw_internal.
* tree-inline.c (copy_edges_for_bb): Likewise.
(maybe_move_debug_stmts_to_successors): Likewise.
* tree-outof-ssa.c (ssa_is_replaceable_p): Pass cfun to
stmt_could_throw_p.
* tree-parloops.c (oacc_entry_exit_ok_1): Likewise.
* tree-sra.c (scan_function): Pass cfun to stmt_can_throw_external.
* tree-ssa-alias.c (stmt_kills_ref_p): Pass cfun to
stmt_can_throw_internal.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Pass cfun to
stmt_could_throw_p.
(mark_aliased_reaching_defs_necessary_1): Pass cfun to
stmt_can_throw_internal.
* tree-ssa-forwprop.c (pass_forwprop::execute): Likewise.
* tree-ssa-loop-im.c (movement_possibility): Pass cfun to
stmt_could_throw_p.
* tree-ssa-loop-ivopts.c (find_givs_in_stmt_scev): Likewise.
(add_autoinc_candidates): Pass cfun to stmt_can_throw_internal.
* tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise.
(convert_mult_to_fma_1): Likewise.
(convert_to_divmod): Likewise.
* tree-ssa-phiprop.c (propagate_with_phi): Likewise.
* tree-ssa-pre.c (compute_avail): Pass cfun to stmt_could_throw_p.
* tree-ssa-propagate.c
(substitute_and_fold_dom_walker::before_dom_children): Likewise.
* tree-ssa-reassoc.c (suitable_cond_bb): Likewise.
(maybe_optimize_range_tests): Likewise.
(linearize_expr_tree): Likewise.
(reassociate_bb): Likewise.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Likewise.
* tree-ssa-scopedtables.c (hashable_expr_equal_p): Likewise.
* tree-ssa-strlen.c (adjust_last_stmt): Likewise.
(handle_char_store): Likewise.
* tree-vect-data-refs.c (vect_find_stmt_data_reference): Pass cfun to
stmt_can_throw_internal.
* tree-vect-patterns.c (check_bool_pattern): Pass cfun to
stmt_could_throw_p.
* tree-vect-stmts.c (vect_finish_stmt_generation_1): Likewise.
(vectorizable_call): Pass cfun to stmt_can_throw_internal.
(vectorizable_simd_clone_call): Likewise.
* value-prof.c (gimple_ic): Pass cfun to stmt_could_throw_p.
(gimple_stringop_fixed_value): Likewise.