Thomas Koenig [Tue, 2 Jan 2018 18:14:04 +0000 (18:14 +0000)]
re PR fortran/45689 ([F03] Missing transformational intrinsic in the trans_func_f2003 list)
2017-01-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/45689
* intrinsic.c (add_function): Add gfc_simplify_maxloc and
gfc_simplify_minloc to maxloc and minloc, respectively.
* intrinsic.h: Add prototypes for gfc_simplify_minloc
and gfc_simplify_maxloc.
* simplify.c (min_max_chose): Adjust prototype. Modify function
to have a return value which indicates if the extremum was found.
(is_constant_array_expr): Fix typo in comment.
(simplify_minmaxloc_to_scalar): New function.
(simplify_minmaxloc_nodim): New function.
(new_array): New function.
(simplify_minmaxloc_to_array): New function.
(gfc_simplify_minmaxloc): New function.
(simplify_minloc): New function.
(simplify_maxloc): New function.
2017-01-02 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/45689
* gfortran.dg/minloc_4.f90: New test case.
* gfortran.dg/maxloc_4.f90: New test case.
Jakub Jelinek [Tue, 2 Jan 2018 18:04:19 +0000 (19:04 +0100)]
re PR c++/83556 (ICE in gimplify_expr, at gimplify.c:12004)
PR c++/83556
* tree.c (replace_placeholders_r): Pass NULL as last argument to
cp_walk_tree instead of d->pset. If non-TREE_CONSTANT and
non-PLACEHOLDER_EXPR tree has been seen already, set *walk_subtrees
to false and return.
(replace_placeholders): Pass NULL instead of &pset as last argument
to cp_walk_tree.
Janne Blomqvist [Tue, 2 Jan 2018 13:25:10 +0000 (15:25 +0200)]
PR libgfortran/83649 Chunk large reads and writes
It turns out that Linux never reads or writes more than 2147479552
bytes in a single syscall. For writes this is not a problem as
libgfortran already contains a loop around write() to handle short
writes. But for reads we cannot do this, since then read will hang if
we have a short read when reading from the terminal. Also, there are
reports that macOS fails I/O's larger than 2 GB. Thus, to work around
these issues do large reads/writes in chunks.
The testcase from the PR
program largewr
integer(kind=1) :: a(2_8**31+1)
a = 0
a(size(a, kind=8)) = 1
open(10, file="largewr.dat", access="stream", form="unformatted")
write (10) a
close(10)
a(size(a, kind=8)) = 2
open(10, file="largewr.dat", access="stream", form="unformatted")
read (10) a
if (a(size(a, kind=8)) == 1) then
print *, "All is well"
else
print *, "Oh no"
end if
end program largewr
fails on trunk but works with the patch.
Regtested on x86_64-pc-linux-gnu, committed to trunk.
libgfortran/ChangeLog:
2018-01-02 Janne Blomqvist <jb@gcc.gnu.org>
PR libgfortran/83649
* io/unix.c (MAX_CHUNK): New define.
(raw_read): For reads larger than MAX_CHUNK, loop.
(raw_write): Write no more than MAX_CHUNK bytes per iteration.
Richard Biener [Tue, 2 Jan 2018 08:45:05 +0000 (08:45 +0000)]
re PR lto/83452 (FAIL: gfortran.dg/save_6.f90 -O0 (test for excess errors))
2017-01-02 Richard Biener <rguenther@suse.de>
PR lto/83452
* simple-object-elf.c (simple_object_elf_copy_lto_debug_section):
Do not use UNDEF locals for removed symbols but instead just
define them in the first prevailing section and with no name.
Use the same gnu_lto_v1 name for all removed globals we promote to
WEAK UNDEFs so hpux can use a stub to provide this symbol. Clear
sh_info and sh_link in removed sections.
Paul Thomas [Mon, 1 Jan 2018 17:36:41 +0000 (17:36 +0000)]
re PR fortran/83076 (ICE in gfc_deallocate_scalar_with_status, at fortran/trans.c:1598)
2018-01-01 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83076
* resolve.c (resolve_fl_derived0): Add caf_token fields for
allocatable and pointer scalars, when -fcoarray selected.
* trans-types.c (gfc_copy_dt_decls_ifequal): Copy the token
field as well as the backend_decl.
(gfc_get_derived_type): Flag GFC_FCOARRAY_LIB for module
derived types that are not vtypes. Components with caf_token
attribute are pvoid types. For a component requiring it, find
the caf_token field and have the component token field point to
its backend_decl.
PR fortran/83319
*trans-types.c (gfc_get_array_descriptor_base): Add the token
field to the descriptor even when codimen not set.
2018-01-01 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83076
* gfortran.dg/coarray_45.f90 : New test.
PR fortran/83319
* gfortran.dg/coarray_46.f90 : New test.
Jakub Jelinek [Sun, 31 Dec 2017 23:52:41 +0000 (00:52 +0100)]
re PR c/83595 (ICE: in linemap_macro_map_lookup, at libcpp/line-map.c:1008 on invalid code)
PR c/83595
* c-parser.c (c_parser_braced_init, c_parser_initelt,
c_parser_conditional_expression, c_parser_cast_expression,
c_parser_sizeof_expression, c_parser_alignof_expression,
c_parser_postfix_expression, c_parser_omp_declare_reduction,
c_parser_transaction_expression): Use set_error () method instead
of setting value member to error_mark_node.
Jakub Jelinek [Sun, 31 Dec 2017 23:52:01 +0000 (00:52 +0100)]
re PR rtl-optimization/83608 (ICE in convert_move, at expr.c:229 in GIMPLE store merging pass)
PR middle-end/83608
* expr.c (store_expr_with_bounds): Use simplify_gen_subreg instead of
convert_modes if target mode has the right side, but different mode
class.
Jakub Jelinek [Sun, 31 Dec 2017 23:51:14 +0000 (00:51 +0100)]
re PR tree-optimization/83609 (ICE in read_complex_part at gcc/expr.c:3202)
PR middle-end/83609
* expr.c (expand_assignment): Fix up a typo in simplify_gen_subreg
last argument when extracting from CONCAT. If either from_real or
from_imag is NULL, use expansion through memory. If result is not
a CONCAT and simplify_gen_subreg fails, try to simplify_gen_subreg
the parts directly to inner mode, if even that fails, use expansion
through memory.
* gcc.dg/pr83609.c: New test.
* g++.dg/opt/pr83609.C: New test.
Jakub Jelinek [Sun, 31 Dec 2017 23:50:32 +0000 (00:50 +0100)]
re PR middle-end/83623 (ICE: in convert_move, at expr.c:248 with -march=knl and 16bit vector bswap/rotate)
PR middle-end/83623
* expmed.c (expand_shift_1): For 2-byte rotates by BITS_PER_UNIT,
check for bswap in mode rather than HImode and use that in expand_unop
too.
Jakub Jelinek [Sun, 31 Dec 2017 23:49:42 +0000 (00:49 +0100)]
* gcc.target/i386/i386.exp
(check_effective_target_avx512vpopcntdqvl): New proc.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntd-1.c: Use
avx512vpopcntdqvl effective target rather than avx512vpopcntdq.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c: Likewise.
Tom de Vries [Sat, 30 Dec 2017 17:02:00 +0000 (17:02 +0000)]
Prune removed funcs from offload table
2017-12-30 Tom de Vries <tom@codesourcery.com>
PR libgomp/83046
* omp-expand.c (expand_omp_target): If in_lto_p, mark offload_funcs with
DECL_PRESERVE_P.
* lto-streamer-out.c (prune_offload_funcs): New function. Remove
offload_funcs entries that no longer have a corresponding cgraph_node.
Mark the remaining ones as DECL_PRESERVE_P.
(output_lto): Call prune_offload_funcs.
* testsuite/libgomp.oacc-c-c++-common/pr83046.c: New test.
* testsuite/libgomp.c-c++-common/pr83046.c: New test.
Jerry DeLisle [Fri, 29 Dec 2017 19:25:31 +0000 (19:25 +0000)]
re PR fortran/83560 (list-directed formatting of INTEGER is missing plus on output when output open with SIGN='PLUS')
2017-12-29 Jerry DeLisle <jvdelisle@gcc.gnu.org>
PR libgfortran/83560
* io/write.c (write_integer): Modify to use write_decimal.
For namelist mode, suppress leading blanks and emit them as
trailing blanks. Change parameter from len to kind for better
readability. (nml_write_obj): Fix comment style.
Paul Thomas [Fri, 29 Dec 2017 14:27:59 +0000 (14:27 +0000)]
re PR fortran/83567 (Parametrized derived types: Segmentation fault when assigning a function return value)
2017-12-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83567
* trans-expr.c (gfc_trans_assignment_1): Free parameterized
components of the lhs if dealloc is set.
*trans-decl.c (gfc_trans_deferred_vars): Do not free the
parameterized components of function results on leaving scope.
2017-12-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83567
* gfortran.dg/pdt_26.f90 : New test.
Michael Meissner [Thu, 28 Dec 2017 21:19:12 +0000 (21:19 +0000)]
builtins.def: (_Float<N> and _Float<N>X BUILT_IN_CEIL): Add _Float<N> and _Float<N>X variants...
[gcc]
2017-12-28 Michael Meissner <meissner@linux.vnet.ibm.com>
* builtins.def: (_Float<N> and _Float<N>X BUILT_IN_CEIL): Add
_Float<N> and _Float<N>X variants for rounding built-in
functions.
(_Float<N> and _Float<N>X BUILT_IN_FLOOR): Likewise.
(_Float<N> and _Float<N>X BUILT_IN_NEARBYINT): Likewise.
(_Float<N> and _Float<N>X BUILT_IN_RINT): Likewise.
(_Float<N> and _Float<N>X BUILT_IN_ROUND): Likewise.
(_Float<N> and _Float<N>X BUILT_IN_TRUNC): Likewise.
* builtins.c (mathfn_built_in_2): Likewise.
* internal-fn.def (CEIL): Likewise.
(FLOOR): Likewise.
(NEARBYINT): Likewise.
(RINT): Likewise.
(ROUND): Likewise.
(TRUNC): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* fold-const.c (tree_call_nonnegative_warnv_p): Likewise.
(integer_valued_real_call_p): Likewise.
* fold-const-call.c (fold_const_call_ss): Likewise.
* gencfn-macros.c (print_case_cfn): Change CFN and operator
printers to take a const char * suffix instead of a bool.
(print_define_operator_list): Likewise.
(fltall_suffixes): New list of suffixes, that include the
traditional suffixes as well as all of the _Float<N> and
_Float<N>X suffixes.
(main): For _Float<N> and _Float<N>X functions, emit both
<name>_FN and <name>_ALL variants. The <macro>_FN variant only
has the _Float<N> and _Float<N>X case names or operators. The
<name>_ALL variant has both the traditional and the
_Float<N>/_Float<N>X case names or operators.
* match.pd (COPYSIGN optimizations): Provide optimizations for
_Float<N> and _Float<N>X types where possible.
(MIN/MAX optimizations): Likewise.
(sqrt optimizations): Likewise.
(rounding optimizations): Likewise.
[gcc/c]
2017-12-28 Michael Meissner <meissner@linux.vnet.ibm.com>
[rs6000] Use gen_int_mode in ieee_128bit_negative_zero
Previously we'd generate a non-canonical zero-extended CONST_INT
instead of a sign-extended one, which tripped the assert for
canonical CONST_INTs after a later patch.
2017-12-28 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/powerpcspe/powerpcspe.md (ieee_128bit_negative_zero): Use
gen_int_mode rather than GEN_INT.
* config/rs6000/rs6000.md (ieee_128bit_negative_zero): Likewise.
Use valid_for_const_vector_p instead of CONSTANT_P
This patch makes the VEC_SERIES code use valid_for_const_vector_p
instead of CONSTANT_P, to match what we already do for VEC_DUPLICATE.
This showed up as a failure in gcc.c-torture/execute/pr28982b.c for -m32
on x86_64-linux-gnu after later patches.
2017-12-28 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* emit-rtl.c (gen_const_vec_series): Use valid_for_const_vector_p
instead of CONSTANT_P.
(gen_vec_series): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.
Janne Blomqvist [Thu, 28 Dec 2017 18:49:12 +0000 (20:49 +0200)]
PR fortran/83344 Don't set bogus constant value
This patch does not fix PR 83344, but merely fixes an error where we
used to set a constant character length value from a non-constant
expression, and thus set it to some bogus value.
As a result of this, I have commented out part of the associate_22.f90
test which otherwise generates a warning message.
Regtested on x86_64-pc-linux-gnu.
gcc/fortran/ChangeLog:
2017-12-28 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/83344
* resolve.c (resolve_assoc_var): Don't set the constant value
unless the target is a constant expression.
gcc/testsuite/ChangeLog:
2017-12-28 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/83344
* gfortran.dg/associate_22.f90: Comment out part of test.
Paul Thomas [Thu, 28 Dec 2017 13:22:36 +0000 (13:22 +0000)]
re PR fortran/83567 (Parametrized derived types: Segmentation fault when assigning a function return value)
2017-12-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83567
* trans-expr.c (gfc_trans_assignment_1): Free parameterized
components of the lhs if dealloc is set.
*trans-decl.c (gfc_trans_deferred_vars): Do not free the
parameterized components of function results on leaving scope.
2017-12-28 Paul Thomas <pault@gcc.gnu.org>
PR fortran/83567
* gfortran.dg/pdt_26.f90 : New test.
Jonathan Wakely [Wed, 27 Dec 2017 22:18:08 +0000 (22:18 +0000)]
PR libstdc++/83600 fix end iterator for unready std::match_results
PR libstdc++/83600
* include/bits/regex.h (match_results::end()): Return valid iterator
when not ready.
* testsuite/28_regex/match_results/ctors/char/default.cc: Check that
unready objects are empty and have equal begin and end iterators.
* testsuite/28_regex/match_results/ctors/wchar_t/default.cc: Likewise.
Martin Liska [Wed, 27 Dec 2017 09:30:14 +0000 (10:30 +0100)]
Assign result of get_string_lenth to a SSA_NAME (PR tree-optimization/83552).
2017-12-27 Martin Liska <mliska@suse.cz>
PR tree-optimization/83552
* tree-ssa-strlen.c (fold_strstr_to_strncmp): Assign result
of get_string_lenth to a SSA_NAME if not a GIMPLE value.
2017-12-27 Martin Liska <mliska@suse.cz>
PR tree-optimization/83552
* gcc.dg/pr83552.c: New test.
Jakub Jelinek [Sat, 23 Dec 2017 08:40:19 +0000 (09:40 +0100)]
re PR c++/83553 (compiler removes body of the for-loop, although there is a case label inside)
PR c++/83553
* fold-const.c (struct contains_label_data): New type.
(contains_label_1): Return non-NULL even for CASE_LABEL_EXPR, unless
inside of a SWITCH_BODY seen during the walk.
(contains_label_p): Use walk_tree instead of
walk_tree_without_duplicates, prepare data for contains_label_1 and
provide own pset.
Jakub Jelinek [Fri, 22 Dec 2017 18:04:18 +0000 (19:04 +0100)]
re PR debug/83550 (Bad location of DW_TAG_structure_type with forward declaration since r224161)
PR debug/83550
* c-decl.c (finish_struct): Set DECL_SOURCE_LOCATION on
TYPE_STUB_DECL and call rest_of_type_compilation before processing
incomplete vars rather than after it.
Jakub Jelinek [Fri, 22 Dec 2017 18:01:58 +0000 (19:01 +0100)]
re PR debug/83547 ((statement-frontiers) error: void value not ignored as it ought to be)
PR debug/83547
* tree-iterator.c (alloc_stmt_list): Start with cleared
TREE_SIDE_EFFECTS regardless whether a new STATEMENT_LIST is allocated
or old one reused.
c/
* c-typeck.c (c_finish_stmt_expr): Ignore !TREE_SIDE_EFFECTS as
indicator of ({ }), instead skip all trailing DEBUG_BEGIN_STMTs first,
and consider empty ones if there are no other stmts. For
-Wunused-value walk all statements before the one only followed by
DEBUG_BEGIN_STMTs.
testsuite/
* gcc.c-torture/compile/pr83547.c: New test.
Ian Lance Taylor [Fri, 22 Dec 2017 16:43:28 +0000 (16:43 +0000)]
compiler: do not propagate address-taken of a slice element to the slice
Array_index_expression may be used for indexing/slicing array or
slice. If a slice element is address taken, the slice itself is
not necessarily address taken. Only propagate address-taken for
arrays.
Ian Lance Taylor [Fri, 22 Dec 2017 15:55:10 +0000 (15:55 +0000)]
compiler: bring escape analysis mostly in line with gc compiler
This CL ports the latest (~Go 1.10) escape analysis code from
the gc compiler. Changes include:
- In the gc compiler, the variable expression is represented
with the variable node itself (ONAME). It is the same node
used in the AST for multiple var expressions for the same
variable. In our case, the var expressions nodes are distinct
nodes. We need to propagate the escape state from/to the
underlying variable in getter and setter. We already do it in
the setter. Do it in the getter as well.
- At the point of escape analysis, some AST constructs have not
been lowered to runtime calls, for example, map literal
construction and some builtin calls. Change the analysis to
work on the non-lowered AST constructs instead of call
expressions for them. For this to work, the analysis needs to
look into Builtin_call_expression. Move its class definition
from expressions.cc to expressions.h, and add necessary
accessors. Also fix bugs in other runtime call handlings
(selectsend, ifaceX2Y2, etc.).
- Handle closures properly. The analysis tracks the function
reference expression, and the escape state is propagated to
the underlying heap expression for get_backend to do stack
allocation for non-escaping closures.
- Fix add_dereference. Before, this was doing expr->deref(),
which undoes an indirection instead of add one. In the gc
compiler, it adds a level of indirection, which is modeled as
an OIND node regardless of the type of the expression. We
can't do this for non-pointer typed expression, otherwise it
will result in a type error. Instead, we model it with a
special flavor of Node, "indirect". The flood phase handles
this by incrementing its level.
- Slicing of an array was not handled correctly. The gc compiler
has an implicit (compiler inserted) OADDR node for the array,
so the analysis is actually performed on the address of the
array. We don't have this implicit address-of expression in
the AST. Instead, we model this by adding an implicit child to
the Node of the Array_index_expression representing slicing of
an array.
- Array_index_expression may represent indexing or slicing. The
code distinguishes them by looking at whether the type of the
expression is a slice. This does not work if the slice element
is a slice. Instead, check whether its end() is NULL.
- Temporary references was handled only in a limited case, as
part of address-of expression. This CL handles it in general.
The analysis uses the Temporary_statement as the point of
tracking, and forwards Temporary_reference_expression to the
underlying statement when needed.
- Handle call return value flows, escpecially multiple return
values. This includes porting part of CL 8202, CL 20102, and
other fixes.
- Support go:noescape pragma.
- Add special handling for self assignment like
b.buf = b.buf[m:n]. (CL 3162)
- Remove ESCAPE_SCOPE, which was treated essentially the same as
ESCAPE_HEAP, and was removed from the gc compiler. (CL 32130)
- Run flood phase until fix point. (CL 30693)
- Unnamed parameters do not escape. (CL 38600)
- Various small bug fixes and improvements.
"make check-go" passes except the one test in math/big, when the
escape analysis is on. The escape analysis is still not run by
default.
Igor Tsimbalist [Fri, 22 Dec 2017 11:41:02 +0000 (12:41 +0100)]
This is a follow up patch for pr83488 to fix an error in setting...
This is a follow up patch for pr83488 to fix an error in setting
OPTION_MASK_ISA_AVX512VNNI_SET and OPTION_MASK_ISA_AVX512F_SET bits.
There were both set in ix86_isa_flags2 while being defined in
different ISA sets. Additionally move OPTION_MASK_ISA_AVX512VNNI_SET
to ix86_isa_flags as it can be used with OPTION_MASK_ISA_AVX512VL_SET.
gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VNNI_SET):
Or in OPTION_MASK_ISA_AVX512F_SET.
(OPTION_MASK_ISA_AVX512F_UNSET): Or in
OPTION_MASK_ISA_AVX512VNNI_UNSET.
(ix86_handle_option): Adjust for
OPTION_MASK_ISA_AVX512VNNI_*SET being in ix86_isa_flags.
* config/i386/i386-builtin.def: Move VNNI builtins from ARGS2
section to ARGS.
* config/i386/i386-c.c: Check for OPTION_MASK_ISA_AVX512VNNI in
isa_flag instead of isa_flag2.
* config/i386/i386.c (ix86_target_string): Move -mavx512vnni from
isa_opts2 to isa_opts.
* config/i386/i386.opt (mavx512vnni): Move from ix86_isa_flags2
to ix86_isa_flags.
Ian Lance Taylor [Fri, 22 Dec 2017 03:27:00 +0000 (03:27 +0000)]
compiler: improve escape analysis diagnostics
This CL brings escape analysis diagnostics closer to the gc
compiler's. This makes porting and debugging escape analysis
code easier. A few changes:
- In the gc compiler, the variable expression is represented
with the variable node itself (ONAME), the location of which
is the location of definition. We add a definition_location
method to Node, and make use of it when the gc compiler emits
diagnostics at the definition locations.
- In the gc compiler, methods are named T.M or (*T).M. Add the
type to the method name when possible.
- Print "moved to heap" messages only for variables.
- Reduce some duplicated diagnostics.
- Print "does not escape" messages in more situations which the
gc compiler does.
- Remove the special handling for closure numbers. In gofrontend,
closures are named "$nested#" where # is a global counter
starting from 0, whereas in the gc compiler they are named
"outer.func#" where # is a per-function counter starting from
1. We tried to adjust the closure name to better matching the
ones in the gc compiler, however, it cannot match exactly
because of the difference of the counter. Instead, just print
"outer.$nested#".
Alexandre Oliva [Fri, 22 Dec 2017 02:07:31 +0000 (02:07 +0000)]
[SFN] sync up debug-only stmt list's side effects with empty stmts too
for gcc/c-family/ChangeLog
PR debug/83527
PR debug/83419
* c-semantics.c (only_debug_stmts_after_p): New.
(pop_stmt_list): Clear side effects in debug-only stmt list.
Check for single nondebug stmt followed by debug stmts only.
Jakub Jelinek [Thu, 21 Dec 2017 23:10:45 +0000 (00:10 +0100)]
re PR middle-end/83487 (ICE in expand_call, at calls.c:4098)
PR middle-end/83487
* config/i386/i386.c (ix86_function_arg_boundary): Return
PARM_BOUNDARY for TYPE_EMPTY_P types.
* gcc.c-torture/compile/pr83487.c: New test.
* gcc.dg/compat/pr83487-1.h: New file.
* gcc.dg/compat/pr83487-1_main.c: New test.
* gcc.dg/compat/pr83487-1_x.c: New file.
* gcc.dg/compat/pr83487-1_y.c: New file.
* gcc.dg/compat/pr83487-2_main.c: New test.
* gcc.dg/compat/pr83487-2_x.c: New file.
* gcc.dg/compat/pr83487-2_y.c: New file.
* g++.dg/abi/pr83487.C: New test.
* g++.dg/compat/abi/pr83487-1_main.C: New test.
* g++.dg/compat/abi/pr83487-1_x.C: New file.
* g++.dg/compat/abi/pr83487-1_y.C: New file.
* g++.dg/compat/abi/pr83487-2_main.C: New test.
* g++.dg/compat/abi/pr83487-2_x.C: New file.
* g++.dg/compat/abi/pr83487-2_y.C: New file.
Jakub Jelinek [Thu, 21 Dec 2017 19:28:10 +0000 (20:28 +0100)]
re PR rtl-optimization/80747 (gcc.dg/tree-ssa/tailrecursion-4.c fails with ICE when compiled with options "-fprofile-use -freorder-blocks-and-partition")
PR rtl-optimization/80747
PR rtl-optimization/83512
* cfgrtl.c (force_nonfallthru_and_redirect): When splitting
succ edge from ENTRY, copy partition from e->dest to the newly
created bb.
* bb-reorder.c (reorder_basic_blocks_simple): If last_tail is
ENTRY, use BB_PARTITION of its successor block as current_partition.
Don't copy partition when splitting succ edge from ENTRY.
* gcc.dg/pr80747.c: New test.
* gcc.dg/pr83512.c: New test.
Jakub Jelinek [Thu, 21 Dec 2017 19:26:34 +0000 (20:26 +0100)]
re PR tree-optimization/83521 (ICE: verify_gimple failed (error: invalid operand in unary operation))
PR tree-optimization/83521
* tree-ssa-phiopt.c (factor_out_conditional_conversion): Use
gimple_build_assign without code on result of
fold_build1 (VIEW_CONVERT_EXPR, ...), as it might not create
a VIEW_CONVERT_EXPR.
Alexandre Oliva [Thu, 21 Dec 2017 18:14:21 +0000 (18:14 +0000)]
[-fcompare-debug] retain insn locations when turning dbr seq into return
A number of -fcompare-debug errors on sparc arise as we split a dbr
SEQUENCE back into separate insns to turn the branch into a return.
If we just take the location from the PREV_INSN, it might be a debug
insn without INSN_LOCATION, or an insn with an unrelated location.
But that's silly: each of the SEQUENCEd insns is still an insn with
its own INSN_LOCATION, so use that instead, even though some may have
been adjusted while constructing the SEQUENCE.
for gcc/ChangeLog
* reorg.c (make_return_insns): Reemit each insn with its own
location.
Alexandre Oliva [Thu, 21 Dec 2017 18:14:06 +0000 (18:14 +0000)]
[SFN] propagate single-nondebug-stmt's side effects to enclosing list
Statements without side effects, preceded by debug begin stmt markers,
would become a statement list with side effects, although the stmt on
its own would be extracted from the list and remain not having side
effects. This causes debug info and possibly codegen differences.
This patch fixes it, identifying the situation in which the stmt would
have been extracted from the stmt list, and propagating the side
effects flag from the stmt to the list.
for gcc/ChangeLog
PR debug/83419
* c-family/c-semantics.c (pop_stmt_list): Propagate side
effects from single nondebug stmt to container list.
James Greenhalgh [Thu, 21 Dec 2017 16:39:43 +0000 (16:39 +0000)]
[patch AArch64] Do not perform a vector splat for vector initialisation if it is not useful
Our current vector initialisation code will first duplicate
the first element to both lanes, then overwrite the top lane with a new
value.
This duplication can be clunky and wasteful.
Better would be to simply use the fact that we will always be overwriting
the remaining bits, and simply move the first element to the corrcet place
(implicitly zeroing all other bits).
We also need a new pattern in simplify-rtx.c:simplify_ternary_operation ,
to ensure we can still simplify:
(vec_concat:OUTER x:INNER y:INNER) or (vec_concat y x)
---
gcc/
* config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code
generation for cases where splatting a value is not useful.
* simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge
across a vec_duplicate and a paradoxical subreg forming a vector
mode to a vec_concat.
Kyrylo Tkachov [Thu, 21 Dec 2017 15:02:49 +0000 (15:02 +0000)]
[arm] Specify +dotprod support for Cortex-A55 and Cortex-A75 in native system detection
Since support for -mcpu=cortex-a55 and -mcpu=cortex-a75
was added we added support for the +dotprod extension
which these CPUs support.
We already specify as such in the arm-cpus.in entries for
these processors. However the table in driver-arm.c was
not adding +dotproct to the -march string that it generates.
This patch fixes that oversight.
In the future I'd like to get the arm_cpu_table in driver-arm.c
be auto-generated somehow from the arm-cpus.in data so
that we don't have to keep track of discrepancies explicitly...
Bootstrapped and tested on arm-none-linux-gnueabihf.
* config/arm/driver-arm.c (arm_cpu_table): Specify dotprod
support for Cortex-A55 and Cortex-A75.