This patch enables the new Transactional Memory Extension announced recently
as part of Arm's new architecture technologies.
We introduce a new optional extension "tme" to enable this. The following
instructions are part of the extension:
* tstart <Xt>
* ttest <Xt>
* tcommit
* tcancel #<imm>
We have also added ACLE intrinsics for the instructions.
[Arm][CMSE]Add warn_unused_return attribute to cmse functions
At present it is possible to call the CMSE functions for checking
addresses (such as cmse_check_address_range) and forget to check/use
the return value. This patch makes the interfaces more robust against
programmer error by marking these functions with the warn_unused_result
attribute. With this set, any use of these functions that does not use
the result will produce a warning.
This produces a warning on default warn levels when the result of the
cmse functions is not used.
For the following function:
void foo()
{
int *data;
cmse_check_address_range((int*)data, 0, 0);
}
The following warning is emitted:
warning: ignoring return value of 'cmse_check_address_range' declared
with attribute 'warn_unused_result' [-Wunused-result]
6 | cmse_check_address_range((int*)data, 0, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
jakub [Wed, 31 Jul 2019 07:49:56 +0000 (07:49 +0000)]
PR middle-end/91301
* gimplify.c (gimplify_omp_for): If for class iterator on
distribute parallel for there is no data sharing clause
on inner_for_stmt, look for private clause on combined
parallel too and if found, move it to inner_for_stmt.
lra_insn_reg and lra_operand_data have both a bitmask of earlyclobber
alternatives and an overall boolean. The danger is that we then test
the overall boolean when really we should be testing for a particular
alternative. This patch gets rid of the boolean and tests the mask
against zero when we really do need to test "any alternative might
be earlyclobber". (I think the only instance of that is the
LRA_UNKNOWN_ALT handling in lra-lives.c:reg_early_clobber_p.)
This is needed (and tested) by an upcoming SVE patch.
2019-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* lra-int.h (lra_operand_data): Remove early_clobber field.
(lra_insn_reg): Likewise.
* lra.c (debug_operand_data): Update accordingly.
(setup_operand_alternative): Likewise.
(new_insn_reg): Likewise. Remove early_clobber parameter.
(collect_non_operand_hard_regs): Update call accordingly.
Don't assign to lra_insn_reg::early_clobber.
(add_regs_to_insn_regno_info): Remove early_clobber parameter
and update calls to new_insn_reg.
(lra_update_insn_regno_info): Update calls accordingly.
* lra-constraints.c (update_and_check_small_class_inputs): Take the
alternative number as a parameter and test whether the operand
is earlyclobbered in that particular alternative.
(process_alt_operands): Update call accordingly. Use per-alternative
checks for earyclobber here too.
* lra-lives.c (reg_early_clobber_p): Check early_clobber_alts
against zero for IRA_UNKNOWN_ALT.
PR fortran/91296
* interface.c (compare_actual_expr): When checking for aliasing, add
a case to handle REF_INQUIRY (e.g., foo(x%re, x%im) do not alias).
2019-07-30 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/91296
* gfortran.dg/pr91296.f90: New test.
Adjust literal pool offset in Thumb-2 movsi patterns
My previous change to the Thumb-2 movsi patterns caused a codesize regression
with -Os in large functions. Fix this by using the literal pool offset of the
16-bit literal load so that the literal pool is dumped earlier, reducing the
number of 32-bit literal loads.
Bootstrap & regress OK on arm-none-linux-gnueabihf --with-cpu=cortex-a57
Use edge->indirect_unknown_callee in cgraph_edge::make_direct (PR ipa/89330).
2019-07-30 Martin Liska <mliska@suse.cz>
PR ipa/89330
* cgraph.c (cgraph_edge::make_direct): Use
edge->indirect_unknown_callee as edge->resolve_speculation can
deallocate edge which is this pointer.
Deduce automatically number of cores for -flto option.
2019-07-30 Martin Liska <mliska@suse.cz>
* doc/invoke.texi: Document new behavior.
* lto-wrapper.c (cpuset_popcount): New function
is a copy of libgomp/config/linux/proc.c.
(init_num_threads): Likewise.
(run_gcc): Automatically detect core count for -flto.
(jobserver_active_p): New function.
PR tree-optimization/91257
* bitmap.h (bitmap_ior_into_and_free): Declare.
* bitmap.c (bitmap_list_unlink_element): Add defaulted param
whether to add the unliked element to the freelist.
(bitmap_list_insert_element_after): Add defaulted param for
an already allocated element.
(bitmap_ior_into_and_free): New function.
* tree-ssa-structalias.c (condense_visit): Reduce the
ponts-to and edge bitmaps of the SCC members in a
logarithmic fashion rather than all to one.
Mark 2nd argument of delete operator as needed (PR tree-optimization/91270).
2019-07-30 Martin Liska <mliska@suse.cz>
PR tree-optimization/91270
* tree-ssa-dce.c (propagate_necessity): Mark 2nd argument
of delete operator as needed.
2019-07-30 Martin Liska <mliska@suse.cz>
PR tree-optimization/91270
* g++.dg/torture/pr91270.C: New test.
This patch extends the FMA handling in tree-ssa-math-opts.c so
that it can cope with conditional multiplications as well as
unconditional multiplications. The addition or subtraction must then
have the same condition as the multiplication (at least for now).
E.g. we can currently fold:
(IFN_COND_ADD cond (mul x y) z fallback)
-> (IFN_COND_FMA cond x y z fallback)
This patch also allows:
(IFN_COND_ADD cond (IFN_COND_MUL cond x y <whatever>) z fallback)
-> (IFN_COND_FMA cond x y z fallback)
2019-07-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-ssa-math-opts.c (convert_mult_to_fma): Add a mul_cond
parameter. When nonnull, make sure that the addition or subtraction
has the same condition.
(math_opts_dom_walker::after_dom_children): Try convert_mult_to_fma
for CFN_COND_MUL too.
gcc/testsuite/
* gcc.dg/vect/vect-cond-arith-7.c: New test.
jakub [Tue, 30 Jul 2019 07:28:22 +0000 (07:28 +0000)]
PR middle-end/91216
* omp-low.c (global_nonaddressable_vars): New variable.
(use_pointer_for_field): For global decls, if they are non-addressable,
remember it in the global_nonaddressable_vars bitmap, if they are
addressable and in the global_nonaddressable_vars bitmap, ignore their
TREE_ADDRESSABLE bit.
(omp_copy_decl_2): Clear TREE_ADDRESSABLE also on private copies of
vars in global_nonaddressable_vars bitmap.
(execute_lower_omp): Free global_nonaddressable_vars bitmap.
jakub [Tue, 30 Jul 2019 07:13:04 +0000 (07:13 +0000)]
PR target/91150
* config/i386/i386-expand.c (expand_vec_perm_blend): Change mask type
from unsigned to unsigned HOST_WIDE_INT. For E_V64QImode cast
comparison to unsigned HOST_WIDE_INT before shifting it left.
* config/i386/i386.md (movstrict<mode>): Use register_operand
predicate for operand 0. Add expander condition. Assert that
operand 0 is a SUBREG RTX.
(*movstrict<mode>_1): Use register_operand predicate for operand 0.
Update operand constraints and insn condition.
(zero_extend<mode>si2_and): Do not call gen_movstrict<mode>.
(zero_extendqihi2_and): Do not call gen_movstrictqi.
(*setcc_qi_slp): Use register_operand predicate for operand 0.
Update operand 0 constraints.
(setcc_qi_slp splitters): Use register_operand predicate for operand 0.
MSP430: Disallow use of code/data regions in the small memory model
gcc/ChangeLog:
2019-07-29 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* config/msp430/msp430.h (DRIVER_SELF_SPECS): Define and emit errors
when -m{code,data}-region are used without -mlarge.
* config/msp430/msp430.c (msp430_option_override): Error when a
non-default code or data region is used without -mlarge.
(msp430_section_attr): Emit a warning and do not add upper/lower/either
attributes when they are used without -mlarge.
gcc/testsuite/ChangeLog:
2019-07-29 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* gcc.target/msp430/pr78818-data-region.c: Add -mlarge to dg-options.
* gcc.target/msp430/region-misuse-code.c: New test.
* gcc.target/msp430/region-misuse-data.c: Likewise.
* gcc.target/msp430/region-misuse-code-data.c: Likewise.
* gcc.target/msp430/region-attribute-misuse.c: Likewise.
inchash::hash::add_wide_int operated directly on the raw encoding
of the wide_int, including any redundant upper bits. The problem
with that is that the upper bits are only defined for some wide-int
storage types (including wide_int itself). wi::to_wide(tree) instead
returns a value that is extended according to the signedness of the
type (so that wi::to_widest can use the same encoding) while rtxes
have the awkward special case of BI, which can be zero-extended
rather than sign-extended.
In the PR, we computed a hash for a "normal" sign-extended wide_int
while the existing entries hashed wi::to_wide(tree). This gives
different results for unsigned types that have the top bit set.
The patch fixes that by hashing the canonical sign-extended form even
if the raw encoding happens to be different.
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* wide-int.h (generic_wide_int::sext_elt): New function.
* inchash.h (hash::add_wide_int): Use it instead of elt.
PR fortran/90813
* dump-parse-tree.c (show_global_symbol): New function.
(gfc_dump_global_symbols): New function.
* gfortran.h (gfc_traverse_gsymbol): Add prototype.
(gfc_dump_global_symbols): Likewise.
* invoke.texi: Document -fdump-fortran-global.
* lang.opt: Add -fdump-fortran-global.
* parse.c (gfc_parse_file): Handle flag_dump_fortran_global.
* symbol.c (gfc_traverse_gsymbol): New function.
* trans-decl.c (sym_identifier): New function.
(mangled_identifier): New function, doing most of the work
of gfc_sym_mangled_identifier.
(gfc_sym_mangled_identifier): Use mangled_identifier. Add mangled
identifier to global symbol table.
(get_proc_pointer_decl): Use backend decl from global identifier
if present.
[arm] Make ACLE builtins use arm_* namespace for expanders
The builtins from <arm_acle.h> use fairly general expander names such as
"crc", "mcr" etc.
These run the risk of being reserved by the midend in the future.
Let's namespace them to arm_* as is convention.
The recursive_init_error class is defined in a header, with an inline
constructor, but the definition of the vtable and destructor are not
exported from the shared library. With -fkeep-inline-functions the
constructor gets emitted in user code, and requires the (non-exported)
vtable. This fails to link.
As far as I can tell, the recursive_init_error class definition was
moved into <cxxabi.h> so it could be documented with Doxygen, not for
any technical reason. But now it's there (and documented), somebody
could be relying on it, by catching that type and possibly performing
derived-to-base conversions to the std::exception base class. So the
conservative fix is to leave the class definition in the header but make
the constructor non-inline. This still allows the type to be caught and
still defines its base class. User code can no longer construct objects
of that type, but that's not something we need to support.
PR libstdc++/51333
* libsupc++/cxxabi.h (__gnu_cxx::recursive_init_error): Do not define
constructor inline.
* libsupc++/guard_error.cc (__gnu_cxx::recursive_init_error): Define
constructor.
* testsuite/18_support/51333.cc: New test.
PR tree-optimization/91257
* tree-vrp.c (operand_less_p): Avoid dispatching to fold for
most cases, instead call compare_values which handles the
symbolic ranges we handle specially.
(compare_values_warnv): Do not call operand_less_p but open-code
the effective fold calls. Avoid converting so much.
Prevent tree-ssa-dce.c from deleting stores at -Og
DCE tries to delete dead stores to local data and also tries to insert
debug binds for simple cases:
/* If this is a store into a variable that is being optimized away,
add a debug bind stmt if possible. */
if (MAY_HAVE_DEBUG_BIND_STMTS
&& gimple_assign_single_p (stmt)
&& is_gimple_val (gimple_assign_rhs1 (stmt)))
{
tree lhs = gimple_assign_lhs (stmt);
if ((VAR_P (lhs) || TREE_CODE (lhs) == PARM_DECL)
&& !DECL_IGNORED_P (lhs)
&& is_gimple_reg_type (TREE_TYPE (lhs))
&& !is_global_var (lhs)
&& !DECL_HAS_VALUE_EXPR_P (lhs))
{
tree rhs = gimple_assign_rhs1 (stmt);
gdebug *note
= gimple_build_debug_bind (lhs, unshare_expr (rhs), stmt);
gsi_insert_after (i, note, GSI_SAME_STMT);
}
}
But this doesn't help for things like "print *ptr" when ptr points
to the local variable (tests Og-dce-1.c and Og-dce-2.c). It can
also introduce wrong debug info for earlier references (second test
in Og-dce-3.c) or make earlier references unavailable (first test
in Og-dce-3.c).
So for -Og I think it'd be better not to delete any stmts with
vdefs for now. This also means that we can avoid the potentially
expensive vop walks (which already have a cut-off, but still).
The patch also fixes the Og failures in gcc.dg/guality/pr54970.c
(PR 86638).
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR debug/86638
* tree-ssa-dce.c (keep_all_vdefs_p): New function.
(mark_stmt_if_obviously_necessary): Mark all stmts with vdefs as
necessary if keep_all_vdefs_p is true.
(mark_aliased_reaching_defs_necessary): Add a gcc_checking_assert
that keep_all_vdefs_p is false.
(mark_all_reaching_defs_necessary): Likewise.
(propagate_necessity): Skip the vuse scan if keep_all_vdefs_p is true.
This patch stops gimple and rtl DSE from running by default at -Og.
The idea is both to improve compile time and to stop us from deleting
stores that we can't track in debug info.
We could rein this back in future for stores to local variables
with is_gimple_reg_type, but at the moment we don't have any
infrastructure for switching between binds to specific values
and binds to evolving memory locations. Even then, location
tracking only works for direct references to the variables, and doesn't
for example help with printing dereferenced pointers (see the next patch
in the series for an example).
I'm also not sure that DSE is important enough for -Og to justify the
compile time cost -- especially in the case of RTL DSE, which is pretty
expensive.
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* common.opt (Og): Change the initial value of flag_dse to 0.
* opts.c (default_options_table): Move OPT_ftree_dse from
OPT_LEVELS_1_PLUS to OPT_LEVELS_1_PLUS_NOT_DEBUG. Also add
OPT_fdse to OPT_LEVELS_1_PLUS_NOT_DEBUG. Put the OPT_ftree_pta
entry before the OPT_ftree_sra entry.
* doc/invoke.texi (Og): Add -fdse and -ftree-dse to the list
of flags disabled by Og.
gcc/testsuite/
* c-c++-common/guality/Og-global-dse-1.c: New test.
Prevent -Og from deleting stores to write-only variables
This patch prevents -Og from deleting stores to write-only variables,
so that the values are still available when debugging. This seems
more convenient than forcing users to use __attribute__((used))
(probably conditionally, if it's not something they want in release
builds).
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-cfg.c (execute_fixup_cfg): Don't delete stores to write-only
variables for -Og.
gcc/testsuite/
* c-c++-common/guality/Og-static-wo-1.c: New test.
* g++.dg/guality/guality.exp: Separate the c-c++-common tests into
"Og" and "general" tests. Run the latter at -O0 and -Og only.
* gcc.dg/guality/guality.exp: Likewise.
There isn't a 1:1 mapping from SVE intrinsics to SVE instructions,
but the intrinsics are still close enough to the instructions for
there to be a specific preferred sequence (or sometimes choice of
preferred sequences) for a given combination of operands. Sometimes
these sequences will be one instruction, sometimes they'll be several.
I therefore wanted a convenient way of matching the exact assembly
implementation of a given function. It's possible to do that using
single scan-assembler lines, but:
(a) they become hard to read for multiline matches
(b) the PASS/FAIL lines tend to be overly long
(c) it's useful to have a single place that skips over uninteresting
lines, such as entry block labels and .cfi_* directives, without
being overly broad
This patch therefore adds a new check-function-bodies dg-final test
that looks for specially-formatted comments. As a demo, the patch
converts the SVE vec_init tests to use the new harness instead of
scan-assembler.
The regexps in parse_function_bodies are fairly general, but might
still need to be extended in future for targets like Darwin or AIX.
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
Generalise VEC_DUPLICATE folding for variable-length vectors
This patch uses the constant vector encoding scheme to handle
more cases of a VEC_DUPLICATE of another vector. Duplicating
any fixed-length vector is fine, and duplicating a variable-length
vector is OK as long as that vector is also a duplicate of a
fixed-length sequence.
Other cases fell through to:
if (VECTOR_MODE_P (mode) && GET_CODE (op) == CONST_VECTOR)
which was only expecting to deal with elementwise operations.
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* simplify-rtx.c (simplify_const_unary_operation): Fold a
VEC_DUPLICATE of a fixed-length vector even if the result
is variable-length. Likewise fold a duplicate of a
variable-length vector if the variable-length vector is
itself a duplicate of a fixed-length sequence.
(test_vector_ops_duplicate): Test more cases.
Implement more rtx vector folds on variable-length vectors
This patch extends the tree-level folding of variable-length vectors
so that it can also be used on rtxes. The first step is to move
the tree_vector_builder new_unary/binary_operator routines to the
parent vector_builder class (which in turn means adding a new
template parameter). The second step is to make simplify-rtx.c
use a direct rtx analogue of the VECTOR_CST handling in fold-const.c.
2019-07-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* vector-builder.h (vector_builder): Add a shape template parameter.
(vector_builder::new_unary_operation): New function, generalizing
the old tree_vector_builder function.
(vector_builder::new_binary_operation): Likewise.
(vector_builder::binary_encoded_nelts): Likewise.
* int-vector-builder.h (int_vector_builder): Update template
parameters to vector_builder.
(int_vector_builder::shape_nelts): New function.
* rtx-vector-builder.h (rtx_vector_builder): Update template
parameters to vector_builder.
(rtx_vector_builder::shape_nelts): New function.
(rtx_vector_builder::nelts_of): Likewise.
(rtx_vector_builder::npatterns_of): Likewise.
(rtx_vector_builder::nelts_per_pattern_of): Likewise.
* tree-vector-builder.h (tree_vector_builder): Update template
parameters to vector_builder.
(tree_vector_builder::shape_nelts): New function.
(tree_vector_builder::nelts_of): Likewise.
(tree_vector_builder::npatterns_of): Likewise.
(tree_vector_builder::nelts_per_pattern_of): Likewise.
* tree-vector-builder.c (tree_vector_builder::new_unary_operation)
(tree_vector_builder::new_binary_operation): Delete.
(tree_vector_builder::binary_encoded_nelts): Likewise.
* simplify-rtx.c: Include rtx-vector-builder.h.
(distributes_over_addition_p): New function.
(simplify_const_unary_operation)
(simplify_const_binary_operation): Generalize handling of vector
constants to include variable-length vectors.
(test_vector_ops_series): Add more tests.
Release cgraph_{node,edge} via ggc_free (PR ipa/89330).
2019-07-28 Martin Liska <mliska@suse.cz>
PR ipa/89330
* cgraph.c (symbol_table::create_edge): Always allocate
a cgraph_edge.
(symbol_table::free_edge): Store summary_id to
edge_released_summary_ids if != -1;
* cgraph.h (NEXT_FREE_NODE): Remove.
(SET_NEXT_FREE_NODE): Likewise.
(NEXT_FREE_EDGE): Likewise.
(symbol_table::release_symbol): Store summary_id to
cgraph_released_summary_ids if != -1;
(symbol_table::allocate_cgraph_symbol): Always allocate
a cgraph_node.
[RS6000] PR91135, __linux__ not defined with -mcall-aixdesc on 9.x and ppc64
This patch makes the obvious fix for PR91135, and deletes extraneous
copies of GNU_USER_TARGET_D_OS_VERSIONS that appear in rs6000/linux.h
and rs6000/linux64.h. Since all configurations using either of these
files also include linux.h there is no need to duplicate the macro.
PR target/91135
* config/rs6000/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Don't
define.
* config/rs6000/linux64.h (TARGET_OS_CPP_BUILTINS): Invoke
GNU_USER_TARGET_OS_CPP_BUILTINS for aixdesc abi.
(GNU_USER_TARGET_D_OS_VERSIONS): Don't define.
[RS6000] Make assembler command line cpu match default for gcc
When gcc is configured using --with-cpu=<cpu>, the specified cpu
effectively becomes a default -mcpu=<cpu> passed to gcc. This then
affects the cpu passed to gas via ASM_CPU_SPEC. If gcc is not
configured using --with-cpu then the cpu passed to gas is that given
by ASM_DEFAULT_SPEC, which currently does not match the default flags
selected in default64.h. This patch makes ASM_DEFAULT_SPEC agree with
TARGET_DEFAULT flags.
rs6000/default64.h appears in three places in config.gcc, the first
one immediately followed by rs6000/freebsd64.h in $tm_file, and the
other two immediately followed by rs6000/linux64.h. To be able to
define ASM_DEFAULT_SPEC in rs6000/default64.h we don't want to
redefine in the other two files. rs6000/freebsd64.h is easy since
that file is always preceded by rs6000/default64.h, but
rs6000/linux64.h can appear without rs6000/default64.h (a
powerpc*-linux config where the default is -m32). In that case we
will have TARGET_DEFAULT flags of 0 (from rs6000/sysv4.h) and want to
use -mppc without -m64 and -mppc64 with -m64. This can be done by
using the rs6000/rtems.h ASM_DEFAULT_SPEC in rs6000/sysv4.h, a change
that won't affect sysv4 configurations where -m64 is invalid.
The patch also introduces ASM_DEFAULT_EXTRA for the altivec variant
targets so as to enable -maltivec by default.
* doc/xml/manual/documentation_hacking.xml: Fix broken reference
to the Doxygen manual. Avoid a "here" link on the way.
Fix another broken link to Doxygen docblocks.
[Darwin, PPC, testsuite] Fix fail for bmi2-bzhi64-1a.c
This test is failing with older cpus because the included header needs both
altivec and vsx to be enabled to succeed in compiling. Without this (if these
are not defaults for the cpu) there are errors like:
In file included from ... x86intrin.h:41,
from ... bmi2-bzhi64-1a.c:6:
... xmmintrin.h: In function '_mm_loadu_ps':
... xmmintrin.h:122:11:
error: incompatible types when returning type 'int' but '__m128' {aka '__vector(4) float'} was expected
<snip>
... xmmintrin.h: In function '_mm_cvtps_pi32':
... xmmintrin.h:996:3:
error: use of 'long long' in AltiVec types is invalid without '-mvsx'
<snip>
Fixed by adding -maltivec -mvsx to the options.
gcc/testsuite/
2019-07-27 Iain Sandoe <iain@sandoe.co.uk>
* gcc.target/powerpc/bmi2-bzhi64-1a.c: Add options to enable altivec
and vsx.
Darwin's "size" command has a different header line, reflecting the Mach-O
section naming conventions. This causes tests using the command to fail
because scanasm.exp expects and checks specific layout of the header line.
AArch64: Make processing less fragile in config.gcc
Due to config.gcc all the options need to be on one line because of the grep
lines which would select only the first line of the option.
This causes it not to select the right bits on options that are spread over
multiple lines when the --with-arch configure option is used. The issue happens
silently and you just get a compiler with an incorrect set of default flags.
The current rules are quite rigid:
1) No space between the AARCH64_OPT_EXTENSION and the opening (.
2) No space between the opening ( and the extension name.
3) No space after the extension name before the ,.
4) Spaces are only allowed after a , and around |.
This patch makes this a lot less fragile by using the C pre-processor to flatten
the list and then provides much more flexible regex using group matching to
process the options instead of string replacement. This removes all the
restrictions above and makes the code a bit more readable.
gcc/ChangeLog:
PR target/89517
* config.gcc: Relax parsing of AARCH64_OPT_EXTENSION.
* config/aarch64/aarch64-option-extensions.def: Add new comments
and restore easier to read options.
Add rules to strip away unneeded type casts in expressions
This patch moves part of the type conversion code from convert.c to match.pd
because match.pd is able to apply these transformations in the presence of
intermediate temporary variables.
Concretely it makes both these cases behave the same
float e = (float)a * (float)b;
*c = (_Float16)e;
and
*c = (_Float16)((float)a * (float)b);
gcc/ChangeLog:
* convert.c (convert_to_real_1): Move part of conversion code...
* match.pd: ...To here.
[PR 89330] Remove non-useful speculations from new_edges
2019-07-26 Martin Jambor <mjambor@suse.cz>
PR ipa/89330
* ipa-inline-transform.c (check_speculations_1): New function.
(push_all_edges_in_set_to_vec): Likewise.
(check_speculations): Use check_speculations_1, new parameter
new_edges.
(inline_call): Pass new_edges to check_speculations.
* ipa-inline.c (add_new_edges_to_heap): Assert edge_callee is not
NULL.
(speculation_useful_p): Early return true if edge is inlined, remove
later checks for inline_failed.
testsuite/
* g++.dg/lto/pr89330_[01].C: New test.
* g++.dg/tree-prof/devirt.C: Added -fno-profile-values to dg-options.
[Darwin, testsuite] Address PR91087 - XFAIL parts of pr16855.C.
The testcase is failing to instrument part of the source because of a bug
in the ordering of static DTORs. It seems unlikely that this is generically
fixable in the toolchain (and given that it's likely to be a dynamic loader
change would not be expected to be applied retrospectively to OS versions
that are out of support). To avoid the testsuite noise, xfail the count lines
that don't match (we can adjust the xfails as/when the upstream bug is fixed).
dejagnu xfails do not seem to work when embedded in a line like:
~Test (void) { .... /* count(1) { xfail ... } */ }
the closing brace seems to confuse the parser. The solution is to exapnd the
text onto three lines.
2019-07-25 Iain Sandoe <iain@sandoe.co.uk>
PR gcov-profile/91087
* g++.dg/gcov/pr16855.C: Xfail the count lines for the DTORs and the
"final" line for the failure summaries. Adjust source layout so that
dejagnu xfail expressions work.
PR fortran/65819
* dependency.h (gfc_dep_resovler): Add optional argument identical.
* dependency.c (gfc_check_dependency): Do not alway return 1 if
the symbol is the same. Pass on identical to gfc_dep_resolver.
(gfc_check_element_vs_element): Whitespace fix.
(gfc_dep_resolver): Adjust comment for function. If identical is
true, return 1 if any overlap has been found.
2019-07-25 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/65819
* gfortran.dg/dependency_54.f90: New test.
* cif-code.def (NEVER_CALL): New code.
* ipa-inline.c (want_inline_small_function_p): Fix formatting issues.
Set the failure to CIF_NEVER_CALL if the IPA count is zero.
The Thumb-2 movsi patterns try to prefer low registers for loads and stores.
However this is done incorrectly by using 2 separate variants with 'l' and 'h'
register classes. The register allocator will only use low registers, and
as a result we end up with significantly more spills and moves to high
registers. Fix this by merging the alternatives and use 'l*r' to indicate
preference for low registers. This saves ~400 instructions from the pr77308
testcase.
* tree-vrp.c (extract_range_from_multiplicative_op): Add
type parameter and use it instead of guessing expression
type from the first operand.
(extract_range_from_binary_expr): Pass expr_type down.
[arm][committed] Clean up code iterator usage in satsi* patterns
GCC 10 now supports having RTL codes being code attributes (thanks
Richard) allowing us to map smax to smin and vice versa.
This means we can clean up their use in the saturation patterns that do
the cross product of [smin, smax] and use the pattern
predicate to cancel out the nonsense ones.
* config/arm/arm.md (SATrev): Change to code attribute.
(*satsi_<SAT:code>): Adjust for the above.
(*satsi_<SAT:code>_shift): Likewise.
Generalize get_most_common_single_value to return n_th value & count
Currently get_most_common_single_value could only return the max hist
<value, count>, add sort after reading from disk, then it return nth value
in later use. Rename it to get_nth_most_common_value.
gcc/ChangeLog:
2019-07-15 Xiong Hu Luo <luoxhu@linux.ibm.com>
* ipa-profile.c (get_most_common_single_value): Use
get_nth_most_common_value.
* profile.c (sort_hist_value): New function.
(compute_value_histograms): Call sort_hist_value to sort the
values after loading from disk.
* value-prof.c (get_most_common_single_value): Rename to ...
get_nth_most_common_value. Add input params n, return
the n_th value and count.
(gimple_divmod_fixed_value_transform): Use
get_nth_most_common_value.
(gimple_ic_transform): Likewise.
(gimple_stringops_transform): Likewise.
* value-prof.h (get_most_common_single_value): Add input params
n, default to 0.
PR tree-optimization/91183 - strlen of a strcpy result with a conditional source not folded
PR tree-optimization/86688 - missing -Wstringop-overflow using a non-string local array in strnlen with excessive bound
gcc/ChangeLog:
PR tree-optimization/91183
PR tree-optimization/86688
* builtins.c (compute_objsize): Handle MEM_REF.
* tree-ssa-strlen.c (class ssa_name_limit_t): New.
(get_min_string_length): Remove.
(count_nonzero_bytes): New function.
(handle_char_store): Rename...
(handle_store): to this. Handle multibyte stores via integer types.
(strlen_check_and_optimize_stmt): Adjust conditional and the called
function name.
gcc/testsuite/ChangeLog:
PR tree-optimization/91183
PR tree-optimization/86688
* gcc.dg/Wstringop-overflow-14.c: New test.
* gcc.dg/attr-nonstring-2.c: Remove xfails.
* gcc.dg/strlenopt-70.c: New test.
* gcc.dg/strlenopt-71.c: New test.
* gcc.dg/strlenopt-72.c: New test.
* gcc.dg/strlenopt-8.c: Remove xfails.
PR driver/80545
* diagnostic.c (diagnostic_classify_diagnostic): Use lang_mask.
(diagnostic_report_diagnostic): Same.
* diagnostic.h (diagnostic_context::option_enabled): Add an argument.
(diagnostic_context::lang_mask): New data member.
* ipa-pure-const.c (suggest_attribute): Use
lang_hooks.option_lang_mask ().
* opts-common.c (option_enabled): Handle new argument.
(get_option_state): Pass an additional argument.
* opts.c (print_filtered_help): Print supported languages for
unsupported options. Adjust printing of current state.
* opts.h (option_enabled): Add argument.
* toplev.c (print_switch_values): Use lang_mask.
(general_init): Set global_dc->lang_mask.
law [Wed, 24 Jul 2019 18:08:51 +0000 (18:08 +0000)]
* gimplify.c (flag_instrument_functions_exclude_p): Include
namespace/class information in the printable name.
* opts.c (add_comma_separated_to_vector): Add NUL terminator
to tokens entered into the vector.
When entering an interrupt, not only the call save registers needs to
be place on stack but also the call clobbers one. More over, the
ARC700 return from interrupt instruction needs to be rtie, the same
like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX]
instruction. Additionally, we need to save the state of the ZOL
machinery, namely the lp_count, lp_end and lp_start registers. For
architectures which are using extension registers (i.e., HS48) we need
to save/restore them as well.
* config/arc/arc-protos.h (arc_output_function_epilogue): Delete
declaration.
(arc_compute_frame_size): Millicode is disabled when compiling
ISR.
(arc_return_address_register): Likewise.
(arc_compute_function_type): Likewise.
(arc_compute_frame_size): Likewise.
(secondary_reload_info): Likewise.
(arc_get_unalign): Likewise.
(arc_can_use_return_insn): Declare.
* config/arc/arc.c (AUX_LP_START): Define
(AUX_LP_END): Likewise.
(arc_frame_info): Update gmask member to 64-bit datum.
(GMASK_LEN): Update.
(arc_compute_function_type): Make it static, move it forward.
(arc_must_save_register): Update, consider the extra regs.
(arc_compute_millicode_save_restore_regs): Update to use the 64
bit gmask.
(arc_compute_frame_size): Likewise.
(arc_enter_leave_p): Likewise.
(arc_save_callee_saves): Likewise.
(arc_restore_callee_saves): Likewise.
(arc_save_callee_enter): Likewise.
(arc_restore_callee_leave): Likewise.
(arc_save_callee_milli): Likewise.
(arc_restore_callee_milli): Likewise.
(arc_expand_prologue): Add new interrupt handling.
(arc_return_address_register): Make it static, move it forward.
(arc_expand_epilogue): Add new interrupt handling.
(arc_get_unalign): Delete.
(arc_epilogue_uses): Make sure we do not remove the extra
saved/restored registers when interrupt.
(arc_can_use_return_insn): New function.
(push_reg): Likewise.
(pop_reg): Likewise.
(arc_save_callee_saves): Add ZOL and FPX aux registers saving
procedures.
(arc_restore_callee_saves): Likewise, but restoring.
* config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define.
(R33_REG): Likewise.
(R34_REG): Likewise.
(R35_REG): Likewise.
(R36_REG): Likewise.
(R37_REG): Likewise.
(R38_REG): Likewise.
(R39_REG): Likewise.
(R45_REG): Likewise.
(R46_REG): Likewise.
(R47_REG): Likewise.
(R48_REG): Likewise.
(R49_REG): Likewise.
(R50_REG): Likewise.
(R51_REG): Likewise.
(R52_REG): Likewise.
(R53_REG): Likewise.
(R54_REG): Likewise.
(R55_REG): Likewise.
(R56_REG): Likewise.
(R58_REG): Likewise.
(type): Add rtie attribute.
(in_call_delay_slot): Use RETURN_ADDR_REGNUM.
(movsi_insn): Accept moves to lp_count.
(rtie): Update pattern.
(simple_return): Simplify it, don't use this pattern as a return
from an interrupt.
(arc600_rtie): New pattern.
(p_return_i): Clean up.
(return): Likewise.
* config/arc/builtins.def (rtie): Only available for non ARC6xx
family CPUs.
* config/arc/predicates.md (move_src_operand): Consider lp_count
as a register.
This patch implements the addv, subv, and mulv patterns for signed
integers.
gcc/ChangeLog:
2019-07-24 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/predicates.md (addv_const_operand): New predicate.
* config/s390/s390-modes.def (CCO): New condition code mode.
* config/s390/s390.c (s390_match_ccmode_set): Handle E_CCOmode.
(s390_branch_condition_mask): Likewise.
* config/s390/s390.md ("addv<mode>4", "subv<mode>4")
("mulv<mode>4"): New expanders.
("*addv<mode>3_ccoverflow", "*addv<mode>3_ccoverflow_const")
("*subv<mode>3_ccoverflow", "*mulv<mode>3_ccoverflow"): New
pattern definitions.
gcc/testsuite/ChangeLog:
2019-07-24 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/addsub-signed-overflow-1.c: New test.
* gcc.target/s390/addsub-signed-overflow-2.c: New test.
* gcc.target/s390/mul-signed-overflow-1.c: New test.
* gcc.target/s390/mul-signed-overflow-2.c: New test.
PR middle-end/91166
* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
(define_predicates): Add entry for uniform_vector_p.
(vec_same_elem_p): New match pattern.
testsuite/
* gcc.target/aarch64/sve/pr91166.c: New test.
Messed up the commit, and missed changes to gcc/config.gcc and to the comments
in some of the headers.
2019-07-24 Iain Sandoe <iain@sandoe.co.uk>
gcc/
PR bootstrap/87030
* config/i386/darwin.h (REAL_LIBGCC_SPEC): Move from here...
* config/i386/darwin32-biarch.h .. to here.
* config/i386/darwin64-biarch.h: Adjust comments.
* config/rs6000/darwin32-biarch.h: Likewise.
* config/rs6000/darwin64-biarch.h: Likewise.
* config.gcc: Missed commit from r273746
(*-*-darwin*): Don't include CPU t-darwin here.
(i[34567]86-*-darwin*): Adjust to use biarch files. Produce
an error message if i686-darwin configuration is attempted for
Darwin >= 18.
* arith.c (gfc_convert_integer, gfc_convert_real, gfc_convert_complex):
Move to ...
* primary.c (convert_integer, convert_real, convert_complex): ... here.
Rename and make static functions.
(match_integer_constant): Use convert_integer
(match_real_constant): Use convert_real.
(match_complex_constant: Use convert_complex.
* arith.h (gfc_convert_integer, gfc_convert_real, gfc_convert_complex):
Remove prototypes.
* array.c (match_array_cons_element): A BOZ cannot be a data
statement value. Jump to a common exit point.
* check.c (gfc_invalid_boz): New function. Emit error or warning
for a BOZ in an invalid context.
(boz_args_check): Move to top of file to prevent need of forward
declaration.
(is_boz_constant): New function. Check that BOZ expr is constant.
(gfc_b z2real): New function. In-place conversion of BOZ literal
constant to REAL in accordance to F2018.
(gfc_boz2int): New function. In-place conversion of BOZ literal
onstant to INTEGER in accordance to F2018.
(gfc_check_achar, gfc_check_char, gfc_check_float): Use gfc_invalid_boz.
Convert BOZ as needed.
(gfc_check_bge_bgt_ble_blt): Enforce F2018 requirements on BGE,
BGT, BLE, and BLT intrinsic functions.
(gfc_check_cmplx): Re-organize to check kind, if present, first.
Convert BOZ real and/or imaginary parts as needed in accordance to
F2018.
(gfc_check_complex): Use gfc_invalid_boz. Convert BOZ as needed.
(gfc_check_dcmplx, gfc_check_dble ): Convert BOZ as needed.
(gfc_check_dshift): Make dshift[lr] conform to F2018 standard.
gfc_check_float (gfc_expr *a)
(gfc_check_iand_ieor_ior): Make IAND, IEOR, and IOR conform to
F2018 standard.
(gfc_check_int): Conform to F2018 standard.
(gfc_check_intconv): Deprecate SHORT and LONG aliases for INT2 and
INT. Simply return for a BOZ argument. See gfc_simplify_intconv.
(gfc_check_merge_bits): Make MERGE_BITS conform to Fortran 2018
standard.
(gfc_check_real): Remove incorrect comment. Check kind, if present,
first. Simply return for a BOZ argument. See gfc_simplify_real.
(gfc_check_and): Re-do error handling for BOZ arguments. Remove
special casing ts.type != BT_INTEGER or BT_LOGICAL.
* decl.c (match_old_style_init): Check for BOZ in old-style
initialization. Issue error or warning depending on
-fallow-invalid-boz option. Issue error if variable is not an
INTEGER or REAL and the value is BOZ.
* expr.c (gfc_copy_expr): Copy a BT_BOZ gfc_expr.
(gfc_check_assign): Re-do error handling for a BOZ in an assignment
statement. Do in-place conversion of RHS based on LHS type of
INTEGER or REAL.
* gfortran.h (gfc_expr): Add a boz component. Remove is_boz component.
(gfc_boz2int, gfc_boz2real, gfc_invalid_boz): New prototypes.
* interface.c (gfc_extend_assign): Guard against replacing an
intrinsic involving a BOZ literal constant on RHS.
* invoke.texi: Doument -fallow-invalid-boz.
* lang.opt: New option. -fallow-invalid-boz.
* libgfortran.h (bt): Elevate BOZ to a basic type.
* misc.c (gfc_basic_typename, gfc_typename): Translate BT_BOZ to BOZ.
* primary.c (convert_integer, convert_real, convert_complex): to here.
Rename and make static functions.
* primary.c(match_boz_constant): Rewrite parsing of a BOZ. Re-do
error handling. Deprecate 'X' for hexidecimal and postfix notation.
Use -fallow-invalid-boz and gfc_invalid_boz to accept deprecated code.
* resolve.c (resolve_ordinary_assign): Rework a RHS that is a
BOZ literal constant. Use gfc_invalid_boz to allow previous
nonstandard behavior. Remove range checking of BOZ conversion.
* simplify.c (convert_boz): Remove function.
(simplify_cmplx): Remove conversion of BOZ constants, because
conversion is done in gfc_check_cmplx.
(gfc_simplify_float): Remove conversion of BOZ constant, because
conversion is done in gfc_check_float.
(simplify_intconv): Use gfc_boz2int to convert BOZ to INTEGER.
Remove range checking for BOZ conversion.
(gfc_simplify_real): Use k, if present, to determine kind. Convert
BOZ to REAL. Remove range checking for BOZ conversion.
target-memory.c (gfc_convert_boz): Rewrite to deal with convert of
a BOZ to a REAL value.
This is about 32/64b host and multilib support across the range of Darwin
systems.
Prior to Darwin8 (OS X 10.4), the toolchains support only PowerPC and only 32b.
On Darwin8 it is possible to target a 64b multilib, but with support limited
to a few of the main libraries on the system (not a recommended configuration).
From Darwin9 to Darwin17 (OSX 10.5 to 10.13) it is possible to have either
32 or 64b hosted toolchains, with support for a 64 or 32b multilib respectively.
On Darwin9 the kernel is 32b, but with support for 64b executables, so it's
conventional to build a 32b host toolchain supporting a 64b multilib. However
this is not enforced (merely a convention).
There is also some platform hardware supporting Darwin10/11 which is only 32b
and for which the same situation applies. However, from Darwin10 to Darwin17,
the majority of platform hardware supports a 64b kernel and it's conventional
to build a 64b host toolchain with support for a 32b multilib.
On/from Darwin18 (OS X 10.14), the development headers (in the SDK) no longer
expose the interfaces for the 32b multilib support (although sufficient runtime
support remains installed that the testsuite can be run for a 32b multilib).
The PR is raised against this latter situation since the absence of exposed
interfaces causes a 'default' bootstrap fail regardless of the availability of
the runtimes. Given the number of permutations, I felt it warranted a general
solution, especially since the current scheme of target headers and t-make
fragments has become somewhat messy.
The changes here enforce the single 32b PowerPC multilib for Darwin < 8 and the
single X86 64b multilib for Darwin >= 18. This means that there is no longer
any need to configure Darwin18+ '--disable-multilib', but also that if you want
to use the ability to continue to test the compiler's 32b multilib there, you
need to make a configuration targeting an earlier OS version (and using the
SDK from that).
gcc/
PR bootstrap/87030
* config.gcc (*-*-darwin*): Don't include CPU t-darwin here.
(i[34567]86-*-darwin*): Adjust to use biarch files. Produce
an error message if i686-darwin configuration is attempted for
Darwin >= 18.
(x86_64-*-darwin*): Switch to single multilib for Darwin >= 18.
(powerpc-*-darwin*): Use biarch files where needed.
(powerpc64-*-darwin*): Likewise.
* config/i386/darwin.h (REAL_LIBGCC_SPEC): Move to new biarch file.
(DARWIN_ARCH_SPEC, DARWIN_SUBARCH_SPEC): Revise for default single
arch case.
* config/i386/darwin32-biarch.h: New.
* config/i386/darwin64.h: Rename.
* gcc/config/i386/darwin64-biarch.h: To this.
* config/i386/t-darwin: Rename.
* gcc/config/i386/t-darwin32-biarch: To this.
* config/i386/t-darwin64: Rename.
* gcc/config/i386/t-darwin64-biarch: To this.
* config/rs6000/darwin32-biarch.h: New.
* config/rs6000/darwin64.h: Rename.
* config/rs6000/darwin64-biarch.h: To this.
(DARWIN_ARCH_SPEC, DARWIN_SUBARCH_SPEC): Revise for default single
arch case.
* config/rs6000/t-darwin8: Rename.
* config/rs6000/t-darwin32-biarch: To this.
* config/rs6000/t-darwin64 Rename.
* config/rs6000/t-darwin64-biarch: To this.
ian [Tue, 23 Jul 2019 17:20:36 +0000 (17:20 +0000)]
compiler: use correct value type in 2-case select send
In the channel-send case, the value to be sent may needs an
(implicit) type conversion to the channel element type. This CL
ensures that we use the correct value type for the send.