git.ipfire.org Git - thirdparty/gcc.git/log

mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]

Middle-end started to emit vec_cmp and vec_cmpu since GCC 11, causing
ICE on MIPS with MSA enabled.  Add the pattern to prevent it.

gcc/

PR target/101132
* config/mips/mips-protos.h (mips_expand_vec_cmp_expr): Declare.
* config/mips/mips.c (mips_expand_vec_cmp_expr): New function.
* config/mips/mips-msa.md (vec_cmp<MSA:mode><mode_i>): New
  expander.
  (vec_cmpu<IMSA:mode><mode_i>): New expander.

gcc/testsuite/

PR target/101132
* gcc.target/mips/pr101132.c: New test.

c++: Implement P0466R5 __cpp_lib_is_pointer_interconvertible compiler helpers [PR101539]

The following patch attempts to implement the compiler helpers for
libstdc++ std::is_pointer_interconvertible_base_of trait and
std::is_pointer_interconvertible_with_class template function.

For the former __is_pointer_interconvertible_base_of trait that checks first
whether base and derived aren't non-union class types that are the same
ignoring toplevel cv-qualifiers, otherwise if derived is unambiguously
derived from base without cv-qualifiers, derived being a complete type,
and if so, my limited understanding of any derived object being
pointer-interconvertible with base subobject IMHO implies (because one can't
inherit from unions or unions can't inherit) that we check if derived is
standard layout type and we walk bases of derived
recursively, stopping on a class that has any non-static data members and
check if any of the bases is base.  On class with non-static data members
no bases are compared already.
Upon discussions, this is something that maybe should have been changed
in the standard with CWG 2254 and the patch no longer performs this and
assumes all base subobjects of standard-layout class types are
pointer-interconvertible with the whole class objects.

The latter is implemented using a FE
__builtin_is_pointer_interconvertible_with_class, but because on the library
side it will be a template function, the builtin takes ... arguments and
only during folding verifies it has a single argument with pointer to member
type.  The initial errors IMHO can only happen if one uses the builtin
incorrectly by hand, the template function should ensure that it has
exactly a single argument that has pointer to member type.
Otherwise, again with my limited understanding of what
the template function should do and pointer-interconvertibility,
it folds to false for pointer-to-member-function, errors if
basetype of the OFFSET_TYPE is incomplete, folds to false
for non-std-layout non-union basetype, then finds the first non-static
data member in the basetype or its bases (by ignoring
DECL_FIELD_IS_BASE FIELD_DECLs that are empty, recursing into
DECL_FIELD_IS_BASE FIELD_DECLs type that are non-empty (I think
std layout should ensure there is at most one), for unions
checks if membertype is same type as any of the union FIELD_DECLs,
for non-unions the first other FIELD_DECL only, and for anonymous
aggregates similarly (union vs. non-union) but recurses into the
anon aggr types with std layout check for anon structures.  If
membertype doesn't match the type of first non-static data member
(or for unions any of the members), then the builtin folds to false,
otherwise the built folds to a check whether the argument is equal
to OFFSET_TYPE of 0 or not, either at compile time if it is constant
(e.g. for constexpr folding) or at runtime otherwise.

As I wrote in the PR, I've tried my testcases with MSVC on godbolt
that claims to implement it, and https://godbolt.org/z/3PnjM33vM
for the first testcase shows it disagrees with my expectations on
static_assert (std::is_pointer_interconvertible_base_of_v<D, F>);
static_assert (std::is_pointer_interconvertible_base_of_v<E, F>);
static_assert (!std::is_pointer_interconvertible_base_of_v<D, G>);
static_assert (!std::is_pointer_interconvertible_base_of_v<D, I>);
static_assert (std::is_pointer_interconvertible_base_of_v<H, volatile I>);
Is that a bug in my patch or is MSVC buggy on these (or mix thereof)?
https://godbolt.org/z/aYeYnne9d
shows the second testcase, here it differs on:
static_assert (std::is_pointer_interconvertible_with_class<F, int> (&F::b));
static_assert (std::is_pointer_interconvertible_with_class<I, int> (&I::g));
static_assert (std::is_pointer_interconvertible_with_class<L, int> (&L::b));
static_assert (std::is_pointer_interconvertible_with_class (&V::a));
static_assert (std::is_pointer_interconvertible_with_class (&V::b));
Again, my bug, MSVC bug, mix thereof?
According to Jason the <D, G>, <D, I> case are the subject of the
CWG 2254 above discussed change and the rest are likely MSVC bugs.

Oh, and there is another thing, the standard has an example:
struct A { int a; };                    // a standard-layout class
struct B { int b; };                    // a standard-layout class
struct C: public A, public B { };       // not a standard-layout class

static_assert( is_pointer_interconvertible_with_class( &C::b ) );
  // Succeeds because, despite its appearance, &C::b has type
  // “pointer to member of B of type int”.
static_assert( is_pointer_interconvertible_with_class<C>( &C::b ) );
  // Forces the use of class C, and fails.
It seems to work as written with MSVC (second assertion fails),
but fails with GCC with the patch:
/tmp/1.C:22:57: error: no matching function for call to ‘is_pointer_interconvertible_with_class<C>(int B::*)’
   22 | static_assert( is_pointer_interconvertible_with_class<C>( &C::b ) );
      |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
/tmp/1.C:8:1: note: candidate: ‘template<class S, class M> constexpr bool std::is_pointer_interconvertible_with_class(M S::*)’
    8 | is_pointer_interconvertible_with_class (M S::*m) noexcept
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/1.C:8:1: note:   template argument deduction/substitution failed:
/tmp/1.C:22:57: note:   mismatched types ‘C’ and ‘B’
   22 | static_assert( is_pointer_interconvertible_with_class<C>( &C::b ) );
      |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
the second int argument isn't deduced.

This boils down to:
template <class S, class M>
bool foo (M S::*m) noexcept;
struct A { int a; };
struct B { int b; };
struct C : public A, public B {};
bool a = foo (&C::b);
bool b = foo<C, int> (&C::b);
bool c = foo<C> (&C::b);
which with /std:c++20 or -std=c++20 is accepted by latest MSVC and ICC but
rejected by GCC and clang (in both cases on the last line).
Is this a GCC/clang bug in argument deduction (in that case I think we want
a separate PR), or a bug in ICC/MSVC and the standard itself that should
specify in the examples both template arguments instead of just the first?
And this has been raised with the CWG.

2021-07-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/101539
gcc/c-family/
* c-common.h (enum rid): Add RID_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
* c-common.c (c_common_reswords): Add
__is_pointer_interconvertible_base_of.
gcc/cp/
* cp-tree.h (enum cp_trait_kind): Add
CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
(enum cp_built_in_function): Add
CP_BUILT_IN_IS_POINTER_INTERCONVERTIBLE_WITH_CLASS.
(fold_builtin_is_pointer_inverconvertible_with_class): Declare.
* parser.c (cp_parser_primary_expression): Handle
RID_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
(cp_parser_trait_expr): Likewise.
* cp-objcp-common.c (names_builtin_p): Likewise.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
* decl.c (cxx_init_decl_processing): Register
__builtin_is_pointer_interconvertible_with_class builtin.
* constexpr.c (cxx_eval_builtin_function_call): Handle
CP_BUILT_IN_IS_POINTER_INTERCONVERTIBLE_WITH_CLASS builtin.
* semantics.c (pointer_interconvertible_base_of_p,
first_nonstatic_data_member_p,
fold_builtin_is_pointer_inverconvertible_with_class): New functions.
(trait_expr_value): Handle CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
(finish_trait_expr): Likewise.  Formatting fix.
* cp-gimplify.c (cp_gimplify_expr): Fold
CP_BUILT_IN_IS_POINTER_INTERCONVERTIBLE_WITH_CLASS.  Call
fndecl_built_in_p just once.
(cp_fold): Likewise.
* tree.c (builtin_valid_in_constant_expr_p): Handle
CP_BUILT_IN_IS_POINTER_INTERCONVERTIBLE_WITH_CLASS.  Call
fndecl_built_in_p just once.
* cxx-pretty-print.c (pp_cxx_trait_expression): Handle
CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF.
gcc/testsuite/
* g++.dg/cpp2a/is-pointer-interconvertible-base-of1.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class1.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class2.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class3.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class4.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class5.C: New test.
* g++.dg/cpp2a/is-pointer-interconvertible-with-class6.C: New test.

c++: Reject anonymous struct with bases

In discussion of jakub's patch for C++20 pointer-interconvertibility, it
came up that we allow anonymous structs to have bases, but don't do anything
usable with them. Let's reject it.

The comment change is something I noticed while looking for the right place
to diagnose this: finish_struct_anon does not actually check for anything
invalid, so it shouldn't claim to.

gcc/cp/ChangeLog:

* class.c (finish_struct_anon): Improve comment.
* decl.c (fixup_anonymous_aggr): Reject anonymous struct
with bases.

gcc/testsuite/ChangeLog:

* g++.dg/ext/anon-struct8.C: New test.

c++: Fix up attribute rollbacks in cp_parser_statement

During the OpenMP directives using C++ attribute syntax work, I've noticed
that cp_parser_statement when parsing various block declarations that do
not allow attribute-specifier-seq at the start rolls back the attributes
only if std_attrs is non-NULL (i.e. some attributes have been parsed),
but doesn't roll back if some tokens were parsed as attribute-specifier-seq,
but didn't yield any attributes (e.g. [[]][[]][[]][[]]), which means
we accept those empty attributes even in places where they don't appear
in the grammar.

The following patch fixes that by instead checking if there are any
tokens to roll back.  This makes the parsing handle the first
function the same as the second one (where some attribute appears).

The testcase contains two xfails, using namespace ... apparently
allows attributes at the start and the attributes shall appeartain to
using in that case.  To be fixed incrementally.

2021-07-30  Jakub Jelinek  <jakub@redhat.com>

* parser.c (cp_parser_statement): Rollback attributes not just
when std_attrs is non-NULL, but whenever
cp_parser_std_attribute_spec_seq parsed any tokens.

* g++.dg/cpp0x/gen-attrs-76.C: New test.

Update gcc de.po.

* de.po: Update.

x86: Don't enable LZCNT/POPCNT if disabled explicitly

gcc/

PR target/101685
* config/i386/i386-options.c (ix86_option_override_internal):
Don't enable LZCNT/POPCNT if they have been disabled explicitly.

gcc/testsuite/

PR target/101685
* gcc.target/i386/pr101685.c: New test.

d: Remove dead code from binary_op.

The front-end ensures that both sides have been casted to the same type
before being given to the lowering pass.

gcc/d/ChangeLog:

* expr.cc (binary_op): Remove dead code.

d: Always layout initializer for the m_RTInfo field in TypeInfo_Class

Makes it explicit that the default value is set to NULL.

gcc/d/ChangeLog:

* typeinfo.cc (TypeInfoVisitor::visit (TypeInfoClassDeclaration *)):
Always layout initializer for the m_RTInfo field.

d: Don't generate a PREDICT_EXPR when assert contracts are turned off.

This expression is just discarded by add_stmt, so never reaches the
middle-end.

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (AssertExp *)): Don't generate
PREDICT_EXPR.

d: Clarify comment for generating static array assignment with literal.

The code block is done as an optimization to elide a call to the runtime
library helpers _d_arrayctor or _d_arrayassign.

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (AssignExp *)): Clarify comment
for generating static array assignment with literal.

d: Only handle named enums in enum_initializer_decl

Anonymous enums neither generate an initializer nor typeinfo symbol, so
it's safe to assert that all enum declarations passed to this function
always have an identifier.

gcc/d/ChangeLog:

* decl.cc (enum_initializer_decl): Only handle named enums.

d: Set COMDAT and visibility of thunks only if they are public.

It is not expected to have a member function that can be non-public, but
this guards against any internal errors that might occur should that
ever change in the front-end.

gcc/d/ChangeLog:

* decl.cc (make_thunk): Set COMDAT and visibility of thunks only if
they are public.

d: Factor aggregate_initializer_decl to set the sinit for aggregate declarations.

The self-hosted implementation of the D front-end changes the type of
`sinit' to a void pointer, which requires an explicit cast to `tree'.

gcc/d/ChangeLog:

* decl.cc (DeclVisitor::visit (StructDeclaration *)): Don't use sinit
for declaration directly.
(DeclVisitor::visit (ClassDeclaration *)): Likewise.
(aggregate_initializer_decl): Likewise. Set sinit after creating.

d: Use Identifier::idPool to generate anonymous field name.

The self-hosted implementation of the D front-end does not export
Identifier::generateId, so handle name generation inline instead.

gcc/d/ChangeLog:

* d-builtins.cc (build_frontend_type): Use Identifier::idPool to
generate anonymous field name.

d: Use hasMonitor to determine whether to emit a __monitor field in D classes

This helper introduced by the front-end is a better gate, and allows the
front-end to change rules for what gets a monitor in the future.

gcc/d/ChangeLog:

* types.cc (layout_aggregate_type): Call hasMonitor.
* typeinfo.cc (TypeInfoVisitor::layout_base): Likewise.
(layout_cpp_typeinfo): Likewise. Don't emit vtable unless
have_typeinfo_p.

d: Insert null terminator in obstack buffers

Covers cases where functions that handle the extracted strings ignore
the explicit length. This isn't something that's known to happen in the
current front-end, but the self-hosted front-end has been observed to do
this in its conversions between D and C-style strings.

gcc/d/ChangeLog:

* d-lang.cc (deps_add_target): Insert null terminator in buffer.
(deps_write): Likewise.
(d_parse_file): Likewise.

d: Drop any field or parameter types that got cached before conversion failed.

This ensures there are no dangling references to AST members that have
been freed, either explcitly or by the garbage collector.

gcc/d/ChangeLog:

* d-builtins.cc (build_frontend_type): Restore builtin_converted_decls
length on conversion failure.

d: Factor d_nested_class and d_nested_struct into single function.

Both do the exact same operation, just on different AST nodes.

gcc/d/ChangeLog:

* d-codegen.cc (d_nested_class): Rename to ...
(get_outer_function): ... this. Handle all aggregate declarations.
(d_nested_struct): Remove.
(find_this_tree): Use get_outer_function.
(get_framedecl): Likewise.

Mark gcc.dg/shrink-wrap-loop.c as XFAIL.

It occurs to me that I should not have disabled early jump threading in
this test, as it may hide an actual defect.  I have reverted my change
and XFAILed the test instead.  I have also opened a PR101690 to keep track
of this problem.

gcc/testsuite/ChangeLog:

* gcc.dg/shrink-wrap-loop.c: Enable early jump threading.  Mark as
XFAIL.

[libgomp] Restore offloading 'libgomp/fortran.c'

GCN:

    ld: error: undefined symbol: gomp_ialias_omp_display_env
    >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744)
    >>>               fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744)
    >>>               fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750)
    >>>               fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750)
    >>>               fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status

nvptx:

    unresolved symbol omp_display_env
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status

Fix-up for commit 7123ae2455b5a1a2f19f13fa82c377cfda157f23
"Implement OpenMP 5.1 section 3.15: omp_display_env".

libgomp/
* fortran.c (omp_display_env_, omp_display_env_8_): Only
'#ifndef LIBGOMP_OFFLOADED_ONLY'.

Co-Authored-By: Ulrich Drepper <drepper@redhat.com>

arm/66791: Replace builtins in vld1.

gcc/ChangeLog:

PR target/66791
* config/arm/arm_neon.h (vld1_p64): Replace call to builtin by
explicitly dereferencing __a.
(vld1_s64): Likewise.
(vld1_u64): Likewise.
* config/arm/arm_neon_builtins.def (vld1): Remove entry for di
and change to VAR13.

Replace evrp use in loop versioning with ranger.

This patch replaces the evrp_range_analyzer in the loop versioning code
with an on-demand ranger.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-loop-versioning.cc (lv_dom_walker::lv_dom_walker): Remove
use of m_range_analyzer.
(loop_versioning::lv_dom_walker::before_dom_children): Same.
(loop_versioning::lv_dom_walker::after_dom_children): Remove.
(loop_versioning::prune_loop_conditions): Replace vr_values use
with range_query interface.
(pass_loop_versioning::execute): Use ranger.

c++: Accept C++11 attribute-definition [PR101582]

As the following testcase shows, we don't parse properly
C++11 attribute-declaration:
https://eel.is/c++draft/dcl.dcl#nt:attribute-declaration

cp_parser_toplevel_declaration just handles empty-declaration parsing
(with diagnostics for C++98) and otherwise calls cp_parser_declaration
which on it calls cp_parser_simple_declaration and rejects it with
"does not declare anything" permerror.

The following patch moves the handling of empty-declaration from
cp_parser_toplevel_declaration to cp_parser_declaration and
handles attribute-declaration in cp_parser_declaration
by parsing the attributes (standard ones only, we've never supported
__attribute__((...)); at namespace scope, so I'm not sure we need to
introduce that), which for C++98 emits the needed diagnostics, and then
warning if there are any attributes that we throw away on the floor.

I'll need this later for OpenMP directives at namespace scope, e.g.
[[omp::directive (requires, atomic_default_mem_order(seq_cst))]];
should be valid at namespace scope (and many other directives).

2021-07-30 Jakub Jelinek <jakub@redhat.com>

PR c++/101582
* parser.c (cp_parser_skip_std_attribute_spec_seq): Add a forward
declaration.
(cp_parser_declaration): Parse empty-declaration and
attribute-declaration.
(cp_parser_toplevel_declaration): Don't parse empty-declaration here.

* g++.dg/cpp0x/gen-attrs-45.C: Expect a warning about ignored
attributes instead of error.
* g++.dg/cpp0x/gen-attrs-75.C: New test.
* g++.dg/modules/pr101582-1.C: New test.

ipa-devirt: check precision mismatch of enum values [PR101396]

We are comparing enum values (in wide_int) to check ODR violation.
However, if we compare two wide_int values with different precision,
we'll trigger an assert, leading to ICE. With enum-base introduced
in C++11, it's easy to sink into this situation.

To fix the issue, we need to explicitly check this kind of mismatch,
and emit a proper warning message if there is such one.

gcc/

PR ipa/101396
* ipa-devirt.c (ipa_odr_read_section): Compare the precision of
enum values, and emit a warning if they mismatch.

gcc/testsuite/

PR ipa/101396
* g++.dg/lto/pr101396_0.C: New test.
* g++.dg/lto/pr101396_1.C: New test.

Use range-based for loops for traversing loops

This patch follows Martin's suggestion here[1], to support
range based loop for iterating loops, analogously to the
patch for vec[2].

For example, use below range-based for loop

for (auto loop : loops_list (cfun, 0))

to replace the previous macro FOR_EACH_LOOP

FOR_EACH_LOOP (loop, 0)

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html

gcc/ChangeLog:

* cfgloop.h (as_const): New function.
(class loop_iterator): Rename to ...
(class loops_list): ... this.
(loop_iterator::next): Rename to ...
(loops_list::Iter::fill_curr_loop): ... this and adjust.
(loop_iterator::loop_iterator): Rename to ...
(loops_list::loops_list): ... this and adjust.
(loops_list::Iter): New class.
(loops_list::iterator): New type.
(loops_list::const_iterator): New type.
(loops_list::begin): New function.
(loops_list::end): Likewise.
(loops_list::begin const): Likewise.
(loops_list::end const): Likewise.
(FOR_EACH_LOOP): Remove.
(FOR_EACH_LOOP_FN): Remove.
* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
for loop with loops_list instance.
(sort_sibling_loops): Likewise.
(disambiguate_loops_with_multiple_latches): Likewise.
(verify_loop_structure): Likewise.
* cfgloopmanip.c (create_preheaders): Likewise.
(force_single_succ_latches): Likewise.
* config/aarch64/falkor-tag-collision-avoidance.c
(execute_tag_collision_avoidance): Likewise.
* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
* config/s390/s390.c (s390_adjust_loops): Likewise.
* doc/loop.texi: Likewise.
* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
(loop_versioning::make_versioning_decisions): Likewise.
* gimple-ssa-split-paths.c (split_paths): Likewise.
* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
* graphite.c (canonicalize_loop_form): Likewise.
(graphite_transform_loops): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-pure-const.c (analyze_function): Likewise.
* loop-doloop.c (doloop_optimize_loops): Likewise.
* loop-init.c (loop_optimizer_finalize): Likewise.
(fix_loop_structure): Likewise.
* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
(move_loop_invariants): Likewise.
* loop-unroll.c (decide_unrolling): Likewise.
(unroll_loops): Likewise.
* modulo-sched.c (sms_schedule): Likewise.
* predict.c (predict_loops): Likewise.
(pass_profile::execute): Likewise.
* profile.c (branch_prob): Likewise.
* sel-sched-ir.c (sel_finish_pipelining): Likewise.
(sel_find_rgns): Likewise.
* tree-cfg.c (replace_loop_annotate): Likewise.
(replace_uses_by): Likewise.
(move_sese_region_to_fn): Likewise.
* tree-if-conv.c (pass_if_conversion::execute): Likewise.
* tree-loop-distribution.c (loop_distribution::execute): Likewise.
* tree-parloops.c (parallelize_loops): Likewise.
* tree-predcom.c (tree_predictive_commoning): Likewise.
* tree-scalar-evolution.c (scev_initialize): Likewise.
(scev_reset): Likewise.
* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
* tree-ssa-live.c (remove_unused_locals): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
(tree_ssa_lim_initialize): Likewise.
* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
(free_numbers_of_iterations_estimates): Likewise.
* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
(pass_scev_cprop::execute): Likewise.
* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
* tree-ssa-threadupdate.c
(jump_thread_path_registry::thread_through_all_blocks): Likewise.
* tree-vectorizer.c (vectorize_loops): Likewise.
* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.

fix breakage from "libstdc++: Remove unnecessary uses of <utility>"

Commit r12-2534 was incomplete and (by inspection derived
from an MMIX build) failing for targets without an insn for
compare_and_swap for pointer-size objects, IOW for targets
for which "ATOMIC_POINTER_LOCK_FREE != 2" is true:

x/gcc/libstdc++-v3/src/c++17/memory_resource.cc: In member function
'std::pmr::memory_resource*
std::pmr::{anonymous}::atomic_mem_res::exchange(std::pmr::memory_resource*)':
x/gcc/libstdc++-v3/src/c++17/memory_resource.cc:140:21: error:
'exchange' is not a member of 'std'
140 | return std::exchange(val, r);
| ^~~~~~~~
make[5]: *** [Makefile:577: memory_resource.lo] Error 1
make[5]: Leaving directory
'/home/hp/tmp/newmmix-r12-2579-p3/gccobj/mmix/libstdc++-v3/src/c++17'

This fix was derived from edits elsewhere in that patch.

Tested mmix-knuth-mmixware, restoring build (together with
target-reviving patches as MMIX is currently and at that commit
broken for target-specific reasons).

libstdc++-v3/:
* src/c++17/memory_resource.cc: Use __exchange instead
of std::exchange.

Fix MMIX breakage; ICE in df_ref_record, at df-scan.c:2598

This bug made me dive into some of the murkier waters of gcc, namely
the source of operand 2 to the "call" pattern.  It can be pretty
poisonous, but is unused (either directly or later) by most targets.

The target function_arg (and function_incoming_arg), can unless
specially handled, cause a VOIDmode reg RTX to be generated, for the
function arguments end-marker.  This is then passed on by expand_call
to the target "call" pattern, as operand[2] (which is wrongly
documented or wrongly implemented, see comment in mmix.c) but unused
by most targets that do not handle it specially, as in operand 2 not
making it into the insn generated for the "call" (et al) patterns.  Of
course, the MMIX port stands out here: the RTX makes it into the
generated RTX but is then actually unused and is just a placeholder;
see mmix_print_operand 'p'.

Anyway, df-scan inspects the emitted call rtx and horks on the
void-mode RTX (actually: that it represents a zero-sized register
range) from r12-1702.

While I could replace or remove the emitted unused call insn operand,
that would still leave unusable rtx to future users of function_arg
actually looking for next_arg_reg.  Better replace VOIDmode with
DImode here; that's the "natural" mode of MMIX registers.

(As a future improvement, I'll also remove the placeholder argument
and replace the intended user; the print_operand output modifier 'p'
modifier (as in "PUSHJ $%p2,%0") with some punctuation, perhaps '!'
(as in "PUSHJ $%!,%0").

I inspected all ports, but other targets emit a special
function_arg_info::end_marker cookie or just don't emit "call"
operand[2] (etc) in the expanded "call" pattern.

gcc:
* config/mmix/mmix.c (mmix_function_arg_1): Avoid
generating a VOIDmode register for e.g the
function_arg_info::end_marker.

Update gcc .po files.

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, zh_CN.po, zh_TW.po: Update.

Reinstate branch-on-bit insns for H8

gcc/
* config/h8300/h8300-modes.def: Add CCZ, CCV and CCC, drop CCZNV.
* config/h8300/h8300.md (H8cc mode iterator): Add CCZ.
(cc mode_attr): Similarly.
(ccz subst_attr): Similarly.
* config/h8300/jumpcall.md: Add new patterns for branch-on-bit.
* config/h8300/testcompare.md: Remove various cc0 based patterns
that had been commented out. Add pattern to set CCZ from a bit
test.

Xfail just the failing assertion and correct target.

Related to:
PR middle-end/101674 - gcc.dg/uninit-pred-9_b.c fails after jump threading rewrite

gcc/testsuite:
PR middle-end/101674
* gcc.dg/uninit-pred-9_b.c: Xfail just the failing assertion and
correct target.

d: Generate Object class if it doesn't exist during TypeInfo emission (PR101672)

Having a stub will prevent errors from occuring when compiling D code
with an empty object.d. Though if it were to actually be used
implicitly then an error should occur.

PR d/101672

gcc/d/ChangeLog:

* typeinfo.cc (make_frontend_typeinfo): Generate Object class if it
doesn't exist.
(check_typeinfo_type): Don't warn if there's no location.

gcc/testsuite/ChangeLog:

* gdc.dg/pr100967.d: Update test.
* gdc.dg/pr101672.d: New test.

d: Return the correct value for C++ constructor calls (PR101664)

C++ constructors return void, even though the front-end semantic treats
them as implicitly returning `this'. To handle this correctly, the
object reference is cached and used as the result of the expression.

PR d/101664

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (CallExp *)): Use object expression as
result for C++ constructor calls.

gcc/testsuite/ChangeLog:

* gdc.dg/extern-c++/extern-c++.exp: New.
* gdc.dg/extern-c++/pr101664.d: New test.
* gdc.dg/extern-c++/pr101664_1.cc: New test.

d: Ensure casting from bool results in either 0 or 1 (PR96435)

If casting from bool, the result is either 0 or 1, any other value
violates @safe code, so enforce that it is never invalid.

PR d/96435

gcc/d/ChangeLog:

* d-convert.cc (convert_for_rvalue): New function.
* d-tree.h (convert_for_rvalue): Declare.
* expr.cc (ExprVisitor::visit (CastExp *)): Use convert_for_rvalue.
(build_return_dtor): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/pr96435.d: New test.

d: Remove generated D header files on error (PR101657)

If an error occurs later during compilation, remember that we generated
the headers, so that they can be removed before exit.

PR d/101657

gcc/d/ChangeLog:

* d-lang.cc (d_parse_file): Remove generated D header files on error.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101657.d: New test.

d: Don't escape quoted format strings in escape_d_format (PR101656)

If the format string is enclosed by two '`' characters, then don't
escape the first and laster characters.

PR d/101656

gcc/d/ChangeLog:

* d-diagnostic.cc (escape_d_format): Don't escape quoted format
strings.

testsuite: Fix up two tests for recent libstdc++ header changes [PR101647]

After recent libstdc++ header changes <functional> no longer includes
(parts of?) <array> and doesn't have to and <memory> no longer includes
(parts of?) <initializer_list>.
This patch fixes:
testsuite/g++.dg/pr71389.C:10:39: error: aggregate 'std::array<std::array<int, 16>, 16> v13' has incomplete type and cannot be defined
as well as
testsuite/g++.dg/cpp0x/initlist48.C:11:6: error: 'initializer_list' in namespace 'std' does not name a template type; did you mean 'uninitialized_fill'?

2021-07-29 Jakub Jelinek <jakub@redhat.com>

PR testsuite/101647
* g++.dg/pr71389.C: Include <array> instead of <functional>.
* g++.dg/cpp0x/initlist48.C: Include also <initializer_list>.

[OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower'

This really is a separate step -- and another pass to be added between the two,
later on.

gcc/
* omp-offload.c (oacc_loop_xform_head_tail, oacc_loop_process):
'update_stmt' after modification.
(pass_oacc_loop_designation): New function, extracted out of...
(pass_oacc_device_lower): ... this.
(pass_data_oacc_loop_designation, pass_oacc_loop_designation)
(make_pass_oacc_loop_designation): New
* passes.def: Add it.
* tree-parloops.c (create_parallel_loop): Adjust.
* tree-pass.h (make_pass_oacc_loop_designation): New.
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c:
's%oaccdevlow%oaccloops%g'.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine-nohost.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/routine-nohost-1.c: Likewise.
* g++.dg/goacc/template.C: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine-nohost.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c:
's%oaccdevlow%oaccloops%g'.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c:
Likewise.
* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>

Fix failed test cases caused by disabling mode promotion for pseudos [PR100952]

gcc/testsuite
PR target/100952
* gcc.target/powerpc/pr56605.c: Change matching
conditions.
* gcc.target/powerpc/pr81348.c: Likewise.

Backwards jump threader rewrite with ranger.

This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

* Makefile.in (tree-ssa-loop-im.o-warn): New.
* flag-types.h (enum threader_mode): New.
* params.opt: Add entry for --param=threader-mode.
* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
(class back_threader): New.
(back_threader::back_threader): New.
(back_threader::~back_threader): New.
(back_threader::maybe_register_path): New.
(back_threader::find_taken_edge): New.
(back_threader::find_taken_edge_switch): New.
(back_threader::find_taken_edge_cond): New.
(back_threader::resolve_def): New.
(back_threader::resolve_phi): New.
(back_threader::find_paths_to_names): New.
(back_threader::find_paths): New.
(dump_path): New.
(debug): New.
(thread_jumps::find_jump_threads_backwards): Call ranger threader.
(thread_jumps::find_jump_threads_backwards_with_ranger): New.
(pass_thread_jumps::execute): Abstract out code...
(try_thread_blocks): ...here.
* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
Abstract out threading candidate code to...
(single_succ_to_potentially_threadable_block): ...here.
* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
New.
* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
* tree-ssa-threadupdate.h (class jump_thread_path_registry):
Return bool from register_jump_thread.

libgomp/ChangeLog:

* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
threader.
* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
* gcc.c-torture/compile/pr83510.c: Same.
* dg.dg/analyzer/pr94851-2.c: Same.
* gcc.dg/loop-unswitch-2.c: Same.
* gcc.dg/old-style-asm-1.c: Same.
* gcc.dg/pr68317.c: Same.
* gcc.dg/pr97567-2.c: Same.
* gcc.dg/predict-9.c: Same.
* gcc.dg/shrink-wrap-loop.c: Same.
* gcc.dg/sibcall-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
* gcc.dg/tree-ssa/pr21001.c: Same.
* gcc.dg/tree-ssa/pr21294.c: Same.
* gcc.dg/tree-ssa/pr21417.c: Same.
* gcc.dg/tree-ssa/pr21458-2.c: Same.
* gcc.dg/tree-ssa/pr21563.c: Same.
* gcc.dg/tree-ssa/pr49039.c: Same.
* gcc.dg/tree-ssa/pr61839_1.c: Same.
* gcc.dg/tree-ssa/pr61839_3.c: Same.
* gcc.dg/tree-ssa/pr77445-2.c: Same.
* gcc.dg/tree-ssa/split-path-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
* gcc.dg/tree-ssa/vrp02.c: Same.
* gcc.dg/tree-ssa/vrp03.c: Same.
* gcc.dg/tree-ssa/vrp05.c: Same.
* gcc.dg/tree-ssa/vrp06.c: Same.
* gcc.dg/tree-ssa/vrp07.c: Same.
* gcc.dg/tree-ssa/vrp09.c: Same.
* gcc.dg/tree-ssa/vrp19.c: Same.
* gcc.dg/tree-ssa/vrp20.c: Same.
* gcc.dg/tree-ssa/vrp33.c: Same.
* gcc.dg/uninit-pred-9_b.c: Same.
* gcc.dg/uninit-pr61112.c: Same.
* gcc.dg/vect/bb-slp-16.c: Same.
* gcc.target/i386/avx2-vect-aggressive.c: Same.
* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
* gcc.dg/tree-ssa/ranger-threader-5.c: New test.

c/101512 - fix missing address-taking in c_common_mark_addressable_vec

c_common_mark_addressable_vec fails to look through C_MAYBE_CONST_EXPR
in the case it isn't at the toplevel.

2021-07-21 Richard Biener <rguenther@suse.de>

PR c/101512
gcc/c-family/
* c-common.c (c_common_mark_addressable_vec): Look through
C_MAYBE_CONST_EXPR even if not at the toplevel.

gcc/testsuite/
* gcc.dg/torture/pr101512.c: New testcase.

Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

gcc/ChangeLog:

* target.def: in0 and in1 do not need to be registers.
* doc/tm.texi: Regenerate.

analyzer: : Refactor callstring to work with pairs of supernodes.

2021-07-25 Ankur Saini <arsenic@sourceware.org>

gcc/analyzer/ChangeLog:
* call-string.cc (call_string::element_t::operator==): New operator.
(call_String::element_t::operator!=): New operator.
(call_string::element_t::get_caller_function): New function.
(call_string::element_t::get_callee_function): New function.
(call_string::call_string): Refactor to Initialise m_elements.
(call_string::operator=): Refactor to work with m_elements.
(call_string::operator==): Likewise.
(call_string::to_json): Likewise.
(call_string::hash): Refactor to hash e.m_caller.
(call_string::push_call): Refactor to work with m_elements.
(call_string::push_call): New overload to push call via supernodes.
(call_string::pop): Refactor to work with m_elements.
(call_string::calc_recursion_depth): Likewise.
(call_string::cmp): Likewise.
(call_string::validate): Likewise.
(call_string::operator[]): Likewise.
* call-string.h (class supernode): New forward decl.
(struct call_string::element_t): New struct.
(call_string::call_string): Refactor to initialise m_elements.
(call_string::bool empty_p): Refactor to work with m_elements.
(call_string::get_callee_node): New decl.
(call_string::get_caller_node): New decl.
(m_elements): Replaces m_return_edges.
* program-point.cc (program_point::get_function_at_depth): Refactor to
work with new call-string format.
(program_point::validate): Likewise.
(program_point::on_edge): Likewise.

Adjust/Refine testcases.

gcc/testsuite/ChangeLog:

PR target/99881
* gcc.target/i386/pr91446.c:
* gcc.target/i386/pr92658-avx512bw-2.c:
* gcc.target/i386/pr92658-sse4-2.c:
* gcc.target/i386/pr92658-sse4.c:
* gcc.target/i386/pr99881.c:

Add a separate function to calculate cost for WIDEN_MULT_EXPR.

gcc/ChangeLog:

PR target/39821
* config/i386/i386.c (ix86_widen_mult_cost): New function.
(ix86_add_stmt_cost): Use ix86_widen_mult_cost for
WIDEN_MULT_EXPR.

gcc/testsuite/ChangeLog:

PR target/39821
* gcc.target/i386/sse2-pr39821.c: New test.
* gcc.target/i386/sse4-pr39821.c: New test.

Use preferred mode for doloop IV [PR61837]

Currently, doloop.xx variable is using the type as niter which may be
shorter than word size.  For some targets, it would be better to use
word size type.  For example, on 64bit system, to access 32bit value,
subreg maybe used.  Then using 64bit type maybe better for niter if
it can be present in both 32bit and 64bit.

This patch add target hook to query preferred mode for doloop IV,
and update mode accordingly.

gcc/ChangeLog:

2021-07-29  Jiufu Guo  <guojiufu@linux.ibm.com>

PR target/61837
* config/rs6000/rs6000.c (TARGET_PREFERRED_DOLOOP_MODE): New hook.
(rs6000_preferred_doloop_mode): New hook.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add hook preferred_doloop_mode.
* target.def (preferred_doloop_mode): New hook.
* targhooks.c (default_preferred_doloop_mode): New hook.
* targhooks.h (default_preferred_doloop_mode): New hook.
* tree-ssa-loop-ivopts.c (compute_doloop_base_on_mode): New function.
(add_iv_candidate_for_doloop): Call targetm.preferred_doloop_mode
and compute_doloop_base_on_mode.

gcc/testsuite/ChangeLog:

2021-07-29  Jiufu Guo  <guojiufu@linux.ibm.com>

PR target/61837
* gcc.target/powerpc/pr61837.c: New test.

Daily bump.

Correct uninitialized object offset and size computation [PR101494].

Resolves:
PR middle-end/101494 - -Wuninitialized false alarm with memrchr of size 0

gcc/ChangeLog:

PR middle-end/101494
* tree-ssa-uninit.c (maybe_warn_operand): Correct object offset
and size computation.

gcc/testsuite/ChangeLog:

PR middle-end/101494
* gcc.dg/uninit-pr101494.c: New test.

Correct -Warray-bounds handling if function pointers [PR101601].

Resolves:
PR middle-end/101601 - -Warray-bounds triggers error: arrays of functions are not meaningful

PR middle-end/101601

gcc/ChangeLog:

* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Remove
a pointless test.
Handle pointers to functions.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Warray-bounds-25.C: New test.
* gcc.dg/Warray-bounds-85.c: New test.

Add new gimple-ssa-warn-access pass.

gcc/ChangeLog:

* Makefile.in (OBJS): Add gimple-ssa-warn-access.o and pointer-query.o.
* attribs.h (fndecl_dealloc_argno): Move fndecl_dealloc_argno to tree.h.
* builtins.c (compute_objsize_r): Move to pointer-query.cc.
(access_ref::access_ref): Same.
(access_ref::phi): Same.
(access_ref::get_ref): Same.
(access_ref::size_remaining): Same.
(access_ref::offset_in_range): Same.
(access_ref::add_offset): Same.
(access_ref::inform_access): Same.
(ssa_name_limit_t::visit_phi): Same.
(ssa_name_limit_t::leave_phi): Same.
(ssa_name_limit_t::next): Same.
(ssa_name_limit_t::next_phi): Same.
(ssa_name_limit_t::~ssa_name_limit_t): Same.
(pointer_query::pointer_query): Same.
(pointer_query::get_ref): Same.
(pointer_query::put_ref): Same.
(pointer_query::flush_cache): Same.
(warn_string_no_nul): Move to gimple-ssa-warn-access.cc.
(check_nul_terminated_array): Same.
(unterminated_array): Same.
(maybe_warn_for_bound): Same.
(check_read_access): Same.
(warn_for_access): Same.
(get_size_range): Same.
(check_access): Same.
(gimple_call_alloc_size): Move to tree.c.
(gimple_parm_array_size): Move to pointer-query.cc.
(get_offset_range): Same.
(gimple_call_return_array): Same.
(handle_min_max_size): Same.
(handle_array_ref): Same.
(handle_mem_ref): Same.
(compute_objsize): Same.
(gimple_call_alloc_p): Move to gimple-ssa-warn-access.cc.
(call_dealloc_argno): Same.
(fndecl_dealloc_argno): Same.
(new_delete_mismatch_p): Same.
(matching_alloc_calls_p): Same.
(warn_dealloc_offset): Same.
(maybe_emit_free_warning): Same.
* builtins.h (check_nul_terminated_array): Move to
gimple-ssa-warn-access.h.
(check_nul_terminated_array): Same.
(warn_string_no_nul): Same.
(unterminated_array): Same.
(class ssa_name_limit_t): Same.
(class pointer_query): Same.
(struct access_ref): Same.
(class range_query): Same.
(struct access_data): Same.
(gimple_call_alloc_size): Same.
(gimple_parm_array_size): Same.
(compute_objsize): Same.
(class access_data): Same.
(maybe_emit_free_warning): Same.
* calls.c (initialize_argument_information): Remove call to
maybe_emit_free_warning.
* gimple-array-bounds.cc: Include new header..
* gimple-fold.c: Same.
* gimple-ssa-sprintf.c: Same.
* gimple-ssa-warn-restrict.c: Same.
* passes.def: Add pass_warn_access.
* tree-pass.h (make_pass_warn_access): Declare.
* tree-ssa-strlen.c: Include new headers.
* tree.c (fndecl_dealloc_argno): Move here from builtins.c.
* tree.h (fndecl_dealloc_argno): Move here from attribs.h.
* gimple-ssa-warn-access.cc: New file.
* gimple-ssa-warn-access.h: New file.
* pointer-query.cc: New file.
* pointer-query.h: New file.

gcc/cp/ChangeLog:

* init.c: Include new header.

PR 100168: Fix call test on power10.

Fix a test that was checking for 64-bit TOC calls, to also allow for
PC-relative calls.

2021-07-28 Michael Meissner <meissner@linux.ibm.com>

gcc/testsuite
PR testsuite/100168
* gcc.dg/pr56727-2.c: Add support for PC-relative calls.

analyzer: play better with -fsanitize=bounds

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::on_call_pre): Treat
IFN_UBSAN_BOUNDS, BUILT_IN_STACK_SAVE, and BUILT_IN_STACK_RESTORE
as no-ops, rather than handling them as unknown functions.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/ubsan-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: remove redundant return value from various impl_call_*

gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (region_model::impl_call_alloca):
Drop redundant return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.
(region_model::impl_call_strlen): Likewise.
* region-model.cc (region_model::on_call_pre): Fix return value of
known functions that don't have unknown side-effects.
* region-model.h (region_model::impl_call_alloca): Drop redundant
return value.
(region_model::impl_call_builtin_expect): Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_strlen): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_operator_delete): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Fortran: ICE in resolve_allocate_deallocate for invalid STAT argument

gcc/fortran/ChangeLog:

PR fortran/101564
* expr.c (gfc_check_vardef_context): Add check for KIND and LEN
parameter inquiries.
* match.c (gfc_match): Fix comment for %v code.
(gfc_match_allocate, gfc_match_deallocate): Replace use of %v code
by %e in gfc_match to allow for function references as STAT and
ERRMSG arguments.
* resolve.c (resolve_allocate_deallocate): Avoid NULL pointer
dereferences and shortcut for bad STAT and ERRMSG argument to
(DE)ALLOCATE. Remove bogus parts of checks for STAT and ERRMSG.

gcc/testsuite/ChangeLog:

PR fortran/101564
* gfortran.dg/allocate_stat_3.f90: New test.
* gfortran.dg/allocate_stat.f90: Adjust error messages.
* gfortran.dg/implicit_11.f90: Likewise.
* gfortran.dg/inquiry_type_ref_3.f90: Likewise.

ubsan: Fix ICEs with DECL_REGISTER tests [PR101624]

The following testcase ICEs, because the base is a CONST_DECL for
the Fortran parameter, and ubsan/sanopt uses DECL_REGISTER macro on it.
/* In VAR_DECL and PARM_DECL nodes, nonzero means declared `register'. */
#define DECL_REGISTER(NODE) (DECL_WRTL_CHECK (NODE)->decl_common.decl_flag_0)
while CONST_DECL doesn't satisfy DECL_WRTL_CHECK.

The following patch checks explicitly for VAR_DECL/PARM_DECL/RESULT_DECL
only before using DECL_REGISTER, assumes other decls aren't DECL_REGISTER.
Not really sure about RESULT_DECL but it at least satisfies DECL_WRTL_CHECK...

2021-07-28 Jakub Jelinek <jakub@redhat.com>

PR middle-end/101624
* ubsan.c (maybe_instrument_pointer_overflow,
instrument_object_size): Only test DECL_REGISTER on VAR_DECLs,
PARM_DECLs or RESULT_DECLs.
* sanopt.c (maybe_optimize_ubsan_ptr_ifn): Likewise.

* gfortran.dg/ubsan/ubsan.exp: New file.
* gfortran.dg/ubsan/pr101624.f90: New test.

match.pd: Fix up recent __builtin_bswap16 simplifications [PR101642]

The following testcase ICEs. The problem is that for __builtin_bswap16
(and only that, others are fine) the argument of the builtin is promoted
to int while the patterns assume it is not and is the same as that of
the return type.
For the bswap simplifications before these new ones it just means we
fail to optimize stuff like __builtin_bswap16 (__builtin_bswap16 (x))
because there are casts in between, but the last one, equality comparison
of __builtin_bswap16 with integer constant results in ICE, because
we create comparison with incompatible types of the operands, and the
other might be fine because usually we bit and the operand before promoting,
but I think it is too dangerous to rely on it, one day we find out that
because it is operand to such a built in, we can throw away any changes
that affect the upper bits and all of sudden it would misbehave.

So, this patch introduces converts that shouldn't do anything for
bswap{32,64,128} and should fix these issues for bswap16.

2021-07-28 Jakub Jelinek <jakub@redhat.com>

PR middle-end/101642
* match.pd (bswap16 (x) == bswap16 (y)): Cast both operands
to type of bswap16 for comparison.
(bswap16 (x) == cst): Cast bswap16 operand to type of cst.

* gcc.c-torture/compile/pr101642.c: New test.

IBM Z: Fix 5 tests in 31-bit mode

gcc/testsuite/ChangeLog:

* gcc.target/s390/global-array-element-pic2.c: Add -mzarch, add
an expectation for 31-bit mode.
* gcc.target/s390/load-imm64-1.c: Use unsigned long long.
* gcc.target/s390/load-imm64-2.c: Likewise.
* gcc.target/s390/vector/long-double-vx-macro-off-on.c: Use
-mzarch.
* gcc.target/s390/vector/long-double-vx-macro-on-off.c:
Likewise.

tree-optimization/101615 - SLP permute opt with CTOR roots

CTOR roots are not explicitely represented so we have to make sure
to materialize permutes on SLP graph entries to them.

2021-07-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Materialize permutes
at CTOR SLP graph entries.

* gcc.dg/vect/bb-slp-pr101615-2.c: New testcase.

aarch64: Add smov alternative to sign_extend pattern

In the testcase here we were generating a umov + sxth to move
a half-word value from SIMD to GP regs with sign-extension.
We can use a single smov instruction for it instead but the
sign-extend pattern was missing the right alternative.
The *zero_extend<SHORT:mode><GPI:mode>2_aarch64 pattern for
zero-extension already has the right alternative for
the analogous umov instruction, so this mirrors that pattern.

Bootstrapped and tested on aarch64-none-linux-gnu.

The test gcc.target/aarch64/sve/clastb_4.c is adjusted to scan for
the clastb h0, p0, h0, z0.h form
instead of
the clastb w0, p0, w0, z0.h form.

This is an improvement as the W forms of the clast instructions are more expensive.

gcc/ChangeLog:

* config/aarch64/aarch64.md (*extend<SHORT:mode><GPI:mode>2_aarch64):
Add "r,w" alternative.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/smov_1.c: New test.
* gcc.target/aarch64/sve/clastb_4.c: Adjust clast scan-assembler.

x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register

There is no SSE <-> AVX transition penalty if the upper bits of YMM/ZMM
registers are unchanged and YMM/ZMM store doesn't change the upper bits
of YMM/ZMM registers.

1. Since zeroing YMM/ZMM register is implemented with zeroing XMM
register, don't set AVX_U128_DIRTY when zeroing YMM/ZMM register.
2. Since store doesn't change the INIT state on the upper bits of
YMM/ZMM register, don't set AVX_U128_DIRTY on store if the source
of store was never non-zero.

Here are the vzeroupper count differences on SPEC CPU 2017 with

-Ofast -march=skylake-avx512

                Before  After    Diff
500.perlbench_r 226 225 -0.44%
502.gcc_r       1263 1103 -12.67%
503.bwaves_r    14 14 0.00%
505.mcf_r       29 28 -3.45%
507.cactuBSSN_r 4651 4628 -0.49%
508.namd_r      433 432 -0.23%
510.parest_r    20380 19347 -5.07%
511.povray_r    495 452 -8.69%
519.lbm_r       2 2 0.00%
520.omnetpp_r   5954 5677 -4.65%
521.wrf_r       12353 12339 -0.11%
523.xalancbmk_r 13137 13001 -1.04%
525.x264_r      192 191 -0.52%
526.blender_r   2515 2366 -5.92%
527.cam4_r      4601 4583 -0.39%
531.deepsjeng_r 20 19 -5.00%
538.imagick_r   898 805 -10.36%
541.leela_r     427 399 -6.56%
544.nab_r       74 74 0.00%
548.exchange2_r 72 72 0.00%
549.fotonik3d_r 318 318 0.00%
554.roms_r      558 554 -0.72%
557.xz_r        79 52 -34.18%

and performance differences are within noise range.

gcc/

PR target/101456
* config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set
AVX_U128_DIRTY when all bits are zero.

gcc/testsuite/

PR target/101456
* gcc.target/i386/pr101456-1.c: New test.
* gcc.target/i386/pr101456-2.c: Likewise.

tree-optimization/101615 - SLP permute opt of existing vectors

This fixes one issue discovered when analyzing PR101615, namely
we happily push permutes to pre-existing vectors but end up
not actually permuting them. In fact we don't want to, so force
materialization on the external.

It doesn't fix the original testcase though.

2021-07-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/101615
* tree-vect-slp.c (vect_optimize_slp): Pre-existing vector
external nodes cannot be permuted so make them perm_out 0.

* gcc.dg/vect/bb-slp-pr101615-1.c: New testcase.

amdgcn: Fix attributes for LLVM-12 [PR 100208]

This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

PR target/100208
* config.in: Regenerate.
* config/gcn/gcn-hsa.h (A_FIJI): New define.
(A_900): New define.
(A_906): New define.
(A_908): New define.
(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
* config/gcn/gcn.c (output_file_start): Adjust attributes according
to the assembler capabilities.
* config/gcn/mkoffload.c (main): Likewise.
* configure: Regenerate.
* configure.ac: Add tests for LLVM assembler attribute features.

Return undefined on edges which are not executed.

When a branch has been folded, mark any range requests on the unexecutable edge as
UNDEFINED.

* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Check for
cond_false and cond_true on branches.

analyzer: Handle strdup builtins

Consolidate allocator builtin handling and add support for
__builtin_strdup and __builtin_strndup.

gcc/analyzer/ChangeLog:

* analyzer.cc (is_named_call_p, is_std_named_call_p): Make
first argument a const_tree.
* analyzer.h (is_named_call_p, -s_std_named_call_p): Likewise.
* sm-malloc.cc (known_allocator_p): New function.
(malloc_state_machine::on_stmt): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/strdup-1.c (test_4, test_5, test_6): New
tests.

analyzer: Recognize __builtin_free as a matching deallocator

Recognize __builtin_free as being equivalent to free when passed into
__attribute__((malloc ())), similar to how it is treated when it is
encountered as a call. This fixes spurious warnings in glibc where
xmalloc family of allocators as well as reallocarray, memalign,
etc. are declared to have __builtin_free as the free function.

gcc/analyzer/ChangeLog:

* sm-malloc.cc
(malloc_state_machine::get_or_create_deallocator): Recognize
__builtin_free.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/attr-malloc-1.c (compatible_alloc,
compatible_alloc2): New extern allocator declarations.
(test_9, test_10): New tests.

d: Wrong evaluation order of binary expressions (PR101640)

The use of fold_build2 can in some cases swap the order of its operands
if that is the more optimal thing to do. However this breaks semantic
guarantee of left-to-right evaluation in D.

PR d/101640

gcc/d/ChangeLog:

* expr.cc (binary_op): Use build2 instead of fold_build2.

gcc/testsuite/ChangeLog:

* gdc.dg/pr96429.d: Update test.
* gdc.dg/pr101640.d: New test.

d: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)

Both the front-end and code generator had a modulo by zero bug when testing if
a conversion from a static array to dynamic array was valid.

PR d/101490

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 27e388b4c.
* d-codegen.cc (build_array_index): Handle void arrays same as byte.
* d-convert.cc (convert_expr): Handle converting to zero-sized arrays.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101490.d: New test.

d: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)

Backports fix from upstream to allow __FUNCTION__ and
__PRETTY_FUNCTION__ to be used as C string literals.

Reviewed-on: https://github.com/dlang/dmd/pull/12923

PR d/101441

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd f8c1ca928.

d: Compile-time reflection for supported built-ins (PR101127)

In order to allow user-code to determine whether a back-end builtin is
available without error, LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE has been
defined to delay putting back-end builtin functions until the ISA that
defines them has been declared.

However in D, there is no global namespace.  All builtins get pushed
into the `gcc.builtins' module, which is constructed during the semantic
analysis pass, which has already finished by the time target attributes
are evaluated.  So builtins are not pushed by the new langhook because
they would be ultimately ignored.  Builtins exposed to D code then can
now only be altered by the command-line.

PR d/101127

gcc/d/ChangeLog:

* d-builtins.cc (d_builtin_function_ext_scope): New function.
* d-lang.cc (LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE): Define.
* d-tree.h (d_builtin_function_ext_scope): Declare.

gcc/testsuite/ChangeLog:

* gdc.dg/pr101127a.d: New test.
* gdc.dg/pr101127b.d: New test.

d: Change in DotTemplateExp type semantics leading to regression (PR101619)

By giving dot templates a type, meant that properry resolving silently
started passing for code that should never have passed. The simple fix
is to provide implementations for checkType and checkValue that give an
error about dot templates having neither a value nor type.

Reviewed-on: https://github.com/dlang/dmd/pull/12920

PR d/101619

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 1d8386a63.

IBM Z: Enable LSan and TSan

libsanitizer/ChangeLog:

* configure.tgt (s390*-*-linux*): Enable LSan and TSan for
s390x.

AArch64: use stable sorting in generating ldp/stp

In some corner cases, we have code as below:
  [base + 0x310] = A
  [base + 0x320] = B
  [base + 0x330] = C
  [base + 0x320] = D
unstable sorting could result in wrong value in offset 0x320.  The
patch fixes it by using gcc_stablesort.

2021-07-28  Bin Cheng  <bin.cheng@linux.alibaba.com>

* config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): use
gcc_stablesort.

Don't skip prologue/epilogue when initializing alias.

Register might be modified in prologue/epilogue, which shouldn't
be skipped in alias info analysis.

2021-07-28 Bin Cheng <bin.cheng@linux.alibaba.com>

gcc/
* alias.c (init_alias_analysis): Don't skip prologue/epilogue.

i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]

AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode
it only supports logical and not arithmetic shifts, only AVX512F for
V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
emulation using various sequences, this patch handles the vector >> vector
case.  No need to adjust costs, the previous cost adjustment actually
covers even the vector by vector shifts.
The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right
V{2,4}DImode shifts (once of the original operands, once of sign mask
constant by the vector shift count), xor and subtraction, on each element
(long long) x >> y is done as
(((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
- (0x8000000000000000ULL >> y)
i.e. if x doesn't have in some element the MSB set, it is just the logical
shift, if it does, then the xor and subtraction cause also all higher bits
to be set.

2021-07-28  Jakub Jelinek  <jakub@redhat.com>

PR target/101611
* config/i386/sse.md (vashr<mode>3): Split into vashrv8di3 expander
and vashrv4di3 expander, where the latter requires just TARGET_AVX2
and has special !TARGET_AVX512VL expansion.
(vashrv2di3<mask_name>): Rename to ...
(vashrv2di3): ... this.  Change condition to TARGET_XOP || TARGET_AVX2
and add special !TARGET_XOP && !TARGET_AVX512VL expansion.

* gcc.target/i386/avx2-pr101611-1.c: New test.
* gcc.target/i386/avx2-pr101611-2.c: New test.

Correct a mistake in a warnung for -Wnonnull.

In the warning for -Wnonnull when warning about array parameters
with bounds > 0 and which are NULL the numbers referring to the
two arguments are switched. This patch corrects the mistake.

2021-07-28 Martin Uecker <muecker@gwdg.de>

gcc/
* calls.c (maybe_warn_rdwr_sizes): Correct argument
numbers in warning that were switched.

gcc/testsuite/
* gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.

Bind(c): Improve error checking in CFI_* functions

This patch adds additional run-time checking for invalid arguments to
CFI_establish and CFI_setpointer.  It also changes existing messages
throughout the CFI_* functions to use PRIiPTR to format CFI_index_t
values instead of casting them to int and using %d (which may not work
on targets where int is a smaller type), simplifies wording of some
messages, and fixes issues with capitalization, typos, and the like.
Additionally some coding standards problems such as >80 character lines
are addressed.

2021-07-24  Sandra Loosemore  <sandra@codesourcery.com>

PR libfortran/101317

libgfortran/
* runtime/ISO_Fortran_binding.c: Include <inttypes.h>.
(CFI_address): Tidy error messages and comments.
(CFI_allocate): Likewise.
(CFI_deallocate): Likewise.
(CFI_establish): Likewise.  Add new checks for validity of
elem_len when it's used, plus type argument and extents.
(CFI_is_contiguous): Tidy error messages and comments.
(CFI_section): Likewise.  Refactor some repetitive code to
make it more understandable.
(CFI_select_part): Likewise.
(CFI_setpointer): Likewise.  Check that source is not an
unallocated allocatable array or an assumed-size array.

gcc/testsuite/
* gfortran.dg/ISO_Fortran_binding_17.f90: Fix typo in error
message patterns.

Bind(c): Fix bugs in CFI_section

CFI_section was incorrectly adjusting the base pointer for the result
array twice in different ways.  It was also overwriting the array
dimension info in the result descriptor before computing the base
address offset from the source descriptor, which caused problems if
the two descriptors are the same.  This patch fixes both problems and
makes the code simpler, too.

A consequence of this patch is that the result array is now 0-based in
all dimensions instead of starting at the numbering to match the first
element of the source array.  The Fortran standard only specifies the
shape of the result array, not its lower bounds, so this is permitted
and probably less confusing for users as well as implementors.

2021-07-17  Sandra Loosemore  <sandra@codesourcery.com>

PR libfortran/101310

libgfortran/
* runtime/ISO_Fortran_binding.c (CFI_section): Fix the base
address computation and simplify the code.

gcc/testsuite/
* gfortran.dg/ISO_Fortran_binding_1.c (section_c): Remove
incorrect assertions.

Fix ISO_Fortran_binding.h paths in gfortran testsuite

ISO_Fortran_binding.h is now generated in the libgfortran build
directory where it is on the default include path. Adjust includes in
the gfortran testsuite not to include an explicit path pointing at the
source directory.

2021-07-27 Sandra Loosemore <sandra@codesourcery.com>

gcc/testsuite/
PR libfortran/101305
* gfortran.dg/ISO_Fortran_binding_1.c: Adjust include path.
* gfortran.dg/ISO_Fortran_binding_10.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_11.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_12.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_15.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_16.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_17.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_18.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_3.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_5.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_6.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_7.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_8.c: Likewise.
* gfortran.dg/ISO_Fortran_binding_9.c: Likewise.
* gfortran.dg/PR94327.c: Likewise.
* gfortran.dg/PR94331.c: Likewise.
* gfortran.dg/bind_c_array_params_3_aux.c: Likewise.
* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: Likewise.
* gfortran.dg/pr93524.c: Likewise.

Bind(C): Correct sizes of some types in CFI_establish

CFI_establish was failing to set the default elem_len correctly for
CFI_type_cptr, CFI_type_cfunptr, CFI_type_long_double, and
CFI_type_long_double_Complex.

2021-07-13 Sandra Loosemore <sandra@codesourcery.com>

libgfortran/
PR libfortran/101305
* runtime/ISO_Fortran_binding.c (CFI_establish): Special-case
CFI_type_cptr and CFI_type_cfunptr. Correct size of long double
on targets where it has kind 10.

Bind(C): Fix type encodings in ISO_Fortran_binding.h

ISO_Fortran_binding.h had many incorrect hardwired kind encodings in
the definitions of the CFI_type_* macros.  Additionally, not all
targets support all the defined type encodings, and the Fortran
standard requires those macros to have a negative value.

This patch changes ISO_Fortran_binding.h to use sizeof instead of
hard-coded sizes, and assembles it from fragments that reflect the
set of types supported by the target.

2021-07-22  Sandra Loosemore  <sandra@codesourcery.com>
    Tobias Burnus  <tobias@codesourcery.com>

libgfortran/
PR libfortran/101305
* ISO_Fortran_binding.h: Fix hard-coded sizes and split into...
* ISO_Fortran_binding-1-tmpl.h: New file.
* ISO_Fortran_binding-2-tmpl.h: New file.
* ISO_Fortran_binding-3-tmpl.h: New file.
* Makefile.am: Add rule for generating ISO_Fortran_binding.h.
Adjust pathnames to that file.
* Makefile.in: Regenerated.
* mk-kinds-h.sh: New file.
* runtime/ISO_Fortran_binding.c: Fix include path.

vect: Fix wrong check in vect_recog_mulhs_pattern [PR101596]

As PR101596 showed, vect_recog_mulhs_pattern uses target_precision to
check the scale_term is expected or not, it could be wrong when the
precision of the actual used new_type larger than target_precision as
shown by the example.

This patch is to use precision of new_type instead of target_precision
for the scale_term matching check.

Bootstrapped & regtested on powerpc64le-linux-gnu P10,
powerpc64-linux-gnu P8, x86_64-redhat-linux and aarch64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/101596
* tree-vect-patterns.c (vect_recog_mulhs_pattern): Fix wrong check
by using new_type's precision instead.

gcc/testsuite/ChangeLog:

PR tree-optimization/101596
* gcc.target/powerpc/pr101596-1.c: New test.
* gcc.target/powerpc/pr101596-2.c: Likewise.
* gcc.target/powerpc/pr101596-3.c: Likewise.

Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct.

gcc/ChangeLog:

PR target/99881
* config/i386/i386.h (processor_costs): Add new member
integer_to_sse.
* config/i386/x86-tune-costs.h (ix86_size_cost, i386_cost,
i486_cost, pentium_cost, lakemont_cost, pentiumpro_cost,
geode_cost, k6_cost, athlon_cost, k8_cost, amdfam10_cost,
bdver_cost, znver1_cost, znver2_cost, znver3_cost,
btver1_cost, btver2_cost, btver3_cost, pentium4_cost,
nocona_cost, atom_cost, atom_cost, slm_cost, intel_cost,
generic_cost, core_cost): Initialize integer_to_sse same value
as sse_op.
(skylake_cost): Initialize integer_to_sse twice as much as sse_op.
* config/i386/i386.c (ix86_builtin_vectorization_cost):
Use integer_to_sse instead of sse_op to calculate the cost of
vec_construct.

gcc/testsuite/ChangeLog:

PR target/99881
* gcc.target/i386/pr99881.c: New test.

Daily bump.

rs6000: Write static initializations for overload tables

2021-06-07 Bill Schmidt <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (write_ovld_static_init): New
function.
(write_init_file): Call write_ovld_static_init.

rs6000: Write static initializations for built-in table

2021-07-26 Bill Schmidt <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (write_bif_static_init): New
function.
(write_init_file): Call write_bif_static_init.

rs6000: Write output to the builtins init file, part 3 of 3

2021-07-27 Bill Schmidt <wschmidt@linux.ibm.com>

gcc/
* config/rs6000/rs6000-gen-builtins.c (typemap): New struct.
(TYPE_MAP_SIZE): New macro.
(type_map): New initialized variable.
(typemap_cmp): New function.
(write_type_node): Likewise.
(write_fntype_init): Implement.

Let -Wuninitialized assume built-ins don't change const arguments [PR101584].

PR tree-optimization/101584 - missing -Wuninitialized with an allocated object after a built-in call

gcc/ChangeLog:

PR tree-optimization/101584
* tree-ssa-uninit.c (builtin_call_nomodifying_p): New function.
(check_defs): Call it.

gcc/testsuite/ChangeLog:

PR tree-optimization/101584
* gcc.dg/uninit-38.c: Remove assertions.
* gcc.dg/uninit-41.c: New test.

libstdc++: Simplify std::optional::value()

The structure of these functions likely dates from the time before G++
fully supported C++14 extended constexpr, so that the throw expression
had to be the operand of a conditional expression. That is not true now,
so we can use a more straightforward version of the code.

We can also simplify the declaration of __throw_bad_optional_access by
using the C++11-style [[noreturn]] attribute so that a separate
declaration isn't needed.

libstdc++-v3/ChangeLog:

* include/experimental/optional (__throw_bad_optional_access):
Replace GNU attribute with C++11 attribute.
(optional::value, optional::value_or): Use if statements
instead of conditional expressions.
* include/std/optional (__throw_bad_optional_access)
(optional::value, optional::value_or): Likewise.

testsuite: Add missing C++ includes to tests [PR101646]

These tests stopped working after some libstdc++ refactoring, because
they aren't including what they use.

gcc/testsuite/ChangeLog:

PR testsuite/101646
* g++.dg/coroutines/pr99047.C:
* g++.dg/pr71655.C:

Use OEP_DECL_NAME when comparing VLA bounds [PR101585].

Resolves:
PR c/101585 - Bad interaction of -fsanitize=undefined and -Wvla-parameters

gcc/c-family:

PR c/101585
* c-warn.c (warn_parm_ptrarray_mismatch): Use OEP_DECL_NAME.

gcc/testsuite:
PR c/101585
* gcc.dg/Wvla-parameter-13.c: New test.

Implement OpenMP 5.1 section 3.15: omp_display_env

This is a new interface which is easily implemented using the
already existing code for the handling of the OMP_DISPLAY_ENV
environment variable.

libgomp/
* env.c (wait_policy, stacksize): New static variables,
move out of handle_omp_display_env.
(omp_display_env): New function.  The meat of the old
handle_omp_display_env function.
(handle_omp_display_env): Change to not take parameters
and instead use the global variables.  Only perform
parsing, defer to omp_display_env for the implementation.
(initialize_env): Remove local variables wait_policy and
stacksize.  Don't pass parameters to handle_omp_display_env.
* fortran.c: Add ialias_redirect for omp_display_env.
(omp_display_env_, omp_display_env_8_): New functions.
* libgomp.map (OMP_5.1): New version.  Add omp_display_env,
omp_display_env_, and omp_display_env_8_.
* omp.h.in: Declare omp_display_env.
* omp_lib.f90.in: Likewise.
* omp_lib.h.in: Likewise.

Fix argument to pthread_join

gcc/testsuite
* g++.dg/gcov/gcov-threads-1.C: Fix argument to pthread_join.

Abstract out (forward) jump threader state handling.

The *forward* jump threader has multiple places where it pushes and
pops state, and where it sets context up for the jump threading
simplifier callback.  Not only are the idioms repetitive, but the only
reason for passing const_and_copies, avail_exprs_stack, and the evrp
engine around are so we can set up context.

As part of my jump threading work, I will divorce the evrp engine from
the DOM jump threader, replacing it with a subset of the path solver I
have just contributed.  Since this will entail passing even more
context around, I've abstracted out the state handling so it can be
passed around in one object.  This cleans up the code, and also makes
it trivial to set up context with another engine in the future.

FWIW, I've used these cleanups and the path solver in a POC to improve
DOM's threaded edges by an additional 5%, and the overall threading
opportunities in the compiler by 1%.  This is in addition to the gains
I have documented in the backwards threader rewrite.

There are no functional changes with this patch.

Tested on x86-64 Linux.

gcc/ChangeLog:

* tree-ssa-dom.c (dom_jump_threader_simplifier):
Put avail_exprs_stack in the class, instead of passing it to
jump_threader_simplifier.
(dom_jump_threader_simplifier::simplify): Add state argument.
(dom_opt_dom_walker): Add state.
(pass_dominator::execute): Pass state to threader.
(dom_opt_dom_walker::before_dom_children): Use state.
* tree-ssa-threadedge.c (jump_threader::jump_threader): Replace
arguments by state.
(jump_threader::record_temporary_equivalences_from_phis):
Register equivalences through the state variable.
(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
Record ranges in a statement through the state variable.
(jump_threader::simplify_control_stmt_condition): Pass state to
simplify.
(jump_threader::simplify_control_stmt_condition_1): Same.
(jump_threader::thread_around_empty_blocks): Remove obsolete
comment.
(jump_threader::thread_through_normal_block): Record equivalences
on edge through the state variable.
(jump_threader::thread_across_edge): Abstract state pushing.
(jt_state::jt_state): New.
(jt_state::push): New.
(jt_state::pop): New.
(jt_state::register_equiv): New.
(jt_state::record_ranges_from_stmt): New.
(jt_state::register_equivs_on_edge): New.
(jump_threader_simplifier::jump_threader_simplifier): Move from
header.
(jump_threader_simplifier::simplify): Add state argument.
* tree-ssa-threadedge.h (class jt_state): New.
(class jump_threader): Add state to constructor.
(class jump_threader_simplifier): Add state to simplify.  Remove
avail_exprs_stack from class.
* tree-vrp.c (vrp_jump_threader_simplifier::simplify): Add state
argument.
(vrp_jump_threader::vrp_jump_threader): Add state.
(vrp_jump_threader::~vrp_jump_threader): Cleanup state.

c++: Reject ordered comparison of null pointers [PR99701]

When implementing DR 1512 in r11-467 I neglected to reject ordered
comparison of two null pointers, like nullptr < nullptr. This patch
fixes that omission.

DR 1512
PR c++/99701

gcc/cp/ChangeLog:

* cp-gimplify.c (cp_fold): Remove {LE,LT,GE,GT_EXPR} from
a switch.
* typeck.c (cp_build_binary_op): Reject ordered comparison
of two null pointers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nullptr11.C: Remove invalid tests.
* g++.dg/cpp0x/nullptr46.C: Add dg-error.
* g++.dg/cpp2a/spaceship-err7.C: New test.
* g++.dg/expr/ptr-comp4.C: New test.

libstdc++-v3/ChangeLog:

* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
Move a line...
* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
...here. New test.

libstdc++: Adjust whitespace in <bits/cow_string.h>

libstdc++-v3/ChangeLog:

* include/bits/cow_string.h: Consistently use tab for
indentation.

libstdc++: Move COW string definitions to separate header

This moves the definitions of the COW string to a separate file, so that
they don't need to be preprocessed for the common case. We could also
move the SSO string definitions to a new file, so that they don't need
to be preprocessed for the old ABI case, but that would require more
shovel work because there are some parts of <bits/basic_string.h> and
<bits/basic_string.tcc> that are common to both definitions.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
(basic_string): Move definition of Copy-on-Write string to
new file.
* include/bits/basic_string.tcc: Likewise.
* include/bits/cow_string.h: New file.

libstdc++: Remove unnecessary uses of <utility>

The <algorithm> header includes <utility>, with a comment referring to
UK-300, a National Body comment on the C++11 draft. That comment
proposed to move std::swap to <utility> and then require <algorithm> to
include <utility>. The comment was rejected, so we do not need to
implement the suggestion. For backwards compatibility with C++03 we do
want <algorithm> to define std::swap, but it does so anyway via
<bits/move.h>. We don't need the whole of <utility> to do that.

A few other headers that need std::swap can include <bits/move.h> to
get it, instead of <utility>.

There are several headers that include <utility> to get std::pair, but
they can use <bits/stl_pair.h> to get it without also including the
rel_ops namespace and other contents of <utility>.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/std/algorithm: Do not include <utility>.
* include/std/functional: Likewise.
* include/std/regex: Include <bits/stl_pair.h> instead of
<utility>.
* include/debug/map.h: Likewise.
* include/debug/multimap.h: Likewise.
* include/debug/multiset.h: Likewise.
* include/debug/set.h: Likewise.
* include/debug/vector: Likewise.
* include/bits/fs_path.h: Likewise.
* include/bits/unique_ptr.h: Do not include <utility>.
* include/experimental/any: Likewise.
* include/experimental/executor: Likewise.
* include/experimental/memory: Likewise.
* include/experimental/optional: Likewise.
* include/experimental/socket: Use __exchange instead
of std::exchange.
* src/filesystem/ops-common.h: Likewise.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust expected
errors to not use a hardcoded line number.
* testsuite/20_util/default_delete/void_neg.cc: Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Include <utility> for std::as_const.
* testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constrained.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constrained.cc:
Likewise.
* testsuite/23_containers/vector/cons/destructible_debug_neg.cc:
Adjust dg-error line number.

libstdc++: Reduce header dependencies on <array> and <utility>

This refactoring reduces the memory usage and compilation time to parse
a number of headers that depend on std::pair, std::tuple or std::array.
Previously the headers for these class templates were all intertwined,
due to the common dependency on std::tuple_size, std::tuple_element and
their std::get overloads. This decouples the headers by moving some
parts of <utility> into a new <bits/utility.h> header. This means that
<array> and <tuple> no longer need to include the whole of <utility>,
and <tuple> no longer needs to include <array>.

This decoupling benefits headers such as <thread> and <scoped_allocator>
which only need std::tuple, and so no longer have to parse std::array.

Some other headers such as <any>, <optional> and <variant> no longer
need to include <utility> just for the std::in_place tag types, so
do not have to parse the std::pair definitions.

Removing direct uses of <utility> also means that the std::rel_ops
namespace is not transitively declared by other headers.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/Makefile.am: Add bits/utility.h header.
* include/Makefile.in: Regenerate.
* include/bits/utility.h: New file.
* include/std/utility (tuple_size, tuple_element): Move
to new header.
* include/std/type_traits (__is_tuple_like_impl<tuple<T...>>):
Move to <tuple>.
(_Index_tuple, _Build_index_tuple, integer_sequence): Likewise.
(in_place_t, in_place_index_t, in_place_type_t): Likewise.
* include/bits/ranges_util.h: Include new header instead of
<utility>.
* include/bits/stl_pair.h (tuple_size, tuple_element): Move
partial specializations for std::pair here.
(get): Move overloads for std::pair here.
* include/std/any: Include new header instead of <utility>.
* include/std/array: Likewise.
* include/std/memory_resource: Likewise.
* include/std/optional: Likewise.
* include/std/variant: Likewise.
* include/std/tuple: Likewise.
(__is_tuple_like_impl<tuple<T...>>): Move here.
(get) Declare overloads for std::array.
* include/std/version (__cpp_lib_tuples_by_type): Change type
to long.
* testsuite/20_util/optional/84601.cc: Include <utility>.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/constrained.cc:
Likewise.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
* testsuite/std/ranges/access/cbegin.cc: Include <utility>.
* testsuite/std/ranges/access/cend.cc: Likewise.
* testsuite/std/ranges/access/end.cc: Likewise.
* testsuite/std/ranges/single_view.cc: Likewise.

Implement basic block path solver.

This is is the main basic block path solver for use in the ranger-based
backwards threader. Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

gcc/ChangeLog:

* Makefile.in (OBJS): Add gimple-range-path.o.
* gimple-range-path.cc: New file.
* gimple-range-path.h: New file.

simplify-rtx: Push sign/zero-extension inside vec_duplicate

As a general principle, vec_duplicate should be as close to the root
of an expression as possible. Where unary operations have
vec_duplicate as an argument, these operations should be pushed
inside the vec_duplicate.

This patch modifies unary operation simplification to push
sign/zero-extension of a scalar inside vec_duplicate.

This patch also updates all RTL patterns in aarch64-simd.md to use
the new canonical form.

gcc/ChangeLog:

2021-07-19 Jonathan Wright <jonathan.wright@arm.com>

* config/aarch64/aarch64-simd.md: Push sign/zero-extension
inside vec_duplicate for all patterns.
* simplify-rtx.c (simplify_context::simplify_unary_operation_1):
Push sign/zero-extension inside vec_duplicate.

Don't use libgomp 'cbuf' buffering with OpenACC 'async'

The host data might not be computed yet (by an earlier asynchronous compute
region, for example.

libgomp/
* target.c (gomp_coalesce_buf_add): Update comment.
(gomp_copy_host2dev, gomp_map_vars_internal): Don't expect to see
'aq && cbuf'.
(gomp_map_vars_internal): Only 'if (!aq)', do
'gomp_coalesce_buf_add'.
* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Remove
XFAIL.

Co-Authored-By: Julian Brown <julian@codesourcery.com>