git.ipfire.org Git - thirdparty/gcc.git/log

libcpp: escape non-ASCII source bytes in -Wbidi-chars= [PR103026]

This flags rich_locations associated with -Wbidi-chars= so that
non-ASCII bytes will be escaped when printing the source lines
(using the diagnostics support I added in
r12-4825-gbd5e882cf6e0def3dd1bc106075d59a303fe0d1e).

In particular, this ensures that the printed source lines will
be pure ASCII, and thus the visual ordering of the characters
will be the same as the logical ordering.

Before:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*‮ } ⁦if (isAdmin)⁩ ⁦ begin admins only */
        |                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only ‮ { ⁦*/
        |                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_‪_PDF_\u202c;
        |               ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_‬_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_‪_PDF_\u202c";
        |                            ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_‬";
        |                                 ^

After:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |                                                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_<U+202A>_PDF_\u202c;
        |                       ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_<U+202C>_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_<U+202A>_PDF_\u202c";
        |                                    ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_<U+202C>";
        |                                 ^

libcpp/ChangeLog:
PR preprocessor/103026
* lex.c (maybe_warn_bidi_on_close): Use a rich_location
and call set_escape_on_output (true) on it.
(maybe_warn_bidi_on_char): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Avoid pathological function redeclarations when checking access sizes [PR102759].

Resolves:
PR tree-optimization/102759 - ICE: Segmentation fault in maybe_check_access_sizes since r12-2976-gb48d4e6818674898

gcc/ChangeLog:

PR tree-optimization/102759
* gimple-array-bounds.cc (build_printable_array_type): Move...
* gimple-ssa-warn-access.cc (build_printable_array_type): Avoid
pathological function redeclarations that remove a previously
declared prototype.
Improve formatting of function arguments in informational notes.
* pointer-query.cc (build_printable_array_type): ...to here.
* pointer-query.h (build_printable_array_type): Declared.

gcc/testsuite/ChangeLog:

PR tree-optimization/102759
* gcc.dg/Warray-parameter-10.c: New test.
* gcc.dg/Wstringop-overflow-82.c: New test.

x86: Add -mharden-sls=[none|all|return|indirect-branch]

Add -mharden-sls= to mitigate against straight line speculation (SLS)
for function return and indirect branch by adding an INT3 instruction
after function return and indirect branch.

gcc/

PR target/102952
* config/i386/i386-opts.h (harden_sls): New enum.
* config/i386/i386.c (output_indirect_thunk): Mitigate against
SLS for function return.
(ix86_output_function_return): Likewise.
(ix86_output_jmp_thunk_or_indirect): Mitigate against indirect
branch.
(ix86_output_indirect_jmp): Likewise.
(ix86_output_call_insn): Likewise.
* config/i386/i386.opt: Add -mharden-sls=.
* doc/invoke.texi: Document -mharden-sls=.

gcc/testsuite/

PR target/102952
* gcc.target/i386/harden-sls-1.c: New test.
* gcc.target/i386/harden-sls-2.c: Likewise.
* gcc.target/i386/harden-sls-3.c: Likewise.
* gcc.target/i386/harden-sls-4.c: Likewise.
* gcc.target/i386/harden-sls-5.c: Likewise.

x86: Remove "%!" before ret

Before MPX was removed, "%!" was mapped to

        case '!':
          if (ix86_bnd_prefixed_insn_p (current_output_insn))
            fputs ("bnd ", file);
          return;

After CET was added and MPX was removed, "%!" was mapped to

       case '!':
          if (ix86_notrack_prefixed_insn_p (current_output_insn))
            fputs ("notrack ", file);
          return;

ix86_notrack_prefixed_insn_p always returns false on ret since the
notrack prefix is only for indirect branches.  Remove the unused "%!"
before ret.

PR target/103307
* config/i386/i386.c (ix86_code_end): Remove "%!" before ret.
(ix86_output_function_return): Likewise.
* config/i386/i386.md (simple_return_pop_internal): Likewise.

Fix modref summary streaming

Fixes bug in streaming in modref access tree that now cause a failure
of gamess benchmark.  The bug is quite old (present in GCC11 release) but it
needs quite interesting series of events to manifest. In particular
1) At lto time ISRA turns some parameters passed by reference to scalar
2) At lto time modref computes summaries for old parameters and then updates
    them but does so quite stupidly believing that the load from parameters
    are now unkonwn loads (rather than optimized out).
    This renders summary not very useful since it thinks every memory aliasing
    int is now accssed (as opposed as parameter dereference)
3) At stream in we notice too early that summary is useless, set every_access
    flag and drop the list.  However while reading rest of the summary we
    overwrite the flag back to 0 which makes us to lose part of summary.
4) right selection of partitions needs to be done to avoid late modref from
    recalculating and thus fixing the summary.

This patch fixes the stream in bug, however we also should fix updating of
summaries.

gcc/ChangeLog:

2021-11-17  Jan Hubicka  <hubicka@ucw.cz>

PR ipa/103246
* ipa-modref.c (read_modref_records): Fix streaminig in of every_access
flag.

i386: Redefine indirect_thunks_used as HARD_REG_SET.

Change indirect_thunks_used to HARD_REG_SET to avoid recalculations
of correct register numbers and allow usage of SET/TEST_HARD_REG_BIT
accessors.

2021-11-17 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

* config/i386/i386.c (indirect_thunks_used): Redefine as HARD_REG_SET.
(ix86_code_end): Use TEST_HARD_REG_BIT on indirect_thunks_used.
(ix86_output_indirect_branch_via_reg): Use SET_HARD_REG_BIT
on indirect_thunks_used.
(ix86_output_indirect_function_return): Ditto.

Add very basic IPA part of modref-kill analysis

gcc/ChangeLog:

2021-11-17 Jan Hubicka <hubicka@ucw.cz>

* ipa-modref-tree.c: Include cgraph.h and tree-streamer.h.
(modref_access_node::stream_out): New member function.
(modref_access_node::stream_in): New member function.
* ipa-modref-tree.h (modref_access_node::stream_out,
modref_access_node::stream_in): Declare.
* ipa-modref.c (modref_summary_lto::useful_p): Free useless kills.
(modref_summary_lto::dump): Dump kills.
(analyze_store): Record kills for LTO
(analyze_stmt): Likewise.
(modref_summaries_lto::duplicate): Duplicate kills.
(write_modref_records): Use new stream_out member function.
(read_modref_records): Likewise.
(modref_write): Stream out kills.
(read_section): Stream in kills
(remap_kills): New function.
(update_signature): Use it.

i386: Introduce LEGACY_SSE_REGNO_P predicate

Introduce LEGACY_SSE_REGNO_P predicate to simplify a couple of places.

No functional changes.

2021-11-17 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

* config/i386/i386.h (LEGACY_SSE_REGNO_P): New predicate.
(SSE_REGNO_P): Use LEGACY_SSE_REGNO_P predicate.
* config/i386/i386.c (zero_all_vector_registers):
Use LEGACY_SSE_REGNO_P predicate.
(ix86_register_priority): Use REX_INT_REGNO_P, REX_SSE_REGNO_P
and EXT_REG_SSE_REGNO_P predicates.
(ix86_hard_regno_call_part_clobbered): Use REX_SSE_REGNO_P
and LEGACY_SSE_REGNO_P predicates.

Handle folded nonconstant array bounds [PR101702]

PR c/101702 - ICE: in handle_argspec_attribute, at c-family/c-attribs.c:3623

gcc/c/ChangeLog:

PR c/101702
* c-decl.c (get_parm_array_spec): Strip casts earlier and fold array
bounds before deciding if they're constant.

gcc/testsuite/ChangeLog:

PR c/101702
* gcc.dg/Warray-parameter-11.c: New test.

doc: document -fimplicit-constexpr

I forgot this in the implementation patch.

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Document
-fimplicit-constexpr.

libstdc++: Use std::construct_at in net::ip::address

Using placement-new isn't valid in constant expressions, so this
replaces it with std::construct_at (via the std::_Construct function
that is usable before C++20).

libstdc++-v3/ChangeLog:

* include/experimental/internet (address): Use std::_Construct
to initialize union members.

libstdc++: Simplify std::string constructors

Several std::basic_string constructors dispatch to one of the
two-argument overloads of _M_construct, which then dispatches again to
_M_construct_aux to detect whether the arguments are iterators or not.
That then dispatches to one of _M_construct(size_type, char_type) or
_M_construct(Iter, Iter, iterator_traits<Iter>::iterator_category{}).

For most of those constructors this is a waste of time, because we know
the arguments are already iterators. For basic_string(const CharT*) and
basic_string(initializer_list<C>) we know that we call _M_construct with
two pointers, and for basic_string(const basic_string&) we call it with
two const_iterators. Those constructors can call the three-argument
overload of _M_construct with the iterator category tag right away,
without the intermediate dispatching.

The case where this doesn't apply is basic_string(InputIter, InputIter),
but for C++11 and later this is constrained so we know it's an iterator
here as well. We can restrict the dispatching in this constructor to
only be done for C++98 and to call _M_construct_aux directly, which
allows us to remove the two-argument _M_construct(InputIter, InputIter)
overload entirely.

N.B. When calling the three-arg _M_construct with pointers or string
iterators, we pass forward_iterator_tag not random_access_iterator_tag.
This is because it makes no difference which overload gets called, and
simplifies overload resolution to not have to do a base-to-derived
check. If we ever add a new overload of M_construct for random access
iterators we would have to revisit this, but that seems unlikely.

This patch also moves the __is_null_pointer checks from the three-arg
_M_construct into the constructors where a null pointer argument is
actually possible. This avoids redundant checks where we know we have a
non-null pointer, or don't have a pointer at all.

Finally, this patch replaces some try-blocks with an RAII type, so that
memory is deallocated during unwinding. This avoids the overhead of
catching and rethrowing an exception.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (_M_construct_aux): Only define
for C++98. Remove constexpr.
(_M_construct_aux_2): Likewise.
(_M_construct(InputIter, InputIter)): Remove.
(basic_string(const basic_string&)): Call _M_construct with
iterator category argument.
(basic_string(const basic_string&, size_type, const Alloc&)):
Likewise.
(basic_string(const basic_string&, size_type, size_type)):
Likewise.
(basic_string(const charT*, size_type, const Alloc&)): Likewise.
Check for null pointer.
(basic_string(const charT*, const Alloc&)): Likewise.
(basic_string(initializer_list<charT>, const Alloc&)): Call
_M_construct with iterator category argument.
(basic_string(const basic_string&, const Alloc&)): Likewise.
(basic_string(basic_string&&, const Alloc&)): Likewise.
(basic_string(_InputIter, _InputIter, const Alloc&)): Likewise
for C++11 and later, call _M_construct_aux for C++98.
* include/bits/basic_string.tcc
(_M_construct(I, I, input_iterator_tag)): Replace try-block with
RAII type.
(_M_construct(I, I, forward_iterator_tag)): Likewise. Remove
__is_null_pointer check.

libstdc++: Set active member of union in std::string [PR103295]

Clang diagnoses that the new constexpr std::string constructors are not
usable in constant expressions, because they start to write to members
of the union without setting an active member.

This adds a new helper function which returns the address of the local
buffer after making it the active member.

This doesn't fix all problems with Clang, because it still refuses to
write to memory returned by the allocator.

libstdc++-v3/ChangeLog:

PR libstdc++/103295
* include/bits/basic_string.h (_M_use_local_data()): New
member function to make local buffer the active member.
(assign(const basic_string&)): Use it.
* include/bits/basic_string.tcc (_M_construct, reserve()):
Likewise.

libstdc++: Fix std::type_info::before for ARM [PR103240]

The r179236 fix for std::type_info::operator== should also have been
applied to std::type_info::before. Otherwise two distinct types can
compare equivalent due to using a string comparison, when they should do
a pointer comparison.

libstdc++-v3/ChangeLog:

PR libstdc++/103240
* libsupc++/tinfo2.cc (type_info::before): Use unadjusted name
to check for the '*' prefix.
* testsuite/util/testsuite_shared.cc: Add type_info object for
use in new test.
* testsuite/18_support/type_info/103240.cc: New test.

Fix two mips target tests compromised by recent IPA work

gcc/testsuite
* gcc.target/mips/frame-header-1.c (bar): Add noipa attribute.
* gcc.target/mips/frame-header-2.c (bar): Likewise.

libcpp: Fix up handling of block comments in -fdirectives-only mode [PR103130]

Normal preprocessing, -fdirectives-only preprocessing before the Nathan's
rewrite, and all other compilers I've tried on godbolt treat even \*/
as end of a block comment, but the new -fdirectives-only handling doesn't.

2021-11-17 Jakub Jelinek <jakub@redhat.com>

PR preprocessor/103130
* lex.c (cpp_directive_only_process): Treat even \*/ as end of block
comment.

* c-c++-common/cpp/dir-only-9.c: New test.

aarch64: Add new vector mode V8DI

This patch is adding new V8DI mode which will be used with new Armv8.7-A
LS64 extension intrinsics.

gcc/ChangeLog:

* config/aarch64/aarch64-modes.def (VECTOR_MODE): New V8DI mode.
* config/aarch64/aarch64.c (aarch64_hard_regno_mode_ok): Handle
V8DImode.
* config/aarch64/iterators.md (define_mode_attr nunits): Add entry
for V8DI.

Fix ICE when mixing VLAs and statement expressions [PR91038]

When returning VM-types from statement expressions, this can
lead to an ICE when declarations from the statement expression
are referred to later. Most of these issues can be addressed by
gimplifying the base expression earlier in gimplify_compound_lval.
Another issue is fixed by wrapping the pointer expression in
pointer_int_sum. This fixes PR91038 and some of the test cases
from PR29970 (structs with VLA members need further work).

gcc/
PR c/91038
PR c/29970
* gimplify.c (gimplify_var_or_parm_decl): Update comment.
(gimplify_compound_lval): Gimplify base expression first.
(gimplify_target_expr): Add comment.

gcc/c-family/
PR c/91038
PR c/29970
* c-common.c (pointer_int_sum): Make sure pointer expressions
are evaluated first when the size expression depends on for
variably-modified types.

gcc/testsuite/
PR c/91038
PR c/29970
* gcc.dg/vla-stexp-3.c: New test.
* gcc.dg/vla-stexp-4.c: New test.
* gcc.dg/vla-stexp-5.c: New test.
* gcc.dg/vla-stexp-6.c: New test.
* gcc.dg/vla-stexp-7.c: New test.
* gcc.dg/vla-stexp-8.c: New test.
* gcc.dg/vla-stexp-9.c: New test.

lim: Reset flow sensitive info even for pointers [PR103192]

Since 2014 is lim clearing SSA_NAME_RANGE_INFO for integral SSA_NAMEs
if moving them from conditional contexts inside of a loop into unconditional
before the loop, but as the miscompilation of gimplify.c shows, we need to
treat pointers the same, even for them we need to reset whether the pointer
can/can't be null or the recorded pointer alignment.

This fixes
-FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error)
-FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors)
-UNRESOLVED: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable
-FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error)
-FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors)
-UNRESOLVED: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable
-FAIL: libgomp.c++/target-in-reduction-2.C (internal compiler error)
-FAIL: libgomp.c++/target-in-reduction-2.C (test for excess errors)
-UNRESOLVED: libgomp.c++/target-in-reduction-2.C compilation failed to produce executable
on both x86_64 and i686.

2021-11-17 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/103192
* tree-ssa-loop-im.c (move_computations_worker): Use
reset_flow_sensitive_info instead of manually clearing
SSA_NAME_RANGE_INFO and do it for all SSA_NAMEs, not just ones
with integral types.

ranger: Fix up fold_using_range::range_of_address [PR103255]

If on &base->member the offset isn't constant or isn't zero and
-fdelete-null-pointer-checks and not -fwrapv-pointer and base has a range
that doesn't include NULL, we return the range of the base.
Usually it isn't a big deal, because for most pointers we just use
varying, range_zero and range_nonzero ranges and nothing beyond that,
but if a pointer is initialized from a constant, we actually track the
exact range and in that case this causes miscompilation.
As discussed on IRC, I think doing something like:
              offset_int off2;
              if (off_cst && off.is_constant (&off2))
                {
                  tree cst = wide_int_to_tree (sizetype, off2 / BITS_PER_UNIT);
                  // adjust range r with POINTER_PLUS_EXPR cst
                  if (!range_includes_zero_p (&r))
                    return true;
                }
              // Fallback
              r = range_nonzero (TREE_TYPE (gimple_assign_rhs1 (stmt)));
              return true;
could work, given that most of the pointer ranges are just the simple ones
perhaps it is too much for little benefit.

2021-11-17  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/103255
* gimple-range-fold.cc (fold_using_range::range_of_address): Return
range_nonzero rather than unadjusted base's range.  Formatting fixes.

* gcc.c-torture/execute/pr103255.c: New test.

Add IFN_COND_FMIN/FMAX functions

This patch adds conditional forms of FMAX and FMIN, following
the pattern for existing conditional binary functions.

gcc/
* doc/md.texi (cond_fmin@var{mode}, cond_fmax@var{mode}): Document.
* optabs.def (cond_fmin_optab, cond_fmax_optab): New optabs.
* internal-fn.def (COND_FMIN, COND_FMAX): New functions.
* internal-fn.c (first_commutative_argument): Handle them.
(FOR_EACH_COND_FN_PAIR): Likewise.
* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
* config/aarch64/aarch64-sve.md (cond_<fmaxmin><mode>): New
pattern.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_fmaxnm_5.c: New test.
* gcc.target/aarch64/sve/cond_fmaxnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8_run.c: Likewise.

i386: Fix non-robust split condition in define_insn_and_split

This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.

gcc/ChangeLog:

* config/i386/i386.md (*add<dwi>3_doubleword, *addv<dwi>4_doubleword,
*addv<dwi>4_doubleword_1, *sub<dwi>3_doubleword,
*subv<dwi>4_doubleword, *subv<dwi>4_doubleword_1,
*add<dwi>3_doubleword_cc_overflow_1, *divmodsi4_const,
*neg<dwi>2_doubleword, *tls_dynamic_gnu2_combine_64_<mode>): Fix split
condition.

Fix PR 103288, ICE after PHI-OPT, move an assigment when still in use for another bb

The problem is r12-5300-gf98f373dd822b35c allows phiopt to recognize more basic blocks
but missed one location where phiopt could move an assignment from the middle block
to the non-middle one. This patch fixes that.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103288

gcc/ChangeLog:

* tree-ssa-phiopt.c (value_replacement): Return early if middle
block has more than one pred.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr103288-1.c: New test.

visium: Fix non-robust split condition in define_insn_and_split

This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.

gcc/ChangeLog:

* config/visium/visium.md (*add<mode>3_insn, *addsi3_insn, *addi3_insn,
*sub<mode>3_insn, *subsi3_insn, *subdi3_insn, *neg<mode>2_insn,
*negdi2_insn, *and<mode>3_insn, *ior<mode>3_insn, *xor<mode>3_insn,
*one_cmpl<mode>2_insn, *ashl<mode>3_insn, *ashr<mode>3_insn,
*lshr<mode>3_insn, *trunchiqi2_insn, *truncsihi2_insn,
*truncdisi2_insn, *extendqihi2_insn, *extendqisi2_insn,
*extendhisi2_insn, *extendsidi2_insn, *zero_extendqihi2_insn,
*zero_extendqisi2_insn, *zero_extendsidi2_insn): Fix split condition.

libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.

We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional control characters
can nest so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

PR preprocessor/103026

gcc/c-family/ChangeLog:

* c.opt (Wbidi-chars, Wbidi-chars=): New option.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wbidi-chars.

libcpp/ChangeLog:

* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.

analyzer: fix missing -Wanalyzer-write-to-const [PR102695]

This patch fixes -Wanalyzer-write-to-const so that it will complain
about attempts to write to functions, to labels.
It also "teaches" the analyzer about strchr, in that strchr can either
return a pointer into the input area (and thus -Wanalyzer-write-to-const
can now complain about writes into a string literal seen this way),
or return NULL (and thus the analyzer can complain about NULL
dereferences if the result is used without a check).

gcc/analyzer/ChangeLog:
PR analyzer/102695
* region-model-impl-calls.cc (region_model::impl_call_strchr): New.
* region-model-manager.cc
(region_model_manager::maybe_fold_unaryop): Simplify cast to
pointer type of an existing pointer to a region.
* region-model.cc (region_model::on_call_pre): Handle
BUILT_IN_STRCHR and "strchr".
(write_to_const_diagnostic::emit): Add auto_diagnostic_group. Add
alternate wordings for functions and labels.
(write_to_const_diagnostic::describe_final_event): Add alternate
wordings for functions and labels.
(region_model::check_for_writable_region): Handle RK_FUNCTION and
RK_LABEL.
* region-model.h (region_model::impl_call_strchr): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/102695
* gcc.dg/analyzer/pr102695.c: New test.
* gcc.dg/analyzer/strchr-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: don't assume target has alloca [PR102779]

gcc/testsuite/ChangeLog:
PR analyzer/102779
* gcc.dg/analyzer/capacity-1.c: Add dg-require-effective-target
alloca. Use __builtin_alloca rather than alloca.
* gcc.dg/analyzer/capacity-3.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Fix clearing of to_info_lto in ipa_merge_modref_summary_after_inlining

This patch fixes bug that caused some optimizations to be dropped with
-fdump-ipa-inline.

gcc/ChangeLog:

2021-11-17 Jan Hubicka <hubicka@ucw.cz>

PR ipa/103246
* ipa-modref.c (ipa_merge_modref_summary_after_inlining): Fix clearing
of to_info_lto

Daily bump.

libstdc++: Fix tests for constexpr std::string

Some tests fail when run with -D_GLIBCXX_USE_CXX11_ABI or -stdgnu++20.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (operator<=>): Use constexpr
unconditionally.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc:
Require cxx11-abit effective target.
* testsuite/21_strings/headers/string/synopsis.cc: Add
conditional constexpr to declarations, and adjust relational
operators for C++20.

c-family: don't cache large vecs

Patrick observed recently that an element of the vector cache could be
arbitrarily large. Let's only cache relatively small vecs.

gcc/c-family/ChangeLog:

* c-common.c (release_tree_vector): Only cache vecs smaller than
16 elements.

Use modref summaries for byte-wise dead store elimination.

gcc/ChangeLog:

* ipa-modref.c (get_modref_function_summary): Declare.
* ipa-modref.h (get_modref_function_summary): New function.
* tree-ssa-dse.c (clear_live_bytes_for_ref): Break out from ...
(clear_bytes_written_by): ... here; also clear memory killed by
calls.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/modref-dse-4.c: New test.

MAINTAINERS: Add myself to DCO section and update email address

ChangeLog:

* MAINTAINERS: Add myself to DCO section and update email address.

Fortran: avoid NULL pointer dereference on invalid range in logical SELECT CASE

gcc/fortran/ChangeLog:

PR fortran/103286
* resolve.c (resolve_select): Choose appropriate range limit to
avoid NULL pointer dereference when generating error message.

gcc/testsuite/ChangeLog:

PR fortran/103286
* gfortran.dg/pr103286.f90: New test.

configure, Darwin: Set appropriate defaults for host-shared.

Darwin x86_64 and aarch64 platforms are PIC (shared) by default,
and user-space code must be built in this mode. The patch
ensures that this is set correctly and applies a default when
--enable-host-shared is not set.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
ChangeLog:

* configure: Regenerate.
* configure.ac: Ensure that PIC (shared) defaults are set
correctly for Darwin.

PCH: Make the save and restore diagnostics more robust.

When saving, if we cannot obtain a suitable memory segment there
is no point in continuing, so exit with an error.

When reading in the PCH, we have a situation that the read-in
data will replace the line tables used by the diagnostics output.
However, the state of the read-oin line tables is indeterminate
at some points where diagnostics might be needed.

To make this more robust, we save the existing line tables at
the start and, once we have read in the pointer to the new one,
put that to one side and restore the original table. This
avoids compiler hangs if the read or memory acquisition code
issues an assert, fatal_error, segv etc.

Once the read is complete, we swap in the new line table that
came from the PCH.

If the read-in PCH is corrupted then we still have a broken
compilation w.r.t any future diagnostics - but there is little
that can be done about that without more careful validation of
the file.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* ggc-common.c (gt_pch_save): If we cannot find a suitable
memory segment for save, then error-out, do not try to
continue.
(gt_pch_restore): Save the existing line table, and when
the replacement is being read, use that when constructing
diagnostics.

rs6000: MMA test case emits wrong code when building a vector pair [PR102976]

PR102976 shows a test case where we generate wrong code when building
a vector pair from 2 vector registers.  The bug here is that with unlucky
register assignments, we can clobber one of the input operands before
we write both registers of the output operand.  The solution is to use
early-clobbers in the assemble pair and accumulator patterns.

2021-11-16  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/102976
* config/rs6000/mma.md (*vsx_assemble_pair): Add early-clobber for
output operand.
(*mma_assemble_acc): Likewise.

gcc/testsuite/
PR target/102976
* gcc.target/powerpc/pr102976.c: New test.

fortran: Identify arguments by their names

This provides a new function to get the name of a dummy argument,
so that identifying an argument can be made using just its name
instead of a mix of name matching (for keyword actual arguments)
and argument counting (for other actual arguments).

gcc/fortran/ChangeLog:
* interface.c (gfc_dummy_arg_get_name): New function.
* gfortran.h (gfc_dummy_arg_get_name): Declare it.
* trans-array.c (arg_evaluated_for_scalarization): Pass a dummy
argument wrapper as argument instead of an actual argument
and an index number. Check it’s non-NULL. Use its name
to identify it.
(gfc_walk_elemental_function_args): Update call to
arg_evaluated for scalarization. Remove argument counting.

fortran: Delete redundant missing_arg_type field

Now that we can get information about an actual arg's associated
dummy using the associated_dummy attribute, the field missing_arg_type
contains redundant information.
This removes it.

gcc/fortran/ChangeLog:
* gfortran.h (gfc_actual_arglist::missing_arg_type): Remove.
* interface.c (gfc_compare_actual_formal): Remove
missing_arg_type initialization.
* intrinsic.c (sort_actual): Ditto.
* trans-expr.c (gfc_conv_procedure_call): Use associated_dummy
and gfc_dummy_arg_get_typespec to get the dummy argument type.

fortran: simplify elemental arguments walking

This adds two functions working with the wrapper struct gfc_dummy_arg
and makes usage of them to simplify a bit the walking of elemental
procedure arguments for scalarization. As information about dummy arguments
can be obtained from the actual argument through the just-introduced
associated_dummy field, there is no need to carry around the procedure
interface and walk dummy arguments manually together with actual arguments.

gcc/fortran/ChangeLog:
* interface.c (gfc_dummy_arg_get_typespec,
gfc_dummy_arg_is_optional): New functions.
* gfortran.h (gfc_dummy_arg_get_typespec,
gfc_dummy_arg_is_optional): Declare them.
* trans.h (gfc_ss_info::dummy_arg): Use the wrapper type
as declaration type.
* trans-array.c (gfc_scalar_elemental_arg_saved_as_reference):
use gfc_dummy_arg_get_typespec function to get the type.
(gfc_walk_elemental_function_args): Remove proc_ifc argument.
Get info about the dummy arg using the associated_dummy field.
* trans-array.h (gfc_walk_elemental_function_args): Update declaration.
* trans-intrinsic.c (gfc_walk_intrinsic_function):
Update call to gfc_walk_elemental_function_args.
* trans-stmt.c (gfc_trans_call): Ditto.
(get_proc_ifc_for_call): Remove.

fortran: Reverse actual vs dummy argument mapping

There was originally no way from an actual argument to get
to the corresponding dummy argument, even if the job of sorting
and matching actual with dummy arguments was done.
The closest was a field named actual in gfc_intrinsic_arg that was
used as scratch data when sorting arguments of one specific call.
However that value was overwritten later on as arguments of another
call to the same procedure were sorted and matched.

This change removes that field from gfc_intrinsic_arg and adds instead
a new field associated_dummy in gfc_actual_arglist.

The new field has as type a new wrapper struct gfc_dummy_arg that provides
a common interface to both dummy arguments of user-defined procedures
(which have type gfc_formal_arglist) and dummy arguments of intrinsic procedures
(which have type gfc_intrinsic_arg).

As the removed field was used in the code sorting and matching arguments,
that code has to be updated. Two local vectors with matching indices
are introduced for respectively dummy and actual arguments, and the
loops are modified to use indices and update those argument vectors.

gcc/fortran/ChangeLog:
* gfortran.h (gfc_dummy_arg_kind, gfc_dummy_arg): New.
(gfc_actual_arglist): New field associated_dummy.
(gfc_intrinsic_arg): Remove field actual.
* interface.c (get_nonintrinsic_dummy_arg): New.
(gfc_compare_actual): Initialize associated_dummy.
* intrinsic.c (get_intrinsic_dummy_arg): New.
(sort_actual): Add argument vectors.
Use loops with indices on argument vectors.
Initialize associated_dummy.

fortran: Tiny sort_actual internal refactoring

Preliminary refactoring to make further changes more obvious.
No functional change.

gcc/fortran/ChangeLog:
* intrinsic.c (sort_actual): initialise variable and use it earlier.

libstdc++: Merge latest Ryu sources

libstdc++-v3/ChangeLog:

* src/c++17/ryu/MERGE: Update the commit hash.
* src/c++17/ryu/d2s_intrinsics.h: Merge from Ryu's master
branch.

Signed-off-by: Patrick Palka <ppalka@redhat.com>

libstdc++: Implement constexpr std::basic_string for C++20

This is only supported for the cxx11 ABI, not for COW strings.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string, operator""s): Add
constexpr for C++20.
(basic_string::basic_string(basic_string&&)): Only copy
initialized portion of the buffer.
(basic_string::basic_string(basic_string&&, const Alloc&)):
Likewise.
* include/bits/basic_string.tcc (basic_string): Add constexpr
for C++20.
(basic_string::swap(basic_string&)): Only copy initialized
portions of the buffers.
(basic_string::_M_replace): Add constexpr implementation that
doesn't depend on pointer comparisons.
* include/bits/cow_string.h: Adjust comment.
* include/ext/type_traits.h (__is_null_pointer): Add constexpr.
* include/std/string (erase, erase_if): Add constexpr.
* include/std/version (__cpp_lib_constexpr_string): Update
value.
* testsuite/21_strings/basic_string/cons/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/literals/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc: New test.
* testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/version.cc: New test.

libstdc++: Use hidden friends for vector<bool>::reference swap overloads

These swap overloads are non-standard, but are needed to make swap work
for vector<bool>::reference rvalues. They don't need to be called
explicitly, only via ADL, so hide them from normal lookup. This is what
I've proposed as the resolution to LWG 3638.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (swap(_Bit_reference, _Bit_reference))
(swap(_Bit_reference, bool&), swap(bool&, _Bit_reference)):
Define as hidden friends of _Bit_reference.

Avoid assuming maximum string length is constant [PR102960].

Resolves:
PR tree-optimization/102960 - ICE: in sign_mask, at wide-int.h:855 in GCC 10.3.0

gcc/ChangeLog:

PR tree-optimization/102960
* gimple-fold.c (get_range_strlen): Take bitmap as an argument rather
than a pointer to it.
(get_range_strlen_tree): Same. Remove bitmap allocation. Use
an auto_bitmap.
(get_maxval_strlen): Use an auto_bitmap.
* tree-ssa-strlen.c (get_range_strlen_dynamic): Factor out PHI
handling...
(get_range_strlen_phi): ...into this function.
Avoid assuming maximum string length is constant
(printf_strlen_execute): Dump pointer query cache contents when
details are requisted.

gcc/testsuite/ChangeLog:

PR tree-optimization/102960
* gcc.dg/Wstringop-overflow-84.c: New test.

shrn-combine-10: update test to current codegen.

When the rshrn commit was reverted I missed this testcase.
This now updates it.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrn-combine-10.c: Use shrn.

signbit-2: make test check for scalar or vector versions

This updates the signbit-2 test to check for
the scalar optimization if the target does not
support vectorization.

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: CHeck vect or scalar.

analyzer: fix overeager sharing of bounded_range instances [PR102662]

This was leading to an assertion failure ICE on a switch stmt when using
-fstrict-enums, due to erroneously reusing a range involving one enum
with a range involving a different enum.

gcc/analyzer/ChangeLog:
PR analyzer/102662
* constraint-manager.cc (bounded_range::operator==): Require the
types to be the same for equality.

gcc/testsuite/ChangeLog:
PR analyzer/102662
* g++.dg/analyzer/pr102662.C: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: improve print_node of PTRMEM_CST

It's been inconvenient that pretty-printing of PTRMEM_CST didn't display
what member the constant refers to.

Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P
nodes; the simplest fix for that is to use the tcc_exceptional hook for
tcc_constant as well.

gcc/cp/ChangeLog:

* ptree.c (cxx_print_xnode): Handle PTRMEM_CST.

gcc/ChangeLog:

* langhooks.h (struct lang_hooks): Adjust comment.
* print-tree.c (print_node): Also call print_xnode hook for
tcc_constant class.

tree-optimization: [PR103218] Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit

This folds Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit inside match.pd.
This was already handled in fold-cost by:
/* A < 0 ? <sign bit of A> : 0 is simply (A & <sign bit of A>). */
I have not removed as we only simplify "a ? POW2 : 0" at the gimple level to "a << CST1"
and fold actually does the reverse of folding "(a<0)<<CST" into "(a<0) ? 1<<CST : 0".

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103218

gcc/ChangeLog:

* match.pd: New pattern for "((type)(a<0)) << SIGNBITOFA".

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103218-1.c: New test.

libstdc++: Fix out-of-bound array accesses in testsuite

I fixed some undefined behaviour in string tests in r238609, but I only
fixed the narrow char versions. This applies the same fixes to the
wchar_t ones. These problems were found when testing a patch to make
std::basic_string usable in constexpr.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc:
Fix reads past the end of strings.
* testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc:
Likewise.
* testsuite/experimental/string_view/operations/compare/wchar_t/1.cc:
Likewise.

libstdc++: Fix typos in tests

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/71964.cc: Fix
typo.
* testsuite/23_containers/set/allocator/71964.cc: Likewise.

arc: Update (u)maddhisi4 patterns

The (u)maddsihi4 patterns are using the ARC's VMAC2H(U)
instruction with null destination, however, VMAC2H(U) doesn't
rewrite the accumulator. This patch solves the destination issue
of VMAC2H by replacing it with DMACH(U) instruction.

gcc/

* config/arc/arc.md (maddhisi4): Use a single move to accumulator.
(umaddhisi4): Likewise.
(machi): Update pattern.
(umachi): Likewise.

gcc/testsuite/

* gcc.target/arc/tmac-4.c: New test.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

tree-optimization/102880 - improve CD-DCE

The PR shows a missed control-dependent DCE caused by CFG cleanup
merging a forwarder resulting in a partially degenerate PHI node.
With control-dependent DCE we need to mark control dependences
of incoming edges into PHIs as necessary but that is unnecessarily
conservative for the case when two edges have the same value.
There is no easy way to mark only a subset of control dependences
of both edges necessary so the fix is to produce forwarder blocks
where then the control dependence captures the requirements more
precisely.

For gcc.dg/tree-ssa/ssa-dom-thread-7.c the number of edges in the
CFG decrease as we have commonized PHI arguments which in turn
results in different threadings. The testcase is too complex
and the dump scanning too simple to do anything meaningful here
but to adjust the number of expected threads.

The same CFG massaging could be useful at RTL expansion time to
reduce the number of copies we need to insert on edges.

FAIL: gcc.dg/tree-ssa/ssa-hoist-4.c scan-tree-dump-times optimized "MAX_EXPR" 1

2021-11-12 Richard Biener <rguenther@suse.de>

PR tree-optimization/102880
* tree-ssa-dce.c (sort_phi_args): New function.
(make_forwarders_with_degenerate_phis): Likewise.
(perform_tree_ssa_dce): Call
make_forwarders_with_degenerate_phis.

* gcc.dg/tree-ssa/pr102880.c: New testcase.
* gcc.dg/tree-ssa/pr69270-3.c: Robustify.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Change the number of
expected threadings.

tree-optimization/102880 - make PHI-OPT recognize more CFGs

This allows extra edges into the middle BB for the PHI-OPT
transforms using replace_phi_edge_with_variable that do not
end up moving stmts from that middle BB.  This avoids regressing
gcc.dg/tree-ssa/ssa-hoist-4.c with the actual fix for PR102880
where CFG cleanup has the choice to remove two forwarders and
picks "the wrong" leading to

   if (a > b) /
       /\    /
      /  <BB>
     /    |
  # PHI <a, b>

rather than

   if (a > b)  |
       /\      |
    <BB> \     |
     /    \    |
  # PHI <a, b, b>

but it's relatively straight-forward to support extra edges
into the middle-BB in paths ending in replace_phi_edge_with_variable
and that do not require moving stmts.  That's because we really
only want to remove the edge from the condition to the middle BB.
Of course actually doing that means updating dominators in non-trival
ways which is why I kept the original code for the single edge
case and simply defer to CFG cleanup by adjusting the condition for
the complicated case.

The testcase needs to be a GIMPLE one since it's quite unreliable
to produce the desired CFG.

2021-11-15  Richard Biener  <rguenther@suse.de>

PR tree-optimization/102880
* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Push
single_pred (bb1) condition to places that really need it.
(match_simplify_replacement): Likewise.
(value_replacement): Likewise.
(replace_phi_edge_with_variable): Deal with extra edges
into the middle BB.

* gcc.dg/tree-ssa/phi-opt-26.c: New testcase.

arc: Update arc specific tests

Update assembly output test pattern. Take into consideration also for
which platform we do execute the test (baremetal or linux).

gcc/testsuite/ChangeLog:

* gcc.target/arc/add_n-combine.c: Update test patterns.
* gcc.target/arc/builtin_eh.c: Update test for linux platforms.
* gcc.target/arc/mul64-1.c: Disable this test while running on
linux.
* gcc.target/arc/tls-gd.c: Update matching patterns.
* gcc.target/arc/tls-ie.c: Likewise.
* gcc.target/arc/tls-ld.c: Likewise.
* gcc.target/arc/uncached-8.c: Likewise.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

Replace more DEBUG_EXPR_DECL creations with build_debug_expr_decl

As discussed on the mailing list, this patch replaces all but one
remaining open coded constructions of DEBUG_EXPR_DECL with calls to
build_debug_expr_decl, even if - in order not to introduce any
functional change - the mode of the constructed decl is then
overwritten.

It is not clear if changing the mode has any effect in practice and
therefore I have added a FIXME note to code which does it, as
requested.

After this patch, DEBUG_EXPR_DECLs are created only by
build_debug_expr_decl and make_debug_expr_from_rtl which looks like
it should be left alone.

gcc/ChangeLog:

2021-11-11 Martin Jambor <mjambor@suse.cz>

* cfgexpand.c (expand_gimple_basic_block): Use build_debug_expr_decl,
add a fixme note about the mode assignment perhaps being unnecessary.
* ipa-param-manipulation.c (ipa_param_adjustments::modify_call):
Likewise.
(ipa_param_body_adjustments::mark_dead_statements): Likewise.
(ipa_param_body_adjustments::reset_debug_stmts): Likewise.
* tree-inline.c (remap_ssa_name): Likewise.
(tree_function_versioning): Likewise.
* tree-into-ssa.c (rewrite_debug_stmt_uses): Likewise.
* tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise.
* tree-ssa.c (insert_debug_temp_for_var_def): Likewise.

ipa-sra: Testcase that removing a "returns_nonnull" retval works

Since we can now remove return values of functions with return_nonnull
type attribute, I'll feel a bit safer if we can test this does not ICE
when someone attempts to access a non-existent call LHS. Eventually
we should probably drop the attribute when this happens.

gcc/testsuite/ChangeLog:

2021-11-15 Martin Jambor <mjambor@suse.cz>

* gcc.dg/ipa/ipa-sra-ret-nonull.c: New test.

libgomp: Mark thread_limit clause to target construct as implemented

After the Fortran changes we can mark it as implemented...

2021-11-16 Jakub Jelinek <jakub@redhat.com>

* libgomp.texi (OpenMP 5.1): Mark thread_limit clause to target
construct as implemented.

openmp: Regimplify operands of GIMPLE_COND in a few more places [PR103208]

As the testcase shows, the non-rectangular loop expansion code didn't
try to regimplify operands of GIMPLE_CONDs it built in some cases.
I have added a helper function which does that and used it in some places
that were regimplifying already to simplify those spots, plus added it
in a couple of other places where it was needed.

2021-11-16 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/103208
* omp-expand.c (expand_omp_build_cond): New function.
(expand_omp_for_init_counts, expand_omp_for_init_vars,
expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Use it.

* c-c++-common/gomp/loop-11.c: New test.

waccess: Fix up pass_waccess::check_alloc_size_call [PR102009]

This function punts if the builtins have no arguments, but as can be seen
on the testcase, even if it has some arguments but alloc_size attribute's
arguments point to arguments that aren't passed, we get a warning earlier
from the FE but should punt rather than ICE on it.
Other users of alloc_size attribute e.g. in
tree-object-size.c (alloc_object_size) punt similarly and similarly
even in the same TU maybe_warn_nonstring_arg correctly verifies calls have
enough arguments.

2021-11-16 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/102009
* gimple-ssa-warn-access.cc (pass_waccess::check_alloc_size_call):
Punt if any of alloc_size arguments is out of bounds vs. number of
call arguments.

* gcc.dg/pr102009.c: New test.

x86_64: Avoid rorx rotation instructions with -Os.

This patch teaches the i386 backend to avoid using BMI2's rorx
instructions when optimizing for size.  The benefits are shown
with the following example:

unsigned int ror1(unsigned int x) { return (x >> 1) | (x << 31); }
unsigned int ror2(unsigned int x) { return (x >> 2) | (x << 30); }
unsigned int rol2(unsigned int x) { return (x >> 30) | (x << 2); }
unsigned int rol1(unsigned int x) { return (x >> 31) | (x << 1); }

which currently with -Os -march=cascadelake generates:

ror1: rorx    $1, %edi, %eax // 6 bytes
        ret
ror2: rorx    $2, %edi, %eax // 6 bytes
        ret
rol2: rorx    $30, %edi, %eax // 6 bytes
        ret
rol1: rorx    $31, %edi, %eax // 6 bytes
        ret

but with this patch now generates:

ror1: movl    %edi, %eax // 2 bytes
        rorl    %eax // 2 bytes
        ret
ror2: movl    %edi, %eax // 2 bytes
        rorl    $2, %eax // 3 bytes
        ret
rol2: movl    %edi, %eax // 2 bytes
        roll    $2, %eax // 3 bytes
        ret
rol1: movl    %edi, %eax // 2 bytes
        roll    %eax // 2 bytes
        ret

I've confirmed that this patch is a win on the CSiBE benchmark,
even though rotations are rare, where for example libmspack/test/md5.o
shrinks from 5824 bytes to 5632 bytes.

2021-11-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386.md (*bmi2_rorx<mode3>_1): Make conditional
on !optimize_function_for_size_p.
(*<any_rotate><mode>3_1): Add preferred_for_size attribute.
(define_splits): Conditionalize on !optimize_function_for_size_p.
(*bmi2_rorxsi3_1_zext): Likewise.
(*<any_rotate>si2_1_zext): Add preferred_for_size attribute.
(define_splits): Conditionalize on !optimize_function_for_size_p.

Fix uninitialized access in merge_call_side_effects

gcc/ChangeLog:

PR ipa/103262
* ipa-modref.c (merge_call_side_effects): Fix uninitialized
access.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/modref-dse-5.c: New test.

tree-optimization: [PR103245] Improve detection of abs pattern using multiplication

So while working on PR 103228 (and a few others), I noticed the testcase for PR 94785
was failing. The problem is that the nop_convert moved from being inside the IOR to be
outside of it. I also noticed the patch for PR 103228 was not needed to reproduce the
issue either.
This patch combines the two patterns together for the abs match when using multiplication
and adds a few places where nop_convert are optional.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/103245

gcc/ChangeLog:

* match.pd: Combine the abs pattern matching using multiplication.
Adding optional nop_convert too.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr103245-1.c: New test.

Add a missing return when transforming atomic bit test and operations

When failing to transform equivalent, but slighly different cases of
atomic bit test and operations to their canonical forms, return
immediately.

gcc/

PR middle-end/103268
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing
return.

gcc/testsuite/

PR middle-end/103268
* gcc.dg/pr103268-1.c: New test.
* gcc.dg/pr103268-2.c: Likewise.

Update my email address.

* MAINTAINERS: Update my address.

Daily bump.

c++: Add -fimplicit-constexpr

With each successive C++ standard the restrictions on the use of the
constexpr keyword for functions get weaker and weaker; it recently occurred
to me that it is heading toward the same fate as the C register keyword,
which was once useful for optimization but became obsolete. Similarly, it
seems to me that we should be able to just treat inlines as constexpr
functions and not make people add the extra keyword everywhere.

There were a lot of testcase changes needed; many disabling errors about
non-constexpr functions that are now constexpr, and many disabling implicit
constexpr so that the tests can check the same thing as before, whether
that's mangling or whatever.

gcc/c-family/ChangeLog:

* c.opt: Add -fimplicit-constexpr.
* c-cppbuiltin.c: Define __cpp_implicit_constexpr.
* c-opts.c (c_common_post_options): Disable below C++14.

gcc/cp/ChangeLog:

* cp-tree.h (struct lang_decl_fn): Add implicit_constexpr.
(decl_implicit_constexpr_p): New.
* class.c (type_maybe_constexpr_destructor): Use
TYPE_HAS_TRIVIAL_DESTRUCTOR and maybe_constexpr_fn.
(finalize_literal_type_property): Simplify.
* constexpr.c (is_valid_constexpr_fn): Check for dtor.
(maybe_save_constexpr_fundef): Try to set DECL_DECLARED_CONSTEXPR_P
on inlines.
(cxx_eval_call_expression): Use maybe_constexpr_fn.
(maybe_constexpr_fn): Handle flag_implicit_constexpr.
(var_in_maybe_constexpr_fn): Use maybe_constexpr_fn.
(potential_constant_expression_1): Likewise.
(decl_implicit_constexpr_p): New.
* decl.c (validate_constexpr_redeclaration): Allow change with
-fimplicit-constexpr.
(grok_special_member_properties): Use maybe_constexpr_fn.
* error.c (dump_function_decl): Don't print 'constexpr'
if it's implicit.
* Make-lang.in (check-c++-all): Update.

libstdc++-v3/ChangeLog:

* testsuite/20_util/to_address/1_neg.cc: Adjust error.
* testsuite/26_numerics/random/concept.cc: Adjust asserts.

gcc/testsuite/ChangeLog:

* lib/g++-dg.exp: Handle "impcx".
* lib/target-supports.exp
(check_effective_target_implicit_constexpr): New.
* g++.dg/abi/abi-tag16.C:
* g++.dg/abi/abi-tag18a.C:
* g++.dg/abi/guard4.C:
* g++.dg/abi/lambda-defarg1.C:
* g++.dg/abi/mangle26.C:
* g++.dg/cpp0x/constexpr-diag3.C:
* g++.dg/cpp0x/constexpr-ex1.C:
* g++.dg/cpp0x/constexpr-ice5.C:
* g++.dg/cpp0x/constexpr-incomplete2.C:
* g++.dg/cpp0x/constexpr-memfn1.C:
* g++.dg/cpp0x/constexpr-neg3.C:
* g++.dg/cpp0x/constexpr-specialization.C:
* g++.dg/cpp0x/inh-ctor19.C:
* g++.dg/cpp0x/inh-ctor30.C:
* g++.dg/cpp0x/lambda/lambda-mangle3.C:
* g++.dg/cpp0x/lambda/lambda-mangle5.C:
* g++.dg/cpp1y/auto-fn12.C:
* g++.dg/cpp1y/constexpr-loop5.C:
* g++.dg/cpp1z/constexpr-lambda7.C:
* g++.dg/cpp2a/constexpr-dtor3.C:
* g++.dg/cpp2a/constexpr-new13.C:
* g++.dg/cpp2a/constinit11.C:
* g++.dg/cpp2a/constinit12.C:
* g++.dg/cpp2a/constinit14.C:
* g++.dg/cpp2a/constinit15.C:
* g++.dg/cpp2a/spaceship-constexpr1.C:
* g++.dg/cpp2a/spaceship-eq3.C:
* g++.dg/cpp2a/udlit-class-nttp-neg2.C:
* g++.dg/debug/dwarf2/auto1.C:
* g++.dg/debug/dwarf2/cdtor-1.C:
* g++.dg/debug/dwarf2/lambda1.C:
* g++.dg/debug/dwarf2/pr54508.C:
* g++.dg/debug/dwarf2/pubnames-2.C:
* g++.dg/debug/dwarf2/pubnames-3.C:
* g++.dg/ext/is_literal_type3.C:
* g++.dg/ext/visibility/template7.C:
* g++.dg/gcov/gcov-12.C:
* g++.dg/gcov/gcov-2.C:
* g++.dg/ipa/devirt-35.C:
* g++.dg/ipa/devirt-36.C:
* g++.dg/ipa/devirt-37.C:
* g++.dg/ipa/devirt-44.C:
* g++.dg/ipa/imm-devirt-1.C:
* g++.dg/lookup/builtin5.C:
* g++.dg/lto/inline-crossmodule-1_0.C:
* g++.dg/modules/enum-1_a.C:
* g++.dg/modules/fn-inline-1_c.C:
* g++.dg/modules/pmf-1_b.C:
* g++.dg/modules/used-1_c.C:
* g++.dg/tls/thread_local11.C:
* g++.dg/tls/thread_local11a.C:
* g++.dg/tm/pr46653.C:
* g++.dg/ubsan/pr70035.C:
* g++.old-deja/g++.other/delete6.C:
* g++.dg/modules/pmf-1_a.H:
Adjust for implicit constexpr.

c++: split_nonconstant_init and flexarrays

split_nonconstant_init was doing the wrong thing for both the initialization
and cleanup here; we know the size from the initializer, and we can pass it
along. This doesn't make the testcase work, since the y destructor is still
broken, but it removes the wrong error for the aggregate initialization.

gcc/cp/ChangeLog:

* typeck2.c (split_nonconstant_init_1): Handle flexarrays better.

gcc/testsuite/ChangeLog:

* g++.dg/ext/flexary37.C: Remove expected error.

gimple-fold: Use ranges to simplify strncat and snprintf

Use ranges for lengths and object sizes in strncat and snprintf to
determine if they can be transformed into simpler operations.

gcc/ChangeLog:

* gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to
determine if it is safe to transform to strcat.
(gimple_fold_builtin_snprintf): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops-2.c: Define size_t.
(safe1): Adjust.
(safe4): New test.
* gcc.dg/fold-stringops-3.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

gimple-fold: Use ranges to simplify _chk calls

Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

* gimple-fold.c (known_lower): New function.
(gimple_fold_builtin_strncat_chk,
gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk,
gimple_fold_builtin_snprintf_chk,
gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
* gcc.dg/fold-stringops-2.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

gimple-fold: Transform stp*cpy_chk to str*cpy directly

Avoid going through another folding cycle and use the ignore flag to
directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.

Dump the transformation in dump_file so that we can verify in tests that
the direct transformation actually happened.

gcc/ChangeLog:

* gimple-fold.c (dump_transformation): New function.
(gimple_fold_builtin_stxcpy_chk,
gimple_fold_builtin_stxncpy_chk): Use it. Simplify to
BUILT_IN_STRNCPY if return value is not used.

gcc/testsuite/ChangeLog:

* gcc.dg/fold-stringops-1.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

Check optab before transforming atomic bit test and operations

Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

PR middle-end/103184
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
before transforming equivalent, but slighly different cases to
their canonical forms.

gcc/testsuite/

PR middle-end/103184
* gcc.dg/pr103184-1.c: New test.
* gcc.dg/pr103184-2.c: Likewise.

IPA: Provide a mechanism to register static DTORs via cxa_atexit.

For at least one target (Darwin) the platform convention is to
register static destructors (i.e. __attribute__((destructor)))
with __cxa_atexit rather than placing them into a list that is
run by some other mechanism.

This patch provides a target hook that allows a target to opt
into this and handling for the process in ipa_cdtor_merge ().

When the mode is enabled (dtors_from_cxa_atexit is set) we:

* Generate new CTORs to register static destructors with
   __cxa_atexit and add them to the existing list of CTORs;
   we then process the revised CTORs list.

* We sort the DTORs into priority and then TU order, this
   means that they are registered in that order with
   __cxa_atexit () and therefore will be run in the reverse
   order.

* Likewise, CTORs are sorted into priority and then TU order,
   which means that they will run in that order.

This matches the behavior of using init/fini (or
mod_init_func/mod_term_func) sections.

This also fixes a bug where Fortran needs a DTOR to be run to
close IO.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR fortran/102992

gcc/ChangeLog:

* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
* ipa.c (cgraph_build_static_cdtor_1): Return the built
function decl.
(build_cxa_atexit_decl): New.
(build_dso_handle_decl): New.
(build_cxa_dtor_registrations): New.
(compare_cdtor_tu_order): New.
(build_cxa_atexit_fns): New.
(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
process the DTORs/CTORs accordingly.
(pass_ipa_cdtor_merge::gate): Also run if
dtors_from_cxa_atexit is set.
* target.def (dtors_from_cxa_atexit): New hook.

configure, Darwin: Check ld64 support for -platform-version.

Newer versions of ld64 allow specifiying the OS target (e.g.
macos or ios) the version and the SDK version all in a single
command. This checks the availability of the command for the
current toolchain.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Test ld64 for -platform-version support.

testsuite, Darwin: In tsvc.h, use malloc for Darwin <= 9.

Earlier Darwin versions fdo not have posix_memalign() but the
malloc implementation is guaranteed to produce memory suitably
aligned for the largest vector type.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* gcc.dg/vect/tsvc/tsvc.h: Use malloc for Darwin 9 and
earlier.

Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds.

Most of the time we get away with using the dsymutil that is
installed with the latest Xcode, however for some cross-compilation
cases that does not work.

We now have the ability to specify the correct dsymutil to use for
the toolchain (--with-dsymutil=) and we should use that specified
tool for debug link. Fixes cross-compilers from x86-64 to powerpc.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ada/ChangeLog:

* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
libgnat/libgnarl recipies.

libstdc++: Unordered containers merge re-use hash code

When merging 2 unordered containers with same hasher we can re-use the hash code from
the cache if any.

Also in the context of the merge operation on multi-container use previous insert iterator as a hint
for the next insert.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h:
(_Hash_code_base<>::_M_hash_code(const _Hash&, const _Hash_node_value<_Value, true>&)): New.
(_Hash_code_base<>::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): New.
* include/bits/hashtable.h (_Hashtable<>::_M_merge_unique): Use latter.
(_Hashtable<>::_M_merge_multi): Likewise.
* testsuite/23_containers/unordered_multiset/modifiers/merge.cc (test05): New test.
* testsuite/23_containers/unordered_set/modifiers/merge.cc (test04): New test.

Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'

Instead of hard-coded '0'/'UINT_MAX', we now use the 'RESERVED_LOCATION_P'
values 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' as spare values for
'Empty'/'Deleted', and generally simplify the code.

gcc/
* diagnostic-spec.h (typedef xint_hash_t)
(typedef xint_hash_map_t): Replace with...
(typedef nowarn_map_t): ... this.
(nowarn_map): Adjust.
* diagnostic-spec.c (nowarn_map, suppress_warning_at): Likewise.

Use 'location_hash' for 'seen_locations' in 'gcc/profile.c:branch_prob'

Follow-up to commit 102fcf94e625a2016a65829c73a42bd6c2420376
"Fix GCOV CFG related issues": considering the current
'int_hash <location_t, 0, 2>', per 'libcpp/include/line-map.h':

      Actual     | Value                         | Meaning
      -----------+-------------------------------+-------------------------------
      0x00000000 | UNKNOWN_LOCATION (gcc/input.h)| Unknown/invalid location.
      -----------+-------------------------------+-------------------------------
      0x00000001 | BUILTINS_LOCATION             | The location for declarations
                 |   (gcc/input.h)               | in "<built-in>"
      -----------+-------------------------------+-------------------------------
      0x00000002 | RESERVED_LOCATION_COUNT       | The first location to be
                 | (also                         | handed out, and the
                 |  ordmap[0]->start_location)   | first line in ordmap 0

... this currently uses value '0' ('UNKNOWN_LOCATION') as spare values for
'Empty', and value '2' ('RESERVED_LOCATION_COUNT') as spare values for
'Deleted', which is questionable?

What actually does get put into 'seen_locations' is (mostly...)
restricted/gated by '!RESERVED_LOCATION_P' (which is true unless
'UNKNOWN_LOCATION' or 'BUILTINS_LOCATION'), thus we may simply use
'location_hash'.

gcc/
* profile.c (branch_prob): Use 'location_hash' for
'seen_locations'.

Drop tree overflow in irange setter.

Drop meaningless overflow that may creep into the IL.

gcc/ChangeLog:

PR tree-optimization/103207
* value-range.cc (irange::set): Drop overflow.

gcc/testsuite/ChangeLog:

* gcc.dg/pr103207.c: New test.

Fortran: openmp: Add support for thread_limit clause on target

gcc/fortran/ChangeLog:

* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
teams.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/thread-limit-1.f90: New test.

testsuite: Add testcase for already fixed PR [PR100469]

This bug introduced in r11-7448-gff92ede8d269375f800e1b347a48f4698874b4a3
has been fixed already by r12-1354-g2d2ed777b23ab6503027039e0adbfe1162f52b2f
aka PR100852 fix.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

PR debug/100469
* g++.dg/opt/pr100469.C: New test.

x86: Add gcc.target/i386/pr103205-2.c

PR target/103205
* gcc.target/i386/pr103205-2.c: New test.

libffi: Update LOCAL_PATCHES

Add

commit a91f844ef449d0dd1cf2e0e47b0ade0d8a6304e1
Author: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
Date: Mon Nov 15 10:24:27 2021 +0100

libffi: Use #define instead of .macro in src/x86/win64.S [PR102874]

to LOCAL_PATCHES.

* LOCAL_PATCHES: Add commit a91f844ef44.

openmp: Add support for thread_limit clause on target

OpenMP 5.1 says that thread_limit clause can also appear on target,
and similarly to teams should affect the thread-limit-var ICV.
On combined target teams, the clause goes to both.

We actually passed thread_limit internally on target already before,
but only used it for gcn/ptx offloading to hint how many threads should be
created and for ptx didn't set thread_limit_var in that case.
Similarly for host fallback.
Also, I found that we weren't copying the args array that contains encoded
thread_limit and num_teams clause for target (etc.) for async target.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

gcc/
* gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT
to OMP_TARGET_CLAUSES if it isn't there already.
gcc/c-family/
* c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>:
Duplicate to both OMP_TARGET and OMP_TEAMS.
gcc/c/
* c-parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
gcc/cp/
* parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
libgomp/
* task.c (gomp_create_target_task): Copy args array as well.
* target.c (gomp_target_fallback): Add args argument.
Set gomp_icv (true)->thread_limit_var if thread_limit is present.
(GOMP_target): Adjust gomp_target_fallback caller.
(GOMP_target_ext): Likewise.
(gomp_target_task_fn): Likewise.
* config/nvptx/team.c (gomp_nvptx_main): Set
gomp_global_icv.thread_limit_var.
* testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.

Fix PHI ordering problems in the path solver.

After auditing the PHI range calculations, I'm not convinced we've
caught all the corner cases.  They haven't shown up in the wild (yet),
but better safe than sorry.

We shouldn't write anything to the cache or trigger additional
lookups while calculating a PHI, as this may cause ordering problems.
We should resolve the PHI with either the cache as it stands, or by
asking for ranges on entry to the path.  I've documented this.

There was one dubious case where we called fold_range in
ssa_range_in_phi, which mostly by luck wasn't triggering lookups,
because fold_range solves a PHI by calling range_on_edge, which is set
to pick up global ranges by default in path_range_query.  This is
fragile, so I've rewritten the call to explicitly use cached or global
ranges.

Also, the cache should be avoided in ssa_range_in_phi when the arg is
defined in the PHI's block, as not doing so could create an ordering
problem.  We have a similar check when calculating relations in PHIs.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Remove useless code.
(path_range_query::ssa_defined_in_bb): New.
(path_range_query::ssa_range_in_phi): Avoid fold_range call that
could trigger additional lookups.
Do not use the cache for ARGs defined in this block.
(path_range_query::compute_ranges_in_block): Use ssa_defined_in_bb.
(path_range_query::maybe_register_phi_relation): Same.
(path_range_query::range_of_stmt): Adjust comment.
* gimple-range-path.h (ssa_defined_in_bb): New.

path solver: Default to global range if nothing found.

This has been a long time coming, but we weren't able to make the
change because of some unrelated regressions.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Default to global range if nothing found.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/pr31146-2.C: Add -fno-thread-jumps.

tree-optimization/103237 - avoid vectorizing unhandled double reductions

Double reductions which have multiple LC PHIs in the inner loop
are not handled correctly during transformation since those PHIs
are not properly classified as reduction. The following disables
vectorizing them.

2021-11-15 Richard Biener <rguenther@suse.de>

PR tree-optimization/103237
* tree-vect-loop.c (vect_is_simple_reduction): Fail for
double reductions with multiple inner loop LC PHI nodes.

* gcc.dg/torture/pr103237.c: New testcase.

PR target/103069: Relax cmpxchg loop for x86 target

From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/
xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under
-march=x86-64 like:

movl v(%rip), %eax
.L2:
movl %eax, %ecx
movl %eax, %edx
orl $1, %ecx
lock cmpxchgl %ecx, v(%rip)
jne .L2
movl %edx, %eax
andl $1, %eax
ret

To relax above loop, GCC should first emit a normal load, check and jump to
.L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to
yield the CPU to another hyperthread and to save power, so the code is
like

.L84:
        movl    (%rdi), %ecx
        movl    %eax, %edx
        orl     %esi, %edx
        cmpl    %eax, %ecx
        jne     .L82
        lock cmpxchgl   %edx, (%rdi)
        jne     .L84
.L82:
        rep nop
        jmp     .L84

This patch adds corresponding atomic_fetch_op expanders to insert load/
compare and pause for all the atomic logic fetch builtins. Add flag
-mrelax-cmpxchg-loop to control whether to generate relaxed loop.

gcc/ChangeLog:

PR target/103069
* config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop):
New expand function.
* config/i386/i386-options.c (ix86_target_string): Add
-mrelax-cmpxchg-loop flag.
(ix86_valid_target_attribute_inner_p): Likewise.
* config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop):
New expand function prototype.
* config/i386/i386.opt: Add -mrelax-cmpxchg-loop.
* config/i386/sync.md (atomic_fetch_<logic><mode>): New expander
for SI,HI,QI modes.
(atomic_<logic>_fetch<mode>): Likewise.
(atomic_fetch_nand<mode>): Likewise.
(atomic_nand_fetch<mode>): Likewise.
(atomic_fetch_<logic><mode>): New expander for DI,TI modes.
(atomic_<logic>_fetch<mode>): Likewise.
(atomic_fetch_nand<mode>): Likewise.
(atomic_nand_fetch<mode>): Likewise.
* doc/invoke.texi: Document -mrelax-cmpxchg-loop.

gcc/testsuite/ChangeLog:

PR target/103069
* gcc.target/i386/pr103069-1.c: New test.
* gcc.target/i386/pr103069-2.c: Ditto.

tree-optimization/103219 - avoid ICE in unroll-and-jam

For no particularly good reason unroll-and-jam uses single_dom_exit
to determine the exit for the region it wants to run VN on.  That
happens to ICE because of the dominance restriction.  Use single_exit
instead.

2021-11-15  Richard Biener  <rguenther@suse.de>

PR tree-optimization/103219
* gimple-loop-jam.c (tree_loop_unroll_and_jam): Use single_exit
to determine the exit for the VN region.

* gcc.dg/torture/pr103219.c: New testcase.

[tree-vectorizer.c] Merge pass_vectorize::execute with vectorize_loops and replace occurences of cfun with function param.

gcc/ChangeLog:
* tree-ssa-loop.c (pass_vectorize): Move to tree-vectorizer.c.
(pass_data_vectorize): Likewise.
(make_pass_vectorize): Likewise.
* tree-vectorizer.c (vectorize_loops): Merge with
pass_vectorize::execute and replace cfun occurences with fun param.
(adjust_simduid_builtins): Add fun param, replace cfun occurences with
fun, and adjust callers approrpiately.
(note_simd_array_uses): Likewise.
(vect_loop_dist_alias_call): Likewise.
(set_uid_loop_bbs): Likewise.
(vect_transform_loops): Likewise.
(try_vectorize_loop_1): Likewise.
(try_vectorize_loop): Likewise.

libffi: Use #define instead of .macro in  src/x86/win64.S [PR102874]

The libffi 3.4.2 import badly broke Solaris/x86 bootstrap with the native
assembler:

Assembler:
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 :
Illegal mnemonic
        Near line: ".macro epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 : Syntax
error
        Near line: ".macro epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 :
Illegal mnemonic
        Near line: ".endm"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 : Syntax
error
        Near line: ".endm"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
Illegal mnemonic
        Near line: " epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
Syntax error
        Near line: "epilogue"

Solaris as doesn't support .macro/.endm.

Fixed by using #define instead of the unportable .macro.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

The bug has been reported upstream
(https://github.com/libffi/libffi/issues/665); a corresponding pull
request is also pending (https://github.com/libffi/libffi/pull/669).

2021-10-21  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

libffi:
PR libffi/102874
* src/x86/win64.S (epilogue): Use #define instead of .macro.

testsuite: i386: Require dfp in gcc.target/i386/pr101346.c

gcc.target/i386/pr101346.c currently FAILs on Solaris/x86:

FAIL: gcc.target/i386/pr101346.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:6:1:
error: decimal floating-point not supported for this target
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:7:6:
error: decimal floating-point not supported for this target
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:9:12:
warning: implicit declaration of function '__builtin_fabsd128'; did you
mean '__builtin_fabsf128'? [-Wimplicit-function-declaration]

Fixed by requiring dfp support. Tested on i386-pc-solaris2.11 and
x86_64-pc-linux-gnu.

2021-10-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr101346.c: Require dfp support.

i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205]

With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so
we ICE. We don't really care if they are done promoted in SImode instead.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

PR target/103205
* config/i386/sync.md (atomic_bit_test_and_set<mode>,
atomic_bit_test_and_complement<mode>,
atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of
OPTAB_DIRECT.

* gcc.target/i386/pr103205.c: New test.

libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound

Here is a PTX implementation of what I was talking about, that for
num_teams_upper 0 or whenever num_teams_lower <= num_blocks, the current
implementation is fine but if the user explicitly asks for more
teams than we can provide in hardware, we need to stop assuming that
omp_get_team_num () is equal to the hw team id, but instead need to use some
team specific memory (it is .shared for PTX), or if none is
provided, array indexed by the hw team id and run some teams serially within
the same hw thread.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

* config/nvptx/team.c (__gomp_team_num): Define as
__attribute__((shared)) var.
(gomp_nvptx_main): Initialize __gomp_team_num to 0.
* config/nvptx/target.c (__gomp_team_num): Declare as
extern __attribute__((shared)) var.
(GOMP_teams4): Use __gomp_team_num as the team number instead of
%ctaid.x. If first, initialize it to %ctaid.x. If num_teams_lower
is bigger than num_blocks, use num_teams_lower teams and arrange for
bumping of __gomp_team_num if !first and returning false once we run
out of teams.
* config/nvptx/teams.c (__gomp_team_num): Declare as
extern __attribute__((shared)) var.
(omp_get_team_num): Return __gomp_team_num value instead of %ctaid.x.

libgomp: Add a testcase for omp_get_num_teams inside of target inside of host teams

This is https://github.com/OpenMP/spec/issues/3183
There is an agreement that we should return 1 team inside of target,
even if that target is inside of host teams. We were doing that
when offloading and not during host fallback, r12-5151 should fix that
even for host fallback.

2021-11-15 Jakub Jelinek <jakub@redhat.com>

* testsuite/libgomp.c/teams-5.c: New test.

c++: location of lambda object and conversion call

Two things that had poor location info: we weren't giving the TARGET_EXPR
for a lambda object any location, and the call to a conversion function was
getting whatever input_location happened to be.

gcc/cp/ChangeLog:

* call.c (perform_implicit_conversion_flags): Use the location of
the argument.
* lambda.c (build_lambda_object): Set location on the TARGET_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-switch.C: Adjust expected location.

c++: check constexpr constructor body

The implicit constexpr patch revealed that our checks for constexpr
constructors that could possibly produce a constant value (which
otherwise are IFNDR) was failing to look at most of the function body.
Fixing that required some library tweaks.

gcc/cp/ChangeLog:

* constexpr.c (maybe_save_constexpr_fundef): Also check whether the
body of a constructor is potentially constant.

libstdc++-v3/ChangeLog:

* src/c++17/memory_resource.cc: Add missing constexpr.
* include/experimental/internet: Only mark copy constructor
as constexpr with __cpp_constexpr_dynamic_alloc.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-89285-2.C: Expect error.
* g++.dg/cpp1y/constexpr-89285.C: Adjust error.