git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Add missing feature-test macro in <memory>

Per version.syn#2, <memory> is required to define
__cpp_lib_addressof_constexpr as 201603L.

Bootstrapped and tested on aarch64-linux-gnu.

Signed-off-by: Dhruv Chawla <dhruvc@nvidia.com>
libstdc++-v3/ChangeLog:
* include/std/memory: Define __glibcxx_want_addressof_constexpr.
* testsuite/20_util/headers/memory/version.cc: Test for macro
value.

(cherry picked from commit 0e65fef8717f404cf9c85bff51bf87d534f87828)

Daily bump.

c++/coroutines: Fix awaiter var creation [PR116506]

Awaiters always need to have a coroutine state frame copy since
they persist across potential supensions. It simplifies the later
analysis considerably to assign these early which we do when
building co_await expressions.

The cleanups in r15-3146-g47dbd69b1, unfortunately elided some of
processing used to cater for cases where the var created from an
xvalue, or is a pointer/reference type.

Corrected thus.

PR c++/116506
PR c++/116880

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Ensure that xvalues are
materialised. Handle references/pointer values in awaiter
access expressions.
(is_stable_lvalue): New.
* decl.cc (cxx_maybe_build_cleanup): Handle null arg.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr116506.C: New test.
* g++.dg/coroutines/pr116880.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Co-authored-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 4c743798b1d4530b327dad7c606c610f3811fdbf)

c++: fix conversion issues

back-port the coroutine part of this.

Some issues caught by a check from another patch:

In build_ramp_function, we're assigning REFERENCE_TYPE things, so we need to
build the assignment directly rather than rely on functions that implement
C++ semantics.

gcc/cp/ChangeLog:

* coroutines.cc (cp_coroutine_transform::build_ramp_function): Build
INIT_EXPR directly.

(cherry picked from commit dd3f3c71df66ed6fd3872ab780f5813831100d1c)

c++, coroutines:Ensure bind exprs are visited once [PR98935].

Recent changes in the C++ FE and the coroutines implementation have
exposed a latent issue in which a bind expression containing a var
that we need to capture in the coroutine state gets visited twice.
This causes an ICE (from a checking assert).  Fixed by adding a pset
to the relevant tree walk.  Exit the callback early when the tree is
not a BIND_EXPR.

PR c++/98935

gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Add a pset to the
tree walk to avoid visiting the same BIND_EXPR twice.  Make
an early exit for cases that the callback does not apply.
(cp_coroutine_transform::apply_transforms): Add a pset to the
tree walk for register_local_vars.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr98935.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit bd8c7e71f516bae29a5a9e517b266141458f3977)

c++/coro: ignore cleanup_point_exprs while expanding awaits [PR116793]

If we reach a CLEANUP_POINT_EXPR while trying to walk statements, we
actually care about the statement or statement list contained within it.

Indeed, such a construction started happening with
r15-3513-g964577c31df206, after temporary promotion.  In the test case
presented in PR116793, the compiler generated:

  <<cleanup_point {
    struct _cleanup_task Aw0 [value-expr: frame_ptr->Aw0_2_3];
    int T002 [value-expr: frame_ptr->T002_2_3];

      int T002 [value-expr: frame_ptr->T002_2_3];
    <<cleanup_point <<< Unknown tree: expr_stmt
      (void) (T002 = TARGET_EXPR <D.20994, 3>) >>>>>;
      struct _cleanup_task Aw0 [value-expr: frame_ptr->Aw0_2_3];
    <<cleanup_point <<< Unknown tree: expr_stmt
      (void) (Aw0 = TARGET_EXPR <D.20995, func ((int &) &T002)>) >>>>>;
    <<cleanup_point <<< Unknown tree: expr_stmt
      (void) (D.22450 = <<< Unknown tree: co_await
        TARGET_EXPR <D.20995, func ((int &) &T002)>
        Aw0

        {_cleanup_task::await_ready (&Aw0), _cleanup_task::await_suspend<_task1::promise_type> (&Aw0, TARGET_EXPR <D.21078, _Coro_self_handle>), <<< Unknown tree: aggr_init_expr
          4
          await_resume
          D.22443
          &Aw0 >>>}
        0 >>>) >>>>>;
    <<cleanup_point <<< Unknown tree: expr_stmt
      (void) (D.20991 = (struct tuple &) &D.22450) >>>>>;
  }
  D.22467 = 1;
  int & i [value-expr: frame_ptr->i_1_2];
  <<cleanup_point <<< Unknown tree: expr_stmt
    (void) (i = std::get<0, int&> (NON_LVALUE_EXPR <D.20991>)) >>>>>;>>;

... i.e. a statement list within a cleanup point.  In such a case, we
don't actually care about the cleanup point, but we do care about the
statement inside, so, we can just walk down into the CLEANUP_POINT_EXPR.

PR c++/116793

gcc/cp/ChangeLog:

* coroutines.cc (await_statement_expander): Just process
subtrees if encountering a CLEANUP_POINT_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr116793-1.C: New test.

(cherry picked from commit d888a8a8dcf391197ae82e2bbf99507effc27950)

c++: simplify handling implicit INDIRECT_REF and co_await in convert_to_void

convert_to_void has, so far, when converting a co_await expression to
void altered the await_resume expression of a co_await so that it is
also converted to void.  This meant that the type of the await_resume
expression, which is also supposed to be the type of the whole co_await
expression, was not the same as the type of the CO_AWAIT_EXPR tree.

While this has not caused problems so far, it is unexpected, I think.

Also, convert_to_void had a special case when an INDIRECT_REF wrapped a
CALL_EXPR.  In this case, we also diagnosed maybe_warn_nodiscard.  This
was a duplication of logic related to converting call expressions to
void.

Instead, we can generalize a bit, and rather discard the expression that
was implicitly dereferenced instead.

This patch changes the diagnostic of:

  void f(struct S* x) { static_cast<volatile S&>(*x); }

... from:

  warning: indirection will not access object of incomplete type
           'volatile S' in statement

... to:

  warning: implicit dereference will not access object of type
           ‘volatile S’ in statement

... but should have no impact in other cases.

gcc/cp/ChangeLog:

* coroutines.cc (co_await_get_resume_call): Return a tree
directly, rather than a tree pointer.
* cp-tree.h (co_await_get_resume_call): Adjust signature
accordingly.
* cvt.cc (convert_to_void): Do not alter CO_AWAIT_EXPRs when
discarding them.  Simplify handling implicit INDIRECT_REFs.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/nodiscard-1.C: New test.

(cherry picked from commit de03ef6337b0a368d61c74b790313b4216c7ed6e)

c++/coro: prevent ICV_STATEMENT diagnostics in temp promotion [PR116502]

If such a diagnostic is necessary, it has already been emitted,
otherwise, it is not correct and emitting it here is inactionable by the
user, and bogus.

PR c++/116502

gcc/cp/ChangeLog:

* coroutines.cc (maybe_promote_temps): Convert temporary
initializers to void without complaining.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/maybe-unused-1.C: New test.
* g++.dg/coroutines/pr116502.C: New test.

(cherry picked from commit 05e4f07cad1eacf869c10622cae2a9cdee3b6a7a)

c++, coroutines: Revise promise construction/destruction.

In examining the coroutine testcases for unexpected diagnostic output
for 'Wall', I found a 'statement has no effect' warning for the promise
construction in one case.  In particular, the case is where the users
promise type has an implicit CTOR but a user-provided DTOR. Further, the
type does not actually need constructing.

In very early versions of the coroutines code we used to check
TYPE_NEEDS_CONSTRUCTING() to determine whether to attempt to build
a constructor call for the promise.  During review, it was suggested
to use type_build_ctor_call () instead.

This latter call checks the constructors in the type (both user-defined
and implicit) and returns true, amongst other cases if any of the found
CTORs are marked as deprecated.

In a number of places (for example [class.copy.ctor] / 6) the standard
says that some version of an implicit CTOR is deprecated when the user
provides a DTOR.

Thus, for this specific arrangement of promise type, type_build_ctor_call
returns true, because of (for example) a deprecated implicit copy CTOR.

We are not going to use any of the deprecated CTORs and thus will not
see warnings from this - however, since the call returned true, we have
now determined that we should attempt to build a constructor call.

Note as above, the type does not actually require construction and thus
one might expect either a NULL_TREE or error_mark_node in response to
the build_special_member_call ().  However, in practice the function
returns the original instance object instead of a call or some error.

When we add that as a statement it triggers the 'statement has no effect'
warning.

The patch here rearranges the promise construction/destruction code to
allow for the case that a DTOR is required independently of a CTOR. In
addition, we check that the return from build_special_member_call ()
has side effects before we add it as a statement.

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Separate the
build of promise constructor and destructor.  When evaluating
the constructor, check that build_special_member_call returns
an expression with side effects before adding it.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 7d1483921941d21d91f929ef0d59a9794b1946b4)

c++, coroutines: Fix handling of bool await_suspend() [PR115905].

As noted in the PR the action of the existing implementation was to
treat a false value from await_suspend () as equivalent to "do not
suspend". Actually it needs to be the equivalent of "resume" - and
we need to restart the dispatcher - since the await_suspend() body
could have already resumed the coroutine.
See also https://github.com/cplusplus/CWG/issues/601 (NAD) for more
discussion.

Since we need to amend the await expansion and the actor build, take
the opportunity to clean up and modernise the code there. Note that
we need to make the jump back to the dispatcher without any scope
exit cleanups (so we have to use the .CO_SUSPN IFN to do this).

PR c++/115905

gcc/cp/ChangeLog:

* coroutines.cc (struct coro_aw_data): Add a member for the
restart dispatch label.
(expand_one_await_expression): Rework to modernise and to
handle the boolean await_suspend() case.
(build_actor_fn): Rework the dispatcher and allow for a jump
back to the dispatcher.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr115905.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 368ba7aed46d57d093c0180baae4dc0e0ba468b6)

coros: mark .CO_YIELD as LEAF [PR106973]

We rely on .CO_YIELD calls being followed by an assignment (optionally)
and then a switch/if in the same basic block. This implies that a
.CO_YIELD can never end a block. However, since a call to .CO_YIELD is
still a call, if the function containing it calls setjmp, GCC thinks
that the .CO_YIELD can introduce abnormal control flow, and generates an
edge for the call.

We know this is not the case; .CO_YIELD calls get removed quite early on
and have no effect, and result in no other calls, so .CO_YIELD can be
considered a leaf function, preventing generating an edge when calling
it.

PR c++/106973 - coroutine generator and setjmp

PR c++/106973

gcc/ChangeLog:

* internal-fn.def (CO_YIELD): Mark as ECF_LEAF.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr106973.C: New test.

(cherry picked from commit 7b7ad3f4b2455072f42e7884b93fd96ebb920bc8)

c++: don't remove labels during coro-early-expand-ifns [PR105104]

In some scenarios, it is possible for the CFG cleanup to cause one of
the labels mentioned in CO_YIELD, which coro-early-expand-ifns intends
to remove, to become part of some statement. As a result, when that
label is removed, the statement it became part of becomes invalid,
crashing the compiler.

There doesn't appear to be a reason to remove the labels (anymore, at
least), so let's not do that.

PR c++/105104

gcc/ChangeLog:

* coroutine-passes.cc (execute_early_expand_coro_ifns): Don't
remove any labels.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr105104.C: New test.

(cherry picked from commit d9c54e9a036189e8961ec17e118fccf794d7bfab)

c++, coroutines: The frame pointer is used in the helpers [PR116482].

We have a bogus warning about the coroutine state frame pointers
being apparently unused in the resume and destroy functions. Fixed
by making the parameters DECL_ARTIFICIAL.

PR c++/116482

gcc/cp/ChangeLog:

* coroutines.cc
(coro_build_actor_or_destroy_function): Make the parameter
decls DECL_ARTIFICIAL.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr116482.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 8d6d6c864442a1cc987b3e6bcb1d903ceb975e4a)

c++/coros: do not assume coros don't nest [PR113457]

In the testcase presented in the PR, during template expansion, an
tsubst of an operand causes a lambda coroutine to be processed, causing
it to get an initial suspend and final suspend.  The code for assigning
awaitable var names (get_awaitable_var) assumed that the sequence Is ->
Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one
coroutine before closing it at a time), and reset the counter used for
unique numbering each time a final suspend occured.  This assumption is
false in a few cases, usually when lambdas are involved.

Instead of storing this counter in a static-storage variable, we can
store it in coroutine_info.  This struct is local to each function, so
we don't need to worry about "cross-contamination" nor resetting.

PR c++/113457

gcc/cp/ChangeLog:

* coroutines.cc (struct coroutine_info): Add integer field
awaitable_number.  This is a counter used for assigning unique
names to awaitable temporaries.
(get_awaitable_var): Use awaitable_number from coroutine_info
instead of the static int awn.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr113457-1.C: New test.
* g++.dg/coroutines/pr113457.C: New test.

(cherry picked from commit 5cca7517c5868b7b9aa13992145eb6082ac5d5b9)

c++, coroutines: Look through initial_await target exprs [PR110635].

In the case that the initial awaiter returns an object, the initial await
can be a target expression and we need to look at its initializer to cast
the await_resume() to void and to wrap in a compound expression that sets
the initial_await_resume_called flag.

PR c++/110635

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::wrap_original_function_body): Look through
initial await target expressions to find the actual co_await_expr
that we need to update.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr110635.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit c442a9b78bdbebdbcb4a8f91bc36961eb732fbdf)

c++, coroutines: Rework handling of throwing_cleanups [PR102051].

In the fix for PR95822 (r11-7402) we set throwing_cleanup false in the top
level of the coroutine transform code. However, as the current PR shows,
that is not sufficient.

Any use of cxx_maybe_build_cleanup() can reset the flag, which causes the
check_return_expr () logic to try to add a guard variable and set it.

For the coroutine code, we need to handle the cleanups separately, since
the responsibility for them changes after the first resume point, which
we handle in the ramp exception processing.

Fix this by forcing the "throwing_cleanup" flag false right before the
processing of the return expression.

PR c++/102051

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Handle
"throwing_cleanup" here instead of ...
(cp_coroutine_transform::apply_transforms): ... here.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr102051.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit f0315f7a325ffccb446fe378fcdfccda6eead8ba)

c++, coroutines: Allow convertible get_return_on_allocation_fail [PR109682].

We have been requiring the get_return_on_allocation_fail() call to have the
same type as the ramp. This is not intended by the standard, so relax that
to allow anything convertible to the ramp return.

PR c++/109682

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Allow for cases where
get_return_on_allocation_fail has a type convertible to the ramp
return type.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr109682.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit f4915e6c4cd42e7d6f397dc36fab507cc47dad05)

c++, coroutines: Only allow void get_return_object if the ramp is void [PR100476].

Require that the value returned by get_return_object is convertible to
the ramp return. This means that the only time we allow a void
get_return_object, is when the ramp is also a void function.

We diagnose this early to allow us to exit the ramp build if the return
values are incompatible.

PR c++/100476

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Remove special
handling of void get_return_object expressions.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C:
Adjust expected diagnostic.
* g++.dg/coroutines/pr102489.C: Avoid void get_return_object.
* g++.dg/coroutines/pr103868.C: Likewise.
* g++.dg/coroutines/pr94879-folly-1.C: Likewise.
* g++.dg/coroutines/pr94883-folly-2.C: Likewise.
* g++.dg/coroutines/pr96749-2.C: Likewise.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit a0b431033c307982123abbff752045cfe7eda47f)

Daily bump.

c++, coroutines: check for members we use in handle_types [PR105475]

Currently, it is possible to ICE GCC by giving it sufficiently broken
code, where sufficiently broken means a std::coroutine_handle missing a
default on the promise_type template argument, and missing members.
As the code generator relies on lookups in the coroutine_handle never
failing (and has no way to signal that error), lets do it ahead of time,
save the result, and use that.  This saves us some lookups and allows us
to propagate an error.

PR c++/105475 - coroutines: ICE in coerce_template_parms, at cp/pt.cc:9183

This includes refactoring from 370a0dee55569, ffd521d8dcddcd4 and
00019b88e714c29c387.

gcc/cp/ChangeLog:

PR c++/105475
* coroutines.cc (struct coroutine_info): Add from_address.
Carries the from_address member we looked up earlier.
(coro_resume_identifier): Remove.  Unused.
(coro_init_identifiers): Do not initialize the above.
(struct susp_frame_data): Remove unused members, provide a CTOR.
(void_coro_handle_address): New variable.  Contains the baselink
for the std::coroutine_handle<void>::address() instance method.
(get_handle_type_address): New function.  Looks up and validates
handle_type::address in a given handle_type.
(get_handle_type_from_address): New function.  Looks up and
validates handle_type::from_address in a given handle_type.
(coro_promise_type_found_p): Remove reliance on
coroutine_handle<> defaulting the promise type to void.  Store
get_handle_type_* results where appropriate.
(struct local_vars_frame_data): Add a CTOR.
(replace_continue): Look up expression type.
(get_coroutine_from_address): New helper.  Gets the
handle_type::from_address BASELINK from a coroutine_info.
(morph_fn_to_coro): Use susp_frame_data CTOR, and make the suspend
state hash map local to the morph function. Use CTOR for
local_vars_frame_data instead of brace init.
(build_actor_fn): Use the get_coroutine_from_address helper and
void_coro_handle_address.

gcc/testsuite/ChangeLog:

PR c++/105475
* g++.dg/coroutines/pr103868.C: Add std::coroutine_handle
members we check for now.
* g++.dg/coroutines/pr105287.C: Ditto.
* g++.dg/coroutines/pr105301.C: Ditto.
* g++.dg/coroutines/pr94528.C: Ditto.
* g++.dg/coroutines/pr94879-folly-1.C: Ditto.
* g++.dg/coroutines/pr94883-folly-2.C: Ditto.
* g++.dg/coroutines/pr98118.C: Ditto.
* g++.dg/coroutines/pr105475.C: New test.
* g++.dg/coroutines/pr105475-1.C: New test.
* g++.dg/coroutines/pr105475-2.C: New test.
* g++.dg/coroutines/pr105475-3.C: New test.
* g++.dg/coroutines/pr105475-4.C: New test.
* g++.dg/coroutines/pr105475-5.C: New test.
* g++.dg/coroutines/pr105475-6.C: New test.
* g++.dg/coroutines/pr105475-broken-spec.C: New test.
* g++.dg/coroutines/pr105475-broken-spec-2.C: New test.

Co-authored-by: Arsen Arsenović <arsen@aarsen.me>
(cherry picked from commit 5b4476a165565cb20729c0a97a3f43b060595209)

c++/coroutines: only defer expanding co_{await,return,yield} if dependent [PR112341]

By doing so, we can get diagnostics in template decls when we know we
can.  For instance, in the following:

  awaitable g();
  template<typename>
  task f()
  {
    co_await g();
    co_yield 1;
    co_return "foo";
  }

... the coroutine promise type in each statement is always
std::coroutine_handle<task>::promise_type, and all of the operands are
not type-dependent, so we can always compute the resulting types (and
expected types) of these expressions and statements.

Also, when we do not know the type of the CO_AWAIT_EXPR or
CO_YIELD_EXPR, we now return NULL_TREE as the type rather than
unknown_type_node.  This is more correct, since the type is not unknown,
it just isn't determined yet.  This also means we can remove the
CO_AWAIT_EXPR and CO_YIELD_EXPR special-cases from
type_dependent_expression_p.

PR c++/112341 - error: insufficient contextual information to determine type on co_await result in function template

gcc/cp/ChangeLog:

PR c++/112341
* coroutines.cc (struct coroutine_info): Also cache the
traits type.
(ensure_coro_initialized): New function.  Makes sure we have
initialized the coroutine state successfully, or informs the
caller should it fail to do so.  Extracted from
coro_promise_type_found_p.
(coro_get_traits_class): New function.  Gets the (cached)
coroutine traits type for a given coroutine.  Extracted from
coro_promise_type_found_p and refactored to cache the result.
(coro_promise_type_found_p): Use the two functions above.
(build_template_co_await_expr): New function.  Builds a
CO_AWAIT_EXPR representing a CO_AWAIT_EXPR in a template
declaration.
(build_co_await): Use the above if processing_template_decl, and
give it a proper type.
(coro_dependent_p): New function.  Returns true iff its
argument is a type-dependent expression OR the current functions
traits class is type dependent.
(finish_co_await_expr): Defer expansion only in the case
coro_dependent_p returns true.
(finish_co_yield_expr): Ditto.
(finish_co_return_stmt): Ditto.
* pt.cc (type_dependent_expression_p): Do not treat
CO_AWAIT/CO_YIELD specially.

gcc/testsuite/ChangeLog:

PR c++/112341
* g++.dg/coroutines/pr112341-2.C: New test.
* g++.dg/coroutines/pr112341-3.C: New test.
* g++.dg/coroutines/torture/co-yield-03-tmpl-nondependent.C: New
test.
* g++.dg/coroutines/pr112341.C: New test.

(cherry picked from commit 32e678b2ed752154b2f96719e33f11a7c6417f20)

c++: diagnose usage of co_await and co_yield in default args [PR115906]

This is a partial fix for PR115906. Per [expr.await] 2s3, "An
await-expression shall not appear in a default argument
([dcl.fct.default])". This patch introduces the diagnostic in that
case, and in the case of a co_yield (as co_yield is defined in terms of
co_await, so prerequisites of co_await hold).

PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration

gcc/cp/ChangeLog:

PR c++/115906
* parser.cc (cp_parser_unary_expression): Reject await
expressions if use of local variables is currently forbidden.
(cp_parser_yield_expression): Reject yield expressions if use of
local variables is currently forbidden.

gcc/testsuite/ChangeLog:

PR c++/115906
* g++.dg/coroutines/pr115906-yield.C: New test.
* g++.dg/coroutines/pr115906.C: New test.
* g++.dg/coroutines/co-await-syntax-02-outside-fn.C: Don't rely
on default arguments.
* g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: Ditto.

(cherry picked from commit 0c382da0943dc7d14455ba2ada2f620a25bd1366)

c++: fix ICE on FUNCTION_DECLs inside coroutines [PR115906]

When register_local_var_uses iterates a BIND_EXPRs BIND_EXPR_VARS, it
fails to account for the fact that FUNCTION_DECLs might be present, and
later passes it to DECL_HAS_VALUE_EXPR_P.  This leads to a tree check
failure in DECL_HAS_VALUE_EXPR_P:

  tree check: expected var_decl or parm_decl or result_decl, have
  function_decl in register_local_var_uses

We only care about PARM_DECL and VAR_DECL, so select only those.

PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration

gcc/cp/ChangeLog:

PR c++/115906
* coroutines.cc (register_local_var_uses): Only process
PARM_DECL and VAR_DECLs.

gcc/testsuite/ChangeLog:

PR c++/115906
* g++.dg/coroutines/coro-function-decl.C: New test.

(cherry picked from commit a362c9ca4ef6585e678f899705043a9aa10dd670)

cp/coroutines: do not rewrite parameters in unevaluated contexts

It is possible to use parameters of a parent function of a lambda in
unevaluated contexts without capturing them.  By not capturing them, we
work around the usual mechanism we use to prevent rewriting captured
parameters.  Prevent this by simply skipping rewrites in unevaluated
contexts.  Those won't mind the value not being present anyway.

This prevents an ICE during parameter substitution.  In the testcase
from the PR, the rewriting machinery finds a param in the body of the
coroutine, which it did not previously encounter while processing the
coroutine declaration, and that does not have a DECL_VALUE_EXPR, and
fails.

gcc/cp/ChangeLog:

PR c++/111728
* coroutines.cc (rewrite_param_uses): Skip unevaluated
subexpressions.

gcc/testsuite/ChangeLog:

PR c++/111728
* g++.dg/coroutines/pr111728.C: New test.

(cherry picked from commit 1a37d6b732506f8c3f9e9452c9dc6a456f25397b)

c++, coroutines, contracts: Handle coroutine and void functions [PR110871,PR110872,PR115434].

The current implementation of contracts emits the checks into function
bodies in three places; for pre-conditions at the start of the body,
for asserts in-line in the function body and for post-conditions as an
addition to return statements.

In general (at least with existing "2a" contract semantics) the in-line
contract asserts behave as expected.

However, the mechanism is not applicable to:

* Handling pre conditions in coroutines since, for those, the standard
  specifies a wrapping of the original function body by functionality
  implementing initial and final suspends (along with some housekeeping
  to route exceptions).  Thus for such transformed function bodies, the
  preconditions then get actioned after the initial suspend, which does
  not behave as intended.

  * Handling post conditions in functions that do not have return
    statements (which applies to coroutines and void functions).

In the following, we identify a potentially transformed function body
(in the case of coroutines, this is usually called the "ramp()" function).

The patch here re-implements the code insertion in one of the two
following ways (code for exposition only):

  * For functions with no post-conditions we wrap the potentially
    transformed function as follows:

  {
     handle_pre_condition_checking ();
     potentially_transformed_function_body ();
  }

  This implements the intent that the preconditions are processed after
  the function parameters are initialised but before any other actions.

  * For functions with post-conditions:

  if (preconditions_exist)
    handle_pre_condition_checking ();
  try
   {
     potentially_transformed_function_body ();
   }
  finally
   {
     handle_post_condition_checking ();
   }
  else [only if the function is not marked noexcept(true) ]
   {
     ;
   }

In this, post-conditions [that might apply to the return value etc.]
are evaluated on every non-exceptional edge out of the function.

At present, the model here is that exceptions thrown by the function
propagate upwards as if there were no contracts present.  If the desired
semantic becomes that an exception is counted as equivalent to a contract
violation - then we can add a second handler in place of the empty
statement.

This patch specifically does not address changes to code-gen and constexpr
handling that are contained in P2900.

PR c++/115434
PR c++/110871
PR c++/110872

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Handle EH_ELSE_EXPR.
* contracts.cc (finish_contract_attribute): Remove excess line.
(build_contract_condition_function): Post condition handlers are
void now.
(emit_postconditions_cleanup): Remove.
(emit_postconditions): New.
(add_pre_condition_fn_call): New.
(add_post_condition_fn_call): New.
(apply_preconditions): New.
(apply_postconditions): New.
(maybe_apply_function_contracts): New.
(apply_postcondition_to_return): Remove.
* contracts.h (apply_postcondition_to_return): Remove.
(maybe_apply_function_contracts): Add.
* coroutines.cc (coro_build_actor_or_destroy_function): Do not
copy contracts to coroutine helpers.
* decl.cc (finish_function): Handle wrapping a possibly
transformed function body in contract checks.
* typeck.cc (check_return_expr): Remove handling of post
conditions on return expressions.

gcc/ChangeLog:

* gimplify.cc (struct gimplify_ctx): Add a flag to show we are
expending a handler.
(gimplify_expr): When we are expanding a handler, and the body
transforms might have re-written DECL_RESULT into a gimple var,
ensure that hander references to DECL_RESULT are also re-written
to refer to the gimple var.  When we are processing an EH_ELSE
expression, then add it if either of the cleanup slots is in
use.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/pr115434.C: New test.
* g++.dg/coroutines/pr110871.C: New test.
* g++.dg/coroutines/pr110872.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit d1706235ed2b274a2d1fa3c3039b5874b4ae7a0e)

configure, Darwin: Recognise new naming for Xcode ld.

The latest editions of XCode have altered the identify reported by 'ld -v'
(again). This means that GCC configure no longer detects the version.

Fixed by adding the new name to the set checked.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Recognise PROJECT:ld-mmmm.nn.aa as an identifier
for Darwin's static linker.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 7f56a8e8ad1c33d358e9e09fcbaf263c2caba1b9)

includes, Darwin: Handle modular use for macOS SDKs [PR116827].

Recent changes to the OS SDKs have altered the way in which include guards
are used for a number of headers when C++ modules are enabled.  Instead of
placing the guards in the included header, they are being placed in the
including header.  This breaks the assumptions in the current GCC stddef.h
specifically, that the presence of __PTRDIFF_T and __SIZE_T means that the
relevant defs are already made.  However in the case of the module-enabled
C++ with these SDKs, that is no longer true.

stddef.h has a large body of special-cases already, but it seems that the
only viable solution here is to add a new one specifically for __APPLE__
and modular code.

This fixes around 280 new fails in the modules test-suite; it is needed on
all open branches that support modules.

PR target/116827

gcc/ChangeLog:

* ginclude/stddef.h: Undefine __PTRDIFF_T and __SIZE_T for module-
enabled c++ on Darwin/macOS platforms.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 9cf6b52d04df22726d88eef113211b3cc08515de)

testsuite, gm2: Use -B option for libstdc++ where required.

We need to add testsuite  options to locate gm2 libs and libstdc++.

Usually '-L' options are added to point to the relevant directories for
the uninstalled libraries.

In cases where libraries are available as both shared and convenience some
additional checks are made.

For some targets -static-xxxx options are handled by specs substitution and
need a '-B' option rather than '-L'.  For Darwin, when embedded runpaths are
in use (the default for all versions after macOS 10.11), '-B' is also needed
to provide the runpath.

When '-B' is used, this results in a '-L' for each path that exists (so that
appending a '-L' as well is a needless duplicate).  There are also cases
where tools warn for duplicates, leading to spurious fails.

Therefore the objective of the code here is to add just one '-L' or '-B' for
each of the libraries.

Currently, we are forcing the full paths to each of the gm2 convenience libs
onto the link line and therefore the B/L logic is not needed there.  It would
need to be added if/when gm2 is tested with shared libraries

gcc/testsuite/ChangeLog:

* lib/gm2.exp: Arrange for a '-B' option to be added for the
libstdc++ paths on targets that need it.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 6b9ceac9e4e2be304c39e6bc8744edf21faac4fb)

Darwin: Pass -macos_version_min to the linker [PR119172].

For binaries to be notarised, the SDK version must be available.
Since we do not, at present, parse this information we have been
passing "0.0" to ld64. This now results in a warning and a fail
to notarise. As a quick-fix, we can fall back to letting ld64
figure out the SDK version (which it does for -macos_version_min).

TODO: Parse the SDKSetting.plist at some point.

cherry-picked from 952e17223d3a9 and fc728cfd569e291a5

PR target/119172

gcc/ChangeLog:

* config.in: Regenerate.
* config/darwin.h (DARWIN_PLATFORM_ID): Add the option to
use -macos_version_min where available.
* configure: Regenerate.
* configure.ac: Check for ld64 support of -macos_version_min.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

fixincludes: adjust stdio fix for macOS 15 headers

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (apple_local_stdio_fn_deprecation): Also apply to
_stdio.h.

(cherry picked from commit 1dc143181550573c9c902fb7a3b495e9b409d0b0)

libgcc, Darwin: Drop the legacy library build for macOS >= 10.12 [PR116809].

From macOSX15 SDK, the unwinder no longer exports some of the symbols used
in that library which (a) causes bootstrap fail and (b) means that the
legacy library is no longer useful.

No open branch of GCC emits references to this library - and any already
-built code that depends on the symbols would need rework anyway.

We have been asked to extend this back to the earliest OS vesion supported
by the SDK (10.12).

PR target/116809

libgcc/ChangeLog:

* config.host: Build legacy libgcc_s.1 on hosts before macOS 10.12.
* config/i386/t-darwin: Remove reference to legacy libgcc_s.1
* config/rs6000/t-darwin: Likewise.
* config/t-darwin-libgccs1: New file.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit d9cafa0c4f0a81304d9b95a78ccc8e9003c6d7a3)

Daily bump.

Fortran: pure subroutine with pure procedure as dummy [PR106948]

PR fortran/106948

gcc/fortran/ChangeLog:

* resolve.cc (gfc_pure_function): If a function has been resolved,
but esym is not yet set, look at its attributes to see whether it
is pure or elemental.

gcc/testsuite/ChangeLog:

* gfortran.dg/pure_formal_proc_4.f90: New test.

(cherry picked from commit 4e3060ee17e6eb8bab718d320199f713533dbbd6)

c++: UNBOUND_CLASS_TEMPLATE context substitution [PR119981]

In r15-123 and r14-11434 we unconditionally set processing_template_decl
when substituting the context of an UNBOUND_CLASS_TEMPLATE, in order to
handle instantiation of the dependently scoped friend declaration

  template<int N>
  template<class T>
  friend class A<N>::B;

where the scope A<N> remains dependent after instantiation.  But this
turns out to misbehave for the UNBOUND_CLASS_TEMPLATE in the below
testcase representing

  g<[]{}>::template fn

since with the flag set substituting the args of test3 into the lambda
causes us to defer the substitution and yield a lambda that still looks
dependent, which in turn makes g<[]{}> still dependent and not suitable
for qualified name lookup.

This patch restricts setting processing_template_decl during
UNBOUND_CLASS_TEMPLATE substitution to the case where there are multiple
levels of introduced template parameters, as in the friend declaration.
(This means we need to substitute the template parameter list(s) first,
which makes sense since they lexically appear first.)

PR c++/119981
PR c++/119378

gcc/cp/ChangeLog:

* pt.cc (tsubst) <case UNBOUND_CLASS_TEMPLATE>: Substitute
into template parameter list first.  When substituting the
context, only set processing_template_decl if there's more
than one level of introduced template parameters.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ15.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 05ea8baf6ff96c77a9a2467d5c45b1ed575fca92)

testsuite: Force -mcmodel=small for gcc.target/aarch64/pr115258.c

The test implicitly assumed the default code model and so failed
for -mcmodel=tiny.

gcc/testsuite/
* gcc.target/aarch64/pr115258.c: Add -mcmodel=small.

(cherry picked from commit 3584aab37f54bcd220c7061568af777e37f4f6ed)

Fix GNAT build failure for x86/FreeBSD

gcc/ada/
PR ada/112958
* Makefile.rtl (LIBGNAT_TARGET_PAIRS) [x86 FreeBSD]: Add specific
version of s-dorepr.adb.
* libgnat/s-dorepr__freebsd.adb: New file.

AVR: target/119989 - Add missing clobbers to xload_<mode>_libgcc.

libgcc's __xload_1...4 is clobbering Z (and also R21 is some cases),
but avr.md had clobbers of respective GPRs only up to reload.
Outcome was that code reading from the same __memx address twice
could be wrong. This patch adds respective clobbers.

gcc/
* config/avr/avr.md (xload_<mode>_libgcc): Clobber R21, Z.

gcc/testsuite/
* gcc.target/avr/torture/pr119989.h: New file.
* gcc.target/avr/torture/pr119989-memx-1.c: New test.
* gcc.target/avr/torture/pr119989-memx-2.c: New test.
* gcc.target/avr/torture/pr119989-memx-3.c: New test.
* gcc.target/avr/torture/pr119989-memx-4.c: New test.

libstdc++: [_GLIBCXX_INLINE_VERSION] Fix tests failures

Adapt testsuite v3_target_compile to strip version namespace from compiler
output so that dg-error and dg-warning directives do not need to consider it.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp (v3_target_compile): Strip version namespace
from compiler output.
* testsuite/20_util/function/cons/70692.cc: Remove now useless __8 namespace
pattern.
* testsuite/23_containers/map/48101_neg.cc: Likewise.
* testsuite/23_containers/multimap/48101_neg.cc: Likewise.

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 8709d6a17830c8a9f48cb3ac6dfc6af76f2e1e81)

RISC-V: vsetvl: elide abnormal edges from LCM computations [PR119533]

vsetvl phase4 uses LCM guided info to insert VSETVL insns, including a
straggler loop for "mising vsetvls" on certain edges. Currently it
asserts on encountering EDGE_ABNORMAL.

When enabling go frontend with V enabled, libgo build hits the assert.

The solution is to prevent abnormal edges from getting into LCM at all
(my prior attempt at this just ignored them after LCM which is not
right). Existing invalid_opt_bb_p () current does this for BB predecessors
but not for successors which is what the patch adds.

Crucially, the ICE/fix also depends on avoiding vsetvl hoisting past
non-transparent blocks: That is taken care of by Robin's patch
"RISC-V: Do not lift up vsetvl into non-transparent blocks [PR119547]"
for a different yet related issue.

Reported-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
PR target/119533

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (invalid_opt_bb_p): Check for
EDGE_ABNOMAL.
(pre_vsetvl::compute_lcm_local_properties): Initialize kill
bitmap.
Debug dump skipped edge.

gcc/testsuite/ChangeLog:

* go.dg/pr119533-riscv.go: New test.
* go.dg/pr119533-riscv-2.go: New test.

(cherry picked from commit edb4867412895100b3addc525bc0dba0ea90c7f9)

RISC-V: Do not lift up vsetvl into non-transparent blocks [PR119547].

When lifting up a vsetvl into a block we currently don't consider the
block's transparency with respect to the vsetvl as in other parts of the
pass. This patch does not perform the lift when transparency is not
guaranteed.

This condition is more restrictive than necessary as we can still
perform a vsetvl lift if the conflicting register is only every used
in vsetvls and no regular insns but given how late we are in the GCC 15
cycle it seems better to defer this. Therefore
gcc.target/riscv/rvv/vsetvl/avl_single-68.c is XFAILed for now.

This issue was found in OpenCV where it manifests as a runtime error.
Zhijin Zeng debugged PR119547 and provided an initial patch.

Reported-By: 曾治金 <zhijin.zeng@spacemit.com>
PR target/119547

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info):
Do not perform lift if block is not transparent.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/avl_single-68.c: xfail.
* g++.target/riscv/rvv/autovec/pr119547.C: New test.
* g++.target/riscv/rvv/autovec/pr119547-2.C: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: Adjust.

(cherry picked from commit 517f7e3f02b4c945d2b4bdabb490961cf986391e)

Daily bump.

Remove other processors from X86_TUNE_DEST_FALSE_DEP_FOR_GLC except GLC

Since the tune if only for GLC(sapphirerapids and alderlake-P).

gcc/ChangeLog:

* config/i386/x86-tune.def (X86_TUNE_DEST_FALSE_DEP_FOR_GLC):
Remove other processor except for GLC since this one is only
for GLC.

(cherry picked from commit 1ad6e171b126a82f38b1e8cbfd207f1d91c58a59)

libstdc++: Improve optional's <=> constraint recursion workaround [PR104606]

It turns out the reason the behavior of this testcase changed after CWG
2369 is because validity of the substituted return type is now checked
later, after constraints. So a more reliable workaround for this issue
is to add a constraint to check the validity of the return type earlier,
matching the pre-CWG 2369 semantics.

PR libstdc++/104606

libstdc++-v3/ChangeLog:

* include/std/optional (operator<=>): Revert r14-9771 change.
Add constraint checking the validity of the return type
compare_three_way_result_t before the three_way_comparable_with
constraint.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 815f1f27a1dba2f0acd1f02d0beafedadebe967c)

libstdc++: Add code comment documenting LWG 4027 change [PR118083]

PR libstdc++/118083

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h
(ranges::__access::__possibly_const_range): Mention LWG 4027.

(cherry picked from commit 640697f7c2def415db81c84010ae25be0785d867)

libstdc++: Implement LWG 4027 change to possibly-const-range [PR118083]

LWG 4027 effectively makes the const range access CPOs ranges::cfoo behave
more consistently across C++23 and C++20 (pre-P2278R4) and also more
consistently with the std::cfoo range accessors, as the below testcase
adjustments demonstrate (which mostly consist of reverting workarounds
added by r14-3771-gf12e26f3496275 and r13-7186-g0d94c6df183375).

In passing fix PR118083 which reports that the input_range constraint on
possibly-const-range is missing in our implementation. A consequence of
this is that the const range access CPOs now consistently reject a non-range
argument, and so in some our of tests we need to introduce otherwise
unused begin/end members.

PR libstdc++/118083

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h
(ranges::__access::__possibly_const_range): Adjust logic as per
LWG 4027. Add missing input_range constraint.
* testsuite/std/ranges/access/cbegin.cc (test05): Verify LWG
4027 testcases.
* testsuite/std/ranges/access/cdata.cc: Adjust, simplify and
consolidate some tests after the above.
* testsuite/std/ranges/access/cend.cc: Likewise.
* testsuite/std/ranges/access/crbegin.cc: Likewise.
* testsuite/std/ranges/access/crend.cc: Likewise.
* testsuite/std/ranges/adaptors/join.cc: Likewise.
* testsuite/std/ranges/adaptors/take_while.cc: Likewise.
* testsuite/std/ranges/adaptors/transform.cc: Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 1b9e4fe2ff5f4711406cdcf0e6e183b247d9f42b)

c++: prev declared hidden tmpl friend inst, cont [PR119807]

When remapping existing specializations of a hidden template friend from
a previous declaration to the new definition, we must remap only those
specializations that match this new definition, but currently we
remap all specializations (since they all appear in the same
DECL_TEMPLATE_INSTANTIATIONS list of the most general template).

Concretely, in the first testcase below, we form two specializations of
the friend A::f, one with arguments {{0},{bool}} and another with
arguments {{1},{bool}}.  Later when instantiating B, we need to remap
these specializations.  During the B<0> instantiation we only want to
remap the first specialization, and during the B<1> instantiation only
the second specialization, but currently we remap both specializations
twice.

tsubst_friend_function needs to determine if an existing specialization
matches the shape of the new definition, which is tricky in general,
e.g. if the outer template parameters may not match up.  Fortunately we
don't have to reinvent the wheel here since is_specialization_of_friend
seems to do exactly what we need.  We can check this unconditionally,
but I think it's only necessary when dealing with specializations formed
from a class template scope previous declaration, hence the
TMPL_ARGS_HAVE_MULTIPLE_LEVELS check.

PR c++/119807
PR c++/112288

gcc/cp/ChangeLog:

* pt.cc (tsubst_friend_function): Skip remapping an
existing specialization if it doesn't match the shape of
the new friend definition.

gcc/testsuite/ChangeLog:

* g++.dg/template/friend86.C: New test.
* g++.dg/template/friend87.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 369461d0749790f1291f76096064d583d2547934)

Daily bump.

c++: format attribute redeclaration [PR116954]

Here when merging the two decls, remove_contract_attributes loses
ATTR_IS_DEPENDENT on the format attribute, so apply_late_template_attributes
just returns, so the attribute doesn't get propagated to the type where the
warning looks for it.

Fixed by using copy_node instead of tree_cons to preserve flags.

PR c++/116954

gcc/cp/ChangeLog:

* contracts.cc (remove_contract_attributes): Preserve flags
on the attribute list.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wformat-3.C: New test.

(cherry picked from commit b0d7d644f3c25af9bf60c948ab26aa7b09a68787)

c++: shortcut constexpr vector ctor [PR113835]

Since std::vector became usable in constant evaluation in C++20, a vector
variable with static storage duration might be manifestly
constant-evaluated, so we properly try to constant-evaluate its initializer.
But it can never succeed since the result will always refer to the result of
operator new, so trying is a waste of time. Potentially a large waste of
time for a large vector, as in the testcase in the PR.

So, let's recognize this case and skip trying constant-evaluation. I do
this only for the case of an integer argument, as that's the case that's
easy to write but slow to (fail to) evaluate.

In the test, I use dg-timeout-factor to lower the default timeout from 300
seconds to 15; on my laptop, compilation without the patch takes about 20
seconds versus about 2 with the patch.

is_std_class comes from r15-4953.

PR c++/113835

gcc/cp/ChangeLog:

* cp-tree.h (is_std_class): Declare.
* constexpr.cc (is_std_class): New function.
(is_std_allocator): Use it.
(cxx_eval_outermost_constant_expr): Bail out early
for std::vector(N).

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-vector1.C: New test.

(cherry picked from commit 764f02327f7b2dc6ac5abaf89038e51cf0ee6d13)

middle-end: fix masking for partial vectors and early break [PR119351]

The following testcase shows an incorrect masked codegen:

#define N 512
#define START 1
#define END 505

int x[N] __attribute__((aligned(32)));

int __attribute__((noipa))
foo (void)
{
  int z = 0;
  for (unsigned int i = START; i < END; ++i)
    {
      z++;
      if (x[i] > 0)
        continue;

      return z;
    }
  return -1;
}

notice how there's a continue there instead of a break.  This means we generate
a control flow where success stays within the loop iteration:

  mask_patt_9.12_46 = vect__1.11_45 > { 0, 0, 0, 0 };
  vec_mask_and_47 = mask_patt_9.12_46 & loop_mask_41;
  if (vec_mask_and_47 == { -1, -1, -1, -1 })
    goto <bb 4>; [41.48%]
  else
    goto <bb 15>; [58.52%]

However when loop_mask_41 is a partial mask this comparison can lead to an
incorrect match.  In this case the mask is:

  # loop_mask_41 = PHI <next_mask_63(6), { 0, -1, -1, -1 }(2)>

due to peeling for alignment with masking and compiling with
-msve-vector-bits=128.

At codegen time we generate:

ptrue   p15.s, vl4
ptrue   p7.b, vl1
not     p7.b, p15/z, p7.b
.L5:
ld1w    z29.s, p7/z, [x1, x0, lsl 2]
cmpgt   p7.s, p7/z, z29.s, #0
not     p7.b, p15/z, p7.b
ptest   p15, p7.b
b.none  .L2
...<early exit>...

Here the basic blocks are rotated and a not is generated.
But the generated not is unmasked (or predicated over an ALL true mask in this
case).  This has the unintended side-effect of flipping the results of the
inactive lanes (which were zero'd by the cmpgt) into -1.  Which then incorrectly
causes us to not take the branch to .L2.

This is happening because we're not comparing against the right value for the
forall case.  This patch gets rid of the forall case by rewriting the
if(all(mask)) into if (!all(mask)) which is the same as if (any(~mask)) by
negating the masks and flipping the branches.

1. For unmasked loops we simply reduce the ~mask.
2. For masked loops we reduce (~mask & loop_mask) which is the same as
   doing (mask & loop_mask) ^ loop_mask.

For the above we now generate:

.L5:
        ld1w    z28.s, p7/z, [x1, x0, lsl 2]
        cmple   p7.s, p7/z, z28.s, #0
        ptest   p15, p7.b
        b.none  .L2

This fixes gromacs with > 1 OpenMP threads and improves performance.

gcc/ChangeLog:

PR tree-optimization/119351
* tree-vect-stmts.cc (vectorizable_early_exit): Mask both operands of
the gcond for partial masking support.

gcc/testsuite/ChangeLog:

PR tree-optimization/119351
* gcc.target/aarch64/sve/pr119351.c: New test.
* gcc.target/aarch64/sve/pr119351_run.c: New test.

    (cherry picked from commit 7cf5503e0af52f5b726da4274a148590c57a458a)

aarch64: force operand to fresh register to avoid subreg issues [PR118892]

When the input is already a subreg and we try to make a paradoxical
subreg out of it for copysign this can fail if it violates the subreg
relationship.

Use force_lowpart_subreg instead of lowpart_subreg to then force the
results to a register instead of ICEing.

gcc/ChangeLog:

PR target/118892
* config/aarch64/aarch64.md (copysign<GPF:mode>3): Use
force_lowpart_subreg instead of lowpart_subreg.

gcc/testsuite/ChangeLog:

PR target/118892
* gcc.target/aarch64/copysign-pr118892.c: New test.

Daily bump.

libbacktrace: Add hpux fileline support

Fixes libstdc++ stacktrace tests.

2025-04-10 John David Anglin <danglin@gcc.gnu.org>

libbacktrace/ChangeLog:
* fileline.c (hpux_get_executable_path): New.
(fileline_initialize): Add pass to get hpux executable path.

Daily bump.

sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)

During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries. Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.

gcc/ChangeLog:

2025-03-26 Martin Jambor <mjambor@suse.cz>

PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.

(cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7)

sra: Avoid creating TBAA hazards (PR118924)

The testcase in PR 118924, when compiled on Aarch64, contains an
gimple aggregate assignment statement in between different types which
are types_compatible_p but behave differently for the purposes of
alias analysis.

SRA replaces the statement with a series of scalar assignments which
however have LHSs access chains modeled on the RHS type and so do not
alias with a subsequent reads and so are DSEd.

SRA clearly gets its "same_access_path" logic subtly wrong.  One issue
is that the same_access_path_p function probably should be implemented
more along the lines of (parts of ao_compare::compare_ao_refs) instead
of internally relying on operand_equal_p.  That is however not the
problem in the PR and so I will deal with it only later.

The issue here is that even when the access path is the same, it must
not be bolted on an aggregate type that does not match.  This patch
does that, taking just one simple function from the
ao_compare::compare_ao_refs machinery and using it to detect the
situation.  The rest is just merging the information in between
accesses of the same access group.

I looked at how many times we come across such assignment during
"make stage2-bubble" of GCC (configured with only c and C++ and
without multilib and libsanitizers) and on an x86_64 there were 87924
such assignments (though now I realize not all of them had to be
aggregate), so they do happen.  The patch leads to about 5% increase
of cases where we don't use an "access path" but resort to a
MEM_REF (from 90209 to 95204).  On an Aarch64, there were 92268 such
assignments and the increase of falling back to MEM_REFs was by
4% (but from a bigger base 132983 to 107991).

gcc/ChangeLog:

2025-04-04  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.

gcc/testsuite/ChangeLog:

2025-03-25  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.

(cherry picked from commit 07d243670020b339380194f6125cde87ada56148)

s390: Accept only Pmode for registers AP/FP/RA [PR119235]

gcc/ChangeLog:

PR target/119235
* config/s390/s390.cc (s390_hard_regno_mode_ok): Accept only
Pmode for registers AP/FP/RA.

(cherry picked from commit 2b383ae2a6e5fc0530bfd8b86ad0e6b27e760bd2)

c++: templates, attributes, #pragma target [PR114772]

Since r12-5426 apply_late_template_attributes suppresses various global
state to avoid applying active pragmas to earlier declarations; we also
need to override target_option_current_node.

PR c++/114772
PR c++/101180

gcc/cp/ChangeLog:

* pt.cc (apply_late_template_attributes): Also override
target_option_current_node.

gcc/testsuite/ChangeLog:

* g++.dg/ext/pragma-target2.C: New test.

(cherry picked from commit 5fdb0145fb9499f5db9e27f775895ce4a39215e4)

libcpp: Fix incorrect line numbers in large files [PR108900]

This patch addresses an issue in the C preprocessor where incorrect
line number information is generated when processing files with a
large number of lines. The problem arises from improper handling
of location intervals in the line map, particularly when locations
exceed LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES.

By ensuring that the highest location is not decremented if it
would move to a different ordinary map, this fix resolves
the line number discrepancies observed in certain test cases.
This change improves the accuracy of line number reporting, benefiting
users relying on precise code coverage and debugging information.

Tested x86_64-linux.

libcpp/ChangeLog:

PR preprocessor/108900
* files.cc (_cpp_stack_file): Do not decrement highest_location
across distinct maps.

Signed-off-by: Jeremy Bettis <jbettis@google.com>
Signed-off-by: Yash Shinde <Yash.Shinde@windriver.com>
(cherry picked from commit d9b56c65a2697e0d7a6c0f15f1977803dc94579b)

c++: constexpr, trivial, and non-alias target [PR111075]

On Darwin and other targets with !can_alias_cdtor, we instead go to
maybe_thunk_ctor, which builds a thunk function that calls the general
constructor. And then cp_fold tries to constant-evaluate that call, and we
ICE because we don't expect to ever be asked to constant-evaluate a call to
a trivial function.

No new test because this fixes g++.dg/torture/tail-padding1.C on affected
targets.

PR c++/111075

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Allow trivial
call from a thunk.

(cherry picked from commit 628aecb050bbbc4bb0bd4468c474623e20d64e21)

Daily bump.

[14] Use --param=aarch64-autovec-preference=2 instead of =sve-only

This updates the backported test.

PR tree-optimization/119706
* g++.target/aarch64/sve/pr119706.C: Adjust --param
aarch64-autovec-preference.

tree-optimization/119778 - properly mark abnormal edge sources during inlining

When inlining a call that abnormally transfers control-flow we make
all inlined calls that can possibly transfer abnormal control-flow
do so as well. But we failed to mark the calls as altering
control-flow. This results in inconsistent behavior later and
possibly wrong-code (we'd eventually prune those edges).

PR tree-optimization/119778
* tree-inline.cc (copy_edges_for_bb): Mark calls that are
source of abnormal edges as altering control-flow.

* g++.dg/torture/pr119778.C: New testcase.

(cherry picked from commit a48f934211434cac1be951c207ee76e4b4340fac)

aarch64: Add test case.

This patch adds a test case to the testsuite for PR119706.
The bug was already fixed by
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680573.html.

OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/
PR tree-optimization/119706
* g++.target/aarch64/sve/pr119706.C: New test.

(cherry picked from commit f6e6e6d9ba1d71fdd02a2c570d60217db6c5a31b)

middle-end/119706 - allow POLY_INT_CST as is_gimple_mem_ref_addr

We currently only INTEGER_CST, but not POLY_INT_CST, which leads
to the situation that when the POLY_INT_CST is only indrectly
present via a SSA def the IL is valid but when propagated it's not.
That's unsustainable.

PR middle-end/119706
* gimple-expr.cc (is_gimple_mem_ref_addr): Also allow
POLY_INT_CST.

(cherry picked from commit bf812c6ad83ec0b241bb3fecc7e68f883b6083df)

target/119549 - fixup handling of -mno-sse4 in target attribute

The following fixes ix86_valid_target_attribute_inner_p to properly
handle target("no-sse4") via OPT_mno_sse4 rather than as unset OPT_msse4.
I've added asserts to ix86_handle_option that RejectNegative is honored
for both.

PR target/119549
* common/config/i386/i386-common.cc (ix86_handle_option):
Assert that both OPT_msse4 and OPT_mno_sse4 are never unset.
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Process negated OPT_msse4 as OPT_mno_sse4.

* gcc.target/i386/pr119549.c: New testcase.

(cherry picked from commit ae2912748b9913fe9cd198db1b3fa704d0cbc728)

tree-optimization/119534 - reject bogus emulated vectorized gather

The following makes sure to reject the attempts to emulate a vector
gather when the discovered index vector type is a vector mask.

PR tree-optimization/119534
* tree-vect-stmts.cc (get_load_store_type): Reject
VECTOR_BOOLEAN_TYPE_P offset vector type for emulated gathers.

* gcc.dg/vect/pr119534.c: New testcase.

(cherry picked from commit d0cc14c62ad7403afcab3c2e38851d3ab179352f)

Daily bump.

d: Fix ICE: type variant differs by TYPE_MAX_VALUE with -g [PR119826]

Forward referenced enum types were never fixed up after the main
ENUMERAL_TYPE was finished. All flags set are now propagated to all
variants after its mode, size, and alignment has been calculated.

PR d/119826

gcc/d/ChangeLog:

* types.cc (TypeVisitor::visit (TypeEnum *)): Propagate flags of main
enum types to all forward-referenced variants.

gcc/testsuite/ChangeLog:

* gdc.dg/debug/imports/pr119826b.d: New test.
* gdc.dg/debug/pr119826.d: New test.

(cherry picked from commit c5ffab99a5a962aa955310e74ca0a4be5c1acf30)

d: Fix ICE in dwarf2out_imported_module_or_decl, at dwarf2out.cc:27676 [PR119817]

The ImportVisitor method for handling the importing of overload sets was
pushing NULL_TREE to the array of import decls, which in turn got passed
to `debug_hooks->imported_module_or_decl', triggering the observed
internal compiler error.

NULL_TREE is returned from `build_import_decl' when the symbol was
ignored for being non-trivial to represent in debug, for example,
template or tuple declarations. So similarly "skip" adding the symbol
when this is the case for overload sets too.

PR d/119817

gcc/d/ChangeLog:

* imports.cc (ImportVisitor::visit (OverloadSet *)): Don't push
NULL_TREE to vector of import symbols.

gcc/testsuite/ChangeLog:

* gdc.dg/debug/imports/m119817/a.d: New test.
* gdc.dg/debug/imports/m119817/b.d: New test.
* gdc.dg/debug/imports/m119817/package.d: New test.
* gdc.dg/debug/pr119817.d: New test.

(cherry picked from commit f5ed7d19c965de9ccb158d77e929b17459bf65b5)

d: Fix forward referenced enums missing type names in debug info [PR118309]

Calling `rest_of_type_compilation' as the D types were built meant that
debug info was being emitted before all forward references were
resolved, resulting in DW_AT_name's to be missing.

Instead, defer outputting type debug information until all modules have
been parsed and generated in `d_finish_compilation'.

PR d/118309

gcc/d/ChangeLog:

* modules.cc: Include debug.h
(d_finish_compilation): Call debug_hooks->type_decl on all TYPE_DECLs.
* types.cc: Remove toplev.h include.
(finish_aggregate_type): Don't call rest_of_type_compilation or
rest_of_decl_compilation on type.
(TypeVisitor::visit (TypeEnum *)): Likewise.

gcc/testsuite/ChangeLog:

* gdc.dg/debug/dwarf2/pr118309.d: New test.

(cherry picked from commit cee353c2653d274768a67677c8ea37fd23422b3c)

libatomic: Fix up libat_{,un}lock_n for mingw [PR119796]

Here is just a port of the previously posted patch to mingw which
clearly has the same problems.

2025-04-16 Jakub Jelinek <jakub@redhat.com>

PR libgcc/101075
PR libgcc/119796
* config/mingw/lock.c (libat_lock_n, libat_unlock_n): Start with
computing how many locks will be needed and take into account
((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the
locks array and others from the start of it will be needed, first
lock the ones from the start followed by ones from the end.

(cherry picked from commit 34fe8e90007afbc87941df9b01ffcf8747c11497)

libatomic: Fix up libat_{,un}lock_n [PR119796]

As mentioned in the PR (and I think in PR101075 too), we can run into
deadlock with libat_lock_n calls with larger n.
As mentioned in PR66842, we use multiple locks (normally 64 mutexes
for each 64 byte cache line in 4KiB page) and currently can lock more
than one lock, in particular for n [0, 64] a single lock, for n [65, 128]
2 locks, for n [129, 192] 3 locks etc.
There are two problems with this:
1) we can deadlock if there is some wrap-around, because the locks are
   acquired always in the order from addr_hash (ptr) up to
   locks[NLOCKS-1].mutex and then if needed from locks[0].mutex onwards;
   so if e.g. 2 threads perform libat_lock_n with n = 2048+64, in one
   case at pointer starting at page boundary and in another case at
   page boundary + 2048 bytes, the first thread can lock the first
   32 mutexes, the second thread can lock the last 32 mutexes and
   then first thread wait for the lock 32 held by second thread and
   second thread wait for the lock 0 held by the first thread;
   fixed below by always locking the locks in order of increasing
   index, if there is a wrap-around, by locking in 2 loops, first
   locking some locks at the start of the array and second at the
   end of it
2) the number of locks seems to be determined solely depending on the
   n value, I think that is wrong, we don't know the structure alignment
   on the libatomic side, it could very well be 1 byte aligned struct,
   and so how many cachelines are actually (partly or fully) occupied
   by the atomic access depends not just on the size, but also on
   ptr % WATCH_SIZE, e.g. 2 byte structure at address page_boundary+63
   should IMHO lock 2 locks because it occupies the first and second
   cacheline

Note, before this patch it locked exactly one lock for n = 0, while
with this patch it could lock either no locks at all (if it is at cacheline
boundary) or 1 (otherwise).
Dunno of libatomic APIs can be called for zero sizes and whether
we actually care that much how many mutexes are locked in that case,
because one can't actually read/write anything into zero sized memory.
If you think it is important, I could add else if (nlocks == 0) nlocks = 1;
in both spots.

2025-04-16  Jakub Jelinek  <jakub@redhat.com>

PR libgcc/101075
PR libgcc/119796
* config/posix/lock.c (libat_lock_n, libat_unlock_n): Start with
computing how many locks will be needed and take into account
((uintptr_t)ptr % WATCH_SIZE).  If some locks from the end of the
locks array and others from the start of it will be needed, first
lock the ones from the start followed by ones from the end.

(cherry picked from commit 61dfb0747afcece3b7a690807b83b366ff34f329)

bitintlower: Fix interaction of gimple_assign_copy_p stmts vs. has_single_use [PR119808]

The following testcase is miscompiled, because we emit a CLOBBER in a place
where it shouldn't be emitted.
Before lowering we have:
  b_5 = 0;
  b.0_6 = b_5;
  b.1_1 = (unsigned _BitInt(129)) b.0_6;
...
  <retval> = b_5;
The bitint coalescing assigns the same partition/underlying variable
for both b_5 and b.0_6 (possible because there is a copy assignment)
and of course a different one for b.1_1 (and other SSA_NAMEs in between).
This is -O0 so stmts aren't DCEd and aren't propagated that much etc.
It is -O0 so we also don't try to optimize and omit some names from m_names
and handle multiple stmts at once, so the expansion emits essentially
  bitint.4 = {};
  bitint.4 = bitint.4;
  bitint.2 = cast of bitint.4;
  bitint.4 = CLOBBER;
...
  <retval> = bitint.4;
and the CLOBBER is the problem because bitint.4 is still live afterwards.
We emit the clobbers to improve code generation, but do it only for
(initially) has_single_use SSA_NAMEs (remembered in m_single_use_names)
being used, if they don't have the same partition on the lhs and a few
other conditions.
The problem above is that b.0_6 which is used in the cast has_single_use
and so was in m_single_use_names bitmask and the lhs in that case is
bitint.2, so a different partition.  But there is gimple_assign_copy_p
with SSA_NAME rhs1 and the partitioning special cases those and while
b.0_6 is single use, b_5 has multiple uses.  I believe this ought to be
a problem solely in the case of such copy stmts and its special case
by the partitioning, if instead of b.0_6 = b_5; there would be
b.0_6 = b_5 + 1; or whatever other stmts that performs or may perform
changes on the value, partitioning couldn't assign the same partition
to b.0_6 and b_5 if b_5 is used later, it couldn't have two different
(or potentially different) values in the same bitint.N var.  With
copy that is possible though.

So the following patch fixes it by being more careful when we set
m_single_use_names, don't set it if it is a has_single_use SSA_NAME
but SSA_NAME_DEF_STMT of it is a copy stmt with SSA_NAME rhs1 and that
rhs1 doesn't have single use, or has_single_use but SSA_NAME_DEF_STMT of it
is a copy stmt etc.

Just to make sure it doesn't change code generation too much, I've gathered
statistics how many times
      if (m_first
          && m_single_use_names
          && m_vars[p] != m_lhs
          && m_after_stmt
          && bitmap_bit_p (m_single_use_names, SSA_NAME_VERSION (op)))
        {
          tree clobber = build_clobber (TREE_TYPE (m_vars[p]),
                                        CLOBBER_STORAGE_END);
          g = gimple_build_assign (m_vars[p], clobber);
          gimple_stmt_iterator gsi = gsi_for_stmt (m_after_stmt);
          gsi_insert_after (&gsi, g, GSI_SAME_STMT);
        }
emits a clobber on
make check-gcc GCC_TEST_RUN_EXPENSIVE=1 RUNTESTFLAGS="--target_board=unix\{-m64,-m32\} GCC_TEST_RUN_EXPENSIVE=1 dg.exp='*bitint* pr112673.c builtin-stdc-bit-*.c pr112566-2.c pr112511.c pr116588.c pr116003.c pr113693.c pr113602.c flex-array-counted-by-7.c' dg-torture.exp='*bitint* pr116480-2.c pr114312.c pr114121.c' dfp.exp=*bitint* i386.exp='pr118017.c pr117946.c apx-ndd-x32-2a.c' vect.exp='vect-early-break_99-pr113287.c' tree-ssa.exp=pr113735.c"
and before this patch it was 41010 clobbers and after it is 40968,
so difference is 42 clobbers, 0.1% fewer.

2025-04-16  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/119808
* gimple-lower-bitint.cc (gimple_lower_bitint): Don't set
m_single_use_names bits for SSA_NAMEs which have single use but
their SSA_NAME_DEF_STMT is a copy from another SSA_NAME which doesn't
have a single use, or single use which is such a copy etc.

(cherry picked from commit 5a48e7732d6aa0aaf12b508fa640125e6c4d14b9)

expmed: Always use QImode for init_expmed set_zero_cost [PR119785]

This is a regression on some targets introduced I believe by r6-2055
which added mode argument to set_src_cost.

The problem here is that in the first iteration, mode is always QImode
and we get as -Os zero cost set_src_cost (const0_rtx, QImode, false).
But then we use the mode variable for iterating over int, partial int
and vector int modes, so for the second iteration we call set_src_cost
with mode which is at that time (machine_mode) (MAX_MODE_VECTOR_INT + 1).

In the x86 case that happens to be V2HFmode and we don't crash (and
compute the same 0 cost as we would for QImode).
But e.g. in the SPARC case (machine_mode) (MAX_MODE_VECTOR_INT + 1) is
MAX_MACHINE_MODE and that does all kinds of weird things especially
when doing ubsan bootstrap.

Fixed by always using QImode.

2025-04-14 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/119785
* expmed.cc (init_expmed): Always pass QImode rather than mode to
set_src_cost passed to set_zero_cost.

(cherry picked from commit f96a54350afcf7f3c90d0ecb51d7683d826acc00)

driver: On linux hosts disable ASLR during -freport-bug [PR119727]

Andi had a useful comment that even with the PR119727 workaround to
ignore differences in libbacktrace printed addresses, it is still better
to turn off ASLR when easily possible, e.g. in case some address leaks
in somewhere in the ICE message elsewhere, or to verify the ICE doesn't
depend on a particular library/binary load addresses.

The following patch adds a configure check and uses personality syscall
to turn off randomization for further -freport-bug subprocesses.

2025-04-14 Jakub Jelinek <jakub@redhat.com>

PR driver/119727
* configure.ac (HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE): New check.
* gcc.cc: Include sys/personality.h if
HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined.
(try_generate_repro): Call
personality (personality (0xffffffffU) | ADDR_NO_RANDOMIZE)
if HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined.
* config.in: Regenerate.
* configure: Regenerate.

(cherry picked from commit 5a32e85810d33dc46b1b5fe2803ee787d40709d5)

driver: Fix up -freport-bug for ASLR [PR119727]

With --enable-host-pie -freport-bug almost never prepares preprocessed
source and instead emits
The bug is not reproducible, so it is likely a hardware or OS problem.
message even for bogus which are 100% reproducible.
The way -freport-bug works is that it reruns it 3 times, capturing stdout
and stderr from each and then tries to compare the outputs in between
different runs.
The libbacktrace emitted hexadecimal addresses at the start of the lines
can differ between runs due to ASLR, either of the PIE executable, or
even if not PIE if there is some frame with e.g. libc function (say
crash in strlen/memcpy etc.).

The following patch fixes it by ignoring such differences at the start of
the lines.

2025-04-12 Jakub Jelinek <jakub@redhat.com>

PR driver/119727
* gcc.cc (files_equal_p): Rewritten using fopen/fgets/fclose instead
of open/fstat/read/close. At the start of lines, ignore lowercase
hexadecimal addresses followed by space.

(cherry picked from commit 8b2ceb421f045ee8b39d7941f39f1e9a67217583)

bitintlower: Fix up handling of SSA_NAME copies in coalescing [PR119722]

The following patch is miscompiled, because during the limited
SSA name coalescing the bitintlower pass does we incorrectly don't
register a conflict.
This is on
  <bb 4> [local count: 1073741824]:
  # b_17 = PHI <b_19(3), 8(2)>
  g.4_13 = g;
  _14 = g.4_13 >> 50;
  _15 = (unsigned int) _14;
  _21 = b_17;
  _16 = (unsigned int) _21;
  s_22 = _15 + _16;
  return s_22;
basic block where in the map->bitint bitmap we track 14, 17 and 19.
The build_bitint_stmt_ssa_conflicts "hook" has special code where
it tracks uses at the final statements of mergeable operations, so
e.g. the
  _16 = (unsigned int) _21;
statement is considered to be use of b_17 because _21 is not in
map->bitmap (or large_huge.m_names), i.e. is mergeable.
The problem is that build_ssa_conflict_graph has special code to handle
SSA_NAME copies and _21 = b_17; is gimple_assign_copy_p.  In such cases
it calls live_track_clear_var on the rhs1.  The problem is that
on the above bb, after we note in the _16 = (unsigned int) _21;
stmt we need b_17 the generic code makes us forget that because
of the copy statement, and then build_bitint_stmt_ssa_conflicts
ignores it completely (because _21 is large/huge bitint and is
not in map->bitint, so assumed to be handled by a later stmt in the
bb, for backwards walk like this before this one).
As the b_17 use is ignored, the coalescing thinks it can put
all of b_17, b_19 and _14 into the same partition, which is wrong,
while we can and should coalesce b_17 and b_19, _14 needs to be a different
temporary because b_17 is set before and used after _14 has been written.

The following patch fixes it by handling gimple_assign_copy_p in two
separate spots, move the generic coalesce handling of it after
build_ssa_conflict_graph (where build_ssa_conflict_graph handling
doesn't fall through to that, it does continue after the call) and
inside of build_ssa_conflict_graph it performs it too, but only if
the lhs is not mergeable large/huge bitint.

2025-04-12  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/119722
* gimple-lower-bitint.h (build_bitint_stmt_ssa_conflicts): Add
CLEAR argument.
* gimple-lower-bitint.cc (build_bitint_stmt_ssa_conflicts): Add
CLEAR argument.  Call clear on gimple_assign_copy_p rhs1 if lhs
is large/huge bitint unless lhs is not in names.
* tree-ssa-coalesce.cc (build_ssa_conflict_graph): Adjust
build_bitint_stmt_ssa_conflicts caller.  Move gimple_assign_copy_p
handling to after the build_bitint_stmt_ssa_conflicts call.

(cherry picked from commit 3f9dfb94eab1ab1bbf9a2b5e20d1f61e36516063)

bitintlower: Fix up handling of nested casts in m_upward_2limbs cases [PR119707]

The following testcase is miscompiled I believe starting with
PR112941 r14-6742.  That commit fixed the bitint-55.c testcase.
The m_first initialization for such conversion initializes 2 SSA_NAMEs,
one is PHI result on the loop (m_data[save_data_cnt]) and the other
(m_data[save_data_cnt+1]) is the argument of that PHI from the latch
edge initialized somewhere in the loop.  Both of these are used to
propagate sign extension (i.e. either 0 or all ones limb) from the
iteration with the sign bit of a narrower type to following iterations.
The bitint-55.c testcase was ICEing with invalid SSA forms as it was
using unconditionally the PHI argument SSA_NAME even in places which
weren't dominated by that.  And the code which was touched is about
handling constant idx, so if e.g. there are nested casts and the
outer one does conditional code based on index comparison with
a particular constant index.
In the following testcase there are 2 nested casts, one from signed
_BitInt(129) to unsigned _BitInt(255) and the outer from unsigned
_BitInt(255) to unsigned _BitInt(256).  The m_upward_2limbs case which
is used for handling mergeable arithmetics (like +-|&^ and casts etc.)
one loop iteration handles 2 limbs, the first half the even ones, the
second half the odd ones.
And for these 2 conversions, the special one for the inner conversion
on x86_64 is with index 2 where the sign bit of _BitInt(129) is present,
while for the outer one index 3 where we need to mask off the most
significant bit.
The r15-6742 change started using m_data[save_data_cnt] for all constant
indexes if it is still inside of the loop (and it is sign extension).
But that doesn't work correctly for the case where the inner conversion
produces the sign extension limb in the loop for an even index and
the outer conversion needs to special case the immediately next conversion,
because in that case using the PHI result will see still 0 there rather
than the updated value from the handling of previous limb.
So the following patch special cases this and uses the other SSA_NAME.

Commented IL, trying to lower
  _1 = (unsigned _BitInt(255)) y_4(D);
  _2 = (unsigned _BitInt(256)) _1;
  _3 = _2 + x_5(D);
  <retval> = _3;
we were emitting
  <bb 3> [local count: 1073741824]:
  # _8 = PHI <0(2), _9(12)>     // This is the limb index
  # _10 = PHI <0(2), _11(12)>   // Sign extension limb from inner cast (0 or ~0UL)
  # _22 = PHI <0(2), _23(12)>   // Overflow bit from addition of previous limb
  if (_8 <= 2)
    goto <bb 4>; [80.00%]
  else
    goto <bb 7>; [20.00%]

  <bb 4> [local count: 1073741824]:
  if (_8 == 2)
    goto <bb 6>; [20.00%]
  else
    goto <bb 5>; [80.00%]

  <bb 5> [local count: 1073741824]:
  _12 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_8];     // Full limbs in y
  goto <bb 7>; [100.00%]

  <bb 6> [local count: 214748360]:
  _13 = MEM <unsigned long> [(_BitInt(129) *)&y + 16B]; // y[2] which
  _14 = (<unnamed-signed:1>) _13;                       // needs to be
  _15 = (unsigned long) _14;                            // sign extended
  _16 = (signed long) _15;                              // to full
  _17 = _16 >> 63;                                      // limb
  _18 = (unsigned long) _17;

  <bb 7> [local count: 1073741824]:
  # _19 = PHI <_12(5), _10(3), _15(6)>  // Limb to add for result of casts
  # _20 = PHI <0(5), _10(3), _18(6)>    // Sign extension limb from previous limb
  _11 = _20;                            // PHI _10 argument above
  _21 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_8];
  _24 = .UADDC (_19, _21, _22);
  _25 = IMAGPART_EXPR <_24>;
  _26 = REALPART_EXPR <_24>;
  VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_8] = _26;
  _27 = _8 + 1;
  if (_27 == 3)                 // For the outer cast limb 3 is special
    goto <bb 11>; [20.00%]
  else
    goto <bb 8>; [80.00%]

  <bb 8> [local count: 1073741824]:
  if (_27 < 2)
    goto <bb 9>; [80.00%]
  else
    goto <bb 10>; [20.00%]

  <bb 9> [local count: 1073741824]:
  _28 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_27];    // These are used in full

  <bb 10> [local count: 1073741824]:
  # _29 = PHI <_28(9), _11(8)>
  goto <bb 12>; [100.00%]

  <bb 11> [local count: 214748360]:
// And HERE is the actual bug.  Using _10 for idx 3 will mean it is always
// zero there and doesn't contain the _18 value propagated to it.
// It should be
// _30 = (<unnamed-unsigned:63>) _11;
// Now if the outer conversion had special iteration say 5, we could
// have used _10 fine here, by that time it already propagates through
// the PHI.
  _30 = (<unnamed-unsigned:63>) _10;
  _31 = (unsigned long) _30;

  <bb 12> [local count: 1073741824]:
  # _32 = PHI <_29(10), _31(11)>
  _33 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_27];
  _34 = .UADDC (_32, _33, _25);
  _23 = IMAGPART_EXPR <_34>;
  _35 = REALPART_EXPR <_34>;
  VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_27] = _35;
  _9 = _8 + 2;
  if (_9 != 4)
    goto <bb 3>; [0.05%]
  else
    goto <bb 13>; [99.95%]

2025-04-11  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/119707
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Only use
m_data[save_data_cnt] instead of m_data[save_data_cnt + 1] if
idx is odd and equal to low + 1.  Remember tree_to_uhwi (idx) in
a temporary instead of calling the function multiple times.

* gcc.dg/torture/bitint-76.c: New test.

(cherry picked from commit b57d7ef4bdda8f939d804bfe40123cb9e4b447b3)

libquadmath: Fix up THREEp96 constant in expq

Here is a cherry-pick from glibc [BZ #32411] fix.

As mentioned by the reporter in a pull request against gcc-mirror,
the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128
rather than 0x3.p+96f128.

The algorithm uses that to compute the t2 integer (tval2), by whose
delta it adjusts the x+xl pair and then in the result uses the precomputed
exp value for that entry.
Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes
being one smaller, sometimes one larger than the desired value, thus can mean
the x+xl pair after adjustment will be larger in absolute value than it
should be.

DesWursters created a test program for this
https://github.com/DesWurstes/comparefloats
and his results were
total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648
I've modified this so with
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3
so that it actually tests pseudo-random _Float128 values with range
(-16384.,16384) with strong bias on values larger than 0.0002 in absolute
value (so that tval1/tval2 aren't zero most of the time) and that gave
total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255
So, in both cases, in most cases the change doesn't result in any differences,
and in those rare cases where does, about 85% have smaller ulp than without
the patch.
Additionally I've tried
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4
and in 2 billion iterations it didn't find any case where x+xl after the
adjustments without this change would be smaller in absolute value compared
to x+xl after the adjustments with this change.

2025-04-09 Jakub Jelinek <jakub@redhat.com>

* math/expq.c (C): Fix up THREEp96 constant.

(cherry picked from commit e081ced345c45581a4891361c08e50e07720239e)

lto: lto-opts fixes [PR119625]

I can reproduce a really weird error in our distro i686 trunk gcc
(but haven't managed to reproduce it with vanilla trunk yet).
echo 'void foo (void) {}' > a.c; gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -o a.o -c a.c; gcc -O2 -flto=auto -m32 -march=i686 -r -o a.lo a.o
lto1: fatal error: open  failed: No such file or directory
compilation terminated.
lto-wrapper: fatal error: gcc returned 1 exit status
The error is because
cat ./a.lo.lto.o-args.0
""
a.o
My suspicion is that this "" in there is caused by weird .gnu.lto_.opts
section content during
gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -S -o a.s -c a.c
compilation (and I can reproduce that one with vanilla trunk).
The above results in
        .section        .gnu.lto_.opts,"e",@progbits
        .string "'-fno-openmp' '-fno-openacc' '-fPIC' '' '-m32' '-march=i686' '-O2' '-flto=auto' '-ffat-lto-objects'"
There are two weird things, one (IMHO the cause of the "" later on) is
the '' part, I think it comes from lto_write_options doing
append_to_collect_gcc_options (&temporary_obstack, &first_p, "");
IMHO it shouldn't call append_to_collect_gcc_options at all for that case.

The -fhardened option causes global_options.x_flag_cf_protection
to be set to CF_FULL and later on the backend option processing
sets it to CF_FULL | CF_SET (i.e. 7, a value not handled in
lto_write_options).

The following patch fixes it by not emitting anything there if
flag_cf_protection is one of the unhandled values.

Perhaps it could incrementally use
switch (global_options.x_flag_cf_protection & ~CF_SET)
instead, dunno.

And the other problem is that the -fPIC in there is really weird.
Our distro compiler or vanilla configured trunk certainly doesn't
default to -fPIC and -fhardened uses -fPIE when
-fPIC/-fpic/-fno-pie/-fno-pic is not specified, so I was expecting
-fPIE in there.
The thing is that the -fpie option causes setting of both
global_options.x_flag_pi{c,e} to 1, -fPIE both to 2:
      /* If -fPIE or -fpie is used, turn on PIC.  */
      if (opts->x_flag_pie)
        opts->x_flag_pic = opts->x_flag_pie;
      else if (opts->x_flag_pic == -1)
        opts->x_flag_pic = 0;
      if (opts->x_flag_pic && !opts->x_flag_pie)
        opts->x_flag_shlib = 1;
so checking first for flag_pic == 2 and then flag_pic == 1
and only afterwards for flag_pie means we never print
-fPIE/-fpie.

Or do you want something further (like
switch (global_options.x_flag_cf_protection & ~CF_SET)
)?

2025-04-04  Jakub Jelinek  <jakub@redhat.com>

PR lto/119625
* lto-opts.cc (lto_write_options): If neither flag_pic nor
flag_pie are set, check first for flag_pie and only later
for flag_pic rather than the other way around, use a temporary
variable.  If flag_cf_protection is not set, don't append anything
if flag_cf_protection is none of CF_{NONE,FULL,BRANCH,RETURN} and
use a temporary variable.

(cherry picked from commit d25728c98682c058bfda79333c94b0a8cf2a3f49)

c: Fix ICEs with -fsanitize=pointer-{subtract,compare} [PR119582]

The following testcase ICEs because c_fully_fold isn't performed on the
arguments of __sanitizer_ptr_{sub,cmp} builtins and so e.g.
C_MAYBE_CONST_EXPR can leak into the gimplifier where it ICEs.

2025-04-02 Jakub Jelinek <jakub@redhat.com>

PR c/119582
* c-typeck.cc (pointer_diff, build_binary_op): Call c_fully_fold on
__sanitizer_ptr_sub or __sanitizer_ptr_cmp arguments.

* gcc.dg/asan/pr119582.c: New test.

(cherry picked from commit 29bc904cb827615ed9f36bc3742ccc4ac77515ec)

combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]

The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
            (set (reg:CCZ 17 flags)
                (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
                    (const_int 0 [0])))
            (set (reg/v:SI 116 [ e ])
                (neg:SI (reg:SI 104 [ _7 ])))
        ]) "pr119291.c":26:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (ne:DI (reg:CCZ 17 flags)
            (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
            (set (pc)
                (pc))
            (set (reg/v:SI 116 [ e ])
                (const_int 0 [0]))
        ]) "pr119291.c":28:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).

This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.

As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
        (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
     (nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
     (nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).

The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.

Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).

On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine?  Why
> does the comment talk about memory?

I was worried about making too risky changes this late in stage4
(and especially also for backports).  Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).

What we know for sure is:
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).

Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes.  This part didn't change in my patch.

I've only changed those
-         && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+         && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.

The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3.  So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false.  Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch).  In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.

Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.

To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).

So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.

2025-04-01  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.

* gcc.c-torture/execute/pr119291.c: New test.

(cherry picked from commit 19ba913517b5e2a001fa9c0f060a1ac74430c027)

LoongArch: Change {dg-do-what-default} save and restore logical.

The set of {dg-do-what-default} to 'run' may lead some test hang
during make check.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/loongarch-vector.exp: Change
{dg-do-what-default} save and restore logical.

(cherry picked from commit dd982198656d914a4958bf86356a4c996c728b9d)

Daily bump.

libstdc++: Correct preprocessing checks for floatX_t and bfloat_16 formatting

Floating points types _Float16, _Float32, _Float64, and bfloat16,
can be formatted only if std::to_chars overloads for such types
were provided. Currently this is only the case for architectures
where float and double are 32-bits and 64-bits IEEE floating points types.

This patch updates the preprocessing checks for formatters
for above types to check _GLIBCXX_FLOAT_IS_IEEE_BINARY32
and _GLIBCXX_DOUBLE_IS_IEEE_BINARY64. Making them non-formattable
on non-IEEE architectures.

Remove a potential UB, where we could produce basic_format_arg
with _M_type set to _Arg_fp32 or _Arg_fp64, that was later not
handled by `_M_visit`.

libstdc++-v3/ChangeLog:

* include/std/format (formatter<_Float16, _CharT>): Define only if
_GLIBCXX_FLOAT_IS_IEEE_BINARY32 macro is defined.
(formatter<_Float16, _CharT>): As above.
(formatter<__gnu_cxx::__bfloat16_t, _CharT>): As above.
(formatter<_Float64, _CharT>): Define only if
_GLIBCXX_DOUBLE_IS_IEEE_BINARY64 is defined.
(basic_format_arg::_S_to_arg_type): Normalize _Float32 and _Float64
only to float and double respectivelly.
(basic_format_arg::_S_to_enum): Remove handling of _Float32 and _Float64.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
(cherry picked from commit 445128c12cf22081223f7385196ee3889ef4c4b2)

i386: Enable -mnop-mcount for -fpic with PLTs [PR119386]

-mnop-mcount can be trivially enabled for -fPIC codegen as long as PLTs
are being used, given that the instruction encodings are identical, only
the target may resolve differently depending on how the linker decides
to incorporate the object file.

So relax the option check, and add a test to ensure that 5-byte NOPs are
emitted when -mnop-mcount is being used.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
gcc/ChangeLog:

PR target/119386
* config/i386/i386-options.cc: Permit -mnop-mcount when
using -fpic with PLTs.

gcc/testsuite/ChangeLog:

PR target/119386
* gcc.target/i386/pr119386-3.c: New test.

(cherry picked from commit 6b4569a3ebdd0df44d87d67a18272ec0b878f2ee)

i386: Prefer PLT indirection for __fentry__ calls under -fPIC [PR119386]

Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling
__fentry__") updated the logic that emits mcount() / __fentry__() calls
into function prologues when profiling is enabled, to avoid GOT-based
indirect calls when a direct call would suffice.

There are two problems with that change:
- it relies on -mdirect-extern-access rather than -fno-plt to decide
  whether or not a direct [PLT based] call is appropriate;
- for the PLT case, it falls through to x86_print_call_or_nop(), which
  does not emit the @PLT suffix, resulting in the wrong relocation to be
  used (R_X86_64_PC32 instead of R_X86_64_PLT32)

Fix this by testing flag_plt instead of ix86_direct_extern_access, and
updating x86_print_call_or_nop() to take flag_pic and flag_plt into
account. This also ensures that -mnop-mcount works as expected when
emitting the PLT based profiling calls.

While at it, fix the 32-bit logic as well, and issue a PLT call unless
PLTs are explicitly disabled.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
gcc/ChangeLog:

PR target/119386
* config/i386/i386.cc (x86_print_call_or_nop): Add @PLT suffix
where appropriate.
(x86_function_profiler): Fall through to x86_print_call_or_nop()
for PIC codegen when flag_plt is set.

gcc/testsuite/ChangeLog:

PR target/119386
* gcc.target/i386/pr119386-1.c: New test.
* gcc.target/i386/pr119386-2.c: New test.

(cherry picked from commit 9b0ae0a8d70603960f3c578d261efd18c02b803f)

Daily bump.

Fix wrong optimization of conditional expression with enumeration type

This is a regression introduced on the mainline and 14 branch by:
https://gcc.gnu.org/pipermail/gcc-cvs/2023-October/391658.html

The change bypasses int_fits_type_p (essentially) to work around the
signedness constraints, but in doing so disregards the peculiarities
of boolean types whose precision is not 1 dealt with by the predicate,
leading to the creation of a problematic conversion here.

Fixed by special-casing boolean types whose precision is not 1, as done
in several other places.

gcc/
* tree-ssa-phiopt.cc (factor_out_conditional_operation): Do not
bypass the int_fits_type_p test for boolean types whose precision
is not 1.

gcc/testsuite/
* gnat.dg/opt105.adb: New test.
* gnat.dg/opt105_pkg.ads, gnat.dg/opt105_pkg.adb: New helper.

aarch64: Disable sysreg feature gating

This applies to the sysreg read/write intrinsics __arm_[wr]sr*. It does
not depend on changes to Binutils, because GCC converts recognised
sysreg names to an encoding based form, which is already ungated in Binutils.

We have, however, agreed to make an equivalent change in Binutils (which
would then disable feature gating for sysreg accesses in inline
assembly), but this has not yet been posted upstream.

In the future we may introduce a new flag to renable some checking,
but these checks could not be comprehensive because many system
registers depend on architecture features that don't have corresponding
GCC/GAS --march options. This would also depend on addressing numerous
inconsistencies in the existing list of sysreg feature dependencies.

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_valid_sysreg_name_p): Remove feature check.
(aarch64_retrieve_sysreg): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/rwsr-ungated.c: New test.

x86: Update gcc.target/i386/apx-interrupt-1.c

ix86_add_cfa_restore_note omits the REG_CFA_RESTORE REG note for registers
pushed in red-zone.  Since

commit 0a074b8c7e79f9d9359d044f1499b0a9ce9d2801
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Apr 13 12:20:42 2025 -0700

    APX: Don't use red-zone with 32 GPRs and no caller-saved registers

disabled red-zone, update gcc.target/i386/apx-interrupt-1.c to expect
31 .cfi_restore directives.

PR target/119784
* gcc.target/i386/apx-interrupt-1.c: Expect 31 .cfi_restore
directives.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 5ed2fa4768f3d318b8ace5bd4a095596e06fad7b)