git.ipfire.org Git - thirdparty/gcc.git/log

Update ChangeLog and version files for release

Daily bump.

Fortran: Suppress failing part of testcase [PR109345]

2024-06-02 Paul Thomas <pault@gcc.gnu.org>

gcc/testsuite/
PR fortran/109345
* gfortran.dg/pr109345.f90: Comment out test of component refs.

Fortran: Fix gimplification error on assignment to pointer [PR103391]

PR fortran/103391

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_trans_assignment_1): Do not use poly assign
for pointer arrays on lhs (as it is done for allocatables
already).

gcc/testsuite/ChangeLog:

* gfortran.dg/assign_12.f90: New test.

(cherry picked from commit 04909c7ecc023874c3444b85f88c60b7b7cc7778)

Daily bump.

testsuite: Add testcase for GCC 13 branch s390 bug [PR120480]

This got broken with r13-9727 and fixed with either of
r13-9729 or r13-9728.

2025-05-30 Jakub Jelinek <jakub@redhat.com>

PR target/120480
* gcc.dg/pr120480.c: New test.

(cherry picked from commit c13d5b939fee565047394475952878dc5394fb74)

s390: Use match_scratch instead of scratch in define_split [PR119834]

The following testcase ICEs since r15-1579 (addition of late combiner),
because *clrmem_short can't be split.
The problem is that the define_insn uses
   (use (match_operand 1 "nonmemory_operand" "n,a,a,a"))
   (use (match_operand 2 "immediate_operand" "X,R,X,X"))
   (clobber (match_scratch:P 3 "=X,X,X,&a"))
and define_split assumed that if operands[1] is const_int_operand,
match_scratch will be always scratch, and it will be reg only if
it was the last alternative where operands[1] is a reg.
The pattern doesn't guarantee it though, of course RA will not try to
uselessly assign a reg there if it is not needed, but during RA
on the testcase below we match the last alternative, but then comes
late combiner and propagates const_int 3 into operands[1].  And that
matches fine, match_scratch matches either scratch or reg and the constraint
in that case is X for the first variant, so still just fine.  But we won't
split that because the splitters only expect scratch.

The following patch fixes it by using match_scratch instead of scratch,
so that it accepts either.

2025-04-17  Jakub Jelinek  <jakub@redhat.com>

PR target/119834
* config/s390/s390.md (define_split after *cpymem_short): Use
(clobber (match_scratch N)) instead of (clobber (scratch)).  Use
(match_dup 4) and operands[4] instead of (match_dup 3) and operands[3]
in the last of those.
(define_split after *clrmem_short): Use (clobber (match_scratch N))
instead of (clobber (scratch)).
(define_split after *cmpmem_short): Likewise.

* g++.target/s390/pr119834.C: New test.

(cherry picked from commit 22fe83d6fc9f59311241c981bcad58b61e2056d4)

Revert "[LRA]: Backporting solutions for PR112918 and PR113354 to solve PR99015"

This reverts commit af73c8bf5168848275bf909ee44fbb8f4973438f.

Daily bump.

[LRA]: Backporting solutions for PR112918 and PR113354 to solve PR99015

Patches for PR112918 and PR11354 depend on each other and can not be
clearly applied to gcc-13 branch. So patches were modified and
combined.

gcc/ChangeLog:

PR rtl-optimization/99015
* lra-constraints.cc (enough_allocatable_hard_regs_p): Extract
from in_class_p.
(in_class_p): Use it with added conditions.
(process_alt_operands): Try to change class too.
(curr_insn_transform): Pass true to in_class_p for reg operand
win. Spill pseudo only used in the insn if the corresponding
operand does not require hard register anymore.

Daily bump.

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  <mjambor@suse.cz>

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

(cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab)

configure, Darwin: Recognise new naming for Xcode ld.

The latest editions of XCode have altered the identify reported by 'ld -v'
(again). This means that GCC configure no longer detects the version.

Fixed by adding the new name to the set checked.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Recognise PROJECT:ld-mmmm.nn.aa as an identifier
for Darwin's static linker.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 7f56a8e8ad1c33d358e9e09fcbaf263c2caba1b9)

testsuite, gm2: Use -B option for libstdc++ where required.

We need to add testsuite  options to locate gm2 libs and libstdc++.

Usually '-L' options are added to point to the relevant directories for
the uninstalled libraries.

In cases where libraries are available as both shared and convenience some
additional checks are made.

For some targets -static-xxxx options are handled by specs substitution and
need a '-B' option rather than '-L'.  For Darwin, when embedded runpaths are
in use (the default for all versions after macOS 10.11), '-B' is also needed
to provide the runpath.

When '-B' is used, this results in a '-L' for each path that exists (so that
appending a '-L' as well is a needless duplicate).  There are also cases
where tools warn for duplicates, leading to spurious fails.

Therefore the objective of the code here is to add just one '-L' or '-B' for
each of the libraries.

Currently, we are forcing the full paths to each of the gm2 convenience libs
onto the link line and therefore the B/L logic is not needed there.  It would
need to be added if/when gm2 is tested with shared libraries

gcc/testsuite/ChangeLog:

* lib/gm2.exp: Arrange for a '-B' option to be added for the
libstdc++ paths on targets that need it.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 6b9ceac9e4e2be304c39e6bc8744edf21faac4fb)

Darwin: Pass -macos_version_min to the linker [PR119172].

For binaries to be notarised, the SDK version must be available.
Since we do not, at present, parse this information we have been
passing "0.0" to ld64. This now results in a warning and a fail
to notarise. As a quick-fix, we can fall back to letting ld64
figure out the SDK version (which it does for -macos_version_min).

TODO: Parse the SDKSetting.plist at some point.

cherry-picked from 952e17223d3a9 and fc728cfd569e291a5

PR target/119172

gcc/ChangeLog:

* config.in: Regenerate.
* config/darwin.h (DARWIN_PLATFORM_ID): Add the option to
use -macos_version_min where available.
* configure: Regenerate.
* configure.ac: Check for ld64 support of -macos_version_min.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

fixincludes: adjust stdio fix for macOS 15 headers

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (apple_local_stdio_fn_deprecation): Also apply to
_stdio.h.

(cherry picked from commit 1dc143181550573c9c902fb7a3b495e9b409d0b0)

libgcc, Darwin: Drop the legacy library build for macOS >= 10.12 [PR116809].

From macOSX15 SDK, the unwinder no longer exports some of the symbols used
in that library which (a) causes bootstrap fail and (b) means that the
legacy library is no longer useful.

No open branch of GCC emits references to this library - and any already
-built code that depends on the symbols would need rework anyway.

We have been asked to extend this back to the earliest OS vesion supported
by the SDK (10.12).

PR target/116809

libgcc/ChangeLog:

* config.host: Build legacy libgcc_s.1 on hosts before macOS 10.12.
* config/i386/t-darwin: Remove reference to legacy libgcc_s.1
* config/rs6000/t-darwin: Likewise.
* config/t-darwin-libgccs1: New file.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit d9cafa0c4f0a81304d9b95a78ccc8e9003c6d7a3)

s390x: Fix vec_xl/vec_xst type aliasing [PR114676]

The requirements of the vec_xl/vec_xst intrinsincs wrt aliasing of the
pointer argument are not really documented. As it turns out, users
are likely to get it wrong. With this patch we let the pointer
argument alias everything in order to make it more robust for users.

gcc/ChangeLog:

PR target/114676
* config/s390/s390-c.cc (s390_expand_overloaded_builtin): Use a
MEM_REF with an addend of type ptr_type_node.

gcc/testsuite/ChangeLog:

PR target/114676
* gcc.target/s390/zvector/pr114676.c: New test.

Suggested-by: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit 42189f21b22c43ac8ab46edf5f6a7b4d99bc86a5)

Daily bump.

libstdc++: Fix backported test [PR112490]

On the 13 branch and older, C++ >= 20 tests need an explicit dg-options
directive specifying the -std flag, otherwise they won't run by default.

PR libstdc++/112490

libstdc++-v3/ChangeLog:

* testsuite/24_iterators/const_iterator/112490.cc: Add
dg-options directive.

libstdc++: Fix complexity of drop_view::begin() const [PR112641]

Views are required to have a amortized O(1) begin(), but our drop_view's
const begin overload is O(n) for non-common ranges with a non-sized
sentinel. This patch reimplements it so that it's O(1) always. See
also LWG 4009.

PR libstdc++/112641

libstdc++-v3/ChangeLog:

* include/std/ranges (drop_view::begin): Reimplement const
overload so that it's O(1) always.
* testsuite/std/ranges/adaptors/drop.cc (test10): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 7f622ee83fbbcf4a4ca70e020db8a0ce4b556b61)

libstdc++: Fix constraint recursion in basic_const_iterator operator- [PR115046]

It was proposed in PR112490 to also adjust basic_const_iterator's friend
operator-(sent, iter) overload alongside the r15-7757-g4342c50ca84ae5
adjustments to its comparison operators, but we lacked a concrete
testcase demonstrating fixable constraint recursion there.  It turns out
Hewill Kang's PR115046 is such a testcase!  So this patch makes the same
adjustments to that overload as well, fixing PR115046.  The LWG 4218 P/R
will need to get adjusted too.

PR libstdc++/115046
PR libstdc++/112490

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (basic_const_iterator::operator-):
Replace non-dependent basic_const_iterator function parameter with
a dependent one of type basic_const_iterator<_It2> where _It2
matches _It.
* testsuite/std/ranges/adaptors/as_const/1.cc (test04): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit d69f73c0334486f3c66937388f02008736809e87)

libstdc++: Fix constraint recursion in basic_const_iterator relops [PR112490]

Here for

  using RCI = reverse_iterator<basic_const_iterator<vector<int>::iterator>>
  static_assert(std::totally_ordered<RCI>);

we effectively need to check the requirement

  requires (RCI x) { x RELOP x; }  for each RELOP in {<, >, <=, >=}

which we expect to be straightforwardly satisfied by reverse_iterator's
namespace-scope relops.  But due to ADL we find ourselves also
considering the basic_const_iterator relop friends, which before CWG
2369 would be quickly discarded since RCI clearly isn't convertible to
basic_const_iterator.  After CWG 2369 though we must first check these
relops' constraints (with _It = vector<int>::iterator and _It2 = RCI),
which entails checking totally_ordered<RCI> recursively.

This patch fixes this by turning the problematic non-dependent function
parameters of type basic_const_iterator<_It> into dependent ones of
type basic_const_iterator<_It3> where _It3 is constrained to match _It.
Thus the basic_const_iterator relop friends now get quickly discarded
during deduction and before the constraint check if the second operand
isn't a specialization of basic_const_iterator (or derived from one)
like before CWG 2369.

PR libstdc++/112490

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (basic_const_iterator::operator<):
Replace non-dependent basic_const_iterator function parameter with
a dependent one of type basic_const_iterator<_It3> where _It3
matches _It.
(basic_const_iterator::operator>): Likewise.
(basic_const_iterator::operator<=): Likewise.
(basic_const_iterator::operator>=): Likewise.
* testsuite/24_iterators/const_iterator/112490.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 4342c50ca84ae5448c0128c52120f4fe9005f203)

libstdc++: Fix ref_view branch of views::as_const [PR119135]

Unlike for span<X> and empty_view<X>, the range_reference_t of
ref_view<X> doesn't correspond to X. This patch fixes the ref_view
branch of views::as_const to correctly query its underlying range
type X.

PR libstdc++/119135

libstdc++-v3/ChangeLog:

* include/std/ranges: Include <utility>.
(views::__detail::__is_ref_view): Replace with ...
(views::__detail::__is_constable_ref_view): ... this.
(views::_AsConst::operator()): Replace bogus use of element_type
in the ref_view branch.
* testsuite/std/ranges/adaptors/as_const/1.cc (test03): Extend
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 50359c0a44381edb6dbd9359ef2ebdadbcc3ed42)

libstdc++: Implement P2540R1 change to views::cartesian_product()

This paper changes the identity element of views::cartesian_product to a
singleton range instead of an empty range. It was approved alongside
the main cartesian_product paper P2374R4, but unfortunately was overlooked
when implementing the main paper.

libstdc++-v3/ChangeLog:

* include/std/ranges (views::_CartesianProduct::operator()):
Adjust identity case as per P2540R1.
* testsuite/std/ranges/cartesian_product/1.cc (test01):
Adjust expected result of the identity case.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 98966f32f906303cd7d2e1c418f320c483e6dcbe)

libstdc++: Implement P2836R1 changes to const_iterator

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (const_iterator): Define conversion
operators as per P2836R1.
* include/std/ranges (__cpp_lib_ranges_as_const): Update value.
* include/std/version (__cpp_lib_ranges_as_const): Likewise.
* testsuite/24_iterators/const_iterator/1.cc (test04): New test.
* testsuite/std/ranges/adaptors/as_const/1.cc: Adjust expected
value of __cpp_lib_ranges_as_const.
* testsuite/std/ranges/version_c++23.cc: Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 731444b3c39e3dc3dd8778f430a38742861dcca1)

libstdc++: Implement LWG 3664 changes to ranges::distance

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h (__distance_fn::operator()):
Adjust iterator/sentinel overloads as per LWG 3664.
* testsuite/24_iterators/range_operations/distance.cc:
Test LWG 3664 example.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 7c0d1e9f2a2f1d41d9eb755c36c871d92638c4b7)

c++: using non-dep array var of unknown bound [PR115358]

For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.

PR c++/115358

gcc/cp/ChangeLog:

* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
variable with unknown bound.
* semantics.cc (finish_decltype_type): Remove now redundant
handling of array variables with unknown bound.
* typeck.cc (cxx_sizeof_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/array37.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit e3915c1ad56591cbd68229a64c941c38330abd69)

c++: template-id dependence wrt local static arg [PR117792]

Here we end up ICEing at instantiation time for the call to
f<local_static> ultimately because we wrongly consider the call to be
non-dependent, and so we specialize f ahead of time and then get
confused when fully substituting this specialization.

The call is dependent due to [temp.dep.temp]/3 and we miss that because
function template-id arguments aren't coerced until overload resolution,
and so the local static template argument lacks an implicit cast to
reference type that value_dependent_expression_p looks for before
considering dependence of the address. Other kinds of template-ids aren't
affected since they're coerced ahead of time.

So when considering dependence of a function template-id, we need to
conservatively consider dependence of the address of each argument (if
applicable).

PR c++/117792

gcc/cp/ChangeLog:

* pt.cc (type_dependent_expression_p): Consider the dependence
of the address of each template argument of a function
template-id.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/nontype7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 40f0f6ab75a391906bed40cbdc098b0df3a91af7)

c++: memfn pointer as NTTP argument considered unused [PR119233]

This is just the member function pointer version of PR c++/105848,
in which our non-dependent call pruning may cause us to not mark an
otherwise unused function pointer template argument as used.

PR c++/119233

gcc/cp/ChangeLog:

* pt.cc (mark_template_arguments_used): Also handle member
function pointers.

gcc/testsuite/ChangeLog:

* g++.dg/template/fn-ptr5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 51b1c0a2dde8ada0856c8a8cf2c1d26ac1657787)

c++: bad 'this' conversion for nullary memfn [PR106760]

Here we notice the 'this' conversion for the call f<void>() is bad, so
we correctly defer deduction for the template candidate, but we end up
never adding it to 'bad_cands' since missing_conversion_p for it returns
false (its only argument is 'this' which has already been determined to
be bad). This is not a huge deal, but it causes us to longer accept the
call with -fpermissive in release builds, and a tree check ICE in checking
builds.

So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_cands' even if no argument
conversion is missing.

PR c++/106760

gcc/cp/ChangeLog:

* call.cc (add_candidates): Relax test for adding a candidate
to 'bad_cands' to also accept an uninstantiated template candidate
that has no missing conversions.

gcc/testsuite/ChangeLog:

* g++.dg/ext/conv3.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 50073ffae0a9b8feb9b36fdafdebd9885f6d7dc8)

c++: normalizing ttp constraints [PR115656]

Here we normalize the constraint same_as<T, bool> for the first
time during ttp coercion of B / UU, specifically constraint subsumption
checking.  During this normalization the set of in-scope template
parameters i.e. current_template_parms is empty, which we rely on
during normalization of the ttp constraints since we pass in_decl=NULL_TREE
to norm_info.  And this tricks the satisfaction cache into thinking that
the satisfaction value of same_as<T, bool> is independent of its template
parameters, and we incorrectly conflate the satisfaction value with
T = bool vs T = long and accept the specialization A<long, B>.

Since is_compatible_template_arg rewrites the ttp's constraints to
be in terms of the argument template's parameters, and since it's
the only caller of weakly_subsumes, the latter funcion can instead
pass in_decl=tmpl to avoid relying on current_template_parms.  This
patch implements this, and in turns renames weakly_subsumes to
ttp_subsumes to reflect that this predicate is now hardcoded for this
one caller.

PR c++/115656

gcc/cp/ChangeLog:

* constraint.cc (weakly_subsumes): Pass in_decl=tmpl to
get_normalized_constraints_from_info.  Rename to ...
(ttp_subsumes): ... this.
* cp-tree.h (weakly_subsumes): Rename to ...
(ttp_subsumes): ... this.
* pt.cc (is_compatible_template_arg): Adjust after renaming.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 2861eb34e30973cb991a7964af7cfeae014a98b0)

c++: decltype(auto) deduction of statement-expression [PR116418]

r8-7538 for PR84968 made strip_typedefs_expr diagnose STATEMENT_LIST
so that we reject statement-expressions in noexcept-specifiers to
match our behavior in template arguments (which the parser diagnoses
directly).

Later r11-7452 made decltype(auto) deduction canonicalize the expression
(as an implementation detail) which in turn calls strip_typedefs_expr,
and so ever since we inadvertently reject decltype(auto) deduction of a
statement-expression.

This patch just removes the diagnostic in strip_typedefs_expr and instead
treats statement-expressions similar to lambda-expressions. The function
doesn't seem like the right place for such a diagnostic and so it seems
easier to just accept rather than try to reject them in a suitable place.

PR c++/116418

gcc/cp/ChangeLog:

* tree.cc (strip_typedefs_expr) <case STATEMENT_LIST>: Replace
this error path with ...
<case STMT_EXPR>: ... this, returning the original tree.

gcc/testsuite/ChangeLog:

* g++.dg/eh/pr84968.C: No longer expect an ahead of time diagnostic
for the statement-expresssion. Instantiate the template and expect
an incomplete type error instead.
* g++.dg/ext/stmtexpr26.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 12bdcc3d7970860b9d66ed4dea203bde8fd68d4d)

c++: ICE during requires-expr partial subst [PR118060]

Here during partial substitution of the requires-expression (as part of
CTAD constraint rewriting) we segfault from the INDIRECT_REF case of
convert_to_void due *f(u) being type-dependent. We should just defer
checking convert_to_void until satisfaction.

PR c++/118060

gcc/cp/ChangeLog:

* constraint.cc (tsubst_valid_expression_requirement): Don't
check convert_to_void during partial substitution.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires40.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit ca79349c050c27ff466735ba78d2e2bbce56ffdc)

c++: c->B::m access resolved through current inst [PR116320]

Here when checking the access of (the injected-class-name) B in c->B::m
at parse time, we notice its context B (now the type) is a base of the
object type C<T>, so we proceed to use C<T> as the effective qualifying
type. But this C<T> is the dependent specialization not the primary
template type, so it has empty TYPE_BINFO, which leads to a segfault later
from perform_or_defer_access_check.

The reason the DERIVED_FROM_P (B, C<T>) test guarding this code path works
despite C<T> having empty TYPE_BINFO is because of its currently_open_class
logic (added in r9-713-gd9338471b91bbe) which replaces a dependent
specialization with the primary template type if we're inside it. So the
safest fix seems to be to call currently_open_class in the caller as well.

PR c++/116320

gcc/cp/ChangeLog:

* semantics.cc (check_accessibility_of_qualified_id): Try
currently_open_class when using the object type as the
effective qualifying type.

gcc/testsuite/ChangeLog:

* g++.dg/template/access42.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 484f139ccd3b631a777802e810a632678b42ffab)

AVR: target/120441 - Fix f7_exp for |x| ≥ 512.

f7_exp limited exponents to 512, but 1023 * ln2 ≈ 709,
hence 1024 is a correct limit.

libgcc/config/avr/libf7/
PR target/120441
* libf7.c (f7_exp): Limit aa->expo to 10 (not to 9).

(cherry picked from commit 672569cee76a1927d14b5eb754a5ff0b9cee1bc8)

Daily bump.

c++: Fix explicit instantiation of const variable templates after earlier implicit instantation [PR113976]

Already previously instantiated const variable templates had
cp_apply_type_quals_to_decl called when they were instantiated,
but if they need runtime initialization, their TREE_READONLY flag
has been subsequently cleared.
Explicit variable template instantiation calls grokdeclarator which
calls cp_apply_type_quals_to_decl on them again, setting TREE_READONLY
flag again, but nothing clears it afterwards, so we emit such
instantiations into rodata sections and segfault when the dynamic
initialization attempts to initialize them.

The following patch fixes that by not calling cp_apply_type_quals_to_decl
on already instantiated variable declarations.

2024-02-28 Jakub Jelinek <jakub@redhat.com>
Patrick Palka <ppalka@redhat.com>

PR c++/113976
* decl.cc (grokdeclarator): Don't call cp_apply_type_quals_to_decl
on DECL_TEMPLATE_INSTANTIATED VAR_DECLs.

* g++.dg/cpp1y/var-templ87.C: New test.

(cherry picked from commit 29ac92436aa5c702e9e02c206e7590ebd806398e)

i386: Change RTL representation of bt[lq] [PR118623]

The following testcase is miscompiled because of RTL represententation
of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it
actually does.
Let's look e.g. at
(define_insn_and_split "*jcc_bt<mode>"
  [(set (pc)
        (if_then_else (match_operator 0 "bt_comparison_operator"
                        [(zero_extract:SWI48
                           (match_operand:SWI48 1 "nonimmediate_operand")
                           (const_int 1)
                           (match_operand:QI 2 "nonmemory_operand"))
                         (const_int 0)])
                      (label_ref (match_operand 3))
                      (pc)))
   (clobber (reg:CC FLAGS_REG))]
  "(TARGET_USE_BT || optimize_function_for_size_p (cfun))
   && (CONST_INT_P (operands[2])
       ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)
          && INTVAL (operands[2])
               >= (optimize_function_for_size_p (cfun) ? 8 : 32))
       : !memory_operand (operands[1], <MODE>mode))
   && ix86_pre_reload_split ()"
  "#"
  "&& 1"
  [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extract:SWI48
            (match_dup 1)
            (const_int 1)
            (match_dup 2))
          (const_int 0)))
   (set (pc)
        (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
                      (label_ref (match_dup 3))
                      (pc)))]
{
  operands[0] = shallow_copy_rtx (operands[0]);
  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
})
The define_insn part in RTL describes exactly what it does,
jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ).
The problem is with what it splits into.
put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU,
nc for NE and GEU and ICEs otherwise.
CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb,
in those cases e.g. for add we have
(set (reg:CCC flags) (compare:CCC (plus:M x y) x))
and use (ltu (reg:CCC flags) (const_int 0)) for carry set and
(geu (reg:CCC flags) (const_int 0)) for carry not set.  These cases
model in RTL what is actually happening, compare in infinite precision
x from the result of finite precision addition in M mode and if it is
less than unsigned (i.e. overflow happened), carry is set.
Another use of CCCmode is in UNSPEC_* patterns, those are used with
(eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset,
given the UNSPEC no big deal, the middle-end doesn't know what means
set or unset.
But for the bt{l,q}; j{c,nc} case the above splits it into
(set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0)))
for bt and
(set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
for the bit set case (so that the jump expands to jc) and ne for
the bit not set case (so that the jump expands to jnc).
Similarly for the different splitters for cmov and set{c,nc} etc.
The problem is that when the middle-end reads this RTL, it feels
the exact opposite to it.  If zero_extract is 1, flags is set
to comparison of 1 and 0 and that would mean using ne ne in the
if_then_else, and vice versa.

So, in order to better describe in RTL what is actually happening,
one possibility would be to swap the behavior of put_condition_code
and use NE + LTU -> c and EQ + GEU -> nc rather than the current
EQ + LTU -> c and NE + GEU -> nc; and adjust everything.  The
following patch uses a more limited approach, instead of representing
bt{l,q}; j{c,nc} case as written above it uses
(set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract)))
and
(set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
which uses the existing put_condition_code but describes what the
insns actually do in RTL clearly.  If zero_extract is 1,
then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU,
0U >= 0U.  The patch adjusts the *bt<mode> define_insn and all the
splitters to it and its comparisons/conditional moves/setXX.

2025-02-10  Jakub Jelinek  <jakub@redhat.com>

PR target/118623
* config/i386/i386.md (*bt<mode>): Represent bt as
compare:CCC of const0_rtx and zero_extract rather than
zero_extract and const0_rtx.
(*jcc_bt<mode>): Likewise.  Use LTU and GEU as flags test
instead of EQ and NE.
(*jcc_bt<mode>_1): Likewise.
(*jcc_bt<mode>_mask): Likewise.
(*jcc_bt<mode>_mask_1): Likewise.
(Help combine recognize bt followed by cmov splitter): Likewise.
(*bt<mode>_setcqi): Likewise.
(*bt<mode>_setncqi): Likewise.
(*bt<mode>_setnc<mode>): Likewise.

* gcc.c-torture/execute/pr118623.c: New test.

(cherry picked from commit 92142019b6cd0cf1fe483203cf3ec451a9848a42)

alias: Perform offset arithmetics in poly_offset_int rather than poly_int64 [PR118819]

This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
hunk.

The following patch hopefully fixes that by doing the additions/subtractions
in poly_offset_int rather than poly_int64 and then converting back to poly_int64.
If it doesn't fit, -1 is returned (which means it is unknown if there is a conflict
or not).

2025-02-27 Jakub Jelinek <jakub@redhat.com>

PR middle-end/118819
* alias.cc (memrefs_conflict_p): Perform arithmetics on c, xsize and
ysize in poly_offset_int and return -1 if it is not representable in
poly_int64.

(cherry picked from commit b570f48c3dfb9ca3d640467cff67e569904009d4)

cselib: Fix up previous patch for SPARC [PR117239]

Sorry, our CI bot just notified me I broke SPARC build. There are two
#ifdef STACK_ADDRESS_OFFSET
guarded snippets and the macro is only defined on SPARC target, so I didn't
notice there was a syntax error.

Fixed thusly.

2025-02-05 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/117239
* cselib.cc (cselib_init): Remove spurious closing paren in
the #ifdef STACK_ADDRESS_OFFSET specific code.

(cherry picked from commit 6094801d6fd7849d2d95ce78f7c6ef01686b9f63)

cselib: For CALL_INSNs to const/pure fns invalidate memory below sp [PR117239]

The following testcase is miscompiled on x86_64 during postreload.
After reload (with IPA-RA figuring out the calls don't modify any
registers but %rax for return value) postreload sees
(insn 14 12 15 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 16 [0x10])) [0  S8 A64])
        (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":18:7 95 {*movdi_internal}
     (nil))
(call_insn/i 15 14 16 2 (set (reg:SI 0 ax)
        (call (mem:QI (symbol_ref:DI ("baz") [flags 0x3]  <function_decl 0x7ffb2e2bdf00 r>) [0 baz S1 A8])
            (const_int 24 [0x18]))) "pr117239.c":18:7 1476 {*call_value}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("baz") [flags 0x3]  <function_decl 0x7ffb2e2bdf00 baz>)
        (expr_list:REG_EH_REGION (const_int 0 [0])
            (nil)))
    (nil))
(insn 16 15 18 2 (parallel [
            (set (reg/f:DI 7 sp)
                (plus:DI (reg/f:DI 7 sp)
                    (const_int 24 [0x18])))
            (clobber (reg:CC 17 flags))
        ]) "pr117239.c":18:7 285 {*adddi_1}
     (expr_list:REG_ARGS_SIZE (const_int 0 [0])
        (nil)))
...
(call_insn/i 19 18 21 2 (set (reg:SI 0 ax)
        (call (mem:QI (symbol_ref:DI ("foo") [flags 0x3]  <function_decl 0x7ffb2e2bdb00 l>) [0 foo S1 A8])
            (const_int 0 [0]))) "pr117239.c":19:3 1476 {*call_value}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x3]  <function_decl 0x7ffb2e2bdb00 foo>)
        (expr_list:REG_EH_REGION (const_int 0 [0])
            (nil)))
    (nil))
(insn 21 19 26 2 (parallel [
            (set (reg/f:DI 7 sp)
                (plus:DI (reg/f:DI 7 sp)
                    (const_int -24 [0xffffffffffffffe8])))
            (clobber (reg:CC 17 flags))
        ]) "pr117239.c":19:3 discrim 1 285 {*adddi_1}
     (expr_list:REG_ARGS_SIZE (const_int 24 [0x18])
        (nil)))
(insn 26 21 24 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 16 [0x10])) [0  S8 A64])
        (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":19:3 discrim 1 95 {*movdi_internal}
     (nil))
i.e.
        movq    %rdx, 16(%rsp)
        call    baz
        addq    $24, %rsp
...
        call    foo
        subq    $24, %rsp
        movq    %rdx, 16(%rsp)
Now, postreload uses cselib and cselib remembered that %rdx value has been
stored into 16(%rsp).  Both baz and foo are pure calls.  If they weren't,
when processing those CALL_INSNs cselib would invalidate all MEMs
      if (RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)
          || !(RTL_CONST_OR_PURE_CALL_P (insn)))
        cselib_invalidate_mem (callmem);
where callmem is (mem:BLK (scratch)).  But they are pure, so instead the
code just invalidates the argument slots from CALL_INSN_FUNCTION_USAGE.
The calls actually clobber more than that, even const/pure calls clobber
all memory below the stack pointer.  And that is something that hasn't been
invalidated.  In this failing testcase, the call to baz is not a big deal,
we don't have anything remembered in memory below %rsp at that call.
But then we increment %rsp by 24, so the %rsp+16 is now 8 bytes below stack
and do the call to foo.  And that call now actually, not just in theory,
clobbers the memory below the stack pointer (in particular overwrites it
with the return value).  But cselib does not invalidate.  Then %rsp
is decremented again (in preparation for another call, to bar) and cselib
is processing store of %rdx (which IPA-RA says has not been modified by
either baz or foo calls) to %rsp + 16, and it sees the memory already has
that value, so the store is useless, let's remove it.
But it is not, the call to foo has changed it, so it needs to be stored
again.

The following patch adds targetted invalidation of memory below stack
pointer (or on SPARC memory below stack pointer + 2047 when stack bias is
used, or on PA memory above stack pointer instead).
It does so only in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions,
because in other functions the stack pointer should be constant from
the end of prologue till start of epilogue and so nothing should be stored
within the function below the stack pointer.

Now, memory below stack pointer is special, except for functions using
alloca/VLAs I believe no addressable memory should be there, it should be
purely outgoing function argument area, if we take address of some automatic
variable, it should live all the time above the outgoing function argument
area.  So on top of just trying to flush memory below stack pointer
(represented by %rsp - PTRDIFF_MAX with PTRDIFF_MAX size on most arches),
the patch tries to optimize and only invalidate memory that has address
clearly derived from stack pointer (memory with other bases is not
invalidated) and if we can prove (we see same SP_DERIVED_VALUE_P bases in
both VALUEs) it is above current stack, also don't call
canon_anti_dependence which might just give up in certain cases.

I've gathered statistics from x86_64-linux and i686-linux
bootstraps/regtests.  During -m64 compilations from those, there were
3718396 + 42634 + 27761 cases of processing MEMs in cselib_invalidate_mem
(callmem[1]) calls, the first number is number of MEMs not invalidated
because of the optimization, i.e.
+             if (sp_derived_base == NULL_RTX)
+               {
+                 has_mem = true;
+                 num_mems++;
+                 p = &(*p)->next;
+                 continue;
+               }
in the patch, the second number is number of MEMs not invalidated because
canon_anti_dependence returned false and finally the last number is number
of MEMs actually invalidated (so that is what hasn't been invalidated
before).  During -m32 compilations the numbers were
1422412 + 39354 + 16509 with the same meaning.

Note, when there is no red zone, in theory even the sp = sp + incr
instruction invalidates memory below the new stack pointer, as signal
can come and overwrite the memory.  So maybe we should be invalidating
something at those instructions as well.  But in leaf functions we certainly
can have even addressable automatic vars in the red zone (which would make
it harder to distinguish), on the other side aren't normally storing
anything below the red zone, and in non-leaf it should normally be just the
outgoing arguments area.

2025-02-05  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/117239
* cselib.cc: Include predict.h.
(callmem): Change type from rtx to rtx[2].
(cselib_preserve_only_values): Use callmem[0] rather than callmem.
(cselib_invalidate_mem): Optimize and don't try to invalidate
for the mem_rtx == callmem[1] case MEMs which clearly can't be
below the stack pointer.
(cselib_process_insn): Use callmem[0] rather than callmem.
For const/pure calls also call cselib_invalidate_mem (callmem[1])
in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions.
(cselib_init): Initialize callmem[0] rather than callmem and also
initialize callmem[1].

* gcc.dg/pr117239.c: New test.

(cherry picked from commit 886ce970eb096bb302228c891f0c8a889c79ad40)

gimple-fold: Avoid ICEs with bogus declarations like const attribute no snprintf [PR117358]

When one puts incorrect const or pure attributes on declarations of various
C APIs which have corresponding builtins (vs. what they actually do), we can
get tons of ICEs in gimple-fold.cc.

The following patch fixes it by giving up gimple_fold_builtin_* folding
if the functions don't have gimple_vdef (or for pure functions like
bcmp/strchr/strstr gimple_vuse) when in SSA form (during gimplification
they will surely have both of those NULL even when declared correctly,
yet it is highly desirable to fold them).

Or shall I replace
!gimple_vdef (stmt) && gimple_in_ssa_p (cfun)
tests with
(gimple_call_flags (stmt) & (ECF_CONST | ECF_PURE | ECF_NOVOPS)) != 0
and
!gimple_vuse (stmt) && gimple_in_ssa_p (cfun)
with
(gimple_call_flags (stmt) & (ECF_CONST | ECF_NOVOPS)) != 0
?

2024-11-28 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/117358
* gimple-fold.cc (gimple_fold_builtin_memory_op): Punt if stmt has no
vdef in ssa form.
(gimple_fold_builtin_bcmp): Punt if stmt has no vuse in ssa form.
(gimple_fold_builtin_bcopy): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_bzero): Likewise.
(gimple_fold_builtin_memset): Likewise. Use return false instead of
return NULL_TREE.
(gimple_fold_builtin_strcpy): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_strncpy): Likewise.
(gimple_fold_builtin_strchr): Punt if stmt has no vuse in ssa form.
(gimple_fold_builtin_strstr): Likewise.
(gimple_fold_builtin_strcat): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_strcat_chk): Likewise.
(gimple_fold_builtin_strncat): Likewise.
(gimple_fold_builtin_strncat_chk): Likewise.
(gimple_fold_builtin_string_compare): Likewise.
(gimple_fold_builtin_fputs): Likewise.
(gimple_fold_builtin_memory_chk): Likewise.
(gimple_fold_builtin_stxcpy_chk): Likewise.
(gimple_fold_builtin_stxncpy_chk): Likewise.
(gimple_fold_builtin_stpcpy): Likewise.
(gimple_fold_builtin_snprintf_chk): Likewise.
(gimple_fold_builtin_sprintf_chk): Likewise.
(gimple_fold_builtin_sprintf): Likewise.
(gimple_fold_builtin_snprintf): Likewise.
(gimple_fold_builtin_fprintf): Likewise.
(gimple_fold_builtin_printf): Likewise.
(gimple_fold_builtin_realloc): Likewise.

* gcc.c-torture/compile/pr117358.c: New test.

(cherry picked from commit 29032dfa57629d1713a97b17a785273823993a91)

doloop: Fix up doloop df use [PR116799]

The following testcases are miscompiled on s390x-linux, because the
doloop_optimize
  /* Ensure that the new sequence doesn't clobber a register that
     is live at the end of the block.  */
  {
    bitmap modified = BITMAP_ALLOC (NULL);

    for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i))
      note_stores (i, record_reg_sets, modified);

    basic_block loop_end = desc->out_edge->src;
    bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified);
check doesn't work as intended.
The problem is that it uses df, but the df analysis was only done using
  iv_analysis_loop_init (loop);
->
  df_analyze_loop (loop);
which computes df inside on the bbs of the loop.
While loop_end bb is inside of the loop, df_get_live_out computed that
way includes registers set in the loop and used at the start of the next
iteration, but doesn't include registers set in the loop (or before the
loop) and used after the loop.

The following patch fixes that by doing whole function df_analyze first,
changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many
targets which use can_use_doloop_if_innermost target hook a so are known
to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin
actually allows non-innermost loops) and checking not just
df_get_live_out (loop_end) (that is needed for something used by the
next iteration), but also df_get_live_in (desc->out_edge->dest),
i.e. what will be used after the loop.  df of such a bb shouldn't
be affected by the df_analyze_loop and so should be from df_analyze
of the whole function.

2024-12-05  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/113994
PR rtl-optimization/116799
* loop-doloop.cc: Include targhooks.h.
(doloop_optimize): Also punt on intersection of modified
with df_get_live_in (desc->out_edge->dest).
(doloop_optimize_loops): Call df_analyze.  Use
LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as
second loops_list argument.

* gcc.c-torture/execute/pr116799.c: New test.
* g++.dg/torture/pr113994.C: New test.

(cherry picked from commit 0eed81612ad6eac2bec60286348a103d4dc02a5a)

c++: Disable deprecated/unavailable diagnostics when creating thunks for methods with such attributes [PR116636]

On the following testcase, we emit false positive warnings/errors about using
the deprecated or unavailable methods when creating thunks for them, even
when nothing (in the testcase so far) actually used those.

The following patch temporarily disables that diagnostics when creating
the thunks.

2024-09-12 Jakub Jelinek <jakub@redhat.com>

PR c++/116636
* method.cc: Include decl.h.
(use_thunk): Temporarily change deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.

* g++.dg/warn/deprecated-19.C: New test.

(cherry picked from commit 4026d89d623e322920b052f7ac0d940ef267dc0f)

c++: vtable referring to "unavailable" virtual fn [PR116606]

mark_vtable_entries already has

   /* It's OK for the vtable to refer to deprecated virtual functions.  */
   warning_sentinel w(warn_deprecated_decl);

but that doesn't cover __attribute__((unavailable)).  We can use the
following override to cover both.

PR c++/116606

gcc/cp/ChangeLog:

* decl2.cc (mark_vtable_entries): Temporarily override deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.  Remove a warning_sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-13.C: New test.

(cherry picked from commit d9d34f9a91371dea4bab0b54b2d7f762a6cc23e0)

asan: Don't fold some strlens with -fsanitize=address [PR110676]

The UB on the following testcase isn't diagnosed by -fsanitize=address,
because we see that the array has a single element and optimize the
strlen to 0. I think it is fine to assume e.g. for range purposes the
lower bound for the strlen as long as we don't try to optimize
strlen (str)
where we know that it returns [26, 42] to
26 + strlen (str + 26), but for the upper bound we really want to punt
on optimizing that for -fsanitize=address to read all the bytes of the
string and diagnose if we run to object end etc.

2024-02-06 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/110676
* gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address
reset maxlen to sizetype maximum.

* gcc.dg/asan/pr110676.c: New test.

(cherry picked from commit d3eac7d96de790df51859f63c13838f153b416de)

s390: Fix tf_to_fprx2

Insn tf_to_fprx2 moves a TF value into a floating-point register pair.
For alternative 0, the input is a vector register, however, in the else
case instruction ldr is emitted which expects floating-point register
operands only.  Thus, this works only for vector registers which overlap
with floating-point registers.  Replace ldr with vlr so that the
remaining vector registers are dealt with, too.  Emitting a vlr instead
of a ldr is fine since the destination register %v0 is part of a
floating-point register pair which means that the low half of %v0 is
ignored in the end anyway and therefore may be clobbered.

gcc/ChangeLog:

* config/s390/vector.md: Fix tf_to_fprx2 by using vlr instead of
ldr.

(cherry picked from commit 8519b8ba9dd9567a5f90966351c1e758dbf511a4)

c++: format attribute redeclaration [PR116954]

Here when merging the two decls, remove_contract_attributes loses
ATTR_IS_DEPENDENT on the format attribute, so apply_late_template_attributes
just returns, so the attribute doesn't get propagated to the type where the
warning looks for it.

Fixed by using copy_node instead of tree_cons to preserve flags.

PR c++/116954

gcc/cp/ChangeLog:

* contracts.cc (remove_contract_attributes): Preserve flags
on the attribute list.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wformat-3.C: New test.

(cherry picked from commit b0d7d644f3c25af9bf60c948ab26aa7b09a68787)

PR tree-optimization/113673: Avoid load merging when potentially trapping.

This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified.  When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.

This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.

2024-06-24  Roger Sayle  <roger@nextmovesoftware.com>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
(find_bswap_or_nop_1): Avoid transformations (load merging) when
stmt_can_throw_internal indicates that a statement can trap.

gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.

(cherry picked from commit d8b05aef77443e1d3d8f3f5d2c56ac49a503fee3)

Fix ICE with -fdump-tree-moref

gcc/ChangeLog:

PR ipa/116055
* ipa-modref.cc (analyze_function): Do not ICE when flags regress.

(cherry picked from commit 98baaa17561ca299eefc98f469f4326e551604c9)

tree-optimization/119534 - reject bogus emulated vectorized gather

The following makes sure to reject the attempts to emulate a vector
gather when the discovered index vector type is a vector mask.

PR tree-optimization/119534
* tree-vect-stmts.cc (get_load_store_type): Reject
VECTOR_BOOLEAN_TYPE_P offset vector type for emulated gathers.

* gcc.dg/vect/pr119534.c: New testcase.

(cherry picked from commit d0cc14c62ad7403afcab3c2e38851d3ab179352f)

tree-optimization/98845 - ICE with tail-merging and DCE/DSE disabled

The following shows that tail-merging will make dead SSA defs live
in paths where it wasn't before, possibly introducing UB or as
in this case, uses of abnormals that eventually fail coalescing
later. The fix is to register such defs for stmt comparison.

PR tree-optimization/98845
* tree-ssa-tail-merge.cc (stmt_local_def): Consider a
def with no uses not local.

* gcc.dg/pr98845.c: New testcase.
* gcc.dg/pr81192.c: Adjust.

(cherry picked from commit 6b8a8c9fd68c5dabaec5ddbc25efeade44f37a14)

middle-end/119119 - re-gimplification of empty CTOR assignments

The following testcase runs into a re-gimplification issue during
inlining when processing

  MEM[(struct e *)this_2(D)].a = {};

where re-gimplification does not handle assignments in the same
way than the gimplifier but instead relies on rhs_predicate_for
and gimplifying the RHS standalone.  This fails to handle
special-casing of CTORs.  The is_gimple_mem_rhs_or_call predicate
already handles clobbers but not empty CTORs so we end up in
the fallback code trying to force the CTOR into a separate stmt
using a temporary - but as we have a non-copyable type here that ICEs.

The following generalizes empty CTORs in is_gimple_mem_rhs_or_call
since those need no additional re-gimplification.

PR middle-end/119119
* gimplify.cc (is_gimple_mem_rhs_or_call): All empty CTORs
are OK when not a register type.

* g++.dg/torture/pr11911.C: New testcase.

(cherry picked from commit 3bd61c1dfaa2d7153eb4be82f423533ea937d0f9)

tree-optimization/119057 - bogus double reduction detection

We are detecting a cycle as double reduction where the inner loop
cycle has extra out-of-loop uses.  This clashes at least with
assumptions from the SLP discovery code which says the cycle
isn't reachable from another SLP instance.  It also was not intended
to support this case, in fact with GCC 14 we seem to generate wrong
code here.

PR tree-optimization/119057
* tree-vect-loop.cc (check_reduction_path): Add argument
specifying whether we're analyzing the inner loop of a
double reduction.  Do not allow extra uses outside of the
double reduction cycle in this case.
(vect_is_simple_reduction): Adjust.

* gcc.dg/vect/pr119057.c: New testcase.

(cherry picked from commit 758de6263dfc7ba8701965fa468691ac23cb7eb5)

tree-optimization/117979 - failed irreducible loop update from DCE

When CD-DCE creates forwarders to reduce false control dependences
it fails to update the irreducible state of edge and the forwarder
block in case the fowarder groups both normal (entry) and edges
from an irreducible region (necessarily backedges). This is because
when we split the first edge, if that's a normal edge, the forwarder
and its edge to the original block will not be marked as part
of the irreducible region but when we then redirect an edge from
within the region it becomes so.

The following fixes this up.

Note I think creating a forwarder that includes backedges is
likely not going to help, but at this stage I don't want to change
the CFG going into DCE. For regular loops we'll have a single
entry and a single backedge by means of loop init and will never
create a forwarder - so this is solely happening for irreducible
regions where it's harder to prove that such forwarder doesn't help.

PR tree-optimization/117979
* tree-ssa-dce.cc (make_forwarders_with_degenerate_phis):
Properly update the irreducible region state.

* gcc.dg/torture/pr117979.c: New testcase.

(cherry picked from commit eca04660a2c9546c8ecefc5288395eb8a9fdc168)

tree-optimization/119778 - properly mark abnormal edge sources during inlining

When inlining a call that abnormally transfers control-flow we make
all inlined calls that can possibly transfer abnormal control-flow
do so as well. But we failed to mark the calls as altering
control-flow. This results in inconsistent behavior later and
possibly wrong-code (we'd eventually prune those edges).

PR tree-optimization/119778
* tree-inline.cc (copy_edges_for_bb): Mark calls that are
source of abnormal edges as altering control-flow.

* g++.dg/torture/pr119778.C: New testcase.

(cherry picked from commit a48f934211434cac1be951c207ee76e4b4340fac)

[14] Use --param=aarch64-autovec-preference=2 instead of =sve-only

This updates the backported test.

PR tree-optimization/119706
* g++.target/aarch64/sve/pr119706.C: Adjust --param
aarch64-autovec-preference.

(cherry picked from commit d7b5fe3400cf15150ae18d522859cafcdd6a9315)

aarch64: Add test case.

This patch adds a test case to the testsuite for PR119706.
The bug was already fixed by
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680573.html.

OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/
PR tree-optimization/119706
* g++.target/aarch64/sve/pr119706.C: New test.

(cherry picked from commit f6e6e6d9ba1d71fdd02a2c570d60217db6c5a31b)

middle-end/119706 - allow POLY_INT_CST as is_gimple_mem_ref_addr

We currently only INTEGER_CST, but not POLY_INT_CST, which leads
to the situation that when the POLY_INT_CST is only indrectly
present via a SSA def the IL is valid but when propagated it's not.
That's unsustainable.

PR middle-end/119706
* gimple-expr.cc (is_gimple_mem_ref_addr): Also allow
POLY_INT_CST.

(cherry picked from commit bf812c6ad83ec0b241bb3fecc7e68f883b6083df)

dwarf2out: Propagate dtprel into the .debug_addr table in resolve_addr_in_expr

For a debugger to display statically-allocated[0] TLS variables the compiler
must communicate information[1] that can be used in conjunction with knowledge
of the runtime enviroment[2] to calculate a location for the variable for
each thread. That need gives rise to dw_loc_dtprel in dwarf2out, a flag tracking
whether the location description is dtprel, or relative to the
"dynamic thread pointer". Location descriptions in the .debug_info section for
TLS variables need to be relocated by the static linker accordingly, and
dw_loc_dtprel controls emission of the needed relocations.

This is further complicated by -gsplit-dwarf. -gsplit-dwarf is designed to allow
as much debugging information as possible to bypass the static linker to improve
linking performance. One of the ways that is done is by introducing a layer of
indirection for relocatable values[3]. That gives rise to addr_index_table which
ultimately results in the .debug_addr section.

While the code handling addr_index_table clearly contemplates the existence of
dtprel entries[4] resolve_addr_in_expr does not, and the result is that when
using -gsplit-dwarf the DWARF for TLS variables contains an address[5] rather
than an offset, and debuggers can't work with that.

This is visible on a trivial example. Compile

```
static __thread int tls_var;

int main(void) {
  tls_var = 42;
  return 0;
}
```

with -g and -g -gsplit-dwarf. Run the program under gdb. When examining the
value of tls_var before and after the assignment, -g behaves as one would
expect but -g -gsplit-dwarf does not. If the user is lucky and the miscalculated
address is not mapped, gdb will print "Cannot access memory at address ...".
If the user is unlucky and the miscalculated address is mapped, gdb will simply
give the wrong value. You can further confirm that the issue is the address
calculation by asking gdb for the address of tls_var and comparing that to what
one would expect.[6]

Thankfully this is trivial to fix by modifying resolve_addr_in_expr to propagate
the dtprel character of the location where necessary. gdb begins working as
expected and the diff in the generated assembly is clear.

```
        .section        .debug_addr,"",@progbits
        .long   0x14
        .value  0x5
        .byte   0x8
        .byte   0
.Ldebug_addr0:
-       .quad   tls_var
+       .long   tls_var@dtpoff, 0
        .quad   .LFB0
```

[0] Referring to e.g. __thread as statically-allocated vs. e.g. a
    dynamically-allocated pthread_key_create() call.
[1] Generally an offset in a TLS block.
[2] With glibc, provided by libthread_db.so.
[3] Relocatable values are moved to a table in the .debug_addr section, those
    values in .debug_info are replaced with special values that look up indexes
    in that table, and then the static linker elsewhere assigns a single per-CU
    starting index in the .debug_addr section, allowing those special values to
    remain permanently fixed and the resulting data to be ignored by the linker.
[4] ate_kind_rtx_dtprel exists, after all, and new_addr_loc_descr does produce
    it where appropriate.
[5] e.g. an address in the .tbss/.tdata section.
[6] e.g. on x86-64 by examining %fsbase and the offset in the assembly

2025-05-01  Kyle Huey  <me@kylehuey.com>

* dwarf2out.cc (resolve_addr_in_expr): Propagate dtprel into the address
table when appropriate.

(cherry picked from commit 02e95abdde9742cecd7d1211e572549c1e56b8b1)

Daily bump.

libfortran: Fix up _gfortran_{,m,s}findloc2_s{1,4} [PR120196]

As mentioned in the PR, _gfortran_{,m,s}findloc2_s{1,4} iterate too many
times in the back case if nothing is found.
For !back, the loops are for (i = 1; i <= extent; i++) so i is in the
body [1, extent] if nothing is found, but for back it is
for (i = extent; i >= 0; i--) so i is in the body [0, extent] and compares
one element before the start of the array.
Note, findloc1_s{1,4} uses
          for (n = len; n > 0; n--, src -= delta * len_array)
for the back loop and
          for (n = 1; n <= len; n++, src += delta * len_array)
for !back.  This patch fixes that.
The testcase fails under valgrind without the libgfortran changes and
succeeds with those.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR libfortran/120196
* m4/ifindloc2.m4 (header1, header2): For back use i > 0 rather than
i >= 0 as for condition.
* generated/findloc2_s1.c: Regenerate.
* generated/findloc2_s4.c: Regenerate.

* gfortran.dg/pr120196.f90: New test.

(cherry picked from commit 748a7bc4624e7b000f6fdb93a8cf7da73ff193bb)

libfortran: Fix up _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} [PR120191]

There is a bug in _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} which the
following testcase shows.
The functions return but then crash in the caller.
Seems that is because buffer overflows, I believe those functions for
if (mask == NULL || *mask) condition being false are supposed to fill in
the result array with all zeros (or allocate it and fill it with zeros).
My understanding is the result array in that case is integer(kind={4,8,16})
and should have the extents the character input array has.
The problem is that it uses * string_len in the extent multiplication:
      extent[n] = GFC_DESCRIPTOR_EXTENT(array,n) * string_len;
and
      extent[n] =
        GFC_DESCRIPTOR_EXTENT(array,n + 1) * string_len;
which is I guess fine and desirable for the extents of the character array,
but not for the extents of the destination array.  Yet the code uses
that extent array for that purpose (and no other purposes).
Here it uses it to set the dimensions for the case where it needs to
allocate (as well as size):
      for (n = 0; n < rank; n++)
        {
          if (n == 0)
            str = 1;
          else
            str = GFC_DESCRIPTOR_STRIDE(retarray,n-1) * extent[n-1];
          GFC_DIMENSION_SET(retarray->dim[n], 0, extent[n] - 1, str);
        }
Here it uses it for bounds checking of the destination:
      if (unlikely (compile_options.bounds_check))
        {
          for (n=0; n < rank; n++)
            {
              index_type ret_extent;

              ret_extent = GFC_DESCRIPTOR_EXTENT(retarray,n);
              if (extent[n] != ret_extent)
                runtime_error ("Incorrect extent in return value of"
                               " MAXLOC intrinsic in dimension %ld:"
                               " is %ld, should be %ld", (long int) n + 1,
                               (long int) ret_extent, (long int) extent[n]);
            }
        }
and here to find out how many retarray elements to actually fill in each
dimension:
  while(1)
    {
      *dest = 0;
      count[0]++;
      dest += dstride[0];
      n = 0;
      while (count[n] == extent[n])
        {
          /* When we get to the end of a dimension, reset it and increment
             the next dimension.  */
          count[n] = 0;
          /* We could precalculate these products, but this is a less
             frequently used path so probably not worth it.  */
          dest -= dstride[n] * extent[n];
Seems maxloc1s.m4 and minloc1s.m4 are the only users of ifunction-s.m4,
so we can change SCALAR_ARRAY_FUNCTION in there without breaking anything
else.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR fortran/120191
* m4/ifunction-s.m4 (SCALAR_ARRAY_FUNCTION): Don't multiply
GFC_DESCRIPTOR_EXTENT(array,) by string_len.
* generated/maxloc1_4_s1.c: Regenerate.
* generated/maxloc1_4_s4.c: Regenerate.
* generated/maxloc1_8_s1.c: Regenerate.
* generated/maxloc1_8_s4.c: Regenerate.
* generated/maxloc1_16_s1.c: Regenerate.
* generated/maxloc1_16_s4.c: Regenerate.
* generated/minloc1_4_s1.c: Regenerate.
* generated/minloc1_4_s4.c: Regenerate.
* generated/minloc1_8_s1.c: Regenerate.
* generated/minloc1_8_s4.c: Regenerate.
* generated/minloc1_16_s1.c: Regenerate.
* generated/minloc1_16_s4.c: Regenerate.

* gfortran.dg/pr120191_3.f90: New test.

(cherry picked from commit 781cfc454b8dc24952fe7f4c5c409296dca505e1)

libfortran: Fix up _gfortran_s{max,min}loc2_{4,8,16}_s{1,4} [PR120191]

I've tried to write a testcase for the BT_CHARACTER maxloc/minloc with named
or unnamed arguments and indeed the just posted patch fixed the arguments
in there in multiple cases to match what the library expects.
But the testcase still fails, due to library problems.

One dealt with in this patch are _gfortran_s{max,min}loc2_{4,8,16}_s{1,4}
functions.  Those are trivial wrappers around
_gfortrani_{max,min}loc2_{4,8,16}_s{1,4} which should call those functions
if the scalar mask is true and just return 0 otherwise.
The two bugs I see there is that the back, len arguments are swapped,
which means that it always acts as back=.true. and for len will use
character length of 1 or 0 instead of the desired one.
The _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} functions have prototypes like
GFC_INTEGER_4
maxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 back, gfc_charlen_type len)
so back comes before len, ditto for the
GFC_INTEGER_4
smaxloc2_4_s1 (gfc_array_s1 * const restrict array,
               GFC_LOGICAL_4 *mask, GFC_LOGICAL_4 back, gfc_charlen_type len)
The other problem is that it was just testing if (mask).  In my limited
Fortran understanding that means that the optional argument mask was
supplied but nothing about its actual value.  Other scalar mask generated
routines use if (mask == NULL || *mask) as the condition when to call the
non-masked function, i.e. when mask is not supplied (then it should act like
.true. mask) or when it is supplied and evaluates to .true.).

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR fortran/120191
* m4/maxloc2s.m4: For smaxloc2 call maxloc2 if mask is NULL or *mask.
Swap back and len arguments.
* m4/minloc2s.m4: Likewise.
* generated/maxloc2_4_s1.c: Regenerate.
* generated/maxloc2_4_s4.c: Regenerate.
* generated/maxloc2_8_s1.c: Regenerate.
* generated/maxloc2_8_s4.c: Regenerate.
* generated/maxloc2_16_s1.c: Regenerate.
* generated/maxloc2_16_s4.c: Regenerate.
* generated/minloc2_4_s1.c: Regenerate.
* generated/minloc2_4_s4.c: Regenerate.
* generated/minloc2_8_s1.c: Regenerate.
* generated/minloc2_8_s4.c: Regenerate.
* generated/minloc2_16_s1.c: Regenerate.
* generated/minloc2_16_s4.c: Regenerate.

* gfortran.dg/pr120191_2.f90: New test.

(cherry picked from commit 482f2192d4ef6af55acae2dc3e0df00b8487cc7d)

fortran: Fix up minloc/maxloc lowering [PR120191]

We need to drop the kind argument from what is passed to the
library, but need to do it not only when one uses the argument name
for it (so kind=4 etc.) but also when one passes all the arguments
to the intrinsics.

The following patch uses what gfc_conv_intrinsic_findloc uses,
which looks more efficient and cleaner, we already set automatic
vars to point to the kind and back actual arguments, so we can just
free/clear expr on the former and set name to "%VAL" on the latter.

And similarly clears dim argument for the BT_CHARACTER case when using
maxloc2/minloc2, again regardless of whether it was named or not.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>
    Daniil Kochergin  <daniil2472s@gmail.com>
    Tobias Burnus  <tburnus@baylibre.com>

PR fortran/120191
* trans-intrinsic.cc (strip_kind_from_actual): Remove.
(gfc_conv_intrinsic_minmaxloc): Don't call strip_kind_from_actual.
Free and clear kind_arg->expr if non-NULL.  Set back_arg->name to
"%VAL" instead of a loop looking for last argument.  Remove actual
variable, use array_arg instead.  Free and clear dim_arg->expr if
non-NULL for BT_CHARACTER cases instead of using a loop.

* gfortran.dg/pr120191_1.f90: New test.

(cherry picked from commit ec249be3c287c6f1dfb328712ac9c39e6fa95eca)

rs6000: Ignore OPTION_MASK_SAVE_TOC_INDIRECT differences in inlining decisions [PR119327]

The following testcase FAILs because the always_inline function can't
be inlined.
The rs6000 backend has similarly to other targets a hook which rejects
inlining which would bring in new ISAs which aren't there in the caller.
And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT
differences.
This flag is set if explicitly requested or by default depending on
whether the current function looks hot (or at least not cold):
  if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
      && flag_shrink_wrap_separate
      && optimize_function_for_speed_p (cfun))
    rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
The target nodes that are being compared here are actually the default
target node (which was created when cfun was NULL) vs. one that was
created for the always_inline function when it wasn't NULL, so one
doesn't have it, the other does.
In any case, this flag feels like a tuning decision rather than hard
ISA requirement and I see no problems why we couldn't inline
even explicit -msave-toc-indirect function into -mno-save-toc-indirect
or vice versa.
We already ignore OPTION_MASK_P{8,10}_FUSION which are also more
like tuning flags.

2025-04-22  Jakub Jelinek  <jakub@redhat.com>

PR target/119327
* config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also
OPTION_MASK_SAVE_TOC_INDIRECT differences.

* g++.dg/opt/pr119327.C: New test.

(cherry picked from commit 4b62cf555b5446cb02fc471519cf1afa09e1a108)

Daily bump.

crypto/tls: fix Config.Time in tests using expired certificates

This is a backport of https://go.dev/cl/640237 from the main repo.

Fixes PR go/118286

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/640435
(cherry picked from commit ed1493e12ed75e837e9b9aa794ed24daf397df7c)

Backport new assume implementation and cache.

Rework the assume pass to work properly and fail conservatively when it
does. Also move it to its own file.

PR tree-optimization/117287
gcc/
* Makefile.in (IBJS): Add tree-assume.o
* gimple-range-fold.cc (class fur_edge): Relocate to...
* gimple-range-fold.h (class fur_edge): Here.
* gimple-range.cc (assume_query::assume_range_p): Remove.
(assume_query::range_of_expr): Remove.
(assume_query::assume_query): Move to tree-assume.cc.
(assume_query::~assume_query): Remove.
(assume_query::calculate_op): Move to tree-assume.cc.
(assume_query::calculate_phi): Likewise.
(assume_query::check_taken_edge): Remove.
(assume_query::calculate_stmt): Move to tree-assume.cc.
(assume_query::dump): Remove.
* gimple-range.h (class assume_query): Move to tree-assume.cc
* tree-assume.cc: New
* tree-vrp.cc (struct pass_data_assumptions): Move to tree-assume.cc.
(class pass_assumptions): Likewise.
(make_pass_assumptions): Likewise.

gcc/testsuite/
* g++.dg/cpp23/pr117287-attr.C: New.

Daily bump.

ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

As described in PR 119852, the output of -fdump-ipa-clones can contain
"(null)" as the suffix/reason for cloning when we need to create a
clone to hold the original function during recursive inlining.  Such
clone is never output and so should not be part of the dump output
either.

gcc/ChangeLog:

2025-04-23  Martin Jambor  <mjambor@suse.cz>

PR ipa/119852
* cgraphclones.cc (dump_callgraph_transformation): Document the
function.  Do not dump if suffix is NULL.

gcc/testsuite/ChangeLog:

2025-04-23  Martin Jambor  <mjambor@suse.cz>

PR ipa/119852
* gcc.dg/ipa/pr119852.c: New test.

(cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83)

Document option -fdump-ipa-clones

I have noticed that the option -fdump-ipa-clones is not documented
although there are users who depend on it. This patch adds the
missing documentation along with the description of the information it
dumps and the format it uses.

I am never quite sure which of the texinfo mark-ups is the most
appropriate in which situation, I'll of course incorporate any
feedback on this as well as the general wording of the text.

After we settle on a version, I'd like to backport the documentation
also at least to GCC 15, 14 and 13.

Is it perhaps OK for master and the branches or what would better be
changed?

Thanks,

Martin

gcc/ChangeLog:

2025-04-23 Martin Jambor <mjambor@suse.cz>

* doc/invoke.texi (Developer Options): Document -fdump-ipa-clones.

(cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca)

Daily bump.

final: Fix get_attr_length for asm goto [PR118411]

The problem is for inline-asm goto, the outer rtl insn type
is a jump_insn and get_attr_length does not handle ASM specially
unlike if the outer rtl insn type was just insn.

This fixes the issue by adding support for both CALL_INSN and JUMP_INSN
with asm.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/118411

gcc/ChangeLog:

* final.cc (get_attr_length_1): Handle asm for CALL_INSN
and JUMP_INSNs.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit c1729df6ec1eff4815a9cdd71392691ce21da028)

libstdc++: Update C++23 status table

This should have been updated for the GCC 13.1 release.

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2023.xml: Update status of proposals
implemented for GCC 13.
* doc/html/manual/status.html: Regenerate.

Daily bump.

testsuite: Add testcase for already fixed PR [PR117498]

This wrong-code issue has been fixed with r15-7249.
We still emit warnings which are questionable and perhaps we'd
get better generated code if niters determined the loop has only a single
iteration without UB and we'd punt on vectorizing it (or unrolling).

2025-01-31 Jakub Jelinek <jakub@redhat.com>

PR middle-end/117498
* gcc.c-torture/execute/pr117498.c: New test.

(cherry picked from commit 9fc0683082067801e3790f7cfffedbf5441e0f82)

Daily bump.

aarch64: Fix CFA offsets in non-initial stack probes [PR119610]

PR119610 is about incorrect CFI output for a stack probe when that
probe is not the initial allocation.  The main aarch64 stack probe
function, aarch64_allocate_and_probe_stack_space, implicitly assumed
that the incoming stack pointer pointed to the top of the frame,
and thus held the CFA.

aarch64_save_callee_saves and aarch64_restore_callee_saves use a
parameter called bytes_below_sp to track how far the stack pointer
is above the base of the static frame.  This patch does the same
thing for aarch64_allocate_and_probe_stack_space.

Also, I noticed that the SVE path was attaching the first CFA note
to the wrong instruction: it was attaching the note to the calculation
of the stack size, rather than to the r11<-sp copy.

gcc/
PR target/119610
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Add a bytes_below_sp parameter and use it to calculate the CFA
offsets.  Attach the first SVE CFA note to the move into the
associated temporary register.
(aarch64_allocate_and_probe_stack_space): Update calls accordingly.
Start out with bytes_per_sp set to the frame size and decrement
it after each allocation.

gcc/testsuite/
PR target/119610
* g++.dg/torture/pr119610.C: New test.
* g++.target/aarch64/sve/pr119610-sve.C: Likewise.

(cherry picked from commit fa61afef18a8566d1907a5ae0e7754e1eac207d9)

Daily bump.

libstdc++/112351 - deal with __gthread_once failure during locale init

The following makes the C++98 locale init path follow the way the
C++11 performs initialization. This way we deal with pthread_once
failing, falling back to non-threadsafe initialization which, given we
initialize from the library, should be serialized by the dynamic
loader already.

PR libstdc++/112351
libstdc++-v3/
* src/c++98/locale.cc (locale::facet::_S_initialize_once):
Check whether _S_c_locale is already initialized.
(locale::facet::_S_get_c_locale): Always perform non-threadsafe
init when threadsafe init failed.

(cherry picked from commit 7562f089a190953b8ef615b90b7b0520e812a930)

Daily bump.

debug/101533 - ICE with variant typedef DIE generation

There's a sanity check in gen_type_die_with_usage that trips
unnecessarily for a case where the relevant DIE has already been
generated successfully in other ways. The following keys the
existing TREE_ASM_WRITTEN check on the correct object, honoring
this and does nothing instead of ICEing for the testcase at hand.

PR debug/101533
* dwarf2out.cc (gen_type_die_with_usage): When we have
output the typedef already do nothing for a typedef variant.
Do not set TREE_ASM_WRITTEN on the type.

* g++.dg/debug/pr101533.C: New testcase.

(cherry picked from commit 99a3f013c3bb8bc022ca488b40aa18fd97b5224d)

middle-end/101478 - ICE with degenerate address during gimplification

When we gimplify &MEM[0B + 4] we are re-folding the address in case
types are not canonical which ends up with a constant address that
recompute_tree_invariant_for_addr_expr ICEs on. Properly guard
that call.

PR middle-end/101478
* gimplify.cc (gimplify_addr_expr): Check we still have an
ADDR_EXPR before calling recompute_tree_invariant_for_addr_expr.

* gcc.dg/pr101478.c: New testcase.

(cherry picked from commit 33ead6400ad59d4b38fa0527a9a7b53a28114ab7)