git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

match: Remove redundant type checks from `(T1)(a bit_op (T2)b)` pattern.

As mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705657.html,
there were some redundant checks in this pattern. In the first if,
the check for pointer and OFFSET_TYPE is redundant as there is a check for
INTEGRAL_TYPE_P before hand. For the second one, the check for INTEGRAL_TYPE_P
on the inner most type is not needed as there is a types_match right afterwards

Pushed as obvious after bootstra/test on x86_64-linux-gnu.

gcc/ChangeLog:

* match.pd (`(T1)(a bit_op (T2)b)`): Remove redundant
type checks.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

c++: modules and coroutines

While working on another issue I found that currently modules do not
work with coroutines at all.  This patch fixes a number of issues in
both the coroutines logic and modules logic to ensure that they play
well together.  To summarize:

- The coroutine proxy objects did not have a DECL_CONTEXT set (required
  for modules to merge declarations).

- The coroutine transformation functions are always non-inline, even
  for an inline ramp function, which means that modules need an override
  to ensure the definitions are available where needed.

- Coroutine transformation functions were not marked DECL_COROUTINE_P,
  despite accessors implying that they were.

- In an importing TU we had lost the connection between the ramp
  functions and the transform functions, as they were kept in a pair
  of global maps.

- Modules streaming couldn't discriminate between the actor or destroy
  functions when merging.

- Modules streaming wasn't setting the cfun->coroutine_component flag,
  needed to activate the middle-end coroutine lowering pass.

This patch also separates the coroutine_info_table initialization from
the ensure_coro_initialized function.  If the first time we see a
coroutine is from a module import, we need to register the
transformation functions now but calling ensure_coro_initialized would
lookup e.g. std::coroutine_traits, which may only be visible from this
module that we're currently reading, causing a recursive load.
Separating the concerns allows this to work correctly.

gcc/cp/ChangeLog:

* coroutines.cc (create_coroutine_info_table): New function.
(get_or_insert_coroutine_info): Mark static.
(ensure_coro_initialized): Likewise; use
create_coroutine_info_table.
(coro_promise_type_found_p): Set DECL_CONTEXT for proxies.
(coro_set_ramp_function): New function.
(coro_set_transform_functions): New function.
(coro_build_actor_or_destroy_function): Use
coro_set_ramp_function, mark as DECL_COROUTINE_P.
* cp-tree.h (coro_set_transform_functions): Declare.
(coro_set_ramp_function): Declare.
* module.cc (struct merge_key): New field coro_disc.
(dumper::impl::nested_name): Distinguish coroutine transform
functions.
(get_coroutine_discriminator): New function.
(trees_out::key_mergeable): Stream coroutine discriminator.
(check_mergeable_decl): Adjust comment, check for matching
coroutine discriminator.
(trees_in::key_mergeable): Read coroutine discriminator.
(has_definition): Override for coroutine transform functions.
(trees_out::write_function_def): Stream linked ramp, actor, and
destroy functions for coroutines.
(trees_in::read_function_def): Read them.
(module_state::read_cluster): Set cfun->coroutine_component.

gcc/testsuite/ChangeLog:

* g++.dg/modules/coro-1_a.C: New test.
* g++.dg/modules/coro-1_b.C: New test.

Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>
Reviewed-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

c++/modules: Update lang_decl_bool streaming

The set of lang_decl flags that we were streaming had gotten out of sync
with the current list; update them.

One notable change is that anticipated_p, which had previously been
deliberately skipped, is now only used for DECL_OMP_PRIVATIZED_MEMBER,
and so should probably be streamed as well.

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_decl_bools): Update list of flags.
(trees_in::lang_decl_bools): Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

match: (X >> C) NE/EQ 0 -> X LT/GE 0 [PR123109]

Implement (X >> C) NE/EQ 0 -> X LT/GE 0 in match.pd instead of fold-const.cc.

Bootstrapped and tested on x86_64 and aarch64.

PR tree-optimization/123109

gcc/ChangeLog:

* fold-const.cc (fold_binary_loc): Remove (X >> C) NE/EQ 0 -> X LT/GE 0
folding.
* match.pd (`(X >> C) NE/EQ 0 -> X LT/GE 0`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp99.c: Update test.
* gcc.dg/pr123109.c: New test.

Signed-off-by: Pengxuan Zheng <pengxuan.zheng@oss.qualcomm.com>

match: Add simplification of `(a*zero_one_valued_p) & b` if `a & b` simplifies [PR119402]

This is a small reassociation for `a*bool & b` into `(a & b) * bool` checking if
`a & b` simplifies. Since it could be the case `b` is `~a` or `a` or something
else that might simplify when anding with `a`.

Note this fixes a regression for aarch64 where the cost of a multiply vs `&-` changed
in GCC 14 and can no longer optimize some cases at the RTL level.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/119402
gcc/ChangeLog:

* match.pd (`(a*zero_one_valued_p) & b`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-14.c: New test.
* gcc.dg/tree-ssa/bitops-15.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

testsuite/aarch64: Fix aarch64/signbitv2sf.c [PR122522]

The problem here is after some heurstics changes the check
loop is now unrolled so we eliminate the array. This means
the check for not having -2147483648 no longer works as
we don't handle SLP in this case.
So the best option is to force the check loop not to unroll
(no vectorize) as this is just testing we SLP the normal
signbit places rather than dealing with the checking loop.

Pushed as obvious after testing the testcase on aarch64-linux-gnu.

PR testsuite/122522
gcc/testsuite/ChangeLog:

* gcc.target/aarch64/signbitv2sf.c (main): Disable
unrolling and vectorizer for the checking loop.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

c: fix checking ICE related to transparent unions and atomic [PR123309]

When matching function arguments in composite_type_internal and one
type comes from a transparent union, it is possible to end up with
atomic and non-atomic types because this case is not handled correctly.
The type matching logic is rewritten in a cleaner way to use helper
functions and to not walk the argument lists three times. With this
change, a checking assertion can be added to test for matching qualifiers
for pointers. (In general, this assumption is still violated for
function return types.)

PR c/123309

gcc/c/ChangeLog:
* c-typeck.cc (transparent_union_replacement): New function.
(composite_type_internal): Rewrite logic.
(type_lists_compatible_p): Remove dead code for NULL arguments.

gcc/testsuite/ChangeLog:
* gcc.dg/pr123309.c: New test.
* gcc.dg/union-composite-type.c: New test.

libstdc++: Fix handling iterators with proxy subscript in heap algorithms.

This patch replaces uses of subscripts in heap algorithms, that where introduced
in r16-4100-gaaeca77a79a9a8 with dereference of advanced iterators.

The Cpp17RandomAccessIterator requirements, allows operator[] to return any
type that is convertible to reference, however user-provided comparators are
required only to accept result of dereferencing the iterator (i.e. reference
directly). This is visible, when comparator defines operator() for which
template arguments can be deduduced from reference (which will fail on proxy)
or that accepts types convertible from reference (see included tests).

For test we introduce a new proxy_random_access_iterator_wrapper iterator
in testsuite_iterators.h, that returns a proxy type from subscript operator.
This is separate type (instead of additional template argument and aliases),
as it used for test that work with C++98.

libstdc++-v3/ChangeLog:

* include/bits/stl_heap.h (std::__is_heap_until, std::__push_heap)
(std::__adjust_heap): Replace subscript with dereference of
advanced iterator.
* testsuite/util/testsuite_iterators.h (__gnu_test::subscript_proxy)
(__gnu_test::proxy_random_access_iterator_wrapper): Define.
* testsuite/25_algorithms/sort_heap/check_proxy_brackets.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

Fortran: Detect missing quote in namelist read.

PR libfortran/123012

libgfortran/ChangeLog:

* io/list_read.c (read_character): Add new check after
get_string and provide better comments.

gcc/testsuite/ChangeLog:

* gfortran.dg/namelist_101.f90: New test.

ifcvt: Improve `cmp?a&b:a` to try with -1 [PR123312]

After the current improvements to ifcvt, on some targets for
cmp?a&b:a it is better to produce `(cmp?b:-1) & a` rather than
`(!cmp?a:0)|(a & b)`. So this extends noce_try_cond_zero_arith (with
a rename to noce_try_cond_arith) to see if `cmp ? a : -1` is cheaper than
`!cmp?a:0`.

Bootstrapped and tested on x86_64-linux-gnu.

PR rtl-optimization/123312
gcc/ChangeLog:

* ifcvt.cc (noce_try_cond_zero_arith): Rename to ...
(noce_try_cond_arith): This. For AND try `cmp ? a : -1`
also to see which one cost less.
(noce_process_if_block): Handle the rename.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

winnt-utf8.manifest: make long path aware

Based on:
https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#longpathaware

gcc:
* config/i386/winnt-utf8.manifest: enable longPathAware.

Signed-off-by: Jonathan Yong <10walls@gmail.com>

winnt-utf8.manifest: Use XML example from Microsoft

Based on example from:
https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests#activecodepage

PR driver/108865

gcc:
* config/i386/winnt-utf8.manifest: correct XML tags

Signed-off-by: Jonathan Yong <10walls@gmail.com>

[PR tree-optimization/123530] Fix ICE in recently added match.pd pattern

The gimple optimization passes can create negative shift counts and pass them
into the simplification routines as seen by the code in pr123530.  If we then
call tree_to_uhwi on those values we get a nice little ICE.

This guards the tree_to_uhwi calls on tree_fits_uhwi_p and resolves the ICE.  I
just protected them all in this recently added pattern.

Bootstrapped and regression tested on x86 and riscv.  Also tested on the rest
of the embedded targets without any regressions.

Pushing to the trunk.

PR tree-optimization/123530
gcc/
* match.pd (reassociating xor to enable rotations): Verify constants
fit into a uhwi before trying to extract them as a uhwi.

gcc/testsuite/
* gcc.dg/torture/pr123530.c: New test.

middle-end/123573 - fix VEC_PERM folding more

The following fixes the fix from r16-6709-ga4716ece529dfd some
more by making sure permute to one operand folding faces same
element number vectors but also insert a VIEW_CONVERT_EXPR for
the case one is VLA and one is VLS (when the VLA case is actually
constant, like with -msve-vector-bits=128). It also makes the
assert that output and input element numbers match done in
fold_vec_perm which this pattern eventually dispatches to into
a check (as the comment already indicates).

Testcases are in the target specific aarch64 testsuite already.

PR middle-end/123573
* fold-const.cc (fold_vec_perm): Actually check, not assert,
that input and output vector element numbers agree.
* match.pd (vec_perm @0 @1 @2): Make sure element numbers
are the same when folding to an input vector and wrap that
inside a VIEW_CONVERT_EXPR.

forwprop: Fix type mismatch in vec constructor [PR123525].

This issue got raised after r16-6671 in which I removed checks for
number-of-element equality.  In the splat case with conversion:

  vector(16) int w;
  vector(8) long int v;
  _13 = BIT_FIELD_REF <w_12(D), 32, 160>;
  _2 = (long int) _13;
  _3 = (long int) _13;
  ...
  _9 = (long int) _13;
  _1 = {_2, _3, _4, _5, _6, _7, _8, _9};

right now we do
  _16 = VEC_PERM_EXPR <w_12(D), w_12(D), { 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5 }>;
  _17 = VIEW_CONVERT_EXPR<vector(8) intD.6>(_16);

where the view convert is actually an optimized
  _17 = BIT_FIELD_REF (_16, 512, 0);

512 is the size of the unconverted source but we should actually use the
converted source type.  That's what this patch does.

PR tree-optimization/123525

gcc/ChangeLog:

* tree-ssa-forwprop.cc (simplify_vector_constructor): Use
converted source type for conversion bit field ref.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr123525.c: New test.
* g++.dg/vect/pr123525-2.cc: New test.

if-conv: Prevent vector types in scalar cond reduction [PR123301].

Currently we allow vector types in scalar conditional reductions by
accident (via the GNU vector extension). This patch prevents that.

PR tree-optimization/123301

gcc/ChangeLog:

* tree-if-conv.cc (convert_scalar_cond_reduction):
Disallow vector types.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr123301.c: New test.

rtlanal: Determine nonzero bits of popcount from operand [PR123501].

The PR involves large mask vectors (e.g. V128BI) from which we take
the popcount. Currently a (popcount:DI (V128BI)) is assumed to have
at most 8 set bits as we assume the popcount operand also has DImode.

This patch uses the operand mode for unary operations and thus
calculates a proper nonzero-bits mask.

We could do the same estimate for ctz and clz but they use nonzero in a
non-poly way and I didn't want to change more than necessary. Therefore
the patch just returns -1 when we have a different operand mode for
ctz/clz.

PR rtl-optimization/123501
PR rtl-optimization/123444

gcc/ChangeLog:

* rtlanal.cc (nonzero_bits1): Use operand mode instead of
operation mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/reduc/pr123501.c: New test.

amdgcn: Adjust failure mode for gfx908 USM: 'libgomp.fortran/map-alloc-comp-9-usm.f90'

The change/rationale that commit 1cf9fda4936de54198858b8f54cd9707a3725f4e
"amdgcn: Adjust failure mode for gfx908 USM" applied to a number of test cases
likewise applies to 'libgomp.fortran/map-alloc-comp-9-usm.f90'.

libgomp/
* testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: Require
working Unified Shared Memory to run the test.

openmp: Bump Version from 4.5 to 5.2 (2/4): Some more '-Wno-deprecated-openmp'

These changes should've been included in
commit 382edf047effcd5b1ce66389742bd1b3e178ac95
"openmp: Bump Version from 4.5 to 5.2 (2/4)", to avoid some more instances of:

    warning: use of 'omp declare target' as a synonym for 'omp begin declare target' has been deprecated since OpenMP 5.2 [-Wdeprecated-openmp]

    warning: 'to' clause with 'declare target' deprecated since OpenMP 5.2, use 'enter' [-Wdeprecated-openmp]

    Warning: Non-C_PTR type argument at (1) is deprecated, use HAS_DEVICE_ADDR [-Wdeprecated-openmp]

    Warning: 'to' clause with 'declare target' at (1) deprecated since OpenMP 5.2, use 'enter' [-Wdeprecated-openmp]

libgomp/
* testsuite/libgomp.c++/examples-4/declare_target-2.C: Add
'-Wno-deprecated-openmp'.
* testsuite/libgomp.c/declare-variant-3-sm30.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm37.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm52.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm61.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm89.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c:
Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1030.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1031.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1032.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1033.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1034.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1035.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1036.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1100.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1101.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1102.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1103.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1150.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1151.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1152.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1153.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c:
Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx902.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx904.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx909.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90c.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx942.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx950.c: Likewise.
* testsuite/libgomp.c/examples-4/async_target-2.c: Likewise.
* testsuite/libgomp.c/interop-hsa.c: Likewise.
* testsuite/libgomp.c/target-20.c: Likewise.
* testsuite/libgomp.c/target-simd-clone-1.c: Likewise.
* testsuite/libgomp.c/target-simd-clone-2.c: Likewise.
* testsuite/libgomp.c/target-simd-clone-3.c: Likewise.
* testsuite/libgomp.fortran/alloc-managed-1.f90: Likewise.
* testsuite/libgomp.fortran/target9.f90: Likewise.

openmp: Bump Version from 4.5 to 5.2 (2/4): 'libgomp.oacc-c-c++-common/vred2d-128.c' [PR123098]

'libgomp.oacc-c-c++-common/vred2d-128.c' had gotten '-Wno-deprecated-openmp'
applied as part of commit 382edf047effcd5b1ce66389742bd1b3e178ac95
"openmp: Bump Version from 4.5 to 5.2 (2/4)", which conceptually doesn't make
sense, as 'libgomp.oacc-c-c++-common/vred2d-128.c' isn't an OpenMP test case.
In commit 9c119b0fdd9ba5a6821c0b4c5874ade8f4969109
"openmp: Limit - reduction -Wdeprecated-openmp diagnostics to OpenMP, testsuite fixes [PR123098]",
the erroneous diagnostic got disabled, so we don't need
'-Wno-deprecated-openmp' anymore.

PR testsuite/123098
libgomp/
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Remove
'-Wno-deprecated-openmp'.

Use -latomic_asneeded or -lgcc_s_asneeded to workaround libtool issues [PR123396]

On Mon, Jan 12, 2026 at 12:13:35PM +0100, Florian Weimer wrote:
> One way to work around the libtool problem would be to stick the
> as-needed into an existing .so linker script, or create a new one under
> a different name (say libatomic_optional.so) that has AS_NEEDED in it,
> and link with -latomic_optional. Then libtool would not have to be
> taught about --push-state/--pop-state etc.

That seems to work.

So far bootstrapped (c,c++,fortran,lto only) and make install tested
on x86_64-linux, tested on a small program without need to libatomic and
struct S { char a[25]; };
_Atomic struct S s;

int main () { struct S t = s; s = t; }
which does at -O0.
Before this patch I got
for i in `find x86_64-pc-linux-gnu/ -name lib\*.so.\*.\*`; do ldd -u $i 2>&1 | grep -q libatomic.so.1 && echo $i; done
x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0
x86_64-pc-linux-gnu/libsanitizer/asan/.libs/libasan.so.8.0.0
x86_64-pc-linux-gnu/libsanitizer/hwasan/.libs/libhwasan.so.0.0.0
x86_64-pc-linux-gnu/libsanitizer/lsan/.libs/liblsan.so.0.0.0
x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.2.0.0
x86_64-pc-linux-gnu/32/libsanitizer/ubsan/.libs/libubsan.so.1.0.0
x86_64-pc-linux-gnu/32/libsanitizer/asan/.libs/libasan.so.8.0.0
x86_64-pc-linux-gnu/32/libstdc++-v3/src/.libs/libstdc++.so.6.0.35
x86_64-pc-linux-gnu/libgcobol/.libs/libgcobol.so.2.0.0
x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.35
With this patch it prints nothing.

2026-01-13 Jakub Jelinek <jakub@redhat.com>

PR libstdc++/123396
gcc/
* configure.ac (gcc_cv_ld_use_as_needed_ldscript): New test.
(USE_LD_AS_NEEDED_LDSCRIPT): New AC_DEFINE.
* gcc.cc (LINK_LIBATOMIC_SPEC): Use "-latomic_asneeded" instead
of LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION
if USE_LD_AS_NEEDED_LDSCRIPT is defined.
(init_gcc_specs): Use "-lgcc_s_asneeded" instead of
LD_AS_NEEDED_OPTION " -lgcc_s " LD_NO_AS_NEEDED_OPTION
if USE_LD_AS_NEEDED_LDSCRIPT is defined.
* config.in: Regenerate.
* configure: Regenerate.
libatomic/
* acinclude.m4 (LIBAT_BUILD_ASNEEDED_SOLINK): New AM_CONDITIONAL.
* libatomic_asneeded.so: New file.
* libatomic_asneeded.a: New file.
* Makefile.am (toolexeclib_DATA): Set if LIBAT_BUILD_ASNEEDED_SOLINK.
(all-local): Install those files into gcc subdir.
* Makefile.in: Regenerate.
* configure: Regenerate.
libgcc/
* config/t-slibgcc (SHLIB_ASNEEDED_SOLINK,
SHLIB_MAKE_ASNEEDED_SOLINK, SHLIB_INSTALL_ASNEEDED_SOLINK): New
vars.
(SHLIB_LINK): Include $(SHLIB_MAKE_ASNEEDED_SOLINK).
(SHLIB_INSTALL): Include $(SHLIB_INSTALL_ASNEEDED_SOLINK).

Fortran: Check constant PDT type specification parameters [PR112460]

2026-01-14 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/112460
* array.cc (resolve_array_list): Stash the first PDT element
and check its type specification parameters against those of
subsequent elements.
* expr.cc (get_parm_list_from_expr): New function to extract the
type spec lists from expressions to be compared.
(gfc_check_type_spec_parms): New function to compare type spec
lists between two expressions. Emit an error if any constant
values are different.
(gfc_check_assign): Check that the PDT type specification parms
are the same on lhs and rhs.
* gfortran.h : Add prototype for gfc_check_type_spec_parms.
* trans-expr.cc (copyable_array_p): PDT arrays are not copyable

gcc/testsuite
PR fortran/112460
* gfortran.dg/pdt_81.f03: New test.

tree-optimization/123539 - signed UB in vector reduction

With previous changes I overlooked one use of vectype.

PR tree-optimization/123539
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Use the compute vectype to pun down to smaller or element
size for by-element reductions.

xfail store_merging_19.c for the same reason as store_merging_18.c

store_merging_19.c is almost the same as store_merging_18.c except
it has assume align in it to allow it work on strict align targets.
Somehow when I was looking at the testresults I noticed 18 but not 19
when I was looking into failures.

Pushed as obvious.

gcc/testsuite/ChangeLog:

* gcc.dg/store_merging_19.c: xfail.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

VN: Fix VN ICE for large _BitInt types

gcc.dg/torture/bitint-18.c triggers an ICE in push_partial_def when
compiling for RISC-V with -O2. The issue occurs because
build_nonstandard_integer_type cannot handle bit widths larger than
MAX_FIXED_MODE_SIZE.

For BITINT_TYPE with maxsizei > MAX_FIXED_MODE_SIZE, use build_bitint_type
instead of build_nonstandard_integer_type, similar to what tree-sra.cc does.

gcc/ChangeLog:

* tree-ssa-sccvn.cc (vn_walk_cb_data::push_partial_def): Use
build_bitint_type for BITINT_TYPE when maxsizei exceeds
MAX_FIXED_MODE_SIZE.

RISC-V: Add support for _BitInt [PR117581]

This patch implements _BitInt support for RISC-V target by defining the
type layout and ABI requirements.  The limb mode selection is based on
the bit width, using appropriate integer modes from QImode to TImode.
The implementation also adds the necessary libgcc version symbols for
_BitInt runtime support functions.

Changes in v3:
- Require sync_char_short effective target for bitint-64.c, bitint-82.c
  and bitint-84.c tests since they use atomic operations.
- Add -fno-section-anchors to bitint-32-on-rv64.c and adjust expected
  assembly output patterns.

Changes in v2:
- limb_mode use up to XLEN when N > XLEN, which is different setting from
  the abi_limb_mode.
- Adding missing floatbitinthf in libgcc.

gcc/ChangeLog:

PR target/117581
* config/riscv/riscv.cc (riscv_bitint_type_info): New function.
(TARGET_C_BITINT_TYPE_INFO): Define.

gcc/testsuite/ChangeLog:

PR target/117581
* gcc.dg/torture/bitint-64.c: Add sync_char_short effective target
requirement.
* gcc.dg/torture/bitint-82.c: Likewise.
* gcc.dg/torture/bitint-84.c: Likewise.
* gcc.target/riscv/bitint-32-on-rv64.c: New test.
* gcc.target/riscv/bitint-alignments.c: New test.
* gcc.target/riscv/bitint-args.c: New test.
* gcc.target/riscv/bitint-sizes.c: New test.

libgcc/ChangeLog:

PR target/117581
* config/riscv/libgcc-riscv.ver: New file.
* config/riscv/t-elf (SHLIB_MAPFILES): Add libgcc-riscv.ver.
* config/riscv/t-softfp32 (softfp_extras): Add floatbitinttf and
fixtfbitint.

pr122458.c: Replace .quad with .dc.a

Replace .quad with .dc.a to avoid

/export/build/gnu/tools-build/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc/build-x86_64-linux/gcc/ /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/ipa/pr122458.c -m32 -fdiagnostics-plain-output -O2 -lm -o pr122458.exe
/usr/local/bin/as: /tmp/cc9Bw0pX.o: unsupported relocation type: 0x1
/tmp/ccGrIiOC.s: Assembler messages:
/tmp/ccGrIiOC.s:4: Error: cannot represent relocation type BFD_RELOC_64
compiler exited with status 1
FAIL: gcc.dg/ipa/pr122458.c (test for excess errors)

for 32-bit targets.

PR ipa/122458
* gcc.dg/ipa/pr122458.c: Replace .quad with .dc.a.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

Add TARGET_MMX_WITH_SSE to the condition of all 64-bit _Float16 vector related patterns.

gcc/ChangeLog:

PR target/123484
* config/i386/mmx.md (divv4hf3): Add TARGET_MMX_WITH_SSE to
the condition.
(cmlav4hf4): Ditto.
(cmla_conjv4hf4): Ditto.
(cmulv4hf3): Ditto.
(cmul_conjv4hf3): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr123484.c: New test.

Daily bump.

match: Simplify `(T1)(a bit_op (T2)b)` to `((T1)a bit_op b)` When b is T1 type and truncating from T2 [PR122845]

This adds the simpliciation of:
```
  <unnamed-signed:3> _1;

  _2 = (signed char) _1;
  _3 = _2 ^ -47;
  _4 = (<unnamed-signed:3>) _3;
```

to:
```
  <unnamed-signed:3> _n;
  _4 = _1 ^ -47;
```

This also fixes PR 122843 by optimizing out the xor such that we get:
```
  _1 = b.a;
  _21 = (<unnamed-signed:3>) t_23(D);
  // t_23 in the original testcase was 200 so this is reduced to 0
  _5 = _1 ^ _21;
  # .MEM_24 = VDEF <.MEM_13>
  b.a = _5;
```
And then there is no cast catch this pattern:
`(bit_xor (convert1? (bit_xor:c @0 @1)) (convert2? (bit_xor:c @0 @2)))`
As we get:
```
  _21 = (<unnamed-signed:3>) t_23(D);
  _5 = _1 ^ _21;
  _22 = (<unnamed-signed:3>) t_23(D);
  _7 = _5 ^ _22;
  _25 = (<unnamed-signed:3>) t_23(D);
  _8 = _7 ^ _25;
  _26 = (<unnamed-signed:3>) t_23(D);
  _9 = _7 ^ _26;
```
After unrolling and then fre will optimize away all of those xor.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/122845
PR tree-optimization/122843
gcc/ChangeLog:

* match.pd (`(T1)(a bit_op (T2)b)`): Also
simplify if T1 is the same type as b and T2 is wider
type than T1.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-12.c: New test.
* gcc.dg/tree-ssa/bitops-13.c: New test.
* gcc.dg/store_merging_18.c: xfail store merging.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Fortran: Add additional checks for constant expressions.

PR fortran/91960

gcc/fortran/ChangeLog:

* resolve.cc (resolve_fl_parameter): Check the righthand symbol
is a constant expression.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr69962.f90: Adjust testcase to ignore new error message.
* gfortran.dg/pr91960_1.f90: New test.
* gfortran.dg/pr91960_2.f90: New test.

c++: deferred noexcept parsing for friend tmpl spec [PR123189]

Since we now defer noexcept parsing for templated friends, a couple of
routines related to deferred parsing need to be updated to cope with friend
template specializations -- their TI_TEMPLATE is a TREE_LIST rather than
a TEMPLATE_DECL, and they don't introduce new template parameters.

PR c++/123189

gcc/cp/ChangeLog:

* name-lookup.cc (binding_to_template_parms_of_scope_p):
Gracefully handle TEMPLATE_INFO whose TI_TEMPLATE is a TREE_LIST.
* pt.cc (maybe_begin_member_template_processing): For a friend
template specialization consider its class context instead.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept92.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: tweak testcase for --stds=impcx

Implicit constexpr makes the use of x disappear, avoiding the exposure and
thus the diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/modules/internal-17_b.C: Add -fno-implicit-constexpr.

c++: more gnu_inline linkage adjustment

Since r16-6477 we allow a gnu_inline to be a key method, because it is only
emitted in one place. It occurs to me that we should make the same
adjustment to other places that check DECL_DECLARED_INLINE_P to decide if a
function has inline/vague/comdat linkage.

PR libstdc++/123326

gcc/cp/ChangeLog:

* cp-tree.h (DECL_NONGNU_INLINE_P): New.
* decl.cc (duplicate_decls, start_decl): Check it.
* decl2.cc (vague_linkage_p, import_export_class): Likewise.
(vtables_uniquely_emitted, import_export_decl): Likewise.
* class.cc (determine_key_method): Check it instead of
lookup_attribute.

tree-optimization/123528 - tighten bool pattern check

The following makes sure we're only applying bool patterns for
conversions to scalar integer or float types.

PR tree-optimization/123528
* tree-vect-patterns.cc (vect_recog_bool_pattern): Restore
INTEGRAL_TYPE_P check but also allow SCALAR_FLOAT_TYPE_P.

* gcc.dg/vect/vect-pr12358.c: New testcase.

libiberty: Make `objalloc_free' `free'-like WRT null pointer

Inspired by a suggestion from Jan Beulich to make one of `objalloc_free'
callers `free'-like with respect to null pointer argument handling make
the function return with no action taken rather than crashing when such
a pointer is passed. This is to make the API consistent with ISO C and
to relieve all the callers from having to check for a null pointer.

libiberty/
* objalloc.c (objalloc_free): Don't use the pointer passed if
null.

aarch64: Fix copyright year.

Initial patch mistakenly added copyright year 2025.

gcc/testsuite/Changelog:

* gcc.target/aarch64/pch/aarch64-pch.exp: Fix copyright year.

aarch64: Add tests checking use of arm_sve.h et al in a pch [PR123457]

These tests check that including files using aarch64 specific pragmas in
headers that become precompiled headers works.

Built and tested for aarch64-linux-gnu on top of Andrew's patch.

gcc/testsuite/Changelog:
PR target/123457
* gcc.target/aarch64/pch/aarch64-pch.exp: Add new testsuite.
* gcc.target/aarch64/pch/pch_arm_acle.c: Add new test file.
* gcc.target/aarch64/pch/pch_arm_acle.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_acle_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_acle_include_post.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_multiple.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_multiple.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_multiple_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_multiple_include_post.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_include_post.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_sve_bridge.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_sve_bridge.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_sve_bridge_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_neon_sve_bridge_include_post.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_sme.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_sme.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_sme_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_sme_include_post.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_sve.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_sve.hs: Likewise.
* gcc.target/aarch64/pch/pch_arm_sve_include_post.c: Likewise.
* gcc.target/aarch64/pch/pch_arm_sve_include_post.hs: Likewise.

ipa-cp: Fix ipa-bit-cp test for recipient_only lattices

Unfortunately I made a silly copy-and paste error in may patch
introducing the recipient_only flag.  This patch fixes it, correctly
bailing out in ipa-bit-cp when it is set during propagation.

gcc/ChangeLog:

2026-01-12  Martin Jambor  <mjambor@suse.cz>

PR ipa/123543
* ipa-cp.cc (propagate_bits_across_jump_function): Fix test for
recipient_only_p.

gcc/testsuite/ChangeLog:

2026-01-12  Martin Jambor  <mjambor@suse.cz>

PR ipa/123543
* gcc.dg/ipa/pr123543.c: New test.

aarch64: Update target checks for sme2 fp8

Commits gcc-16-6381-g226d5fd59dc8 and gcc-16-6380-gef533d234293 had
insufficient target checks and this caused regressions on the linaro CI
which uses an older binutils version.

This change adds the needed checks.

gcc/testsuite/Changelog:
* gcc.target/aarch64/sme2/acle-asm/cvt_mf8_bf16_x2.c: Added target checks.
* gcc.target/aarch64/sme2/acle-asm/cvt_mf8_f16_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/cvt_mf8_f32_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/cvtn_mf8_f32_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f16_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f16_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f32_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f32_x4.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f64_x2.c: Likewise.
* gcc.target/aarch64/sme2/acle-asm/scale_f64_x4.c: Likewise.

Bump BASE-VER to 16.0.1 now that we are in stage4.

* BASE-VER: Bump to 16.0.1.

s390: Fix ABI issue in libstdc++.so.6

On Sat, Jan 10, 2026 at 05:24:15PM +0100, Stefan Schulze Frielinghaus wrote:
> libstdc++-v3/ChangeLog:
>
>       * config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Add
>       names {,P,K}DF16.

This is wrong - an ABI issue.

You can't export new symbols in CXXABI_1.3.14 symbol version when they
weren't exported there in GCC 13.1 already.
Symbols new in GCC 16 like these should be exported in CXXABI_1.3.17.

Fixed thusly.

2026-01-12  Jakub Jelinek  <jakub@redhat.com>

* config/abi/pre/gnu.ver (CXXABI_1.3.14): Don't export _ZTI*DF16_ on
s390x.
(CXXABI_1.3.17): Export _ZTI*DF16_ on s390x.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Remove
_ZTI{,P,K}DF16_.

tree-optimization/122830 - move VN through aggregate copies

The following generalizes the few hacks we have to more loosely
allow VN through aggregate copies to a more general (but also
restrictive) feature to rewrite the lookup to a new base with
a constant offset. This should now allow all constant-indexed
aggregate copies and it does never leave any stray components
and hoping for the best.

This resolves the diagnostic regression reported in PR122824.

PR tree-optimization/122830
PR tree-optimization/122824
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Generalize
aggregate copy handling when no variable offsets are
involved.

* gcc.dg/tree-ssa/ssa-fre-112.c: New testcase.
* g++.dg/warn/Warray-bounds-pr122824.C: Likewise.

Fix extra_off mis-computation during aggregate copy VN

With the rewrite of aggregate copy handling in r16-2729-g0d276cd378e7a4
there's an error introduced which accumulates extra_off even if we
throw away some of the tentative component consumption. The following
fixes this.

* tree-ssa-sccvn.cc (vn_reference_lookup_3): Only tentatively
accumulate extra_off when tentatively consuming components
during aggregate copy handling.

libstdc++: Stop using some reserved names in src/c++20/atomic.cc

libstdc++-v3/ChangeLog:

* src/c++20/atomic.cc (__detail::__spin_impl): Do not use
reserved names for variables.

libstdc++: Improve comments on __wait_args::_M_setup_proxy_wait

libstdc++-v3/ChangeLog:

* include/bits/atomic_wait.h (__wait_args): Improve comments.
* src/c++20/atomic.cc (__wait_args::_M_setup_proxy_wait):
Improve comment.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Fix generate_cannonical test for 128bit floating points.

This updates test01, so it properly handle 128bit floating points,
including situation when long double uses such representation.
Firstly, the computation of skips is corrected, by discarding number
values equal to number of calls required to generate element
(skips become zero for all non-float correctly). Furthermore, checks
of histogram for types using iec559 representation, is moved inside
test01 function, so we use correct value for long double, depending
on number of digits in mantissa on given platform.

We also extend test to cover __float128, to test 128bit floating
point on more platforms.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc:
Updated test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

middle-end/123175 - fix parts of const VEC_PERM with relaxed input sizes

The following fixes enough of const VEC_PERM folding and lowering
to deal with the fallout for the two testcases from the PR. We
usually do not generate such problematic VEC_PERM expressions, but
we allow those since GCC 14. As can be seen we mishandle those,
including failure to expand/lower them by zero-extending inputs (which is
what __builtin_shufflevector does).

I'm unsure as to what extent we get such permutes but Tamar indicates
that aarch64 can handle those at least.

PR middle-end/123175
* match.pd (vec_perm @0 @1 @2): Fixup for inputs having a
different number of elements than the result.
* tree-vect-generic.cc (lower_vec_perm): Likewise.

* gcc.dg/torture/pr123175-1.c: New testcase.
* gcc.dg/torture/pr123175-2.c: Likewise.

libgomp: Skip libgomp.c++/target-cdtor-2.C on Solaris [PR81337]

The libgomp.c++/target-cdtor-2.C test FAILs on Solaris:

FAIL: libgomp.c++/target-cdtor-2.C output pattern test

Compared to the Linux output

~S, 5, 1
[...]
finiDH1, 1

the Solaris output has a different order:

finiDH1, 1
[...]
~S, 5, 1

This is another instance of the long-standing PR c++/81337.  As detailed
there, the relative order of ~S::S() and __attribute__((destructor()))
functions isn't guaranteed.  Since xfail'ing the dg-output parts isn't
practical, this patch skips the whole test on Solaris.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2025-12-16  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

libgomp:
PR c++/81337
* testsuite/libgomp.c++/target-cdtor-2.C: Skip on Solaris.
Fix comments.

c++: Improve diagnostic for implicit conversion errors [PR115163]

This patch adds a note to indicate if any viable explicit conversion
functions were skipped if an implicit conversion failed to occur.

Perhaps the base diagnostic in ocp_convert can be further improved for
class types as well, as the current message is not very clear, but I've
not looked into that for this patch.

PR c++/115163

gcc/cp/ChangeLog:

* call.cc (implicit_conversion_error): Add flags argument, call
maybe_show_nonconverting_candidate.
(build_converted_constant_expr_internal): Pass flags to
implicit_conversion_error.
(perform_implicit_conversion_flags): Likewise.
* cvt.cc (ocp_convert): Call maybe_show_nonconverting_candidate
on conversion error.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible7.C: Add new testcases.
* g++.dg/diagnostic/explicit2.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

simplify-rtx: Fix up shift/rotate VOIDmode count handling [PR123523]

The following testcase ICEs on i686-linux, because the HW in that
case implements the shift as shifting by 64-bit count (anything larger
or equal to number of bits in the first operand's element results
in 0 or sign copies), so the machine description implements it as
such as well.
Now, because shifts/rotates can have different modes on the first
and second operand, when the second one has VOIDmode (i.e. CONST_INT,
I think CONST_WIDE_INT has non-VOIDmode and CONST_DOUBLE with VOIDmode
is hopefully very rarely used), we need to choose some mode for the
wide_int conversion. And so far we've been choosing BITS_PER_WORD/word_mode
or the mode of the first operand's element, whichever is wider.
That works fine on 64-bit targets, CONST_INT has always at most 64 bits,
but for 32-bit targets uses SImode.

Because HOST_BITS_PER_WIDE_INT is always 64, the following patch just
uses that plus DImode instead of BITS_PER_WORD and word_mode.

2026-01-12 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/123523
* simplify-rtx.cc (simplify_const_binary_operation): Use
DImode for VOIDmode shift and truncation counts if int_mode
is narrower than HOST_BITS_PER_WIDE_INT rather than
word_mode if int_mode it is narrower than BITS_PER_WORD.

* gcc.target/i386/pr123523.c: New test.

c++: Remove gnu::gnu_inline attribute on inheriting ctors [PR123526]

The recent addition of gnu::gnu_inline attributes to some C++26 constexpr
methods broke classes which inherit e.g. from std::logic_error or other
C++26 classes with gnu::gnu_inline constructors and use inheriting
constructors.  On std::logic_error etc. it has the desired effect that
the ctor itself can be constexpr evaluated and even inlined, but is not
emitted in each TU that needs it and didn't inline it, but is still
contained in libstdc++.{a,so.6}.
Unfortunately inheriting ctors inherit also attributes of the corresponding
ctors except those that clone_attrs filter out and that includes the
gnu_inline attribute if explicitly specified on the base class ctor.
That has the undesirable effect that the implementation detail of e.g.
the std::logic_error class leaks into the behavior of a class that inherits
from it if it is using inheriting constructors, those will result in
undefined symbols for the inheriting constructors if they aren't inlined,
unless one also inherits from it in some TU without gnu_inline there (e.g.
one compiled with -std=c++23 or earlier).

So, the following patch fixes it by removing the gnu::gnu_inline attribute
from the inheriting constructor.  Not done in clone_attrs because that
function is also used for the normal constructor cloning and in that case
we do want to clone those attributes.

2026-01-12  Jakub Jelinek  <jakub@redhat.com>

PR c++/123526
* method.cc: Include attribs.h.
(implicitly_declare_fn): Remove gnu::gnu_inline attribute.

* g++.dg/ext/gnu-inline-inh-ctor1.C: New test.
* g++.dg/ext/gnu-inline-inh-ctor2.C: New test.

testsuite: Remove lp64 requirement from gcc.target/i386/pr123121.c [PR123121]

The test gcc.target/i386/pr123121.c does not rely on LP64-specific
behavior. Drop the dg-require-effective-target lp64 directive so the
test can run on 32-bit i386 targets as well.

PR rtl-optimization/123121

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr123121.c:

testsuite: i386: Disable AVX512BW/DQ tests with Solaris/x86 as [PR123415]

Several AVX512BW and AVX512DQ tests FAIL on Solaris/x86 with the native
assembler.  As detailed in the PR, this is for two reasons:

* Due to a misunderstanding, %k0 isn't accepted as source or destination
  register of some insns.

* {sae} is considered implicit for some insns, so specifying it
  explicitly was deemed unnecessary.

It's unclear if and when this will be fixed, so avx512bw and avx512dq
tests are disabled for now.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2025-12-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR target/123415
* lib/target-supports.exp (check_effective_target_avx512dq):
Disable with Solaris/x86 as.
(check_effective_target_avx512bw): Likewise.

[rs6000] [testsuite] Fix test-frame-related.c [PR123129]

The testcase test-frame-related.c fails in 32-bit mode due to
constraints not matching. Use -mpowerpc64 option to ensure that the
testcase works with -m32.

gcc/testsuite:
PR testsuite/123129
* gcc.dg/rtl/powerpc/test-frame-related.c: Add -mpowerpc64.

AutoFDO: Fix missing null-pointer check in offline_unrealized_inlines

This was a trivial check that was missing and was causing ICEs due to
segmentation faults in some tests.

Bootstrapped and regtested on aarch64-linux-gnu.

Signed-off-by: Dhruv Chawla <dhruvc@nvidia.com>
gcc/ChangeLog:

* auto-profile.cc (autofdo_source_profile::offline_unrealized_inlines):
Add missing check for in_map.

testsuite: Disable vector-compare-1.C for arm targets [PR121752]

So arm is a bit special, non_strict_align is sometimes true but
it does not represent the true value of STRICT_ALIGN inside the compiler,
so this testcase fails. This disables the testcase for arm targets where
STRICT_ALIGN is always true even when there is unaligned loads.

Pushed as obvious after testing on x86_64 and arm-eabi (with -march=armv7) to make
sure the testcase no longer run on arm.

PR testsuite/121752
gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/vector-compare-1.C: Disable for arm targets.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

Fortran: Test cases from previously fixed bug

Adding two testcases from Gerhard Steinmetz from 2016-08-30.
These have had the dejagnu directives added. The last comment
in the PR is from Andrew Pinski notes the PR was fixed in the 9.3,
10+ timeframe. The testcases are small. Committing the tests to
ensure things are not broken in the future.

PR fortran/77415

gcc/testsuite/ChangeLog:

* gfortran.dg/pr77415_1.f90: New test.
* gfortran.dg/pr77415_2.f90: New test.

libga68: Make it possible to debug the GC

If GC_DEBUG is defined then all-upper-case macros will expand to calls
to the debug variant of collector functions.

So add the configury bit to define GC_DEBUG if the user wants and
switch all `GC_` calls to the corresponding macros.

libga68/ChangeLog:

* configure: Regenerate.
* configure.ac: Add --enable-algol68-gc-debug option and
define GC_DEBUG accordingly.
* ga68-alloc.c (_libga68_realloc): Use the C macro version of
the GC function.
(_libga68_realloc_unchecked): Likewise.
(_libga68_malloc): Likewise.

Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>

Daily bump.

lto: Fix SegFault in ICF caused by missing body

During LTO symbol merging, weak symbols may be resolved to external
definition.
We reset the symbol, so the body might be released in unreachability
pass. But we didn't mark the symbol with body_removed, so ICF assumed
the body was still there causing SegFault.

gcc/lto/ChangeLog:

* lto-symtab.cc (lto_symtab_merge_symbols): Set body_removed
for symbols resolved outside of IR.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/attr-weakref-2_0.c: New test.
* gcc.dg/lto/attr-weakref-2_1.c: New test.

lto: Add toplevel simple assembly heuristics

This new pass heuristically detects symbols referenced by toplevel
assembly to prevent their optimization.

Heuristics is done by comparing identifiers in assembly to known
symbols.

The pass is split into 2 passes, in LGEN and in WPA.
There must be one pass for WPA to be able to reference any symbol.
However in WPA there may be multiple symbols with the same name,
so we handle those local symbols in LGEN.

gcc/ChangeLog:

* asm-toplevel.cc (mark_fragile_ref_by_asm):
Add marked_local to handle symbol as local.
(ipa_asm_heuristics): New.
(class pass_ipa_asm): New.
(make_pass_ipa_asm_lgen): New.
(make_pass_ipa_asm_wpa): New.
* common.opt: New flto-toplevel-asm-heuristics.
* passes.def: New asm passes.
* timevar.def (TV_IPA_LTO_ASM): New.
* tree-pass.h (make_pass_ipa_asm_lgen): New.
(make_pass_ipa_asm_wpa): New.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/toplevel-simple-asm-1_0.c: New test.
* gcc.dg/lto/toplevel-simple-asm-1_1.c: New test.
* gcc.dg/lto/toplevel-simple-asm-2_0.c: New test.
* gcc.dg/lto/toplevel-simple-asm-2_1.c: New test.

lto: Allow other partitionings for toplevel assembly

For balanced and max partitioning this adds proper partitioning of asm
and related symbols.

The special symbols are partitioned with 1to1 and joined together if
there is no name conflict. All other symbols are partitioned with the
requested partitioning.
In typical usage with small number of toplevel assembly and no name
conflicts, all special symbols will be in the single first partition.
balanced partitioning will continue filling last asm partition.

gcc/lto/ChangeLog:

* lto-partition.cc (join_partitions): Declare.
(lto_1_to_1_map): Split out to..
(map_1_to_1): ..here.
(create_asm_partition): Replaced by..
(create_asm_partitions): ..this.
(lto_max_map): Use new create_asm_partitions.
(lto_balanced_map): Use new create_asm_partitions.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/toplevel-extended-asm-2_0.c: More partitionings.
* gcc.dg/lto/toplevel-extended-asm-2_1.c: Likewise.

lto: Handle .local symbols in toplevel extended assembly

.local symbols cannot become global, so we have to use must_remain_in_tu.

There is no way to mark declaration as both external and static/.local
in C. So we have to disable the implicit definition of static variables.
Also .local asm function still produces "used but never defined" warning.

gcc/ChangeLog:

* asm-toplevel.cc (mark_fragile_ref_by_asm): New.
(struct constraint_data): New.
(walk_through_constraints): Handle .local definitions.
(analyze_toplevel_extended_asm): Propagate constraint_data.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/toplevel-extended-asm-2_0.c: New test.
* gcc.dg/lto/toplevel-extended-asm-2_1.c: New test.
* gcc.dg/lto/toplevel-extended-asm-3_0.c: New test.
* gcc.dg/lto/toplevel-extended-asm-3_1.c: New test.

lto: Add must_remain_in_tu flags to symtab_node

With toplevel assembly we are sometimes not allowed to globalize static
symbols. So such symbols cannot be in more than one partition.

must_remain_in_tu_* guarantee that such symbols or references to them do
not escape the original translation unit. Thus 1to1 partitioning is always
valid.

gcc/ChangeLog:

* cgraph.h: Add must_remain_in_tu_*.
* cgraphclones.cc (cgraph_node::create_clone): Propagate
must_remain_in_tu_body.
* cif-code.def (MUST_REMAIN_IN_TU): New.
* ipa-icf.cc (sem_function::equals_wpa): Check
must_remain_in_tu_*
(sem_variable::equals_wpa): Likewise.
* ipa-inline-transform.cc (inline_call): Propagate
must_remain_in_tu_body.
* ipa-inline.cc (can_inline_edge_p): Check
must_remain_in_tu_body.
* lto-cgraph.cc (lto_output_node): Output must_remain_in_tu_*
(lto_output_varpool_node): Likewise.
(input_overwrite_node): Input must_remain_in_tu_*.
(input_varpool_node): Likewise.
* tree.cc (decl_address_ip_invariant_p): Check
must_remain_in_tu_name.
* varpool.cc (varpool_node::ctor_useable_for_folding_p): Check
must_remain_in_tu_body.

gcc/lto/ChangeLog:

* lto-symtab.cc (lto_cgraph_replace_node): Propagate
must_remain_in_tu_*.
(lto_varpool_replace_node): Likewise.

lto: Compute partition boundary with asm_nodes

Previous patch added asm_node streaming, so we need to add referenced
symbols to partition.

asm_nodes must be added to partition before computing the boundary.

gcc/ChangeLog:

* lto-cgraph.cc (compute_ltrans_boundary): Add symbols
referenced from asm_nodes.
* lto-streamer-out.cc (lto_output): Move adding asm_nodes
to...
* passes.cc (ipa_write_summaries): ...here.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/toplevel-extended-asm-1_0.c: New test.
* gcc.dg/lto/toplevel-extended-asm-1_1.c: New test.

lto: Stream toplevel extended assembly

Streaming of toplevel extended assembly was missing implementation.

Streaming must be after merging of decls, otherwise we would have to
fix the pointers to new decls.

gcc/ChangeLog:

* ipa-free-lang-data.cc (find_decls_types_in_asm): New.
(free_lang_data_in_cgraph): Use find_decls_types_in_asm.
* lto-cgraph.cc (input_cgraph_1): Move asm to..
(input_toplevel_asms): ..here.
* lto-streamer-in.cc (lto_input_toplevel_asms):
Allow extended asm.
* lto-streamer-out.cc (lto_output_toplevel_asms):
Allow extended asm.
(lto_output_toplevel_asms): Allow ASM_EXPR.
* lto-streamer.h (input_toplevel_asms): New.

gcc/lto/ChangeLog:

* lto-common.cc (read_cgraph_and_symbols): Call
input_toplevel_asms after decl merging.

gcc/testsuite/ChangeLog:

* g++.dg/lto/toplevel_asm-0_0.C: New test.

ipa: Analyze toplevel extended assembly

Analyzes references from toplevel extended assembly.

We cannot perform IPA optimizations with toplevel assembly, so
symtab_node only needs ref_by_asm to know that it should not be removed.

PR ipa/122458

gcc/ChangeLog:

* Makefile.in: Add new file.
* cgraph.h (analyze_toplevel_extended_asm): New.
* cgraphunit.cc (symbol_table::finalize_compilation_unit):
Call analyze_toplevel_extended_asm.
* asm-toplevel.cc: New file.

gcc/lto/ChangeLog:

* lto-common.cc (read_cgraph_and_symbols):
Call analyze_toplevel_extended_asm.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/pr122458.c: New test.

ipa: Add flag ref_by_asm to symtab_node

ref_by_asm will be used by toplevel assembly to mark symbols that cannot
be removed.
It largely overlaps with force_output. Main difference is that ref_by_asm
is meaningful on declarations by not removing them. force_output with
declaration is ignored, which cannot be easily changed, since several
places depend on this behavior.

Global ref_by_asm should not be localized, because they cannot benefit
from it. It would only result in complications, for example by renaming
the symbol.

Notes on different solutions in unreachability analysis:
First unreachability analysis is done in analyze_functions. Marking
ref_by_asm declarations as needed from the start would require reprocessing,
because some declarations may gain definition during the analysis.
Since at this point declarations do not need adding any other symbol,
we can check for ref_by_asm at the end, next to referred_to_p check.

Second unreachability analysis is in remove_unreachable_nodes. Here
declarations (or symbols in_other_partition) may require an alias.
So there we need to add the declarations from the start.

gcc/ChangeLog:

* cgraph.cc (cgraph_node_cannot_be_local_p_1): Check ref_by_asm.
(cgraph_node::verify_node): Likewise.
* cgraph.h (cgraph_node::only_called_directly_or_aliased_p):
Likewise.
(cgraph_node::can_remove_if_no_direct_calls_and_refs_p):
Likewise.
(varpool_node::can_remove_if_no_refs_p): Likewise.
(varpool_node::all_refs_explicit_p): Likewise.
* cgraphunit.cc (symtab_node::needed_p): Likewise.
(analyze_functions): Likewise.
* gimple-ssa-pta-constraints.cc (refered_from_nonlocal_fn):
Likewise.
(refered_from_nonlocal_var): Likewise.
(ipa_create_global_variable_infos): Likewise.
* ipa-comdats.cc (ipa_comdats): Likewise.
* ipa-visibility.cc (cgraph_externally_visible_p): Likewise.
(varpool_node::externally_visible_p): Likewise.
* ipa.cc (symbol_table::remove_unreachable_nodes): Likewise.
* lto-cgraph.cc (lto_output_node): Output ref_by_asm.
(lto_output_varpool_node): Likewise.
(input_overwrite_node): Input ref_by_asm.
(input_varpool_node): Likewise.
* symtab.cc (address_matters_1): Check ref_by_asm.

gcc/lto/ChangeLog:

* lto-symtab.cc (lto_cgraph_replace_node): Propagate ref_by_asm.
(lto_varpool_replace_node): Propagate ref_by_asm.

CRIS: Handle POST_INC in cris_rtx_costs

POST_INC is a code that's only supposed to be valid in an address, so
it should only be calculated through the TARGET_ADDRESS_COST hook, not
by the TARGET_RTX_COSTS hook.  But, because rtx_cost does not
special-case MEM costs by calling TARGET_ADDRESS_COST, we get here as
part of e.g. the auto-inc-dec and combine passes, so deal with it for
the time being.  Without this, the cost is the value of size_factor *
COSTS_N_INSNS (1), i.e. 4 per word.  There's no obvious observable
effect for generated code (coremark, libgcc and newlib-libc checked
for -march=v10), but it may make a difference in the future, so be
safe and correct the cost.

Tested at r16-6493-ge77ba7ef8c75 for cris-elf.  That the cost actually
is changed is observable mostly simply by applying -dp when compiling
int incref(int n, char *p)
{
   int sum = 0;

   while (n--)
     sum += *p++;

   return sum;
}
and seeing that the cost for the single autoincrement is changed from e.g.
adds.b [$r11+],$r10 ;# 15 [c=12 l=2]  *addsqisi_swap/1
to
adds.b [$r11+],$r10 ;# 15 [c=8 l=2]  *addsqisi_swap/1

gcc:
* config/cris/cris.cc (cris_rtx_costs) <POST_INC>: Handle POST_INC
as ZERO_EXTEND and SIGN_EXTEND, i.e. as an operator without cost.

remove inclusion of tickLib.h from gthr-vxworks.h

This header is not used any more and its inclusion is problematic
when building against Helix Cert as it might end up dragging LLVM-specific
headers from spinLockLib.h.

libgcc/
* config/gthr-vxworks.h: Remove #include of tickLib.h.

Fortran: [PR123012] Silent acceptance of unquoted character items

PR libfortran/123012

libgfortran/ChangeLog:

* io/list_read.c (read_character): Add new check when no
quate is provided and the character string is digits only.

gcc/testsuite/ChangeLog:

* gfortran.dg/namelist_100.f90: New test.

Daily bump.

Fix regression on mcore-elf port after recent switch conversion change

Filip's recent change to re-enable switch conversion at -Og triggered a
regression on the mcore-elf target.

If we look at tree-switch-conversion.cc we have this:

  if (flag_pic)
    return false;

The mcore-elf port defines a dummy ASM_OUTPUT_ADDR_DIFF_ELT which is designed
to trigger an assembler syntax error and thus fail loudly.  That definition
comes from a time when it appears we had to define that macro in every port,
even if it wasn't being used.

These days we do not need to define that macro unless it's really needed.  And
a definition like the one for mcore-elf will cause problems
(compile/pr69102.c).  That definition has also been the cause of a long
standing failure in the port (gcc.dg/pr47446-2.c).

Naturally this has been through a round of testing where it fixes the two
issues noted above without any regressions.

gcc/
* config/mcore/mcore.h (ASM_OUT_ADDR_DIFF_ELT): Remove.

s390: Add HF mode support

This patch adds support for _Float16.  As time of writing this, there is
no hardware _Float16 support on s390.  Therefore, _Float16 operations
have to be extended and truncated which is supported via soft-fp.

The ABI demands that _Float16 values are left aligned in FP registers
similar as it is already the case for 32-bit FP values.  If vector
extensions are available, copying between left-aligned FPRs and
right-aligned GPRs is natively supported.  Without vector extensions,
the alignment has to be taken care of manually.  For target z10,
instructions lgdr/ldgr can be used in conjunction with shifts.  Copying
via lgdr from an FPR into a GPR is the easy case since for the shift the
target GPR can be utilized.  However, copying via ldgr from a GPR into a
FPR requires a secondary reload register which is used for the shift
result and is then copied into the FPR.  Prior z10, there is no hardware
support in order to copy directly between FPRs and GPRs.  Therefore, in
order to copy from a GPR into an FPR we would require a secondary reload
register for the shift and secondary memory for copying the aligned
value.  Since this is not supported, _Float16 support starts with z10.
As a consequence, for all targets older than z10 test
libstdc++-abi/abi_check fails.

gcc/ChangeLog:

* config/s390/s390-modes.def (FLOAT_MODE): Add HF mode.
(VECTOR_MODE): Add V{1,2,4,8,16}HF modes.
* config/s390/s390.cc (s390_scalar_mode_supported_p): For 64-bit
targets z10 and newer support HF mode.
(s390_vector_mode_supported_p): Add HF mode.
(s390_register_move_cost): Keep HF mode operands in registers.
(s390_legitimate_constant_p): Support zero constant.
(s390_secondary_reload): For GPR to FPR moves a secondary reload
register is required.
(s390_secondary_memory_needed): GPR<->FPR moves don't require
secondary memory.
(s390_libgcc_floating_mode_supported_p): For 64-bit targets z10
and newer support HF mode.
(s390_hard_regno_mode_ok): Allow HF mode for FPRs and VRs.
(s390_function_arg_float): Consider HF mode, too.
(s390_excess_precision): For EXCESS_PRECISION_TYPE_FLOAT16
return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Define.
* config/s390/s390.md (movhf): Define.
(reload_half_gprtofpr_z10): Define.
(signbithf2): Define.
* config/s390/vector.md: Add new vector modes to various
iterators.

libgcc/ChangeLog:

* config.host: Include s390/t-float16.
* config/s390/libgcc-glibc.ver: Export symbols
__trunc{sf,df,tf}hf2, __extendhf{sf,df,tf}2, __fix{,uns}hfti,
__float{,un}tihf, __floatbitinthf.
* config/s390/t-softfp: Add to softfp_extras instead of setting
it.
* configure: Regenerate.
* configure.ac: Support float16 only for 64-bit targets z10 and
newer.
* config/s390/_dpd_dd_to_hf.c: New file.
* config/s390/_dpd_hf_to_dd.c: New file.
* config/s390/_dpd_hf_to_sd.c: New file.
* config/s390/_dpd_hf_to_td.c: New file.
* config/s390/_dpd_sd_to_hf.c: New file.
* config/s390/_dpd_td_to_hf.c: New file.
* config/s390/t-float16: New file.

libstdc++-v3/ChangeLog:

* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Add
names {,P,K}DF16.

gcc/testsuite/ChangeLog:

* g++.target/s390/float16-1.C: New test.
* g++.target/s390/float16-2.C: New test.
* gcc.target/s390/float16-1-2.h: New test.
* gcc.target/s390/float16-1.c: New test.
* gcc.target/s390/float16-10.c: New test.
* gcc.target/s390/float16-2.c: New test.
* gcc.target/s390/float16-3.c: New test.
* gcc.target/s390/float16-4.c: New test.
* gcc.target/s390/float16-5.c: New test.
* gcc.target/s390/float16-6.c: New test.
* gcc.target/s390/float16-7.c: New test.
* gcc.target/s390/float16-8.c: New test.
* gcc.target/s390/float16-9.c: New test.
* gcc.target/s390/float16-signbit.h: New test.
* gcc.target/s390/vector/vec-extract-4.c: New test.
* gcc.target/s390/vector/vec-float16-1.c: New test.

Ada, Darwin: Fix bootstrap after recent warning improvements.

Similar to the changes in r16-6620, the improved gnatwu warning finds a 'use'
clause that is not needed in s-osinte__darwin.abd leading to a bootstrap
fail building the libraries.

Fixed by removing the extraneous 'use' clause.

gcc/ada/ChangeLog:

* libgnarl/s-osinte__darwin.adb: Delete unneeded use clause.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

libstdc++: Fix std::system_category().message(int) on mingw32 target

On the mingw32 target, std::system_category().message(int) uses
FormatMessage api to format error messages. When the error message
contains insert sequences, it is unsafe not to use the
FORMAT_MESSAGE_OGNORE_INSERTS flag, as seen at:
https://devblogs.microsoft.com/oldnewthing/20071128-00/?p=24353

The output of FormatMessage ends with "\r\n" and includes a Full stop
character used by the current thread's UI language. Now, we will remove
"\r\n" and any trailing '.' from the output in any language environment.

In the testsuite for std::system_category().message(int), we first
switch the thread UI language to en-US to meet expectations in any
language environment.

libstdc++-v3/ChangeLog:

* src/c++11/system_error.cc (system_error_category) [_WIN32]:
Use FormatMessageA function instead of FormatMessage macro.
* testsuite/19_diagnostics/error_category/system_category.cc:
Fix typo in __MINGW32__ macro name. Adjust behavior on the
mingw32 target.

cfgcleanup: Protect latches always [PR123417]

So it turns out LOOPS_MAY_HAVE_MULTIPLE_LATCHES is set in places
along compiling. Setting it only means there might be multiple
latches currently. It does not mean let's go in an delete them
all; which is what remove_forwarder_block does currently. This
was happening before my set of patches too but since it was
only happening in merge_phi pass, latches were not cleared away
al of the time and then recreated.

This solves the problem by protecting latches all of the time
instead of depedent on LOOPS_MAY_HAVE_MULTIPLE_LATCHES not being set.

vect-uncounted_7.c needs to be xfailed here because we no longer
vectorize the code. Note the IR between GCC 15 and after this patch
is the same so I think this was just a case were the testcase
was added after the remove forwarder changes and should not have
vectorized (or vectorize differently).

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/123417
gcc/ChangeLog:

* tree-cfgcleanup.cc (maybe_remove_forwarder_block): Always
protect latches.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-uncounted_7.c: xfail vect test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

toplevel: Unbreak Ada build [PR123490]

As written earlier, the config-ml.in change from the
--with-multi-buildlist patch broke build of Ada, Ada uses
RTSDIR = rts$(subst /,_,$(MULTISUBDIR))
and expects that the primary multilib will result in rts
rather than rts_. it results in after the --with-multi-buildlist
changes.

The following patch fixes it by restoring previous behavior for
ml_subdir / MULTISUBDIR such that for primary multilib it is
still empty rather than /.

2026-01-10 Jakub Jelinek <jakub@redhat.com>

PR ada/123490
* config-ml.in: Restore ml_subdir being empty instead of /.
for the primary multilib.

ranger: Verify gimple_call_num_args for several builtins [PR123431]

While gimple_call_combined_fn already do call
gimple_builtin_call_types_compatible_p and for most of builtins ensures
the right types of arguments, for type generic builtins it does not,
from POV of that function those functions are rettype (...).
Now, while the FE does some number of argument checking for the type
generic builtins, as the testcase below shows, it can be gamed.

So, this patch checks the number of arguments for type generic builtins
and does nothing if they have unexpected number of arguments.
Also for the returns arg verifies it can access the first argument.

2026-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/123431
* gimple-range-op.cc (gimple_range_op_handler::maybe_builtin_call):
Punt if type-generic builtins with a single argument don't have
exactly one argument. For returns_arg punt if call doesn't have
at least one argument.

* gcc.dg/pr123431.c: New test.

c: Fix ignored qualifier issue for enumerations [PR123435,PR123463]

We accept a mismatch in qualifiers for enumerations and integers
because we switch to the underlying type before checking that qualifiers
match.

PR c/123435
PR c/123463

gcc/c/ChangeLog:
* c-typeck.cc (comptypes_internal): Test for qualifiers first.

gcc/testsuite/ChangeLog:
* gcc.dg/pr123435-1.c: New test.
* gcc.dg/pr123435-2.c: New test.
* gcc.dg/pr123463.c: New test.

libstdc++: constexpr flat_map and flat_multimap

This patch makes flat_map and flat_multimap constexpr as part of P3372R3.

libstdc++-v3/ChangeLog:

* include/bits/version.def: Add FTM.
* include/bits/version.h: Regenerate.
* include/std/flat_map: Add constexpr.
* testsuite/23_containers/flat_map/1.cc: Add constexpr test.
* testsuite/23_containers/flat_multimap/1.cc: Add constexpr test.

a68: Add exit function to POSIX prelude

Add the procedure `posixexit'.

gcc/algol68/ChangeLog:

* a68-low-posix.cc (a68_posix_setexitstatus): Delete function.
(a68_posix_exit): New function.
* a68-low-prelude.cc (a68_lower_setexitstatus): Delete function.
(a68_lower_posixexit): New function.
* a68-low-runtime.def (SET_EXIT_STATUS): Delete definition.
(POSIX_EXIT): Add definition for posixexit.
* a68-parser-prelude.cc (posix_prelude): Remove setexitstatus
identifier from and add posixexit identifier to standenv.
* a68.h (a68_posix_setexitstatus): Delete prototype.
(a68_lower_setexitstatus): Likewise.
(a68_posix_exit): New prototype.
(a68_lower_posixexit): Likewise.
* ga68.texi:

libga68/ChangeLog:

* ga68-posix.c (_libga68_posixexit): New function.
* ga68.h (_libga68_posixexit): New prototype.
(_libga68_set_exit_status): Delete prototype.
* ga68.map: Remove _libga68_set_exit_status from and add
_libga68_posixexit to the global map.
* libga68.c: include <stdlib.h>.
(_libga68_set_exit_status): Delete function.
(main): Return EXIT_SUCCESS.

gcc/testsuite/ChangeLog:

* algol68/execute/posix-exit-1.a68: New test.

Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>

Daily bump.

a68: Escape @ in ga68.texi

Signed-off-by: Jose E. Marchesi <jemarch@gnu.org
gcc/algol68/ChangeLog

* ga68.texi (Worthy characters): Escape @.

forwprop: Use ssizetype for mask [PR123414].

RVV's vectors can get very large with LMUL8. In the PR we have
256-element char vectors which get permuted. For permuting them
we use a mask vectype that is deduced from the element type
without checking if the permute indices fit this type.
That leads to an invalid permute mask which gets optimized away.

This patch uses ssizetype as masktype instead.

PR tree-optimization/123414

gcc/ChangeLog:

* tree-ssa-forwprop.cc (simplify_vector_constructor):
Use ssizetype as mask type.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr123414.c: New test.

Fix broken bootstrap on FreeBSD.

As analyzed by Steve, on freebsd __gthread_t is a pointer type.
I thought it the cleanest solution to remove the #ifdef in gfc_unit,
make the "self" member a intptr_t and cast the return value of
__gthread_t to that type.

PR fortran/123512

libgfortran/ChangeLog:

* io/io.h: Change type of self to intptr_t.
* io/async.h (LOCK_UNIT): Cast __gthread_self () to intptr_t.
(TRYLOCK_UNIT): Likewise.
(OWN_THREAD_ID): Likewise.

Update Copyright for gen-evolution.awk and gen-cxxapi-file.py

On Fri, Jan 09, 2026 at 05:54:47PM +0000, Joseph Myers wrote:
> I think updates to gcc/config/loongarch/genopts/gen-evolution.awk's calls
> to copyright_header are needed as well.  At present, building for
> loongarch can result in files in the source tree being reverted to older
> copyright dates because the generation hasn't been updated (discovered via
> my glibc bot with GCC mainline stopping updating its GCC source tree
> because such modifications appeared in the sources).  Of course this also
> shows up missing entries in contrib/gcc_update for the three files
> generated by gen-evolution.awk.

gen-evolution.awk was explicitly blacklisted
and so was gen-cxxapi-file.py, both because update-copyright.py
matched Copyright line also within the printing code but it wasn't
matching the expected form.
Fixed by making sure the printing code doesn't match it by using
print "   Copy" "right (C) " ... in the awk case and
Copy{:s}right in the python case (with "" arg added).

2026-01-09  Jakub Jelinek  <jakub@redhat.com>

contrib/
* update-copyright.py (GCCFilter): Don't filter out
gen-evolution.awk and gen-cxxapi-file.py.
gcc/
* config/loongarch/genopts/gen-evolution.awk: Update
copyright year.
(copyright_header): Separate parts of Copyright word
with " " so that it doesn't get matched by update-copyright.py.
(gen_full_header, gen_full_source, gen_full_def): Include
2026 year in the ranges.
gcc/cp/
* gen-cxxapi-file.py: Update copyright year.  Separate
parts of Copyright word with {:s} so that it doesn't get matched
by update-copyright.py.

analyzer: port pop_frame_callbacks to pub/sub

More simplification/consolidation of some callback logic in analyzer in
favor of using the analyzer pub/sub channel.

No functional change intended.

gcc/analyzer/ChangeLog:
* common.h (struct on_frame_popped): New.
(subscriber::on_message): New vfunc for on_frame_popped.
* region-model.cc: Include "context.h" and "channels.h".
(region_model::pop_frame_callbacks): Delete.
(region_model::pop_frame): Port from notify_on_pop_frame to
using pub/sub channel.
* region-model.h (pop_frame_callback): Delete typedef.
(region_model::register_pop_frame_callback): Delete.
(region_model::pop_frame_callbacks): Delete.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc
(cpython_analyzer_events_subscriber::on_message): Implement for
on_frame_popped.
(plugin_init): Drop call to
region_model::register_pop_frame_callback in favor of the above
pub/sub handler.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: port translation_unit_callbacks to pub/sub

Simplification/consolidation of some callback logic in analyzer in
favor of using the analyzer pub/sub channel.

No functional change intended.

gcc/analyzer/ChangeLog:
* analyzer-language.cc: Include "context.h" and "channels.h".
(finish_translation_unit_callbacks): Delete.
(register_finish_translation_unit_callback): Delete.
(run_callbacks): Delete.
(on_finish_translation_unit): Port from run_callbacks to pub/sub.
* analyzer-language.h (finish_translation_unit_callback): Delete
typedef.
(register_finish_translation_unit_callback): Delete decl.
* common.h (class translation_unit): New forward decl.
(struct analyzer_events::on_tu_finished): New.
(analyzer_events::subscriber::on_message): Add vfunc for
on_tu_finished messages.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc
(cpython_analyzer_events_subscriber::on_message): New.
(plugin_init): Port stashing of named types and global vars to
pub/sub framework.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: replace PLUGIN_ANALYZER_INIT with a pub/sub channel

This patch eliminates the PLUGIN_ANALYZER_INIT event in favor of a new
analyzer_events_channel that can be subscribed to, and ports all the
in-tree analyzer plugins to using it.

The PLUGIN_* approach isn't typesafe, and the name suggests it's only
meant to be used for plugins, whereas the pub/sub approach is typesafe,
and treats the publish/subscribe network as orthogonal to whether the
code is built into the executable or is a plugin.

gcc/analyzer/ChangeLog:
* common.h: Define INCLUDE_LIST.
(class plugin_analyzer_init_iface): Replace with...
(gcc::topics::analyzer_events::on_ana_init): ...this.
(gcc::topics::analyzer_events::subscriber): New.
* engine.cc: Include "context.h" and "channels.h".
(class plugin_analyzer_init_impl): Replace with...
(class impl_on_ana_init): ...this. Fix some overlong lines.
(impl_run_checkers): Port from PLUGIN_ANALYZER_INIT to using
publish/subscribe framework.

gcc/ChangeLog:
* channels.h (gcc::topics::analyzer_events::subscriber): New
forward decl.
(compiler_channels::analyzer_events_channel): New field.
* doc/plugins.texi (PLUGIN_ANALYZER_INIT): Delete.
* plugin.cc (register_callback): Delete PLUGIN_ANALYZER_INIT.
(invoke_plugin_callbacks_full): Likewise.
* plugin.def (PLUGIN_ANALYZER_INIT): Delete this event.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc: Port from
PLUGIN_ANALYZER_INIT to subscribing to analyzer_events_channel.
* gcc.dg/plugin/analyzer_gil_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_kernel_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_known_fns_plugin.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: add optional CFG dumps to SARIF/HTML output sinks

This patch adds a new key/value pair "cfgs={yes,no}" to diagnostics
sinks, "no" by default.

If set to "yes" for a SARIF sink, then GCC will add the internal state
of the CFG for all functions after each pertinent optimization pass in
graph form to theRun.graphs in the SARIF output.

If set to "yes" for an HTML sink, the generated HTML will contain SVG
displaying the graphs, adapted from code in graph.cc

Text sinks ignore it.

The SARIF output is thus a machine-readable serialization of (some of)
GCC's intermediate representation (as JSON), but it's much less than
GCC-XML used to provide.  The precise form of the information is
documented as subject to change without notice.

Currently it shows both gimple statements and RTL instructions,
depending on the pass.  My hope is that it should be possible to write a
"cfg-grep" tool that can read the SARIF and automatically identify
in which pass a particular piece of our IR appeared or disappeared,
for tracking down bugs in our optimization passes.

Implementation-wise:
* this uses the publish-subscribe mechanism from the earlier patch, by
having the diagnostics sink subscribe to pass_events::after_pass
messages from the pass_events_channel.
* the patch adds a new hook to cfghooks.h for dumping a basic block
into a SARIF property bag

gcc/ChangeLog:
* Makefile.in (OBJS): Add tree-diagnostic-cfg.o.
(OBJS-libcommon): Add custom-sarif-properties/cfg.o,
diagnostics/digraphs-to-dot.o, and
diagnostics/digraphs-to-dot-from-cfg.o.
* cfghooks.cc: Define INCLUDE_VECTOR.  Add includes of
"diagnostics/sarif-sink.h" and "custom-sarif-properties/cfg.h".
(dump_bb_as_sarif_properties): New.
* cfghooks.h (diagnostics::sarif_builder): New forward decl.
(json::object): New forward decl.
(cfg_hooks::dump_bb_as_sarif_properties): New callback field.
(dump_bb_as_sarif_properties): New decl.
* cfgrtl.cc (rtl_cfg_hooks): Populate the new callback
field with rtl_dump_bb_as_sarif_properties.
(cfg_layout_rtl_cfg_hooks): Likewise.
* custom-sarif-properties/cfg.cc: New file.
* custom-sarif-properties/cfg.h: New file.
* diagnostics/digraphs-to-dot-from-cfg.cc: New file, partly
adapted from gcc/graph.cc.
* diagnostics/digraphs-to-dot.cc: New file.
* diagnostics/digraphs-to-dot.h: New file, based on material in...
* diagnostics/digraphs.cc: Include
"diagnostics/digraphs-to-dot.h".
(class conversion_to_dot): Rework and move to above.
(make_dot_graph_from_diagnostic_graph): Likewise.
(make_dot_node_from_digraph_node): Likewise.
(make_dot_edge_from_digraph_edge): Likewise.
(conversion_to_dot::get_dot_id_for_node): Likewise.
(conversion_to_dot::has_edges_p): Likewise.
(digraph::make_dot_graph): Use to_dot::converter::make and invoke
the result to make the dot graph.
* diagnostics/digraphs.h (digraph:get_all_nodes): New accessor.
* diagnostics/html-sink.cc
(html_builder::m_per_logical_loc_graphs): New field.
(html_builder::add_graph_for_logical_loc): New.
(html_sink::report_digraph_for_logical_location): New.
* diagnostics/sarif-sink.cc (sarif_array_of_unique::get_element):
New.
(sarif_builder::report_digraph_for_logical_location): New.
(sarif_sink::report_digraph_for_logical_location): New.
* diagnostics/sink.h: Include "diagnostics/logical-locations.h".
(sink::report_digraph_for_logical_location): New vfunc.
* diagnostics/text-sink.h
(text_sink::report_digraph_for_logical_location): New.
* doc/invoke.texi (fdiagnostics-add-output): Clarify wording.
Distinguish between scheme-specific vs GCC-specific keys, and add
"cfgs" as the first example of the latter.
* gimple-pretty-print.cc: Include "cfghooks.h", "json.h", and
"custom-sarif-properties/cfg.h".
(gimple_dump_bb_as_sarif_properties): New.
* gimple-pretty-print.h (diagnostics::sarif_builder): New forward
decl.
(json::object): Likewise.
(gimple_dump_bb_as_sarif_properties): New.
* graphviz.cc (get_compass_pt_from_string): New
* graphviz.h (get_compass_pt_from_string): New decl.
* libsarifreplay.cc (sarif_replayer::handle_graph_object): Fix
overlong line.
* opts-common.cc: Define INCLUDE_VECTOR.
* opts-diagnostic.cc: Define INCLUDE_LIST.  Include
"diagnostics/sarif-sink.h", "tree-diagnostic-sink-extensions.h",
"opts-diagnostic.h", and "pub-sub.h".
(class gcc_extra_keys): New class.
(opt_spec_context::opt_spec_context): Add "client_keys" param and
pass to dc_spec_context.
(handle_gcc_specific_keys): New.
(try_to_make_sink): New.
(gcc_extension_factory::singleton): New.
(handle_OPT_fdiagnostics_add_output_): Rework to use
try_to_make_sink.
(handle_OPT_fdiagnostics_set_output_): Likewise.
* opts-diagnostic.h: Include "diagnostics/sink.h".
(class gcc_extension_factory): New.
* opts.cc: Define INCLUDE_LIST.
* print-rtl.cc: Include "dumpfile.h", "cfghooks.h", "json.h", and
"custom-sarif-properties/cfg.h".
(rtl_dump_bb_as_sarif_properties): New.
* print-rtl.h (diagnostics::sarif_builder): New forward decl.
(json::object): Likewise.
(rtl_dump_bb_as_sarif_properties): New decl.
* tree-cfg.cc (gimple_cfg_hooks): Use
gimple_dump_bb_as_sarif_properties for new callback field.
* tree-diagnostic-cfg.cc: New file, based on material in graph.cc.
* tree-diagnostic-sink-extensions.h: New file.
* tree-diagnostic.cc: Define INCLUDE_LIST.  Include
"tree-diagnostic-sink-extensions.h".
(compiler_ext_factory): New.
(tree_diagnostics_defaults): Set gcc_extension_factory::singleton
to be compiler_ext_factory.

gcc/testsuite/ChangeLog:
* gcc.dg/diagnostic-cfgs-html.py: New test.
* gcc.dg/diagnostic-cfgs-sarif.py: New test.
* gcc.dg/diagnostic-cfgs.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Add publish/subscribe topics and channel: pass_events

This patch adds a new "struct compiler_channels" to hold channels
relating to the compiler that plugins (or diagnostic sinks) might want
to subscribe to events for, accessed from the global gcc::context
object, along with a new gcc/topics/ source subdirectory to hold
strongly-typed publish/subscribe topics relating to the compiler.

For now, there is just one: pass_events_channel, which, if there are any
subscribers, issues notifications about passes starting/stopping on
particular functions, using topics::pass_events, declared in
topics/pass-events.h, but followup patches add more kinds of
notification channel.

A toy plugin in the testsuite shows how this could be used to build a
progress notification UI for the compiler, and a followup patch uses the
channel to (optionally) capture CFG information at each stage of
optimization in machine-readable form into a SARIF sink.

gcc/ChangeLog:
* channels.h: New file.
* context.cc: Define INCLUDE_LIST. Include "channels.h".
(gcc::context::context): Create m_channels.
(gcc::context::~context): Delete it.
* context.h (struct compiler_channels): New forward decl.
(gcc::context::get_channels): New accessor.
(gcc::context::m_channels): New field.
* passes.cc: Define INCLUDE_LIST. Include "topics/pass-events.h"
and "channels.h".
(execute_one_pass): If the global context's pass_events_channel
has subscribers, publish before_pass and after_pass events to it.
* topics/pass-events.h: New file.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/plugin.exp: Add progress_notifications_plugin.cc.
* gcc.dg/plugin/progress_notifications_plugin.cc: New test plugin.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Add pub-sub.{h,cc}

This patch introduces a publish/subscribe mechanism, allowing for
loosely-coupled senders and receivers, with strongly-typed messages
passing between them. For example, a GCC subsystem could publish
messages about events, and a plugin could subscribe to them.

An example can be seen in the selftests.

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add pub-sub.o.
* pub-sub.cc: New file.
* pub-sub.h: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::pub_sub_cc_tests.
* selftest.h (selftest::pub_sub_cc_tests): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

aarch64: Fix PCH for sve builtins [PR123457]

The problem here is function_table was not in the GGC memory space and not
streamed out. So even though the builtins were reloaded, function_table was
a nullptr as it was not reloaded.

Also noticed initial_indexes should be marked with GTY so it is reloaded correctly
from PCH.

Built and tested for aarch64-linux-gnu.

PR target/123457
gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins.cc (struct registered_function_hasher):
Change base class to ggc_ptr_hash.
(initial_indexes): Mark with GTY.
(function_table): Likewise.
(handle_arm_sve_h): Allocate function_table from ggc instead of heap.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

c: Optimize TARGET_EXPRs for _Atomic loads [PR123475]

On the following testcase we emit a false positive warning that
a temporary (TARGET_EXPR slot) is used uninitialized, from the early_uninit
pass.

This regressed with my change to instrument for
-ftrivial-auto-var-init={zero,pattern} not just DECL_EXPRs, but also
TARGET_EXPR initializations if the TARGET_EXPR_INITIALIZER has void type.
Those cases are where the initializer doesn't necessarily have to initialize
the whole TARGET_EXPR slot, or might use parts or the whole slot before
those are initialized; this is how e.g. various C++ temporary objects are
constructed.

The problem is in pass interaction.  The FE creates a TARGET_EXPR with
void type initializer because the initializer is originally
__atomic_load (&expr, &tmp, SEQ_CST); but it is folded instantly into
(void) (tmp = (type) __atomic_load_N (&expr, SEQ_CST)).  The FE also
marks the TARGET_EXPR slot as TREE_ADDRESSABLE, because it would be
if it will use libatomic, but nothing in the IL then takes its address.
Now, since my r16-4212 change which was for mainly C++26 compliance
we see the TARGET_EXPR and because it has void type TARGET_EXPR_INITIALIZER,
we start with tmp = .DEFERRED_INIT (...); just in case the initialization
would attempt to use the slot before initialization or not initialize fully.
Because tmp is TREE_ADDRESSABLE and has gimple reg type, it is actually not
gimplified as tmp = .DEFERRED_INIT (...); but as _1 = .DEFERRED_INIT (...);
tmp = _1; but because it is not actually address taken in the IL, already
the ssa pass turns it into SSA_NAME (dead one), so we have
_1 = .DEFERRED_INIT (...); _2 = _1; and _2 is unused.  Next comes
early_uninit and warns on the dead SSA_NAME copy that it uses uninitialized
var.

The following patch attempts to fix that by checking if
c_build_function_call_vec has optimized the call right away into pure
assignment to the TARGET_EXPR slot without the slot being used anywhere
else in the expression and 1) clearing again TREE_ADDRESSABLE on the slot,
because it isn't really addressable 2) optimizing the TARGET_EXPR, so that
it doesn't have void type TARGET_EXPR_INITIALIZER by changing it to the rhs
of the MODIFY_EXPR.  That way gimplifier doesn't bother creating
.DEFERRED_INIT for it at all.

Or should something like this be done instead in the TARGET_EXPR
gimplification?  I mean not the TREE_ADDRESSABLE clearing, that can't be
done without knowing what we know in the FE, but the rest, generally
TARGET_EXPR with initializer (void) (TARGET_EXPR_SLOT = something)
where something doesn't refer to TARGET_EXPR_SLOT can be optimized into
just something TARGET_EXPR_INITIALIZER.

2026-01-09  Jakub Jelinek  <jakub@redhat.com>

PR c/123475
* c-typeck.cc (c_find_var_r): New function.
(convert_lvalue_to_rvalue): If c_build_function_call_vec
folded __atomic_load (&expr, &tmp, SEQ_CST); into
(void) (tmp = __atomic_load_<N> (&expr, SEQ_CST)), drop
TREE_ADDRESSABLE flag from tmp and set TARGET_EXPR
initializer just to the rhs of the MODIFY_EXPR.

* gcc.dg/pr123475.c: New test.

doc: List more valid -x option arguments

We miss quite a few -x option arguments that can be specified.

2026-01-09 Jakub Jelinek <jakub@redhat.com>

* doc/invoke.texi (-x): Add c++-system-module, objc-cpp-output,
objc++-cpp-output, adascil, adawhy, modula-2, modula-2-cpp-output,
rust, algol68 and lto as further possible option arguments.

libstdc++: Simplify use_proxy_wait function

The __wait_args::_M_setup_proxy_wait function must only be called when
_M_obj == addr is true, so it's redundant for _M_setup_proxy_wait to
pass addr to use_proxy_wait. That address is already passed as
args._M_old anyway.

libstdc++-v3/ChangeLog:

* src/c++20/atomic.cc (use_proxy_wait): Remove unused second
parameter.
(__wait_args::_M_setup_proxy_wait): Remove second argument.
(__notify_impl): Likewise.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Fix proxy wait detection in atomic wait

A failed assertion was observed with std::atomic<bool>::wait when the
loop in __atomic_wait_address is entered and calls _M_setup_wait a
second time, after waking from __wait_impl. When the first call to
_M_setup_wait makes a call to _M_setup_proxy_wait that function decides
that a proxy wait is needed for an object of type bool, and it updates
the _M_obj and _M_obj_size members to refer to the futex in the proxy
state, instead of referring to the bool object itself. The next time
_M_setup_wait is called it calls _M_setup_proxy_wait again but now it
sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
check.

The problem is that _M_setup_proxy_wait wasn't correctly handling the
case where it's called a second time, after the decision to use a proxy
wait has already been made. That can be fixed in _M_setup_proxy_wait by
checking if _M_obj != addr, which implies that a proxy wait has already
been set up by a previous call. In that case, _M_setup_proxy_wait should
only update _M_old to the latest value of the proxy _M_ver.

This change means that _M_setup_proxy_wait is safe to call repeatedly
for a proxy wait, and will only update _M_wait_state, _M_obj, and
_M_obj_size on the first call. On the second and subsequent calls, those
variables are already correctly set for the proxy wait so don't need to
be set again.

For non-proxy waits, calling _M_setup_proxy_wait more than once is safe,
but pessimizes performance. The caller shouldn't make a second call to
_M_setup_proxy_wait because we don't need to check again if a proxy wait
should be used (the answer won't change) and we don't need to load a
value from the proxy _M_ver.

However, it was difficult to detect the case of a non-proxy wait,
because _M_setup_wait doesn't know if it's being called the first time
(when _M_setup_proxy_wait is called to make the initial decision) or a
subsequent time (in which case _M_obj == addr implies a non-proxy wait
was already decided on). As a result, _M_setup_proxy_wait was being used
every time to see if it's a proxy wait. We can resolve this by splitting
the _M_setup_wait function into _M_setup_wait and _M_on_wake, where the
former is only called once to do the initial setup and the latter is
called after __wait_impl returns, to prepare to check the predicate and
possibly wait again. The new _M_on_wake function can avoid unnecessary
calls to _M_setup_proxy_wait by checking _M_obj == addr to identify a
non-proxy wait.

The three callers of _M_setup_wait are updated to use _M_on_wake instead
of _M_setup_wait after waking from a waiting function. This change
revealed a latent performance bug in __atomic_wait_address_for which was
not passing __res to _M_setup_wait, so a new value was always loaded
even when __res._M_has_val was true. By splitting _M_on_wake out of
_M_setup_wait this problem became more obvious, because we no longer
have _M_setup_wait doing two different jobs, depending on whether it was
passed the optional third argument or not.

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__atomic_wait_address_until):
Use _M_on_wake instead of _M_setup_wait after waking.
(__atomic_wait_address_for): Likewise.
* include/bits/atomic_wait.h (__atomic_wait_address): Likewise.
(__wait_args::_M_setup_wait): Remove third parameter and move
code to update _M_old to ...
(__wait_args::_M_on_wake): New member function to update _M_old
after waking, only calling _M_setup_proxy_wait if needed.
(__wait_args::_M_store): New member function to update _M_old
from a value, for non-proxy waits.
* src/c++20/atomic.cc (__wait_args::_M_setup_proxy_wait): If
_M_obj is not addr, only load a new value and return true.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>