git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

Update gcc sv.po

* sv.po: Update.

[RISC-V][PR tree-optimization/57650] Detect more czero opportunities

So in pr57650 we have RTL like this:

> (set (reg:DI 147)
>     (and:DI (gt:DI (reg:DI 153 [ y ])
>             (reg:DI 154 [ z ]))
>         (ne:DI (reg/v/f:DI 138 [ x ])
>             (const_int 0 [0]))))

That's going to generate:

        sgt     a1,a1,a2
        snez    a5,a0
        and     a5,a5,a1

But with zicond we can do better.  That's really just:

        sgt     a1,a1,a2
        czero.eqz       a1,a1,a0

We already had patterns to clean this kind of mess up a bit, but they needed a
bit more generalization.  First they only accepted NE forms, but EQ is just as
valid and just requires us to select between czero.nez and czero.eqz.  Second
the AND is commutative, so the equality test can appear in either position.
With those generalizations we can get the desired code.  Note I'm not trying to
tackle the larger problems with 57650, just the low level code generation
inefficiencies.

This has been in my tester for a while without regressions and is being
exercised during a bootstrap on the BPI.  I'll wait for pre-commit CI to render
a verdict.

PR tree-optimization/57650
gcc/
* config/riscv/zicond.md: Generalize patterns which identify
a logical AND of an equality test and some other sCC insn to
handle more cases.

gcc/testsuite/
* gcc.target/riscv/pr57650.c: New test.

c++: fix decltype(id) for pointer-to-data-member access expr [PR124978]

Here after substitution into decltype(X), X is the expanded but not
constant-evaluated pointer-to-data-member access expression

  *((const int *) *cw<Divide{42}>::value + (sizetype) *cw<&Divide::value>::value)

and finish_decltype_type wrongly strips the outermost INDIRECT_REF under
the assumption that it's an implicit dereference of a reference, but here
it's an explicit pointer dereference.  This causes the decltype to yield
const int* instead of the expected int.

This patch fixes this particular bug by checking REFERENCE_REF_P instead
of INDIRECT_REF_P which additionally verifies the dereferenced thing
actually has reference type.  The decltype now yields the correct type
modulo an unnecessary const due to the separate bug PR115314.

PR c++/124978
PR c++/115314

gcc/cp/ChangeLog:

* semantics.cc (finish_decltype_type): Check REFERENCE_REF_P
instead of INDIRECT_REF_P before stripping implicit dereferences.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class74.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: defer completion of streamed-in cNTTPs [PR124953]

Here we hit lazy loading recursion when streaming in the cNTTP object
wrap<Storage>{}, via get_template_parm_object -> cp_finish_decl ->
ensure_literal_type_for_constexpr_object -> complete_type, apparently
the class definition of wrap<Storage> hasn't been streamed in yet.
If we disable that literal type check for NTTP objects, we still hit
recursion, from layout_var_decl.

It seems prudent to defer calling cp_finish_decl for NTTP objects until
after lazy loading has completed like we do for expand_or_defer_fn and
cdtors. This patch arranges that, as a follow-up to the some previous
NTTP object streaming fixes r15-3031 and r16-318.

PR c++/124953

gcc/cp/ChangeLog:

* module.cc (trees_in::tree_node) <tt_nttp_var>: Push the result
of get_template_parm_object to post_load_decls.
(post_load_processing): Call cp_finish_decl on any not yet
completed NTTP objects.
* pt.cc (get_template_parm_object): Don't call cp_finish_decl
when !check_init.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-nttp-3_a.H: New test.
* g++.dg/modules/tpl-nttp-3_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

aarch64: Remove redundant m_curr_insn initialization/de-initialization

m_curr_insn has a default member initializer (= nullptr) in the class
declaration, so the explicit assignment in the ctor is redundant.

The assignment in the dtor is also unnecessary since the member's lifetime ends
with the object.

Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-narrow-gp-writes.cc
(narrow_gp_writes::narrow_gp_writes): Remove redundant m_curr_insn
initialization.
(narrow_gp_writes::~narrow_gp_writes): Remove redundant m_curr_insn
de-initialization.

ext-dce: Promote narrow operations to wider mode when extended bits are dead

When an operation like (sign_extend:DI (plus:SI ...)) has dead extended
bits, promote the inner operation to the wider mode, eliminating the
extension wrapper. This enables combine to see DI-mode sequences and
form instructions like sh1add, sh2add, sh3add on RISC-V.

Only promote candidates that form chains — where one candidate's result
feeds into another's operand. Standalone (isolated) promotions are
skipped because they cause regressions on targets with free sign
extension (e.g., RISC-V W-suffix instructions): they prevent combine
from folding sext.w patterns and break combine split patterns that
depend on the sign_extend wrapper (sh1add, packw).

Chain detection tracks promotion candidates and their register
connections within each basic block, propagating through copies
created by optimized extensions.

gcc/ChangeLog:

* ext-dce.cc (promotion_candidate_info): New struct.
(copy_info): New struct.
(promotion_candidates, promotable_dests): New file-scope variables.
(consumed_by_candidate, promotion_copies): Likewise.
(ext_dce_try_promote_operation): New function to promote
sign/zero-extended arithmetic to wider mode.
(ext_dce_record_promotion_candidate): New function to record
promotion candidates for deferred chain analysis.
(ext_dce_promote_chained_candidates): New function to promote
only chained candidates.
(ext_dce_process_uses): Record candidates instead of promoting
immediately; propagate chain info through optimized copies.
(ext_dce_process_bb): Call ext_dce_promote_chained_candidates
after processing all insns in a block.
(ext_dce_init): Allocate chain detection bitmaps.
(ext_dce_finish): Free chain detection data structures.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/ext-dce-promote-2.c: Update to verify both
chain promotions (sh1add, sh3add) and standalone skipping.

ext-dce: Only remove REG_EQUAL/EQUIV notes on successful optimization

In ext_dce_try_optimize_extension, REG_EQUAL/EQUIV notes were removed
unconditionally after attempting validate_change, even when the
validation failed and the insn was reverted to its original state.
This could cause subsequent passes to generate different (incorrect)
code because they lost the REG_EQUAL hint on an unchanged insn.

Guard the note removal with the 'ok' flag so notes are only stripped
when validate_change actually committed the transformation.

gcc/ChangeLog:

* ext-dce.cc (ext_dce_try_optimize_extension): Only remove
REG_EQUAL/EQUIV notes when validate_change succeeds.

aarch64: Update br_mispredict_factor for generic tunings

After some testing, we have found that a br_mispredict_factor of 7 is more
suitable than the default factor of 6 that was proposed in d7aebc72899.
6 can be too restrictive on certain workloads and reject cheaper csels in favour
of conditional branches.

On an Olympus core, this change improves SPEC2017 fp rate geomean by 1% while
the int rate geomean is unchanged. There are no visible regressions >1%.

Additionally, github.com/facebook/zstd retains the performance improvement this
patch introduced.

Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:

* config/aarch64/tuning_models/generic.h: Update br_mispredict_factor
to 7.

match.pd: Relax single_use for fold-to-zero comparisons

The single_use restriction on the X +- C1 CMP C2 -> X CMP C2 -+ C1
simplification (for eq/ne) prevents folding patterns like (++*a == 1)
into (*a == 0) when the defining SSA value has multiple uses.

Comparing against zero is cheaper on most targets (beqz on RISC-V,
cbz on AArch64), so the transform is profitable even when the
defining SSA has multiple uses.  Relax single_use when the folded
comparison constant is zero.

For example, given:
  _1 = *a;
  _2 = _1 + 1;
  *a = _2;
  if (_2 == 1)

match.pd now produces:
  if (_1 == 0)

which generates beqz/cbz instead of li+beq/cmp+b.eq.

This is a partial fix towards the issue described in PR120283.

gcc/ChangeLog:

* match.pd: Relax single_use for eq/ne when folded constant
is zero.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c: New test.

c++: constexpr union with no active member [PR124910]

Patrick pointed out that while r16-8767 made a union constant after
destroying its active member, we still weren't treating a union that never
had an active member as constant; the difference is the
CONSTRUCTOR_NO_CLEARING flag, and what that means to
reduced_constant_expression_p.

It seems to me that since P2686 [expr.const] says whether a prvalue
expression is a constant expression depends on the constituent values, and
[intro.object] says that only the active member is a constituent value of a
union, so a union with no active member has no constituent values and so is
vacuously constant, like an object of empty type. P2686 as a whole is not a
DR, but the draft was previously unclear, and CWG2658 also clarified that
copying a union is equivalent to copying the active member *if any*.

I was somewhat surprised that none of the existing tests needed to be
changed.

PR c++/124910
DR 2658

gcc/cp/ChangeLog:

* constexpr.cc (reduced_constant_expression_p): Allow a union
with no active member.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union12.C: New test.

[LRA]: Fix reg notes update

There is a typo in using dead_set instead of set in
clear_sparseset_regnos and regnos_in_sparseset_p. This can result in
wrong unused (stalled) notes and wrong or worse code generation by
optimizations using unused notes after RA.

gcc/ChangeLog:

* lra-lives.cc (clear_sparseset_regnos, regnos_in_sparseset_p):
Use set instead of dead_set.

[IRA]: Fix some cost calculation.

ira_memory_move_cost is used in many IRA places but I found 2 places where
load and store costs are used instead of correspondingly store and load
costs. The patch fixes this.

gcc/ChangeLog:

* ira-costs.cc (record_reg_classes): When calculating alt_cost use
the right cost of memory-reg move.
* ira-emit.cc (emit_move_list): Use load cost instead of store for
moving memory to reg.

[RA]: Fix some typos and remove unused code

The following patch fixes different harmless typos and removes some unused code.

gcc/ChangeLog:

* ira-build.cc (add_to_conflicts): Use sizeof(ira_object_p)
instead of sizeof(ira_allocno_t) for allocations.
* ira-color.cc (print_hard_reg_set): Fix printing hard reg set.
* ira-emit.cc (allocno_last_set, allocno_last_set_check): Remove
unused static variables.
* ira.cc (combine_and_move_insns): Fix dead note recognition.
(ira_remove_insn_scratches): Use dump_file instead of
ira_dump_file.
* lra-constraints.cc (match_reload): Remove always true condition.
(undo_optional_reloads): Fix recognition of clobber for assertion.

match: (X * C1) + (X << C2) -> X * (C1 + (1 << C2)) [PR124886]

This patch adds the following match pattern.
(X * C1) + (X << C2) -> X * (C1 + (1 << C2))

Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

PR tree-optimization/124886

gcc/ChangeLog:

* match.pd ((X * C1) + (X << C2) -> X * (C1 + (1 << C2))): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr124886.c: New test.

Signed-off-by: Pengxuan Zheng <pengxuan.zheng@oss.qualcomm.com>

[RISC-V][PR target/121268] Add splitters to improve andn generation

So if we have something like (and (not X) (not Y)) where X or Y is a simple
register and the other is possibly more complex, but implementable with a
single instruction, we want to split at the the complex expression.  Let's say
it's Y above.  We want to generate

(set (temp) (not Y))
(set (dest) (and (not (X) (temp))

The most interesting cases for Y exploit the ~x = -x + 1 identity or (x & -x) -
1 = (x - 1) & ~x

If we take two functions from the PR:

unsigned int f1(unsigned int x)
{
    return ~(x | -x);
}

unsigned int f3(unsigned int x)
{
    return (x & -x) - 1;
}

Currently generates this on rv64:

f1:
        negw    a5,a0
        or      a0,a5,a0
        not     a0,a0
        ret

f3:
        negw    a5,a0
        and     a0,a5,a0
        addiw   a0,a0,-1
        ret

After this patch we generate:

f1:
        addiw   a5,a0,-1
        andn    a0,a5,a0
        ret

f3:
        addiw   a5,a0,-1
        andn    a0,a5,a0
        ret

I considered doing these in simplify-rtx.  My biggest worry is over-fitting to
the way the RISC-V port expresses the "w" form instructions.  So I stuck with a
target specific solution.

It's just a few 3->2 splitters.   The bulk of the patch has been in my tester
for a while, but the last pattern is new after I did some experimentation on
rv32 to make sure it's generating sensible code too.  The runs in my tester
have all been without regressions.  Obviously I'll be waiting on the pre-commit
CI system to render a verdict.

PR target/121268
gcc/
* config/riscv/bitmanip.md: Add splitters to exploit identities
that relate subtraction and bitwise negation on 2's complement
arithmetic.

gcc/testsuite/
* gcc.target/riscv/pr121268.c: New test.

aarch64/testsuite: add LTO coverage for branch-protection notes and attributes

Recent binutils (e.g. 2.46) switched AArch64 branch-protection emission
from .note.gnu.property to build attributes
(Tag_Feature_BTI, Tag_Feature_PAC, Tag_Feature_GCS) when GCC is
configured with such toolchains.

PR target/124365 exposed an issue where -flto with
-mbranch-protection=standard caused loss of branch-protection metadata in build
attributes. This was due to an LTO bug, now fixed upstream
(8b39ec70741b7fb9d059b6944f30a6743dea996a).

Add tests to verify both forms in LTO builds, covering:
• older binutils behaviour (.note.gnu.property), and
• newer binutils behaviour (build attributes).

This ensures branch-protection metadata is preserved across LTO for both
toolchain configurations.

PR target/124365

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/lto/lto.exp: New DejaGnu test driver for LTO tests
for aarch64. Copied from gcc/testsuite/gcc.target/arm/lto/lto.exp with
minor changes.
* gcc.target/aarch64/lto/pr124365-build-attributes-1_0.c: New test
for build attributes with branch protection.
* gcc.target/aarch64/lto/pr124365-build-attributes-1_1.c: Companion
source file for the LTO test.
* gcc.target/aarch64/lto/pr124365-build-attributes-2_0.c: New test
for build attributes without branch protection.
* gcc.target/aarch64/lto/pr124365-build-attributes-2_1.c: Companion
source file for the LTO test with branch protection enabled.
* gcc.target/aarch64/lto/pr124365-gnu-property-1_0.c: New test for
`.note.gnu.property` with branch protection.
* gcc.target/aarch64/lto/pr124365-gnu-property-1_1.c: Companion
source file for the LTO test.
* gcc.target/aarch64/lto/pr124365-gnu-property-2_0.c: New test for
`.note.gnu.property` without branch protection.
* gcc.target/aarch64/lto/pr124365-gnu-property-2_1.c: Companion
source file for the LTO test with branch protection enabled.

testsuite: Extend object-readelf beyond attributes

object-readelf in lib/lto.exp was hard-wired to use readelf -A,
limiting it to attribute checks. Extend it to accept a readelf option and
a regex, where the option selects the readelf flag and the regex is
matched against the output.

Add wrapper procedures for common use cases:

• attribute checks, and
• note checks.

Also add support for negative checks via an "is-negative" argument,
which requires that the regex is not present in the output.

gcc/ChangeLog:

* doc/sourcebuild.texi (Scan object metadata with readelf): Document
object-readelf-attributes, object-readelf-attributes-not,
object-readelf-notes, and object-readelf-notes-not as regex-based
checks with optional target/xfail selectors.

gcc/testsuite/ChangeLog:

* lib/lto.exp (object-readelf): Accept a readelf option and a single
regex; match against full readelf output. Keep positive/negative
behaviour via wrappers.
(object-readelf-attributes, object-readelf-attributes-not,
object-readelf-notes, object-readelf-notes-not): Implement as wrappers
over the generic matcher.
* gcc.dg-selftests/dg-final.exp (dg_final_directive_check_num_args):
Update for object-readelf-* wrappers to regex-style arguments (1..3).
* gcc.target/arm/lto/pr61123-enum-size_0.c: Update to use
object-readelf-attributes with a single regex.

[PATCH v3] tree-optimization: lower mempcpy to memcpy when result is unused [PR93556]

This patch allows the GIMPLE folder to transform __builtin_mempcpy into
__builtin_memcpy in cases where the return value is ignored. This is beneficial
because most targets have an efficient implementation for memcpy.

Existing tests that relied on the unfolded mempcpy have been duplicated - one
version now takes the folded mempcpy into account, and the other intentionally
prevents the folding from happening.

Bootstrapped and regression tested on x86_64-linux-gnu.

PR tree-optimization/93556

gcc/ChangeLog:

* gimple-fold.cc (gimple_fold_builtin_mempcpy): New function.
(gimple_fold_builtin): Handle BUILT_IN_MEMPCPY.

gcc/testsuite/ChangeLog:

* gcc.dg/pr79223.c: Rename to gcc.dg/pr79223-1.c and update scans.
* gcc.dg/tree-prof/val-prof-7.c: Rename to
gcc.dg/tree-prof/val-prof-7-1.c and update scans.
* gcc.dg/tree-ssa/builtins-folding-gimple-3.c: Update scans.
* gcc.dg/builtin-mempcpy-1.c: New test.
* gcc.dg/builtin-mempcpy-2.c: New test.
* gcc.dg/pr79223-2.c: New test.
* gcc.dg/tree-prof/val-prof-7-2.c: New test.
* gcc.dg/tree-ssa/builtins-folding-gimple-4.c: New test.

Signed-off-by: Netanel Komm <netanelkomm@gmail.com>

libstdc++: Fix up std::is_scalar for std::meta::info [PR125024]

https://eel.is/c++draft/basic.types.general#9.sentence-1 says that
std::meta::info and its cv-qualified versions are scalar types too
(and in https://eel.is/c++draft/basic.fundamental#19.sentence-1
that they are fundamental types too).
Now, on the reflection side, eval_is_scalar_type is handled
in the compiler and uses SCALAR_TYPE_P (type) which includes
REFLECTION_TYPE_P check and eval_is_fundamental_type includes that
explicitly too.
std::is_fundamental uses
   template<typename _Tp>
     struct is_fundamental
     : public __or_<is_arithmetic<_Tp>, is_void<_Tp>,
                    is_null_pointer<_Tp>
#if __cpp_impl_reflection >= 202506L
                    , is_reflection<_Tp>
#endif
                    >::type
     { };
but for std::is_scalar we apparently forgot to include is_reflection.

The following patch fixes that.

2026-04-26  Jakub Jelinek  <jakub@redhat.com>

PR libstdc++/125024
* include/std/type_traits (std::is_scalar): For
__cpp_impl_reflection >= 202506L handle is_reflection types as
scalar.
* testsuite/20_util/is_scalar/reflection.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

testsuite: Fix up bitint-95.c test [PR124988]

I forgot to add the usual guards of bitint tests to bitint-95.c test
(which were done even in the 4 other tests from the same commit).

2026-04-27 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/124988
* gcc.dg/torture/bitint-95.c: Add bitint effective targets and
guard parts of test which need _BitInt(192) support with
__BITINT_MAXWIDTH__ >= 192.

tree-optimization/125025 - ICE with niter analysis and UBSAN

The following avoids trying to compute the absolute step by
negating a signed step, instead, as done in one other place
already, first convert to unsigned and then negate.

PR tree-optimization/125025
* tree-ssa-loop-niter.cc (number_of_iterations_ne): Avoid
negation of most negative signed integer.
(number_of_iterations_lt): Likewise.

* gcc.dg/torture/pr125025.c: New testcase.

tree-optimization/125019 - fix ICE with recurrence vectorization

This fixes an oversight with the PR124677 fix.

PR tree-optimization/125019
* tree-vect-loop.cc (vectorizable_recurr): Properly guard
against hitting last stmt when searching for the insertion
place.

* gcc.dg/pr125019.c: New testcase.

Daily bump.

match: Optimize `signed < 0 ? positive : min<signed, positive>` into `(signed)min<(unsigned), (unsigned)positive>` [PR110262]

While looking into PR 110252 a few years back, I noticed this missed
optimization in code from sel-sched.cc. I only realized today
I could generalize it to handle more than just 1 to all positive
values.
This adds the pattern to optimize:
signed < 0 ? positive : min<signed, positive>
into:
unsigned ts = signed;
unsigned ps = positive;
unsigned ru = min<ts, tp>;
(signed)ru

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/110262

gcc/ChangeLog:

* match.pd (`signed < 0 ? positive : min<signed, positive>`): New
pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/pr110262-1.c: New test.
* gcc.dg/tree-ssa/phi-opt-46.c: New test.
* gcc.dg/tree-ssa/phi-opt-47.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

install: Use Binutils over binutils

gcc:
* doc/install.texi (Prerequisites): Use Binutils over binutils to
refer to that project.
(Downloading the source): Ditto.
(Configuration): Ditto.
(Building): Ditto.
(Specific): Ditto.

rtl-optimization: Simplify vec_select of a vec_select.

This patch adds an RTL optimization to simplify-rtx.cc to simplify a
vec_select of a vec_select.

A motivating example is the following code on x86_64:

typedef unsigned int v4si __attribute__((vector_size(16)));

v4si foo(v4si vec, int val) {
    vec[1] = val;
    vec[2] = val;
    return vec;
}

with -O2, GCC currently generates the following code:

foo:    movd    %edi, %xmm1
        pshufd  $225, %xmm0, %xmm0 // swap elements 0 and 1
        movss   %xmm1, %xmm0 // overwrite element 0
        pshufd  $225, %xmm0, %xmm0 // swap elements 0 and 1
        pshufd  $198, %xmm0, %xmm0 // swap elements 0 and 3
        movss   %xmm1, %xmm0 // overwrite element 0
        pshufd  $198, %xmm0, %xmm0 // swap elements 0 and 3

Notice there a two consecutive pshufd instructions, permuting the
same register.  During combine, we see:

Trying 11 -> 14:
   11: r103:V4SI=vec_select(r103:V4SI,parallel)
   14: r105:V4SI=vec_select(r103:V4SI,parallel)
      REG_DEAD r103:V4SI
Failed to match this instruction:
(set (reg:V4SI 105 [ vec_5 ])
    (vec_select:V4SI (vec_select:V4SI (reg:V4SI 103 [ vec_4 ])
            (parallel [
                    (const_int 1 [0x1])
                    (const_int 0 [0])
                    (const_int 2 [0x2])
                    (const_int 3 [0x3])
                ]))
        (parallel [
                (const_int 2 [0x2])
                (const_int 1 [0x1])
                (const_int 0 [0])
                (const_int 3 [0x3])
            ])))

Clearly a permutation of a permutation is another permutation, so
the above expression can be simplified/canonicalized.  Conveniently
there's already code in simplify_rtx to spot that a vec_select of
vec_select is an identity, this patch extends that functionality to
simplify a vec_select of a vec_select to a single vec_select.

With this transformation in simplify-rtx.cc, combine now reports:

Trying 11 -> 14:
   11: r103:V4SI=vec_select(r103:V4SI,parallel)
   14: r105:V4SI=vec_select(r103:V4SI,parallel)
      REG_DEAD r103:V4SI
Successfully matched this instruction:
(set (reg:V4SI 105 [ vec_5 ])
    (vec_select:V4SI (reg:V4SI 103 [ vec_4 ])
        (parallel [
                (const_int 2 [0x2])
                (const_int 0 [0])
                (const_int 1 [0x1])
                (const_int 3 [0x3])
            ])))
allowing combination of insns 11 and 14
original costs 4 + 4 = 8
replacement cost 4

And for the example above, we now generate:

foo:    movd    %edi, %xmm1
        pshufd  $225, %xmm0, %xmm0
        movss   %xmm1, %xmm0
        pshufd  $210, %xmm0, %xmm0
        movss   %xmm1, %xmm0
        pshufd  $198, %xmm0, %xmm0
        ret

2026-04-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1)
<case VEC_SELECT>: Simplify a (non-identity) vec_select of a
vec_select.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-pshufd-2.c: New test case.

PR tree-optimization/124715: pow(0,-1) sets errno with -fmath-errno

This patch addresses PR tree-optimization/124715, where it is unsafe for
GCC (specifically match.pd) to transform pow(x,-1) into 1.0/x if x may be
zero, which sets errno, unless -fno-math-errno (included in -ffast-math)
is specified.

2026-04-26 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
PR tree-optimization/124715
* match.pd (simpify pows): Check flag_errno_math before simplifying
pow(x,-1) -> 1/x when x could be zero.

gcc/testsuite/ChangeLog
PR tree-optimization/124715
* gcc.dg/no-math-errno-5.c: New test case.
* gcc.dg/no-math-errno-6.c: Likewise.

i386: Refactor AVX512 comparisons in machine description sse.md.

This patch refactors/tidies up the define_insns for vector comparisons
on 512-bit vectors in sse.md.  The motivation is that the current
organization (accidentally) introduces dubious instructions such as
avx512f_cmpv16si3_mask_round and avx512vl_cmpv2di3_mask_round, which
are integer comparisons that specify a floating point rounding mode!?

The problem is caused by the decomposition of mode iterators.
Currently, sse.md uses four patterns: (1) for signed comparions
of floating point and large integer modes (V48H), (2) for signed
comparisons of small integer modes (VI12), (3) for unsigned
comparisons of small integer modes (VI12) and (4) for unsigned
comparisons of large integer modes (VI48).  The first pattern
also allows for variants specifying the FP rounding mode.

The refactoring below uses a more sensible decomposition into
only three patterns: (1) for [signed] comparisons of floating
point modes (VFH), (2) for signed comparisons of integers (VI1248)
and (3) for unsigned comparisons of integers (VI1248).

For the record, to show this produces the same coverage:

V48H = v{16,8,4}si v{8,4,2}di v{32,16,8}hf v{16,8,4}sf v{8,4,2}df
V12 = v{64,32,16}qi v{32,16,8}hi

VFH = v{32,16,8}hf v{16,8,4}sf v{8,4,2}df
VI1248 = v{64,32,16}qi v{32,16,8}hi v{16,8,4}si v{8,4,2}di

The simplification also allows a clean-up of predicates
(for operand[3]) as there are 8 integer comparison operators
and 32 floating point comparison operators, and we no longer
need cmp_imm_predicate to restrict range based upon <mode>.

V48H cmp_imm_predicate -> VFH const_0_to_31_operand (FP)
VI12 cmp_imm_predicate -> VI1248 const_0_to_7_operand (signed)
VI12 const_0_to_7_operand ->
VI48 const_0_to_7_operand -> VI1248 const_0_to_7_operand (unsigned)

There are no changes other than removing the non-sensical patterns
from insn-emit, insn-recog and friends.

2026-04-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/sse.md
(<avx512>_cmp<mode>3<mask_scalar_merge_name><round_saeonly_name>):
Change mode iterator from V48H_AVX512VL to VFH_AVX512VL and op3's
predicate from <cmp_imm_predicate> to const_0_to_31_operand.
(<avx512>_cmp<mode>3<mask_scalar_merge_name>): Change mode
iterator from VI12_AVX512VL to VI1248_AVX512VLBW.
(<avx512>_ucmp<mode>3<mask_scalar_merge_name>): Likewise.

Daily bump.

[RISC-V][PR rtl-optimization/56096] Improve equality comparisons of a logical AND expressions

This BZ shows that we can improve certain comparisons for RISC-V.  In
particular if we are testing the result of a logical AND for equality and one
operand of the AND requires synthesis, we may be able to do better if we right
shift away any trailing zeros from the constant and shift the other input as
well.  This wins when the shifted constant does not require synthesis.

That may in turn allow improvement of a select of 0 and 2^n based on the
zero/nonzero status of a logical AND.  Essentially we can rewrite the sequence
to remove a data dependency.

Concretely:

>
> unsigned f1 (unsigned x, unsigned m)
> {
>     x >>= ((m & 0x008080) ? 8 : 0);
>     return x;
> }

Compiles into:

>         li      a5,32768
>         addi    a5,a5,128
>         and     a1,a1,a5
>         snez    a1,a1
>         slliw   a1,a1,3
>         srlw    a0,a0,a1
>         ret

But after this patch we generate this instead:

>         srai    a5,a1,7
>         andi    a5,a5,257
>         li      a4,8
>         czero.eqz       a1,a4,a5
>         srlw    a0,a0,a1
>         ret

It's just one less instruction, but the li can issue whenever the uarch wants
before the srlw as it has no incoming dependency.  So we're slight more dense
on encoding and slightly more efficient as well.  Much like 57650, I'm focused
on the low level RISC-V codegen issues, not the broader issues that are raised
in the PR.

This has been in my tree for a while, so it's been tested on riscv32-elf,
riscv64-elf and bootstrapped on the BPI which has support for czero.  Waiting
on pre-commit CI before moving forward.

PR rtl-optimization/56096
gcc/
* config/riscv/riscv.md: Add new patterns to optimize certain cases with
a logical AND feeding an equality test against zero.

gcc/testsuite/

* gcc.target/riscv/pr56096.c: New test.

scev/niter: Use INTEGRAL_NB_TYPE_P instead of direct comparison to INTEGER_TYPE [PR124061]

I noticed this while looking into PR 124052. This is not the first time we had
direct type comparison against INTEGER_TYPE which should have been different.
As mention in PR 124052, I didn't include bool types so I needed a new macro
to simplify things.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/124061

gcc/ChangeLog:

* tree-scalar-evolution.cc (interpret_rhs_expr): Use
INTEGRAL_NB_TYPE_P instead of comparing the code to INTEGER_TYPE.
* tree-ssa-loop-niter.cc (number_of_iterations_ne): Likewise.
(number_of_iterations_cltz): Likewise.
(number_of_iterations_exit_assumptions): Likewise.
* tree.h (INTEGRAL_NB_TYPE_P): New macro.

gcc/testsuite/ChangeLog:

* g++.dg/opt/enum-loop-1.C: New test.
* gcc.dg/tree-ssa/bitint-loop-opt-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

[RISC-V][PR target/123904] Improve bit masking of shifted values

If we are masking off bits on the upper and lower part of a register on riscv,
depending on the precise mask it may be best implemented as a shift triplet.
ie, shift left to clear upper bits, shift right to clear lower bits, shift left
again to put the bits into their proper position.

If the input value is already left shifted and the shift count corresponds to
the low mask bits, then we can get away with just two shifts. We shift left to
clear the relevant high bits, then shift right to put them into their proper
position.

This likey came from spec or coremark given it was reported to me by the RAU
team a while back. But the testcase didn't include enough breadcrumbs to know
for sure.

This has been repeatedly bootstrapped and regression tested on the Pioneer and
BPI as well as regularly regression tested on the riscv32-elf and riscv64-elf
embedded targets.

I'll wait for pre-commit CI to spin before pushing to the trunk.

PR target/123904
gcc/
* config/riscv/riscv.md (masking shifted value): New splitter to
optimize certain masking operations on shifted values.

gcc/testsuite/
* gcc.target/riscv/pr123904.c: New test.

[RISC-V][PR target/123838] Improve code generated for shifts with counts 31-N or 63-N

A shift count expressed at 31 - n ends up generating code like this:

        li      a5,31
        subw    a5,a5,a1
        sllw    a0,a0,a5
        ret

Note how we had to load 31 into a constant for the subtraction. But instead of
using 31 - n we can use a bit-not as it'll do precisely what we need in the
bits that the shift instruction actually uses.  This results in:

        not     a1, a1
        sllw    a0, a0, a1
        ret

The core idea we're exploiting here is the processor implements
SHIFT_COUNT_TRUNCATED semantics.  so a SI shift only cares about the low 5 bits
and DI the low 6 bits of the shift count.  And if we think about what bit
pattern -1 would be in those cases we get 31 and 63.  We then exploit the
identity

-x = ~x + 1  // identity
-1 - x = ~x  // a tiny bit of algebra

So in these limited cases we can place the the -1 - x with ~x.

I didn't implement this in simplify-rtx.  It wasn't actually going to help
because while the RISC-V chip implements SHIFT_COUNT_TRUNCATED semantics, it
doesn't define SHIFT_COUNT_TRUNCATED for "reasons".

So there's two patterns.  One for an X mode destination, naturally the shift
count is 31/63 - n for SI/DI respectively.  It's a bit odd that the subtraction
is always SImode, but that's probably narrowing happening somewhere.

The second pattern covers the "w" forms for rv64.

This trick probably works for the zbs instructions as well. That's going to be
a whole lot more patterns and I haven't seen this idiom show up anywhere in
practice, so it doesn't seem like a good cost/benefit analysis.

This spun overnight on riscv32-elf and riscv64-elf and on the Pioneer without
regressions.  I'll wait for pre-commit CI to do its thing before pushing.

PR target/123838
gcc/
* config/riscv/riscv.md: Use splitters to simplify shifts where
the shift count is 31-N or 63-N.

gcc/testsuite
* gcc.target/riscv/pr123838.c: New test.

Co-authored-by: Austin Law <austinklaw@gmail.com>

c: Fix recursive structure / union redeclaration with qualifiers [PR124303]

We reject correct recursive redeclarations when qualifiers are involved.
The reason is that the check is done before the variants are completed.

PR c/124303

gcc/c/ChangeLog:
* c-decl.cc (finish_struct): Check for consistency of
declarations after completing variants.

gcc/testsuite/ChangeLog:
* gcc.dg/pr124303.c: New test.

RISC-V: Add test for vec_duplicate + vmsle.vv combine with GR2VR cost 0, 1 and 15

Add asm dump check and run test for vec_duplicate + vmsle.vv
combine to vmsle.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vmsle.vx.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmsle-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmsle-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmsle-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vmsle-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

RISC-V: Combine vec_duplicate + vmsle.vv to vmsle.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vmsle.vv to the
vmsle.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have asm code like below, GR2VR cost is 0.

Before this patch:
  11       beq a3,zero,.L8
  12       vsetvli a5,zero,e32,m1,ta,ma
  13       vmv.v.x v2,a2
  ...
  16   .L3:
  17       vsetvli a5,a3,e32,m1,ta,ma
  ...
  22       vmsle.vv v1,v2,v3
  ...
  25       bne a3,zero,.L3

After this patch:
  11       beq a3,zero,.L8
  ...
  14    .L3:
  15       vsetvli a5,a3,e32,m1,ta,ma
  ...
  20       vmsle.vx v1,a2,v3
  ...
  23       bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/predicates.md: Add ge to the swappable
cmp operator iterator.
* config/riscv/riscv-v.cc (get_swapped_cmp_rtx_code): Take
care of the swapped rtx code as well.

Signed-off-by: Pan Li <pan2.li@intel.com>

match.pd: remove bit set/bit clear branch mispredict [PR64567]

Add two patterns to eliminate mispredicts in the following bit ops
scenarios:

- checking if a single bit is not set, and in this case set it: always
set the bit;
- checking if a bitmask is set (even partially), and in this case clear
it: always clear the bitmask.

Bootstrapped and tested with x86_64-pc-linux-gnu.

PR tree-optimization/64567

gcc/ChangeLog:

* match.pd (`cond (bit_and A IMM) (bit_or A IMM) A`): New
pattern.
(`cond (bit_and A IMM) (bit_and A ~IMM) A`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr64567-2.c: New test.
* gcc.dg/tree-ssa/pr64567.c: New test.

tree-ssa-strlen: Use gimple_build/gimple_convert_to_ptrofftype [PR122989]

Replace convert_to_ptrofftype, force_gimple_operand_gsi,
gimple_build_assign, and gsi_insert_before with
gimple_convert_to_ptrofftype and gimple_build.

gcc/ChangeLog:

PR tree-optimization/122989
* tree-ssa-strlen.cc (get_string_length): Use
gimple_convert_to_ptrofftype and gimple_build instead of
convert_to_ptrofftype/force_gimple_operand_gsi/gimple_build_assign.

Signed-off-by: Avinal Kumar <avinal.xlvii@gmail.com>

testsuite: Fix gcc.target/x86_64/abi tests on FreeBSD

The gcc.target/x86_64/abi tests currently FAIL on FreeBSD/amd64 when
using GNU ld.  Most of the failures are like

FAIL: gcc.target/x86_64/abi/test_3_element_struct_and_unions.c compilation,  -O0

gld-2.46: warning: /tmp//cckSN7Ts.o: missing .note.GNU-stack section implies executable stack
gld-2.46: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

UNRESOLVED: gcc.target/x86_64/abi/test_3_element_struct_and_unions.c execution,
-O0

This causes more than 1000 failures.  This patch fixes this by emitting
.note.GNU-stack on FreeBSD, too.

With this fixed, the ms-sysv tests now FAIL with

FAIL: gcc.target/x86_64/abi/ms-sysv/ms-sysv.c  -O2 "-DGEN_ARGS=-p0" (test for excess errors)
UNRESOLVED: gcc.target/x86_64/abi/ms-sysv/ms-sysv.c  -O2 "-DGEN_ARGS=-p0" compilation failed to produce executable

Excess errors:
ms-sysv/ms-sysv-generated.h:30:1: error: bp cannot be used in 'asm' here

Like Solaris, FreeBSD empirically needs --omit-rbp-clobbers in GEN_ARGS.

There's one more failure:

FAIL: gcc.target/x86_64/abi/callabi/leaf-2.c scan-assembler-not %rsp

This test expects -fomit-frame-pointer, while FreeBSD defaults to
-fno-omit-frame-pointer.

Bootstrapped without regressions on amd64-pc-freebsd15.0 with both gld
and /usr/bin/ld (lld), and x86_64-pc-linux-gnu.

2026-03-18  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/x86_64/abi/asm-support.S: Use .note.GNU-stack on
FreeBSD, too.
* gcc.target/x86_64/abi/avx/asm-support.S: Likewise.
* gcc.target/x86_64/abi/avx512f/asm-support.S: Likewise.
* gcc.target/x86_64/abi/avx512fp16/asm-support.S: Likewise.
* gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S: Likewise.
* gcc.target/x86_64/abi/avx512fp16/m512h/asm-support.S: Likewise.
* gcc.target/x86_64/abi/bf16/asm-support.S: Likewise.
* gcc.target/x86_64/abi/bf16/m256bf16/asm-support.S: Likewise.
* gcc.target/x86_64/abi/bf16/m512bf16/asm-support.S: Likewise.
* gcc.target/x86_64/abi/ms-sysv/do-test.S: Likewise.
Update comment.

* gcc.target/x86_64/abi/ms-sysv/ms-sysv.exp (runtest_ms_sysv): Add
--omit-rbp-clobbers on FreeBSD.

* gcc.target/x86_64/abi/callabi/leaf-2.c (dg-options): Add
-fomit-frame-pointer.

[RISC-V][PR target/124984] Fix RTL checking abort in thead memory address classification

As shown in the PR, we can trigger an RTL checking abort when classifying thead
specific addressing modes. As far as I can tell, the code is supposed to be
extracting constant value from the multiply operation, but instead is
referencing the wrong object.

The fix is trivial. I don't think this is anywhere near serious enough to try
to get into the imminent gcc-16 release. So after pre-commit testing is done
I'll push to the trunk, then backport in a week or so after the gcc-16 release
has been made.

This has been regression tested on riscv64-elf and riscv32-elf. While it will
spin on the Pioneer overnight, which has the relevant thead extensions, they
aren't enabled by default, so I don't really expect any meaningful improvements
to coverage.

PR target/124984
gcc/
* config/riscv/thead.cc (th_memidx_classify_address_index): Extract
constant multiplicand value from the right object.

gcc/testsuite
* gcc.target/riscv/pr124984.c: New test.

Daily bump.

testsuite: New effective-target sleep

libgfortran calls sleep, which is not available on all targets.

gcc:
* doc/sourcebuild.texi (Effective-Target Keywords): Document 'sleep'.

gcc/testsuite:
* lib/target-supports.exp (check_effective_target_sleep): New.

[RISC-V][PR rtl-optimization/80770] Canonicalize extending byte loads for RISC-V

In the process of debugging pr80770 with Shreya it became apparent that a
failure to CSE certain memory references was inhibiting Shreya's RTL
simplification from firing in all the cases we cared about as the simplifier
requires two operands to be the same pseudo.

The failure to CSE stems from having two QI loads which are sign extended to
different sized destinations.  As it turns out the code to fix that was
something I already had in flight as it's a small piece of eliminating a few
define_insn_and_split patterns (or simplifying them down to just a
define_split).

To expose the missed CSE what we really want to do is extend the value out to
word mode in a temporary, then use a lowpart extraction to set the real
destination.  The key being we haven't changed the size of the load, just how
widely it gets extended. Think of it as canonicalization for the purposes of
CSE.

This isn't the full set of changes I had in flight in that space, but does
clean things up enough for QImode loads to get CSE'd better and is enough to
trigger Shreya's pr80770 changes consistently for the testcodes we have on
RISC-V.

This has been spinning in my tester for a while.  So it's clean on riscv64-elf,
riscv32-elf as well as bootstrapped and regression tested on the Pioneer and
BPI-F3.  I'll wait for the pre-commit tester to do its thing before pushing to
the trunk.

In case it's not obvious, I'm focused on trickling RISC-V target improvements
right now so as not to potentially interfere with the release process.  So this
doesn't include Shreya's simplify-rtx.cc changes.

PR rtl-optimization/80770
gcc/
* config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Always extend
out to a word and use a subreg lowpart extraction to get the right bits.
(extend<SHORT:mode><SUPERQI:mode>2): Similarly.

gcc/testsuite
* gcc.target/riscv/rvv/base/vwaddsub-1.c: Adjust expected output.

mips: Fix ICE on mips64-elf by removing MAX_FIXED_MODE_SIZE override [PR120144]

The definition of MAX_FIXED_MODE_SIZE did not account for MIPS supporting
TImode, which causes an internal compiler error when building libstdc++. Upon further
investigation, this definition appears to be a historical mistake.

This patch removes the MAX_FIXED_MODE_SIZE override, which fixes the error.

PR target/120144

gcc/ChangeLog:

* config/mips/mips.h (MAX_FIXED_MODE_SIZE): Remove.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr120144.c: New test.

Signed-off-by: Carter Rennick <carter.rennick@gmail.com>

tree-ssa-dce: eliminate dead relaxed atomic loads with no LHS [PR123966]

A relaxed atomic load whose result is never used has no observable
effect: the value is discarded and __ATOMIC_RELAXED provides no
inter-thread synchronisation guarantee.

Fix this by adding an early-return check for
BUILT_IN_ATOMIC_LOAD_1/2/4/8/16 calls that have no LHS and a
compile-time-constant relaxed memory order.

PR tree-optimization/123966

gcc/ChangeLog:
* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary):
Don't mark a relaxed atomic load with no LHS as necessary.

gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr123966.c: New test.

Signed-off-by: Eikansh Gupta <eikansh.gupta@oss.qualcomm.com>

libstdc++: Disallow duration of cv-qualified types and references.

This implements LWG 4481, "Disallow chrono::duration<const T, P>",
which was approved in Croydon 2026.

libstdc++-v3/ChangeLog:

* include/bits/chrono.h: Add static_assert requiring cv-unqualified
non-reference type.
* testsuite/20_util/duration/io.cc: Remove const-qualifier in
stream manipluators tests.
* testsuite/20_util/duration/requirements/typedefs_neg4.cc:
New test.

[PATCH] RISC-V: Add vector cost model for Spacemit-X60

This patch implements a dedicated vector cost model for the Spacemit-X60
core. The cost values are derived from micro-benchmarking
data provided by the Camel CDR project.

Following discussions during the RISC-V Patchwork Meeting and based on
the upstream review process, this model applies a clamping
for long-latency instructions. Specifically, all long reservations
are capped at 7 cycles.

As we do not have access to the SPEC CPU benchmark suite, no testing
was performed using that suite. The implementation is based on the
cycle counts reported in the linked data source.

Data source:
https://camel-cdr.github.io/rvv-bench-results/spacemit_x60/index.html

Discussion reference:
https://gcc.gnu.org/pipermail/gcc-patches/2026-February/707625.html

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_sched_adjust_cost):Enable
TARGET_ADJUST_LMUL_COST for spacemit_x60.
* config/riscv/spacemit-x60.md: Add vector pipeline model
for Spacemit-X60.

Co-authored-by: Dusan Stojkovic <Dusan.Stojkovic@rt-rk.com>
Co-authored-by: Nikola Ratkovac <Nikola.Ratkovac@rt-rk.com>

libsdc++: Restore check for validity of std::get for elements_view.

Resolves LWG3797, "elements_view insufficiently constrained".

When P2165R4 updated __has_tuple_element in C++23 to reuse __tuple_like
concept, it dropped the requirement of validity of get, assuming that for
tuple_like type with size of N, get<I> on lvalue is well-formed for any I < N.
This however does not hold for ranges::subrange (tuple-like of size 2) with
move-only iterator, for which get can only be applied on rvalue. In consequence
constrains allowed instantiating elements_view for range of such subrange,
but instantiating it's iterator lead to hard error from iterator_category
computation.

This patch applies the requirements on validity of get also in C++23 and
later standard modes.

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__has_tuple_element): Check
if std::get<_Nm>(__t) returns referenceable type also for C++23
and later.
* testsuite/std/ranges/adaptors/elements.cc: Add test covering
vector of ranges::subrange with move-only iterator.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

Some TLC to vect_create_new_slp_node APIs

The following properly documents the overloads of vect_create_new_slp_node
and adjusts callers in tree-vect-slp-patterns.cc

* tree-vect-slp.cc (vect_create_new_slp_node): Assert that 'code'
is either ERROR_MARK or VEC_PERM_EXPR. Document properly.
* tree-vect-slp-patterns.cc (vect_build_swap_evenodd_node):
Use lane_permutation_t.
(vect_build_combine_node): Likewise. Pass VEC_PERM_EXPR
as code.

Do not pass vector type to scalar costing

The following drops passing of the vector type to scalar stmt costing
for BB vectorization.

* tree-vect-slp.cc (vect_bb_slp_scalar_cost): Do not pass
vector type to costing.

[RISC-V][V2][PR target/123839] Improve subset of constant permutes for RISC-V

There's a set of constant permutes that are currently implemented
via vslideup+vcompress which requires a mask (and setup of the
mask), but which can be implemented via vslideup+vslidedown.

This has been tested on riscv{32,64}-elf as well as in a BPI-F3 which
is configured to use V by default.

PR target/123839
gcc/
* config/riscv/riscv-v.cc (shuffle_slide_patterns): Use a
vslideup+vslidedown pair rather than a vcompressed based
sequence.

gcc/testsuite
* gcc.target/riscv/rvv/autovec/binop/vcompress-avlprop-1.c: Adjust
expected output.
* gcc.target/riscv/rvv/autovec/pr123839.c: New test.

rs6000: Don't fold stuff for C++ during targetm.resolve_overloaded_builtin [PR124133]

The following testcase ICEs starting with the removal of NON_DEPENDENT_EXPR
in GCC 14.  The problem is that while parsing templates if all the arguments
of the overloaded builtins are non-dependent types,
targetm.resolve_overloaded_builtin can be called on it.  And trying to
fold_convert or fold_build2 subexpressions of such arguments can ICE,
because they can contain various FE specific trees, or standard trees
with NULL_TREE types, or e.g. type mismatches in binary tree operands etc.
All that goes away later when the trees are instantiated and
targetm.resolve_overloaded_builtin is called again, but if it ICEs while
doing that, it won't reach that point.  And the reason to call that
hook in that case if none of the arguments are type dependent is to figure
out if the result type is also non-dependent.

Given the general desire to fold stuff in the FE during parsing as little
as possible and fold it only during cp_fold later on and because from the
target *-c.cc files it isn't easily possible to find out if it is
processing_template_decl or not, the following patch just stops folding
anything in the arguments, calls convert instead of fold_convert and
just build2 instead of fold_build2 etc. when in C++ (and keeps doing what
it did for C).

2026-04-24  Jakub Jelinek  <jakub@redhat.com>

PR target/124133
* config/rs6000/rs6000-c.cc (c_fold_convert): New function.
(c_fold_build2_loc): Likewise.
(fully_fold_convert): Use c_fold_convert instead of fold_convert.
(altivec_build_resolved_builtin): Likewise.  Use c_fold_build2_loc
instead of fold_build2.
(resolve_vec_mul, resolve_vec_adde_sube, resolve_vec_addec_subec):
Use c_fold_build2_loc instead of fold_build2_loc.
(resolve_vec_splats, resolve_vec_extract): Use c_fold_convert instead
of fold_convert.
(resolve_vec_insert): Use c_fold_build2_loc instead of fold_build2.
(altivec_resolve_overloaded_builtin): Use c_fold_convert instead
of fold_convert.

* g++.target/powerpc/pr124133-1.C: New test.
* g++.target/powerpc/pr124133-2.C: New test.

Reviewed-by: Michael Meissner <meissner@linux.ibm.com>

bitintlower: Padding bit fixes, part 5 [PR123635]

The following patch is hopefully the last missing part of the _BitInt
bitint_extended padding bit fixes, this time for
__builtin_{add,sub,mul}_overflow.  For __builtin_{add,sub}_overflow,
the extension in the padding bits of a partial limb (if any) is already
done in some cases during the handling of the limbs (and the last
hunk in gimple-lower-bitint.cc just adds it to one spot where it was
missing).  The extension in the padding bits of a full limb of padding
bits (if any) and for __builtin_mul_overflow partial limb too is done
in finish_arith_overflow.  If both var and obj are NULL, it is
__builtin_*_overflow_p or __builtin_*_overflow that ignores the result
of the operation and only cares about whether it overflowed or not; in
that case there is nothing to extend.

2026-04-24  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/123635
PR tree-optimization/124988
* gimple-lower-bitint.cc (bitint_large_huge::finish_arith_overflow):
Handle bitint_extend.
(bitint_large_huge::lower_addsub_overflow): Fix up comment spelling.
For bitint_extended extend the partial limb if any.

* gcc.dg/torture/bitint-91.c: New test.
* gcc.dg/torture/bitint-92.c: New test.
* gcc.dg/torture/bitint-93.c: New test.
* gcc.dg/torture/bitint-94.c: New test.
* gcc.dg/torture/bitint-95.c: New test.

Reviewed-by: Richard Biener <rguenth@suse.de>

libstdc++: Reject using views::iota on iota_view.

Resolves LWG4096, views::iota(views::iota(0)) should be rejected.

For __e of type _Tp that is specialization of iota_view, the CTAD based
expression iota_view(__e) is well formed, and creates a copy of __e.
As iota_view<decay_t<_Tp>> is ill-formed in this case (iota_view is not
weakly_incrementable), using that type in return type explicitly, removes
the overload from overload resolution in this case.

The (now redudant) __detail::__can_iota_view constrain in template head is
preserved, to provide error messages consistent with adaptors for other
non-incrementable types.

libstdc++-v3/ChangeLog:

* include/std/ranges (_Iota::operator()(_Tp&&)): Replace
auto return type and CTAD with iota_view<decay_t<_Tp>>.
* testsuite/std/ranges/iota/iota_view.cc: Tests if
views::iota(iota_view) is rejected.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Constrain views::adjacent(_transform)?<0> to forward_ranges.

This resolves LWG 4098, "views::adjacent<0> should reject non-forward ranges"
which was approved in Sofia 2024.

libstdc++-v3/ChangeLog:

* include/std/ranges (_AdjacentTransform::operator())
(_Adjacent::operator()): Require forward_range for N == 0.
* testsuite/std/ranges/adaptors/adjacent/1.cc: Test if input_ranges
are rejected.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc: Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Add _GLIBCXX_RESOLVE_LIB_DEFECTS comment for LWG4083.

The LWG4083, "views::as_rvalue should reject non-input ranges" is resolved,
as input_range<_Range> is implied by __detail::__can_as_rvalue_view<_Range>.

libstdc++-v3/ChangeLog:

* include/std/ranges: Add comment for LWG4083.

x86_cse: Use integer load for CONST_VECTOR load

CONST_VECTOR load no larger than integer register

(set (reg:V2QI 294)
(const_vector:V2QI [(const_int 0 [0]) repeated x2]))

can use integer load. Use inner mode as the scalar mode for CONST_VECTOR
load source.

gcc/

PR target/125009
* config/i386/i386-features.cc (ix86_place_single_vector_set):
Support CONST_VECTOR load no larger than integer register.
(ix86_broadcast_inner): Use inner mode as the scalar mode for
CONST_VECTOR load source.
(pass_x86_cse::x86_cse): Generate CONST_VECTOR broadcast source
for CONST_VECTOR load no larger than integer register.

gcc/testsuite/

PR target/125009
* g++.target/i386/pr125009.C: New test.
* gcc.target/i386/pr125009.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

libstdc+: Provide iterator type for basic_const_iterator.

This resolves LWG 4253, "basic_const_iterator should provide iterator_type"
which was approved in Kona 2025.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (basic_const_iterator::iterator_type):
Define.
* testsuite/24_iterators/const_iterator/1.cc: Tests for
iterator_type.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

tree-optimization/124843 - vectorize inversion of scalar bools

Scalar bool inversion vectorization fails due to bools having
bit precision. The following adds a pattern to rewrite it
to the corresponding BIT_XOR_EXPR operation which we can vectorize
just fine.

PR tree-optimization/124843
* tree-vect-patterns.cc (vect_recog_bool_pattern): Recognize
BIT_NOT_EXPR of scalar bools and rewrite with BIT_XOR_EXPR.

* gcc.dg/vect/vect-bool-4.c: New testcase.

Improve points-to after vectorization

The following teaches the vectorizer to create points-to info from
non-pointer accesses like copy_ref_info does.

* tree-vect-data-refs.cc (vect_duplicate_ssa_name_ptr_info):
Create points-to info from decl-based accesses.
(vect_create_addr_base_for_vector_ref): Adjust.
(vect_create_data_ref_ptr): Likewise.
(bump_vector_ptr): Likewise.

SLP pattern TLC

The following removes STMT_VINFO_SLP_VECT_ONLY_PATTERN which only
exists so we can do some cleanup that doesn't seem to be necessary.
We've been cleaning the original to pattern stmt link, but
add_pattern_stmt never sets that up - it only sets up the pattern
to original stmt link, so the SLP pattern is only reachable from
the pattern SLP nodes representative.

* tree-vectorizer.h (_stmt_vec_info::slp_vect_pattern_only_p):
Remove.
(STMT_VINFO_SLP_VECT_ONLY_PATTERN): Likewise.
* tree-vectorizer.cc (vec_info::new_stmt_vec_info): Do not
initialize STMT_VINFO_SLP_VECT_ONLY_PATTERN.
* tree-vect-loop.cc (vect_analyze_loop_2): Nothing to do
for SLP pattern stmts that are not reachable from scalar
stmts anyway. Remove dead code.
* tree-vect-slp-patterns.cc (complex_pattern::build): Do not
set STMT_VINFO_SLP_VECT_ONLY_PATTERN.
(addsub_pattern::build): Likewise.
* tree-vect-slp.cc (vect_free_slp_tree): Remove dead code.

SLP pattern TLC

The following removes setting of STMT_VINFO_REDUC_DEF on pattern
stmts - those are only ever checked on original scalar stmts now.
But for that to work we have to make the related stmt of the new
SLP pattern stmts the original stmt of a possible pattern.

The only valid SLP_TREE_CODE are VEC_PERM_EXPR and ERROR_MARK,
do not set it to CALL_EXPR.

* tree-vect-slp-patterns.cc (complex_pattern::build):
Add pattern for the original stmt, do not set
STMT_VINFO_REDUC_DEF.
(addsub_pattern::build): Likewise.

libstdc++: Update tzdata to 2026a

Import the new 2026a tzdata.zi file and new leapseconds expiry date.

libstdc++-v3/ChangeLog:

* include/std/chrono (chrono::__detail::__get_leap_second_info):
Update expiry date for leap seconds list.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds):
Likewise.
* src/c++20/tzdata.zi: Import new file from 2026a release.

Do not use DEFAULT_CFLAGS in ieee.exp [PR125003]

ieee.exp tries to inherit flags from DEFAULT_CFLAGS, which is sometimes set and sometimes unset
When it is set, it is set to "-ansi -pedantic-errors", which causes spurious failures.

Introduce a new variable, DEFAULT_IEEE_CFLAGS, which is independent of
DEFAULT_CFLAGS, but which boards may still override if needed.
It includes the default of "-w -fno-inline" as it was in the old style testcases.

Then the target specific flags should not be stored out in DEFAULT_IEEE_CFLAGS but they
are needed for the default flags passed to the compiler. They can't be stored out to
DEFAULT_IEEE_CFLAGS as for x86, depending on if -m32 or -m64 is first, -ffloat-store might
be included for -m64 or not. We don't want it to be there for -m64.

PR testsuite/125003

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/ieee/ieee.exp: Rewrite the default flags
and set DEFAULT_IEEE_CFLAGS if not already set.

Co-authored-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

match.pd: x != -CST ? x + CST : 0 -> x + CST [PR122996]

This patch simplifies expressions of the form x != CST1 ? x + CST2 : 0
into x + CST2 when CST1 == -CST2. This comes up, for example, when
dealing with 'rtrim'-style operations.

Bootstrapped and regression tested on x86_64-pc-linux-gnu.

PR tree-optimization/122996

gcc/ChangeLog:

* match.pd (x != CST1 ? x + CST2 : 0 -> x + CST2): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr122996.c: New test.

Signed-off-by: Netanel Komm <netanelkomm@gmail.com>

tree-optimization/124947 - IVOPTs emits uninit use

The following prevents IVOPTs from rewriting a compare using an IV
that involves undefined SSA vars.

PR tree-optimization/124947
* tree-ssa-loop-ivopts.cc (may_eliminate_iv): Do not use
a candidate that involves undefs.

* gcc.dg/pr124947.c: New testcase.

tree-optimization/124946 - signed overflow with emulated mixed dot-prod

When biasing the unsigned vector operand we have to perform that in
an unsigned type to avoid possible signed overflow.

PR tree-optimization/124946
* tree-vect-loop.cc (vect_emulate_mixed_dot_prod): Perform the
constant biasing in an unsigned type.

Daily bump.

c++/modules: stream PTRMEM_CST_LOCATION and TRAIT_EXPR_LOCATION

gcc/cp/ChangeLog:

* module.cc (trees_out::core_vals) <case PTRMEM_CST>:
Stream PTRMEM_CST_LOCATION.
<case TRAIT_EXPR>: Stream TRAIT_EXPR_LOCATION.
(trees_in::core_vals): As in trees_out::core_vals.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++/modules: PTRMEM_CST member considered unused [PR124981]

Here in _b.C the needed specialization A<B, &B::g> has already been
instantiated in module M, so we stream it in rather than instantiate it.
We then proceed to instantiate A<B, &B::g>::f() whose definition invokes
the pointer-to-member &B::g but it turns out that nothing has marked
B::g as used in this TU so we neglect to emit it and linking fails.

We do mark B::g as used during instantiation of A<B, &B::g> via
mark_template_arguments_used, but this instantiaton happens in module M
not the importer, and TREE_USED is deliberately not streamed.

This patch fixes this by setting TREE_USED on PTRMEM_CST_MEMBER during
stream-in, via the RTU macro, which seems sufficient to ensure B::g gets
emitted. This macro is already used for streaming in subexpressions and
BASELINK_FUNCTIONS so using it for PTRMEM_CST_MEMBER doesn't seem too
out of place.

PR c++/124981

gcc/cp/ChangeLog:

* module.cc (trees_in::core_vals) <case PTRMEM_CST>: Use RTU
instead of RT to stream PTRMEM_CST_MEMBER.

gcc/testsuite/ChangeLog:

* g++.dg/modules/ptrmem-1_a.C: New test.
* g++.dg/modules/ptrmem-1_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: introduce lookup_annotation

This patch introduces a new helper for looking up annotations.

gcc/cp/ChangeLog:

* cp-tree.h (lookup_annotation): Declare.
* decl.cc (grokfndecl): Use lookup_annotation.
(grokdeclarator): Likewise.
* name-lookup.cc (push_local_extern_decl_alias): Likewise.
* parser.cc (cp_parser_decomposition_declaration): Likewise.
* reflect.cc (eval_annotations_of): Likewise.
* tree.cc (lookup_annotation): New.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: add lk_module

During Reflection review it came up that we don't have lk_module.
Instead, we're checking lk_external && DECL_MODULE_ATTACH_P &&
!DECL_MODULE_EXPORT_P. This patch adds lk_module which allows further
cleanups.

I'm not sure the cp_parser_template_argument change is required.

gcc/cp/ChangeLog:

* cp-tree.h (enum linkage_kind): Add lk_module.
* module.cc (check_module_decl_linkage): Use DECL_EXTERNAL_LINKAGE_P.
* name-lookup.cc (check_can_export_using_decl): Don't check for
attachment.
* parser.cc (cp_parser_template_argument): Check that linkage isn't
lk_module.
* reflect.cc (eval_has_module_linkage): Check lk_module.
(eval_has_external_linkage): Use DECL_EXTERNAL_LINKAGE_P.
* tree.cc (decl_linkage): Return lk_module if appropriate.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: CWG 2229, cv-qualified unnamed bit-fields [PR123935]

This implements [class.bit]/2: An unnamed bit-field shall not be
declared with a cv-qualified type. This was clarified in DR 2229.

DR 2229
PR c++/123935

gcc/cp/ChangeLog:

* decl2.cc (grokbitfield): Add pedwarn for cv-qualified unnamed
bit-fields.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2229.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++/reflection: erroneous access check on dependent splice [PR124989]

When processing &[:R:] in cp_parser_splice_expression, we call
build_offset_ref with access checking turned off via push_ and
pop_deferring_access_checks, but the same pair of calls is not
present around the call to build_offset_ref in tsubst_splice_expr
and so the following test fails to compile due to access control
checking failures.

PR c++/124989

gcc/cp/ChangeLog:

* pt.cc (tsubst_splice_expr): Turn off access checking for the
build_offset_ref call.

gcc/testsuite/ChangeLog:

* g++.dg/reflect/member24.C: New test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

[PATCH] RISC-V: Omit ghost from the pipeline-checker output

I noticed some pipelines erroneously provide reservations for ghost.
Likely because the pipeline-checker will report it as missing.

This patch filters out all default reservations (only ghost now) from
riscv.md.

gcc/ChangeLog:

* config/riscv/pipeline-checker: Filter tuneless insn types.

Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>

c++: revert fix for PR41127 [PR118374]

Previously, we did not parse definitely in cp_parser_enum_specifier
after seeing CPP_COLON, since we allowed for bitfield widths to follow
"enum identifier :" in member-declarations. However, ISO says that in
such a situation, the colon should be parsed as an enum-base
([dcl.enum]/a), which means bitfield widths are not allowed. This
patch reverts the changes which allowed for bitfield widths, since
parsing definitely improves diagnostics for errant underlying types.

This reverts SVN r151246.

PR c++/118374
PR c++/41127

gcc/cp/ChangeLog:

* parser.cc (cp_parser_enum_specifier): Parse definitely
before cp_parser_type_specifier_seq.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum1.C: Update test.
* g++.dg/parse/enum5.C: Expect error with bitfield width
and enum-key in member.
* g++.dg/cpp0x/enum45.C: New test.

c++: Add support for [[gnu::trivial_abi]] attribute [PR107187]

Implement the trivial_abi attribute for GCC to fix ABI compatibility
issues with Clang. Currently, GCC silently ignores this attribute,
causing call convention mismatches when linking GCC-compiled code with
Clang-compiled object files that use trivial_abi types.

This attribute allows types to be treated as trivial for ABI purposes,
enabling pass in registers instead of invisible references. The
attribute is supported with `__attribute__((trivial_abi))` and
`[[clang::trivial_abi]]` spellings.

PR c++/107187

gcc/cp/ChangeLog:

* cp-tree.h (has_trivial_abi_attribute): New function.
(validate_trivial_abi_attribute): Declare.
(classtype_has_non_deleted_copy_or_move_ctor): Declare.
(cxx_clang_attribute_table): Declare.
* tree.cc (handle_trivial_abi_attribute): New function.
(handle_gnu_trivial_abi_attribute): New function.
(classtype_has_trivial_abi): New function.
(validate_trivial_abi_attribute): New function.
(cxx_gnu_attributes): Add trivial_abi entry.
(cxx_clang_attributes): New table for [[clang::trivial_abi]].
* class.cc (finish_struct_bits): Skip BLKmode for types with
trivial_abi attribute.
(classtype_has_non_deleted_copy_or_move_ctor): New function.
(finish_struct_1): Call validate_trivial_abi_attribute before
finish_struct_bits.
* cp-objcp-common.h (cp_objcp_attribute_table): Register
cxx_clang_attribute_table.
* decl.cc (store_parm_decls): Register cleanups for trivial_abi
parameters.

gcc/ChangeLog:

* doc/extend.texi: Document __attribute__((trivial_abi)).

Signed-off-by: Yuxuan Chen <i@yuxuan.ch>
Reviewed-by: Jason Merrill <jason@redhat.com>

c++: fix typo in consteval, array, modules [PR124973]

Argh, I must have typoed when I realized that we wanted to check
ff_genericize here rather than !ff_only_non_odr. And didn't notice the
problem because I also forgot the -O in the testcase.

PR c++/124973

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold_r): Fix typo.

gcc/testsuite/ChangeLog:

* g++.dg/modules/consteval-1_b.C: Add -O.

RISC-V: Add SUBREG_PROMOTED annotation to min/max si3 expansion

The <bitmanip_optab>si3 expansion for smin/smax/umin/umax sign-extends
both inputs and then performs the DImode min/max, which returns one of
its inputs unchanged. The result is therefore always sign-extended,
but the missing SUBREG_PROMOTED annotation on the lowpart caused GCC
to emit a redundant sext.w.

Add the SUBREG_PROMOTED_VAR_P / SUBREG_PROMOTED_SET(SRP_SIGNED)
annotation, matching rotrsi3, rotlsi3, and other si3 expansions.

gcc/ChangeLog:

* config/riscv/bitmanip.md (<bitmanip_optab>si3): Add
SUBREG_PROMOTED annotation to lowpart result.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-min-max-05.c: New test.
* gcc.target/riscv/zbb-min-max-06.c: New test.
* gcc.target/riscv/zbb-min-max-07-run.c: New test.

[PR target/124029][RISC-V] Adjust cost of comparisons

> Given this is a relatively straightforward define_split, it is likely a good
> case for Austin to chase down.

Actually it is easier than that
The middle-end has a costing mechism for this already:
```
;; cmp: le, old cst: (const_int 268435455 [0xfffffff]) new cst: (const_int 268435456 [0x10000000])
;; old cst cost: 4, new cst cost: 4
```

You need to implement a COMPARE cost part of the riscv_rtx_costs like it is
done for aarch64_rtx_costs.

It won't be 100% exact because in riscv case there is no COMPARE instruction.
But at least it might be more about the costs of generating which constant and
all.

PR target/124029
gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Improve costing of COMPARE
nodes.

gcc/testsuite
* gcc.target/riscv/pr124029.c: New test.
* gcc.target/riscv/rvv/autovec/struct/struct_vect-2.c: Adjust
expected output.

Co-authored-by: Jeff Law <jeffrey.law@oss.qualcomm.com>

c++/reflection: reflect on dependent class template [PR124926]

Here we issue a bogus error for

  ^^Cls<T>::template Inner

where Inner turns out to be a class type, but we created a SCOPE_REF
because we can't know in advance what it will substitute into, and

  ^^typename Cls<T>::template Inner

is invalid.  The typename can only be used in

  ^^typename Cls<T>::template Inner<int>

We're taking a reflection so both types and non-types are valid, so
I think we shouldn't give the error for ^^, and take the reflection
of the TEMPLATE_DECL.

PR c++/124926

gcc/cp/ChangeLog:

* pt.cc (tsubst_qualified_id): Rename name_lookup_p parameter to
reflecting_p.  Check !reflecting_p instead of name_lookup_p.  Do
not give the "instantiation yields a type" error when reflecting_p
is true.
(tsubst_expr) <case REFLECT_EXPR>: Adjust the call to
tsubst_qualified_id.

gcc/testsuite/ChangeLog:

* g++.dg/reflect/dep15.C: New test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>

testsuite: Check configured assembler in gcc.misc-tests/options.exp

When I recently dropped --with-gnu-as from my builds which had become
unnecessary, several tests started to FAIL, e.g.

FAIL: compiler driver --coverage option(s) (assembler options)

and all other "(assembler options)" tests in gcc.misc-tests/options.exp.

This happends because my builds use something like
--with-as=/vol/gcc/bin/gas-2.46 instead of relying on a random bundled
version of gas.  Therefore the configured assembler name doesn't end in
"as".

The assembler options check in options.exp (check_for_all_options) looks
for

" *as(\\.exe)? .*$as_pattern"

in the gcc -v output, with an empty as_pattern.

While gcc was configured with --with-gnu-as, the gcc -v output starts
with

Configured with:...---with-gnu-as ...

later followed by

/vol/gcc/bin/gas-2.46 -v -V -Qy -s --32 -o /var/tmp//ccr7.tAa.o /var/tmp//ccRnpmxb.s

Since Tcl does multiline matching by default, the first line did match
the pattern although the actual assembler invokation does not.

To avoid this, this patch does two things:

* Use newline-sensitive matching.

* Check for the actual configured assembler instead of assuming its name
  ends in as.

Tested on i386-pc-solaris2.11 configured with --with-as as above and
x86_64-pc-linux-gnu without --with-as, thus picking up as from PATH.

2026-04-14  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.misc-tests/options.exp (check_for_all_options): Check for
configured assembler.

libstdc++: Include bool conversion in noexcept specification of indirect::operator==.

This expands the resolution of LWG4325 to heterogenous comparision
with T per standard draft (see corresponding commit [1]).

[1] https://github.com/cplusplus/draft/pull/8935/changes/833d635d648cdbd06c9935acccf925ee0aea3c79

libstdc++-v3/ChangeLog:

* include/bits/indirect.h (indirect::operator==): Adjust
noexcept specification.
* testsuite/std/memory/indirect/relops.cc: New test for noexcept
specification.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

libstdc++: Implement __integral_constant_like in terms of __constexpr_wrapper_like.

This implements LWG4486. integral-constant-like and constexpr-wrapper-like
exposition-only concept duplication.

libstdc++-v3/ChangeLog:

* include/bits/simd_details.h (simd::__constexpr_wrapper_like):
Move to...
* include/std/concepts (std::__constexpr_wrapper_like): Moved
from bits/simd_details.h.
* include/std/span (std::__integral_constant_like): Define in
terms of __constexpr_wrapper_like.
* testsuite/std/simd/traits_impl.cc: Added using declaration
for std::__constexpr_wrapper_like.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>

i386: Bump STACK_CHECK_PROTECT for 64-bit Windows

This is required for -fstack-check to be able to properly recover from a
stack overflow condition on (some configurations of) Windows Server 2025.

gcc/
* config/i386/cygming.h (STACK_CHECK_PROTECT): Define.

libgcc: Honor LDFLAGS_FOR_TARGET for shared libgcc on Windows

Unlike for other targets, LDFLAGS_FOR_TARGET is not honored on Windows when
the shared libgcc is built.

libgcc/
* config/i386/t-slibgcc-cygming (SHLIB_LINK): Add $(LDFLAGS).

x86: Don't check SSE2 in x86_cse::gate

commit 5cf1b9a03ec5b617af8c50c1e9c0d223083fd7f2
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Aug 19 11:50:41 2022 -0700

    x86-64: Remove redundant TLS calls

changed the x86_cse pass to also remove redundant TLS calls.  Remove the
SSE2 check in x86_cse::gate so that redundant TLS calls are removed when
SSE is disabled.

gcc/

PR target/124994
* config/i386/i386-features.cc (x86_cse::gate): Drop TARGET_SSE2.

gcc/testsuite/

PR target/124994
* gcc.target/i386/pr124994.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

[PATCH] RISC-V: Remove redundant CALL_P check

Ready for trunk (or gcc-17 since it's stage4 for gcc-16 now)?

------------------------------------------------------------------
From:钟居哲 <juzhe.zhong@rivai.ai>
Send Time:Thu, Jan 8, 2026, 10:51
To:Bohan Lei<garthlei@linux.alibaba.com>
CC:"gcc-patches"<gcc-patches@gcc.gnu.org>; "pan2.li"<pan2.li@intel.com>; Bohan Lei<garthlei@linux.alibaba.com>
Subject:Re: [PATCH] RISC-V: Remove redundant CALL_P check

LGTM

From:  "Bohan Lei"<garthlei@linux.alibaba.com>
Date:  Thu, Jan 8, 2026, 10:49
Subject:  [PATCH] RISC-V: Remove redundant CALL_P check
To:  <gcc-patches@gcc.gnu.org>
Cc: <juzhe.zhong@rivai.ai>, <pan2.li@intel.com>, "Bohan Lei"<garthlei@linux.alibaba.com>
Since we are using `reg_set_p` to check VXRM definition, the `CALL_P`
check has become redundant.  VXRM is marked as call-used in riscv.h, and
`reg_set_p` in `vxrm_unknown_p` should always return true when a call is
encountered.

gcc/ChangeLog:

* config/riscv/riscv.cc (vxrm_unknown_p): Remove `CALL_P` check

Daily bump.

c++: consteval, array, modules [PR124973]

Here the consteval holder constructor calls the defaulted element_array
constructor, which uses a VEC_INIT_EXPR to call the defaulted element
constructor.

When we read in the holder constructor, we need to clone it, so we call
finish_function, which calls cp_fold_function_non_odr_use, which tries to
constant-evaluate the call to the element_array constructor.  This
eventually wants to evaluate the VEC_INIT_EXPR, which wants to call the
element constructor (complete object clone).  But we haven't cloned the
element constructor yet, so mark_used tries to synthesize it again, which
breaks because the constructor is already defined, just not cloned yet.

We should have cloned the element constructor first, but we didn't know that
the element_array constructor depends on it because VEC_INIT_EXPR doesn't
express that; build_vec_init_expr calls build_vec_init_elt and then throws
it away.  Perhaps we want to add the elt_init as an additional operand that
is used to express dependencies, but ignored in expansion?

It would also be nice not to repeat all the finish_function passes when
loading a function from a module; we already did
cp_fold_function_non_odr_use and such for this function before writing out
the module, doing it again is a waste of time.

But also, trying to constant-evaluate the element_array constructor is wrong
for _non_odr_use, it shouldn't be doing any optimization folding.

Furthermore, since the TARGET_EXPR is wrapped in an INIT_EXPR, we should
never have tried to fold it by itself, before cp_genericize_init_expr has a
chance to elide it.  So let's only do that folding when ff_genericize, like
the other TARGET_EXPR transformations.  This is a much simpler fix for this
testcase.

While we're at it, let's also suppress the other flag_no_inline-conditional
folding when ff_only_non_odr.

PR c++/124973
PR c++/120502
PR c++/120005

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Only
do optimization folding when ff_genericize.
(cp_fold) <case CALL_EXPR>: Don't do
optimization folding when ff_only_non_odr.

gcc/testsuite/ChangeLog:

* g++.dg/modules/consteval-1_a.C: New test.
* g++.dg/modules/consteval-1_b.C: New test.

ipa: fix 'writing' typo in comment

gcc/ChangeLog:

* ipa-prop.cc (param_type_may_change_p): Fix comment typo.

lto: fix spelling in comment

gcc/lto/ChangeLog:

* lto-symtab.cc (lto_varpool_replace_node): Fix spelling of 'warning'.

contrib: header-tools: fix spelling

contrib/header-tools/ChangeLog:

* show-headers: Fix spelling of 'additional'.

i386: fix typo in comment

Followup to r16-2675-gdf82965344f641.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_get_callcvt): Say 'regparm' in comment,
not 'regparam'.

a68: Fix make install-html

Currently, trying to do "make install-html" after a build results in an
error:

  $ make install-html
    ⋮
  Doing install-html in gcc
  make[2]: Entering directory '/tmp/build/gcc'
  make[2]: *** No rule to make target '/tmp/build/gcc/HTML/gcc-16.0.1/ga68-coding-guidelines.info', needed by 'algol68.install-html'.  Stop.
  make[2]: Leaving directory '/tmp/build/gcc'
  make[1]: *** [Makefile:5054: install-html-gcc] Error 1
  make[1]: Leaving directory '/tmp/build'
  make: *** [Makefile:1929: do-install-html] Error 2

The problem is a typo in a dependency of the algol68.install-html rule.
Fix it by removing the ".info" suffix.

With this change, "make install-html" succeeds but ga68-internals and
ga68-coding-guidelines don't get installed.  Assuming this is
unintentional, extend the for loop to also install them.

gcc/algol68/ChangeLog:
* Make-lang.in (algol68.install-html): Fix
ga68-coding-guidelines dependency.  Install all dependencies.

Regenerate gcc.pot

* gcc.pot: Regenerate.

cfghooks: Pass data to callback function of make_forwarder_block

This makes a cleanup that is way overdue and should have been done
years ago. Instead of setting some global/static variables for the
callback function to check here, we pass down the data to the callback
function. This reduces the number of global variables (which should help
with Parallel GCC project). Plus since mfb_keep_just was exported outside
of cfgloopmanip.cc (it was used in tree-ssa-threadupdate.cc), it reduces
is shared between files.

I found this useful when working on PR 123113 as I needed a new callback
function.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* cfghooks.cc (make_forwarder_block): New data argument,
pass it down to redirect_edge_p.
* cfghooks.h (make_forwarder_block): Add void* argument.
* cfgloop.cc (mfb_reis_set): Remove.
(mfb_redirect_edges_in_set): Add new data argument.
Use it instead of mfb_reis_set.
(form_subloop): Create a local variable instead of
mfb_areis_set. Update call to make_forwarder_block.
(merge_latch_edges): Likewise.
* cfgloopmanip.cc (mfb_kj_edge): Remove.
(mfb_keep_just): Add new data argument.
Use it instead of mfb_kj_edge.
(create_preheader): Use local variable instead of
mfb_kj_edge. Update call to make_forwarder_block.
* cfgloopmanip.h (mfb_keep_just): Add void* argument.
* tree-cfgcleanup.cc (mfb_keep_latches): Add unused void* arugment.
(cleanup_tree_cfg_noloop): Update call to make_forwarder_block.
* tree-ssa-threadupdate.cc
(fwd_jt_path_registry::thread_through_loop_header): Use local
variable instead of mfb_kj_edge. Update call to make_forwarder_block.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

cfghooks: Remove new_bb_cbk callback from make_forwarder_block

This callback seems to be unused since it was allowed to be NULL
in r0-78960-g89f8f30f356532 (19 years ago), so let's just remove it.
this is also the first step in changing the callback to make_forwarder_block.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* cfghooks.cc (make_forwarder_block): Remove new_bb_cbk argument.
* cfghooks.h (make_forwarder_block): Remove last argument.
* cfgloop.cc (form_subloop): Update call to make_forwarder_block.
(merge_latch_edges): Likewise.
* cfgloopmanip.cc (create_preheader): Likewise.
* tree-cfgcleanup.cc (cleanup_tree_cfg_noloop): Likewise.
* tree-ssa-threadupdate.cc
(fwd_jt_path_registry::thread_through_loop_header): Likewise.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>

doc: grammar fix for -ffunction-cse

gcc/ChangeLog:

* doc/invoke.texi (-ffunction-cse): Add missing full stop.