]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
2 hours agoDisallow scan-store vectorization in epilogues master trunk
Richard Biener [Wed, 30 Jul 2025 13:05:19 +0000 (15:05 +0200)] 
Disallow scan-store vectorization in epilogues

The following disallows vectorizing epilogues containing scan-stores.
Since code generation works by walking gimple stmts it is not ready
for this when cleaning up epilogue vectorization.  I believe
scan-store vectorization needs most of the work done during SLP
discovery to reflect the data flow.

* tree-vect-stmts.cc (check_scan_store): Remove redundant
slp_node check.  Disallow epilogue vectorization.

2 hours agoAvoid passing vectype != NULL when costing scalar IL
Richard Biener [Tue, 29 Jul 2025 07:20:42 +0000 (09:20 +0200)] 
Avoid passing vectype != NULL when costing scalar IL

The following makes sure to not leak a set vectype on a stmt when
doing scalar IL costing as this can confuse vector cost models
which do not look at m_costing_for_scalar most of the time.

* tree-vectorizer.h (vector_costs::costing_for_scalar): New
accessor.
(add_stmt_cost): For scalar costing force vectype to NULL.
Verify we do not pass in a SLP node.

5 hours agoRISC-V: Adding H to the canonical order [PR121312]
Kito Cheng [Thu, 31 Jul 2025 03:02:45 +0000 (11:02 +0800)] 
RISC-V: Adding H to the canonical order [PR121312]

We added H into canonical order before, but forgot to add it to
arch-canonicalize as well...

gcc/ChangeLog:

PR target/121312
* config/riscv/arch-canonicalize: Add H extension to the
canonical order.

8 hours agoDaily bump.
GCC Administrator [Thu, 31 Jul 2025 00:21:08 +0000 (00:21 +0000)] 
Daily bump.

10 hours ago[sanitizer_common] Remove reference to obsolete termio ioctls (#138822)
Sam James [Fri, 25 Jul 2025 18:45:18 +0000 (19:45 +0100)] 
[sanitizer_common] Remove reference to obsolete termio ioctls (#138822)

Cherry picked from LLVM commit c99b1bcd505064f2e086e6b1034ce0b0c91ea5b9.

The termio ioctls are no longer used after commit 59978b21ad9c
("[sanitizer_common] Remove interceptors for deprecated struct termio
(#137403)"), remove them.  Fixes this build error:

../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:765:27: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  765 |   unsigned IOCTL_TCGETA = TCGETA;
      |                           ^~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:769:27: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  769 |   unsigned IOCTL_TCSETA = TCSETA;
      |                           ^~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:770:28: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  770 |   unsigned IOCTL_TCSETAF = TCSETAF;
      |                            ^~~~~~~
../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:771:28: error: invalid application of ‘sizeof’ to incomplete type ‘__sanitizer::termio’
  771 |   unsigned IOCTL_TCSETAW = TCSETAW;
      |                            ^~~~~~~

10 hours agoUpdate cpplib sr.po
Joseph Myers [Wed, 30 Jul 2025 22:12:41 +0000 (22:12 +0000)] 
Update cpplib sr.po

* sr.po: Update.

10 hours agoc++: Don't assume trait funcs return error_mark_node when tf_error is passed [PR121291]
Nathaniel Shead [Tue, 29 Jul 2025 12:20:32 +0000 (22:20 +1000)] 
c++: Don't assume trait funcs return error_mark_node when tf_error is passed [PR121291]

For the sake of determining if there are other errors in user code to
report early, many trait functions don't always return error_mark_node
if not called in a SFINAE context (i.e., tf_error is set).  This patch
removes some assumptions on this behaviour I'd made when improving
diagnostics of builtin traits.

PR c++/121291

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Remove assumption about
failures returning error_mark_node.
* except.cc (explain_not_noexcept): Allow expr not being
noexcept.
* method.cc (build_invoke): Adjust comment.
(is_trivially_xible): Always note non-trivial components if expr
is not null or error_mark_node.
(is_nothrow_xible): Likewise for non-noexcept components.
(is_nothrow_convertible): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_invocable7.C: New test.
* g++.dg/ext/is_nothrow_convertible5.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
13 hours agolibstdc++: Fix test when dual abi disabled
François Dumont [Tue, 29 Jul 2025 04:32:52 +0000 (06:32 +0200)] 
libstdc++: Fix test when dual abi disabled

When !_GLIBCXX_USE_DUAL_ABI the old COW std::string implementation is being used
which do not generate the expected error diagnostics.

libstdc++-v3/ChangeLog:

* testsuite/std/time/format/data_not_present_neg.cc: Remove _GLIBCXX_USE_DUAL_ABI
check.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
16 hours agoc++: improve non-constant template arg diagnostic
Jason Merrill [Wed, 30 Jul 2025 16:00:06 +0000 (12:00 -0400)] 
c++: improve non-constant template arg diagnostic

A conversation today pointed out that the current diagnostic for this case
doesn't mention the constant evaluation failure, it just says e.g.

"'p' is not a valid template argument for 'int*' because it is not the
address of a variable"

This patch improves the diagnostic in C++17 and above (post-N4268) to
diagnose failed constant-evaluation.

gcc/cp/ChangeLog:

* pt.cc (convert_nontype_argument_function): Check
cxx_constant_value on failure.
(invalid_tparm_referent_p): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/tc1/dr49.C: Adjust diagnostic.
* g++.dg/template/func2.C: Likewise.
* g++.dg/cpp1z/nontype8.C: New test.

17 hours agosimplify-rtx: Add `(subreg (not a))` simplification for word_mode [PR121308]
Andrew Pinski [Wed, 30 Jul 2025 04:49:16 +0000 (21:49 -0700)] 
simplify-rtx: Add `(subreg (not a))` simplification for word_mode [PR121308]

Right now in simplify_subreg, there is code to try to simplify for word_mode
with the binary bitwise operators. The unary bitwise operator is not handle,
this causes an odd mix match and the new self testing code that was added with
r16-2614-g965564eafb721f was not expecting.

The self testing code was for testing the newly added code but since there
was already code that handles word_mode, we hit the mismatch but only
for targets where word_mode is SImode (or smaller).

This adds the code to handle `not` in a similar fashion as the other
bitwise operators for word_mode.

Changes since v1:
* v2: add `&& SCALAR_INT_MODE_P (innermode)` to the conditional.

Bootstrapped and tested on x86_64-linux-gnu.

PR rtl-optimization/121308
gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_subreg): Handle
subreg of `not` with word_mode to make it symmetric with the
other bitwise operators.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
17 hours agoIFCVT: Fix factor_out_operators correctly for more than 1 phi [PR121295]
Andrew Pinski [Tue, 29 Jul 2025 15:46:01 +0000 (08:46 -0700)] 
IFCVT: Fix factor_out_operators correctly for more than 1 phi [PR121295]

r16-2590-ga51bf9e10182cf was not the correct fix for this in the end.
Instead a much simplier and localized fix is needed, just change the phi
that is being worked on with the new result and arguments that is from the
factored out operator.
This solves the issue of not having result in the IR and causing issues that way.

Bootstrapped and tested on x86_64-linux-gnu.
Note this depends on reverting r16-2590-ga51bf9e10182cf.

PR tree-optimization/121236
PR tree-optimization/121295

gcc/ChangeLog:

* tree-if-conv.cc (factor_out_operators): Change the phi node
to the new result and args.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr121236-1.c: New test.
* gcc.dg/torture/pr121295-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
17 hours agoRevert "ifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]"
Andrew Pinski [Tue, 29 Jul 2025 15:28:00 +0000 (08:28 -0700)] 
Revert "ifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]"

This reverts commit a51bf9e10182cf7ac858db0ea6c5cb11b4f12377.

18 hours agoReport read errors when reading auto-profile
Jan Hubicka [Wed, 30 Jul 2025 14:09:12 +0000 (16:09 +0200)] 
Report read errors when reading auto-profile

currently -fauto-profile will happily read truncated file without any warning
and interpret it as a zero profile which will in turn result in slow code.
This patch exports gcov_is_error and adds checks so truncated files are detected.

gcc/ChangeLog:

* auto-profile.cc (string_table::read): Check gcov_is_error.
(read_profile): Likewise.
* gcov-io.cc (gcov_is_error): Export for gcc linkage.
* gcov-io.h (gcov_is_error): Declare.

19 hours ago[x86] factor out worker from ix86_builtin_vectorization_cost
Richard Biener [Wed, 30 Jul 2025 11:01:18 +0000 (13:01 +0200)] 
[x86] factor out worker from ix86_builtin_vectorization_cost

The following factors out a worker that gets a mode argument
rather than a vectype argument.  That makes a difference when
we hit the fallback in add_stmt_cost for scalar stmts where
vectype might be NULL and thus mode is derived from the scalar
stmt there.  But ix86_builtin_vectorization_cost does not
have access to the stmt.  So the patch instead dispatches
to the new ix86_default_vector_cost there, passing down the mode
we derived from the stmt.

This is to avoid regressions with a patch that makes even more
scalar stmt costings have a vectype passed.

* config/i386/i386.cc (ix86_default_vector_cost): Split
out from ...
(ix86_builtin_vectorization_cost): ... this and use
mode instead of vectype as argument.
(ix86_vector_costs::add_stmt_cost): Call
ix86_default_vector_cost instead of ix86_builtin_vectorization_cost.

19 hours agos390: Implement spaceship optab [PR117015]
Stefan Schulze Frielinghaus [Wed, 30 Jul 2025 13:25:54 +0000 (15:25 +0200)] 
s390: Implement spaceship optab [PR117015]

gcc/ChangeLog:

PR target/117015
* config/s390/s390-protos.h (s390_expand_int_spaceship): New
function.
(s390_expand_fp_spaceship): New function.
* config/s390/s390.cc (s390_expand_int_spaceship): New function.
(s390_expand_fp_spaceship): New function.
* config/s390/s390.md (spaceship<mode>4): New expander.

gcc/testsuite/ChangeLog:

* gcc.target/s390/spaceship-fp-1.c: New test.
* gcc.target/s390/spaceship-fp-2.c: New test.
* gcc.target/s390/spaceship-fp-3.c: New test.
* gcc.target/s390/spaceship-fp-4.c: New test.
* gcc.target/s390/spaceship-int-1.c: New test.
* gcc.target/s390/spaceship-int-2.c: New test.
* gcc.target/s390/spaceship-int-3.c: New test.

19 hours agocprop: Allow jump bypassing for single set insns
Stefan Schulze Frielinghaus [Wed, 30 Jul 2025 13:25:54 +0000 (15:25 +0200)] 
cprop: Allow jump bypassing for single set insns

During jump bypassing also consider insns of the form

(insn 25 57 26 9 (parallel [
            (set (reg:CCZ 33 %cc)
                (compare:CCZ (reg:SI 60 [ _9 ])
                    (const_int 0 [0])))
            (clobber (scratch:SI))
        ]) "spaceship-fp-4.c":27:1 1746 {*tstsi_cconly_extimm}
     (nil))

by testing for a single set insn during bypass_conditional_jumps().
This is a requirement for test gcc.target/s390/spaceship-fp-4.c of the
subsequent commit.

In order to silence

cprop.cc:1621:40: error: 'setcc_dest' may be used uninitialized [-Werror=maybe-uninitialized]
 1621 |             src = simplify_replace_rtx (src, setcc_dest, setcc_src);
      |                   ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~

initialize setcc_{dest,src} in bypass_block() although this is not
really required.

gcc/ChangeLog:

* cprop.cc (bypass_block): Extract single set.
(bypass_conditional_jumps): Ditto.

19 hours agox86: Transform to "pushq $-1; popq reg" for -Oz
H.J. Lu [Tue, 29 Jul 2025 18:22:35 +0000 (11:22 -0700)] 
x86: Transform to "pushq $-1; popq reg" for -Oz

commit 4c80062d7b8c272e2e193b8074a8440dbb4fe588
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun May 25 07:40:29 2025 +0800

    x86: Enable *mov<mode>_(and|or) only for -Oz

disabled transformation from "movq $-1,reg" to "pushq $-1; popq reg" for
-Oz.  But for legacy integer registers, the former is 4 bytes and the
latter is 3 bytes.  Enable such transformation for -Oz.

gcc/

PR target/120427
* config/i386/i386.md (peephole2): Transform "movq $-1,reg" to
"pushq $-1; popq reg" for -Oz if reg is a legacy integer register.

gcc/testsuite/

PR target/120427
* gcc.target/i386/pr120427-5.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
19 hours agoauto-profile fixes
Jan Hubicka [Wed, 30 Jul 2025 12:53:21 +0000 (14:53 +0200)] 
auto-profile fixes

This patch silences warning about bad location in function_instance::match
warning about profile containing record for line numbers that are not matched
by the function body.  While this is a bogus profile (and we will end up losing
the profile data), create_gcov does not have enough information to output them
correctly in all contexts since in dwarf5 we output multiple locations per single
instructions (possibly comming from different inlines) while it can only represent
one inline stack.

The patch also fixes issue with profile scaling.   By making force_nonzero to
take into account cutoffs, I made the test for counter being non-zero before scaling
too agressive.

gcc/ChangeLog:

* auto-profile.cc (function_instance::match): Disable warning
about bogus locations since dwarf does not represent enough
info to output them correctly in all cases.
(add_scale): Use nonzero_p instead of orig.force_nonzero () == orig.
(afdo_adjust_guessed_profile): Add missing newline in dump
file.

19 hours agoFix symbol_table::change_decl_assembler_name when DECL_RTL is already computed
Jan Hubicka [Wed, 30 Jul 2025 12:48:43 +0000 (14:48 +0200)] 
Fix symbol_table::change_decl_assembler_name when DECL_RTL is already computed

while working on patch assigning unique names to static symbols I noticed that
fortran symbols are not renamed since the frontend calls make_decl_rtl.  This
gets DECL_ASSEMBBLER_NAME and DECL_RTL out of sync. I think we can drop that
call, but it is also good idea to avoid this inconsistence, so this patch makes
symbol_table::change_decl_assembler_name to recompute DECL_RTL in this case.

gcc/ChangeLog:

* symtab.cc (symbol_table::change_decl_assembler_name): Recompute DECL_RTL
in case it is already computed.

19 hours agoFix fasle profile insonsistency error
Jan Hubicka [Wed, 30 Jul 2025 13:05:18 +0000 (15:05 +0200)] 
Fix fasle profile insonsistency error

This patch fixes false incosistent profile error message seen when building SPEC with
-fprofile-use -fdump-ipa-profile.
The problem is that with dumping tree_esitmate_probability is run in dry run
mode to report success rates of heuristics.  It however runs determine_unlikely_bbs
which ovewrites some counts to profile_count::zero and later value profiling sees
the mismatch.

In sane profile determine_unlikely_bbs should be almost always no-op since it
should only drop to 0 things that are known to be unlikely executed. What
happens here is that there is a comdat where profile is lost and we see a
call with non-zero count calling function with zero count and "fix" the profile
by making the call to have zero count, too.

I also extended unlikely prediates to avoid tampering with predictions when
prediciton is believed to be reliable.  This also avoids us from dropping all
EH regions to 0 count as tested by the testcase.

gcc/ChangeLog:

* predict.cc (unlikely_executed_edge_p): Ignore EDGE_EH if profile
is reliable.
(unlikely_executed_stmt_p): special case builtin_trap/unreachable and
ignore other heuristics for reliable profiles.
(tree_estimate_probability): Disable unlikely bb detection when
doing dry run

gcc/testsuite/ChangeLog:

* g++.dg/tree-prof/eh1.C: New test.

20 hours agovect: Add target hook to prefer gather/scatter instructions
Andrew Stubbs [Mon, 28 Jul 2025 13:58:03 +0000 (13:58 +0000)] 
vect: Add target hook to prefer gather/scatter instructions

For AMD GCN, the instructions available for loading/storing vectors are
always scatter/gather operations (i.e. there are separate addresses for
each vector lane), so the current heuristic to avoid gather/scatter
operations with too many elements in get_group_load_store_type is
counterproductive. Avoiding such operations in that function can
subsequently lead to a missed vectorization opportunity whereby later
analyses in the vectorizer try to use a very wide array type which is
not available on this target, and thus it bails out.

This patch adds a target hook to override the "single_element_p"
heuristic in the function as a target hook, and activates it for GCN. This
allows much better code to be generated for affected loops.

Co-authored-by: Julian Brown <julian@codesourcery.com>
gcc/
* doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add
documentation hook.
* doc/tm.texi: Regenerate.
* target.def (prefer_gather_scatter): Add target hook under vectorizer.
* hooks.cc (hook_bool_mode_int_unsigned_false): New function.
* hooks.h (hook_bool_mode_int_unsigned_false): New prototype.
* tree-vect-stmts.cc (vect_use_strided_gather_scatters_p): Add
parameters group_size and single_element_p, and rework to use
targetm.vectorize.prefer_gather_scatter.
(get_group_load_store_type): Move some of the condition into
vect_use_strided_gather_scatters_p.
* config/gcn/gcn.cc (gcn_prefer_gather_scatter): New function.
(TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define hook.

20 hours agoDon't pass vector params through to offload targets
Andrew Stubbs [Thu, 24 Jul 2025 12:58:31 +0000 (12:58 +0000)] 
Don't pass vector params through to offload targets

The optimization options are deliberately passed through to the LTO compiler,
but when the same mechanism is reused for offloading it ends up forcing the
host compiler settings onto the device compiler.  Maybe this should be removed
completely, but this patch just fixes a few of them.  In particular,
param_vect_partial_vector_usage is disabled by x86 and this really hurts amdgcn.

I also fixed an ambiguous else warning in the generated file by adding braces.

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_option_override): Add note to set default for
param_vect_partial_vector_usage to "1".
* optc-save-gen.awk: Don't pass through options marked "NoOffload".
* params.opt (-param=vect-epilogues-nomask): Add NoOffload.
(-param=vect-partial-vector-usage): Likewise.
(-param=vect-inner-loop-cost-factor): Likewise.

21 hours agotree-optimization/121130 - vectorizable_call cannot handle .MASK_CALL
Richard Biener [Wed, 30 Jul 2025 10:34:20 +0000 (12:34 +0200)] 
tree-optimization/121130 - vectorizable_call cannot handle .MASK_CALL

The following makes it correctly reject them,
vectorizable_simd_clone_call is solely responsible for them.

PR tree-optimization/121130
* tree-vect-stmts.cc (vectorizable_call): Bail out for
.MASK_CALL.

* gcc.dg/vect/vect-simd-pr121130.c: New testcase.

21 hours agoc++: Make __extension__ silence -Wlong-long pedwarns/warnings [PR121133]
Jakub Jelinek [Wed, 30 Jul 2025 11:23:56 +0000 (13:23 +0200)] 
c++: Make __extension__ silence -Wlong-long pedwarns/warnings [PR121133]

The PR13358 r0-92909 change changed the diagnostics on long long
in C++ (either with -std=c++98 or -Wlong-long), but unlike the
C FE we unfortunately warn even in the
__extension__ long long a;
etc. cases.  The C FE in that case in
disable_extension_diagnostics saves and clears not just
pedantic flag but also warn_long_long (and several others), while
C++ FE only temporarily disables pedantic.

The following patch makes it behave like the C FE in this regard,
though (__extension__ 1LL) still doesn't work because of the
separate lexing (and I must say I have no idea how to fix that).

Or do you prefer a solution closer to the C FE, cp_parser_extension_opt
saving the values into a bitfield and have another function to restore
the state (or use RAII)?

2025-07-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/121133
* parser.cc (cp_parser_unary_expression): Adjust
cp_parser_extension_opt caller and restore warn_long_long.
(cp_parser_declaration): Likewise.
(cp_parser_block_declaration): Likewise.
(cp_parser_member_declaration): Likewise.
(cp_parser_extension_opt): Add SAVED_LONG_LONG argument,
save previous warn_long_long state into it and clear it
for __extension__.

* g++.dg/warn/pr121133-1.C: New test.
* g++.dg/warn/pr121133-2.C: New test.
* g++.dg/warn/pr121133-3.C: New test.
* g++.dg/warn/pr121133-4.C: New test.

21 hours agolibcpp: Fix up comma diagnostics in preprocessor for C++ [PR120778]
Jakub Jelinek [Wed, 30 Jul 2025 11:20:59 +0000 (13:20 +0200)] 
libcpp: Fix up comma diagnostics in preprocessor for C++ [PR120778]

The P2843R3 Preprocessing is never undefined paper contains comments
that various compilers handle comma operators in preprocessor expressions
incorrectly and I think they are right.

In both C and C++ the grammar uses constant-expression non-terminal
for #if/#elif and in both C and C++ that NT is conditional-expression,
so
  #if 1, 2
is IMHO clearly wrong in both languages.

C89 then says for constant-expression
"Constant expressions shall not contain assignment, increment, decrement,
function-call, or comma operators, except when they are contained within the
operand of a sizeof operator."
Because all the remaining identifiers in the #if/#elif expression are
replaced with 0 I think assignments, increment, decrement and function-call
aren't that big deal because (0 = 1) or ++4 etc. are all invalid, but
for comma expressions I think it matters.  In r0-56429 PR456 Joseph has
added !CPP_OPTION (pfile, c99) to handle that correctly.
Then C99 changed that to:
"Constant expressions shall not contain assignment, increment, decrement, function-call,
or comma operators, except when they are contained within a subexpression that is not
evaluated."
That made for C99+
  #if 1 || (1, 2)
etc. valid but
  #if (1, 2)
is still invalid, ditto
  #if 1 ? 1, 2 : 3

In C++ I can't find anything like that though, and as can be seen on say
int a[(1, 2)];
int b[1 ? 1, 2 : 3];
being accepted by C++ and rejected by C while
int c[1, 2];
int d[1 ? 2 : 3, 4];
being rejected in both C and C++, so I think for C++ it is indeed just the
grammar that prevents #if 1, 2.  When it is the second operand of ?: or
inside of () the grammar just uses expression and that allows comma
operator.

So, the following patch uses different decisions for C++ when to diagnose
comma operator in preprocessor expressions, for C++ tracks if it is inside
of () (obviously () around #embed clauses don't count unless one uses
limit ((1, 2)) etc.) or inside of the second ?: operand and allows comma
operator there and disallows elsewhere.

BTW, I wonder if anything in the standard disallows <=> in the preprocessor
expressions.  Say
  #if (0 <=> 1) < 0
etc.
  #include <compare>
  constexpr int a = (0 <=> 1) < 0;
is valid (but not valid without #include <compare>) and the expressions
don't use any identifiers.

2025-07-30  Jakub Jelinek  <jakub@redhat.com>

PR c++/120778
* internal.h (struct lexer_state): Add comma_ok member.
* expr.cc (_cpp_parse_expr): Initialize it to 0, increment on
CPP_OPEN_PAREN and CPP_QUERY, decrement on CPP_CLOSE_PAREN
and CPP_COLON.
(num_binary_op): For C++ pedwarn on comma operator if
pfile->state.comma_ok is 0 instead of !c99 or skip_eval.

* g++.dg/cpp/if-comma-1.C: New test.

22 hours agovect: Add missing skip-vector check for peeling with versioning [PR121020]
Pengfei Li [Wed, 30 Jul 2025 09:54:14 +0000 (10:54 +0100)] 
vect: Add missing skip-vector check for peeling with versioning [PR121020]

This fixes a miscompilation issue introduced by the enablement of
combined loop peeling and versioning. A test case that reproduces the
issue is included in the patch.

When performing loop peeling, GCC usually inserts a skip-vector check.
This ensures that after peeling, there are enough remaining iterations
to enter the main vectorized loop. Previously, the check was omitted if
loop versioning for alignment was applied. It was safe before because
versioning and peeling for alignment were mutually exclusive.

However, with combined peeling and versioning enabled, this is not safe
any more. A loop may be peeled and versioned at the same time. Without
the skip-vector check, the main vectorized loop can be entered even if
its iteration count is zero. This can cause the loop running many more
iterations than needed, resulting in incorrect results.

To fix this, the patch updates the condition of omitting the skip-vector
check to when versioning is performed alone without peeling.

gcc/ChangeLog:

PR tree-optimization/121020
* tree-vect-loop-manip.cc (vect_do_peeling): Update the
condition of omitting the skip-vector check.
* tree-vectorizer.h (LOOP_VINFO_USE_VERSIONING_WITHOUT_PEELING):
Add a helper macro.

gcc/testsuite/ChangeLog:

PR tree-optimization/121020
* gcc.dg/vect/vect-early-break_138-pr121020.c: New test.

22 hours agovect: Fix insufficient alignment requirement for speculative loads [PR121190]
Pengfei Li [Wed, 30 Jul 2025 09:51:11 +0000 (10:51 +0100)] 
vect: Fix insufficient alignment requirement for speculative loads [PR121190]

This patch fixes a segmentation fault issue that can occur in vectorized
loops with an early break. When GCC vectorizes such loops, it may insert
a versioning check to ensure that data references (DRs) with speculative
loads are aligned. The check normally requires DRs to be aligned to the
vector mode size, which prevents generated vector load instructions from
crossing page boundaries.

However, this is not sufficient when a single scalar load is vectorized
into multiple loads within the same iteration. In such cases, even if
none of the vector loads crosses page boundaries, subsequent loads after
the first one may still access memory beyond current valid page.

Consider the following loop as an example:

while (i < MAX_COMPARE) {
  if (*(p + i) != *(q + i))
    return i;
  i++;
}

When compiled with "-O3 -march=znver2" on x86, the vectorized loop may
include instructions like:

vmovdqa (%rcx,%rax), %ymm0
vmovdqa 32(%rcx,%rax), %ymm1
vpcmpeqq (%rdx,%rax), %ymm0, %ymm0
vpcmpeqq 32(%rdx,%rax), %ymm1, %ymm1

Note two speculative vector loads are generated for each DR (p and q).
The first vmovdqa and vpcmpeqq are safe due to the vector size (32-byte)
alignment, but the following ones (at offset 32) may not be safe because
they could read from the beginning of the next memory page, potentially
leading to segmentation faults.

To avoid the issue, this patch increases the alignment requirement for
speculative loads to DR_TARGET_ALIGNMENT. It ensures all vector loads in
the same vector iteration access memory within the same page.

gcc/ChangeLog:

PR tree-optimization/121190
* tree-vect-data-refs.cc (vect_enhance_data_refs_alignment):
Increase alignment requirement for speculative loads.

gcc/testsuite/ChangeLog:

PR tree-optimization/121190
* gcc.dg/vect/vect-early-break_52.c: Update an unsafe test.
* gcc.dg/vect/vect-early-break_137-pr121190.c: New test.

23 hours agoaarch64: Fix sme2+faminmax intrisic gating (PR 121300)
Alfie Richards [Tue, 29 Jul 2025 14:16:40 +0000 (14:16 +0000)] 
aarch64: Fix sme2+faminmax intrisic gating (PR 121300)

Fixes the feature gating for the SME2+FAMINMAX intrinsics.

PR target/121300

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-sme.def (svamin/svamax): Fix
arch gating.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr121300.c: New test.

23 hours agotree-optimization/121304 - set memory_access_type before reading it
Richard Biener [Wed, 30 Jul 2025 07:44:07 +0000 (09:44 +0200)] 
tree-optimization/121304 - set memory_access_type before reading it

The following re-orders gather/scatter handling back to be before
we check for fallback situations, specifically make sure to set
memory_access_type before reading it.

* tree-vect-stmts.cc (get_group_load_store_type):
Process STMT_VINFO_GATHER_SCATTER before reading
memory_access_type.

23 hours agoaarch64: Add support for unpacked SVE FP conditional ternary arithmetic
Spencer Abson [Wed, 30 Jul 2025 08:58:50 +0000 (08:58 +0000)] 
aarch64: Add support for unpacked SVE FP conditional ternary arithmetic

This patch extends the expander for fma, fnma, fms, and fnms to support
partial SVE FP modes.

We add the missing BF16 tests, which we can now trigger for having
implemented the conditional expander.

We also add tests for the 'merging with multiplicand' case, which this
expander canonicalizes (albeit under SVE_STRICT_GP).

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (@cond_<optab><mode>): Extend
to support partial FP modes.
(*cond_<optab><mode>_2_strict): Extend from SVE_FULL_F to SVE_F,
use aarch64_predicate_operand.
(*cond_<optab><mode>_4_strict): Extend from SVE_FULL_F_B16B16 to
SVE_F_B16B16, use aarch64_predicate_operand.
(*cond_<optab><mode>_any_strict):  Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_cond_fmla_1.c: Add test cases
for merging with multiplcand.
* gcc.target/aarch64/sve/unpacked_cond_fmls_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fnmla_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fnmls_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmla_2.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fmls_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fnmla_2.c: Likewise..
* gcc.target/aarch64/sve/unpacked_cond_fnmls_2.c: Likewise.
* g++.target/aarch64/sve/unpacked_cond_ternary_bf16_1.C: Likewise.
* g++.target/aarch64/sve/unpacked_cond_ternary_bf16_2.C: Likewise.

24 hours agoaarch64: Relaxed SEL combiner patterns for unpacked SVE FP ternary arithmetic
Spencer Abson [Wed, 30 Jul 2025 08:20:58 +0000 (08:20 +0000)] 
aarch64: Relaxed SEL combiner patterns for unpacked SVE FP ternary arithmetic

Extend the ternary op/UNSPEC_SEL combiner patterns from SVE_FULL_F/
SVE_FULL_F_BF to SVE_F/SVE_F_BF, where the strictness value is
SVE_RELAXED_GP.

We can only reliably test the 'merging with the third input' (addend)
and 'independent value' patterns at this stage as the canocalisation that
reorders the multiplicands based on the second SEL input would be performed
by the conditional expander.

Another difficulty is that we can't test these fused multiply/SEL combines
without using __builtin_fma and friends.  The reason for this is as follows:

We support COND_ADD, COND_SUB, and COND_MUL optabs, so match.pd will
canonicalize patterns like ADD/SUB/MUL combined with a VEC_COND_EXPR into
these conditional forms.  Later, when widening_mul tries to fold these into
conditional fused multiply operations, the transformation fails - simply
because we haven’t implemented those conditional fused multiply optabs yet.

Hence why this patch lacks tests for BFloat16...

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F to SVE_F.
(*cond_<optab><mode>_4_relaxed): Extend from SVE_FULL_F_B16B16
to SVE_F_B16B16.
(*cond_<optab><mode>_any_relaxed): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/unpacked_cond_fmla_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fmls_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fnmla_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fnmls_1.c: Likewise.

24 hours agofortran: Remove useless elements count variable
Mikael Morin [Sun, 27 Jul 2025 12:47:14 +0000 (14:47 +0200)] 
fortran: Remove useless elements count variable

The function gfc_array_init_size evaluates the number of array elements
to a variable from a caller, but the single caller providing the
variable actually doesn't use it.

This change removes the variable and the function arguments passing its
address down the call chain.

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_array_init_size): Remove the nelems
argument.
(gfc_array_allocate): Update caller.  Remove the nelems
argument.
* trans-stmt.cc (gfc_trans_allocate): Update caller.  Remove the
nelems variable.
* trans-array.h (gfc_array_allocate): Update prototype.

24 hours agofortran: implement split for fortran 2023
Yuao Ma [Sun, 27 Jul 2025 11:41:25 +0000 (19:41 +0800)] 
fortran: implement split for fortran 2023

This patch includes the implementation, documentation, and test case for SPLIT.

gcc/fortran/ChangeLog:

* check.cc (gfc_check_split): Argument check for SPLIT.
* gfortran.h (enum gfc_isym_id): Define GFC_ISYM_SPLIT.
* intrinsic.cc (add_subroutines): Register SPLIT intrinsic.
* intrinsic.h (gfc_check_split): New decl.
(gfc_resolve_split): Ditto.
* intrinsic.texi: SPLIT documentation.
* iresolve.cc (gfc_resolve_split): Add resolved_sym for SPLIT.
* trans-decl.cc (gfc_build_intrinsic_function_decls): Add decl for
SPLIT in libgfortran.
* trans-intrinsic.cc (conv_intrinsic_split): SPLIT codegen.
(gfc_conv_intrinsic_subroutine): Handle SPLIT case.
* trans.h (GTY): Declare gfor_fndecl_string_split{, _char4}.

libgfortran/ChangeLog:

* gfortran.map: Add split symbol.
* intrinsics/string_intrinsics_inc.c (string_split):
Runtime support for SPLIT.

gcc/testsuite/ChangeLog:

* gfortran.dg/split_1.f90: New test.
* gfortran.dg/split_2.f90: New test.
* gfortran.dg/split_3.f90: New test.
* gfortran.dg/split_4.f90: New test.

Signed-off-by: Yuao Ma <c8ef@outlook.com>
24 hours agoaarch64: Add support for unpacked SVE FP ternary arithmetic
Spencer Abson [Wed, 30 Jul 2025 07:59:42 +0000 (07:59 +0000)] 
aarch64: Add support for unpacked SVE FP ternary arithmetic

This patch extends the expander for unconditional fma, fnma, fms, and
fnms, so that it supports partial SVE FP modes.

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md (<optab><mode>4): Extend from
SVE_FULL_F_B16B16 to SVE_F_B16B16.  Use aarch64_sve_fp_pred instead
of aarch64_ptrue_reg.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F_B16B16 to
SVE_F_B16B16.  Use aarch64_predicate_operand.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/unpacked_ternary_bf16_1.C: New test.
* g++.target/aarch64/sve/unpacked_ternary_bf16_2.C: Likewise.
* gcc.target/aarch64/sve/unpacked_fmla_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmla_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmls_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmls_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fnmla_1.c: Likeiwse.
* gcc.target/aarch64/sve/unpacked_fnmla_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fnmls_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fnmls_2.c: Likewise.

25 hours agoRemove V64SFmode and V64SImode.
liuhongt [Tue, 29 Jul 2025 03:01:54 +0000 (20:01 -0700)] 
Remove V64SFmode and V64SImode.

It's needed by avx5124vnniw/avx5124fmaps which have been removed by
r15-656-ge1a7e2c54d52d0.

gcc/ChangeLog:

* config/i386/i386-modes.def: Remove VECTOR_MODES(FLOAT, 256)
and VECTOR_MODE (INT, SI, 64).
* config/i386/i386.cc (ix86_hard_regno_nregs): Remove related
code for V64SF/V64SImode.

25 hours agoEliminate redundant vpextrq/vpinsrq when move TI to V4SI.
liuhongt [Tue, 29 Jul 2025 07:01:37 +0000 (00:01 -0700)] 
Eliminate redundant vpextrq/vpinsrq when move TI to V4SI.

r14-1902-g96c3539f2a3813 split TImode move with 2 DImode move, it's
supposed to optimize TImode in parameter/return since accoring to
psABI it's stored into 2 general registers.

But when TImode is not in parameter/return, it could create redundancy
in the PR.

The patch add a splitter to handle that.

.i.e.
(insn 10 9 14 2 (set (subreg:V2DI (reg:V4SI 98 [ <retval> ]) 0)
     (vec_concat:V2DI (subreg:DI (reg:TI 101) 0)
 (subreg:DI (reg:TI 101) 8)))
 8442 {vec_concatv2di}
(expr_list:REG_DEAD (reg:TI 101)

gcc/ChangeLog:

PR target/121274
* config/i386/sse.md (*vec_concatv2di_0): Add a splitter
before it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr121274.c: New test.

31 hours agoRISC-V: Add testcases for unsigned avg ceil vx combine.
Pan Li [Mon, 28 Jul 2025 12:12:31 +0000 (20:12 +0800)] 
RISC-V: Add testcases for unsigned avg ceil vx combine.

The unsigned avg ceil share the vaaddux.vx for the vx combine,
so add the test case to make sure it works well as expected.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Add asm check
for unsigned avg ceil.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add
test data.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
32 hours agoDaily bump.
GCC Administrator [Wed, 30 Jul 2025 00:19:13 +0000 (00:19 +0000)] 
Daily bump.

32 hours agosimplify-rtx: Fix Distribute subregs over logic ops [PR121302]
Andrew Pinski [Wed, 30 Jul 2025 00:02:44 +0000 (17:02 -0700)] 
simplify-rtx: Fix Distribute subregs over logic ops [PR121302]

r16-2614-g965564eafb721f had a typo where it would assume byte==0
rather than use the byte (offset) that was passed.

This fixes that typo and also fixes the comment since it is not just
about lowerpart subregs but all non-paradoxical subregs.

Pushed as obvious after bootstrap/test on x86_64-linux-gnu.

PR rtl-optimization/121302
gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_subreg): Use
byte instead of 0 when calling simplify_subreg.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
38 hours agotestsuite: Cleanup after auto-profile testcases when auto-profile is not supported...
Andrew Pinski [Tue, 29 Jul 2025 17:38:58 +0000 (10:38 -0700)] 
testsuite: Cleanup after auto-profile testcases when auto-profile is not supported [PR121215]

The problem here is that in tree-prof.exp does not cleanup if requiring auto-profile
but it is not supported and the testcase uses dg-additional-sources. Currently additional_sources
is not reset to "" and then another testcase comes along and thinks that is the additional source
to be added.

Committed as obvious after testing:
make check-gcc RUNTESTFLAGS="tree-prof.exp=afdo-crossmodule-1.c tree-ssa.exp=pr67891.c"
to make sure pr67891.c now no longer uses the additional source.

PR testsuite/121215
gcc/testsuite/ChangeLog:

* lib/profopt.exp (profopt-execute): Call cleanup-after-saved-dg-test
if returning early for the -fauto-profile case failing case.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
39 hours agoaarch64: Add support for unpacked SVE FP conditional binary arithmetic
Spencer Abson [Tue, 29 Jul 2025 16:37:26 +0000 (16:37 +0000)] 
aarch64: Add support for unpacked SVE FP conditional binary arithmetic

This patch extends the expander for conditional smax, smin, add, sub, mul,
min, max, and div to support partial SVE FP modes.

If exceptions from undefined vector elements must be suppressed, this
expansion converts the container-level predicate to an element-level one, and
ensures that these elements are inactive for the operation.  In practice, this
is a predicate AND with the existing mask and a container-size PTRUE.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_sve_emit_masked_fp_pred):
Declare.
* config/aarch64/aarch64-sve.md (and<mode>3):  Change this to...
(@and<mode>3): ...this, so that we can use gen_and3.
(@cond_<optab><mode>): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16,
use aarch64_predicate_operand.
(*cond_<optab><mode>_2_strict): Likewise.
(*cond_<optab><mode>_3_strict): Likewise.
(*cond_<optab><mode>_any_strict): Likwise.
(*cond_<optab><mode>_2_const_strict): Extend from SVE_FULL_F to SVE_F,
use aarch64_predicate_operand.
(*cond_<optab><mode>_any_const_strict): Likewise.
(*cond_sub<mode>_3_const_strict): Likwise.
(*cond_sub<mode>_const_strict): Likewise.
(*vcond_mask_<mode><vpred>): Use aarch64_predicate_operand, and update
the comment here.
* config/aarch64/aarch64.cc (aarch64_sve_emit_masked_fp_pred): New
function.  Helper to mask the predicate in conditional expanders.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/unpacked_cond_binary_bf16_2.C: New test.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmax_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmin_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fdiv_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fsubr_2.c: Likewise.

40 hours agox86: Pass -mno-80387 to compile pr121208-1(a|b).c
H.J. Lu [Tue, 29 Jul 2025 16:11:34 +0000 (09:11 -0700)] 
x86: Pass -mno-80387 to compile pr121208-1(a|b).c

Pass -mno-80387 to compile pr121208-1(a|b).c to silence

.../pr121208-1a.c:11:1: sorry, unimplemented: 80387 instructions aren’t allowed in a function with the ‘no_caller_saved_registers’ attribute

PR target/121208
* gcc.target/i386/pr121208-1a.c (dg-options): Add -mno-80387.
* gcc.target/i386/pr121208-1b.c (dg-options): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
41 hours agotestsuite: Adjust s390x params for vector tests.
Juergen Christ [Tue, 29 Jul 2025 14:23:24 +0000 (16:23 +0200)] 
testsuite: Adjust s390x params for vector tests.

Loop peeling and minimal loop vectorization threshold prevented loop
vectorization in these examples.  Adjust parameters in the test to
make the test pass.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
PR testsuite/121286
PR testsuite/121288

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr112325.c: Adjust parameters for s390.
* gcc.dg/vect/pr117888-1.c: Ditto.

41 hours agoRISC-V: Generate -mcpu and -mtune options from riscv-cores.def.
Dongyan Chen [Wed, 25 Jun 2025 13:20:25 +0000 (21:20 +0800)] 
RISC-V: Generate -mcpu and -mtune options from riscv-cores.def.

Automatically generate -mcpu and -mtune options in invoke.texi from
the unified riscv-cores.def metadata, ensuring documentation stays in sync
with definitions and reducing manual maintenance.

gcc/ChangeLog:

* Makefile.in: Add riscv-mcpu.texi and riscv-mtune.texi to the list
of files to be processed by the Texinfo generator.
* config/riscv/t-riscv: Add rule for generating riscv-mcpu.texi
and riscv-mtune.texi.
* doc/invoke.texi: Replace hand‑written extension table with
`@include riscv-mcpu.texi` and `@include riscv-mtune.texi` to
pull in auto‑generated entries.
* config/riscv/gen-riscv-mcpu-texi.cc: New file.
* config/riscv/gen-riscv-mtune-texi.cc: New file.
* doc/riscv-mcpu.texi: New file.
* doc/riscv-mtune.texi: New file.

41 hours agosimplify-rtx: Simplify subregs of logic ops
Richard Sandiford [Tue, 29 Jul 2025 14:58:34 +0000 (15:58 +0100)] 
simplify-rtx: Simplify subregs of logic ops

This patch adds a new rule for distributing lowpart subregs through
ANDs, IORs, and XORs with a constant, in cases where one of the terms
then disappears.  For example:

  (lowart-subreg:QI (and:HI x 0x100))

simplifies to zero and

  (lowart-subreg:QI (and:HI x 0xff))

simplifies to (lowart-subreg:QI x).

This would often be handled at some point using nonzero bits.  However,
the specific case I want the optimisation for is SVE predicates,
where nonzero bit tracking isn't currently an option.  Specifically:
the predicate modes VNx8BI, VNx4BI and VNx2BI have the same size as
VNx16BI, but treat only every second, fourth, or eighth bit as
significant.  Thus if we have:

  (subreg:VNx8BI (and:VNx16BI x C))

where C is the repeating constant { 1, 0, 1, 0, ... }, then the
AND only clears bits that are made insignificant by the subreg,
and so the result is equal to (subreg:VNx8BI x).  Later patches
rely on this.

gcc/
* simplify-rtx.cc (simplify_context::simplify_subreg): Distribute
lowpart subregs through AND/IOR/XOR, if doing so eliminates one
of the terms.
(test_scalar_int_ext_ops): Add some tests of the above for integers.
* config/aarch64/aarch64.cc (aarch64_test_sve_folding): Likewise
add tests for predicate modes.

41 hours agotestsuite: Generalise aarch64/saturating_arithmetic*.c
Richard Sandiford [Tue, 29 Jul 2025 14:58:33 +0000 (15:58 +0100)] 
testsuite: Generalise aarch64/saturating_arithmetic*.c

gcc.target/aarch64/saturating_arithmetic_{1,2}.c expect w0 and w1 to
be duplicated into vectors.  The tests expected the duplication of w1
to happen first, but the other order would be fine too.  A later
simplify-rtx.cc patch happens to change the order.

gcc/testsuite/
* gcc.target/aarch64/saturating_arithmetic_1.c: Allow w0 and w1
to be duplicated in either order.
* gcc.target/aarch64/saturating_arithmetic_2.c: Likewise.

41 hours agotestsuite: Make aarch64/cmpbr.c more forgiving
Richard Sandiford [Tue, 29 Jul 2025 14:58:33 +0000 (15:58 +0100)] 
testsuite: Make aarch64/cmpbr.c more forgiving

The 8-bit and 16-bit tests in cmpbr.c assumed an inverted operand
order ("w1, w0"), but it's possible to use the uninverted operand
order too.  This patch generalises the tests to support both forms.

This is a prerequisite for a later patch that adds a new
simplify-rtx.cc rule.

gcc/testsuite/
* gcc.target/aarch64/cmpbr.c: Support both operand orders
for 8-bit and 16-bit comparisons.

41 hours agoaarch64: Fix function_expander::get_reg_target
Richard Sandiford [Tue, 29 Jul 2025 14:58:32 +0000 (15:58 +0100)] 
aarch64: Fix function_expander::get_reg_target

function_expander::get_reg_target didn't actually check for a register,
meaning that it could return a memory target instead.  That doesn't
really matter for the current direct and indirect uses (svundef*,
svcreate*, and svset*) but it will for later patches.

gcc/
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::get_reg_target): Check whether the target
is a valid register_operand.

41 hours ago[modula2] Tidyup remove unused local variables
Gaius Mulley [Tue, 29 Jul 2025 14:52:58 +0000 (15:52 +0100)] 
[modula2] Tidyup remove unused local variables

This patch removes unused local variables from three procedures.

gcc/m2/ChangeLog:

* gm2-compiler/M2GenGCC.mod (FoldBecomes): Remove all
local variables.
(CodeIndrX): Remove length.
Remove newstr.
* gm2-compiler/M2Range.mod (FoldTypeIndrX): Remove desType.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
42 hours agoasf: Fix case of multiple stores with base offset [PR120660]
Konstantinos Eleftheriou [Fri, 18 Jul 2025 11:46:41 +0000 (04:46 -0700)] 
asf: Fix case of multiple stores with base offset [PR120660]

When having multiple stores with the same offset as the load, in the
case that we are eliminating the load, we were generating a mov instruction
for both of them, leading to the overwrite of the register containing the
loaded value.

This patch fixes this issue by generating a mov instruction only for the
first store in the store-load sequence that has the same offset as the load.
For the next ones that might be encountered, we use bit-field insertion.

Bootstrapped/regtested on AArch64 and x86_64.

PR rtl-optimization/120660

gcc/ChangeLog:

* avoid-store-forwarding.cc (process_store_forwarding):
Fix instruction generation when haveing multiple stores with
base offset.

gcc/testsuite/ChangeLog:

* gcc.dg/pr120660.c: New test.

42 hours agolibsdc++: Test using range_format::map as format_kind.
Tomasz Kamiński [Tue, 29 Jul 2025 12:59:35 +0000 (14:59 +0200)] 
libsdc++: Test using range_format::map as format_kind.

This adderess TODO from the test file.

libstdc++-v3/ChangeLog:

* testsuite/std/format/ranges/format_kind.cc: New test.

Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
42 hours agoRISC-V: Remove use of structured binding to fix compiler warning
Christoph Müllner [Mon, 28 Jul 2025 15:31:06 +0000 (17:31 +0200)] 
RISC-V: Remove use of structured binding to fix compiler warning

Function riscv_ext_is_subset () uses structured bindings to iterate over
all keys and values of an unordered map.  However, this is only
available since C++17 and causes a warning like this:
  warning: structured bindings only available with ‘-std=c++17’
This patch addresses the warning.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_ext_is_subset):
Remove use of structured binding to fix compiler warning.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
43 hours agoasf: Skip when an instruction doesn't satisfy the constraints [PR119795]
Konstantinos Eleftheriou [Wed, 25 Jun 2025 11:24:52 +0000 (13:24 +0200)] 
asf: Skip when an instruction doesn't satisfy the constraints [PR119795]

While scanning the instructions and upon reaching an instruction that
doesn't satisfy the constraints that we have set, we were removing the
already detected stores, but we were continuing adding stores from that
point onward. This was causing issues when the address ranges from later
stores overlapped with the load's address, leading to partial and wrong
update of the register containing the loaded value.

With this patch, we are skipping the tranformation for stores that operate
on the load's address range, when stores that operate on the same range
have been deleted due to constraint violations.

PR rtl-optimization/119795

gcc/ChangeLog:

* avoid-store-forwarding.cc
(store_forwarding_analyzer::avoid_store_forwarding): Skip
transformations for stores that operate on the same address
range as deleted ones.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr119795.c: New test.

43 hours agoRISC-V: Add test cases for mul based unsigned scalar SAT_MUL
Pan Li [Sat, 26 Jul 2025 08:38:23 +0000 (16:38 +0800)] 
RISC-V: Add test cases for mul based unsigned scalar SAT_MUL

Add run and tree-optimized check for mul based unsigned scalar SAT_MUL
instead of the widen_mul.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u64.c: Add rv64
target for run.
* gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-1-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-1-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-1-u8-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-2-u16-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-2-u32-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-2-u8-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u32.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
43 hours agoMatch: Introduce mul based pattern for unsigned SAT_MUL
Pan Li [Sat, 26 Jul 2025 08:32:08 +0000 (16:32 +0800)] 
Match: Introduce mul based pattern for unsigned SAT_MUL

Like widen_mul based pattern, we would like introduce the mul based
pattern as well.  The pattern is quite simple compares to the
widen_mul, thus add new instead of the for loop in match.pd.

gcc/ChangeLog:

* match.pd: Add mul based unsigned SAT_MUL.

Signed-off-by: Pan Li <pan2.li@intel.com>
44 hours agoAnother testcase for PR120687
Richard Biener [Tue, 29 Jul 2025 12:01:46 +0000 (14:01 +0200)] 
Another testcase for PR120687

This shows reassoc is harmful even with len == 3.

PR tree-optimization/120687
* gcc.dg/vect/pr120687-3.c: New testcase.

45 hours agotestsuite: Fix C++14 test failure with modules test [PR121285]
Nathaniel Shead [Tue, 29 Jul 2025 11:20:03 +0000 (21:20 +1000)] 
testsuite: Fix C++14 test failure with modules test [PR121285]

I hadn't validated this test worked in C++14 before submitting, fixed
thusly.

PR testsuite/121285

gcc/testsuite/ChangeLog:

* g++.dg/modules/class-11_a.H: Make static_asserts valid for
C++14.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
46 hours agotree-optimization/120687 - avoid disturbing reduction chains in reassoc
Richard Biener [Tue, 29 Jul 2025 08:05:32 +0000 (10:05 +0200)] 
tree-optimization/120687 - avoid disturbing reduction chains in reassoc

Reassoc carefully ranks operands to form reduction chains for
vectorization so we are careful to not apply any width related
changes in the early pass.  Unfortunately we are not careful
enough.  The following gates fma related re-ordering and also
the >= 3 ops tail "optimization" which is the culprit here.

This does not fix the reported inefficient vectorization when
using signed integer reductions yet.

PR tree-optimization/120687
* tree-ssa-reassoc.cc (reassociate_bb): Do not disturb
the sorted operand order in the early pass.
* tree-vect-slp.cc (vect_analyze_slp): Dump when a detected
reduction chain fails SLP discovery.

* gcc.dg/vect/pr120687-1.c: New testcase.
* gcc.dg/vect/pr120687-2.c: Likewise.

2 days agoFix UB in string_slice::operator== (PR 121261)
Alfie Richards [Mon, 28 Jul 2025 13:32:45 +0000 (13:32 +0000)] 
Fix UB in string_slice::operator== (PR 121261)

This adds a nullptr check to fix a regression where it is possible to call
`memcmp (NULL, NULL, 0)` which is UB prior to C26.

This fixes the bootstrap-ubsan build.

gcc/ChangeLog:
PR middle-end/121261
* vec.h: Add null ptr check.

2 days agoPR modula2/121289 Poor warning location when using Wstyle option
Gaius Mulley [Tue, 29 Jul 2025 08:09:58 +0000 (09:09 +0100)] 
PR modula2/121289 Poor warning location when using Wstyle option

This patch adds a token location parameter to CheckVariableAgainstKeyword
and dependants ensuring that the warning is generated from the
token associated with the variable rather than the end of the statement.

gcc/m2/ChangeLog:

PR modula2/121289
* gm2-compiler/M2Students.def (CheckVariableAgainstKeyword): New
parameter tok.
* gm2-compiler/M2Students.mod (CheckVariableAgainstKeyword): New
parameter tok.
Pass tok to PerformVariableKeywordCheck.
(PerformVariableKeywordCheck): New parameter tok.
Pass tok to MetaErrorStringT0.
* gm2-compiler/P2SymBuild.mod (BuildVariable): Pass tok to
CheckVariableAgainstKeyword.
* gm2-libs-iso/LowLong.mod (except): Replace with ...
(exceptSrc): ... this.
* gm2-libs-iso/LowReal.mod (except): Replace with ...
(exceptSrc): ... this.
* gm2-libs-iso/LowShort.mod (except): Replace with ...
(exceptSrc): ... this.
* gm2-libs-iso/Processes.mod (Wait): Replace from with fromCor.
* gm2-libs-iso/RndFile.mod (EndPos): Replace end with endP.
* gm2-libs/SCmdArgs.mod (GetArg): Replace start with startPos.
Replace end with endPos.
(NArg): Replace start with startPos.
Replace end with endPos.

gcc/testsuite/ChangeLog:

PR modula2/121289
* gm2/warnings/style/fail/badvarname.mod: New test.
* gm2/warnings/style/fail/warnings-style-fail.exp: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2 days agotestsuite: Restore dg-do run on pr116906 and pr78185 tests
Christophe Lyon [Mon, 26 May 2025 15:07:47 +0000 (15:07 +0000)] 
testsuite: Restore dg-do run on pr116906 and pr78185 tests

Commit r15-7152-g57b706d141b87c removed
/* { dg-do run { target*-*-linux* *-*-gnu* *-*-uclinux* } } */

from these tests, turning them into 'compile' only tests, even when
they could be executed.

This patch adds
/* { dg-do run } */

which is OK since the tests are correctly skipped if needed thanks to
the following effective-targets (alarm and signal).

With this patch we have again two entries for these tests on linux targets:
* compile (test for excess errors)
* execution test

gcc/testsuite/ChangeLog:
* gcc.dg/pr116906-1.c: Add 'dg-do run'.
* gcc.dg/pr116906-2.c: Likewise.
* gcc.dg/pr78185.c: Likewise.

2 days agocalls: Allow musttail calls to noreturn [PR121159]
Jakub Jelinek [Tue, 29 Jul 2025 07:49:55 +0000 (09:49 +0200)] 
calls: Allow musttail calls to noreturn [PR121159]

In the PR119483 r15-9003 change we've allowed musttail calls to noreturn
functions, after all the decision not to normally tail call noreturn
functions is not because it is not possible to tail call those, but because
it screws up backtraces.  As the following testcase shows, we've done that
only for functions not declared [[noreturn]]/_Noreturn but later on
discovered through IPA as noreturn.  Functions explicitly declared
[[noreturn]] have (for historical reasons) volatile FUNCTION_TYPE and
the FUNCTION_DECLs are volatile as well, so in order to support those
we shouldn't complain on ECF_NORETURN (we've stopped doing so for musttail
in PR119483) but also shouldn't complain about TYPE_VOLATILE on their
FUNCTION_TYPE (something that IPA doesn't change, I think it only sets
TREE_THIS_VOLATILE on the FUNCTION_DECL).  volatile on function type
really means noreturn as well, it has no other meaning.

2025-07-29  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/121159
* calls.cc (can_implement_as_sibling_call_p): Don't reject declared
noreturn functions in musttail calls.

* c-c++-common/pr121159.c: New test.
* gcc.dg/plugin/must-tail-call-2.c (test_5): Don't expect an error.

2 days agooutput: Move an special # (256) to a new macro
Andrew Pinski [Tue, 29 Jul 2025 03:22:53 +0000 (20:22 -0700)] 
output: Move an special # (256) to a new macro

This is a followup to the review of mergability of CSWTCH patch
located at https://gcc.gnu.org/pipermail/gcc-patches/2025-July/690810.html.
Moves the special # (256) to a macro so it is not used bare in the source
and there is only the need to change it in one place.
This special # was added with r0-37392-g201556f0e00580 which added the original mergeable
section support to gcc.

Pushed as obvious after build and test on x86_64.

gcc/ChangeLog:

* output.h (MAX_ALIGN_MERGABLE): New define.
* tree-switch-conversion.cc (switch_conversion::build_one_array):
Use MAX_ALIGN_MERGABLE instead of 256.
* varasm.cc (mergeable_string_section): Likewise
(mergeable_constant_section): Likewise

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2 days agoImprove mergability of CSWTCH [PR120523]
Andrew Pinski [Fri, 25 Jul 2025 23:16:36 +0000 (16:16 -0700)] 
Improve mergability of CSWTCH [PR120523]

When I did r16-1067-gaa935ce40a7, I thought it would be
enough to mark the decl as mergable to get it to merge on
all targets. Turns out a few things needed to be changed
to support it being mergable on all targets.
The first thing is improve the selecting of the mergable
section and instead of basing it on the DECL's mode, it
should be based on the size instead.
The second thing that needed to be happen is change the
alignment of the CSWTCH decl to be aligned to the next power
of 2 compared to the size if the size is less than 32bytes
(the max mergable size that is supported).

With these changes, cswtch-6.c passes on ia32 and other targets.
And the new testcase cswtch-7.c will pass now too.

Note I noticed the darwin's darwin_mergeable_constant_section could
be "fixed" up to use DECL_SIZE instead of the DECL_MODE but I am not
sure it makes a huge difference.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/120523
gcc/ChangeLog:

* output.h (mergeable_constant_section): New declaration taking
unsigned HOST_WIDE_INT for the size.
* tree-switch-conversion.cc (switch_conversion::build_one_array):
Increase the alignment of CSWTCH for sizes less than 32bytes.
* varasm.cc (mergeable_constant_section): Split out twice.
One that takes the size in unsigned HOST_WIDE_INT and the
other size in a tree.
(default_elf_select_section): Pass DECL_SIZE instead of
DECL_MODE to mergeable_constant_section.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/cswtch-7.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2 days agoUn-factor vectorizable_load parts
Richard Biener [Mon, 28 Jul 2025 13:04:01 +0000 (15:04 +0200)] 
Un-factor vectorizable_load parts

When the costing refactoring happened we ended up with some strange
inter-mixing of VMAT unrelated code.  The following moves stuff
closer to where it's actually used, at the expense of duplicating
some lines.

* tree-vect-stmts.cc (vectorizable_load): Un-factor VMAT
specific code to their handling blocks.

2 days agoEliminate gather-scatter-info offset_dt member
Richard Biener [Mon, 28 Jul 2025 12:22:44 +0000 (14:22 +0200)] 
Eliminate gather-scatter-info offset_dt member

The following removes this only set member.  Sligthly complicated
by the hoops get_group_load_store_type jumps through.  I've simplified
that, noting the offset vector type that's relevant is that of the
actual offset SLP node, not of what vect_check_gather_scatter (re-)computes.

* tree-vectorizer.h (gather_scatter_info::offset_dt): Remove.
* tree-vect-data-refs.cc (vect_describe_gather_scatter_call):
Do not set it.
(vect_check_gather_scatter): Likewise.
* tree-vect-stmts.cc (vect_truncate_gather_scatter_offset):
Likewise.
(get_group_load_store_type): Use the vector type of the offset
SLP child.  Do not re-check vect_is_simple_use validated by
SLP build.

2 days agoDaily bump.
GCC Administrator [Tue, 29 Jul 2025 00:19:24 +0000 (00:19 +0000)] 
Daily bump.

2 days agoAVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.
Georg-Johann Lay [Mon, 28 Jul 2025 19:44:06 +0000 (21:44 +0200)] 
AVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.

Converting from generic AS to __flashx used the same rule like
for __memx, which tags RAM (generic AS) locations by setting bit 23.
The justification was that generic isn't a subset of __flashx, though
that lead to surprises with code like const __flashx *x = NULL.

The natural thing to do is to just load 0x000000 in that case,
so that the null pointer works in __flashx as expected.

Apart from that, converting NULL to __flashx (or __flash) no more
raises a -Waddr-space-convert diagnostic.

gcc/
PR target/121277
* config/avr/avr.cc (avr_addr_space_convert): When converting
from generic AS to __flashx, don't set bit 23.
(avr_convert_to_type): Don't -Waddr-space-convert when NULL
is converted to __flashx or to __flash.

2 days agoifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]
Andrew Pinski [Fri, 25 Jul 2025 20:54:32 +0000 (13:54 -0700)] 
ifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]

When I added the factor operations to ifcvt, I messed how handling of removing
the phi nodes. The fix is we need to remove the phi node that was factored out
as we factored out the operator because otherwise scev can go when it comes
to detecting if the new args are from a reduction.

Also the need to change the interface for is_cond_scalar_reduction as the
phi node that was being passed after the factoring no longer exists so need
to pass the parts that were being used.

PR tree-optimization/121236

gcc/ChangeLog:

* tree-if-conv.cc (is_cond_scalar_reduction): Instead of phi argument,
pass bb and res of the phi.
(factor_out_operators): Add iterator for the phi. Remove the phi
if this is the first time. Return if we had removed the phi.
(predicate_scalar_phi): Add the phi iterator argument.
Update call to is_cond_scalar_reduction.
Update call to factor_out_operators and set the return value to true
when factor_out_operators returns true.
(predicate_all_scalar_phis): Don't remove the phi if predicate_scalar_phi
already removed it.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr121236-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2 days agox86: Disallow -mtls-dialect=gnu with no_caller_saved_registers
H.J. Lu [Thu, 24 Jul 2025 14:38:13 +0000 (07:38 -0700)] 
x86: Disallow -mtls-dialect=gnu with no_caller_saved_registers

__tls_get_addr doesn't preserve vector registers.  When a function
with no_caller_saved_registers attribute calls __tls_get_addr, YMM
and ZMM registers will be clobbered.  Issue an error and suggest
-mtls-dialect=gnu2 in this case.

gcc/

PR target/121208
* config/i386/i386.cc (ix86_tls_get_addr): Issue an error for
-mtls-dialect=gnu with no_caller_saved_registers attribute and
suggest -mtls-dialect=gnu2.

gcc/testsuite/

PR target/121208
* gcc.target/i386/pr121208-1a.c: New test.
* gcc.target/i386/pr121208-1b.c: Likewise.
* gcc.target/i386/pr121208-2a.c: Likewise.
* gcc.target/i386/pr121208-2b.c: Likewise.
* gcc.target/i386/pr121208-3a.c: Likewise.
* gcc.target/i386/pr121208-3b.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2 days agolibstdc++: Teach std::distance and std::advance about C++20 iterators [PR102181]
Jonathan Wakely [Fri, 18 Jul 2025 17:42:20 +0000 (18:42 +0100)] 
libstdc++: Teach std::distance and std::advance about C++20 iterators [PR102181]

When the C++98 std::distance and std::advance functions (and C++11
std::next and std::prev) are used with C++20 iterators there can be
unexpected results, ranging from compilation failure to decreased
performance to undefined behaviour.

An iterator which satisfies std::input_iterator but does not meet the
Cpp17InputIterator requirements might have std::output_iterator_tag for
its std::iterator_traits<I>::iterator_category, which means it currently
cannot be used with std::advance at all. However, the implementation of
std::advance for a Cpp17InputIterator doesn't do anything that isn't
valid for iterator types satsifying C++20 std::input_iterator.

Similarly, a type satisfying C++20 std::bidirectional_iterator might be
usable with std::prev, if it weren't for the fact that its C++17
iterator_category is std::input_iterator_tag.

Finally, a type satisfying C++20 std::random_access_iterator might use a
slower implementation for std::distance or std::advance if its C++17
iterator_category is not std::random_access_iterator_tag.

This commit adds a __promotable_iterator concept to detect C++20
iterators which explicitly define an iterator_concept member, and which
either have no iterator_category, or their iterator_category is weaker
than their iterator_concept. This is used by std::distance and
std::advance to detect iterators which should dispatch based on their
iterator_concept instead of their iterator_category. This means that
those functions just work and do the right thing for C++20 iterators
which would otherwise fail to compile or have suboptimal performance.

This is related to LWG 3197, which considers making it undefined to use
std::prev with types which do not meet the Cpp17BidirectionalIterator
requirements.  I think making it work, as in this commit, is a better
solution than banning it (or rejecting it at compile-time as libc++
does).

PR libstdc++/102181

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator_base_funcs.h (distance, advance):
Check C++20 iterator concepts and handle appropriately.
(__detail::__iter_category_converts_to_concept): New concept.
(__detail::__promotable_iterator): New concept.
* testsuite/24_iterators/operations/cxx20_iterators.cc: New
test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2 days agogit_commit.py: add "diagnostics" to bug components
David Malcolm [Mon, 28 Jul 2025 14:56:07 +0000 (10:56 -0400)] 
git_commit.py: add "diagnostics" to bug components

contrib/ChangeLog
* gcc-changelog/git_commit.py: Add "diagnostics" to bug
components.

2 days agorestore bootstrap with --enable-checking=release [PR121260]
Mikael Pettersson [Mon, 28 Jul 2025 11:44:46 +0000 (13:44 +0200)] 
restore bootstrap with --enable-checking=release [PR121260]

Current trunk doesn't bootstrap with --enable-checking=release
due to improper nesting of namespaces and #if CHECKING_P blocks.
This corrects that.

gcc/
PR other/121260
* diagnostics/changes.cc: Correct nesting of namespaces
and #if CHECKING_P blocks.
* diagnostics/context.cc: Likewise.
* diagnostics/html-sink.cc: Likewise.
* diagnostics/output-spec.cc: Likewise.
* diagnostics/sarif-sink.cc: Likewise.

Signed-off-by: Mikael Pettersson <mikpelinux@gmail.com>
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 days agonvptx/nvptx.opt: Update -march-map= for newer sm_xxx: test cases
Thomas Schwinge [Mon, 28 Jul 2025 13:55:24 +0000 (15:55 +0200)] 
nvptx/nvptx.opt: Update -march-map= for newer sm_xxx: test cases

Test cases for commit 60ba2b61af23e6d561c5cbab8df57ea093ade3b3
"nvptx/nvptx.opt: Update -march-map= for newer sm_xxx".

gcc/testsuite/
* gcc.target/nvptx/march-map=sm_100.c: New.
* gcc.target/nvptx/march-map=sm_100a.c: Likewise.
* gcc.target/nvptx/march-map=sm_100f.c: Likewise.
* gcc.target/nvptx/march-map=sm_101.c: Likewise.
* gcc.target/nvptx/march-map=sm_101a.c: Likewise.
* gcc.target/nvptx/march-map=sm_101f.c: Likewise.
* gcc.target/nvptx/march-map=sm_103.c: Likewise.
* gcc.target/nvptx/march-map=sm_103a.c: Likewise.
* gcc.target/nvptx/march-map=sm_103f.c: Likewise.
* gcc.target/nvptx/march-map=sm_120.c: Likewise.
* gcc.target/nvptx/march-map=sm_120a.c: Likewise.
* gcc.target/nvptx/march-map=sm_120f.c: Likewise.
* gcc.target/nvptx/march-map=sm_121.c: Likewise.
* gcc.target/nvptx/march-map=sm_121a.c: Likewise.
* gcc.target/nvptx/march-map=sm_121f.c: Likewise.

2 days agonvptx/nvptx.opt: Update -march-map= for newer sm_xxx
Tobias Burnus [Mon, 12 May 2025 15:12:36 +0000 (17:12 +0200)] 
nvptx/nvptx.opt: Update -march-map= for newer sm_xxx

Usage of the -march-map=: "Select the closest available '-march=' value
that is not more capable."

As PTX ISA 8.6/8.7 (= unreleased CUDA 12.7 + CUDA 12.8) added the
Nvidia Blackwell GPUs SM_100, SM_101, and SM_120, it makes sense to
add them as well. Note that all three come as sm_XXX and sm_XXXa.

PTX ISA 8.8 (CUDA 12.9) added SM_103 and SM_121 and the new 'f' suffix
for all SM_1xx.

Internally, GCC currently generates the same code for >= sm_80 (Ampere);
however, as GCC's -march= also supports sm_89 (Ada), the here added
sm_1xxs (Blackwell) will map to sm_89.

[Naming note: while ptx code generated for sm_X can also run with sm_Y
if Y > X, code generated for sm_XXXa can (generally) only run on
the specific hardware; and sm_XXXf implies compatibility with only
subsequent targets in the same family.]

gcc/ChangeLog:

* config/nvptx/nvptx.opt (march-map=): Add sm_100{,f,a},
sm_101{,f,a}, sm_103{,a,f}, sm_120{,a,f} and sm_121{,f,a}.

2 days agogcn: Fix CDNA3 atomics' buffer invalidation
Tobias Burnus [Mon, 28 Jul 2025 13:45:06 +0000 (15:45 +0200)] 
gcn: Fix CDNA3 atomics' buffer invalidation

For device (agent) scope atomics - as needed when there is more than one teams,
a buffer_wbl2 followed by s_waitcnt is required. When doing the initial porting,
the pre-atomic instruction got accidentally replaced by buffer_inv sc1, which is
not quite the right instruction.

gcc/ChangeLog:

* config/gcn/gcn.md (atomic_load, atomic_store, atomic_exchange):
Fix CDNA3 L2 cache write-back before atomic instructions.

2 days agoConst correctness for gather-scatter info
Richard Biener [Mon, 28 Jul 2025 11:46:18 +0000 (13:46 +0200)] 
Const correctness for gather-scatter info

The following adds const qualification to gather_scatter_info *
parameters for various APIs in the vectorizer.

* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Make *gs_info const.
(vect_build_one_gather_load_call): Likewise.
(vect_build_one_scatter_store_call): Likewise.
(vect_get_gather_scatter_ops): Likewise.
(vect_get_strided_load_store_ops): Likewise.

2 days agogcn: Add more s_nop for MI300
Tobias Burnus [Mon, 28 Jul 2025 13:15:24 +0000 (15:15 +0200)] 
gcn: Add more s_nop for MI300

Implement another case where the CDNA3 ISA documentation requires s_nop,
add a comment why another case does not need to be handled. And add one
case where an s_nop is required by MI300A hardware but seems to be not
mentioned in the CDNA3 ISA documentation.

gcc/ChangeLog:

* config/gcn/gcn.md  (define_attr "vcmp"): Add with values
vcmp/vcmpx/no.
(*movbi, cstoredi4.., cstore<mode>4): Set it.
* config/gcn/gcn-valu.md (vec_cmp<mode>...): Likewise.
* config/gcn/gcn.cc (gcn_cmpx_insn_p): Remove.
(gcn_md_reorg): Add two new conditions for MI300.

2 days agogcn: Add 'nops' insn, extend comments
Tobias Burnus [Mon, 28 Jul 2025 10:16:02 +0000 (12:16 +0200)] 
gcn: Add 'nops' insn, extend comments

Use 's_nops' with a number instead of multiple of 's_nop' when
manually adding 1 to 5 wait state. This helps with
the instruction cache and helps a tiny bit with PR119367 where
a two-byte variable overflows in the debugging location view handling.

Add a comment about 'sc0' to TARGET_GLC_NAME as for atomics it is
unrelated to the scope but to whether the result is stored; i.e.
using e.g. 'sc1' instead of 'sc0' will have undesired consequences!

Update the comment above print_operand_address to document 'R' and 'V';
those are used below as "Temporary hack.", but it makes sense to see
them in the list.

gcc/ChangeLog:

* config/gcn/gcn-opts.h (enum hsaco_attr_type): Add comment
about 'sc0'.
* config/gcn/gcn.cc (gcn_md_reorg): Use gen_nops instead of gen_nop.
(print_operand_address): Document 'R' and 'V' in the
pre-function comment as well.
* config/gcn/gcn.md (nops): Add.

2 days agolibstdc++: provide debug impl of P2697 ctor [PR119742]
Nathan Myers [Mon, 7 Jul 2025 20:54:26 +0000 (16:54 -0400)] 
libstdc++: provide debug impl of P2697 ctor [PR119742]

This adds the new bitset constructor from string_view
defined in P2697 to the debug version of the type.

libstdc++-v3/Changelog:
PR libstdc++/119742
* include/debug/bitset: Add new ctor.

2 days agotree-optimization/121256 - properly support SLP in vectorizable recurrence
Richard Biener [Sun, 27 Jul 2025 16:42:25 +0000 (18:42 +0200)] 
tree-optimization/121256 - properly support SLP in vectorizable recurrence

We failed to build the correct initialization vector.  For VLA
vectors and a non-uniform initialization vector this rejects
vectorization for now.

PR tree-optimization/121256
* tree-vect-loop.cc (vectorizable_recurr): Build a correct
initialization vector for SLP_TREE_LANES > 1.

* gcc.dg/vect/vect-recurr-pr121256.c: New testcase.
* gcc.dg/vect/vect-recurr-pr121256-2.c: Likewise.

2 days agolibstdc++: Fix style issues in <mdspan>.
Luc Grosheintz [Sun, 27 Jul 2025 12:40:10 +0000 (14:40 +0200)] 
libstdc++: Fix style issues in <mdspan>.

libstdc++-v3/ChangeLog:

* include/std/mdspan: Small stylistic adjustments.

Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
2 days agoMove STMT_VINFO_TYPE to SLP_TREE_TYPE
Richard Biener [Sun, 27 Jul 2025 16:42:18 +0000 (18:42 +0200)] 
Move STMT_VINFO_TYPE to SLP_TREE_TYPE

I am at a point where I want to store additional information from
analysis (from loads and stores) to re-use them at transform stage
without repeating the analysis.  I do not want to add to
stmt_vec_info at this point, so this starts adding kind specific
sub-structures by moving the STMT_VINFO_TYPE field to the SLP
tree and adding a (dummy for now) union tagged by it to receive
such data.

The change is largely mechanical after RISC-V has been prepared
to have a SLP node around.

I have settled for a union (supposed to get pointers to data).

As followup this enables getting rid of SLP_TREE_CODE and making
VEC_PERM therein a separate type, unifying its handling.

* tree-vectorizer.h (_slp_tree::type): Add.
(_slp_tree::u): Likewise.
(_stmt_vec_info::type): Remove.
(STMT_VINFO_TYPE): Likewise.
(SLP_TREE_TYPE): New.
* tree-vectorizer.cc (vec_info::new_stmt_vec_info): Do not
initialize type.
* tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize type.
(vect_slp_analyze_node_operations): Adjust.
(vect_schedule_slp_node): Likewise.
* tree-vect-patterns.cc (vect_init_pattern_stmt): Do not
copy STMT_VINFO_TYPE.
* tree-vect-loop.cc: Set SLP_TREE_TYPE instead of
STMT_VINFO_TYPE everywhere.
(vect_create_loop_vinfo): Do not set STMT_VINFO_TYPE on
loop conditions.
* tree-vect-stmts.cc: Set SLP_TREE_TYPE instead of
STMT_VINFO_TYPE everywhere.
(vect_analyze_stmt): Adjust.
(vect_transform_stmt): Likewise.
* config/aarch64/aarch64.cc (aarch64_vector_costs::count_ops):
Access SLP_TREE_TYPE instead of STMT_VINFO_TYPE.
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
Remove non-SLP element-wise load/store matching.
* config/rs6000/rs6000.cc
(rs6000_cost_data::update_target_cost_per_stmt): Pass in
the SLP node.  Use that to get at the memory access
kind and type.
(rs6000_cost_data::add_stmt_cost): Pass down SLP node.
* config/riscv/riscv-vector-costs.cc (variable_vectorized_p):
Use SLP_TREE_TYPE.
(costs::need_additional_vector_vars_p): Likewise.
(costs::update_local_live_ranges): Likewise.

2 days agoada: Minor typo fix in comment
Marc Poulhiès [Thu, 17 Jul 2025 13:00:25 +0000 (15:00 +0200)] 
ada: Minor typo fix in comment

gcc/ada/ChangeLog:

* gcc-interface/trans.cc (gnat_to_gnu): Fix typo in comment.

3 days agoaarch64: Add tuning model for Olympus core.
Jennifer Schmitz [Mon, 21 Jul 2025 17:07:20 +0000 (10:07 -0700)] 
aarch64: Add tuning model for Olympus core.

This patch adds a new tuning model for the NVIDIA Olympus core.
The values used here are based on the Software Optimization Guide
that will be published imminently.

Bootstrapped and tested on aarch64-linux-gnu, no regression.

OK for trunk?
OK to backport to GCC 15?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
Co-Authored-By: Dhruv Chawla <dhruvc@nvidia.com>
gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (olympus): Use olympus tuning
model.
* config/aarch64/aarch64.cc: Include olympus.h.
* config/aarch64/tuning_models/olympus.h: New file.

3 days agolibstdc++: Refactor tests for mdspan related accessors.
Luc Grosheintz [Sun, 27 Jul 2025 13:37:56 +0000 (15:37 +0200)] 
libstdc++: Refactor tests for mdspan related accessors.

Versions 1, 2 and 3 of the patch for adding aligned_accessor had a
bug in the constraints that allowed conversion of

  aligned_accessor<T, N> a = aligned_accessor<const T, N>{};

and prevented the reverse.

The file mdspan/accessors/generic.cc already contains code that checks
all variation of the constraint. This commit allows passing in two
different accessors. Enabling it to be reused more widely.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/accessors/generic.cc: Refactor
test_ctor.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
3 days agolibstdc++: Support braces as arguments for std::erase on inplace_vector [PR121196]
Tomasz Kamiński [Fri, 25 Jul 2025 12:50:26 +0000 (14:50 +0200)] 
libstdc++: Support braces as arguments for std::erase on inplace_vector [PR121196]

PR libstdc++/121196

libstdc++-v3/ChangeLog:

* include/std/inplace_vector (std::erase): Provide default argument
for _Up parameter.
* testsuite/23_containers/inplace_vector/erasure.cc: Add test for
using braces-init-list as arguments to erase_if and use function
to verify content of inplace_vector

Reviewed-by: Patrick Palka <ppalka@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 days agoLoongArch: Remove the definition of CASE_VECTOR_SHORTEN_MODE.
Lulu Cheng [Thu, 24 Jul 2025 11:07:25 +0000 (19:07 +0800)] 
LoongArch: Remove the definition of CASE_VECTOR_SHORTEN_MODE.

On LoongArch, the switch jump-table always stores absolute
addresses, so there is no need to define the macro
CASE_VECTOR_SHORTEN_MODE.

gcc/ChangeLog:

* config/loongarch/loongarch.h
(CASE_VECTOR_SHORTEN_MODE): Delete.

3 days agoxtensa: Fix remaining inaccuracies in xtensa_is_insn_L32R_p()
Takayuki 'January June' Suwa [Fri, 25 Jul 2025 01:40:42 +0000 (10:40 +0900)] 
xtensa: Fix remaining inaccuracies in xtensa_is_insn_L32R_p()

The previous fix also had some flaws:

- The TARGET_CONST16 check was a bit premature
- It didn't take into account the possibility of the RTL expression
   "(set (reg:SF gpr) (const_int))", especially when TARGET_AUTOLITPOOLS is
   configured

This patch fixes the above.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Re-rewrite to more accurately capture insns that could be L32R machine
instructions wherever possible, and add comments that help understand
the intent of the process.

3 days agoDaily bump.
GCC Administrator [Mon, 28 Jul 2025 00:17:10 +0000 (00:17 +0000)] 
Daily bump.

3 days agofortran: Consistently use the same assignment reallocation condition [PR121185]
Mikael Morin [Sun, 27 Jul 2025 15:11:58 +0000 (17:11 +0200)] 
fortran: Consistently use the same assignment reallocation condition [PR121185]

This is a follow-up to:
r16-2248-gac8e536526393580bc9a4339bab2f8603eff8a47
fortran: Delay evaluation of array bounds after reallocation

That revision delayed the evaluation of array bounds, with changes in
two places: in the scalarizer where we save expressions without
evaluating their values to variables, and in the reallocation code where
we evaluate to variables the expressions previously saved.  The effect
should not have been visible in scalarized code, as the saving to a
variable was only delayed after reallocation.

Unfortunately, it's actually not the case, and there are cases where
expressions that were saved to variables before the change, are no
longer after it.  The reason for that is differing conditions guarding
the omission of the evaluation to variables in the scalarizer on one
hand, and the emission of reallocation code with the saving to variables
on the other hand.  There is an additional check that avoids the
emission of reallocation code if we can prove at compile time that both
sides of the assignment are conformable.

This change moves up the reallocation code condition definition, so that
it can be used as well to flag the left hand side array as
reallocatable, and omit the evaluation of expressions in the exact same
conditions where the reallocation code would catch those unevaluated
expressions.

An explicit call to gfc_fix_class_refs is added before the evaluation of
the reallocation code condition.  It was implicit before, by the call to
gfc_walk_expr.

This is not a correctness issue, but PR #121185, that made the problem
apparent, exhibited wrong code examples where the lack of an
intermediary variable was making visible a class container at the
beginning of an array reference, causing the non-polymorphic array
reference to be evaluated in a polymorphic way.

The preceding commits have already fixed the PR #121185 test, so I
haven't found any addition to the testsuite that would reliably test
this change.

PR fortran/121185

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_trans_assignment_1): Use the same condition
to set the is_alloc_lhs flag and to decide to generate
reallocation code.  Add explicit call to gfc_fix_class_refs
before evaluating the condition.

3 days agofortran: Trigger reference saving on pointer dereference [PR121185]
Mikael Morin [Sun, 27 Jul 2025 15:11:50 +0000 (17:11 +0200)] 
fortran: Trigger reference saving on pointer dereference [PR121185]

This is a follow-up to revision:
r16-2371-g8f41c87654fd819e48c9f6f1ac3d87e35794d310
fortran: Factor array descriptor references

That revision introduced new variables to limit repeated subexpressions
in array descriptor references.  The change added a walk along the
reference from child to parent, that selected subreferences worth
saving and applied the saving if the reference proved non-trivial
enough.  Trivialness was defined in a comment as: only made of a DECL
and NOPs and COMPONENTs.  But the case of a pointer derefence didn't
trigger the saving, so the code was also considering a dereference as if
it was trivial.

This change triggers the reference saving on pointer dereferences,
making the trivialness as defined by the code aligned with the comment.

This change is not strictly speaking a bug fix, but PR #121185 exhibited
wrong code examples where the lack of a variable hiding the polymorphic
leading part of a non-polymorphic array reference was causing the latter
to be evaluated in a polymorphic way.

PR fortran/121185

gcc/fortran/ChangeLog:

* trans-array.cc (set_factored_descriptor_value): Also trigger
the saving of the previously selected reference on encountering
an INDIRECT_REF.  Extract the saving code...
(save_ref): ... here as a new function.

gcc/testsuite/ChangeLog:

* gfortran.dg/assign_14.f90: New test.

3 days agofortran: Bound class container lookup after array descriptor [PR121185]
Mikael Morin [Sun, 27 Jul 2025 15:11:40 +0000 (17:11 +0200)] 
fortran: Bound class container lookup after array descriptor [PR121185]

Don't look for a class container too far after an array descriptor.
This avoids generating a polymorphic array reference, using the virtual
table of a parent object, to access a non-polymorphic child having a
type unrelated to that of the parent.

PR fortran/121185

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_get_class_from_expr): Give up class
container lookup on the second COMPONENT_REF after an array
descriptor.

gcc/testsuite/ChangeLog:

* gfortran.dg/assign_13.f90: New test.

4 days agoRISC-V: Add test case for vaadd.vx combine polluting VXRM
Pan Li [Fri, 25 Jul 2025 13:29:29 +0000 (21:29 +0800)] 
RISC-V: Add test case for vaadd.vx combine polluting VXRM

Add asm check to make sure vx combine of vaadd.vx will not pollute
the vxrm.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 days agoRISC-V: Add test for vec_duplicate + vaadd.vv combine case 1 with GR2VR cost 0, 1...
Pan Li [Fri, 25 Jul 2025 13:28:24 +0000 (21:28 +0800)] 
RISC-V: Add test for vec_duplicate + vaadd.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vaadd.vv combine to
vaadd.vx, with the GR2VR cost is 0, 1 and 2

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 days agoRISC-V: Add test for vec_duplicate + vaadd.vv combine case 0 with GR2VR cost 0, 1...
Pan Li [Fri, 25 Jul 2025 13:26:27 +0000 (21:26 +0800)] 
RISC-V: Add test for vec_duplicate + vaadd.vv combine case 0 with GR2VR cost 0, 1 and 15

Add asm dump check and run test for vec_duplicate + vaadd.vv
combine to vaadd.vx, with the GR2VR cost is 0, 2 and 15

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper
macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 days agoRISC-V: Combine vec_duplicate + vaadd.vv to vaadd.vx on GR2VR cost
Pan Li [Fri, 25 Jul 2025 13:22:47 +0000 (21:22 +0800)] 
RISC-V: Combine vec_duplicate + vaadd.vv to vaadd.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vaadd.vv to the
vaadd.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_AVG_FLOOR(NT, WT)        \
  NT                                   \
  test_##NT##_avg_floor(NT x, NT y)    \
  {                                    \
    return (NT)(((WT)x + (WT)y) >> 1); \
  }

  #define AVG_FLOOR_FUNC(T)      test_##T##_avg_floor

  DEF_AVG_FLOOR(int32_t, int64_t)
  DEF_VX_BINARY_CASE_2_WRAP(T, AVG_FLOOR_FUNC(T), avg_floor)

Before this patch:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vaadd.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vaadd.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vxrm_vec_vec_dup):
Add new case UNSPEC_VAADD.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new case UNSPEC_VAADD to
iterator.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 days agoRISC-V: Fix another vf FP16 combine run test failures
Pan Li [Fri, 25 Jul 2025 14:11:13 +0000 (22:11 +0800)] 
RISC-V: Fix another vf FP16 combine run test failures

Like Robin's fix for vf combine f16.c run tests, there is still
another failures similar.  This patch would like to fix it as
previous.

will commit it directly if the CI agrees.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f16.c:
Add zvfh requirements and options.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f16.c:
Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
4 days agoDaily bump.
GCC Administrator [Sun, 27 Jul 2025 00:16:31 +0000 (00:16 +0000)] 
Daily bump.

4 days agoPrevent mixups of IDENTIFIER_TRANSPARENT_ALIAS and IDENTIFIER_INTERNAL_P better ...
Nathaniel Shead [Wed, 23 Jul 2025 23:41:00 +0000 (09:41 +1000)] 
Prevent mixups of IDENTIFIER_TRANSPARENT_ALIAS and IDENTIFIER_INTERNAL_P better [PR120855]

The assertion failure on ASM_OUTPUT_WEAKREF targets since my r16-1738
was caused because the 'TREE_CHAIN (id)' check in assemble_name_resolve
no longer implies that ID is a transparent alias, since internal
identifiers can have a TREE_CHAIN as well.

I still don't think it's possible for a transparent alias to be an
internal identifier in the sense added, so this patch simply constrains
the assertion better so that it doesn't fail spuriously.  I also added a
couple of other assertions to help validate this assumption.

PR middle-end/120855

gcc/ChangeLog:

* cgraphunit.cc (symbol_table::compile): Assert a transparent
alias is not an internal identifier.
* symtab.cc (symbol_table::change_decl_assembler_name):
Likewise.
* varasm.cc (assemble_name_resolve): Check for
IDENTIFIER_TRANSPARENT_ALIAS instead of just TREE_CHAIN.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>