]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
2 weeks agoRewrite assign_discriminators
Jan Hubicka [Fri, 11 Jul 2025 11:01:13 +0000 (13:01 +0200)] 
Rewrite assign_discriminators

To assign debug locations to corresponding statements auto-fdo uses
discriminators.  Documentation says that if given statement belongs to multiple
basic blocks, the discrminator distinguishes them.

Current implementation however only work fork statements that expands into a
squence of gimple statements which forms a linear sequence, sicne it
essentially tracks a current location and renews it each time new BB is found.
This is commonly not true for C++ code as in:

  <bb 25> :
  [simulator/csimplemodule.cc:379:85] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680);
  [simulator/csimplemodule.cc:379:85 discrim 13] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:379:85 discrim 13] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:377:45] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject;
  [simulator/csimplemodule.cc:377:45] _44 = _43 + 40;
  [simulator/csimplemodule.cc:377:45] _45 = [simulator/csimplemodule.cc:377:45] *_44;
  [simulator/csimplemodule.cc:379:85] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41);

This is a fragment of code that is expanded from:

371         if (this!=simulation.getContextModule())
372             throw cRuntimeError("send()/sendDelayed() of module (%s)%s called in the context of "
373                                 "module (%s)%s: method called from the latter module "
374                                 "lacks Enter_Method() or Enter_Method_Silent()? "
375                                 "Also, if message to be sent is passed from that module, "
376                                 "you'll need to call take(msg) after Enter_Method() as well",
377                                 getClassName(), getFullPath().c_str(),
378                                 simulation.getContextModule()->getClassName(),
379                                 simulation.getContextModule()->getFullPath().c_str());

Notice that 379:85 is interleaved by 377:45 and the pass does not assign new discriminator.
With patch we get:

  <bb 25> :
  [simulator/csimplemodule.cc:379:85 discrim 7] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680);
  [simulator/csimplemodule.cc:379:85 discrim 8] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:379:85 discrim 8] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782;
  [simulator/csimplemodule.cc:377:45 discrim 1] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject;
  [simulator/csimplemodule.cc:377:45 discrim 1] _44 = _43 + 40;
  [simulator/csimplemodule.cc:377:45 discrim 1] _45 = [simulator/csimplemodule.cc:377:45] *_44;
  [simulator/csimplemodule.cc:379:85 discrim 8] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41);

There are earlier statements with line number 379, so that is why there is discriminator 7 for the call.
After that discriminator is increased.  There are two reasons for it
 1) AFDO requires every callsite to have unique lineno:discriminator pair
 2) call may not terminate and htus the profile of first statement
    may be higher than the rest.

Old pass also contained logic to skip debug statements.  This is not a good
idea since we output them to the debug output and if AFDO tool picks these
locations up they will be misplaced in basic blocks.

Debug statements are naturally quite useful to track back the AFDO profiles
and in meantime LLVM folks implemented something similar called pseudoprobe.
I think it makes sense toenable debug statements with -fauto-profile even if
debug info is off and make use of them as done in this patch.

Sadly AFDO tool is quite broken and bulid around assumption that every address
has at most one debug location assigned to it (i.e. debug info before debug
statements were introduced). I have WIP patch fixing this.

Note that LLVM also has -fdebug-info-for-auto-profile (on by defualt it seems)
that controls discriminator production and some other little bits.  I wonder if
we want to have something similar.  Should it be -gdebug-info-for-auto-profile
instead?

gcc/ChangeLog:

* opts.cc (finish_options): Enable debug_nonbind_markers_p for
auto-profile.
* tree-cfg.cc (struct locus_discrim_map): Remove.
(struct locus_discrim_hasher): Remove.
(locus_discrim_hasher::hash): Remove.
(locus_discrim_hasher::equal): Remove.
(first_non_label_nondebug_stmt): Remove.
(build_gimple_cfg): Do not allocate discriminator tables.
(next_discriminator_for_locus): Remove.
(same_line_p): Remove.
(struct discrim_entry): New structure.
(assign_discriminator): Rewrite.
(assign_discriminators): Rewrite.

2 weeks agoFix ICE in speculative devirtualization
Jan Hubicka [Fri, 11 Jul 2025 10:37:24 +0000 (12:37 +0200)] 
Fix ICE in speculative devirtualization

This patch fixes ICE bilding lto1 with autoprofiledbootstrap and in pr114790.
What happens is that auto-fdo speculatively devirtualizes to a wrong target.
This is due to a bug where it mixes up dwarf names and linkage names of inline
functions I need to fix as well.

Later we clone at WPA time. At ltrans time clone is materialized and call is
turned into a direct call (this optimization is missed by ipa-cp propagation).
At this time we should resolve speculation but we don't.  As a result we get
error from verifier after inlining complaining that there is speculative call
with corresponding direct call lacking speculative flag.

This seems long-lasting problem in cgraph_update_edges_for_call_stmt_node but
I suppose it does not trigger since we usually speculate correctly or notice
the direct call at WPA time already.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

PR ipa/114790
* cgraph.cc (cgraph_update_edges_for_call_stmt_node): Resolve devirtualization
if call statement was optimized out or turned to direct call.

gcc/testsuite/ChangeLog:

* g++.dg/lto/pr114790_0.C: New test.
* g++.dg/lto/pr114790_1.C: New test.

2 weeks agoipa: Disallow signature changes in fun->has_musttail functions [PR121023]
Jakub Jelinek [Fri, 11 Jul 2025 10:09:44 +0000 (12:09 +0200)] 
ipa: Disallow signature changes in fun->has_musttail functions [PR121023]

As the following testcase shows e.g. on ia32, letting IPA opts change
signature of functions which have [[{gnu,clang}::musttail]] calls
can turn programs that would be compiled normally into something
that is rejected because the caller has fewer argument stack slots
than the function being tail called.

The following patch prevents signature changes for such functions.
It is perhaps too big hammer in some cases, but it might be hard
to try to figure out what signature changes are still acceptable and which
are not at IPA time.

2025-07-11  Jakub Jelinek  <jakub@redhat.com>
    Martin Jambor  <mjambor@suse.cz>

PR ipa/121023
* ipa-fnsummary.cc (compute_fn_summary): Disallow signature changes
on cfun->has_musttail functions.

* c-c++-common/musttail32.c: New test.

2 weeks agoi386: Add a new peeophole2 for PR91384 under APX_F
Hu, Lin1 [Tue, 8 Apr 2025 07:43:59 +0000 (15:43 +0800)] 
i386: Add a new peeophole2 for PR91384 under APX_F

gcc/ChangeLog:

PR target/91384
* config/i386/i386.md: Add new peeophole2 for optimize *negsi_1
followed by *cmpsi_ccno_1 with APX_F.

gcc/testsuite/ChangeLog:

PR target/91384
* gcc.target/i386/pr91384-1.c: New test.

2 weeks agoproperly compute fp/mode for scalar ops for vectorizer costing
Richard Biener [Thu, 10 Jul 2025 11:30:30 +0000 (13:30 +0200)] 
properly compute fp/mode for scalar ops for vectorizer costing

The x86 add_stmt_hook relies on the passed vectype to determine
the mode and whether it is FP for a scalar operation.  This is
unreliable now for stmts involving patterns and in the future when
there is no vector type passed for scalar operations.

To be least disruptive I've kept using the vector type if it is passed.

* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Use
the LHS of a scalar stmt to determine mode and whether it is FP.

2 weeks agocobol: Fix build on 32-bit Darwin [PR120621]
Rainer Orth [Fri, 11 Jul 2025 07:56:18 +0000 (09:56 +0200)] 
cobol: Fix build on 32-bit Darwin [PR120621]

Bootstrapping trunk with 32-bit-default on Mac OS X 10.11
(i386-apple-darwin15) fails:

/vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc: In static member function 'static void cdftext::process_file(filespan_t, int, bool)':
/vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc:1859:14: error: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'size_t' {aka 'long unsigned int'} [-Werror=format=]
 1859 |       dbgmsg("%s:%d: line " HOST_SIZE_T_PRINT_UNSIGNED ", opening %s on fd %d",
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1860 |              __func__, __LINE__,mfile.lineno(),
      |                                 ~~~~~~~~~~~~~~
      |                                             |
      |                                             size_t {aka long unsigned int}
In file included from /vol/gcc/src/hg/master/local/gcc/system.h:1244,
                 from /vol/gcc/src/hg/master/local/gcc/cobol/cobol-system.h:61,
                 from /vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc:33:
/vol/gcc/src/hg/master/local/gcc/hwint.h:135:51: note: format string is defined here
  135 | #define HOST_SIZE_T_PRINT_UNSIGNED "%" GCC_PRISZ "u"
      |                                     ~~~~~~~~~~~~~~^
      |                                                   |
      |                                                   unsigned int
      |                                     %" GCC_PRISZ "lu

On Darwin, size_t is always long unsigned int.  However, unsigned int
and long unsigned int are both 32-bit, so hwint.h selects %u for the
format.  As documented there, the arg needs to be cast to fmt_size_t to
avoid the error.

This isn't an issue on other 32-bit platforms like Solaris/i386 or
Linux/i686 since they use unsigned int for size_t.

/vol/gcc/src/hg/master/local/gcc/cobol/parse.y: In function 'int yyparse()':
/vol/gcc/src/hg/master/local/gcc/cobol/parse.y:10215:36: error: format '%zu' expects argument of type 'size_t', but argument 4 has type 'int' [-Werror=format=]
10215 |                     error_msg(loc, "FUNCTION %qs has "
      |                                    ^~~~~~~~~~~~~~~~~~~
10216 |                               "inconsistent parameter type %zu (%qs)",
      |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10217 |                               keyword_str($1), p - args.data(), name_of(p->field) );
      |                                                                ~~~~~~~~~~~~~~~
      |                                                                  |
      |                                                                  int

The arg (p - args.data())) is ptrdiff_t (int on 32-bit Darwin), while
the %zu format expect size_t (long unsigned int).  The patch therefore
casts the ptrdiff_t arg to long and prints it as such.

There are two more instances of the same problem:

/vol/gcc/src/hg/master/local/gcc/cobol/util.cc: In member function 'void cbl_field_t::report_invalid_initial_value(const YYLTYPE&) const':
/vol/gcc/src/hg/master/local/gcc/cobol/util.cc:905:80: error: format '%zu' expects argument of type 'size_t', but argument 6 has type 'int' [-Werror=format=]
  905 |                 error_msg(loc, "%s cannot represent VALUE %qs exactly (max %c%zu)",
      |                                                                              ~~^
      |                                                                                |
      |                                                                                long unsigned int
      |                                                                              %u
  906 |                           name, data.initial, '.', pend - p);
      |                                                    ~~~~~~~~
      |                                                         |
      |                                                         int

In file included from /vol/gcc/src/hg/master/local/gcc/cobol/scan.l:48:
/vol/gcc/src/hg/master/local/gcc/cobol/scan_ante.h: In function 'int numstr_of(const char*, radix_t)':
/vol/gcc/src/hg/master/local/gcc/cobol/scan_ante.h:152:25: error: format '%zu' expects argument of type 'size_t', but argument 4 has type 'int' [-Werror=format=]
  152 |       error_msg(yylloc, "significand of %s has more than 36 digits (%zu)", input, nx);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~         ~~
      |                                                                                   |
      |                                                                                   int

Fixed in the same way.

Bootstrapped without regressions on i386-apple-darwin15,
x86_64-apple-darwin, i386-pc-solaris2.11, amd64-pc-solaris2.11,
i686-pc-linux-gnu, and x86_64-pc-linux-gnu.

2025-06-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/cobol:
PR cobol/120621
* lexio.cc (parse_replace_pairs): Cast mfile.lineno() to fmt_size_t.
* parse.y (intrinsic): Print ptrdiff_t using %ld, cast arg to long.
* scan_ante.h (numstr_of): Print nx using %ld, cast arg to long.
* util.cc (cbl_field_t::report_invalid_initial_value): Print
ptrdiff_t using %ld, cast arg to long.

2 weeks agolibstdc++: Always treat __float128 as a floating-point type
Jonathan Wakely [Wed, 2 Jul 2025 20:16:30 +0000 (21:16 +0100)] 
libstdc++: Always treat __float128 as a floating-point type

Similar to the previous commit that made is_integral_v<__int128>
unconditionally true, this makes is_floating_point_v<__float128>
unconditionally true. With the new extended floating-point types in
C++23 (std::float64_t etc.) it seems unhelpful for is_floating_point_v
to be true for them, but not for __float128. Especially as it is true on
some targets, because __float128 is just a typedef for long double.

This change makes is_floating_point_v<__float128> true whenever the type
is defined, giving less surprising and more portable behaviour.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_floating<__float128>):
Do not depend on __STRICT_ANSI__.
* include/bits/stl_algobase.h (__size_to_integer(__float128)):
Likewise.
* include/std/type_traits (__is_floating_point_helper<__float128>):
Likewise.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2 weeks agolibstdc++: Treat __int128 as a real integral type [PR96710]
Jonathan Wakely [Fri, 16 May 2025 12:33:23 +0000 (13:33 +0100)] 
libstdc++: Treat __int128 as a real integral type [PR96710]

Since LWG 3828 (included in C++23) implementations are allowed to have
extended integer types that are wider than intmax_t. This means we no
longer have to make is_integral_v<__int128> false for strict -std=c++23
mode, removing the confusing inconsistency with -std=gnu++23 (where
is_integral_v<__int128> is true).

This change makes __int128 a true integral type for all modes, treating
LWG 3828 as a DR for previous standards. Most of the change just
involves removing special cases where we wanted to treat __int128 and
unsigned __int128 as integral types even when is_integral_v was false.

There are still some preprocessor conditionals needed, because on some
targets the compiler defines the macro __GLIBCXX_TYPE_INT_N_0 as
__int128 in non-strict modes. Because we define explicit specializations
of templates such as is_integral for all the INT_N types, we already
have a specialization of is_integral<__int128> in non-strict modes, and
so to avoid a redefinition we only must only define
is_integral<__int128> for strict modes.

libstdc++-v3/ChangeLog:

PR libstdc++/96710
* include/bits/cpp_type_traits.h (__is_integer): Define explicit
specializations for __int128.
(__memcpyable_integer): Remove explicit specializations for
__int128.
* include/bits/iterator_concepts.h (incrementable_traits):
Likewise.
(__is_signed_int128, __is_unsigned_int128, __is_int128): Remove.
(__is_integer_like, __is_signed_integer_like): Remove check for
__int128.
* include/bits/max_size_type.h: Remove all uses of __is_int128
in constraints.
* include/bits/ranges_base.h (__to_unsigned_like): Remove
overloads for __int128.
(ranges::ssize): Remove special case for __int128.
* include/bits/stl_algobase.h (__size_to_integer): Define
__int128 overloads for strict modes.
* include/ext/numeric_traits.h (__is_integer_nonstrict): Remove
explicit specializations for __int128.
* include/std/charconv (to_chars): Define overloads for
__int128.
* include/std/format (__format::make_unsigned_t): Remove.
(__format::to_chars): Remove.
* include/std/limits (numeric_limits): Define explicit
specializations for __int128.
* include/std/type_traits (__is_integral_helper): Likewise.
(__make_unsigned, __make_signed): Likewise.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2 weeks agoFortran: Implement F2018 IMPORT statements [PR106135]
Paul Thomas [Fri, 11 Jul 2025 07:28:27 +0000 (08:28 +0100)] 
Fortran:  Implement F2018 IMPORT statements [PR106135]

2025-09-09  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
PR fortran/106135
* decl.cc (build_sym): Emit an error if a symbol associated by
an IMPORT, ONLY or IMPORT, all statement is being redeclared.
(gfc_match_import): Parse and check the F2018 versions of the
IMPORT statement. For scopes other than and interface body, if
the symbol cannot be found in the host scope, generate it and
set it up such that gfc_fixup_sibling_symbols can transfer its
'imported attribute' if it turnes out to be a not yet parsed
procedure. Test for violations of C897-8100.
* gfortran.h : Add 'import_only' to the gfc_symtree structure.
Add the enum, 'importstate', which is used for values the new
field 'import_state' in gfc_namespace.
* parse.cc (gfc_fixup_sibling_symbols): Transfer the attribute
'imported' to the new symbol.
* resolve.cc (check_sym_import_status, check_import_status):
New functions to test symbols and expressions for violations of
F2018:C8102.
(resolve_call): Test the 'resolved_sym' against C8102 by a call
to 'check_sym_import_status'.
(gfc_resolve_expr): If the expression is OK and an IMPORT
statement has been registered in the current scope, test C102
by calling 'check_import_status'.
(resolve_select_type): Test the declared derived type in TYPE
IS and CLASS IS statements.

gcc/testsuite/
PR fortran/106135
* gfortran.dg/import3.f90: Use -std=f2008 and comment on change
in error message texts with f2018.
* gfortran.dg/import12.f90: New test.

2 weeks agoStop updating gcc-12 branch
Richard Biener [Fri, 11 Jul 2025 06:32:26 +0000 (08:32 +0200)] 
Stop updating gcc-12 branch

contrib/
* gcc-changelog/git_update_version.py: Stop updating gcc-12
branch.

2 weeks agoDaily bump.
GCC Administrator [Fri, 11 Jul 2025 00:19:26 +0000 (00:19 +0000)] 
Daily bump.

3 weeks agoc++: Save 8 further bytes from lang_type allocations
Jakub Jelinek [Thu, 10 Jul 2025 22:05:23 +0000 (00:05 +0200)] 
c++: Save 8 further bytes from lang_type allocations

The following patch implements the
/* FIXME reuse another field?  */
comment on the lambda_expr member.
I think (and asserts in the patch seem to confirm) CLASSTYPE_KEY_METHOD
is only ever non-NULL for TYE_POLYMORPHIC_P and on the other side
CLASSTYPE_LAMBDA_EXPR is only used on closure types which are never
polymorphic.

So, the patch just uses one member for both, with the accessor macros
changed to be no longer lvalues and adding SET_* variants of the macros
for setters.

2025-07-11  Jakub Jelinek  <jakub@redhat.com>

* cp-tree.h (struct lang_type): Add comment before key_method.
Remove lambda_expr.
(CLASSTYPE_KEY_METHOD): Give NULL_TREE if not TYPE_POLYMORPHIC_P.
(SET_CLASSTYPE_KEY_METHOD): Define.
(CLASSTYPE_LAMBDA_EXPR): Give NULL_TREE if TYPE_POLYMORPHIC_P.
Use key_method member instead of lambda_expr.
(SET_CLASSTYPE_LAMBDA_EXPR): Define.
* class.cc (determine_key_method): Use SET_CLASSTYPE_KEY_METHOD
macro.
* decl.cc (xref_tag): Use SET_CLASSTYPE_LAMBDA_EXPR macro.
* lambda.cc (begin_lambda_type): Likewise.
* module.cc (trees_in::read_class_def): Use SET_CLASSTYPE_LAMBDA_EXPR
and SET_CLASSTYPE_KEY_METHOD macros, assert lambda is NULL if
TYPE_POLYMORPHIC_P and otherwise assert key_method is NULL.

3 weeks agoc++: Fix up final handling in C++98 [PR120628]
Jakub Jelinek [Thu, 10 Jul 2025 21:47:42 +0000 (23:47 +0200)] 
c++: Fix up final handling in C++98 [PR120628]

The following patch is on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686210.html
patch which stopped treating override as conditional keyword in
class properties.
This PR mentions another problem; we emit a bogus warning on code like
struct C {}; struct C final = {};
in C++98.  In this case we parse final as conditional keyword in C++
(including pedwarn) but the caller then immediately aborts the tentative
parse because it isn't followed by { nor (in some cases) : .
I think we certainly shouldn't pedwarn on it, but I think we even shouldn't
warn for it say for -Wc++11-compat, because we don't actually treat the
identifier as conditional keyword even in C++11 and later.
The patch only does this if final is the only class property conditional
keyword, if one uses
struct S __final final __final = {};
one gets the warning and duplicate diagnostics and later parsing errors.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/120628
* parser.cc (cp_parser_elaborated_type_specifier): Use
cp_parser_nth_token_starts_class_definition_p with extra argument 1
instead of cp_parser_next_token_starts_class_definition_p.
(cp_parser_class_property_specifier_seq_opt): For final conditional
keyword in C++98 check if the token after it isn't
cp_parser_nth_token_starts_class_definition_p nor CPP_NAME and in
that case break without consuming it nor warning.
(cp_parser_class_head): Use
cp_parser_nth_token_starts_class_definition_p with extra argument 1
instead of cp_parser_next_token_starts_class_definition_p.
(cp_parser_next_token_starts_class_definition_p): Renamed to ...
(cp_parser_nth_token_starts_class_definition_p): ... this.  Add N
argument.  Use cp_lexer_peek_nth_token instead of cp_lexer_peek_token.

* g++.dg/cpp0x/final1.C: New test.
* g++.dg/cpp0x/final2.C: New test.
* g++.dg/cpp0x/override6.C: New test.

3 weeks agoc++: Don't incorrectly reject override after class head name [PR120569]
Jakub Jelinek [Thu, 10 Jul 2025 21:41:56 +0000 (23:41 +0200)] 
c++: Don't incorrectly reject override after class head name [PR120569]

While the
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2786r13.html#c03-compatibility-changes-for-annex-c-diff.cpp03.dcl.dcl
hunk dropped because
struct C {}; struct C final {};
is actually not valid C++98 (which didn't have list initialization), we
actually also reject
struct D {}; struct D override {};
and that IMHO is valid all the way from C++11 onwards.
Especially in the light of P2786R13 adding new contextual keywords, I think
it is better to use a separate routine for parsing the
class-virt-specifier-seq (in C++11, there was export next to final),
class-virt-specifier (in C++14 to C++23) and
class-property-specifier-seq (in C++26) instead of using the same function
for virt-specifier-seq and class-property-specifier-seq.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/120569
* parser.cc (cp_parser_class_property_specifier_seq_opt): New
function.
(cp_parser_class_head): Use it instead of
cp_parser_property_specifier_seq_opt.  Don't diagnose
VIRT_SPEC_OVERRIDE here.  Formatting fix.

* g++.dg/cpp0x/override2.C: Expect different diagnostics with
override or duplicate final.
* g++.dg/cpp0x/override5.C: New test.
* g++.dg/cpp0x/duplicate1.C: Expect different diagnostics with
duplicate final.

3 weeks agoc++, libstdc++: Implement C++26 P3068R5 - constexpr exceptions [PR117785]
Jakub Jelinek [Thu, 10 Jul 2025 21:26:15 +0000 (23:26 +0200)] 
c++, libstdc++: Implement C++26 P3068R5 - constexpr exceptions [PR117785]

The following patch implements the C++26 P3068R5 - constexpr exceptions
paper.

As the IL cxx_eval_constant* functions process already contains the low
level calls like __cxa_{allocate,free}_exception, __cxa_{,re}throw etc.,
the patch just makes 10 extern "C" __cxa_* functions magic builtins which
during constant evaluation pretend to be constexpr even when not declared
so and handle them directly, plus does the same for 3 std namespace
functions - std::uncaught_exceptions, std::current_exception and
std::rethrow_exception and adds one new FE builtin -
__builtin_eh_ptr_adjust_ref which the library can use instead of the
_M_addref and _M_release out of line methods (this one instead of
recognizing _M_* as magic too because those are clearly specific to
libstdc++ and e.g. libc++ could use something else).

The patch uses magic VAR_DECLs with heap_{uninit_,,deleted_}identifier
DECL_NAME like for operator new/delete for objects allocated with
__cxa_allocate_exception, just sets their DECL_LANG_SPECIFIC so that
we can track their reference count as well (with std::exception_ptr
the same exception object can be referenced multiple times and we want
to destruct and free only when it reaches zero refcount).

For uncaught exceptions being propagated, the patch uses new kind of
*jump_target, which is that magic VAR_DECL described above.
The largest change in the patch is making jump_target argument non-optional
in cxa_eval_constant_exception and all functions it calls that need it.
This is because exceptions can be thrown from pretty much everywhere, e.g.
binary expression can throw in either operand.  And the patch also adds
if (*jump_target) return NULL_TREE; or similar in many spots, so that we
don't crash because cxx_eval_constant_expression returned NULL_TREE
somewhere before actually trying to use it and so that we don't uselessly
dive into other operands etc.
Note, with statement expressions actually this was something we just didn't
handle correctly before, one can validly have:
  a = ({ if (x) return 42; 12; }) + b;
or in the other operand, or break/continue instead of return if it is
somewhere in a loop/switch; and it isn't ok to branch from one operand to
another one through some kind of goto.

On the potential_constant_expression_1 side, important change was to
set *jump_target conservatively on calls that could throw for C++26 (the
patch uses magic void_node for potential_constant_expression* instead of
VAR_DECL, so that we don't have to create new VAR_DECLs there uselessly).
Without that change, several methods in libstdc++ wouldn't work correctly.
I'm not sure what exactly potential_constant_expression_1 maps to in the
C++26 standard wording now and whether doing that is ok, because basically
after the first call to non-noexcept function it stops checking stuff.

And, in some spots where I know potential_constant_expression_1 didn't
check some subexpressions (e.g. the EH only cleanups or TRY_BLOCK handlers)
I've added *potential_constant_expression* calls during cxx_eval_constant*,
not sure if I need to do that because potential_constant_expression_1 is
very conservative and just doesn't recurse on subexpressions in many cases.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

PR c++/117785
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_constexpr_exceptions=202411L for C++26.
gcc/cp/
* constexpr.cc: Implement C++26 P3068R5 - constexpr exceptions.
(class constexpr_global_ctx): Add caught_exceptions and
uncaught_exceptions members.
(constexpr_global_ctx::constexpr_global_ctx): Initialize
uncaught_exceptions.
(returns, breaks, continues, switches): Move earlier.
(throws): New function.
(exception_what_str, diagnose_std_terminate,
diagnose_uncaught_exception): New functions.
(enum cxa_builtin): New type.
(cxx_cxa_builtin_fn_p, cxx_eval_cxa_builtin_fn): New functions.
(cxx_eval_builtin_function_call): Add jump_target argument.  Call
cxx_eval_cxa_builtin_fn for __builtin_eh_ptr_adjust_ref.  Adjust
cxx_eval_constant_expression calls, if it results in jmp_target,
set *jump_target to it and return.
(cxx_bind_parameters_in_call): Add jump_target argument.  Pass
it through to cxx_eval_constant_expression.  If it sets *jump_target,
break.
(fold_operand): Adjust cxx_eval_constant_expression caller.
(cxx_eval_assert): Likewise.  If it set jmp_target, return true.
(cxx_eval_internal_function): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression.  Return early if *jump_target
after recursing on args.
(cxx_eval_dynamic_cast_fn): Likewise.  Don't set reference_p for
C++26 with -fexceptions.
(cxx_eval_thunk_call): Add jump_target argument.  Pass it through
to cxx_eval_constant_expression.
(cxx_set_object_constness): Likewise.  Don't set TREE_READONLY if
throws (jump_target).
(cxx_eval_call_expression): Add jump_target argument.  Pass it
through to cxx_eval_internal_function, cxx_eval_builtin_function_call,
cxx_eval_thunk_call, cxx_eval_dynamic_cast_fn and
cxx_set_object_constness.  Pass it through also
cxx_eval_constant_expression on arguments, cxx_bind_parameters_in_call
and cxx_fold_indirect_ref and for those cases return early
if *jump_target.  Call cxx_eval_cxa_builtin_fn for cxx_cxa_builtin_fn_p
functions.  For cxx_eval_constant_expression on body, pass address of
cleared jmp_target automatic variable, if it throws propagate
to *jump_target and make it non-cacheable.  For C++26 don't diagnose
calls to non-constexpr functions before cxx_bind_parameters_in_call
could report some argument throwing an exception.
(cxx_eval_unary_expression): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression and return early
if *jump_target after the call.
(cxx_fold_pointer_plus_expression): Likewise.
(cxx_eval_binary_expression): Likewise and similarly for
cxx_fold_pointer_plus_expression call.
(cxx_eval_conditional_expression): Pass jump_target to
cxx_eval_constant_expression on first operand and return early
if *jump_target after the call.
(cxx_eval_vector_conditional_expression): Add jump_target argument.
Pass it through to cxx_eval_constant_expression for all 3 arguments
and return early if *jump_target after any of those calls.
(get_array_or_vector_nelts): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression.
(eval_and_check_array_index): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression calls and return early after
each of them if *jump_target.
(cxx_eval_array_reference): Likewise.
(cxx_eval_component_reference): Likewise.
(cxx_eval_bit_field_ref): Likewise.
(cxx_eval_bit_cast): Likewise.  Assert CHECKING_P call doesn't
throw or return.
(cxx_eval_logical_expression): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression calls and return early after
each of them if *jump_target.
(cxx_eval_bare_aggregate): Likewise.
(cxx_eval_vec_init_1): Add jump_target argument.  Pass it through
to cxx_eval_bare_aggregate and recursive call.  Pass it through
to get_array_or_vector_nelts and cxx_eval_constant_expression
and return early after it if *jump_target.
(cxx_eval_vec_init): Add jump_target argument.  Pass it through
to cxx_eval_constant_expression and cxx_eval_vec_init_1.
(cxx_union_active_member): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression and return early after it
if *jump_target.
(cxx_fold_indirect_ref_1): Add jump_target argument.  Pass it
through to cxx_union_active_member and recursive calls.
(cxx_eval_indirect_ref): Add jump_target argument.  Pass it through
to cxx_fold_indirect_ref_1 calls and to recursive call, in which
case return early after it if *jump_target.
(cxx_fold_indirect_ref): Add jump_target argument.  Pass it through
to cxx_fold_indirect_ref and cxx_eval_constant_expression calls and
return early after those if *jump_target.
(cxx_eval_trinary_expression): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression calls and return early after
those if *jump_target.
(cxx_eval_store_expression): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression and eval_and_check_array_index
calls and return early after those if *jump_target.
(cxx_eval_increment_expression): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression calls and return early after
those if *jump_target.
(label_matches): Handle VAR_DECL case.
(cxx_eval_statement_list): Remove local_target variable and
!jump_target handling.  Handle throws (jump_target) like returns or
breaks.
(cxx_eval_loop_expr): Remove local_target variable and !jump_target
handling.  Pass it through to cxx_eval_constant_expression.  Handle
throws (jump_target) like returns.
(cxx_eval_switch_expr): Pass jump_target through to
cxx_eval_constant_expression on cond, return early after it
if *jump_target.
(build_new_constexpr_heap_type): Add jump_target argument.  Pass it
through to cxx_eval_constant_expression calls, return early after
those if *jump_target.
(merge_jump_target): New function.
(cxx_eval_constant_expression): Make jump_target argument no longer
defaulted, don't test jump_target for NULL.  Pass jump_target
through to recursive calls, cxx_eval_call_expression,
cxx_eval_store_expression, cxx_eval_indirect_ref,
cxx_eval_unary_expression, cxx_eval_binary_expression,
cxx_eval_logical_expression, cxx_eval_array_reference,
cxx_eval_component_reference, cxx_eval_bit_field_ref,
cxx_eval_vector_conditional_expression, cxx_eval_bare_aggregate,
cxx_eval_vec_init, cxx_eval_trinary_expression, cxx_fold_indirect_ref,
build_new_constexpr_heap_type, cxx_eval_increment_expression,
cxx_eval_bit_cast and return earlyu after some of those
if *jump_target as needed.
(cxx_eval_constant_expression) <case TARGET_EXPR>: For C++26 push
also CLEANUP_EH_ONLY cleanups, with NULL_TREE marker after them.
(cxx_eval_constant_expression) <case RETURN_EXPR>: Don't
override *jump_target if throws (jump_target).
(cxx_eval_constant_expression) <case TRY_CATCH_EXPR, case TRY_BLOCK,
case MUST_NOT_THROW_EXPR, case TRY_FINALLY_EXPR, case CLEANUP_STMT>:
Handle C++26 constant expressions.
(cxx_eval_constant_expression) <case CLEANUP_POINT_EXPR>: For C++26
with throws (jump_target) evaluate the CLEANUP_EH_ONLY cleanups as
well, and if not throws (jump_target) skip those.  Set *jump_target
if some of the cleanups threw.
(cxx_eval_constant_expression) <case THROW_EXPR>: Recurse on operand
for C++26.
(cxx_eval_outermost_constant_expr): Diagnose uncaught exceptions both
from main expression and cleanups, diagnose also
break/continue/returns from the main expression.  Handle
CLEANUP_EH_ONLY cleanup markers.  Don't diagnose mutable poison stuff
if non_constant_p.  Use different diagnostics for non-deleted heap
allocations if they were allocated by __cxa_allocate_exception.
(callee_might_throw): New function.
(struct check_for_return_continue_data): Add could_throw field.
(check_for_return_continue): Handle AGGR_INIT_EXPR and CALL_EXPR and
set d->could_throw if they could throw.
(potential_constant_expression_1): For CALL_EXPR allow
cxx_dynamic_cast_fn_p calls.  For C++26 set *jump_target to void_node
for calls that could throw.  For C++26 if call to non-constexpr call
is seen, try to evaluate arguments first and if they could throw,
don't diagnose call to non-constexpr function nor return false.
Adjust check_for_return_continue_data initializers and
set *jump_target to void_node if data.could_throw_p.  For C++26
recurse on THROW_EXPR argument.  Add comment explaining TRY_BLOCK
handling with C++26 exceptions.  Handle throws like returns in some
cases.
* cp-tree.h (MUST_NOT_THROW_NOEXCEPT_P, MUST_NOT_THROW_THROW_P,
MUST_NOT_THROW_CATCH_P, DECL_EXCEPTION_REFCOUNT): Define.
(DECL_LOCAL_DECL_P): Fix comment typo, VARIABLE_DECL -> VAR_DECL.
(enum cp_built_in_function): Add CP_BUILT_IN_EH_PTR_ADJUST_REF,
(handler_match_for_exception_type): Declare.
* call.cc (handler_match_for_exception_type): New function.
* except.cc (initialize_handler_parm): Set MUST_NOT_THROW_CATCH_P
on newly created MUST_NOT_THROW_EXPR.
(begin_eh_spec_block): Set MUST_NOT_THROW_NOEXCEPT_P.
(wrap_cleanups_r): Set MUST_NOT_THROW_THROW_P.
(build_throw): Add another TARGET_EXPR whose scope spans
until after the __cxa_throw call and copy pointer value from ptr
to it and use it in __cxa_throw argument.
* tree.cc (builtin_valid_in_constant_expr_p): Handle
CP_BUILT_IN_EH_PTR_ADJUST_REF.
* decl.cc (cxx_init_decl_processing): Initialize
__builtin_eh_ptr_adjust_ref FE builtin.
* pt.cc (tsubst_stmt) <case MUST_NOT_THROW_EXPR>: Copy the
MUST_NOT_THROW_NOEXCEPT_P, MUST_NOT_THROW_THROW_P and
MUST_NOT_THROW_CATCH_P flags.
* cp-gimplify.cc (cp_gimplify_expr) <case CALL_EXPR>: Error on
non-folded CP_BUILT_IN_EH_PTR_ADJUST_REF calls.
gcc/testsuite/
* g++.dg/cpp0x/constexpr-ellipsis2.C: Expect different diagnostics for
C++26.
* g++.dg/cpp0x/constexpr-throw.C: Likewise.
* g++.dg/cpp1y/constexpr-84192.C: Expect different diagnostics.
* g++.dg/cpp1y/constexpr-throw.C: Expect different diagnostics for
C++26.
* g++.dg/cpp1z/constexpr-asm-5.C: Likewise.
* g++.dg/cpp26/constexpr-eh1.C: New test.
* g++.dg/cpp26/constexpr-eh2.C: New test.
* g++.dg/cpp26/constexpr-eh3.C: New test.
* g++.dg/cpp26/constexpr-eh4.C: New test.
* g++.dg/cpp26/constexpr-eh5.C: New test.
* g++.dg/cpp26/constexpr-eh6.C: New test.
* g++.dg/cpp26/constexpr-eh7.C: New test.
* g++.dg/cpp26/constexpr-eh8.C: New test.
* g++.dg/cpp26/constexpr-eh9.C: New test.
* g++.dg/cpp26/constexpr-eh10.C: New test.
* g++.dg/cpp26/constexpr-eh11.C: New test.
* g++.dg/cpp26/constexpr-eh12.C: New test.
* g++.dg/cpp26/constexpr-eh13.C: New test.
* g++.dg/cpp26/constexpr-eh14.C: New test.
* g++.dg/cpp26/constexpr-eh15.C: New test.
* g++.dg/cpp26/feat-cxx26.C: Change formatting in __cpp_pack_indexing
and __cpp_pp_embed test.  Add __cpp_constexpr_exceptions test.
* g++.dg/cpp26/static_assert1.C: Expect different diagnostics for
C++26.
* g++.dg/cpp2a/consteval34.C: Likewise.
* g++.dg/cpp2a/consteval-memfn1.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic4.C: For C++26 add std::exception and
std::bad_cast definitions and expect different diagnostics.
* g++.dg/cpp2a/constexpr-dynamic6.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic7.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic8.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic9.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic11.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic14.C: Likewise.
* g++.dg/cpp2a/constexpr-dynamic18.C: Likewise.
* g++.dg/cpp2a/constexpr-new27.C: New test.
* g++.dg/cpp2a/constexpr-typeid5.C: New test.
libstdc++-v3/
* include/bits/version.def (constexpr_exceptions): New.
* include/bits/version.h: Regenerate.
* libsupc++/exception (std::bad_exception::bad_exception): Add
_GLIBCXX26_CONSTEXPR.
(std::bad_exception::~bad_exception, std::bad_exception::what): For
C++26 add constexpr and define inline.
* libsupc++/exception.h (std::exception::exception,
std::exception::operator=): Add _GLIBCXX26_CONSTEXPR.
(std::exception::~exception, std::exception::what): For C++26 add
constexpr and define inline.
* libsupc++/exception_ptr.h (std::make_exception_ptr): Add
_GLIBCXX26_CONSTEXPR.  For if consteval use just throw with
current_exception() in catch.
(std::exception_ptr::exception_ptr(void*)): For C++26 add constexpr
and define inline.
(std::exception_ptr::exception_ptr()): Add _GLIBCXX26_CONSTEXPR.
(std::exception_ptr::exception_ptr(const exception_ptr&)): Likewise.
Use __builtin_eh_ptr_adjust_ref if consteval and compiler has it
instead of _M_addref.
(std::exception_ptr::exception_ptr(nullptr_t)): Add
_GLIBCXX26_CONSTEXPR.
(std::exception_ptr::exception_ptr(exception_ptr&&)): Likewise.
(std::exception_ptr::operator=): Likewise.
(std::exception_ptr::~exception_ptr): Likewise.  Use
__builtin_eh_ptr_adjust_ref if consteval and compiler has it
instead of _M_release.
(std::exception_ptr::swap): Add _GLIBCXX26_CONSTEXPR.
(std::exception_ptr::operator bool): Likewise.
(std::exception_ptr::operator==): Likewise.
* libsupc++/nested_exception.h
(std::nested_exception::nested_exception): Add _GLIBCXX26_CONSTEXPR.
(std::nested_exception::operator=): Likewise.
(std::nested_exception::~nested_exception): For C++26 add constexpr
and define inline.
(std::nested_exception::rethrow_if_nested): Add _GLIBCXX26_CONSTEXPR.
(std::nested_exception::nested_ptr): Likewise.
(std::_Nested_exception::_Nested_exception): Likewise.
(std::throw_with_nested, std::rethrow_if_nested): Likewise.
* libsupc++/new (std::bad_alloc::bad_alloc): Likewise.
(std::bad_alloc::operator=): Likewise.
(std::bad_alloc::~bad_alloc): For C++26 add constexpr and define
inline.
(std::bad_alloc::what): Likewise.
(std::bad_array_new_length::bad_array_new_length): Add
_GLIBCXX26_CONSTEXPR.
(std::bad_array_new_length::~bad_array_new_length): For C++26 add
constexpr and define inline.
(std::bad_array_new_length::what): Likewise.
* libsupc++/typeinfo (std::bad_cast::bad_cast): Add
_GLIBCXX26_CONSTEXPR.
(std::bad_cast::~bad_cast): For C++26 add constexpr and define inline.
(std::bad_cast::what): Likewise.
(std::bad_typeid::bad_typeid): Add _GLIBCXX26_CONSTEXPR.
(std::bad_typeid::~bad_typeid): For C++26 add constexpr and define
inline.
(std::bad_typeid::what): Likewise.

3 weeks agoaarch64: Guard VF-based costing with !m_costing_for_scalar
Richard Sandiford [Thu, 10 Jul 2025 21:00:41 +0000 (22:00 +0100)] 
aarch64: Guard VF-based costing with !m_costing_for_scalar

g:4b47acfe2b626d1276e229a0cf165e934813df6c caused a segfault
in aarch64_vector_costs::analyze_loop_vinfo when costing scalar
code, since we'd end up dividing by a zero VF.

Much of the structure of the aarch64 costing code dates from
a stage 4 patch, when we had to work within the bounds of what
the target-independent code did.  Some of it could do with a
rework now that we're not so constrained.

This patch is therefore an emergency fix rather than the best
long-term solution.  I'll revisit when I have more time to think
about it.

gcc/
* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
Guard VF-based costing with !m_costing_for_scalar.

3 weeks agoReduce the # of arguments of .ACCESS_WITH_SIZE from 6 to 4.
Qing Zhao [Wed, 9 Jul 2025 21:31:55 +0000 (21:31 +0000)] 
Reduce the # of arguments of .ACCESS_WITH_SIZE from 6 to 4.

This is an improvement to the design of internal function .ACCESS_WITH_SIZE.

Currently, the .ACCESS_WITH_SIZE is designed as:

   ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE,
     TYPE_OF_SIZE, ACCESS_MODE, TYPE_SIZE_UNIT for element)
   which returns the REF_TO_OBJ same as the 1st argument;

   1st argument REF_TO_OBJ: The reference to the object;
   2nd argument REF_TO_SIZE: The reference to the size of the object,
   3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE represents
     0: the number of bytes.
     1: the number of the elements of the object type;
   4th argument TYPE_OF_SIZE: A constant 0 with its TYPE being the same as the
     TYPE of the object referenced by REF_TO_SIZE
   5th argument ACCESS_MODE:
     -1: Unknown access semantics
      0: none
      1: read_only
      2: write_only
      3: read_write
   6th argument: The TYPE_SIZE_UNIT of the element TYPE of the FAM when 3rd
      argument is 1. NULL when 3rd argument is 0.

Among the 6 arguments:
 A. The 3rd argument CLASS_OF_SIZE is not needed. If the REF_TO_SIZE represents
    the number of bytes, simply pass 1 to the TYPE_SIZE_UNIT argument.
 B. The 4th and the 5th arguments can be combined into 1 argument, whose TYPE
    represents the TYPE_OF_SIZE, and the constant value represents the
    ACCESS_MODE.

As a result, the new design of the .ACCESS_WITH_SIZE is:

   ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE,
     TYPE_OF_SIZE + ACCESS_MODE, TYPE_SIZE_UNIT for element)
   which returns the REF_TO_OBJ same as the 1st argument;

   1st argument REF_TO_OBJ: The reference to the object;
   2nd argument REF_TO_SIZE: The reference to the size of the object,
   3rd argument TYPE_OF_SIZE + ACCESS_MODE: An integer constant with a pointer
     TYPE.
     The pointee TYPE of the pointer TYPE is the TYPE of the object referenced
by REF_TO_SIZE.
     The integer constant value represents the ACCESS_MODE:
0: none
1: read_only
2: write_only
3: read_write
   4th argument: The TYPE_SIZE_UNIT of the element TYPE of the array.

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): Adjust the position
of the arguments per the new design.

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Update comments.
Adjust the arguments per the new design.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments.
* internal-fn.def (ACCESS_WITH_SIZE): Update comments.
* tree-object-size.cc (access_with_size_object_size): Update comments.
Adjust the arguments per the new design.

3 weeks agoPassing TYPE_SIZE_UNIT of the element as the 6th argument to .ACCESS_WITH_SIZE (PR121000)
Qing Zhao [Wed, 9 Jul 2025 20:10:30 +0000 (20:10 +0000)] 
Passing TYPE_SIZE_UNIT of the element as the 6th argument to .ACCESS_WITH_SIZE (PR121000)

The size of the element of the FAM _cannot_ reliably depends on the original
TYPE of the FAM that we passed as the 6th parameter to the .ACCESS_WITH_SIZE:

     TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (gimple_call_arg (call, 5))))

when the element of the FAM has a variable length type. Since the variable
 that represents TYPE_SIZE_UNIT has no explicit usage in the original IL,
compiler transformations (such as DSE) that are applied before object_size
phase might eliminate the whole definition to the variable that represents
the TYPE_SIZE_UNIT of the element of the FAM.

In order to resolve this issue, instead of passing the original TYPE of the
FAM as the 6th argument to .ACCESS_WITH_SIZE, we should explicitly pass the
original TYPE_SIZE_UNIT of the element TYPE of the FAM as the 6th argument
to the call to  .ACCESS_WITH_SIZE.

PR middle-end/121000

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Update comments.
Pass TYPE_SIZE_UNIT of the element as the 6th argument.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments.
* internal-fn.def (ACCESS_WITH_SIZE): Update comments.
* tree-object-size.cc (access_with_size_object_size): Update comments.
Get the element_size from the 6th argument directly.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-pr121000.c: New test.

3 weeks agotestsuite: Fix unallocated array usage in test
Mikael Morin [Sat, 5 Jul 2025 13:05:20 +0000 (15:05 +0200)] 
testsuite: Fix unallocated array usage in test

gcc/testsuite/ChangeLog:

* gfortran.dg/asan/array_constructor_1.f90: Allocate array
before using it.

3 weeks agoaarch64: Fix LD1Q and ST1Q failures for big-endian
Richard Sandiford [Thu, 10 Jul 2025 15:54:45 +0000 (16:54 +0100)] 
aarch64: Fix LD1Q and ST1Q failures for big-endian

LD1Q gathers and ST1Q scatters are unusual in that they operate
on 128-bit blocks (effectively VNx1TI).  However, we don't have
modes or ACLE types for 128-bit integers, and 128-bit integers
are not the intended use case.  Instead, the instructions are
intended to be used in "hybrid VLA" operations, where each 128-bit
block is an Advanced SIMD vector.

The normal SVE modes therefore capture the intended use case better
than VNx1TI would.  For example, VNx2DI is effectively N copies
of V2DI, VNx4SI N copies of V4SI, etc.

Since there is only one LD1Q instruction and one ST1Q instruction,
the ACLE support used a single pattern for each, with the loaded or
stored data having mode VNx2DI.  The ST1Q pattern was generated by:

    rtx data = e.args.last ();
    e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data));
    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q);

where the force_lowpart_subreg bitcast the stored data to VNx2DI.
But such subregs require an element reverse on big-endian targets
(see the comment at the head of aarch64-sve.md), which wasn't the
intention.  The code should have used aarch64_sve_reinterpret instead.

The LD1Q pattern was used as follows:

    e.prepare_gather_address_operands (1, false);
    return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q);

which always returns a VNx2DI value, leaving the caller to bitcast
that to the correct mode.  That bitcast again uses subregs and has
the same problem as above.

However, for the reasons explained in the comment, using
aarch64_sve_reinterpret does not work well for LD1Q.  The patch
instead parameterises the LD1Q based on the required data mode.

gcc/
* config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with...
(@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svld1q_gather_impl::expand): Update accordingly.
(svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret
instead of force_lowpart_subreg.

3 weeks agocobol: Add PUSH and POP to CDF.
James K. Lowden [Wed, 9 Jul 2025 22:14:40 +0000 (18:14 -0400)] 
cobol: Add PUSH and POP to CDF.

Introduce cdf_directives_t class to centralize management of CDF
state. Move existing CDF state variables and functions into the new
class.

gcc/cobol/ChangeLog:

PR cobol/120765
* cdf.y: Extend grammar for new CDF syntax, relocate dictionary.
* cdfval.h (cdf_dictionary): Use new CDF dictionary.
* dts.h: Remove useless assignment, note incorrect behavior.
* except.cc: Remove obsolete EC state.
* gcobol.1: Document CDF in its own section.
* genapi.cc (parser_statement_begin): Use new EC state function.
(parser_file_merge): Same.
(parser_check_fatal_exception): Same.
* genutil.cc (get_and_check_refstart_and_reflen): Same.
(get_depending_on_value_from_odo): Same.
(get_data_offset): Same.
(process_this_exception): Same.
* lexio.cc (check_push_pop_directive): New function.
(check_source_format_directive): Restrict regex search to 1 line.
(cdftext::free_form_reference_format): Use new function.
* parse.y: Define new CDF tokens, use new CDF state.
* parse_ante.h (cdf_tokens): Use new CDF state.
(redefined_token): Same.
(class prog_descr_t): Remove obsolete CDF state.
(class program_stack_t): Same.
(current_call_convention): Same.
* scan.l: Recognize new CDF tokens.
* scan_post.h (is_cdf_token): Same.
* symbols.h (cdf_current_tokens): Change current_call_convention to return void.
* token_names.h: Regenerate.
* udf/stored-char-length.cbl: Use new PUSH/POP CDF functionality.
* util.cc (class cdf_directives_t): Define cdf_directives_t.
(current_call_convention): Same.
(cdf_current_tokens): Same.
(cdf_dictionary): Same.
(cdf_enabled_exceptions): Same.
(cdf_push): Same.
(cdf_push_call_convention): Same.
(cdf_push_current_tokens): Same.
(cdf_push_dictionary): Same.
(cdf_push_enabled_exceptions): Same.
(cdf_push_source_format): Same.
(cdf_pop): Same.
(cdf_pop_call_convention): Same.
(cdf_pop_current_tokens): Same.
(cdf_pop_dictionary): Same.
(cdf_pop_enabled_exceptions): Same.
(cdf_pop_source_format): Same.
* util.h (cdf_push): Declare cdf_directives_t.
(cdf_push_call_convention): Same.
(cdf_push_current_tokens): Same.
(cdf_push_dictionary): Same.
(cdf_push_enabled_exceptions): Same.
(cdf_push_source_format): Same.
(cdf_pop): Same.
(cdf_pop_call_convention): Same.
(cdf_pop_current_tokens): Same.
(cdf_pop_dictionary): Same.
(cdf_pop_source_format): Same.
(cdf_pop_enabled_exceptions): Same.

libgcobol/ChangeLog:

* common-defs.h (cdf_enabled_exceptions): Use new CDF state.

3 weeks agoFixes to auto-profile and Gimple matching.
Jan Hubicka [Thu, 10 Jul 2025 14:56:21 +0000 (16:56 +0200)] 
Fixes to auto-profile and Gimple matching.

This patch fixes several issues I noticed in gimple matching and -Wauto-profile
warning.  One problem is that we mismatched symbols with user names, such as
"*strlen" instead of "strlen". I added raw_symbol_name to strip extra '*' which
is ok on ELF targets which are only targets we support with auto-profile, but
eventually we will want to add the user prefix.  There is sorry about this.
Also I think dwarf2out is wrong:

static void
add_linkage_attr (dw_die_ref die, tree decl)
{
  const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));

  /* Mimic what assemble_name_raw does with a leading '*'.  */
  if (name[0] == '*')
    name = &name[1];

The patch also fixes locations of warning.  I used location of problematic
statement as warning_at parmaeter but also included info about the containing
funtction.  This makes warning_at to ignore the fist location that is fixed now.

I also fixed the ICE with -Wno-auto-profile disussed earlier.

Bootstrapped/regtested x86_64-linux.  Autoprofiled bootstrap now fails for
weird reasons for me (it does not bild the training stage), so I will try to
debug this before comitting.

gcc/ChangeLog:

* auto-profile.cc: Include output.h.
(function_instance::set_call_location): Also sanity check
that location is known.
(raw_symbol_name): Two new static functions.
(dump_inline_stack): Use it.
(string_table::get_index_by_decl): Likewise.
(function_instance::get_cgraph_node): Likewise.
(function_instance::get_function_instance_by_decl): Fix typo
in warning; use raw names; fix lineno decoding.
(match_with_target): Add containing funciton parameter;
correctly output function and call location in warning.
(function_instance::lookup_count): Fix warning locations.
(function_instance::match): Fix warning locations; avoid
crash with mismatched callee; do not warn about broken callsites
twice.
(autofdo_source_profile::offline_external_functions): Use
raw_assembler_name.
(walk_block): Use raw_assembler_name.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/afdo-inline.c: Add user symbol names.

3 weeks agoexpand: ICE if asked to expand RDIV with non-float type.
Robin Dapp [Wed, 9 Jul 2025 13:58:05 +0000 (15:58 +0200)] 
expand: ICE if asked to expand RDIV with non-float type.

This patch adds asserts that ensure we only expand an RDIV_EXPR with
actual float mode.  It also replaces the RDIV_EXPR in setting a
vectorized loop's length by EXACT_DIV_EXPR.  The code in question is
only used with length-control targets (riscv, powerpc, s390).

PR target/121014

gcc/ChangeLog:

* cfgexpand.cc (expand_debug_expr): Assert FLOAT_MODE_P.
* optabs-tree.cc (optab_for_tree_code): Assert FLOAT_TYPE_P.
* tree-vect-loop.cc (vect_get_loop_len): Use EXACT_DIV_EXPR.

3 weeks agoRISC-V: Make zero-stride load broadcast a tunable.
Robin Dapp [Thu, 10 Jul 2025 07:41:48 +0000 (09:41 +0200)] 
RISC-V: Make zero-stride load broadcast a tunable.

This patch makes the zero-stride load broadcast idiom dependent on a
uarch-tunable "use_zero_stride_load".  Right now we have quite a few
paths that reach a strided load and some of them are not exactly
straightforward.

While broadcast is relatively rare on rv64 targets it is more common on
rv32 targets that want to vectorize 64-bit elements.

While the patch is more involved than I would have liked it could have
even touched more places.  The whole broadcast-like insn path feels a
bit hackish due to the several optimizations we employ.  Some of the
complications stem from the fact that we lump together real broadcasts,
vector single-element sets, and strided broadcasts.  The strided-load
alternatives currently require a memory_constraint to work properly
which causes more complications when trying to disable just these.

In short, the whole pred_broadcast handling in combination with the
sew64_scalar_helper could use work in the future.  I was about to start
with it in this patch but soon realized that it would only distract from
the original intent.  What can help in the future is split strided and
non-strided broadcast entirely, as well as the single-element sets.

Yet unclear is whether we need to pay special attention for misaligned
strided loads (PR120782).

I regtested on rv32 and rv64 with strided_load_broadcast_p forced to
true and false.  With either I didn't observe any new execution failures
but obviously there are new scan failures with strided broadcast turned
off.

PR target/118734

gcc/ChangeLog:

* config/riscv/constraints.md (Wdm): Use tunable for Wdm
constraint.
* config/riscv/riscv-protos.h (emit_avltype_insn): Declare.
(can_be_broadcasted_p): Rename to...
(can_be_broadcast_p): ...this.
* config/riscv/predicates.md: Use renamed function.
(strided_load_broadcast_p): Declare.
* config/riscv/riscv-selftests.cc (run_broadcast_selftests):
Only run broadcast selftest if strided broadcasts are OK.
* config/riscv/riscv-v.cc (emit_avltype_insn): New function.
(sew64_scalar_helper): Only emit a pred_broadcast if the new
tunable says so.
(can_be_broadcasted_p): Rename to...
(can_be_broadcast_p): ...this and use new tunable.
* config/riscv/riscv.cc (struct riscv_tune_param): Add strided
broad tunable.
(strided_load_broadcast_p): Implement.
* config/riscv/vector.md: Use strided_load_broadcast_p () and
work around 64-bit broadcast on rv32 targets.

3 weeks ago[PATCH] libgcc: PR target/116363 Fix SFtype to UDWtype conversion
Jan Dubiec [Thu, 10 Jul 2025 13:41:08 +0000 (07:41 -0600)] 
[PATCH] libgcc: PR target/116363 Fix SFtype to UDWtype conversion

This patch fixes SFtype to UDWtype (aka float to unsigned long long)
conversion on targets without DFmode like e.g. H8/300H. It solely relies
on SFtype->UWtype and UWtype->UDWtype conversions/casts. The existing code
in line 2218 (counter = a) assigns/casts a float which is *always* not lesser
than Wtype_MAXp1_F to an UWtype int which of course does not have enough
capacity.

PR target/116363

libgcc/ChangeLog:

* libgcc2.c (__fixunssfDI): Fix SFtype to UDWtype conversion for targets
without LIBGCC2_HAS_DF_MODE defined

3 weeks ago[RISC-V] Detect new fusions for RISC-V
Daniel Barboza [Thu, 10 Jul 2025 13:28:38 +0000 (07:28 -0600)] 
[RISC-V] Detect new fusions for RISC-V

This is primarily Daniel's work...  He's chasing things in QEMU & LLVM right
now so I'm doing a bit of clean-up and shepherding this patch forward.

--

Instruction fusion is a reasonably common way to improve the performance of
code on many architectures/designs.  A few years ago we submitted (via VRULL I
suspect) fusion support for a number of cases in the RISC-V space.

We made each type of fusion selectable independently in the tuning structure so
that designs which implemented some particular set of fusions could select just
the ones their design implemented.  This patch adds to that generic
infrastructure.

In particular we're introducing additional load fusions, store pair fusions,
bitfield extractions and a few B extension related fusions.

Conceptually for the new load fusions we're adding the ability to fuse most
add/shNadd instructions with a subsequent load.  There's a couple of
exceptions, but in general the expectation is that if we have add/shNadd for
address computation, then they can potentially use with the load where the
address gets used.

We've had limited forms of store pair fusion for a while.  Essentially we
required both stores to be 64 bits wide and land on opposite sides of a 128 bit
cache line.  That was enough to help prologues and a few other things, but was
fairly restrictive.  The new cases capture store pairs where the two stores
have the same size and hit consecutive memory locations.  For example, storing
consecutive bytes with sb+sb is fusible.

For bitfield extractions we can fuse together a shift left followed by a shift
right for arbitrary shift counts where as previously we restricted the shift
counts to those implementing sign/zero extensions of 8, and 16 bit objects.

Finally some B extension fusions.  orc.b+not which shows up in string
comparisons, ctz+andi (deepsjeng?), neg+max (synthesized abs).

I hope these prove to be useful to other RISC-V designs.  I wouldn't be
surprised if we have to break down the new load fusions further for some
designs.  If we need to do that it wouldn't be hard.

FWIW, our data indicates the generalized store fusions followed by the expanded
load fusions are the most important cases for the new code.

These have been tested with crosses and bootstrapped on the BPI.

Waiting on pre-commit CI before moving forward (though it has been failing to
pick up some patches recently...)

gcc/
* config/riscv/riscv.cc (riscv_fusion_pairs): Add new cases.
(riscv_set_is_add): New function.
(riscv_set_is_addi, riscv_set_is_adduw, riscv_set_is_shNadd): Likewise.
(riscv_set_is_shNadduw): Likewise.
(riscv_macro_fusion_pair_p): Add new fusion cases.

Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
3 weeks agotestsuite: Add -funwind-tables to sve*/pfalse* tests
Richard Sandiford [Thu, 10 Jul 2025 13:23:57 +0000 (14:23 +0100)] 
testsuite: Add -funwind-tables to sve*/pfalse* tests

The SVE svpfalse folding tests use CFI directives to delimit the
function bodies.  That requires -funwind-tables to be enabled,
which is true by default for *-linux-gnu targets, but not for *-elf.

gcc/testsuite/
* gcc.target/aarch64/sve/pfalse-binary.c: Add -funwind-tables.
* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-binaryxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-clast.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-count_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-fold_left.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve/pfalse-load_replicate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ptest.c: Likewise.
* gcc.target/aarch64/sve/pfalse-rdffr.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction.c: Likewise.
* gcc.target/aarch64/sve/pfalse-reduction_wide.c: Likewise.
* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: Likewise.
* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: Likewise.
* gcc.target/aarch64/sve/pfalse-storexn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_n.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_pred.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: Likewise.
* gcc.target/aarch64/sve/pfalse-unaryxn.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-binary_wide.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-compare.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c,
* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c,
* gcc.target/aarch64/sve2/pfalse-unary.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: Likewise.
* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: Likewise.

3 weeks agoHandle failed gcond pattern gracefully
Richard Biener [Thu, 10 Jul 2025 09:26:04 +0000 (11:26 +0200)] 
Handle failed gcond pattern gracefully

SLP analysis of early break conditions asserts pattern recognition
canonicalized all of them.  But the pattern can fail, for example
when vector types cannot be computed.  So be graceful here, so
we don't ICE when we didn't yet compute vector types.

* tree-vect-slp.cc (vect_analyze_slp): Fail for non-canonical
gconds.

3 weeks agoAdjust reduction with conversion SLP build
Richard Biener [Thu, 10 Jul 2025 09:23:59 +0000 (11:23 +0200)] 
Adjust reduction with conversion SLP build

The following adjusts how we set SLP_TREE_VECTYPE for the conversion
node we build when fixing up the reduction with conversion SLP instance.
This should probably see more TLC, but the following avoids relying
on STMT_VINFO_VECTYPE for this.

* tree-vect-slp.cc (vect_build_slp_instance): Do not use
SLP_TREE_VECTYPE to determine the conversion back to the
reduction IV.

3 weeks agoAvoid vect_is_simple_use call from vectorizable_reduction
Richard Biener [Thu, 10 Jul 2025 09:21:26 +0000 (11:21 +0200)] 
Avoid vect_is_simple_use call from vectorizable_reduction

When analyzing the reduction cycle we look to determine the
reduction input vector type, for lane-reducing ops we look
at the input but instead of using vect_is_simple_use which
is problematic for SLP we should simply get at the SLP
operands vector type.  If that's not set and we make up one
we should also ensure it stays so.

* tree-vect-loop.cc (vectorizable_reduction): Avoid
vect_is_simple_use and record a vector type if we come
up with one.

3 weeks agoAvoid vect_is_simple_use call from get_load_store_type
Richard Biener [Thu, 10 Jul 2025 08:25:03 +0000 (10:25 +0200)] 
Avoid vect_is_simple_use call from get_load_store_type

This isn't the required refactoring of vect_check_gather_scatter
but it avoids a now unnecessary call to vect_is_simple_use which
is problematic because it looks at STMT_VINFO_VECTYPE which we
want to get rid of.  SLP build already ensures vect_is_simple_use
on all lane defs, so all we need is to populate the offset_vectype
and offset_dt which is not always set by vect_check_gather_scatter.
That's both easy to get from the SLP child directly.

* tree-vect-stmts.cc (get_load_store_type): Do not use
vect_is_simple_use to fill gather/scatter offset operand
vectype and dt.

3 weeks agoPass SLP node down to cost hook for reduction cost
Richard Biener [Thu, 10 Jul 2025 08:08:23 +0000 (10:08 +0200)] 
Pass SLP node down to cost hook for reduction cost

The following arranges vector reduction costs to hand down the
SLP node (of the reduction stmt) to the cost hooks, not only the
stmt_info.  This also avoids accessing STMT_VINFO_VECTYPE of an
unrelated stmt to the node that is subject to code generation.

* tree-vect-loop.cc (vect_model_reduction_cost): Get SLP
node instead of stmt_info and use that when recording costs.

3 weeks agoaarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementa...
Kyrylo Tkachov [Wed, 9 Jul 2025 17:04:01 +0000 (10:04 -0700)] 
aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementation of NOR

While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
due to its tied operands, the destination of the movprfx cannot be also
a source operand.  But the offending pattern in aarch64-sve2.md tries
to do exactly that for the "=?&w,w,w" alternative and gas warns for the
attached testcase.

This patch adjusts that alternative to avoid taking operand 0 as an input
in the NBSL again.

So for the testcase in the patch we now generate:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z1.d
        ret

instead of the previous:
nor_z:
        movprfx z0, z1
        nbsl    z0.d, z0.d, z2.d, z0.d
        ret

which generated a gas warning.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

PR target/120999
* config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>):
Adjust movprfx alternative.

gcc/testsuite/

PR target/120999
* gcc.target/aarch64/sve2/pr120999.c: New test.

3 weeks agoaarch64: Extend HVLA permutations to big-endian
Richard Sandiford [Thu, 10 Jul 2025 09:57:28 +0000 (10:57 +0100)] 
aarch64: Extend HVLA permutations to big-endian

TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1
"hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions.
This matching was conditional on !BYTES_BIG_ENDIAN.

The ACLE code also lowered the associated SVE2.1 intrinsics into
suitable VEC_PERM_EXPRs.  This lowering was not conditional on
!BYTES_BIG_ENDIAN.

The mismatch led to lots of ICEs in the ACLE tests on big-endian
targets: we lowered to VEC_PERM_EXPRs that are not supported.

I think the !BYTES_BIG_ENDIAN restriction was unnecessary.
SVE maps the first memory element to the least significant end of
the register for both endiannesses, so no endian correction or lane
number adjustment is necessary.

This is in some ways a bit counterintuitive.  ZIPQ1 is conceptually
"apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does
matter when choosing between Advanced SIMD ZIP1 and ZIP2.  For example,
the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little-
endian and ZIP2 for big-endian.  But the difference between the hybrid
VLA and Advanced SIMD permute selectors is a consequence of the
difference between the SVE and Advanced SIMD element orders.

The same thing applies to ACLE intrinsics.  The current lowering of
svzipq1 etc. is correct for both endiannesses.  If ACLE code does:

  2x svld1_s32 + svzipq1_s32 + svst1_s32

then the byte-for-byte result is the same for both endiannesses.
On big-endian targets, this is different from using the Advanced SIMD
sequence below for each 128-bit block:

  2x LDR + ZIP1 + STR

In contrast, the byte-for-byte result of:

  2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32

depends on endianness, since the quadword gathers and scatters use
Advanced SIMD byte ordering for each 128-bit block.  This gather/scatter
sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR
sequence for both endiannesses.

Programmers writing ACLE code have to be aware of this difference
if they want to support both endiannesses.

The patch includes some new execution tests to verify the expansion
of the VEC_PERM_EXPRs.

gcc/
* doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document.
* config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to
BYTES_BIG_ENDIAN.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw):
New proc.
* gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian.  Add
noipa attributes.
* gcc.target/aarch64/sve2/extq_1.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1.c: Likewise.
* gcc.target/aarch64/sve2/dupq_1_run.c: New test.
* gcc.target/aarch64/sve2/extq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise.
* gcc.target/aarch64/sve2/zipq_1_run.c: Likewise.

3 weeks agoRemove dead code dealing with non-SLP
Richard Biener [Thu, 10 Jul 2025 07:44:50 +0000 (09:44 +0200)] 
Remove dead code dealing with non-SLP

After vect_analyze_loop_operations is gone we can clean up
vect_analyze_stmt as it is no longer called out of SLP context.

* tree-vectorizer.h (vect_analyze_stmt): Remove stmt-info
and need_to_vectorize arguments.
* tree-vect-slp.cc (vect_slp_analyze_node_operations_1):
Adjust.
* tree-vect-stmts.cc (can_vectorize_live_stmts): Remove
stmt_info argument and remove non-SLP path.
(vect_analyze_stmt): Remove stmt_info and need_to_vectorize
argument and prune paths no longer reachable.
(vect_transform_stmt): Adjust.

3 weeks agoComment spelling fix: tunning -> tuning
Jakub Jelinek [Thu, 10 Jul 2025 08:23:31 +0000 (10:23 +0200)] 
Comment spelling fix: tunning -> tuning

Kyrylo noticed another spelling bug and like usually, the same mistake
happens in multiple places.

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

* config/i386/x86-tune.def: Change "Tunning the" to "tuning" in
comment and use semicolon instead of dot in comment.
* loop-unroll.cc (decide_unroll_stupid): Comment spelling fix,
tunning -> tuning.

3 weeks agoChange bellow in comments to below
Jakub Jelinek [Thu, 10 Jul 2025 08:16:43 +0000 (10:16 +0200)] 
Change bellow in comments to below

While I'm not a native English speaker, I believe all the uses
of bellow (roar/bark/...) in comments in gcc are meant to be
below (beneath/under/...).

2025-07-10  Jakub Jelinek  <jakub@redhat.com>

gcc/
* tree-vect-loop.cc (scale_profile_for_vect_loop): Comment
spelling fix: bellow -> below.
* ipa-polymorphic-call.cc (record_known_type): Likewise.
* config/i386/x86-tune.def: Likewise.
* config/riscv/vector.md (*vsetvldi_no_side_effects_si_extend):
Likewise.
* tree-scalar-evolution.cc (iv_can_overflow_p): Likewise.
* ipa-devirt.cc (add_type_duplicate): Likewise.
* tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Likewise.
* gimple-ssa-sccopy.cc: Likewise.
* cgraphunit.cc: Likewise.
* graphite.h (struct poly_dr): Likewise.
* ipa-reference.cc (ignore_edge_p): Likewise.
* tree-ssa-alias.cc (ao_compare::compare_ao_refs): Likewise.
* profile-count.h (profile_probability::probably_reliable_p):
Likewise.
* ipa-inline-transform.cc (inline_call): Likewise.
gcc/ada/
* par-load.adb: Comment spelling fix: bellow -> below.
* libgnarl/s-taskin.ads: Likewise.
gcc/testsuite/
* gfortran.dg/g77/980310-3.f: Comment spelling fix: bellow -> below.
* jit.dg/test-debuginfo.c: Likewise.
libstdc++-v3/
* testsuite/22_locale/codecvt/codecvt_unicode.h
(ucs2_to_utf8_out_error): Comment spelling fix: bellow -> below.
(utf16_to_ucs2_in_error): Likewise.

3 weeks agoRemove vect_dissolve_slp_only_groups
Richard Biener [Wed, 9 Jul 2025 13:10:26 +0000 (15:10 +0200)] 
Remove vect_dissolve_slp_only_groups

This function dissolves DR groups that are not subject to SLP.  Which
means it is no longer necessary.

* tree-vect-loop.cc (vect_dissolve_slp_only_groups): Remove.
(vect_analyze_loop_2): Do not call it.

3 weeks agoRemove vect_analyze_loop_operations
Richard Biener [Wed, 9 Jul 2025 13:04:12 +0000 (15:04 +0200)] 
Remove vect_analyze_loop_operations

This removes the remains of vect_analyze_loop_operations.  All the
checks it does still on LC PHIs of inner loops in outer loop
vectorization should be handled by vectorizable_lc_phi.

* tree-vect-loop.cc (vect_active_double_reduction_p): Remove.
(vect_analyze_loop_operations): Remove.
(vect_analyze_loop_2): Do not call it.

3 weeks agoRemove non-SLP vectorization factor determining
Richard Biener [Wed, 9 Jul 2025 10:53:45 +0000 (12:53 +0200)] 
Remove non-SLP vectorization factor determining

The following removes the VF determining step from non-SLP stmts.
For now we keep setting STMT_VINFO_VECTYPE for all stmts, there are
too many places to fix, including some more complicated ones, so
this is defered for a followup.

Along this removes vect_update_vf_for_slp, merging the check for
present hybrid SLP stmts to vect_detect_hybrid_slp and fail analysis
early.  This also removes to essentially duplicate this check in
the stmt walk of vect_analyze_loop_operations.  Getting rid of that,
and performing some other checks earlier is also defered to a followup.

* tree-vect-loop.cc (vect_determine_vf_for_stmt_1): Rename
to ...
(vect_determine_vectype_for_stmt_1): ... this and only set
STMT_VINFO_VECTYPE.  Fail for single-element vector types.
(vect_determine_vf_for_stmt): Rename to ...
(vect_determine_vectype_for_stmt): ... this and only set
STMT_VINFO_VECTYPE. Fail for single-element vector types.
(vect_determine_vectorization_factor): Rename to ...
(vect_set_stmts_vectype): ... this and only set STMT_VINFO_VECTYPE.
(vect_update_vf_for_slp): Remove.
(vect_analyze_loop_operations): Remove walk over stmts.
(vect_analyze_loop_2): Call vect_set_stmts_vectype instead of
vect_determine_vectorization_factor.  Set vectorization factor
from LOOP_VINFO_SLP_UNROLLING_FACTOR.  Fail if vect_detect_hybrid_slp
detects hybrid stmts or when vect_make_slp_decision finds
nothing to SLP.
* tree-vect-slp.cc (vect_detect_hybrid_slp): Move check
whether we have any hybrid stmts here from vect_update_vf_for_slp
* tree-vect-stmts.cc (vect_analyze_stmt): Remove loop over
stmts.
* tree-vectorizer.h (vect_detect_hybrid_slp): Update.

3 weeks agoRISCV: Remove the v extension requirement for sat scalar run test
Pan Li [Wed, 9 Jul 2025 02:40:52 +0000 (10:40 +0800)] 
RISCV: Remove the v extension requirement for sat scalar run test

The sat scalar run test should not require the v extension, thus
take rv32 || rv64 instead of riscv_v for the requirement.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
* The rv32gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add-run-1-i16.c: Take rv32 || rv64
instead of riscv_v for scalar run test.
* gcc.target/riscv/sat/sat_s_add-run-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_add-run-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-1-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-2-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-3-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i64.c: Ditto.
* gcc.target/riscv/sat/sat_s_sub-run-4-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i16-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i16.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i32.c: Ditto.
* gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u32-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_add_imm-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-10-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-11-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-12-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-6-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-7-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-8-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub-run-9-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_sub_imm-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-1-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-2-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-3-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-4-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-5-u8.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u16.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u32.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u64.c: Ditto.
* gcc.target/riscv/sat/sat_u_trunc-run-6-u8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 weeks agoDaily bump.
GCC Administrator [Thu, 10 Jul 2025 00:20:18 +0000 (00:20 +0000)] 
Daily bump.

3 weeks agocobol: Development round-up. [PR120765, PR119337, PR120794]
Robert Dubner [Wed, 9 Jul 2025 16:24:38 +0000 (12:24 -0400)] 
cobol: Development round-up. [PR120765, PR119337, PR120794]

This collection of changes reflects development by both Jim Lowden and Bob
Dubner.  It includes fixes to the cobcd script; refinements to the multiple-
period syntax; changes to the parser; implementation of DISPLAY/ACCEPT to and
from ENVIRONMENT-NAME, ENVIRONMENT-VALUE, ARGUMENT-NUMBER, ARGUMENT-VALUE and
minor changes to genapi.cc to cut down on the number of cppcheck warnings.

Co-authored-by: James K. Lowden <jklowden@cobolworx.com>
Co-authored-by: Robert Dubner <rdubner@symas.com>
gcc/cobol/ChangeLog:

PR cobol/120765
PR cobol/119337
PR cobol/120794
* Make-lang.in: Take control of the .cc.o rule.
* cbldiag.h (error_msg_direct): New declaration.
(gcc_location_dump): Forward declaration.
(location_dump): Use gcc_location_dump.
* cdf.y: Change some tokens.
* gcobc: Change dialect handling.
* genapi.cc (parser_call_targets_dump): Temporarily remove from service.
(parser_compile_dcls): Combine temporary arrays.
(get_binary_value_from_float): Apply const to one parameter.
(depending_on_value): Localize a boolean variable.
(normal_normal_compare): Likewise.
(cobol_compare): Eliminate cppcheck warning.
(combined_name): Apply const to an input parameter.
(parser_perform): Apply const to a variable.
(parser_accept): Improve handling of special_name_t parameter and
the exception conditions.
(parser_display): Improve handling of speciat_name_t parameter; use the
os_filename[] string when appropriate.
(program_end_stuff): Rename shadowing variable.
(parser_division): Consolidate temporary char[] arrays.
(parser_file_start): Apply const to a parameter.
(inspect_replacing): Likewise.
(parser_program_hierarchy): Rename shadowing variable.
(mh_identical): Apply const to parameters.
(float_type_of): Likewise.
(picky_memcpy): Likewise.
(mh_numeric_display): Likewise.
(mh_little_endian): Likewise.
(mh_source_is_group): Apply static to a variable it.
(move_helper): Quiet a cppcheck warning.
* genapi.h (parser_accept): Add exceptions to declaration.
(parser_accept_under_discussion): Add declaration.
(parser_display): Change to std::vector; add exceptions to declaration.
* lexio.cc (cdf_source_format): Improve source code location handling.
(source_format_t::infer): Likewise.
(is_fixed_format): Likewise.
(is_reference_format): Likewise.
(left_margin): Likewise.
(right_margin): Likewise.
(cobol_set_indicator_column): Likewise.
(include_debug): Likewise.
(continues_at): Likewise.
(indicated): Likewise.
(check_source_format_directive): Likewise.
(cdftext::free_form_reference_format): Likewise.
* parse.y: Tokens; program and function names; DISPLAY and ACCEPT
handling.
* parse_ante.h (class tokenset_t): Removed.
(class current_tokens_t): Removed.
(field_of): Removed.
* scan.l: Token handling.
* scan_ante.h (level_found): Comment.
* scan_post.h (start_condition_str): Remove cast author_state:.
* symbols.cc (symbols_update): Change error message.
(symbol_table_init): Correct and reorder entries.
(symbol_unresolved_file_key): New function definition.
(cbl_file_key_t::deforward): Change error message.
* symbols.h (symbol_unresolved_file_key): New declaration.
(keyword_tok): New function.
(redefined_token): New function.
(class current_tokens_t): New class.
* symfind.cc (symbol_match): Revise error message.
* token_names.h: Reorder and change numbers in comments.
* util.cc (class cdf_directives_t): New class.
(cobol_set_indicator_column): New function.
(cdf_source_format): New function.
(gcc_location_set_impl): Improve column handling in token_location.
(gcc_location_dump): New function.
(class temp_loc_t): Modify constructor.
(error_msg_direct): New function.
* util.h (class source_format_t): New class.

libgcobol/ChangeLog:

* libgcobol.cc (__gg__accept_envar): ACCEPT/DISPLAY environment variables.
(accept_envar): Likewise.
(default_exception_handler): Refine system log entries.
(open_syslog): Likewise.
(__gg__set_env_name): ACCEPT/DISPLAY environment variables.
(__gg__get_env_name): ACCEPT/DISPLAY environment variables.
(__gg__get_env_value): ACCEPT/DISPLAY environment variables.
(__gg__set_env_value): ACCEPT/DISPLAY environment variables.
(__gg__fprintf_stderr): Adjust __attribute__ for printf.
(__gg__set_arg_num): ACCEPT/DISPLAY command-line arguments.
(__gg__accept_arg_value): ACCEPT/DISPLAY command-line arguments.
(__gg__get_file_descriptor): DISPLAY on os_filename[] /dev device.

3 weeks agolibstdc++: Fix __uninitialized_default for constexpr case
Jonathan Wakely [Tue, 8 Jul 2025 09:48:21 +0000 (10:48 +0100)] 
libstdc++: Fix __uninitialized_default for constexpr case

We should not use the std::fill optimization for trivial types during
constant evaluation, because we need to begin the lifetime of all
objects, even trivially default constructible ones.

This fixes a bug that Clang diagnosed:

include/c++/16.0.0/bits/stl_algobase.h:925:11: note: assignment to object outside its lifetime is not allowed in a constant expression
  925 |         *__first = __val;
      |         ~~~~~~~~~^~~~~~~

I initially just added the #ifdef __cpp_lib_is_constant_evaluated check,
but that gave warnings with GCC because the function isn't constexpr
until C++26. So then I tried checking __glibcxx_raw_memory_algorithms
for the value indicating constexpr uninitialized_value_construct, but
that macro depends on __cpp_constexpr >= 202406 and Clang 19 doesn't
support constexpr placement new, so doesn't define it.

So I decided to just change __uninitialized_default to use
_GLIBCXX20_CONSTEXPR which is consistent with __uninitialized_default_n
(which needs to be constexpr because it's used by std::vector). We don't
currently need to use __uninitialized_default in constexpr contexts for
C++20 code, but we might find uses for it, so now it would be possible.

libstdc++-v3/ChangeLog:

* include/bits/stl_uninitialized.h (__uninitialized_default):
Do not use optimized implementation for constexpr case. Use
_GLIBCXX20_CONSTEXPR instead of _GLIBCXX26_CONSTEXPR.

3 weeks agolibstdc++: Add more template keywords to <mdspan> for Clang
Jonathan Wakely [Tue, 8 Jul 2025 21:04:29 +0000 (22:04 +0100)] 
libstdc++: Add more template keywords to <mdspan> for Clang

This fixes:

include/c++/16.0.0/mdspan:1182:33: error: use 'template' keyword to treat 'mapping' as a dependent template name
 1182 |               const typename _OLayout::mapping<_OExtents>&>
      |                                        ^
include/c++/16.0.0/mdspan:1185:31: error: use 'template' keyword to treat 'mapping' as a dependent template name
 1185 |             const typename _OLayout::mapping<_OExtents>&, mapping_type>
      |                                      ^

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan): Add template keyword for
dependent name.

3 weeks agolibstdc++: Do not use list-initialization in std::span members [PR120997]
Jonathan Wakely [Tue, 8 Jul 2025 13:56:39 +0000 (14:56 +0100)] 
libstdc++: Do not use list-initialization in std::span members [PR120997]

As the bug report shows, for span<const bool> the return statements of
the form `return {data(), count};` will use the new C++26 constructor,
span(initializer_list<element_type>).

Although the conversions from data() to bool and count to bool are
narrowing and should be ill-formed, in system headers the narrowing
diagnostics are suppressed. In any case, even if the compiler diagnosed
them as ill-formed, we still don't want the initializer_list constructor
to be used. We want to use the span(element_type*, size_t) constructor
instead.

Replace the braced-init-list uses with S(data(), count) where S is the
correct return type. We need to make similar changes in the C++26
working draft, which will be taken care of via an LWG issue.

libstdc++-v3/ChangeLog:

PR libstdc++/120997
* include/std/span (span::first, span::last, span::subspan): Do
not use braced-init-list for return statements.
* testsuite/23_containers/span/120997.cc: New test.

3 weeks agoaarch64: Fix endianness of DFmode vector constants
Richard Sandiford [Wed, 9 Jul 2025 16:44:20 +0000 (17:44 +0100)] 
aarch64: Fix endianness of DFmode vector constants

aarch64_simd_valid_imm tries to decompose a constant into a repeating
series of 64 bits, since most Advanced SIMD and SVE immediate forms
require that.  (The exceptions are handled first.)  It does this by
building up a byte-level register image, lsb first.  If the image does
turn out to repeat every 64 bits, it loads the first 64 bits into an
integer.

At this point, endianness has mostly been dealt with.  Endianness
applies to transfers between registers and memory, whereas at this
point we're dealing purely with register values.

However, one of things we try is to bitcast the value to a float
and use FMOV.  This involves splitting the value into 32-bit chunks
(stored as longs) and passing them to real_from_target.  The problem
being fixed by this patch is that, when a value spans multiple 32-bit
chunks, real_from_target expects them to be in memory rather than
register order.  Thus index 0 is the most significant chunk if
FLOAT_WORDS_BIG_ENDIAN and the least significant chunk otherwise.

This fixes aarch64/sve/cond_fadd_1.c and various other tests
for aarch64_be-elf.

gcc/
* config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Account
for FLOAT_WORDS_BIG_ENDIAN when building a floating-point value.

3 weeks agoFix ICE in afdo_adjust_guessed_profile
Jan Hubicka [Wed, 9 Jul 2025 16:30:09 +0000 (18:30 +0200)] 
Fix ICE in afdo_adjust_guessed_profile

gcc/ChangeLog:

* auto-profile.cc (afdo_adjust_guessed_profile): Add forgotten
if (dump_file) guard.

3 weeks agoc++: add passing testcases [PR120243]
Jason Merrill [Wed, 9 Jul 2025 15:13:19 +0000 (11:13 -0400)] 
c++: add passing testcases [PR120243]

These pass now; the first was fixed by r16-1507.

PR c++/120243

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/pr120243-unhandled-1.C: New test.
* g++.dg/coroutines/torture/pr120243-unhandled-2.C: New test.

3 weeks agoc++: generic lambda in template arg [PR121012]
Jason Merrill [Wed, 9 Jul 2025 15:03:31 +0000 (11:03 -0400)] 
c++: generic lambda in template arg [PR121012]

My r16-2065 adding missed errors for auto in a template arg in a lambda
parameter also introduced a bogus error on this testcase, where the auto is
both in a lambda parameter and in a template arg, but in the other order,
which is OK.  So we should clear in_template_argument_list_p for lambdas
like we do so many other parser flags.

PR c++/121012
PR c++/120917

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Clear
parser->in_template_argument_list_p.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-targ17.C: New test.

3 weeks agoc++: 'this' in lambda in noexcept-spec [PR121008]
Jason Merrill [Wed, 9 Jul 2025 14:38:20 +0000 (10:38 -0400)] 
c++: 'this' in lambda in noexcept-spec [PR121008]

In r16-970 I changed finish_this_expr to look at current_class_type rather
than current_class_ptr to accommodate explicit object lambdas.  But here in
a lambda in the noexcept-spec, the closure type doesn't yet have the
function as its context, so lambda_expr_this_capture can't find the function
and gives up.  But in this context current_class_ptr refers to the
function's 'this', so let's go back to using it in that case.

PR c++/121008
PR c++/113563

gcc/cp/ChangeLog:

* semantics.cc (finish_this_expr): Do check current_class_ref for
non-lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval28.C: New test.

3 weeks agoc++: optional template after :: causing error [PR119838]
Marek Polacek [Tue, 8 Jul 2025 18:36:37 +0000 (14:36 -0400)] 
c++: optional template after :: causing error [PR119838]

Found while working on Reflection where we currently reject:

  constexpr auto r = ^^::template C<int>::type;

which should work, because "::template C<int>::" should match the

  nested-name-specifier template(opt) simple-template-id ::

production where the template is optional.  This bug is not limited
to Reflection as demonstrated by the attached test case, so I'm
submitting it separately.

The check_template_keyword_in_nested_name_spec call should ensure that
we're dealing with a template-id if we've seen "template".

PR c++/119838

gcc/cp/ChangeLog:

* parser.cc (cp_parser_nested_name_specifier_opt): New global_p
parameter.  Look for "template" when global_p is true.
(cp_parser_simple_type_specifier): Pass global_p to
cp_parser_nested_name_specifier_opt.

gcc/testsuite/ChangeLog:

* g++.dg/parse/template32.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
3 weeks agoaarch64: Some fixes for SVE INDEX constants
Richard Sandiford [Wed, 9 Jul 2025 15:39:20 +0000 (16:39 +0100)] 
aarch64: Some fixes for SVE INDEX constants

When using SVE INDEX to load an Advanced SIMD vector, we need to
take account of the different element ordering for big-endian
targets.  For example, when big-endian targets store the V4SI
constant { 0, 1, 2, 3 } in registers, 0 becomes the most
significant element, whereas INDEX always operates from the
least significant element.  A big-endian target would therefore
load V4SI { 0, 1, 2, 3 } using:

    INDEX Z0.S, #3, #-1

rather than little-endian's:

    INDEX Z0.S, #0, #1

While there, I noticed that we would only check the first vector
in a multi-vector SVE constant, which would trigger an ICE if the
other vectors turned out to be invalid.  This is pretty difficult to
trigger at the moment, since we only allow single-register modes to be
used as frontend & middle-end vector modes, but it can be seen using
the RTL frontend.

gcc/
* config/aarch64/aarch64.cc (aarch64_sve_index_series_p): New
function, split out from...
(aarch64_simd_valid_imm): ...here.  Account for the different
SVE and Advanced SIMD element orders on big-endian targets.
Check each vector in a structure mode.

gcc/testsuite/
* gcc.dg/rtl/aarch64/vec-series-1.c: New test.
* gcc.dg/rtl/aarch64/vec-series-2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/dupq_2.c: Fix expected
output for this big-endian test.
* gcc.target/aarch64/sve/acle/general/dupq_4.c: Likewise.
* gcc.target/aarch64/sve/vec_init_3.c: Restrict to little-endian
targets and add more tests.
* gcc.target/aarch64/sve/vec_init_4.c: New big-endian version
of vec_init_3.c.

3 weeks agoMake the RTL frontend set REG_NREGS correctly
Richard Sandiford [Wed, 9 Jul 2025 15:39:20 +0000 (16:39 +0100)] 
Make the RTL frontend set REG_NREGS correctly

While working on a new testcase that uses the RTL frontend,
I hit a bug where a (reg ...) that spans multiple hard registers
had REG_NREGS set to 1.  This caused various things to misbehave.
For example, if the (reg ...) in question was used as crtl->return_rtx,
only the first register in the group would be marked as live on exit.

gcc/
* read-rtl-function.cc (function_reader::read_rtx_operand_r): Use
hard_regno_nregs to work out REG_NREGS for hard registers.

3 weeks agolibiberty: add routines to handle type-sensitive doubly linked lists
Matthieu Longo [Mon, 10 Feb 2025 11:24:57 +0000 (11:24 +0000)] 
libiberty: add routines to handle type-sensitive doubly linked lists

Those methods's implementation is relying on duck-typing at compile
time.
The structure corresponding to the node of a doubly linked list needs
to define attributes 'prev' and 'next' which are pointers on the type
of a node.
The structure wrapping the nodes and others metadata (first, last, size)
needs to define pointers 'first', and 'last' of the node's type, and
an integer type for 'size'.

Mutative methods can be bundled together and be declarable once via a
same macro, or can be declared separately. The merge sort is bundled
separately.
There are 3 types of macros:
1. for the declaration of prototypes: to use in a header file for a
   public declaration, or as a forward declaration in the source file
   for private declaration.
2. for the declaration of the implementation: to use always in a
   source file.
3. for the invocation of the functions.

The methods can be declared either public or private via the second
argument of the declaration macros.

List of currently implemented methods:
- LINKED_LIST_*:
    - APPEND: insert a node at the end of the list.
    - PREPEND: insert a node at the beginning of the list.
    - INSERT_BEFORE: insert a node before the given node.
    - POP_FRONT: remove the first node of the list.
    - POP_BACK: remove the last node of the list.
    - REMOVE: remove the given node from the list.
    - SWAP: swap the two given nodes in the list.
- LINKED_LIST_MERGE_SORT: a merge sort implementation.

include/ChangeLog:

* doubly-linked-list.h: New file.

libiberty/ChangeLog:

* Makefile.in: Add new header.
* testsuite/Makefile.in: Add new test.
* testsuite/test-doubly-linked-list.c: New test.

3 weeks agoRISC-V: Add test for vec_duplicate + vssub.vv combine case 1 with GR2VR cost 0, 1...
Pan Li [Mon, 7 Jul 2025 03:17:00 +0000 (11:17 +0800)] 
RISC-V: Add test for vec_duplicate + vssub.vv combine case 1 with GR2VR cost 0, 1 and 2

Add asm dump check test for vec_duplicate + vssub.vv combine to
vssub.vx, with the GR2VR cost is 0, 1 and 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 weeks agoRISC-V: Add test for vec_duplicate + vssub.vv combine case 0 with GR2VR cost 0, 2...
Pan Li [Mon, 7 Jul 2025 03:13:15 +0000 (11:13 +0800)] 
RISC-V: Add test for vec_duplicate + vssub.vv combine case 0 with GR2VR cost 0, 2 and 15

Add asm dump check and run test for vec_duplicate + vssub.vv
combine to vssub.vx, with the GR2VR cost is 0, 2 and 15.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vssub-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 weeks agoRISC-V: Combine vec_duplicate + vssub.vv to vssub.vx on GR2VR cost
Pan Li [Mon, 7 Jul 2025 03:07:11 +0000 (11:07 +0800)] 
RISC-V: Combine vec_duplicate + vssub.vv to vssub.vx on GR2VR cost

This patch would like to combine the vec_duplicate + vssub.vv to the
vssub.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_SAT_S_ADD(T, UT, MIN, MAX) \
  T                                      \
  test_##T##_sat_add (T x, T y)          \
  {                                      \
    T sum = (UT)x + (UT)y;               \
    return (x ^ y) < 0                   \
      ? sum                              \
      : (sum ^ x) >= 0                   \
        ? sum                            \
        : x < 0 ? MIN : MAX;             \
  }

  DEF_SAT_S_ADD(int32_t, uint32_t, INT32_MIN, INT32_MAX)
  DEF_VX_BINARY_CASE_2_WRAP(T, SAT_S_ADD_FUNC(T), sat_add)

Before this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     vsetvli a5,zero,e32,m1,ta,ma
  13   │     vmv.v.x v2,a2
  14   │     slli    a3,a3,32
  15   │     srli    a3,a3,32
  16   │ .L3:
  17   │     vsetvli a5,a3,e32,m1,ta,ma
  18   │     vle32.v v1,0(a1)
  19   │     slli    a4,a5,2
  20   │     sub a3,a3,a5
  21   │     add a1,a1,a4
  22   │     vssub.vv v1,v1,v2
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_or_int32_t_case_0:
  11   │     beq a3,zero,.L8
  12   │     slli    a3,a3,32
  13   │     srli    a3,a3,32
  14   │ .L3:
  15   │     vsetvli a5,a3,e32,m1,ta,ma
  16   │     vle32.v v1,0(a1)
  17   │     slli    a4,a5,2
  18   │     sub a3,a3,a5
  19   │     add a1,a1,a4
  20   │     vssub.vx v1,v1,a2
  21   │     vse32.v v1,0(a0)
  22   │     add a0,a0,a4
  23   │     bne a3,zero,.L3

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_vec_dup): Add
new case SS_MINUS.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op ss_minus.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 weeks ago[PATCH] RISC-V: Enable zvfh for vector-scalar half-float run tests
Paul-Antoine Arras [Wed, 9 Jul 2025 14:36:24 +0000 (08:36 -0600)] 
[PATCH] RISC-V: Enable zvfh for vector-scalar half-float run tests

zvfh is not enabled at the testsuite level. It has to be enabled on a testcase
by testcase basis. This was correctly done for compile tests but not for run
tests. This patch fixes it.
Also, to ensure correct results with half-precision floats, MAX_RELATIVE_DIFF is
set according to the type.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_run.h: Set
MAX_RELATIVE_DIFF depending on type.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c: Enable zvfh.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c: Likewise.

3 weeks ago[PATCH] RISC-V: Adjust testdata for unsigned vector SAT_SUB
Ciyan Pan [Wed, 9 Jul 2025 14:31:25 +0000 (08:31 -0600)] 
[PATCH] RISC-V: Adjust testdata for unsigned vector SAT_SUB

This patch adjust test data for unsigned vector SAT_SUB to vec_sat_data.h

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan <panciyan@eswincomputing.com>
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: Add vec_sat_u_sub_fmt wrap define.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: Add vec_sat_u_sub test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-1-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-10-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-2-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-3-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-4-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-5-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-6-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-7-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-8-u8.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u16.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u32.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u64.c: Remove test data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-9-u8.c: Remove test data.

3 weeks agotestsuite: Add a couple of fstack_protector guards
Richard Sandiford [Wed, 9 Jul 2025 14:01:17 +0000 (15:01 +0100)] 
testsuite: Add a couple of fstack_protector guards

These tests required runtime support for -fstack-protector,
but didn't test for it.

gcc/testsuite/
* gcc.target/aarch64/pr118348_1.c: Require fstack_protector.
* gcc.target/aarch64/pr118348_2.c: Likewise.

3 weeks agoext-dce: Fix subreg_lsb is_constant assumption (2)
Richard Sandiford [Wed, 9 Jul 2025 13:59:34 +0000 (14:59 +0100)] 
ext-dce: Fix subreg_lsb is_constant assumption (2)

This patch fixes another instance of the problem described in the
cover note for g:bf3037e923e9f91d93ab64bdf73a37f64f659fb9.

gcc/
* ext-dce.cc (ext_dce_process_uses): Apply is_constant directly
to the subreg_lsb.

3 weeks ago[PATCH] [PR target/109286] H8/300: Fix warnings about initfini sections missing attri...
Jan Dubiec [Wed, 9 Jul 2025 12:09:20 +0000 (06:09 -0600)] 
[PATCH] [PR target/109286] H8/300: Fix warnings about initfini sections missing attributes

The patch changes order of inclusions, i.e. elfos.h is included before
target specific h8300/h8300.h, in a way similar to a few other targets.
Thanks to this change it is possible to override macros from elfos.h in
h8300/h8300.h, in particular .init/.fini section definitions.

PR target/109286

gcc/ChangeLog:

* config.gcc: Include elfos.h before h8300/h8300.h.

* config/h8300/h8300.h (INIT_SECTION_ASM_OP): Override
default version from elfos.h.
(FINI_SECTION_ASM_OP): Ditto.
(ASM_DECLARE_FUNCTION_NAME): Ditto.
(ASM_GENERATE_INTERNAL_LABEL): Macro removed because it was
being overridden in elfos.h anyway.
(ASM_OUTPUT_SKIP): Ditto.

3 weeks agogimple-fold: extend vector simplification to match scalar bitwise optimizations ...
Icen Zeyada [Wed, 9 Jul 2025 11:57:17 +0000 (12:57 +0100)] 
gimple-fold: extend vector simplification to match scalar bitwise optimizations [PR119196]

    Generalize existing scalar gimple_fold rules to apply the same
    bitwise comparison simplifications to vector types.  Previously, an
    expression like

        (x < y) && (x > y)

    would fold to `false` if x and y are scalars, but equivalent vector
    comparisons were left untouched.  This patch enables folding of
    patterns of the form

        (cmp x y) bit_and (cmp x y)
        (cmp x y) bit_ior (cmp x y)
        (cmp x y) bit_xor (cmp x y)

    for vector operands as well, ensuring consistent optimization across
    all data types.

gcc/ChangeLog:

PR tree-optimization/119196
* match.pd: Allow scalar optimizations with bitwise AND/OR/XOR to apply to vectors.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vector-compare-5.c: Add new test for vector compare simplification.

Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
3 weeks agotree-simplify: unify simple_comparison ops in vec_cond for bit and/or/xor [PR119196]
Icen Zeyada [Wed, 9 Jul 2025 11:57:11 +0000 (12:57 +0100)] 
tree-simplify: unify simple_comparison ops in vec_cond for bit and/or/xor [PR119196]

Merge simple_comparison patterns under a single vec_cond_expr for bit_and,
bit_ior, and bit_xor in the simplify pass.

Ensure that when both operands of a bit_and, bit_or, or bit_xor are simple_comparison
results, they reside within the same vec_cond_expr rather than separate ones.
This prepares the AST so that subsequent transformations (e.g., folding the
comparisons if possible) can take effect.

gcc/ChangeLog:

PR tree-optimization/119196
* match.pd: Merge multiple vec_cond_expr in a single one for
bit_and, bit_ior and bit_xor.

Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
3 weeks ago[RISC-V][PR target/120642] Avoid propagating constant AVL for theadvector
Jeff Law [Wed, 9 Jul 2025 11:23:34 +0000 (05:23 -0600)] 
[RISC-V][PR target/120642] Avoid propagating constant AVL for theadvector

AVL propagation currently assumes that it can propagate a constant AVL into any
vector insn and trips an assert if the insn fails to recognize after such a
propagation.

However, for xtheadvector that is not a correct assumption; xtheadvector does
not allow the vector length to be a constant integer (other than zero which
allowed via x0).

After consulting with Jin Ma (thanks!) we agree the right fix is to avoid
creating the immediate AVL for xtheadvector.

This has been tested in my tester, just waiting for the pre-commit tester to
spin it.

PR target/120642
gcc/
* config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Do not do
constant AVL propagation for xtheadvector.

gcc/testsuite/
* gcc.target/riscv/rvv/xtheadvector/pr120642.c: New test.

3 weeks agolibstdc++: Add smart ptr owner_equals and owner_hash [PR117403]
Paul Keir [Tue, 8 Jul 2025 12:36:49 +0000 (13:36 +0100)] 
libstdc++: Add smart ptr owner_equals and owner_hash [PR117403]

New structs and member functions added to C++26 by P1901R2.

libstdc++-v3/ChangeLog:

PR libstdc++/117403
* include/bits/shared_ptr.h (shared_ptr::owner_equal)
(shared_ptr::owner_hash, weak_ptr::owner_equal)
(weak_ptr::owner_hash): Define new member functions.
* include/bits/shared_ptr_base.h (owner_equal, owner_hash):
Define new structs.
* include/bits/version.def (smart_ptr_owner_equality): Define.
* include/bits/version.h: Regenerate.
* include/std/memory: Added define for
__glibcxx_want_smart_ptr_owner_equality.
* testsuite/20_util/owner_equal/version.cc: New test.
* testsuite/20_util/owner_equal/cmp.cc: New test.
* testsuite/20_util/owner_equal/noexcept.cc: New test.
* testsuite/20_util/owner_hash/cmp.cc: New test.
* testsuite/20_util/owner_hash/noexcept.cc: New test.
* testsuite/20_util/shared_ptr/observers/owner_equal.cc: New
test.
* testsuite/20_util/shared_ptr/observers/owner_hash.cc:
New test.
* testsuite/20_util/weak_ptr/observers/owner_equal.cc: New test.
* testsuite/20_util/weak_ptr/observers/owner_hash.cc: New test.

Signed-off-by: Paul Keir <paul.keir@uws.ac.uk>
3 weeks agolibstdc++: Added missing members to numeric_limits specializations for integer-class...
Mateusz Zych [Tue, 8 Jul 2025 09:51:07 +0000 (10:51 +0100)] 
libstdc++: Added missing members to numeric_limits specializations for integer-class types

[iterator.concept.winc]/11 says that std::numeric_limits should be
specialized for integer-class types, with each member defined
appropriately.

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (numeric_limits<__max_size_type>):
New members.
(numeric_limits<__max_diff_type>): Likewise.
* testsuite/std/ranges/iota/max_size_type.cc: New test cases.

Signed-off-by: Mateusz Zych <mte.zych@gmail.com>
3 weeks agoAvoid accessing STMT_VINFO_VECTYPE
Richard Biener [Wed, 9 Jul 2025 09:23:30 +0000 (11:23 +0200)] 
Avoid accessing STMT_VINFO_VECTYPE

The following fixes up two places we access STMT_VINFO_VECTYPE that's
not covered by the fixup in vect_analyze/transform_stmt to set that
from SLP_TREE_VECTYPE.

* tree-vect-loop.cc (vectorizable_reduction): Get the
output vector type from slp_for_stmt_info.
* tree-vect-stmts.cc (vect_analyze_stmt): Bail out earlier
for PURE_SLP_STMT when doing loop stmt analysis.

3 weeks agotestsuite/120093 - fix gcc.dg/vect/pr101145.c
Richard Biener [Wed, 9 Jul 2025 11:10:13 +0000 (13:10 +0200)] 
testsuite/120093 - fix gcc.dg/vect/pr101145.c

The following changes noinline to noipa to avoid having IPA-CP clones
confusing the vectorized loop counting.

PR testsuite/120093
* gcc.dg/vect/pr101145.c: Use noipa instead of noinline
attribute.

3 weeks agos390: Fix vector pattern tests for -m31.
Juergen Christ [Wed, 9 Jul 2025 09:19:50 +0000 (11:19 +0200)] 
s390: Fix vector pattern tests for -m31.

Vectorization of int patterns requires 64bit long type (at least the
way the tests are coded).  Fix this to only test for successful
vectoriation on 64bit targets.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/pattern-avg-1.c: Fix on -m31.
* gcc.target/s390/vector/pattern-mulh-1.c: Fix on -m31.
* gcc.target/s390/vector/pattern-mulh-2.c: Fix on -m31.

3 weeks agoImprove afdo_adjust_guessed_profile
Jan Hubicka [Wed, 9 Jul 2025 09:51:03 +0000 (11:51 +0200)] 
Improve afdo_adjust_guessed_profile

This patch makes afdo_adjust_guessed_profile more robust.  Instead of using
median of scales we compute robust average wehre weights is taken from execution
count of edge it originates from and also I added a cap since in some cases
scaling factor may end up being very large introducing artificial hotest regions
of the program confusing ipa-profile's histogram based cutoff.
This was the problem of roms.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* auto-profile.cc (struct scale): New structure.
(add_scale): Also record weights.
(afdo_adjust_guessed_profile): Compute robust average
of scales and cap by max count in function.

3 weeks agoFix profile scaling in tree-inline.cc:initialize_cfun
Jan Hubicka [Mon, 7 Jul 2025 17:20:25 +0000 (19:20 +0200)] 
Fix profile scaling in tree-inline.cc:initialize_cfun

initialize_cfun calls
 profile_count::adjust_for_ipa_scaling (&num, &den);
but then the result is never used.  This patch fixes it.  Overall scalling
of entry/exit block is bit sloppy in tree-inline.  I see if I can clean it up.

* tree-inline.cc (initialize_cfun): Use num and den for scaling.

3 weeks agoFix auto-profile.cc:get_original_name
Jan Hubicka [Mon, 7 Jul 2025 15:18:23 +0000 (17:18 +0200)] 
Fix auto-profile.cc:get_original_name

There are two bugs in get_original_name.  FIrst the for loop walking list of known
suffixes uses sizeos (suffixes).  It evnetually walks to an empty suffix.
Second problem is that strcmp may accept suffixes that are longer.  I.e.
mix up .isra with .israabc.  This is probably not a big deal but the first
bug makes get_original_name to effectively strip all suffixes, even important
one on my setup.

gcc/ChangeLog:

* auto-profile.cc (get_original_name): Fix loop walking the
suffixes.

3 weeks agolibstdc++: Fix memory_resource.cc bootstrap failure for non-gthreads targets
Jonathan Wakely [Wed, 9 Jul 2025 09:14:23 +0000 (10:14 +0100)] 
libstdc++: Fix memory_resource.cc bootstrap failure for non-gthreads targets

The new choose_block_size function added in r16-2112-gac2fb60a67d6d1 was
defined inside an #ifdef _GLIBCXX_HAS_GTHREADS group, which means that
it's not available for single-threaded targets, and so can't be used by
unsynchronized_pool_resource. Move it before that preprocessor group so
it's always defined.

libstdc++-v3/ChangeLog:

* src/c++17/memory_resource.cc: Adjust indentation of unnamed
namespaces.
(pool_sizes): Add comment.
(choose_block_size): Move outside preprocessor group for
gthreads targets.
* testsuite/20_util/synchronized_pool_resource/118681.cc:
Require gthreads.

3 weeks agoFix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'
Thomas Schwinge [Wed, 9 Jul 2025 08:06:39 +0000 (10:06 +0200)] 
Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'

Fix-up for commit 72e85d46472716e670cbe6e967109473b8d12d38
"tree-optimization/120780: Support object size for containing objects".
'size_t sz' is unused here, and GCC/nvptx doesn't accept this:

    spawn -ignore SIGHUP [...]/nvptx-none-run ./builtin-dynamic-object-size-pr120780.exe
    error   : Prototype doesn't match for 'main' in 'input file 1 at offset 1924', first defined in 'input file 1 at offset 1924'
    nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)
    FAIL: gcc.dg/builtin-dynamic-object-size-pr120780.c execution test

gcc/testsuite/
* gcc.dg/builtin-dynamic-object-size-pr120780.c: Fix 'main' function.

3 weeks agoarm: remove useless push/pop pragmas in arm_neon.h
Christophe Lyon [Wed, 30 Apr 2025 11:07:52 +0000 (11:07 +0000)] 
arm: remove useless push/pop pragmas in arm_neon.h

Remove #pragma GCC target ("arch=armv8.2-a+bf16") since it matches the
preceding pragma GCC target and is thus useless.

gcc/ChangeLog:

* config/arm/arm_neon.h: Remove useless push/pop pragmas.

3 weeks agomiddle-end: Use rounding division for ranges for partial vectors [PR120922]
Tamar Christina [Wed, 9 Jul 2025 07:42:02 +0000 (08:42 +0100)] 
middle-end: Use rounding division for ranges for partial vectors [PR120922]

This patch adds support for niters ranges for partial
vector loops.

Due to the last iteration being partial the bounds should
be at least 1 but niters // vf as the max.

gcc/ChangeLog:

PR tree-optimization/120922
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Support range
for partial vectors.

3 weeks agomiddle-end: don't set range on partial vectors [PR120922]
Tamar Christina [Wed, 9 Jul 2025 07:39:35 +0000 (08:39 +0100)] 
middle-end: don't set range on partial vectors [PR120922]

Before the change in g:309dbcea2cabb31bde1a65cdfd30bb7f87b170a2 we would never
set a range for constant VF and requires partial vector loops.

I think a range could be set, since I think the number of latch executions is a
ceiling division of TYPE_MAX_VALUE / vf. To account for the partial iteration.

This would also then deal with the ICE cause in the PR where the chosen VF was
much higher than TYPE_MAX_VALUE and that a mask is relied upon to make it safe.

Since the patch was supposed to not change behavior I've added an additional
partial vector check on the const_vf > 0 check to make it explicit that we only
set it on non-partial vectors (alternative would have been to swap the order of
the vf.constant(&const_vf)) check, but that would have hidden the requirement
sneakily.

The second patch adds support for ranges for partial masks.

gcc/ChangeLog:

PR tree-optimization/120922
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Don't set range
for partial vectors.

gcc/testsuite/ChangeLog:

PR tree-optimization/120922
* gcc.dg/vect/pr120922.c: New test.

3 weeks agolibstdc++: Update some baseline_symbols.txt (x32)
H.J. Lu [Wed, 9 Jul 2025 01:43:52 +0000 (09:43 +0800)] 
libstdc++: Update some baseline_symbols.txt (x32)

* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt:
Updated.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 weeks agoRISC-V: Disable uint128_t testcase of SAT_MUL when rv32
Pan Li [Tue, 8 Jul 2025 02:46:29 +0000 (10:46 +0800)] 
RISC-V: Disable uint128_t testcase of SAT_MUL when rv32

The rv32 doesn't support __uint128, and then we will have
error like below during test.

error: '__int128' is not supported on this target.

Thus, we disable the uint128_t related test when rv32.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_arith.h: Add xlen check for
uint128_t.
* gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u128.c: Enable
run test for rv64 only.
* gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u128.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-run-1-u64-from-u128.c: Ditto.
* gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u128.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 weeks agoDaily bump.
GCC Administrator [Wed, 9 Jul 2025 00:20:02 +0000 (00:20 +0000)] 
Daily bump.

3 weeks agolibstdc++: Fix double free in new pool resource test [PR118681]
Jonathan Wakely [Tue, 8 Jul 2025 23:54:33 +0000 (00:54 +0100)] 
libstdc++: Fix double free in new pool resource test [PR118681]

This was supposed to free p1 and p2, not free p2 twice.

libstdc++-v3/ChangeLog:

PR libstdc++/118681
* testsuite/20_util/unsynchronized_pool_resource/118681.cc: Fix
deallocate argument.

3 weeks agoruntime: avoid libc memmove and memclr
Ian Lance Taylor [Tue, 1 Jul 2025 04:26:11 +0000 (21:26 -0700)] 
runtime: avoid libc memmove and memclr

The libc memmove and memclr don't reliably operate on full memory words.
We already avoided them on PPC64, but the same problem can occur even
on x86, where some processors use "rep movsb" and "rep stosb".
Always use C code that stores full memory words.

While we're here, clean up the C code. We don't need special handling
if the memmove/memclr pointers are not pointer-aligned.

Unfortunately, this will likely be slower. Perhaps some day we can
have our own assembly code that operates a word at a time,
or we can use different operations when we know there are no pointers.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/685178

3 weeks agosyscall: pass correct pointer to system call in recvmsgRaw
Ian Lance Taylor [Tue, 1 Jul 2025 04:23:41 +0000 (21:23 -0700)] 
syscall: pass correct pointer to system call in recvmsgRaw

The code in recvmsgRaw, introduced in https://go.dev/cl/384695,
incorrectly passed &rsa to the recvmsg system call.
But in recvmsgRaw rsa is already a pointer passed by the caller.
This change passes the correct pointer.

I'm guessing that this didn't show up in the testsuite because
we run the tests in short mode.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/685177

3 weeks agolibstdc++: Ensure pool resources meet alignment requirements [PR118681]
Jonathan Wakely [Fri, 4 Jul 2025 15:44:13 +0000 (16:44 +0100)] 
libstdc++: Ensure pool resources meet alignment requirements [PR118681]

For allocations with size > alignment and size % alignment != 0 we were
sometimes returning pointers that did not meet the requested aligment.
For example, allocate(24, 16) would select the pool for 24-byte objects
and the second allocation from that pool (at offset 24 bytes into the
pool) is only 8-byte aligned not 16-byte aligned.

The pool resources need to round up the requested allocation size to a
multiple of the alignment, so that the selected pool will always return
allocations that meet the alignment requirement.

libstdc++-v3/ChangeLog:

PR libstdc++/118681
* src/c++17/memory_resource.cc (choose_block_size): New
function.
(synchronized_pool_resource::do_allocate): Use choose_block_size
to determine appropriate block size.
(synchronized_pool_resource::do_deallocate): Likewise
(unsynchronized_pool_resource::do_allocate): Likewise.
(unsynchronized_pool_resource::do_deallocate): Likewise
* testsuite/20_util/synchronized_pool_resource/118681.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/118681.cc: New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 weeks agoc++: bogus error with union in qualified name [PR83469]
Marek Polacek [Tue, 8 Jul 2025 14:09:36 +0000 (10:09 -0400)] 
c++: bogus error with union in qualified name [PR83469]

While working on Reflection I noticed that we reject:

  union U { int i; };
  constexpr auto r = ^^typename ::U;

which is due to PR83469.  Andrew P. posted a patch in 2021:
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586344.html
for which I had some comments but an updated patch never came.

~~
There are a few issues here with typenames and unions (and even struct
keywords with unions). First in cp_parser_check_class_key,
we need to allow typenames to name union types and union key
to be able to use with typenames.

The next issue is we need to record if we had a union key,
right now we just record it was a struct/class/typename one
which is wrong.
~~

This patch is an updated and cleaned up version; I've also addressed
a missing bit in pt.cc.

PR c++/83469
PR c++/93809

gcc/cp/ChangeLog:

* cp-tree.h (UNION_TYPE_P): Define.
(TYPENAME_IS_UNION_P): Define.
* decl.cc (struct typename_info): Add union_p field.
(struct typename_hasher::equal): Compare union_p field.
(build_typename_type): Use ti.union_p for union_type.  Set
TYPENAME_IS_UNION_P.
* error.cc (dump_type) <case TYPENAME_TYPE>: Handle
TYPENAME_IS_UNION_P.
* module.cc (trees_out::type_node): Likewise.
* parser.cc (cp_parser_check_class_key): Allow typename key for union
types and allow union keyword for typename types.
* pt.cc (tsubst) <case TYPENAME_TYPE>: Don't conflate unions with
class_type.  For TYPENAME_IS_CLASS_P, check NON_UNION_CLASS_TYPE_P
rather than CLASS_TYPE_P.  Add TYPENAME_IS_UNION_P handling.

gcc/testsuite/ChangeLog:

* g++.dg/template/error45.C: Adjust dg-error.
* g++.dg/warn/Wredundant-tags-3.C: Remove xfail.
* g++.dg/parse/union1.C: New test.
* g++.dg/parse/union2.C: New test.
* g++.dg/parse/union3.C: New test.
* g++.dg/parse/union4.C: New test.
* g++.dg/parse/union5.C: New test.
* g++.dg/parse/union6.C: New test.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
3 weeks agoxtensa: Fix B[GE/LT]UI instructions with immediate values of 32768 or 65536 not being...
Takayuki 'January June' Suwa [Mon, 7 Jul 2025 14:40:17 +0000 (23:40 +0900)] 
xtensa: Fix B[GE/LT]UI instructions with immediate values of 32768 or 65536 not being emitted

This is because in canonicalize_comparison() in gcc/expmed.cc, the COMPARE
rtx_cost() for the immediate values in the title does not change between
the old and new versions.  This patch fixes that.

(note: Currently, this patch only works if some constant propagation
optimizations are enabled (-O2 or higher) or if bare large constant
assignments are possible (-mconst16 or -mauto-litpools).  In the future
I hope to make it work at -O1...)

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_b4const_or_zero):
Remove.
(xtensa_b4const): Add a case where the value is 0, and rename
to xtensa_b4const_or_zero.
(xtensa_rtx_costs): Fix to also consider the result of
xtensa_b4constu().

gcc/testsuite/ChangeLog:

* gcc.target/xtensa/BGEUI-BLTUI-32k-64k.c: New.

3 weeks agolibstdc++: Fix _GLIBCXX_DEBUG std::forward_list build regression
Jonathan Wakely [Tue, 8 Jul 2025 17:20:13 +0000 (19:20 +0200)] 
libstdc++: Fix _GLIBCXX_DEBUG std::forward_list build regression

Commit 2fd6f42c17a8040dbd3460ca34d93695dacf8575 broke _GLIBCXX_DEBUG
std::forward_list implementation.

libstdc++-v3/ChangeLog:

* include/debug/forward_list (_Safe_forward_list<>::_M_swap):
Adapt to _M_this() signature change.

3 weeks agoc++: Implement part of C++26 P2686R4 - constexpr structured bindings [PR117784]
Jakub Jelinek [Tue, 8 Jul 2025 17:21:55 +0000 (19:21 +0200)] 
c++: Implement part of C++26 P2686R4 - constexpr structured bindings [PR117784]

The following patch implements the constexpr structured bindings part of
the P2686R4 paper, so the [dcl.pre], [dcl.struct.bind], [dcl.constinit]
and first hunk in [dcl.constexpr] changes.
The paper doesn't have a feature test macro and the constexpr structured
binding part of it seems more-less self-contained, so I think it is useful
to get this in independently from the rest.
Of course, automatic constexpr/constinit structured bindings in the
tuple cases or automatic constexpr/constinit structured bindings with auto &
will not really work for now.
Another reason for the split is that for C++ < 26, I think what the patch
implements is basically what the users will see, i.e. we can accept
constexpr or constinit structured binding with pedwarn, but I think we can't
change the constant expression rules in C++ < 26.

I plan to look at the rest of the paper.

2025-07-08  Jakub Jelinek  <jakub@redhat.com>

PR c++/117784
* decl.cc: Implement part of C++26 P2686R4 - constexpr structured
bindings.
(cp_finish_decl): Pedwarn for C++23 and older on constinit on
structured bindings except for static/thread_local where it uses
earlier error.
(grokdeclarator): Pedwarn on constexpr structured bindings for
C++23 and older instead of emitting error always, don't clear
constexpr_p in that case.
* parser.cc (cp_parser_decomposition_declaration): Copy over
DECL_DECLARED_CONSTEXPR_P and DECL_DECLARED_CONSTINIT_P flags.

* g++.dg/cpp1z/decomp3.C (test): For constexpr structured binding
initialize from constexpr var instead of non-constexpr and expect
just a pedwarn for C++23 and older instead of error always.
* g++.dg/cpp26/decomp9.C (foo): Likewise.
* g++.dg/cpp26/decomp22.C: New test.
* g++.dg/cpp26/decomp23.C: New test.
* g++.dg/cpp26/decomp24.C: New test.
* g++.dg/cpp26/decomp25.C: New test.

3 weeks agolibstdc++: Do not expose set_brackets/set_separator for formatter with format_kind...
Tomasz Kamiński [Tue, 8 Jul 2025 08:04:41 +0000 (10:04 +0200)] 
libstdc++: Do not expose set_brackets/set_separator for formatter with format_kind other than sequence [PR119861]

The standard defines separate specializations of range-default-formatter, out
of which only one for range_format::sequence provide the set_brackets and
set_separator methods. We implemented it as one specialization and exposed
this method for range_format other than string or debug_string, i.e. when
range_formatter was used as underlying formatter.

PR libstdc++/119861

libstdc++-v3/ChangeLog:

* include/std/format (formatter<_Rg, _CharT>::set_separator)
(formatter<_Rg, _CharT>::set_brackets): Constrain with
(format_kind<_Rg> == range_format::sequence).
* testsuite/std/format/ranges/pr119861_neg.cc: New test.

3 weeks agolibstdc++: Better CTAD for span and mdspan [PR120914].
Luc Grosheintz [Tue, 8 Jul 2025 09:49:21 +0000 (11:49 +0200)] 
libstdc++: Better CTAD for span and mdspan [PR120914].

This implements P3029R1. In P3029R1, the CTAD for span is refined to
permit deducing the extent of the span from an integral constant, e.g.

  span((T*) ptr, integral_constant<size_t, 5>{});

is deduced as span<T, 5>. Similarly, in

  auto exts = extents(integral_constant<int, 2>);
  auto md = mdspan((T*) ptr, integral_constant<int, 2>);

exts and md have types extents<size_t, 2> and mdspan<double,
extents<size_t, 2>>, respectively.

PR libstdc++/120914

libstdc++-v3/ChangeLog:

* include/std/span (span): Update CTAD to enable
integral constants [P3029R1].
* include/std/mdspan (extents): ditto.
(mdspan): ditto.
* testsuite/23_containers/span/deduction.cc: Test deduction
guide.
* testsuite/23_containers/mdspan/extents/misc.cc: ditto.
* testsuite/23_containers/mdspan/mdspan.cc: ditto.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
3 weeks agos390: Always compute address of stack protector guard
Stefan Schulze Frielinghaus [Tue, 8 Jul 2025 14:40:34 +0000 (16:40 +0200)] 
s390: Always compute address of stack protector guard

Computing the address of the thread pointer on s390 involves multiple
instructions and therefore bears the risk that the address of the canary
or intermediate values of it are spilled after prologue in order to be
reloaded for the epilogue.  Since there exists no mechanism to ensure
that a value is not coming from stack, as a precaution compute the
address always twice, i.e., one time for the prologue and one time for
the epilogue.  Note, even if there were such a mechanism, emitting
optimal code is non-trivial since there exist cases with opposing
requirements as e.g. if the thread pointer is not only computed for the
TLS guard but also for other TLS objects.  For the latter accesses it is
desired to spill and reload the thread pointer instead of recomputing it
whereas for the former it is not.

gcc/ChangeLog:

* config/s390/s390.md (stack_protect_get_tpsi): New insn.
(stack_protect_get_tpdi): New insn.
(stack_protect_set): Use new insn.
(stack_protect_test): Use new insn.

gcc/testsuite/ChangeLog:

* gcc.target/s390/stack-protector-guard-tls-1.c: New test.

3 weeks agolibstdc++: Silence a warning in a test for span.
Luc Grosheintz [Tue, 8 Jul 2025 09:49:20 +0000 (11:49 +0200)] 
libstdc++: Silence a warning in a test for span.

In a test of span, there's an unused variable myspan. This
commit silences the warning.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/span/contiguous_range_neg.cc: Silence
warning about unused variable myspan.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
3 weeks agoAvoid IPA opts around guality plumbing
Richard Biener [Tue, 8 Jul 2025 11:46:01 +0000 (13:46 +0200)] 
Avoid IPA opts around guality plumbing

The following avoids inlining the actual main() (renamed to
guality_main) into the guality plumbing.  This can cause
jump threading opportunities to appear and generally increase
the chance what we actually test isn't what we think.  Likewise
make guality_check noipa instead of just noinline.

gcc/testsuite/
* gcc.dg/guality/guality.h (guality_main): Declare noipa.
(guality_check): Likewise.

3 weeks agoRISC-V: Do not use vsetivli for THeadVector.
Robin Dapp [Tue, 8 Jul 2025 09:35:12 +0000 (11:35 +0200)] 
RISC-V: Do not use vsetivli for THeadVector.

In emit_vlmax_insn_lra we use a vsetivli for an immediate AVL.
XTHeadVector does not support this, so guard appropriately.

PR target/120461

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_insn_lra): Do not emit
vsetivli for XTHeadVector.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/pr120461.c: New test.

3 weeks agoRISC-V: Ignore non-types in builtin function hash.
Robin Dapp [Tue, 8 Jul 2025 09:17:41 +0000 (11:17 +0200)] 
RISC-V: Ignore non-types in builtin function hash.

If a user passes a string that doesn't represent a variable we still try
to compute a hash for its type.  Its tree does not represent a type but
just an exceptional, though.  This patch just ignores it, leaving the
error to the checking code later.

PR target/113829

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (registered_function::overloaded_hash):
Skip non-type arguments.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113829.c: New test.

3 weeks agolibstdc++: Set feature test macro for complete C++23 mdspan [PR107761].
Luc Grosheintz [Tue, 8 Jul 2025 08:24:27 +0000 (10:24 +0200)] 
libstdc++: Set feature test macro for complete C++23 mdspan [PR107761].

PR libstdc++/107761

libstdc++-v3/ChangeLog:

* include/bits/version.def (mdspan): Set to 202207 and remove
no_stdname.
* include/bits/version.h: Regenerate.
* testsuite/23_containers/mdspan/version.cc: Test presence
of feature test macro.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
3 weeks ago[PATCH] riscv: allow zero in zacas subword atomic cas
Andreas Schwab [Tue, 8 Jul 2025 13:32:17 +0000 (07:32 -0600)] 
[PATCH] riscv: allow zero in zacas subword atomic cas

gcc:
PR target/120995
* config/riscv/sync.md (zacas_atomic_cas_value_strong<mode>):
Allow op3 to be zero.

gcc/testsuite:
PR target/120995
* gcc.target/riscv/amo/zabha-zacas-atomic-cas.c: New test.

3 weeks agolibstdc++: Implement mdspan and tests [PR107761].
Luc Grosheintz [Tue, 8 Jul 2025 08:24:26 +0000 (10:24 +0200)] 
libstdc++: Implement mdspan and tests [PR107761].

Implements the class mdspan as described in N4950, i.e. without P3029.
It also adds tests for mdspan. This commit completes the implementation
of P0009, i.e. the C++23 part <mdspan>.

PR libstdc++/107761

libstdc++-v3/ChangeLog:

* include/std/mdspan (mdspan): New class.
* src/c++23/std.cc.in (mdspan): Add.
* testsuite/23_containers/mdspan/class_mandate_neg.cc: New test.
* testsuite/23_containers/mdspan/mdspan.cc: New test.
* testsuite/23_containers/mdspan/layout_like.h: Add class
LayoutLike which models a user-defined layout.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>