]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
4 weeks ago[sanitizer_common] Fix build on ppc64+musl (#120036)
Jeff Law [Fri, 27 Jun 2025 21:11:41 +0000 (15:11 -0600)] 
[sanitizer_common] Fix build on ppc64+musl (#120036)

Cherry picked from LLVM commit 801b519dfd01e21da0be17aa8f8dc2ceb0eb9e77.

In powerpc64-unknown-linux-musl, signal.h does not include asm/ptrace.h,
which causes "member access into incomplete type 'struct pt_regs'"
errors. Include the header explicitly to fix this.

Also in sanitizer_linux_libcdep.cpp, there is a usage of TlsPreTcbSize
which is not defined in such a platform. Guard the branch with macro.

4 weeks agoFortran: follow-up fix to checking of renamed-on-use interface name [PR120784]
Harald Anlauf [Fri, 27 Jun 2025 21:00:48 +0000 (23:00 +0200)] 
Fortran: follow-up fix to checking of renamed-on-use interface name [PR120784]

Commit r16-1633 introduced a regression for imported interfaces that were
not renamed-on-use, since the related logic did not take into account that
the absence of renaming could be represented by an empty string.

PR fortran/120784

gcc/fortran/ChangeLog:

* interface.cc (gfc_match_end_interface): Detect empty local_name.

gcc/testsuite/ChangeLog:

* gfortran.dg/interface_63.f90: Extend testcase.

4 weeks agoc++: fix decltype_p handling for binary expressions
Jason Merrill [Wed, 25 Jun 2025 20:26:56 +0000 (16:26 -0400)] 
c++: fix decltype_p handling for binary expressions

With Jakub's constexpr virtual base patch,
23_containers/vector/bool/cmp_c++20.cc failed the assert I add to
fixed_type_or_null, meaning that it returned the wrong value.  Let's fix the
result as well as adding the assert, and fix cp_parser_binary_expression to
properly wrap any class-type calls in the operands in TARGET_EXPR even
within a decltype so we don't hit the assert.

gcc/cp/ChangeLog:

* class.cc (fixed_type_or_null): Handle class-type CALL_EXPR.
* parser.cc (cp_parser_binary_expression): Fix decltype_p handling.

4 weeks agolibstdc++: Directly implement ranges::shuffle [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:40 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::shuffle [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (shuffle_fn::operator()):
Reimplement directly, based on the stl_algo.h implementation.
* testsuite/25_algorithms/shuffle/constrained.cc (test02):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::sample [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:37 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::sample [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__sample_fn::operator()):
Reimplement the forward_iterator branch directly, based
on the stl_algo.h implementation.  Add explicit cast to
_Out's difference_type in the !forward_iterator branch.
* testsuite/25_algorithms/sample/constrained.cc (test02):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::nth_element [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:34 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::nth_element [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__introselect): New,
based on the stl_algo.h implementation.
(nth_element_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/nth_element/constrained.cc:

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::stable_partition [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:32 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::stable_partition [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__find_if_not_n): New,
based on the stl_algo.h implementation.
(__detail::__stable_partition_adaptive): Likewise.
(__stable_partition_fn::operator()): Reimplement in terms of
the above.
* testsuite/25_algorithms/stable_partition/constrained.cc
(test03): New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::stable_sort [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:29 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::stable_sort [PR100795]

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__move_merge): New,
based on the stl_algo.h implementation.
(__detail::__merge_sort_loop): Likewise.
(__detail::__chunk_insertion_sort): Likewise.
(__detail::__merge_sort_with_buffer): Likewise.
(__detail::__stable_sort_adaptive): Likewise.
(__detail::__stable_sort_adaptive_resize): Likewise.
(__detail::__inplace_stable_sort): Likewise.
(__stable_sort_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/stable_sort/constrained.cc:

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::inplace_merge [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:26 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::inplace_merge [PR100795]

As with the previous patch, this patch reimplements ranges::inplace_merge
directly instead of incorrectly forwarding to std::inplace_merge.  In
addition to the compatibility changes listed in the previous patch we
also:

  - explicitly cast the difference type (which can be an integer class) to
    ptrdiff_t when constructing a _Temporary_buffer

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__move_merge_adaptive):
New, based on the stl_algo.h implementation.
(__detail::__move_merge_adaptive_backward): Likewise.
(__detail::__rotate_adaptive): Likewise.
(__detail::__merge_adaptive): Likewise.
(__detail::__merge_adaptive_resize): Likewise.
(__detail::__merge_without_buffer): Likewise.
(__inplace_merge_fn::operator()): Reimplement in terms of the
above.
* testsuite/25_algorithms/inplace_merge/constrained.cc (test03):
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::sort [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:53:06 +0000 (13:53 -0400)] 
libstdc++: Directly implement ranges::sort [PR100795]

As with the previous patch, this patch reimplements ranges::sort
directly instead of incorrectly forwarding to std::sort.  In addition to
the compatibility changes listed in the previous patch we also:

  - use ranges::iter_swap instead of std::iter_swap
  - use ranges::move_backward instead of std::move_backward
  - use __bit_width and __to_unsigned_like instead of __lg

PR libstdc++/100795
PR libstdc++/118209

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (__bit_width): New explicit
specialization for __max_size_type.
* include/bits/ranges_algo.h (__detail::__move_median_to_first):
New, based on the stl_algo.h implementation.
(__detail::__unguarded_liner_insert): Likewise.
(__detail::__insertion_sort): Likewise.
(__detail::__sort_threshold): Likewise.
(__detail::__unguarded_insertion_sort): Likewise.
(__detail::__final_insertion_sort): Likewise.
(__detail::__unguarded_partition): Likewise.
(__detail::__unguarded_partition_pivot): Likewise.
(__detail::__heap_select): Likewise.
(__detail::__partial_sort): Likewise.
(__detail::__introsort_loop): Likewise.
(__sort_fn::operator()): Reimplement in terms of the above.
* testsuite/25_algorithms/sort/118209.cc: New test.
* testsuite/25_algorithms/sort/constrained.cc (test03): New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Directly implement ranges::heap algos [PR100795]
Patrick Palka [Fri, 27 Jun 2025 17:52:56 +0000 (13:52 -0400)] 
libstdc++: Directly implement ranges::heap algos [PR100795]

ranges::push_heap, ranges::pop_heap, ranges::make_heap and
ranges::sort_heap are currently defined in terms of the corresponding
STL-style algorithms, but this is incorrect because the STL-style
algorithms rely on the legacy iterator system, and so misbehave when
passed a narrowly C++20 random access iterator.  The other ranges heap
algos, ranges::is_heap and ranges::is_heap_until, are implemented
directly already and have no known issues.

This patch reimplements these ranges:: algos directly instead, based
closely on the legacy stl_heap.h implementation, with the following
changes for compatibility with the C++20 iterator system:

  - handle non-common ranges by computing the corresponding end iterator
  - use ranges::iter_move instead of std::move(*iter)
  - use iter_value_t / iter_difference_t instead of iterator_traits

Besides these changes, the implementation of these algorithms is
intended to mirror the stl_heap.h implementations, for ease of
maintenance and review.

Note that we don't explicitly pass the projection function throughout,
instead we just create and pass a composite predicate via __make_comp_proj.

PR libstdc++/100795

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__detail::__push_heap): New,
based on the stl_heap.h implementation.
(__push_heap_fn::operator()): Reimplement in terms of the above.
(__detail::__adjust_heap): New, based on the stl_heap.h
implementation.
(__deatil::__pop_heap): Likewise.
(__pop_heap_fn::operator()): Reimplement in terms of the above.
(__make_heap_fn::operator()): Likewise.
(__sort_heap_fn::operator()): Likewise.
* testsuite/25_algorithms/heap/constrained.cc (test03): New
test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
4 weeks agolibstdc++: Use runtime format for internal format calls in chrono [PR110739]
Tomasz Kamiński [Fri, 27 Jun 2025 07:50:18 +0000 (09:50 +0200)] 
libstdc++: Use runtime format for internal format calls in chrono [PR110739]

This patch adjust all internal std::format call inside of __formatter_chrono,
to use runtime format string and thus avoid compile time checking of validity
of the format string. Majority of cases are covered by calling newly introduced
_S_empty_fs() function that returns __Runtime_format_string containing
_S_empty_spec, instead of passing later directly.

In case of _M_j we use _S_str_d3 function (extracted from _S_str_d2), eliminating
call to std::format outside of unlikely scenario in which day of year is greater
than 1000 (this may happen for year_month_day with month greater than 12). In
consequence, outside of handling subseconds, we no longer delegate to std::format
or construct temporary strings, when formatting chrono types with ok() values.

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_S_empty_fs): Define.
(__formatter_chrono::_S_str_d2): Use _S_str_d3 for 3+ digits and
place allways_inline attribute after comment.
(__formatter_chrono::_S_str_d3): Extracted from _S_str_d2.
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_R_X): Replace
_S_empty_spec with _S_empty_fs().
(__formatter_chrono::_M_j): Likewise and use _S_str_d3 in common
case.
(__format::operator-(_ChronoParts, _ChronoParts))
(__format::operator-=(_ChronoParts, _ChronoParts))
(__formatter_chrono::_S_fill_two_digits)
(__formatter_chrono::_S_str_d1): Place always_inline attribute
after comment.

4 weeks agoc++/modules: Avoid name clashes when streaming internal labels [PR98375,PR118904]
Nathaniel Shead [Tue, 20 May 2025 15:18:49 +0000 (01:18 +1000)] 
c++/modules: Avoid name clashes when streaming internal labels [PR98375,PR118904]

The frontend creates some variables that need to be given unique names
for the TU so that they can unambiguously be accessed.  Historically
this has been done with a global counter local to each place that needs
an internal label, but this doesn't work with modules as depending on
what declarations have been imported, some counter values may have
already been used.

This patch reworks the situation to instead have a single collection of
counters for the TU, and a new function 'generate_internal_label' that
gets the next label with given prefix using that counter.  Modules
streaming can then use this function to regenerate new names on
stream-in for any such decls, guaranteeing uniqueness within the TU.

These labels should only be used for internal entities so there should
be no issues with the names differing from TU to TU; we will need to
handle this if we ever start checking ODR of definitions we're merging
but that's an issue for later.

For proof of concept, this patch makes use of the new API for
__builtin_source_location and ubsan; there are probably other places
in the frontend where this change will need to be made as well.
One other change this exposes is that both of these components rely
on the definition of the VAR_DECLs they create, so stream that too
for uncontexted variables.

PR c++/98735
PR c++/118904

gcc/cp/ChangeLog:

* cp-gimplify.cc (source_location_id): Remove.
(fold_builtin_source_location): Use generate_internal_label.
* module.cc (enum tree_tag): Add 'tt_internal_id' enumerator.
(trees_out::tree_value): Adjust assertion, write definitions
of uncontexted VAR_DECLs.
(trees_in::tree_value): Read variable definitions.
(trees_out::tree_node): Write internal labels, adjust assert.
(trees_in::tree_node): Read internal labels.

gcc/ChangeLog:

* tree.cc (struct identifier_hash): New type.
(struct identifier_count_traits): New traits.
(internal_label_nums): New hash map.
(generate_internal_label): New function.
(prefix_for_internal_label): New function.
* tree.h (IDENTIFIER_INTERNAL_P): New macro.
(generate_internal_label): Declare.
(prefix_for_internal_label): Declare.
* ubsan.cc (ubsan_ids): Remove.
(ubsan_type_descriptor): Use generate_internal_label.
(ubsan_create_data): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/src-loc-1.h: New test.
* g++.dg/modules/src-loc-1_a.H: New test.
* g++.dg/modules/src-loc-1_b.C: New test.
* g++.dg/modules/src-loc-1_c.C: New test.
* g++.dg/modules/ubsan-1_a.C: New test.
* g++.dg/modules/ubsan-1_b.C: New test.
* g++.dg/ubsan/module-1-aux.cc: New test.
* g++.dg/ubsan/module-1.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
4 weeks agoc++/modules: Support streaming new size cookie for constexpr [PR120040]
Nathaniel Shead [Tue, 20 May 2025 13:09:07 +0000 (23:09 +1000)] 
c++/modules: Support streaming new size cookie for constexpr [PR120040]

This type currently has a DECL_NAME of an IDENTIFIER_DECL.  Although the
documentation indicates this is legal, this confuses modules streaming
which expects all RECORD_TYPEs to have a TYPE_DECL, which is used to
determine the context and merge key, etc.

PR c++/120040

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Handle TYPE_NAME
now being a TYPE_DECL rather than just an IDENTIFIER_NODE.
* init.cc (build_new_constexpr_heap_type): Build a TYPE_DECL for
the returned type; mark the type as artificial.
* module.cc (trees_out::type_node): Add some assertions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr120040_a.C: New test.
* g++.dg/modules/pr120040_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
4 weeks agoc++/modules: Implement streaming of uncontexted TYPE_DECLs [PR98735]
Nathaniel Shead [Tue, 20 May 2025 13:07:20 +0000 (23:07 +1000)] 
c++/modules: Implement streaming of uncontexted TYPE_DECLs [PR98735]

Currently, most declarations must have a DECL_CONTEXT for modules
streaming to behave correctly, so that they can have an appropriate
merge key generated and be correctly deduplicated on import.

There are a few exceptions, however, for internally generated
declarations that will never be merged and don't necessarily have an
appropriate parent to key off for the context.  One case that's come up
a few times is TYPE_DECLs, especially temporary RECORD_TYPEs used as
intermediaries within expressions.

Previously I've tried to give all such types a DECL_CONTEXT, but in some
cases that has ended up being infeasible, such as with the types
generated by UBSan (which are shared with the C frontend and don't know
their context, especially when created at global scope).  Additionally,
these types often don't have many of the parts that a normal struct
declaration created via parsing user code would have, which confuses
module streaming.

Given that these types are typically intended to be one-off and unique
anyway, this patch instead adds support for by-value streaming of
uncontexted TYPE_DECLs.  The patch only support streaming the bare
minimum amount of fields needed for the cases I've come across so far;
in general the preference should still be to ensure that DECL_CONTEXT is
set where possible.

PR c++/98735
PR c++/120040

gcc/cp/ChangeLog:

* module.cc (trees_out::tree_value): Write TYPE_DECLs.
(trees_in::tree_value): Read TYPE_DECLs.
(trees_out::tree_node): Support uncontexted TYPE_DECLs, and
ensure that all parts of a by-value decl are marked for
streaming.
(trees_out::get_merge_kind): Treat members of uncontexted types
as always unique.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
4 weeks agolibstdc++: Fix warnings introduced by type-erasing for chrono commits [PR110739]
Tomasz Kamiński [Fri, 27 Jun 2025 10:35:53 +0000 (12:35 +0200)] 
libstdc++: Fix warnings introduced by type-erasing for chrono commits [PR110739]

The r16-1709-g4b3cefed1a08344495fedec4982d85168bd8173f caused `-Woverflow`
in empty_spec.cc file. This warning is not cause by any issue in shipping
code, and results in taking to much shortcut when implementing a test-only
custom representation type Rep, where long was always used to store a value.
In particular common type for Rep and long long int, was de-facto long.
This is addressed by adding Under template parameter, that controls the type
of stored value, and handling it properly in common_type specializations.
No changes to shipping code are necessary.

Secondly, extracting _M_locale_fmt calls in r16-1712-gcaac94, resulted in __ctx
format parameter no longer being used. This patch removes such parameter
entirely, and replace _FormatContext template parameter, with _OutIter parameter
for __out. For consistency type of the __out is decoupled from _FormatContext,
for functions that still need context:
 * to extract locale (_M_A_a, _M_B_b, _M_c, _M_p, _M_r, _M_subsecs)
 * perform formatting for duration/subseconds (_M_Q, _M_T, _M_S, _M_subsecs)

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_format_to):
Rename _Out to _OutIter for consistency, and update calls
to specifier functions.
(__formatter_chrono::_M_wi, __formatter_chrono::_M_C_y_Y)
(__formatter_chrono::_M_D_x, __formatter_chrono::_M_d_e)
(__formatter_chrono::_M_F, __formatter_chrono::_M_g_G)
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_j)
(__formatter_chrono::_M_m, __formatter_chrono::_M_M)
(__formatter_chrono::_M_q, __formatter_chrono::_M_R_X)
(__formatter_chrono::_M_u_w, __formatter_chrono::_M_U_V_W)
(__formatter_chrono::_M_z, __formatter_chrono::_M_z):
Remove _FormatContext parameter, and  introduce _OutIter
for __out type.
(__formatter_chrono::_M_a_A, __formatter_chrono::_M_B_b)
(__formatter_chrono::_M_p, __formatter_chrono::_M_Q)
(__formatter_chrono::_M_r, __formatter_chrono::_M_S)
(__formatter_chrono::_M_subsecs, __formatter_chrono::_M_T):
Introduce separate _OutIter template parameter for __out.
(__formatter_chrono::_M_c, __formatter_chrono::_M_T):
Likewise, and adjust calls to specifiers functions.
* testsuite/std/time/format/empty_spec.cc: Make underlying
type for Rep configurable.

4 weeks agoFix afdo profiles for functions that was not early-inlined
Jan Hubicka [Fri, 27 Jun 2025 14:10:31 +0000 (16:10 +0200)] 
Fix afdo profiles for functions that was not early-inlined

This patch should finish the oflining infrastructure by offlining
(prior AFDO annotation) all inline function instances that was not
early inlined.  This is mostly the case of recursive inlining or when
-fno-auto-profile-inlining is used which sould now produce comparable
code.

I also cleaned up offlining of self-recursive functions which now
happens through the worklist and reduces problem with recursive ivocation
of the funciton merging modifying datastructures at unexpected places.

gcc/ChangeLog:

* auto-profile.cc (function_instance::set_name,
function_instance::set_realized, function_instnace::realized_p,
function_instance::set_in_worklist,
function_instance::clear_in_worklist,
function_instance::in_worklist_p): New member functions.
(function_instance::in_worklist, function_instance::realized_):
new.
(get_relative_location_for_locus): Break out from ....
(get_relative_location_for_stmt): ... here.
(function_instance::~function_instance): Sanity check that
removed function is not in worklist.
(function_instance::merge): Do not offline realized instances.
(function_instance::offline): Make private; add duplicate functions
to worklist rather then merging immediately.
(function_instance::offline_if_in_set):  Cleanup.
(function_instance::remove_external_functions): Likewise.
(function_instance::offline_if_not_realized): New member function.
(autofdo_source_profile::offline_external_functions): Handle delayed
functions.
(autofdo_source_profile::offline_unrealized_inlines): New member function.
(walk_block): New function.
(mark_realized_functions): New function.
(afdo_annotate_cfg): Fix dump.
(auto_profile): Mark realized functions and offline rest; do not compute
fn summary.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/afdo-crossmodule-1.c: Update template.

4 weeks agoAVR: target/113934 - Use LRA per default.
Georg-Johann Lay [Fri, 27 Jun 2025 13:44:40 +0000 (15:44 +0200)] 
AVR: target/113934 - Use LRA per default.

Now that the patches for PR120424 are upstream, the last known bug
associated with avr+lra has been fixed: PR118591.  So we can pull the
switch that turns on LRA per default.

This patch only sets -mlra per default.  It doesn't do any Reload related
cleanup or removal from the avr backend, hence -mno-lra still works.

The only new problem is that gcc.dg/torture/pr64088.c fails with LRA
but not with Reload.  Though that test case is awkward since it is UB
but expects the compiler to behave in a specific way which avr-gcc
doesn't do: PR116780.

This patch also avoids a relative recent ICE that breaks building libgcc:
R24:DI is allowed per hard_regno_mode_ok, but R26:SI is disallowed
for Reload for old reasons.  Outcome is that a split2 pattern for
R24:DI = zero_extend:DI (R22:SI) runs into an ICE.

AVR-LibC builds fine with this patch.
The AVR-LibC testsuite passes without errors.

gcc/
PR target/113934
* config/avr/avr.opt (-mlra): Turn on per default.

4 weeks ago[RISC-V][PR target/119971] Avoid losing shift count masking
Jeff Law [Fri, 27 Jun 2025 13:00:15 +0000 (07:00 -0600)] 
[RISC-V][PR target/119971] Avoid losing shift count masking

Fix typo spotted by Bernhard Reutner-Fischer.

PR target/119971

gcc/testsuite/
* gcc.target/riscv/pr119971.c: Fix typo.

4 weeks agotree-optimization/120808 - SLP patterns with FMA/FMS
Richard Biener [Fri, 27 Jun 2025 09:42:45 +0000 (11:42 +0200)] 
tree-optimization/120808 - SLP patterns with FMA/FMS

The following amends the SLP addsub pattern to also match blends
of .FMA/.FMS and form .FMADDSUB even when -ffp-contract=off.

PR tree-optimization/120808
* tree-vect-slp-patterns.cc (vect_match_expression_p):
Take a code_helper and also match calls.
(addsub_pattern::recognize): Handle .FMA/.FMS pairs
in addition to PLUS/MINUS.
(addsub_pattern::build): Adjust.

* gcc.dg/vect/bb-slp-pr120808.c: Now also expect FMADDSUB
patterns to be matched.

4 weeks agoFixup vector epilog analysis skipping when not using partial vectors
Richard Biener [Thu, 26 Jun 2025 09:38:47 +0000 (11:38 +0200)] 
Fixup vector epilog analysis skipping when not using partial vectors

The following avoids re-analyzing the loop as epilogue when not
using partial vectors and the mode is the same as the autodetected
vector mode and that has a too high VF for a non-predicated loop.
This situation occurs almost always on x86 and saves us one
re-analysis unless --param vect-partial-vector-usage is non-default.

* tree-vectorizer.h (vect_chooses_same_modes_p): New
overload.
* tree-vect-stmts.cc (vect_chooses_same_modes_p): Likewise.
* tree-vect-loop.cc (vect_analyze_loop): Prune epilogue
analysis further when not using partial vectors.

4 weeks agoFixup partial_vectors_supported_p use
Richard Biener [Thu, 26 Jun 2025 09:45:05 +0000 (11:45 +0200)] 
Fixup partial_vectors_supported_p use

The following fixes the computation of supports_partial_vectors which
is used to prune the set of modes to iterate over for epilog
vectorization.  The used partial_vectors_supported_p predicate
only looks for while_ult while also support predication when
mask modes are integer modes as for AVX512.

I've noticed this isn't very effective on x86_64 anyway since
if the main loop mode is autodetected we skip re-analyzing
mode_i == 0, but then mode_i == 1 is usually the very same
large mode.  A patch for this will follow, but this will
regress without the fix below.

* tree-vect-loop.cc (vect_analyze_loop): Consider AVX512
style masking when computing supports_partial_vectors.

4 weeks agolibstdc++: Fix Darwin bootstrap by simplifying ver file syntax.
Iain Sandoe [Thu, 26 Jun 2025 22:43:02 +0000 (23:43 +0100)] 
libstdc++: Fix Darwin bootstrap by simplifying ver file syntax.

The symbol parsing script does not handle the closing brace of a new
symbol group and the identifier for the inherited group to be on
different lines, which r16-1708-gaf5b72cf9f564 introduced. Fixed by
making the conditional encompass both the brace and the identifier.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver: Keep the closing brace of the
CXXABI_1.3.17 symbol group together with the identifier
for the inherited group.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
4 weeks agoc++: Add fix note for how to declare main in a module
Nathaniel Shead [Wed, 25 Jun 2025 11:24:40 +0000 (21:24 +1000)] 
c++: Add fix note for how to declare main in a module

This patch adds a note to help users unfamiliar with modules terminology
understand how to declare main in a named module since P3618.

There doesn't appear to be an easy robust location available for "the
start of this declaration" that I could find to attach a fixit to, but
the explanation should suffice.

gcc/cp/ChangeLog:

* decl.cc (grokfndecl): Add explanation of how to attach to
global module.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
4 weeks agodocs: fix a typo in used attribute documentation
Tamar Christina [Fri, 27 Jun 2025 07:23:58 +0000 (08:23 +0100)] 
docs: fix a typo in used attribute documentation

This fixes a small typo in the Label attributes docs.

gcc/ChangeLog:

* doc/extend.texi: Fix typo in unsed attribute docs.

4 weeks agox86: Handle vector broadcast source
H.J. Lu [Thu, 26 Jun 2025 02:05:30 +0000 (10:05 +0800)] 
x86: Handle vector broadcast source

Use the inner scalar mode of vector broadcast source in:

  (set (reg:V8DF 394)
       (vec_duplicate:V8DF (reg:V2DF 190 [ alpha ])))

to compute the vector mode for broadcast from vector source.

gcc/

PR target/120830
* config/i386/i386-features.cc (ix86_get_vector_cse_mode): Handle
vector broadcast source.

gcc/testsuite/

PR target/120830
* g++.target/i386/pr120830.C: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks ago[lra] catch all to-sp eliminations with nonzero offsets [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:29 +0000 (21:01 -0300)] 
[lra] catch all to-sp eliminations with nonzero offsets [PR120424]

An x86_64-linux-gnu native with ix86_frame_pointer_required modified
to return true for nonzero frames, to exercize
lra_update_fp2sp_elimination, reveals in stage1 testing that wrong
code is generated for gcc.c-torture/execute/ieee/fp-cmp-8l.c:
argp-to-sp eliminations are used for one_test to pass its arguments on
to *pos, and the sp offsets survive the disabling of that elimination.

We didn't really have to disable that elimination, but the x86 backend
disables eliminations to sp if frame_pointer_needed.

This change extends the catching of fp2sp eliminations to all (?)
eliminations to sp with nonzero offsets, since none of them can be
properly reversed and would silently lead to wrong code.

By accepting nonzero offsets, we bootstrap with
-maccumulate-outgoing-args on x86_64-linux-gnu (with
ix86_frame_pointer_required modified to return true on nonzero frame
size).

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (elimination_2sp_occurred_p): Rename
from...
(elimination_fp2sp_occured_p): ... this.  Adjust all uses.
(lra_eliminate_regs_1): Don't require a from-frame-pointer
elimination to set it.
(update_reg_eliminate): Likewise to test it.

5 weeks ago[lra] apply elimination offsets to MEM in autoinc address [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:27 +0000 (21:01 -0300)] 
[lra] apply elimination offsets to MEM in autoinc address [PR120424]

When attempting to bootstrap arm-linux-gnueabihf with
{BOOT_C,T}FLAGS='-g -O2 -fnon-call-exceptions
-fstack-clash-protection', gmp fails to build in stage2: gen-fac's
mpz_and gets miscompiled.

A pseudo is initialized before a loop and used in a PRE_INC load
inside a loop.  It gets spilled just as the fp2sp elimination is
disabled, and only the initialization gets adjusted with elimination
offsets.  The unadjusted stack slot within the PRE_INC load ends up
reloaded later, but only when the FP offset has already missed its
chance to be adjusted.

Arrange for lra_eliminate_regs_1 to adjust autoinc addresses that are
MEMs themselves.

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_eliminate_regs_1): Adjust autoinc
addresses that are MEMs.

5 weeks ago[lra] reorder operations in lra_update_fp2sp_elimination [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:26 +0000 (21:01 -0300)] 
[lra] reorder operations in lra_update_fp2sp_elimination [PR120424]

The various recent additions to lra_update_fp2sp_elimination rendered
it somewhat confusing, with intermixed groups of statements pertaining
to three different major actions: disabling the elimination,
recomputing live ranges, and spilling uses of the frame pointer.
Reorder them for readability.

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination): Reorder
and regroup related statements.

5 weeks ago[lra] rework deactivation of fp2sp elimination [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:24 +0000 (21:01 -0300)] 
[lra] rework deactivation of fp2sp elimination [PR120424]

Deactivating the fp2sp elimination in lra_update_fp2sp_elimination
prevents update_reg_eliminate from propagating the fp2sp elimination
offset to the next chosen elimination, so it may retain -1 as the
prev_offset, and prev_offset will be taken as an already-applied
offset that needs to be compensated in the next round of spilling and
reloading.  This affects, for example, crtbegin.o's
__do_global_dtors_aux on arm-linux-gnueabihf in a {BOOT_C,T}FLAGS='-O2
-g -fnon-call-exceptions -fstack-clash-protection' bootstrap.

Alas, just retaining that elimination causes spills to use the fp2sp
elimination, including applying sp offsets, which breaks e.g. an
x86_64-linux-gnu native bootstrap with ix86_frame_pointer_required
modified to return true on nonzero frame size.

The middle-ground solution is to keep the elimination active, so that
its offsets are applied and propagated on to the subsequent fp
elimination, but without introducing sp offsets, so that
e.g. pr103973-18.c on the modified x86_64-linux-gnu doesn't get
adjacent argument pushes of two adjacent on-stack temporaries ending
up pushing the same temporary because of undesired adjustments.

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Avoid sp offsets in further fp2sp eliminations...
(update_reg_eliminate): ... and restore to_rtx before assert
checking.

5 weeks ago[lra] recompute ranges upon disabling fp2sp elimination [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:22 +0000 (21:01 -0300)] 
[lra] recompute ranges upon disabling fp2sp elimination [PR120424]

If the frame size grows to nonzero, arm_frame_pointer_required may
flip to true under -fstack-clash-protection -fnon-call-exceptions, and
that may disable the fp2sp elimination part-way through lra.

If pseudos had got assigned to the frame pointer register before that,
they have to be spilled, and that requires complete live range
information.  If !lra_reg_spill_p, lra_spill won't have live ranges
for such pseudos, and they could end up sharing spill slots with other
pseudos whose live ranges actually overlap.

This affects at least Ada.Strings.Wide_Superbounded.Super_Insert and
.Super_Replace_Slice in libgnat/a-stwisu.adb, when compiled with -O2
-fstack-clash-protection -march=armv7 (implied Thumb2), causing
acats-4's cdd2a01 to fail.

Recomputing live ranges including registers may renumber and compress
points, so we have to recompute the aggregated live ranges for
already-assigned spill slots as well.

As a safety net, reject empty live ranges when computing slot sharing.

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Compute complete live ranges and recompute slots' live ranges
if needed.
* lra-lives.cc (lra_reset_live_range_list): New.
(lra_complete_live_ranges): New.
* lra-spills.cc (assign_spill_hard_regs): Reject empty live
ranges.
(add_pseudo_to_slot): Likewise.
(lra_recompute_slots_live_ranges): New.
* lra-int.h (lra_reset_live_range_list): Declare.
(lra_complete_live_ranges): Declare.
(lra_recompute_slots_live_ranges): Declare.

5 weeks ago[genoutput] mark scratch outputs as eliminable [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:21 +0000 (21:01 -0300)] 
[genoutput] mark scratch outputs as eliminable [PR120424]

acats' fdd2a00.read is miscompiled on arm-linux-gnu with -O2
-fstack-clash-protection -march=armv7-a -marm: a clobbered scratch
register in a *iorsi3_compare0_scratch pattern gets initially assigned
to the frame pointer register, but at some point during lra the frame
size grows to nonzero, arm_frame_pointer_required flips to true, and
the fp2sp elimination has to be disabled, so the scratch register gets
spilled to a stack slot.

It needs to get the sfp elimination at that point, because later
rounds of elimination will assume the previous round's offset has
already been applied.  But since scratch matches are not regarded as
eliminable by genoutput, we don't attempt elimination in the clobbered
stack slot MEM rtx.

Later on, lra issues a reload for that slot, using a new pseudo
allocated to a hardware register, that gets stored in the stack slot
after the original insn.  Elimination in that reload store insn
eventually updates the elimination offset, but it's an incremental
update, assuming that the offset so far has already been applied.

Without applying the initial offset, the store ends up overlapping
with the function's register save area, corrupting a caller's
call-saved register.

AFAICT the old reload's elimination wouldn't be harmed by allowing
elimination in scratch operands, so I'm enabling eliminable for them
regardless.  Should it be found to make a difference, we could
presumably set a different bit in eliminable to enable reload and lra
to tell them apart and behave accordingly.

for  gcc/ChangeLog

PR rtl-optimization/120424
* genoutput.cc (scan_operands): Make MATCH_SCRATCHes eliminable.

5 weeks ago[lra] inactivate disabled fp2sp elimination [PR120424]
Alexandre Oliva [Fri, 27 Jun 2025 00:01:19 +0000 (21:01 -0300)] 
[lra] inactivate disabled fp2sp elimination [PR120424]

Even after we disable the fp2sp elimination when it is the active
elimination for the fp, spilling might use it before
update_reg_eliminate runs and inactivates it for good.  If it is used,
update_reg_eliminate will fail the check that fp2sp was not used.

Since we keep track of uses of this specific elimination, and
lra_update_fp2sp_elimination checks it before disabling it, we know it
hasn't been used, so we can inactivate it without any ill effects.

This fixes the pr118591-1.c avr-none regression exposed by the
PR120424 fix.

for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Inactivate the unused fp2sp elimination right away.

5 weeks agopru: Split 64-bit moves into a sequence of 32-bit moves
Dimitar Dimitrov [Sun, 9 Feb 2025 15:55:03 +0000 (17:55 +0200)] 
pru: Split 64-bit moves into a sequence of 32-bit moves

The 64-bit register-to-register moves on PRU are implemented with two
instructions moving 32-bit registers.  Defining a split for the 64-bit
moves allows this to be described in RTL, and thus one of the 32-bit
moves to be eliminated if the destination register is dead.

Also, split the loading of non-trivial 64-bit integer constants.  The
resulting 32-bit integer constants have better chance to be loaded with
something more optimal than an "ldi32".

For now do the splits only after register allocation, because LRA does
not yet efficiently handle subregs.  See
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651366.html

This patch shows slight improvement for wikisort benchmark from
embench-iot:

Benchmark          size-before  size-after  difference
---------          -----------  ----------  ----------
aha-mont64          1,648       1,648       0
crc32                 104       104         0
depthconv           1,172       1,172       0
edn                 3,040       3,040       0
huffbench           1,616       1,616       0
matmult-int           748       748         0
md5sum                700       700         0
nettle-aes          2,664       2,664       0
nettle-sha256       5,732       5,732       0
nsichneu           21,372       21,372      0
picojpeg            9,716       9,716       0
qrduino             8,556       8,556       0
sglib-combined      3,724       3,724       0
slre                3,488       3,488       0
statemate           1,132       1,132       0
tarfind               652       652         0
ud                  1,004       1,004       0
wikisort           18,120       18,092      -28
xgboost               300       300         0

gcc/ChangeLog:

* config/pru/pru.md (reg move splitter): New splitter for 64-bit
register moves into two 32-bit moves.
(const_int move splitter): New splitter for 64-bit constant
integer moves into two 32-bit moves.

gcc/testsuite/ChangeLog:

* gcc.target/pru/mov64-subreg-1.c: New test.
* gcc.target/pru/mov64-subreg-2.c: New test.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
5 weeks agodiagnostics: make 5 more fields of diagnostic_context private
David Malcolm [Thu, 26 Jun 2025 17:29:36 +0000 (13:29 -0400)] 
diagnostics: make 5 more fields of diagnostic_context private

No functional change intended.

gcc/ada/ChangeLog:
* gcc-interface/misc.cc (gnat_init): Use
diagnostic_context::set_internal_error_callback.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_diagnostics_set_defaults): Use
diagnostic_context::set_permissive_option.

gcc/cp/ChangeLog:
* error.cc (cxx_initialize_diagnostics): Use
diagnostic_context::set_adjust_diagnostic_info_callback.

gcc/ChangeLog:
* diagnostic.h (diagnostic_context::set_permissive_option): New.
(diagnostic_context::set_fatal_errors): New.
(diagnostic_context::set_internal_error_callback): New.
(diagnostic_context::set_adjust_diagnostic_info_callback): New.
(diagnostic_context::inhibit_notes): New.
(diagnostic_context::m_opt_permissive): Make private.
(diagnostic_context::m_fatal_errors): Likewise.
(diagnostic_context::m_internal_error): Likewise.
(diagnostic_context::m_adjust_diagnostic_info): Likewise.
(diagnostic_context::m_inhibit_notes_p): Likewise.
(diagnostic_inhibit_notes): Delete.
* opts.cc (common_handle_option): Use
diagnostic_context::set_fatal_errors.
* toplev.cc (internal_error_function): Use
diagnostic_context::set_internal_error_callback.
(general_init): Likewise.
(process_options): Use diagnostic_context::inhibit_notes.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
5 weeks agodiagnostics, testsuite: don't assume host has "dot" [PR120809]
David Malcolm [Thu, 26 Jun 2025 17:28:50 +0000 (13:28 -0400)] 
diagnostics, testsuite: don't assume host has "dot" [PR120809]

gcc/ChangeLog:
PR analyzer/120809
* diagnostic-format-html.cc
(html_builder::maybe_make_state_diagram): Bulletproof against the
SVG generation failing.
* xml.cc (xml::printer::push_element): Assert that the ptr is
nonnull.
(xml::printer::append): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/120809
* gcc.dg/analyzer/state-diagram-5.c: Split out into...
* gcc.dg/analyzer/state-diagram-5-html.c: ...this, adding
dg-require-dot...
* gcc.dg/analyzer/state-diagram-5-sarif.c: ...and this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
5 weeks agodiagnostics: refactor sarif_scheme_handler::make_sink
David Malcolm [Thu, 26 Jun 2025 17:28:44 +0000 (13:28 -0400)] 
diagnostics: refactor sarif_scheme_handler::make_sink

No functional change intended.

gcc/ChangeLog:
* diagnostic-output-spec.cc (sarif_scheme_handler::make_sink):
Split out creation of sarif_generation_options and
sarif_serialization_format into...
(sarif_scheme_handler::make_sarif_gen_opts): ...this...
(sarif_scheme_handler::make_sarif_serialization_object): ...and
this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
5 weeks agoRISC-V: update prepare_ternary_operands to handle vector-scalar case [PR120828]
Paul-Antoine Arras [Wed, 25 Jun 2025 16:42:00 +0000 (16:42 +0000)] 
RISC-V: update prepare_ternary_operands to handle vector-scalar case [PR120828]

This is a followup to 92e1893e0 "RISC-V: Add patterns for vector-scalar
multiply-(subtract-)accumulate" that caused an ICE in some cases where the mult
operands were wrongly swapped.
This patch ensures that operands are not swapped in the vector-scalar case.

PR target/120828

gcc/ChangeLog:

* config/riscv/riscv-v.cc (prepare_ternary_operands): Handle the
vector-scalar case.

5 weeks agolibstdc++: Lift chrono localized formatting to main chrono format loop [PR110739]
Tomasz Kamiński [Wed, 25 Jun 2025 14:58:31 +0000 (16:58 +0200)] 
libstdc++: Lift chrono localized formatting to main chrono format loop [PR110739]

This patch extract calls to _M_locale_fmt and construction of the struct tm,
from the functions dedicated to each specifier, to main format loop in
_M_format_to functions. This removes duplicated code repeated for specifiers.

To allow _M_locale_fmt to only be called if localized formatting is enabled
('L' is present in chrono-format-spec), we provide a implementations for
locale specific specifiers (%c, %r, %x, %X) that produces the same result
as locale::classic():
 * %c is implemented as separate _M_c method
 * %r is implemented as separate _M_r method
 * %x is implemented together with %D, as they provide same behavior,
 * %X is implemented together with %R as _M_R_X, as both of them do not include
   subseconds.

The handling of subseconds was also extracted to _M_subsecs function that is
used by _M_S and _M_T specifier. The _M_T is now implemented in terms of
_M_R_X (printing time without subseconds) and _M_subs.

The __mod parameter responsible for triggering localized formatting was removed
from methods handling most of specifiers, except:
 * _M_S (for %S) for which it determines if subseconds should be included,
 * _M_z (for %z) for which it determines if ':' is used as separator.

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_use_locale_fmt):
Define.
(__formatter_chrono::_M_locale_fmt): Moved to front of the class.
(__formatter_chrono::_M_format_to): Construct and initialize
struct tm and call _M_locale_fmt if needed.
(__formatter_chrono::_M_c_r_x_X): Split into separate methods.
(__formatter_chrono::_M_c, __formatter_chrono::_M_r): Define.
(__formatter_chrono::_M_D): Renamed to _M_D_x.
(__formatter_chrono::_M_D_x): Renamed from _M_D.
(__formatter_chrono::_M_R_T): Split into _M_R_X and _M_T.
(__formatter_chrono::_M_R_X): Extracted from _M_R_T.
(__formatter_chrono::_M_T): Define in terms of _M_R_X and _M_subsecs.
(__formatter_chrono::_M_subsecs): Extracted from _M_S.
(__formatter_chrono::_M_S): Replaced __mod with __subs argument,
removed _M_locale_fmt call, and delegate to _M_subsecs.
(__formatter_chrono::_M_C_y_Y, __formatter_chrono::_M_d_e)
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_m)
(__formatter_chrono::_M_u_w, __formatter_chrono::_M_U_V_W): Remove
__mod argument and call to _M_locale_fmt.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
5 weeks agocontrib/mklog.py: Add main function
Alex Coplan [Thu, 19 Jun 2025 13:44:06 +0000 (14:44 +0100)] 
contrib/mklog.py: Add main function

This adds a main() function to mklog.py (like e.g. check_GNU_style.py
has), which makes it easier to import and invoke from another python
script.  This is useful when using a wrapper script to set up the python
environment.

Smoke tested by using the modified mklog.py to generate the ChangeLog
for this patch.

contrib/ChangeLog:

* mklog.py (main): New.

5 weeks agorust: Silence a clang warning in borrow-checker-diagnostics
Martin Jambor [Mon, 23 Jun 2025 21:52:20 +0000 (23:52 +0200)] 
rust: Silence a clang warning in borrow-checker-diagnostics

When compiling
gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
with clang, it emits the following warning:

  gcc/rust/checks/errors/borrowck/rust-borrow-checker-diagnostics.cc:145:46: warning: non-constant-expression cannot be narrowed from type 'Polonius::Loan' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]

I'd hope that for indexing that is never really a problem,
nevertheless if narrowing is taking place, I guess it can be argued it
should be made explicit.

gcc/rust/ChangeLog:

2025-06-23  Martin Jambor  <mjambor@suse.cz>

* checks/errors/borrowck/rust-borrow-checker-diagnostics.cc
(BorrowCheckerDiagnostics::get_loan): Type cast loan to uint32_t.

5 weeks agolibstdc++: Type-erase chrono-data for formatting [PR110739]
Tomasz Kamiński [Tue, 24 Jun 2025 12:07:46 +0000 (14:07 +0200)] 
libstdc++: Type-erase chrono-data for formatting [PR110739]

This patch reworks the formatting for the chrono types, such that they are all
formatted in terms of _ChronoData class, that includes all required fields.
Populating each required field is performed in formatter for specific type,
based on the chrono-spec used.

To facilitate above, the _ChronoSpec now includes additional _M_needed field,
that represnts the chrono data that is referenced by format spec (this value
is also configured for __defSpec). This value differs from the value of
__parts passed to _M_parse, which does include all fields that can be computed
from input (e.g. weekday_indexed can be computed for year_month_day). Later
it is used to fill _ChronoData, in particular _M_fill_* family of functions,
to determine if given field needs to be set, and thus its value needs to be
computed.

In consequence _ChronoParts enum was extended with additional values, that
allows more fine grained identification:
 * _TimeOfDay is separated into _HoursMinutesSeconds and _Subseconds,
 * _TimeZone is separated into _ZoneAbbrev and _ZoneOffset,
 * _LocalDays, _WeekdayIndex are defined and in included in _Date,
 * _Duration is removed, and instead _EpochUnits and _UnitSuffix are
   introduced.
Furthermore, to avoid name conflicts _ChonoParts is now defined as enum class,
with additional operators that simplify uses.

In addition to fields that can be printed using chrono-spec, _ChronoData stores:
 * Total days in wall time (_M_ldays), day of year (_M_day_of_year) - used by
   struct tm construction, and for ISO calendar computation.
 * Total seconds in wall time (_M_lseconds) - this value may be different from
   sum of days, hours, minutes, seconds (e.g. see utc_time below). Included
   to allow future extension, like printing total minutes.
 * Total seconds since epoch - due offset different from above. Again to be
   used with future extension (e.g. %s as proposed in P2945R1).
 * Subseconds - count of attoseconds (10^(-18)), in addition to printing can
   be used to  compute fractional hours, minutes.
The both total seconds fields use single _TotalSeconds enumerator in
_ChronoParts, that when present in combination with _EpochUnits or _LocalDays
indicates that _M_eseconds (_EpochSeconds) or _M_lseconds (_LocalSeconds) are
provided/required.

To handle type formatting of time since epoch ('%Q'|_EpochUnits), we use the
format_args mechanism, where the result of +d.count() (see LWG4118) is erased
into make_format_args to local __arg_store, that is later referenced by
_M_ereps (_M_ereps.get(0)).

To handle precision values, and in prepartion to allow user to configure ones,
we store the precision as third element of _M_ereps (_M_ereps.get(2)), this
allows duration with precision to be printed using "{0:{2}}". For subseconds
the precision is handled differently depending on the representation:
 * for integral reps, _M_subseconds value is used to determine fractional value,
   precision is trimmed to 18 digits;
 * for floating-points, _M_ereps stores duration<Rep> initialized with only
   fractional seconds, that is later formatted with precision.
Always using _M_subseconds fields for integral duration, means that we do not
use formattter for user-defined durations that are considered to be integral
(see empty_spec.cc file change). To avoid potentially expensive computation
of _M_subseconds, we make sure that _ChronoParts::_Subseconds is set only if
_Subseconds are needed. In particular we remove this flag for localized ouput
in _M_parse.

Construction of the _M_ereps as described above is handled by __formatter_duration,
that is then used to format duration, hh_mm_ss and time_points specializations.
This class also handles _UnitSuffix, the _M_units_suffix field is populated
either with predefined suffix (chrono::__detail::__units_suffix) or one produced
locally.

Finally, formatters for types listed below contains type specific logic:
 * hh_mm_ss - we do not compute total duration and seconds, unless explicitly
   requested, as such computation may overflow;
 * utc_time - for time during leap second insertion, the _M_seconds field is
   increased to 60;
 * __local_time_fmt - exception is thrown if zone offset (_ZoneOffset) or
   abbrevation (_ZoneAbbrev) is requsted, but corresponding pointer is null,
   futhermore conversion from `char` to `wchar_t` for abbreviation is performed
   if needed.

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__format::__no_timezone_available):
Removed, replaced with separate throws in formatter for
__local_time_fmt
(__format::_ChronoParts): Defined additional enumertors and
declared as enum class.
(__format::operator&(_ChronoParts, _ChronoParts))
(__format::operator&=(_ChronoParts&, _ChronoParts))
(__format::operator-(_ChronoParts, _ChronoParts))
(__format::operator-=(_ChronoParts&, _ChronoParts))
(__format::operator==(_ChronoParts, decltype(nullptr)))
(_ChronoSpec::_M_time_only, _ChronoSpec::_M_floating_point_rep)
(_ChronoSpec::_M_custom_rep, _ChronoSpec::_M_needed)
(_ChronoSpec::_M_needs, __format::_ChronoData): Define.
(__format::__formatter_chrono): Redefine to accept _ChronoData.
(__formatter_chrono::_M_format_to_ostream): Moved to
__formatter_duration.
(__format::__formatter_duration): Define.
(__formatter_chrono_info::format): Pass value-constructed
_ChronoData.
(std::formatter<chrono::day, _CharT>)
(std::formatter<chrono::month, _CharT>)
(std::formatter<chrono::year, _CharT>)
(std::formatter<chrono::weekday, _CharT>)
(std::formatter<chrono::weekday_indexed, _CharT>)
(std::formatter<chrono::weekday_last, _CharT>)
(std::formatter<chrono::month_day, _CharT>)
(std::formatter<chrono::month_day_last, _CharT>)
(std::formatter<chrono::month_weekday, _CharT>)
(std::formatter<chrono::month_weekday_indexed, _CharT>)
(std::formatter<chrono::month_weekday_last, _CharT>)
(std::formatter<chrono::year_month, _CharT>)
(std::formatter<chrono::year_month_day, _CharT>)
(std::formatter<chrono::year_month_day_last, _CharT>)
(std::formatter<chrono::year_month_weekday, _CharT>)
(std::formatter<chrono::year_month_weekday_indexed, _CharT>)
(std::formatter<chrono::year_month_weekday_last, _CharT>):
Construct _ChronoData in format, and configure _M_needed in
_ChronoSpec.
(std::formatter<chrono::duration<_Rep, _Period>, _CharT>)
(std::formatter<chrono::hh_mm_ss<_Duration>, _CharT>)
(std::formatter<chrono::sys_time<_Duration>, _CharT>)
(std::formatter<chrono::utc_time<_Duration>, _CharT>)
(std::formatter<chrono::tai_time<_Duration>, _CharT>)
(std::formatter<chrono::gps_time<_Duration>, _CharT>)
(std::formatter<chrono::file_time<_Duration>, _CharT>)
(std::formatter<chrono::local_time<_Duration>, _CharT>)
(std::formatter<chrono::_detail::__local_time_fmt<_Duration>, _CharT>):
Reworked in terms of __formatter_duration and _ChronoData.
(std::formatter<chrono::_detail::__utc_leap_second<_Duration>, _CharT>):
Removed.
(_Parser<_Duration>::operator()): Adjusted for _ChronoParts
being enum class.
* include/std/chrono (__detail::__utc_leap_second): Removed,
replaced with simply bumping _M_seconds in _ChronoData.
* testsuite/std/time/format/empty_spec.cc: Updated %S integral
ouput.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
5 weeks agolibstdc++: Implement C++26 P2927R3 - Inspecting exception_ptr
Jakub Jelinek [Thu, 26 Jun 2025 14:18:38 +0000 (16:18 +0200)] 
libstdc++: Implement C++26 P2927R3 - Inspecting exception_ptr

The following patch attempts to implement the C++26 P2927R3 - Inspecting exception_ptr
paper (but not including P3748R0, I plan to play with it incrementally and
it will really depend on the Constexpr exceptions patch).

The function template is implemented using an out of line private method of
exception_ptr, so that P3748R0 then can use if consteval and provide a
constant evaluation variant of it.

2025-06-26  Jakub Jelinek  <jakub@redhat.com>

* include/bits/version.def (exception_ptr_cast): Add.
* include/bits/version.h: Regenerate.
* libsupc++/exception: Define __glibcxx_want_exception_ptr_cast before
including bits/version.h.
* libsupc++/exception_ptr.h (std::exception_ptr_cast): Define.
(std::__exception_ptr::exception_ptr::_M_exception_ptr_cast): Declare.
* libsupc++/eh_ptr.cc
(std::__exception_ptr::exception_ptr::_M_exception_ptr_cast): Define.
* src/c++23/std.cc.in (std::exception_ptr_cast): Export.
* config/abi/pre/gnu.ver: Export
_ZNKSt15__exception_ptr13exception_ptr21_M_exception_ptr_castERKSt9type_info
at CXXABI_1.3.17.
* testsuite/util/testsuite_abi.cc (check_version): Allow CXXABI_1.3.17.
* testsuite/18_support/exception_ptr/exception_ptr_cast.cc: New test.

5 weeks agoc++, libstdc++: Implement C++26 P2830R10 - Constexpr Type Ordering
Jakub Jelinek [Thu, 26 Jun 2025 14:15:20 +0000 (16:15 +0200)] 
c++, libstdc++: Implement C++26 P2830R10 - Constexpr Type Ordering

The following patch attempts to implement the C++26 P2830R10 - Constexpr Type
Ordering paper, with a minor change that std::type_order<T, U> class template
doesn't derive from integer_constant, because std::strong_ordering is not
a structural type (except in MSVC), so instead it is just a class template
with static constexpr strong_ordering value member and also value_type,
type and 2 operators.

The paper mostly talks about using something other than mangled names for
the ordering, but given that the mangler is part of the GCC C++ FE, using
the mangler seems to be the best ordering choice to me.

2025-06-26  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
* cp-trait.def: Implement C++26 P2830R10 - Constexpr Type Ordering.
(TYPE_ORDER): New.
* method.cc (type_order_value): Define.
* cp-tree.h (type_order_value): Declare.
* semantics.cc (trait_expr_value): Use gcc_unreachable also
for CPTK_TYPE_ORDER, adjust comment.
(finish_trait_expr): Handle CPTK_TYPE_ORDER.
* constraint.cc (diagnose_trait_expr): Likewise.
gcc/testsuite/
* g++.dg/cpp26/type-order1.C: New test.
* g++.dg/cpp26/type-order2.C: New test.
* g++.dg/cpp26/type-order3.C: New test.
libstdc++-v3/
* include/bits/version.def (type_order): New.
* include/bits/version.h: Regenerate.
* libsupc++/compare: Define __glibcxx_want_type_order before
including bits/version.h.
(std::type_order, std::type_order_v): New trait and template variable.
* src/c++23/std.cc.in (std::type_order, std::type_order_v): Export.
* testsuite/18_support/comparisons/type_order/1.cc: New test.

5 weeks agoi386: Introduce crc_rev<mode>si4 expanders [PR120719]
Uros Bizjak [Thu, 26 Jun 2025 12:13:01 +0000 (14:13 +0200)] 
i386: Introduce crc_rev<mode>si4 expanders [PR120719]

Introduce crc_rev<mode>si4 expanders to generate CRC32 instruction when using
__builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32.

PR target/120719

gcc/ChangeLog:

* config/i386/i386.md (crc_rev<SWI124:mode>si4): New expander.

gcc/testsuite/ChangeLog:

* gcc.target/i386/crc-builtin-crc32.c: New test.

5 weeks agoRISC-V: Fix build issue
Kito Cheng [Thu, 26 Jun 2025 06:35:47 +0000 (14:35 +0800)] 
RISC-V: Fix build issue

Apparently I forgot to squash this fix into the previous commit before I
push...

gcc/ChangeLog:

* config/riscv/riscv.md: Fix build issue.

5 weeks agolto-ltrans-cache: Remove unused private member
Martin Jambor [Thu, 26 Jun 2025 09:34:46 +0000 (11:34 +0200)] 
lto-ltrans-cache: Remove unused private member

When building GCC with clang, it warns that the private member suffix
in class ltrans_file_cache (defined in lto-ltrans-cache.h) is not used
which indeed looks like it is the case.  This patch therefore removes
it along with its initialization in the constructor.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* lto-ltrans-cache.h (class ltrans_file_cache): Remove member prefix.
* lto-ltrans-cache.cc (ltrans_file_cache::ltrans_file_cache): Do
not initialize member prefix.

5 weeks agoRISC-V: Add comment and reorder the the include files in riscv.md [NFC]
Kito Cheng [Thu, 26 Jun 2025 06:26:57 +0000 (14:26 +0800)] 
RISC-V: Add comment and reorder the the include files in riscv.md [NFC]

This patch adds a comment to the riscv.md file to clarify the purpose of
the file and reorders the include files for better organization.

gcc/ChangeLog:

* config/riscv/riscv.md: Add comment and reorder include
files.

5 weeks agotree-vect-stmts.cc: Remove an unused shadowed variable
Martin Jambor [Mon, 23 Jun 2025 22:08:39 +0000 (00:08 +0200)] 
tree-vect-stmts.cc: Remove an unused shadowed variable

When compiling tree-vect-stmts.cc with clang, it emits a warning:

  gcc/tree-vect-stmts.cc:14930:19: warning: unused variable 'mode_iter' [-Wunused-variable]

And indeed, there are two mode_iter local variables in function
supportable_indirect_convert_operation and the first one is not used
at all.  This patch removes it.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* tree-vect-stmts.cc (supportable_indirect_convert_operation):
Remove an unused shadowed variable.

5 weeks agoSilence a clang warning in tree-vect-slp.cc about an unused variable
Martin Jambor [Tue, 24 Jun 2025 09:22:19 +0000 (11:22 +0200)] 
Silence a clang warning in tree-vect-slp.cc about an unused variable

Since r15-4695-gd17e672ce82e69 (Richard Biener: Assert finished
vectorizer pattern COND_EXPR transition), the static const array
cond_expr_maps is unused and when GCC is compiled with clang, it warns
about that.

This patch simply removes the variable.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* tree-vect-slp.cc (cond_expr_maps): Remove.

5 weeks agofortran: Avoid freeing uninitialized value
Martin Jambor [Thu, 26 Jun 2025 09:25:20 +0000 (11:25 +0200)] 
fortran: Avoid freeing uninitialized value

When compiling fortran/match.cc, clang emits a warning

  fortran/match.cc:5301:7: warning: variable 'p' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]

which looks accurate, so this patch adds an initialization of p to
avoid the use.

gcc/fortran/ChangeLog:

2025-06-23  Martin Jambor  <mjambor@suse.cz>

* match.cc (gfc_match_nullify): Initialize p to NULL;

5 weeks agoAdd testcase for afdo offlining and fix two bugs
Jan Hubicka [Thu, 26 Jun 2025 08:48:20 +0000 (10:48 +0200)] 
Add testcase for afdo offlining and fix two bugs

This patch adds a testcase that offlining works and profile info is not lost.
While doing it I noticed a pasto that made the dump to be "afdo" and not
"afdo_offline" and also that not all functions are processed as the range
for does not expect new values to be put to the vector.  Fixed thus.

gcc/ChangeLog:

* auto-profile.cc (function_instance::merge): Add TODO.
(autofdo_source_profile::offline_external_functions):
Do not use range for on the worklist.
* timevar.def (TV_IPA_AUTOFDO_OFFLINE): New timevar.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/afdo-crossmodule-1.c: New test.
* gcc.dg/tree-prof/afdo-crossmodule-1b.c: New test.

5 weeks agoFortran: Prevent creation of unused tree.
Andre Vehreschild [Wed, 25 Jun 2025 10:27:35 +0000 (12:27 +0200)] 
Fortran: Prevent creation of unused tree.

gcc/fortran/ChangeLog:

* trans.cc (gfc_allocate_using_malloc): Prevent possible memory
leak when allocation was already done.

5 weeks agoFortran: Fix wasting memory in coarray single mode.
Andre Vehreschild [Wed, 25 Jun 2025 10:27:04 +0000 (12:27 +0200)] 
Fortran: Fix wasting memory in coarray single mode.

gcc/fortran/ChangeLog:

* resolve.cc (resolve_fl_derived0): Do not create the token
component when not in coarray lib mode.
* trans-types.cc: Do not access the token when not in coarray
lib mode.

5 weeks agoFortran: Fix out of bounds access in structure constructor's clean up [PR120711]
Andre Vehreschild [Wed, 25 Jun 2025 07:12:35 +0000 (09:12 +0200)] 
Fortran: Fix out of bounds access in structure constructor's clean up [PR120711]

A structure constructor's generated clean up code was using an offset
variable, which was manipulated before the clean up was run leading to
an out of bounds access.

PR fortran/120711

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_array_ctor_element): Store the value
of the offset for reuse.

gcc/testsuite/ChangeLog:

* gfortran.dg/asan/array_constructor_1.f90: New test.

5 weeks agoAvoid some lost AFDO profiles with LTO
Jan Hubicka [Thu, 26 Jun 2025 07:06:52 +0000 (09:06 +0200)] 
Avoid some lost AFDO profiles with LTO

This patch fixes some of cases where we lose profile info because we do not
perform inlining that happened at train run before AFDO annotation is done.
This is a common problem with LTO in the case cross-module inlining happened.

I added afdo_offline pass that does two things:
 1) collect set of all functions defined in current unit
 2) walk all toplevel function instances.  If function instance correspond
    to a defined symbol, walk everything inlined to it.  If crossmodule
    inlining is seen, remove the inline instances and recursively look into
    inline instnaces that go back to the current unit and turn them to offline
    ones

    If function instance corresponds to external symbol, remove it but
    also look for functions inlined to it that belong to current module.

When merging profile we also need to recursively merge profiles of inlined
functions and if the inlining decisins does not match, offline the bodies.
This is somewhat fragile since recursive calls may trigger modifications of
functions currently being merged, but I hope I chased away problems with that -
will give it a second tought to see if this can be reorganized into a worklist
fashion that is more safe.

I noticed that functions may appear in the afdo data either as their
symbol name or dwarf name (since inline functions may not have known symbol
name).  There is already some logic to handle that but it is broken in the
case both names are used.

To mitigate the problem I also added logic to translate dwarf names
to symbol names in case both are used.  This prevents profile loss i.e.
in exchange2.  Here digits_2 function appears by its dwarf name (digits_2)
but also is clonned which makes it to appear by its symbol name (__*digits_2)

All profile massaging is done before early optimization so the VPT targets of
offline bodies are correct.  We still will lose profile if early inlining
fails.  I will add second pass to afdo to offline these.

Last problem is that in case we early inlined more than expected (which now
happens more often due to offlining) the profile will be lost and filled by
static profile.  Problem here is that we need to somehow scale the profile of
inline instance but I do not see how to determine invocation counts.  Will try
to look into that incrementally - perhaps we can keep some info from offlining.

There is also now a dump infrastructure that prints the proflie in a
the same format as dump_gcov tool.

autoprofiledbootstraped, regsted x86_64-linux, will commit it shortly.

Honza

gcc/ChangeLog:

* auto-profile.cc (name_index_set, name_index_map): New types.
(dump_afdo_loc): New function.
(dump_inline_stack): Simplify.
(function_instance::merge): Merge recursively inlined functions;
offline if necessary; collect new fnctions.
(function_instance::offline): New member function.
(function_instance::offline_if_in_set): New member function.
(function_instance::remove_external_functions): New member function.
(function_instance::dump): New member function.
(function_instance::debug): New member function.
(function_instance::dump_inline_stack): New member function.
(function_instance::find_icall_target_map): Use removed_icall_target.
(function_instance::remove_icall_target): Only mark icall target removed.
(autofdo_source_profile::offline_external_functions): New function.
(function_instance::read_function_instance): Record inlined_to pointers;
use -1 for unknown head counts.
(autofdo_source_profile::get_function_instance_by_name_index): New
function.
(autofdo_source_profile::add_function_instance): New member function.
(autofdo_source_profile::read): Do not leak memory; fix formatting.
(read_profile): Fix formatting.
(afdo_annotate_cfg): LIkewise.
(class pass_ipa_auto_profile_offline): New pass.
(make_pass_ipa_auto_profile_offline): New function.
* passes.def (pass_ipa_auto_profile_offline): Add
* tree-pass.h (make_pass_ipa_auto_profile): Declare

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/indir-call-prof-2.c: Update template.

5 weeks agox86: Also handle all 1s float vector constant
H.J. Lu [Wed, 25 Jun 2025 22:08:51 +0000 (06:08 +0800)] 
x86: Also handle all 1s float vector constant

Since float vector constant

(const_vector:V4SF [(const_double:SF -QNaN [-QNaN]) repeated x4])

is an all 1s float vector constant, update the remove_redundant_vector
pass to replace

(insn 20 18 21 2 (set (reg:V4SF 124)
        (const_vector:V4SF [
                (const_double:SF -QNaN [-QNaN]) repeated x4
            ])) "x.cc":26:5 2426 {movv4sf_internal}
     (nil))

with

(insn 49 2 5 2 (set (reg:V16QI 135)
        (const_vector:V16QI [
                (const_int -1 [0xffffffffffffffff]) repeated x16
            ])) -1
     (nil))
...
(insn 20 18 21 2 (set (reg:V4SF 124)
        (subreg:V4SF (reg:V16QI 135) 0)) "x.cc":26:5 2426 {movv4sf_internal}
     (nil))

gcc/

PR target/120819
* config/i386/i386-features.cc (ix86_broadcast_inner): Also handle
all 1s float vector constant.

gcc/testsuite/

PR target/120819
* g++.target/i386/pr120819.C: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks agox86: Handle REG_EH_REGION note in DEF_INSN
H.J. Lu [Wed, 25 Jun 2025 04:50:53 +0000 (12:50 +0800)] 
x86: Handle REG_EH_REGION note in DEF_INSN

For tcpsock_test.go in libgo tests,

commit aba3b9d3a48a0703fd565f7c5f0caf604f59970b
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri May 9 07:17:07 2025 +0800

    x86: Extend the remove_redundant_vector pass

added an instruction:

(insn 501 101 102 21 (set (reg:V2DI 234)
        (vec_duplicate:V2DI (reg:DI 111 [ _46 ]))) "tcpsock_test.go":691:12 discrim 1 -1
     (nil))

after

(insn 101 100 501 21 (set (reg:DI 111 [ _46 ])
        (mem:DI (reg/f:DI 110 [ _45 ]) [5 *_45+0 S8 A64])) "tcpsock_test.go":691:12 discrim 1 99 {*movdi_internal}
     (expr_list:REG_DEAD (reg/f:DI 110 [ _45 ])
        (expr_list:REG_EH_REGION (const_int 1 [0x1])
            (nil))))

which resulted in

(insn 101 100 501 21 (set (reg:DI 111 [ _46 ])
        (mem:DI (reg/f:DI 110 [ _45 ]) [5 *_45+0 S8 A64])) "tcpsock_test.go":691:12 discrim 1 99 {*movdi_internal}
     (expr_list:REG_DEAD (reg/f:DI 110 [ _45 ])
        (expr_list:REG_EH_REGION (const_int 1 [0x1])
            (nil))))
(insn 501 101 102 21 (set (reg:V2DI 234)
        (vec_duplicate:V2DI (reg:DI 111 [ _46 ]))) "tcpsock_test.go":691:12 discrim 1 -1
     (nil))

and caused:

tcpsock_test.go: In function 'net.TestTCPBig..func2':
tcpsock_test.go:684:28: error: in basic block 21:
  684 |                         go func() {
      |                            ^
tcpsock_test.go:684:28: error: flow control insn inside a basic block
(insn 101 100 501 21 (set (reg:DI 111 [ _46 ])
        (mem:DI (reg/f:DI 110 [ _45 ]) [5 *_45+0 S8 A64])) "tcpsock_test.go":691:12 discrim 1 99 {*movdi_internal}
     (expr_list:REG_DEAD (reg/f:DI 110 [ _45 ])
        (expr_list:REG_EH_REGION (const_int 1 [0x1])
            (nil))))
during RTL pass: rrvl
tcpsock_test.go:684:28: internal compiler error: in rtl_verify_bb_insns, at cfgrtl.cc:2834

Copy the REG_EH_REGION note to the newly added instruction and split the
block after the previous instruction.

PR target/120816
* config/i386/i386-features.cc (remove_redundant_vector_load):
Handle REG_EH_REGION note in DEF_INSN.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks agox86: Add preserve_none and update no_caller_saved_registers attributes
H.J. Lu [Sun, 13 Apr 2025 18:38:24 +0000 (11:38 -0700)] 
x86: Add preserve_none and update no_caller_saved_registers attributes

Add preserve_none attribute which is similar to no_callee_saved_registers
attribute, except on x86-64, r12, r13, r14, r15, rdi and rsi registers are
used for integer parameter passing.  This can be used in an interpreter
to avoid saving/restoring the registers in functions which process byte
codes.  It improved the pystones benchmark by 6-7%:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628#c15

Remove -mgeneral-regs-only restriction on no_caller_saved_registers
attribute.  Only SSE is allowed since SSE XMM register load preserves
the upper bits in YMM/ZMM register while YMM register load zeros the
upper 256 bits of ZMM register, and preserving 32 ZMM registers can
be quite expensive.

gcc/

PR target/119628
* config/i386/i386-expand.cc (ix86_expand_call): Call
ix86_type_no_callee_saved_registers_p instead of looking up
no_callee_saved_registers attribute.
* config/i386/i386-options.cc (ix86_set_func_type): Look up
preserve_none attribute.  Check preserve_none attribute for
interrupt attribute.  Don't check no_caller_saved_registers nor
no_callee_saved_registers conflicts here.
(ix86_set_func_type): Check no_callee_saved_registers before
checking no_caller_saved_registers attribute.
(ix86_set_current_function): Allow SSE with
no_caller_saved_registers attribute.
(ix86_handle_call_saved_registers_attribute): Check preserve_none,
no_callee_saved_registers and no_caller_saved_registers conflicts.
(ix86_gnu_attributes): Add preserve_none attribute.
* config/i386/i386-protos.h (ix86_type_no_callee_saved_registers_p):
New.
* config/i386/i386.cc
(x86_64_preserve_none_int_parameter_registers): New.
(ix86_using_red_zone): Don't use red-zone when there are no
caller-saved registers with SSE.
(ix86_type_no_callee_saved_registers_p): New.
(ix86_function_ok_for_sibcall): Also check TYPE_PRESERVE_NONE
and call ix86_type_no_callee_saved_registers_p instead of looking
up no_callee_saved_registers attribute.
(ix86_comp_type_attributes): Call
ix86_type_no_callee_saved_registers_p instead of looking up
no_callee_saved_registers attribute.  Return 0 if preserve_none
attribute doesn't match in 64-bit mode.
(ix86_function_arg_regno_p): For cfun with TYPE_PRESERVE_NONE,
use x86_64_preserve_none_int_parameter_registers.
(init_cumulative_args): Set preserve_none_abi.
(function_arg_64): Use x86_64_preserve_none_int_parameter_registers
with preserve_none attribute.
(setup_incoming_varargs_64): Use
x86_64_preserve_none_int_parameter_registers with preserve_none
attribute.
(ix86_save_reg): Treat TYPE_PRESERVE_NONE like
TYPE_NO_CALLEE_SAVED_REGISTERS.
(ix86_nsaved_sseregs): Allow saving XMM registers for
no_caller_saved_registers attribute.
(ix86_compute_frame_layout): Likewise.
(x86_this_parameter): Use
x86_64_preserve_none_int_parameter_registers with preserve_none
attribute.
* config/i386/i386.h (ix86_args): Add preserve_none_abi.
(call_saved_registers_type): Add TYPE_PRESERVE_NONE.
(machine_function): Change call_saved_registers to 3 bits.
* doc/extend.texi: Add preserve_none attribute.  Update
no_caller_saved_registers attribute to remove -mgeneral-regs-only
restriction.

gcc/testsuite/

PR target/119628
* gcc.target/i386/no-callee-saved-3.c: Adjust error location.
* gcc.target/i386/no-callee-saved-19a.c: New test.
* gcc.target/i386/no-callee-saved-19b.c: Likewise.
* gcc.target/i386/no-callee-saved-19c.c: Likewise.
* gcc.target/i386/no-callee-saved-19d.c: Likewise.
* gcc.target/i386/no-callee-saved-19e.c: Likewise.
* gcc.target/i386/preserve-none-1.c: Likewise.
* gcc.target/i386/preserve-none-2.c: Likewise.
* gcc.target/i386/preserve-none-3.c: Likewise.
* gcc.target/i386/preserve-none-4.c: Likewise.
* gcc.target/i386/preserve-none-5.c: Likewise.
* gcc.target/i386/preserve-none-6.c: Likewise.
* gcc.target/i386/preserve-none-7.c: Likewise.
* gcc.target/i386/preserve-none-8.c: Likewise.
* gcc.target/i386/preserve-none-9.c: Likewise.
* gcc.target/i386/preserve-none-10.c: Likewise.
* gcc.target/i386/preserve-none-11.c: Likewise.
* gcc.target/i386/preserve-none-12.c: Likewise.
* gcc.target/i386/preserve-none-13.c: Likewise.
* gcc.target/i386/preserve-none-14.c: Likewise.
* gcc.target/i386/preserve-none-15.c: Likewise.
* gcc.target/i386/preserve-none-16.c: Likewise.
* gcc.target/i386/preserve-none-17.c: Likewise.
* gcc.target/i386/preserve-none-18.c: Likewise.
* gcc.target/i386/preserve-none-19.c: Likewise.
* gcc.target/i386/preserve-none-20.c: Likewise.
* gcc.target/i386/preserve-none-21.c: Likewise.
* gcc.target/i386/preserve-none-22.c: Likewise.
* gcc.target/i386/preserve-none-23.c: Likewise.
* gcc.target/i386/preserve-none-24.c: Likewise.
* gcc.target/i386/preserve-none-25.c: Likewise.
* gcc.target/i386/preserve-none-26.c: Likewise.
* gcc.target/i386/preserve-none-27.c: Likewise.
* gcc.target/i386/preserve-none-28.c: Likewise.
* gcc.target/i386/preserve-none-29.c: Likewise.
* gcc.target/i386/preserve-none-30a.c: Likewise.
* gcc.target/i386/preserve-none-30b.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks agoDaily bump.
GCC Administrator [Thu, 26 Jun 2025 00:19:59 +0000 (00:19 +0000)] 
Daily bump.

5 weeks agox86: Add debug dump for the remove_redundant_vector pass
H.J. Lu [Sat, 10 May 2025 08:57:58 +0000 (16:57 +0800)] 
x86: Add debug dump for the remove_redundant_vector pass

Add debug dump for the remove_redundant_vector pass with the following
output:

Replace:

(insn 7 4 8 2 (set (reg:V2DI 103)
        (const_vector:V2DI [
                (const_int 0 [0]) repeated x2
            ])) "x.c":8:13 2406 {movv2di_internal}
     (nil))

with:

(insn 7 4 8 2 (set (reg:V2DI 103)
        (subreg:V2DI (reg:V32QI 109) 0)) "x.c":8:13 2406 {movv2di_internal}
     (nil))

...

Replace:

(insn 16 15 17 3 (set (reg:V4DI 105)
        (const_vector:V4DI [
                (const_int 0 [0]) repeated x4
            ])) "x.c":13:28 2405 {movv4di_internal}
     (nil))

with:

(insn 16 15 17 3 (set (reg:V4DI 105)
        (subreg:V4DI (reg:V32QI 109) 0)) "x.c":13:28 2405 {movv4di_internal}
     (nil))

...

Place:

(insn 25 5 23 2 (set (reg:V32QI 109)
        (const_vector:V32QI [
                (const_int 0 [0]) repeated x32
            ])) -1
     (nil))

after:

(insn 23 25 24 2 (set (reg/f:DI 107 [ mem1 ])
        (reg:DI 5 di [ mem1 ])) "x.c":5:1 95 {*movdi_internal}
     (expr_list:REG_DEAD (reg:DI 5 di [ mem1 ])
        (nil)))

in the *.309r.rrvl debug dump.

* config/i386/i386-features.cc (ix86_place_single_vector_set):
Add debug dump.
(replace_vector_const): Likewise.
(remove_redundant_vector_load): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks agoarc: Use intrinsics for __builtin_mul_overflow ()
Luis Silva [Wed, 25 Jun 2025 14:58:35 +0000 (17:58 +0300)] 
arc: Use intrinsics for __builtin_mul_overflow ()

This patch handles both signed and unsigned builtin multiplication
overflow.

Uses the "mpy.f" instruction to set the condition codes based on the
result.  In the event of an overflow, the V flag is set, triggering a
conditional move depending on the V flag status.

For example, set "1" to "r0" in case of overflow:

        mov_s   r0,1
        mpy.f   r0,r0,r1
        j_s.d   [blink]
        mov.nv  r0,0

gcc/ChangeLog:

* config/arc/arc.md (<su_optab>mulvsi4): New define_expand.
(<su_optab>mulsi3_Vcmp): New define_insn.

Signed-off-by: Luis Silva <luiss@synopsys.com>
5 weeks agoarc: Add commutative multiplication patterns
Luis Silva [Wed, 25 Jun 2025 14:54:12 +0000 (17:54 +0300)] 
arc: Add commutative multiplication patterns

This patch introduces two new instruction patterns:

    `*mulsi3_cmp0`: This pattern performs a multiplication and sets
    the CC_Z register based on the result, while also storing the
    result of the multiplication in a general-purpose register.

    `*mulsi3_cmp0_noout`: This pattern performs a multiplication and
    sets the CC_Z register based on the result without storing the
    result in a general-purpose register.

These patterns are optimized to generate code using the `mpy.f`
instruction, specifically used where the result is compared to zero.

In addition, the previous commutative multiplication implementation
was removed.  It incorrectly took into account the negative flag,
which is wrong.  This new implementation only considers the zero flag.

A test case has been added to verify the correctness of these changes.

gcc/ChangeLog:

* config/arc/arc.cc (arc_select_cc_mode): Handle multiplication
results compared against zero, selecting CC_Zmode.
* config/arc/arc.md (*mulsi3_cmp0): New define_insn.
(*mulsi3_cmp0_noout): New define_insn.

gcc/testsuite/ChangeLog:

* gcc.target/arc/mult-cmp0.c: New test.

Signed-off-by: Luis Silva <luiss@synopsys.com>
5 weeks agoarc: testsuite: Scan rlc instead of mov.hs
Luis Silva [Wed, 25 Jun 2025 14:45:37 +0000 (17:45 +0300)] 
arc: testsuite: Scan rlc instead of mov.hs

Due to the patch by Roger Sayle,
09881218137f4af9b7c894c2d350cf2ff8e0ee23, which introduces the use of
the `rlc rX,0` instruction in place of the `mov.hs`, the add overflow
test case needs to be updated.  The previous test case was validating
the `mov.hs` instruction, but now it must validate the `rlc`
instruction as the new behavior.

gcc/testsuite/ChangeLog:

* gcc.target/arc/overflow-1.c: Replace mov.hs with rlc.

Signed-off-by: Luis Silva <luiss@synopsys.com>
5 weeks agoARC: Use intrinsics for __builtin_sub_overflow*()
Shahab Vahedi [Wed, 25 Jun 2025 14:37:02 +0000 (17:37 +0300)] 
ARC: Use intrinsics for __builtin_sub_overflow*()

This patch covers signed and unsigned subtractions.  The generated code
would be something along these lines:

signed:
  sub.f   r0, r1, r2
  b.v     @label

unsigned:
  sub.f   r0, r1, r2
  b.c     @label

gcc/

* config/arc/arc.md (subsi3_v, subvsi4, subsi3_c): New patterns.

gcc/testsuite/

* gcc.target/arc/overflow-2.c: New file.

5 weeks agoARC: Use intrinsics for __builtin_add_overflow*()
Shahab Vahedi [Wed, 25 Jun 2025 14:22:45 +0000 (17:22 +0300)] 
ARC: Use intrinsics for __builtin_add_overflow*()

This patch covers signed and unsigned additions.  The generated code
would be something along these lines:

signed:
  add.f   r0, r1, r2
  b.v     @label

unsigned:
  add.f   r0, r1, r2
  b.c     @label

gcc/

* config/arc/arc-modes.def (CC_V): New mode.
* config/arc/arc-protos.h (arc_gen_unlikely_cbranch): New
function declaration.
* config/arc/arc.cc (arc_gen_unlikely_cbranch): New
function.
(get_arc_condition_code): Handle new mode.
* config/arc/arc.md (addvsi3_v, addvsi4, addsi3_c, uaddvsi4): New
patterns.
* config/arc/predicates.md (proper_comparison_operator): Handel
the new V_mode.
(equality_comparison_operator): Likewise.

gcc/testsuite/

* gcc.target/arc/overflow-1.c: New file

5 weeks agodiagnostics: Mark path_label::get_effects as final override
Martin Jambor [Wed, 25 Jun 2025 15:11:34 +0000 (17:11 +0200)] 
diagnostics: Mark path_label::get_effects as final override

When compiling diagnostic-path-output.cc with clang, it warns that
path_label::get_effects should be marked as override.  That looks like
a good idea and from a brief look I also believe it should be marked
as final (the other override in the class is marked as both), so this
patch does that.

Likewise for html_output_format::after_diagnostic in
diagnostic-format-html.cc which also already has quite a few member
functions marked as final override.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* diagnostic-path-output.cc (path_label::get_effects): Mark as
final override.
* diagnostic-format-html.cc
(html_output_format::after_diagnostic): Likewise.

5 weeks agoranger-op: Use CFN_ constant instead of plain BUILTIN_ one
Martin Jambor [Mon, 23 Jun 2025 16:21:34 +0000 (18:21 +0200)] 
ranger-op: Use CFN_ constant instead of plain BUILTIN_ one

when compiling gimple-range-op.cc, clang issues warning:

  gimple-range-op.cc:1419:18: warning: comparison of different enumeration types in switch statement ('combined_fn' and 'built_in_function') [-Wenum-compare-switch]

which I hope is harmless, but all other switch cases use CFN_ prefixed
constants, so I guess the ISINF case should too.

gcc/ChangeLog:

2025-06-23  Martin Jambor  <mjambor@suse.cz>

* gimple-range-op.cc
(gimple_range_op_handler::maybe_builtin_call): Use
CFN_BUILT_IN_ISINF instead of BUILT_IN_ISINF.

5 weeks agovalue-relation.h: Mark dom_oracle::next_relation as override
Martin Jambor [Wed, 25 Jun 2025 15:03:39 +0000 (17:03 +0200)] 
value-relation.h: Mark dom_oracle::next_relation as override

When GCC is compiled with clang, it emits a warning that
dom_oracle::next_relation is not marked as override even though it
does override a virtual function of its ancestor.  This patch marks it
as such to silence the warning and for the sake of consistency.

There are other member functions in the class which are marked as
final override but this particular function is in the protected
section so I decided to just mark it as override.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* value-relation.h (class dom_oracle): Mark member function
next_relation as override.

5 weeks agotree-ssa-propagate.h: Mark two functions as override
Martin Jambor [Wed, 25 Jun 2025 15:02:10 +0000 (17:02 +0200)] 
tree-ssa-propagate.h: Mark two functions as override

When tree-ssa-propagate.h is compiled with clang, it complains that
member functions functions value_of_expr and range_of_expr of class
substitute_and_fold_engine are not marked as override even though they
do override virtual functions of the ancestor class.  This patch
merely adds the keyword to silence the warning and for consistency's
sake.

I did not make this part of the previous patch because I wanted to
point out that the first case is quite unusual, a virtual function
with a functional body (range_query::value_of_expr) is being
overridden with a pure virtual function.  I assume it was a conscious
decision but adding the override keyword seems even more important
then.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* tree-ssa-propagate.h (class substitute_and_fold_engine): Mark
member functions value_of_expr and range_of_expr as override.

5 weeks agoranger: Mark several member functions as final override
Martin Jambor [Wed, 25 Jun 2025 14:59:12 +0000 (16:59 +0200)] 
ranger: Mark several member functions as final override

When GCC is built with clang, it emits warnings that several member
functions of various ranger classes override a virtual function of an
ancestor but are not marked with the override keyword.  After
inspecting the cases, I found that all these classes had other member
functions marked as final override, so I added the final keyword
everywhere too.

In some cases other such overrides were not explicitly marked as
virtual, which made formatting easier.  For that reason and also for
consistency, in such cases I removed the virtual keyword from the
functions I marked as final override too.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* range-op-mixed.h (class operator_plus): Mark member function
overflow_free_p as final override.
(class operator_minus): Likewise.
(class operator_mult): Likewise.
* range-op-ptr.cc (class pointer_plus_operator): Mark member
function lhs_op1_relation as final override.
* range-op.cc (class operator_div::): Mark member functions
op2_range and update_bitmask as final override.
(class operator_logical_and): Mark member functions fold_range,
op1_range and op2_range as final override.  Remove unnecessary
virtual.
(class operator_logical_or): Likewise.
(class operator_logical_not): Mark member functions fold_range and
op1_range as final override.  Remove unnecessary virtual.
formatting easier.
(class operator_absu): Mark member functions wi_fold as final
override.

5 weeks agocoroutines: Remove unused private member in cp_coroutine_transform
Martin Jambor [Wed, 25 Jun 2025 14:56:58 +0000 (16:56 +0200)] 
coroutines: Remove unused private member in cp_coroutine_transform

When building GCC with clang, it warns that the private member suffix
in class cp_coroutine_transform (defined in gcc/cp/coroutines.h) is
not used which indeed looks like it is the case.  This patch therefore
removes it.

gcc/cp/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* coroutines.h (class cp_coroutine_transform): Remove member
orig_fn_body.

5 weeks agoMark pass_sccopy gate and execute functions as final override
Martin Jambor [Wed, 25 Jun 2025 14:53:03 +0000 (16:53 +0200)] 
Mark pass_sccopy gate and execute functions as final override

It is customary to mark the gate and execute functions of the classes
representing passes as final override but this is missing in
pass_sccopy.  This patch adds it which also silences clang warnings
about it.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* gimple-ssa-sccopy.cc (class pass_sccopy): Mark member functions
gate and execute as final override.

5 weeks agoMark rtl_avoid_store_forwarding functions final override
Martin Jambor [Wed, 25 Jun 2025 14:48:44 +0000 (16:48 +0200)] 
Mark rtl_avoid_store_forwarding functions final override

It is customary to mark the gate and execute functions of the classes
representing passes as final override but this is missing in
pass_rtl_avoid_store_forwarding.  This patch adds it which also
silences a clang warning about it.

gcc/ChangeLog:

2025-06-24  Martin Jambor  <mjambor@suse.cz>

* avoid-store-forwarding.cc (class
pass_rtl_avoid_store_forwarding): Mark member function gate as
final override.

5 weeks agoRemove unused vector in value-relation.cc.
Andrew MacLeod [Tue, 24 Jun 2025 20:51:56 +0000 (16:51 -0400)] 
Remove unused vector in value-relation.cc.

The relation_to_code vector in value-relation is now unused, so we can
remove it.

* value-relation.cc (relation_to_code): Remove.

5 weeks agoPromote verify_range to vrange.
Andrew MacLeod [Fri, 20 Jun 2025 01:19:27 +0000 (21:19 -0400)] 
Promote verify_range to vrange.

most range classes had a verufy_range, but it was all private. Make it a
supported routine from vrange.

* value-range.cc (frange::verify_range): Constify.
(irange::verify_range): Constify.
* value-range.h (vrange::verify_range): New.
(irange::verify_range): Make public.
(prange::verify_range): Make public.
(prange::verify_range): Make public.
(value_range::verify_range): New.

5 weeks agoget_bitmask is sometimes less refined.
Andrew MacLeod [Tue, 24 Jun 2025 17:10:56 +0000 (13:10 -0400)] 
get_bitmask is sometimes less refined.

get_bitmask intersects the current mask with a mask generated from the
range.  If the 2 masks are incompatible, it currently returns UNKNOWN.
Instead, ti should return the original mask or information is lost.

* value-range.cc (irange::get_bitmask): Return original mask if
result is unknown.
(assert_snap_result): New.
(test_irange_snap_bounds): New.
(range_tests_misc): Call test_irange_snap_bounds.

5 weeks agotree-optimization/109892 - SLP reduction of fma
Richard Biener [Wed, 25 Jun 2025 08:36:59 +0000 (10:36 +0200)] 
tree-optimization/109892 - SLP reduction of fma

The following adds the ability to vectorize a fma reduction pair
as SLP reduction (we cannot yet handle ternary association in
reduction vectorization yet).

PR tree-optimization/109892
* tree-vect-loop.cc (check_reduction_path): Handle fma.
(vectorizable_reduction): Apply FOLD_LEFT_REDUCTION code
generation constraints.

* gcc.dg/vect/vect-reduc-fma-1.c: New testcase.
* gcc.dg/vect/vect-reduc-fma-2.c: Likewise.
* gcc.dg/vect/vect-reduc-fma-3.c: Likewise.

5 weeks agotree-optimization/120808 - SLP build with mixed .FMA/.FMS
Richard Biener [Wed, 25 Jun 2025 07:24:41 +0000 (09:24 +0200)] 
tree-optimization/120808 - SLP build with mixed .FMA/.FMS

The following allows SLP build to succeed when mixing .FMA/.FMS
in different lanes like we handle mixed plus/minus.  This does not
yet address SLP pattern matching to not being able to form
a FMADDSUB from this.

PR tree-optimization/120808
* tree-vectorizer.h (compatible_calls_p): Add flag to
indicate a FMA/FMS pair is allowed.
* tree-vect-slp.cc (compatible_calls_p): Likewise.
(vect_build_slp_tree_1): Allow mixed .FMA/.FMS as two-operator.
(vect_build_slp_tree_2): Handle calls in two-operator SLP build.
* tree-vect-slp-patterns.cc (compatible_complex_nodes_p):
Adjust.

* gcc.dg/vect/bb-slp-pr120808.c: New testcase.

5 weeks agoivopts: Change constant_multiple_of to expand aff nodes.
Alfie Richards [Tue, 24 Jun 2025 13:49:27 +0000 (13:49 +0000)] 
ivopts: Change constant_multiple_of to expand aff nodes.

This changes the calls to tree_to_aff_combination in constant_multiple_of to
tree_to_aff_combination_expand along with associated plumbing of ivopts_data
and required cache.

This improves cases such as:

```c
void f(int *p1, int *p2, unsigned long step, unsigned long end, svbool_t pg) {
    for (unsigned long i = 0; i < end; i += step) {
        svst1(pg, p1, svld1_s32(pg, p2));
        p1 += step;
        p2 += step;
    }
}
```

Where ivopts previously didn't expand the SSA variables for the step increements
and so lacked the ability to group all the IV's and ended up with:

```
f:
cbz x3, .L1
mov x4, 0
.L3:
ld1w z31.s, p0/z, [x1]
add x4, x4, x2
st1w z31.s, p0, [x0]
add x1, x1, x2, lsl 2
add x0, x0, x2, lsl 2
cmp x3, x4
bhi .L3
.L1:
ret
```

After this change we end up with:

```
f:
cbz x3, .L1
mov x4, 0
.L3:
ld1w z31.s, p0/z, [x1, x4, lsl 2]
st1w z31.s, p0, [x0, x4, lsl 2]
add x4, x4, x2
cmp x3, x4
bhi .L3
.L1:
ret
```

gcc/ChangeLog:

* tree-ssa-loop-ivopts.cc (constant_multiple_of): Change
tree_to_aff_combination to tree_to_aff_combination_expand and add
parameter to take ivopts_data.
(get_computation_aff_1): Change parameters and calls to include
ivopts_data.
(get_computation_aff): Ditto.
(get_computation_at) Ditto.:
(get_debug_computation_at) Ditto.:
(get_computation_cost) Ditto.:
(rewrite_use_nonlinear_expr) Ditto.:
(rewrite_use_address) Ditto.:
(rewrite_use_compare) Ditto.:
(remove_unused_ivs) Ditto.:

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/adr_7.c: New test.

5 weeks agolibstdc++: Test for %S precision for durations with integral representation.
Tomasz Kamiński [Tue, 24 Jun 2025 11:49:26 +0000 (13:49 +0200)] 
libstdc++: Test for %S precision for durations with integral representation.

Existing test are extented to cover cases where not precision is specified,
or it is specified to zero. The precision value is ignored in all cases.

libstdc++-v3/ChangeLog:

* testsuite/std/time/format/precision.cc: New tests.

5 weeks agortl-ssa: Rewrite process_uses_of_deleted_def [PR120745]
Richard Sandiford [Wed, 25 Jun 2025 09:44:34 +0000 (10:44 +0100)] 
rtl-ssa: Rewrite process_uses_of_deleted_def [PR120745]

process_uses_of_deleted_def seems to have been written on the assumption
that non-degenerate phis would be explicitly deleted by an insn_change,
and that the function therefore only needed to delete degenerate phis.
But that was inconsistent with the rest of the code, and wouldn't be
very convenient in any case.

This patch therefore rewrites process_uses_of_deleted_def to handle
general phis.

I'm not aware that this fixes any issues in current code, but it is
needed to enable the rtl-ssa dce work that Ondřej and Honza are
working on.

gcc/
PR rtl-optimization/120745
* rtl-ssa/changes.cc (process_uses_of_deleted_def): Rewrite to
handle deletions of non-degenerate phis.

5 weeks agolibstdc++: Report compilation error on formatting "%d" from month_last [PR120650]
Tomasz Kamiński [Tue, 24 Jun 2025 07:17:12 +0000 (09:17 +0200)] 
libstdc++: Report compilation error on formatting "%d" from month_last [PR120650]

For month_day we incorrectly reported day information to be available, which lead
to format_error being thrown from the call to formatter::format at runtime, instead
of making call to format ill-formed.

The included test cover most of the combinations of _ChronoParts and format
specifiers.

PR libstdc++/120650

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h
(formatter<chrono::month_day_last,_CharT>::parse): Call _M_parse with
only Month being available.
* testsuite/std/time/format/data_not_present_neg.cc: New test.

5 weeks agox86: Update -mtune=intel for Diamond Rapids/Clearwater Forest
H.J. Lu [Tue, 24 Jun 2025 23:40:31 +0000 (07:40 +0800)] 
x86: Update -mtune=intel for Diamond Rapids/Clearwater Forest

-mtune=intel is used to generate a single binary to run well on both big
core and small core, similar to hybrid CPUs.  Update -mtune=intel to tune
for Diamond Rapids and Clearwater Forest, instead of Silvermont.

PR target/120815
* common/config/i386/i386-common.cc (processor_alias_table):
Replace CPU_SLM/PTA_NEHALEM with CPU_HASWELL/PTA_HASWELL for
PROCESSOR_INTEL.
* config/i386/i386-options.cc (processor_cost_table): Replace
intel_cost with alderlake_cost.
* config/i386/x86-tune-costs.h (intel_cost): Removed.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Treat
PROCESSOR_INTEL like PROCESSOR_ALDERLAKE.
(ix86_adjust_cost): Likewise.
* doc/invoke.texi: Update -mtune=intel for Diamond Rapids and
Clearwater Forest.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
5 weeks agoi386: Remove CLDEMOTE for clients
Haochen Jiang [Wed, 25 Jun 2025 02:34:37 +0000 (10:34 +0800)] 
i386: Remove CLDEMOTE for clients

CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned
it will be enabled on Xeon and Atom servers, not clients. Remove them
since Alder Lake (where it is introduced).

gcc/ChangeLog:

* config/i386/i386.h (PTA_ALDERLAKE): Use PTA_GOLDMONT_PLUS
as base to remove PTA_CLDEMOTE.
(PTA_SIERRAFOREST): Add PTA_CLDEMOTE since PTA_ALDERLAKE
does not include that anymore.
* doc/invoke.texi: Update texi file.

5 weeks agoRISC-V: Add Profiles RVA/B23S64 support.
Jiawei [Tue, 24 Jun 2025 09:34:05 +0000 (17:34 +0800)] 
RISC-V: Add Profiles RVA/B23S64 support.

This patch adds support for the RISC-V Profiles RVA23S64 and RVB23S64.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New Profiles.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-rva23s.c: New test.
* gcc.target/riscv/arch-rvb23s.c: New test.

5 weeks agoAdd -fauto-profile-inlining
Jan Hubicka [Wed, 25 Jun 2025 01:01:29 +0000 (03:01 +0200)] 
Add -fauto-profile-inlining

this patch adds -fauto-profile-inlining which can be used to control
the auto-profile directed inlning.

gcc/ChangeLog:

* common.opt: (fauto-profile-inlining): New
* doc/invoke.texi (-fauto-profile-inlining): Document.
* ipa-inline.cc (inline_functions_by_afdo): Check
flag_auto_profile.
(early_inliner): Also do inline_functions_by_afdo with
!flag_early_inlining.

5 weeks agoRemove early inlining from afdo pass
Jan Hubicka [Wed, 25 Jun 2025 00:59:54 +0000 (02:59 +0200)] 
Remove early inlining from afdo pass

This pass removes early-inlining from afdo pass since all inlining should now
happen from early inliner.  I tedted this on spec and there are 3 inlines
happening here which are blocked at early-inline time by hitting large function
growth limit.  We probably want to bypass that limit, I will look into that
incrementaly.

This should make the non-inlined function profile merging hopefully easier.

It may still make sense to separate afdo inliner from early inliner to solve
the non-transitivity issues which is not that hard to do with current code
orgnaization. However this should be separate IPA pass rather then another
part of afdo pass, since it can be coneptually separate.

gcc/ChangeLog:

* auto-profile.cc: Update toplevel comment.
(early_inline): Remove.
(auto_profile): Don't do early inlining.

5 weeks agoDaily bump.
GCC Administrator [Wed, 25 Jun 2025 00:19:34 +0000 (00:19 +0000)] 
Daily bump.

5 weeks agogcn: Fix glc vs. sc0 handling for scalar memory access
Tobias Burnus [Tue, 24 Jun 2025 21:55:27 +0000 (23:55 +0200)] 
gcn: Fix glc vs. sc0 handling for scalar memory access

gfx942 still uses glc for scalar access ('s_...') and only uses
sc0/nt/sc1 for vector access.

gcc/ChangeLog:

* config/gcn/gcn-opts.h (TARGET_GLC_NAME): Fix and extend the
description in the comment.
* config/gcn/gcn.cc (print_operand): Extend the comment about
'G' and 'g'.
* config/gcn/gcn.md: Use 'glc' instead of %G where appropriate.

5 weeks agoFortran/OpenACC: Add Fortran support for acc_attach/acc_detach
Tobias Burnus [Tue, 24 Jun 2025 21:28:57 +0000 (23:28 +0200)] 
Fortran/OpenACC: Add Fortran support for acc_attach/acc_detach

While C/++ support the routines acc_attach{,_async} and
acc_detach{,_finalize}{,_async} routines since a long time, the Fortran
API routines where only added in OpenACC 3.3.

Unfortunately, they cannot directly be implemented in the library as
GCC will introduce a temporary array descriptor in some cases, which
causes the attempted attachment to the this temporary variable instead
of to the original one.

Therefore, those API routines are handled in a special way in the compiler.

gcc/fortran/ChangeLog:

* trans-stmt.cc (gfc_trans_call_acc_attach_detach): New.
(gfc_trans_call): Call it.

libgomp/ChangeLog:

* libgomp.texi (acc_attach, acc_detach): Update for Fortran
version.
* openacc.f90 (acc_attach{,_async}, acc_detach{,_finalize}{,_async}):
Add.
* openacc_lib.h: Likewise.
* testsuite/libgomp.oacc-fortran/acc-attach-detach-1.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-attach-detach-2.f90: New test.

5 weeks agoRISC-V: Add patterns for vector-scalar multiply-(subtract-)accumulate [PR119100]
Paul-Antoine Arras [Tue, 24 Jun 2025 21:42:50 +0000 (15:42 -0600)] 
RISC-V: Add patterns for vector-scalar multiply-(subtract-)accumulate [PR119100]

This pattern enables the combine pass (or late-combine, depending on the case)
to merge a vec_duplicate into a plus-mult or minus-mult RTL instruction.

Before this patch, we have two instructions, e.g.:
  vfmv.v.f       v6,fa0
  vfmacc.vv      v2,v6,v4

After, we get only one:
  vfmacc.vf      v2,fa0,v4

PR target/119100

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*<optab>_vf_<mode>): Handle both add and
acc FMA variants.
* config/riscv/vector.md (*pred_mul_<optab><mode>_scalar_undef): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfmacc and vfmsac.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop.h: Add support for acc
variants.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_run.h: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c: Define
TEST_OUT.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f32.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f64.c: Likewise.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f64.c: New test.

5 weeks agoFortran: fix ICE in verify_gimple_in_seq with substrings [PR120743]
Harald Anlauf [Tue, 24 Jun 2025 18:46:38 +0000 (20:46 +0200)] 
Fortran: fix ICE in verify_gimple_in_seq with substrings [PR120743]

PR fortran/120743

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_substring): Substring indices are of
type gfc_charlen_type_node.  Convert to size_type_node for
pointer arithmetic only after offset adjustments have been made.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr120743.f90: New test.

Co-authored-by: Jerry DeLisle <jvdelisle@gcc.gnu.org>
Co-authored-by: Mikael Morin <mikael@gcc.gnu.org>
5 weeks agoc++: Implement C++26 P3618R0 - Allow attaching main to the global module [PR120773]
Jakub Jelinek [Tue, 24 Jun 2025 17:00:11 +0000 (19:00 +0200)] 
c++: Implement C++26 P3618R0 - Allow attaching main to the global module [PR120773]

The following patch implements the P3618R0 paper by tweaking pedwarn
condition, adjusting pedwarn wording, adjusting one testcase and adding 4
new ones.  The paper was voted in as DR, so it isn't guarded on C++ version.

2025-06-24  Jakub Jelinek  <jakub@redhat.com>

PR c++/120773
* decl.cc (grokfndecl): Implement C++26 P3618R0 - Allow attaching
main to the global module.  Only pedwarn for current_lang_name
other than lang_name_cplusplus and adjust pedwarn wording.

* g++.dg/parse/linkage5.C: Don't expect error on
extern "C++" int main ();.
* g++.dg/parse/linkage7.C: New test.
* g++.dg/parse/linkage8.C: New test.
* g++.dg/modules/main-2.C: New test.
* g++.dg/modules/main-3.C: New test.

5 weeks agoi386: Convert LEA stack adjust insn to SUB when FLAGS_REG is dead
Uros Bizjak [Tue, 24 Jun 2025 09:02:02 +0000 (11:02 +0200)] 
i386: Convert LEA stack adjust insn to SUB when FLAGS_REG is dead

ADD/SUB is faster than LEA for most processors. Also, there are
several peephole2 patterns available that convert prologue esp
subtractions to pushes (at the end of i386.md). These process only
patterns with flags reg clobber, so they are ineffective
with clobber-less stack ptr adjustments, introduced by r16-1551
("x86: Enable separate shrink wrapping").

Introduce a peephole2 pattern that adds a clobber to a clobber-less
stack ptr adjustments when FLAGS_REG is dead.

gcc/ChangeLog:

* config/i386/i386.md
(@pro_epilogue_adjust_stack_add_nocc<mode>): Add type attribute.
(pro_epilogue_adjust_stack_add_nocc peephole2 pattern):
Convert pro_epilogue_adjust_stack_add_nocc variant to
pro_epilogue_adjust_stack_add when FLAGS_REG is dead.

5 weeks agoRemove non-SLP path from vectorizable_load
Richard Biener [Tue, 24 Jun 2025 12:38:19 +0000 (14:38 +0200)] 
Remove non-SLP path from vectorizable_load

This cleans the rest of vectorizable_load from non-SLP, propagates
out ncopies == 1, and elides loops from 0 to ncopies.

* tree-vect-stmts.cc (vectorizable_load): Remove non-SLP
paths and propagate out ncopies == 1.

5 weeks agodiagnostic: fix for older version of GCC
Marc Poulhiès [Tue, 24 Jun 2025 13:12:30 +0000 (15:12 +0200)] 
diagnostic: fix for older version of GCC

Having both an enum and a variable with the same name triggers an error with
gcc 5.

gcc/ChangeLog:
* diagnostic-state-to-dot.cc (get_color_for_dynalloc_state):
Rename argument dynalloc_state to dynalloc_st.
(add_title_tr): Rename argument style to styl.
(on_xml_node): Rename local variable dynalloc_state to dynalloc_st.

5 weeks agolibstdc++: Unnecessary type completion in __is_complete_or_unbounded [PR120717]
Patrick Palka [Tue, 24 Jun 2025 13:33:25 +0000 (09:33 -0400)] 
libstdc++: Unnecessary type completion in __is_complete_or_unbounded [PR120717]

When checking __is_complete_or_unbounded on a reference to incomplete
type, we overeagerly try to instantiate/complete the referenced type
which besides being unnecessary may also produce an unexpected
-Wsfinae-incomplete warning (added in r16-1527) if the referenced type
is later defined.

This patch fixes this by effectively restricting the sizeof check to
object (except unknown-bound array) types.  In passing simplify the
implementation by using is_object instead of is_function/reference/void
and introducing a __maybe_complete_object_type helper.

PR libstdc++/120717

libstdc++-v3/ChangeLog:

* include/std/type_traits (__maybe_complete_object_type): New
helper trait, factored out from ...
(__is_complete_or_unbounded): ... here.  Only check sizeof on a
__maybe_complete_object_type type.  Fix formatting.
* testsuite/20_util/is_complete_or_unbounded/120717.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
5 weeks agogcc: remove atan from edom_only_function
Yuao Ma [Mon, 23 Jun 2025 16:06:16 +0000 (00:06 +0800)] 
gcc: remove atan from edom_only_function

According to the man page, atan does not produce an error. According to the C23
standard draft (N3088), a range error occurs for atan if a nonzero x is too
close to zero. Neither of them mentions that atan will result in a domain error.

gcc/ChangeLog:

* tree-call-cdce.cc (edom_only_function): Remove atan.

Signed-off-by: Yuao Ma <c8ef@outlook.com>
5 weeks agos390: Fix float vector extract for pre-z13
Juergen Christ [Wed, 18 Jun 2025 13:16:28 +0000 (15:16 +0200)] 
s390: Fix float vector extract for pre-z13

Also provide the vec_extract patterns for floats on pre-z13 machines
to prevent ICEing in those cases.

gcc/ChangeLog:

* config/s390/vector.md (VF): Don't restrict modes.
(VEC_SET_SINGLEFLOAT): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-extract-1.c: Fix test on arch11.
* gcc.target/s390/vector/vec-set-1.c: Run test on arch11.
* gcc.target/s390/vector/vec-extract-2.c: New test.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>