]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
3 months agolibstdc++: Fix mingw build by using _M_span [PR119970]
Tomasz Kamiński [Mon, 28 Apr 2025 06:53:59 +0000 (08:53 +0200)] 
libstdc++: Fix mingw build by using _M_span [PR119970]

The r16-142-g01e5ef3e8b9128 chagned return type of _Str_sink::view()
to basic_string_view<_CharT>. The mutable access is provided by _M_span
function, that is now used for mingw path.

PR libstdc++/119970

libstdc++-v3/ChangeLog:

* include/std/ostream (vprint_unicode) [_WIN32 && !__CYGWIN__]: Call
_Str_sink::_M_span instead of view.
* include/std/print (vprint_unicode) [_WIN32 && !__CYGWIN__]: Call
_Str_sink::_M_span instead of view.

3 months agolibstdc++: Strip reference and cv-qual in range deduction guides for maps.
Tomasz Kamiński [Thu, 20 Mar 2025 08:02:03 +0000 (09:02 +0100)] 
libstdc++: Strip reference and cv-qual in range deduction guides for maps.

This implements part of LWG4223 that adjust the deduction guides for maps types
(map, unordered_map, flat_map and non-unique equivalent) from "range"
(std::from_range, iterator pair), such that referience and cv qualification are
stripped from the element of the pair-like value_type.

In combination with r15-8296-gd50171bc07006d, the LWG4223 is fully implemented now.

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h (__detail::__range_key_type):
Replace remove_const_t with remove_cvref_t.
(__detail::__range_mapped_type): Apply remove_cvref_t.
* include/bits/stl_iterator.h: (__detail::__iter_key_t):
Replace remove_const_t with __remove_cvref_t.
(__detail::__iter_val_t): Apply __remove_cvref_t.
* testsuite/23_containers/flat_map/1.cc: New tests.
* testsuite/23_containers/flat_multimap/1.cc: New tests.
* testsuite/23_containers/map/cons/deduction.cc: New tests.
* testsuite/23_containers/map/cons/from_range.cc: New tests.
* testsuite/23_containers/multimap/cons/deduction.cc: New tests.
* testsuite/23_containers/multimap/cons/from_range.cc: New tests.
* testsuite/23_containers/unordered_map/cons/deduction.cc: New tests.
* testsuite/23_containers/unordered_map/cons/from_range.cc: New tests.
* testsuite/23_containers/unordered_multimap/cons/deduction.cc:
New tests.
* testsuite/23_containers/unordered_multimap/cons/from_range.cc:
New tests.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agolibstdc++: Implement missing allocator-aware constructors for unordered containers.
Tomasz Kamiński [Tue, 18 Mar 2025 15:10:48 +0000 (16:10 +0100)] 
libstdc++: Implement missing allocator-aware constructors for unordered containers.

This patch implements remainder of LWG2713 (after r15-8293-g64f5c854597759)
by adding missing allocator aware version of unordered associative containers
constructors accepting pair of iterators or initializer_list, and corresponding
deduction guides.

libstdc++-v3/ChangeLog:

* include/bits/unordered_map.h (unordered_map):
Define constructors accepting:
(_InputIterator, _InputIterator, const allocator_type&),
(initializer_list<value_type>, const allocator_type&),
(unordered_multimap): Likewise.
* include/debug/unordered_map (unordered_map): Likewise.
(unordered_multimap): Likewise.
* include/bits/unordered_set.h (unordered_set):
Define constructors and deduction guide accepting:
(_InputIterator, _InputIterator, const allocator_type&),
(initializer_list<value_type>, const allocator_type&),
(unordered_multiset): Likewise.
* include/debug/unordered_set (unordered_set): Likewise.
(unordered_multiset): Likewise.
* testsuite/23_containers/unordered_map/cons/66055.cc: New tests.
* testsuite/23_containers/unordered_map/cons/deduction.cc: New tests.
* testsuite/23_containers/unordered_multimap/cons/66055.cc: New tests.
* testsuite/23_containers/unordered_multimap/cons/deduction.cc: New
tests.
* testsuite/23_containers/unordered_multiset/cons/66055.cc: New tests.
* testsuite/23_containers/unordered_multiset/cons/deduction.cc: New
tests.
* testsuite/23_containers/unordered_set/cons/66055.cc: New tests.
* testsuite/23_containers/unordered_set/cons/deduction.cc: New tests.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agotailc: Improve tail recursion handling [PR119493]
Jakub Jelinek [Mon, 28 Apr 2025 07:22:50 +0000 (09:22 +0200)] 
tailc: Improve tail recursion handling [PR119493]

Here is a patch to improve the tail recursion handling also for
non-musttail calls.

2025-04-28  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/119493
* tree-tailcall.cc (find_tail_calls): Handle non-gimple_reg_type
arguments which aren't just passed through for tail recursions
even for non-musttail calls.

3 months agoc-family: Improve location for -Wunknown-pragmas in a _Pragma [PR118838]
Lewis Hyatt [Tue, 11 Feb 2025 18:45:41 +0000 (13:45 -0500)] 
c-family: Improve location for -Wunknown-pragmas in a _Pragma [PR118838]

The warning for -Wunknown-pragmas is issued at the location provided by
libcpp to the def_pragma() callback. This location is
cpp_reader::directive_line, which is a location for the start of the line
only; it is also not a valid location in case the unknown pragma was lexed
from a _Pragma string. These factors make it impossible to suppress
-Wunknown-pragmas via _Pragma("GCC diagnostic...") directives on the same
source line, as in the PR and the test case. Address that by issuing the
warning at a better location returned by cpp_get_diagnostic_override_loc().
libcpp already maintains this location to handle _Pragma-related diagnostics
internally; it was needed also to make a publicly accessible version of it.

gcc/c-family/ChangeLog:

PR c/118838
* c-lex.cc (cb_def_pragma): Call cpp_get_diagnostic_override_loc()
to get a valid location at which to issue -Wunknown-pragmas, in case
it was triggered from a _Pragma.

libcpp/ChangeLog:

PR c/118838
* errors.cc (cpp_get_diagnostic_override_loc): New function.
* include/cpplib.h (cpp_get_diagnostic_override_loc): Declare.

gcc/testsuite/ChangeLog:

PR c/118838
* c-c++-common/cpp/pragma-diagnostic-loc-2.c: New test.
* g++.dg/gomp/macro-4.C: Adjust expected output.
* gcc.dg/gomp/macro-4.c: Likewise.
* gcc.dg/cpp/Wunknown-pragmas-1.c: Likewise.

3 months agogcc: For Windows x86-32, always attempt to realign stack regardless of SSE
LIU Hao [Sun, 27 Apr 2025 10:18:34 +0000 (18:18 +0800)] 
gcc: For Windows x86-32, always attempt to realign stack regardless of SSE

For Windows x86-32 targets, the Microsoft ABI only guarantees that the stack
is aligned to 4-byte boundaries. GCC knows about the default alignment of the
stack. However, before this commit, it did not realign the stack unless SSE
was also enabled.

When a stricter (larger) alignment is requested, it's always necessary to
realign the stack, as what Solaris does.

Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111107#c14
Signed-off-by: LIU Hao <lh_mouse@126.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/ChangeLog:

PR target/111107
* config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h.

3 months agoFix size_t in id-15.c and infoleak-net-ethtool-ioctl.c for llp64
Jonathan Yong [Thu, 24 Apr 2025 07:42:17 +0000 (07:42 +0000)] 
Fix size_t in id-15.c and infoleak-net-ethtool-ioctl.c for llp64

Use __SIZE_TYPE__ for size_t types so that it works for
llp64.

Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:

* gcc.dg/graphite/id-15.c: Use __SIZE_TYPE__ instead of
unsigned long.
* gcc.dg/plugin/infoleak-net-ethtool-ioctl.c: ditto.

3 months agoDaily bump.
GCC Administrator [Mon, 28 Apr 2025 00:18:29 +0000 (00:18 +0000)] 
Daily bump.

3 months agoc++/modules: Ensure DECL_FRIEND_CONTEXT is streamed [PR119939]
Nathaniel Shead [Fri, 25 Apr 2025 14:10:34 +0000 (00:10 +1000)] 
c++/modules: Ensure DECL_FRIEND_CONTEXT is streamed [PR119939]

An instantiated friend function relies on DECL_FRIEND_CONTEXT being set
to be able to recover the template arguments of the class that
instantiated it, despite not being a template itself.  This patch
ensures that this data is streamed even when DECL_CLASS_SCOPE_P is not
true.

PR c++/119939

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_decl_vals): Also stream
lang->u.fn.context when DECL_UNIQUE_FRIEND_P.
(trees_in::lang_decl_vals): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/concept-11_a.H: New test.
* g++.dg/modules/concept-11_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
3 months agossa-fre-4.c: Enable for all targets and adjust scan match
H.J. Lu [Sun, 10 Nov 2024 09:55:20 +0000 (17:55 +0800)] 
ssa-fre-4.c: Enable for all targets and adjust scan match

Since the C frontend no longer promotes char argument, enable ssa-fre-4.c
for all targets and adjust scan match.

PR middle-end/112877
* gcc.dg/tree-ssa/ssa-fre-4.c: Enable for all targets and adjust
scan match.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agoscev-cast.c: Enable for all targets and adjust scan matches
H.J. Lu [Sun, 10 Nov 2024 08:50:46 +0000 (16:50 +0800)] 
scev-cast.c: Enable for all targets and adjust scan matches

Since the C frontend no longer promotes char argument, enable scev-cast.c
for all targets and adjust scan matches.

PR middle-end/112877
* gcc.dg/tree-ssa/scev-cast.c: Enable for all targets and adjust
scan match.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agovect-simd-clone-1[6-8][cd].c: Expect in-branch clones for x86
H.J. Lu [Sun, 10 Nov 2024 08:41:10 +0000 (16:41 +0800)] 
vect-simd-clone-1[6-8][cd].c: Expect in-branch clones for x86

Since the C frontend no longer promotes char and short arguments, expect
in-branch clones for x86.

PR middle-end/112877
* gcc.dg/vect/vect-simd-clone-16c.c: Expect in-branch clones for
x86.
* gcc.dg/vect/vect-simd-clone-16d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18d.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agoi386: Adjust apx-ndd.c for frontend promotion removal
H.J. Lu [Sun, 10 Nov 2024 03:27:14 +0000 (11:27 +0800)] 
i386: Adjust apx-ndd.c for frontend promotion removal

Since the C frontend no longer promotes integer argument smaller than int,
the apx-ndd.c codgen is slightly different:

apx-ndd.s (original) 2024-11-10 06:07:09.894876973 +0800
apx-ndd.s (updated)  2024-11-10 06:06:59.371860565 +0800
@@ -17,7 +17,7 @@ foo_add_char:
 foo1_add_char:
 .LFB1:
  .cfi_startproc
- leal (%rsi,%rdi), %eax
+ leal (%rdi,%rsi), %eax
  ret
  .cfi_endproc
 .LFE1:
@@ -50,7 +50,7 @@ foo_add_short:
 foo1_add_short:
 .LFB4:
  .cfi_startproc
- leal (%rsi,%rdi), %eax
+ leal (%rdi,%rsi), %eax
  ret
  .cfi_endproc
 .LFE4:
@@ -413,7 +413,7 @@ foo_and_char:
 foo1_and_char:
 .LFB37:
  .cfi_startproc
- andl %edi, %esi, %eax
+ andl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE37:
@@ -435,7 +435,7 @@ foo_and_short:
 foo1_and_short:
 .LFB39:
  .cfi_startproc
- andl %edi, %esi, %eax
+ andl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE39:
@@ -501,7 +501,7 @@ foo_or_char:
 foo1_or_char:
 .LFB45:
  .cfi_startproc
- orl %edi, %esi, %eax
+ orl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE45:
@@ -523,7 +523,7 @@ foo_or_short:
 foo1_or_short:
 .LFB47:
  .cfi_startproc
- orl %edi, %esi, %eax
+ orl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE47:
@@ -589,7 +589,7 @@ foo_xor_char:
 foo1_xor_char:
 .LFB53:
  .cfi_startproc
- xorl %edi, %esi, %eax
+ xorl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE53:
@@ -611,7 +611,7 @@ foo_xor_short:
 foo1_xor_short:
 .LFB55:
  .cfi_startproc
- xorl %edi, %esi, %eax
+ xorl %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE55:
@@ -1018,7 +1018,7 @@ foo4_rol_uint64_t:
 foo1_imul_short:
 .LFB92:
  .cfi_startproc
- imull %edi, %esi, %eax
+ imull %esi, %edi, %eax
  ret
  .cfi_endproc
 .LFE92:

Adjust the assembler scans.

PR middle-end/112877
* gcc.target/i386/apx-ndd.c: Adjusted.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agoDrop targetm.promote_prototypes from C, C++ and Ada frontends
H.J. Lu [Wed, 20 Nov 2024 23:54:35 +0000 (07:54 +0800)] 
Drop targetm.promote_prototypes from C, C++ and Ada frontends

Remove the targetm.calls.promote_prototypes call from C, C++ and Ada
frontends.

gcc/

PR c/48274
PR middle-end/112877
PR middle-end/118288
* gimple.cc (gimple_builtin_call_types_compatible_p): Remove the
targetm.calls.promote_prototypes call.
* tree.cc (tree_builtin_call_types_compatible_p): Likewise.

gcc/ada/

PR middle-end/112877
* gcc-interface/utils.cc (create_param_decl): Remove the
targetm.calls.promote_prototypes call.

gcc/c/

PR c/48274
PR middle-end/112877
PR middle-end/118288
* c-decl.cc (start_decl): Remove the
targetm.calls.promote_prototypes call.
(store_parm_decls_oldstyle): Likewise.
(finish_function): Likewise.
* c-typeck.cc (convert_argument): Likewise.
(c_safe_arg_type_equiv_p): Likewise.

gcc/cp/

PR middle-end/112877
* call.cc (type_passed_as): Remove the
targetm.calls.promote_prototypes call.
(convert_for_arg_passing): Likewise.
* typeck.cc (cxx_safe_arg_type_equiv_p): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agoHonor TARGET_PROMOTE_PROTOTYPES during RTL expand
H.J. Lu [Thu, 21 Nov 2024 00:11:06 +0000 (08:11 +0800)] 
Honor TARGET_PROMOTE_PROTOTYPES during RTL expand

Promote integer arguments smaller than int if TARGET_PROMOTE_PROTOTYPES
returns true.

gcc/

PR middle-end/112877
* calls.cc (initialize_argument_information): Promote small integer
arguments if TARGET_PROMOTE_PROTOTYPES returns true.

gcc/testsuite/

PR middle-end/112877
* gfortran.dg/pr112877-1.f90: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 months agoRISC-V: Extract vector stepped for expand_const_vector [NFC]
Pan Li [Thu, 17 Apr 2025 02:27:17 +0000 (10:27 +0800)] 
RISC-V: Extract vector stepped for expand_const_vector [NFC]

Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions.  For example, the const vector stepped
will be extracted into expand_const_vector_stepped.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Extract
const vector stepped into separated func.
(expand_const_vector_single_step_npatterns): Add new func
to take care of single step.
(expand_const_vector_interleaved_stepped_npatterns): Add new
func to take care of interleaved step.
(expand_const_vector_stepped): Add new func to take care of
const vector stepped.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 months agoRISC-V: Extract vector duplicate for expand_const_vector [NFC]
Pan Li [Wed, 16 Apr 2025 07:47:21 +0000 (15:47 +0800)] 
RISC-V: Extract vector duplicate for expand_const_vector [NFC]

Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions.  For example, the const vector duplicate
will be extracted into expand_const_vector_duplicate, and then
expand_const_vector_duplicate_repeating and
expand_const_vector_duplicate_default for the underlying function.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector_duplicate_repeating):
Add new func to take care of vector duplicate with repeating.
(expand_const_vector_duplicate_default): Add new func to take
care of default const vector duplicate.
(expand_const_vector_duplicate): Add new func to take care
of all const vector duplicate.
(expand_const_vector): Extract const vector duplicate into
separated function.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 months agoRISC-V: Extract vec_series for expand_const_vector [NFC]
Pan Li [Wed, 16 Apr 2025 06:43:23 +0000 (14:43 +0800)] 
RISC-V: Extract vec_series for expand_const_vector [NFC]

Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions.  For example, the const vec_series
will be extracted into expand_const_vec_series.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vec_series): Add new
func to take care of the const vec_series.
(expand_const_vector): Extract const vec_series into separated
function.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 months agoRISC-V: Extract vec_duplicate for expand_const_vector [NFC]
Pan Li [Wed, 16 Apr 2025 03:16:21 +0000 (11:16 +0800)] 
RISC-V: Extract vec_duplicate for expand_const_vector [NFC]

Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions.  For example, the const vec_duplicate
will be extracted into expand_const_vec_duplicate.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Extract
const vec_duplicate into separated function.
(expand_const_vec_duplicate): Add new func to take care
of the const vec_duplicate.

Signed-off-by: Pan Li <pan2.li@intel.com>
3 months agoRefactor msse4 and mno-sse4.
liuhongt [Tue, 1 Apr 2025 07:30:07 +0000 (00:30 -0700)] 
Refactor msse4 and mno-sse4.

gcc/ChangeLog:

PR target/119549
* common/config/i386/i386-common.cc (ix86_handle_option):
Refactor msse4 and mno-sse4.
* config/i386/i386.opt (msse4): Remove RejectNegative.
(mno-sse4): Remove the entry.
* config/i386/i386-options.cc
(ix86_valid_target_attribute_inner_p): Remove special code
which handles mno-sse4.

3 months agoDaily bump.
GCC Administrator [Sun, 27 Apr 2025 00:16:47 +0000 (00:16 +0000)] 
Daily bump.

3 months agoFix i386 vectorizer cost of FP scalar MAX_EXPR and MIN_EXPR
Jan Hubicka [Sat, 26 Apr 2025 20:10:19 +0000 (22:10 +0200)] 
Fix i386 vectorizer cost of FP scalar MAX_EXPR and MIN_EXPR

I introduced a bug by last minute cleanups unifying the scalar and vector SSE conditional.
This patch fixes it and restores cost of 1 of SSE scalar MIN/MAX

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

PR target/105275
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Fix cost of FP scalar
MAX_EXPR and MIN_EXPR

3 months agoAdd m32c*-*-* to the list of obsolete targets
Iain Buclaw [Fri, 25 Apr 2025 17:45:07 +0000 (19:45 +0200)] 
Add m32c*-*-* to the list of obsolete targets

This patch marks m32c*-*-* targets obsolete in GCC 16.  The target has
not had a maintainer since GCC 9, and fails to compile even the
simplest of functions since GCC 8 (reported in PR83670).

contrib/ChangeLog:

* config-list.mk: Add m32c*-*-* to the list of obsoleted targets.

gcc/ChangeLog:

* config.gcc (LIST): --enable-obsolete for m32c-elf.

3 months agosimplify-rtx: Simplify `(zero_extend (and x CST))` -> (and (subreg x) CST)
Andrew Pinski [Wed, 5 Feb 2025 22:44:25 +0000 (14:44 -0800)] 
simplify-rtx: Simplify `(zero_extend (and x CST))` -> (and (subreg x) CST)

This adds the simplification of a ZERO_EXTEND of an AND. This optimization
was already handled in combine via combine_simplify_rtx and the handling
there of compound_operations (ZERO_EXTRACT).

Build and tested for aarch64-linux-gnu.
Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_unary_operation_1) <case ZERO_EXTEND>:
Add simplifcation for and with a constant.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agoDaily bump.
GCC Administrator [Sat, 26 Apr 2025 00:19:18 +0000 (00:19 +0000)] 
Daily bump.

3 months agotestsuite: Skip tests incompatible with generic thunk support
Dimitar Dimitrov [Sat, 11 Jan 2025 16:03:15 +0000 (18:03 +0200)] 
testsuite: Skip tests incompatible with generic thunk support

Some backends do not define TARGET_ASM_OUTPUT_MI_THUNK.  But the generic
thunk support cannot emit code for calling variadic methods of
multiple-inheritance classes.  Example error for pru-unknown-elf:

 .../gcc/gcc/testsuite/g++.dg/ipa/pr83549.C:7:24: error: generic thunk code fails for method 'virtual void C::_ZThn4_N1C3fooEz(...)' which uses '...'

Disable the affected tests for all targets which do not define
TARGET_ASM_OUTPUT_MI_THUNK.

Ensured that test results with and without this patch for
x86_64-pc-linux-gnu are the same.

gcc/ChangeLog:

* doc/sourcebuild.texi: Document variadic_mi_thunk effective
target check.

gcc/testsuite/ChangeLog:

* g++.dg/ipa/pr83549.C: Require effective target
variadic_mi_thunk.
* g++.dg/ipa/pr83667.C: Ditto.
* g++.dg/torture/pr81812.C: Ditto.
* g++.old-deja/g++.jason/thunk3.C: Ditto.
* lib/target-supports.exp
(check_effective_target_variadic_mi_thunk): New function.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
3 months agotestsuite: Add require target for SJLJ exception implementation
Dimitar Dimitrov [Sat, 18 Jan 2025 15:10:43 +0000 (17:10 +0200)] 
testsuite: Add require target for SJLJ exception implementation

Testcases for musttail call optimization fail on pru-unknown-elf:
  FAIL: c-c++-common/musttail14.c  -std=gnu++17 (test for excess errors)
  Excess errors:
  .../gcc/gcc/testsuite/c-c++-common/musttail14.c:37:14: error: cannot tail-call: caller uses sjlj exceptions

Silence these errors by disabling the tests if target uses SJLJ for
implementing exceptions.  Use a new effective target check for this.

Ensured that test results with and without this patch for
x86_64-pc-linux-gnu are the same.

gcc/ChangeLog:

* doc/sourcebuild.texi: Document effective target
using_sjlj_exceptions.

gcc/testsuite/ChangeLog:

* c-c++-common/musttail14.c: Disable test if effective target
using_sjlj_exceptions.
* c-c++-common/musttail22.c: Ditto.
* g++.dg/musttail8.C: Ditto.
* g++.dg/musttail9.C: Ditto.
* g++.dg/opt/musttail3.C: Ditto.
* g++.dg/opt/musttail4.C: Ditto.
* g++.dg/opt/musttail5.C: Ditto.
* g++.dg/opt/pr119613.C: Ditto.
* lib/target-supports.exp
(check_effective_target_using_sjlj_exceptions): New check.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
3 months agolibstdc++: Use markdown in some Doxygen comments
Jonathan Wakely [Fri, 25 Apr 2025 15:14:19 +0000 (16:14 +0100)] 
libstdc++: Use markdown in some Doxygen comments

libstdc++-v3/ChangeLog:

* include/bits/ptr_traits.h (to_address): Use markdown for
formatting in Doxygen comments.

3 months agolibstdc++: Add some makefile dependencies
Jonathan Wakely [Thu, 10 Apr 2025 11:56:43 +0000 (12:56 +0100)] 
libstdc++: Add some makefile dependencies

This ensures that wstring-inst.o and similar files will be rebuilt when
string-inst.cc changes.

libstdc++-v3/ChangeLog:

* src/c++11/Makefile.am: Add prerequisites for targets that
depend on string-inst.cc.
* src/c++11/Makefile.in: Regenerate.

3 months agolibstdc++: Micro-optimization for std::addressof
Jonathan Wakely [Fri, 25 Apr 2025 14:49:22 +0000 (15:49 +0100)] 
libstdc++: Micro-optimization for std::addressof

Currently std::addressof calls std::__addressof which uses
__builtin_addressof. This leads to me prefering std::__addressof in some
code, to avoid the extra hop. But it's not as though the implementation
of std::__addressof is complicated and reusing it avoids any code
duplication.

So let's just make std::addressof use the built-in directly, and then we
only need to use std::__addressof in C++98 code. (Transitioning existing
uses of std::__addressof to std::addressof isn't included in this
change.)

The front end does fold std::addressof with -ffold-simple-inlines but
this change still seems worthwhile.

libstdc++-v3/ChangeLog:

* include/bits/move.h (addressof): Use __builtin_addressof
directly.

3 months agolibstdc++: Remove c++26 dg-error lines for -Wdelete-incomplete errors
Jonathan Wakely [Fri, 25 Apr 2025 14:57:56 +0000 (15:57 +0100)] 
libstdc++: Remove c++26 dg-error lines for -Wdelete-incomplete errors

This fixes:
FAIL: tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc  -std=gnu++26  (test for errors, line 283)
FAIL: tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc  -std=gnu++26  (test for errors, line 305)

This is another consequence of r16-133-g8acea9ffa82ed8 which prevents
the -Wdelete-incomplete errors that happen after the first error.

libstdc++-v3/ChangeLog:

* testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc:
Remove dg-error directives for additional c++26 errors.

3 months agomatch: Move `(cmp (cond @0 @1 @2) @3)` simplification after the bool compare simplifc...
Andrew Pinski [Tue, 22 Apr 2025 22:13:39 +0000 (15:13 -0700)] 
match: Move `(cmp (cond @0 @1 @2) @3)` simplification after the bool compare simplifcation

This moves the `(cmp (cond @0 @1 @2) @3)` simplifcation to be after the boolean comparison
simplifcations so that we don't end up simplifing into the same thing for a GIMPLE_COND.

gcc/ChangeLog:

* match.pd: Move `(cmp (cond @0 @1 @2) @3)` simplifcation after
the bool comparison simplifications.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agogimple: Fix comment before gimple_cond_make_false/gimple_cond_make_true
Andrew Pinski [Wed, 23 Apr 2025 20:48:16 +0000 (13:48 -0700)] 
gimple: Fix comment before gimple_cond_make_false/gimple_cond_make_true

I noticed the comments and the code don't match.
The correct form is:
'if (0 != 0)': false
and
'if (1 != 0)': true

That is always NE and always 0 as the second operand.

Also there is a spello for statement in the comment in
front of gimple_cond_true_p.

Pushed as obvious.

gcc/ChangeLog:

* gimple.h (gimple_cond_make_false): Fix comment.
(gimple_cond_make_true): Likewise.
(gimple_cond_true_p): Fix spello for statement in comment.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agoFortran: fix procedure pointer handling with -fcheck=pointer [PR102900]
Harald Anlauf [Thu, 24 Apr 2025 19:28:35 +0000 (21:28 +0200)] 
Fortran: fix procedure pointer handling with -fcheck=pointer [PR102900]

PR fortran/102900

gcc/fortran/ChangeLog:

* trans-decl.cc (gfc_generate_function_code): Use sym->result
when generating fake result decl for functions returning
allocatable or pointer results.
* trans-expr.cc (gfc_conv_procedure_call): When checking the
pointer status of an actual argument passed to a non-allocatable,
non-pointer dummy which is of type CLASS, do not check the
class container of the actual if it is just a procedure pointer.
(gfc_trans_pointer_assignment): Fix treatment of assignment to
NULL of a procedure pointer.

gcc/testsuite/ChangeLog:

* gfortran.dg/proc_ptr_52.f90: Add -fcheck=pointer to options.
* gfortran.dg/proc_ptr_57.f90: New test.

3 months agoc++: pruning non-captures in noexcept lambda [PR119764]
Jason Merrill [Mon, 14 Apr 2025 16:18:06 +0000 (12:18 -0400)] 
c++: pruning non-captures in noexcept lambda [PR119764]

The patch for PR87185 fixed the ICE without fixing the underlying problem,
that we were failing to find the declaration of the capture proxy that we
are trying to decide whether to prune.  Fixed by looking at the right index
in stmt_list_stack.

Since this changes captures, it changes the ABI of noexcept lambdas; we
haven't worked hard to maintain lambda capture ABI, but it's easy enough to
control here.

PR c++/119764
PR c++/87185

gcc/cp/ChangeLog:

* lambda.cc (insert_capture_proxy): Handle noexcept lambda.
(prune_lambda_captures): Likewise, in ABI v21.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-noexcept1.C: New test.

3 months agoc++: add -fabi-version=21
Jason Merrill [Tue, 22 Apr 2025 20:37:30 +0000 (16:37 -0400)] 
c++: add -fabi-version=21

I'm about to add a bugfix that changes the ABI of noexcept lambdas, so first
let's add the new ABI version.  And I think it's time to update the
compatibility version; let's bump to GCC 13, before the addition of concepts
mangling.

gcc/ChangeLog:

* common.opt: Add ABI v21.

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Bump default ABI to 21
and compat ABI to 18.

gcc/testsuite/ChangeLog:

* g++.dg/abi/macro0.C: Update for -fabi-version=21.

3 months agolibstdc++: Rename std::latch data member
Jonathan Wakely [Thu, 30 Jan 2025 12:07:48 +0000 (12:07 +0000)] 
libstdc++: Rename std::latch data member

Rename _M_a to match the name of the exposition-only data member shown
in the standard, i.e. 'counter'.

libstdc++-v3/ChangeLog:

* include/std/latch (latch::_M_a): Rename to _M_counter.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agoicf: Remove unused constructors of sem_function and sem_variable
Andrew Pinski [Tue, 25 Mar 2025 05:32:54 +0000 (22:32 -0700)] 
icf: Remove unused constructors of sem_function and sem_variable

The constructors for sem_function and sem_variable that just
passes the bitmap obstack and NOT the cgraph node was unused
so let's remove it.

gcc/ChangeLog:

* ipa-icf.cc (sem_function::sem_function): Remove
the obstack argument version one.
(sem_variable::sem_variable): Likewise.
* ipa-icf.h (sem_function): Remove ctor for
obstack argument only one.
(sem_variable): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agoicf: Remove nop code from sem_function::init.
Andrew Pinski [Tue, 25 Mar 2025 05:27:30 +0000 (22:27 -0700)] 
icf: Remove nop code from sem_function::init.

Here we had:
  node = node;
Which does nothing so let's remove it.

gcc/ChangeLog:

* ipa-icf.cc (sem_function::init): Remove assignment of node from itself.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agophiopt: Remove calls.h include [PR119811]
Andrew Pinski [Sat, 19 Apr 2025 00:10:12 +0000 (17:10 -0700)] 
phiopt: Remove calls.h include [PR119811]

When the patch, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660807.html was rewroked into r15-3047-g404d947d8ddd3c,
the include for calls.h was still included and missed that it was no longer needed.

Pushed as obvious.

PR tree-optimization/119811
gcc/ChangeLog:

* tree-ssa-phiopt.cc: Remove calls.h include.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
3 months agocobol: New testcases.
Robert Dubner [Fri, 25 Apr 2025 14:19:35 +0000 (10:19 -0400)] 
cobol: New testcases.

These testcases are derived from the cobolworx run_fundamental.at file.

gcc/testsuite

* cobol.dg/group2/88_level_with_FALSE_IS_clause.cob: New testcase.
* cobol.dg/group2/88_level_with_FILLER.cob: Likewise.
* cobol.dg/group2/88_level_with_THRU.cob: Likewise.
* cobol.dg/group2/ADD_CORRESPONDING.cob: Likewise.
* cobol.dg/group2/ADD_SUBTRACT_CORR_mixed_fix___float.cob: Likewise.
* cobol.dg/group2/ALPHABETIC-LOWER_test.cob: Likewise.
* cobol.dg/group2/ALPHABETIC_test.cob: Likewise.
* cobol.dg/group2/ALPHABETIC-UPPER_test.cob: Likewise.
* cobol.dg/group2/BLANK_WHEN_ZERO.cob: Likewise.
* cobol.dg/group2/Check_for_equality_of_COMP-1___COMP-2.cob: Likewise.
* cobol.dg/group2/Compare_COMP-2_with_floating-point_literal.cob: Likewise.
* cobol.dg/group2/Contained_program_visibility__3_.cob: Likewise.
* cobol.dg/group2/Contained_program_visibility__4_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__1_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__2_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__3_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__4_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__5_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__6_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__7_.cob: Likewise.
* cobol.dg/group2/Context_sensitive_words__8_.cob: Likewise.
* cobol.dg/group2/debugging_lines__not_active_.cob: Likewise.
* cobol.dg/group2/debugging_lines__WITH_DEBUGGING_MODE_.cob: Likewise.
* cobol.dg/group2/DEBUG_Line.cob: Likewise.
* cobol.dg/group2/DISPLAY_and_assignment_NumericDisplay.cob: Likewise.
* cobol.dg/group2/DISPLAY_data_items_with_MOVE_statement.cob: Likewise.
* cobol.dg/group2/DISPLAY_data_items_with_VALUE_clause.cob: Likewise.
* cobol.dg/group2/DISPLAY_literals__DECIMAL-POINT_is_COMMA.cob: Likewise.
* cobol.dg/group2/GLOBAL_at_lower_level.cob: Likewise.
* cobol.dg/group2/GLOBAL_at_same_level.cob: Likewise.
* cobol.dg/group2/GLOBAL_FD__1_.cob: Likewise.
* cobol.dg/group2/GLOBAL_FD__2_.cob: Likewise.
* cobol.dg/group2/GLOBAL_FD__3_.cob: Likewise.
* cobol.dg/group2/GLOBAL_FD__4_.cob: Likewise.
* cobol.dg/group2/Hexadecimal_literal.cob: Likewise.
* cobol.dg/group2/integer_arithmetic_on_floating-point_var.cob: Likewise.
* cobol.dg/group2/MULTIPLY_BY_literal_in_INITIAL_program.cob: Likewise.
* cobol.dg/group2/Named_conditionals_-_fixed__float__and_alphabetic.cob: Likewise.
* cobol.dg/group2/Numeric_operations__1_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__2_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__3_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__4_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__5_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__7_.cob: Likewise.
* cobol.dg/group2/Numeric_operations__8_.cob: Likewise.
* cobol.dg/group2/ROUNDED_AWAY-FROM-ZERO.cob: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-AWAY-FROM-ZERO.cob: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-EVEN.cob: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-TOWARD-ZERO.cob: Likewise.
* cobol.dg/group2/ROUNDED_TOWARD-GREATER.cob: Likewise.
* cobol.dg/group2/ROUNDED_TOWARD-LESSER.cob: Likewise.
* cobol.dg/group2/ROUNDED_TRUNCATION.cob: Likewise.
* cobol.dg/group2/ROUNDING_omnibus_Floating-Point_from_COMPUTE.cob: Likewise.
* cobol.dg/group2/ROUNDING_omnibus_NumericDisplay_from_COMPUTE.cob: Likewise.
* cobol.dg/group2/Separate_sign_positions__1_.cob: Likewise.
* cobol.dg/group2/Separate_sign_positions__2_.cob: Likewise.
* cobol.dg/group2/Simple_p-scaling.cob: Likewise.
* cobol.dg/group2/Simple_TYPEDEF.cob: Likewise.
* cobol.dg/group2/ADD_SUBTRACT_CORR_mixed_fix___float.out: New known-good result.
* cobol.dg/group2/BLANK_WHEN_ZERO.out: Likewise.
* cobol.dg/group2/Contained_program_visibility__4_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__1_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__2_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__3_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__4_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__5_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__6_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__7_.out: Likewise.
* cobol.dg/group2/Context_sensitive_words__8_.out: Likewise.
* cobol.dg/group2/debugging_lines__not_active_.out: Likewise.
* cobol.dg/group2/debugging_lines__WITH_DEBUGGING_MODE_.out: Likewise.
* cobol.dg/group2/DEBUG_Line.out: Likewise.
* cobol.dg/group2/DISPLAY_and_assignment_NumericDisplay.out: Likewise.
* cobol.dg/group2/DISPLAY_data_items_with_MOVE_statement.out: Likewise.
* cobol.dg/group2/DISPLAY_data_items_with_VALUE_clause.out: Likewise.
* cobol.dg/group2/DISPLAY_literals__DECIMAL-POINT_is_COMMA.out: Likewise.
* cobol.dg/group2/GLOBAL_at_lower_level.out: Likewise.
* cobol.dg/group2/GLOBAL_at_same_level.out: Likewise.
* cobol.dg/group2/Hexadecimal_literal.out: Likewise.
* cobol.dg/group2/Named_conditionals_-_fixed__float__and_alphabetic.out: Likewise.
* cobol.dg/group2/ROUNDED_AWAY-FROM-ZERO.out: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-AWAY-FROM-ZERO.out: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-EVEN.out: Likewise.
* cobol.dg/group2/ROUNDED_NEAREST-TOWARD-ZERO.out: Likewise.
* cobol.dg/group2/ROUNDED_TOWARD-GREATER.out: Likewise.
* cobol.dg/group2/ROUNDED_TOWARD-LESSER.out: Likewise.
* cobol.dg/group2/ROUNDED_TRUNCATION.out: Likewise.
* cobol.dg/group2/ROUNDING_omnibus_Floating-Point_from_COMPUTE.out: Likewise.
* cobol.dg/group2/ROUNDING_omnibus_NumericDisplay_from_COMPUTE.out: Likewise.
* cobol.dg/group2/Separate_sign_positions__1_.out: Likewise.
* cobol.dg/group2/Separate_sign_positions__2_.out: Likewise.
* cobol.dg/group2/Simple_p-scaling.out: Likewise.

3 months agolibstdc++: Minimalize temporary allocations when width is specified [PR109162]
Tomasz Kamiński [Wed, 23 Apr 2025 11:17:09 +0000 (13:17 +0200)] 
libstdc++: Minimalize temporary allocations when width is specified [PR109162]

When width parameter is specified for formatting range, tuple or escaped
presentation of string, we used to format characters to temporary string,
and write produce sequence padded according to the spec. However, once the
estimated width of formatted representation of input is larger than the value
of spec width, it can be written directly to the output. This limits size of
required allocation, especially for large ranges.

Similarly, if precision (maximum) width is provided for string presentation,
only a prefix of sequence with estimated width not greater than precision, needs
to be buffered.

To realize above, this commit implements a new _Padding_sink specialization.
This sink holds an output iterator, a value of padding width, (optionally)
maximum width and a string buffer inherited from _Str_sink.
Then any incoming characters are treated in one of following ways, depending of
estimated width W of written sequence:
* written to string if W is smaller than padding width and maximum width (if present)
* ignored, if W is greater than maximum width
* written to output iterator, if W is greater than padding width

The padding sink is used instead of _Str_sink in __format::__format_padded,
__formatter_str::_M_format_escaped functions.

Furthermore __formatter_str::_M_format implementation was reworked, to:
* reduce number of instantiations by delegating to _Rg& and const _Rg& overloads,
* non-debug presentation is written to _Out directly or via _Padding_sink
* if maximum width is specified for debug format with non-unicode encoding,
  string size is limited to that number.

PR libstdc++/109162

libstdc++-v3/ChangeLog:

* include/bits/formatfwd.h (__simply_formattable_range): Moved from
std/format.
* include/std/format (__formatter_str::_format): Extracted escaped
string handling to separate method...
(__formatter_str::_M_format_escaped): Use __Padding_sink.
(__formatter_str::_M_format): Adjusted implementation.
(__formatter_str::_S_trunc): Extracted as namespace function...
(__format::_truncate): Extracted from __formatter_str::_S_trunc.
(__format::_Seq_sink): Removed forward declarations, made members
protected and non-final.
(_Seq_sink::_M_trim): Define.
(_Seq_sink::_M_span): Renamed from view.
(_Seq_sink::view): Returns string_view instead of span.
(__format::_Str_sink): Moved after _Seq_sink.
(__format::__format_padded): Use _Padding_sink.
* testsuite/std/format/debug.cc: Add timeout and new tests.
* testsuite/std/format/ranges/sequence.cc: Specify unicode as
encoding and new tests.
* testsuite/std/format/ranges/string.cc: Likewise.
* testsuite/std/format/tuple.cc: Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agolibstdc++: Replace leftover std::queue with Adaptor in ranges/adaptors.cc.
Tomasz Kamiński [Fri, 25 Apr 2025 12:55:30 +0000 (14:55 +0200)] 
libstdc++: Replace leftover std::queue with Adaptor in ranges/adaptors.cc.

This was leftover from  work-in-progress state, where only std::queue was
tested.

libstdc++-v3/ChangeLog:

* testsuite/std/format/ranges/adaptors.cc: Updated test.

3 months agomodulo-sched: reject loop conditions when not decrementing with one [PR 116479]
Andre Vieira [Fri, 25 Apr 2025 13:02:43 +0000 (14:02 +0100)] 
modulo-sched: reject loop conditions when not decrementing with one [PR 116479]

In the commit titled 'doloop: Add support for predicated vectorized loops' the
doloop_condition_get function was changed to accept loops with decrements
larger than 1.  This patch rejects such loops for modulo-sched.

gcc/ChangeLog:

PR rtl-optimization/116479
* modulo-sched.cc (doloop_register_get): Reject conditions with
decrements that are not 1.

gcc/testsuite/ChangeLog:

* gcc.dg/pr116479.c: New test.

3 months agos390: Allow 5+ argument tail-calls in some -m31 -mzarch special cases [PR119873]
Jakub Jelinek [Fri, 25 Apr 2025 12:42:01 +0000 (14:42 +0200)] 
s390: Allow 5+ argument tail-calls in some -m31 -mzarch special cases [PR119873]

Here is a patch to handle the PARALLEL case too.
I think we can just use rtx_equal_p there, because it will always use
SImode in the EXPR_LIST REGs in that case.

2025-04-25  Jakub Jelinek  <jakub@redhat.com>

PR target/119873
* config/s390/s390.cc (s390_call_saved_register_used): Don't return
true if default definition of PARM_DECL SSA_NAME of the same register
is passed in call saved register in the PARALLEL case either.

* gcc.target/s390/pr119873-5.c: New test.

3 months agolibstdc++: Remove c++98_only dg-error
Jonathan Wakely [Fri, 25 Apr 2025 11:35:01 +0000 (12:35 +0100)] 
libstdc++: Remove c++98_only dg-error

This fixes
FAIL: 22_locale/ctype/is/string/89728_neg.cc  -std=gnu++98  (test for errors, line )

Since r16-133-g8acea9ffa82ed8 we don't keep issuing more errors after
the first one, so this dg-error no longer matches anything.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/ctype/is/string/89728_neg.cc: Remove
dg-error for c++98_only effective target.

3 months agolibstdc++: Constrain formatter for thread::id [PR119918]
Tomasz Kamiński [Thu, 24 Apr 2025 07:32:24 +0000 (09:32 +0200)] 
libstdc++: Constrain formatter for thread::id [PR119918]

This patch add constraint __formatter::__char to _CharT type parameter
of formatter<thread::id, _CharT> specialization, matching the constraint
of formatting of integer/pointers that are used as native handles.

The dependency on <format> header, is changed to <bits/formatfwd.h>.
To achieve that, formatting of pointers is extracted from void const*
specialization to internal __formatter_ptr<_CharT>, that can be forward
declared.

Finally, the handle representation is now printed directly to __fc.out(),
by the formatter for handle type. To support this, internal formatters
can now be constructed from _Spec object as alternative to invoking parse
method.

PR libstdc++/119918

libstdc++-v3/ChangeLog:

* include/bits/formatfwd.h (__format::_Align): Moved from std/format.
(std::__throw_format_error, __format::__formatter_str)
(__format::__formatter_ptr): Declare.
* include/std/format (__format::_Align): Moved to bits/formatfwd.h.
(__formatter_int::__formatter_int): Define.
(__format::__formatter_ptr): Extracted from formatter for const void*.
(std::formatter<const void*, _CharT>, formatter<void*, _CharT>)
(std::formatter<nullptr_t, _CharT>): Delegate to __formatter_ptr<_CharT>.
* include/std/thread (std::formatter<thread::id, _CharT>): Constrain
_CharT template parameter.
(formatter<thread::id, _CharT>::parse): Specify default aligment, and
qualify __throw_format_error to disable ADL.
(formatter<thread::id, _CharT>::format): Use formatters to write directly
to output.
* testsuite/30_threads/thread/id/output.cc: Tests for formatting thread::id
representing not-a-thread with padding and formattable concept.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agolibstdc++: Define __cpp_lib_format_ranges in format header [PR109162]
Tomasz Kamiński [Tue, 22 Apr 2025 07:56:42 +0000 (09:56 +0200)] 
libstdc++: Define __cpp_lib_format_ranges in format header [PR109162]

As P2286R8 and P2585R1 as now fully implemented, we now define
__cpp_lib_format_ranges feature test macro with __cpp_lib_format_ranges.
This macro is provided only in <format>.

Uses of internal __glibcxx_format_ranges are also updated.

PR libstdc++/109162

libstdc++-v3/ChangeLog:

* include/bits/version.def (format_ranges): Remove no_stdname and
update value.
* include/bits/version.h: Regenerate.
* src/c++23/std.cc.in: Replace __glibcxx_format_ranges with
__cpp_lib_format_ranges.
* testsuite/std/format/formatter/lwg3944.cc: Likewise.
* testsuite/std/format/parse_ctx.cc: Likewise.
* testsuite/std/format/string.cc: Likewise.
* testsuite/std/format/ranges/feature_test.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agolibstdc++: Implement formatters for queue, priority_queue and stack [PR109162]
Tomasz Kamiński [Fri, 18 Apr 2025 12:56:39 +0000 (14:56 +0200)] 
libstdc++: Implement formatters for queue, priority_queue and stack [PR109162]

This patch implements formatter specializations for standard container adaptors
(queue, priority_queue and stack) from P2286R8.

To be able to access the protected `c` member, the adaptors befriend
corresponding formatter specializations. Note that such specialization
may be disable if the container is formattable, in such case
specializations are unharmful.

As in the case of previous commits, the signatures of the user-facing parse
and format methods of the provided formatters deviate from the standard by
constraining types of parameters:
 * _CharT is constrained __formatter::__char
 * basic_format_parse_context<_CharT> for parse argument
 * basic_format_context<_Out, _CharT> for format second argument
The standard specifies all above as unconstrained types. In particular
_CharT constrain, allow us to befriend all allowed specializations.

Furthermore the standard specifies these formatters as delegating to
formatter<ranges::ref_view<const? _Container>, charT>, which in turn
delegates to range_formatter. This patch avoids one level of indirection,
and dependency of ranges::ref_view.  This is technically observable if
user specializes formatter<std::ref_view<PD>> where PD is program defined
container, but I do not think this is the case worth extra indirection.

This patch also moves the formattable and it's dependencies to the formatfwd.h,
so it can be used in adapters formatters, without including format header.
The definition of _Iter_for is changed from alias to denoting
back_insert_iterator<basic_string<_CharT>>, to struct with type nested typedef
that points to same type, that is forward declared.

PR libstdc++/109162

libstdc++-v3/ChangeLog:

* include/bits/formatfwd.h (__format::__parsable_with)
(__format::__formattable_with, __format::__formattable_impl)
(__format::__has_debug_format, __format::__const_formattable_range)
(__format::__maybe_const_range, __format::__maybe_const)
(std::formattable): Moved from std/format.
(__format::Iter_for, std::range_formatter): Forward declare.
* include/bits/stl_queue.h (std::formatter): Forward declare.
(std::queue, std::priority_queue): Befriend formatter specializations.
* include/bits/stl_stack.h (std::formatter): Forward declare.
(std::stack): Befriend formatter specializations.
* include/std/format (__format::_Iter_for): Define as struct with
(__format::__parsable_with, __format::__formattable_with)
(__format::__formattable_impl, __format::__has_debug_format)
(_format::__const_formattable_range, __format::__maybe_const_range)
(__format::__maybe_const, std::formattable): Moved to bits/formatfwd.h.
(std::range_formatter): Remove default argument specified in declaration
in bits/formatfwd.h.
* include/std/queue: Include bits/version.h before bits/stl_queue.h.
(formatter<queue<_Tp, _Container, _Compare>, _CharT>)
(formatter<priority_queue<_Tp, _Container, _Compare>, _CharT>): Define.
* include/std/stack: Include bits/version.h before bits/stl_stack.h
(formatter<stack<_Tp, _Container, _Compare>, _CharT>): Define.
* testsuite/std/format/ranges/adaptors.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agoOpenMP, GCN: Add interop-hsa testcase
Andrew Stubbs [Thu, 24 Apr 2025 16:50:08 +0000 (16:50 +0000)] 
OpenMP, GCN: Add interop-hsa testcase

This testcase ensures that the interop HSA support is sufficient to run
a kernel manually on the same device.

libgomp/ChangeLog:

* testsuite/libgomp.c/interop-hsa.c: New test.

3 months agoc++: bad pending_template recursion
Jason Merrill [Fri, 18 Apr 2025 22:00:34 +0000 (18:00 -0400)] 
c++: bad pending_template recursion

limit_bad_template_recursion currently avoids immediate instantiation of
templates from uses in an already ill-formed instantiation, but we still can
get unnecessary recursive instantiation in pending_templates if the
instantiation was queued before the error.

Initially this regressed several libstdc++ tests which seemed to rely on a
static_assert in a function called from another that is separately ill-formed.
For instance, in the 48101_neg.cc tests, we first got an error in find(), then
later instantiate _S_key() (called from find) and got the static_assert error
from there. r16-131-g876d1a22dfaf87 and r16-132-g901900bc37566c changed
the library code (and tests) to make the expected static_assert errors
happen earlier.

gcc/cp/ChangeLog:

* cp-tree.h (struct tinst_level): Add had_errors bit.
* pt.cc (push_tinst_level_loc): Clear it.
(pop_tinst_level): Set it.
(reopen_tinst_level): Check it.
(instantiate_pending_templates): Call limit_bad_template_recursion.

gcc/testsuite/ChangeLog:

* g++.dg/template/recurse5.C: New test.

3 months agolibstdc++: Improve diagnostics for std::packaged_task invocable checks
Jonathan Wakely [Thu, 24 Apr 2025 20:55:16 +0000 (21:55 +0100)] 
libstdc++: Improve diagnostics for std::packaged_task invocable checks

Moving the static_assert that checks is_invocable_r_v into _Task_state
means it is checked when we instantiate that class template.

Replacing the __create_task_state function with a static member function
_Task_state::_S_create ensures we instantiate _Task_state and trigger
the static_assert immediately, not deep inside the implementation of
std::allocate_shared. This results in shorter diagnostics that don't
show deeply-nested template instantiations before the static_assert
failure.

Placing the static_assert at class scope also helps us to fail earlier
than waiting until when the _Task_state::_M_run virtual function is
instantiated. That also makes the diagnostics shorter and easier to read
(although for C++11 and C++14 modes the class-scope static_assert
doesn't check is_invocable_r, so dangling references aren't detected
until _M_run is instantiated).

libstdc++-v3/ChangeLog:

* include/std/future (__future_base::_Task_state): Check
invocable requirement here.
(__future_base::_Task_state::_S_create): New static member
function.
(__future_base::_Task_state::_M_reset): Use _S_create.
(__create_task_state): Remove.
(packaged_task): Use _Task_state::_S_create instead of
__create_task_state.
* testsuite/30_threads/packaged_task/cons/dangling_ref.cc:
Adjust dg-error patterns.
* testsuite/30_threads/packaged_task/cons/lwg4154_neg.cc:
Likewise.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agolibstdc++: Add _M_key_compare helper to associative containers
Jonathan Wakely [Thu, 24 Apr 2025 13:58:58 +0000 (14:58 +0100)] 
libstdc++: Add _M_key_compare helper to associative containers

In r10-452-ge625ccc21a91f3 I noted that we don't have an accessor for
invoking _M_impl._M_key_compare in the associative containers. That
meant that the static assertions to check for valid comparison functions
were squirrelled away in _Rb_tree::_S_key instead. As Jason noted in
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681436.html this
means that the static assertions fail later than we'd like.

This change adds a new _Rb_tree::_M_key_compare member function which
invokes the _M_impl._M_key_compare function object, and then moves the
static_assert from _S_key into _M_key_compare. Now if the static_assert
fails, that's the first error we get, before the "no match for call" and
and "invalid conversion" errors.

Because the new function is const-qualified, we now treat LWG 2542 as a
DR for older standards, requiring the comparison function to be const
invocable. Previously we only enforced the LWG 2542 rule for C++17 and
later.

I did consider deprecating support for comparisons which aren't const
invocable, something like this:

  // Before LWG 2542 it wasn't strictly necessary for _Compare to be
  // const invocable, if you only used non-const container members.
  // Define a non-const overload for pre-C++17, deprecated for C++11/14.
  #if __cplusplus < 201103L
  bool
  _M_key_compare(const _Key& __k1, const _Key& __k2)
  { return _M_impl._M_key_compare(__k1, __k2); }
  #elif __cplusplus < 201703L
  template<typename _Key1, typename _Key2>
    [[__deprecated__("support for comparison functions that are not "
                     "const invocable is deprecated")]]
    __enable_if_t<
    __and_<__is_invocable<_Compare&, const _Key1&, const _Key2&>,
           __not_<__is_invocable<const _Compare&, const _Key1&, const _Key2&>>>::value,
           bool>
    _M_key_compare(const _Key1& __k1, const _Key2& __k2)
    {
      static_assert(
        __is_invocable<_Compare&, const _Key&, const _Key&>::value,
        "comparison object must be invocable with two arguments of key type"
      );
      return _M_impl._M_key_compare(__k1, __k2);
    }
  #endif

But I decided that this isn't necessary, because we've been enforcing
the C++17 rule since GCC 8.4 and 9.2, and C++17 has been the default
since GCC 11.1. Users have had plenty of time to fix their invalid
comparison functions.

libstdc++-v3/ChangeLog:

* include/bits/stl_tree.h (_Rb_tree::_M_key_compare): New member
function to invoke comparison function.
(_Rb_tree): Use new member function instead of accessing the
comparison function directly.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 months agoGCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruc...
Thomas Schwinge [Wed, 23 Apr 2025 08:51:48 +0000 (10:51 +0200)] 
GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]

'__dso_handle' for '__cxa_atexit', '__cxa_finalize'.  See
<https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor>.

PR target/119853
PR target/119854
libgcc/
* config/gcn/crt0.c (_fini_array): Call
'__GCC_offload___cxa_finalize'.
* config/nvptx/gbl-ctors.c (__static_do_global_dtors): Likewise.
libgomp/
* target-cxa-dso-dtor.c: New.
* config/accel/target-cxa-dso-dtor.c: Likewise.
* Makefile.am (libgomp_la_SOURCES): Add it.
* Makefile.in: Regenerate.
* testsuite/libgomp.c++/target-cdtor-1.C: New.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.

3 months agoAdd 'libgomp.c-c++-common/target-cdtor-1.c'
Thomas Schwinge [Wed, 23 Apr 2025 15:35:29 +0000 (17:35 +0200)] 
Add 'libgomp.c-c++-common/target-cdtor-1.c'

libgomp/
* testsuite/libgomp.c-c++-common/target-cdtor-1.c: New.

3 months agoGCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]
Andrew Pinski [Mon, 21 Apr 2025 22:32:26 +0000 (22:32 +0000)] 
GCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]

There are GCN/C++ target as well as offloading codes, where the hard-coded
section names in 'gcn_hsa_declare_function_name' do not fit, and assembly thus
fails:

    LLVM ERROR: Size expression must be absolute.

This commit progresses GCN target:

    [-FAIL: g++.dg/init/call1.C  -std=gnu++17 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
    [-FAIL: g++.dg/init/call1.C  -std=gnu++26 (internal compiler error: Aborted signal terminated program as)-]
    [-FAIL:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
    UNSUPPORTED: g++.dg/init/call1.C  -std=gnu++98: exception handling not supported

..., and GCN offloading:

    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-1.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-1.C output pattern test+}

    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.c++/target-exceptions-throw-2.C PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.c++/target-exceptions-throw-2.C output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 7 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (internal compiler error: Aborted signal terminated program as)-]
    [-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  PR119737 at line 9 (test for bogus messages, line )-]
    [-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  [-compilation failed to produce executable-]{+execution test+}
    {+PASS: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O2  output pattern test+}

PR target/119737
gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Properly
switch sections.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: Remove
PR119737 XFAILing.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.

Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
3 months agoAdjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder...
Thomas Schwinge [Tue, 22 Apr 2025 11:41:22 +0000 (13:41 +0200)] 
Adjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder' [PR118794]

Fix-up for commit aa3e72f943032e5f074b2bd2fd06d130dda8760b
"Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]":
we need to adjust for configurations with 'targetm.arm_eabi_unwinder', as per:

    gcc/config/arm/arm.cc:#define TARGET_ARM_EABI_UNWINDER true
    gcc/config/c6x/c6x.cc:#define TARGET_ARM_EABI_UNWINDER true

..., which for ARM is conditional to '#if ARM_UNWIND_INFO' (defined in
'gcc/config/arm/bpabi.h', used for various GCC configurations), and for
C6x unconditional.

This gets us:

    --- target-exceptions-pr118794-1.C.269t.optimized
    +++ target-exceptions-pr118794-1.C.270t.optimized
    [...]
     __attribute__((omp declare target))
     void f ()
    [...]
       gimple_call <__dt_comp , NULL, &c>
    -  gimple_call <__builtin_eh_pointer, _7, 2>
    -  gimple_call <__builtin_unwind_resume, NULL, _7>
    +  gimple_call <__builtin_cxa_end_cleanup, NULL>

     }
    [...]

PR target/118794
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Adjust for
'targetm.arm_eabi_unwinder'.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

3 months agoAdjust gcc_release for id href web transformations
Jakub Jelinek [Fri, 25 Apr 2025 08:23:15 +0000 (10:23 +0200)] 
Adjust gcc_release for id href web transformations

We now have some script which transforms e.g.
<h2 id="15.1">GCC 15.1</h2>
line in gcc-15/changes.html to
<h2 id="15.1"><a href="#15.1">GCC 15.1</a></h2>

This unfortunately breaks the gcc_release script, which looks for
GCC 15.1 appearing in gennews after optional blanks from the start of
the line in the NEWS file, which is no longer the case, there is
[129]GCC 15.1
or something like that with an URL later on
 129. https://gcc.gnu.org/gcc-15/changes.html#15.1

The following patch handles this.

2025-04-25  Jakub Jelinek  <jakub@redhat.com>

* gcc_release: Allow optional \[[0-9]+\] before GCC major.minor
in the NEWS file.

3 months agoUpdate gennews for GCC 15.
Jakub Jelinek [Fri, 25 Apr 2025 07:53:35 +0000 (09:53 +0200)] 
Update gennews for GCC 15.

2025-04-25  Jakub Jelinek  <jakub@redhat.com>

* gennews (files): Add files for GCC 15.

3 months ago[PATCH] RISC-V: Imply C from Zca whenever possible [PR119122]
Yuriy Kolerov [Fri, 25 Apr 2025 03:22:16 +0000 (21:22 -0600)] 
[PATCH] RISC-V: Imply C from Zca whenever possible [PR119122]

GCC must imply C extension from Zca extension when it's
possible. It's necessary for achieving compatibility
between different march strings which in fact may be
the same.

E.g., if rv32ic multilib configuration is presented in
GCC, then GCC will not choose this configuration for
linking if -march=rv32i_zca is passed.

Here is a more practical example. From RISC-V
Instruction Set Manual:

    Therefore common ISA strings can be updated as follows
    to include the relevant Zc extensions, for example:
        - RV32IMC becomes RV32IM_Zce
        - RV32IMCF becomes RV32IMF_Zce

With current implication rules this will not work well
if rv32imc configuration is presented and a user
passes -march=rv32im_zce. This is how we can check
this with a simple empty test.c source file:

$ riscv64-unknown-elf-gcc -march=rv32ic -mabi=ilp32 -mriscv-attribute -S test.c
$ grep "attribute arch" test.s
        .attribute arch, "rv32i2p1_c2p0_zca1p0"
$ riscv64-unknown-elf-gcc -march=rv32i_zce -mabi=ilp32 -mriscv-attribute -S test.c
$ grep "attribute arch" test.s
        .attribute arch, "rv32i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0"

According to current GCC these march strings are
incompatible: the first one contains c2p0 and the
second on doesn't.

To introduce such implication rule we need to carefully
cover all possible combinations with these extensions:
zca, zcf, zcd, F and D.

According to the same manual:

    As C defines the same instructions as Zca, Zcf and
    Zcd, the rule is that:
        - C always implies Zca
        - C+F implies Zcf (RV32 only)
        - C+D implies Zcd

Here is a full list of cases:

    1. rv32i_zca implies C.
    2. rv32if_zca_zcf implies C.
    3. rv32ifd_zca_zcf_zcd implies C.
    4. rv64i_zca implies C.
    5. rv64ifd_zca_zcd implies C.

PR target/119122

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info): Add a rule
for Zca to C implication.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-25.c: Fix dg-error expectation.
* gcc.target/riscv/attribute-c-1.c: New test.
* gcc.target/riscv/attribute-c-2.c: New test.
* gcc.target/riscv/attribute-c-3.c: New test.
* gcc.target/riscv/attribute-c-4.c: New test.
* gcc.target/riscv/attribute-c-5.c: New test.
* gcc.target/riscv/attribute-c-6.c: New test.
* gcc.target/riscv/attribute-c-7.c: New test.
* gcc.target/riscv/attribute-c-8.c: New test.
* gcc.target/riscv/attribute-zce-1.c: Update Zce tests.
* gcc.target/riscv/attribute-zce-2.c: Likewise.
* gcc.target/riscv/attribute-zce-3.c: Likewise
* gcc.target/riscv/attribute-zce-4.c: Likewise.

3 months agoDaily bump.
GCC Administrator [Fri, 25 Apr 2025 00:18:00 +0000 (00:18 +0000)] 
Daily bump.

3 months agolibstdc++: Remove unnecessary dg-prune-output from tests
Jonathan Wakely [Thu, 24 Apr 2025 13:50:36 +0000 (14:50 +0100)] 
libstdc++: Remove unnecessary dg-prune-output from tests

There are no errors matching this pattern in these tests (only in the
deque/48101_neg.cc and vector/48101_neg.cc tests).

libstdc++-v3/ChangeLog:

* testsuite/23_containers/forward_list/48101_neg.cc: Remove
dg-prune-output that doesn't match anything.
* testsuite/23_containers/list/48101_neg.cc: Likewise.
* testsuite/23_containers/multiset/48101_neg.cc: Likewise.
* testsuite/23_containers/set/48101_neg.cc: Likewise.

3 months agos390: Allow 5+ argument tail-calls in some special cases [PR119873]
Jakub Jelinek [Thu, 24 Apr 2025 21:44:28 +0000 (23:44 +0200)] 
s390: Allow 5+ argument tail-calls in some special cases [PR119873]

protobuf (and therefore firefox too) currently doesn't build on s390*-linux.
The problem is that it uses [[clang::musttail]] attribute heavily, and in
llvm (IMHO llvm bug) [[clang::musttail]] calls with 5+ arguments on
s390*-linux are silently accepted and result in a normal non-tail call.
In GCC we just reject those because the target hook refuses to tail call it
(IMHO the right behavior).
Now, the reason why that happens is as s390_function_ok_for_sibcall attempts
to explain, the 5th argument (assuming normal <= wordsize integer or pointer
arguments, nothing that needs 2+ registers) is passed in %r6 which is not
call clobbered, so we can't do tail call when we'd have to change content
of that register and then caller would assume %r6 content didn't change and
use it again.
In the protobuf case though, the 5th argument is always passed through
from the caller to the musttail callee unmodified, so one can actually
emit just jg tail_called_function or perhaps tweak some registers but
keep %r6 untouched, and in that case I think it is just fine to tail call
it (at least unless the stack slots used for 6+ argument can't be modified
by the callee in the ABI and nothing checks for that).

So, the following patch checks for this special case, where the argument
which uses %r6 is passed in a single register and it is passed default
definition of SSA_NAME of a PARM_DECL with the same DECL_INCOMING_RTL.

It won't really work at -O0 but should work for -O1 and above, at least when
one doesn't really try to modify the parameter conditionally and hope it will
be optimized away in the end.

2025-04-24  Jakub Jelinek  <jakub@redhat.com>
    Stefan Schulze Frielinghaus  <stefansf@gcc.gnu.org>

PR target/119873
* config/s390/s390.cc (s390_call_saved_register_used): Don't return
true if default definition of PARM_DECL SSA_NAME of the same register
is passed in call saved register.
(s390_function_ok_for_sibcall): Adjust comment.

* gcc.target/s390/pr119873-1.c: New test.
* gcc.target/s390/pr119873-2.c: New test.
* gcc.target/s390/pr119873-3.c: New test.
* gcc.target/s390/pr119873-4.c: New test.

3 months agolibstdc++: Add lvalue overload for generator::yield_value
Jonathan Wakely [Thu, 12 Dec 2024 00:32:08 +0000 (00:32 +0000)] 
libstdc++: Add lvalue overload for generator::yield_value

This was approved in Wrocław as LWG 3899.

This avoids creating a new coroutine frame to co_yield the elements of
an lvalue generator.

libstdc++-v3/ChangeLog:

* include/std/generator (generator::yield_value): Add overload
taking lvalue element_of view, as per LWG 3899.
* testsuite/24_iterators/range_generators/lwg3899.cc: New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Arsen Arsenović <arsen@aarsen.me>
3 months agoPR modula2/115276: libgm2 wraptime.cc field access all return -1.
Gaius Mulley [Thu, 24 Apr 2025 21:09:19 +0000 (22:09 +0100)] 
PR modula2/115276: libgm2 wraptime.cc field access all return -1.

This patch provides autoconf tests for each field used in wraptime.cc
referencing struct tm and struct timeval.

libgm2/ChangeLog:

PR modula2/115276
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac (AC_STRUCT_TIMEZONE): Add.
(AC_CHECK_MEMBER): Test for struct tm.tm_year.
(AC_CHECK_MEMBER): Test for struct tm.tm_mon.
(AC_CHECK_MEMBER): Test for struct tm.tm_mday.
(AC_CHECK_MEMBER): Test for struct tm.tm_hour.
(AC_CHECK_MEMBER): Test for struct tm.tm_min.
(AC_CHECK_MEMBER): Test for struct tm.tm_sec.
(AC_CHECK_MEMBER): Test for struct tm.tm_year.
(AC_CHECK_MEMBER): Test for struct tm.tm_yday.
(AC_CHECK_MEMBER): Test for struct tm.tm_wday.
(AC_CHECK_MEMBER): Test for struct tm.tm_isdst.
(AC_CHECK_MEMBER): Test for struct timeval.tv_sec.
(AC_CHECK_MEMBER): Test for struct timeval.tv_sec.
(AC_CHECK_MEMBER): Test for struct timeval.tv_usec.
* libm2iso/wraptime.cc (InitTimeval): Guard against lack
struct timeval and malloc.
(InitTimezone): Guard against lack of struct tm.tm_zone
and malloc.
(KillTimezone): Ditto.
(InitTimeval): Guard against lack of struct timeval
and malloc.
(KillTimeval): Guard against lack of malloc.
(settimeofday): Guard against lack of struct tm.tm_zone.
(GetFractions): Guard against lack of struct timeval.
(localtime_r): Ditto.
(GetYear): Guard against lack of struct tm.
(GetMonth): Ditto.
(GetDay): Ditto.
(GetHour): Ditto.
(GetMinute): Ditto.
(GetSecond): Ditto.
(GetSummerTime): Ditto.
(GetDST): Guards against lack of struct timezone.
(SetTimezone): Ditto.
(SetTimeval): Guard against lack of struct tm.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
3 months agocobol: Repair some exception processing logic.
Robert Dubner [Thu, 24 Apr 2025 20:26:58 +0000 (16:26 -0400)] 
cobol: Repair some exception processing logic.

This patch changes the exception processing logic for the calculation of
reference modifications and table subscripts to be more in accordance with
ISO specifications.

It also adjusts the processing of RETURN-CODE when calling routines that
have no CALL ... RETURNING phrase.

gcc/cobol

* genapi.cc: (initialize_variable_internal): Change TRACE1 formatting.
(create_and_call): Repair RETURN-CODE processing.
(mh_source_is_group): Repair run-time IF type comparison.
(psa_FldLiteralA): Change TRACE1 formatting.
(parser_symbol_add): Eliminate unnecessary code.
* genutil.cc: Eliminate SET_EXCEPTION_CODE macro.
(get_data_offset_dest): Repair set_exception_code logic.
(get_data_offset_source): Likewise.
(get_binary_value): Likewise.
(refer_refmod_length): Likewise.
(refer_fill_depends): Likewise.
(refer_offset_dest): Likewise.
(refer_size_dest): Likewise.
(refer_offset_source): Likewise.

gcc/testsuite

* cobol.dg/group1/declarative_1.cob: Adjust for repaired exception logic.

3 months agoFix i386 vectorizer cost of COND_EXPR and MIN_MAX with one of parameters 0 or -1
Jan Hubicka [Thu, 24 Apr 2025 16:37:55 +0000 (18:37 +0200)] 
Fix i386 vectorizer cost of COND_EXPR and MIN_MAX with one of parameters 0 or -1

gcc/ChangeLog:

PR target/119919
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Account
correctly cond_expr and min/max when one of operands is 0 or -1.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr119919.c: New test.

3 months agoFix ICE building deepsjeng with -fprofile-use
Jan Hubicka [Thu, 24 Apr 2025 16:35:54 +0000 (18:35 +0200)] 
Fix ICE building deepsjeng with -fprofile-use

The problem here is division by zero, since adjusted 0 > precise 0. Fixed by
using right test.

gcc/ChangeLog:

PR ipa/119924
* ipa-cp.cc (update_counts_for_self_gen_clones): Use nonzero_p.
(update_profiling_info): Likewise.
(update_specialized_profile): Likewise.

3 months agolibgomp/testsuite: Fix hip_header_nvidia check, add workaround to test
Tobias Burnus [Thu, 24 Apr 2025 16:26:30 +0000 (18:26 +0200)] 
libgomp/testsuite: Fix hip_header_nvidia check, add workaround to test

This is all about using the AMD's HIP header files with
__HIP_PLATFORM_NVIDIA__ defined, i.e. HIP with Nvidia/CUDA; in that case,
HIP is a thin layer on top of CUDA.

First, the check_effective_target_gomp_hip_header_nvidia check failed;
to fix it, -Wno-deprecated-declarations was added - and likewise to the
two affected testcases that actually used the HIP headers on Nvidia.

Doing so, the HIP tested was successful but the HIP-BLAS one showed two
issues:

* One seems to be related to include search paths as the HIP header uses
  #include "library_types.h" to include that CUDA header. Seemingly, it
  tried to included (again) the HIP header hip/library_types.h, not the
  CUDA one. I guess, some tweaking of -isystem vs. -I could have
  prevented this, but the simpler workaround was to just explicitly
  include the CUDA one before the HIP header files.

* Once done, everything compiled but linking failed as the association
  between three HIP-BLAS functions and their CUDA-BLAS ones did not
  work. Solution: Just add three #define for mapping them.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp
(check_effective_target_gomp_hip_header_nvidia): Compile with
"-Wno-deprecated-declarations".
* testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas.h: Add workarounds
when using the HIP headers with __HIP_PLATFORM_NVIDIA__.

3 months agolibstdc++: Add std::deque<>::shrink_to_fit test
François Dumont [Thu, 10 Apr 2025 18:58:11 +0000 (20:58 +0200)] 
libstdc++: Add std::deque<>::shrink_to_fit test

The existing test is currently testing std::vector. Adapt it for std::deque.

libstdc++-v3/ChangeLog:

* testsuite/util/replacement_memory_operators.h: Adapt for -fno-exceptions
context.
* testsuite/23_containers/deque/capacity/shrink_to_fit.cc: Adapt test
to check std::deque shrink_to_fit method.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kaminski <tkaminsk@redhat.com>
3 months agoc++: attribute duplication [PR116954]
Jason Merrill [Thu, 24 Apr 2025 09:15:01 +0000 (05:15 -0400)] 
c++: attribute duplication [PR116954]

As a followup to the previous patch for 116954, there's no reason to do
anything in remove_contract_attributes if contracts aren't enabled.

PR c++/116954

gcc/cp/ChangeLog:

* contracts.cc (remove_contract_attributes): Return early if
not enabled.

3 months agoaarch64: Fix CFA offsets in non-initial stack probes [PR119610]
Richard Sandiford [Thu, 24 Apr 2025 13:31:49 +0000 (14:31 +0100)] 
aarch64: Fix CFA offsets in non-initial stack probes [PR119610]

PR119610 is about incorrect CFI output for a stack probe when that
probe is not the initial allocation.  The main aarch64 stack probe
function, aarch64_allocate_and_probe_stack_space, implicitly assumed
that the incoming stack pointer pointed to the top of the frame,
and thus held the CFA.

aarch64_save_callee_saves and aarch64_restore_callee_saves use a
parameter called bytes_below_sp to track how far the stack pointer
is above the base of the static frame.  This patch does the same
thing for aarch64_allocate_and_probe_stack_space.

Also, I noticed that the SVE path was attaching the first CFA note
to the wrong instruction: it was attaching the note to the calculation
of the stack size, rather than to the r11<-sp copy.

gcc/
PR target/119610
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Add a bytes_below_sp parameter and use it to calculate the CFA
offsets.  Attach the first SVE CFA note to the move into the
associated temporary register.
(aarch64_allocate_and_probe_stack_space): Update calls accordingly.
Start out with bytes_per_sp set to the frame size and decrement
it after each allocation.

gcc/testsuite/
PR target/119610
* g++.dg/torture/pr119610.C: New test.
* g++.target/aarch64/sve/pr119610-sve.C: Likewise.

3 months agoc: Allow $@` in GNU23/GNU2Y raw string delimiters [PR110343]
Jakub Jelinek [Thu, 24 Apr 2025 13:29:50 +0000 (15:29 +0200)] 
c: Allow $@` in GNU23/GNU2Y raw string delimiters [PR110343]

Aaron mentioned in the PR that late in C23 N3124 was adopted and
$@` are now part of basic character set.  The paper has been implemented
in GCC from what I can see, but we should allow for GNU23/2Y $@` in
raw string delimiters as well, like they are allowed for C++26, because
the delimiters can contain anything from basic character set but space,
()\, tab, form-feed, newline and backspace.

2025-04-24  Jakub Jelinek  <jakub@redhat.com>

PR c++/110343
* lex.cc (lex_raw_string): For C allow $@` in raw string delimiters
if CPP_OPTION (pfile, low_ucns) i.e. for C23 and later.

* gcc.dg/raw-string-1.c: New test.

3 months agoopts.cc: Use opts rather than opts_set for validating -fipa-reorder-for-locality
Kyrylo Tkachov [Thu, 24 Apr 2025 12:33:54 +0000 (05:33 -0700)] 
opts.cc: Use opts rather than opts_set for validating -fipa-reorder-for-locality

This ensures -fno-ipa-reorder-for-locality doesn't complain with an explicit
-flto-partition=.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
* opts.cc (validate_ipa_reorder_locality_lto_partition): Check opts
instead of opts_set for x_flag_ipa_reorder_for_locality.
(finish_options): Update call site.

3 months agolibgomp: Add additional OpenMP interop runtime tests
Tobias Burnus [Thu, 24 Apr 2025 12:36:37 +0000 (14:36 +0200)] 
libgomp: Add additional OpenMP interop runtime tests

Add checks for nowait/depend and for checks that the returned
CUDA, CUDA_DRIVER and HIP interop objects actually work.

While the CUDA/CUDA_DRIVER ones are only for Nvidia GPUs, HIP
works on both AMD and Nvidia GPUs; on Nvidia GPUs, it is a
very thin wrapper around CUDA.

For Fortran, only a HIP test has been added - using hipfort.

While libgomp.c-c++-common/interop-2.c always works - even without
GPU - and checks for depend / nowait, all others require that
runtime libraries are found at link (and execution) time:
For Nvidia GPUs, libcuda + libcudart or libcublas,
For AMD GPUs, libamdhip64 or libhipblas.

The header files and hipfort modules do not need to be present as a
fallback has been implemented, but if they are, they get used.

Due to the combinations, the basic 1x C/C++, 4x C and 1x Fortran tests
yield 1x C/C++, 14x C and 4 Fortran run-test files.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp (check_effective_target_openacc_cublas,
check_effective_target_openacc_cudart): Update description as
the check requires more.
(check_effective_target_openacc_libcuda,
check_effective_target_openacc_libcublas,
check_effective_target_openacc_libcudart,
check_effective_target_gomp_hip_header_amd,
check_effective_target_gomp_hip_header_nvidia,
check_effective_target_gomp_hipfort_module,
check_effective_target_gomp_libamdhip64,
check_effective_target_gomp_libhipblas): New.
* testsuite/libgomp.c-c++-common/interop-2.c: New test.
* testsuite/libgomp.c/interop-cublas-full.c: New test.
* testsuite/libgomp.c/interop-cublas-libonly.c: New test.
* testsuite/libgomp.c/interop-cuda-full.c: New test.
* testsuite/libgomp.c/interop-cuda-libonly.c: New test.
* testsuite/libgomp.c/interop-hip-amd-full.c: New test.
* testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip.h: New test.
* testsuite/libgomp.c/interop-hipblas-amd-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas.h: New test.
* testsuite/libgomp.fortran/interop-hip-amd-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip.h: New test.

3 months agoopts.cc Simplify handling of explicit -flto-partition= and -fipa-reorder-for-locality
Kyrylo Tkachov [Thu, 24 Apr 2025 07:34:09 +0000 (00:34 -0700)] 
opts.cc Simplify handling of explicit -flto-partition= and -fipa-reorder-for-locality

The handling of an explicit -flto-partition= and -fipa-reorder-for-locality
should be simpler.  No need to have a new default option.  We can use opts_set
to check if -flto-partition is explicitly set and use that information in the
error handling.
Remove -flto-partition=default and update accordingly.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/

* common.opt (LTO_PARTITION_DEFAULT): Delete.
(flto-partition=): Change default back to balanced.
* flag-types.h (lto_partition_model): Remove LTO_PARTITION_DEFAULT.
* opts.cc (validate_ipa_reorder_locality_lto_partition):
Check opts_set->x_flag_lto_partition instead of LTO_PARTITION_DEFAULT.
(finish_options): Remove handling of LTO_PARTITION_DEFAULT.

gcc/testsuite/

* gcc.dg/completion-2.c: Remove check for default.

3 months agoPR modula2/119915: Sprintf1 repeats the entire format string if it starts with a...
Gaius Mulley [Thu, 24 Apr 2025 10:15:18 +0000 (11:15 +0100)] 
PR modula2/119915: Sprintf1 repeats the entire format string if it starts with a directive

This bugfix is for FormatStrings to ensure that in the case of %x, %u the
procedure function PerformFormatString uses Copy rather than Slice to
avoid the case on an upper bound of zero in Slice.  Oddly the %d case
had the correct code.

gcc/m2/ChangeLog:

PR modula2/119915
* gm2-libs/FormatStrings.mod (PerformFormatString): Handle
the %u and %x format specifiers in a similar way to the %d
specifier.  Avoid using Slice and use Copy instead.

gcc/testsuite/ChangeLog:

PR modula2/119915
* gm2/pimlib/run/pass/format2.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
3 months agodwarf2out: Decrease dw_loc_descr_node and dw_attr_struct struct sizes [PR119711]
Jakub Jelinek [Thu, 24 Apr 2025 08:29:34 +0000 (10:29 +0200)] 
dwarf2out: Decrease dw_loc_descr_node and dw_attr_struct struct sizes [PR119711]

As noted by Richi on a large testcase, there are unnecessary paddings
in some heavily used dwarf2out.{h,cc} structures on 64-bit hosts.

struct dw_val_node {
        enum dw_val_class          val_class;            /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        struct addr_table_entry *  val_entry;            /*     8     8 */
        union dw_val_struct_union  v;                    /*    16    16 */

        /* size: 32, cachelines: 1, members: 3 */
        /* sum members: 28, holes: 1, sum holes: 4 */
        /* last cacheline: 32 bytes */
};
struct dw_loc_descr_node {
        dw_loc_descr_ref           dw_loc_next;          /*     0     8 */
        enum dwarf_location_atom   dw_loc_opc:8;         /*     8: 0  4 */
        unsigned int               dtprel:1;             /*     8: 8  4 */
        unsigned int               frame_offset_rel:1;   /*     8: 9  4 */

        /* XXX 22 bits hole, try to pack */

        int                        dw_loc_addr;          /*    12     4 */
        struct dw_val_node         dw_loc_oprnd1;        /*    16    32 */
        struct dw_val_node         dw_loc_oprnd2;        /*    48    32 */

        /* size: 80, cachelines: 2, members: 7 */
        /* sum members: 76 */
        /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 22 bits */
        /* last cacheline: 16 bytes */
};
struct dw_attr_struct {
        enum dwarf_attribute       dw_attr;              /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        struct dw_val_node         dw_attr_val;          /*     8    32 */

        /* size: 40, cachelines: 1, members: 2 */
        /* sum members: 36, holes: 1, sum holes: 4 */
        /* last cacheline: 40 bytes */
};

The following patch is an (not very clean admittedly) attempt to decrease
size of dw_loc_descr_node from 80 bytes to 72 and (more importantly)
dw_attr_struct from 40 bytes to 32 by moving the dw_attr member from
dw_attr_struct into dw_attr_val's padding and similarly move
dw_loc_opc/dtprel/frame_offset_rel members into dw_loc_oprnd1 padding
and dw_loc_addr into dw_loc_oprnd2 padding.
All we need to ensure is that nothing tries to copy whole dw_val_node
structs unless it is copied as part of whole dw_loc_descr_node or
dw_attr_struct copy.

To verify that wasn't the case, I've temporarily added a deleted copy ctor
to dw_val_node and then looked at all the errors/warnings caused by that,
and those were just from memcpy/memmove or structure assignments of whole
dw_loc_descr_node/dw_attr_struct.

2025-04-24  Jakub Jelinek  <jakub@redhat.com>

PR debug/119711
* dwarf2out.h (struct dw_val_node): Add u member.
(struct dw_loc_descr_node): Remove dw_loc_opc, dtprel,
frame_offset_rel and dw_loc_addr members.
(dw_loc_opc, dw_loc_dtprel, dw_loc_frame_offset_rel, dw_loc_addr):
Define.
(struct dw_attr_struct): Remove dw_attr member.
(dw_attr): Define.
* dwarf2out.cc (loc_descr_equal_p_1): Use dw_loc_dtprel instead of
dtprel.
(output_loc_operands, new_addr_loc_descr, loc_checksum,
loc_checksum_ordered): Likewise.
(resolve_args_picking_1): Use dw_loc_frame_offset_rel instead of
frame_offset_rel.
(loc_list_from_tree_1): Likewise.
(resolve_addr_in_expr): Use dw_loc_dtprel instead of dtprel.
(copy_deref_exprloc): Copy val_class, val_entry and v members
instead of whole dw_loc_oprnd1 and dw_loc_oprnd2.
(optimize_string_length): Copy val_class, val_entry and v members
instead of whole dw_attr_val.
(hash_loc_operands): Use dw_loc_dtprel instead of dtprel.
(compare_loc_operands, compare_locs): Likewise.

3 months agotarget: [PR103750] Also handle avx512 kmask & immediate 15 or 3 when VF is 4/2.
liuhongt [Tue, 8 Apr 2025 06:50:53 +0000 (23:50 -0700)] 
target: [PR103750] Also handle avx512 kmask & immediate 15 or 3 when VF is 4/2.

Since the upper bits are already cleared by the comparison
instructions.

gcc/ChangeLog:
PR target/103750
* config/i386/sse.md (*<avx512>_cmp<mode>3_and15): New define_insn.
(*<avx512>_ucmp<mode>3_and15): Ditto.
(*<avx512>_cmp<mode>3_and3): Ditto.
(*avx512vl_ucmpv2di3_and3): Ditto.
(*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>):
Change operands[3] predicate to <cmp_imm_predicate>.
(*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2):
Ditto.
(*<avx512>_cmp<mode>3): Add GET_MODE_NUNITS (<MODE>mode) >= 8
to the condition.
(*<avx512>_ucmp<mode>3): Ditto.
(V48_AVX512VL_4): New mode iterator.
(VI48_AVX512VL_4): Ditto.
(V8_AVX512VL_2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512vl-pr103750-1.c: New test.
* gcc.target/i386/avx512f-pr96891-3.c: Adjust testcase.
* gcc.target/i386/avx512f-vpcmpgtuq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpeqq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpequq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgeq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgeuq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgtq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpgtuq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpleq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpleuq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpltq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpltuq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpneqq-1.c: Ditto.
* gcc.target/i386/avx512vl-vpcmpnequq-1.c: Ditto.

3 months agoPR modula2/119914 No error message generated when passing a Ztype to an unbounded...
Gaius Mulley [Thu, 24 Apr 2025 01:39:36 +0000 (02:39 +0100)] 
PR modula2/119914 No error message generated when passing a Ztype to an unbounded array

This patch detects constants ZType, RType, CType being passed to unbounded
arrays and generates an error message highlighting the formal and
actual parameters in error.

gcc/m2/ChangeLog:

PR modula2/119914
* gm2-compiler/M2Check.mod (checkConstMeta): Add check for
Ztype, Rtype and Ctype and unbounded arrays.
(IsZRCType): New procedure function.
(isZRC): Add comment.
* gm2-compiler/M2Quads.mod:
* gm2-compiler/M2Range.mod (gdbinit): New procedure.
(BreakWhenRangeCreated): Ditto.
(CheckBreak): Ditto.
(InitRange): Call CheckBreak.
(Init): Add gdbhook and initialize interactive watch point.
* gm2-compiler/SymbolTable.def (GetNthParamAnyClosest): New
procedure function.
* gm2-compiler/SymbolTable.mod (BreakSym): Remove constant.
(BreakSym): Add Variable.
(stop): Remove.
(gdbhook): New procedure.
(BreakWhenSymCreated): Ditto.
(CheckBreak): Ditto.
(NewSym): Call CheckBreak.
(Init): Add gdbhook and initialize interactive watch point.
(MakeProcedure): Replace guarded call to stop with CheckBreak.
(GetNthParamChoice): New procedure function.
(GetNthParamOrdered): Ditto.
(GetNthParamAnyClosest): Ditto.
(GetOuterModuleScope): Ditto.

gcc/testsuite/ChangeLog:

PR modula2/119914
* gm2/pim/fail/constintarraybyte.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
3 months agoRegenerate gcc.pot
Joseph Myers [Wed, 23 Apr 2025 19:25:23 +0000 (19:25 +0000)] 
Regenerate gcc.pot

* gcc.pot: Regenerate.

3 months agotestsuite: Require fstack_protector for no-stack-protector-attr-3.C
Dimitar Dimitrov [Wed, 12 Mar 2025 20:22:45 +0000 (22:22 +0200)] 
testsuite: Require fstack_protector for no-stack-protector-attr-3.C

The test fails on pru-unknown-elf with:
   cc1plus: warning: '-fstack-protector' not supported for this target

Even though the compiled functions have the feature disabled using an
attribute, the command line option is still not supported by some targets.

Tested x86_64-pc-linux-gnu and ensured that g++.sum is the same with and
without this patch.

gcc/testsuite/ChangeLog:

* g++.dg/no-stack-protector-attr-3.C: Require effective target
fstack_protector.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
3 months agoEnable ip-cp cloning over non-hot edges
Jan Hubicka [Wed, 23 Apr 2025 16:39:14 +0000 (18:39 +0200)] 
Enable ip-cp cloning over non-hot edges

Currently enabling profile feedback regresses x264 and exchange. In both cases the root of the
issue is that ipa-cp cost model thinks cloning is not relevant when feedback is available
while it clones without feedback.

Consider:

__attribute__ ((used))
int a[1000];

__attribute__ ((noinline))
void
test2(int sz)
{
  for (int i = 0; i < sz; i++)
  a[i]++;
  asm volatile (""::"m"(a));
}

__attribute__ ((noinline))
void
test1 (int sz)
{
  for (int i = 0; i < 1000; i++)
  test2(sz);
}
int main()
{
test1(1000);
return 0;
}

Here we want to clone call both test1 and test2 and specialize for 1000, but
ipa-cp will not do that, since it will skip call main->test1 as not hot since
it is called just once both with or without profile feedback.
In this simple testcase even without profile feedback we will track that main
is called once.

I think the testcase shows that hotness of call is not that relevant when
deciding whether we want to propagate constants across it.  ipa-cp with IPA
profile can compute overall estimate of time saved (which is existing time
benefit computing time saved per invociation of the function multiplied by
number of executions) and see if result is big enough. An easy check is to
simply call maybe_hot_p on the resulting count.

So this patch makes ipa-cp to consider all calls sites except those known to be
unlikely executed (i.e. run 0 times in train run or known to lead to someting
bad) as interesting, which makes ipa-cp to propagate across them, find cloning
candidates and feed them into good_clonning_oppurtunity.

For this I added cs_interesting_for_ipcp_p which also attempts to do right
thing with partial training.

Now good_clonning_oppurtunity will currently return false, since it will figure
out that the call edge is not very frequent.
It already kind of knows that frequency of call instruction istself is not too
important, but instead of computing overall time saved, it tries to compare it
with param_ipa_cp_profile_count_base percentage of counts of call edges.  I
think this is not very relevant since estimated time saved per call can be
large.  So I dropped this logic and replaced it with simple use of overall
saved time.

Since ipa-cp is not dealing well with the cases where it hits the allowed unit
growth limit, we probably want to be more careful, so I keep existing metric
with this change.

So now we get:

Evaluating opportunities for test1/3.
 - considering value 1000 for param #0 sz (caller_count: 1)
     good_cloning_opportunity_p (time: 1, size: 8, count_sum: 1 (precise), overall time saved: 1 (adjusted)) -> evaluation: 0.12, threshold: 500
     not cloning: time saved is not hot
     good_cloning_opportunity_p (time: 129001, size: 20, count_sum: 1 (precise), overall time saved: 129001 (adjusted)) -> evaluation: 6450.05, threshold: 500

First call to good_cloning_oppurtunity considers the case where only test1 is
clonned. In this case time saved is 1 (for passing the value around) and since
it is called just once (count_sum) overall time saved is 1 which is not
considered hot and we also get very low evaulation score.

In the second call we consider cloning chain test1->test2.  In this case time
saved is large (12901) since test2 is invoked many times and it is used to
controll the loop.  We still know that the count is 1 but overall time is
129001 which is already considered relevant and we clone.

I also try to do something sensible in case we have calls both with
and without IPA profile (which can happen for comdats where profile got missing
or with LTO if some units were not trained).
Instead of checking whether sum of calls with known profile is nonzero, I keep
track if there are other calls and if so, also try the local heuristics that
is used without profile feedback.

The patch improves SPECint with -Ofast -fprofile-use by approx 1% by speeding
up x264 from 99.3s to 91.3s (9%) and exchange from 99.7s to 95.5s (3.3%).

We still get better x264 runtime without profile (86.4s for x264 and 93.8 for exchange).

The main problem I see is that ipa-cp has the global limit for growth of 10%
but does not consider the oppurtunities in priority order.  Consequently if the
limit is hit, randomly some clone oppurtunities are dropped in favour of
others.

I dumped unit size changes with -flto -Ofast build of SPEC2017. Without patch I get:

orig new growth
588677 605385 102.838229
4378 6037 137.894016
484650 494851 102.104818
4111 4111 100.000000
99953 103519 103.567677
106181 114889 108.201091
21389 21597 100.972462
24925 26746 107.305918
15308 23974 156.610922
27354 27906 102.017986
494 494 100.000000
4631 4631 100.000000
863216 872729 101.102042
126604 126604 100.000000
605138 627156 103.638509
4112 4112 100.000000
222006 231293 104.183220
2952 3384 114.634146
37584 39807 105.914751
4111 4111 100.000000
13226 13226 100.000000
4111 4111 100.000000
326215 337396 103.427494
25240 25433 100.764659
64644 65972 102.054328
127223 132300 103.990631
494 494 100.000000

Small units can grow up to 16000 instructions and other units are
large. So there is only one 156% growth hititng limits which is exchange
that has recursive clonning that goes specially.

With profile feedback ipacp basically shuts itself off:

333815 333891 100.022767
2559 2974 116.217272
217576 217581 100.002298
2749 2749 100.000000
64652 64716 100.098992
68416 69707 101.886986
13171 13171 100.000000
11849 11849 100.000000
10519 16180 153.816903
15843 15843 100.000000
231 231 100.000000
3624 3624 100.000000
573385 573386 100.000174
97623 97623 100.000000
295673 295676 100.001015
2750 2750 100.000000
130723 130726 100.002295
2334 2334 100.000000
19313 19313 100.000000
2749 2749 100.000000
517331 517331 100.000000
6707 6707 100.000000
2749 2749 100.000000
193638 193638 100.000000
16425 16425 100.000000
47154 47154 100.000000
96422 96422 100.000000
231 231 100.000000

So we essentially clone only exchange and and mcf (116%)
With patch and no FDO I get:

588677 605385 102.838229
4378 6037 137.894016
484519 494698 102.100846
4111 4111 100.000000
99953 103519 103.567677
106181 114889 108.201091
21389 22632 105.811398
24854 26620 107.105496
15308 23974 156.610922
27354 28039 102.504204
494 494 100.000000
4631 4631 100.000000
4631 4631 100.000000
126604 126630 100.020536
4112 4112 100.000000
222006 231293 104.183220
2952 3384 114.634146
37584 39807 105.914751
2760715 2835539 102.710312
4111 4111 100.000000
13226 13226 100.000000
4111 4111 100.000000
326215 337396 103.427494
25240 25433 100.764659
64644 65972 102.054328
127223 132300 103.990631
494 494 100.000000

which seems essentially same as without patch. However with FDO I get:
333815 350363 104.957237
2559 3345 130.715123
217469 220765 101.515618
485599 488772 100.653420
2749 2749 100.000000
64652 74265 114.868836
68416 87484 127.870674
13171 20656 156.829398
11792 11990 101.679104
10519 17028 161.878506
15843 16119 101.742094
231 231 100.000000
573336 573336 100.000000
97623 97623 100.000000
295497 296208 100.240612
2750 2750 100.000000
130723 133341 102.002708
2334 2334 100.000000
19313 19368 100.284782
2749 2749 100.000000
6707 6755 100.715670
2749 2749 100.000000
193638 194712 100.554643
16425 17377 105.796043
47154 47154 100.000000
96422 96422 100.000000
231 231 100.000000

So here we get 114% and 127 growth in x264 (two differen tbinaries)
56% growht in Deepsjeng, 61% growth in Exchange which all are above
10% cutoff.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

* ipa-cp.cc (base_count): Remove.
(struct caller_statistics): Rename n_hot_calls to n_interesting_calls;
add called_without_ipa_profile.
(init_caller_stats): Update.
(cs_interesting_for_ipcp_p): New function.
(gather_caller_stats): collect n_interesting_calls and
called_without_profile.
(ipcp_cloning_candidate_p): Use n_interesting-calls rather then hot.
(good_cloning_opportunity_p): Rewrite heuristics when IPA profile is
present
(estimate_local_effects): Update.
(value_topo_info::propagate_effects): Update.
(compare_edge_profile_counts): Remove.
(ipcp_propagate_stage): Do not collect base_count.
(get_info_about_necessary_edges): Record whether function is called
without profile.
(decide_about_value): Update.
(ipa_cp_cc_finalize): Do not initialie base_count.
* profile-count.cc (profile_count::operator*): New.
(profile_count::operator*=): New.
* profile-count.h (profile_count::operator*): Declare
(profile_count::operator*=): Declare.
* params.opt: Remove ipa-cp-profile-count-base.
* doc/invoke.texi: Likewise.

3 months agoCost truth_value exprs in i386 vectorizer costs.
Jan Hubicka [Wed, 23 Apr 2025 15:04:32 +0000 (17:04 +0200)] 
Cost truth_value exprs in i386 vectorizer costs.

this patch implements costing of truth_value exprs.  I.e.
  a = b < c;
Those seems to be now the most common operations that goes to the addss path
except for in->fp and fp->int conversions.

For integer we use setcc, for FP there is CMccSS and variants which sets the
destination register a s a mast (i.e. -1 on true and 0 on false).  Technically
these needs res&1 to get into 1 on true, 0 on false, but looking on examples
where this is used, it is common that the resulting code is optimized avoiding
need for this (except for cases wehre result is directly saved to memory).
For this reason I am accounting only one sse_op (CMccSS) itself.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Cost truth_value
exprs.

3 months agoUpdate gcc sv.po
Joseph Myers [Wed, 23 Apr 2025 15:01:42 +0000 (15:01 +0000)] 
Update gcc sv.po

* sv.po: Update.

3 months agolibstdc++: Update baseline symbols for powerpc-linux and powerpc64-linux
Andreas Schwab [Wed, 23 Apr 2025 13:24:40 +0000 (15:24 +0200)] 
libstdc++: Update baseline symbols for powerpc-linux and powerpc64-linux

* config/abi/post/powerpc-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.

3 months agotestsuite: aarch64: arm: Enable vld1x?.c and vst1x?.c on arm [PR71233]
Christophe Lyon [Fri, 14 Mar 2025 15:04:29 +0000 (15:04 +0000)] 
testsuite: aarch64: arm: Enable vld1x?.c and vst1x?.c on arm [PR71233]

r14-7202-gc8ec3e1327cb1e added vld1xN and vst1xN intrinsics and some
tests on arm, but didn't enable some existing tests.

Since these tests are shared with aarch64, this patch removes the
'dg-skip-if "unimplemented" { arm*-*-* }' directives and relies on the
advsimd-intrinsics.exp driver to define the appropriate flags and
dg-do-what action.  (A previous patch removed 'dg-do run', and this
patch removes 'dg-options "-O3"' which would override the options
computed by the test driver)

float16 intrinsics require the neon-fp16 FPU, which is possibly
enabled by advsimd-intrinsics.exp, so we include them unconditionally
on aarch64 or if fp16 is enabled on arm.

poly64 intrinsics would require crypto-neon-fp-armv8: the patch
enables the corresponding tests on aarch64 only, since for arm they
are already covered by other tests in gcc.target/arm/simd/.  For some
reason, poly64 tests where missing from x2 and x3 tests, so the patch
adds them as needed.

Tested on aarch64-linux-gnu (no change), arm-linux-gnueabihf (the
additional tests are executed) and various flavors of arm-none-eabi
(the additional tests are compiled-only on M-profile, executed on
A-profile).

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Enable on arm.
* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.

3 months agotestsuite: Skip g++.dg/eh/pr119507.C on Solaris/SPARC with as
Rainer Orth [Wed, 23 Apr 2025 11:09:40 +0000 (13:09 +0200)] 
testsuite: Skip g++.dg/eh/pr119507.C on Solaris/SPARC with as

The new g++.dg/eh/pr119507.C test FAILs on Solaris/SPARC with the native as:

FAIL: g++.dg/eh/pr119507.C  -std=gnu++17  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1
FAIL: g++.dg/eh/pr119507.C  -std=gnu++17  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1
FAIL: g++.dg/eh/pr119507.C  -std=gnu++26  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1
FAIL: g++.dg/eh/pr119507.C  -std=gnu++26  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1
FAIL: g++.dg/eh/pr119507.C  -std=gnu++98  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1
FAIL: g++.dg/eh/pr119507.C  -std=gnu++98  scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1

This happens because the syntax for COMDAT sections is vastly different
from the one used by gas.

Rather than trying to handle this, this patch just skips the test.

Tested on sparc-sun-solaris2.11 with both as and gas,
i386-pc-solaris2.11, and x86_64-pc-linux-gnu.

2025-04-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* g++.dg/eh/pr119507.C: Skip on sparc*-*-solaris2* && !gas.

3 months agolibstdc++: fix possible undefined atomic lock-free type aliases in module std
ZENG Hao [Sun, 20 Apr 2025 09:02:16 +0000 (17:02 +0800)] 
libstdc++: fix possible undefined atomic lock-free type aliases in module std

When building for 'i386-*' targets, all basic types are 'sometimes lock-free'
and thus std::atomic_signed_lock_free and std::atomic_unsigned_lock_free are
not declared. In the header <atomic>, they are placed in preprocessor
condition __cpp_lib_atomic_lock_free_type_aliases. In module std, they should
be the same.

libstdc++-v3/ChangeLog:

* src/c++23/std.cc.in (atomic_signed_lock_free): Guard with
preprocessor check for __cpp_lib_atomic_lock_free_type_aliases.
(atomic_unsigned_lock_free): Likewise.

3 months agoFortran: Use correct location in check of coarray functions [PR119200]
Andre Vehreschild [Tue, 22 Apr 2025 08:11:52 +0000 (10:11 +0200)] 
Fortran: Use correct location in check of coarray functions [PR119200]

Use gfc_current_intrinsic_where during check(), because
gfc_current_locus is not set to correct location or at all.

PR fortran/119200

gcc/fortran/ChangeLog:

* check.cc (gfc_check_lcobound): Use locus from intrinsic_where.
(gfc_check_image_index): Same.
(gfc_check_num_images): Same.
(gfc_check_team_number): Same.
(gfc_check_this_image): Same.
(gfc_check_ucobound): Same.

3 months agotestsuite: AMDGCN test for vect-early-break_38.c as well to consistent architecture...
Tamar Christina [Wed, 23 Apr 2025 07:07:23 +0000 (08:07 +0100)] 
testsuite: AMDGCN test for vect-early-break_38.c as well to consistent architecture [PR119286]

I had missed this one during the AMDGCN test failures.

Like vect-early-break_18.c this test is also scalaring the
loads and thus leading to unexpected vectorization for this
testcase.

gcc/testsuite/ChangeLog:

PR target/119286
* gcc.dg/vect/vect-early-break_38.c: Force -march=gfx908 for amdgcn.

3 months agoOpenMP: Add libgomp.fortran/target-enter-data-8.f90
Tobias Burnus [Wed, 23 Apr 2025 07:03:00 +0000 (09:03 +0200)] 
OpenMP: Add libgomp.fortran/target-enter-data-8.f90

Add another testcase for Fortran deep mapping of allocatable components.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-enter-data-8.f90: New test.

3 months agoAccept allones or 0 operand for vcond_mask op1.
liuhongt [Mon, 31 Mar 2025 03:15:41 +0000 (20:15 -0700)] 
Accept allones or 0 operand for vcond_mask op1.

Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
or vpandn.

gcc/ChangeLog:

* config/i386/predicates.md (vector_or_0_or_1s_operand): New predicate.
(nonimm_or_0_or_1s_operand): Ditto.
* config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>):
Extend the predicate of operands1 to accept 0 or allones
operands.
(vcond_mask_<mode><sseintvecmodelower>): Ditto.
(vcond_mask_v1tiv1ti): Ditto.
(vcond_mask_<mode><sseintvecmodelower>): Ditto.
* config/i386/i386.md (mov<mode>cc): Ditto for operands[2] and
operands[3].
* config/i386/i386-expand.cc (ix86_expand_sse_fp_minmax):
Force immediate_operand to register.

gcc/testsuite/ChangeLog:

* gcc.target/i386/blendv-to-maxmin.c: New test.
* gcc.target/i386/blendv-to-pand.c: New test.

3 months agoDaily bump.
GCC Administrator [Wed, 23 Apr 2025 00:18:18 +0000 (00:18 +0000)] 
Daily bump.

3 months agoFix vectorizer costs of COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR, ABSU_EXPR
Jan Hubicka [Tue, 22 Apr 2025 21:47:14 +0000 (23:47 +0200)] 
Fix vectorizer costs of COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR, ABSU_EXPR

this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR,
MAX_EXPR, ABS_EXPR and ABSU_EXPR.   We previously costed ABS_EXPR and ABSU_EXPR
but it was only correct for FP variant (wehre it corresponds to andss clearing
sign bit).  Integer abs/absu is open coded as conditinal move for SSE2 and
SSE3 instroduced an instruction.

MIN_EXPR/MAX_EXPR compiles to minss/maxss for FP and accroding to Agner Fog
tables they costs same as sse_op on all targets. Integer translated to single
instruction since SSE3.

COND_EXPR translated to open-coded conditional move for SSE2, SSE4.1 simplified
the sequence and AVX512 introduced masked registers.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Add special cases
for COND_EXPR; make MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR more realistic.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr89618-2.c: XFAIL.

3 months agors6000: Ignore OPTION_MASK_SAVE_TOC_INDIRECT differences in inlining decisions [PR119327]
Jakub Jelinek [Tue, 22 Apr 2025 19:27:28 +0000 (21:27 +0200)] 
rs6000: Ignore OPTION_MASK_SAVE_TOC_INDIRECT differences in inlining decisions [PR119327]

The following testcase FAILs because the always_inline function can't
be inlined.
The rs6000 backend has similarly to other targets a hook which rejects
inlining which would bring in new ISAs which aren't there in the caller.
And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT
differences.
This flag is set if explicitly requested or by default depending on
whether the current function looks hot (or at least not cold):
  if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
      && flag_shrink_wrap_separate
      && optimize_function_for_speed_p (cfun))
    rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
The target nodes that are being compared here are actually the default
target node (which was created when cfun was NULL) vs. one that was
created for the always_inline function when it wasn't NULL, so one
doesn't have it, the other does.
In any case, this flag feels like a tuning decision rather than hard
ISA requirement and I see no problems why we couldn't inline
even explicit -msave-toc-indirect function into -mno-save-toc-indirect
or vice versa.
We already ignore OPTION_MASK_P{8,10}_FUSION which are also more
like tuning flags.

2025-04-22  Jakub Jelinek  <jakub@redhat.com>

PR target/119327
* config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also
OPTION_MASK_SAVE_TOC_INDIRECT differences.

* g++.dg/opt/pr119327.C: New test.

3 months agoRevert "libstdc++: Optimize std::projected<I, std::identity>" [PR119888]
Patrick Palka [Tue, 22 Apr 2025 16:52:34 +0000 (12:52 -0400)] 
Revert "libstdc++: Optimize std::projected<I, std::identity>" [PR119888]

This non-standard optimization breaks real-world code that expects the
result of std::projected to always (be a class type and) have a value_type
member, which isn't true for e.g. I=int*, so revert it for now.

PR libstdc++/119888

This reverts commit 51761c50f843d5be4e24172535e4524b5072f24c.

3 months agoaarch64: Define __ARM_FEATURE_FAMINMAX
Richard Sandiford [Tue, 22 Apr 2025 16:19:15 +0000 (17:19 +0100)] 
aarch64: Define __ARM_FEATURE_FAMINMAX

We implemented FAMINMAX ACLE support but failed to define the
associated feature macro.

gcc/
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_FAMINMAX.

gcc/testsuite/
* gcc.target/aarch64/pragma_cpp_predefs_4.c: Test
__ARM_FEATURE_FAMINMAX.

3 months agoInduction vectorizer: prevent ICE for scalable types
Spencer Abson [Thu, 20 Mar 2025 12:18:57 +0000 (12:18 +0000)] 
Induction vectorizer: prevent ICE for scalable types

We currently check that the target suppports PLUS_EXPR and MINUS_EXPR
with step_vectype (a fix for pr103523).  However, vectorizable_induction
can emit a vectorized MULT_EXPR when calculating the step of each IV for
SLP, and both MULT_EXPR/FLOAT_EXPR when calculating VEC_INIT for float
inductions.

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_induction): Add target support
checks for vectorized MULT_EXPR and FLOAT_EXPR where necessary for
scalable types.
Prefer target_supports_op_p over directly_supports_p for these tree
codes.
(vectorizable_nonlinear_induction): Fix a doc comment while I'm
here.

3 months agoAArch64: Emit half-precision FCMP/FCMPE
Spencer Abson [Fri, 31 Jan 2025 19:05:57 +0000 (19:05 +0000)] 
AArch64: Emit half-precision FCMP/FCMPE

Enable a target with FEAT_FP16 to emit the half-precision variants
of FCMP/FCMPE.

gcc/ChangeLog:

* config/aarch64/aarch64.md: Update cbranch, cstore, fcmp
and fcmpe to use the GPF_F16 iterator for floating-point
modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/_Float16_cmp_1.c: New test.
* gcc.target/aarch64/_Float16_cmp_2.c: New (negative) test.