]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
6 hours agolibgccjit: Fix error on Power architectures caused by wrong jit_target_objs master trunk
Antoni Boucher [Thu, 9 Oct 2025 20:51:42 +0000 (16:51 -0400)] 
libgccjit: Fix error on Power architectures caused by wrong jit_target_objs

gcc/ChangeLog:
* config.gcc (jit_target_objs): Don't set this variable since
the object files don't exist.

7 hours agoc2y: Allow unspecified arrays in generic association.
Martin Uecker [Sun, 12 Oct 2025 18:00:22 +0000 (20:00 +0200)] 
c2y: Allow unspecified arrays in generic association.

To allow unspecified arrays in generic association add a new
declaration context GENERIC_ASSOC for grokdeclarator and new
function grokgenassoc to be used by the parser.  The error
about unspecified array is moved from build_array_declarator
to grokdeclarator to be able to check for this.

gcc/c/ChangeLog:
* c-decl.cc (build_array_declarator): Remove error.
(grokgenassoc): New function.
(grokdeclarator): Add error.
* c-parser.cc (c_parser_generic_selection): Use grokgenassoc.
* c-tree.h (grokgenassoc): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.dg/c2y-generic-6.c: New test.
* gcc.dg/c2y-generic-7.c: New test.

7 hours agoc++: Implement C++23 P2674R1 - A trait for implicit lifetime types
Jakub Jelinek [Tue, 21 Oct 2025 17:17:11 +0000 (19:17 +0200)] 
c++: Implement C++23 P2674R1 - A trait for implicit lifetime types

The following patch attempts to implement the compiler side of the
C++23 P2674R1 paper.  As mentioned in the paper, since CWG2605
the trait isn't really implementable purely on the library side.

Because it is implemented completely on the compiler side, it
just uses SCALAR_TYPE_P and so can e.g. accept __int128 even in
-std=c++23 mode, even when std::is_scalar_v<__int128> is false in
that case.  And as an extention it (like Clang) accepts _Complex
types and vector types.
I must say I'm quite surprised that any array types are considered
implicit-lifetime, even if their element type is not, but perhaps
there is some reason for that.
Because std::is_array_v<int[0]> is false, it returns false for that
as well, dunno if that shouldn't be changed for implicit-lifetime.
It accepts also VLAs.

The library part has been split into a separate patch still pending
review; committing it now so that reflection can use it in its
std::meta::is_implicit_lifetime_type implementation.

2025-10-21  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
* cp-tree.h: Implement C++23 P2674R1 - A trait for implicit lifetime
types.
(implicit_lifetime_type_p): Declare.
* tree.cc (implicit_lifetime_type_p): New function.
* cp-trait.def (IS_IMPLICIT_LIFETIME): New unary trait.
* semantics.cc (trait_expr_value): Handle CPTK_IS_IMPLICIT_LIFETIME.
(finish_trait_expr): Likewise.
* constraint.cc (diagnose_trait_expr): Likewise.
gcc/testsuite/
* g++.dg/ext/is_implicit_lifetime.C: New test.

8 hours agoarm: testsuite: [MVE] Fix expected code for vadcq_m and vsbcq_m [PR122189]
Christophe Lyon [Mon, 20 Oct 2025 14:31:21 +0000 (14:31 +0000)] 
arm: testsuite: [MVE] Fix expected code for vadcq_m and vsbcq_m [PR122189]

The original versions of these tests only took into account code
generated with -mfloat-abi=hard.

Depending on how the toolchain is configured, arm_v8_1m_mve may use
-mfloat-abi-softfp, which generates a different instructions order.

Depending on the -mtune setting, the order can also vary, so the patch
adds -fno-schedule-insns -fno-schedule-insns2 to avoid such
maintenance issues.

In particular, this fixes the failures with:
 -mthumb -march=armv7e-m+fp.dp -mtune=cortex-m7 -mfloat-abi=hard -mfpu=auto
 -mthumb -march=armv6s-m -mtune=cortex-m0 -mfloat-abi=soft -mfpu=auto

gcc/testsuite/ChangeLog:

PR target/122189
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c

10 hours agoOpenMP: Handle non-executable directives in intervening code [PR120180,PR122306]
Paul-Antoine Arras [Thu, 16 Oct 2025 16:22:08 +0000 (17:22 +0100)] 
OpenMP: Handle non-executable directives in intervening code [PR120180,PR122306]

OpenMP 6 permits non-executable directives in intervening code; this commit adds
support for a sensible subset, namely metadirectives, nothing, assume, and
'error at(compilation)'.
Also handle the special case where a metadirective can be resolved at parse time
to 'omp nothing'.
This fixes a build issue that affects 10 out 12 SPECaccel benchmarks.

Co-authored by: Tobias Burnus <tburnus@baylibre.com>

PR c/120180
PR fortran/122306

gcc/c/ChangeLog:

* c-parser.cc (c_parser_pragma): Accept a subset of non-executable
OpenMP directives in intervening code.
(c_parser_omp_error): Reject 'error at(execution)' in intervening code.
(c_parser_omp_metadirective): Return early if only one selector matches
and it resolves to 'omp nothing'.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_metadirective): Return early if only one
selector matches and it resolves to 'omp nothing'.
(cp_parser_omp_error): Reject 'error at(execution)' in intervening code.
(cp_parser_pragma): Accept a subset of non-executable OpenMP directives
as intervening code.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_exec_op): Add EXEC_OMP_FIRST_OPENMP_EXEC and
EXEC_OMP_LAST_OPENMP_EXEC.
* openmp.cc (gfc_match_omp_context_selector): Remove static. Remove
checks on score. Add cleanup. Remove checks on trait properties.
(gfc_match_omp_context_selector_specification): Remove static. Adjust
calls to gfc_match_omp_context_selector.
(gfc_match_omp_declare_variant): Adjust call to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): Likewise.
(icode_code_error_callback): Reject all statements except
'assume' and 'metadirective'.
(gfc_resolve_omp_context_selector): New function.
(resolve_omp_metadirective): Skip metadirectives which context selectors
can be statically resolved to false. Replace metadirective by its body
if only 'nothing' remains.
(gfc_resolve_omp_declare): Call gfc_resolve_omp_context_selector for
each variant.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/imperfect1.c: Adjust dg-error.
* c-c++-common/gomp/imperfect4.c: Likewise.
* c-c++-common/gomp/pr120180.c: Move to...
* c-c++-common/gomp/pr120180-1.c: ...here. Remove dg-error.
* g++.dg/gomp/attrs-imperfect1.C: Adjust dg-error.
* g++.dg/gomp/attrs-imperfect4.C: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Adjust dg-error.
* gfortran.dg/gomp/declare-variant-20.f90: Likewise.
* c-c++-common/gomp/pr120180-2.c: New test.
* g++.dg/gomp/pr120180-1.C: New test.
* gfortran.dg/gomp/pr120180-1.f90: New test.
* gfortran.dg/gomp/pr120180-2.f90: New test.
* gfortran.dg/gomp/pr122306-1.f90: New file.
* gfortran.dg/gomp/pr122306-2.f90: New file.

12 hours agox86_64: Start TImode STV chains from zero-extension or *concatditi.
Roger Sayle [Tue, 21 Oct 2025 12:14:58 +0000 (13:14 +0100)] 
x86_64: Start TImode STV chains from zero-extension or *concatditi.

Currently x86_64's TImode STV pass has the restriction that candidate
chains must start with a TImode load from memory.  This patch improves
the functionality of STV to allow zero-extensions and construction of
TImode pseudos from two DImode values (i.e. *concatditi) to both be
considered candidate chain initiators.  For example, this allows chains
starting from an __int128 function argument to be processed by STV.

Compiled with -O2 on x86_64:

__int128 m0,m1,m2,m3;
void foo(__int128 m)
{
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously generated:

foo:    xchgq   %rdi, %rsi
        movq    %rsi, m0(%rip)
        movq    %rdi, m0+8(%rip)
        movq    %rsi, m1(%rip)
        movq    %rdi, m1+8(%rip)
        movq    %rsi, m2(%rip)
        movq    %rdi, m2+8(%rip)
        movq    %rsi, m3(%rip)
        movq    %rdi, m3+8(%rip)
        ret

With the patch, we now generate:

foo: movq    %rdi, %xmm0
        movq    %rsi, %xmm1
        punpcklqdq      %xmm1, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

or with -mavx2:

foo: vmovq   %rdi, %xmm1
        vpinsrq $1, %rsi, %xmm1, %xmm0
        vmovdqa %xmm0, m0(%rip)
        vmovdqa %xmm0, m1(%rip)
        vmovdqa %xmm0, m2(%rip)
        vmovdqa %xmm0, m3(%rip)
        ret

Likewise, for zero-extension:

__int128 m0,m1,m2,m3;
void bar(unsigned long x)
{
    __int128 m = x;
    m0 = m;
    m1 = m;
    m2 = m;
    m3 = m;
}

Previously with -O2:

bar:    movq    %rdi, m0(%rip)
        movq    $0, m0+8(%rip)
        movq    %rdi, m1(%rip)
        movq    $0, m1+8(%rip)
        movq    %rdi, m2(%rip)
        movq    $0, m2+8(%rip)
        movq    %rdi, m3(%rip)
        movq    $0, m3+8(%rip)
        ret

with this patch:

bar: movq    %rdi, %xmm0
        movaps  %xmm0, m0(%rip)
        movaps  %xmm0, m1(%rip)
        movaps  %xmm0, m2(%rip)
        movaps  %xmm0, m3(%rip)
        ret

As shown in the examples above, the scalar-to-vector (STV) conversion of
*concatditi has an overhead [treating two DImode registers as a TImode
value is free on x86_64], but specifying this penalty allows the STV
pass to make an informed decision if the total cost/gain of the chain
is a net win.

2025-10-21  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (timode_concatdi_p): New
function to recognize the various variants of *concatditi3_[1-7].
(scalar_chain::add_insn): Like VEC_SELECT, ZERO_EXTEND and
timode_concatdi_p instructions don't require their input
operands to be converted (to TImode).
(timode_scalar_chain::compute_convert_gain): Split/clone XOR and
IOR cases from AND case, to handle timode_concatdi_p costs.
<case PLUS>: Handle timode_concatdi_p conversion costs.
<case ZERO_EXTEND>: Provide costs of DImode to TImode extension.
(timode_convert_concatdi): Helper function to transform
a *concatditi3 instruction into a vec_concatv2di instruction.
(timode_scalar_chain::convert_insn): Split/clone XOR and IOR
cases from ANS case, to handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
<case ZERO_EXTEND>: Convert zero_extendditi2 to *vec_concatv2di_0.
<case PLUS>: Handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
(timode_scalar_to_vector_candidate_p): Support timode_concatdi_p
instructions in IOR, XOR and PLUS cases.
<case ZERO_EXTEND>: Consider zero extension of a register from
DImode to TImode to be a candidate.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-10.c: New test case.
* gcc.target/i386/sse4_1-stv-11.c: Likewise.
* gcc.target/i386/sse4_1-stv-12.c: Likewise.

13 hours agoOpenMP: Update directive arrays used for 'omp assume(s)' with contains/absent
Tobias Burnus [Tue, 21 Oct 2025 11:31:19 +0000 (13:31 +0200)] 
OpenMP: Update directive arrays used for 'omp assume(s)' with contains/absent

Both Fortran and C/C++ have an array with classifications of directives;
currently, this array is only used to handle the restrictions of the
contains/absent clauses to the assume/assumes directives.

For C/C++, uncommenting 'declare mapper' was missed. Additionally,
'end ...' is a directive but not a directive name; hence, those
are now rejected as 'unknown directive' instead of as 'invalid'
directive.

Additionally, both lists now list newer entries (commented out) for
OpenMP 6.x - and a note (comment) was added for C/C++'s
'begin metadirective' and for Fortran's 'allocate', respectively.

gcc/c-family/ChangeLog:

* c-omp.cc (c_omp_directives): Uncomment 'declare mapper',
add comment to 'begin metadirective', add 6.x unimplemented
directives as comment-out entries.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_assumption_clauses): Switch to
'unknown' not 'invalid' directive name for end directives.

gcc/fortran/ChangeLog:

* openmp.cc (gfc_omp_directive): Add comment to 'allocate';
add 6.x unimplemented directives as comment-out entries.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/assumes-2.c: Change for 'invalid'
to 'unknown' change for end directives.
* c-c++-common/gomp/begin-assumes-2.c: Likewise.
* c-c++-common/gomp/assume-2.c: Likewise. Check 'declare
mapper'.

13 hours agotree-optimization/120687 - reduction chain with UB on signed overflow
Richard Biener [Mon, 20 Oct 2025 12:44:27 +0000 (14:44 +0200)] 
tree-optimization/120687 - reduction chain with UB on signed overflow

The following adds the ability to discover a reduction chain on a
series of statements that invoke undefined behavior on integer overflow.
This inhibits the reassoc pass from associating stmts in the way
naturally leading to a reduction chain.  The common mistake on the
source side is to rely on the += operator to sum multiple inputs.

After the refactoring of how we handle reduction chains we can
easily use vect_slp_linearize_chain to do this our selves and
rely on the vectorizer punning operations to unsigned given reduction
vectorization always associates.

PR tree-optimization/120687
* tree-vect-slp.cc (vect_analyze_slp_reduc_chain): When
there's no natural reduction chain see if vect_slp_linearize_chain
can recover one and built the SLP instance manually in that
case.
(vect_schedule_slp): Deal with NULL lanes when looking for
stores to remove.
* tree-vect-loop.cc (vect_transform_cycle_phi): Dump when we
are successfully transforming a reduction chain.

* gcc.dg/vect/vect-reduc-chain-4.c: New testcase.

13 hours agoFix partial epilog for bool vectors
Richard Biener [Tue, 21 Oct 2025 09:09:41 +0000 (11:09 +0200)] 
Fix partial epilog for bool vectors

When we do epilogue vectorization the partial reduction of a bool
vector via vect_create_partial_epilog ends up being done on an
integer vector but we fail to pun back to a bool vector at the end,
causing an ICE later.  I couldn't manage to create a testcase
running into the failure but a pending patch will expose this on
gcc.dg/vect/vect-switch-ifcvt-3.c

* tree-vect-loop.cc (vect_create_partial_epilog): Pun back
to the requested type if necessary.

15 hours agovect: Fix regression for PR104116
Avinash Jayakar [Tue, 21 Oct 2025 09:33:41 +0000 (15:03 +0530)] 
vect: Fix regression for PR104116

The commit gcc-16-4464-g6883d51304f added 30 new tests for testing
vectorization of {FLOOR,MOD,ROUND}_{DIV,MOD}_EXPR. Few of them failed
for certain targets due to the vectorization of runtime-check loop which
was not intended.
This patch disables optimization for all of the run-time check loops so
that the count of vectorized loop is always 1.

2025-10-21  Avinash Jayakar  <avinashd@linux.ibm.com>

gcc/testsuite/ChangeLog:
PR target/104116
* gcc.dg/vect/pr104116-ceil-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-div.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-udiv.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-ceil-umod.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-div.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-floor-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-div.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-mod.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-udiv.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod-2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod-pow2.c: disable vectorization.
* gcc.dg/vect/pr104116-round-umod.c: disable vectorization.
* gcc.dg/vect/pr104116.h (init_arr): use std idiom, correct
indentation.
(init_uarr): use std idiom.

15 hours agomatch: Add support for convert `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN...
Andrew Pinski [Mon, 20 Oct 2025 22:48:43 +0000 (15:48 -0700)] 
match: Add support for convert `((signed)x) < 0` to `x >= (unsigned)SIGNED_TYPE_MIN` while detecting min/max [PR110068]

This copies the optimization which was done to fix PR 95699 to match detection of MIN/MAX
from minmax_replacement to match.
This is another step in getting rid of minmax_replacement in phiopt.  There are still a few
more min/max detections that needs to be handled before the removal. pr101024-1.c adds one
example of that but since the testcase currently passes I didn't xfail it.

pr110068-1.c adds a testcase which was not detected beforehand either.

Changes since v1:
* v2: Fix comment about how it is transformed.
      Use SIGNED_TYPE_MIN everywhere instead of mxing in SIGNED_TYPE_MAX too.

Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/95699
PR tree-optimization/101024
PR tree-optimization/110068

gcc/ChangeLog:

* match.pd (`(type1)x CMP CST1 ? (type2)x : CST2`): Treat
`(signed)x </>= 0` as `x >=/< SIGNED_TYPE_MIN`

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr101024-1.c: New test.
* gcc.dg/tree-ssa/pr110068-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
16 hours agoRedefine ASM_PREFERRED_EH_DATA_FORMAT for ppc[64]-vxworks
Olivier Hainque [Sat, 20 Apr 2024 11:43:56 +0000 (08:43 -0300)] 
Redefine ASM_PREFERRED_EH_DATA_FORMAT for ppc[64]-vxworks

This patch redefines ASM_PREFERRED_EH_DATA_FORMAT from the
otherwise inherited linux variant, preventing DW_EH_PE_indirect
in 64bit DKMs, where they are not strictly
needed and where the runtime load could resolve the DW.refs to
symbols of the same name within a different DKM loaded previously.

gcc/
* config/rs6000/vxworks.h (ASM_PREFERRED_EH_DATA_FORMAT):
Redefine.

16 hours agoReplace VSB_DIR by sysroot ref in VXWORKS_ADDITIONAL_CPP_SPEC
Olivier Hainque [Sat, 20 Apr 2024 15:37:51 +0000 (12:37 -0300)] 
Replace VSB_DIR by sysroot ref in VXWORKS_ADDITIONAL_CPP_SPEC

VXWORKS_ADDITIONAL_CPP_SPEC has an artificial guard on
-fself-test to prevent all-gcc build failures from self-tests
in environments where VSB_DIR is not defined.

The libraries are not built during such
checks; having a VxWorks installation at hand is not necessary, and
requiring VSB_DIR to be defined is inappropriate.

This patch replaces the use of %getenv(VSB_DIR) by $sysroot references
which allows removing the artifical guard of -fself-tests.

gcc/
* config/vxworks.h (VXWORKS_ADDITIONAL_CPP_SPEC):
Remove guard on -fself-tests and replace %:getenv(VSB_DIR) by
sysroot references.

24 hours agoDaily bump.
GCC Administrator [Tue, 21 Oct 2025 00:20:03 +0000 (00:20 +0000)] 
Daily bump.

26 hours agoFix minor RISC-V testsuite failure
Jeff Law [Mon, 20 Oct 2025 22:31:20 +0000 (16:31 -0600)] 
Fix minor RISC-V testsuite failure

This fixes reduc-8 yet again.  This time the required "a2" moved to the other source operand of the add.  So the regexp is further expanded to allow add anyreg,anyreg,a2 or add anyreg,a2,anyreg.

gcc/testsuite
* gcc.target/riscv/rvv/autovec/reduc/reduc-8.c: Adjust expected output.

26 hours agoAda: Add missing qualifier for integer literal
Eric Botcazou [Mon, 20 Oct 2025 22:21:37 +0000 (00:21 +0200)] 
Ada: Add missing qualifier for integer literal

gcc/ada/
PR ada/102078
* affinity.c (__gnat_set_affinity_mask): Add U qualifier.

26 hours agoipa: Delete callback edges when redirecting to unreachable.
Josef Melcr [Sat, 18 Oct 2025 10:47:17 +0000 (12:47 +0200)] 
ipa: Delete callback edges when redirecting to unreachable.

When a callback-carrying edge is redirected to __builtin_unreachable,
the associated callbacks will never get called, so the corresponding
callback edges must be deleted, as they no longer reflect the reality.

The line in analyze_function_body is an obvious typo I discovered during
debugging, so I decided to bundle it in.

gcc/ChangeLog:

* ipa-fnsummary.cc (redirect_to_unreachable): Purge callback
edges when redirecting the carrying edge.
(analyze_function_body): Fix typo.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>
26 hours agolibgccjit: Add gcc_jit_context_new_array_type_u64
Antoni Boucher [Sat, 4 Mar 2023 05:44:49 +0000 (00:44 -0500)] 
libgccjit: Add gcc_jit_context_new_array_type_u64

gcc/jit/ChangeLog:

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_37): New ABI tag.
* docs/topics/types.rst: Document
gcc_jit_context_new_array_type_u64.
* jit-playback.cc (new_array_type): Change num_elements type to
uint64_t.
* jit-playback.h (new_array_type): Change num_elements type to
uint64_t.
* jit-recording.cc (recording::context::new_array_type): Change
num_elements type to uint64_t.
(recording::array_type::make_debug_string): Use uint64_t
format.
(recording::array_type::write_reproducer): Switch to
gcc_jit_context_new_array_type_u64.
* jit-recording.h (class array_type): Change num_elements type
to uint64_t.
(new_array_type): Change num_elements type to uint64_t.
(num_elements): Change return type to uint64_t.
* libgccjit.cc (gcc_jit_context_new_array_type_u64):
New function.
* libgccjit.h (gcc_jit_context_new_array_type_u64):
New function.
* libgccjit.exports: New function.
* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

* jit.dg/all-non-failing-tests.h: Add test-arrays-u64.c.
* jit.dg/test-arrays-u64.c: New test.

26 hours agotestsuite: Move ipcp-cb* from ipa to libgomp
Josef Melcr [Mon, 20 Oct 2025 21:52:16 +0000 (23:52 +0200)] 
testsuite: Move ipcp-cb* from ipa to libgomp

This patch addresses the incorrectly placed tests, which fail if the
testsuite is ran and gcc has not been installed yet, as discussed
here:
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698095.html.

gcc/testsuite/ChangeLog:
* gcc.dg/ipa/ipcp-cb-spec1.c: Moved to libgomp/testsuite/libgomp.c/.
* gcc.dg/ipa/ipcp-cb-spec2.c: Likewise.
* gcc.dg/ipa/ipcp-cb1.c: Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.c/ipcp-cb-spec1.c: Moved from
gcc/testsuite/gcc.dg/ipa/.
* testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise.
* testsuite/libgomp.c/ipcp-cb1.c: Likewise.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>
26 hours agoAda: Fix incorrect specification of GNAT.Calendar.Time_IO "%c"
Eric Botcazou [Mon, 20 Oct 2025 21:57:01 +0000 (23:57 +0200)] 
Ada: Fix incorrect specification of GNAT.Calendar.Time_IO "%c"

The timezone is not printed by the "%c" specifier.

gcc/ada/
PR ada/32318
* libgnat/g-catiio.adb (Image_Helper) <'c'>: Fix comment.

27 hours agolibgccjit: Do not treat warnings as errors
Antoni Boucher [Mon, 12 Feb 2024 23:49:43 +0000 (19:49 -0400)] 
libgccjit: Do not treat warnings as errors

gcc/jit/ChangeLog:

* jit-playback.cc (add_error, add_error_va): Send DK_ERROR to
add_error_va.
(add_diagnostic): Call add_diagnostic instead of add_error.
* jit-recording.cc (DEFINE_DIAGNOSTIC_KIND): New define.
(recording::context::add_diagnostic): New function.
(recording::context::add_error): Send DK_ERROR to add_error_va.
(recording::context::add_error_va): New parameter diagnostic_kind.
* jit-recording.h (add_diagnostic): New function.
(add_error_va): New parameter diagnostic_kind.
* libgccjit.cc (jit_error): Send DK_ERROR to add_error_va.

gcc/testsuite/ChangeLog:

* jit.dg/test-error-array-bounds.c: Fix test.

27 hours agolibgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node
Antoni Boucher [Fri, 3 Jun 2022 01:14:06 +0000 (21:14 -0400)] 
libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02  Antoni Boucher  <bouanto@zoho.com>

gcc/jit/
PR jit/105827
* dummy-frontend.cc: Fix lang_tree_node.
* jit-common.h: New function (jit_tree_chain_next) used by
lang_tree_node.

27 hours agolibgccjit: Support more target builtin types
Antoni Boucher [Tue, 18 Mar 2025 16:51:23 +0000 (12:51 -0400)] 
libgccjit: Support more target builtin types

This also adds option to abort on unsupported type in order to be able
to detect new unsupported types more easily.

gcc/jit/ChangeLog:
PR jit/117886
* dummy-frontend.cc: Support some missing types.
* jit-playback.h (get_abort_on_unsupported_target_builtin): New
function.
* jit-recording.cc (get_abort_on_unsupported_target_builtin,
set_abort_on_unsupported_target_builtin): New functions.
* jit-recording.h (get_abort_on_unsupported_target_builtin,
set_abort_on_unsupported_target_builtin): New functions.
(m_abort_on_unsupported_target_builtin): New field.
* libgccjit.cc
(gcc_jit_context_set_abort_on_unsupported_target_builtin): New
function.
* libgccjit.h
(gcc_jit_context_set_abort_on_unsupported_target_builtin): New
function.
* libgccjit.exports (LIBGCCJIT_ABI_36): New ABI tag.
* libgccjit.map (LIBGCCJIT_ABI_36): New ABI tag.
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_36): New ABI tag.
* docs/topics/contexts.rst: Document new function.

29 hours agohurd: Add OPTION_GLIBC_P and OPTION_GLIBC
Svante Signell [Sun, 6 Feb 2022 11:43:23 +0000 (11:43 +0000)] 
hurd: Add OPTION_GLIBC_P and OPTION_GLIBC

GNU/Hurd uses glibc just like GNU/Linux.

This is needed for gcc to notice that glibc supports split stack in
finish_options.

PR go/104290
gcc/ChangeLog:
* config/gnu.h (OPTION_GLIBC_P, OPTION_GLIBC): Define.

29 hours agoc++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads...
Thomas Schwinge [Mon, 20 Oct 2025 15:36:49 +0000 (17:36 +0200)] 
c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads: Adjust 'libgomp.c++/{target-flex-101.C,target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}' [PR114457, PR122268, PR120450]

With commit r16-4212-gf256a13f8aed833fe964a2ba541b7b30ad9b4a76
"c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]",
we acquired:

    {+FAIL: libgomp.c++/target-flex-101.C (internal compiler error: in assign_temp, at function.cc:990)+}
    [-PASS:-]{+FAIL:+} libgomp.c++/target-flex-101.C (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} libgomp.c++/target-flex-101.C [-execution test-]{+compilation failed to produce executable+}

... for GCN, nvptx offloading compilation, and on the other hand:

    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_map-concurrent.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_map-concurrent.C [-compilation failed to produce executable-]{+execution test+}

    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (internal compiler error[-: in assign_temp, at function.cc:990)-]
    [-XFAIL:-]{+XPASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C (test for excess errors)
    [-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-std__flat_multimap-concurrent.C [-compilation failed to produce executable-]{+execution test+}

... for GCN offloading compilation (already PASSed for nvptx).

Note that these test cases explicitly use '-std=c++23', so don't undergo the
new C++26 P2795R5 functionality.  Yet, comparing before vs. after that commit,
in the 'gimple' dumps (that is, early host compilation), there are a lot of
changes where 'gimple_assign <constructor, [...], {CLOBBER(bob)}, NULL, NULL>'s
and relatedly 'gimple_bind's newly appear/no longer appear elsewhere.  This
leads to correspondingly different code at the beginning of offloading
compilation.  Why/how that now ('libgomp.c++/target-flex-101.C') vs. before
('libgomp.c++/{target-std__flat_map-concurrent.C,target-std__flat_multimap-concurrent.C}')
translates into 'expand' ICEs, I can't tell.

PR c++/114457
PR c++/122268
PR c++/120450
libgomp/
* testsuite/libgomp.c++/target-flex-101.C: XFAIL GCN, nvptx
offloading compilation.
* testsuite/libgomp.c++/target-std__flat_map-concurrent.C:
Un-XFAIL GCN offloading compilation.
* testsuite/libgomp.c++/target-std__flat_multimap-concurrent.C:
Likewise.

29 hours agoc++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads...
Thomas Schwinge [Mon, 20 Oct 2025 15:36:49 +0000 (17:36 +0200)] 
c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads: Adjust 'c-c++-common/goacc/kernels-decompose-pr100280-1.c' [PR114457]

With commit r16-4212-gf256a13f8aed833fe964a2ba541b7b30ad9b4a76
"c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]",
we acquired:

    @@ -181180,8 +184423,8 @@ PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 14
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 15 (test for warnings, line 12)
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26  at line 16 (test for warnings, line 12)
    PASS: c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 (test for excess errors)
    [-XFAIL:-]{+XPASS:+} c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 TODO at line 18 (test for warnings, line 19)
    [-XFAIL:-]{+XPASS:+} c-c++-common/goacc/kernels-decompose-pr100280-1.c  -std=c++26 TODO location at line 17 (test for bogus messages, line 10)

As in other OpenACC 'kernels' test cases, the underlying issue again is
PR121975 "Various goacc failures with -ftrivial-auto-var-init=zero" (to be
resolved later on).

PR c++/114457
gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-pr100280-1.c: Skip for
c++26 until PR121975 is fixed.

29 hours agoAda: Fix Default_Component_Value aspect wrongly ignored on derived type
Eric Botcazou [Mon, 20 Oct 2025 19:01:06 +0000 (21:01 +0200)] 
Ada: Fix Default_Component_Value aspect wrongly ignored on derived type

This is again an old issue, which was mostly fixed a few releases ago except
for the specific case of an array type derived from String.

gcc/ada/
PR ada/68179
* exp_ch3.adb (Expand_Freeze_Array_Type): Build an initialization
procedure for a type derived from String declared with the aspect
Default_Aspect_Component_Value.

gcc/testsuite/
* gnat.dg/component_value1.adb: New test.

29 hours agoAda: Fix use type clause invalidated by use clause in nested package
Eric Botcazou [Mon, 20 Oct 2025 18:48:39 +0000 (20:48 +0200)] 
Ada: Fix use type clause invalidated by use clause in nested package

This is an old issue, whereby a use type clause is partially invalidated by
a use clause in a nested package, a variant of PR ada/64869 recently fixed.
The problem occurs only for unusual primitive operators because of a small
oversight in the implementation.  The fix simply aligns this implementation
with the one exercised by PR ada/64869, which is more robust.

gcc/ada/
PR ada/52319
* sem_ch7.adb (Uninstall_Declarations): Use direct test on Nkind
to spot operators.
* sem_ch8.adb (End_Use_Package): Also test the Etype of operators
to spot those which are primitive operators of use-visible types.

gcc/testsuite/
* gnat.dg/use_type3.adb: New test.

31 hours agoEnsure use of gcc's version of stdatomic.h in gthr-vxworks
Ashley Gay [Fri, 8 Mar 2024 11:30:35 +0000 (11:30 +0000)] 
Ensure use of gcc's version of stdatomic.h in gthr-vxworks

VxWorks provides its own version of the standard stdatomic.h, possibly
relying on non-gcc builtins, and our implementation of the gthr API resorts
to VxWorks specific functions for atomicity features.

When compiling libgcc (with gcc), make sure gcc's version of stdatomic.h
is used: #include it here, first, then define the macro used to guard the
system version so it doesn't get expanded when included indirectly by
other system headers.

2025-10-20  Olivier Hainque  <hainque@adacore.com>
    Ashley Gay  <gay@adacore.com>

libgcc/
* config/gthr-vxworks.h: Include stdatomic.h and prevent indirect
inclusion of contents from the system version of that header.

31 hours agoTidy bits of libgcc/config/gthr-vxworks
Olivier Hainque [Wed, 9 Jul 2025 10:48:34 +0000 (10:48 +0000)] 
Tidy bits of libgcc/config/gthr-vxworks

This addresses a variety of warnings about missing prototypes
or suspicious ptr-to-function conversions.

libgcc/
* config/gthr-vxworks-thread.c (__init_gthread_tcb): Make static.
(__delete_gthread_tcb): Likewise.
(__task_wrapper): Likewise.
(__gthread_create): Convert __task_wrapper to (void *) before going
to (FUNCPTR).
* config/gthr-vxworks-tls.c (tls_delete_hook): Accommodate prototype
variations between kernel and rtp. Return STATUS.

32 hours agoxtensa: Make all memory constraints special
Takayuki 'January June' Suwa [Thu, 9 Oct 2025 21:33:36 +0000 (06:33 +0900)] 
xtensa: Make all memory constraints special

In a previous commit (fb7b82964f54192d0723a45c0657d2eb7c5ac97c), we fixed an issue
where loads from literal pool to a hardware floating-point register were double-
indirected; that is, the address of the literal pool entry was temporarily loaded
from another entry into the address (GP) register, and then loaded from that
address into the FP register.  However, we discovered that the same issue could
occur in rare cases when loading FP constants into address registers.

Similarly, this problem can be avoided by prefixing the corresponding alternative
constraint with '^' to increase the cost of Reload/LRA, but as a more fundamental
and comprehensive solution, this patch defines all memory constraint definitions
using define_special_memory_constraint, so that reloads cannot occur for addresses
(based on a good suggestion from Jeff Law).

gcc/ChangeLog:

* config/xtensa/constraints.md (R, U):
Change define_memory_constraint to define_special_memory_constraint.
* config/xtensa/xtensa.md
(movsi_internal, movhi_internal, movqi_internal):
Rearrange their alternatives in the order of constant assignment, register-
register move, load, store and special.  And also consolidate overlapping
alternatives.
(movsf_internal): Rearrange the alternatives as above, and remove the '^'
alternative character which is no longer needed.

32 hours agoxtensa: Make individual use of CONST16 instruction
Takayuki 'January June' Suwa [Thu, 9 Oct 2025 21:29:16 +0000 (06:29 +0900)] 
xtensa: Make individual use of CONST16 instruction

Until now, in Xtensa ISA, the CONST16 machine instruction (which shifts a
specified register left by half a word and stores a 16-bit constant value
in the low halfword of the register) has always been used in pairs and
only for full-word constant value assignments.

This patch provides a new insn definition for using CONST16 alone, and
also adds a constantsynth method that saves one byte for constant assign-
ments within a certain range when TARGET_DENSITY is also enabled.

gcc/ChangeLog:

* config/xtensa/xtensa.cc
(constantsynth_method_const16): New.
(constantsynth_methods): Append constantsynth_method_const16().
(constantsynth_info): Add cost calculation for full-word constant
assignment when TARGET_CONST16 is enabled.
(constantsynth_pass1): Change it so that it works regardless of
TARGET_CONST16.
* config/xtensa/xtensa.md (*xtensa_const16): New.

32 hours agoxtensa: Apply split_DI_SF_DF_const() even if TARGET_CONST16 or TARGET_AUTOLITPOOLS
Takayuki 'January June' Suwa [Thu, 9 Oct 2025 21:26:54 +0000 (06:26 +0900)] 
xtensa: Apply split_DI_SF_DF_const() even if TARGET_CONST16 or TARGET_AUTOLITPOOLS

Otherwise, if TARGET_CONST16 or TARGET_AUTOLITPOOLS is enabled, DI/SF/DFmode
constant assignments will not benefit from their splitting or constantsynth.

gcc/ChangeLog:

* config/xtensa/xtensa.cc (do_largeconst):
Change split_DI_SF_DF_const() to be called unconditionally.

32 hours agolibstdc++: Implement P3060R3: Add std::views::indices(n)
Yuao Ma [Tue, 14 Oct 2025 16:28:48 +0000 (00:28 +0800)] 
libstdc++: Implement P3060R3: Add std::views::indices(n)

This patch adds the views::indices function using iota.

libstdc++-v3/ChangeLog:

* include/bits/version.def: Add ranges_indices FTM.
* include/bits/version.h: Regenerate.
* include/std/ranges: Implement views::indices.
* testsuite/std/ranges/indices/1.cc: New test.

32 hours agoInclude linux-protos.h for ppc*vxworks7r2
Olivier Hainque [Thu, 10 Jul 2025 09:38:30 +0000 (09:38 +0000)] 
Include linux-protos.h for ppc*vxworks7r2

This provides prototypes for target hooks dragged in through linux.h,
in a similar fashion as the ppc*-linux ports do.

gcc/
* config.gcc (powerpc*-wrs-vxworks7r*): Add linux-protos.h
to tm_p_file.

33 hours agolibstdc++: Deduce function_ref<M&() noexcept> from member object pointers.
Tomasz Kamiński [Mon, 20 Oct 2025 13:19:43 +0000 (15:19 +0200)] 
libstdc++: Deduce function_ref<M&() noexcept> from member object pointers.

Implement resolution of LWG4425.

libstdc++-v3/ChangeLog:

* include/bits/funcwrap.h (__polyfunc::__deduce_funcref):
Adjust signature produced for member object pointers.
* testsuite/20_util/function_ref/deduction.cc: Update tests.

33 hours agoInfer TOOL/TOOL_FAMILY from vxworks-predef.h on VxWorks7
Olivier Hainque [Sat, 20 Apr 2024 15:35:41 +0000 (12:35 -0300)] 
Infer TOOL/TOOL_FAMILY from vxworks-predef.h on VxWorks7

This change moves, for VxWorks 7, the setting of the TOOL
and TOOL_FAMILY macros from a builtin_define to a run-time
computation from vxworks-predefs.h.

This is useful on Vx7 to allow a single toolchain to be used
for instances of VxWorks based on either a gnu or an llvm system
toolchain for a given cpu (typically, powerpc).

This is achieved by leveraging the existence of a very basic
autoconf.h file in all VxWorks 7 VSBs, #included directly from
vxworks-predef.h.

gcc/
* config/vxworks.h (VXWORKS_OS_CPP_BUILTINS): Only
builtin_define TOOL and TOOL_FAMILY for !TARGET_VXWORKS7.
Augment comment on VXWORKS_PERSONALITY.
* config/vxworks/vxworks-predef.h: Infer TOOL and TOOL_FAMILY
from the VSB autoconf.h when we have one, determined by the presence
of a _VSB_CONFIG_FILE definition.

libgcc/
* config/t-vxworks: -include vxworks-predef.h explicitly, as the
automatic inclusion is disabled by -nostdinc.

35 hours agoaarch64: Add support for menable-sysreg-checking flag.
Srinath Parvathaneni [Mon, 20 Oct 2025 13:07:55 +0000 (14:07 +0100)] 
aarch64: Add support for menable-sysreg-checking flag.

Hi All,

In the current Binutils we have disabled the feature gating for sysreg
by default and we have introduced a new flag "-menable-sysreg-checking"
to renable some of this checking.

However in GCC, we have disabled the feature gating of sysreg to read/write
intrinsics __arm_[wr]sr* and we have not added any mechanism to check the
feature gating if needed similar to Binutils.

This patch adds the support for the flag "-menable-sysreg-checking" which
renables some of the feature checking of sysreg to read/write intrinsics
__arm_[wr]sr* similar to Binutils.

For inline assembly, sysreg checks are not performed by CC1 and are
instead delegated to the assembler. By default, the assembler does not
perform these checks either. With this patch, the -menable-sysreg-checking
flag passed to the compiler will also be propagated to the assembler,
enabling sysreg checking for inline assembly.

gcc/ChangeLog:

* config/aarch64/aarch64-elf.h (ASM_SPEC): Update the macro.
* config/aarch64/aarch64.cc (aarch64_valid_sysreg_name_p):
Add feature check condition.
(aarch64_retrieve_sysreg): Likewise.
* config/aarch64/aarch64.opt (menable-sysreg-checking):
Define new flag.
* doc/invoke.texi (menable-sysreg-checking): Document new flag.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/asm-inlined-sysreg-1.c: New test.
* gcc.target/aarch64/acle/asm-inlined-sysreg-2.c: Likewise.
* gcc.target/aarch64/acle/rwsr-gated-1.c: Likewise.
* gcc.target/aarch64/acle/rwsr-gated-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_aarch64_sysreg_guarding_ok): Check
assembler support of -menable-sysreg-checking flag.

35 hours agoMAINTAINERS: Add myself to vectorizer maintainer list
Tamar Christina [Mon, 20 Oct 2025 13:12:31 +0000 (14:12 +0100)] 
MAINTAINERS: Add myself to vectorizer maintainer list

Following the announcement on
https://gcc.gnu.org/pipermail/gcc/2025-October/246833.html
adding myself to vectorizer maintainer list.

ChangeLog:

* MAINTAINERS (Various Maintainers): Add myself for the vectorizer.

36 hours agoFix minor testsuite scan failures for RISC-V
Jeff Law [Mon, 20 Oct 2025 11:48:54 +0000 (05:48 -0600)] 
Fix minor testsuite scan failures for RISC-V

This fixes minor testsuite fallout after some of Jan's recent changes, nothing
of real significance, just minor changes in codegen causing scan tests to fail.
It's mostly an -O1/-Og problem and we can just skip the tests for those.

gcc/testsuite
* gcc.target/riscv/rvv/vsetvl/imm_switch-6.c: Skip scan-asm test for -O1 too.
* gcc.target/riscv/rvv/vsetvl/imm_switch-7.c: Likewise.
* gcc.target/riscv/shrink-wrap-1.c: Likewise.  Skip for -Og as well.
* gcc.target/riscv/xandes/xandesperf-1.c: Adjust expected output.

37 hours agoAda: Use Osint.Program_Name in gnatchop
Nicolas Boulenguez [Mon, 20 Oct 2025 11:08:22 +0000 (13:08 +0200)] 
Ada: Use Osint.Program_Name in gnatchop

This aligns gnatchop with the other GNAT tools when it comes to locating
GCC's driver executable.

gcc/ada/
PR ada/87777
* gnatchop.adb: Add with clause for Osint.
(Locate_Executable): Delete.
(Gnatchop): Use Osint.Program_Name and Locate_Exec_On_Path instead
of Locate_Executable to locate GCC's driver executable.

37 hours agotop-level: Add forgejo sanity checks
Christophe Lyon [Mon, 1 Sep 2025 15:51:33 +0000 (15:51 +0000)] 
top-level: Add forgejo sanity checks

Add a sample workflow for Forgejo, as an example of integrated CI.

To keep it lightweight, we run only two small checks on each patch of
the series:
- contrib/check_GNU_style.py
  which catches common mistakes (spaces vs tab, missing spaces, ...)
  but has some false positive warnings.

- contrib/gcc-changelog/git_check_commit.py
  which checks the commit message and ChangeLog entry

In order to run both checks even if the other fails, we use two steps
with 'continue-on-error: true', and we need a 'final-result'
consolidation step to generate the global status.

ChangeLog:
* .forgejo/workflows/sanity-checks.yaml: New file.

38 hours agolibstdc++: Remove undeclared macros from configure.ac [PR122322]
Jonathan Wakely [Sat, 18 Oct 2025 21:05:43 +0000 (22:05 +0100)] 
libstdc++: Remove undeclared macros from configure.ac [PR122322]

The additions inr16-4443-g651bf5126da124 cause errors when running
autoreconf.

libstdc++-v3/ChangeLog:

PR libstdc++/122322
* configure.ac (with_newlib) <*-rtems*>: Remove
HAVE_SYS_IOCT4YL_H, _GLIBCXX_USE_LINK, _GLIBCXX_USE_READLINK,
_GLIBCXX_USE_SYMLINK, _GLIBCXX_USE_TRUNCATE, and
_GLIBCXX_USE_FDOPENDIR. Remove duplicates.
* configure: Regenerate.

39 hours agoAda: Fix spurious warning for renaming of component of VFA record
Eric Botcazou [Mon, 20 Oct 2025 09:21:21 +0000 (11:21 +0200)] 
Ada: Fix spurious warning for renaming of component of VFA record

This is a regression present on the mainline and all active branches: the
compiler gives a spurious "is not referenced" warning for the renaming of
a component of a Volatile_Full_Access record.

gcc/ada/
PR ada/107536
* exp_ch2.adb (Expand_Renaming): Mark the entity as referenced.

gcc/testsuite/
* gnat.dg/renaming18.adb: New test.

39 hours agotree-optimization/121631 - UB in vector epilogue
Richard Biener [Fri, 17 Oct 2025 12:03:21 +0000 (14:03 +0200)] 
tree-optimization/121631 - UB in vector epilogue

The vectorizer fails to take UB due to signed overflow into account
when generating code for the epilogue of a signed reduction.  The
following tries to make sure to perform the actual reduction
computations in an unsigned type.  I did not bother to adjust
inputs to internal functions like .REDUC_PLUS.

PR tree-optimization/121631
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
When the reduction operation invokes UB on signed overflow
make sure to perform operations with it on an unsigned type.

40 hours agoImplement bool reduction vectorization
Richard Biener [Thu, 9 Oct 2025 12:03:29 +0000 (14:03 +0200)] 
Implement bool reduction vectorization

Currently we mess up here in two places.  One is pattern recognition
which computes a mask-precision for a bool reduction PHI that's
inconsistent with that of the latch definition.  This is solved by
iterating the mask-precision computation.  The second is that the
reduction epilogue generation and the code querying support for it
isn't ready for mask inputs.  The following fixes this by falling
back to doing all the epilogue processing on a data type again, if
the target does not support a direct mask reduction.  For that we
utilize the newly added reduc_sbool_{and,ior,xor}_scal optabs
so we can go the direct IFN path on masks if the target supports
that.  In the future we can also implement an additional fallback
for IOR and AND reductions using a scalar cond-expr like
mask != 0 ? true : false, but the new optabs provide more information
to the target.

PR tree-optimization/101639
PR tree-optimization/103495
* tree-vectorizer.h (vect_reduc_info_s): Add reduc_type_for_mask.
(VECT_REDUC_INFO_VECTYPE_FOR_MASK): New.
* tree-vect-patterns.cc (vect_determine_mask_precision):
Return whether the mask precision changed.
(vect_determine_precisions): Iterate mask precision computation
for loop vectorization.
* tree-vect-loop.cc (get_initial_defs_for_reduction): Properly
convert non-mask initial values to a mask initial def for
the reduction.
(sbool_reduction_fn_for_fn): New function.
(vect_create_epilog_for_reduction): For a mask input convert
it to the vector type analysis decided to use.  Use a regular
conversion for the final convert to the scalar code type.
(vectorizable_reduction): Support mask reductions.  Verify
we can compute a data vector from the mask result or a direct
maks reduction is provided by the target.

* gcc.dg/vect/vect-reduc-bool-1.c: New testcase.
* gcc.dg/vect/vect-reduc-bool-2.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-3.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-4.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-5.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-6.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-7.c: Likewise.
* gcc.dg/vect/vect-reduc-bool-8.c: Likewise.

40 hours agoAdd reduc_sbool_{and,ior,xor}_scal optabs
Richard Biener [Tue, 14 Oct 2025 09:48:43 +0000 (11:48 +0200)] 
Add reduc_sbool_{and,ior,xor}_scal optabs

The following adds named patterns for reducing of vector masks with
AND, IOR and XOR to be used by the vectorizer.  A slight complication
are targets using scalar integer modes as mask modes, as for those
the mode for low-precision masks is ambiguous.  For this reason the
optab follows what vec_pack_sbool_trunc does and passes an additional
CONST_INT operand indicating the number of lanes in the input mask.
Note this is done always when the vector mask mode is an integer mode
and never otherwise.

* doc/md.texi (reduc_sbool_{and,ior,xor}_scal_<mode>): Document.
* optabs.def (reduc_sbool_and_scal_optab,
reduc_sbool_ior_scal_optab, reduc_sbool_xor_scal_optab): New.
* internal-fn.def (REDUC_SBOOL_AND, REDUC_SBOOL_IOR,
REDUC_SBOO_XOR): Likewise.
* internal-fn.cc (reduc_sbool_direct): New initializer.
(expand_reduc_sbool_optab_fn): New expander.
(direct_reduc_sbool_optab_supported_p): New.

41 hours agoUpdate auto-vectorizer maintainance area
Richard Biener [Mon, 20 Oct 2025 07:31:15 +0000 (09:31 +0200)] 
Update auto-vectorizer maintainance area

The following adjusts the attribution of the auto-vectorizer area
to say 'vectorizer (+ tree-if-conv)' as approved by the SC.

* MAINTAINERS (auto-vectorizer): Change attribution to
vectorizer (+ tree-if-conv).

42 hours agox86: Optimize copysign (x, const_double)
H.J. Lu [Sun, 19 Oct 2025 01:13:52 +0000 (09:13 +0800)] 
x86: Optimize copysign (x, const_double)

After

commit 3f176e1adc6bc9cc2c21222d776b51d9f43cb66b
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Thu Nov 9 13:59:39 2023 +0000

    middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

fneg (fabs (x)) is expanded to copysign (x, -1).  Swap constraints for
operands[1] and operands[2] in copysign<mode>3 pattern to optimize

  y = copysign (x, const_double)

instead of

  y = copysign (const_double, x)

Simplify

  y = copysign (x, positive_const_double)

to

  y = ~signbit_mask & x

and

  y = copysign (x, negative_const_double)

to

  y = signbit_mask | x

gcc/

PR target/99930
PR target/122323
* config/i386/i386-expand.cc (ix86_expand_copysign): Swap
operands[1] with operands[2].  Optimize copysign (x, const_double)
instead of copysign (const_double, x).
* config/i386/i386.md (copysign<mode>3): Swap constraints for
operands[1] and operands[2].

gcc/testsuite/

PR target/99930
PR target/122323
* gcc.target/i386/builtin-copysign-2.c: New test.
* gcc.target/i386/builtin-copysign-3.c: Likewise.
* gcc.target/i386/builtin-copysign-4.c: Likewise.
* gcc.target/i386/builtin-copysign-5.c: Likewise.
* gcc.target/i386/builtin-copysign-6.c: Likewise.
* gcc.target/i386/builtin-copysign-7.c: Likewise.
* gcc.target/i386/builtin-copysign-8a.c: Likewise.
* gcc.target/i386/builtin-copysign-8b.c: Likewise.
* gcc.target/i386/builtin-fabs-1.c: Likewise.
* gcc.target/i386/builtin-fabs-2.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2 days agoDaily bump.
GCC Administrator [Mon, 20 Oct 2025 00:18:13 +0000 (00:18 +0000)] 
Daily bump.

2 days agoPR modula2/122333: m2spellcheck.cc remove memset and tidyup
Gaius Mulley [Sun, 19 Oct 2025 17:48:18 +0000 (18:48 +0100)] 
PR modula2/122333: m2spellcheck.cc remove memset and tidyup

This patch removes memset from m2spellcheck_InitCandidates.
It corrects a comment boiler plate and removes an unused local
variable.  Finally it frees up memory used by the candidates_array
in KillCandidates.

gcc/m2/ChangeLog:

PR modula2/122333
* gm2-compiler/M2MetaError.mod (JoinSentances): Remove
unused variable.
* gm2-gcc/m2spellcheck.cc (m2spellcheck_InitCandidates): Rewrite.
(KillCandidates): Deallocate auto_vec candidates_array.
(candidates_array_vec_t): New declaration.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2 days agoAVR: The nzb=1 patterns with IOR, XOR, AND work the same way with PLUS.
Georg-Johann Lay [Sun, 19 Oct 2025 14:42:07 +0000 (16:42 +0200)] 
AVR: The nzb=1 patterns with IOR, XOR, AND work the same way with PLUS.

gcc/
* config/avr/avr.cc (avr_nonzero_bits_lsr_operands_p): Also
handle PLUS.
* config/avr/avr.md (pixaop): New code iterator for PLUS,
IOR, XOR, AND.
(nzb=1 insns): Use pixaop instead of bitop code iterator.
Handle PLUS in outputs.

2 days agoad PR122212: Fix test case for 16-bit int targets.
Georg-Johann Lay [Sun, 19 Oct 2025 14:39:32 +0000 (16:39 +0200)] 
ad PR122212: Fix test case for 16-bit int targets.

PR testsuite/122212
PR testsuite/52641
gcc/testsuite/
* gcc.dg/torture/pr122212.c: Pass 0xffffffff instead of -1u
for all bits set in uint32_t.

2 days agoad PR122016: Fix test case for 16-bit size targets.
Georg-Johann Lay [Sun, 19 Oct 2025 14:32:43 +0000 (16:32 +0200)] 
ad PR122016: Fix test case for 16-bit size targets.

PR testsuite/122016
PR testsuite/52641
gcc/testsuite/
* gcc.dg/torture/pr122016.c (strncmp): Use __SIZE_TYPE__ instead
of long as type of the size argument.

2 days agoRISC-V: Add testcase for unsigned scalar SAT_MUL form 6
Pan Li [Fri, 17 Oct 2025 07:12:10 +0000 (15:12 +0800)] 
RISC-V: Add testcase for unsigned scalar SAT_MUL form 6

The form 6 of unsigned scalar SAT_MUL has supported from the
previous change.  Thus, add the test cases to make sure it
works well.

The below test suites are passed for this patch series.
 * The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat/sat_u_mul-7-u16-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u16-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u16-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u32-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u32-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u32-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u64-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u8-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u8-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u8-from-u64.rv32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-7-u8-from-u64.rv64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u16-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u16-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u16-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u32-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u32-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u64-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u8-from-u128.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u8-from-u16.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u8-from-u32.c: New test.
* gcc.target/riscv/sat/sat_u_mul-run-7-u8-from-u64.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
2 days agocobol: Implement ENTRY statement; finish removing ascii/ebcdic dichotomy.
Robert Dubner [Sat, 18 Oct 2025 14:35:52 +0000 (10:35 -0400)] 
cobol: Implement ENTRY statement; finish removing ascii/ebcdic dichotomy.

The prior set of changes largely eliminated the assumption that the
internal codeset was either ascii or ebcdic.  These changes remove the
last vestiges of that assumption.

These changes also implement the COBOL ENTRY statement, which allows a
program-id to have more than one externally callable entry point. Since
GCC assumes the existence of an ABI that is not, repeat *not* capable of
that, it is implemented here by creating a separate function with the
name specified by the ENTRY statement.  That function sets up global
variables which cause control to be transferred to the ENTRY point when
the parent function is called re-entrantly, and then executes that call.

gcc/cobol/ChangeLog:

* genapi.cc (move_tree): Formatting.
(parser_enter_file): Incorporate global __gg__entry_label.
(enter_program_common): Remove calls to alphabet overrides.
(parser_alphabet): Change cbl_alphabet_e handling.
(parser_alphabet_use): Likewise.
(initialize_the_data): Likewise.
(establish_using): Process passed parameters in a subroutine.
(parser_division): Remove in-line parameter processing;
call establish_using() instead. Check for __gg__entry_label.
(parser_file_add): Temporary workaround for charset encoding.
(parser_file_open): Likewise.
(create_and_call): Push/pop program state around call to external.
(parser_entry): Implement new ENTRY statement feature.
(mh_source_is_literalN): Formatting.
* genapi.h (parser_entry): New ENTRY statement.
* gengen.cc (gg_create_goto_pair): Formatting.
(gg_goto_label_decl): Remove.
* gengen.h (gg_goto_label_decl): Remove.
* genutil.cc (internal_codeset_is_ebcdic): Remove.
* genutil.h (internal_codeset_is_ebcdic): Remove.
* symbols.cc (symbols_alphabet_set): Restrict alphabet scan to
program.
* symbols.h (is_elementary): Use defined constants instead of
explicit 'A'and 'N'

libgcobol/ChangeLog:

* charmaps.cc (__gg__set_internal_codeset): Eliminate ascii/ebcdic.
(__gg__text_conversion_override): Remove.
* charmaps.h (enum text_device_t):  Eliminate ascii/ebcdic.
(enum text_codeset_t): Remove.
(__gg__set_internal_codeset): Remove.
(__gg__text_conversion_override): Remove.
* gfileio.cc: Anticipate cbl_encoding_t fixes.
* libgcobol.cc (struct program_state): Incorporate
__gg__entry_label.
(__gg__pop_program_state): Eliminate unused defines.
(__gg__alphabet_use): Eliminate ascii/ebcdic dichotomy.
* valconv.cc (__gg__alphabet_create): Likewise.

3 days agoDaily bump.
GCC Administrator [Sun, 19 Oct 2025 00:17:28 +0000 (00:17 +0000)] 
Daily bump.

3 days agoRegenerate common.opt.urls
Mark Wielaard [Sat, 18 Oct 2025 23:48:07 +0000 (01:48 +0200)] 
Regenerate common.opt.urls

An alias for -ftree-parallelize-loops was added to common.opt, but
common.opt.urls wasn't regenerated.

Fixes: f708b83d197b ("tree-parloops: Enable runtime thread detection with -ftree-parallelize-loops")
gcc/ChangeLog:

* common.opt.urls: Regenerate.

3 days agolibstdc++: Implement P1494 and P3641 Partial program correctness [PR119060]
Iain Sandoe [Thu, 4 Sep 2025 15:21:16 +0000 (16:21 +0100)] 
libstdc++: Implement P1494 and P3641 Partial program correctness [PR119060]

This implements the library parts of P1494 as amended by P3641.  For GCC the
compiler itself treats stdio operations as equivalent to the observable
checkpoint and thus it does not appear to be necessary to add calls to those
functions (it will not alter the outcome).

This adds the facility for C++26, although there is no reason, in principle,
that it would not work back to C++11 at least.

PR c++/119060

libstdc++-v3/ChangeLog:

* include/bits/version.def: Add observable_checkpoint at present
allowed from C++26.
* include/bits/version.h: Regenerate.
* include/std/utility: Add std::observable_checkpoint().
* src/c++23/std.cc.in: Add obervable_checkpoint () to utility.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
3 days agoc++: Implement P1494 and P3641 Partial program correctness [PR119060].
Iain Sandoe [Sat, 6 Sep 2025 16:11:21 +0000 (17:11 +0100)] 
c++: Implement P1494 and P3641 Partial program correctness [PR119060].

P1494 provides a mechanism that serves to demarc epochs within the code
preventing UB-based optimisations from 'time traveling' across such
boundaries.  The additional paper, P3641, alters the name of the function
to 'observable_checkpoint' which is the name used here.

This implementation  maintains the observable function call through to
expand, where it produces no code.

PR c++/119060

gcc/ChangeLog:

* builtins.cc (expand_builtin): Handle BUILT_IN_OBSERVABLE_CHKPT.
* builtins.def (BUILT_IN_OBSERVABLE_CHKPT): New.
* tree.cc (build_common_builtin_nodes): Build observable
checkpoint builtin.

gcc/cp/ChangeLog:

* cxxapi-data.csv: Add observable_checkpoint to <utility>.
* std-name-hint.gperf: Add observable_checkpoint to <utility>.
* std-name-hint.h: Regenerate.

gcc/testsuite/ChangeLog:

* g++.dg/cpp26/observable-checkpoint.C: New test.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
3 days agoc++/modules: Import purview using-directives in the same module [PR122279]
Nathaniel Shead [Wed, 15 Oct 2025 07:13:38 +0000 (18:13 +1100)] 
c++/modules: Import purview using-directives in the same module [PR122279]

[namespace.qual] p1 says that a namespace nominated by a using-directive
is searched if the using-directive precedes that point.

[basic.lookup.general] p2 says that a declaration in a different TU
within a module purview is visible if either the declaration is
exported, or the other TU is part of the same module as the point of
lookup.  This patch implements the second half of that.

PR c++/122279

gcc/cp/ChangeLog:

* module.cc (depset::hash::add_namespace_entities): Seed any
purview using-decls.
(module_state::write_using_directives): Stream if the udir was
exported or not.
(module_state::read_using_directives): Add the using-directive
if it's either exported or part of this module.

gcc/testsuite/ChangeLog:

* g++.dg/modules/namespace-13_b.C: Adjust expected results.
* g++.dg/modules/namespace-13_c.C: Test non-exported
using-directive is not used.
* g++.dg/modules/namespace-14_a.C: New test.
* g++.dg/modules/namespace-14_b.C: New test.
* g++.dg/modules/namespace-14_c.C: New test.
* g++.dg/modules/namespace-14_d.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
3 days agoAArch64: Implement widen_[us]sum using 2-way [US]UDOT for SVE2p1 [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:22:50 +0000 (08:22 +0100)] 
AArch64: Implement widen_[us]sum using 2-way [US]UDOT for SVE2p1 [PR122069]

SVE2p1 adds 2-way dotproduct which we can use when we have to do a single step
widening addition.  This is useful for instance when the value to be widened
does not come from a load.  For example for

int foo2_int(unsigned short *x, unsigned short * restrict y) {
  int sum = 0;
  for (int i = 0; i < 8000; i++)
    {
      x[i] = x[i] + y[i];
      sum += x[i];
    }
  return sum;
}

we used to generate

.L12:
        ld1h    z30.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z30.h, z30.h, z29.h
        uaddwb  z31.s, z31.s, z30.h
        uaddwt  z31.s, z31.s, z30.h
        st1h    z30.h, p7, [x0, x2, lsl 1]
        mov     x3, x2
        inch    x2
        cmp     w2, w4
        bls     .L12
        inch    x3
        uaddv   d31, p7, z31.s

but with +sve2p1

.L12:
        ld1h    z31.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z31.h, z31.h, z29.h
        udot    z30.s, z31.h, z28.h
        st1h    z31.h, p7, [x0, x2, lsl 1]
        mov     x3, x2
        inch    x2
        cmp     w2, w4
        bls     .L12
        inch    x3
        uaddv   d30, p7, z30.s

gcc/ChangeLog:

PR middle-end/122069
* config/aarch64/aarch64-sve2.md
(widen_ssum<mode><Vnarrow>3): Update.
(widen_usum<mode><Vnarrow>3): Update.

gcc/testsuite/ChangeLog:

PR middle-end/122069
* gcc.target/aarch64/sve2/pr122069_3.c: New test.
* gcc.target/aarch64/sve2/pr122069_4.c: New test.

3 days agoAArch64: Implement widen_[us]sum using [US]ADDW[TB] for SVE2 [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:22:18 +0000 (08:22 +0100)] 
AArch64: Implement widen_[us]sum using [US]ADDW[TB] for SVE2 [PR122069]

SVE2 adds [US]ADDW[TB] which we can use when we have to do a single step
widening addition.  This is useful for instance when the value to be widened
does not come from a load.  For example for

int foo2_int(unsigned short *x, unsigned short * restrict y) {
  int sum = 0;
  for (int i = 0; i < 8000; i++)
    {
      x[i] = x[i] + y[i];
      sum += x[i];
    }
  return sum;
}

we used to generate

.L6:
        ld1h    z1.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z29.h, z29.h, z1.h
        punpklo p6.h, p7.b
        uunpklo z0.s, z29.h
        add     z31.s, p6/m, z31.s, z0.s
        punpkhi p6.h, p7.b
        uunpkhi z30.s, z29.h
        add     z31.s, p6/m, z31.s, z30.s
        st1h    z29.h, p7, [x0, x2, lsl 1]
        add     x2, x2, x4
        whilelo p7.h, w2, w3
        b.any   .L6
        ptrue   p7.b, all
        uaddv   d31, p7, z31.s

but with +sve2

.L12:
        ld1h    z30.h, p7/z, [x0, x2, lsl 1]
        ld1h    z29.h, p7/z, [x1, x2, lsl 1]
        add     z30.h, z30.h, z29.h
        uaddwb  z31.s, z31.s, z30.h
        uaddwt  z31.s, z31.s, z30.h
        st1h    z30.h, p7, [x0, x2, lsl 1]
        mov     x3, x2
        inch    x2
        cmp     w2, w4
        bls     .L12
        inch    x3
        uaddv   d31, p7, z31.s

gcc/ChangeLog:

PR middle-end/122069
* config/aarch64/aarch64-sve2.md: (widen_ssum<mode><Vnarrow>3): New.
(widen_usum<mode><Vnarrow>3): New.
* config/aarch64/iterators.md (Vnarrow): New, to match VNARROW.

gcc/testsuite/ChangeLog:

PR middle-end/122069
* gcc.target/aarch64/sve2/pr122069_1.c: New test.
* gcc.target/aarch64/sve2/pr122069_2.c: New test.

3 days agoAArch64: Implement widen_[us]sum using dotproduct for SVE [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:21:56 +0000 (08:21 +0100)] 
AArch64: Implement widen_[us]sum using dotproduct for SVE [PR122069]

This patch implements support for using dotproduct to do sum reductions by
changing += a into += (a * 1).  i.e. we seed the multiplication with 1.

Given the example

int foo_int(unsigned char *x, unsigned char * restrict y) {
  int sum = 0;
  for (int i = 0; i < 8000; i++)
     sum += char_abs(x[i] - y[i]);
  return sum;
}

we used to generate

.L2:
        ld1b    z1.b, p7/z, [x0, x2]
        ld1b    z29.b, p7/z, [x1, x2]
        sub     z29.b, z1.b, z29.b
        uunpklo z0.h, z29.b
        uunpkhi z29.h, z29.b
        uunpklo z30.s, z0.h
        add     z31.s, p6/m, z31.s, z30.s
        uunpkhi z0.s, z0.h
        add     z31.s, p5/m, z31.s, z0.s
        uunpklo z28.s, z29.h
        add     z31.s, p4/m, z31.s, z28.s
        uunpkhi z29.s, z29.h
        add     z31.s, p3/m, z31.s, z29.s
        add     x2, x2, x7
        whilelo p7.b, w2, w3
        whilelo p3.s, w2, w6
        whilelo p4.s, w2, w5
        whilelo p5.s, w2, w4
        whilelo p6.s, w2, w3
        b.any   .L2
        ptrue   p7.b, all
        uaddv   d31, p7, z31.s

but now generates with +dotprod

.L3:
        ld1b    z30.b, p7/z, [x5, x2]
        ld1b    z29.b, p7/z, [x1, x2]
        sub     z30.b, z30.b, z29.b
        udot    z31.s, z30.b, z28.b
        mov     x3, x2
        add     x2, x2, x6
        cmp     w2, w0
        bls     .L3
        incb    x3
        uaddv   d31, p7, z31.s

gcc/ChangeLog:

PR middle-end/122069
* config/aarch64/aarch64-sve.md (widen_<sur>sum<mode><vsi2qi>3): New.

gcc/testsuite/ChangeLog:

PR middle-end/122069
* gcc.target/aarch64/sve/pr122069_1.c: New test.
* gcc.target/aarch64/sve/pr122069_2.c: New test.

3 days agors6000: convert widen_[us]sum into convert optab [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:21:30 +0000 (08:21 +0100)] 
rs6000: convert widen_[us]sum into convert optab [PR122069]

This patch is a mechanical rewrite of the widen_[us]sum optabs from a direct to
a conversion optab.  The result of which requires the output mode to be added to
the existing patterns.

No change in functionality is expected.

gcc/ChangeLog:

PR middle-end/122069
* config/rs6000/altivec.md (widen_usum<mode>3): Rename ...
(widen_usumv4si<mode>3): ... to this.
(widen_ssumv16qi3): Rename ...
(widen_ssumv4siv16qi3): ... to this.
(widen_ssumv8hi3): Rename ...
(widen_ssumv4siv8hi3): ... to this.

3 days agoia64: convert widen_[us]sum into convert optab [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:21:09 +0000 (08:21 +0100)] 
ia64: convert widen_[us]sum into convert optab [PR122069]

The target does not seem to have a maintainer listed, I've CC'ed a group of
global maintainers instead hoping one of you could approve it.

This patch is a mechanical rewrite of the widen_[us]sum optabs from a direct to
a conversion optab.  The result of which requires the output mode to be added to
the existing patterns.

No change in functionality is expected.

gcc/ChangeLog:

PR middle-end/122069
* config/ia64/vect.md (widen_usumv8qi3): Renamed ...
(widen_usumv4hiv8qi3): ... into this.
(widen_usumv4hi3): Renamed ...
(widen_usumv2siv4hi3): ... into this.
(widen_ssumv8qi3): Renamed ...
(widen_ssumv4hiv8qi3): ... into this.
(widen_ssumv4hi3): Renamed ...
(widen_ssumv2siv4hi3): ... into this.

3 days agoarm: convert widen_[us]sum into convert optab [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:20:47 +0000 (08:20 +0100)] 
arm: convert widen_[us]sum into convert optab [PR122069]

This patch is a mechanical rewrite of the widen_[us]sum optabs from a direct to
a conversion optab.  The result of which requires the output mode to be added to
the existing patterns.

No change in functionality is expected.

gcc/ChangeLog:

PR middle-end/122069
* config/arm/iterators.md (v_double_width): New, matching
V_double_width.
* config/arm/neon.md (widen_ssum<mode>3): Renamed ...
(widen_ssum<v_double_width><mode>3, widen_ssum<V_widen_l><mode>3): ...
into these.
(widen_usum<mode>3): Renamed ...
(widen_usum<v_double_width><mode>3, widen_usum<V_widen_l><mode>3): ...
into these.

3 days agoAArch64: add double widen_sum optab using dotprod for Adv.SIMD [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:20:07 +0000 (08:20 +0100)] 
AArch64: add double widen_sum optab using dotprod for Adv.SIMD [PR122069]

This patch implements support for using dotproduct to do sum reductions by
changing += a into += (a * 1).  i.e. we seed the multiplication with 1.

Given the example

int foo_int(unsigned char *x, unsigned char * restrict y) {
  int sum = 0;
  for (int i = 0; i < 8000; i++)
     sum += char_abs(x[i] - y[i]);
  return sum;
}

we used to generate

.L2:
        ldr     q0, [x0, x2]
        ldr     q28, [x1, x2]
        sub     v28.16b, v0.16b, v28.16b
        zip1    v29.16b, v28.16b, v31.16b
        zip2    v28.16b, v28.16b, v31.16b
        uaddw   v30.4s, v30.4s, v29.4h
        uaddw2  v30.4s, v30.4s, v29.8h
        uaddw   v30.4s, v30.4s, v28.4h
        uaddw2  v30.4s, v30.4s, v28.8h
        add     x2, x2, 16
        cmp     x2, x3
        bne     .L2
        addv    s31, v30.4s

but now generates with +dotprod

.L2:
        ldr     q29, [x0, x2]
        ldr     q28, [x1, x2]
        sub     v28.16b, v29.16b, v28.16b
        udot    v31.4s, v28.16b, v30.16b
        add     x2, x2, 16
        cmp     x2, x3
        bne     .L2
        addv    s31, v31.4s

gcc/ChangeLog:

PR middle-end/122069
* config/aarch64/aarch64-simd.md (widen_ssum<mode><vsi2qi>3): New.
(widen_usum<mode><vsi2qi>3): New.

gcc/testsuite/ChangeLog:

PR middle-end/122069
* gcc.target/aarch64/pr122069_3.c: New test.
* gcc.target/aarch64/pr122069_4.c: New test.

3 days agoAArch64: convert widen_sum optabs to convert [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:19:28 +0000 (08:19 +0100)] 
AArch64: convert widen_sum optabs to convert [PR122069]

This patch is a mechanical rewrite of the widen_[us]sum optabs from a direct to
a conversion optab.  The result of which requires the output mode to be added to
the existing patterns.

No change in functionality is expected.

gcc/ChangeLog:

PR middle-end/122069
* config/aarch64/aarch64-simd.md (widen_ssum<mode>3): Change into..
(widen_ssum<Vdblw><mode>3, widen_ssum<Vwide><mode>3): ... these.
(widen_usum<mode>3): Change into ...
(widen_usum<Vdblw><mode>3, widen_usum<Vwide><mode>3): ... these.
* config/aarch64/iterators.md (Vdblw): New.
(Vwide): Extend to match VWIDE.

gcc/testsuite/ChangeLog:

PR middle-end/122069
* gcc.target/aarch64/pr122069_1.c: New test.
* gcc.target/aarch64/pr122069_2.c: New test.

3 days agomiddle-end: refactor WIDEN_SUM_EXPR into convert optab [PR122069]
Tamar Christina [Sat, 18 Oct 2025 07:18:14 +0000 (08:18 +0100)] 
middle-end: refactor WIDEN_SUM_EXPR into convert optab [PR122069]

This patch changes the widen_[us]sum optabs into a convert optabs such that
targets and specify more than one conversion.

Following this patch are patches rewriting all targets using this change.

While working on this I noticed that the pattern does miss some cases it
could handle if it tried multiple attempts. e.g. if the promotion is from
qi to si, and the target doesn't have this, it should try hi -> si.

But I'm leaving that for now.

gcc/ChangeLog:

PR middle-end/122069
* doc/md.texi (widen_ssum@var{n}@var{m}3, widen_usum@var{n}@var{m}3):
Update docs.
* optabs.cc (expand_widen_pattern_expr): Add WIDEN_SUM_EXPR as widening.
* optabs.def (ssum_widen_optab, usum_widen_optab): Convert from direct
to a conversion optab.
* tree-vect-patterns.cc (vect_recog_widen_sum_pattern): Change
vect_supportable_direct_optab_p into vect_supportable_conv_optab_p.

3 days agofortran: allow character in conditional expression
Yuao Ma [Thu, 16 Oct 2025 14:32:52 +0000 (22:32 +0800)] 
fortran: allow character in conditional expression

This patch allows the use of character types in conditional expressions.

gcc/fortran/ChangeLog:

* resolve.cc (resolve_conditional): Allow character in cond-expr.
* trans-const.cc (gfc_conv_constant): Handle want_pointer.
* trans-expr.cc (gfc_conv_conditional_expr): Fill se->string_length.
(gfc_conv_string_parameter): Handle COND_EXPR tree code.

gcc/testsuite/ChangeLog:

* gfortran.dg/conditional_1.f90: Test character type.
* gfortran.dg/conditional_2.f90: Test print constants.
* gfortran.dg/conditional_4.f90: Test diagnostic message.
* gfortran.dg/conditional_6.f90: Test character cond-arg.

3 days agotree-object-size.cc: Fix assert constant offset in check_for_plus_in_loops [PR122012]
Linsen Zhou [Fri, 17 Oct 2025 03:05:04 +0000 (11:05 +0800)] 
tree-object-size.cc: Fix assert constant offset in check_for_plus_in_loops [PR122012]

After commit 51b85dfeb19652bf3e0aaec08828ba7cee1e641c, when the
pointer offset is a variable in the loop, the object size of the
pointer may also need to be reexamined.
Which make gcc_assert in the check_for_plus_in_loops failed.

gcc/ChangeLog:

PR tree-optimization/122012
* tree-object-size.cc (check_for_plus_in_loops): Skip check
for the variable offset

gcc/testsuite/ChangeLog:

PR tree-optimization/122012
* gcc.dg/torture/pr122012.c: New test.

Signed-off-by: Linsen Zhou <i@lin.moe>
4 days agoDaily bump.
GCC Administrator [Sat, 18 Oct 2025 00:18:06 +0000 (00:18 +0000)] 
Daily bump.

4 days agobpf: fix memset miscompilation with larger stores [PR122139]
David Faust [Wed, 15 Oct 2025 20:36:38 +0000 (13:36 -0700)] 
bpf: fix memset miscompilation with larger stores [PR122139]

The BPF backend expansion of setmem was broken, because it could elect
to use stores of HI, SI or DI modes based on the destination alignment
when the value was QI, but fail to duplicate the byte value across to
those larger sizes.  This resulted in not all bytes of the destination
actually being set to the desired value.

Fix bpf_expand_setmem to ensure the desired byte value is really
duplicated as necessary, whether it is constant or a (sub)reg:QI.

PR target/122139

gcc/

* config/bpf/bpf.cc (bpf_expand_setmem): Duplicate byte value
across to new mode when using larger modes for store.

gcc/testsuite/

* gcc.target/bpf/memset-3.c: New.
* gcc.target/bpf/memset-4.c: New.

4 days agoAArch64: Extend intrinsics framework to account for merging predications without...
Tamar Christina [Fri, 17 Oct 2025 14:43:04 +0000 (15:43 +0100)] 
AArch64: Extend intrinsics framework to account for merging predications without gp [PR121604]

In PR121604 the problem was noted that currently the SVE intrinsics
infrastructure assumes that for any predicated operation that the GP is at the
first argument position which has a svbool_t or for a unary merging operation
that it's in the second position.

However you have intrinsics like fmov_lane which have an svbool_t but it's not
a GP but instead it's the inactive lanes.

You also have instructions like BRKB which work only on predicates so it
incorrectly determines the first operand to be the GP, while that's also the
inactive lanes.

However during apply_predication we do have the information about where the GP
is.  This patch re-organizes the code to record this information into the
function_instance such that folders have access to this information.

For functions that are outliers like pmov_lane we can now override the
availability of the intrinsics having a GP.

gcc/ChangeLog:

PR target/121604
* config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication):
Store gp_index.
(struct pmov_to_vector_lane_def): Mark instruction as has no GP.
* config/aarch64/aarch64-sve-builtins.h (function_instance::gp_value,
function_instance::inactive_values, function_instance::gp_index,
function_shape::has_gp_argument_p): New.
* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::fold_pfalse):
Simplify code and use GP helpers.

gcc/testsuite/ChangeLog:

PR target/121604
* gcc.target/aarch64/sve/pr121604_brk.c: New test.
* gcc.target/aarch64/sve2/pr121604_pmov.c: New test.

Co-authored-by: Jennifer Schmitz <jschmitz@nvidia.com>
4 days agotree-optimization/122308 - apply LIM after unroll-and-jam
Richard Biener [Fri, 17 Oct 2025 13:12:11 +0000 (15:12 +0200)] 
tree-optimization/122308 - apply LIM after unroll-and-jam

Just like with loop interchange, unroll-and-jam can leave invariant
stmts in the inner loop from outer loop stmts inbetween the two
inner loop copies.  Do a per-function invariant motion when we
applied unroll-and-jam.  This avoids failed dataref analysis
and fallback to gather/scatter during vectorization.

PR tree-optimization/122308
* gimple-loop-jam.cc (tree_loop_unroll_and_jam): Do LIM
after applying unroll-and-jam.

* gcc.dg/vect/vect-pr122308.c: New testcase.

4 days agoipa, cgraph: Enable constant propagation to OpenMP kernels.
Josef Melcr [Thu, 16 Oct 2025 14:25:29 +0000 (16:25 +0200)] 
ipa, cgraph: Enable constant propagation to OpenMP kernels.

This patch enables constant propagation to outlined OpenMP kernels.
It does so using a new function attribute called ' callback' (note the
space).

The attribute ' callback' captures the notion of a function calling one
of its arguments with some of its parameters as arguments.  An OpenMP
example of such function is GOMP_parallel.
We implement the attribute with new callgraph edges called callback
edges. They are imaginary edges pointing from the caller of the function
with the attribute (e.g. caller of GOMP_parallel) to the body function
itself (e.g. the outlined OpenMP body).  They share their call statement
with the edge from which they are derived (direct edge caller -> GOMP_parallel
in this case).  These edges allow passes such as ipa-cp to see the hidden
call site to the body function and optimize the function accordingly.

To illustrate on an example, the body GOMP_parallel looks something
like this:

void GOMP_parallel (void (*fn) (void *), void *data, /* ... */)
{
  /* ... */
  fn (data);
  /* ... */
}

If we extend it with the attribute ' callback(1, 2)', we express that the
function calls its first argument and passes it its second argument.
This is represented in the call graph in this manner:

             direct                         indirect
caller -----------------> GOMP_parallel ---------------> fn
  |
  ----------------------> fn
          callback

The direct edge is then the callback-carrying edge, all new edges
are the derived callback edges.
While constant propagation is the main focus of this patch, callback
edges can be useful for different passes (for example, they improve icf
for OpenMP kernels), as they allow for address redirection.
If the outlined body function gets optimized and cloned, from body_fn to
body_fn.optimized, the callback edge allows us to replace the
address in the arguments list:

GOMP_parallel (body_fn, &data_struct, /* ... */);

becomes

GOMP_parallel (body_fn.optimized, &data_struct, /* ... */);

This redirection is possible for any function with the attribute.

This callback attribute implementation is partially compatible with
clang's implementation. Its semantics, arguments and argument indexing style are
the same, but we represent an unknown argument position with 0
(precedent set by attributes such as 'format'), while clang uses -1 or '?'.
We use the index 1 for the 'this' pointer in member functions, clang
uses 0. We also allow for multiple callback attributes on the same function,
while clang only allows one.

The attribute is currently for GCC internal use only, thanks to the
space in its name.  Originally, it was supposed to be called
'callback' like its clang counterpart, but we cannot use this name, as
clang uses non-standard indexing style, leading to inconsistencies.  The
attribute will be introduced into the public API as 'gnu::callback_only'
in a future patch.

The attribute allows us to propagate constants into body functions of
OpenMP constructs. Currently, GCC won't propagate the value 'c' into the
OpenMP body in the following example:

int a[100];
void test(int c) {
#pragma omp parallel for
  for (int i = 0; i < c; i++) {
    if (!__builtin_constant_p(c)) {
      __builtin_abort();
    }
    a[i] = i;
  }
}
int main() {
  test(100);
  return a[5] - 5;
}

With this patch, the body function will get cloned and the constant 'c'
will get propagated.

Some functions may utilize the attribute's infrastructure without being
declared with it, for example GOMP_task.  These functions are special
cases and use the special case functions found in attr-callback.h.  Special
cases use the attribute under certain circumstances, for example
GOMP_task uses it when the copy function is not being used required.

gcc/ChangeLog:

* Makefile.in: Add attr-callback.o to OBJS.
* builtin-attrs.def (ATTR_CALLBACK): Callback attr identifier.
(DEF_CALLBACK_ATTRIBUTE): Macro for callback attr creation.
(GOMP): Attr for libgomp functions.
(ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST with GOMP callback
attr added.
* cgraph.cc (cgraph_add_edge_to_call_site_hash): Always hash the
callback-carrying edge.
(cgraph_node::get_edge): Always return the callback-carrying
edge.
(cgraph_edge::set_call_stmt): Add cascade for callback edges.
(symbol_table::create_edge): Allow callback edges to share call
stmts, initialize new flags.
(cgraph_edge::make_callback): New method, derives a new callback
edge.
(cgraph_edge::get_callback_carrying_edge): New method.
(cgraph_edge::first_callback_edge): Likewise.
(cgraph_edge::next_callback_edge): Likewise.
(cgraph_edge::purge_callback_edges): Likewise.
(cgraph_edge::redirect_callee): When redirecting a callback
edge, redirect its ref as well.
(cgraph_edge::redirect_call_stmt_to_callee): Add callback edge
redirection logic, set update_derived_edges to true hwne
redirecting the carrying edge.
(cgraph_node::remove_callers): Add cascade for callback edges.
(cgraph_edge::dump_edge_flags): Print callback flags.
(cgraph_node::verify_node): Add sanity checks for callback
edges.
* cgraph.h: Add new 1 bit flags and 16 bit callback_id to
cgraph_edge class.
* cgraphclones.cc (cgraph_edge::clone): Copy over callback data.
* cif-code.def (CALLBACK_EDGE): Add CIF_CALLBACK_EDGE code.
* ipa-cp.cc (purge_useless_callback_edges): New function,
deletes callback edges when necessary.
(ipcp_decision_stage): Call purge_useless_callback_edges.
* ipa-fnsummary.cc (ipa_call_summary_t::duplicate): Add
an exception for callback edges.
(analyze_function_body): Copy over summary from carrying to
callback edge.
* ipa-inline-analysis.cc (do_estimate_growth_1): Skip callback
edges when estimating growth.
* ipa-inline-transform.cc (inline_transform): Add redirection
cascade for callback edges.
* ipa-param-manipulation.cc
(drop_decl_attribute_if_params_changed_p): New function.
(ipa_param_adjustments::build_new_function_type): Add
args_modified out param.
(ipa_param_adjustments::adjust_decl): Drop callback attrs when
modifying args.
* ipa-param-manipulation.h: Adjust decl of
build_new_function_type.
* ipa-prop.cc (ipa_duplicate_jump_function): Add decl.
(init_callback_edge_summary): New function.
(ipa_compute_jump_functions_for_edge): Add callback edge
creation logic.
* lto-cgraph.cc (lto_output_edge): Stream out callback data.
(input_edge): Input callback data.
* omp-builtins.def (BUILT_IN_GOMP_PARALLEL_LOOP_STATIC): Use new
attr list.
(BUILT_IN_GOMP_PARALLEL_LOOP_GUIDED): Likewise.
(BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_DYNAMIC): Likewise.
(BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME): Likewise.
(BUILT_IN_GOMP_PARALLEL): Likewise.
(BUILT_IN_GOMP_PARALLEL_SECTIONS): Likewise.
(BUILT_IN_GOMP_TEAMS_REG): Likewise.
* tree-core.h (ECF_CB_1_2): New constant for callback(1,2).
* tree-inline.cc (copy_bb): Copy callback edges when copying the
carrying edge.
(redirect_all_calls): Redirect callback edges.
* tree.cc (set_call_expr_flags): Create callback attr according
to the ECF_CB flag.
* attr-callback.cc: New file.
* attr-callback.h: New file.

gcc/c-family/ChangeLog:

* c-attribs.cc: Define callback attr.

gcc/fortran/ChangeLog:

* f95-lang.cc (ATTR_CALLBACK_GOMP_LIST): New attr list
corresponding to the list in builtin-attrs.def.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/ipcp-cb-spec1.c: New test.
* gcc.dg/ipa/ipcp-cb-spec2.c: New test.
* gcc.dg/ipa/ipcp-cb1.c: New test.

Signed-off-by: Josef Melcr <jmelcr02@gmail.com>
4 days agoFix missing style violation report for package instantiation
Eric Botcazou [Fri, 17 Oct 2025 09:02:28 +0000 (11:02 +0200)] 
Fix missing style violation report for package instantiation

Unlike for subprogram instantiation, -gnatyr does not report style violation
for package instantiation, more precisely for the generic package's name.

Fixing it uncovered style violations in the sources of the compiler itself!

gcc/ada/
PR ada/122295
* sem_ch12.adb (Analyze_Package_Instantiation): Force Style_Check
to False only after possibly installing the parent.
* aspects.adb (UAD_Pragma_Map): Fix style violation.
* inline.adb (To_Pending_Instantiations): Likewise.
* lib.ads (Unit_Names): Likewise.
* repinfo.adb (Relevant_Entities): Likewise.
* sem_ch7.adb (Subprogram_Table): Likewise.
(Traversed_Table): Likewise.
* sem_util.adb (Interval_Sorting): Likewise.

gcc/testsuite/
* gnat.dg/specs/style1.ads: New test.

4 days agolibstdc++: Fix typo in in __atomic_ref_base::_S_required_alignment.
Tomasz Kamiński [Fri, 17 Oct 2025 08:24:09 +0000 (10:24 +0200)] 
libstdc++: Fix typo in in __atomic_ref_base::_S_required_alignment.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h
(__atomic_ref_base::_S_required_alignment): Renamed from...
(__atomic_ref_base::_S_required_aligment): Renamed.

4 days agotree-optimization/122301 - fix ICE and improve vectorization of min/max reduction
Richard Biener [Fri, 17 Oct 2025 07:26:25 +0000 (09:26 +0200)] 
tree-optimization/122301 - fix ICE and improve vectorization of min/max reduction

The following fixes another issue with updating of reduc_idx in pattern
sequences.  But the testcase also shows the pattern in question is
harmful for vectorization since a reduction path may not contain
promotions/demotions.  So the already existing but ineffective check
to guard the pattern is fixed.

PR tree-optimization/122301
* tree-vect-patterns.cc (vect_recog_over_widening_pattern):
Fix reduction guard.
(vect_mark_pattern_stmts): Fix reduction def check.

* gcc.dg/vect/vect-pr122301.c: New testcase.

4 days agovect: Add pattern recognition for vectorizing {FLOOR,CEIL,ROUND}_{MOD, DIV}_EXPR
Avinash Jayakar [Fri, 17 Oct 2025 07:08:59 +0000 (12:38 +0530)] 
vect: Add pattern recognition for vectorizing {FLOOR,CEIL,ROUND}_{MOD, DIV}_EXPR

Added a new helper function "add_code_for_floorceilround_divmod" in
tree-vect-patterns.cc for adding compensating code for each of the op
{FLOOR,ROUND,CEIL}_{DIV,MOD}_EXPR. This function checks if target supports all
required operations required to implement these operation and generates
vectorized code for the respective operations. Based on the following logic
FLOOR_{DIV,MOD}
r = x %[fl] y;
r = x % y; if (r && (x ^ y) < 0) r += y;
r = x/[fl] y;
r = x % y; d = x/y; if (r && (x ^ y) < 0) d--;
CEIL_{DIV,MOD} (unsigned)
r = x %[cl] y;
r = x % y; if (r) r -= y;
r = x/[cl] y;
r = x % y; d = x/y; if (r) d++;
CEIL_{DIV,MOD} (signed)
r = x %[cl] y;
r = x % y; if (r && (x ^ y) >= 0) r -= y;
r = x/[cl] y;
r = x % y; d = x/y; if (r && (x ^ y) >= 0) d++;
ROUND_{DIV,MOD} (unsigned)
r = x %[rd] y;
r = x % y; if (r > ((y-1)/2)) r -= y;
r = x/[rd] y;
r = x % y; d = x/y; if (r > ((y-1)/2)) d++;
ROUND_{DIV,MOD} (signed)
r = x %[rd] y;
r = x % y; if (r > ((y-1)/2))
{if ((x ^ y) >= 0) r -= y; else r += y;}
r = x/[rd] y;
r = x % y; d = x/y; if ((r > ((y-1)/2)) && (x ^ y) >= 0)
{if ((x ^ y) >= 0) d++; else d--;}
each of the case is implemented in a vectorized form.
This function is then called in each of the path in vect_recog_divmod_pattern,
which there are 3, based on value of constant operand1,
1. == 2
2. == power of 2
3. otherwise

2025-10-17  Avinash Jayakar  <avinashd@linux.ibm.com>

gcc/ChangeLog:
PR tree-optimization/104116
* tree-vect-patterns.cc (add_code_for_floorceilround_divmod): patt recog
for {FLOOR,ROUND,CEIL}_{DIV,MOD}_EXPR.
(vect_recog_divmod_pattern): Call add_code_for_floorceilround_divmod
after computing div/mod for each control path.

gcc/testsuite/ChangeLog:
PR tree-optimization/104116
* gcc.dg/vect/pr104116-ceil-div-2.c: New test.
* gcc.dg/vect/pr104116-ceil-div-pow2.c: New test.
* gcc.dg/vect/pr104116-ceil-div.c: New test.
* gcc.dg/vect/pr104116-ceil-mod-2.c: New test.
* gcc.dg/vect/pr104116-ceil-mod-pow2.c: New test.
* gcc.dg/vect/pr104116-ceil-mod.c: New test.
* gcc.dg/vect/pr104116-ceil-udiv-2.c: New test.
* gcc.dg/vect/pr104116-ceil-udiv-pow2.c: New test.
* gcc.dg/vect/pr104116-ceil-udiv.c: New test.
* gcc.dg/vect/pr104116-ceil-umod-2.c: New test.
* gcc.dg/vect/pr104116-ceil-umod-pow2.c: New test.
* gcc.dg/vect/pr104116-ceil-umod.c: New test.
* gcc.dg/vect/pr104116-floor-div-2.c: New test.
* gcc.dg/vect/pr104116-floor-div-pow2.c: New test.
* gcc.dg/vect/pr104116-floor-div.c: New test.
* gcc.dg/vect/pr104116-floor-mod-2.c: New test.
* gcc.dg/vect/pr104116-floor-mod-pow2.c: New test.
* gcc.dg/vect/pr104116-floor-mod.c: New test.
* gcc.dg/vect/pr104116-round-div-2.c: New test.
* gcc.dg/vect/pr104116-round-div-pow2.c: New test.
* gcc.dg/vect/pr104116-round-div.c: New test.
* gcc.dg/vect/pr104116-round-mod-2.c: New test.
* gcc.dg/vect/pr104116-round-mod-pow2.c: New test.
* gcc.dg/vect/pr104116-round-mod.c: New test.
* gcc.dg/vect/pr104116-round-udiv-2.c: New test.
* gcc.dg/vect/pr104116-round-udiv-pow2.c: New test.
* gcc.dg/vect/pr104116-round-udiv.c: New test.
* gcc.dg/vect/pr104116-round-umod-2.c: New test.
* gcc.dg/vect/pr104116-round-umod-pow2.c: New test.
* gcc.dg/vect/pr104116-round-umod.c: New test.
* gcc.dg/vect/pr104116.h: New test.

4 days agomatch: Fix (a != b) | ((a|b) != 0) and (a == b) & ((a|b) == 0) match pattern [PR122296]
Andrew Pinski [Fri, 17 Oct 2025 00:02:52 +0000 (17:02 -0700)] 
match: Fix (a != b) | ((a|b) != 0) and (a == b) & ((a|b) == 0) match pattern [PR122296]

There are 2 fixes for these 2 patterns.
1) Reuse the (a|b) expression instead of recreating it
   Fixed by capturing the bit_ior expression and using that instead
   of a new expression.
2) Use the correct 0. Fixed by capturing the integer_zerop and using that
   instead of integer_zero_node.

2) could be fuxed by using `build_cst_zero (TREE_TYPE (@0))` But since
we already have the correct 0, capturing it would be faster.

Pushed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/122296

gcc/ChangeLog:

* match.pd (`(a != b) | ((a|b) != 0)`): Reuse both
the ior and zero instead of recreating them.
(`(a == b) & ((a|b) == 0)`): Likewise

gcc/testsuite/ChangeLog:

* gcc.dg/torture/int-bwise-opt-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
4 days agomatch: Fix `(a == b) | ((a|b) != 0)` pattern for vectors [PR122296]
Andrew Pinski [Thu, 16 Oct 2025 23:10:59 +0000 (16:10 -0700)] 
match: Fix `(a == b) | ((a|b) != 0)` pattern for vectors [PR122296]

The pattern `(a == b) | ((a|b) != 0)` uses build_one_cst to build boolean true
but boolean can be a signed multi-bit type. So this changes the result to
use constant_boolean_node isntead.
`(a != b) & ((a|b) == 0)` has a similar issue but in that case it is less likely
to be an issue as false is almost always just 0 but this changes it to be consistent.

Pushed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR tree-optimization/122296

gcc/ChangeLog:

* match.pd (`(a == b) | ((a|b) != 0)`): Fix true value.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/int-bwise-opt-vect01.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
4 days agox86: Cast stride to __PTRDIFF_TYPE__ for AMX-MOVRS intrinsics. [PR122119]
Hu, Lin1 [Fri, 10 Oct 2025 06:30:19 +0000 (14:30 +0800)] 
x86: Cast stride to __PTRDIFF_TYPE__ for AMX-MOVRS intrinsics. [PR122119]

On 64-bit windows, long can't be used, because it is 32 bits. Use
__PTRDIFF_TYPE__ instead of long.

gcc/ChangeLog:

PR target/122119
* config/i386/amxmovrsintrin.h
(_tile_loaddrs_internal): Use __PTRDIFF_TYPE__ instead of long.
(_tile_loaddrst1_internal): Ditto.

5 days agoDaily bump.
GCC Administrator [Fri, 17 Oct 2025 00:18:48 +0000 (00:18 +0000)] 
Daily bump.

5 days agodiagnostics: generalize state graph code to use json::property instances (v2)
David Malcolm [Thu, 16 Oct 2025 21:39:03 +0000 (17:39 -0400)] 
diagnostics: generalize state graph code to use json::property instances (v2)

In r16-1631-g2334d30cd8feac I added support for capturing state
information from -fanalyzer in the form of embedded XML strings
in SARIF output.

In r16-2211-ga5d9debedd2f46 I rewrote this so the state was captured in
the form of a SARIF directed graph, using various custom types.

I want to add the ability to capture other kinds of graph in our SARIF
output (e.g. inheritance hierarchies, CFGs, etc), so  the following patch
reworks the state graph handling code to minimize the use of custom types.
Instead, the patch introduces various json::property types, and
describes the state graph serialization in terms of instances of these
properties, rather than hardcoding string attribute names in readers and
writers.  The custom SARIF properties live in a new
"gcc/custom-sarif-properties/" directory.

The "experimental-html" scheme keys "show-state-diagrams-dot-src" and
"show-state-diagrams-sarif" become "show-graph-dot-src" and
"show-graph-dot-src" in preparation for new kinds of graph in the output.

This is an updated version of the patch, tested to build with GCC 5
(which the previous version didn't leading to PR bootstrap/122151)

contrib/ChangeLog:
* gcc.doxy (INPUT): Add gcc/custom-sarif-properties

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add
custom-sarif-properties/digraphs.o and
custom-sarif-properties/state-graphs.o.  Remove
diagnostics/state-graphs.o.
* configure: Regenerate.
* configure.ac: Add custom-sarif-properties to subdir iteration.
* custom-sarif-properties/digraphs.cc: New file.
* custom-sarif-properties/digraphs.h: New file.
* custom-sarif-properties/state-graphs.cc: New file.
* custom-sarif-properties/state-graphs.h: New file.
* diagnostics/diagnostics-selftests.cc
(run_diagnostics_selftests): Drop call of state_graphs_cc_tests.
* diagnostics/diagnostics-selftests.h (state_graphs_cc_tests):
Delete decl.
* diagnostics/digraphs.cc: Include
"custom-sarif-properties/digraphs.h".  Move include of
"selftest.h" to within CHECKING_P section.
(using digraph_object): New.
(namespace properties): New.
(diagnostics::digraphs::object::get_attr): Delete.
(diagnostics::digraphs::object::set_attr): Delete.
(diagnostics::digraphs::object::set_json_attr): Delete.
(digraph_object::get_property): New definitions, for various
property types.
(digraph_object::set_property): Likewise.
(digraph_object::maybe_get_property): New.
(digraph_object::get_property_as_tristate): New.
(digraph_object::ensure_property_bag): New.
(digraph::get_graph_kind): New.
(digraph::set_graph_kind): New.
Add include of "custom-sarif-properties/state-graphs.h".
(selftest::test_simple_graph): Rewrite to use json::property
instances rather than string attribute names.
(selftest::test_property_objects): New test.
(selftest::digraphs_cc_tests): Call it.
* diagnostics/digraphs.h: Include "tristate.h".
(object::get_attr): Delete.
(object::set_attr): Delete.
(object::get_property): New decls.
(object::set_property): New decls.
(object::maybe_get_property): New.
(object::get_property_as_tristate): New.
(object::set_json_attr): Delete.
(object::ensure_property_bag): New.
(graph::get_graph_kind): New.
(graph::set_graph_kind): New.
* diagnostics/html-sink.cc
(html_generation_options::html_generation_options): Update for
field renamings.
(html_generation_options::dump): Likewise.
(html_builder::maybe_make_state_diagram): Likewise.
(html_builder::add_graph): Show SARIF and .dot src inline, if
requested.
* diagnostics/html-sink.h
(html_generation_options::m_show_state_diagrams_sarif): Rename
to...
(html_generation_options::m_show_graph_sarif): ...this.
(html_generation_options::m_show_state_diagrams_dot_src): Rename
to...
(html_generation_options::m_show_graph_dot_src0): ...this.
* diagnostics/output-spec.cc
(html_scheme_handler::maybe_handle_kv): Rename keys.
(html_scheme_handler::get_keys): Likewise.
* diagnostics/state-graphs-to-dot.cc: : Reimplement throughout to
use json::property instances found within custom_sarif_properties
throughout, rather than types in diagnostics::state_graphs.
* diagnostics/state-graphs.cc: Deleted file.
* diagnostics/state-graphs.h: Delete almost all, except decl of
diagnostics::state_graphs::make_dot_graph.
* doc/invoke.texi: Update for changes to "experimental-html" sink
keys.
* json.cc (json::object::set_string): New.
(json::object::set_integer): New.
(json::object::set_bool): New.
(json::object::set_array_of_string): New.
* json.h: Include "label-text.h".
(struct json::property): New template.
(json::string_property): New.
(json::integer_property): New.
(json::bool_property): New.
(json::json_property): New.
(using json::array_of_string_property): New.
(struct json::enum_traits): New.
(enum_json::property): New.
(json::value::dyn_cast_array): New vfunc.
(json::value::dyn_cast_integer_number): New vfunc.
(json::value::set_string): New.
(json::value::set_integer): New.
(json::value::set_bool): New.
(json::value::set_array_of_string): New.
(json::value::maybe_get_enum): New.
(json::value::set_enum): New.
(json::array::dyn_cast_array): New.
(json::integer_number::dyn_cast_integer_number): New.
(object::maybe_get_enum): New.
(object::set_enum): New.

gcc/analyzer/ChangeLog:
* ana-state-to-diagnostic-state.cc: Reimplement throughout to use
json::property instances found within custom_sarif_properties
throughout, rather than types in diagnostics::state_graphs.
* ana-state-to-diagnostic-state.h: Likewise.
* checker-event.cc: Likewise.
* sm-malloc.cc: Likewise.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_graphs.cc
(report_diag_with_graphs): Port from set_attr to set_property.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
5 days agodwarf: add wiki link for DWARF GNU_annotation extensions
David Faust [Thu, 16 Oct 2025 17:39:31 +0000 (10:39 -0700)] 
dwarf: add wiki link for DWARF GNU_annotation extensions

include/

* dwarf2.def (DW_TAG_GNU_annotation): Add link to wiki page
documenting the extension.
(DW_AT_GNU_annotation): Likewise.

5 days agolibstdc++: Improve ostream output for std::stacktrace
Jonathan Wakely [Wed, 15 Oct 2025 19:10:34 +0000 (20:10 +0100)] 
libstdc++: Improve ostream output for std::stacktrace

With this change stacktrace entries always output the frame address, and
source file information no longer results in " at :0", e.g.

  16#  myfunc(int) at /tmp/bt.cc:48 [0x4008b7]
  17#  main at /tmp/bt.cc:61 [0x40091a]
  18#  __libc_start_call_main [0x7efc3d6d3574]
  19#  __libc_start_main@GLIBC_2.2.5 [0x7efc3d6d3627]
  20#  _start [0x400684]

This replaces the previous output:

  16# myfunc(int) at /tmp/bt.cc:48
  17# main at /tmp/bt.cc:61
  18# __libc_start_call_main at :0
  19# __libc_start_main@GLIBC_2.2.5 at :0
  20# _start at :0

A change that is not visible in the examples above is that for a
non-empty stacktrace_entry, we now print "<unknown>" for the function
name if description() returns an empty string.  For an empty (e.g.
default constructed) stacktrace_entry the entire string representation
is now "<unknown>" instead of an empty string.

Instead of printing "<unknown>" for the function name, we could set that
string in the stacktrace_entry::_Info object, so that description()
returns "<unknown>" and then operator<< wouldn't need to handle an empty
description() string. However, returning an empty string from that
function seems simpler for users to detect, rather than having to parse
"<unknown>".

We could also choose a different string for an empty stacktrace_entry,
maybe "<none>" or "<invalid>", but "<unknown>" seems good.

libstdc++-v3/ChangeLog:

* include/std/stacktrace
(operator<<(ostream&, const stacktrace_entry&)): Improve output
when description() or source_file() returns an empty string,
or the stacktrace_entry is invalid. Append frame address to
output.
(operator<<(ostream&, const basic_stacktrace<A>&)): Use the
size_type of the correct specialization.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Nathan Myers <nmyers@redhat.com>
5 days agoError out stack-protector unavailability on AIX
Ayappan Perumal [Mon, 1 Sep 2025 13:27:52 +0000 (08:27 -0500)] 
Error out stack-protector unavailability on AIX

stack-protector is not supported in GCC on AIX. This patch is to fail the
compilation if -fstack-protector option is passed.

gcc/ChangeLog:

* config/rs6000/aix.h (SUBTARGET_DRIVER_SELF_SPECS):
Error out when stack-protector option is used in AIX
as it is not supported on AIX

Approved By: Segher Boessenkool <segher@kernel.crashing.org>

5 days agolibgomp.c/declare-variant-4-gfx*: Add missing archs + dg-excess-errors
Tobias Burnus [Thu, 16 Oct 2025 09:11:39 +0000 (11:11 +0200)] 
libgomp.c/declare-variant-4-gfx*: Add missing archs + dg-excess-errors

Add missing tests for gfx* context selectors; mark all but the
default-arch declare-variant-4.c with 'dg-excess-errors' to
silence libgomp not-found errors (still passing the
scan-offload-tree-dump check) - or at least causing just
UNRESOLVED errors if the error is
  "built without library support ... consider compiling for
   the associated generic architecture".

In case the multilib is configured, the result will be
an XPASS.

libgomp/ChangeLog:

* testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: Add
dg-excess-errors to handle possible missing libgomp multi lib.
* testsuite/libgomp.c/declare-variant-4-gfx1030.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1036.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1100.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1103.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90c.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx942.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx1031.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1032.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1033.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1034.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1035.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1101.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1102.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1150.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1151.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1152.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1153.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx902.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx904.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx909.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx950.c: New test.

5 days agotree-optimization/122292 - fix reduction code gen issue
Richard Biener [Wed, 15 Oct 2025 14:15:50 +0000 (16:15 +0200)] 
tree-optimization/122292 - fix reduction code gen issue

The following fixes a mixup of vector types checked when looking
at a conditional reduction operation.  We want the actual data
vector input type, so look at the SLP trees type instead and
special-case lane-reducing ops like the original code did.

PR tree-optimization/122292
* tree-vect-loop.cc (vect_transform_reduction): Compute the
input vector type the same way the analysis phase does.

6 days agoDaily bump.
GCC Administrator [Thu, 16 Oct 2025 00:21:56 +0000 (00:21 +0000)] 
Daily bump.

6 days agoRange snap bitmasks as they are set.
Andrew MacLeod [Tue, 7 Oct 2025 15:56:08 +0000 (11:56 -0400)] 
Range snap bitmasks as they are set.

Range bounds adjustments based on a bitmask were lazily set.  This lead
to some inconsitencies which were causing problems. Improve the bounds,
and do it every time the bitmask is adjusted.

PR tree-optimization/121468
PR tree-optimization/121206
PR tree-optimization/122200
gcc/
* value-range.cc (irange_bitmask::range_from_mask): New.
(irange::snap): Add explicit overflow flag.
(irange::snap_subranges): Use overflow flag.
(irange::set_range_from_bitmask): Use range_from_mask.
(test_irange_snap_bounds): Adjust for improved ranges.
* value-range.h (irange::range_from_mask): Add prototype.
(irange::snap): Adjust prototype.

gcc/testsuite/
* gcc.dg/pr121468.c: New.
* gcc.dg/pr122200.c: New.

6 days agolibstdc++: Add pretty printers for std::stacktrace
Jonathan Wakely [Wed, 15 Oct 2025 20:44:16 +0000 (21:44 +0100)] 
libstdc++: Add pretty printers for std::stacktrace

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdStacktraceEntryPrinter):
New printer for std::stacktrace_entry.
(StdStacktracePrinter): New printer for std::basic_stacktrace.

6 days agolibstdc++: Remove invalid entry from the end of std::stacktrace
Jonathan Wakely [Wed, 15 Oct 2025 13:59:20 +0000 (14:59 +0100)] 
libstdc++: Remove invalid entry from the end of std::stacktrace

The backtrace_simple function seems to consistently invoke the callback
with an invalid -1UL value as the last entry, which seems to come from
_Unwind_Backtrace. The glibc backtrace(3) function has a special case to
not include that final invalid address, but libbacktrace doesn't seem to
handle it. Do so in std::stacktrace::current() instead.

libstdc++-v3/ChangeLog:

* include/std/stacktrace (basic_stacktrace::current): Call
_M_trim before returning.
(basic_stacktrace::_M_trim): New member function.

6 days agolibstdc++: Fix missing __to_timeout_timespec for targets using POSIX sleep [PR122293]
Jonathan Wakely [Wed, 15 Oct 2025 11:52:27 +0000 (12:52 +0100)] 
libstdc++: Fix missing __to_timeout_timespec for targets using POSIX sleep [PR122293]

The preprocessor condition for defining the new __to_timeout_timespec
function templates did not match all the conditions under which it's
needed.

std::this_thread::sleep_for is defined #if ! defined _GLIBCXX_NO_SLEEP
but it relies on __to_timeout_timespec which was only being defined for
targets that use nanosleep, or clock_gettime, or use gthreads.

For a non-gthreads target that uses POSIX sleep to implement
std::this_thread::sleep_for, the build fails with:

include/bits/this_thread_sleep.h:71:40: error: '__to_timeout_timespec' is not a member of 'std::chrono' [-Wtemplate-body]
   71 |         struct timespec __ts = chrono::__to_timeout_timespec(__rtime);
      |                                        ^~~~~~~~~~~~~~~~~~~~~

Presumably the same would happen for mingw-w64 if configured with
--disable-threads (as that would be a non-gthreads target that doesn't
use nanosleep or clock_gettime).

libstdc++-v3/ChangeLog:

PR libstdc++/122293
* include/bits/chrono.h (__to_timeout_timespec): Fix
preprocessor condition to match the conditions under which
callers of this function are defined.
* include/bits/this_thread_sleep.h: Remove unused include.

6 days ago[PATCH] Makefile.tpl: remove an extra \; from find command
Basil Milanich [Wed, 15 Oct 2025 17:31:09 +0000 (11:31 -0600)] 
[PATCH] Makefile.tpl: remove an extra \; from find command

The extra \; parameter in the find command causes it to fail immediately and
not clean any config.cache:

$ find . -name config.cache -exec rm -f {} \; \;
find: paths must precede expression: `;'

This is benign in most cases but the binutils is also using this Makefile.tpl and
as the result its 'make distclean' can leave config.cache files around, which
fails subsequent attempts to configure and build it.

I have modified the Makefile.tpl and regenerated Makefile.in from it. For testing
I ran a config/make/make distclean loop.

* Makefile.tpl (distclean): Remove extraenous semicolon.
* Makefile.in: Rebuilt.

6 days agogcn: Add missing GFX9_4_GENERIC, OpenMP context-selector update
Tobias Burnus [Wed, 15 Oct 2025 17:15:15 +0000 (19:15 +0200)] 
gcn: Add missing GFX9_4_GENERIC, OpenMP context-selector update

The definition for gfx942 and gfx950 missed the GFX9_4_GENERIC
family flag.

For OpenMP context selectors: The t-omp-device file missed the
generic selectors.

Additionally, there is now a note in the OpenMP documentation that
there is a one-to-one match for ISA names, ignoring any compatibility.
For instance, for Nvidia GPUs 'isa("sm_70")' is only true when compiling
for 'sm_70', even though sm < 7.0 code also runs on sm_70 hardware.
And, for AMD GPUs, gfx9-4-generic neither matches 'gfx942'
(even though such generic code runs on gfx942) - nor the reverse
(although all gfx9-4-generic code runs on gfx942).

gcc/ChangeLog:

* config/gcn/gcn-devices.def (gfx942, gfx950): Set generic name
to GFX9_4_GENERIC.
* config/gcn/t-omp-device: Include generic names for OpenMP's
ISA trait.

libgomp/ChangeLog:

* libgomp.texi (OpenMP Context Selectors): Add note that there is
currently an exact match between ISA and compilation, ignoring
compatibilities in both ways.
* testsuite/libgomp.c/declare-variant-4.h: Add missing variant
functions for specific and generic AMD GPUs.
* testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: New test.

6 days agodebug_tree: print out clique/base for MEM_REF/TARGET_MEM_REF
Andrew Pinski [Tue, 14 Oct 2025 17:36:03 +0000 (10:36 -0700)] 
debug_tree: print out clique/base for MEM_REF/TARGET_MEM_REF

While debugging PR 122273, I noticed that print_node was not
printing out the clique/base for MEM_REF/TARGET_MEM_REF. This
made harder to understand why operand_equal_p (without looking
into the code) would be rejecting two looking the same MEM_REFs.

Changes since v1:
* v2: Don't print out clique/base if clique is 0.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* print-tree.cc (print_node): Print out clique/base
for MEM_REF and TARGET_MEM_REF.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
6 days agoarm: avoid unmatched insn in movhfcc [PR118460]
Richard Earnshaw [Tue, 14 Oct 2025 12:53:05 +0000 (13:53 +0100)] 
arm: avoid unmatched insn in movhfcc [PR118460]

When compiling for m-profile with the floating-point extension we have
a vsel instruction that takes a limited set of comparisons.  In most
cases we can use this with careful selection of the operand order, but
we need to expand things in the right way.  This patch is in two parts:

1) We validate that the expansion will produce correct RTL;
2) We canonicalize the comparison to increase the chances that the
above check will pass.

gcc:

PR target/118460
* config/arm/arm.cc (arm_canonicalize_comparison): For floating-
point comparisons, swap the operand order if that will be more
likely to produce a comparison that can be used with VSEL.
(arm_validize_comparison): Make sure that HFmode comparisons
are compatible with VSEL.

gcc/testsuite:

PR target/118460
* gcc.target/arm/armv8_2-fp16-move-1.c: Adjust expected output.
* gcc.target/arm/armv8_2-fp16-move-2.c: Likewise.