]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
3 weeks agoada: Improve dmsg
Viljar Indus [Fri, 20 Mar 2026 15:23:14 +0000 (17:23 +0200)] 
ada: Improve dmsg

Add missing attributes to dmsg. Additionally add support for
printing locations and fixes.

gcc/ada/ChangeLog:

* erroutc-pretty_emitter.adb (To_String): Relocated to erroutc.
(To_File_Name): Likewise.
(Line_To_String): Likewise.
(Column_To_String): Likewise.
* erroutc.adb (dedit): New function for debugging edits.
(dfix): New function for debuging fixes.
(dloc): New function for debugging locations.
(dmsg): Print missing Error_Msg_Object attributes.
(To_String): New function for printing spans
(To_String): Relocated from erroutc-pretty_emitter.adb
(To_File_Name): Likewise.
* erroutc.ads: Likewise.

3 weeks agoada: Stop using gnat_envp
Ronan Desplanques [Tue, 17 Mar 2026 14:38:44 +0000 (15:38 +0100)] 
ada: Stop using gnat_envp

First, a bit of context: Ada has only had support for manipulating
environment variables in the standard library since Ada 2005 and the
introduction of Ada.Environment_Variables.

Prior to that, GNAT had introduced the implementation-specific
Ada.Command_Line.Environment, which still exists today. Until now,
Ada.Command_Line.Environment used a global variable, gnat_envp, which
must be initialized with envp, the optional third parameter to main in C.
When the main was in Ada, the binder generated the appropriate assignment.
The rest of the time, it was the responsibility of the user to write this
assignment. Failure to do so would cause null pointer dereferences when
using Ada.Command_Line.Environment. Although documented in the spec of
Ada.Command_Line, this was rather easy to miss.

Worse, the assignment caused linking failures in the rather common case
of a C GPR project with'ing an Ada GPR project and linking dynamically.

Also, Ada.Command_Line.Environment was inconsistent across platforms with
regard to how it was affected by calls to putenv.

When we added support for the standard Ada.Environment_Variables, the
gnat_envp machinery wasn't reused. Instead, another mechanism based on
the Unix global variable environ (and its close equivalents on other
platforms) was introduced.

What this patch does is switch Ada.Command_Line.Environment over to this
new environ-based mechanism. All uses of gnat_envp are removed, but the
definition itself is kept for backwards compatibility.

gcc/ada/ChangeLog:

* argv-lynxos178-raven-cert.c: Update comments.
* argv.c (gnat_envp): Add comment about it being unused.
(__gnat_env_count, __gnat_len_env, __gnat_fill_env): Use
__gnat_environ instead of gnat_envp.
* bindgen.adb (Command_Line_Used): Update comment.
(Gen_Main): Remove gnat_envp assignment generation. Remove generated
envp parameter.
(Gen_Output_File_Ada): Remove generated envp parameter.
* env.h: Make usable as C++.
* libgnat/a-colien.ads: Remove comment.
* libgnat/a-comlin.ads: Update comment.
* targparm.ads: Update comment.

3 weeks agoada: Require compilation unit to have no indentation
Piotr Trojanek [Tue, 10 Mar 2026 15:18:13 +0000 (16:18 +0100)] 
ada: Require compilation unit to have no indentation

We had a style check for compilation unit to start at column number which is
multiple of indentation value. Now we require compilation units to no have no
indentation.

gcc/ada/ChangeLog:

* par-ch10.adb (P_Compilation_Unit): Require no indentation.

3 weeks agoada: Fix compiler crash on primitive completed by expression function
Eric Botcazou [Tue, 17 Mar 2026 21:44:13 +0000 (22:44 +0100)] 
ada: Fix compiler crash on primitive completed by expression function

This further restricts the special bypass for the freezing of the profile
in Analyze_Subprogram_Body_Helper to the case of wrapper functions.

gcc/ada/ChangeLog:

PR ada/93702
* exp_ch3.adb (Make_Controlling_Function_Wrappers): Do not set the
Was_Expression_Function flag on the body.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Avoid freezing the
profile only for wrapper functions.

3 weeks agoada: Pretty-print filter of loop parameter specification
Piotr Trojanek [Mon, 16 Mar 2026 12:42:08 +0000 (13:42 +0100)] 
ada: Pretty-print filter of loop parameter specification

Filter was only pretty-printed for iterator specification, but it can also
appear in loop parameter specification. This only affects debug output.

gcc/ada/ChangeLog:

* sprint.adb (Sprint_Node_Actual): Print filter in loop parameter
specification.

3 weeks agoada: Suppress warning for quantified expression with filters
Piotr Trojanek [Mon, 16 Mar 2026 12:40:44 +0000 (13:40 +0100)] 
ada: Suppress warning for quantified expression with filters

If quantified expression has a filter, it becomes less clear whether we should
warn about quantified variable not being used.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): If there is a filter,
then suppress the warning.

3 weeks agoada: Suppress warning about unused variable in trivial quantification
Piotr Trojanek [Fri, 13 Mar 2026 11:54:06 +0000 (12:54 +0100)] 
ada: Suppress warning about unused variable in trivial quantification

When condition of a quantification expression is written as True or False, then
the user has likely done this on purpose and there is no need for a warning.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): Suppress warning for
trivial conditions.

3 weeks agoada: Unset Comes_From_Source in inlined static functions
Viljar Indus [Tue, 17 Mar 2026 07:40:27 +0000 (09:40 +0200)] 
ada: Unset Comes_From_Source in inlined static functions

Unset Comes_From_Source in the inlined expression in
order to avoid spurious resolution errors.

gcc/ada/ChangeLog:

* inline.adb (Adjust_Node): Renamed from Adjust_Sloc and
additionally unset Comes_From_Source.

3 weeks agoada: Missing overflow check on Integer_128 under GNATProve mode
Javier Miranda [Fri, 13 Mar 2026 19:48:26 +0000 (19:48 +0000)] 
ada: Missing overflow check on Integer_128 under GNATProve mode

Under GNATProve mode the frontend does not generate overflow
checks on type conversions of Universal Integer numbers to
128-bit integer type numbers.

gcc/ada/ChangeLog:

* checks.adb (Apply_Scalar_Range_Check): When the type of the expression
is Universal Integer we cannot statically determine if the expression
is in the range of the target type.
* sem_eval.adb (In_Subrange_Of): Do not consider T2 in the range of
Universal Integer (since theoretically they are not).
(Test_In_Range): Do not consider Universal type expressions in range
of subtype Typ.

3 weeks agoada: Add (r)pech debug routines for entity chains and simple check
Marc Poulhiès [Thu, 12 Mar 2026 16:01:06 +0000 (17:01 +0100)] 
ada: Add (r)pech debug routines for entity chains and simple check

(r)pech (Print Entity Chain - Header) can be used to dump the entity
chains with one node header per line:

- N_Defining_Identifier "system__use_ada_main_program_name" (Entity_Id=2804) (source)
- N_Defining_Identifier "system__zcx_by_default" (Entity_Id=2808) (source)
- N_Defining_Identifier "system__standard_library" (Entity_Id=108628) (source)
- N_Defining_Identifier "system__exception_table" (Entity_Id=109523) (source)

Also add a simple consistency check to all routines that dumps the
entity chain: if Prev (Next (E)) /= E (or Next (Prev (E)) /= E in the
reverse order), an extra line is printed:

- N_Defining_Identifier "system__tick" (Entity_Id=2550) (source)
 !! - Prev (Next (^^^^)) = N_Defining_Identifier  "system__default_priority" (Entity_Id=2700) (source)
- N_Defining_Identifier "system__address" (Entity_Id=2553) (source)

This example shows that the next links have 2550->2553, but the previous
links have 2700 <- 2553.

gcc/ada/ChangeLog:

* treepr.ads (pech, rpech): New.
(Print_Entity_Chain): Adjust signature and comment to handle
printing only header and doing the simple check.
* treepr.adb (pech, rpech): New.
(Print_Entity_Chain): Support for printing only headers and doing
simple check.

3 weeks agoada: Enable resolution of overloading on Last and Previous for Iterable
Claire Dross [Wed, 11 Mar 2026 09:25:16 +0000 (10:25 +0100)] 
ada: Enable resolution of overloading on Last and Previous for Iterable

The resolution of overloading for the optional Last and Previous primitives
of an Iterable aspect should be done like for other primitives.

gcc/ada/ChangeLog:

* sem_ch13.adb (Resolve_Iterable_Operation): Handle Previous and Last
like Next and First.

3 weeks agoada: Add volatile abstract state to creation functions in Interfaces.C.Strings
Claire Dross [Wed, 11 Mar 2026 16:56:51 +0000 (17:56 +0100)] 
ada: Add volatile abstract state to creation functions in Interfaces.C.Strings

The additional volatile abstract state is necessary to model the value of the
new pointer.

gcc/ada/ChangeLog:

* libgnat/i-cstrin.ads: New C_Addresses volatile state to use as
input of the New_String and New_Char_Array.

3 weeks agoada: Fix assertion failure on call in object notation in entry barrier
Eric Botcazou [Mon, 16 Mar 2026 07:08:05 +0000 (08:08 +0100)] 
ada: Fix assertion failure on call in object notation in entry barrier

The problem is that the Original_Record_Component field is accessed without
checking that it may be.

gcc/ada/ChangeLog:

* sem_util.adb (Statically_Names_Object) <N_Selected_Component>:
Return False if the selector is neither component nor discriminant.

3 weeks agoada: Inheritance of pragma/aspect unchecked_union
Javier Miranda [Thu, 12 Mar 2026 18:55:21 +0000 (18:55 +0000)] 
ada: Inheritance of pragma/aspect unchecked_union

Derived types inherit pragma/aspect uncheched union.

gcc/ada/ChangeLog:

* sem_ch3.adb (Build_Derived_Record_Type): Record type derivations
inherit Is_Unchecked_Union and Has_Unchecked_Union flags.
(Inherit_Component): Add discriminals to the associations list.
* exp_ch3.adb (Build_Record_Init_Proc): Derivations of Unchecked_Union
types don't need an initialization procedure; they reuse the init proc
of their parent type.

3 weeks agoada: Distribute declaration of return object into conditional expressions
Eric Botcazou [Mon, 9 Mar 2026 17:59:11 +0000 (18:59 +0100)] 
ada: Distribute declaration of return object into conditional expressions

This lifts one of the limitations of the distribution of a declaration of
an object into the dependent expressions of its initialization expression
when it is a conditional expression, namely the case of the return object
of an extended return statement.

gcc/ada/ChangeLog:

* exp_ch4.adb (Expand_N_Case_Expression): Deal with initialization
expression of return object.
(Expand_N_If_Expression): Likewise.
(Insert_Conditional_Object_Declaration): Likewise.
* exp_util.adb (Is_Distributable_Declaration): Lift limitation for
return objects, including those with a class-wide type.
* sem_ch3.adb (Analyze_Object_Declaration): Set Return_Applies_To
on artificial return objects created from within a transient scope.
Remove test on Expander_Active for better error recovery.

3 weeks agoada: Fix reStructuredText markup
Ronan Desplanques [Thu, 12 Mar 2026 14:35:55 +0000 (15:35 +0100)] 
ada: Fix reStructuredText markup

gcc/ada/ChangeLog:

* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix
markup.
* gnat_ugn.texi: Regenerate.

3 weeks agoada: Suppress warning about unused quantified variables with junk names
Piotr Trojanek [Wed, 11 Mar 2026 21:00:46 +0000 (22:00 +0100)] 
ada: Suppress warning about unused quantified variables with junk names

For quantified expressions like "for all Dummy in ... => True" we don't want
to warn about unused variable when it has a junk name.

gcc/ada/ChangeLog:

* sem_ch4.adb (Analyze_Quantified_Expression): Suppress warning for
variables with junk names.

3 weeks agoada: Refactor Inline_Static_Function_Call
Viljar Indus [Mon, 9 Mar 2026 14:33:19 +0000 (16:33 +0200)] 
ada: Refactor Inline_Static_Function_Call

gcc/ada/ChangeLog:

* inline.adb (Inline_Static_Function_Call): Reduce source code nesting.
* inline.ads (Inline_Static_Function_Call): Likewise.

3 weeks agoada: Calculate the sloc adjustment for inlined static functions
Viljar Indus [Mon, 9 Mar 2026 12:52:33 +0000 (14:52 +0200)] 
ada: Calculate the sloc adjustment for inlined static functions

First (and last) node calculation is done by traversing the original
nodes of the given node. This is fine for expanding existing code.
However when inlining static functions this can lead to a node that is
in a completly different location (e.g. the spec) being considered the
first node in the location of the inlined call. This means that in this
type of scenario reseting the slocs is not enough.

The correct approach to use here would be to calculate the Adjustment
in the Source File Index between the function and the inlined call. This
approach is also used in inlining regular subprograms.

Once there is an entry in the Source File Index for the inlined call the
error message mechanism will both highlight the call and the expression
function if an error is present in the inlined call.

gcc/ada/ChangeLog:

* inline.adb (Inline_Static_Function_Call): Add a Source File Index
entry for the call and apply the necessary sloc adjustment values
for all of the inlined nodes.

3 weeks agoada: Make __gnat_copy_attribs non-blocking on windows
Tonu Naks [Wed, 26 Nov 2025 12:24:19 +0000 (12:24 +0000)] 
ada: Make __gnat_copy_attribs non-blocking on windows

gcc/ada/ChangeLog:

* adaint.c (__gnat_copy_attribs): use GetFileAttributesEx to
to fetch attributes.

3 weeks agoc++: Fix build_value_init_noctor anon aggr handling
Jakub Jelinek [Fri, 29 May 2026 08:16:10 +0000 (10:16 +0200)] 
c++: Fix build_value_init_noctor anon aggr handling

As I've mentioned on Saturday, the CWG3130 patchset fail to bootstrap.
The problem is that we try to call build_value_init on anonymous unions
or structs, which doesn't work well when they don't have a default
constructor.
Now, if some non-trivial construction is needed, type will already have
either a user-provided constructor or at least non-trivial one, in that
case build_value_init_noctor isn't called at all.  So, this patch
just zero-initializes the anonymous aggregate members.

2026-05-29  Jakub Jelinek  <jakub@redhat.com>

* init.cc (build_value_init_noctor): Zero initialize anonymous
union/struct subobjects.  Formatting fix.

3 weeks agolibgfortran: Use MapViewOfFileEx instead of MapViewOfFileExNuma in caf_shmem
Peter Damianov [Fri, 29 May 2026 06:53:42 +0000 (08:53 +0200)] 
libgfortran: Use MapViewOfFileEx instead of MapViewOfFileExNuma in caf_shmem

MapViewOfFileExNuma is only present when _WIN32_WINNT >= 0x0600 (Windows Vista
or later). The code is passing NUMA_NO_PREFERRED_MODE, and that
is documented as:

No NUMA node is preferred. This is the same as calling the MapViewOfFileEx
function.

https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-mapviewoffileexnuma

So, MapViewOfFileEx will behave identically, while still allowing Windows XP
support.

libgfortran/ChangeLog:

* caf/shmem/shared_memory.c (shared_memory_init): Use
MapViewOfFileEx instead of MapViewOfFileExNuma.

3 weeks agobb-slp-complex-mla-half-float.c: Add the missing end brace
H.J. Lu [Fri, 29 May 2026 02:16:22 +0000 (10:16 +0800)] 
bb-slp-complex-mla-half-float.c: Add the missing end brace

commit 44a31df54837adf2f7815e7966dfe8ac32eb8f3b
Author: Artemiy Volkov <artemiy.volkov@arm.com>
Date:   Mon May 18 10:21:18 2026 +0000

    aarch64: introduce partial AdvSIMD vector modes

changed gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c to:

-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1"  { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" { xfail arm*-*-* } } */

The end brace was missing.  Add the missing end brace to fix it.

PR testsuite/125489
* gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c: Add the
missing end brace.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 weeks agoDaily bump.
GCC Administrator [Fri, 29 May 2026 00:16:40 +0000 (00:16 +0000)] 
Daily bump.

3 weeks agoFortran: f_c_string intrinsic improvements
Sandra Loosemore [Thu, 28 May 2026 22:33:33 +0000 (22:33 +0000)] 
Fortran: f_c_string intrinsic improvements

The existing implementation of f_c_string is quite inefficient, doing
either 2 or 3 allocations and copies of the input string prefix.  This
rewrite adds folding for constant string arguments and handles other
cases with a single allocation and copy.

This patch also adds the missing documentation for this intrinsic to the
gfortran manual.

gcc/fortran/ChangeLog
* intrinsic.texi (F_C_STRING): New section.
* trans-intrinsic.cc (conv_trim): Delete.
(conv_isocbinding_function): Rewrite the F_C_STRING case.

gcc/testsuite/ChangeLog
* gfortran.dg/f_c_string3.f90: New.
* gfortran.dg/f_c_string4.f90: New.
* gfortran.dg/f_c_string5.f90: New.

3 weeks agoFortran: Add c_f_strpointer intrinsic
Sandra Loosemore [Thu, 28 May 2026 22:33:33 +0000 (22:33 +0000)] 
Fortran: Add c_f_strpointer intrinsic

This is a missing Fortran 2023 feature.

gcc/fortran/ChangeLog
* check.cc (gfc_check_c_f_strpointer): New.
* f95-lang.cc (gfc_init_builtin_functions): Add BUILT_IN_STRNLEN.
* gfortran.h (enum gfc_isym_id): Add GFC_ISYM_C_F_STRPOINTER.
* gfortran.texi (Interoperable Subroutines and Functions): Mention
f_c_string and c_f_strpointer.
* intrinsic.cc (add_subroutines): Add c_f_strpointer.  Fix nearby
whitespace errors.
(sort_actual): Handle first argument to c_f_strpointer specially.
* intrinsic.h (gfc_check_c_f_strpointer): Declare.
* intrinsic.texi (C_F_STRPOINTER): New section.  Add entry to menu
and cross-references from similar functions.
* iso-c-binding.def: Add c_f_strpointer.
* trans-intrinsic.cc (conv_isocbinding_subroutine_strpointer): New.
(gfc_conv_intrinsic_subroutine): Call it.

gcc/testsuite/ChangeLog
* gfortran.dg/c_f_strpointer-1.f90: New.
* gfortran.dg/c_f_strpointer-2.f90: New.
* gfortran.dg/c_f_strpointer-3.f90: New.
* gfortran.dg/c_f_strpointer-4.f90: New.
* gfortran.dg/c_f_strpointer-5.f90: New.
* gfortran.dg/c_f_strpointer-6.f90: New.
* gfortran.dg/c_f_strpointer-7.f90: New.
* gfortran.dg/c_f_strpointer-8.f90: New.
* gfortran.dg/c_f_strpointer-9.f90: New.
* gfortran.dg/c_f_strpointer-10.f90: New.
* gfortran.dg/pr108961.f90: Rename locally-defined c_f_strpointer.

Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
3 weeks agolibcody: allow non-ASCII module names [PR120458]
Jean-Christian CÎRSTEA [Sun, 22 Mar 2026 17:57:42 +0000 (19:57 +0200)] 
libcody: allow non-ASCII module names [PR120458]

Before this commit, attempting to use non-ASCII characters in quoted
words failed, even though the protocol allows the usage of such
characters in quoted words. To fix this:

1. Remove `c >= 0x7f` comparison when parsing a quoted word.
2. Use `unsigned char` instead of `char` such that `c < 0x20` fails for
   non-ASCII characters.

PR c++/120458

libcody/ChangeLog:

* buffer.cc (S2C): Allow non-ASCII chars in quoted words.
* cody.hh: Use unsigned char for S2C().

gcc/testsuite/ChangeLog:

* g++.dg/README: Explain purpose of modules/ dir.
* g++.dg/modules/pr120458-1_a.C: Define non-ASCII module with
default mapper.
* g++.dg/modules/pr120458-1_b.C: Import non-ASCII module with
default mapper.
* g++.dg/modules/pr120458-2_a.C: Define non-ASCII module with
a file as mapper.
* g++.dg/modules/pr120458-2_b.C: Import non-ASCII module with
a file as mapper.
* g++.dg/modules/pr120458-2.map: Define mapping for pr120458-2
test case.

Signed-off-by: Jean-Christian CÎRSTEA <jean-christian.cirstea@tuta.com>
3 weeks agovect-early-break-no-epilog_11.c: Require avx512f_runtime
H.J. Lu [Thu, 28 May 2026 20:30:56 +0000 (04:30 +0800)] 
vect-early-break-no-epilog_11.c: Require avx512f_runtime

Require avx512f_runtime instead of avx512f_hw to fix

ERROR: gcc.dg/vect/vect-early-break-no-epilog_11.c -flto -ffat-lto-objects: unknown effective target keyword `avx512f_hw' for " dg-require-effective-target 7 avx512f_hw { target i?86-*-* x86_64-*-* } "

* gcc.dg/vect/vect-early-break-no-epilog_11.c: Require
avx512f_runtime instead of avx512f_hw.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
3 weeks agotree-ssa: Loop store motion micro-optimizations.
Roger Sayle [Thu, 28 May 2026 19:56:27 +0000 (20:56 +0100)] 
tree-ssa: Loop store motion micro-optimizations.

ref_always_accessed_p is (currently) only ever called with stored_p being
true, so specializing for this case, renaming ref_always_accessed{,_p} to
ref_always_stored{,_p} saves storage and some redundant checks at run-time.

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* tree-ssa-loop-im.cc (ref_always_accessed_p): Rename to...
(ref_always_stored_p): New function specialized to determine if
REF is a store that is always executed in LOOP.
(execute_sm): Use ref_always_stored_p instead of
ref_always_accessed_p.
(class ref_always_accessed): Rename to..
(class ref_always_stored): Remove (always true) stored_p field.
(ref_always_stored::operator ()): Always check for a store.
Move hash table lookup, get_lim_data, after store test.
(can_sm_ref_p): Use ref_always_stored_p insead of
ref_always_accessed_p.

3 weeks agox86_64 SSE: Tweak/correct STV cost of 128-bit rotate by constant.
Roger Sayle [Thu, 28 May 2026 19:54:17 +0000 (20:54 +0100)] 
x86_64 SSE: Tweak/correct STV cost of 128-bit rotate by constant.

This one line change resolves the failure of gcc.target/i386/rotate-2.c
when compiled with -march=cascadelake triggered by recent STV improvements.
https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716996.html

The decision of whether to perform STV is finely balanced, and affected
by the microarchitecture's timings/costs, but in this case the underlying
issue appears to be the parameterized cost for performing a 128-bit
rotation by a constant in SSE registers.  Depending upon the number
of bits to rotate by, SSE requires either 1 or 2 shuffles, followed
by a left shift, a right shift and an any_or_plus to combine the result.
This is therefore 4 or 5 instructions, but currently returns
COSTS_N_INSNS(1) instead of COSTS_N_INSNS(4) [probably a typo].

As an aside, it might be more useful for this gain to based on latency;
as both the shuffles and the shifts can each be performed in parallel,
so a reasonable vcost may therefore be COSTS_N_INSNS(3), but such fine
tuning might require microbenchmarking.  I mention it here just in case
using COSTS_N_INSNS(4) is bisected as a performance regression.

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc (compute_convert_gain): Tweak
the cost of a 128-bit rotation to be 4 (or 5) instructions.

3 weeks agox86_64 SSE: Handle SUBREG conversions in TImode STV (for ptest).
Roger Sayle [Thu, 28 May 2026 19:50:11 +0000 (20:50 +0100)] 
x86_64 SSE: Handle SUBREG conversions in TImode STV (for ptest).

This patch teaches i386's STV pass how to handle SUBREG conversions,
i.e. that a TImode SUBREG can be transformed into a V1TImode SUBREG,
without worrying about other DEFs and USEs.

One example where this is useful is

typedef long long __m128i __attribute__ ((__vector_size__ (16)));
int foo (__m128i x, __m128i y) {
  return (__int128)x == (__int128)y;
}

where with -O2 -msse4 we can now scalar-to-vector transform:

(insn 7 4 8 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (subreg:TI (reg/v:V2DI 86 [ x ]) 0)
            (subreg:TI (reg/v:V2DI 87 [ y ]) 0))) {*cmpti_doubleword}

into

(insn 17 4 7 2 (set (reg:V1TI 91)
        (xor:V1TI (subreg:V1TI (reg/v:V2DI 86 [ x ]) 0)
            (subreg:V1TI (reg/v:V2DI 87 [ y ]) 0)))
     (nil))
(insn 7 17 8 2 (set (reg:CCZ 17 flags)
        (unspec:CCZ [
                (reg:V1TI 91) repeated x2
            ] UNSPEC_PTEST)) {*sse4_1_ptestv1ti}
     (expr_list:REG_DEAD (reg/v:V2DI 87 [ y ])
        (expr_list:REG_DEAD (reg/v:V2DI 86 [ x ])
            (nil))))

with the dramatic effect that the assembly output before:

foo: movaps  %xmm0, -40(%rsp)
        movq    -32(%rsp), %rdx
        movq    %xmm0, %rax
        movq    %xmm1, %rsi
        movaps  %xmm1, -24(%rsp)
        movq    -16(%rsp), %rcx
        xorq    %rsi, %rax
        xorq    %rcx, %rdx
        orq     %rdx, %rax
        sete    %al
        movzbl  %al, %eax
        ret

now becomes

foo: pxor    %xmm1, %xmm0
        xorl    %eax, %eax
        ptest   %xmm0, %xmm0
        sete    %al
        ret

i.e. a 128-bit vector doesn't need to be transferred to the
scalar unit to be tested for equality.  The new test case includes
additional related examples that show similar improvements.

Previously we explicitly checked *cmpti_doubleword operands to be
either immediate constants, or a TImode REG or a TImode MEM.  By
enhancing this to allow a TImode SUBREG, we now handle everything
that would match the general_operand predicate, making this part
of STV more like other RTL passes (lra/reload).  The big change is
that unlike a regular DF USE, a SUBREG USE doesn't require us to
analyze and convert the rest of the DEF-USE chain.

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>
    Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog
* config/i386/i386-features.cc (scalar_chain::add_insn): Don't
call analyze_register_chain if the USE is a SUBREG.
(scalar_chain::convert_op): Call gen_lowpart to convert
scalar (TImode) SUBREGs to vector (V1TImode) SUBREGs.
(convertible_comparison_p): We can now handle all general_operands
of *cmp<dwi>_doubleword.
(timode_remove_non_convertible_regs): We only need to check TImode
uses that aren't TImode SUBREGs of registers in other modes.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-ptest-7.c: New test case.

3 weeks agox86 SSE: Improve vector increment/decrement on x86.
Roger Sayle [Thu, 28 May 2026 19:46:04 +0000 (20:46 +0100)] 
x86 SSE: Improve vector increment/decrement on x86.

This patch improves the code generated by the i386 backend for incrementing
(adding one to) and decrementing (subtracting one from) a vector.  With SSE
materializing the vector -1 is more efficient than materializing the
vector +1, hence x + 1 (increment) is better expressed as x - (-1), and
x - 1 (decrement) is better expressed as x + (-1).  Conveniently the
relevant additions and subtractions are specified as a single pattern,
using a plusminus iterator, in the machine description.

For the four example functions:

typedef char v16sqi __attribute__ ((vector_size(16)));
typedef unsigned char v16uqi __attribute__ ((vector_size(16)));

v16sqi sadd1(v16sqi x) { return x+1; }
v16uqi uadd1(v16uqi x) { return x+1; }
v16sqi saddm1(v16sqi x) { return x-1; }
v16uqi uaddm1(v16uqi x) { return x-1; }

GCC with -O2 -mavx2 previously generated:

sadd1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

uadd1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

saddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpabsb  %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

uaddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

With this patch, we now consistently generate:

sadd1:  vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

uadd1:  vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpsubb  %xmm1, %xmm0, %xmm0
        ret

saddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

uaddm1: vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpaddb  %xmm1, %xmm0, %xmm0
        ret

2026-05-28  Roger Sayle  <roger@nextmovesoftware.com>
    Hongtao Liu  <hongtao.liu@intel.com>
    Uros Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (inv_insn): New define_code_attr.
* config/i386/sse.md (<plusminus><mode>3): Accept a CONST_VECTOR
as the second operand.  If the second operand is CONST1_RTX,
canonicalize to use CONSTM1_RTX instead.
(*add<mode>3_one): New define_insn_and_split to convert padd +1
to psub -1.
(*sub<mode>3_one): Likewise, a new define_insn_and_split to
convert psub +1 to padd -1.

gcc/testsuite/ChangeLog
* gcc.target/i386/avx512f-simd-1.c: Tweak test case.
* gcc.target/i386/sse2-paddb-2.c: New test case.
* gcc.target/i386/sse2-paddd-2.c: Likewise.
* gcc.target/i386/sse2-paddw-2.c: Likewise.
* gcc.target/i386/sse2-psubb-2.c: Likewise.
* gcc.target/i386/sse2-psubd-2.c: Likewise.
* gcc.target/i386/sse2-psubw-2.c: Likewise.

3 weeks agoc++: fix infinite looping with arr[arr] [PR125454]
Marek Polacek [Thu, 28 May 2026 17:43:58 +0000 (13:43 -0400)] 
c++: fix infinite looping with arr[arr] [PR125454]

Here r16-3466 moved the canonicalization step that transforms
idx[array] to array[idx] to the beginning of cp_build_array_ref.
When we have array[array], we'll be swapping till we blow the stack.

Previously, we'd give the !INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P
error so there was no problem.

PR c++/125454

gcc/cp/ChangeLog:

* typeck.cc (cp_build_array_ref): Don't recurse for array[array].

gcc/testsuite/ChangeLog:

* g++.dg/other/array8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
3 weeks agofortran: module-contained PRIVATE procedures must have global ELF linkage [PR125430]
Jerry DeLisle [Sat, 23 May 2026 04:56:34 +0000 (21:56 -0700)] 
fortran: module-contained PRIVATE procedures must have global ELF linkage [PR125430]

Assisted by: Claude Sonnet 4.6

gcc/fortran/ChangeLog:

PR fortran/125430
* trans-decl.cc (build_function_decl): Set TREE_PUBLIC for all
module-contained procedures so submodules compiled as separate
translation units can reach them via host association.  Also set
DECL_VISIBILITY to VISIBILITY_HIDDEN for PRIVATE procedures,
matching the existing treatment of module variables.

gcc/testsuite/ChangeLog:

PR fortran/125430
* gfortran.dg/module_private_2.f90: Remove scan-tree-dump-times
assertion for 'priv'; PRIVATE module procedures now have global
linkage with hidden visibility and are no longer optimized away.
* gfortran.dg/public_private_module_2.f90: Add xfail markers to
scan-assembler-not for 'two' and 'six'; update comment to mention
procedures alongside variables.
* gfortran.dg/public_private_module_7.f90: Add xfail marker to
scan-assembler-not for '__m_common_attrs_MOD_other'.
* gfortran.dg/public_private_module_8.f90: Add xfail marker to
scan-assembler-not for '__m_MOD_myotherlen'.
* gfortran.dg/submodule_private_host.f90: New test.
* gfortran.dg/submodule_private_host_aux.f90: New auxiliary file.
* gfortran.dg/warn_unused_function_2.f90: Remove 'defined but not
used' expectation for s1; PRIVATE module procedures now have
global linkage and no longer trigger the unused-function warning.

3 weeks ago[RISC-V] Fix expected testsuite output after ext-dce changes
Jeff Law [Thu, 28 May 2026 17:36:01 +0000 (11:36 -0600)] 
[RISC-V] Fix expected testsuite output after ext-dce changes

The recent changes to ext-dce can transform sign extension to zero extension in
some cases.  As a result tests which previously expected a signed load can now
see an unsigned load.  Of course on rv32 "lw" loads a full word, so this
doesn't show up there.  So instead of looking for "lw" we instead look for
"(lwu|lw)".  This fixes the "regressions" after the ext-dce changes.

gcc/testsuite
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Adjust expected
output.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/a-ztso-store-release.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/zalasr-rvwmo-store-release.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-compat-seq-cst.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-relaxed.c: Likewise.
* gcc.target/riscv/amo/zalasr-ztso-store-release.c: Likewise.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.
* gcc.target/riscv/cpymem-64.c: Likewise.
* gcc.target/riscv/memcpy-nonoverlapping.c: Likewise.
* gcc.target/riscv/pr67731.c: Likewise.

3 weeks agoc++: add fixed test [PR106957]
Marek Polacek [Thu, 28 May 2026 16:18:29 +0000 (12:18 -0400)] 
c++: add fixed test [PR106957]

Fixed by r16-8015:
c++: error routines re-entered with uneval lambda [PR124397]

PR c++/106957

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval32.C: New test.

3 weeks agolibstdc++: Fix -fno-exceptions support in testsuite_allocator.h
Patrick Palka [Thu, 28 May 2026 14:41:33 +0000 (10:41 -0400)] 
libstdc++: Fix -fno-exceptions support in testsuite_allocator.h

This fixes the error

.../testsuite_allocator.h:402:13: error: exception handling disabled, use '-fexceptions' to enable
  402 |             catch(...)
      |             ^~~~~

seen when running some C++23 library tests with -fno-exceptions.

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_allocator.h
(uneq_allocator::allocate): Use __try/__catch instead.
(uneq_allocator::allocate_at_least): Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
3 weeks agolibstdc++: Fix availability of flat_meow::operator=(initializer_list)
Patrick Palka [Thu, 28 May 2026 14:39:32 +0000 (10:39 -0400)] 
libstdc++: Fix availability of flat_meow::operator=(initializer_list)

This assignment operator was not being brought in from the private base
class causing assignments from {...} to be inefficiently treated as
construction + move assignment.

libstdc++-v3/ChangeLog:

* include/std/flat_map (flat_map): Bring in operator= from
_Flat_map_base.
(flat_multimap): Likewise.
* include/std/flat_set (flat_set): Bring in operator= from
_Flat_set_base.
(flat_multiset): Likewise.
* testsuite/23_containers/flat_map/1.cc (test11): Simplify by
using = {...}.
(test12): New test.
* testsuite/23_containers/flat_multimap/1.cc (test10): Simplify
by using = {...}.
(test11): New test.
* testsuite/23_containers/flat_multiset/1.cc (test10): Simplify
by using = {...}.
(test11): New test.
* testsuite/23_containers/flat_set/1.cc (test10): Simplify by
using = {...}.
(test11): New test.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
3 weeks agolibstdc++: Implement P3567R2 flat_meow fixes
Patrick Palka [Thu, 28 May 2026 14:39:29 +0000 (10:39 -0400)] 
libstdc++: Implement P3567R2 flat_meow fixes

This implements the changes in sections 5, 6 and 8 of P3567R2; the other
changes (in section 4 and 7) are effectively already implemented.

libstdc++-v3/ChangeLog:

* include/bits/version.def (flat_map): Bump to 202511.
(flat_set): Likewise.
* include/bits/version.h: Regenerate.
* include/std/flat_map (_Flat_map_impl): Remove
is_nothrow_swappable_v assertions.
(_Flat_map_impl::_Flat_map_impl): Explicitly default copy ctor.
Define move ctor with corrected exception handling as per
P3567R2.
(_Flat_map_impl::operator=): Likewise.
(_Flat_map_impl::insert_range): Define new __sorted_t overload
as per P3567R2.
(_Flat_map_impl::swap): Make conditionally noexcept as per
P3567R2.
* include/std/flat_set (_Flat_set_impl): Remove
is_nothrow_swappable_v assertion.
(_Flat_set_impl::_Flat_set_impl): Explicitly default copy ctor.
Define move ctor with correct invariant preserving behavior as
per P3567R2.
(_Flat_set_impl::operator=): Likewise.
(_Flat_set_impl::_M_insert_range): Factored out from
insert_range.  Add bool parameter __is_sorted defaulted to
false.
(_Flat_set_impl::insert_range): Define new __sorted_t overload
as per P3567R2.
(_Flat_set_impl::swap): Make conditionally noexcept as per
P3567R2.  Correct to use ranges::swap instead of ADL swap.
* testsuite/23_containers/flat_map/1.cc (test11, test12):
New tests.
* testsuite/23_containers/flat_multimap/1.cc (test10, test11):
New tests.
* testsuite/23_containers/flat_multiset/1.cc (test10, test11):
New tests.
* testsuite/23_containers/flat_set/1.cc (test10, test11):
New tests.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
3 weeks agolibstdc++: Fix suboptimal complexity of flat_map::_M_insert
Patrick Palka [Thu, 28 May 2026 14:39:27 +0000 (10:39 -0400)] 
libstdc++: Fix suboptimal complexity of flat_map::_M_insert

Ever since r16-1742 ranges::inplace_merge is now correctly C++20 iterator
aware which allows us to idiomatically implement this helper with the
correct optimal complexity N + M log M instead of N log N.

libstdc++-v3/ChangeLog:

* include/std/flat_map (_Flat_map_impl::_M_insert): New bool
parameter __is_sorted defaulted to false.  Reimplement using
views::zip and ranges::inplace_merge.
(_Flat_map_impl::insert): In the __sorted_t overload, pass
__is_sorted=true to _M_insert.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
3 weeks agocobol: Add assertion to suppress -Warray-bounds false positive [PR125404]
Jonathan Wakely [Wed, 20 May 2026 21:24:33 +0000 (22:24 +0100)] 
cobol: Add assertion to suppress -Warray-bounds false positive [PR125404]

This works around a warning from std::vector code, which seems to be
assuming that the vector is empty and therefore calling back() would be
invalid:

/home/test/src/gcc/gcc/cobol/symfind.cc:526:45: error: array subscript -1 is outside array bounds of ‘long unsigned int [1152921504606846975]’ [-Werror=array-bounds=]
  526 |                     return ancestors.back() == i01;
      |                            ~~~~~~~~~~~~~~~~~^~~~~~

Compiling with -D_GLIBCXX_ASSERTIONS also fixes the warning.

gcc/cobol/ChangeLog:

PR cobol/125404
* symfind.cc (symbol_find): Add assertion that ancestors vector
is not empty.

3 weeks agoRISC-V: Fix REGNO_REG_CLASS for FP hard registers
Jin Ma [Tue, 26 May 2026 03:25:57 +0000 (11:25 +0800)] 
RISC-V: Fix REGNO_REG_CLASS for FP hard registers

The GCC Internals Manual, section 19.8 "Register Classes", documents
REGNO_REG_CLASS as:

  REGNO_REG_CLASS (regno)                                      [Macro]
    A C expression whose value is a register class containing hard
    register regno.  In general there is more than one such class;
    choose a class which is minimal, meaning that no smaller class
    also contains the register.

riscv_regno_to_class[] currently maps every FP hard register to
RVC_FP_REGS, but RVC_FP_REGS only contains f8-f15.  The entries for
f0-f7 and f16-f31 therefore violate the "containing hard register
regno" half of the contract: the returned class does not contain the
register at all.

The mismatch corrupts IRA's cost model.  setup_allocno_cost_vector
indexes the per-hard-reg cost slot via REGNO_REG_CLASS:

  rclass = REGNO_REG_CLASS (hard_regno);
  num = cost_classes_ptr->index[rclass];
  ...
  reg_costs[j] = COSTS (costs, i)->cost[num];

After setup_regno_cost_classes_by_mode adds RVC_FP_REGS to the cost
classes, the cost for e.g. f16 is silently read from the RVC_FP_REGS
slot.

The new fp-reg-class.c testcase puts eight "cf"- and sixteen "f"-
constrained doubles live across a call.  In the buggy state IRA
places the cf pseudos outside the cf class and LRA recovers with
sixteen fmv.d to fs* registers; with the fix IRA spills those values
honestly and the IRA "+++Costs" line reports a non-zero "mem"
component.

Fix it by giving each FP hard register its minimal class: FP_REGS for
f0-f7 and f16-f31, RVC_FP_REGS for f8-f15.  As a companion change,
switch riscv_secondary_memory_needed from class-equality tests to
reg_class_subset_p so it still recognises the FP side regardless of
which subclass the table returns.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_regno_to_class): Use the minimal
class containing each FP hard register: FP_REGS for f0-f7 and
f16-f31, RVC_FP_REGS for f8-f15.
(riscv_secondary_memory_needed): Use reg_class_subset_p to
detect FP classes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fp-reg-class.c: New test.

3 weeks agoRISC-V: Support VLS LMUL cost scaling
Zhongyao Chen [Thu, 28 May 2026 11:27:25 +0000 (19:27 +0800)] 
RISC-V: Support VLS LMUL cost scaling

Make VLS (fixed-length) vector modes use the same LMUL cost scaling as
VLA modes. This makes the vectorizer to pick smaller LMULs sometimes.

Here is how I update the testsuite which failed in regression test:
  - dyn-lmul-conv-[1-2].c: The cost model now prefers smaller LMULs,
    so update expectations.
  - pr123414.c: This test relies on large LMULs to trigger a specific bug,
    can be fixed by adding -fno-vect-cost-model.

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (get_lmul_cost_scaling):
Enable scaling for all vector modes (VLA and VLS).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/dyn-lmul-conv-1.c: Update expected LMUL counts.
* gcc.target/riscv/rvv/autovec/dyn-lmul-conv-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/pr123414.c: Disable vector cost model.

Signed-off-by: Zhongyao Chen <chen.zhongyao@zte.com.cn>
3 weeks agoavr.opt.urls: Add -masm-len-notes and -Wasm-len-notes.
Georg-Johann Lay [Thu, 28 May 2026 12:45:51 +0000 (14:45 +0200)] 
avr.opt.urls: Add -masm-len-notes and -Wasm-len-notes.

gcc/
* config/avr/avr.opt.urls (-masm-len-notes, -Wasm-len-notes): Add.

3 weeks agotestsuite: add AVX512 requirement to vect-early-break-no-epilog_11.c
Tamar Christina [Thu, 28 May 2026 12:04:00 +0000 (13:04 +0100)] 
testsuite: add AVX512 requirement to vect-early-break-no-epilog_11.c

This testcase on x86_64 needs AVX512 to vectorize.
My original testing used -march=native so it was on by default.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-early-break-no-epilog_11.c: Add AVX512 for x86_64.

3 weeks agolibstdc++: Optimize operator<< for piecewise distributions.
Tomasz Kamiński [Mon, 25 May 2026 13:15:09 +0000 (15:15 +0200)] 
libstdc++: Optimize operator<< for piecewise distributions.

This avoids creating an temporary vector and uses _M_int and _M_den
members of _M_param. Empty _M_int (default) values are handled by
printing values direclty.

libstdc++-v3/ChangeLog:

* include/bits/random.h (piecewise_constant_distribution::param_type)
(piecewise_linear_distribution::param_type): Befriend operator<<.
* include/bits/random.tcc
(operator<<(basic_ostream&, piecewise_linear_distribution))
(operator<<(basic_ostream&, piecewise_constant_distribution)):
Use __x._M_param._M_int and __x._M_param._M_den instead of accessors.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 weeks agolibstdc++: Expand serialization test for piecewise distributions.
Tomasz Kamiński [Mon, 25 May 2026 12:53:43 +0000 (14:53 +0200)] 
libstdc++: Expand serialization test for piecewise distributions.

Due the viariability of the resutls, the test are currently limited
to x86_64 architectures. float/double test are disabled for -m32
as I was getting unstable result.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/piecewise_constant_distribution/operators/serialize2.cc:
New test.
* testsuite/26_numerics/random/piecewise_linear_distribution/operators/serialize2.cc:
New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
3 weeks agoaarch64/sve: combine AdvSIMD and SVE vec_duplicates
Artemiy Volkov [Mon, 22 Dec 2025 12:46:21 +0000 (12:46 +0000)] 
aarch64/sve: combine AdvSIMD and SVE vec_duplicates

Currently, to duplicate a 64-bit or narrower value into a SVE register, we
choose to go via an intermediate 128-bit AdvSIMD register, viz.:

svfloat32_t foo(float x) {
    return svdupq_n_f32(x, x, x, x);
}

which will produce the following code:

        dup     v0.4s, v0.s[0]
        dup     z0.q, z0.q[0]
        ret

when compiled with -O2 -march=armv9-a+sve.

This can be simplified into a single dup instruction going to an SVE
register directly from a scalar (or a smaller vector) value:

mov z0.s, s0
ret

To facilitate this, this patch adds a pattern that combine can use to
merge two vec_duplicate instructions (scalar -> AdvSIMD and AdvSIMD ->
SVE) into a single one (scalar -> SVE).

To demonstrate the effect of this patch, the vec-init-23.c test from
AdvSIMD was reused as a new SVE test (vec_init_5.c).

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(*aarch64_vec_duplicate_subvector<vconsv><vconq><mode>):
New pattern.
* config/aarch64/iterators.md (VCONSV): New mode attribute.
(vconsv): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/vec_init_5.c: New test.

3 weeks agoaarch64: implement vec_concat support for sub-64-bit types
Artemiy Volkov [Thu, 26 Feb 2026 08:45:08 +0000 (08:45 +0000)] 
aarch64: implement vec_concat support for sub-64-bit types

This patch improves handling of 2-element vec_concats in
aarch64_vector_init_fallback (); where previously the aarch64_vec_concat
insn was emitted only for pairs of vectors, we now allow scalar operands
as well.  Furthermore, if the two operands are the same, we can now emit a
vec_duplicate instead of a vec_concat, leading to better code generation.

This is backed by the new combine{z,_internal}{,_be} insn patterns, that
were each split between integral 16- and 32-bit modes (only involving GPRs
and memory), and the rest (requiring the "w" alternatives as well).

The effect of the changes is illustrated by the changes to vec-init-23.c,
introduced in the previous patch (and a handful of other vector-init
related tests).

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_combine_internal<mode>):
New insn pattern.
(*aarch64_combine_internal_be<mode>): Likewise.
(*aarch64_combinez<mode>): Likewise.
(*aarch64_combinez_be<mode>): Likewise.
(@aarch64_vec_concat<mode>): Support smaller vector and scalar modes.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback):
Handle the case of two scalar elements.
* config/aarch64/iterators.md (SSUB64): New mode iterator.
(VSSUB64): Likewise.
(VSSUB32_I) : Likewise.
(VSSUB64_F): Likewise.
(VS32_I_SUB64_F): Likewise.
(single_wx): Define attribute for sub-64-bit vector and scalar modes.
(bitsize): Likewise.
(VDBL): Likewise.
(single_dwx): New mode attribute.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/gather_load_10.c: Adjust testcase.
* gcc.target/aarch64/sve/slp_1.c: Likewise.
* gcc.target/aarch64/vec-init-18.c: Likewise.
* gcc.target/aarch64/vec-init-23.c: Likewise.

3 weeks agoaarch64: initialize vectors from starting subsequence
Artemiy Volkov [Thu, 26 Feb 2026 09:01:30 +0000 (09:01 +0000)] 
aarch64: initialize vectors from starting subsequence

Now that we have 2- and 4-element vector modes for all the sub-word scalar
modes, we can emit more efficient code when the elements of a vector
constructor can be generated from a common starting subsequence of length
power of two.  To do this, first detect the shortest possible starting
subsequence by repeatedly folding the initial constructor element array
in half, as long as the left and the right halves are equal.  Afterwards,
after emitting the subsequence, duplicate it by generating a
vec_duplicate with the correct source mode.

On the MD side, this requires implementing the vec_duplicate optab to
duplicate an arbitrary sub-128-bit value into a full 64- or a 128-bit
AdvSIMD register, as well as the vec_set insn for the VSUB64 modes (needed
as fallback for the divide-and-conquer approach).  The latter uses a
properly scaled and shifted "bfi" for integer values, and a properly
indexed "ins" for FP elements.

This change allows us to get rid of long chains of inserts and compile
things like:

int16x8_t f (int16_t x, int16_t y, int16_t z, int16_t w)
{
return (int16x8_t) {x, y, z, w, x, y, z, w};
}

into:
bfi     w0, w2, 16, 16
bfi     w1, w3, 16, 16
dup     v31.2s, w0
dup     v0.2s, w1
zip1    v0.8h, v31.8h, v0.8h
ret

rather than:

dup     v31.4h, w0
dup     v0.4h, w1
ins     v31.h[1], w2
ins     v0.h[1], w3
ins     v31.h[3], w2
ins     v0.h[3], w3
zip1    v0.8h, v31.8h, v0.8h
ret

This patch also includes an extensive new test, which includes the above
case, as well as adjustments to existing codegen tests as necessary.

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_simd_dup_subvector<vconq><mode>):
New insn pattern.
(*aarch64_simd_dup_subvector<vcond><mode>): Likewise.
(@aarch64_simd_vec_set<mode>): Likewise.
(vec_set<mode>): Handle 16- and 32-bit vector modes in the expander.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): Add
logic to initialize vector from starting subsequence.  Make static.
(scalar_move_insn_p): Consider sub-64-bit vector moves scalar.
* config/aarch64/iterators.md (VDDUP): New iterator.
(VQDUP): Likewise.
(elem_bits): Define attribute for sub-64-bit vector modes.
(Vetype): Likewise.
(VEL): Likewise.
(single_wx): Define attribute for sub-64-bit vector and scalar modes.
(single_type): Likewise.
(VCOND): Likewise.
(VCONQ): Likewise.
(Vqduptype): New mode attribute.
(Vdduptype): Likewise.
(vcond): Likewise.
(vconq): Likewise.
(vstype): Define attribute for 64-bit vector and sub-128-bit scalar
modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ldp_stp_16.c: Adjust testcase.
* gcc.target/aarch64/sve/slp_1.c: Likewise.
* gcc.target/aarch64/vec-init-18.c: Likewise.
* gcc.target/aarch64/vec-init-23.c: New test.

3 weeks agoaarch64: introduce partial AdvSIMD vector modes
Artemiy Volkov [Mon, 18 May 2026 10:21:18 +0000 (10:21 +0000)] 
aarch64: introduce partial AdvSIMD vector modes

In addition to V2HF that already exists, this patch adds 4 more partial
(16- and 32-bit) AdvSIMD vector modes: V4QI, V2QI, V2HI, and V2BF.  For
now, these are intended only for duplication into full-sized (32-, 64-,
and 128-bit) registers.  As a minimal closure required to bootstrap the
compiler, this also implements the "mov" expand and the "aarch64_simd_mov"
insn_and_split for the new modes (gathered under the VSUB64 iterator).

This patch also adds the new aarch64_advsimd_sub_dword_mode_p () helper to
facilitate detecting the new modes; that is then used (a) to disable
vec_perm_const vectorization for those modes, (b) in the "mov" expander
for those modes, and (c) to define the new "Da" constraint.

Some existing testcases were adjusted where needed.  (The _Float16
testcase in sve/slp_1.c temporarily expects GPRs to be used for V2HF,
which is corrected to FPRs by the succeeding patch; and the half-float
complex tests now recognize some of the patterns, but check that V2BF
still can't be used for vectorization.)

gcc/ChangeLog:

* config/aarch64/aarch64-modes.def (VECTOR_MODE): Remove V2HF.
(VECTOR_MODES): Define V2QI, V4QI, V2HI, V2HF, V2BF.
* config/aarch64/aarch64-protos.h
(aarch64_advsimd_sub_dword_mode_p): Declare new predicate.
* config/aarch64/aarch64-simd.md (*aarch64_simd_mov<mode>): New
define_insn_and_split pattern.
(mov<mode>): Add sub-64-bit vector modes to the VALL_F16 expander.
Forego const vector expansion for those modes.
* config/aarch64/aarch64.cc (aarch64_classify_vector_mode):
Handle 16- and 32-bit vector modes.
(aarch64_advsimd_sub_dword_mode_p): Define new predicate.
(aarch64_vectorize_vec_perm_const): Refuse for partial vector modes.
* config/aarch64/constraints.md (Da): New constraint.
* config/aarch64/iterators.md (VSUB64): New iterator.
(VALL_F16_SUB64): Likewise.
(size): Define attribute for sub-64-bit vector modes.
(VSC): New mode attribute.
(vstype): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/complex/bb-slp-complex-add-half-float.c: Adjust testcase.
* gcc.dg/vect/complex/bb-slp-complex-mla-half-float.c: Likewise.
* gcc.dg/vect/complex/bb-slp-complex-mul-half-float.c: Likewise.
* gcc.target/aarch64/sve/slp_1.c: Likewise.

3 weeks agoi386: Refine c86-4g fdiv scheduling model
Kewen Lin [Thu, 28 May 2026 11:22:57 +0000 (11:22 +0000)] 
i386: Refine c86-4g fdiv scheduling model

Commit r17-258 introduced separated c86-4g fdiv units to avoid the
automaton explosion caused by modeling the whole divider latency on
normal FPU pipes.  But the real hardware may keep the associated FPU
pipe occupied for some cycles at both the beginning and the end of
an fdiv or sqrt operation.  Following Alexander's suggestion in [1],
this patch still keeps the long-latency part on the dedicated fdiv
unit but models only a bounded part of the FPU pipe occupancy.  It
makes the first four cycles reserve both the selected FPU pipe and
the fdiv unit, then keep only the fdiv unit for the remaining cycles.

Taking r17-258 as baseline, I tried K = 1,2,3,4 for

  fpu,divider*N -> (fpu+divider)*K, divider*(N-K)

and measured the time for build/genautomata and the top 100 symbol
sizes of insn-automata.o (baseline normalized as 100) as below:

1) without any other changes:
              time     size
  baseline    100      100
  r17-203     340.0    629.3
  K1          100.3    100
  K2          105.5    112.5
  K3          112.8    129
  K4          119.4    141

2) Splitting fpu0/fpu2 and fpu1/fpu3 to paired automatons:
              time     size
  baseline    100      100
  r17-203     340.0    629.3
  KS1         79.6     43.3
  KS2         79.8     43.3
  KS3         79.6     43.3
  KS4         79.4     43.3

It turns out that if we want to model the FPU occupancy for some
beginning cycles, separating the involved fpu1/fpu3 from the
original fpu looks better.  So this patch splits fpu0/fpu2 and
fpu1/fpu3 into two paired automata and this extra coupling does
not grow the main FPU automata significantly.

This patch also corrects some other modeling omissions like:

  - Fix c86_4g_fp_op_idiv_load latency typo by one cycle.
  - Merge the old c86_4g_m7 idiv DI/SI/HI reservations after
    aligning their latency and divider unit occupancy (with
    updated values), while keeping QI separate.
  - Adjust reservation units in templates like
    c86_4g_m7_avx_vpinsr_reg_load and c86_4g_m7_avx512_sseadd_xy
    etc.
  - Add missing reservation units and unit occupancy in templates
    like c86_4g_m7_avx512_permi2_ymm and
    c86_4g_m7_sse_sseiadd_hplus_load etc.
  - Adjust reservation units and unit occupancy in templates like
    c86_4g_m7_avx512_perm_zmm_imm, c86_4g_m7_avx512_expand and
    c86_4g_m7_avx512_ssemul etc.

And also introduces some reusable reservation aliases to simplify
some modelings.

I tested build time for i686 bootstrapping in a docker container:
  - r17-202: 2437s (before c86-4g support)
  - r17-203: 7291s (c86-4g support)
  - r17-258: 2646s (tweaking for build time)
  - this: 2358s
It looks this patch improves build time (even better than r17-202
though the trivial gap can be due to some jitter).

The symbol sizes are improved as below:

nm -CS -t d --defined-only gcc/insn-automata.o \
    | sed 's/^[0-9]* 0*//' \
    | sort -n | tail -20

with r17-258:

  20068 r bdver1_fp_transitions
  22354 r c86_4g_m7_ieu_min_issue_delay
  26208 r slm_min_issue_delay
  26580 t internal_min_issue_delay(int, DFA_chip*)
  26869 t internal_state_transition(int, DFA_chip*)
  27244 r bdver1_fp_min_issue_delay
  28518 r glm_check
  28518 r glm_transitions
  33690 r geode_min_issue_delay
  33728 r c86_4g_fp_transitions
  45436 r znver4_fpu_min_issue_delay
  46980 r bdver3_fp_min_issue_delay
  49428 r glm_min_issue_delay
  53730 r btver2_fp_min_issue_delay
  53760 r znver1_fp_transitions
  89414 r c86_4g_m7_ieu_transitions
  93960 r bdver3_fp_transitions
  181744 r znver4_fpu_transitions
  326322 r c86_4g_m7_fpu_min_issue_delay
  1305288 r c86_4g_m7_fpu_transitions

with this:

  17872 r print_reservation(_IO_FILE*, rtx_insn*)::...
  20068 r bdver1_fp_check
  20068 r bdver1_fp_transitions
  22016 r c86_4g_m7_fpu02_transitions
  22354 r c86_4g_m7_ieu_min_issue_delay
  26208 r slm_min_issue_delay
  27244 r bdver1_fp_min_issue_delay
  28199 t internal_min_issue_delay(int, DFA_chip*)
  28362 t internal_state_transition(int, DFA_chip*)
  28518 r glm_check
  28518 r glm_transitions
  33690 r geode_min_issue_delay
  45436 r znver4_fpu_min_issue_delay
  46980 r bdver3_fp_min_issue_delay
  49428 r glm_min_issue_delay
  53730 r btver2_fp_min_issue_delay
  53760 r znver1_fp_transitions
  89414 r c86_4g_m7_ieu_transitions
  93960 r bdver3_fp_transitions
  181744 r znver4_fpu_transitions

Based on random sampling of SPEC2017 benchmarks 525.x264_r and
521.wrf_r, I verified that the new modeling introduces no
significant compilation overhead.  Testing with a single job on a
c86-4g-m7 machine revealed no impact on x264 and a tiny increase
for wrf (~0.3%).

[1] https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716681.html

gcc/ChangeLog:

* config/i386/c86-4g-m7.md (c86_4g_m7_fpu): Remove automaton.
(c86_4g_m7_fpu02): New automaton.
(c86_4g_m7_fpu13): Ditto.
(c86-4g-m7-fpu0): Move to c86_4g_m7_fpu02 automaton.
(c86-4g-m7-fpu1): Move to c86_4g_m7_fpu13 automaton.
(c86-4g-m7-fpu2): Move to c86_4g_m7_fpu02 automaton.
(c86-4g-m7-fpu3): Move to c86_4g_m7_fpu13 automaton.
(c86-4g-m7-fdiv): Remove cpu unit.
(c86-4g-m7-fdiv1): New cpu unit.
(c86-4g-m7-fdiv3): Ditto.
(c86-4g-m7-fpu_0_3): New reservation.
(c86-4g-m7-fpu_1_3x2): Ditto.
(c86-4g-m7-fpu_1_3x3): Ditto.
(c86-4g-m7-fpu_1_3x6): Ditto.
(c86-4g-m7-fpux2): Ditto.
(c86-4g-m7-fpux4): Ditto.
(c86-4g-m7-fpux6): Ditto.
(c86-4g-m7-fpux8): Ditto.
(c86-4g-m7-fpux16): Ditto.
(c86-4g-m7-fp1fdiv1x4): Ditto.
(c86-4g-m7-fp3fdiv3x4): Ditto.
(c86-4g-m7-fdiv13): Ditto.
(c86-4g-m7-fp13div13): Ditto.
(c86-4g-m7-fp13div13x4): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x8): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x9): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x11): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x15): Ditto.
(c86-4g-m7-fp1div1_fp3div3_x4x18): Ditto.
(c86_4g_m7_idiv): New reservation.
(c86_4g_m7_idiv_QI): Adjust reservation latency and unit occupancy.
(c86_4g_m7_idiv_load): New reservation.
(c86_4g_m7_idiv_QI_load): Adjust reservation latency and unit
occupancy.
(c86_4g_m7_idiv_DI): Remove reservation.
(c86_4g_m7_idiv_SI): Ditto.
(c86_4g_m7_idiv_HI): Ditto.
(c86_4g_m7_idiv_DI_load): Ditto.
(c86_4g_m7_idiv_SI_load): Ditto.
(c86_4g_m7_idiv_HI_load): Ditto.
(c86_4g_m7_sse_insertimm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_insert): Ditto.
(c86_4g_m7_fp_sqrt): Adjust reservation.
(c86_4g_m7_fp_div): Ditto.
(c86_4g_m7_fp_div_load): Ditto.
(c86_4g_m7_fp_idiv_load): Ditto.
(c86_4g_m7_sse_pinsr_reg): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_pinsr_reg_load): Ditto.
(c86_4g_m7_avx_vpinsr_reg): Ditto.
(c86_4g_m7_avx_vpinsr_reg_load): Ditto.
(c86_4g_m7_avx512_perm_xmm): Delete the prefix condition.
(c86_4g_m7_avx512_perm_xmm_opload): Ditto.
(c86_4g_m7_avx512_permi2_ymm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_avx512_permi2_zmm): Ditto.
(c86_4g_m7_avx512_permi2_ymm_load): Ditto.
(c86_4g_m7_avx512_permi2_zmm_load): Ditto.
(c86_4g_m7_avx512_perm_zmm_imm): Ditto.
(c86_4g_m7_avx512_perm_zmm_imm_load): Ditto.
(c86_4g_m7_avx512_perm_zmm_noimm): Ditto.
(c86_4g_m7_sse_perm_zmm_noimm_load): Ditto.
(c86_4g_m7_avx_perm_ymm): Remove.
(c86_4g_m7_avx_perm_ymem): Ditto.
(c86_4g_m7_avx512_shuf_zmm): Adjust reservation units and unit
occupancy.
(c86_4g_m7_avx512_shuf_zmem): Ditto.
(c86_4g_m7_avx512_cmpestr): Ditto.
(c86_4g_m7_avx512_cmpestr_load): Ditto.
(c86_4g_m7_avx512_vdbpsadbw_zmm): Ditto.
(c86_4g_m7_avx512_vdbpsadbw_zmem): Ditto.
(c86_4g_m7_avx_ssecomi_comi): Ditto.
(c86_4g_m7_avx_ssecomi_comi_load): Ditto.
(c86_4g_m7_avx512_expand): Ditto.
(c86_4g_m7_avx512_expand_load): Ditto.
(c86_4g_m7_avx512_expand_z): Ditto.
(c86_4g_m7_avx512_expand_z_load): Ditto.
(c86_4g_m7_sse_movnt_xy): Rename to c86_4g_m7_sse_movnt.
(c86_4g_m7_avx512_sseadd_xy): Adjust reservation units.
(c86_4g_m7_avx512_sseadd_xy_load): Ditto.
(c86_4g_m7_sse_sseiadd_hplus): Adjust reservation units and unit
occupancy.
(c86_4g_m7_sse_sseiadd_hplus_load): Ditto.
(c86_4g_m7_avx512_ssemul): Adjust reservation units.
(c86_4g_m7_avx512_ssemul_load): Ditto.
(c86_4g_m7_avx512_ssediv): Remove.
(c86_4g_m7_avx512_ssediv_mem): Remove.
(c86_4g_m7_avx512_ssediv_x): New.
(c86_4g_m7_avx512_ssediv_xmem): New.
(c86_4g_m7_avx512_ssediv_y): New.
(c86_4g_m7_avx512_ssediv_ymem): New.
(c86_4g_m7_avx512_ssediv_z): Adjust reservation units.
(c86_4g_m7_avx512_ssediv_zmem): Ditto.
(c86_4g_m7_avx512_ssecmp_z): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_ssecmp_z_load): Ditto.
(c86_4g_m7_avx512_ssecmp_vp_z): New reservation.
(c86_4g_m7_avx512_ssecmp_vp_z_load): Ditto.
(c86_4g_m7_avx512_ssecmp_test_z): Remove reservation.
(c86_4g_m7_avx512_ssecmp_test_z_load): Ditto.
(c86_4g_m7_avx512_muladd): Broaden matching condition.
(c86_4g_m7_avx512_muladd_load): Ditto.
(c86_4g_m7_fma_muladd): Remove reservation.
(c86_4g_m7_fma_muladd_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_x): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_sse_conflict_x_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_y): Ditto.
(c86_4g_m7_avx512_sse_conflict_y_load): Ditto.
(c86_4g_m7_avx512_sse_conflict_z): Ditto.
(c86_4g_m7_avx512_sse_conflict_z_load): Ditto.
(c86_4g_m7_avx512_sse_class_z): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_sse_class_z_load): Ditto.
(c86_4g_m7_avx512_sse_sqrt): Remove.
(c86_4g_m7_avx512_sse_sqrt_load): Remove.
(c86_4g_m7_avx512_sse_sqrt_sf_x): New.
(c86_4g_m7_avx512_sse_sqrt_sf_xload): New.
(c86_4g_m7_avx512_sse_sqrt_sf_y): New.
(c86_4g_m7_avx512_sse_sqrt_sf_yload): New.
(c86_4g_m7_avx512_sse_sqrt_sf_z): New.
(c86_4g_m7_avx512_sse_sqrt_sf_zload): New.
(c86_4g_m7_avx512_sse_sqrt_df_x): New.
(c86_4g_m7_avx512_sse_sqrt_df_xload): New.
(c86_4g_m7_avx512_sse_sqrt_df_y): New.
(c86_4g_m7_avx512_sse_sqrt_df_yload): New.
(c86_4g_m7_avx512_sse_sqrt_df_z): New.
(c86_4g_m7_avx512_sse_sqrt_df_zload): New.
(c86_4g_m7_avx512_msklog_vector): Add reservation units and unit
occupancy.
(c86_4g_m7_avx512_mskmov_z_k): Ditto.
(c86_4g_m7_avx512_mskmov_k_reg): Ditto.
* config/i386/c86-4g.md (c86_4g_fp): Remove automaton.
(c86_4g_fp024): New automaton.
(c86_4g_fp1): Ditto.
(c86-4g-fp0): Move to c86_4g_fp024 automaton.
(c86-4g-fp1): Move to c86_4g_fp1 automaton.
(c86-4g-fp2): Move to c86_4g_fp024 automaton.
(c86-4g-fp3): Ditto.
(c86-4g-fp1fdivx4): New reservation.
(c86_4g_fp_sqrt): Adjust reservation.
(c86_4g_sse_sqrt_sf): Ditto.
(c86_4g_sse_sqrt_sf_mem): Ditto.
(c86_4g_sse_sqrt_df): Ditto.
(c86_4g_sse_sqrt_df_mem): Ditto.
(c86_4g_fp_op_div): Ditto.
(c86_4g_fp_op_div_load): Ditto.
(c86_4g_fp_op_idiv_load): Adjust reservation latency.
(c86_4g_ssediv_ss_ps): Adjust reservation.
(c86_4g_ssediv_ss_ps_load): Ditto.
(c86_4g_ssediv_sd_pd): Ditto.
(c86_4g_ssediv_sd_pd_load): Ditto.
(c86_4g_ssediv_avx256_ps): Ditto.
(c86_4g_ssediv_avx256_ps_load): Ditto.
(c86_4g_ssediv_avx256_pd): Ditto.
(c86_4g_ssediv_avx256_pd_load): Ditto.

Co-authored-by: Xin Liu <liulxx@hygon.cn>
Signed-off-by: Xin Liu <liulxx@hygon.cn>
Signed-off-by: Kewen Lin <linkewen@hygon.cn>
3 weeks agoRISC-V: Add RISC-V RVV main-loop overhead comparison in cost model
Zhongyao Chen [Wed, 20 May 2026 09:30:22 +0000 (17:30 +0800)] 
RISC-V: Add RISC-V RVV main-loop overhead comparison in cost model

Add an RVV-specific loop-overhead comparison in the RISC-V cost model and
use it after inside-loop cost comparison.

The RISC-V implementation prefers RVV mode that eliminate the main
loop, and otherwise compares their main-loop head overhead.

Local testing shows no regressions. This is likely because few testcases
have equal inside-loop cost, especially before VLS lmul cost scaling support.

I also ran regression tests with temporary VLS lmul cost scaling support.
Only 3 regressions found:
  - dyn-lmul-conv-1.c & dyn-lmul-conv-2.c: Cost model now prefers smaller LMULs
due to VLS lmul scaling, so this is reasonable, just need to update expectations.
  - pr123414.c: This test relies on large LMULs to trigger a specific bug,
so reasonable too, can be fixed by adding -fno-vect-cost-model.

The VLS LMUL cost scaling patch will be updated after this is pushed.

gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc
(estimated_loop_iters): New function.
(compare_loop_overhead): New function.
(costs::better_main_loop_than_p): Compare RVV loop overhead after
inside-loop cost.

Signed-off-by: Zhongyao Chen <chen.zhongyao@zte.com.cn>
3 weeks agoaarch64: Make more use of UINTVAL
Alex Coplan [Wed, 27 May 2026 20:26:44 +0000 (21:26 +0100)] 
aarch64: Make more use of UINTVAL

I noticed while reviewing some other code that we have existing code of
the form (unsigned HOST_WIDE_INT) INTVAL (X).  Such expressions are (by
definition of UINTVAL) equivalent to UINTVAL (x), and the latter is both
more succint and (IMO) more readable, so this patch replaces those
instances in the aarch64 backend accordingly.

There are also many occurrences of this outside of aarch64, I see:

$ git grep -nE "\(unsigned HOST_WIDE_INT\)\s?INTVAL" | wc -l
73

with this patch applied, but this patch just fixes the aarch64 cases for
now.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_strip_extend): Replace
(unsigned HOST_WIDE_INT) INVAL (x) with UINTVAL (x).
* config/aarch64/predicates.md (aarch64_shift_imm_si): Likewise.
(aarch64_shift_imm_di): Likewise.
(aarch64_shift_imm64_di): Likewise.
(aarch64_imm3): Likewise.

3 weeks agoAVR: Support [[len=<words]] notes in inline asm to specifty its size.
Georg-Johann Lay [Thu, 28 May 2026 09:44:21 +0000 (11:44 +0200)] 
AVR: Support [[len=<words]] notes in inline asm to specifty its size.

This patch adds support for [[len=<words>]] in (the comments of) inline
asm constructs.  It serves several purposes:

- Cases where the expanded asm is longer than determined from the number
  of physical and logical line breaks.  Such cases can lead to errors
  when a jump that uses a too optimistic jump offset is crossing an asm.

- Better code generation for jumps that are crossing an asm.  The default
  length of an asm is (1 + NL) * 2 words, where NL denotes the sum of
  physical and logical line breaks.  However, almost all AVR instructions
  occupy only one 16-bit word.

The feature is implemented in ADJUST_INSN_LENGTH.  The length of
an asm is the sum over all [[len=<words>]] notes, except when an
unrecognized construct is found or an error occurred.  In the latter
case, the default insn length is used.  These <words> are supported:

<words> = [0-9]+
   Specifies a non-negative decimal integer.

<words> = %[0-9]+
<words> = %[<name>]   # Already resolved to %[0-9]+ by the middle-end.
   Refers to the respective asm operand, which must be CONST_INT.

<words> = lds
<words> = sts
   Specifies the length of a LDS or STS instruction, i.e.
   1 word if AVR_TINY, and 2 words otherwise.

<words> = %~
<words> = %~call
<words> = %~jmp
   Specifies the length of a %~call resp. %~jmp instruction, i.e.
   2 words if AVR_HAVE_JMP_CALL, and 1 word otherwise.

In order to observe the assigned lengths, see -fdump-rtl-shorten or the
";; ADDR = ..." insn addresses in the asm output with -mlog=insn_addresses.

The benefits of using magic comments are:

- The feature is backwards compatible, and the target code can use
  the same asm syntax since only asm comments have to be adjusted.
  No #ifdef feature test macros are needed.  The only case where the
  feature is not fully backwards compatible is when asm templates
  already contain invalid "[[len=" notes for some reason.  In that
  case, -mno-asm-len-notes restores the old behavior.

- Since the asm size is the sum over all notes, the final size can
  be stitched together from multiple annotations / parts of an asm
  template, and there is no need to support operations like plus.

gcc/
* config/avr/avr.cc (avr_read_number, avr_length_of_asm)
(avr_maybe_length_of_asm): New static functions.
(avr_adjust_insn_length): Call avr_maybe_length_of_asm on
unrecognized insns.
* config/avr/avr.opt (-masm-len-notes, -Wasm-len-notes): New
options.
* doc/invoke.texi (AVR Options): Add -masm-len-notes,
-Wasm-len-notes.
* doc/extend.texi (Size of an asm): Add @subsubheading
"Specifying the size of an asm on AVR".

libgcc/config/avr/libf7/
* libf7.h: Add "[len=...]]" notes to all non-empty inline asm's.
* libf7.c: Dito.

3 weeks agoAVR: ad target/121343 - Use hard-reg constraints in [u]divmod insns.
Georg-Johann Lay [Thu, 28 May 2026 09:30:23 +0000 (11:30 +0200)] 
AVR: ad target/121343 - Use hard-reg constraints in [u]divmod insns.

PR target/121343
gcc/
* config/avr/avr.md (divmod<mode>4, udivmod<mode>4): Use
hard-reg constraints instead of explicit hard-regs.
(*divmodqi4_call_split, *udivmodqi4_call_split): Remove.
(*divmodhi4_call_split, *udivmodhi4_call_split): Remove.
(*divmodpsi4_call_split, *udivmodpsi4_call_split): Remove.
(*divmodsi4_call_split, *udivmodsi4_call_split): Remove.

3 weeks agoi386: Fix up *add<mode>_1<nf_name> [PR125469]
Jakub Jelinek [Thu, 28 May 2026 08:28:12 +0000 (10:28 +0200)] 
i386: Fix up *add<mode>_1<nf_name> [PR125469]

The following testcase ICEs, because combine matches
(set (reg:DI 108) (plus:DI (reg:DI 104 [ s ]) (subreg:DI (reg:TI 103 [ _2 ]) 8)))
Now, because ix86_validate_address_register has:
12038         /* Don't allow SUBREGs that span more than a word.  It can
12039            lead to spill failures when the register is one word out
12040            of a two word structure.  */
12041         if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
12042           return NULL_RTX;
this isn't recognized as *leadi, but is recognized as *adddi_1_nf pattern
instead.  Now, later on the RA turns it into:
(set (reg:DI 2 cx [108]) (plus:DI (reg:DI 0 ax [orig:104 s ] [104]) (reg:DI 5 di [ _2+8 ])))
which would be valid *leadi, but given that INSN_CODE is already set to the
*adddi_1_nf and that also satisfies it, nothing re-recognizes it as *leadi.
But in that case without TARGET_APX_NDD the pattern has return "#";
That is a bug, because there is no splitter to split that
(set (reg:DI 2 cx [108]) (plus:DI (reg:DI 0 ax [orig:104 s ] [104]) (reg:DI 5 di [ _2+8 ])))
into itself so that it is re-recognized as *leadi, so it just ICEs.
I think having a splitter to split to the same thing would be just weird, so
this just outputs lea insn directly.

2026-05-28  Jakub Jelinek  <jakub@redhat.com>

PR target/125469
* config/i386/i386.md (*add<mode>_1<nf_name>): Don't return "#" for
the lea non-TARGET_APX_NDD case, instead emit a lea directly.

* gcc.target/i386/apx-nf-pr125469.c: New test.

Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
3 weeks agoada: Align the alternate stack on Linux for address sanitizer
Sebastian Poeplau [Wed, 4 Mar 2026 09:06:07 +0000 (10:06 +0100)] 
ada: Align the alternate stack on Linux for address sanitizer

Address sanitizer requires the memory region configured via sigaltstack to be
8-byte aligned (see ASAN_SHADOW_GRANULARITY and ASAN_SHADOW_SCALE).

gcc/ada/ChangeLog:

* init.c (__gnat_alternate_stack): add alignment attribute.

3 weeks agoada: Fix iterator for Iterable aspect rejected without subtype indication
Eric Botcazou [Tue, 10 Mar 2026 09:14:20 +0000 (10:14 +0100)] 
ada: Fix iterator for Iterable aspect rejected without subtype indication

Iterator specifications of the In form without subtype indication are parsed
as a choice list, and later turned during semantic analysis into a bona-fide
N_Iterator_Specification node when there is a single choice with an iterator
type, but the case of the GNAT Iterable aspect is overlooked in the process.

gcc/ada/ChangeLog:

* sem_aggr.adb (Resolve_Array_Aggregate): Also rewrite a choice list
with a single choice as an iterator specification when the choice's
type has the GNAT Iterable aspect specified.

3 weeks agoada: Fix handling of qualified subtype with static predicate in array aggregate
Eric Botcazou [Mon, 9 Mar 2026 19:30:09 +0000 (20:30 +0100)] 
ada: Fix handling of qualified subtype with static predicate in array aggregate

The static predicate is ignored when the choice present in the aggregate is
anything else than the direct name of the subtype.

gcc/ada/ChangeLog:

* sem_aggr.adb (Resolve_Array_Aggregate): Analyze the choice before
testing whether it is the name of a subtype with a predicate.

3 weeks agoada: Fix assertion failure on invalid String_Literal aspect
Eric Botcazou [Fri, 6 Mar 2026 13:30:23 +0000 (14:30 +0100)] 
ada: Fix assertion failure on invalid String_Literal aspect

The root cause is that a subprogram declared in the body is incorrectly
considered as a primitive operation of a type declared in a package spec.

gcc/ada/ChangeLog:

* einfo.ads (In_Package_Body): Update description.
(In_Private_Part): Likewise.
* sem_ch3.adb (Analyze_Object_Declaration): Compute In_Package_Body
along with In_Private_Part for the object if its scope is a package.
* sem_ch6.adb (Analyze_Expression_Function): Do not compute
In_Private_Part here.
(Enter_Overloaded_Entity): Compute In_Package_Body & In_Private_Part
for the entity if its scope is a package.
* sem_util.adb (Collect_Primitive_Operations): Skip the subprograms
declared in the body for types declared in a package specification.

3 weeks agoada: Fix assertion failure for improper aggregate operation
Eric Botcazou [Fri, 6 Mar 2026 11:27:20 +0000 (12:27 +0100)] 
ada: Fix assertion failure for improper aggregate operation

The compiler takes the Entity of a node without checking that it may.

gcc/ada/ChangeLog:

* sem_ch13.adb (Resolve_Aspect_Aggregate.Resolve_Operation): Add
missing guard for the presence of Entity on the node.

3 weeks agoada: Reject non-primitive operations in Finalizable aspect
Eric Botcazou [Wed, 4 Mar 2026 19:43:02 +0000 (20:43 +0100)] 
ada: Reject non-primitive operations in Finalizable aspect

The implementation does not support them and allowing them would not bring
any significant benefit.

gcc/ada/ChangeLog:

* doc/gnat_rm/gnat_language_extensions.rst
(Generalized Finalization): Document the new restriction.
* sem_ch13.adb (Resolve_Finalizable_Argument): Adjust wording of
error message.
(Resolve_Finalization_Procedure.Is_Finalizable_Primitive): Require
the procedure to be a primitive operation.
* gnat_rm.texi: Regenerate.

3 weeks agoada: Remove .EXE suffix from GNAT.Command_Line error messages
Piotr Trojanek [Wed, 4 Mar 2026 16:05:33 +0000 (17:05 +0100)] 
ada: Remove .EXE suffix from GNAT.Command_Line error messages

The .EXE suffix in GNAT.Command_Line output causes diffs in testsuite results
that run on different platforms.

gcc/ada/ChangeLog:

* libgnat/g-comlin.adb
(Command_Name): New routine to strip platform-specific suffix.
(Display_Help, Get_Opt): Use new routine.
(Try_Help): Remove hardcoded ".exe" suffix; use new routine.

3 weeks agoada: Incorrect error message on use of 'Result with wrong prefix
Javier Miranda [Wed, 4 Mar 2026 13:32:00 +0000 (13:32 +0000)] 
ada: Incorrect error message on use of 'Result with wrong prefix

gcc/ada/ChangeLog:

* sem_util.adb (Is_Access_To_Subprogram_Wrapper): Remove useless
call to Can_Have_Formals. Found by Dismukes.

3 weeks agoada: Fix bogus visibility error for inherited operator of null extension
Eric Botcazou [Wed, 4 Mar 2026 13:36:13 +0000 (14:36 +0100)] 
ada: Fix bogus visibility error for inherited operator of null extension

This occurs when the operator has a heterogeneous profile and the extension
is declared in the same scope as the type of a non-controlling parameter of
the operator, because Find_Dispatching_Type incorrectly returns this type.

gcc/ada/ChangeLog:

* exp_ch3.adb (Make_Controlling_Function_Wrappers): Manually set the
Has_Controlling_Result flag on the wrappers.
* sem_disp.ads (Override_Dispatching_Operation): Move to...
* sem_disp.adb (Override_Dispatching_Operation): ...here.
(Find_Dispatching_Type): Return the (controlling) result type for a
controlling function wrapper.

3 weeks agoada: Fix casing of reserved word.
Vadim Godunko [Tue, 3 Mar 2026 04:49:34 +0000 (08:49 +0400)] 
ada: Fix casing of reserved word.

gcc/ada/ChangeLog:

* doc/gnat_rm/implementation_of_ada_2022_features.rst: Fix casing.
* gnat_rm.texi: Regenerate.

3 weeks agoada: Fix unresolved symbols with partial -gnatVo compilation
Eric Botcazou [Tue, 3 Mar 2026 10:35:36 +0000 (11:35 +0100)] 
ada: Fix unresolved symbols with partial -gnatVo compilation

This happens when the units of a program using the standard containers are
not uniformly compiled with the -gnatVo switch.  This is the fallout of an
internal confusion as to what validity checks must be applied to.

gcc/ada/ChangeLog:

* exp_ch4.adb (Expand_N_Op_Eq): Do not expand an array comparison
for validity checking purposes when the component type is covered
by the suppression of validity checks.

3 weeks agoada: vast: protect against a predicate failure
Marc Poulhiès [Tue, 3 Mar 2026 10:45:20 +0000 (11:45 +0100)] 
ada: vast: protect against a predicate failure

In case where the node is not a Pragma as expected, don't try to check
its field as it can raise a predicate error.

gcc/ada/ChangeLog:

* vast.adb (Do_Node_Pass_2): Only check aspect/pragma consistency for pragma nodes.

3 weeks agoada: Incorrect error message on use of 'Result with wrong prefix
Javier Miranda [Mon, 2 Mar 2026 16:24:01 +0000 (16:24 +0000)] 
ada: Incorrect error message on use of 'Result with wrong prefix

gcc/ada/ChangeLog:

* sem_util.ads (Is_Access_Subprogram_Wrapper): Renamed as
Is_Access_To_Subprogram_Wrapper.
* sem_util.adb (Is_Access_Subprogram_Wrapper): Ditto plus add
assertion.
* sem_disp.adb (Is_Access_To_Subprogram_Wrapper): Removed.
* sem_prag.adb (Find_Related_Declaration_Or_Body): Replace call to
Is_Access_Subprogram_Wrapper by call to Is_Access_To_Subprogram_Wrapper.
* exp_ch6.adb (Expand_Call): Ditto.
* sem_attr.adb (Analyze_Attribute [Attribute_Result]): For access to
subprogram wrappers, report that the expected prefix is the name of
the access type.

3 weeks agoada: Minor cleanup
Marc Poulhiès [Mon, 2 Mar 2026 14:57:17 +0000 (15:57 +0100)] 
ada: Minor cleanup

Call Decorate to set fields for aspect instead of setting them manually.

gcc/ada/ChangeLog:

* sem_ch13.adb (Make_Pragma_From_Boolean_Aspect): Use Decorate.

3 weeks agoada: Rewrite Analyze_Aspect_Specifications
Bob Duff [Sun, 1 Mar 2026 18:29:50 +0000 (13:29 -0500)] 
ada: Rewrite Analyze_Aspect_Specifications

Misc cleanup of Sem_Ch13.Analyze_Aspect_Specifications.

Split out procedures, remove gratuitous gotos, make various
things somewhat more uniform, etc.

Change type of E parameter of Analyze_Aspect_Specifications
from Entity_Id to N_Entity_Id; the latter has a predicate to
make sure we only pass entities. Modify one place in
Sem_Ch12.Analyze_Formal_Subprogram_Declaration that violates
the predicate, by skipping Analyze_Aspect_Specifications in
case of error.

Consolidate computation of Delay_Required into a single function.
Unfortunately, it is still necessary to modify Delay_Required
later, so it can't be constant.

Aspect_Invariant was set to Always_Delay, and then we did
"Delay_Required := False;" unconditionally. Better to set it
to Never_Delay in the first place. Similar for some other aspects.

Aspect_Implicit_Dereference was set to Always_Delay, but we create an
Aitem and insert it without delay and then do a "goto" to skip the
delay-related code. Better to set it to Never_Delay. Similar for some
other aspects, including ones previously set to Rep_Aspect. This is
probably wrong, but it was already wrong -- it doesn't introduce new
bugs.

Move Set_Aspect_On_Partial_View so it gets called for all
aspects when appropriate; "goto Continue;" was skipping this
call in some cases.

Make Boolean_Aspects include Library_Unit_Aspects, because all
Library_Unit_Aspects really are Boolean_Aspects. This allows
to change "Boolean_Aspects | Library_Unit_Aspects" to just
"Boolean_Aspects" in several places. There were just 3 uses
of Boolean_Aspects without Library_Unit_Aspects; the one in
Sem_Util seems harmless, and the two in Delay_Aspect have
a new assertion that makes sure we're not changing anything.

gcc/ada/ChangeLog:

* sem_ch13.adb (Analyze_Aspect_Specifications):
Major rewrite.
* sem_ch13.ads: Minor comment improvements.
* aspects.ads: Change some aspects to be Never_Delay.
Make Boolean_Aspects include Library_Unit_Aspects.
* exp_ch9.adb (Build_Corresponding_Record):
When copying aspects, set Aspect_Rep_Item to Empty,
so Asp_Copy looks like an unanalyzed tree.
* sem_ch12.adb (Analyze_Formal_Subprogram_Declaration):
Skip Analyze_Aspect_Specifications in case of error.
* sem_ch6.adb (Analyze_Expression_Function): Likewise.
* sinfo.ads: Minor comment improvement.

3 weeks agoada: Compiler hangs on a semantically incorrect program.
Steve Baird [Thu, 26 Feb 2026 23:59:07 +0000 (15:59 -0800)] 
ada: Compiler hangs on a semantically incorrect program.

A homonyms list should be acyclic. Do not introduce a cycle in an error case.

gcc/ada/ChangeLog:

* sem_ch6.adb (Install_Entity): If the entity to be installed is
already installed, assert that an error has already been flagged
and then return without introducing a cycle in the entity's
Homonyms list.

3 weeks agoada: Create the SARIF file in the current cwd
Viljar Indus [Wed, 25 Feb 2026 12:29:46 +0000 (14:29 +0200)] 
ada: Create the SARIF file in the current cwd

Previously we used to create the SARIF file next to the specified
source file e.g. "<Specified_Source_File_Path>.gnat.sarif"

Now the SARIF file is always generated in the cwd
"<Source_File_Name>.gnat.sarif" similarly to how gcc handles its sarif
files.

gcc/ada/ChangeLog:

* errout.adb (Output_Messages): use the source file name without
the directory path when constructing the name of the SARIF file.
* osint.adb (Strip_Directory): New method for extracting the file name
from a given path.
* osint.ads (Strip_Directory): Likewise.

3 weeks agoada: Crash on wrong renaming of record field
Javier Miranda [Wed, 25 Feb 2026 18:08:37 +0000 (18:08 +0000)] 
ada: Crash on wrong renaming of record field

gcc/ada/ChangeLog:

* sem_ch8.adb (Find_Renamed_Entity): Protect call to First_Formal.

3 weeks agoada: Fix regression under GNATProve mode
Javier Miranda [Wed, 25 Feb 2026 14:24:02 +0000 (14:24 +0000)] 
ada: Fix regression under GNATProve mode

Improve previous patch since the regression reproduces also
compiling under check syntax and semantic only mode (-gnatc).

gcc/ada/ChangeLog:

* sem_res.adb (Resolve_Declare_Expression): Do not create a
transient scope when expansion is disabled.

3 weeks agoada: Fix regression under GNATProve mode
Javier Miranda [Tue, 24 Feb 2026 19:03:38 +0000 (19:03 +0000)] 
ada: Fix regression under GNATProve mode

This patch fixes a regression recently introduced compiling
code under GNATprove mode.

gcc/ada/ChangeLog:

* sem_res.adb (Resolve_Declare_Expression): Do not create a
transient scope under GNATprove mode.

3 weeks agoada: VAST: Explain 2-pass algorithm
Bob Duff [Tue, 24 Feb 2026 15:13:59 +0000 (10:13 -0500)] 
ada: VAST: Explain 2-pass algorithm

Minor: Add a comment.

gcc/ada/ChangeLog:

* vast.adb (Pass): Add a comment.

3 weeks agoada: Fix VAST check on aspect consistency
Marc Poulhiès [Tue, 24 Feb 2026 09:03:52 +0000 (10:03 +0100)] 
ada: Fix VAST check on aspect consistency

Currently, N_Attribute_Definition_Clause nodes don't have a
Corresponding_Aspect field. As hinted by a comment, it's something we
would like to do in the future, but adding the check was premature.

gcc/ada/ChangeLog:

* vast.adb (Do_Node_Pass_2): Adjust check for aspect consistency.

3 weeks agoada: Adjust 'Constrained for formal parameters of unchecked union types
Eric Botcazou [Mon, 23 Feb 2026 16:29:44 +0000 (17:29 +0100)] 
ada: Adjust 'Constrained for formal parameters of unchecked union types

GNAT has historically never added extra formal parameters alongside formal
parameters of unchecked union types, even when they have convention Ada,
so it cannot compute the 'Constrained attribute for In Out or Out formal
parameters. This changes the compiler to raise Program_Error in this case.

gcc/ada/ChangeLog:

* exp_attr.adb (Expand_N_Attribute_Reference) <Constrained>: If the
prefix is a non-In formal parameter of an unchecked union type, give
a warning and insert a raise statement for Program_Error.

3 weeks agoada: Fix spurious discriminant check failure for unconstrained actual parameter
Eric Botcazou [Mon, 23 Feb 2026 08:43:17 +0000 (09:43 +0100)] 
ada: Fix spurious discriminant check failure for unconstrained actual parameter

This happens when the unconstrained variable passed as actual parameter is
initialized by a conditional expression, because its declaration is wrongly
distributed into the dependent expressions of the conditional expression.

gcc/ada/ChangeLog:

* exp_util.ads (Is_Distributable_Declaration): New predicate.
* exp_util.adb (Is_Distributable_Declaration): New predicate coming
from Expand_N_Case_Expression and Expand_N_If_Expression.  Return
False for variables of an unconstrained definite nonlimited subtype.
* exp_ch4.adb (Expand_N_Case_Expression): Replace calls to local
Is_Optimizable_Declaration by calls to Is_Distributable_Declaration.
(Expand_N_If_Expression): Likewise.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Likewise.

3 weeks agoada: Rename Insert_Pragma to be Insert_Aitem
Bob Duff [Sat, 21 Feb 2026 10:31:28 +0000 (05:31 -0500)] 
ada: Rename Insert_Pragma to be Insert_Aitem

...because it now supports attribute_definition_clauses.
Also rename the formal parameter.

Document the fact that it sets Aitem to Empty.

gcc/ada/ChangeLog:

* sem_ch13.adb (Insert_Pragma):
Rename to be Insert_Aitem.

3 weeks agoada: Cleanup Analyze_Aspect_Specifications
Bob Duff [Fri, 20 Feb 2026 15:07:05 +0000 (10:07 -0500)] 
ada: Cleanup Analyze_Aspect_Specifications

Comment cleanup: Change incorrect uses of "erroneous"
(which is Ada jargon) to be "illegal".
Remove long list of aspects for Insert_Pragma;
it seems useless, and might be incorrect, and is certainly
incorrect after this change.

Change Insert_Pragma to be more general, and use it more
instead of ad-hoc code. It now supports N_Attribute_Definition_Clauses,
so should be renamed (in a future change).

The previous code sometimes used Ins_Node to preserve order;
the order of pragmas is the same as the order of aspects.
But sometimes, Ins_Node was not used. (With Ins_Node,
"with Foo => ..., Bar => ..." produces pragma Foo then pragma Bar,
for example. Without Ins_Node, it produces pragma Bar then pragma Foo.)
We are trying to use Insert_Pragma for more cases (DRY).
The new code uses Ins_Node to preserve order in case of Annotate,
and not otherwise. The Compilation_Unit case also does not
preserve order. This code is marked "???" to be cleaned up later.

One goal of this change (not yet done) is to avoid having
so many "goto Continue;"s, which are confusing, especially
since <<Continue>> is misnamed (it's not at the end of a
loop body). We will probably also split out Analyze_One_Aspect
as a separate procedure. When we get to the code after the
giant case statement, if Aitem is present, we can insert it.
(Current code inserts it as we go along.)

Move code dealing with Boolean_Aspects and Library_Unit_Aspects of
library units to where other Boolean_Aspects and Library_Unit_Aspects
are handled. This seems simpler.

gcc/ada/ChangeLog:

* sem_ch13.adb (Analyze_Aspect_Specifications):
Misc cleanup.

3 weeks agoada: Implement restrictions for unchecked union in inlining for GNATprove
Claire Dross [Thu, 19 Feb 2026 16:06:09 +0000 (17:06 +0100)] 
ada: Implement restrictions for unchecked union in inlining for GNATprove

Do not inline calls when  a formal parameter has an unchecked union type as
it might lead to missing checks for UU restrictions.

gcc/ada/ChangeLog:

* inline.adb (Can_Be_Inlined_In_GNATprove_Mode):
Do not inline subprograms with formals of an unchecked union type.

3 weeks agoada: Fix freezing of nested discriminated type
Marc Poulhiès [Thu, 19 Feb 2026 10:18:23 +0000 (11:18 +0100)] 
ada: Fix freezing of nested discriminated type

Simply creating the freeze node for the base type of a discriminated type
without adjusting the scope and the visible declarations leads to an
incorrect tree that crashes the compiler when unnesting the predicate
function.

gcc/ada/ChangeLog:

* sem_ch3.adb (Find_Type_Of_Object): Adjust freezing of the base
type of a discriminated type.

Co-authored-by: Eric Botcazou <botcazou@adacore.com>
3 weeks agoada: Simplify test for limited types
Ronan Desplanques [Thu, 19 Feb 2026 15:45:48 +0000 (16:45 +0100)] 
ada: Simplify test for limited types

Is_Limited_Type always returns True for types where Is_Limited_Composite
is True, therefore the disjunct this patch removes had no effect.

gcc/ada/ChangeLog:

* sem_ch3.adb (Process_Full_View): Simplify test.

3 weeks agoada: Fix oversight in latest accessibility change
Eric Botcazou [Thu, 19 Feb 2026 18:36:18 +0000 (19:36 +0100)] 
ada: Fix oversight in latest accessibility change

The oversight is that the dynamic accessibility checks should be generated
neither when accessibility checks are disabled, for example by means of the
-gnatp switch, nor when the GNAT restriction No_Dynamic_Accessibility_Checks
is enabled.

gcc/ada/ChangeLog:

* accessibility.adb
(Apply_Accessibility_Check_For_Class_Wide_Return): Do not test if
accessibility checks are suppressed here but...
(Apply_Accessibility_Check_For_Return): ...here instead.

3 weeks agoada: Apply the check for all primitive equality operations
Viljar Indus [Wed, 18 Feb 2026 09:29:23 +0000 (11:29 +0200)] 
ada: Apply the check for all primitive equality operations

gcc/ada/ChangeLog:

* sem_ch6.adb (Check_For_Primitive_Subprogram): add the
check for ghost equality functions for all branches handling
primitive subprograms.

3 weeks agoada: Clean up around Is_Immutably_Limited_Type
Ronan Desplanques [Wed, 18 Feb 2026 16:37:27 +0000 (17:37 +0100)] 
ada: Clean up around Is_Immutably_Limited_Type

This improves the documentation comments of Is_Immutably_Limited_Type and
Is_Inherently_Limited_Type and rewrites the body of
Is_Inherently_Limited_Type to leverage Is_Immutably_Limited_Type.

gcc/ada/ChangeLog:

* sem_aux.ads (Is_Immutably_Limited_Type, Is_Inherently_Limited_Type):
Improve documentation comments.
* sem_aux.adb (Is_Inherently_Limited_Type): Replace inline code with
call to Is_Immutably_Limited_Type.

3 weeks agoada: Minor cleanup related to aspects vs. pragmas
Bob Duff [Wed, 18 Feb 2026 21:52:52 +0000 (16:52 -0500)] 
ada: Minor cleanup related to aspects vs. pragmas

Initialize aspect is implementation defined, but was not in
Implementation_Defined_Aspect.

Misc minor comment fixes.

gcc/ada/ChangeLog:

* aspects.ads (Aspect_Initialize):
Add to Implementation_Defined_Aspect.
* einfo.ads (Alignment_Clause): Minor comment fix.
* sem.adb: Remove useless null statements.
* sem_ch13.ads (Rep_Item_Too_Late):
Minor comment fix (this IS Sem_Ch13).
* sem_prag.adb (Fix_Error):
Minor comment fix (aspects are not "abnormal").
* sinfo.ads: Minor comment fix.

3 weeks agoada: Crash when using address clause on declare-expression constant
Javier Miranda [Wed, 28 Jan 2026 11:19:45 +0000 (11:19 +0000)] 
ada: Crash when using address clause on declare-expression constant

gcc/ada/ChangeLog:

* gen_il-fields.ads (Scope_Link): New field.
* gen_il-gen-gen_nodes.adb (N_Expression_With_Actions): Added Scope_Link.
* sinfo.ads (N_Expression_With_Actions): Add field Scope_Link.
* sem_ch4.adb (Analyze_Expression_With_Actions): Set field Scope_Link
* sem_ch5.ads (Has_Sec_Stack_Call): Declaration moved to the package spec.
* sem_ch5.adb (Has_Sec_Stack_Call): ditto.
* sem_res.adb (Resolve_Declare_Expression): Push/Pop internally created
scope to provide proper visibility of the declare_items.

3 weeks agoada: Fix crash evaluating class-wide preconditions with missing completion
Denis Mazzucato [Wed, 18 Feb 2026 13:35:55 +0000 (14:35 +0100)] 
ada: Fix crash evaluating class-wide preconditions with missing completion

This patch fixes a crash occurring when evaluating class-wide precondition of a
non-primitive subprogram where accessing the class-wide type of its dispatching
type is not possible. The bug occurs when the type is abstract and missing
completion, a proper error should be given instead.

gcc/ada/ChangeLog:

* sem_prag.adb (Check_References): Don't call Class_Wide_Type if the
subprogram is a non-primitive procedure as the dispatching type may be
empty.

3 weeks agoada: Remove spurious error on attribute Count with expansion disabled
Piotr Trojanek [Mon, 16 Feb 2026 14:23:41 +0000 (15:23 +0100)] 
ada: Remove spurious error on attribute Count with expansion disabled

When expansion is disabled, e.g. because of GNAT switch -gnatc or because GNAT
is operating in the GNATprove mode, then attribute Count is not expanded and
can legitimately appear in a barrier of a protected entry, even if restriction
Pure_Barriers is enabled.

gcc/ada/ChangeLog:

* exp_ch9.adb (Is_Pure_Barrier): Handle unexpanded attribute Count.

3 weeks agoada: VAST Check_Corresponding_Aspect
Bob Duff [Tue, 17 Feb 2026 02:54:51 +0000 (21:54 -0500)] 
ada: VAST Check_Corresponding_Aspect

Add Check_Corresponding_Aspect to VAST.
Improve comments.

gcc/ada/ChangeLog:

* vast.adb (Check_Corresponding_Aspect):
New checks for aspect/pragma consistency.
(Check_Enum): Add documentation of the checks.

3 weeks agoada: Fix small irregularity for Master_Id of anonymous access result type
Eric Botcazou [Tue, 17 Feb 2026 16:27:43 +0000 (17:27 +0100)] 
ada: Fix small irregularity for Master_Id of anonymous access result type

The Master_Id of access types whose designated type contains tasks is set to
a renaming of the current _Master variable by means of Build_Master_Renaming

The exception is for anonymous access result types, whose Master_Id is set
to the current _Master variable by Check_Anonymous_Access_Return_With_Tasks.

This is fully correct because the entity of the variable is preresolved, but
is ambiguous in the .dg file because there can be several _Master variables
in the subprogram, which effectively represent distinct masters.  Therefore
this makes the case of anonymous access result types also use a renaming.

gcc/ada/ChangeLog:

* exp_ch9.ads (Build_Master_Declaration): Minor tweaks in comment.
(Build_Master_Entity): Likewise.
(Build_Master_Renaming): Likewise.
(Build_Master_Renaming_Declaration): New function declaration.
* exp_ch9.adb (Build_Master_Declaration): Move around.
(Build_Master_Renaming_Declaration): New function.
(Build_Master_Renaming): Call Build_Master_Renaming_Declaration
to build the renaming declaration.
* sem_ch6.adb (Check_Anonymous_Access_Return_With_Tasks): Remove
useless guard on Declarations (N).  Create a renaming declaration
for the current _Master variable and set is as the Master_Id of
the access result type.

3 weeks agoada: Fix for illegal deep delta array aggregate with others
Piotr Trojanek [Fri, 13 Feb 2026 09:07:28 +0000 (10:07 +0100)] 
ada: Fix for illegal deep delta array aggregate with others

Do not try to apply a scalar range check to "others" choice in deep delta array
aggregate. This choice is illegal, but we still need to handle it in expansion.

gcc/ada/ChangeLog:

* exp_spark.adb (Expand_SPARK_N_Delta_Aggregate): Special case for
"others" clause.

3 weeks agotestsuite, i386: add win64 AVX indirect alignment tests [PR54412]
oltolm [Sun, 24 May 2026 10:57:51 +0000 (12:57 +0200)] 
testsuite, i386: add win64 AVX indirect alignment tests [PR54412]

On x86_64-w64-mingw32, PR target/54412 is triggered when AVX and
AVX512 values are passed or returned indirectly and GCC materializes
under-aligned stack storage for them.

Add run tests for the original by-value AVX cases, for isolated hidden
sret allocation and callee by-reference parameter materialization, and
for a dedicated aligned(64) AVX512 case.

gcc/testsuite/ChangeLog:

PR target/54412
* gcc.target/i386/pr54412-v4d-o0-aligned-locals.c: New test.
* gcc.target/i386/pr54412-o2-by-value-cases.c: New test.
* gcc.target/i386/pr54412-sret-no-args.c: New test.
* gcc.target/i386/pr54412-callee-byref-param.c: New test.
* gcc.target/i386/pr54412-avx512-aligned64.c: New test.

Signed-off-by: oltolm <oleg.tolmatcev@gmail.com>
Signed-off-by: Jonathan Yong <10walls@gmail.com>
3 weeks ago[RISC-V] Drop compromised scan-asm test after recent vectorized loop epilogue changes
Jeff Law [Thu, 28 May 2026 02:00:34 +0000 (20:00 -0600)] 
[RISC-V] Drop compromised scan-asm test after recent vectorized loop epilogue changes

Tamar's recent change to elide vectorized loop epilogues compromised the
scan-asm part of this test for RISC-V.  Essentially the test is looking for a
specific insn that appears in the unnecessary epilogue.

The original motivation for these tests was an ICE.  So I'm just dropping the
scan-asm parts of this test so that we still verify that we're not triggering
an ICE.

gcc/testsuite
* gcc.target/riscv/rvv/base/pr115456-3.c: Drop compromised scan-asm
part of the test.

3 weeks agoFortran: [PR93727] Add EX format rounding for truncated hex mantissa
Jerry DeLisle [Sun, 24 May 2026 03:28:43 +0000 (20:28 -0700)] 
Fortran: [PR93727] Add EX format rounding for truncated hex mantissa

Implement proper rounding of the hex mantissa in write_ex when the
user specifies a d smaller than full precision.  All Fortran ROUND=
modes are supported: ROUND_NEAREST (ties-to-even), ROUND_COMPATIBLE
(ties away from zero), ROUND_UP, ROUND_DOWN, and ROUND_ZERO.
ROUND_PROCDEFINED and ROUND_UNSPECIFIED default to ROUND_NEAREST on
IEEE 754 systems, consistent with the decimal format behaviour.

Carry propagation handles the case where incrementing a string of
trailing F hex digits reaches the integer digit; if that overflows
(F → 16) the output is normalized by setting the integer digit to 8
and incrementing the binary exponent by one.

Assisted by: Claude Sonnet 4.6

PR fortran/93727

libgfortran/ChangeLog:

* io/write.c (write_ex): Replace simple truncation with
rounding-aware logic respecting dtp round_status.  Add carry
propagation and integer-digit normalization.
* io/write_float.def: Change use of GFC_UINTEGER_8 to
long long unsigned.

gcc/testsuite/ChangeLog:

* gfortran.dg/EXformat_3.F90: New test covering rounding for
KIND=4, 8, 10, and 16: clear round-up, ties-to-even (truncate
and round-up cases), carry propagation, and normalization.
* gfortran.dg/EXrounding.F90: New test checking the various
rounding modes for all kinds.

3 weeks agoDaily bump.
GCC Administrator [Thu, 28 May 2026 00:16:27 +0000 (00:16 +0000)] 
Daily bump.