Eric Botcazou [Sat, 3 Jan 2026 10:47:35 +0000 (11:47 +0100)]
Ada: Fix infinite loop on iterated element association with iterator and key
Unlike when the key expression is not present, Resolve_Iterated_Association
analyzes instead of preanalyzes the iterator specification, which causes the
expander to be invoked on an orphaned copy of the iterator expression.
gcc/ada/
PR ada/123371
* sem_aggr.adb (Resolve_Iterated_Association): Call Preanalyze
instead of Analyze consistently, as well as Copy_Separate_Tree
instead of New_Copy_Tree.
gcc/testsuite/
* gnat.dg/specs/aggr10.ads: New test.
Martin Uecker [Sun, 21 Dec 2025 18:10:56 +0000 (19:10 +0100)]
c: Fix ICE for invalid code with variadic and old-school prototypes [PR121507]
When mixing old-school definition without prototype and new C23
variadic functions without named argument, there can be an ICE
when trying to form the composite type. Avoid this by letting it
fail later due to incompatible types.
Martin Uecker [Thu, 25 Dec 2025 17:27:33 +0000 (18:27 +0100)]
c: Fix construction of composite type for atomic pointers [PR121081]
When constructing the composite type of two atomic pointer types,
we used "qualify_type" which did not copy the "atomic" qualifier.
Use c_build_type_attribute_qual_variant instead.
Paul Thomas [Sat, 3 Jan 2026 07:37:28 +0000 (07:37 +0000)]
Fortran: Invalid association with operator-result selector [PR123352]
2026-01-03 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/123352
* gfortran.h: Add prototype for gfc_resolve_symbol.
* interface.cc (matching_typebound_op): If the current
namespace has not been resolved and the derived type is use
associated, resolve the derived type with gfc_resolve_symbol.
* match.cc (match_association_list): If the associate name is
unknown type and the selector is an operator expression, copy
the selector and call gfc_extend_expr. Replace the selector if
there is a match, otherwise free the copy.
* resolve.cc (gfc_resolve_symbol): New function.
gcc/testsuite/
PR fortran/123352
* gfortran.dg/associate_78.f90: New test.
The macOS awk seems to not like having an unparenthesized conditional
expression as the last argument to printf. This commit workarounds
this by simply replacing the conditional expression with a conditional
statement.
Tested with gawk and mawk.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
libga68/ChangeLog
Jakub Jelinek [Fri, 2 Jan 2026 08:47:43 +0000 (09:47 +0100)]
Tweak update-copyright.py script
When running update-copyright.py --this-year, I've encountered various
failures, this patch works around those.
2026-01-02 Jakub Jelinek <jakub@redhat.com>
* update-copyright.py (LibPhobosFilter): Ignore also
__importc_builtins.di.
(LibGRustFilter): New filter.
(GCCCopyright): Add several external authors.
(GCCCmdLine): Use LibGRustFilter for libgrust. Add libga68
directory, both to list of directories and default directories.
Jakub Jelinek [Fri, 2 Jan 2026 08:18:02 +0000 (09:18 +0100)]
c++: Fix up is_late_template_attribute for [[maybe_unused]] [PR123277]
is_late_template_attribute wants to return false for gnu::unused, gnu::used
and maybe_unused attributes, so that -Wunused-local-typedefs sees those
attributes applied even to typedefs to dependent types.
Before my recent r16-5937 change, for maybe_unused this happened to work
through the lookup_attribute_spec returning NULL in that case because
there is no gnu::maybe_unused attribute, but when that has been fixed,
maybe_unused needs to be listed in the exceptions next to unused and used
attributes.
2026-01-02 Jakub Jelinek <jakub@redhat.com>
PR c++/123277
* decl2.cc (is_late_template_attribute): Return false also for
[[maybe_unused]] attribute on TYPE_DECLs to dependent types.
* g++.dg/warn/Wunused-local-typedefs-5.C: New test.
Steven G. Kargl [Fri, 2 Jan 2026 04:52:55 +0000 (20:52 -0800)]
Fortran: Fix tab not ignored in print statement.
PR fortran/101399
gcc/fortran/ChangeLog:
* io.cc (match_io): If the -Wtabs option is used, then Issue a
warning for a <tab> character following a 'print' statement;
otherwise ignore the <tab>.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr101399_1.f90: New test.
* gfortran.dg/pr101399_2.f90: New test.
Jerry DeLisle [Tue, 30 Dec 2025 22:46:35 +0000 (14:46 -0800)]
Fortran: Generate a runtime error on recursive I/O
PR libfortran/119136
gcc/fortran/ChangeLog:
* libgfortran.h: Add enum for new LIBERROR_RECURSIVE_IO.
libgfortran/ChangeLog:
* io/io.h: Delete prototype for unused stash_internal_unit.
(check_for_recursive): Add prototype for this new function.
* io/transfer.c (data_transfer_init): Add call to new
check_for_recursive.
* io/unit.c (delete_unit): Fix comment.
(check_for_recursive): Add new function.
* runtime/error.c (translate_error): Add translation for
"Recursive I/O not allowed runtime error message.
[PATCH] [AutoFDO/devirt] Fix ICE with duplicate speculative ID
This happens due to autoprofile pass makes edge make_speculative.
Then ipa-devirt does the same with the same speculative_id which
reults in duplicate speculative_id and ICE.
Jose E. Marchesi [Wed, 31 Dec 2025 22:02:45 +0000 (23:02 +0100)]
a68: use a68_error specific tag in diagnostic message
a68_error and friends still use their own upper-letter based tag
format. We will be switching these to use the GCC standard %-based
tags for diagnostics, hopefully soonish, but in the meanwhile do not
pass a %s tag to a68_error because bad things happen.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-imports.cc (a68_open_packet): Use right tag format in
a68_error.
Jeff Law [Wed, 31 Dec 2025 05:52:03 +0000 (22:52 -0700)]
[RISC-V][PR target/121485] Fix mode on Zvkned lmul extending patterns
This fixes the mode on the lmul-extending variants of various Zvkned patterns.
Essentially vsetvl insertion depends on the mode of each insn and for lmul
extending patterns, we need the larger mode, not the smaller one to get the
correct vsetvls.
Tested on riscv{32,64}-elf on the simple testcase in the PR. I also verified
the larger testcase in godbolt appears to work correctly.
Waiting on upstream CI before committing.
PR target/121485
gcc/
* config/riscv/vector-crypto.md: Fix mode attribute for the
lmul extending Zvkned patterns.
gcc/testsuite/
* gcc.target/riscv/rvv/vsetvl/pr121485.c: New test.
Andrew Pinski [Wed, 31 Dec 2025 01:23:13 +0000 (17:23 -0800)]
testsuite: Skip pr123295-1.c for non int128 targets [PR123334]
This was an oversight on my part. The testcase uses __int128 but
forgot to check if it is compiling for a target that supports that
type.
Mark the testcase as unsupported for non-int128 targets.
Pushed as obvious after run the testcase with and without -m32 on x86_64:
make check-gcc RUNTESTFLAGS="--target_board=unix/-m32 dg.exp=pr123295-1.c"
make check-gcc RUNTESTFLAGS="--target_board=unix dg.exp=pr123295-1.c"
PR testsuite/123334
gcc/testsuite/ChangeLog:
* gcc.dg/pr123295-1.c: Require int128.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
a68: support for Algol 68 code in libga68 and initial transput
Most of the standard prelude is implemented in a combination of code
lowered by the front-end (standard operators, contants, etc) and
functions provided by the libga68 run-time library, to which the
former libcalls. Until now, all the support routines in libga68 were
written in C. However, many of the transput facilities are better
implemented in Algol 68.
The Revised Report includes a reference implementation (code listing)
of many of the standard routines. This implementation, however, makes
use of an "extended" program notation in order to denote certain
notions to avoid repetitive code. Therefore this commit includes
sppp, a build-time pre-processor written in awk that is only intended
to be used internally by the libga68 run-time library. This
preprocessor allows us to write code like:
proc subwhole = (Number v, int width) string:
case v in
{iter L {short short} {short} {} {long} {long long}}
{iter S {LENG LENG} {LENG} {} {SHORTEN} {SHORTEN SHORTEN}}
({L} int x):
begin string s, {L} int n := x;
while dig_char ({S} (n MOD {L} 10)) +=: s;
n %:= {L} 10; n /= {L} 0
do ~ od;
(UPB s > width | width * errorchar | s)
end
{reti {,}}
esac;
Resulting in cases for short short int, short int, int, long int and
long long int being macro-expanded in the routine's conformance
clause.
This commit also adds the necessary infrastructure for writing Algol
68 code in the libga68 library, including the ability of having
modules exported by libga68. An implementation of some of the
transput routines is also provided in standard.a68: whole, fixed,
float, string_to_L_real, char_in_string, L_int_width, L_real_width and
L_exp_with.
The build system changes include the backport of the Automake Algol 68
support, which is in a released version of Automake but not in the
version used for GCC, to libga68/m4/autoconf.m4.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
ChangeLog
* Makefile.def (flags_to_pass): Rename GA68, GA68FLAGS,
GA68_FOR_TARGET, GA68FLAGS_FOR_TARGET to A68, A68FLAGS,
A68_FOR_TARGET and A68FLAGS_FOR_TARGET.
* Makefile.tpl: Use A68, A68FLAGS, A68_FOR_BUILD and
A68_FOR_TARGET rather than GA68, GA68FLAGS, GA68_FOR_BUILD and
GA68_FOR_TARGET.
* Makefile.in: Regenerate.
* configure.ac: Set A68_FOR_BUILD rather than GA68_FOR_BUILD, and
invoke ACX_PROG_A68 rather than ACX_PROG_GA68.
Subst A68_FOR_BUILD rather than GA68_FOR_BUILD.
Subst A68 and A68FLAGS rather than GA68 and GA68FLAGS.
Set A68_FOR_TARGET rather than GA68_FOR_TARGET.
* configure: Regenerate.
* config-ml.in: Handle A68FLAGS and define A68 in sub-configures.
config/ChangeLog
* acx.m4: Define ACX_PROG_A68 rather than ACX_PROG_GA68.
(ACX_PROG_A68): Set A68 rather than GA68.
gcc/algol68/ChangeLog
* a68-lang.cc (a68_init_options): Add an entry to A68_MODULE_FILES
to map module Transput to the basename ga68.
gcc/testsuite/ChangeLog
* algol68/execute/char-in-string-1.a68: New test.
libga68/ChangeLog
* m4/autoconf.m4: New file.
* configure.ac: Expand AC_PROG_A68.
* configure: Regenerate.
* Makefile.am: Add rules to build Algol 68 sources and to
build the transput module.
* Makefile.in: Regenerate.
* acinclude.m4: Include m4/autoconf.m4.
* sppp.awk: New file.
* transput.a68.in: Likewise.
Eric Botcazou [Tue, 30 Dec 2025 19:13:51 +0000 (20:13 +0100)]
Ada: Fix warnings during bootstrap
The warnings are:
warning: "g-htable.adb" should be recompiled
warning: ("/usr/lib64/gcc/x86_64-suse-linux/7/adalib/g-htable.ali" is
obsolete and read-only)
warning: "g-byorma.adb" should be recompiled
warning: ("/usr/lib64/gcc/x86_64-suse-linux/7/adalib/g-byorma.ali" is
obsolete and read-only)
warning: "g-speche.adb" should be recompiled
warning: ("/usr/lib64/gcc/x86_64-suse-linux/7/adalib/g-speche.ali" is
obsolete and read-only)
warning: "g-spchge.adb" should be recompiled
warning: ("/usr/lib64/gcc/x86_64-suse-linux/7/adalib/g-spchge.ali" is
obsolete and read-only)
warning: "g-u3spch.adb" should be recompiled
warning: ("/usr/lib64/gcc/x86_64-suse-linux/7/adalib/g-u3spch.ali" is
obsolete and read-only)
but would be hard errors if a kluge was not used (passing -t to gnatbind).
This fixes the warnings, as well as tentatively removes the kludge.
gcc/ada/
* gcc-interface/Make-lang.in (GNATBIND_FLAGS): Delete.
(GNAT_ADA_OBJS): Move g-byorma.o, g-htable.o, g-spchge.o,
g-speche.o and g-u3spch.o to STAGE1 list.
(GNATBIND_OBJS): Move g-byorma.o, g-hesora.o and g-htable.o
to STAGE1 list.
(ada/b_gnat1.adb): Do not pass GNATBIND_FLAGS to gnatbind.
(ada/b_gnatb.adb): Likewise.
(ADA_GENERATED_FILES): Add g-byorma.ad[sb], g-hesora.ad[sb],
g-htable.ad[sb], g-spchge.ad[sb], g-speche.ad[sb], g-u3spch.ad[sb]
and alphabetize.
* libgnat/g-byorma.ads: Add note to head comment.
* libgnat/g-hesora.ads: Likewise.
* libgnat/g-htable.ads: Likewise.
* libgnat/g-spchge.ads: Likewise.
* libgnat/g-speche.ads: Likewise.
* libgnat/g-u3spch.ads: Likewise.
Jeff Law [Tue, 30 Dec 2025 17:38:07 +0000 (10:38 -0700)]
[RISC-V][PR target/123318] Use a Pmode temporary for output of auipc
In the explict-relocs path through the RISC-V backend we generate sequences
using auipc which stores its result in a GPR.
Under the right circumstances we can end up with cases where we try to use
pseudos which may not be Pmode sized or worse yet may be a floating point mode.
This patch forces those paths to generate a fresh temporary when the provided
one isn't already Pmode. That helps this bug, but I'm not 100% convinced the
explict-relocs stuff is correct and I wouldn't be surprised to find other bugs
lurking in here.
Bootstrapped & regression tested on the Pioneer and regression tested on
riscv{32,64}-elf as well.
Will commit once pre-commit CI gives it the green light.
PR target/123318
gcc/
* config/riscv/riscv.cc (riscv_legitimize_const_move): Force
riscv_split_symbol to generate a new temporary if the provided
one isn't Pmode.
gcc/testsuite/
* gcc.target/riscv/pr123318.c: New test.
Eric Botcazou [Tue, 30 Dec 2025 10:44:54 +0000 (11:44 +0100)]
Ada: Reject formal parameter as name of subprogram renaming
This implements a minimal form of the old RM 8.5.4(6) rule, which forbids
the use of (the name of) a formal parameter of the specification in the
name of a renaming subprogram declaration; it turns out that implementing
the full rule breaks existing code that works fine otherwise.
gcc/ada/
PR ada/15605
* sem_ch8.adb (Analyze_Subprogram_Renaming): Give an error if the
name is also that of a formal parameter of the specification.
gcc/testsuite/
* gnat.dg/specs/profile1.ads: New test.
Jose E. Marchesi [Mon, 29 Dec 2025 23:11:31 +0000 (00:11 +0100)]
a68: scanner fixes for bits denotations
This commit:
1. Fixes a bug in the scanner so it now checks whether the digits in a
bits denotation are ok for its radix.
2. Does not allow to have typographical display features between the
digits of bits denotations in SUPPER stropping, when the radix is
16. This is is avoid confusing situations like the one described
in the comment below.
3. Adds a few tests.
4. Fixes an existing test that was assuming that bits denotations with
radix 10 are allowed. The report allows radixes 2, 4, 8 and 16.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-scanner.cc (get_next_token): Bits denotation parsing
fixes.
* ga68.texi (SUPPER stropping): Document special rule for bits
denotations with radix 16.
gcc/testsuite/ChangeLog
* algol68/compile/error-radix-1.a68: New test.
* algol68/compile/radix-hex-upper-1.a68: Likewise.
* algol68/compile/radix-hex-supper-1.a68: Likewise.
* algol68/compile/error-radix-4.a68: Likewise.
* algol68/compile/error-radix-3.a68: Likewise.
* algol68/compile/error-radix-2.a68: Likewise.
* algol68/execute/environment-enquiries-6.a68: Do not use radix 10
in bits denotations.
Andrew Pinski [Sun, 28 Dec 2025 20:50:12 +0000 (12:50 -0800)]
ifcvt: Allow non-comparisons against 0 in noce_try_cond_zero_arith
Like r16-6332-g2a84a753afcf37 but instead of just allowing any comparisons
against 0, this allows all comparisons. I mentioned this in
https://gcc.gnu.org/pipermail/gcc-patches/2025-December/704463.html.
gcc/ChangeLog:
* ifcvt.cc (noce_try_cond_zero_arith): Remove restriction on comparison
against 0.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Fri, 26 Dec 2025 20:07:26 +0000 (12:07 -0800)]
ifcvt: Handle lowpart subregs if noce_emit_cmove fails in noce_try_cond_zero_arith [PR123308]
This fixes up a missed optimization regression for riscv in ifcvt after r16-6350-g9e61a171244110.
The problem is noce_emit_cmove will fail for QImode. This can show up when dealing with shifts
and the right hand side is `(subreg:QI (reg:DI) lowpart)`. Trying first for the subreg mode
fails so the need to try to unwrap the subreg and try for the full mode.
This fixes test_ShiftLeft_eqz in gcc.target/riscv/zicond_ifcvt_opt.c.
Bootstrapped and tested on x86_64-linux-gnu.
PR rtl-optimization/123308
gcc/ChangeLog:
* ifcvt.cc (noce_try_cond_zero_arith): If noce_emit_cmove fails
for a lowpart subreg case, then try the full reg cmove and
take the lowpart subreg afterwards.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 28 Dec 2025 20:39:13 +0000 (12:39 -0800)]
ifcvt: cleanup if_info->cond usage in noce_try_cond_zero_arith
Since r16-6332-g2a84a753afcf37, if_info->cond is not used indirectly any more
for creating the conditional. So this patch stops doing the swap and fixes up
so the case where if_info->rev_cond might be null.
gcc/ChangeLog:
* ifcvt.cc (noce_try_cond_zero_arith): Don't swap if_info->cond
but use it directly with if_info->rev_cond.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Pietro Monteiro [Mon, 29 Dec 2025 17:53:20 +0000 (12:53 -0500)]
libga68: Add symbol versions to exports
libga68/ChangeLog:
* configure.ac: New test to determine if symbol versioning is
supported.
* Makefile.am: Use result of above test to add appropriate linker
flags.
* Makefile.in: Regenerated.
* aclocal.m4: Likewise.
* configure: Likewise.
* ga68.map: New file.
* libtool-version: New file.
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>
Egas Ribeiro [Thu, 25 Dec 2025 12:33:47 +0000 (12:33 +0000)]
c++: Fix ICE with requires-expression in lambda requires-clause [PR123080]
When parsing a lambda with a trailing requires-clause, calling
cp_parser_requires_clause_opt confused generic lambda parsing when
implicit template parameters were involved.
After failing to parse a type-requirement during tentative parsing and
hitting error recovery code in cp_parser_skip_to_end_of_statement that
aborted the current implicit template, attemping to finish the lambda
declaration caused ICEs.
Fix this by not aborting the current implicit template during tentative
parsing and adding cleanup for fully_implicit_function_template_p.
PR c++/123080
gcc/cp/ChangeLog:
* parser.cc (cp_parser_skip_to_end_of_statement): Don't abort
implicit templates during tentative parsing.
(cp_parser_lambda_declarator_opt): Add cleanup for
fully_implicit_function_template_p before parsing
trailing_requires_clause.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-requires6.C: New test.
* g++.dg/cpp2a/lambda-requires6a.C: New test.
Signed-off-by: Egas Ribeiro <egas.g.ribeiro@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Fri, 7 Nov 2025 12:24:17 +0000 (23:24 +1100)]
c++: Fold non-ODR usages of potentially constant values early [PR120005]
[basic.link] p14.4 says that a declaration naming a TU-local entity is
an exposure, ignoring "any reference to a non-volatile const object or
reference with internal or no linkage initialized with a constant
expression that is not an odr-use".
To implement this, we cannot stream these entities but must fold them
into their underlying values beforehand. This was already done to a
degree by cp_fold, but it didn't handle all cases, and notably was not
performed on the saved body of a constexpr function, which would then
cause errors during modules streaming.
This patch implements this by supplementing cp_fold with additional
rules to fold non-ODR usages of contants. We need to do this as an
additional walk before saving the constexpr function definition, so we
also disable as much other folding during this walk as possible to
prevent removing any information that the constexpr interpreter
requires to function correctly.
With this we still will error on uses in templates. In general it's
impossible to tell within an uninstantiated template body whether a
reference is an ODR-use in the face of dependent expressions, so we
don't attempt to do anything for this case.
PR c++/119097
PR c++/120005
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1): Fall back to
location from parent expression if needed.
* cp-gimplify.cc (enum fold_flags): Add ff_only_non_odr.
(cp_fold_data::cp_fold_data): Assert invariant for flags.
(cp_fold_omp_clause_refs_r): New function.
(cp_fold_r): Specially handle OMP_CLAUSE_DECL.
(cp_fold_function_non_odr_use): New function.
(cp_fold_non_odr_use_1): New function.
(cp_fold_maybe_rvalue): Fold non-ODR uses when requested.
(cp_fold_non_odr_use): New function.
(fold_caches): Increase number of caches.
(get_fold_cache): Use a new cache for non-ODR use walks.
(cp_fold): Skip most folding for non-ODR use walks; always
fold constant-initialized references; remove dead code to
fold __builtin_source_location.
* cp-tree.h (cp_fold_function_non_odr_use): Declare.
(cp_fold_non_odr_use): Declare.
* decl.cc (finish_function): Fold non-ODR uses before saving
constexpr fundef. Invoke PLUGIN_PRE_GENERICIZE before this
folding.
* ptree.cc (cxx_print_xnode): Handle TU_LOCAL_ENTITY.
* tree.cc (bot_manip): Propagate TREE_CONSTANT.
* typeck2.cc (digest_nsdmi_init): Fold non-ODR uses in NSDMIs.
Jakub Jelinek [Mon, 29 Dec 2025 13:00:02 +0000 (14:00 +0100)]
auto-profile.cc: Fix build with C++14
On Tue, Dec 23, 2025 at 11:01:36AM +0530, Dhruv Chawla wrote:
> Committed as:
> - r16-6347-g84058c3cc805f7
This broke building gcc with C++14 system compilers.
../../gcc/auto-profile.cc: In member function ‘std::pair<const char*, int> autofdo::string_table::get_original_name(const char*) const’:
../../gcc/auto-profile.cc:1129:7: warning: init-statement in selection statements only available with ‘-std=c++17’ or ‘-std=gnu++17’ [-Wc++17-extensions]
1129 | if (symtab_node *n
| ^~~~~~~~~~~
This is valid only in C++17 and later.
Fixed thusly.
2025-12-29 Jakub Jelinek <jakub@redhat.com>
* auto-profile.cc (string_table::get_original_name): Avoid using
init-statement in selection statement.
Rainer Orth [Mon, 29 Dec 2025 11:09:35 +0000 (12:09 +0100)]
build: Cherry-pick libtool.m4 support for GNU ld *_sol2 emulations
GNU ld gained separate Solaris-specific linker emulations (*_sol2) long
ago. Since their introduction, GCC has preferred them over their
non-*_sol2 counterparts but supported both forms. This has changed for
GCC 16: since all supported versions of GNU ld do support the *_sol2
emulations, GCC now uses them unconditionally.
libtool has also been updated to handle this since libtool 2.4.2 back in
2011. However, that change has only partially been backported to the
heavily patched libtool.m4 in the GCC tree: the sparcv9 part is there,
but the amd64 part is missing for some reason. This causes problems
with some recent binutils changes.
Therefore this patch cherry-picks the libtool patch to bring
Solaris/x86_64 in sync with Solaris/sparcv9 and upstream libtool.
Bootstrapped without regressions on {amd64,i386}-pc-solaris2.11 and
{sparcv9,sparc}-sun-solaris2.11.
Jose E. Marchesi [Mon, 29 Dec 2025 03:18:52 +0000 (04:18 +0100)]
a68: use LMD instead of LM for mode labels in exports
dwarf2out uses "LM" for line-info labels in text sections, using the
global counter line_info_label_num to get unique label names. The
Algol 68 exports were using the same string for the mode labels in the
.a68_exports sections using its own private counter. This led to
assemblers to not be happy whey they find duplicated labels in the
input assembly files.
This commit changes the names of mode labels in Algol 68 export
sections to use the "LMD" string instead.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-exports.cc (a68_asm_output_mode): Use .LMDnn labels for
modes instead of .LMnn.
Andrew Pinski [Fri, 26 Dec 2025 22:30:22 +0000 (14:30 -0800)]
LRA: Fix eliminate regs into a subreg inside a debug insn [PR123295]
So the problem here is during LRA we are eliminating argp and trying to
simplify the RTL as we go but inside a debug insn, almost all subreg
are valid due to gen_lowpart_for_debug done during debug insn simplification.
So simplify_gen_subreg will fail on some subregs and return null.
This causes problems later on. The solution is create a raw SUBREG
like what is done in lra_substitute_pseudo for debug insns.
Bootstrapped and tested on x86_64-linux-gnu.
PR rtl-optimization/123295
gcc/ChangeLog:
* lra-eliminations.cc (lra_eliminate_regs_1): For a debug
insn, create a raw SUBREG if simplify_gen_subreg fails.
gcc/testsuite/ChangeLog:
* gcc.dg/pr123295-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jose E. Marchesi [Sun, 28 Dec 2025 19:05:50 +0000 (20:05 +0100)]
a68: document finding module exports in the manual
This commit contains some little updates on the ga68 user manual, on
the topic of modules, exports and the modules map.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* ga68.texi (Packets): Update to specify only particular programs
and prelude packets are currently supported.
(Writing modules): Few updates.
(Module activation): Likewise.
(Modules and exports): New section.
Jose E. Marchesi [Sun, 28 Dec 2025 00:35:35 +0000 (01:35 +0100)]
a68: fix deduplication of imported modes
The internal global list of modes maintained by the compiler should
not contain two modes that are equivalent. When importing module
interfaces, the modes in these interfaces should get deduplicated
before being "interned" in the compiler's modes list. This commit
fixes the deduplication to accommodate the fact that more than one
module interface may be read from a given packet (compilation unit)
and also that multiple interfaces may be imported indirectly via
publicized modules.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-imports.cc (a68_decode_modes): Do not deduplicate imported
modes here.
(a68_open_packet): Do it here.
* a68-parser-extract.cc (extract_revelation): Recurse into
publicized modules after interning modes in the current module,
not before.
gcc/testsuite/ChangeLog
* algol68/execute/modules/module23bar.a68: New test.
* algol68/execute/modules/module23foo.a68: Likewise.
* algol68/execute/modules/program-23.a68: Likewise.
Rainer Orth [Sun, 28 Dec 2025 10:32:37 +0000 (11:32 +0100)]
Support Solaris CTF generation
The Solaris Compact C Type Format, CTF, was introduced back in Solaris
9. It is the precursor to current GNU CTF, meant primarily for tools
like the low-level debugger mdb or DTrace that would like to avoid the
overhead of full DWARF-2 debugging information. However, for a long
time creation required separate steps to convert DWARF information
(version 2 only) to CTF in the input objects and later merge this into
the final objects. The tools to do so were available, but they were
barely documented and their use restricted to the core OS because of
this difficulty.
There's recently been a massive effort to simplify this and allow for
wider adoption. The native linker has been extended to take GNU CTF
info in the input objects and convert that to Solaris CTF itself. At
the same time, the massively enhanced tools and the format itself are
fully documented.
To make this even simpler to use, this patch introduces a new -gsctf
option to hide the details from users. At compile time, it just passes
-gctf to the compiler, and at link time it invokes ld with -z ctf.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.
Rainer Orth [Sun, 28 Dec 2025 10:08:07 +0000 (11:08 +0100)]
testsuite: i386: Fix up check-function-bodies tests
Several recent tests that use check-function-bodies on x86 FAIL on Solaris:
they all lack dg-add-options check_function_bodies which is required to
handle some Solaris differences. One test also needs -fomit-frame-pointer
to deal with a different Solaris/x86 default.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
Rainer Orth [Sun, 28 Dec 2025 09:43:04 +0000 (10:43 +0100)]
Don't check for -xbrace_comment with Solaris/x86 as
With Solaris/x86 as, GCC uses the -xbrace_comment option if supported.
Since it is present in the Solaris 11.4 FCS assembler and 11.4 is the
only supported Solaris version, this check is no longer necessary.
Bootstrapped without regressions on i386-pc-solaris2.11.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-taxes.cc (tax_module_dec): Do not handle
DEFINING_MODULE_INDICANT.
* a68-exports.cc (a68_add_module_to_moif): Do not mangle module
names in module extracts.
(add_pub_revelations_to_moif): New function.
(a68_do_exports): Simplify and call add_pub_revelations_to_moif.
* a68-imports.cc (a68_decode_moifs): Add all decoded moifs to the
global list TOP_MOIF.
* a68-parser-extract.cc (extract_revelation): Recurse to import
extracts from publicized modules.
(a68_extract_indicants): Do not add symbol table entries for
defining modules.
* a68-types.h (struct TAG_T): Remove field EXPORTED.
(EXPORTED): Remove macro.
(TOP_MOIF): Define.
* a68-parser.cc (a68_parser): Initialize global list of moifs.
(a68_new_tag): Do not initialize EXPORTED.
gcc/testsuite/ChangeLog
* algol68/execute/modules/module22bar.a68: New test.
* algol68/execute/modules/module22foo.a68: Likewise.
* algol68/execute/modules/program-22.a68: Likewise.
* algol68/compile/modules/program-11.a68: Adjust test to
publicized modules.
* algol68/compile/modules/program-error-multiple-delaration-module-1.a68:
Likewise.
Jose E. Marchesi [Sat, 27 Dec 2025 17:42:38 +0000 (18:42 +0100)]
a68: remove coalesce_public_symbols shortcut
As planned, this commit removes a crude hack (the coalescing of 'pub'
symbols right after bottom-up parsing) that I introduced during the
initial implementation of modules. The goal was to get working
separated compilation as soon as possible. Now the rest of the
parser, and also the lowerer pass, is made to know about these 'pub'
symbols.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-bottom-up.cc (a68_bottom_up_error_check): Do not
check for the absence of public-symbols.
(a68_bottom_up_coalesce_pub): Removed function.
* a68-parser.cc (a68_parser): Do not call
a68_bottom_up_coalesce_pub
* a68-parser-extract.cc (a68_extract_indicants): Adapt to the
presence of public-symbols.
* a68-parser-modes.cc (get_mode_from_proc_variables): Likewise.
* a68-parser-taxes.cc (tax_variable_dec): Likewise.
(tax_proc_variable_dec): Likewise.
(tax_op_dec): Likewise
(tax_prio_dec): Likewise.
* a68-low-decls.cc (a68_lower_mode_declaration): Adapt to the
presence of public-symbols.
(a68_lower_variable_declaration): Likewise.
(a68_lower_identity_declaration): Likewise.
(a68_lower_procedure_declaration): Likewise.
(a68_lower_procedure_variable_declaration): Likewise.
(a68_lower_brief_operator_declaration): Likewise.
(a68_lower_operator_declaration): Likewise.
gcc/testsuite/ChangeLog
* algol68/compile/module-2.a68: Expand test a little.
Jose E. Marchesi [Sat, 27 Dec 2025 10:09:04 +0000 (11:09 +0100)]
a68: allow joined list of revelations in access clauses
This commit adds support for having a joined list of revelations in
access clauses, like in:
access Module18a,
Module18b,
Module18c
begin assert (foo = 10);
assert (bar = 20);
assert (baz = 30)
end
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-bottom-up.cc (reduce_enclosed_clauses): Reduce joined
list of revelations.
* a68-low-clauses.cc (a68_lower_revelation_ludes): New function.
(a68_lower_access_clause): Use a68_lower_revelation_ludes.
Jose E. Marchesi [Tue, 23 Dec 2025 14:53:36 +0000 (15:53 +0100)]
a68: fix support for nested access clauses
This commit fixes the support for having an access clause as the
controlled clause of another access clause.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-top-down.cc (top_down_access): An access clause may
be nested in another access clause.
* a68-parser-extract.cc (a68_extract_indicants): Coalesce 'pub'
symbols.
(a68_extract_indicants): Nested access are not allowed in module
texts.
* a68-parser-bottom-up.cc (expected_module_text): New function.
(reduce_prelude_packet): Use expected_module_text.
(a68_bottom_up_error_check): Add comment.
gcc/testsuite/ChangeLog
* algol68/compile/error-module-nested-access-1.a68: New test.
* algol68/execute/modules/program-21.a68: Likewise.
Jose E. Marchesi [Mon, 22 Dec 2025 23:52:52 +0000 (00:52 +0100)]
a68: fetch module exports from packet by name
A packet (compilation unit) may emit more than one module interface in
its exports section. This is because a module may publicize the
exports of other module. This commit makes the import infrastructure
to read multiple module interfaces from exports sections and then look
for the accessed module in the data.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-types.h (struct MOIF_T): Add chain_next to GTY info.
* a68-imports.cc (a68_decode_modes): Mode offsets are relative to
the start of the moif, not the start of the exports.
(a68_decode_moifs): Renamed from a68_decode_moif and changed to
decode multiple moifs from the exports.
(a68_open_packet): Call a68_decode_moifs and look for the right
moif.
* a68-exports.cc (a68_moif_new): Initialize NEXT (moif).
Jakub Jelinek [Sat, 27 Dec 2025 10:45:18 +0000 (11:45 +0100)]
simplify-rtx: Fix up (ne (ior (ne x 0) y) 0) simplification [PR123114]
The following testcase ICEs on x86_64-linux since the PR52345
(ne (ior (ne x 0) y) 0) simplification was (slightly) fixed.
It wants to optimize
(set (reg/i:DI 10 a0)
(ne:DI (ior:DI (ne:DI (reg:DI 151 [ a ])
(const_int 0 [0]))
(reg:DI 152 [ b ]))
(const_int 0 [0])))
but doesn't check important property of that, in particular
that the mode of the inner NE operand is the same as the
mode of the inner NE.
The following testcase has
(set (reg:CCZ 17 flags)
(compare:CCZ (ior:QI (ne:QI (reg/v:SI 104 [ c ])
(const_int 0 [0]))
(reg:QI 98 [ _5 ]))
(const_int 0 [0])))
where cmp_mode is QImode, but the mode of the inner NE operand
is SImode instead, and it attempts to create
(ne:CCZ (ior:QI (reg/v:SI 104 [ c ]) (reg:QI 98 [ _5 ])) (const_int 0))
which obviously crashes later on.
The following patch fixes it by checking the mode of the inner NE operand
and also by using CONST0_RTX (cmp_mode) instead of CONST0_RTX (mode)
because that is the mode of the other operand, not mode which is the
mode of the outer comparison (though, guess for most modes it will still
be const0_rtx).
I guess for mode mismatches we could arbitrarily choose some extension (zero
or sign) and extend the narrower mode to the wider mode, but I doubt that it
would ever match on any target. But even then we'd need to limit it, we
wouldn't want to deal with another mode class (say floating point
comparisons), and dunno about vector modes etc.
2025-12-27 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/123114
* simplify-rtx.cc (simplify_context::simplify_relational_operation):
Verify XEXP (XEXP (op0, 0), 0) mode and use CONST0_RTX (cmp_mode)
instead of CONST0_RTX (mode).
Eric Botcazou [Sat, 27 Dec 2025 09:24:52 +0000 (10:24 +0100)]
Ada: Fix assertion failure for unfrozen mutably tagged type as actual
...in instance. An instantiation is a freezing point for the actuals,
so the mutably tagged type will be frozen by the instantiation, but this
happens too late in the current implementation of mutably tagged types,
because the declaration of their CW-equivalent type is not analyzed until
after the type is frozen.
gcc/ada/
PR ada/123306
* sem_ch12.adb (Analyze_One_Association): Immediately freeze the
root type of mutably tagged types used as actual type parameters.
gcc/testsuite/
* gnat.dg/specs/mutably_tagged1.ads: New test.
Jeff Law [Fri, 26 Dec 2025 22:24:56 +0000 (15:24 -0700)]
[RISC-V][PR target/123283] Wrap naked REG operands with a USE.
I was in the process of testing this patch when Andreas filed PR123283.
What's going on is we have patterns in sync.md which have naked operands:
(define_insn "subword_atomic_fetch_strong_<atomic_optab>"
[(set (match_operand:SI 0 "register_operand" "=&r") ;; old value at mem
(match_operand:SI 1 "memory_operand" "+A")) ;; mem location
(set (match_dup 1)
(unspec_volatile:SI
[(any_atomic:SI (match_dup 1)
(match_operand:SI 2 "arith_operand" "rI")) ;; value for op
(match_operand:SI 3 "const_int_operand")] ;; model
UNSPEC_SYNC_OLD_OP_SUBWORD))
(match_operand:SI 4 "arith_operand" "rI") ;; mask
(match_operand:SI 5 "arith_operand" "rI") ;; not_mask
(clobber (match_scratch:SI 6 "=&r")) ;; tmp_1
(clobber (match_scratch:SI 7 "=&r"))] ;; tmp_2
Note carefully operands #4 and #5 and the fact they are a toplevel construct as
opposed to being an operand of another RTX. That's a no-no. They need to be
wrapped with a USE.
I spot-checked sync.md and found a few more instances. Fixing the set I found
fixed the testsuite regressions I was seeing and also fixes the mis-compilation
of libgo. Bootstrapped and regression tested on my BPI and Pioneer. It's also
clean on the riscv64-elf and riscv32-elf targets in my tester.
PR target/123283
gcc/
* config/riscv/sync.md (subword_atomic_fetch_strong_nand): Add
USEs for naked operands that might be pseudos.
(subword_atomic_fetch_strong_<atomic_optab>): Likewise.
(subword_atomic_exchange_strong): Likewise.
(subword_atomic_cas_strong): Likewise.
Eric Botcazou [Fri, 26 Dec 2025 13:52:32 +0000 (14:52 +0100)]
Ada: Fix bogus error on aggregate in call with qualified type in instance
This happens with a container aggregate in the testcase, although this can
very likely happen with a record aggregate as well. The trick used in the
Save_Global_References procedure for aggregates loses the qualification of
the type of the formal for which the aggregate is the actual.
gcc/ada/
PR ada/123302
* sem_ch12.adb (Save_Global_Reference.Save_References_In_Aggregate):
Recurse on the scope of the type to find one that is visible, in the
case of an actual in a subprogram call with a local type.
gcc/testsuite/
* gnat.dg/aggr34.adb: New test.
* gnat.dg/aggr34_pkg1.ads, gnat.dg/aggr34_pkg1.adb: New helper.
* gnat.dg/aggr34_pkg2.ads, gnat.dg/aggr34_pkg2.adb: Likewise.
* gnat.dg/aggr34_pkg3.ads: Likewise.
Egas Ribeiro [Mon, 22 Dec 2025 21:41:00 +0000 (21:41 +0000)]
c-family: Fix ICE with -MD and -fdeps-format sharing output [PR121864]
When -MD, -fdeps-format=p1689r5 and -save-temps are used without
explicit output files, they default to the same stream, which is
invalid. The error message attempted to print fdeps_file, but this is
NULL in this case, causing an ICE.
Use out_fname as a fallback when fdeps_file is NULL to avoid the ICE
and provide a meaningful error message.
Fix suggested by Andrew Pinski.
PR c++/121864
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_finish): Use out_fname as fallback when
fdeps_file is NULL in error message.
Signed-off-by: Egas Ribeiro <egas.g.ribeiro@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Eric Botcazou [Fri, 26 Dec 2025 09:44:57 +0000 (10:44 +0100)]
Ada: Fix illegal Aggregate aspect not rejected
The Ada 2022 RM is adamant that the names specified in the Aggregate aspect
must denote "exactly one" subprogram, in other words that it is illegal to
use names that denote more than one subprogram in the Aggregate aspect.
gcc/ada/
PR ada/123289
* sem_ch13.adb (Resolve_Aspect_Aggregate.Resolve_Operation): Give
an error if the operation's name denotes more than one subprogram.
gcc/testsuite/
* gnat.dg/specs/aggr9.ads: New test.
Sandra Loosemore [Sun, 21 Dec 2025 02:23:43 +0000 (02:23 +0000)]
doc, riscv: Clean up RISC-V extensions documentation
This patch fixes a number of problems I observed in the RISC-V
extensions documentation, which is autogenerated from .def files:
- The formatting of the table looked terrible in the PDF output, with
overlapping text. I made the first two columns wider to fix this.
- Also the extension names in the table should have @samp{} markup.
- Many extensions were missing a full name/description. (Documenting
something as "xyzzy extension" adds nothing useful to readers when we
are already listing the extension name "xyzzy" in the table.)
- Irregular spelling and capitalization in the full names.
Sandra Loosemore [Sun, 14 Dec 2025 00:38:48 +0000 (00:38 +0000)]
doc, riscv: Clean up documentation of RISC-V options [PR122243]
gcc/ChangeLog
PR other/122243
* config/riscv/riscv.opt (mplt): Mark deprecated option Undocumented.
(msmall-data-limit=): Mark RejectNegative.
* doc/invoke.texi (Option Summary) <RISC-V Options>: Remove -mplt
documentation. Only list one form of each option. Add missing
options -mcpu, -mscalar-strict-align, -mno-vector-strict-align,
-momit-leaf-frame-pointer, -mstringop-strategy, -mrvv-vector-bits,
-mrvv-max-lmul, -madjust-lmul-cost, -mmax-vectorization, and
-mno-autovec-segment.
(RISC-V Options): Remove -mplt documentation. Add documentation for
missing options listed above. Add missing index entries for negative
forms. Correct the default for the -minline-str* options, which
has changed. Copy-edit for markup, spelling, and usage. Trivial
whitespace fixes.
Then, sincos attempts to find the type of the IFN_SIN/IFN_COS via
mathfn_built_in_type. This fails, so the compiler crashes.
For these IFNs, their input type is the same as their output type, so
we can fall back to that.
Note that, currently, GCC can't seem to handle vector sincos/cexpi
operations, so any attempt to CSE these will fail quickly after. This
patch does not fix that, only the ICE that happens in the attempt.
gcc/ChangeLog:
* tree-ssa-math-opts.cc (execute_cse_sincos_1): If
mathfn_built_in_type fails to determine a type for our
operation, presume that it is the same as the input type.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/sincos-ice-on-ifn_sin-call.c: New test.
* gcc.target/gcn/sincos-ice-on-ifn_sin-call-1.c: New test.
Pan Li [Sun, 21 Dec 2025 12:07:43 +0000 (20:07 +0800)]
RISC-V: Combine vec_duplicate + vmsleu.vv to vmsleu.vx on GR2VR cost
This patch would like to combine the vec_duplicate + vmsleu.wv to the
vmsleu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.
Assume we have asm code like below, GR2VR cost is 0.
After this patch:
11 beq a3,zero,.L8
...
14 .L3:
15 vsetvli a5,a3,e32,m1,ta,ma
...
20 vmsleu.wx v1,a2,v3
...
23 bne a3,zero,.L3
gcc/ChangeLog:
* config/riscv/predicates.md: Add geu to the swappable
cmp operator iterator.
* config/riscv/riscv-v.cc (get_swapped_cmp_rtx_code): Take
care of the swapped rtx code correspondly.
aarch64: Add the ability to have three types in an sve/sme intrinsic name
The majority of sve/sme intrinsics have names which are defined by one type
(like svuint8_t svextq[_u8]) or two types (like svsub_za32[_f32]_vg1x2).
Some intrinsics now have three types (like svtmopa_lane_za32[_s8_u8]).
This change extends the number of type_suffix_indexes from two to three
to cover this case.
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc: (svmul_impl::fold):
Replace use of type_suffix_pair with type_suffix_triple.
* config/aarch64/aarch64-sve-builtins-shapes.cc: (parse_element_type):
Handle third type suffix.
(parse_type): Handle c2 in function signature. Add the u signature with
the ability to pass a tuple with twice as many vectors as the base type.
Calculate number of vectors against the type with the maximum number of
bits rather than "the other one".
(load_contiguous_base::resolve): Add argument to resolve_to call.
(compare_scalar_def::resolve): Likewise.
(ternary_mfloat8_def::resolve): Likewise.
(ternary_mfloat8_lane_def::resolve): Likewise.
(ternary_mfloat8_opt_n_def::resolve): Likewise.
* config/aarch64/aarch64-sve-builtins.cc: (TYPES_all_pred,
TYPES_all_count, TYPES_all_pred_count, TYPES_all_float,
TYPES_all_signed, TYPES_all_float_and_signed, TYPES_all_unsigned,
TYPES_all_integer, TYPES_all_arith, TYPES_all_data, TYPES_b, TYPES_c,
TYPES_b_unsigned, TYPES_b_integer, TYPES_b_data, TYPES_bh_integer,
TYPES_bs_unsigned, TYPES_bhs_signed, TYPES_bhs_unsigned,
TYPES_bhs_integer, TYPES_bh_data, TYPES_bhs_data, TYPES_bhs_widen,
TYPES_h_bfloat, TYPES_h_float, TYPES_h_integer, TYPES_h_data,
TYPES_hs_signed, TYPES_hs_integer, TYPES_hs_float, TYPES_hs_data,
TYPES_hd_unsigned, TYPES_hsd_signed, TYPES_hsd_integer, TYPES_hsd_data,
TYPES_h_float_mf8, TYPES_s_float, TYPES_s_float_mf8,
TYPES_s_float_hsd_integer, TYPES_s_float_sd_integer, TYPES_s_signed,
TYPES_s_unsigned, TYPES_s_integer, TYPES_s_data, TYPES_sd_signed,
TYPES_sd_unsigned, TYPES_sd_integer, TYPES_sd_data,
TYPES_all_float_and_sd_integer, TYPES_d_float, TYPES_d_unsigned,
TYPES_d_integer, TYPES_d_data, TYPES_cvt, TYPES_cvt_bfloat,
TYPES_cvt_h_s_float, TYPES_cvt_f32_f16, TYPES_cvt_long,
TYPES_cvt_narrow_s, TYPES_cvt_narrow, TYPES_cvt_s_s, TYPES_cvt_mf8,
TYPES_cvtn_mf8, TYPES_cvtnx_mf8, TYPES_inc_dec_n, TYPES_qcvt_x2,
TYPES_qcvt_x4, TYPES_qrshr_x2,TYPES_qrshru_x2, TYPES_qrshr_x4,
TYPES_qrshru_x4, TYPES_reinterpret, TYPES_reinterpret_b,TYPES_while,
TYPES_while_x, TYPES_while_x_c, TYPES_s_narrow_fsu,TYPES_all_za,
TYPES_d_za, TYPES_za_bhsd_data,TYPES_za_all_data, TYPES_za_h_mf8,
TYPES_za_hs_mf8, TYPES_za_h_bfloat, TYPES_za_h_float,
TYPES_za_s_b_signed, TYPES_za_s_b_unsigned, TYPES_za_s_b_integer,
TYPES_za_s_h_integer,TYPES_za_s_h_data, TYPES_za_s_unsigned,
TYPES_za_s_integer, TYPES_za_s_mf8, TYPES_za_s_float, TYPES_za_s_data,
TYPES_za_d_h_integer, TYPES_za_d_float, TYPES_za_d_integer,
TYPES_mop_base, TYPES_mop_base_signed, TYPES_mop_base_unsigned,
TYPES_mop_i16i64, TYPES_mop_i16i64_signed, TYPES_mop_i16i64_unsigned,
ΤYPES_za): Extend defines to three arguments.
(DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE): Likewise.
(DEF_TRIPLE_TYPE): Add new define.
(DEF_SVE_TYPES_ARRAY): Redefine all types_ arrays into arrays of
type_suffix_triple.
(types_none): Likewise.
(function_instance::hash): Add third type to hash calculation.
(function_builder::get_name): Add third type to function name.
(function_builder::add_overloaded_functions): Handle third type.
(function_resolver::lookup_form): Likewise.
(function_resolver::resolve_to): Likewise.
(function_resolver::resolve_unary): Likewise.
* config/aarch64/aarch64-sve-builtins.h: (type_suffix_triple): replace
type_suffix_pair.
(function_group_info::types): Likewise.
(function_instance::ctor): Likewise.
(function_instance::type_suffix_ids): Likewise.
(function_resolver::lookup_form): Add third type argument.
(function_resolver::resolve_to): Likewise.
(function_instance::operator==): Add third type to equality calculation.
Karl Meakin [Wed, 24 Dec 2025 11:41:27 +0000 (11:41 +0000)]
aarch64: add 8-bit floating point dot product
This patch adds support for the following intrinsics when sme-f8f16 is enabled:
* svdot_za16[_mf8]_vg1x2_fpm
* svdot_za16[_mf8]_vg1x4_fpm
* svdot[_single]_za16[_mf8]_vg1x2_fpm
* svdot[_single]_za16[_mf8]_vg1x4_fpm
* svdot_lane_za16[_mf8]_vg1x2_fpm
* svdot_lane_za16[_mf8]_vg1x4_fpm
This patch adds support for the following intrinsics when sme-f8f32 is enabled:
* svdot_za32[_mf8]_vg1x2_fpm
* svdot_za32[_mf8]_vg1x4_fpm
* svdot[_single]_za32[_mf8]_vg1x2_fpm
* svdot[_single]_za32[_mf8]_vg1x4_fpm
* svdot_lane_za32[_mf8]_vg1x2_fpm
* svdot_lane_za32[_mf8]_vg1x4_fpm
* svvdot_lane_za32[_mf8]_vg1x2_fpm
* svvdotb_lane_za32[_mf8]_vg1x4_fpm
* svvdott_lane_za32[_mf8]_vg1x4_fpm
gcc:
* config/aarch64/aarch64-sme.md
(@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x24:mode>): New insn.
(@aarch64_fvdot_half<optab>): Likewise.
(@aarch64_fvdot_half<optab>_plus): Likewise.
* config/aarch64/aarch64-sve-builtins-functions.h
(class svvdot_half_impl): New function impl.
* config/aarch64/aarch64-sve-builtins-sme.cc (FUNCTION): Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc (struct dot_half_za_slice_lane_def):
New function shape.
* config/aarch64/aarch64-sve-builtins-shapes.h: Likewise.
* config/aarch64/aarch64-sve-builtins-sme.def (svdot): New function.
(svdot_lane): Likewise.
(svvdot_lane): Likewise.
(svvdotb_lane): Likewise.
(svvdott_lane): Likewise.
* config/aarch64/aarch64-sve-builtins-sme.h (svvdotb_lane_za): New function.
(svvdott_lane_za): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_za_s_mf8): New types array.
(TYPES_za_hs_mf8): Likewise.
(za_hs_mf8): Likewise.
* config/aarch64/iterators.md (SME_ZA_F8F16): New mode iterator.
(SME_ZA_F8F32): Likewise.
(SME_ZA_FP8_x1): Likewise.
(SME_ZA_FP8_x2): Likewise.
(SME_ZA_FP8_x4): Likewise.
(UNSPEC_SME_FDOT_FP8): New unspec.
(UNSPEC_SME_FVDOT_FP8): Likewise.
(UNSPEC_SME_FVDOTT_FP8): Likewise.
(UNSPEC_SME_FVDOTB_FP8): Likewise.
(SME_FP8_DOTPROD): New int iterator.
(SME_FP8_FVDOT): Likewise.
(SME_FP8_FVDOT_HALF): Likewise.
gcc/testsuite:
* gcc.target/aarch64/sme2/acle-asm/dot_lane_za16_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_lane_za16_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_lane_za32_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_lane_za32_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_single_za16_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_single_za16_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_single_za32_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_single_za32_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_za16_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_za16_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_za32_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/dot_za32_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/vdot_lane_za16_mf8_vg1x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/vdotb_lane_za32_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/vdott_lane_za32_mf8_vg1x4.c: New test.
* gcc.target/aarch64/sve/acle/general-c/dot_half_za_slice_lane_fpm.c: New test.
aarch64: add 8-bit floating-point sum of outer products and accumulate
This patch adds support for FMOPA (widening, 2-way, FP8 to FP16) when
sme-f8f16 is enabled using svmopa_za16[_mf8]_m_fpm and for FMOPA (widening,
4-way) when sme-f8f32 is enabled using svmopa_za32[_mf8]_m_fpm.
Asm tests for the new intrinsics are added, similar to those for existing
mopa_z16 intrinsics. Tests for the binary_za_m shape are added.
gcc:
* config/aarch64/aarch64-sme.md
(@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><VNx16QI_ONLY:mode>): Add
new define_insn.
* config/aarch64/aarch64-sve-builtins-shapes.cc
(struct binary_za_m_base): Support fpm argument.
* config/aarch64/aarch64-sve-builtins-sme.cc (svmopa_za): Extend for
fp8.
* config/aarch64/aarch64-sve-builtins-sme.def (svmopa): Add new
DEF_SME_ZA_FUNCTION_GS_FPM entries.
aarch64: add Multi-vector 8-bit floating-point multiply-add long
This patch adds support for the following intrinsics when sme-f8f16 is enabled:
* svmla_lane_za16[_mf8]_vg2x1_fpm
* svmla_lane_za16[_mf8]_vg2x2_fpm
* svmla_lane_za16[_mf8]_vg2x4_fpm
* svmla_za16[_mf8]_vg2x1_fpm
* svmla[_single]_za16[_mf8]_vg2x2_fpm
* svmla[_single]_za16[_mf8]_vg2x4_fpm
* svmla_za16[_mf8]_vg2x2_fpm
* svmla_za16[_mf8]_vg2x4_fpm
This patch adds support for the following intrinsics when sme-f8f32 is enabled:
* svmla_lane_za32[_mf8]_vg4x1_fpm
* svmla_lane_za32[_mf8]_vg4x2_fpm
* svmla_lane_za32[_mf8]_vg4x4_fpm
* svmla_za32[_mf8]_vg4x1_fpm
* svmla[_single]_za32[_mf8]_vg4x2_fpm
* svmla[_single]_za32[_mf8]_vg4x4_fpm
* svmla_za32[_mf8]_vg4x2_fpm
* svmla_za32[_mf8]_vg4x4_fpm
Asm tests for the 32 bit versions follow the blueprint set in
mla_lane_za32_u8_vg4x1.c mla_za32_u8_vg4x1.c and similar.
16 bit versions follow similar patterns modulo differences in allowed offsets.
gcc:
* config/aarch64/aarch64-sme.md
(@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x24:mode>): Add
new define_insn.
(*aarch64_sme_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><VNx16QI_ONLY:mode>,
*aarch64_sme_<optab><VNx8HI_ONLY:mode><VNx16QI_ONLY:mode>_plus,
*aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx16QI_ONLY:mode>_plus,
@aarch64_sme_single_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x24:mode>,
*aarch64_sme_single_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
@aarch64_sme_lane_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x124:mode>,
*aarch64_sme_lane_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x124:mode>,
*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x124:mode>):
Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc
(struct binary_za_slice_lane_base): Support fpm argument.
(struct binary_za_slice_opt_single_base): Likewise.
* config/aarch64/aarch64-sve-builtins-sme.cc (svmla_za): Extend for fp8.
(svmla_lane_za): Likewise.
* config/aarch64/aarch64-sve-builtins-sme.def (svmla_lane): Add new
DEF_SME_ZA_FUNCTION_GS_FPM entries.
(svmla): Likewise.
* config/aarch64/iterators.md (SME_ZA_F8F16_32): Add new mode iterator.
(SME_ZA_FP8_x24, SME_ZA_FP8_x124): Likewise.
(UNSPEC_SME_FMLAL): Add new unspec.
(za16_offset_range): Add new mode_attr.
(za16_32_long): Likewise.
(za16_32_last_offset): Likewise.
(SME_FP8_TERNARY_SLICE): Add new iterator.
(optab): Add entry for UNSPEC_SME_FMLAL.
gcc/testsuite:
* gcc.target/aarch64/sme2/acle-asm/test_sme2_acle.h: (TEST_ZA_X1,
TEST_ZA_XN, TEST_ZA_SINGLE, TEST_ZA_SINGLE_Z15, TEST_ZA_LANE,
TEST_ZA_LANE_Z15): Add fpm0 parameter.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_1.c: Add
tests for variants accepting fpm.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_1.c:
Likewise.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x4.c: New test.
aarch64: add basic support for sme-f8f16 and sme-f8f32
This patch adds support for the SME_F8F16 and SME_F8F32 features as architecture
options, along with related definitions. This support is required for subsequent
intrinsics to work.
gcc/
* config/aarch64/aarch64.h:
(TARGET_STREAMING_SME_F8F16, TARGET_STREAMING_SME_F8F32): Add defines.
* config/aarch64/aarch64-c.cc:
(__ARM_FEATURE_SME_F8F16, __ARM_FEATURE_SME_F8F32): Add defines.
* config/aarch64/aarch64-option-extensions.def:
(sme-f8f16, sme-f8f32): Add arch options in command line.
* config/aarch64/aarch64-sve-builtins-functions.h:
(sme_2mode_function_t): Pass unspec_for_mfp8 parameter through ctor.
* config/aarch64/aarch64-sve-builtins-sme.def:
(DEF_SME_FUNCTION_GS, DEF_SME_FUNCTION): Redefine based on
DEF_SME_FUNCTION_GS_FPM.
(DEF_SME_ZA_FUNCTION_GS, DEF_SME_ZA_FUNCTION): Redefine based on
DEF_SME_ZA_FUNCTION_GS_FPM.
(AARCH64_FL_SME_F8F16, AARCH64_FL_SME_F8F32): Add new
REQUIRED_EXTENSIONS sections.
* config/aarch64/aarch64-sve-builtins.cc:
(TYPES_za_h_mf8): Add new types.
(TYPES_za_s_mf8): Likewise.
(sme_function_groups): Define using DEF_SME_FUNCTION_GS_FPM instead of
DEF_SME_FUNCTION_GS.
* doc/invoke.texi: (sme-f8f16, sme-f8f32): Add documentation of option.
gcc/testsuite/
* gcc.target/aarch64/pragma_cpp_predefs_4.c: Add tests checking that
sme-f8f16 and sme-f8f32 prefefs are off by default, and checks for
feature dependencies.
* lib/target-supports.exp: Add check_effective_target support for
sme-f8f16 and sme-f8f32.
Test structure is based on the urshl ones that have a similar structure in how
they treat arguments.
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svscale_impl): Added new
class for dealing with all svscale functions (including sve)
(svscale): updated FUNCTION macro call to make use of new class.
* config/aarch64/aarch64-sve-builtins-sve2.def: (svscale):
Added new DEF_SVE_FUNCTION_GS call to enable recognition of new variant.
* config/aarch64/aarch64-sve2.md (@aarch64_sve_fscale<mode>): Added
new define_insn. (@aarch64_sve_single_fscale<mode>): Likewise.
* config/aarch64/iterators.md: (SVE_Fx24_NOBF): Added new iterator,
similar to SVE_Fx24 but without brainfloat.
(SVE_Fx24): Updated to make use of SVE_Fx24_NOBF.
(SVSCALE_SINGLE_INTARG): Added new mode_attr.
(SVSCALE_INTARG): Likewise.
This patch adds the following intrinsics (all __arm_streaming only) along with
asm tests for them under the +sme2+fp8 flags:
- svfloat16x2_t svcvt1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvt2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvt1_bf16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvt2_bf16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvtl1_bf16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
- svfloat16x2_t svcvtl2_bf16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm)
gcc/
* config/aarch64/aarch64-sve-builtins-sve2.cc (svcvtl1, svcvtl2): Added
new FUNTIONs.
* config/aarch64/aarch64-sve-builtins-sve2.def
(svcvt1, svcvt2, svcvtl1, svcvtl2): Added new DEF_SVE_FUNCTION_GS_FPM.
* config/aarch64/aarch64-sve-builtins-sve2.h (svcvtl1, svcvtl2): Added
new function_base.
* config/aarch64/aarch64-sve-builtins.cc
(function_resolver::resolve_unary): use group_suffix_id when resolving
C overloads.
* config/aarch64/aarch64-sve2.md
(@aarch64_sve2_fp8_cvt_<fp8_cvt_uns_op><mode>): Added new define_insn.
* config/aarch64/aarch64.h (TARGET_SSME2_FP8): Added new define.
* config/aarch64/iterators.md
(UNSPEC_F1CVTL. UNSPEC_F2CVTL): Added new unspecs.
(FP8CVT_UNS): Extended int_iterator.
(fp8_cvt_uns_op): Likewise.
gcc/testsuite/
* g++.target/aarch64/sme2/aarch64-sme2-acle-asm.exp: Use tuning flag
to reduce churn in testsuites.
* gcc.target/aarch64/sme2/aarch64-sme2-acle-asm.exp: Likewise.
* gcc.target/aarch64/sme2/acle-asm/cvt_mf8_x2.c: Added test file.
* gcc.target/aarch64/sme2/acle-asm/cvtl_mf8_x2.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_X2_WIDE): Added
fpm0 argument for intrinsics.
In a GCC configuration with both AMD and NVIDIA GPU code offloading supported,
and the selected AMD GPU code generation not supporting USM, but an USM-capable
NVIDIA GPU available, I see all test cases that require effective-target
'omp_usm' turn UNSUPPORTED, because:
Executing on host: gcc usm_available_2778376.c [...]
[...]
In function 'main._omp_fn.0':
lto1: warning: Unified Shared Memory is required, but XNACK is disabled
lto1: note: Try -foffload-options=-mxnack=any
gcn mkoffload: warning: conflicting settings; XNACK is forced off but Unified Shared Memory is required
UNSUPPORTED: [...]
That warning is, however, not relevant in the scenario described above: we're
not going to exercise AMD GPU code offloading at run time.
With the effective-target 'omp_usm' check robustified like this, the affected
test cases are then no longer UNSUPPORTED, but of course, there's then the
corollary issue that compilation of the test case itself now emits the very
same warning, which results in the "test for excess errors" FAILing, despite
the execution test PASSing, for example:
FAIL: libgomp.c++/target-std__valarray-concurrent-usm.C (test for excess errors)
PASS: libgomp.c++/target-std__valarray-concurrent-usm.C execution test
That's clearly not ideal either (but is representative of what real-world usage
would run into), but is certainly better than the whole test case turning
UNSUPPORTED. To be continued, I guess...
Andrew Pinski [Tue, 23 Dec 2025 21:30:00 +0000 (13:30 -0800)]
ifcvt: Move noce_try_cond_zero_arith last
I noticed that on x86_64 and aarch64, noce_try_cond_zero_arith
would produce worse code than noce_try_cmove_arith.
So we should do noce_try_cond_zero_arith last instead
of before noce_try_cmove_arith.
Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
Also checked to make sure riscv testcases still work.
gcc/ChangeLog:
* ifcvt.cc (noce_process_if_block): Move noce_try_cond_zero_arith
last.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>