* config/loongarch/loongarch.cc
(loongarch_compare_version_priority): Returns true if DECL1
and DECL2 are versions of the same function.
(TARGET_COMPARE_VERSION_PRIORITY): Define.
* config/loongarch/genopts/gen-evolution.awk:
* config/loongarch/loongarch-evol-attr.def: Regenerate.
* config/loongarch/loongarch-protos.h
(loongarch_parse_fmv_features): Function declaration.
(get_feature_mask_for_version): Likewise.
* config/loongarch/loongarch-target-attr.cc
(enum features_prio): Defining the priority of features.
(struct loongarch_attribute_info): Add members about
features.
(LARCH_ATTR_MASK): Likewise.
(LARCH_ATTR_ENUM): Likewise.
(LARCH_ATTR_BOOL): Likewise.
(loongarch_parse_fmv_features): Parse a function
multiversioning feature string STR.
* config/loongarch/loongarch.cc
(get_suffixed_assembler_name): Return an identifier for the
base assembler name of a versioned function.
(get_feature_mask_for_version): Get the mask and priority of
features.
(add_condition_to_bb): Insert judgment statements for different
features functions.
(dispatch_function_versions): Generates the dispatch function for
multi-versioned functions.
(make_resolver_func): Make the resolver function decl to dispatch
the versions of a multi-versioned function.
(loongarch_generate_version_dispatcher_body): Generate the
dispatcher logic to invoke the right function version at run-time
for a given set of function versions.
(TARGET_GENERATE_VERSION_DISPATCHER_BODY): Define.
* common/config/loongarch/cpu-features.h: New file.
Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P for LoongArch.
This is used to determine whether the attribute ((target_version ("...")))
is valid and process it.
Define TARGET_HAS_FMV_TARGET_ATTRIBUTE to 0 to use "target_version"
for function versioning.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_process_target_version_attr): New function.
(loongarch_option_valid_version_attribute_p): New function.
(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): Define.
* config/loongarch/loongarch.h
(TARGET_HAS_FMV_TARGET_ATTRIBUTE): Define it to 0.
* config/loongarch/genopts/gen-evolution.awk: Output the
info needed for handling evolution features when parsing
the target pragma and attribute.
* config/loongarch/genopts/genstr.sh: Add support for
generating *.def files.
* config/loongarch/loongarch-target-attr.cc
(struct loongarch_attribute_info): Add structure member
record option mask.
(LARCH_ATTR_MASK): New macro.
(LARCH_ATTR_ENUM): Likewise.
(LARCH_ATTR_BOOL): Likewise.
(loongarch_handle_option): Support for new options.
(loongarch_process_one_target_attr): Added support for
the la64v1.1 extended instruction set.
* config/loongarch/t-loongarch: Generate loongarch-evol-attr.def.
* doc/extend.texi: Add new attribute description information.
* config/loongarch/loongarch-evol-attr.def: Generate.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/pragma-la64V1_1.c: New test.
* gcc.target/loongarch/pragma-la64V1_1-2.c: New test.
Andrew Pinski [Mon, 10 Nov 2025 20:22:28 +0000 (12:22 -0800)]
ifcvt: Fix factor_out_operators for BIT_FIELD_REF and BIT_INSERT_EXPR [PR122629]
So factor_out_operators will factor out some expressions but in the case
of BIT_FIELD_REF and BIT_INSERT_EXPR, this only allowed for operand 0 as the
other operands need to be constant.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122629
gcc/ChangeLog:
* tree-if-conv.cc (factor_out_operators): Reject
BIT_FIELD_REF and BIT_INSERT_EXPR if operand other
than 0 is different.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122629-1.c: New test.
* gcc.dg/torture/pr122629-2.c: New test.
* gcc.dg/tree-ssa/pr122629-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jakub Jelinek [Tue, 11 Nov 2025 07:29:22 +0000 (08:29 +0100)]
gimplify-me: Fix regimplification of gimple-reg-type clobbers [PR122620]
Since r11-2238-ge443d8213864ac337c29092d4767224f280d2062 the C++ FE
emits clobbers like *_1 = {CLOBBER}; where *_1 MEM_REF has some scalar
type like int for -flifetime-dse={1,2} and most of the compiler manages
to cope with that.
If we are very unlucky, we trigger an ICE while trying to regimplify it
(at least during inlining), as happens with GCC 15.2 on firefox-145.0
built with LTO+PGO.
I haven't managed to reduce that to a small testcase that would ICE though,
the clobber certainly appears in code like
template <typename T>
struct S {
T *p;
union { char a; T b; };
static S foo (T *x) { S s; s.p = x; s.b.~T (); return s; }
~S ();
};
void
bar ()
{
int i = 42;
S <int> s = S <int>::foo (&i);
}
but convincing inliner that it should id->regimplify = true; on exactly
that stmt has been difficult.
The ICE is because we try (in two spots) to regimplify the rhs of the
gimple_clobber_p stmt if gimple-reg-type type (i.e. the TREE_CLOBBER),
because it doesn't satisfy the is_gimple_mem_rhs_or_call predicate
returned by rhs_predicate_for for the MEM_REF lhs. And regimplify it
by trying to gimplify SSA_NAME = {CLOBBER}; INIT_EXPR and in there reach
a special case which stores that freshly made SSA_NAME into memory and
loads it from memory, so uses a SSA_NAME without SSA_NAME_DEF_STMT.
Fixed thusly by saying clobbers are ok even for the gimple-reg-types.
2025-11-11 Jakub Jelinek <jakub@redhat.com>
PR lto/122620
* gimplify-me.cc (gimple_regimplify_operands): Don't try to regimplify
TREE_CLOBBER on rhs of gimple_clobber_p if it has gimple_reg_type.
Hu, Lin1 [Tue, 28 Oct 2025 08:11:47 +0000 (16:11 +0800)]
i386: Support C++ template parameters in AMX intrinsics [PR122446]
The AMX intrinsics previously used string concatenation with the '#'
operator to construct register names, which prevented their use with
C++ template non-type parameters. This patch converts all AMX intrinsics
to use inline assembly constraints with the %c format specifier.
And Intel style registers also have % prefix, update Intel syntax to use plain
register names without % preifx.
Nathaniel Shead [Mon, 10 Nov 2025 12:41:25 +0000 (23:41 +1100)]
c++/modules: Propagate purviewness to all parent namespaces
In PR c++/100134, tsubst_friend_function was adjusted to ensure that
instantiating a friend function in an unopened namespace still correctly
marked the namespace as purview. This adjusts the fix to also apply
to nested namespaces.
gcc/cp/ChangeLog:
* pt.cc (tsubst_friend_function): Mark all parent namespaces as
purview if needed.
Store the 'rid' value in a local variable, and pass it to functions that
handle various keywords. This simplifies the code, and removes some
wrappers.
No functional change intended.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_sizeof_expression): Remove function.
(c_parser_countof_expression): Remove function.
(c_parser_unary_expression): Store the 'rid', and pass it
directly to the function calls, without calling wrappers.
Suggested-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Sam James [Sun, 9 Nov 2025 02:00:52 +0000 (02:00 +0000)]
gcc: quote some expressions in `test x...`
$gcc_cv_nm may contain a string with spaces since r16-4178-g6051a849aa1e8e and r16-5013-gf8bb20167f8127. It was possible for this to happen via strange user
input in the past too. `test x$gcc_cv_nm != x` therefore produces some noise
like:
```
checking assembler for working .subsection -1...
/usr/m68k-unknown-linux-gnu/tmp/portage/sys-devel/gcc-16.0.9999/work/gcc-16.0.9999/gcc/configure: line 26132: test: syntax error: `--plugin' unexpected
```
Quote a bunch of such tests. I've drive-by quoted other such tests where
they're for a program and may have a similar problem, but not all other
such tests (much larger patch and not at least strictly necessary).
Andrew Pinski [Sat, 8 Nov 2025 05:08:42 +0000 (21:08 -0800)]
builtins: Fix atomics expansion after build_call_nary change [PR122605]
So before r16-5089-gc66ebc3e22138, we could call build_call_nary with more
arguments than was passed as the nargs. Afterwards we get an assert if there
were not exactly that amount.
In this case the code is easier to read when passing the correct number of args
in the first place.
This fixes the two places in builtins.cc where this pattern shows up.
Bootstrapped and tested on x86_64-linux-gnu (and tested the testcase with -m32 where
the failure showed up).
PR middle-end/122605
gcc/ChangeLog:
* builtins.cc (expand_ifn_atomic_bit_test_and): Split out the call to
build_call_nary into two different statements.
(expand_ifn_atomic_op_fetch_cmp_0): Likewise.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Sandra Loosemore [Thu, 30 Oct 2025 00:56:22 +0000 (00:56 +0000)]
Documentation for -fident and -Qy/-Qn options [PR122243]
I noticed that the comments for -fident in common.opt were garbled,
and its description is confusing; this is classed as a code generation
option rather than a preprocessor option, and it controls emission of all
".ident" directives in the assembly file, not just those inserted by the
"#ident" preprocessor directive. Also, the -Qy/-Qn options which have the
same effect as -fident/-fno-ident were documented as System V Options when
in fact they are available on all targets. Fixed thusly.
gcc/ChangeLog
PR other/122243
* common.opt: Clean up comments/documentation for -fident.
* doc/invoke.texi: Move -Qy/-Qn documentation from System V options
and combine with -fident/-fno-ident entry.
Sandra Loosemore [Wed, 29 Oct 2025 22:11:44 +0000 (22:11 +0000)]
Document linker options + -Q and -S [PR122243]
This patch adds documentation for several options that the GCC driver
passes to the linker via specs without further interpretation. I've
also added some comments/documentation strings to common.opt for these
and a couple other options that previously didn't have any.
GCC has long supported long-form command-line options with the same
meanings as its traditional one-character options, e.g. --output as an
alias for -o, --language for -x, and so on. However, these have never
been documented in the manual. This patch adds the missing
documentation for these options, plus some additional options that
have previously undocumented two-dash aliases with the same names as
the one-dash form (e.g., -dumpdir and --dumpdir).
Sandra Loosemore [Wed, 22 Oct 2025 01:58:01 +0000 (01:58 +0000)]
Only document -A/--assert options in cpp manual [PR122243]
Assertions are a preprocessor feature that has been declared obsolete
with strong warnings not to use them since 2001. The main GCC manual
documents the -A command-line option but doesn't include the section
that explains the purpose of the feature or that it is obsolete; that
material appears only in the preprocessor manual. It seems rather pointless
to clutter up the GCC manual with unhelpful documentation of an obsolete
feature, so I've restricted the option documentation to the
preprocessor manual too. I've also added the missing documentation
entries for the long form of the option, --assert.
gcc/ChangeLog
PR other/122243
* doc/cppopts.texi (-A): Restrict option documentation to the CPP
manual. Also document the --assert form.
* doc/invoke.texi (Option Summary): Don't list the -A option.
I noticed that several options (mostly C++ options, including those
for contracts) were documented in the manual but were not listed in
the corresponding option summary table. Besides adding the entries, I
also corrected the alphabetization in the C++ option table and some
formatting issues for option arguments.
gcc/ChangeLog
PR other/122243
* doc/invoke.texi (Option Summary): Add missing entries,
also correct alphabetization and formatting of the C++ options.
(C++ Language Options): Fix some formatting issues.
Sandra Loosemore [Tue, 28 Oct 2025 22:38:08 +0000 (22:38 +0000)]
Mark some undocumented options as such [PR122243]
We have a number of command-line options that are undocumented (either
intentionally or because they are obsolete and retained only for
compatibility), that ought to be marked as "Undocumented". I've also
added some comments to the .opt files.
gcc/c-family/ChangeLog
PR other/122243
* c.opt (fmodule-version-ignore): Mark as "Undocumented".
gcc/ChangeLog
PR other/122243
* common.opt (fhelp, fhelp=, ftarget-help, fversion): Mark as
"Undocumented".
(fbounds-check): Update comments.
(flag-graphite, fsel-sched-reschedule-pipelined): Mark as
"Undocumented".
(fstack-limit): Add comment.
Sandra Loosemore [Fri, 17 Oct 2025 15:11:47 +0000 (15:11 +0000)]
Add "RejectNegative" to some options where it doesn't make sense [PR122243]
This patch adds the "RejectNegative" property to several options where
it doesn't make sense. These are either options of the form
"name=value" rather than an on/off switch, those that are already in a
"no-" form, or options that form a mutually-exclusive set.
Also, the fhelp, ftarget-help, and fversion options that do not take
arguments ignore the "-no" prefix so that even "-fno-help" (etc)
causes help to be printed instead of suppressing help output. Since that
behavior is not useful, I've added RejectNegative to those options as well.
Sandra Loosemore [Wed, 15 Oct 2025 23:30:00 +0000 (23:30 +0000)]
Add some missing @opindex entries [PR122243]
The options handled in this patch already have documentation but are
either missing an @opindex entry entirely, or index only the negative
option form.
Splitting a CONST_INT address into base and offset can be beneficial
when accessing multiple addresses in the same UBYTE region. The base
constant load can be shared among those accesses.
There is no regression for single accesses per UBYTE memory region.
The transformation by TARGET_ADDR_SPACE_LEGITIMIZE_ADDRESS generates
practically equivalent code:
For PRU there is a small complication. While load/store instructions
support base+offset addressing, the call instructions do not.
But the TARGET_ADDR_SPACE_LEGITIMIZE_ADDRESS arguments do not show
which operation is using the address, so invalid address is emitted for
call instructions to CONST_INT addresses. This is solved by fixing up
the call address operands during expansion.
PR target/122415
gcc/ChangeLog:
* config/pru/pru-protos.h (pru_fixup_jump_address_operand):
Declare.
* config/pru/pru.cc (pru_fixup_jump_address_operand): New
function.
(pru_addr_space_legitimize_address): New function.
(TARGET_ADDR_SPACE_LEGITIMIZE_ADDRESS): Declare.
* config/pru/pru.md (call): Fixup the address operand.
(call_value): Ditto.
(sibcall): Ditto.
(sibcall_value): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/pru/pr122415-1.c: New test.
* gcc.target/pru/pr122415-2.c: New test.
Tejas Belagod [Mon, 6 Jan 2025 05:53:44 +0000 (11:23 +0530)]
AArch64: Support C/C++ operations on svbool_t
Support a subset of C/C++ operations (bitwise, conditional etc.) on svbool_t.
gcc/c-family/ChangeLog:
* c-common.cc (c_build_vec_convert): Support vector boolean
types for __builtin_convertvector ().
gcc/c/ChangeLog:
* c-typeck.cc (build_binary_op): Support vector boolean types.
gcc/cp/ChangeLog:
* typeck.cc (cp_build_binary_op): Likewise.
* call.cc (build_conditional_expr): Support vector booleans.
* cvt.cc (ocp_convert): Call target hook to resolve conversion
between standard and non-standard booleans.
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): Make
SVE vector boolean type equivalent to GNU vectors.
* config/aarch64/aarch64-sve.md (extend<vpred><mode>2,
zero_extend<vpred><mode>2, trunc<mode><vpred>2, vec_cmp<mode><mode>):
New patterns to support additional operations on predicate modes.
* config/aarch64/aarch64.cc (aarch64_valid_vector_boolean_op): New.
(aarch64_invalid_unary_op): Consider vector bool types.
(aarch64_invalid_binary_op): Likewise.
(aarch64_convert_to_type): Define target hook and handle standard to
non-standard bool conversion.
arm: Don't reject early mov?fcc patterns that we might be able to handle
The define_expand patterns for movdfcc, movsfcc and movhfcc had overly
tight contstraints that could cause the compiler to reject these
patterns when re-ordering the operands could lead to a successful
match. Relax the initial predicate test and rely on the test after
arm_validize_comparison has run to determine whether this is something
we can support. This fixes some test failures which were introduced
in the fix for PR118460
gcc/ChangeLog:
PR target/118460
* config/arm/arm.md (movhfcc): Use expandable_comparison_operator.
(movsfcc, movdfcc): Likewise.
Robin Dapp [Fri, 7 Nov 2025 14:54:52 +0000 (15:54 +0100)]
vect: Do not set range for step != 1 [PR121985].
In PR120922 we first disabled setting a range on niters_vector for
partial vectorization and later introduced a ceiling division instead.
In PR121985 we ran into this again where a bogus range caused wrong code
later. On top I saw several instances of this issue on a local branch
that enables more VLS length-controlled loops.
I believe we must not set niter_vector's range to TYPE_MAX / VF, no
matter the rounding due to the way niters_vector is used. It's not
really identical to the number of vector iterations but the actual
number the loop will iterate is niters_vector / step where step = VF
for partial vectors.
Thus, only set the range to TYPE_MAX / VF if step == 1.
gcc/ChangeLog:
PR middle-end/121985
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Only
set niter_vector's range if step == 1.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr121985.c: New test.
Robin Dapp [Fri, 7 Nov 2025 09:21:36 +0000 (10:21 +0100)]
optabs: Do not pun modes smaller than QImode.
In can_vec_perm_const_p if we cannot directly permute a vector mode we
try to pun it with a byte mode. qimode_for_vec_perm checks gets the
mode size and uses that as number of elements for the new QImode vector.
This doesn't work for RVV mask vectors, though. First, their
precision might be smaller than a byte and second, there is no
way to easily pun them. The most common way would be a vector select
from {0, 0, ...} and {1, 1, ...} vectors. Therefore this patch checks
if the perm's innermode precision is a multiple of QImode's precision.
Bootstrapped and regtested on x86 and power10. Regtested on aarch64 and
riscv64.
gcc/ChangeLog:
* optabs-query.cc (qimode_for_vec_perm): Check if QImode's
precision divides the inner mode's precision.
Robin Dapp [Fri, 7 Nov 2025 16:18:02 +0000 (17:18 +0100)]
vect: Give up if there is no offset_vectype.
vect_gather_scatter_fn_p currently ICEs if offset_vectype is NULL.
This is an oversight in the patches that relax gather/scatter detection.
Catch this.
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Bail if
offset_vectype is NULL.
Robin Dapp [Thu, 9 Oct 2025 15:25:59 +0000 (17:25 +0200)]
vect: Reduce group size of consecutive strided accesses.
Consecutive load permutations like {0, 1, 2, 3} or {4, 5, 6, 7} in a
group of 8 only read a part of the group, leaving a gap.
For strided accesses we can elide the permutation and, instead of
accessing the whole group, use the number of SLP lanes. This
effectively increases the vector size as we don't load gaps. On top we
do not need to emit the permutes at all.
gcc/ChangeLog:
* tree-vect-slp.cc (vect_load_perm_consecutive_p): New function.
(vect_lower_load_permutations): Use.
(vect_optimize_slp_pass::remove_redundant_permutations): Use.
* tree-vect-stmts.cc (has_consecutive_load_permutation): New
function that uses vect_load_perm_consecutive_p.
(get_load_store_type): Use.
(vectorizable_load): Reduce group size.
* tree-vectorizer.h (struct vect_load_store_data): Add
subchain_p.
(vect_load_perm_consecutive_p): Declare.
Jakub Jelinek [Mon, 10 Nov 2025 11:52:45 +0000 (12:52 +0100)]
c++: Implement C++26 P3920R0 - Wording for NB comment resolution on trivial relocation
Trivial relocation was voted out of C++26, the following patch
removes it (note, the libstdc++ part was still waiting for patch review
and so doesn't need to be removed).
This isn't a mere revert of r16-2206; I've kept -Wc++26-compat option,
from earlier patches the non-terminal stays to be class-property-specifier,
and I had to partially revert also various follow-up changes, e.g. for
modules to handle the new flags and test them, for -Wkeyword-macro
etc. to diagnose the conditional keywords or the feature test macro
etc.
Jakub Jelinek [Mon, 10 Nov 2025 10:36:42 +0000 (11:36 +0100)]
c++: Diagnose #define/#undef indeterminate
While working on CWG3053 I've noticed I forgot to enable diagnostics
on #define indeterminate or #undef indeterminate now that it is handled
as valid C++26 attribute.
2025-11-10 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* lex.cc (cxx_init): For C++26 call cpp_warn on "indeterminate".
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-1.C: Expect diagnostics on define/undef
of indeterminate.
* g++.dg/warn/Wkeyword-macro-2.C: Likewise.
* g++.dg/warn/Wkeyword-macro-4.C: Likewise.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-7.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
Jakub Jelinek [Mon, 10 Nov 2025 10:34:20 +0000 (11:34 +0100)]
c++, libcpp: Implement CWG3053
The following patch implements CWG3053 approved in Kona, where it is now
valid not just to #define likely(a) or #define unlikely(a, b, c) but also
to #undef likely or #undef unlikely.
2025-11-10 Jakub Jelinek <jakub@redhat.com>
libcpp/
* directives.cc: Implement CWG3053.
(do_undef): Don't pedwarn or warn about #undef likely or #undef
unlikely.
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-4.C: Don't diagnose for #undef likely
or #undef unlikely.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-9.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
* g++.dg/warn/Wkeyword-macro-10.C: Likewise.
Lewis Hyatt [Wed, 30 Jul 2025 23:20:55 +0000 (19:20 -0400)]
libcpp: Improve locations for macros defined prior to PCH include [PR105608]
It is permissible to define macros prior to including a PCH, as long as
these definitions are disjoint from or identical to the macros in the
PCH. The PCH loading process replaces all libcpp data structures with those
from the PCH, so it is necessary to remember the extra macros separately and
then restore them after loading the PCH, which all is handled by
cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
process consists of pushing a buffer containing the macro definition and
then lexing it from there, similar to how a command-line -D option is
processed. The current implementation does not attempt to set up the
line_map for this process, and so the locations assigned to the macros are
often not meaningful. (Similar to what happened in the past with lexing the
tokens out of a _Pragma string, lexing out of a buffer rather than a file
produces "sorta" reasonable locations that are often close enough, but not
reliably correct.)
Fix that up by remembering enough additional information (more or less, an
expanded_location for each macro definition) to produce a reasonable
location for the newly restored macros.
One issue that came up is the treatment of command-line-defined macros. From
the perspective of the generic line_map data structures, the command-line
location is not distinguishable from other locations; it's just an ordinary
location created by the front ends with a fake file name by convention. (At
the moment, it is always the string `<command-line>', subject to
translation.) Since libcpp needs to assign macros to that location, it
needs to know what location to use, so I added a new member
line_maps::cmdline_location for the front ends to set, similar to how
line_maps::builtin_location is handled.
This revealed a small issue, in c-opts.cc we have:
/* All command line defines must have the same location. */
cpp_force_token_locations (parse_in, line_table->highest_line);
But contrary to the comment, all command line defines don't actually end up
with the same location anymore. This is because libcpp/lex.cc has been
expanded (r6-4873) to include range information on the returned
locations. That logic has never been respecting the request of
cpp_force_token_locations. I believe this was not intentional, and so I have
corrected that here. Prior to this patch, the range logic has been leading
to command-line macros all having similar locations in the same line map (or
ad-hoc locations based from there for sufficiently long tokens); with this
change, they all have exactly the same location and that location is
recorded in line_maps::cmdline_location.
With that change, then it works fine for pch.cc to restore macros whether
they came from the command-line or from the main file.
gcc/c-family/ChangeLog:
PR preprocessor/105608
* c-opts.cc (c_finish_options): Set new member
line_table->cmdline_location.
* c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
libcpp pch.cc; it is now possible that the linemap is in a different
file after returning from cpp_read_state().
libcpp/ChangeLog:
PR preprocessor/105608
* include/line-map.h: Add new member CMDLINE_LOCATION.
* lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
the token location to include range information if token location
override was requested.
(warn_about_normalization): Likewise.
(_cpp_lex_direct): Likewise.
* pch.cc (struct saved_macro): New local struct.
(struct save_macro_data): Change DEFNS vector to hold saved_macro
rather than uchar*.
(save_macros): Adapt to remember the location information for each
saved macro in addition to the definition.
(cpp_prepare_state): Likewise.
(cpp_read_state): Use the saved location information to generate
proper locations for the restored macros.
gcc/testsuite/ChangeLog:
PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Remove xfails.
* g++.dg/pch/line-map-4.C: New test.
* g++.dg/pch/line-map-4.Hs: New test.
Mark Wielaard [Sun, 9 Nov 2025 21:12:19 +0000 (22:12 +0100)]
Regenerate libgfortran Makefile.in and aclocal.m4
Commit a1fe2cfa8965 ("fortran: [PR121628]") regenerated libgfortran
Makefile.an and aclocal.m4 files with automake 1.15 instead of 1.15.1.
Run autoreconf version 2.69 with automake 1.15.1 inside libgfortran.
Eric Botcazou [Sat, 8 Nov 2025 18:15:46 +0000 (19:15 +0100)]
Ada: Fix bogus error on limited with clause and private parent package
The implementation of the 10.1.2(8/2-11/2) subclauses that establish rules
for the legality of "with" clauses of private child units is done separately
for regular "with" clauses (in Check_Private_Child_Unit) and for limited
"with" clauses (in Check_Private_Limited_Withed_Unit). The testcase, which
contains the regular and the "limited" version of the same pattern, exhibits
a disagreement between them; the former implementation is correct and the
latter is wrong in this case.
The patch fixes the problem and also cleans up the latter implementation by
aligning it with the former as much as possible.
gcc/ada/
PR ada/34374
* sem_ch10.adb (Check_Private_Limited_Withed_Unit): Use a separate
variable for the private child unit, streamline the loop locating
the nearest private ancestor, fix a too early termination of the
loop traversing the ancestor of the current unit, and use the same
privacy test as Check_Private_Child_Unit.
Philipp Tomsich [Sat, 8 Nov 2025 16:28:07 +0000 (09:28 -0700)]
[RISC-V] Add testcase for shifted truthvalue
I was doing some cleanup on our internal tree and noticed a pattern that I
didn't think was actually useful in practice. Thankfully the internal commit
included a testcase clearly targeting that pattern.
I'm upstreaming the testcase, but not the unnecessary pattern.
gcc/testsuite
* gcc.target/riscv/snez.c: New test.
Avinash Jayakar [Sat, 8 Nov 2025 04:27:59 +0000 (09:57 +0530)]
isel: Check bounds before converting VIEW_CONVERT to VEC_SET.
The function gimple_expand_vec_set_expr in the isel pass, converted
VIEW_CONVERT_EXPR to VEC_SET_EXPR without checking the bounds on the index,
which cause ICE on targets that supported VEC_SET_EXPR like x86 and powerpc.
This patch adds a bound check on the index operand and rejects the conversion
if index is out of bound.
Lulu Cheng [Mon, 3 Nov 2025 09:53:52 +0000 (17:53 +0800)]
LoongArch: Fix PR122097 (2).
r16-4703 does not completely fix PR122097. Floating-point vectors
were not processed in the function loongarch_const_vector_same_bytes_p.
This patch will completely resolve this issue.
PR target/122097
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_const_vector_same_bytes_p): Add processing for
floating-point vector data.
Avinash Jayakar [Sat, 8 Nov 2025 02:53:31 +0000 (08:23 +0530)]
vect: Complete implementation for MULT_EXPR vector lowering.
Use sequences of shifts and add/sub if the hardware does not have support for
vector multiplication. In a previous patch, bare bones vector lowering had been
implemented which only worked when the constant value was a power of 2.
In this patch, few more cases have been added, i.e., if a constant is a uniform
vector but not a power of 2 then use the choose_mult_variant, with max cost
estimate as the cost of scalar multiplication operation times the number of
elements in the vector. This is similar to the logic while expanding MULT_EXPR
in expand pass or in the vector pattern recognition in tree-vect-patterns.cc.
gcc/ChangeLog:
PR tree-optimization/122065
* tree-vect-generic.cc (target_supports_mult_synth_alg): Add helper to
check mult synth.
(expand_vector_mult): Optimize mult when const is uniform but not
power of 2.
Jerry DeLisle [Sat, 8 Nov 2025 02:46:54 +0000 (18:46 -0800)]
fortran: [PR121628]
The PR121628 deep-copy helper reused a static seen_derived_types set
across wrapper generation, so recursive allocatable arrays that appeared
multiple times in a derived type caused infinite compile-time recursion.
Save and restore the set around each wrapper build, polish follow-ups,
and add a regression test to keep the scenario covered.
gcc/fortran/ChangeLog:
PR fortran/121628
* trans-array.cc (seen_derived_types): Move to file scope and
preserve/restore around generate_element_copy_wrapper.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Reuse
gfc_trans_force_lval when forcing addressable CAF temps.
gcc/testsuite/ChangeLog:
PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_7.f90: New test.
libgfortran/ChangeLog:
PR fortran/121628
* Makefile.in: Keep continuation indentation within 80 columns.
* aclocal.m4: Regenerate.
* libgfortran.h: Drop unused forward declaration.
Signed-off-by: Christopher Albert <albert@tugraz.at>
Andrew Pinski [Fri, 7 Nov 2025 22:01:33 +0000 (14:01 -0800)]
sccp: Fix order of removal of phi (again) [PR122599]
This time we are gimplifying the expression and call
fold_stmt during the gimplification (which is fine) but
since we removed the phi and the expression references ssa
names in the phi indirectly, things just fall over inside the ranger.
This moves the removal of the phi until gimplification happens as it
might refer back to the ssa name that the phi defines.
Pushed as obvious after bootstrap test on x86_64-linux-gnu.
PR tree-optimization/122599
gcc/ChangeLog:
* tree-scalar-evolution.cc (final_value_replacement_loop): Move
the removal of the phi until after the gimplification of the final
value expression.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122599-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
gcc/analyzer/ChangeLog:
* checker-event.cc
(region_creation_event_allocation_size::print_desc): Fix missing
"else" leading to stray trailing "allocated here" text in events.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Andrew Pinski [Tue, 28 Oct 2025 05:22:08 +0000 (22:22 -0700)]
Move build_call_nary away from va_list
Instead of a va_list here we can create a std::initializer_list that contains the
arguments and pass that.
This is just one quick version of what was mentioned during the Reviewing refactoring
goals and acceptable abstractions.
The generated code should be similar or slightly better. Plus there is extra checking
of bounds of the std::initializer_list.
I didn't remove the n argument from build_call_nary at this stage as I didn't want to change
the calls to build_call_nary but I added a gcc_checking_assert to make sure the number passed
is the number of arguments.
Changes since v1:
* v2: Fix build_call's access of std::initializer_list.
gcc/ChangeLog:
* tree.cc (build_call_nary): Remove decl.
Add template definition that uses std::initializer_list<tree>
and call build_call.
(build_call): New declaration.
* tree.h (build_call_nary): Remove.
(build_call): New function.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Robin Dapp [Tue, 7 Oct 2025 15:17:22 +0000 (17:17 +0200)]
RISC-V: Remove gather scale and offset handling.
With the recent vectorizer changes upstream the vectorizer can take care
of offset extension and scaling (and its proper costing) itself.
Thus, we can remove all related handling in expand_gather_scatter and
set the predicates in the gather/scatter expanders to what our
instructions actually support.
gcc/ChangeLog:
* config/riscv/autovec.md: Use const_1_operand for scale and
extend predicates.
* config/riscv/riscv-v.cc (expand_gather_scatter): Remove scale
and extension handling.
Robin Dapp [Thu, 6 Nov 2025 08:14:35 +0000 (09:14 +0100)]
vect: Do not convert offset type in strided gather.
The gather/scatter relaxation patches introduced a bug with
vect_use_strided_gather_scatters_p. I didn't want to pass
supported_offset_vectype and supported scale all the way from
vect_truncate_gather_scatter_offset and
vect_use_strided_gather_scatters_p to get_load_store_type so
just called vect_gather_scatter_fn_p again afterwards to determine
the supported type and scale.
However, this doesn't take into account that
vect_use_strided_gather_scatters_p changes the offset type after
verifying that we can use gather/scatter.
The flow right now is
- vect_use_strided_gather_scatters_p calls vect_check_gather_scatter
with e.g. a char offset type.
- We actually need/support a short vector offset type and
vect_use_strided_gather_scatters_p fold converts the actual (scalar)
char offset to a short offset.
- We call vect_gather_scatter_fn_p with the new short offset instead of
the original char one, thinking we need an even larger offset type.
The last call is obviously not identical to the ones we used to check
gather/scatter in the first place and can fail if there is no offset
vectype.
There are several ways to fix this. The most obvious one is to bite the
bullet and just add the supported_offset_vectype and supported_scale to
all the intermediate functions. I wondered, however, if we need the
offset conversion at all. As far as I can tell we don't ever use
the scalar offset type and vect_get_strided_load_store_ops in particular
uses offset_vectype. This, this patch removes the conversion.
I bootstrapped and regtested this, before and after the relaxation
patches, on x86 and power10. Regtested on aarch64 and riscv.
gcc/ChangeLog:
* tree-vect-stmts.cc (vect_use_strided_gather_scatters_p):
Do not convert offset type.
Robin Dapp [Wed, 29 Oct 2025 15:02:51 +0000 (16:02 +0100)]
vect: Relax gather/scatter scale handling.
Similar to the signed/unsigned patch before this one relaxes the
gather/scatter restrictions on scale factors. The basic idea is that a
natively unsupported scale factor can still be reached by emitting a
multiplication before the actual gather operation. As before, we need
to make sure that there is no overflow when multiplying.
Robin Dapp [Tue, 9 Sep 2025 09:41:51 +0000 (11:41 +0200)]
vect: Relax gather/scatter detection by swapping offset sign.
This patch adjusts vect_gather_scatter_fn_p to always check an offset
type with swapped signedness (vs. the original offset argument).
If the target supports the gather/scatter with the new offset type as
well as the conversion of the offset we now emit an explicit offset
conversion before the actual gather/scatter.
The relaxation is only done for the IFN path of gather/scatter and the
general idea roughly looks like:
- vect_gather_scatter_fn_p builds a list of all offset vector types
that the target supports for the current vectype. Then it goes
through that list, trying direct support first and sign-swapped
offset types next, taking precision requirements into account.
If successful it sets supported_offset_vectype to the type that actually
worked while offset_vectype_out is the type that was requested.
- vect_check_gather_scatter works as before but uses the relaxed
vect_gather_scatter_fn_p.
- get_load_store_type sets ls_data->supported_offset_vectype if the
requested type wasn't supported but another one was.
- check_load_store_for_partial_vectors uses the
supported_offset_vectype in order to validate what get_load_store_type
determined.
- vectorizable_load/store emit a conversion if
ls_data->supported_offset_vectype is nonzero and cost it.
The offset type is either of pointer size (if we started with a signed
offset) or twice the size of the original offset (when that one was
unsigned).
gcc/ChangeLog:
* tree-vect-data-refs.cc (struct gather_scatter_config): New
struct to hold gather/scatter configurations.
(vect_gather_scatter_which_ifn): New function to determine which
IFN to use.
(vect_gather_scatter_get_configs): New function to enumerate all
target-supported configs.
(vect_gather_scatter_fn_p): Rework to use
vect_gather_scatter_get_configs and try sign-swapped offset.
(vect_check_gather_scatter): Use new supported offset vectype
argument.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Ditto.
(vect_truncate_gather_scatter_offset): Ditto.
(vect_use_grouped_gather): Ditto.
(get_load_store_type): Ditto.
(vectorizable_store): Convert to sign-swapped offset type if
needed.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct vect_load_store_data): Add
supported_offset_vectype.
(vect_gather_scatter_fn_p): Add argument.
Andrew Pinski [Thu, 6 Nov 2025 20:04:30 +0000 (12:04 -0800)]
forwprop: Handle already true/false branchs in optimize_unreachable [PR122588]
When optimize_unreachable was moved from fab to forwprop, I missed that due to
the integrated copy prop, we might end up with an already true branch leading
to a __builtin_unreachable block. optimize_unreachable would switch around
the if and things go down hill from there since the other edge was already
marked as non-executable, forwprop didn't process those blocks and didn't
do copy prop into that block and the original assignment statement was removed.
This fixes the problem by having optimize_unreachable not touch the if
statement was already changed to true/false.
Note I placed the testcase in gcc.c-torture/compile as gcc.dg/torture
is NOT currently testing -Og (see PR 122450 for that).
Changes since v1:
* v2: Add gimple testcase.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122588
gcc/ChangeLog:
* tree-ssa-forwprop.cc (optimize_unreachable): Don't touch
if the condition was already true or false.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr122588-1.c: New test.
* gcc.dg/tree-ssa/pr122588-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Eric Botcazou [Fri, 7 Nov 2025 19:42:57 +0000 (20:42 +0100)]
Ada: Fix bogus error on inherited operation for extension of type instance
It comes from a small discrepancy between class-wide subtypes and types:
they both have unknown discriminants, but only the latter may have
discriminants, which causes Subtypes_Statically_Match to return False.
gcc/ada/
PR ada/83188
* sem_eval.adb (Subtypes_Statically_Match): Deal with class-wide
subtypes whose class-wide types have discriminants.
gcc/testsuite/
* gnat.dg/class_wide6.ads, gnat.dg/class_wide6.adb: New test.
* gnat.dg/class_wide6_pkg.ads: New helper.
David Faust [Thu, 6 Nov 2025 22:24:14 +0000 (14:24 -0800)]
bpf: improve memmove inlining [PR122140]
The BPF backend inline memmove expansion was broken for certain
constructs. This patch addresses the two underlying issues:
1. Off-by-one in the "backwards" unrolled move loop offset.
2. Poor use of temporary register for the generated move loop, which
could result in some of the loads performing the move to be optimized
away when the source and destination of the memmove are based off of
the same pointer.
gcc/
PR target/122140
* config/bpf/bpf.cc (bpf_expand_cpymem): Fix off-by-one offset
in backwards loop. Improve src and dest addrs used for the
branch condition.
(emit_move_loop): Improve emitted set insns and remove the
explict temporary register.
Richard Biener [Thu, 6 Nov 2025 13:24:34 +0000 (14:24 +0100)]
tree-optimization/122577 - missed vectorization of conversion from bool
We are currently overly restrictive with rejecting conversions from
bit-precision entities to mode precision ones. Similar to RTL expansion
we can focus on non-bit operations producing bit-precision results
which we currently do not properly handle by masking. Such checks
should be already present. The following relaxes vectorizable_conversion.
Actual bitfield accesses are catched and rejected by vectorizer dataref
analysis and converted during if-conversion into mode-size accesses
with appropriate sign- or zero-extension.
PR tree-optimization/122577
* tree-vect-stmts.cc (vectorizable_conversion): Allow conversions
from non-mode-precision types.
Pan Li [Thu, 6 Nov 2025 05:19:20 +0000 (13:19 +0800)]
Match: Refactor bit_ior based unsigned SAT_MUL pattern by widen mul helper [NFC]
There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka
* widen_mul directly, like _3 w* _4
* convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4
* convert and then mul, like (uint64_t)_3 * (uint64_t)_4
All of them will be referenced during different forms of unsigned
SAT_MUL pattern match, but actually we can wrap them into a helper
which present the "widening_mul" sematics. With this helper, some
unnecessary pattern and duplicated code could be eliminated. Like
min based pattern, this patch focus on bit_ior based pattern.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
* match.pd: Leverage usmul_widen_mult by bit_ior based
unsigned SAT_MUL pattern.
Pan Li [Wed, 15 Oct 2025 14:16:11 +0000 (22:16 +0800)]
RISC-V: Combine vsext.vf2 and vsll.vi to vwsll.vi on ZVBB
The vwsll.vi of zvbb ext take zero extend before ashift. But
we can still do some combine based on sign extend if and only
if the shift is imm and the sign extend bits are all shifted.
For example as below
vsetvli zero, zero, e32, m1, ta, ma
vsext.vf2 v1, v2
vsll.vi v1, v1, 16
If the ashift bits is greater than or equals to truncated bitsize,
(aka 16 for e32), the sign or zero extend bits will be ashifted
and never pollute the final result. Then we have
vsetvli zero, zero, e32, m1, ta, ma
vwsll.vi v1, v2, 16
PR target.121959
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*vwsll_sign_extend_<mode>): Add
pattern to combine vsext.vf2 and vslli.vi to vwsll.vi.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr121959-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-4.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-5.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959.h: New test.
Richard Biener [Fri, 7 Nov 2025 09:15:36 +0000 (10:15 +0100)]
tree-optimization/122589 - imm use iterator checking fallout
The following addresses the latent issue that gsi_replace_with_seq
causes debug info to unnecessarily degrade and in this process
break the new immediate use iterator sanity checking. In particular
gsi_remove has side-effects on debug stmts even when operating
in non-permanent operation. But as we are operating on a sequence
not in the IL here this should be avoided. Re-factoring
gsi_replace_with_seq to not rely on gsi_remove fulfills this.
I've noticed gsi_split_seq_before has misleading documentation.
Fixed thereby as well.
PR tree-optimization/122589
PR middle-end/122594
* gimple-iterator.cc (gsi_replace_with_seq): Instead of
removing the last stmt from the sequence with gsi_remove,
split it using gsi_split_seq_before.
(gsi_split_seq_before): Fix bogus documentation.
Alfie Richards [Wed, 15 Oct 2025 13:34:55 +0000 (13:34 +0000)]
aarch64: Add support for preserve_none function attribute [PR target/118328]
When applied to a function preserve_none changes the procedure call standard
such that all registers except stack pointer, frame register, and link register
are caller saved. Additionally, it changes the argument passing registers.
PR target/118328
gcc/ChangeLog:
* config/aarch64/aarch64.cc (handle_aarch64_vector_pcs_attribute):
Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_pcs_exclusions): New definition.
(aarch64_gnu_attributes): Add entry for preserve_none and add
aarch64_pcs_exclusions to aarch64_vector_pcs entry.
(aarch64_preserve_none_abi): New function.
(aarch64_fntype_abi): Add handling for preserve_none.
(aarch64_reg_save_mode): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_hard_regno_call_part_clobbered): Add handling for
ARM_PCS_PRESERVE_NONE.
(num_pcs_arg_regs): New helper function.
(get_pcs_arg_reg): New helper function.
(aarch64_function_ok_for_sibcall): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_layout_arg): Add preserve_none argument lauout..
(function_arg_preserve_none_regno_p): New helper function.
(aarch64_function_arg): Update to handle preserve_none.
(function_arg_preserve_none_regno_p): Update logic for preserve_none.
(aarch64_expand_builtin_va_start): Add preserve_none layout.
(aarch64_setup_incoming_varargs): Add preserve_none layout.
(aarch64_is_variant_pcs): Update for case of ARM_PCS_PRESERVE_NONE.
(aarch64_comp_type_attributes): Add preserve_none.
* config/aarch64/aarch64.h (NUM_PRESERVE_NONE_ARG_REGS): New macro.
(PRESERVE_NONE_REGISTERS): New macro.
(enum arm_pcs): Add ARM_PCS_PRESERVE_NONE.
* doc/extend.texi (preserve_none): Add docs for new attribute.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/preserve_none_1.c: New test.
* gcc.target/aarch64/preserve_none_mingw_1.c: New test.
* gcc.target/aarch64/preserve_none_2.c: New test.
* gcc.target/aarch64/preserve_none_3.c: New test.
* gcc.target/aarch64/preserve_none_4.c: New test.
* gcc.target/aarch64/preserve_none_5.c: New test.
* gcc.target/aarch64/preserve_none_6.c: New test.
Pan Li [Sun, 26 Oct 2025 07:21:15 +0000 (15:21 +0800)]
RISC-V: RISC-V: Combine vec_duplicate + vwmaccu.vv to vwmaccu.vx on GR2VR cost
This patch would like to combine the vec_duplicate + vwmaccu.wv to the
vwmaccu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.
Assume we have asm code like below, GR2VR cost is 0.
After this patch:
11 beq a3,zero,.L8
...
14 .L3:
15 vsetvli a5,a3,e32,m1,ta,ma
...
20 vwmaccu.wx v1,a2,v3
...
23 bne a3,zero,.L3
Unfortunately, and similar as vwaddu.vv, only widening from uint32_t to
uint64_t has the necessary zero-extend during combine, we loss the
extend op after expand for any other types.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*widen_mul_plus_vx_<mode>): Add
new pattern to combine the vwmaccu.vx.
* config/riscv/vector.md (*pred_widen_mul_plus_u_vx<mode>_undef):
Add undef define_insn for vmwaccu.vx emiting.
(@pred_widen_mul_plus_u_vx<mode>): Ditto.
When the mode of the destination operand selected by the condition
is SImode, explicit sign extension is applied to both selected
source operands, and the destination operand is marked as
sign-extended.
This method can eliminate some of the sign extension instructions
caused by conditional selection optimization.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_sign_extend_if_subreg_prom_p): Determine if the
current operand is SUBREG and if the source of SUBREG is
the sign-extended value.
(loongarch_expand_conditional_move): Optimize.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/sign-extend-4.c: New test.
* gcc.target/loongarch/sign-extend-5.c: New test.
Lulu Cheng [Thu, 12 Dec 2024 08:21:38 +0000 (16:21 +0800)]
LoongArch: Implement sge and sgeu.
The original implementation of the function loongarch_extend_comparands
only prevented op1 from being loaded into the register when op1 was
const0_rtx. It has now been modified so that op1 is not loaded into
the register as long as op1 is an immediate value. This allows
slt{u}i to be generated instead of slt{u} if the conditions are met.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_canonicalize_int_order_test): Support GT GTU LT
and LTU.
(loongarch_extend_comparands): Expand the scope of op1 from
0 to all immediate values.
* config/loongarch/loongarch.md
(*sge<u>_<X:mode><GPR:mode>): New template.
AArch64, ARM: Clean up documentation of -mbranch-protection.
While working on other things, I noticed that the documentation for
the -mbranch-protection= option was pretty garbled on both aarch64 and
arm targets, with incorrect markup, too much syntax crammed into the
option summary, and confusion about what the values "+leaf" modifier
can apply to. I rewrote it to list all the valid option values
explicitly in the option description, checking this against the
implementation.
gcc/ChangeLog
* doc/invoke.texi (AArch64 Options): Clean up description of
-mbranch-protection= argument.
(ARM Options): Likewise.
Jerry DeLisle [Thu, 6 Nov 2025 20:44:18 +0000 (12:44 -0800)]
fortran: [PR121628]
This patch fixes PR121628 by implementing proper deep copy semantics for
derived types containing recursive allocatable array components, in
compliance with Fortran 2018+ standards.
The original implementation would generate infinitely recursive code at
compile time when encountering self-referential derived types with
allocatable components (e.g., type(t) containing allocatable type(t)
arrays). This patch solves the problem by generating a runtime helper
function that performs element-wise deep copying, avoiding compile-time
recursion while maintaining correct assignment semantics.
The trans-intrinsic.cc change enhances handling of constant values in
coarray atomic operations to ensure temporary variables are created when
needed, avoiding invalid address-of-constant expressions.
gcc/fortran/ChangeLog:
PR fortran/121628
* trans-array.cc (get_copy_helper_function_type): New function to
create function type for element copy helpers.
(get_copy_helper_pointer_type): New function to create pointer type
for element copy helpers.
(generate_element_copy_wrapper): New function to generate runtime
helper for element-wise deep copying of recursive types.
(structure_alloc_comps): Detect recursive allocatable array
components and use runtime helper instead of inline recursion.
Add includes for cgraph.h and function.h.
* trans-decl.cc (gfor_fndecl_cfi_deep_copy_array): New declaration
for runtime deep copy helper.
(gfc_build_builtin_function_decls): Initialize the runtime helper
declaration.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Enhance handling of
constant values in coarray atomic operations by detecting and
materializing address-of-constant expressions.
* trans.h (gfor_fndecl_cfi_deep_copy_array): Add external declaration.
libgfortran/ChangeLog:
PR fortran/121628
* Makefile.am: Add runtime/deep_copy.c to source files.
* Makefile.in: Regenerate.
* gfortran.map: Export _gfortran_cfi_deep_copy_array symbol.
* libgfortran.h: Add prototype for internal_deep_copy_array.
* runtime/deep_copy.c: New file implementing runtime deep copy
helper for recursive allocatable array components.
gcc/testsuite/ChangeLog:
PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_5.f90: New test for recursive
allocatable array deep copy.
* gfortran.dg/alloc_comp_deep_copy_6.f90: New test for multi-level
recursive allocatable deep copy.
* gfortran.dg/array_memcpy_2.f90: Fix test with proper allocation.
Signed-off-by: Christopher Albert <albert@tugraz.at>
Eric Botcazou [Thu, 6 Nov 2025 19:42:13 +0000 (20:42 +0100)]
Ada: Fix function call in object notation incorrectly rejected
This happens in the name of a procedure call, again when there
is an implicit dereference in this name, and the fix to apply to
Find_Selected_Component is again straightforward:
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -8524,9 +8524,7 @@ package body Sem_Ch8 is
-- Error if the prefix is procedure or entry, as is P.X
if Ekind (P_Name) /= E_Function
- and then
- (not Is_Overloaded (P)
- or else Nkind (Parent (N)) = N_Procedure_Call_Statement)
+ and then not Is_Overloaded (P)
then
-- Prefix may mention a package that is hidden by a local
-- declaration: let the user know. Scan the full homonym
But this also changes the diagnostics in illegal cases because they are not
uniform in the procedure, so the change also factors them out so as to make
them uniform, which slightly improves them in the end.
gcc/ada/
PR ada/113352
* sem_ch4.adb (Diagnose_Call): Tweak error message.
* sem_ch8.adb (Find_Selected_Component): Remove bypass for calls
to procedures in the overloaded overloadable case. Factor out
the diagnostics code and invoke it uniformly in this case.
gcc/testsuite/
* gnat.dg/prefix3.adb: New test.
* gnat.dg/prefix3_pkg.ads: New helper.
* gnat.dg/prefix3_pkg.adb: Likewise.
Due to some quirks in crtstuff.c, attribute "retain" requires
some features that avr doesn't implement -- even though it
doesnt't even use crtstuff. This patch works around that.
PR target/122516
gcc/
* config/avr/elf.h (SUPPORTS_SHF_GNU_RETAIN): Define if
HAVE_GAS_SHF_GNU_RETAIN.
In particular note insn 34, 42 and 43. Those are useless. Insns 36, 37, 38
are just a single bit extraction from a variable location (from one of the
if-converted blocks). I couldn't see a good way to fix the problem with insn
34/insn 42. The desire to make the then/else blocks independent is cmove_arith
is good, what's unclear is whether or not that code really cares about the
*destination* of the then/else blocks. But I set that aside.
We then thought that cleaning up the variable bit extraction would be the way
to go. So a pattern was constructed to match that form of variable bit extract
and the cost model was twiddled to return that it was a single fast
instruction. But even with those changes fwprop1 refused to make the
substitution. Sigh. At least combine recognizes the idiom later and cleans it
up.
Then we realized we really should just ignore the (set (reg) (const_int 0)) in
the if-converted sequence. We're going to be able to propagate that away in
nearly every case since we have a hard-wired zero register. Sure enough,
ignoring that insn was enough to tip the balance on this case and we get the
desired code.
Tested on riscv32-elf and riscv64-elf. Pioneer bootstrap is in flight, though
it won't really exercise this problem. The BPI's build hasn't started yet, so
it'll be at least 27hours before it's done.
Waiting on pre-commit CI before moving forward.
gcc/
* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Ignore
assignments of (const_int 0) to a register. They will get propagated
away.
gcc/testsuite
* gcc.target/riscv/czero-bext.c: New test.
Eric Botcazou [Thu, 6 Nov 2025 19:03:49 +0000 (20:03 +0100)]
Ada: Fix incorrect renaming of primitive subprogram in object notation
It is possible to declare a subprogram renaming whose name is a primitive
subprogram in object notation; in this case, the name is unconditionally
evaluated in the front-end (unlike for objects) so that, if an ad-hoc body
needs to be built for the renaming later, the name is not reevaluated for
every call to it.
This evaluation is skipped if the name contains an implicit dereference,
as reported in the first PR, and the fix is to make the dereference explicit
at the end of the processing done in Analyze_Renamed_Primitive_Operation,
as is done in the sibling procedure Analyze_Renamed_Entry. The patch also
makes a few consistency tweaks to them and also replaces a manual evaluation
of the name in Expand_N_Subprogram_Renaming_Declaration by a simple call to
Evaluate_Name, which is the procedure used for object renamings.
Analyze_Renamed_Primitive_Operation performs the resolution of the name
based on the declared profile, but it does not do that correctly in all
cases, as reported in the second PR; the fix is again straightforward.
gcc/ada/
PR ada/113350
PR ada/113551
* exp_ch2.adb (Expand_Renaming): Fix reference to Evaluate_Name.
* exp_ch8.adb (Expand_N_Subprogram_Renaming_Declaration): Call
Evaluate_Name to evaluate the name.
* sem_ch8.adb (Analyze_Renamed_Entry): Minor tweaks.
(Analyze_Renamed_Family_Member): Likewise.
(Analyze_Renamed_Primitive_Operation): Likewise.
Fix thinko in the function checking profile conformance, save the
result of the resolution and make implicit dereferences explicit.
gcc/testsuite
* gnat.dg/renaming19.adb: New test.
* gnat.dg/renaming19_pkg.ads: New helper.
* gnat.dg/renaming19_pkg.adb: Likewise.
AVR: AVR-SD: Put a valid opcode prior to gs() table in .subsection 1.
On functional safety devices (AVR-SD), each executed instruction must
be followed by a valid opcode. This is because instruction fetch and
decode for the next instruction runs while the 2-stage pipeline is
executing the current instruction.
There is only one case where avr-gcc generates code interspersed with
data, which is when a switch/case table is generated for a function
with a "section" attribute and AVR_HAVE_JMP_CALL. In that case, the
table with the gs() code label addresses is put in .subsection 1 so
that it belongs to the section as specified by the "section" attribute.
gcc/
* config/avr/avr.cc (avr_output_addr_vec): Output
a valid opcode prior to the first gs() label provided:
- The code is compiled for an arch that has AVR-SD mcus, and
- the function has a "section" attribute, and
- the function has a gs() label addresses switch/case table.
Your Name [Thu, 6 Nov 2025 16:50:22 +0000 (09:50 -0700)]
[RISC-V][PR 121136] Improve various tests which only need to examine upper bits in a GPR
So pre-commit CI flagged an issue with the initial version of this patch. In
particular the cmp-mem-const-{1,2} tests are failing.
I didn't see that in my internal testing, but that well could be an artifact of
having multiple patches touching in the same broad space that the tester is
evaluating. If I apply just this patch I can trigger the cmp-mem-const{1,2}
failures.
The code we're getting now is actually better than we were getting before, but
the new patterns avoid the path through combine that emits the message about
narrowing the load down to a byte load, hence the failure.
Given we're getting better code now than before, I'm just skipping this test on
risc-v. That's the only non-whitespace change since the original version of
this patch.
--
This addresses the first level issues seen in generating better performing code
for testcases derived from pr121136. It likely regresses code size in some
cases as in many cases it selects code sequences that should be better
performing, though larger to encode.
Improving -Os code generation should remain the primary focus of pr121136. Any
improvements in code size with this change are a nice side effect, but not the
primary goal.
--
Let's take this test (derived from the PR):
_Bool func1_0x1U (unsigned int x) { return x <= 0x1U; }
Those should produce the same output. We currently get these fragments for the
3 cases. In particular note how the second variant is a two instruction
sequence.
sltiu a0,a0,2
srliw a0,a0,1
seqz a0,a0
sltiu a0,a0,2
This patch will adjust that second sequence to match the first and third and is
optimal.
Let's take another case. This is interesting as it's right at the simm12
border:
_Bool func1_0x7ffU (unsigned long x) { return x <= 0x7ffU; }
In this case the second sequence is pretty good. Not perfect, but clearly
better than the other two. This patch will fix the code for case #1 and case
So anyway, that's the basic motivation here. So to be 100% clear, while the
bug is focused on code size, I'm focused on the performance of the resulting
code.
This has been tested on riscv32-elf and riscv64-elf. It's also bootstrapped
and regression tested on the Pioneer. The BPI won't have results for this
patch until late tomorrow.
--
PR rtl-optimization/121136
gcc/
* config/riscv/riscv.md: Add define_insn to test the
upper bits of a register against zero using sltiu when
the bits are extracted via zero_extract or logial right shift.
Add 3->2 define_splits for gtu/leu cases testing upper bits
against zero.
gcc/testsuite
* gcc.target/riscv/pr121136.c: New test.
* gcc.dg/cmp-mem-const-1.c: Skip for risc-v.
* gcc.dg/cmp-mem-const-2.c: Likewise.
Robert Dubner [Thu, 6 Nov 2025 12:26:18 +0000 (07:26 -0500)]
cobol: Mainly extends compilation and execution in finternal-ebcdic.
We expanded our extended testing regime to execute many testcases in
EBCDIC mode as well as in ASCII. This exposed hundreds of problems in
both compilation (where conversions must be made between the ASCII
source code and the EBCDIC execution environment) and in run-time
functionality, where results from calls to system routines and internal
calculations that must be done in ASCII have to be converted to EBCDIC.
These changes also switch to using FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE when initializing fixed-point COBOL variable types.
This provides for accurate initialization up to 37 digits, instead of
losing accuracy after 33 digits.
These changes also support the implementation of the COBOL DELETE FILE
(Format 2) statement.
These changes also introduce expanded support for specifying character
encodings, including support for locales.
co-authored-by: Robert Dubner <rdubner@symas.com>
co-authored-by: James K. Lowden <jklowden@cobolworx.com>
gcc/cobol/ChangeLog:
* Make-lang.in: Repair documentation generation.
* cdf.y: Changes to tokens.
* cobol1.cc (cobol_langhook_handle_option): Add comment.
* genapi.cc (function_pointer_from_name): Use data.original() for
function name.
(parser_initialize_programs): Likewise.
(cobol_compare): Make sure encodings of comparands are the same.
(move_tree): Change name of DEFAULT_SOURCE_ENCODING macro.
(parser_enter_program): Typo.
(psa_FldLiteralN): Break out dirty_to_binary() support routine.
(dirty_to_binary): Likewise.
(parser_alphabet): Rename 'alphabet' to 'collation_sequence'.
(parser_allocate): Change wsclear() to be uint32_t instead of char.
(parser_label_label): Formatting.
(parser_label_goto): Likewise.
(get_the_filename): Breakout get_the_filename(), which handles
encoding.
(parser_file_open): Likewise.
(set_up_delete_file_label): Implement DELETE FILE (Format 2).
(parser_file_delete_file): Likewise.
(parser_file_delete_on_exception): Likewise.
(parser_file_delete_not_exception): Likewise.
(parser_file_delete_end): Likewise.
(parser_call): Use data.original().
(parser_entry): Use data.original().
(mh_source_is_literalN): Convert from
sourceref.field->codeset.encoding.
(binary_initial_from_float128): Change to "binary_initial".
(binary_initial): Calculate in FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE.
(digits_from_int128): New routine uses binary_initial.
(digits_from_float128): Removed. Kept as comment for reference.
(initial_from_initial): Use binary_initial.
(actually_create_the_static_field): Use correct encoding.
(parser_symbol_add): Likewise.
* genapi.h (parser_file_delete_file): Implement FILE DELETE.
(parser_file_delete_on_exception): Implement FILE DELETE.
(parser_file_delete_not_exception): Implement FILE DELETE.
(parser_file_delete_end): Implement FILE DELETE.
* genmath.cc: Include charmaps.h.
* genutil.cc (get_literal_string): Change name of
DEFAULT_SOURCE_ENCODING macro.
* parse.y: Token changes; numerous changes in support of encoding;
support for DELETE FILE.
* parse_ante.h (name_of): Use data.original().
(class prog_descr_t): Support of locales.
(current_options): Formatting.
(current_encoding): Formatting.
(current_program_index): Formatting.
(current_section): Formatting.
(current_paragraph): Formatting.
(is_integer_literal): Use correct encoding.
(value_encoding_check): Handle encoding changes.
(alphabet_add): Likewise.
(data_division_ready): Likewise.
* scan.l: Use data.original().
* show_parse.h: Use correct encoding.
* symbols.cc (elementize): Likewise.
(symbol_elem_cmp): Handle locale.
(struct symbol_elem_t): Likewise.
(symbol_locale): Likewise.
(field_str): Change DEFAULT_SOURCE_ENCODING macro name.
(symbols_alphabet_set): Formatting.
(symbols_update): Modify consistency checks.
(symbol_locale_add): Locale support.
(cbl_locale_t::cbl_locale_t): Locale support.
(cbl_alphabet_t::cbl_alphabet_t): New structure.
(cbl_alphabet_t::reencode): Formatting.
(cbl_alphabet_t::assign): Change name of collation_sequence.
(cbl_alphabet_t::also): Likewise.
(new_literal_add): Anticipate the need for four-byte characters.
(guess_encoding): Eliminate.
(cbl_field_t::internalize): Refine conversion of data.initial to
specified encoding.
* symbols.h (enum symbol_type_t): Add SymLocale.
(struct cbl_field_data_t): Incorporate data.orig.
(struct cbl_field_t): Likewise.
(struct cbl_delete_file_t): New structure.
(struct cbl_label_t): Incorporate cbl_delete_file_t.
(struct cbl_locale_t): Support for locale.
(hex_decode): Comment.
(struct cbl_alphabet_t): Incorporate locale; change variable name
to collation_sequence.
(struct symbol_elem_t): Incorporate locale.
(cbl_locale_of): Likewise.
(cbl_alphabet_of): Likewise.
(symbol_locale_add): Likewise.
(wsclear): Type is now uint32_t instead of char.
* util.cc (symbol_type_str): Incorporate locale.
(cbl_field_t::report_invalid_initial_value): Change test so that
pure PIC A() variables are limited to [a-zA-Z] and space.
(valid_move): Use DEFAULT_SOURCE_ENCODING macro.
(cobol_filename): Formatting.
Richard Biener [Mon, 3 Nov 2025 13:04:55 +0000 (14:04 +0100)]
SSA immediate use iterator checking
The following implements additional checking around
SSA immediate use iteration. Specifically this prevents
- any nesting of FOR_EACH_IMM_USE_STMT inside another iteration
via FOR_EACH_IMM_USE_STMT or FOR_EACH_IMM_USE_FAST when iterating
on the same SSA name
- modification (for now unlinking of immediate uses) of a SSA
immediate use list when a fast iteration of the immediate uses
of the SSA name is active
- modification (for now unlinking of immediate uses) of the immediate
use list outside of the block of uses for the currently active stmt
of an ongoing FOR_EACH_IMM_USE_STMT of the SSA name
To implement this additional bookkeeping members are put into the
SSA name structure when ENABLE_GIMPLE_CHECKING is active. I have
kept the existing consistency checking of the fast iterator.
* ssa-iterators.h (imm_use_iterator::name): Add.
(delink_imm_use): When in a FOR_EACH_IMM_USE_STMT iteration
enforce we only remove uses from the current stmt.
(end_imm_use_stmt_traverse): Reset current stmt.
(first_imm_use_stmt): Assert no FOR_EACH_IMM_USE_STMT on
var is in progress. Set the current stmt.
(next_imm_use_stmt): Set the current stmt.
(auto_end_imm_use_fast_traverse): New, lower iteration
depth upon destruction.
(first_readonly_imm_use): Bump the iteration depth.
* tree-core.h (tree_ssa_name::active_iterated_stmt,
tree_ssa_name::fast_iteration_depth): New members when
ENABLE_GIMPLE_CHECKING.
* tree-ssanames.cc (make_ssa_name_fn): Initialize
immediate use verifier bookkeeping members.
Richard Biener [Fri, 31 Oct 2025 12:08:05 +0000 (13:08 +0100)]
Make FOR_EACH_IMM_USE_STMT work w/o fake imm use node
This is an attempt to fix PR122502 by making a FOR_EACH_IMM_USE_FAST
with in an FOR_EACH_IMM_USE_STMT on _the same_ VAR work without
the former running into the FOR_EACH_IMM_USE_STMT inserted marker
use operand. It does this by getting rid of the marker.
The downside is that this in principle restricts the set of operations
that can be done on the immediate use list of VAR. Where previously
almost anything was OK (but technically not well-defined what happens
to the iteration) after this patch you may only remove immediate
uses of VAR on the current stmt from the FOR_EACH_IMM_USE_STMT
iteration. In particular things will break if you happen to remove
the one immediate use of VAR on the stmt immediately following
the set of immediate uses on the currrent stmt.
Additional checking to combat such cases is implemented in a
followup.
PR tree-optimization/122502
* ssa-iterators.h (imm_use_iterator::iter_node): Remove.
(imm_use_iterator::next_stmt_use): New.
(next_readonly_imm_use): Adjust checking code.
(end_imm_use_stmt_traverse): Simplify.
(link_use_stmts_after): Likewise. Return the last use
with the same stmt.
(first_imm_use_stmt): Simplify. Set next_stmt_use.
(next_imm_use_stmt): Likewise.
(end_imm_use_on_stmt_p): Adjust.