Robin Dapp [Fri, 7 Nov 2025 16:18:02 +0000 (17:18 +0100)]
vect: Give up if there is no offset_vectype.
vect_gather_scatter_fn_p currently ICEs if offset_vectype is NULL.
This is an oversight in the patches that relax gather/scatter detection.
Catch this.
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Bail if
offset_vectype is NULL.
Robin Dapp [Thu, 9 Oct 2025 15:25:59 +0000 (17:25 +0200)]
vect: Reduce group size of consecutive strided accesses.
Consecutive load permutations like {0, 1, 2, 3} or {4, 5, 6, 7} in a
group of 8 only read a part of the group, leaving a gap.
For strided accesses we can elide the permutation and, instead of
accessing the whole group, use the number of SLP lanes. This
effectively increases the vector size as we don't load gaps. On top we
do not need to emit the permutes at all.
gcc/ChangeLog:
* tree-vect-slp.cc (vect_load_perm_consecutive_p): New function.
(vect_lower_load_permutations): Use.
(vect_optimize_slp_pass::remove_redundant_permutations): Use.
* tree-vect-stmts.cc (has_consecutive_load_permutation): New
function that uses vect_load_perm_consecutive_p.
(get_load_store_type): Use.
(vectorizable_load): Reduce group size.
* tree-vectorizer.h (struct vect_load_store_data): Add
subchain_p.
(vect_load_perm_consecutive_p): Declare.
Jakub Jelinek [Mon, 10 Nov 2025 11:52:45 +0000 (12:52 +0100)]
c++: Implement C++26 P3920R0 - Wording for NB comment resolution on trivial relocation
Trivial relocation was voted out of C++26, the following patch
removes it (note, the libstdc++ part was still waiting for patch review
and so doesn't need to be removed).
This isn't a mere revert of r16-2206; I've kept -Wc++26-compat option,
from earlier patches the non-terminal stays to be class-property-specifier,
and I had to partially revert also various follow-up changes, e.g. for
modules to handle the new flags and test them, for -Wkeyword-macro
etc. to diagnose the conditional keywords or the feature test macro
etc.
Jakub Jelinek [Mon, 10 Nov 2025 10:36:42 +0000 (11:36 +0100)]
c++: Diagnose #define/#undef indeterminate
While working on CWG3053 I've noticed I forgot to enable diagnostics
on #define indeterminate or #undef indeterminate now that it is handled
as valid C++26 attribute.
2025-11-10 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* lex.cc (cxx_init): For C++26 call cpp_warn on "indeterminate".
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-1.C: Expect diagnostics on define/undef
of indeterminate.
* g++.dg/warn/Wkeyword-macro-2.C: Likewise.
* g++.dg/warn/Wkeyword-macro-4.C: Likewise.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-7.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
Jakub Jelinek [Mon, 10 Nov 2025 10:34:20 +0000 (11:34 +0100)]
c++, libcpp: Implement CWG3053
The following patch implements CWG3053 approved in Kona, where it is now
valid not just to #define likely(a) or #define unlikely(a, b, c) but also
to #undef likely or #undef unlikely.
2025-11-10 Jakub Jelinek <jakub@redhat.com>
libcpp/
* directives.cc: Implement CWG3053.
(do_undef): Don't pedwarn or warn about #undef likely or #undef
unlikely.
gcc/testsuite/
* g++.dg/warn/Wkeyword-macro-4.C: Don't diagnose for #undef likely
or #undef unlikely.
* g++.dg/warn/Wkeyword-macro-5.C: Likewise.
* g++.dg/warn/Wkeyword-macro-9.C: Likewise.
* g++.dg/warn/Wkeyword-macro-8.C: Likewise.
* g++.dg/warn/Wkeyword-macro-10.C: Likewise.
Lewis Hyatt [Wed, 30 Jul 2025 23:20:55 +0000 (19:20 -0400)]
libcpp: Improve locations for macros defined prior to PCH include [PR105608]
It is permissible to define macros prior to including a PCH, as long as
these definitions are disjoint from or identical to the macros in the
PCH. The PCH loading process replaces all libcpp data structures with those
from the PCH, so it is necessary to remember the extra macros separately and
then restore them after loading the PCH, which all is handled by
cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration
process consists of pushing a buffer containing the macro definition and
then lexing it from there, similar to how a command-line -D option is
processed. The current implementation does not attempt to set up the
line_map for this process, and so the locations assigned to the macros are
often not meaningful. (Similar to what happened in the past with lexing the
tokens out of a _Pragma string, lexing out of a buffer rather than a file
produces "sorta" reasonable locations that are often close enough, but not
reliably correct.)
Fix that up by remembering enough additional information (more or less, an
expanded_location for each macro definition) to produce a reasonable
location for the newly restored macros.
One issue that came up is the treatment of command-line-defined macros. From
the perspective of the generic line_map data structures, the command-line
location is not distinguishable from other locations; it's just an ordinary
location created by the front ends with a fake file name by convention. (At
the moment, it is always the string `<command-line>', subject to
translation.) Since libcpp needs to assign macros to that location, it
needs to know what location to use, so I added a new member
line_maps::cmdline_location for the front ends to set, similar to how
line_maps::builtin_location is handled.
This revealed a small issue, in c-opts.cc we have:
/* All command line defines must have the same location. */
cpp_force_token_locations (parse_in, line_table->highest_line);
But contrary to the comment, all command line defines don't actually end up
with the same location anymore. This is because libcpp/lex.cc has been
expanded (r6-4873) to include range information on the returned
locations. That logic has never been respecting the request of
cpp_force_token_locations. I believe this was not intentional, and so I have
corrected that here. Prior to this patch, the range logic has been leading
to command-line macros all having similar locations in the same line map (or
ad-hoc locations based from there for sufficiently long tokens); with this
change, they all have exactly the same location and that location is
recorded in line_maps::cmdline_location.
With that change, then it works fine for pch.cc to restore macros whether
they came from the command-line or from the main file.
gcc/c-family/ChangeLog:
PR preprocessor/105608
* c-opts.cc (c_finish_options): Set new member
line_table->cmdline_location.
* c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in
libcpp pch.cc; it is now possible that the linemap is in a different
file after returning from cpp_read_state().
libcpp/ChangeLog:
PR preprocessor/105608
* include/line-map.h: Add new member CMDLINE_LOCATION.
* lex.cc (get_location_for_byte_range_in_cur_line): Do not expand
the token location to include range information if token location
override was requested.
(warn_about_normalization): Likewise.
(_cpp_lex_direct): Likewise.
* pch.cc (struct saved_macro): New local struct.
(struct save_macro_data): Change DEFNS vector to hold saved_macro
rather than uchar*.
(save_macros): Adapt to remember the location information for each
saved macro in addition to the definition.
(cpp_prepare_state): Likewise.
(cpp_read_state): Use the saved location information to generate
proper locations for the restored macros.
gcc/testsuite/ChangeLog:
PR preprocessor/105608
* g++.dg/pch/line-map-3.C: Remove xfails.
* g++.dg/pch/line-map-4.C: New test.
* g++.dg/pch/line-map-4.Hs: New test.
Mark Wielaard [Sun, 9 Nov 2025 21:12:19 +0000 (22:12 +0100)]
Regenerate libgfortran Makefile.in and aclocal.m4
Commit a1fe2cfa8965 ("fortran: [PR121628]") regenerated libgfortran
Makefile.an and aclocal.m4 files with automake 1.15 instead of 1.15.1.
Run autoreconf version 2.69 with automake 1.15.1 inside libgfortran.
Eric Botcazou [Sat, 8 Nov 2025 18:15:46 +0000 (19:15 +0100)]
Ada: Fix bogus error on limited with clause and private parent package
The implementation of the 10.1.2(8/2-11/2) subclauses that establish rules
for the legality of "with" clauses of private child units is done separately
for regular "with" clauses (in Check_Private_Child_Unit) and for limited
"with" clauses (in Check_Private_Limited_Withed_Unit). The testcase, which
contains the regular and the "limited" version of the same pattern, exhibits
a disagreement between them; the former implementation is correct and the
latter is wrong in this case.
The patch fixes the problem and also cleans up the latter implementation by
aligning it with the former as much as possible.
gcc/ada/
PR ada/34374
* sem_ch10.adb (Check_Private_Limited_Withed_Unit): Use a separate
variable for the private child unit, streamline the loop locating
the nearest private ancestor, fix a too early termination of the
loop traversing the ancestor of the current unit, and use the same
privacy test as Check_Private_Child_Unit.
Philipp Tomsich [Sat, 8 Nov 2025 16:28:07 +0000 (09:28 -0700)]
[RISC-V] Add testcase for shifted truthvalue
I was doing some cleanup on our internal tree and noticed a pattern that I
didn't think was actually useful in practice. Thankfully the internal commit
included a testcase clearly targeting that pattern.
I'm upstreaming the testcase, but not the unnecessary pattern.
gcc/testsuite
* gcc.target/riscv/snez.c: New test.
Avinash Jayakar [Sat, 8 Nov 2025 04:27:59 +0000 (09:57 +0530)]
isel: Check bounds before converting VIEW_CONVERT to VEC_SET.
The function gimple_expand_vec_set_expr in the isel pass, converted
VIEW_CONVERT_EXPR to VEC_SET_EXPR without checking the bounds on the index,
which cause ICE on targets that supported VEC_SET_EXPR like x86 and powerpc.
This patch adds a bound check on the index operand and rejects the conversion
if index is out of bound.
Lulu Cheng [Mon, 3 Nov 2025 09:53:52 +0000 (17:53 +0800)]
LoongArch: Fix PR122097 (2).
r16-4703 does not completely fix PR122097. Floating-point vectors
were not processed in the function loongarch_const_vector_same_bytes_p.
This patch will completely resolve this issue.
PR target/122097
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_const_vector_same_bytes_p): Add processing for
floating-point vector data.
Avinash Jayakar [Sat, 8 Nov 2025 02:53:31 +0000 (08:23 +0530)]
vect: Complete implementation for MULT_EXPR vector lowering.
Use sequences of shifts and add/sub if the hardware does not have support for
vector multiplication. In a previous patch, bare bones vector lowering had been
implemented which only worked when the constant value was a power of 2.
In this patch, few more cases have been added, i.e., if a constant is a uniform
vector but not a power of 2 then use the choose_mult_variant, with max cost
estimate as the cost of scalar multiplication operation times the number of
elements in the vector. This is similar to the logic while expanding MULT_EXPR
in expand pass or in the vector pattern recognition in tree-vect-patterns.cc.
gcc/ChangeLog:
PR tree-optimization/122065
* tree-vect-generic.cc (target_supports_mult_synth_alg): Add helper to
check mult synth.
(expand_vector_mult): Optimize mult when const is uniform but not
power of 2.
Jerry DeLisle [Sat, 8 Nov 2025 02:46:54 +0000 (18:46 -0800)]
fortran: [PR121628]
The PR121628 deep-copy helper reused a static seen_derived_types set
across wrapper generation, so recursive allocatable arrays that appeared
multiple times in a derived type caused infinite compile-time recursion.
Save and restore the set around each wrapper build, polish follow-ups,
and add a regression test to keep the scenario covered.
gcc/fortran/ChangeLog:
PR fortran/121628
* trans-array.cc (seen_derived_types): Move to file scope and
preserve/restore around generate_element_copy_wrapper.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Reuse
gfc_trans_force_lval when forcing addressable CAF temps.
gcc/testsuite/ChangeLog:
PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_7.f90: New test.
libgfortran/ChangeLog:
PR fortran/121628
* Makefile.in: Keep continuation indentation within 80 columns.
* aclocal.m4: Regenerate.
* libgfortran.h: Drop unused forward declaration.
Signed-off-by: Christopher Albert <albert@tugraz.at>
Andrew Pinski [Fri, 7 Nov 2025 22:01:33 +0000 (14:01 -0800)]
sccp: Fix order of removal of phi (again) [PR122599]
This time we are gimplifying the expression and call
fold_stmt during the gimplification (which is fine) but
since we removed the phi and the expression references ssa
names in the phi indirectly, things just fall over inside the ranger.
This moves the removal of the phi until gimplification happens as it
might refer back to the ssa name that the phi defines.
Pushed as obvious after bootstrap test on x86_64-linux-gnu.
PR tree-optimization/122599
gcc/ChangeLog:
* tree-scalar-evolution.cc (final_value_replacement_loop): Move
the removal of the phi until after the gimplification of the final
value expression.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122599-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
gcc/analyzer/ChangeLog:
* checker-event.cc
(region_creation_event_allocation_size::print_desc): Fix missing
"else" leading to stray trailing "allocated here" text in events.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Andrew Pinski [Tue, 28 Oct 2025 05:22:08 +0000 (22:22 -0700)]
Move build_call_nary away from va_list
Instead of a va_list here we can create a std::initializer_list that contains the
arguments and pass that.
This is just one quick version of what was mentioned during the Reviewing refactoring
goals and acceptable abstractions.
The generated code should be similar or slightly better. Plus there is extra checking
of bounds of the std::initializer_list.
I didn't remove the n argument from build_call_nary at this stage as I didn't want to change
the calls to build_call_nary but I added a gcc_checking_assert to make sure the number passed
is the number of arguments.
Changes since v1:
* v2: Fix build_call's access of std::initializer_list.
gcc/ChangeLog:
* tree.cc (build_call_nary): Remove decl.
Add template definition that uses std::initializer_list<tree>
and call build_call.
(build_call): New declaration.
* tree.h (build_call_nary): Remove.
(build_call): New function.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Robin Dapp [Tue, 7 Oct 2025 15:17:22 +0000 (17:17 +0200)]
RISC-V: Remove gather scale and offset handling.
With the recent vectorizer changes upstream the vectorizer can take care
of offset extension and scaling (and its proper costing) itself.
Thus, we can remove all related handling in expand_gather_scatter and
set the predicates in the gather/scatter expanders to what our
instructions actually support.
gcc/ChangeLog:
* config/riscv/autovec.md: Use const_1_operand for scale and
extend predicates.
* config/riscv/riscv-v.cc (expand_gather_scatter): Remove scale
and extension handling.
Robin Dapp [Thu, 6 Nov 2025 08:14:35 +0000 (09:14 +0100)]
vect: Do not convert offset type in strided gather.
The gather/scatter relaxation patches introduced a bug with
vect_use_strided_gather_scatters_p. I didn't want to pass
supported_offset_vectype and supported scale all the way from
vect_truncate_gather_scatter_offset and
vect_use_strided_gather_scatters_p to get_load_store_type so
just called vect_gather_scatter_fn_p again afterwards to determine
the supported type and scale.
However, this doesn't take into account that
vect_use_strided_gather_scatters_p changes the offset type after
verifying that we can use gather/scatter.
The flow right now is
- vect_use_strided_gather_scatters_p calls vect_check_gather_scatter
with e.g. a char offset type.
- We actually need/support a short vector offset type and
vect_use_strided_gather_scatters_p fold converts the actual (scalar)
char offset to a short offset.
- We call vect_gather_scatter_fn_p with the new short offset instead of
the original char one, thinking we need an even larger offset type.
The last call is obviously not identical to the ones we used to check
gather/scatter in the first place and can fail if there is no offset
vectype.
There are several ways to fix this. The most obvious one is to bite the
bullet and just add the supported_offset_vectype and supported_scale to
all the intermediate functions. I wondered, however, if we need the
offset conversion at all. As far as I can tell we don't ever use
the scalar offset type and vect_get_strided_load_store_ops in particular
uses offset_vectype. This, this patch removes the conversion.
I bootstrapped and regtested this, before and after the relaxation
patches, on x86 and power10. Regtested on aarch64 and riscv.
gcc/ChangeLog:
* tree-vect-stmts.cc (vect_use_strided_gather_scatters_p):
Do not convert offset type.
Robin Dapp [Wed, 29 Oct 2025 15:02:51 +0000 (16:02 +0100)]
vect: Relax gather/scatter scale handling.
Similar to the signed/unsigned patch before this one relaxes the
gather/scatter restrictions on scale factors. The basic idea is that a
natively unsupported scale factor can still be reached by emitting a
multiplication before the actual gather operation. As before, we need
to make sure that there is no overflow when multiplying.
Robin Dapp [Tue, 9 Sep 2025 09:41:51 +0000 (11:41 +0200)]
vect: Relax gather/scatter detection by swapping offset sign.
This patch adjusts vect_gather_scatter_fn_p to always check an offset
type with swapped signedness (vs. the original offset argument).
If the target supports the gather/scatter with the new offset type as
well as the conversion of the offset we now emit an explicit offset
conversion before the actual gather/scatter.
The relaxation is only done for the IFN path of gather/scatter and the
general idea roughly looks like:
- vect_gather_scatter_fn_p builds a list of all offset vector types
that the target supports for the current vectype. Then it goes
through that list, trying direct support first and sign-swapped
offset types next, taking precision requirements into account.
If successful it sets supported_offset_vectype to the type that actually
worked while offset_vectype_out is the type that was requested.
- vect_check_gather_scatter works as before but uses the relaxed
vect_gather_scatter_fn_p.
- get_load_store_type sets ls_data->supported_offset_vectype if the
requested type wasn't supported but another one was.
- check_load_store_for_partial_vectors uses the
supported_offset_vectype in order to validate what get_load_store_type
determined.
- vectorizable_load/store emit a conversion if
ls_data->supported_offset_vectype is nonzero and cost it.
The offset type is either of pointer size (if we started with a signed
offset) or twice the size of the original offset (when that one was
unsigned).
gcc/ChangeLog:
* tree-vect-data-refs.cc (struct gather_scatter_config): New
struct to hold gather/scatter configurations.
(vect_gather_scatter_which_ifn): New function to determine which
IFN to use.
(vect_gather_scatter_get_configs): New function to enumerate all
target-supported configs.
(vect_gather_scatter_fn_p): Rework to use
vect_gather_scatter_get_configs and try sign-swapped offset.
(vect_check_gather_scatter): Use new supported offset vectype
argument.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Ditto.
(vect_truncate_gather_scatter_offset): Ditto.
(vect_use_grouped_gather): Ditto.
(get_load_store_type): Ditto.
(vectorizable_store): Convert to sign-swapped offset type if
needed.
(vectorizable_load): Ditto.
* tree-vectorizer.h (struct vect_load_store_data): Add
supported_offset_vectype.
(vect_gather_scatter_fn_p): Add argument.
Andrew Pinski [Thu, 6 Nov 2025 20:04:30 +0000 (12:04 -0800)]
forwprop: Handle already true/false branchs in optimize_unreachable [PR122588]
When optimize_unreachable was moved from fab to forwprop, I missed that due to
the integrated copy prop, we might end up with an already true branch leading
to a __builtin_unreachable block. optimize_unreachable would switch around
the if and things go down hill from there since the other edge was already
marked as non-executable, forwprop didn't process those blocks and didn't
do copy prop into that block and the original assignment statement was removed.
This fixes the problem by having optimize_unreachable not touch the if
statement was already changed to true/false.
Note I placed the testcase in gcc.c-torture/compile as gcc.dg/torture
is NOT currently testing -Og (see PR 122450 for that).
Changes since v1:
* v2: Add gimple testcase.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122588
gcc/ChangeLog:
* tree-ssa-forwprop.cc (optimize_unreachable): Don't touch
if the condition was already true or false.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr122588-1.c: New test.
* gcc.dg/tree-ssa/pr122588-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Eric Botcazou [Fri, 7 Nov 2025 19:42:57 +0000 (20:42 +0100)]
Ada: Fix bogus error on inherited operation for extension of type instance
It comes from a small discrepancy between class-wide subtypes and types:
they both have unknown discriminants, but only the latter may have
discriminants, which causes Subtypes_Statically_Match to return False.
gcc/ada/
PR ada/83188
* sem_eval.adb (Subtypes_Statically_Match): Deal with class-wide
subtypes whose class-wide types have discriminants.
gcc/testsuite/
* gnat.dg/class_wide6.ads, gnat.dg/class_wide6.adb: New test.
* gnat.dg/class_wide6_pkg.ads: New helper.
David Faust [Thu, 6 Nov 2025 22:24:14 +0000 (14:24 -0800)]
bpf: improve memmove inlining [PR122140]
The BPF backend inline memmove expansion was broken for certain
constructs. This patch addresses the two underlying issues:
1. Off-by-one in the "backwards" unrolled move loop offset.
2. Poor use of temporary register for the generated move loop, which
could result in some of the loads performing the move to be optimized
away when the source and destination of the memmove are based off of
the same pointer.
gcc/
PR target/122140
* config/bpf/bpf.cc (bpf_expand_cpymem): Fix off-by-one offset
in backwards loop. Improve src and dest addrs used for the
branch condition.
(emit_move_loop): Improve emitted set insns and remove the
explict temporary register.
Richard Biener [Thu, 6 Nov 2025 13:24:34 +0000 (14:24 +0100)]
tree-optimization/122577 - missed vectorization of conversion from bool
We are currently overly restrictive with rejecting conversions from
bit-precision entities to mode precision ones. Similar to RTL expansion
we can focus on non-bit operations producing bit-precision results
which we currently do not properly handle by masking. Such checks
should be already present. The following relaxes vectorizable_conversion.
Actual bitfield accesses are catched and rejected by vectorizer dataref
analysis and converted during if-conversion into mode-size accesses
with appropriate sign- or zero-extension.
PR tree-optimization/122577
* tree-vect-stmts.cc (vectorizable_conversion): Allow conversions
from non-mode-precision types.
Pan Li [Thu, 6 Nov 2025 05:19:20 +0000 (13:19 +0800)]
Match: Refactor bit_ior based unsigned SAT_MUL pattern by widen mul helper [NFC]
There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka
* widen_mul directly, like _3 w* _4
* convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4
* convert and then mul, like (uint64_t)_3 * (uint64_t)_4
All of them will be referenced during different forms of unsigned
SAT_MUL pattern match, but actually we can wrap them into a helper
which present the "widening_mul" sematics. With this helper, some
unnecessary pattern and duplicated code could be eliminated. Like
min based pattern, this patch focus on bit_ior based pattern.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
* match.pd: Leverage usmul_widen_mult by bit_ior based
unsigned SAT_MUL pattern.
Pan Li [Wed, 15 Oct 2025 14:16:11 +0000 (22:16 +0800)]
RISC-V: Combine vsext.vf2 and vsll.vi to vwsll.vi on ZVBB
The vwsll.vi of zvbb ext take zero extend before ashift. But
we can still do some combine based on sign extend if and only
if the shift is imm and the sign extend bits are all shifted.
For example as below
vsetvli zero, zero, e32, m1, ta, ma
vsext.vf2 v1, v2
vsll.vi v1, v1, 16
If the ashift bits is greater than or equals to truncated bitsize,
(aka 16 for e32), the sign or zero extend bits will be ashifted
and never pollute the final result. Then we have
vsetvli zero, zero, e32, m1, ta, ma
vwsll.vi v1, v2, 16
PR target.121959
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*vwsll_sign_extend_<mode>): Add
pattern to combine vsext.vf2 and vslli.vi to vwsll.vi.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr121959-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-4.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-5.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr121959.h: New test.
Richard Biener [Fri, 7 Nov 2025 09:15:36 +0000 (10:15 +0100)]
tree-optimization/122589 - imm use iterator checking fallout
The following addresses the latent issue that gsi_replace_with_seq
causes debug info to unnecessarily degrade and in this process
break the new immediate use iterator sanity checking. In particular
gsi_remove has side-effects on debug stmts even when operating
in non-permanent operation. But as we are operating on a sequence
not in the IL here this should be avoided. Re-factoring
gsi_replace_with_seq to not rely on gsi_remove fulfills this.
I've noticed gsi_split_seq_before has misleading documentation.
Fixed thereby as well.
PR tree-optimization/122589
PR middle-end/122594
* gimple-iterator.cc (gsi_replace_with_seq): Instead of
removing the last stmt from the sequence with gsi_remove,
split it using gsi_split_seq_before.
(gsi_split_seq_before): Fix bogus documentation.
Alfie Richards [Wed, 15 Oct 2025 13:34:55 +0000 (13:34 +0000)]
aarch64: Add support for preserve_none function attribute [PR target/118328]
When applied to a function preserve_none changes the procedure call standard
such that all registers except stack pointer, frame register, and link register
are caller saved. Additionally, it changes the argument passing registers.
PR target/118328
gcc/ChangeLog:
* config/aarch64/aarch64.cc (handle_aarch64_vector_pcs_attribute):
Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_pcs_exclusions): New definition.
(aarch64_gnu_attributes): Add entry for preserve_none and add
aarch64_pcs_exclusions to aarch64_vector_pcs entry.
(aarch64_preserve_none_abi): New function.
(aarch64_fntype_abi): Add handling for preserve_none.
(aarch64_reg_save_mode): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_hard_regno_call_part_clobbered): Add handling for
ARM_PCS_PRESERVE_NONE.
(num_pcs_arg_regs): New helper function.
(get_pcs_arg_reg): New helper function.
(aarch64_function_ok_for_sibcall): Add handling for ARM_PCS_PRESERVE_NONE.
(aarch64_layout_arg): Add preserve_none argument lauout..
(function_arg_preserve_none_regno_p): New helper function.
(aarch64_function_arg): Update to handle preserve_none.
(function_arg_preserve_none_regno_p): Update logic for preserve_none.
(aarch64_expand_builtin_va_start): Add preserve_none layout.
(aarch64_setup_incoming_varargs): Add preserve_none layout.
(aarch64_is_variant_pcs): Update for case of ARM_PCS_PRESERVE_NONE.
(aarch64_comp_type_attributes): Add preserve_none.
* config/aarch64/aarch64.h (NUM_PRESERVE_NONE_ARG_REGS): New macro.
(PRESERVE_NONE_REGISTERS): New macro.
(enum arm_pcs): Add ARM_PCS_PRESERVE_NONE.
* doc/extend.texi (preserve_none): Add docs for new attribute.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/preserve_none_1.c: New test.
* gcc.target/aarch64/preserve_none_mingw_1.c: New test.
* gcc.target/aarch64/preserve_none_2.c: New test.
* gcc.target/aarch64/preserve_none_3.c: New test.
* gcc.target/aarch64/preserve_none_4.c: New test.
* gcc.target/aarch64/preserve_none_5.c: New test.
* gcc.target/aarch64/preserve_none_6.c: New test.
Pan Li [Sun, 26 Oct 2025 07:21:15 +0000 (15:21 +0800)]
RISC-V: RISC-V: Combine vec_duplicate + vwmaccu.vv to vwmaccu.vx on GR2VR cost
This patch would like to combine the vec_duplicate + vwmaccu.wv to the
vwmaccu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.
Assume we have asm code like below, GR2VR cost is 0.
After this patch:
11 beq a3,zero,.L8
...
14 .L3:
15 vsetvli a5,a3,e32,m1,ta,ma
...
20 vwmaccu.wx v1,a2,v3
...
23 bne a3,zero,.L3
Unfortunately, and similar as vwaddu.vv, only widening from uint32_t to
uint64_t has the necessary zero-extend during combine, we loss the
extend op after expand for any other types.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*widen_mul_plus_vx_<mode>): Add
new pattern to combine the vwmaccu.vx.
* config/riscv/vector.md (*pred_widen_mul_plus_u_vx<mode>_undef):
Add undef define_insn for vmwaccu.vx emiting.
(@pred_widen_mul_plus_u_vx<mode>): Ditto.
When the mode of the destination operand selected by the condition
is SImode, explicit sign extension is applied to both selected
source operands, and the destination operand is marked as
sign-extended.
This method can eliminate some of the sign extension instructions
caused by conditional selection optimization.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_sign_extend_if_subreg_prom_p): Determine if the
current operand is SUBREG and if the source of SUBREG is
the sign-extended value.
(loongarch_expand_conditional_move): Optimize.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/sign-extend-4.c: New test.
* gcc.target/loongarch/sign-extend-5.c: New test.
Lulu Cheng [Thu, 12 Dec 2024 08:21:38 +0000 (16:21 +0800)]
LoongArch: Implement sge and sgeu.
The original implementation of the function loongarch_extend_comparands
only prevented op1 from being loaded into the register when op1 was
const0_rtx. It has now been modified so that op1 is not loaded into
the register as long as op1 is an immediate value. This allows
slt{u}i to be generated instead of slt{u} if the conditions are met.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_canonicalize_int_order_test): Support GT GTU LT
and LTU.
(loongarch_extend_comparands): Expand the scope of op1 from
0 to all immediate values.
* config/loongarch/loongarch.md
(*sge<u>_<X:mode><GPR:mode>): New template.
AArch64, ARM: Clean up documentation of -mbranch-protection.
While working on other things, I noticed that the documentation for
the -mbranch-protection= option was pretty garbled on both aarch64 and
arm targets, with incorrect markup, too much syntax crammed into the
option summary, and confusion about what the values "+leaf" modifier
can apply to. I rewrote it to list all the valid option values
explicitly in the option description, checking this against the
implementation.
gcc/ChangeLog
* doc/invoke.texi (AArch64 Options): Clean up description of
-mbranch-protection= argument.
(ARM Options): Likewise.
Jerry DeLisle [Thu, 6 Nov 2025 20:44:18 +0000 (12:44 -0800)]
fortran: [PR121628]
This patch fixes PR121628 by implementing proper deep copy semantics for
derived types containing recursive allocatable array components, in
compliance with Fortran 2018+ standards.
The original implementation would generate infinitely recursive code at
compile time when encountering self-referential derived types with
allocatable components (e.g., type(t) containing allocatable type(t)
arrays). This patch solves the problem by generating a runtime helper
function that performs element-wise deep copying, avoiding compile-time
recursion while maintaining correct assignment semantics.
The trans-intrinsic.cc change enhances handling of constant values in
coarray atomic operations to ensure temporary variables are created when
needed, avoiding invalid address-of-constant expressions.
gcc/fortran/ChangeLog:
PR fortran/121628
* trans-array.cc (get_copy_helper_function_type): New function to
create function type for element copy helpers.
(get_copy_helper_pointer_type): New function to create pointer type
for element copy helpers.
(generate_element_copy_wrapper): New function to generate runtime
helper for element-wise deep copying of recursive types.
(structure_alloc_comps): Detect recursive allocatable array
components and use runtime helper instead of inline recursion.
Add includes for cgraph.h and function.h.
* trans-decl.cc (gfor_fndecl_cfi_deep_copy_array): New declaration
for runtime deep copy helper.
(gfc_build_builtin_function_decls): Initialize the runtime helper
declaration.
* trans-intrinsic.cc (conv_intrinsic_atomic_op): Enhance handling of
constant values in coarray atomic operations by detecting and
materializing address-of-constant expressions.
* trans.h (gfor_fndecl_cfi_deep_copy_array): Add external declaration.
libgfortran/ChangeLog:
PR fortran/121628
* Makefile.am: Add runtime/deep_copy.c to source files.
* Makefile.in: Regenerate.
* gfortran.map: Export _gfortran_cfi_deep_copy_array symbol.
* libgfortran.h: Add prototype for internal_deep_copy_array.
* runtime/deep_copy.c: New file implementing runtime deep copy
helper for recursive allocatable array components.
gcc/testsuite/ChangeLog:
PR fortran/121628
* gfortran.dg/alloc_comp_deep_copy_5.f90: New test for recursive
allocatable array deep copy.
* gfortran.dg/alloc_comp_deep_copy_6.f90: New test for multi-level
recursive allocatable deep copy.
* gfortran.dg/array_memcpy_2.f90: Fix test with proper allocation.
Signed-off-by: Christopher Albert <albert@tugraz.at>
Eric Botcazou [Thu, 6 Nov 2025 19:42:13 +0000 (20:42 +0100)]
Ada: Fix function call in object notation incorrectly rejected
This happens in the name of a procedure call, again when there
is an implicit dereference in this name, and the fix to apply to
Find_Selected_Component is again straightforward:
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -8524,9 +8524,7 @@ package body Sem_Ch8 is
-- Error if the prefix is procedure or entry, as is P.X
if Ekind (P_Name) /= E_Function
- and then
- (not Is_Overloaded (P)
- or else Nkind (Parent (N)) = N_Procedure_Call_Statement)
+ and then not Is_Overloaded (P)
then
-- Prefix may mention a package that is hidden by a local
-- declaration: let the user know. Scan the full homonym
But this also changes the diagnostics in illegal cases because they are not
uniform in the procedure, so the change also factors them out so as to make
them uniform, which slightly improves them in the end.
gcc/ada/
PR ada/113352
* sem_ch4.adb (Diagnose_Call): Tweak error message.
* sem_ch8.adb (Find_Selected_Component): Remove bypass for calls
to procedures in the overloaded overloadable case. Factor out
the diagnostics code and invoke it uniformly in this case.
gcc/testsuite/
* gnat.dg/prefix3.adb: New test.
* gnat.dg/prefix3_pkg.ads: New helper.
* gnat.dg/prefix3_pkg.adb: Likewise.
Due to some quirks in crtstuff.c, attribute "retain" requires
some features that avr doesn't implement -- even though it
doesnt't even use crtstuff. This patch works around that.
PR target/122516
gcc/
* config/avr/elf.h (SUPPORTS_SHF_GNU_RETAIN): Define if
HAVE_GAS_SHF_GNU_RETAIN.
In particular note insn 34, 42 and 43. Those are useless. Insns 36, 37, 38
are just a single bit extraction from a variable location (from one of the
if-converted blocks). I couldn't see a good way to fix the problem with insn
34/insn 42. The desire to make the then/else blocks independent is cmove_arith
is good, what's unclear is whether or not that code really cares about the
*destination* of the then/else blocks. But I set that aside.
We then thought that cleaning up the variable bit extraction would be the way
to go. So a pattern was constructed to match that form of variable bit extract
and the cost model was twiddled to return that it was a single fast
instruction. But even with those changes fwprop1 refused to make the
substitution. Sigh. At least combine recognizes the idiom later and cleans it
up.
Then we realized we really should just ignore the (set (reg) (const_int 0)) in
the if-converted sequence. We're going to be able to propagate that away in
nearly every case since we have a hard-wired zero register. Sure enough,
ignoring that insn was enough to tip the balance on this case and we get the
desired code.
Tested on riscv32-elf and riscv64-elf. Pioneer bootstrap is in flight, though
it won't really exercise this problem. The BPI's build hasn't started yet, so
it'll be at least 27hours before it's done.
Waiting on pre-commit CI before moving forward.
gcc/
* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Ignore
assignments of (const_int 0) to a register. They will get propagated
away.
gcc/testsuite
* gcc.target/riscv/czero-bext.c: New test.
Eric Botcazou [Thu, 6 Nov 2025 19:03:49 +0000 (20:03 +0100)]
Ada: Fix incorrect renaming of primitive subprogram in object notation
It is possible to declare a subprogram renaming whose name is a primitive
subprogram in object notation; in this case, the name is unconditionally
evaluated in the front-end (unlike for objects) so that, if an ad-hoc body
needs to be built for the renaming later, the name is not reevaluated for
every call to it.
This evaluation is skipped if the name contains an implicit dereference,
as reported in the first PR, and the fix is to make the dereference explicit
at the end of the processing done in Analyze_Renamed_Primitive_Operation,
as is done in the sibling procedure Analyze_Renamed_Entry. The patch also
makes a few consistency tweaks to them and also replaces a manual evaluation
of the name in Expand_N_Subprogram_Renaming_Declaration by a simple call to
Evaluate_Name, which is the procedure used for object renamings.
Analyze_Renamed_Primitive_Operation performs the resolution of the name
based on the declared profile, but it does not do that correctly in all
cases, as reported in the second PR; the fix is again straightforward.
gcc/ada/
PR ada/113350
PR ada/113551
* exp_ch2.adb (Expand_Renaming): Fix reference to Evaluate_Name.
* exp_ch8.adb (Expand_N_Subprogram_Renaming_Declaration): Call
Evaluate_Name to evaluate the name.
* sem_ch8.adb (Analyze_Renamed_Entry): Minor tweaks.
(Analyze_Renamed_Family_Member): Likewise.
(Analyze_Renamed_Primitive_Operation): Likewise.
Fix thinko in the function checking profile conformance, save the
result of the resolution and make implicit dereferences explicit.
gcc/testsuite
* gnat.dg/renaming19.adb: New test.
* gnat.dg/renaming19_pkg.ads: New helper.
* gnat.dg/renaming19_pkg.adb: Likewise.
AVR: AVR-SD: Put a valid opcode prior to gs() table in .subsection 1.
On functional safety devices (AVR-SD), each executed instruction must
be followed by a valid opcode. This is because instruction fetch and
decode for the next instruction runs while the 2-stage pipeline is
executing the current instruction.
There is only one case where avr-gcc generates code interspersed with
data, which is when a switch/case table is generated for a function
with a "section" attribute and AVR_HAVE_JMP_CALL. In that case, the
table with the gs() code label addresses is put in .subsection 1 so
that it belongs to the section as specified by the "section" attribute.
gcc/
* config/avr/avr.cc (avr_output_addr_vec): Output
a valid opcode prior to the first gs() label provided:
- The code is compiled for an arch that has AVR-SD mcus, and
- the function has a "section" attribute, and
- the function has a gs() label addresses switch/case table.
Your Name [Thu, 6 Nov 2025 16:50:22 +0000 (09:50 -0700)]
[RISC-V][PR 121136] Improve various tests which only need to examine upper bits in a GPR
So pre-commit CI flagged an issue with the initial version of this patch. In
particular the cmp-mem-const-{1,2} tests are failing.
I didn't see that in my internal testing, but that well could be an artifact of
having multiple patches touching in the same broad space that the tester is
evaluating. If I apply just this patch I can trigger the cmp-mem-const{1,2}
failures.
The code we're getting now is actually better than we were getting before, but
the new patterns avoid the path through combine that emits the message about
narrowing the load down to a byte load, hence the failure.
Given we're getting better code now than before, I'm just skipping this test on
risc-v. That's the only non-whitespace change since the original version of
this patch.
--
This addresses the first level issues seen in generating better performing code
for testcases derived from pr121136. It likely regresses code size in some
cases as in many cases it selects code sequences that should be better
performing, though larger to encode.
Improving -Os code generation should remain the primary focus of pr121136. Any
improvements in code size with this change are a nice side effect, but not the
primary goal.
--
Let's take this test (derived from the PR):
_Bool func1_0x1U (unsigned int x) { return x <= 0x1U; }
Those should produce the same output. We currently get these fragments for the
3 cases. In particular note how the second variant is a two instruction
sequence.
sltiu a0,a0,2
srliw a0,a0,1
seqz a0,a0
sltiu a0,a0,2
This patch will adjust that second sequence to match the first and third and is
optimal.
Let's take another case. This is interesting as it's right at the simm12
border:
_Bool func1_0x7ffU (unsigned long x) { return x <= 0x7ffU; }
In this case the second sequence is pretty good. Not perfect, but clearly
better than the other two. This patch will fix the code for case #1 and case
So anyway, that's the basic motivation here. So to be 100% clear, while the
bug is focused on code size, I'm focused on the performance of the resulting
code.
This has been tested on riscv32-elf and riscv64-elf. It's also bootstrapped
and regression tested on the Pioneer. The BPI won't have results for this
patch until late tomorrow.
--
PR rtl-optimization/121136
gcc/
* config/riscv/riscv.md: Add define_insn to test the
upper bits of a register against zero using sltiu when
the bits are extracted via zero_extract or logial right shift.
Add 3->2 define_splits for gtu/leu cases testing upper bits
against zero.
gcc/testsuite
* gcc.target/riscv/pr121136.c: New test.
* gcc.dg/cmp-mem-const-1.c: Skip for risc-v.
* gcc.dg/cmp-mem-const-2.c: Likewise.
Robert Dubner [Thu, 6 Nov 2025 12:26:18 +0000 (07:26 -0500)]
cobol: Mainly extends compilation and execution in finternal-ebcdic.
We expanded our extended testing regime to execute many testcases in
EBCDIC mode as well as in ASCII. This exposed hundreds of problems in
both compilation (where conversions must be made between the ASCII
source code and the EBCDIC execution environment) and in run-time
functionality, where results from calls to system routines and internal
calculations that must be done in ASCII have to be converted to EBCDIC.
These changes also switch to using FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE when initializing fixed-point COBOL variable types.
This provides for accurate initialization up to 37 digits, instead of
losing accuracy after 33 digits.
These changes also support the implementation of the COBOL DELETE FILE
(Format 2) statement.
These changes also introduce expanded support for specifying character
encodings, including support for locales.
co-authored-by: Robert Dubner <rdubner@symas.com>
co-authored-by: James K. Lowden <jklowden@cobolworx.com>
gcc/cobol/ChangeLog:
* Make-lang.in: Repair documentation generation.
* cdf.y: Changes to tokens.
* cobol1.cc (cobol_langhook_handle_option): Add comment.
* genapi.cc (function_pointer_from_name): Use data.original() for
function name.
(parser_initialize_programs): Likewise.
(cobol_compare): Make sure encodings of comparands are the same.
(move_tree): Change name of DEFAULT_SOURCE_ENCODING macro.
(parser_enter_program): Typo.
(psa_FldLiteralN): Break out dirty_to_binary() support routine.
(dirty_to_binary): Likewise.
(parser_alphabet): Rename 'alphabet' to 'collation_sequence'.
(parser_allocate): Change wsclear() to be uint32_t instead of char.
(parser_label_label): Formatting.
(parser_label_goto): Likewise.
(get_the_filename): Breakout get_the_filename(), which handles
encoding.
(parser_file_open): Likewise.
(set_up_delete_file_label): Implement DELETE FILE (Format 2).
(parser_file_delete_file): Likewise.
(parser_file_delete_on_exception): Likewise.
(parser_file_delete_not_exception): Likewise.
(parser_file_delete_end): Likewise.
(parser_call): Use data.original().
(parser_entry): Use data.original().
(mh_source_is_literalN): Convert from
sourceref.field->codeset.encoding.
(binary_initial_from_float128): Change to "binary_initial".
(binary_initial): Calculate in FIXED_WIDE_INT(128) instead of
REAL_VALUE_TYPE.
(digits_from_int128): New routine uses binary_initial.
(digits_from_float128): Removed. Kept as comment for reference.
(initial_from_initial): Use binary_initial.
(actually_create_the_static_field): Use correct encoding.
(parser_symbol_add): Likewise.
* genapi.h (parser_file_delete_file): Implement FILE DELETE.
(parser_file_delete_on_exception): Implement FILE DELETE.
(parser_file_delete_not_exception): Implement FILE DELETE.
(parser_file_delete_end): Implement FILE DELETE.
* genmath.cc: Include charmaps.h.
* genutil.cc (get_literal_string): Change name of
DEFAULT_SOURCE_ENCODING macro.
* parse.y: Token changes; numerous changes in support of encoding;
support for DELETE FILE.
* parse_ante.h (name_of): Use data.original().
(class prog_descr_t): Support of locales.
(current_options): Formatting.
(current_encoding): Formatting.
(current_program_index): Formatting.
(current_section): Formatting.
(current_paragraph): Formatting.
(is_integer_literal): Use correct encoding.
(value_encoding_check): Handle encoding changes.
(alphabet_add): Likewise.
(data_division_ready): Likewise.
* scan.l: Use data.original().
* show_parse.h: Use correct encoding.
* symbols.cc (elementize): Likewise.
(symbol_elem_cmp): Handle locale.
(struct symbol_elem_t): Likewise.
(symbol_locale): Likewise.
(field_str): Change DEFAULT_SOURCE_ENCODING macro name.
(symbols_alphabet_set): Formatting.
(symbols_update): Modify consistency checks.
(symbol_locale_add): Locale support.
(cbl_locale_t::cbl_locale_t): Locale support.
(cbl_alphabet_t::cbl_alphabet_t): New structure.
(cbl_alphabet_t::reencode): Formatting.
(cbl_alphabet_t::assign): Change name of collation_sequence.
(cbl_alphabet_t::also): Likewise.
(new_literal_add): Anticipate the need for four-byte characters.
(guess_encoding): Eliminate.
(cbl_field_t::internalize): Refine conversion of data.initial to
specified encoding.
* symbols.h (enum symbol_type_t): Add SymLocale.
(struct cbl_field_data_t): Incorporate data.orig.
(struct cbl_field_t): Likewise.
(struct cbl_delete_file_t): New structure.
(struct cbl_label_t): Incorporate cbl_delete_file_t.
(struct cbl_locale_t): Support for locale.
(hex_decode): Comment.
(struct cbl_alphabet_t): Incorporate locale; change variable name
to collation_sequence.
(struct symbol_elem_t): Incorporate locale.
(cbl_locale_of): Likewise.
(cbl_alphabet_of): Likewise.
(symbol_locale_add): Likewise.
(wsclear): Type is now uint32_t instead of char.
* util.cc (symbol_type_str): Incorporate locale.
(cbl_field_t::report_invalid_initial_value): Change test so that
pure PIC A() variables are limited to [a-zA-Z] and space.
(valid_move): Use DEFAULT_SOURCE_ENCODING macro.
(cobol_filename): Formatting.
Richard Biener [Mon, 3 Nov 2025 13:04:55 +0000 (14:04 +0100)]
SSA immediate use iterator checking
The following implements additional checking around
SSA immediate use iteration. Specifically this prevents
- any nesting of FOR_EACH_IMM_USE_STMT inside another iteration
via FOR_EACH_IMM_USE_STMT or FOR_EACH_IMM_USE_FAST when iterating
on the same SSA name
- modification (for now unlinking of immediate uses) of a SSA
immediate use list when a fast iteration of the immediate uses
of the SSA name is active
- modification (for now unlinking of immediate uses) of the immediate
use list outside of the block of uses for the currently active stmt
of an ongoing FOR_EACH_IMM_USE_STMT of the SSA name
To implement this additional bookkeeping members are put into the
SSA name structure when ENABLE_GIMPLE_CHECKING is active. I have
kept the existing consistency checking of the fast iterator.
* ssa-iterators.h (imm_use_iterator::name): Add.
(delink_imm_use): When in a FOR_EACH_IMM_USE_STMT iteration
enforce we only remove uses from the current stmt.
(end_imm_use_stmt_traverse): Reset current stmt.
(first_imm_use_stmt): Assert no FOR_EACH_IMM_USE_STMT on
var is in progress. Set the current stmt.
(next_imm_use_stmt): Set the current stmt.
(auto_end_imm_use_fast_traverse): New, lower iteration
depth upon destruction.
(first_readonly_imm_use): Bump the iteration depth.
* tree-core.h (tree_ssa_name::active_iterated_stmt,
tree_ssa_name::fast_iteration_depth): New members when
ENABLE_GIMPLE_CHECKING.
* tree-ssanames.cc (make_ssa_name_fn): Initialize
immediate use verifier bookkeeping members.
Richard Biener [Fri, 31 Oct 2025 12:08:05 +0000 (13:08 +0100)]
Make FOR_EACH_IMM_USE_STMT work w/o fake imm use node
This is an attempt to fix PR122502 by making a FOR_EACH_IMM_USE_FAST
with in an FOR_EACH_IMM_USE_STMT on _the same_ VAR work without
the former running into the FOR_EACH_IMM_USE_STMT inserted marker
use operand. It does this by getting rid of the marker.
The downside is that this in principle restricts the set of operations
that can be done on the immediate use list of VAR. Where previously
almost anything was OK (but technically not well-defined what happens
to the iteration) after this patch you may only remove immediate
uses of VAR on the current stmt from the FOR_EACH_IMM_USE_STMT
iteration. In particular things will break if you happen to remove
the one immediate use of VAR on the stmt immediately following
the set of immediate uses on the currrent stmt.
Additional checking to combat such cases is implemented in a
followup.
PR tree-optimization/122502
* ssa-iterators.h (imm_use_iterator::iter_node): Remove.
(imm_use_iterator::next_stmt_use): New.
(next_readonly_imm_use): Adjust checking code.
(end_imm_use_stmt_traverse): Simplify.
(link_use_stmts_after): Likewise. Return the last use
with the same stmt.
(first_imm_use_stmt): Simplify. Set next_stmt_use.
(next_imm_use_stmt): Likewise.
(end_imm_use_on_stmt_p): Adjust.
This effective-target does not need to check for arm32, but needs to
force -march=armv8-a, otherwise -mfpu=fp-armv8 has no useful meaning.
While fixing that, introduce
check_effective_target_arm_v8_vfp_ok_nocache, so that arm_v8_vfp_ok
behaves like arm_v8_neon_ok and many other effective-targets.
Without this patch, gcc.target/arm/attr-neon.c fails with a toolchain
configured with --with-mode=thumb --with-cpu=cortex-m0
--with-float=soft because arm_v8_vfp returns "" because arm32 is
false. As a result, the testcase is compiled with the options needed
for arm_neon_ok, which generates an extra ".fpu neon" directive
compared to what is expected.
The patch removes -march=armv8-a from dg-options in lceil-vcvt_1.c,
lfloor-vcvt_1.c lround-vcvt_1.c and vrinta-ce.c, because this could
override what arm_v8_vfp_ok detected (and lead to 'error: selected
architecture lacks an FPU').
With this patch, the test passes, and several others are enabled:
gcc.target/arm/lceil-vcvt_1.c
gcc.target/arm/lfloor-vcvt_1.c
gcc.target/arm/lround-vcvt_1.c
gcc.target/arm/pr69135_1.c
gcc.target/arm/vmaxnmdf.c
gcc.target/arm/vmaxnmsf.c
gcc.target/arm/vminnmdf.c
gcc.target/arm/vminnmsf.c
gcc.target/arm/vrinta-ce.c
gcc.target/arm/vrintaf32.c
gcc.target/arm/vrintaf64.c
gcc.target/arm/vrintmf32.c
gcc.target/arm/vrintmf64.c
gcc.target/arm/vrintpf32.c
gcc.target/arm/vrintpf64.c
gcc.target/arm/vrintrf32.c
gcc.target/arm/vrintrf64.c
gcc.target/arm/vrintxf32.c
gcc.target/arm/vrintxf64.c
gcc.target/arm/vrintzf32.c
gcc.target/arm/vrintzf64.c
gcc.target/arm/vseleqdf.c
gcc.target/arm/vseleqsf.c
gcc.target/arm/vselgedf.c
gcc.target/arm/vselgesf.c
gcc.target/arm/vselgtdf.c
gcc.target/arm/vselgtsf.c
gcc.target/arm/vselledf.c
gcc.target/arm/vsellesf.c
gcc.target/arm/vselltdf.c
gcc.target/arm/vselltsf.c
gcc.target/arm/vselnedf.c
gcc.target/arm/vselnesf.c
gcc.target/arm/vselvcdf.c
gcc.target/arm/vselvcsf.c
gcc.target/arm/vselvsdf.c
gcc.target/arm/vselvssf.c
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_arm_v8_vfp_ok_nocache): New.
(check_effective_target_arm_v8_vfp_ok): Call the above helper, and
use global flags.
(add_options_for_arm_v8_vfp): Use et_arm_v8_vfp_flags.
* gcc.target/arm/lceil-vcvt_1.c: Remove -march=armv8-a.
* gcc.target/arm/lfloor-vcvt_1.c: Likewise.
* gcc.target/arm/lround-vcvt_1.c: Likewise.
* gcc.target/arm/vrinta-ce.c: Likewise.
Peter Damianov [Thu, 6 Nov 2025 00:14:44 +0000 (00:14 +0000)]
libiberty: Add BigObj COFF support for LTO on Windows targets [PR122472]
This patch adds support for the BigObj COFF object file format to libiberty's
simple-object-coff.c. BigObj extends regular COFF to support a 32-bit section
count.
BigObj differs from COFF in a few ways:
* A different header structure
* 32-bit section counts instead of 16-bit
* 32-bit symbol section numbers instead of 16-bit
* 20-byte symbols instead of 18-byte symbols
(due to the extended section numbers)
For a more detailed summary, read my blog post on this subject:
https://peter0x44.github.io/posts/bigobj_format_explained/
libiberty/ChangeLog:
PR target/122472
* simple-object-coff.c (struct external_filehdr_bigobj): New
structure for BigObj file header.
(bigobj_magic): New constant for BigObj magic bytes.
(struct external_syment_bigobj): New structure for BigObj
20-byte symbol table entries.
(union external_auxent_bigobj): New union for BigObj 20-byte
auxiliary symbol entries.
(struct simple_object_coff_read): Add is_bigobj flag and make
nscns 32-bit to support both formats.
(struct simple_object_coff_attributes): Add is_bigobj flag.
(simple_object_coff_match): Add BigObj format detection.
(simple_object_coff_read_strtab): Use format-specific symbol
size when calculating string table offset.
(simple_object_coff_attributes_merge): Check is_bigobj flag.
(simple_object_coff_write_filehdr_bigobj): New function.
(simple_object_coff_write_to_file): Add logic for writing
BigObj vs regular COFF format with appropriate symbol
and auxiliary entry structures.
Signed-off-by: Peter Damianov <peter0x44@disroot.org> Signed-off-by: Jonathan Yong <10walls@gmail.com>
Xi Ruoyao [Tue, 4 Nov 2025 13:03:18 +0000 (21:03 +0800)]
LoongArch: Switch the default code model to medium
It has turned out the normal code model isn't enough for some large
LoongArch link units in practice. Quoting WANG Rui's comment [1]:
We’ve actually been considering pushing for a change to the default
code model for LoongArch compilers (including GCC) for a while now.
In fact, this was one of the topics discussed in yesterday’s internal
compiler tool-chain meeting. The reason we haven’t moved forward with
it yet is that the medium code model generates a R_LARCH_CALL36
relocation, which had some issues with earlier versions of the linker.
We need to assess the impact on users before proceeding with the change.
In GCC we have build-time probe for linker call36 support and if the
linker does not support it, we fall back to pcalau12i + jirl or
la.{local,global} + jirl for the medium code model. I also had some
concern about a potential performance regression caused by the
conservative nature of the relaxation process, but when I tested this
patch it turned out the relaxation is powerful enough to eliminate all
the pcaddu18i instructions in cc1plus and libstdc++.so.
The Loong Arch Linux project has been using -mcmodel=medium in their
{C,CXX}FLAGS building packages for a while [2] and they've not reported
any issues with that.
The Linux kernel developers has already anticipated the change and
explicitly specified -mcmodel=normal for a while [3].
Thus to me it's safe to make GCC 16 the first release with the medium
code model as the default now. If someone must keep the normal code
model as the default for any reason, it's possible to configure GCC
using --with-cmodel=normal.
* config.gcc: Support --with-cmodel={medium,normal} and make
medium the default for LoongArch, define TARGET_DEFAULT_CMODEL
as the selected value.
* config/loongarch/loongarch-opts.cc: Use TARGET_DEFAULT_CMODEL
instead of hard coding CMODEL_NORMAL.
* doc/install.texi: Document that --with-cmodel= is supported
for LoongArch.
* doc/invoke.texi: Update the document about default code model
on LoongArch.
Nathaniel Shead [Sun, 2 Nov 2025 04:58:39 +0000 (15:58 +1100)]
c++/modules: Complain on imported GMF TU-local entities in instantiation [PR121574]
An unfortunate side effect of the previous patch is that even with
-pedantic-errors, unless the user specifies -Wtemplate-names-tu-local
when building the module interface there will be no diagnostic at all
from instantiating a template that exposes global TU-local entities,
either when building the module or its importer.
This patch solves this by recognising imported TU-local dependencies,
even if they weren't streamed as TU_LOCAL_ENTITY nodes. The warnings
here are deliberately conservative for when we can be sure this was
actually an imported TU-local entity; in particular, we bail on any
TU-local entity that originated from a header module, without attempting
to determine if the entity came via a named module first.
PR c++/121574
gcc/cp/ChangeLog:
* cp-tree.h (instantiating_tu_local_entity): Declare.
* module.cc (is_tu_local_entity): Extract from depset::hash.
(is_tu_local_value): Likewise.
(has_tu_local_tmpl_arg): Likewise.
(depset::hash::is_tu_local_entity): Remove.
(depset::hash::has_tu_local_tmpl_arg): Remove.
(depset::hash::is_tu_local_value): Remove.
(instantiating_tu_local_entity): New function.
(depset::hash::add_binding_entity): No longer go through
depset::hash to check is_tu_local_entity.
* pt.cc (complain_about_tu_local_entity): Remove.
(tsubst): Use instantiating_tu_local_entity.
(tsubst_expr): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-17_b.C: Check for diagnostics when
instantiating imported TU-local entities.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Nathaniel Shead [Thu, 30 Oct 2025 12:13:21 +0000 (23:13 +1100)]
c++/modules: Allow ignoring some TU-local exposure errors in GMF [PR121574]
A frequent issue with migrating to C++20 modules has been dealing with
third-party libraries with internal functions or data. This causes GCC
to currently refuse to build the module if any references to these
internal-linkage declarations escape into the module CMI.
This can seem needlessly hostile, however, especially since we have the
capabilities to support this (to a degree) from header units, albeit
with the inherent ODR issues associated with their use. In aid of this,
this patch demotes the error to a pedwarn in various scenarios, by
treating some declarations as not being TU-local even if they otherwise
would have been.
Effort has been made to not alter semantics of valid programs, and to
continue to diagnose cases that the standard says we must. In
particular, any code in the module purview is still a hard error, due to
the inherent issues with exposing TU-local entities, and the lack of any
migration requirements.
Because this patch is just to assist migration, we only deal with the
simplest (yet most common) cases: namespace scope functions and
variables. Types are hard to handle neatly as we risk getting thousands
of unhelpful warnings as we continue to walk the type body and find new
TU-local entities to complain about. Templates are also tricky because
it's hard to tell if an instantiation that occurred in the module
purview only refers to global module entities or if it's inadvertantly
exposing a purview entity as well. Neither of these are likely to occur
frequently in third-party code; if need be, this can be relaxed later as
well.
Similarly, even in the GMF a constexpr variable with a TU-local value
will not be usable in constant expressions in the importer, and since we
cannot easily warn about this from the importer we continue to make this
an error in the module interface.
PR c++/121574
gcc/c-family/ChangeLog:
* c.opt: New warning '-Wexpose-global-module-tu-local'.
* c.opt.urls: Regenerate.
* module.cc (depset::disc_bits): Replace 'DB_REFS_TU_LOCAL_BIT'
and 'DB_EXPOSURE_BIT' with new four flags
'DB_{REF,EXPOSE}_{GLOBAL,PURVIEW}_BIT'.
(depset::is_tu_local): Support checking either for only purview
TU-local entities or any entity described TU-local by standard.
(depset::refs_tu_local): Likewise.
(depset::is_exposure): Likewise.
(depset::hash::make_dependency): A constant initialized to a
TU-local variable is always considered a purview exposure.
(is_exposure_of_member_type): Adjust sanity checks to handle if
we ever relax requirements for TU-local types.
(depset::hash::add_dependency): Differentiate referencing
purview or GMF TU-local entities.
(depset::hash::diagnose_bad_internal_ref): New function.
(depset::hash::diagnose_template_names_tu_local): New function.
(depset::hash::finalize_dependencies): Handle new warnings that
might be needed for GMF TU-local entities.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-17_a.C: New test.
* g++.dg/modules/internal-17_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Eric Botcazou [Wed, 5 Nov 2025 20:15:35 +0000 (21:15 +0100)]
Ada: Fix qualified name of discriminant incorrectly accepted in constraint
The RM 3.8(12/3) subclause says that a discriminant mentioned in a
constraint must appear alone as a direct name. The last part is not
consistently checked and, while the first part is, it generates a
slightly different error message depending on the form of the input.
This fixes the last part and changes the first to use a single message.
gcc/ada/
PR ada/35793
* sem_res.adb (Check_Discriminant_Use): In a constraint context,
check that the discriminant appears alone as a direct name in all
cases and give a consistent error message when it does not.
gcc/testsuite/
* gnat.dg/specs/discr8.ads: New test.
Fix the unified-shared memory test,
libgomp.c++/target-std__multimap-concurrent-usm.C
added in commit r16-1010-g83ca283853f195
libgomp: Add testcases for concurrent access to standard C++ containers
on offload targets, a number of USM variants
This tests includes the actual code of target-std__multimap-concurrent.C.
The issue is that multimap.insert allocates memory – which is freed by
the destructor. However, if the memory is allocated on a device
('insert'), it also needs to be freed there ('clear') as in general
freeing device-allocated memory is not possible on the host.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-std__multimap-concurrent.C: Fix memory
freeing of device allocated memory with USM.
Paul Thomas [Wed, 5 Nov 2025 12:11:00 +0000 (12:11 +0000)]
Fortran: Fix PDT constructors in associate [PR122501, PR122524]
2025-11-05 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/122501
PR fortran/122524
* primary.cc (gfc_convert_to_structure_constructor): Correct
whitespace issue.
(gfc_match_rvalue): Remove the attempt to match specific procs
before filling out PDT constructor. Instead, defer this until
resolution with the condition that there not be a following
arglist and more than one procedure in the generic interface.
gcc/testsuite/
PR fortran/122501
* gfortran.dg/pdt_66.f03: New test.
PR fortran/122524
* gfortran.dg/pdt_67.f03: New test.
Artemiy Volkov [Sat, 1 Nov 2025 17:17:15 +0000 (17:17 +0000)]
forwprop: allow subvectors in simplify_vector_constructor ()
This is an attempt to fix
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/697879.html in the
middle-end; the motivation in that patch was to teach gcc to compile:
foo:
dup d31, v0.d[1]
uzp1 v0.2d, v31.2d, v0.2d
ret
Instead of adding a define_insn in the backend, this patch relaxes the
precondition of tree-ssa-forwprop.cc:simplify_vector_constructor () to
accept subvectors as constructor elements. During initial argument
processing (ll. 3817-3916), subvectors are decomposed into individual
elements before populating the ELTS array; this allows the rest of the
function to remain unchanged. Special handling is also implemented for
constant and splat subvector elements of a constructor (the latter with
the use of ssa_uniform_vector_p () from tree-vect-generic.cc, which this
patch moves to tree.cc).
Add GIMPLE tests to gcc.dg/tree-ssa demonstrating the intended behavior
with various combinations of subvectors as constructor arguments,
including constant and splat subvectors; also add some aarch64-specific
tests to show that the change leads to us picking the "ext" instruction
for the resulting VEC_PERM_EXPR.
Bootstrapped and regtested on aarch64 and x86_64, regtested on aarch64_be.
gcc/ChangeLog:
* tree-ssa-forwprop.cc (simplify_vector_constructor): Support
vector constructor elements.
* tree-vect-generic.cc (ssa_uniform_vector_p): Make non-static and
move ...
* tree.cc (ssa_uniform_vector_p): ... here.
* tree.h (ssa_uniform_vector_p): Declare it.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/forwprop-43.c: New test.
* gcc.target/aarch64/simd/combine_ext.c: New test.
Richard Biener [Mon, 3 Nov 2025 13:43:39 +0000 (14:43 +0100)]
Use gather_imm_use_stmts instead of FOR_EACH_IMM_USE_STMT in forwprop
The following fixes forwprop using FOR_EACH_IMM_USE_STMT to iterate
over stmts and then eventually removing the active stmt, releasing
its defs. This can cause debug stmt insertion with a RHS referencing
the SSA name we iterate over, adding to its immediate use list
but also adjusting all other debug stmts refering to the released
SSA name, updating those. And those can refer to the original
iterated over variable.
In the end the destructive behavior of update_stmt is a problem
here, which unlinks all uses of a stmt and then links in the
newly computed ones instead of leaving those in place that are.
The solution is to not rely on FOR_EACH_IMM_USE_STMT to iterate
over stmt uses without duplicates.
* tree-ssa-forwprop.cc (forward_propagate_addr_expr):
Use gather_imm_use_stmts instead of FOR_EACH_IMM_USE_STMT.
Richard Biener [Tue, 4 Nov 2025 09:48:44 +0000 (10:48 +0100)]
Add gather_imm_use_stmts helper
The following adds a helper function to gather SSA use stmts without
duplicates. It steals the only padding bit in gimple to be a
"infrastructure local flag" which should be used only temporarily
and kept cleared. I did not add accessor functions for the flag
to not encourage (ab-)uses.
I have used an auto_vec<gimple *, 2> in the API to avoid heap
allocations for most cases (without doing statistics). I have
verified GCC 7 performs NRV optimization on the copy but I'll
note while auto_vec<gimple *> has copy and assign deleted,
auto_vec<gimple *, N> does not. Adding them breaks pair-fusion.cc
compile. Without using 'auto' or range-for the API use is a bit
awkward as that exposes the number of auto-allocated elements.
The helper can be used in a range-for, see the followup for an
example.
* gimple.h (gimple::pad): Rename to ...
(gimple::ilf): ... this.
* ssa-iterators.h (gather_imm_use_stmts): Declare.
* tree-ssa-operands.cc (gather_imm_use_stmts): New function.
Richard Biener [Mon, 3 Nov 2025 12:59:36 +0000 (13:59 +0100)]
Fix unsafe stmt modifications in FOR_EACH_IMM_USE_STMT
The following fixes path isolation changing the immediate use list of
an SSA name that is currently iterated over via FOR_EACH_IMM_USE_STMT.
This happens when it duplicates a BB within this and creates/modifies
stmts that contain SSA uses of the name and calls update_stmt which
re-builds SSA operands, including removal of SSA uses and re-inserting
them. This is not safe as that might cause missed iterated uses but
more importantly could cause the 'next' use to be removed.
For the case in question the fix is to collect interesting uses in
a vec and do the processing outside of the FOR_EACH_IMM_USE_STMT
iteration.
* gimple-ssa-isolate-paths.cc (check_loadstore): Set
the volatile flag on the stmt manually.
(find_implicit_erroneous_behavior): Move code transform
outside of FOR_EACH_IMM_USE_STMT iteration.
The "vrepli.b" instruction is introduced by the init-regs pass (see
PR61810 and all the issues it references). To work it around, we can
use post-reload instead of define_expand: the "f" constraint will make
the compiler automatically move the scalar between GPR and FPR, and
reload is much later than init-regs so init-regs won't get in our way.
Now the code looks like:
movgr2fr.d $f0,$r4
vpcnt.d $vr0,$vr0
movfr2gr.d $r4,$f0
jr $r1
gcc/ChangeLog:
* config/loongarch/loongarch.md (cntmap): Change to uppercase.
(popcount<GPR:mode>2): Modify to a post reload split.
Eric Botcazou [Tue, 4 Nov 2025 18:54:45 +0000 (19:54 +0100)]
Ada: Fix incorrect legality check in instantiation of child generic unit
The problem arises when the generic unit has a formal access type parameter,
because the manual resolution implemented in Find_Actual_Type does not pick
the correct entity for the designated type. The fix replaces it with a bona
fide resolution and cleans up the associated code in the callers.
gcc/ada/
PR ada/18453
* sem_ch12.adb (Find_Actual_Type): Add Typ_Ref parameter and
perform a standard resolution on it in the fallback case.
Call Get_Instance_Of if the type is declared in a formal of
the child unit.
(Instantiate_Type.Validate_Access_Type_Instance): Adjust call
to Find_Actual_Type.
(Instantiate_Type.Validate_Array_Type_Instance): Likewise and
streamline the check for matching component subtypes.
gcc/testsuite/
* gnat.dg/specs/generic_inst9.ads: New test.
* gnat.dg/specs/generic_inst9_pkg1.ads: New helper.
* gnat.dg/specs/generic_inst9_pkg2.ads: Likewise.
* gnat.dg/specs/generic_inst9_pkg2-g.ads: Likewise.
This function acts on entire parameter declaration lists, and iterates
over them. Use plural in the name, to clarify that it acts on
parameters, not just on a single parameter.
Kees Cook [Tue, 26 Aug 2025 04:09:41 +0000 (21:09 -0700)]
arc: Add const attribute support for mathematical ARC builtins
The ARC builtin functions __builtin_arc_ffs and __builtin_arc_fls
perform pure mathematical operations equivalent to the standard
GCC __builtin_ffs function, which is marked with the const attribute.
However, the ARC target-specific versions were not marked as const,
preventing compiler optimizations like common subexpression elimination.
Extend the ARC builtin infrastructure to support function attributes
and mark the appropriate mathematical builtins as const:
- __builtin_arc_ffs: Find first set bit (const)
- __builtin_arc_fls: Find last set bit (const)
- __builtin_arc_norm: Count leading zeros (const)
- __builtin_arc_normw: Count leading zeros for 16-bit (const)
- __builtin_arc_swap: Endian swap (const)
gcc/ChangeLog:
* config/arc/builtins.def: Add ATTRS parameter to DEF_BUILTIN
macro calls. Mark mathematical builtins (FFS, FLS, NORM, NORMW,
SWAP) with attr_const, leave others as NULL_TREE.
* config/arc/arc.cc: Add support for builtin function attributes.
Create attr_const using tree_cons. Update DEF_BUILTIN macro to
pass ATTRS parameter to add_builtin_function.
gcc/testsuite/ChangeLog:
* gcc.target/arc/builtin_fls_const.c: New test. Verify that
const attribute enables CSE optimization for mathematical ARC
builtins by checking that duplicate calls are eliminated and
results are optimized to shift operations.
OpenMP/Fortran: Revamp handling of labels in metadirectives [PR122369,PR122508]
When a label is matched in the first statement after the end of a metadirective
body, it is bound to the associated region. However this prevents it from being
referenced elsewhere.
This patch fixes it by rebinding such labels to the outer region. It also
ensures that labels defined in an outer region can be referenced in a
metadirective body.
PR fortran/122369
PR fortran/122508
gcc/fortran/ChangeLog:
* gfortran.h (gfc_rebind_label): Declare new function.
* parse.cc (parse_omp_metadirective_body): Rebind labels to the outer
region. Maintain a vector of metadirective regions.
(gfc_parse_file): Initialise it.
* parse.h (GFC_PARSE_H): Declare it.
* symbol.cc (gfc_get_st_label): Look for existing labels in outer
metadirective regions.
(gfc_rebind_label): Define new function.
(gfc_define_st_label): Accept duplicate labels in metadirective body.
(gfc_reference_st_label): Accept shared DO termination labels in
metadirective body.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/pr122369-1.f90: New test.
* gfortran.dg/gomp/pr122369-2.f90: New test.
* gfortran.dg/gomp/pr122369-3.f90: New test.
* gfortran.dg/gomp/pr122369-4.f90: New test.
* gfortran.dg/gomp/pr122508-1.f90: New test.
* gfortran.dg/gomp/pr122508-2.f90: New test.
Pan Li [Mon, 3 Nov 2025 11:27:24 +0000 (19:27 +0800)]
Match: Refactor min based unsigned SAT_MUL pattern by widen mul helper [NFC]
There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka
* widen_mul directly, like _3 w* _4
* convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4
* convert and then mul, like (uint64_t)_3 * (uint64_t)_4
All of them will be referenced during different forms of unsigned
SAT_MUL pattern match, but actually we can wrap them into a helper
which present the "widening_mul" sematics. With this helper, some
unnecessary pattern and duplicated code could be eliminated.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
* match.pd: Add usmul_widen_mult helper and referenced by
min based unsigned SAT_MUL pattern.
On i686, offsets into object archives could be 64-bit, but they're
inconsistently treated across the lto, which may sometimes result in
truncation of those offsets for large archives.
Use int64_t/off_t consistently across all uses of archive offsets to
make sure that they're always read and mapped correctly.
gcc/lto/ChangeLog
PR lto/122515
* lto.h (lto_section_slot): Set type of START to off_t.
* lto-common.cc (lto_read_section_data): Adjust.
* lto-object.cc (lto_obj_file_open): Set type of OFFSET to
int64_t.
gcc/ChangeLog
PR lto/122515
* lto-wrapper.cc (debug_objcopy): Set type of INOFF to int64_t.
(run_gcc): Set type of FILE_OFFSET to int64_t.
gcc/testsuite/ChangeLog
PR lto/122515
* lib/lto.exp (lto-build-archive): New procedure.
(lto-execute-1): Use it.
(lto-link-and-maybe-run, lto-get-options-main): Handle ar-link.
* gcc.dg/lto/pr122515_0.c: New test case.
* gcc.dg/lto/pr122515_1.c: New file.
* gcc.dg/lto/pr122515_2.c: Likewise.
* gcc.dg/lto/pr122515_3.c: Likewise.
* gcc.dg/lto/pr122515_4.c: Likewise.
* gcc.dg/lto/pr122515_5.c: Likewise.
* gcc.dg/lto/pr122515_6.c: Likewise.
* gcc.dg/lto/pr122515_7.c: Likewise.
* gcc.dg/lto/pr122515_8.c: Likewise.
* gcc.dg/lto/pr122515_9.c: Likewise.
Nathaniel Shead [Thu, 16 Oct 2025 11:51:23 +0000 (22:51 +1100)]
c++: Don't constrain template visibility using no-linkage variables [PR122253]
When finding the minimal visibility of a template, any reference to a
dependent automatic variable will cause the instantiation to be marked
as internal linkage. However, when processing the template decl we
don't yet know whether that should actually be the case, as a given
instantiation may not require referencing the local decl in its
mangling.
This patch fixes the issue by checking for no-linkage decls first, in
which case we just constrain using the type of the entity. We can't use
a check for lk_external/lk_internal in the other cases, as
instantiations referring to internal types can still have external
linkage as determined by the language, but should still constrain the
visibility of any declarations that refer to them.
PR c++/122253
gcc/cp/ChangeLog:
* decl2.cc (min_vis_expr_r): Don't mark no-linkage declarations
as VISIBILITY_ANON.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-16.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Tobias Burnus [Tue, 4 Nov 2025 09:23:31 +0000 (10:23 +0100)]
gfortran.dg/pr122513-2.f90: New test [PR122513]
This test is from PR122513; even though the actual error message was
already added in GCC 15, there was no testcase for the diagnostic type
Index variable 'i' at (1) cannot be specified in a locality-spec
Thus, this commit adds one.
gcc/testsuite/ChangeLog:
PR fortran/122513
* gfortran.dg/pr122513-2.f90: New test.
Kishan Parmar [Tue, 4 Nov 2025 07:11:28 +0000 (12:41 +0530)]
simplify-rtx: Canonicalize SUBREG and LSHIFTRT order for AND operations
For a given rtx expression (and (lshiftrt (subreg X) shift) mask)
combine pass tries to simplify the RTL form to
(and (subreg (lshiftrt X shift)) mask)
where the SUBREG wraps the result of the shift. This leaves the AND
and the shift in different modes, which complicates recognition.
(and (lshiftrt (subreg X) shift) mask)
where the SUBREG is inside the shift and both operations share the same
mode. This form is easier to recognize across targets and enables
cleaner pattern matching.
This patch checks in simplify-rtx to perform this transformation when
safe: the SUBREG must be a lowpart, the shift amount must be valid, and
the precision of the operation must be preserved.
Tested on powerpc64le-linux-gnu, powerpc64-linux-gnu, and
x86_64-pc-linux-gnu with no regressions. On rs6000, the change reduces
insn counts due to improved matching.
2025-11-04 Kishan Parmar <kishan@linux.ibm.com>
gcc/ChangeLog:
PR rtl-optimization/93738
* simplify-rtx.cc (simplify_binary_operation_1): Canonicalize
SUBREG(LSHIFTRT) into LSHIFTRT(SUBREG) when valid.
David Malcolm [Tue, 4 Nov 2025 02:42:59 +0000 (21:42 -0500)]
analyzer: add event kinds for special control flow [PR122544]
The SARIF 3.38.8 "kinds" property has some verbs for expressing
control flow, but is missing some of the awkward special cases.
The spec says "If none of these values are appropriate, a SARIF
producer MAY use any value."
This patch adds the following new values:
* "throw" for throwing an exception
* "catch" for catching an exception
* "unwind" for unwinding stack frame(s) during exception-handling
* "setjmp" for calls to setjmp
* "longjmp" for calls to longjmp that rewind the program counter/stack
to the location of a previous setjmp call
gcc/analyzer/ChangeLog:
PR analyzer/122544
* checker-event.cc (catch_cfg_edge_event::get_meaning): New.
(setjmp_event::get_meaning): New.
(rewind_event::get_meaning): New.
(throw_event::get_meaning): New.
(unwind_event::get_meaning): New.
* checker-event.h (catch_cfg_edge_event::get_meaning): New decl.
(setjmp_event::get_meaning): New decl.
(rewind_event::get_meaning): New decl.
(throw_event::get_meaning): New decl.
(unwind_event::get_meaning): New decl.
gcc/ChangeLog:
PR analyzer/122544
* diagnostics/paths.cc (event::meaning::maybe_get_verb_str):
Handle the new verbs.
* diagnostics/paths.h (event::meaning::verb): Add new values
for special control flow operations.
(event::meaning::meaning): Add ctor taking just a verb.
gcc/testsuite/ChangeLog:
PR analyzer/122544
* g++.dg/analyzer/exception-path-1-sarif.py: New test script.
* g++.dg/analyzer/exception-path-1.C: Add SARIF output, and use
the above to check it.
* g++.dg/analyzer/exception-path-unwind-multiple-2-sarif.py: New
test script.
* g++.dg/analyzer/exception-path-unwind-multiple-2.C: Add SARIF
output, and use the above to check it.
* gcc.dg/analyzer/setjmp-3-sarif.py: New test script.
* gcc.dg/analyzer/setjmp-3.c: Add SARIF output, and use
the above to check it.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Eric Botcazou [Mon, 3 Nov 2025 23:40:39 +0000 (00:40 +0100)]
Ada: Fix segfault for instantiation on function call returning string
The problem is that a transient scope is created during the analysis of the
actual parameters of the instantiation and this discombobulates the complex
handling of scopes in Sem_Ch12.
gcc/ada/
PR ada/78175
* sem_ch12.adb (Hide_Current_Scope): Deal with a transient scope
as current scope.
(Remove_Parent): Likewise.
gcc/testsuite/
* gnat.dg/generic_inst15.adb: New test.
* gnat.dg/generic_inst15_pkg-g.ads: New helper.
* gnat.dg/generic_inst15_pkg.ads: Likewise.