Richard Biener [Mon, 17 Nov 2025 10:29:46 +0000 (11:29 +0100)]
Improve LIM dump and some testcases
The following avoids the newline between 'Moving statement' and the
actual stmt in dumps to make specific scanning easier.
* tree-ssa-loop-im.cc (move_computations_worker): Avoid newline
between 'Moving statement' and actual statement dump in dumpfile.
* gcc.dg/vect/slp-9.c: Use noipa function attribute, drop
-fno-early-inlining option.
* c-c++-common/restrict-2.c: Explicitly look for hoisted loads.
* gfortran.dg/pr104466.f90: Adjust.
cfgloop: Modify loop_exits_{to,from}_bb_p return type to edge
Given that when finding whether the predicate in question is satisfied
or not we already do the heavy-lifting of identifying the specific
edge that matches the particular criterion, it is wasteful to throw
the edge information away, only to potentially have to recalculate it
when true is returned.
Rather, given the ability of treating a valid pointer as true and,
conversely, the NULL pointer as false, we can return the edge for
should we wish to use it, while keeping the function's existing calls
in the code as is.
That is, rather than keeping two 128-bit loads and using the usubl2
instruction designed to operate on upper halves of 128-bit vector
registers, we are doing four 64-bit scalar loads and operate on 64-bit
values, which leads to increased register pressure.
What happens here is the aforementioned commit lowers the vget_half_* ()
intrinsics to BIT_FIELD_REFs, at which point the logic in
tree-ssa-forwprop.cc::optimize_vector_load () kicks in, breaking down
vector loads into scalar loads as long as all uses are through
BIT_FIELD_REFs. AFAICT, this function (or before it existed, the code
comprising it) handles the following scenarios:
(1) Introduced in r10-135-ga7eb97ad269b65 in response to PR88983, this
code broke down vector loads into smaller loads whenever the target
doesn't natively support wider loads, fixing code quality issues. This
should always be a win since the original loads weren't even available in
the first place.
(2) Since r12-2728-g2724d1bba6b364, it is now also handling loads that
feed into VEC_UNPACK expressions to prefer extending scalar loads to
vector loads + vector unpack, which is beneficial at least on some
microarchitectures.
This patch restricts the optimization to those scenarios explicitly, while
adding another one on top:
(3) If any of the BIT_FIELD_REFs have scalar type, prefer scalar loads to
vector loads to reduce possible traffic between scalar and vector register
files. IOW, only if all BIT_FIELD_REFs are used as subvectors, assume
there might be other instructions operating on those subvectors that do
not leave the vector register file, and do not perform the transformation.
To summarize, after this patch, if either (1), (2), or (3) holds, narrow
loads are preferred, otherwise vector loads are left intact.
Bootstrapped and regtested on aarch64 and x86_64, no regressions on
SPEC2017, the code snippet above added as an aarch64-specific test.
gcc/ChangeLog:
* tree-ssa-forwprop.cc (optimize_vector_load): Inhibit
optimization when all uses are through subvectors without
extension.
Jakub Jelinek [Mon, 17 Nov 2025 08:44:05 +0000 (09:44 +0100)]
GCC, meet C++20
I've tried to test a patch to switch -std=gnu++17 C++ default
to -std=gnu++20 (will post momentarily), but ran into various problems
during GCC bootstraps, our codebase isn't fully C++20 ready.
The most common problems are arithmetic or bitwise operations
between enumerators of different enum types (or between enumerator
and floating point in the testsuite), ambiguous overloaded
operator == because of forgotten const qualification of const inside
of the argument and then libcody being largely stuck in C++ and incompatible
with C++20 which introduced char8_t type and uses it for u8 literals.
The following patch fixes various issues I've run into, for libcody
this patch just makes sure code including cody.hh can be compiled
with -std=gnu++20, libcody itself I have a tweak in the other patch.
Nothing in this patch will make the code invalid for C++14.
2025-11-17 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum built_in_function): Avoid arithmetics or
bitwise operations between enumerators from different enums.
* lto-streamer.h (lto_tag_is_gimple_code_p): Likewise.
* gimple.h (gimple_omp_atomic_set_memory_order): Likewise.
* common/config/i386/i386-cpuinfo.h (M_CPU_SUBTYPE_START,
M_CPU_TYPE): Likewise.
* tree-complex.cc (expand_complex_libcall): Likewise.
* ipa-modref-tree.h (modref_access_node::operator ==): Change
argument type from modref_access_node & to
const modref_access_node &.
* ipa-modref-tree.cc (modref_access_node::operator ==): Likewise.
gcc/cobol/
* symbols.cc (symbol_table_init): Avoid arithmetics or
bitwise operations between enumerators from different enums.
gcc/fortran/
* parse.cc (gfc_parse_file): Avoid arithmetics or
bitwise operations between enumerators from different enums.
libcody/
* cody.hh (MessageBuffer::Space): For C++14 or newer use
(char) u8' ' instead of Detail::S2C(u8" ").
Jakub Jelinek [Mon, 17 Nov 2025 08:42:56 +0000 (09:42 +0100)]
OpenMP/OpenACC tests. vs C++26
OpenMP/OpenACC array sections, generally expr[expr:expr] or
expr[expr:expr:expr] can have any of the exprs between [ and ]
omitted, low-bound (first defaults to 0, last (stride) defaults to
1 and the middle (length) for some arrays defaults to
ceil((size − lower_bound)/stride).
People have been writing this for years without spaces between [ and :
and : and ] when that expr has been omitted, but guess for C++26
one needs to add a space. I think [ :: ] isn't going to be parsed
as the same as [ : : ] either.
Richard Biener [Thu, 6 Nov 2025 12:19:35 +0000 (13:19 +0100)]
[x86] avoid using masked vector epilogues when no scalar epilog is needed
The following arranges for avoiding masked vector epilogues when we'll
eventually arrive at a vector epilogue with VF == 1 which implies no
scalar epilog will be necessary.
This avoids regressing performance in OpenColorIO when the
avx512_masked_epilogues tuning is enabled. A testcase for one
example case is shown in PR122573.
PR tree-optimization/122573
* config/i386/i386.cc (ix86_vector_costs::finish_cost): Avoid
using masked epilogues when an SSE epilogue would have a VF of one.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr122573.c: New testcase.
Andrew MacLeod [Fri, 14 Nov 2025 21:11:30 +0000 (16:11 -0500)]
Allow single PHI initial values.
There are some single PHI groups that can benefit from an initial
value. Also improve the iteration calculation by bounding each
iteration with the known global value.
PR tree-optimization/121345
gcc/
* gimple-range-phi.cc (phi_group::phi_group): Add modifier name.
(phi_group::is_modifier_p): Set modifier stmt operand name.
(phi_group::calculate_using_modifier): Bound the iteration range
by known global range.
(phi_analyzer::process_phi): Allow single PHIS if they meet certain
criteria.
* gimple-range-phi.h (m_modifier_name): New member.
(is_modifier_p): Adjust prototype.
Andrew MacLeod [Fri, 14 Nov 2025 21:06:42 +0000 (16:06 -0500)]
Turn PHI analyzer to a simple pre-pass
Rather than having a dynamic analyzer around that is handcuffed by
only global values, invoke it as a prepass in VRP and put all values it finds
in the query's global cache via update_range_info.
gcc/
* gimple-range-fold.cc (fold_using_range::range_of_phi): Remove
the PHI analysis query.
* gimple-range-phi.cc (phi_analysis_object): Delete.
(phi_analysis_initialize): Delete.
(phi_analysis_finalize): Delete.
(phi_analysis_available_p): Delete.
(phi_analysis): Invoke a phi analyzer.
(phi_analyzer::phi_analyzer): Preprocess all phi nodes and set
global values for them in a query.
(phi_analyzer::process_phi): Use query, and export any inital
values found to the query.
* gimple-range-phi.h (m_global): Delete.
(phi_analysis_initialize): Delete.
(phi_analysis_finalize): Delete.
(phi_analysis_available_p): Delete.
(phi_analysis): Change prototype.
* tree-vrp.cc (execute_ranger_vrp): Call phi_analysis.
gcc/testsuite/
* gcc.dg/pr102983.c: Adjust final check.
Andrew MacLeod [Fri, 14 Nov 2025 20:39:18 +0000 (15:39 -0500)]
Update current query global when system global changes.
This ensures a the current range_query's internal tracking of a global value
matches anything another entity sets.
* gimple-range.cc (gimple_ranger::update_range_info): New.
* gimple-range.h (update_range_info): New prototype.
* tree-ssanames.cc (set_range_info): Update the range info for
the current range query.
* value-query.h (update_range_info): New prototype.
* value-query.cc (update_range_info): New default stub.
Andrew Pinski [Sat, 15 Nov 2025 22:51:32 +0000 (14:51 -0800)]
cfglceanup: Fix check for preheaders
I had messed up the check in r16-5258-g1d8e2d51e5c5cb for preheaders
where return to remvoe the forwarder preheader block even if LOOPS_HAVE_PREHEADERS
was set. I am not sure how often this happens because most of the time the pre-header
will have an incoming phi block anyways but it is safer not to remove it in this case.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Restore check on
LOOPS_HAVE_PREHEADERS.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jonathan Wakely [Sun, 16 Nov 2025 13:57:25 +0000 (13:57 +0000)]
libstdc++: Include <mutex> in syncbuf.cc [PR122698]
For most configurations bits/std_mutex.h would already be included by
<syncstream>, but not if configured with _GLIBCXX_USE_CXX11_ABI=0 as the
default, because syncbuf is disabled in that case.
libstdc++-v3/ChangeLog:
PR libstdc++/122698
* src/c++20/syncbuf.cc (__syncbuf_get_mutex): Include <mutex>.
Fix indentation of function body.
Nathaniel Shead [Sat, 15 Nov 2025 04:11:55 +0000 (15:11 +1100)]
c++/modules: Keep tracking instantiations of static class variable templates [PR122625]
r16-4930-gfd5c057c2d01 ensured that we noted all class-scope variables.
But I also added a clause to 'read_var_def' to skip all class-scope
instantiations, under the mistaken belief that this would be handled in
read_class_def.
But as the testcase shows, read_class_def cannot (and should not)
register instantiations of member variable templates, as when reading
the class it just sees the template declaration. So this patch
re-enables tracking instantiations of class-scope variable templates.
PR c++/122625
gcc/cp/ChangeLog:
* module.cc (trees_in::read_var_def): Also track class-scope
primary template specialisations.
gcc/testsuite/ChangeLog:
* g++.dg/modules/inst-7_a.C: New test.
* g++.dg/modules/inst-7_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Richard Biener [Fri, 14 Nov 2025 12:40:50 +0000 (13:40 +0100)]
Always analyze possible partial vector usage
The following makes us always start with LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P
as true and only makes vect_determine_partial_vectors_and_peeling
honor --param vect-partial-vector-usage or explicit requests from the
target for epilog vectorization. This exposes whether we could have
used partial vectors to the target at costing time as even when the
main loop is never supposed to get masked the value is useful to
determine possible epilog loop masking.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info):
Initialize can_use_partial_vectors_p to true.
(vect_determine_partial_vectors_and_peeling): Add masked_p
parameter and honor it.
(vect_analyze_loop_2): Pass through masked_p.
(vect_analyze_loop_1): Pass down masked_p.
(vect_analyze_loop): Simplify check on possible masking of
the epilog when there's no .WHILE_ULT.
Richard Biener [Fri, 14 Nov 2025 13:07:01 +0000 (14:07 +0100)]
Decide on LOOP_VINFO_USING_SELECT_VL_P after determining partial vectors
The following makes us decide on partial vectors first so we can
use LOOP_VINFO_USING_PARTIAL_VECTORS_P to decide on a decrementing IV
and LOOP_VINFO_USING_SELECT_VL_P as followup.
* tree-vect-loop.cc (vect_determine_partial_vectors_and_peeling):
Remove resetting of LOOP_VINFO_USING_SELECT_VL_P.
(vect_analyze_loop_2): Decide on partial vectors before
deciding on decrementing IV or .SELECT_VL usage.
Richard Biener [Fri, 14 Nov 2025 13:10:24 +0000 (14:10 +0100)]
Do not call vect_determine_partial_vectors_and_peeling from transform
It gets more difficult to maintain this doesn't do any changes late
(see followups), so kill it. We do have to retain re-setting of
LOOP_VINFO_PEELING_FOR_NITER though, since
vect_need_peeling_or_partial_vectors_p is incorrect for epilogues
when done during analysis. We should fix this of course.
* tree-vectorizer.h (vect_determine_partial_vectors_and_peeling):
Remove.
(vect_need_peeling_or_partial_vectors_p): Declare.
* tree-vect-loop.cc (vect_determine_partial_vectors_and_peeling):
Make static.
(vect_need_peeling_or_partial_vectors_p): Export.
* tree-vect-loop-manip.cc (vect_do_peeling): Do not call
vect_determine_partial_vectors_and_peeling but instead
re-compute LOOP_VINFO_PEELING_FOR_NITER using
vect_need_peeling_or_partial_vectors_p.
Richard Biener [Fri, 14 Nov 2025 12:54:20 +0000 (13:54 +0100)]
Remove LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P
This is a write-only parameter, it cannot be relied upon either.
So remove it.
* tree-vectorizer.h (_loop_vec_info::epil_using_partial_vectors_p):
Remove.
(LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P): Likewise.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info):
Do not initialize epil_using_partial_vectors_p.
(vect_determine_partial_vectors_and_peeling): Do not set it.
Lewis Hyatt [Sun, 16 Nov 2025 04:10:52 +0000 (23:10 -0500)]
diagnostics: Fix -fdump-internal-locations for 64-bit location_t
When adding support for 64-bit location_t in GCC 15, I missed a couple
changes needed for the internal debugging tool -fdump-internal-locations to
work properly. This would previously ICE on a location_t large enough to
overflow a signed 32-bit int.
gcc/ChangeLog:
* diagnostics/context.cc (num_digits): Change argument type from
`int' to `uint64_t'.
(test_num_digits): Add test for 64-bit argument.
* diagnostic.h (num_digits): Adjust prototype.
* input.cc (write_digit): Accept argument in range 0-9 instead of
an arbitrary int.
(write_digit_row): Adjust to change in write_digit().
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/location-overflow-test-3.c: New test.
* gcc.dg/plugin/plugin.exp: Add the new test.
Andrew Pinski [Sun, 16 Nov 2025 04:13:45 +0000 (20:13 -0800)]
aarch64: unxfail pr117123.C
This testcase now passes for aarch64 after r16-5258-1d8e2d51e5c5.
The keeping around the loop pre-header helped to better thread the jumps
and fix the issue at hand.
Pushed as obvious.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/pr117123.C: un-xfail.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jason Xu [Sun, 3 Aug 2025 22:19:04 +0000 (18:19 -0400)]
gcc: Make aarch64-mingw32 target install wrap stdint.h
Wrapped stdint.h for AArch64 MinGW32 is useful for bare-matal PE
target e.g. UEFI, as those platform does not provide a system
stdint.h, this would align with x86_64 mingw32 target which provides a
wrapped stdint.h
I have tested this by compiling a AArch64 UEFI Application using gcc's
stdint.h, with -ffreestanding flag, and execute the application with
AAVMF(edk2) inside QEMU.
gcc/ChangeLog:
* config.gcc (aarch64-*-mingw*): Set use_gcc_stdint to wrap.
Jason Merrill [Sat, 15 Nov 2025 17:43:37 +0000 (23:13 +0530)]
c++/modules: explicit inst of constructor
The extern template __shared_ptr<filesystem::_Dir> in bits/fs_dir.h was
leading to an ICE in import_export_decl in 29_atomics/atomic_ref/address.cc
because we had the nonsensical combination of DECL_REALLY_EXTERN and
!DECL_INTERFACE_KNOWN. This turned out to be because mark_decl_instantiated
was exiting early if TREE_ASM_WRITTEN since way back in the pre-cgraph days,
and expand_or_defer_fn_1 sets TREE_ASM_WRITTEN on maybe-in-charge ctors.
The mark_decl_instantiated code is long-obsolete, so let's just remove it.
Jeff Law [Sat, 15 Nov 2025 16:26:25 +0000 (09:26 -0700)]
[RISC-V] Avoid most calls to gen_extend_insn
Yet more infrastructure on our way to eliminating some define_insn_and_split
constructs.
The RISC-V port is using gen_extend_insn to directly generate a SIGN or ZERO
extend insn. This is undesirable because we don't actually have a full set of
extension instructions, particularly zero extension for the base architecture.
We've gotten away with this because we've had a define_insn_and_splits which
claim to support the full set of zero/sign extensions. We very much want to
eliminate that little white lie. So we need to fix those pesky calls to
gen_extend_insn.
Similar to a patch from earlier this week convert_modes comes to the rescue.
It'll run through the expander path allowing us to generate the desired code.
In most cases it's a trivial replacement.
One case is left in the tree. For that case the source operand is known to be
a MEM and we can always extend a load from a MEM. Converting this one would
result in infinite recursion through riscv_legitimize_move.
One case is perhaps nontrivial. convert_move will emit the code to perform the
conversion into a fresh pseudo register. In one case we need to make sure that
value is copied into the output register for an insn. So a trivial
emit_move_insn is needed.
Built and regression tested on riscv32-elf and riscv64-elf. It's also
bootstrapped on the Pioneer. Regression testing is in progress, but won't
finish for many hours. The BPI is spinning this change right now, but won't
have results until tomorrow night.
gcc/
* config/riscv/riscv.cc (risc_legitimize_move): Use convert_modes
rather than gen_extend_insn for most cases.
* config/riscv/riscv.md (addv<mode>4): Likewise.
(uaddv<mode>4, subv<mode>4, usubv<mode>4): Likewise.
(mulv<mode>4, umulv<mode>4): Likewise.
* config/riscv/sync.md (atomic_compare_and_swap<mode>): Likewise.
Jakub Jelinek [Sat, 15 Nov 2025 15:06:05 +0000 (16:06 +0100)]
testsuite: Fix up c-c++-common/asan/asan-stack-small.c test
Here is a fix for the test I've talked about today in the libsanitizer
update mail.
The test relied on a coming before b coming before c, all with 32 byte
distances, but gcc can actually emit them in the exact opposite order
or some other one.
2025-11-15 Jakub Jelinek <jakub@redhat.com>
* c-c++-common/asan/asan-stack-small.c (pa, pb, pc): Make these
vars volatile.
(uintptr_t): New typedef.
(main): Use access of b using pa pointer with offset depending on
how exactly the 3 variables are laid out in the frame.
Jakub Jelinek [Sat, 15 Nov 2025 15:04:56 +0000 (16:04 +0100)]
cobol: Fix bootstrap [PR122691]
Andrew's recent r16-5258 change broke bootstrap on x86_64-linux with
cobol enabled, the error is
../../gcc/cobol/lexio.cc: In function ‘std::pair<std::__cxx11::list<replace_t>,
char*> parse_replace_pairs(const char*, const char*, bool)’:
../../gcc/cobol/lexio.cc:907:76: error: ‘%.*s’ directive argument is null
[-Werror=format-overflow=]
907 | dbgmsg( "%s:%d: %s: " HOST_SIZE_T_PRINT_UNSIGNED " pairs parsed from '%.*s'",
| ^~~~
The problem is that some jump threading is happening now that didn't happen
before and a dbgmsg call is duplicated, once with 0, NULL as the last two
arguments, once with some size and pointer.
The following patch makes sure we never call it with NULL pointer, even when
the size is 0, to silence the warning.
2025-11-15 Jakub Jelinek <jakub@redhat.com>
PR cobol/122691
* lexio.cc (parse_replace_pairs): Replace parsed.stmt.p with
parsed.stmt.size() ? parsed.stmt.p : "" in the last argument to
dbgmsg.
Jason Merrill [Fri, 14 Nov 2025 12:22:57 +0000 (17:52 +0530)]
c++/modules: fix hash_map issue
Building std.compat.cc was crashing for me because we would first get a
pointer into imported_temploid_friends, then insert a new entry, causing the
hash_map to expand, and then dereference the pointer into the former
location of the hash table. Fixed by dereferencing the pointer before
inserting rather than after.
gcc/cp/ChangeLog:
* module.cc (transfer_defining_module): Dereference
pointer into hash_map before possible insertion.
Jason Merrill [Fri, 14 Nov 2025 17:59:38 +0000 (23:29 +0530)]
c++/modules: using builtin
Here, when we try to bring "memset" back into the global namespace, we find
the built-in, see that it's the same declaration (because the module brought
it into the other namespace with a using-declaration), and decide that we
don't need to do anything. But we still need a non-hidden overload.
Jason Merrill [Wed, 12 Nov 2025 09:33:46 +0000 (15:03 +0530)]
c++/modules: friend void foo<bar>()
23_containers/mdspan/layouts/padded.cc was failing because on load we were
wrongly treating the __get_static_stride friends as equivalent between
layout_left_padded and layout_right_padded. This happened because we were
wrongly pushing these declarations into namespace scope even though we don't
yet know what template they instantiate. Fixed by using the same
MK_local_friend mechanism as template friends.
gcc/cp/ChangeLog:
* decl.cc (grokfndecl): Set DECL_CHAIN of a friend f<>.
* module.cc (trees_out::get_merge_kind): Give it MK_local_friend.
(trees_out::decl_container): Its container is the befriender.
(trees_out::key_mergeable): Expand comment.
* cp-tree.h (decl_specialization_friend_p): New.
* friend.cc (do_friend): Use it.
* pt.cc (tsubst_friend_function): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-11_a.C: New test.
* g++.dg/modules/friend-11_b.C: New test.
Karl Meakin [Fri, 17 Oct 2025 13:32:59 +0000 (13:32 +0000)]
aarch64: Add `aarch64_comparison_operator_cc`
Deduplicate the checks against `ccmode` by extracting to a new
predicate.
gcc/ChangeLog:
* config/aarch64/aarch64.md(mov<ALLI_GPF:mode>cc): Use new predicate.
(mov<GPF:mode><GPI:mode>cc): Likewise.
(<neg_not_op><mode>cc): Likewise.
* config/aarch64/predicates.md (aarch64_comparison_operator_cc):
New predicate.
Karl Meakin [Tue, 30 Sep 2025 12:05:00 +0000 (12:05 +0000)]
aarch64: Fix condition accepted by mov<GPF>cc
Apply the same fix from bc11cbff9e648fdda2798bfa2d7151d5cd164b87
("aarch64: Fix condition accepted by mov<ALLI>cc") to `MOV<GPF>cc`.
Fixes ICEs when compiling code such as `cmpbr-4.c` and `cmpbr-5.c` with `+cmpbr`.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/sqshl_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/srshr_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/uqshl_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/urshr_check_shift.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/sqshll_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/srshrl_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/uqshll_check_shift.c: New test.
* gcc.target/arm/mve/intrinsics/urshrl_check_shift.c: New test.
This is caused by a combination of things: the vector is
uninitialized, DAP requires a count of the number of children of a
variable, and libstdc++ printers don't implement the 'num_children'
method, so gdb tries to count children by iterating.
In this case, the vector has a nonsensical size:
(gdb) p myVector
$1 = std::vector of length -34979931, capacity -33992726
This patch adds a 'num_children' method to a subset of the
pretty-printers, in particular ones where I thought the length might
be arbitrarily large and susceptible to being garbage when the object
isn't initialized.
I've also specifically added a check to the vector printer for the
case where the length is negative.
These container printers could be further improved by adding the
'child' method, allowing random access to child objects. However I
haven't done that here.
libstdc++-v3/ChangeLog
* python/libstdcxx/v6/printers.py (StdVectorPrinter._bounds):
New method.
(StdVectorPrinter.to_string): Use it.
(StdVectorPrinter.num_children): New method.
(StdStackOrQueuePrinter.num_children): New method.
(StdMapPrinter.num_children): New method.
(StdSetPrinter.num_children): New method.
(StdDequePrinter._size): New method.
(StdDequePrinter.to_string): Use it.
(StdDequePrinter.num_children): New method.
(Tr1UnorderedSetPrinter.num_children): New method.
(Tr1UnorderedMapPrinter.num_children): New method.
(StdSpanPrinter.num_children): New method.
Tomasz Kamiński [Fri, 14 Nov 2025 16:43:59 +0000 (17:43 +0100)]
libstdc++: Ensure that _Utf_view is always a view.
Previously, _Utf_view accepted any input_range, including reference-to-array
types like char(&)[2], and stored it as the _M_base member. In such cases,
_Utf_view was not assignable, failing the requirements of view concept.
This patch addresses the issue by adding the ranges::view constraint to the
second template parameter of _Utf_view, and for clarity renaming it from
_Range to _View. The constructor is also adjusted to accept its argument
by value (views must be O(1) move-constructible). This prevents implicitly
generated CTAD from deducing a reference type.
This makes _Utf_view consistent with both other standard views and the
wording from P2728R8: Unicode in the Library, Part 1: UTF Transcoding [1].
The explicit CTAD from viewable_range is not defined for _Utf_view because
it depends on views::all_t, views::ref_view, and views::owning_view,
which are declared in <ranges>. Consequently, users must explicitly cast
the argument to a view or specify it as a template parameter.
* include/bits/unicode.h (_Utf_view): Rename the template parameter
from _Range to _View and constrain it with ranges::view.
(_Utf_view::_Utf_view): Accept by value instead of rvalue reference.
* include/std/format (__format::__write_padded): Replace _Utf_view
over const char32_t(&)[1] with span<const char32_t, 1>.
* testsuite/ext/unicode/view.cc: Add checks if specialization
of _Utf_view satisfy view. Wrap arrays into std::span before
constructing _Utf_view.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Biener [Thu, 13 Nov 2025 12:40:27 +0000 (13:40 +0100)]
ipa/122663 - fix ICE with stmt removal during IPA modification
We currently remove stmts inside of a FOR_EACH_IMM_USE_STMT iteration
which can be problematical. The following adjusts purge_all_uses
to gather all stmts to remove and remove them in reverse order
afterwards which also better deals with debug stmt generation.
PR ipa/122663
* ipa-param-manipulation.cc (purge_all_uses): Collect
stmts to remove and process that list in reverse.
Tomasz Kamiński [Fri, 24 Oct 2025 08:24:26 +0000 (10:24 +0200)]
libstdc++: Use _Bind_front_t/_Bind_back_t in bind_front<f>/bind_back<f> [PR122032]
This patch changes the implementation of bind_front<f> and bind_back<f> to
return a _Bind_front_t<_Bind_fn_t<f>, ...> and _Bind_back_t<_Bind_fn_t<f>, ...>
respectively, replacing the previous lambda-based implementation. The prior use
of a lambda caused non-conforming behavior with respect to C++23 [func.require]
p8, which requires that bind_front<f>(s), bind_front<f>(move(s)), and
bind_front<f>(as_const(s)) produce the same type.
Additionally, using specialized structs reduces the size of the resulting functor
in certain scenarios (see PR).
For the zero-argument case, the function still returns a _Bind_fn_t<f>. Since this
type is already a perfect forwarding call wrapper, it yields the same result as
_Bind_front_t<_Bind_fn_t<f>>.
A consequence of this change is that the types returned by bind_front<f>(args...)
and bind_back<f>(args...) are no longer structural - they are not required to be
structural by the standard.
PR libstdc++/122032
libstdc++-v3/ChangeLog:
* include/std/functional (std::bind_front<f>, std::bind_back<f>):
Define in terms of _Bind_front_t/_Bind_back_t.
* testsuite/20_util/function_objects/bind_back/nttp.cc: New tests.
* testsuite/20_util/function_objects/bind_front/nttp.cc: New tests.
Reviewed-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Richard Biener [Thu, 6 Nov 2025 10:49:31 +0000 (11:49 +0100)]
tree-optimization/122573 - enhance SLP of invariant loads
Currently SLP of invariant loads is only supported for the case of
a single load that is splat, as side-effect of supporting this case
even for non-invariant loads. The following extends this to any
set of invariant loads. The way we have load permutations for
these makes it a bit awkward, thus adjustments in that area.
PR tree-optimization/122573
* tree-vect-slp.cc (vect_build_slp_tree_1): Support
groups of invariant loads.
(vect_build_slp_tree_2): Likewise.
(vect_transform_slp_perm_load_1): Likewise.
* tree-vect-stmts.cc (vectorizable_load): Handle non-splat
SLP for invaraint loads.
Richard Biener [Fri, 14 Nov 2025 07:20:56 +0000 (08:20 +0100)]
tree-optimization/122680 - avoid range query during vect transform
Range queries during analysis on the original loop might not yield
the same result as those on the epilog during transform. Separate
analysis from transform here.
PR tree-optimization/122680
* tree-vect-stmts.cc (vectorizable_conversion): Avoid range
queries during transform.
Rainer Orth [Fri, 14 Nov 2025 08:12:34 +0000 (09:12 +0100)]
build: Require binutils 2.30+ on Solaris [PR121457, PR121458]
I recently noticed that gcc/configure.ac contains quite a number of
checks for Solaris ld and GNU ld versions that can be massively
simplified. GCC trunk only supports Solaris 11.4, thus Solaris ld is at
least at version 5.11-1.3159 (the one in 11.4 FCS), and GNU ld can be
required to be at least 2.30.1, the version bundled in 11.4 FCS.
This way quite a number of special cases can simply be removed, as well
as some macros that depend on them and the code they guard.
To ensure that nobody tries to use an older self-compiled version of GNU
ld, the minimum version is checked at configure time.
This change also allowed to fix two bugs that were caused by checks for
*_sol2 among the linker emulations listed by gld -V, which are only valid
when targetting Solaris. Before those checks were done irrespective of
target, causing checks to go wrong when a version of binutils configured
with --enable-targets=all was used. Since now all versions of GNU ld
supported on Solaris are known to support those *_sol2 emulations, the
checks can be replaced by hardcoding the emulations when targetting
Solaris.
Bootstrapped without regressions on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.
arm: add support for out of range shift amount in MVE asrl and lsll [PR122216]
MVE asrl and lsll instructions have two variants:
- immediate shift amount in the [1..32] range
- shift amount in a register, where negative values reverse the
direction of the shift
However, RTL assumes that the shift amount is interpreted unsigned, so
we want to make sure undesired simplifications do not take place.
For instance if simplify_rtx optimizes
(set (reg:SI 1) (const_int -5))
(set (reg:DI 2) (ashift:DI (reg:DI 3) (reg:SI 1)))
into:
(set (reg:DI 2) (ashift:DI (reg:DI 3) (const_int -5)))
we do not want this to be interpreted as undefined behavior.
We handle this using a general pattern where:
- immediates are handled by a define_insn_and_split pattern which
directly maps immediates in [1..32] to the shift operator and splits
other cases as needed.
- non-immediates are handled by another pattern
gcc/ChangeLog:
PR target/122216
* config/arm/arm.md (ashldi3, ashrdi3): Force shift amount into
QImode.
* config/arm/constraints.md: Fix comment, Pg is valid in Thumb-2
state only.
* config/arm/mve.md (mve_asrl): Handle various shift amount ranges.
(mve_asrl_imm, mve_asrl_internal): New patterns.
(mve_lsll): Handle various shift amount ranges.
(mve_lsll_imm, mve_lsll_internal): New patterns.
gcc/testsuite/ChangeLog:
PR target/122216
* gcc.target/arm/mve/intrinsics/asrl-various-ranges.c: New test.
* gcc.target/arm/mve/intrinsics/lsll-various-ranges.c: New test.
Christophe Lyon [Wed, 27 Aug 2025 09:42:56 +0000 (09:42 +0000)]
arm: fix MVE asrl lsll lsrl patterns [PR122216]
The thumb2_asrl, thumb2_lsll and thumb2_lsrl patterns were incorrecly
using (match_dup 0) for the first argument of the shift operator.
This patch replaces that with (match_operand:DI 1
arm_general_register_operandarm_general_register_operand "0") and
fixes the related expanders in arm.md to use that additional argument
and get rid of the copy of operands[1] to operands[0].
Finally, since these patterns are MVE-only, rename them into mve_XXX
and move them to mve.md.
gcc/ChangeLog:
PR target/122216
* config/arm/thumb2.md (thumb2_asrl, thumb2_lsll, thumb2_lsrl):
Move to ...
* config/arm/mve.md (mve_asrl, mve_lsll, mve_lsrl): ... here. Use
match_operand instead of match_dup.
* config/arm/arm.md (ashldi3, ashrdi3, lshrdi3): Remove useless
copy. Update for new prototype.
zhaozhou [Mon, 10 Nov 2025 07:20:26 +0000 (15:20 +0800)]
LoongArch: Fix predicate for symbolic_pcrel_offset_operand.
The predicate checks if the operand is PLUS(symbol_ref, const_int), but
the match (match_operand 0/1) is not equal XEXP(op, 0/1). It should be
adjusted to use match_test and pass XEXP(op, 0/1) into the constraint
function.
zhaozhou [Mon, 10 Nov 2025 07:38:26 +0000 (15:38 +0800)]
LoongArch: Fix issue where data marked as GTY is cleaned up by ggc.
As for GGC(GCC Garbage Collection), it's use gengtype tool to scan all
source files contain the GTY mark, and generate gt-*.h files. GGC
traversal these file to find gt_root node and marks these objects that
directly or indirectly reference this node as active, then clean up
unmarked object's memory.
For the loongarch-builtins.cc file, it is necessary to add
target_gtfiles in config.gcc to generate gt-loongarch-builtins.h, and
include this header file in the .cc file, prevented the data marked
with GTY in this `.cc` file cleaned up by ggc.
Alexandre Oliva [Thu, 13 Nov 2025 22:54:01 +0000 (19:54 -0300)]
[vxworks] wrap base/b_NULL.h to override NULL
Some versions of vxworks define NULL to __nullptr in C++, assuming
C++11, which breaks at least a number of analyzer tests that get
exercised in C++98 mode.
Wrap the header that defines NULL so that, after including it, we
override the NULL definition with the one provided by stddef.h.
That required some infrastructure to enable subdirectories in extra
headers. Since USER_H filenames appear as dependencies, that limits
the possibilities or markup, so I went for a filesystem-transparent
sequence that doesn't appear in any extra_headers whatsoever, namely
/././, to mark the beginning of the desired install name.
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
for gcc/ChangeLog
* config/vxworks/base/b_NULL.h: New.
* config.gcc (extra_headers) <*-*-vxworks*>: Add it.
* Makefile.in (stmp-int-hdrs): Support /././ markers in USER_H
to mark the beginning of the install name. Document.
* doc/sourcebuild.texi (Headers): Document /././ marker.
Nathaniel Shead [Thu, 13 Nov 2025 22:11:25 +0000 (09:11 +1100)]
c++/modules: Add testcase for lookup of hidden friend [PR122646]
r16-5173-g52a24bcec9388a fixed this testcase, but I think it's
worthwhile still adding this reduced test for it to the modules.exp set
of tests so we don't need to rely on libstdc++ tests for it yet.
PR c++/122646
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-10_a.C: New test.
* g++.dg/modules/friend-10_b.C: New test.
Andrew Pinski [Tue, 11 Nov 2025 20:07:11 +0000 (12:07 -0800)]
Merge remove_forwarder_block_with_phi into remove_forwarder_block
This is the last cleanup in this area. Merges the splitting functionality
of remove_forwarder_block_with_phi into remove_forwarder_block.
Now mergephi still has the ability to split the edges when merging the forwarder
block with a phi. But this reduces the non-shared code a lot.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Remove must argument.
(remove_forwarder_block): Add can_split
argument. Handle the splitting case (iff phis in bb).
(cleanup_tree_cfg_bb): Update argument to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): Remove.
(pass_merge_phi::execute): Update argument to tree_forwarder_block_p
and call remove_forwarder_block instead of remove_forwarder_block_with_phi.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 11 Nov 2025 19:29:38 +0000 (11:29 -0800)]
cfgcleanup: Support merging forwarder blocks with phis [PR122493]
This adds support for merging forwarder blocks with phis in cleanupcfg.
This patch might seem small but that is because the previous patches were
done to build up to make it easier to add this support.
There is still one more patch to merge remove_forwarder_block
and remove_forwarder_block_with_phi since remove_forwarder_block_with_phi
supports splitting an edge which is not supported as an option in remove_forwarder_block.
The splitting edge option should not be enabled for cfgcleanup but only for mergephi.
Note r8-338-ge7d70c6c3bccb2 added always creating a preheader for loops so we should
protect them if we have a phi node as it goes back and forth here. And both the gimple
and RTL loop code likes to have this preheader in the case of having the same constant
value being starting of the loop.
explaination on testcase changes
gcc.target/i386/pr121062-1.c needed a small change because there is a basic block
which is not duplicated so only one `movq reg, -1` is there instead of 2.
uninit-pred-7_a.c is xfailed and filed as PR122660, some analysis in the PR already of
the difference now.
uninit-pred-5.C was actually a false positive because when
m_best_candidate is non-NULL, m_best_candidate_len is always initialized.
The log message on the testcase is wrong if you manually fall the path
you can notice that. With an extra jump threading after the merging of
some bbs, the false positive is now no longer happening. So change the
dg-warning to dg-bogus.
ssa-dom-thread-7.c now jump threads 12 times in thread2 instead of 8
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122493
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Change bool argument
to a must have phi and allow phis if it is false.
(remove_forwarder_block): Add support for merging of forwarder blocks
with phis.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr121062-1.c: Update count.
* gcc.dg/uninit-pred-7_a.c: xfail line 23.
* g++.dg/uninit-pred-5.C: Change dg-warning to dg-bogus.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update count of jump thread.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 12 Nov 2025 09:30:30 +0000 (01:30 -0800)]
fix handling of mapped and their location
So when we using the newly mapped location, we should check if
it is not unknown location and if so just use the original location.
Note this is a latent bug in remove_forwarder_block_with_phi code too.
This fixes gcc.dg/uninit-pr40635.c when doing more mergephi.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): Use the original location
if the mapped location is unknown.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Tue, 11 Nov 2025 08:38:25 +0000 (00:38 -0800)]
mergephi: extend copy_phi_arg_into_existing_phi and use it for remove_forwarder_block_with_phi
copy_phi_arg_into_existing_phi was added in r14-477-g78b0eea7802698
and used in remove_forwarder_block but since
remove_forwarder_block_with_phi needed to use the redirect edge var
map, it was not moved over. This extends copy_phi_arg_into_existing_phi
to have the ability to optional use the mapper.
This also makes remove_forwarder_block_with_phi and remove_forwarder_block closer to
one another. There is a few other changes needed to be able to do both
from the same function.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): New use_map argument.
* tree-cfg.h (copy_phi_arg_into_existing_phi): Update declaration.
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
copy_phi_arg_into_existing_phi instead of inlining it.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 10 Nov 2025 01:17:49 +0000 (17:17 -0800)]
mergephi: use edge iterator in remove_forwarder_block_with_phi
It was always kinda of odd that while remove_forwarder_block used
an edge iterator, remove_forwarder_block_with_phi used a while loop.
remove_forwarder_block_with_phi was added after remove_forwarder_block too.
Anyways this changes remove_forwarder_block_with_phi into use the same
form of loop so it is easier to merge the 2.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
edge iterator instead of while loop.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Mon, 10 Nov 2025 00:13:05 +0000 (16:13 -0800)]
cfgcleanup: Remove check on available dominator information in remove_forwarder_block
Since at least r9-1005-gb401e50fed4def, dominator information is
available in remove_forwarder_block so there is no reason to have a
check on if we should update the dominator information, always do it.
This is one more step into commoning remove_forwarder_block and remove_forwarder_block_with_phi.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check
on the available dominator information.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Wed, 12 Nov 2025 00:47:04 +0000 (16:47 -0800)]
cfgcleanup: forwarder block, ignore bbs which merge with the predecessor
While moving mergephi's forwarder block removal over to cfgcleanup,
I noticed a few regressions due to removal of a forwarder block (correctly)
but the counts were not updated, instead let these blocks be handled by the merge_blocks
cleanup code.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Reject bb which has a single
predecessor which has a single successor.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 23:54:43 +0000 (15:54 -0800)]
mergephi: Move checks from pass_merge_phi::execute to remove_forwarder_block_with_phi
This moves the checks that were in pass_merge_phi::execute into remove_forwarder_block_with_phi
or tree_forwarder_block_p to make easier to merge remove_forwarder_block_with_phi with remove_forwarder_block.
This also simplifies the code slightly because we can do `return false` rather than break
in one location.
gcc/ChangeLog:
* tree-cfgcleanup.cc (pass_merge_phi::execute): Move
check for abnormal or no phis to remove_forwarder_block_with_phi
and the check on dominated to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): here.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 22:07:15 +0000 (14:07 -0800)]
cfgcleanup: Move check for dest containing non-local label/eh landing pad to tree_forwarder_block_p
I noticed this check was in both remove_forwarder_block and remove_forwarder_block_with_phi but
were slightly different in that eh landing pad was not being checked for remove_forwarder_block_with_phi
when it definite should be.
This folds the check into tree_forwarder_block_p instead as it is called right before hand anyways.
The eh landing pad check was added to the non-phi one by r0-98233-g28e5ca15b76773 but missed the phi variant;
I am not sure if it could show up there but it is better to have one common code than having two copies of
slightly different checks.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Remove check on non-local label.
(remove_forwarder_block): Remove check on non-label/eh landing pad.
(tree_forwarder_block_p): Add check on lable for an eh landing pad.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 21:56:12 +0000 (13:56 -0800)]
cfglceanup: Remove check for infinite loop in remove_forwarder_block/remove_forwarder_block_with_phi
Since removing the worklist for both mergephi and cfglceanup (r0-80545-g672987e82f472b), these
two functions are now called right after tree_forwarder_block_p so there is no reason to the
extra check for infinite loop nor the current loop on the headers check as it is already
handled in tree_forwarder_block_p.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check for infinite loop.
(remove_forwarder_block_with_phi): Likewise. Also remove check for loop header.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Andrew Pinski [Sun, 9 Nov 2025 06:40:08 +0000 (22:40 -0800)]
mergephi: Remove worklist
Since the worklist was never added to and the anlysis part can benifit
from the work part, we can combine the analayis part with the work part.
This should get a small speedup for this pass
Looking into the history here, remove_forwarder_block used to add to the worklist
but remove_forwarder_block_with_phi never did.
This is the first step in moving part of the functionality of mergephi into
cfgcleanup.
Jeff Law [Thu, 13 Nov 2025 20:10:12 +0000 (13:10 -0700)]
Handle shift-pairs in ext-dce for targets without zero/sign extension insns
This is more prep work for revamping the zero/sign extension patterns on RISC-V
to avoid the need for define_insn_and_splits.
The core issue at hand is for the base ISA we don't have the full set of
sign/zero extensions. So what's been done so far is to pretend we do via a
define_insn_and_split, then split the extensions into shift pairs post-reload
(for the base ISA).
That has multiple undesirable properties, including inhibiting optimization in
some cases and making it harder to add new optimizations in the most natural
way in the future.
The basic approach we've been taking to these problems has been to generate the
desired code at expansion time. When we do that for RISC-V, ext-dce will no
longer see the zero/sign extension nodes when compiling for the base ISA --
instead it'll see shift pairs. And that in turn causes ext-dce to miss
elimination opportunities which is a regression relative to the trunk right
now.
This patch improves ext-dce to recognize the second shift (right) in such a
sequence, then try to match it up with a prior left shift (which has to be the
immediately prior real instruction). When it can pair them up it'll treat the
pair like an extension. The right shift turns into a simple copy of the source
of the left shift.
That prevents optimization regressions with the in flight code to revamp the
zero extension (and then sign extensino) code. No new tests since it's
preventing existing tests from failing to optimize after some in flight stuff
lands.
Bootstrapped and regression tested on x86_64 and tested on all the crosses in
my tester. The Pioneer and BPI will pick it up tonight for bootstrap testing
on RISC-V.
* ext-dce.cc (ext_dce_try_optimize_rshift): New function to optimize a
shift pair implementing a zero/sign extension.
(ext_dce_try_optimize_extension): Renamed from
ext_dce_try_optimize_insn.
(ext_dce_process_uses): Handle shift pairs implementing extensions.
Andrew Pinski [Thu, 13 Nov 2025 05:06:02 +0000 (21:06 -0800)]
sccp: Fix order of gimplification, removal of the phi and constant prop in sccp (3rd time) [PR122637]
This is 3rd (and hopefully last) time to fix the order here.
The previous times were r16-5093-g77e10b47f25d05 and r16-4905-g7b9d32aa2ffcb5.
The order before these patches were:
* removal of phi
* propagate constants
* gimplification of expr
* create assignment
* rewrite to undefined
* add stmts to bb
The current order before this patch (and after the other 2):
* gimplification of expr
* removal of phi
* create assignment
* propagate constants
* rewrite to undefined
* add stmts to bb
The correct and new order with this patch we have:
* gimplifcation of expr
* propagate constants
* removal of phi
* create the assignment
* rewrite to undefined
* add stmts to bb
This is because the propagate of the constant will cause a fold_stmt which requires
the statement in the IR still. The gimplifcation of expr also calls fold_stmt.
Now with the new order the phi is not removed until right before the creation of the
new assigment so the IR in the basic block is well defined while calling fold_stmt.
Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
PR tree-optimization/122637
gcc/ChangeLog:
* tree-scalar-evolution.cc (final_value_replacement_loop): Fix order
of gimplification and constant prop.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122637-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Artemiy Volkov [Thu, 13 Nov 2025 11:15:19 +0000 (11:15 +0000)]
gcc/testsuite: adjust tree-ssa/forwprop-43.c
Introduced in r16-5042-g470411f44f51d9, this testcase fails on
AdvSIMD-less AArch32 configurations, likely as well as on other targets
without vector support; thus, require it via dg-require-effective-target.
Since this testcase includes stdint.h, require that as well.
Regtested on arm-gnueabihf with
RUNTESTFLAGS=--target_board=unix/-mfpu=vfpv3-d16/-march=armv7-a.
Jeff Law [Thu, 13 Nov 2025 15:51:40 +0000 (08:51 -0700)]
[RISC-V][PR rtl-optimization/122627] Yet another fix in IRA equivalence array handling
Yup, yet another out of bounds access into the equivalence array.
In this case we had an out of bounds write, which corrupted the heap leading to
the fault.
Given this is the 3rd such issue in this space in recent history and the second
in this loop within LRA within a week or so, I looked for a solution that would
cover the whole loop rather than another spot fix.
The good news is this loop runs after elimination, so we can just expand the
equivalence array after elimination and all the right things should happen.
This also allows removal of the spot fix I did last week (which I did
backtest). I didn't have a testcase for the bug in this space I fixed a couple
months ago (and the artifacts from that build are certainly gone from my tester
by now).
Bootstrapped and regression tested on x86. Also verified the RISC-V failures
in this bz and bz122321 are fixed.
Given this is a refinement & simplification of a prior fix, I'm going to take
some slight leeway to push the fix forward now.
PR rtl-optimization/122627
gcc/
* lra-constraints.cc (update_equiv): Remove patch from last week
related to pr122321.
(lra_constraints): Expand the equivalence array after eliminations
are complete.
gcc/testsuite/
* gcc.target/riscv/rvv/autovec/pr122627.c: New test.
Eric Botcazou [Sun, 2 Nov 2025 16:11:19 +0000 (17:11 +0100)]
ada: Fix internal error on protected entry and private record
This is a freezing issue introduced by the new support for deferred extra
formals. The freezing of local types created during the expansion of the
entry construct happens in the wrong scope when the expansion is deferred,
causing reference-before-freezing in the expanded code.
gcc/ada/ChangeLog:
* exp_ch9.adb (Expand_N_Entry_Declaration): In the deferred case,
freeze immediately all the newly created entities.
Douglas B Rupp [Fri, 3 Oct 2025 16:54:47 +0000 (09:54 -0700)]
ada: Corrupted unwind info in aarch64-vx7r2 llvm kernel tests
Adjust the register restoration on aarch64 to not use register 96
on llvm. Avoids the "reg too big" warning on aarch64 when sigtramp
is called. For llvm and aarch64, the correct choice seems to be 32.
Remove parens on REGNO_PC_OFFSET when compiling,
it causes a silent failure due to alphanumeric register names.
Define a macro for __attribute ((optimize (2))) which is
empty if not availble. (Despite being documented, it generates an
"unknown attribute" warning with clang.)
Define ATTRIBUTE_PRINTF_2 if not defined.
gcc/ada/ChangeLog:
* sigtramp-vxworks-target.h (REGNO_PC_OFFSET): Use 32 vice
96 with llvm/clang. (REGNO_G_REG_OFFSET): Remove parens on
operand. (REGNO_GR): Likewise.
* sigtramp-vxworks.c (__gnat_sigtramp): Define a macro for
__attribute__ optimize, which is empty of not available.
* raise-gcc.c (db): Define ATTRIBUTE_PRINTF_2 if not defined.