Paul Thomas [Wed, 7 Jan 2026 16:14:12 +0000 (16:14 +0000)]
Fortran: [PDT]Fix ICE in tree check and memory leaks[PR90218, PR123071]
2026-01-07 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/123071
* resolve.cc (resolve_typebound_function): If a generic
typebound procedure is marked as overridable and all the
specific procedures are non-overridable, it is safe to resolve
the compcall.
PR fortran/90218
* trans-array.cc (gfc_trans_array_constructor_value): PDT
structure constructor elements must be finalized.
(trans_array_constructor): Set 'finalize_required' for PDT
constructors.
* trans-decl.cc (gfc_get_symbol_decl): PDT initialization is
required in contained namespaces as long as the parent is not
a module.
(gfc_init_default_pdt): Delete the stmtblock_t argument. Assign
a variable 'value' expression using gfc_trans_assignment.
Simplifiy the logic around the call to gfc_init_default_dt. In
both cases return a tree expression or null tree.
(gfc_trans_deferred_vars): Only call gfc_allocate_pdt_comp if
gfc_init_default_pdt returns null tree.
* trans-expr.cc (gfc_trans_alloc_subarray_assign): Add a static
stmtblock_t pointer 'final_block'. Free 'dest' data pointer and
add to final_block.
(gfc_conv_structure): Set 'final_block' to the se's finalblock.
(gfc_trans_assignment_1): Do not deallocate PDT array ctrs.
trans-stmt.cc (gfc_trans_allocate): Also deallocate PDT expr3
allocatable components.
(gfc_trans_deallocate): Add PDT deallocation to se.pre instead
of block.
* trans-stmt.cc (gfc_trans_allocate): Free the allocatable
components of a PDT expr3.
(gfc_trans_deallocate): Add 'tmp' to se.pre rather than block.
gcc/testsuite/
PR fortran/90218
* gfortran.dg/pdt_79.f03: Used uninitialized warning and change
tree scan for 'mapped_tensor.j' to 'Pdttensor_t_4.2.j'.
* gfortran.dg/pdt_80.f03: New test.
Tomas Glozar [Wed, 7 Jan 2026 16:02:15 +0000 (09:02 -0700)]
[PATCH 1/2] ia64: Fix zero_call_used_regs for PRs [PR121535]
ia64 uses default_zero_call_used_regs(), which uses emit_move_insn()
to zero out registers. ia64 predicate registers use BImode, which is not
supported by emit_move_insn().
Implement ia64_zero_call_used_regs() to zero PRs by manually emitting
a CCImode move. default_zero_call_used_regs() is then called to handle
the remaining registers.
PR target/121535
gcc/ChangeLog:
* config/ia64/ia64.cc (TARGET_ZERO_CALL_USED_REGS): Override
function with target-specific one.
(struct gcc_target): Move to end of file.
(ia64_zero_call_used_regs): Add target-specific function.
Xinhui Yang [Wed, 7 Jan 2026 15:59:01 +0000 (08:59 -0700)]
[PATCH] ia64: properly include libunwind support during configuration
By using the test `with_system_libunwind', libgcc can use either
in-house implementation or reference external libunwind symbols.
However, this breaks the static libgcc.a library, as in t-linux it
references unwind-compat.c, which turns some _Unwind_* symbols into
references of the corresponding symbols in libunwind, but libunwind does
not exist in some conditions (e.g. bootstrapping a toolchain). The
linker complains about `missing version node for symbol', since it can
not find the symbol it is referring to.
The unwind-compat.c module should only exist, if system libunwind is
being used. Also GCC itself should add -lunwind only if this condition
is met, too.
Implementing better control for whether to embed unwind implementation
into libgcc to fix this issue.
gcc/
* config.gcc: limit -lunwind usage by testing if the system
libunwind is being used.
libgcc/
* config.host (ia64): include unwind-compat only if the system
libunwind is being used.
* config/ia64/t-linux-libunwind: include libgcc symver definition
for libgcc symbols, since it bears the same role as t-linux
(except libunwind); Include fde-glibc.c since the unwind
implementation requires _Unwind_FindTableEntry in this file.
* config/ia64/unwind-ia64.c: protect _Unwind_FindTableEntry inside
inihbit_libc ifndefs to allow it to build with newlib or
without proper headers.
Xi Ruoyao [Wed, 31 Dec 2025 01:52:35 +0000 (09:52 +0800)]
LoongArch: guard SImode simple shift and arithmetic expansions with can_create_pseudo_p [PR 123320]
As we have hardware instructions for those operations, developers will
reasonably assume they can emit them even after reload. But on LA64 we
are expanding them using pseudos to reduce unneeded sign extensions,
breaking such an expectation and causing ICE like PR 123320.
Only create the pseudo when can_create_pseudo_p () to fix such cases.
PR target/123320
gcc
* config/loongarch/loongarch.md (<optab><mode>3): Only expand
using psuedos when can_create_pseudo_p ().
(addsi3): Likewise.
[committed] [PR target/123403] Fix base register and offsets for v850 libgcc
PR target/123403
libgcc/
* config/v850/lib1funcs.S (__return_r25_r29): Fix ! __EP__ clause to
use SP, not EP.
(__return_r2_r31): Fix offsets to match store offsets.
When basic_stringbuf::setbuf has been called we need to copy the
contents of the buffer into _M_string first, before returning that.
libstdc++-v3/ChangeLog:
PR libstdc++/123100
* include/std/sstream (basic_stringbuf::str()&&): Handle the
case where _M_string is not being used for the buffer.
* testsuite/27_io/basic_stringbuf/str/char/123100.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 6 Jan 2026 14:00:09 +0000 (14:00 +0000)]
libstdc++: Override detection of flockfile support in newlib [PR123406]
As explained in the PR, flockfile and funlockfile are always declared by
newlib and there's no easy way to detect whether they're actually
defined. Ensure that ac_stdio_locking=no gets set for non-cygwin newlib
targets.
libstdc++-v3/ChangeLog:
PR libstdc++/123406
* acinclude.m4 (GLIBCXX_CHECK_STDIO_LOCKING): Override detection
of flockfile for non-cygwin newlib targets.
* configure: Regenerate.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Mon, 5 Jan 2026 17:29:40 +0000 (17:29 +0000)]
libstdc++: Fix memory leak in std::barrier destructor [PR123378]
When I replaced the std::unique_ptr member in r16-997-gef632273a90657 I
should have added an explicit delete[] operation to replace the effects
of the unique_ptr destructor.
Tobias Burnus [Wed, 7 Jan 2026 14:51:55 +0000 (15:51 +0100)]
OpenMP: Add early C/C++ parser support for 'groupprivate' directive
After parsing the directive, 'sorry, unimplemented' is printed.
Note that restriction checks still have to be implemented, but this
depends on parser support for the 'local' clause of 'omp declare target',
which still has to be implemented.
Andrew MacLeod [Tue, 6 Jan 2026 15:14:47 +0000 (10:14 -0500)]
Early builtin_unreachable removal must examine dependencies.
Even if all uses of a name are dominated by the unreachable branch,
recomputation of a value in the defintion of a name might be reachable.
PR tree-optimization/123300
gcc/
* gimple-range-gori.cc (gori_map::exports_and_deps): New.
* gimple-range-gori.h (exports_and_deps): New prototype.
(FOR_EACH_GORI_EXPORT_AND_DEP_NAME): New macro.
* tree-vrp.cc (remove_unreachable:remove_unreachable): Initialize
m_tmp bitmap.
(remove_unreachable:~remove_unreachable): Dispose of m_tmp bitmap.
(remove_unreachable:fully_replaceable): Move from static function
and check reachability of exports and dependencies.
The initial idea of this optimization was to reduce it to "X != 0",
checking for either X being an unsigned or a truncating conversion.
Then we discussed reducing it to "(X & -X) != 0" instead. This form
would avoid the potential trapping problems (like -ftrapv) that might
happen in case X is not an unsigned type.
Then, as suggested by Roger Sayle in bugzilla, we could reduce to just
"-X != 0". Keeping the negated value in the pattern preserves any trapping
or UBs to be handled by other match.pd patterns that are more able to do
the conversion to "X != 0" when applicable. This would also spare us from
a TYPE_UNSIGNED check.
Jakub Jelinek [Wed, 7 Jan 2026 14:17:21 +0000 (15:17 +0100)]
combine: Fix up serious regression in try_combine [PR121773]
Back in April last year I've changed try_combine's condition when trying to
split two independent sets by moving one of them to i2. Previously this was
testing !modified_between_p (SET_DEST (setN), i2, i3) and I've changed it
to SET_DEST (setN) != pc_rtx && !reg_used_between_p (SET_DEST (set1), i2, i3)
on the assumption written in the r15-9131-g19ba913517b5e2a00 commit
message:
"The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both."
That assumption is wrong though, neither of these is a subset of the
other and I don't see any APIs which test both. We need to avoid moving
a set from i3 to i2 both in case where the REG (or SUBREG_REG of SUBREG or
MEM or whatever else) is set/modified between i2 and i3 exclusive, as shown
by the testcase in PR121773 (which I'm not including because my ARM neon
knowledge is limited). We have i2 insn 18 and i3 insn 7 after the current
try_combine modifications:
(insn 18 5 19 2 (set (reg:SI 104 [ _6 ])
(const_int 305419896 [0x12345678])) "include/arm_neon.h":7467:22 542 {*arm_movsi_vfp}
(expr_list:REG_EQUAL (const_int 305419896 [0x12345678])
(nil)))
(insn 19 18 21 2 (set (reg:SI 105 [ _6+4 ])
(const_int 538968064 [0x20200000])) "include/arm_neon.h":7467:22 542 {*arm_movsi_vfp}
(nil))
(insn 21 19 7 2 (set (reg:DI 101 [ _5 ])
(const_int 0 [0])) "include/arm_neon.h":607:14 -1
(nil))
(insn 7 21 8 2 (parallel [
(set (pc)
(pc))
(set (subreg:SI (reg:DI 101 [ _5 ]) 0)
(const_int 610839792 [0x2468acf0]))
]) "include/arm_neon.h":607:14 17 {addsi3_compare_op1}
(expr_list:REG_DEAD (reg:SI 104 [ _6 ])
(nil)))
The second set can't be moved to the i2 location, because (reg:DI 101)
is modified in insn 21 and so if setting half of it to 610839792 is
moved from insn 7 where it modifies what was previously 0 into a location
where it overwrites something and is later overwritten in insn 21, we get
different behavior.
And the second case is mentioned in the PR119291 commit log:
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
(set (pc)
(pc))
(set (reg/v:SI 116 [ e ])
(const_int 0 [0]))
]) "pr119291.c":28:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
i2 is insn 22, i3 is insn 25 after in progress modifications and the
second set can't be moved to i2 location, because (reg/v:SI 116) is used
in insn 23, so with it being set to 0 around insn 22 insn 23 will see
a different value.
So, I'm afraid we need both the modified_between_p and reg_used_between_p
check. We don't need the SET_DEST (setN) != pc_rtx checks, those were
added because modified_between_p (pc_rtx, i2, i3) returns true if start
is not the same as end, but reg_used_between_p doesn't behave like that.
2026-01-07 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119291
PR rtl-optimization/121773
* combine.cc (try_combine): Check that SET_DEST (setN) is neither
modified_between_p nor reg_used_between_p instead of just not
reg_used_between_p or pc_rtx.
Jakub Jelinek [Wed, 7 Jan 2026 14:00:50 +0000 (15:00 +0100)]
libstdc++: Use gnu_inline attribute on constexpr exception methods [PR123183]
As mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2026-January/704712.html
in the gnu::constexpr_only thread, gnu::gnu_inline attribute actually
seems to work for most of what we need for C++26 constexpr exceptions
(i.e. when we want out of line bodies for C++ < 26 and need to use
constexpr for C++26, yet don't want for reasons mentioned in those
two PRs the bodies of those constexpr methods to be emitted inline).
Unfortunately clang++ doesn't handle it 100% properly and requires
the redundant inline keyword to make it work (even when the methods
are constexpr and thus implicilty inline), g++ doesn't require that,
so the patch adds also the redundant inline keywords and not just
the [[__gnu__::__gnu_inline__]] attribute.
This way if something wants to inline those functions it can, but
if their address is taken, we just rely on libstdc++.{so,a} to provide
those (which it does as before because those TUs are compiled with
older -std= modes).
The earlier r16-6477-gd5743234731 commit made sure gnu::gnu_inline
constexpr virtual methods can be key methods, so vtables and rtti can
be emitted only in the TU defining non-gnu_inline versions of those.
Alfie Richards [Tue, 7 Oct 2025 14:16:16 +0000 (14:16 +0000)]
aarch64: Add support for fmv priority syntax.
Adds support for the AArch64 fmv priority syntax.
This allows users to override the default function ordering.
For example:
```c
int bar [[gnu::target_version("default")]] (int){
return 1;
}
int bar [[gnu::target_version("dotprod;priority=2")]] (int) {
return 2;
}
int bar [[gnu::target_version("sve;priority=1")]] (int) {
return 3;
}
```
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_parse_fmv_features): Add parsing
for priority arguments.
(aarch64_process_target_version_attr): Update call to
aarch64_parse_fmv_features.
(get_feature_mask_for_version): Update call to
aarch64_parse_fmv_features.
(aarch64_compare_version_priority): Add logic to order by priority if present.
(aarch64_functions_b_resolvable_from_a): Update call to
aarch64_parse_fmv_features.
(aarch64_mangle_decl_assembler_name): Update call to
aarch64_parse_fmv_features.
(dispatch_function_versions): Add logic to sort by priority.
(aarch64_same_function_versions): Add diagnostic if invalid use of
priority syntax.
(aarch64_merge_decl_attributes): Add logic to make suer priority
arguments are preserved.
(aarch64_check_target_clone_version): Update call to
aarch64_parse_fmv_features.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/fmv_priority3.c: New test.
* gcc.target/aarch64/fmv_priority_error1.c: New test.
* gcc.target/aarch64/fmv_priority_error2.c: New test.
Alfie Richards [Tue, 7 Oct 2025 13:01:09 +0000 (13:01 +0000)]
targethooks: Change SAME_FUNCTION_VERSIONS hook to support checking mergeability
This changes the hook to support checking version mergeability for cases
where the version strings do imply the same version, but are conflicting
in some other way so cannot be merged.
This is a change required for adding priority version support in aarch64.
gcc/ChangeLog:
* target.def (TARGET_OPTION_SAME_FUNCTION_VERSIONS): Update
documentation.
* tree.cc (disjoint_version_decls): Change for new NULL parameter
to same_function_versions.
(diagnose_versioned_decls): Update to pass diagnostic location to
same_function_versions.
* doc/tm.texi: Regenerate.
* config/aarch64/aarch64.cc (aarch64_same_function_versions):
Update hook impl for new arguments.
* config/riscv/riscv.cc (riscv_same_function_versions): Update
hook impl for new arguments.
* config/loongarch/loongarch.cc
(loongarch_same_function_versions): Likewise
* hooks.cc (hook_stringslice_stringslice_unreachable): Changed
to...
(hook_stringslice_consttree_stringslice_consttree_unreachable):
...this and add extra arguments.
* hooks.h (hook_stringslice_stringslice_unreachable): Changed
to...
(hook_stringslice_consttree_stringslice_consttree_unreachable):
and add extra arguments.
Martin Jambor [Wed, 7 Jan 2026 10:53:15 +0000 (11:53 +0100)]
ipa-cp: Multiple sweeps over the call-graph in the decision stage
Currently, IPA-CP makes only one sweep in the decision stage over the
call-graph, meaning that some clonin , even if relatively cheap, may
not be performed because the pass runs out of the overall growth
budget before it gets to evaluating it. By making more (three by
default, but configurable with a parameter) sweeps over the call-graph
with progressivelly stricter cost limits, the more benefitial
candidates will have a better chance to be cloned before others.
gcc/ChangeLog:
2025-07-08 Martin Jambor <mjambor@suse.cz>
* params.opt (param_ipa_cp_sweeps): New.
* doc/invoke.texi (ipa-cp-sweeps): New.
* ipa-cp.cc (max_number_sweeps): New.
(get_max_overall_size): New parameter cur_sweep, use it and the total
number of sweeps from the NODE to calculate the result too.
(ipcp_propagate_stage): Get the maximum number of sweeps specified in
the corresponding parameter of any possibly affected node.
(good_cloning_opportunity_p): Add parameter cur_sweep, adjust the
threshold according to it.
(decide_about_value): New parameter cur_sweep, pass it to
get_max_overall_size and to good_cloning_opportunity_p.
(decide_whether_version_node): New parameter cur_sweep, pass it to
decide_about_value and get_max_overall_size. Make sure the node is
not dead.
(ipcp_decision_stage): Make multiple sweeps over the call-graph.
Martin Jambor [Wed, 7 Jan 2026 10:53:14 +0000 (11:53 +0100)]
ipa-cp: Move decision to clone for all contexts to decision stage
Currently, IPA-CP makes decisions to clone a function for all (known)
contexts in the evaluation phase, in a separate sweep over the call
graph from the decisions about cloning for values available only in
certain contexts. This patch moves it to the decision stage, which
requires slightly more computation at the decision stage but the
benefit/cost heuristics is also likely to be slightly better because
it can be calculated using the call graph edges that remain after any
cloning for special contexts. Perhaps more importantly, it also
allows us to do multiple decision sweeps over the call graph with
different "parameters."
gcc/ChangeLog:
2025-07-02 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_node_params): Remove member do_clone_for_all_contexts.
(ipa_node_params::ipa_node_params): Do not initialize
do_clone_for_all_contexts.
* ipa-cp.cc (gather_context_independent_values): Remove parameter
calculate_aggs, calculate them always.
(estimate_local_effects): Move the decision whether to clone for
all context...
(decide_whether_version_node): ...here. Fix dumps.
(decide_about_value): Adjust alignment in dumps.
Rainer Orth [Wed, 7 Jan 2026 08:52:39 +0000 (09:52 +0100)]
fixincludes: Remove unnecessary Solaris fixes
Many fixincludes fixes are no longer applied on Solaris 11.4, usually
because they have been incorporated into the system headers. Sometimes
this happened as early as Solaris 10 already.
A few still were applied although unnecessarily, usually because they
have been applied to system headers in a slightly different way.
This patch removes all such fixes or disables the unnecessary ones that
aren't Solaris-specific on Solaris only. While the solaris_math_12 fix
isn't necessary in current Solaris 11.4 SRUs, it was kept since it still
applies to Solaris 11.4 FCS.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11. I've also checked that the fixes applied to the
11.4 FCS headers are identical to those before this patch, with the
exception of those that are no longer actually needed.
Richard Biener [Tue, 6 Jan 2026 13:10:38 +0000 (14:10 +0100)]
tree-optimization/123316 - avoid ICE due to lack of PHI patterns
With bools we can end up with mixed vector types in PHI nodes due
to PHIs not having pattern stmts. Avoid this when analyzing
a nested cycle, similar to how we already to when analyzing BB
vectorization PHIs.
Rainer Orth [Wed, 7 Jan 2026 05:53:23 +0000 (06:53 +0100)]
Allow disabling -gctf non-C warning [PR123259]
In mixed-language builds it may be difficult to restrict -gctf to only
C-language sources. However, the
cc1plus: note: CTF debug info requested, but not supported for ‘GNU C++17’ frontend
warning for non-C languages, which is perfectly benign, may confuse
parts of the build, so it may be useful to disable it.
This patch applies the existing -Wno-complain-wrong-lang option to
suppress it.
Bootstrapped without regressions on i386-pc-solaris2.11
sparc-sun-solaris2.11, also with C/C++-only bootstraps that apply
-gctf/-gsctf via STAGE[23]_CFLAGS and STAGE[23]_TFLAGS.
warn_access: Limit waccess2 to dangling pointer checks [PR 123374]
The second pass of warn_access (waccess2) was added to implement
dangling pointer checks but it implicitly ran the early checks too,
which issues false warnings on code that has not been fully optimized.
Limit this second run to only dangling pointer checks for call
statements. This does not break any of the existing warning tests, so
it didn't seem to add any actual value for the additional run anyway.
gcc/ChangeLog:
PR tree-optimization/123374
* gimple-ssa-warn-access.cc (pass_waccess::set_pass_param): Add
a second parameter.
(pass_waccess::check_call): Skip access checks for waccess2.
(pass_waccess::execute): Drop initialization of
M_CHECK_DANGLING_P.
* passes.def: Adjust.
gcc/testsuite/ChangeLog:
PR tree-optimization/123374
* g++.dg/warn/pr123374.C: New test.
Sebastian Huber [Mon, 29 Dec 2025 23:41:38 +0000 (00:41 +0100)]
gcov: Fix counter update method selection
The counter update method selection had some issues.
For PROFILE_UPDATE_ATOMIC, if atomic updates are not supported, then
fallback to single mode, however, use partial atomic updates if
available. Issue warnings.
For PROFILE_UPDATE_PRFER_ATOMIC, if atomic updates are not supported,
then fallback to single mode, however, use partial atomic updates if
available. Do not issue warnings.
gcc/ChangeLog:
* tree-profile.cc (tree_profiling): Do not use atomic operations
if they are not available. Try to use at least partial atomic
updates as a fallback.
Jeff Law [Tue, 6 Jan 2026 23:16:56 +0000 (16:16 -0700)]
[PR target/123269] Adjust predcomm testcases to avoid vectorization
Thankfully this "bug" is just a case where after Robin's change we're
vectorizing cases we weren't before which in turn doesn't give predcom the
opportunity to optimize the code.
Like on existing predcom test we can restore the test's intent by using
-fno-tree-vectorize.
Tested x86_64 and the various crosses to ensure nothing regressed. Pushing to
the trunk.
This pattern is only emitted during function epilogue expansion (obviously
after register allocation), so putting reload_completed in the condition
is redundant.
This patch also changes the declaration of the return register (A0 address
register) required for normal function returns to properly defining the
EPILOGUE_USES macro, as is already done on other targets, rather than
placing '(use (reg:SI A0_REG))' RTX.
gcc/ChangeLog:
* config/xtensa/xtensa.h (EPILOGUE_USES): New macro definition.
* config/xtensa/xtensa.md (return):
Remove '(use (reg:SI A0_REG))' from the template description, and
reload_completed from the condition.
(sibcall_epilogue): Remove emitting '(use (reg:SI A0_REG))'.
Tamar Christina [Tue, 6 Jan 2026 15:00:44 +0000 (15:00 +0000)]
vect: Add check for BUILT_IN_NORMAL to ifcvt [PR122103]
It was reported that some AVX10 test like
gcc.target/i386/avx10_2-vcvtbf162ibs-2.c ICEd with my
changes. It turns out it's due to associated_internal_fn
only supporting BUILT_IN_NORMAL calls.
This adds a check for this before calling
associated_internal_fn.
Manually tested the files since they have an effective
target tests for hardware I don't have.
gcc/ChangeLog:
PR tree-optimization/122103
* tree-if-conv.cc (ifcvt_can_predicate): Add check for
normal builtins.
Richard Ball [Tue, 6 Jan 2026 14:26:20 +0000 (14:26 +0000)]
aarch64: Add support for __pldir intrinsic
This patch adds support for the __pldir intrinsic.
This is a new prefetch intrinsic which declares an
intent to read from an address.
This intrinsic is part of FEAT_PCDPHINT.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(enum aarch64_builtins): New builtin flag.
(aarch64_init_pcdphint_builtins): New builtin function.
(aarch64_expand_pldir_builtin): Expander for new intrinsic.
(aarch64_general_expand_builtin): Call new expander.
* config/aarch64/aarch64.md
(aarch64_pldir): New pattern for instrinsic.
* config/aarch64/arm_acle.h
(__attribute__): New call to builtin.
(__pldir): Likewise.
Richard Ball [Tue, 6 Jan 2026 14:26:20 +0000 (14:26 +0000)]
aarch64: Add support for FEAT_PCDPHINT atomic_store intrinsics.
This patch adds support for the atomic_store_with_stshh intrinsic
in aarch64. This intrinsic is part of FEAT_PCDPHINT.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(enum aarch64_builtins): Add new flags.
(aarch64_init_pcdphint_builtins): Create new Builtin functions.
(aarch64_general_init_builtins): Call init for PCDPHINT.
(aarch64_expand_stshh_builtin): Expander for new intrinsic.
(aarch64_general_expand_builtin): Call new expander.
* config/aarch64/aarch64-c.cc
(aarch64_update_cpp_builtins): New feature.
* config/aarch64/aarch64.h (TARGET_PCDPHINT): Likewise.
* config/aarch64/arm_acle.h
(__atomic_store_with_stshh): Generic to call builtins.
* config/aarch64/atomics.md
(@aarch64_atomic_store_stshh<mode>): New pattern for intrinsic.
* config/aarch64/iterators.md: New UNSPEC.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/atomic_store_with_stshh.c: New test.
Eric Botcazou [Tue, 6 Jan 2026 14:18:34 +0000 (15:18 +0100)]
Fix gcc.c-torture/execute/pr110817-[13].c on the SPARC
As discussed in the audit trail, the TARGET_VECTORIZE_GET_MASK_MODE hook of
the SPARC back-end always returns Pmode (SImode would probably have been OK)
and this causes build_truth_vector_type_for_mode to generate questionable
types like:
<vector_type 0x7ffff6f6da80
type <boolean_type 0x7ffff6f6d9d8 public QI
size <integer_cst 0x7ffff6e04f18 constant 8>
unit-size <integer_cst 0x7ffff6e04f30 constant 1>
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff6f6d9d8 precision:1 min <integer_cst 0x7ffff6f69678 -1> max
<integer_cst 0x7ffff6f7deb8 0>>
DI
size <integer_cst 0x7ffff6e04e28 type <integer_type 0x7ffff6e150a8
bitsizetype> constant 64>
unit-size <integer_cst 0x7ffff6e04e40 type <integer_type 0x7ffff6e15000
sizetype> constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff6f6da80 nunits:1>
which then go through this trick in store_constructor:
/* Ensure no excess bits are set.
GCN needs this for nunits < 64.
x86 needs this for nunits < 8. */
auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
if (maybe_ne (GET_MODE_PRECISION (mode), nunits))
tmp = expand_binop (mode, and_optab, tmp,
GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
target, true, OPTAB_WIDEN);
if (tmp != target)
emit_move_insn (target, tmp);
break;
}
to yield code that cannot possibly work on a big-endian platform.
Coaxing build_truth_vector_type_for_mode to generate more sensible types
fixes the problem but runs afoul of the TARGET_VECTORIZE_GET_MASK_MODE
hook for some AVX512 modes, so is probably not worth the risk. Moreover,
I didn't manage to come up with a big-endian implementation of the above
trick that would make some sense for the questionable vector types, so the
fix simply disables it.
gcc/
PR target/121192
* expr.cc (store_constructor) <VECTOR_TYPE>: Disable the special
trick for uniform boolean vectors with integer modes and single-bit
mask entries on big-endian platforms.
Artemiy Volkov [Fri, 2 Jan 2026 11:18:19 +0000 (11:18 +0000)]
testsuite: rework some vect/complex testcases
This is the second stab at
https://gcc.gnu.org/pipermail/gcc-patches/2026-January/704823.html, which
concerns cleaning up some testcases in gcc.dg/vect/complex. The original
commit message reads:
---------------------------------------------------------------------------
Some of the testcases in the gcc.dg/vect/complex directory try to match
"stmt.*$internal_fn" in the slp1/vect logs, which leads to many false
positives; this patch changes this to "add new stmt: \[^\n\r]*$internal_fn",
making sure that the log fragments matched in this way are limited to
single lines and correspond to actual newly created GIMPLE statements.
This main change results in some fallout, necessitating the following
additional tweaks:
- For fast-math testcases, replace the "1"s in scan-tree-dump-times
directives by appropriate counts.
- XFAIL bb-slp and vect testcases featuring integral types,
since the cadd{90,270} optabs are not implemented for integral modes.
- Disable some FP16 tests for arm targets due to absence of cadd{90,270}
for V8HF.
- Replace "target { X } && ! target { Y }" selectors with the correct
"target { X && { ! Y } }" form.
- In bb-slp-complex-add-pattern-long.c, adjust the testcase header to
match other tests so that different scan-tree-dump-times directives
can be switched off selectively.
- In bb-slp-complex-add-pattern-long.c, remove an extraneous scan for
"Found COMPLEX_ADD_ROT90".
- In bb-slp-complex-add-pattern-int.c, use vect_complex_add_int instead of
vect_complex_add_byte.
---------------------------------------------------------------------------
Following Tamar's feedback, tweaks 2 and 3 above have been fixed by these
changes since v1:
- Change what dg-add-options does for arm_v8_3a{,_fp16}_complex_neon so
that the correct flags are returned regardless of configure-time values
of -mfpu.
- For integer tests, require MVE rather than AdvSIMD from the arm
backend's side, as only that ISA has cadd{90,270} for integral modes.
- Un-XFAIL testcases that gcc is currently able to vectorize, separately
for the arm and aarch64 backends.
Re-regtested on aarch64 (with and without SVE2) and arm.
Eric Botcazou [Tue, 6 Jan 2026 10:26:01 +0000 (11:26 +0100)]
Ada: Clear possible confusion in doc/install.texi
The sentence:
"If you need to build an intermediate version of GCC in order to
bootstrap current GCC, consider GCC 9.5: it can build the current Ada
and D compilers, and was also the version that declared C++17 support
stable."
is possibly confusing because it globs Ada and D together, whereas Ada
imposes no further requirement over C++ (GCC 5.4+) unlike D (GCC 9.4+).
gcc/
* doc/install.texi (Prerequisites): Remove reference to Ada in
conjunction with GCC 9.5 and adjust its GCC version requirement.
The order of evaluation of function arguments is unspecified in C++.
The function object_sizes_set_temp called object_sizes_set with two
calls to make_ssa_name() as arguments. Since make_ssa_name() has the
side effect of incrementing the global SSA version counter, different
architectures of the same compiler evaluated these calls in different
orders.
This resulted in non-deterministic SSA version numbering between
x86_64 and aarch64 hosts during cross-compilation, leading to
divergent object files.
Sequencing the calls into separate statements ensures deterministic
evaluation order.
2026-01-06 Jakub Jelinek <jakub@redhat.com>
Marco Falke <falke.marco@gmail.com>
PR tree-optimization/123351
* tree-object-size.cc (object_sizes_set_temp): Separate calls to
make_ssa_name to ensure deterministic execution order.
Thomas Koenig [Sun, 4 Jan 2026 19:09:39 +0000 (20:09 +0100)]
Generate a runtime error on recursive I/O, thread-safe
This patch is a version of Jerry's patch with one additional feature.
When locking a unit, the thread ID of the locking thread also stored
in the gfc_unit structure. When the unit is found to be locked, it can
be either have been locked by the same thread (bad, recursive I/O) or
by another thread (harmless).
Regression-tested fully (make -j8 check in the gcc build directory) on
Linux, which links in pthreads by default. Steve checked on FreeBSD,
which does not do so.
Jerry DeLisle <jvdelisle@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR libfortran/119136
gcc/fortran/ChangeLog:
* libgfortran.h: Add enum for new LIBERROR_RECURSIVE_IO.
libgfortran/ChangeLog:
* io/async.h (UNLOCK_UNIT): New macro.
(TRYLOCK_UNIT): New macro.
(LOCK_UNIT): New macro.
* io/io.h: Delete prototype for unused stash_internal_unit.
(check_for_recursive): Add prototype for this new function.
* io/transfer.c (data_transfer_init): Add call to new
check_for_recursive.
* io/unit.c (delete_unit): Fix comment.
(check_for_recursive): Add new function.
(init_units): Use new macros.
(close_unit_1): Likewise.
(unlock_unit): Likewise.
* io/unix.c (flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* runtime/error.c (translate_error): : Add translation for
"Recursive I/O not allowed runtime error message.
supers1ngular [Tue, 6 Jan 2026 01:09:02 +0000 (17:09 -0800)]
openmp: Improve Fortran Diagnostics for Linear Clause
This patch improves diagnostics for the linear clause,
providing a more accurate and intuitive recommendation
for remediation if the deprecated syntax is used.
Additionally updates the relevant test to reflect the
changed verbiage of the warning.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clauses): New diagnostic logic.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/pr84418-1.f90: Fix verbiage of
dg-warning to reflect updated warning.
Tamar Christina [Mon, 5 Jan 2026 20:56:03 +0000 (20:56 +0000)]
vect: teach vectorizable_call to predicate calls when they can trap [PR122103]
The following example
void f (float *__restrict c, int *__restrict d, int n)
{
for (int i = 0; i < n; i++)
{
c[i] = __builtin_sqrtf (c[i]);
}
}
compiled with -O3 -march=armv9-a -fno-math-errno -ftrapping-math needs to be
predicated on the conditional. It's invalid to execute the branch and use a
select to extract it later unless using -fno-trapping-math.
However as discussed in PR96373 while we probably shouldn't vectorize for the
cases where we can trap but don't support conditional operation there doesn't
seem to be a clear consensus on how GCC should handle trapping math.
As such similar to PR96373 I don't stop vectorization if trapping math and
the conditional operation isn't supported.
PR tree-optimization/122103
* gcc.target/aarch64/sve/pr122103_4.c: New test.
* gcc.target/aarch64/sve/pr122103_5.c: New test.
* gcc.target/aarch64/sve/pr122103_6.c: New test.
Tamar Christina [Mon, 5 Jan 2026 20:55:34 +0000 (20:55 +0000)]
vect: teach if-convert to predicate __builtin calls [PR122103]
The following testcase
void f (float *__restrict c, int *__restrict d, int n)
{
for (int i = 0; i < n; i++)
{
if (d[i] > 1000)
c[i] = __builtin_sqrtf (c[i]);
}
}
compiled with -O3 -march=armv9-a -fno-math-errno -ftrapping-math needs to be
predicated on the conditional. It's invalid to execute the branch and use a
select to extract it later unless using -fno-trapping-math.
This change in if-conversion changes what we used to generate:
PR tree-optimization/122103
* gcc.target/aarch64/sve/pr122103_1.c: New test.
* gcc.target/aarch64/sve/pr122103_2.c: New test.
* gcc.target/aarch64/sve/pr122103_3.c: New test.
Tamar Christina [Mon, 5 Jan 2026 20:55:05 +0000 (20:55 +0000)]
vect: update tests for -ftrapping-math support [PR122103]
Before going any further, this updates the existing testcases that really
require -fno-trapping-math to now use that.
It also adds three new tests for SVE. They will however fail until the last
patch but that's fine.
Notable is testcase gcc.target/aarch64/sve/unpacked_cond_frinta_2.c which
without -ftrapping-math (which it's explicitly checking for) generates worse
code because the vectorizer forces an unneeded unpack. This is however the
same issue with how the vectorizer picks VF as we've seen a number of times.
Tamar Christina [Mon, 5 Jan 2026 20:54:35 +0000 (20:54 +0000)]
middle-end: extend fma -> fms transformation to conditional optab [PR122103]
Currently in the simplications between if-conversion and vect we rely on
match.pd to rewrite FMA into FMS if the accumulator is on a negated value.
However if if-conversion instead produces a COND_FMA then this doesn't work and
so the vectorizer can't generate a vector FMS or it's other variant.
This extends the rules to include the COND_FMA variants. Because this happens
before the vectorization the vectorizer will take care of generating the LEN
variants and as such we don't need match.pd to know about those.
The added rules are the same as the ones directly above them just changing
FMA to COND_FMA.
gcc/ChangeLog:
PR tree-optimization/122103
* match.pd: Add COND_FMA to COND_FMS rewrite rules.
Tamar Christina [Mon, 5 Jan 2026 20:53:46 +0000 (20:53 +0000)]
middle-end: Add new conditional IFNs for existing math IFNs [PR122103]
For a few math IFNs we never declared the conditional variants. This is needed
to handle trapping math correctly. SVE already implements all of these using
the expected optabs.
This just adds the COND and COND_LEN optabs for SQRT, CEIL, FLOOR, ROUND and
RINT.
Note that we don't seem to have any documentation for the math IFNs as they look
like they're all on the optabs/original builtins. As such I only documented the
optabs as that's consistent.
which is incorrect, fsqrt can raise FE exceptions and so should be masked on p7
as the inactive lanes can trigger incorrect FE errors as the code in the PR
demonstrates.
In GCC 13 this was partially addressed for instruction that got lowered to IFNs
through r13-5979-gb9c78605039f839f3c79ad8fca4f60ea9a5654ed but it never
addressed __builtin_math_fns. Assuming the direction of travel in PR96373 is
still valid this extends the support.
While ERRNO trapping is controlled through flags, it looks like for trapping
math the calls and IFNs are not marked specifically. Instead in
gimple_could_trap_p_1 through operation_could_trap_p we default to all floating
point operation could trap if flag_trapping_math.
This extends gimple_could_trap_p_1 to do the same for __builtin_math_fns but
exclude instructions that the standard says can't raise FEs.
Jeff Law [Mon, 5 Jan 2026 16:34:28 +0000 (09:34 -0700)]
[RISC-V] Restore inline expansion of block moves on RISC-V in some cases
Edwin's patch to add a --param for a size threshold on block moves
inadvertently disabled using inline block moves for cases where the count is
unknown. This caused testsuite regressions (I don't remember which test, it
was ~6 weeks ago if not longer). I'd hoped Edwin would see the new failures,
but I suspect he's buried by transition stuff with Rivos/Meta.
This patch restores prior behavior when the count is unknown and no --param was
specified.
Bootstrapped and regression tested on both the BPI and Pioneer systems and
regression tested on riscv{32,64}-elf as well.
Pushing to the trunk after pre-commit CI does its thing.
gcc/
* config/riscv/riscv-string.cc (expand_block_move): Restore using
inlined memcpy/memmove for unknown counts if the param hasn't been
specified.
(expand_vec_setmem): Similarly for memset.
Pan Li [Mon, 5 Jan 2026 16:28:04 +0000 (09:28 -0700)]
[PATCH v1 2/2] RISC-V: Add run test case for vwadd/vwsub wx mis combine [PR123317]
From: Pan Li <pan2.li@intel.com>
Add test cases for the mis combine of the vwadd/vwsub vx combine.
PR target/123317
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr123317-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr123317-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr123317-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr123317-run-4.c: New test.
* gcc.target/riscv/rvv/autovec/pr123317-run.h: New test.
The vwaddu/vwsubu wx combine patterns take the any_extend by
mistake, it is unsigned so we must leverage zero_extend here.
This PATCH would like to fix this which result in sign_extend
code pattern combine to vwaddu/vwsub.wx.
PR target/123317
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Take zero_extend for
both the vwaddu and vwsubu wx pattern.
Alice Carlotti [Tue, 30 Dec 2025 10:12:45 +0000 (10:12 +0000)]
aarch64 doc: Fix incorrect function name
The documentation for aarch64's -mtrack-speculation referred to the
builtin function __builtin_speculation_safe_copy, but the actual
function name is __builtin_speculation_safe_value.
Pan Li [Sun, 28 Dec 2025 08:33:27 +0000 (16:33 +0800)]
Vect: Adjust depth_limit of vec_slp_has_scalar_use from 2 to 3
The test case of RISC-V vx-6-u8.c is failed for the vaaddu.vx asm check
when --param=gpr2vr-cost=2 recently. After some investigation, it is
failed to vectorize afte some middle-end changes. The depth_limit is 2
of the func vec_slp_has_scalar_use, and then return -1 by design. Then the
slp_insntance got 12 in size and we may see log similar as below:
*_2 1 times vec_to_scalar costs 3 in epilogue
*_2 1 times vec_to_scalar costs 3 in epilogue
*_2 1 times vec_to_scalar costs 3 in epilogue
*_2 1 times vec_to_scalar costs 3 in epilogue
Vector cost: 18
Scalar cost: 9
And then cannot vectorize due to cost consideration.
This PATCH would like to adjust the depth_limit to 3 suggested by
Richard.
gcc/ChangeLog:
* tree-vect-slp.cc (vec_slp_has_scalar_use): Adjust the
depth_limit from 2 to 3.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/sat_add-cost-1.c: New test.
Tamar Christina [Mon, 5 Jan 2026 14:27:14 +0000 (14:27 +0000)]
AArch64: tweak inner-loop penalty when doing outer-loop vect [PR121290]
r16-3394-g28ab83367e8710a78fffa2513e6e008ebdfbee3e added a cost model adjustment
to detect invariant load and replicate cases when doing outer-loop vectorization
where the inner loop uses a value defined in the outer-loop.
In other words, it's trying to detect the cases where the inner loop would need
to do an ld1r and all inputs are then working on replicated values. The
argument is that in this case the vector loop is just the scalar loop since each
lane just works on the duplicated values.
But it had two short comings.
1. It's an all or nothing thing. The load and replicate may only be a small
percentage of the amount of data being processed. As such this patch now
requires the load and replicate to be at least 50% of the leafs of an SLP
tree. Ideally we'd just only increase body by VF * invariant leafs, but we
can't since the middle-end cost model applies a rather large penalty to the
scalar code (* 50) and as such the base cost ends up being too high and we
just never vectorize. The 50% is an attempt to strike a balance in this
awkward situation. Experiments show it works reasonably well and we get the
right codegen in all the test cases.
2. It does not keep in mind that a load + replicate where that vector value is
used in a by index operation will result in is decomposing the load back to
scalar. e.g.
ld1r {v0.4s}, x0
mul v1.4s, v2.4s, v0.4s
is transformed into
ldr s0, x0
mul v1.4s, v2.4s, v0.s[0]
and as such this case may actually be profitable because we're only doing a
scalar load of a single element, similar to the scalar loop.
This patch tries to detect (loosely) such cases and doesn't apply the penalty
for these. It's a bit hard to tell whether we end up with a by index
operation so early as the vectorizer itself is not aware of them and as such
the patch does not do an exhaustive check, but only does the most obvious
one.
gcc/ChangeLog:
PR target/121290
* config/aarch64/aarch64.cc (aarch64_possible_by_lane_insn_p): New.
(aarch64_vector_costs): Add m_num_dup_stmts and m_num_total_stmts.
(aarch64_vector_costs::add_stmt_cost): Use them.
(adjust_body_cost): Likewise.
gcc/testsuite/ChangeLog:
PR target/121290
* gcc.target/aarch64/pr121290.c: Move to...
* gcc.target/aarch64/pr121290_1.c: ...here.
* g++.target/aarch64/pr121290_1.C: New test.
* gcc.target/aarch64/pr121290_2.c: New test.
void f(const int *restrict in,
int *restrict out,
int n, int threshold)
{
for (int i = 0; i < n; ++i) {
int v = in[i];
if (v > threshold) {
int t = v * 3;
t += 7;
t ^= 0x55;
t *= 0x55;
t -= 0x5;
t &= 0xFE;
t ^= 0x55;
out[i] = t;
} else {
out[i] = v;
}
}
}
compiled at -O2
results in aggressive if-conversion which increases the number of dynamic
instructions and the latency of the loop as it has to wait for t to be
calculated now in all cases.
This has led to big performance losses in packages like zstd [1] which in turns
affects packaging and LTO speed.
The default cost model for if-conversion is overly permissive and allows if
conversions assuming that branches are very expensive.
This patch implements an if-conversion cost model for AArch64. AArch64 has a
number of conditional instructions that need to be accounted for, however this
initial version keeps things simple and is only really concerned about csel.
The issue specifically with csel is that it may have to wait for two argument
to be evaluated before it can be executed. This means it has a direct
correlation to increases in dynamic instructions.
To fix this I add a new tuning parameter that indicates a rough estimation of
the branch misprediction cost of a branch. We then accept if-conversion while
the cost of this multiplied by the cost of branches is cheaper.
There is a basic detection of CINC and CSET because these usually are ok. We
also accept all if-conversion when not inside a loop. Because CE is not an RTL
SSA pass we can't do more extensive checks like checking if the csel is a loop
carried dependency. As such this is a best effort thing and intends to catch the
most egregious cases like the above.
This recovers the ~25% performance loss in zstd decoding and gives better
results than GCC 14 which was before the regression happened.
Additionally I've benchmarked on a number of cores all the attached examples
and checked various cases. On average the patch gives an improvement between
20-40%.
PR target/123017
* gcc.target/aarch64/pr123017_1.c: New test.
* gcc.target/aarch64/pr123017_2.c: New test.
* gcc.target/aarch64/pr123017_3.c: New test.
* gcc.target/aarch64/pr123017_4.c: New test.
* gcc.target/aarch64/pr123017_5.c: New test.
* gcc.target/aarch64/pr123017_6.c: New test.
* gcc.target/aarch64/pr123017_7.c: New test.
Paul Thomas [Mon, 5 Jan 2026 07:05:36 +0000 (07:05 +0000)]
Fortran: ICE in type-bound function with PDT result [PR 123071]
2026-01-05 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/123071
* resolve.cc (resolve_typebound_function): Make sure that the
class declared type is resolved.
(resolve_allocate_deallocate): Any kind of expr3 array ref will
need resolution not just constant size refs.
* trans-decl.cc (gfc_trans_deferred_vars): Exclude vtabs from
initialization.
(emit_not_set_warning): New function using code extracted from
gfc_generate_function_code.
(gfc_generate_function_code): PDT module procedures results
that have not been referenced must have the fake_result_decl
added to the symbol and emit_not_set_warning called. Likewise
replace explicit code with call to emit_not_set_warning.
gcc/testsuite
PR fortran/123071
* gfortran.dg/pdt_79.f03: New test.
Jeff Law [Sun, 4 Jan 2026 19:12:21 +0000 (12:12 -0700)]
Partially revert patch that made VXRM a global register on RISC-V
This is something that fell through the cracks in gcc-15. VXRM isn't heavily
used, so errors in this space could easily be going unnoticed right now.
Essentially we made VXRM a global register a while back, it was done somewhat
speculatively as we didn't have a case where it really mattered. Richard S.
then argued the patch was wrong and I agreed with him, but never got around to
reverting the hunk in question.
So that's what I'm finally doing here. Note that I kept the tests from the
patch which made VXRM a global. Those should continue to work.
Bootstrapped and regression tested on a BPI & Pioneer system and checked on
riscv{32,64}-elf as well.
gcc/
* config/riscv/riscv.cc (riscv_conditional_register_usage): Revert
patch that made VXRM a global register.
Keith Packard [Sun, 4 Jan 2026 18:56:24 +0000 (11:56 -0700)]
[PATCH] Add support for using picolibc
Selected for *-picolibc-* targets or when --with-picolibc is
passed to configure.
Add custom options for use with picolibc:
* '--oslib='. Allows targets to insert an OS library after the C
library in the LIB_PATH spec file fragment. This library maps a few
POSIX APIs used by picolibc to underlying system capabilities.
* '--crt0='. Allows targets to use an alternate crt0 in place of the
usual one as provided by Picolibc. Picolibc provides a range of
crt0 versions which this can be used to select among.
* '--printf=' and '--scanf='. Allows targets to customize the version
of printf and scanf linked from the C library.
Adds some new preprocessor variables allowing the C library to adjust
the specfile generation process without affecting target changes:
* LIBC_CPP_SPEC. A specfile fragment appended to cpp_spec. Picolibc
uses this to add preprocessor definitions when the --printf and
--scanf options are provided so that applications can detect the
available printf and scanf versions.
* LIBC_LINK_SPEC. A specfile fragment appended to link_spec. Picolibc
uses this to implement the --printf and --scanf options, passing
suitable --defsym options to the linker.
Documents the new driver options and target macros.
gcc/
* config.gcc: Add clause for picolibc.
* config/picolibc-spec.h: New file.
* config/picolibc.opt: Likewise.
* config/picolibc.opt.urls: Likewise.
* configure.ac: Add support for --with-picolibc.
* configure: Rebuilt.
* doc/invoke.texi: Document picolibc options.
* doc/tm.texi.in (LIBC_CPP_SPEC): Document.
(LIBC_LINK_SPEC): Similarly.
* doc/tm.texi: Rebuilt.
* gcc.cc (LIBC_CPP_SPEC): Provide default definition.
(LIBC_LINK_SPEC): Likewise.
(cpp_spec): Include LIBC_CPP_SPEC.
(link_spec): Similarly for LIBC_LINK_SPEC.
Richard Braun [Sun, 4 Jan 2026 18:43:35 +0000 (11:43 -0700)]
[PATCH] c6x: fix the scheduling of floating-point multiplication instructions
From: Richard Braun <richard.braun@sbg-systems.com>
Instructions have two time-related units associated with them: the
number of delay slots, and the functional unit latency. But some
floating-point multiplication instructions have a functional unit
latency that actually varies depending on the following instructions
scheduled on the same functional unit [1].
For example, the MPYDP instruction is described with a functional unit
latency of 4, but there are additional "cycle-other resource conflicts"
with a following MPYSPDP instruction.
In order to describe that, this patch introduces one pseudo functional
unit per affected instruction, and augments reservations individually
for all implemented instructions that may be affected when following.
[1] 4.3.2 .M-Unit Constraints - SPRUFE8B TMS320C674x DSP CPU and
Instruction Set Reference Guide
gcc/
* config/c6x/c6x-sched.md.in (mpydp_m_N__CROSS_,
mpyspdp_m_N__CROSS_, mpysp2dp_m_N__CROSS_): Update reservations.
* config/c6x/c6x-sched.md: Regenerated.
* config/c6x/c6x.md (m1dp, m1spdp, m2dp, m2spdp): New CPU units.
Signed-off-by: Richard Braun <richard.braun@sbg-systems.com>
Andrew Pinski [Sat, 3 Jan 2026 05:25:12 +0000 (21:25 -0800)]
testsuite: Create a variant of uninit-pred-7_a.c [PR123377]
So it turns out the xfail uninit-pred-7_a.c didn't always happen
depending on the setting of logical-op-non-short-circuit.
So this creates a second copy of the testcase for the case
of `logical-op-non-short-circuit=0` without the xfail and then sets
`logical-op-non-short-circuit=1` for uninit-pred-7_a.c with still
the xfail.
Tested on x86_64-linux-gnu to make sure both pass and uninit-pred-7_a.c
xfails like it should. Also tested manually on powerpc64-linux-gnu to
see the bogus warning happen with logical-op-non-short-circuit=1.
PR testsuite/123377
gcc/testsuite/ChangeLog:
* gcc.dg/uninit-pred-7_a.c: Add
`--param logical-op-non-short-circuit=1` to the options.
* gcc.dg/uninit-pred-7_a_a.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
[PATCH v2]: pch, target: update host hooks for NetBSD and OpenBSD
The PCH use_address hooks for NetBSD hosts have not yet been updated to allow
compiled headers to be loaded at an address different from their preferred
address.
This change updates host-netbsd.cc:netbsd_gt_pch_use_address() thus: if a
compiled header cannot be mapped at its preferred address, a region of memory
is allocated and the base address of this region is passed back to the caller
(ggc-common.cc:gt_pch_restore() I believe). Note that in this case the return
value is 0, allowing gt_pch_restore() to load the header. In this respect the
behaviour is slightly different from that of the use_address hook for other
hosts (e.g. Linux).
This change against GCC 15.2.0 builds on the work in pch/71934 (and
target/58937)
gcc/
* config/host-netbsd.cc (netbsd_gt_pch_use_address): Support PCH
loading at addresses other than its preferred address.
* config/host-openbsd.cc (openbsd_gt_pch_use_address): Likewise.
Daniel Barboza [Sun, 4 Jan 2026 17:44:51 +0000 (10:44 -0700)]
[PATCH v4] match.pd: (c?a:b) op d -> c ? (a op d):(b op d) [PR122608]
Add a pattern to handle cases where we have an OP that is
unconditionally being applied in the result of a gcond. In this case we
can apply OP to both legs of the conditional. E.g:
t = b ? 10 : 20;
t = t + 20;
becomes just:
t = b ? 30 : 40
A variant pattern was also added to handle the case where the gcond
result is used as the second operand. This was needed because most of
the ops we're handling aren't commutative.
PR tree-optimization/122608
gcc/ChangeLog:
* match.pd (`(c ? a : b) op d -> c ? (a op d) : (b op d)`): New
pattern.
(`d op (c ? a : b) -> c ? (d op a) : (d op b)`): Likewise
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110701.c: the pattern added is now folding
an XOR into the ifcond and the assembler isn't emitting an
'andl' anymore. The test was turned into a runtime test
instead.
* gcc.dg/torture/pr122608.c: New test.
Signed-off-by: Daniel Barboza <daniel.barboza@oss.qualcomm.com>
Jeff Law [Sun, 4 Jan 2026 16:30:01 +0000 (09:30 -0700)]
[PR target/123010] Simplify shift of sign extracted field to a sign extending shift
In pr123010 we have a case where we should be getting a single slliw, but
instead we get a 3-insn sequence.
As noted in the BZ we had this before combine:
> (insn 6 3 13 2 (set (reg:DI 137)
> (sign_extend:DI (ashift:SI (subreg/s/u:SI (reg/v:DI 135 [ a ]) 0)
> (const_int 1 [0x1])))) "j.c":10:14 312 {ashlsi3_extend}
> (expr_list:REG_DEAD (reg/v:DI 135 [ a ])
> (nil)))
Which is exactly what we want. That's a single slliw instruction. THen
combine generates this:
> (insn 6 3 13 2 (parallel [
> (set (reg:DI 137)
> (sign_extract:DI (reg:DI 139 [ a ])
> (const_int 31 [0x1f])
> (const_int 0 [0])))
> (clobber (scratch:DI))
> ]) "j.c":10:14 333 {*extractdi3}
> (expr_list:REG_DEAD (reg:DI 139 [ a ])
> (nil)))
> (insn 13 6 14 2 (set (reg/i:DI 10 a0)
> (ashift:DI (reg:DI 137)
> (const_int 1 [0x1]))) "j.c":11:1 297 {ashldi3}
> (expr_list:REG_DEAD (reg:DI 137)
> (nil)))
Which is due to a define_insn_and_split mis-behaving a bit.
The first approach was to define an insn for the case where we left shift a
sign extracted bitfield where the sign bit of the bitfield gets shifted into
bit 31. Theory being this might be a reasonably common occurrence and having a
pattern for it might be useful (and there's a similar pattern one could write
for a small number of zero extended fields getting shifted left as well).
That turns out to be a problem though as the sign extension is obfuscated
making it harder to track the state of the sign bits and thus harder to
eliminate later sign extensions. Those regressions can be fixed, but doing so
requires the revamp of the sign/zero extension patterns to eliminate several of
the define_insn_and_splits. Larger change than I really want to do right now.
We could also throttle back the most problematic define_insn_and_split. It's
likely viable, though probably not the best use of time given my desire to
clean up the define_insn_and_splits, including this one.
We can also recognize this case and simplify it. Essentially when we have
(ashift (sign_extract ...)) recognize the case when we're extracting a bitfield
starting at bit 0 and the bit field is shifted such that the sign bit of the
bitfield moves into bit position 7, 15 or 31. In that case we can simplify it
to (sign_extend (ashift ...))
This patch takes the last of those three approaches. Bootstrapped and
regression tested on x86_64, and riscv (BPI and Pioneer) as well as going
through all the embedded targets without regressions. I was somewhat worried
about loongarch due to a pattern in its machine description, but the two tests
added with that pattern still pass, so it seems OK too.
PR target/123010
gcc/
* simplify-rtx.cc (simplify_binary_operation_1, case ASHIFT): Simplify
case where a left shift of the sign extracted field can be turned into
a sign extension of a left shift.
gcc/testsuite
* gcc.target/riscv/pr123010.c: New test.
testsuite: Require effective target bitint for test case
Fix for the following test error on pru-unknown-elf:
FAIL: gcc.dg/Wzero-as-null-pointer-constant-2.c (test for excess errors)
Excess errors:
.../gcc/gcc/testsuite/gcc.dg/Wzero-as-null-pointer-constant-2.c:9:19: sorry, unimplemented: '_BitInt(4)' is not supported on this target
Andrew Pinski [Sat, 3 Jan 2026 23:09:40 +0000 (15:09 -0800)]
testsuite: Add new variant of pr42196-3.c
While working on complex lowering, I noticed that the
testcase pr42196-3.c had some interesting code in it
and most likely a copy and pasto. Since this testcase
was added back in 2009, I rather add a new testcase
rather than changing the old one.
The testcase was doing:
```
if (b)
{
f1 = __real__ u.cf;
f1 = __imag__ u.cf;
}
else
{
f1 = __real__ u.ci;
f1 = __imag__ u.ci;
}
r = bar (f1, f2);
```
I suspect the second f1 in both sides of the conditional
were supposed to be f2. So the new testcase does that.
Tested on x86_64-linux-gnu and pushed as obvious.
PR tree-optimization/42196
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr42196-4.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Pietro Monteiro [Sat, 3 Jan 2026 16:47:42 +0000 (11:47 -0500)]
algol68: Improve testsuite initialization
On algol68_link_flags remove unused variables and move finding the
link spec file to algol68_init while making it multilib aware.
Set always used compiler flags on algol68_init instead of
algol68_target_compile. This makes the log file for RUNTESTFLAGS="-v"
4 times smaller.
gcc/testsuite/ChangeLog:
* lib/algol68.exp (algol68_link_flags): Remove unused
variables and move finding the link spec file to...
(algol68_init): Here and make it multilib aware. Set always
used compiler flags here from algol68_target_compile.
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>
Jakub Jelinek [Sat, 3 Jan 2026 13:27:41 +0000 (14:27 +0100)]
widening_mul: Fix up .SAT_{ADD,SUB,MUL} pattern recognition [PR123372]
The following testcase ICEs since r15-1671, because the match.pd pattern
now allows a cast and the function checks whether the ifn is supported
on a wrong type. .SAT_{ADD,SUB,MUL} are binary ifns, so they care about
the type of their first operand:
#define binary_direct { 0, 0, true }
where
/* optabs can be parameterized by one or two modes. These fields describe
how to select those modes from the types of the return value and
arguments. A value of -1 says that the mode is determined by the
return type while a value N >= 0 says that the mode is determined by
the type of argument N. A value of -2 says that this internal
function isn't directly mapped to an optab. */
but in this function (unlike the function right below it for the
same ifns) checks the type of the lhs which since that change can be
actually a different type (expansion performs the operation on the
argument types and then casts the result to the lhs type).
So, e.g. on x86_64 -m32, it checks wether ussubsi3 insn can be used
(which it can), but then actually uses it on DImode arguments and
ussubdi3 is TARGET_64BIT only. Similarly for -m64 it checks ussubsi3 too
instead of ussubti3 (which doesn't exist).
2026-01-03 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/123372
* tree-ssa-math-opts.cc
(build_saturation_binary_arith_call_and_replace): Pass type of op_0
rather than type of lhs as second argument to
direct_internal_fn_supported_p.
Jakub Jelinek [Sat, 3 Jan 2026 11:18:53 +0000 (12:18 +0100)]
c++: Allow gnu::gnu_inline inline methods to be key methods [PR123326]
As gnu::gnu_inline inline/constexpr virtual methods are just inlined, but
don't have their out of line bodies emitted in the current TU, yet their
out of line copies are referenced in the vtable, I think it makes sense
to allow those to be key methods, as some other TU needs to provide the out
of line copy of the method and the vtable can go in that TU.
While this is in theory an ABI change, I seriously doubt anything in the
wild actually uses it, exactly because it is hard to use it correctly
(one needs to do something like libstdc++ with #ifdefs and compiling the
TU with out of line copy with different -std= flag or other special preprocessor
macros) plus constexpr virtual methods are C++20 and later anyway.
Or should this be done only for methods declared with constexpr where
it is even harder to do?
With just inline one could do
struct S {
[[gnu::gnu_inline]] inline virtual int foo () { return 42; }
};
in the header and
int S::foo () { return 42; }
in one of the TUs, but that doesn't work for constexpr, because constexpr
virtual method can't be overridden with non-constexpr one.
2026-01-03 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/123326
* class.cc (determine_key_method): Allow virtual inline/constexpr
non-pure virtual methods with gnu::gnu_inline attribute to be key
methods.
Jakub Jelinek [Sat, 3 Jan 2026 11:16:51 +0000 (12:16 +0100)]
c++: Fix up check for typeid on polymorphic type before C++20 [PR123347]
The following testcase ICEs since TYPE_POLYMORPHIC_P macro has been
changed to allow being used only on RECORD_TYPE/UNION_TYPE.
This particular spot wasn't adjusted.
2026-01-03 Jakub Jelinek <jakub@redhat.com>
PR c++/123347
* constexpr.cc (potential_constant_expression_1): Check for
CLASS_TYPE_P before using TYPE_POLYMORPHIC_P on TREE_TYPE (e).
Eric Botcazou [Sat, 3 Jan 2026 10:47:35 +0000 (11:47 +0100)]
Ada: Fix infinite loop on iterated element association with iterator and key
Unlike when the key expression is not present, Resolve_Iterated_Association
analyzes instead of preanalyzes the iterator specification, which causes the
expander to be invoked on an orphaned copy of the iterator expression.
gcc/ada/
PR ada/123371
* sem_aggr.adb (Resolve_Iterated_Association): Call Preanalyze
instead of Analyze consistently, as well as Copy_Separate_Tree
instead of New_Copy_Tree.
gcc/testsuite/
* gnat.dg/specs/aggr10.ads: New test.
Martin Uecker [Sun, 21 Dec 2025 18:10:56 +0000 (19:10 +0100)]
c: Fix ICE for invalid code with variadic and old-school prototypes [PR121507]
When mixing old-school definition without prototype and new C23
variadic functions without named argument, there can be an ICE
when trying to form the composite type. Avoid this by letting it
fail later due to incompatible types.
Martin Uecker [Thu, 25 Dec 2025 17:27:33 +0000 (18:27 +0100)]
c: Fix construction of composite type for atomic pointers [PR121081]
When constructing the composite type of two atomic pointer types,
we used "qualify_type" which did not copy the "atomic" qualifier.
Use c_build_type_attribute_qual_variant instead.
Paul Thomas [Sat, 3 Jan 2026 07:37:28 +0000 (07:37 +0000)]
Fortran: Invalid association with operator-result selector [PR123352]
2026-01-03 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/123352
* gfortran.h: Add prototype for gfc_resolve_symbol.
* interface.cc (matching_typebound_op): If the current
namespace has not been resolved and the derived type is use
associated, resolve the derived type with gfc_resolve_symbol.
* match.cc (match_association_list): If the associate name is
unknown type and the selector is an operator expression, copy
the selector and call gfc_extend_expr. Replace the selector if
there is a match, otherwise free the copy.
* resolve.cc (gfc_resolve_symbol): New function.
gcc/testsuite/
PR fortran/123352
* gfortran.dg/associate_78.f90: New test.
The macOS awk seems to not like having an unparenthesized conditional
expression as the last argument to printf. This commit workarounds
this by simply replacing the conditional expression with a conditional
statement.
Tested with gawk and mawk.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
libga68/ChangeLog