Richard Biener [Thu, 22 Jan 2026 10:11:18 +0000 (11:11 +0100)]
tree-optimization/123756 - remove now bogus assert in reduction vect
With r16-5372-gfacb92812a4ec5 I have generalized reduction operator
support to allow (masked) internal functions in more cases. The
following removes an now bogus assert given IFN_FMAX is now allowed
given IFN_COND_FMAX is available.
Andrew Pinski [Thu, 22 Jan 2026 07:53:32 +0000 (23:53 -0800)]
testsuite: don't test for shirnk wrapping for arm thumb1 on pr46555.c [PR123751]
Thumb1 does not support shrink wrapping so the check for shrink
wrapping in pr46555.c needs to be disabled for that. It does work
with both thumb2 and arm modes.
Pushed after testing for arm with `-mthumb -march=armv8-m.base`,
`-marm -mcpu=cotext-a72` and `-mthumb -mcpu=cotext-a72` to make
sure the correct tests are happening and still pass.
PR testsuite/123751
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr46555.c: Disable for arm thumb1.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
David Malcolm [Thu, 22 Jan 2026 01:28:39 +0000 (20:28 -0500)]
sarif-replay: improve path output when source is unavailable [PR122622]
For cases where sarif-replay can't find the source, text output with
-fdiagnostics-path-format=inline-events and HTML output both lead to
the event locations and messages in replayed execution paths not being
printed at all.
Fixed thusly.
gcc/ChangeLog:
PR diagnostics/122622
* diagnostics/paths-output.cc: Include "diagnostics/file-cache.h".
(event_range::print_as_text): Generalize the fallback logic for
special locations to also cover the case where source-printing
will fail, and show the location for that case.
(event_range::print_as_html): Likewise.
(event_range::can_print_source_p): New.
gcc/testsuite/ChangeLog:
PR diagnostics/122622
* sarif-replay.dg/2.1.0-valid/missing-source-pr122622-check-html.py:
New test script.
* sarif-replay.dg/2.1.0-valid/missing-source-pr122622.sarif: New
test.
* sarif-replay.dg/2.1.0-valid/spec-example-4.sarif: Update
expected output to reflect showing event locations and text.
* sarif-replay.dg/2.1.0-valid/tutorial-example.sarif: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
liuhongt [Mon, 19 Jan 2026 08:02:21 +0000 (00:02 -0800)]
Add u-arch tune prefer_bcst_from_integer.
/* X86_TUNE_PREFER_BCST_FROM_INTEGER: Enable broadcast from integer for
128/256/512-bit vector, if disabled, the move will be done by
broadcast/load from constant pool
broadcast from integer:
mov $0xa,%eax
vmovd %eax,%xmm0
vpbroadcastd %xmm0,%xmm0
broadcast/load from constant pool:
vpbroadcastd CST.0(%rip), %xmm0 */
The tune is on by default.
gcc/ChangeLog:
PR target/123631
* config/i386/i386-expand.cc (ix86_vector_duplicate_value):
Don't force CONST_INT to reg !TARGET_PREFER_BCST_FROM_INTEGER,
force it to mem instead.
* config/i386/i386.h (TARGET_PREFER_BCST_FROM_INTEGER): New macro.
* config/i386/x86-tune.def
(X86_TUNE_PREFER_BCST_FROM_INTEGER): New tune.
David Malcolm [Wed, 21 Jan 2026 18:43:03 +0000 (13:43 -0500)]
analyzer: ensure that concrete binding keys don't overlap
Add assertions on the internal state of the analyzer, made
possible by r16-4334-g310a70ef6db45d, to verify that concrete binding
keys don't overlap.
Doing so uncovers an issue in the state merging where overzealous merging
of partially-overlapping maps could lead to a malformed merged state.
Fix this by rejecting such state mergers.
gcc/analyzer/ChangeLog:
* store.cc (binding_cluster::validate): Reimplement as...
(binding_map::validate): ...this new function, using
m_concrete and m_symbolic sizes rather than iterating through
map and counting. Verify that concrete keys do not overlap.
(binding_cluster::can_merge_p): Reject cases that would lead to
overlapping concrete clusters.
* store.h (binding_cluster::validate): New decl.
(binding_map::get_concrete_bindings): New accessor.
gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/flex-without-call-summaries.c: Skip on
C++98 and tweak xfails to reflect slight differences in where
we hit exploration limits.
* c-c++-common/analyzer/raw-data-cst-pr117262-1.c: Add params to
force full exploration of the loop.
* gcc.dg/analyzer/pr93355-localealias.c (read_alias_file): Drop
xfail.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 21 Jan 2026 18:42:48 +0000 (13:42 -0500)]
analyzer: different decls don't alias
While debugging an issue where a binding_map could erroneously have
overlapping concrete bindings, I noticed a couple of cases in
haproxy-2.7 arising due to decls with !tracked_p leading to eval_alias
being called by store::set_value, and erroneously returning TS_UNKNOWN,
leading to writes to those decls affecting other decls.
Fixed thusly.
gcc/analyzer/ChangeLog:
* store.cc (store::eval_alias): Different decls don't alias.
gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/aliasing-4.c: New test.
* c-c++-common/analyzer/aliasing-5.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Gaius Mulley [Wed, 21 Jan 2026 18:33:30 +0000 (18:33 +0000)]
[PR modula2/123739] The help option descriptions are not capitalized
All option descriptions are now capitalized and some of the help
messages have been improved. A Warning tag (after the language label)
has been added to all warning options.
gcc/m2/ChangeLog:
PR modula2/123739
* lang.opt: All option descriptions are now capitalized.
(Wcase-enum): Warning tag added.
(Wpedantic-param-names): Ditto.
(Wpedantic-cast): Ditto.
(Wverbose-unbounded): Ditto.
(Wstyle): Ditto.
(Wuninit-variable-checking): Ditto.
(Wuninit-variable-checking=): Ditto.
Marek Polacek [Tue, 20 Jan 2026 18:53:12 +0000 (13:53 -0500)]
c++: introduce maybe_get_first_fn
In reflect.cc, we have maybe_get_reflection_fndecl to maybe
extract the BASELINK / OVL_FIRST from an expression. We could
introduce a more general function that the rest of the compiler
can use as well, based on maybe_get_fns. I currently do not see
any spots that could use this new function, though.
Marek Polacek [Fri, 16 Jan 2026 19:13:16 +0000 (14:13 -0500)]
c++: tighten up is_std_substitution
During Reflection review it came up that is_std_substitution
handles NAMESPACE_DECLs accidentally: it wants either a class or a
class template, but is looking at the type of any decl. With this
patch, we return false for any _DECL except TYPE_DECL and
DECL_CLASS_TEMPLATE_P, but let's verify that we now don't return false
for something that used to yield true.
gcc/cp/ChangeLog:
* mangle.cc (is_std_substitution): Return false for any _DECL except
TYPE_DECL and DECL_CLASS_TEMPLATE_P. Verify that we don't return false
for something that used to yield true. Use NULL_TREE instead of NULL.
s390: Don't emulate vec_cmpgtuv1tiv1ti for VXE3 [PR122781]
Starting with VXE3, 128-bit integer compares are natively supported.
For older machines those compares are emulated via
*vec_cmpeq<mode><mode>_nocc_emu and *vec_cmpgt<mode><mode>_nocc_emu and
*vec_cmpgtu<mode><mode>_nocc_emu. The latter was missing !TARGET_VXE3
in the condition which resulted in emulating unsigned greater-than
compares instead of making use of the new instructions enabled by
r15-7051.
PR target/122781
gcc/ChangeLog:
* config/s390/vector.md: Don't emulate vec_cmpgtu for 128-bit
integers for VXE3.
Jakub Jelinek [Wed, 21 Jan 2026 13:32:26 +0000 (14:32 +0100)]
c++: Fix ICE in build_base_path -> resolves_to_fixed_type_p -> fixed_type_or_null [PR123692]
The move of the resolves_to_fixed_type_p call earlier in build_base_path
for constexpr virtual inheritance caused the following ICE.
What changed is that in an unevaluated context expr can be a CALL_EXPR with
class type and when it is not a ctor,
resolves_to_fixed_type_p -> fixed_type_or_null
ICEs on it:
if (CLASS_TYPE_P (TREE_TYPE (instance)))
{
/* We missed a build_cplus_new somewhere, likely due to tf_decltype
mishandling. */
gcc_checking_assert (false);
Now, the reason why that worked fine when resolves_to_fixed_type_p was
later in the function is that there was
/* This must happen before the call to save_expr. */
expr = cp_build_addr_expr (expr, complain);
in between the new and old calls to resolves_to_fixed_type_p, and for the
uneval case like that
cp_build_addr_expr -> unary_complex_lvalue -> build_cplus_new
wraps the CALL_EXPR into a TARGET_EXPR already, so the later call
was happy.
Now, none of fixed_type_p, virtual_access or nonnull values are ever
used in the build_base_path uneval path.
if (code == PLUS_EXPR
&& !want_pointer
&& !has_empty
&& !uneval
&& !virtual_access)
return build_simple_base_path (expr, binfo);
doesn't look at it,
if (!want_pointer)
{
rvalue = !lvalue_p (expr);
/* This must happen before the call to save_expr. */
expr = cp_build_addr_expr (expr, complain);
}
else
expr = mark_rvalue_use (expr);
neither, nothing soon after it either and very soon we
if (uneval)
{
expr = build_nop (ptr_target_type, expr);
goto indout;
}
and indout: doesn't use it either.
So, if only uneval expressions have problems with moving the
resolves_to_fixed_type_p call earlier, the following patch fixes
it by just not calling that function at all in the uneval case
because we will not care about the result anyway.
2026-01-21 Jakub Jelinek <jakub@redhat.com>
PR c++/123692
* class.cc (build_base_path): Don't call resolves_to_fixed_type_p if
uneval.
Jakub Jelinek [Wed, 21 Jan 2026 13:30:03 +0000 (14:30 +0100)]
c++: Fix ICE with constant evaluation of a = {CLOBBER} with ptrmemfn [PR123677]
The following testcase ICEs when we evaluate a = {CLOBBER} stmt.
The code assumes that if type is an aggregate type and *valp is
non-NULL, then it must be a CONSTRUCTOR.
That is usually the case, but there is one exception, *valp can
be a PTRMEM_CST if TYPE_PTRMEMFUNC_P (type) and in that
case CONSTRUCTOR_ELTS (*valp) obviously ICEs or misbehaves.
Now, while I could do something like
if (*valp && (!TYPE_PTRMEMFUNC_P (type) || TREE_CODE (*valp) != PTRMEM_CST))
just making sure TREE_CODE (*valp) == CONSTRUCTOR seems much easier
and more readable.
2026-01-21 Jakub Jelinek <jakub@redhat.com>
PR c++/123677
* constexpr.cc (cxx_eval_store_expression): Only clear
CONSTRUCTOR_ELTS (*valp) if *valp is CONSTRUCTOR.
Eric Botcazou [Wed, 21 Jan 2026 10:47:42 +0000 (11:47 +0100)]
Ada: Fix visibility issue on generic parent from nested generic package
The problem is that we temporarily push onto the scope stack and install
the declarations of a package that is already on the scope stack and whose
declarations are already visible so, when the temporary condition is over,
the declarations are uninstalled, thus making them definitively invisible.
It comes from the use of the idiom Scope_Within_Or_Same (Current_Scope, S)
to detect whether S is open in the current scope, but that's not robust in
the presence of transient scopes or during instantiation of generic units.
gcc/ada/
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Replace call to
Scope_Within_Or_Same (Current_Scope, S) with In_Open_Scopes (S) to
test whether S is open in the current scope.
* sem_util.adb (From_Nested_Package): Likewise.
If the default CPU under testing merely supports Thumb2, but doesn't
default to it, thumb2-pop-loreg.c fails because the expected pop
instruction won't be found. Enable -mthumb explicitly.
Alexandre Oliva [Wed, 21 Jan 2026 03:30:14 +0000 (00:30 -0300)]
testsuite: arm: bf16_vstn_1 vst3q_bf16 improved by late-combine
The late-combine pass removes some unnecessary register copying in
bf16_vstn_1.c, copying that was expected by vst3q_bf16. Adjust the
expectations so that they match the better code we get now.
for gcc/testsuite/ChangeLog
* gcc.target/arm/simd/bf16_vstn_1.c: Adjust expectations for
code improved by late-combine.
Alexandre Oliva [Wed, 21 Jan 2026 03:30:10 +0000 (00:30 -0300)]
testsuite: enable use of cxa_atexit on tree-ssa cxa_atexit tests
The expected cxa_atexit calls and optimizations won't take place on
targets that don't have -fuse-cxa-atexit enabled by default. Enabling
it explicitly is enough to meet the expectations of the cxa_atexit
tests.
Alexandre Oliva [Wed, 21 Jan 2026 03:30:01 +0000 (00:30 -0300)]
testsuite: silence nolto-rel warning in pr62026_0.C
On some targets, pr62026_0.C issues a warning about implicitly passing
-flinker-output=nolto-rel, which flags the test as a failure. Pass
the flag explicitly to avoid the warning.
Alexandre Oliva [Wed, 21 Jan 2026 03:29:58 +0000 (00:29 -0300)]
testsuite: enable use of cxa_atexit on abi tag18a test
The _ZZZ1fB7__test1vEN1T1gEvE1x symbol is only output with the ABI tag
when cxa_atexit is available for use. On targets that default to
-fno-use-cxa-atexit, the test fails gratuitously. Add an explicit
-fuse-cxa-atexit, along with a requirement for cxa_atexit support.
for gcc/testsuite/ChangeLog
* g++.dg/abi/abi-tag18a.C: Require and enable cxa_atexit.
Alexandre Oliva [Wed, 21 Jan 2026 03:29:54 +0000 (00:29 -0300)]
testsuite: add hostedlib requirements to multiple C++ tests
Various C++ tests added in the gcc-15 cycle require features that are
only available when libstdc++-v3 is built in hosted mode, so they fail
when this is not the case. Skip them if ! hostedlib.
Alexandre Oliva [Wed, 21 Jan 2026 03:29:48 +0000 (00:29 -0300)]
testsuite: _Static_assert is not present in C90
fp8-helpers-neon.c uses _Static_assert, but the compiler warns that
this feature is not present in C90. Annotate it as an __extension__
to silence the warning, preserving the intent of the test.
for gcc/testsuite/ChangeLog
* gcc.target/aarch64/acle/fp8-helpers-neon.c: Silence
warnings about _Static_assert.
Alexandre Oliva [Wed, 21 Jan 2026 03:29:44 +0000 (00:29 -0300)]
testsuite: require c99_runtime for ldexp optimizations
ldexpf and ldexpl are only optimized as expected when the target libc
is known to have C99 support, so don't expect those optimizations when
HAVE_C99_RUNTIME is not set by builtins-config.h.
for gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ldexp.c: Require HAVE_C99_RUNTIME to test
ldexpf and ldexpl.
Alexandre Oliva [Wed, 21 Jan 2026 03:29:36 +0000 (00:29 -0300)]
testsuite: drop explicit run from dfp execution tests
dfp support isn't built into libgcc when fenv.h doesn't define the
expected FE_* macros, but some dfp tests go "dg-do run", overriding
the default "compile" when dfprt is not available.
Drop the overrider.
for gcc/testsuite/ChangeLog
* gcc.dg/dfp/c23-decimal64x-1.c: Drop the explicit dg-do run.
* gcc.dg/dfp/c23-decimal64x-3.c: Likewise.
Jeff Law [Tue, 20 Jan 2026 22:03:28 +0000 (15:03 -0700)]
[PR rtl-optimization/123380] Avoid creating bogus SUBREG in combine
In this issue we try to call gen_rtx_SUBREG with arguments that will trigger an
assertion failure. In particular we're trying to create a paradoxical subreg
of an HFmode object where the paradoxical is in DImode. That's obviously a
change in size. validate_subreg returns false for that case, thus triggering
the assertion.
Like other cases in combine.cc and elsewhere we can check validate_subreg
before we call gen_rtx_SUBREG and if validate_subreg returns false, we can
return a safe value. So that's all this patch does.
Bootstrapped and regression tested on x86_64, also regression tested on
riscv{32,64}-elf. Pushing to the trunk.
PR rtl-optimization/123380
gcc/
* combine.cc (gen_lowpart_for_combine): Don't try to create a
paradoxical SUBREG if it's going to be rejected by validate_subreg.
gcc/testsuite/
* gcc.target/riscv/pr123380.c: New test.
Jeff Law [Tue, 20 Jan 2026 20:57:10 +0000 (13:57 -0700)]
[RISC-V][PR target/123626] Fix VXRM state after calls
This is a partial fix for a long standing issue that Richard S. raised about a
year ago.
Specifically he indicated that he believed our handling of VXRM mode switching
was wrong and could lead to incorrect code, particularly WRT handling of calls.
Without rehashing everything related to VXRM, its sufficient to say that it has
no known value at function entry or upon returning from a call.
If we look at the main scan loop in mode-switching we have:
The way to think about this is if INSN requests a mode and it is not the same
as the last mode, then we've got a new point where we need to logically insert
a mode switch and we clear TRANSP.
A CALL_INSN in the RISC-V backend produces NO_MODE (it doesn't need VXRM
state). So we never get into the then clause of that inner if statement and
TRANSP stays on.
The fix is quite simple. We need one more state in the VXRM mode switching
that indicates we don't know VXRM's state after a call. While I could have
hacked up the various hooks to special case CALLs, it was just as easy to
adjust the attribute's generic handling so that any CALL_P is given the
VXRM_MODE_CLOBBER state.
Out of an abundance of caution if we'll filter out any actual code generation
setting it to CLOBBER state.
This is enough to make the testcase pass for rv64. It's still failing rv32,
but likely for completely different reasons. It obviously doesn't cause any
regressions on riscv{32,64}-elf and bootstraps will fire up later today on the
Pioneer and BPI.
PR target/123626
gcc/
* config/riscv/vector.md (vxrm_mode): Handle CALL_INSNs, which set
the attribute to the new VXRM_MODE_CLOBBER state.
* config/riscv/riscv.cc (riscv_emit_mode_set): Don't emit code when
VXRM's state changes to VXRM_MODE_CLOBBER.
gcc/testsuite
* gcc.target/riscv/rvv/base/pr123626.c: New test.
Patrick Palka [Tue, 20 Jan 2026 20:25:07 +0000 (15:25 -0500)]
c++: non-dep reversed <=> returning int [PR123601]
The code path for non-dependent operator expressions rewritten from a
<=> returning a non-class type (added in r16-3727-gf2fddc4b84a843) is
also reached for a reversed such <=> expression with the form
0 @ (y <=> x) where the @ is <=>, so we need to relax the relevant
assert accordingly.
PR c++/123601
gcc/cp/ChangeLog:
* tree.cc (build_min_non_dep_op_overload): Relax
COMPARISON_CLASS_P assert to accept SPACESHIP_EXPR too.
Andrew MacLeod [Mon, 19 Jan 2026 18:44:25 +0000 (13:44 -0500)]
Do not trap on a stmt with no basic block.
WHen calculating ranges for statements not in the IL, avoid looking for
range on entry values.
PR tree-optimization/123314
gcc/
* gimple-range.cc (gimple_ranger::range_on_entry): Do not check
ranger cache for an SSA_NAME with no BB.
(gimple_ranger::prefill_stmt_dependencies): Stop filling
dependencies when an out-of IL name is encountered.
PR sarif-replay/123056 notes that when using sarif-replay to generate
HTML from a .sarif file containing an embedded "sarif:/" link we get
bogus output containing SGR codes.
The links in question come from GCC's sarif output for cross-referencing
event IDs within an execution path.
These links are JSON pointers. I experimented with propertly supporting
the JSON Pointer spec (RFC 6901) within GCC, and I have a partially
working implementation which parses JSON pointers here, and, where
appropriate, reconstructs the pertinent event ID.
However, that feels too invasive to be pushing in stage 4. Hence for
GCC 16, this patch simply skips the link part of "sarif:/" links in
sarif-replay, avoiding corrupt output, deferring the more ambitious
round-tripping fix to GCC 17.
gcc/ChangeLog:
PR sarif-replay/123056
* libsarifreplay.cc (struct embedded_link): Move decl earlier.
(sarif_replayer::append_embeddded_link): New.
(sarif_replayer::make_plain_text_within_result_message): Move the
link-replay logic to the above, and skip the link part of
intra-sarif links.
gcc/testsuite/ChangeLog:
PR sarif-replay/123056
* sarif-replay.dg/2.1.0-valid/3.11.6-embedded-links-pr123056.sarif: New test.
* sarif-replay.dg/2.1.0-valid/embedded-links-pr123056-check-html.py:
New test script.
* sarif-replay.dg/2.1.0-valid/embedded-links-pr123056-check-sarif-roundtrip.py:
New test script.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 20 Jan 2026 17:06:11 +0000 (12:06 -0500)]
analyzer: fix -Wmaybe-uninitialized of 'edge_sense'
gcc/analyzer/ChangeLog:
* checker-event.cc (cfg_edge_event::maybe_get_edge_sense): New.
* checker-event.h (cfg_edge_event::maybe_get_edge_sense): New decl.
* diagnostic-manager.cc
(diagnostic_manager::consolidate_conditions): Use the above to
ensure that edge_sense is initialized if used, and to simplify
the check for a run of conditions.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jakub Jelinek [Tue, 20 Jan 2026 16:06:54 +0000 (17:06 +0100)]
c++: Fix -Wmisleading-indentation ICE on expansion stmt [PR123694]
The following testcase ICEs in the -Wmisleading-indentation warning.
For C++11 and later expansion statement can appear in the set of keywords
the warning sees and it is the first keyword of it (for expansion stmt
of a pair of keywords), so RID_TEMPLATE.
2026-01-20 Jakub Jelinek <jakub@redhat.com>
PR c++/123694
* c-indentation.cc (guard_tinfo_to_string): Handle RID_TEMPLATE
for C++11 or later.
Jakub Jelinek [Tue, 20 Jan 2026 14:38:24 +0000 (15:38 +0100)]
aarch64: Ignore debug stmts in aarch64_possible_by_lane_insn_p [PR123724]
Like in many other spots, when walking immediate uses for optimization decisions
we should just ignore debug stmts. aarch64_possible_by_lane_insn_p wasn't
ignoring those and wasted time on doing lookup for those and ICEd because
it wasn't a vectorizable stmt.
Fixed by ignoring debug stmts early during the immediate use walk.
Marek Polacek [Fri, 16 Jan 2026 18:00:35 +0000 (13:00 -0500)]
c++: adjust visibility of _DECLs with no linkage
This came up during Reflection review:
<https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705175.html>
Certain entities, e.g. vars in public inline functions or template
functions should be exported. But entities defined within TU-local
should not be exported.
This patch adjusts min_vis_expr_r to that effect as per the Reflection
discussion.
gcc/cp/ChangeLog:
* decl2.cc (min_vis_expr_r): For _DECLs with no linkage refer to the
linkage of the containing entity that does have a name with linkage.
Roger Sayle [Tue, 20 Jan 2026 14:07:55 +0000 (14:07 +0000)]
PR rtl-optimization/123585: ICE in vec_select simplification on x86_64.
This patch is my proposed fix to PR rtl-optimization/123585, an ICE caused
by some incorrect logic in a gcc_assert.
2026-01-20 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/123585
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1)
<case VEC_SELECT>: Correct gcc_assert when optimizing a vec_select
of a vec_select with differing vector lengths.
gcc/testsuite/ChangeLog
PR rtl-optimization/123585
* gcc.target/i386/pr123585.c: New test case.
Kyrylo Tkachov [Thu, 15 Jan 2026 13:22:46 +0000 (05:22 -0800)]
aarch64: Adjust predicate used for SVE2 SHA3 XAR rotate amount
While fixing the Advanced SIMD XAR patterns I looked at SVE2 and
it looks okay there but the rotate amount should use the
aarch64_simd_rshift_imm predicate rather than lshift_imm since the
instruction (unlike the Advanced SIMD version) takes values from
[1, bitwidth].
Bootstrapped and tested on aarch64-none-linux-gnu.
Kyrylo Tkachov [Thu, 15 Jan 2026 13:10:31 +0000 (05:10 -0800)]
aarch64: PR target/123584 - Fix expansion of SHA3 XAR with 0 amount
In this PR the vxarq_u64 intrinisc gets passed a rotate amount of 0
and the patterns don't handle it right. Because we adjust RTL amount
during expand to account for the canonical representation we end up
emitting a V2DImode rotate of 64, which the output instruction is not
prepared to handle. What we should be doing is leaving it as 0 in
that case, which is what this patch does.
A XAR with a rotate of 0 is really just an EOR and we could have emitted
it as such but I thought that, at least at -O0, it would be nicer to emit
the XAR-0 form as it's still a legal instruction and the user did ask for
it through the intrinsic. At -O1 and above the optimisers kick in and simplify
it to an EOR anyway.
Note: the SVE2 XAR instruction doesn't suffer from this problem because a
rotate amount of 0 is actually not allowed by the instruction itself and
the early intrinsic validation rejects it anyway.
Bootstrapped and tested on aarch64-none-linux-gnu.
PR target/123584
* config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Leave zero
rotate amounts as zero during expansion.
(*aarch64_xarqv2di_insn): Account for zero rotate amounts. Print #
in rotate immediate.
gcc/testsuite/
PR target/123584
* gcc.target/aarch64/torture/xar-zero.c: New test.
Prachi Godbole [Tue, 20 Jan 2026 04:49:38 +0000 (20:49 -0800)]
ipa-reorder-for-locality - Introduce C++ template heuristics
This patch introduces a new heuristics for reordering functions, to be used
in the absense of profile information. This approach uses C++ template
instantiation types to group functions together. Entry functions are sorted
in the beginning, and callees are sorted as part of partition_callchain ().
Bootstrapped and tested on aarch64-none-linux-gnu.
Prachi Godbole [Tue, 20 Jan 2026 04:31:43 +0000 (20:31 -0800)]
ipa-reorder-for-locality - Address compile time issues for locality cloning pass
This patch attempts to reduce compile time for locality cloning pass by
reducing recursive calls to partition_callchain (). This is achieved by
precomputing caller callee information into locality_info. locality_info
stores all callees of a node, either directly or via inlined nodes thereby
avoiding calls to partition_callchain () for inlined nodes which are already
partitioned with their inlined_to nodes. locality_info stores precomputed
accumulated incoming edge frequencies per unique caller and avoids repeated
computation within partition_callchain (). It also stores preaccumulated and
sorted outgoing edge frequencies for unique callees.
This patch refines is_entry_node_p () check by calling local_p () instead of
just alias check.
Approximately 45% compile time improvement is observed for
bootstrap-lto-locality config, and takes 2-5% more time on top of
bootstrap-lto.
This patch also handles appropriate memory management of pass specific data
structures.
Bootstrapped and tested on aarch64-none-linux-gnu.
Richard Biener [Tue, 20 Jan 2026 09:57:34 +0000 (10:57 +0100)]
middle-end/123697 - fix .MASK_LOAD_LANES folding
.MASK_LOAD_LANES has an aggregate (array of vectors) return value
which is not compatible with the else value used when trying to
fold this with all lanes inactive. Instead use an empty CTOR if
the else value is zero and otherwise do not simplify.
PR middle-end/123697
* gimple-fold.cc (gimple_fold_partial_load_store): Use an
empty CTOR for a zero else value in .MASK_LOAD_LANES.
Jakub Jelinek [Tue, 20 Jan 2026 11:01:07 +0000 (12:01 +0100)]
libstdc++: Disable __cpp_lib_reflection for old CXX ABI
Reflection currently doesn't work with -D_GLIBCXX_USE_CXX11_ABI=0.
The problem is that std::meta::exception currently uses under the
hood std::string and std::u8string and those aren't constexpr in
the old ABI.
While those members are in the standard exposition-only and so
we could make it to work by writing a custom class template that
just remembers const char{,8_t} * and size_t, there shouldn't be
many people trying to use C++26 features with the ABI that isn't
even compatible with C++11.
Last night I was surprised because make check help.exp reported the missing
dot at the end of cobol/lang.opt description below:
Wmove-index
Cobol Warning Var(move_index, 1) Init(1)
Warn if MOVE INDEX is used
but has not reported
fexec-national-charset=
Cobol Joined Var(cobol_national_charset) RejectNegative
Set the default execution character set for NATIONAL data items
a few lines earlier.
The problem is that help.exp verified output of
--help={common,optimizers,param,target,warnings} and just selected FEs:
--help={ada,c,c++,d,fortran,go} and no other languages.
Wmove-index above got reported because it appears in --help=warnings,
but fexec-national-charset= didn't, because it only appears in --help=cobol
So, the following patch adds 6 further languages to what help.exp tests
and fixes the reported bugs.
2026-01-20 Jakub Jelinek <jakub@redhat.com>
gcc/testsuite/
* gcc.misc-tests/help.exp: Check for descriptions without terminating
dot or semicolon also for objc, objc++, rust, modula-2, cobol and
algol68.
gcc/rust/
* lang.opt (frust-crate=, frust-extern=,
frust-incomplete-and-experimental-compiler-do-not-use,
frust-max-recursion-depth=, frust-crate-type=, frust-mangling=,
frust-cfg=, frust-edition=, frust-embed-metadata,
frust-metadata-output=, frust-compile-until=,
frust-name-resolution-2.0, frust-panic=, frust-overflow-checks): Add
dot at the end of the description.
gcc/cobol/
* lang.opt (fexec-national-charset=): Add dot at the end of the
description.
gcc/algol68/
* lang.opt (std=algol68, std=gnu68): Add dot at the end of the
description.
gcc/m2/
* lang.opt (Wpedantic-param-names, Wpedantic-cast, Wverbose-unbounded,
Wstyle, fauto-init, fbounds, fcase, fcpp, fcpp-end, fcpp-begin,
fdebug-builtins, fd, fdebug-function-line-numbers, fdef=,
fdump-system-exports, fextended-opaque, ffloatvalue,
fgen-module-list=, findex, fiso, flocation=, fm2-debug-trace=,
fm2-dump=, fm2-dump-decl=, fm2-dump-gimple=, fm2-dump-quad=,
fm2-dump-filter=, fm2-file-offset-bits=, fm2-g, fm2-lower-case,
fm2-pathname=, fm2-pathname-root=, fm2-pathname-rootI=, fm2-plugin,
fm2-prefix=, fm2-statistics, fm2-strict-type, fm2-strict-type-reason,
fm2-whole-program, fmod=, fnil, fpim, fpim2, fpim3, fpim4,
fpositive-mod-floor-div, fpthread, fq, frange, freturn,
fruntime-modules=, fscaffold-dynamic, fscaffold-c, fscaffold-c++,
fscaffold-main, fscaffold-static, fshared, fsoft-check-all, fsources,
fswig, fuse-list=, fwideset, fwholediv, fwholevalue, save-temps,
save-temps=): Add dot at the end of the description.
Richard Biener [Tue, 20 Jan 2026 09:24:20 +0000 (10:24 +0100)]
tree-optimization/123729 - fix reduction epilog flowing into abnormal edge
When we vectorize a reduction and the reduction value flows across
an abnormal edge we have to make sure to mark the final SSA properly.
The following serves as a recipie how to avoid blindly copying
SSA_NAME_OCCURS_IN_ABNORMAL_PHI but instead set it when needed during
use replacement.
PR tree-optimization/123729
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Set
SSA_NAME_OCCURS_IN_ABNORMAL_PHI if the reduction flows
across an abnomal edge.
libgomp: Ensure memory sync after performing tasks
As described in PR 122356 there is a theoretical bug around not
"publishing" user data written in a task when that task has been
executed by a thread after entry to a barrier.
Key points of the C memory model that are relevant:
1) Memory writes can be seen in a different order in different threads.
2) When one thread (A) reads a value with acquire memory ordering that
another thread (B) has written with release memory ordering, then all
data written in thread (B) before the write that set this value will
be visible to thread (A) after that read.
3) This point requires that the read and write operate on the same
value. The guarantee is one-way: It specifies that thread (A) will
see the writes that thread (B) has performed before the specified
write. It does not specify that thread (B) will see writes that
thread (A) has performed before reading this value.
Outline of the issue:
1) While there is a memory sync at entry to the barrier, user code can
be ran after threads have all entered the barrier.
2) There are various points where a memory sync can occur after entry to
the barrier:
- One thread getting the `task_lock` mutex that another thread has
released.
- Last thread incrementing `bar->generation` with `MEMMODEL_RELEASE`
and some other thread reading it with `MEMMODEL_ACQUIRE`.
However there are code paths that can avoid these points.
3) On the code-paths that can avoid these points we could have no memory
synchronisation between a write to user data that happened in a task
executed after entry to the barrier, and some other thread running
the implicit task after the barrier. Hence that "other thread" may
read a stale value that should have been overwritten in the explicit
task.
There are two code-paths that I believe I've identified:
1) The last thread sees `task_count == 0` and increments the generation
with `MEMMODEL_RELEASE` before continuing on to the next implicit
task.
If some other thread had executed a task that wrote user data I
don't see any way in which an acquire-release ordering *from* the
thread writing user data *to* the last thread would have been formed.
2) After all threads have entered the barrier. Some thread (A) is
waiting in `do_wait`. Some other thread (B) completes a task writing
user data. Thread (B) increments the generation using
`gomp_team_barrier_done` (non atomically -- hence not allowing the
formation of any acquire-release ordering with this write). Thread
(A) reads that data with `MEMMODEL_ACQUIRE`, but since the write was
not atomic that does not form an ordering.
This patch makes two changes:
1) The write of `task_count == 0` in `gomp_barrier_handle_tasks` is done
atomically while the read of `task_count` in
`gomp_team_barrier_wait_end` is also made atomic. This addresses the
first case by forming an acquire-release ordering *from* the thread
executing tasks *to* the thread that will increment the generation
and continue.
2) The write of `bar->generation` via `gomp_team_barrier_done` called
from `gomp_barrier_handle_tasks` is done atomically. This means that
it will form an acquire-release synchronisation with the existing
atomic read of `bar->generation` in the main loop of
`gomp_team_barrier_wait_end`.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
libgomp: Enforce tasks executed lexically after scheduled
In PR122314 we noticed that our implementation of a barrier could
execute tasks from the next "Task scheduling" region. This was because
of a race condition where a barrier could be "completed", and some
thread raced ahead to schedule another task on the "next" barrier all
before some other thread checks for a bit on the generation number to
tell if there is a task pending.
The solution provided here is to check whether the generation number has
"incremented" past the state that this barrier was entered with. As it
happens the `state` variable already provided to
`gomp_barrier_handle_tasks` is enough for the targets to tell whether
the current global generation has incremented from the existing one.
This requires some changes in the two loops in bar.c that are waiting on
tasks being available. These loops now need to check for "generation
has incremented" rather than "generation is identical to one increment
forward". Without such an adjustment of the check a thread that is
refusing to execute tasks because they have been scheduled for the next
barrier will not continue into the next region until some other thread
has completed the task (and removed the BAR_TASK_PENDING flag).
This problem could be seen by a hang in testcases like
task-reduction-13.c.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
Jakub Jelinek [Tue, 20 Jan 2026 00:18:51 +0000 (01:18 +0100)]
cobol: Fix up -Wmove-index option description
I'm seeing
FAIL: compiler driver --help=warnings option(s): "^ +-.*[^:.]\$" absent from output: " -Wmove-index Warn if MOVE INDEX is used"
That is a test which verifies all option descriptions end with a dot or semicolon.
Fixed thusly:
2026-01-20 Jakub Jelinek <jakub@redhat.com>
* lang.opt (Wmove-index): Add missing dot at the end of description.
Pietro Monteiro [Tue, 20 Jan 2026 00:00:44 +0000 (19:00 -0500)]
algol68: Add allocation function for leaf objects
Boehm GC has a malloc_atomic function that doesn't clear the new
allocation and doesn't scan it for pointers.
Add a wrapper for the GC malloc_atomic in the run-time library and use it to
allocate GC-collectable strings in the library.
Change the lowering of malloc on the front end to select the run-time GC malloc
function to be used based on the mode having pointers or not. Use leaf
allocations for modes that are not refs or that don't contain refs.
A boolean `has_refs' member was added to MOID_T and the computation of the
atrribute is done by the parser when generating the mode list.
gcc/algol68/ChangeLog:
* a68-low-clauses.cc (a68_lower_collateral_clause): Update
call to a68_lower_alloca.
* a68-low-coercions.cc (a68_lower_widening): Likewise.
* a68-low-generator.cc (allocator_t): Adjust typedef.
(fill_in_buffer): Adjust call to allocator.
(gen_mode): Likewise.
* a68-low-multiples.cc (a68_row_malloc): Change type parameter to MOID_T
from tree. Adjust call to a68_lower_malloc.
* a68-low-posix.cc (a68_posix_fgets): Adjust call to a68_row_malloc.
(a68_posix_gets): Likewise.
* a68-low-runtime.def (MALLOC_LEAF): Add definition for
_libga68_malloc_leaf.
* a68-low-strings.cc (a68_string_concat): Adjust call to
a68_lower_malloc.
(a68_string_from_char): Likewise.
* a68-low-units.cc (a68_lower_slice): Likewise.
* a68-low.cc (a68_low_dup): Adjust calls to a68_lower_malloc
and a68_lower_alloca.
(a68_lower_alloca): Change type parameter to MOID_T from tree.
(a68_lower_malloc): Likewise. Use _libga68_malloc_leaf if the MOID_T
doesn't have refs, use _libga68_malloc otherwise.
* a68-parser-modes.cc (a68_create_mode): Set has_refs on the new mode.
(is_mode_has_refs): New function.
(compute_derived_modes): Set has_refs on the chain of modes.
* a68-parser.cc (a68_new_moid): Set has_refs to false by
default.
* a68-types.h (struct MOID_T): Add member `has_refs`.
(HAS_REFS): New macro.
* a68.h (a68_row_malloc): Update prototype.
(a68_lower_alloca): Likewise.
(a68_lower_malloc): Likewise.
libga68/ChangeLog:
* ga68-alloc.c (_libga68_malloc_leaf): New function.
* ga68-posix.c (_libga68_posixfgets): Use _libga68_malloc_leaf
instead of _libga68_malloc.
* ga68-unistr.c (_libga68_u32_to_u8): Likewise.
(_libga68_u8_to_u32): Likewise.
* ga68.h (_libga68_malloc_leaf): New prototype.
* ga68.map: Add _libga68_malloc_leaf to the global map.
Signed-off-by: Pietro Monteiro <pietro@sociotechnical.xyz>
Jeff Law [Mon, 19 Jan 2026 22:53:22 +0000 (15:53 -0700)]
[RISC-V][PR rtl-optimization/121787] Work around bad cfglayout interaction with asm goto
This is a suggestion from Richi in the PR.
The RISC-V backend calls the loop initialization routines during setup for
vsetvl insertion/optimization. Right now that uses LOOPS_NORMAL which allows
various adjustments to the loop structure. The interaction between those CFG
adjustments and asm goto support is putting the CFG into an undesirable state.
There's potentially an issue in the CFG layout bits, but we can punt that out
by using AVOID_CFG_MODIFICATIONS when calling loop_optimizer_init. My review
of the vsetvl code doesn't show any direct need for clean preheaders, latches,
etc -- the biggest thing it needs is for infinite loops to be connected to the
exit block which is handled outside of loop_optimizer_init.
So this is a workaround, but enough to get the PR off the regression list.
Waiting for pre-commit CI to do its thing, though it has already passed
riscv{32,64}-elf for me. Bootstrap on the Pioneer is in flight.
PR rtl-optimization/121787
gcc/
* config/riscv/riscv-vsetvl.cc (pre_vsetvl): Adjust call to
loop_optimizer_init to avoid making CFG changes.
gcc/testsuite/
* gcc.target/riscv/pr121787-1.c: New test.
* gcc.target/riscv/pr121787-2.c: New test.
Joseph Myers [Mon, 19 Jan 2026 21:16:46 +0000 (21:16 +0000)]
testsuite: Do not restrict five tests to { target native }
Five miscellaneous tests use { target native }, while not doing
anything that actually needs some kind of special handling for cross
testing. Remove the { target native } restriction from those tests.
Tested for x86_64-pc-linux-gnu, and with cross to aarch64-linux.
* g++.old-deja/g++.mike/eh30.C, g++.old-deja/g++.mike/p4750.C,
g++.old-deja/g++.robertl/eb106.C, g++.old-deja/g++.robertl/eb83.C,
gcc.dg/20020201-1.c: Do not use { target native }.
Rainer Orth [Mon, 19 Jan 2026 20:51:28 +0000 (21:51 +0100)]
Silently ignore -pthread etc. on Solaris
gcc supports -pthread/-pthreads on Solaris to provide a way to
transparently handle the platform-specific needs of multitheaded
programs. In the past, this used to link with -lpthread. However, this
has been removed in
config: -pthread shouldn't link with -lpthread on Solaris
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615080.html
since libpthread had been folded into libc.
The only thing these options do now is to define _REENTRANT and
_PTHREADS. In Solaris 11.4, the system headers no longer reference the
former. Checking gnulib as an important source of portability
information, I find that _REENTRANT is used for two purposes:
* Ensure that strerror_r, localtime_r and gmtime_r are declared.
However, these declarations are no longer guarded by _REENTRANT, so
this is moot.
* Besides, _REENTRANT is defined on Solaris in general, but this has no
longer any effect.
There's no reference _PTHREADS at all, so this seems to be an ancient
relic no longer needed at all.
This patch silently ignores both options, keeping them for portability's
sake.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.
Georg-Johann Lay [Mon, 19 Jan 2026 17:33:30 +0000 (18:33 +0100)]
testsuite/123175 - Use int32_t instead of int in vec-type construction.
gcc/testsuite/
PR testsuite/123175
* gcc.dg/torture/pr123175-1.c: Use int32_t instead of int in
vec-type construction.
* gcc.dg/torture/pr123175-2.c: Same.
If we're extracting an element out of a uniform vector, then any element
will do and it's conveniently returned by uniform_vector_p. So with a
simple match.pd pattern that simplifies to _26 = 0. That in turn allows
elimination of all the vector code and simplify the return value to a
constant as well, resulting in the desired code shown earlier.
One could easily argue that this need not be restricted to a uniform
vector and I would totally agree. But given we're in stage4, the
minimal fix for the regression seems more appropriate. But I could
certainly be convinced to handle the more general case here.
Bootstrapped and regression tested on x86 & riscv64. Tested across the
cross configurations as well with no regressions.
PR target/113666
gcc/
* fold-const-call.cc (fold_const_vec_extract): New function.
(fold_const_call, case CFN_VEC_EXTRACT): Call it.
* match.pd (IFN_VEC_EXTRACT): Handle extraction from a uniform
vector.
gcc/testsuite
* gcc.target/riscv/rvv/base/pr113666.c: New test.
Co-authored-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Richard Biener [Wed, 7 Jan 2026 09:23:22 +0000 (10:23 +0100)]
tree-optimization/123061 - invalid hoisting of division
The following fixes the computation of always-exeecuted-in in the LIM
pass which was enhanced to handle inner loops in a better way but
in this process ended up setting inner loop always-executed-in state
based on outer loop analysis, which is wrong because an inner loop
block needs to be proven to be always executed for all inner loop
iterations as well, not only for all outer loop iterations.
The fix is to iterate over inner loops first and when processing
an outer loop only update always-executedness if a block belongs
to the very same loop or an immediately nested loop and always
executed inside that.
PR tree-optimization/123061
PR tree-optimization/123636
* tree-ssa-loop-im.cc (fill_always_executed_in_1): Change
outer-to-inner to inner-to-outer iteration. Update inner
loop state only when always executed in an immediately
nested loop.
* gcc.dg/torture/pr123061.c: New testcase.
* gcc.dg/torture/pr123636.c: Likewise.
* gcc.dg/tree-ssa/ssa-lim-26.c: Likewise.
Tomasz Kamiński [Fri, 16 Jan 2026 13:01:53 +0000 (14:01 +0100)]
libstdc++: Use overload operator<=> when provided in relational functors [PR114153]
The implementation of less<> did not consider the possibility of t < u being
rewritten from overloaded operator<=>. This lead to situation when for t,u that:
* provide overload operator<=>, such that (t < u) is rewritten to (t <=> u) < 0,
* are convertible to pointers,
the expression std::less<>(t, u) would incorrectly result in call of
std::less<void*> on values converted to the pointers, instead of t < u.
The similar issues also occurred for greater<>, less_equal<>, greater_equal<>,
their range equivalents, and in three_way_compare for heterogeneous calls.
This patch addresses above, by also checking for free-functions and member
overloads of operator<=>, before falling back to pointer comparison. We do
not put any constraints on the return type of selected operator, in particular
in being one of the standard defined comparison categories, as the language
does not put any restriction of returned type, and if (t <=> u) is well
formed, (t op u) is interpreted as (t <=> u) op 0. If that later expression
is ill-formed, the expression using op also is (see included tests).
The relational operator rewrites try both order of arguments, t < u,
can be rewritten into operator<=>(t, u) < 0 or 0 < operator<=>(u, t), it
means that we need to test both operator<=>(T, U) and operator<=>(U, T)
if T and U are not the same types. This is now extracted into
__not_overloaded_spaceship helper concept, placed in <concepts>, to
avoid extending set of includes.
The compare_three_way functor defined in compare, already considers overloaded
operator<=>, however it does not consider reversed candidates, leading
to situation in which t <=> u results in 0 <=> operator<=>(u, t), while
compare_three_way{}(t, u) uses pointer comparison. This is also addressed by
using __not_overloaded_spaceship, that check both order of arguments.
Finally, as operator<=> is introduced in C++20, for std::less(_equal)?<>,
std::greater(_equal)?<>, we use provide separate __ptr_cmp implementation
in that mode, that relies on use of requires expression. We use a nested
requires clause to guarantee short-circuiting of their evaluation.
The operator() of aforementioned functors is reworked to use if constexpr,
in all standard modes (as we allow is as extension), eliminating the need
for _S_cmp function.
PR libstdc++/114153
libstdc++-v3/ChangeLog:
* include/bits/ranges_cmp.h (__detail::__less_builtin_ptr_cmp):
Add __not_overloaded_spaceship spaceship check.
* include/bits/stl_function.h (greater<void>::operator())
(less<void>::operator(), greater_equal<void>::operator())
(less_equal<void>::operator()): Implement using if constexpr.
(greater<void>::__S_cmp, less<void>::__S_cmp)
(greater_equal<void>::__ptr_comp, less_equal<void>::S_cmp):
Remove.
(greater<void>::__ptr_cmp, less<void>::__ptr_cmp)
(greater_equal<void>::__ptr_comp, less_equal<void>::ptr_cmp): Change
tostatic constexpr variable. Define in terms of requires expressions
and __not_overloaded_spaceship check.
* include/std/concepts: (__detail::__not_overloaded_spaceship):
Define.
* libsupc++/compare: (__detail::__3way_builtin_ptr_cmp): Use
__not_overloaded_spaceship concept.
* testsuite/20_util/function_objects/comparisons_pointer_spaceship.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
The following fixes an omission in find_or_generate_expression to
check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI as already done in
create_expression_by_pieces.
PR tree-optimization/123602
* tree-ssa-pre.cc (find_or_generate_expression): Do not
generate references to abnormal SSA names.
Tomasz Kamiński [Mon, 19 Jan 2026 09:03:08 +0000 (10:03 +0100)]
libstdc++: Fix std::erase_if for std::string with -D_GLIBCXX_USE_CXX11_ABI=0.
The __cow_string used with -D_GLIBCXX_USE_CXX11_ABI=0, does not provide
erase accepting const_iterator, so we adjust __detail::__erase.if
(introduced in r16-6889-g3287) to call __cont.erase with mutable iterators.
libstdc++-v3/ChangeLog:
* include/bits/erase_if.h (__detail::__erase_if): Pass mutable
iterators to __cont.erase.
Support for -m31 is deprecated and will be removed in a future release.
In order to let users know, emit an error/warning during configure. An
error is thrown if --enable-multilib is given implicitly, or if
explicitly but not --enable-obsolete.
Jakub Jelinek [Mon, 19 Jan 2026 08:46:36 +0000 (09:46 +0100)]
vect-generic: Fix up expand_vector_mult [PR123656]
The alg_sub_factor handling in expand_vector_mult had the arguments
reversed.
As documented in expmed.h, the algorithms should be
These are the operations:
alg_zero total := 0;
alg_m total := multiplicand;
alg_shift total := total * coeff
alg_add_t_m2 total := total + multiplicand * coeff;
alg_sub_t_m2 total := total - multiplicand * coeff;
alg_add_factor total := total * coeff + total;
alg_sub_factor total := total * coeff - total;
alg_add_t2_m total := total * coeff + multiplicand;
alg_sub_t2_m total := total * coeff - multiplicand;
The first operand must be either alg_zero or alg_m. */
So, alg_sub_factor should be identical to alg_sub_t2_m with the
difference that one subtracts accumulator and the other subtracts
op0. I went through all the other ones and they seem to match
the description except for alg_sub_factor and tree-vect-patterns.cc
seems to be fully correct. expand_vector_mult at times has
pretty random order of PLUS_EXPR arguments, but that is a commutative
operation, so makes no difference.
Furthermore, I saw weird formatting in the alg_add_t_m2 case, so fixed
that too.
2026-01-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/123656
* tree-vect-generic.cc (expand_vector_mult): Fix up alg_sub_factor
handling. Fix up formatting in alg_add_t_m2 handling.
Jakub Jelinek [Mon, 19 Jan 2026 08:45:10 +0000 (09:45 +0100)]
libatomic: Change installed libatomic_asneeded.a into a symlink [PR123650]
So, apparently I've tripped over not just one linker bug with the
libatomic/libgcc_s asneeded workaround for libtool bug, but two.
One is that mold doesn't parse INPUT ( AS_NEEDED ( -latomic ) )
or INPUT ( AS_NEEDED ( -lgcc_s ) ) correctly, I think that just
should be fixed in mold.
Another one is that ld.bfd doesn't handle correctly INPUT ( libatomic.a )
when doing static linking with -flto. While that bug should be fixed too
in the linker, the reason to install a linker script for a static library
has been just my laziness, a symbolic link is more efficient, and even on
hosts without symbolic link for a very small library like libatomic.a
we can live with a cp -pR copy of it.
Furthermore, when I was checking in the last patch (i.e. r16-6736 PR123396),
git was loudly complaining about libatomic_asneeded.a being checked
into repository when *.a is in .gitignored.
So, the following patch revamps the libatomic_asneeded* handling.
libatomic_asneeded.so is rewritten in the way that libgcc_s_asneeded.so
is done and libatomic_asneeded.a installed using $(LN_S).
2026-01-19 Jakub Jelinek <jakub@redhat.com>
PR libgcc/123650
* Makefile.am (toolexeclib_DATA): Remove.
(all-local): For LIBAT_BUILD_ASNEEDED_SOLINK instead of installing
libatomic_asneeded.{so,a} from top_srcdir cd into the destination
directory, use echo to write libatomic_asneeded.so and $(LN_S) to
symlink libatomic_asneeded.a to libatomic.a.
(install-data-am): For LIBAT_BUILD_ASNEEDED_SOLINK depend on
install-asneeded.
(install-asneeded): New goal.
* libatomic_asneeded.so: Remove.
* libatomic_asneeded.a: Remove.
* Makefile.in: Regenerate.
The following allows to switch the x86 target to use the vectorizer
cost comparison mechanic to select between different vector mode
variants of vectorizations. The default is still to not do this
but this allows an opt-in.
On SPEC CPU 2017 for -Ofast -march=znver4 this shows 2463 out of
39706 vectorized loops changing mode. In 503 out of 12378 cases
we decided to not use masked epilogs. Compile-time increases by ~1% overall.
With a quick 1-run there does not seem to be off-noise effects
for INT, this particular optimization and target option combination
and actual hardware to run on. For FP 549.fotonik3d_r improves by 6%
(confirmed with a 2-run).
This was triggered by PR123190 and PR123603 which have cases where
comparing costs would have resulted in the faster vector size to be
used. Both were reported for -O2 -march=x86-64-v3 -flto and with PGO.
The PR123603 recorded regression of 548.exchange2_r with these flags
is resolved with the flag (performance improves by 13%). I don't
have SPEC 2006 on that machine so did not verify the PR123190 433.milc
regression, but that has been improved with the two earlier patches.
The --param has no effect on the testcase in the PR.
I do expect that some of our tricks in the x86 cost model to make
larger vector sizes unprofitable will be obsolete or are
counter-productive with cost comparison turned on.
Lulu Cheng [Sat, 17 Jan 2026 07:12:46 +0000 (15:12 +0800)]
LoongArch: Fix bug117575.
In the template "vec_set<mode>", a call is made to
"lasx_xvinsve0_<lasxfmt_f>_scalar", but there is an issue due to
the different ranges of operand1 between the two templates.
The range of operand1 in the template
"lasx_xvinsve0_<lasxfmt_f>_scalar" is now set to be the same as
that in "vec_set<mode>".
PR target/117575
gcc/ChangeLog:
* config/loongarch/lasx.md: Modify the range of operand1.
François Dumont [Wed, 10 Dec 2025 18:12:58 +0000 (19:12 +0100)]
libstdc++: Fix std::erase_if behavior for std::__debug containers
Complete fix of std::erase_if/std::erase for all std::__debug containers and
__gnu_debug::basic_string. Make sure that iterators erased by this function
will be properly detected as such by the debug container and so considered as
invalid.
Doing so introduce a new std::__detail::__erase_if function dealing, similarly
to std::__detail::__erase_node_if, with non-node containers.
libstdc++-v3/ChangeLog:
* include/bits/erase_if.h (__detail::__erase_if): New.
* include/debug/deque (std::erase_if<>(__debug::deque<>&, _Pred)): Use latter.
* include/debug/inplace_vector (std::erase_if<>(__debug::inplace_vector<>&, _Pred)):
Likewise.
* include/debug/vector (std::erase_if<>(__debug::vector<>&, _Pred)): Likewise.
* include/std/deque: Include erase_if.h.
(std::erase_if<>(std::vector<>&, _Pred)): Adapt to use __detail::__erase_if.
* include/std/inplace_vector (std::erase_if<>(std::inplace_vector<>&, _Pred)):
Likewise.
* include/std/string (std::erase_if<>(std::basic_string<>&, _Pred)): Likewise.
* include/std/vector (std::erase_if<>(std::vector<>&, _Pred)): Likewise.
* include/debug/forward_list
(std::erase_if<>(__debug::forward_list<>&, _Pred)): New.
(std::erase<>(__debug::forward_list<>&, const _Up&)): New.
* include/debug/list
(std::erase_if<>(__debug::list<>&, _Pred)): New.
(std::erase<>(__debug::list<>&, const _Up&)): New.
* include/debug/map (std::erase_if<>(__debug::map<>&, _Pred)): New.
(std::erase_if<>(__debug::multimap<>&, _Pred)): New.
* include/debug/set (std::erase_if<>(__debug::set<>&, _Pred)): New.
(std::erase_if<>(__debug::multiset<>&, _Pred)): New.
* include/debug/string
(std::erase_if<>(__gnu_debug::basic_string<>&, _Pred)): New.
(std::erase<>(__gnu_debug::basic_string<>&, const _Up&)): New.
* include/debug/unordered_map
(std::erase_if<>(__debug::unordered_map<>&, _Pred)): New.
(std::erase_if<>(__debug::unordered_multimap<>&, _Pred)): New.
* include/debug/unordered_set
(std::erase_if<>(__debug::unordered_set<>&, _Pred)): New.
(std::erase_if<>(__debug::unordered_multiset<>&, _Pred)): New.
* include/std/forward_list (std::erase_if<>(std::forward_list<>&, _Pred)):
Adapt to work exclusively for normal implementation.
(std::erase<>(std::forward_list<>&, const _Up&)): Likewise.
* include/std/list (std::erase_if<>(std::list<>&, _Pred)): Likewise.
(std::erase<>(std::list<>&, const _Up&)): Likewise.
* include/std/map (std::erase_if<>(std::map<>&, _Pred)): Likewise.
(std::erase_if<>(std::multimap<>&, _Pred)): Likewise.
Guard functions using __cpp_lib_erase_if.
* include/std/set (std::erase_if<>(std::set<>&, _Pred)): Likewise.
(std::erase_if<>(std::multiset<>&, _Pred)): Likewise.
Guard functions using __cpp_lib_erase_if.
* include/std/unordered_map
(std::erase_if<>(std::unordered_map<>&, _Pred)): Likewise.
(std::erase_if<>(std::unordered_multimap<>&, _Pred)): Likewise.
Guard functions using __cpp_lib_erase_if.
* include/std/unordered_set
(std::erase_if<>(std::unordered_set<>&, _Pred)): Likewise.
(std::erase_if<>(std::unordered_multiset<>&, _Pred)): Likewise.
Guard functions using __cpp_lib_erase_if.
* testsuite/21_strings/basic_string/debug/erase.cc: New test case.
* testsuite/23_containers/forward_list/debug/erase.cc: New test case.
* testsuite/23_containers/forward_list/debug/invalidation/erase.cc: New test case.
* testsuite/23_containers/list/debug/erase.cc: New test case.
* testsuite/23_containers/list/debug/invalidation/erase.cc: New test case.
* testsuite/23_containers/map/debug/erase_if.cc: New test case.
* testsuite/23_containers/map/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/multimap/debug/erase_if.cc: New test case.
* testsuite/23_containers/multimap/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/multiset/debug/erase_if.cc: New test case.
* testsuite/23_containers/multiset/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/set/debug/erase_if.cc: New test case.
* testsuite/23_containers/set/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/unordered_map/debug/erase_if.cc: New test case.
* testsuite/23_containers/unordered_map/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/unordered_multimap/debug/erase_if.cc: New test case.
* testsuite/23_containers/unordered_multimap/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/unordered_multiset/debug/erase_if.cc: New test case.
* testsuite/23_containers/unordered_multiset/debug/invalidation/erase_if.cc: New test case.
* testsuite/23_containers/unordered_set/debug/erase_if.cc: New test case.
* testsuite/23_containers/unordered_set/debug/invalidation/erase_if.cc: New test case.
The canonicalization of args code was originally thinking edges e1/e2
were edges out going from the cond block but they were the edges
coming into the join block. This rewrites the canonicalization of arg0/1
args to correct that mistake. And it fixes the wrong code that would
happen in this case.
PR tree-optimization/123645
gcc/ChangeLog:
* tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Rewrite
the canonicalization of the args code based on e1/e2 being edges into
the join block.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr123645-1.c: New test.
* gcc.dg/torture/pr123645-2.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Jose E. Marchesi [Sat, 17 Jan 2026 22:50:31 +0000 (23:50 +0100)]
a68: do not use `^' for the pow operator
The RR mentions all of "**", "^" and "UP" as the representation of the
several pow operators for integral, real and complex operations. This
patch removes "^" from the list (and a remnant of "UP") and thus frees
that worthy character to be used for some other purpose in the future.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-parser-prelude.cc (stand_prelude): Remove definitions for ^
operator.
* ga68.texi (Real operators): Remove entries for ^.
(Integral operators): Likewise.
gcc/testsuite/ChangeLog
* algol68/execute/pow-real-1.a68: Adapt test to use ** rather than
^ for pow operator.
Jose E. Marchesi [Sat, 17 Jan 2026 22:03:34 +0000 (23:03 +0100)]
a68: new Coding Guidelines manual for Algol 68
This commit adds a new manual containing a few coding guidelines and
recommendations for writing Algol 68 code. The primary goal of the
document is to be used in the context of GCC development to achieve a
coherent style among the code base. However, other people may want to
adopt these coding conventions as well, so we are distributing them in
their own manual rather than as part of the ga68 internals manual.
Thanks to Chris Hermansen, Mohammad-Reza Nabipoor, Pietro Monteiro and
'jpl' James for their help and nice discussions in the algol68@
mailing list.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
Eric Botcazou [Sat, 17 Jan 2026 21:27:53 +0000 (22:27 +0100)]
Ada: Fix packed boolean array with Default_Component_Value aspect
Putting the Default_Component_Value aspect on a bit-packed array type has
never worked, so this plugs the loophole. For the sake of consistency,
the recent fix for PR ada/68179 is adjusted to use Has_Default_Aspect too.
gcc/ada/
PR ada/68179
PR ada/123589
* exp_ch3.adb (Expand_Freeze_Array_Type): Build an initialization
procedure for a bit-packed array type if Has_Default_Aspect is set
on the base type, but make sure not to build it twice. Also test
Has_Default_Aspect for a type derived from String.
gcc/testsuite/
* gnat.dg/component_value2.adb: New test.
Co-authored-by: Lisa Felidae <lisa@felidae.bam.moe>
Sandra Loosemore [Sat, 10 Jan 2026 20:27:36 +0000 (20:27 +0000)]
doc, nds32: Add missing documentation for nds32 options [PR122243]
This back end had numerous options defined that were not documented in
the manual. Descriptions were taken from the .opt file. I also did some
editorial cleanups in the .opt file text where appropriate.
doc, x86: Clean up x86 options documentation [PR122243]
Besides the usual fixes in this series to make the options summary
agree with the options listed in the detailed documentation and add
missing @opindex entries, I decided it was not very helpful to users
to have dozens of ISA extension options documented as a group spanning
multiple pages in the manual. I broke that up so each of those
options is described separately, using the documentation string from
the .opt file.
gcc/ChangeLog
PR other/122243
* config/i386/i386.opt (malign-functions): Mark undocumented/unused
option as Undocumented.
(malign-jumps): Likewise.
(malign-loops): Likewise.
(mbranch-cost, mforce-drap): Mark undocumented options likely
intended for developer use only as Undocumented.
(mstv): Correct sense of option in doc string.
(mavx512cd): Remove extra "and" from doc string.
(mavx512dq): Likewise.
(mavx512bw): Likewise.
(mavx512vl): Likewise.
(mavx512ifma): Likewise.
(mavx512bvmi): Likewise.
* doc/invoke.texi (Options Summary) <x86 Options>: Add
missing options. Correct whitespace and re-wrap long lines.
Remove -mthreads which is now classed as a MinGW option.
(Cygwin and MinGW Options): Replace existing documentation of
-mthreads with the more detailed text moved from x86 Options.
(x86 Options): Move introductory text about ISA extensions before
the individual options instead of after. Document them all
individually instead of as a group, and move immediately after
-march/-mtune documentation. Rewrap long lines. Document
interaction between SSE and AVX with -mfpmath=sse. Move -masm
documentation farther down instead of grouped with options
affecting floating-point behavior. Add missing @opindex
entries. Rewrite the -mdaz-ftz documentation. Document
-mstack-arg-probe. Copy-editing. Document -mstv. Remove
obsolete warning about -mskip-rax-setup in very old GCC versions.
Rewrite the -mapx-inline-asm-use-gpr32 documentation.
Document -mgather and -mscatter. Split -miamcu documentation
from -m32/-m64/etc. Rewrite -munroll-only-small-loops documentation.
Document -mdispatch-scheduler.