David Malcolm [Thu, 5 Feb 2026 23:36:32 +0000 (18:36 -0500)]
value-range: update comments
One of the difficulties I ran into when familiarizing myself with
value-range.{h,cc} is that the comments and classes refer to
representations of "ranges", but the implementation has grown beyond
mere ranges of values (such as with bitmasks and NaN-tracking).
Arguably "range" could refer to the mathematical definition: the set
of possible outputs of a function, but I find it much clearer to think
of these classes as efficient representations of subsets of possible
values of a type.
This patch updates various leading comments in a way that clarifies
the intent of these classes (for me, at least).
gcc/ChangeLog:
* value-range.cc (irange_bitmask::irange_bitmask): Fix typo in
comment.
* value-range.h (class vrange): Update leading comment to refer
to "subsets" rather than "ranges". Allude to the available
subclasses. Warn that the classes can be over-approximations and
thus can introduce imprecision.
(class irange_bitmask): Updating leading comment to refer to
knowledge about a "value", rather than a "range". Reword
description of MASK and VALUE to clarify implementation, and
add an example.
(class irange): Update leading comment to refer to a
"subset of possible values" rather than a "range", and
that subclasses have responsibility for storage.
(class nan_state): Rewrite leading comment.
(class frange final): Update leading comment to refer to
subsets of possible values, rather than ranges, and to
consistently use "Nan" when capitalizing.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Robert Dubner [Thu, 5 Feb 2026 15:45:40 +0000 (10:45 -0500)]
cobol: Use _perform_line_pairs instead of injecting encoded label names.
The gcobol front end has been communicating with GDB-COBOL by encoding
information into labels that are injected into the assembly language
with ASM_EXPR nodes. That behavior is, at best, questionable.
These changes replace the "proccall" and "procret" types of those labels
in favor of a static _perform_line_pairs table that contains the same
information and is accessible by GDB-COBOL by virtue of its known name.
That table allows GDB-COBOL to "NEXT over COBOL PERFORM" statements in a
way that is familiar to users who have used "NEXT over function call".
Eventually that information should find its way into the .debug_info
section, but at the present time I don't know how to do that on either
the compiler or debugger sides.
Most of these changes involve eliminating gg_insert_into_assembler calls
and replacing them with the perform_is_armed/perform_line_pairs logic.
Some COBOL variable initialization changes crept in here, as well.
gcc/cobol/ChangeLog:
* genapi.cc (DEFAULT_LINE_NUMBER): Remove unused #define.
(parser_statement_begin): Implement perform_is_armed logic.
(initialize_variable_internal): Handle both real and int types in
SHOW_PARSE tracing.
(section_label): Comment a renumbered insert_nop() for gdb-cobol
logic.
(paragraph_label): Likewise.
(leave_procedure): Eliminate call to gg_insert_into_assembler().
(parser_enter_section): Renumber insert_nop().
(parser_perform): Eliminate call to gg_insert_into_assembler().
(parser_perform_times): Likewise.
(internal_perform_through): Likewise.
(internal_perform_through_times): Likewise.
(parser_leave_file): Create the static _perform_line_pairs table.
(parser_sleep): Renumber insert_nop().
(parser_division): Remove calls to initialize_the_data().
(parser_perform_start): New call to insert_nop().
(parser_perform_conditional): Likewise.
(perform_outofline_before_until): Expanded comment.
(perform_outofline_after_until): Eliminate call to
gg_insert_into_assembler().
(perform_outofline_testafter_varying): Likewise.
(perform_outofline_before_varying): Likewise.
(perform_inline_testbefore_varying): New call to insert_nop().
(create_and_call): Change a comment.
* gengen.cc (gg_create_goto_pair): Change characteristics of a
label.
* parse.y: Change how data are initialized.
* parse_ante.h (field_type_update): Likewise.
* symbols.cc (cbl_field_t::set_signable): Likewise.
(cbl_field_t::encode): Likewise.
* symbols.h (struct cbl_field_t): Likewise.
* util.cc (symbol_field_type_update): Likewise.
(cbl_field_t::encode_numeric): Likewise.
Jonathan Wakely [Tue, 3 Feb 2026 15:57:47 +0000 (15:57 +0000)]
libstdc++: Fix ambiguity caused by new std::source_location constructor
The new constructor added for Contracts support was not explicit, so
caused ambiguities when arbitrary pointers were used in contexts which
could convert to std::source_location.
We don't actually need a constructor, the contract_violation::location()
function can just set the data member directly.
libstdc++-v3/ChangeLog:
* include/std/contracts (contract_violation::location): Use
source_location default constructor and then set _M_impl.
* include/std/source_location (source_location(const void*)):
Remove constructor.
* testsuite/18_support/contracts/includes.cc: Move to...
* testsuite/18_support/contracts/srcloc.cc: ...here. Test for
ambiguity caused by new constructor.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Matthieu Longo [Fri, 30 Jan 2026 15:08:41 +0000 (15:08 +0000)]
libiberty: add helper to swap doubly-linked list wrappers
This patch introduces a new helper to swap the contents of two
doubly-linked list wrappers. The new *_swap_lists operation exchanges
the first, last, and size fields, allowing two lists to be swapped
efficiently without iterating over their elements.
This helper is intended for cases where the ownership of a list must be
exchanged but swapping wrapper pointers is not possible. For simple
references to lists, when wrappers are dynamically allocated, swapping
the wrapper pointers themselves is sufficient and remains the preferred
approach.
This change adds the necessary declaration and definition macros to
doubly-linked-list.h and integrates them into the set of mutative
list operations.
The testsuite is updated accordingly to cover the new functionality.
This happens in C,C++ frontends before cgraph is created where the
relevant flags are located.
We can suppress these warnings with OPT_Wunused.
C,C++ frontends and cgraphunit suppressed OPT_Wunused and
OPT_Wunused_function interchangeably, so we unify suppression to
only OPT_Wunused.
Jakub Jelinek [Thu, 5 Feb 2026 13:59:12 +0000 (14:59 +0100)]
Revert c, c++: Use c*_build_qualified_type instead of build_qualified_type from within build_type_attribute
As seen in PR123882, this broke more than it fixed, a lot of
build_type_attribute_qual_variant including build_type_attribute_variant
just pass in TYPE_QUALS (type) as the last argument and for C/C++
when the code pushes the quals to the element type, it will effectively
make those unqualified. The PR123882 ICE is then on the array_as_string
terrible hack if the FE calls get_aka_type on that, it wants to create
qualified attribute variant of that and errors on the restrict qual.
So, to fix both PR c/123882 and other unknown regressions caused by
PR c/101312 I'm reverting it now.
testsuite: lto: transform gcc-ar to include prefix
When the the gcc binary is named arm-none-eabi-gcc, the gcc-ar binary
will be named arm-none-eabi-gcc-ar. The current approach works fine as
long as the binary does not contain any prefix, but if it does, the
gcc-ar binary will not be found.
Richard Biener [Thu, 5 Feb 2026 11:26:18 +0000 (12:26 +0100)]
tree-optimization/123986 - upon SLP analysis rollback, release data
The following makes sure to release any SLP kind specific data when
rolling back earlier successful analysis. This avoids crashing
when re-analyzing such node via another graph entry.
PR tree-optimization/123986
* tree-vect-slp.cc (vect_slp_analyze_operations): When
rolling back analyzed nodes, release node specific data
and reset SLP_TREE_TYPE.
compare-elim: arm: enable compare-elimination on Arm [PR123604]
The Arm port has never had the compare elimination pass enabled by
adding a definition of TARGET_FLAGS_REGNUM. But just adding this is
insufficient because the target uses COND_EXEC and compare-elim is not
yet set up to handle this.
This seems to be quite simple, since we just need to recognize
COND_EXEC in insns when scanning for uses of the condition code
register.
This is a partial mitigation for the code quality regression
reported in PR target/123604.
Jakub Jelinek [Thu, 5 Feb 2026 12:39:42 +0000 (13:39 +0100)]
ranger: Fix up WIDEN_MULT_EXPR handling in the ranger [PR123978]
In r13-6617 WIDEN_MULT_EXPR support has been added to the ranger, though
I guess until we started to use ranger during expansion in r16-1398
it wasn't really used much because vrp2 happens before widen_mul.
WIDEN_MULT_EXPR is documented to be
/* Widening multiplication.
The two arguments are of type t1 and t2, both integral types that
have the same precision, but possibly different signedness.
The result is of integral type t3, such that t3 is at least twice
the size of t1/t2. WIDEN_MULT_EXPR is equivalent to first widening
(promoting) the arguments from type t1 to type t3, and from t2 to
type t3 and then multiplying them. */
and IMHO ranger should follow that description, so not relying on
the precisions to be exactly 2x but >= 2x. More importantly, I don't
see convert_mult_to_widen actually ever testing TYPE_UNSIGNED on the
result, why would it when the actual RTL optabs don't care about that,
in RTL the signs are relevant just whether it is smul_widen, umul_widen
or usmul_widen. Though on GIMPLE whether the result is signed or unsigned
is important, for value rangers it is essential (in addition to whether the
result type is wrapping or undefined overflow). Unfortunately the ranger
doesn't tell wi_fold about the signs of the operands and wide_int can be
both signed and unsigned, all it knows is the precision of the operands,
so r13-6617 handled it by introducing two variants (alternate codes for
WIDEN_MULT_EXPR). One was assuming first operand is signed, the other
the first operand is unsigned and both were assuming that the second operand
has the same sign as the result and that result has exactly 2x precision
of the arguments. That is clearly wrong, on the following testcase
we have u w* u -> s stmt and ranger incorrectly concluded that the result
has [0, 0] range because the operands were [0, 0xffffffff] and
[0, -1] (both had actually [0, 0xffffffff] range, but as it used sign
extension rather than zero extension for the latter given the signed result,
it got it wrong). And when we see [0, 0] range for memset length argument,
we just optimize it away completely at expansion time, which is wrong for
the testcase where it can be arbitrary long long int [0, 0xffffffff]
* long long int [0, 0xffffffff], so because of signed overflow I believe
the right range is long long int [0, 0x7fffffffffffffff], as going above
that would be UB and both operands are non-negative.
The following patch fixes it by not having 2 magic ops for WIDEN_MULT_EXPR,
but 3, one roughly corresponding to smul_widen, one to umul_widen and
one to usmul_widen (though confusingly with sumul order of operands).
The first one handles s w* s -> {u,s}, the second one u w* u -> {u,s}
and the last one s w* u -> {u,s} with u w* s -> {u,s} handled by swapping
the operands as before. Also, in all cases it uses TYPE_PRECISION (type)
as the precision to extend to, because that is the precision in which
the actual multiplication is performed, the operation as described is
(type) op1 * (type) op2.
Note, r13-6617 also added OP_WIDEN_PLUS_{SIGNED,UNSIGNED} and handlers
for that, but it doesn't seem to be wired up in any way, so I think it
is dead code:
|git grep OP_WIDEN_PLUS_ .
|ChangeLog-2023: (OP_WIDEN_PLUS_SIGNED): New.
|ChangeLog-2023: (OP_WIDEN_PLUS_UNSIGNED): New.
|range-op.cc: set (OP_WIDEN_PLUS_SIGNED, op_widen_plus_signed);
|range-op.cc: set (OP_WIDEN_PLUS_UNSIGNED, op_widen_plus_unsigned);
|range-op.h:#define OP_WIDEN_PLUS_SIGNED ((unsigned) MAX_TREE_CODES + 2)
|range-op.h:#define OP_WIDEN_PLUS_UNSIGNED ((unsigned) MAX_TREE_CODES + 3)
My understanding is that it is misnamed attempt to implement WIDEN_SUM_EXPR
handling but one that wasn't hooked up in maybe_non_standard.
I wonder if we shouldn't keep it as is for GCC 16, rename to OP_WIDEN_SUM_*
in stage1, hook it up in maybe_non_standard (in this case 2 ops are
sufficient, the description is
/* Widening summation.
The first argument is of type t1.
The second argument is of type t2, such that t2 is at least twice
the size of t1. The type of the entire expression is also t2.
WIDEN_SUM_EXPR is equivalent to first widening (promoting)
the first argument from type t1 to type t2, and then summing it
with the second argument. */
and so we know second argument has the same type as the result, so all
we need to encode is the sign of the first argument.
And the ops should be both renamed and fixed, instead of
wi::overflow_type ov_lb, ov_ub;
signop s = TYPE_SIGN (type);
r = int_range<2> (type, new_lb, new_ub);
I'd go for
wide_int lh_wlb = wide_int::from (lh_lb, TYPE_PRECISION (type), SIGNED);
wide_int lh_wub = wide_int::from (lh_ub, TYPE_PRECISION (type), SIGNED);
return op_plus.wi_fold (r, type, lh_wlb, lh_wub, rh_lb, rh_ub);
(and similarly for the unsigned case with s/SIGNED/UNSIGNED/g).
Reasons: the precision again needs to be widen to type's precision, there
is no point to widen the second operand as it is already supposed to have
the right precision and operator_plus actually ends with
value_range_with_overflow (r, type, new_lb, new_ub, ov_lb, ov_ub);
to handle the overflows etc., r = int_range<2> (type, new_lb, new_ub);
won't do it.
2026-02-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/123978
* range-op.h (OP_WIDEN_MULT_SIGNED_UNSIGNED): Define.
(OP_WIDEN_PLUS_SIGNED, OP_WIDEN_PLUS_UNSIGNED,
RANGE_OP_TABLE_SIZE): Renumber.
* gimple-range-op.cc (imple_range_op_handler::maybe_non_standard):
Use 3 different classes instead of 2 for WIDEN_MULT_EXPR, one
for both operands signed, one for both operands unsigned and
one for operands with different signedness. In the last case
canonicalize to first operand signed second unsigned.
* range-op.cc (operator_widen_mult_signed::wi_fold): Use
TYPE_PRECISION (type) rather than wi::get_precision (whatever) * 2,
use SIGNED for all wide_int::from calls.
(operator_widen_mult_unsigned::wi_fold): Similarly but use UNSIGNED
for all wide_int::from calls.
(class operator_widen_mult_signed_unsigned): New type.
(operator_widen_mult_signed_unsigned::wi_fold): Define.
Jonathan Wakely [Wed, 4 Feb 2026 22:57:34 +0000 (22:57 +0000)]
libstdc++: Fix std::shared_ptr pretty printer for GDB 11
This pretty printer was updated for GCC 16 to match a change to
std::atomic<shared_ptr<T>>. But the gdb.Type.is_scalar property was
added in GDB 12.1, so we get an error for older GDB versions.
This adds a workaround for older GDB versions. The gdb.Type.tag property
is None for scalar types, and should always be defined for the
std::atomic class template. Another option would be to use the
is_specialization_of function defined in printers.py, but just checking
for the tag is simpler.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (SharedPointerPrinter): Only
use gdb.Type.is_scalar if supported.
* testsuite/libstdc++-prettyprinters/compat.cc: Test printer for
old implementation of std::atomic<std::shared_ptr<T>>.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jan Hubicka [Thu, 5 Feb 2026 11:30:50 +0000 (12:30 +0100)]
Fix speculative devirtualization ICE
this patch fixes cgraph verifier ICE about missing indirect call in the
speculative call sequence. This is triggered when tree-inline manages to
devirtualize call during folding it does and the call has multiple specualtive
targets. cgraph_update_edges_for_call_stmt_node calls
cgraph_edge::resolve_speculation which resolves one edge, but it should make
the whole sequence direct which is done by cgraph_edge::make_direct.
The code was also handling the case that call changed but it is not direct.
This should be impossible (and would require similar updating) so I added unreachable.
Bootstrapped/regtested x86_64-linux and also tested with autofdo and spec2017.
gcc/ChangeLog:
PR ipa/123226
* cgraph.cc (cgraph_update_edges_for_call_stmt_node): Fix handling
of multi-target speculations resolved at clone materialization time
xiezhiheng [Wed, 14 Jan 2026 02:07:22 +0000 (10:07 +0800)]
aarch64: Add support for Hisilicon's hip12 core (-mcpu=hip12)
This patch adds initial support for Hisilicon's hip12 core
(Kunpeng 950 processor).
For more information, see:
https://www.huawei.com/en/news/2025/9/hc-xu-keynote-speech
Bootstrapped and tested on aarch64-linux-gnu, no regression.
Robin Dapp [Wed, 4 Feb 2026 20:20:22 +0000 (21:20 +0100)]
RISC-V: Fix xtheadvector ratio attribute. [PR123870]
As reported in PR123870 we miscompile an RVV-optimized jpeg-quantsmooth
with xtheadvector. The core issue is that we forget to emit a vsetvl
before a -fschedule-insn induced spill restore. Spills are usually
handled by full-register loads and stores but xtheadvector doesn't have
those. Instead, the regular loads and stores are used which differ from
full-register loads/store in the fact that they don't encode the LMUL
in the instruction directly and thus require a proper SEW and LMUL in
the vtype rather than just the ratio.
This patch makes vlds have an SEW/LMUL demand instead of a "ratio only"
demand for theadvector.
I didn't manage to come up with a simple test case, though.
PR123969 has a test but it won't fail without slight changes to the
16 codebase. I'm still adding it for documentation and backport
reasons.
Regtested on rv64gcv_zvl512b.
PR target/123870
PR target/123969
gcc/ChangeLog:
* config/riscv/vector.md: Add vlds to "no ratio" for
theadvector.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/pr123969.c: New test.
Signed-off-by: Robin Dapp <rdapp@oss.qualcomm.com>
Robin Dapp [Wed, 4 Feb 2026 11:51:36 +0000 (12:51 +0100)]
RISC-V: Allow vector modes for xtheadvector. [PR123971]
In riscv_vector_mode_supported_any_target_p we disallow any vector mode
when TARGET_XTHEADVECTOR.
Things go wrong when we check if a permutation for a mode is supported
by just looking at the optab (e.g. in forwprop). Then later we try to
expand that permutation but cannot find a related int vector mode because
we don't allow any vector mode.
Strictly speaking, this is fallout from the simplify_vector_constructor
changes but it's still a target issue as the common code has done the
proper check and we don't live up to the promise of being able to extend
a certain mode.
This patch just allows all modes in
riscv_vector_mode_supported_any_target_p, even for theadvector.
Robin Dapp [Mon, 2 Feb 2026 20:48:05 +0000 (21:48 +0100)]
forwprop: Handle nop-conversion for maybe_ident. [PR123925]
The same handling for nop conversions we did in the !maybe_ident case is
also necessary in for maybe_ident. This patch performs the necessary
preprocessing before the if and unifies the nop-conversion handling.
PR tree-optimization/123925
gcc/ChangeLog:
* tree-ssa-forwprop.cc (simplify_vector_constructor):
Add nop-conversion handling for maybe_ident.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr123925.c: New test.
Signed-off-by: Robin Dapp <rdapp@oss.qualcomm.com>
Robin Dapp [Mon, 2 Feb 2026 09:28:08 +0000 (10:28 +0100)]
RISC-V: Disable small memsets for xtheadvector [PR123910].
This patch disables memsets with size less than a vector for
xtheadvector. As xtheadvector does not support fractional
LMUL we need to ensure to not emit those vectors that might
use it.
a68: make SET, CLEAR and TEST bits operators zero-based
After some discussion at the working group we have decided that the
bits operators SET, CLEAR and TEST, which are a GNU extension, shall
get bit numbers which are zero-based rather than one-based.
Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>
gcc/algol68/ChangeLog
* a68-low-bits.cc (a68_bits_set): Make bit number zero-based rathe
than one-based.
(a68_bits_clear): Likewise.
(a68_bits_test): Likewise.
* ga68.texi (Extended bits operators): Adapt documentation
accordingly.
gcc/testsuite/ChangeLog
* algol68/execute/bits-clear-1.a68: Adapt test to new CLEAR
semantics.
* algol68/execute/bits-set-1.a68: Likewise for SET.
* algol68/execute/bits-test-1.a68: Likewise for TEST.
With commit r16-5947-ga6c50ec2c6ebcbda2b032eee0552a6a486355e12
"Add -ffuse-ops-with-volatile-access", GCC/nvptx avoids use of intermediate
registers in applicable cases (nice!). This causes one test suite regression:
PASS: gcc.target/nvptx/alloca-5.c (test for excess errors)
XFAIL: gcc.target/nvptx/alloca-5.c execution test
[-PASS:-]{+FAIL:+} gcc.target/nvptx/alloca-5.c check-function-bodies f
PASS: gcc.target/nvptx/alloca-5.c check-function-bodies g
Adjust the FAILing 'check-function-bodies' as per the improved code generation.
Richard Biener [Tue, 3 Feb 2026 11:18:56 +0000 (12:18 +0100)]
tree-optimization/121726 - TBAA bug with SRA of address-taken decls
The following fixes re-materialization of aggregates before calls
that take the address of a scalarized decl. The issue here is that
we do not know the appropriate effective type to use for the stores.
So we use ref-all accesses for the re-materialization to properly
support TBAA info modref might have recorded. The same holds
true for the re-load after the call.
PR tree-optimization/121726
* tree-sra.cc (build_ref_for_offset): Add force_ref_all
parameter and use ptr_type_node as alias pointer type in
that case.
(build_ref_for_model): Add force_ref_all parameter and
pass it through, forcing build_ref_for_offset.
(generate_subtree_copies): Likewise.
(sra_modify_call_arg): Force ref-all accesses.
Tamar Christina [Thu, 5 Feb 2026 08:07:33 +0000 (08:07 +0000)]
middle-end: use inner variable when determining deferred FMA order [PR123898]
If we defer an FMA creation the code tries to determine the order of the
operands before deferring. To do this it compares the operands against the
result expression (which should contain the multiplication expression).
However the multiply might be wrapped in a conversion. This change has us strip
one level of conversion (the most that convert_mult_to_fma) supports handling
and only then do the comparison.
We cannot strip ops[0] and ops[1] and store them stripped since after the
deferrence, if we create an FMA we need to know the original types and
convert_mult_to_fma handles the conversions during FMA creation anyway.
There's probably a similar helper to strip_nop_view_converts but I couldn't
find one, since many of the stripping helpers are recursive or don't support
stripping VIEW_CONVERTS.
gcc/ChangeLog:
PR tree-optimization/123898
* tree-ssa-math-opts.cc (strip_nop_view_converts): New.
(convert_mult_to_fma): Use it.
gcc/testsuite/ChangeLog:
PR tree-optimization/123898
* gcc.target/aarch64/sve/pr123898.c: New test.
Alexandre Oliva [Thu, 5 Feb 2026 02:53:11 +0000 (23:53 -0300)]
testsuite: aarch64: state pr123775.c requirements
The execution testcase requires sve2 and 128-bit sve hardware, but it
doesn't state those requiremnets. I'm think the latter is implied by
the former, but I'm not entirely sure, so I'm requiring both
explicitly.
for gcc/testsuite/ChangeLog
PR middle-end/123775
* gcc.target/aarch64/sve2/pr123775.c: Add sve128 and sve2 hw
requirements.
Alexandre Oliva [Thu, 5 Feb 2026 02:48:55 +0000 (23:48 -0300)]
simplify-rtx: fix riscv redundant-bitmap-2.C
The insn simplification expected by the test, to get a bset
instruction, has been prevented since r15-9239, because we get rotates
for bit clear and shifts for bit flip, and we don't know how to
simplify those.
Teach the rtl simplifier, at the spots where it had been extended to
handle these logical simplifications, to also handle these less
obvious negations.
for gcc/ChangeLog
* simplify-rtx.cc (negated_ops_p): New.
(simplify_context::simplify_binary_operation_1): Use it.
Pan Li [Wed, 4 Feb 2026 04:55:28 +0000 (12:55 +0800)]
RISC-V: Introduce vr2fpr-cost= for customizing the cost when vr2fpr
Similar to vr2gpr-cost=, add the one for fpr as well.
PR/target 123916
gcc/ChangeLog:
* config/riscv/riscv-opts.h (VR2FPR_COST_UNPROVIDED): Add new
sentinel for unprovided cost.
* config/riscv/riscv-protos.h (get_vr2fr_cost): Add new func
decl.
* config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost):
Leverage new func to get cost of vr2fpr.
* config/riscv/riscv.cc (riscv_register_move_cost): Ditto.
(get_vr2fr_cost): Add new func to wrap access to the cost
of the vr2fpr.
* config/riscv/riscv.opt: Add new param vr2fpr-cost.
Pan Li [Wed, 4 Feb 2026 02:13:38 +0000 (10:13 +0800)]
RISC-V: Introduce vr2gpr-cost= for customizing the cost when vr2gpr
The middle-end pass NULL_TREE in previous, and then skip the
adjust_stmt_cost step which has count the cost of vr2gpr.
After Richard introduced more like slp_node with its vectype for
recording the cost, the adjust_stmt_cost will be hit and plus
the cost of vr2gpr now.
And then fail to vectorize due to cost value of vr2gpr is counted.
This PATCH would like to introduce another param named vr2gpr-cost,
to allow the cmdline provide the cost value of vr2gpr. Then we can
leverage the this param to make the failed test happy.
For further enhancement of the cost value customization, we would
like to defer to next stage-1, aka GCC-17.
PR/target 123916
gcc/ChangeLog:
* config/riscv/riscv-opts.h (GPR2VR_COST_UNPROVIDED): Depend on
default unprovided value.
(FPR2VR_COST_UNPROVIDED): Ditto.
(VR2GPR_COST_UNPROVIDED): Ditto.
(COST_UNPROVIDED): Add new default unprovided value.
* config/riscv/riscv-protos.h (get_vr2gr_cost): Add new func
decl.
* config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost):
Leverage new func to get cost of vr2gpr.
* config/riscv/riscv.cc (riscv_register_move_cost): Ditto.
(get_vr2gr_cost): Add new func to wrap access to the cost
of the vr2gpr.
* config/riscv/riscv.opt: Add new param vr2gpr-cost.
Remaining to do:
* Add new declarations in debug headers too.
Implement C++23 P2077R3 "Heterogeneous erasure overloads for
associative containers". Adds template overloads for members
erase and extract to address elements using an alternative key
type, such as string_view for a container of strings, without
need to construct an actual key object.
The new overloads enforce concept __heterogeneous_tree_key or
__heterogeneous_hash_key to verify the function objects provided
meet requirements, and that the key supplied is not an iterator
or the native key.
libstdc++-v3/ChangeLog:
PR libstdc++/117404
* include/bits/version.def (associative_heterogeneous_erasure):
Define.
* include/bits/version.h: Regenerate.
* include/std/map: Request new feature from version.h.
* include/std/set: Same.
* include/std/unordered_map: Same.
* include/std/unordered_set: Same.
* include/bits/stl_map.h (extract, erase): Define overloads.
* include/bits/stl_set.h: Same.
* include/bits/stl_multimap.h: Same.
* include/bits/stl_multiset.h: Same.
* include/bits/unordered_map.h: Same, 2x.
* include/bits/unordered_set.h: Same, 2x.
* include/bits/stl_function.h (concepts __not_container_iterator,
__heterogeneous_key): Define.
* include/bits/hashtable.h (_M_find_before_node, _M_locate, extract):
Delegate to more-general _tr version.
(_M_find_before_node_tr, _M_locate_tr, _M_extract_tr, _M_erase_tr):
Add new members to support a heterogeneous key argument.
(_M_erase_some): Add new helper function.
(concept __heterogeneous_hash_key): Define.
* include/bits/stl_tree.h (_M_lower_bound_tr, _M_upper_bound_tr,
_M_erase_tr, _M_extract_tr): Add new members to support a
heterogeneous key argument.
(concept __heterogeneous_tree_key): Define.
* testsuite/23_containers/map/modifiers/hetero/erase.cc: New test.
* testsuite/23_containers/multimap/modifiers/hetero/erase.cc: Same.
* testsuite/23_containers/multiset/modifiers/hetero/erase.cc: Same.
* testsuite/23_containers/set/modifiers/hetero/erase.cc: Same.
* testsuite/23_containers/unordered_map/modifiers/hetero/erase.cc: Same.
* testsuite/23_containers/unordered_multimap/modifiers/hetero/erase.cc:
Same.
* testsuite/23_containers/unordered_multiset/modifiers/hetero/erase.cc:
Same.
* testsuite/23_containers/unordered_set/modifiers/hetero/erase.cc: Same.
LRA rarely splits hard reg live ranges to assign regs to pseudos when
other possibility to allocate regs failed. During the pseudo
assignment LRA updates hard reg preferences of pseudos connected to
given pseudo through copies. For this LRA uses array
update_hard_regno_preference_check which was not allocated during hard
reg live range splitting. The patch fixes the bug by allocation of
the array during hard reg live range splitting.
gcc/ChangeLog:
PR rtl-optimization/123922
* lra-assigns.cc (lra_split_hard_reg_for): Allocate and free
update_hard_regno_preference_check. Clear non_reload_pseudos for
successful spilling too.
Andrew Pinski [Wed, 4 Feb 2026 03:20:48 +0000 (19:20 -0800)]
complex: Directly emit gimple from extract_component [PR121661]
Currently extract_component uses force_gimple_operand_gsi to emit
gimple including loads. The problem with that decls that have
DECL_EXPR_DECL set on it will change over to use the DECL_EXPR_DECL instead.
Normally this is ok except for nested functions where the original decl
is a PARAM_DECL, there is a copy from the param decl to the new frame based
location.
Well instead we should just create the gimple ourselves.
The only special case that needs to be handled is BIT_FIELD_REF and
a VCE of SSA_NAME. BIT_FIELD_REF was already handled specially
so we can just emit the load there. VCE of SSA_NAME on the other
hand needed some extra code.
Note VCE of s SSA_NAME case could be optimized, I filed PR 123968
for that. Since that is not a regression at this point and we are
now producing the same code as before.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/121661
gcc/ChangeLog:
* tree-complex.cc (extract_component): Create gimple
assign statements directly rather than call force_gimple_operand_gsi.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr121661-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
OpenMP/Fortran: Fix present modifier in map clauses for allocatables
The OpenMP 6.0 spec reads (Section 7.9.6 "map Clause"):
"Unless otherwise specified, if a list item is a referencing variable then the
effect of the map clause is applied to its referring pointer and, if a
referenced pointee exists, its referenced pointee."
In other words, the map clause (and its modifiers) applies to the array
descriptor (unconditionally), and also to the array data if it is allocated.
Without this patch, the semantics enforced in libgomp is incorrect: an
allocatable is deemed present only if it is allocated. Correct semantics: an
allocatable is in the present table as long as its descriptor is mapped, even if
no data exists.
libgomp/ChangeLog:
* target.c (gomp_present_fatal): New function.
(gomp_map_vars_internal): For a Fortran allocatable array, present
causes runtime termination only if the descriptor is not mapped.
(gomp_update): Call gomp_present_fatal.
* testsuite/libgomp.fortran/map-alloc-present-1.f90: New test.
I made a pass through the parameter documentation to fix various
issues with grammar, markup, punctuation, capitalization, inconsistent
use of terminology, and so on. I'm sure there's still room for improvement
but this is at least a step in the right direction.
As internal documentation, the parameter descriptions naturally
contain a lot of jargon and abbreviations for compiler passes. This
patch does not attempt to fix that. Probably a more critical area for
future improvement is making it more obvious what pass or part of the
compiler each parameter controls and perhaps adding some additional
subsectioning to group them.
gcc/ChangeLog
PR translation/89915
* doc/params.texi: Copy-edit text throughout the file.
* params.opt: Likewise in documentation strings.
doc: Remove references to parameters in user documentation [PR123245] [PR89915]
We shouldn't point users at specific parameters that control GCC
behavior since those are explicitly internal GCC interfaces subject to
incompatible changes or removal. This patch removes all such
references in the GCC user manual, either by replacing them with more
user-centric language, vague references to parameters generally, or
just removing the text as being unhelpful.
gcc/ChangeLog
PR target/123245
PR translation/89915
* doc/invoke.texi (Warning Options): Remove discussion of parameters
from -Winterference-size documentation.
(Static Analyzer Options): Ditto for -Wanalyzer-symbol-too-complex,
the list of things the analyzer has specific knowledge of, and
-fanalyzer-call-summaries.
(Optimize Options): Ditto for -finline-limit and fipa-cp-clone.
(Instrumentation Options): Likewise for -fsanitize=kernel-hwaddress
and -fharden-control-flow-redundancy.
(C++ Compiled Module Interface): Likewise for discussion of limits
on number of open files.
doc: Move parameter docs to the GCC internals manual [PR123245] [PR89915]
There appears to be some consensus that parameter documentation should
be moved from the GCC user manual to the internals manual. Aside from
the explicit disclaimer that parameters are implementation details and
subject to change, many of them are documented in pure "implementor-speak"
using jargon that users are unlikely to be familiar with.
This patch moves the documentation more or less as-is. I added some
sectioning and a new index to make things easier to find, but I did
not modify the parameter descriptions to correct even obvious grammar
and markup issues. That will be addressed in a subsequent patch.
There are several places in the user manual that make dangling
reference to parameters controlling the behavior of user-visible
options, in spite of the caveats elsewhere that parameters are
internal and can't be relied on. There's also a separate patch for this.
gcc/ChangeLog
PR target/123245
PR translation/89915
* Makefile.in (TEXI_GCCINT_FILES): Add params.texi.
* doc/gccint.texi (pa): New index.
(Top): Add new Parameters and Parameters Index nodes to menu.
Include params.texi.
(Parameter Index): New.
* doc/invoke.texi (Option Summary): Move --param from Optimization
Options to Developer Options.
(Optimization Options): Move parameter documentation to params.texi.
(Developer Options): Add abbreviated discussion of --param here.
(LoongArch Options): Move parameter documentation to params.texi.
(RISC-V Options): Likewise.
(RS/6000 and PowerPC Options): Likewise.
* doc/params.texi: New file.
Yangyu Chen [Wed, 4 Feb 2026 14:48:17 +0000 (07:48 -0700)]
[PATCH] RISC-V: fix nullptr dereference in parse_arch
When parsing target attributes, if an invalid architecture string is
provided, the function parse_single_ext may return nullptr. The existing
code does not check for this case, leading to a nullptr dereference when
attempting to access the returned pointer. This patch adds a check to
ensure that the returned pointer is not nullptr before dereferencing it.
If it is nullptr, an appropriate error message is generated.
Jakub Jelinek [Wed, 4 Feb 2026 11:30:42 +0000 (12:30 +0100)]
c++: Fix up eval_parameters_of for function types [PR123913]
eval_parameters_of was trying to share 3 lines of code between the
function declaration and function type cases, but got it wrong in
multiple ways for the latter. One thing is that we should override
reflect_kind only for the function decl case to REFLECT_PARM, there
we need to differentiate function parameter reflection vs. variable
reflection, for function type it is just type. Another one is that
we iterate over PARM_DECLs through DECL_CHAIN in the function decl
case, but for types we iterate over the TREE_LIST nodes and the
type is only TREE_VALUE of that.
And last, but am not sure about that, maybe
https://eel.is/c++draft/meta.reflection#queries-62.2 should be clarified,
I think we want to apply dealias. We have notes like
https://eel.is/c++draft/meta.reflection#queries-note-7
https://eel.is/c++draft/meta.reflection#traits-5
but those don't apply to type_of or parameters_of. And I think there was
an agreement that meta fns which return reflection of a type don't return
type aliases, but can't see it written explicitly except for the traits.
2026-02-04 Jakub Jelinek <jakub@redhat.com>
PR c++/123913
PR c++/123964
* reflect.cc (eval_parameters_of): Fix up handling of function
types.
Jakub Jelinek [Wed, 4 Feb 2026 11:29:24 +0000 (12:29 +0100)]
c++: Perform the iterating expansion stmt N evaluation in immediate context [PR123611]
For the N evaluation for iterating expansion stmts where the standard says
to evaluate:
[] consteval {
std::ptrdiff_t result = 0;
for (auto i = begin; i != end; ++i) ++result;
return result; // distance from begin to end
}()
right now (subject to further changes in CWG3140) I wanted to save compile
time/memory and effort to actually construct the lambda and it is evaluated
just using TARGET_EXPRs. On the following testcase it makes a difference,
when the lambda is consteval, the expressions inside of it are evaluated
in immediate context and so the testcase should be accepted, but we
currently reject it when i has consteval-only type and expansion stmt
doesn't appear in an immediate or immediate-escalating function.
The following patch fixes this by forcing in_immediate_context () to be true
around the evaluation.
2026-02-04 Jakub Jelinek <jakub@redhat.com>
PR c++/123611
* pt.cc (finish_expansion_stmt): Temporarily enable
in_immediate_context () for the iterating expansion stmt N
computation.
Jakub Jelinek [Wed, 4 Feb 2026 10:49:01 +0000 (11:49 +0100)]
toplevel: Build stage1/stage2/stageprofile libstdc++ with --disable-libstdcxx-pch when bootstrapping
The following patch saves ~ 2.4GiB of disk space in x86_64-linux
bootstrapped object directory:
find obj80 -name \*.gch -a -type f | xargs du -shc | grep total
3.7G total
find obj82 -name \*.gch -a -type f | xargs du -shc | grep total
1.3G total
and ~ 800MiB for i686-linux:
find obj81 -name \*.gch -a -type f | xargs du -shc | grep total
1.2G total
find obj83 -name \*.gch -a -type f | xargs du -shc | grep total
409M total
by disabling PCH in stage1/stage2/stageprofile builds, so only
building it in stage3/stagefeedback etc.
I think in stage1/stage2 it is a pure waste of bootstrap time and disk
space, for profiledbootstrap I'd say PCH isn't used commonly enough
in the wild that it is worth training GCC on that (but if you disagree,
I can surely take out that single line in there).
Jakub Jelinek [Wed, 4 Feb 2026 10:36:08 +0000 (11:36 +0100)]
tree: Fix up wrong-code with certain C++ default arguments [PR123818]
The following testcase is miscompiled since r0-69852-g4038c495f (at least
if one can somehow arrange in C++98 to have AGGR_INIT_EXPR or any other
FE specific trees nested in constructor elts for CONSTRUCTOR nested inside
of default argument, if not, then since C++11 support that allows that has
been implemented).
The problem is that we unfortunately store default arguments in TREE_PURPOSE
of TYPE_ARG_TYPES nodes of the FUNCTION/METHOD_TYPE and those are shared,
with type_hash_canon used to unify them. The default arguments aren't
considered in type_hash_canon_hash at all, but the equality hook considers
them in type_list_equal by calling simple_cst_equal on those default
arguments. That function is a tri-state, returns 0 for surely unequal,
1 for equal and -1 for "I don't know", usually because there are FE trees
etc. (though, admittedly it is unclear why such distinction is done, as
e.g. for VAR_DECLs etc. it sees in there it returns 0 rather than -1).
Anyway, the r0-69852-g4038c495f change changed CONSTRUCTOR_ELTS from
I think a tree list of elements to a vector and changed the simple_cst_equal
implementation of CONSTRUCTOR from just recursing on CONSTRUCTOR_ELTS (which
I think would just return -1 because I don't see simple_cst_equal having
TREE_LIST nor TREE_VEC handling) to something that thinks simple_cst_equal
returns a bool rather than tri-state, plus has a comment whether it should
handle indexes too.
So, in particular on the testcase which has in default arguments with
CONSTRUCTOR inside it with AGGR_INIT_EXPR as the single element (but
could be as well in multiple elements), the recursive call returns -1
for "I don't know" on the element and the function considers it the same
as if it returned 1, they are equal.
The following patch fixes it by calling the recursive non tail-recursive
simple_cst_equal calls like everywhere else in the function, by returning
what it returned if it returned <= 0 and otherwise continuing.
Plus given the comment I've also implemented checking the index.
The special case for FIELD_DECL is so that if both indexes are FIELD_DECLs
and they are different, we don't return -1 but 0.
2026-02-04 Jakub Jelinek <jakub@redhat.com>
PR c++/123818
* tree.cc (simple_cst_equal) <case CONSTRUCTOR>: Return -1 if some
recursive call returns -1. Also compare indexes.
Jakub Jelinek [Wed, 4 Feb 2026 10:25:01 +0000 (11:25 +0100)]
bitint: Don't try to extend PHI in the min_prec == prec case [PR122689]
The following testcase is miscompiled on aarch64-linux, the problem is that
the PHI argument is lowered to
VIEW_CONVERT_EXPR<unsigned long[196]>(y) = VIEW_CONVERT_EXPR<unsigned long[196]>(*.LC0);
MEM <unsigned long[1]> [(unsigned _BitInt(12419) *)&y + 1568B] = {};
on the edge, where for aarch64 unsigned _BitInt(12419) the size of the type
is already 1568 bytes (aka 196 DImode limbs, 98 TImode ABI limbs), so the
fir stmt copies everything and the second stmt clobbers random unrelated memory
after it.
Usually when min_prec == prec (otherwise we guarantee that min_prec is either 0,
or a single limb (which doesn't have padding bits) or something without any
padding bits (multiple of abi_limb_prec)) we take the
if (c)
{
if (VAR_P (v1) && min_prec == prec)
{
tree v2 = build1 (VIEW_CONVERT_EXPR,
TREE_TYPE (v1), c);
g = gimple_build_assign (v1, v2);
gsi_insert_on_edge (e, g);
edge_insertions = true;
break;
}
path and need nothing else, but in this case v1 is a PARM_DECL, so we need to
take the code path with VCE on the lhs side as well. But after the assignment
we fall through into the handling of the extension, but for min_prec == prec
that isn't needed and is actually harmful, we've already copied everything
and the code later on assumes there are no padding bits and uses just
TYPE_SIZE_UNIT.
So, this patch just avoids the rest after we've copied all the bits for
min_prec == prec.
2026-02-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/122689
* gimple-lower-bitint.cc (gimple_lower_bitint): For the PHI handling
if min_prec == prec, break after emitting assignment from c.
Prachi Godbole [Wed, 4 Feb 2026 07:23:10 +0000 (23:23 -0800)]
ipa-reorder-for-locality - Adjust bootstrap-lto-locality and param to reduce compile time
This patch turns off -fipa-reorder-for-locality for -fprofile-generate because
it's not required and contributes to the bloated time taken by bootstrap. It
also reduces the default partition size by half; the increased number of
partitions speeds up LTRANS phase.
Bootstrapped and tested on aarch64-none-linux-gnu. OK for mainline?
David Malcolm [Tue, 3 Feb 2026 23:52:35 +0000 (18:52 -0500)]
analyzer: fix ICE on pointer offsets [PR116865]
gcc/analyzer/ChangeLog:
PR analyzer/116865
* region-model-manager.cc
(region_model_manager::get_offset_region): Use POINTER_PLUS_EXPR
rather than PLUS_EXPR for pointer offsets.
gcc/testsuite/ChangeLog:
PR analyzer/116865
* c-c++-common/analyzer/ice-pr116865.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
- Import dmd v2.112.0.
- Bitfields feature is now enabled by default.
- The compiler now accepts `-std=d2024' and `-std=d202y'.
- An error is now issued for dangling `else' statements.
- `finally' statements are no longer rewritten to a sequence if
no `Exception' was thrown.
- Some forms of `printf' calls are now treated as `@safe'.
- Implicit integer conversions in `int op= float` assignments
has been deprecated.
D runtime changes:
- Import druntime v2.112.0.
- Added `filterCaughtThrowable' in `core.thread.ThreadBase'.
Phobos changes:
- Import phobos v2.112.0.
gcc/d/ChangeLog:
* dmd/VERSION: Bump version to v2.112.0.
* dmd/MERGE: Merge upstream dmd 24a41073c2.
* d-attribs.cc (build_attributes): Update for new front-end interface.
* d-builtins.cc (build_frontend_type): Likewise.
(matches_builtin_type): Likewise.
(d_init_versions): Predefine D_Profile when compiling with profile
enabled.
* d-codegen.cc (get_array_length): Update for new front-end interface.
(lower_struct_comparison): Likewise.
(build_array_from_val): Likewise.
(get_function_type): Likewise.
(get_frameinfo): Likewise.
* d-compiler.cc (Compiler::paintAsType): Likewise.
* d-convert.cc (convert_expr): Likewise.
(convert_for_rvalue): Likewise.
(convert_for_assignment): Likewise.
(d_array_convert): Likewise.
* d-diagnostic.cc (verrorReport): Rename to ...
(vreportDiagnostic): ... this.
(verrorReportSupplemental): Rename to ...
(vsupplementalDiagnostic): ... this.
* d-lang.cc (d_handle_option): Handle -std=d2024 and -std=d202y.
(d_parse_file): Update for new front-end interface.
* d-target.cc (Target::fieldalign): Likewise.
(Target::isVectorTypeSupported): Likewise.
(Target::isVectorOpSupported): Likewise.
* decl.cc (get_symbol_decl): Likewise.
(DeclVisitor::visit): Likewise.
(DeclVisitor::visit (FuncDeclaration *)): Do NRVO on `__result' decl.
* expr.cc (needs_postblit): Remove.
(needs_dtor): Remove.
(lvalue_p): Remove.
(ExprVisitor::visit): Update for new front-end interface.
(ExprVisitor::visit (AssignExp *)): Update for front-end lowering
expression using templates.
* imports.cc (ImportVisitor::visit): Update for new front-end
interface.
* intrinsics.def (INTRINSIC_VA_ARG): Update signature.
(INTRINSIC_C_VA_ARG): Update signature.
(INTRINSIC_VASTART): Update signature.
* lang.opt: Add -std=d2024 and -std=d202y.
* toir.cc (IRVisitor::visit): Update for new front-end interface.
* typeinfo.cc (TypeInfoVisitor::visit): Likewise.
(TypeInfoVisitor::visit (TypeInfoStructDeclaration *)): Ensure
semantic is ran on all TypeInfo members.
(base_vtable_offset): Update for new front-end interface.
* types.cc (TypeVisitor::visit): Likewise.
Marek Polacek [Mon, 2 Feb 2026 16:05:16 +0000 (11:05 -0500)]
libstdc++, c++/reflection: mark {,de}allocate constexpr
[allocator.members] says that allocator::{,de}allocate should be
constexpr but currently we don't mark them as such. I had to
work around that in the Reflection code, but it would be better to
clean this up. (I see no allocation_result so I'm not changing that.)
Kirill Chilikin [Sun, 25 Jan 2026 07:43:08 +0000 (14:43 +0700)]
fortran: Fix creation of reference to C_FUNLOC argument [PR117303]
The reference returned by C_FUNLOC is assigned to a variable. Without that,
no reference from the calling subprogram to the argument of C_FUNLOC
was created in the call graph, resulting in an undefined-reference error
with link-time optimization. Please see PR 117303 for more details.
PR fortran/117303
gcc/fortran/ChangeLog:
* trans-intrinsic.cc (conv_isocbinding_function):
Assign the reference returned by C_FUNLOC to a variable.
gcc/testsuite/ChangeLog:
* gfortran.dg/c_funloc_tests_7.f90:
Updated test due to changed code generation.
* gfortran.dg/c_funloc_tests_9.f90: New test.
Rainer Orth [Tue, 3 Feb 2026 19:41:40 +0000 (20:41 +0100)]
build: Only use gas_flag/gnu_ld_flag internally [PR123841]
gcc/acinclude.m4, gcc/config.gcc, and gcc/configure.ac have two
different variables that are checked to determine if GNU as and/or GNU
ld are used. config.gcc describes them like this:
gas_flag Either yes or no depending on whether GNU as was
requested.
gnu_ld_flag Either yes or no depending on whether GNU ld was
requested.
gas Set to yes or no depending on whether the target
system normally uses GNU as.
gnu_ld Set to yes or no depending on whether the target
system normally uses GNU ld.
I find this duplication highly confusing: what's the point of what a
target normally uses when it can just be determined at configure time if
the assembler/linker used is gas/gnu_ld?
There are two uses for those variables:
* gas/gnu_ld determine the setting of HAVE_GNU_AS/HAVE_GNU_LD. In this
case only, the normally part may be good enough, so this patch doesn't
touch it.
* However, there are several other places where this isn't good enough:
when the assembler/linker is invoked at configure time, it's crucial
that the right options and input syntax are use for the tool in
question.
Therefore this patch determines gas_flag/gnu_ld_flag at configure time
if they are not yet set otherwise. All tests that need to know which
tool is used now check gas_flag/gnu_ld_flag only.
Tested on {i386,amd64}-pc-solaris2.11, {i686,x86_64}-pc-linux-gnu,
{i386,x86_64}-apple-darwin, and sparc64-unknown-linux-gnu.
Paul Thomas [Tue, 3 Feb 2026 18:00:54 +0000 (18:00 +0000)]
Fortran: Fix module proc with array valued dummy procedure [PR123952]
2026-01-14 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/123952
* symbol.cc (gfc_copy_dummy_sym): Ensure that external, array
valued destination symbols have the correct interface so that
conflicts do not arise when adding attributes.
gcc/testsuite
PR fortran/123952
* gfortran.dg/pr123952.f90: New test.
Roger Sayle [Tue, 3 Feb 2026 17:43:59 +0000 (17:43 +0000)]
PR middle-end/118608: Fix lack of sign extension on MIPS64 with -Os
Here is a refreshed version of my patch from February 2025, for resolving
PR middle-end/118608, a wrong code on valid regression where the middle-end
is failing to keep values suitably sign-extended (MIPS64 is a rare
targetm.mode_rep_extended target, as well as being BYTES_BIG_ENDIAN).
This fix requires three independent tweaks, one in each source file.
The first tweak is that the logic in my triggering patch for determining
whether store_field updates the most significant bit needs to be updated
to handle BYTES_BIG_ENDIAN. Of the two insertions in the bugzilla test
case, we were generating the sign extension after the wrong one.
The second tweak was that this explicit sign-extension was then being
eliminated during combine by simplify-rtx that believed the explicit
TRUNCATE wasn't required. This patch updates truncated_to_mode to
understand that on mode_rep_extended targets, TRUNCATE is used instead
of SUBREG because it isn't a no-op. Finally, the third tweak is
that the MIPS backend requires a small change to recognize (and split)
*extenddi_truncatesi when TARGET_64BIT and !ISA_HAS_EXTS.
On mips64-elf with -mabi=64 the following are now generated for prepareNeedle:
-O2
sll $5,$5,16
jr $31
or $2,$5,$4
-Os
dsll $5,$5,16
or $2,$4,$5
dsll $2,$2,32
jr $31
dsra $2,$2,32
-O2 -march=octeon2
move $2,$0
ins $2,$5,16,16
jr $31
ins $2,$4,0,16
-Os -march=octeon2
move $2,$0
dins $2,$4,0,16
dins $2,$5,16,16
jr $31
sll $2,$2,0
Many thanks to Mateusz Marciniec and Jeff Law for additional testing.
gcc/ChangeLog
PR middle-end/118608
* expr.cc (store_field_updates_msb_p): New helper function that
now also handles BYTES_BIG_ENDIAN targets.
(expand_assignment): Use the above function when deciding to emit
a required sign/zero extension.
* rtlanal.cc (truncated_to_mode): Call targetm.mode_rep_extended
to check whether an explicit TRUNCATE is required (i.e. performs
an extension) on this target.
* config/mips/mips.md (*extenddi_truncate<mode>): Handle all
SUBDI modes, not just SHORT modes.
gcc/testsuite/ChangeLog
PR middle-end/118608
* gcc.target/mips/pr118608-1.c: New test case.
* gcc.target/mips/pr118608-2.c: Likewise.
* gcc.target/mips/pr118608-3.c: Likewise.
* gcc.target/mips/pr118608-4.c: Likewise.
Roger Sayle [Tue, 3 Feb 2026 17:06:25 +0000 (17:06 +0000)]
PR middle-end/123826: __builtin_pow vs. errno.
This is my proposed solution to PR middle-end/123826. Initially
I thought this would be a "one line change", adding a test for
flag_errno_math to gimple_expand_builtin_pow. Unfortunately
this revealed a second later problem, where pow (with constant
arguments) was still getting evaluated at compile-time, even when
the result is known to overflow.
It's ancient history, but shortly after I added support for pow
as a builtin, I contributed code to evaluate it at compile-time
when the exponent is an integer constant. Since then we now use
MPFR to evaluate libm math functions at compile-time. However
the vestigial call to evaluate pow via real_powi still exists,
and gets invoked after do_mpfr_arg2/mpfr_pow correctly determines
that we shouldn't evaluate pow at compile-time. This patch reorganizes
fold_const_pow paying attention to signaling NaNs (PR 61441)
and flag_errno_math. Most importantly normal cases like pow(2.0,3.0)
and pow(3.0,0.5) still get evaluated at compile-time.
2026-02-03 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/123826
* tree-ssa-math-opts.cc (gimple_expand_builtin_pow): Add test
for flag_errno_math.
* fold-const-call.cc (fold_const_pow): Reorganize, eliminating
call to real_powi, and letting do_mpfr_arg2 do all the heavy
lifting.
gcc/testsuite/ChangeLog
PR middle-end/123826
* gcc.dg/errno-2.c: New test case.
* gcc.dg/errno-3.c: Likewise.
Christophe Lyon [Tue, 3 Feb 2026 13:07:10 +0000 (13:07 +0000)]
testsuite: arm: Make arm_neon_ok_nocache consistent with other effective-targets
A recent patch updated arm_neon_ok_nocache with an improved list of
flags to try, but was not consistent with
arm_v8_3a_complex_neon_ok,_nocache arm_v8_2a_fp16_neon_ok_nocache,
arm_v8_3a_fp16_complex_neon_ok_nocache.
This patch inserts "-mcpu=unset -march=armv7-a+simd -mfpu=auto" before
we try to force -mfloat-abi=softfp and -mfloat-abi=hard.
Tested on aarch64-linux-gnu, arm-linux-gnueabihf and several flavors
of arm-none-eabi with non-default configurations.
Jeff Law [Tue, 3 Feb 2026 14:24:53 +0000 (07:24 -0700)]
[RISC-V][PR rtl-optimization/123322] Split general conditional moves into simpler sequences
This has gone round and round a few times... But I think we're finally at a
resolution of the regression. There's more work to do for gcc-17 though.
---
At its core this regression is caused by more accurately costing branches
during if-conversion. The more accurate cost causes the multi-set path during
the first ifcvt pass to (profitably) convert the sequence. However, it would
have been more profitable to have waited for things to be cleaned up and by the
2nd ifcvt pass we'd be able to convert using the condzero path which is *more*
profitable than the multi-set code (which is what happens in gcc-15).
One possible solution would be to disable the multi-set ifcvt code during its
first pass. That was evaluated and ultimately rejected because of clear
undesirable impacts. It may still be a good thing to do, but certainly not at
this stage of gcc-16.
Another path that was considered was recognizing the sign extension in the
if-convertable block and avoiding the multi-set path in that case on the
assumption it'd be cleaned up later and the condzero path would be used. That
felt way too target specific and optimistic about removal of the sign
extension.
Daniel looked at if-converting these cases during phiopt; that generally looks
like a good thing to do, but had notable undesirable fallout on RISC-V and I
was significantly worried it would re-introduce sequences on x86 and aarch64
that we just fixed. ie, solving one regression and creating another. So that
feels like a gcc-17 evaluation.
After much head-banging it occurred to me that we could tackle this in the
target with a tiny bit of help from combine. These conditional ops are 4
instruction sequences. The operation, a pair of czeros and an add/ior to
select between the output of the two czeros. Combine is pretty conservative in
when it chooses to try 4 insn sequences.
So step #1 was to loosen the restrictions on combine a little bit. If it sees
an IF_THEN_ELSE, then it considers that "good" which in turn allows us to try 4
insn combinations a bit more often.
That exposes the potential for 4 insn combinations and since the IOR/XOR case
is a 4->2 sequence fixing those regressions is just a good define_split. In
fact, we can use that splitter for any case where the neutral operand is zero
(IOR, XOR, PLUS, MINUS, etc). We need a second pattern to handle reversed
operands since we're using code iterators and match_dup for the matching
operand rather than something like a matching constraint.
THat still leaves conditional shifts regressing due to the same problem. For
shifts/rotates, the count is in QImode, so we need a different pattern to
handle those cases, but it has the same basic structure and behavior.
AND was *still* regressing though. That would require a 4->3 split which isn't
supported by combine. Given the issues with the other paths attempted, I
ultimately decided on a define_insn_and_split (boo!). This shouldn't be
happening a ton like mvconst_internal, so not great, but workable. This also
required recognizing the pattern and giving it a suitable cost (COSTS_N_INSNS
(3)).
That fixes the regressions. We're still not generating ideal code for rv64
when the operands are 32bit quantities and the architecture provides an
instruction variant that sign extends from 32 to 64 bits (add, subtract, shifts
and rotates). But the code we generate for gcc-16 is better than we were
generating for gcc-15, it's just not optimal. So there's a TODO for gcc-17 as
well.
This was bootstrapped and regression tested on x86, alpha, riscv, armv7, hppa,
maybe others, but those definitely for sure. It was also tested on the various
crosses without regressions. Waiting on pre-commit testing to render a verdict
before going forward.
PR rtl-optimization/123322
gcc/
* combine.cc (try_combine): Consider an IF_THEN_ELSE "good" when
evaluating if 4 insn combinations should be tried.
* config/riscv/iterators.md (zero_is_neutral_op): New iterator.
(zero_is_neutral_op_c): Likewise.
(any_shift_rotate): Likewise.
* config/riscv/riscv.cc (riscv_rtx_costs): Recognize the conditional
AND RTL and cost is appropriately.
* config/riscv/zicond.md: Add patterns to rewrite general conditional
move sequences into simpler forms.
gcc/testsuite
* gcc.target/riscv/pr123322.c: New test.
Marek Polacek [Mon, 2 Feb 2026 23:09:08 +0000 (18:09 -0500)]
c++/reflection: refactor compare_reflections
In <https://gcc.gnu.org/pipermail/gcc-patches/2026-January/705756.html>
Jason suggested using cp_tree_equal for all exprs in compare_reflections.
This patch does so. We just have to handle comparing annotations and
types specially, then we can use cp_tree_equal for the rest. It just
had to be taught not to crash on unequal NAMESPACE_DECLs.
gcc/cp/ChangeLog:
* reflect.cc (compare_reflections): Handle comparing annotations
and types specially, use cp_tree_equal for the rest.
* tree.cc (cp_tree_equal) <case NAMESPACE_DECL>: New.
I've totally missed the P3491R3 paper (define_static_{string,object,array})
comes with its own feature test macro - __cpp_lib_define_static 202506
which should appear in <version> and <meta>.
The paper contains 3 parts, std::is_string_literal,
std::meta::reflect_constant_{string,array} and
std::define_static_{string,object,array}.
The first part is implementable without reflection, the third part in theory
would be also implementable without reflection but usually will be (and in
libstdc++ is) implemented using reflection, and the middle part is really
part of reflection. So dunno how useful this FTM actually is, maybe just
for cases where some implementation does implement reflection and doesn't
implement this paper for a while.
Anyway, the FTM is in C++26 draft, so this patch adds it, with the same
condition as __cpp_lib_reflection.
2026-02-03 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/123921
* include/bits/version.def (define_static): New with the
same values as reflection.
* include/bits/version.h: Regenerate.
* include/std/meta: Define also __glibcxx_want_define_static before
including bits/version.h.
* g++.dg/reflect/feat2.C: Add also test for __cpp_lib_define_static.
* g++.dg/reflect/feat3.C: New test.
This patch extends omp_target_is_accessible to check the actual device status
for the memory region, on amdgcn and nvptx devices (rather than just checking
if shared memory is enabled).
In both cases, we check the status of each 4k region within the given memory
range (assuming 4k pages should be safe for all the currently supported hosts)
and returns true if all of the pages report accessible.
The testcases have been modified to check that allocations marked accessible
actually are accessible (inaccessibility can't be checked without invoking
memory faults), and to understand that some parts of an array can be accessible
but other parts not (I have observed this intermittently for the stack memory
on amdgcn using the Fortran testcase, which can have the allocation span pages).
There's also new testcases for the various other memory modes, and for managed
memory.
include/ChangeLog:
* cuda/cuda.h (CUpointer_attribute): New enum.
(cuPointerGetAttribute): New prototype.
libgomp/ChangeLog:
PR libgomp/121813
PR libgomp/113213
* libgomp-plugin.h (GOMP_OFFLOAD_is_accessible_ptr): New prototype.
* libgomp.h
(struct gomp_device_descr): Add GOMP_OFFLOAD_is_accessible_ptr.
* libgomp.texi: Update omp_target_is_accessible docs.
* plugin/cuda-lib.def (cuPointerGetAttribute): New entry.
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_amd_svm_attributes_get_fn and hsa_amd_pointer_info_fn.
(init_hsa_runtime_functions): Add hsa_amd_svm_attributes_get and
hsa_amd_pointer_info.
(enum accessible): New enum type.
(host_memory_is_accessible): New function.
(device_memory_is_accessible): New function.
(GOMP_OFFLOAD_is_accessible_ptr): New function.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_is_accessible_ptr): Likewise.
* target.c (omp_target_is_accessible): Call is_accessible_ptr_func.
(gomp_load_plugin_for_device): Add is_accessible_ptr.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Rework
to match more details of the GPU implementation.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: Likewise.
* testsuite/libgomp.c-c++-common/target-is-accessible-2.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-3.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-4.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-5.c: New test.
afunix.h was first included in MinGW 10.0.0. For earlier versions, fall
back to internal definition.
gcc/ChangeLog:
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac (gcc_cv_header_afunix_h): New test.
(HAVE_AFUNIX_H): New AC_DEFINE.
* diagnostics/sarif-sink.cc: Conditionally include afunix.h
if it's available, else fall back to internal definition.
Richard Biener [Tue, 3 Feb 2026 08:26:01 +0000 (09:26 +0100)]
ipa/123416 - fix IPA modref summary merging after inlining
There's a typo in the condition skipping load collapsing when
there's no callee modref summary. We do have to collapse loads
for the destination iff the callee performs any loads which includes
when the callee is ECF_PURE. The LTO summary part already gets
this correct.
PR ipa/123416
* ipa-modref.cc (ipa_merge_modref_summary_after_inlining):
Fix typo in condtion for load merging when no callee summary.
Jakub Jelinek [Tue, 3 Feb 2026 08:19:19 +0000 (09:19 +0100)]
c++: Fix UB in eval_data_member_spec [PR123920]
We can overflow buffer in eval_data_member_spec on some initializers.
The code has 2 loops, one to figure out the needed length
of the buffer and diagnose errors
unsigned HOST_WIDE_INT l = 0;
bool ntmbs = false;
FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (f), k, field, value)
if (!tree_fits_shwi_p (value))
goto fail;
else if (field == NULL_TREE)
{
if (integer_zerop (value))
{
ntmbs = true;
break;
}
++l;
}
else if (TREE_CODE (field) == RANGE_EXPR)
{
tree lo = TREE_OPERAND (field, 0);
tree hi = TREE_OPERAND (field, 1);
if (!tree_fits_uhwi_p (lo) || !tree_fits_uhwi_p (hi))
goto fail;
if (integer_zerop (value))
{
l = tree_to_uhwi (lo);
ntmbs = true;
break;
}
l = tree_to_uhwi (hi) + 1;
}
else if (tree_fits_uhwi_p (field))
{
l = tree_to_uhwi (field);
if (integer_zerop (value))
{
ntmbs = true;
break;
}
++l;
}
else
goto fail;
if (!ntmbs || l > INT_MAX - 1)
goto fail;
which assumes that if there are designators, they need to be ascending,
so no { [10] = 1, [2] = 3, [6] = 4, [4] = 5 } like in C11 and stops on the
first '\0' seen (but remember in l the index of that. Here we need the
3 integer_zerop calls because the exact setting of l depends on if it is
an elt without index, or with RANGE_EXPR index, or with normal INTEGER_CST
index. And then there is a second loop where it just stores the values
into the allocated buffer and does rely on the checking the first loop
did. Now, for CONSTRUCTORs without indexes or with RANGE_EXPR only it
also correctly stops at '\0', but I forgot to check that in the last
case where index is INTEGER_CST:
char *namep;
unsigned len = l;
if (l < 64)
namep = XALLOCAVEC (char, l + 1);
else
namep = XNEWVEC (char, l + 1);
memset (namep, 0, l + 1);
l = 0;
FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (f), k, field, value)
if (field == NULL_TREE)
{
if (integer_zerop (value))
break;
namep[l] = tree_to_shwi (value);
++l;
}
else if (TREE_CODE (field) == RANGE_EXPR)
{
tree lo = TREE_OPERAND (field, 0);
tree hi = TREE_OPERAND (field, 1);
if (integer_zerop (value))
break;
unsigned HOST_WIDE_INT m = tree_to_uhwi (hi);
for (l = tree_to_uhwi (lo); l <= m; ++l)
namep[l] = tree_to_shwi (value);
}
else
{
l = tree_to_uhwi (field);
namep[l++] = tree_to_shwi (value);
}
namep[len] = '\0';
So, I could add if (integer_zerop (value) break; into the else
block, but then, in this loop there is no reason not to check integer_zerop
(value) first because it is handled in all 3 cases the same.
2026-02-03 Jakub Jelinek <jakub@redhat.com>
PR c++/123920
* reflect.cc (eval_data_member_spec): Break out of the loop
if value is integer_zerop even in the field
&& TREE_CODE (field) != RANGE_EXPR case and use a single
test for integer_zerop (value) in the whole loop.
Jakub Jelinek [Tue, 3 Feb 2026 08:18:34 +0000 (09:18 +0100)]
c++: Don't call cpp_translate_string on NULL string [PR123918]
My P2246R1 patch caused diagnostics reported by running ubsan
instrumented compiler on cpp26/static_assert1.C - if len is 0,
we don't bother to allocate msg, so it stays NULL, and when I've added
cpp_translate_string call, that can invoke memcpy (something, NULL, 0);
in that case.
While that is no longer UB in C2Y since N3322, libsanitizer doesn't
know that yet and reports it anyway.
While we could just do
if (len)
{
...
}
else
msg = "";
there is really no point in trying to translate "" and allocate memory
for that, so the following patch instead by passes that translation for
len == 0.
2026-02-03 Jakub Jelinek <jakub@redhat.com>
PR c++/123918
* semantics.cc (cexpr_str::extract): Bypass cpp_translate_string
for len == 0.
Eric Botcazou [Tue, 3 Feb 2026 07:45:23 +0000 (08:45 +0100)]
Ada: Fix couple of small accessibility glitches
The first glitch is that the ACATS test c3a0025 does not pass in Ada 2005
because an accessibility check preempts a null access check. The second
glitch is that there should be no differences in Ada 2012 and later for
the test, in other words there is a missing accessibility check failure.
The second glitch comes from a thinko in the new implementation of the
In_Return_Value predicate, which has incorrectly dropped the handling of
assignments to return objects.
The first glitch is fixed by swapping the order of null access checks and
accessibility checks for conversions, which requires adding a small guard
to Apply_Discriminant_Check.
gcc/ada/
* checks.adb (Apply_Discriminant_Check): Bail out for a source type
that is a class-wide type whose root type has no discriminants.
* exp_ch4.adb (Expand_N_Type_Conversion): If the target type is an
access type, emit null access checks before accessibility checks.
* sem_util.adb (In_Return_Value): Deal again with assignments to
return objects.
Xi Ruoyao [Thu, 29 Jan 2026 09:08:02 +0000 (17:08 +0800)]
LoongArch: rework copysign and xorsign implementation
The copysign and xorsign implementation had two significant bugs:
1. The GCC Internal documentation explicitly says the IOR, XOR, and AND
optabs are only for fixed-point modes, i.e. they cannot be used for
floating-point modes.
2. The handling of "%V" uses a very nasty way to pun floating-point
const value to integer representation, invoking undefined behavior on
32-bit hosts by shifting left a "long" by 32 bits. In fact
lowpart_subreg handles punning of const values correctly despite the
name contains "reg."
Fix the bugs by using lowpart_subreg to pun the modes in the expanders.
gcc/
* config/loongarch/predicates.md (const_vector_neg_fp_operand):
New define_predicate.
(reg_or_vector_neg_fp_operand): New define_predicate.
* config/loongarch/lasx.md (copysign<mode>3): Remove.
(xorsign<mode>3): Remove.
* config/loongarch/lsx.md (copysign<mode>3): Remove.
(@xorsign<mode>3): Remove.
* config/loongarch/simd.md (copysign<mode>3): New define_expand.
(@xorsign<mode>3): New define_expand.
(and<mode>3): Only allow IVEC instead of ALLVEC.
(ior<mode>3): Likewise.
(xor<mode>3): Likewise.
* config/loongarch/loongarch.cc (loongarch_print_operand): No
longer allow floating-point vector constants for %V.
(loongarch_const_vector_bitimm_set_p): Always return false for
floating-point vector constants.
(loongarch_build_signbit_mask): Factor out force_reg.
(loongarch_emit_swrsqrtsf): Use integer vector mode instead of
floating-point vector mode when masking zero inputs.