optabs had a local function called lowpart_subreg_maybe_copy
that is very similar to the lowpart version of force_subreg.
This patch adds a force_lowpart_subreg wrapper around
force_subreg.
The only difference between the old and new functions is that
the old one asserted success while the new one doesn't.
It's common not to assert elsewhere when taking subregs;
normally a null result is enough.
Later patches will make more use of the new function.
gcc/
* explow.h (force_lowpart_subreg): Declare.
* explow.cc (force_lowpart_subreg): New function.
While adding more uses of force_subreg, I realised that it should
be more careful to emit no instructions on failure. This kind of
failure should be very rare, so I don't think it's a case worth
optimising for.
gcc/
* explow.cc (force_subreg): Emit no instructions on failure.
Jonathan Wakely [Tue, 15 Apr 2025 13:00:23 +0000 (14:00 +0100)]
libstdc++: Do not define __cpp_lib_ranges_iota in <ranges>
In r14-7153-gadbc46942aee75 we removed a duplicate definition of
__glibcxx_want_range_iota from <ranges>, but __cpp_lib_ranges_iota
should be defined in <ranges> at all.
libstdc++-v3/ChangeLog:
* include/std/ranges (__glibcxx_want_ranges_iota): Do not
define.
Michael Levine [Fri, 7 Jun 2024 08:54:38 +0000 (09:54 +0100)]
libstdc++: Fix std::ranges::iota is not included in numeric [PR108760]
Before this patch, using std::ranges::iota required including
<algorithm> when it should have been sufficient to only include
<numeric>.
For the backport to the release branch ranges::iota is defined in
<bits/ranges_algobase.h> so that it's available in both <numeric> and
<algorithm>. This avoids breaking code that compiles successfully using
existing releases where <algorithm> defines ranges::iota.
libstdc++-v3/ChangeLog:
PR libstdc++/108760
* include/bits/ranges_algo.h (ranges::out_value_result)
(ranges::iota_result, ranges::__iota_fn, ranges::iota): Move to
<bits/ranges_algobase.h>.
* include/bits/ranges_algobase.h (ranges::out_value_result):
(ranges::iota_result, ranges::__iota_fn, ranges::iota): Move to
here.
* include/std/numeric: Include <bits/ranges_algobase.h>.
* testsuite/25_algorithms/iota/1.cc: Renamed to ...
* testsuite/26_numerics/iota/2.cc: ... here.
Jonathan Wakely [Thu, 27 Feb 2025 13:27:17 +0000 (13:27 +0000)]
libstdc++: Fix ranges::move and ranges::move_backward to use iter_move [PR105609]
The ranges::move and ranges::move_backward algorithms are supposed to
use ranges::iter_move(iter) instead of std::move(*iter), which matters
for an iterator type with an iter_move overload findable by ADL.
Currently those algorithms use std::__assign_one which uses std::move,
so define a new ranges::__detail::__assign_one helper function that uses
ranges::iter_move.
libstdc++-v3/ChangeLog:
PR libstdc++/105609
* include/bits/ranges_algobase.h (__detail::__assign_one): New
helper function.
(__copy_or_move, __copy_or_move_backward): Use new function
instead of std::__assign_one.
* testsuite/25_algorithms/move/constrained.cc: Check that
ADL iter_move is used in preference to std::move.
* testsuite/25_algorithms/move_backward/constrained.cc:
Likewise.
Jonathan Wakely [Mon, 14 Oct 2024 22:34:20 +0000 (23:34 +0100)]
libstdc++: Reuse std::__assign_one in <bits/ranges_algobase.h>
Use std::__assign_one instead of ranges::__assign_one. Adjust the uses,
because std::__assign_one has the arguments in the opposite order (the
same order as an assignment expression).
libstdc++-v3/ChangeLog:
* include/bits/ranges_algobase.h (ranges::__assign_one): Remove.
(__copy_or_move, __copy_or_move_backward): Use std::__assign_one
instead of ranges::__assign_one.
Jonathan Wakely [Sun, 13 Oct 2024 18:14:04 +0000 (19:14 +0100)]
libstdc++: Fix ranges::copy_backward for a single memcpyable element [PR117121]
The result iterator needs to be decremented before writing to it.
Improve the PR 108846 tests for all of std::copy, std::copy_n,
std::copy_backward, and the std::ranges versions.
libstdc++-v3/ChangeLog:
PR libstdc++/117121
* include/bits/ranges_algobase.h (copy_backward): Decrement
output iterator before assigning one element through it.
* testsuite/25_algorithms/copy/108846.cc: Ensure the algorithm's
effects are correct for a single memcpyable element.
* testsuite/25_algorithms/copy_backward/108846.cc: Likewise.
* testsuite/25_algorithms/copy_n/108846.cc: Likewise.
libstdc++: Do not use use memmove for 1-element ranges [PR108846,PR116471]
This commit ports the fixes already applied by r13-6372-g822a11a1e642e0
to the range-based versions of copy/move algorithms.
When doing so, a further bug (PR116471) was discovered in the
implementation of the range-based algorithms: although the algorithms
are already constrained by the indirectly_copyable/movable concepts,
there was a failing static_assert in the memmove path.
This static_assert checked that iterator's value type was assignable by
using the is_copy_assignable (move) type traits. However, this is a
problem, because the traits are too strict when checking for constness;
a type like
struct S { S& operator=(S &) = default; };
is trivially copyable (and thus could benefit of the memmove path),
but it does not satisfy is_copy_assignable because the operator takes
by non-const reference.
Now, the reason for the check to be there is because a type with
a deleted assignment operator like
struct E { E& operator=(const E&) = delete; };
is still trivially copyable, but not assignable. We don't want
algorithms like std::ranges::copy to compile because they end up
selecting the memmove path, "ignoring" the fact that E isn't even
copy assignable.
But the static_assert isn't needed here any longer: as noted before,
the ranges algorithms already have the appropriate constraints; and
even if they didn't, there's now a non-discarded codepath to deal with
ranges of length 1 where there is an explicit assignment operation.
Therefore, this commit removes it. (In fact, r13-6372-g822a11a1e642e0
removed the same static_assert from the non-ranges algorithms.)
libstdc++-v3/ChangeLog:
PR libstdc++/108846
PR libstdc++/116471
* include/bits/ranges_algobase.h (__assign_one): New helper
function.
(__copy_or_move): Remove a spurious static_assert; use
__assign_one for memcpyable ranges of length 1.
(__copy_or_move_backward): Likewise.
* testsuite/25_algorithms/copy/108846.cc: Extend to range-based
algorithms, and cover both memcpyable and non-memcpyable
cases.
* testsuite/25_algorithms/copy_backward/108846.cc: Likewise.
* testsuite/25_algorithms/copy_n/108846.cc: Likewise.
* testsuite/25_algorithms/move/108846.cc: Likewise.
* testsuite/25_algorithms/move_backward/108846.cc: Likewise.
RISC-V: revert pr114194 tests on gcc-14 [PR118601]
The gcc-14 backport that split the pr114194 testcase for rv32 and rv64
would only generate the expected rv32 sequence if commit 6b315907c0353f71169a7555e653d29a981fef67 had also been backported, but
it wasn't. Without it, we get the same code as before on both rv32
and rv64, so revert to the original test.
The pr118182-2.c testcase backported from gcc-15 depended on the late
combine pass after register allocation to substitute the zero constant
into the pred_broadcast to get to the expected vmv.s.x instruction.
Without that pass, we get a mfmv.s.f instead. Expect that on gcc-14.
Andrew Pinski [Sat, 15 Mar 2025 23:37:41 +0000 (16:37 -0700)]
discriminators: Fix assigning discriminators on edge [PR113546]
The problem here is there was a compare debug since the discriminators
would still take into account debug statements. For the edge we would look
at the first statement after the labels and that might have been a debug statement.
So we need to skip over debug statements otherwise we could get different
discriminators # with and without -g.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR middle-end/113546
gcc/ChangeLog:
* tree-cfg.cc (first_non_label_stmt): Rename to ...
(first_non_label_nondebug_stmt): This and use gsi_start_nondebug_after_labels_bb.
(assign_discriminators): Update call to first_non_label_nondebug_stmt.
Patrick Palka [Mon, 14 Apr 2025 15:20:13 +0000 (11:20 -0400)]
c++: wrong targs in satisfaction diagnostic context line [PR99214]
In the three-parameter version of satisfy_declaration_constraints, when
't' isn't the most general template, then 't' won't correspond with
'args' after we augment the latter via add_outermost_template_args, and
so the instantiation context that we push via push_tinst_level isn't
quite correct: 'args' is a complete set of template arguments, but 't'
is not necessarily the most general template. This manifests as
misleading diagnostic context lines when issuing a satisfaction failure
error, e.g. the below testcase without this patch we emit:
In substitution of '... void A<int>::f<U>() ... [with U = int]'
and with this patch we emit:
In substitution of '... void A<int>::f<U>() ... [with U = char]'.
This patch fixes this by passing the original 'args' to push_tinst_level,
which ought to properly correspond to 't'.
PR c++/99214
gcc/cp/ChangeLog:
* constraint.cc (satisfy_declaration_constraints): Pass the
original ARGS to push_tinst_level.
Jonathan Wakely [Fri, 11 Apr 2025 10:08:34 +0000 (11:08 +0100)]
libstdc++: Document thread-safety for COW std::string [PR21334]
The gcc4-compatible copy-on-write std::string does not conform to the
C++11 requirements on data race avoidance in standard containers.
Specifically, calling non-const member functions such as begin() and
data() needs to do the "copy on write" operation and so is most
definitely a modification of the object. As such, those non-const
members must not be called concurrently with any other uses of the
string object.
libstdc++-v3/ChangeLog:
PR libstdc++/21334
* doc/xml/manual/using.xml: Document that container data race
avoidance rules do not apply to COW std::string.
* doc/html/*: Regenerate.
Andrew Pinski [Mon, 2 Dec 2024 16:35:23 +0000 (08:35 -0800)]
phiopt: Reset the number of iterations information of a loop when changing an exit from the loop [PR117243]
After r12-5300-gf98f373dd822b3, phiopt could get the following bb structure:
|
middle-bb -----|
| |
| |----| |
phi<1, 2> | |
cond | |
| | |
|--------+---|
Which was considered 2 loops. The inner loop had esimtate of upper_bound to be 8,
due to the original `for (b = 0; b <= 7; b++)`. The outer loop was already an
infinite one.
So phiopt would come along and change the condition to be unconditionally true,
we change the inner loop to being an infinite one but don't reset the estimate
on the loop and cleanup cfg comes along and changes it into one loop but also
does not reset the estimate of the loop. Then the loop unrolling uses the old estimate
and decides to add an unreachable there.o
So the fix is when phiopt changes an exit to a loop, reset the estimates, similar to
how cleanupcfg does it when merging some basic blocks.
Andrew Pinski [Sun, 9 Mar 2025 06:43:54 +0000 (22:43 -0800)]
phiopt: Fix value_replacement for middle bb having phi nodes [PR118922]
After r12-5300-gf98f373dd822b3, value_replacement would be able to look at the
following cfg structure:
```
<bb 5> [local count: 1014686024]:
if (h_6 != 0)
goto <bb 7>; [94.50%]
else
goto <bb 6>; [5.50%]
value_replacement would incorrectly think the middle bb (6) was empty and so it decides
to remove condition in bb5 and replacing it with 0 as the function thought it was `h_6 ? 0 : h_6`.
But since the there is an incoming phi node to bb6 defining h_6 that is incorrect.
The fix is to check if there is phi nodes in the middle bb and set empty_or_with_defined_p to false.
This was not needed before r12-5300-gf98f373dd822b3 because the phi would have been dead otherwise due to
other checks.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/118922
gcc/ChangeLog:
* tree-ssa-phiopt.cc (value_replacement): Set empty_or_with_defined_p
to false when there is phi nodes for the middle bb.
Eric Botcazou [Mon, 14 Apr 2025 21:35:43 +0000 (23:35 +0200)]
Revert very recent backport of changes to the type system
The backport of the change made for PR c/113688 onto the 14 branch a couple
of weeks ago has seriously broken the LTO compiler for the Ada language on
the 14 branch, because it changes the GCC type system for the sake of C in
a way that is not compatible with simple discriminated types in Ada. To be
more precise, useless_type_conversion_p now returns true for some (view-)
conversions that are needed by the rest of the compiler.
gcc/
PR lto/119792
Revert
Backported from master:
2024-12-12 Martin Uecker <uecker@tugraz.at>
Andrew Pinski [Mon, 14 Apr 2025 15:40:24 +0000 (08:40 -0700)]
testcase: Add testcase for already fixed PR [PR118476]
This testcase was fixed by r15-3052-gc7b76a076cb2c6ded but is
a testcase that failed in a different fashion and a much older
failure than the one added with r15-3052.
Andrew Pinski [Mon, 19 Aug 2024 15:06:36 +0000 (08:06 -0700)]
match: Reject non-ssa name/min invariants in gimple_extract [PR116412]
After the conversion for phiopt's conditional operand
to use maybe_push_res_to_seq, it was found that gimple_extract
will extract out from REALPART_EXPR/IMAGPART_EXPR/VCE and BIT_FIELD_REF,
a memory load. But that extraction was not needed as memory loads are not
simplified in match and simplify. So gimple_extract should return false
in those cases.
Changes since v1:
* Move the rejection to gimple_extract from factor_out_conditional_operation.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116412
gcc/ChangeLog:
* gimple-match-exports.cc (gimple_extract): Return false if op0
was not a SSA name nor a min invariant for REALPART_EXPR/IMAGPART_EXPR/VCE
and BIT_FIELD_REF.
Andrew Pinski [Sun, 27 Oct 2024 20:16:22 +0000 (13:16 -0700)]
vec-lowering: Fix ABSU lowering [PR111285]
ABSU_EXPR lowering incorrectly used the resulting type
for the new expression but in the case of ABSU the resulting
type is an unsigned type and with ABSU is folded away. The fix
is to use a signed type for the expression instead.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/111285
gcc/ChangeLog:
* tree-vect-generic.cc (do_unop): Use a signed type for the
operand if the operation was ABSU_EXPR.
Andrew Pinski [Tue, 1 Oct 2024 21:48:19 +0000 (14:48 -0700)]
backprop: Fix deleting of a phi node [PR116922]
The problem here is remove_unused_var is called on a name that is
defined by a phi node but it deletes it like removing a normal statement.
remove_phi_node should be called rather than gsi_remove for phinodes.
Note there is a possibility of using simple_dce_from_worklist instead
but that is for another day.
Andrew Pinski [Wed, 2 Oct 2024 21:21:24 +0000 (14:21 -0700)]
aarch64: Fix early ra for -fno-delete-dead-exceptions [PR116927]
Early-RA was considering throwing instructions as being dead and removing
them even if -fno-delete-dead-exceptions was in use. This fixes that oversight.
Built and tested for aarch64-linux-gnu.
PR target/116927
gcc/ChangeLog:
* config/aarch64/aarch64-early-ra.cc (early_ra::is_dead_insn): Insns
that throw are not dead with -fno-delete-dead-exceptions.
Andrew Pinski [Tue, 1 Oct 2024 18:34:00 +0000 (18:34 +0000)]
phiopt: Fix VCE moving by rewriting it into cast [PR116098]
Phiopt match_and_simplify might move a well defined VCE assign statement
from being conditional to being uncondtitional; that VCE might no longer
being defined. It will need a rewrite into a cast instead.
This adds the rewriting code to move_stmt for the VCE case.
This is enough to fix the issue at hand. It should also be using rewrite_to_defined_overflow
but first I need to move the check to see a rewrite is needed into its own function
and that is causing issues (see https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663938.html).
Plus this version is easiest to backport.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116098
gcc/ChangeLog:
* tree-ssa-phiopt.cc (move_stmt): Rewrite VCEs from integer to integer
types to case.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/pr116098-2.c: New test.
* g++.dg/torture/pr116098-1.C: New test.
Harald Anlauf [Tue, 8 Apr 2025 20:30:15 +0000 (22:30 +0200)]
Fortran: fix issue with impure elemental subroutine and interface [PR119656]
PR fortran/119656
gcc/fortran/ChangeLog:
* interface.cc (gfc_compare_actual_formal): Fix front-end memleak
when searching for matching interfaces.
* trans-expr.cc (gfc_conv_procedure_call): If there is a formal
dummy corresponding to an absent argument, use its type, and only
fall back to inferred type otherwise.
We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
did not go compile something that old, and identified this change via
git blame, so might be wrong)
=== cut here ===
struct Foo { int x; };
Foo& get (Foo &v) { return v; }
void bar () {
Foo v; v.x = 1;
(true ? get (v) : get (v)).*(&Foo::x) = 2;
// v.x still equals 1 here...
}
=== cut here ===
The problem lies in build_m_component_ref, that computes the address of
the COND_EXPR using build_address to build the representation of
(true ? get (v) : get (v)).*(&Foo::x);
and gets something like
&(true ? get (v) : get (v)) // #1
instead of
(true ? &get (v) : &get (v)) // #2
and the write does not go where want it to, hence the miscompile.
This patch replaces the call to build_address by a call to
cp_build_addr_expr, which gives #2, that is properly handled.
PR c++/114525
gcc/cp/ChangeLog:
* typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
instead of build_address.
Richard Biener [Wed, 9 Apr 2025 12:36:19 +0000 (14:36 +0200)]
rtl-optimization/119689 - compare-debug failure with LRA
The previous change to fix LRA rematerialization broke compare-debug
for i586 bootstrap. Fixed by using prev_nonnote_nondebug_insn
instead of prev_nonnote_insn.
PR rtl-optimization/119689
PR rtl-optimization/115568
* lra-remat.cc (create_cands): Use prev_nonnote_nondebug_insn
to check whether insn2 is directly before insn.
[PR115568][LRA]: Use more strict output reload check in rematerialization
In this PR case LRA rematerialized a value from inheritance insn
instead of output reload one. This resulted in considering a
rematerilization candidate value available when it was actually
not. As a consequence an insn after rematerliazation used the
unexpected value and this use resulted in fp exception. The patch
fixes this bug.
gcc/ChangeLog:
PR rtl-optimization/115568
* lra-remat.cc (create_cands): Check that output reload insn is
adjacent to given insn. Update a comment.
Jason Merrill [Thu, 10 Apr 2025 22:16:37 +0000 (18:16 -0400)]
c++: avoid ARM -Wunused-value [PR114970]
Because of the __builtin_is_constant_evaluated, maybe_constant_init in
expand_default_init fails, so the constexpr constructor isn't folded until
cp_fold, which builds a COMPOUND_EXPR in case the enclosing expression is
relying on the ARM behavior of returning 'this'.
As in other places, avoid -Wunused-value on artificial COMPOUND_EXPR.
PR c++/114970
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold): Suppress warnings on
return_this COMPOUND_EXPR.
Jason Merrill [Thu, 10 Apr 2025 18:34:35 +0000 (14:34 -0400)]
c++: nested lambda capture pack [PR119345]
tsubst_stmt already registers a local capture proxy as a
local_specialization of both an outer capture proxy and the captured
variable; we also need to do that in add_extra_args.
PR c++/119345
gcc/cp/ChangeLog:
* pt.cc (add_extra_args): Also register a specialization
of the captured variable.
Jason Merrill [Wed, 9 Apr 2025 17:22:56 +0000 (13:22 -0400)]
c++: lambda in constraint of lambda [PR119175]
Here when we went to mangle the constraints of from<0>, the outer lambda has
no mangling scope, but the inner one was treated as having the outer one as
its scope. And mangling the outer one means mangling its constraints, which
include the inner one. So infinite recursion.
But a lambda closure type isn't a scope that anything should have for
mangling, the inner lambda should also have no mangling scope.
PR c++/119175
gcc/cp/ChangeLog:
* mangle.cc (decl_mangling_context): Look through lambda type.
Jason Merrill [Mon, 7 Apr 2025 18:35:14 +0000 (14:35 -0400)]
c++: self-dependent alias template [PR117530]
Here, instantiating B<short> means instantiating A<short>, which means
instantiating B<short>. And then when we go to register the initial
instantiation, it conflicts with the inner one. Fixed by checking after
tsubst whether there's already something in the hash table. We already did
something much like this in tsubst_decl, but that doesn't handle this case.
PR c++/117530
gcc/cp/ChangeLog:
* pt.cc (instantiate_template): Check retrieve_specialization after
tsubst.
With inherited CTAD the set of guides may be a two-dimensional overload
set (i.e. OVERLOADs of OVERLOADs) so alias_ctad_tweaks (which also does
the inherited CTAD transformation) needs to use the 2D-aware lkp_iterator
instead of ovl_iterator, or better yet use the more idiomatic lkp_range.
PR c++/119687
gcc/cp/ChangeLog:
* pt.cc (alias_ctad_tweaks): Use lkp_range / lkp_iterator
instead of ovl_iterator.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/class-deduction-inherited8.C: New test.
Jonathan Wakely [Tue, 5 Nov 2024 17:19:06 +0000 (17:19 +0000)]
libstdc++: Fix conversions to key/value types for hash table insertion [PR115285]
The conversions to key_type and value_type that are performed when
inserting into _Hashtable need to be fixed to do any required
conversions explicitly. The current code assumes that conversions from
the parameter to the key_type or value_type can be done implicitly,
which isn't necessarily true.
Remove the _S_forward_key function which doesn't handle all cases and
either forward the parameter if it already has type cv key_type, or
explicitly construct a temporary of type key_type.
Similarly, the _ConvertToValueType specialization for maps doesn't
handle all cases either, for std::pair arguments only some value
categories are handled. Remove _ConvertToValueType and for the _M_insert
function for unique keys, either forward the argument unchanged or
explicitly construct a temporary of type value_type.
For the _M_insert overload for non-unique keys we don't need any
conversion at all, we can just forward the argument directly to where we
construct a node.
libstdc++-v3/ChangeLog:
PR libstdc++/115285
* include/bits/hashtable.h (_Hashtable::_S_forward_key): Remove.
(_Hashtable::_M_insert_unique_aux): Replace _S_forward_key with
a static_cast to a type defined using conditional_t.
(_Hashtable::_M_insert): Replace _ConvertToValueType with a
static_cast to a type defined using conditional_t.
* include/bits/hashtable_policy.h (_ConvertToValueType): Remove.
* testsuite/23_containers/unordered_map/insert/115285.cc: New test.
* testsuite/23_containers/unordered_set/insert/115285.cc: New test.
* testsuite/23_containers/unordered_set/96088.cc: Adjust
expected number of allocations.
François Dumont [Tue, 22 Oct 2024 17:13:34 +0000 (19:13 +0200)]
libstdc++: Always instantiate key_type to compute hash code [PR115285]
Even if it is possible to compute a hash code from the inserted arguments
we need to instantiate the key_type to guaranty hash code consistency.
Preserve the lazy instantiation of the mapped_type in the context of
associative containers.
libstdc++-v3/ChangeLog:
PR libstdc++/115285
* include/bits/hashtable.h (_S_forward_key<_Kt>): Always return a temporary
key_type instance.
* testsuite/23_containers/unordered_map/96088.cc: Adapt to additional instanciation.
Also check that mapped_type is not instantiated when there is no insertion.
* testsuite/23_containers/unordered_multimap/96088.cc: Adapt to additional
instanciation.
* testsuite/23_containers/unordered_multiset/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/pr115285.cc: New test case.
Jin Ma [Mon, 7 Apr 2025 06:21:50 +0000 (14:21 +0800)]
RISC-V: Disable unsupported vsext/vzext patterns for XTheadVector.
XThreadVector does not support the vsext/vzext instructions; however,
due to the reuse of RVV optimizations, it may generate these instructions
in certain cases. To prevent the error "Unknown opcode 'th.vsext.vf2',"
we should disable these patterns.
V2:
Change the value of dg-do in the test case from assemble to compile, and
remove the -save-temps option.
gcc/ChangeLog:
* config/riscv/vector.md: Disable vsext/vzext for XTheadVector.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/vsext.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vzext.c: New test.
Looking at the log for the reload pass, it is found that "Changing pseudo 209 in
operand 3 of insn 69 on equiv 0x1".
It converts the vl operand in insn from the expected register(reg:DI 209) to the
constant 1(const_int 1 [0x1]).
This conversion occurs because, although the predicate for the vl operand is
restricted by "vector_length_operand" in the pattern, the constraint is still
"rK", which allows the transformation.
The issue is that changing the "rK" constraint to "rJ" for the constraint of vl
operand in the pattern would prevent this conversion, But unfortunately this will
conflict with RVV (RISC-V Vector Extension).
Based on the review's recommendations, the best solution for now is to create
a new constraint to distinguish between RVV and XTheadVector, which is exactly
what this patch does.
Kito Cheng [Thu, 10 Apr 2025 08:58:49 +0000 (16:58 +0800)]
RISC-V: Fix the behavior for multilib-generator with --cmodel=large on rv32
Large code model is only supported on RV64, so we don't need to
generate the multilibs for RV32 with --cmodel=large. And the compact
code model is something we don't supported on upstream (which is
accidentally added in the past), so we need to remove it.
gcc/ChangeLog:
* config/riscv/multilib-generator: Remove the compact code model
and check large code model for RV32.
Patrick Palka [Wed, 9 Apr 2025 21:48:05 +0000 (17:48 -0400)]
libstdc++: Fix constraint recursion in basic_const_iterator operator- [PR115046]
It was proposed in PR112490 to also adjust basic_const_iterator's friend
operator-(sent, iter) overload alongside the r15-7757-g4342c50ca84ae5
adjustments to its comparison operators, but we lacked a concrete
testcase demonstrating fixable constraint recursion there. It turns out
Hewill Kang's PR115046 is such a testcase! So this patch makes the same
adjustments to that overload as well, fixing PR115046. The LWG 4218 P/R
will need to get adjusted too.
PR libstdc++/115046
PR libstdc++/112490
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (basic_const_iterator::operator-):
Replace non-dependent basic_const_iterator function parameter with
a dependent one of type basic_const_iterator<_It2> where _It2
matches _It.
* testsuite/std/ranges/adaptors/as_const/1.cc (test04): New test.
Patrick Palka [Wed, 9 Apr 2025 21:55:36 +0000 (17:55 -0400)]
c++: ICE with nested default targ lambdas [PR119574]
In GCC 14 we fixed PR116567 in a more conservative way that doesn't
distinguish between the two kinds of deferred substitutions, and so
for PR119574 we instead ICE from get_innermost_template_args due to
TMPL_PARMS_DEPTH of the lambda, 2, being greater than the depth of the
augmented args, 1.
This patch works around the ICE in a best effort kind of way by guarding
the get_innermost_template_args call appropriately; I don't think it's
possible to get this completely right in GCC 14 without backporting the
proper fix for PR116567.
Note that lambda-targ13b.C present in the GCC 15 version of this patch[1]
never worked in GCC 14, and still doesn't work, which is why it's not
present in this patch.
xuli [Tue, 12 Nov 2024 02:31:28 +0000 (02:31 +0000)]
RISC-V: Bugfix for max_sew_overlap_and_next_ratio_valid_for_prev_sew_p[pr117483]
This patch fixs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117483
If prev and next satisfy the following rules, we should forbid the case
(next.get_sew() < prev.get_sew() && (!next.get_ta() || !next.get_ma()))
in the compatible function max_sew_overlap_and_next_ratio_valid_for_prev_sew_p.
Otherwise, the tail elements of next will be polluted.
Andreas noted we were getting an uninit warning after the recent constant
synthesis changes. Essentially there's no way for the uninit analysis code to
know the first entry in the CODES array is a UNKNOWN which will set X before
its first use.
So trivial initialization with NULL_RTX is the obvious fix.
Robin Dapp [Wed, 24 Jul 2024 07:08:00 +0000 (09:08 +0200)]
RISC-V: Error early with V and no M extension.
For calculating the value of a poly_int at runtime we use a
multiplication instruction that requires the M extension.
Instead of just asserting and ICEing this patch emits an early
error at option-parsing time.
gcc/ChangeLog:
PR target/116036
* config/riscv/riscv.cc (riscv_override_options_internal): Error
with TARGET_VECTOR && !TARGET_MUL.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-31.c: Add m to arch string and expect it.
* gcc.target/riscv/arch-32.c: Ditto.
* gcc.target/riscv/predef-14.c: Ditto.
* gcc.target/riscv/predef-15.c: Ditto.
* gcc.target/riscv/predef-16.c: Ditto.
* gcc.target/riscv/predef-26.c: Ditto.
* gcc.target/riscv/predef-27.c: Ditto.
* gcc.target/riscv/predef-32.c: Ditto.
* gcc.target/riscv/predef-33.c: Ditto.
* gcc.target/riscv/rvv/autovec/pr111486.c: Add m to arch string.
* gcc.target/riscv/compare-debug-1.c: Ditto.
* gcc.target/riscv/compare-debug-2.c: Ditto.
* gcc.target/riscv/rvv/base/pr116036.c: New test.
Robin Dapp [Wed, 31 Jul 2024 14:54:03 +0000 (16:54 +0200)]
RISC-V: Correct mode_idx attribute for viwalu wx variants [PR116149].
In PR116149 we choose a wrong vector length which causes wrong values in
a reduction. The problem happens in avlprop where we choose the
number of units in the instruction's mode as vector length. For the
non-scalar variants the respective operand has the correct non-widened
mode. For the scalar variants, however, the same operand has a scalar
mode which obviously only has one unit. This makes us choose VL = 1
leaving three elements undisturbed (so potentially -1). Those end up
in the reduction causing the wrong result.
This patch adjusts the mode_idx just for the scalar variants of the
affected instruction patterns.
gcc/ChangeLog:
PR target/116149
* config/riscv/vector.md: Fix mode_idx attribute of scalar
widen add/sub variants.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr116149.c: New test.
Jeff Law [Thu, 8 Aug 2024 13:42:26 +0000 (07:42 -0600)]
[RISC-V][PR target/116240] Ensure object is a comparison before extracting arguments
This was supposed to go out the door yesterday, but I kept getting interrupted.
The target bits for rtx costing can't assume the rtl they're given actually
matches a target pattern. It's just kind of inherent in how the costing
routines get called in various places.
In this particular case we're trying to cost a conditional move:
(set (dest) (if_then_else (cond) (true) (false))
On the RISC-V port the backend only allows actual conditionals for COND. So
something like (eq (reg) (const_int 0)). In the costing code for if-then-else
we did something like
(XEXP (XEXP (cond, 0), 0)))
Which fails miserably if COND is a terminal node like (reg) rather than (ne
(reg) (const_int 0)
So this patch tightens up the RTL scanning to ensure that we have a comparison
before we start looking at the comparison's arguments.
Run through my tester without incident, but I'll wait for the pre-commit tester
to run through a cycle before pushing to the trunk.
Jeff
ps. We probably could support a naked REG for the condition and internally convert it to (ne (reg) (const_int 0)), but I don't think it likely happens with any regularity.
PR target/116240
gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Ensure object is a
comparison before looking at its arguments.
gcc/testsuite
* gcc.target/riscv/pr116240.c: New test.
曾治金 [Wed, 14 Aug 2024 06:06:23 +0000 (14:06 +0800)]
RISC-V: Fix factor in dwarf_poly_indeterminate_value [PR116305]
This patch is to fix the bug (BugId:116305) introduced by the commit
bd93ef for risc-v target.
The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128
if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So
it changes the value of BYTES_PER_RISCV_VECTOR. For example, before
merging the commit bd93ef and if TARGET_MIN_VLEN is 256, the value
of BYTES_PER_RISCV_VECTOR should be [8, 8], but now [16, 16]. The value
of riscv_bytes_per_vector_chunk and BYTES_PER_RISCV_VECTOR are no longer
equal.
Prologue will use BYTES_PER_RISCV_VECTOR.coeffs[1] to estimate the vlenb
register value in riscv_legitimize_poly_move, and dwarf2cfi will also
get the estimated vlenb register value in riscv_dwarf_poly_indeterminate_value
to calculate the number of times to multiply the vlenb register value.
So need to change the factor from riscv_bytes_per_vector_chunk to
BYTES_PER_RISCV_VECTOR, otherwise we will get the incorrect dwarf
information. The incorrect example as follow:
```
csrr   t0,vlenb
slli   t1,t0,1
sub   sp,sp,t1
The sequence '0x92,0xa2,0x38,0' means the vlenb register, '0x34' means
the literal 4, '0x1e' means the multiply operation. But in fact, the
vlenb register value just need to multiply the literal 2.
PR target/116305
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_dwarf_poly_indeterminate_value): Take
BYTES_PER_RISCV_VECTOR for *factor instead of riscv_bytes_per_vector_chunk.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalable_vector_cfi.c: New test.
Robin Dapp [Tue, 27 Aug 2024 08:25:34 +0000 (10:25 +0200)]
RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].
When the source mode is potentially larger than one vector (e.g. an
LMUL2 mode for VLEN=128) we don't know which vector the subreg actually
refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI))
could actually be the a full (high) vector register of a two-register
group (at VLEN=128) or the higher part of a single register (at VLEN>128).
As the subreg is statically ambiguous we prevent such situations in
can_change_mode_class.
The culprit in PR116086 is
_12 = BIT_FIELD_REF <vect_cst__42, 128, 128>;
which can be expanded with a vector-vector extract (from V4DI to V2DI).
This patch adds a VLS-mode vector-vector extract that handles "halving"
cases like this one by sliding down the source vector, thus making sure
the correct part is used.
PR target/116086
gcc/ChangeLog:
* config/riscv/autovec.md (vec_extract<mode><v_half>): Add
vector-vector extract for VLS modes.
* config/riscv/riscv.cc (riscv_can_change_mode_class): Forbid
VLS modes larger than one vector.
* config/riscv/vector-iterators.md: Add vector-vector extract
iterators.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add effective target checks for
zvl256b and zvl512b.
* gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086.c: New test.
RISC-V: Fix vl_used_by_non_rvv_insn logic of vsetvl pass
This patch fixes a bug in the current vsetvl pass. The current pass uses
`m_vl` to determine whether the dest operand has been used by non-RVV
instructions. However, `m_vl` may have been modified as a result of an
`update_avl` call, and thus would be no longer the dest operand of the
original instruction. This can lead to incorrect vsetvl eliminations, as is
shown in the testcase. In this patch, we create a `dest_vl` variable for
this scenerio.
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc: Use `dest_vl` for dest VL operand
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/vsetvl_bug-3.c: New test.
Andreas Schwab [Thu, 12 Sep 2024 11:55:09 +0000 (13:55 +0200)]
riscv: Fix duplicate assmbler label in @tlsdesc<mode> insn
Use %= instead of maintaining a sequence number manually, so that it
doesn't result in a duplicate assembler label when the insn is duplicated.
PR target/116693
* config/riscv/riscv.cc (riscv_legitimize_tls_address): Don't pass
seqno to gen_tlsdesc and remove it.
* config/riscv/riscv.md (@tlsdesc<mode>): Remove operand 1. Use
%= instead of %1 in template.
Bohan Lei [Wed, 18 Sep 2024 13:20:23 +0000 (07:20 -0600)]
[PATCH] RISC-V: Allow zero operand for DI variants of vssubu.vx
The RISC-V vector machine description relies on the helper function
`sew64_scalar_helper` to emit actual insns for the DI variants of
vssub.vx and vssubu.vx. This works with vssub.vx, but can cause
problems with vssubu.vx with the scalar operand being constant zero,
because `has_vi_variant_p` returns false, and the operand will be taken
without being loaded into a reg. The attached testcases can cause an
internal compiler error as a result.
Allowing a constant zero operand in those insns seems to be a simple
solution that only affects minimum existing code.
gcc/ChangeLog:
* config/riscv/vector.md: Allow zero operand for DI variants of
vssubu.vx
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vssubu-1.c: New test.
* gcc.target/riscv/rvv/base/vssubu-2.c: New test.
[PATCH] RISC-V: Fix th.extu operands exceeding range on rv32.
The Combine Pass may generate zero_extract instructions that are out of range.
Drawing from other architectures like AArch64, we should impose restrictions
on the "*th_extu<mode>4" pattern.
gcc/
* config/riscv/thead.md (*th_extu<mode>4): Fix th.extu
operands exceeding range on rv32.
Robin Dapp [Thu, 21 Nov 2024 13:49:53 +0000 (14:49 +0100)]
RISC-V: Ensure vtype for full-register moves [PR117544].
As discussed in PR117544 the VTYPE register is not preserved across
function calls. Even though vmv1r-like instructions operate
independently of the actual vtype they still require a valid vtype. As
we cannot guarantee that the vtype is valid we must make sure to emit a
vsetvl between a function call and a vmv1r.v.
This patch makes the necessary changes by splitting the full-reg-move
insns into patterns that use the vtype register and adding vmov to the
types of instructions requiring a vset.
Pan Li [Wed, 4 Dec 2024 05:53:52 +0000 (13:53 +0800)]
RISC-V: Add assert for insn operand out of range access [PR117878][NFC]
According to the the initial analysis of PR117878, the ice comes from
the out-of-range operand access for recog_data.operand[]. Thus, add
one assert here to expose this explicitly.
PR target/117878
gcc/ChangeLog:
* config/riscv/riscv-v.cc (vlmax_avl_type_p): Add assert for
out of range access.
(nonvlmax_avl_type_p): Ditto.
Jeff Law [Mon, 30 Dec 2024 20:51:55 +0000 (13:51 -0700)]
[RISC-V][PR target/106544] Avoid ICEs due to bogus asms
This is a fix for a bug Andrew P filed a while back where essentially a poorly
crafted asm statement could trigger a ICE during assembly output. Various
cases will use INTVAL (op) without verifying the operand is a CONST_INT node
first.
The usual way to handle this is via output_operand_lossage, which this patch
implements.
I focused primarily on the CONST_INT cases, there could well be other problems
in this space, if so they should get distinct bugs with testcases.
Tested in my tester on rv32 and rv64. Waiting for pre-commit testing before
moving forward.
PR target/106544
gcc/
* config/riscv/riscv.cc (riscv_print_operand): Issue an error for
invalid operands rather than invalidly accessing INTVAL of an
object that is not a CONST_INT. Fix one error string for 'N'.
gcc/testsuite
* gcc.target/riscv/pr106544.c: New test.
with negative step expecting wraparound semantics due to -fwrapv.
For building interleaved patterns we have an optimization that
does e.g.
{1, 209, ...} = { 1, 0, 209, 0, ...}
and
{201, 25, ...} >> 8 = { 0, 201, 0, 25, ...}
and IORs those.
The optimization only works if the lowpart bits are zero. When
overflowing e.g. with a negative step we cannot guarantee this.
This patch makes us fall back to the generic merge handling for negative
steps.
I'm not 100% certain we're good even for positive steps. If the
step or the vector length is large enough we'd still overflow and
have non-zero lower bits. I haven't seen this happen during my
testing, though and the patch doesn't make things worse, so...
Regtested on rv64gcv_zvl512b. Let's see what the CI says.
Regards
Robin
PR target/117682
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Fall back to
merging if either step is negative.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr117682.c: New test.
In PR118154 we emit strided stores but the first of those does not
always have the proper VTYPE. That's because we erroneously delete
a necessary vsetvl.
Kito Cheng [Mon, 23 Dec 2024 15:23:44 +0000 (23:23 +0800)]
RISC-V: Fix code gen for reduction with length 0 [PR118182]
`.MASK_LEN_FOLD_LEFT_PLUS`(or `mask_len_fold_left_plus_m`) is expecting the
return value will be the start value even if the length is 0.
However current code gen in RISC-V backend is not meet that semantic, it will
result a random garbage value if length is 0.
Let example by current code gen for MASK_LEN_FOLD_LEFT_PLUS with f64:
# _148 = .MASK_LEN_FOLD_LEFT_PLUS (stmp__148.33_134, vect__70.32_138, { -1, ... }, loop_len_161, 0);
vsetvli zero,a5,e64,m1,ta,ma
vfmv.s.f v2,fa5 # insn 1
vfredosum.vs v1,v1,v2 # insn 2
vfmv.f.s fa5,v1 # insn 3
insn 1:
- vfmv.s.f won't do anything if VL=0, which means v2 will contain garbage value.
insn 2:
- vfredosum.vs won't do anything if VL=0, and keep vd unchanged even TA.
(v-spec say: `If vl=0, no operation is performed and the destination register
is not updated.`)
insn 3:
- vfmv.f.s will move the value from v1 even VL=0, so this is safe.
So how we fix that? we need two fix for that:
1. insn 1: need always execute with VL=1, so that we can guarantee it will
always work as expect.
2. insn 2: Add new pattern to force `vd` use same reg as `vs1` (start value) for
all reduction patterns, then we can guarantee vd[0] will contain the
start value when vl=0
For 1, it's just a simple change to riscv_vector::expand_reduction, but for 2,
we have to add _VL0_SAFE variant reduction to force `vd` use same reg as `vs1`
(start value).
Change since V3:
- Rename _AV to _VL0_SAFE for readability.
- Use non-VL0_SAFE version if VL is const or VLMAX.
- Only force VL=1 for vfmv.s.f when VL is non-const and non-VLMAX.
- Two more testcase.
Jin Ma [Sat, 18 Jan 2025 14:43:17 +0000 (07:43 -0700)]
[PR target/118357] RISC-V: Disable fusing vsetvl instructions by VSETVL_VTYPE_CHANGE_ONLY for XTheadVector.
In RVV 1.0, the instruction "vsetvli zero,zero,*" indicates that the
available vector length (avl) does not change. However, in XTheadVector,
this same instruction signifies that the avl should take the maximum value.
Consequently, when fusing vsetvl instructions, the optimization labeled
"VSETVL_VTYPE_CHANGE_ONLY" is disabled for XTheadVector.
PR target/118357
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc: Function change_vtype_only_p always
returns false for XTheadVector.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/pr118357.c: New test.
Jeff Law [Sat, 18 Jan 2025 20:44:33 +0000 (13:44 -0700)]
[RISC-V][PR target/116308] Fix generation of initial RTL for atomics
While this wasn't originally marked as a regression, it almost certainly is
given that older versions of GCC would have used libatomic and would not have
ICE'd on this code.
Basically this is another case where we directly used simplify_gen_subreg when
we should have used gen_lowpart.
When I fixed a similar bug a while back I noted the code in question as needing
another looksie. I think at that time my brain saw the mixed modes (SI & QI)
and locked up. But the QI stuff is just the shift count, not some deeper
issue. So fixing is trivial.
We just replace the simplify_gen_subreg with a gen_lowpart and get on with our
lives.
Tested on rv64 and rv32 in my tester. Waiting on pre-commit testing for final
verdict.
PR target/116308
gcc/
* config/riscv/riscv.cc (riscv_lshift_subword): Use gen_lowpart
rather than simplify_gen_subreg.
The right shift is always going to produce 0. 0 + 1 = 1 which is a power of 2.
So exact_log2 returns 0 and we get a true result rather than a false result.
The fix is trivial. "<=". While inside we might as well fix the formatting.
Tested on rv32 and rv64 in my tester. Waiting on upstream pre-commit testing
to render a verdict.
Jonathan Wakely [Mon, 7 Apr 2025 18:52:55 +0000 (19:52 +0100)]
libstdc++: Fix use-after-free in std::format [PR119671]
When formatting floating-point values to wide strings there's a case
where we invalidate a std::wstring buffer while a std::wstring_view is
still referring to it.
libstdc++-v3/ChangeLog:
PR libstdc++/119671
* include/std/format (__formatter_fp::format): Do not invalidate
__wstr unless _M_localized returns a valid string.
* testsuite/std/format/functions/format.cc: Check wide string
formatting of floating-point types with classic locale.
Jonathan Wakely [Wed, 26 Mar 2025 11:47:05 +0000 (11:47 +0000)]
libstdc++: Replace use of __mindist in ranges::uninitialized_xxx algos [PR101587]
In r15-8980-gf4b6acfc36fb1f I introduced a new function object for
finding the smaller of two distances. In bugzilla Hewill Kang pointed
out that we still need to explicitly convert the result back to the
right difference type, because the result might be an integer-like class
type that doesn't convert to an integral type explicitly.
Rather than doing that conversion in the __mindist function object, I
think it's simpler to remove it again and just do a comparison and
assignment. We always want the result to have a specific type, so we can
just check if the value of the other type is smaller, and then convert
that to the other type if so.
libstdc++-v3/ChangeLog:
PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
Remove.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
comparison and assignment instead of __mindist.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check with ranges that use integer-like class type for
difference type.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
Reviewed-by: Tomasz Kaminski <tkaminsk@redhat.com> Reviewed-by: Hewill Kang <hewillk@gmail.com>
(cherry picked from commit 03ac8886e5c1fa16da90276fd721a57fa9435f4f)
Jonathan Wakely [Wed, 26 Mar 2025 11:47:05 +0000 (11:47 +0000)]
libstdc++: Replace use of std::min in ranges::uninitialized_xxx algos [PR101587]
Because ranges can have any signed integer-like type as difference_type,
it's not valid to use std::min(diff1, diff2). Instead of calling
std::min with an explicit template argument, this adds a new __mindist
helper that determines the common type and uses that with std::min.
libstdc++-v3/ChangeLog:
PR libstdc++/101587
* include/bits/ranges_uninitialized.h (__detail::__mindist):
New function object.
(ranges::uninitialized_copy, ranges::uninitialized_copy_n)
(ranges::uninitialized_move, ranges::uninitialized_move_n): Use
__mindist instead of std::min.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
Check ranges with difference difference types.
* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
Likewise.
LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].
In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers. This will cause the compiler to automatically link libquadmath
when compiling Fortran programs. But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.
PR target/119408
libgfortran/ChangeLog:
* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch. If so, return false.
* configure: Regenerate.
libquadmath/ChangeLog:
* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch. If so, return false.
* configure: Regenerate.
Sigend-off-by: Xi Ruoyao <xry111@xry111.site> Sigend-off-by: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit 1534f0099c98ea14c08a401302b05edf2231f411)
Patrick Palka [Thu, 13 Mar 2025 23:55:00 +0000 (19:55 -0400)]
libstdc++: Work around C++20 tuple<tuple<any>> constraint recursion [PR116440]
The type tuple<tuple<any>> is clearly copy/move constructible, but for
reasons that are not yet completely understood checking this triggers
constraint recursion with our C++20 tuple implementation (but not the
C++17 implementation).
It turns out this recursion stems from considering the non-template
tuple(const _Elements&) constructor during the copy/move constructibility
check. Considering this constructor is ultimately redundant, since the
defaulted copy/move constructors are better matches.
GCC has a non-standard "perfect candidate" optimization[1] that causes
overload resolution to shortcut considering template candidates if we
find a (non-template) perfect candidate. So to work around this issue
(and as a general compile-time optimization) this patch turns the
problematic constructor into a template so that GCC doesn't consider it
when checking for copy/move constructibility of this tuple type.
Changing the template-ness of a constructor can affect overload
resolution (since template-ness is a tiebreaker) so there's a risk this
change could e.g. introduce overload resolution ambiguities. But the
original C++17 implementation has long defined this constructor as a
template (in order to constrain it etc), so doing the same thing in the
C++20 mode should naturally be quite safe.
The testcase still fails with Clang (in C++20 mode) since it doesn't
implement said optimization.
Jason Merrill [Mon, 7 Apr 2025 15:49:19 +0000 (11:49 -0400)]
c++: constinit and value-initialization [PR119652]
Value-initialization built an AGGR_INIT_EXPR to set AGGR_INIT_ZERO_FIRST on.
Passing that AGGR_INIT_EXPR to maybe_constant_value returned a TARGET_EXPR,
which potential_constant_expression_1 mistook for a temporary.
We shouldn't add a TARGET_EXPR to the AGGR_INIT_EXPR in this case, just like
we already avoid adding it to CONSTRUCTOR or CALL_EXPR.
PR c++/119652
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_outermost_constant_expr): Also don't add a
TARGET_EXPR around AGGR_INIT_EXPR.
Jason Merrill [Fri, 4 Apr 2025 21:34:08 +0000 (17:34 -0400)]
c++: __FUNCTION__ in lambda return type [PR118629]
In this testcase, the use of __FUNCTION__ is within a function parameter
scope, the lambda's. And P1787 changed __func__ to live in the parameter
scope. But [basic.scope.pdecl] says that the point of declaration of
__func__ is immediately before {, so in the trailing return type it isn't in
scope yet, so this __FUNCTION__ should refer to foo().
Looking first for a block scope, then a function parameter scope, gives us
the right result.
PR c++/118629
gcc/cp/ChangeLog:
* name-lookup.cc (pushdecl_outermost_localscope): Look for an
sk_block.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-__func__3.C: New test.