Jonathan Wakely [Wed, 16 Apr 2025 17:38:03 +0000 (18:38 +0100)]
libstdc++: Add dg-options "-std=gnu++20" to backported tests
These tests were backported from gcc-14 where the testsuite
automatically adds -std=gnu++20 as needed. That doesn't happen on the
older release branches, so an explicit dg-options directive is needed to
ensure the tests are run by default. Otherwise they'll only be run when
somebody uses a custom --target_board that includes -std=gnu++20.
For 29_atomics/headers/stdatomic.h/115807.cc we need to compile with
-std=gnu++23 instead.
Jonathan Wakely [Mon, 9 Dec 2024 17:35:24 +0000 (17:35 +0000)]
libstdc++: Skip redundant assertions in std::array equality [PR106212]
As PR c++/106212 shows, the Debug Mode checks cause a compilation error
for equality comparisons involving std::array prvalues in constant
expressions. Those Debug Mode checks are redundant when
comparing two std::array objects, because we already know we have a
valid range. We can also avoid the unnecessary step of using
std::__niter_base to do __normal_iterator unwrapping, which isn't needed
because our std::array iterators are just pointers. Using
std::__equal_aux1 instead of std::equal avoids the redundant checks in
std::equal and std::__equal_aux.
libstdc++-v3/ChangeLog:
PR libstdc++/106212
* include/std/array (operator==): Use std::__equal_aux1 instead
of std::equal.
* testsuite/23_containers/array/comparison_operators/106212.cc:
New test.
Jonathan Wakely [Mon, 9 Dec 2024 17:35:24 +0000 (17:35 +0000)]
libstdc++: Skip redundant assertions in std::span construction [PR117966]
As PR c++/117966 shows, the Debug Mode checks cause a compilation error
for a global constexpr std::span. Those debug checks are redundant when
constructing from an array or a range, because we already know we have a
valid range and we know its size. Instead of delegating to the
std::span(contiguous_iterator, contiguous_iterator) constructor, just
initialize the data members directly.
libstdc++-v3/ChangeLog:
PR libstdc++/117966
* include/std/span (span(T (&)[N])): Do not delegate to
constructor that performs redundant checks.
(span(array<T, N>&), span(const array<T, N>&)): Likewise.
(span(Range&&), span(const span<T, N>&)): Likewise.
* testsuite/23_containers/span/117966.cc: New test.
This is needed to avoid errors outside the immediate context when
evaluating is_default_constructible_v<vector<T, A>> when A is not
default constructible.
To avoid diagnostic regressions for 23_containers/vector/48101_neg.cc we
need to make the std::allocator<cv T> partial specializations default
constructible, which they probably should have been anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/113841
* include/bits/allocator.h (allocator<cv T>): Add default
constructor to partial specializations for cv-qualified types.
* include/bits/stl_vector.h (_Vector_impl::_Vector_impl()):
Constrain so that it's only present if the allocator is default
constructible.
* include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()):
Likewise.
* testsuite/23_containers/vector/cons/113841.cc: New test.
Jonathan Wakely [Mon, 2 Sep 2024 10:29:13 +0000 (11:29 +0100)]
libstdc++: Specialize std::disable_sized_sentinel_for for std::move_iterator [PR116549]
LWG 3736 added a partial specialization of this variable template for
two std::move_iterator types. This is needed for the case where the
types satisfy std::sentinel_for and are subtractable, but do not model
the semantics requirements of std::sized_sentinel_for.
libstdc++-v3/ChangeLog:
PR libstdc++/116549
* include/bits/stl_iterator.h (disable_sized_sentinel_for):
Define specialization for two move_iterator types, as per LWG
3736.
* testsuite/24_iterators/move_iterator/lwg3736.cc: New test.
Jonathan Wakely [Thu, 14 Nov 2024 01:14:44 +0000 (01:14 +0000)]
libstdc++: Add missing parts of LWG 3480 for directory iterators [PR117560]
It looks like I only read half the resolution of LWG 3480 and decided we
already supported it. As well as making the non-member overloads of end
take their parameters by value, we need some specializations of the
enable_borrowed_range and enable_view variable templates.
libstdc++-v3/ChangeLog:
PR libstdc++/117560
* include/bits/fs_dir.h (enable_borrowed_range, enable_view):
Define specializations for directory iterators, as per LWG 3480.
* testsuite/27_io/filesystem/iterators/lwg3480.cc: New test.
Jonathan Wakely [Fri, 11 Apr 2025 10:08:34 +0000 (11:08 +0100)]
libstdc++: Document thread-safety for COW std::string [PR21334]
The gcc4-compatible copy-on-write std::string does not conform to the
C++11 requirements on data race avoidance in standard containers.
Specifically, calling non-const member functions such as begin() and
data() needs to do the "copy on write" operation and so is most
definitely a modification of the object. As such, those non-const
members must not be called concurrently with any other uses of the
string object.
libstdc++-v3/ChangeLog:
PR libstdc++/21334
* doc/xml/manual/using.xml: Document that container data race
avoidance rules do not apply to COW std::string.
* doc/html/*: Regenerate.
Richard Biener [Wed, 9 Apr 2025 12:36:19 +0000 (14:36 +0200)]
rtl-optimization/119689 - compare-debug failure with LRA
The previous change to fix LRA rematerialization broke compare-debug
for i586 bootstrap. Fixed by using prev_nonnote_nondebug_insn
instead of prev_nonnote_insn.
PR rtl-optimization/119689
PR rtl-optimization/115568
* lra-remat.cc (create_cands): Use prev_nonnote_nondebug_insn
to check whether insn2 is directly before insn.
[PR115568][LRA]: Use more strict output reload check in rematerialization
In this PR case LRA rematerialized a value from inheritance insn
instead of output reload one. This resulted in considering a
rematerilization candidate value available when it was actually
not. As a consequence an insn after rematerliazation used the
unexpected value and this use resulted in fp exception. The patch
fixes this bug.
gcc/ChangeLog:
PR rtl-optimization/115568
* lra-remat.cc (create_cands): Check that output reload insn is
adjacent to given insn. Update a comment.
Richard Biener [Thu, 18 Jul 2024 11:35:33 +0000 (13:35 +0200)]
middle-end/115641 - invalid address construction
fold_truth_andor_1 via make_bit_field_ref builds an address of
a CALL_EXPR which isn't valid GENERIC and later causes an ICE.
The following simply avoids the folding for f ().a != 1 || f ().b != 2
as it is a premature optimization anyway. The alternative would
have been to build a TARGET_EXPR around the call. To get this far
f () has to be const as otherwise the two calls are not semantically
equivalent for the optimization.
PR middle-end/115641
* fold-const.cc (decode_field_reference): If the inner
reference isn't something we can take the address of, fail.
Richard Biener [Sun, 13 Oct 2024 09:42:27 +0000 (11:42 +0200)]
tree-optimization/116481 - avoid building function_type[]
The following avoids building an array type with function or method
element type during diagnosing an array bound violation as this
will result in an error, rejecting a program with a not too useful
error message. Instead build such array type manually.
PR tree-optimization/116481
* pointer-query.cc (build_printable_array_type):
Build an array types with function or method element type
manually to avoid bogus diagnostic.
Richard Biener [Thu, 26 Sep 2024 13:41:59 +0000 (15:41 +0200)]
tree-optimization/116850 - corrupt post-dom info
Path isolation computes post-dominators on demand but can end up
splitting blocks after that, wrecking it. We can delay splitting
of blocks until we no longer need the post-dom info which is what
the following patch does to solve the issue.
PR tree-optimization/116850
* gimple-ssa-isolate-paths.cc (bb_split_points): New global.
(insert_trap): Delay BB splitting if post-doms are computed.
(find_explicit_erroneous_behavior): Process delayed BB
splitting after releasing post dominators.
(gimple_ssa_isolate_erroneous_paths): Do not free post-dom
info here.
Richard Biener [Fri, 15 Nov 2024 10:56:14 +0000 (11:56 +0100)]
tree-optimization/117574 - bougs niter lt-to-ne
When trying to change a IV from IV0 < IV1 to IV0' != IV1' we apply
fancy adjustments to the may_be_zero condition we compute rather
than using the obvious IV0->base >= IV1->base expression (to be
able to use > instead of >=?). This doesn't seem to go well.
PR tree-optimization/117574
* tree-ssa-loop-niter.cc (number_of_iterations_lt_to_ne):
Use the obvious may_be_zero condition.
Richard Biener [Thu, 5 Dec 2024 09:47:13 +0000 (10:47 +0100)]
tree-optimization/117912 - bogus address equivalences for __builtin_object_size
VN again is the culprit for exploiting address equivalences before
__builtin_object_size got the chance to do its job. This time
it isn't about union members but adjacent structure fields where
an address to one after the last element of an array field can
spill over to the next field.
The following protects all out-of-bound accesses on the upper bound
side (singling out TYPE_MAX_VALUE + 1 is more expensive). It
ignores other out-of-bound addresses that would invoke UB.
Zero-sized arrays are a bit awkward because the C++ represents them
with a -1U upper bound.
There's a similar issue for zero-sized components whose address can
be the same as the adjacent field in C.
PR tree-optimization/117912
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): For addresses
of zero-sized components do not set ->off if the object size pass
didn't run.
For OOB ARRAY_REF accesses in address expressions avoid setting
->off if the object size pass didn't run.
(valueize_refs_1): Likewise.
* c-c++-common/torture/pr117912-1.c: New testcase.
* c-c++-common/torture/pr117912-2.c: Likewise.
* c-c++-common/torture/pr117912-3.c: Likewise.
Richard Biener [Mon, 3 Feb 2025 08:55:50 +0000 (09:55 +0100)]
tree-optimization/118717 - store commoning vs. abnormals
When we sink common stores in cselim or the sink pass we have to
make sure to not introduce overlapping lifetimes for abnormals
used in the ref. The easiest is to avoid sinking stmts which
reference abnormals at all which is what the following does.
PR tree-optimization/118717
* tree-ssa-phiopt.cc (cond_if_else_store_replacement_1):
Do not common stores referencing abnormal SSA names.
* tree-ssa-sink.cc (sink_common_stores_to_bb): Likewise.
Eric Botcazou [Fri, 4 Apr 2025 09:45:23 +0000 (11:45 +0200)]
Ada: Fix thinko in Eigensystem for complex Hermitian matrices
The implementation solves the eigensystem for a NxN complex Hermitian matrix
by first solving it for a 2Nx2N real symmetric matrix and then interpreting
the 2Nx1 real vectors as Nx1 complex ones, but the last step does not work.
The patch fixes the last step and also performs a small cleanup throughout
the implementation, mostly in the commentary and without functional changes.
gcc/ada/
* libgnat/a-ngcoar.adb (Eigensystem): Adjust notation and fix the
layout of the real symmetric matrix in the main comment. Adjust
the layout of the associated code accordingly and correctly turn
the 2Nx1 real vectors into Nx1 complex ones.
(Eigenvalues): Minor similar tweaks.
* libgnat/a-ngrear.adb (Jacobi): Minor tweaks in the main comment.
Adjust notation and corresponding parameter names of functions.
Fix call to Unit_Matrix routine. Adjust the comment describing
the various kinds of iterations to match the implementation.
Jonathan Wakely [Fri, 31 Mar 2023 12:44:04 +0000 (13:44 +0100)]
libstdc++: Teach optimizer that empty COW strings are empty [PR107087]
The compiler doesn't know about the invariant that the _S_empty_rep()
object is immutable and so _M_length and _M_refcount are always zero.
This means that we get warnings about writing possibly-non-zero length
strings into buffers that can't hold them. If we teach the compiler that
the empty rep is always zero length, it knows it can be copied into any
buffer.
For Stage 1 we might want to also consider adding this to capacity():
if (_S_empty_rep()._M_capacity != 0)
__builtin_unreachable();
And this to _Rep::_M_is_leaked() and _Rep::_M_is_shared():
if (_S_empty_rep()._M_refcount != 0)
__builtin_unreachable();
libstdc++-v3/ChangeLog:
PR tree-optimization/107087
* include/bits/cow_string.h (basic_string::size()): Add
optimizer hint that _S_empty_rep()._M_length is always zero.
(basic_string::length()): Call size().
Jonathan Wakely [Thu, 8 Feb 2024 13:59:42 +0000 (13:59 +0000)]
libstdc++: Avoid aliasing violation in std::valarray [PR99117]
The call to __valarray_copy constructs an _Array object to refer to
this->_M_data but that means that accesses to this->_M_data are through
a restrict-qualified pointer. This leads to undefined behaviour when
copying from an _Expr object that actually aliases this->_M_data.
Replace the call to __valarray_copy with a plain loop. I think this
removes the only use of that overload of __valarray_copy, so it could
probably be removed. I haven't done that here.
libstdc++-v3/ChangeLog:
PR libstdc++/99117
* include/std/valarray (valarray::operator=(const _Expr&)):
Use loop to copy instead of __valarray_copy with _Array.
* testsuite/26_numerics/valarray/99117.cc: New test.
Iain Buclaw [Tue, 11 Mar 2025 16:56:18 +0000 (17:56 +0100)]
d: Fix regression returning from function with invariants [PR119139]
An optimization was added in GDC-12 which sets the TREE_READONLY flag on
all local variables with the storage class `const' assigned. For some
reason, const is also being added by the front-end to `__result'
variables in non-virtual functions, which ends up getting wrong code by
the gimplify pass promoting the local to static storage.
A bug has been raised upstream, as this looks like an error in the AST.
For now, turn off setting TREE_READONLY on all result variables.
PR d/119139
gcc/d/ChangeLog:
* decl.cc (get_symbol_decl): Don't set TREE_READONLY for __result
declarations.
Fix folding of BIT_NOT_EXPR for POLY_INT_CST [PR118976]
There was an embarrassing typo in the folding of BIT_NOT_EXPR for
POLY_INT_CSTs: it used - rather than ~ on the poly_int. Not sure
how that happened, but it might have been due to the way that
~x is implemented as -1 - x internally.
gcc/
PR tree-optimization/118976
* fold-const.cc (const_unop): Use ~ rather than - for BIT_NOT_EXPR.
* config/aarch64/aarch64.cc (aarch64_test_sve_folding): New function.
(aarch64_run_selftests): Run it.
aarch64: Fix folding of degenerate svwhilele case [PR117045]
The svwhilele folder mishandled the degenerate case in which
the second argument is the maximum integer. In that case,
the result is all-true regardless of the first parameter:
If the second scalar operand is equal to the maximum signed integer
value then a condition which includes an equality test can never fail
and the result will be an all-true predicate.
This is because the conceptual "increment the first operand
by 1 after each element" is done modulo the range of the operand.
The GCC code was instead treating it as infinite precision.
whilele_5.c even had a test for the incorrect behaviour.
The easiest fix seemed to be to handle that case specially before
doing constant folding. This also copes with variable first operands.
gcc/
PR target/116999
PR target/117045
* config/aarch64/aarch64-sve-builtins-base.cc
(svwhilelx_impl::fold): Check for WHILELTs of the minimum value
and WHILELEs of the maximum value. Fold them to all-false and
all-true respectively.
The testcase contains a VNx2QImode pseudo that is live across a call
and that cannot be allocated a call-preserved register. LRA quite
reasonably tried to save it before the call and restore it afterwards.
Unfortunately, the target told it to do that in SImode, even though
punning between SImode and VNx2QImode is disallowed by both
TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.
The natural class to use for SImode is GENERAL_REGS, so this led
to an unsalvageable situation in which we had:
(set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))
where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended
up in a reload loop.
The hooks above should ensure that this situation can never occur
for incoming subregs. It only happened here because the target
explicitly forced it.
The decision to use SImode for modes smaller than 4 bytes dates
back to the beginning of the port, before 16-bit floating-point
modes existed. I'm not sure whether promoting to SImode really
makes sense for any FPR, but that's a separate performance/QoI
discussion. For now, this patch just disallows using SImode
when it is wrong for correctness reasons, since that should be
safer to backport.
gcc/
PR testsuite/116238
* config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
Only return SImode if we can convert to and from it.
gcc/testsuite/
PR testsuite/116238
* gcc.target/aarch64/sve/pr116238.c: New test.
Christophe Lyon [Mon, 3 Mar 2025 11:12:18 +0000 (11:12 +0000)]
arm: Handle fixed PIC register in require_pic_register (PR target/115485)
Commit r9-4307-g89d7557202d25a forgot to accept a fixed PIC register
when extending the assert in require_pic_register.
arm_pic_register can be set explicitly by the user
(e.g. -mpic-register=r9) or implicitly as the default value with
-fpic/-fPIC/-fPIE and -mno-pic-data-is-text-relative -mlong-calls, and
we want to use/accept it when recording cfun->machine->pic_reg as used
to be the case.
--cut here--
/* Otherwise, if this register is used by I3, then this register
now dies here, so we must put a REG_DEAD note here unless there
is one already. */
else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
&& ! (REG_P (XEXP (note, 0))
? find_regno_note (i3, REG_DEAD,
REGNO (XEXP (note, 0)))
: find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
{
PUT_REG_NOTE_KIND (note, REG_DEAD);
place = i3;
}
--cut here--
Flags register is used in I3, but there already is a REG_DEAD note in I3.
The above condition doesn't trigger and continues in the "else" part where
REG_DEAD note is put to I2. The proposed solution corrects the above
logic to trigger every time the register is referenced in I3, avoiding the
"else" part.
PR rtl-optimization/118739
gcc/ChangeLog:
* combine.cc (distribute_notes) <case REG_UNUSED>: Correct the
logic when the register is used by I3.
Iain Buclaw [Fri, 28 Feb 2025 18:22:36 +0000 (19:22 +0100)]
d: Fix comparing uninitialized memory in dstruct.d [PR116961]
Floating-point emulation in the D front-end is done via a type named
`struct longdouble`, which in GDC is a small interface around the
real_value type. Because the D code cannot include gcc/real.h directly,
a big enough buffer is used for the data instead.
On x86_64, this buffer is actually bigger than real_value itself, so
when a new longdouble object is created with
there is uninitialized padding at the end of `r`. This was never a
problem when D was implemented in C++ (until GCC 12) as comparing two
longdouble objects with `==' would be forwarded to the relevant
operator== overload that extracted the underlying real_value.
However when the front-end was translated to D, such conditions were
instead rewritten into identity comparisons
return exp.toReal() is CTFloat.zero
The `is` operator gets lowered as a call to `memcmp() == 0', which is
where the read of uninitialized memory occurs, as seen by valgrind.
==26778== Conditional jump or move depends on uninitialised value(s)
==26778== at 0x911F41: dmd.dstruct._isZeroInit(dmd.expression.Expression) (dstruct.d:635)
==26778== by 0x9123BE: StructDeclaration::finalizeSize() (dstruct.d:373)
==26778== by 0x86747C: dmd.aggregate.AggregateDeclaration.determineSize(ref const(dmd.location.Loc)) (aggregate.d:226)
[...]
To avoid accidentally reading uninitialized data, explicitly initialize
all `longdouble` variables with an empty constructor on C++ side of the
implementation before initializing underlying real_value type it holds.
where the shift count operand does not trivially fit the scheme of
address operands. Reject those operands, especially since
strip_address_mutations() expects expressions of the form
(and ... (const_int ...)) and fails for (and ... (const_wide_int ...)).
Thus, be more strict here and accept only CONST_INT operands. Done by
replacing immediate_operand() with const_int_operand() which is enough
since the former only additionally checks for LEGITIMATE_PIC_OPERAND_P
and targetm.legitimate_constant_p which are always true for CONST_INT
operands.
While on it, fix indentation of the if block.
gcc/ChangeLog:
PR target/118835
* config/s390/s390.cc (s390_valid_shift_count): Reject shift
count operands which do not trivially fit the scheme of
address operands.
Dimitry Andric [Tue, 28 Jan 2025 17:36:16 +0000 (18:36 +0100)]
libgcc: On FreeBSD use GCC's crt objects for static linking
Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.
libgcc:
PR target/118685
* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.