Jakub Jelinek [Fri, 13 Oct 2023 07:09:32 +0000 (09:09 +0200)]
libstdc++: Fix tr1/8_c_compatibility/cstdio/functions.cc regression with recent glibc
The following testcase started FAILing recently after the
https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554
glibc change which marked vfscanf with nonnull (1) attribute.
While vfwscanf hasn't been marked similarly (strangely), the patch changes
that too. By using va_arg one hides the value of it from the compiler
(volatile keyword would do too, or making the FILE* stream a function
argument, but then it might need to be guarded by #if or something).
2023-10-13 Jakub Jelinek <jakub@redhat.com>
* testsuite/tr1/8_c_compatibility/cstdio/functions.cc (test01):
Initialize stream to va_arg(ap, FILE*) rather than 0.
* testsuite/tr1/8_c_compatibility/cwchar/functions.cc (test01):
Likewise.
Jakub Jelinek [Wed, 19 Jul 2023 11:48:53 +0000 (13:48 +0200)]
wide-int: Fix up wi::divmod_internal [PR110731]
As the following testcase shows, wi::divmod_internal doesn't handle
correctly signed division with precision > 64 when the dividend (and likely
divisor as well) is the type's minimum and the precision isn't divisible
by 64.
A few lines above what the patch hunk changes is:
/* Make the divisor and dividend positive and remember what we
did. */
if (sgn == SIGNED)
{
if (wi::neg_p (dividend))
{
neg_dividend = -dividend;
dividend = neg_dividend;
dividend_neg = true;
}
if (wi::neg_p (divisor))
{
neg_divisor = -divisor;
divisor = neg_divisor;
divisor_neg = true;
}
}
i.e. we negate negative dividend or divisor and remember those.
But, after we do that, when unpacking those values into b_dividend and
b_divisor we need to always treat the wide_ints as UNSIGNED,
because divmod_internal_2 performs an unsigned division only.
Now, if precision <= 64, we don't reach here at all, earlier code
handles it. If dividend or divisor aren't the most negative values,
the negation clears their most significant bit, so it doesn't really
matter if we unpack SIGNED or UNSIGNED. And if precision is multiple
of HOST_BITS_PER_WIDE_INT, there is no difference in behavior, while
-0x80000000000000000000000000000000 negates to
-0x80000000000000000000000000000000 the unpacking of it as SIGNED
or UNSIGNED works the same.
In the testcase, we have signed precision 119 and the dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
both before and after negation.
Divisor is
val = { 2 }, len = 1, precision = 119
But we really want to divide 0x400000000000000000000000000000 by 2
unsigned and then negate at the end.
If it is unsigned precision 119 division
0x400000000000000000000000000000 by 2
dividend is
val = { 0, 0xffc0000000000000 }, len = 2, precision = 119
but as we unpack it UNSIGNED, it is unpacked into
0, 0, 0, 0x00400000
The following patch fixes it by always using UNSIGNED unpacking
because we've already negated negative values at that point if
sgn == SIGNED and so most negative constants should be treated as
positive.
2023-07-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/110731
* wide-int.cc (wi::divmod_internal): Always unpack dividend and
divisor as UNSIGNED regardless of sgn.
The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.
PR tree-optimization/111764
* tree-vect-loop.c (check_reduction_path): Remove the attempt
to allow x + x via special-casing of assigns.
Richard Biener [Mon, 16 Oct 2023 10:50:46 +0000 (12:50 +0200)]
middle-end/111818 - failed DECL_NOT_GIMPLE_REG_P setting of volatile
The following addresses a missed DECL_NOT_GIMPLE_REG_P setting of
a volatile declared parameter which causes inlining to substitute
a constant parameter into a context where its address is required.
The main issue is in update_address_taken which clears
DECL_NOT_GIMPLE_REG_P from the parameter but fails to rewrite it
because is_gimple_reg returns false for volatiles. The following
changes maybe_optimize_var to make the 1:1 correspondence between
clearing DECL_NOT_GIMPLE_REG_P of a register typed decl and
actually rewriting it to SSA.
PR middle-end/111818
* tree-ssa.c (maybe_optimize_var): When clearing
DECL_NOT_GIMPLE_REG_P always rewrite into SSA.
Richard Biener [Mon, 19 Jun 2023 07:52:45 +0000 (09:52 +0200)]
tree-optimization/110298 - CFG cleanup and stale nb_iterations
When unrolling we eventually kill nb_iterations info since it may
refer to removed SSA names. But we do this only after cleaning
up the CFG which in turn can end up accessing it. Fixed by
swapping the two.
PR tree-optimization/110298
* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely):
Clear number of iterations info before cleaning up the CFG.
Richard Biener [Mon, 19 Jun 2023 07:23:16 +0000 (09:23 +0200)]
debug/110295 - mixed up early/late debug for member DIEs
When we process a scope typedef during early debug creation and
we have already created a DIE for the type when the decl is
TYPE_DECL_IS_STUB and this DIE is still in limbo we end up
just re-parenting that type DIE instead of properly creating
a DIE for the decl, eventually picking up the now completed
type and creating DIEs for the members. Instead this is currently
defered to the second time we come here, when we annotate the
DIEs with locations late where now the type DIE is no longer
in limbo and we fall through doing the job for the decl.
The following makes sure we perform the necessary early tasks
for this by continuing with the decl DIE creation after setting
a parent for the limbo type DIE.
PR debug/110295
* dwarf2out.c (process_scope_var): Continue processing
the decl after setting a parent in case the existing DIE
was in limbo.
Richard Biener [Fri, 9 Jun 2023 07:29:09 +0000 (09:29 +0200)]
middle-end/110182 - TYPE_PRECISION on VECTOR_TYPE causes wrong-code
When folding two conversions in a row we use TYPE_PRECISION but
that's invalid for VECTOR_TYPE. The following fixes this by
using element_precision instead.
* match.pd (two conversions in a row): Use element_precision
to DTRT for VECTOR_TYPE.
liuhongt [Thu, 7 Dec 2023 01:17:27 +0000 (09:17 +0800)]
Don't assume it's AVX_U128_CLEAN after call_insn whose abi.mode_clobber(V4DImode) deosn't contains all SSE_REGS.
If the function desn't clobber any sse registers or only clobber
128-bit part, then vzeroupper isn't issued before the function exit.
the status not CLEAN but ANY after the function.
Also for sibling_call, it's safe to issue an vzeroupper. Also there
could be missing vzeroupper since there's no mode_exit for
sibling_call_p.
gcc/ChangeLog:
PR target/112891
* config/i386/i386.c (ix86_avx_u128_mode_after): Return
AVX_U128_ANY if callee_abi doesn't clobber all_sse_regs to
align with ix86_avx_u128_mode_needed.
(ix86_avx_u128_mode_needed): Return AVX_U128_ClEAN for
sibling_call.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr112891.c: New test.
* gcc.target/i386/pr112891-2.c: New test.
Jonathan Wakely [Mon, 1 Nov 2021 12:27:43 +0000 (12:27 +0000)]
libstdc++: Missing constexpr for __gnu_debug::__valid_range etc
The new 25_algorithms/move/constexpr.cc test fails in debug mode,
because the debug assertions use the non-constexpr overloads in
<debug/stl_iterator.h>.
libstdc++-v3/ChangeLog:
* include/debug/stl_iterator.h (__valid_range): Add constexpr
for C++20. Qualify call to avoid ADL.
(__get_distance, __can_advance, __unsafe, __base): Likewise.
* testsuite/25_algorithms/move/constexpr.cc: Also check with
std::reverse_iterator arguments.
where the expression is ((1 << R3) + R10), which does not match a valid
machine addressing mode. Consequently `print_operand_address' chokes.
This can be reduced to the testcase included, where it triggers the same
ICE in `p'. Preincrements are required so that their results land in
registers and consequently an indexed addressing mode is tried or
otherwise doing operations piecemeal on stack-based function arguments
as direct input operands turns out more profitable in terms of RTX costs
and the ICE is avoided.
The ultimate cause has been commit c605a8bf9270 ("VAX: Accept ASHIFT in
address expressions"), where a shift of an immediate value by a register
has been mistakenly allowed as an index expression as if the shift
operation was commutative such as multiplication is. So with ASHIFT the
scaler in an index expression has to be the right-hand operand, and the
backend has to enforce that, whereas with MULT the scaler can be either
operand.
Fix this by only accepting the index scaler as the RHS operand to
ASHIFT.
gcc/
PR target/111815
* config/vax/vax.c (index_term_p): Only accept the index scaler
as the RHS operand to ASHIFT.
gcc/testsuite/
PR target/111815
* gcc.dg/torture/pr111815.c: New test.
The Xmethod for std::deque::operator[] has the same bug that I recently
fixed for the std::deque::size() Xmethod. The first node might have
unused capacity at the start, which needs to be accounted for when
indexing into the deque.
libstdc++-v3/ChangeLog:
PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.index):
Correctly handle unused capacity at the start of the first node.
* testsuite/libstdc++-xmethods/deque.cc: Check index operator
when elements have been removed from the front.
The Xmethod for std::deque::size() assumed that the first element would
be at the start of the first node. That's only true if elements are only
added at the back. If an element is inserted at the front, or removed
from the front (or anywhere before the middle) then the first node will
not be completely populated, and the Xmethod will give the wrong result.
libstdc++-v3/ChangeLog:
PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.size): Fix
calculation to use _M_start._M_cur.
* testsuite/libstdc++-xmethods/deque.cc: Check failing cases.
Jonathan Wakely [Thu, 28 Sep 2023 13:54:59 +0000 (14:54 +0100)]
libstdc++: Reformat Python code
Some of these changes were suggested by autopep8's --aggressive
option, others are for readability.
Break long lines by splitting strings across multiple lines, or
introducing local variables to hold results.
Use raw strings for regular expressions, so that backslashes don't need
to be escaped.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py: Break long lines. Use raw
strings for regular expressions. Add whitespace around
operators.
(is_member_of_namespace): Use isinstance to check type.
(is_specialization_of): Likewise. Adjust template_name
for versioned namespace instead of duplicating the re.match
call.
(StdExpAnyPrinter._string_types): New static method.
(StdExpAnyPrinter.to_string): Use _string_types.
Iain Buclaw [Tue, 7 Nov 2023 13:04:07 +0000 (14:04 +0100)]
libphobos: Fix regression d21 loops in getCpuInfo0B in Solaris/x86 kernel zone
This function assumes that cpuid would return "invalid domain" when a
sub-leaf index greater than what's supported is requested. This turned
out not to always be the case when running on some virtual machines.
As the loop only does anything for levels 0 and 1, make that a hard
limit for number of times the loop is ran.
PR d/112408
libphobos/ChangeLog:
* libdruntime/core/cpuid.d (getCpuInfo0B): Limit number of times loop
runs.
Kewen Lin [Thu, 12 Oct 2023 05:05:03 +0000 (00:05 -0500)]
rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]
As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.
Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.
PR target/111367
gcc/ChangeLog:
* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.
rax is used to save and restore DFmode value. In RA both GENERAL_REGS
and SSE_REGS cost zero since we didn't disparage the
alternative in movdf_internal pattern, according to register
allocation order, GENERAL_REGS is allocated. The patch add ? for
alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal
pattern, after that we get optimal RA.
Andrew Pinski [Thu, 5 Oct 2023 19:21:19 +0000 (12:21 -0700)]
MATCH: Fix infinite loop between `vec_cond(vec_cond(a,b,0), c, d)` and `a & b`
Match has a pattern which converts `vec_cond(vec_cond(a,b,0), c, d)`
into `vec_cond(a & b, c, d)` but since in this case a is a comparison
fold will change `a & b` back into `vec_cond(a,b,0)` which causes an
infinite loop.
The best way to fix this is to enable the patterns for vec_cond(*,vec_cond,*)
only for GIMPLE so we don't get an infinite loop for fold any more.
Jonathan Wakely [Wed, 4 Oct 2023 11:07:11 +0000 (12:07 +0100)]
libstdc++: Fix testsuite failures with -O0
Backport the prune.exp change from r12-4425-g1595fe44e11a96 to fix two
testsuite failures when testing with -O0:
FAIL: 20_util/uses_allocator/69293_neg.cc (test for excess errors)
FAIL: 20_util/uses_allocator/cons_neg.cc (test for excess errors)
Also force some 20_util/integer_comparisons/ xfail tests to use -O2 so
that the errors match the dg-error directives.