Jonathan Wakely [Wed, 28 Apr 2021 11:45:49 +0000 (12:45 +0100)]
libstdc++: Add missing noexcept on std::thread member function [PR 100298]
The new inline definition of std::thread::hardware_concurrency() for
non-gthreads targets is missing the noexcept-specifier that is on the
declaration.
libstdc++-v3/ChangeLog:
PR libstdc++/100298
* include/bits/std_thread.h (thread::hardware_concurrency): Add
missing noexcept to inline definition for non-gthreads targets.
Richard Earnshaw [Tue, 27 Apr 2021 11:25:30 +0000 (12:25 +0100)]
arm: fix UB when compiling thumb2 with PIC [PR100236]
arm_compute_save_core_reg_mask contains UB in that the saved PIC
register number is used to create a bit mask. However, for some target
options this register is undefined and we end up with a shift of ~0.
On native compilations this is benign since the shift will still be
large enough to move the bit outside of the range of the mask, but if
cross compiling from a system that truncates out-of-range shifts to
zero (or worse, raises a trap for such values) we'll get potentially
wrong code (or a fault).
gcc:
PR target/100236
* config/arm/arm.c (THUMB2_WORK_REGS): Check PIC_OFFSET_TABLE_REGNUM
is valid before including it in the mask.
(cherry picked from commit 01d0bda8bdf3cd804e1e00915d432ad0cdc49399)
64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
targets, so there is no need for additional atomic moves to a temporary
register.
Introduced load peephole2 patterns assume that there won't be any additional
loads from the load location outside the peepholed sequence and wrongly
removed the source location initialization.
OTOH, introduced store peephole2 patterns assume there won't be any additional
loads from the stored location outside the peepholed sequence and wrongly
removed the destination location initialization. Note that we can't use plain
x87 FST instruction to initialize destination location because FST converts
the value to the double-precision format, changing bits during move.
The patch restores removed initializations in load and store patterns.
Additionally, plain x87 FST in store peephole2 patterns is prevented by
limiting the store operand source to SSE registers.
2021-04-27 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/100182
* config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
Copy operand 3 to operand 4. Use sse_reg_operand
as operand 3 predicate.
(FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
Copy operand 1 to operand 0.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage): Ditto.
Patrick Palka [Tue, 27 Apr 2021 18:07:46 +0000 (14:07 -0400)]
libstdc++: Fix up lambda in join_view::_Iterator::operator++ [PR100290]
Currently, the return type of this lambda is decltype(auto), so the
lambda ends up returning a copy of _M_parent->_M_inner rather than a
reference to it when _S_ref_glvalue is false. This means _M_inner and
ranges::end(__inner_range) are respectively an iterator and sentinel for
different ranges, so comparing them is undefined.
libstdc++-v3/ChangeLog:
PR libstdc++/100290
* include/std/ranges (join_view::_Iterator::operator++): Correct
the return type of the lambda to avoid returning a copy of
_M_parent->_M_inner.
* testsuite/std/ranges/adaptors/join.cc (test10): New test.
Patrick Palka [Sat, 24 Apr 2021 04:14:29 +0000 (00:14 -0400)]
c++: do_class_deduction and dependent init [PR93383]
Here we're crashing during CTAD with a dependent initializer (performed
from convert_template_argument) because one of the initializer's
elements has an empty TREE_TYPE, which ends up making resolve_args
unhappy.
Besides the case where we're initializing one template placeholder
from another, which is already specifically handled earlier in
do_class_deduction, it seems we can't in general correctly resolve a
template placeholder using a dependent initializer, so this patch makes
the function just punt until instantiation time instead.
gcc/cp/ChangeLog:
PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* pt.c (do_class_deduction): Punt if the initializer is
type-dependent.
gcc/testsuite/ChangeLog:
PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive.
* g++.dg/cpp2a/nontype-class45.C: New test.
* g++.dg/cpp2a/nontype-class46.C: New test.
* g++.dg/cpp2a/nontype-class47.C: New test.
* g++.dg/cpp2a/nontype-class48.C: New test.
Harald Anlauf [Sat, 24 Apr 2021 18:51:41 +0000 (20:51 +0200)]
PR fortran/100154 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131
Add appropriate static checks for the character and status arguments to
the GNU Fortran intrinsic extensions fget[c], fput[c]. Extend variable
check to allow a function reference having a data pointer result.
gcc/fortran/ChangeLog:
PR fortran/100154
* check.c (variable_check): Allow function reference having a data
pointer result.
(arg_strlen_is_zero): New function.
(gfc_check_fgetputc_sub): Add static check of character and status
arguments.
(gfc_check_fgetput_sub): Likewise.
* intrinsic.c (add_subroutines): Fix argument name for the
character argument to intrinsic subroutines fget[c], fput[c].
gcc/testsuite/ChangeLog:
PR fortran/100154
* gfortran.dg/pr100154.f90: New test.
Marek Polacek [Wed, 21 Apr 2021 00:24:09 +0000 (20:24 -0400)]
c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]
Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression. That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.
The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p. But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent. It seems we need to test both op0 and op1 separately
to suppress this warning.
gcc/cp/ChangeLog:
PR c++/100161
* pt.c (tsubst_copy_and_build) <case PLUS_EXPR>: Test op0 and
op1 separately for value- or type-dependence.
gcc/testsuite/ChangeLog:
PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.
Marek Polacek [Tue, 20 Apr 2021 16:16:04 +0000 (12:16 -0400)]
c++: Don't allow defining types in enum-base [PR96380]
In r11-2064 I made cp_parser_enum_specifier commit to tentative parse
when seeing a '{'. That still looks like the correct thing to do, but
it caused an ICE-on-invalid as well as accepts-invalid.
When we have something sneaky like this, which is broken in multiple
ways:
template <class>
enum struct c : union enum struct c { e = b, f = a };
we parse the "enum struct c" part (that's OK) and then we see that
we have an enum-base, so we consume ':' and then parse the type-specifier
that follows the :. "union enum" is clearly invalid, but we're still
parsing tentatively and we parse everything up to the ;, and then
throw away the underlying type. We parsed everything because we were
tricked into parsing an enum-specifier in an enum-base of another
enum-specifier! Not good.
Since the grammar for enum-base doesn't allow a defining-type-specifier,
only a type-specifier, we should set type_definition_forbidden_message
which fixes all the problems in this PR.
libgomp/testsuite: Fix checks for dg-excess-errors
For the tests modified below, the effective target line has to be effective
when compiling for an offload target, except that variable-not-offloaded.c
would compile with unified-share memory and pr86416-*.c if long double/float128
is supported.
The previous check used a run-time device ability check. This new variant
now enables those dg- lines when _compiling_ for nvptx or gcn.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
New, based on check_effective_target_offload_target_nvptx.
(check_effective_target_offload_target_nvptx): Call it.
(check_effective_target_offload_target_amdgcn): New.
* testsuite/libgomp.c-c++-common/function-not-offloaded.c:
Require target offload_target_nvptx || offload_target_amdgcn.
* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise.
* testsuite/libgomp.c/pr86416-1.c: Likewise.
* testsuite/libgomp.c/pr86416-2.c: Likewise.
Michael Meissner [Tue, 27 Apr 2021 14:52:57 +0000 (10:52 -0400)]
[PATCH] Backport fix for PR target/98952
The test in the PowerPC 32-bit trampoline support is backwards. It aborts
if the trampoline size is greater than the expected size. It should abort
when the trampoline size is less than the expected size. I fixed the test
so the operands are reversed. I then folded the load immediate into the
compare instruction.
I verified this by creating a 32-bit trampoline program and manually
changing the size of the trampoline to be 48 instead of 40. The program
aborted with the larger size. I updated this code and ran the test again
and it passed.
I added a test case that runs on PowerPC 32-bit Linux systems and it calls
the __trampoline_setup function with a larger buffer size than the
compiler uses. The test is not run on 64-bit systems, since the function
__trampoline_setup is not called. I also limited the test to just Linux
systems, in case trampolines are handled differently in other systems.
libgcc/
2021-04-27 Michael Meissner <meissner@linux.ibm.com>
PR target/98952
* config/rs6000/tramp.S (__trampoline_setup, elfv1 #ifdef): Fix
trampoline size comparison in 32-bit by reversing test and
combining load immediate with compare. Fix backported from trunk
change on 4/23, 886b6c1e8af502b69e3f318b9830b73b88215878.
(__trampoline_setup, elfv2 #ifdef): Fix trampoline size comparison
in 32-bit by reversing test and combining load immediate with
compare.
gcc/testsuite/
2021-04-27 Michael Meissner <meissner@linux.ibm.com>
Jakub Jelinek [Tue, 27 Apr 2021 13:46:16 +0000 (15:46 +0200)]
aarch64: Fix UB in the compiler [PR100200]
The following patch fixes UBs in the compiler when negativing
a CONST_INT containing HOST_WIDE_INT_MIN. I've changed the spots where
there wasn't an obvious earlier condition check or predicate that
would fail for such CONST_INTs.
Jakub Jelinek [Tue, 27 Apr 2021 13:42:47 +0000 (15:42 +0200)]
veclower: Fix up vec_shl matching of VEC_PERM_EXPR [PR100239]
The following testcase ICEs at -O0, because lower_vec_perm sees the
_1 = { 0, 0, 0, 0, 0, 0, 0, 0 };
_2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
_3 = { 6, 0, 0, 0, 0, 0, 0, 0 };
_4 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _2, _3>;
and as the ISA is SSE2, there is no support for the particular permutation
nor for variable mask permutation. But, the code to match vec_shl matches
it, because the permutation has the first operand a zero vector and the
mask picks all elements randomly from that vector.
So, in the end that isn't a vec_shl, but the permutation could be in theory
optimized into the first argument. As we keep it as is, it will fail
during expansion though, because that for vec_shl correctly requires that
it actually is a shift:
unsigned firstidx = 0;
for (unsigned int i = 0; i < nelt; i++)
{
if (known_eq (sel[i], nelt))
{
if (i == 0 || firstidx)
return NULL_RTX;
firstidx = i;
}
else if (firstidx
? maybe_ne (sel[i], nelt + i - firstidx)
: maybe_ge (sel[i], nelt))
return NULL_RTX;
}
if (firstidx == 0)
return NULL_RTX;
first = firstidx;
The if (firstidx == 0) return NULL; is what is missing a counterpart
on the lower_vec_perm side.
As with optimize != 0 we fold it in other spots, I think it is not needed
to optimize this cornercase in lower_vec_perm (which would mean we'd need
to recurse on the newly created _4 = { 0, 0, 0, 0, 0, 0, 0, 0 };
whether it is supported or not).
2021-04-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/100239
* tree-vect-generic.c (lower_vec_perm): Don't accept constant
permutations with all indices from the first zero element as vec_shl.
Jakub Jelinek [Tue, 27 Apr 2021 13:26:24 +0000 (15:26 +0200)]
cfgcleanup: Fix -fcompare-debug issue in outgoing_edges_match [PR100254]
The following testcase fails with -fcompare-debug. The problem is that
outgoing_edges_match behaves differently between -g0 and -g, if
some load/store with REG_EH_REGION is followed by DEBUG_INSNs, the
REG_EH_REGION check is not done, while when there are no DEBUG_INSNs, it is
done.
We already compute last1 and last2 as BB_END (bb{1,2}) with skipped debug
insns and notes, so this patch just uses those.
2021-04-27 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100254
* cfgcleanup.c (outgoing_edges_match): Check REG_EH_REGION on
last1 and last2 insns rather than BB_END (bb1) and BB_END (bb2) insns.
Jakub Jelinek [Wed, 21 Apr 2021 13:17:10 +0000 (15:17 +0200)]
testsuite: Add -fchecking to dg-ice tests
In --enable-checking=release builds (which is the default on release
branches), I'm getting various extra FAILs that don't appear in
--enable-checking=yes builds.
XPASS: g++.dg/cpp0x/constexpr-52830.C -std=c++14 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C -std=c++14 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C -std=c++17 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C -std=c++2a (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp2a/nontype-class39.C -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c -std=c++98 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c -std=c++98 (test for excess errors)
These are tests that have dg-ice and most of those ICEs are checking ICEs
which go away in release checking when -fno-checking is the default.
The following patch adds -fchecking option to those.
Jakub Jelinek [Wed, 21 Apr 2021 10:31:45 +0000 (12:31 +0200)]
cprop: Fix -fcompare-debug bug in constprop_register [PR100148]
The following testcase shows different behavior between -g and -g0
in constprop_register, if a flags register setter is separated
from a conditional jump using those flags with -g by a DEBUG_INSN.
As it uses just NEXT_INSN, for -g it will look at the DEBUG_INSN which is
not a conditional jump, while otherwise it would look at the conditional
jump and call cprop_jump.
2021-04-21 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/100148
* cprop.c (constprop_register): Use next_nondebug_insn instead of
NEXT_INSN.
Jonathan Wakely [Thu, 22 Apr 2021 16:26:50 +0000 (17:26 +0100)]
libstdc++: Fix semaphore to work with system_clock timeouts
The __cond_wait_until_impl function takes a steady_clock timeout, but
then sometimes tries to compare it to a time from the system_clock,
which won't compile. Additionally, that function gets called with
system_clock timeouts, which also won't compile. This makes the function
accept timeouts for either clock, and compare to the time from the right
clock.
This fixes the compilation error that was causing two tests to fail on
non-futex targets, so we can revert the r12-11 change to disable them.
libstdc++-v3/ChangeLog:
* include/bits/atomic_timed_wait.h (__cond_wait_until_impl):
Handle system_clock as well as steady_clock.
* testsuite/30_threads/semaphore/try_acquire_for.cc: Re-enable.
* testsuite/30_threads/semaphore/try_acquire_until.cc:
Re-enable.
Thomas Rodgers [Thu, 22 Apr 2021 01:12:03 +0000 (18:12 -0700)]
libstdc++: Fix "bare" notifications dropped by waiters check
For types that track whether or not there extant waiters (e.g.
semaphore) internally, the __atomic_notify_address_bare() call was
introduced to avoid the overhead of loading the atomic count of
waiters. For platforms that don't have Futex, however, there was
still a check for waiters, and seeing that there are none (because
in the bare case, the count is not incremented), the notification
is dropped. This commit addresses that case.
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h: Always notify waiters in the
case of 'bare' address notification.
Jonathan Wakely [Thu, 22 Apr 2021 10:10:06 +0000 (11:10 +0100)]
libstdc++: Remove #error from <semaphore> implementation [PR 100179]
This removes the #error from <bits/semaphore_base.h> for the case where
neither __atomic_semaphore nor __platform_semaphore is defined.
Also rename the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro to
_GLIBCXX_USE_POSIX_SEMAPHORE for consistency with the similar
_GLIBCXX_USE_CXX11_ABI macro that can be used to request an alternative
(ABI-changing) implementation.
libstdc++-v3/ChangeLog:
PR libstdc++/100179
* include/bits/semaphore_base.h: Remove #error.
* include/std/semaphore: Do not define anything unless one of
the implementations is available.
Jakub Jelinek [Thu, 22 Apr 2021 13:08:21 +0000 (15:08 +0200)]
libstdc++: Add workaround for ia32 floating atomics miscompilations [PR100184]
gcc on ia32 miscompiles various atomics involving floating point,
unfortunately I'm afraid it is too late to fix that for 11.1 and
as I'm quite lost on it, it might take a while for 12 too
(disabling all the 8 peephole2s would be easiest, but then we'd
run into optimization regressions).
While 1.cc just FAILs, with dejagnu 1.6.1 wait_notify.cc hangs the
make check even after the timeout fires. The following patch therefore
xfails the former and skips the latter.
Tested on x86_64-linux where
make check RUNTESTFLAGS='conformance.exp=atomic_float/*.cc'
is still
=== libstdc++ Summary ===
# of expected passes 8
and on i686-linux, where it is now
=== libstdc++ Summary ===
# of expected passes 5
# of expected failures 1
# of unsupported tests 1
2021-04-22 Jakub Jelinek <jakub@redhat.com>
PR target/100182
* testsuite/29_atomics/atomic_float/1.cc: Add dg-xfail-run-if for
ia32.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Add dg-skip-if for
ia32.
pr68078.f90 tests out-of-memory handling and calls set_vm_limit to set the
soft limit. However, setrlimit was then called with hard limit RLIM_INFINITY,
which failed when the current hard limit was lower.
gcc/testsuite/
* gfortran.dg/set_vm_limit.c (set_vm_limit): Call getrlimit, use
obtained hard limit, and only call setrlimit if new softlimit is lower.
Richard Biener [Wed, 21 Apr 2021 12:54:05 +0000 (14:54 +0200)]
Avoid -latomic for amdgcn offloading
libatomic isn't built for amdgcn but reduction-16.c adds it
via -foffload=-latomic when offloading for nvptx is enabled.
The following avoids linker errors when offloading to amdgcn is enabled
as well.
2021-04-21 Richard Biener <rguenther@suse.de>
libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic
only on nvptx-none.
Thomas Rodgers [Wed, 21 Apr 2021 17:01:06 +0000 (10:01 -0700)]
[libstdc++] Fix test timeout in stop_calback/destroy.cc
A change was made to __atomic_semaphore::_S_do_try_acquire() to
(ideally) let the compare_exchange reload the value of __old rather than
always reloading it twice. This causes _M_acquire to spin indefinitely
if the value of __old is already 0.
Jakub Jelinek [Wed, 21 Apr 2021 09:09:25 +0000 (11:09 +0200)]
Fix AIX libstdc++ semaphore support [PR100164]
> > The #error would not be hit if _GLIBCXX_HAVE_POSIX_SEMAPHORE were defined,
> > but it shows up in your error report.
> You now have pinpointed the problem.
> It's not that AIX doesn't have semaphore, but that the code previously
> had a fallback that hid a bug in the macros:
// Use futex if available and didn't force use of POSIX
using __fast_semaphore = __atomic_semaphore<__detail::__platform_wait_t>;
using __fast_semaphore = __platform_semaphore;
using __fast_semaphore = __atomic_semaphore<ptrdiff_t>;
> The problem is that libstdc++ configure defines
> _GLIBCXX_HAVE_POSIX_SEMAPHORE in config.h. libstdc++ uses sed to
> rewrite config.h to c++config.h and prepends _GLIBCXX_, so c++config.h
> contains
> And bits/semaphore_base.h is not testing that corrupted macro. Either
> semaphore_base.h needs to test for the corrupted macro, or libtsdc++
> configure needs to define HAVE_POSIX_SEMAPHORE without itself
> prepending _GLIBCXX_ so that the c++config.h rewriting works
> correctly and defines the correct macro for semaphore_base.h.
The include/Makefile.am sed is:
sed -e 's/HAVE_/_GLIBCXX_HAVE_/g' \
-e 's/PACKAGE/_GLIBCXX_PACKAGE/g' \
-e 's/VERSION/_GLIBCXX_VERSION/g' \
-e 's/WORDS_/_GLIBCXX_WORDS_/g' \
-e 's/_DARWIN_USE_64_BIT_INODE/_GLIBCXX_DARWIN_USE_64_BIT_INODE/g' \
-e 's/_FILE_OFFSET_BITS/_GLIBCXX_FILE_OFFSET_BITS/g' \
-e 's/_LARGE_FILES/_GLIBCXX_LARGE_FILES/g' \
-e 's/ICONV_CONST/_GLIBCXX_ICONV_CONST/g' \
-e '/[ ]_GLIBCXX_LONG_DOUBLE_COMPAT[ ]/d' \
-e '/[ ]_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT[ ]/d' \
< ${CONFIG_HEADER} >> $@ ;\
so for many macros one needs _GLIBCXX_ prefixes already in configure,
as can be seen in grep AC_DEFINE.*_GLIBCXX configure.ac acinclude.m4
But _GLIBCXX_HAVE_POSIX_SEMAPHORE is the only one that shouldn't have
that prefix because the sed is adding that.
E.g. on i686-linux, I see
grep _GLIBCXX__GLIBCXX c++config.h
that proves it is the only broken one.
So this change fixes the acinclude.m4 side.
2021-04-21 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/100164
* acinclude.m4: For POSIX semaphores AC_DEFINE HAVE_POSIX_SEMAPHORE
rather than _GLIBCXX_HAVE_POSIX_SEMAPHORE.
* configure: Regenerated.
* config.h.in: Regenerated.
As register names are required for darwin, but not accepted by gas
unless you use `-mregnames', they have been conditionally removed on
non-darwin targets.
To avoid duplicating large blocks of almost identical code, the inline
assembly is now statically generated.
libphobos/ChangeLog:
* libdruntime/core/thread/osthread.d (callWithStackShell): Statically
generate PPC and PPC64 asm implementations, and conditionally remove
PPC register names on non-Darwin targets.
Jonathan Wakely [Tue, 20 Apr 2021 14:11:29 +0000 (15:11 +0100)]
libstdc++: Disable tests that fail after atomic wait/notify rewrite
These tests are currently failing, but should be analyzed and
re-enabled.
libstdc++-v3/ChangeLog:
* testsuite/30_threads/semaphore/try_acquire_for.cc: Disable
test for targets not using futexes for semaphores.
* testsuite/30_threads/semaphore/try_acquire_until.cc: Likewise.
* testsuite/30_threads/stop_token/stop_callback/destroy.cc:
Disable for all targets.
Thomas Rodgers [Tue, 20 Apr 2021 10:54:27 +0000 (11:54 +0100)]
libstdc++: Refactor/cleanup of C++20 atomic wait implementation
This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.
The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.
The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiter_pool_base) fits in
two cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it is
available.
The __waiters type is now specific to untimed waits, and is renamed to
__waiter_pool. Timed waits have a corresponding __timed_waiter_pool
type. Much of the code has been moved from the previous __atomic_wait()
free function to the __waiter_base template and a __waiter derived type
is provided to implement the un-timed wait operations. A similar change
has been made to the timed wait implementation.
The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from <thread> which implements this_thread::sleep_for,
sleep_until has been moved to a new <bits/std_thread_sleep.h> header
which allows the thread sleep code to be consumed without pulling in the
whole of <thread>.
The entry points into the wait/notify code have been restructured to
support either -
* Testing the current value of the atomic stored at the given address
and waiting on a notification.
* Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.
This change also centralizes what it means to compare values for the
purposes of atomic<T>::wait rather than scattering through individual
predicates.
This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).
This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new <bits/this_thread_sleep.h> header.
* include/Makefile.in: Regenerate.
* include/bits/this_thread_sleep.h: New file.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_timed_wait.h: Extensive rewrite.
* include/bits/atomic_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.
* include/std/semaphore: Likewise.
* include/std/thread (this_thread::sleep_for)
(this_thread::sleep_until): Move to new header.
* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify/1.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.
Martin Liska [Tue, 20 Apr 2021 13:53:09 +0000 (15:53 +0200)]
It seems we bumped LTO_major_version last time 2 years ago.
Right now, the following is seen when one links a GCC 10.2.x LTO object file:
$ gcc a.o
lto1: fatal error: bytecode stream in file 'a.o' generated with LTO version 9.2 instead of the expected 9.0
I suggest bumping LTO_major_version for releases/gcc-11 branch.
Can we please align it with a GCC release (version 11)? For the future, if e.g. GCC 12 consumes LTO
bytecode from GCC 11, we can leave LTO_major_version. Once e.g. GCC 13 needs bumping,
I would then change it to 13.
Patrick Palka [Tue, 20 Apr 2021 13:18:50 +0000 (09:18 -0400)]
libstdc++: Implement P2259R1 changes [PR95983]
This implements the wording changes of P2259R1 "Repairing input range
adaptors and counted_iterator", which resolves LWG 3283, 3289 and 3408.
The wording changes are relatively straightforward, but they require
some boilerplate to implement: the changes to make a type alias
"conditionally present" in some iterator class are realized by defining
a base class template and an appropriately constrained partial
specialization thereof that contains the type alias, and having the
iterator class derive from this base class. Sometimes the relevant
condition depend on members from the iterator class, but because a
class is incomplete when instantiating its bases, the constraints on
the partial specialization can't use anything from the iterator class.
This patch works around this by hoisting these members out to the
enclosing scope (e.g. transform_view::_Iterator::_Base is hoisted out
to transform_view::_Base so that transform_view::__iter_cat can use it).
This patch also implements the proposed resolution of LWG 3291 to rename
iota_view::iterator_category to iota_view::iterator_concept, which was
previously problematic due to LWG 3408.
libstdc++-v3/ChangeLog:
PR libstdc++/95983
* include/bits/stl_iterator.h (__detail::__move_iter_cat):
Define.
(move_iterator): Derive from the above in C++20 in order to
conditionally define iterator_category as per P2259.
(move_iterator::__base_cat): No longer used, so remove.
(move_iterator::iterator_category): Remove in C++20.
(__detail::__common_iter_use_postfix_proxy): Define.
(common_iterator::_Proxy): Rename to ...
(common_iterator:__arrow_proxy): ... this.
(common_iterator::__postfix_proxy): Define as per P2259.
(common_iterator::operator->): Adjust.
(common_iterator::operator++): Adjust as per P2259.
(iterator_traits<common_iterator>::_S_iter_cat): Define.
(iterator_traits<common_iterator>::iterator_category): Change as
per P2259.
(__detail::__counted_iter_value_type): Define.
(__detail::__counted_iter_concept): Define.
(__detail::__counted_iter_cat): Define.
(counted_iterator): Derive from the above three classes in order
to conditionally define value_type, iterator_concept and
iterator category respectively as per P2259.
(counted_iterator::operator->): Define as per P2259.
(incrementable_traits<counted_iterator>): Remove as per P2259.
(iterator_traits<counted_iterator>): Adjust as per P2259.
* include/std/ranges (__detail::__iota_view_iter_cat): Define.
(iota_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(iota_view::_S_iter_cat): Rename to ...
(iota_view::_S_iter_concept): ... this.
(iota_view::iterator_concept): Use it to apply LWG 3291 changes.
(iota_view::iterator_category): Remove.
(__detail::__filter_view_iter_cat): Define.
(filter_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(filter_view::_Iterator): Move to struct __filter_view_iter_cat.
(filter_view::_Iterator::iterator_category): Remove.
(transform_view::_Base): Define.
(transform_view::__iter_cat): Define.
(transform_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(transform_view::_Iterator::_Base): Just alias
transform_view::_Base.
(transform_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(transform_view::_Iterator::iterator_category): Remove.
(transform_view::_Sentinel::_Base): Just alias
transform_view::_Base.
(join_view::_Base): Define.
(join_view::_Outer_iter): Define.
(join_view::_Inner_iter): Define.
(join_view::_S_ref_is_glvalue): Define.
(join_view::__iter_cat): Define.
(join_view::_Iterator): Derive from it in order to conditionally
define iterator_category as per P2259.
(join_view::_Iterator::_Base): Just alias join_view::_Base.
(join_view::_Iterator::_S_ref_is_glvalue): Just alias
join_view::_S_ref_is_glvalue.
(join_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(join_view::_Iterator::_Outer_iter): Just alias
join_view::_Outer_iter.
(join_view::_Iterator::_Inner_iter): Just alias
join_view::_Inner_iter.
(join_view::_Iterator::iterator_category): Remove.
(join_view::_Sentinel::_Base): Just alias join_view::_Base.
(__detail::__split_view_outer_iter_cat): Define.
(__detail::__split_view_inner_iter_cat): Define.
(split_view::_Base): Define.
(split_view::_Outer_iter): Derive from __split_view_outer_iter_cat
in order to conditionally define iterator_category as per P2259.
(split_view::_Outer_iter::iterator_category): Remove.
(split_view::_Inner_iter): Derive from __split_view_inner_iter_cat
in order to conditionally define iterator_category as per P2259.
(split_view::_Inner_iter::_S_iter_cat): Move to
__split_view_inner_iter_cat.
(split_view::_Inner_iter::iterator_category): Remove.
(elements_view::_Base): Define.
(elements_view::__iter_cat): Define.
(elements_view::_Iterator): Derive from the above in order to
conditionall define iterator_category as per P2259.
(elements_view::_Iterator::_Base): Just alias
elements_view::_Base.
(elements_view::_Iterator::_S_iter_concept)
(elements_view::_Iterator::iterator_concept): Define as per
P2259.
(elements_view::_Iterator::iterator_category): Remove.
(elements_view::_Sentinel::_Base): Just alias
elements_view::_Base.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc:
Adjust constraints on iterator_traits<counted_iterator>.
* testsuite/std/ranges/p2259.cc: New test.
Jakub Jelinek [Tue, 20 Apr 2021 10:48:12 +0000 (12:48 +0200)]
libstdc++: Update ppc64le baseline_symbols.txt
> Tested on powerpc64{,le}-linux now (-m32/-m64 on be) and while the first
> patch works fine, the second one unfortunately doesn't on either be or le,
> so more work is needed there.
Here are the needed changes to make it work.
For symbols with _LDBL_ substring in version name we already have code to
ignore those if no such symbols appear (but it is slightly incorrect, see
below).
So, this patch does the same thing for symbol versions with _IEEE128_
substring.
The previously incorrectly handled case is that in addition to
FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intImEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
or
OBJECT:12:_ZTSu9__ieee128@@CXXABI_IEEE128_1.3.13
cases we also have the
OBJECT:0:CXXABI_IEEE128_1.3.13
OBJECT:0:GLIBCXX_IEEE128_3.4.29
cases, which have empty version_name and the name is in that case the
symbol version. Those need to be ignored too.
2021-04-20 Jakub Jelinek <jakub@redhat.com>
* testsuite/util/testsuite_abi.cc (compare_symbols): If any symbol
versions with _IEEE128_ substring are found, set ieee_version_found
to true. Ignore missing symbols with _IEEE128_ in version name if
!ieee_version_found. Use i->first as version_name instead of
i->second.version_name if the latter is empty.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.
testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c
Depending on whether GCC is configured using --with-mode=zarch or not,
for the 31bit target instructions are generated either for ESA or
z/Architecture. For the sake of simplicity and robustness test only for
the latter by adding manually option -mzarch.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
z/Architecture only.
gcc/fortran
PR fortran/100110
* trans-decl.c (gfc_get_symbol_decl): Replace test for host
association with a check that the current and symbol namespaces
are the same.
gcc/testsuite/
PR fortran/100110
* gfortran.dg/pdt_31.f03: New test.
* gfortran.dg/pdt_26.f03: Reduce 'builtin_malloc' count from 9
to 8.
libphobos: Fix SIGBUS in read_encoded_value_with_base on sparc-sun-solaris (PR98584)
Instead of unsafe pointer dereferencing, use memcpy() to read encoded
values from memory. The function `read_encoded_value' has been updated
to accept a ref parameter, this simplifies handling of the pointer to
memory needing to be read.
libphobos/ChangeLog:
PR d/98584
* libdruntime/gcc/deh.d (scanLSDA): Update calls to read_uleb128 and
read_encoded_value.
(actionTableLookup): Update calls to read_sleb128 and
read_encoded_value_with_base.
* libdruntime/gcc/unwind/pe.d (read_uleb128): Update signature.
(read_sleb128): Update signature.
(read_unaligned): New function.
(read_encoded_value_with_base): Update signature. Call read_unaligned
instead of unsafe pointer dereferencing.
(read_encoded_value): Update signature.
Andrew MacLeod [Fri, 16 Apr 2021 21:08:51 +0000 (17:08 -0400)]
tree-optimization/100081 - Limit depth of logical expression windback.
Limit how many logical expressions GORI will look back through when
evaluating outgoing edge range.
PR tree-optimization/100081
* gimple-range-cache.h (ranger_cache): Inherit from gori_compute
rather than gori_compute_cache.
* gimple-range-gori.cc (is_gimple_logical_p): Move to top of file.
(range_def_chain::m_logical_depth): New member.
(range_def_chain::range_def_chain): Initialize m_logical_depth.
(range_def_chain::get_def_chain): Don't build defchains through more
than LOGICAL_LIMIT logical expressions.
* params.opt (param_ranger_logical_depth): New.
d: Fix ICE in when formating a string with '%' or '`' characters (PR98457)
The percentage character was being confused for a format specifier in
pp_format(), whilst the backtick character was confused for the
beginning of a quoted string in expand_d_format().
Both are now properly escaped to avoid the ICE.
gcc/d/ChangeLog:
PR d/98457
* d-diagnostic.cc (expand_d_format): Handle escaped backticks.
(escape_d_format): New funtion.
(verror): Call escape_d_format on prefixing strings.
(vdeprecation): Likewise.
Richard Earnshaw [Mon, 19 Apr 2021 15:56:31 +0000 (16:56 +0100)]
arm: partial revert of r11-8168 [PR100067]
This is a partial revert of r11-8168. The overall purpose of the
commit is retained (to fix a bogus warning when -mfpu=<not-auto> is
used in combination with eg -mcpu=neoverse-v1), but it removes the
hunk that changed the subsequent feature bits for features of a
simd/fp unit that cannot be described by -mfpu. While I still think
that is the correct direction of travel, it's somewhat disruptive and
not appropriate for late stage4. I'll revisit for gcc-12.
gcc:
PR target/100067
* config/arm/arm.c (arm_configure_build_target): Do not strip
extended FPU/SIMD feature bits from the target ISA when -mfpu
is specified (partial revert of r11-8168).
Eric Botcazou [Mon, 19 Apr 2021 08:13:36 +0000 (10:13 +0200)]
Fix another -freorder-blocks-and-partition glitch with Windows SEH
Since GCC 8, the -freorder-blocks-and-partition pass can split a function
into hot and cold parts, thus generating 2 FDEs for a single function in
DWARF for exception purposes and doing an equivalent trick for Windows SEH.
Now the Windows system unwinder does not support arbitrarily large frames
and there is even a hard limit on the encoding of the CFI, which changes
the stack allocation strategy when it is topped and which must be reflected
everywhere.
gcc/
* config/i386/winnt.c (i386_pe_seh_cold_init): Properly deal with
frames larger than the SEH maximum frame size.
gcc/testsuite/
* gnat.dg/opt92.adb: New test.
testsuite: Enable zero-scratch-regs-{8,9,10,11}.c on s390*
On s390* the only missing part for the mentioned testcases was a load of
a double floating-point zero via a move (in particular for quite old
machines) which was added in commit 46c47420a5fefd4d9d02b0db347235dd74e20fb2.
Common code implementation is sufficient in order to clear volatile
GPRs, FPRs, and VRs. Access registers a0 and a1 are nonvolatile and not
cleared. Therefore, target hook TARGET_ZERO_CALL_USED_REGS is not
implemented for s390*.
Added a target specific test in order to ensure that all call clobbered
GPRs, FPRs, and VRs are zeroed and all call saved registers are kept.
gcc/testsuite/ChangeLog:
* c-c++-common/zero-scratch-regs-8.c: Enable on s390*.
* c-c++-common/zero-scratch-regs-9.c: Likewise.
* c-c++-common/zero-scratch-regs-10.c: Likewise.
* c-c++-common/zero-scratch-regs-11.c: Likewise.
* gcc.target/s390/zero-scratch-regs-1.c: New test.
Following up on the fix for PR99914, when testing on MinGW, it was found
not to support weak in the same way as on ELF or Mach-O targets.
So the linkage has been reverted back to COMDAT for that target, however
in order to properly support overriding functions and variables, all
declarations with external linkage must be put on COMDAT. For this a
new target hook has been added to control the behavior.
gcc/ChangeLog:
PR d/99914
* config/i386/winnt-d.c (TARGET_D_TEMPLATES_ALWAYS_COMDAT): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (D language and ABI): Add @hook for
TARGET_D_TEMPLATES_ALWAYS_COMDAT.
gcc/d/ChangeLog:
PR d/99914
* d-target.def (d_templates_always_comdat): New hook.
* d-tree.h (mark_needed): Remove prototype.
* decl.cc: Include d-target.h.
(mark_needed): Rename to...
(d_mark_needed): ...this. Make static.
(set_linkage_for_decl): Put variables in comdat if
d_templates_always_comdat.
Following on from adding TARGET_D_REGISTER_OS_TARGET_INFO, this adds the
required handlers to implement `__traits(getTargetInfo, "objectFormat")'
for all platforms that have D support files.
Some back-ends (i386, rs6000, and pa) have some awarenes of the what
object format they are compiling for, so new getTargetInfo handlers have
been have added both to those back-ends as well as platform-specific
target files to override the default in the D front-end.
gcc/ChangeLog:
* config/darwin-d.c (darwin_d_handle_target_object_format): New
function.
(darwin_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/dragonfly-d.c (dragonfly_d_handle_target_object_format): New
function.
(dragonfly_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/freebsd-d.c (freebsd_d_handle_target_object_format): New
function.
(freebsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/glibc-d.c (glibc_d_handle_target_object_format): New
function.
(glibc_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/i386/i386-d.c (ix86_d_handle_target_object_format): New
function.
(ix86_d_register_target_info): Add ix86_d_handle_target_object_format
as handler for objectFormat key.
* config/i386/winnt-d.c (winnt_d_handle_target_object_format): New
function.
(winnt_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/netbsd-d.c (netbsd_d_handle_target_object_format): New
function.
(netbsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/openbsd-d.c (openbsd_d_handle_target_object_format): New
function.
(openbsd_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
* config/pa/pa-d.c (pa_d_handle_target_object_format): New function.
(pa_d_register_target_info): Add pa_d_handle_target_object_format as
handler for objectFormat key.
* config/rs6000/rs6000-d.c (rs6000_d_handle_target_object_format): New
function.
(rs6000_d_register_target_info): Add
rs6000_d_handle_target_object_format as handler for objectFormat key.
* config/sol2-d.c (solaris_d_handle_target_object_format): New
function.
(solaris_d_register_target_info): New function.
(TARGET_D_REGISTER_OS_TARGET_INFO): Define.
gcc/d/ChangeLog:
* d-target.cc (d_handle_target_object_format): New function.
(d_language_target_info): Add d_handle_target_object_format as handler
for objectFormat key.
(Target::getTargetInfo): Continue if handler returned NULL_TREE.
Jakub Jelinek [Sat, 17 Apr 2021 09:31:30 +0000 (11:31 +0200)]
libstdc++: Update some baseline_symbols.txt
As we have only one P1 left right now, I think it is the right time
to update abi list files in libstdc++.
Here is an update for x86_64/i?86/s390x/ppc64 linux (aarch64 seems
to be correct already). For ppc64le it is missing the IEEE128 symver
symbols, but those need further work on the abi checking side.
Jakub Jelinek [Sat, 17 Apr 2021 09:27:14 +0000 (11:27 +0200)]
sanitizer: Fix asan against glibc 2.34 [PR100114]
As mentioned in the PR, SIGSTKSZ is no longer a compile time constant in
glibc 2.34 and later, so
static const uptr kAltStackSize = SIGSTKSZ * 4;
needs dynamic initialization, but is used by a function called indirectly
from .preinit_array and therefore before the variable is constructed.
This results in using 0 size instead and all asan instrumented programs
die with:
==91==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22)
testsuite/arm: Fix scan-assembler-times in pr96770.c with movt/movw
The previous change to this testcase missed the fact that the data may
be accessed via an anchor, depending on the optimization level,
leading to false failures.
This patch restricts matching to upper16:lower16 followed by
non-spaces, followed by +4 (in f4) or +320 (in f5).
Using '.*' instead of '[^ \]' would match accross the whole assembly
file, which is not what we want, hence the limitation with spaces.
Jakub Jelinek [Fri, 16 Apr 2021 18:49:33 +0000 (20:49 +0200)]
aarch64: Don't emit -Wpsabi note when ABI was never affected [PR91710]
As the following testcase shows, we emit a -Wpsabi note about argument
passing change since GCC 9, but in reality the ABI didn't change.
The alignment is 8 bits in GCC < 9 and 32 bits in GCC >= 9 and
the aarch64_function_arg_alignment returns in that case:
return MIN (MAX (alignment, PARM_BOUNDARY), STACK_BOUNDARY);
so when both the old and new alignment are smaller or equal to PARM_BOUNDARY
(or both are larger than STACK_BOUNDARY, just in theory), even when the new
one is bigger, it doesn't change the argument passing.
So, the following patch changes aarch64_function_arg_alignment to tell the
callers the exact old alignmentm so that they can test it if needed.
The other aarch64_function_arg_alignment callers either check the
alignment for equality against 16-byte alignment (when old alignment was
smaller than that and the new one is 16-byte, we want to emit -Wpsabi
in all the cases) or the va_arg case which I think is ok now too.
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR target/91710
* config/aarch64/aarch64.c (aarch64_function_arg_alignment): Change
abi_break argument from bool * to unsigned *, store there the pre-GCC 9
alignment.
(aarch64_layout_arg, aarch64_gimplify_va_arg_expr): Adjust callers.
(aarch64_function_arg_regno_p): Likewise. Only emit -Wpsabi note if
the old and new alignment after applying MIN/MAX to it is different.
Jakub Jelinek [Fri, 16 Apr 2021 16:32:27 +0000 (18:32 +0200)]
intl: Add --enable-host-shared support [PR100096]
As mentioned in the PR, building gcc with jit enabled and
--enable-host-shared doesn't work on NetBSD/i?86, as libgccjit.so.0
has text relocations.
The r0-125846-g459260ecf8b420b029601a664cdb21c185268ecb changes
added --enable-host-shared support to various libraries, but didn't
add it to intl/ subdirectory; on Linux it isn't really needed, because
all: all-no
all-no: #nothing
but on other OSes intl/libintl.a is built.
The following patch makes sure it is built with -fPIC when
--enable-host-shared is used.
This causes CSE to incorrectly think that the two predicates are equal because
some of the significant bits get ignored due to the subreg.
The attached patch instead makes it so it always looks at all 16-bits of the
predicate, but in turn means we need to generate a TRN that matches the expected
result mode. In effect in RTL we keep the mode as VNx16BI but during codegen
re-interpret them as the mode the predicate instruction wanted:
Which needed correction to the TRN pattern. A new TRN1_CONV unspec is
introduced which allows one to keep the arguments as VNx16BI but encode the
instruction as a type of the last operand.
Jakub Jelinek [Fri, 16 Apr 2021 15:37:07 +0000 (17:37 +0200)]
c++: Fix empty base stores in cxx_eval_store_expression [PR100111]
In r11-6895 handling of empty bases has been fixed such that non-lval
stores of empty classes are not added when the type of *valp doesn't
match the type of the initializer, but as this testcase shows it is
done only when *valp is non-NULL. If it is NULL, we still shouldn't
add empty class constructors if the type of the constructor elt *valp
points to doesn't match.
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR c++/100111
* constexpr.c (cxx_eval_store_expression): Don't add CONSTRUCTORs
for empty classes into *valp when types don't match even when *valp
is NULL.
Bill Schmidt [Fri, 16 Apr 2021 15:38:11 +0000 (10:38 -0500)]
doc: Update Power builtin documentation in user's manual
The standard for many Power vector interfaces is now the recently
published Power Vector Intrinsic Programming Reference. Reference
that document for the relevant interfaces, and remove redundant
information from the GCC user's manual.
2021-04-16 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions): Revise
this section and its subsections.
Marek Polacek [Wed, 14 Apr 2021 21:57:15 +0000 (17:57 -0400)]
c++: ICE with bogus late return type [PR99803]
Here we ICE when compiling this code in C++20, because we're trying to
slam a 'typename' after the ->. The cp_parser_template_id call just
before the spot I'm changing parsed A::template A<int> as a BASELINK
that contains a constructor, but make_typename_type crashes on that.
This patch makes make_typename_type more robust instead of checking
for is_overloaded_fn prior calling it.
gcc/cp/ChangeLog:
PR c++/99803
* decl.c (make_typename_type): Give an error and return when
name is is_overloaded_fn.
* parser.c (cp_parser_class_name): Don't check is_overloaded_fn
before calling make_typename_type.
Robin Dapp [Wed, 17 Mar 2021 08:08:42 +0000 (09:08 +0100)]
testsuite: Move gimplefe-4[0|1] tests.
The gimplefe-40.c and gimplefe-41.c test cases require vect_* effective
targets even though they reside in gcc.dg. By default e.g.
DEFAULT_VECTCFLAGS which is used to add target-specific machine or build
flags is only applied in the ./vect subdirectory. Move these tests
there.
gcc/testsuite/ChangeLog:
* gcc.dg/gimplefe-40.c: Moved to...
* gcc.dg/vect/gimplefe-40.c: ...here.
* gcc.dg/gimplefe-41.c: Moved to...
* gcc.dg/vect/gimplefe-41.c: ...here.
For z10 and newer inner loops are completely unrolled which means store
motion is not applied. Reverting max-completely-peeled-insns to the
default value fixes these testcases.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr83403-1.c: Revert
max-completely-peeled-insns to the default value on IBM Z.
* gcc.dg/tree-ssa/pr83403-2.c: Likewise.