PR106907 has few warnings spotted from cppcheck. In that addressing duplicate
expression issue here. Here the same expression is used twice in logical
AND(&&) operation which result in same result so removing that.
where M1 and M2 are of equal mode size. That is problematic for the splitter
vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun an
lvalue DFmode pseudo into DImode and assign a constant to it with
emit_move_insn, as the new transformation simply undoes this, and we end up
splitting indefinitely.
This patch changes things around in the arm backend so that we use a
DImode temporary (instead of DFmode) and first load the DImode constant
into the pseudo, and then pun the pseudo into DFmode as an rvalue in a
reg -> reg move. I believe this should be semantically equivalent but
avoids the pathalogical behaviour seen in the PR.
gcc/ChangeLog:
PR target/109800
* config/arm/arm.md (movdf): Generate temporary pseudo in DImode
instead of DFmode.
* config/arm/vfp.md (no_literal_pool_df_immediate): Rather than punning an
lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it into
DFmode as an rvalue.
gcc/testsuite/ChangeLog:
PR target/109800
* gcc.target/arm/pure-code/pr109800.c: New test.
Jonathan Wakely [Thu, 1 Jun 2023 10:16:49 +0000 (11:16 +0100)]
libstdc++: Do not use std::expected::value() in monadic ops (LWG 3938)
The monadic operations in std::expected always check has_value() so we
can avoid the execptional path in value() and the assertions in error()
by accessing _M_val and _M_unex directly. This means that the monadic
operations no longer require _M_unex to be copyable so that it can be
thrown from value(), as modified by LWG 3938.
This also fixes two incorrect uses of std::move in transform(F&&)& and
transform(F&&) const& which I found while making these changes.
Now that move-only error types are supported, it's possible to properly
test the constraints that LWG 3877 added to and_then and transform. The
lwg3877.cc test now does that.
libstdc++-v3/ChangeLog:
* include/std/expected (expected::and_then, expected::or_else)
(expected::transform_error): Use _M_val and _M_unex instead of
calling value() and error(), as per LWG 3938.
(expected::transform): Likewise. Remove incorrect std::move
calls from lvalue overloads.
(expected<void, E>::and_then, expected<void, E>::or_else)
(expected<void, E>::transform): Use _M_unex instead of calling
error().
* testsuite/20_util/expected/lwg3877.cc: Add checks for and_then
and transform, and for std::expected<void, E>.
* testsuite/20_util/expected/lwg3938.cc: New test.
Jonathan Wakely [Tue, 16 May 2023 21:40:42 +0000 (22:40 +0100)]
libstdc++: Implement LWG 3877 for std::expected monadic ops
This was approved in Issaquah 2023. As well as fixing the value
categories, this fixes the fact that we were incorrectly testing E
instead of T in the or_else constraints.
libstdc++-v3/ChangeLog:
* include/std/expected (expected::and_then, expected::or_else)
(expected::transform, expected::transform_error): Fix exception
specifications as per LWG 3877.
(expected<void, E>::and_then, expected<void, E>::transform):
Likewise.
* testsuite/20_util/expected/lwg3877.cc: New test.
Andrew Pinski [Mon, 5 Jun 2023 04:32:00 +0000 (04:32 +0000)]
Fix PR 110085: `make clean` in GCC directory on sh target causes a failure
On sh target, there is a MULTILIB_DIRNAMES (or is it MULTILIB_OPTIONS) named m2,
this conflicts with the langauge m2. So when you do a `make clean`, it will remove
the m2 directory and then a build will fail. Now since r0-78222-gfa9585134f6f58,
the multilib directories are no longer created in the gcc directory as libgcc
was moved to the toplevel. So we can remove the part of clean that removes those
directories.
Tested on x86_64-linux-gnu and a cross to sh-elf that `make clean` followed by
`make` works again.
Committed as approved.
gcc/ChangeLog:
PR bootstrap/110085
* Makefile.in (clean): Remove the removing of
MULTILIB_DIR/MULTILIB_OPTIONS directories.
Jonathan Wakely [Tue, 21 Mar 2023 12:29:08 +0000 (12:29 +0000)]
libstdc++: Make std::filesystem::copy_file work for procfs [PR108178]
The size reported by stat is always zero for some special files such as
those under /proc, which means the current copy_file implementation
thinks there is nothing to copy. Instead of trusting the stat value, try
to read a character from a streambuf and check for EOF.
For the backport, we also need to avoid trying to use sendfile when stat
reports a zero size, so that we use streambufs to copy the file.
libstdc++-v3/ChangeLog:
PR libstdc++/108178
* src/filesystem/ops-common.h (do_copy_file): Check for empty
files by trying to read a character.
* testsuite/27_io/filesystem/operations/copy_file_108178.cc:
New test.
Jonathan Wakely [Tue, 6 Jun 2023 10:38:42 +0000 (11:38 +0100)]
libstdc++: Fix ambiguous expression in std::array<T, 0>::front() [PR110139]
For 32-bit targets using -pedantic (or using Clang) makes the expression
_M_elems[0] ambiguous. The overloaded operator[] that we want to call
has a size_t parameter, but 0 is type ptrdiff_t for many ILP32 targets,
so using the implicit conversion from _M_elems to T* and then
subscripting that is also viable.
Change the 0 to (size_type)0 and also make the conversion to T*
explicit, so that's it's not viable here. The latter change requires a
static_cast in data() where we really do want to convert _M_elems to a
pointer.
libstdc++-v3/ChangeLog:
PR libstdc++/110139
* include/std/array (__array_traits<T, 0>::operator T*()): Make
conversion operator explicit.
(array::front): Use size_type as subscript operand.
(array::data): Use static_cast to make conversion explicit.
* testsuite/23_containers/array/element_access/110139.cc: New
test.
Joseph Faulls [Fri, 2 Jun 2023 15:44:48 +0000 (15:44 +0000)]
libstdc++: Do not assume existence of char8_t codecvt facet
It is not required that codecvt<char8_t, char, mbstate_t> facet be
supported by the locale, nor is it added as part of the default locale.
This can lead to dangerous behaviour when static_cast.
libstdc++-v3/ChangeLog:
* include/bits/locale_classes.tcc: Remove check for
codecvt<char8_t, char, mbstate_t> facet.
Iain Buclaw [Mon, 5 Jun 2023 16:30:12 +0000 (18:30 +0200)]
d: Warn when declared size of a special enum does not match its intrinsic type.
All special enums have declarations in the D runtime library, but the
compiler will recognize and treat them specially if declared in any
module. When the underlying base type of a special enum is a different
size to its matched intrinsic, then this can cause undefined behavior at
runtime. Detect and warn about when such a mismatch occurs.
gcc/d/ChangeLog:
* gdc.texi (Warnings): Document -Wextra and -Wmismatched-special-enum.
* implement-d.texi (Special Enums): Add reference to warning option
-Wmismatched-special-enum.
* lang.opt: Add -Wextra and -Wmismatched-special-enum.
* types.cc (TypeVisitor::visit (TypeEnum *)): Warn when declared
special enum size mismatches its intrinsic type.
Thomas Neumann [Wed, 10 May 2023 10:33:49 +0000 (12:33 +0200)]
fix radix sort on 32bit platforms [PR109670]
The radix sort uses two buffers, a1 for input and a2 for output.
After every digit the role of the two buffers is swapped.
When terminating the sort early the code made sure the output
was in a2. However, when we run out of bits, as can happen on
32bit platforms, the sorted result was in a1, as we had just
swapped a1 and a2.
This patch fixes the problem by unconditionally having a1 as
output after every loop iteration.
This bug manifested itself only on 32bit platforms and even then
only in some circumstances, as it needs frames where a swap
is required due to differences in the top-most byte, which is
affected by ASLR. The new logic was validated by exhaustive
search over 32bit input values.
Thomas Neumann [Tue, 2 May 2023 14:21:09 +0000 (16:21 +0200)]
release the sorted FDE array when deregistering a frame [PR109685]
The atomic fastpath bypasses the code that releases the sort
array which was lazily allocated during unwinding. We now
check after deregistering if there is an array to free.
libgcc/ChangeLog:
PR libgcc/109685
* unwind-dw2-fde.c: Free sort array in atomic fast path.
target/110088: Improve operation of l-reg with const after move from d-reg.
After reload, there may be sequences like
lreg = dreg
lreg = lreg <op> const
with an LD_REGS dreg, non-LD_REGS lreg, and <op> in PLUS, IOR, AND.
If dreg dies after the first insn, it is possible to use
dreg = dreg <op> const
lreg = dreg
instead which is more efficient.
gcc/
PR target/110088
* config/avr/avr.md: Add an RTL peephole to optimize operations on
non-LD_REGS after a move from LD_REGS.
(piaop): New code iterator.
Jonathan Wakely [Wed, 10 May 2023 11:20:58 +0000 (12:20 +0100)]
libstdc++: Fix std::abs(__float128) for -NaN and -0.0 [PR109758]
The current implementation of this non-standard overload of std::abs
incorrectly returns a negative value for negative NaNs and negative
zero, because x < 0 is false in both cases.
Use fabsl(long double) or fabsf128(_Float128) if those do the right
thing. Otherwise, use __builtin_signbit(x) instead of x < 0 to detect
negative inputs. This assumes that __builtin_signbit handles __float128
correctly, but that seems to be true for all of GCC, clang and icc.
libstdc++-v3/ChangeLog:
PR libstdc++/109758
* include/bits/std_abs.h (abs(__float128)): Handle negative NaN
and negative zero correctly.
* testsuite/26_numerics/headers/cmath/109758.cc: New test.
Jonathan Wakely [Fri, 12 May 2023 12:44:21 +0000 (13:44 +0100)]
libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1
This should have been done in r9-2028-g8ba7f29e3dd064 when
std::shared_mutex was changed to be defined without depending on
_GLIBCXX_USE_C99_STDINT_TR1.
libstdc++-v3/ChangeLog:
* testsuite/experimental/feat-cxx14.cc: Remove dependency on
_GLIBCXX_USE_C99_STDINT_TR1.
Jonathan Wakely [Fri, 12 May 2023 12:34:37 +0000 (13:34 +0100)]
libstdc++: Remove test dependencies on _GLIBCXX_USE_C99_STDINT_TR1
These #ifdef checks should have been removed in r9-2029-g612c9c702e2c9e
when the u16string_view and u32string_view aliases were changed to be
defined unconditionally.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string_view/typedefs.cc: Remove
dependency on _GLIBCXX_USE_C99_STDINT_TR1.
* testsuite/experimental/string_view/typedefs.cc: Likewise.
Jonathan Wakely [Thu, 1 Jun 2023 15:49:53 +0000 (16:49 +0100)]
libstdc++: Fix PSTL test that fails in C++20
This test fails in C++20 and later due to a warning:
warning: C++20 says that these are ambiguous, even though the second is reversed:
note: candidate 1: 'bool MyClass::operator==(const MyClass&)'
note: candidate 2: 'bool MyClass::operator==(const MyClass&)' (reversed)
note: try making the operator a 'const' member function
FAIL: 26_numerics/pstl/numeric_ops/transform_reduce.cc (test for excess errors)
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc:
Add const to equality operator.
Jonathan Wakely [Mon, 15 May 2023 20:41:56 +0000 (21:41 +0100)]
libstdc++: Document removal of implicit allocator rebinding extensions
Traditionally libstdc++ allowed containers to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.
libstdc++-v3/ChangeLog:
* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.
Jonathan Wakely [Tue, 9 May 2023 17:18:01 +0000 (18:18 +0100)]
libstdc++: Fix <chrono> pretty printers and add tests
This fixes a couple of errors in the printers for chrono types, and adds
tests to ensure they keep working.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdChronoDurationPrinter):
Print floating-point durations correctly.
(StdChronoTimePointPrinter): Support printing only the value,
not the type name. Uncomment handling for known clocks.
(StdChronoZonedTimePrinter): Remove type names from output.
(StdChronoCalendarPrinter): Fix hh_mm_ss member access.
(StdChronoTimeZonePrinter): Add equals sign to output.
* testsuite/libstdc++-prettyprinters/chrono.cc: New test.
Alexandre Oliva [Tue, 30 May 2023 21:46:26 +0000 (18:46 -0300)]
[libstdc++] [testsuite] xfail double-prec from_chars for x86_64 ldbl
When long double is wider than double, but from_chars is implemented
in terms of double, tests that involve the full precision of long
double are expected to fail. Mark them as such on x86_64-*-vxworks*.
for libstdc++-v3/ChangeLog
* testsuite/20_util/from_chars/4.cc: Skip long double test06
on x86_64-vxworks.
* testsuite/20_util/to_chars/long_double.cc: Xfail run on
x86_64-vxworks.
Christophe Lyon [Tue, 23 May 2023 14:30:53 +0000 (14:30 +0000)]
testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic
Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
Although these correspond to the same size, g++ complains when using the
'wrong' version:
invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
or
invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'} [-fpermissive]
when calling vst1q(int32*, int32x4_t) with a first parameter of type
'long int *' (resp. 'int *')
To make this test pass with any type of toolchain, this patch defines
'word_type' according to which libc is in use.
PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.
* testsuite/experimental/simd/tests/operator_cvt.cc: Make long
double <-> (u)long conversion tests conditional on sizeof(long
double) and sizeof(long).
Kito Cheng [Fri, 12 May 2023 08:54:57 +0000 (16:54 +0800)]
RISC-V: Suppress unused parameter warning in riscv-common.cc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi):
Drop unused parameter.
(riscv_select_multilib): Ditto.
(riscv_compute_multilib): Update call site of
riscv_select_multilib_by_abi and riscv_select_multilib_by_abi.
Kito Cheng [Thu, 4 May 2023 07:12:27 +0000 (15:12 +0800)]
RISC-V: Handle multi-lib path correclty for linux
RISC-V Linux encodes the ABI into the path, so in theory, we can only use that
to select multi-lib paths, and no way to use different multi-lib paths between
`rv32i/ilp32` and `rv32ima/ilp32`, we'll mapping both to `/lib/ilp32`.
It's hard to do that with GCC's builtin multi-lib selection mechanism; builtin
mechanism did the option string compare and then enumerate all possible reuse
rules during the build time. However, it's impossible to RISC-V; we have a huge
number of combinations of `-march`, so implementing a customized multi-lib
selection becomes the only solution.
Multi-lib configuration is only used for determines which ISA should be used
when compiling the corresponding ABI variant after this patch.
During the multi-lib selection stage, only consider -mabi as the only key to
select the multi-lib path.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi): New.
(riscv_select_multilib): New.
(riscv_compute_multilib): Extract logic to riscv_select_multilib and
also handle select_by_abi.
* config/riscv/elf.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Change it
to select_by_abi_arch_cmodel from 1.
* config/riscv/linux.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Define.
* config/riscv/riscv-opts.h (enum riscv_multilib_select_kind): New.
Georg-Johann Lay [Tue, 23 May 2023 12:54:12 +0000 (14:54 +0200)]
target/104327: Allow more inlining between different optimization levels.
avr-common.cc introduces the following options that are set depending
on optimization level: -mgas-isr-prologues, -mmain-is-OS-task and
-fsplit-wide-types-early. The inliner thinks that different options
disallow cross-optimization inlining, so provide can_inline_p.
gcc/
PR target/104327
* config/avr/avr.cc (avr_can_inline_p): New static function.
(TARGET_CAN_INLINE_P): Define to that function.
Georg-Johann Lay [Thu, 25 May 2023 17:02:34 +0000 (19:02 +0200)]
target/82931: Make a pattern more generic to match more bit-transfers.
There is already a pattern in avr.md that matches single-bit transfers
from one register to another one, but it only handled bit 0 of 8-bit
registers. This change makes that pattern more generic so it matches
more of similar single-bit transfers.
gcc/
PR target/82931
* config/avr/avr.md (*movbitqi.0): Rename to *movbit<mode>.0-6.
Handle any bit position and use mode QISI.
* config/avr/avr.cc (avr_rtx_costs_1) [IOR]: Return a cost
of 2 insns for bit-transfer of respective style.
gcc/testsuite/
PR target/82931
* gcc.target/avr/pr82931.c: New test.
Matthias Kretz [Thu, 23 Mar 2023 08:32:58 +0000 (09:32 +0100)]
libstdc++: Add missing constexpr to simd
The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.
This patch resolves several issues with using simd in constant
expressions.
Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins
PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.
PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.
PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.
Georg-Johann Lay [Tue, 23 May 2023 16:49:19 +0000 (18:49 +0200)]
Improve cost computation for single-bit bit insertions.
Some miscomputation of rtx_costs lead to sub-optimal code for
single-bit bit insertions. This patch implements TARGET_INSN_COST,
which has a chance to see the whole insn during insn combination;
in particular the SET_DEST of (set (zero_extract (...) ...)).
gcc/
* config/avr/avr.cc (avr_insn_cost): New static function.
(TARGET_INSN_COST): Define to that function.
Eric Botcazou [Tue, 23 May 2023 08:15:35 +0000 (10:15 +0200)]
Fix handling of non-integral bit-fields in native_encode_initializer
The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD)
have integral types, but that's not the case in Ada where they may have
pretty much any type, resulting in a wrong encoding for them
gcc/
* fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the
specific treatment for bit-fields only if they have an integral type
and filter out non-integral bit-fields that do not start and end on
a byte boundary.
gcc/testsuite/
* gnat.dg/opt101.adb: New test.
* gnat.dg/opt101_pkg.ads: New helper.
Jakub Jelinek [Sun, 21 May 2023 11:36:56 +0000 (13:36 +0200)]
atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.
So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.
2023-05-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.
Kewen Lin [Wed, 17 May 2023 07:48:40 +0000 (02:48 -0500)]
vect: Don't retry if the previous analysis fails
When working on a cost tweaking patch, I found that a newly
added test case has different dumpings with stage-1 and
bootstrapped gcc. By looking into it, the apparent reason
is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf
set expectedly, the following retrying will use the garbage
slp_done_for_suggested_uf instead. In fact, the setting of
slp_done_for_suggested_uf only happens when the previous
analysis succeeds, for the mentioned test case, its previous
analysis does fail, it's unexpected to use the value of
slp_done_for_suggested_uf any more.
In function vect_analyze_loop_1, we only return success when
res is true, which is the result of 1st analysis. It means
we never try to vectorize with unroll_vinfo if the previous
analysis fails. So this patch shouldn't break anything, and
just stop some useless analysis early.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_1): Don't retry analysis with
suggested unroll factor once the previous analysis fails.
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
Triffid Hunter [Sat, 20 May 2023 05:50:00 +0000 (07:50 +0200)]
target/105753: Fix ICE in add_clobbers due to extra PARALLEL in insn.
This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend. Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.
gcc/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of
printing error text to assembly.
gcc/testsuite/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* gcc.target/avr/torture/pr105753.c: New test.
Andreas Schwab [Sat, 23 Apr 2022 13:48:42 +0000 (15:48 +0200)]
riscv/linux: Don't add -latomic with -pthread
Now that we have support for inline subword atomic operations, it is no
longer necessary to link against libatomic. This also fixes testsuite
failures because the framework does not properly set up the linker flags
for finding libatomic.
The use of atomic operations is also independent of the use of libpthread.
Patrick Palka [Tue, 16 May 2023 16:39:16 +0000 (12:39 -0400)]
c++: desig init in presence of list ctor [PR109871]
add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertently being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the initializer list as if
it's non-designated. This patch fixes this by making us check for invalid
designated initialization sooner.
PR c++/109871
gcc/cp/ChangeLog:
* call.cc (add_list_candidates): Check for invalid designated
initialization sooner and even for types that have a list
constructor.
Harald Anlauf [Sun, 14 May 2023 19:53:51 +0000 (21:53 +0200)]
Fortran: CLASS pointer function result in variable definition context [PR109846]
gcc/fortran/ChangeLog:
PR fortran/109846
* expr.cc (gfc_check_vardef_context): Check appropriate pointer
attribute for CLASS vs. non-CLASS function result in variable
definition context.
gcc/testsuite/ChangeLog:
PR fortran/109846
* gfortran.dg/ptr-func-5.f90: New test.
Harald Anlauf [Fri, 5 May 2023 19:22:12 +0000 (21:22 +0200)]
Fortran: overloading of intrinsic binary operators [PR109641]
Fortran allows overloading of intrinsic operators also for operands of
numeric intrinsic types. The intrinsic operator versions are used
according to the rules of F2018 table 10.2 and imply type conversion as
long as the operand ranks are conformable. Otherwise no type conversion
shall be performed to allow the resolution of a matching user-defined
operator.
gcc/fortran/ChangeLog:
PR fortran/109641
* arith.cc (eval_intrinsic): Check conformability of ranks of operands
for intrinsic binary operators before performing type conversions.
* gfortran.h (gfc_op_rank_conformable): Add prototype.
* resolve.cc (resolve_operator): Check conformability of ranks of
operands for intrinsic binary operators before performing type
conversions.
(gfc_op_rank_conformable): New helper function to compare ranks of
operands of binary operator.
gcc/testsuite/ChangeLog:
PR fortran/109641
* gfortran.dg/overload_5.f90: New test.
arm testsuite: Shifts and get_FPSCR ACLE optimisation fixes
These newly updated tests were rewritten by Andrea. Some of them
needed further manual fixing as follows:
* The #shift immediate value not in the check-function-bodies as expected
* The ACLE was specifying sub-optimal code: lsr+and instead of ubfx. In
this case the test rewritten from the ACLE had the lsr+and pattern,
but the compiler was able to optimise to ubfx. Hence I've changed the
test to now match on ubfx.
* Added a separate test to check shift on constants being optimised to
movs.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/srshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/srshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/uqshl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/uqshll.c: Update shift value.
* gcc.target/arm/mve/intrinsics/urshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/urshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Update to ubfx.
* gcc.target/arm/mve/mve_const_shifts.c: New test.
arm testsuite: XFAIL or relax registers in some tests [PR109697]
This is a simple testsuite tidy-up patch, addressing to types of errors:
* The vcmp vector-scalar tests failing due to the compiler's preference
of vector-vector comparisons, over vector-scalar comparisons. This is
due to the lack of cost model for MVE and the compiler not knowing that
the RTL vec_duplicate is free in those instructions. For now, we simply
XFAIL these checks.
* The tests for pr108177 had strict usage of q0 and r0 registers,
meaning that they would FAIL with -mfloat-abi=softf. The register checks
have now been relaxed. A couple of these run-tests also had incosistent
use of integer MVE with floating point vectors, so I've now changed
these to use FP MVE.
Following Andrea's overhaul of the MVE testsuite, these tests are now
reduntant, as equivalent checks have been added to the each intrinsic's
<intrinsic name>.c test.
arm: Fix MVE header pointer overloads this time (and a bit more tidying)
Hi all,
Previously we had fixed the overloading of scalar arguments to intrinsics
with the introduction of a new `__ARM_mve_coerce3` _ Generic association.
This allowed users to give types other than int32_t, e.g. int, short, long,
etc., which previously would emit a nonsensical error message from the
_Generic.
Here I adjust that handling slightly and I am also doing the same thing, but
for pointer types:
(un)signed char* can be now used instead of (u)int8_t*
(un)signed short* can be now used instead of (u)int16_t*
(un)signed int* and long* can be now used instead of (u)int32_t*
(un)signed long long* can be now used instead of (u)int64_t*
__fp16* and _Float16* can be now used instead of float16_t*
float* can be now used instead of float32_t*
This required me to break down the _coerce_ generics for the specific
pointer types.
On the scalar types, the change in this patch is minor, renaming the
_coerce_ generics and passing all scalars through the `__typeof` for
consistency with each-other.
No test regressions in the GCC testsuite or CMSIS-NN.
arm: Stop vadcq, vsbcq intrinsics from overwriting the FPSCR NZ flags
Hi all,
We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:
when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```
the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).
This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.
Andrea Corallo [Wed, 19 Apr 2023 16:08:57 +0000 (18:08 +0200)]
arm: Mve backend + testsuite fixes 2
Hi all,
this patch improves a number of MVE tests in the testsuite for more
precise and better coverage using check-function-bodies instead of
scan-assembler checks. Also all intrusctions prescribed in the
ACLE[1] are now checked.
Also a number of simple fixes are done in the backend to fix
capitalization and spacing.
Andrea Corallo [Thu, 23 Mar 2023 14:36:37 +0000 (15:36 +0100)]
arm: Fix vstrwq* backend + testsuite
Hi all,
this patch fixes the vstrwq* MVE instrinsics failing to emit the
correct sequence of instruction due to a missing predicate. Also the
immediate range is fixed to be multiples of 2 up between [-252, 252].
Best Regards
Andrea
gcc/ChangeLog:
* config/arm/constraints.md (mve_vldrd_immediate): Move it to
predicates.md.
(Ri): Move constraint definition from predicates.md.
(Rl): Define new constraint.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Add
missing constraint.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Add missing Up constraint
for op 1, use mve_vstrw_immediate predicate and Rl constraint for
op 2. Fix asm output spacing.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Add missing constraint.
* config/arm/predicates.md (Ri) Move constraint to constraints.md
(mve_vldrd_immediate): Move it from
constraints.md.
(mve_vstrw_immediate): New predicate.
Andrea Corallo [Tue, 28 Feb 2023 10:03:18 +0000 (11:03 +0100)]
arm: Mve testsuite improvements
Hello all,
this patch improves a number of MVE tests in the testsuite for more
precise and better coverage using check-function-bodies instead of
scan-assembler checks. Also all intrusctions prescribed in the ACLE[1]
are now checked.
Jakub Jelinek [Wed, 17 May 2023 19:21:23 +0000 (21:21 +0200)]
libstdc++: Fix up some <cmath> templates [PR109883]
As can be seen on the following testcase, for
std::{atan2,fmod,pow,copysign,fdim,fmax,fmin,hypot,nextafter,remainder,remquo,fma}
if one operand type is std::float{16,32,64,128}_t or std::bfloat16_t and
another one some integral type or some other floating point type which
promotes to the other operand's type, we can end up with endless recursion.
This is because of a declaration ordering problem in <cmath>, where the
float, double and long double overloads of those functions come before
the templates which use __gnu_cxx::__promote_{2,3}, but the
std::float{16,32,64,128}_t and std::bfloat16_t overloads come later in the
file. If the result of those promotions is _Float{16,32,64,128} or
__gnu_cxx::__bfloat16_t, say std::pow(_Float64, int) calls
std::pow(_Float64, _Float64) and the latter calls itself.
The following patch fixes that by moving those templates later in the file,
so that the calls from those templates see also the other overloads.
I think other templates in the file like e.g. isgreater etc. shouldn't be
a problem, because those just use __builtin_isgreater etc. in their bodies.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/109883
* include/c_global/cmath (atan2, fmod, pow): Move
__gnu_cxx::__promote_2 using templates after _Float{16,32,64,128} and
__gnu_cxx::__bfloat16_t overloads.
(copysign, fdim, fmax, fmin, hypot, nextafter, remainder, remquo):
Likewise.
(fma): Move __gnu_cxx::__promote_3 using template after
_Float{16,32,64,128} and __gnu_cxx::__bfloat16_t overloads.
* testsuite/26_numerics/headers/cmath/constexpr_std_c++23.cc: New test.
Jakub Jelinek [Wed, 17 May 2023 18:59:54 +0000 (20:59 +0200)]
i386: Fix up types in __builtin_{inf,huge_val,nan{,s},fabs,copysign}q builtins [PR109884]
When _Float128 support has been added to C++ for 13.1, float128t_type_node
tree has been added - in C float128_type_node and float128t_type_node is
the same and represents both _Float128 and __float128, but in C++ they
are distinct types which have different handling in the FEs.
When doing that change, I mistakenly forgot to change FLOAT128 primitive
type, which is used for the __builtin_{inf,huge_val,nan{,s},fabs,copysign}q
builtins results and some of their arguments (and nothing else).
The following patch fixes that.
On ia64 we already use float128t_type_node for those builtins, pa while
it has __float128 that type is the same as long double and so those builtins
have long double types and on powerpc seems we don't have these builtins
but instead define macros which map them to __builtin_*f128. That will
not work properly in C++, perhaps we should change those macros to be
function-like and cast to __float128.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109884
* config/i386/i386-builtin-types.def (FLOAT128): Use
float128t_type_node rather than float128_type_node.
Jakub Jelinek [Wed, 17 May 2023 08:15:50 +0000 (10:15 +0200)]
c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]
My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.
I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.
In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.
Marek Polacek [Tue, 16 May 2023 18:12:06 +0000 (14:12 -0400)]
c++: -Wdangling-reference not suppressed in template [PR109774]
In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive. It wasn't working in a
template, though, because the suppress_warning call was never reached.
PR c++/109774
gcc/cp/ChangeLog:
* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.
Patrick O'Neill [Tue, 18 Apr 2023 21:33:13 +0000 (14:33 -0700)]
RISCV: Inline subword atomic ops
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.
This patch changes the default behavior to inline subword atomic calls
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.
gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.
2023-05-03 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
PR target/104338
* config/riscv/riscv-protos.h: Add helper function stubs.
* config/riscv/riscv.cc: Add helper functions for subword masking.
* config/riscv/riscv.opt: Add command-line flags
-minline-atomics and -mno-inline-atomics.
* config/riscv/sync.md: Add masking logic and inline asm for
fetch_and_op, fetch_and_nand, CAS, and exchange ops.
* doc/invoke.texi: Add blurb regarding new command-line flags
-minline-atomics and -mno-inline-atomics.
libgcc/ChangeLog:
PR target/104338
* config/riscv/atomic.c: Add reference to duplicate logic.
gcc/testsuite/ChangeLog:
PR target/104338
* gcc.target/riscv/inline-atomics-1.c: New test.
* gcc.target/riscv/inline-atomics-2.c: New test.
* gcc.target/riscv/inline-atomics-3.c: New test.
* gcc.target/riscv/inline-atomics-4.c: New test.
* gcc.target/riscv/inline-atomics-5.c: New test.
* gcc.target/riscv/inline-atomics-6.c: New test.
* gcc.target/riscv/inline-atomics-7.c: New test.
* gcc.target/riscv/inline-atomics-8.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Tobias Burnus [Fri, 12 May 2023 14:27:40 +0000 (16:27 +0200)]
LTO: Fix writing of toplevel asm with offloading [PR109816]
When offloading was enabled, top-level 'asm' were added to the offloading
section, confusing assemblers which did not support the syntax. Additionally,
with offloading and -flto, the top-level assembler code did not end up
in the host files.
As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
the issue became more apparent, causing fails with nvptx for some
C++ testcases.
PR libstdc++/109816
gcc/ChangeLog:
* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
'!lto_stream_offload_p'.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-map-class-1.C: New test.
* testsuite/libgomp.c++/target-map-class-2.C: New test.
Sören Tempel [Sun, 14 May 2023 17:30:21 +0000 (19:30 +0200)]
fix assert in __deregister_frame_info_bases
The assertion in __deregister_frame_info_bases assumes that for every
frame something was inserted into the lookup data structure by
__register_frame_info_bases. Unfortunately, this does not necessarily
hold true as the btree_insert call in __register_frame_info_bases will
not insert anything for empty ranges. Therefore, we need to explicitly
account for such empty ranges in the assertion as `ob` will be a null
pointer for such ranges, hence causing the assertion to fail.