Iain Buclaw [Mon, 5 Jun 2023 16:30:12 +0000 (18:30 +0200)]
d: Warn when declared size of a special enum does not match its intrinsic type.
All special enums have declarations in the D runtime library, but the
compiler will recognize and treat them specially if declared in any
module. When the underlying base type of a special enum is a different
size to its matched intrinsic, then this can cause undefined behavior at
runtime. Detect and warn about when such a mismatch occurs.
gcc/d/ChangeLog:
* gdc.texi (Warnings): Document -Wextra and -Wmismatched-special-enum.
* implement-d.texi (Special Enums): Add reference to warning option
-Wmismatched-special-enum.
* lang.opt: Add -Wextra and -Wmismatched-special-enum.
* types.cc (TypeVisitor::visit (TypeEnum *)): Warn when declared
special enum size mismatches its intrinsic type.
Thomas Neumann [Wed, 10 May 2023 10:33:49 +0000 (12:33 +0200)]
fix radix sort on 32bit platforms [PR109670]
The radix sort uses two buffers, a1 for input and a2 for output.
After every digit the role of the two buffers is swapped.
When terminating the sort early the code made sure the output
was in a2. However, when we run out of bits, as can happen on
32bit platforms, the sorted result was in a1, as we had just
swapped a1 and a2.
This patch fixes the problem by unconditionally having a1 as
output after every loop iteration.
This bug manifested itself only on 32bit platforms and even then
only in some circumstances, as it needs frames where a swap
is required due to differences in the top-most byte, which is
affected by ASLR. The new logic was validated by exhaustive
search over 32bit input values.
Thomas Neumann [Tue, 2 May 2023 14:21:09 +0000 (16:21 +0200)]
release the sorted FDE array when deregistering a frame [PR109685]
The atomic fastpath bypasses the code that releases the sort
array which was lazily allocated during unwinding. We now
check after deregistering if there is an array to free.
libgcc/ChangeLog:
PR libgcc/109685
* unwind-dw2-fde.c: Free sort array in atomic fast path.
target/110088: Improve operation of l-reg with const after move from d-reg.
After reload, there may be sequences like
lreg = dreg
lreg = lreg <op> const
with an LD_REGS dreg, non-LD_REGS lreg, and <op> in PLUS, IOR, AND.
If dreg dies after the first insn, it is possible to use
dreg = dreg <op> const
lreg = dreg
instead which is more efficient.
gcc/
PR target/110088
* config/avr/avr.md: Add an RTL peephole to optimize operations on
non-LD_REGS after a move from LD_REGS.
(piaop): New code iterator.
Jonathan Wakely [Wed, 10 May 2023 11:20:58 +0000 (12:20 +0100)]
libstdc++: Fix std::abs(__float128) for -NaN and -0.0 [PR109758]
The current implementation of this non-standard overload of std::abs
incorrectly returns a negative value for negative NaNs and negative
zero, because x < 0 is false in both cases.
Use fabsl(long double) or fabsf128(_Float128) if those do the right
thing. Otherwise, use __builtin_signbit(x) instead of x < 0 to detect
negative inputs. This assumes that __builtin_signbit handles __float128
correctly, but that seems to be true for all of GCC, clang and icc.
libstdc++-v3/ChangeLog:
PR libstdc++/109758
* include/bits/std_abs.h (abs(__float128)): Handle negative NaN
and negative zero correctly.
* testsuite/26_numerics/headers/cmath/109758.cc: New test.
Jonathan Wakely [Fri, 12 May 2023 12:44:21 +0000 (13:44 +0100)]
libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1
This should have been done in r9-2028-g8ba7f29e3dd064 when
std::shared_mutex was changed to be defined without depending on
_GLIBCXX_USE_C99_STDINT_TR1.
libstdc++-v3/ChangeLog:
* testsuite/experimental/feat-cxx14.cc: Remove dependency on
_GLIBCXX_USE_C99_STDINT_TR1.
Jonathan Wakely [Fri, 12 May 2023 12:34:37 +0000 (13:34 +0100)]
libstdc++: Remove test dependencies on _GLIBCXX_USE_C99_STDINT_TR1
These #ifdef checks should have been removed in r9-2029-g612c9c702e2c9e
when the u16string_view and u32string_view aliases were changed to be
defined unconditionally.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string_view/typedefs.cc: Remove
dependency on _GLIBCXX_USE_C99_STDINT_TR1.
* testsuite/experimental/string_view/typedefs.cc: Likewise.
Jonathan Wakely [Thu, 1 Jun 2023 15:49:53 +0000 (16:49 +0100)]
libstdc++: Fix PSTL test that fails in C++20
This test fails in C++20 and later due to a warning:
warning: C++20 says that these are ambiguous, even though the second is reversed:
note: candidate 1: 'bool MyClass::operator==(const MyClass&)'
note: candidate 2: 'bool MyClass::operator==(const MyClass&)' (reversed)
note: try making the operator a 'const' member function
FAIL: 26_numerics/pstl/numeric_ops/transform_reduce.cc (test for excess errors)
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc:
Add const to equality operator.
Jonathan Wakely [Mon, 15 May 2023 20:41:56 +0000 (21:41 +0100)]
libstdc++: Document removal of implicit allocator rebinding extensions
Traditionally libstdc++ allowed containers to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.
libstdc++-v3/ChangeLog:
* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.
Jonathan Wakely [Tue, 9 May 2023 17:18:01 +0000 (18:18 +0100)]
libstdc++: Fix <chrono> pretty printers and add tests
This fixes a couple of errors in the printers for chrono types, and adds
tests to ensure they keep working.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdChronoDurationPrinter):
Print floating-point durations correctly.
(StdChronoTimePointPrinter): Support printing only the value,
not the type name. Uncomment handling for known clocks.
(StdChronoZonedTimePrinter): Remove type names from output.
(StdChronoCalendarPrinter): Fix hh_mm_ss member access.
(StdChronoTimeZonePrinter): Add equals sign to output.
* testsuite/libstdc++-prettyprinters/chrono.cc: New test.
Alexandre Oliva [Tue, 30 May 2023 21:46:26 +0000 (18:46 -0300)]
[libstdc++] [testsuite] xfail double-prec from_chars for x86_64 ldbl
When long double is wider than double, but from_chars is implemented
in terms of double, tests that involve the full precision of long
double are expected to fail. Mark them as such on x86_64-*-vxworks*.
for libstdc++-v3/ChangeLog
* testsuite/20_util/from_chars/4.cc: Skip long double test06
on x86_64-vxworks.
* testsuite/20_util/to_chars/long_double.cc: Xfail run on
x86_64-vxworks.
Christophe Lyon [Tue, 23 May 2023 14:30:53 +0000 (14:30 +0000)]
testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic
Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
Although these correspond to the same size, g++ complains when using the
'wrong' version:
invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
or
invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'} [-fpermissive]
when calling vst1q(int32*, int32x4_t) with a first parameter of type
'long int *' (resp. 'int *')
To make this test pass with any type of toolchain, this patch defines
'word_type' according to which libc is in use.
PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.
* testsuite/experimental/simd/tests/operator_cvt.cc: Make long
double <-> (u)long conversion tests conditional on sizeof(long
double) and sizeof(long).
Kito Cheng [Fri, 12 May 2023 08:54:57 +0000 (16:54 +0800)]
RISC-V: Suppress unused parameter warning in riscv-common.cc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi):
Drop unused parameter.
(riscv_select_multilib): Ditto.
(riscv_compute_multilib): Update call site of
riscv_select_multilib_by_abi and riscv_select_multilib_by_abi.
Kito Cheng [Thu, 4 May 2023 07:12:27 +0000 (15:12 +0800)]
RISC-V: Handle multi-lib path correclty for linux
RISC-V Linux encodes the ABI into the path, so in theory, we can only use that
to select multi-lib paths, and no way to use different multi-lib paths between
`rv32i/ilp32` and `rv32ima/ilp32`, we'll mapping both to `/lib/ilp32`.
It's hard to do that with GCC's builtin multi-lib selection mechanism; builtin
mechanism did the option string compare and then enumerate all possible reuse
rules during the build time. However, it's impossible to RISC-V; we have a huge
number of combinations of `-march`, so implementing a customized multi-lib
selection becomes the only solution.
Multi-lib configuration is only used for determines which ISA should be used
when compiling the corresponding ABI variant after this patch.
During the multi-lib selection stage, only consider -mabi as the only key to
select the multi-lib path.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_select_multilib_by_abi): New.
(riscv_select_multilib): New.
(riscv_compute_multilib): Extract logic to riscv_select_multilib and
also handle select_by_abi.
* config/riscv/elf.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Change it
to select_by_abi_arch_cmodel from 1.
* config/riscv/linux.h (RISCV_USE_CUSTOMISED_MULTI_LIB): Define.
* config/riscv/riscv-opts.h (enum riscv_multilib_select_kind): New.
Georg-Johann Lay [Tue, 23 May 2023 12:54:12 +0000 (14:54 +0200)]
target/104327: Allow more inlining between different optimization levels.
avr-common.cc introduces the following options that are set depending
on optimization level: -mgas-isr-prologues, -mmain-is-OS-task and
-fsplit-wide-types-early. The inliner thinks that different options
disallow cross-optimization inlining, so provide can_inline_p.
gcc/
PR target/104327
* config/avr/avr.cc (avr_can_inline_p): New static function.
(TARGET_CAN_INLINE_P): Define to that function.
Georg-Johann Lay [Thu, 25 May 2023 17:02:34 +0000 (19:02 +0200)]
target/82931: Make a pattern more generic to match more bit-transfers.
There is already a pattern in avr.md that matches single-bit transfers
from one register to another one, but it only handled bit 0 of 8-bit
registers. This change makes that pattern more generic so it matches
more of similar single-bit transfers.
gcc/
PR target/82931
* config/avr/avr.md (*movbitqi.0): Rename to *movbit<mode>.0-6.
Handle any bit position and use mode QISI.
* config/avr/avr.cc (avr_rtx_costs_1) [IOR]: Return a cost
of 2 insns for bit-transfer of respective style.
gcc/testsuite/
PR target/82931
* gcc.target/avr/pr82931.c: New test.
Matthias Kretz [Thu, 23 Mar 2023 08:32:58 +0000 (09:32 +0100)]
libstdc++: Add missing constexpr to simd
The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.
This patch resolves several issues with using simd in constant
expressions.
Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins
PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.
PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.
PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.
Georg-Johann Lay [Tue, 23 May 2023 16:49:19 +0000 (18:49 +0200)]
Improve cost computation for single-bit bit insertions.
Some miscomputation of rtx_costs lead to sub-optimal code for
single-bit bit insertions. This patch implements TARGET_INSN_COST,
which has a chance to see the whole insn during insn combination;
in particular the SET_DEST of (set (zero_extract (...) ...)).
gcc/
* config/avr/avr.cc (avr_insn_cost): New static function.
(TARGET_INSN_COST): Define to that function.
Eric Botcazou [Tue, 23 May 2023 08:15:35 +0000 (10:15 +0200)]
Fix handling of non-integral bit-fields in native_encode_initializer
The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD)
have integral types, but that's not the case in Ada where they may have
pretty much any type, resulting in a wrong encoding for them
gcc/
* fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the
specific treatment for bit-fields only if they have an integral type
and filter out non-integral bit-fields that do not start and end on
a byte boundary.
gcc/testsuite/
* gnat.dg/opt101.adb: New test.
* gnat.dg/opt101_pkg.ads: New helper.
Jakub Jelinek [Sun, 21 May 2023 11:36:56 +0000 (13:36 +0200)]
atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.
So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.
2023-05-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.
Kewen Lin [Wed, 17 May 2023 07:48:40 +0000 (02:48 -0500)]
vect: Don't retry if the previous analysis fails
When working on a cost tweaking patch, I found that a newly
added test case has different dumpings with stage-1 and
bootstrapped gcc. By looking into it, the apparent reason
is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf
set expectedly, the following retrying will use the garbage
slp_done_for_suggested_uf instead. In fact, the setting of
slp_done_for_suggested_uf only happens when the previous
analysis succeeds, for the mentioned test case, its previous
analysis does fail, it's unexpected to use the value of
slp_done_for_suggested_uf any more.
In function vect_analyze_loop_1, we only return success when
res is true, which is the result of 1st analysis. It means
we never try to vectorize with unroll_vinfo if the previous
analysis fails. So this patch shouldn't break anything, and
just stop some useless analysis early.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_1): Don't retry analysis with
suggested unroll factor once the previous analysis fails.
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
Triffid Hunter [Sat, 20 May 2023 05:50:00 +0000 (07:50 +0200)]
target/105753: Fix ICE in add_clobbers due to extra PARALLEL in insn.
This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend. Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.
gcc/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of
printing error text to assembly.
gcc/testsuite/
PR target/105753
Backport from 2023-05-20 https://gcc.gnu.org/r14-1016
* gcc.target/avr/torture/pr105753.c: New test.
Andreas Schwab [Sat, 23 Apr 2022 13:48:42 +0000 (15:48 +0200)]
riscv/linux: Don't add -latomic with -pthread
Now that we have support for inline subword atomic operations, it is no
longer necessary to link against libatomic. This also fixes testsuite
failures because the framework does not properly set up the linker flags
for finding libatomic.
The use of atomic operations is also independent of the use of libpthread.
Patrick Palka [Tue, 16 May 2023 16:39:16 +0000 (12:39 -0400)]
c++: desig init in presence of list ctor [PR109871]
add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertently being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the initializer list as if
it's non-designated. This patch fixes this by making us check for invalid
designated initialization sooner.
PR c++/109871
gcc/cp/ChangeLog:
* call.cc (add_list_candidates): Check for invalid designated
initialization sooner and even for types that have a list
constructor.
Harald Anlauf [Sun, 14 May 2023 19:53:51 +0000 (21:53 +0200)]
Fortran: CLASS pointer function result in variable definition context [PR109846]
gcc/fortran/ChangeLog:
PR fortran/109846
* expr.cc (gfc_check_vardef_context): Check appropriate pointer
attribute for CLASS vs. non-CLASS function result in variable
definition context.
gcc/testsuite/ChangeLog:
PR fortran/109846
* gfortran.dg/ptr-func-5.f90: New test.
Harald Anlauf [Fri, 5 May 2023 19:22:12 +0000 (21:22 +0200)]
Fortran: overloading of intrinsic binary operators [PR109641]
Fortran allows overloading of intrinsic operators also for operands of
numeric intrinsic types. The intrinsic operator versions are used
according to the rules of F2018 table 10.2 and imply type conversion as
long as the operand ranks are conformable. Otherwise no type conversion
shall be performed to allow the resolution of a matching user-defined
operator.
gcc/fortran/ChangeLog:
PR fortran/109641
* arith.cc (eval_intrinsic): Check conformability of ranks of operands
for intrinsic binary operators before performing type conversions.
* gfortran.h (gfc_op_rank_conformable): Add prototype.
* resolve.cc (resolve_operator): Check conformability of ranks of
operands for intrinsic binary operators before performing type
conversions.
(gfc_op_rank_conformable): New helper function to compare ranks of
operands of binary operator.
gcc/testsuite/ChangeLog:
PR fortran/109641
* gfortran.dg/overload_5.f90: New test.
arm testsuite: Shifts and get_FPSCR ACLE optimisation fixes
These newly updated tests were rewritten by Andrea. Some of them
needed further manual fixing as follows:
* The #shift immediate value not in the check-function-bodies as expected
* The ACLE was specifying sub-optimal code: lsr+and instead of ubfx. In
this case the test rewritten from the ACLE had the lsr+and pattern,
but the compiler was able to optimise to ubfx. Hence I've changed the
test to now match on ubfx.
* Added a separate test to check shift on constants being optimised to
movs.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/srshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/srshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/uqshl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/uqshll.c: Update shift value.
* gcc.target/arm/mve/intrinsics/urshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/urshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Update to ubfx.
* gcc.target/arm/mve/mve_const_shifts.c: New test.
arm testsuite: XFAIL or relax registers in some tests [PR109697]
This is a simple testsuite tidy-up patch, addressing to types of errors:
* The vcmp vector-scalar tests failing due to the compiler's preference
of vector-vector comparisons, over vector-scalar comparisons. This is
due to the lack of cost model for MVE and the compiler not knowing that
the RTL vec_duplicate is free in those instructions. For now, we simply
XFAIL these checks.
* The tests for pr108177 had strict usage of q0 and r0 registers,
meaning that they would FAIL with -mfloat-abi=softf. The register checks
have now been relaxed. A couple of these run-tests also had incosistent
use of integer MVE with floating point vectors, so I've now changed
these to use FP MVE.
Following Andrea's overhaul of the MVE testsuite, these tests are now
reduntant, as equivalent checks have been added to the each intrinsic's
<intrinsic name>.c test.
arm: Fix MVE header pointer overloads this time (and a bit more tidying)
Hi all,
Previously we had fixed the overloading of scalar arguments to intrinsics
with the introduction of a new `__ARM_mve_coerce3` _ Generic association.
This allowed users to give types other than int32_t, e.g. int, short, long,
etc., which previously would emit a nonsensical error message from the
_Generic.
Here I adjust that handling slightly and I am also doing the same thing, but
for pointer types:
(un)signed char* can be now used instead of (u)int8_t*
(un)signed short* can be now used instead of (u)int16_t*
(un)signed int* and long* can be now used instead of (u)int32_t*
(un)signed long long* can be now used instead of (u)int64_t*
__fp16* and _Float16* can be now used instead of float16_t*
float* can be now used instead of float32_t*
This required me to break down the _coerce_ generics for the specific
pointer types.
On the scalar types, the change in this patch is minor, renaming the
_coerce_ generics and passing all scalars through the `__typeof` for
consistency with each-other.
No test regressions in the GCC testsuite or CMSIS-NN.
arm: Stop vadcq, vsbcq intrinsics from overwriting the FPSCR NZ flags
Hi all,
We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:
when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```
the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).
This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.
Andrea Corallo [Wed, 19 Apr 2023 16:08:57 +0000 (18:08 +0200)]
arm: Mve backend + testsuite fixes 2
Hi all,
this patch improves a number of MVE tests in the testsuite for more
precise and better coverage using check-function-bodies instead of
scan-assembler checks. Also all intrusctions prescribed in the
ACLE[1] are now checked.
Also a number of simple fixes are done in the backend to fix
capitalization and spacing.
Andrea Corallo [Thu, 23 Mar 2023 14:36:37 +0000 (15:36 +0100)]
arm: Fix vstrwq* backend + testsuite
Hi all,
this patch fixes the vstrwq* MVE instrinsics failing to emit the
correct sequence of instruction due to a missing predicate. Also the
immediate range is fixed to be multiples of 2 up between [-252, 252].
Best Regards
Andrea
gcc/ChangeLog:
* config/arm/constraints.md (mve_vldrd_immediate): Move it to
predicates.md.
(Ri): Move constraint definition from predicates.md.
(Rl): Define new constraint.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Add
missing constraint.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Add missing Up constraint
for op 1, use mve_vstrw_immediate predicate and Rl constraint for
op 2. Fix asm output spacing.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Add missing constraint.
* config/arm/predicates.md (Ri) Move constraint to constraints.md
(mve_vldrd_immediate): Move it from
constraints.md.
(mve_vstrw_immediate): New predicate.
Andrea Corallo [Tue, 28 Feb 2023 10:03:18 +0000 (11:03 +0100)]
arm: Mve testsuite improvements
Hello all,
this patch improves a number of MVE tests in the testsuite for more
precise and better coverage using check-function-bodies instead of
scan-assembler checks. Also all intrusctions prescribed in the ACLE[1]
are now checked.
Jakub Jelinek [Wed, 17 May 2023 19:21:23 +0000 (21:21 +0200)]
libstdc++: Fix up some <cmath> templates [PR109883]
As can be seen on the following testcase, for
std::{atan2,fmod,pow,copysign,fdim,fmax,fmin,hypot,nextafter,remainder,remquo,fma}
if one operand type is std::float{16,32,64,128}_t or std::bfloat16_t and
another one some integral type or some other floating point type which
promotes to the other operand's type, we can end up with endless recursion.
This is because of a declaration ordering problem in <cmath>, where the
float, double and long double overloads of those functions come before
the templates which use __gnu_cxx::__promote_{2,3}, but the
std::float{16,32,64,128}_t and std::bfloat16_t overloads come later in the
file. If the result of those promotions is _Float{16,32,64,128} or
__gnu_cxx::__bfloat16_t, say std::pow(_Float64, int) calls
std::pow(_Float64, _Float64) and the latter calls itself.
The following patch fixes that by moving those templates later in the file,
so that the calls from those templates see also the other overloads.
I think other templates in the file like e.g. isgreater etc. shouldn't be
a problem, because those just use __builtin_isgreater etc. in their bodies.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/109883
* include/c_global/cmath (atan2, fmod, pow): Move
__gnu_cxx::__promote_2 using templates after _Float{16,32,64,128} and
__gnu_cxx::__bfloat16_t overloads.
(copysign, fdim, fmax, fmin, hypot, nextafter, remainder, remquo):
Likewise.
(fma): Move __gnu_cxx::__promote_3 using template after
_Float{16,32,64,128} and __gnu_cxx::__bfloat16_t overloads.
* testsuite/26_numerics/headers/cmath/constexpr_std_c++23.cc: New test.
Jakub Jelinek [Wed, 17 May 2023 18:59:54 +0000 (20:59 +0200)]
i386: Fix up types in __builtin_{inf,huge_val,nan{,s},fabs,copysign}q builtins [PR109884]
When _Float128 support has been added to C++ for 13.1, float128t_type_node
tree has been added - in C float128_type_node and float128t_type_node is
the same and represents both _Float128 and __float128, but in C++ they
are distinct types which have different handling in the FEs.
When doing that change, I mistakenly forgot to change FLOAT128 primitive
type, which is used for the __builtin_{inf,huge_val,nan{,s},fabs,copysign}q
builtins results and some of their arguments (and nothing else).
The following patch fixes that.
On ia64 we already use float128t_type_node for those builtins, pa while
it has __float128 that type is the same as long double and so those builtins
have long double types and on powerpc seems we don't have these builtins
but instead define macros which map them to __builtin_*f128. That will
not work properly in C++, perhaps we should change those macros to be
function-like and cast to __float128.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109884
* config/i386/i386-builtin-types.def (FLOAT128): Use
float128t_type_node rather than float128_type_node.
Jakub Jelinek [Wed, 17 May 2023 08:15:50 +0000 (10:15 +0200)]
c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]
My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.
I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.
In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.
2023-05-17 Jakub Jelinek <jakub@redhat.com>
PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.
Marek Polacek [Tue, 16 May 2023 18:12:06 +0000 (14:12 -0400)]
c++: -Wdangling-reference not suppressed in template [PR109774]
In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive. It wasn't working in a
template, though, because the suppress_warning call was never reached.
PR c++/109774
gcc/cp/ChangeLog:
* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.
Patrick O'Neill [Tue, 18 Apr 2023 21:33:13 +0000 (14:33 -0700)]
RISCV: Inline subword atomic ops
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.
This patch changes the default behavior to inline subword atomic calls
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.
gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.
2023-05-03 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
PR target/104338
* config/riscv/riscv-protos.h: Add helper function stubs.
* config/riscv/riscv.cc: Add helper functions for subword masking.
* config/riscv/riscv.opt: Add command-line flags
-minline-atomics and -mno-inline-atomics.
* config/riscv/sync.md: Add masking logic and inline asm for
fetch_and_op, fetch_and_nand, CAS, and exchange ops.
* doc/invoke.texi: Add blurb regarding new command-line flags
-minline-atomics and -mno-inline-atomics.
libgcc/ChangeLog:
PR target/104338
* config/riscv/atomic.c: Add reference to duplicate logic.
gcc/testsuite/ChangeLog:
PR target/104338
* gcc.target/riscv/inline-atomics-1.c: New test.
* gcc.target/riscv/inline-atomics-2.c: New test.
* gcc.target/riscv/inline-atomics-3.c: New test.
* gcc.target/riscv/inline-atomics-4.c: New test.
* gcc.target/riscv/inline-atomics-5.c: New test.
* gcc.target/riscv/inline-atomics-6.c: New test.
* gcc.target/riscv/inline-atomics-7.c: New test.
* gcc.target/riscv/inline-atomics-8.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Tobias Burnus [Fri, 12 May 2023 14:27:40 +0000 (16:27 +0200)]
LTO: Fix writing of toplevel asm with offloading [PR109816]
When offloading was enabled, top-level 'asm' were added to the offloading
section, confusing assemblers which did not support the syntax. Additionally,
with offloading and -flto, the top-level assembler code did not end up
in the host files.
As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
the issue became more apparent, causing fails with nvptx for some
C++ testcases.
PR libstdc++/109816
gcc/ChangeLog:
* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
'!lto_stream_offload_p'.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-map-class-1.C: New test.
* testsuite/libgomp.c++/target-map-class-2.C: New test.
Sören Tempel [Sun, 14 May 2023 17:30:21 +0000 (19:30 +0200)]
fix assert in __deregister_frame_info_bases
The assertion in __deregister_frame_info_bases assumes that for every
frame something was inserted into the lookup data structure by
__register_frame_info_bases. Unfortunately, this does not necessarily
hold true as the btree_insert call in __register_frame_info_bases will
not insert anything for empty ranges. Therefore, we need to explicitly
account for such empty ranges in the assertion as `ob` will be a null
pointer for such ranges, hence causing the assertion to fail.
Patrick Palka [Fri, 12 May 2023 12:37:54 +0000 (08:37 -0400)]
c++: remove redundant testcase [PR83258]
I noticed only after the fact that the new testcase template/function2.C
(from r14-708-gc3afdb8ba8f183) is just a subset of ext/visibility/anon8.C,
so let's get rid of it.
Patrick Palka [Thu, 11 May 2023 20:31:33 +0000 (16:31 -0400)]
c++: 'mutable' subobject of constexpr variable [PR109745]
r13-2701-g7107ea6fb933f1 made us correctly accept during constexpr
evaluation 'mutable' member accesses of objects constructed during
that evaluation, while continuing to reject such accesses for constexpr
objects constructed outside of that evaluation, by considering the
CONSTRUCTOR_MUTABLE_POISON flag during cxx_eval_component_reference.
However, this flag is set only for the outermost CONSTRUCTOR of a
constexpr variable initializer, so if we're accessing a 'mutable' member
of a nested CONSTRUCTOR, the flag won't be set and we won't reject the
access. This can lead to us accepting invalid code, as in the first
testcase, or even wrong code generation due to our speculative constexpr
evaluation, as in the second and third testcase.
This patch fixes this by setting CONSTRUCTOR_MUTABLE_POISON recursively
rather than only on the outermost CONSTRUCTOR.
PR c++/109745
gcc/cp/ChangeLog:
* typeck2.cc (poison_mutable_constructors): Define.
(store_init_value): Use it instead of setting
CONSTRUCTOR_MUTABLE_POISON directly.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-mutable4.C: New test.
* g++.dg/cpp0x/constexpr-mutable5.C: New test.
* g++.dg/cpp1y/constexpr-mutable2.C: New test.
Patrick Palka [Thu, 11 May 2023 14:04:25 +0000 (10:04 -0400)]
c++: converted lambda as template argument [PR83258, ...]
r8-1253-g3d2e25a240c711 removed the template argument linkage requirement
in convert_nontype_argument for C++17 (which r9-3836-g4be5c72cf3ea3e later
factored out into invalid_tparm_referent_p), but we need to also remove
the one in convert_nontype_argument_function for benefit of the first and
third testcase which we currently reject even in C++17/20 mode.
And in invalid_tparm_referent_p we're inadvertendly returning false for
the address of a lambda's static op() since it's DECL_ARTIFICIAL, which
currently causes us to reject the second (C++20) testcase. But this
DECL_ARTIFICIAL check seems to be relevant only for VAR_DECL, and in fact
this code path was originally reachable only for VAR_DECL until recently
(r13-6970-gb5e38b1c166357). So this patch restricts the check to VAR_DECL.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
PR c++/83258
PR c++/80488
PR c++/97700
gcc/cp/ChangeLog:
* pt.cc (convert_nontype_argument_function): Remove linkage
requirement for C++17 and later.
(invalid_tparm_referent_p) <case ADDR_EXPR>: Restrict
DECL_ARTIFICIAL rejection test to VAR_DECL.
gcc/testsuite/ChangeLog:
* g++.dg/ext/visibility/anon8.C: Don't expect a "no linkage"
error for the template argument &B2:fn in C++17 mode.
* g++.dg/cpp0x/lambda/lambda-conv15.C: New test.
* g++.dg/cpp2a/nontype-class56.C: New test.
* g++.dg/template/function2.C: New test.
Patrick Palka [Tue, 9 May 2023 19:06:34 +0000 (15:06 -0400)]
c++: noexcept-spec from nested class confusion [PR109761]
When late processing a noexcept-spec from a nested class after completion
of the outer class (since it's a complete-class context), we pass the wrong
class context to noexcept_override_late_checks -- the outer class type
instead of the nested class type -- which leads to bogus errors in the
below test.
This patch fixes this by making noexcept_override_late_checks obtain the
class context directly via DECL_CONTEXT instead of via an additional
parameter.
PR c++/109761
gcc/cp/ChangeLog:
* parser.cc (cp_parser_class_specifier): Don't pass a class
context to noexcept_override_late_checks.
(noexcept_override_late_checks): Remove 'type' parameter
and use DECL_CONTEXT of 'fndecl' instead.
Patrick Palka [Sun, 7 May 2023 15:54:21 +0000 (11:54 -0400)]
c++: bound ttp in lambda function type [PR109651]
After r14-11-g2245459c85a3f4 we now coerce the template arguments of a
bound ttp again after level-lowering it. Notably a level-lowered ttp
doesn't have DECL_CONTEXT set, so during this coercion we fall back to
using current_template_parms to obtain the relevant set of in-scope
parameters.
But it turns out current_template_parms isn't properly set when
substituting the function type of a generic lambda, and so if the type
contains bound ttps that need to be lowered we'll crash during their
attempted coercion. Specifically in the first testcase below,
current_template_parms during the lambda type substitution (with T=int)
is "1 U" instead of the expected "2 TT, 1 U", and we crash when level
lowering TT<int>.
Ultimately the problem is that tsubst_lambda_expr does things in the
wrong order: we ought to substitute (and install) the in-scope template
parameters _before_ substituting anything that may use those template
parameters (such as the function type of a generic lambda). This patch
corrects this substitution order.
PR c++/109651
gcc/cp/ChangeLog:
* pt.cc (coerce_template_args_for_ttp): Mention we can hit the
current_template_parms fallback when level-lowering a bound ttp.
(tsubst_template_decl): Add lambda_tparms parameter. Prefer to
use lambda_tparms instead of substituting DECL_TEMPLATE_PARMS.
(tsubst_decl) <case TEMPLATE_DECL>: Pass NULL_TREE as lambda_tparms
to tsubst_template_decl.
(tsubst_lambda_expr): For a generic lambda, substitute
DECL_TEMPLATE_PARMS and set current_template_parms to it
before substituting the function type. Pass the substituted
DECL_TEMPLATE_PARMS as lambda_tparms to tsubst_template_decl.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-generic-ttp1.C: New test.
* g++.dg/cpp2a/lambda-generic-ttp2.C: New test.
Jonathan Wakely [Wed, 10 May 2023 20:30:10 +0000 (21:30 +0100)]
libstdc++: Backport std::basic_string::_S_allocate from trunk
This is a backport of r14-739-gc62e945492afbb to keep the exported
symbol list consistent between trunk and gcc-13. The new assertions from
that commit are not part of this backport.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Export basic_string::_S_allocate.
* include/bits/basic_string.h: (basic_string::_Alloc_traits_impl):
Remove class template.
(basic_string::_S_allocate): New static member function.
(basic_string::assign): Use _S_allocate.
* include/bits/basic_string.tcc (basic_string::_M_create)
(basic_string::reserve, basic_string::_M_replace): Likewise.
The trunk patch for this PR corrected the ABI for enums that have
a defined underlying type. We shouldn't change the ABI on the branches
though, so this patch just removes the assertions that highlighed
the problem.
I think the same approach makes sense longer-term: keep the assertions
at maximum strength in trunk, and in any new branches that get cut.
Then, if the assertions trip an ABI problem, fix the problem in trunk
and remove the assertions from active branches.
The tests are the same as for the trunk version, but with all Wpsabi
message and expected output checks removed.