Jakub Jelinek [Sun, 21 May 2023 11:36:56 +0000 (13:36 +0200)]
match.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.
So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.
2023-05-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.
PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.
* testsuite/experimental/simd/tests/operator_cvt.cc: Make long
double <-> (u)long conversion tests conditional on sizeof(long
double) and sizeof(long).
Matthias Kretz [Thu, 23 Mar 2023 08:32:58 +0000 (09:32 +0100)]
libstdc++: Add missing constexpr to simd
The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.
This patch resolves several issues with using simd in constant
expressions.
Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins
PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.
PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.
PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.
Matthias Kretz [Thu, 23 Feb 2023 13:55:08 +0000 (14:55 +0100)]
libstdc++: Fix simd compilation with Clang
Clang fails to compile some constant expressions involving simd.
Therefore, just disable this non-conforming extension for clang.
Fix AVX512 blend implementation for Clang. It was converting the bitmask
to bool before, which is obviously wrong. Instead use a Clang builtin to
convert the bitmask to vector-mask before using a vector blend ?:. A
similar change is required for the masked unary implementation, because
the GCC builtins do not exist on Clang.
* include/experimental/bits/simd_detail.h: Don't declare the
simd API as constexpr with Clang.
* include/experimental/bits/simd_x86.h (__movm): New.
(_S_blend_avx512): Resolve FIXME. Implement blend using __movm
and ?:.
(_SimdImplX86::_S_masked_unary): Clang does not implement the
same builtins. Implement the function using __movm, ?:, and -
operators on vector_size types instead.
Matthias Kretz [Mon, 20 Feb 2023 16:49:37 +0000 (17:49 +0100)]
libstdc++: Always-inline most of non-cmath fixed_size implementation
For simd, the inlining behavior should be similar to builtin types. (No
operator on buitin types is ever translated into a function call.)
Therefore, always_inline is the right choice (i.e. inline on -O0 as
well).
PR libstdc++/108856
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_masked_unary): More efficient
implementation of masked inc-/decrement for integers and floats
without AVX2.
* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
builtins for masked inc-/decrement.
Matthias Kretz [Sat, 14 Jan 2023 16:07:59 +0000 (17:07 +0100)]
libstdc++: Annotate most lambdas with always_inline
All of the annotated lambdas are simply a necessary means for
implementing these functions and should never result in an actual
function call. Many of these lambdas would go away if C++ had better
language support for packs.
Michael Meissner [Mon, 22 May 2023 15:17:01 +0000 (11:17 -0400)]
Do not generate vmaddfp and vnmsubfp
This is version 3 of the patch. This is essentially version 1 with the removal
of changes to altivec.md, and cleanup of the comments.
Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
and those changes are deleted in this patch.
The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
these instructions seems to break Eigen on big endian systems.
I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double). I have also done the builds and test on a power8
big endian system (testing both 32-bit and 64-bit code generation). Chip has
verified that it fixes the problem that Eigen encountered. Can I check this
into the master GCC branch? After a burn-in period, can I check this patch
into the active GCC branches?
Thanks in advance.
2023-04-07 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. Back
port from master 04/10/2023.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.
gcc/testsuite/
PR target/70243
* gcc.target/powerpc/pr70243.c: New test. Back port from master
04/10/2023.
Patrick Palka [Tue, 14 Mar 2023 20:44:32 +0000 (16:44 -0400)]
libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
Define.
(__cpp_lib_move_iterator_concept): Define for C++20.
(move_iterator::iterator_concept): Strengthen as per P2520R0.
* include/std/version (__cpp_lib_move_iterator_concept): Define
for C++20.
* testsuite/24_iterators/move_iterator/p2520r0.cc: New test.
Patrick Palka [Fri, 3 Mar 2023 16:37:02 +0000 (11:37 -0500)]
c++: thinko in extract_local_specs [PR108998]
In order to fix PR100295, r13-4730-g18499b9f848707 attempted to make
extract_local_specs walk the given pattern twice, ignoring unevaluated
operands the first time around so that we prefer to process a local
specialization in an evaluated context if it appears in one (we process
each local specialization once even if it appears multiple times in the
pattern).
But there's a thinko in the patch, namely that we don't actually walk
the pattern twice since we don't clear the visited set for the second
walk (to avoid processing a local specialization twice) and so the root
node (and any node leading up to an unevaluated operand) is considered
visited already. So the patch effectively made extract_local_specs
ignore unevaluated operands altogether, which this testcase demonstrates
isn't quite safe (extract_local_specs never sees 'aa' and we don't record
its local specialization, so later we try to specialize 'aa' on the spot
with the args {{int},{17}} which causes us to nonsensically substitute
its auto with 17.)
This patch fixes this by refining the second walk to start from the
trees we skipped over during the first walk.
PR c++/108998
gcc/cp/ChangeLog:
* pt.c (el_data::skipped_trees): New data member.
(extract_locals_r): Push to skipped_trees any unevaluated
contexts that we skipped over.
(extract_local_specs): For the second walk, start from each
tree in skipped_trees.
Patrick Palka [Thu, 15 Dec 2022 21:02:05 +0000 (16:02 -0500)]
c++: extract_local_specs and unevaluated contexts [PR100295]
Here during partial instantiation of the constexpr if, extra_local_specs
walks the statement looking for local specializations within to capture.
However, we're thwarted by the fact that 'ts' first appears inside an
unevaluated context, and so the calls to process_outer_var_ref for its
local specializations are a no-op. And since we walk each tree exactly
once, we end up not capturing the local specializations despite 'ts'
later occurring in an evaluated context.
This patch fixes this by making extract_local_specs walk evaluated
contexts first before walking unevaluated contexts. We could probably
get away with not walking unevaluated contexts at all, but this approach
seems more clearly safe.
PR c++/100295
PR c++/107579
gcc/cp/ChangeLog:
* pt.c (el_data::skip_unevaluated_operands): New data member.
(extract_locals_r): If skip_unevaluated_operands is true,
don't walk into unevaluated contexts.
(extract_local_specs): Walk the pattern twice, first with
skip_unevaluated_operands true followed by it set to false.
Patrick Palka [Tue, 29 Nov 2022 14:55:21 +0000 (09:55 -0500)]
c++: explicit specialization and trailing requirements [PR107864]
Here we're crashing when using the explicit specialization of the
function template g with trailing requirements ultimately because
earlier decls_match (called indirectly from register_specialization) for
for the explicit specialization returned false since the template has
trailing requirements whereas the specialization doesn't.
In r12-2230-gddd25bd1a7c8f4, we fixed a similar issue concerning template
requirements instead of trailing requirements. We could extend that fix
to ignore trailing requirement mismatches for explicit specializations
as well, but it seems cleaner to just propagate constraints from the
specialized template to the specialization when declaring an explicit
specialization so that decls_match will naturally return true in this
case. And it looks like determine_specialization already does this,
albeit inconsistently (only when specializing a non-template member
function of a class template as in cpp2a/concepts-explicit-spec4.C).
So this patch makes determine_specialization consistently propagate
constraints from the specialized template to the specialization, which
in turn lets us get rid of the function_requirements_equivalent_p special
case added by r12-2230.
PR c++/107864
gcc/cp/ChangeLog:
* decl.c (function_requirements_equivalent_p): Don't check
DECL_TEMPLATE_SPECIALIZATION.
* pt.c (determine_specialization): Propagate constraints when
specializing a function template too. Simplify by using
add_outermost_template_args.
Patrick Palka [Thu, 3 Nov 2022 19:35:18 +0000 (15:35 -0400)]
c++: requires-expr and access checking [PR107179]
Like during satisfaction, we also need to avoid deferring access checks
during substitution of a requires-expr because the outcome of an access
check can determine the value of the requires-expr. Otherwise (in
deferred access checking contexts such as within a base-clause), the
requires-expr may evaluate to the wrong result, and along the way a
failed access check may leak out from it into a non-SFINAE context and
cause a hard error (as in the below testcase).
PR c++/107179
gcc/cp/ChangeLog:
* constraint.cc (tsubst_requires_expr): Make sure we're not
deferring access checks.
Patrick Palka [Wed, 30 Mar 2022 14:13:11 +0000 (10:13 -0400)]
c++: ICE with failed __is_constructible constraint [PR100474]
Here we're crashing when diagnosing an unsatisfied __is_constructible
constraint because diagnose_trait_expr doesn't recognize this trait
(along with a bunch of other traits). Fix this by adding handling for
all remaining traits and removing the default case so that when adding a
new trait we'll get a warning that diagnose_trait_expr needs to handle it.
Patrick Palka [Sat, 12 Mar 2022 20:00:40 +0000 (15:00 -0500)]
c++: return-type-req in constraint using only outer tparms [PR104527]
Here the template context for the atomic constraint has two levels of
template parameters, but since it depends only on the innermost parameter
T we use a single-level argument vector (built by get_mapped_args) during
substitution into the atom. We eventually pass this vector to
do_auto_deduction as part of checking the return-type-requirement within
the atom, but do_auto_deduction expects outer_targs to be a full set of
arguments for sake of satisfaction.
This patch fixes this by making get_mapped_args always return an
argument vector whose depth corresponds to the template depth of the
context in which the atomic constraint expression was written, instead
of the highest parameter level that the expression happens to use.
PR c++/104527
gcc/cp/ChangeLog:
* constraint.cc (normalize_atom): Set
ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately.
(get_mapped_args): Make static, adjust parameters. Always
return a vector whose depth corresponds to the template depth of
the context of the atomic constraint expression. Micro-optimize
by passing false as exact to safe_grow_cleared and by collapsing
a multi-level depth-one argument vector.
(satisfy_atom): Adjust call to get_mapped_args and
diagnose_atomic_constraint.
(diagnose_atomic_constraint): Replace map parameter with an args
parameter.
* cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define.
(get_mapped_args): Remove declaration.
Patrick Palka [Fri, 28 Jan 2022 20:41:15 +0000 (15:41 -0500)]
c++: bogus warning with value init of const pmf [PR92752]
Here we're emitting a -Wignored-qualifiers warning for an intermediate
compiler-generated cast of nullptr to 'method-type* const' as part of
value initialization of a const pmf. This patch suppresses the warning
by instead casting to the corresponding unqualified type.
PR c++/92752
gcc/cp/ChangeLog:
* typeck.c (build_ptrmemfunc): Cast a nullptr constant to the
unqualified pointer type not the qualified one.
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
Harald Anlauf [Sun, 14 May 2023 19:53:51 +0000 (21:53 +0200)]
Fortran: CLASS pointer function result in variable definition context [PR109846]
gcc/fortran/ChangeLog:
PR fortran/109846
* expr.c (gfc_check_vardef_context): Check appropriate pointer
attribute for CLASS vs. non-CLASS function result in variable
definition context.
gcc/testsuite/ChangeLog:
PR fortran/109846
* gfortran.dg/ptr-func-5.f90: New test.
c++tools, configury: Configure with C++; test checking status [PR98821].
The c++tools configure fragments need to be built with a C++ compiler.
In addition, the stand-alone server uses diagnostic mechanisms in common
with GCC, but needs to define implementations for gcc_assert and
supporting output functions.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/98821 - modules : c++tools configures with CC but code fragments assume CXX.
PR c++/98821
c++tools/ChangeLog:
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Configure using C++. Pull logic to
detect enabled checking modes; default to release
checking.
* server.cc (AI_NUMERICSERV): Define a fallback value.
(gcc_assert): New.
(gcc_unreachable): New.
(fancy_abort): Only build when checking is enabled.
c++, coroutines: Fix block nests when the function has no top-level bind.
When the function contains no local vars and also no nested scopes, there
is no top-level bind expression. Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions. When this was done the necessary scope blocks were
omitted which leads to disconnected function content.
Fixed by adding a new block to the added bind expression.
Iain Sandoe [Thu, 30 Mar 2023 07:44:23 +0000 (13:14 +0530)]
c++,coroutines: Stabilize names of promoted slot vars [PR101118].
When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name. This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units. This leads to a
"ODR mismatch" warning for the frame type.
The fix here is to use the current promoted temporaries count to produce
the name, this is stable between TUs and computed per coroutine.
libsanitizer, darwin: Unsupport Darwin >= 22 for now.
The mechanism for location dyld has altered from Darwin22 since dyld is now
in the shared cache. The implemented mechanism for walking the cache uses
Apple Blocks which GCC does not yet support, and the fallback to the original
mechanism does not work there.
Until a suitable work-around can be found, unsupport Darwin22+.
The SDK for MacOS13 includes Apple-specific deprecations of some functions that
are not deprecated in Posix, C or C++ and widely used in GCC.
The fix makes the deprecation conditional on __APPLE_LOCAL_DEPRECATIONS so that
end users may still observe them but they are hidden from normal compilations.
* fixincl.x: Regenerate.
* inclhack.def: Add a fix for MacOS13 SDK function deprecations
in stdio.h.
* tests/base/stdio.h (__deprecated_msg): New test.
Iain Sandoe [Thu, 22 Dec 2022 17:32:06 +0000 (17:32 +0000)]
libgcc, Darwin: No early install for the compatibility libgcc_s.1.dylib.
On Darwin, GCC now uses a libgcc_s.1.1 for builtins and forwards the system
unwinder. We do, however, build a backwards compatibility libgcc_s.1.dylib.
However, this is not needed by GCC and can cause incorrect operation when
DYLD_LIBRARY_PATH is in use.
Since we do not need or use it during the build, the solution is to skip the
installation into the $build/gcc directory.
Simon Wright [Sun, 12 Jun 2022 16:01:22 +0000 (17:01 +0100)]
Darwin: Truncate kernel-provided version to OS major for Darwin >= 20.
In common with system tools, GCC uses a version obtained from the kernel as
the prevailing macOS target, when that is not overridden by command line or
environment versions (i.e. mmacosx-version-min=, MACOSX_DEPLOYMENT_TARGET).
Presently, GCC assumes that if the OS version is >= 20, the value used should
include both major and minium version identifiers. However the system tools
(for those versions) truncate the value to the major version - this leads to
link errors when combining objects built with clang and GCC for example:
ld: warning: object file (null.o) was built for newer macOS version (12.2)
than being linked (12.0)
The change here truncates the values GCC uses to the major version.
gcc/ChangeLog:
PR target/104871
* config/darwin-driver.c (darwin_find_version_from_kernel): If the OS
version is darwin20 (macOS 11) or greater, truncate the version to the
major number.
Mark Mentovai [Fri, 10 Jun 2022 14:56:42 +0000 (15:56 +0100)]
Darwin: Future-proof -mmacosx-version-min
f18cbc1ee1f4 (2021-12-18) updated various parts of gcc to not impose a
Darwin or macOS version maximum of the current known release. Different
parts of gcc accept, variously, Darwin version numbers matching
darwin2*, and macOS major version numbers up to 99. The current released
version is Darwin 21 and macOS 12, with Darwin 22 and macOS 13 expected
for public release later this year. With one major OS release per year,
this strategy is expected to provide another 8 years of headroom.
However, f18cbc1ee1f4 missed config/darwin-c.c (now .cc), which
continued to impose a maximum of macOS 12 on the -mmacosx-version-min
compiler driver argument. This was last updated from 11 to 12 in 11b967577483 (2021-10-27), but kicking the can down the road one year at
a time is not a viable strategy, and is not in line with the more recent
technique from f18cbc1ee1f4.
Prior to 556ab5125912 (2020-11-06), config/darwin-c.c did not impose a
maximum that needed annual maintenance, as at that point, all macOS
releases had used a major version of 10. The stricter approach imposed
since then was valuable for a time until the particulars of the new
versioning scheme were established and understood, but now that they
are, it's prudent to restore a more permissive approach.
gcc/ChangeLog:
* config/darwin-c.c: Make -mmacosx-version-min more future-proof.
Iain Sandoe [Sun, 29 May 2022 15:14:32 +0000 (16:14 +0100)]
Darwin: Fix empty g++ command lines [PR105599].
An empty g++ command line should produce a diagnostic that there are no
inputs. The PR is that currently Darwin produces a dignostic about missing
link items instead - this is because (errnoeously), for this driver, we are
creating a link job for empty command lines.
The problem occurs in four stages:
The g++ driver appends -shared-libgcc to the command line.
The Darwin driver_init code in the backend does not see this (it sees an
empty command line).
When the back end driver code driver sees an empty command line, it does not
add any supplementary flags (e.g. asm-macosx-version-min) - precisely to
avoid anything being claimed as an input_file and therefore triggering a link
line.
Since we do not have a value for asm-macosx-version-min when processing the
driver specs, we unconditionally inject 'multiply_defined suppress' which is
used with shared libgcc (but only intended on very old Darwin). This then
causes the generation of a link job.
The solution, for the present, is to move version-specific link params to the
LINK_SPEC so that they are only processed when a link job has already been
decided.
Iain Sandoe [Mon, 20 Dec 2021 15:19:50 +0000 (15:19 +0000)]
Darwin: Update rules for handling alignment of globals.
The current rule was too strict and has not been required since Darwin11.
This relaxes the constraint to allow up to 2^28 alignment for non-common
entities. Common is still restricted to a maximum aligment of 2^15.
When the host is an older version of Darwin ( earlier that 11 ) then the
existing constraint is still applied. Note that this is a host constraint
not a target one (so that a compilation on 10.7 targeting 10.6 is allowed
to use a greater alignment than the tools on 10.6 support). This matches
the behaviour of clang.
* config.gcc: Emit L2_MAX_OFILE_ALIGNMENT with suitable
values for the host.
* config/darwin.c (darwin_emit_common): Error for alignment
values > 32768.
* config/darwin.h (MAX_OFILE_ALIGNMENT): Rework to use the
configured L2_MAX_OFILE_ALIGNMENT.
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-aligned-globals.c: New test.
* gcc.dg/darwin-comm-1.c: New test.
* gcc.dg/attr-aligned.c: Amend for new alignment values on
Darwin.
* gcc.target/i386/pr89261.c: Likewise.
Darwin: Future-proof and homogeneize detection of darwin versions
The current GCC branch will become 12.1.0, which will be the stable
version of GCC when the next macOS version is released. There are some
places in GCC that don’t handle darwin22 as a version, so we need to
future-proof it (gcc/config.gcc and gcc/config/darwin-driver.c). We
align that code with what Apple clang does, i.e. accept all potential
major macOS versions until 99.
This patch also homogenises the handling of darwin version numbers,
where the majority of places use darwin2*, but some used darwin2[0-9]*.
Since there never was a darwin2.x version, the two are equivalent, and
we prefer the simpler darwin2*
gcc/ChangeLog:
* config/darwin-driver.c: Make version code more future-proof.
* config.gcc: Homogeneize darwin versions.
* configure.ac: Homogeneize darwin versions.
* configure: Regenerate.
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-minversion-link.c: Test darwin21.
* obj-c++.dg/cxx-ivars-3.mm: Homogeneize darwin versions.
* obj-c++.dg/objc-gc-3.mm: Homogeneize darwin versions.
* objc.dg/objc-gc-4.m: Homogeneize darwin versions.
Jonathan Wakely [Mon, 28 Nov 2022 13:28:53 +0000 (13:28 +0000)]
libstdc++: Fix src/c++17/memory_resource for H8 targets [PR107801]
This fixes compilation failures for H8 multilibs. For the normal
multilib (ILP16L32?), the chunk struct does not have the expected size,
because uint32_t is type long and has alignment 4 (by default). This
forces sizeof(chunk) to be 12 instead of the expected 10. We can fix
that by using bitset::size_type instead of uint32_t, so that we only use
a 16-bit size when size_t and pointers are 16-bit types.
For the IL32P16 multilibs that use -mint32, int is wider than size_t
and so arithmetic expressions involving size_t promote to int. This
means we need some explicit casts back to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/107801
* src/c++17/memory_resource.cc (chunk::_M_bytes): Change type
from uint32_t to bitset::size_type. Adjust static assertion.
(__pool_resource::_Pool::replenish): Cast to size_t after
multiplication instead of before.
(__pool_resource::_M_alloc_pools): Ensure both arguments to
std::max have type size_t.
Jonathan Wakely [Tue, 22 Nov 2022 09:53:36 +0000 (09:53 +0000)]
libstdc++: Fix pool resource build errors for H8 [PR107801]
The array of pool sizes was previously adjusted to work for msp430-elf
which has 16-bit int and either 16-bit size_t or 20-bit size_t. The
largest pool sizes were disabled unless size_t has more than 20 bits.
The H8 family has 16-bit int but 32-bit size_t, which means that the
largest sizes are enabled, but 1<<15 produces a negative number that
then cannot be narrowed to size_t.
Replace the test for 32-bit size_t with a test for 32-bit int, which
means we won't use the 4kiB to 4MiB pools for targets with 16-bit int
even if they have a wider size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/107801
* src/c++17/memory_resource.cc (pool_sizes): Disable large pools
for targets with 16-bit int.
Jonathan Wakely [Fri, 23 Sep 2022 12:28:37 +0000 (13:28 +0100)]
libstdc++: Fix std::is_nothrow_invocable_r for uncopyable prvalues [PR91456]
This is the last missing piece of PR 91456.
This also removes the only use of the C++11 version of
std::is_nothrow_invocable.
libstdc++-v3/ChangeLog:
PR libstdc++/91456
* include/std/type_traits (__is_nothrow_invocable): Remove.
(__is_invocable_impl::__nothrow_type): New member type which
checks if the conversion can throw.
(__is_nt_invocable_impl): Replace class template with alias
template to __is_nt_invocable_impl::__nothrow_type.
* testsuite/20_util/is_nothrow_invocable/91456.cc: New test.
* testsuite/20_util/is_nothrow_convertible/value.cc: Remove
macro used by value_ext.cc test.
* testsuite/20_util/is_nothrow_convertible/value_ext.cc: Remove
test for non-standard __is_nothrow_invocable trait.
Jonathan Wakely [Fri, 1 Jul 2022 10:40:29 +0000 (11:40 +0100)]
libstdc++: Add nodiscard attribute to filesystem operations
Some of these are not truly "pure" because they access the file system,
e.g. exists and file_size, but they do not modify anything and are only
useful for the return value.
If you really want to use one of those functions just to check whether
an error is reported (either via an exception or an error_code&
argument) you can still do so, but you need to cast the discarded result
to void. Several tests need such a change, because they were indeed
only calling the functions to check for expected errors.
libstdc++-v3/ChangeLog:
* include/bits/fs_ops.h: Add nodiscard to all pure functions.
* include/experimental/bits/fs_ops.h: Likewise.
* testsuite/27_io/filesystem/operations/all.cc: Do not discard
results of absolute and canonical.
* testsuite/27_io/filesystem/operations/absolute.cc: Cast
discarded result to void.
* testsuite/27_io/filesystem/operations/canonical.cc: Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/canonical.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
Jonathan Wakely [Thu, 10 Nov 2022 14:11:27 +0000 (14:11 +0000)]
libstdc++: Fix tests with non-const operator==
These tests fail in strict -std=c++20 mode but their equality ops don't
need to be non-const, it looks like an accident.
This fixes two FAILs with -std=c++20:
FAIL: 20_util/tuple/swap.cc (test for excess errors)
FAIL: 26_numerics/valarray/87641.cc (test for excess errors)
Jonathan Wakely [Fri, 11 Mar 2022 14:52:38 +0000 (14:52 +0000)]
libstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875]
The current code in read_utf8_code_point assumes that integer promotion
will create a 32-bit int, but that's not true for 16-bit targets like
msp430 and avr. This changes the intermediate variables used for each
octet from unsigned char to char32_t, so that (c << N) works correctly
when N > 8.
libstdc++-v3/ChangeLog:
PR libstdc++/104875
* src/c++11/codecvt.cc (read_utf8_code_point): Use char32_t to
hold octets that will be left-shifted.
The standard's move-and-swap implementation generates smaller code at
all levels except -O0 and -Og, so it seems simplest to just do what the
standard says.
libstdc++-v3/ChangeLog:
PR libstdc++/108118
* include/bits/shared_ptr_base.h (weak_ptr::operator=):
Implement as move-and-swap exactly as specified in the standard.
* testsuite/20_util/weak_ptr/cons/self_move.cc: New test.
liuhongt [Wed, 10 May 2023 07:16:58 +0000 (15:16 +0800)]
x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
if (mdaz-ftz)
link crtfastmath.o
else if ((Ofast || ffast-math || funsafe-math-optimizations)
&& !mno-daz-ftz)
link crtfastmath.o
else
Don't link crtfastmath.o
gcc/ChangeLog:
* config/i386/cygwin.h (ENDFILE_SPEC): Link crtfastmath.o
whenever -mdaz-ftz is specified. Don't link crtfastmath.o
when -mno-daz-ftz is specified.
* config/i386/darwin.h (ENDFILE_SPEC): Ditto.
* config/i386/gnu-user-common.h
(GNU_USER_TARGET_MATHFILE_SPEC): Ditto.
* config/i386/mingw32.h (ENDFILE_SPEC): Ditto.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
Patrick Palka [Mon, 24 Apr 2023 17:39:54 +0000 (13:39 -0400)]
libstdc++: Fix __max_diff_type::operator>>= for negative values
This patch fixes sign bit propagation when right-shifting a negative
__max_diff_type value by more than one, a bug that our existing test
coverage didn't expose until r14-159-g03cebd304955a6 fixed the front
end's 'signed typedef-name' handling that the test relies on (which is
a non-standard extension to the language grammar).
libstdc++-v3/ChangeLog:
* include/bits/max_size_type.h (__max_diff_type::operator>>=):
Fix propagation of sign bit.
* testsuite/std/ranges/iota/max_size_type.cc: Avoid using the
non-standard 'signed typedef-name'. Add some compile-time tests
for right-shifting a negative __max_diff_type value by more than
one.
Jakub Jelinek [Tue, 9 May 2023 10:14:18 +0000 (12:14 +0200)]
testsuite: Add further testcase for already fixed PR [PR109778]
I came up with a testcase which reproduces all the way to r10-7469.
LTO to avoid early inlining it, so that ccp handles rotates and not
shifts before they are turned into rotates.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* gcc.dg/lto/pr109778_0.c: New test.
* gcc.dg/lto/pr109778_1.c: New file.
Jakub Jelinek [Tue, 9 May 2023 10:10:07 +0000 (12:10 +0200)]
tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp [PR109778]
The following testcase is miscompiled, because bitwise ccp2 handles
a rotate with a signed type incorrectly.
Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3
arguments, all other callers just rotate in the right precision and
I think work correctly. ccp works with widest_ints and so rotations
by the excessive precision certainly don't match what it wants
when it sees a rotate in some specific bitsize. Still, if it is
unsigned rotate and the widest_int is zero extended from width,
the functions perform left shift and logical right shift on the value
and then at the end zero extend the result of left shift and uselessly
also the result of logical right shift and return | of that.
On the testcase we the signed char rrotate by 4 argument is
CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2.
The mask is correctly rotated to 0x20, but because the 8-bit constant
is sign extended to 192-bit one, the logical right shift by 4 doesn't
yield expected 0xb, but gives 0xfffffffffff....ffffb, and then
return wi::zext (left, width) | wi::zext (right, width); where left is
0xfffffff....fb50, so we return 0xfb instead of the expected
0x5b.
The following patch fixes that by doing the zero extension in case of
the right variable before doing wi::lrshift rather than after it.
Also, wi::[lr]rotate widht width < precision always zero extends
the result. I'm afraid it can't do better because it doesn't know
if it is done for an unsigned or signed type, but the caller in this
case knows that very well, so I've done the extension based on sgn
in the caller. E.g. 0x5b rotated right (or left) by 4 with width 8
previously gave 0xb5, but sgn == SIGNED in widest_int it should be
0xffffffff....fffb5 instead.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on
wi::zext (x, width) rather than x if width != precision, rather
than using wi::zext (right, width) after the shift.
* tree-ssa-ccp.c (bit_value_binop): Call wi::ext on the results
of wi::lrotate or wi::rrotate.
When we end up with a widen-sum with an invariant smaller operand
the reduction code uses a wrong vector type for it, causing
IL checking ICEs. The following fixes that and the inefficiency
of using a widen-sum with a widenend invariant operand as well
by actually performing the check the following comment wants.
PR tree-optimization/108950
* tree-vect-patterns.c (vect_recog_widen_sum_pattern):
Check oprnd0 is defined in the loop.
* tree-vect-loop.c (vectorizable_reduction): Record all
operands vector types, compute that of invariants and
properly update their SLP nodes.