Paul E. Murphy [Thu, 22 Jun 2023 22:53:46 +0000 (17:53 -0500)]
go: Update usage of TARGET_AIX to TARGET_AIX_OS
TARGET_AIX is defined to a non-zero value on linux and maybe other
powerpc64le targets. This leads to unexpected behavior such as
dropping the .go_export section when linking a shared library
on linux/powerpc64le.
Instead, use TARGET_AIX_OS to toggle AIX specific behavior.
Fixes golang/go#60798.
2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com>
gcc/go/
* go-backend.c [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS.
* go-lang.c: Likewise.
But the document of mvzeroupper doesn't mention the insertion
required -O2 and above, it may confuse users when they explicitly
use -Os -mvzeroupper.
------------
mvzeroupper
Target Mask(VZEROUPPER) Save
Generate vzeroupper instruction before a transfer of control flow out of
the function.
------------
The patch moves flag_expensive_optimizations && !optimize_size to
ix86_option_override_internal. It makes -mvzeroupper independent of
optimization level, but still keeps the behavior of architecture
tuning(emit_vzeroupper) unchanged.
gcc/ChangeLog:
* config/i386/i386-features.c (pass_insert_vzeroupper:gate):
Move flag_expensive_optimizations && !optimize_size to ..
* config/i386/i386-options.c (ix86_option_override_internal):
.. this, it makes -mvzeroupper independent of optimization
level, but still keeps the behavior of architecture
tuning(emit_vzeroupper) unchanged.
(rest_of_handle_insert_vzeroupper): Remove
flag_expensive_optimizations && !optimize_size.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-vzeroupper-29.c: New testcase.
Iain Buclaw [Mon, 26 Jun 2023 01:24:27 +0000 (03:24 +0200)]
d: Suboptimal codegen for __builtin_expect(cond, false)
Since PR96435, both boolean objects and expressions have been evaluated
in the following way.
(*(ubyte*)&obj_or_expr) & 1
It has been noted that sometimes this can cause the back-end to optimize
in non-obvious ways - in particular with __builtin_expect.
This @safe feature is now restricted to just when reading the value of a
bool field that comes from a union.
PR d/110359
gcc/d/ChangeLog:
* d-convert.cc (convert_for_rvalue): Only apply the @safe boolean
conversion to boolean fields of a union.
(convert_for_condition): Call convert_for_rvalue in the default case.
Jonathan Wakely [Mon, 11 Oct 2021 12:28:32 +0000 (13:28 +0100)]
libstdc++: Fix std::numeric_limits::lowest() test for strict modes
This test uses std::is_integral to decide whether we are testing an
integral or floating-point type. But that fails for __int128 because
is_integral<__int128> is false in strict modes. By using
numeric_limits::is_integer instead we get the right answer for all types
that have a numeric_limits specialization.
We can also simplify the test by removing the unnecessary tag
dispatching.
libstdc++-v3/ChangeLog:
* testsuite/18_support/numeric_limits/lowest.cc: Use
numeric_limits<T>::is_integer instead of is_integral<T>::value.
Jonathan Wakely [Mon, 15 May 2023 20:41:56 +0000 (21:41 +0100)]
libstdc++: Document removal of implicit allocator rebinding extensions
Traditionally libstdc++ allowed containers to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.
libstdc++-v3/ChangeLog:
* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.
The standard's move-and-swap implementation generates smaller code at
all levels except -O0 and -Og, so it seems simplest to just do what the
standard says.
libstdc++-v3/ChangeLog:
PR libstdc++/108118
* include/bits/shared_ptr_base.h (weak_ptr::operator=):
Implement as move-and-swap exactly as specified in the standard.
* testsuite/20_util/weak_ptr/cons/self_move.cc: New test.
Jonathan Wakely [Wed, 17 Jun 2020 21:49:06 +0000 (22:49 +0100)]
libstdc++: Avoid stack overflow in std::vector (PR 94540)
The std::__uninitialized_default_n algorithm used by std::vector creates
an initial object as a local variable then copies that into the
destination range. If the object is too large for the stack this
crashes. We should create the first object directly into the
destination and then copy it from there.
This doesn't fix the bug for C++98, because in that case the initial
value is created as a default argument of the vector constructor i.e. in
the user's code, not inside libstdc++. We can't prevent that.
PR libstdc++/94540
* include/bits/stl_uninitialized.h (__uninitialized_default_1<true>):
Construct the first value at *__first instead of on the stack.
(__uninitialized_default_n_1<true>): Likewise.
Improve comments on several of the non-standard algorithms.
* testsuite/20_util/specialized_algorithms/uninitialized_default/94540.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_default_n/94540.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/94540.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_value_construct_n/94540.cc:
New test.
* testsuite/23_containers/vector/cons/94540.cc: New test.
Jonathan Wakely [Mon, 22 Aug 2022 14:16:16 +0000 (15:16 +0100)]
libstdc++: Check for overflow in regex back-reference [PR106607]
Currently we fail to notice integer overflow when parsing a
back-reference expression, or when converting the parsed result from
long to int. This changes the result to be int, so no conversion is
needed, and uses the overflow-checking built-ins to detect an
out-of-range back-reference.
libstdc++-v3/ChangeLog:
PR libstdc++/106607
* include/bits/regex_compiler.tcc (_Compiler::_M_cur_int_value):
Use built-ins to check for integer overflow in back-reference
number.
* testsuite/28_regex/basic_regex/106607.cc: New test.
Jonathan Wakely [Tue, 14 Dec 2021 14:32:35 +0000 (14:32 +0000)]
libstdc++: Fix handling of invalid ranges in std::regex [PR102447]
std::regex currently allows invalid bracket ranges such as [\w-a] which
are only allowed by ECMAScript when in web browser compatibility mode.
It should be an error, because the start of the range is a character
class, not a single character. The current implementation of
_Compiler::_M_expression_term does not provide a way to reject this,
because we only remember a previous character, not whether we just
processed a character class (or collating symbol etc.)
This patch replaces the pair<bool, CharT> used to emulate
optional<CharT> with a custom class closer to pair<tribool,CharT>. That
allows us to track three states, so that we can tell when we've just
seen a character class.
With this additional state the code in _M_expression_term for processing
the _S_token_bracket_dash can be improved to correctly reject the [\w-a]
case, without regressing for valid cases such as [\w-] and [----].
libstdc++-v3/ChangeLog:
PR libstdc++/102447
* include/bits/regex_compiler.h (_Compiler::_BracketState): New
class.
(_Compiler::_BrackeyMatcher): New alias template.
(_Compiler::_M_expression_term): Change pair<bool, CharT>
parameter to _BracketState. Process first character for
ECMAScript syntax as well as POSIX.
* include/bits/regex_compiler.tcc
(_Compiler::_M_insert_bracket_matcher): Pass _BracketState.
(_Compiler::_M_expression_term): Use _BracketState to store
state between calls. Improve handling of dashes in ranges.
* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
Add more tests for ranges containing dashes. Check invalid
ranges with character class at the beginning.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:15 +0000 (13:48 +0100)]
libstdc++: Check for invalid syntax_option_type values in <regex>
The standard says that it is invalid for more than one grammar element
to be set in a value of type regex_constants::syntax_option_type. This
adds a check in the regex compiler andthrows an exception if an invalid
value is used.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/regex_compiler.h (_Compiler::_S_validate): New
function.
* include/bits/regex_compiler.tcc (_Compiler::_Compiler): Use
_S_validate to check flags.
* include/bits/regex_error.h (_S_grammar): New error code for
internal use.
* testsuite/28_regex/basic_regex/ctors/grammar.cc: New test.
Jonathan Wakely [Sun, 12 Dec 2021 21:15:17 +0000 (21:15 +0000)]
libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]
The overload of std::regex_replace that takes a std::basic_string as the
fmt argument (for the replacement string) is implemented in terms of the
one taking a const C*, which uses std::char_traits to find the length.
That means it stops at a null character, even though the basic_string
might have additional characters beyond that.
Rather than duplicate the implementation of the const C* one for the
std::basic_string case, this moves that implementation to a new
__regex_replace function which takes a const C* and a length. Then both
the std::basic_string and const C* overloads can call that (with the
latter using char_traits to find the length to pass to the new
function).
libstdc++-v3/ChangeLog:
PR libstdc++/103664
* include/bits/regex.h (__regex_replace): Declare.
(regex_replace): Use it.
* include/bits/regex.tcc (__regex_replace): Replace regex_replace
definition with __regex_replace.
* testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New test.
Jonathan Wakely [Wed, 29 Sep 2021 12:48:11 +0000 (13:48 +0100)]
libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]
When the input sequence contains a _CharT(0) character, the strchr call
in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return
a pointer to the terminating null at the end of the string. This makes
the scanner think it's found a special character. Because it doesn't
match any of the actual special characters, we fall off the end of the
function (or assert in debug mode).
We should check for a null character explicitly and either treat it as
an ordinary character (for the ECMAScript grammar) or an error (for all
others). I'm not 100% sure that's right, but it seems consistent with
the POSIX RE rules where a '\0' means the end of the regex pattern or
the end of the sequence being matched.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/84110
* include/bits/regex_error.h (regex_constants::_S_null): New
error code for internal use.
* include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()):
Check for null character.
* testsuite/28_regex/basic_regex/84110.cc: New test.
libstdc++: include/bits/regex_error.h: Avoid warning with -fno-exceptions.
When building with -fno-exceptions, __GLIBCXX_THROW_OR_ABORT expands to
abort(), causing warnings:
unused parameter '__ecode'
unused parameter '__what'
This patch adds __attribute__((unused)) to avoid them.
Kewen Lin [Tue, 20 Jun 2023 06:40:52 +0000 (01:40 -0500)]
rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]
As PR109932 shows, builtins __builtin_{un,}pack_vector_int128
should be guarded under vsx rather than power7, as their
corresponding bif patterns have the conditions TARGET_VSX
and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode). This patch is to
ensure __builtin_{un,}pack_vector_int128 only available under
vsx.
PR target/109932
gcc/ChangeLog:
* config/rs6000/rs6000-builtin.def (BU_VSX_MISC_2): New macro.
({un,}pack_vector_int128): Use BU_VSX_MISC_2 instead of
BU_P7_MISC_2.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109932-1.c: New test.
* gcc.target/powerpc/pr109932-2.c: New test.
Kewen Lin [Mon, 12 Jun 2023 06:07:52 +0000 (01:07 -0500)]
rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]
As PR110011 shows, when encoding 128 bits fp constant into
toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is
to find the first float mode with LONG_DOUBLE_TYPE_SIZE
bits of precision, it would be TFmode here. But the 128
bits fp constant can be with mode IFmode or KFmode, which
doesn't necessarily have the same underlying float format
as the one of TFmode, like this PR exposes, with option
-mabi=ibmlongdouble TFmode has ibm_extended_format while
KFmode has ieee_quad_format, mixing up the formats (the
encoding/decoding ways) would cause unexpected results.
This patch is to make it use constant's own mode instead
of TFmode for real_to_target call.
PR target/110011
gcc/ChangeLog:
* config/rs6000/rs6000.c (output_toc): Use the mode of the 128-bit
floating constant itself for real_to_target call.
* config/rs6000/rs6000.c (darwin_rs6000_special_round_type_align):
Make sure that we do not have a cap on field alignment before altering
the struct layout based on the type alignment of the first entry.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/darwin-abi-13-0.c: New test.
* gcc.target/powerpc/darwin-abi-13-1.c: New test.
* gcc.target/powerpc/darwin-abi-13-2.c: New test.
* gcc.target/powerpc/darwin-structs-0.h: New test.
Jakub Jelinek [Fri, 9 Jun 2023 07:10:29 +0000 (09:10 +0200)]
fortran: Fix ICE on pr96024.f90 on big-endian hosts [PR96024]
The pr96024.f90 testcase ICEs on big-endian hosts. The problem is
that length->val.integer is accessed after checking
length->expr_type == EXPR_CONSTANT, but it is a CHARACTER constant
which uses length->val.character union member instead and on big-endian
we end up reading constant 0x100000000 rather than some small number
on little-endian and if target doesn't have enough memory for 4 times
that (i.e. 16GB allocation), it ICEs.
2023-06-09 Jakub Jelinek <jakub@redhat.com>
PR fortran/96024
* primary.c (gfc_convert_to_structure_constructor): Only do
constant string ctor length verification and truncation/padding
if constant length has INTEGER type.
Michael Meissner [Mon, 22 May 2023 15:26:08 +0000 (11:26 -0400)]
Do not generate vmaddfp and vnmsubfp
This is version 3 of the patch. This is essentially version 1 with the removal
of changes to altivec.md, and cleanup of the comments.
Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
and those changes are deleted in this patch.
The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating
these instructions seems to break Eigen on big endian systems.
I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double). I have also done the builds and test on a power8
big endian system (testing both 32-bit and 64-bit code generation). Chip has
verified that it fixes the problem that Eigen encountered. Can I check this
into the master GCC branch? After a burn-in period, can I check this patch
into the active GCC branches?
Thanks in advance.
2023-05-22 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. Back
port from master 04/10/2023.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.
gcc/testsuite/
PR target/70243
* gcc.target/powerpc/pr70243.c: New test. Back port from master
04/10/2023.
Iain Sandoe [Sun, 21 May 2023 09:45:25 +0000 (10:45 +0100)]
libstdc++: Rename __null_terminated to avoid a collision with system headers.
r13-1073-g254e88b3d7e8ab made this change for experimental/bits/fs-path.h.
The change is needed on the 10.x branch for bits/fs-path.h (it is not required
on later branches since this header was reworked).
libsanitizer, darwin: Unsupport Darwin >= 22 for now.
The mechanism for location dyld has altered from Darwin22 since dyld is now
in the shared cache. The implemented mechanism for walking the cache uses
Apple Blocks which GCC does not yet support, and the fallback to the original
mechanism does not work there.
Until a suitable work-around can be found, unsupport Darwin22+.
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
The SDK for MacOS13 includes Apple-specific deprecations of some functions that
are not deprecated in Posix, C or C++ and widely used in GCC.
The fix makes the deprecation conditional on __APPLE_LOCAL_DEPRECATIONS so that
end users may still observe them but they are hidden from normal compilations.
* fixincl.x: Regenerate.
* inclhack.def: Add a fix for MacOS13 SDK function deprecations
in stdio.h.
* tests/base/stdio.h (__deprecated_msg): New test.
Iain Sandoe [Thu, 22 Dec 2022 17:32:06 +0000 (17:32 +0000)]
libgcc, Darwin: No early install for the compatibility libgcc_s.1.dylib.
On Darwin, GCC now uses a libgcc_s.1.1 for builtins and forwards the system
unwinder. We do, however, build a backwards compatibility libgcc_s.1.dylib.
However, this is not needed by GCC and can cause incorrect operation when
DYLD_LIBRARY_PATH is in use.
Since we do not need or use it during the build, the solution is to skip the
installation into the $build/gcc directory.
Iain Sandoe [Mon, 20 Dec 2021 15:19:50 +0000 (15:19 +0000)]
Darwin: Update rules for handling alignment of globals.
The current rule was too strict and has not been required since Darwin11.
This relaxes the constraint to allow up to 2^28 alignment for non-common
entities. Common is still restricted to a maximum aligment of 2^15.
When the host is an older version of Darwin ( earlier that 11 ) then the
existing constraint is still applied. Note that this is a host constraint
not a target one (so that a compilation on 10.7 targeting 10.6 is allowed
to use a greater alignment than the tools on 10.6 support). This matches
the behaviour of clang.
* config.gcc: Emit L2_MAX_OFILE_ALIGNMENT with suitable
values for the host.
* config/darwin.c (darwin_emit_common): Error for alignment
values > 32768.
* config/darwin.h (MAX_OFILE_ALIGNMENT): Rework to use the
configured L2_MAX_OFILE_ALIGNMENT.
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-aligned-globals.c: New test.
* gcc.dg/darwin-comm-1.c: New test.
* gcc.dg/attr-aligned.c: Amend for new alignment values on
Darwin.
* gcc.target/i386/pr89261.c: Likewise.
Jakub Jelinek [Tue, 9 May 2023 10:14:18 +0000 (12:14 +0200)]
testsuite: Add further testcase for already fixed PR [PR109778]
I came up with a testcase which reproduces all the way to r10-7469.
LTO to avoid early inlining it, so that ccp handles rotates and not
shifts before they are turned into rotates.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* gcc.dg/lto/pr109778_0.c: New test.
* gcc.dg/lto/pr109778_1.c: New file.
Jakub Jelinek [Tue, 9 May 2023 10:10:07 +0000 (12:10 +0200)]
tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp [PR109778]
The following testcase is miscompiled, because bitwise ccp2 handles
a rotate with a signed type incorrectly.
Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3
arguments, all other callers just rotate in the right precision and
I think work correctly. ccp works with widest_ints and so rotations
by the excessive precision certainly don't match what it wants
when it sees a rotate in some specific bitsize. Still, if it is
unsigned rotate and the widest_int is zero extended from width,
the functions perform left shift and logical right shift on the value
and then at the end zero extend the result of left shift and uselessly
also the result of logical right shift and return | of that.
On the testcase we the signed char rrotate by 4 argument is
CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2.
The mask is correctly rotated to 0x20, but because the 8-bit constant
is sign extended to 192-bit one, the logical right shift by 4 doesn't
yield expected 0xb, but gives 0xfffffffffff....ffffb, and then
return wi::zext (left, width) | wi::zext (right, width); where left is
0xfffffff....fb50, so we return 0xfb instead of the expected
0x5b.
The following patch fixes that by doing the zero extension in case of
the right variable before doing wi::lrshift rather than after it.
Also, wi::[lr]rotate widht width < precision always zero extends
the result. I'm afraid it can't do better because it doesn't know
if it is done for an unsigned or signed type, but the caller in this
case knows that very well, so I've done the extension based on sgn
in the caller. E.g. 0x5b rotated right (or left) by 4 with width 8
previously gave 0xb5, but sgn == SIGNED in widest_int it should be
0xffffffff....fffb5 instead.
2023-05-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109778
* wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on
wi::zext (x, width) rather than x if width != precision, rather
than using wi::zext (right, width) after the shift.
* tree-ssa-ccp.c (bit_value_binop): Call wi::ext on the results
of wi::lrotate or wi::rrotate.
Jonathan Wakely [Wed, 3 May 2023 20:40:01 +0000 (21:40 +0100)]
libstdc++: Ensure constexpr std::lcm detects out-of-range result [PR105844]
On the gcc-10 branch, __glibcxx_assert does not unconditionally check
the condition during constant evaluation. This means we need an explicit
additional check for std::lcm results that cannot be represented in an
unsigned result type.
libstdc++-v3/ChangeLog:
PR libstdc++/105844
* include/std/numeric (lcm): Ensure out-of-range result is
detected in constant evaluation.
* testsuite/26_numerics/lcm/105844.cc: Adjust dg-error string.
Jonathan Wakely [Fri, 10 Jun 2022 13:39:13 +0000 (14:39 +0100)]
libstdc++: Make std::lcm and std::gcd detect overflow [PR105844]
When I fixed PR libstdc++/92978 I introduced a regression whereby
std::lcm(INT_MIN, 1) and std::lcm(50000, 49999) would no longer produce
errors during constant evaluation. Those calls are undefined, because
they violate the preconditions that |m| and the result can be
represented in the return type (which is int in both those cases). The
regression occurred because __absu<unsigned>(INT_MIN) is well-formed,
due to the explicit casts to unsigned in that new helper function, and
the out-of-range multiplication is well-formed, because unsigned
arithmetic wraps instead of overflowing.
To fix 92978 I made std::gcm and std::lcm calculate |m| and |n|
immediately, yielding a common unsigned type that was used to calculate
the result. That was partly correct, but there's no need to use an
unsigned type. Doing so only suppresses the overflow errors so the
compiler can't detect them. This change replaces __absu with __abs_r
that returns the common type (not its corresponding unsigned type). This
way we can detect overflow in __abs_r when required, while still
supporting the most-negative value when it can be represented in the
result type. To detect LCM results that are out of range of the result
type we still need explicit checks, because neither constant evaluation
nor UBsan will complain about unsigned wrapping for cases such as
std::lcm(500000u, 499999u). We can detect those overflows efficiently by
using __builtin_mul_overflow and asserting.
libstdc++-v3/ChangeLog:
PR libstdc++/105844
* include/experimental/numeric (experimental::gcd): Simplify
assertions. Use __abs_r instead of __absu.
(experimental::lcm): Likewise. Remove use of __detail::__lcm so
overflow can be detected.
* include/std/numeric (__detail::__absu): Rename to __abs_r and
change to allow signed result type, so overflow can be detected.
(__detail::__lcm): Remove.
(gcd): Simplify assertions. Use __abs_r instead of __absu.
(lcm): Likewise. Remove use of __detail::__lcm so overflow can
be detected.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/26_numerics/gcd/105844.cc: New test.
* testsuite/26_numerics/lcm/105844.cc: New test.
Anthony Sharp [Fri, 27 Aug 2021 14:02:42 +0000 (10:02 -0400)]
call_summary: add missing template keyword
Without the 'template', this function template compares 'traverse' to 'f',
and then compares the result to 'a'. Evidently it hasn't been instantiated
yet.
Jakub Jelinek [Wed, 12 Apr 2023 14:55:15 +0000 (16:55 +0200)]
reassoc: Fix up another ICE with returns_twice call [PR109410]
The following testcase ICEs in reassoc, unlike the last case I've fixed
there here SSA_NAME_USED_IN_ABNORMAL_PHI is not the case anywhere.
build_and_add_sum places new statements after the later appearing definition
of an operand but if both operands are default defs or constants, we place
statement at the start of the function.
If the very first statement of a function is a call to returns_twice
function, this doesn't work though, because that call has to be the first
thing in its basic block, so the following patch splits the entry successor
edge such that the new statements are added into a different block from the
returns_twice call.
I think we should in stage1 reconsider such placements, I think it
unnecessarily enlarges the lifetime of the new lhs if its operand(s)
are used more than once in the function. Unless something sinks those
again. Would be nice to place it closer to the actual uses (or where
they will be placed).
2023-04-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109410
* tree-ssa-reassoc.c (build_and_add_sum): Split edge from entry
block if first statement of the function is a call to returns_twice
function.
Jakub Jelinek [Sun, 2 Apr 2023 18:05:31 +0000 (20:05 +0200)]
libiberty: Make strstr.c in libiberty ANSI compliant
On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
>
> On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
> > The original code in libiberty says "FIXME" and then says it has not been
> > validated to be ANSI compliant. However, this patch changes the function to
> > match implementations that ARE compliant, and such code is in the public
> > domain.
> >
> > I ran the test results, and there are no test failures.
>
> Thanks. This seems to be the standard "simple" strstr implementation.
> There's significantly faster implementations available, but I doubt it's
> worth the effort as the version in this file only gets used if there is
> no system strstr.c.
Except that PR109306 says the new version is non-compliant and
is certainly slower than what we used to have. The only problem I see
on the old version (sure, it is not very fast version) is that for
strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
strchr in that case changed p to point to the last character and then
strncmp returned 0.
The question reported in PR109306 is whether memcmp is required not to
access characters beyond the first difference or not.
For all of memcmp/strcmp/strncmp, C17 says:
"The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp
is determined by the sign of the difference between the values of the first pair of characters (both
interpreted as unsigned char) that differ in the objects being compared."
but then in memcmp description says:
"The memcmp function compares the first n characters of the object pointed to by s1 to the first n
characters of the object pointed to by s2."
rather than something similar to strncmp wording:
"The strncmp function compares not more than n characters (characters that follow a null character
are not compared) from the array pointed to by s1 to the array pointed to by
s2."
So, while for strncmp it seems clearly well defined when there is zero
terminator before reaching the n, for memcmp it is unclear if say
int
memcmp (const void *s1, const void *s2, size_t n)
{
int ret = 0;
size_t i;
const unsigned char *p1 = (const unsigned char *) s1;
const unsigned char *p2 = (const unsigned char *) s2;
for (i = n; i; i--)
if (p1[i - 1] != p2[i - 1])
ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
return ret;
}
wouldn't be valid implementation (one which always compares all characters
and just returns non-zero from the first one that differs).
So, shouldn't we just revert and handle the len == 0 case correctly?
I think almost nothing really uses it, but still, the old version
at least worked nicer with a fast strchr.
Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
because strchr already compared the first character.
2023-04-02 Jakub Jelinek <jakub@redhat.com>
PR other/109306
* strstr.c (strstr): Return s1 if len is 0.
Jakub Jelinek [Tue, 28 Mar 2023 08:56:44 +0000 (10:56 +0200)]
sanopt: Return TODO_cleanup_cfg if any .{UB,HWA,A}SAN_* calls were lowered [PR106190]
The following testcase ICEs, because without optimization eh lowering
decides not to duplicate finally block of try/finally and so we end up
with variable guarded cleanup. The sanopt pass creates a cfg that ought
to be cleaned up (some IFN_UBSAN_* functions are lowered in this case with
constant conditions in gcond and when not allowing recovery some bbs which
end with noreturn calls actually have successor edges), but the cfg cleanup
is actually (it is -O0) done only during the optimized pass. We notice
there that the d[1][a] = 0; statement which has an EH edge is unreachable
(because ubsan would always abort on the out of bounds d[1] access), remove
the EH landing pad and block, but because that block just sets a variable
and jumps to another one which tests that variable and that one is reachable
from normal control flow, the __builtin_eh_pointer (1) later in there is
kept in the IL and we ICE during expansion of that statement because the
EH region has been removed.
The following patch fixes it by doing the cfg cleanup already during
sanopt pass if we create something that might need it, while the EH
landing pad is then removed already during sanopt pass, there is ehcleanup
later and we don't ICE anymore.
2023-03-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106190
* sanopt.c (pass_sanopt::execute): Return TODO_cleanup_cfg if any
of the IFN_{UB,HWA,A}SAN_* internal fns are lowered.
Jakub Jelinek [Sun, 26 Mar 2023 18:15:05 +0000 (20:15 +0200)]
predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]
In the following testcase, we predict baz to have cold
entry regardless of the user supplied attribute (as it call
unconditionally a cold function), but still issue
a -Wsuggest-attribute=cold warning despite it having that attribute
already.
The following patch avoids that.
2023-03-26 Jakub Jelinek <jakub@redhat.com>
PR ipa/105685
* predict.c (compute_function_frequency): Don't call
warn_function_cold if function already has cold attribute.
Jakub Jelinek [Mon, 20 Mar 2023 19:29:47 +0000 (20:29 +0100)]
c++: Drop TREE_READONLY on vars (possibly) initialized by tls wrapper [PR109164]
The following two testcases are miscompiled, because we keep TREE_READONLY
on the vars even when they are (possibly) dynamically initialized by a TLS
wrapper function. Normally cp_finish_decl drops TREE_READONLY from vars
which need dynamic initialization, but for TLS we do this kind of
initialization upon every access to those variables. Keeping them
TREE_READONLY means e.g. PRE can hoist loads from those before loops
which contain the TLS wrapper calls, so we can access the TLS variables
before they are initialized.
2023-03-20 Jakub Jelinek <jakub@redhat.com>
PR c++/109164
* cp-tree.h (var_needs_tls_wrapper): Declare.
* decl2.c (var_needs_tls_wrapper): No longer static.
* decl.c (cp_finish_decl): Clear TREE_READONLY on TLS variables
for which a TLS wrapper will be needed.
* g++.dg/tls/thread_local13.C: New test.
* g++.dg/tls/thread_local13-aux.cc: New file.
* g++.dg/tls/thread_local14.C: New test.
* g++.dg/tls/thread_local14-aux.cc: New file.