This patch consolidates the glob implementation. The main changes are:
* On Linux all implementation now uses the default one at
sysdeps/unix/sysv/linux/glob{free}{64}.c with the exception
of alpha (which requires specific versioning) and s390-32 (which
different than other 32 bits ports it does not add a compat one
symbol for 2.1 version).
* The default implementation uses XSTAT_IS_XSTAT64 to define whether
both glob{free} and glob{free}64 should be different implementations.
For archictures that define XSTAT_IS_XSTAT64, glob{free} is an alias
to glob{free}64.
* Move i386 olddirent.h header to Linux default directory, since it is
the only header with this name and it is shared among different
architectures (and used on compat glob symbol as well).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
Current implementation of tunables does not set arena_max and arena_test
values. Any value provided by glibc.malloc.arena_max and
glibc.malloc.arena_test parameters is ignored.
These tunables have minval value set to 1 (see elf/dl-tunables.list file)
and undefined maxval value. In that case default value (which is 0. see
scripts/gen-tunables.awk) is being used to set maxval.
For instance, generated tunable_list[] entry for arena_max is:
(gdb) p *cur
$1 = {name = 0x7ffff7df6217 "glibc.malloc.arena_max",
type = {type_code = TUNABLE_TYPE_SIZE_T, min = 1, max = 0},
val = {numval = 0, strval = 0x0}, initialized = false,
security_level = TUNABLE_SECLEVEL_SXID_IGNORE,
env_alias = 0x7ffff7df622e "MALLOC_ARENA_MAX"}
As a result, any value of glibc.malloc.arena_max is ignored by
TUNABLE_SET_VAL_IF_VALID_RANGE macro
__type min = (__cur)->type.min; <- initialized to 1
__type max = (__cur)->type.max; <- initialized to 0!
if (min == max) <- false
{
min = __default_min;
max = __default_max;
}
if ((__type) (__val) >= min && (__type) (val) <= max) <- false
{
(__cur)->val.numval = val;
(__cur)->initialized = true;
}
Assigning correct min/max values at a build time fixes a problem.
Plus, a bit of optimization: Setting of default min/max values for the
given type at a run time might be eliminated.
* elf/dl-tunables.c (do_tunable_update_val): Range checking fix.
* scripts/gen-tunables.awk: Set unspecified minval and/or maxval
values to correct default value for given type.
Joseph Myers [Thu, 19 Oct 2017 17:32:20 +0000 (17:32 +0000)]
Install correct bits/long-double.h for MIPS64 (bug 22322).
Similar to bug 21987 for SPARC, MIPS64 wrongly installs the ldbl-128
version of bits/long-double.h, meaning incorrect results when using
headers installed from a 64-bit installation for a 32-bit build. (I
haven't actually seen this cause build failures before its interaction
with bits/floatn.h did so - installed headers wrongly expecting
_Float128 to be available in a 32-bit configuration.)
This patch fixes the bug by moving the MIPS header to
sysdeps/mips/ieee754, which comes before sysdeps/ieee754/ldbl-128 in
the sysdeps directory ordering. (bits/floatn.h will need a similar
fix - duplicating the ldbl-128 version for MIPS will suffice - for
headers from a 32-bit installation to be correct for 64-bit builds.)
Tested with build-many-glibcs.py (compilers build for
mips64-linux-gnu, where there was previously a libstdc++ build failure
as at
<https://sourceware.org/ml/libc-testresults/2017-q4/msg00130.html>).
Romain Naour [Mon, 16 Oct 2017 21:21:56 +0000 (23:21 +0200)]
Let signbit use the builtin in C++ mode with gcc < 6.x (bug 22296)
When using gcc < 6.x, signbit does not use the type-generic
__builtin_signbit builtin, instead it uses __MATH_TG.
However, when library support for float128 is available, __MATH_TG uses
__builtin_types_compatible_p, which is not available in C++ mode.
On the other hand, libstdc++ undefines (in cmath) many macros from
math.h, including signbit, so that it can provide its own functions.
However, during its configure tests, libstdc++ just tests for the
availability of the macros (it does not undefine them, nor does it
provide its own functions).
Finally, libstdc++ configure tests include math.h and get the definition
of signbit that uses __MATH_TG (and __builtin_types_compatible_p).
Since libstdc++ does not undefine the macros during its configure
tests, they fail.
This patch lets signbit use the builtin in C++ mode when gcc < 6.x is
used. This allows the configure test in libstdc++ to work.
Tested for x86_64.
[BZ #22296]
* math/math.h: Let signbit use the builtin in C++ mode with gcc
< 6.x
Cc: Gabriel F. T. Gomes <gftg@linux.vnet.ibm.com> Cc: Joseph Myers <joseph@codesourcery.com>
(cherry picked from commit 386e1c26ac473d6863133ab9cbe3bbda16c15816)
H.J. Lu [Sun, 22 Oct 2017 14:40:39 +0000 (07:40 -0700)]
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]
In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.
Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:
This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.
On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.
[BZ #21265]
* sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET):
New.
* sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>.
(get_common_indeces): Set xsave_state_size, xsave_state_full_size
and bit_arch_XSAVEC_Usable if needed.
(init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow
and bit_arch_Use_dl_runtime_resolve_opt.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
Removed.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(bit_arch_Prefer_No_AVX512): Updated.
(bit_arch_MathVec_Prefer_No_AVX512): Likewise.
(bit_arch_XSAVEC_Usable): New.
(STATE_SAVE_OFFSET): Likewise.
(STATE_SAVE_MASK): Likewise.
[__ASSEMBLER__]: Include <cpu-features-offsets.h>.
(cpu_features): Add xsave_state_size and xsave_state_full_size.
(index_arch_Use_dl_runtime_resolve_opt): Removed.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_XSAVEC_Usable): New.
* sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow.
* sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables
is enabled.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup):
Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx,
_dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt,
_dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt
with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and
_dl_runtime_resolve_xsavec.
* sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE):
Removed.
(DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT
instead of VEC_SIZE.
(REGISTER_SAVE_BND0): Removed.
(REGISTER_SAVE_BND1): Likewise.
(REGISTER_SAVE_BND3): Likewise.
(REGISTER_SAVE_RAX): Always defined to 0.
(VMOV): Removed.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_slow): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_avx512): Likewise.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(USE_FXSAVE): New.
(_dl_runtime_resolve_fxsave): Likewise.
(USE_XSAVE): Likewise.
(_dl_runtime_resolve_xsave): Likewise.
(USE_XSAVEC): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512):
Removed.
(_dl_runtime_resolve_avx512_opt): Likewise.
(_dl_runtime_resolve_avx): Likewise.
(_dl_runtime_resolve_avx_opt): Likewise.
(_dl_runtime_resolve_sse): Likewise.
(_dl_runtime_resolve_sse_vex): Likewise.
(_dl_runtime_resolve_fxsave): New.
(_dl_runtime_resolve_xsave): Likewise.
(_dl_runtime_resolve_xsavec): Likewise.
H.J. Lu [Sun, 22 Oct 2017 11:14:54 +0000 (04:14 -0700)]
x86: Add x86_64 to x86-64 HWCAP [BZ #22093]
Before glibc 2.26, ld.so set dl_platform to "x86_64" and searched the
"x86_64" subdirectory when loading a shared library. ld.so in glibc
2.26 was changed to set dl_platform to "haswell" or "xeon_phi", based
on supported ISAs. This led to shared library loading failure for
shared libraries placed under the "x86_64" subdirectory.
This patch adds "x86_64" to x86-64 dl_hwcap so that ld.so will always
search the "x86_64" subdirectory when loading a shared library.
NB: We can't set x86-64 dl_platform to "x86-64" since ld.so will skip
the "haswell" and "xeon_phi" subdirectories on "haswell" and "xeon_phi"
machines.
This patch syncs posix/glob.c implementation with gnulib version b5ec983 (glob: simplify symlink detection). The only difference
to gnulib code is
* DT_UNKNOWN, DT_DIR, and DT_LNK definition in the case there
were not already defined. Gnulib code which uses
HAVE_STRUCT_DIRENT_D_TYPE will redefine them wrongly because
GLIBC does not define HAVE_STRUCT_DIRENT_D_TYPE. Instead
the patch check for each definition instead.
Also, the patch requires additional globfree and globfree64 files
for compatibility version on some architectures. Also the code
simplification leads to not macro simplification (not need for
NO_GLOB_PATTERN_P anymore).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
[BZ #1062]
* posix/Makefile (routines): Add globfree, globfree64, and
glob_pattern_p.
* posix/flexmember.h: New file.
* posix/glob_internal.h: Likewise.
* posix/glob_pattern_p.c: Likewise.
* posix/globfree.c: Likewise.
* posix/globfree64.c: Likewise.
* sysdeps/gnu/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/globfree.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/oldglob.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree64.c: Likewise.
* posix/glob.c (HAVE_CONFIG_H): Use !_LIBC instead.
[NDEBUG): Remove comments.
(GLOB_ONLY_P, _AMIGA, VMS): Remove define.
(dirent_type): New type. Use uint_fast8_t not
uint8_t, as C99 does not require uint8_t.
(DT_UNKNOWN, DT_DIR, DT_LNK): New macros.
(struct readdir_result): Use dirent_type. Do not define skip_entry
unless it is needed; this saves a byte on platforms lacking d_ino.
(readdir_result_type, readdir_result_skip_entry):
New functions, replacing ...
(readdir_result_might_be_symlink, readdir_result_might_be_dir):
these functions, which were removed. This makes the callers
easier to read. All callers changed.
(D_INO_TO_RESULT): Now empty if there is no d_ino.
(size_add_wrapv, glob_use_alloca): New static functions.
(glob, glob_in_dir): Check for size_t overflow in several places,
and fix some size_t checks that were not quite right.
Remove old code using SHELL since Bash no longer
uses this.
(glob, prefix_array): Separate MS code better.
(glob_in_dir): Remove old Amiga and VMS code.
(globfree, __glob_pattern_type, __glob_pattern_p): Move to
separate files.
(glob_in_dir): Do not rely on undefined behavior in accessing
struct members beyond their bounds. Use a flexible array member
instead
(link_stat): Rename from link_exists2_p and return -1/0 instead of
0/1. Caller changed.
(glob): Fix memory leaks.
* posix/glob64 (globfree64): Move to separate file.
* sysdeps/gnu/glob64.c (NO_GLOB_PATTERN_P): Remove define.
(globfree64): Remove hidden alias.
* sysdeps/unix/sysv/linux/Makefile (sysdeps_routines): Add
oldglob.
* sysdeps/unix/sysv/linux/alpha/glob.c (__new_globfree): Move to
separate file.
* sysdeps/unix/sysv/linux/i386/glob64.c (NO_GLOB_PATTERN_P): Remove
define.
Move compat code to separate file.
* sysdeps/wordsize-64/glob.c (globfree): Move definitions to
separate file.
Florian Weimer [Fri, 20 Oct 2017 02:10:15 +0000 (04:10 +0200)]
sysconf: Fix missing definition of UIO_MAXIOV on Linux [BZ #22321]
After commit 37f802f86400684c8d13403958b2c598721d6360 (Remove
__need_IOV_MAX and __need_FOPEN_MAX), UIO_MAXIOV is no longer supplied
(indirectly) through <bits/stdio_lim.h>, so sysdeps/posix/sysconf.c no
longer sees the definition.
Remove conditional on LDBL_MANT_DIG from e_lgammal_r.c
The IEEE 754 implementation of lgammal in sysdeps/ieee754/ldbl-128/ used
to be shared by IBM's implementation in sysdeps/ieee754/ldbl-128ibm/ (by
an inclusion of the source file). In order for the algorithm to work
for IBM's implementation, a check for LDBL_MANT_DIG was required. Since
the source file is no longer shared, the requirement for the check is
gone. This patch removes the conditionals.
ldbl-128ibm: Automatic replacing of _Float128 and L()
The ldbl-128ibm implementation of j0l, j1l, lgammal_r, and cbrtl, as
well as the tables used by expl were copied from ldbl-128. However, the
original files used _Float128 for the type and L() for the literal
suffix. This patch uses the following sed command to rewrite _Float128
as long double and L(x) as xL (for e_expl.c, e_j0l.c, e_j1l.c,
e_lgammal_r.c, and t_expl.h):
sed -i <filename> \
-e "/^#define _Float128 long double/d" \
-e "/^#define L(x) x ## L/d" \
-e "/L(/s/)/L/" \
-e "/L(/s/L(//" \
-e "s/_Float128/long double/g"
For sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c, this sed command incorrectly
replaces a few occurrences of L(), so the following command is used
instead:
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Remove definitions of
_Float128 and L().
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Remove definitions of
_Float128 and L(). Replace _Float128 with long double and L(x)
with xL, throughout the file.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
ldbl-128ibm: Copy implementations from ldbl-128 instead of including them
Some files under sysdeps/ieee754/ldbl-128ibm/ are able to reuse the
implementation in sysdeps/ieee754/ldbl-128/ by defining _Float128 to
long double. This relied on compiler support for _Float128 being
disabled. On powerpc, such support was disabled by default, however, it
got enabled by default [1] in GCC 8.
This patch copies the implementations from ldbl-128 to ldbl-128ibm. The
uses of _Float128 and L() are kept intact in this patch and are replaced
with a script in a subsequent patch.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Include tables from
sysdeps/ieee754/ldbl-128ibm.
* sysdeps/ieee754/ldbl-128ibm/e_j0l.c: Copy contents from the
equivalent implementation in sysdeps/ieee754/ldbl-128/ instead
of including it. Keep _Float128 and L() intact. These will be
reviewed by a separate patch.
* sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_cbrtl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/t_expl.h: Likewise.
powerpc: Add redirection for finitef128, isinf128, and isnanf128
On powerpc64le, compiler support for float128 is not enabled by default
on gcc. To enable it, the flag -mfloat128 must be passed as a command
line option to the compiler. This means that only the few files that
actively have -mfloat128 passed as an argument get compiler support for
float128, whereas all other files don't.
When -mfloat128 becomes enabled by default on powerpc [1], all the files
that do not currently have compiler support for float128 enabled during
their compilation, will start to have it. This will lead to build
errors in s_finite.c, s_isinf.c, and s_isnan.c.
The errors are due to the unintended macro expansion of __finitef128 to
__redirect_finitef128 in math/bits/mathcalls-helper-functions.h. In
that header, __MATHDECL_1 takes '__finite' and 'f128' as arguments and
concatenates them. However, since '__finite' has been redefined in
s_finite.c, the function declaration becomes __redirect_finitef128:
extern int __redirect___finitef128 (_Float128 __value) __attribute__ ((__nothrow__ )) __attribute__ ((__const__));
This declaration itself is OK. The problem arises when include/math.h
creates the hidden prototype ('hidden_proto (__finitef128)'), which
expands to:
Since __finitef128 is not declared, __typeof fails. This effect was
already true for the 'float' and 'long double' versions and is now true
for float128. Likewise for isinsff128 and isnanf128.
This patch defines __finitef128 as __redirect___finitef128 in
sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c, similarly to what's
done for the float and long double versions of these functions, to get
rid of the build error. Likewise for isinff128 and isnanf128.
powerpc64le: Add -mfloat128 to tst-strtod-nan-locale testcase
On powerpc64le, not all files can have the flag -mfloat128 passed as an
option on the compile command, since that could conflict with other
flags, such as -mno-vsx. Each file that needs the flag, gets it through
a CFLAGS-filename variable on sysdeps/powerpc/powerpc64le/Makefile.
The test cases tst-strtod-nan-locale and tst-wcstod-nan-locale are
missing this flag.
Tested for powerpc64le.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-tst-strtod-nan-locale.c): New variable.
(CFLAGS-tst-wcstod-nan-locale.c): New variable.
aarch64: Optimized implementation of memmove for Qualcomm Falkor
This is an optimized memmove implementation for the Qualcomm Falkor
processor core. Due to the way the falkor memcpy needs to be written,
code cannot be easily shared between memmove and memcpy like in case
of other aarch64 memcpy implementations due to which this routine is
separate. The underlying principle is the same as that of memcpy
where it tries to use registers with the same lower 4 bits for
fetching the same stream, thus optimizing hardware prefetcher
performance.
The memcpy copy loop copies 64 bytes at a time using the same register
pair since that's the way to train the hardware prefetcher on the
falkor core. memmove cannot quite do that since it needs to avoid
overlaps, so it does the next best thing, i.e. has a 32 byte loop with
a 32 byte end (prefetch a loop ahead to account for overlapping
locations) with register pairs that alias so that they hit the same
prefetcher. Due to this difference in loop size, they have to
currently be separate implementations but efforts are on to try and
get memmove to fall back into memcpy whenever it can without simply
duplicating all of the code.
Performance:
The routine fares around 20-25% better than the generic memmove for
most medium to large sizes (i.e. > 128 bytes) for the new walking
memmove benchmark (memmove-walk) with an unexplained regression
between 1K and 2K. The minor regression is something worth looking
into for us, but the remaining gains are significant enough that we
would like this included upstream as we looking into the cause for the
regression. Here is a snippet of the numbers as generated from the
microbenchmark by the compare_strings script. Comparisons are against
__memmove_generic:
aarch64: Optimized memcpy for Qualcomm Falkor processor
This is an optimized implementation of the memcpy routine that gives a
significant gain in performance for all sizes of copies on the
Qualcomm Falkor processor. A detailed rationale of the implementation
is written in a comment in the patch.
This implementation improves time for copies up to 128 bytes by up to
15% and for larger copies by up to 35% in the glibc
microbenchmark. The memcpy-random benchmark sees improvements in all
sizes in the range of 13%-18%.
Here are the full numbers extracted from the glibc microbenchmark
using the commands:
Carlos O'Donell [Thu, 28 Sep 2017 17:05:18 +0000 (11:05 -0600)]
malloc: Fix tcache leak after thread destruction [BZ #22111]
The malloc tcache added in 2.26 will leak all of the elements remaining
in the cache and the cache structure itself when a thread exits. The
defect is that we do not set tcache_shutting_down early enough, and the
thread simply recreates the tcache and places the elements back onto a
new tcache which is subsequently lost as the thread exits (unfreed
memory). The fix is relatively simple, move the setting of
tcache_shutting_down earlier in tcache_thread_freeres. We add a test
case which uses mallinfo and some heuristics to look for unaccounted for
memory usage between the start and end of a thread start/join loop. It
is very reliable at detecting that there is a leak given the number of
iterations. Without the fix the test will consume 122MiB of leaked
memory.
H.J. Lu [Wed, 4 Oct 2017 00:41:32 +0000 (17:41 -0700)]
test-math-iscanonical.cc: Replace bool with int
Fix GCC 7 compilation error:
test-math-iscanonical.cc: In function ‘void check_type()’:
test-math-iscanonical.cc:33:11: error: use of an operand of type ‘bool’ in ‘operator++’ is deprecated [-Werror=deprecated]
errors++;
^~
Since not all non-zero error counts are errors, return errors != 0
instead.
Add C++ versions of iscanonical for ldbl-96 and ldbl-128ibm (bug 22235)
All representations of floating-point numbers in types with IEC 60559
binary exchange format are canonical. On the other hand, types with IEC
60559 extended formats, such as those implemented under ldbl-96 and
ldbl-128ibm, contain representations that are not canonical.
TS 18661-1 introduced the type-generic macro iscanonical, which returns
whether a floating-point value is canonical or not. In Glibc, this
type-generic macro is implemented using the macro __MATH_TG, which, when
support for float128 is enabled, relies on __builtin_types_compatible_p
to select between floating-point types. However, this use of
iscanonical breaks C++ applications, because the builtin is only
available in C mode.
This patch provides a C++ implementation of iscanonical that relies on
function overloading, rather than builtins, to select between
floating-point types.
Unlike the C++ implementations for iszero and issignaling, this
implementation ignores __NO_LONG_DOUBLE_MATH. The double type always
matches IEC 60559 double format, which is always canonical. Thus, when
double and long double are the same (__NO_LONG_DOUBLE_MATH), iscanonical
always returns 1 and is not implemented with __MATH_TG.
Tested for powerpc64, powerpc64le and x86_64.
[BZ #22235]
* math/math.h: Trivial fix for unbalanced parentheses in comment.
* math/Makefile [CXX] (tests): Add test-math-iscanonical.cc.
(CFLAGS-test-math-iscanonical.cc): New variable.
* math/test-math-iscanonical.cc: New file.
* sysdeps/ieee754/ldbl-96/bits/iscanonical.h (iscanonical):
Provide a C++ implementation based on function overloading,
rather than using __MATH_TG, which uses C-only builtins.
* sysdeps/ieee754/ldbl-128ibm/bits/iscanonical.h (iscanonical):
Likewise.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iscanonical.cc): New variable.
Refactor long double information into bits/long-double.h.
resulted in sparc32 configurations installing the ldbl-opt version of
bits/long-double.h instead of the intended
sysdeps/unix/sysv/linux/sparc version.
For sparc32 by itself, this is not a problem, since the ldbl-opt
version is correct for sparc32. However, both sparc32 and sparc64 are
supposed to install sets of headers that work for both of them, so
that a single sysroot, whichever order the libraries are built and
installed in, works for both. The effect of having the wrong version
installed is that you end up with a miscompiled sparc64 libstdc++
which fails glibc's configure tests for the C++ compiler.
This patch moves the header from sysdeps/unix/sysv/linux/sparc to
separate copies of the same file for sparc32 and sparc64, to ensure it
comes before ldbl-opt in the sysdeps directory ordering.
Tested with build-many-glibcs.py for sparc64-linux-gnu and
sparcv9-linux-gnu.
[BZ #21987]
* sysdeps/unix/sysv/linux/sparc/bits/long-double.h: Remove file
and copy to ...
* sysdeps/unix/sysv/linux/sparc/sparc32/bits/long-double.h:
... here.
* sysdeps/unix/sysv/linux/sparc/sparc64/bits/long-double.h:
... and here.
Joseph Myers [Thu, 28 Sep 2017 01:59:02 +0000 (01:59 +0000)]
Fix nearbyint arithmetic moved before feholdexcept (bug 22225).
In <https://sourceware.org/ml/libc-alpha/2013-05/msg00722.html> I
remarked on the possibility of arithmetic in various nearbyint
implementations being scheduled before feholdexcept calls, resulting
in spurious "inexact" exceptions.
I'm now actually observing this occurring in glibc built for ARM with
GCC 7 (in fact, both copies of the same addition/subtraction sequence
being combined and moved out before the conditionals and
feholdexcept/fesetenv pairs), resulting in test failures.
This patch makes the nearbyint implementations with this particular
feholdexcept / arithmetic / fesetenv pattern consistently use
math_opt_barrier on the function argument when first used in
arithmetic, and also consistently use math_force_eval before fesetenv
(the latter was generally already done, but the dbl-64/wordsize-64
implementation used math_opt_barrier instead, and as
math_opt_barrier's intended effect is through its output value being
used, such a use that doesn't use the return value is suspect).
Tested for x86_64 (--disable-multi-arch so more of these
implementations get used), and for ARM in a configuration where I saw
the problem scheduling.
[BZ #22225]
* sysdeps/ieee754/dbl-64/s_nearbyint.c (__nearbyint): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
Likewise. Use math_force_eval not math_opt_barrier after
arithmetic.
* sysdeps/ieee754/flt-32/s_nearbyintf.c (__nearbyintf): Use
math_opt_barrier on argument when doing arithmetic on it.
* sysdeps/ieee754/ldbl-128/s_nearbyintl.c (__nearbyintl):
Likewise.
Let fpclassify use the builtin when optimizing for size in C++ mode (bug 22146)
When optimization for size is on (-Os), fpclassify does not use the
type-generic __builtin_fpclassify builtin, instead it uses __MATH_TG.
However, when library support for float128 is available, __MATH_TG uses
__builtin_types_compatible_p, which is not available in C++ mode.
On the other hand, libstdc++ undefines (in cmath) many macros from
math.h, including fpclassify, so that it can provide its own functions.
However, during its configure tests, libstdc++ just tests for the
availability of the macros (it does not undefine them, nor does it
provide its own functions).
Finally, when libstdc++ is configured with optimization for size
enabled, its configure tests include math.h and get the definition of
fpclassify that uses __MATH_TG (and __builtin_types_compatible_p).
Since libstdc++ does not undefine the macros during its configure tests,
they fail.
This patch lets fpclassify use the builtin in C++ mode, even when
optimization for size is on. This allows the configure test in
libstdc++ to work.
Tested for powerpc64le and x86_64.
[BZ #22146]
math/math.h: Let fpclassify use the builtin in C++ mode, even
when optimazing for size.
Declare ifunc resolver to return a pointer to the same type as the target
function to help GCC detect incompatibilities between the two when it's
enhanced to do so.
builds for powerpc64le fail in the declaration of some ifunc resolvers,
because the ifunc is declared with unmatching return types. One of the
declarations comes from the __ifunc_resolver macro, which was patched by
the aforementioned commit:
/* Helper / base macros for indirect function symbols. */
#define __ifunc_resolver(type_name, name, expr, arg, init, classifier) \
classifier inhibit_stack_protector \
__typeof (type_name) *name##_ifunc (arg) \
whereas the other comes from the unpatched __ifunc macro when
HAVE_GCC_IFUNC is not defined:
This patch changes the return type of the ifunc resolver in the __ifunc
macro, so that it matches the return type of the target function,
similarly to what the aforementioned commit does.
Tested for powerpc64le and s390x with unpatched GCC.
* include/libc-symbols.h: [!defined HAVE_GCC_IFUNC] (__ifunc):
Change the return type of the ifunc resolver to match the return
type of the target function.
Martin Sebor [Tue, 22 Aug 2017 15:35:23 +0000 (09:35 -0600)]
Declare ifunc resolver to return a pointer to the same type as the target
function to help GCC detect incompatibilities between the two when it's
enhanced to do so.
Alan Modra [Thu, 3 Aug 2017 06:09:21 +0000 (15:39 +0930)]
tst-tlsopt-powerpc as a shared lib
This makes the __tls_get_addr_opt test run as a shared library, and so
actually test that DTPMOD64/DTPREL64 pairs are processed by ld.so to
support the __tls_get_adfr_opt call stub fast return. After a
2017-01-24 patch (binutils f0158f4416) ld.bfd no longer emitted
unnecessary dynamic relocations against local thread variables,
instead setting up the __tls_index GOT entries for the call stub fast
return. This meant tst-tlsopt-powerpc passed but did not check ld.so
relocation support. After a 2017-07-16 patch (binutils 676ee2b5fa)
ld.bfd no longer set up the __tls_index GOT entries for the call stub
fast return, and tst-tlsopt-powerpc failed.
Compiling mod-tlsopt-powerpc.c with -DSHARED exposed a bug in
powerpc64/tls-macros.h, which defines a __TLS_GET_ADDR macro that
clashes with one defined in dl-tls.h. The tls-macros.h version is
only used in that file, so delete it and expand.
* sysdeps/powerpc/mod-tlsopt-powerpc.c: Extract from
tst-tlsopt-powerpc.c with function name change and no test harness.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: Remove body of test.
Call tls_get_addr_opt_test.
* sysdeps/powerpc/Makefile (LDFLAGS-tst-tlsopt-powerpc): Don't define.
(modules-names): Add mod-tlsopt-powerpc.
(mod-tlsopt-powerpc.so-no-z-defs): Define.
(tst-tlsopt-powerpc): Depend on .so.
* sysdeps/powerpc/powerpc64/tls-macros.h (__TLS_GET_ADDR): Don't
define. Expand use in TLS_GD and TLS_LD.
H.J. Lu [Wed, 23 Aug 2017 15:22:52 +0000 (08:22 -0700)]
string/stratcliff.c: Replace int with size_t [BZ #21982]
Fix GCC 7 errors when string/stratcliff.c is compiled with -O3:
stratcliff.c: In function ‘do_test’:
cc1: error: assuming signed overflow does not occur when assuming that (X - c) <= X is always true [-Werror=strict-overflow]
[BZ #21982]
* string/stratcliff.c (do_test): Declare size, nchars, inner,
middle and outer with size_t instead of int. Repleace %d and
%Zd with %zu in printf. Update "MAX (0, nchars - 128)" and
"MAX (outer, nchars - 64)" to support unsigned outer and
nchars. Also exit loop when outer == 0.
H.J. Lu [Thu, 31 Aug 2017 13:28:31 +0000 (06:28 -0700)]
Place $(elf-objpfx)sofini.os last [BZ #22051]
Since sofini.os terminates .eh_frame section, it should be placed last.
[BZ #22051]
* Makerules (build-module-helper-objlist): Filter out
$(elf-objpfx)sofini.os.
(build-shlib-objlist): Append $(elf-objpfx)sofini.os if it is
needed.
getaddrinfo: Fix error handling in gethosts [BZ #21915] [BZ #21922]
The old code uses errno as the primary indicator for success or
failure. This is wrong because errno is only set for specific
combinations of the status return value and the h_errno variable.
This simplifies the code because it is not necessary to propagate the
temporary h_errno value to the thread-local variable. It also increases
compatibility with NSS modules which update only one of the two places.
Joseph Myers [Tue, 22 Aug 2017 00:30:51 +0000 (00:30 +0000)]
Fix position of tests-unsupported definition in assert/Makefile.
tests-unsupported has to be defined before the inclusion of Rules in a
subdirectory Makefile; otherwise it is ineffective. This patch fixes
the ordering in assert/Makefile, where a recent test addition put
tests-unsupported too late (resulting in build failures when the C++
compiler was missing or broken, and thereby showing up the unrelated
bug 21987).
Incidentally, I don't see why these tests depend on
$(have-cxx-thread_local) rather than just a working C++ compiler.
Tested in such a configuration (broken compiler/libstdc++) with
build-many-glibcs.py.
Provide a C++ version of iszero that does not use __MATH_TG (bug 21930)
When signaling nans are enabled (with -fsignaling-nans), the C++ version
of iszero uses the fpclassify macro, which is defined with __MATH_TG.
However, when support for float128 is available, __MATH_TG uses the
builtin __builtin_types_compatible_p, which is only available in C mode.
This patch refactors the C++ version of iszero so that it uses function
overloading to select between the floating-point types, instead of
relying on fpclassify and __MATH_TG.
Tested for powerpc64le, s390x, x86_64, and with build-many-glibcs.py.
[BZ #21930]
* math/math.h [defined __cplusplus && defined __SUPPORT_SNAN__]
(iszero): New C++ implementation that does not use
fpclassify/__MATH_TG/__builtin_types_compatible_p, when
signaling nans are enabled, since __builtin_types_compatible_p
is a C-only feature.
* math/test-math-iszero.cc: When __HAVE_DISTINCT_FLOAT128 is
defined, include ieee754_float128.h for access to the union and
member ieee854_float128.ieee.
[__HAVE_DISTINCT_FLOAT128] (do_test): Call check_float128.
[__HAVE_DISTINCT_FLOAT128] (check_float128): New function.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-iszero.cc): Add -mfloat128 to the build
options of test-math-zero on powerpc64le.
Fix the C++ version of issignaling when __NO_LONG_DOUBLE_MATH is defined
When __NO_LONG_DOUBLE_MATH is defined, __issignalingl is not available,
thus issignaling with long double argument should call __issignaling,
instead.
Tested for powerpc64le.
* math/math.h [defined __cplusplus] (issignaling): In the long
double case, call __issignalingl only if __NO_LONG_DOUBLE_MATH
is not defined. Call __issignaling, otherwise.
Provide a C++ version of issignaling that does not use __MATH_TG
The macro __MATH_TG contains the logic to select between long double and
_Float128, when these types are ABI-distinct. This logic relies on
__builtin_types_compatible_p, which is not available in C++ mode.
On the other hand, C++ function overloading provides the means to
distinguish between the floating-point types. The overloading
resolution will match the correct parameter regardless of type
qualifiers, i.e.: const and volatile.
Tested for powerpc64le, s390x, and x86_64.
* math/math.h [defined __cplusplus] (issignaling): Provide a C++
definition for issignaling that does not rely on __MATH_TG,
since __MATH_TG uses __builtin_types_compatible_p, which is only
available in C mode.
(CFLAGS-test-math-issignaling.cc): New variable.
* math/Makefile [CXX] (tests): Add test-math-issignaling.
* math/test-math-issignaling.cc: New test for C++ implementation
of type-generic issignaling.
* sysdeps/powerpc/powerpc64le/Makefile [subdir == math]
(CXXFLAGS-test-math-issignaling.cc): Add -mfloat128 to the build
options of test-math-issignaling on powerpc64le.
Do not use __builtin_types_compatible_p in C++ mode (bug 21930)
The logic to define isinf for float128 depends on the availability of
__builtin_types_compatible_p, which is only available in C mode,
however, the conditionals do not check for C or C++ mode. This lead to
an error in libstdc++ configure, as reported by bug 21930.
This patch adds a conditional for C mode in the definition of isinf for
float128. No definition is provided in C++ mode, since libstdc++
headers undefine isinf.
Tested for powerpc64le (glibc test suite and libstdc++-v3 configure).
[BZ #21930]
* math/math.h (isinf): Check if in C or C++ mode before using
__builtin_types_compatible_p, since this is a C mode feature.
powerpc: Restrict xssqrtqp operands to Vector Registers (bug 21941)
POWER ISA 3.0 introduces the xssqrtqp instructions, which expects
operands to be in Vector Registers (Altivec/VMX), even though this
instruction belongs to the Vector-Scalar Instruction Set.
In GCC's Extended Assembly for POWER, the 'wq' register constraint is
provided for use with IEEE 754 128-bit floating-point values. However,
this constraint does not limit the register allocation to Vector
Registers (Altivec/VMX) and could assign a Vector-Scalar Register (VSX)
to the operands of the instruction.
This patch changes the register constraint used in sqrtf128 from 'wq' to
'v', in order to request a Vector Register (Altivec/VMX) for use with
the xssqrtqp instruction.
Tested for powerpc64le and --with-cpu=power9.
[BZ #21941]
* sysdeps/powerpc/fpu/math_private.h (__ieee754_sqrtf128): Since
xssqrtqp requires operands to be in Vector Registers
(Altivec/VMX), replace the register constraint 'wq' with 'v'.
* sysdeps/powerpc/powerpc64le/power9/fpu/e_sqrtf128.c
(__ieee754_sqrtf128): Likewise.
posix: Set p{read,write}v2 to return ENOTSUP (BZ#21780)
Different than other architectures hppa-linux-gnu define different values
for ENOTSUP and EOPNOTSUPP, where the later is a Linux specific one.
This leads to tst-preadwritev{64}v2 tests failures:
$ ./testrun.sh misc/tst-preadvwritev2
error: tst-preadvwritev2-common.c:35: preadv2 failure did not set errno to ENOTSUP (223)
error: 1 test failures
The straightforward fix is to return the POSIX defined ENOTSUP on all
p{read,write}v{64}v2 implementations instead of Linux specific one.
Checked on x86_64-linux-gnu and the tst-preadwritev{64}v2 on
hppa-linux-gnu (although due the installed kernel on my testing system
the pwritev{64}v2 with an invalid flag still fails due a known kernel
issue [1]).
since xgetbv takes much more cycles than single cycle operations like
vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with
AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
server.
[BZ #21871]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.
Aurelien Jarno [Thu, 3 Aug 2017 22:35:48 +0000 (22:35 +0000)]
Fix the return type of the getentropy stub
The return type of the getentropy stub is wrongly defined as ssize_t,
while both the <sys/random.h> header and the Linux implementation
define it as int. This patch fixes that.
Carlos O'Donell [Sat, 29 Jul 2017 04:02:03 +0000 (00:02 -0400)]
mutex: Fix robust mutex lock acquire (Bug 21778)
65810f0ef05e8c9e333f17a44e77808b163ca298 fixed a robust mutex bug but
introduced BZ 21778: if the CAS used to try to acquire a lock fails, the
expected value is not updated, which breaks other cases in the loce
acquisition loop. The fix is to simply update the expected value with
the value returned by the CAS, which ensures that behavior is as if the
first case with the CAS never happened (if the CAS fails).
This is a regression introduced in the last release.
Tested on x86_64, i686, ppc64, ppc64le, s390x, aarch64, armv7hl.
microblaze: Resolve non-relocatable branch in pt-vfork.S (BZ#21779)
The relative branch directly to __libc_vfork results in an relocation
that cannot be resolved. Specifically a R_MICROBLAZE_64_PCREL relocation
is created for this branch, however for MicroBlaze R_MICROBLAZE_64_PCREL
type relocations symbols are not resolved. Additionally due to the
branch being located in the .text section the instruction cannot be
rewritten as the section is not writable, and causes a segfault at
runtime when loading libpthread.
To resolve this issue, ensure the branch is done using PLT. This removes
the need to modify the instruction and trades the R_MICROBLAZE_64_PCREL
for a more common R_MICROBLAZE_JUMP via the PLT.
[BZ #21779]
* sysdeps/unix/sysv/linux/microblaze/pt-vfork.S: Branch using PLT.
Carlos O'Donell [Fri, 28 Jul 2017 04:22:44 +0000 (00:22 -0400)]
rwlock: Fix explicit hand-over (bug 21298)
Without this fix, the rwlock can fail to execute the explicit hand-over
in certain cases (e.g., empty critical sections that switch quickly between
read and write phases). This can then lead to errors in how __wrphase_futex
is accessed, which in turn can lead to deadlocks.
Mike FABIAN [Thu, 27 Jul 2017 09:04:38 +0000 (11:04 +0200)]
Minor improvements to new az_IR locale
* locales/az_IR (LC_MESSAGES): Improve yesexpr and noexpr.
* locales/az_IR (LC_ADDRESS): Fix typo in comment and
use the individual iso-639-3 code for South Azerbaijani
"azb" in lang_term.
* locales/az_IR (LC_NAME): Improve readability of name_fmt in source.
Rical Jasan [Tue, 20 Jun 2017 13:39:27 +0000 (06:39 -0700)]
manual: Refactor documentation of CHAR_BIT.
This single-@item @table is better defined with @deftypevr, since the
CHAR_BIT macro has @standards (being declared in a header), and @items
in @tables are not considered annotatable. Using @deftypevr
automatically includes the macro in the Variable and Constant Macro
Index and ensures its inclusion the Summary of Library Facilities.
@deftypevr is used to record the type of the macro so that it also
appears in the Summary.
The description is updated to mention a later POSIX requirement that
this macro have the value 8.
* manual/lang.texi (CHAR_BIT): Convert from an @table to an
@deftypevr. Change standard from ISO to C90. Mention the
POSIX.1-2001 requirement of the value 8.