Shawn Landden [Wed, 26 Jun 2019 14:52:45 +0000 (10:52 -0400)]
PPC64: Add libmvec SIMD double-precision power function [BZ #24210]
Based off the ./sysdeps/ieee754/dbl-64/pow.c implementation,
and provides identical results.
Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar pow to figure out what is causing the overflow or underflow.
I may have not normalized the data for benchmarking this properly,
but operating only on integers between 0-2^32 and floats between 0.5 and
1 I get the following:
Running 20 times over 32MiB
vector: mean 535.824919 (sd 0.246088)
scalar: mean 286.384220 (sd 0.027630)
Which is a very impressive speed boost.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
PPC64: Add libmvec SIMD single-precision power function [BZ #24210]
Based off the ./sysdeps/ieee754/flt-32/powf.c implementation,
and thus provides identical results.
Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar powf to figure out what is causing the overflow or underflow.
I may have not normalized the data for benchmarking this properly,
but operating only on floats between 0.5 and 1 I get the following:
Running 20 times over 32MiB
vector: mean 307.659767 (sd 0.203217)
scalar: mean 221.837088 (sd 0.032256)
And with random data there is a decrease in performance:
vector: mean 265.366371 (sd 0.000626)
scalar: mean 279.598078 (sd 0.025592)
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Shawn Landden [Wed, 29 May 2019 19:11:06 +0000 (16:11 -0300)]
PPC64: Add libmvec SIMD double-precision natural exponent function [BZ #24209]
Passes all tests.
Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar expf to figure out what is causing the overflow or underflow.
The special-case path is not vectorized, and performs much worse than
the scalar code.
Normalized data: 1 to 2^32 converted to double
Running 20 times over 32MiB
vector: mean 563.807107 MiB/s (sd 0.390922)
scalar: mean 226.527824 MiB/s (sd 0.077406)
Random data:
vector: mean 80.175986 MiB/s (sd 1.110948)
scalar: mean 244.738130 MiB/s (sd 0.029561)
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Shawn Landden [Mon, 27 May 2019 17:47:16 +0000 (12:47 -0500)]
PPC64: Add libmvec SIMD single-precision natural exponent function [BZ #24209]
Passes all tests.
Based off the ./sysdeps/ieee754/dbl-64/e_exp.c implementation,
and thus provides identical results.
Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar expf to figure out what is causing the overflow or underflow.
Suprisingly the special-case path performs as well as the normal path.
(both of which are vectorized)
Running 20 times over 32MiB
vector: mean 432.263032 MiB/s (sd 0.486733)
scalar: mean 178.646197 MiB/s (sd 0.050013)
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Tue, 7 May 2019 22:04:11 +0000 (17:04 -0500)]
PPC64: Add libmvec SIMD single-precision logarithm function [BZ #24208]
Implements single-precision vector logarithm function. The algorithm is
an adaptation of the one in sysdeps/ieee754/flt-32/e_logf.c, modified for
PPC64 VSX hardware. The version of e_logf.c referenced here is from
commit #bf27d3973d.
The patch has been tested on both Little-Endian and Big-Endian. It
passes all the tests for single-precision logarithm run by make check with
max ULP of 1. Integration into the make check infrastructure is adapted from
similar x86_64 changes in commit #774488f88a.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Fri, 3 May 2019 17:45:45 +0000 (12:45 -0500)]
PPC64: Add libmvec SIMD double-precision logarithm function [BZ #24208]
Implements double-precision vector logarithm function. The algorithm is
an adaptation of the one in sysdeps/ieee754/dbl-64, modified to exploit
PPC64 VSX hardware. The version of ieee754/dbl-64 is commit #f41b0a43e4.
The patch has been tested on both Little-Endian and Big-Endian. It
passes all the tests for double-precision logarithm run by make check.
Integration into the make check infrastructure closely follows corres-
ponding changes done for x86_64 in commit #6af25acc7b.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
PPC64: Add libmvec SIMD single-precision sincosf function [BZ #24207]
Implements single-precision vector sincosf function. The polynomial approxima-
ting algorithm is adapted for PPC64 from x86_64 [commit #a6336cc446].
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sincosf function all pass.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
PPC64: Add libmvec SIMD double-precision sincos function [BZ #24207]
Implements double-precision vector sincos function. The polynomial approxima-
ting algorithm is adapted for PPC64 from x86_64 [commit #c9a8c526ac].
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sincos function all pass.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Thu, 7 Mar 2019 07:22:34 +0000 (07:22 +0000)]
PPC64: Add libmvec SIMD single-precision sine function [BZ #24206]
Implements single-precision vector sine function. The polynomial
sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #2a8c2c7b33].
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector single-precision sine function all pass.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Thu, 7 Mar 2019 06:36:48 +0000 (06:36 +0000)]
PPC64: Add libmvec SIMD double-precision sine function [BZ #24206]
Implements double-precision vector sine function. The polynomial
sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #4b9c2b707b].
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sine function all pass.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Thu, 7 Mar 2019 05:17:47 +0000 (05:17 +0000)]
PPC64: Add libmvec SIMD single-precision cosine function [BZ #24205]
Implements single-precision cosine using VSX vector capability. The polynomial
cosine-approximating algorithm is adapted for PPC64 from x86_64 [commit #04f496d602].
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is
tested using the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector cosine function all pass.
Details on the ABI are found at this link:
<https://sourceware.org/glibc/wiki/
libmvec?action=AttachFile&do=view&target=VectorABI.txt>
But for adjusting the width of operands, details described for the
double-precision cosine implemented earlier apply here. See git
commit #7956c29f07 for that information.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Bert Tenjy [Wed, 6 Mar 2019 21:52:48 +0000 (21:52 +0000)]
PPC64: Add libmvec SIMD double-precision cosine function [BZ #24205]
This is the 1st of 12 patches that will implement libmvec for PPC64 using
VSX hardware capabilities.
Implements double-precision cosine using VSX vector capability. Algorithm for
cosine is from x86_64 [commit #2193311288] adapted to PPC64.
Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at
<https://sourceware.org/glibc/wiki/
libmvec?action=AttachFile&do=view&target=VectorABI.txt>
The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is
tested using the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector cosine function all pass.
Library libmvec is built by default. To disable building it, pass flag
--disable-mathvec to the configure script.
A runtime check prevents vector tests running on systems lacking VSX hardware.
Glibc built with this patch was installed using the procedure outlined at
<https://sourceware.org/glibc/wiki/Testing/Builds>. Compiling against the new
library created a test executable which computes cosines using the vector
version of the function. The results are at most 2-ulps away from the scalar
cosine. That is expected and indicated in the comments describing the
algorithm - as obtained from x86_64 commit #2193311288.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
From the GNU C Library manual, the pkey_set can receive a combination of
PKEY_DISABLE_WRITE and PKEY_DISABLE_ACCESS. However PKEY_DISABLE_ACCESS
is more restrictive than PKEY_DISABLE_WRITE and includes its behavior.
The test expects that after setting
(PKEY_DISABLE_WRITE|PKEY_DISABLE_ACCESS) pkey_get should return the
same. This may not be true as PKEY_DISABLE_ACCESS will succeed in
describing the state of the key in this case.
The pkey behavior during signal handling is different between x86 and
POWER. This change make the test compatible with both architectures.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Lukasz Majewski [Mon, 27 Jan 2020 14:37:07 +0000 (15:37 +0100)]
y2038: linux: Provide __gettimeofday64 implementation
In the glibc the gettimeofday can use vDSO (on power and x86 the
USE_IFUNC_GETTIMEOFDAY is defined), gettimeofday syscall or 'default'
___gettimeofday() from ./time/gettime.c (as a fallback).
In this patch the last function (___gettimeofday) has been refactored and
moved to ./sysdeps/unix/sysv/linux/gettimeofday.c to be Linux specific.
The new __gettimeofday64 explicit 64 bit function for getting 64 bit time from
the kernel (by internally calling __clock_gettime64) has been introduced.
Moreover, a 32 bit version - __gettimeofday has been refactored to internally
use __gettimeofday64.
The __gettimeofday is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary check for time_t potential
overflow and conversion of struct __timeval64 to 32 bit struct timespec.
The iFUNC vDSO direct call optimization has been removed from both i686 and
powerpc32 (USE_IFUNC_GETTIMEOFDAY is not defined for those architectures
anymore). The Linux kernel does not provide a y2038 safe implementation of
gettimeofday neither it plans to provide it in the future, clock_gettime64
should be used instead. Keeping support for this optimization would require
to handle another build permutation (!__ASSUME_TIME64_SYSCALLS &&
USE_IFUNC_GETTIMEOFDAY) which adds more complexity and has limited use
(since the idea is to eventually have a y2038 safe glibc build).
Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master
Above tests were performed with Y2038 redirection applied as well as without
to test proper usage of both __gettimeofday64 and __gettimeofday.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
[Including some commit message improvement]
Florian Weimer [Tue, 18 Feb 2020 16:52:27 +0000 (17:52 +0100)]
Linux: Work around kernel bugs in chmod on /proc/self/fd paths [BZ #14578]
It appears that the ability to change symbolic link modes through such
paths is unintended. On several file systems, the operation fails with
EOPNOTSUPP, even though the symbolic link permissions are updated.
The expected behavior is a failure to update the permissions, without
file system changes.
Florian Weimer [Tue, 18 Feb 2020 13:42:41 +0000 (14:42 +0100)]
Introduce <elf-initfini.h> and ELF_INITFINI for all architectures
This supersedes the init_array sysdeps directory. It allows us to
check for ELF_INITFINI in both C and assembler code, and skip DT_INIT
and DT_FINI processing completely on newer architectures.
A new header file is needed because <dl-machine.h> is incompatible
with assembler code. <sysdep.h> is compatible with assembler code,
but it cannot be included in all assembler files because on some
architectures, it redefines register names, and some assembler files
conflict with that.
<elf-initfini.h> is replicated for legacy architectures which need
DT_INIT/DT_FINI support. New architectures follow the generic default
and disable it.
MIPS fallback code handle a frame where its FDE can not be obtained
(for instance a signal frame) by reading the kernel allocated signal frame
and adding '2' to the value of 'sc_pc' [1]. The added value is used to
recognize an end of an EH region on mips16 [2].
The fix adjust the obtained signal frame value and remove the libgcc added
value by checking if the previous frame is a signal frame one.
Checked with backtrace and tst-sigcontext-get_pc tests on mips-linux-gnu
and mips64-linux-gnu.
[1] libgcc/config/mips/linux-unwind.h from gcc code.
[2] gcc/config/mips/mips.h from gcc code. */
Florian Weimer [Tue, 18 Feb 2020 12:44:48 +0000 (13:44 +0100)]
Move implementation of <file_change_detection.h> into a C file
file_change_detection_for_stat partially initialize
struct file_change_detection in some cases, when the size member
alone determines the outcome of all comparisons. This results
in maybe-uninitialized compiler warnings in case of sufficiently
aggressive inlining.
Once the implementation is moved into a separate C file, this kind
of inlining is no longer possible, so the compiler warnings are gone.
Prepare redirections for IEEE long double on powerpc64le
All functions that have a format string, which can consume a long double
argument, must have one version for each long double format supported on
a platform. On powerpc64le, these functions currently have two versions
(i.e.: long double with the same format as double, and long double with
IBM Extended Precision format). Support for a third long double format
option (i.e. long double with IEEE long double format) is being prepared
and all the aforementioned functions now have a third version (not yet
exported on the master branch, but the code is in).
For these functions to get selected (during build time), references to
them in user programs (or dependent libraries) must get redirected to
the aforementioned new versions of the functions. This patch installs
the header magic required to perform such redirections.
Notice, however, that since the redirections only happen when
__LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including
powerpc64le) currently does it, no redirections actually happen.
Redirections and the exporting of the new functions will happen at the
same time (when powerpc64le adds ldbl-128ibm-compat to their Implies.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
Florian Weimer [Mon, 17 Feb 2020 10:30:17 +0000 (11:30 +0100)]
stdlib: Reduce namespace pollution in <inttypes.h>
The namespace pollution results in conform test failures if the tests
are run __USE_EXTERN_INLINES defined (e.g., when configuring with
CC="gcc -O3" CXX="g++ -O3").
Florian Weimer [Sat, 8 Feb 2020 18:58:43 +0000 (19:58 +0100)]
ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]
Exporting functions and relying on symbol interposition from libc.so
makes the choice of implementation dependent on DT_NEEDED order, which
is not what some compiler drivers expect.
This commit replaces one magic mechanism (symbol interposition) with
another one (preprocessor-/compiler-based redirection). This makes
the hand-over from the minimal malloc to the full malloc more
explicit.
Removing the ABI symbols is backwards-compatible because libc.so is
always in scope, and the dynamic loader will find the malloc-related
symbols there since commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c
("ld.so: Support moving versioned symbols between sonames
[BZ #24741]").
riscv: Avoid clobbering register parameters in syscall
The riscv INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).
This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).
It is similar to the fix done for MIPS (bug 25523).
Checked with riscv64-linux-gnu-rv64imafdc-lp64d build.
microblaze: Avoid clobbering register parameters in syscall
The microblaze INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).
This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).
It is similar to the fix done for MIPS (bug 25523).
Checked with microblaze-linux-gnu and microblazeel-linux-gnu build.
It changes the mips INTERNAL_SYSCALL* and internal_syscall* macros
to return a negative value instead of the 'a3' register value on then
'err' macro argument.
The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
The redefinition of INTERNAL_VSYSCALL_CALL is also no longer
required.
Checked on mips64-linux-gnu, mips64n32-linux-gnu, and mips-linux-gnu.
The mips64 Linux syscall macros only differs argument type and
the requirement of sign-extending values on n32. The headers
are consolidate by parameterizing the arguments with a new type,
__syscall_arg_t, and by defining the ARGIFY for n64.
Also, the generic unix mips64 sysdep is essentially the same,
only the load instruction need to be adjusted depending of the
ABI.
Checked on mips64-linux-gnu and mips64n32-linux-gnu.
alpha: Refactor syscall and Use Linux kABI for syscall return
It highly unlikely that alpha will be ported to anything else than
Linux, so this patch moves the generic unix syscall definition to
Linux and adapt it to Linux kernel ABI.
It changes the internal_syscall* macros to return a negative value
instead of the '$19' register value on the 'err' macro argument.
The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
sparc: Avoid clobbering register parameters in syscall
The sparc INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).
This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).
It is similar to the fix done for MIPS (bug 25523).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
It changes the sparc internal_syscall* macros to return a negative
value instead of the 'g1' register value in the 'err' macro argument.
The __SYSCALL_STRING macro is also changed to no set the 'g1'
value, since 'o1' already holds all the required information
to check if syscall has failed.
The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
The redefinition of INTERNAL_VSYSCALL_CALL is also no longer
required.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu. It fixes
the sporadic issues on sparc32 where clock_nanosleep does not
act as cancellation entrypoint.
It changes the powerpc INTERNAL_VSYSCALL_CALL and INTERNAL_SYSCALL_NCS
to return a negative value instead of the returning the CR value in
the 'err' macro argument.
The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
Checked on powerpc64-linux-gnu, powerpc64le-linux-gnu, and
powerpc-linux-gnu-power4.
The diferences between powerpc64{le} and powerpc32 Linux sysdep.h
are:
1. On both vDSO and syscall macros the volatile registers r9, r10,
r11, and r12 are used as input operands on powerpc32 and as
clobber registers on powerpc64. However the outcome is essentially
the same, it advertise the register might be clobbered by the
kernel (although Linux won't leak register information to userland
in such case).
2. The LOADARGS* macros uses a different size to check for invalid
types.
3. The pointer mangling support guard pointer loading uses ABI
specific instruction and register.
This patch consolidates on only one sysdep by using the the powerpc64
version as default and add the adjustments required for powerpc32.
Checked on powerpc64-linux-gnu, powerpc64le-linux-gnu, and
powerpc-linux-gnu-power4.
H.J. Lu [Fri, 14 Feb 2020 22:45:34 +0000 (14:45 -0800)]
i386: Enable CET support in ucontext functions
1. getcontext and swapcontext are updated to save the caller's shadow
stack pointer and return address.
2. setcontext and swapcontext are updated to restore shadow stack and
jump to new context directly.
3. makecontext is updated to allocate a new shadow stack and set the
caller's return address to the helper code, L(exitcode).
4. Since we no longer save and restore EAX, ECX and EDX in getcontext,
setcontext and swapcontext, we can use them as scratch register slots
to enable CET in ucontext functions.
Since makecontext allocates a new shadow stack when making a new
context and kernel allocates a new shadow stack for clone/fork/vfork
syscalls, we track the current shadow stack base. In setcontext and
swapcontext, if the target shadow stack base is the same as the current
shadow stack base, we unwind the shadow stack. Otherwise it is a stack
switch and we look for a restore token.
We enable shadow stack at run-time only if program and all used shared
objects, including dlopened ones, are shadow stack enabled, which means
that they must be compiled with GCC 8 or above and glibc 2.28 or above.
We need to save and restore shadow stack only if shadow stack is enabled.
When caller of getcontext, setcontext, swapcontext and makecontext is
compiled with smaller ucontext_t, shadow stack won't be enabled at
run-time. We check if shadow stack is enabled before accessing the
extended field in ucontext_t.
Alistair Francis [Mon, 11 Nov 2019 23:07:19 +0000 (15:07 -0800)]
tst-clone3: Use __NR_futex_time64 if we don't have __NR_futex
We can't include sysdep.h in the test case (it introduces lots of
strange failures) so __NR_futex isn't redifined to __NR_futex_time64 by
64-bit time_t 32-bit archs (y2038 safe).
To allow the test to pass let's just do the __NR_futex_time64 syscall if
we don't have __NR_futex defined.
Florian Weimer [Fri, 14 Feb 2020 19:55:39 +0000 (20:55 +0100)]
powerpc64: Add memory protection key support [BZ #23202]
The 32-bit protection key behavior is somewhat unclear on 32-bit powerpc,
so this change is restricted to the 64-bit variants.
Flag translation is needed because of hardware differences between the
POWER implementation (read and write flags) and the Intel implementation
(write and read+write flags).
ldbl-128ibm-compat: Provide a scalb implementation
Reuse the template in order to provide the redirect for
scalbl to __scalbieee128, but avoid any extra aliasing
as this is intended to support long double redirects only.
This is a preparatory patch to enable building a _Float128
variant to ease reuse when building a _Float128 variant to
alias this long double only symbol.
Notably, stubs are added where missing to the native _Float128
sysdep dir to prevent building these newly templated variants
created inside the build directories.
Also noteworthy are the changes around LIBM_SVID_COMPAT. These
changes are not intuitive. The templated version is only
enabled when !LIBM_SVID_COMPAT, and the compat version is
predicated entirely on LIBM_SVID_COMPAT. Thus, exactly one is
stubbed out entirely when building. The nldbl scalb compat
files are updated to account for this.
Likewise, fixup the reuse of m68k's e_scalb{f,l}.c to include
it's override of e_scalb.c. Otherwise, the search path finds
the templated copy in the build directory. This could be
futher simplified by providing an overridden template, but I
lack the hardware to verify.
Joseph Myers [Fri, 14 Feb 2020 14:16:25 +0000 (14:16 +0000)]
Adjust thresholds in Bessel function implementations (bug 14469).
A recent discussion in bug 14469 notes that a threshold in float
Bessel function implementations, used to determine when to use a
simpler implementation approach, results in substantially inaccurate
results.
As I discussed in
<https://sourceware.org/ml/libc-alpha/2013-03/msg00345.html>, a
heuristic argument suggests 2^(S+P) as the right order of magnitude
for a suitable threshold, where S is the number of significand bits in
the floating-point type and P is the number of significant bits in the
representation of the floating-point type, and the float and ldbl-96
implementations use thresholds that are too small. Some threshold
does need using, there or elsewhere in the implementation, to avoid
spurious underflow and overflow for large arguments.
This patch sets the thresholds in the affected implementations to more
heuristically justifiable values. Results will still be inaccurate
close to zeroes of the functions (thus this patch does *not* fix any
of the bugs for Bessel function inaccuracy); fixing that would require
a different implementation approach, likely along the lines described
in <http://www.cl.cam.ac.uk/~jrh13/papers/bessel.ps.gz>.
So the justification for a change such as this would be statistical
rather than based on particular tests that had excessive errors and no
longer do so (no doubt such tests could be found, but would probably
be too fragile to add to the testsuite, as liable to give large errors
again from very small implementation changes or even from compiler
changes). See
<https://sourceware.org/ml/libc-alpha/2020-02/msg00638.html> for such
statistics of the resulting improvements for float functions.
Florian Weimer [Tue, 21 Jan 2020 16:38:15 +0000 (17:38 +0100)]
resolv: Fix ABA race in /etc/resolv.conf change detection [BZ #25420]
__resolv_conf_get_current should only record the initial file
change data if after verifying that file just read matches the
original measurement. Fixes commit aef16cc8a4c670036d45590877
("resolv: Automatically reload a changed /etc/resolv.conf file
[BZ #984]").
Florian Weimer [Tue, 21 Jan 2020 16:11:01 +0000 (17:11 +0100)]
resolv: Fix file handle leak in __resolv_conf_load [BZ #25429]
res_vinit_1 did not close the stream on errors, only on success.
This change moves closing the stream to __resolv_conf_load, for both
the success and error cases.
Joseph Myers [Thu, 13 Feb 2020 21:59:59 +0000 (21:59 +0000)]
Add STATX_ATTR_VERITY from Linux 5.5 to bits/statx-generic.h.
This patch adds the new STATX_ATTR_VERITY macro from Linux 5.5 to
glibc's bits/statx-generic.h. (This only does anything if glibc is
being used with old kernel headers.)
By undef strong_alias on alpha implementation, the
default_symbol_version macro becomes an empty macro on static build.
It fixes the issue introduced at c953219420.
Checked on alpha-linux-gnu with a 'make check run-built-tests=no'.
Joseph Myers [Wed, 12 Feb 2020 23:31:56 +0000 (23:31 +0000)]
Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).
Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).
Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.
This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process. (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.) Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.
Matheus Castanho [Wed, 12 Feb 2020 16:07:32 +0000 (13:07 -0300)]
sunrpc: Properly clean up if tst-udp-timeout fails
The macro TEST_VERIFY_EXIT is used several times on
sunrpc/tst-udp-timeout to exit the test if a condition evaluates to
false. The side effect is that the code to terminate the RPC server
process is not executed when the program calls exit, so that
sub-process stays alive.
This commit registers a clean up function with atexit to kill the
server process before exiting the main program.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
David Kilroy [Wed, 12 Feb 2020 17:31:17 +0000 (14:31 -0300)]
elf: avoid stack allocation in dl_open_worker
As the sort was removed, there's no need to keep a separate map of
links. Instead, when relocating objects iterate over l_initfini
directly.
This allows us to remove the loop copying l_initfini elements into
map. We still need a loop to identify the first and last elements that
need relocation.
David Kilroy [Wed, 12 Feb 2020 17:28:15 +0000 (14:28 -0300)]
elf: Allow dlopen of filter object to work [BZ #16272]
There are two fixes that are needed to be able to dlopen filter
objects. First _dl_map_object_deps cannot assume that map will be at
the beginning of l_searchlist.r_list[], as filtees are inserted before
map. Secondly dl_open_worker needs to ensure that filtees get
relocated.
In _dl_map_object_deps:
* avoiding removing relocation dependencies of map by setting
l_reserved to 0 and otherwise processing the rest of the search
list.
* ensure that map remains at the beginning of l_initfini - the list
of things that need initialisation (and destruction). Do this by
splitting the copy up. This may not be required, but matches the
initialization order without dlopen.
Modify dl_open_worker to relocate the objects in new->l_inifini.
new->l_initfini is constructed in _dl_map_object_deps, and lists the
objects that need initialization and destruction. Originally the list
of objects in new->l_next are relocated. All of these objects should
also be included in new->l_initfini (both lists are populated with
dependencies in _dl_map_object_deps). We can't use new->l_prev to pick
up filtees, as during a recursive dlopen from an interposed malloc
call, l->prev can contain objects that are not ready for relocation.
Add tests to verify that symbols resolve to the filtee implementation
when auxiliary and filter objects are used, both as a normal link and
when dlopen'd.
Joseph Myers [Wed, 12 Feb 2020 13:37:16 +0000 (13:37 +0000)]
Rename RWF_WRITE_LIFE_NOT_SET to RWH_WRITE_LIFE_NOT_SET following Linux 5.5.
Linux 5.5 renames RWF_WRITE_LIFE_NOT_SET to RWH_WRITE_LIFE_NOT_SET,
with the old name kept as an alias. This patch makes the
corresponding change in glibc.
Stefan Liebler [Wed, 12 Feb 2020 08:10:56 +0000 (09:10 +0100)]
S390: Fix non-ascii character in fenv.h.
The comment "isn't" contained a non-ascii character which leads to
an error if compiled with -finput-charset=ascii:
error: failure to convert ascii to UTF-8
This is observable in GCC testsuite:
FAIL: 17_intro/headers/c++1998/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2011/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2014/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2017/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2020/charset.cc (test for excess errors)
The code started out with bits form resolv/resolv_conf.c, but it
was enhanced to deal with directories and FIFOs in a more predictable
manner. A test case is included as well.
This will be used to implement the /etc/resolv.conf change detection.
This currently lives in a header file only. Once there are multiple
users, the implementations should be moved into C files.
Fangrui Song [Wed, 12 Feb 2020 06:10:19 +0000 (01:10 -0500)]
elf.h: Add R_RISCV_IRELATIVE
The number has been officially assigned by
https://github.com/riscv/riscv-elf-psabi-doc/pull/131
https://github.com/riscv/riscv-elf-psabi-doc/commit/d21ca40a7f56812a15e97450b7bc1599c0d35b82
Samuel Thibault [Mon, 10 Feb 2020 22:06:33 +0000 (23:06 +0100)]
hurd: Add __pthread_spin_wait and use it
900778283ac3 ("htl: make pthread_spin_lock really spin") made
pthread_spin_lock really spin and not block, but the current users of
__pthread_spin_lock were assuming that it blocks, i.e. they use it as a
lightweight mutex fitting in just one int.
Joseph Myers [Mon, 10 Feb 2020 22:17:59 +0000 (22:17 +0000)]
Use --disable-gdbserver in build-many-glibcs.py.
Now that binutils-gdb has gdbserver at top level, an extra
--disable-gdbserver configure option is needed when configuring
binutils from a git checkout to avoid it also building gdbserver
unnecessarily (although fairly harmlessly). This patch updates the
options used in build-many-glibcs.py accordingly (although this might
end up not being needed depending on what happens regarding whether
gdbserver gets built for host != target).
Tested with a build-many-glibcs.py compilers build for
aarch64-linux-gnu using binutils-gdb master.