git.ipfire.org Git - thirdparty/glibc.git/log

PPC64: Attach SIMD attribute to cosf, sin, sinf function declarations.

These changes were mistakenly left out of the patches that added SIMD
versions of these functions to libmvec.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD double-precision power function [BZ #24210]

Based off the ./sysdeps/ieee754/dbl-64/pow.c implementation,
and provides identical results.

Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar pow to figure out what is causing the overflow or underflow.

I may have not normalized the data for benchmarking this properly,
but operating only on integers between 0-2^32 and floats between 0.5 and
1 I get the following:

Running 20 times over 32MiB
vector: mean 535.824919 (sd 0.246088)
scalar: mean 286.384220 (sd 0.027630)

Which is a very impressive speed boost.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD single-precision power function [BZ #24210]

Based off the ./sysdeps/ieee754/flt-32/powf.c implementation,
and thus provides identical results.

Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar powf to figure out what is causing the overflow or underflow.

I may have not normalized the data for benchmarking this properly,
but operating only on floats between 0.5 and 1 I get the following:

Running 20 times over 32MiB
vector: mean 307.659767 (sd 0.203217)
scalar: mean 221.837088 (sd 0.032256)

And with random data there is a decrease in performance:
vector: mean 265.366371 (sd 0.000626)
scalar: mean 279.598078 (sd 0.025592)

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

powerpc64: Add support for vec_cmpne for older compilers

vec_cmpne was added to GCC 7, requiring an alternative implementation
when building glibc with GCC 6.

PPC64: Add libmvec SIMD double-precision natural exponent function [BZ #24209]

Passes all tests.

Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar expf to figure out what is causing the overflow or underflow.

The special-case path is not vectorized, and performs much worse than
the scalar code.
Normalized data: 1 to 2^32 converted to double
Running 20 times over 32MiB
vector: mean 563.807107 MiB/s (sd 0.390922)
scalar: mean 226.527824 MiB/s (sd 0.077406)

Random data:
vector: mean 80.175986 MiB/s (sd 1.110948)
scalar: mean 244.738130 MiB/s (sd 0.029561)

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD single-precision natural exponent function [BZ #24209]

Passes all tests.

Based off the ./sysdeps/ieee754/dbl-64/e_exp.c implementation,
and thus provides identical results.

Unlike other libmvec functions, this sets the underflow and overflow bits.
The caller can check these flags, and possibly re-run the calculations with
scalar expf to figure out what is causing the overflow or underflow.

Suprisingly the special-case path performs as well as the normal path.
(both of which are vectorized)
Running 20 times over 32MiB
vector: mean 432.263032 MiB/s (sd 0.486733)
scalar: mean 178.646197 MiB/s (sd 0.050013)

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

powerpc64: Fix libmvec's logf4 build on GCC < 8

The built-in vec_float was added to GCC 8.0, requiring an alternative
implementation when using older GCC versions.

PPC64: Add libmvec SIMD single-precision logarithm function [BZ #24208]

Implements single-precision vector logarithm function.  The algorithm is
an adaptation of the one in sysdeps/ieee754/flt-32/e_logf.c, modified for
PPC64 VSX hardware.  The version of e_logf.c referenced here is from
commit #bf27d3973d.

The patch has been tested on both Little-Endian and Big-Endian.  It
passes all the tests for single-precision logarithm run by make check with
max ULP of 1.  Integration into the make check infrastructure is adapted from
similar x86_64 changes in commit #774488f88a.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD double-precision logarithm function [BZ #24208]

Implements double-precision vector logarithm function.  The algorithm is
an adaptation of the one in sysdeps/ieee754/dbl-64, modified to exploit
PPC64 VSX hardware.  The version of ieee754/dbl-64 is commit #f41b0a43e4.

The patch has been tested on both Little-Endian and Big-Endian.  It
passes all the tests for double-precision logarithm run by make check.
Integration into the make check infrastructure closely follows corres-
ponding changes done for x86_64 in commit #6af25acc7b.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

powerpc64: Fix mathvec build and tests on POWER < 8

vec_d_cos2_vsx.c, vec_d_sin2_vsx.c and vec_d_sincos2_vsx.c use
vec_sl(), which is only available on POWER8 processors.

PPC64: Add libmvec SIMD single-precision sincosf function [BZ #24207]

Implements single-precision vector sincosf function. The polynomial approxima-
ting algorithm is adapted for PPC64 from x86_64 [commit #a6336cc446].

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sincosf function all pass.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD double-precision sincos function [BZ #24207]

Implements double-precision vector sincos function. The polynomial approxima-
ting algorithm is adapted for PPC64 from x86_64 [commit #c9a8c526ac].

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sincos function all pass.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD single-precision sine function [BZ #24206]

Implements single-precision vector sine function. The polynomial
sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #2a8c2c7b33].

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector single-precision sine function all pass.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD double-precision sine function [BZ #24206]

Implements double-precision vector sine function. The polynomial
sine-approximating algorithm is adapted for PPC64 from x86_64 [commit #4b9c2b707b].

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian.
Testing uses the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector sine function all pass.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD single-precision cosine function [BZ #24205]

Implements single-precision cosine using VSX vector capability. The polynomial
cosine-approximating algorithm is adapted for PPC64 from x86_64 [commit #04f496d602].

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is
tested using the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector cosine function all pass.

Details on the ABI are found at this link:
<https://sourceware.org/glibc/wiki/
libmvec?action=AttachFile&do=view&target=VectorABI.txt>

But for adjusting the width of operands, details described for the
double-precision cosine implemented earlier apply here. See git
commit #7956c29f07 for that information.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PPC64: Add libmvec SIMD double-precision cosine function [BZ #24205]

This is the 1st of 12 patches that will implement libmvec for PPC64 using
VSX hardware capabilities.

Implements double-precision cosine using VSX vector capability. Algorithm for
cosine is from x86_64 [commit #2193311288] adapted to PPC64.

Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at
<https://sourceware.org/glibc/wiki/
libmvec?action=AttachFile&do=view&target=VectorABI.txt>

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is
tested using the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector cosine function all pass.

Library libmvec is built by default. To disable building it, pass flag
--disable-mathvec to the configure script.

A runtime check prevents vector tests running on systems lacking VSX hardware.

Glibc built with this patch was installed using the procedure outlined at
<https://sourceware.org/glibc/wiki/Testing/Builds>. Compiling against the new
library created a test executable which computes cosines using the vector
version of the function. The results are at most 2-ulps away from the scalar
cosine. That is expected and indicated in the comments describing the
algorithm - as obtained from x86_64 commit #2193311288.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

Fix tst-pkey expectations on pkey_get [BZ #23202]

From the GNU C Library manual, the pkey_set can receive a combination of
PKEY_DISABLE_WRITE and PKEY_DISABLE_ACCESS.  However PKEY_DISABLE_ACCESS
is more restrictive than PKEY_DISABLE_WRITE and includes its behavior.

The test expects that after setting
(PKEY_DISABLE_WRITE|PKEY_DISABLE_ACCESS) pkey_get should return the
same.  This may not be true as PKEY_DISABLE_ACCESS will succeed in
describing the state of the key in this case.

The pkey behavior during signal handling is different between x86 and
POWER.  This change make the test compatible with both architectures.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

y2038: linux: Provide __gettimeofday64 implementation

In the glibc the gettimeofday can use vDSO (on power and x86 the
USE_IFUNC_GETTIMEOFDAY is defined), gettimeofday syscall or 'default'
___gettimeofday() from ./time/gettime.c (as a fallback).

In this patch the last function (___gettimeofday) has been refactored and
moved to ./sysdeps/unix/sysv/linux/gettimeofday.c to be Linux specific.

The new __gettimeofday64 explicit 64 bit function for getting 64 bit time from
the kernel (by internally calling __clock_gettime64) has been introduced.

Moreover, a 32 bit version - __gettimeofday has been refactored to internally
use __gettimeofday64.

The __gettimeofday is now supposed to be used on systems still supporting 32
bit time (__TIMESIZE != 64) - hence the necessary check for time_t potential
overflow and conversion of struct __timeval64 to 32 bit struct timespec.

The iFUNC vDSO direct call optimization has been removed from both i686 and
powerpc32 (USE_IFUNC_GETTIMEOFDAY is not defined for those architectures
anymore). The Linux kernel does not provide a y2038 safe implementation of
gettimeofday neither it plans to provide it in the future, clock_gettime64
should be used instead. Keeping support for this optimization would require
to handle another build permutation (!__ASSUME_TIME64_SYSCALLS &&
USE_IFUNC_GETTIMEOFDAY) which adds more complexity and has limited use
(since the idea is to eventually have a y2038 safe glibc build).

Build tests:
./src/scripts/build-many-glibcs.py glibcs

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master

Above tests were performed with Y2038 redirection applied as well as without
to test proper usage of both __gettimeofday64 and __gettimeofday.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
[Including some commit message improvement]

Linux: Work around kernel bugs in chmod on /proc/self/fd paths [BZ #14578]

It appears that the ability to change symbolic link modes through such
paths is unintended. On several file systems, the operation fails with
EOPNOTSUPP, even though the symbolic link permissions are updated.
The expected behavior is a failure to update the permissions, without
file system changes.

Reviewed-by: Matheus Castanho <msc@linux.ibm.com>

Introduce <elf-initfini.h> and ELF_INITFINI for all architectures

This supersedes the init_array sysdeps directory.  It allows us to
check for ELF_INITFINI in both C and assembler code, and skip DT_INIT
and DT_FINI processing completely on newer architectures.

A new header file is needed because <dl-machine.h> is incompatible
with assembler code.  <sysdep.h> is compatible with assembler code,
but it cannot be included in all assembler files because on some
architectures, it redefines register names, and some assembler files
conflict with that.

<elf-initfini.h> is replicated for legacy architectures which need
DT_INIT/DT_FINI support.  New architectures follow the generic default
and disable it.

mips: Fix bracktrace result for signal frames

MIPS fallback code handle a frame where its FDE can not be obtained
(for instance a signal frame) by reading the kernel allocated signal frame
and adding '2' to the value of 'sc_pc' [1]. The added value is used to
recognize an end of an EH region on mips16 [2].

The fix adjust the obtained signal frame value and remove the libgcc added
value by checking if the previous frame is a signal frame one.

Checked with backtrace and tst-sigcontext-get_pc tests on mips-linux-gnu
and mips64-linux-gnu.

[1] libgcc/config/mips/linux-unwind.h from gcc code.
[2] gcc/config/mips/mips.h from gcc code. */

Move implementation of <file_change_detection.h> into a C file

file_change_detection_for_stat partially initialize
struct file_change_detection in some cases, when the size member
alone determines the outcome of all comparisons. This results
in maybe-uninitialized compiler warnings in case of sufficiently
aggressive inlining.

Once the implementation is moved into a separate C file, this kind
of inlining is no longer possible, so the compiler warnings are gone.

<fd_to_filename.h>: Add type safety and port to Hurd

The new type struct fd_to_filename makes the allocation of the
backing storage explicit.

Hurd uses /dev/fd, not /proc/self/fd.

Co-Authored-By: Paul Eggert <eggert@cs.ucla.edu>

Prepare redirections for IEEE long double on powerpc64le

All functions that have a format string, which can consume a long double
argument, must have one version for each long double format supported on
a platform.  On powerpc64le, these functions currently have two versions
(i.e.: long double with the same format as double, and long double with
IBM Extended Precision format).  Support for a third long double format
option (i.e. long double with IEEE long double format) is being prepared
and all the aforementioned functions now have a third version (not yet
exported on the master branch, but the code is in).

For these functions to get selected (during build time), references to
them in user programs (or dependent libraries) must get redirected to
the aforementioned new versions of the functions.  This patch installs
the header magic required to perform such redirections.

Notice, however, that since the redirections only happen when
__LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including
powerpc64le) currently does it, no redirections actually happen.
Redirections and the exporting of the new functions will happen at the
same time (when powerpc64le adds ldbl-128ibm-compat to their Implies.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>

conform/conformtest.py: Extend tokenizer to cover character constants

Such constants are used in __USE_EXTERN_INLINES blocks.

stdlib: Reduce namespace pollution in <inttypes.h>

The namespace pollution results in conform test failures if the tests
are run __USE_EXTERN_INLINES defined (e.g., when configuring with
CC="gcc -O3" CXX="g++ -O3").

x86: Avoid single-argument _Static_assert in <tls.h>

Older GCC versions do not support this extension. Fixes commit f1bdee61797
("x86 tls: Use _Static_assert for TLS access size assertion").

x86 tls: Use _Static_assert for TLS access size assertion

htl: Link internal htl tests against libpthread

pthread: Fix building tst-robust8 with nptl

NPTL's pthreadP.h needs internal definitions

pthread: Move robust mutex tests from nptl to sysdeps/pthread

tst-robust8.c prints some mutex internals for nptl debugging, this
needed to be made conditioned by getting built with nptl.

htl: Remove stub warning for pthread_mutexattr_setpshared

It actually is implemented.

htl: Add missing functions and defines for robust mutexes

htl: Only check pthread_self coherency when DEBUG is set

htl has been widely tested for a long time now with this coherency
checked successfully.

hurd: Add THREAD_GET/SETMEM/_NC

Store them in the TCB, and use them for accessing _hurd_sigstate.

hurd tls: update comment about fields at the end of tcbhead

ld.so: Do not export free/calloc/malloc/realloc functions [BZ #25486]

Exporting functions and relying on symbol interposition from libc.so
makes the choice of implementation dependent on DT_NEEDED order, which
is not what some compiler drivers expect.

This commit replaces one magic mechanism (symbol interposition) with
another one (preprocessor-/compiler-based redirection). This makes
the hand-over from the minimal malloc to the full malloc more
explicit.

Removing the ABI symbols is backwards-compatible because libc.so is
always in scope, and the dynamic loader will find the malloc-related
symbols there since commit f0b2132b35248c1f4a80f62a2c38cddcc802aa8c
("ld.so: Support moving versioned symbols between sonames
[BZ #24741]").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Remove weak declaration of free from <inline-hashtab.h>

elf/dl-minimal.c provides a definition of free, so the function
pointer is always non-null, even before the final relocation
of the loader.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

elf: Extract _dl_sym_post, _dl_sym_find_caller_map from elf/dl-sym.c

The definitions are moved into a new file, elf/dl-sym-post.h, so that
this code can be used by the dynamic loader as well.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

elf: Introduce the rtld-stubbed-symbols makefile variable

This generalizes a mechanism used for stack-protector support, so
that it can be applied to other symbols if required.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

arm: fix use of INTERNAL_SYSCALL_CALL

Remove extra argument from INTERNAL_SYSCALL_CALL macro call. Fixes
commit bc2eb9321e ("linux: Remove INTERNAL_SYSCALL_DECL").

linux: Remove INTERNAL_SYSCALL_DECL

With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, the INTERNAL_SYSCALL_DECL is an empty declaration
on all ports.

This patch removes the 'err' argument on INTERNAL_SYSCALL* macro
and remove the INTERNAL_SYSCALL_DECL usage.

Checked with a build against all affected ABIs.

nptl: Remove ununsed pthread-errnos.h rule

linux: Consolidate INLINE_SYSCALL

With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, there is no need to replicate the INLINE_SYSCALL.

The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__,
which is ok now and it allows cleanup some archaic code that assume
otherwise.

Checked with a build against all affected ABIs.

s390: Consolidate Linux syscall definition

The {INTERNAL,INLINE}_SYSCALL are defined only on s390 sysdep.h.

Checked on s390x-linux-gnu and s390-linux-gnu.

riscv: Avoid clobbering register parameters in syscall

The riscv INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).

This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).

It is similar to the fix done for MIPS (bug 25523).

Checked with riscv64-linux-gnu-rv64imafdc-lp64d build.

microblaze: Avoid clobbering register parameters in syscall

The microblaze INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).

This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).

It is similar to the fix done for MIPS (bug 25523).

Checked with microblaze-linux-gnu and microblazeel-linux-gnu build.

nios2: Use Linux kABI for syscall return

It changes the nios INTERNAL_SYSCALL_RAW macro to return a negative
value instead of the 'r2' register value on the 'err' macro argument.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.

Checked with a build against nios2-linux-gnu.

mips: Use Linux kABI for syscall return

It changes the mips INTERNAL_SYSCALL* and internal_syscall* macros
to return a negative value instead of the 'a3' register value on then
'err' macro argument.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
The redefinition of INTERNAL_VSYSCALL_CALL is also no longer
required.

Checked on mips64-linux-gnu, mips64n32-linux-gnu, and mips-linux-gnu.

mips64: Consolidate Linux sysdep.h

The mips64 Linux syscall macros only differs argument type and
the requirement of sign-extending values on n32. The headers
are consolidate by parameterizing the arguments with a new type,
__syscall_arg_t, and by defining the ARGIFY for n64.

Also, the generic unix mips64 sysdep is essentially the same,
only the load instruction need to be adjusted depending of the
ABI.

Checked on mips64-linux-gnu and mips64n32-linux-gnu.

ia64: Use Linux kABI for syscall return

It changes the ia64 INTERNAL_SYSCALL_NCS macro to return a negative
value instead of the 'r10' register value on the 'err' macro argument.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.

Checked on ia64-linux-gnu.

alpha: Refactor syscall and Use Linux kABI for syscall return

It highly unlikely that alpha will be ported to anything else than
Linux, so this patch moves the generic unix syscall definition to
Linux and adapt it to Linux kernel ABI.

It changes the internal_syscall* macros to return a negative value
instead of the '$19' register value on the 'err' macro argument.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.

Checked on alpha-linux-gnu.

sparc: Avoid clobbering register parameters in syscall

The sparc INTERNAL_SYSCALL macro might clobber the register
parameter if the argument itself might clobber any register (a function
call for instance).

This patch fixes it by using temporary variables for the expressions
between the register assignments (as indicated by GCC documentation,
6.47.5.2 Specifying Registers for Local Variables).

It is similar to the fix done for MIPS (bug 25523).

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

sparc: Use Linux kABI for syscall return

It changes the sparc internal_syscall* macros to return a negative
value instead of the 'g1' register value in the 'err' macro argument.
The __SYSCALL_STRING macro is also changed to no set the 'g1'
value, since 'o1' already holds all the required information
to check if syscall has failed.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
The redefinition of INTERNAL_VSYSCALL_CALL is also no longer
required.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu. It fixes
the sporadic issues on sparc32 where clock_nanosleep does not
act as cancellation entrypoint.

powerpc: Use Linux kABI for syscall return

It changes the powerpc INTERNAL_VSYSCALL_CALL and INTERNAL_SYSCALL_NCS
to return a negative value instead of the returning the CR value in
the 'err' macro argument.

The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.

Checked on powerpc64-linux-gnu, powerpc64le-linux-gnu, and
powerpc-linux-gnu-power4.

powerpc: Consolidate Linux syscall definition

The diferences between powerpc64{le} and powerpc32 Linux sysdep.h
are:

  1. On both vDSO and syscall macros the volatile registers r9, r10,
     r11, and r12 are used as input operands on powerpc32 and as
     clobber registers on powerpc64.  However the outcome is essentially
     the same, it advertise the register might be clobbered by the
     kernel (although Linux won't leak register information to userland
     in such case).

  2. The LOADARGS* macros uses a different size to check for invalid
     types.

  3. The pointer mangling support guard pointer loading uses ABI
     specific instruction and register.

This patch consolidates on only one sysdep by using the the powerpc64
version as default and add the adjustments required for powerpc32.

Checked on powerpc64-linux-gnu, powerpc64le-linux-gnu, and
powerpc-linux-gnu-power4.

i386: Enable CET support in ucontext functions

1. getcontext and swapcontext are updated to save the caller's shadow
stack pointer and return address.
2. setcontext and swapcontext are updated to restore shadow stack and
jump to new context directly.
3. makecontext is updated to allocate a new shadow stack and set the
caller's return address to the helper code, L(exitcode).
4. Since we no longer save and restore EAX, ECX and EDX in getcontext,
setcontext and swapcontext, we can use them as scratch register slots
to enable CET in ucontext functions.

Since makecontext allocates a new shadow stack when making a new
context and kernel allocates a new shadow stack for clone/fork/vfork
syscalls, we track the current shadow stack base.  In setcontext and
swapcontext, if the target shadow stack base is the same as the current
shadow stack base, we unwind the shadow stack.  Otherwise it is a stack
switch and we look for a restore token.

We enable shadow stack at run-time only if program and all used shared
objects, including dlopened ones, are shadow stack enabled, which means
that they must be compiled with GCC 8 or above and glibc 2.28 or above.
We need to save and restore shadow stack only if shadow stack is enabled.
When caller of getcontext, setcontext, swapcontext and makecontext is
compiled with smaller ucontext_t, shadow stack won't be enabled at
run-time.  We check if shadow stack is enabled before accessing the
extended field in ucontext_t.

Tested on i386 CET/non-CET machines.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

tst-clone3: Use __NR_futex_time64 if we don't have __NR_futex

We can't include sysdep.h in the test case (it introduces lots of
strange failures) so __NR_futex isn't redifined to __NR_futex_time64 by
64-bit time_t 32-bit archs (y2038 safe).

To allow the test to pass let's just do the __NR_futex_time64 syscall if
we don't have __NR_futex defined.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

powerpc64: Add memory protection key support [BZ #23202]

The 32-bit protection key behavior is somewhat unclear on 32-bit powerpc,
so this change is restricted to the 64-bit variants.

Flag translation is needed because of hardware differences between the
POWER implementation (read and write flags) and the Intel implementation
(write and read+write flags).

ldbl-128ibm-compat: Provide a scalb implementation

Reuse the template in order to provide the redirect for
scalbl to __scalbieee128, but avoid any extra aliasing
as this is intended to support long double redirects only.

Add a generic scalb implementation

This is a preparatory patch to enable building a _Float128
variant to ease reuse when building a _Float128 variant to
alias this long double only symbol.

Notably, stubs are added where missing to the native _Float128
sysdep dir to prevent building these newly templated variants
created inside the build directories.

Also noteworthy are the changes around LIBM_SVID_COMPAT.  These
changes are not intuitive.  The templated version is only
enabled when !LIBM_SVID_COMPAT, and the compat version is
predicated entirely on LIBM_SVID_COMPAT.  Thus, exactly one is
stubbed out entirely when building.  The nldbl scalb compat
files are updated to account for this.

Likewise, fixup the reuse of m68k's e_scalb{f,l}.c to include
it's override of e_scalb.c.  Otherwise, the search path finds
the templated copy in the build directory.  This could be
futher simplified by providing an overridden template, but I
lack the hardware to verify.

Adjust thresholds in Bessel function implementations (bug 14469).

A recent discussion in bug 14469 notes that a threshold in float
Bessel function implementations, used to determine when to use a
simpler implementation approach, results in substantially inaccurate
results.

As I discussed in
<https://sourceware.org/ml/libc-alpha/2013-03/msg00345.html>, a
heuristic argument suggests 2^(S+P) as the right order of magnitude
for a suitable threshold, where S is the number of significand bits in
the floating-point type and P is the number of significant bits in the
representation of the floating-point type, and the float and ldbl-96
implementations use thresholds that are too small.  Some threshold
does need using, there or elsewhere in the implementation, to avoid
spurious underflow and overflow for large arguments.

This patch sets the thresholds in the affected implementations to more
heuristically justifiable values.  Results will still be inaccurate
close to zeroes of the functions (thus this patch does *not* fix any
of the bugs for Bessel function inaccuracy); fixing that would require
a different implementation approach, likely along the lines described
in <http://www.cl.cam.ac.uk/~jrh13/papers/bessel.ps.gz>.

So the justification for a change such as this would be statistical
rather than based on particular tests that had excessive errors and no
longer do so (no doubt such tests could be found, but would probably
be too fragile to add to the testsuite, as liable to give large errors
again from very small implementation changes or even from compiler
changes).  See
<https://sourceware.org/ml/libc-alpha/2020-02/msg00638.html> for such
statistics of the resulting improvements for float functions.

Tested (glibc testsuite) for x86_64.

resolv: Fix ABA race in /etc/resolv.conf change detection [BZ #25420]

__resolv_conf_get_current should only record the initial file
change data if after verifying that file just read matches the
original measurement. Fixes commit aef16cc8a4c670036d45590877
("resolv: Automatically reload a changed /etc/resolv.conf file
[BZ #984]").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

resolv: Enhance __resolv_conf_load to capture file change data

The data is captured after reading the file. This allows callers
to check the change data against an earlier measurement.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

resolv: Fix file handle leak in __resolv_conf_load [BZ #25429]

res_vinit_1 did not close the stream on errors, only on success.
This change moves closing the stream to __resolv_conf_load, for both
the success and error cases.

Fixes commit 89f187a40fc0ad4e22838526bfe34d73f758b776 ("resolv: Use
getline for configuration file reading in res_vinit_1") and commit
3f853f22c87f0b671c0366eb290919719fa56c0e ("resolv: Lift domain search
list limits [BZ #19569] [BZ #21475]"), where memory allocation was
introduced into res_vinit_1.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

resolv: Use <file_change_detection.h> in __resolv_conf_get_current

Only minor functional changes (i.e., regarding the handling of
directories, which are now treated as empty files).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Add STATX_ATTR_VERITY from Linux 5.5 to bits/statx-generic.h.

This patch adds the new STATX_ATTR_VERITY macro from Linux 5.5 to
glibc's bits/statx-generic.h. (This only does anything if glibc is
being used with old kernel headers.)

Tested for x86_64.

Use gcc -finput-charset=ascii for check-installed-headers.

A non-ascii character in the installed headers leads now to:
error: failure to convert ascii to UTF-8

Such a finding in s390 specific fenv.h leads to fails in GCC testsuite.
See glibc commit 08aea89ef67c5780ae734073494df0a451bce20f.

Adding this gcc option also to our tests was proposed by Florian Weimer.

This change also found a hit in resource.h where now "microseconds" is used.
I've adjusted all the resource.h files.

I've used the following command to check for further hits in headers.
LC_ALL=C find -name "*.h" -exec grep -PHn "[\x80-\xFF]" {} \;

Tested on s390x and x86_64.

Reviewed-by: Zack Weinberg <zackw@panix.com>

math/test-sinl-pseudo: Use stack protector only if available

This fixes commit 9333498794cde1d5cca518bad ("Avoid ldbl-96 stack
corruption from range reduction of pseudo-zero (bug 25487).").

alpha: Fix static gettimeofday symbol

By undef strong_alias on alpha implementation, the
default_symbol_version macro becomes an empty macro on static build.
It fixes the issue introduced at c953219420.

Checked on alpha-linux-gnu with a 'make check run-built-tests=no'.

nss_nisplus: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_dns: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_files: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_db: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_compat: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_hesiod: Use NSS_DECLARE_MODULE_FUNCTIONS

Reviewed-by: DJ Delorie <dj@redhat.com>

nss: Add function types and NSS_DECLARE_MODULE_FUNCTIONS macro to <nss.h>

This macro allows to add type safety to the implementation of NSS
service modules.

Reviewed-by: DJ Delorie <dj@redhat.com>

nss_compat: Do not use nss_* names for function pointers

A future commit will use these names for types of functions
in NSS service modules.

Reviewed-by: DJ Delorie <dj@redhat.com>

Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).

Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).

Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.

This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process. (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.) Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.

Tested for x86_64.

mips: Fix argument passing for inlined syscalls on Linux [BZ #25523]

According to [gcc documentation][1], temporary variables must be used for
the desired content to not be call-clobbered.

Fix the Linux inline syscall templates by adding temporary variables,
much like what x86 did before
(commit 381a0c26d73e0f074c962e0ab53b99a6c327066d).

Tested with gcc 9.2.0, both cross-compiled and natively on Loongson
3A4000.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html

mips: Use 'long int' and 'long long int' in linux syscall code

Style fixes only, no functional change.

alpha: Use generic gettimeofday implementation

It makes alpha no longer reports information about a system-wide
time zone and moves the version logic on the alpha implementation.

Checked on a build and check-abi for alpha-linux-gnu.

Reviewed-by: Lukasz Majewski <lukma@denx.de>

sunrpc: Properly clean up if tst-udp-timeout fails

The macro TEST_VERIFY_EXIT is used several times on
sunrpc/tst-udp-timeout to exit the test if a condition evaluates to
false. The side effect is that the code to terminate the RPC server
process is not executed when the program calls exit, so that
sub-process stays alive.

This commit registers a clean up function with atexit to kill the
server process before exiting the main program.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

elf: avoid stack allocation in dl_open_worker

As the sort was removed, there's no need to keep a separate map of
links. Instead, when relocating objects iterate over l_initfini
directly.

This allows us to remove the loop copying l_initfini elements into
map. We still need a loop to identify the first and last elements that
need relocation.

Tested by running the testsuite on x86_64.

elf: avoid redundant sort in dlopen

l_initfini is already sorted by dependency in _dl_map_object_deps(),
so avoid sorting again in dl_open_worker().

Tested by running the testsuite on x86_64.

elf: Allow dlopen of filter object to work [BZ #16272]

There are two fixes that are needed to be able to dlopen filter
objects. First _dl_map_object_deps cannot assume that map will be at
the beginning of l_searchlist.r_list[], as filtees are inserted before
map. Secondly dl_open_worker needs to ensure that filtees get
relocated.

In _dl_map_object_deps:

* avoiding removing relocation dependencies of map by setting
  l_reserved to 0 and otherwise processing the rest of the search
  list.

* ensure that map remains at the beginning of l_initfini - the list
  of things that need initialisation (and destruction). Do this by
  splitting the copy up. This may not be required, but matches the
  initialization order without dlopen.

Modify dl_open_worker to relocate the objects in new->l_inifini.
new->l_initfini is constructed in _dl_map_object_deps, and lists the
objects that need initialization and destruction. Originally the list
of objects in new->l_next are relocated. All of these objects should
also be included in new->l_initfini (both lists are populated with
dependencies in _dl_map_object_deps). We can't use new->l_prev to pick
up filtees, as during a recursive dlopen from an interposed malloc
call, l->prev can contain objects that are not ready for relocation.

Add tests to verify that symbols resolve to the filtee implementation
when auxiliary and filter objects are used, both as a normal link and
when dlopen'd.

Tested by running the testsuite on x86_64.

Update translations

Pull in translation update from translation.org.

Rename RWF_WRITE_LIFE_NOT_SET to RWH_WRITE_LIFE_NOT_SET following Linux 5.5.

Linux 5.5 renames RWF_WRITE_LIFE_NOT_SET to RWH_WRITE_LIFE_NOT_SET,
with the old name kept as an alias. This patch makes the
corresponding change in glibc.

Tested for x86_64.

S390: Fix non-ascii character in fenv.h.

The comment "isn't" contained a non-ascii character which leads to
an error if compiled with -finput-charset=ascii:
error: failure to convert ascii to UTF-8

This is observable in GCC testsuite:
FAIL: 17_intro/headers/c++1998/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2011/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2014/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2017/charset.cc (test for excess errors)
FAIL: 17_intro/headers/c++2020/charset.cc (test for excess errors)

Also rewrite the comment above.

Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>

io: Add io/tst-lchmod covering lchmod and fchmodat

Linux: Emulate fchmodat with AT_SYMLINK_NOFOLLOW using O_PATH [BZ #14578]

/proc/self/fd files are special and chmod on O_PATH descriptors
in that directory operates on the symbolic link itself (like lchmod).

io: Implement lchmod using fchmodat [BZ #14578]

Add internal <file_change_detection.h> header file

The code started out with bits form resolv/resolv_conf.c, but it
was enhanced to deal with directories and FIFOs in a more predictable
manner. A test case is included as well.

This will be used to implement the /etc/resolv.conf change detection.

This currently lives in a header file only. Once there are multiple
users, the implementations should be moved into C files.

elf.h: Add R_RISCV_IRELATIVE

The number has been officially assigned by
https://github.com/riscv/riscv-elf-psabi-doc/pull/131
https://github.com/riscv/riscv-elf-psabi-doc/commit/d21ca40a7f56812a15e97450b7bc1599c0d35b82

Fix typo in the name for Wednesday in Kurdish [BZ #9809]

debug: Add missing locale dependencies of fortify tests

The missing dependencies result in failures like this if make check
is invoked with sufficient parallelism for the debug subdirectory:

FAIL: debug/tst-chk2
FAIL: debug/tst-chk3
FAIL: debug/tst-chk4
FAIL: debug/tst-chk5
FAIL: debug/tst-chk6
FAIL: debug/tst-lfschk1
FAIL: debug/tst-lfschk2
FAIL: debug/tst-lfschk3
FAIL: debug/tst-lfschk4
FAIL: debug/tst-lfschk5
FAIL: debug/tst-lfschk6

htl C11 threads: Avoid pthread_ symbols visibility in static library

hurd: Add __pthread_spin_wait and use it

900778283ac3 ("htl: make pthread_spin_lock really spin") made
pthread_spin_lock really spin and not block, but the current users of
__pthread_spin_lock were assuming that it blocks, i.e. they use it as a
lightweight mutex fitting in just one int.

__pthread_spin_wait provides that support back.

ldbl-128ibm-compat: set PRINTF_CHK flag in {,v}sprintf_chk

This should be unconditionally set to match the common implementation,
and fixes multiple test failures related to sprintf.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

Use --disable-gdbserver in build-many-glibcs.py.

Now that binutils-gdb has gdbserver at top level, an extra
--disable-gdbserver configure option is needed when configuring
binutils from a git checkout to avoid it also building gdbserver
unnecessarily (although fairly harmlessly). This patch updates the
options used in build-many-glibcs.py accordingly (although this might
end up not being needed depending on what happens regarding whether
gdbserver gets built for host != target).

Tested with a build-many-glibcs.py compilers build for
aarch64-linux-gnu using binutils-gdb master.