]> git.ipfire.org Git - thirdparty/glibc.git/log
thirdparty/glibc.git
8 months agomalloc: send freed small chunks to smallbin
k4lizen [Fri, 29 Nov 2024 13:25:29 +0000 (13:25 +0000)] 
malloc: send freed small chunks to smallbin

Large chunks get added to the unsorted bin since
sorting them takes time, for small chunks the
benefit of adding them to the unsorted bin is
non-existant, actually hurting performance.

Splitting and malloc_consolidate still add small
chunks to unsorted, but we can hint the compiler
that that is a relatively rare occurance.
Benchmarking shows this to be consistently good.

Authored-by: k4lizen <k4lizen@proton.me>
Signed-off-by: Aleksa Siriški <sir@tmina.org>
8 months agoAArch64: Remove zva_128 from memset
Wilco Dijkstra [Mon, 25 Nov 2024 18:43:08 +0000 (18:43 +0000)] 
AArch64: Remove zva_128 from memset

Remove ZVA 128 support from memset - the new memset no longer
guarantees count >= 256, which can result in underflow and a
crash if ZVA size is 128 ([1]).  Since only one CPU uses a ZVA
size of 128 and its memcpy implementation was removed in commit
e162ab2bf1b82c40f29e1925986582fa07568ce8, remove this special
case too.

[1] https://sourceware.org/pipermail/libc-alpha/2024-November/161626.html

Reviewed-by: Andrew Pinski <quic_apinski@quicinc.com>
8 months agobenchtests: Add calloc test
Wangyang Guo [Fri, 29 Nov 2024 08:05:35 +0000 (16:05 +0800)] 
benchtests: Add calloc test

Two new benchmarks related to calloc added:
- bench-calloc-simple
- bench-calloc-thread
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
8 months agopthread_getcpuclockid: Add descriptive comment to smoke test
Siddhesh Poyarekar [Thu, 28 Nov 2024 11:30:40 +0000 (06:30 -0500)] 
pthread_getcpuclockid: Add descriptive comment to smoke test

Add a descriptive comment to the tst-pthread-cpuclockid-invalid test and
also drop pthread_getcpuclockid from the TODO-testing list since it now
has full coverage.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
8 months agoRemove nios2-linux-gnu
Adhemerval Zanella [Tue, 26 Nov 2024 19:34:00 +0000 (16:34 -0300)] 
Remove nios2-linux-gnu

GCC 15 (e876acab6cdd84bb2b32c98fc69fb0ba29c81153) and binutils
(e7a16d9fd65098045ef5959bf98d990f12314111) both removed all Nios II
support, and the architecture has been EOL'ed by the vendor.  The
kernel still has support, but without a proper compiler there
is no much sense in keep it on glibc.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agolibio: make _IO_least_marker static
Siddhesh Poyarekar [Thu, 28 Nov 2024 13:27:24 +0000 (08:27 -0500)] 
libio: make _IO_least_marker static

Trivial cleanup to limit _IO_least_marker so that it's clear that it is
unused outside of genops.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
8 months agomalloc: Avoid func call for tcache quick path in free()
Wangyang Guo [Tue, 26 Nov 2024 07:33:38 +0000 (15:33 +0800)] 
malloc: Avoid func call for tcache quick path in free()

Tcache is an important optimzation to accelerate memory free(), things
within this code path should be kept as simple as possible. This commit
try to remove the function call when free() invokes tcache code path by
inlining _int_free().

Result of bench-malloc-thread benchmark

Test Platform: Xeon-8380
Ratio: New / Original time_per_iteration (Lower is Better)

Threads#   | Ratio
-----------|------
1 thread   | 0.879
4 threads  | 0.874

The performance data shows it can improve bench-malloc-thread benchmark
by ~12% in both single thread and multi-thread scenario.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
8 months agodebug: Fix tst-longjmp_chk3 build failure on Hurd
Florian Weimer [Tue, 26 Nov 2024 18:26:13 +0000 (19:26 +0100)] 
debug: Fix tst-longjmp_chk3 build failure on Hurd

Explicitly include <unistd.h> for _exit and getpid.

8 months agomath: Add internal roundeven_finite
Adhemerval Zanella [Mon, 11 Nov 2024 20:38:44 +0000 (17:38 -0300)] 
math: Add internal roundeven_finite

Some CORE-MATH routines uses roundeven and most of ISA do not have
an specific instruction for the operation.  In this case, the call
will be routed to generic implementation.

However, if the ISA does support round() and ctz() there is a better
alternative (as used by CORE-MATH).

This patch adds such optimization and also enables it on powerpc.
On a power10 it shows the following improvement:

expm1f                      master      patched       improvement
latency                     9.8574       7.0139            28.85%
reciprocal-throughput       4.3742       2.6592            39.21%

Checked on powerpc64le-linux-gnu and aarch64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agoRISC-V: Use builtin for fma and fmaf
Julian Zhu [Wed, 11 Sep 2024 07:13:19 +0000 (15:13 +0800)] 
RISC-V: Use builtin for fma and fmaf

The built-in functions `builtin_{fma, fmaf}` are sufficient to generate correct `fmadd.d`/`fmadd.s` instructions on RISC-V.

Signed-off-by: Julian Zhu <jz531210@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoRISC-V: Use builtin for copysign and copysignf
Julian Zhu [Wed, 11 Sep 2024 08:05:12 +0000 (16:05 +0800)] 
RISC-V: Use builtin for copysign and copysignf

The built-in functions `builtin_{copysign, copysignf}` are sufficient to generate correct `fsgnj.d/fsgnj.s` instructions on RISC-V.

Signed-off-by: Julian Zhu <jz531210@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoSilence most -Wzero-as-null-pointer-constant diagnostics
Alejandro Colomar [Sat, 16 Nov 2024 15:51:31 +0000 (16:51 +0100)] 
Silence most -Wzero-as-null-pointer-constant diagnostics

Replace 0 by NULL and {0} by {}.

Omit a few cases that aren't so trivial to fix.

Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117059>
Link: <https://software.codidact.com/posts/292718/292759#answer-292759>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
8 months agosysdeps: linux: Fix output of LD_SHOW_AUXV=1 for AT_RSEQ_*
Yannick Le Pennec [Mon, 25 Nov 2024 13:12:05 +0000 (14:12 +0100)] 
sysdeps: linux: Fix output of LD_SHOW_AUXV=1 for AT_RSEQ_*

The constants themselves were added to elf.h back in 8754a4133e but the
array in _dl_show_auxv wasn't modified accordingly, resulting in the
following output when running LD_SHOW_AUXV=1 /bin/true on recent Linux:

    AT_??? (0x1b): 0x1c
    AT_??? (0x1c): 0x20

With this patch:

    AT_RSEQ_FEATURE_SIZE: 28
    AT_RSEQ_ALIGN:        32

Tested on Linux 6.11 x86_64

Signed-off-by: Yannick Le Pennec <yannick.lepennec@live.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agodebug: Wire up tst-longjmp_chk3
Florian Weimer [Mon, 25 Nov 2024 16:32:54 +0000 (17:32 +0100)] 
debug: Wire up tst-longjmp_chk3

The test was added in commit ac8cc9e300a002228eb7e660df3e7b333d9a7414
without all the required Makefile scaffolding.  Tweak the test
so that it actually builds (including with dynamic SIGSTKSZ).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agonptl: initialize cpu_id_start prior to rseq registration
Michael Jeanson [Wed, 20 Nov 2024 19:15:42 +0000 (14:15 -0500)] 
nptl: initialize cpu_id_start prior to rseq registration

When adding explicit initialization of rseq fields prior to
registration, I glossed over the fact that 'cpu_id_start' is also
documented as initialized by user-space.

While current kernels don't validate the content of this field on
registration, future ones could.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
8 months agomath: Fix branch hint for 68d7128942
Adhemerval Zanella [Mon, 25 Nov 2024 16:37:50 +0000 (13:37 -0300)] 
math: Fix branch hint for 68d7128942

8 months agopowerpc64le: ROP Changes for strncpy/ppc-mount
Sachin Monga [Mon, 25 Nov 2024 15:17:30 +0000 (10:17 -0500)] 
powerpc64le: ROP Changes for strncpy/ppc-mount

Add ROP protect instructions to strncpy and ppc-mount functions.
Modify FRAME_MIN_SIZE to 48 bytes for ELFv2 to reserve additional
16 bytes for ROP save slot and padding.

Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
8 months agomath: Fix non-portability in the computation of signgam in lgammaf
Vincent Lefevre [Fri, 22 Nov 2024 16:54:53 +0000 (13:54 -0300)] 
math: Fix non-portability in the computation of signgam in lgammaf

The k>>31 in signgam = 1 - (((k&(k>>31))&1)<<1); is not portable:

* The ISO C standard says "If E1 has a signed type and a negative
  value, the resulting value is implementation-defined." (this is
  still in C23).
* If the int type is larger than 32 bits (e.g. a 64-bit type),
  then k = INT_MAX; line 144 will make k>>31 put 1 in bit 0
  (thus signgam will be -1) while 0 is expected.

Moreover, instead of the fx >= 0x1p31f condition, testing fx >= 0
is probably better for 2 reasons:

The signgam expression has more or less a condition on the sign
of fx (the goal of k>>31, which can be dropped with this new
condition). Since fx ≥ 0 should be the most common case, one can
get signgam directly in this case (value 1). And this simplifies
the expression for the other case (fx < 0).

This new condition may be easier/faster to test on the processor
(e.g. by avoiding a load of a constant from the memory).

This is commit d41459c731865516318f813cf4c966dafa0eecbf from CORE-MATH.

Checked on x86_64-linux-gnu.

8 months agomalloc: Split _int_free() into 3 sub functions
Wangyang Guo [Thu, 29 Aug 2024 06:27:28 +0000 (14:27 +0800)] 
malloc: Split _int_free() into 3 sub functions

Split _int_free() into 3 smaller functions for flexible combination:
* _int_free_check -- sanity check for free
* tcache_free -- free memory to tcache (quick path)
* _int_free_chunk -- free memory chunk (slow path)

8 months agohurd: Add MAP_NORESERVE mmap flag
Samuel Thibault [Sun, 24 Nov 2024 23:54:26 +0000 (00:54 +0100)] 
hurd: Add MAP_NORESERVE mmap flag

This is already the current default behavior, which we will change with
overcommit support addition.

8 months agonptl: Add smoke test for pthread_getcpuclockid failure
Siddhesh Poyarekar [Thu, 21 Nov 2024 22:13:33 +0000 (17:13 -0500)] 
nptl: Add smoke test for pthread_getcpuclockid failure

Exercise the case where an exited thread will cause
pthread_getcpuclockid to fail.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agoAdd multithreaded test of sem_getvalue
Joseph Myers [Fri, 22 Nov 2024 16:58:51 +0000 (16:58 +0000)] 
Add multithreaded test of sem_getvalue

Test coverage of sem_getvalue is fairly limited.  Add a test that runs
it on threads on each CPU.  For this purpose I adapted
tst-skeleton-thread-affinity.c; it didn't seem very suitable to use
as-is or include directly in a different test doing things per-CPU,
but did seem a suitable starting point (thus sharing
tst-skeleton-affinity.c) for such testing.

Tested for x86_64.

8 months agomath: Use tanf from CORE-MATH
Adhemerval Zanella [Fri, 8 Nov 2024 16:24:28 +0000 (13:24 -0300)] 
math: Use tanf from CORE-MATH

The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic tanf.

The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, and to use a generic
128 bit routine for ABIs that do not support it natively.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (neoverse1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       82.3961       54.8052       33.49%
x86_64v2                     82.3415       54.8052       33.44%
x86_64v3                     69.3661       50.4864       27.22%
i686                         219.271       45.5396       79.23%
aarch64                      29.2127       19.1951       34.29%
power10                      19.5060       16.2760       16.56%

reciprocal-throughput         master       patched  improvement
x86_64                       28.3976       19.7334       30.51%
x86_64v2                     28.4568       19.7334       30.65%
x86_64v3                     21.1815       16.1811       23.61%
i686                         105.016       15.1426       85.58%
aarch64                      18.1573       10.7681       40.70%
power10                       8.7207        8.7097        0.13%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agomath: Use lgammaf from CORE-MATH
Adhemerval Zanella [Wed, 30 Oct 2024 14:50:03 +0000 (11:50 -0300)] 
math: Use lgammaf from CORE-MATH

The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic lgammaf.

The code was adapted to glibc style, to use the definition of
math_config.h, to remove errno handling, to use math_narrow_eval
on overflow usage, and to adapt to make it reentrant.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       86.5609       70.3278       18.75%
x86_64v2                     78.3030       69.9709       10.64%
x86_64v3                     74.7470       59.8457       19.94%
i686                         387.355       229.761       40.68%
aarch64                      40.8341       33.7563       17.33%
power10                      26.5520       16.1672       39.11%
powerpc                      28.3145       17.0625       39.74%

reciprocal-throughput         master       patched  improvement
x86_64                       68.0461       48.3098       29.00%
x86_64v2                     55.3256       47.2476       14.60%
x86_64v3                     52.3015       38.9028       25.62%
i686                         340.848       195.707       42.58%
aarch64                      36.8000       30.5234       17.06%
power10                      20.4043       12.6268       38.12%
powerpc                      22.6588       13.8866       38.71%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agomath: Use erfcf from CORE-MATH
Adhemerval Zanella [Tue, 29 Oct 2024 13:02:20 +0000 (10:02 -0300)] 
math: Use erfcf from CORE-MATH

The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic erfcf.

The code was adapted to glibc style and to use the definition of
math_config.h.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       98.8796       66.2142       33.04%
x86_64v2                     98.9617       67.4221       31.87%
x86_64v3                     87.4161       53.1754       39.17%
aarch64                      33.8336       22.0781       34.75%
power10                      21.1750       13.5864       35.84%
powerpc                      21.4694       13.8149       35.65%

reciprocal-throughput         master       patched  improvement
x86_64                       48.5620       27.6731       43.01%
x86_64v2                     47.9497       28.3804       40.81%
x86_64v3                     42.0255       18.1355       56.85%
aarch64                      24.3938       13.4041       45.05%
power10                      10.4919        6.1881       41.02%
powerpc                       11.763       6.76468       42.49%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agomath: Use erff from CORE-MATH
Adhemerval Zanella [Mon, 28 Oct 2024 20:58:18 +0000 (17:58 -0300)] 
math: Use erff from CORE-MATH

The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic erff.

The code was adapted to glibc style and to use the definition of
math_config.h.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master       patched  improvement
x86_64                       85.7363       45.1372       47.35%
x86_64v2                     86.6337       38.5816       55.47%
x86_64v3                     71.3810       34.0843       52.25%
i686                         190.143       97.5014       48.72%
aarch64                      34.9091       14.9320       57.23%
power10                      38.6160        8.5188       77.94%
powerpc                      39.7446       8.45781       78.72%

reciprocal-throughput         master       patched  improvement
x86_64                       35.1739       14.7603       58.04%
x86_64v2                     34.5976       11.2283       67.55%
x86_64v3                     27.3260        9.8550       63.94%
i686                         91.0282       30.8840       66.07%
aarch64                      22.5831        6.9615       69.17%
power10                      18.0386        3.0918       82.86%
powerpc                      20.7277       3.63396       82.47%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agomath: Split s_erfF in erff and erfc
Adhemerval Zanella [Mon, 28 Oct 2024 20:02:01 +0000 (17:02 -0300)] 
math: Split s_erfF in erff and erfc

So we can eventually replace each implementation.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agomath: Use cbrtf from CORE-MATH
Adhemerval Zanella [Mon, 28 Oct 2024 15:38:50 +0000 (12:38 -0300)] 
math: Use cbrtf from CORE-MATH

The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic cbrtf.

The code was adapted to glibc style and to use the definition of
math_config.h.

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1,
gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1):

latency                       master        patched       improvement
x86_64                       68.6348        36.8908            46.25%
x86_64v2                     67.3418        36.6968            45.51%
x86_64v3                     63.4981        32.7859            48.37%
aarch64                      29.3172        12.1496            58.56%
power10                      18.0845         8.8893            50.85%
powerpc                      18.0859        8.79527            51.37%

reciprocal-throughput         master        patched       improvement
x86_64                       36.4369        13.3565            63.34%
x86_64v2                     37.3611        13.1149            64.90%
x86_64v3                     31.6024        11.2102            64.53%
aarch64                      18.6866        7.3474             60.68%
power10                       9.4758        3.6329             61.66%
powerpc                      9.58896        3.90439            59.28%

Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agobenchtests: Add tanf benchmark
Adhemerval Zanella [Fri, 8 Nov 2024 09:13:50 +0000 (09:13 +0000)] 
benchtests: Add tanf benchmark

Random inputs in [-pi, pi].

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agobenchtests: Add lgammaf benchmark
Adhemerval Zanella [Tue, 29 Oct 2024 16:40:29 +0000 (13:40 -0300)] 
benchtests: Add lgammaf benchmark

Random inputs in the range [-20.0,20.0].

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agobenchtests: Add erfcf benchmark
Adhemerval Zanella [Tue, 29 Oct 2024 12:28:01 +0000 (09:28 -0300)] 
benchtests: Add erfcf benchmark

It is based on binary64 erfc-inputs, with random inputs in
[0,b=0x1.41bbf6p+3] where b in the smallest number such that
erfcf(b) rounds to 0 (to nearest).

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agobenchtests: Add erff benchmark
Adhemerval Zanella [Mon, 28 Oct 2024 18:53:30 +0000 (15:53 -0300)] 
benchtests: Add erff benchmark

It is based on binary64 erf-inputs, with random inputs in [0,b=0x1.f5a888p+1]
where b in the smallest number such that erff(b) rounds to 1 (to nearest).

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agobenchtests: Add cbrtf benchmark
Adhemerval Zanella [Mon, 28 Oct 2024 13:00:49 +0000 (10:00 -0300)] 
benchtests: Add cbrtf benchmark

Based on binary64 benchtests, with random inputs in [1,8].

8 months agoelf: Handle static PIE with non-zero load address [BZ #31799]
H.J. Lu [Mon, 28 Oct 2024 22:01:14 +0000 (06:01 +0800)] 
elf: Handle static PIE with non-zero load address [BZ #31799]

For a static PIE with non-zero load address, its PT_DYNAMIC segment
entries contain the relocated values for the load address in static PIE.
Since static PIE usually doesn't have PT_PHDR segment, use p_vaddr of
the PT_LOAD segment with offset == 0 as the load address in static PIE
and adjust the entries of PT_DYNAMIC segment in static PIE by properly
setting the l_addr field for static PIE.  This fixes BZ #31799.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
8 months agox86/string: Use `movsl` instead of `movsd` in strncat [BZ #32344]
Siddhesh Poyarekar [Thu, 21 Nov 2024 22:05:11 +0000 (17:05 -0500)] 
x86/string: Use `movsl` instead of `movsd` in strncat [BZ #32344]

The previous patch missed strncat, so fixed that.

Resolves: BZ #32344

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
8 months agostdlib: Make getenv thread-safe in more cases
Florian Weimer [Thu, 21 Nov 2024 20:10:52 +0000 (21:10 +0100)] 
stdlib: Make getenv thread-safe in more cases

Async-signal-safety is preserved, too.  In fact, getenv is fully
reentrant and can be called from the malloc call in setenv
(if a replacement malloc uses getenv during its initialization).

This is relatively easy to implement because even before this change,
setenv, unsetenv, clearenv, putenv do not deallocate the environment
strings themselves as they are removed from the environment.

The main changes are:

* Use release stores for environment array updates, following
  the usual pattern for safely publishing immutable data
  (in this case, the environment strings).

* Do not deallocate the environment array.  Instead, keep older
  versions around and adopt an  exponential resizing policy.  This
  results in an amortized constant space leak per active environment
  variable, but there already is such a leak for the variable itself
  (and that is even length-dependent, and includes no-longer used
  values).

* Add a seqlock-like mechanism to retry getenv if a concurrent
  unsetenv is observed.  Without that, it is possible that
  getenv returns NULL for a variable that is never unset.  This
  is visible on some AArch64 implementations with the newly
  added stdlib/tst-getenv-unsetenv test case.  The mechanism
  is not a pure seqlock because it tolerates one write from
  unsetenv.  This avoids the need for a second copy of the
  environ array that getenv can read from a signal handler
  that happens to interrupt an unsetenv call.

No manual updates are included with this patch because environ
usage with execve, posix_spawn, system is still not thread-safe
relative unsetenv.  The new process may end up with an environment
that misses entries that were never unset.  This is the same issue
described above for getenv.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoaarch64: Remove non-temporal load/stores from oryon-1's memset
Andrew Pinski [Fri, 15 Nov 2024 03:03:20 +0000 (19:03 -0800)] 
aarch64: Remove non-temporal load/stores from oryon-1's memset

The hardware architects have a new recommendation not to use
non-temporal load/stores for memset. This patch removes this path.
I found there was no difference in the memset speed with/without
non-temporal load/stores either.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoaarch64: Remove non-temporal load/stores from oryon-1's memcpy
Andrew Pinski [Fri, 15 Nov 2024 03:03:19 +0000 (19:03 -0800)] 
aarch64: Remove non-temporal load/stores from oryon-1's memcpy

The hardware architects have a new recommendation not to use
non-temporal load/stores for memcpy. This patch removes this path.
I found there was no difference in the memcpy speed with/without
non-temporal load/stores either.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agopowerpc64le: _init/_fini file changes for ROP
Sachin Monga [Wed, 20 Nov 2024 21:50:00 +0000 (16:50 -0500)] 
powerpc64le: _init/_fini file changes for ROP

The ROP instructions were added in ISA 3.1 (ie, Power10), however they
were defined so that if executed on older cpus, they would behave as
nops.  This allows us to emit them on older cpus and they'd just be
ignored, but if run on a Power10, then the binary would be ROP protected.

Hash instructions use negative offsets so the default position
of ROP pointer is FRAME_ROP_SAVE from caller's SP.

Modified FRAME_MIN_SIZE_PARM to 112 for ELFv2 to reserve
additional 16 bytes for ROP save slot and padding.

Signed-off-by: Sachin Monga <smonga@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
8 months agomman.h: Fix MAP_HASSEMPHORE typo
Samuel Thibault [Wed, 20 Nov 2024 18:51:08 +0000 (19:51 +0100)] 
mman.h: Fix MAP_HASSEMPHORE typo

BSD's MAP_HASSEMAPHORE is with an A. MAP_HASSEMPHORE is not used in any
Debian software for instance.

8 months agomisc: remove extra va_end in error_tail (bug 32233)
Andreas Schwab [Wed, 20 Nov 2024 12:15:44 +0000 (13:15 +0100)] 
misc: remove extra va_end in error_tail (bug 32233)

This is an addendum to commit b7b52b9dec ("error, error_at_line: Add
missing va_end calls"), which added the va_end calls in the callers where
they belong.

8 months agointl: avoid alloca for arbitrary sizes (bug 32380)
Andreas Schwab [Wed, 20 Nov 2024 09:01:29 +0000 (10:01 +0100)] 
intl: avoid alloca for arbitrary sizes (bug 32380)

Use malloc for the copy of the domain name and the category value, which
can both be of arbitrary size.

8 months agomanual: Add description of AArch64-specific pkey flags
Yury Khrustalev [Wed, 20 Nov 2024 11:20:33 +0000 (11:20 +0000)] 
manual: Add description of AArch64-specific pkey flags

Describe AArch64 specific flags PKEY_DISABLE_READ and PKEY_DISABLE_EXECUTE that
are available on AArch64 systems with enabled Stage 1 permission overlays
feature introduced in Armv8.9 / 9.4 (FEAT_S1POE).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoAArch64: Add support for memory protection keys
Yury Khrustalev [Wed, 20 Nov 2024 11:16:36 +0000 (11:16 +0000)] 
AArch64: Add support for memory protection keys

This patch adds support for memory protection keys on AArch64 systems with
enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4
(FEAT_S1POE) [1].

 1. Internal functions "pkey_read" and "pkey_write" to access data
    associated with memory protection keys.
 2. Implementation of API functions "pkey_get" and "pkey_set" for
    the AArch64 target.
 3. AArch64-specific PKEY flags for READ and EXECUTE (see below).
 4. New target-specific test that checks behaviour of pkeys on
    AArch64 targets.
 5. This patch also extends existing generic test for pkeys.
 6. HWCAP constant for Permission Overlay Extension feature.

To support more accurate mapping of underlying permissions to the
PKEY flags, we introduce additional AArch64-specific flags. The full
list of flags is:

 - PKEY_UNRESTRICTED: 0x0 (for completeness)
 - PKEY_DISABLE_ACCESS: 0x1 (existing flag)
 - PKEY_DISABLE_WRITE: 0x2 (existing flag)
 - PKEY_DISABLE_EXECUTE: 0x4 (new flag, AArch64 specific)
 - PKEY_DISABLE_READ: 0x8 (new flag, AArch64 specific)

The problem here is that PKEY_DISABLE_ACCESS has unusual semantics as
it overlaps with existing PKEY_DISABLE_WRITE and new PKEY_DISABLE_READ.
For this reason mapping between permission bits RWX and "restrictions"
bits awxr (a for disable access, etc) becomes complicated:

 - PKEY_DISABLE_ACCESS disables both R and W
 - PKEY_DISABLE_{WRITE,READ} disables W and R respectively
 - PKEY_DISABLE_EXECUTE disables X

Combinations like the one below are accepted although they are redundant:

 - PKEY_DISABLE_ACCESS | PKEY_DISABLE_READ | PKEY_DISABLE_WRITE

Reverse mapping tries to retain backward compatibility and ORs
PKEY_DISABLE_ACCESS whenever both flags PKEY_DISABLE_READ and
PKEY_DISABLE_WRITE would be present.

This will break code that compares pkey_get output with == instead
of using bitwise operations. The latter is more correct since PKEY_*
constants are essentially bit flags.

It should be noted that PKEY_DISABLE_ACCESS does not prevent execution.

[1] https://developer.arm.com/documentation/ddi0487/ka/ section D8.4.1.4

Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoAArch64: Remove thunderx{,2} memcpy
Andrew Pinski [Wed, 20 Nov 2024 11:08:53 +0000 (11:08 +0000)] 
AArch64: Remove thunderx{,2} memcpy

ThunderX1 and ThunderX2 have been retired for a few years now.
So let's remove the thunderx{,2} specific versions of memcpy.
The performance gain or them was for medium and large sizes
while the generic (aarch64) memcpy will handle just slightly worse.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
8 months agoFix femode_t conditionals for arc and or1k
Joseph Myers [Tue, 19 Nov 2024 22:25:39 +0000 (22:25 +0000)] 
Fix femode_t conditionals for arc and or1k

Two of the architecture bits/fenv.h headers define femode_t if
__GLIBC_USE (IEC_60559_BFP_EXT), instead of the correct condition
__GLIBC_USE (IEC_60559_BFP_EXT_C23) (both were added after commit
0175c9e9be5f0b2000859666b6e1ef3696f1123b, but were probably first
developed before it and then not updated to take account of its
changes).  This results in failures of the installed headers check for
fenv.h when building with GCC 15 (defaults to -std=gnu23 - we don't
yet have an installed-headers test specifically for C23 mode and don't
yet require a compiler with such a mode for building glibc) together
with a combination of options leaving C23 features enabled, since the
declarations of functions using femode_t use the correct conditions;
see
<https://sourceware.org/pipermail/libc-testresults/2024q4/013163.html>.
Fix the conditionals to get <fenv.h> to work correctly in C23 mode
again.

Tested with build-many-glibcs.py (arc-linux-gnu, arch-linux-gnuhf,
or1k-linux-gnu-hard, or1k-linux-gnu-soft).

8 months agopowerpc64le: Optimized strcat for POWER10
Mahesh Bodapati [Tue, 19 Nov 2024 20:57:35 +0000 (15:57 -0500)] 
powerpc64le: Optimized strcat for POWER10

This patch adds an optimized strcat which makes use of the default
strcat function which calls the Power10 strcpy and strlen routines.

8 months agopowerpc: Improve the inline asm for syscall wrappers
Peter Bergner [Tue, 5 Nov 2024 22:05:53 +0000 (16:05 -0600)] 
powerpc: Improve the inline asm for syscall wrappers

Update the inline asm syscall wrappers to match the newer register constraint
usage in INTERNAL_VSYSCALL_CALL_TYPE.  Use the faster mfocrf instruction when
available, rather than the slower mfcr microcoded instruction.

8 months agohtl: move pthread_attr_init into libc.
gfleury [Mon, 18 Nov 2024 11:21:45 +0000 (13:21 +0200)] 
htl: move pthread_attr_init into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_setguardsize into libc.
gfleury [Mon, 18 Nov 2024 11:21:44 +0000 (13:21 +0200)] 
htl: move pthread_attr_setguardsize into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_setschedparam into libc.
gfleury [Mon, 18 Nov 2024 11:21:43 +0000 (13:21 +0200)] 
htl: move pthread_attr_setschedparam into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_setscope into libc.
gfleury [Mon, 18 Nov 2024 11:21:42 +0000 (13:21 +0200)] 
htl: move pthread_attr_setscope into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_setstackaddr into libc.
gfleury [Mon, 18 Nov 2024 11:21:41 +0000 (13:21 +0200)] 
htl: move pthread_attr_setstackaddr into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_setstacksize into libc.
gfleury [Mon, 18 Nov 2024 11:21:40 +0000 (13:21 +0200)] 
htl: move pthread_attr_setstacksize into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_getstack into libc.
gfleury [Mon, 18 Nov 2024 11:21:39 +0000 (13:21 +0200)] 
htl: move pthread_attr_getstack into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_getstackaddr into libc.
gfleury [Mon, 18 Nov 2024 11:21:38 +0000 (13:21 +0200)] 
htl: move pthread_attr_getstackaddr into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl move pthread_attr_getstacksize into libc.
gfleury [Mon, 18 Nov 2024 11:21:37 +0000 (13:21 +0200)] 
htl move pthread_attr_getstacksize into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl move pthread_attr_getscope into libc.
gfleury [Mon, 18 Nov 2024 11:21:36 +0000 (13:21 +0200)] 
htl move pthread_attr_getscope into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl move pthread_attr_getguardsize into libc.
gfleury [Mon, 18 Nov 2024 11:21:35 +0000 (13:21 +0200)] 
htl move pthread_attr_getguardsize into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move __pthread_default_attr into libc
gfleury [Mon, 18 Nov 2024 11:21:34 +0000 (13:21 +0200)] 
htl: move __pthread_default_attr into libc

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agohtl: move pthread_attr_destroy into libc.
gfleury [Mon, 18 Nov 2024 11:21:33 +0000 (13:21 +0200)] 
htl: move pthread_attr_destroy into libc.

Signed-off-by: gfleury <gfleury@disroot.org>
8 months agostdio-common: Fix C23-ism in formatted output specifier tests [BZ #32360]
Maciej W. Rozycki [Fri, 15 Nov 2024 22:43:54 +0000 (22:43 +0000)] 
stdio-common: Fix C23-ism in formatted output specifier tests [BZ #32360]

Nameless function parameters have only been added to ISO C with the C23
revision of the language standard.  Give names to the unused parameters
of the stub 'dladdr' implementation then so as to make compilation happy
with the earlier language definitions, fixing errors such as:

tst-printf-format-skeleton.c:374:9: error: parameter name omitted
  374 | dladdr (const void *, Dl_info *)

reported by older compilers.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agoelf: handle addition overflow in _dl_find_object_update_1 [BZ #32245]
Aurelien Jarno [Sun, 10 Nov 2024 09:50:34 +0000 (10:50 +0100)] 
elf: handle addition overflow in _dl_find_object_update_1 [BZ #32245]

The remaining_to_add variable can be 0 if (current_used + count) wraps,
This is caught by GCC 14+ on hppa, which determines from there that
target_seg could be be NULL when remaining_to_add is zero, which in
turns causes a -Wstringop-overflow warning:

 In file included from ../include/atomic.h:49,
                  from dl-find_object.c:20:
 In function '_dlfo_update_init_seg',
     inlined from '_dl_find_object_update_1' at dl-find_object.c:689:30,
     inlined from '_dl_find_object_update' at dl-find_object.c:805:13:
 ../sysdeps/unix/sysv/linux/hppa/atomic-machine.h:44:4: error: '__atomic_store_4' writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=]
    44 |    __atomic_store_n ((mem), (val), __ATOMIC_RELAXED);                        \
       |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 dl-find_object.c:644:3: note: in expansion of macro 'atomic_store_relaxed'
   644 |   atomic_store_relaxed (&seg->size, new_seg_size);
       |   ^~~~~~~~~~~~~~~~~~~~
 In function '_dl_find_object_update':
 cc1: note: destination object is likely at address zero

In practice, this is not possible as it represent counts of link maps.
Link maps have sizes larger than 1 byte, so the sum of any two link map
counts will always fit within a size_t without wrapping around.

This patch therefore adds a check on remaining_to_add == 0 and tell GCC
that this can not happen using __builtin_unreachable.

Thanks to Andreas Schwab for the investigation.

Closes: BZ #32245
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agox86/string: Use `movsl` instead of `movsd` in strncpy/strncat [BZ #32344]
Noah Goldstein [Tue, 12 Nov 2024 23:04:42 +0000 (17:04 -0600)] 
x86/string: Use `movsl` instead of `movsd` in strncpy/strncat [BZ #32344]

`ld`, starting at 2.40, emits a warning when using `movsd`. There is
no change to the actual code produced.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
8 months agomanual: Fix overeager s/int/size_t/ in memory.texi
Jonathan Wakely [Wed, 13 Nov 2024 14:43:58 +0000 (14:43 +0000)] 
manual: Fix overeager s/int/size_t/ in memory.texi

The change in e3960d1c57e57f33e0e846d615788f4ede73b945 should only have
affected 'int' not 'internally'.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
8 months agohppa: Update libm-test-ulps
John David Anglin [Wed, 13 Nov 2024 02:32:54 +0000 (21:32 -0500)] 
hppa: Update libm-test-ulps

Update imaginary part of csin.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
8 months agoRevert "hurd: Stop depending on the default_pager stubs provided by gnumach"
Samuel Thibault [Wed, 13 Nov 2024 00:33:32 +0000 (01:33 +0100)] 
Revert "hurd: Stop depending on the default_pager stubs provided by gnumach"

This reverts commit f7f7dd8009275504b211c170caf5bce50fa472ac.

default_pager is actually also used in e.g. xosview.

8 months agolinux: Add support for getrandom vDSO
Adhemerval Zanella [Wed, 18 Sep 2024 14:01:22 +0000 (16:01 +0200)] 
linux: Add support for getrandom vDSO

Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque
state allocated with mmap using flags specified by the vDSO.

Multiple states are allocated at once, as many as fit into a page, and
these are held in an array of available states to be doled out to each
thread upon first use, and recycled when a thread terminates. As these
states run low, more are allocated.

To make this procedure async-signal-safe, a simple guard is used in the
LSB of the opaque state address, falling back to the syscall if there's
reentrancy contention.

Also, _Fork() is handled by blocking signals on opaque state allocation
(so _Fork() always sees a consistent state even if it interrupts a
getrandom() call) and by iterating over the thread stack cache on
reclaim_stack. Each opaque state will be in the free states list
(grnd_alloc.states) or allocated to a running thread.

The cancellation is handled by always using GRND_NONBLOCK flags while
calling the vDSO, and falling back to the cancellable syscall if the
kernel returns EAGAIN (would block). Since getrandom is not defined by
POSIX and cancellation is supported as an extension, the cancellation is
handled as 'may occur' instead of 'shall occur' [1], meaning that if
vDSO does not block (the expected behavior) getrandom will not act as a
cancellation entrypoint. It avoids a pthread_testcancel call on the fast
path (different than 'shall occur' functions, like sem_wait()).

It is currently enabled for x86_64, which is available in Linux 6.11,
and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are
available in Linux 6.12.

Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html
Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64
Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64
Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64
Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x
8 months agoio: Add setuid tests for faccessat
Siddhesh Poyarekar [Wed, 16 Oct 2024 19:10:15 +0000 (15:10 -0400)] 
io: Add setuid tests for faccessat

Add a new test tst-faccessat-setuid that iterates through real and
effective UID/GID combination and tests the faccessat() interface for
default and AT_EACCESS flags.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agotst-faccessat.c: Port to libsupport
Siddhesh Poyarekar [Wed, 9 Oct 2024 01:09:43 +0000 (21:09 -0400)] 
tst-faccessat.c: Port to libsupport

Use libsupport convenience functions and macros instead of the old
test-skeleton.  Also add a new xdup() convenience wrapper function.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agosupport: Add xdup
Siddhesh Poyarekar [Fri, 8 Nov 2024 17:33:47 +0000 (12:33 -0500)] 
support: Add xdup

Add xdup as the error-checking version of dup for test cases.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoLoongArch: Update ulps
caiyinyu [Mon, 11 Nov 2024 01:56:05 +0000 (09:56 +0800)] 
LoongArch: Update ulps

Needed for test-float-cacosh, test-float-csin, test-float32-cacosh and
test-float32-csin.

Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agostat.h: Fix missing declaration of struct timespec
Samuel Thibault [Sat, 9 Nov 2024 23:45:19 +0000 (00:45 +0100)] 
stat.h: Fix missing declaration of struct timespec

When building with e.g. -std=c99 and _ATFILE_SOURCE, stat.h was missing
including bits/types/struct_timespec.h to get the struct timespec
declaration for utimensat.

8 months agomach: Fix __xpg_strerror_r on in-range but undefined errors [BZ #32350]
Samuel Thibault [Sat, 9 Nov 2024 18:54:08 +0000 (19:54 +0100)] 
mach: Fix __xpg_strerror_r on in-range but undefined errors [BZ #32350]

For instance, 1073741906 leads to system 16, subsystem 0 and code 82,
which is in range (max_code is 122), but not defined. Return EINVAL in
that case, like

8 months agox86/string: Use `movsl` instead of `movsd` [BZ #32344]
Noah Goldstein [Fri, 8 Nov 2024 19:18:17 +0000 (11:18 -0800)] 
x86/string: Use `movsl` instead of `movsd` [BZ #32344]

`ld`, starting at 2.40, emits a warning when using `movsd`. There is
no change to the actual code produced.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
8 months agoRename new tst-sem17 test to tst-sem18
Joseph Myers [Fri, 8 Nov 2024 17:08:09 +0000 (17:08 +0000)] 
Rename new tst-sem17 test to tst-sem18

As noted by Adhemerval, we already have a tst-sem17 in nptl.

Tested for x86_64.

8 months agoAvoid uninitialized result in sem_open when file does not exist
Joseph Myers [Fri, 8 Nov 2024 01:53:48 +0000 (01:53 +0000)] 
Avoid uninitialized result in sem_open when file does not exist

A static analyzer apparently reported an uninitialized use of the
variable result in sem_open in the case where the file is required to
exist but does not exist.

The report appears to be correct; set result to SEM_FAILED in that
case, and add a test for it.

Note: the test passes for me even without the sem_open fix, I guess
because result happens to get value SEM_FAILED (i.e. 0) when
uninitialized.

Tested for x86_64.

8 months agonptl: initialize rseq area prior to registration
Michael Jeanson [Thu, 7 Nov 2024 21:23:49 +0000 (22:23 +0100)] 
nptl: initialize rseq area prior to registration

Per the rseq syscall documentation, 3 fields are required to be
initialized by userspace prior to registration, they are 'cpu_id',
'rseq_cs' and 'flags'. Since we have no guarantee that 'struct pthread'
is cleared on all architectures, explicitly set those 3 fields prior to
registration.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agos390x: Update ulps
Mark Wielaard [Thu, 7 Nov 2024 18:05:56 +0000 (19:05 +0100)] 
s390x: Update ulps

Needed for test-float-cacosh, test-float-csin, test-float32-cacosh and
test-float32-csin.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
8 months agoelf: avoid jumping over a needed declaration
DJ Delorie [Thu, 7 Nov 2024 02:40:35 +0000 (21:40 -0500)] 
elf: avoid jumping over a needed declaration

The declaration of found_other_class could be jumped
over via the goto just above it, but the code jumped
to uses found_other_class.  Move the declaration
up a bit to ensure it's properly declared and initialized.

8 months agomath: Fix log10f on some ABIs
Adhemerval Zanella [Thu, 7 Nov 2024 10:51:27 +0000 (07:51 -0300)] 
math: Fix log10f on some ABIs

The commit 9247f53219 triggered some regressions on loongarch and
riscv:

math/test-float-log10
math/test-float32-log10

And it is due a wrong sync with CORE-MATH for special 0.0/-0.0
inputs.

Checked on aarch64-linux-gnu and loongarch64-linux-gnu-lp64d.

8 months agostdio-common: Add tests for formatted vsnprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vsnprintf output specifiers

Wire vsnprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted vsprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vsprintf output specifiers

Wire vsprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted vfprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vfprintf output specifiers

Wire vfprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted vdprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vdprintf output specifiers

Wire vdprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted vasprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vasprintf output specifiers

Wire vasprintf into test infrastructure for formatted printf output
specifiers.

Owing to mtrace logging these tests take amounts of time to complete
similar to those of corresponding asprintf tests, so set timeouts for
the tests accordingly, with a global default for all the vasprintf
tests, and then individual higher settings for double and long double
tests each.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted vprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted vprintf output specifiers

Wire vprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted snprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted snprintf output specifiers

Wire snprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted sprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted sprintf output specifiers

Wire sprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted fprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted fprintf output specifiers

Wire fprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted dprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted dprintf output specifiers

Wire dprintf into test infrastructure for formatted printf output
specifiers.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted asprintf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted asprintf output specifiers

Wire asprintf into test infrastructure for formatted printf output
specifiers.

Owing to mtrace logging of lots of memory allocation calls these tests
take a considerable amount of time to complete, except for the character
conversion, taking from 00m20s for 'tst-printf-format-as-s --direct s',
through 01m10s and 03m53s for 'tst-printf-format-as-char --direct i' and
'tst-printf-format-as-double --direct f' respectively, to 19m24s for
'tst-printf-format-as-ldouble --direct f', all in standalone execution
from NFS on a RISC-V FU740@1.2GHz system and with output redirected over
100Mbps network via SSH.  It is with the skeleton's stub implementation
of dladdr(3); execution times with regular dladdr(3) are up to over
twice longer.

Set timeouts for the tests accordingly then, with a global default for
all the asprintf tests, and then individual higher settings for double
and long double tests each.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agostdio-common: Add tests for formatted printf output specifiers
Maciej W. Rozycki [Thu, 7 Nov 2024 06:14:24 +0000 (06:14 +0000)] 
stdio-common: Add tests for formatted printf output specifiers

This is a collection of tests for formatted printf output specifiers
covering the d, i, o, u, x, and X integer conversions, the e, E, f, F,
g, and G floating-point conversions, the c character conversion, and the
s string conversion.  Also the hh, h, l, and ll length modifiers are
covered with the integer conversions as is the L length modifier with
the floating-point conversions.

The -, +, space, #, and 0 flags are iterated over, as permitted by the
conversion handled, in tuples of 1..5, including tuples with repetitions
of 2, and combined with field width and/or precision, again as permitted
by the conversion.  The resulting format string is then used to produce
output from respective sets of input data corresponding to the specific
conversion under test.  POSIX extensions beyond ISO C are not used.

Output is produced in the form of records which include both the format
string (and width and/or precision where given in the form of separate
arguments) and the conversion result, and is verified with GNU AWK using
the format obtained from each such record against the reference value
also supplied, relying on the fact that GNU AWK has its own independent
implementation of format processing, striving to be ISO C compatible.

In the course of implementation I have determined that in the non-bignum
mode GNU AWK uses system sprintf(3) for the floating-point conversions,
defeating the objective of doing the verification against an independent
implementation.  Additionally the bignum mode (using MPFR) is required
to correctly output wider integer and floating-point data.  Therefore
for the conversions affected the relevant shell scripts sanity-check AWK
and terminate with unsupported status if the bignum mode is unavailable
for floating-point data or where data is output incorrectly.

The f and F floating-point conversions are build-time options for GNU
AWK, depending on the environment, so they are probed for before being
used.  Similarly the a and A floating-point conversions, however they
are currently not used, see below.  Also GNU AWK does not handle the b
or B integer conversions at all at the moment, as at 5.3.0.  Support for
the a, A, b, and B conversions can however be easily added following the
approach taken for the f and F conversions.

Output produced by gawk for the a and A floating-point conversions does
not match one produced by us: insufficient precision is used where one
hasn't been explicitly given, e.g. for the negated maximum finite IEEE
754 64-bit value of -1.79769313486231570814527423731704357e+308 and "%a"
format we produce -0x1.fffffffffffffp+1023 vs gawk's -0x1.000000p+1024
and a different exponent is chosen otherwise, such as with "%.a" where
we output -0x2p+1023 vs gawk's -0x1p+1024 for the same value, or "%.20a"
where -0x1.fffffffffffff0000000p+1023 is our output, but gawk produces
-0xf.ffffffffffff80000000p+1020 instead.  Consequently I chose not to
include a and A conversions in testing at this time.

And last but not least there are numerous corner cases that GNU AWK does
not handle correctly, which are worked around by explicit handling in
the AWK script.  These are in particular:

- extraneous leading 0 produced for the alternative form with the o
  conversion, e.g. { printf "%#.2o", 1 } produces "001" rather than
  "01",

- unexpected 0 produced where no characters are expected for the input
  of 0 and the alternative form with the precision of 0 and the integer
  hexadecimal conversions, e.g. { printf "%#.x", 0 } produces "0" rather
  than "",

- missing + character in the non-bignum mode only for the input of 0
  with the + flag, precision of 0 and the signed integer conversions,
  e.g. { printf "%+.i", 0 } produces "" rather than "+",

- missing space character in the non-bignum mode only for the input of 0
  with the space flag, precision of 0 and the signed integer
  conversions, e.g. { printf "% .i", 0 } produces "" rather than " ",

- for released gawk versions of up to 4.2.1 missing - character for the
  input of -NaN with the floating-point conversions, e.g. { printf "%e",
  "-nan" }' produces "nan" rather than "-nan",

- for released gawk versions from 5.0.0 onwards + character output for
  the input of -NaN with the floating-point conversions, e.g. { printf
  "%e", "-nan" }' produces "+nan" rather than "-nan",

- for released gawk versions from 5.0.0 onwards + character output for
  the input of Inf or NaN in the absence of the + or space flags with
  the floating-point conversions, e.g. { printf "%e", "inf" }' produces
  "+inf" rather than "inf",

- for released gawk versions of up to 4.2.1 missing + character for the
  input of Inf or NaN with the + flag and the floating-point
  conversions, e.g. { printf "%+e", "inf" }' produces "inf" rather than
  "+inf",

- for released gawk versions of up to 4.2.1 missing space character for
  the input of Inf or NaN with the space flag and the floating-point
  conversions, e.g. { printf "% e", "nan" }' produces "nan" rather than
  " nan",

- for released gawk versions from 5.0.0 onwards + character output for
  the input of Inf or NaN with the space flag and the floating-point
  conversions, e.g. { printf "% e", "inf" }' produces "+inf" rather than
  " inf",

- for released gawk versions from 5.0.0 onwards the field width is
  ignored for the input of Inf or NaN and the floating-point
  conversions, e.g. { printf "%20e", "-inf" }' produces "-inf" rather
  than "                -inf",

NB for released gawk versions of up to 4.2.1 floating-point conversion
issues apply to the bignum mode only, as in the non-bignum mode system
sprintf(3) is used.  As from version 5.0.0 specialized handling has been
added for [-]Inf and [-]NaN inputs and the issues listed apply to both
modes.  The '--posix' flag makes gawk versions from 5.0.0 onwards avoid
the issue with field width and the + character unconditionally output
for the input of Inf or NaN, however not the remaining issues and then
the 'gensub' function is not supported in the POSIX mode, so to go this
path I deemed not worth it.

Each test completes within single seconds except for the long double
one.  There the F/f formats produce a large number of digits, which
appears to be computationally intensive and CPU-bound.  Standalone
execution time for 'tst-printf-format-p-ldouble --direct f' is in the
range of 00m36s for POWER9@2.166GHz and 09m52s for FU740@1.2GHz and
output redirected locally to /dev/null, and 10m11s for FU740 and output
redirected over 100Mbps network via SSH to /dev/null, so the throughput
of the network adds very little (~3.2% in this case) to the processing
time.  This is with IEEE 754 quad.

So I have scaled the timeout for 'tst-printf-format-skeleton-ldouble'
accordingly.  Regardless, following recent practice the test has been
added to the standard rather than extended set.  However, unlike most
of the remaining tests it has been split by the conversion specifier,
so as to allow better parallelization of this long-running test.  As
a side effect this lets the test report the unsupported status for the
F/f conversions where applicable, so 'tst-printf-format-p-double' has
been split for consistency as well.

Only printf itself is handled at the moment, but the infrastructure
provides for all the printf family functions to be verified, changes
for which to be supplied separately.  The complication around having
some tests iterating over all the relevant conversion specifiers and
other verifying conversion specifiers individually combined with
iterating over printf family functions has hit a peculiarity in GNU
make where the use of multiple targets with a pattern rule is handled
differently from such use with an ordinary rule.  Consequently it
seems impossible to bulk-define a pattern rule using '$(foreach ...)',
where each target would simply trigger the recipe according to the
pattern and matching dependencies individually (such a rule does work,
but implies all targets to be updated with a single recipe execution).

Therefore as a compromise a single single-target pattern rule has been
defined that has listed all the conversion-specific scripts and all the
test executables as dependencies.  Consequently tests will be rerun in
the absence of changes to their actual sources or scripts whenever an
unrelated file has changed that has been listed.  Also all the formatted
printf output tests will always be built whenever any single one is to
be run.  This only affects test development and not test runs in the
field, though it does change the order of execution of the individual
steps and also acts as a Makefile barrier in parallel runs.  As the
execution time dominates the compilation time for these tests it is not
seen as a serious shortcoming.

As pointed out by Florian Weimer <fweimer@redhat.com> the malloc tracing
facility can take a substantial amount of time in calling dladdr(3) to
determine the caller's location.  This is not needed by the verification
made with these tests, so I chose to interpose the symbol with a stub
implementation that always fails in the shared skeleton.  We have total
control over the test environment, so I think it is a safe and minimal
impact approach.  If there's ever anything else added to the tests that
would actually rely on dladdr(3) returning usable results, only then we
can think of a different approach.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agonptl: fix __builtin_thread_pointer detection on LoongArch
caiyinyu [Wed, 6 Nov 2024 02:06:21 +0000 (10:06 +0800)] 
nptl: fix __builtin_thread_pointer detection on LoongArch

Signed-off-by: caiyinyu <caiyinyu@loongson.cn>
8 months agomath: Fix incorrect results of exp10m1f with some GCC versions
Florian Weimer [Wed, 6 Nov 2024 15:09:05 +0000 (16:09 +0100)] 
math: Fix incorrect results of exp10m1f with some GCC versions

On GCC 11 (x86-64), the previous code produced test failures like
this one:

Failure: Test: exp10m1_towardzero (-0x1.1p+4)
Result:
 is:         -1.00000000e+00  -0x1.000000p+0
 should be:  -9.99999940e-01  -0x1.fffffep-1
 difference:  5.96046447e-08   0x1.000000p-24
 ulp       :  1.0000
 max.ulp   :  0.0000

Apply a similar fix to exp2m1f.

Co-authored-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agomisc: Align argument name for pkey_*() functions with the manual
Yury Khrustalev [Wed, 6 Nov 2024 13:05:57 +0000 (13:05 +0000)] 
misc: Align argument name for pkey_*() functions with the manual

Change name of the access_rights argument to access_restrictions
of the following functions:

 - pkey_alloc()
 - pkey_set()

as this argument refers to access restrictions rather than access
rights and previous name might have been misleading.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agomanual: Use more precise wording for memory protection keys
Yury Khrustalev [Wed, 6 Nov 2024 13:04:27 +0000 (13:04 +0000)] 
manual: Use more precise wording for memory protection keys

Update the name of the argument in several pkey_*() functions that refers
to access restrictions rather than access rights: change access "rights"
to access "restrictions".

Specify that the result of the pkey_get() should be checked using bitwise
operations rather than plain equals comparison.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
8 months agoelf: Switch to main malloc after final ld.so self-relocation
Florian Weimer [Wed, 6 Nov 2024 09:33:44 +0000 (10:33 +0100)] 
elf: Switch to main malloc after final ld.so self-relocation

Before commit ee1ada1bdb8074de6e1bdc956ab19aef7b6a7872
("elf: Rework exception handling in the dynamic loader
[BZ #25486]"), the previous order called the main calloc
to allocate a shadow GOT/PLT array for auditing support.
This happened before libc.so.6 ELF constructors were run, so
a user malloc could run without libc.so.6 having been
initialized fully.  One observable effect was that
environ was NULL at this point.

It does not seem to be possible at present to trigger such
an allocation, but it seems more robust to delay switching
to main malloc after ld.so self-relocation is complete.
The elf/tst-rtld-no-malloc-audit test case fails with a
2.34-era glibc that does not have this fix.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agoelf: Introduce _dl_relocate_object_no_relro
Florian Weimer [Wed, 6 Nov 2024 09:33:44 +0000 (10:33 +0100)] 
elf: Introduce _dl_relocate_object_no_relro

And make _dl_protect_relro apply RELRO conditionally.

Reviewed-by: DJ Delorie <dj@redhat.com>
8 months agoelf: Do not define consider_profiling, consider_symbind as macros
Florian Weimer [Wed, 6 Nov 2024 09:33:44 +0000 (10:33 +0100)] 
elf: Do not define consider_profiling, consider_symbind as macros

This avoids surprises when refactoring the code if these identifiers
are re-used later in the file.

Reviewed-by: DJ Delorie <dj@redhat.com>