git.ipfire.org Git - thirdparty/glibc.git/log

malloc: Add tst-mallocfork to tests-exclude-threaded exception list

Commit 244c404ae85003f45aa491a50b6902655ee2df15 added -threaded-main and
-threaded-worker variants of several malloc tests with some exceptions.

tst-mallocfork calls fork from a signal handler, leading to sporadic
deadlocks when multi-threaded since fork is not AS-safe when
multi-threading. This commit therefore adds tst-mallocfork to the
appropriate exception list.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

aarch64: Fix LD_AUDIT with GCS in permissive mode

In permissive mode, during audit module handling, check_gcs is unaware
that it is handling audit modules rather than the binary itself.  It
causes the loader to fail to load the audit module, rather than
loading it and disabling GCS.

Also extends GCS tests with 4 LD_AUDIT tests:

1. tst-gcs-audit-disabled: checks if the audit module without GCS
   marking is loaded with default gcs support.

2. tst-gcs-audit-enforced: checks if the audit module without GCS
   marking is not loaded when GCS is enforced.

3. tst-gcs-audit-optional: checks if the audit module without GCS
   marking is loaded when GCS is optional.

4. tst-gcs-audit-override: check if the audit modules without GCS
   marking is loaded when GCS is overrided.

Checked on aarch64-linux-gnu with Linux 6.18 on Apple M4 emulated (for
BTI support) and on qemu 10.1.50 simulated (for GCS).

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>

aarch64: Add LD_PRELOAD tests for GCS handling

There are modeled after the 'shared' tests subset and it adds 4 new
tests:

1. tst-gcs-preload-disabled: checks if GCS is disabled when a LD_PRELOAD
   module is used without GCS marking with default GCS support.

2. tst-gcs-preload-enforced-abort: chekcs if loader aborts startup when
   a LD_PRELOAD is used without GCS marking and GCS is enforced.

3. tst-gcs-preload-optional: checks if GCS is disabled when a LD_PRELOAD
   is used without GCS marking and GCS is optional.

4. tst-gcs-preload-override: checks if GCS is enabled when a LD_PRELOAD
   is used without GCS marking and GCS is overrided.

Checked on aarch64-linux-gnu with Linux 6.18 on Apple M4 emulated (for
BTI support) and on qemu 10.1.50 simulated (for GCS).

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>

aarch64: Add LD_AUDIT tests for BTI handling

This patch adds 3 new tests:

1. tst-bti-permissive-audit: checks that process runs when an LD_AUDIT module
   not marked with BTI but BTI is not enforced.

2. tst-bti-prot-audit: checks if the process correctly handles audit module with
   BTI marking when BTI is enforced.

3. tst-bti-unprot-audit: check if the process ignores an audit module without
   BTI marking when BTI is enforced.

Checked on aarch64-linux-gnu with Linux 6.18 on Apple M4 emulated (for
BTI support) and on qemu 10.1.50 simulated (for GCS).

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>

aarch64: Add LD_PRELOAD tests for BTI handling

Add 3 new tests to check if LD_PRELOAD is correctly handled:

1. tst-bti-abort-unprot-preload: checks if the process is aborted if
   a LD_PRELOAD module without BTI marking is used and BTI is enforced.

2. tst-bti-dep-prot-preload: checks if the process works correctly if
   a LD_PRELOAD module with BTI marking is used and BTI is enforced.

3. tst-bti-permissive-preload: checks if the process works correctly
   if a LD_PRELOAD module with BTI marking is used and BTI is not
   enforced.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>

Revert "x86: Do not use __builtin_fpclassify for _Float64x/long double"

This reverts commit 50112103993b042f52fb6afb0e4eee24fe4cb9af. It
breaks libstdc++ and other languages bootstrap.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Revert "x86: Do not use __builtin_isinf_sign for _Float64x/long double"

This reverts commit 999cd617cb7e40a2fa719e91fe1028c853ae14d5. It
breaks libstdc++ and other languages bootstrap.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

aarch64: update NEWS for 2.43 release

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

aarch64: Add LD_DEBUG=security to log BTI and GCS warnings

Introduce DL_DEBUG_SECURITY mask to enable messages related to
loading modules that lack certain target-dependent hardening
or security features.

Use this mask for warnings related to AArch64 BTI and GCS.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

tst-if_nameindex.c: Fix minimum buffer size

The standard says that the buffer given to if_indextoname shall have at
least IF_NAMESIZE bytes.

ldbl-128ibm-compat: Add local aliases for printf family symbols

When the compiler selects IEEE-128 long double ABI(-mabi=ieeelongdouble),
calls to printf, fprintf, sprintf and snprintf are redirected to the
__printfieee128, __fprintfieee128, __sprintfieee128 and __snprintfieee128
symbols respectively. This causes "break printf" (and others) in
GDB to fail because the original symbol names do not exist as global
symbols in libc.so.6.

Fix this by adding local symbol aliases in the ieee128 compatibility
files so that the original symbol names are present in the symbol table
again. This restores the expected GDB behavior ("break printf" works)
without requiring dynamic symbols or versioned compatibility symbols.

Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

math: Fix powerpc64le -Os build after 6b7067460f

The __USE_EXTERN_INLINES is gated __OPTIMIZE_SIZE__, so also gated
the alias required using the same logic.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

x86: Fix x86_64 build failure with -Os (BZ 33367)

The 13cfd77bf5 change broke the b5d88fa6c3 fix by removing the symbol
to __symbol redirections. Although it works for -O2 with both gcc
and clang, with -Os without the redirection, the libcall might still
be issued.

This patch reinstates the b5d88fa6c3 fix, with a modification that
allows each ifunc variant to control which trunc to issue. This is
required for clang, which defines HAVE_X86_INLINE_TRUNC to 1 (meaning
that trunc will always be lowered to the instruction on -Os).

Checked on x86_64-linux-gnu with -O2 and -Os with gcc-15 and clang-18.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Sync acosh from CORE-MATH

The CORE-MATH c423b9a3 commit made atanh to use a slight different
muldd_acc and polydd (which uses muldd_acc internally) compared
to previous version.

The new tests were suggested by Paul Zimmermann (although I did
not see any regression).

Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

math: Sync atanh from CORE-MATH

The CORE-MATH c423b9a3 commit made atanh to use a slight different
muldd_acc, mulddd, and polydd (which uses muldd_acc internally)
compare to asinh and acosh.

The new tests were suggested by Paul Zimmermann (although I did
not see any regression).

Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

math: Sync asinh from CORE-MATH

The CORE-MATH commit cd653cf7 fixes some issues for RNDZ (below).

testing double (without inline functions)
Failure: Test: asinh_towardzero (0x1.07888cc76eb3cp-4)
Result:
is:          6.4294901954893124e-02   0x1.075a175321f74p-4
should be:   6.4294901954893110e-02   0x1.075a175321f73p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07888e17d0fep-4)
Result:
is:          6.4294906839823556e-02   0x1.075a18a2d2922p-4
should be:   6.4294906839823542e-02   0x1.075a18a2d2921p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07888e344d64ep-4)
Result:
is:          6.4294907253493294e-02   0x1.075a18bf3fed0p-4
should be:   6.4294907253493280e-02   0x1.075a18bf3fecfp-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07888e45219adp-4)
Result:
is:          6.4294907497881415e-02   0x1.075a18d00b3f4p-4
should be:   6.4294907497881401e-02   0x1.075a18d00b3f3p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.0788946685afp-4)
Result:
is:          6.4294930288402857e-02   0x1.075a1eee32572p-4
should be:   6.4294930288402843e-02   0x1.075a1eee32571p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07889a0cffe1fp-4)
Result:
is:          6.4294951293986671e-02   0x1.075a2491b07a9p-4
should be:   6.4294951293986657e-02   0x1.075a2491b07a8p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07889cddeccf9p-4)
Result:
is:          6.4294961763186983e-02   0x1.075a276120993p-4
should be:   6.4294961763186969e-02   0x1.075a276120992p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Failure: Test: asinh_towardzero (0x1.07889efd3a82bp-4)
Result:
is:          6.4294969652980468e-02   0x1.075a297f4f503p-4
should be:   6.4294969652980454e-02   0x1.075a297f4f502p-4
difference:  1.3877787807814456e-17   0x1.0000000000000p-56
ulp       :  1.0000
max.ulp   :  0.0000
Maximal error of `asinh_towardzero'
is      : 1 ulp
accepted: 0 ulp

The muldd was renamed to muldd_acc to avoid deviate from CORE-MATH
(the symbol and logic in replicated on multiple implementation,
different than glibc we consolidate it on ddcoremath.h).

Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

aarch64: Fix error messages for GCS and BTI incompatible modules

When either program path of module name is empty, don't print an
empty string followed by a colon.

Also fix-up test for a static BTI binary to check error message
for this case.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

hurd: Fix sigreturn clobbering some xmm registers

__sigreturn2 uses _hurd_sigstate_unlock after restoring the interrupted
xmm values, we thus need it not to touch xmm. It makes sense to inline
sigstate_is_global_rcv _hurd_sigstate_lock/unlock anyway. unlock calls
gsync_wake, so we need to avoid xmm there as well.

Linux: test sizes larger than UINT_MAX for copy_file_range

If the kernel supports the COPY_FILE_RANGE_64 FUSE interface, we can
safely tests the large size values.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Update the bundled <linux/fuse.h> userspace header from Linux 6.18

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Linux: fix copy_file_range test on Linux >= 6.18

On Linux >= 6.18, the kernel submits the new COPY_FILE_RANGE_64
operation to the fuse implementation for large files. There is a
fall-back routine to COPY_FILE_RANGE but it's only used if
COPY_FILE_RANGE_64 returns ENOSYS.

So, return ENOSYS instead of EIO for "unsupported" operations in order
to make the kernel do the correct thing for this case and maybe in case
that a new operation is added into the kernel fuse interface in the
future.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Switch currency symbol for the bg_BG locale to euro

Bulgaria joined the eurozone on 2026-01-01.

Suggested-by: Йордан Гигов <jgigov@abv.bg>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>

aarch64: Fix PT_GNU_PROPERTY checks for static exe (BZ 33713)

All checks related to the PT_GNU_PROPERTY bits would be skipped
if the binary had no PT_GNU_PROPERTY note at all. This meant that
enforcing an abort when some bits are not present was not possible.

Fixes BZ 33713

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

tst-sig-redzone: Decorate assembly function

To make sure to put it into .text, and let debugging tools know about it.

hurd: check that signal processing does not hurt the x86_64 redzone

hurd: also test mmx state restoration

mmx registers are not the same as mmx, so better test them too

mach/hurd: add `bits/in.h`

This defines `IP_PKTINFO` and `struct in_pktinfo`
Message-ID: <20260104102835.9373-2-jlledom@mailfence.com>

Better terminology for ‘long double’ in manual

* manual/math.texi (Mathematical Constants):
Don’t say that long double is “the same as” double, as the
types remain distinct (problem reported by Keith Thompson).
Also, don’t imply that float is the “narrowest”, as floating
point types don’t have widths in Standard C. Instead, talk
about precision and exponent range.

Update copyright dates not handled by scripts/update-copyrights

I've updated copyright dates in glibc for 2026. This is the patch for
the changes not generated by scripts/update-copyrights.

Update copyright dates with scripts/update-copyrights

Pass glibc pre-commit checks

This is needed for the next patch which updates copyright dates.
* elf/sprof.c:
* sysdeps/unix/sysv/linux/tst-pidfd_getinfo.c:
Remove trailing white space.
* misc/tst-atomic.c: Remove trailing empty line.

malloc_info: fix closing </sizes> tag typo

Fix regression in commit 7447efa9622cb33a567094833f6c4000b3ed2e23
("malloc: remove fastbin code from malloc_info") where the closing
`sizes` tag had a typo, missing the '/'.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

LoongArch: Use generic __builtin_trap in abort.

Reviewed-by: Xi Ruoyao <xry111@xry111.site>

malloc: Fix clang build after 1c588a2187

clang issues:

malloc.c:1909:8: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
1909 |   if (!DEFAULT_THP_PAGESIZE || mp_.thp_mode != malloc_thp_mode_not_supported)
      |        ^
../sysdeps/unix/sysv/linux/aarch64/malloc-hugepages.h:19:35: note: expanded from macro 'DEFAULT_THP_PAGESIZE'
   19 | #define DEFAULT_THP_PAGESIZE    (1UL << 21)

Checked on aarch64-linux-gnu.

elf: Fix elf/tst-decorate-maps on aarch64 after 321e1fc73f

The intention of the call "xmalloc(256 * 1024)" in tst-decorate-maps is
to force malloc() to fall back to using mmap() since such an amount
won't be available from the main heap.

Post 321e1fc73f, on aarch64, the heap gets extended by default by at
least 2MB, thus the aforementioned call may get satisfied on the main
heap itself. Thus, increase the amount of memory requested to force the
mmap() path again.

Checked on aarch64-linux-gnu.

misc: Enable tst-atomic for clang

The atomic.h macros now uses compiler builtins.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: Use math_opt_barrier on ldbl-128 powl underflow/overflow handling

This avoids compiler to move the operation before the 'iy' compare.

It fixes math/test-float128-pow regreesions when building with clang [1]:

Failure: pow (-0x1.000002p+0, 0xf.ffffffffffff8p+1020): Exception "Underflow" set
Failure: pow (-0x1.000002p+0, 0xf.ffffffffffffbffffffffffffcp+1020): Exception "Underflow" set
Failure: pow (-0x1.000002p+0, 0xf.fffffffffffffffffffffffffff8p+16380): Exception "Underflow" set
Failure: pow (-0x1.000002p+0, 0xf.fffffffffffffffp+16380): Exception "Underflow" set
Failure: pow (-0x1.000002p+0, 0xf.fffffp+124): Exception "Underflow" set
Failure: pow (-0x1.00000ep+0, 0xf.ffffffffffff8p+1020): Exception "Underflow" set
Failure: pow (-0x1.00000ep+0, 0xf.ffffffffffffbffffffffffffcp+1020): Exception "Underflow" set
Failure: pow (-0x1.00000ep+0, 0xf.fffffffffffffffffffffffffff8p+16380): Exception "Underflow" set
Failure: pow (-0x1.00000ep+0, 0xf.fffffffffffffffp+16380): Exception "Underflow" set
Failure: pow (-0x2p+0, -0xf.ffffffffffff8p+1020): Exception "Overflow" set
Failure: pow (-0x2p+0, -0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set
Failure: pow (-0x2p+0, -0xf.fffffffffffffffffffffffffff8p+16380): Exception "Overflow" set
Failure: pow (-0x2p+0, -0xf.fffffffffffffffp+16380): Exception "Overflow" set
Failure: pow (-0x2p+0, -0xf.fffffp+124): Exception "Overflow" set
Failure: pow (-0x2p+0, 0xf.ffffffffffff8p+1020): Exception "Underflow" set
Failure: pow (-0x2p+0, 0xf.ffffffffffffbffffffffffffcp+1020): Exception "Underflow" set
Failure: pow (-0x2p+0, 0xf.fffffffffffffffffffffffffff8p+16380): Exception "Underflow" set
Failure: pow (-0x2p+0, 0xf.fffffffffffffffp+16380): Exception "Underflow" set
Failure: pow (-0x2p+0, 0xf.fffffp+124): Exception "Underflow" set
[...]

Checked on x86_64-linux-gnu and aarch64-linux-gnu with gcc-15 and
clang-18.

[1] https://github.com/llvm/llvm-project/issues/173080

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

stdio: Fix tst-vfprintf-user-type on clang

The pointer alias comparison will be optimized away by the compiler,
so use an indirection point to prevent it (similar to
malloc/tst-malloc-aux.h).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Do not use __builtin_isinf_sign for _Float64x/long double

Neither gcc [1] nor clang [2] handles pseudo-normal numbers correctly
with the __builtin_isinf_sign, so disable its usage for _Float64x and
long double types.

This only affects x86, so add a new define __FP_BUILTIN_ISINF_SIGN_DENORMAL
to gate long double and related types to the libc function instead.

It fixes the regression on test-ldouble-isinf when built with clang:

Failure: isinf (pseudo_zero): Exception "Invalid operation" set
Failure: isinf (pseudo_inf): Exception "Invalid operation" set
Failure: isinf (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf (pseudo_snan): Exception "Invalid operation" set
Failure: isinf (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_unnormal): Exception "Invalid operation" set

Checked on x86_64-linux-gnu with gcc-15 and clang-18.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123173
[2] https://github.com/llvm/llvm-project/issues/172651

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Do not use __builtin_fpclassify for _Float64x/long double

Neither gcc [1] nor clang [2] handles pseudo-normal numbers correctly
with the __builtin_fpclassify, so disable its usage for _Float64x and
long double types.

This only affects x86, so add a new header, fp-builtin-denormal.h, that
defines whether the architecture requires disabling the optimization
through a new glibc define (__FP_BUILTIN_FPCLASSIFY_DENORMAL).

It fixes the regression on test-ldouble-fpclassify and
test-float64x-fpclassify when built with clang:

Failure: fpclassify (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_unnormal): Exception "Invalid operation" set

Checked on x86_64-linux-gnu with gcc-15 and clang-18.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123161
[2] https://github.com/llvm/llvm-project/issues/172533

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

resolv: Add test for NOERROR/NODATA handling [BZ #14308]

Add a test which verifies that getaddrinfo does not fail if one of A/AAAA
responses is NOERROR/NODATA reply with recursion unavailable and the other
response provides an address.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

riscv: Add RVV memset for both multiarch and non-multiarch builds

This patch adds an RVV-optimized implementation of memset for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.

The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__memset_vector) is added
alongside the generic fallback (__memset_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).

Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling memset().

Co-authored-by: Jerry Shih <jerry.shih@sifive.com>
Co-authored-by: Jeff Law <jeffreyalaw@gmail.com>
Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>

stdlib: Avoid strlen plt with clang

The clang-21 LoopIdiomRecognizePass replaces some loops in
__xpg_basename with a strlen call.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: Do not use __builtin_isgreater* and __builtin_isless* on clang

clang does not check for unordered numbers with builtins for 128-bit
float types (both _Float128 on x86_64 or long double on aarch64) [1].

For instance, the code:

  #ifdef __x86_64__
  typedef __float128 FLOAT128_TYPE;
  #elif defined (__aarch64__)
  typedef long double FLOAT128_TYPE;
  #endif

  int foo (FLOAT128_TYPE x, FLOAT128_TYPE y)
  {
    return __builtin_isgreater (x, y);
  }

Will issue a __gttf2 call instead of a __unordtf2 followed by the
comparison.

Using the generic implementation fixes multiple issues with math tests,
such as:

Failure: fmax (0, qNaN): Exception "Invalid operation" set
Failure: fmax (0, -qNaN): Exception "Invalid operation" set
Failure: fmax (-0, qNaN): Exception "Invalid operation" set
Failure: fmax (-0, -qNaN): Exception "Invalid operation" set
Failure: fmax (9, qNaN): Exception "Invalid operation" set
Failure: fmax (9, -qNaN): Exception "Invalid operation" set
Failure: fmax (-9, qNaN): Exception "Invalid operation" set
Failure: fmax (-9, -qNaN): Exception "Invalid operation" set

It has a small performance overhead due to the extra isunordered (which
could be omitted for float and double types). Using _Generic (similar to
how __MATH_TG) on a bivariadic function requires a lot of boilerplate
macros.

[1] https://github.com/llvm/llvm-project/issues/172499
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Support vDSO with more than one PT_LOAD with v_addr starting at 0 (BZ 32583)

The setup_vdso assumes that vDSO will contain only one PT_LOAD segment
and that 0 is the sentinel for the start mapping address.  Although
the kernel avoids adding more than one PT_LOAD to avoid compatibility
issues, there is no impending issue that prevents glibc from supporting
vDSO with multiple PT_LOAD (as some wrapper tools do [1]).

To support multiple PT_LOAD segments, replace the sentinel with a bool
to indicate that the VMA start has already been set.

Testing is really tricky, since the bug report does not indicate which
tool was used to trigger the issue, nor a runtime that provides a vDSO
with multiple PT_LOAD.  I had to modify the qemu user with a custom
script to create 2 PT_LOAD sections, remove checks that prevent the
vDSO object from being created, and remove the load bias adjustment
in load_elf_vdso.  I could not come up with an easy test case to
integrate with glibc.

The Linux kernel provides vDSO with only one PT_LOAD due to
compatibility reasons. For instance

* arch/arm64/kernel/vdso/vdso.lds.S

86 /*
87  * We must supply the ELF program headers explicitly to get just one
88  * PT_LOAD segment, and set the flags explicitly to make segments read-only.
89  /
90 PHDRS
91 {
92         text            PT_LOAD         FLAGS(5) FILEHDR PHDRS; / PF_R|PF_X /
93         dynamic         PT_DYNAMIC      FLAGS(4);               / PF_R /
94         note            PT_NOTE         FLAGS(4);               / PF_R */
95 }

* arch/x86/entry/vdso/vdso-layout.lds.S

95 /*
96  * We must supply the ELF program headers explicitly to get just one
97  * PT_LOAD segment, and set the flags explicitly to make segments read-only.
98  /
99 PHDRS
100 {
101         text            PT_LOAD         FLAGS(5) FILEHDR PHDRS; / PF_R|PF_X /
102         dynamic         PT_DYNAMIC      FLAGS(4);               / PF_R /
103         note            PT_NOTE         FLAGS(4);               / PF_R */
104         eh_frame_hdr    PT_GNU_EH_FRAME;
105 }

Checked on aarch64-linux-gnu.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32583#c2

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Make pthread_{clock, timed}join{_np} act on all cancellation (BZ 33717)

The pthread_join/pthread_timedjoin_np/pthread_clockjoin_np will not act
on cancellation if 1. some other thread is already waiting on the 'joinid'
or 2. If the thread has already exited.

On nptl/pthread_join_common.c, the 1. is due to the CAS doing an early
return:

80   else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
81                                                                    &self,
82                                                                    NULL)))
83     /* There is already somebody waiting for the thread.  */
84     return EINVAL;

And 2. is due to the pd->tid equal to 0:

99       pid_t tid;
100       while ((tid = atomic_load_acquire (&pd->tid)) != 0)
101         {

The easiest solution would be to add an __pthread_testcancel () on
__pthread_clockjoin_ex () if 'cancel' is true.

Checked on aarch64-linux-gnu, arm-linux-gnueabihf, x86_64-linux-gnu,
and i686-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

support: Add support_thread_state_wait

Same as support_process_state_wait, but wait for the task TID
(obtained with gettid) from the current process. Since the kernel
might remove the /proc/<pid>/task/<tid>/status at any time if the
thread terminates, the code needs to handle possible
fopen/getline/fclose failures due to an inexistent file.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Remove INVALID_TD_P

And use the new __pthread_descriptor_valid function that checks
for 'joinstate' to get the thread state instead of 'tid'. The
joinstate is set by the kernel when the thread exits.

Checked on x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)

The use-after-free described in BZ#19951 is due to the use of two
different PD fields, 'joinid' and 'cancelhandling', to describe the
thread state and to synchronise the calls of pthread_join,
pthread_detach, pthread_exit, and normal thread exit.

Any state change may require checking both fields atomically to handle
partial state (e.g., pthread_join() with a cancellation handler to
issue a 'joinstate' field rollback).

This patch uses a different PD member with 4 possible states (JOINABLE,
DETACHED, EXITING, and EXITED) instead of the pthread 'tid' field, with
the following logic:

1. On pthread_create, the initial state is set either to JOINABLE or
    DETACHED depending on the pthread attribute used.

2. On pthread_detach, a CAS is issued on the state.  If the CAS fails,
    the thread is already detached (DETACHED) or being terminated (EXITING).
    For the former, an EINVAL is returned; for the latter, pthread_detach
    should be responsible for joining the thread (and for deallocating any
    internal resources).

3. In the exit phase of the wrapper function for the thread start routine
    (reached either if the thread function has returned, pthread_exit has
    been called, or cancellation handled has been acted upon), we issue a
    CAS on state to set it to the EXITING mode.

    If the thread is previously in DETACHED mode, the thread is responsible
    for deallocating any resources; otherwise, the thread must be joined
    (detached threads cannot deallocate themselves immediately).

4. The clear_tid_field on 'clone' call is changed to set the new 'state'
    field on thread exit (EXITED).  This state is only reached at thread
    termination.

5. The pthread_join implementation is now simpler: the futex wait is done
    directly on thread state, and there is no need to reset it in case of
    timeout since the state is now set either by pthread_detach() or by the
    kernel on process termination.

The race condition on pthread_detach is avoided with a single atomic
operation on the PD state: once the mode is set to THREAD_STATE_DETACHED, it
is up to the thread itself to deallocate its memory (done during the exit
phase at pthread_create()).

Also, the INVALID_NOT_TERMINATED_TD_P is removed since a negative yid is
not possible, and the macro is not used anywhere.

This change triggers an invalid C11 thread test: it creates a thread that
detaches, and after a timeout, the creating thread checks whether the join
fails.  The issue is that once thrd_join() is called, the thread's lifetime
is not defined.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, and powerpc64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Set cancellation type and state on pthread_exit (BZ #28267)

It is required by POSIX XSH 2.9.5 Thread Cancellation under the
heading Thread Cancellation Cleanup Handlers.

Checked x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Use __futex_abstimed_wait64 on pthread_create (BZ 33715)

The pthread_create is annotated as __THROWNL.

Checked on x86_64-linux-gnu.

Suggested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

build-many-glibcs.py: Fix s390x-linux-gnu.

The recent commit 638d437dbf9c68e40986edaa9b0d1c2e72a1ae81
"Deprecate s390-linux-gnu (31bit)"
leads to:
FAIL: compilers-s390x-linux-gnu gcc build
when it tries to build 31bit libgcc.

The build is fixed by explicitely disabling multilib.

hurd/i386: Remove stale __GNUC_PREREQ (6, 0) test from tls.h

GCC 12 is currently the minimum supported compiler version.
Remove no longer needed __GNUC_PREREQ (6, 0) test and corresponding
dead code from tls.h.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

nptl: Optimize trylock for high cache contention workloads (BZ #33704)

Check lock availability before acquisition to reduce cache line
bouncing. Significantly improves trylock throughput on multi-core
systems under heavy contention.

Tested on x86_64.

Fixes BZ #33704.

Co-authored-by: Alex M Wells <alex.m.wells@intel.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Regenerate sysdeps/x86_64/configure

It fixes the 9b8c2b8294 commit.

x86_64: Fix mark-plt configure test

The configure check might ignore if compiler driver warns that the
option is no support, so force fatal warnings.

If fixes the elf/tst-plt-rewrite{1,2} regressions when ld.lld is used.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: Fix potential underflow on ldbl-128 erfl

The multiplication operation is required only if the branch is taken,
and the compiler might not optimize it away.

It fixes the following issues when the testcase is built with clang:

FAIL: math/test-ldouble-erf
Failure: erf (-0x4p-16384): Exception "Underflow" set
Failure: erf (0x1p-10000): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded54f38760a41abb84p-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded54f38760a41abb88p-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded54f4p-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded54f8p-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded5518p-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded551cp-16384): Exception "Underflow" set
Failure: erf (0x3.8b7f12369ded552p-16384): Exception "Underflow" set
Failure: erf (0x4p-16384): Exception "Underflow" set
Failure: erf_downward (-0x4p-16384): Exception "Underflow" set
Failure: erf_downward (0x1p-10000): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded54f38760a41abb84p-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded54f38760a41abb88p-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded54f4p-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded54f8p-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded5518p-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded551cp-16384): Exception "Underflow" set
Failure: erf_downward (0x3.8b7f12369ded552p-16384): Exception "Underflow" set
Failure: erf_downward (0x4p-16384): Exception "Underflow" set
Failure: erf_towardzero (-0x4p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x1p-10000): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded54f38760a41abb84p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded54f38760a41abb88p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded54f4p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded54f8p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded5518p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded551cp-16384): Exception "Underflow" set
Failure: erf_towardzero (0x3.8b7f12369ded552p-16384): Exception "Underflow" set
Failure: erf_towardzero (0x4p-16384): Exception "Underflow" set
Failure: erf_upward (-0x4p-16384): Exception "Underflow" set
Failure: erf_upward (0x1p-10000): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded54f38760a41abb84p-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded54f38760a41abb88p-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded54f4p-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded54f8p-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded5518p-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded551cp-16384): Exception "Underflow" set
Failure: erf_upward (0x3.8b7f12369ded552p-16384): Exception "Underflow" set
Failure: erf_upward (0x4p-16384): Exception "Underflow" set

Checked on x86_64-linux-gnu and aarch64-linux-gnu with gcc-15 and
clang-18.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

atomic: Reinstate HAVE_64B_ATOMICS configure check

Reinstate HAVE_64B_ATOMICS configure check that was reverted by commit
7fec8a5de6826ef9ae440238d698f0fe5a5fb372 due to BZ #33632.  This was
fixed by 3dd2cbfa35e0e6e0345633079bd5a83bb822c2d8 by only allowing
64-bit atomics on sem_t if its type is 8-byte aligned.  Rebase and add
in cleanups in include/atomic.h that were omitted.

Fix an issue with sparcv8-linux-gnu-leon3 forcing -mcpu=v8 for rtld.c which
overrules -mcpu=leon3 and causes __atomic_always_lock_free (4, 0) to
incorrectly return 0 and trigger asserts in atomics.  Remove this as it
seems to be a workaround for an issue in 1997.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Improve thp_init

Cleanup thp_init, change it so that the DEFAULT_THP_PAGESIZE
setting can be overridden with glibc.malloc.hugetlb=0 tunable.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

linux: Update kernel version to 6.17 in tst-openat2-consts.py

No need constant was added between 6.8 and 6.17.

Updates struct tcp_info and TCP_AO_XX corresponding struct from 6.17 to netinet/tcp.h

This patch updates struct tcp_info to include new fields from Linux 6.17:
- tcpi_pacing_rate, tcpi_max_pacing_rate
- tcpi_bytes_acked, tcpi_bytes_received
- tcpi_delivery_rate, tcpi_busy_time
- tcpi_delivered, tcpi_delivered_ce
- and many other TCP metrics

Additionally, this patch adds:
- TCP_AO_* definitions (Authentication Option)
- struct tcp_diag_md5sig for INET_DIAG_MD5SIG
- Netlink attribute types for SCM_TIMESTAMPING_OPT_STATS

All changes are synchronized from the Linux kernel's tcp.h without
functional modifications, only code style changes.

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: set default tcache fill count to 16

Now that the fastbins are gone, set the default per size class length of
tcache to 16. We observe that doing this retains the original performance
of malloc.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: Remove fastbin comments

Now that all the fastbin code is gone, remove the remaining comments
referencing fastbins.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: Remove fastbin infrastructure

Now that all users of the fastbin code are gone, remove the fastbin
infrastructure.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: Remove do_check_remalloced_chunk

do_check_remalloced_chunk checks properties of fastbin chunks. But, it is also
used to check properties of other chunks. Hence, remove this and merge the body
of the function in do_check_malloced_chunk.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove fastbin code from malloc_info

In preparation for removal of fastbins, remove all fastbin code from
malloc_info.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove fastbin code from do_check_malloc_state

In preparation for removal of fastbins, remove all fastbin code from
do_check_malloc_state.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove mallopt fastbin stats

In preparation for removal of fastbins, remove all fastbin code from
mallopt.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove allocation from fastbin, and trim_fastbins

In preparation for removal of fastbins, remove the fastbin allocation
path, and remove the TRIM_FASTBINS code.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove malloc_consolidate

In preparation for removal of fastbins, remove the consolidation
infrastructure of fastbins.

Reviewed-by: DJ Delorie <dj@redhat.com>

malloc: remove fastbin tests

Remove all the fastbin tests in preparation for removing the fastbins.

Reviewed-by: DJ Delorie <dj@redhat.com>

Deprecate s390-linux-gnu (31bit)

The next linux 6.19 release will remove support for compat syscalls on s390x with those commits:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d79affa31cbee477a45642efc49957d05466307
0d79affa31cb Merge branch 'compat-removal'
4ac286c4a8d9 s390/syscalls: Switch to generic system call table generation
f4e1f1b1379d s390/syscalls: Remove system call table pointer from thread_struct
3db5cf935471 s390/uapi: Remove 31 bit support from uapi header files
8e0b986c59c6 s390: Remove compat support
169ebcbb9082 tools: Remove s390 compat support
7afb095df3e3 s390/syscalls: Add pt_regs parameter to SYSCALL_DEFINE0() syscall wrapper
b2da5f6400b4 s390/kvm: Use psw32_t instead of psw_compat_t
8c633c78c23a s390/ptrace: Rename psw_t32 to psw32_t

This patch also removes s390-linux-gnu (31bit) from build-many-glibcs.py.
Then the next update of syscall numbers for Linux 6.19 won't change
sysdeps/unix/sysv/linux/s390/s390-32/arch-syscall.h
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>

benchtests: Add pthread mutex trylock recursive throughput test (BZ #33704)

This benchmark measures number of successful updates per second
under high cache contention.

Tested on x86_64.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

benchtests: Refactor pthread trylock throughput test (BZ #33704)

Refactor pthread trylock throughput benchmark test.
No functionality change.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

benchtests: Add pthread mutex trylock throughput test (BZ #33704)

This benchmark measures number of successful updates per second
under high cache contention.

Tested on x86_64.

Co-authored-by: Alex M Wells <alex.m.wells@intel.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

posix: Fix getconf symbolic constants defined in limits.h (BZ# 29147)

POSIX-1.2018 defines that getconf utility shall print symbolic constant
listed under the heading Maximum value and Minimum Values [1], however
current behaviour is to print the values using pathconf or sysconf,
which represents the specific implementation values instead of the
system agnostics ones.

Another issue is for such the symbolic constants, getconf handles them
as a path_var which requires an additional pathname (which does not
make sense for constants values).

The patch fixes it adding a new internal type, LIMITS_H, which only
prints the constant symbolic without requiring an additional path.
Only the values define in glibc provided limits.h plust the GNU
extensions are handled.

Checked on x86_64-linux-gnu.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/getconf.html)

Tested-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

configure: use TEST_CC to check for --depaudit

The ld.lld does not support this option.
Reviewed-by: Sam James <sam@gentoo.org>

configure: use TEST_CC to check for --no-error-execstack

The ld.lld does not support the --no-error-execstack option, and it is
required only to suppress the linker warning while building tests. A new
configure macro, LIBC_TEST_LINKER_FEATURE, is added to check for linker
features using TEST_CC instead of CC.

Checked on x86_64-linux-gnu and aarch64-linux-gnu with gcc and
TEST_CC set to clang-18 and clan-21.

Reviewed-by: Sam James <sam@gentoo.org>

manual: Fix madvise typo in mseal documentation

malloc: Enable 2MB THP by default on Aarch64

Add missing files from previous commit.

malloc: Enable 2MB THP by default on Aarch64

Linux supports multi-sized Transparent Huge Pages (mTHP). For the purpose
of this patch description, we call the block size mapped by a non-last
level pagetable level, the traditional THP size (2M for 4K basepage,
512M for 64K basepage). Linux now also supports intermediate THP sizes
mapped by the last level pagetable - we call that the mTHP size.

The support for mTHP in Linux has grown to be better and stable over time -
applications can benefit from reduced page faults and reduced kernel
memory management overhead, albeit at the cost of internal fragmentation.
We have observed consistent performance boosts with mTHP with little
variance.

As a result, enable 2M THP by default on Aarch64. This enables THP even if
user hasn't passed glibc.malloc.hugetlb=1. If user has passed it, we avoid
making the system call to check the hugepage size from sysfs, and override
it with the hardcoded 2MB.

There are two additional benefits of this patch, if the transparent
hugepage sysctl is set to madvise or always:

1) The THP size is now hardcoded to 2MB for Aarch64. This avoids a
syscall for fetching the THP size from sysfs.

2) On 64K basepage size systems, the traditional THP size is 512M, which
is unusable and impractical. We can instead benefit from the mTHP size of
2M. Apart from the usual benefit of THPs/mTHPs as described above, Aarch64
systems benefit from reduced TLB pressure on this mTHP size, commonly
known as the "contpte" size. If the application takes a pagefault, and
either the THP sysctl settings is "always", or the virtual memory area
has been madvise(MADV_HUGEPAGE)'d along with sysctl being "madvise", then
Linux will fault in a 2M mTHP, mapping contiguous pages into the pagetable,
and painting the pagetable entries with the cont-bit. This bit is a hint to
the hardware that the concerned pagetable entry maps a page which is part
of a set of contiguous pages - the TLB then only remembers a single entry
for this set of 2M/64K = 32 pages, because the physical address of any
other page in this contiguous set is computable by the TLB cached physical
address via a linear offset. Hence, what was only possible with the
traditional THP size, is now possible with the mTHP size.

We see a 6.25% performance improvement on SPEC.

If the sysctl is set to never, no transparent hugepages will be created by
the kernel. But, this patch still sets thp_pagesize = 2MB. The benefit is
that on MORECORE() invocation, we extend the heap by 2MB instead of 4KB,
potentially reducing the frequency of this syscall's invocation by 512x.
Note that, there is no difference in cost between an sbrk(2M) and sbrk(4K);
the kernel only does a virtual reservation and does not touch user physical
memory.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

malloc: Do not make out-of-bounds madvise call on non-aligned heap

Currently, if the initial program break is not aligned to the system page
size, then we align the pointer down to the page size. If there is a gap
before the heap VMA, then such an adjustment means that the madvise() range
now contains a gap. The behaviour in the upstream kernel is currently this:
madvise() will return -ENOMEM, even though the operation will still succeed
in the sense that the VM_HUGEPAGE flag will be set on the heap VMA. We
*must not* depend on this behaviour - this is an internal kernel
implementation, and earlier kernels may possibly abort the operation
altogether.

The other case is that there is no gap, and as a result we may end up
setting the VM_HUGEPAGE flag on that other VMA too, which is an
unnecessary side effect.

Let us fix this by aligning the pointer up to the page size. We should
also subtract the pointer difference from the size, because if we don't,
since the pointer is now aligned up, the size may cross the heap VMA, thus
leading to the same problem but at the other end.

There is no need to check this new size against mp_.thp_pagesize to decide
whether to make the madvise() call. The reason we make this check at the
start of madvise_thp() is to check whether the size of the VMA is enough
to map THPs into it. Since that check has passed, all that we need to
ensure now is that q + size does not cross the heap VMA.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

linux: Add openat2 (BZ 31664)

The openat2 syscall was added on Linux 5.6, as an extension of openat.
Unlike other open-like functions, the kernel only provides the LFS
variant (so files larger than 4GB always succeed, unlike other
functions with an offset larger than off_t). Also, similar to other
open functions, the new symbol is a cancellable entrypoint.

The test case added only stress tests for some of the syscalls' provided
functionality, and it is based on an existing kernel self-test.

A fortify wrapper is added to verify the argument size if not larger
than the current support open_how struct.

Gnulib added an openat2 module, which uses read-only for the open_how
argument [1]. There is no clear indication whether the kernel will
indeed use the argument as in-out, how it would do so, or for which
kind of functionality [2]. Also, adding a potentially different prototype
than gnulib only would add extra unnecessary friction and extra
wrappers to handle it.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

[1] https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=0b97ffdf32bdab909d02449043447237273df75e
[2] https://sourceware.org/pipermail/libc-alpha/2025-September/169740.html

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

malloc: Extend malloc function hiding to tst-reallocarray (BZ #32366)

clang 20 optimize out reallocarray.

Reviewed-by: Sam James <sam@gentoo.org>

malloc: Extend malloc function hiding to tst-pvalloc (BZ #32366)

clang 21 optimize out reallocarray.

Reviewed-by: Sam James <sam@gentoo.org>

configure: Enable experimental support for clang

The clang support is still experimental and not all testcases run
correctly. Only clang 18 and onwards is supported and only
for x86_64-linux-gnu and aarch64-linux-gnu.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>

configure: Only use -Wno-discarded-qualifiers iff compiler supports it

The clang analogous option is
-Wno-incompatible-pointer-types-discards-qualifiers, so add a configure
option to check which one compiler supports.

Handle clang -Wignored-attributes on weak aliases

Clang issues a warning for double alias redirection, indicating that thei
original symbol is used even if a weak definition attempts to override it.

For instance, in the construction:

  int __internal_impl (...) {}
  weak_alias (__internal_impl, external_impl);
  #if SOMETHING
  weak_alias (external_impl, another_external_impl)
  #endif

Clang warns that another_external_impl always resolves to __internal_impl,
even if external_impl is a weak reference. Using the internal symbol for
both aliases resolves this warning.

This issue also occurs with certain libc_hidden_def usage:

  int __internal_impl (...) {}
  weak_alias (__internal_impl, __internal_alias)
  libc_hidden_weak (__internal_alias)

In this case, using a strong_alias is sufficient to avoid the warning
(since the alias is internal, there is no need to use a weak alias).

However, for the constructions like:

  int __internal_impl (...) {}
  weak_alias (__internal_impl, __internal_alias)
  libc_hidden_def (__internal_alias)
  weak_alias (__internal_impl, external_alias)
  libc_hidden_def (external_alias)

Clang warns that the internal external_alias will always resolve to
__GI___internal_impl, even if a weak definition of __GI_internal_impl is
overridden.  For this case, a new macro named static_weak_alias is used
to create a strong alias for SHARED, or a weak_alias otherwise.

With these changes, there is no need to check and enable the
-Wno-ignored-attributes suppression when using clang.

Checked with a build on affected ABIs, and a full check on aarch64,
armhf, i686, and x86_64.

Reviewed-by: Sam James <sam@gentoo.org>

build-many-glibcs.py: Include URL in download exception

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

x32: Implement prctl in assembly

Since the variadic prctl function takes at most 5 integer arguments which
are passed in the same integer registers on x32 as the function with 5
integer arguments, we can use assembly for prctl. Since upper 32-bits in
the last 4 arguments of pcrtl must be cleared to match the x32 prctl
syscall interface where the last 4 arguments are unsigned 64 bit longs,
implement prctl in assembly to clear upper 32-bits in the last 4 arguments
and add a test to verify it.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

build-many-glibcs.py: Switch Git URLs to https://

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

libio: null terminate the buffer upon initial allocation in getdelim

Commit 33eff78c8b28adc4963987880e10d96761f2a167 caused issues in nbdkit
which had code similar to this to get the last line of the file:

    while (getline (&line, &len, fp) != -1)
      ;
    /* Process LINE.  */

After that commit, line[0] would be equal to '\0' instead of containing
the last line of the file like before that commit. A recent POSIX issue
clarified that the behavior before and after that commit are allowed,
since the contents of LINE are unspecified after -1 is returned
[1]. However, some programs rely on the previous behavior.

This patch null terminates the buffer upon getdelim/getline's initial
allocation. This is compatible with previous glibc versions, while also
protecting the caller from reading uninitialized memory if the file is
empty, as long as getline/getdelim does the initial allocation.

[1] https://www.austingroupbugs.net/bug_view_page.php?bug_id=1953

Suggested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

aarch64: Implement AdvSIMD and SVE rsqrt(f) routines

Vector variants of the new C23 rsqrt routines for both AdvSIMD and
SVE, as well as in both single and double precision.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

benchtests: Add benchtests for rsqrt

Add benchtests for double precision vector rsqrt routine. They are
identical to those found in log2.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

benchtests: Add benchtests for rsqrtf

Add benchtests for vector single precision rsqrtf. They are
identical to those found in log2f.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

i386: Fix fmod/fmodf/remainder/remainderf for gcc-12

The __builtin_fmod{f} and __builtin_remainder{f} were added on gcc 13,
and the minimum supported gcc is 12. This patch adds a configure test
to check whether the compiler enables inlining for fmod/remainder, and
uses inline assembly if not.

Checked on i686-linux-gnu wih gcc-12.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

nptl: Check alignment of pthread structs

Report assertion failure if the alignment of external pthread structs is
lower than the internal version. This triggers on type mismatches like
in BZ #33632.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

aarch64: Optimise AdvSIMD atanhf

Optimise AdvSIMD atanhf by vectorising the special case.
There are asymptotes at x = -1 and x = 1. So return inf for these.
Values for which |x| > 1, return NaN.

R.Throughput difference on V2 with GCC@15:
58-60% improvement in special cases.
No regression in fast pass.

aarch64: Optimise AdvSIMD asinhf

Optimise AdvSIMD asinhf by vectorising the special case.
For values greater than 0x1p64, scale the input down first.
This is because the output will overflow with inputs greater than
or equal to this value as there is a squaring operation in the
algorithm.
To scale, do:
2asinh(sqrt[(x-1)/2])
Because:
2asinh(x) = +-acosh(2x^2 + 1)
Apply opposite operations in opposite order for x, and you get:
asinh(x) = 2acosh(sqrt[(x-1)/2]).
Found that using asinh instead of acosh also very closely
approximates asinh(x) for a high input x.

R.Throughput difference on V2 with GCC@15:
25-58% improvement in special cases.
4% regression in fast pass.

aarch64: Optimise AdvSIMD acoshf

Optimise AdvSIMD acoshf by vectorising the special case.
For values greater than 0x1p64, scale the input down first.
This is because the output will overflow with inputs greater than
or equal to this value as there is a squaring operation in the
algorithm.
To scale, do:
2acosh(sqrt[(x+1)/2])
Because:
acosh(x) = 1/2acosh(2x^2 - 1) for x>=1.
Apply opposite operations in opposite order for x, and you get:
acosh(x) = 2acosh(sqrt[(x+1)/2]).

R.Throughput difference on V2 with GCC@15:
30-49% improvement in special cases.
2% regression in fast pass.