git.ipfire.org Git - thirdparty/glibc.git/log

math: Add fast-path to fma

For normal numbers there is no need to issue scalbn, the fma can set
the exponend directly. Performance-wise on x86_64-linux-gnu without
multi-arch it shows a latency improvement of ~5% and throughput of %7
(and sligth more for ABIs witht tail-call optimization).

Checked on x86_64-linux-gnu and i686-linux-gnu with --disable-multi-arch.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

posix: Add POSIX aliases to some spawn functions

Both `posix_spawn_file_actions_add{,f}chdir` functions are now fully
defined by POSIX-2024, this patch adds both functions as aliases of the
already existing `posix_spawn_file_actions_add{,f}chdir_np` GNU
extensions.

This makes glibc more compliant in regards to POSIX-2024.

Signed-off-by: Lucas Chollet <lucas.chollet@free.fr>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: update recent malloc tunable defaults

manual/tunables.texi: update glibc.malloc.tcache_count to 16, in sync
with source code; fix a typo in glibc.mem.decorate_maps tunable

Signed-off-by: Rocket Ma <marocketbd@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

libio: Fix wide stream backup buffer leak on fclose [BZ #33999]

This patch fixes a memory leak when ungetwc is used on a wide oriented stream.
The backup buffer was never freed on fclose, causing a memory leak per
ungetwc/fclose call.

The leak has two causes:

In iofclose.c, for wide streams (fp->mode > 0), _IO_new_fclose never calls
_IO_free_wbackup_area. Fixed by adding the missing call.

In wgenops.c, _IO_wdefault_finish checks fp->_IO_save_base (the narrow field,
always NULL for wide streams) instead of fp->_wide_data->_IO_save_base,
and uses a bare free() that leaves _IO_save_end and _IO_backup_base dangling.
Replace the hand-rolled cleanup with _IO_have_wbackup/_IO_free_wbackup_area,
which handles backup-mode switching and clears all three pointers.

This was independently reported by Rocket Ma [1], whose patch corrects the condition
but still uses the manual free path.

Apply the same _IO_have_backup condition in genops.c for consistency.

Tested by:
make test t=libio/tst-wbackup-leak

[1] https://patchwork.sourceware.org/project/glibc/patch/20260323171742.1039768-1-marocketbd@gmail.com/

Signed-off-by: Xiang Gao <gaoxiang@kylinos.cn>

hugepages: Move THP helpers to generic hugepages abstraction

The helpers for determining the default transparent huge page size
and THP mode are currently implemented in malloc-hugepages. However,
these interfaces are not malloc-specific and are also needed by other
subsystems (e.g. the dynamic loader for segment alignment).

Introduce a new generic hugepages abstraction and move the THP mode
detection, default THP page size probing, and hugepage configuration
helpers there. The malloc code now calls into the generic helpers
instead of duplicating the sysfs parsing.

There is no functional change. This is a pure refactoring to make
the THP detection reusable outside of malloc.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Signed-off-by: WANG Rui <wangrui@loongson.cn>

resolv: Run tst-getaddrinfo-eai-again in a network namespace

This ensures that the test does not accidentally use a name server
running on 127.0.0.1.

Tested-by: Frédéric Bérat <fberat@redhat.com>
Reviewed-by: Frédéric Bérat <fberat@redhat.com>

htl: Fix SEM_FAILED type

io: Use gnulib fts implementation (BZ 22944, BZ 20331)

This patch synchronizes the glibc fts implementation with the latest
version from gnulib (as of 2026-02-16).

The primary motivation is to address limitations in the legacy glibc
implementation, most notably BZ 22944, where fts fails with an
ENAMETOOLONG error when traversing very long paths or deeply nested
directory trees.  The gnulib implementation dynamically reallocates
path buffers and uses openat/fchdir optimizations, effectively
lifting the MAXPATHLEN limitation.

The gnulib implementation also added extra features, which are
used by different GNU projects (coreutils, diffutils):

* FTS_TIGHT_CYCLE_CHECK: used to enable a strict, immediate
   cycle-detection algorithm during a file system traversal.  This is
   done internally using a hash table: every time the traversal enters
   a directory, it records the directory's device and inode (dev/ino)
   pair in the hash table, and before entering any directory, fts
   checks the hash table.

* FTS_CWDFD: instead of actually changing the process's current
   working directory, it maintains a virtual current working directory
   using file descriptors.  The file descriptor is store at the
   fts_cwd_fd field and all subsequent file operations are performed
   relative to this file descriptor using *at functions.

* FTS_DEFER_STAT: performance-oriented flag that instructs the file
   tree traversal engine to delay fetching file metadata.  When the
   flag is used, fts skips the immediate stat call.  Instead, it marks
   the entry with a special internal state (FTS_NSOK and
   FTS_STAT_REQUIRED).  The actual stat call is pushed down the line
   and executed by fts_read right before the application actually
   accesses the entry.

* FTS_VERBATIM: fts_open accept and use the path strings exactly as
   they were provided in the arguments array without slash trimming.

* FTS_MOUNT: it restrict the file tree walk to a single file system.

Hopefully,it would allow some GNU projects to use the glibc
implementation instead of pulling the gnulib one.

It requires some changes to keep compatibility, compared to gnulib:

* The new required fields are added at the end of FTS structure, and
   the new FTS flags are adjusted to avoid change FTS_NAMEONLY/FTS_STOP
   (even though they are marked as private).

* The FTSENT uses a flexible array (fts_name), so two adjustments are
   required: the two new members (fts_fts and fts_dirp) are place
   *before* the struct and the fts_statp is now always allocated and
   accounted (the gnulib implementation uses an alwyas allocated member).

Checked on x86_64-linux-gnu and i686-linux-gnu.

stdlib: Add internal stdc_rotate_right implementation

It follows the C2y N3367 proposed interface, along with some tests
imported from gnulib (and adapted to glibc libsupport).

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

AArch64: Remove unused MIDR entries

Remove the now unused eMAG MIDR check and unused entries from cpu_list[].

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

AArch64: Remove eMAG memset ifunc

As a cleanup remove the eMAG ifunc for memset.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

AArch64: Remove eMAG memchr ifunc

As a cleanup remove the eMAG ifunc for memchr.

Reviewed-by: JiangNing OS<jiangning@amperemail.onmicrosoft.com>

hurd: Interrupted RPC returning EINTR when server has actually changed state.

An interrupted RPC call can return EINTR whilst the RPC is still in
progress on the server. Some RPC calls have permanent consequences
(eg. a write() to append data to a file) but a caller seeing EINTR
should expect that no state has changed. The signal thread now stores
the server's reply (which it already waited for) as the interrupted
thread's reply.
Message-ID: <20260401194948.90428-3-mike@weatherwax.co.uk>

hurd: alterations to MSG_EXAMINE interface (intr-msg.h)

MSG_EXAMINE has been broadened to allow the signal thread (for
example) to access additional arguments that are passed to
interruptible RPCs in other threads. All architecture specific
variants of intr-msg.h now comply with the revised interface and the
single user of MSG_EXAMINE (report-wait.c) adjusted accordingly.
Message-ID: <20260401194948.90428-2-mike@weatherwax.co.uk>

malloc: Show hugetlb tunable default in --list-tunables

Update the hugetlb tunable default in elf/dl-tunables.c so it is shown as 1
with /lib/ld-linux-aarch64.so.1 --list-tunables.
Move the intitialization of thp_mode/thp_pagesize to do_set_hugetlb() and
avoid accessing /sys/kernel/mm if DEFAULT_THP_PAGESIZE > 0. Switch off THP if
glibc.malloc.hugetlb=0 is used - this behaves as if DEFAULT_THP_PAGESIZE==0.
Fix the --list-tunables testcase.

Reviewed-by: DJ Delorie <dj@redhat.com>

io: ftw: Use state stack instead of recursion (BZ 33882)

The current implementation of ftw relies on recursion to traverse
directories (ftw_dir calls process_entry, which calls ftw_dir).  In deep
directory trees, this could lead to a stack overflow (as demonstrated by
the new tst-nftw-bz33882.c test).

This patch refactors ftw to use an explicit, heap-allocated stack to
manage directory traversal:

  * The 'struct ftw_frame' encapsulates the state of a single directory
    level (directory stream, stat buffer, previous base offset, and
    current state).

  * The ftw_dir is rewritten to use a loop instead of recursion and
    an iterative loop to enable immediate state transitions without
    function call overhead.

The patch also cleans up some unused definitions and assumptions (e.g.,
free-clobbering errno) and fixes a UB when handling the ftw callback.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>

math: Sync sinh from CORE-MATH

The CORE-MATH e756933f improved the error bound in the fast path for
x_0 <= x < 1/4, along with a formal proof [1].

Checked on x86_64-linux-gnu, i686-linux-gnu, aaarch64-linux-gnu,
and arm-linux-gnueabihf.

[1] https://core-math.gitlabpages.inria.fr/sinh.pdf

testsuite: fix test-narrowing-trap failure on platforms where FE_INVALID is not defined

I didn't realize it can be undefined at all instead of simply
unsupported :(.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Document CVE-2026-4046

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

x86_64: Prefer EVEX512 code-path on AMD Zen5 CPUs

Introduced a synthetic architecture preference flag (Prefer_EVEX512)
and enabled it for AMD Zen5 (CPUID Family 0x1A) when AVX-512 is supported.

This flag modifies IFUNC dispatch to prefer 512-bit EVEX variants over
256-bit EVEX variants for string and memory functions on Zen5 processors,
leveraging their native 512-bit execution units for improved throughput.
When Prefer_EVEX512 is set, the dispatcher selects evex512 implementations;
otherwise, it falls back to evex (256-bit) variants.

The implementation updates the IFUNC selection logic in ifunc-avx2.h and
ifunc-evex.h to check for the Prefer_EVEX512 flag before dispatching to
EVEX512 implementations. This change affects six string/memory functions:

  - strchr
  - strlen
  - strnlen
  - strrchr
  - strchrnul
  - memchr

Benchmarks conducted on AMD Zen5 hardware demonstrate significant
performance improvements across all affected functions:

Function    Baseline   Patched    Avg         Avg        Avg      Max
            Variant    Variant    Baseline    Patched    Change   Improve
                                  (ns)        (ns)       %        %
------------+----------+----------+-----------+----------+--------+--------
STRCHR      evex       evex512    16.408      12.293     25.08%   37.69%
STRLEN      evex       evex512    16.862      11.436     32.18%   56.74%
STRNLEN     evex       evex512    18.493      11.762     36.40%   64.40%
STRRCHR     evex       evex512    15.154      10.874     28.24%   44.38%
STRCHRNUL   evex       evex512    16.464      12.605     23.44%   45.56%
MEMCHR      evex       evex512    9.984       8.268      17.19%   39.99%

Additionally, a tunable option (glibc.cpu.x86_cpu_features.preferred)
is provided to allow runtime control of the Prefer_EVEX512 flag for testing
and compatibility.

Reviewed-by: Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: Fix lgammaf regression on i686

The new test from 19781c2221 triggers a failure on i686:

  testing float (without inline functions)
  Failure: lgamma (0x3.12be38p+120): errno set to 0, expected 34 (ERANGE)
  Failure: lgamma_upward (0x3.12be38p+120): errno set to 0, expected 34 (ERANGE)

Use math_narrow_eval on the multiplication to force the expected
precision.

Checked on i686-linux-gnu.

math: Use polydd_cosh instead of polydd on cosh

This is similar to original CORE-MATH code and why the function
exists.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
and arm-linux-gnueabihf.

localedata: Add disclaimer to files contributed with assignment

Add the FSF's disclaimer to bi_VU, C, gbm_IN, hif_FJ, sah_RU,
sm_WS, and to_TO which were created under copyright assignment
(not DCO).

This change ensures that all 352 localedata files have either
the FSF disclaimer or the related DCO text we are using
e.g. ab_GE.

Link: https://inbox.sourceware.org/libc-alpha/80426eb7-70cd-4178-8fda-51d590aa38d4@redhat.com/
Link: https://inbox.sourceware.org/libc-alpha/20130220215701.B263F2C0A7@topped-with-meat.com/
Link: https://inbox.sourceware.org/libc-alpha/87pmtq54hs.fsf@oldenburg.str.redhat.com/
Reviewed-by: Collin Funk <collin.funk1@gmail.com>

advisories: Update GLIBC-SA-2026-0005 and GLIBC-SA-2026-0006.

Update advisories with Fix-Commit information for 2.43.9000 and 2.44.

Update NEWS with advisory entries.

resolv: Check hostname for validity (CVE-2026-4438)

The processed hostname in getanswer_ptr should be correctly checked to
avoid invalid characters from being allowed, including shell
metacharacters. It is a security issue to fail to check the returned
hostname for validity.

A regression test is added for invalid metacharacters and other cases
of invalid or valid characters.

No regressions on x86_64-linux-gnu.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Use #!/usr/bin/python3 in remaining Python scripts

Some distributions ban the /usr/bin/python path in their build
systems due to the ambiguity of whether it refers to Python 2 or
Python 3. Python 2 has been out of support for many years, and
glibc has required Python 3 at build time for a while. So it seems
safe to switch the remaining scripts over to /usr/bin/python3.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

LoongArch: Add new files for LA32 in sysdeps/unix/sysv/linux/loongarch/ilp32

Add implies, abilist, c++-types and syscall files.

LoongArch: Add support for LA32 in sysdeps/unix/sysv/linux/loongarch

LoongArch: Add new file for LA32 in sysdeps/loongarch/ilp32

LoongArch: Add support for LA32 in sysdeps/loongarch/fpu

Move the loongarch64 implementation to sysdeps/loongarch/lp64/fpu.

LoongArch: Add support for LA32 in sysdeps/loongarch

LoongArch: fix missing trap for enabled exceptions on narrowing operation

The libc_feupdateenv_test macro is supposed to trap when the trap for a
previously held exception is enabled. But
libc_feupdateenv_test_loongarch wasn't doing it properly: the comment
claims "setting of the cause bits" would cause "the hardware to generate
the exception" but that's simply not true for the LoongArch movgr2fcsr
instruction.

To fix the issue, we need to call __feraiseexcept in case a held exception
is enabled to trap.

Reviewed-by: caiyinyu <caiyinyu@loongson.cn>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>

nptl: Fix nptl/tst-cancel31 fail sometimes

tst-cancel31 fail on la32 qemu-system with a single-core
system sometimes.

IF the test and a infinite loop run on a same x86_64 core,
the test also fail sometimes.
taskset -c 0 make test t=nptl/tst-cancel31
taskset -c 0 ./a.out (a.out is a infinite loop)

After writeopener thread opens the file, it may switch to
main thread and find redundant files.

pthread_cancel and pthread_join writeopener thread
before support_descriptors_check.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

resolv: Count records correctly (CVE-2026-4437)

The answer section boundary was previously ignored, and the code in
getanswer_ptr would iterate past the last resource record, but not
beyond the end of the returned data.  This could lead to subsequent data
being interpreted as answer records, thus violating the DNS
specification.  Such resource records could be maliciously crafted and
hidden from other tooling, but processed by the glibc stub resolver and
acted upon by the application.  While we trust the data returned by the
configured recursive resolvers, we should not trust its format and
should validate it as required.  It is a security issue to incorrectly
process the DNS protocol.

A regression test is added for response section crossing.

No regressions on x86_64-linux-gnu.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

Add advisory text for CVE-2026-4438

Explain the security issue and set the context for the vulnerability to
help downstreams get a better understanding of the issue.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Add advisory text for CVE-2026-4437

Explain the security issue and set the context for the vulnerability to
help downstreams get a better understanding of the issue.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

Use binutils 2.46, MPC 1.4.0 in build-many-glibcs.py

Note that MPC 1.4.0 has moved from .tar.gz to .tar.xz distribution.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

LoongArch: feclearexcept: skip clearing CAUSE

The comment explaining the reason to clear CAUSE does not make any
sense: it says the next "CTC" instruction would raise the FP exception
of which both the CAUSE and ENABLE bits are set, but LoongArch does not
have the CTC instruction. LoongArch has the movgr2fcsr instruction but
movgr2fcsr never raises any FP exception, different from the MIPS CTC
instruction.

So we don't really need to care CAUSE at all.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>

riscv: Resolve calls to memcpy using memcpy-generic in early startup

This patch from Adhemerval sets up the ifunc redirections so that we
resolve memcpy to memcpy_generic in early startup. This avoids infinite
recursion for memcpy calls before the loader is fully initialized.

Tested-by: Jeff Law <jeffrey.law@oss.qualcomm.com>

riscv: Treat clang separately in RVV compiler checks

Detect clang explicitly and apply compiler-specific version checks for
RVV support.

Signed-off-by: Zihong Yao <zihong.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>

math: Fix spurious overflow and missing errno for lgammaf

It syncs with CORE-MATH 9a75500ba1831 and 20d51f2ee.

Checked on aarch64-linux-gnu.

misc: Fix a few typos in comments

math: Sync lgammaf with CORE-MATH

It removes some unnecessary corner-case checks and uses a slightly
different binary algorithm for the hard-case database binary search.

Checked on aarch64-linux-gnu, arm-linux-gnueabihf,
powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu.

math: Sync tgammaf with CORE-MATH

It adds a minor optimization on fast path.

Checked on aarch64-linux-gnu, arm-linux-gnueabihf,
powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu.

Makefile: add allow-list for failures

Enable adding known failures to allowed-failures.txt and ignore failures
in case they are in the list. In case the allowed-failures.txt does not
exist, all failures lead to a failed status as before.

When the file is present, failures of listed tests are ignored and reported
on stdout. If tests not in the allowed list fail, summarize-tests exits with
status 1 and reports the failing tests.

The expected format of allowed-failures.txt file is:
<test_name> # <comment>

Reviewed-by: Florian Weimer <fweimer@redhat.com>

string: Add fallback implementation for ctz/clz

The libgcc implementations of __builtin_clzl/__builtin_ctzl may require
access to additional data that is not marked as hidden, which could
introduce additional GOT indirection and necessitate RELATIVE relocs.
And the RELATIVE reloc is an issue if the code is used during static-pie
startup before self-relocation (for instance, during an assert).

For this case, the ABI can add a string-bitops.h header that defines
HAVE_BITOPTS_WORKING to 0. A configure check for this issue is tricky
because it requires linking against the standard libraries, which
create many RELATIVE relocations and complicate filtering those that
might be created by the builtins.

The fallback is disabled by default, so no target is affected.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

AArch64: Remove prefer_sve_ifuncs

Remove the prefer_sve_ifuncs CPU feature since it was intended for older
kernels. Current distros all use modern Linux kernels with improved support
for SVE save/restore, making this check redundant.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

This reverts commit 6e8f32d39a57aa1f31bf15375810aab79a0f5f4b.

First off, apologies for my misunderstanding on how madvise(MADV_HUGEPAGE)
works. I had the misconception that doing madvise(p, 1, MADV_HUGEPAGE) will set
VM_HUGEPAGE on the entire VMA - it does not, it will align the size to
PAGE_SIZE (4k) and then *split* the VMA. Only the first page-length of the
virtual space will VM_HUGEPAGE'd, the rest of it will stay the same.

The above is the semantics for all madvise() calls - which makes sense from a
UABI perspective. madvise() should do the proposed thing to only the length
(page-aligned) which it was asked to do, doing any more than that is not
something the user is expecting.

Commit 6e8f32d39a57 tries to optimize around the madvise() call by determining
whether the VMA got madvise'd before. This will work for most cases except
the following: if check_may_shrink_heap() is true, shrink_heap() re-maps the
shrunk portion, giving us a new VMA altogether. That VMA won't have the
VM_HUGEPAGE flag.

Reverting this commit, we will again mark the new VMA with VM_HUGEPAGE, and
the kernel will merge the two into a single VMA marked with VM_HUGEPAGE.

This may be the only case where we lose VM_HUGEPAGE, and we could micro-optimize
by extending the current if-condition with !check_may_shrink_heap. But let us
not do this - this is very difficult to reason about, and I am soon going
to propose mmap(MAP_HUGEPAGE) in Linux to do away with all these workarounds.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: factor out ld.conf parsing

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

x86: Fix tanh ifunc selection

The inclusion of generic tanh implementation without undefining the
libm_alias_double (to provide the __tanh_sse2 implementation) makes
the exported tanh symbol pointing to SSE2 variant.

Reviewed-by: DJ Delorie <dj@redhat.com>

x86_64: Add cosh with FMA

The cosh shows an improvement of about ~35% when building for
x86_64-v3.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Consolidated common definition/data for cosh/sinh/tanh

Common data definitions are moved to e_coshsinh_data, cosh only
data is moved to e_cosh_data, sinh to e_sinh_data, and tanh to
e_tanh_data.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Use tanh from CORE-MATH

The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):

* Range [-DBL_MAX, -10]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

* Range [-10, -10]
* FE_TONEAREST
     0:       4059325526  94.51%
     1:        231023238   5.38%
     2:          4618531   0.11%
* FE_UPWARD
     0:       2106654900  49.05%
     1:       2145413180  49.95%
     2:         40847554   0.95%
     3:          2051661   0.05%
* FE_DOWNWARD
     0:       2106618401  49.05%
     1:       2145409958  49.95%
     2:         40880992   0.95%
     3:          2057944   0.05%
* FE_TOWARDZERO
     0:       4061659952  94.57%
     1:        221006985   5.15%
     2:         12285512   0.29%
     3:            14846   0.00%

* Range [10, DBL_MAX]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Performance-wise, it shows:

latency                      master        patched        improvement
x86_64                     109.7420       184.5950            -68.21%
x86_64v2                   109.1230       187.1890            -71.54%
x86_64v3                    99.4471        49.1104             50.62%
aarch64                     43.0474        32.2933             24.98%
armhf-vpfv4                 41.0954        35.8473             12.77%
powerpc64le                 27.3282        22.7134             16.89%

reciprocal-throughput        master        patched        improvement
x86_64                      42.5562       158.1820           -271.70%
x86_64v2                    42.5734       159.2560           -274.07%
x86_64v3                    35.9899        24.2877             32.52%
aarch64                     24.7660        22.8466              7.75%
armhf-vpfv4                 27.0251        25.8150              4.48%
powerpc64le                 11.7350        11.2504              4.13%

* x86_64:        gcc version 15.2.1 20260112, Ryzen 9 5900X, --disable-multi-arch
* aarch64:       gcc version 15.2.1 20251105, Neoverse-N1
* armv7a-vpfv4:  gcc version 15.2.1 20251105, Neoverse-N1
* powerpc64le:   gcc version 15.2.1 20260128, POWER10

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Remove the SVID error handling from sinh

It improves throughput from 8 to 18% and latency from 1 to 10%,
dependending of the ABI.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Use sinh from CORE-MATH

The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):

* Range [-DBL_MAX, -10]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

* Range [-10, -10]
* FE_TONEAREST
     0:       3169388892  73.79%
     1:       1125270674  26.20%
     2:           307729   0.01%
* FE_UPWARD
     0:       1450068660  33.76%
     1:       2146926394  49.99%
     2:        697404986  16.24%
     3:           567255   0.01%
* FE_DOWNWARD
     0:       1449727976  33.75%
     1:       2146957381  49.99%
     2:        697719649  16.25%
     3:           562289   0.01%
* FE_TOWARDZERO
     0:       2519351889  58.66%
     1:       1773434502  41.29%
     2:          2180904   0.05%

* Range [10, DBL_MAX]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Performance-wise, it shows:

latency                      master        patched        improvement
x86_64                     101.0710       129.4710            -28.10%
x86_64v2                   101.1810       127.6370            -26.15%
x86_64v3                    96.0685        48.5911             49.42%
aarch64                     41.4229        22.3971             45.93%
armhf-vpfv4                 42.8620        25.6011             40.27%
powerpc64le                 29.2630        13.1450             55.08%

reciprocal-throughput        master        patched        improvement
x86_64                      42.6895       105.7150           -147.64%
x86_64v2                    42.7255       104.7480           -145.17%
x86_64v3                    39.6949        25.9087             34.73%
aarch64                     26.0104        19.2236             26.09%
armhf-vpfv4                 29.4362        23.6350             19.71%
powerpc64le                 12.9170        8.34582             35.39%

* x86_64:        gcc version 15.2.1 20260112, Ryzen 9 5900X, --disable-multi-arch
* aarch64:       gcc version 15.2.1 20251105, Neoverse-N1
* armv7a-vpfv4:  gcc version 15.2.1 20251105, Neoverse-N1
* powerpc64le:   gcc version 15.2.1 20260128, POWER10

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Remove the SVID error handling from cosh

It improves throughout from 3.5% to 9%.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Use cosh from CORE-MATH

The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):

* Range [-DBL_MAX, -10]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

* Range [-10, -10]
* FE_TONEAREST
     0:       3291614060  76.64%
     1:       1003353235  23.36%
* FE_UPWARD
     0:       2295272497  53.44%
     1:       1999675198  46.56%
     2:            19600   0.00%
* FE_DOWNWARD
     0:       2294966533  53.43%
     1:       1999981461  46.57%
     2:            19301   0.00%
* FE_TOWARDZERO
     0:       2306015780  53.69%
     1:       1988942093  46.31%
     2:             9422   0.00%

* Range [10, DBL_MAX]
* FE_TONEAREST
     0:      10000000000 100.00%
* FE_UPWARD
     0:      10000000000 100.00%
* FE_DOWNWARD
     0:      10000000000 100.00%
* FE_TOWARDZERO
     0:      10000000000 100.00%

The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Performance-wise, it shows:

latency                      master        patched     improvement
x86_64                      52.1066       126.4120        -142.60%
x86_64v2                    49.5781       119.8520        -141.74%
x86_64v3                    45.0811        50.5758         -12.19%
aarch64                     19.9977        21.7814          -8.92%
armhf-vpfv4                 20.5969        27.0479         -31.32%
powerpc64le                 12.6405        13.6768          -8.20%

reciprocal-throughput        master        patched     improvement
x86_64                      18.4833        102.9120       -456.78%
x86_64v2                    17.5409        99.5179        -467.35%
x86_64v3                    18.9187        25.3662         -34.08%
aarch64                     10.9045        18.8217         -72.60%
armhf-vpfv4                 15.7430        24.0822         -52.97%
powerpc64le                  5.4275         8.1269         -49.73%

* x86_64:        gcc version 15.2.1 20260112, Ryzen 9 5900X, --disable-multi-arch
* aarch64:       gcc version 15.2.1 20251105, Neoverse-N1
* armv7a-vpfv4:  gcc version 15.2.1 20251105, Neoverse-N1
* powerpc64le:   gcc version 15.2.1 20260128, POWER10

Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>

nptl/htl: Add missing AC_PROVIDES

nptl/htl: Fix confusion over PTHREAD_IN_LIBC and __PTHREAD_NPTL/HTL

The last uses of PTHREAD_IN_LIBC is where it should have been
__PTHREAD_NPTL/HTL. The latter was not conveniently available everywhere.
Defining it from config.h makes things simpler.

nptl: Drop comment about PTHREAD_IN_LIBC

nptl is now always in libc.

elf: directly call dl_init_static_tls

htl can now have it directly in ld.so

resolv: Move libanl symbols to libc on hurd too

elf: Drop librt.so from localplt-built-dso

It's always empty now.

rt: Move librt symbols to libc on hurd too

htl: Use pthread_rwlock for libc_rwlock

Like nptl does, so we really get rwlock behavior.

mach: Add __mach_rwlock_*

We cannot use pthread_rwlock for these until we have reimplemented
pthread_rwlock with gsync, so fork __libc_rwlock off for now.

configure: Remove extra ')' from b4c110022c

configure: Fix bootstrap build after 570c46d36b (BZ 33985)

The 570c46d36b make libgcc_s to be defined for have-cc-with-libunwind=noi
(default for gcc builds) without taking into consideration that the compiler
can link against -lgcc_s (defined by have-libgcc_s).

Checked with a build-many-glibc.py for x86_64-linux-gnu.

linux: Fix aliasing violations and assert address in __check_pf (bug #33927)

The Linux implementation of __check_pf retrieves interface data via
make_request, which queries the kernel via netlink. The IFA_ADDRESS
received from the kernel's RTM_NEWADDR netlink message is (a)
type-punned via pointer-casting leading to strict aliasing violations,
and (b) dereferenced assuming that it is non-NULL.

This commit removes the strict-aliasing violations using memcpy, and
adds an assert that the address is indeed non-NULL before dereferencing
it.

Reported-by: Siteshwar Vashisht <svashisht@redhat.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

x86: Don't left shift negative values

GCC warns about this with -Wshift-negative-value:

    In file included from ../sysdeps/x86/cpu-features.c:24:
    ../sysdeps/x86/dl-cacheinfo.h: In function ‘get_common_cache_info’:
    ../sysdeps/x86/dl-cacheinfo.h:913:45: warning: left shift of negative value [-Wshift-negative-value]
      913 |                           count_mask = ~(-1 << (count_mask + 1));
          |                                             ^~
    ../sysdeps/x86/dl-cacheinfo.h:930:45: warning: left shift of negative value [-Wshift-negative-value]
      930 |                           count_mask = ~(-1 << (count_mask + 1));
          |                                             ^~

This is because C23 § 6.5.8 specifies that this is undefined behavior.
We can cast it to unsigned which would be equivelent to UINT_MAX.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

Support loading libunwind instead of libgcc_s

The 'unwind-link' facility allows glibc to support thread cancellation
and exit (pthread_cancel, pthread_exiti, backtrace) by dynamically
loading the  unwind library at runtime, preventing a hard dependency on
libgcc_s within libc.so.

When building with libunwind (for clang/LLVM toolchains [1]), two
assumptions in the existing code break:

  1. The runtime library is libunwind.so instead of libgcc_s.so.

  2. libgcc relies on __gcc_personality_v0 to handle unwinding mechanics.
     libunwind exposes the standard '_Unwind_*' accessors directly.

This patch adapts `unwind-link` to handle both environments based on
the HAVE_CC_WITH_LIBUNWIND configuration:

  * The UNWIND_SONAME macro now selects between LIBGCC_S_SO and
    LIBUNWIND_SO.

  * For libgcc, it continues to resolve `__gcc_personality_v0`.

  * For libunwind, it instead resolves the standard
    _Unwind_GetLanguageSpecificData, _Unwind_SetGR, _Unwind_SetIP,
     and _Unwind_GetRegionStart helpers.

   * unwind-resume.c is updated to implement wrappers for these
     accessors that forward calls to the dynamically loaded function
     pointers, effectively shimming the unwinder.

Tests and Makefiles are updated to link against `$(libunwind)` where
appropriate.

Reviewed-by: Sam James <sam@gentoo.org>
[1] https://github.com/libunwind/libunwind

configure: Repurpose have-cc-with-libunwind for clang support

The `have-cc-with-libunwind` check (and its corresponding macro
HAVE_CC_WITH_LIBUNWIND) was historically specific to IA64, intended
to supplement libgcc with libunwind.  Since this logic is unused in
current GCC configurations, this patch repurposes it to support
clang-based toolchains that utilize LLVM's libunwind instead of
libgcc_s.

The configure script now detects if the compiler natively supports
unwinding via `-lunwind`.

Additionally, when this mode is enabled, `-lclang_rt.builtins` is
explicitly added to the `libgcc_eh` definition.  This is necessary
because `links-dso-program` otherwise fails to link due to a missing
`__gcc_personality_v0` symbol.  It appears that clang does not
automatically link the builtins providing this personality routine
when `rlink-path` is actively used during the build.

Reviewed-by: Sam James <sam@gentoo.org>

configure: Parametrize runtime libraries to support compiler-rt

Historically, the build system has hardcoded references to `-lgcc` and
`-lgcc_eh`, explicitly assuming the use of the GCC runtime.  This
prevents building glibc with alternative toolchains, specifically clang
configured with `--rtlib=compiler-rt`, where these libraries are
replaced by `libclang_rt.builtins`.

This patch introduces a mechanism to dynamically detect the compiler's
underlying runtime library.

The logic works as follows:

1. It queries the compiler using `-print-libgcc-file-name`.
2. It parses the output path to determine if `libgcc` or `compiler-rt`
   is in use.
3. Based on this detection, it parametrizes the build variables for
   the static runtime and exception handling libraries (replacing
   hardcoded `-lgcc` and `-lgcc_eh`).

This ensures that the build system correctly links against the active
compiler runtime—whether it is the traditional libgcc or LLVM's
compiler-rt—without requiring manual overrides.

Reviewed-by: Sam James <sam@gentoo.org>

malloc: Remove lingering DIAG_POP_NEEDS_COMMENT

From 0ea9ebe48ad624919d579dbe651293975fb6a699.

malloc: Cleanup warnings

Cleanup warnings - malloc builds with -Os and -Og without needing any
complex warning avoidance defines.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Document CVE-2026-3904

All branches already have a fix, so this is mainly for distributions
that may have cherry-picked the SSE2 memcmp implementation.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>

LoongArch: Optimize float environment functions

In LoongArch, fcsr1 is the alias of enables field in fcsr0, fscr3 is the
alias of RM field in fscr0. This patch use fcsr1 and fcsr3 register to
optimize fedisableexcept, feenableexcept, fegetexcept, fegetround,
fesetround, get_rounding_mode functions, which could reduce the
additional andi instruction.

nptl: Only issues __libc_unwind_link_get for SHARED

The compiler already optimizes it away for static builds.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

x86_64: Conditionally define __sfp_handle_exceptions for compiler-rt

The LLVM compiler-rt builtins library does not currently provide an
implementation for __sfp_handle_exceptions. On x86_64, this causes
unresolved symbol errors when building glibc in environments that
exclude libgcc.

This patch implements __sfp_handle_exceptions specifically for x86_64,
bridging the gap for non-GNU compiler runtimes.

The implementation is used conditionally, only if the compiler does
not already provide the symbol.

NB: the implementation is based on libgcc and raises bosh SSE and i387
exceptions (different that the one from 460ee50de054396cc9791ff4)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

test-assert-c++-variadic.cc: Disable assert_works for GCC 14.2 and 14.1

PR118629 [1] resolved issue with usage of __PRETTY_FUNCTION__
(to which assert expands) inside unevaluated context for GCC 14.3.
This affects only versions 14.1 and 14.2, as -std=c++26 option is
supported since 14.1.

clang supports above snippet for all version that supports --std=c++26
flag (since 17.0.1).

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118629

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

libio: Properly link in function _IO_wfile_doallocate in static binaries

This patch addresses Bug 33935 - _IO_wfile_doallocate not linked correctly
when linking glibc statically.
https://sourceware.org/bugzilla/show_bug.cgi?id=33935

The function _IO_wfile_doallocate has been added with pragma weak in vtable.c,
while it is the only one symbol contained in wfiledoalloc.c,
and has not been directly called in libio.

In static binaries the true function symbol _IO_wfile_doallocate may not
be correctly linked when linking glibc with cases contains wchar functions,
but the weak symbol in vtable is linked instead,
and cause segmentation fault when running.

This patch fixes this with similar way to symbol _IO_file_doallocate,
that add libio_static_fn_required(_IO_wfile_doallocate) in wgenops.c
to make _IO_wfile_doallocate always link in static binaries.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Improve memalign alignment

Use generic stdc_bit_width to safely adapt to input types. Move rounding up of
alignments that are not powers of 2 to __libc_memalign. Simplify alignment
handling of aligned_alloc and __posix_memalign. Add a testcase for non-power
of 2 memalign and fix malloc-debug.

Reviewed-by: DJ Delorie <dj@redhat.com>

feat(rtld): Allow LD_DEBUG category exclusion

Adds support for excluding specific categories from `LD_DEBUG` output.

The `LD_DEBUG` environment variable now accepts category names prefixed
with a dash (`-`) to disable their debugging output. This allows users
to enable broad categories (e.g., `all`) while suppressing verbose or
irrelevant information from specific sub-categories (e.g., `-tls`).

The `process_dl_debug` function in `rtld.c` has been updated to parse
these exclusion options and unset the corresponding bits in
`GLRO(dl_debug_mask)`. The `LD_DEBUG=help` output has also been updated
to document this new functionality. A new test `tst-dl-debug-exclude.sh`
is added to verify the correct behavior of category exclusion.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf(tls): Add debug logging for TLS operations

This commit introduces extensive debug logging for thread-local storage
(TLS) operations within the dynamic linker. When `LD_DEBUG=tls` is
enabled, messages are printed for:
- TLS module assignment and release.
- DTV (Dynamic Thread Vector) resizing events.
- TLS block allocations and deallocations.
- `__tls_get_addr` slow path events (DTV updates, lazy allocations, and
static TLS usage).

The log format is standardized to use a "tls: " prefix and identifies
modules using the "modid %lu" convention. To aid in debugging
multithreaded applications, thread-specific logs include the Thread
Control Block (TCB) address to identify the context of the operation.

A new test module `tst-tls-debug-mod.c` and a corresponding shell script
`tst-tls-debug-recursive.sh` have been added. Additionally, the existing
`tst-dl-debug-tid` NPTL test has been updated to verify these TLS debug
messages in a multithreaded context.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: should check result of openat with -1 not 1

Signed-off-by: Weixie Cui <cuiweixie@gmail.com>
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

htl: Fix pthread_once memory ordering

We need to tie the fast-path read with the store, to make sure that when
fast-reading 1, we see all the effects performed by the init routine.

(and we don't need a full barrier, only an acquire/release pair is
needed)

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

htl: Make sure the exit path of last thread sees all thread cleanups

In case e.g. some atexit() handlers expect all threads to have finished
their side effects.

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

hurd: Check for _hurdsig_preempted_set with _hurd_siglock held

Without taking _hurd_siglock, we could be missing the addition of a global
preemptor.

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

htl: Call thread-specific destructors for last thread too

As required by posix.

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

htl: Fix checking for mutex not being recoverable

pthread_mutex_unlock sets __owner_id to NOTRECOVERABLE_ID

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

benchtests: Adapt tanh

Random values in the range of [-4,4].

benchtests: Adapt sinh

Random values in the range of [-10,10].

benchtests: Adapt cosh

Random values in the range of [-10,10].

Fix Makefile alphabetical ordering

hurd; Fix return value for sigwait

It is supposed to return an error code, not just -1.

Reported-by: Brent Baccala <cosine@freesoft.org> 's Claude assistant

hurd: Fix cleaning on sigtimedwait timing out

sigtimedwait also needs to clean up preemptors and the blocked mask before
returning EAGAIN.

Also add some sigtimedwait testing.

Linux: Only define OPEN_TREE_* macros in <sys/mount.h> if undefined (bug 33921)

There is a conditional inclusion of <linux/mount.h> earlier in the file.
If that defines the macros, do not redefine them. This addresses build
problems as the token sequence used by the UAPI macro definitions
changes between Linux versions.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Avoid accessing /sys/kernel/mm files

On AArch64 malloc always checks /sys/kernel/mm/transparent_hugepage/enabled to
set the THP mode. However this check is quite expensive and the file may not
be accessible in containers. If DEFAULT_THP_PAGESIZE is non-zero, use
malloc_thp_mode_madvise so that we take advantage of THP in all cases. Since
madvise is a fast systemcall, it adds only a small overhead compared to the
cost of mmap and populating the pages.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>