Pierre Blanchard [Wed, 15 Apr 2026 08:32:44 +0000 (08:32 +0000)]
AArch64: Implement AdvSIMD and SVE powr(f) routines
Vector variants of the new C23 powr routines.
These provide same maximum error error as pow by virtue of
relying on shared approximation techniques and sources.
Note: Benchmark inputs for powr(f) are identical to pow(f).
Performance gain over pow on V1 with GCC@15:
- SVE powr: 10-12% on subnormal x, 12-13% on x < 0.
- SVE powrf: 15% on all x < 0.
- AdvSIMD powr: for x < 0, 40% if x subnormal, 60% otherwise.
- AdvSIMD powrf: 4% on x subnormals or x < 0.
Pierre Blanchard [Wed, 15 Apr 2026 08:32:41 +0000 (08:32 +0000)]
AArch64: Improve AdvSIMD and SVE pow(f).
Optimize handling of subnormal x and/or negative x.
Some cleanup in attributes, macros and improving overall consistency.
Move core computation to header
Introduce config parameter to turn sign_bias on/off.
Performance improvement on V1 with GCC@15:
- AdvSIMD pow: 10-15% on subnormals.
- AdvSIMD powf: 30 to 70% on subnormals or x < 0, <=3% on x > 0.
- SVE pow: 10-15% on subnormals, <=3% otherwise.
- SVE powf: no significant variations in codegen/perf.
Remove wordsize-64 and arch-specific implementations, for ABIs when
off_t is the same as off64_t (__OFF_T_MATCHES_OFF64_T) the ftw64.c
will create the requires aliases.
The ftw.c implementation is moved to ftw-common.c to simplify
the __OFF_T_MATCHES_OFF64_T usage.
Remove wordsize-64 and arch-specific implementations, for ABIs when
off_t is the same as off64_t (__OFF_T_MATCHES_OFF64_T) the fts64.c
will create the requires aliases.
The fts.c implementation is moved to fts-common.c to simplify
the __OFF_T_MATCHES_OFF64_T usage.
Yao Zihong [Wed, 18 Feb 2026 21:12:09 +0000 (15:12 -0600)]
riscv: Add RVV strcat for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of strcat for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__strcat_vector) is added
alongside the generic fallback (__strcat_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling strcat().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
After removing the files in s390-32 subfolder, we can also remove the
entries in CONTRIBUTED-BY file.
The entries for s390-64 files were adjusted to fit to the new paths. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Thu, 9 Apr 2026 08:47:09 +0000 (10:47 +0200)]
s390: Move files out of s390-64 folders
All the files in subfolders s390/s390-64 in sysdeps directory are moved
up to the s390/ ones. If necessary the files were merged with the existing
ones.
sysdeps/s390/preconfigure.ac was updated to reflect the removal of s390-64
subdirectory. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Thu, 9 Apr 2026 08:47:08 +0000 (10:47 +0200)]
s390: Switch to common-code headers
The removal of s390-32 allows us to switch to common-code headers
instead of providing s390-64 specific headers:
from sysdeps/unix/sysv/linux/s390/bits/environments.h
to bits/environments.h
-> We now only have a 64bit environment.
from sysdeps/s390/s390-64/bits/wordsize.h
to sysdeps/wordsize-64/bits/wordsize.h
-> All macros are defined equal
from sysdeps/unix/sysv/linux/s390/bits/utmp.h
to bits/utmp.h
-> On s390-64, __WORDSIZE_TIME64_COMPAT32 is defined to 0, then the
64bit part of both headers is identical
from sysdeps/unix/sysv/linux/s390/bits/utmpx.h
to sysdeps/gnu/bits/utmpx.h
-> On s390-64, __WORDSIZE_TIME64_COMPAT32 is defined to 0, then the
64bit part of both headers is identical
from sysdeps/unix/sysv/linux/s390/bits/timesize.h
to bits/timesize.h
-> __TIMESIZE is defined to 64 in both cases
from sysdeps/unix/sysv/linux/s390/bits/procfs-id.h
to sysdeps/unix/sysv/linux/bits/procfs-id.h
-> The typedefs for __pr_uid_t and __pr_gid_t on s390-64 are equal
in both files. No need for an extra s390-specific header file anymore.
from sysdeps/unix/sysv/linux/s390/bits/procfs-extra.h
to sysdeps/unix/sysv/linux/bits/procfs-extra.h
-> Get rid of the "32-bit variants so that BFD can read 32-bit core files."
Furthermore it turned out that there is a hardcoded implementation
independent of procfs-extra.h in <binutils>/bfd/elf32-s390.c:
elf_s390_grok_prstatus(), elf_s390_grok_psinfo(). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Thu, 9 Apr 2026 08:47:07 +0000 (10:47 +0200)]
s390: Remove s390-32 specific code in non s390-32 specific files
This patch removes s390-32 specific code in either common-code files
or shared files between s390-64 and s390-32.
Such code was guarded with preprocessor guards which check the size
of __WORDSIZE or __ELF_NATIVE_CLASS and of course the existance of
__s390x__ and __s390__ macros.
Note, that if __s390x__ is defined then __s390__ is also defined.
This patch also adjust guards for __s390__ only to __s390x__ to
make clear that those are still needed.
Futhermore the macro names for ifunc variants were adjusted from
XYZ_Z900_G5 to XYZ_Z900 as G5 is a pre 64bit machine.
On s390-32 we've used the special assembler directive to enable
zarch instructions:
.machinemode "zarch_nohighgprs"
As this is not needed on s390-64 anymore as zarch is enabled by default,
just drop those lines.
Furthermore we do not check for HWCAP_S390_ZARCH and HWCAP_S390_HIGH_GPRS
anymore. Just simplify those checks for e.g. stfle- or cuXY-instructions.
The 32/64 abi-variants and the corresponding abi-conditions are now also
removed from the s390 Makefiles and thus we now only generate a single
gnu/stubs.h and gnu/lib-names.h file instead of also having the different
ones for both abi-variants.
After removing process_elf32_file in s390 readelflib.c, ldconfig is only
recognizing 64bit ELF files for ld.so.cache.
Various comments mentioning s390 (with meaning s390-32) were removed/adjusted. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Thu, 9 Apr 2026 08:47:06 +0000 (10:47 +0200)]
s390: Remove support for s390-32.
The linux 6.19 release has removed support for compat syscalls on s390x.
Therefore s390-linux-gnu (31bit) configuration was deprecated with glibc 2.43:
commit 638d437dbf9c68e40986edaa9b0d1c2e72a1ae81
"Deprecate s390-linux-gnu (31bit)"
While deprecation, the build-many-glibcs.py script has already removed s390 (31bit).
Now explicitely exit with an error in sysdeps/s390/preconfigure
if somebody tries to build glibc for s390 (31bit).
Furthermore all s390-32 specific files are removed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
WANG Rui [Mon, 20 Apr 2026 10:54:13 +0000 (10:54 +0000)]
elf: Add test for THP alignment of large load segments
Add a new test to verify that large executable PT_LOAD segments are
mapped at addresses aligned to the THP size when the glibc tunable
glibc.elf.thp=1 is enabled and the system is configured to use THP
in "always" mode.
The test loads a shared object with a sufficiently large executable
segment via dlopen and inspects /proc/self/maps to check that the
mapping address is aligned to the THP page size reported by the kernel.
The test is skipped if the THP size cannot be determined or if THP is
not enabled in "always" mode.
Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Use pending character state in IBM1390, IBM1399 character sets (CVE-2026-4046)
Follow the example in iso-2022-jp-3.c and use the __count state
variable to store the pending character. This avoids restarting
the conversion if the output buffer ends between two 4-byte UCS-4
code points, so that the assert reported in the bug can no longer
happen.
Even though the fix is applied to ibm1364.c, the change is only
effective for the two HAS_COMBINED codecs for IBM1390, IBM1399.
The test case was mostly auto-generated using
claude-4.6-opus-high-thinking, and composer-2-fast shows up in the
log as well. During review, gpt-5.4-xhigh flagged that the original
version of the test case was not exercising the new character
flush logic.
This fixes bug 33980.
Assisted-by: LLM Reviewed-by: Carlos O'Donell <carlos@redhat.com>
WANG Rui [Tue, 14 Apr 2026 15:26:58 +0000 (15:26 +0000)]
loongarch: Enable THP-aligned load segments by default on 64-bit
On LoongArch64 Linux, aligning ELF load segments to Transparent Huge Page
(THP) boundaries provides consistent performance benefits for large
binaries by reducing TLB pressure and improving instruction fetch
efficiency.
Enable THP-based load segment alignment by default on LoongArch64 by
setting `glibc.elf.thp=1` during startup. Define the default THP
page size for load segment alignment on LoongArch64 as 32MB.
This allows the dynamic loader to apply THP-friendly alignment without
requiring the `glibc.elf.thp` tunable to be explicitly set.
Workload 1: building Cargo 1.93.0
Rustc: nightly-2026-02-26
Without patch With patch
instructions 3,690,358,948,176 3,690,301,774,568
cpu-cycles 4,233,025,766,760 4,035,866,635,741
itlb-misses 9,708,829,532 2,700,014,717
time elapsed 302.40 s 289.68 s
Instructions remain essentially unchanged. iTLB misses drop by about
72%, reducing CPU cycles by about 4.7% and wall time by about 4.2%.
Workload 2: building Linux kernel v7.0-rc1
LLVM: 21.1.8
Without patch With patch
instructions 14,163,739,876,387 14,169,418,598,675
cpu-cycles 19,231,890,317,741 16,851,494,928,181
itlb-misses 91,142,010,440 90,779,245
time elapsed 1022.09 s 893.22 s
Instructions remain roughly the same. iTLB misses drop from about 91B
to about 90M (roughly 99.9% reduction), reducing CPU cycles by about
12% and wall time by about 12.6%.
Reviewed-by: caiyinyu <caiyinyu@loongson.cn> Signed-off-by: WANG Rui <wangrui@loongson.cn>
WANG Rui [Tue, 14 Apr 2026 15:24:39 +0000 (15:24 +0000)]
elf: Align large load segments to PMD huge page size for THP
Mapping segments that are at least the size of a PMD huge page to
huge-page-aligned addresses helps make them eligible for Transparent
Huge Pages (THP).
This patch introduces a Linux-specific helper, `_dl_map_segment_align`,
to determine an appropriate maximum alignment for ELF load segments based
on the system THP policy. The optimization is enabled only when the glibc
tunable `glibc.elf.thp=1` is set and THP is configured to be used
unconditionally.
The optimization depends on Linux kernel support for file-backed THP,
specifically:
* `CONFIG_READ_ONLY_THP_FOR_FS` (available since Linux kernel 5.4), and
* `CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS`.
When enabled, the helper queries the default THP page size and uses it
to align sufficiently large load segments that are already properly
aligned in both virtual address and file offset (e.g., zero).
For eligible segments, the alignment is bumped to the THP page size,
which improves THP eligibility, reduces TLB pressure, and improves
performance for large objects. To avoid excessive address space padding
on systems with very large THP sizes, the alignment is capped at 32MB.
The optimization is applied only to non-writable segments, matching
typical THP usage.
Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
WANG Rui [Tue, 14 Apr 2026 15:17:46 +0000 (15:17 +0000)]
tunables: Add glibc.elf.thp tunable for THP-aware segment alignment
Introduce a new tunable, `glibc.elf.thp`, to control Transparent Huge
Page (THP) aware alignment of ELF loadable segments.
When set to `1`, the dynamic loader will attempt to align sufficiently
large `PT_LOAD` segments to the PMD huge page size when mapping them.
This increases the likelihood that the kernel backs these mappings with
Transparent Huge Pages.
The default value is `0`, which preserves the traditional page-sized
alignment and keeps existing behavior unchanged.
On systems without THP support, or when THP is disabled in the kernel,
enabling this tunable has no effect.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Signed-off-by: WANG Rui <wangrui@loongson.cn>
WANG Rui [Tue, 14 Apr 2026 15:16:07 +0000 (15:16 +0000)]
elf: Introduce _dl_map_segment_align hook for segment alignment tuning
Introduce a new helper function, _dl_map_segment_align, to allow
architecture-specific adjustment of ELF load segment alignment during
object mapping.
The generic ELF loader now calls this hook when determining the maximum
segment alignment. The generic implementation is a no-op and preserves
existing behavior.
This provides a well-defined extension point for architectures that
need to adjust segment alignment policies (for example, to improve
mapping efficiency or enable platform-specific optimizations) without
embedding such logic directly in the generic loader.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Signed-off-by: WANG Rui <wangrui@loongson.cn>
WANG Rui [Tue, 14 Apr 2026 15:14:33 +0000 (15:14 +0000)]
elf: Remove redundant _dl_map_segments declaration from dl-load.h
The function `_dl_map_segments` is defined in `<dl-map-segments.h>`,
which provides the canonical implementation (optionally overridden
by sysdeps variants). All call sites include `<dl-map-segments.h>`
directly, so declaring `_dl_map_segments` in `dl-load.h` is unnecessary.
Keeping a static prototype in `dl-load.h` can trigger
-Wunused-function errors when the header is included by translation
units that do not include `<dl-map-segments.h>` and do not reference
`_dl_map_segments`. Since glibc builds with `-Werror`, this results
in build failures [1].
Remove the redundant declaration from `dl-load.h` to avoid these
spurious warnings and keep the declaration colocated with the
definition as intended.
Michael Kelly [Wed, 15 Apr 2026 18:03:09 +0000 (19:03 +0100)]
hurd: __adjtime() to support NULL delta whilst returning olddelta.
This is required to obtain the remaining time of day adjustment
without altering the required adjustment.
Message-ID: <20260415180318.109742-4-mike@weatherwax.co.uk>
Michael Kelly [Wed, 15 Apr 2026 20:18:52 +0000 (22:18 +0200)]
hurd: __adjtime(): struct timeval and time_value_t are not identical.
'struct timeval' and 'struct time_value' have different types for the
microseconds component: int and long int. Casting one to the other
leads to negative numbers not being preserved properly within the
called code.
Message-ID: <20260415180318.109742-3-mike@weatherwax.co.uk>
The locales en_GB and en_IE use a date format of "%d//%m//%y",
as this is the most common shorthand format in both countries.
However the ga_IE locale does not conform to this. The format
"%d.%m.%y" is not commonly used in either the ROI or the UK,
and the forward-slash separator is the most common in both
languages when used in both countries.
This can be verified by checking the CLDR data for Irish:
https://www.unicode.org/cldr/charts/48/verify/dates/ga.html
Signed-off-by: Charlotte Mcmenamin <altronic25@protonmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
For normal numbers there is no need to issue scalbn, the fma can set
the exponend directly. Performance-wise on x86_64-linux-gnu without
multi-arch it shows a latency improvement of ~5% and throughput of %7
(and sligth more for ABIs witht tail-call optimization).
Checked on x86_64-linux-gnu and i686-linux-gnu with --disable-multi-arch. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Lucas Chollet [Wed, 1 Apr 2026 12:21:39 +0000 (14:21 +0200)]
posix: Add POSIX aliases to some spawn functions
Both `posix_spawn_file_actions_add{,f}chdir` functions are now fully
defined by POSIX-2024, this patch adds both functions as aliases of the
already existing `posix_spawn_file_actions_add{,f}chdir_np` GNU
extensions.
This makes glibc more compliant in regards to POSIX-2024.
Signed-off-by: Lucas Chollet <lucas.chollet@free.fr> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This patch fixes a memory leak when ungetwc is used on a wide oriented stream.
The backup buffer was never freed on fclose, causing a memory leak per
ungetwc/fclose call.
The leak has two causes:
In iofclose.c, for wide streams (fp->mode > 0), _IO_new_fclose never calls
_IO_free_wbackup_area. Fixed by adding the missing call.
In wgenops.c, _IO_wdefault_finish checks fp->_IO_save_base (the narrow field,
always NULL for wide streams) instead of fp->_wide_data->_IO_save_base,
and uses a bare free() that leaves _IO_save_end and _IO_backup_base dangling.
Replace the hand-rolled cleanup with _IO_have_wbackup/_IO_free_wbackup_area,
which handles backup-mode switching and clears all three pointers.
This was independently reported by Rocket Ma [1], whose patch corrects the condition
but still uses the manual free path.
Apply the same _IO_have_backup condition in genops.c for consistency.
WANG Rui [Tue, 7 Apr 2026 14:04:28 +0000 (14:04 +0000)]
hugepages: Move THP helpers to generic hugepages abstraction
The helpers for determining the default transparent huge page size
and THP mode are currently implemented in malloc-hugepages. However,
these interfaces are not malloc-specific and are also needed by other
subsystems (e.g. the dynamic loader for segment alignment).
Introduce a new generic hugepages abstraction and move the THP mode
detection, default THP page size probing, and hugepage configuration
helpers there. The malloc code now calls into the generic helpers
instead of duplicating the sysfs parsing.
There is no functional change. This is a pure refactoring to make
the THP detection reusable outside of malloc.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Signed-off-by: WANG Rui <wangrui@loongson.cn>
io: Use gnulib fts implementation (BZ 22944, BZ 20331)
This patch synchronizes the glibc fts implementation with the latest
version from gnulib (as of 2026-02-16).
The primary motivation is to address limitations in the legacy glibc
implementation, most notably BZ 22944, where fts fails with an
ENAMETOOLONG error when traversing very long paths or deeply nested
directory trees. The gnulib implementation dynamically reallocates
path buffers and uses openat/fchdir optimizations, effectively
lifting the MAXPATHLEN limitation.
The gnulib implementation also added extra features, which are
used by different GNU projects (coreutils, diffutils):
* FTS_TIGHT_CYCLE_CHECK: used to enable a strict, immediate
cycle-detection algorithm during a file system traversal. This is
done internally using a hash table: every time the traversal enters
a directory, it records the directory's device and inode (dev/ino)
pair in the hash table, and before entering any directory, fts
checks the hash table.
* FTS_CWDFD: instead of actually changing the process's current
working directory, it maintains a virtual current working directory
using file descriptors. The file descriptor is store at the
fts_cwd_fd field and all subsequent file operations are performed
relative to this file descriptor using *at functions.
* FTS_DEFER_STAT: performance-oriented flag that instructs the file
tree traversal engine to delay fetching file metadata. When the
flag is used, fts skips the immediate stat call. Instead, it marks
the entry with a special internal state (FTS_NSOK and
FTS_STAT_REQUIRED). The actual stat call is pushed down the line
and executed by fts_read right before the application actually
accesses the entry.
* FTS_VERBATIM: fts_open accept and use the path strings exactly as
they were provided in the arguments array without slash trimming.
* FTS_MOUNT: it restrict the file tree walk to a single file system.
Hopefully,it would allow some GNU projects to use the glibc
implementation instead of pulling the gnulib one.
It requires some changes to keep compatibility, compared to gnulib:
* The new required fields are added at the end of FTS structure, and
the new FTS flags are adjusted to avoid change FTS_NAMEONLY/FTS_STOP
(even though they are marked as private).
* The FTSENT uses a flexible array (fts_name), so two adjustments are
required: the two new members (fts_fts and fts_dirp) are place
*before* the struct and the fts_statp is now always allocated and
accounted (the gnulib implementation uses an alwyas allocated member).
Mike Kelly [Wed, 1 Apr 2026 19:49:33 +0000 (20:49 +0100)]
hurd: Interrupted RPC returning EINTR when server has actually changed state.
An interrupted RPC call can return EINTR whilst the RPC is still in
progress on the server. Some RPC calls have permanent consequences
(eg. a write() to append data to a file) but a caller seeing EINTR
should expect that no state has changed. The signal thread now stores
the server's reply (which it already waited for) as the interrupted
thread's reply.
Message-ID: <20260401194948.90428-3-mike@weatherwax.co.uk>
Mike Kelly [Wed, 1 Apr 2026 19:49:32 +0000 (20:49 +0100)]
hurd: alterations to MSG_EXAMINE interface (intr-msg.h)
MSG_EXAMINE has been broadened to allow the signal thread (for
example) to access additional arguments that are passed to
interruptible RPCs in other threads. All architecture specific
variants of intr-msg.h now comply with the revised interface and the
single user of MSG_EXAMINE (report-wait.c) adjusted accordingly.
Message-ID: <20260401194948.90428-2-mike@weatherwax.co.uk>
malloc: Show hugetlb tunable default in --list-tunables
Update the hugetlb tunable default in elf/dl-tunables.c so it is shown as 1
with /lib/ld-linux-aarch64.so.1 --list-tunables.
Move the intitialization of thp_mode/thp_pagesize to do_set_hugetlb() and
avoid accessing /sys/kernel/mm if DEFAULT_THP_PAGESIZE > 0. Switch off THP if
glibc.malloc.hugetlb=0 is used - this behaves as if DEFAULT_THP_PAGESIZE==0.
Fix the --list-tunables testcase.
io: ftw: Use state stack instead of recursion (BZ 33882)
The current implementation of ftw relies on recursion to traverse
directories (ftw_dir calls process_entry, which calls ftw_dir). In deep
directory trees, this could lead to a stack overflow (as demonstrated by
the new tst-nftw-bz33882.c test).
This patch refactors ftw to use an explicit, heap-allocated stack to
manage directory traversal:
* The 'struct ftw_frame' encapsulates the state of a single directory
level (directory stream, stat buffer, previous base offset, and
current state).
* The ftw_dir is rewritten to use a loop instead of recursion and
an iterative loop to enable immediate state transitions without
function call overhead.
The patch also cleans up some unused definitions and assumptions (e.g.,
free-clobbering errno) and fixes a UB when handling the ftw callback.
Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
Sajan Karumanchi [Thu, 26 Mar 2026 09:21:30 +0000 (09:21 +0000)]
x86_64: Prefer EVEX512 code-path on AMD Zen5 CPUs
Introduced a synthetic architecture preference flag (Prefer_EVEX512)
and enabled it for AMD Zen5 (CPUID Family 0x1A) when AVX-512 is supported.
This flag modifies IFUNC dispatch to prefer 512-bit EVEX variants over
256-bit EVEX variants for string and memory functions on Zen5 processors,
leveraging their native 512-bit execution units for improved throughput.
When Prefer_EVEX512 is set, the dispatcher selects evex512 implementations;
otherwise, it falls back to evex (256-bit) variants.
The implementation updates the IFUNC selection logic in ifunc-avx2.h and
ifunc-evex.h to check for the Prefer_EVEX512 flag before dispatching to
EVEX512 implementations. This change affects six string/memory functions:
Additionally, a tunable option (glibc.cpu.x86_cpu_features.preferred)
is provided to allow runtime control of the Prefer_EVEX512 flag for testing
and compatibility.
Reviewed-by: Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Carlos O'Donell [Fri, 20 Mar 2026 21:14:33 +0000 (17:14 -0400)]
resolv: Check hostname for validity (CVE-2026-4438)
The processed hostname in getanswer_ptr should be correctly checked to
avoid invalid characters from being allowed, including shell
metacharacters. It is a security issue to fail to check the returned
hostname for validity.
A regression test is added for invalid metacharacters and other cases
of invalid or valid characters.
Florian Weimer [Wed, 25 Mar 2026 09:44:13 +0000 (10:44 +0100)]
Use #!/usr/bin/python3 in remaining Python scripts
Some distributions ban the /usr/bin/python path in their build
systems due to the ambiguity of whether it refers to Python 2 or
Python 3. Python 2 has been out of support for many years, and
glibc has required Python 3 at build time for a while. So it seems
safe to switch the remaining scripts over to /usr/bin/python3.
Xi Ruoyao [Tue, 24 Mar 2026 07:22:59 +0000 (15:22 +0800)]
LoongArch: fix missing trap for enabled exceptions on narrowing operation
The libc_feupdateenv_test macro is supposed to trap when the trap for a
previously held exception is enabled. But
libc_feupdateenv_test_loongarch wasn't doing it properly: the comment
claims "setting of the cause bits" would cause "the hardware to generate
the exception" but that's simply not true for the LoongArch movgr2fcsr
instruction.
To fix the issue, we need to call __feraiseexcept in case a held exception
is enabled to trap.
Reviewed-by: caiyinyu <caiyinyu@loongson.cn> Signed-off-by: Xi Ruoyao <xry111@xry111.site>
mengqinggang [Tue, 24 Mar 2026 07:48:38 +0000 (15:48 +0800)]
nptl: Fix nptl/tst-cancel31 fail sometimes
tst-cancel31 fail on la32 qemu-system with a single-core
system sometimes.
IF the test and a infinite loop run on a same x86_64 core,
the test also fail sometimes.
taskset -c 0 make test t=nptl/tst-cancel31
taskset -c 0 ./a.out (a.out is a infinite loop)
After writeopener thread opens the file, it may switch to
main thread and find redundant files.
pthread_cancel and pthread_join writeopener thread
before support_descriptors_check.
Carlos O'Donell [Fri, 20 Mar 2026 20:43:33 +0000 (16:43 -0400)]
resolv: Count records correctly (CVE-2026-4437)
The answer section boundary was previously ignored, and the code in
getanswer_ptr would iterate past the last resource record, but not
beyond the end of the returned data. This could lead to subsequent data
being interpreted as answer records, thus violating the DNS
specification. Such resource records could be maliciously crafted and
hidden from other tooling, but processed by the glibc stub resolver and
acted upon by the application. While we trust the data returned by the
configured recursive resolvers, we should not trust its format and
should validate it as required. It is a security issue to incorrectly
process the DNS protocol.
A regression test is added for response section crossing.
Xi Ruoyao [Thu, 19 Mar 2026 08:33:22 +0000 (16:33 +0800)]
LoongArch: feclearexcept: skip clearing CAUSE
The comment explaining the reason to clear CAUSE does not make any
sense: it says the next "CTC" instruction would raise the FP exception
of which both the CAUSE and ENABLE bits are set, but LoongArch does not
have the CTC instruction. LoongArch has the movgr2fcsr instruction but
movgr2fcsr never raises any FP exception, different from the MIPS CTC
instruction.
riscv: Resolve calls to memcpy using memcpy-generic in early startup
This patch from Adhemerval sets up the ifunc redirections so that we
resolve memcpy to memcpy_generic in early startup. This avoids infinite
recursion for memcpy calls before the loader is fully initialized.
Tested-by: Jeff Law <jeffrey.law@oss.qualcomm.com>
Martin Coufal [Thu, 19 Mar 2026 13:09:22 +0000 (14:09 +0100)]
Makefile: add allow-list for failures
Enable adding known failures to allowed-failures.txt and ignore failures
in case they are in the list. In case the allowed-failures.txt does not
exist, all failures lead to a failed status as before.
When the file is present, failures of listed tests are ignored and reported
on stdout. If tests not in the allowed list fail, summarize-tests exits with
status 1 and reports the failing tests.
The expected format of allowed-failures.txt file is:
<test_name> # <comment>
The libgcc implementations of __builtin_clzl/__builtin_ctzl may require
access to additional data that is not marked as hidden, which could
introduce additional GOT indirection and necessitate RELATIVE relocs.
And the RELATIVE reloc is an issue if the code is used during static-pie
startup before self-relocation (for instance, during an assert).
For this case, the ABI can add a string-bitops.h header that defines
HAVE_BITOPTS_WORKING to 0. A configure check for this issue is tricky
because it requires linking against the standard libraries, which
create many RELATIVE relocations and complicate filtering those that
might be created by the builtins.
The fallback is disabled by default, so no target is affected.
Wilco Dijkstra [Mon, 16 Mar 2026 14:24:32 +0000 (14:24 +0000)]
AArch64: Remove prefer_sve_ifuncs
Remove the prefer_sve_ifuncs CPU feature since it was intended for older
kernels. Current distros all use modern Linux kernels with improved support
for SVE save/restore, making this check redundant.
First off, apologies for my misunderstanding on how madvise(MADV_HUGEPAGE)
works. I had the misconception that doing madvise(p, 1, MADV_HUGEPAGE) will set
VM_HUGEPAGE on the entire VMA - it does not, it will align the size to
PAGE_SIZE (4k) and then *split* the VMA. Only the first page-length of the
virtual space will VM_HUGEPAGE'd, the rest of it will stay the same.
The above is the semantics for all madvise() calls - which makes sense from a
UABI perspective. madvise() should do the proposed thing to only the length
(page-aligned) which it was asked to do, doing any more than that is not
something the user is expecting.
Commit 6e8f32d39a57 tries to optimize around the madvise() call by determining
whether the VMA got madvise'd before. This will work for most cases except
the following: if check_may_shrink_heap() is true, shrink_heap() re-maps the
shrunk portion, giving us a new VMA altogether. That VMA won't have the
VM_HUGEPAGE flag.
Reverting this commit, we will again mark the new VMA with VM_HUGEPAGE, and
the kernel will merge the two into a single VMA marked with VM_HUGEPAGE.
This may be the only case where we lose VM_HUGEPAGE, and we could micro-optimize
by extending the current if-condition with !check_may_shrink_heap. But let us
not do this - this is very difficult to reason about, and I am soon going
to propose mmap(MAP_HUGEPAGE) in Linux to do away with all these workarounds.
The inclusion of generic tanh implementation without undefining the
libm_alias_double (to provide the __tanh_sse2 implementation) makes
the exported tanh symbol pointing to SSE2 variant.
The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):
The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):
The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
The current implementation precision shows the following accuracy, on
three ranges ([-DBL_MAX,-10], [-10,10], [10,DBL_MAX]) with 10e9 uniform
randomly generated numbers for each range (first column is the
accuracy in ULP, with '0' being correctly rounded, second is the
number of samples with the corresponding precision):
The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Samuel Thibault [Mon, 16 Mar 2026 11:20:45 +0000 (12:20 +0100)]
nptl/htl: Fix confusion over PTHREAD_IN_LIBC and __PTHREAD_NPTL/HTL
The last uses of PTHREAD_IN_LIBC is where it should have been
__PTHREAD_NPTL/HTL. The latter was not conveniently available everywhere.
Defining it from config.h makes things simpler.