Thomas Daubney [Mon, 18 May 2026 16:20:53 +0000 (16:20 +0000)]
Vectorise special cases for SVE log1p(f)
This patch adds vectorised special cases for the SVE log functions
log1p and log1pf.
When built with GCC-15 and executed on a Neoverse V2 platform, the
following benchmarking throughput uplifts were measured:
log1pf -> 285% speed-up (4.85 ns/element to 1.26 ns/element)
log1p -> 117% speed-up (8.25 ns/element to 3.80 ns/element)
Note that the numbers here are for the special case path only and that
the fast path performance has been maintained. These changes have also
maintained the same level of accuracy as before.
Rocket Ma [Thu, 21 May 2026 03:22:38 +0000 (20:22 -0700)]
stdio-common: Optimize scanf %ms series array expansion
* stdio-common/vfscanf-internal.c: If user explicitly set the maximum
size of the string, respect it when reading characters. Instead of
always expanding exponentially, try to expand array to the exact size
user requested when `user_size < current_size * 2`.
Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Thomas Daubney [Fri, 8 May 2026 15:58:11 +0000 (15:58 +0000)]
Vectorise special cases for SVE inverse hyperbolics
This patch adds vectorised special cases for the SVE inverse hyperbolic
functions atanh, acosh and asinh for single precision floats. It also
moves the commonly used inf and nan bit values into the sv_log1pf_inline
data struct for resuse.
When built with GCC-15 and executed on a Neoverse V2 platform, the
following benchmarking throughput uplifts were measured:
atanh -> 215% speed-up (5.51 ns/element to 1.75 ns/element)
acosh -> 152% speed-up (4.63 ns/element to 1.84 ns/element)
asinh -> 51% speed-up (5.00 ns/element to 3.31 ns/element)
Note that the numbers here are for the special case path only and that
the fast path performance has been maintained. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
zombie12138 [Wed, 6 May 2026 05:38:01 +0000 (22:38 -0700)]
x86: Fix non-temporal memset unreachable on AMD Zen 3/4/5
On AMD Zen 3/4/5 with ERMS, the non-temporal memset path is unreachable
because rep_stosb_threshold is set to SIZE_MAX (vectorized loop is faster
than ERMS on these CPUs), but the non-temporal code path is nested inside
the rep_stosb branch.
The existing rescue logic at the Avoid_STOSB check only covers the case
where the CPU lacks ERMS hardware support. It does not cover AMD Zen 3+
where ERMS is supported but deliberately unused for performance reasons.
Extend the condition to also lower rep_stosb_threshold when:
- The user has not explicitly set x86_rep_stosb_threshold (respect tunables)
- rep_stosb_threshold is higher than memset_non_temporal_threshold (NT gated)
This makes the non-temporal path reachable for large memset operations,
providing ~2x speedup on pre-faulted buffers larger than L3 cache.
Tested on AMD Ryzen 7 8745HS (Zen 4):
- Pre-faulted 64MB memset: 2.02 ms -> 0.94 ms (2.15x faster)
- First-touch 64MB memset: 19.3 ms -> 21.3 ms (11% regression, expected:
kernel clear_page cache warming bypassed by NT stores)
* sysdeps/x86/dl-cacheinfo.h (dl_init_cacheinfo): Extend
rep_stosb_threshold lowering condition to cover AMD Zen 3/4/5
where ERMS is supported but stosb is disabled via threshold.
Xiang Gao [Fri, 8 May 2026 06:04:10 +0000 (14:04 +0800)]
libio: Ignore doallocate for open_memstream and open_wmemstream [BZ #34019]
setvbuf (stream, NULL, _IOFBF, 0) takes a special path in
_IO_setvbuf: if the byte-oriented buffer base is NULL, it calls
_IO_DOALLOCATE and returns without invoking the stream setbuf hook.
For open_wmemstream, the byte-oriented buffer base is NULL although
the wide result buffer has already been initialized in _wide_data.
As a result, this path calls _IO_wdefault_doallocate, which may
replace the wide buffer managed by open_wmemstream.
Install an open_wmemstream-specific doallocate hook that leaves
the growable result buffer unchanged. Add a regression test for this
path.
Install a narrow memstream doallocate hook as well. It keeps both
memstream vtables consistent (generic stdio allocation must not
replace the growable result buffer).
Xiang Gao [Fri, 8 May 2026 06:04:09 +0000 (14:04 +0800)]
libio: Ignore setbuf for open_memstream and open_wmemstream [BZ #34019]
open_memstream and open_wmemstream manage an internal growable buffer.
The default setbuf hook can reset that buffer, breaking the assumptions
used by the string stream overflow paths.
Install setbuf hooks that leave the internal buffer unchanged, and add
regression test cases for the narrow and wide cases, based on the
reproducer in BZ #34019.
Checked on x86_64 with no regression in the libio tests.
xiejiamei [Sat, 9 May 2026 07:56:28 +0000 (07:56 +0000)]
x86: Lower non-temporal copy threshold for Hygon
Benchmarks on Hygon processors show that the default non-temporal
threshold is higher than ideal for large copy workloads. As a result,
memcpy and memmove may continue to use the temporal copy path for
longer than is beneficial, increasing cache pollution and reducing
throughput for large copies.
Lower the copy non-temporal threshold to 3/8 of the shared cache size
per thread on Hygon. This allows the non-temporal copy path to be
selected earlier while leaving the memset non-temporal threshold
unchanged.
Signed-off-by: xiejiamei <xiejiamei@hygon.cn> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Daan De Meyer [Mon, 18 May 2026 07:39:33 +0000 (07:39 +0000)]
elf: Don't crash in dlsym when tail-called from a constructor [BZ #34156]
If a shared library's constructor calls dlsym and discards the result,
the compiler is free to lower the call to a tail jump. The dynamic
linker then resolves the apparent caller to ld.so's own link map, which
has no l_scope, and crashes in _dl_lookup_symbol_x dereferencing the
NULL scope pointer.
Tail-call optimization is a legal C transformation and there is no way
for the dynamic linker to recover the real caller from the elided frame.
Detect the situation by its observable effect -- a link map with no
l_scope -- and fall back to the main program's link map, the same
treatment used when the caller's address is otherwise unrecognized.
The check is written against l->l_scope rather than against _dl_rtld_map
directly because dl-sym-post.h is also compiled into libc.so, where
_dl_rtld_map is not visible (it lives only in ld.so).
Add dlfcn/tst-dlsym-ctor exercising the pattern. Without the fix the
test SIGSEGVs during dlopen; with the fix dlopen returns cleanly.
Signed-off-by: Daan De Meyer <daan@amutable.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Avinal Kumar [Mon, 18 May 2026 10:18:00 +0000 (15:48 +0530)]
SHARED-FILES: Update gettext sync record
Update the gettext section to reflect the 2026 sync with GNU
gettext 1.0 (through commit 2ebbdd0e2). Add intl/eval-plural.h
which was missing from the shared files list.
Avinal Kumar [Mon, 18 May 2026 10:14:52 +0000 (15:44 +0530)]
intl: Fix undefined pointer behaviour
In _nl_find_msg (dcigettext.c), outbuf was computed as
freemem + sizeof(size_t) before checking whether freemem_size is
large enough. When freemem is NULL (initial state), this is
undefined behaviour i.e arithmetic on a null pointer. Move the
outbuf assignment after the size check where freemem is guaranteed
to be a valid allocation.
In read_alias_file (localealias.c), after realloc the old
string_space pointer is dangling. The expression
new_pool - string_space subtracts a valid pointer from a dangling
one, which is undefined behaviour per ISO C 23.
Rewrite as new_pool + (map[i].alias - string_space) so both
operands of the subtraction point into the same (old) object
before string_space is reassigned.
Based on GNU gettext commits 695429040 and 2ebbdd0e2.
Original author: Bruno Haible <bruno@clisp.org>
In file included from zic.c:16:
private.h:849:1: error: static declaration of ‘mempcpy’ follows non-static declaration
849 | mempcpy(void *restrict s1, void const *restrict s2, size_t n)
| ^~~~~~~
In file included from ../include/string.h:60,
from private.h:222:
../string/string.h:432:14: note: previous declaration of ‘mempcpy’ with type [...]
432 | extern void *mempcpy (void *__restrict __dest,
| ^~~~~~~
The libc-symbols.h already defined some HAVE_*, but timezone files are
built with -D_ISOMAC. Remove its usage and only define _ and N_
macros if not already defined.
Checked on x86_64-linux-gnu and with a build-many-glibcs.py build for
i686-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
elf: Use dl_scratch_buffer for LD_LIBRARY_PATH copy in _dl_init_paths
_dl_init_paths used strdupa to make a mutable copy of LD_LIBRARY_PATH
for fillin_rpath to tokenize. The env block is attacker-controllable
and Linux allows individual variables up to MAX_ARG_STRLEN (32 *
PAGE_SIZE = 128 KB), so the strdupa can push tens of KB onto the
loader's startup stack on top of the env block that already sits on
the initial stack. With a reduced RLIMIT_STACK the doubled copy
overflows before main () is reached.
Replace the strdupa with a dl_scratch_buffer: short paths stay in
the 256-byte inline area, longer ones spill to anonymous mmap (malloc
is not yet available during _dl_init_paths). Two follow-on changes
make the new scratch lifetime safe against _dl_signal_error:
* Count entries directly off the const LD_LIBRARY_PATH and allocate
__rtld_env_path_list.dirs *before* the scratch is live. That way
the larger of the two heap allocations the loader controls signals
its OOM with no scratch to leak.
* Convert fillin_rpath to return bool instead of calling
_dl_signal_error internally on per-entry malloc failure. Its
only caller in the LLP path now frees the scratch first and then
signals the error from a clean state. decompose_rpath, the other
caller, is updated symmetrically. This also fixes a pre-existing
leak in fillin_rpath's OOM path, where the to_free heap copy from
expand_dynamic_string_token was not released before the
_dl_signal_error.
Checked on x86_64-linux-gnu, aarch64-linux-gnu, and i686-linux-gnu.
elf: Use dl_scratch_buffer for DST expansion in _dl_map_object_deps
The expand_dst macro in _dl_map_object_deps performs an unbounded
alloca via DL_DST_REQUIRED, which scales with the link map's
l_origin length plus the count of dynamic-string tokens in the
input string. When a DT_NEEDED entry carries several DSTs and the
link map sits in a deep directory, the resulting allocation grows
to several kilobytes -- enough to overflow a PTHREAD_STACK_MIN
thread that calls dlopen.
Convert the macro to a static function that draws from a caller-
owned dl_scratch_buffer, so oversized expansions land on the heap
(or anonymous mmap during early startup) instead of the stack.
The scratch buffer is reused across DT_NEEDED, DT_AUXILIARY, and
DT_FILTER entries of the same map and freed once dependency
expansion completes.
A new regression test, tst-dst-needed-minstack, builds a wrapper
library that inherits a five-DST SONAME from a leaf module,
deploys it under a deep temporary directory, and dlopens it from
a PTHREAD_STACK_MIN thread. Without the fix the dlopen overflows
the thread stack and crashes; with the fix the dlopen returns
cleanly (with or without a successful load).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
elf: Replace alloca with dl_scratch_buffer in _dl_load_cache_lookup
The alloca added by commit ccdb048d ("Fix recursive dlopen") to
snapshot the matched cache entry before __strdup runs through
interposable malloc is sized by best_len, which can reach PATH_MAX.
On PTHREAD_STACK_MIN threads that's enough to overflow the stack
mid-dlopen.
Use dl_scratch_buffer with DL_SCRATCH_NO_MALLOC: short entries stay
in the 256-byte inline area, longer ones spill to anonymous mmap
rather than to interposable malloc. The recursive-dlopen invariant
is preserved.
New container test elf/tst-dl-cache-long-path constructs a ~3.4 KB
deep directory, populates ld.so.cache with that entry, and dlopens
from a PTHREAD_STACK_MIN thread under deliberate stack pressure;
reliably SIGSEGVs against the alloca-based code and passes with the
fix.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
elf: Replace alloca/VLA with dl_scratch_buffer in dl-load.c
is_trusted_path_normalize, print_search_path, and open_path used
alloca or a VLA to hold a path scratch buffer sized by user-controlled
inputs (an RPATH directory length, or
max_dirnamelen + max_capstrlen + namelen). On the worst case that
consumes up to PATH_MAX bytes of stack per call, which can overflow a
PTHREAD_STACK_MIN-sized stack mid-dlopen when combined with the
loader's other on-stack scratch (struct filebuf, etc.).
Replace those allocations with dl_scratch_buffer. As a small cleanup,
print_search_path now takes the scratch buffer from its caller
(open_path's buffer is already large enough --
max_dirnamelen + max_capstrlen + namelen with namelen >= 1 covers the
max_dirnamelen + max_capstrlen + 1 print_search_path requires), so
LD_DEBUG=libs no longer pays for an extra allocation per open_path
invocation.
A new test elf/tst-dl-path-buf exercises the relevant paths -- dlopen
via DT_RPATH, open_path failure cleanup, dlopen with an over-long
name, dlopen from a PTHREAD_STACK_MIN thread.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
elf: Add dl_scratch_buffer, a loader-side scratch buffer
Several loader code paths need a short-lived scratch buffer sized
by attacker-influenced inputs (RPATH entries, ld.so.cache strings,
etc.). The available primitives are all unsuitable:
- alloca is unbounded and can overflow PTHREAD_STACK_MIN stacks.
- <scratch_buffer.h> is unaware of __minimal_malloc: a malloc'd
spill freed during early loader startup silently leaks because
__minimal_free only releases the most-recent allocation.
- A few paths cannot route through the interposable malloc at
all -- ld.so.cache lookup in particular, because an interposed
user malloc may recursively call dlopen and __munmap the cache
mapping mid-copy (commit ccdb048d, "Fix recursive dlopen").
Add a loader-side analogue of <scratch_buffer.h>: a 256-byte inline
area for the common case, with spill to malloc by default or to
anonymous mmap when __minimal_malloc is active or the caller passes
DL_SCRATCH_NO_MALLOC. Mmap spills are tagged " glibc: loader
scratch" via __set_vma_name for /proc/self/maps visibility. On OOM
dl_scratch_buffer_allocate raises a loader error via _dl_signal_error
and does not return. The one-shot contract (no second allocate
without an intervening free) is enforced by an assertion in
_dl_scratch_buffer_allocate.
No functional change in this commit; consumers are added separately.
Yury Khrustalev [Tue, 19 May 2026 13:37:13 +0000 (14:37 +0100)]
malloc: Small fix for code readability
A couple of small fixes for code readability, no functional change.
- Add missing comments for #endif statements.
- Move inclusion of string.h from malloc.c to calloc-clear-memory.h
where it is actually used.
- Re-order alias definitions for malloc functions.
Stefan Liebler [Mon, 11 May 2026 13:22:11 +0000 (15:22 +0200)]
s390: Adjust configure check for static-pie support.
With the previous approach, the configure check fails for lld in version >=19.
While binutils and lld 18 is placing the R_390_IRELATIVE relocation in .rela.plt
and emits DT_JMPREL pointing to it, newer lld versions puts the R_390_IRELATIVE
relocation in .rela.dyn and therefore there is also no DT_JMPREL entry and the
configure check claims that lld does not support static-pie.
The R_390_IRELATIVE relocation is also processed fine in .rela.dyn, thus the
configure check is adjusted. Now the configure checks that it exists a
R_390_IRELATIVE relocation at all. If the R_390_IRELATIVE relocation lands in
.rela.plt, it ensures that there is DT_JMPREL pointing to it. Otherwise there
should be a .rela.dyn section.
Weihong Ye [Mon, 18 May 2026 16:25:46 +0000 (16:25 +0000)]
AArch64: Optimize memcmp for Kunpeng 950 with SVE
Key optimizations:
- Use SVE predication for branch-free handling of short inputs and tails
- Use 4-way loop unrolling to maximize pipeline utilization
- Optimize mismatch detection with early exit logic
Benchmark (bench-memcmp, generic -> this patch):
- Small (0-128B): 15% - 50% speedup
- Medium (129-1024B): 21% - 50% speedup
- Large (2048-4096B): 28% - 50% speedup
Note: regressions may be observed in edge cases where offsets
are near 4K boundaries. These instances are rare and the overall
performance gain remains significantly positive.
Also add IFUNC support for memcmp and correct the first-line
comment in memcpy_kunpeng950.S.
Paul Eggert [Wed, 13 May 2026 18:08:35 +0000 (11:08 -0700)]
Simplify tzdb-related configuration
tzdb 2026b no longer needs -Wno-discarded-qualifiers or
-Wno-unused-variable. From a suggestion by Joseph Myers in:
https://sourceware.org/pipermail/libc-alpha/2026-May/177312.html
* configure.ac (config-cflags-wno-discarded-qualifiers): Remove.
* timezone/Makefile (CFLAGS-zic.c): Remove -Wno-unused-variable,
$(config-cflags-wno-discarded-qualifiers).
Paul Eggert [Wed, 13 May 2026 18:08:34 +0000 (11:08 -0700)]
timezone: sync to tzdb 2026b
Sync tzselect, zdump, zic to tzdb 2026b.
This fixes some buffer and integer overflows in zic,
adds new zic options -D, -m and -u inspired by FreeBSD,
and raises zic’s maximum number of abbreviation bytes
per timezone from 50 to 256.
This patch incorporates the following tzdb source code changes:
f9d30685 Output a minimal time zone designation table 37a4d178 Fix zic overflow bug with too-large offsets 4392f2dc zic now checks for signals more often 99a08a66 Fix zic buffer overflow when computing TZ d63b9287 zic: keep needed last transition to new type d005045d Pacify clang -Wunterminated-string-initialization e67b08d3 Port to C23 strchr macro 3d4b4e46 Add zic.c overflow commentary d9101b88 zic now a bit safer for overflows near 2**63 b23fa8e0 zic now allows more than 50 leap seconds 4ff518d2 Increase TZ_MAX_CHARS from 50 to 256 75d3b73b New -DTZ_RUNTIME_LEAPS=0 build-time option 87343c6e TZ_MAX_TIMES must be at least 310 now fc8f1b68 Simplify int_fast32_t definition on C89 platforms 24581465 Remove TZDEFRULES ("posixrules") from localtime.c fc708427 zic now warns about -p b09a3f23 Port TWOS_COMPLEMENT to signed-magnitude hosts 56b7a24a Make sure 2**31 - 1 is signed 9068ab78 zic no longer generates utoff == -2**31 cb6f9b3b Omit unnecessary L suffixes c37fbc32 Clarify when ‘__attribute__((pure))’ is a hack 859690a7 Fix some unsequenced/reproducible commentary 9c772ca7 Port to POSIX.1-2001 fflush 10f93018 Omit no-op transitions when Rule+Zone cancel a0b09b52 Fix unlikely backslash bug in scripts 2cbd3a71 Allow builder to override GRANDPARENTED c7257626 not used at → used outside faed4bd3 Clarify <sys/auxv.h> vs getauxval df08e6a1 Port mode_t (and gid_t, uid_t) to MS-Windows 6127d375 New zic option -u, inspired by FreeBSD 813c9ee0 New zic option -m, inspired by FreeBSD 987ea89c New zic option -D, inspired by FreeBSD cc377b07 Simplify mkdir situation cd994a90 Simplify !HAVE_POSIX_DECLS situation 052ddf76 Minor gettext macro improvements d9018f1c Refactor duplicate duplicate-option code 8d65db97 Prefer fdopen to umask in zic d7edca6e Omit “'”s from zic usage message a09ba7a5 getopt returns -1 (not EOF) on failure e22d410c zic now uses is_digit f57cadda Always invoke umask at start 242a8338 Fix mode_t issues on MS-Windows 2fecd606 MKDIR_UMASK → MKDIR_PERMS refactoring 90ef088a Move static_assert to top level 41576478 Port better to platforms lacking mempcpy 90a08d3e * private.h: Include stddef.h early enough aa8b35fe Simplify port to NetBSD struct __state cd2fddf7 Port to -DHAVE_SYS_STAT_H=0 -DHAVE_POSIX_DECLS=0 8470e759 Pacify GCC 15 -Wunterminated-string-initialization 8817d42f Prefer mempcpy to doing it by hand 87abb113 Tighten security checks on TZ values c87f0918 Use strnlen 07f7f31a Fix preprocessor indenting 3adf4123 Add offtime_r à la FreeBSD and NetBSD b807a31e Don’t depend on ‘true’ for tzselect ddffc800 * zic.c: Fix misspelled comment (thanks to Jonathan Wakely). 7063d08c Fix bug with -d RELATIVE -t ABSOLUTE e8920e76 Rename emalloc to xmalloc. e8e1a3d2 NetBSD defines STD_INSPIRED functions 3411494c Define _CRT_DECLARE_NONSTDC_NAMES for MS-Windows 7c909166 Define NOMINMAX for MS-Windows 24a4d97f 'zdump -' now reads from stdin e6d6bc3e Pacify gcc -Wsuggest-attribute=format sans snprintf in zdump 99557862 TZNAME_MAXIMUM defaults to 254, not 255 fe5be99d Be more consistent about macro true/false vs 1/0 31f483a1 Remove dependency of asctime on strftime 7ef7ed06 Simplify timeoff redefinition 1bd67a4b Move MKTIME_MIGHT_OVERFLOW definition 67f7e8ab Pacify GCC 15ish -Wzero-as-null-pointer-constant 535a4e8b Pacify GCC 15ish -Wleading-whitespace=blanks 0706ef0b Move iinntt definition ea814e99 strftime %s no longer is limited to time_t range 41e5344e Fix bug near the year 2**31 - 1 - 1900 4e1de249 Pacify gcc -Wsuggest-attribute=const ebd2ed92 Don’t define _FILE_OFFSET_BITS if _TIME_BITS 26a649a1 Improve zdump overflow checking 9c8221d7 * private.h: Fix timeoff comment. 9db906a0 Switch from RFC 8536 to 9636 for documentation af54a9e8 Port better to glibc when used internally there
xiejiamei [Sat, 9 May 2026 06:23:09 +0000 (14:23 +0800)]
x86: Lower non-temporal copy threshold for Hygon
Benchmarks on Hygon processors show that the default non-temporal
threshold is higher than ideal for large copy workloads. As a result,
memcpy and memmove may continue to use the temporal copy path for
longer than is beneficial, increasing cache pollution and reducing
throughput for large copies.
Lower the copy non-temporal threshold to 3/8 of the shared cache size
per thread on Hygon. This allows the non-temporal copy path to be
selected earlier while leaving the memset non-temporal threshold
unchanged.
Rocket Ma [Wed, 13 May 2026 16:12:42 +0000 (09:12 -0700)]
libio: Fix fmemopen_write on appending mode (BZ 34006)
* libio/fmemopen.c: Reference pos the variable instead of c->pos.
On the edge case, one byte should be written at the end of buffer,
instead of returning error.
Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
elf: Defer all IRELATIVE relocations until after PLT setup
When a shared library is built with -z lazy and its IFUNC resolver calls
a PLT function, the dynamic linker can crash. The resolver runs while
the PLT stubs still hold their raw ELF virtual addresses — l_addr has
not yet been added — so the call branches to an unmapped address.
The old code deferred IRELATIVE entries only to the end of the relocation
range currently being processed (via the r2/end2 scan-ahead mechanism in
elf_dynamic_do_Rel). This was sufficient only when both IRELATIVE and the
JMP_SLOT entries for the PLT functions it needs are in the same section.
On x86-64, aarch64, arm, i386 and most other targets, a file-scope
initialiser of the form
int (*fptr)(void) = some_ifunc;
causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
entries for any PLT calls made by the resolver live in .rela.plt.
Processing .rela.dyn before .rela.plt means the resolver fires before the
PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
Fix this by splitting IRELATIVE processing into a separate, explicitly
deferred pass. In elf/do-rel.h:
- Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
elf_dynamic_do_Rel. IRELATIVE entries are now always skipped in the
non-bootstrap path.
- Add a new elf_dynamic_do_Rel_irelative function that scans a
relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
IRELATIVE and ifunc relocations.
In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
approach for non-bootstrap builds unconditionally (regardless of whether
ranges[1].size is zero):
Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
everything except IRELATIVE/STT_GNU_IFUNC.
Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
processes only IRELATIVE, by which point all PLT stubs are
valid.
This guarantees that IRELATIVE resolvers can call PLT stubs safely
regardless of which section the linker placed R_*_IRELATIVE in.
Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
This patch addresses the binutils BZ 13302 [1] from the glibc side, and
also fixes the mold-reported issue [2], which shows that IFUNC relocation
placement and processing can work differently across ABIs.
I checked on all ABIs that support IFUNC (x86_64, i686, aarch64, arm,
loongarch, powerpc, riscv, s390, and sparc), some via qemu-system.
Whenever a large mmap is released the mmap and trim thresholds are updated.
As a result these thresholds grow ever larger which means huge allocations
are always served by arenas rather than mmap. The thresholds can end up as
large as an arena, which completely stops all trimming of the top block.
Remove the code completely - the default thresholds seem way too low for
modern 64-bit targets, but they can be increased seperately.
Avinal Kumar [Tue, 5 May 2026 11:28:25 +0000 (16:58 +0530)]
intl: Fix memory leak in _nl_find_domain on allocation failure
When _nl_explode_name() returns -1 (out of memory) and the locale was
resolved through an alias, _nl_find_domain() returns immediately
without freeing the locale copy allocated earlier. Similarly,
when _nl_make_l10nflist() returns NULL, the 'goto out' skips the
alias_value free.
Fix by nesting the _nl_make_l10nflist() call and its result handling
inside 'if (mask != -1)' instead of returning early. Move the
normalized_codeset free inside the same block. Both failure paths
now fall through to the unconditional alias_value free at the end.
Imported from GNU gettext commit 10eafd9e5.
Original author: Bruno Haible <bruno@clisp.org>
Avinal Kumar [Tue, 5 May 2026 15:11:42 +0000 (20:41 +0530)]
intl: Remove pre-C99 fallbacks from plural-exp.c
glibc requires C11 since 2022, making pre-C99 compatibility
paths in plural-exp.c dead code:
- init_germanic_plural(): With C99+, GERMANIC_PLURAL is
initialized at compile time and this function is never called.
Remove the function and the INIT_GERMANIC_PLURAL macro.
- HAVE_STRTOUL guard: Protected strtoul() usage with a manual
digit-parsing fallback. strtoul is in C89 <stdlib.h> and glibc
provides it. Remove the guard and the fallback loop.
Imported from GNU gettext commits ab5990532 and c1d84d656.
Original author: Bruno Haible <bruno@clisp.org>
Avinal Kumar [Tue, 5 May 2026 15:11:41 +0000 (20:41 +0530)]
intl: Remove PRI_MACROS_BROKEN from loadmsgcat.c
PRI_MACROS_BROKEN was a workaround for AIX 4, where inttypes.h
did not properly define the PRI* format macros (PRId8, PRIu32, etc.).
glibc has never supported AIX, the macro was always hardcoded to 0
under _LIBC, making it a dead code.
GNU gettext removed this in commit 267f61670 ("Drop portability to
AIX 4"), since no supported system has broken PRI macros post-C99.
Based on GNU gettext commit 267f61670.
Original author: Bruno Haible <bruno@clisp.org>
Avinal Kumar [Tue, 5 May 2026 15:11:40 +0000 (20:41 +0530)]
intl: Remove IN_LIBGLOCALE dead code
Remove all IN_LIBGLOCALE conditional blocks from intl/. libglocale
was a proposed API from 2005 that was never completed or shipped.
The macro is never defined in glibc or in current GNU gettext, making
every #ifdef IN_LIBGLOCALE block dead code.
GNU gettext removed these in commits starting from 2023. Removing
them from glibc reduces noise and eases future syncs with gettext.
Imported from GNU gettext commit d6a6801c1.
Original author: Bruno Haible <bruno@clisp.org>
The failure tail of __libc_arm_za_disable only leads to
__libc_fatal, so it does not need to preserve call frame state.
Remove the PAC prologue, frame setup, saved cntd value, stack
stores, and associated CFI directives from the fatal path, leaving
only the required SME state shutdown and fatal call.
Add tst-sme-za-disable-fail to exercise the abort path by providing
a TPIDR2 block with non-zero reserved bytes and checking that the
process terminates with SIGABRT and the expected fatal message.
stdio-common: Silence clang -Wfortify-source warning in tst-vfscanf-bz34008
clang does not recognize the 'm' scanf specifier and incorrectly warns
that the buf argument may overflow. Suppress the warning with the
clang-specific DIAG_* macros.
elf: Batch program-header reads in _dl_map_segments (oversight fix)
The fix for BZ 26577 ("Fix stack overflow in _dl_map_object_from_fd
with large e_phnum") removed the alloca for the program-header table
and introduced a streaming iterator (dl_pt_load_iterator) so segments
could be walked without staging the entire table on the stack.
That patch batched reads correctly in _dl_map_object_scan_phdrs (the
first walk, which collects PT_DYNAMIC/PT_TLS/PT_GNU_* metadata), but
overlooked the second walk in _dl_map_segments:
_dl_pt_load_iterator_next issued one pread64 per program header to
find the next PT_LOAD entry. For an object with N program headers
this added N redundant per-phdr syscalls on every dlopen / loader
startup -- regardless of whether the table had already been read by
open_verify into struct filebuf.
Unify both walks behind a single batched helper,
_dl_pt_load_iterator_phdr_at:
- When the program header table fits in the bytes already read by
open_verify into fbp->buf (the common case for nearly all shared
objects), all phdr accesses are served from that buffer with no
syscall at all.
- Otherwise, up to FILEBUF_SIZE / sizeof(ElfW(Phdr)) program headers
are read into fbp->buf with a single pread64; subsequent indices
in the same window hit the buffer.
Both _dl_map_object_scan_phdrs and _dl_pt_load_iterator_next now go
through this helper, eliminating the separate batching logic in
_dl_map_object_scan_phdrs. struct filebuf moves from dl-load.c to
dl-load.h so the inline iterator in dl-map-segments.h can reach
fbp->buf.
The filebuf size is also bumped to ensure the cached fast path
triggers for all observed binaries. A survey of an Ubuntu 24.04
installation (scanning /usr) shows:
Candidate files : 465834
ELF files inspected : 11624
glibc-linked binaries : 10164
Minimum e_phnum : 5
Maximum e_phnum : 14
Average e_phnum : 11.37
Median e_phnum : 11.0
shows e_phnum capped at 14 (for instance gcc's cc1, lto1, perl,
and gdb). The previous FILEBUF_SIZE of 832 on 64-bit fit only 13
program headers after the ELF header (64 + 13*56 = 792), so 64-bit
binaries with 14 phdrs missed the cached path. FILEBUF_SIZE is
bumped from 512/832 to 640/1024 (32-bit / 64-bit) -- enough for at
least 16 program headers on either ABI, leaving headroom over the
observed maximum.
For a typical shared library where open_verify's initial read covers
the program header table, this reduces _dl_map_segments from N
preads to 0. For a worst-case e_phnum that does not fit in fbp->buf,
reads drop from N to ceil(N / phdrs_per_buf) -- the same cost
_dl_map_object_scan_phdrs already pays.
No functional change. Tested on x86_64-linux-gnu, aaarch64-linux-gnu,
and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Shamil Abdulaev [Wed, 13 May 2026 05:52:40 +0000 (07:52 +0200)]
libio: Fix race in _IO_new_file_init_internal initialization order [BZ #33785]
_IO_new_file_init_internal linked the new stream into _IO_list_all
before setting fp->_fileno to -1. A concurrent thread that walks
_IO_list_all (for example via fflush (NULL)) could observe the stream
with an uninitialized _fileno before initialization completed.
Set _fileno = -1 before _IO_link_in so the stream is fully
initialized when it becomes visible in the global list.
This is the residual concurrency defect noted at the end of commit b657f72fa3 ("libio: Fix deadlock between freopen, fflush (NULL) and
fclose (bug 24963)").
Add libio/tst-file-init-race exercising concurrent fopen/fclose and
fflush (NULL) to detect regressions.
Add a new internal test, `tst-wcsmbs-clone-overflow`, to verify correct
gconv module reference counting. The Makefile is updated to include this
test in the `tests-internal` list and ensure it runs with generated locales.
This test specifically checks that the `__counter` for `gconv_fcts->towc`
does not leak references when `swscanf` is used with a stack-allocated
wide character stream. It ensures that `_IO_wstrfile_fclose_stack`
properly decrements the module reference counter, preventing a module
from staying loaded indefinitely due to unreleased references.
libio: Fix gconv module reference counter overflow in swscanf
The swscanf family of functions creates a wide-oriented FILE stream
on the stack. Initialization of this stream invokes `_IO_fwide`, which
clones the global locale's gconv transformation steps via
`__wcsmbs_clone_conv`. This increments the reference counter (`__counter`)
of the gconv module.
Because the FILE stream is stack-allocated, `fclose` cannot be called,
and so `__gconv_release_step` is never invoked. The counter leaks,
eventually hitting the 32-bit integer overflow limit and aborting the
process.
To resolve this, we introduce `_IO_wstrfile_fclose_stack`, a dedicated
cleanup function for stack-allocated FILE streams. This function invokes
`_IO_FINISH` and correctly releases the gconv steps via
`__gconv_release_step` without attempting to `free` the FILE pointer.
This cleanup function is then hooked into all variants of swscanf right
before they return.
elf: Eliminate alloca for program-header table in the ELF loader
The ELF loader allocates the program-header table on the stack with
alloca(e_phnum * sizeof(ElfW(Phdr))) in two places: once in
open_verify to call elf_machine_reject_phdr_p, and again in
_dl_map_object_from_fd to scan segment types. Both fall back to
alloca only when the table does not fit in the initial fbp->buf read;
for a crafted ELF with e_phnum == 0x7FFF this means up to ~1.8 MB
(32767 × 56 bytes on a 64-bit host) on the stack in each call, with
no guard against the combination exhausting the available stack space.
A latent variant of this problem exists even for ordinary shared
libraries when dlopen is called from a thread running with
PTHREAD_STACK_MIN stack (16 KB on Linux). The nptl/tst-minstack-exit
test demonstrates that glibc code paths must operate correctly under
minimum-stack conditions; loading a shared library with even a modest
number of program headers can overflow the remaining stack through the
alloca-based phdr table.
This patch eliminates both allocas by replacing them with a single
_dl_map_object_scan_phdrs function that reads program headers in
fixed-size chunks into the existing fbp->buf scratch buffer (512 B on
32-bit, 832 B on 64-bit) using pread. When all headers fit within
the bytes already captured by open_verify's initial read() call (the
common case), no extra syscall is needed. This should be the case for
most of the ELF objects and should not required additional syscalls.
The slow path issues as many pread calls as necessary without any stack
growth proportional to e_phnum. The elf_machine_reject_phdr_p interface
is redesigned around a new struct dl_machine_phdr_info and on MIPS this
captures the PT_MIPS_ABIFLAGS entry in-flight, so the compatibility check
in elf_machine_reject_phdr_p no longer needs to re-scan the program-header
table.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
NB: this patch depends on https://sourceware.org/pipermail/libc-alpha/2026-May/177239.html Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
elf: Fix stack overflow in _dl_map_object_from_fd with large e_phnum (BZ 26577)
The _dl_map_object_from_fd uses a VLA (loadcmds[l->l_phnum]) whose size
is proportional to e_phnum. A crafted ELF with e_phnum == 0x7FFF
allocates ~1.5 MB (32767 × 48 bytes on 64-bit machine) on the stack,
which adds to the previous ~1.75 MB alloca for the phdr table that
precedes it.
This patch follow Florian's suggestion [1] to use a two-pass approach
(collect-then-map) with a single-pass struct dl_pt_load_iterator that
precomputes the metadata needed by _dl_map_segments (p_align_max,
has_holes, first/last segment bounds, nloadcmds) and then yields one
struct loadcmd at a time through _dl_pt_load_iterator_next, holding at
most one loadcmd on the stack at a time. The same iterator is
threaded through _dl_map_segments in dl-map-segments.h.
The main complex part is the test, which adds python-generated crafted
ET_DYN that has e_phnum == 0x7FFF: one PT_LOAD covering the ELF header
so the loader exercises the full iterator path, and the remaining
headers PT_NULL. The test runs two subtests under a reduced stack limit
(phdr alloca + 1 MB headroom ≈ 2.75 MB, well below the 3.25 MB the
unfixed VLA code requires).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
[1] https://sourceware.org/pipermail/libc-alpha/2026-February/175136.html Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Avinal Kumar [Mon, 4 May 2026 17:22:06 +0000 (22:52 +0530)]
intl: Add tests for plural expression hardening
The first test checks for stack overflow. It uses a plural expression
nested 5000 levels deep using the !(1-(...)) pattern. The parser
accepts it (below YYMAXDEPTH=10000), but evaluation exeeds
EVAL_MAXDEPTH=100 and falls back to index 0 instead of crashing with
SIGSEGV.
The second test checks for division by zero in plural expression. The
expression (n!=1)+1/(n!=1729) triggers 1/0 for n=1729. msgfmt only
validates 0<= n <= 1000, so the .mo file is accepted. Evaluation
returns PE_INTDIV and falls back instead of raising SIGFPE.
Adaptations from gettext to glibc:
- gettext's plural-3 embeds the nested expresion as a literal string.
This test uses an AWK script (plural-depth.awk) to generate the same
expression.
- gettext uses LANGUAGE= (empty) with LC_ALL=ll and its own locale
setup. glibc requires a real locale for setlocale() or else the "C"
locale override in dcigettext.c ignores LANGUAGE entirely.
The tests are derived from GNU gettext's plural-3 (commit 021348871a22)
and plural-4 (commit 429ba6c6b835), adapted to glibc's test framework.
Avinal Kumar [Mon, 4 May 2026 17:22:05 +0000 (22:52 +0530)]
intl: Import plural expression hardening from GNU gettext
The plural expression evaluator plural_eval() in eval-plural.h uses
unbounded recursion, which can cause a stack overflow crash with
deeply nested expressions in malicious .mo files. This is
particularly dangerous on threads with small stacks (musl libc
default: 128 KB, AIX 7 default: 96 KB, glibc after ulimit -s 260:
~3919 recursions max).
Additionally, division by zero in plural expressions triggers
raise(SIGFPE), which is not multithread-safe, catching SIGFPE
requires per-process signal handlers that race with other threads.
Fix both by importing the hardening from GNU gettext:
- Replace unbounded plural_eval() with depth-limited
plural_eval_recurse() (EVAL_MAXDEPTH=100), returning a
struct eval_result with status instead of a bare unsigned long.
- Return PE_INTDIV status on division by zero instead of raising
SIGFPE. Remove the architecture-specific INTDIV0_RAISES_SIGFPE
macro and the conditional #include <signal.h>.
- Update plural_lookup() in dcigettext.c to handle the new return
type, falling back to index 0 on any evaluation failure.
Based on GNU gettext commits ef37a1540 and 726bfb1d1.
Discussed on: https://sourceware.org/pipermail/libc-alpha/2023-October/152010.html
are declared as extern inline, but no translation unit provides their
real definitions. This can lead to a link failure if the functions are
not inlined. Fix it by declaring them as static inline instead.
Yao Zihong [Tue, 5 May 2026 21:22:29 +0000 (16:22 -0500)]
riscv: Add RVV strncmp for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of strncmp for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__strncmp_vector) is added
alongside the generic fallback (__strncmp_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling strncmp().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Yao Zihong [Tue, 5 May 2026 21:12:37 +0000 (16:12 -0500)]
riscv: Add RVV strcmp for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of strcmp for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__strcmp_vector) is added
alongside the generic fallback (__strcmp_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling strcmp().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Yury Khrustalev [Wed, 6 May 2026 12:29:56 +0000 (13:29 +0100)]
support: add support_ptr_after_free
Some tests use pointers after the associated memory has been freed.
On targets that support memory tagging, using such pointers even
for test purposes might be impossible. To work around this, we add
new function that would allow to clear a pointer in a target-specific
way.
We modify 3 relevant malloc tests: tst-malloc-backtrace, tst-tcfree3,
and tst-safe-linking.
Rocket Ma [Sat, 18 Apr 2026 06:48:41 +0000 (23:48 -0700)]
stdio-common: Fix buffer overflow in scanf %mc [BZ #34008]
* stdio-common/vfscanf-internal.c: When enlarging allocated buffer with
format %mc or %mC, glibc allocates one byte less, leading to
user-controlled one byte overflow. This commit fixes BZ #34008, or
CVE-2026-5450.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Uros Bizjak [Wed, 6 May 2026 15:07:54 +0000 (17:07 +0200)]
i386: Replace inline asm rotates in pointer_guard with stdc_rotate_{left,right}
Use the C23 <stdbit.h> rotation helpers instead of inline assembly
for pointer mangling and demangling on i386.
The PTR_MANGLE and PTR_DEMANGLE macros previously used rol/ror
inline asm with a constant rotation of 9. Replace these with
stdc_rotate_left and stdc_rotate_right operating on uintptr_t,
preserving the exact rotation count via 2 * sizeof (uintptr_t) + 1.
This change removes inline assembly, improves portability and
readability and lets the compiler select optimal code generation.
Uros Bizjak [Wed, 6 May 2026 15:05:46 +0000 (17:05 +0200)]
x86_64: Replace inline asm rotates in pointer_guard with stdc_rotate_{left,right}
Use the C23 <stdbit.h> rotation helpers instead of inline assembly
for pointer mangling and demangling on x86_64.
The PTR_MANGLE and PTR_DEMANGLE macros previously used rol/ror
inline asm with a constant rotation of 2 * LP_SIZE + 1. Replace
these with stdc_rotate_left and stdc_rotate_right operating on
uintptr_t, preserving the exact rotation count via
2 * sizeof (uintptr_t) + 1.
This change removes inline assembly, improves portability and
readability and lets the compiler select optimal code generation.
Carlos O'Donell [Thu, 7 May 2026 14:41:36 +0000 (10:41 -0400)]
Drop "(C) YYYY" from DCO'd contributions.
Contributions made under DCO use a generic statement to indicate that
the file has copyright, but that statement does not need to include a
year. Remove the year to avoid the work required to update that
statement to include future years as such updates are not required.
Rocket Ma [Sat, 2 May 2026 03:39:07 +0000 (20:39 -0700)]
libio: Fix ungetwc operating on byte stream [BZ #33998]
* libio/wgenops.c: When _IO_wdefault_pbackfail attempts to push back one
character, it accidently compare the wchar to push back with the last
char from byte stream, instead of wide stream. Under specific coding,
attacker may exploit this to leak information. This commit fix bug
33998, or CVE-2026-5928.
Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Uros Bizjak [Wed, 6 May 2026 13:15:35 +0000 (15:15 +0200)]
stdlib: add missing stdc_rotate_right_ull alias when builtin is available
When __builtin_stdc_rotate_right is supported, glibc defines type-specific
aliases for several unsigned integer types (uc, us, ui, ul), but omits the
unsigned long long variant. This leads to an inconsistency between the
builtin-backed path and the generic fallback, where unsigned long long
is handled.
Add the missing stdc_rotate_right_ull macro mapping to
stdc_rotate_right(__x, __n) to complete the set of type-specific helpers
and ensure consistent API coverage across all supported unsigned integer
types.
No functional change for existing users; this only exposes the expected
alias for unsigned long long.
Fixes: 331c7a4cd0ee ("stdbit: Fix builtin name used in __glibc_has_builtin check for rotate_right") Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Uros Bizjak [Wed, 6 May 2026 09:21:39 +0000 (11:21 +0200)]
stdbit: Fix builtin name used in __glibc_has_builtin check for rotate_right
The __glibc_has_builtin check in include/stdbit.h incorrectly refers to
___builtin_stdc_rotate_right (with three leading underscores) instead of the
correct __builtin_stdc_rotate_right (two leading underscores). As a result,
the builtin is not detected even when supported by the compiler.
Fix the spelling to use __builtin_stdc_rotate_right consistently in both the
feature test and the corresponding comment.
Yao Zihong [Thu, 30 Apr 2026 20:15:36 +0000 (15:15 -0500)]
riscv: Add RVV strlen for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of strlen for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__strlen_vector) is added
alongside the generic fallback (__strlen_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling strlen().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Jiho Lee [Wed, 29 Apr 2026 01:00:14 +0000 (10:00 +0900)]
localedata: update LC_ADDRESS and LC_NAME for ko_KR
Update the South Korean (ko_KR) locale to reflect official standards:
- LC_ADDRESS: Follow the Large-to-Small hierarchy (Country, Postcode,
City, Road, Recipient) as per Korea Post guidelines.
(https://www.koreapost.go.kr/kpost/subIndex/135.do?pSiteIdx=125)
- LC_NAME: Follow the standard Korean order (Surname + Given Name)
without Western-style salutations or tabs.
Extend the Prefer_No_AVX512 tuning to cover Hygon model 0x8.
Benchmarks on Hygon platforms show that EVEX implementations
are often more profitable than AVX512 paths. The existing logic
already enables Prefer_No_AVX512 for model 0x7. Apply the same
preference to model 0x8 to ensure consistent IFUNC selection
behavior across newer Hygon processors.
Signed-off-by: xiejiamei <xiejiamei@hygon.cn> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
* stdio-common/reg-modifier.c: The wchar in str can be greater or equal
than 0, and less or equal than UCHAR_MAX, that means, we need a buffer
with UCHAR_MAX + 1 elements, so that user input will not overflow
__printf_modifier_table.
Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Petr Menšík [Thu, 16 Oct 2025 14:18:23 +0000 (16:18 +0200)]
Return different exit codes when gai_result is > -100
Make the result checkable from the command line even without verbose
mode. Keep original exit status 2 for name not found error. But report
other errors by exit status greater than 10.
For too high values make it return 2 as before.
Signed-off-by: Petr Menšík <pemensik@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Petr Menšík [Thu, 16 Oct 2025 14:18:21 +0000 (16:18 +0200)]
nss: Add verbose flag to getent tool
Unlike older hosts database served by gethostbyname, getaddrinfo call
can return varying return codes. Those codes can be vital for providing
reason why name resolution on the system did not return address. Even
when getent tool is usually present on every small container image,
there is often no helpful tool to show getaddrinfo errors.
This simple change adds verbosity flag to getent. With that it can
provide more details about the reason of the failure. It can help to
obtain information whether the name queried exists or does not have
address of requested types only.
The only database where this will help is ahosts* variants. I have not
found any kind of test to expand with new verbose flag. But I think this
would be very useful on various limited system, where bind-utils is not
installed by default. Besides, sometimes getaddrinfo call can return
different information than DNS protocol itself.
Example of usage:
nss/getent -v ahosts com.
This will tell you the name exists, but has no address.
Signed-off-by: Petr Menšík <pemensik@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The cve.mitre.org site is an archive that redirects users to cve.org.
The linked about page has also been removed, so this patch changes it to
reference the current equivalent on cve.org.
The URL https://www.gnu.org/software/libc/bugs.html now redirects to
a page with no bug-reporting instructions. Point to the glibc wiki
Bugzilla Procedures page instead.
Signed-off-by: Shamil Abdulaev <ashamil435@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The existing paragraph warning about _FILE_OFFSET_BITS default changes
due to Y2038 is correct but confusing, as it does not explicitly state
why time_t concerns affect _FILE_OFFSET_BITS.
Clarify that _TIME_BITS=64 (needed for Y2038 safety) requires
_FILE_OFFSET_BITS=64, so when systems migrate to 64-bit time_t by
default, _FILE_OFFSET_BITS will also need to default to 64, even for
applications that do not handle large files.
This addresses the confusion noted in the bug report while keeping the
warning in place, as the transitive dependency makes it relevant to
the _FILE_OFFSET_BITS documentation.
Rocket Ma [Fri, 24 Apr 2026 17:27:59 +0000 (10:27 -0700)]
misc: Optimize getusershell.c
* misc/getusershell.c: Completely rewrite the unit. Only allocate one
big buffer to store shell names. Add a missing unit test.
The new implementation read the whole file into one buffer, and wipe out
every byte but shell names. Later when addressing shell names from first
shell, jump to next '\0' and then jump to next '/'. This could reduce
memory footprint and shall improve some performance.
Signed-off-by: Rocket Ma <marocketbd@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Thu, 23 Apr 2026 12:43:33 +0000 (14:43 +0200)]
Remove EXIT_UNSUPPORTED in stdlib/test-bz22786 if path is NULL
With commit 6c3a8a9d868a8deddf0d6dcc785b6d120de90523 (2018-08-25), the test
used xmalloc instead of malloc and therefore removed the path == NULL check
as xmalloc is printing an error message and exit with a fail in this case.
On s390-32 this was always a FAIL instead of UNSUPPORTED, thus the previous
behaviour was re-enabled with commit 3bad2358d67d371497079bba4f8eca9c0096f4e2
five days later on 2018-08-30. Therefore, we don't know if this also happens
on other systems.
While removing s390-32 with commit b01debcd8f5229860b3224ea135b1b8456281cee
I've adjusted the comment and Adhemerval asked whether this can also happen
on other systems with little physical memory. We've decided to remove the
EXIT_UNSUPPORTED in this extra commit instead of the large s390-32 removal one.
See libc-alpha:
https://inbox.sourceware.org/libc-alpha/20260409085102.3475867-1-stli@linux.ibm.com/T/#m28b5375bef4cfb10729b93c7e658b91a14b07b85
If this change leads to test fails somewhere, please add a comment about your
used system and revert this commit.
Nowadays path is allocated with support_blob_repeat_allocate which returns
an empty struct in case of malloc/mmap is not able to allocate enough memory.
All other tests using support_blob_repeat_allocate
(stdlib/tst-strtod-overflow.c, support/tst-support_blob_repeat.c and
string/tst-memmove-overflow.c) are properly checking the start or size field
directly or indirectly via TEST_COMPARE_BLOB.
While the test support/tst-support_blob_repeat.c just prints a warning if
allocating the large mappings is not possible, the other tests exit with
UNSUPPORTED.
At least for the realpath-part, the commit 855a67c3cc81be4fc806c66e3e01b53e352a4e9f introduced support_accept_oom handling.
According to the discussion:
https://inbox.sourceware.org/libc-alpha/8a1fd5b2-5118-498e-babf-e46c0e6d1cdf@redhat.com/
Agreed, test-bz22786 can use a lot of memory.
OK. These convert OOM to UNSUPPORTED for the test if there isn't enough memory.
In case of not enough memory while allocating path, this change would lead to a
segmentation fault instead of UNSUPPORTED. As this is inconsistent compared to
the second realpath-part and also to the other tests using
support_blob_repeat_allocate, I would prefer keeping UNSUPPORTED if path is NULL.
Nevertheless, I've posted this patch for discussion as promised while reviewing
the s390-32 removal patch. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Many tests use Glibc tunables, and the values of the tunables are
provided via the GLIBC_TUNABLES env variable. Tests set it in
makefiles using
tst-foo-ENV = GLIBC_TUNABLES=tunable=value
This overwrites environment for this test, so if another env var is
set elsewhere, one of these changes would be lost. The correct way
should be to append to test's environment:
tst-foo-ENV += GLIBC_TUNABLES=tunable=value
However, if two or more tunables need to be set for the same test,
the 'tunable=value' part should be appended to previously defined
GLIBC_TUNABLES env variable, and it's not easy to achieve this via
existing tools available for tests.
Additionally, there are cases when it is useful to set ambient env
var GLIBC_TUNABLES in order to apply it to most of the tests except
those that require specific tunables. The existing mechanism that
relies on tst-foo-ENV would always override the ambient env var
even when it is not desirable.
To address all of this, in this commit we add support for using
constructs like
tst-foo-TUNABLES += tunable=value
Using this, the test will receive appropriate GLIBC_TUNABLES contents,
and if there is an ambient env var GLIBC_TUNABLES, its value will be
prepended to the env var used by the test. Even if the ambient env var
contains the same tunable that is used by a test, the test's value will
override the ambient value, and the test will be executed correctly.
Additionally, we support cases when tests must have specific value
of the GLIBC_TUNABLES env var (ignoring any ambient value):
tst-foo-TUNABLES-only += tunable=value
The existing mechanism that uses tst-foo-ENV will continue to work,
however if the same test uses both, the new mechanism will override
the old one.
Additional benefit is that the code in makefiles becomes shorter.
We also change tunable handling for malloc tests in this commit.
posix: fix false regex match with backrefs and $ anchor
This fixes the $ anchor being ignored in the following grep command:
$ grep -E '^(.?)(.?).?\2\1$' <<< ab
ab
However, the regular expression should only match palindromes.
This patch is mostly copied from a commit in Gnulib from Jim Meyering
[1], and a followup commit by Paul Eggert [2]. It was found by Ed Morton
in GNU sed [3].
Yao Zihong [Tue, 21 Apr 2026 19:58:10 +0000 (14:58 -0500)]
riscv: Add RVV memcpy for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of memcpy for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__memcpy_vector) is added
alongside the generic fallback (__memcpy_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling memcpy().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Yury Khrustalev [Wed, 25 Mar 2026 10:04:19 +0000 (10:04 +0000)]
support: add support_address_diff function
Some malloc tests compare pointers meaning to compare addresses.
On AArch64, the 64-bit value of the pointer may contain metadata
along with the values of the address.
In order to correctly compare addresses, we add new function for
AArch64 target that will use the AArch64 64 SUBP (subtract pointer)
instruction when it is available. This instruction uses the 56-bit
addresses ignoring top-byte metadata.
Best implementation is selected using ifunc resolver.
On other targets and also on AArch64 when MTE is not available this
function defaults to PTR_DIFF defined in libc-pointer-arith.h.
Three malloc tests are modified accordingly:
- tst-memalign-2.c
- tst-memalign-3.c
- tst-realloc.c
Yao Zihong [Mon, 20 Apr 2026 21:19:08 +0000 (16:19 -0500)]
riscv: Add RVV strcpy for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of strcpy for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__strcpy_vector) is added
alongside the generic fallback (__strcpy_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling strcpy().
Co-authored-by: Hau Hsu <hau.hsu@sifive.com> Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Stefan Liebler [Tue, 21 Apr 2026 12:50:15 +0000 (14:50 +0200)]
s390: Remove Wno-CFLAGS for rtld.c/dl-load.c/dl-reloc.c
While review of s390-32 removal, Adhemerval asked if those CFLAGS are still
necessary:
https://inbox.sourceware.org/libc-alpha/20260409085102.3475867-1-stli@linux.ibm.com/T/#me5120906445f3941031e29c3a093f1699eae77b4
According to the git-history, the first s390-Makefile was introduced back in
2000-08-02 with those CFLAGS. The same are also included now and past in
i386-Makefile. But I haven't found a reason why those were added in the past
and if it was really necessary on s390. I assume it was with old GCCs most
likely due to inclusion of dl-machine.h.
This patch removes those CFLAGS. If needed, we have to circumvent the issues
again. At least I've used current GCCs 12.5, 13.4, 14.3, 15.2 and gcc-head
to successfully build current glibc on s390-64 with -O2, -O3 and -Os without
such warnings. Reviewed-by: Florian Weimer <fweimer@redhat.com>
Pierre Blanchard [Wed, 15 Apr 2026 08:32:44 +0000 (08:32 +0000)]
AArch64: Implement AdvSIMD and SVE powr(f) routines
Vector variants of the new C23 powr routines.
These provide same maximum error error as pow by virtue of
relying on shared approximation techniques and sources.
Note: Benchmark inputs for powr(f) are identical to pow(f).
Performance gain over pow on V1 with GCC@15:
- SVE powr: 10-12% on subnormal x, 12-13% on x < 0.
- SVE powrf: 15% on all x < 0.
- AdvSIMD powr: for x < 0, 40% if x subnormal, 60% otherwise.
- AdvSIMD powrf: 4% on x subnormals or x < 0.