Florian Weimer [Thu, 13 Nov 2025 08:46:15 +0000 (09:46 +0100)]
hppa: Consistently reference LGPL in copyright header
The file was added with a GPL reference (but LGPL statement) in
commit 0d6bed71502f053fa702ccbb7dd4fa6741b2a0ed ("hppa: Add
____longjmp_check C implementation.").
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Joseph Myers [Thu, 13 Nov 2025 00:04:21 +0000 (00:04 +0000)]
Change fromfp functions to return floating types following C23 (bug 28327)
As discussed in bug 28327, C23 changed the fromfp functions to return
floating types instead of intmax_t / uintmax_t. (Although the
motivation in N2548 was reducing the use of intmax_t in library
interfaces, the new version does have the advantage of being able to
specify arbitrary integer widths for e.g. assigning the result to a
_BitInt, as well as being able to indicate an error case in-band with
a NaN return.)
As with other such changes from interfaces introduced in TS 18661,
implement the new types as a replacement for the old ones, with the
old functions remaining as compat symbols but not supported as an API.
The test generator used for many of the tests is updated to handle
both versions of the functions.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Also tested tgmath tests for x86_64 with GCC 7 to make sure that the
modified case for older compilers in <tgmath.h> does work.
Also tested for powerpc64le to cover the ldbl-128ibm implementation
and the other things that are handled differently for that
configuration. The new tests fail for ibm128, but all the failures
relate to incorrect signs of zero results and turn out to arise from
bugs in the underlying roundl, ceill, truncl and floorl
implementations that I've reported in bug 33623, rather than
indicating any bug in the actual new implementation of the functions
for that format. So given fixes for those functions (which shouldn't
be hard, and of course should add to the tests for those functions
rather than relying only on indirect testing via fromfp), the fromfp
tests should start passing for ibm128 as well.
Wilco Dijkstra [Tue, 11 Nov 2025 17:46:19 +0000 (17:46 +0000)]
math: Remove float_t and double_t [BZ #33563]
Remove uses of float_t and double_t. This is not useful on modern machines,
and does not help given GCC defaults to -fexcess-precision=fast.
One use of double_t remains to allow forcing the precision to double
on targets where FLT_EVAL_METHOD=2. This fixes BZ #33563 on
i486-pc-linux-gnu.
Wilco Dijkstra [Mon, 10 Nov 2025 13:52:14 +0000 (13:52 +0000)]
math: Remove ldbl-128/s_fma.c
Remove ldbl-128/s_fma.c - it makes no sense to use emulated float128
operations to emulate FMA. Benchmarking shows dbl-64/s_fma.c is about
twice as fast. Remove redundant dbl-64/s_fma.c includes in targets
that were trying to work around this issue.
It has been added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370)
as a way to block operations such as mapping, moving to another location,
shrinking the size, expanding the size, or modifying it to a pre-existing
memory mapping.
Although the system only works on 64-bit CPUs, the entrypoint was added
for all ABIs (since the kernel might eventually implement it for additional
ones and/or the ABI can execute on a 64-bit kernel).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Xi Ruoyao [Fri, 7 Nov 2025 15:49:22 +0000 (23:49 +0800)]
LoongArch: Call elf_ifunc_invoke for R_LARCH_IRELATIVE in elf_machine_rela
When R_LARCH_IRELATIVE is resolved by apply_irel, the ifunc resolver is
called via elf_ifunc_invoke so it can read HWCAP from the __ifunc_arg_t
argument. But when R_LARCH_IRELATIVE is resolved by elf_machine_rela (it
will happen if we dlopen() a shared object containing R_LARCH_IRELATIVE),
the ifunc resolver is invoked directly with no or different argument.
This causes a segfault if the resolver uses the __ifunc_arg_t.
Despite the LoongArch psABI does not specify this argument, IMO it's
more convenient to have this argument IMO and per hyrum's rule there may
be objects in wild which already relies on this argument (they just
didn't blow up because they are not dlopen()ed yet). So make the
behavior handling R_LARCH_IRELATIVE of elf_machine_rela same as
apply_irel.
math: Optimize frexpl (binary128) with fast path for normal numbers
Add fast path optimization for frexpl (128-bit IEEE quad precision) using
a single unsigned comparison to identify normal floating-point numbers and
return immediately via arithmetic on the exponent field.
The implementation uses arithmetic operations hx = hx - (ex << 48)
to adjust the exponent in place, which is simpler and more efficient than
bit masking. For subnormals, the traditional multiply-based normalization
is retained for reliability with the split 64-bit word format.
The zero/infinity/NaN check groups these special cases together for better
branch prediction.
This optimization provides the same algorithmic improvements as the other
frexp variants while maintaining correctness for all edge cases.
Osama Abdelkader [Thu, 23 Oct 2025 15:06:29 +0000 (18:06 +0300)]
math: Optimize frexp (binary64) with fast path for normal numbers
Add fast path optimization for frexp using a single unsigned comparison
to identify normal floating-point numbers and return immediately via
arithmetic on the bit representation.
The implementation uses asuint64()/asdouble() from math_config.h and arithmetic
operations to adjust the exponent, which generates better code than bit masking
on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides
faster normalization than the traditional multiply approach.
The zero/infinity/NaN check is simplified to (int64_t)(ix << 1) <= 0, which
is more efficient than separate comparisons.
Osama Abdelkader [Thu, 23 Oct 2025 15:06:28 +0000 (18:06 +0300)]
math: Optimize frexpf (binary32) with fast path for normal numbers
Add fast path optimization for frexpf using a single unsigned comparison
to identify normal floating-point numbers and return immediately via
arithmetic on the bit representation.
The implementation uses asuint()/asfloat() from math_config.h and arithmetic
operations to adjust the exponent, which generates better code than bit masking
on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides
faster normalization than the traditional multiply approach.
The zero/infinity/NaN check is simplified to (int32_t)(hx << 1) <= 0, which
is more efficient than separate comparisons.
Add benchmark support for frexp, frexpf, and frexpl to measure the
performance improvement of the fast path optimization.
- Created frexp-inputs, frexpf-inputs, frexpl-inputs with random test values
- Added frexp, frexpf, frexpl to bench-math list
- Added CFLAGS to disable builtins for accurate benchmarking
These benchmarks will be used to quantify the performance gains from the
fast path optimization for normal floating-point numbers.
string: Check if attribute can declared after function declaration
Some symbols that might be auto-generated by the compiler are redefined
to internal alias (for instance mempcpy to __mempcpy). However, if fortify
is enabled, the fortify wrapper is define before the alias re-defined and
clang warns attribute declaration must precede definition.
Use an asm alias if compiler does not support it, instead of an
attribute.
The count_leading_zeros is not used anymore, so there is no need to
provide the table for possible usage. The hppa already provides
the compat symbol on libgcc-compat.c.
The in-use group size is increased large enough to trigger ERANGE
for initial buffers and cause a retry. The actualy size is
approximately twice that required to trigger the defect, though
any size larger than NSS_BUFLEN_GROUP triggers the defect.
Without the fix the group is not merged and the failure is detected,
but with the fix the ERANGE error is handled, buffers are enlarged
and subsequently correctly merged.
Tested with a/b testing before and after patching.
Tested on x86_64 with no regression.
Co-authored-by: Patsy Griffin <patsy@redhat.com> Reviewed-by: DJ Delorie <dj@redhat.com>
This disrupts when testing Glibc on a system with a newer kernel
and it seems we can try improve testing for invalid flags setting
all the bits that are not supposed to be supported (rather than
setting only the next unsupported bit).
Florian Weimer [Thu, 6 Nov 2025 13:49:21 +0000 (14:49 +0100)]
support: Exit on consistency check failure in resolv_response_add_name
Using TEST_VERIFY (crname_target != crname) instructs some analysis
tools that crname_target == crname might hold. Under this assumption,
they report a use-after-free for crname_target->offset below, caused
by the previous free (crname).
Joe Ramsay [Thu, 6 Nov 2025 18:29:33 +0000 (18:29 +0000)]
AArch64: Fix instability in AdvSIMD sinh
Previously presence of special-cases in one lane could affect the
results in other lanes due to unconditional scalar fallback. The old
WANT_SIMD_EXCEPT option (which has never been enabled in libmvec) has
been removed from AOR, making it easier to spot and fix
this. No measured change in performance. This patch applies cleanly as
far back as 2.41, however there are conflicts with 2.40 where sinh was
first introduced.
Joe Ramsay [Thu, 6 Nov 2025 18:26:54 +0000 (18:26 +0000)]
AArch64: Fix instability in AdvSIMD tan
Previously presence of special-cases in one lane could affect the
results in other lanes due to unconditional scalar fallback. The old
WANT_SIMD_EXCEPT option (which has never been enabled in libmvec) has
been removed from AOR, making it easier to spot and fix this. 4%
improvement in throughput with GCC 14 on Neoverse V1. This bug is
present as far back as 2.39 (where tan was first introduced).
Joe Ramsay [Thu, 6 Nov 2025 15:36:03 +0000 (15:36 +0000)]
AArch64: Optimise SVE scalar callbacks
Instead of using SVE instructions to marshall special results into the
correct lane, just write the entire vector (and the predicate) to
memory, then use cheaper scalar operations.
Geomean speedup of 16% in special intervals on Neoverse with GCC 14.
linux: Add STATX_WRITE_ATOMIC/STATX_ATTR_WRITE_ATOMIC definitions to generic statx
The commit fc650bfd71081d26c1015d299827fb58a23a6b02 added
STATX_WRITE_ATOMIC/STATX_ATTR_WRITE_ATOMIC on the statx-generic.h
without updating the generic statx struct.
Linux 6.16 adds no new syscalls, while Linux 6.17 adds file_getattr
and file_setattr (commit be7efb2d20d67f334a7de2aef77ae6c69367e646).
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
* PIDFD_GET_INFO and pidfd_info (along with related extra flags) to
allow get information about the process without the need to parse
/proc (commit cdda1f26e74ba, Linux 6.13).
* PIDFD_SELF_{THREAD,THREAD_GROUP,SELF,SELF_PROCESS} to allow
pidfd_send_signal refer to the own process or thread lead groups
without the need of allocating a file descriptor (commit f08d0c3a71114,
Linux 6.15).
* PIDFD_INFO_COREDUMP that extends PIDFD_GET_INFO to obtain coredump
information.
Linux uAPI header defines both PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP on linux/fcntl.h (since they reserve part of the
AT_* values), however for glibc I do not see any good reason to add pidfd
definitions on fcntl-linux.h.
The tst-pidfd.c is extended with some PIDFD_SELF_* tests and a new
‘tst-pidfd_getinfo.c’ test is added to check PIDFD_GET_INFO. The
PIDFD_INFO_COREDUMP tests would require very large and complex tests
that are already covered by kernel tests.
Checked on aarch64-linux-gnu and x86_64-linux-gnu on kernels 6.8 and
6.17. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Collin Funk [Fri, 23 May 2025 04:00:05 +0000 (21:00 -0700)]
Add feature test macros for POSIX.1-2024.
* include/features.h (_POSIX_C_SOURCE): Document the value of 202405L
for POSIX.1-2024. Set it to 202405L when _GNU_SOURCE or _DEFAULT_SOURCE
is defined.
(_XOPEN_SOURCE): Document the value of 800 for POSIX-1.2024. Set it to
800 when _GNU_SOURCE is defined.
(__USE_XOPEN2K24, __USE_XOPEN2K24XSI): New internal macros. Set them
when _POSIX_C_SOURCE is 202405L or greater and/or when _XOPEN_SOURCE is
800 or greater.
* manual/creature.texi (Feature Test Macros): Document the new values
for _POSIX_C_SOURCE and _XOPEN_SOURCE.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Joseph Myers [Tue, 4 Nov 2025 23:41:35 +0000 (23:41 +0000)]
Rename fromfp files in preparation for changing types for C23
As discussed in bug 28327, the fromfp functions changed type in C23
(compared to the version in TS 18661-1); they now return the same type
as the floating-point argument, instead of intmax_t / uintmax_t.
As with other such incompatible changes compared to the initial TS
18661 versions of interfaces (the types of totalorder functions, in
particular), it seems appropriate to support only the new version as
an API, not the old one (although many programs written for the old
API might in fact work wtih the new one as well). Thus, the existing
implementations should become compat symbols. They are sufficiently
different from how I'd expect to implement the new version that using
separate implementations in separate files is more convenient than
trying to share code, and directly sharing testcases would be
problematic as well.
Rename the existing fromfp implementation and test files to names
reflecting how they're intended to become compat symbols, so freeing
up the existing filenames for a subsequent implementation of the C23
versions of these functions (which is the point at which the existing
implementations would actually become compat symbols).
gen-fromfp-tests.py and gen-fromfp-tests-inputs are not renamed; I
think it will make sense to adapt the test generator to be able to
generate most tests for both versions of the functions (with extra
test inputs added that are only of interest with the C23 version).
The ldbl-opt/nldbl-* files are also not renamed; since those are for a
static only library, no compat versions are needed, and they'll just
have their contents changed when the C23 version is implemented.
Joseph Myers [Tue, 4 Nov 2025 17:12:00 +0000 (17:12 +0000)]
Add C23 long_double_t, _FloatN_t
C23 Annex H adds <math.h> typedefs long_double_t and _FloatN_t
(originally introduced in TS 18661-3), analogous to float_t and
double_t. Add these typedefs to glibc. (There are no _FloatNx_t
typedefs.)
C23 also slightly changes the rules for how such typedef names should
be defined, compared to the definition in TS 18661-3. In both cases,
<TYPE>_t corresponds to the evaluation format for <TYPE>, as specified
by FLT_EVAL_METHOD (for which <math.h> uses glibc's internal
__GLIBC_FLT_EVAL_METHOD). Specifically, each FLT_EVAL_METHOD value
corresponds to some type U (for example, 64 corresponds to U =
_Float64), and for types with exactly the same set of values as U, TS
18661-3 says expressions with those types are to be evaluated to the
range and precision of type U (so <TYPE>_t is defined to U), whereas
C23 only does that for types whose values are a strict subset of those
of type U (so <TYPE>_t is defined to <TYPE>).
As with other cases where semantics changed between TS 18661 and C23,
this patch only implements the newer version of the semantics
(including adjusting existing definitions of float_t and double_t as
needed). The new semantics are contradictory between the main
standard and Annex H for the case of FLT_EVAL_METHOD == 2 and the
choice of double_t when double and long double have the same values
(the main standard says it's defined as long double in that case,
whereas Annex H would define it as double), which I've raised on the
WG14 reflector (but I think setting FLT_EVAL_METHOD == 2 when double
and long double have the same values is a fairly theoretical
combination of features); for now glibc follows the value in the main
standard in that case.
Note that I think all existing GCC targets supported by glibc only use
values -1, 0, 1, 2 or 16 for FLT_EVAL_METHOD (so most of the header
code is somewhat theoretical, though potentially relevant with other
compilers since the choice of FLT_EVAL_METHOD is only an API choice,
not an ABI one; it can vary with compiler options, and these typedefs
should not be used in ABIs). The testcase (expanded to cover the new
typedefs) is really just repeating the same logic in a second place
(so all it really tests is that __GLIBC_FLT_EVAL_METHOD is consistent
with FLT_EVAL_METHOD).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Peter Bergner [Wed, 3 Sep 2025 19:26:03 +0000 (14:26 -0500)]
riscv: Add vector registers to __SYSCALL_CLOBBERS
The Linux kernel ABI specifies that the vector registers are not preserved
across system calls, but the __SYSCALL_CLOBBERS macro doesn't mention them.
This could possibly lead to compilers trying to keep data in the vector
registers across the syscall leading to corruption. Add the vector registers
to __SYSCALL_CLOBBERS when the vector extension is enabled. If the vector
extension is enabled, then require GCC 15 or later and RVV 1.0 or later.
Fixes: 36960f0c76 ("RISC-V: Linux Syscall Interface") Signed-off-by: Peter Bergner <bergner@tenstorrent.com>
Collin Funk [Tue, 4 Nov 2025 02:08:37 +0000 (18:08 -0800)]
Regenerate charmap-kw.h and locfile-kw.h with gperf 3.3
In commit 970364dac00b38333e5b2d91c90d11e80141d265 we switched some
/*FALLTHROUGH*/ comments to [[fallthrough]] to avoid warnings with
Clang. However, since gperf emitted different output the buildbot
failed. The buildbot has been updated to use gperf 3.3 which will use
__attribute__ ((__fallthrough__)) where needed to avoid warnings [1].
This patch regenerates these files with the same version.
math: Remove the SVID error handling from remainder
The optimized i386 version is faster than the generic one, and
gcc implements it through the builtin. This optimization enables
us to migrate the implementation to a C version. The performance
on a Zen3 chip is similar to the SVID one.
The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):
math: Remove the SVID error handling from remainderf
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. This optimization enables us to
migrate the implementation to a C version. The performance on a Zen3
chip is similar to the SVID one.
The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin (different
than i386).
Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):
The only usage was for pthread_spin_lock, introduced by 12d2dd706099aa4,
as a way to optimize the code for certain architectures. Now that atomic
builtins are used by default, let the compiler use the best code sequence
for the atomic exchange.
Now that atomic builtins are used by default, we can rely on the
compiler to define when to use 64-bit atomic operations.
It allows the use of 64-bit atomic operations on some 32-bit ABIs where
they were not previously enabled due to missing pre-processor handling:
hppa, mips64n32, s390, and sparcv9.
All ABIs, except alpha and sparc, define it to
atomic_full_barrier/__sync_synchronize, which can be mapped to
__atomic_thread_fence (__ATOMIC_RELEASE).
For alpha, it uses a 'wmb' which does not map to any of C11
barriers.
For sparc it uses a stronger 'member #LoadStore | #StoreStore',
where the release barrier maps to just 'membar #StoreLoad'. The
patch keeps the sparc definition.
For PowerPC, it allows the use of lwsync for additional chips
(since _ARCH_PWR4 does not cover all chips that support it).
All ABIs, except alpha, powerpc, and x86_64, define it to
atomic_full_barrier/__sync_synchronize, which can be mapped to
__atomic_thread_fence (__ATOMIC_SEQ_CST) in most cases, with the
exception of aarch64 (where the acquire fence is generated as
'dmb ishld' instead of 'dmb ish').
For s390x, it defaults to a memory barrier where __sync_synchronize
emits a 'bcr 15,0' (which the manual describes as pipeline
synchronization).
For PowerPC, it allows the use of lwsync for additional chips
(since _ARCH_PWR4 does not cover all chips that support it).
Tested on aarch64-linux-gnu, where the acquire produces a different
instruction that the current code.
The __HAVE_64B_ATOMICS can be define based on __WORDSIZE, and
the __ARCH_ACQ_INSTR, MUTEX_HINT_*, and barriers definition are
defined by the target cpu.
H.J. Lu [Thu, 30 Oct 2025 23:29:45 +0000 (07:29 +0800)]
Build programs in $(others-noinstall) like tests if libgcc_s is available
Build programs in $(others-noinstall) like tests only if libgcc_s is
available. Otherwise, "build-many-glibcs.py compilers" will fail to
build the initial glibc with the initial limited gcc due to the missing
libgcc_s.
Joseph Myers [Mon, 3 Nov 2025 19:56:42 +0000 (19:56 +0000)]
Support assert as a variadic macro for C23
C23 makes assert into a variadic macro to handle cases of an argument
that would be interpreted as a single function argument but more than
one macro argument (in particular, compound literals with an
unparenthesized comma in an initializer list); this change was made by
N2829. Note that this only applies to assert, not to other macros
specified in the C standard with particular numbers of arguments.
Implement this support in glibc. This change is only for C; C++ would
need a separate change to its separate assert implementations. It's
also applied only in C23 mode. It depends on support for (C99)
variadic macros, and also (in order to detect calls where more than
one expression is passed, via an unevaluated function call) a C99
boolean type. These requirements are encapsulated in the definition
of __ASSERT_VARIADIC. Tests with -std=c99 and -std=gnu99 (using
implementations continue to work.
I don't think we have a way in the glibc testsuite to validate that
passing more than one expression as an argument does produce the
desired error.
The Dynamic Linker chapter now includes a new section detailing
environment variables that influence its behavior.
This new section documents the `LD_DEBUG` environment variable,
explaining how to enable debugging output and listing its various
keywords like `libs`, `reloc`, `files`, `symbols`, `bindings`,
`versions`, `scopes`, `tls`, `all`, `statistics`, `unused`, and `help`.
It also documents `LD_DEBUG_OUTPUT`, which controls where the debug
output is written, allowing redirection to a file with the process ID
appended.
This provides users with essential information for controlling and
debugging the dynamic linker.
Introduce the `DL_DEBUG_TLS` debug mask to enable detailed logging for
Thread-Local Storage (TLS) and Thread Control Block (TCB) management.
This change integrates a new `tls` option into the `LD_DEBUG`
environment variable, allowing developers to trace:
- TCB allocation, deallocation, and reuse events in `dl-tls.c`,
`nptl/allocatestack.c`, and `nptl/nptl-stack.c`.
- Thread startup events, including the TID and TCB address, in
`nptl/pthread_create.c`.
A new test, `tst-dl-debug-tid`, has been added to validate the
functionality of this new debug logging, ensuring that relevant messages
are correctly generated for both main and worker threads.
This enhances the debugging capabilities for diagnosing issues related
to TLS allocation and thread lifecycle within the dynamic linker.
Pincheng Wang [Fri, 31 Oct 2025 20:15:26 +0000 (15:15 -0500)]
riscv: Add Zbkb optimized repeat_bytes helper
Introduce a RISC-V specific string-misc.h to provide an optimized
repeat_bytes implementation when the Zbkb extension is available.
The new version uses packh/packw/pack instruction count and avoiding
high latency instructions. This helper is used by several mem and string
functions, and falls back to the generic implementation when Zbkb is not
present.
Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Wilco Dijkstra [Fri, 24 Oct 2025 14:26:50 +0000 (14:26 +0000)]
math: Fix powf special case [BZ #33563]
Fix powf (0x1.fffffep+127, 1.0f) to return 0x1.fffffep+127 when
rouding upwards. Cleanup the special case code - performance
improves by ~1.2%. This fixes BZ #33563.
Yury Khrustalev [Wed, 29 Oct 2025 15:59:53 +0000 (15:59 +0000)]
debug: mark __libc_message_wrapper as always inline
When building with -Og to enable debugging, there is currently a compiler error
because if __libc_message_wrapper() is not inline, the __va_arg_pack_len macro
cannot be used.
Eric Wong [Fri, 31 Oct 2025 03:03:00 +0000 (20:03 -0700)]
cdefs: allow __attribute__ on tcc
According to the tcc (tiny C compiler) Changelog, tcc supports
__attribute__ since 0.9.3. Looking at history of tcc at
<https://repo.or.cz/tinycc.git>, __attribute__ support was added
in commit 14658993425878be300aae2e879560698e0c6c4c on 2002-01-03,
which also looks like the release of 0.9.3. While I'm unable to
find release tags for tcc before 0.9.18 (2003-04-14), the next
release (0.9.28) will include __attribute__((cleanup(func)) which
I rely on.