git.ipfire.org Git - thirdparty/glibc.git/log

linux: Inline __syscall_internal_cancel and __syscall_cancel

It improves some interception tools such as valgrind, however on
multithread the __syscall_cancel_arch is called.

The result libc.so has a slight larger code size:

ABI             master        patched         diff        increase
aarch64        1658673        1669121        10448           0.63%
x86_64         1976656        1985744         9088           0.46%
i686           2233622        2251130        17508           0.78%
powerpc64le    2382448        2396768        14320           0.60%

It mimics internally how cancellable entrypoints were implemented
before 89b53077d2a58f00e7debdfe58afabe953dac60d, where cancellation
handlign were done inline in the syscall wraper.

hurd: Fix tst-stack2 test build on Hurd

It requires $(shared-thread-library). Fixes 0c342594237.

Checked on a i686-gnu build.

nss: remove undefined behavior and optimize getaddrinfo

On x86-64 and compiling with -O2 using stdc_leading_zeros compiles to
the bsr instruction. The fls function removed by this patch is inlined
but still loops while checking each bit individually.

* nss/getaddrinfo.c: Include <stdbit.h>.
(fls): Remove function. This function contains a left shift of 31 on an
'int' which is undefined.
(rfc3484_sort): Use stdc_leading_zeros instead of fls.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

powerpc: Remove POWER7 strncasecmp optimization

These routines are not extensively used (gnulib documentation even
recommend use a replacement [1]), and there is already a POWER8
version that uses proper vectorized instructions.

[1] https://www.gnu.org/software/gnulib/manual/gnulib.html#C-strings

Checked with a build for some powerpc variations.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

manual: add more pthread functions

Add stubs and partial docs for many undocumented pthreads functions.
While neither exhaustive nor complete, gives minimal usage docs
for many functions and expands the pthreads chapters, making it
easier to continue improving this section in the future.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

S390: Add new s390 platform z17.

The glibc-hwcaps subdirectories are extended by "z17".  Libraries are loaded if
the z17 facility bits are active:
- Miscellaneous-instruction-extensions facility 4
- Vector-enhancements-facility 3
- Vector-Packed-Decimal-Enhancement Facility 3
- CPU: Concurrent-Functions Facility

tst-glibc-hwcaps.c is extended in order to test z17 via new marker6.
In case of running on a z17 with a kernel not recognizing z17 yet,
AT_PLATFORM will be z900 but vector-bit in AT_HWCAP is set.  This situation
is now recognized and this testcase does not fail.

A fatal glibc error is dumped if glibc was build with architecture
level set for z17, but run on an older machine (See dl-hwcap-check.h).
Note, you might get an SIGILL before this check if you don't use:
configure --with-rtld-early-cflags=-march=<older-machine>

ld.so --list-diagnostics now also dumps information about s390.cpu_features.

Independent from z17, the s390x kernel won't introduce new HWCAP-Bits if there
is no special handling needed in kernel itself.  For z17, we don't have new
HWCAP flags, but have to check the facility bits retrieved by
stfle-instruction.

Instead of storing all the stfle-bits (currently four 64bit values) in the
cpu_features struct, we now only store those bits, which are needed within
glibc itself.  Note that we have this list twice, one with original values and
the other one which can be filtered with GLIBC_TUNABLES=glibc.cpu.hwcaps.
Those new fields are stored in so far reserved space in cpu_features struct.
Thus processes started in between the update of glibc package and we e.g. have
a new ld.so and an old libc.so, won't crash. The glibc internal ifunc-resolvers
would not select the best optimized variant.

The users of stfle-bits are also updated:
- parsing of GLIBC_TUNABLES=glibc.cpu.hwcaps
- glibc internal ifunc-resolvers
- __libc_ifunc_impl_list
- sysconf

Correct test descriptors in libm-test-pown.inc

While working on implementing compoundn, I noticed that
libm-test-pown.inc was wrongly using TEST_ff_f and AUTO_TESTS_ff_f
when the actual types involved meant fL_f should be used instead of
ff_f; fix to use the correct descriptor strings for pown. (These
strings affect how gen-libm-test.py generates a C file in some cases.
The structure type test_fL_f_data for expected results and the use of
RUN_TEST_LOOP_fL_f in the ALL_RM_TEST call were already correct.)

Tested for x86_64. The generated libm-test-pown.c was actually
unchanged, but the old descriptor strings were still logically
incorrect.

malloc: Inline tcache_try_malloc

Inline tcache_try_malloc into calloc since it is the only caller. Also fix
usize2tidx and use it in __libc_malloc, __libc_calloc and _mid_memalign.
The result is simpler, cleaner code.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Fix UB on sinpif (BZ 32925)

The left shift overflows for 'int', use uint32_t instead. It syncs
with CORE-MATH commit bbfabd99.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on erfcf (BZ 32924)

The left shift overflows for 'int', use uint64_t instead. It syncs
with CORE-MATH commit d0a2be200cbc1344d800d9ef0ebee9ad67dd3ad8.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on cospif (BZ 32923)

The left shift overflows for 'int', use uint32_t instead. It syncs
with CORE-MATH commit bbfabd993a71b049c210b0febfd06d18369fadc1.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on cbrtf (BZ 32922)

The left shift overflows for 'int64_t', use unsigned instead. It syncs
with CORE-MATH commit f7c7408d1749ec2859ea249495af699359ae559b.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on sinhf (BZ 32921)

The left shift overflows for 'int', use uint64_t instead. It syncs
with CORE-MATH commit bbfabd99.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on logf (BZ 32920)

The left shift overflows for 'int', use a literal instead. It syncs
with OPTIMIZED-ROUTINES commit 0f87f607b976820ef41fe64d004fe67dc7af8236.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on coshf (BZ 32919)

The left shift overflows for 'int', use uint64_t instead. It syncs
with CORE-MATH commit 4d6192d2.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on atanhf (BZ 32918)

The left shift overflows for 'int', use unsigned instead. It syncs
with CORE-MATH commit 4d6192d2.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

nptl: Fix pthread_getattr_np when modules with execstack are allowed (BZ 32897)

The BZ 32653 fix (12a497c716f0a06be5946cabb8c3ec22a079771e) kept the
stack pointer zeroing from make_main_stack_executable on
_dl_make_stack_executable. However, previously the 'stack_endp'
pointed to temporary variable created before the call of
_dl_map_object_from_fd; while now we use the __libc_stack_end
directly.

Since pthread_getattr_np relies on correct __libc_stack_end, if
_dl_make_stack_executable is called (for instance, when
glibc.rtld.execstack=2 is set) __libc_stack_end will be set to zero,
and the call will always fail.

The __libc_stack_end zero was used a mitigation hardening, but since
52a01100ad011293197637e42b5be1a479a2f4ae it is used solely on
pthread_getattr_np code. So there is no point in zeroing anymore.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Sam James <sam@gentoo.org>

RISC-V: Use builtin for ffs and ffsll while supported extension available

Hardware ctz instructions are available in the RISC-V Zbb and XTheadBb extension. With special `-march` flags defined, we can generate more simplified code compared to the generic implementation of `ffs`/`ffsll`.

Signed-off-by: Julian Zhu <julian.oerv@isrc.iscas.ac.cn>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

stdio: Remove UB on printf_fp

The __printf_fp_buffer_1 issues count_leading_zeros with 0 argument,
which might leads to call __builtin_ctz depending of the ABI.
Replace with stdbit.h function instead.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>

benchtest: Correct shell script related to bench-malloc-thread

This patch changes the shell script that selects which arguments are used
for the execution of bench-malloc-thread.
The problem seems to have been introduced in commit:

  commit 2d6427a63cad8056ba6bcaaaa8df21977c8dde3d
  Author: Wangyang Guo <wangyang.guo@intel.com>
  Date:   Fri Nov 29 16:05:35 2024 +0800
  benchtests: Add calloc test

With current condition, the following error "/bin/sh: 3: [[: not found"
occurs when executing `make bench BENCHSET="malloc-thread"` and the else
path is taken, using incorrect arguments for bench test execution.

Error is reproducible in Debian based distros.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

linux/termio: remove <termio.h> and struct termio

The <termio.h> interface is absolutely ancient: it was obsoleted by
<termios.h> already in the first version of POSIX (1988) and thus
predates the very first version of Linux. Unfortunately, some constant
macros are used both by <termio.h> and <termios.h>; particularly
problematic is the baud rate constants since the termio interface
*requires* that the baud rate is set via an enumeration as part of
c_cflag.

In preparation of revamping the termios interface to support the
arbitrary baud rate capability that the Linux kernel has supported
since 2008, remove <termio.h> in the hope that no one still uses this
archaic interface.

Note that there is no actual code in glibc to support termio: it is
purely an unabstracted ioctl() interface.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: tst-audit10: split AVX512F code into dedicated functions [BZ #32882]

"Recent" GCC versions (since commit fc62716fe8d1, backported to stable
branches) emit a vzeroupper instruction at the end of functions
containing AVX instructions. This causes the tst-audit10 test to fail
on CPUs lacking AVX instructions, despite the AVX512F check. The crash
occurs in the pltenter function of tst-auditmod10b.c.

Fix that by moving the code guarded by the check_avx512 function into
specific functions using the target ("avx512f") attribute. Note that
since commit 5359c3bc91cc ("x86-64: Remove compiler -mavx512f check") it
is safe to assume that the compiler has AVX512F support, thus the
__AVX512F__ checks can be dropped.

Tested on non-AVX, AVX2 and AVX512F machines.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Add NT_ARM_GCS and NT_RISCV_TAGGED_ADDR_CTRL from Linux 6.13 to elf.h

Linux 6.13 adds new ELF note types NT_ARM_GCS and
NT_RISCV_TAGGED_ADDR_CTRL. Add these to glibc's elf.h.

Tested for x86_64.

Add AT_* constants from Linux 6.12

Linux 6.12 adds AT_RENAME_* aliases for RENAME_* flags for renameat2,
and also AT_HANDLE_MNT_ID_UNIQUE. Add the first set of aliases to
stdio.h alongside the RENAME_* names, and AT_HANDLE_MNT_ID_UNIQUE to
bits/fcntl-linux.h.

Tested for x86_64.

hurd: Make symlink return EEXIST on existing target directory

8ef17919509e ("hurd: Fix EINVAL error on linking to a slash-trailing path
[BZ #32569]) made symlink return ENOTDIR, but the gnulib testsuite does
not recognize it for such a situation, and EEXIST is indeed more
comprehensible to users.

hurd: Clear FP exceptions before calling signal handler

This avoids SIGFPE handlers (or code longjmp-ed to) getting disturbed by the
exception that generated it.

Note: gcc's unwinding depends on the rpc_wait_trampoline/trampoline exact
code, so we here avoid breaking it.

hurd: Do not check for xstate level if it was not initialized

If __thread_get_state failed, there is no xstate level to check.
ok is 0 already and the memory exists, but better not read uninitialized
memory.

hurd: Do not restore xstate when it is not initialized

If the process has never used fp before getting a signal, xstate is set
(and thus the x87 state is not initialized) but xstate->initialized is still
0, and we should not restore anything.

hurd: Make *utime*s catch invalid times [BZ #32802, BZ #32803]

hurd: save xstate during signal handling

* hurd/Makefile: add new tests
* hurd/test-sig-rpc-interrupted.c: check xstate save and restore in
  the case where a signal is delivered to a thread which is waiting
  for an rpc. This test implements the rpc interruption protocol used
  by the hurd servers. It was so far passing on Debian thanks to the
  local-intr-msg-clobber.diff patch, which is now obsolete.
* hurd/test-sig-xstate.c: check xstate save and restore in the case
  where a signal is delivered to a running thread, making sure that
  the xstate is modified in the signal handler.
* hurd/test-xstate.h: add helpers to test xstate
* sysdeps/mach/hurd/i386/bits/sigcontext.h: add xstate to the
  sigcontext structure.
+ sysdeps/mach/hurd/i386/sigreturn.c: restore xstate from the saved
  context
* sysdeps/mach/hurd/x86/trampoline.c: save xstate if
  supported. Otherwise we fall back to the previous behaviour of
  ignoring xstate.
* sysdeps/mach/hurd/x86_64/bits/sigcontext.h: add xstate to the
  sigcontext structure.
* sysdeps/mach/hurd/x86_64/sigreturn.c: restore xstate from the saved
  context

Signed-off-by: Luca Dariz <luca@orpolo.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-ID: <20250319171118.142163-1-luca@orpolo.org>

hurd: Check return value of mach_port_mod_refs() in the dup routine of fcntl()

Message-ID: <20250310084409.24177-1-zhmingluo@163.com>

malloc: move tcache_init out of hot tcache paths

This patch moves any calls of tcache_init away after tcache hot paths.
Since there is no reason to initialize tcaches in the hot path and since
we need to be able to check tcache != NULL in any case, because of
tcache_thread_shutdown function, moving tcache_init away from hot path
can only be beneficial.
The patch also removes the initialization of tcaches within the
__libc_free call. It only makes sense to initialize tcaches for the
thread after it calls one of the allocation functions. Also the patch
removes the save/restore of errno from tcache_init code, as it is no
longer needed.

aarch64: Add back non-temporal load/stores from oryon-1's memset

I misunderstood the recommendation from the hardware team about non-temporal
load/stores. It is still recommended to use them in memset for large sizes. It
was not recommended for their use with device memory and memset is already
not valid to be used with device memory.

This reverts commit e6590f0c86632c36c9a784cf96075f4be2e920d2.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Add back non-temporal load/stores from oryon-1's memcpy

I misunderstood the recommendation from the hardware team about non-temporal
load/stores. It is still recommended to use them in memcpy for large sizes. It
was not recommended for their use with device memory and memcpy is already
not valid to be use with device memory.

This reverts commit eb5eeb47403e0a91de834868e501b4d62b8d2cb9.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Use tailcalls in __libc_free

Use tailcalls to avoid the overhead of a frame on the free fastpath.
Move tcache initialization to _int_free_chunk(). Add malloc_printerr_tail()
which can be tailcalled without forcing a frame like no-return functions.
Change tcache_double_free_verify() to retry via __libc_free() after clearing
the key.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Inline tcache_free

Inline tcache_free since it's only used by __libc_free. Add __glibc_likely
for the tcache checks.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Improve free checks

The checks on size can be merged and use __builtin_add_overflow. Since
tcache only handles small sizes (and rejects sizes < MINSIZE), delay this
check until after tcache.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Inline _int_free_check

Inline _int_free_check since it is only used by __libc_free.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Inline _int_free

Inline _int_free since it is a small function and only really used by
__libc_free.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Move mmap code out of __libc_free hotpath

Currently __libc_free checks for a freed mmap chunk in the fast path.
Also errno is always saved and restored to preserve it.  Since mmap chunks
are larger than the largest tcache chunk, it is safe to delay this and
handle tcache, smallbin and medium bin blocks first.  Move saving of errno
to cases that actually need it.  Remove a safety check that fails on mmap
chunks and a check that mmap chunks cannot be added to tcache.

Performance of bench-malloc-thread improves by 9.2% for 1 thread and
6.9% for 32 threads on Neoverse V2.

Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual/tunables: fix a trivial typo

Fixes: 12a497c716f0 ("elf: Extend glibc.rtld.execstack tunable to force executable stack (BZ 32653)"
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Fix spelling mistake "trucate" -> "truncate"

There is a spelling mistake in a test filename. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Fix spelling mistake "suports" -> "supports"

There are spelling mistakes in assert messages. Fix them.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Fix spelling mistake "succsefully" -> "successfully"

There is a spelling mistake in a puts statement. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: Mention POSIX-1.2024 requires time_t to be 64 bit or wider.

* manual/time.texi (Time Types): Mention POSIX-1.2024 requires 64 bit
time_t.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: Update standardization of getline and getdelim [BZ #32830]

* manual/stdio.texi (Line Input): Document that getline and getdelim
where GNU extensions until standardized in POSIX.1-2008. Add restrict
to function prototypes.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

libio: Add test case for fflush

Since one path uses _IO_SYNC and the other _IO_OVERFLOW, the newly added
test cases verifies that `fflush (FILE)` and `fflush (NULL)` are
semantically equivalent from the FILE perspective.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

libio: Synthesize ESPIPE error if lseek returns 0 after reading bytes

This is required so that fclose, when trying to seek to the right
position after filling the input buffer, does not fail with EINVAL.
This fclose code path only ignores ESPIPE errors.

Reported by Petr Pisar on
<https://bugzilla.redhat.com/show_bug.cgi?id=2358265>.

Fixes commit be6818be31e756398e45f70e2819d78be0961223 ("Make fclose
seek input file to right offset (bug 12724)").

Reviewed-by: Frédéric Bérat <fberat@redhat.com>

x86: Detect Intel Diamond Rapids

Detect Intel Diamond Rapids and tune it similar to Intel Granite Rapids.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

x86: Handle unknown Intel processor with default tuning

Enable default tuning for unknown Intel processor.

Tested on x86, no regression.

Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

conform: Add initial support for C23.

Hi Joseph,

As we discussed, this patch just makes C23 include every check that is
performed by C11.

I tested the commit by adding the ISO23 Make and Python variables to be
the same as ISO11.  So the only difference was compiling with -DISO23
instead of -DISO11.  And changed the temporary directories to instead
use the format f'/tmp/glibc-{self.standard}-{self.header}'.  Then I used
a shell script to run 'cmp' on each file in the ISO11 and ISO23
directories for each header to make sure they were the same.

-- 8< --

Make C23 checks include every test that is performed by C11.  Done by
running the following command:

find conform -name '*.h-data' | xargs sed -i \
  -e 's| !defined ISO11| !defined ISO11 \&\& !defined ISO23|g' \
  -e 's| defined ISO11| defined ISO11 \|\| defined ISO23|g' \
  -e 's|ifdef ISO11|if defined ISO11 \|\| defined ISO23|g' \
  -e 's|ifndef ISO11|if !defined ISO11 \&\& !defined ISO23|g'

Signed-off-by: Collin Funk <collin.funk1@gmail.com>

x86: Add ARL/PTL/CWF model detection support

- Add ARROWLAKE model detection.
- Add PANTHERLAKE model detection.
- Add CLEARWATERFOREST model detection.

Intel® Architecture Instruction Set Extensions Programming Reference
https://cdrdv2.intel.com/v1/dl/getContent/671368 Section 1.2.

No regression, validated model detection on SDE.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

timezone: Enhance tst-bz28707 diagnostics

This hopefully provides additional information about why the
test failed, in case the fix in commit 62db87ab24f9ca483f97f
("timezone: Fix tst-bz28707 Makefile rule") turns out to be
insufficient.

Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>

powerpc: Remove relocation cache flush code for power64

This is only needed for -mno-secure-plt, and this linkage mode
is not supported with powerpc64 and powerp64le.

Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

math: Fix up THREEp96 constant in expf128 [BZ #32411]

As mentioned by the reporter in a pull request against gcc-mirror,
the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128
rather than 0x3.p+96f128.

The algorithm uses that to compute the t2 integer (tval2), by whose
delta it adjusts the x+xl pair and then in the result uses the precomputed
exp value for that entry.
Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes
being one smaller, sometimes one larger than the desired value, thus can mean
the x+xl pair after adjustment will be larger in absolute value than it
should be.

DesWursters created a test program for this
https://github.com/DesWurstes/comparefloats
and his results were
total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648
I've modified this so with
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3
so that it actually tests pseudo-random _Float128 values with range
(-16384.,16384) with strong bias on values larger than 0.0002 in absolute
value (so that tval1/tval2 aren't zero most of the time) and that gave
total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255
So, in both cases, in most cases the change doesn't result in any differences,
and in those rare cases where does, about 85% have smaller ulp than without
the patch.
Additionally I've tried
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4
and in 2 billion iterations it didn't find any case where x+xl after the
adjustments without this change would be smaller in absolute value compared
to x+xl after the adjustments with this change.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

elf: Extend glibc.rtld.execstack tunable to force executable stack (BZ 32653)

From the bug report [1], multiple programs still require to dlopen
shared libraries with either missing PT_GNU_STACK or with the executable
bit set.  Although, in some cases, it seems to be a hard-craft assembly
source without the required .note.GNU-stack marking (so the static linker
is forced to set the stack executable if the ABI requires it), other
cases seem that the library uses trampolines [2].

Unfortunately, READ_IMPLIES_EXEC is not an option since on some ABIs
(x86_64), the kernel clears the bit, making it unsupported.  To avoid
reinstating the broken code that changes stack permission on dlopen
(0ca8785a28), this patch extends the glibc.rtld.execstack tunable to
allow an option to force an executable stack at the program startup.

The tunable is a security issue because it defeats the PT_GNU_STACK
hardening.  It has the slight advantage of making it explicit by the
caller, and, as for other tunables, this is disabled for setuid binaries.
A tunable also allows us to eventually remove it, but from previous
experiences, it would require some time.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32653
[2] https://github.com/conda-forge/ctng-compiler-activation-feedstock/issues/143
Reviewed-by: Sam James <sam@gentoo.org>

stdlib: Implement C2Y uabs, ulabs, ullabs and uimaxabs

C2Y adds unsigned versions of the abs functions (see C2Y draft N3467 and
proposal N3349).

Tested for x86_64.

Signed-off-by: Lenard Mollenkopf <glibc@lenardmollenkopf.de>

stdio-common: In tst-setvbuf2, close helper thread descriptor only if opened

The helper thread may get canceled before the open system
call succeds. Then ThreadData.fd remains zero, and eventually
the xclose call in end_reader_thread fails because descriptor 0
is not open.

Instead, initialize the fd member to -1 (not a valid descriptor)
and close the descriptor only if valid. Do this in a new end_thread
helper routine.

Also add more error checking to close operations.

Fixes commit 95b780c1d0549678c0a244c6e2112ec97edf0839 ("stdio: Add
more setvbuf tests").

Remove duplicates from binaries-shared-tests when creating make rules

This avoids a warning from make because binaries-shared-tests contains
both $(tests-container) and $(tests-internal).

x86: Optimize xstate size calculation

Scan xstate IDs up to the maximum supported xstate ID. Remove the
separate AMX xstate calculation. Instead, exclude the AMX space from
the start of TILECFG to the end of TILEDATA in xsave_state_size.

Completed validation on SKL/SKX/SPR/SDE and compared xsave state size
with "ld.so --list-diagnostics" option, no regression.

Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

NEWS: update for GCC 12.1 requirement [BZ #32539]

Since 27b96e069aad17cefea9437542180bff448ac3a0, the minimum GCC required
to build glibc is GCC 12.1.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

stdio: fix hurd link for tst-setvbuf2

stdlib: Fix qsort memory leak if callback throws (BZ 32058)

If the input buffer exceeds the stack auxiliary buffer, qsort will
malloc a temporary one to call mergesort. Since C++ standard does
allow the callback comparison function to throw [1], the glibc
implementation can potentially leak memory.

The fixes uses a pthread_cleanup_combined_push and
pthread_cleanup_combined_pop, so it can work with and without
exception enables. The qsort code path that calls malloc now
requires some extra setup and a call to __pthread_cleanup_push
anmd __pthread_cleanup_pop (which should be ok since they just
setup some buffer state).

Checked on x86_64-linux-gnu.

[1] https://timsong-cpp.github.io/cppwp/n4950/alg.c.library#4

Reviewed-by: DJ Delorie <dj@redhat.com>

sysdeps: powerpc: restore -mlong-double-128 check

We mistakenly dropped the check in 27b96e069aad17cefea9437542180bff448ac3a0;
there's some other checks which we *can* drop, but let's worry about that
later.

Fixes the build on ppc64le where GCC is configured with --with-long-double-format=ieee.

Reviewed-by: Andreas Schwab <schwab@suse.de>

stdio: Add more setvbuf tests

add ptmx support to test-container

Update syscall lists for Linux 6.14

Linux 6.14 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.14.

Tested with build-many-glibcs.py.

x86: Link tst-gnu2-tls2-x86-noxsave{,c,xsavec} with libpthread

This fixes a test build failure on Hurd.

Fixes commit 145097dff170507fe73190e8e41194f5b5f7e6bf ("x86: Use separate
variable for TLSDESC XSAVE/XSAVEC state size (bug 32810)").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Fix tst-origin build when toolchain defaults to --as-needed (BZ 32823)

Checked on aarch64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Raise the minimum GCC version to 12.1 [BZ #32539]

For all Linux distros with glibc 2.40 which I can find, GCC 14.2 is used
to compile glibc 2.40:

OS                    GCC      URL
AOSC                  14.2.0   https://aosc.io/
Arch Linux            14.2.0   https://archlinux.org/
ArchPOWER             14.2.0   https://archlinuxpower.org/
Artix                 14.2.0   https://artixlinux.org/
Debian                14.2.0   https://www.debian.org/
Devuan                14.2.0   https://www.devuan.org/
Exherbo               14.2.0   https://www.exherbolinux.org/
Fedora                14.2.1   https://fedoraproject.org/
Gentoo                14.2.1   https://gentoo.org/
Kali Linux            14.2.0   https://www.kali.org/
KaOS                  14.2.0   https://kaosx.us/
LiGurOS               14.2.0   https://liguros.gitlab.io/
Mageia                14.2.0   https://www.mageia.org/en/
Manjaro               14.2.0   https://manjaro.org/
NixOS                 14.2.0   https://nixos.org/
openmamba             14.2.0   https://openmamba.org/
OpenMandriva          14.2.0   https://openmandriva.org/
openSUSE              14.2.0   https://www.opensuse.org/
Parabola              14.2.0   https://www.parabola.nu/
PLD Linux             14.2.0   https://pld-linux.org/
PureOS                14.2.0   https://pureos.net/
Raspbian              14.2.0   http://raspbian.org/
Slackware             14.2.0   http://www.slackware.com/
Solus                 14.2.0   https://getsol.us/
T2 SDE                14.2.0   http://t2sde.org/
Ubuntu                14.2.0   https://www.ubuntu.com/
Wikidata              14.2.0   https://wikidata.org/

Support older versions of GCC to build glibc 2.42:

1. Need to work around bugs in older versions of GCC.
2. Can't use the new features in newer versions of GCC, which may be
required for new features, like _Float16 which requires GCC 12.1 or
above, in glibc,

The main benefit of supporting older versions of GCC is easier backport
of bug fixes to the older releases of glibc, which can be mitigated by
avoiding incompatible features in newer versions of GCC for critical bug
fixes.  Require GCC 12.1 or newer to build.  Remove GCC version check for
PowerPC and s390x.

TEST_CC and TEST_CXX can be used to test the glibc build with the older
versions of GCC.

For glibc developers who are using Linux OSes which don't come with GCC
12.1 or newer, they should build and install GCC 12.1 or newer to work
on glibc.

This fixes BZ #32539.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>

Fix typo in comment

manual: tidy the longopt.c example

- Change longopt.c's backticks to single quotes
- puts() does not use format specifiers

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: Document functions adopted by POSIX.1-2024.

Here is a patch updating the documentation to mention GNU and BSD
extensions that were adopted by POSIX.1-2024.

* manual/llio.texi (Memory-mapped I/O): Add that MAP_ANON and
MAP_ANONYMOUS were added by POSIX.1-2024.
* manual/memory.texi (Changing Block Size): Mention that reallocarray
was added by POSIX.1-2024.
* manual/message.texi (Message Translation): Adjust wording to match
standardization.
(Translation with gettext): Mention the gettext family of functions were
added by POSIX.1-2024.
* manual/pattern.texi (Wildcard Matching): Mention that FNM_CASEFOLD was
added by POSIX.1-2024.
* manual/process.texi (Creating a Process): Mention that _Fork and
WCOREDUMP were added by POSIX.1-2024.
* manual/signal.texi (Miscellaneous Signals): Mention that SIGWINCH was
added by POSIX-1.2024.
* manual/startup.texi (Environment Access): Mention that secure_getenv
was added by POSIX.1-2024.
* manual/string.texi (Truncating Strings): Mention that strlcpy,
strlcat, wcslcpy, and wslcat were added by POSIX-1.2024.
(Search Functions): Document that memmem was added by POSIX-1.2024.
* manual/terminal.texi (Allocation): Mention that ptsname_r was added by
POSIX-1.2024.
* manual/threads.texi (Waiting with Explicit Clocks): Move node under
POSIX Threads. Mention pthread_cond_clockwait,
pthread_rwlock_clockrdlock, and pthread_rwlock_clockwrlock were added by
POSIX-1.2024.
(Joining Threads): New node under Non-POSIX Extensions.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Collin Funk <collin.funk1@gmail.com>

aarch64: Fix _dl_tlsdesc_dynamic unwind for pac-ret (BZ 32612)

When libgcc is built with pac-ret, it requires to autenticate the
unwinding frame based on CFI information.  The _dl_tlsdesc_dynamic
uses a custom calling convention, where it is responsible to save
and restore all registers it might use (even volatile).

The pac-ret support added by 1be3d6eb823d8b952fa54b7bbc90cbecb8981380
was added only on the slow-path, but the fast path also adds DWARF
Register Rule Instruction (cfi_adjust_cfa_offset) since it requires
to save/restore some auxiliary register.  It seems that this is not
fully supported neither by libgcc nor AArch64 ABI [1].

Instead, move paciasp/autiasp to function prologue/epilogue to be
used on both fast and slow paths.

I also corrected the _dl_tlsdesc_dynamic comment description, it was
copied from i386 implementation without any adjustment.

Checked on aarch64-linux-gnu with a toolchain built with
--enable-standard-branch-protection on a system with pac-ret
support.

[1]  https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id1

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

x86: Use separate variable for TLSDESC XSAVE/XSAVEC state size (bug 32810)

Previously, the initialization code reused the xsave_state_full_size
member of struct cpu_features for the TLSDESC state size. However,
the tunable processing code assumes that this member has the
original XSAVE (non-compact) state size, so that it can use its
value if XSAVEC is disabled via tunable.

This change uses a separate variable and not a struct member because
the value is only needed in ld.so and the static libc, but not in
libc.so. As a result, struct cpu_features layout does not change,
helping a future backport of this change.

Fixes commit 9b7091415af47082664717210ac49d51551456ab ("x86-64:
Update _dl_tlsdesc_dynamic to preserve AMX registers").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Skip XSAVE state size reset if ISA level requires XSAVE

If we have to use XSAVE or XSAVEC trampolines, do not adjust the size
information they need. Technically, it is an operator error to try to
run with -XSAVE,-XSAVEC on such builds, but this change here disables
some unnecessary code with higher ISA levels and simplifies testing.

Related to commit befe2d3c4dec8be2cdd01a47132e47bdb7020922
("x86-64: Don't use SSE resolvers for ISA level 3 or above").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

malloc: Improve performance of __libc_malloc

Improve performance of __libc_malloc by splitting it into 2 parts: first handle
the tcache fastpath, then do the rest in a separate tailcalled function.
This results in significant performance gains since __libc_malloc doesn't need
to setup a frame and we delay tcache initialization and setting of errno until
later.

On Neoverse V2, bench-malloc-simple improves by 6.7% overall (up to 8.5% for
ST case) and bench-malloc-thread improves by 20.3% for 1 thread and 14.4% for
32 threads.

Reviewed-by: DJ Delorie <dj@redhat.com>

stdio-common: Reject real data w/o exponent digits in scanf [BZ #12701]

Reject invalid formatted scanf real input data the exponent part of
which is comprised of an exponent introducing character, optionally
followed by a sign, and with no actual digits following. Such data is a
prefix of, but not a matching input sequence and it is required by ISO C
to cause a matching failure.

Currently a matching success is instead incorrectly produced along with
the conversion result according to the input significand read and the
exponent of zero, with the significand and the exponent part wholly
consumed from input.

Correct an invalid `tstscanf.c' test accordingly that expects a matching
success for input data provided in the ISO C standard as an example for
a matching failure.

Enable input data that causes test failures without this fix in place.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Reject significand prefixes in scanf [BZ #12701]

Reject invalid formatted scanf real input data that is comprised of a
hexadecimal prefix, optionally preceded by a sign, and with no actual
digits following owing to the field width restriction in effect. Such
data is a prefix of, but not a matching input sequence and it is
required by ISO C to cause a matching failure.

Currently a matching success is instead incorrectly produced along with
the conversion result of zero, with the prefix wholly consumed from
input. Where the end of input is marked by the end-of-file condition
rather than the field width restriction in effect a matching failure is
already correctly produced.

Enable input data that causes test failures without this fix in place.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Reject integer prefixes in scanf [BZ #12701]

Reject invalid formatted scanf integer input data that is comprised of a
binary or hexadecimal prefix, optionally preceded by a sign, and with no
actual digits following. Such data is a prefix of, but not a matching
input sequence and it is required by ISO C to cause a matching failure.

Currently a matching success is instead incorrectly produced along with
the conversion result of zero, with the prefix wholly consumed from
input.

Enable input data that causes test failures without this fix in place.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Also reject exp char w/o significand in i18n scanf [BZ #13988]

Fix the handling of real 'scanf' input such as "+.e" as per BZ #13988
for the i18n case as well, complementing commit 6ecec3b616ae ("Don't
accept exp char without preceding digits in scanf float parsing"), where
the 'e' character is incorrectly consumed from input. Add a test case
matching stdio-common/bug26.c, with bits from localedata/tst-sscanf.c.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add tests for formatted vsscanf input specifiers

Wire vsscanf into test infrastructure for formatted scanf input
specifiers.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add tests for formatted vfscanf input specifiers

Wire vfscanf into test infrastructure for formatted scanf input
specifiers.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add tests for formatted vscanf input specifiers

Wire vscanf into test infrastructure for formatted scanf input
specifiers.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add tests for formatted sscanf input specifiers

Wire sscanf into test infrastructure for formatted scanf input
specifiers.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add tests for formatted fscanf input specifiers

Wire fscanf into test infrastructure for formatted scanf input
specifiers.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf long double data for Intel/Motorola 80-bit format

Add Makefile infrastructure, a format-specific test skeleton providing a
data comparison implementation that ignores bits of data representation
in memory that do not participate in holding floating-point data, and
`long double' real input data for targets using the Intel/Motorola
80-bit format.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

Implement C23 pown

C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the pown functions, which are like pow but with an
integer exponent.  That exponent has type long long int in C23; it was
intmax_t in TS 18661-4, and as with other interfaces changed after
their initial appearance in the TS, I don't think we need to support
the original version of the interface.  The test inputs are based on
the subset of test inputs for pow that use integer exponents that fit
in long long.

As the first such template implementation that saves and restores the
rounding mode internally (to avoid possible issues with directed
rounding and intermediate overflows or underflows in the wrong
rounding mode), support also needed to be added for using
SET_RESTORE_ROUND* in such template function implementations.  This
required math-type-macros-float128.h to include <fenv_private.h>, so
it can tell whether SET_RESTORE_ROUNDF128 is defined.  In turn, the
include order with <fenv_private.h> included before <math_private.h>
broke loongarch builds, showing up that
sysdeps/loongarch/math_private.h is really a fenv_private.h file
(maybe implemented internally before the consistent split of those
headers in 2018?) and needed to be renamed to fenv_private.h to avoid
errors with duplicate macro definitions if <math_private.h> is
included after <fenv_private.h>.

The underlying implementation uses __ieee754_pow functions (called
more than once in some cases, where the exponent does not fit in the
floating type).  I expect a custom implementation for a given format,
that only handles integer exponents but handles larger exponents
directly, could be faster and more accurate in some cases.

I encourage searching for worst cases for ulps error for these
implementations (necessarily non-exhaustively, given the size of the
input space).

Tested for x86_64 and x86, and with build-many-glibcs.py.

support: Use unwinder in links-dso-program-c only with libgcc_s

Do not build links-dso-program-c with exception (unwinding) support
if libgcc_s is not available. In this case, the unwinder may be
part of libgcc.a or libgcc_eh.a, depending on how GCC was built.
If the unwinder is in libgcc_eh.a only, linking links-dso-program-c
failed before this change. After this change, the exception
handling landing pad is only generated if libgcc_s available,
avoiding an undefined _Unwind_Resume (or equivalent) symbol
reference in the non-libgcc_s case.

Fixes commit ffd36cc27407003a6f9efcb9c16370e3435c5b1d ("support: Use
unwinder in links-dso-program-c only with libgcc_s") and
commit 5dfbc3c43ecc1bcfc760a032c91bb002660051bc ("support: Link
links-dso-program-c with libgcc_s only if available").

malloc: Use __always_inline for simple functions

Use __always_inline for small helper functions that are critical for
performance. This ensures inlining always happens when expected.
Performance of bench-malloc-simple improves by 0.6% on average on
Neoverse V2.

Reviewed-by: DJ Delorie <dj@redhat.com>

linux: Fix integer overflow warnings when including <sys/mount.h> [BZ #32708]

Using gcc -Wshift-overflow=2 -Wsystem-headers to compile a file
including <sys/mount.h> will cause a warning since 1 << 31 is undefined
behavior on platforms where int is 32-bits.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

malloc: Use _int_free_chunk for remainders

When splitting a chunk, release the tail part by calling int_free_chunk.
This avoids inserting random blocks into tcache that were never requested
by the user. Fragmentation will be worse if they are never used again.
Note if the tail is fairly small, we could avoid splitting it at all.
Also remove an oddly placed initialization of tcache in _libc_realloc.

Reviewed-by: DJ Delorie <dj@redhat.com>

Use MPFR 4.2.2 and Linux 6.14 in build-many-glibcs.py

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

stdio-common: Add scanf long double data for IBM 128-bit format

Add Makefile infrastructure and IBM 128-bit 'long double' real input for
targets switching between the IEEE 754 binary128 and IBM 128-bit formats
with '-mabi=ieeelongdouble' and '-mabi=ibmlongdouble'. Reuse IEEE 754
binary128 input data but with modified output file names so as not to
clash with the names used for IBM 128-bit format tests made with common
rules for the 'long double' data type.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf long double data for IEEE 754 binary64 format

Add Makefile infrastructure and 64-bit `long double' real input data for
targets switching between the IEEE 754 binary64 and IEEE 754 binary128
formats with `-mlong-double-64' and `-mlong-double-128'. Use modified
output file names for the IEEE 754 binary64 format so as not to clash
with the names used for IEEE 754 binary128 format tests made with common
rules for the 'long double' data type.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf long double data for IEEE 754 binary128 format

Add Makefile infrastructure and `long double' real input data for
targets using the IEEE 754 binary128 format.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf double data for IEEE 754 binary64 format

Add Makefile infrastructure and `double' real input data for targets
using the IEEE 754 binary64 format.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf float data for IEEE 754 binary32 format

Add Makefile infrastructure and `float' real input data for targets
using the IEEE 754 binary32 format.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0e', '0e+',
'0x', '0x8p', '0x0p-', etc.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf integer data for LP64 targets

Add Makefile infrastructure and `int' and `long' integer input data,
signed and unsigned, for LP64 targets.

While the size of `int' data is the same between ILP32 and LP64 targets,
resulting scanf output is different between them for out of range input
data and while ISO C and POSIX both say that the behavior is undefined
if the result of the conversion cannot be represented we want to keep
track of our output to prevent inadvertent changes. Hence the use of
distinct `int' integer input data between ILP32 and LP64 targets.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0b' or '0x'.

Reviewed-by: Joseph Myers <josmyers@redhat.com>

stdio-common: Add scanf integer data for ILP32 targets

Add Makefile infrastructure and `int' and `long' integer input data,
signed and unsigned, for ILP32 targets.

While the size of `int' data is the same between ILP32 and LP64 targets,
resulting scanf output is different between them for out of range input
data and while ISO C and POSIX both say that the behavior is undefined
if the result of the conversion cannot be represented we want to keep
track of our output to prevent inadvertent changes. Hence the use of
distinct `int' integer input data between ILP32 and LP64 targets.

Keep input data disabled and referring to BZ #12701 for entries that are
are currently incorrectly accepted as valid data, such as '0b' or '0x'.

Reviewed-by: Joseph Myers <josmyers@redhat.com>