git.ipfire.org Git - thirdparty/glibc.git/log

x86: fix wmemset ifunc stray '!' (bug 33542)

The ifunc selector for wmemset had a stray '!' in the
X86_ISA_CPU_FEATURES_ARCH_P(...) check:

  if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2)
      && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features,
                                      AVX_Fast_Unaligned_Load, !))

This effectively negated the predicate and caused the AVX2/AVX512
paths to be skipped, making the dispatcher fall back to the SSE2
implementation even on CPUs where AVX2/AVX512 are available. The
regression leads to noticeable throughput loss for wmemset.

Remove the stray '!' so the AVX_Fast_Unaligned_Load capability is
tested as intended and the correct AVX2/EVEX variants are selected.

Impact:
- On AVX2/AVX512-capable x86_64, wmemset no longer incorrectly
  falls back to SSE2; perf now shows __wmemset_evex/avx2 variants.

Testing:
- benchtests/bench-wmemset shows improved bandwidth across sizes.
- perf confirm the selected symbol is no longer SSE2.

Signed-off-by: xiejiamei <xiejiamei@hygon.com>
Signed-off-by: Li jing <lijing@hygon.cn>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 4d86b6cdd8132e0410347e07262239750f86dfb4)

x86: Detect Intel Nova Lake Processor

Detect Intel Nova Lake Processor and tune it similar to Intel Panther
Lake. https://cdrdv2.intel.com/v1/dl/getContent/671368 Section 1.2.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit a114e29ddd530962d2b44aa9d89f1f6075abe7fa)

x86: Detect Intel Wildcat Lake Processor

Detect Intel Wildcat Lake Processor and tune it similar to Intel Panther
Lake. https://cdrdv2.intel.com/v1/dl/getContent/671368 Section 1.2.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit f8dd52901b72805a831d5a4cb7d971e4a3c9970b)

nss: Group merge does not react to ERANGE during merge (bug 33361)

The break statement in CHECK_MERGE is expected to exit the surrounding
while loop, not the do-while loop with in the macro. Remove the
do-while loop from the macro. It is not needed to turn the macro
expansion into a single statement due to the way CHECK_MERGE is used
(and the statement expression would cover this anyway).

Reviewed-by: Collin Funk <collin.funk1@gmail.com>
(cherry picked from commit 0fceed254559836b57ee05188deac649bc505d05)

AArch64: Fix SVE powf routine [BZ #33299]

Fix a bug in predicate logic introduced in last change.
A slight performance improvement from relying on all true
predicates during conversion from single to double.
This fixes BZ #33299.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit aac077645a645bba0d67f3250e82017c539d0f4b)

Optimize __libc_tsd_* thread variable access

These variables are not exported, and libc.so TLS is initial-exec
anyway. Declare these variables as hidden and use the initial-exec
TLS model.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>
(cherry picked from commit a894f04d877653bea1639fc9a4adf73bd9347bf4)

i386: Add GLIBC_ABI_GNU_TLS version [BZ #33221]

On i386, programs and shared libraries with __thread usage may fail
silently at run-time against glibc without the TLS run-time fix for:

https://sourceware.org/bugzilla/show_bug.cgi?id=32996

Add GLIBC_ABI_GNU_TLS version to indicate that glibc has the working
GNU TLS run-time. Linker can add the GLIBC_ABI_GNU_TLS version to
binaries which depend on the working TLS run-time so that such programs
and shared libraries will fail to load and run at run-time against
libc.so without the GLIBC_ABI_GNU_TLS version, instead of fail silently
at random.

This fixes BZ #33221.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit ed1b7a5a489ab555a27fad9c101ebe2e1c1ba881)

i386: Also add GLIBC_ABI_GNU2_TLS version [BZ #33129]

Since the GNU2 TLS run-time bug:

https://sourceware.org/bugzilla/show_bug.cgi?id=31372

affects both i386 and x86-64, also add GLIBC_ABI_GNU2_TLS version to i386
to indicate the working GNU2 TLS run-time. For x86-64, the additional
GNU2 TLS run-time bug fix is needed for

https://sourceware.org/bugzilla/show_bug.cgi?id=31501

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit bd4628f3f18ac312408782eea450429c6f044860)

i386: Update ___tls_get_addr to preserve vector registers

Compiler generates the following instruction sequence for dynamic TLS
access:

leal tls_var@tlsgd(,%ebx,1), %eax
call ___tls_get_addr@PLT

CALL instruction is transparent to compiler which assumes all registers,
except for EFLAGS, AX, CX, and DX, are unchanged after CALL.  But
___tls_get_addr is a normal function which doesn't preserve any vector
registers.

1. Rename the generic __tls_get_addr function to ___tls_get_addr_internal.
2. Change ___tls_get_addr to a wrapper function with implementations for
FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all vector registers.
3. dl-tlsdesc-dynamic.h has:

_dl_tlsdesc_dynamic:
/* Like all TLS resolvers, preserve call-clobbered registers.
   We need two scratch regs anyway.  */
subl $32, %esp
cfi_adjust_cfa_offset (32)

It is wrong to use

movl %ebx, -28(%esp)
movl %esp, %ebx
cfi_def_cfa_register(%ebx)
...
mov %ebx, %esp
cfi_def_cfa_register(%esp)
movl -28(%esp), %ebx

to preserve EBX on stack.  Fix it with:

movl %ebx, 28(%esp)
movl %esp, %ebx
cfi_def_cfa_register(%ebx)
...
mov %ebx, %esp
cfi_def_cfa_register(%esp)
movl 28(%esp), %ebx

4. Update _dl_tlsdesc_dynamic to call ___tls_get_addr_internal directly.
5. Add have-test-mtls-traditional to compile tst-tls23-mod.c with
traditional TLS variant to verify the fix.
6. Define DL_RUNTIME_RESOLVE_REALIGN_STACK in sysdeps/x86/sysdep.h.

This fixes BZ #32996.

Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 848f0e46f03f22404ed9a8aabf3fd5ce8809a1be)

elf: Preserve _rtld_global layout for the release branch

Backporting commit 9d6577fdff801a856383e69322f15e63424ad312
("elf: Introduce _dl_debug_change_state") removed the
_ns_debug member. Keep it to preseve struct layout.

elf: Test dlopen (NULL, RTLD_LAZY) from an ELF constructor

This call must not complete initialization of all shared objects
in the global scope because the ELF constructor which makes the call
likely has not finished initialization. Calling more constructors
at this point would expose those to a partially constructed
dependency.

This completes the revert of commit 9897ced8e78db5d813166a7ccccfd5a
("elf: Run constructors on cyclic recursive dlopen (bug 31986)").

(cherry picked from commit d604f9c500570e80febfcc6a52b63a002b466f35)

elf: Fix handling of symbol versions which hash to zero (bug 29190)

This was found through code inspection. No application impact is
known.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 46d31980943d8be2f421c1e3276b265c7552636e)

x86-64: Add GLIBC_ABI_DT_X86_64_PLT [BZ #33212]

When the linker -z mark-plt option is used to add DT_X86_64_PLT,
DT_X86_64_PLTSZ and DT_X86_64_PLTENT, the r_addend field of the
R_X86_64_JUMP_SLOT relocation stores the offset of the indirect
branch instruction.  However, glibc versions without the commit:

commit f8587a61892cbafd98ce599131bf4f103466f084
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri May 20 19:21:48 2022 -0700

    x86-64: Ignore r_addend for R_X86_64_GLOB_DAT/R_X86_64_JUMP_SLOT

    According to x86-64 psABI, r_addend should be ignored for R_X86_64_GLOB_DAT
    and R_X86_64_JUMP_SLOT.  Since linkers always set their r_addends to 0, we
    can ignore their r_addends.

Reviewed-by: Fangrui Song <maskray@google.com>
won't ignore the r_addend value in the R_X86_64_JUMP_SLOT relocation.
Such programs and shared libraries will fail at run-time randomly.

Add GLIBC_ABI_DT_X86_64_PLT version to indicate that glibc is compatible
with DT_X86_64_PLT.

The linker can add the glibc GLIBC_ABI_DT_X86_64_PLT version dependency
whenever -z mark-plt is passed to the linker.  The resulting programs and
shared libraries will fail to load at run-time against libc.so without the
GLIBC_ABI_DT_X86_64_PLT version, instead of fail randomly.

This fixes BZ #33212.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 399384e0c8193e31aea014220ccfa24300ae5938)

x86-64: Add GLIBC_ABI_GNU2_TLS version [BZ #33129]

Programs and shared libraries compiled with -mtls-dialect=gnu2 may fail
silently at run-time against glibc without the GNU2 TLS run-time fix
for:

https://sourceware.org/bugzilla/show_bug.cgi?id=31372

Add GLIBC_ABI_GNU2_TLS version to indicate that glibc has the working
GNU2 TLS run-time. Linker can add the GLIBC_ABI_GNU2_TLS version to
binaries which depend on the working GNU2 TLS run-time:

https://sourceware.org/bugzilla/show_bug.cgi?id=33130

so that such programs and shared libraries will fail to load and run at
run-time against libc.so without the GLIBC_ABI_GNU2_TLS version, instead
of fail silently at random.

This fixes BZ #33129.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 9df8fa397d515dc86ff5565f6c45625e672d539e)

elf: Compile _dl_debug_state separately (bug 33224)

This ensures that the compiler will not inline it, so that
debuggers which do not use the Systemtap probes can reliably
set a breakpoint on it.

Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
Tested-by: Andreas K. Huettel <dilfridge@gentoo.org>
(cherry picked from commit 620f0730f311635cd0e175a3ae4d0fc700c76366)

elf: Restore support for _r_debug interpositions and copy relocations

The changes in commit a93d9e03a31ec14405cb3a09aa95413b67067380
("Extend struct r_debug to support multiple namespaces [BZ #15971]")
break the dyninst dynamic instrumentation tool.  It brings its
own definition of _r_debug (rather than a declaration).

Furthermore, it turns out it is rather hard to use the proposed
handshake for accessing _r_debug via DT_DEBUG. If applications want
to access _r_debug, they can do so directly if the relevant code has
been built as PIC.  To protect against harm from accidental copy
relocations due to linker relaxations, this commit restores copy
relocation support by adjusting both copies if interposition or
copy relocations are in play.  Therefore, it is possible to
use a hidden reference in ld.so to access _r_debug.

Only perform the copy relocation initialization if libc has been
loaded.  Otherwise, the ld.so search scope can be empty, and the
lookup of the _r_debug symbol mail fail.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit ea85e7d55087075376a29261e722e4fae14ecbe7)

elf: Introduce _dl_debug_change_state

It combines updating r_state with the debugger notification.

The second change to _dl_open introduces an additional debugger
notification for dlmopen, but debuggers are expected to ignore it.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 8329939a37f483a16013dd8af8303cbcb86d92cb)

elf: Introduce separate _r_debug_array variable

It replaces the ns_debug member of the namespaces. Previously,
the base namespace had an unused ns_debug member.

This change also fixes a concurrency issue: Now _dl_debug_initialize
only updates r_next of the previous namespace's r_debug after the new
r_debug is initialized, so that only the initialized version is
observed. (Client code accessing _r_debug will benefit from load
dependency tracking in CPUs even without explicit barriers.)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7278d11f3a0cd528188c719bab75575b0aea2c6e)

Use TLS initial-exec model for __libc_tsd_CTYPE_* thread variables [BZ #33234]

Commit 10a66a8e421b ("Remove <libc-tsd.h>") removed the TLS initial-exec
(IE) model attribute from the __libc_tsd_CTYPE_* thread variable declarations
and definitions. Commit a894f04d8776 ("Optimize __libc_tsd_* thread
variable access") restored it on declarations.

Restore the TLS initial-exec model attribute on __libc_tsd_CTYPE_* thread
variable definitions.

This resolves test tst-locale1 failure on s390 32-bit, when using a
GNU linker without the fix from GNU binutils commit aefebe82dc89
("IBM zSystems: Fix offset relative to static TLS").

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit e5363e6f460c2d58809bf10fc96d70fd1ef8b5b2)

ctype: Fallback initialization of TLS using relocations (bug 19341, bug 32483)

This ensures that the ctype data pointers in TLS are valid
in secondary namespaces even without initialization via
__ctype_init.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>
(cherry picked from commit 2745db8dd3ec31045acd761b612516490085bc20)

Use proper extern declaration for _nl_C_LC_CTYPE_{class,toupper,tolower}

The existing initializers already contain explicit casts. Keep them
due to int/uint32_t mismatch.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>
(cherry picked from commit e0c0f856f58ceb68800a964c36c15c606e7a8c4c)

Remove <libc-tsd.h>

Use __thread variables directly instead.  The macros do not save any
typing.  It seems unlikely that a future port will lack __thread
variable support.

Some of the __libc_tsd_* variables are referenced from assembler
files, so keep their names.  Previously, <libc-tls.h> included
<tls.h>, which in turn included <errno.h>, so a few direct includes
of <errno.h> are now required.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>
(cherry picked from commit 10a66a8e421b09682b774c795ef1da402235dddc)

AArch64: Improve codegen SVE log1p helper

Improve codegen by packing coefficients.
4% and 2% improvement in throughput microbenchmark on Neoverse V1, for acosh
and atanh respectively.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 6849c5b791edd216f2ec3fdbe4d138bc69b9b333)

AArch64: Optimise SVE FP64 Hyperbolics

Reworke SVE FP64 hyperbolics to use the SVE FEXPA
instruction.

Also update the special case handelling for large
inputs to be entirely vectorised.

Performance improvements on Neoverse V1:

cosh_sve: 19% for |x| < 709, 5x otherwise
sinh_sve: 24% for |x| < 709, 5.9x otherwise
tanh_sve: 12% for |x| < 19, 9x otherwise

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit dee22d2a81ab59afc165fb6dcb45d723f13582a0)

AArch64: Optimize SVE exp functions

Improve performance of SVE exps by making better use
of the SVE FEXPA instruction.

Performance improvement on Neoverse V1:
exp2_sve:   21%
exp2f_sve:  24%
exp10f_sve: 23%
expm1_sve:  25%

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 1e3d1ddf977ecd653de8d0d10eb083d80ac21cf3)

AArch64: Improve codegen in SVE log1p

Improves memory access, reformat evaluation scheme to pack coefficients.
5% improvement in throughput microbenchmark on Neoverse V1.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit da196e6134ede64728006518352d75b6c3902fec)

AArch64: Optimize inverse trig functions

Improve performance of Inverse trig functions by altering how coefficients are
loaded.

Performance improvement on Neoverse V1:
SVE     acos   14%
AdvSIMD acos   6%

AdvSIMD asin   6%
SVE     asin   5%
AdvSIMD asinf  2%

AdvSIMD atanf  22%
SVE     atanf  20%
SVE     atan   11%
AdvSIMD atan   5%

SVE     atan2  7%
SVE     atan2f 4%
AdvSIMD atan2f 3%
AdvSIMD atan2  2%

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 1e84509e0041c0a83997aba602a585bb3b8285f0)

AArch64: Optimize algorithm in users of SVE expf helper

Polynomial order was unnecessarily high, unlocking multiple
optimizations.
Max error for new SVE expf is 0.88 +0.5ULP.
Max error for new SVE coshf is 2.56 +0.5ULP.
Performance improvement on Neoverse V1: expf (30%), coshf (26%).

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit cf56eb28fa277d9dbb301654682ca89f71c30a48)

AArch64: Avoid memset ifunc in cpu-features.c [BZ #33112]

During early startup memcpy or memset must not be called since many targets
use ifuncs for them which won't be initialized yet. Security hardening may
use -ftrivial-auto-var-init=zero which inserts calls to memset. Redirect
memset to memset_generic by including dl-symbol-redir-ifunc.h in cpu-features.c.
This fixes BZ #33112.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 681a24ae4d0cb8ed92de98b4da660308840b09ba)

nptl: Fix SYSCALL_CANCEL for return values larger than INT_MAX (BZ 33245)

The SYSCALL_CANCEL calls __syscall_cancel, which in turn
calls __internal_syscall_cancel with an 'int' return instead of the
expected 'long int'. This causes issues with syscalls that return
values larger than INT_MAX, such as copy_file_range [1].

Checked on x86_64-linux-gnu.

[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79139

Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
(cherry picked from commit 7107bebf19286f42dcb0a97581137a5893c16206)

elf: Handle ld.so with LOAD segment gaps in _dl_find_object (bug 31943)

Detect if ld.so not contiguous and handle that case in _dl_find_object.
Set l_find_object_processed even for initially loaded link maps,
otherwise dlopen of an initially loaded object adds it to
_dlfo_loaded_mappings (where maps are expected to be contiguous),
in addition to _dlfo_nodelete_mappings.

Test elf/tst-link-map-contiguous-ldso iterates over the loader
image, reading every word to make sure memory is actually mapped.
It only does that if the l_contiguous flag is set for the link map.
Otherwise, it finds gaps with mmap and checks that _dl_find_object
does not return the ld.so mapping for them.

The test elf/tst-link-map-contiguous-main does the same thing for
the libc.so shared object. This only works if the kernel loaded
the main program because the glibc dynamic loader may fill
the gaps with PROT_NONE mappings in some cases, making it contiguous,
but accesses to individual words may still fault.

Test elf/tst-link-map-contiguous-libc is again slightly different
because the dynamic loader always fills the gaps with PROT_NONE
mappings, so a different form of probing has to be used.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 20681be149b9eb1b6c1f4246bf4bd801221c86cd)

elf: Extract rtld_setup_phdr function from dl_main

Remove historic binutils reference from comment and update
how this data is used by applications.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 2cac9559e06044ba520e785c151fbbd25011865f)

elf: Do not add a copy of _dl_find_object to libc.so

This reduces code size and dependencies on ld.so internals from
libc.so.

Fixes commit f4c142bb9fe6b02c0af8cfca8a920091e2dba44b
("arm: Use _dl_find_object on __gnu_Unwind_Find_exidx (BZ 31405)").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 96429bcc91a14f71b177ddc5e716de3069060f2c)

stdlib: resolve a double lock init issue after fork [BZ #32994]

The __abort_fork_reset_child (introduced in
d40ac01cbbc66e6d9dbd8e3485605c63b2178251) call resets the lock after the
fork. This causes a DRD regression in valgrind
(https://bugs.kde.org/show_bug.cgi?id=503668), as it's effectively a
double initialization, despite it being actually ok in this case. As
suggested in https://sourceware.org/bugzilla/show_bug.cgi?id=32994#c2
we replace it here with a memcpy of another initialized lock instead,
which makes valgrind happy.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit d9a348d0927c7a1aec5caf3df3fcd36956b3eb23)

iconv: iconv -o should not create executable files (bug 33164)

The mistake is that open must use 0666 to pick up the umask,
and not 0777 (which is required by mkdir).

Fixes commit 8ef3cff9d1ceafe369f982d980678d749fb93bd2
("iconv: Support in-place conversions (bug 10460, bug 32033)").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit cdcf24ee14c27b77744ff52ab3ae852821207eb0)

posix: Fix double-free after allocation failure in regcomp (bug 33185)

If a memory allocation failure occurs during bracket expression
parsing in regcomp, a double-free error may result.

Reported-by: Anastasia Belova <abelova@astralinux.ru>
Co-authored-by: Paul Eggert <eggert@cs.ucla.edu>
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
(cherry picked from commit 7ea06e994093fa0bcca0d0ee2c1db271d8d7885d)

Fix error reporting (false negatives) in SGID tests

And simplify the interface of support_capture_subprogram_self_sgid.

Use the existing framework for temporary directories (now with
mode 0700) and directory/file deletion.  Handle all execution
errors within support_capture_subprogram_self_sgid.  In particular,
this includes test failures because the invoked program did not
exit with exit status zero.  Existing tests that expect exit
status 42 are adjusted to use zero instead.

In addition, fix callers not to call exit (0) with test failures
pending (which may mask them, especially when running with --direct).

Fixes commit 35fc356fa3b4f485bd3ba3114c9f774e5df7d3c2
("elf: Fix subprocess status handling for tst-dlopen-sgid (bug 32987)").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 3a3fb2ed83f79100c116c824454095ecfb335ad7)

support: Pick group in support_capture_subprogram_self_sgid if UID == 0

When running as root, it is likely that we can run under any group.
Pick a harmless group from /etc/group in this case.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 2f769cec448d84a62b7dd0d4ff56978fe22c0cd6)

sparc: Fix sparc32 Fix argument passing to __libc_start_main (BZ 32981)

Commit 404526ee2e58f3c075253943ddc9988f4bd6b80c changed _start to write
the last argument to __libc_start_main without taking into consideration
that the function did not create a full stack frame, which leads to
overwriting the argv[0].

(cherry picked from commit 8788bd77d68c6429c7f2dcbd22765525555c3cd8)

ppc64le: Revert "powerpc: Optimized strcmp for power10" (CVE-2025-5702)

This reverts commit 3367d8e180848030d1646f088759f02b8dfe0d6f

Reason for revert: Power10 strcmp clobbers non-volatile vector
registers (Bug 33056)

Tested on ppc64le without regression.

(cherry picked from commit 15808c77b35319e67ee0dc8f984a9a1a434701bc)

ppc64le: Revert "powerpc : Add optimized memchr for POWER10" (Bug 33059)

This reverts commit b9182c793caa05df5d697427c0538936e6396d4b

Reason for revert: Power10 memchr clobbers v20 vector register
(Bug 33059)

This is not a security issue, unlike CVE-2025-5745 and
CVE-2025-5702.

Tested on ppc64le without regression.

(cherry picked from commit a7877bb6685300f159fa095c9f50b22b112cddb8)

ppc64le: Revert "powerpc: Fix performance issues of strcmp power10" (CVE-2025-5702)

This reverts commit 90bcc8721ef82b7378d2b080141228660e862d56

This change is in the chain of the final revert that fixes the CVE
i.e. 3367d8e180848030d1646f088759f02b8dfe0d6f

Reason for revert: Power10 strcmp clobbers non-volatile vector
registers (Bug 33056)

Tested on ppc64le with no regressions.

(cherry picked from commit c22de63588df7a8a0edceea9bb02534064c9d201)

ppc64le: Revert "powerpc: Optimized strncmp for power10" (CVE-2025-5745)

This reverts commit 23f0d81608d0ca6379894ef81670cf30af7fd081

Reason for revert: Power10 strncmp clobbers non-volatile vector
registers (Bug 33060)

Tested on ppc64le with no regressions.

(cherry picked from commit 63c60101ce7c5eac42be90f698ba02099b41b965)

sparc: Fix argument passing to __libc_start_main (BZ 32981)

sparc start.S does not provide the final argument for
__libc_start_main, which is the highest stack address used to
update the __libc_stack_end.A

This fixes elf/tst-execstack-prog-static-tunable on sparc64.
On sparcv9 this does not happen because the kernel puts an
auxv value, which turns to point to a value in the stack itself.

Checked on sparc64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 404526ee2e58f3c075253943ddc9988f4bd6b80c)

elf: Fix subprocess status handling for tst-dlopen-sgid (bug 32987)

This should really move into support_capture_subprogram_self_sgid.

Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 35fc356fa3b4f485bd3ba3114c9f774e5df7d3c2)

x86_64: Fix typo in ifunc-impl-list.c.

Fix wcsncpy and wcpncpy typo in ifunc-impl-list.c.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit f2aeb6ff941dccc4c777b5621e77addea6cc076c)

elf: Test case for bug 32976 (CVE-2025-4802)

Check that LD_LIBRARY_PATH is ignored for AT_SECURE statically
linked binaries, using support_capture_subprogram_self_sgid.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit d8f7a79335b0d861c12c42aec94c04cd5bb181e2)

support: Use const char * argument in support_capture_subprogram_self_sgid

The function does not modify the passed-in string, so make this clear
via the prototype.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit f0c09fe61678df6f7f18fe1ebff074e62fa5ca7a)

elf: Keep using minimal malloc after early DTV resize (bug 32412)

If an auditor loads many TLS-using modules during startup, it is
possible to trigger DTV resizing. Previously, the DTV was marked
as allocated by the main malloc afterwards, even if the minimal
malloc was still in use. With this change, _dl_resize_dtv marks
the resized DTV as allocated with the minimal malloc.

The new test reuses TLS-using modules from other auditing tests.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit aa3d7bd5299b33bffc118aa618b59bfa66059bcb)

hurd: Fix tst-stack2 test build on Hurd

It requires $(shared-thread-library). Fixes 0c342594237.

Checked on a i686-gnu build.

(cherry picked from commit f66cb3c9ebcac80b3200c3aff0e3aed6111547ba)

nptl: Fix pthread_getattr_np when modules with execstack are allowed (BZ 32897)

The BZ 32653 fix (12a497c716f0a06be5946cabb8c3ec22a079771e) kept the
stack pointer zeroing from make_main_stack_executable on
_dl_make_stack_executable. However, previously the 'stack_endp'
pointed to temporary variable created before the call of
_dl_map_object_from_fd; while now we use the __libc_stack_end
directly.

Since pthread_getattr_np relies on correct __libc_stack_end, if
_dl_make_stack_executable is called (for instance, when
glibc.rtld.execstack=2 is set) __libc_stack_end will be set to zero,
and the call will always fail.

The __libc_stack_end zero was used a mitigation hardening, but since
52a01100ad011293197637e42b5be1a479a2f4ae it is used solely on
pthread_getattr_np code. So there is no point in zeroing anymore.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 0c3425942374e72c3bcac28b2578117d36b0f9df)

elf: tst-audit10: split AVX512F code into dedicated functions [BZ #32882]

"Recent" GCC versions (since commit fc62716fe8d1, backported to stable
branches) emit a vzeroupper instruction at the end of functions
containing AVX instructions. This causes the tst-audit10 test to fail
on CPUs lacking AVX instructions, despite the AVX512F check. The crash
occurs in the pltenter function of tst-auditmod10b.c.

Fix that by moving the code guarded by the check_avx512 function into
specific functions using the target ("avx512f") attribute. Note that
since commit 5359c3bc91cc ("x86-64: Remove compiler -mavx512f check") it
is safe to assume that the compiler has AVX512F support, thus the
__AVX512F__ checks can be dropped.

Tested on non-AVX, AVX2 and AVX512F machines.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit e78caeb4ff812ae19d24d65f4d4d48508154277b)

x86: Detect Intel Diamond Rapids

Detect Intel Diamond Rapids and tune it similar to Intel Granite Rapids.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
(cherry picked from commit de14f1959ee5f9b845a7cae43bee03068b8136f0)

x86: Handle unknown Intel processor with default tuning

Enable default tuning for unknown Intel processor.

Tested on x86, no regression.

Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 9f0deff558d1d6b08c425c157f50de85013ada9c)

x86: Add ARL/PTL/CWF model detection support

- Add ARROWLAKE model detection.
- Add PANTHERLAKE model detection.
- Add CLEARWATERFOREST model detection.

Intel® Architecture Instruction Set Extensions Programming Reference
https://cdrdv2.intel.com/v1/dl/getContent/671368 Section 1.2.

No regression, validated model detection on SDE.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit e53eb952b970ac94c97d74fb447418fb327ca096)

x86: Optimize xstate size calculation

Scan xstate IDs up to the maximum supported xstate ID. Remove the
separate AMX xstate calculation. Instead, exclude the AMX space from
the start of TILECFG to the end of TILEDATA in xsave_state_size.

Completed validation on SKL/SKX/SPR/SDE and compared xsave state size
with "ld.so --list-diagnostics" option, no regression.

Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
(cherry picked from commit 70b648855185e967e54668b101d24704c3fb869d)

elf: Fix arm-linux-gnueabihf build break from b861755a84

elf: Extend glibc.rtld.execstack tunable to force executable stack (BZ 32653)

From the bug report [1], multiple programs still require to dlopen
shared libraries with either missing PT_GNU_STACK or with the executable
bit set.  Although, in some cases, it seems to be a hard-craft assembly
source without the required .note.GNU-stack marking (so the static linker
is forced to set the stack executable if the ABI requires it), other
cases seem that the library uses trampolines [2].

Unfortunately, READ_IMPLIES_EXEC is not an option since on some ABIs
(x86_64), the kernel clears the bit, making it unsupported.  To avoid
reinstating the broken code that changes stack permission on dlopen
(0ca8785a28), this patch extends the glibc.rtld.execstack tunable to
allow an option to force an executable stack at the program startup.

The tunable is a security issue because it defeats the PT_GNU_STACK
hardening.  It has the slight advantage of making it explicit by the
caller, and, as for other tunables, this is disabled for setuid binaries.
A tunable also allows us to eventually remove it, but from previous
experiences, it would require some time.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32653
[2] https://github.com/conda-forge/ctng-compiler-activation-feedstock/issues/143
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 12a497c716f0a06be5946cabb8c3ec22a079771e)

x86: Link tst-gnu2-tls2-x86-noxsave{,c,xsavec} with libpthread

This fixes a test build failure on Hurd.

Fixes commit 145097dff170507fe73190e8e41194f5b5f7e6bf ("x86: Use separate
variable for TLSDESC XSAVE/XSAVEC state size (bug 32810)").

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit c6e2895695118ab59c7b17feb0fcb75a53e3478c)

x86: Use separate variable for TLSDESC XSAVE/XSAVEC state size (bug 32810)

Previously, the initialization code reused the xsave_state_full_size
member of struct cpu_features for the TLSDESC state size. However,
the tunable processing code assumes that this member has the
original XSAVE (non-compact) state size, so that it can use its
value if XSAVEC is disabled via tunable.

This change uses a separate variable and not a struct member because
the value is only needed in ld.so and the static libc, but not in
libc.so. As a result, struct cpu_features layout does not change,
helping a future backport of this change.

Fixes commit 9b7091415af47082664717210ac49d51551456ab ("x86-64:
Update _dl_tlsdesc_dynamic to preserve AMX registers").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 145097dff170507fe73190e8e41194f5b5f7e6bf)

x86: Skip XSAVE state size reset if ISA level requires XSAVE

If we have to use XSAVE or XSAVEC trampolines, do not adjust the size
information they need. Technically, it is an operator error to try to
run with -XSAVE,-XSAVEC on such builds, but this change here disables
some unnecessary code with higher ISA levels and simplifies testing.

Related to commit befe2d3c4dec8be2cdd01a47132e47bdb7020922
("x86-64: Don't use SSE resolvers for ISA level 3 or above").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 59585ddaa2d44f22af04bb4b8bd4ad1e302c4c02)

x86_64: Add atanh with FMA

On SPR, it improves atanh bench performance by:

Before After Improvement
reciprocal-throughput 15.1715 14.8628 2%
latency 57.1941 56.1883 2%

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c7c4a5906f326f1290b1c2413a83c530564ec4b8)

x86_64: Add sinh with FMA

On SPR, it improves sinh bench performance by:

Before After Improvement
reciprocal-throughput 14.2017 11.815 17%
latency 36.4917 35.2114 4%

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit dded0d20f67ba1925ccbcb9cf28f0c75febe0dbe)

x86_64: Add tanh with FMA

On Skylake, it improves tanh bench performance by:

Before After Improvement
max 110.89 95.826 14%
min 20.966 20.157 4%
mean 30.9601 29.8431 4%

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c6352111c72a20b3588ae304dd99b63e25dd6d85)

nptl: Check if thread is already terminated in sigcancel_handler (BZ 32782)

The SIGCANCEL signal handler should not issue __syscall_do_cancel,
which calls __do_cancel and __pthread_unwind, if the cancellation
is already in proces (and libgcc unwind is not reentrant). Any
cancellation signal received after is ignored.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 360cce0b066f34e85e473c04cdc16e6fa426021b)

nptl: PTHREAD_COND_INITIALIZER compatibility with pre-2.41 versions (bug 32786)

The new initializer and struct layout does not initialize the
__g_signals field in the old struct layout before the change in
commit c36fc50781995e6758cae2b6927839d0157f213c ("nptl: Remove
g_refs from condition variables"). Bring back fields at the end
of struct __pthread_cond_s, so that they are again zero-initialized.

Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit dbc5a50d12eff4cb3f782129029d04b8a76f58e7)

nptl: clear the whole rseq area before registration

Due to the extensible nature of the rseq area we can't explictly
initialize fields that are not part of the ABI yet. It was agreed with
upstream that all new fields will be documented as zero initialized by
userspace. Future kernels configured with CONFIG_DEBUG_RSEQ will
validate the content of all fields during registration.

Replace the explicit field initialization with a memset of the whole
rseq area which will cover fields as they are added to future kernels.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 689a62a4217fae78b9ce0db781dc2a421f2b1ab4)

Linux: Remove attribute access from sched_getattr (bug 32781)

The GCC attribute expects an element count, not bytes.

(cherry picked from commit 74c68fa61b5ebf4c64605a3cc5e47154a66671ce)

math: Remove an extra semicolon in math function declarations

Commit 6bc301672bfbd ("math: Remove __XXX math functions from installed
math.h [BZ #32418]") left an extra semicolon after macro expansion. For
instance the ceil declaration after expansion is:

extern double ceil (double __x) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));;

This chokes very naive parsers like gauche c-wrapper. Fix that by
removing that extra semicolon in the macro.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 443cb0b5f25129dd0f1e9f9101299d31c4700b7f)

posix: Move environ helper variables next to environ definition (bug 32541)

This helps with statically interposing getenv.

Updates commit 7a61e7f557a97ab597d6fca5e2d1f13f65685c61
("stdlib: Make getenv thread-safe in more cases").

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 6ef0bd02dbe34aab8b956ffa2db5679341d520f5)

configure: Fix spelling of -Wl,--no-error-execstack option

BFD ld recognizes all -no-* options (with a single leading dash)
unconditionally.

Fixes commit a2bd5008a99032830add3e4005c25b61e3207112
("Pass -Wl,--no-error-execstack for tests where -Wl,-z,execstack
is used [PR32717]").

(cherry picked from commit 59dc232df277c21239c357e3519682c26e182cd7)

elf: Check if __attribute__ ((aligned (65536))) is supported

The BZ #32763 tests fail to build for MicroBlaze (which defines
MAX_OFILE_ALIGNMENT to (32768*8) in GCC, so __attribute__ ((aligned
(65536))) is unsupported). Add a configure-time check to enable BZ #32763
tests only if __attribute__ ((aligned (65536))) is supported.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 9b91484bee8f6f1bf1e2d26a8df461b553784528)

static-pie: Skip the empty PT_LOAD segment at offset 0 [BZ #32763]

As shown in

https://sourceware.org/bugzilla/show_bug.cgi?id=25237

linker may generate an empty PT_LOAD segments at offset 0:

Elf file type is EXEC (Executable file)
Entry point 0x4000e8
There are 3 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000000f0 0x00000000000000f0  R E    0x1000
  LOAD           0x0000000000000000 0x0000000000410000 0x0000000000410000
                 0x0000000000000000 0x0000000000b5dce8  RW     0x10000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10

Section to Segment mapping:
  Segment Sections...
   00     .text
   01     .bss
   02

Skip the empty PT_LOAD segment at offset 0 to support such binaries.
This fixes BZ #32763.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 596130591ae4b058a529cc1318b95e624559054c)

Pass -Wl,--no-error-execstack for tests where -Wl,-z,execstack is used [PR32717]

When GNU Binutils is configured with --enable-error-execstack=yes, a handful
of our tests which rely on -Wl,-z,execstack fail. Pass --Wl,--no-error-execstack
to override the behaviour and get a warning instead.

Bug: https://sourceware.org/PR32717
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit a2bd5008a99032830add3e4005c25b61e3207112)

AArch64: Use prefer_sve_ifuncs for SVE memset

Use prefer_sve_ifuncs for SVE memset just like memcpy.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
(cherry picked from commit 0f044be1dae5169d0e57f8d487b427863aeadab4)

AArch64: Add SVE memset

Add SVE memset based on the generic memset with predicated load for sizes < 16.
Unaligned memsets of 128-1024 are improved by ~20% on average by using aligned
stores for the last 64 bytes. Performance of random memset benchmark improves
by ~2% on Neoverse V1.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
(cherry picked from commit 163b1bbb76caba4d9673c07940c5930a1afa7548)

math: Improve layout of exp/exp10 data

GCC aligns global data to 16 bytes if their size is >= 16 bytes.  This patch
changes the exp_data struct slightly so that the fields are better aligned
and without gaps.  As a result on targets that support them, more load-pair
instructions are used in exp.  Exp10 is improved by moving invlog10_2N later
so that neglog10_2hiN and neglog10_2loN can be loaded using load-pair.

The exp benchmark improves 2.5%, "144bits" by 7.2%, "768bits" by 12.7% on
Neoverse V2.  Exp10 improves by 1.5%.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 5afaf99edb326fd9f36eb306a828d129a3a1d7f7)

aarch64: Add GCS test with signal handler

Test that when we return from a function that enabled GCS at runtime
we get SIGSEGV. Also test that ucontext contains GCS block with the
GCS pointer.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Add GCS tests for dlopen

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Add GCS tests for transitive dependencies

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Add tests for Guarded Control Stack

These tests validate that GCS tunable works as expected depending
on the GCS markings in the test binaries.

Tests validate both static and dynamically linked binaries.

These new tests are AArch64 specific. Moreover, they are included only
if linker supports the "-z gcs=<value>" option. If built, these tests
will run on systems with and without HWCAP_GCS. In the latter case the
tests will be reported as UNSUPPORTED.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Add configure checks for GCS support

- Add check that linker supports -z gcs=...
- Add checks that main and test compiler support
-mbranch-protection=gcs

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

AArch64: Improve codegen for SVE powf

Improve memory access with indexed/unpredicated instructions.
Eliminate register spills. Speedup on Neoverse V1: 3%.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 95e807209b680257a9afe81a507754f1565dbb4d)

AArch64: Improve codegen for SVE pow

Move constants to struct. Improve memory access with indexed/unpredicated
instructions. Eliminate register spills. Speedup on Neoverse V1: 24%.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 0b195651db3ae793187c7dd6d78b5a7a8da9d5e6)

AArch64: Improve codegen for SVE erfcf

Reduce number of MOV/MOVPRFXs and use unpredicated FMUL.
Replace MUL with LSL. Speedup on Neoverse V1: 6%.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit f5ff34cb3c75ec1061c75bb9188b3c1176426947)

Aarch64: Improve codegen in SVE exp and users, and update expf_inline

Use unpredicted muls, and improve memory access.
7%, 3% and 1% improvement in throughput microbenchmark on Neoverse V1,
for exp, exp2 and cosh respectively.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit c0ff447edf19bd4630fe79adf5e8b896405b059f)

Aarch64: Improve codegen in SVE asinh

Use unpredicated muls, use lanewise mla's and improve memory access.
1% regression in throughput microbenchmark on Neoverse V1.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 8f0e7fe61e0a2ad5ed777933703ce09053810ec4)

RISC-V: Fix IFUNC resolver cannot access gp pointer

In some cases, an IFUNC resolver may need to access the gp pointer to
access global variables. Such an object may have l_relocated == 0 at
this time. In this case, an IFUNC resolver will fail to access a global
variable and cause a SIGSEGV.

This patch fixes this issue by relaxing the check of l_relocated in
elf_machine_runtime_setup, but added a check for SHARED case to avoid
using this code in static-linked executables. Such object have already
set up the gp pointer in load_gp function and l->l_scope will be NULL if
it is a pie object. So if we use these code to set up the gp pointer
again for static-pie, it will causing a SIGSEGV in glibc as original bug
on BZ #31317.

I have also reproduced and checked BZ #31317 using the mold commit
bed5b1731b ("illumos: Treat absolute symbols specially"), this patch can
fix the issue.

Also, we used the wrong gp pointer previously because ref->st_value is
not the relocated address but just the offset from the base address of
ELF. An edge case may happen if we reference gp pointer in a IFUNC
resolver in a PIE object, but it will not happen in compiler-generated
codes since -pie will disable relax to gp. In this case, the GP will be
initialized incorrectly since the ref->st_value is not the address after
relocation. This patch fixes this issue by adding the l->l_addr to
ref->st_value to get the relocated address for the gp pointer. We don't
use SYMBOL_ADDRESS macro here because __global_pointer$ is a special
symbol that has SHN_ABS type, but it will use PC-relative addressing in
the load_gp function using lla.

Closes: BZ #32269
Fixes: 96d1b9ac23 ("RISC-V: Fix the static-PIE non-relocated object check")
Co-authored-by: Vivian Wang <dramforever@live.com>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
(cherry picked from commit 3fd2ff7685e3ee85c8cd2896f28ad62f67d7c483)

math: Add optimization barrier to ensure a1 + u.d is not reused [BZ #30664]

A number of fma tests started to fail on hppa when gcc was changed to
use Ranger rather than EVRP.  Eventually I found that the value of
a1 + u.d in this is block of code was being computed in FE_TOWARDZERO
mode and not the original rounding mode:

    if (TININESS_AFTER_ROUNDING)
      {
        w.d = a1 + u.d;
        if (w.ieee.exponent == 109)
          return w.d * 0x1p-108;
      }

This caused the exponent value to be wrong and the wrong return path
to be used.

Here we add an optimization barrier after the rounding mode is reset
to ensure that the previous value of a1 + u.d is not reused.

Signed-off-by: John David Anglin <dave.anglin@bell.net>

math: Fix `unknown type name '__float128'` for clang 3.4 to 3.8.1 (bug 32694)

When compiling a program that includes <bits/floatn.h> using a clang version
between 3.4 (included) and 3.8.1 (included), clang will fail with `unknown type
name '__float128'; did you mean '__cfloat128'?`. This changes fixes the clang
prerequirements macro call in floatn.h to check for clang 3.9 instead of 3.4,
since support for __float128 was actually enabled in 3.9 by:

commit 50f29e06a1b6a38f0bba9360cbff72c82d46cdd4
Author: Nemanja Ivanovic <nemanja.i.ibm@gmail.com>
Date: Wed Apr 13 09:49:45 2016 +0000

Enable support for __float128 in Clang

This fixes bug 32694.

Signed-off-by: koraynilay <koray.fra@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 29803ed3ce420f01e7c567c97fc8945d5e5e5992)

x86 (__HAVE_FLOAT128): Defined to 0 for Intel SYCL compiler [BZ #32723]

Intel compiler always defines __INTEL_LLVM_COMPILER. When SYCL is
enabled by -fsycl, it also defines SYCL_LANGUAGE_VERSION. Since Intel
SYCL compiler doesn't support _Float128:

https://github.com/intel/llvm/issues/16903

define __HAVE_FLOAT128 to 0 for Intel SYCL compiler.

This fixes BZ #32723.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 5a4573be6f96ff49111bb6cae767676b5aafa7a8)

Fix tst-aarch64-pkey to handle ENOSPC as not supported

The syscall pkey_alloc can return ENOSPC to indicate either that all
keys are in use or that the system runs in a mode in which memory
protection keys are disabled. In such case the test should not fail and
just return unsupported.

This matches the behaviour of the generic tst-pkey.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 60f2d6be657aa8c663ee14bd266d343ae0f35afb)

assert: Add test for CVE-2025-0395

Use the __progname symbol to override the program name to induce the
failure that CVE-2025-0395 describes.

This is related to BZ #32582

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit cdb9ba84191ce72e86346fb8b1d906e7cd930ea2)

math: Fix tanf for some inputs (BZ 32630)

The logic was copied wrong from CORE-MATH.

(cherry picked from commit 09e7f4d594b4308fbea18e3044148d67b59757c9)

nptl: Correct stack size attribute when stack grows up [BZ #32574]

Set stack size attribute to the size of the mmap'd region only
when the size of the remaining stack space is less than the size
of the mmap'd region.

This was reversed.  As a result, the initial stack size was only
135168 bytes.  On architectures where the stack grows down, the
initial stack size is approximately 8384512 bytes with the default
rlimit settings.  The small main stack size on hppa broke
applications like ruby that check for stack overflows.

Signed-off-by: John David Anglin <dave.anglin@bell.net>

math: Fix sinhf for some inputs (BZ 32627)

The logic was copied wrong from CORE-MATH.

math: Fix log10p1f internal table value (BZ 32626)

It was copied wrong from CORE-MATH.

(cherry picked from commit c79277a16785c8ae96d821414f4d31d654a0177c)

NEWS: start new section

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>

Remove advisories from release branch

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>

Create ChangeLog.old/ChangeLog.30

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>