Arjun Shankar [Mon, 26 Jan 2026 12:49:37 +0000 (13:49 +0100)]
dlfcn: Add dlinfo request type RTLD_DI_ORIGIN_PATH (bug #24298)
The existing dlinfo request type RTLD_DI_ORIGIN used for querying the
value of the '$ORIGIN' dynamic string token is prone to buffer
overflows.
This commit adds a new request type named RTLD_DI_ORIGIN_PATH that
returns a pointer to the dynamic string token (i.e. the 'l_origin' field
in the link map) instead. The dlinfo manual is updated with the new
request type, and the description of RTLD_DI_ORIGIN is updated to
recommend RTLD_DI_ORIGIN_PATH instead.
A test for the new request type is also added to tst-dlinfo.
Aurelien Jarno [Tue, 20 Jan 2026 17:25:08 +0000 (18:25 +0100)]
Fix ldbl-128ibm ceill, floorl, roundl and truncl zero-sign handling
When the result of ceill, floorl, roundl and truncl is zero, the sign of
the result must match the sign of the input. For the IBM 128-bit long
double format, the sign is determined by the high part.
Ensure the correct sign when the high part is the result of
computations, by copying the sign from the input high part to the output
high part. On POWER, this conveniently maps to the fcpsgn instruction.
In addition add test for the values provided in BZ #33623, and for the
opposite value when the result is 0.
Florian Weimer [Sat, 24 Jan 2026 09:29:39 +0000 (10:29 +0100)]
support: Reinitialize containers if /etc is present
This prevents test failures because configuration file leftovers
unexpectedly change glibc for future tests. Whether this
triggers depends on test execution order.
Adding postclean.req files manually (before this change) appears
too error-prone.
posix: Reset wordexp_t fields with WRDE_REUSE (CVE-2025-15281 / BZ 33814)
The wordexp fails to properly initialize the input wordexp_t when
WRDE_REUSE is used. The wordexp_t struct is properly freed, but
reuses the old wc_wordc value and updates the we_wordv in the
wrong position. A later wordfree will then call free with an
invalid pointer.
Xi Ruoyao [Thu, 15 Jan 2026 08:24:57 +0000 (16:24 +0800)]
Linux: fix tst-copy_file_range-large failure in 32-bit glibc build on 64-bit kernel [BZ 33790]
Reported-by: H. J. Lu <hjl.tools@gmail.com> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Florian Weimer <fweimer@redhat.com> Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Florian Weimer [Thu, 15 Jan 2026 21:29:46 +0000 (22:29 +0100)]
elf: Ignore LD_PROFILE if LD_PROFILE_OUTPUT is not set (bug 33797)
The previous default for LD_PROFILE_OUTPUT, /var/tmp, is insecure
because it's typically a 1777 directory, and other systems could
place malicious files there which interfere with execution.
Requiring the user to specify a profiling directory mitigates
the impact of bug 33797. Clear LD_PROFILE_OUTPUT alongside
with LD_PROFILE.
Rework the test not to use predictable file names.
Carlos O'Donell [Thu, 15 Jan 2026 20:09:38 +0000 (15:09 -0500)]
resolv: Fix NSS DNS backend for getnetbyaddr (CVE-2026-0915)
The default network value of zero for net was never tested for and
results in a DNS query constructed from uninitialized stack bytes.
The solution is to provide a default query for the case where net
is zero.
Adding a test case for this was straight forward given the existence of
tst-resolv-network and if the test is added without the fix you observe
this failure:
FAIL: resolv/tst-resolv-network
original exit status 1
error: tst-resolv-network.c:174: invalid QNAME: \146\218\129\128
error: 1 test failures
With a random QNAME resulting from the use of uninitialized stack bytes.
After the fix the test passes.
Additionally verified using wireshark before and after to ensure
on-the-wire bytes for the DNS query were as expected.
The change to cap valid sizes to PTRDIFF_MAX inadvertently dropped the
overflow check for alignment in memalign functions, _mid_memalign and
_int_memalign. Reinstate the overflow check in _int_memalign, aligned
with the PTRDIFF_MAX change since that is directly responsible for the
CVE. The missing _mid_memalign check is not relevant (and does not have
a security impact) and may need a different approach to fully resolve,
so it has been omitted.
CVE-Id: CVE-2026-0861
Vulnerable-Commit: 9bf8e29ca136094f73f69f725f15c51facc97206 Reported-by: Igor Morgenstern, Aisle Research Fixes: BZ #33796 Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
tst-mallocfork calls fork from a signal handler, leading to sporadic
deadlocks when multi-threaded since fork is not AS-safe when
multi-threading. This commit therefore adds tst-mallocfork to the
appropriate exception list.
In permissive mode, during audit module handling, check_gcs is unaware
that it is handling audit modules rather than the binary itself. It
causes the loader to fail to load the audit module, rather than
loading it and disabling GCS.
Also extends GCS tests with 4 LD_AUDIT tests:
1. tst-gcs-audit-disabled: checks if the audit module without GCS
marking is loaded with default gcs support.
2. tst-gcs-audit-enforced: checks if the audit module without GCS
marking is not loaded when GCS is enforced.
3. tst-gcs-audit-optional: checks if the audit module without GCS
marking is loaded when GCS is optional.
4. tst-gcs-audit-override: check if the audit modules without GCS
marking is loaded when GCS is overrided.
Checked on aarch64-linux-gnu with Linux 6.18 on Apple M4 emulated (for
BTI support) and on qemu 10.1.50 simulated (for GCS).
Sachin Monga [Mon, 12 Jan 2026 17:40:15 +0000 (12:40 -0500)]
ldbl-128ibm-compat: Add local aliases for printf family symbols
When the compiler selects IEEE-128 long double ABI(-mabi=ieeelongdouble),
calls to printf, fprintf, sprintf and snprintf are redirected to the
__printfieee128, __fprintfieee128, __sprintfieee128 and __snprintfieee128
symbols respectively. This causes "break printf" (and others) in
GDB to fail because the original symbol names do not exist as global
symbols in libc.so.6.
Fix this by adding local symbol aliases in the ieee128 compatibility
files so that the original symbol names are present in the symbol table
again. This restores the expected GDB behavior ("break printf" works)
without requiring dynamic symbols or versioned compatibility symbols.
The 13cfd77bf5 change broke the b5d88fa6c3 fix by removing the symbol
to __symbol redirections. Although it works for -O2 with both gcc
and clang, with -Os without the redirection, the libcall might still
be issued.
This patch reinstates the b5d88fa6c3 fix, with a modification that
allows each ifunc variant to control which trunc to issue. This is
required for clang, which defines HAVE_X86_INLINE_TRUNC to 1 (meaning
that trunc will always be lowered to the instruction on -Os).
Checked on x86_64-linux-gnu with -O2 and -Os with gcc-15 and clang-18.
The CORE-MATH c423b9a3 commit made atanh to use a slight different
muldd_acc and polydd (which uses muldd_acc internally) compared
to previous version.
The new tests were suggested by Paul Zimmermann (although I did
not see any regression).
Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
The CORE-MATH c423b9a3 commit made atanh to use a slight different
muldd_acc, mulddd, and polydd (which uses muldd_acc internally)
compare to asinh and acosh.
The new tests were suggested by Paul Zimmermann (although I did
not see any regression).
Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
The muldd was renamed to muldd_acc to avoid deviate from CORE-MATH
(the symbol and logic in replicated on multiple implementation,
different than glibc we consolidate it on ddcoremath.h).
Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu,
and i686-linux-gnu.
Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Samuel Thibault [Sun, 11 Jan 2026 01:00:25 +0000 (02:00 +0100)]
hurd: Fix sigreturn clobbering some xmm registers
__sigreturn2 uses _hurd_sigstate_unlock after restoring the interrupted
xmm values, we thus need it not to touch xmm. It makes sense to inline
sigstate_is_global_rcv _hurd_sigstate_lock/unlock anyway. unlock calls
gsync_wake, so we need to avoid xmm there as well.
Xi Ruoyao [Thu, 8 Jan 2026 07:27:53 +0000 (15:27 +0800)]
Linux: fix copy_file_range test on Linux >= 6.18
On Linux >= 6.18, the kernel submits the new COPY_FILE_RANGE_64
operation to the fuse implementation for large files. There is a
fall-back routine to COPY_FILE_RANGE but it's only used if
COPY_FILE_RANGE_64 returns ENOSYS.
So, return ENOSYS instead of EIO for "unsupported" operations in order
to make the kernel do the correct thing for this case and maybe in case
that a new operation is added into the kernel fuse interface in the
future.
Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Florian Weimer <fweimer@redhat.com>
Yury Khrustalev [Wed, 10 Dec 2025 15:00:26 +0000 (15:00 +0000)]
aarch64: Fix PT_GNU_PROPERTY checks for static exe (BZ 33713)
All checks related to the PT_GNU_PROPERTY bits would be skipped
if the binary had no PT_GNU_PROPERTY note at all. This meant that
enforcing an abort when some bits are not present was not possible.
Paul Eggert [Sat, 3 Jan 2026 18:27:52 +0000 (10:27 -0800)]
Better terminology for ‘long double’ in manual
* manual/math.texi (Mathematical Constants):
Don’t say that long double is “the same as” double, as the
types remain distinct (problem reported by Keith Thompson).
Also, don’t imply that float is the “narrowest”, as floating
point types don’t have widths in Standard C. Instead, talk
about precision and exponent range.
Paul Eggert [Thu, 1 Jan 2026 21:19:24 +0000 (13:19 -0800)]
Pass glibc pre-commit checks
This is needed for the next patch which updates copyright dates.
* elf/sprof.c:
* sysdeps/unix/sysv/linux/tst-pidfd_getinfo.c:
Remove trailing white space.
* misc/tst-atomic.c: Remove trailing empty line.
Fix regression in commit 7447efa9622cb33a567094833f6c4000b3ed2e23
("malloc: remove fastbin code from malloc_info") where the closing
`sizes` tag had a typo, missing the '/'.
malloc.c:1909:8: error: converting the result of '<<' to a boolean always evaluates to true [-Werror,-Wtautological-constant-compare]
1909 | if (!DEFAULT_THP_PAGESIZE || mp_.thp_mode != malloc_thp_mode_not_supported)
| ^
../sysdeps/unix/sysv/linux/aarch64/malloc-hugepages.h:19:35: note: expanded from macro 'DEFAULT_THP_PAGESIZE'
19 | #define DEFAULT_THP_PAGESIZE (1UL << 21)
elf: Fix elf/tst-decorate-maps on aarch64 after 321e1fc73f
The intention of the call "xmalloc(256 * 1024)" in tst-decorate-maps is
to force malloc() to fall back to using mmap() since such an amount
won't be available from the main heap.
Post 321e1fc73f, on aarch64, the heap gets extended by default by at
least 2MB, thus the aforementioned call may get satisfied on the main
heap itself. Thus, increase the amount of memory requested to force the
mmap() path again.
x86: Do not use __builtin_isinf_sign for _Float64x/long double
Neither gcc [1] nor clang [2] handles pseudo-normal numbers correctly
with the __builtin_isinf_sign, so disable its usage for _Float64x and
long double types.
This only affects x86, so add a new define __FP_BUILTIN_ISINF_SIGN_DENORMAL
to gate long double and related types to the libc function instead.
It fixes the regression on test-ldouble-isinf when built with clang:
Failure: isinf (pseudo_zero): Exception "Invalid operation" set
Failure: isinf (pseudo_inf): Exception "Invalid operation" set
Failure: isinf (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf (pseudo_snan): Exception "Invalid operation" set
Failure: isinf (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_downward (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_towardzero (pseudo_unnormal): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_zero): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_inf): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_qnan): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_snan): Exception "Invalid operation" set
Failure: isinf_upward (pseudo_unnormal): Exception "Invalid operation" set
Checked on x86_64-linux-gnu with gcc-15 and clang-18.
x86: Do not use __builtin_fpclassify for _Float64x/long double
Neither gcc [1] nor clang [2] handles pseudo-normal numbers correctly
with the __builtin_fpclassify, so disable its usage for _Float64x and
long double types.
This only affects x86, so add a new header, fp-builtin-denormal.h, that
defines whether the architecture requires disabling the optimization
through a new glibc define (__FP_BUILTIN_FPCLASSIFY_DENORMAL).
It fixes the regression on test-ldouble-fpclassify and
test-float64x-fpclassify when built with clang:
Failure: fpclassify (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_downward (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_towardzero (pseudo_unnormal): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_zero): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_inf): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_qnan): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_snan): Exception "Invalid operation" set
Failure: fpclassify_upward (pseudo_unnormal): Exception "Invalid operation" set
Checked on x86_64-linux-gnu with gcc-15 and clang-18.
Sergey Kolosov [Mon, 15 Dec 2025 12:00:01 +0000 (13:00 +0100)]
resolv: Add test for NOERROR/NODATA handling [BZ #14308]
Add a test which verifies that getaddrinfo does not fail if one of A/AAAA
responses is NOERROR/NODATA reply with recursion unavailable and the other
response provides an address.
Yao Zihong [Fri, 19 Dec 2025 23:46:42 +0000 (17:46 -0600)]
riscv: Add RVV memset for both multiarch and non-multiarch builds
This patch adds an RVV-optimized implementation of memset for RISC-V and
enables it for both multiarch (IFUNC) and non-multiarch builds.
The implementation integrates Hau Hsu's 2023 RVV work under a unified
ifunc-based framework. A vectorized version (__memset_vector) is added
alongside the generic fallback (__memset_generic). The runtime resolver
selects the RVV variant when RISCV_HWPROBE_KEY_IMA_EXT_0 reports vector
support (RVV).
Currently, the resolver still selects the RVV variant even when the RVV
extension is disabled via prctl(). As a consequence, any process that
has RVV disabled via prctl() will receive SIGILL when calling memset().
Co-authored-by: Jerry Shih <jerry.shih@sifive.com> Co-authored-by: Jeff Law <jeffreyalaw@gmail.com> Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn> Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
Will issue a __gttf2 call instead of a __unordtf2 followed by the
comparison.
Using the generic implementation fixes multiple issues with math tests,
such as:
Failure: fmax (0, qNaN): Exception "Invalid operation" set
Failure: fmax (0, -qNaN): Exception "Invalid operation" set
Failure: fmax (-0, qNaN): Exception "Invalid operation" set
Failure: fmax (-0, -qNaN): Exception "Invalid operation" set
Failure: fmax (9, qNaN): Exception "Invalid operation" set
Failure: fmax (9, -qNaN): Exception "Invalid operation" set
Failure: fmax (-9, qNaN): Exception "Invalid operation" set
Failure: fmax (-9, -qNaN): Exception "Invalid operation" set
It has a small performance overhead due to the extra isunordered (which
could be omitted for float and double types). Using _Generic (similar to
how __MATH_TG) on a bivariadic function requires a lot of boilerplate
macros.
[1] https://github.com/llvm/llvm-project/issues/172499 Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
elf: Support vDSO with more than one PT_LOAD with v_addr starting at 0 (BZ 32583)
The setup_vdso assumes that vDSO will contain only one PT_LOAD segment
and that 0 is the sentinel for the start mapping address. Although
the kernel avoids adding more than one PT_LOAD to avoid compatibility
issues, there is no impending issue that prevents glibc from supporting
vDSO with multiple PT_LOAD (as some wrapper tools do [1]).
To support multiple PT_LOAD segments, replace the sentinel with a bool
to indicate that the VMA start has already been set.
Testing is really tricky, since the bug report does not indicate which
tool was used to trigger the issue, nor a runtime that provides a vDSO
with multiple PT_LOAD. I had to modify the qemu user with a custom
script to create 2 PT_LOAD sections, remove checks that prevent the
vDSO object from being created, and remove the load bias adjustment
in load_elf_vdso. I could not come up with an easy test case to
integrate with glibc.
The Linux kernel provides vDSO with only one PT_LOAD due to
compatibility reasons. For instance
* arch/arm64/kernel/vdso/vdso.lds.S
86 /*
87 * We must supply the ELF program headers explicitly to get just one
88 * PT_LOAD segment, and set the flags explicitly to make segments read-only.
89 /
90 PHDRS
91 {
92 text PT_LOAD FLAGS(5) FILEHDR PHDRS; / PF_R|PF_X /
93 dynamic PT_DYNAMIC FLAGS(4); / PF_R /
94 note PT_NOTE FLAGS(4); / PF_R */
95 }
* arch/x86/entry/vdso/vdso-layout.lds.S
95 /*
96 * We must supply the ELF program headers explicitly to get just one
97 * PT_LOAD segment, and set the flags explicitly to make segments read-only.
98 /
99 PHDRS
100 {
101 text PT_LOAD FLAGS(5) FILEHDR PHDRS; / PF_R|PF_X /
102 dynamic PT_DYNAMIC FLAGS(4); / PF_R /
103 note PT_NOTE FLAGS(4); / PF_R */
104 eh_frame_hdr PT_GNU_EH_FRAME;
105 }
nptl: Make pthread_{clock, timed}join{_np} act on all cancellation (BZ 33717)
The pthread_join/pthread_timedjoin_np/pthread_clockjoin_np will not act
on cancellation if 1. some other thread is already waiting on the 'joinid'
or 2. If the thread has already exited.
On nptl/pthread_join_common.c, the 1. is due to the CAS doing an early
return:
80 else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
81 &self,
82 NULL)))
83 /* There is already somebody waiting for the thread. */
84 return EINVAL;
Same as support_process_state_wait, but wait for the task TID
(obtained with gettid) from the current process. Since the kernel
might remove the /proc/<pid>/task/<tid>/status at any time if the
thread terminates, the code needs to handle possible
fopen/getline/fclose failures due to an inexistent file.
And use the new __pthread_descriptor_valid function that checks
for 'joinstate' to get the thread state instead of 'tid'. The
joinstate is set by the kernel when the thread exits.
nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
The use-after-free described in BZ#19951 is due to the use of two
different PD fields, 'joinid' and 'cancelhandling', to describe the
thread state and to synchronise the calls of pthread_join,
pthread_detach, pthread_exit, and normal thread exit.
Any state change may require checking both fields atomically to handle
partial state (e.g., pthread_join() with a cancellation handler to
issue a 'joinstate' field rollback).
This patch uses a different PD member with 4 possible states (JOINABLE,
DETACHED, EXITING, and EXITED) instead of the pthread 'tid' field, with
the following logic:
1. On pthread_create, the initial state is set either to JOINABLE or
DETACHED depending on the pthread attribute used.
2. On pthread_detach, a CAS is issued on the state. If the CAS fails,
the thread is already detached (DETACHED) or being terminated (EXITING).
For the former, an EINVAL is returned; for the latter, pthread_detach
should be responsible for joining the thread (and for deallocating any
internal resources).
3. In the exit phase of the wrapper function for the thread start routine
(reached either if the thread function has returned, pthread_exit has
been called, or cancellation handled has been acted upon), we issue a
CAS on state to set it to the EXITING mode.
If the thread is previously in DETACHED mode, the thread is responsible
for deallocating any resources; otherwise, the thread must be joined
(detached threads cannot deallocate themselves immediately).
4. The clear_tid_field on 'clone' call is changed to set the new 'state'
field on thread exit (EXITED). This state is only reached at thread
termination.
5. The pthread_join implementation is now simpler: the futex wait is done
directly on thread state, and there is no need to reset it in case of
timeout since the state is now set either by pthread_detach() or by the
kernel on process termination.
The race condition on pthread_detach is avoided with a single atomic
operation on the PD state: once the mode is set to THREAD_STATE_DETACHED, it
is up to the thread itself to deallocate its memory (done during the exit
phase at pthread_create()).
Also, the INVALID_NOT_TERMINATED_TD_P is removed since a negative yid is
not possible, and the macro is not used anywhere.
This change triggers an invalid C11 thread test: it creates a thread that
detaches, and after a timeout, the creating thread checks whether the join
fails. The issue is that once thrd_join() is called, the thread's lifetime
is not defined.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, and powerpc64-linux-gnu.
Stefan Liebler [Fri, 19 Dec 2025 10:19:53 +0000 (11:19 +0100)]
build-many-glibcs.py: Fix s390x-linux-gnu.
The recent commit 638d437dbf9c68e40986edaa9b0d1c2e72a1ae81
"Deprecate s390-linux-gnu (31bit)"
leads to:
FAIL: compilers-s390x-linux-gnu gcc build
when it tries to build 31bit libgcc.
The build is fixed by explicitely disabling multilib.
Sunil K Pandey [Tue, 9 Dec 2025 16:57:44 +0000 (08:57 -0800)]
nptl: Optimize trylock for high cache contention workloads (BZ #33704)
Check lock availability before acquisition to reduce cache line
bouncing. Significantly improves trylock throughput on multi-core
systems under heavy contention.
Tested on x86_64.
Fixes BZ #33704.
Co-authored-by: Alex M Wells <alex.m.wells@intel.com> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reinstate HAVE_64B_ATOMICS configure check that was reverted by commit 7fec8a5de6826ef9ae440238d698f0fe5a5fb372 due to BZ #33632. This was
fixed by 3dd2cbfa35e0e6e0345633079bd5a83bb822c2d8 by only allowing
64-bit atomics on sem_t if its type is 8-byte aligned. Rebase and add
in cleanups in include/atomic.h that were omitted.
Fix an issue with sparcv8-linux-gnu-leon3 forcing -mcpu=v8 for rtld.c which
overrules -mcpu=leon3 and causes __atomic_always_lock_free (4, 0) to
incorrectly return 0 and trigger asserts in atomics. Remove this as it
seems to be a workaround for an issue in 1997.