H.J. Lu [Sat, 31 Jul 2021 02:07:30 +0000 (19:07 -0700)]
Use __executable_start as the lowest address for profiling [BZ #28153]
Glibc assumes that ENTRY_POINT is the lowest address for which we need
to keep profiling records and BFD linker uses a linker script to place
the input sections.
Starting from GCC 4.6, the main function is placed in .text.startup
section and starting from binutils 2.22, BFD linker with
* scripttempl/elf.sc: Group .text.exit, text.startup and .text.hot
sections.
places .text.startup section before .text section, which leave the main
function out of profiling records.
Starting from binutils 2.15, linker provides __executable_start to mark
the lowest address of the executable. Use __executable_start as the
lowest address to keep the main function in profiling records. This fixes
[BZ #28153].
Tested on Linux/x86-64, Linux/x32 and Linux/i686 as well as with
build-many-glibcs.py.
Samuel Thibault [Mon, 23 Aug 2021 17:06:49 +0000 (19:06 +0200)]
hurd: Fix errlist error mapping
On the Hurd, the errno values don't start at 0, so _sys_errlist_internal
needs index remapping. The _sys_errlist_internal definition already properly
uses ERR_MAP, but __get_errlist and __get_errname were not.
Joseph Myers [Mon, 23 Aug 2021 16:18:42 +0000 (16:18 +0000)]
Fix iconv build with GCC mainline
Current GCC mainline produces -Wstringop-overflow errors building some
iconv converters, as discussed at
<https://gcc.gnu.org/pipermail/gcc/2021-July/236943.html>. Add an
__builtin_unreachable call as suggested so that GCC can see the case
that would involve a buffer overflow is unreachable; because the
unreachability depends on valid conversion state being passed into the
function from previous conversion steps, it's not something the
compiler can reasonably deduce on its own.
Tested with build-many-glibcs.py that, together with
<https://sourceware.org/pipermail/libc-alpha/2021-August/130244.html>,
it restores the glibc build for powerpc-linux-gnu.
Andreas Schwab [Mon, 23 Aug 2021 08:19:52 +0000 (10:19 +0200)]
rtld: copy terminating null in tunables_strdup (bug 28256)
Avoid triggering a false positive from valgrind by copying the terminating
null in tunables_strdup. At this point the heap is still clean, but
valgrind is stricter here.
Record only the relative address of the caller in mtrace file. Use
LD_TRACE_PRELINKING to get the executable as well as binary vs
executable load offsets so that we may compute a base to add to the
relative address in the mtrace file. This allows us to get a valid
address to pass to addr2line in all cases.
Fixes BZ #22716.
Co-authored-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org> Reviewed-by: DJ Delorie <dj@redhat.com>
Matt Whitlock [Thu, 17 Jun 2021 03:40:47 +0000 (23:40 -0400)]
x86: fix Autoconf caching of instruction support checks [BZ #27991]
The Autoconf documentation for the AC_CACHE_CHECK macro states:
The commands-to-set-it must have no side effects except for setting
the variable cache-id, see below.
However, the tests for support of -msahf and -mmovbe were embedded in
the commands-to-set-it for lib_cv_include_x86_isa_level. This had the
consequence that libc_cv_have_x86_lahf_sahf and libc_cv_have_x86_movbe
were not defined whenever lib_cv_include_x86_isa_level was read from
cache. These variables' being undefined meant that their unquoted use
in later test expressions led to the 'test' built-in's misparsing its
arguments and emitting errors like "test: =: unexpected operator" or
"test: =: unary operator expected", depending on the particular shell.
This commit refactors the tests for LAHF/SAHF and MOVBE instruction
support into their own AC_CACHE_CHECK macro invocations to obey the
rule that the commands-to-set-it must have no side effects other than
setting the variable named by cache-id.
Signed-off-by: Matt Whitlock <sourceware@mattwhitlock.name> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fangrui Song [Wed, 18 Aug 2021 16:15:20 +0000 (09:15 -0700)]
Remove sysdeps/*/tls-macros.h
They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing. Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Fangrui Song [Mon, 16 Aug 2021 16:59:30 +0000 (09:59 -0700)]
elf: Drop elf/tls-macros.h in favor of __thread and tls_model attributes [BZ #28152] [BZ #28205]
elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.
Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.
When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.
As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.
Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Samuel Thibault [Mon, 16 Aug 2021 09:20:38 +0000 (11:20 +0200)]
hurd: Drop fmh kludge
Gnumach's 0650a4ee30e3 implements support for high bits being set in the
mask parameter of vm_map. This allows to remove the fmh kludge that was
masking away the address range by mapping a dumb area there.
Stafford Horne [Mon, 7 Jun 2021 13:10:19 +0000 (22:10 +0900)]
time: Fix overflow itimer tests on 32-bit systems
On the port of OpenRISC I am working on and it appears the rv32 port
we have sets __TIMESIZE == 64 && __WORDSIZE == 32. This causes the
size of time_t to be 8 bytes, but the tv_sec in the kernel is still 32-bit
causing truncation.
The truncations are unavoidable on these systems so skip the
testing/failures by guarding with __KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64.
Also, futher in the tests and in other parts of code checking for time_t
overflow does not work on 32-bit systems when time_t is 64-bit. As
suggested by Adhemerval, update the in_time_t_range function to assume
32-bits by using int32_t.
This also brings in the header for stdint.h so we can update other
usages of __int32_t to int32_t as suggested by Adhemerval.
Xi Ruoyao [Fri, 13 Aug 2021 16:01:14 +0000 (16:01 +0000)]
mips: increase stack alignment in clone to match the ABI
In "mips: align stack in clone [BZ #28223]"
(commit 1f51cd9a860ee45eee8a56fb2ba925267a2a7bfe) I made a mistake: I
misbelieved one "word" was 2-byte and "doubleword" should be 4-byte.
But in MIPS ABI one "word" is defined 32-bit (4-byte), so "doubleword" is
8-byte [1], and "quadword" is 16-byte [2].
Xi Ruoyao [Thu, 12 Aug 2021 20:31:59 +0000 (20:31 +0000)]
mips: align stack in clone [BZ #28223]
The MIPS O32 ABI requires 4 byte aligned stack, and the MIPS N64 and N32
ABI require 8 byte aligned stack. Previously if the caller passed an
unaligned stack to clone the the child misbehaved.
Nikita Popov [Thu, 12 Aug 2021 10:39:50 +0000 (16:09 +0530)]
librt: add test (bug 28213)
This test implements following logic:
1) Create POSIX message queue.
Register a notification with mq_notify (using NULL attributes).
Then immediately unregister the notification with mq_notify.
Helper thread in a vulnerable version of glibc
should cause NULL pointer dereference after these steps.
2) Once again, register the same notification.
Try to send a dummy message.
Test is considered successfulif the dummy message
is successfully received by the callback function.
Fangrui Song [Wed, 11 Aug 2021 16:00:37 +0000 (09:00 -0700)]
aarch64: Make elf_machine_{load_address,dynamic} robust [BZ #28203]
The AArch64 ABI is largely platform agnostic and does not specify
_GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the
only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value
to the link-time address _DYNAMIC. [2]
In 2012, __ehdr_start was implemented in GNU ld and gold in binutils
2.23. Using adrp+add / (-mcmodel=tiny) adr to access
__ehdr_start/_DYNAMIC gives us a robust way to get the load address and
the link-time address of _DYNAMIC.
[1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2
[2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the
link-time address _DYNAMIC.
LLD is widely used on aarch64 Android and ChromeOS devices. Software
just works without the need for _GLOBAL_OFFSET_TABLE_[0].
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Wilco Dijkstra [Tue, 10 Aug 2021 12:42:07 +0000 (13:42 +0100)]
[3/5] AArch64: Improve A64FX memset for remaining bytes
Simplify handling of remaining bytes. Avoid lots of taken branches and complex
whilelo computations, instead unconditionally write vectors from the end.
Wilco Dijkstra [Tue, 10 Aug 2021 12:39:37 +0000 (13:39 +0100)]
[2/5] AArch64: Improve A64FX memset for large sizes
Improve performance of large memsets. Simplify alignment code. For zero memset
use DC ZVA, which almost doubles performance. For non-zero memsets use the
unroll8 loop which is about 10% faster.
Wilco Dijkstra [Tue, 10 Aug 2021 12:30:27 +0000 (13:30 +0100)]
[1/5] AArch64: Improve A64FX memset for small sizes
Improve performance of small memsets by reducing instruction counts and
improving code alignment. Bench-memset shows 35-45% performance gain for
small sizes.
Joseph Myers [Mon, 9 Aug 2021 16:51:38 +0000 (16:51 +0000)]
Add PTRACE_GET_RSEQ_CONFIGURATION from Linux 5.13 to sys/ptrace.h
Linux 5.13 adds a PTRACE_GET_RSEQ_CONFIGURATION constant, with an
associated ptrace_rseq_configuration structure.
Add this constant to the various sys/ptrace.h headers in glibc, with
the structure in bits/ptrace-shared.h (named struct
__ptrace_rseq_configuration in glibc, as with other such structures).
Nikita Popov [Mon, 9 Aug 2021 14:47:34 +0000 (20:17 +0530)]
librt: fix NULL pointer dereference (bug 28213)
Helper thread frees copied attribute on NOTIFY_REMOVED message
received from the OS kernel. Unfortunately, it fails to check whether
copied attribute actually exists (data.attr != NULL). This worked
earlier because free() checks passed pointer before actually
attempting to release corresponding memory. But
__pthread_attr_destroy assumes pointer is not NULL.
So passing NULL pointer to __pthread_attr_destroy will result in
segmentation fault. This scenario is possible if
notification->sigev_notify_attributes == NULL (which means default
thread attributes should be used).
Anton Blanchard [Tue, 27 Jul 2021 05:47:49 +0000 (15:47 +1000)]
powerpc64: Replace some PPC_FEATURE_HAS_VSX with PPC_FEATURE_ARCH_2_06
We use PPC_FEATURE_HAS_VSX to select a number of POWER7 optimised
functions. These functions don't use any VSX instructions, so
PPC_FEATURE_ARCH_2_06 seems like a better fit.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Florian Weimer [Fri, 6 Aug 2021 07:51:38 +0000 (09:51 +0200)]
Linux: Fix fcntl, ioctl, prctl redirects for _TIME_BITS=64 (bug 28182)
__REDIRECT and __THROW are not compatible with C++ due to the ordering of the
__asm__ alias and the throw specifier. __REDIRECT_NTH has to be used
instead.
Joseph Myers [Thu, 5 Aug 2021 20:12:56 +0000 (20:12 +0000)]
Add INADDR_DUMMY from Linux 5.13 to netinet/in.h
Linux 5.13 adds an INADDR_DUMMY definition; add a corresponding
definition to glibc's netinet/in.h. (This isn't strictly a new kernel
interface, rather a value defined in RFC 7600.)
It turned that the generic implementation of brk() does not work
for sparc, since on failure kernel will just return the previous
input value without setting the conditional register.
This patches adds back a sparc32 and sparc64 implementation removed
by 720480934ab9107.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
DJ Delorie [Tue, 3 Aug 2021 20:33:29 +0000 (16:33 -0400)]
test-dlclose-exit-race: avoid hang on pthread_create error
This test depends on the "last" function being called in a
different thread than the "first" function, as "last" posts
a semaphore that "first" is waiting on. However, if pthread_create
fails - for example, if running in an older container before
the clone3()-in-container-EPERM fixes - exit() is called in the
same thread as everything else, the semaphore never gets posted,
and first hangs.
The fix is to pre-post that semaphore before a single-threaded
exit.
Joseph Myers [Mon, 2 Aug 2021 16:33:44 +0000 (16:33 +0000)]
Fix build of nptl/tst-thread_local1.cc with GCC 12
The test nptl/tst-thread_local1.cc fails to build with GCC mainline
because of changes to what libstdc++ headers implicitly include what
other headers:
tst-thread_local1.cc: In function 'int do_test()':
tst-thread_local1.cc:177:5: error: variable 'std::array<std::pair<const char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X' has initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~
Fix this by adding an explicit include of <array>.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
Paul Zimmermann [Mon, 2 Aug 2021 13:19:21 +0000 (15:19 +0200)]
Remove obsolete comments/name from several benchtest input files.
These comments refer to slow paths that were removed in
glibc 2.34 or earlier. The corresponding "names" that yield
separate workload traces for "make bench" are thus obsolete.
We are however keeping the corresponding inputs. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
x86-64: Avoid rep movsb with short distance [BZ #27130]
introduced some regressions on Intel processors without Fast Short REP
MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with
short distance only on Intel processors with FSRM. bench-memmove-large
on Skylake server shows that cycles of __memmove_evex_unaligned_erms
improves for the following data size:
tests: use xmalloc to allocate implementation array
The benchmark and tests must fail in case of allocation failure in the
implementation array. Also annotate the x* allocators in support.h so
that the compiler has more information about them.
Tell the compiler that xmalloc family of allocators always return
non-NULL. xrealloc in locale/programs also always returns non-NULL,
but that conflicts with default realloc behaviour and that of xrealloc
in libsupport, so keep it as is for now and resolve the differences
later.
mcheck and malloc-check no longer work with static binaries, so drop
those tests.
Reported-by: Samuel Thibault <samuel.thibault@gnu.org> Tested-by: Samuel Thibault <samuel.thibault@gnu.org> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
manual: Document unsupported cases for interposition
These functions call the core allocator functions (realloc and malloc
respectively) and are hence guaranteed to allocate memory using the
correct functions when multiple allocators are interposed. Having
these functions interposed in one allocator and not another may result
in confusion, hence discourage interposing them altogether.
H.J. Lu [Sat, 5 Jun 2021 13:42:20 +0000 (06:42 -0700)]
x86: Install <bits/platform/x86.h> [BZ #27958]
1. Install <bits/platform/x86.h> for <sys/platform/x86.h> which includes
<bits/platform/x86.h>.
2. Rename HAS_CPU_FEATURE to CPU_FEATURE_PRESENT which checks if the
processor has the feature.
3. Rename CPU_FEATURE_USABLE to CPU_FEATURE_ACTIVE which checks if the
feature is active. There may be other preconditions, like sufficient
stack space or further setup for AMX, which must be satisfied before the
feature can be used.
Samuel Thibault [Thu, 22 Jul 2021 18:29:57 +0000 (18:29 +0000)]
hurd: Fix glob lstat compatibility
84f7ce84474c ("posix: Add glob64 with 64-bit time_t support") replaced
GLOB_NO_LSTAT with defining GLOB_LSTAT and GLOB_LSTAT64, but the posix
and gnu versions of the change were missing in the commit.
Make malloc hooks symbols compat-only so that new applications cannot
link against them and remove the declarations from the API. Also
remove the unused malloc-hooks.h.
Finally, mark all symbols in libc_malloc_debug.so as compat so that
the library cannot be linked against.
Add a note about the deprecation in NEWS.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
These deprecated functions are only safe to call from
__malloc_initialize_hook and as a result, are not useful in the
general case. Move the implementations to libc_malloc_debug so that
existing binaries that need it will now have to preload the debug DSO
to work correctly.
This also allows simplification of the core malloc implementation by
dropping all the undumping support code that was added to make
malloc_set_state work.
One known breakage is that of ancient emacs binaries that depend on
this. They will now crash when running with this libc. With
LD_BIND_NOW=1, it will terminate immediately because of not being able
to find malloc_set_state but with lazy binding it will crash in
unpredictable ways. It will need a preloaded libc_malloc_debug.so so
that its initialization hook is executed to allow its malloc
implementation to work properly.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
The malloc-check debugging feature is tightly integrated into glibc
malloc, so thanks to an idea from Florian Weimer, much of the malloc
implementation has been moved into libc_malloc_debug.so to support
malloc-check. Due to this, glibc malloc and malloc-check can no
longer work together; they use altogether different (but identical)
structures for heap management. This should not make a difference
though since the malloc check hook is not disabled anywhere.
malloc_set_state does, but it does so early enough that it shouldn't
cause any problems.
The malloc check tunable is now in the debug DSO and has no effect
when the DSO is not preloaded.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Wean mtrace away from the malloc hooks and move them into the debug
DSO. Split the API away from the implementation so that we can add
the API to libc.so as well as libc_malloc_debug.so, with the libc
implementations being empty.
Update localplt data since memalign no longer has any callers after
this change.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Now that mcheck no longer needs to check __malloc_initialized (and no
other third party hook can since the symbol is not exported), make the
variable boolean and static so that it is used strictly within malloc.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Split the mcheck implementation into the debugging hooks and API so
that the API can be replicated in libc and libc_malloc_debug.so. The
libc APIs always result in failure.
The mcheck implementation has also been moved entirely into
libc_malloc_debug.so and with it, all of the hook initialization code
can now be moved into the debug library. Now the initialization can
be done independently of libc internals.
With this patch, libc_malloc_debug.so can no longer be used with older
libcs, which is not its goal anyway. tst-vfork3 breaks due to this
since it spawns shell scripts, which in turn execute using the system
glibc. Move the test to tests-container so that only the built glibc
is used.
This move also fixes bugs in the mcheck version of memalign and
realloc, thus allowing removal of the tests from tests-mcheck
exclusion list.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so. With this, the hooks now no
longer have any effect on the core library.
libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.
Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.
The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.
Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Make the __morecore and __default_morecore symbols compat-only and
remove their declarations from the API. Also, include morecore.c
directly into malloc.c; this should ideally get merged into malloc in
a future cleanup.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Remove __after_morecore_hook from the API and finalize the symbol so
that it can no longer be used in new applications. Old applications
using __after_morecore_hook will find that their hook is no longer
called.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Make mcheck tests conditional on GLIBC_2.23 or earlier
Targets with base versions of 2.24 or later won't have
__malloc_initialize_hook because of which the tests will essentially
be the same as the regular malloc tests. Avoid running them instead
and save time.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
ARC: fp: (micro)optimize FPU_STATUS read by eliding FWE bit clearing
Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
a "control signal" to enable explicit write (vs. the side-effect of FPU
instructions). However this bit is RAZ and write-only, thus effectively
never stored in FPU_STATUS register. Thus when reading the register
there is no need to clear it. This shaves off a BCLR instruction from
the fe*exceptino family of functions and while no big deal still makes
sense to do.
This came up when debugging a race in math/test-fenv-tls [1]