conform/conformtest.pl: Escape literal braces in regular expressions
This suppresses Perl warnings like these:
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.32), passed through in regex; marked by <-- HERE in m/^element
*({ <-- HERE ([^}]*)}|([^{ ]*)) *({([^}]*)}|([^{ ]*)) *([A-Za-z0-9_]*)
*(.*)/ at conformtest.pl line 370.
Daniel Alvarez [Fri, 29 Jun 2018 15:23:13 +0000 (17:23 +0200)]
getifaddrs: Don't return ifa entries with NULL names [BZ #21812]
A lookup operation in map_newlink could turn into an insert because of
holes in the interface part of the map. This leads to incorrectly set
the name of the interface to NULL when the interface is not present
for the address being processed (most likely because the interface was
added between the RTM_GETLINK and RTM_GETADDR calls to the kernel).
When such changes are detected by the kernel, it'll mark the dump as
"inconsistent" by setting NLM_F_DUMP_INTR flag on the next netlink
message.
This patch checks this condition and retries the whole operation.
Hopes are that next time the interface corresponding to the address
entry is present in the list and correct name is returned.
Florian Weimer [Thu, 28 Jun 2018 11:22:34 +0000 (13:22 +0200)]
Use _STRUCT_TIMESPEC as guard in <bits/types/struct_timespec.h> [BZ #23349]
After commit d76d3703551a362b472c866b5b6089f66f8daa8e ("Fix missing
timespec definition for sys/stat.h (BZ #21371)") in combination with
kernel UAPI changes, GCC sanitizer builds start to fail due to a
conflicting definition of struct timespec in <linux/time.h>. Use
_STRUCT_TIMESPEC as the header file inclusion guard, which is already
checked in the kernel header, to support including <linux/time.h> and
<sys/stat.h> in the same translation unit.
Florian Weimer [Fri, 1 Jun 2018 09:24:58 +0000 (11:24 +0200)]
libio: Avoid _allocate_buffer, _free_buffer function pointers [BZ #23236]
These unmangled function pointers reside on the heap and could
be targeted by exploit writers, effectively bypassing libio vtable
validation. Instead, we ignore these pointers and always call
malloc or free.
In theory, this is a backwards-incompatible change, but using the
global heap instead of the user-supplied callback functions should
have little application impact. (The old libstdc++ implementation
exposed this functionality via a public, undocumented constructor
in its strstreambuf class.)
Paul Pluzhnikov [Sun, 6 May 2018 01:08:27 +0000 (18:08 -0700)]
Fix stack overflow with huge PT_NOTE segment [BZ #20419]
A PT_NOTE in a binary could be arbitratily large, so using alloca
for it may cause stack overflow. If the note is larger than
__MAX_ALLOCA_CUTOFF, use dynamically allocated memory to read it in.
2018-05-05 Paul Pluzhnikov <ppluzhnikov@google.com>
Stefan Liebler [Thu, 17 May 2018 12:05:51 +0000 (14:05 +0200)]
Fix blocking pthread_join. [BZ #23137]
On s390 (31bit) if glibc is build with -Os, pthread_join sometimes
blocks indefinitely. This is e.g. observable with
testcase intl/tst-gettext6.
pthread_join is calling lll_wait_tid(tid), which performs the futex-wait
syscall in a loop as long as tid != 0 (thread is alive).
On s390 (and build with -Os), tid is loaded from memory before
comparing against zero and then the tid is loaded a second time
in order to pass it to the futex-wait-syscall.
If the thread exits in between, then the futex-wait-syscall is
called with the value zero and it waits until a futex-wake occurs.
As the thread is already exited, there won't be a futex-wake.
In lll_wait_tid, the tid is stored to the local variable __tid,
which is then used as argument for the futex-wait-syscall.
But unfortunately the compiler is allowed to reload the value
from memory.
With this patch, the tid is loaded with atomic_load_acquire.
Then the compiler is not allowed to reload the value for __tid from memory.
ChangeLog:
[BZ #23137]
* sysdeps/nptl/lowlevellock.h (lll_wait_tid):
Use atomic_load_acquire to load __tid.
Joseph Myers [Thu, 17 May 2018 12:04:58 +0000 (14:04 +0200)]
Fix signed integer overflow in random_r (bug 17343).
Bug 17343 reports that stdlib/random_r.c has code with undefined
behavior because of signed integer overflow on int32_t. This patch
changes the code so that the possibly overflowing computations use
unsigned arithmetic instead.
Note that the bug report refers to "Most code" in that file. The
places changed in this patch are the only ones I found where I think
such overflow can occur.
Tested for x86_64 and x86.
[BZ #17343]
* stdlib/random_r.c (__random_r): Use unsigned arithmetic for
possibly overflowing computations.
This patch fixes the i386 sa_restorer field initialization for sigaction
syscall for kernel with vDSO. As described in bug report, i386 Linux
(and compat on x86_64) interprets SA_RESTORER clear with nonzero
sa_restorer as a request for stack switching if the SS segment is 'funny'.
This means that anything that tries to mix glibc's signal handling with
segmentation (for instance through modify_ldt syscall) is randomly broken
depending on what values lands in sa_restorer.
The testcase added is based on Linux test tools/testing/selftests/x86/ldt_gdt.c,
more specifically in do_multicpu_tests function. The main changes are:
- C11 atomics instead of plain access.
- Remove x86_64 support which simplifies the syscall handling and fallbacks.
- Replicate only the test required to trigger the issue.
Checked on i686-linux-gnu.
[BZ #21269]
* sysdeps/unix/sysv/linux/i386/Makefile (tests): Add tst-bz21269.
* sysdeps/unix/sysv/linux/i386/sigaction.c (SET_SA_RESTORER): Clear
sa_restorer for vDSO case.
* sysdeps/unix/sysv/linux/i386/tst-bz21269.c: New file.
DJ Delorie [Fri, 2 Mar 2018 04:20:45 +0000 (23:20 -0500)]
[BZ #22342] Fix netgroup cache keys.
Unlike other nscd caches, the netgroup cache contains two types of
records - those for "iterate through a netgroup" (i.e. setnetgrent())
and those for "is this user in this netgroup" (i.e. innetgr()),
i.e. full and partial records. The timeout code assumes these records
have the same key for the group name, so that the collection of records
that is "this netgroup" can be expired as a unit.
However, the keys are not the same, as the in-netgroup key is generated
by nscd rather than being passed to it from elsewhere, and is generated
without the trailing NUL. All other keys have the trailing NUL, and as
noted in the linked BZ, debug statements confirm that two keys for the
same netgroup are added to the cache with two different lengths.
The result of this is that as records in the cache expire, the purge
code only cleans out one of the two types of entries, resulting in
stale, possibly incorrect, and possibly inconsistent cache data.
The patch simply includes the existing NUL in the computation for the
key length ('key' points to the char after the NUL, and 'group' to the
first char of the group, so 'key-group' includes the first char to the
NUL, inclusive).
[BZ #22342]
* nscd/netgroupcache.c (addinnetgrX): Include trailing NUL in
key value.
Jesse Hathaway [Thu, 17 May 2018 11:56:33 +0000 (13:56 +0200)]
getlogin_r: return early when linux sentinel value is set
When there is no login uid Linux sets /proc/self/loginid to the sentinel
value of, (uid_t) -1. If this is set we can return early and avoid
needlessly looking up the sentinel value in any configured nss
databases.
Checked on aarch64-linux-gnu.
* sysdeps/unix/sysv/linux/getlogin_r.c (__getlogin_r_loginuid): Return
early when linux sentinel value is set.
Arjun Shankar [Thu, 18 Jan 2018 16:47:06 +0000 (16:47 +0000)]
Fix integer overflows in internal memalign and malloc [BZ #22343] [BZ #22774]
When posix_memalign is called with an alignment less than MALLOC_ALIGNMENT
and a requested size close to SIZE_MAX, it falls back to malloc code
(because the alignment of a block returned by malloc is sufficient to
satisfy the call). In this case, an integer overflow in _int_malloc leads
to posix_memalign incorrectly returning successfully.
Upon fixing this and writing a somewhat thorough regression test, it was
discovered that when posix_memalign is called with an alignment larger than
MALLOC_ALIGNMENT (so it uses _int_memalign instead) and a requested size
close to SIZE_MAX, a different integer overflow in _int_memalign leads to
posix_memalign incorrectly returning successfully.
Both integer overflows affect other memory allocation functions that use
_int_malloc (one affected malloc in x86) or _int_memalign as well.
This commit fixes both integer overflows. In addition to this, it adds a
regression test to guard against false successful allocations by the
following memory allocation functions when called with too-large allocation
sizes and, where relevant, various valid alignments:
malloc, realloc, calloc, reallocarray, memalign, posix_memalign,
aligned_alloc, valloc, and pvalloc.
powerpc: Fix syscalls during early process initialization [BZ #22685]
The tunables framework needs to execute syscall early in process
initialization, before the TCB is available for consumption. This
behavior conflicts with powerpc{|64|64le}'s lock elision code, that
checks the TCB before trying to abort transactions immediately before
executing a syscall.
This patch adds a powerpc-specific implementation of __access_noerrno
that does not abort transactions before the executing syscall.
Tested on powerpc{|64|64le}.
[BZ #22685]
* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION_IMPL): Renamed
from ABORT_TRANSACTION.
(ABORT_TRANSACTION): Redirect to ABORT_TRANSACTION_IMPL.
* sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION,
ABORT_TRANSACTION_IMPL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/not-errno.h: New file. Reuse
Linux code, but remove the code that aborts transactions.
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 4612268a0ad8e3409d8ce2314dd2dd8ee0af5269)
In C++ mode, __MATH_TG cannot be used for defining iseqsig, because
__MATH_TG relies on __builtin_types_compatible_p, which is a C-only
builtin. This is true when float128 is provided as an ABI-distinct type
from long double.
Moreover, the comparison macros from ISO C take two floating-point
arguments, which need not have the same type. Choosing what underlying
function to call requires evaluating the formats of the arguments, then
selecting which is wider. The macro __MATH_EVAL_FMT2 provides this
information, however, only the type of the macro expansion is relevant
(actually evaluating the expression would be incorrect).
This patch provides a C++ version of iseqsig, in which only the type of
__MATH_EVAL_FMT2 (__typeof or decltype) is used as a template parameter
for __iseqsig_type. This function calls the appropriate underlying
function.
Tested for powerpc64le and x86_64.
[BZ #22377]
* math/Makefile [C++] (tests): Add test for iseqsig.
* math/math.h [C++] (iseqsig): New implementation, which does
not rely on __MATH_TG/__builtin_types_compatible_p.
* math/test-math-iseqsig.cc: New file.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iseqsig.cc): New variable.
Szabolcs Nagy [Wed, 18 Oct 2017 16:26:23 +0000 (17:26 +0100)]
[AARCH64] Rewrite elf_machine_load_address using _DYNAMIC symbol
This patch rewrites aarch64 elf_machine_load_address to use special _DYNAMIC
symbol instead of _dl_start.
The static address of _DYNAMIC symbol is stored in the first GOT entry.
Here is the change which makes this solution work (part of binutils 2.24):
https://sourceware.org/ml/binutils/2013-06/msg00248.html
i386, x86_64 targets use the same method to do this as well.
The original implementation relies on a trick that R_AARCH64_ABS32 relocation
being resolved at link time and the static address fits in the 32bits.
However, in LP64, normally, the address is defined to be 64 bit.
Here is the C version one which should be portable in all cases.
* sysdeps/aarch64/dl-machine.h (elf_machine_load_address): Use
_DYNAMIC symbol to calculate load address.
H.J. Lu [Fri, 19 Jan 2018 17:41:12 +0000 (09:41 -0800)]
x86-64: Properly align La_x86_64_retval to VEC_SIZE [BZ #22715]
_dl_runtime_profile calls _dl_call_pltexit, passing a pointer to
La_x86_64_retval which is allocated on stack. The lrv_vector0
field in La_x86_64_retval must be aligned to size of vector register.
When allocating stack space for La_x86_64_retval, we need to make sure
that the address of La_x86_64_retval + RV_VECTOR0_OFFSET is aligned to
VEC_SIZE. This patch checks the alignment of the lrv_vector0 field
and pads the stack space if needed.
Tested with x32 and x86-64 on SSE4, AVX and AVX512 machines. It fixed
I verified that without the guard accounting change in commit 630f4cc3aa019ede55976ea561f1a7af2f068639 (Fix stack guard size
accounting) and RTLD_NOW for libgcc_s introduced by commit f993b8754080ac7572b692870e926d8b493db16c (nptl: Open libgcc.so with
RTLD_NOW during pthread_cancel), the tst-minstack-cancel test fails on
an AVX-512F machine. tst-minstack-exit still passes, and either of
the mentioned commit by itself frees sufficient stack space to make
tst-minstack-cancel pass, too.
Szabolcs Nagy [Mon, 15 Jan 2018 15:06:31 +0000 (16:06 +0100)]
[BZ #22637] Fix stack guard size accounting
Previously if user requested S stack and G guard when creating a
thread, the total mapping was S and the actual available stack was
S - G - static_tls, which is not what the user requested.
This patch fixes the guard size accounting by pretending the user
requested S+G stack. This way all later logic works out except
when reporting the user requested stack size (pthread_getattr_np)
or when computing the minimal stack size (__pthread_get_minstack).
Normally this will increase thread stack allocations by one page.
TLS accounting is not affected, that will require a separate fix.
Florian Weimer [Mon, 8 Jan 2018 13:57:25 +0000 (14:57 +0100)]
nptl: Add test for callee-saved register restore in pthread_exit
GCC PR 83641 results in a miscompilation of libpthread, which
causes pthread_exit not to restore callee-saved registers before
running destructors for objects on the stack. This test detects
this situation:
info: unsigned int, direct pthread_exit call
tst-thread-exit-clobber.cc:80: numeric comparison failure
left: 4148288912 (0xf741dd90); from: value
right: 1600833940 (0x5f6ac994); from: magic_values.v2
info: double, direct pthread_exit call
info: unsigned int, indirect pthread_exit call
info: double, indirect pthread_exit call
error: 1 test failures
Dmitry V. Levin [Sun, 7 Jan 2018 02:03:41 +0000 (02:03 +0000)]
linux: make getcwd(3) fail if it cannot obtain an absolute path [BZ #22679]
Currently getcwd(3) can succeed without returning an absolute path
because the underlying getcwd syscall, starting with linux commit
v2.6.36-rc1~96^2~2, may succeed without returning an absolute path.
This is a conformance issue because "The getcwd() function shall
place an absolute pathname of the current working directory
in the array pointed to by buf, and return buf".
This is also a security issue because a non-absolute path returned
by getcwd(3) causes a buffer underflow in realpath(3).
Fix this by checking the path returned by getcwd syscall and falling
back to generic_getcwd if the path is not absolute, effectively making
getcwd(3) fail with ENOENT. The error code is chosen for consistency
with the case when the current directory is unlinked.
[BZ #22679]
CVE-2018-1000001
* sysdeps/unix/sysv/linux/getcwd.c (__getcwd): Fall back to
generic_getcwd if the path returned by getcwd syscall is not absolute.
* io/tst-getcwd-abspath.c: New test.
* io/Makefile (tests): Add tst-getcwd-abspath.
ia64: Fix memchr for large input sizes (BZ #22603)
Current optimized ia64 memchr uses a strategy to check for last address
by adding the input one with expected size. However it does not take
care for possible overflow.
It was triggered by 3038145ca23 where default rawmemchr now uses memchr
(p, c, (size_t)-1).
This patch fixes it by implement a satured addition where overflows
sets the maximum pointer size to UINTPTR_MAX.
Checked on ia64-linux-gnu where it fixes both stratcliff and
test-rawmemchr failures.
Adhemerval Zanella <adhemerval.zanella@linaro.org>
James Clarke <jrtc27@jrtc27.com>
[BZ #22603]
* sysdeps/ia64/memchr.S (__memchr): Avoid overflow in pointer
addition.
Aurelien Jarno [Fri, 29 Dec 2017 13:44:57 +0000 (14:44 +0100)]
tst-realloc: do not check for errno on success [BZ #22611]
POSIX explicitly says that applications should check errno only after
failure, so the errno value can be clobbered on success as long as it
is not set to zero.
Changelog:
[BZ #22611]
* malloc/tst-realloc.c (do_test): Remove the test checking that errno
is unchanged on success.
(cherry picked from commit f8aa69be445f65bb36cb3ae9291423600da7d6d2)
Aurelien Jarno [Sat, 30 Dec 2017 09:54:23 +0000 (10:54 +0100)]
elf: Check for empty tokens before dynamic string token expansion [BZ #22625]
The fillin_rpath function in elf/dl-load.c loops over each RPATH or
RUNPATH tokens and interprets empty tokens as the current directory
("./"). In practice the check for empty token is done *after* the
dynamic string token expansion. The expansion process can return an
empty string for the $ORIGIN token if __libc_enable_secure is set
or if the path of the binary can not be determined (/proc not mounted).
Fix that by moving the check for empty tokens before the dynamic string
token expansion. In addition, check for NULL pointer or empty strings
return by expand_dynamic_string_token.
The above changes highlighted a bug in decompose_rpath, an empty array
is represented by the first element being NULL at the fillin_rpath
level, but by using a -1 pointer in decompose_rpath and other functions.
Changelog:
[BZ #22625]
* elf/dl-load.c (fillin_rpath): Check for empty tokens before dynamic
string token expansion. Check for NULL pointer or empty string possibly
returned by expand_dynamic_string_token.
(decompose_rpath): Check for empty path after dynamic string
token expansion.
(cherry picked from commit 3e3c904daef69b8bf7d5cc07f793c9f07c3553ef)
Dmitry V. Levin [Sun, 17 Dec 2017 23:49:46 +0000 (23:49 +0000)]
elf: do not substitute dst in $LD_LIBRARY_PATH twice [BZ #22627]
Starting with commit glibc-2.18.90-470-g2a939a7e6d81f109d49306bc2e10b4ac9ceed8f9 that
introduced substitution of dynamic string tokens in fillin_rpath,
_dl_init_paths invokes _dl_dst_substitute for $LD_LIBRARY_PATH twice:
the first time it's called directly, the second time the result
is passed on to fillin_rpath which calls expand_dynamic_string_token
which in turn calls _dl_dst_substitute, leading to the following
behaviour:
$ mkdir -p /tmp/'$ORIGIN' && cd /tmp/'$ORIGIN' &&
echo 'int main(){}' |gcc -xc - &&
strace -qq -E LD_LIBRARY_PATH='$ORIGIN' -e /open ./a.out
open("/tmp//tmp/$ORIGIN/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/tmp//tmp/$ORIGIN/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/tmp//tmp/$ORIGIN/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/tmp//tmp/$ORIGIN/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
Fix this by removing the direct _dl_dst_substitute invocation.
* elf/dl-load.c (_dl_init_paths): Remove _dl_dst_substitute preparatory
code and invocation.
Luke Shumaker [Wed, 15 Nov 2017 19:39:22 +0000 (20:39 +0100)]
linux ttyname{_r}: Add tests
Add a new tst-ttyname test that includes several named sub-testcases.
This patch is ordered after the patches with the fixes that it tests for (to
avoid breaking `git bisect`), but for reference, here's how each relevant change
so far affected the testcases in this commit, starting with 15e9a4f378c8607c2ae1aa465436af4321db0e23:
[1]: 15e9a4f introduced a semantic that, under certain failure
conditions, ttyname sets errno=ENODEV, where previously it didn't
set errno; it's not quite fair to hold "before 15e9a4f" ttyname to
those new semantics. This testcase actually fails, but would have
passed if we tested for the old the semantics.
Each of the failing tests before 15e9a4f are all essentially the same bug: that
it returns a PTY slave with the correct minor device number, but from the wrong
devpts filesystem instance.
15e9a4f sought to fix this, but missed several of the cases that can cause this
to happen, and also broke the case where both the erroneous PTY and the correct
PTY exist.
Luke Shumaker [Wed, 15 Nov 2017 19:36:44 +0000 (20:36 +0100)]
linux ttyname{_r}: Don't bail prematurely [BZ #22145]
Commit 15e9a4f378c8607c2ae1aa465436af4321db0e23 introduced logic for ttyname()
sending back ENODEV to signal that we can't get a name for the TTY because we
inherited it from a different mount namespace.
However, just because we inherited it from a different mount namespace and it
isn't available at its original path, doesn't mean that its name is unknowable;
we can still try to find it by allowing the normal fall back on iterating
through devices.
An example scenario where this happens is with "/dev/console" in containers.
It's a common practice among container managers to allocate a PTY master/slave
pair in the host's mount namespace (the slave having a path like "/dev/pty/$X"),
bind mount the slave to "/dev/console" in the container's mount namespace, and
send the slave FD to a process in the container. Inside of the
container, the slave-end isn't available at its original path ("/dev/pts/$X"),
since the container mount namespace has a separate devpts instance from the host
(that path may or may not exist in the container; if it does exist, it's not the
same PTY slave device). Currently ttyname{_r} sees that the file at the
original "/dev/pts/$X" path doesn't match the FD passed to it, and fails early
and gives up, even though if it kept searching it would find the TTY at
"/dev/console". Fix that; don't have the ENODEV path force an early return
inhibiting the fall-back search.
This change is based on the previous patch that adds use of is_mytty in
getttyname and getttyname_r. Without that change, this effectively reverts 15e9a4f, which made us disregard the false similarity of file pointed to by
"/proc/self/fd/$Y", because if it doesn't bail prematurely then that file
("/dev/pts/$X") will just come up again anyway in the fall-back search.
Luke Shumaker [Fri, 22 Dec 2017 14:27:57 +0000 (15:27 +0100)]
linux ttyname{_r}: Make tty checks consistent
In the ttyname and ttyname_r routines on Linux, at several points it needs to
check if a given TTY is the TTY we are looking for. It used to be that this
check was (to see if `maybe` is `mytty`):
That is, it made the st_ino and st_dev parts of the check happen even if we have
the st_rdev member. This is an important change, because the kernel allows
multiple devpts filesystem instances to be created; a device file in one devpts
instance may share the same st_rdev with a file in another devpts instance, but
they aren't the same file.
This check appears twice in each file (ttyname.c and ttyname_r.c), once (in
ttyname and __ttyname_r) to check if a candidate file found by inspecting /proc
is the desired TTY, and once (in getttyname and getttyname_r) to check if a
candidate file found by searching /dev is the desired TTY. However, 15e9a4f
only updated the checks for files found via /proc; but the concern about
collisions between devpts instances is just as valid for files found via /dev.
So, update all 4 occurrences the check to be consistent with the version of the
check introduced in 15e9a4f. Make it easy to keep all 4 occurrences of the
check consistent by pulling it in to a static inline function, is_mytty.
Luke Shumaker [Wed, 15 Nov 2017 19:31:32 +0000 (20:31 +0100)]
linux ttyname: Update a reference to kernel docs for kernel 4.10
Linux 4.10 moved many of the documentation files around.
4.10 came out between the time the patch adding the comment (commit 15e9a4f378c8607c2ae1aa465436af4321db0e23) was submitted and the time
it was applied (in February, January, and March 2017; respectively).
Luke Shumaker [Wed, 15 Nov 2017 19:28:40 +0000 (20:28 +0100)]
manual: Update to mention ENODEV for ttyname and ttyname_r
Commit 15e9a4f378c8607c2ae1aa465436af4321db0e23 introduced ENODEV as a possible
error condition for ttyname and ttyname_r. Update the manual to mention this GNU
extension.
ia64: Fix thread stack allocation permission set (BZ #21672)
This patch fixes ia64 failures on thread exit by madvise the required
area taking in consideration its disjoing stacks
(NEED_SEPARATE_REGISTER_STACK). Also the snippet that setup the
madvise call to advertise kernel the area won't be used anymore in
near future is reallocated in allocatestack.c (for consistency to
put all stack management function in one place).
Checked on x86_64-linux-gnu and i686-linux-gnu for sanity (since
it is not expected code changes for architecture that do not
define NEED_SEPARATE_REGISTER_STACK) and also got a report that
it fixes ia64-linux-gnu failures from Sergei Trofimovich
<slyfox@gentoo.org>.
[BZ #21672]
* nptl/allocatestack.c [_STACK_GROWS_DOWN] (setup_stack_prot):
Set to use !NEED_SEPARATE_REGISTER_STACK as well.
(advise_stack_range): New function.
* nptl/pthread_create.c (START_THREAD_DEFN): Move logic to mark
stack non required to advise_stack_range at allocatestack.c
Default semantic for mmap2 syscall is to take the offset in 4096-byte
units. However m68k and ia64 mmap2 implementation take in the
configured pageunit units and for both architecture it can be
different values.
This patch fixes the m68k runtime discover of mmap2 offset unit
and adds the ia64 definition to find it at runtime.
Checked the basic tst-mmap and tst-mmap-offset on m68k (the system
is configured with 4k, so current code is already passing on this
system) and a sanity check on x86_64-linux-gnu (which should not be
affected by this change). Sergei also states that ia64 loader now
work correctly with this change.
Adhemerval Zanella <adhemerval.zanella@linaro.org>
Sergei Trofimovich <slyfox@inbox.ru>
* sysdeps/unix/sysv/linux/m68k/mmap_internal.h (MMAP2_PAGE_SHIFT):
Rename to MMAP2_PAGE_UNIT.
* sysdeps/unix/sysv/linux/mmap.c: Include mmap_internal iff
__OFF_T_MATCHES_OFF64_T is not defined.
* sysdeps/unix/sysv/linux/mmap_internal.h (page_unit): Declare as
uint64_t.
(MMAP2_PAGE_UNIT) [MMAP2_PAGE_UNIT == -1]: Redefine to page_unit.
(page_unit) [MMAP2_PAGE_UNIT != -1]: Remove definition.
James Clarke [Tue, 12 Dec 2017 14:17:10 +0000 (12:17 -0200)]
ia64: Add ipc_priv.h header to set __IPC_64 to zero
When running strace, IPC_64 was set in the command, but ia64 is
an architecture where CONFIG_ARCH_WANT_IPC_PARSE_VERSION *isn't* set
in the kernel, so ipc_parse_version just returns IPC_64 without
clearing the IPC_64 bit in the command.
* sysdeps/unix/sysv/linux/ia64/ipc_priv.h: New file defining
__IPC_64 to 0 to avoid IPC_64 being set.
hooks.c: In function ‘realloc_check’:
hooks.c:352:14: error: ‘magic_p’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
*magic_p ^= 0xFF;
Arjun Shankar [Thu, 30 Nov 2017 12:31:45 +0000 (13:31 +0100)]
Fix integer overflow in malloc when tcache is enabled [BZ #22375]
When the per-thread cache is enabled, __libc_malloc uses request2size (which
does not perform an overflow check) to calculate the chunk size from the
requested allocation size. This leads to an integer overflow causing malloc
to incorrectly return the last successfully allocated block when called with
a very large size argument (close to SIZE_MAX).
This commit uses checked_request2size instead, removing the overflow.
Wilco Dijkstra [Tue, 24 Oct 2017 11:39:24 +0000 (12:39 +0100)]
Add single-threaded path to malloc/realloc/calloc/memalloc
This patch adds a single-threaded fast path to malloc, realloc,
calloc and memalloc. When we're single-threaded, we can bypass
arena_get (which always locks the arena it returns) and just use
the main arena. Also avoid retrying a different arena since
there is just the main arena.
Wilco Dijkstra [Thu, 19 Oct 2017 17:19:55 +0000 (18:19 +0100)]
Fix deadlock in _int_free consistency check
This patch fixes a deadlock in the fastbin consistency check.
If we fail the fast check due to concurrent modifications to
the next chunk or system_mem, we should not lock if we already
have the arena lock. Simplify the check to make it obviously
correct.
* malloc/malloc.c (_int_free): Fix deadlock bug in consistency check.
Linux commit ID cba6ac4869e45cc93ac5497024d1d49576e82666 reserved a new
bit for a scenario where transactional memory is available, but the
suspended state is disabled.
* sysdeps/powerpc/bits/hwcap.h (PPC_FEATURE2_HTM_NO_SUSPEND): New
macro.
Andreas Schwab [Tue, 8 Aug 2017 14:21:58 +0000 (16:21 +0200)]
Don't use IFUNC resolver for longjmp or system in libpthread (bug 21041)
Unlike the vfork forwarder and like the fork forwarder as in bug 19861,
there won't be a problem when the compiler does not turn this into a tail
call.
powerpc: Replace lxvd2x/stxvd2x with lvx/stvx in P7's memcpy/memmove
POWER9 DD2.1 and earlier has an issue where some cache inhibited
vector load traps to the kernel, causing a performance degradation. To
handle this in memcpy and memmove, lvx/stvx is used for aligned
addresses instead of lxvd2x/stxvd2x.
crypt: Use NSPR header files in addition to NSS header files [BZ #17956]
When configuring and building GNU libc using the Mozilla NSS library
for cryptography (--enable-nss-crypt option), also include the
NSPR header files along with the Mozilla NSS library header files.
Finally, when running the check-local-headers test, ignore the
Mozilla NSPR library header files (used by the Mozilla NSS library).
Currently free typically uses 2 atomic operations per call. The have_fastchunks
flag indicates whether there are recently freed blocks in the fastbins. This
is purely an optimization to avoid calling malloc_consolidate too often and
avoiding the overhead of walking all fast bins even if all are empty during a
sequence of allocations. However using catomic_or to update the flag is
completely unnecessary since it can be changed into a simple boolean and
accessed using relaxed atomics. There is no change in multi-threaded behaviour
given the flag is already approximate (it may be set when there are no blocks in
any fast bins, or it may be clear when there are free blocks that could be
consolidated).
Performance of malloc/free improves by 27% on a simple benchmark on AArch64
(both single and multithreaded). The number of load/store exclusive instructions
is reduced by 33%. Bench-malloc-thread speeds up by ~3% in all cases.
Wilco Dijkstra [Tue, 17 Oct 2017 17:25:43 +0000 (18:25 +0100)]
Inline tcache functions
The functions tcache_get and tcache_put show up in profiles as they
are a critical part of the tcache code. Inline them to give tcache
a 16% performance gain. Since this improves multi-threaded cases
as well, it helps offset any potential performance loss due to adding
single-threaded fast paths.
James Clarke [Fri, 13 Oct 2017 18:44:39 +0000 (15:44 -0300)]
Fix TLS relocations against local symbols on powerpc32, sparc32 and sparc64
Normally, TLS relocations against local symbols are optimised by the linker
to be absolute. However, gold does not do this, and so it is possible to
end up with, for example, R_SPARC_TLS_DTPMOD64 referring to a local symbol.
Since sym_map is left as null in elf_machine_rela for the special local
symbol case, the relocation handling thinks it has nothing to do, and so
the module gets left as 0. Havoc then ensues when the variable in question
is accessed.
Before this fix, the main_local_gold program would receive a SIGBUS on
sparc64, and SIGSEGV on powerpc32. With this fix applied, that test now
passes like the rest of them.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
Assign sym_map to be map for local symbols, as TLS relocations
use sym_map to determine whether the symbol is defined and to
extract the TLS information.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise.
This patch adds two new internal defines to set the internal
pthread_mutex_t layout required by the supported ABIS:
1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define
__nusers fields before or after __kind. The preferred value for
is 0 for new ports and it sets __nusers before __kind.
2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and
__list members will be place inside an union for linuxthreads
compatibility. The preferred value is 0 for ports and it sets
to not use an union to define both fields.
It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32.
Checked with a make check run-built-tests=no on all afected ABIs.
[BZ #22298]
* nptl/allocatestack.c (allocate_stack): Check if
__PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if
__PTHREAD_MUTEX_HAVE_PREV is defined.
* nptl/descr.h (pthread): Likewise.
* nptl/nptl-init.c (__pthread_initialize_minimal_internal):
Likewise.
* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
* sysdeps/nptl/fork.c (__libc_fork): Likewise.
* sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise.
* sysdeps/nptl/bits/thread-shared-types.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
(__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead
of __WORDSIZE for internal layout.
(__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead
of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION
instead of __WORDSIZE whether to use an union for __spins and __list
fields.
(__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION
case.
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New
defines.
* sysdeps/alpha/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/hppa/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/sparc/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
* sysdeps/x86/nptl/bits/pthreadtypes-arch.h
(__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION):
Likewise.
nptl: Add tests for internal pthread_mutex_t offsets
This patch adds a new build test to check for internal fields
offsets for user visible internal field. Although currently
the only field which is statically initialized to a non zero value
is pthread_mutex_t.__data.__kind value, the tests also check the
offset of __kind, __spins, __elision (if supported), and __list
internal member. A internal header (pthread-offset.h) is added
to each major ABI with the reference value.
Checked on x86_64-linux-gnu and with a build check for all affected
ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf,
hppa-linux-gnu, i686-linux-gnu, ia64-linux-gnu, m68k-linux-gnu,
microblaze-linux-gnu, mips64-linux-gnu, mips64-n32-linux-gnu,
mips-linux-gnu, powerpc64le-linux-gnu, powerpc-linux-gnu,
s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparc64-linux-gnu,
sparcv9-linux-gnu, tilegx-linux-gnu, tilegx-linux-gnu-x32,
tilepro-linux-gnu, x86_64-linux-gnu, and x86_64-linux-x32).
posix: Do not use WNOHANG in waitpid call for Linux posix_spawn
As shown in some buildbot issues on aarch64 and powerpc, calling
clone (VFORK) and waitpid (WNOHANG) does not guarantee the child
is ready to be collected. This patch changes the call back to 0
as before fe05e1cb6d64 fix.
This change can lead to the scenario 4.3 described in the commit,
where the waitpid call can hang undefinitely on the call. However
this is also a very unlikely and also undefinied situation where
both the caller is trying to terminate a pid before posix_spawn
returns and the race pid reuse is triggered. I don't see how to
correct handle this specific situation within posix_spawn.
Checked on x86_64-linux-gnu, aarch64-linux-gnu and
powerpc64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of
WNOHANG in waitpid call.
posix: Fix improper assert in Linux posix_spawn (BZ#22273)
As noted by Florian Weimer, current Linux posix_spawn implementation
can trigger an assert if the auxiliary process is terminated before
actually setting the err member:
340 /* Child must set args.err to something non-negative - we rely on
341 the parent and child sharing VM. */
342 args.err = -1;
[...]
362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size,
363 CLONE_VM | CLONE_VFORK | SIGCHLD, &args);
364
365 if (new_pid > 0)
366 {
367 ec = args.err;
368 assert (ec >= 0);
Another possible issue is killing the child between setting the err and
actually calling execve. In this case the process will not ran, but
posix_spawn also will not report any error:
As suggested by Andreas Schwab, this patch removes the faulty assert
and also handles any signal that happens before fork and execve as the
spawn was successful (and thus relaying the handling to the caller to
figure this out). Different than Florian, I can not see why using
atomics to set err would help here, essentially the code runs
sequentially (due CLONE_VFORK) and I think it would not be legal the
compiler evaluate ec without checking for new_pid result (thus there
is no need to compiler barrier).
Summarizing the possible scenarios on posix_spawn execution, we
have:
1. For default case with a success execution, args.err will be 0, pid
will not be collected and it will be reported to caller.
2. For default failure case, args.err will be positive and the it will
be collected by the waitpid. An error will be reported to the
caller.
3. For the unlikely case where the process was terminated and not
collected by a caller signal handler, it will be reported as succeful
execution and not be collected by posix_spawn (since args.err will
be 0). The caller will need to actually handle this case.
4. For the unlikely case where the process was terminated and collected
by caller we have 3 other possible scenarios:
4.1. The auxiliary process was terminated with args.err equal to 0:
it will handled as 1. (so it does not matter if we hit the pid
reuse race since we won't possible collect an unexpected
process).
4.2. The auxiliary process was terminated after execve (due a failure
in calling it) and before setting args.err to -1: it will also
be handle as 1. but with the issue of not be able to report the
caller a possible execve failures.
4.3. The auxiliary process was terminated after args.err is set to -1:
this is the case where it will be possible to hit the pid reuse
case where we will need to collected the auxiliary pid but we
can not be sure if it will be expected one. I think for this
case we need to actually change waitpid to use WNOHANG to avoid
hanging indefinitely on the call and report an error to caller
since we can't differentiate between a default failure as 2.
and a possible pid reuse race issue.
Checked on x86_64-linux-gnu.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where
the auxiliary process is terminated by a signal before calling _exit
or execve.
This patch consolidates the glob implementation. The main changes are:
* On Linux all implementation now uses the default one at
sysdeps/unix/sysv/linux/glob{free}{64}.c with the exception
of alpha (which requires specific versioning) and s390-32 (which
different than other 32 bits ports it does not add a compat one
symbol for 2.1 version).
* The default implementation uses XSTAT_IS_XSTAT64 to define whether
both glob{free} and glob{free}64 should be different implementations.
For archictures that define XSTAT_IS_XSTAT64, glob{free} is an alias
to glob{free}64.
* Move i386 olddirent.h header to Linux default directory, since it is
the only header with this name and it is shared among different
architectures (and used on compat glob symbol as well).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
Current implementation of tunables does not set arena_max and arena_test
values. Any value provided by glibc.malloc.arena_max and
glibc.malloc.arena_test parameters is ignored.
These tunables have minval value set to 1 (see elf/dl-tunables.list file)
and undefined maxval value. In that case default value (which is 0. see
scripts/gen-tunables.awk) is being used to set maxval.
For instance, generated tunable_list[] entry for arena_max is:
(gdb) p *cur
$1 = {name = 0x7ffff7df6217 "glibc.malloc.arena_max",
type = {type_code = TUNABLE_TYPE_SIZE_T, min = 1, max = 0},
val = {numval = 0, strval = 0x0}, initialized = false,
security_level = TUNABLE_SECLEVEL_SXID_IGNORE,
env_alias = 0x7ffff7df622e "MALLOC_ARENA_MAX"}
As a result, any value of glibc.malloc.arena_max is ignored by
TUNABLE_SET_VAL_IF_VALID_RANGE macro
__type min = (__cur)->type.min; <- initialized to 1
__type max = (__cur)->type.max; <- initialized to 0!
if (min == max) <- false
{
min = __default_min;
max = __default_max;
}
if ((__type) (__val) >= min && (__type) (val) <= max) <- false
{
(__cur)->val.numval = val;
(__cur)->initialized = true;
}
Assigning correct min/max values at a build time fixes a problem.
Plus, a bit of optimization: Setting of default min/max values for the
given type at a run time might be eliminated.
* elf/dl-tunables.c (do_tunable_update_val): Range checking fix.
* scripts/gen-tunables.awk: Set unspecified minval and/or maxval
values to correct default value for given type.