git.ipfire.org Git - thirdparty/glibc.git/log

Avoid .symver on common symbols [BZ #21666]

The .symver directive on common symbol just creates a new common symbol,
not an alias and the newer assembler with the bug fix for

https://sourceware.org/bugzilla/show_bug.cgi?id=21661

will issue an error.  Before the fix, we got

$ readelf -sW libc.so | grep "loc[12s]"
  5109: 00000000003a0608     8 OBJECT  LOCAL  DEFAULT   36 loc1
  5188: 00000000003a0610     8 OBJECT  LOCAL  DEFAULT   36 loc2
  5455: 00000000003a0618     8 OBJECT  LOCAL  DEFAULT   36 locs
  6575: 00000000003a05f0     8 OBJECT  GLOBAL DEFAULT   36 locs@GLIBC_2.2.5
  7156: 00000000003a05f8     8 OBJECT  GLOBAL DEFAULT   36 loc1@GLIBC_2.2.5
  7312: 00000000003a0600     8 OBJECT  GLOBAL DEFAULT   36 loc2@GLIBC_2.2.5

in libc.so.  The versioned loc1, loc2 and locs have the wrong addresses.
After the fix, we got

$ readelf -sW libc.so | grep "loc[12s]"
  6570: 000000000039e3b8     8 OBJECT  GLOBAL DEFAULT   34 locs@GLIBC_2.2.5
  7151: 000000000039e3c8     8 OBJECT  GLOBAL DEFAULT   34 loc1@GLIBC_2.2.5
  7307: 000000000039e3c0     8 OBJECT  GLOBAL DEFAULT   34 loc2@GLIBC_2.2.5

[BZ #21666]
* misc/regexp.c (loc1): Add __attribute__ ((nocommon));
(loc2): Likewise.
(locs): Likewise.

(cherry picked from commit 388b4f1a02f3a801965028bbfcd48d905638b797)

[AArch64] Use hidden __GI__dl_argv in rtld startup code

We rely on the symbol being locally defined so using extern symbol
is not correct and the linker may complain about the relocations.

conform tests: call perl with '-I.'

Historically perl includes the current directory in the module search
path. Over the time this has been considered as a security issue and
the recent vulnerabilities [1] made people to reconsider this behaviour.
It is almost sure that this will be removed in the future [2], possibly
for the 5.26 release, although this is not yet firmly decided.

Debian has decided to backport the patches [3], so the perl binary in
unstable do not have '.' in @INC anymore.

This behaviour is used in the conform perl scripts to include the
GlibcConform module. This patch fixes that by calling perl with '-I.'.
This is not a security issue in this case as make ensures that the
current directory is $(srcdir)/conform/ when the scripts are called.
Passing the full path would do exactly the same.

[1] CVE-2016-1238 CVE-2016-6185
[2] https://rt.perl.org/Public/Bug/Display.html?id=127810
[3] https://lists.debian.org/debian-devel-announce/2016/08/msg00013.html

Changelog:
* conform/Makefile (conformtest-header-tests): Pass -I. to $(PERL).
(linknamespace-symlists-tests): Likewise.
(linknamespace-header-tests): Likewise.

(cherry picked from commit 6d5336211d2e823d4d431a01e62a80d9be4cbc9d)

x86-64: Align the stack in __tls_get_addr [BZ #21609]

This change forces realignment of the stack pointer in __tls_get_addr, so
that binaries compiled by GCCs older than GCC 4.9:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

continue to work even if vector instructions are used in glibc which
require the ABI stack realignment.

__tls_get_addr_slow is added to handle the slow paths in the default
implementation of__tls_get_addr in elf/dl-tls.c. The new __tls_get_addr
calls __tls_get_addr_slow after realigning the stack. Internal calls
within ld.so go directly to the default implementation of __tls_get_addr
because they do not need stack realignment.

[BZ #21609]
* sysdeps/x86_64/Makefile (sysdep-dl-routines): Add tls_get_addr.
(gen-as-const-headers): Add rtld-offsets.sym.
* sysdeps/x86_64/dl-tls.c: New file.
* sysdeps/x86_64/rtld-offsets.sym: Likwise.
* sysdeps/x86_64/tls_get_addr.S: Likewise.
* sysdeps/x86_64/dl-tls.h: Add multiple inclusion guards.
* sysdeps/x86_64/tlsdesc.sym (TI_MODULE_OFFSET): New.
(TI_OFFSET_OFFSET): Likwise.

(cherry picked from commit 031e519c95c069abe4e4c7c59e2b4b67efccdee5)

i686: Add missing IS_IN (libc) guards to vectorized strcspn

Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.

(cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)

Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209)

The LD_HWCAP_MASK environment variable may alter the selection of
function variants for some architectures. For AT_SECURE process it
means that if an outdated routine has a bug that would otherwise not
affect newer platforms by default, LD_HWCAP_MASK will allow that bug
to be exploited.

To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid
binaries.

[BZ #21209]
* elf/rtld.c (process_envvars): Ignore LD_HWCAP_MASK for
AT_SECURE processes.
* sysdeps/generic/unsecvars.h: Add LD_HWCAP_MASK.

(cherry picked from commit 1c1243b6fc33c029488add276e56570a07803bfd)

ld.so: Reject overly long LD_AUDIT path elements

Also only process the last LD_AUDIT entry.

(cherry picked from commit 81b82fb966ffbd94353f793ad17116c6088dedd9)

ld.so: Reject overly long LD_PRELOAD path elements

(cherry picked from commit 6d0ba622891bed9d8394eef1935add53003b12e8)

CVE-2017-1000366: Ignore LD_LIBRARY_PATH for AT_SECURE=1 programs [BZ #21624]

LD_LIBRARY_PATH can only be used to reorder system search paths, which
is not useful functionality.

This makes an exploitable unbounded alloca in _dl_init_paths unreachable
for AT_SECURE=1 programs.

(cherry picked from commit f6110a8fee2ca36f8e2d2abecf3cba9fa7b8ea7d)

m68k: fix 64bit atomic ops

(cherry picked from commit 64ae9fe45662c8994b0e56ab469b01967408a154)

Correct collation rules for Malayalam.

[BZ #19922]
* locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF.

[BZ #19919]
* locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37.

fork: Remove bogus parent PID assertions [BZ #21386]

(cherry picked from commit 1d2bc2eae969543b89850e35e532f3144122d80a)

Use test-driver in sysdeps/unix/sysv/linux/tst-clone2.c

(cherry picked from commit fdc543919a3d8578631a492e1227c2cd8f5ecec7)

Remove cached PID/TID in clone

This patch remove the PID cache and usage in current GLIBC code.  Current
usage is mainly used a performance optimization to avoid the syscall,
however it adds some issues:

  - The exposed clone syscall will try to set pid/tid to make the new
    thread somewhat compatible with current GLIBC assumptions.  This cause
    a set of issue with new workloads and usecases (such as BZ#17214 and
    [1]) as well for new internal usage of clone to optimize other algorithms
    (such as clone plus CLONE_VM for posix_spawn, BZ#19957).

  - The caching complexity also added some bugs in the past [2] [3] and
    requires more effort of each port to handle such requirements (for
    both clone and vfork implementation).

  - Caching performance gain in mainly on getpid and some specific
    code paths.  The getpid performance leverage is questionable [4],
    either by the idea of getpid being a hotspot as for the getpid
    implementation itself (if it is indeed a justifiable hotspot a
    vDSO symbol could let to a much more simpler solution).

    Other usage is mainly for non usual code paths, such as pthread
    cancellation signal and handling.

For thread creation (on stack allocation) the code simplification in fact
adds some performance gain due the no need of transverse the stack cache
and invalidate each element pid.

Other thread usages will require a direct getpid syscall, such as
cancellation/setxid signal, thread cancellation, thread fail path (at
create_thread), and thread signal (pthread_kill and pthread_sigqueue).
However these are hardly usual hotspots and I think adding a syscall is
justifiable.

It also simplifies both the clone and vfork arch-specific implementation.
And by review each fork implementation there are some discrepancies that
this patch also solves:

  - microblaze clone/vfork does not set/reset the pid/tid field
  - hppa uses the default vfork implementation that fallback to fork.
    Since vfork is deprecated I do not think we should bother with it.

The patch also removes the TID caching in clone. My understanding for
such semantic is try provide some pthread usage after a user program
issue clone directly (as done by thread creation with CLONE_PARENT_SETTID
and pthread tid member).  However, as stated before in multiple discussions
threads, GLIBC provides clone syscalls without further supporting all this
semantics.

I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le.
For sparc32, sparc64, and mips I ran the basic fork and vfork tests from
posix/ folder (on a qemu system).  So it would require further testing
on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze
because it is already implementing the patch semantic regarding clone/vfork).

[1] https://codereview.chromium.org/800183004/
[2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368
[4] http://yarchive.net/comp/linux/getpid_caching.html

* sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting.
* nptl/allocatestack.c (allocate_stack): Likewise.
(__reclaim_stacks): Likewise.
(setxid_signal_thread): Obtain pid through syscall.
* nptl/nptl-init.c (sigcancel_handler): Likewise.
(sighandle_setxid): Likewise.
* nptl/pthread_cancel.c (pthread_cancel): Likewise.
* sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise.
* sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue):
Likewise.
* sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise.
* sysdeps/unix/sysv/linux/getpid.c: Remove file.
* nptl/descr.h (struct pthread): Change comment about pid value.
* nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread
pid assert.
* sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids):
Do not set pid value.
* nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread
pid cache check.
* nptl_db/td_thr_validate.c (td_thr_validate): Likewise.
* sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset.
* sysdeps/alpha/nptl/tcb-offsets.sym: Likewise.
* sysdeps/arm/nptl/tcb-offsets.sym: Likewise.
* sysdeps/hppa/nptl/tcb-offsets.sym: Likewise.
* sysdeps/i386/nptl/tcb-offsets.sym: Likewise.
* sysdeps/ia64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/m68k/nptl/tcb-offsets.sym: Likewise.
* sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise.
* sysdeps/mips/nptl/tcb-offsets.sym: Likewise.
* sysdeps/nios2/nptl/tcb-offsets.sym: Likewise.
* sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/s390/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sh/nptl/tcb-offsets.sym: Likewise.
* sysdeps/sparc/nptl/tcb-offsets.sym: Likewise.
* sysdeps/tile/nptl/tcb-offsets.sym: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise.
* sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching.
* sysdeps/unix/sysv/linux/alpha/clone.S: Likewise.
* sysdeps/unix/sysv/linux/arm/clone.S: Likewise.
* sysdeps/unix/sysv/linux/hppa/clone.S: Likewise.
* sysdeps/unix/sysv/linux/i386/clone.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise.
* sysdeps/unix/sysv/linux/mips/clone.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sh/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/tile/clone.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise.
* sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset.
* sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/arm/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/i386/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/clone.S: Likewise.
* sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/mips/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sh/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tile/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise.
* sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread
struct access.
(clone_test): Remove function.
(do_test): Rewrite to take in consideration pid is not cached anymore.

(cherry picked from commit c579f48edba88380635ab98cb612030e3ed8691e)

Add INTERNAL_SYSCALL_CALL

This patch adds two new macros for internal and inline syscall to use
within GLIBC: INTERNAL_SYSCALL_CALL and INLINE_SYSCALL_CALL.  They are
similar to the old INTERNAL_SYSCALL and INLINE_SYSCALL with the difference
the new macros accept a variable argument call and do not require to pass
the expected argument size.

The advantage is it is possible to use variable argument macros like
SYSCALL_LL{64} without the need to also handle the argument size.  So
for an ABI where SYSCALL_LL might split the argument in high and low
parts, instead of:

  INTERNAL_SYSCALL_DECL (err);
#if ...
  INTERNAL_SYSCALL (syscall, err, 2, SYSCALL_LL (len));
#else
  INTERNAL_SYSCALL (syscall, err, 1, SYSCALL_LL (len));
#endif

It will be just:

  INTERNAL_SYSCALL_CALL (syscall, err, SYSCALL_LL (len));

The INLINE_SYSCALL_CALL follows the same semanthic regarding the argument
and is similar to INLINE_SYSCALL regarding setting errno.

Checked with a build for x86_64, i386, aach64, armhf, powerpc64le, powerpc32,
and mips32.  No code generation changed.

* sysdeps/unix/sysdep.h (__INTERNAL_SYSCALL0): New macro.
(__INTERNAL_SYSCALL1): Likewise.
(__INTERNAL_SYSCALL2): Likewise.
(__INTERNAL_SYSCALL3): Likewise.
(__INTERNAL_SYSCALL4): Likewise.
(__INTERNAL_SYSCALL5): Likewise.
(__INTERNAL_SYSCALL6): Likewise.
(__INTERNAL_SYSCALL7): Likewise.
(__INTERNAL_SYSCALL_NARGS_X): Likewise.
(__INTERNAL_SYSCALL_NARGS): Likewise.
(__INTERNAL_SYSCALL_DISP): Likewise.
(INTERNAL_SYSCALL_CALL): Likewise.
(__SYSCALL0): Rename to __INLINE_SYSCALL0.
(__SYSCALL1): Rename to __INLINE_SYSCALL1.
(__SYSCALL2): Rename to __INLINE_SYSCALL2.
(__SYSCALL3): Rename to __INLINE_SYSCALL3.
(__SYSCALL4): Rename to __INLINE_SYSCALL4.
(__SYSCALL5): Rename to __INLINE_SYSCALL5.
(__SYSCALL6): Rename to __INLINE_SYSCALL6.
(__SYSCALL7): Rename to __INLINE_SYSCALL7.
(__SYSCALL_NARGS_X): Rename to __INLINE_SYSCALL_NARGS_X.
(__SYSCALL_NARGS): Rename to __INLINE_SYSCALL_NARGS.
(__SYSCALL_DISP): Rename to __INLINE_SYSCALL_DISP.
(__SYSCALL_CALL): Rename to INLINE_SYSCALL_CALL.
(SYSCALL_CANCEL): Replace __SYSCALL_CALL with INLINE_SYSCALL_CALL.

(cherry picked from commit e33a23fbe8c2dba04fe05678c584d3efcb6c9951)

Synchronize support/ infrastructure with master

This commit updates the support/ subdirectory to
commit 2714c5f3c95f90977167c1d21326d907fb76b419
on the master branch and modifies Makeconfig,
Rules, and extra-lib.mk accordingly.

x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396]

On Skylake server, AVX512 load/store instructions in memcpy/memset may
lead to lower CPU turbo frequency in certain situations. Use of AVX2
in memcpy/memset has been observed to have improved overall performance
in many workloads due to the higher frequency.

Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512
if AVX512ER isn't available so that AVX2 versions of memcpy/memset are
used on Skylake server.

[BZ #21396]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
Prefer_No_AVX512 if AVX512ER isn't available.
* sysdeps/x86/cpu-features.h (bit_arch_Prefer_No_AVX512): New.
(index_arch_Prefer_No_AVX512): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Don't use
AVX512 version if Prefer_No_AVX512 is set.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memmove.S (__libc_memmove): Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.S (__memmove_chk):
Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memset.S (memset): Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk):
Likewise.

(cherry picked from commit 4cb334c4d6249686653137ec273d081371b3672d)

x86: Set Prefer_No_VZEROUPPER if AVX512ER is available

AVX512ER won't be implemented in any Xeon processors and will be in
all Xeon Phi processors.  Don't check CPU model number when setting
Prefer_No_VZEROUPPER for Xeon Phi.  Instead, set Prefer_No_VZEROUPPER
if AVX512ER is available.  It works with current and future Xeon Phi
and non-Xeon Phi processors.

* sysdeps/x86/cpu-features.c (init_cpu_features): Set
Prefer_No_VZEROUPPER if AVX512ER is available.
* sysdeps/x86/cpu-features.h
(bit_cpu_AVX512PF): New.
(bit_cpu_AVX512ER): Likewise.
(bit_cpu_AVX512CD): Likewise.
(bit_cpu_AVX512BW): Likewise.
(bit_cpu_AVX512VL): Likewise.
(index_cpu_AVX512PF): Likewise.
(index_cpu_AVX512ER): Likewise.
(index_cpu_AVX512CD): Likewise.
(index_cpu_AVX512BW): Likewise.
(index_cpu_AVX512VL): Likewise.
(reg_AVX512PF): Likewise.
(reg_AVX512ER): Likewise.
(reg_AVX512CD): Likewise.
(reg_AVX512BW): Likewise.
(reg_AVX512VL): Likewise.

(cherry picked from commit 1c53cb49de6d82d9469ccbd5aa0c55924502bd8b)

Fix MIPS n64 readahead (bug 21026).

As noted in bug 20126, MIPS n64 uses an incorrect implementation of
readahead intended for 32-bit systems. This patch adds a
syscalls.list entry to fix this. An updated version of the
consolidation patch
<https://sourceware.org/ml/libc-alpha/2016-09/msg00527.html> could
remove this syscalls.list entry again.

Tested with compilation (only) for mips64; the nature of the syscall
doesn't allow for a glibc test to detect this issue.

[BZ #21026]
* sysdeps/unix/sysv/linux/mips/mips64/n64/syscalls.list
(readahead): New syscall entry.

(cherry picked from commit 30733525c6867c160261db1afade6326000f9f75)

x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]

On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
the first 8 vector registers.  The code layout is

  if only %xmm0 - %xmm7 registers are used
     preserve %xmm0 - %xmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %zmm0 - %zmm7 registers

Branch predication always executes the fallthrough code path to preserve
%zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
registers are used.  This leads to lower CPU frequency on Skylake
server.  This patch changes the fallthrough code path to preserve
%xmm0 - %xmm7 registers instead:

  if whole %zmm0 - %zmm7 registers are used
    preserve %zmm0 - %zmm7 registers
  if only %ymm0 - %ymm7 registers are used
     preserve %ymm0 - %ymm7 registers
  preserve %xmm0 - %xmm7 registers

Tested on Skylake server.

[BZ #21258]
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
Define only if _dl_runtime_resolve is defined to
_dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
Fallthrough to _dl_runtime_resolve_sse_vex.

(cherry picked from commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00)

posix_spawn: use a larger min stack for -fstack-check [BZ #21253]

When glibc is built with -fstack-check, trying to use posix_spawn can
lead to segfaults due to gcc internally probing stack memory too far.
The new spawn API will allocate a minimum of 1 page, but the stack
checking logic might probe a couple of pages.  When it tries to walk
them, everything falls apart.

The gcc internal docs [1] state the default interval checking is one
page.  Which means we need two pages (the current one, and the next
probed).  No target currently defines it larger.

Further, it mentions that the default minimum stack size needed to
recover from an overflow is 4/8KiB for sjlj or 8/12KiB for others.
But some Linux targets (like mips and ppc) go up to 16KiB (and some
non-Linux targets go up to 24KiB).

Let's create each child with a minimum of 32KiB slack space to support
them all, and give us future breathing room.

No test is added as existing ones crash.  Even a simple call is
enough to trigger the problem:
char *argv[] = { "/bin/ls", NULL };
posix_spawn(NULL, "/bin/ls", NULL, NULL, argv, NULL);

[1] https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gccint/Stack-Checking.html

(cherry picked from commit 21f042c804835d1f7a4a8e06f2c93ca35a182042)

fts: Fix symbol redirect for fts_set [BZ #21289]

In a 32-bit environment with _FILE_OFFSET_BITS=64, the __REDIRECT macro
combined with __THROW generates an invalid C++ declaration.

(cherry picked from commit ce39613205dc47ceaeea76710d49e7a483b503ab)

posix_spawn: fix stack setup on ia64 [BZ #21275]

The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.

Reported-by: Matt Turner <mattst88@gentoo.org>
(cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)

hppa: Fix setting of __libc_stack_end

The binutils package was recently changed to fix -z relro support on hppa.
See ld/21000 for details:
https://sourceware.org/bugzilla/show_bug.cgi?id=21000

This exposed a problem with the _dl_start_user function in the RTLD_START
define. We need to set __libc_stack_end before it is made read only. For
this, we need to define DL_STACK_END. The offset of 0x160 gives the same
stack end as the code in _dl_start_user.

A build log with the attached patch is here:
https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=hppa&ver=2.24-9&stamp=1487639205&raw=0

(cherry picked from commit 5d20a49aaccef5ef7adac93d5ca159f6b7ba0105)

Add VZEROUPPER to memset-vec-unaligned-erms.S [BZ #21081]

Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at
function entry, memset optimized for AVX2 and AVX512 will always use
ymm/zmm register. VZEROUPPER should be placed before ret in

L(stosb):
        movq    %rdx, %rcx
        movzbl  %sil, %eax
        movq    %rdi, %rdx
        rep stosb
        movq    %rdx, %rax
        ret

since it can be reached from

L(stosb_more_2x_vec):
        cmpq    $REP_STOSB_THRESHOLD, %rdx
        ja      L(stosb)

[BZ #21081]
* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
(L(stosb)): Add VZEROUPPER before ret.

(cherry picked from commit 02b78ff749f0c88771713368dbb2a09b1979814f)

X86_64: Don't use PLT nor GOT in static archives [BZ #20750]

There is no need to use PLT nor GOT in static archives to branch to a
function, regardless whether static archives is compiled with PIC or
not. When static archives are used to create dynamic executable,
PLT/GOT may be used. The resulting executable still works correctly.

[BZ #20750]
* sysdeps/x86_64/sysdep.h (JUMPTARGET): Check SHARED instead
of PIC.

(cherry picked from commit c9070e6305c08365c5f8b604bdca39c699d70cfa)

CVE-2015-5180: resolv: Fix crash with internal QTYPE [BZ #18784]

Also rename T_UNSPEC because an upcoming public header file
update will use that name.

(cherry picked from commit fc82b0a2dfe7dbd35671c10510a8da1043d746a5)

Drop GLIBC_TUNABLES in setxid processes

Drop the GLIBC_TUNABLES environment variable from the environment of
setxid processes to avoid passing it on to non-setxid children. This
prevents potentially insecure tunables in the GLIBC_TUNABLES envvar
from crossing over into a child that may use a libc that has tunables
support.

* sysdeps/generic/unsecvars.h: Add GLIBC_TUNABLES.

powerpc: Fix write-after-destroy in lock elision [BZ #20822]

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.

(cherry picked from commit e9a96ea1aca4ebaa7c86e8b83b766f118d689d0f)
(with changes from commit eb1321f291515dae75c83a40c39e775fdd38e97a)

localedata: bs_BA: fix yesexpr/noexpr [BZ #20974]

Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.

This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).

(cherry picked from commit a035eb6928bc63fb798dcc1421529f933122d74f)

Bug 11941: ld.so: Improper assert map->l_init_called in dlclose

There is at least one use case where during exit a library destructor
might call dlclose() on a valid handle and have it fail with an
assertion. We must allow this case, it is a valid handle, and dlclose()
should not fail with an assert. In the future we might be able to return
an error that the dlclose() could not be completed because the opened
library has already been unloaded and destructors have run as part of
exit processing.

For more details see:
https://www.sourceware.org/ml/libc-alpha/2016-12/msg00859.html

(cherry picked from commit 57707b7fcc38855869321f8c7827bfe21d729f37)

alpha: fix trunc for big input values

The alpha specific version of trunc and truncf always add and subtract
0x1.0p23 or 0x1.0p52 even for big values. This causes this kind of
errors in the testsuite:

  Failure: Test: trunc_towardzero (0x1p107)
  Result:
   is:          1.6225927682921334e+32   0x1.fffffffffffffp+106
   should be:   1.6225927682921336e+32   0x1.0000000000000p+107
   difference:  1.8014398509481984e+16   0x1.0000000000000p+54
   ulp       :  0.5000
   max.ulp   :  0.0000

Change this by returning the input value when its absolute value is
greater than 0x1.0p23 or 0x1.0p52. NaN have to go through the add and
subtract operations to get possibly silenced.

Finally remove the code to handle inexact exception, trunc should never
generate such an exception.

Changelog:
* sysdeps/alpha/fpu/s_trunc.c (__trunc): Return the input value
when its absolute value is greater than 0x1.0p52.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_truncf.c (__truncf): Return the input value
when its absolute value is greater than 0x1.0p23.
[_IEEE_FP_INEXACT] Remove.

(cherry picked from commit b74d259fe793499134eb743222cd8dd7c74a31ce)

alpha: fix rint on sNaN input

The alpha version of rint wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.

Changelog:
* sysdeps/alpha/fpu/s_rint.c (__rint): Add argument with itself
when it is a NaN.
* sysdeps/alpha/fpu/s_rintf.c (__rintf): Likewise.

(cherry picked from commit cb7f9d63b921ea1a1cbb4ab377a8484fd5da9a2b)

alpha: fix floor on sNaN input

The alpha version of floor wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.

Finally remove the code to handle inexact exception, floor should never
generate such an exception.

Changelog:
* sysdeps/alpha/fpu/s_floor.c (__floor): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_floorf.c (__floorf): Likewise.

(cherry picked from commit 65cc568cf57156e5230db9a061645e54ff028a41)

alpha: fix ceil on sNaN input

The alpha version of ceil wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.

Finally remove the code to handle inexact exception, ceil should never
generate such an exception.

Changelog:
* sysdeps/alpha/fpu/s_ceil.c (__ceil): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_ceilf.c (__ceilf): Likewise.

(cherry picked from commit 062e53c195b4a87754632c7d51254867247698b4)

X86-64: Add _dl_runtime_resolve_avx[512]_{opt|slow} [BZ #20508]

There is transition penalty when SSE instructions are mixed with 256-bit
AVX or 512-bit AVX512 load instructions.  Since _dl_runtime_resolve_avx
and _dl_runtime_profile_avx512 save/restore 256-bit YMM/512-bit ZMM
registers, there is transition penalty when SSE instructions are used
with lazy binding on AVX and AVX512 processors.

To avoid SSE transition penalty, if only the lower 128 bits of the first
8 vector registers are non-zero, we can preserve %xmm0 - %xmm7 registers
with the zero upper bits.

For AVX and AVX512 processors which support XGETBV with ECX == 1, we can
use XGETBV with ECX == 1 to check if the upper 128 bits of YMM registers
or the upper 256 bits of ZMM registers are zero.  We can restore only the
non-zero portion of vector registers with AVX/AVX512 load instructions
which will zero-extend upper bits of vector registers.

This patch adds _dl_runtime_resolve_sse_vex which saves and restores
XMM registers with 128-bit AVX store/load instructions.  It is used to
preserve YMM/ZMM registers when only the lower 128 bits are non-zero.
_dl_runtime_resolve_avx_opt and _dl_runtime_resolve_avx512_opt are added
and used on AVX/AVX512 processors supporting XGETBV with ECX == 1 so
that we store and load only the non-zero portion of vector registers.
This avoids SSE transition penalty caused by _dl_runtime_resolve_avx and
_dl_runtime_profile_avx512 when only the lower 128 bits of vector
registers are used.

_dl_runtime_resolve_avx_slow is added and used for AVX processors which
don't support XGETBV with ECX == 1.  Since there is no SSE transition
penalty on AVX512 processors which don't support XGETBV with ECX == 1,
_dl_runtime_resolve_avx512_slow isn't provided.

[BZ #20495]
[BZ #20508]
* sysdeps/x86/cpu-features.c (init_cpu_features): For Intel
processors, set Use_dl_runtime_resolve_slow and set
Use_dl_runtime_resolve_opt if XGETBV suports ECX == 1.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
New.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_Use_dl_runtime_resolve_opt): Likewise.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Use
_dl_runtime_resolve_avx512_opt and _dl_runtime_resolve_avx_opt
if Use_dl_runtime_resolve_opt is set.  Use
_dl_runtime_resolve_slow if Use_dl_runtime_resolve_slow is set.
* sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
(_dl_runtime_resolve_opt): New.  Defined for AVX and AVX512.
(_dl_runtime_resolve): Add one for _dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx_slow):
New.
(_dl_runtime_resolve_opt): Likewise.
(_dl_runtime_profile): Define only if _dl_runtime_profile is
defined.

(cherry picked from commit fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604)

x86_64: fix static build of __memcpy_chk for compilers defaulting to PIC/PIE

When glibc is compiled with gcc 6.2 that has been configured with
to default to PIC/PIE, the static version of __memcpy_chk is not built,
as the test is done on PIC instead of SHARED. Fix the test to check for
SHARED, like it is done for similar functions like memmove_chk.

Changelog:
* sysdeps/x86_64/memcpy_chk.S (__memcpy_chk): Check for SHARED
instead of PIC.

(cherry picked from commit 380ec16d62f459d5a28cfc25b7b20990c45e1cc9)

MIPS: Add `.insn' to ensure a text label is defined as code not data

Avoid a build error with microMIPS compilation and recent versions of
GAS which complain if a branch targets a label which is marked as data
rather than microMIPS code:

../sysdeps/mips/mips32/crti.S: Assembler messages:
../sysdeps/mips/mips32/crti.S:72: Error: branch to a symbol in another ISA mode
make[2]: *** [.../csu/crti.o] Error 1

as commit 9d862524f6ae ("MIPS: Verify the ISA mode and alignment of
branch and jump targets") closed a hole in branch processing, making
relocation calculation respect the ISA mode of the symbol referred.
This allowed diagnosing the situation where an attempt is made to pass
control from code assembled for one ISA mode to code assembled for a
different ISA mode and either relaxing the branch to a cross-mode jump
or if that is not possible, then reporting this as an error rather than
letting such code build and then fail unpredictably at the run time.

This however requires the correct annotation of branch targets as code,
because the ISA mode is not relevant for data symbols and is therefore
not recorded for them.  The `.insn' pseudo-op is used for this purpose
and has been supported by GAS since:

Wed Feb 12 14:36:29 1997  Ian Lance Taylor  <ian@cygnus.com>

* config/tc-mips.c (mips_pseudo_table): Add "insn".
(s_insn): New static function.
* doc/c-mips.texi: Document .insn.

so there has been no reason to avoid it where required.  More recently
this pseudo-op has been documented, by the microMIPS architecture
specification[1][2], as required for the correct interpretation of any
code label which is not followed by an actual instruction in an assembly
source.

Use it in our crti.S files then, to mark that the trailing label there
with no instructions following is indeed not a code bug and the branch
is legitimate.

References:

[1] "MIPS Architecture for Programmers, Volume II-B: The microMIPS32
    Instruction Set", MIPS Technologies, Inc., Document Number: MD00582,
    Revision 5.04, January 15, 2014, Section 7.1 "Assembly-Level
    Compatibility", p. 533

[2] "MIPS Architecture for Programmers, Volume II-B: The microMIPS64
    Instruction Set", MIPS Technologies, Inc., Document Number: MD00594,
    Revision 5.04, January 15, 2014, Section 8.1 "Assembly-Level
    Compatibility", p. 623

2016-11-23  Matthew Fortune  <Matthew.Fortune@imgtec.com>
            Maciej W. Rozycki  <macro@imgtec.com>

* sysdeps/mips/mips32/crti.S (_init): Add `.insn' pseudo-op at
`.Lno_weak_fn' label.
* sysdeps/mips/mips64/n32/crti.S (_init): Likewise.
* sysdeps/mips/mips64/n64/crti.S (_init): Likewise.

(cherry picked from commit cfaf1949ff1f8336b54c43796d0e2531bc8a40a2)

Fix writes past the allocated array bounds in execvpe (BZ#20847)

This patch fixes an invalid write out or stack allocated buffer in
2 places at execvpe implementation:

  1. On 'maybe_script_execute' function where it allocates the new
     argument list and it does not account that a minimum of argc
     plus 3 elements (default shell path, script name, arguments,
     and ending null pointer) should be considered.  The straightforward
     fix is just to take account of the correct list size on argument
     copy.

  2. On '__execvpe' where the executable file name lenght may not
     account for ending '\0' and thus subsequent path creation may
     write past array bounds because it requires to add the terminating
     null.  The fix is to change how to calculate the executable name
     size to add the final '\0' and adjust the rest of the code
     accordingly.

As described in GCC bug report 78433 [1], these issues were masked off by
GCC because it allocated several bytes more than necessary so that many
off-by-one bugs went unnoticed.

Checked on x86_64 with a latest GCC (7.0.0 20161121) with -O3 on CFLAGS.

[BZ #20847]
* posix/execvpe.c (maybe_script_execute): Remove write past allocated
array bounds.
(__execvpe): Likewise.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78433

configure: accept __stack_chk_fail_local for ssp support too [BZ #20662]

When glibc is compiled with gcc 6.2 that has been configured with
--enable-default-pie and --enable-default-ssp, the configure script
fails to detect that the compiler has ssp turned on by default when
being built for i686-linux-gnu.

This is because gcc is emitting __stack_chk_fail_local but the
script is only looking for __stack_chk_fail. Support both.

Example output:
checking whether x86_64-pc-linux-gnu-gcc -m32 -Wl,-O1 -Wl,--as-needed
implicitly enables -fstack-protector... no

(cherry picked from commit c7409aded44634411a19b0b7178b7faa237835e6)

Fix linknamespace parallel test failures.

Having found that with my script to build many glibc variants I could
reproduce the linknamespace test failures in parallel builds (that
various people had previously reported but I hadn't seen myself), I
investigated those failures further. This patch adds a missing
dependency to those tests.

Tested for x86_64, including the configuration where I saw those
failures and where I don't see them with this patch.

* conform/Makefile ($(linknamespace-header-tests)): Also depend on
$(linknamespace-symlists-tests).

(cherry picked from commit 6c50bb532bb1f47084762e16fbf52af9b5a752d8)

gconv.h: fix build with GCC 7

gconv.h is using a flex array to define the __gconv_info member in an
invalid way, causing GCC 7 to issue an error:

| In file included from ../include/gconv.h:1:0,
|                  from ../sysdeps/unix/sysv/linux/_G_config.h:32,
|                  from ../libio/libio.h:31,
|                  from ../include/libio.h:4,
|                  from ../libio/stdio.h:74,
|                  from ../include/stdio.h:5,
|                  from test-math-isinff.cc:22:
| ../iconv/gconv.h:142:50: error: flexible array member '__gconv_info::__data' not at end of 'struct _IO_codecvt'
| In file included from ../include/libio.h:4:0,
|                  from ../libio/stdio.h:74,
|                  from ../include/stdio.h:5,
|                  from test-math-isinff.cc:22:
| ../libio/libio.h:211:14: note: next member '_G_iconv_t _IO_codecvt::__cd_out' declared here
| ../libio/libio.h:187:8: note: in the definition of 'struct _IO_codecvt'
| In file included from ../include/gconv.h:1:0,
|                  from ../sysdeps/unix/sysv/linux/_G_config.h:32,
|                  from ../libio/libio.h:31,
|                  from ../include/libio.h:4,
|                  from ../libio/stdio.h:74,
|                  from ../include/stdio.h:5,
|                  from test-math-isinff.cc:22:
| ../iconv/gconv.h:142:50: error: flexible array member '__gconv_info::__data' not at end of 'struct _IO_wide_data'
| In file included from ../include/libio.h:4:0,
|                  from ../libio/stdio.h:74,
|                  from ../include/stdio.h:5,
|                  from test-math-isinff.cc:22:
| ../libio/libio.h:211:14: note: next member '_G_iconv_t _IO_codecvt::__cd_out' declared here
| ../libio/libio.h:215:8: note: in the definition of 'struct _IO_wide_data'

This is basically a revert to the code from 15 years ago. More details
are available in the GCC bug:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78039

Changelog:
* iconv/gconv.h (__gconv_info): Define __data element using a
zero-length array.

(cherry picked from commit 0623b9e6a9c3441427cf8c421bcc81d09e48fdc1)

Fix cmpli usage in power6 memset.

Building glibc for powerpc64 with recent (2.27.51.20161012) binutils,
with multi-arch enabled, I get the error:

../sysdeps/powerpc/powerpc64/power6/memset.S: Assembler messages:
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (5 is not between 0 and 1)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (128 is not between 0 and 31)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: missing operand

Indeed, cmpli is documented as a four-operand instruction, and looking
at nearby code it seems likely cmpldi was intended. This patch fixes
this powerpc64 code accordingly, and makes a corresponding change to
the powerpc32 code.

Tested for powerpc, powerpc64 and powerpc64le by Tulio Magno Quites
Machado Filho

* sysdeps/powerpc/powerpc32/power6/memset.S (memset): Use cmplwi
instead of cmpli.
* sysdeps/powerpc/powerpc64/power6/memset.S (memset): Use cmpldi
instead of cmpli.

(cherry picked from commit 78b7adbaea643f2f213bb113e3ec933416a769a8)

Fix Linux sh4 pread/pwrite argument passing

Although conceptually correct for p{read,write}{64} offset argument passing,
sh4 implementation does not generate the correct expected code.  The
__ALIGNMENT_ARG redefinition is incorrect for two reasons: 1. the
kernel-features.h header is included multiple times (since it contains no
guards) and 2. the value it redefines is also incorrect (should be '0, '
instead of empty definition).

This patch fixes it by adding another macro, SYSCALL_LL_PRW{64}, meant to be
used to pass the offset argument on p{read,write}64.  It is basically the
already define SYSCALL_LL{64} plus __ALIGNMENT_ARG unless __ASSUME_PRW_DUMMY_ARG
is define.  In this case an empty dummy argument is used regardless how
__ALIGNMENT_ARG is defined (sh4 case).

Checked on x86_64, i686, aarch64, armhf, and powerpc64le (basically a sanity
check).  Also, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> and
James Clarke <jrtc27@jrtc27.com> help me check on a debian sh4 bootstrap using
2.24 plus this patch to verify it also corrected fixed the regression issue.

I also verified the generated object for a 2.24 build and master with this
patch for sh4 and both look identical.

* sysdeps/unix/sysv/linux/pread.c (__libc_pread): Use SYSCALL_LL_PRW.
* sysdeps/unix/sysv/linux/pwrite.c (__libc_pwrite): Likewise.
* sysdeps/unix/sysv/linux/pread64.c (__libc_pread64): Use
SYSCALL_LL64_PRW.
* sysdeps/unix/sysv/linux/pwrite64.c (__libc_pwrite64): Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h: Define
__ASSUME_PRW_DUMMY_ARG.
* sysdeps/unix/sysv/linux/sh/pread.c: Remove file.
* sysdeps/unix/sysv/linux/sh/pread64.c: Likewise.
* sysdeps/unix/sysv/linux/sh/pwrite.c: Likewise.
* sysdeps/unix/sysv/linux/sh/pwrite64.c: Likewise.
* sysdeps/unix/sysv/linux/sysdep.h: Define SYSCALL_LL_PRW and
SYSCALL_LL_PRW64 based on __ASSUME_PRW_DUMMY_ARG.

powerpc: Regenerate ULPs

* sysdeps/powerpc/fpu/libm-test-ulps: Regenerated.

posix: Correctly block/unblock all signals on Linux posix_spawn

This patch correctly block and unblocks all signals when executing
Linux posix_spawn by using the __libc_signal_{un}block_all functions
instead of default sigprocmask. The latter might remove both
SIGCANCEL and SIGSETXID from the blocked signal list.

Checked on x86_64, i686, powerpc64le, and aarch64.

* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Correctly block and unblock
all signals when executing the clone vfork child.
(SIGALL_SET): Remove macro.

posix: Correctly enable/disable cancellation on Linux posix_spawn

This patch correctly enable and disable asynchronous cancellation on
Linux posix_spawn. Current code invert the logic by enabling and
disabling instead. It also adds a new test to check if posix_spawn
is not a cancellation entrypoint.

Checked on x86_64, i686, powerpc64le, and aarch64.

* nptl/Makefile (tests): Add tst-exec5.
* nptl/tst-exec5.c: New file.
* sysdeps/unix/sysv/linux/spawni.c (__spawni): Correctly enable and disable
asynchronous cancellation.

powerpc: Fix POWER9 implies

Fix multiarch build for POWER9 by correcting the order of the
directories listed at sysnames configure variable.

(cherry picked from commit 1850ce5a2ea3b908b26165e7e951cd4334129f07)

NaCl: Fix libc.abilist missing GLIBC_2.24 A.

* sysdeps/arm/nacl/libc.abilist: Add GLIBC_2.24 A.

NaCl: Fix compile error for __dup after libc_hidden_proto addition.

* sysdeps/nacl/dup.c: Add libc_hidden_def.

(cherry picked from commit 6b75ba1388bff6a81bad410d7318d385a043b3cb)

Fix generic wait3 after union wait_status removal.

* sysdeps/posix/wait3.c: Don't treat STAT_LOC as a union, since it's
not any more.

(cherry picked from commit 9a3d16ac152447567bfc822497c564a0630c79fe)

NaCl: Fix compile error in clock function.

* sysdeps/nacl/clock.c (clock): nacl_abi_clock_t -> nacl_irt_clock_t

(cherry picked from commit 307c2c2dfff76330a29a3ab69a0177b118142145)

nptl/tst-once5: Reduce time to expected failure

(cherry picked from commit 1f645571d2db9008b3cd3d5acb9ff93357864283)

argp: Do not override GCC keywords with macros [BZ #16907]

glibc provides fallback definitions already. It is not necessary to
suppress warnings for unknown attributes because GCC does this
automatically for system headers.

This commit does not sync with gnulib because gnulib has started to use
_GL_* macros in the header file, which are arguably in the gnulib
implementation space and not suitable for an installed glibc header
file.

(cherry picked from commit 2c820533c61fed175390bc6058afbbe42d2edc37)

arm: mark __startcontext as .cantunwind (bug 20435)

__startcontext marks the bottom of the call stack of the contexts created
by makecontext.

(cherry picked from commit 9e2ff6c9cc54c0b4402b8d49e4abe7000fde7617)

Also includes the NEWS update, cherry-picked from commits
056dd72af83f5459ce6d545a49dea6dba7d635dc and
4d047efdbc55b0d68947cde682e5363d16a66294.

Do not override objects in libc.a in other static libraries [BZ #20452]

With this change, we no longer add sysdep.o and similar objects which
are present in libc.a to other static libraries.

(cherry picked from commit d9067fca40b8aac156d73cfa44d6875813555a6c)

sparc: remove fdim sparc specific implementations

The fdim and fdimf functions on sparc do not fully follow the standard
and do not set errno to ERANGE when the result overflows. Since glibc
2.24 this causes the two following tests to fail:

Failure: fdim (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
Failure: fdim_upward (max_value, -max_value): errno set to 0, expected 34 (ERANGE)

It happens that using GCC with the generic C code generates very similar
code to the sparc specific implementations. Therefore this patches
remove them. Note it might still worth adding a vis3 specific version of
fdim on sparc32/sparcv9, this is done in a following patch to ease
backporting.

Changelog:
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
Remove s_fdimf-vis3, s_fdim-vis3.
* sysdeps/sparc/sparc32/fpu/s_fdim.S: Delete file.
* sysdeps/sparc/sparc32/fpu/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdim.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fdimf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fdim.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fdimf.S: Likewise.

(cherry picked from commit 8a9f4eb95894eae7e725e79721ba26fbc5b4ed06)

Fix sNaN handling in nearbyint on 32-bit sparc.

* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
(__nearbyint_vis3): Don't check for sNaN before float register is
loaded with the incoming argument.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
(__nearbyintf_vis3): Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
Likewise.

(cherry picked from commit 3ef3f1b93fdf20143865cc16dd75f39a7f0fac2f)

powerpc: fix ifunc-sel.h fix asm constraints and clobber list

As pointer out on the mailing list, the inline assembly code in
sysdeps/powerpc/ifunc-sel.h doesn't have a list of clobbered registers
and used wrong constraints.

This patch fixes that. I verified it doesn't introduce any change in the
generated code.

Changelog:
* sysdeps/powerpc/ifunc-sel.h (ifunc_sel): Add "11", "12", "cr0" to the
clobber list. Use "i" constraint instead of "X".
(ifunc_one): Add "12" to the clobber list. Use "i" constraint instead
of "X".

(cherry picked from commit 30f926d3b3dcb74c038155715ed341d5c4b334eb)

powerpc: fix ifunc-sel.h with GCC 6

On 32-bit PowerPC GCC 6 always saves the PIC register on the stack in
the prologue and adjust the stack in the epilogue. It is therefore not
possible anymore to just exit the function in the inline asm code,
otherwise it corrupts the stack pointer. This causes the following tests
to fail when using GCC 6:

FAIL: elf/ifuncmain1
FAIL: elf/ifuncmain1pic
FAIL: elf/ifuncmain1picstatic
FAIL: elf/ifuncmain1pie
FAIL: elf/ifuncmain1staticpic
FAIL: elf/ifuncmain1staticpie
FAIL: elf/ifuncmain1vis
FAIL: elf/ifuncmain1vispic
FAIL: elf/ifuncmain1vispie
FAIL: elf/ifuncmain2pic
FAIL: elf/ifuncmain2picstatic
FAIL: elf/ifuncmain3
FAIL: elf/ifuncmain4picstatic
FAIL: elf/ifuncmain5
FAIL: elf/ifuncmain5picstatic
FAIL: elf/ifuncmain5staticpic

The solution is to replace the beqlr instructions by a beq to the end
of the inline asm code. This fixes all the above failures.

ChangeLog:
* sysdeps/powerpc/ifunc-sel.h (ifunc_sel): Replace beqlr instructions
by beq instructions jumping to the end of the function.

(cherry picked from commit ee71e5b6dd6a21e981ad0fa74359e066f5a8b359)

Update from Translation Project.

malloc: Preserve arena free list/thread count invariant [BZ #20370]

It is necessary to preserve the invariant that if an arena is
on the free list, it has thread attach count zero.  Otherwise,
when arena_thread_freeres sees the zero attach count, it will
add it, and without the invariant, an arena could get pushed
to the list twice, resulting in a cycle.

One possible execution trace looks like this:

Thread 1 examines free list and observes it as empty.
Thread 2 exits and adds its arena to the free list,
  with attached_threads == 0).
Thread 1 selects this arena in reused_arena (not from the free list).
Thread 1 increments attached_threads and attaches itself.
  (The arena remains on the free list.)
Thread 1 exits, decrements attached_threads,
  and adds the arena to the free list.

The final step creates a cycle in the usual way (by overwriting the
next_free member with the former list head, while there is another
list item pointing to the arena structure).

tst-malloc-thread-exit exhibits this issue, but it was only visible
with a debugger because the incorrect fix in bug 19243 removed
the assert from get_free_list.

(cherry picked from commit f88aab5d508c13ae4a88124e65773d7d827cd47b)

x86: Use sysdep.o from libc.a in static libraries

Static libraries can use the sysdep.o copy in libc.a without
a performance penalty. This results in a visible difference
if libpthread.a is relinked into a single object file (which
is needed to support libraries which check for the presence
of certain symbols to enable threading support, which generally
fails with static linking unless libpthread.a is relinked).

(cherry picked from commit e67330ab57bfd0f964539576ae7dcc658c456724)

Update for glibc 2.24 release.

Update libc.pot and NEWS.

sparc: remove ceil, floor, trunc sparc specific implementations

The ceil, floor and trunc functions on sparc do not fully follow the
standard and trigger an inexact exception when presented a value which
is not an integer. Since glibc 2.24 this causes a few tests to fail,
for instance:

  testing double (without inline functions)
  Failure: ceil (lit_pi): Exception "Inexact" set
  Failure: ceil (-lit_pi): Exception "Inexact" set
  Failure: ceil (min_subnorm_value): Exception "Inexact" set
  Failure: ceil (min_value): Exception "Inexact" set
  Failure: ceil (0.1): Exception "Inexact" set
  Failure: ceil (0.25): Exception "Inexact" set
  Failure: ceil (0.625): Exception "Inexact" set
  Failure: ceil (-min_subnorm_value): Exception "Inexact" set
  Failure: ceil (-min_value): Exception "Inexact" set
  Failure: ceil (-0.1): Exception "Inexact" set
  Failure: ceil (-0.25): Exception "Inexact" set
  Failure: ceil (-0.625): Exception "Inexact" set

I tried to fix that by using the same strategy than used on other
architectures, that is by saving the FSR register at the beginning
and restoring it at the end of the function. When doing so I noticed
a comment that this operation might be very costly, so I decided to
do some benchmarks.

The benchmarks below represent the time required to run each of the
function 60 millions of times with different input value. I have done
that in the basic V9 code, the VIS2 code, and using the default C
implementation of the libc, for both sparc32 and sparc64, on a Niagara
T1 based machine and an UltraSparc IIIi. Given I don't have access to a
more recent machine), I haven't been able to test the VIS3 version. Also
it should be noted that it doesn't make sense to do this benchmark for
V8 or earlier as in that case we use the default C implementation. The
results are available in the table below, the "+ fix" version correspond
to the one saving and restoring the FSR.

  Niagara T1 / sparc32
  --------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9          19.10    22.48    19.10    22.48    16.59    19.27
  V9 + fix    19.77    23.34    19.77    23.33    17.27    20.12
  VIS2        16.87    19.62    16.87    19.62
  VIS2 + fix  17.55    20.47    17.55    20.47
  C impl      11.39    13.80    11.40    13.80    10.88    10.84

  Niagara T1 / sparc64
  --------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9          18.14    22.23    18.14    22.23    15.64    19.02
  V9 + fix    18.82    23.08    18.82    23.08    16.32    19.87
  VIS2        15.92    19.37    15.92    19.37
  VIS2 + fix  16.59    20.22    16.59    20.22
  C impl      11.39    13.60    11.39    15.36    10.88    12.65

  UltraSparc IIIi / sparc32
  -------------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9           4.81     7.09     6.61    11.64     4.91     7.05
  V9 + fix     7.20    10.42     7.14    10.54     6.76     9.47
  VIS2         4.81     7.03     4.76     7.13
  VIS2 + fix   6.76     9.51     6.71     9.63
  C impl       3.88     8.62     3.90     9.45     3.57     6.62

  UltraSparc IIIi / sparc64
  -------------------------
              ceilf    ceil     floorf   floor    truncf   trunc
  V9           3.48     4.39     3.48     4.41     3.01     3.85
  V9 + fix     4.76     5.90     4.76     5.90     4.86     6.26
  VIS2         2.95     3.61     2.95     3.61
  VIS2 + fix   4.24     5.37     4.30     7.97
  C impl       3.63     4.89     3.62     6.38     3.33     4.03

The first thing that should be noted is that the C implementation is
always faster on the Niagara T1 based machine. On the UltraSparc IIIi
the float version on sparc32 is also faster.

Coming back about the fix saving and restoring the FSR, it appears
it has a big impact as expected. In that case the C implementation is
always faster than the fixed implementations.

This patch therefore removes the sparc specific implementations in
favor of the generic ones.

Changelog:
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
[$(subdir) = math] (libm-sysdep_routines): Remove.
[$(subdir) = math && $(have-as-vis3) = yes] (libm-sysdep_routines):
Remove s_ceilf-vis3, s_ceil-vis3, s_floorf-vis3, s_floor-vis3,
s_truncf-vis3, s_trunc-vis3.
* sysdeps/sparc/sparc64/fpu/multiarch/Makefile: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis2.S: Delete
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceil.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floor.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis2.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_floorf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_trunc.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf-vis3.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_truncf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceil.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_floor.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_floorf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_trunc.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_truncf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis2.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_ceil.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_ceilf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_floor.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_floorf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_trunc.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_truncf.S: Likewise.

CVE-2016-5417 was assigned to bug 19257

Don't compile do_test with -mavx/-mavx/-mavx512

Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run
on non-AVX machines.

[BZ #20384]
* sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add
test-double-libmvec-sincos-avx-main.o,
test-double-libmvec-sincos-avx2-main.o,
test-double-libmvec-sincos-main.o,
test-float-libmvec-sincosf-avx-main.o,
test-float-libmvec-sincosf-avx2-main.o and
test-float-libmvec-sincosf-main.o.
test-float-libmvec-sincosf-avx512-main.o.
($(objpfx)test-double-libmvec-sincos): Also link with
$(objpfx)test-double-libmvec-sincos-main.o.
($(objpfx)test-double-libmvec-sincos-avx): Also link with
$(objpfx)test-double-libmvec-sincos-avx-main.o.
($(objpfx)test-double-libmvec-sincos-avx2): Also link with
$(objpfx)test-double-libmvec-sincos-avx2-main.o.
($(objpfx)test-float-libmvec-sincosf): Also link with
$(objpfx)test-float-libmvec-sincosf-main.o.
($(objpfx)test-float-libmvec-sincosf-avx): Also link with
$(objpfx)test-float-libmvec-sincosf-avx2-main.o.
[$(config-cflags-avx512) == yes] (extra-test-objs): Add
test-double-libmvec-sincos-avx512-main.o and
($(objpfx)test-double-libmvec-sincos-avx512): Also link with
$(objpfx)test-double-libmvec-sincos-avx512-main.o.
($(objpfx)test-float-libmvec-sincosf-avx512): Also link with
$(objpfx)test-float-libmvec-sincosf-avx512-main.o.
(CFLAGS-test-double-libmvec-sincos.c): Removed.
(CFLAGS-test-float-libmvec-sincosf.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-main.c): New.
(CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX.
(CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx2.c): Set to
-DREQUIRE_AVX2.
(CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx512.c): Set to
-DREQUIRE_AVX512F.
(CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New
file.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c:
Likewise.

Nios II localplt.data update: remove __eqsf2

powerpc: Fix missing verb and typo in comment about AT_HWCAP entry

Fix missing verb and typo in comment about AT_HWCAP entry, in the context
of mcontext_t struct definition for PPC64 Linux kernels.

[AArch64] Update libm-test-ulps

This partly reverts commit f8238ae3c7701dbd9c04028861916de64e578114
that regenerated the ulps, to make the max ulps good for gcc-5,
gcc-6 and gcc-trunk as well.

* sysdeps/aarch64/libm-test-ulps: Updated.

S390: Do not clobber r13 with memcpy on 31bit with copies >1MB.

If the default memcpy variant is called with a length of >1MB on 31bit,
r13 is clobbered as the algorithm is switching to mvcle. The mvcle code
returns without restoring r13. All other cases are restoring r13.

If memcpy is called from outside libc the ifunc resolver will only select
this variant if running on machines older than z10.
Otherwise or if memcpy is called from inside libc, this default memcpy
variant is called.
The testcase timezone/tst-tzset is triggering this issue in some combinations
of gcc versions and optimization levels.

This bug was introduced in commit 04bb21ac93e90d7696bcaf8febe2b2dd2d83585a
and thus is a regression compared to former glibc 2.23 release.

This patch removes the usage of r13 at all. Thus it is not saved and restored.
The base address for execute-instruction is now stored in r5 which is obtained
after r5 is not needed anymore as 256byte block counter.

ChangeLog:

* sysdeps/s390/s390-32/memcpy.S (memcpy): Eliminate the usage
of r13 as it is not restored in mvcle case.

microblaze: fix variable name collision with syscall macros

If a function passes in a variable named "ret", the code will miscompile
when it declares a local ret variable. In some cases, it's even a build
failure like so:
../sysdeps/unix/sysv/linux/spawni.c: In function '__spawni_child':
../sysdeps/unix/sysv/linux/spawni.c:289:5: error: address of register variable 'ret' requested
while (write_not_cancel (p, &ret, sizeof ret) < 0)

elf/elf.h: Add missing Meta relocations

2016-07-19 Will Newton <will.newton@gmail.com>

* elf/elf.h (R_METAG_REL8, R_METAG_REL16, R_METAG_TLS_GD
R_METAG_TLS_LDM, R_METAG_TLS_LDO_HI16, R_METAG_TLS_LDO_LO16,
R_METAG_TLS_LDO, R_METAG_TLS_IE, R_METAG_TLS_IENONPIC,
R_METAG_TLS_IENONPIC_HI16, R_METAG_TLS_IENONPIC_LO16,
R_METAG_TLS_LE, R_METAG_TLS_LE_HI16, R_METAG_TLS_LE_LO16): New.

i386: Compile rtld-*.os with -mno-sse -mno-mmx -mfpmath=387

Compile i386 rtld-*.os with -mno-sse -mno-mmx -mfpmath=387 so that no
code in ld.so uses mm/xmm/ymm/zmm registers on i386 since the first 3
mm/xmm/ymm/zmm registers are used to pass vector parameters which must
be preserved.

* sysdeps/i386/Makefile (rtld-CFLAGS): New.
[subdir == elf] (CFLAGS-.os): Replace -mno-sse -mno-mmx
-mfpmath=387 with $(rtld-CFLAGS).
[subdir != elf] (CFLAGS-.os): Compile rtld-*.os with
$(rtld-CFLAGS).

elf: Define missing Meta architecture specific relocations

Fix cos computation for multiple precision fallback (bz #20357)

During the sincos consolidation I made two mistakes, one was a logical
error due to which cos(0x1.8475e5afd4481p+0) returned
sin(0x1.8475e5afd4481p+0) instead.

The second issue was an error in negating inputs for the correct
quadrants for sine. I could not find a suitable test case for this
despite running a program to search for such an input for a couple of
hours.

Following patch fixes both issues. Tested on x86_64. Thanks to Matt
Clay for identifying the issue.

[BZ #20357]
* sysdeps/ieee754/dbl-64/s_sin.c (sloww): Fix up condition
to call __mpsin/__mpcos and to negate values.
* math/auto-libm-test-in: Add test.
* math/auto-libm-test-out: Regenerate.

Don't install the internal header grp-merge.h

grp-merge.h was introduced in Stephen Gallagher's patch adding the
"group merging" feature to NSS.  It declares two functions, __copy_grp
and __merge_grp, both of which are tagged 'internal_function', which
means that nobody can even compile the contents of the header without
access to libc-symbols.h, which is not installed.  (Also, these
functions are GLIBC_PRIVATE exports from libc.so.)  Hence I believe
grp-merge.h should not be installed either.

This really needs to be in 2.24, so that no released version of the
library installs this header.

I hope that what I did to the ChangeLog diff will allow it to be
applied without hassle.

* grp/Makefile: Don't install the internal header grp-merge.h.

[AArch64] Regenerate libm-test-ulps

* sysdeps/aarch64/libm-test-ulps: Regenerated.

Fix TABDLY value

* bits/termios.h (TABDLY): Change macro to include TAB3 bit too.

Refactor Linux raise implementation (BZ#15368)

This patch changes both the nptl and libc Linux raise implementation
to avoid the issues described in BZ#15368. The strategy used is
summarized in bug report first comment:

1. Block all signals (including internal NPTL ones);
2. Get pid and tid directly from syscall (not relying on cached
values);
3. Call tgkill;
4. Restore old signal mask.

Tested on x86_64 and i686.

[BZ #15368]
* sysdeps/unix/sysv/linux/nptl-signals.h
(__nptl_clear_internal_signals): New function.
(__libc_signal_block_all): Likewise.
(__libc_signal_block_app): Likewise.
(__libc_signal_restore_set): Likewise.
* sysdeps/unix/sysv/linux/pt-raise.c (raise): Use Linux raise.c
implementation.
* sysdeps/unix/sysv/linux/raise.c (raise): Reimplement to not use
the cached pid/tid value in pthread structure.

Regenerate i686 libm-test-ulps with GCC 6.1 at -O3 [BZ #20347]

This fixes with GCC 6.1 and -O3 on i686:

Failure: Test: j0_downward (0xap+0)
Result:
is:         -2.45935813e-01  -0x1.f7ad32p-3
should be:  -2.45935768e-01  -0x1.f7ad2cp-3
difference:  4.47034835e-08   0x1.800000p-25
ulp       :  3.0000
max.ulp   :  2.0000
Maximal error of `j0_downward'
is      : 3 ulp
accepted: 2 ulp

[BZ #20347]
* sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Regenerated.

x86-64: Add p{read,write}[v]64 to syscalls.list [BZ #20348]

64-bit off_t in pread64, preadv, pwrite64 and pwritev syscalls is passed
in one 64-bit register for both x32 and x86-64. Since the inline
asm statement only passes long, which is 32-bit for x32, in registers,
64-bit off_t is truncated to 32-bit on x32. Since __ASSUME_PREADV and
__ASSUME_PWRITEV are defined unconditionally, these syscalls can be
implemented in syscalls.list to pass 64-bit off_t in one 64-bit register.

Tested on x86-64 and x32 with off_t > 4GB on pread64/pwrite64 and
preadv64/pwritev64.

[BZ #20348]
* sysdeps/unix/sysv/linux/x86_64/syscalls.list: Add pread64,
preadv64, pwrite64 and pwritev64.

Test p{read,write}64 with offset > 4GB

Test p{read,write}64 with offset > 4GB.  Since it is not an error for a
successful pread/pwrite call to transfer fewer bytes than requested, we
should check if the return value is -1.   No need to close and unlink
temporary file, which is handled by test-skeleton.c.

[BZ #20350]
* posix/tst-preadwrite.c: Renamed to ...
* posix/tst-preadwrite-common.c: This.
(PREAD): Removed.
(PWRITE): Likewise.
(STRINGIFY): Likewise.
(STRINGIFY2): Likewise.
(do_prepare): Make it static and remove function arguments.
(do_test): Likewise.
(PREPARE): Updated.
(TEST_FUNCTION): New.
(name): Make it static.
(fd): Likewise.
(do_prepare): Use create_temp_file.
(do_test): Renamed to ...
(do_test_with_offset): This.  Make it static and accept offset.
Properly check return value of PWRITE and PREAD.  Return bytes
read.  Don't close fd nor unlink name.
* posix/tst-preadwrite.c: Rewrite.
* posix/tst-preadwrite64.c: Likewise.

x86-64: Properly align stack in _dl_tlsdesc_dynamic [BZ #20309]

Since _dl_tlsdesc_dynamic is called via PLT, we need to add 8 bytes for
push in the PLT entry to align the stack.

[BZ #20309]
* configure.ac (have-mtls-dialect-gnu2): Set to yes if
-mtls-dialect=gnu2 works.
* configure: Regenerated.
* elf/Makefile [have-mtls-dialect-gnu2 = yes]
(tests): Add tst-gnu2-tls1.
(modules-names): Add tst-gnu2-tls1mod.
($(objpfx)tst-gnu2-tls1): New.
(tst-gnu2-tls1mod.so-no-z-defs): Likewise.
(CFLAGS-tst-gnu2-tls1mod.c): Likewise.
* elf/tst-gnu2-tls1.c: New file.
* elf/tst-gnu2-tls1mod.c: Likewise.
* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Add 8
bytes for push in the PLT entry to align the stack.

X86-64: Define LO_HI_LONG to skip pos_h [BZ #20349]

Define LO_HI_LONG to skip pos_h since it is ignored by kernel:

static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
{
#define HALF_LONG_BITS (BITS_PER_LONG / 2)
return (((loff_t)high << HALF_LONG_BITS) << HALF_LONG_BITS) | low;
}

where size of loff_t == size of long.

[BZ #20349]
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (LO_HI_LONG): New.

Revert "Add pretty printers for the NPTL lock types"

This reverts commit 62ce266b0b261def2c6329be9814ffdcc11964d6.

The change is not mature enough because it needs the following fixes:

1. Redirect test output to a file like other tests

2. Eliminate the need to use a .gdbinit because distributions will
   break without it.  I should have caught that but I was in too much
   of a hurry to get the patch in :/

3. Feature checking during configure to determine things like minimum
   required gdb version, python-pexpect version, etc. to make sure
   that tests work correctly.

[AArch64] Add bits/hwcap.h for aarch64 linux

AArch64 uses HWCAP bits but they are not defined in sys/auxv.h.
This patch adds a copy of the linux v4.6 arm64 uapi asm/hwcap.h
definitions.

* sysdeps/unix/sysv/linux/aarch64/bits/hwcap.h: New.

[AArch64] Fix libc internal asm profiling code

When glibc is built with --enable-profile, the ENTRY of
asm functions includes CALL_MCOUNT for profiling.
(matters for binaries static linked against libc_p.a.)

CALL_MCOUNT did not save/restore argument registers
around the _mcount call so it clobbered them.
(it is enough to only save/restore the arguments passed
to a given asm function, but that would be too many asm
changes so it is simpler to always save all argument
registers in this macro.)

float args are not saved: mcount does not clobber the
float regs and currently no asm function takes float
arguments anyway.

[BZ #18707]
* sysdeps/aarch64/Makefile (CFLAGS-mcount.c): Add -mgeneral-regs-only.
* sysdeps/aarch64/sysdep.h (CALL_MCOUNT): Save argument registers.

Fix LO_HI_LONG definition

The p{read,write}v{64} consolidation patch [1] added a wrong guard
for LO_HI_LONG definition. It currently uses both
'__WORDSIZE == 64' and 'defined __ASSUME_WORDSIZE64_ILP32' to set
the value to be passed in one argument, otherwise it will be split
in two.

However it fails on MIPS64n32 where syscalls n32 uses the compat
implementation in the kernel meaning the off_t arguments are passed
in two separate registers.

GLIBC already defines a macro for such cases (__OFF_T_MATCHES_OFF64_T),
so this patch uses it instead.

Checked on x86_64, i686, x32, aarch64, armhf, and s390.

* sysdeps/unix/sysv/linux/sysdep.h
[__WORDSIZE == 64 || __ASSUME_WORDSIZE64_ILP32] (LO_HI_LONG): Remove
guards.
* misc/tst-preadvwritev-common.c: New file.
* misc/tst-preadvwritev.c: Use tst-preadvwritev-common.c.
* misc/tst-preadvwritev64.c: Use tst-preadwritev-common.c and add
a check for files larger than 2GB.

[1] 4751bbe2ad4d1bfa05774e29376d553ecfe563b0

Remove __ASSUME_OFF_DIFF_OFF64 definition

This patch removes the __ASSUME_OFF_DIFF_OFF64 define introduced in
p{read,write} consolidation patch.  This define was added based on
the idea 32 bits ports would continue to follow previous off{64}_t
definition where off_t size differs from off64_t one.

However, with recent AArch64/ILP32 patch submission and also with
discussion for RISCV kernel interface, 32 bits ports now may aim
to use off_t and off64_t with the same size as 64 bits.

So current assumption for both p{read,write} and p{read,write}v
are not compatible with new type definition.  This patch now makes
the syscall wrappers to only depend on __OFF_T_MATCHES_OFF64_T to
define the default and 64-suffix variant, as follow:

  <function>.c
  #ifndef __OFF_T_MATCHES_OFF64_T
  /* build <function> */
  #endif

  and

  <function>64.c

  /* build <function>64 */
  #ifdef __OFF_T_MATCHES_OFF64_T
  weak_alias (fallocate64, fallocate)
  #endif

Tested on x86_64, i686, x32, and armhf.

* sysdeps/unix/sysv/linux/mips/kernel-features.h
(__ASSUME_OFF_DIFF_OFF64): Remove define.
* sysdeps/unix/sysv/linux/pread.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pread): Replace by
__OFF_T_MATCHES_OFF64_T.
* sysdeps/unix/sysv/linux/pread64.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pread64): Likewise.
* sysdeps/unix/sysv/linux/preadv.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (preadv): Likewise.
* sysdeps/unix/sysv/linux/preadv64.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (preadv64): Likewise.
* sysdeps/unix/sysv/linux/pwrite.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwrite): Likewise.
* sysdeps/unix/sysv/linux/pwrite64.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwrite64): Likewise.
* sysdeps/unix/sysv/linux/pwritev.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwritev): Likewise.
* sysdeps/unix/sysv/linux/pwritev64.c
[__WORDSIZE != 64 || __ASSUME_OFF_DIFF_OFF64] (pwritev64): Likewise.

Add pretty printers for the NPTL lock types

This patch adds pretty printers for the following NPTL types:

- pthread_mutex_t
- pthread_mutexattr_t
- pthread_cond_t
- pthread_condattr_t
- pthread_rwlock_t
- pthread_rwlockattr_t

To load the pretty printers into your gdb session, do the following:

python
import sys
sys.path.insert(0, '/path/to/glibc/build/nptl/pretty-printers')
end

source /path/to/glibc/source/pretty-printers/nptl-printers.py

You can check which printers are registered and enabled by issuing the
'info pretty-printer' gdb command. Printers should trigger automatically when
trying to print a variable of one of the types mentioned above.

The printers are architecture-independent, and were manually tested on both
the gdb CLI and Eclipse CDT.

In order to work, the printers need to know the values of various flags that
are scattered throughout pthread.h and pthreadP.h as enums and #defines. Since
replicating these constants in the printers file itself would create a
maintenance burden, I wrote a script called gen-py-const.awk that Makerules uses
to extract the constants. This script is pretty much the same as gen-as-const.awk,
except it doesn't cast the constant values to 'long' and is thorougly documented.
The constants need only to be enumerated in a .pysym file, which is then referenced
by a Make variable called gen-py-const-headers.

As for the install directory, I discussed this with Mike Frysinger and Siddhesh
Poyarekar, and we agreed that it can be handled in a separate patch, and it shouldn't
block merging of this one.

In addition, I've written a series of test cases for the pretty printers.
Each lock type (mutex, condvar and rwlock) has two test programs, one for itself
and other for its related 'attributes' object. Each test program in turn has a
PExpect-based Python script that drives gdb and compares its output to the
expected printer's. The tests run on the glibc host, which is assumed to have
both gdb and PExpect; if either is absent the tests will fail with code 77
(UNSUPPORTED). For cross-testing you should use cross-test-ssh.sh as test-wrapper.
I've tested the printers on both a native build and a cross build using a Beaglebone
Black, with the build system's filesystem shared with the board through NFS.

Finally, I've written a README that explains all this and more.

Hopefully this should be good to go in now. Thanks.

ChangeLog:

2016-07-04 Martin Galvan <martin.galvan@tallertechnologies.com>

* Makeconfig (build-hardcoded-path-in-tests): Set to 'yes' for shared builds
if tests-need-hardcoded-path is defined.
(all-subdirs): Add pretty-printers.
* Makerules ($(py-const)): New rule.
* Rules (others): Add $(py-const), if defined.
* nptl/Makefile (gen-py-const-headers): Define.
* nptl/nptl-printers.py: New file.
* nptl/nptl_lock_constants.pysym: Likewise.
* pretty-printers/Makefile: Likewise.
* pretty-printers/README: Likewise.
* pretty-printers/test-condvar-attributes.c: Likewise.
* pretty-printers/test-condvar-attributes.p: Likewise.
* pretty-printers/test-condvar-printer.c: Likewise.
* pretty-printers/test-condvar-printer.py: Likewise.
* pretty-printers/test-mutex-attributes.c: Likewise.
* pretty-printers/test-mutex-attributes.py: Likewise.
* pretty-printers/test-mutex-printer.c: Likewise.
* pretty-printers/test-mutex-printer.py: Likewise.
* pretty-printers/test-rwlock-attributes.c: Likewise.
* pretty-printers/test-rwlock-attributes.py: Likewise.
* pretty-printers/test-rwlock-printer.c: Likewise.
* pretty-printers/test-rwlock-printer.py: Likewise.
* pretty-printers/test_common.py: Likewise.
* scripts/gen-py-const.awk: Likewise.

tile: only define __ASSUME_ALIGNED_REGISTER_PAIRS for 32-bit

The previous uses of this symbol were all in wordsize-32 code.
In commit eeddfa91cbb1 ("Consolidate off_t/off64_t syscall
argument passing") it was expanded to be used in pread/pwrite.
Accordingly, we only define it in 32-bit compilation modes now.
Both tilepro and tilegx32 follow this convention for the
kernel ABI. tilegx64 follows it for passing 128-bit values,
but there are no such ABIs in the kernel.

Define __USE_KERNEL_IPV6_DEFS macro for non-Linux kernels

Commit 1c1e7fb6 changed the __USE_KERNEL_IPV6_DEFS tests from 'ifdef'
to 'if'. As inet/netinet.in.h is a generic file, this causes a warning
on non-Linux kernels (for example Hurd). To fix that define it in the
generic bits/in.h file.

Changelog:
* bits/in.h (__USE_KERNEL_IPV6_DEFS): Define to 0.

ppc: Fix modf (sNaN) for pre-POWER5+ CPU (bug 20240).

Commit a6a4395d fixed modf implementation by compiling s_modf.c and
s_modff.c with -fsignaling-nans. However these files are also included
from the pre-POWER5+ implementation, and thus these files should also
be compiled with -fsignaling-nans.

Changelog:
[BZ #20240]
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile
(CFLAGS-s_modf-ppc32.c): New variable.
(CFLAGS-s_modff-ppc32.c): Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
(CFLAGS-s_modf-ppc64.c): Likewise.
(CFLAGS-s_modff-ppc64.c): Likewise.

localedata: fix de_LI locale

Fix the postal_fmt and country_name entries to continue on the following
line without indentation.

localedata/Changelog:
* locales/de_LI (postal_fmt): Fix indentation.
(country_name): Likewise.

Add test case for bug 20263

Fix robust mutex daedlock [BZ #20263]

In Linux/ARM environment, a robust mutex can't catch the timeout result
when it is already owned by other thread and requests to try lock with
a specific time value(pthread_mutex_timedlock). The futex already returns
the ETIMEDOUT result but there is no check the return value and it makes
a deadlock.

* nptl/lowlevelrobustlock.c: Implement ETIMEDOUT logic.

Add missing changelog part

New locale de_LI

The Principality of Liechtenstein currently does not have a corresponding
locale. Given the links with Switzerland, the best is to base the locale
on the de_CH one (German is the official language) and only change the
country related categories: LC_ADDRESS. and LC_TELEPHONE.

localedata/Changelog:
* locales/de_LI: New locale.
* SUPPORTED: Add de_LI.