git.ipfire.org Git - thirdparty/glibc.git/log

pthread_once hangs when init routine throws an exception [BZ #18435]

This is another attempt at making pthread_once handle throwing exceptions
from the init routine callback.  As the new testcases show, just switching
to the cleanup attribute based cleanup does fix the tst-once5 test, but
breaks the new tst-oncey3 test.  That is because when throwing exceptions,
only the unwind info registered cleanups (i.e. C++ destructors or cleanup
attribute), when cancelling threads and there has been unwind info from the
cancellation point up to whatever needs cleanup both unwind info registered
cleanups and THREAD_SETMEM (self, cleanup, ...) registered cleanups are
invoked, but once we hit some frame with no unwind info, only the
THREAD_SETMEM (self, cleanup, ...) registered cleanups are invoked.
So, to stay fully backwards compatible (allow init routines without
unwind info which encounter cancellation points) and handle exception throwing
we actually need to register the pthread_once cleanups in both unwind info
and in the THREAD_SETMEM (self, cleanup, ...) way.
If an exception is thrown, only the former will happen and we in that case
need to also unregister the THREAD_SETMEM (self, cleanup, ...) registered
handler, because otherwise after catching the exception the user code could
call deeper into the stack some cancellation point, get cancelled and then
a stale cleanup handler would clobber stack and probably crash.
If a thread calling init routine is cancelled and unwind info ends before
the pthread_once frame, it will be cleaned up through self->cleanup as
before.  And if unwind info is present, unwind_stop first calls the
self->cleanup registered handler for the frame, then it will call the
unwind info registered handler but that will already see __do_it == 0
and do nothing.

(cherry picked from commit f0419e6a10740a672b28e112c409ae24f5e890ab)

elf: ld.so --help calls _dl_init_paths without a main map [BZ #27577]

In this case, use the link map of the dynamic loader itself as
a replacement. This is more than just a hack: if we ever support
DT_RUNPATH/DT_RPATH for the dynamic loader, reporting it for
ld.so --help (without further command line arguments) would be the
right thing to do.

Fixes commit 332421312576bd7095e70589154af99b124dd2d1 ("elf: Always
set l in _dl_init_paths (bug 23462)").

(cherry picked from commit 4e6db99c665d3b82a70a3e218860ef087b1555b4)

elf: Always set l in _dl_init_paths (bug 23462)

After d1d5471579eb0426671bf94f2d71e61dfb204c30 ("Remove dead
DL_DST_REQ_STATIC code.") we always setup the link map l to make the
static and shared cases the same. The bug is that in elf/dl-load.c
(_dl_init_paths) we conditionally set l only in the #ifdef SHARED
case, but unconditionally use it later. The simple solution is to
remove the #ifdef SHARED conditional, because it's no longer needed,
and unconditionally setup l for both the static and shared cases. A
regression test is added to run a static binary with
LD_LIBRARY_PATH='$ORIGIN' which crashes before the fix and runs after
the fix.

Co-Authored-By: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 332421312576bd7095e70589154af99b124dd2d1)

x86: Handle _SC_LEVEL1_ICACHE_LINESIZE [BZ #27444]

commit 2d651eb9265d1366d7b9e881bfddd46db9c1ecc4
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Sep 18 07:55:14 2020 -0700

x86: Move x86 processor cache info to cpu_features

missed _SC_LEVEL1_ICACHE_LINESIZE.

1. Add level1_icache_linesize to struct cpu_features.
2. Initialize level1_icache_linesize by calling handle_intel,
handle_zhaoxin and handle_amd with _SC_LEVEL1_ICACHE_LINESIZE.
3. Return level1_icache_linesize for _SC_LEVEL1_ICACHE_LINESIZE.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit f53ffc9b90cbd92fa5518686daf4091bdd1d4889)

io: Return EBAFD for negative file descriptor on fstat (BZ #27559)

Now that fstat is implemented on top fstatat we need to handle negative
inputs. The implementation now rejects AT_FDCWD, which would otherwise
be accepted by the kernel.

Checked on x86_64-linux-gnu and on i686-linux-gnu.

(cherry picked from commit 94caafa040e4b4289c968cd70d53041b1463ac4d)

nscd: Fix double free in netgroupcache [BZ #27462]

In commit 745664bd798ec8fd50438605948eea594179fba1 a use-after-free
was fixed, but this led to an occasional double-free. This patch
tracks the "live" allocation better.

Tested manually by a third party.

Related: RHBZ 1927877

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit dca565886b5e8bd7966e15f0ca42ee5cff686673)

x86: Set minimum x86-64 level marker [BZ #27318]

Since the full ISA set used in an ELF binary is unknown to compiler,
an x86-64 ISA level marker indicates the minimum, not maximum, ISA set
required to run such an ELF binary.  We never guarantee a library with
an x86-64 ISA level v3 marker doesn't contain other ISAs beyond x86-64
ISA level v3, like AVX VNNI.  We check the x86-64 ISA level marker for
the minimum ISA set.  Since -march=sandybridge enables only some ISAs
in x86-64 ISA level v3, we should set the needed ISA marker to v2.
Otherwise, libc is compiled with -march=sandybridge will fail to run on
Sandy Bridge:

$ ./elf/ld.so ./libc.so
./libc.so: (p) CPU ISA level is lower than required: needed: 7; got: 3

Set the minimum, instead of maximum, x86-64 ISA level marker should have
no impact on the glibc-hwcaps directory assignment logic in ldconfig nor
ld.so.

(cherry picked from commit 339bf918ea4830fb35614632e96f3aab3237adce)

nss: Re-enable NSS module loading after chroot [BZ #27389]

The glibc 2.33 release enabled /etc/nsswitch.conf reloading,
and to prevent potential security issues like CVE-2019-14271
the re-loading of nsswitch.conf and all mdoules was disabled
when the root filesystem changes (see bug 27077).

Unfortunately php-lpfm and openldap both require the ability
to continue to load NSS modules after chroot. The packages
do not exec after the chroot, and so do not cause the
protections to be reset. The only solution is to re-enable
only NSS module loading (not nsswitch.conf reloading) and so
get back the previous glibc behaviour.

In the future we may introduce a way to harden applications
so they do not reload NSS modules once the root filesystem
changes, or that only files/dns are available pre-loaded
(or builtin).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 58673149f37389495c098421085ffdb468b3f7ad)

x86: Add CPU-specific diagnostics to ld.so --list-diagnostics

(cherry picked from commit 01a5746b6c8a44dc29d33e056b63485075a6a3cc)

Adjusted to not print the rep_movsb_stop_threshold field, which
does not exist in glibc 2.33.

x86: Automate generation of PREFERRED_FEATURE_INDEX_1 bitfield

Use a .def file to define the bitfield layout, so that it is possible
to iterate over field members using the preprocessor.

(cherry picked from commit e4933c8a92ea08eecdf3ab45e7f76c95dc3d20ac)

ld.so: Implement the --list-diagnostics option

(cherry picked from commit 851f32cf7bf7067f73b991610778915edd57d7b4)

string: Work around GCC PR 98512 in rawmemchr

(cherry picked from commit 044e603b698093cf48f6e6229e0b66acf05227e4)

S390: Add new hwcap values.

The new hwcap values indicate support for arch14 architecture.

(cherry picked from commit 25251c0707fe34f30a27381a5fabc35435a96621)

tunables: Disallow negative values for some tunables

The glibc.malloc.mmap_max tunable as well as al of the INT_32 tunables
don't have use for negative values, so pin the hardcoded limits in the
non-negative range of INT. There's no real benefit in any of those
use cases for the extended range of unsigned, so I have avoided added
a new type to keep things simple.

(cherry picked from commit 228f30ab4724d4087d5f52018873fde22efea6e2)

x86: Use SIZE_MAX instead of (long int)-1 for tunable range value

The tunable types are SIZE_T, so set the ranges to the correct maximum
value, i.e. SIZE_MAX.

(cherry picked from commit a1b8b06a55c1ee581d5ef860cec214b0c27a66f0)

tunables: Simplify TUNABLE_SET interface

The TUNABLE_SET interface took a primitive C type argument, which
resulted in inconsistent type conversions internally due to incorrect
dereferencing of types, especialy on 32-bit architectures.  This
change simplifies the TUNABLE setting logic along with the interfaces.

Now all numeric tunable values are stored as signed numbers in
tunable_num_t, which is intmax_t.  All calls to set tunables cast the
input value to its primitive type and then to tunable_num_t for
storage.  This relies on gcc-specific (although I suspect other
compilers woul also do the same) unsigned to signed integer conversion
semantics, i.e. the bit pattern is conserved.  The reverse conversion
is guaranteed by the standard.

(cherry picked from commit 61117bfa1b08ca048e6512c0652c568300fedf6a)

nsswitch: return result when nss database is locked [BZ #27343]

Before the change nss_database_check_reload_and_get() did not populate
the '*result' value when it returned success in a case of chroot
detection. This caused initgroups() to use garage pointer in the
following test (extracted from unbound):

```

int main() {
    // load some NSS modules
    struct passwd * pw = getpwnam("root");

    chdir("/tmp");
    chroot("/tmp");
    chdir("/");
    // access nsswitch.conf in a chroot
    initgroups("root", 0);
}
```

Reviewed-by: DJ Delorie <dj@redhat.com>

Prepare for glibc 2.33 release

Update version.h, features.h, and ChangeLog.old/ChangeLog.22.

Update NEWS with bugs

Update translations

Add missing Serbian translation.

NEWS: Fix typo in CVE-2021-3326 entry

elf: Fix tests that rely on ld.so.cache for cross-compiling

For configurations with cross-compiling equal to 'maybe' or 'no',
ldconfig will not run and thus the ld.so.cache will not be created
on the container testroot.pristine.

This lead to failures on both tst-glibc-hwcaps-prepend-cache and
tst-ldconfig-ld_so_conf-update on environments where the same
compiler can be used to build different ABIs (powerpc and x86 for
instance).

This patch addas a new test-container hook, ldconfig.run, that
triggers a ldconfig execution prior the test execution.

Checked on x86_64-linux-gnu and i686-linux-gnu.

NEWS: Mention CVE-2021-3326 (iconv assertion with ISO-20220-JP-3)

NEWS: Add entry for glibc-hwcaps and deprecate legacy hwcaps

x86: Properly set usable CET feature bits [BZ #26625]

commit 94cd37ebb293321115a36a422b091fdb72d2fb08
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Sep 16 05:27:32 2020 -0700

    x86: Use HAS_CPU_FEATURE with IBT and SHSTK [BZ #26625]

broke

GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK

since it can no longer disable IBT nor SHSTK.  Handle IBT and SHSTK with:

1. Revert commit 94cd37ebb293321115a36a422b091fdb72d2fb08.
2. Clears the usable CET feature bits if kernel doesn't support CET.
3. Add GLIBC_TUNABLES tests without dlopen.
4. Add tests to verify that CPU_FEATURE_USABLE on IBT and SHSTK matches
_get_ssp.
5. Update GLIBC_TUNABLES tests with dlopen to verify that CET is disabled
with GLIBC_TUNABLES.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Update translations

Update libc.pot for 2.33 release

Update ia64 libm-test-ulps

sh: Update libm-tests-ulps

ia64: Fix brk call on statup

brk used by statup before TCB is properly set, so we can't use
IA64_USE_NEW_STUB.

This patch fixes a regression introduced by 720480934ab910.

Checked on ia64-linux-gnu.

Update sparc libm-test-ulps

Update alpha libm-test-ulps

powerpc64: Workaround sigtramp vdso return call

A not so recent kernel change[1] changed how the trampoline
`__kernel_sigtramp_rt64` is used to call signal handlers.

This was exposed on the test misc/tst-sigcontext-get_pc

Before kernel 5.9, the kernel set LR to the trampoline address and
jumped directly to the signal handler, and at the end the signal
handler, as any other function, would `blr` to the address set.  In
other words, the trampoline was executed just at the end of the signal
handler and the only thing it did was call sigreturn.  But since
kernel 5.9 the kernel set CTRL to the signal handler and calls to the
trampoline code, the trampoline then `bctrl` to the address in CTRL,
setting the LR to the next instruction in the middle of the
trampoline, when the signal handler returns, the rest of the
trampoline code executes the same code as before.

Here is the full trampoline code as of kernel 5.11.0-rc5 for
reference:

    V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
    .Lsigrt_start:
            bctrl   /* call the handler */
            addi    r1, r1, __SIGNAL_FRAMESIZE
            li      r0,__NR_rt_sigreturn
            sc
    .Lsigrt_end:
    V_FUNCTION_END(__kernel_sigtramp_rt64)

This new behavior breaks how `backtrace()` uses to detect the
trampoline frame to correctly reconstruct the stack frame when it is
called from inside a signal handling.

This workaround rely on the fact that the trampoline code is at very
least two (maybe 3?) instructions in size (as it is in the 32 bits
version, only on `li` and `sc`), so it is safe to check the return
address be in the range __kernel_sigtramp_rt64 .. + 4.

[1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
    commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
    url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Fix nss/tst-reload2 for systems without PATH_MAX

nsswitch: do not reload if "/" changes

https://sourceware.org/bugzilla/show_bug.cgi?id=27077

Before reloading nsswitch.conf, verify that the root directory
hasn't changed - if it has, it's likely that we've entered a
container and should not trust the nsswitch inside the container
nor load any shared objects therein.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

elf: Limit tst-prelink-cmp target archs

elf/tst-prelink-cmp was initially added for x86 (commit fe534fe898) to validate
the fix for Bug 19178, and later applied to all architectures that use GLOB_DAT
relocations (commit 89569c8bb6). However, that bug only affected targets that
handle GLOB_DAT relocations as ELF_TYPE_CLASS_EXTERN_PROTECTED_DATA, so the test
should only apply to targets defining DL_EXTERN_PROTECTED_DATA, which gates the
usage of the elf type class above. For all other targets not meeting that
criteria, the test now returns with UNSUPPORTED status.

Fixes the test on POWER10 processors, which started using R_PPC64_GLOB_DAT.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256)

The conversion loop to the internal encoding does not follow
the interface contract that __GCONV_FULL_OUTPUT is only returned
after the internal wchar_t buffer has been filled completely.  This
is enforced by the first of the two asserts in iconv/skeleton.c:

      /* We must run out of output buffer space in this
rerun.  */
      assert (outbuf == outerr);
      assert (nstatus == __GCONV_FULL_OUTPUT);

This commit solves this issue by queuing a second wide character
which cannot be written immediately in the state variable, like
other converters already do (e.g., BIG5-HKSCS or TSCII).

Reported-by: Tavis Ormandy <taviso@gmail.com>

Revert "Make libc symbols hidden in static PIE" [BZ #27237]

This reverts commit 2682695e5c7acf1e60dd3b5c3a14d4e82416262c.
Fixes bug 27237.

That commit turned out to be too intrusive affecting crt files, test
system and benchmark files. They should not be affected, but the
build system does not set the MODULE_NAME and LIBC_NONSHARED reliably.

benchtests: Do not build bench-timing-type with MODULE_NAME=libc

Since commit 2682695e5c7a, `make bench-build' with `--enable-static-pie'
fails due to bench-timing-type being incorrectly built with MODULE_NAME
set to `libc'. This commit sets MODULE_NAME to nonlib, thus fixing the
build failure.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

aarch64: Fix the list of tested IFUNC variants [BZ #26818]

Some IFUNC variants are not compatible with BTI and MTE so don't
set them as usable for testing and benchmarking on a BTI or MTE
enabled system.

As far as IFUNC selectors are concerned a system is BTI enabled if
the cpu supports it and glibc was built with BTI branch protection.

Most IFUNC variants are BTI compatible, but thunderx2 memcpy and
memmove use a jump table with indirect jump, without a BTI j.

Fixes bug 26818.

Update INSTALL with package versions that are known to work

Most packages have been tested with their latest releases, except for
Python, whose latest version is 3.9.1.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

aarch64: Move and update the definition of MTE_ENABLED

The hwcap value is now in linux 5.10 and in glibc bits/hwcap.h, so use
that definition.

Move the definition to init-arch.h so all ifunc selectors can use it
and expose an "mte" shorthand for mte enabled runtime.

For now we allow user code to enable tag checks and use PROT_MTE
mappings without libc involvment, this is not guaranteed ABI, but
can be useful for testing and debugging with MTE.

Fix misplaced const

Constify __x86_cacheinfo_p and __x86_cpu_features_p, not their pointer
target types.

Update C-SKY libm-test-ulps

manual: Correct argument order in mount examples [BZ #27207]

Reviewed-by: DJ Delorie <dj@redhat.com>

linux: mips: Fix getdents64 fallback on mips64-n32

GCC mainline shows the following error:

../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c: In function '__getdents64':
../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c:121:7: error: 'memcpy' forming offset [4, 7] is out of the bounds [0, 4] [-Werror=array-bounds]
  121 |       memcpy (((char *) dp + offsetof (struct dirent64, d_ino)),
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  122 |               KDP_MEMBER (kdp, d_ino), sizeof ((struct dirent64){0}.d_ino));
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../sysdeps/unix/sysv/linux/mips/mips64/getdents64.c:123:7: error: 'memcpy' forming offset [4, 7] is out of the bounds [0, 4] [-Werror=array-bounds]
  123 |       memcpy (((char *) dp + offsetof (struct dirent64, d_off)),
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  124 |               KDP_MEMBER (kdp, d_off), sizeof ((struct dirent64){0}.d_off));
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The issue is due both d_ino and d_off fields for mips64-n32
kernel_dirent are 32-bits, while this is using memcpy to copy 64 bits
from it into the glibc dirent64.

The fix is to use a temporary buffer to read the correct type
from kernel_dirent.

Checked with a build-many-glibcs.py for mips64el-linux-gnu and I
also checked the tst-getdents64 on mips64el 4.1.4 kernel with
and without fallback enabled (by manually setting the
getdents64_supported).

x86: Properly match CPU features in /proc/cpuinfo [BZ #27222]

Search " YYY " and " YYY\n", instead of "YYY", to avoid matching
"XXXYYYZZZ" with "YYY".

Update /proc/cpuinfo CPU feature names:

/proc/cpuinfo                     glibc
------------------------------------------------
avx512vbmi                        AVX512_VBMI
dts                               DS
pni                               SSE3
tsc_deadline_timer                TSC_DEADLINE

x86-64: Update tst-glibc-hwcaps-2.c for x86-64 baseline

Return EXIT_FAILURE only if the level 2 libx86-64-isa-level.so is used
on x86-64 baseline machine.

powerpc64: Select POWER9 machine for the scv instruction

It is not available with the baseline ISA.

Fixes commit 68ab82f56690ada86ac1e0c46bad06ba189a10ef
("powerpc: Runtime selection between sc and scv for syscalls").

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

x86: Check ifunc resolver with CPU_FEATURE_USABLE [BZ #27072]

Check ifunc resolver with CPU_FEATURE_USABLE and tunables in dynamic and
static executables to verify that CPUID features are initialized early in
static PIE.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Revert "linux: Move {f}xstat{at} to compat symbols" for static build

This reverts commit 20b39d59467b0c1d858e89ded8b0cebe55e22f60 for static
library. This avoids the need to rebuild the world for the case where
libstdc++ (and potentially other libraries) are linked to a old glibc.

To avoid requering to provide xstat symbols for newer ABIs (such as
riscv32) a new LIB_COMPAT macro is added. It is similar to SHLIB_COMPAT
but also works for static case (thus evaluating similar to SHLIB_COMPAT
for both shared and static case).

Checked with a check-abi on all affected ABIs. I also check if the
static library does contains the xstat symbols.

aarch64: revert memcpy optimze for kunpeng to avoid performance degradation

In commit 863d775c481704baaa41855fc93e5a1ca2dc6bf6, kunpeng920 is added to default memcpy version,
however, there is performance degradation when the copy size is some large bytes, eg: 100k.
This is the result, tested in glibc-2.28:
             before backport  after backport Performance improvement
memcpy_1k      0.005              0.005                 0.00%
memcpy_10k     0.032              0.029                 10.34%
memcpy_100k    0.356              0.429                 -17.02%
memcpy_1m      7.470              11.153                -33.02%

This is the demo
#include "stdio.h"
#include "string.h"
#include "stdlib.h"

char a[1024*1024] = {12};
char b[1024*1024] = {13};
int main(int argc, char *argv[])
{
    int i = atoi(argv[1]);
    int j;
    int size = atoi(argv[2]);

    for (j = 0; j < i; j++)
        memcpy(b, a, size*1024);
    return 0;
}

# gcc -g -O0 memcpy.c -o memcpy
# time taskset -c 10 ./memcpy 100000 1024

Co-authored-by: liqingqing <liqingqing3@huawei.com>

Make libc symbols hidden in static PIE

Hidden visibility can avoid indirections and RELATIVE relocs in
static PIE libc.

The check should use IS_IN_LIB instead of IS_IN(libc) since all
symbols are defined locally in static PIE and the optimization is
useful in all libraries not just libc. However the test system
links objects from libcrypt.a into dynamic linked test binaries
where hidden visibility does not work. I think mixing static and
shared libc components in the same binary should not be supported
usage, but to be safe only use hidden in libc.a.

On some targets (i386) this optimization cannot be applied because
hidden visibility PIE ifunc functions don't work, so it is gated by
NO_HIDDEN_EXTERN_FUNC_IN_PIE.

From -static-pie linked 'int main(){}' this shaves off 71 relative
relocs on aarch64 and reduces code size by about 2k.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

csu: Move static pie self relocation later [BZ #27072]

IFUNC resolvers may depend on tunables and cpu feature setup so
move static pie self relocation after those.

It is hard to guarantee that the ealy startup code does not rely
on relocations so this is a bit fragile. It would be more robust
to handle RELATIVE relocs early and only IRELATIVE relocs later,
but the current relocation processing code cannot do that.

The early startup code up to relocation processing includes

  _dl_aux_init (auxvec);
  __libc_init_secure ();
  __tunables_init (__environ);
  ARCH_INIT_CPU_FEATURES ();
  _dl_relocate_static_pie ();

These are simple enough that RELATIVE relocs can be avoided.

The following steps include

  ARCH_SETUP_IREL ();
  ARCH_SETUP_TLS ();
  ARCH_APPLY_IREL ();

On some targets IRELATIVE processing relies on TLS setup on
others TLS setup relies on IRELATIVE relocs, so the right
position for _dl_relocate_static_pie is target dependent.
For now move self relocation as early as possible on targets
that support static PIE.

Fixes bug 27072.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Use hidden visibility for early static PIE code

Extern symbol access in position independent code usually involves GOT
indirection which needs RELATIVE reloc in a static linked PIE. (On
some targets this is avoided e.g. because the linker can relax a GOT
access to a pc-relative access, but this is not generally true.) Code
that runs before static PIE self relocation must avoid relying on
dynamic relocations which can be ensured by using hidden visibility.
However we cannot just make all symbols hidden:

On i386, all calls to IFUNC functions must go through PLT and calls to
hidden functions CANNOT go through PLT in PIE since EBX used in PIE PLT
may not be set up for local calls to hidden IFUNC functions.

This patch aims to make symbol references hidden in code that is used
before and by _dl_relocate_static_pie when building a static PIE libc.
Note: for an object that is used in the startup code, its references
and definition may not have consistent visibility: it is only forced
hidden in the startup code.

This is needed for fixing bug 27072.

Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

csu: Avoid weak ref for __ehdr_start in static PIE

All linkers support __ehdr_start that support static PIE linking,
so there is no need to check for its presence via a weak reference.

This avoids a RELATIVE relocation in static PIE startup code on some
targets.

With non-PIE static linking the weak ref check is kept in case the
linker does not support __ehdr_start.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

configure: Check for static PIE support

Add SUPPORT_STATIC_PIE that targets can define if they support
static PIE. This requires PI_STATIC_AND_HIDDEN support and various
linker features as described in

commit 9d7a3741c9e59eba87fb3ca6b9f979befce07826
Add --enable-static-pie configure option to build static PIE [BZ #19574]

Currently defined on x86_64, i386 and aarch64 where static PIE is
known to work.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Avoid RELATIVE relocs in __tunables_init

With static pie linking pointers in the tunables list need
RELATIVE relocs since the absolute address is not known at link
time. We want to avoid relocations so the static pie self
relocation can be done after tunables are initialized.

This is a simple fix that embeds the tunable strings into the
tunable list instead of using pointers.  It is possible to have
a more compact representation of tunables with some additional
complexity in the generator and tunable parser logic.  Such
optimization will be useful if the list of tunables grows.

There is still an issue that tunables_strdup allocates and the
failure handling code path is sufficiently complex that it can
easily have RELATIVE relocations.  It is possible to avoid the
early allocation and only change environment variables in a
setuid exe after relocations are processed.  But that is a
bigger change and early failure is fatal anyway so it is not
as critical to fix right away. This is bug 27181.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Make the tunable struct definition internal only

The representation of the tunables including type information and
the tunable list structure are only used in the implementation not
in the tunables api that is exposed to usage within glibc.

This patch moves the representation related definitions into the
existing dl-tunable-types.h and uses that only for implementation.

The tunable callback and related types are moved to dl-tunables.h
because they are part of the tunables api.

This reduces the details exposed in the tunables api so the internals
are easier to change.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

<sys/platform/x86.h>: Remove the C preprocessor magic

In <sys/platform/x86.h>, define CPU features as enum instead of using
the C preprocessor magic to make it easier to wrap this functionality
in other languages.  Move the C preprocessor magic to internal header
for better GCC codegen when more than one features are checked in a
single expression as in x86-64 dl-hwcaps-subdirs.c.

1. Rename COMMON_CPUID_INDEX_XXX to CPUID_INDEX_XXX.
2. Move CPUID_INDEX_MAX to sysdeps/x86/include/cpu-features.h.
3. Remove struct cpu_features and __x86_get_cpu_features from
<sys/platform/x86.h>.
4. Add __x86_get_cpuid_feature_leaf to <sys/platform/x86.h> and put it
in libc.
5. Make __get_cpu_features() private to glibc.
6. Replace __x86_get_cpu_features(N) with __get_cpu_features().
7. Add _dl_x86_get_cpu_features to GLIBC_PRIVATE.
8. Use a single enum index for each CPU feature detection.
9. Pass the CPUID feature leaf to __x86_get_cpuid_feature_leaf.
10. Return zero struct cpuid_feature for the older glibc binary with a
smaller CPUID_INDEX_MAX [BZ #27104].
11. Inside glibc, use the C preprocessor magic so that cpu_features data
can be loaded just once leading to more compact code for glibc.

256 bits are used for each CPUID leaf.  Some leaves only contain a few
features.  We can add exceptions to such leaves.  But it will increase
code sizes and it is harder to provide backward/forward compatibilities
when new features are added to such leaves in the future.

When new leaves are added, _rtld_global_ro offsets will change which
leads to race condition during in-place updates. We may avoid in-place
updates by

1. Rename the old glibc.
2. Install the new glibc.
3. Remove the old glibc.

NB: A function, __x86_get_cpuid_feature_leaf , is used to avoid the copy
relocation issue with IFUNC resolver as shown in IFUNC resolver tests.

posix: Fix fnmatch.c on bootstrap

Only define FALLTHROUGH for _LIBC and do not check __clang_major__
value.

It partially syncs with gnulib 5c52f00c69f39fe.

Checked with build-many-glibcs.py for aarch64-linux-gnu.

stdlib: Add testcase for BZ #26241

Old implementation of realpath allocates a PATH_MAX using alloca for
each symlink in the path, leading to MAXSYMLINKS times PATH_MAX
maximum stack usage.

The test create a symlink with __eloop_threshold() loops and creates
a thread with minimum stack size (obtained through
support_small_stack_thread_attribute). The thread issues a stack
allocations that fill the thread allocated stack minus some slack
plus and the realpath usage (which assumes a bounded stack usage).
If realpath uses more than about 2 * PATH_MAX plus some slack it
triggers a stackoverflow.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>

posix: Fix regex_internal.h on bootstrap

Only define FALLTHROUGH for _LIBC and do not check __clang_major__
value.

It partially syncs with gnulib 5c52f00c69f39fe.

Checked with build-many-glibcs.py for aarch64-linux-gnu,
x86_64-linux-gnu, and s390x-linux-gnu.

Use <startup.h> in __libc_init_secure

Since __libc_init_secure is called before ARCH_SETUP_TLS, it must use
"int $0x80" for system calls in i386 static PIE. Add startup_getuid,
startup_geteuid, startup_getgid and startup_getegid to <startup.h>.
Update __libc_init_secure to use them.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Avoid RELATIVE relocation for _dl_sysinfo

Set the default _dl_sysinfo in _dl_aux_init to avoid RELATIVE relocation
in static PIE.

This is needed for fixing bug 27072 on x86.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

libmvec: Add extra-test-objs to test-extras

Add extra-test-objs to test-extras so that they are compiled with
-DMODULE_NAME=testsuite instead of -DMODULE_NAME=libc.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Hurd: Add rtld-strncpy-c.c

All IFUNC functions which are used in ld.so must have a rtld version if
the IFUNC version isn't safe to use in ld.so.

Update MIPS libm-test-ulps.

Update arm libm-test-ulps.

Update powerpc-nofpu libm-test-ulps.

Update hppa libm-test-ulps

ARC: nofpu: Regenerate ulps

ld.so: Add --list-tunables to print tunable values

Pass --list-tunables to ld.so to print tunables with min and max values.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

math/test-tgmath2: Fix fabs failure when no long double

I have been testing with GCC trunk and GLIBC master while working on the
OpenRISC port.  This test has been failing with fabs not being called,
This is caused as my architecture is configure with no long double
meaning the two calls are the same:

  TEST (fabs (Vdouble1), double, fabs);
  TEST (fabs (Vldouble1), ldouble, fabs);

Instead of the tgmath calls resolving to fabs and fabsl both calls are
fabs.  Next, do to compiler optimiations the second call is eliminated.
Fix this by invoking the failing TEST with Vldouble2.

Note, I also updated the FAIL message to more clearly show where the
failure happened, so I see:

  FAIL: math/test-tgmath2
  original exit status 1
  wrong function called, fabs (ldouble) failure on line 174

Cc: Joseph Myers <joseph@codesourcery.com>

x86: Move x86 processor cache info to cpu_features

1. Move x86 processor cache info to _dl_x86_cpu_features in ld.so.
2. Update tunable bounds with TUNABLE_SET_WITH_BOUNDS.
3. Move x86 cache info initialization to dl-cacheinfo.h and initialize
x86 cache info in init_cpu_features ().
4. Put x86 cache info for libc in cacheinfo.h, which is included in
libc-start.c in libc.a and is included in cacheinfo.c in libc.so.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Fix x86 build with --enable-tunable=no

Checked on x86_64-linux-gnu.

ifuncmain6pie: Remove the circular IFUNC dependency [BZ #20019]

On x86, ifuncmain6pie failed with:

[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4  00000706 R_386_GLOB_DAT         0000400c   foo_ptr
00003ff8  00000406 R_386_GLOB_DAT         00000000   foo
0000400c  00000401 R_386_32               00000000   foo
[hjl@gnu-cfl-2 build-i686-linux]$

Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.

Use the right argument code in unnormal tests

Use the right argument code (j) in the unnormal tests and cast inputs
from the ieee_long_double_shape_type struct to Float64x to properly
test it.

ldconfig/x86: Store ISA level in cache and aux cache

Store ISA level in the portion of the unused upper 32 bits of the hwcaps
field in cache and the unused pad field in aux cache. ISA level is stored
and checked only for shared objects in glibc-hwcaps subdirectories. The
shared objects in the default directories aren't checked since there are
no fallbacks for these shared objects.

Tested on x86-64-v2, x86-64-v3 and x86-64-v4 machines with
--disable-hardcoded-path-in-tests and --enable-hardcoded-path-in-tests.

elf: work around a gcc bug in elf_get_dynamic_info

Since commit 2f056e8a5dd4dc0f075413f931e82cede37d1057
"aarch64: define PI_STATIC_AND_HIDDEN",
building glibc with gcc-8 on aarch64 fails with

/BLD/elf/librtld.os: in function `elf_get_dynamic_info':
/SRC/elf/get-dynamic-info.h:70:(.text+0xad8): relocation truncated to
fit: R_AARCH64_ADR_PREL_PG_HI21 against symbol `_rtld_local' defined
in .data section in /BLD/elf/librtld.os

This is a gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98618
The bug is fixed on gcc-10 and not yet backported. gcc-9 is affected,
but the issue happens to not trigger in glibc, gcc-8 and older seems
to miscompile rtld.os.

Rewriting the affected code in elf_get_dynamic_info seems to make the
issue go away on <= gcc-9.

The change makes the logic a bit clearer too (by separating the index
computation and array update) and drops an older gcc workaround (since
gcc 4.6 is no longer supported).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

x86: Set header.feature_1 in TCB for always-on CET [BZ #27177]

Update dl_cet_check() to set header.feature_1 in TCB when both IBT and
SHSTK are always on.

posix: consume less entropy on tempname

The first getrandom is used only for __GT_NOCREATE, which is inherently
insecure and can use the entropy as a small improvement. On the
second and later attempts it might help against DoS attacks.

It sync with gnulib commit 854fbb81d91f7a0f2b463e7ace2499dee2f380f2.

Checked on x86_64-linux-gnu.

Makerules: Do not require startup files for format.lds probe object

During statically linked bootstrap, the compiler does not have
the required startup files, so do a smaller dummy link to obtain
the output format information.

Fixes commit 87d583c6e8cd0e49f64da76636ebeec033298b4d ("install:
Replace scripts/output-format.sed with objdump -f [BZ #26559]").

install: Replace scripts/output-format.sed with objdump -f [BZ #26559]

GNU ld and gold have supported --print-output-format since 2011. glibc
requires binutils>=2.25 (2015), so if LD is GNU ld or gold, we can
assume the option is supported.

lld is by default a cross linker supporting multiple targets. It auto
detects the file format and does not need OUTPUT_FORMAT. It does not
support --print-output-format.

By parsing objdump -f, we can support all the three linkers.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

math: Add BZ#18980 fix back on dbl-64 cosh

It is regression from 9e97f239eae1f2b1 (Remove dbl-64/wordsize-64
(part 2)) where is missed to add the BZ#18980 fix (9e97f239eae1f2b1).

Checked on i686-linux-gnu.

posix: Sync tempname with gnulib [BZ #26648]

It syncs with gnulib commit b1268f22f443e8e4b9e. The try_tempname_len
now uses getrandom on each iteration to get entropy and only uses the
clock plus ASLR as source of entropy if getrandom fails.

Checked on x86_64-linux-gnu and i686-linux-gnu.

posix: Fix return value of system if shell can not be executed [BZ #27053]

POSIX states that system returned code for failure to execute the shell
shall be as if the shell had terminated using _exit(127). This
behaviour was removed with 5fb7fc96350575.

Checked on x86_64-linux-gnu.

support: Add xchmod wrapper

Checked on x86_64-linux-gnu.

Update STATX_ATTR_DAX value from Linux 5.10.

This patch updates the value of STATX_ATTR_DAX in bits/statx-generic.h
for a change made in Linux 5.10. (As with previous such changes, this
only does anything if glibc is being used with old kernel headers.)

Tested for x86_64.

riscv: Initialize $gp before resolving the IRELATIVE relocation

The $gp register may be used to access the global variable in
the PDE program, so the $gp register should be initialized before
executing the IFUNC resolver of PDE program to avoid unexpected
error occurs.

riscv: support GNU indirect function

Enable riscv glibc to support GNU indirect function

posix: Correct attribute access mode on readlinkat [BZ #27024].

Add xfchmod to libsupport

Add xchdir to libsupport.

POSIX locale: Fix typo in comment

ARC: Regenerate ulps

Reinstate pass for

FAIL: math/test-double-cosh
FAIL: math/test-double-sinh
FAIL: math/test-float32x-cosh
FAIL: math/test-float32x-sinh
FAIL: math/test-float64-cosh
FAIL: math/test-float64-sinh
FAIL: math/test-ldouble-cosh
FAIL: math/test-ldouble-sinh

mntent: Use __putc_unlocked instead of fputc_unlocked

__putc_unlocked is guaranteed to be inlined all the time as opposed to
fputc_unlocked, which does not get inlined when glibc is built with
-Os.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

aarch64: define PI_STATIC_AND_HIDDEN

AArch64 always uses pc relative access to static and hidden object
symbols, but the config setting was previously missing.

This affects ld.so start up code.

Update NEWS for CVE-2019-25013.

x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker [BZ #26717]

GCC 11 supports -march=x86-64-v[234] to enable x86 micro-architecture ISA
levels:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250

and -mneeded to emit GNU_PROPERTY_X86_ISA_1_NEEDED property with
GNU_PROPERTY_X86_ISA_1_V[234] marker:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/13

Binutils support for GNU_PROPERTY_X86_ISA_1_V[234] marker were added by

commit b0ab06937385e0ae25cebf1991787d64f439bf12
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Oct 30 06:49:57 2020 -0700

    x86: Support GNU_PROPERTY_X86_ISA_1_BASELINE marker

and

commit 32930e4edbc06bc6f10c435dbcc63131715df678
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Oct 9 05:05:57 2020 -0700

    x86: Support GNU_PROPERTY_X86_ISA_1_V[234] marker

GNU_PROPERTY_X86_ISA_1_NEEDED property in x86 ELF binaries indicate the
micro-architecture ISA level required to execute the binary.  The marker
must be added by programmers explicitly in one of 3 ways:

1. Pass -mneeded to GCC.
2. Add the marker in the linker inputs as this patch does.
3. Pass -z x86-64-v[234] to the linker.

Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker support to ld.so if binutils 2.32 or newer is used to build glibc:

1. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
markers to elf.h.
2. Add GNU_PROPERTY_X86_ISA_1_BASELINE and GNU_PROPERTY_X86_ISA_1_V[234]
marker to abi-note.o based on the ISA level used to compile abi-note.o,
assuming that the same ISA level is used to compile the whole glibc.
3. Add isa_1 to cpu_features to record the supported x86 ISA level.
4. Rename _dl_process_cet_property_note to _dl_process_property_note and
add GNU_PROPERTY_X86_ISA_1_V[234] marker detection.
5. Update _rtld_main_check and _dl_open_check to check loaded objects
with the incompatible ISA level.
6. Add a testcase to verify that dlopen an x86-64-v4 shared object fails
on lesser platforms.
7. Use <get-isa-level.h> in dl-hwcaps-subdirs.c and tst-glibc-hwcaps.c.

Tested under i686, x32 and x86-64 modes on x86-64-v2, x86-64-v3 and
x86-64-v4 machines.

Marked elf/tst-isa-level-1 with x86-64-v4, ran it on x86-64-v3 machine
and got:

[hjl@gnu-cfl-2 build-x86_64-linux]$ ./elf/tst-isa-level-1
./elf/tst-isa-level-1: CPU ISA level is lower than required
[hjl@gnu-cfl-2 build-x86_64-linux]$