git.ipfire.org Git - thirdparty/glibc.git/log

elf: Always call destructors in reverse constructor order (bug 30785)

The current implementation of dlclose (and process exit) re-sorts the
link maps before calling ELF destructors.  Destructor order is not the
reverse of the constructor order as a result: The second sort takes
relocation dependencies into account, and other differences can result
from ambiguous inputs, such as cycles.  (The force_first handling in
_dl_sort_maps is not effective for dlclose.)  After the changes in
this commit, there is still a required difference due to
dlopen/dlclose ordering by the application, but the previous
discrepancies went beyond that.

A new global (namespace-spanning) list of link maps,
_dl_init_called_list, is updated right before ELF constructors are
called from _dl_init.

In dl_close_worker, the maps variable, an on-stack variable length
array, is eliminated.  (VLAs are problematic, and dlclose should not
call malloc because it cannot readily deal with malloc failure.)
Marking still-used objects uses the namespace list directly, with
next and next_idx replacing the done_index variable.

After marking, _dl_init_called_list is used to call the destructors
of now-unused maps in reverse destructor order.  These destructors
can call dlopen.  Previously, new objects do not have l_map_used set.
This had to change: There is no copy of the link map list anymore,
so processing would cover newly opened (and unmarked) mappings,
unloading them.  Now, _dl_init (indirectly) sets l_map_used, too.
(dlclose is handled by the existing reentrancy guard.)

After _dl_init_called_list traversal, two more loops follow.  The
processing order changes to the original link map order in the
namespace.  Previously, dependency order was used.  The difference
should not matter because relocation dependencies could already
reorder link maps in the old code.

The changes to _dl_fini remove the sorting step and replace it with
a traversal of _dl_init_called_list.  The l_direct_opencount
decrement outside the loader lock is removed because it appears
incorrect: the counter manipulation could race with other dynamic
loader operations.

tst-audit23 needs adjustments to the changes in LA_ACT_DELETE
notifications.  The new approach for checking la_activity should
make it clearer that la_activty calls come in pairs around namespace
updates.

The dependency sorting test cases need updates because the destructor
order is always the opposite order of constructor order, even with
relocation dependencies or cycles present.

There is a future cleanup opportunity to remove the now-constant
force_first and for_fini arguments from the _dl_sort_maps function.

Fixes commit 1df71d32fe5f5905ffd5d100e5e9ca8ad62 ("elf: Implement
force_first handling in _dl_sort_maps_dfs (bug 28937)").

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 6985865bc3ad5b23147ee73466583dd7fdf65892)

elf: Do not run constructors for proxy objects

Otherwise, the ld.so constructor runs for each audit namespace
and each dlmopen namespace.

(cherry picked from commit f6c8204fd7fabf0cf4162eaf10ccf23258e4d10e)

io: Fix record locking contants for powerpc64 with __USE_FILE_OFFSET64

Commit 5f828ff824e3b7cd1 ("io: Fix F_GETLK, F_SETLK, and F_SETLKW for
powerpc64") fixed an issue with the value of the lock constants on
powerpc64 when not using __USE_FILE_OFFSET64, but it ended-up also
changing the value when using __USE_FILE_OFFSET64 causing an API change.

Fix that by also checking that define, restoring the pre
4d0fe291aed3a476a commit values:

Default values:
- F_GETLK: 5
- F_SETLK: 6
- F_SETLKW: 7

With -D_FILE_OFFSET_BITS=64:
- F_GETLK: 12
- F_SETLK: 13
- F_SETLKW: 14

At the same time, it has been noticed that there was no test for io lock
with __USE_FILE_OFFSET64, so just add one.

Tested on x86_64-linux-gnu, i686-linux-gnu and
powerpc64le-unknown-linux-gnu.

Resolves: BZ #30804.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 434bf72a94de68f0cc7fbf3c44bf38c1911b70cb)

hurd: Make exception subcode a long

On EXC_BAD_ACCESS, exception subcode is used to pass the faulting memory
address, so it needs to be (at least) pointer-sized. Thus, make it into
a long. This matches the corresponding change in GNU Mach.
Message-Id: <20230319151017.531737-5-bugaevc@gmail.com>

(cherry picked from commit d8ee5d614bc485f6d1752dfa0d60524b20945a56)

x86: Fix incorrect scope of setting `shared_per_thread` [BZ# 30745]

The:

```
    if (shared_per_thread > 0 && threads > 0)
      shared_per_thread /= threads;
```

Code was accidentally moved to inside the else scope.  This doesn't
match how it was previously (before af992e7abd).

This patch fixes that by putting the division after the `else` block.

(cherry picked from commit 084fb31bc2c5f95ae0b9e6df4d3cf0ff43471ede)

x86: Use `3/4*sizeof(per-thread-L3)` as low bound for NT threshold.

On some machines we end up with incomplete cache information. This can
make the new calculation of `sizeof(total-L3)/custom-divisor` end up
lower than intended (and lower than the prior value). So reintroduce
the old bound as a lower bound to avoid potentially regressing code
where we don't have complete information to make the decision.
Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 8b9a0af8ca012217bf90d1dc0694f85b49ae09da)

x86: Fix slight bug in `shared_per_thread` cache size calculation.

After:
```
    commit af992e7abdc9049714da76cae1e5e18bc4838fb8
    Author: Noah Goldstein <goldstein.w.n@gmail.com>
    Date:   Wed Jun 7 13:18:01 2023 -0500

        x86: Increase `non_temporal_threshold` to roughly `sizeof_L3 / 4`
```

Split `shared` (cumulative cache size) from `shared_per_thread` (cache
size per socket), the `shared_per_thread` *can* be slightly off from
the previous calculation.

Previously we added `core` even if `threads_l2` was invalid, and only
used `threads_l2` to divide `core` if it was present. The changed
version only included `core` if `threads_l2` was valid.

This change restores the old behavior if `threads_l2` is invalid by
adding the entire value of `core`.
Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 47f747217811db35854ea06741be3685e8bbd44d)

x86: Increase `non_temporal_threshold` to roughly `sizeof_L3 / 4`

Current `non_temporal_threshold` set to roughly '3/4 * sizeof_L3 /
ncores_per_socket'. This patch updates that value to roughly
'sizeof_L3 / 4`

The original value (specifically dividing the `ncores_per_socket`) was
done to limit the amount of other threads' data a `memcpy`/`memset`
could evict.

Dividing by 'ncores_per_socket', however leads to exceedingly low
non-temporal thresholds and leads to using non-temporal stores in
cases where REP MOVSB is multiple times faster.

Furthermore, non-temporal stores are written directly to main memory
so using it at a size much smaller than L3 can place soon to be
accessed data much further away than it otherwise could be. As well,
modern machines are able to detect streaming patterns (especially if
REP MOVSB is used) and provide LRU hints to the memory subsystem. This
in affect caps the total amount of eviction at 1/cache_associativity,
far below meaningfully thrashing the entire cache.

As best I can tell, the benchmarks that lead this small threshold
where done comparing non-temporal stores versus standard cacheable
stores. A better comparison (linked below) is to be REP MOVSB which,
on the measure systems, is nearly 2x faster than non-temporal stores
at the low-end of the previous threshold, and within 10% for over
100MB copies (well past even the current threshold). In cases with a
low number of threads competing for bandwidth, REP MOVSB is ~2x faster
up to `sizeof_L3`.

The divisor of `4` is a somewhat arbitrary value. From benchmarks it
seems Skylake and Icelake both prefer a divisor of `2`, but older CPUs
such as Broadwell prefer something closer to `8`. This patch is meant
to be followed up by another one to make the divisor cpu-specific, but
in the meantime (and for easier backporting), this patch settles on
`4` as a middle-ground.

Benchmarks comparing non-temporal stores, REP MOVSB, and cacheable
stores where done using:
https://github.com/goldsteinn/memcpy-nt-benchmarks

Sheets results (also available in pdf on the github):
https://docs.google.com/spreadsheets/d/e/2PACX-1vS183r0rW_jRX6tG_E90m9qVuFiMbRIJvi5VAE8yYOvEOIEEc3aSNuEsrFbuXw5c3nGboxMmrupZD7K/pubhtml
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit af992e7abdc9049714da76cae1e5e18bc4838fb8)

sparc: Fix la_symbind for bind-now (BZ 23734)

The sparc ABI has multiple cases on how to handle JMP_SLOT relocations,
(sparc_fixup_plt/sparc64_fixup_plt).  For BINDNOW, _dl_audit_symbind
will be responsible to setup the final relocation value; while for
lazy binding _dl_fixup/_dl_profile_fixup will call the audit callback
and tail cail elf_machine_fixup_plt (which will call
sparc64_fixup_plt).

This patch fixes by issuing the SPARC specific routine on bindnow and
forwarding the audit value to elf_machine_fixup_plt for lazy resolution.
It fixes the la_symbind for bind-now tests on sparc64 and sparcv9:

  elf/tst-audit24a
  elf/tst-audit24b
  elf/tst-audit24c
  elf/tst-audit24d

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
(cherry picked from commit dddc88587a7f48cbb361d9929ec23d790164eef8)

nptl: Fix tst-cancel30 on sparc64

As indicated by sparc kernel-features.h, even though sparc64 defines
__NR_pause, it is not supported (ENOSYS). Always use ppoll or the
64 bit time_t variant instead.

(cherry picked from commit 370da8a121c3ba9eeb2f13da15fc0f21f4136b25)

elf: _dl_find_object may return 1 during early startup (bug 30515)

Success is reported with a 0 return value, and failure is -1.
Enhance the kitchen sink test elf/tst-audit28 to cover
_dl_find_object as well.

Fixes commit 5d28a8962dcb ("elf: Add _dl_find_object function")
and bug 30515.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 1bcfe0f732066ae5336b252295591ebe7e51c301)

realloc: Limit chunk reuse to only growing requests [BZ #30579]

The trim_threshold is too aggressive a heuristic to decide if chunk
reuse is OK for reallocated memory; for repeated small, shrinking
allocations it leads to internal fragmentation and for repeated larger
allocations that fragmentation may blow up even worse due to the dynamic
nature of the threshold.

Limit reuse only when it is within the alignment padding, which is 2 *
size_t for heap allocations and a page size for mmapped allocations.
There's the added wrinkle of THP, but this fix ignores it for now,
pessimizing that case in favor of keeping fragmentation low.

This resolves BZ #30579.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reported-by: Nicolas Dusart <nicolas@freedelity.be>
Reported-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 2fb12bbd092b0c10f1f2083216e723d2406e21c4)

hppa: xfail debug/tst-ssp-1 when have-ssp is yes (gcc-12 and later)

io: Fix F_GETLK, F_SETLK, and F_SETLKW for powerpc64

Different than other 64 bit architectures, powerpc64 defines the
LFS POSIX lock constants with values similar to 32 ABI, which
are meant to be used with fcntl64 syscall. Since powerpc64 kABI
does not have fcntl, the constants are adjusted with the
FCNTL_ADJUST_CMD macro.

The 4d0fe291aed3a476a changed the logic of generic constants
LFS value are equal to the default values; which is now wrong
for powerpc64.

Fix the value by explicit define the previous glibc constants
(powerpc64 does not need to use the 32 kABI value, but it simplifies
the FCNTL_ADJUST_CMD which should be kept as compatibility).

Checked on powerpc64-linux-gnu and powerpc-linux-gnu.

(cherry picked from commit 5f828ff824e3b7cd133ef905b8ae25ab8a8f3d66)

io: Fix record locking contants on 32 bit arch with 64 bit default time_t (BZ#30477)

For architecture with default 64 bit time_t support, the kernel
does not provide LFS and non-LFS values for F_GETLK, F_GETLK, and
F_GETLK (the default value used for 64 bit architecture are used).

This is might be considered an ABI break, but the currenct exported
values is bogus anyway.

The POSIX lockf is not affected since it is aliased to lockf64,
which already uses the LFS values.

Checked on i686-linux-gnu and the new tests on a riscv32.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 4d0fe291aed3a476a3b59c4ecfae9d35ac0f15e8)

Document BZ #20975 fix

__check_pf: Add a cancellation cleanup handler [BZ #20975]

There are reports for hang in __check_pf:

https://github.com/JoeDog/siege/issues/4

It is reproducible only under specific configurations:

1. Large number of cores (>= 64) and large number of threads (> 3X of
the number of cores) with long lived socket connection.
2. Low power (frequency) mode.
3. Power management is enabled.

While holding lock, __check_pf calls make_request which calls __sendto
and __recvmsg.  Since __sendto and __recvmsg are cancellation points,
lock held by __check_pf won't be released and can cause deadlock when
thread cancellation happens in __sendto or __recvmsg.  Add a cancellation
cleanup handler for __check_pf to unlock the lock when cancelled by
another thread.  This fixes BZ #20975 and the siege hang issue.

(cherry picked from commit a443bd3fb233186038b8b483959ecb7978d1abea)

Ignore MAP_VARIABLE in tst-mman-consts.py

Linux 6.2 removed the hppa compatibility MAP_VARIABLE define. That
means that, whether or not we remove it in glibc, it needs to be
ignored in tst-mman-consts.py (since this macro comparison
infrastructure expects that new kernel header versions only add new
macros, not remove old ones).

Tested with build-many-glibcs.py for hppa-linux-gnu (Linux 6.2
headers).

(cherry picked from commit 01e09ab0574758e0afff4333511866278ce7c84f)

gmon: Revert addition of tunables to the manual

These tunables had to be removed over GLIBC_PRIVATE ABI stability
concerns.

gmon: Revert addition of tunables to preserve GLIBC_PRIVATE ABI

Otherwise, processes are likely to crash during concurrent updates
to a new glibc version on the stable release branch.

The test gmon/tst-mcount-overflow depends on those tunables, so
it has to be removed as well.

gmon: fix memory corruption issues [BZ# 30101]

V2 of this patch fixes an issue in V1, where the state was changed to ON not
OFF at end of _mcleanup. I hadn't noticed that (counterintuitively) ON=0 and
OFF=3, hence zeroing the buffer turned it back on. So set the state to OFF
after the memset.

1. Prevent double free, and reads from unallocated memory, when
   _mcleanup is (incorrectly) called two or more times in a row,
   without an intervening call to __monstartup; with this patch, the
   second and subsequent calls effectively become no-ops instead.
   While setting tos=NULL is minimal fix, safest action is to zero the
   whole gmonparam buffer.

2. Prevent memory leak when __monstartup is (incorrectly) called two
   or more times in a row, without an intervening call to _mcleanup;
   with this patch, the second and subsequent calls effectively become
   no-ops instead.

3. After _mcleanup, treat __moncontrol(1) as __moncontrol(0) instead.
   With zeroing of gmonparam buffer in _mcleanup, this stops the
   state incorrectly being changed to GMON_PROF_ON despite profiling
   actually being off. If we'd just done the minimal fix to _mcleanup
   of setting tos=NULL, there is risk of far worse memory corruption:
   kcount would point to deallocated memory, and the __profil syscall
   would make the kernel write profiling data into that memory,
   which could have since been reallocated to something unrelated.

4. Ensure __moncontrol(0) still turns off profiling even in error
   state. Otherwise, if mcount overflows and sets state to
   GMON_PROF_ERROR, when _mcleanup calls __moncontrol(0), the __profil
   syscall to disable profiling will not be invoked. _mcleanup will
   free the buffer, but the kernel will still be writing profiling
   data into it, potentially corrupted arbitrary memory.

Also adds a test case for (1). Issues (2)-(4) are not feasible to test.

Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit bde121872001d8f3224eeafa5b7effb871c3fbca)

gmon: improve mcount overflow handling [BZ# 27576]

When mcount overflows, no gmon.out file is generated, but no message is printed
to the user, leaving the user with no idea why, and thinking maybe there is
some bug - which is how BZ 27576 ended up being logged. Print a message to
stderr in this case so the user knows what is going on.

As a comment in sys/gmon.h acknowledges, the hardcoded MAXARCS value is too
small for some large applications, including the test case in that BZ. Rather
than increase it, add tunables to enable MINARCS and MAXARCS to be overridden
at runtime (glibc.gmon.minarcs and glibc.gmon.maxarcs). So if a user gets the
mcount overflow error, they can try increasing maxarcs (they might need to
increase minarcs too if the heuristic is wrong in their case.)

Note setting minarcs/maxarcs too large can cause monstartup to fail with an
out of memory error. If you set them large enough, it can cause an integer
overflow in calculating the buffer size. I haven't done anything to defend
against that - it would not generally be a security vulnerability, since these
tunables will be ignored in suid/sgid programs (due to the SXID_ERASE default),
and if you can set GLIBC_TUNABLES in the environment of a process, you can take
it over anyway (LD_PRELOAD, LD_LIBRARY_PATH, etc). I thought about modifying
the code of monstartup to defend against integer overflows, but doing so is
complicated, and I realise the existing code is susceptible to them even prior
to this change (e.g. try passing a pathologically large highpc argument to
monstartup), so I decided just to leave that possibility in-place.

Add a test case which demonstrates mcount overflow and the tunables.

Document the new tunables in the manual.

Signed-off-by: Simon Kissane <skissane@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 31be941e4367c001b2009308839db5c67bf9dcbc)

gmon: Fix allocated buffer overflow (bug 29444)

The `__monstartup()` allocates a buffer used to store all the data
accumulated by the monitor.

The size of this buffer depends on the size of the internal structures
used and the address range for which the monitor is activated, as well
as on the maximum density of call instructions and/or callable functions
that could be potentially on a segment of executable code.

In particular a hash table of arcs is placed at the end of this buffer.
The size of this hash table is calculated in bytes as
p->fromssize = p->textsize / HASHFRACTION;

but actually should be
p->fromssize = ROUNDUP(p->textsize / HASHFRACTION, sizeof(*p->froms));

This results in writing beyond the end of the allocated buffer when an
added arc corresponds to a call near from the end of the monitored
address range, since `_mcount()` check the incoming caller address for
monitored range but not the intermediate result hash-like index that
uses to write into the table.

It should be noted that when the results are output to `gmon.out`, the
table is read to the last element calculated from the allocated size in
bytes, so the arcs stored outside the buffer boundary did not fall into
`gprof` for analysis. Thus this "feature" help me to found this bug
during working with https://sourceware.org/bugzilla/show_bug.cgi?id=29438

Just in case, I will explicitly note that the problem breaks the
`make test t=gmon/tst-gmon-dso` added for Bug 29438.
There, the arc of the `f3()` call disappears from the output, since in
the DSO case, the call to `f3` is located close to the end of the
monitored range.

Signed-off-by: Леонид Юрьев (Leonid Yuriev) <leo@yuriev.ru>
Another minor error seems a related typo in the calculation of
`kcountsize`, but since kcounts are smaller than froms, this is
actually to align the p->froms data.

Co-authored-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 801af9fafd4689337ebf27260aa115335a0cb2bc)

posix: Fix system blocks SIGCHLD erroneously [BZ #30163]

Fix bug that SIGCHLD is erroneously blocked forever in the following
scenario:

1. Thread A calls system but hasn't returned yet
2. Thread B calls another system but returns

SIGCHLD would be blocked forever in thread B after its system() returns,
even after the system() in thread A returns.

Although POSIX does not require, glibc system implementation aims to be
thread and cancellation safe. This bug was introduced in
5fb7fc96350575c9adb1316833e48ca11553be49 when we moved reverting signal
mask to happen when the last concurrently running system returns,
despite that signal mask is per thread. This commit reverts this logic
and adds a test.

Signed-off-by: Adam Yi <ayi@janestreet.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 436a604b7dc741fc76b5a6704c6cd8bb178518e7)

x86_64: Fix asm constraints in feraiseexcept (bug 30305)

The divss instruction clobbers its first argument, and the constraints
need to reflect that. Fortunately, with GCC 12, generated code does
not actually change, so there is no externally visible bug.

Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 5d1ccdda7b0c625751661d50977f3dfbc73f8eae)

gshadow: Matching sgetsgent, sgetsgent_r ERANGE handling (bug 30151)

Before this change, sgetsgent_r did not set errno to ERANGE, but
sgetsgent only check errno, not the return value from sgetsgent_r.
Consequently, sgetsgent did not detect any error, and reported
success to the caller, without initializing the struct sgrp object
whose address was returned.

This commit changes sgetsgent_r to set errno as well. This avoids
similar issues in applications which only change errno.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit 969e9733c7d17edf1e239a73fa172f357561f440)

stdio-common: tests: don't double-define _FORTIFY_SOURCE

Exactly the same as 35bcb08eaa953c9b8bef6ab2486dc4361e1f26c0.

If using -D_FORITFY_SOURCE=3 (in my case, I've patched GCC to add
=3 instead of =2 (we've done =2 for years in Gentoo)), building
glibc tests will fail on tst-bz11319-fortify2 like:
```
<command-line>: error: "_FORTIFY_SOURCE" redefined [-Werror]
<built-in>: note: this is the location of the previous definition
cc1: all warnings being treated as errors
```

It's just because we're always setting -D_FORTIFY_SOURCE=2
rather than unsetting it first. If F_S is already 2, it's harmless,
but if it's another value (say, 1, or 3), the compiler will bawk.

(I'm not aware of a reason this couldn't be tested with =3,
but the toolchain support is limited for that (too new), and we want
to run the tests everywhere possible.)

As Siddhesh noted previously, we could implement some fallback
logic to determine the maximal F_S value supported by the toolchain,
which is a bit easier now that autoconf-archive has been updated for F_S=3
(https://github.com/autoconf-archive/autoconf-archive/pull/269), but let's
revisit this if it continues to crop up.

Signed-off-by: Sam James <sam@gentoo.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit ecf8ae6704d5034fc2d5e29e5dc88dbca981581e)

elf: Restore ldconfig libc6 implicit soname logic [BZ #30125]

While cleaning up old libc version support, the deprecated libc4 code was
accidentally kept in `implicit_soname`, instead of the libc6 code.

This causes additional symlinks to be created by `ldconfig` for libraries
without a soname, e.g. a library `libsomething.123.456.789` without a soname
will create a `libsomething.123` -> `libsomething.123.456.789` symlink.

As the libc6 version of the `implicit_soname` code is a trivial `xstrdup`,
just inline it and remove `implicit_soname` altogether.

Some further simplification looks possible (e.g. the call to `create_links`
looks like a no-op if `soname == NULL`, other than the verbose printfs), but
logic is kept as-is for now.

Fixes: BZ #30125
Fixes: 8ee878592c4a ("Assume only FLAG_ELF_LIBC6 suport")
Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 1b0ea8c5d886fedabd611a569b5ec58a6f5153e6)

stdlib: Undo post review change to 16adc58e73f3 [BZ #27749]

Post review removal of "goto restart" from
https://sourceware.org/pipermail/libc-alpha/2021-April/125470.html
introduced a bug when some atexit handers skipped.

Signed-off-by: Vitaly Buka <vitalybuka@google.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit fd78cfa72ea2bab30fdb4e1e0672b34471426c05)

elf: Smoke-test ldconfig -p against system /etc/ld.so.cache

The test is sufficient to detect the ldconfig bug fixed in
commit 9fe6f6363886aae6b2b210cae3ed1f5921299083 ("elf: Fix 64 time_t
support for installed statically binaries").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 9fd63e35371b9939e9153907c6a753e6960b68ad)

NEWS: Document CVE-2023-25139.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit 67c37737ed474d25fd4dc535dfd822c426e6b971)

Account for grouping in printf width (bug 30068)

This is a partial fix for mishandling of grouping when formatting
integers. It properly computes the width in the presence of grouping
characters when the width is larger than the number of significant
digits. The precision related issue is documented in bug 23432.

Co-authored-by: Andreas Schwab <schwab@suse.de>
(cherry picked from commit c980549cc6a1c03c23cc2fe3e7b0fe626a0364b0)

Use 64-bit time_t interfaces in strftime and strptime (bug 30053)

Both functions use time_t only internally, so the ABI is not affected.

(cherry picked from commit 41349f6f67c83e7bafe49f985b56493d2c4c9c77)

LoongArch: Add new relocation types.

cdefs: Limit definition of fortification macros

Define the __glibc_fortify and other macros only when __FORTIFY_LEVEL >
0. This has the effect of not defining these macros on older C90
compilers that do not have support for variable length argument lists.

Also trim off the trailing backslashes from the definition of
__glibc_fortify and __glibc_fortify_n macros.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 2337e04e21ba6040926ec871e403533f77043c40)

Create ChangeLog.old/ChangeLog.26.

Prepare for glibc 2.37 release.

Update version.h, and include/features.h.

x86: Fix strncat-avx2.S reading past length [BZ #30065]

Occurs when `src` has no null-term.

Two cases:

1) Zero-length check is doing:
```
    test    %rdx, %rdx
    jl      L(zero_len)
```
which doesn't actually check zero (was at some point `decq` and the
flag never got updated).

The fix is just make the flag `jle` i.e:
```
    test    %rdx, %rdx
    jle     L(zero_len)
```

2) Length check in page-cross case checking if we should continue is
doing:
```
    cmpq    %r8, %rdx
    jb      L(page_cross_small)
```
which means we will continue searching for null-term if length ends at
the end of a page and there was no null-term in `src`.

The fix is to make the flag:
```
    cmpq    %r8, %rdx
    jbe     L(page_cross_small)
```

Update install.texi, and regenerate INSTALL.

Update manual/contrib.texi.

Thank Yinyu Cai for their maintainership of the LoongArch port.

Thank Vineet Gupta for their maintainership of the ARC port.

Thank Tulio Magno Quites Machado Filho for their past maintainership
of the PowerPC port.

Thank Rajalakshmi Srinivasaraghavan for their current maintainership
of the PowerPC port.

Update NEWS file with bug fixes.

Regenerate configure.

Run using vanilla upstream autoconf 2.69.

Minor whitespace change to sysdeps/loongarch/configure and
sysdeps/mach/configure, and nothing else.

Update all PO files in preparation for release.

doc: correct _FORTIFY_SOURCE doc in features.h

libio: Update number of written bytes in dprintf implementation

The __printf_buffer_flush_dprintf function needs to record that
the buffer has been written before reusing it. Without this
accounting, dprintf always returns zero.

Fixes commit 8ece45e4f586abd212d1c02d74d38ef681a45600
("libio: Convert __vdprintf_internal to buffers").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

Account for octal marker in %#o format

Use binutils 2.40 branch in build-many-glibcs.py

This patch makes build-many-glibcs.py use binutils 2.40 branch.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

Use MPFR 4.2.0, MPC 1.3.1 in build-many-glibcs.py

This patch makes build-many-glibcs.py use the new MPFR 4.2.0 and MPC
1.3.1 releases.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

stdio-common: Handle -1 buffer size in __sprintf_chk & co (bug 30039)

This shows up as an assertion failure when sprintf is called with
a specifier like "%.8g" and libquadmath is linked in:

Fatal glibc error: printf_buffer_as_file.c:31
(__printf_buffer_as_file_commit): assertion failed:
file->stream._IO_write_ptr <= file->next->write_end

Fix this by detecting pointer wraparound in __vsprintf_internal
and saturate the addition to the end of the address space instead.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

Document '%F' format specifier

The '%F' format specifier was implemented in commit 6c46718f9f0 on
2000-08-23, but remains undocumented in the manual.
https://stackoverflow.com/questions/75157669/format-specifier-f-missing-from-glibcs-documentation

Fix that.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

sparc (64bit): Regenerate ulps

Linux catbus 5.15.69-gentoo #1 SMP Sat Sep 24 07:56:24 PDT 2022 sparc64 sun4v UltraSparc T5 (Niagara5) GNU/Linux
gcc (Gentoo 11.3.1_p20221209 p3) 11.3.1 20221209
GNU ld (Gentoo 2.38 p4) 2.38
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

ia64: Regenerate ulps

Linux guppy 5.13.0-00002-gdecb01746d6c #368 SMP Sat Aug 14 20:10:13 UTC 2021 ia64 Dual-Core Intel(R) Itanium(R) Processor 9040 GenuineIntel GNU/Linux
gcc (Gentoo 12.2.1_p20221231 p8) 12.2.1 20221231
GNU ld (Gentoo 2.40 p1) 2.40
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Update libc.pot for 2.37 release.

x86: Cache computation for AMD architecture.

All AMD architectures cache details will be computed based on
__cpuid__ `0x8000_001D` and the reference to __cpuid__ `0x8000_0006` will be
zeroed out for future architectures.

Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>

manual: Fix typo

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Add STATX_DIOALIGN from Linux 6.1 to bits/statx-generic.h

Linux 6.1 adds a new STATX_DIOALIGN constant. Add it to glibc's
bits/statx-generic.h.

Tested for x86_64.

Add IPPROTO_L2TP from Linux 6.1 to netinet/in.h

Linux 6.1 adds a define IPPROTO_L2TP to its include/uapi/linux/in.h
(not strictly a new constant, since it's moved from
include/uapi/linux/l2tp.h). Add this constant to glibc's
netinet/in.h.

Tested for x86_64.

AArch64: Improve strrchr

Use shrn for narrowing the mask which simplifies code and speeds up small
strings. Unroll the first search loop to improve performance on large
strings.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize strnlen

Optimize strnlen using the shrn instruction and improve the main loop.
Small strings are around 10% faster, large strings are 40% faster on
modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize strlen

Optimize strlen by unrolling the main loop. Large strings are 64% faster on
modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize strcpy

Unroll the main loop. Large strings are around 20% faster on modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Improve strchrnul

Unroll the main loop, which improves performance slightly.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize strchr

Simplify calculation of the mask using shrn. Unroll the main loop.
Small strings are 20% faster on modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Improve strlen_asimd

Use shrn for the mask, merge tst+bne into cbnz, and tweak code alignment.
Performance improves slightly as a result.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize memrchr

Optimize the main loop - large strings are 43% faster on modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

AArch64: Optimize memchr

Optimize the main loop - large strings are 40% faster on modern CPUs.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

hurd: Fix _NOFLSH value

shifting 1 (thus an integer) left 31 bit is undefined behavior. We have to
make it an unsigned integer to properly get 0x80000000 (like done in other
places).

elf: Fix GL(dl_phdr) and GL(dl_phnum) for static builds [BZ #29864]

The 73fc4e28b9464f0e refactor did not add the GL(dl_phdr) and
GL(dl_phnum) for static build, relying on the __ehdr_start symbol,
which is always added by the static linker, to get the correct values.

This is problematic in some ways:

  - The segment may see its in-memory size differ from its in-file
    size (or the binary may have holes).  The Linux has fixed is to
    provide concise values for both AT_PHDR and AT_PHNUM (commit
    0da1d5002745c - "fs/binfmt_elf: Fix AT_PHDR for unusual ELF files")

  - Some archs (alpha for instance) the hidden weak reference is not
    correctly pulled by the static linker and  __ehdr_start address
    end up being 0, which makes GL(dl_phdr) and GL(dl_phnum) have both
    invalid values (and triggering a segfault later on libc.so while
    accessing TLS variables).

The safer fix is to just restore the previous behavior to setup
GL(dl_phdr) and GL(dl_phnum) for static based on kernel auxv.  The
__ehdr_start fallback can also be simplified by not assuming weak
linkage (as for PIE).

The libc-static.c auxv init logic is moved to dl-support.c, since
the later is build without SHARED and then GLRO macro is defined
to access the variables directly.

The _dl_phdr is also assumed to be always non NULL, since an invalid
NULL values does not trigger TLS initialization (which is used in
various libc systems).

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

string: Suppress -Wmaybe-unitialized for wordcopy [BZ #19444]

When compiling with GCC 6+ the sparc build warns that some variables
might be used uninitialized. However it does not seem the fact, since
the variables are really initialized (and also other targets that use the
same code, like powerpc, do not warn about it).

So suppress the warning for now.

Changes from v1:
* Update patch description and the explanation for the suppresion.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

scripts/build-many-glibcs.py: Remove unused RANLIB and STRIP option

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

configure: Move nm, objdump, and readelf to LIBC_PROG_BINUTILS

Allow the variables to be overriden or have the defaults come
from the compiler currently in use.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

configure: Allow user override LD, AR, OBJCOPY, and GPROF

The only way to override LD, AR, OBJCOPY, and GPROF is through
--with-binutils (setting the environments variables on configure is
overridden by LIBC_PROG_BINUTILS).

The build-many-glibcs.py (bmg) glibcs option generates a working config,
but not fully concise (some tools will be set from environment variable,
while other will be set from $CC --print-prog-name). So remove the
environment variable set to always use the "$CC --print-prog-name".
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Suppress -O0 warnings for soft-fp fsqrt [BZ #19444]

The patch suppress the same warnings from 87c266d758d29e52bfb717f90,
that shows issues for microblaze, mips soft-fp, nios2, and or1k.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

sunrpc: Suppress GCC -O1 warning on user2netname [BZ #19444]

The same issue described by 6128e82ebe973163d2dd614d31753c88c0c4d645
also happend with -O1.

Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

locale: Use correct buffer size for utf8_sequence_error [BZ #19444]

The buffer used by snprintf might not be large enough for all possible
inputs, as indicated by gcc with -O1:

../locale/programs/linereader.c: In function ‘utf8_sequence_error’:
../locale/programs/linereader.c:713:58: error: ‘%02x’ directive output
may be truncated writing between 2 and 8 bytes into a region of size
between 1 and 13 [-Werror=format-truncation=]
  713 |     snprintf (buf, sizeof (buf), "0x%02x 0x%02x 0x%02x 0x%02x",
      |                                                          ^~~~
../locale/programs/linereader.c:713:34: note: directive argument in the
range [0, 2147483647]
  713 |     snprintf (buf, sizeof (buf), "0x%02x 0x%02x 0x%02x 0x%02x",
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../locale/programs/linereader.c:713:5: note: ‘snprintf’ output between
20 and 38 bytes into a destination of size 30
  713 |     snprintf (buf, sizeof (buf), "0x%02x 0x%02x 0x%02x 0x%02x",
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  714 |               ch1, ch2, ch3, ch4);
      |               ~~~~~~~~~~~~~~~~~~~

Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Add HWCAP2_SVE_EBF16 from Linux 6.1 to AArch64 bits/hwcap.h

Linux 6.1 adds a new AArch64 HWCAP2 value HWCAP2_SVE_EBF16; add it to
the corresponding bits/hwcap.h.

Tested with build-many-glibcs.py for aarch64.

Add _FORTIFY_SOURCE implementation documentation [BZ #28998]

There have been multiple requests to provide more detail on how the
_FORTIFY_SOURCE macro works, so this patch adds a new node in the
Library Maintenance section that does this. A lot of the description is
implementation detail, which is why I put this in the appendix and not
in the main documentation.

Resolves: BZ #28998.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Update copyright dates not handled by scripts/update-copyrights

I've updated copyright dates in glibc for 2023. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files.

Update copyright dates with scripts/update-copyrights

Remove trailing whitespace in gmp.h

Remove trailing whitespace

For some reason this causes a pre-commit check error for a copyright
date update commit, even though that commit doesn't touch anything
near the line with this whitespace.

C2x semantics for <tgmath.h>

<tgmath.h> implements semantics for integer generic arguments that
handle cases involving _FloatN / _FloatNx types as specified in TS
18661-3 plus some defect fixes.

C2x has further changes to the semantics for <tgmath.h> macros with
such types, which should also be considered defect fixes (although
handled through the integration of TS 18661-3 in C2x rather than
through an issue tracking process).  Specifically, the rules were
changed because of problems raised with using the macros with the
evaluation format types such as float_t and _Float32_t: the older
version of the rules didn't allow passing _FloatN / _FloatNx types to
the narrowing macros returning float or double, or passing float /
double / long double to the narrowing macros returning _FloatN /
_FloatNx, which was a problem with the evaluation format types which
could be either kind of type depending on the value of
FLT_EVAL_METHOD.

Thus the new rules allow cases of mixing types which were not allowed
before, and, as part of the changes, the handling of integer arguments
was also changed: if there is any _FloatNx generic argument, integer
generic arguments are treated as _Float32x (not double), while the
rule about treating integer arguments to narrowing macros returning
_FloatN or _FloatNx as _Float64 not double was removed (no longer
needed now double is a valid argument to such macros).

I've implemented the changes in GCC's __builtin_tgmath, which thus
requires updates to glibc's test expectations so that the tests
continue to build with GCC 13 (the test is also updated to test the
argument types that weren't allowed before but are now valid under C2x
rules).

Given those test changes, it's then also necessary to fix the
implementations in <tgmath.h> to have appropriate semantics with older
GCC so that the tests pass with GCC versions before GCC 13 as well.
For some cases (non-narrowing macros with two or three generic
arguments; narrowing macros returning _Float32x), the older version of
__builtin_tgmath doesn't correspond sufficiently well to C2x
semantics, so in those cases <tgmath.h> is adjusted to use the older
macro implementation instead of __builtin_tgmath.  The older macro
implementation is itself adjusted to give the desired semantics, with
GCC 7 and later.  (It's not possible to get the right semantics in all
cases for the narrowing macros with GCC 6 and before when the _FloatN
/ _FloatNx names are typedefs rather than distinct types.)

Tested as follows: with the full glibc testsuite for x86_64, GCC 6, 7,
11, 13; with execution of the math/tests for aarch64, arm, powerpc and
powerpc64le, GCC 6, 7, 12 and 13 (powerpc64le only with GCC 12 and
13); with build-many-glibcs.py with GCC 6, 7, 12 and 13.

time: Set daylight to 1 for matching DST/offset change (bug 29951)

The daylight variable is supposed to be set to 1 if DST is ever in
use for the current time zone. But __tzfile_read used to do this:

__daylight = rule_stdoff != rule_dstoff;

This check can fail to set __daylight to 1 if the DST and non-DST
offsets happen to be the same.

Fix ldbl-128 built-in function use

Fix the following issues with built-in function use in
sysdeps/ieee754/ldbl-128 and sysdeps/ieee754/float128:

* fabsl used __builtin_fabsf128 unconditionally, breaking the build
  with GCC 6 for several architectures; it should use __builtin_fabsl
  with an appropriate redirection in float128_private.h.  (I'm not
  particularly concerned with building glibc with GCC 6; rather, I
  want to be able to run the tgmath.h tests with GCC 6, which is a
  significantly different case for tgmath.h compared to GCC 7 and
  later because of the lack of _FloatN / _FloatNx support in the
  compiler, and at present running the tests with a compiler means
  building glibc with that compiler.)

* Some (conditional) uses of built-in functions had been added to
  ldbl-128 without appropriate float128_private.h remapping (there was
  remapping for the macros controlling whether the built-in functions
  are used, just not for the functions themselves).

* s_llrintl.c called __builtin_round not __builtin_llrintl, which is
  obviously wrong.

Tested with build-many-glibcs.py for aarch64-linux-gnu, GCC 6 (where
it fixes the glibc build) and GCC 12, and with the glibc testsuite for
x86_64.

x86: Check minimum/maximum of non_temporal_threshold [BZ #29953]

The minimum non_temporal_threshold is 0x4040. non_temporal_threshold may
be set to less than the minimum value when the shared cache size isn't
available (e.g., in an emulator) or by the tunable. Add checks for
minimum and maximum of non_temporal_threshold.

This fixes BZ #29953.

i686: Regenerate ulps

Reviewed-by: Florian Weimer <fweimer@redhat.com>

hurd getcwd: Fix memory leak on error

hurd fcntl: Make LOCKED macro more robust

hurd: Make dl-sysdep __sbrk check __vm_allocate call

The caller won't be able to progress, but better crash than use random
addr.

htl: Drop duplicate check in __pthread_stack_alloc

hurd hurdstartup: Initialize remaining fields of hurd_startup_data

In case we don't have a bootstrap port or __exec_startup_get_info
failed, we should avoid leaking uninitialized fields of data.

hurd _S_msg_add_auth: Initialize new arrays to 0

If make_list fails, they would be undefined, and freeup with free
uninitialized pointers.

htl: Check error returned by __getrlimit

getdelim: ensure error indicator is set on error (bug 29917)

POSIX requires that getdelim and getline set the error indicator on the
stream when an error occured, in addition to setting errno.

htl: Fix sem_wait race between read and gsync_wait

If the value changes between sem_wait's read and the gsync_wait call,
the kernel will return KERN_INVALID_ARGUMENT, which we have to interpret
as the value having already changed.

This fixes applications (e.g. libgo) seeing sem_wait erroneously return
KERN_INVALID_ARGUMENT.

Avoid use of atoi in malloc

This patch is analogous to commit
a3708cf6b0a5a68e2ed1ce3db28a03ed21d368d2.

atoi has undefined behavior on out-of-range input, which makes it
problematic to use anywhere in glibc that might be processing input
out-of-range for atoi but not specified to produce undefined behavior
for the function calling atoi. In conjunction with the C2x strtol
changes, use of atoi in libc can also result in localplt test failures
because the redirection for strtol does not interact properly with the
libc_hidden_proto call for __isoc23_strtol for the call in the inline
atoi implementation.

In malloc/arena.c, this issue shows up for atoi calls that are only
compiled for --disable-tunables (thus with the
x86_64-linux-gnu-minimal configuration of build-many-glibcs.py, for
example). Change those atoi calls to use strtol directly, as in the
previous such changes.

Tested for x86_64 (--disable-tunables).

Linux: Pass size argument of epoll_create to the kernel

The kernel actually verifies it, and a garbage value in the register
causes improper system call failures.

Fixes commit c1c0dea38833751f36a145c32 ("Linux: Remove epoll_create,
inotify_init from syscalls.list") and commit d1d23b134244d59c4d6ef2295
("Lninux: consolidate epoll_create implementation").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Simplify scripts/cross-test-ssh.sh configuration.

With modern ssh clients and daemons it is required to use AcceptEnv and
SendEnv configuration options to correctly support testing the DSO sort
ordering tests. This requirement is present because
scripts/dso-ordering-test.py injects GLIBC_TUNABLES to the left of the
${test_wrapper_env} and so it must both be sent by the ssh client and
accepted by the ssh daemon. This requirement is removed in this change
and the injected GLIBC_TUNABLES is placed after ${run_program_env} and
so still correctly provides the override that the test requires.
This is similar to existing tests like elf/tst-pathopt.sh,
elf/tst-rtld-load-self.sh, and locale/tst-locale-locpath.sh.

Tested that it fixes two failures when cross-testing on aarch64 with
scripts/cross-test-ssh.sh and an ssh client and daemon that do not pass
GLIBC_TUNABLES. Without this fix such a configuration will report the
following failures (since the GLIBC_TUNABLES not preserved):
FAIL: elf/tst-bz15311
FAIL: elf/tst-bz28937

Tested without regression on native x86_64 and aarch64 builds.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Define MADV_COLLAPSE from Linux 6.1

Add the MADV_COLLAPSE constant from Linux 6.1 to bits/mman-linux.h and
the hppa bits/mman.h.

Tested for x86_64.

powerpc64: Increase SIGSTKSZ and MINSIGSTKSZ

This patch increases the value of SIGSTKSZ and MINSIGSTKSZ
for powerpc64 similar to the kernel commit
2f82ec19757f58549467db568c56e7dfff8af283 to allow
further expansion of the signal stack frame size.