git.ipfire.org Git - thirdparty/glibc.git/log

AArch64: Improve codegen for SVE logs

Reduce memory access by using lanewise MLA and moving constants to struct
and reduce number of MOVPRFXs.
Update maximum ULP error for double log_sve from 1 to 2.
Speedup on Neoverse V1 for log (3%), log2 (5%), and log10 (4%).

(cherry picked from commit 32d193a372feb28f9da247bb7283d404b84429c6)

AArch64: Improve codegen in SVE tans

Improves memory access.
Tan: MOVPRFX 7 -> 2, LD1RD 12 -> 5, move MOV away from return.
Tanf: MOV 2 -> 1, MOVPRFX 6 -> 3, LD1RW 5 -> 4, move mov away from return.

(cherry picked from commit aa6609feb20ebf8653db639dabe2a6afc77b02cc)

AArch64: Improve codegen of AdvSIMD atan(2)(f)

Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 6914774b9d3460876d9ad4482782213ec01a752e)

AArch64: Improve codegen of AdvSIMD logf function family

Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1 for log2 and log,
and 2% for log10.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit d6e034f5b222a9ed1aeb5de0c0c7d0dda8b63da3)

AArch64: Improve codegen in AdvSIMD logs

Remove spurious ADRP and a few MOVs.
Reduce memory access by using more indexed MLAs in polynomial.
Align notation so that algorithms are easier to compare.
Speedup on Neoverse V1 for log10 (8%), log (8.5%), and log2 (10%).
Update error threshold in AdvSIMD log (now matches SVE log).

(cherry picked from commit 8eb5ad2ebc94cc5bedbac57c226c02ec254479c7)

AArch64: Simplify rounding-multiply pattern in several AdvSIMD routines

This operation can be simplified to use simpler multiply-round-convert
sequence, which uses fewer instructions and constants.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 16a59571e4e9fd019d3fc23a2e7d73c1df8bb5cb)

aarch64: Avoid redundant MOVs in AdvSIMD F32 logs

Since the last operation is destructive, the first argument to the FMA
also has to be the first argument to the special-case in order to
avoid unnecessary MOVs. Reorder arguments and adjust special-case
bounds to facilitate this.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
(cherry picked from commit 8b09af572b208bfde4d31c6abbae047dcc217675)

aarch64: Fix AdvSIMD libmvec routines for big-endian

Previously many routines used * to load from vector types stored
in the data table. This is emitted as ldr, which byte-swaps the
entire vector register, and causes bugs for big-endian when not
all lanes contain the same value. When a vector is to be used
this way, it has been replaced with an array and the load with an
explicit ld1 intrinsic, which byte-swaps only within lanes.

As well, many routines previously used non-standard GCC syntax
for vector operations such as indexing into vectors types with []
and assembling vectors using {}. This syntax should not be mixed
with ACLE, as the former does not respect endianness whereas the
latter does. Such examples have been replaced with, for instance,
vcombine_* and vgetq_lane* intrinsics. Helpers which only use the
GCC syntax, such as the v_call helpers, do not need changing as
they do not use intrinsics.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 90a6ca8b28bf34e361e577e526e1b0f4c39a32a5)

assert: Add test for CVE-2025-0395

Use the __progname symbol to override the program name to induce the
failure that CVE-2025-0395 describes.

This is related to BZ #32582

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit cdb9ba84191ce72e86346fb8b1d906e7cd930ea2)

stdlib: Test using setenv with updated environ [BZ #32588]

Add a test for setenv with updated environ. Verify that BZ #32588 is
fixed.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 8ab34497de14e35aff09b607222fe1309ef156da)

malloc: obscure calloc use in tst-calloc

Similar to a9944a52c967ce76a5894c30d0274b824df43c7a and
f9493a15ea9cfb63a815c00c23142369ec09d8ce, we need to hide calloc use from
the compiler to accommodate GCC's r15-6566-g804e9d55d9e54c change.

First, include tst-malloc-aux.h, but then use `volatile` variables
for size.

The test passes without the tst-malloc-aux.h change but IMO we want
it there for consistency and to avoid future problems (possibly silent).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit c3d1dac96bdd10250aa37bb367d5ef8334a093a1)

Hide all malloc functions from compiler [BZ #32366]

Since -1 isn't a power of two, compiler may reject it, hide memalign from
Clang 19 which issues an error:

tst-memalign.c:86:31: error: requested alignment is not a power of 2 [-Werror,-Wnon-power-of-two-alignment]
   86 |   p = memalign (-1, pagesize);
      |                 ^~
tst-memalign.c:86:31: error: requested alignment must be 4294967296 bytes or smaller; maximum alignment assumed [-Werror,-Wbuiltin-assume-aligned-alignment]
   86 |   p = memalign (-1, pagesize);
      |                 ^~

Update tst-malloc-aux.h to hide all malloc functions and include it in
all malloc tests to prevent compiler from optimizing out any malloc
functions.

Tested with Clang 19.1.5 and GCC 15 20241206 for BZ #32366.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit f9493a15ea9cfb63a815c00c23142369ec09d8ce)

Fix underallocation of abort_msg_s struct (CVE-2025-0395)

Include the space needed to store the length of the message itself, in
addition to the message string. This resolves BZ #32582.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 68ee0f704cb81e9ad0a78c644a83e1e9cd2ee578)

x86/string: Fixup alignment of main loop in str{n}cmp-evex [BZ #32212]

The loop should be aligned to 32-bytes so that it can ideally run out
the DSB. This is particularly important on Skylake-Server where
deficiencies in it's DSB implementation make it prone to not being
able to run loops out of the DSB.

For example running strcmp-evex on 200Mb string:

32-byte aligned loop:
    - 43,399,578,766      idq.dsb_uops
not 32-byte aligned loop:
    - 6,060,139,704       idq.dsb_uops

This results in a 25% performance degradation for the non-aligned
version.

The fix is to just ensure the code layout is such that the loop is
aligned. (Which was previously the case but was accidentally dropped
in 84e7c46df).

NB: The fix was actually 64-byte alignment. This is because 64-byte
alignment generally produces more stable performance than 32-byte
aligned code (cache line crosses can affect perf), so if we are going
past 16-byte alignmnent, might as well go to 64. 64-byte alignment
also matches most other functions we over-align, so it creates a
common point of optimization.

Times are reported as ratio of Time_With_Patch /
Time_Without_Patch. Lower is better.

The values being reported is the geometric mean of the ratio across
all tests in bench-strcmp and bench-strncmp.

Note this patch is only attempting to improve the Skylake-Server
strcmp for long strings. The rest of the numbers are only to test for
regressions.

Tigerlake Results Strings <= 512:
    strcmp : 1.026
    strncmp: 0.949

Tigerlake Results Strings > 512:
    strcmp : 0.994
    strncmp: 0.998

Skylake-Server Results Strings <= 512:
    strcmp : 0.945
    strncmp: 0.943

Skylake-Server Results Strings > 512:
    strcmp : 0.778
    strncmp: 1.000

The 2.6% regression on TGL-strcmp is due to slowdowns caused by
changes in alignment of code handling small sizes (most on the
page-cross logic). These should be safe to ignore because 1) We
previously only 16-byte aligned the function so this behavior is not
new and was essentially up to chance before this patch and 2) this
type of alignment related regression on small sizes really only comes
up in tight micro-benchmark loops and is unlikely to have any affect
on realworld performance.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 483443d3211532903d7e790211af5a1d55fdb1f3)

x86: Improve large memset perf with non-temporal stores [RHEL-29312]

Previously we use `rep stosb` for all medium/large memsets. This is
notably worse than non-temporal stores for large (above a
few MBs) memsets.
See:
https://docs.google.com/spreadsheets/d/1opzukzvum4n6-RUVHTGddV6RjAEil4P2uMjjQGLbLcU/edit?usp=sharing
For data using different stategies for large memset on ICX and SKX.

Using non-temporal stores can be up to 3x faster on ICX and 2x faster
on SKX. Historically, these numbers would not have been so good
because of the zero-over-zero writeback optimization that `rep stosb`
is able to do. But, the zero-over-zero writeback optimization has been
removed as a potential side-channel attack, so there is no longer any
good reason to only rely on `rep stosb` for large memsets. On the flip
size, non-temporal writes can avoid data in their RFO requests saving
memory bandwidth.

All of the other changes to the file are to re-organize the
code-blocks to maintain "good" alignment given the new code added in
the `L(stosb_local)` case.

The results from running the GLIBC memset benchmarks on TGL-client for
N=20 runs:

Geometric Mean across the suite New / Old EXEX256: 0.979
Geometric Mean across the suite New / Old EXEX512: 0.979
Geometric Mean across the suite New / Old AVX2   : 0.986
Geometric Mean across the suite New / Old SSE2   : 0.979

Most of the cases are essentially unchanged, this is mostly to show
that adding the non-temporal case didn't add any regressions to the
other cases.

The results on the memset-large benchmark suite on TGL-client for N=20
runs:

Geometric Mean across the suite New / Old EXEX256: 0.926
Geometric Mean across the suite New / Old EXEX512: 0.925
Geometric Mean across the suite New / Old AVX2   : 0.928
Geometric Mean across the suite New / Old SSE2   : 0.924

So roughly a 7.5% speedup. This is lower than what we see on servers
(likely because clients typically have faster single-core bandwidth so
saving bandwidth on RFOs is less impactful), but still advantageous.

Full test-suite passes on x86_64 w/ and w/o multiarch.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 5bf0ab80573d66e4ae5d94b094659094336da90f)

x86: Avoid integer truncation with large cache sizes (bug 32470)

Some hypervisors report 1 TiB L3 cache size. This results
in some variables incorrectly getting zeroed, causing crashes
in memcpy/memmove because invariants are violated.

(cherry picked from commit 61c3450db96dce96ad2b24b4f0b548e6a46d68e5)

math: Exclude internal math symbols for tests [BZ #32414]

Since internal tests don't have access to internal symbols in libm,
exclude them for internal tests. Also make tst-strtod5 and tst-strtod5i
depend on $(libm) to support older versions of GCC which can't inline
copysign family functions. This fixes BZ #32414.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
(cherry picked from commit 5df09b444835fca6e64b3d4b4a5beb19b3b2ba21)

malloc: add indirection for malloc(-like) functions in tests [BZ #32366]

GCC 15 introduces allocation dead code removal (DCE) for PR117370 in
r15-5255-g7828dc070510f8. This breaks various glibc tests which want
to assert various properties of the allocator without doing anything
obviously useful with the allocated memory.

Alexander Monakov rightly pointed out that we can and should do better
than passing -fno-malloc-dce to paper over the problem. Not least because
GCC 14 already does such DCE where there's no testing of malloc's return
value against NULL, and LLVM has such optimisations too.

Handle this by providing malloc (and friends) wrappers with a volatile
function pointer to obscure that we're calling malloc (et. al) from the
compiler.

Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
(cherry picked from commit a9944a52c967ce76a5894c30d0274b824df43c7a)

Pass -nostdlib -nostartfiles together with -r [BZ #31753]

Since -r in GCC 6/7/8 doesn't imply -nostdlib -nostartfiles, update the
link-static-libc.out rule to also pass -nostdlib -nostartfiles. This
fixes BZ #31753.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 2be3352f0b1ebaa39596393fffe1062275186669)

nptl: initialize cpu_id_start prior to rseq registration

When adding explicit initialization of rseq fields prior to
registration, I glossed over the fact that 'cpu_id_start' is also
documented as initialized by user-space.

While current kernels don't validate the content of this field on
registration, future ones could.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
(cherry picked from commit d9f40387d3305d97e30a8cf8724218c42a63680a)

nptl: initialize rseq area prior to registration

Per the rseq syscall documentation, 3 fields are required to be
initialized by userspace prior to registration, they are 'cpu_id',
'rseq_cs' and 'flags'. Since we have no guarantee that 'struct pthread'
is cleared on all architectures, explicitly set those 3 fields prior to
registration.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 97f60abd25628425971f07e9b0e7f8eec0741235)

elf: Change ldconfig auxcache magic number (bug 32231)

In commit c628c2296392ed3bf2cb8d8470668e64fe53389f (elf: Remove
ldconfig kernel version check), the layout of auxcache entries
changed because the osversion field was removed from
struct aux_cache_file_entry. However, AUX_CACHEMAGIC was not
changed, so existing files are still used, potentially leading
to unintended ldconfig behavior. This commit changes AUX_CACHEMAGIC,
so that the file is regenerated.

Reported-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 0a536f6e2f76e3ef581b3fd9af1e5cf4ddc7a5a2)

Make tst-strtod-underflow type-generic

The test tst-strtod-underflow covers various edge cases close to the
underflow threshold for strtod (especially cases where underflow on
architectures with after-rounding tininess detection depends on the
rounding mode). Make it use the type-generic machinery, with
corresponding test inputs for each supported floating-point format, so
that other functions in the strtod family are tested for underflow
edge cases as well.

Tested for x86_64.

(cherry picked from commit 94ca2c0894f0e1b62625c369cc598a2b9236622c)

Add crt1-2.0.o for glibc 2.0 compatibility tests

Starting from glibc 2.1, crt1.o contains _IO_stdin_used which is checked
by _IO_check_libio to provide binary compatibility for glibc 2.0.  Add
crt1-2.0.o for tests against glibc 2.0.  Define tests-2.0 for glibc 2.0
compatibility tests.  Add and update glibc 2.0 compatibility tests for
stderr, matherr and pthread_kill.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 5f245f3bfbe61b2182964dafb94907e38284b806)

Add tests of more strtod special cases

There is very little test coverage of inputs to strtod-family
functions that don't contain anything that can be parsed as a number
(one test of ".y" in tst-strtod2), and none that I can see of skipping
initial whitespace. Add some tests of these things to tst-strtod2.

Tested for x86_64.

(cherry picked from commit 378039ca578c2ea93095a1e710d96f58c68a3997)

Add more tests of strtod end pointer

Although there are some tests in tst-strtod2 and tst-strtod3 for the
end pointer provided by strtod when it doesn't parse the whole string,
they aren't very thorough. Add tests of more such cases to
tst-strtod2.

Tested for x86_64.

(cherry picked from commit b5d3737b305525315e0c7c93ca49eadc868eabd5)

Make tst-strtod2 and tst-strtod5 type-generic

Some of the strtod tests use type-generic machinery in tst-strtod.h to
test the strto* functions for all floating types, while others only
test double even when the tests are in fact meaningful for all
floating types.

Convert tst-strtod2 and tst-strtod5 to use the type-generic machinery
so they test all floating types. I haven't tried to convert them to
use newer test interfaces in other ways, just made the changes
necessary to use the type-generic machinery.

Tested for x86_64.

(cherry picked from commit 8de031bcb9adfa736c0caed2c79d10947b8d8f48)

powerpc64le: Build new strtod tests with long double ABI flags (bug 32145)

This fixes several test failures:

=====FAIL: stdlib/tst-strtod1i.out=====
Locale tests
all OK
Locale tests
all OK
Locale tests
strtold("1,5") returns -6,38643e+367 and not 1,5
strtold("1.5") returns 1,5 and not 1
strtold("1.500") returns 1 and not 1500
strtold("36.893.488.147.419.103.232") returns 1500 and not 3,68935e+19
Locale tests
all OK

=====FAIL: stdlib/tst-strtod3.out=====
0: got wrong results -2.5937e+4826, expected 0

=====FAIL: stdlib/tst-strtod4.out=====
0: got wrong results -6,38643e+367, expected 0
1: got wrong results 0, expected 1e+06
2: got wrong results 1e+06, expected 10

=====FAIL: stdlib/tst-strtod5i.out=====
0: got wrong results -6,38643e+367, expected 0
2: got wrong results 0, expected -0
4: got wrong results -0, expected 0
5: got wrong results 0, expected -0
6: got wrong results -0, expected 0
7: got wrong results 0, expected -0
8: got wrong results -0, expected 0
9: got wrong results 0, expected -0
10: got wrong results -0, expected 0
11: got wrong results 0, expected -0
12: got wrong results -0, expected 0
13: got wrong results 0, expected -0
14: got wrong results -0, expected 0
15: got wrong results 0, expected -0
16: got wrong results -0, expected 0
17: got wrong results 0, expected -0
18: got wrong results -0, expected 0
20: got wrong results 0, expected -0
22: got wrong results -0, expected 0
23: got wrong results 0, expected -0
24: got wrong results -0, expected 0
25: got wrong results 0, expected -0
26: got wrong results -0, expected 0
27: got wrong results 0, expected -0

Fixes commit 3fc063dee01da4f80920a14b7db637c8501d6fd4
("Make __strtod_internal tests type-generic").

Suggested-by: Joseph Myers <josmyers@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit cc3e743fc09ee6fca45767629df9cbcbe1feba82)

Do not set errno for overflowing NaN payload in strtod/nan (bug 32045)

As reported in bug 32045, it's incorrect for strtod/nan functions to
set errno based on overflowing payload (strtod should only set errno
for overflow / underflow of its actual result, and potentially if
nothing in the string can be parsed as a number at all; nan should be
a pure function that never sets it). Save and restore errno around
the internal strtoull call and add associated test coverage.

Tested for x86_64.

(cherry picked from commit 64f62c47e9c350f353336f2df6714e1d48ec50d8)

Improve NaN payload testing

There are two separate sets of tests of NaN payloads in glibc:

* libm-test-{get,set}payload* verify that getpayload, setpayload,
  setpayloadsig and __builtin_nan functions are consistent in their
  payload handling.

* test-nan-payload verifies that strtod-family functions and the
  not-built-in nan functions are consistent in their payload handling.

Nothing, however, connects the two sets of functions (i.e., verifies
that strtod / nan are consistent with getpayload / setpayload /
__builtin_nan).

Improve test-nan-payload to check actual payload value with getpayload
rather than just verifying that the strtod and nan functions produce
the same NaN.  Also check that the NaNs produced aren't signaling and
extend the tests to cover _FloatN / _FloatNx.

Tested for x86_64.

(cherry picked from commit be77d5ae417236883c02d3d67c0716e3f669fa41)

Make __strtod_internal tests type-generic

Some of the strtod tests use type-generic machinery in tst-strtod.h to
test the strto* functions for all floating types, while others only
test double even when the tests are in fact meaningful for all
floating types.

Convert the tests of the internal __strtod_internal interface to cover
all floating types. I haven't tried to convert them to use newer test
interfaces in other ways, just made the changes necessary to use the
type-generic machinery. As an internal interface, there are no
aliases for different types with the same ABI (however,
__strtold_internal is defined even if long double has the same ABI as
double), so macros used by the type-generic testing code are redefined
as needed to avoid expecting such aliases to be present.

Tested for x86_64.

(cherry picked from commit 3fc063dee01da4f80920a14b7db637c8501d6fd4)

Fix strtod subnormal rounding (bug 30220)

As reported in bug 30220, the implementation of strtod-family
functions has a bug in the following case: the input string would,
with infinite exponent range, take one more bit to represent than is
available in the normal precision of the return type; the value
represented is in the subnormal range; and there are no nonzero bits
in the value, below those that can be represented in subnormal
precision, other than the least significant bit and possibly the
0.5ulp bit. In this case, round_and_return ends up discarding the
least significant bit.

Fix by saving that bit to merge into more_bits (it can't be merged in
at the time it's computed, because more_bits mustn't include this bit
in the case of after-rounding tininess detection checking if the
result is still subnormal when rounded to normal precision, so merging
this bit into more_bits needs to take place after that check).

Tested for x86_64.

(cherry picked from commit 457622c2fa8f9f7435822d5287a437bc8be8090d)

More thoroughly test underflow / errno in tst-strtod-round

Add tests of underflow in tst-strtod-round, and thus also test for
errno being unchanged when there is neither overflow nor underflow.
The errno setting before the function call to test for being unchanged
is adjusted to set errno to 12345 instead of 0, so that any bugs where
strtod sets errno to 0 would be detected.

This doesn't add any new test inputs for tst-strtod-round, and in
particular doesn't cover the edge cases of underflow the way
tst-strtod-underflow does (none of the existing test inputs for
tst-strtod-round actually exercise cases that have underflow with
before-rounding tininess detection but not with after-rounding
tininess detection), but at least it provides some coverage (as per
the recent discussions) that ordinary non-overflowing non-underflowing
inputs to these functions do not set errno.

Tested for x86_64.

(cherry picked from commit d73ed2601b7c3c93c3529149a3d7f7b6177900a8)

Test errno setting on strtod overflow in tst-strtod-round

We have no tests that errno is set to ERANGE on overflow of
strtod-family functions (we do have some tests for underflow, in
tst-strtod-underflow). Add such tests to tst-strtod-round.

Tested for x86_64.

(cherry picked from commit 207d64feb26279e152c50744e3c37e68491aca99)

Add tests of fread

There seem to be no glibc tests specifically for the fread function.
Add basic tests of that function.

Tested for x86_64.

(cherry picked from commit d14c977c65aac7db35bb59380ef99d6582c4f930)

stdio-common: Add new test for fdopen

This commit adds fdopen test with all modes.
Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 1d72fa3cfa046f7293421a7e58f2a272474ea901)

libio: Attempt wide backup free only for non-legacy code

_wide_data and _mode are not available in legacy code, so do not attempt
to free the wide backup buffer in legacy code.

Resolves: BZ #32137 and BZ #27821

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit ae4d44b1d501421ad9a3af95279b8f4d1546f1ce)

debug: Fix read error handling in pcprofiledump

The reading loops did not check for read failures. Addresses
a static analysis report.

Manually tested by compiling a program with the GCC's
-finstrument-functions option, running it with
“LD_PRELOAD=debug/libpcprofile.so PCPROFILE_OUTPUT=output-file”,
and reviewing the output of “debug/pcprofiledump output-file”.

(cherry picked from commit 89b088bf70c651c231bf27e644270d093b8f144a)

elf: Fix tst-dlopen-tlsreinit1.out test dependency

Fixes commit 5097cd344fd243fb8deb6dec96e8073753f962f9
("elf: Avoid re-initializing already allocated TLS in dlopen
(bug 31717)").

Reported-by: Patsy Griffin <patsy@redhat.com>
Reviewed-by: Patsy Griffin <patsy@redhat.com>
(cherry picked from commit e82a7cb1622bff08d8e3a144d7c5516a088f1cbc)

elf: Avoid re-initializing already allocated TLS in dlopen (bug 31717)

The old code used l_init_called as an indicator for whether TLS
initialization was complete.  However, it is possible that
TLS for an object is initialized, written to, and then dlopen
for this object is called again, and l_init_called is not true at
this point.  Previously, this resulted in TLS being initialized
twice, discarding any interim writes (technically introducing a
use-after-free bug even).

This commit introduces an explicit per-object flag, l_tls_in_slotinfo.
It indicates whether _dl_add_to_slotinfo has been called for this
object.  This flag is used to avoid double-initialization of TLS.
In update_tls_slotinfo, the first_static_tls micro-optimization
is removed because preserving the initalization flag for subsequent
use by the second loop for static TLS is a bit complicated, and
another per-object flag does not seem to be worth it.  Furthermore,
the l_init_called flag is dropped from the second loop (for static
TLS initialization) because l_need_tls_init on its own prevents
double-initialization.

The remaining l_init_called usage in resize_scopes and update_scopes
is just an optimization due to the use of scope_has_map, so it is
not changed in this commit.

The isupper check ensures that libc.so.6 is TLS is not reverted.
Such a revert happens if l_need_tls_init is not cleared in
_dl_allocate_tls_init for the main_thread case, now that
l_init_called is not checked anymore in update_tls_slotinfo
in elf/dl-open.c.

Reported-by: Jonathon Anderson <janderson@rice.edu>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 5097cd344fd243fb8deb6dec96e8073753f962f9)

elf: Clarify and invert second argument of _dl_allocate_tls_init

Also remove an outdated comment: _dl_allocate_tls_init is
called as part of pthread_create.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit fe06fb313bddf7e4530056897d4a706606e49377)

elf: Support recursive use of dynamic TLS in interposed malloc

It turns out that quite a few applications use bundled mallocs that
have been built to use global-dynamic TLS (instead of the recommended
initial-exec TLS).  The previous workaround from
commit afe42e935b3ee97bac9a7064157587777259c60e ("elf: Avoid some
free (NULL) calls in _dl_update_slotinfo") does not fix all
encountered cases unfortunatelly.

This change avoids the TLS generation update for recursive use
of TLS from a malloc that was called during a TLS update.  This
is possible because an interposed malloc has a fixed module ID and
TLS slot.  (It cannot be unloaded.)  If an initially-loaded module ID
is encountered in __tls_get_addr and the dynamic linker is already
in the middle of a TLS update, use the outdated DTV, thus avoiding
another call into malloc.  It's still necessary to update the
DTV to the most recent generation, to get out of the slow path,
which is why the check for recursion is needed.

The bookkeeping is done using a global counter instead of per-thread
flag because TLS access in the dynamic linker is tricky.

All this will go away once the dynamic linker stops using malloc
for TLS, likely as part of a change that pre-allocates all TLS
during pthread_create/dlopen.

Fixes commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow
tls access after dlopen [BZ #19924]").

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 018f0fc3b818d4d1460a4e2384c24802504b1d20)

nptl: Use <support/check.h> facilities in tst-setuid3

Remove local FAIL macro in favor to FAIL_EXIT1 from <support/check.h>,
which provides equivalent reporting, with the name of the file and the
line number within of the failure site additionally included. Remove
FAIL_ERR altogether and include ": %m" explicitly with the format string
supplied to FAIL_EXIT1 as there seems little value to have a separate
macro just for this.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 8c98195af6e6f1ce21743fc26c723e0f7e45bcf2)

posix: Use <support/check.h> facilities in tst-truncate and tst-truncate64

Remove local FAIL macro in favor to FAIL_RET from <support/check.h>,
which provides equivalent reporting, with the name of the file of the
failure site additionally included, for the tst-truncate-common core
shared between the tst-truncate and tst-truncate64 tests.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit fe47595504a55e7bb992f8928533df154b510383)

ungetc: Fix backup buffer leak on program exit [BZ #27821]

If a file descriptor is left unclosed and is cleaned up by _IO_cleanup
on exit, its backup buffer remains unfreed, registering as a leak in
valgrind.  This is not strictly an issue since (1) the program should
ideally be closing the stream once it's not in use and (2) the program
is about to exit anyway, so keeping the backup buffer around a wee bit
longer isn't a real problem.  Free it anyway to keep valgrind happy
when the streams in question are the standard ones, i.e. stdout, stdin
or stderr.

Also, the _IO_have_backup macro checks for _IO_save_base,
which is a roundabout way to check for a backup buffer instead of
directly looking for _IO_backup_base.  The roundabout check breaks when
the main get area has not been used and user pushes a char into the
backup buffer with ungetc.  Fix this to use the _IO_backup_base
directly.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 3e1d8d1d1dca24ae90df2ea826a8916896fc7e77)

ungetc: Fix uninitialized read when putting into unused streams [BZ #27821]

When ungetc is called on an unused stream, the backup buffer is
allocated without the main get area being present. This results in
every subsequent ungetc (as the stream remains in the backup area)
checking uninitialized memory in the backup buffer when trying to put a
character back into the stream.

Avoid comparing the input character with buffer contents when in backup
to avoid this uninitialized read. The uninitialized read is harmless in
this context since the location is promptly overwritten with the input
character, thus fulfilling ungetc functionality.

Also adjust wording in the manual to drop the paragraph that says glibc
cannot do multiple ungetc back to back since with this change, ungetc
can actually do this.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit cdf0f88f97b0aaceb894cc02b21159d148d7065c)

Make tst-ungetc use libsupport

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 3f7df7e757f4efec38e45d4068e5492efcac4856)

stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]

Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer
than INT_MAX (bug 27650)") and add a test case for the issue, inspired
by the reproducer provided with the bug report.

This has been verified to succeed as from the commit referred and fail
beforehand.

As the test requires 2GiB of data to be passed around its performance
has been evaluated using a choice of systems and the execution time
determined to be respectively in the range of 9s for POWER9@2.166GHz,
24s for FU740@1.2GHz, and 40s for 74Kf@950MHz. As this is on the verge
of and beyond the default timeout it has been increased by the factor of
8. Regardless, following recent practice the test has been added to the
standard rather than extended set.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 89cddc8a7096f3d9225868304d2bc0a1aaf07d63)

support: Add FAIL test failure helper

Add a FAIL test failure helper analogous to FAIL_RET, that does not
cause the current function to return, providing a standardized way to
report a test failure with a message supplied while permitting the
caller to continue executing, for further reporting, cleaning up, etc.

Update existing test cases that provide a conflicting definition of FAIL
by removing the local FAIL definition and then as follows:

- tst-fortify-syslog: provide a meaningful message in addition to the
  file name already added by <support/check.h>; 'support_record_failure'
  is already called by 'support_print_failure_impl' invoked by the new
  FAIL test failure helper.

- tst-ctype: no update to FAIL calls required, with the name of the file
  and the line number within of the failure site additionally included
  by the new FAIL test failure helper, and error counting plus count
  reporting upon test program termination also already provided by
  'support_record_failure' and 'support_report_failure' respectively,
  called by 'support_print_failure_impl' and 'adjust_exit_status' also
  respectively.  However in a number of places 'printf' is called and
  the error count adjusted by hand, so update these places to make use
  of FAIL instead.  And last but not least adjust the final summary just
  to report completion, with any error count following as reported by
  the test driver.

- test-tgmath2: no update to FAIL calls required, with the name of the
  file of the failure site additionally included by the new FAIL test
  failure helper.  Also there is no need to track the return status by
  hand as any call to FAIL will eventually cause the test case to return
  an unsuccesful exit status regardless of the return status from the
  test function, via a call to 'adjust_exit_status' made by the test
  driver.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 1b97a9f23bf605ca608162089c94187573fb2a9e)

string: strerror, strsignal cannot use buffer after dlmopen (bug 32026)

Secondary namespaces have a different malloc.  Allocating the
buffer in one namespace and freeing it another results in
heap corruption.  Fix this by using a static string (potentially
translated) in secondary namespaces.  It would also be possible
to use the malloc from the initial namespace to manage the
buffer, but these functions would still not be safe to use in
auditors etc. because a call to strerror could still free a
buffer while it is used by the application.  Another approach
could use proper initial-exec TLS, duplicated in secondary
namespaces, but that would need a callback interface for freeing
libc resources in namespaces on thread exit, which does not exist
today.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 25a5eb4010df94b412c67db9e346029de316d06b)

Define __libc_initial for the static libc

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit eb0e50e9a1cf80a2ba6f33f990a08ef37a3267fb)

x86: Fix bug in strchrnul-evex512 [BZ #32078]

Issue was we were expecting not matches with CHAR before the start of
the string in the page cross case.

The check code in the page cross case:
```
    and    $0xffffffffffffffc0,%rax
    vmovdqa64 (%rax),%zmm17
    vpcmpneqb %zmm17,%zmm16,%k1
    vptestmb %zmm17,%zmm17,%k0{%k1}
    kmovq  %k0,%rax
    inc    %rax
    shr    %cl,%rax
    je     L(continue)
```

expects that all characters that neither match null nor CHAR will be
1s in `rax` prior to the `inc`. Then the `inc` will overflow all of
the 1s where no relevant match was found.

This is incorrect in the page-cross case, as the
`vmovdqa64 (%rax),%zmm17` loads from before the start of the input
string.

If there are matches with CHAR before the start of the string, `rax`
won't properly overflow.

The fix is quite simple. Just replace:

```
    inc    %rax
    shr    %cl,%rax
```
With:
```
    sar    %cl,%rax
    inc    %rax
```

The arithmetic shift will clear any matches prior to the start of the
string while maintaining the signbit so the 1s can properly overflow
to zero in the case of no matches.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 7da08862471dfec6fdae731c2a5f351ad485c71f)

Adjust check-local-headers test for libaudit 4.0

The new version introduces /usr/include/audit_logging.h and
/usr/include/audit-records.h.

(cherry picked from commit 91eb62d63887a959e43aafb6fc022a87614dc7c9)

x32/cet: Support shadow stack during startup for Linux 6.10

Use RXX_LP in RTLD_START_ENABLE_X86_FEATURES.  Support shadow stack during
startup for Linux 6.10:

commit 2883f01ec37dd8668e7222dfdb5980c86fdfe277
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 15 07:04:33 2024 -0700

    x86/shstk: Enable shadow stacks for x32

    1. Add shadow stack support to x32 signal.
    2. Use the 64-bit map_shadow_stack syscall for x32.
    3. Set up shadow stack for x32.

Add the map_shadow_stack system call to <fixup-asm-unistd.h> and regenerate
arch-syscall.h.  Tested on Intel Tiger Lake with CET enabled x32.  There
are no regressions with CET enabled x86-64.  There are no changes in CET
enabled x86-64 _dl_start_user.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 8344c1f5514b1b5b1c8c6e48f4b802653bd23b71)

x86-64: Remove sysdeps/x86_64/x32/dl-machine.h

Remove sysdeps/x86_64/x32/dl-machine.h by folding x32 ARCH_LA_PLTENTER,
ARCH_LA_PLTEXIT and RTLD_START into sysdeps/x86_64/dl-machine.h.  There
are no regressions on x86-64 nor x32.  There are no changes in x86-64
_dl_start_user.  On x32, _dl_start_user changes are

<_dl_start_user>:
mov    %eax,%r12d
+ mov    %esp,%r13d
mov    (%rsp),%edx
mov    %edx,%esi
- mov    %esp,%r13d
and    $0xfffffff0,%esp
mov    0x0(%rip),%edi        # <_dl_start_user+0x14>
lea    0x8(%r13,%rdx,4),%ecx

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 652c6cf26927352fc0e37e4e60c6fc98ddf6d3b4)

support: Add options list terminator to the test driver

This avoids crashes if a test is passed unknown options.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit c2a474f4617ede7a8bf56b7257acb37dc757b2d1)

manual/stdio: Further clarify putc, putwc, getc, and getwc

This is a follow-up to 10de4a47ef3f481592e3c62eb07bcda23e9fde4d that
reworded the manual entries for putc and putwc and removed any
performance claims.

This commit further clarifies these entries and brings getc and getwc in
line with the descriptions of putc and putwc, removing any performance
claims from them as well.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 942670c81dc8071dd75d6213e771daa5d2084cb6)

Fix name space violation in fortify wrappers (bug 32052)

Rename the identifier sz to __sz everywhere.

Fixes: a643f60c53 ("Make sure that the fortified function conditionals are constant")
(cherry picked from commit 39ca997ab378990d5ac1aadbaa52aaf1db6d526f)
(redone from scratch because of many conflicts)

resolv: Fix tst-resolv-short-response for older GCC (bug 32042)

Previous GCC versions do not support the C23 change that
allows labels on declarations.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit ec119972cb2598c04ec7d4219e20506006836f64)

Add mremap tests

Add tests for MREMAP_MAYMOVE and MREMAP_FIXED. On Linux, also test
MREMAP_DONTUNMAP.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit ff0320bec2810192d453c579623482fab87bfa01)

mremap: Update manual entry

Update mremap manual entry:

1. Change mremap to variadic.
2. Document MREMAP_FIXED and MREMAP_DONTUNMAP.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit cb2dee4eccf46642eef588bee64f9c875c408f1c)

linux: Update the mremap C implementation [BZ #31968]

Update the mremap C implementation to support the optional argument for
MREMAP_DONTUNMAP added in Linux 5.7 since it may not always be correct
to implement a variadic function as a non-variadic function on all Linux
targets. Return MAP_FAILED and set errno to EINVAL for unknown flag bits.
This fixes BZ #31968.

Note: A test must be added when a new flag bit is introduced.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 6c40cb0e9f893d49dc7caee580a055de53562206)

Enhanced test coverage for strncmp, wcsncmp

Add string/test-strncmp-nonarray and
wcsmbs/test-wcsncmp-nonarray.

This is the test that uncovered bug 31934. Test run time
is more than one minute on a fairly current system, so turn
these into xtests that do not run automatically.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 54252394c25ddf0062e288d4a6ab7a885f8ae009)

Enhance test coverage for strnlen, wcsnlen

This commit adds string/test-strnlen-nonarray and
wcsmbs/test-wcsnlen-nonarray.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 783d4c0b81889c39a9ddf13b60d0fde4040fb1c0)

manual: make setrlimit() description less ambiguous

The existing description for setrlimit() has some ambiguity. It could be
understood to have the semantics of getrlimit(), i.e., the limits from the
process are stored in the provided rlp pointer.

Make the description more explicit that rlp are the input values, and that
the limits of the process is changed with this function.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit aedbf08891069fc029ed021e4dba933eb877b394)

manual/stdio: Clarify putc and putwc

The manual entry for `putc' described what "most systems" do instead of
describing the glibc implementation and its guarantees. This commit
fixes that by warning that putc may be implemented as a macro that
double-evaluates `stream', and removing the performance claim.

Even though the current `putc' implementation does not double-evaluate
`stream', offering this obscure guarantee as an extension to what
POSIX allows does not seem very useful.

The entry for `putwc' is also edited to bring it in line with `putc'.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 10de4a47ef3f481592e3c62eb07bcda23e9fde4d)

malloc: add multi-threaded tests for aligned_alloc/calloc/malloc

Improve aligned_alloc/calloc/malloc test coverage by adding
multi-threaded tests with random memory allocations and with/without
cross-thread memory deallocations.

Perform a number of memory allocation calls with random sizes limited
to 0xffff.

Use the existing DSO ('malloc/tst-aligned_alloc-lib.c') to randomize
allocator selection.

The multi-threaded allocation/deallocation is staged as described below:

- Stage 1: Half of the threads will be allocating memory and the
  other half will be waiting for them to finish the allocation.
- Stage 2: Half of the threads will be allocating memory and the
  other half will be deallocating memory.
- Stage 3: Half of the threads will be deallocating memory and the
  second half waiting on them to finish.

Add 'malloc/tst-aligned-alloc-random-thread.c' where each thread will
deallocate only the memory that was previously allocated by itself.

Add 'malloc/tst-aligned-alloc-random-thread-cross.c' where each thread
will deallocate memory that was previously allocated by another thread.

The intention is to be able to utilize existing malloc testing to ensure
that similar allocation APIs are also exposed to the same rigors.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
(cherry picked from commit b0fbcb7d0051a68baf26b2aed51a8a31c34d68e5)

malloc: avoid global locks in tst-aligned_alloc-lib.c

Make sure the DSO used by aligned_alloc/calloc/malloc tests does not get
a global lock on multithreaded tests.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
(cherry picked from commit 9a27b566b2048f599048f2f4afe1cce06c4ef43d)

resolv: Track single-request fallback via _res._flags (bug 31476)

This avoids changing _res.options, which inteferes with change
detection as part of automatic reloading of /etc/resolv.conf.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 868ab8923a2ec977faafec97ecafac0c3159c1b2)

resolv: Do not wait for non-existing second DNS response after error (bug 30081)

In single-request mode, there is no second response after an error
because the second query has not been sent yet. Waiting for it
introduces an unnecessary timeout.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit af625987d619388a100b153520d3ee308bda9889)

resolv: Allow short error responses to match any query (bug 31890)

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 691a3b2e9bfaba842e46a5ccb7f5e6ea144c3ade)

elf: Fix localplt.awk for DT_RELR-enabled builds (BZ 31978)

For each input readelf output, localplt.awk parses each 'Relocation
section' entry, checks its offset against the dynamic section entry, and
saves each DT_JMPREL, DT_RELA, and DT_REL offset value it finds. After
all lines are read, the script checks if any segment offset differed
from 0, meaning at least one 'Relocation section' was matched.

However, if the shared object was built with RELR support and the static
linker could place all the relocation on DT_RELR, there would be no
DT_JMPREL, DT_RELA, and DT_REL entries; only a DT_RELR.

For the current three ABIs that support (aarch64, x86, and powerpc64),
the powerpc64 ld.so shows the behavior above. Both x86_64 and aarch64
show extra relocations on '.rela.dyn', which makes the script check to
succeed.

This patch fixes by handling DT_RELR, where the offset is checked
against the dynamic section entries and if the shared object contains an
entry it means that there are no extra PLT entries (since all
relocations are relative).

It fixes the elf/check-localplt failure on powerpc.

Checked with a build/check for aarch64-linux-gnu, x86_64-linux-gnu,
i686-linux-gnu, arm-linux-gnueabihf, s390x-linux-gnu, powerpc-linux-gnu,
powerpc64-linux-gnu, and powerpc64le-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 4f047d9edecb1a9b796a9a904dcd42bd3cc3d3b6)

Fix usage of _STACK_GROWS_DOWN and _STACK_GROWS_UP defines [BZ 31989]

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-By: Andreas K. Hüttel <dilfridge@gentoo.org>
(cherry picked from commit 8cfa4ecff21adf226984f135aa576dd8063bbba3)

Linux: Make __rseq_size useful for feature detection (bug 31965)

The __rseq_size value is now the active area of struct rseq
(so 20 initially), not the full struct size including padding
at the end (32 initially).

Update misc/tst-rseq to print some additional diagnostics.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
(cherry picked from commit 2e456ccf0c34a056e3ccafac4a0c7effef14d918)

elf: Make dl-rseq-symbols Linux only

And avoid a Hurd build failures.

Checked on x86_64-linux-gnu.

(cherry picked from commit 9fc639f654dc004736836613be703e6bed0c36a8)

nptl: fix potential merge of __rseq_* relro symbols

While working on a patch to add support for the extensible rseq ABI, we
came across an issue where a new 'const' variable would be merged with
the existing '__rseq_size' variable. We tracked this to the use of
'-fmerge-all-constants' which allows the compiler to merge identical
constant variables. This means that all 'const' variables in a compile
unit that are of the same size and are initialized to the same value can
be merged.

In this specific case, on 32 bit systems 'unsigned int' and 'ptrdiff_t'
are both 4 bytes and initialized to 0 which should trigger the merge.
However for reasons we haven't delved into when the attribute 'section
(".data.rel.ro")' is added to the mix, only variables of the same exact
types are merged. As far as we know this behavior is not specified
anywhere and could change with a new compiler version, hence this patch.

Move the definitions of these variables into an assembler file and add
hidden writable aliases for internal use. This has the added bonus of
removing the asm workaround to set the values on rseq registration.

Tested on Debian 12 with GCC 12.2.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 2b92982e2369d292560793bee8e730f695f48ff3)

s390x: Fix segfault in wcsncmp [BZ #31934]

The z13/vector-optimized wcsncmp implementation segfaults if n=1
and there is only one character (equal on both strings) before
the page end. Then it loads and compares one character and misses
to check n again. The following load fails.

This patch removes the extra load and compare of the first character
and just start with the loop which uses vector-load-to-block-boundary.
This code-path also checks n.

With this patch both tests are passing:
- the simplified one mentioned in the bugzilla 31934
- the full one in Florian Weimer's patch:
"manual: Document a GNU extension for strncmp/wcsncmp"
(https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 9b7651410375ec8848a1944992d663d514db4ba7)

stdlib: fix arc4random fallback to /dev/urandom (BZ 31612)

The __getrandom_nocancel used by __arc4random_buf uses
INLINE_SYSCALL_CALL (which returns -1/errno) and the loop checks for
the return value instead of errno to fallback to /dev/urandom.

The malloc code now uses __getrandom_nocancel_nostatus, which uses
INTERNAL_SYSCALL_CALL, so there is no need to use the variant that does
not set errno (BZ#29624).

Checked on x86_64-linux-gnu.

Reviewed-by: Xi Ruoyao <xry111@xry111.site>
(cherry picked from commit 184b9e530e6326e668709826903b6d30dc6cac3f)

math: Provide missing math symbols on libc.a (BZ 31781)

The libc.a for alpha, s390, and sparcv9 does not provide
copysignf64x, copysignf128, frexpf64x, frexpf128, modff64x, and
modff128.

Checked with a static build for the affected ABIs.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit eaa8113bf0eb599025e3efdbe1bb214ee8dc645a)

math: Fix isnanf128 static build (BZ 31774)

Some static implementation of float128 routines might call __isnanf128,
which is not provided by the static object.

Checked on x86_64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 5d4999e519ec77e75bef920e2540e8605015680a)

math: Fix i386 and m68k exp10 on static build (BZ 31775)

The commit 08ddd26814 removed the static exp10 on i386 and m68k with an
empty w_exp10.c (required for the ABIs that uses the newly
implementation). This patch fixes by adding the required symbols on the
arch-specific w_exp{f}_compat.c implementation.

Checked on i686-linux-gnu and with a build for m68k-linux-gnu.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 1f09aae36aa185b8b1100dfa6b776442279bf580)

math: Fix i386 and m68k fmod/fmodf on static build (BZ 31488)

The commit 16439f419b removed the static fmod/fmodf on i386 and m68k
with and empty w_fmod.c (required for the ABIs that uses the newly
implementation). This patch fixes by adding the required symbols on
the arch-specific w_fmod{f}_compat.c implementation.

To statically build fmod fails on some ABI (alpha, s390, sparc) because
it does not export the ldexpf128, this is also fixed by this patch.

Checked on i686-linux-gnu and with a build for m68k-linux-gnu.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 0b716305dfb48c2d13ed4f7d06c082b90c1d226f)

posix: Fix pidfd_spawn/pidfd_spawnp leak if execve fails (BZ 31695)

If the pidfd_spawn/pidfd_spawnp helper process succeeds, but evecve
fails for some reason (either with an invalid/non-existent, memory
allocation, etc.) the resulting pidfd is never closed, nor returned
to caller (so it can call close).

Since the process creation failed, it should be up to posix_spawn to
also, close the file descriptor in this case (similar to what it
does to reap the process).

This patch also changes the waitpid with waitid (P_PIDFD) for pidfd
case, to avoid a possible pid re-use.

Checked on x86_64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit c90cfce849d010474e8cccf3e5bff49a2c8b141f)

Linux: Include <dl-symbol-redir-ifunc.h> in dl-sysdep.c

The _dl_sysdep_parse_arguments function contains initalization
of a large on-stack variable:

dl_parse_auxv_t auxv_values = { 0, };

This uses a non-inline version of memset on powerpc64le-linux-gnu,
so it must use the baseline memset.

(cherry picked from commit f6ea5d1291cf3f264514d03872ebae84e0293b69)

NEWS: update list of fixed CVEs in 2.39

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>

NEWS: update list of fixed bugs in 2.39

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>

x86: Properly set x86 minimum ISA level [BZ #31883]

Properly set libc_cv_have_x86_isa_level in shell for MINIMUM_X86_ISA_LEVEL
defined as

(__X86_ISA_V1 + __X86_ISA_V2 + __X86_ISA_V3 + __X86_ISA_V4)

Also set __X86_ISA_V2 to 1 for i386 if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
is defined. There are no changes in config.h nor in config.make on x86-64.
On i386, -march=x86-64-v2 with GCC generates

#define MINIMUM_X86_ISA_LEVEL 2

in config.h and

have-x86-isa-level = 2

in config.make. This fixes BZ #31883.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
(cherry picked from commit 29807a271edca3e47195bda0c69ae45e245551a9)

x86: Properly set MINIMUM_X86_ISA_LEVEL for i386 [BZ #31867]

On i386, set the default minimum ISA level to 0, not 1 (baseline which
includes SSE2). There are no changes in config.h nor in config.make on
x86-64. This fixes BZ #31867.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Tested-by: Ian Jordan <immoloism@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 09bc68b0ac26331a0109f0578c9368e09176da18)

localedata: ssy_ER: Fix syntax error

(cherry picked from commit 07fd072caff50bca2a7e9f5737a5b38280d2ffda)

malloc: New test to check malloc alternate path using memory obstruction

The test aims to ensure that malloc uses the alternate path to
allocate memory when sbrk() or brk() fails.To achieve this,
the test first creates an obstruction at current program break,
tests that obstruction with a failing sbrk(), then checks if malloc
is still returning a valid ptr thus inferring that malloc() used
mmap() instead of brk() or sbrk() to allocate the memory.
Reviewed-by: Arjun Shankar <arjun@redhat.com>
Reviewed-by: Zack Weinberg <zack@owlfolio.org>
(cherry picked from commit 127fc56152347d73cb7c1c283e60e1cb1f15e9f9)

malloc: Improve aligned_alloc and calloc test coverage.

Add a DSO (malloc/tst-aligned_alloc-lib.so) that can be used during
testing to interpose malloc with a call that randomly uses either
aligned_alloc, __libc_malloc, or __libc_calloc in the place of malloc.
Use LD_PRELOAD with the DSO to mirror malloc/tst-malloc.c testing as an
example in malloc/tst-malloc-random.c. Add malloc/tst-aligned-alloc-random.c
as another example that does a number of malloc calls with randomly sized,
but limited to 0xffff, requests.

The intention is to be able to utilize existing malloc testing to ensure
that similar allocation APIs are also exposed to the same rigors.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 3395157ff2b0657d70c36169156f67440205c8bf)

malloc/Makefile: Split and sort tests

Put each test on a separate line and sort tests.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit d49cd6a1913da9744b9a0ffbefb3f7958322382e)

x86/cet: fix shadow stack test scripts

Some shadow stack test scripts use the '==' operator with the 'test'
command to validate exit codes resulting in the following error:

sysdeps/x86_64/tst-shstk-legacy-1e.sh: 31: test: 139: unexpected operator

The '==' operator is invalid for the 'test' command, use '-eq' like the
previous call to 'test'.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 155bb9d036646138348fee0ac045de601811e0c5)

elf: Avoid some free (NULL) calls in _dl_update_slotinfo

This has been confirmed to work around some interposed mallocs.  Here
is a discussion of the impact test ust/libc-wrapper/test_libc-wrapper
in lttng-tools:

  New TLS usage in libgcc_s.so.1, compatibility impact
  <https://inbox.sourceware.org/libc-alpha/8734v1ieke.fsf@oldenburg.str.redhat.com/>

Reportedly, this patch also papers over a similar issue when tcmalloc
2.9.1 is not compiled with -ftls-model=initial-exec.  Of course the
goal really should be to compile mallocs with the initial-exec TLS
model, but this commit appears to be a useful interim workaround.

Fixes commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow
tls access after dlopen [BZ #19924]").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit afe42e935b3ee97bac9a7064157587777259c60e)

misc: Add support for Linux uio.h RWF_NOAPPEND flag

In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.

This caused two test failures as these tests expected the flag 0x00000020
to be unused.  Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).

  FAIL: misc/tst-preadvwritev2
  FAIL: misc/tst-preadvwritev64v2

This patch adds the flag, adjusts the test and adds details to
documentation.

Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 3db9d208dd5f30b12900989c6d2214782b8e2011)

i386: Disable Intel Xeon Phi tests for GCC 15 and above (BZ 31782)

This patch disables Intel Xeon Phi tests for GCC 15 and above.

GCC 15 removed Intel Xeon Phi ISA support.
commit e1a7e2c54d52d0ba374735e285b617af44841ace
Author: Haochen Jiang <haochen.jiang@intel.com>
Date: Mon May 20 10:43:44 2024 +0800

i386: Remove Xeon Phi ISA support

Fixes BZ 31782.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 1b713c9a5349ef3cd1a8ccf9de017c7865713c67)

Reinstate generic features-time64.h

The a4ed0471d7 removed the generic version which is included by
features.h and used by Hurd.

Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.

(cherry picked from commit c27f8763cffbb7db9b3f1f5e09ef24d26cbb63f4)

Always define __USE_TIME_BITS64 when 64 bit time_t is used

It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI. Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).

The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi

[1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html
[2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit a4ed0471d71739928a0d0fa3258b3ff3b158e9b9)

socket: Use may_alias on sockaddr structs (bug 19622)

This supports common coding patterns. The GCC C front end before
version 7 rejects the may_alias attribute on a struct definition
if it was not present in a previous forward declaration, so this
attribute can only be conditionally applied.

This implements the spirit of the change in Austin Group issue 1641.

Suggested-by: Marek Polacek <polacek@redhat.com>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Sam James <sam@gentoo.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 8d7b6b4cb27d4dec1dd5f7960298c1699275f962)

parse_fdinfo: Don't advance pointer twice [BZ #31798]

pidfd_getpid.c has

      /* Ignore invalid large values.  */
      if (INT_MULTIPLY_WRAPV (10, n, &n)
          || INT_ADD_WRAPV (n, *l++ - '0', &n))
        return -1;

For GCC older than GCC 7, INT_ADD_WRAPV(a, b, r) is defined as

   _GL_INT_OP_WRAPV (a, b, r, +, _GL_INT_ADD_RANGE_OVERFLOW)

and *l++ - '0' is evaluated twice.  Fix BZ #31798 by moving "l++" out of
the if statement.  Tested with GCC 6.4 and GCC 14.1.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit f981bf6b9db87e0732b46bfe92fdad4d363225e8)