git.ipfire.org Git - thirdparty/glibc.git/log

elf: Fix UB on _dl_map_object_from_fd

On 32-bit architecture ubsan triggers:

UBSAN: Undefined behaviour in dl-load.c:1345:54 pointer index expression with base 0x00612508 overflowed to 0xf7c3a508

Use explicit uintptr_t operation instead.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

argp: Fix shift bug

From gnulib commits 06094e390b0 and 88033d3779362a.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

math: Remove i386 ilogb/ilogbf/llogb/llogbf

The new float and double implementation does not required an
extra function call and error handling uses math_err function,
which results in better performance on i386 as well.

With gcc-14 on AMD AMD Ryzen 9 5900X, master shows:

$ ./benchtests/bench-ilogb
  "ilogb": {
   "subnormal": {
    "duration": 3.68863e+09,
    "iterations": 1.72228e+08,
    "max": 89.2995,
    "min": 21.016,
    "mean": 21.4171
   },
   "normal": {
    "duration": 3.68878e+09,
    "iterations": 1.72948e+08,
    "max": 78.6065,
    "min": 21.127,
    "mean": 21.3288
   }
  }
$ ./benchtests/bench-ilogbf
  "ilogbf": {
   "subnormal": {
    "duration": 3.68835e+09,
    "iterations": 1.66716e+08,
    "max": 46.953,
    "min": 21.793,
    "mean": 22.1236
   },
   "normal": {
    "duration": 3.68784e+09,
    "iterations": 1.66168e+08,
    "max": 46.9715,
    "min": 21.904,
    "mean": 22.1935
   }
  }

While with this patch:

$ ./benchtests/bench-ilogb
  "ilogb": {
   "subnormal": {
    "duration": 3.68134e+09,
    "iterations": 4.17516e+08,
    "max": 32.5045,
    "min": 8.3245,
    "mean": 8.81723
   },
   "normal": {
    "duration": 3.6677e+09,
    "iterations": 6.79468e+08,
    "max": 50.9305,
    "min": 5.3465,
    "mean": 5.3979
   }
}
$ ./benchtests/bench-ilogbf
  "ilogbf": {
   "subnormal": {
    "duration": 3.67553e+09,
    "iterations": 5.11032e+08,
    "max": 35.927,
    "min": 7.0485,
    "mean": 7.19237
   },
   "normal": {
    "duration": 3.66877e+09,
    "iterations": 6.556e+08,
    "max": 26.3625,
    "min": 5.5315,
    "mean": 5.59605
   }
}

Checked on i686-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

math: Optimize float ilogb/llogb

It removes the wrapper by moving the error/EDOM handling to an
out-of-line implementation (__math_invalidf_i/__math_invalidf_li).
Also, __glibc_unlikely is used on errors case since it helps
code generation on recent gcc.

The code now builds to with gcc-14 on aarch64:

0000000000000000 <__ilogbf>:
   0:   1e260000        fmov    w0, s0
   4:   d3577801        ubfx    x1, x0, #23, #8
   8:   340000e1        cbz     w1, 24 <__ilogbf+0x24>
   c:   5101fc20        sub     w0, w1, #0x7f
  10:   7103fc3f        cmp     w1, #0xff
  14:   54000040        b.eq    1c <__ilogbf+0x1c>  // b.none
  18:   d65f03c0        ret
  1c:   12b00000        mov     w0, #0x7fffffff                 // #2147483647
  20:   14000000        b       0 <__math_invalidf_i>
  24:   53175800        lsl     w0, w0, #9
  28:   340000a0        cbz     w0, 3c <__ilogbf+0x3c>
  2c:   5ac01000        clz     w0, w0
  30:   12800fc1        mov     w1, #0xffffff81                 // #-127
  34:   4b000020        sub     w0, w1, w0
  38:   d65f03c0        ret
  3c:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  40:   14000000        b       0 <__math_invalidf_i>

Some ABI requires additional adjustments:

  * i386 and m68k requires to use the template version, since
    both provide __ieee754_ilogb implementatations.

  * loongarch uses a custom implementation as well.

  * powerpc64le also has a custom implementation for POWER9, which
    is also used for float and float128 version.  The generic
    e_ilogb.c implementation is moved on powerpc to keep the
    current code as-is.

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

math: Remove UB and optimize double ilogbf

The subnormal exponent calculation invokes UB by left shifting the
signed expoenent to find the first leading bit.

The patch reimplements ilogb using the math_config.h macros and
uses the new stdbit.h function to simplify the subnormal handling.

On aarch64 it generates better code:

* master:

0000000000000000 <__ieee754_ilogbf>:
   0:   1e260000        fmov    w0, s0
   4:   12007801        and     w1, w0, #0x7fffffff
   8:   72091c1f        tst     w0, #0x7f800000
   c:   54000141        b.ne    34 <__ieee754_ilogbf+0x34>  // b.any
  10:   34000201        cbz     w1, 50 <__ieee754_ilogbf+0x50>
  14:   53185c21        lsl     w1, w1, #8
  18:   12800fa0        mov     w0, #0xffffff82                 // #-126
  1c:   d503201f        nop
  20:   531f7821        lsl     w1, w1, #1
  24:   51000400        sub     w0, w0, #0x1
  28:   7100003f        cmp     w1, #0x0
  2c:   54ffffac        b.gt    20 <__ieee754_ilogbf+0x20>
  30:   d65f03c0        ret
  34:   13177c20        asr     w0, w1, #23
  38:   12b01002        mov     w2, #0x7f7fffff                 // #2139095039
  3c:   5101fc00        sub     w0, w0, #0x7f
  40:   6b02003f        cmp     w1, w2
  44:   12b00001        mov     w1, #0x7fffffff                 // #2147483647
  48:   1a819000        csel    w0, w0, w1, ls  // ls = plast
  4c:   d65f03c0        ret
  50:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  54:   d65f03c0        ret

* patch:

0000000000000000 <__ieee754_ilogbf>:
   0:   1e260001        fmov    w1, s0
   4:   d3577820        ubfx    x0, x1, #23, #8
   8:   350000e0        cbnz    w0, 24 <__ieee754_ilogbf+0x24>
   c:   53175821        lsl     w1, w1, #9
  10:   34000141        cbz     w1, 38 <__ieee754_ilogbf+0x38>
  14:   5ac01021        clz     w1, w1
  18:   12800fc0        mov     w0, #0xffffff81                 // #-127
  1c:   4b010000        sub     w0, w0, w1
  20:   d65f03c0        ret
  24:   7103fc1f        cmp     w0, #0xff
  28:   5101fc00        sub     w0, w0, #0x7f
  2c:   12b00001        mov     w1, #0x7fffffff                 // #2147483647
  30:   1a811000        csel    w0, w0, w1, ne  // ne = any
  34:   d65f03c0        ret
  38:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  3c:   d65f03c0        ret

Other architecture with support for stdc_leading_zeros and/or
__builtin_clzll should have similar improvements.

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

math: Optimize double ilogb/llogb

It removes the wrapper by moving the error/EDOM handling to an
out-of-line implementation (__math_invalid_i/__math_invalid_li).
Also, __glibc_unlikely is used on errors case since it helps
code generation on recent gcc.

The code now builds to with gcc-14 on aarch64:

0000000000000000 <__ilogb>:
   0:   9e660000        fmov    x0, d0
   4:   d374f801        ubfx    x1, x0, #52, #11
   8:   340000e1        cbz     w1, 24 <__ilogb+0x24>
   c:   510ffc20        sub     w0, w1, #0x3ff
  10:   711ffc3f        cmp     w1, #0x7ff
  14:   54000040        b.eq    1c <__ilogb+0x1c>  // b.none
  18:   d65f03c0        ret
  1c:   12b00000        mov     w0, #0x7fffffff                 // #2147483647
  20:   14000000        b       0 <__math_invalid_i>
  24:   d374cc00        lsl     x0, x0, #12
  28:   b40000a0        cbz     x0, 3c <__ilogb+0x3c>
  2c:   dac01000        clz     x0, x0
  30:   12807fc1        mov     w1, #0xfffffc01                 // #-1023
  34:   4b000020        sub     w0, w1, w0
  38:   d65f03c0        ret
  3c:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  40:   14000000        b       0 <__math_invalid_i>

Some ABI requires additional adjustments:

  * i386 and m68k requires to use the template version, since
    both provide __ieee754_ilogb implementatations.

  * loongarch uses a custom implementation as well.

  * powerpc64le also has a custom implementation for POWER9, which
    is also used for float and float128 version.  The generic
    e_ilogb.c implementation is moved on powerpc to keep the
    current code as-is.

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

math: Remove UB and optimize double ilogb

The subnormal exponent calculation invokes UB by left shifting the
signed exponent to find the first leading bit.  The implementation
also uses 32 bits operations, which generates suboptimal code in
64 bits architectures.

The patch reimplements ilogb using the math_config.h macros and
uses the new stdbit function to simplify the subnormal handling.

On aarch64 it generates better code:

* master:

0000000000000000 <__ieee754_ilogb>:
   0:   9e660000        fmov    x0, d0
   4:   d360fc02        lsr     x2, x0, #32
   8:   d360f801        ubfx    x1, x0, #32, #31
   c:   f26c285f        tst     x2, #0x7ff00000
  10:   540001a1        b.ne    44 <__ieee754_ilogb+0x44>  // b.any
  14:   2a000022        orr     w2, w1, w0
  18:   34000322        cbz     w2, 7c <__ieee754_ilogb+0x7c>
  1c:   35000221        cbnz    w1, 60 <__ieee754_ilogb+0x60>
  20:   2a0003e1        mov     w1, w0
  24:   7100001f        cmp     w0, #0x0
  28:   12808240        mov     w0, #0xfffffbed                 // #-1043
  2c:   540000ad        b.le    40 <__ieee754_ilogb+0x40>
  30:   531f7821        lsl     w1, w1, #1
  34:   51000400        sub     w0, w0, #0x1
  38:   7100003f        cmp     w1, #0x0
  3c:   54ffffac        b.gt    30 <__ieee754_ilogb+0x30>
  40:   d65f03c0        ret
  44:   13147c20        asr     w0, w1, #20
  48:   12b00202        mov     w2, #0x7fefffff                 // #2146435071
  4c:   510ffc00        sub     w0, w0, #0x3ff
  50:   6b02003f        cmp     w1, w2
  54:   12b00001        mov     w1, #0x7fffffff                 // #2147483647
  58:   1a819000        csel    w0, w0, w1, ls  // ls = plast
  5c:   d65f03c0        ret
  60:   53155021        lsl     w1, w1, #11
  64:   12807fa0        mov     w0, #0xfffffc02                 // #-1022
  68:   531f7821        lsl     w1, w1, #1
  6c:   51000400        sub     w0, w0, #0x1
  70:   7100003f        cmp     w1, #0x0
  74:   54ffffac        b.gt    68 <__ieee754_ilogb+0x68>
  78:   d65f03c0        ret
  7c:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  80:   d65f03c0        ret

* patch:

0000000000000000 <__ieee754_ilogb>:
   0:   9e660001        fmov    x1, d0
   4:   d374f820        ubfx    x0, x1, #52, #11
   8:   350000e0        cbnz    w0, 24 <__ieee754_ilogb+0x24>
   c:   d374cc21        lsl     x1, x1, #12
  10:   b4000141        cbz     x1, 38 <__ieee754_ilogb+0x38>
  14:   dac01021        clz     x1, x1
  18:   12807fc0        mov     w0, #0xfffffc01                 // #-1023
  1c:   4b010000        sub     w0, w0, w1
  20:   d65f03c0        ret
  24:   711ffc1f        cmp     w0, #0x7ff
  28:   510ffc00        sub     w0, w0, #0x3ff
  2c:   12b00001        mov     w1, #0x7fffffff                 // #2147483647
  30:   1a811000        csel    w0, w0, w1, ne  // ne = any
  34:   d65f03c0        ret
  38:   320107e0        mov     w0, #0x80000001                 // #-2147483647
  3c:   d65f03c0        ret

Other architecture with support for stdc_leading_zeros and/or
__builtin_clzll should have similar improvements.

Checked on aarch64-linux-gnu and x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

manual: Correct return value description of 'clock_nanosleep'

Commit 1a3d8f2201d4d613401ce5be9a283f4f28c43093 incorrectly described
'clock_nanosleep' as having the same return values as 'nanosleep'. Fix
this, clarifying that 'clock_nanosleep' returns a positive error number
upon failure instead of setting 'errno'. Also clarify that 'nanosleep'
returns '-1' upon error.

Fixes: 1a3d8f2201d4d613401ce5be9a283f4f28c43093
Reported-by: Mark Harris <mark.hsj@gmail.com>
Reviewed-by: Mark Harris <mark.hsj@gmail.com>

nss: free dynarray buffer after parsing nsswitch.conf

Resolves: swbz 31791

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

manual: Document clock_nanosleep

Make minor clarifications in the documentation for 'nanosleep' and add
an entry for 'clock_nanosleep' as a generalized variant of the former
function that allows clock selection.
Reviewed-by: Maciej W. Rozycki <macro@redhat.com>

manual: Fix invalid 'illegal' usage with 'nanosleep'

The GNU Coding Standards demand that 'illegal' only be used to refer to
activities prohibited by law. Replace it with 'invalid' accordingly in
the description of the EINVAL error condition for 'nanosleep'.

manual: Fix duplicate 'consult' erratum

Remove 'consult' duplication appearing in Extensible Scheduling section.

localedata: Correct Persian collation rules description

Fix an erratum in the Persian locale claiming that the CLDR collation
rules referred are for Ukrainian.

stdio-common: Correct 'sscanf' test feature wrapper description

Fix a typo in the description, making the wrapper correctly refer to
'sscanf' rather than 'scanf' being tested.

manual: Document error codes missing for 'inet_ntop'

Add documentation for EAFNOSUPPORT and ENOSPC error codes returned, and
the return value on failure.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

manual: Document error codes missing for 'socket'

Add missing EAFNOSUPPORT, ESOCKTNOSUPPORT, EPROTOTYPE, EINVAL, EPERM,
and ENOMEM error codes, and adjust existing descriptions accordingly.

On Linux either ENOBUFS or ENOMEM is returned in the case of a memory
allocation failure, depending on the namespace requested, e.g. AF_INET
returns ENOMEM while AF_INET6 returns ENOBUFS, so document these codes
as alternatives.

Similarly EPERM is returned rather than EACCES on Linux, so document
these codes as alternatives as well. We might want to convert EPERM to
EACCES for POSIX compliance, but it is beyond the scope of this change,
and software has to expect either anyway, owing to the long-established
practice.

Finally ESOCKTNOSUPPORT is returned rather than EPROTONOSUPPORT for an
unsupported style except for the AF_QIPCRTR namespace where EPROTOTYPE
is used, so document these codes as alternatives too.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

stdio-common: Consistently use 'num_digits_len' in 'vfscanf'

Make the only place use 'num_digits_len' enumeration constant where 10
is referred literally for a digit index in i18n handling for decimal
integers. No change in code produced.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

Update syscall lists for Linux 6.15

Linux 6.15 adds the new syscall open_tree_attr. Update
syscall-names.list and regenerate the arch-syscall.h headers with
build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

AArch64: Improve enabling of SVE for libmvec

When using a -mcpu option in CFLAGS, GCC can report errors when building libmvec.
Fix this by overriding both -mcpu and -march with a generic variant with SVE added.
Also use a tune for a modern SVE core.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

AArch64: Improve codegen in SVE log1p

Improves memory access, reformat evaluation scheme to pack coefficients.
5% improvement in throughput microbenchmark on Neoverse V1.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Use Linux 6.15 in build-many-glibcs.py

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

Reviewed-by: Florian Weimer <fweimer@redhat.com>

manual: mention PKEY_UNRESTRICTED macro in the manual

Also use this macro in one of the examples.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

linux: use PKEY_UNRESTRICTED macro in tst-pkey

Reviewed-by: Florian Weimer <fweimer@redhat.com>

misc: add PKEY_UNRESTRICTED macro

A corresponding macro has been added to Linux UAPI headers in 6.15.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

generic: Add missing parameter name to __getrandom_early_init

This is required after commit 03da41d47dc73674307e6ffc5b75e9043febc698
("Turn on -Wmissing-parameter-name by default if available").

Reviewed-by: Sam James <sam@gentoo.org>

hurd: Avoid -Wfree-labels warning in _hurd_intr_rpc_mach_msg

This is required after commit 4f4c4fcde76aedc1f5362a51d98ebb57a28fbce9
("Turn on -Wfree-labels by default if available").

Reviewed-by: Sam James <sam@gentoo.org>

Update RISC-V relocations

Update the list of RISC-V relocations from the ELF psABI as of June 2024.
It removes binutils-internal only relocations that were never part of
actual object files. The GNU_VTINHERIT and GNU_VTENTRY relocations were
never used because the corresponding GCC option -fvtable-gc was never
supported on RISC-V.

Use -std=gnu17 in build-many-glibcs.py when configuring GMP

This works around incompatibility of GMP 6.3.0 with GCC 15 (defaulting
to C23) following an approach suggested by Florian.

Tested with build-many-glibcs.py (host-libraries build only).

Reviewed-by: Florian Weimer <fweimer@redhat.com>

malloc: Fix malloc init order

__ptmalloc_init was called too early in __libc_early_init: it uses
__libc_initial which is not set yet. Fix this by moving initialization
to the end of __libc_early_init.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Move C warning flags from +gccwarn to +gccwarn-c

This avoids warnings about these options during the C++ header
inclusion tests.

Reviewed-by: Sam James <sam@gentoo.org>

doc: Add missing space in documentation of __TIMESIZE

doc: Fix typos in documentation of _TIME_BITS

Fix comment typo in libc-symbols.h

Reviewed-by: Sam James <sam@gentoo.org>

Turn on -Wmissing-parameter-name by default if available

This flags another hazard for backporting changes to earlier branches.

Reviewed-by: Sam James <sam@gentoo.org>

manual: Document getopt_long_only with single letter options (bug 32980)

Signed-off-by: Tomas Volf <~@wolfsden.cz>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

Turn on -Wfree-labels by default if available

This flags a hazard for backporting changes to earlier branches.

Reviewed-by: Sam James <sam@gentoo.org>

S390: Use cfi_val_offset instead of cfi_escape. 31bit part

Due to raising the minimum binutils version to version >=2.28,
the used cfi_escape for cfi_val_offset can now be ommitted.

The commit 0fc76d876261ee8253fef198ffec48c832edd4ff
has already adjusted it for the 64bit part of mcount.
This patch also adjusts it for the 31bit part of mcount.

Checked with "objdump -WF" / "objdump -Wf" that the previous
cfi_escape and the new cfi_val_offset are equal.

libmvec: Add inputs for asinpi(f), acospi(f), atanpi(f) and atan2pi(f)

Add initial inputs for asinpi(f), acospi(f), atanpi(f) and atan2pi(f) based
on existing asin/acos/atan inputs.

Benchtests now works on the new libmvec function.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

INSTALL: Regenerate with texinfo 7.2

This fixes make dist on systems with the latest texinfo installed.
GNU texinfo 7.2 changes @xrefs in proper plain text sentences instead
of pseudo info references.

Tested-By: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>

Fix error reporting (false negatives) in SGID tests

And simplify the interface of support_capture_subprogram_self_sgid.

Use the existing framework for temporary directories (now with
mode 0700) and directory/file deletion.  Handle all execution
errors within support_capture_subprogram_self_sgid.  In particular,
this includes test failures because the invoked program did not
exit with exit status zero.  Existing tests that expect exit
status 42 are adjusted to use zero instead.

In addition, fix callers not to call exit (0) with test failures
pending (which may mask them, especially when running with --direct).

Fixes commit 35fc356fa3b4f485bd3ba3114c9f774e5df7d3c2
("elf: Fix subprocess status handling for tst-dlopen-sgid (bug 32987)").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

manual: Use more inclusive language in comments.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Makerules: Use 'original' instead of 'master' in source.

Use more inclusive language in makefile source.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

gen-libm-test: Use 'original source' instead of 'master' in code.

Use more inclusive language in generated sources.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

nss_test1: Use 'parametrized template' instead of 'master' in comment.

Use more inclusive language in code comments.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

linknamespace: Use 'ALLOWLIST' instead of 'WHITELIST' in code.

Use more inclusive language in code.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

posix: Use more inclusive language in test data.

Remove Changelog entries that use 'blacklist' or 'master' in the
test data. The test data still contains enough accented characters
to serve the purposes of the posix/tst-regex.c test.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

pylintrc: Remove obsolete ignore section and comments.

Remove the obsolete ignore=CVS since we use git now.

We make the code more inclusive by removing obsolete comments.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

support: Pick group in support_capture_subprogram_self_sgid if UID == 0

When running as root, it is likely that we can run under any group.
Pick a harmless group from /etc/group in this case.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

ldbl-128: also disable lgammaf128_r builtin when building lgammal_r

elf: Fix subprocess status handling for tst-dlopen-sgid (bug 32987)

This should really move into support_capture_subprogram_self_sgid.

Reviewed-by: Sam James <sam@gentoo.org>

x86_64: Fix typo in ifunc-impl-list.c.

Fix wcsncpy and wcpncpy typo in ifunc-impl-list.c.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Test case for bug 32976 (CVE-2025-4802)

Check that LD_LIBRARY_PATH is ignored for AT_SECURE statically
linked binaries, using support_capture_subprogram_self_sgid.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

support: Use const char * argument in support_capture_subprogram_self_sgid

The function does not modify the passed-in string, so make this clear
via the prototype.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

AArch64: Fix typo in math-vector.h

Fix typo atanpi2->atan2pi in math-vector.h.

Fix typos in ldbl-opt makefile

The -fno-builtin options need to disable the long double builtins.

AArch64: Cleanup SVE config and defines

Now we finally support modern GCC and binutils, it's time for a cleanup.
Remove HAVE_AARCH64_SVE_ASM define and conditional compilation. Remove SVE
configure checks for SVE, ACLE and variant-PCS support.

Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>

AArch64: Cleanup PAC and BTI

Now we finally support modern GCC and binutils, it's time for a cleanup.
Use PAC and BTI instructions unconditionally and use proper assembler syntax.
Remove the PR target/94791 strip_pac workarounds for buggy GCCs. Remove the
PAC/BTI configure checks - always emit GNU property notes on assembly files.
Change cfi_window_save to the correct cfi_negate_ra_state unwind directive.

Reviewed-by: Matthieu Longo <matthieu.longo@arm.com>

AArch64: Implement AdvSIMD and SVE atan2pi/f

Implement double and single precision variants of the C23 routine atan2pi
for both AdvSIMD and SVE.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

AArch64: Implement AdvSIMD and SVE atanpi/f

Implement double and single precision variants of the C23 routine atanpi
for both AdvSIMD and SVE.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

AArch64: Implement AdvSIMD and SVE asinpi/f

Implement double and single precision variants of the C23 routine asinpi
for both AdvSIMD and SVE.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

AArch64: Implement AdvSIMD and SVE acospi/f

Implement double and single precision variants of the C23 routine acospi
for both AdvSIMD and SVE.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

AArch64: Optimize inverse trig functions

Improve performance of Inverse trig functions by altering how coefficients are
loaded.

Performance improvement on Neoverse V1:
SVE     acos   14%
AdvSIMD acos   6%

AdvSIMD asin   6%
SVE     asin   5%
AdvSIMD asinf  2%

AdvSIMD atanf  22%
SVE     atanf  20%
SVE     atan   11%
AdvSIMD atan   5%

SVE     atan2  7%
SVE     atan2f 4%
AdvSIMD atan2f 3%
AdvSIMD atan2  2%

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Document CVE-2025-4802.

This commit adds advisory data for the above CVE(s).

ctype: Fallback initialization of TLS using relocations (bug 19341, bug 32483)

This ensures that the ctype data pointers in TLS are valid
in secondary namespaces even without initialization via
__ctype_init.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>

Use proper extern declaration for _nl_C_LC_CTYPE_{class,toupper,tolower}

The existing initializers already contain explicit casts. Keep them
due to int/uint32_t mismatch.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>

Optimize __libc_tsd_* thread variable access

These variables are not exported, and libc.so TLS is initial-exec
anyway. Declare these variables as hidden and use the initial-exec
TLS model.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>

Remove <libc-tsd.h>

Use __thread variables directly instead.  The macros do not save any
typing.  It seems unlikely that a future port will lack __thread
variable support.

Some of the __libc_tsd_* variables are referenced from assembler
files, so keep their names.  Previously, <libc-tls.h> included
<tls.h>, which in turn included <errno.h>, so a few direct includes
of <errno.h> are now required.

Reviewed-by: Frédéric Bérat <fberat@redhat.com>

manual: add sched_getcpu()

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

manual: Clarifications for listing directories

Support for seeking is limited. Using the d_off and d_reclen members
of struct dirent is discouraged, especially with readdir. Concurrent
modification of directories during iteration may result in duplicate
or missing etnries.

manual: add remaining CPU_* macros

Adds remaining CPU_* macros, including the CPU_*_S macros
for dynamic-sized cpu sets.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

powerpc: Remove check for -mabi=ibmlongdouble

The -mabi=ibmlongdouble option has been added in gcc 4.2, thus can be
assumed to always exist.

aarch64: update tests for SME

Add test that checks that ZA state is disabled after setjmp and sigsetjmp
Update existing SME test that uses setjmp

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

aarch64: Disable ZA state of SME in setjmp and sigsetjmp

Due to the nature of the ZA state, setjmp() should clear it in the
same manner as it is already done by longjmp.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

benchtest: malloc tcache hotpath benchtest

Existing benchtests for malloc infrastructure seem to be rather generic
to test global malloc implementation performance.  This new benchtest
focus on reducing any non tcache related side effects, allowing to more
realistically predict performance impacts of tcache code changes.
The test was inpired in bench-[cm]alloc-thread code, with severe
simplifications:
- forces single thread execution, reducing concurrency side-effects,
   like cache incoherence penalties due simultaneous writes to the same
   cache pages;
- Focus on allocating and deallocating a single size for all the
   duration of the benchmark. Since all it does is allocate and
   deallocate, it will measure the tcache hotpath without any
   side-effects.
- Allows to specify the allocation size as input argument.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Implement C23 rootn.

C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the rootn functions, which compute the Yth root of X for
integer Y (with a domain error if Y is 0, even if X is a NaN).  The
integer exponent has type long long int in C23; it was intmax_t in TS
18661-4, and as with other interfaces changed after their initial
appearance in the TS, I don't think we need to support the original
version of the interface.

As with pown and compoundn, I strongly encourage searching for worst
cases for ulps error for these implementations (necessarily
non-exhaustively, given the size of the input space).  I also expect a
custom implementation for a given format could be much faster as well
as more accurate, although the implementation is simpler than those
for pown and compoundn.

This completes adding to glibc those TS 18661-4 functions (ignoring
DFP) that are included in C23.  See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118592 regarding the C23
mathematical functions (not just the TS 18661-4 ones) missing built-in
functions in GCC, where such functions might usefully be added.

Tested for x86_64 and x86, and with build-many-glibcs.py.

malloc: Improve performance of __libc_calloc

Improve performance of __libc_calloc by splitting it into 2 parts: first handle
the tcache fastpath, then do the rest in a separate tailcalled function.
This results in significant performance gains since __libc_calloc doesn't need
to setup a frame.

On Neoverse V2, bench-calloc-simple improves by 5.0% overall.
Bench-calloc-thread 1 improves by 24%.

Reviewed-by: DJ Delorie <dj@redhat.com>

S390: Use cfi_val_offset instead of cfi_escape.

Due to raising the minimum binutils version to version >=2.28,
the used cfi_escape for cfi_val_offset can now be ommitted.

Checked with "objdump -WF" / "objdump -Wf" that the previous
cfi_escape and the new cfi_val_offset are equal.

powerpc64le: Remove configure check for objcopy >= 2.26.

Due to raising the minimum binutils version to >= 2.26, the configure
check for testing support of --update-section is not needed anymore.
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>

Raise the minimum binutils version to 2.39

The recent commit 27b96e069aad17cefea9437542180bff448ac3a0 raises the minimum
GCC version to 12.1 which was released in 2022.

The current minimum bintuils version 2.25 was released end of 2014. This patch
now raises the minimum binutils version to 2.39 which was also released in 2022.

The hint for ARC is not needed anymore.

In sysdeps/[alpha|hppa|csky]/configure.ac, PIE is unsupported with this comment:
PIE builds fail on binutils 2.37 and earlier, see:
https://sourceware.org/bugzilla/show_bug.cgi?id=28672
This patch keeps PIE unsupported and let the machine maintainers test and
enable it later.

In sysdeps/arm/configure.ac, there is a check whether TPOFF relocs with addends
are assembled correctly, which is known to be broken in binutils 2.24 and 2.25.
See: https://sourceware.org/bugzilla/show_bug.cgi?id=18383
This patch keeps the check as is and let the machine maintainers check if it
still required.

According to Florian Weimer:
Having at least binutils 2.38 will allow us to assume that this linker
bug is fixed:
Bug 28743 - -z relro creats holes in the process image on GNU/Linux
<https://sourceware.org/bugzilla/show_bug.cgi?id=28743>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

added benchtest inputs for log2l

added benchtest inputs for expl

aarch64: fix unwinding in longjmp

Previously, longjmp() on aarch64 was using CFI directives around the
call to __libc_arm_za_disable() after CFA was redefined at the start
of longjmp(). This may result in unwinding issues. Move the call and
surrounding CFI directives to the beginning of longjmp().

Suggested-by: Wilco Dijkstra <wilco.dijkstra@arm.com>

added benchtest inputs for powl

changes in v2:
* fixed the missing Makefile entry in the first version
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

added benchtest inputs for fmal

manual: fix typo for sched_[sg]etattr

Originally added in 41a90f3f5f which says it's adding sched_getattr
and sched_setattr.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

malloc: Improve malloc initialization

Move malloc initialization to __libc_early_init. Use a hidden __ptmalloc_init
for initialization and a weak call to avoid pulling in the system malloc in a
static binary. All previous initialization checks can now be removed.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Document all CLOCK_* values

The manual documents CLOCK_REALTIME and CLOCK_MONOTONIC but not other
CLOCK_* values. Add documentation of the POSIX clocks
CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID, along with a
reference to the Linux man pages for the semantics of the
Linux-specific clocks supported (as with some other functionality
coming direct from the Linux kernel where the man pages can be
considered the main documentation).

Note: CLOCK_MONOTONIC_RAW, CLOCK_REALTIME_COARSE and
CLOCK_MONOTONIC_COARSE are also defined in the toplevel bits/time.h,
as used for Hurd. Nevertheless, I see no sign that the Hurd code in
glibc actually has any support for those clocks, so I think it is
correct to document them as Linux-specific (and to refer only to the
Linux man pages for their semantics).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

malloc: Improved double free detection in the tcache

The previous double free detection did not account for an attacker to
use a terminating null byte overflowing from the previous
chunk to change the size of a memory chunk is being sorted into.
So that the check in 'tcache_double_free_verify' would pass
even though it is a double free.

Solution:
Let 'tcache_double_free_verify' iterate over all tcache entries to
detect double frees.

This patch only protects from buffer overflows by one byte.
But I would argue that off by one errors are the most common
errors to be made.

Alternatives Considered:
  Store the size of a memory chunk in big endian and thus
  the chunk size would not get overwritten because entries in the
  tcache are not that big.

  Move the tcache_key before the actual memory chunk so that it
  does not have to be checked at all, this would work better in general
  but also it would increase the memory usage.

Signed-off-by: David Lau <david.lau@fau.de>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Correct spelling mistake in test file

There are some spelling mistakes in the test file. Fix them

Reviewed-by: guoce <guoce@kylinos.cn>
Signed-off-by: panzhe0328 <panzhe@kylinos.cn>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

hurd: Make rename refuse trailing slashes [BZ #32570]

As tested by Gnulib's renameatu module.

Reported by Collin Funk on
https://sourceware.org/bugzilla/show_bug.cgi?id=32570

Implement C23 compoundn

C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the compoundn functions, which compute (1+X) to the
power Y for integer Y (and X at least -1).  The integer exponent has
type long long int in C23; it was intmax_t in TS 18661-4, and as with
other interfaces changed after their initial appearance in the TS, I
don't think we need to support the original version of the interface.

Note that these functions are "compoundn" with a trailing "n", *not*
"compound" (CORE-MATH has the wrong name, for example).

As with pown, I strongly encourage searching for worst cases for ulps
error for these implementations (necessarily non-exhaustively, given
the size of the input space).  I also expect a custom implementation
for a given format could be much faster as well as more accurate (I
haven't tested or benchmarked the CORE-MATH implementation for
binary32); this is one of the more complicated and less efficient
functions to implement in a type-generic way.

As with exp2m1 and exp10m1, this showed up places where the
powerpc64le IFUNC setup is not as self-contained as one might hope (in
this case, without the changes specific to powerpc64le, there were
undefined references to __GI___expf128).

Tested for x86_64 and x86, and with build-many-glibcs.py.

hurd: Fix tst-stack2 test build on Hurd

It requires $(shared-thread-library). Fixes 0c342594237.

Checked on a i686-gnu build.

nss: remove undefined behavior and optimize getaddrinfo

On x86-64 and compiling with -O2 using stdc_leading_zeros compiles to
the bsr instruction. The fls function removed by this patch is inlined
but still loops while checking each bit individually.

* nss/getaddrinfo.c: Include <stdbit.h>.
(fls): Remove function. This function contains a left shift of 31 on an
'int' which is undefined.
(rfc3484_sort): Use stdc_leading_zeros instead of fls.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

powerpc: Remove POWER7 strncasecmp optimization

These routines are not extensively used (gnulib documentation even
recommend use a replacement [1]), and there is already a POWER8
version that uses proper vectorized instructions.

[1] https://www.gnu.org/software/gnulib/manual/gnulib.html#C-strings

Checked with a build for some powerpc variations.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

manual: add more pthread functions

Add stubs and partial docs for many undocumented pthreads functions.
While neither exhaustive nor complete, gives minimal usage docs
for many functions and expands the pthreads chapters, making it
easier to continue improving this section in the future.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>

S390: Add new s390 platform z17.

The glibc-hwcaps subdirectories are extended by "z17".  Libraries are loaded if
the z17 facility bits are active:
- Miscellaneous-instruction-extensions facility 4
- Vector-enhancements-facility 3
- Vector-Packed-Decimal-Enhancement Facility 3
- CPU: Concurrent-Functions Facility

tst-glibc-hwcaps.c is extended in order to test z17 via new marker6.
In case of running on a z17 with a kernel not recognizing z17 yet,
AT_PLATFORM will be z900 but vector-bit in AT_HWCAP is set.  This situation
is now recognized and this testcase does not fail.

A fatal glibc error is dumped if glibc was build with architecture
level set for z17, but run on an older machine (See dl-hwcap-check.h).
Note, you might get an SIGILL before this check if you don't use:
configure --with-rtld-early-cflags=-march=<older-machine>

ld.so --list-diagnostics now also dumps information about s390.cpu_features.

Independent from z17, the s390x kernel won't introduce new HWCAP-Bits if there
is no special handling needed in kernel itself.  For z17, we don't have new
HWCAP flags, but have to check the facility bits retrieved by
stfle-instruction.

Instead of storing all the stfle-bits (currently four 64bit values) in the
cpu_features struct, we now only store those bits, which are needed within
glibc itself.  Note that we have this list twice, one with original values and
the other one which can be filtered with GLIBC_TUNABLES=glibc.cpu.hwcaps.
Those new fields are stored in so far reserved space in cpu_features struct.
Thus processes started in between the update of glibc package and we e.g. have
a new ld.so and an old libc.so, won't crash. The glibc internal ifunc-resolvers
would not select the best optimized variant.

The users of stfle-bits are also updated:
- parsing of GLIBC_TUNABLES=glibc.cpu.hwcaps
- glibc internal ifunc-resolvers
- __libc_ifunc_impl_list
- sysconf

Correct test descriptors in libm-test-pown.inc

While working on implementing compoundn, I noticed that
libm-test-pown.inc was wrongly using TEST_ff_f and AUTO_TESTS_ff_f
when the actual types involved meant fL_f should be used instead of
ff_f; fix to use the correct descriptor strings for pown. (These
strings affect how gen-libm-test.py generates a C file in some cases.
The structure type test_fL_f_data for expected results and the use of
RUN_TEST_LOOP_fL_f in the ALL_RM_TEST call were already correct.)

Tested for x86_64. The generated libm-test-pown.c was actually
unchanged, but the old descriptor strings were still logically
incorrect.

malloc: Inline tcache_try_malloc

Inline tcache_try_malloc into calloc since it is the only caller. Also fix
usize2tidx and use it in __libc_malloc, __libc_calloc and _mid_memalign.
The result is simpler, cleaner code.

Reviewed-by: DJ Delorie <dj@redhat.com>

math: Fix UB on sinpif (BZ 32925)

The left shift overflows for 'int', use uint32_t instead. It syncs
with CORE-MATH commit bbfabd99.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

math: Fix UB on erfcf (BZ 32924)

The left shift overflows for 'int', use uint64_t instead. It syncs
with CORE-MATH commit d0a2be200cbc1344d800d9ef0ebee9ad67dd3ad8.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>