Use realloc in convert_hostent_to_gaih_addrtuple and fix up pointers in
the result list so that a single block is maintained for
hostbyname3_r/hostbyname2_r and freed in gaih_inet. This result is
never merged with any other results, since the hosts database does not
permit merging.
Resolves BZ #28852.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: DJ Delorie <dj@redhat.com>
getcwd: Set errno to ERANGE for size == 1 (CVE-2021-3999)
Cherry-picked from 23e0e8f5f1fb5ed150253d986ecccdc90c2dcd5e in main branch.
Test included with this commit is not cherry-picked because it requires more
changes.
No valid path returned by getcwd would fit into 1 byte, so reject the
size early and return NULL with errno set to ERANGE. This change is
prompted by CVE-2021-3999, which describes a single byte buffer
underflow and overflow when all of the following conditions are met:
- The buffer size (i.e. the second argument of getcwd) is 1 byte
- The current working directory is too long
- '/' is also mounted on the current working directory
Sequence of events:
- In sysdeps/unix/sysv/linux/getcwd.c, the syscall returns ENAMETOOLONG
because the linux kernel checks for name length before it checks
buffer size
- The code falls back to the generic getcwd in sysdeps/posix
- In the generic func, the buf[0] is set to '\0' on line 250
- this while loop on line 262 is bypassed:
while (!(thisdev == rootdev && thisino == rootino))
since the rootfs (/) is bind mounted onto the directory and the flow
goes on to line 449, where it puts a '/' in the byte before the
buffer.
- Finally on line 458, it moves 2 bytes (the underflowed byte and the
'\0') to the buf[0] and buf[1], resulting in a 1 byte buffer overflow.
- buf is returned on line 469 and errno is not set.
Some tests in original commit are not included because they depend on headers that
are not present in GRTEv5 branch.
The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).
This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).
To avoid sending the problematic query names over DNS, commit 6ca53a2453598804a2559a548a08424fca96434a ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.
The exported x86_64 fenv.h functions operate on both i387 and SSE (since
they should work on both float, double, and long double) while the
internal libc_fe* set either SSE (float, double, and float128) or
i387 (long double).
The libgcc __sfp_handle_exceptions (used on float128 implementation),
however, will set either SEE or i387 exception depending of the
exception to raise. This broke the internal assumption of float128
where only SSE operations will be used.
This patch reimplements the libgcc __sfp_handle_exceptions to use only
SSE operations and sets libgcc to use it instead of its own
implementation.
And I think we should fix libgcc in a similar manner, since checking on
config/i386/64/sfp-machine.h it already only supports SSE rounding mode
and x86_64 ABI also expectes float128 to use SSE registers [1]
(although it is not clear on how future implementation might implement
it).
Pranav Kant [Thu, 28 Dec 2023 22:46:30 +0000 (22:46 +0000)]
Get rid of WANT_FLOAT128 usage in floatn.h
This header is installed system-wide. It's not correct to introduce a new
macro WANT_FLOAT128 in this because then we are either forcing the compiler to
make it an inbuilt macro to make glibc expose all float128 functionality, or asking
our clients to -DWANT_FLOAT128 to get float128 functionality in glibc.
Given we are primarily going to have float128 enabled GRTE now, we don't need to have
guards for non-float128 cases.
Fangrui Song [Sun, 10 Oct 2021 21:38:00 +0000 (14:38 -0700)]
x86: Define __HAVE_FLOAT128 for Clang and use __builtin_*f128 code path
Clang supports __builtin_fabsf128 (despite not supporting _Float128) but
_not _builtin_fabsq.
By falling back to `typedef __float128 _Float128;`, the float128 code
will be buildable with Clang.
Joseph Myers [Mon, 2 Aug 2021 16:33:44 +0000 (16:33 +0000)]
Fix build of nptl/tst-thread_local1.cc with GCC 12
The test nptl/tst-thread_local1.cc fails to build with GCC mainline
because of changes to what libstdc++ headers implicitly include what
other headers:
tst-thread_local1.cc: In function 'int do_test()':
tst-thread_local1.cc:177:5: error: variable 'std::array<std::pair<const char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X' has initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~
Fix this by adding an explicit include of <array>.
Tested with build-many-glibcs.py for aarch64-linux-gnu.
Kamlesh Kumar [Thu, 5 Dec 2019 15:49:00 +0000 (16:49 +0100)]
<string.h>: Define __CORRECT_ISO_CPP_STRING_H_PROTO for Clang [BZ #25232]
Without the asm redirects, strchr et al. are not const-correct.
libc++ has a wrapper header that works with and without
__CORRECT_ISO_CPP_STRING_H_PROTO (using a Clang extension). But when
Clang is used with libstdc++ or just C headers, the overloaded functions
with the correct types are not declared.
This change does not impact current GCC (with libstdc++ or libc++).
Fangrui Song [Wed, 27 Apr 2022 21:01:19 +0000 (14:01 -0700)]
Remove x86_64 specific lowlevellock/cancellation
The x86_64 specific implemention has CFI directives like
`.cfi_adjust_cfa_offset 128` which are incorrect when RBP is used as the
canonical frame address.
This follows the spirit of the following two commits by removing the
x86_64 specific implementation. The generic implementation will be used.
Fangrui Song [Mon, 25 Apr 2022 23:50:00 +0000 (16:50 -0700)]
elf: Support DT_RELR relative relocation format
Adapted from
https://sourceware.org/pipermail/libc-alpha/2022-April/138085.html
([PATCH v11 0/7] Support DT_RELR relative relocation format),
which is expected to be included in glibc 2.36.
glibc 2.35 has a fair amount of rtld changes to avoid nested functions
(https://sourceware.org/PR27220). This patch is carefully crafted to
make the minimal changes.
Notebly, this commit
* works around b/208156916 by not bumping DT_NUM. DT_RELR and DT_RELRSZ
take the l_info slots at DT_VERSYM+1 and DT_VERSYM+2.
* avoids changes to include/link.h
* removes the time travel compatibility check (error if DT_RELR is used
without GLIBC_ABI_DT_RELR version need). This needs link.h change and
the detected case cannot happen if we correctly use
-Wl,-z,pack-relative-relocs.
Fangrui Song [Mon, 25 Apr 2022 23:48:25 +0000 (16:48 -0700)]
configure: Don't check LD -v --help for LIBC_LINKER_FEATURE
When LIBC_LINKER_FEATURE is used to check a linker option with the equal
sign, it will likely fail because the LD -v --help output may look like
`-z lam-report=[none|warning|error]` while the needle is something like
`-z lam-report=warning`.
The LD -v --help filter doesn't save much time, so just remove it.
Joseph Myers [Mon, 26 Feb 2018 18:17:47 +0000 (18:17 +0000)]
Use libc_hidden_* for atoi (bug 15105).
Continuing the fixes for localplt test failures with -Os arising from
functions not being inlined in that case, this patch fixes such
failures for atoi by using libc_hidden_proto and libc_hidden_def.
Tested for x86_64 (both that it removes this particular localplt
failure for -Os, and that the testsuite continues to pass without
-Os).
[BZ #15105]
* stdlib/atoi.c (atoi): Use libc_hidden_def.
* include/stdlib.h [!_ISOMAC] (atoi): Use libc_hidden_proto.
Joseph Myers [Fri, 23 Feb 2018 13:54:53 +0000 (13:54 +0000)]
Use libc_hidden_* for tolower, toupper (bug 15105).
Continuing the fixes for localplt test failures with -Os arising from
functions not being inlined in that case, this patch fixes such
failures for tolower and toupper by using libc_hidden_proto and
libc_hidden_def.
Tested for x86_64 (both that it removes this particular localplt
failure for -Os, and that the testsuite continues to pass without
-Os).
2018-02-22 Joseph Myers <joseph@codesourcery.com>
[BZ #15105]
* ctype/ctype.c (tolower): Use libc_hidden_def.
(toupper): Likewise.
* include/ctype.h [!_ISOMAC] (tolower): Use libc_hidden_proto.
[!_ISOMAC] (toupper): Likewise.
Joseph Myers [Thu, 15 Feb 2018 21:00:02 +0000 (21:00 +0000)]
Use libc_hidden_* for argz_next, __argz_next (bug 15105).
Among other localplt test failures when building with -Os, there are
libc.so PLT references for argz_next and __argz_next. This is a
simple case of functions that are inlined for -O2 but not for -Os;
this patch adds libc_hidden_proto / libc_hidden_def for them to avoid
localplt failures even when not inlined.
Tested for x86_64 (both that it removes these particular localplt
failures for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).
[BZ #15105]
* include/argz.h (argz_next): Use libc_hidden_proto.
(__argz_next): Likewise.
* string-argz-next.c (__argz_next): Use libc_hidden_def.
(argz_next): Use libc_hidden_weak.
Joseph Myers [Thu, 15 Feb 2018 20:59:12 +0000 (20:59 +0000)]
Use libc_hidden_* for __cmsg_nxthdr (bug 15105).
Among other localplt test failures when building with -Os, there are
libc.so PLT references for __cmsg_nxthdr. This is a simple case of a
function that is inlined for -O2 but not for -Os; this patch adds
libc_hidden_proto / libc_hidden_def for it to avoid a localplt failure
even when it is not inlined.
Tested for x86_64 (both that it removes this particular localplt
failure for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).
[BZ #15105]
* include/sys/socket.h [!_ISOMAC] (__cmsg_nxthdr): Use
libc_hidden_proto.
* sysdeps/unix/sysv/linux/cmsg_nxthdr.c (__cmsg_nxthdr): Use
libc_hidden_def.
Joseph Myers [Thu, 15 Feb 2018 20:58:16 +0000 (20:58 +0000)]
Use libc_hidden_* for fputs (bug 15105).
Among other localplt test failures when building with -Os, there are
libc.so PLT references for fputs. fputs calls normally get redirected
to _IO_fputs by a macro in include/stdio.h (and _IO_fputs in turn uses
libc_hidden_proto), but GCC can convert an fprintf call with a
constant string argument into an fputs call, which of course is then
unaffected by the macro redirection. (I don't know why this issue
only appears with -Os.)
This patch duly adds a use of libc_hidden_proto for fputs. I see no
obvious reason why the fputs macro redirection is needed at all, but
this patch does not change it.
Tested for x86_64 (both that it removes this particular localplt
failure for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).
[BZ #15105]
* include/stdio.h [!_ISOMAC && IS_IN (libc)] (fputs): Use
libc_hidden_proto.
* libio/iofputs.c (fputs): Use libc_hidden_weak.
Building with -Os produces linknamespace and localplt failures for,
among other functions, gnu_dev_major, gnu_dev_minor and
gnu_dev_makedev.
The issue is that those functions are not inlined when building with
-Os. While one could force them to be inlined in that case, it seems
more natural to fix this issue similarly to other namespace issues.
Thus, this patch makes gnu_dev_* into weak aliases for hidden symbols
__gnu_dev_*; __gnu_dev_* are then defined as inlines in the internal
include/sys/sysmacros.h, and uses of gnu_dev_* (often via the macros
major, minor and makedev) for which there are namespace issues are
changed to use __gnu_dev_*; where there are no namespace issues, use
of libc_hidden_proto serves to avoid unnecessary local PLT entry use.
Tested for x86_64, (a) without -Os, to verify the testsuite continues
to pass without problems and that the functions called under their new
names continue to be inlined as expected in that case; (b) with -Os,
to verify that the linknamespace and localplt failures in question go
away (but because of other such failures present, neither of the
relevant bugs can yet be closed).
[BZ #15105]
[BZ #19463]
* include/sys/sysmacros.h [!_ISOMAC]
(__SYSMACROS_NEED_IMPLEMENTATION): Define macro.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC]
(_SYS_SYSMACROS_H_WRAPPER): Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_major): Use
libc_hidden_proto.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_minor): Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_makedev):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__SYSMACROS_DECL_TEMPL):
Undefine and redefine to add use __gnu_dev_ prefix.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__SYSMACROS_IMPL_TEMPL):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_major): Declare
and define as hidden inline function.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_minor):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_makedev):
Likewise.
* misc/makedev.c (OUT_OF_LINE_IMPL_TEMPL): Use __gnu_dev_ prefix.
(gnu_dev_major): Use weak_alias and libc_hidden_weak.
(gnu_dev_minor): Likewise.
(gnu_dev_makedev): Likewise.
* csu/check_fds.c (check_one_fd): Use __gnu_dev_makedev instead of
makedev.
* posix/wordexp.c (exec_comm_child): Likewise.
* sysdeps/mach/hurd/xmknodat.c (__xmknodat): Use __gnu_dev_minor
instead of minor and __gnu_dev_major instead of major.
* sysdeps/unix/sysv/linux/device-nrs.h (DEV_TTY_P): Use
__gnu_dev_major instead of major.
* sysdeps/unix/sysv/linux/pathconf.c (distinguish_extX): Use
__gnu_dev_major instead of gnu_dev_major and __gnu_dev_minor
instead of gnu_dev_minor.
* sysdeps/unix/sysv/linux/ptsname.c (MASTER_P): Likewise.
(SLAVE_P): Likewise.
(__ptsname_internal): Use __gnu_dev_minor instead of minor.
* sysdeps/unix/sysv/linux/ttyname.h (is_pty): Use __gnu_dev_major
instead of major.
Fangrui Song [Mon, 11 Jan 2021 19:56:54 +0000 (11:56 -0800)]
install: Replace scripts/output-format.sed with objdump -f [BZ #26559]
GNU ld and gold have supported --print-output-format since 2011. glibc
requires binutils>=2.25 (2015), so if LD is GNU ld or gold, we can
assume the option is supported.
lld is by default a cross linker supporting multiple targets. It auto
detects the file format and does not need OUTPUT_FORMAT. It does not
support --print-output-format.
By parsing objdump -f, we can support all the three linkers.
Fangrui Song [Fri, 16 Apr 2021 18:26:39 +0000 (11:26 -0700)]
Set the retain attribute on _elf_set_element if CC supports [BZ #27492]
So that text_set_element/data_set_element/bss_set_element defined
variables will be retained by the linker.
Note: 'used' and 'retain' are orthogonal: 'used' makes sure the variable
will not be optimized out; 'retain' prevents section garbage collection
if the linker support SHF_GNU_RETAIN.
GNU ld 2.37 and LLD 13 will support -z start-stop-gc which allow C
identifier name sections to be GCed even if there are live
__start_/__stop_ references.
Without the change, there are some static linking problems, e.g.
_IO_cleanup (libio/genops.c) may be discarded by ld --gc-sections, so
stdout is not flushed on exit.
Note: GCC may warning 'retain' attribute ignored while __has_attribute(retain)
is 1 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99587).
Fangrui Song [Mon, 16 Aug 2021 16:59:30 +0000 (09:59 -0700)]
elf: Drop elf/tls-macros.h in favor of __thread and tls_model attributes [BZ #28152] [BZ #28205]
elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.
Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.
When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.
As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.
Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.
Fangrui Song [Wed, 11 Aug 2021 16:00:37 +0000 (09:00 -0700)]
aarch64: Make elf_machine_{load_address,dynamic} robust [BZ #28203]
The AArch64 ABI is largely platform agnostic and does not specify
_GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the
only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value
to the link-time address _DYNAMIC. [2]
In 2012, __ehdr_start was implemented in GNU ld and gold in binutils
2.23. Using adrp+add / (-mcmodel=tiny) adr to access
__ehdr_start/_DYNAMIC gives us a robust way to get the load address and
the link-time address of _DYNAMIC.
[1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2
[2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the
link-time address _DYNAMIC.
LLD is widely used on aarch64 Android and ChromeOS devices. Software
just works without the need for _GLOBAL_OFFSET_TABLE_[0].
intl: Handle translation output codesets with suffixes [BZ #26383]
Commit 91927b7c7643 (Rewrite iconv option parsing [BZ #19519]) did not
handle cases where the output codeset for translations (via the `gettext'
family of functions) might have a caller specified encoding suffix such as
TRANSLIT or IGNORE. This led to a regression where translations did not
work when the codeset had a suffix.
This commit fixes the above issue by parsing any suffixes passed to
__dcigettext and adds two new test-cases to intl/tst-codeset.c to
verify correct behaviour. The iconv-internal function __gconv_create_spec
and the static iconv-internal function gconv_destroy_spec are now visible
internally within glibc and used in intl/dcigettext.c.
This commit replaces string manipulation during `iconv_open' and iconv_prog
option parsing with a structured, flag based conversion specification. In
doing so, it alters the internal `__gconv_open' interface and accordingly
adjusts its uses.
This change fixes several hangs in the iconv program and therefore includes
a new test to exercise iconv_prog options that originally led to these hangs.
It also includes a new regression test for option handling in the iconv
function.
Arjun Shankar [Wed, 4 Nov 2020 11:19:38 +0000 (12:19 +0100)]
iconv: Accept redundant shift sequences in IBM1364 [BZ #26224]
The IBM1364, IBM1371, IBM1388, IBM1390 and IBM1399 character sets
share converter logic (iconvdata/ibm1364.c) which would reject
redundant shift sequences when processing input in these character
sets. This led to a hang in the iconv program (CVE-2020-27618).
This commit adjusts the converter to ignore redundant shift sequences
and adds test cases for iconv_prog hangs that would be triggered upon
their rejection. This brings the implementation in line with other
converters that also ignore redundant shift sequences (e.g. IBM930
etc., fixed in commit 692de4b3960d).
Previously, in UCS4 conversion routines we limit the number of
characters we examine to the minimum of the number of characters in the
input and the number of characters in the output. This is not the
correct behavior when __GCONV_IGNORE_ERRORS is set, as we do not consume
an output character when we skip a code unit. Instead, track the input
and output pointers and terminate the loop when either reaches its
limit.
This resolves assertion failures when resetting the input buffer in a step of
iconv, which assumes that the input will be fully consumed given sufficient
output space.
Joseph Myers [Wed, 12 Feb 2020 23:31:56 +0000 (23:31 +0000)]
Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).
Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).
Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.
This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process. (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.) Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.
This patch syncs the regex implementation with gnulib (commit 0ee5212).
Only two changes in GLIBC regex testing are required:
1. posix/bug-regex28.c: as previously discussed [1] the change of
expected results on the pattern should be safe.
2. posix/PCRE.tests: the ERE (a)|\1 is malformed (in the sense that
the \1 doesn't mean anything) and although current GLIBC accepts
it has undefined behavior. This patch removes the specific test.
This sync contains some patches from thread 'Regex: Make libc regex
more usable outside GLIBC.' [2] which have been pushed upstream in
gnulib. This patches also fixes some regex issues (BZ #23233,
BZ #21163, BZ #18986, BZ #13762) and I did not add testcases for
both #23233 and #13762 because I couldn't think a simple way to
trigger the expected failure path to trigger them.
Andreas Schwab [Mon, 21 Dec 2020 03:26:43 +0000 (08:56 +0530)]
Fix buffer overrun in EUC-KR conversion module (bz #24973)
The byte 0xfe as input to the EUC-KR conversion denotes a user-defined
area and is not allowed. The from_euc_kr function used to skip two bytes
when told to skip over the unknown designation, potentially running over
the buffer end.
Florian Weimer [Wed, 27 Jan 2021 12:36:12 +0000 (13:36 +0100)]
gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256)
The conversion loop to the internal encoding does not follow
the interface contract that __GCONV_FULL_OUTPUT is only returned
after the internal wchar_t buffer has been filled completely. This
is enforced by the first of the two asserts in iconv/skeleton.c:
/* We must run out of output buffer space in this
rerun. */
assert (outbuf == outerr);
assert (nstatus == __GCONV_FULL_OUTPUT);
This commit solves this issue by queuing a second wide character
which cannot be written immediately in the state variable, like
other converters already do (e.g., BIG5-HKSCS or TSCII).
Shu-Chun Weng [Mon, 3 May 2021 23:47:10 +0000 (16:47 -0700)]
Don't crash if /var/tmp doesn't exist
`xstat` is checked `stat64` crashing the program if the latter returns
failure. In this loop, we are trying to find one folder that satisfies
the condition, no reason to crash the program if one folder doesn't.
Shu-Chun Weng [Mon, 3 May 2021 23:12:44 +0000 (16:12 -0700)]
More aggressively prevent a buffer from being optimized out
The volatile global variable was first introduced in e86f9654c. I have
noticed the compiler still optimizing the buffer out on AArch64
presumably because the assignment is after all other observable
behaviors so it's still valid to eliminate it.
Fangrui Song [Thu, 8 Jul 2021 21:26:22 +0000 (14:26 -0700)]
x86_64: Remove unneeded static PIE check for undefined weak diagnostic
https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld
diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in
-pie links. Arguably keeping the diagnostic like other ports is more
correct, since statically resolving movl foo(%rip), %eax to the
link-time zero address produces a corrupted output.
It turns out that --enable-static-pie builds do not depend on the ld
behavior. GCC generates GOT indirection for weak declarations for
-fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't
really matter.
Refactor the sincos implementation - rather than rely on odd partial inlining
of preprocessed portions from sin and cos, explicitly write out the cases.
This makes sincos much easier to maintain and provides an additional 16-20%
speedup between 0 and 2^27. The overall speedup of sincos is 48% over this range.
Between 0 and PI it is 66% faster.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same
logic as sin and cos.
[PATCH 6/7] sin/cos slow paths: refactor duplicated code into dosin
Refactor duplicated code into do_sin. Since all calls to do_sin use copysign to
set the sign of the result, move it inside do_sin. Small inputs use a separate
polynomial, so move this into do_sin as well (the check is based on the more
conservative case when doing large range reduction, but could be relaxed).
* sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small
inputs. Return correct sign.
(do_sincos): Remove small input check before do_sin, let do_sin set
the sign.
(__sin): Likewise.
(__cos): Likewise.
[PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction
For huge inputs use the improved do_sincos function as well. Now no cases use
the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it.
[PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction
This patch improves the accuracy of the range reduction. When the input is
large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not
enough. Improve range reduction accuracy to 136 bits. As a result the special
checks for results close to zero can be removed. The ULP of the polynomials is
at worst 0.55ULP, so there is no reason for the slow functions, and they can be
removed.
* sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to
reduce_sincos, improve accuracy to 136 bits.
(do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions.
(__sin): Use improved reduction and simplified do_sincos calculation.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
[PATCH 2/7] sin/cos slow paths: remove large range reduction
This patch removes the large range reduction code and defers to the huge range
reduction code. The first level range reducer supports inputs up to 2^27,
which is way too large given that inputs for sin/cos are typically small
(< 10), and optimizing for a smaller range would give a significant speedup.
Input values above 2^27 are practically never used, so there is no reason for
supporting range reduction between 2^27 and 2^48. Removing it significantly
simplifies code and enables further speedups. There is about a 2.3x slowdown
in this range due to __branred being extremely slow (a better algorithm could
easily more than double performance).
[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs
This series of patches removes the slow patchs from sin, cos and sincos.
Besides greatly simplifying the implementation, the new version is also much
faster for inputs up to PI (41% faster) and for large inputs needing range
reduction (27% faster).
ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most
of the range with mpsin and mpcos. The number of incorrectly rounded results
(ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5,
the average is ~850 per million between 0 and PI.
Tested on AArch64 and x86_64 with no regressions.
The first patch removes the slow paths for the cases where the input is small
and doesn't require range reduction. Update ULP tables for sin, cos and sincos
on AArch64 and x86_64.
* sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small
inputs.
(__cos): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.
Josh Kunz [Fri, 24 Jan 2020 01:37:14 +0000 (17:37 -0800)]
Additional fixes for llvm-as
Unlike GCC, llvm always uses an integrated assembler, which attempts to
recognized all `asm` statements written in the C code. glibc uses some
syntactically invalid asm statements to emit constants into assembly that
are later extracted with a sed or AWK script.
This change fixes two such invalid `asm` statements by wrapping the
output in a `.ascii` directive.. This does not break the sed/AWK (the same
special sequence is output) but it makes the statement syntactically valid.
See cf8e3f8757 for a previous fix for the same issue.
Joseph Myers [Fri, 18 May 2018 11:57:15 +0000 (11:57 +0000)]
Fix year 2039 bug for localtime with 64-bit time_t (bug 22639).
Bug 22639 reports localtime failing to handle time offset transitions
correctly in 2039 and later on platforms with 64-bit time_t.
The problem is the use of SECSPERDAY (constant 86400) in calculations
such as
t = ((year - 1970) * 365
+ /* Compute the number of leapdays between 1970 and YEAR
(exclusive). There is a leapday every 4th year ... */
+ ((year - 1) / 4 - 1970 / 4)
/* ... except every 100th year ... */
- ((year - 1) / 100 - 1970 / 100)
/* ... but still every 400th year. */
+ ((year - 1) / 400 - 1970 / 400)) * SECSPERDAY;
where t is of type time_t and year is of type int. Before my commit 92bd70fb85bce57ac47ba5d8af008736832c955a (an update from tzcode,
included in 2.26 and later releases), SECSPERDAY was obtained from a
file imported from tzcode, where the value included a cast to
int_fast32_t. On 64-bit platforms, glibc defines int_fast32_t to be
long int, so 64-bit, but my patch resulted in it changing to int.
(The bug would probably have existed even before my patch for x32,
which has 64-bit time_t but 32-bit int_fast32_t, but I haven't
verified that.)
This patch fixes the problem by including a cast to time_t in the
definition of SECSPERDAY. (64-bit time support for 32-bit systems
should move such code that isn't a public interface to using the
internal 64-bit version of time_t throughout.)
Tested for x86_64 and x86.
[BZ #22639]
* time/tzset.c (SECSPERDAY): Cast to time_t.
* time/tst-y2039.c: New file.
* time/Makefile (tests): Add tst-y2039.
These comments should make it easier to see the (small) diff introduced
in cf8e3f8757. Without these comments, the diff may get list on a future
upstream merge.
Josh Kunz [Wed, 7 Aug 2019 18:40:50 +0000 (11:40 -0700)]
Make gen-XX-const scripts work with llvm-as
The gen-as-const and gen-py-const scripts are used to generate integer constant
definitions from a list of constant C-expressions. This is achieved by
generating a C program with inline `asm` statements, that depend on
these constant expressions. During compilation, the constant expressions
are evaluated, and included in the inline asm. The build process
generates only the assembly, and then used `sed` to extract the values
from the assembly text.
This is clever. It allows the build process to extract the value of C
statements built under the target architecture. The implementation is a
bit fragile, but it is not immediately obvious to me how it could be
improved.
This change slightly modifies `gen-as-const` and `gen-py-const` to emit
valid assembly directives instead of invalid directives that were
previously emitted. Since the values are extracted via string parsing,
this has no effect on the values extracted. This is needed because the
LLVM assembler validates all statements before emitting them, whereas it
appears GCC will literally emit any `asm` directives without validation
or recognition.
According to ARMv8 architecture reference manual section C7.2.188, SIMD MOV (to
general) instruction format is
MOV <Xd>, <Vn>.D[<index>]
gas appears to accept "<Vn>.2D[<index>]" as well, but clang's assembler does
not. C.f. https://community.arm.com/developer/ip-products/processors/f/cortex-a-forum/5214/aarch64-assembly-syntax-for-armclang