This patch syncs posix/glob.c implementation with gnulib version b5ec983 (glob: simplify symlink detection). The only difference
to gnulib code is
* DT_UNKNOWN, DT_DIR, and DT_LNK definition in the case there
were not already defined. Gnulib code which uses
HAVE_STRUCT_DIRENT_D_TYPE will redefine them wrongly because
GLIBC does not define HAVE_STRUCT_DIRENT_D_TYPE. Instead
the patch check for each definition instead.
Also, the patch requires additional globfree and globfree64 files
for compatibility version on some architectures. Also the code
simplification leads to not macro simplification (not need for
NO_GLOB_PATTERN_P anymore).
Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py
for all major architectures.
[BZ #1062]
* posix/Makefile (routines): Add globfree, globfree64, and
glob_pattern_p.
* posix/flexmember.h: New file.
* posix/glob_internal.h: Likewise.
* posix/glob_pattern_p.c: Likewise.
* posix/globfree.c: Likewise.
* posix/globfree64.c: Likewise.
* sysdeps/gnu/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/alpha/globfree.c: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/oldglob.c: Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree.c: Likewise.
* sysdeps/wordsize-64/globfree64.c: Likewise.
* posix/glob.c (HAVE_CONFIG_H): Use !_LIBC instead.
[NDEBUG): Remove comments.
(GLOB_ONLY_P, _AMIGA, VMS): Remove define.
(dirent_type): New type. Use uint_fast8_t not
uint8_t, as C99 does not require uint8_t.
(DT_UNKNOWN, DT_DIR, DT_LNK): New macros.
(struct readdir_result): Use dirent_type. Do not define skip_entry
unless it is needed; this saves a byte on platforms lacking d_ino.
(readdir_result_type, readdir_result_skip_entry):
New functions, replacing ...
(readdir_result_might_be_symlink, readdir_result_might_be_dir):
these functions, which were removed. This makes the callers
easier to read. All callers changed.
(D_INO_TO_RESULT): Now empty if there is no d_ino.
(size_add_wrapv, glob_use_alloca): New static functions.
(glob, glob_in_dir): Check for size_t overflow in several places,
and fix some size_t checks that were not quite right.
Remove old code using SHELL since Bash no longer
uses this.
(glob, prefix_array): Separate MS code better.
(glob_in_dir): Remove old Amiga and VMS code.
(globfree, __glob_pattern_type, __glob_pattern_p): Move to
separate files.
(glob_in_dir): Do not rely on undefined behavior in accessing
struct members beyond their bounds. Use a flexible array member
instead
(link_stat): Rename from link_exists2_p and return -1/0 instead of
0/1. Caller changed.
(glob): Fix memory leaks.
* posix/glob64 (globfree64): Move to separate file.
* sysdeps/gnu/glob64.c (NO_GLOB_PATTERN_P): Remove define.
(globfree64): Remove hidden alias.
* sysdeps/unix/sysv/linux/Makefile (sysdeps_routines): Add
oldglob.
* sysdeps/unix/sysv/linux/alpha/glob.c (__new_globfree): Move to
separate file.
* sysdeps/unix/sysv/linux/i386/glob64.c (NO_GLOB_PATTERN_P): Remove
define.
Move compat code to separate file.
* sysdeps/wordsize-64/glob.c (globfree): Move definitions to
separate file.
H.J. Lu [Sun, 22 Oct 2017 15:24:00 +0000 (08:24 -0700)]
x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]
In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector,
mask and bound registers. It simplifies _dl_runtime_resolve and supports
different calling conventions. ld.so code size is reduced by more than
1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles
than saving and restoring vector and bound registers individually.
Latency for _dl_runtime_resolve to lookup the function, foo, from one
shared library plus libc.so:
This is the worst case where portion of time spent for saving and
restoring registers is bigger than majority of cases. With smaller
_dl_runtime_resolve code size, overall performance impact is negligible.
On IvyBridge, differences in build and test time of binutils with lazy
binding GCC and binutils are noises. On Westmere, differences in
bootstrap and "makc check" time of GCC 7 with lazy binding GCC and
binutils are also noises.
H.J. Lu [Thu, 19 Oct 2017 15:52:50 +0000 (08:52 -0700)]
x86-64: Verify that _dl_runtime_resolve preserves vector registers
On x86-64, _dl_runtime_resolve must preserve the first 8 vector
registers. Add 3 _dl_runtime_resolve tests to verify that SSE,
AVX and AVX512 registers are preserved.
So the intention is clearly to return an error for NULL name.
This patch duly inverts the sense of the conditional. It fixes the
build with GCC mainline, and passes usual glibc testsuite testing for
x86_64. However, I have not tried any actual substantive nisplus
testing, do not have an environment for such testing, and do not know
whether it is possible that strlen (name) or tablename_len might be
large so that the VLA for buf is actually a security issue. However,
if it is a security issue, there are plenty of other similar instances
in the nisplus code (that haven't been hidden by a bogus comparison
with NULL) - and nis_table.c:__create_ib_request uses strdupa on the
string passed to nis_list, so a local fix in the caller wouldn't
suffice anyway (see bug 20987). (Calls to strdupa and other such
macros that use alloca must be considered equally questionable
regarding stack overflow issues as direct calls to alloca and VLA
declarations.)
[BZ #20978]
* nis/nss_nisplus/nisplus-alias.c (_nss_nisplus_getaliasbyname_r):
Compare name == NULL, not name != NULL.
Joseph Myers [Sat, 7 Oct 2017 11:42:41 +0000 (13:42 +0200)]
Fix rpcgen buffer overrun (bug 20790).
Building with GCC 7 produces an error building rpcgen:
rpc_parse.c: In function 'get_prog_declaration':
rpc_parse.c:543:25: error: may write a terminating nul past the end of the destination [-Werror=format-length=]
sprintf (name, "%s%d", ARGNAME, num); /* default name of argument */
~~~~^
rpc_parse.c:543:5: note: format output between 5 and 14 bytes into a destination of size 10
sprintf (name, "%s%d", ARGNAME, num); /* default name of argument */
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
That buffer overrun is for the case where the .x file declares a
program with a million arguments. The strcpy two lines above can
generate a buffer overrun much more simply for a long argument name.
The limit on length of line read by rpcgen (MAXLINESIZE == 1024)
provides a bound on the buffer size needed, so this patch just changes
the buffer size to MAXLINESIZE to avoid both possible buffer
overruns. A testcase is added that rpcgen does not crash with a
500-character argument name, where it previously crashed.
It would not at all surprise me if there are many other ways of
crashing rpcgen with either valid or invalid input; fuzz testing would
likely find various such bugs, though I don't think they are that
important to fix (rpcgen is not that likely to be used with untrusted
.x files as input). (As well as fuzz-findable bugs there are probably
also issues when various int variables get overflowed on very large
input.) The test infrastructure for rpcgen-not-crashing tests would
need extending if tests are to be added for cases where rpcgen should
produce an error, as opposed to cases where it should succeed.
Tested for x86_64 and x86.
[BZ #20790]
* sunrpc/rpc_parse.c (get_prog_declaration): Increase buffer size
to MAXLINESIZE.
* sunrpc/bug20790.x: New file.
* sunrpc/Makefile [$(run-built-tests) = yes] (rpcgen-tests): New
variable.
[$(run-built-tests) = yes] (tests-special): Add $(rpcgen-tests).
[$(run-built-tests) = yes] ($(rpcgen-tests)): New rule.
since xgetbv takes much more cycles than single cycle operations like
vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with
AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
server.
[BZ #21871]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.
Florian Weimer [Mon, 27 Feb 2017 18:05:13 +0000 (19:05 +0100)]
sunrpc: Avoid use-after-free read access in clntudp_call [BZ #21115]
After commit bc779a1a5b3035133024b21e2f339fe4219fb11c
(CVE-2016-4429: sunrpc: Do not use alloca in clntudp_call
[BZ #20112]), ancillary data is stored on the heap,
but it is accessed after it has been freed.
The test case must be run under a heap debugger such as valgrind
to observe the invalid access. A malloc implementation which
immediately calls munmap on free would catch this bug as well.
James Clarke [Tue, 24 Jan 2017 11:20:06 +0000 (09:20 -0200)]
Bug 21053: sh: Reduce namespace pollution from sys/ucontext.h
The problem is basically that sys/ucontext.h is defining R0..R15
which happens to conflict with some packages like Firefox when
trying to build on SH.
The very same problem existed on arm back then [1] and it was fixed by
renaming R0..R15 to REG_R0..REG_R15. This patch imploy a similar
strategy for SH.
Checked on sh4-linux-gnu with run-built-tests=no and I also got reports
that it fixes Firefox build on Debian sh4.
* sysdeps/unix/sysv/linux/sh/sh3/ucontext_i.sym: Use new REG_R*
constants instead of the old R* ones.
* sysdeps/unix/sysv/linux/sh/sh4/ucontext_i.sym: Likewise.
* sysdeps/unix/sysv/linux/sh/sys/ucontext.h (NGPREG): Rename...
(NGREG): ... to this, to fit in with other architectures.
(gpregset_t): Use new NGREG macro.
[__USE_GNU]: Remove condition; all architectures other than tile
are unconditional.
(R*): Rename to REG_R*.
Historically perl includes the current directory in the module search
path. Over the time this has been considered as a security issue and
the recent vulnerabilities [1] made people to reconsider this behaviour.
It is almost sure that this will be removed in the future [2], possibly
for the 5.26 release, although this is not yet firmly decided.
Debian has decided to backport the patches [3], so the perl binary in
unstable do not have '.' in @INC anymore.
This behaviour is used in the conform perl scripts to include the
GlibcConform module. This patch fixes that by calling perl with '-I.'.
This is not a security issue in this case as make ensures that the
current directory is $(srcdir)/conform/ when the scripts are called.
Passing the full path would do exactly the same.
continue to work even if vector instructions are used in glibc which
require the ABI stack realignment.
__tls_get_addr_slow is added to handle the slow paths in the default
implementation of__tls_get_addr in elf/dl-tls.c. The new __tls_get_addr
calls __tls_get_addr_slow after realigning the stack. Internal calls
within ld.so go directly to the default implementation of __tls_get_addr
because they do not need stack realignment.
Florian Weimer [Wed, 14 Jun 2017 06:11:22 +0000 (08:11 +0200)]
i686: Add missing IS_IN (libc) guards to vectorized strcspn
Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile
rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can
no longer be used in ld.so, even if the compiled code never makes it
into the final ld.so link. This commit adds the missing IS_IN (libc)
guard to the SSE 4.2 strcspn implementation, so that it can be used from
ld.so in the future.
Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209)
The LD_HWCAP_MASK environment variable may alter the selection of
function variants for some architectures. For AT_SECURE process it
means that if an outdated routine has a bug that would otherwise not
affect newer platforms by default, LD_HWCAP_MASK will allow that bug
to be exploited.
To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid
binaries.
This patch remove the PID cache and usage in current GLIBC code. Current
usage is mainly used a performance optimization to avoid the syscall,
however it adds some issues:
- The exposed clone syscall will try to set pid/tid to make the new
thread somewhat compatible with current GLIBC assumptions. This cause
a set of issue with new workloads and usecases (such as BZ#17214 and
[1]) as well for new internal usage of clone to optimize other algorithms
(such as clone plus CLONE_VM for posix_spawn, BZ#19957).
- The caching complexity also added some bugs in the past [2] [3] and
requires more effort of each port to handle such requirements (for
both clone and vfork implementation).
- Caching performance gain in mainly on getpid and some specific
code paths. The getpid performance leverage is questionable [4],
either by the idea of getpid being a hotspot as for the getpid
implementation itself (if it is indeed a justifiable hotspot a
vDSO symbol could let to a much more simpler solution).
Other usage is mainly for non usual code paths, such as pthread
cancellation signal and handling.
For thread creation (on stack allocation) the code simplification in fact
adds some performance gain due the no need of transverse the stack cache
and invalidate each element pid.
Other thread usages will require a direct getpid syscall, such as
cancellation/setxid signal, thread cancellation, thread fail path (at
create_thread), and thread signal (pthread_kill and pthread_sigqueue).
However these are hardly usual hotspots and I think adding a syscall is
justifiable.
It also simplifies both the clone and vfork arch-specific implementation.
And by review each fork implementation there are some discrepancies that
this patch also solves:
- microblaze clone/vfork does not set/reset the pid/tid field
- hppa uses the default vfork implementation that fallback to fork.
Since vfork is deprecated I do not think we should bother with it.
The patch also removes the TID caching in clone. My understanding for
such semantic is try provide some pthread usage after a user program
issue clone directly (as done by thread creation with CLONE_PARENT_SETTID
and pthread tid member). However, as stated before in multiple discussions
threads, GLIBC provides clone syscalls without further supporting all this
semantics.
I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le.
For sparc32, sparc64, and mips I ran the basic fork and vfork tests from
posix/ folder (on a qemu system). So it would require further testing
on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze
because it is already implementing the patch semantic regarding clone/vfork).
This patch adds two new macros for internal and inline syscall to use
within GLIBC: INTERNAL_SYSCALL_CALL and INLINE_SYSCALL_CALL. They are
similar to the old INTERNAL_SYSCALL and INLINE_SYSCALL with the difference
the new macros accept a variable argument call and do not require to pass
the expected argument size.
The advantage is it is possible to use variable argument macros like
SYSCALL_LL{64} without the need to also handle the argument size. So
for an ABI where SYSCALL_LL might split the argument in high and low
parts, instead of:
Arjun Shankar [Wed, 7 Jun 2017 09:46:24 +0000 (11:46 +0200)]
Synchronize support/ infrastructure with master
This commit updates the support/ subdirectory to
commit 2714c5f3c95f90977167c1d21326d907fb76b419
on the master branch and modifies Makeconfig,
Rules, and extra-lib.mk accordingly.
Existing interposed mallocs do not define the glibc-internal
fork callbacks (and they should not), so statically interposed
mallocs lead to link failures because the strong reference from
fork pulls in glibc's malloc, resulting in multiple definitions
of malloc-related symbols.
H.J. Lu [Fri, 28 Apr 2017 17:27:22 +0000 (10:27 -0700)]
x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396]
On Skylake server, AVX512 load/store instructions in memcpy/memset may
lead to lower CPU turbo frequency in certain situations. Use of AVX2
in memcpy/memset has been observed to have improved overall performance
in many workloads due to the higher frequency.
Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512
if AVX512ER isn't available so that AVX2 versions of memcpy/memset are
used on Skylake server.
[BZ #21396]
* sysdeps/x86/cpu-features.c (init_cpu_features): Set
Prefer_No_AVX512 if AVX512ER isn't available.
* sysdeps/x86/cpu-features.h (bit_arch_Prefer_No_AVX512): New.
(index_arch_Prefer_No_AVX512): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Don't use
AVX512 version if Prefer_No_AVX512 is set.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memmove.S (__libc_memmove): Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.S (__memmove_chk):
Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk):
Likewise.
* sysdeps/x86_64/multiarch/memset.S (memset): Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk):
Likewise.
H.J. Lu [Fri, 28 Apr 2017 17:26:58 +0000 (10:26 -0700)]
x86: Set Prefer_No_VZEROUPPER if AVX512ER is available
AVX512ER won't be implemented in any Xeon processors and will be in
all Xeon Phi processors. Don't check CPU model number when setting
Prefer_No_VZEROUPPER for Xeon Phi. Instead, set Prefer_No_VZEROUPPER
if AVX512ER is available. It works with current and future Xeon Phi
and non-Xeon Phi processors.
Joseph Myers [Thu, 5 Jan 2017 17:35:53 +0000 (17:35 +0000)]
Fix MIPS n64 readahead (bug 21026).
As noted in bug 20126, MIPS n64 uses an incorrect implementation of
readahead intended for 32-bit systems. This patch adds a
syscalls.list entry to fix this. An updated version of the
consolidation patch
<https://sourceware.org/ml/libc-alpha/2016-09/msg00527.html> could
remove this syscalls.list entry again.
Tested with compilation (only) for mips64; the nature of the syscall
doesn't allow for a glibc test to detect this issue.
[BZ #21026]
* sysdeps/unix/sysv/linux/mips/mips64/n64/syscalls.list
(readahead): New syscall entry.
H.J. Lu [Tue, 21 Mar 2017 17:59:31 +0000 (10:59 -0700)]
x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]
On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
the first 8 vector registers. The code layout is
if only %xmm0 - %xmm7 registers are used
preserve %xmm0 - %xmm7 registers
if only %ymm0 - %ymm7 registers are used
preserve %ymm0 - %ymm7 registers
preserve %zmm0 - %zmm7 registers
Branch predication always executes the fallthrough code path to preserve
%zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
registers are used. This leads to lower CPU frequency on Skylake
server. This patch changes the fallthrough code path to preserve
%xmm0 - %xmm7 registers instead:
if whole %zmm0 - %zmm7 registers are used
preserve %zmm0 - %zmm7 registers
if only %ymm0 - %ymm7 registers are used
preserve %ymm0 - %ymm7 registers
preserve %xmm0 - %xmm7 registers
Tested on Skylake server.
[BZ #21258]
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
Define only if _dl_runtime_resolve is defined to
_dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
Fallthrough to _dl_runtime_resolve_sse_vex.
Mike Frysinger [Thu, 16 Mar 2017 06:59:31 +0000 (23:59 -0700)]
posix_spawn: use a larger min stack for -fstack-check [BZ #21253]
When glibc is built with -fstack-check, trying to use posix_spawn can
lead to segfaults due to gcc internally probing stack memory too far.
The new spawn API will allocate a minimum of 1 page, but the stack
checking logic might probe a couple of pages. When it tries to walk
them, everything falls apart.
The gcc internal docs [1] state the default interval checking is one
page. Which means we need two pages (the current one, and the next
probed). No target currently defines it larger.
Further, it mentions that the default minimum stack size needed to
recover from an overflow is 4/8KiB for sjlj or 8/12KiB for others.
But some Linux targets (like mips and ppc) go up to 16KiB (and some
non-Linux targets go up to 24KiB).
Let's create each child with a minimum of 32KiB slack space to support
them all, and give us future breathing room.
No test is added as existing ones crash. Even a simple call is
enough to trigger the problem:
char *argv[] = { "/bin/ls", NULL };
posix_spawn(NULL, "/bin/ls", NULL, NULL, argv, NULL);
Mike Frysinger [Mon, 20 Mar 2017 08:47:56 +0000 (04:47 -0400)]
posix_spawn: fix stack setup on ia64 [BZ #21275]
The ia64-specific clone2 call expects the base of the stack mapping and
the stack size as sep arguments, not an initial stack value as on other
stack-grows-down architectures. Reuse the stack-grows-up macro so we
pass in the right stack base.
The binutils package was recently changed to fix -z relro support on hppa.
See ld/21000 for details:
https://sourceware.org/bugzilla/show_bug.cgi?id=21000
This exposed a problem with the _dl_start_user function in the RTLD_START
define. We need to set __libc_stack_end before it is made read only. For
this, we need to define DL_STACK_END. The offset of 0x160 gives the same
stack end as the code in _dl_start_user.
A build log with the attached patch is here:
https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=hppa&ver=2.24-9&stamp=1487639205&raw=0
H.J. Lu [Mon, 30 Jan 2017 18:59:15 +0000 (10:59 -0800)]
Add VZEROUPPER to memset-vec-unaligned-erms.S [BZ #21081]
Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at
function entry, memset optimized for AVX2 and AVX512 will always use
ymm/zmm register. VZEROUPPER should be placed before ret in
H.J. Lu [Mon, 28 Nov 2016 17:44:49 +0000 (09:44 -0800)]
X86_64: Don't use PLT nor GOT in static archives [BZ #20750]
There is no need to use PLT nor GOT in static archives to branch to a
function, regardless whether static archives is compiled with PIC or
not. When static archives are used to create dynamic executable,
PLT/GOT may be used. The resulting executable still works correctly.
[BZ #20750]
* sysdeps/x86_64/sysdep.h (JUMPTARGET): Check SHARED instead
of PIC.
Drop the GLIBC_TUNABLES environment variable from the environment of
setxid processes to avoid passing it on to non-setxid children. This
prevents potentially insecure tunables in the GLIBC_TUNABLES envvar
from crossing over into a child that may use a libc that has tunables
support.
powerpc: Fix write-after-destroy in lock elision [BZ #20822]
The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.
Mike Frysinger [Thu, 15 Dec 2016 23:34:05 +0000 (18:34 -0500)]
localedata: bs_BA: fix yesexpr/noexpr [BZ #20974]
Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.
This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).
Carlos O'Donell [Fri, 23 Dec 2016 18:30:22 +0000 (13:30 -0500)]
Bug 11941: ld.so: Improper assert map->l_init_called in dlclose
There is at least one use case where during exit a library destructor
might call dlclose() on a valid handle and have it fail with an
assertion. We must allow this case, it is a valid handle, and dlclose()
should not fail with an assert. In the future we might be able to return
an error that the dlclose() could not be completed because the opened
library has already been unloaded and destructors have run as part of
exit processing.
For more details see:
https://www.sourceware.org/ml/libc-alpha/2016-12/msg00859.html
Aurelien Jarno [Tue, 2 Aug 2016 07:18:59 +0000 (09:18 +0200)]
alpha: fix trunc for big input values
The alpha specific version of trunc and truncf always add and subtract
0x1.0p23 or 0x1.0p52 even for big values. This causes this kind of
errors in the testsuite:
Change this by returning the input value when its absolute value is
greater than 0x1.0p23 or 0x1.0p52. NaN have to go through the add and
subtract operations to get possibly silenced.
Finally remove the code to handle inexact exception, trunc should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_trunc.c (__trunc): Return the input value
when its absolute value is greater than 0x1.0p52.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_truncf.c (__truncf): Return the input value
when its absolute value is greater than 0x1.0p23.
[_IEEE_FP_INEXACT] Remove.
Aurelien Jarno [Tue, 2 Aug 2016 07:18:59 +0000 (09:18 +0200)]
alpha: fix rint on sNaN input
The alpha version of rint wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Changelog:
* sysdeps/alpha/fpu/s_rint.c (__rint): Add argument with itself
when it is a NaN.
* sysdeps/alpha/fpu/s_rintf.c (__rintf): Likewise.
Aurelien Jarno [Tue, 2 Aug 2016 07:18:59 +0000 (09:18 +0200)]
alpha: fix floor on sNaN input
The alpha version of floor wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Finally remove the code to handle inexact exception, floor should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_floor.c (__floor): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_floorf.c (__floorf): Likewise.
Aurelien Jarno [Tue, 2 Aug 2016 07:18:59 +0000 (09:18 +0200)]
alpha: fix ceil on sNaN input
The alpha version of ceil wrongly return sNaN for sNaN input. Fix that
by checking for NaN and by returning the input value added with itself
in that case.
Finally remove the code to handle inexact exception, ceil should never
generate such an exception.
Changelog:
* sysdeps/alpha/fpu/s_ceil.c (__ceil): Add argument with itself
when it is a NaN.
[_IEEE_FP_INEXACT] Remove.
* sysdeps/alpha/fpu/s_ceilf.c (__ceilf): Likewise.
There is transition penalty when SSE instructions are mixed with 256-bit
AVX or 512-bit AVX512 load instructions. Since _dl_runtime_resolve_avx
and _dl_runtime_profile_avx512 save/restore 256-bit YMM/512-bit ZMM
registers, there is transition penalty when SSE instructions are used
with lazy binding on AVX and AVX512 processors.
To avoid SSE transition penalty, if only the lower 128 bits of the first
8 vector registers are non-zero, we can preserve %xmm0 - %xmm7 registers
with the zero upper bits.
For AVX and AVX512 processors which support XGETBV with ECX == 1, we can
use XGETBV with ECX == 1 to check if the upper 128 bits of YMM registers
or the upper 256 bits of ZMM registers are zero. We can restore only the
non-zero portion of vector registers with AVX/AVX512 load instructions
which will zero-extend upper bits of vector registers.
This patch adds _dl_runtime_resolve_sse_vex which saves and restores
XMM registers with 128-bit AVX store/load instructions. It is used to
preserve YMM/ZMM registers when only the lower 128 bits are non-zero.
_dl_runtime_resolve_avx_opt and _dl_runtime_resolve_avx512_opt are added
and used on AVX/AVX512 processors supporting XGETBV with ECX == 1 so
that we store and load only the non-zero portion of vector registers.
This avoids SSE transition penalty caused by _dl_runtime_resolve_avx and
_dl_runtime_profile_avx512 when only the lower 128 bits of vector
registers are used.
_dl_runtime_resolve_avx_slow is added and used for AVX processors which
don't support XGETBV with ECX == 1. Since there is no SSE transition
penalty on AVX512 processors which don't support XGETBV with ECX == 1,
_dl_runtime_resolve_avx512_slow isn't provided.
[BZ #20495]
[BZ #20508]
* sysdeps/x86/cpu-features.c (init_cpu_features): For Intel
processors, set Use_dl_runtime_resolve_slow and set
Use_dl_runtime_resolve_opt if XGETBV suports ECX == 1.
* sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt):
New.
(bit_arch_Use_dl_runtime_resolve_slow): Likewise.
(index_arch_Use_dl_runtime_resolve_opt): Likewise.
(index_arch_Use_dl_runtime_resolve_slow): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Use
_dl_runtime_resolve_avx512_opt and _dl_runtime_resolve_avx_opt
if Use_dl_runtime_resolve_opt is set. Use
_dl_runtime_resolve_slow if Use_dl_runtime_resolve_slow is set.
* sysdeps/x86_64/dl-trampoline.S: Include <cpu-features.h>.
(_dl_runtime_resolve_opt): New. Defined for AVX and AVX512.
(_dl_runtime_resolve): Add one for _dl_runtime_resolve_sse_vex.
* sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx_slow):
New.
(_dl_runtime_resolve_opt): Likewise.
(_dl_runtime_profile): Define only if _dl_runtime_profile is
defined.
Aurelien Jarno [Thu, 24 Nov 2016 11:10:13 +0000 (12:10 +0100)]
x86_64: fix static build of __memcpy_chk for compilers defaulting to PIC/PIE
When glibc is compiled with gcc 6.2 that has been configured with
to default to PIC/PIE, the static version of __memcpy_chk is not built,
as the test is done on PIC instead of SHARED. Fix the test to check for
SHARED, like it is done for similar functions like memmove_chk.
Changelog:
* sysdeps/x86_64/memcpy_chk.S (__memcpy_chk): Check for SHARED
instead of PIC.
MIPS: Add `.insn' to ensure a text label is defined as code not data
Avoid a build error with microMIPS compilation and recent versions of
GAS which complain if a branch targets a label which is marked as data
rather than microMIPS code:
../sysdeps/mips/mips32/crti.S: Assembler messages:
../sysdeps/mips/mips32/crti.S:72: Error: branch to a symbol in another ISA mode
make[2]: *** [.../csu/crti.o] Error 1
as commit 9d862524f6ae ("MIPS: Verify the ISA mode and alignment of
branch and jump targets") closed a hole in branch processing, making
relocation calculation respect the ISA mode of the symbol referred.
This allowed diagnosing the situation where an attempt is made to pass
control from code assembled for one ISA mode to code assembled for a
different ISA mode and either relaxing the branch to a cross-mode jump
or if that is not possible, then reporting this as an error rather than
letting such code build and then fail unpredictably at the run time.
This however requires the correct annotation of branch targets as code,
because the ISA mode is not relevant for data symbols and is therefore
not recorded for them. The `.insn' pseudo-op is used for this purpose
and has been supported by GAS since:
Wed Feb 12 14:36:29 1997 Ian Lance Taylor <ian@cygnus.com>
so there has been no reason to avoid it where required. More recently
this pseudo-op has been documented, by the microMIPS architecture
specification[1][2], as required for the correct interpretation of any
code label which is not followed by an actual instruction in an assembly
source.
Use it in our crti.S files then, to mark that the trailing label there
with no instructions following is indeed not a code bug and the branch
is legitimate.
References:
[1] "MIPS Architecture for Programmers, Volume II-B: The microMIPS32
Instruction Set", MIPS Technologies, Inc., Document Number: MD00582,
Revision 5.04, January 15, 2014, Section 7.1 "Assembly-Level
Compatibility", p. 533
[2] "MIPS Architecture for Programmers, Volume II-B: The microMIPS64
Instruction Set", MIPS Technologies, Inc., Document Number: MD00594,
Revision 5.04, January 15, 2014, Section 8.1 "Assembly-Level
Compatibility", p. 623
2016-11-23 Matthew Fortune <Matthew.Fortune@imgtec.com>
Maciej W. Rozycki <macro@imgtec.com>
Fix writes past the allocated array bounds in execvpe (BZ#20847)
This patch fixes an invalid write out or stack allocated buffer in
2 places at execvpe implementation:
1. On 'maybe_script_execute' function where it allocates the new
argument list and it does not account that a minimum of argc
plus 3 elements (default shell path, script name, arguments,
and ending null pointer) should be considered. The straightforward
fix is just to take account of the correct list size on argument
copy.
2. On '__execvpe' where the executable file name lenght may not
account for ending '\0' and thus subsequent path creation may
write past array bounds because it requires to add the terminating
null. The fix is to change how to calculate the executable name
size to add the final '\0' and adjust the rest of the code
accordingly.
As described in GCC bug report 78433 [1], these issues were masked off by
GCC because it allocated several bytes more than necessary so that many
off-by-one bugs went unnoticed.
Checked on x86_64 with a latest GCC (7.0.0 20161121) with -O3 on CFLAGS.
Denis Kaganovich [Thu, 20 Oct 2016 20:01:39 +0000 (22:01 +0200)]
configure: accept __stack_chk_fail_local for ssp support too [BZ #20662]
When glibc is compiled with gcc 6.2 that has been configured with
--enable-default-pie and --enable-default-ssp, the configure script
fails to detect that the compiler has ssp turned on by default when
being built for i686-linux-gnu.
This is because gcc is emitting __stack_chk_fail_local but the
script is only looking for __stack_chk_fail. Support both.
Example output:
checking whether x86_64-pc-linux-gnu-gcc -m32 -Wl,-O1 -Wl,--as-needed
implicitly enables -fstack-protector... no
Joseph Myers [Thu, 3 Nov 2016 22:47:02 +0000 (22:47 +0000)]
Fix linknamespace parallel test failures.
Having found that with my script to build many glibc variants I could
reproduce the linknamespace test failures in parallel builds (that
various people had previously reported but I hadn't seen myself), I
investigated those failures further. This patch adds a missing
dependency to those tests.
Tested for x86_64, including the configuration where I saw those
failures and where I don't see them with this patch.
* conform/Makefile ($(linknamespace-header-tests)): Also depend on
$(linknamespace-symlists-tests).
Aurelien Jarno [Sun, 6 Nov 2016 20:33:10 +0000 (21:33 +0100)]
gconv.h: fix build with GCC 7
gconv.h is using a flex array to define the __gconv_info member in an
invalid way, causing GCC 7 to issue an error:
| In file included from ../include/gconv.h:1:0,
| from ../sysdeps/unix/sysv/linux/_G_config.h:32,
| from ../libio/libio.h:31,
| from ../include/libio.h:4,
| from ../libio/stdio.h:74,
| from ../include/stdio.h:5,
| from test-math-isinff.cc:22:
| ../iconv/gconv.h:142:50: error: flexible array member '__gconv_info::__data' not at end of 'struct _IO_codecvt'
| In file included from ../include/libio.h:4:0,
| from ../libio/stdio.h:74,
| from ../include/stdio.h:5,
| from test-math-isinff.cc:22:
| ../libio/libio.h:211:14: note: next member '_G_iconv_t _IO_codecvt::__cd_out' declared here
| ../libio/libio.h:187:8: note: in the definition of 'struct _IO_codecvt'
| In file included from ../include/gconv.h:1:0,
| from ../sysdeps/unix/sysv/linux/_G_config.h:32,
| from ../libio/libio.h:31,
| from ../include/libio.h:4,
| from ../libio/stdio.h:74,
| from ../include/stdio.h:5,
| from test-math-isinff.cc:22:
| ../iconv/gconv.h:142:50: error: flexible array member '__gconv_info::__data' not at end of 'struct _IO_wide_data'
| In file included from ../include/libio.h:4:0,
| from ../libio/stdio.h:74,
| from ../include/stdio.h:5,
| from test-math-isinff.cc:22:
| ../libio/libio.h:211:14: note: next member '_G_iconv_t _IO_codecvt::__cd_out' declared here
| ../libio/libio.h:215:8: note: in the definition of 'struct _IO_wide_data'
This is basically a revert to the code from 15 years ago. More details
are available in the GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78039
Changelog:
* iconv/gconv.h (__gconv_info): Define __data element using a
zero-length array.
Joseph Myers [Tue, 25 Oct 2016 15:54:16 +0000 (15:54 +0000)]
Fix cmpli usage in power6 memset.
Building glibc for powerpc64 with recent (2.27.51.20161012) binutils,
with multi-arch enabled, I get the error:
../sysdeps/powerpc/powerpc64/power6/memset.S: Assembler messages:
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (5 is not between 0 and 1)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: operand out of range (128 is not between 0 and 31)
../sysdeps/powerpc/powerpc64/power6/memset.S:254: Error: missing operand
Indeed, cmpli is documented as a four-operand instruction, and looking
at nearby code it seems likely cmpldi was intended. This patch fixes
this powerpc64 code accordingly, and makes a corresponding change to
the powerpc32 code.
Tested for powerpc, powerpc64 and powerpc64le by Tulio Magno Quites
Machado Filho
* sysdeps/powerpc/powerpc32/power6/memset.S (memset): Use cmplwi
instead of cmpli.
* sysdeps/powerpc/powerpc64/power6/memset.S (memset): Use cmpldi
instead of cmpli.
Although conceptually correct for p{read,write}{64} offset argument passing,
sh4 implementation does not generate the correct expected code. The
__ALIGNMENT_ARG redefinition is incorrect for two reasons: 1. the
kernel-features.h header is included multiple times (since it contains no
guards) and 2. the value it redefines is also incorrect (should be '0, '
instead of empty definition).
This patch fixes it by adding another macro, SYSCALL_LL_PRW{64}, meant to be
used to pass the offset argument on p{read,write}64. It is basically the
already define SYSCALL_LL{64} plus __ALIGNMENT_ARG unless __ASSUME_PRW_DUMMY_ARG
is define. In this case an empty dummy argument is used regardless how
__ALIGNMENT_ARG is defined (sh4 case).
Checked on x86_64, i686, aarch64, armhf, and powerpc64le (basically a sanity
check). Also, John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> and
James Clarke <jrtc27@jrtc27.com> help me check on a debian sh4 bootstrap using
2.24 plus this patch to verify it also corrected fixed the regression issue.
I also verified the generated object for a 2.24 build and master with this
patch for sh4 and both look identical.
* sysdeps/unix/sysv/linux/pread.c (__libc_pread): Use SYSCALL_LL_PRW.
* sysdeps/unix/sysv/linux/pwrite.c (__libc_pwrite): Likewise.
* sysdeps/unix/sysv/linux/pread64.c (__libc_pread64): Use
SYSCALL_LL64_PRW.
* sysdeps/unix/sysv/linux/pwrite64.c (__libc_pwrite64): Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h: Define
__ASSUME_PRW_DUMMY_ARG.
* sysdeps/unix/sysv/linux/sh/pread.c: Remove file.
* sysdeps/unix/sysv/linux/sh/pread64.c: Likewise.
* sysdeps/unix/sysv/linux/sh/pwrite.c: Likewise.
* sysdeps/unix/sysv/linux/sh/pwrite64.c: Likewise.
* sysdeps/unix/sysv/linux/sysdep.h: Define SYSCALL_LL_PRW and
SYSCALL_LL_PRW64 based on __ASSUME_PRW_DUMMY_ARG.
posix: Correctly block/unblock all signals on Linux posix_spawn
This patch correctly block and unblocks all signals when executing
Linux posix_spawn by using the __libc_signal_{un}block_all functions
instead of default sigprocmask. The latter might remove both
SIGCANCEL and SIGSETXID from the blocked signal list.
Checked on x86_64, i686, powerpc64le, and aarch64.
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Correctly block and unblock
all signals when executing the clone vfork child.
(SIGALL_SET): Remove macro.
posix: Correctly enable/disable cancellation on Linux posix_spawn
This patch correctly enable and disable asynchronous cancellation on
Linux posix_spawn. Current code invert the logic by enabling and
disabling instead. It also adds a new test to check if posix_spawn
is not a cancellation entrypoint.
Checked on x86_64, i686, powerpc64le, and aarch64.
* nptl/Makefile (tests): Add tst-exec5.
* nptl/tst-exec5.c: New file.
* sysdeps/unix/sysv/linux/spawni.c (__spawni): Correctly enable and disable
asynchronous cancellation.
Florian Weimer [Thu, 18 Aug 2016 09:15:42 +0000 (11:15 +0200)]
argp: Do not override GCC keywords with macros [BZ #16907]
glibc provides fallback definitions already. It is not necessary to
suppress warnings for unknown attributes because GCC does this
automatically for system headers.
This commit does not sync with gnulib because gnulib has started to use
_GL_* macros in the header file, which are arguably in the gnulib
implementation space and not suitable for an installed glibc header
file.
Aurelien Jarno [Fri, 5 Aug 2016 20:35:01 +0000 (22:35 +0200)]
sparc: remove fdim sparc specific implementations
The fdim and fdimf functions on sparc do not fully follow the standard
and do not set errno to ERANGE when the result overflows. Since glibc
2.24 this causes the two following tests to fail:
Failure: fdim (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
Failure: fdim_upward (max_value, -max_value): errno set to 0, expected 34 (ERANGE)
It happens that using GCC with the generic C code generates very similar
code to the sparc specific implementations. Therefore this patches
remove them. Note it might still worth adding a vis3 specific version of
fdim on sparc32/sparcv9, this is done in a following patch to ease
backporting.
Aurelien Jarno [Tue, 2 Aug 2016 22:22:44 +0000 (00:22 +0200)]
powerpc: fix ifunc-sel.h fix asm constraints and clobber list
As pointer out on the mailing list, the inline assembly code in
sysdeps/powerpc/ifunc-sel.h doesn't have a list of clobbered registers
and used wrong constraints.
This patch fixes that. I verified it doesn't introduce any change in the
generated code.
Changelog:
* sysdeps/powerpc/ifunc-sel.h (ifunc_sel): Add "11", "12", "cr0" to the
clobber list. Use "i" constraint instead of "X".
(ifunc_one): Add "12" to the clobber list. Use "i" constraint instead
of "X".
Aurelien Jarno [Tue, 2 Aug 2016 22:22:44 +0000 (00:22 +0200)]
powerpc: fix ifunc-sel.h with GCC 6
On 32-bit PowerPC GCC 6 always saves the PIC register on the stack in
the prologue and adjust the stack in the epilogue. It is therefore not
possible anymore to just exit the function in the inline asm code,
otherwise it corrupts the stack pointer. This causes the following tests
to fail when using GCC 6:
It is necessary to preserve the invariant that if an arena is
on the free list, it has thread attach count zero. Otherwise,
when arena_thread_freeres sees the zero attach count, it will
add it, and without the invariant, an arena could get pushed
to the list twice, resulting in a cycle.
One possible execution trace looks like this:
Thread 1 examines free list and observes it as empty.
Thread 2 exits and adds its arena to the free list,
with attached_threads == 0).
Thread 1 selects this arena in reused_arena (not from the free list).
Thread 1 increments attached_threads and attaches itself.
(The arena remains on the free list.)
Thread 1 exits, decrements attached_threads,
and adds the arena to the free list.
The final step creates a cycle in the usual way (by overwriting the
next_free member with the former list head, while there is another
list item pointing to the arena structure).
tst-malloc-thread-exit exhibits this issue, but it was only visible
with a debugger because the incorrect fix in bug 19243 removed
the assert from get_free_list.
Florian Weimer [Thu, 4 Aug 2016 09:10:57 +0000 (11:10 +0200)]
x86: Use sysdep.o from libc.a in static libraries
Static libraries can use the sysdep.o copy in libc.a without
a performance penalty. This results in a visible difference
if libpthread.a is relinked into a single object file (which
is needed to support libraries which check for the presence
of certain symbols to enable threading support, which generally
fails with static linking unless libpthread.a is relinked).
sparc: remove ceil, floor, trunc sparc specific implementations
The ceil, floor and trunc functions on sparc do not fully follow the
standard and trigger an inexact exception when presented a value which
is not an integer. Since glibc 2.24 this causes a few tests to fail,
for instance:
testing double (without inline functions)
Failure: ceil (lit_pi): Exception "Inexact" set
Failure: ceil (-lit_pi): Exception "Inexact" set
Failure: ceil (min_subnorm_value): Exception "Inexact" set
Failure: ceil (min_value): Exception "Inexact" set
Failure: ceil (0.1): Exception "Inexact" set
Failure: ceil (0.25): Exception "Inexact" set
Failure: ceil (0.625): Exception "Inexact" set
Failure: ceil (-min_subnorm_value): Exception "Inexact" set
Failure: ceil (-min_value): Exception "Inexact" set
Failure: ceil (-0.1): Exception "Inexact" set
Failure: ceil (-0.25): Exception "Inexact" set
Failure: ceil (-0.625): Exception "Inexact" set
I tried to fix that by using the same strategy than used on other
architectures, that is by saving the FSR register at the beginning
and restoring it at the end of the function. When doing so I noticed
a comment that this operation might be very costly, so I decided to
do some benchmarks.
The benchmarks below represent the time required to run each of the
function 60 millions of times with different input value. I have done
that in the basic V9 code, the VIS2 code, and using the default C
implementation of the libc, for both sparc32 and sparc64, on a Niagara
T1 based machine and an UltraSparc IIIi. Given I don't have access to a
more recent machine), I haven't been able to test the VIS3 version. Also
it should be noted that it doesn't make sense to do this benchmark for
V8 or earlier as in that case we use the default C implementation. The
results are available in the table below, the "+ fix" version correspond
to the one saving and restoring the FSR.
The first thing that should be noted is that the C implementation is
always faster on the Niagara T1 based machine. On the UltraSparc IIIi
the float version on sparc32 is also faster.
Coming back about the fix saving and restoring the FSR, it appears
it has a big impact as expected. In that case the C implementation is
always faster than the fixed implementations.
This patch therefore removes the sparc specific implementations in
favor of the generic ones.
H.J. Lu [Wed, 27 Jul 2016 18:51:33 +0000 (11:51 -0700)]
Don't compile do_test with -mavx/-mavx/-mavx512
Don't compile do_test with -mavx, -mavx nor -mavx512 since they won't run
on non-AVX machines.
[BZ #20384]
* sysdeps/x86_64/fpu/Makefile (extra-test-objs): Add
test-double-libmvec-sincos-avx-main.o,
test-double-libmvec-sincos-avx2-main.o,
test-double-libmvec-sincos-main.o,
test-float-libmvec-sincosf-avx-main.o,
test-float-libmvec-sincosf-avx2-main.o and
test-float-libmvec-sincosf-main.o.
test-float-libmvec-sincosf-avx512-main.o.
($(objpfx)test-double-libmvec-sincos): Also link with
$(objpfx)test-double-libmvec-sincos-main.o.
($(objpfx)test-double-libmvec-sincos-avx): Also link with
$(objpfx)test-double-libmvec-sincos-avx-main.o.
($(objpfx)test-double-libmvec-sincos-avx2): Also link with
$(objpfx)test-double-libmvec-sincos-avx2-main.o.
($(objpfx)test-float-libmvec-sincosf): Also link with
$(objpfx)test-float-libmvec-sincosf-main.o.
($(objpfx)test-float-libmvec-sincosf-avx): Also link with
$(objpfx)test-float-libmvec-sincosf-avx2-main.o.
[$(config-cflags-avx512) == yes] (extra-test-objs): Add
test-double-libmvec-sincos-avx512-main.o and
($(objpfx)test-double-libmvec-sincos-avx512): Also link with
$(objpfx)test-double-libmvec-sincos-avx512-main.o.
($(objpfx)test-float-libmvec-sincosf-avx512): Also link with
$(objpfx)test-float-libmvec-sincosf-avx512-main.o.
(CFLAGS-test-double-libmvec-sincos.c): Removed.
(CFLAGS-test-float-libmvec-sincosf.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-main.c): New.
(CFLAGS-test-double-libmvec-sincos-avx-main.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx2-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx2-main.c): Likewise.
(CFLAGS-test-float-libmvec-sincosf-avx512-main.c): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx.c): Set to -DREQUIRE_AVX.
(CFLAGS-test-float-libmvec-sincosf-avx.c ): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx2.c): Set to
-DREQUIRE_AVX2.
(CFLAGS-test-float-libmvec-sincosf-avx2.c ): Likewise.
(CFLAGS-test-double-libmvec-sincos-avx512.c): Set to
-DREQUIRE_AVX512F.
(CFLAGS-test-float-libmvec-sincosf-avx512.c): Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos.c: Rewritten.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf.c: Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx-main.c: New
file.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx2-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-avx512-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-double-libmvec-sincos-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx2-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-avx512-main.c:
Likewise.
* sysdeps/x86_64/fpu/test-float-libmvec-sincosf-main.c:
Likewise.