Moved sel16x4_0/1/2/3 to VEX/priv/guest_generic_helpers.h.
Moved amd64g_calculate_sse_phminposuw from guest_amd64_helpers.c
to guest_generic_helpers.h and renamed to
g_calculate_sse_phminposuw so both x86 and amd64 can use it.
Add test function to sse4-common.h and update
none/tests/x86/sse4-x86.c to test the instruction.
Paul Floyd [Wed, 3 Jun 2026 19:39:00 +0000 (21:39 +0200)]
Warning cleanup: one more -Wmisleading-indentation for minilzo-inl.c
I think that I will give up here. The ones in minilzo-inl.c were
easy to fix. The others are all deep in macros and caused by
macros putting everything on one line. I don't think that it's worth
the effort to use pragmas or per-file compiler options or to convert
the macros to static inline functions. I'm only seeing them with
GCC 8.5 on ppc64le.
Paul Floyd [Wed, 3 Jun 2026 06:06:49 +0000 (08:06 +0200)]
Regtest warnings: fixes for GCC 16.1 C++20 volatile deprecation
We now get warnings for things like using operator++ on a volatile.
Simply removing the volatile keyword seems to solve the problem.
In two of the cases the value is now returned in order to prevent the
compiler from optimising away the variable (which was probably the
original intent).
Paul Floyd [Sun, 31 May 2026 05:53:50 +0000 (07:53 +0200)]
Massif doc: mention that Darwin x86 uses unsigned long for size_t.
This is somewhat academic, macOS 10.15 dropped x86 in 2019,
Valgrind does not support macOS 10.14 x86 since it uses AVX.
The last version of macOS x86 that we do have support for is
macOS 10.13 which was released in 2017 and ended support in
2020.
Paul Floyd [Fri, 29 May 2026 19:16:32 +0000 (21:16 +0200)]
Darwin version macros: remove DARWIN_10_6 and DARWIN_10_7
A while back I bumped up the configure.ac oldest supported OSX to
version 10.8 (that's the oldest with an xcode supporting C11,
I suppose that we could use MacPorts gcc or clang but I don't
want to turn back now).
There were a good number of uses of conditional DARWIN_10_6
and DARWIN_10_7 in the code that this change purges.
Paul Floyd [Fri, 29 May 2026 06:31:39 +0000 (08:31 +0200)]
Warning cleanup: make sure that either macros are defined or use #ifdef rather than #if
A comment in https://bugs.kde.org/show_bug.cgi?id=519604 about correctly checking
for DARWIN_VERS led me to doing some builds with -Wundef. We still get quite a few
warnings from mips and LZO. I'll repeat the exercise on illumos Darwin and Linux amd64.
Florian Krohm [Fri, 22 May 2026 20:08:50 +0000 (22:08 +0200)]
Remove VEX/test
The snippets there are not suitable for testsuite integration.
Some compile with warnings or not at all (x87fxam.c, x87tst.c),
fpucw.c has no meaningful output. The remaining ones test x87
FPU state save / restore by dumping the state as hex bytes. That
would result in several .exp files because FPU state contains
instruction and data addresses.
Mark Wielaard [Fri, 22 May 2026 05:39:56 +0000 (07:39 +0200)]
Use helper functions to print double +/- inf/nan for sse4 testcase
Different libc printf implementations can have different ways to print
plus/minus nan/inf values. Add a print_double wrapper that checks
whether the double value is a nan, inf and whether it has a positive
or negative signbit. Similarly add a print_float wrapper. This way
we don't need any alternative .exps or filters.
Mark Wielaard [Thu, 5 Mar 2026 17:49:32 +0000 (18:49 +0100)]
Add PACKUSDW SSE4.1 support for x86
Add handling of PACKUSDW to VEX/priv/guest_x86_toIR.c based on the
guest_amd64_toIR.c implementation. Handle Iop_QNarrowBin32Sto16Ux8
using h_generic_calc_QNarrowBin32Sto16Ux8 in VEX/priv/host_x86_isel.c.
Move test_PACKUSDW from none/tests/amd64/sse4-64.c to
none/tests/sse4-common.h and add the same test to
none/tests/x86/sse4-x86.c with new PACKUSDW output in stdout.exp.
Support pextrd instruction in guest_x86_toIR.c and
host_x86_isel.c. Add test function to sse4-common.h
and update none/tests/x86/sse4-x86.c to test the
instruction.
Florian Krohm [Sat, 16 May 2026 20:31:17 +0000 (20:31 +0000)]
New test none/tests/x86/fb_test_i386.c
This is essentially VEX/test/test-i386* with various fixes.
The md5sum code in fb_test_i386.c was copied verbatim from
none/tests/amd64/fb_test_amd64.c.
Extracts system register shift and mask macros to a new VEX public
header libvex_guest_arm64_sysregs.h
These macros are used in both VEX dirty helpers and coregrind
VG_(machine_get_hwcaps).
Mostly this is just refactoring the code. It should be more robust
in the face of changes concerning future ARM CPU features. The
id_aa64pfr0_el1 dirty helper has changed a bit - the code was wrong
but the output was right (for what we currently support).
Paul Floyd [Sun, 10 May 2026 13:10:58 +0000 (15:10 +0200)]
arm64: change configure dotprod test to a run test
We were testing for dotprod with AC_COMPILE_IFELSE. That's wrong, it just
checks that the compiler accepts -march=armv8.2-a+dotprod and that the
assembler handles dotprod opcodes. Changed it to AC_RUN_IFELSE which
additionally checks that the binary runs. Valgrind should be checking
for this in VG_(machine_get_hwcaps), storing it in VexArchInfo and
checking it in guest_arm64_toIR.c (it looks like we check it but don't
store 'dp' in vai.hwcaps and in general VEX doesnt check whether it
should try to decode these extensions).
This fixes the last part of the issues - CPUs with and without
crypto support. Rather than just fix building tests this splits
the crypto part out of fp_and_simd.c into a new file, v8crypro.c
Paul Floyd [Sun, 10 May 2026 07:34:19 +0000 (09:34 +0200)]
arm64: add a configure check for crypto support
This is related to https://bugs.kde.org/show_bug.cgi?id=391311
In the patch for that bugzilla item there is a build time fix.
Without the crypto tests the test will fail.
If there are no issues with this configure test I'll split off a new
test v8crypto from fp_and_simd.
This is point 6 on the list.
1. Cache the DZP bit from mrs dczid_el0 in a new field arm64_data_zero_prohibited of VexArchInfo
2. Use the cached values from VexArchInfo for mrs dczid_el0 rather than a dirty helper
3. Add a check to dc zva that the above DZP wasn't set. If it is then print a message
and stop with a Ijk_SigILL.
Paul Floyd [Fri, 8 May 2026 10:47:41 +0000 (12:47 +0200)]
Regtest: make x86 and amd64 inf and nan production cross platform, part 2
This is for FreeBSD and probably Darwin as well.
Now that we have the same nan generation on all platforms we only need two
expecteds for these tests, one where printf of a negative nan outputs
-nan and the other where it outputs nan.
So, add an expected for FreeSD x86. Rationalise the amd64 expecteds.
There were 2 for FreeBSD only differentiated by the mkPosNan codegen.
Now just the one is needed. While I'm at it make the expected name
more consistent, exp-freebsd rather than exp.freebsd.
Paul Floyd [Fri, 8 May 2026 07:05:42 +0000 (09:05 +0200)]
Regtest: make x86 and amd64 inf and nan production cross platform
We were using 1.0/0.0 to produce a positive inf and 0.0/0.0 to
produce a postive nan. GCC and clang handle the nan case
differently. GCC generates the division which counterintuitively
results in a negative nan. clang does constant folding and
directly generates a positive nan.
This change uses __builtin_inf and __builtin_nan instead. These
were added back with GCC 3.3 so there should be no backwards
compatibility issues.
Add these and adapt the ROUNDSS and ROUNDSD testcases so they work on
both amd64 and x86. Move get_mxcsr, set_mxcsr, get_sse_roundingmode
and set_sse_roundingmode from sse4-64.c to sse4-common.h (and add x86
variants). Move ROUNDSD and ROUNDSS test function and make them use
XMMREG_DST instead of xmm11 (which isn't available on x86).
Add testcase output for test_ROUNDSD_w_immediate_rounding(),
test_ROUNDSS_w_immediate_rounding(), test_ROUNDSD_w_mxcsr_rounding()
and test_ROUNDSS_w_mxcsr_rounding() to sse4-x86.stdout.exp.
Paul Floyd [Sun, 3 May 2026 06:39:07 +0000 (08:39 +0200)]
Darwin persona syscall: use the SYS_persona macro to detect at build time
It looks like I broke OSX 10.10 build when I added the persona wrapper.
Rather than hard code the version for persona this should work on
all platforms. In the longer term I'd like to do the same as Linux
and FreeBSD and not use a VERS macro for Darwin syscalls.
Mark Wielaard [Sat, 2 May 2026 20:59:23 +0000 (22:59 +0200)]
Change XMMREG_DST in sse4-common.h back to xmm11 for amd64
XMMREG_DST for amd64 (x86_64) was accidentially set to xmm7 instead of
using xmm11 in commit a1904db1dd0ee8c046a3fd89c822463cd496d78e.
Add SSE4.1 PBLENDVB, BLENDVPS and BLENDVPD
Change it back to xmm11 to make sure a register that isn't available
in 32bit mode is tested. No changes to any of the test results because
of this.
Paul Floyd [Fri, 1 May 2026 09:16:17 +0000 (11:16 +0200)]
Darwin fixup_macho_loadcmds: make build more flexible
Part of work for https://bugs.kde.org/show_bug.cgi?id=519604
The value of maxprot for the __UNIXSTACK load command changed
from 7 to 3 during the macOS 10.14 lifecycle. Our code used
the SDK value to decide which to check for. MacPorts buildbot
targets all macOS 10.14 versions which does not work with our
assumption that latest 10.14 SDK => maxprot 3. In order to make
this more flexible we now allow maxprot 3 or 7 on macOS 10.14.
This is just for fixup_macho_commands that just gets used
once for each tool during builds.
Martin Cermak [Fri, 24 Apr 2026 11:21:05 +0000 (13:21 +0200)]
Recognize ioctl UFFDIO_* operations
The userfaultfd* LTP testcases demonstrate how valgrind
isn't aware of the ioctl UFFDIO_* operations. Teach
Valgrind to recognize those per linux/fs/userfaultfd.c.
Mark Wielaard [Wed, 22 Apr 2026 12:15:28 +0000 (14:15 +0200)]
Use SSizeT for VG_(readlink) result in VG_(realpath)
VG_(readlink) returns a negative value if it fails. Which is checked
for right after the call. But if SizeT is unsigned, so that check
always fails. Use SSizeT instead for the result variable.
Mark Wielaard [Mon, 20 Apr 2026 15:24:23 +0000 (17:24 +0200)]
Update NEWS entries
- New VALGRIND_REPLACES_MALLOC and VALGRIND_GET_TOOLNAME client requests.
- Linux lightweight guard pages support (--max-guard-pages=N).
- x86 (32bit) (partial) SSE4.1 instruction support.
- Linux Test Project (LTP) v20260130 was integrated.
Mark Wielaard [Wed, 18 Feb 2026 16:48:15 +0000 (17:48 +0100)]
Add MOVNTDQA SSE4.1 support for x86
Add handling of MOVNTDQA to VEX/priv/guest_x86_toIR.c based on the
guest_amd64_toIR.c implementation.
Move test_MOVNTDQA from none/tests/amd64/sse4-64.c to
none/tests/sse4-common.h and add the same test to
none/tests/x86/sse4-x86.c with new MOVNTDQA output in stdout.exp.
Move the MPSADBW computation helper and IR builder from
guest_amd64_helpers.c and guest_amd64_toIR.c into a new
header guest_generic_helpers.h and guest_generic_sse.h,
so the helpers could be shared between x86 and amd64
implementaions.
Move MPSADBW tests into the shared sse4-common.h so they are
also exercised on x86.
Mark Wielaard [Thu, 12 Mar 2026 00:22:32 +0000 (01:22 +0100)]
Add PCMPEQQ SSE4.1 support for x86
Add handling of PCMPEQQ to VEX/priv/guest_x86_toIR.c based on the
guest_amd64_toIR.c implementation. Handle Iop_CmpEQ64x2 using
h_generic_calc_CmpEQ64x2 in VEX/priv/host_x86_isel.c.
Move test_PCMPEQQ from none/tests/amd64/sse4-64.c to
none/tests/sse4-common.h and add the same test to
none/tests/x86/sse4-x86.c with new PCMPEQQ output in stdout.exp.
This possibly segfaults when arg2 is not a constant.
Optimising GCC figures it can first check arg2->tag == Iex_Const
which is cheaper than disp = arg2->Iex.Const.con->Ico.U64;
Nice one.
Regtested with both default compiler flags and -g only.