Add fd_allowed and POST_newFd_RES to all syscalls that use or return fds
This makes sure all file descriptors that take a file descriptor check
that the file descriptor is valid. Also makes sure that the
--modify-fds=high option affects all sycalls that return a file
descriptor.
Refactor code in preparation for running each testcase twice: once
with constant folding and once without.
- remove function print_opnd
- remove function complain
- factor out function get_expected_value
- checking the result moved to valgrind_execute_test
- make IRICB a static global in valgrind.c
- new_iricb now returns a pointer to it
Paul Floyd [Thu, 24 Jul 2025 20:45:01 +0000 (22:45 +0200)]
FreeBSD syscall: improve sigwait and sigwaitinfo wrapper.
Both take two pointers. We were allowing null pointers for all
of them. Only the 2nd argument of sigwaitinfo, info, is
allowed to be NULL. Update the scalar test with some NULL
arguments for these syscalls.
s390x: Fix crash when constant folding is disabled (BZ 507173)
Followup to 942a48c1d which fixed the register usage of conditional
moves for s390_insn_get_reg_usage. A similar fix is needed for
s390_insn_map_regs considering the case when the condition is
S390_CC_NEVER.
Paul Floyd [Sat, 19 Jul 2025 13:10:31 +0000 (15:10 +0200)]
Bug 505673 - Valgrind crashes with an internal error and SIGBUS when the guest tries to open its own file with O_WRONLY|O_CREAT|O_TRUNC
This is all quite messy.
It affects open() openat() and openat2() (the last of which is Linux only).
On Linux we also need to check for /proc/self/exe and /proc/PID/exe.
On Linux there are also a couple of RESOLVE flags for openat2() that
mean _don't_ check /proc magic links.
In the general case we need to have some reference to check whether
the filename matches the guest filename. So I've added that as
VG_(resolved_exename) (which I was already using on FreeBSD).
The pathname also needs to be canonicalised. It may be a
relative path, symlink or use RESOLVE_IN_ROOT. That uses
VG_(realpath) (again which was already present for FreBSD).
On illumos the man page says that opening running binaries for
writing failes with errno set to ETXTBSY but that's not what
the open functions do - they just open the file. So I've done nothing
for illumos or Solaris. Maybe I'll open an illumos ticket.
I haven't tried on Darwin.
The Linux open functions with /proc/self/exe and /proc/PID/exe
were just calling dup on the fd that we hold for the client exe.
That means that we were ignoring any other flags. That has now changed.
If the open doesn't fail because the WRONLY/RDWR flags are set then
the syscall gets called from the PRE wrapper using VG_(resolved_exename)
instewad of the /proc pathname.
I haven't tried to handle all of the Linux openat2 RESOLVE*
flags. RESOLVE_NO_MAGICLINKS is handled and I see the LTS test
openat202 now passing, so this should also fix Bug 506910.
I'm not sure that VG_(realpath) handles all forms of weird path
resolution on Linux (on FreeBSD it uses a syscall so that should
work OK).
Paul Floyd [Fri, 18 Jul 2025 11:21:26 +0000 (13:21 +0200)]
FreeBSD: fix check for mmap flags
On FreeBSD, mmap also has MAP_STACK and MAP_GUARD that can
be mapped without a backing file referred to by fd.
As a result during ld.so startup and thread creation mmap for
stacks was failing. So no guest could be load and execute,
with errors like
ld-elf.so.1: /home/paulf/scratch/valgrind_nightly/nightly/valgrind-new/.in_place/vgpreload_core-amd64-freebsd.so: mmap of entire address space failed: Bad file descriptor
Paul Floyd [Thu, 17 Jul 2025 18:38:54 +0000 (20:38 +0200)]
iropt regtest: use mrand32() instead of rand()
On illumos rand() has a RAND_MAX of 32k only. That's not enough to
generate 64bit values easily. So use mrand48() which genrerates
the full range of 32bit int values.
Martin Cermak [Thu, 17 Jul 2025 07:16:53 +0000 (09:16 +0200)]
Wrap linux specific syscall 22 (ustat)
The ustat syscall comes from pre-git linux history. It is
deprecated in favor of statfs. But in some cases it may
still be used.
int ustat(dev_t dev, struct ustat *ubuf); returns information
about a mounted filesystem. dev is a device number identifying
a device containing a mounted filesystem. ubuf is a pointer to
a ustat structure.
Declare a sys_ustat wrapper in priv_syswrap-linux.h and hook
it for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,\
s390x,x86}-linux using LINXY with PRE and POST handler in
syswrap-linux.c
Mark Wielaard [Wed, 16 Jul 2025 00:45:39 +0000 (02:45 +0200)]
Check mmap fd is valid, if used, and fail early with EBADF if not
mmap should fail with EBADF if the given fd is bad (or used by valgrind
itself) when used (flags does not contain MAP_ANONYMOUS).
Check both with ML_(fd_allowed) (which might only warn) and fcntl
(VKI_F_GETFD) to see if the file descriptor is valid. Fail early so
the address space manager and the actual mmap call don't do
unnecessary work (and might fail with a different error code).
Mark Wielaard [Tue, 15 Jul 2025 21:49:36 +0000 (23:49 +0200)]
Support mmap MAP_FIXED_NOREPLACE if defined
Define VKI_MAP_FIXED_NOREPLACE for amd64-linux, arm-linux,
arm64-linux, mips32-linux, mips64-linux, riscv64-linux and x86-linux.
If it is defined then ML_(generic_PRE_sys_mmap) will also interpret
VKI_MAP_FIXED_NOREPLACE as an MFixed hint. If the aspace manager
doesn't find a MAP_FIXED_NOREPLACE ok, then fail with EEXIST. If the
actual kernel mmap request fails and MAP_FIXED_NOREPLACE is set also
immediately fail with EEXIST without retrying.
Mark Wielaard [Mon, 14 Jul 2025 22:00:44 +0000 (00:00 +0200)]
Handle SIGSYS and SIGSTKFLT when defined
Both signals were already partially handled. But calculate_SKSS_from_SCSS
only handled SIGSYS on freebsd. default_action didn't handle SIGSTKFLT.
And sync_signalhandler didn't expect to have to handle SIGSYS.
Mark Wielaard [Mon, 14 Jul 2025 21:23:23 +0000 (23:23 +0200)]
Reject any attempt to set the handler for SIGKILL/STOP
Even though resetting SIGKILL or SIGSTOP to SIG_DFL would be a noop it
isn't allowed. Just always return EINVAL if an attempt is made to set
the signal handler for SIGKILL or SIGSTOP. There is an LTP test for
this signal01.
Add program to double-check VEX constant folding. BZ 506211
Using IR injection. Essentially:
- prepare input values for an IROp
- create an IRExpr for the IRop
- constant fold the expression
- make sure the result is an IRConst with the expected value
Only IROps with integer operands and result are supported.
No vector and floating point IROps. Maximum bit width is 64.
Part of fixing https://bugs.kde.org/show_bug.cgi?id=506211
Mark Wielaard [Fri, 11 Jul 2025 17:58:53 +0000 (19:58 +0200)]
linux mseal PRE wrapper should First check for overflow
According to https://docs.kernel.org/next/userspace-api/mseal.html
mseal returns -EINVAL when Address range (addr + len) overflow. The
LTP test mseal02 checks this. So do this check first before checking
for valid_client_addr (which returns -ENOMEM).
Mark Wielaard [Fri, 11 Jul 2025 15:18:47 +0000 (17:18 +0200)]
Check ppoll ufds array is safe to deref before checking fd members
LTP ppoll01 provides a bad fds array to ppoll as a testcase.
memcheck should warn (through PRE_MEM_READ) this array is bad.
But it shouldn't try to derefence anything if is isn't safe.
Mark Wielaard [Thu, 10 Jul 2025 21:09:18 +0000 (23:09 +0200)]
Add fcntl14{,_64}, fcntl34{,_64} and fcntl36{,_64} to ltp-excludes.txt
These fcntl syscall tests time out and would need at least
LTP_TIMEOUT_MUL=5 when run under memcheck, which is several minutes,
so exclude them for now.
Mark Wielaard [Wed, 9 Jul 2025 16:27:17 +0000 (18:27 +0200)]
Suppress unimplemented fcntl command warning with -q
LTP tests fcntl13 and fcntl13_64 fail because even with -q valgrind
emits warnings about unknown (999) fcntl commands. Don't emit that
message with -q, just fail with EINVAL.
Fix operand / result types of Iop_DivU128[E], Iop_ModU128 and their signed counterparts
In libvex_ir.h these IROps are described to operate on Ity_I128 operands and produce a like typed result. This contradicts the specification in ir_defs.c
(function typeOfprimop) which claims Ity_V128 for operands and result.
Above IROps are used exclusively by ppc for the following opcodes:
Iop_DivU128 --> vdivuq Vector Divide Unsigned Quadword
Iop_DivS128 --> vdivsq Vector Divide Signed Quadword
Iop_DivU128E --> vdiveuq Vector Divide Extended Unsigned Quadword
Iop_DivS128E --> vdivesq Vector Divide Extended Signed Quadword
Iop_ModU128 --> vmoduq Vector Modulo Unsigned Quadword
Iop_ModS128 --> vmodsq Vector Modulo Signed Quadword
Reading the ISA document, it is clear, that those opcodes perform an
integer division / modulo operation. Technically, they work on vector
registers, presumably because vector registers are the only resource
wide enough to store a quadword. Perhaps that is where the confusion
comes from.
So Ity_I128 it is.
So far there was only one application of IR injection, namely vbit-test.
Soonish there will be another.
Refactor the IRICB to separate out the structure for the application's
payload.
Paul Floyd [Tue, 8 Jul 2025 06:14:56 +0000 (08:14 +0200)]
Fix VEX/useful/Makefile-vex
This uses hard coded 'make' which may mean Solaris make or
BSD make ratheer than the initial invokation (e.g., gmake or some
other make that is not first inthe PATH). Use ${MAKE} instead
so that the same make is used for the second invokation.
Compile errors because config.h not found. Turns out libvex_inner.h
Also missing was priv/host_generic_reg_alloc3.o causing linking to fail.
Now fixed.
Mark Wielaard [Fri, 4 Jul 2025 22:51:36 +0000 (00:51 +0200)]
Check dup2 oldfd before allowing the syscall
The dup201 LTP test fails with TFAIL: dup2(1024, 5) succeeded
That is because 1024 here is the soft file limit (so one higher than
the max number of fds). Valgrind raises the soft limit a little
internally to have a few private fds for itself. So this dup2 call
succeeds (and possibly dups and internal valgrind fd into the
newfd). We should check the oldfd before allowing the dup2 syscall,
like we already check the newfd.
Mark Wielaard [Fri, 4 Jul 2025 21:14:18 +0000 (23:14 +0200)]
Sanity check io_submit addresses before dereferencing
The LTP io_submit03 test fails under valgrind memcheck because it
tests bad struct iocb attay addresses. Fix this by explicitly checking
the struct iocb pointer and each array element pointer are safe to
deref in the linux sys_io_submit PRE handler.
Define VKI_F_CREATED_QUERY in vki-linux.h.
Recognize it in PRE(sys_fcntl).
This fixes ltp tests failures. When running:
make ltpchecks TESTS="fcntl40 fcntl40_64
the tests would fail with:
fcntl40: unempty log2.filtered:
==1809471== Warning: unimplemented fcntl command: 1028
Florian Krohm [Mon, 30 Jun 2025 19:31:33 +0000 (19:31 +0000)]
s390x: Fix diagnostic for S390_DECODE_UNKNOWN_SPECIAL_INSN
When decoding fails the insn bytes (at most 6) are shown. However,
"special insns" are 10 bytes with the last 2 bytes being the interesting
ones. Print them all.
Mark Wielaard [Sat, 28 Jun 2025 16:33:29 +0000 (18:33 +0200)]
mips32: Use LINXY for statmount and listmount
commit 57152acfc6a8 "Wrap linux specific syscalls 457 (listmount) and
458 (statmount)" added LINXY wrappers for all arches, except for
mips32 where it used LINX_. This was a typo/mistake. Make sure mips32
also uses LINXY wrappers.
Martin Cermak [Fri, 27 Jun 2025 20:36:03 +0000 (22:36 +0200)]
Wrap linux specific syscalls 457 (listmount) and 458 (statmount)
The listmount syscall returns a list of mount IDs under the req.mnt_id.
This is meant to be used in conjunction with statmount(2) in order to
provide a way to iterate and discover mounted file systems.
The statmount syscall returns information about a mount, storing it in
the buffer pointed to by smbuf. The returned buffer is a struct
statmount which is of size bufsize.
Declare a sys_{lis,sta}tmount wrapper in priv_syswrap-linux.h and hook it
for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
using LINXY with PRE and POST handler in syswrap-linux.c
Both syscalls need CAP_SYS_ADMIN, to successfully test.
When --track-fds=bad is specified, do not warn about
leaked file descriptors and only warn about file decriptors
which was not opened or already closed.
Update the documentation in docs/xml/manual-core.xml.
Add none/tests/track_bad test to test the new option.
Adjust none/tests/cmdline1 and none/tests/cmdline2 expected
outputs.
Mark Wielaard [Sat, 21 Jun 2025 21:04:04 +0000 (23:04 +0200)]
Update DW_TAG_subprogram parsing for clang
Clang doesn't give a name for some artificial subprograms. In that
case just use "<artificial>" as the name of the DW_TAG_subprogram.
Clang also sometimes generates a DW_TAG_subprogram without any
attributes. These aren't really useful for us. So just silently skip
them.
If we warn about subprograms without a name, specification or abstract
origin, also emit the index in the .debug_info section to make it
easier to look them up.
Mark Wielaard [Thu, 29 May 2025 21:41:52 +0000 (23:41 +0200)]
Rewrite DWARF inlined subroutine handling to work cross CU
The readdwarf3 parsers cannot read DIEs across CUs. An inlined
subroutine refers to an subprogram which has a name (or refers to a
declaration of a subprogram that has a name). These subprograms can be
(and often are when dwz has been used to compress the DWARF) in a
different CU. So a lot of inlined subroutines in backtraces are just
called "UnknownInlinedFun".
To work around not being able to read DIEs across CUs directly we
don't try to immediately resolve the name of the inlined subroutine by
following the abstract origin reference to the subprogram, but just
record it in the DiInlLoc. We also record all subprogram indexes while
parsing in a new DiSubprogram structure and whether the subprogram had
a name or had a reference to another subprogram (specification).
We have to look under a couple more DIEs. We normally want to skip any
DIE that doesn't have an address range when looking for inlined
subroutines, but there are various other DIEs that can contain a
subprogram (specification).
We also want to walk the DIEs from low to high (cooked DIE) index, so
we first pass over the main .debug_info, then the .debug_types, and
finally the alt .debug_info. That way we can store the DiSubprograms
in an array from low to high index and use a binary search to connect
the inlined subroutines to the subprogram that contains the name.
The code also tracks whether the subprogram is artificial, but this
isn't used yet. But should make it possible for a followup patch to
remove artificial inlined subroutines from a backtrace.
Tested against emacs and libreoffice as packaged in Fedora where the
programs and all shared libraries used are processed with dwz. The new
code gives a name to every inlined subroutine. Except when the DWARF
produced is bad and the DW_AT_subroutine didn't contain an
DW_AT_abstract_origin and so no DW_AT_subprogram can be found.
Martin Cermak [Tue, 17 Jun 2025 11:51:48 +0000 (13:51 +0200)]
Wrap linux specific mseal syscall
mseal takes address, size and flags. Flags are reserved for
future use. Modern CPUs support memory permissions such as RW and
NX bits. The mseal syscall takes address and size parameters to
additionally protect memory mapping against modifications.
Declare a sys_mseal wrapper in priv_syswrap-linux.h and hook it
for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
using LINX_ with PRE handler in syswrap-linux.c
Florian Krohm [Sat, 7 Jun 2025 12:12:56 +0000 (12:12 +0000)]
s390x: Fix thinko, remove a few fixs390's
All of VEX's Iops are side effect free. They don't set the condition
code. Therefore, when emitting a VCEQ, VPK[L]S or VCH[L] the 'set cc' field
is 0. Nothing to fix here.
Florian Krohm [Sat, 7 Jun 2025 11:48:01 +0000 (11:48 +0000)]
s390x: Improve none/tests/s390x/cvb.c
The result of a CVB insn resides in an 8-byte register of which the
most significant 4 bytes ought to unchanged by the insn. We want to
check that. So we need to use a variable of type 'long'.
Rewrite the code a bit.
In some cases the LTP tests intentionally work with SIGSEGV. This
happens e.g. with the mmap18 and select03 testcases. Valgrind
detects SIGSEGV and reports that as a failure.
Such report can't be suppressed using the suppressions mechanism.
That's why this update comes with "output filters". Filters are
scripts that read from their stdin, and write filtered output to
their stdout. Filters reside in auxprogs/filters.
This update comes with 2 filters: For mmap18, and for select03.
They are awk scripts.
Except for filters, this update also blacklists testcase fork13
because it is slow. It is possible to add comments prefixed with
the '#' sign (implicitly - because they don't match any testcase
name), so a comment is added too.
This update also introduces new default valgrind --vgdb=no switch. It
improves valgrind behavior with nftw01 nftw6401 setfsgid04 setfsgid03_16
and symlink03 testcases. These were previously complaining like this:
==22969== could not unlink /tmp/vgdb-pipe-from-vgdb-to-22969-by-root-on ...
Paul Floyd [Fri, 30 May 2025 06:47:00 +0000 (08:47 +0200)]
FreeBSD syscalls: change two more warnings
sysctl gave me a hard time when I first started working on FreeBSD
but that warning should only be without -q
change the unhandled kenv warning to a VG_(unimplemented), which will
terminate Valgrind rather than just print a warning.
There are still a few aio warnings. Really they should be promoted
to some kind of fully fledged warning (maybe a core warning). I'm
not sure that it's worth the effort as I suspect that aio is not
much used.