Paul Floyd [Sat, 28 Sep 2024 06:20:25 +0000 (08:20 +0200)]
Compiler warning in ML_(check_elf_and_get_rw_loads)
GCC 12.2 complains that
previous_rw_a_phdr.p_vaddr + previous_rw_a_phdr.p_filesz
may be using p_filesz uninitialized
That's only possible if ML_(img_get) somehow fails to read all
of a program header such that p_memsz is greater than 0 but
p_filesz remains uninitialized. Hardly likely since p_memsz
comes after p_filesz in the structure.
Paul Floyd [Fri, 27 Sep 2024 20:18:24 +0000 (22:18 +0200)]
FreeBSD: remove code for FreeBSD 10
FreeBSD 10 was never really tested - fully working FreeBSD support
arrived around the time of FreeBSD 11.3 and 12.1. FreeBSD had
already been EOL around 2 years by then.
Paul Floyd [Sun, 15 Sep 2024 07:52:56 +0000 (09:52 +0200)]
Bug 492210 - False positive on x86/amd64 with ZF taken directly from addition
Also adds similar checks for short and char equivalents to the
original int reproducer.
Initial fix provided by
Alexander Monakov <amonakov@gmail.com>
Two versions of the testcase, one with default options and one with
-expensive-definedness-checks=yes because the byte operations
subb and addb need the flag turned on explicitly.
Paul Floyd [Sat, 14 Sep 2024 18:56:54 +0000 (20:56 +0200)]
FreeBSD: add file descriptor tracking for _umtx_op
UMTX_OP_SHM with a sub request of UMTX_SHM_CREAT creates
an anonymous shared memory object and returns a file
descriptor. This fd is now tracked when required.
Andreas Arnez [Tue, 10 Sep 2024 16:38:49 +0000 (18:38 +0200)]
s390x: Add MSA support
Handle instructions that were added to z/Architecture with the
message-security assist (MSA) facility or with one of its extensions up to
MSA extension 9:
km -- ``cipher message''
kmc -- ``cipher message with chaining''
kimd -- ``compute intermediate message digest''
klmd -- ``compute last message digest''
kmac -- ``compute message authentication code''
kmf -- ``cipher message with cipher feedback''
kmctr -- ``cipher message with counter''
kmo -- ``cipher message with output feedback''
pcc -- ``perform cryptographic computation''
kma -- ``cipher message with authentication''
kdsa -- ``compute digital signature authentication''
Each of these instructions has multiple functions. Support all functions
described by MSA levels up to extension 9. Handle the instructions as
"extensions" and essentially forward them to the instructions themselves,
as long as they are available on the host.
will not be handled by this change, since it is privileged and should not
occur in user-space programs.
The MSA facilities are typically used by cryptographic libraries like
OpenSSL or openCryptoki. So far Valgrind suppresses the facility bits
indicating any MSA support, which causes such libraries to revert to a
software implementation.
This change enables running cryptographic applications under Valgrind
without reverting to an alternate code path.
Miao Wang [Mon, 26 Aug 2024 14:08:43 +0000 (22:08 +0800)]
sys_statx: support for statx(fd, NULL, AT_EMPTY_PATH)
statx(fd, NULL, AT_EMPTY_PATH) is supported since Linux 6.11 and this
patch adds the support to valgrind, so that it won't complain when
NULL is used as |filename| and |flags| includes AT_EMPTY_PATH.
Ref: commit 0ef625bba6fb ("vfs: support statx(..., NULL, AT_EMPTY_PATH, ...)")
Signed-off-by: Miao Wang <shankerwangmiao@gmail.com>
valgrind testing: extend vg_regtest to emit automake-style .trs/.log files
Extend vg_regtest to produce automake-style log files for each vgtest
case, so that developers and testsuite archiving/analysis tools such
as bunsen can examine passing as well as non-passing test outputs in
detail. The build-tree test-suite-overall.log file holds all the key
information about tests, especially failures.
Andreas Arnez [Mon, 19 Aug 2024 13:22:40 +0000 (15:22 +0200)]
s390x: Fix PC calculations with EX/EXRL
When executing under EX or EXRL, some instructions yield wrong results
under Valgrind. This affects
* PC-relative instructions such as LARL or BRC
* instructions that set a link register, such as BASR
The issue is caused by confusions about the various instruction addresses
involved. When executing an instruction under EX or EXRL, the following
addresses are relevant:
(1) The address of the execute instruction (guest_IA_curr_instr). This is
needed when restarting the instruction or iterating over it.
(2) The address following the execute instruction (guest_IA_next_instr).
This is what a link register needs to be set to.
(3) The address of the target instruction. This is the base for relative
addressing.
The latter isn't handled at all when translating for EX/EXRL. And the
instructions that set a link register don't use guest_IA_next_instr, but
add their own instruction length to guest_IA_curr_instr. This is wrong
whenever the target instruction and the EX/EXRL instruction have different
lengths.
Fix all this and enhance the test cases accordingly. The updated test
cases fail before this patch and succeed afterwards.
Andreas Arnez [Tue, 13 Aug 2024 11:52:07 +0000 (13:52 +0200)]
s390x: Fix performance issue with EXRL
Valgrind can currently run into a situation where a code block containing
EXRL is re-translated over and over, potentially causing extreme
slow-down. Such a slow-down has been observed when running the following
command under Valgrind:
z/Architecture has the "execute" instructions EX and EXRL. Valgrind
handles EX by translating it at least twice. The first translation just
copies the target instruction to the variable `last_execute_target' and
triggers a "restart", invalidating the current BB and creating a new BB
that starts with EX. The second translation contains the IR for the
instruction in `last_execute_target', but first checks if this still
matches the instruction to be executed. If not, it initiates a "restart",
as above. For EXRL there is a shortcut that sets `last_execute_target'
without going through the first translation.
Now the combination of two issues in the current implemenation typically
leads to an EXRL being translated every time:
(1) An EXRL can appear in the middle of a BB. If so, a "restart" will
discard everything in the BB up to this point. And when getting back
to the same instructions, everything will be re-translated again.
(2) After commit 7e9113cb7a249e0fae2a365462c6b016 (handling Bug 405403),
the shortcut in s390_irgen_EXRL() only fills 6 instead of 8 bytes into
`last_execute_target', while the check still compares this to 8 bytes
from the target location. Thus the check usually fails, triggering a
"restart" of EXRL.
The first issue does not apply to EX, because there was already logic for
terminating a BB before an EX instruction. Just extend that logic and
treat EXRL the same way.
The second issue is caused by the discrepancy of reading 6 versus 8 bytes
and comparing these two. But in fact, reading 6 or 8 bytes are both
incorrect. Only the bytes that belong to the instruction should be read
and compared. The instruction length can be determined from the first
byte `b' at the target location (2 bytes if b < 0x40, 4 bytes if b < 0xc0,
and 6 bytes otherwise), so do this.
Andreas Arnez [Thu, 8 Aug 2024 12:56:50 +0000 (14:56 +0200)]
s390x: Fix disassembly of locfh/locfhr, update S390_MAX_MNEMONIC_LEN
The length of the "longest mnemonic" for the s390x disassembler is
currently defined in s390_defs.h to be 8 characters, where in fact it
should be 9. Update the constant to reflect that.
Also fix the disassembly of the instructions locfh and locfhr, changing
them from their current wrong representations `locgh' and `locghr'.
Sam James [Mon, 22 Jul 2024 11:26:39 +0000 (12:26 +0100)]
configure: drop -flto-partition=one
For me, -flto-partition=one takes ~35m to build + test, while the default
(which is 'balanced') takes ~5m.
The reason that -flto-partition=one is slower is because it disables all
of gcc's LTO parallelisation. This can produce better code, at the cost
of (far) more expensive build times. If users want that, they can still
pass it in their *FLAGS, but I don't think it's a suitable default.
Andreas Arnez [Wed, 10 Jul 2024 16:47:07 +0000 (18:47 +0200)]
s390x: Re-implement STFLE as an extension
The existing implementation of the STFLE instruction does not use the
correct operand size when tracking memory effects. Instead of respecting
the user-provided maximum number of doublewords and the returned value
from the instruction, it assumes a hard coded value (S390_NUM_FACILITY_DW)
instead.
For example, if an application passes a buffer of 3 doublewords to STFLE
while Valgrind assumes a fixed size of 4 doublewords, Valgrind may falsely
complain about an invalid write for the last doubleword.
Fix this by re-implementing STFLE via the extension mechanism.
Andreas Arnez [Wed, 10 Jul 2024 16:47:07 +0000 (18:47 +0200)]
s390x: Fix PRNO for SHA-512-DRNG generate
In the implementation of PRNO, when handling the "SHA-512-DRNG generate"
operation, the updated length is written back to the wrong register.
Also, while the instruction fills the output buffer from right-to-left,
the memory tracking is done as if it were the other way around. Fix both
of these issues.
Mark Wielaard [Thu, 4 Jul 2024 13:21:39 +0000 (15:21 +0200)]
Avoid dev/inode check on btrfs with --sanity-level=3
With --sanity-level=3 or higher the aspacemgr sanity checks the
device/inode numbers from /proc/self/maps to the file stat
results. These don't match on btrfs. So detect when a file is on a
btrfs volume and ignore the check in that case.
gdb on Fedora will warn not being able to load the rpm python module.
Unable to load 'rpm' module. Please install the python3-rpm package.
Filter out that message so tests don't fail.
Don't allow programs calling fnctl on valgrind's own file descriptors
Add a call to ML_(fd_allowed) in the PRE handler of fcntl and fcntl64
and block syscalls with EBADF when the file descriptor isn't allowed
to be used by the program.
Mark Wielaard [Sun, 16 Jun 2024 22:27:12 +0000 (00:27 +0200)]
Close both internal pipe fds after VG_(fork) in parent and child
An VG_fork() creates a pipe between parent and child to syncronize the
two processes. The parent wants to register the child pid before the
child can run. This is done in register_sigchld_ignore.
Make sure both the parent and the child close both the read and write
file descriptors so none leak.
Mark Wielaard [Sun, 16 Jun 2024 19:23:08 +0000 (21:23 +0200)]
Don't leave fds created with --log-file, --xml-file or --log-socket open
prepare_sink_fd and prepare_sink_socket will create a new file
descriptor for the output sink. finalize_sink_fd then copies the fd
to the safe range, so it doesn't conflict with any application fds.
If we created the original fd ourselves, it was a VgLogTo_File or
VgLogTo_Socket, not VgLogTo_Fd, finalize_sink_fd should close it.
Also close socket when connecting fails in VG_(connect_via_socket).
Add a testcase for --log-file and --xml-file which prints output to
/dev/stderr
Add a new filter_xml. Note the use of --child-silent-after-fork=yes
usage in two vgtests. Maybe this should be the default for --xml=yes?
Otherwise xml output will be "corrupted" by output from a fork.
Paul Floyd [Mon, 10 Jun 2024 05:14:40 +0000 (07:14 +0200)]
FreeBSD: fixes for version 14.1
There were several leftovers from when I split FREEBSD_14
into 14_0 and 14_1 versions.
sys_break doesn't exist on arm64
There's a really annoying conditional jump error in a static copy
of strlen in ld-elf.so.1. We can't redirect the strlen, so I've
added a suppression. But it messes up test cases that use -s
to count errors.
Finally, FreeBSD 14.1 has removed a few old FreeBSD 7 syscalls.
David Benjamin [Thu, 16 May 2024 14:12:59 +0000 (10:12 -0400)]
Extract common arm64 SIMD helpers into a single header
This was copy-pasted between two files and, with the number of
extensions in aarch64, will likely need to be in many more. As the
header file defines a bunch of static, mutable state, some functions
needed to be moved to a separate .c file, to avoid weird behaviors from
C's textual inclusion model.
This also required refreshing fp_and_simd's expected output. The
fp_and_simd and fp_and_simd_v82 copies of randV128 produced slightly
different output because fp_and_simd_v82 also checked for valid f16s.
Deduplicating the code means we now apply that across the board.
NB: The fp_and_simd expected output was synthesized from what valgrind
thought the correct output was, *not* running the executable directly.
Valgrind does not seem to actually match a real Arm machine. This
divergence already existed before this commit. The divergence is in the
fmla, fcvtxn, and fcvtxn2 instructions. Looking at the corresponding
code in guest_arm64_toIR.c, I see various comments discussion how they
don't quite round correctly, so I'm guessing this is a known bug. For
now, as before this commit, I've generated the test expectations based
on the bug.
Andreas Arnez [Wed, 15 May 2024 12:32:42 +0000 (14:32 +0200)]
s390x: Support the deflate-conversion facility (DFLTCC)
So far the DFLTCC (deflate conversion call) instruction is not supported
by Valgrind. Similar to PRNO and NNPA, it is a "complex" instruction
whose memory effects cannot be adequately expressed with a dirty helper.
Add support for the DFLTCC instruction using the new "extension" mechanism
and reflect this accordingly in the supported facilities and HWCAPs.
Andreas Arnez [Wed, 15 May 2024 12:32:42 +0000 (14:32 +0200)]
Avoid use of guest_IP_AT_SYSCALL in handle_extension()
The guest state field guest_IP_AT_SYSCALL is referenced in
handle_extension(), even though it may not be defined by all
architectures. Avoid its use altogether.
Andreas Arnez [Wed, 15 May 2024 12:32:42 +0000 (14:32 +0200)]
Fix uninitialized `err' in handle_extension()
In handle_extension(), in the case of a second return from SCHEDSETJMP the
variable `err' would be used uninitialized. Fix this by avoiding any
access to `err' in this case.
Mark Wielaard [Mon, 13 May 2024 10:30:13 +0000 (12:30 +0200)]
README_DEVELOPERS: Replace b vgPlain_do_exec with b vgPlain_do_exec_inner
When building with --enable-lto vgPlain_do_exec is optimized out.
So replace the breakpoint example with vgPlain_do_exec_inner and
add a note that this is just an example and internal symbol names
might change or get optimized out.