Mark Wielaard [Sat, 21 Jun 2025 21:04:04 +0000 (23:04 +0200)]
Update DW_TAG_subprogram parsing for clang
Clang doesn't give a name for some artificial subprograms. In that
case just use "<artificial>" as the name of the DW_TAG_subprogram.
Clang also sometimes generates a DW_TAG_subprogram without any
attributes. These aren't really useful for us. So just silently skip
them.
If we warn about subprograms without a name, specification or abstract
origin, also emit the index in the .debug_info section to make it
easier to look them up.
Mark Wielaard [Thu, 29 May 2025 21:41:52 +0000 (23:41 +0200)]
Rewrite DWARF inlined subroutine handling to work cross CU
The readdwarf3 parsers cannot read DIEs across CUs. An inlined
subroutine refers to an subprogram which has a name (or refers to a
declaration of a subprogram that has a name). These subprograms can be
(and often are when dwz has been used to compress the DWARF) in a
different CU. So a lot of inlined subroutines in backtraces are just
called "UnknownInlinedFun".
To work around not being able to read DIEs across CUs directly we
don't try to immediately resolve the name of the inlined subroutine by
following the abstract origin reference to the subprogram, but just
record it in the DiInlLoc. We also record all subprogram indexes while
parsing in a new DiSubprogram structure and whether the subprogram had
a name or had a reference to another subprogram (specification).
We have to look under a couple more DIEs. We normally want to skip any
DIE that doesn't have an address range when looking for inlined
subroutines, but there are various other DIEs that can contain a
subprogram (specification).
We also want to walk the DIEs from low to high (cooked DIE) index, so
we first pass over the main .debug_info, then the .debug_types, and
finally the alt .debug_info. That way we can store the DiSubprograms
in an array from low to high index and use a binary search to connect
the inlined subroutines to the subprogram that contains the name.
The code also tracks whether the subprogram is artificial, but this
isn't used yet. But should make it possible for a followup patch to
remove artificial inlined subroutines from a backtrace.
Tested against emacs and libreoffice as packaged in Fedora where the
programs and all shared libraries used are processed with dwz. The new
code gives a name to every inlined subroutine. Except when the DWARF
produced is bad and the DW_AT_subroutine didn't contain an
DW_AT_abstract_origin and so no DW_AT_subprogram can be found.
Martin Cermak [Tue, 17 Jun 2025 11:51:48 +0000 (13:51 +0200)]
Wrap linux specific mseal syscall
mseal takes address, size and flags. Flags are reserved for
future use. Modern CPUs support memory permissions such as RW and
NX bits. The mseal syscall takes address and size parameters to
additionally protect memory mapping against modifications.
Declare a sys_mseal wrapper in priv_syswrap-linux.h and hook it
for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
using LINX_ with PRE handler in syswrap-linux.c
Florian Krohm [Sat, 7 Jun 2025 12:12:56 +0000 (12:12 +0000)]
s390x: Fix thinko, remove a few fixs390's
All of VEX's Iops are side effect free. They don't set the condition
code. Therefore, when emitting a VCEQ, VPK[L]S or VCH[L] the 'set cc' field
is 0. Nothing to fix here.
Florian Krohm [Sat, 7 Jun 2025 11:48:01 +0000 (11:48 +0000)]
s390x: Improve none/tests/s390x/cvb.c
The result of a CVB insn resides in an 8-byte register of which the
most significant 4 bytes ought to unchanged by the insn. We want to
check that. So we need to use a variable of type 'long'.
Rewrite the code a bit.
In some cases the LTP tests intentionally work with SIGSEGV. This
happens e.g. with the mmap18 and select03 testcases. Valgrind
detects SIGSEGV and reports that as a failure.
Such report can't be suppressed using the suppressions mechanism.
That's why this update comes with "output filters". Filters are
scripts that read from their stdin, and write filtered output to
their stdout. Filters reside in auxprogs/filters.
This update comes with 2 filters: For mmap18, and for select03.
They are awk scripts.
Except for filters, this update also blacklists testcase fork13
because it is slow. It is possible to add comments prefixed with
the '#' sign (implicitly - because they don't match any testcase
name), so a comment is added too.
This update also introduces new default valgrind --vgdb=no switch. It
improves valgrind behavior with nftw01 nftw6401 setfsgid04 setfsgid03_16
and symlink03 testcases. These were previously complaining like this:
==22969== could not unlink /tmp/vgdb-pipe-from-vgdb-to-22969-by-root-on ...
Paul Floyd [Fri, 30 May 2025 06:47:00 +0000 (08:47 +0200)]
FreeBSD syscalls: change two more warnings
sysctl gave me a hard time when I first started working on FreeBSD
but that warning should only be without -q
change the unhandled kenv warning to a VG_(unimplemented), which will
terminate Valgrind rather than just print a warning.
There are still a few aio warnings. Really they should be promoted
to some kind of fully fledged warning (maybe a core warning). I'm
not sure that it's worth the effort as I suspect that aio is not
much used.
Use --modify-fds=yes to restrict the option from affecting
the 0/1/2 file descriptors as they're often used for
stdin/tdout/stderr redirection.
The new possibility is named "yes" because "yes" is used
as the default in general. The default behaviour of the --modify-fds
option is then such, that highest available file descriptor is returned
execept when the lowest stdin/stdout/stderr (0, 1, 2) are available.
For example, if we want to redirect stdout to stderr by closing stdout
(file descriptor 1) and then calling dup (), file descriptor 1 will be
returned and not the highest number available. This is because the
following is a common pattern to redirect stdout to stderr:
close (1);
/* stdout becomes stderr */
ret = dup (2);
Add none/tests/track_yes.vgtest and none/tests/track_high.vgtest
tests to test --modify-fds=yes/high behave as expected.
Mark Wielaard [Tue, 20 May 2025 10:09:13 +0000 (12:09 +0200)]
PRE(sys_prlimit64): Check ARG3 and ARG4 ML_(safe_to_deref) up front
The previous commit 859d267a456c "PR504341: Prevent LTP setrlimit05
syscall test from crashing valgrind" changed the checking logic of the
PRE handler changing the if-else control flow. Do the ARG3 and ARG4
ML_(safe_to_deref) checking up front and return EFAULT early so the
later checking logic doesn't need to change.
Mark Wielaard [Mon, 19 May 2025 19:42:18 +0000 (21:42 +0200)]
Add fixed bug 504466 double close causes SEGV to NEWS
https://bugs.kde.org/show_bug.cgi?id=504466 was fixed by
commit 8187386962598d1393eaf6cf4e032996f5edabb3
Check whether file descriptor is inherited before printing where_opened
Martin Cermak [Mon, 19 May 2025 09:45:04 +0000 (11:45 +0200)]
PR504341: Prevent LTP setrlimit05 syscall test from crashing valgrind
Prevent ltp/testcases/kernel/syscalls/setrlimit/setrlimit05 testcase
from crashing valgrind when passing 0xffffffffffff as ARG3 and then
trying to dereference it.
Mark Wielaard [Sun, 18 May 2025 13:31:36 +0000 (15:31 +0200)]
Check whether file descriptor is inherited before printing where_opened
Inherited file descriptors don't have an ExeContext where they were
opened (by the program). So don't try to print the NULL where_opened
when reporting double close errors for such file descriptors.
Add a testcase none/tests/fdleak_doubleclose0 that crashes valgrind
before this fix.
Martin Cermak [Fri, 16 May 2025 20:12:39 +0000 (22:12 +0200)]
PR503969: make ltpchecks: flatten the log structure
Flatten the directory structure of make ltpchecks logs per PR503969#c9.
Individual syscall tests are numbered, so that no testcase naming
conflicts should show up. Demo upload:
Martin Cermak [Fri, 16 May 2025 09:46:06 +0000 (11:46 +0200)]
PR503969: Make test results of make ltpchecks compatible with bunsen
Synthesize automake-like testcase log format for bunsen compatibility.
Each testcase now produces a triplet of <testcase>.{log,trs}, and a
test-suite.log file. This way test results can be uploaded to bunsen
using the t-upload-git-push command.
The <testcase>.log file contains the actual testcase log. The .trs
file contains automake-encoded test result. The test-suite.log
file contains testcase timing information.
The log directory tree respect the one of the LTP testcases to avoid
naming collisions.
Here is how the test result was uploaded to Bunsen:
> TESTING FINISHED, logs in /home/mcermak/WORK/valgrind/valgrind/auxprogs/auxchecks/ltp-full-20250130/ltp
> make[1]: Leaving directory '/home/mcermak/WORK/valgrind/valgrind/auxprogs'
>
> real 8m27.467s
> user 44m46.416s
> sys 6m46.405s
> 41$ cd /home/mcermak/WORK/valgrind/valgrind/auxprogs/auxchecks/ltp-full-20250130/ltp
> 41$ t-upload-git-push ssh://builder.sourceware.org/git/bunsendb.git mcermak/PR503969-$(date +%s) \
> $(find . -type f -name '*.log' -o -name '*.trs' -o -name 'test-suite.log')
> 3036eba2de1cc44a1942f9681720ea234da9029e refs/tags/mcermak/PR503969-1747389329
> 41$
Florian Krohm [Thu, 15 May 2025 21:42:19 +0000 (21:42 +0000)]
Compile code that gets linked with libc by adding -fhosted.
The use of -fno-builtin interfers with -Wformat: no format warnings will
be given when both options are present. Because -fno-builtin means: when
you encounter a function that has the same name as a built-in function,
forget everything you know about that function.
That includes functions attributes. And without __attribute__((format...))
there will be no -Wformat warnings.
The fix is to append -fhosted when compiling something that gets linked
with libc. As command line options are processed left to right adding
-fhosted at the end overrides any earlier -fno-builtin option.
Fix compiler warnings.
Add some ugly ifdeffery to drd/drd_intercepts.c. I could not figure out how
to rewrite the offensive printf to avoid testcase breakage.
Martin Cermak [Thu, 15 May 2025 10:03:55 +0000 (12:03 +0200)]
Wrap linux specific cachestat syscall
cachestat takes an fd, cstat_range and flags arguments and
writes out page cache statistics via the cstat struct.
Declare a sys_cachestat wrapper in priv_syswrap-linux.h and hook it
for {amd64,arm,arm64,mips64,nanomips,ppc32,ppc64,riscv64,s390x,x86}-linux
using LINXY with PRE/POST handlers in syswrap-linux.c.
Define __NR_cachestat for amd64, arm, arm64, mips32, mips64, nanomips,
ppc32, ppc64, riscv64, s390x, and x86.
Mark Wielaard [Tue, 13 May 2025 22:13:06 +0000 (00:13 +0200)]
Don't count closed inherited file descriptors
Programs which close some inherited file descriptors and are run under
valgrind with -q and --track-fds=yes would still show the FILE
DESCRIPTORS banner even if there were no non-inherited file
descriptors still open.
Fix this by not counting already closed inherited file descriptors.
Add a simple testcase to show /usr/bin/cat /dev/null doesn't produce
any output running under valgrind -q --track-fds=yes.
Martin Cermak [Tue, 13 May 2025 12:21:08 +0000 (14:21 +0200)]
auxprogs/ltp-tester.sh: Fix typo in summary log grep
This does point out a couple more failures. In case the test results
differ when running under valgrind, but there are no other error
indicators. e.g. openat202 now reports:
Which points out we have a bug in our openat wrapper when given
/proc/self/exe with RESOLVE_NO_MAGICLINKS (which should fail with
ELOOP, but succeeds when running under valgrind, probably because we
have a "magic" /proc/self/exe wrapper).
Paul Floyd [Mon, 12 May 2025 19:38:46 +0000 (21:38 +0200)]
illumos regtest: mask a leak in memcheck/tests/realloc_size_zero_xml
There's what looks like a libc printf buffer possible leak
in this test on illumos. With xml output leak reporting is
hard coded to on. So to avoid this extra error add
--show-possibly-lost=no
Maybe this should be suppressed. I haven't managed to reproduce it
without xml and outside of perl regtest.
Mark Wielaard [Fri, 9 May 2025 11:46:44 +0000 (13:46 +0200)]
Add workaround for missing riscv_hwprobe syscall (258)
On riscv newer glibc (2.41) will probe instruction support using the
riscv_hwprobe syscall. Since Valgrind currently doesn't have a wrapper
for riscv_hwprobe that causes a Warning. Since the RISC-V Hardware
Probing Interface is non-trivial and we don't really implement
extended riscv instructions anyway work around that by "implementing"
riscv_hwprobe as sys_ni_syscall so it generates an ENOSYS and glibc
will silently fall back to not using any extended instructions.
Mark Wielaard [Thu, 8 May 2025 22:21:25 +0000 (00:21 +0200)]
mount syscall param filesystemtype may be NULL
On Linux the mount syscall, depending on flags provided, the source,
type and data my be ignored. We already don't check data and allow
source to be NULL. Normally when type is ignored an application will
provide an empty string "". But sometimes NULL is passed (like for
source). So we now also allow type to be NULL to prevent false
positives.
Adjust the linux/scalar.c tests so the type param is still
unaddressable.
But the disasm-test parser assumed there could only be one
address including a symbol name on a given line. It stopped
comparison beyond that point.
The line
Said patch removes the resteering machinery which allowed chasing through
unconditional jumps/calls during IR generation.
There were two fixme's related to this which are now removed.
Also eliminate functions 'call_function_and_chase' and
'always_goto_and_chase' which no longer are meaningful. Use
'call_function' and 'always_goto' instead.
Florian Krohm [Sun, 4 May 2025 21:29:34 +0000 (21:29 +0000)]
s390x: Add disassembly for special insns
Surely we want to see this when tracing the frintend.
Also: new function s390_irgen_inject_ir to wrap the handling of
the special insn for IR injection.
Mark Wielaard [Sun, 4 May 2025 18:16:26 +0000 (20:16 +0200)]
ltp-excludes: Add fork14, futex_cmp_requeue and pidfd_send_signal
There are a few more linux test project syscall tests that seem to
cause some trouble for some buildbots. The fork14 test uses a lot of
memory, as do the futex_cmp_requeue tests (at least on ppc64le). And
the pidfd_send_signal tests, when run inside a container, seem to kill
the test wrapper (and the container it runs in).
Paul Floyd [Tue, 22 Apr 2025 05:22:42 +0000 (07:22 +0200)]
Regtest: clean up warning and compilation of bug290061.c
On some platforms there was a 'defined but not used' warning.
When I fixed that I got a link error from clang. Using a
_LDFLAGS option causes automake to split building the test into
separate compile and link commands and clang was optimizing away
the unused static 'meh' symbol.
Paul Floyd [Mon, 21 Apr 2025 18:44:31 +0000 (20:44 +0200)]
Illumos regtest: add an expected for none/tests/fdleak_socketpair_xml.stderr
illumos socketpair doesn't get the next two fds (3 and 4), instead it
gets 4 and 5. That looks like it's because this is done in two steps in libc.
so_socket gets called twice returnning fds 3 and 4 the so_socketpair takes
those and does some rebinding(?) resulting in fds 4 and 5.
Mark Wielaard [Fri, 18 Apr 2025 10:22:29 +0000 (12:22 +0200)]
Add auxprogs/ltp-excludes.txt
There are a couple of ltp testcases that take a very long time to run
(under valgrind). Add a file auxprogs/ltp-excludes.txt that is used to
exclude them from a make ltpchecks run, containing 10 tests:
Martin Cermak [Thu, 17 Apr 2025 14:14:19 +0000 (16:14 +0200)]
Use LTP for testing valgrind
Add a new top level make target ltpchecks which will fetch the latest
linux test project (ltp) release as defined by the LTP_VERSION and
LTP_SHA256 variables in auxprogs/Makefile.am (update those when a new
version of ltp is released). If the ltp tar.xz has already been
downloaded, or it has already been unpacked and build, the (cached)
file and build will be reused.
The actual testing is done through the auxprogs/ltp-tester.sh script.
It takes all executable tests from the ltp testcases under
kernel/syscalls and runs them 3 times. Once directly, not under
valgrind, once with -q --tool=none and once with -q
--tool=memcheck. It then checks that valgrind didn't produce any
messages with the none tool, that there were no fatal errors produced
(as defined in auxprogs/ltp-error-patterns.txt) and that the ltp
results are the same with and without valgrind.
Currently there are 1472 test binaries and running them all (serially)
takes more than three hours and detects various missing or incomplete
syscall handlers in valgrind, plus various crashers.
Paul Floyd [Thu, 17 Apr 2025 19:26:24 +0000 (21:26 +0200)]
Illumos: increase coverage of --modify-fds syscalls
It looks like Solaris/Illumos is missing some F_DUP* coverage
and we aren't handling syscalls that reaturn 2 fds (pipe, socketpair).
Otherwise this should cover most Illumos cases at least.