Mark Wielaard [Wed, 12 Dec 2018 13:15:28 +0000 (14:15 +0100)]
Mark helper regs defined in final_tidyup before freeres_wrapper call.
In final_tidyup we setup the guest to call the freeres_wrapper, which
will (possibly) call __gnu_cxx::__freeres() and/or __libc_freeres().
In a couple of cases (ppc64be, ppc64le and mips32) this involves setting
up one or more helper registers. Since we setup these guest registers
we should make sure to mark them as fully defined. Otherwise we might
see spurious warnings about undefined value usage if the guest register
happened to not be fully defined before.
Add a --show-percs option to cg_annotate and callgrind_annotate.
Because it's very useful. As part of this, the "percentage of events
annotated" numbers at the bottom of the output is changed to "events
annotated" so that --show-percs doesn't compute a percentage of a
percentage.
Example output lines:
```
4,967,137,442 (100.0%) PROGRAM TOTALS
Mark Wielaard [Fri, 7 Dec 2018 13:01:20 +0000 (14:01 +0100)]
Fix sigkill.stderr.exp for glibc-2.28.
glibc 2.28 filters out some bad signal numbers and returns
Invalid argument instead of passing such bad signal numbers
the kernel sigaction syscall. So we won't see such bad signal
numbers and won't print "bad signal number" ourselves.
Add a new memcheck/tests/sigkill.stderr.exp-glibc-2.28 to catch
this case.
Mark Wielaard [Thu, 6 Dec 2018 19:52:22 +0000 (20:52 +0100)]
Bug 401822 Fix asm constraints for ppc64 jm-vmx jm-insns.c test.
The mfvscr and vor instructions in jm-insns.c had a "=vr" constraint.
This should have been an "=v" constraint. This resolved assembler
warnings and the testcase failing on ppc64le with gcc 8.2 and
binutils 2.30.
Andreas Arnez [Wed, 5 Dec 2018 16:07:05 +0000 (17:07 +0100)]
Add Emacs configuration files
This adds a configuration file ".dir-locals.el" for Emacs to the topmost
directory of the Valgrind source tree, and another such file to the
directory drd/tests. These files contain per-directory local Emacs
variables.
The following settings are performed:
* The base C style is set to "Linux", indentation is set to 3 columns
per level, the use of tabs for indentation is disabled, and the fill
column is set to 80.
* The source files in drd/tests use 2 instead of 3 columns per indentation
level.
Vadim Barkov [Fri, 5 Oct 2018 10:46:44 +0000 (13:46 +0300)]
Bug 385411 s390x: Tests and internals for z13 vector FP support
Add test cases for the z13 vector FP support. Bring s390-opcodes.csv
up-to-date, reflecting that the z13 vector instructions are now supported.
Also remove the non-support disclaimer for the vector facility from
README.s390.
The patch was contributed by Vadim Barkov, with some clean-up and minor
adjustments by Andreas Arnez.
Always output all leak kinds in a xtree leak result file.
- The option --xtree-leak=yes (to output leak result in xtree format)
automatically activates the option --show-leak-kinds=all,
as xtree visualisation tools such as kcachegrind can in any case
select what kind of leak to visualise.
Andreas Arnez [Thu, 26 Jul 2018 14:35:24 +0000 (16:35 +0200)]
s390x: More fixes for z13 support
This patch addresses the following:
* Fix the implementation of LOCGHI. Previously Valgrind performed 32-bit
sign extension instead of 64-bit sign extension on the immediate value.
* Advertise VXRS in HWCAP. If no VXRS are advertised, but the program
uses vector registers, this could cause problems with a glibc built with
"-march=z13".
Add mkRight{32,64} as right-travelling analogues to mkLeft{32,64}.
doCmpORD: for the cases of a signed comparison against zero, compute
definedness of the 3 result bits (lt,gt,eq) separately, and, for the lt and eq
bits, do it exactly accurately.
expensiveCountTrailingZeroes: no functional change. Re-analyse/verify and add
comments.
expensiveCountLeadingZeroes: add. Very similar to
expensiveCountTrailingZeroes.
Add some comments to mark unary ops which are self-shadowing.
Route Iop_Ctz{,Nat}{32,64} through expensiveCountTrailingZeroes.
Route Iop_Clz{,Nat}{32,64} through expensiveCountLeadingZeroes.
Add instrumentation for Iop_PopCount{32,64} and Iop_Reverse8sIn32_x1.
memcheck/tests/vbit-test/irops.c
Add dummy new entries for all new IROps, just enough to make it compile and
run.
Dont emit cnttz{w,d}. We may need them on a target which doesn't support
them. Instead we can generate a fairly reasonable alternative sequence with
cntlz{w,d} instead.
Add support for emitting popcnt{w,d}.
VEX/priv/host_ppc_isel.c
Add support for: Iop_ClzNat32 Iop_ClzNat64
Redo support for: Iop_Ctz{32,64} and their Nat equivalents, so as to not use
cnttz{w,d}, as mentioned above.
Add support for: Iop_PopCount64 Iop_PopCount32 Iop_Reverse8sIn32_x1
Julian Seward [Tue, 20 Nov 2018 09:52:33 +0000 (10:52 +0100)]
Add some new IROps to support improved Memcheck analysis of strlen etc.
This is part of the fix for bug 386945. It adds the following IROps, plus
their supporting type- and printing- fragments:
Iop_Reverse8sIn32_x1: 32-bit byteswap. A fancy name, but it is consistent
with naming for the other swapping IROps that already exist.
Iop_PopCount64, Iop_PopCount32: population count
Iop_ClzNat64, Iop_ClzNat32, Iop_CtzNat64, Iop_CtzNat32: counting leading and
trailing zeroes, with "natural" (Nat) semantics for a zero input, meaning, in
the case of zero input, return the number of bits in the word. These
functionally overlap with the existing Iop_Clz64, Iop_Clz32, Iop_Ctz64,
Iop_Ctz32. The existing operations are undefined in case of a zero input.
Adding these new variants avoids the complexity of having to change the
declared semantics of the existing operations. Instead they are deprecated
but still available for use.
Andreas Arnez [Tue, 30 Oct 2018 16:06:38 +0000 (17:06 +0100)]
Bug 400491 s390x: Sign-extend immediate operand of LOCHI and friends
The VEX implementation of each of the z/Architecture instructions LOCHI,
LOCHHI, and LOCGHI treats the immediate 16-bit operand as an unsigned
integer instead of a signed integer. This is fixed.
Andreas Arnez [Thu, 25 Oct 2018 11:47:12 +0000 (13:47 +0200)]
Bug 400490 s390x: Fix register allocation for VRs vs FPRs
On s390x, if vector registers are available, they are fed to the register
allocator as if they were separate from the floating-point registers. But
in fact the FPRs are embedded in the VRs. So for instance, if both f3 and
v3 are allocated and used at the same time, corruption will result.
This is fixed by offering only the non-overlapping VRs, v16 to v31, to the
register allocator instead.
Fix dependencies between libcoregrind*.a and *m_main.o/*m_libcsetjmp.o
The primary and secondary coregrind libraries must be updated
when m_main.c or m_libcsetjmp.c are changed.
A dependency was missing between libcoregrind*.a and libnolto_coregrind*.a,
and so tools were not relinked when m_main.c or m_libcsetjmp.c were
changed.
Fix 399301 - Use inlined frames in Massif XTree output.
Author: Nicholas Nethercote <nnethercote@mozilla.com>
Use inlined frames in Massif XTree output.
This makes Massif's output much easier to follow.
The commit also removes a -1 used on all Massif stack frame addresses.
There was a big comment questioning the presence of that -1, and with it
gone the addresses now match those produced by DHAT.
* Create the 3.15 section in the NEWS file
(the idea is that this section is maintained during the development,
i.e. document user visible changes and/or the fixed bugs, as part of
the commit).
* start the fixed bug list with 399322 Improve callgrind_annotate output
Andreas Arnez [Tue, 9 Oct 2018 09:22:27 +0000 (11:22 +0200)]
Bug 399444 s390x: Drop unnecessary check in s390_irgen_VSLDB
In s390_irgen_VSLDB there was special handling for the case that the
immediate operand i4 has the value 16, which would mean that the result v1
were a full copy of the third operand v3. However, this is impossible
because i4 can only assume values from 0 to 15; thus the special handling
can be removed.
Julian Seward [Wed, 3 Oct 2018 13:29:42 +0000 (15:29 +0200)]
sigframe construction for x86-linux: ensure that ESP is correctly aligned before entering the handler. n-i-bz.
Without this, a signal handler compiled by Clang 6, which uses movdqa to load/store
relative to ESP, segfaults because the resulting address isn't 16-aligned.
Memcheck on amd64; fix false positive associated with spec cases {Z,NZ} after {LOGICB,LOGICW}. n-i-bz.
For the spec cases {Z,NZ} after {LOGICB,LOGICW}, which are simply comparisons
of the result against zero, use Cmp{EQ,NE}32 rather than their 64-bit
counterparts. This is because Memcheck on amd64 instruments the 32 bit
versions exactly, at the default --expensive-definedness-checks=auto setting.
The alternative would have been to make Memcheck also do exact instrumentation
of the 64 bit versions, but that would also burden all other 64 bit eq/ne
comparisons with that cost for no purpose. So this is a cheaper solution.
Mark Wielaard [Thu, 27 Sep 2018 03:09:42 +0000 (05:09 +0200)]
Fix s390x_dirtyhelper_vec_op signature for non-s390x case.
The definition of s390x_dirtyhelper_vec_op in guest_s390_helpers.c
didn't match the one from guest_s390_defs.h for the non-s390x case.
Causing a compiler warning/error.
Fix 398028 Assertion `cfsi_fits` failing in simple C program
At least with libopenblas, we can have several rx mappings
with some holes between mappings.
Change the invariant (2) checking so that such holes are ok,
as long as no cfsi refers to such an hole.
Andreas Arnez [Wed, 25 Jul 2018 12:23:02 +0000 (14:23 +0200)]
s390x: Implement conditional trap instructions
This implements various z/Architecture instructions that conditionally
yield a data exception ("trap"). The condition is either based on a
comparison being true ("compare and trap") or on a loaded value being
zero ("load and trap"). These instructions haven't been widely used in
the past, but may now be emitted by newer compilers. Note that the
resulting signal for a data exception is SIGFPE, not SIGTRAP. Thus this
patch also adds a new jump kind Ijk_SigFPE.
Mark Wielaard [Wed, 19 Sep 2018 19:23:06 +0000 (21:23 +0200)]
Fix arm64-linux/scalar clone test argument check order.
When the clone syscall was refactored to work across all linux arches
the arguments were checked in a different order. Fix the arm64-linux
scalar.stderr.exp to match the same order for the (invalid) clone
arguments.
This makes memcheck/tests/arm64-linux/scalar.vgtest pass again.
Mark Wielaard [Tue, 18 Sep 2018 20:55:45 +0000 (22:55 +0200)]
Run power_ISA2_0[57] tests with -q
memcheck/tests/ppc64/power_ISA2_0[57] could spuriously fail when
some internal glibc function would allocate and free some memory.
To get the expected output run the tests with -q and clear stderr.exp.
Bug 395991 - wine's unit tests enter a signal delivery loop under valgrind on armv7l when SIGSEGV is used.
On signal handler return, restore r0 .. r15 inclusive from the sigcontext that we
gave to the handler, so that any changes the handler has made to those values
will take effect on return.
Adds test cases, that check both the 10h and 11h decodings. For some reason
the expected output diff is huge. I don't know why. It is the same as what
the hardware produces, though.
This is similar to bug #387712 (about cgijnl), but a newer gcc uses cgijl
now. So use a similar fix when cc_dep2 is zero, only check whether the
most significant bit of cc_dep1 is set to 1.
Mark Wielaard [Mon, 3 Sep 2018 09:54:38 +0000 (11:54 +0200)]
Bug 397354 utimensat should ignore tv_sec if tv_nsec is UTIME_NOW/OMIT.
When code uses utimensat with UTIME_NOW or UTIME_OMIT valgrind memcheck
would generate a warning. But as the utimensat manpage says:
If the tv_nsec field of one of the timespec structures has the special
value UTIME_NOW, then the corresponding file timestamp is set to the
current time. If the tv_nsec field of one of the timespec structures
has the special value UTIME_OMIT, then the corresponding file timestamp
is left unchanged. In both of these cases, the value of the corre‐
sponding tv_sec field is ignored.
So ignore the timespec tv_sec when tv_nsec is set to UTIME_NOW or
UTIME_OMIT.
Some applications are mapping an object ro, and then unmaps it directly.
In such a case, we have a di that contains obsolete fsm.maps (not matching
OS mappings). The di for this unmapped object is not active,
and has no dinfo (have_dinfo == False).
(more generally, fsm.maps can contain a whole bunch of obsolete mappings).
Later on, some other libs can be mapped with a mapping overlapping
this obsolete mapping.
A di that never had its debug info loaded can really be discarded,
even if CG_(clo_keep_debuginfo).
In such a case, it is normal to have to discard a not active di.
(it might be better to keep fsm.maps in sync with the real OS
mapping, but that is a much bigger change/fix).
The FSM debug tracing was static, it is now dynamic according
to debug loglevel >= 3.
The below is an extract of the trace showing what happens.
SYSCALL[4384,1](257) sys_openat ( 4294967196, 0x4244398(/usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so), 524288 ) --> [async] ...
SYSCALL[4384,1](257) ... [async] --> Success(0x3)
SYSCALL[4384,1](72) sys_fcntl[ARG3=='arg'] ( 3, 2, 1 )[sync] --> Success(0x0)
SYSCALL[4384,1](5) sys_newfstat ( 3, 0x1ffefff8b0 )[sync] --> Success(0x0)
SYSCALL[4384,1](5) sys_newfstat ( 3, 0x1ffefff9c0 )[sync] --> Success(0x0)
SYSCALL[4384,1](9) sys_mmap ( 0x0, 10520, 1, 1, 3, 0 )--4384-- di_notify_mmap-0:
--4384-- di_notify_mmap-1: 0x4027000-0x4029fff r--
--4384-- di_notify_mmap-2: /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so
--4384-- di_notify_mmap-3: is_rx_map 0, is_rw_map 0, is_ro_map 1
--4384-- di_notify_mmap-4: noting details in DebugInfo* at 0x10024CEA10
--4384-- di_notify_mmap-6: no dinfo loaded /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so (no rx or no rw mapping)
--> [pre-success] Success(0x4027000)
SYSCALL[4384,1](3) sys_close ( 3 )[sync] --> Success(0x0)
SYSCALL[4384,1](11) sys_munmap ( 0x4027000, 10520 )[sync] --> Success(0x0)
^^^^ the above munmap has not cleaned up or removed anything in DebugInfo* at 0x10024CEA10
Later on, /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqxcb.so is mapped
overlapping the memory where libqeglfs.so was mapped ro.
Now, this cleans up the (useless) di that never had have_dinfo true, e.g.
------ start ELF OBJECT -------------------------------------------------------
------ name = /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqxcb.so
...
--4384-- Discarding syms at 0x0-0x0 in /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so (have_dinfo 0)
(the 0x0-0x0 in the trace is because there was never any text mapping for libqeglfs.so).
Julian Seward [Sun, 19 Aug 2018 07:18:20 +0000 (09:18 +0200)]
arm64 front end: add early-writeback handling for w (32 bit int) and q (128 bit) stores.
Clang 6.0.1 creates instructions of the form str wX, [sp,#-N] and str qX,
[sp,#-N], and the lack of early writeback causes many false positive errors
from Memcheck, plus it potentially breaks the segfault-based main-thread stack
extension mechanism. This commit adds support for them, using the same scheme
with which existing cases are handled.
Julian Seward [Fri, 17 Aug 2018 07:31:37 +0000 (09:31 +0200)]
Fix 388174 - valgrind with Wine quits with "Assertion 'cfsi_fits' failed"
In check_CFSI_related_invariants, this commit improves the check for invariant
(2), which, as noted in an existing comment, "might need to be improved".
Instead of assuming that the CFSI range fits entirely into one "rx" mapping,
check that it is covered by the union of all the "rx" mappings we have. This
is the correct check. The previous check was observed to have failed as below
for at least some Clang generated objects (possibly in conjunction with lld as
the linker.)
Julian Seward [Fri, 17 Aug 2018 07:09:21 +0000 (09:09 +0200)]
Add changes to ensure that a DebugInfo that has been archived cannot be archived again.
* discard_or_archive_marked_DebugInfos: clear the mark bit for a Debuginfo
that will be archived
* discard_DebugInfos_which_overlap_with: when selecting DebugInfos to be
discarded or archived, fix a mistake in which some mark bits wouldn't be
changed at all, meaning their "old" value was used to influence the current
operation.
These may (or may not) fix #393146; at the very least, they are somehow
related.
Add file descriptor tracking in wrappers for bpf system call
Support for the bpf system call was added in a previous commit, but
did not include tracking for file descriptors handled by the call.
Add checks and tracking for file descriptors. Check in PRE() wrapper
that all file descriptors (pointing to object such as eBPF programs or
maps, cgroups, or raw tracepoints) used by the system call are valid,
then add tracking in POST() wrapper for newly produced file descriptors.
As the file descriptors are not always processed in the same way by the
bpf call, add to the header file some additional definitions from bpf.h
that are necessary to sort out under what conditions descriptors should
be checked in the PRE() helper.
Fixes: 388786 - Support bpf syscall in amd64 Linux
Add support for bpf() Linux-specific system call on amd64 platform. The
bpf() syscall is used to handle eBPF objects (programs and maps), and
can be used for a number of operations. It takes three arguments:
- "cmd" is an integer encoding a subcommand to run. Available subcommand
include loading a new program, creating a map or updating its entries,
retrieving information about an eBPF object, and may others.
- "attr" is a pointer to an object of type union bpf_attr. This object
converts to a struct related to selected subcommand, and embeds the
various parameters used with this subcommand. Some of those parameters
are read by the kernel (example for an eBPF map lookup: the key of the
entry to lookup), others are written into (the value retrieved from
the map lookup).
- "attr_size" is the size of the object pointed by "attr".
Since the action performed by the kernel, and the way "attr" attributes
are processed depends on the subcommand in use, the PRE() and POST()
wrappers need to make the distinction as well. For each subcommand, mark
the attributes that are read or written.
For some map operations, the only way to infer the size of the memory
areas used for read or write operations seems to involve reading
from /proc/<pid>/fdinfo/<fd> in order to retrieve the size of keys
and values for this map.
The definitions of union bpf_attr and of other eBPF-related elements
required for adequately performing the checks were added to the Linux
header file.
Processing related to file descriptors is added in a follow-up patch.
Move pre_check for ASCII string out of PRE(sys_prctl)
The sys_prctl wrapper with PR_SET_NAME option reads an ASCII string passed
as its second argument. This string is supposed to be shorter than a given
limit. As the actual length of the string is unknown, the PRE() wrapper
performs a number of checks on it, including, in worst case, trying to
dereference it byte by byte.
To avoid re-implementing all this logic for other wrappers that could
need it, get the string processing out of the wrapper and move it to a
static function. Note that passing tid as an argument to the function is
required for macros PRE_MEM_RASCIIZ and PRE_MEM_READ to work properly.
Julian Seward [Tue, 14 Aug 2018 08:13:46 +0000 (10:13 +0200)]
VG_(di_notify_mmap): once we've read debuginfo for an object, ignore all further mappings. n-i-bz.
Once we've read debuginfo for an object, ignore all further mappings. If we
don't do that, applications that mmap in their own objects to inspect them for
whatever reason, will cause "irrelevant" mappings to be recorded in the
object's fsm.maps table. This can lead to serious problems later on.
This has become necessary because 64aa729bfae71561505a40c12755bd6b55bb3061 of
Thu Jul 12 2018 (the fix for bug 395682) started recording readonly segments
in the fsm.maps table, where before they were ignored.