Julian Seward [Mon, 10 Jan 2011 15:01:03 +0000 (15:01 +0000)]
Memcheck, None: update avg translation size to be more realistic.
Massif: specify avg translation size at all, so as to avoid excessive
retranslations caused by the fact that the default value is far below
reality for Massif.
When a shmat() size is passed to the tool, round it up to a page size. This
is how mmap() sizes are treated. It fixes an assertion failure in Massif
with --pages-as-heap=yes.
Julian Seward [Mon, 6 Dec 2010 11:40:04 +0000 (11:40 +0000)]
New command line option: --trace-children-skip-by-arg, which allows
chase/nochase decisions for child processes to be made on the basis
of their argv[] entries rather than on the name of their executables.
Julian Seward [Mon, 6 Dec 2010 11:11:29 +0000 (11:11 +0000)]
Minor improvements to PDB reading:
* better progress messages, to make it clear that reading of a
PDB is finished, and how much stuff was read from it
* don't mmap PDB files to read them -- instead use VG_(read).
This is because CIFS filesystem mounting only works reliably on
Linux when mounted with option '-o directio', and that
disallows mmap-ing files.
Julian Seward [Mon, 6 Dec 2010 10:56:09 +0000 (10:56 +0000)]
Speedups and fixes:
* (speedup) addMemEvent: generate inline code to check whether a
memory access is within 16k of the stack pointer, and if so
don't bother to call the helper
* (speedup) find_Block_containing: cache the most recently seen 2
blocks, and check new references in them first. This gives a
worthwhile speedup.
* (fix) at the end of the run, merge stats from un-freed blocks
back into APs. This fixes misleading stats that cause un-freed
blocks to appear to not have been accessed at all.
Tom Hughes [Fri, 12 Nov 2010 10:40:20 +0000 (10:40 +0000)]
Rework the strcasecmp stuff a little, based on Jakub Jelinek's patch
on bug #256600 because the original version turned out to somewhat
fragile across different glibc versions.
Bart Van Assche [Mon, 25 Oct 2010 18:18:54 +0000 (18:18 +0000)]
DRD: the help text does now indicate that lock contention detection is off by default / changed default value of the exclusive mutex threshold from 1000s to off. See also #255247.
Julian Seward [Tue, 12 Oct 2010 10:14:43 +0000 (10:14 +0000)]
Fix up printing of the can't-autodetect-params message and the
filtering out thereof, so as to make Cachegrind and Callgrind
pass their regressiont tests on ARM-Linux.
Julian Seward [Tue, 12 Oct 2010 10:13:17 +0000 (10:13 +0000)]
Define VG_CLREQ_SZB correctly on ARM, so Cachegrind and Callgrind
don't assert in their regtests on ARM. (Value is the same in both
ARM and Thumb mode, fortunately.)
Julian Seward [Tue, 12 Oct 2010 10:09:15 +0000 (10:09 +0000)]
Add DHAT as an experimental tool. DHAT (a Dynamic Heap Analysis Tool)
is a heap profiler that is complementary to Massif. DHAT tracks heap
allocations, and connects which memory accesses are to which blocks.
It can find the following information:
* total allocation and max liveness
* average block lifetime (# instructions between allocation and
freeing)
* average number of reads and writes to each byte in the block
("access ratios")
* average of longest interval of non-access to a block, also
measured in instructions
* which fields of blocks are used a lot, and which aren't
(hot-field profiling)
Using these stats it is possible to identify allocation points with
the following characteristics:
* potential process-lifetime leaks (blocks allocated by the point just
accumulate, and are freed only at the end of the run)
* excessive turnover: points which chew through a lot of heap, even if
it is not held onto for very long
* excessively transient: points which allocate very short lived blocks
* useless or underused allocations: blocks which are allocated but not
completely filled in, or are filled in but not subsequently read.
* blocks which see extended periods of inactivity. Could these
perhaps be allocated later or freed sooner?
* blocks with inefficient layout (hot fields spread out over
multiple cache lines), or with alignment holes
Julian Seward [Tue, 12 Oct 2010 00:44:05 +0000 (00:44 +0000)]
Make the --prefix-to-strip=... command-line option added in r11312
behave more like the original proposal in #245535. This makes it
more flexible and general. Also rename it.
* new name is --fullpath-after=
* allow multiple instances of --fullpath-after=
* don't require the specified strings to be prefixes, only substrings
But retain the elegant backwards-compatibility trick in Bart's r11312
commit: if --fullpath-after= is not specified at all, then behave
exactly as before.
Fixes #245535. A mixture of patches from Bart Van Assche
(bart.vanassche@gmail.com), Alexander Potapenko (glider@google.com),
and me (integration and documentation).
Julian Seward [Mon, 11 Oct 2010 18:03:13 +0000 (18:03 +0000)]
Fix bogus register constraints for ARM mode LDREX and STREX.
Derived from a patch by Rodrigo Belem <rodrigo.belem@openbossa.org>
Partially fixes #253636.
Julian Seward [Fri, 8 Oct 2010 17:43:26 +0000 (17:43 +0000)]
More unwind fixes for the amd64-linux CALL_FN_*_* macros, as per
bug 243270 comments 47 and 48:
* use __builtin_dwarf_cfa(), not __builtin_frame_address(0), to get the CFA
* use correct register specifier in VALGRIND_CFI_PROLOGUE
Bart Van Assche [Fri, 8 Oct 2010 15:54:57 +0000 (15:54 +0000)]
Only enable CFI annotations when __GCC_HAVE_DWARF2_CFI_ASM is defined. This should work for all platforms, all gcc versions and with and without -fno-dwarf2-cfi-asm / -fno-asynchronous-unwind-tables. Thanks to Jakub Jelinek for the hint.
Julian Seward [Thu, 7 Oct 2010 10:00:04 +0000 (10:00 +0000)]
Fix build breakage on Darwin resulting from r11402 (see #243270),
by disabling creation of .cfi directives on Darwin, until such time
as someone can figure out how to do this.
Julian Seward [Thu, 7 Oct 2010 09:56:19 +0000 (09:56 +0000)]
Only use VKI_O_LARGEFILE on platforms where it exists. This
unbreaks the build breakage on Darwin introduced in r11397, which
was a fix for #234064. The breakage was subsequently reported
in #253420 and #253452, which this commit fixes.
Change Cachegrind/Callgrind to talk about the LL (last-level) cache instead
of the L2 cache. This is to accommodate machines with three levels of
cache. We still only simulate two levels, the first and the last.
Julian Seward [Wed, 6 Oct 2010 22:45:18 +0000 (22:45 +0000)]
The amd64-linux unwinder rejects stacks of smaller than 512 bytes as
bogus, and produces essentially useless traces from them. With
gcc-4.4 and later, some valid thread stacks really are smaller than
this. Hence change the limit down to 256 bytes. Investigated by
Evgeniy Stepanov, eugeni.stepanov@gmail.com.
See bug 243270 comment 21.
Julian Seward [Wed, 6 Oct 2010 22:07:06 +0000 (22:07 +0000)]
amd64-linux: add suitable CFI annotations so that unwinding through
the CALL_FN_*_* macros works more reliably. This is all very fiddly
and is described in a large comment in valgrind.h. Fixes #243270.
(Evgeniy Stepanov, eugeni.stepanov@gmail.com)