Julian Seward [Wed, 23 Feb 2011 13:22:24 +0000 (13:22 +0000)]
Add a new constructor for empty XArrays, VG_(newSizedXA). This is
identical to VG_(newXA) but allows passing in a size hint. In the
case where the likely final size of the XArray is known at creation
time, this allows avoiding the repeated (implicit) resizing and
copying of the array as elements are added, which can save a vast
amount of dynamic memory allocation turnover.
Julian Seward [Wed, 23 Feb 2011 13:18:56 +0000 (13:18 +0000)]
Fix a scalability problem observed whilst running Helgrind on a large
workload: when scanning a freelist of a given size for a big-enough
block (to allocate), don't scan all the way around the list. Instead
give up after 100 blocks and try the freelist above. The pathological
case (as observed) is that the freelist contains tens of thousands of
blocks, but all are too small for the current request, hence they are
all visited pointlessly. If the new heuristic is used, the freelist
start point is moved along by one block, so that future searches
eventually inspect the entire freelist, just very slowly.
Also, some improvements to stats gathering, and rename of some
existing stats fields in struct Arena.
Julian Seward [Fri, 11 Feb 2011 16:47:03 +0000 (16:47 +0000)]
Make ld.so:index redir mandatory for glibc-2.12 and later, on x86-linux.
Also, improve the failure message a bit, so as to tell people what package
they need to install, in at least some cases.
Bart Van Assche [Thu, 10 Feb 2011 21:03:47 +0000 (21:03 +0000)]
DRD: don't inline pthread intercepts because in combination with the current fragile implementation of the CALL_FN_* macros inlining intercepts can easily trigger stack alignment errors on Darwin.
Julian Seward [Wed, 9 Feb 2011 12:47:23 +0000 (12:47 +0000)]
_pre_mem_asciiz handlers in both tools: don't segfault if passed an
obviously invalid address. Fixes #255009. Investigation & initial
patch by Philippe Waroquiers (philippe.waroquiers@skynet.be)
When unwinding needs to be done because the stack pointer is reset
(e.g. by a longjmp), it makes no sense to interprete the control
flow change as call, but should be seen as a return.
This indirectly fixes bug 246152. Unwinding potentially changes the
exec state, which is unique for threads, but also for signal handlers.
E.g. this is true for a longjmp out of a signal handler. Exec state
changes modify members of struct CLG_(current_state), such as
CLG_(current_state).bbcc and CLG_(current_state).jmps_passed, which
are backed in CLG_(setup_bbcc)() by last_bbcc and passed, respectivly.
On a exec state change, these local vars go out of sync, and lead
to invalid data passed to CLG_(push_call_stack)() for handling a call,
which triggered data corruption, and the symptoms seen in bug 246152.
As in the given situation, there is no call anymore, there is no call
into CLG_(push_call_stack)(), and the corruption (or since last commit
the failed assertion) is not triggered any more.
Better failed assertion then silent data corruption
This is part 1 of the fix to bug 246152, and makes the bug
reproducable as failed assertion also on Ubuntu 10.10 on 64bit
machines. However, the test needs to be compiled 32bit (-m32).
Julian Seward [Thu, 27 Jan 2011 23:56:36 +0000 (23:56 +0000)]
Somewhat reduce the amount of mempool sanity checking, so as to avoid
rendering the mempool machinery impossibly slow for pools containing
many blocks. Fixes #255966.
If Massif's --threshold value was less than 1.0, in lines like this:
->00.00% (0B) in 11 places, all below massif's threshold (00.00%)
the threshold would always be incorrectly printed as 00.00%. This was
because the percentage printing was broken for percentages less than 1.0.
This change fixes this problem, and modifies a test to check for it.
Julian Seward [Sun, 23 Jan 2011 20:45:53 +0000 (20:45 +0000)]
Improve error reports for addressing errors in the presence of
mempools: try and relate an invalid address to known mempool
allocated blocks, and if that fails, to malloc'd blocks that
back the mempool. See #254420.
Print a stack trace as part of the "unhandled instruction bytes" warning.
Useful if the program in question catches signals, in which case the usual
"Process terminating..." stack trace isn't shown. Requested by Jesse
Ruderman.
Julian Seward [Mon, 10 Jan 2011 15:01:03 +0000 (15:01 +0000)]
Memcheck, None: update avg translation size to be more realistic.
Massif: specify avg translation size at all, so as to avoid excessive
retranslations caused by the fact that the default value is far below
reality for Massif.
When a shmat() size is passed to the tool, round it up to a page size. This
is how mmap() sizes are treated. It fixes an assertion failure in Massif
with --pages-as-heap=yes.
Julian Seward [Mon, 6 Dec 2010 11:40:04 +0000 (11:40 +0000)]
New command line option: --trace-children-skip-by-arg, which allows
chase/nochase decisions for child processes to be made on the basis
of their argv[] entries rather than on the name of their executables.
Julian Seward [Mon, 6 Dec 2010 11:11:29 +0000 (11:11 +0000)]
Minor improvements to PDB reading:
* better progress messages, to make it clear that reading of a
PDB is finished, and how much stuff was read from it
* don't mmap PDB files to read them -- instead use VG_(read).
This is because CIFS filesystem mounting only works reliably on
Linux when mounted with option '-o directio', and that
disallows mmap-ing files.
Julian Seward [Mon, 6 Dec 2010 10:56:09 +0000 (10:56 +0000)]
Speedups and fixes:
* (speedup) addMemEvent: generate inline code to check whether a
memory access is within 16k of the stack pointer, and if so
don't bother to call the helper
* (speedup) find_Block_containing: cache the most recently seen 2
blocks, and check new references in them first. This gives a
worthwhile speedup.
* (fix) at the end of the run, merge stats from un-freed blocks
back into APs. This fixes misleading stats that cause un-freed
blocks to appear to not have been accessed at all.
Tom Hughes [Fri, 12 Nov 2010 10:40:20 +0000 (10:40 +0000)]
Rework the strcasecmp stuff a little, based on Jakub Jelinek's patch
on bug #256600 because the original version turned out to somewhat
fragile across different glibc versions.
Bart Van Assche [Mon, 25 Oct 2010 18:18:54 +0000 (18:18 +0000)]
DRD: the help text does now indicate that lock contention detection is off by default / changed default value of the exclusive mutex threshold from 1000s to off. See also #255247.
Julian Seward [Tue, 12 Oct 2010 10:14:43 +0000 (10:14 +0000)]
Fix up printing of the can't-autodetect-params message and the
filtering out thereof, so as to make Cachegrind and Callgrind
pass their regressiont tests on ARM-Linux.
Julian Seward [Tue, 12 Oct 2010 10:13:17 +0000 (10:13 +0000)]
Define VG_CLREQ_SZB correctly on ARM, so Cachegrind and Callgrind
don't assert in their regtests on ARM. (Value is the same in both
ARM and Thumb mode, fortunately.)
Julian Seward [Tue, 12 Oct 2010 10:09:15 +0000 (10:09 +0000)]
Add DHAT as an experimental tool. DHAT (a Dynamic Heap Analysis Tool)
is a heap profiler that is complementary to Massif. DHAT tracks heap
allocations, and connects which memory accesses are to which blocks.
It can find the following information:
* total allocation and max liveness
* average block lifetime (# instructions between allocation and
freeing)
* average number of reads and writes to each byte in the block
("access ratios")
* average of longest interval of non-access to a block, also
measured in instructions
* which fields of blocks are used a lot, and which aren't
(hot-field profiling)
Using these stats it is possible to identify allocation points with
the following characteristics:
* potential process-lifetime leaks (blocks allocated by the point just
accumulate, and are freed only at the end of the run)
* excessive turnover: points which chew through a lot of heap, even if
it is not held onto for very long
* excessively transient: points which allocate very short lived blocks
* useless or underused allocations: blocks which are allocated but not
completely filled in, or are filled in but not subsequently read.
* blocks which see extended periods of inactivity. Could these
perhaps be allocated later or freed sooner?
* blocks with inefficient layout (hot fields spread out over
multiple cache lines), or with alignment holes
Julian Seward [Tue, 12 Oct 2010 00:44:05 +0000 (00:44 +0000)]
Make the --prefix-to-strip=... command-line option added in r11312
behave more like the original proposal in #245535. This makes it
more flexible and general. Also rename it.
* new name is --fullpath-after=
* allow multiple instances of --fullpath-after=
* don't require the specified strings to be prefixes, only substrings
But retain the elegant backwards-compatibility trick in Bart's r11312
commit: if --fullpath-after= is not specified at all, then behave
exactly as before.
Fixes #245535. A mixture of patches from Bart Van Assche
(bart.vanassche@gmail.com), Alexander Potapenko (glider@google.com),
and me (integration and documentation).