Tom Hughes [Sun, 2 Oct 2011 10:49:35 +0000 (10:49 +0000)]
Avoid using direct access to read multi-byte values from DWARF files
and use read_Type routines instead as they work rather better on strict
aligned (or semi-strict a la ARM) machines. Fixes #282527.
Change the name of the pipes for vgdb by adding username and hostname.
Those are obtained by looking at some commonly defined environment
variables.
That should help with problems where /tmp is shared or process IDs get
recycled. We had some intermittent nightly build issues because of that.
Partial fix for bugzilla #280757.
Be a bit more careful about the return type for VG_MINIMAL_SETJMP,
both the home-grown versions and the versions that feed through to
__builtin_setjmp.
x86-linux, amd64-linux: Implement VG_MINIMAL_SETJMP and
VG_MINIMAL_LONGJMP directly, rather than using __builtin_setjmp
and __builtin_longjmp, since clang-2.9 miscompiles the latter
(by completely ignoring it.)
Also, add comment about the return type for VG_MINIMAL_SETJMP.
Compile everything with -fno-builtin, so as to disable LLVM's
idiom-recognition optimisation. This identifies memset-style
loops and turns them into calls to memset, which in this case
leads to infinite recursion because it does this transformations
in VG_(memset), which is called from memset(), which we also define.
Remove hardwired /tmp directory in vgdb. Honour VG_TMPDIR
and TMPDIR which was introduced when fixing bugzilla #267020.
Factor out VG_(tmpdir). New function VG_(vgdb_path_prefix).
Partially fixes bugzilla #280757.
ML_(read_elf_debug_info): (no functional change, I hope): fix up
confusing control flow, by separating the logic for "is there a
debuginfo file to be found?" from that of "if a debuginfo file was
found, let's record certain facts (section offsets etc) about it."
This makes it possible to add arbitrary other schemes for finding
debuginfo files without further complicating the existing control flow.
Re-enable the use of loctab (line number table) trimming, for a 5% to
10% reduction in debuginfo storage requirements for large applications
on 32 bit platforms. This code had been present since the MacOSX port
was merged but had been disabled. Remove equivalent code for
shrinking the symbol tables since they are much (4 x) smaller than the
line number tables, trimming them is hardly worth the effort.
run_a_thread_NORETURN: add trashed-register annotations for the magic
bits of assembly which finally cause the thread to exit. How this
ever worked before, on any platform, beats me. The lack was causing
some Android builds to segfault at thread exit. Only the s390 version
was correct.
A previous patch (bug 250101) introduced the concept of reclaimable
superblock: a superblock that cannot be splitted in smaller blocks
and that can be munmapped.
This patch generalises the reclaimable concept : all superblocks are
now reclaimable. To reduce fragmentation, big superblocks are still
kept unsplittable.
The patch has 4 aspects:
1 The previous concept of 'reclaimable superblock' is renamed
'unsplittable superblock' (this is a mechanical change).
2 Ensure that splittable blocks can be reclaimed :
After each free, if the free results in a merged block which
completely covers the superblock, then the superblock can be reclaimed.
3 If a superblock is reclaimed and there exists some translations
for this superblock then discard the translations.
Note : I did not understand the comment speaking about
circular dependency. Just calling VG_(discard_translations) seems
to cause no problem. As m_transtab.c does not allocate client memory,
I believe no circular (dynamic) dependency can be done.
4 Activate 'unsplittable superblock' for all arenas.
Change the backtrace filtering machinery for the helgrind regression
bucket. Instead of removing what we don't want to see in a backtrace
(e.g. path segments through libc and libpthread), we simply keep what
we do want to see. That way .exp files can be generic.
We need to make sure that GCC inlining does not get in the way. So all
the ..._WRK function in hg_intercepts.c are attributed as noinline.
The backtrace filtering is done in the new filter_helgrind script.
filter_stderr is simplified quite a bit.
Fixes bug #281468. See also the comments #5 and #6 there.
Remove code duplication from the dispatchers. Keep the core loop
in common.
To accomplish that without penalizing the non-profiling dispatcher
we do the stats gathering *after* the jitted code returns to the
dispatcher. For that to work properly, we need to stash away the
instruction adddress before entering the jitted code so we can use
it later. (See also VEX r2208).
Two other tweaks are included here:
(1) For the non-profiling dispatcher it is not necessary to update
the LR in each iteration. Quite obviously the jitted code cannot
modify the LR in its iteration because it needs it at the very end
when it returns. So we move this step out of the core loop.
(2) Move loading the address of VG_(tt_fast) past testing for a changed
guest state pointer.
Add initial support for Mac OS X 10.7 (Lion). Tracked by bug #275168.
* configure.in support
* new supp file darwin11.supp
* comment out many intercepts in mc_replace_strmem.c and
vg_replace_malloc.c that are apparently unnecessary for Darwin
* add minimal handling for the following new syscalls and mach traps:
mach_port_set_context
task_get_exception_ports
getaudit_addr
psynch_mutexwait
psynch_mutexdrop
psynch_cvbroad
psynch_cvsignal
psynch_cvwait
psynch_rw_rdlock
psynch_rw_wrlock
psynch_rw_unlock
psynch_cvclrprepost
* wqthread_hijack on amd64-darwin: deal with
tst->os_state.pthread having an apparently different offset,
which caused an assertion failure
* m_debuginfo: for 32 bit processes on Lion, use the DebugInfoFSM
cleanup added in r12041/12042 to handle apparently new dyld
behaviour, which is to map text areas r-- first and only
vm_protect them later to r-x.
The following cleanups remain to be done
* remove apparently pointless, commented out wrapper macro
invokations in mc_replace_strmem.c, eg
//MEMMOVE(VG_Z_DYLD, memmove)
(or determine that they are still necessary, and uncomment)
* ditto in vg_replace_malloc.c, plus general VGO_darwin cleanups
there
* write proper syscall wrappers for
mach_port_set_context
task_get_exception_ports
getaudit_addr
psynch_mutexwait
psynch_mutexdrop
psynch_cvbroad
psynch_cvsignal
psynch_cvwait
psynch_rw_rdlock
psynch_rw_wrlock
psynch_rw_unlock
psynch_cvclrprepost
These are currently just no-ops and may be causing Memcheck to
report false undef-value errors
* figure out why it doesn't work properly unless built with gcc-4.2 on
Lion.
gcc-4.2 is the "normal" gcc (i686-apple-darwin11-gcc-4.2.1). Plain
gcc is the hybrid gcc-front-end clang-back-end thing
(i686-apple-darwin11-llvm-gcc-4.2). Whereas on Snow Leopard, plain
gcc is the normal gcc.
The symptoms of the failure are that wqthread_hijack in
syswrap-amd64-linux.c hits this /*NOTREACHED*/ vg_assert(0); right
at the end (you need a pretty complex threaded app to trigger this),
which makes me think that either ML_(wqthread_continue_NORETURN) or
call_on_new_stack_0_1 do return, which they are not expected to.
* figure out if some of the uninitialised value errors reported in
system libraries on are caused by Memcheck being confused by LLVM
generated code, as per bug #242137
A refactoring change; no functional effect. struct _DebugInfo
contains a bunch of fields which are used as a very simple state
machine that observes mmap calls and decides when to read debuginfo
for the associated file. This change moves these fields into their
own structure, struct _DebugInfoFSM, for cleanness, so as to make it
clear they have a common purpose.
Fix tc23_bogus_condwait.c testcase for s390x.
The testcase used to cause a SIGILL because the address of the bogus
mutex 1 + (char*)&mx[0] denotes a memory location that will eventually
appear in a compare-and-swap instruction. That insn does not allow
memory operands that are not word-aligned. Hence, the SIGILL.
With this fix both incarnations of this testcase (in helgrind and drd)
pass.
Add an .exp for s390x. Certain older kernels had a bug in providing
an invalid siginfo for SIGBUS. Hunted down and fixed by
Christian Borntraeger (borntraeger@de.ibm.com).
This testcase is sensitive to some sleep period. On slower
machines we need to sleep longer. See bugzilla #268623 comment #2.
So let's sleep 500ms instead of 100ms, get rid of the load
barrier and enable the testcase for s390x again.
For s390x we also need to accept a reported size of 1.
This is due to older versions of GCC who use the MVC insn for
assignments and that creates a sequence of 1-byte memory accesses.
ML_(read_elf_debug_info): if we exit from this routine via the BAD
macro, set di->soname back to NULL, so that if we later reenter with
the same 'di', we don't fall over the initial di->soname == NULL
assertion.
Make some vgdb interface to callgrind_control internal
The vgdb "status" monitor command is still available, but
used for pretty printing of status information now (acutally,
just some place holder for real information up to now: just
number of running threads). The internal interface used by
callgrind_control to provide stack traces and event counts
is using "status internal", and is not documented, as the
format is not for human consumption.
This actually was a regression from 3.6.1, but the patch
also improves on printed messages, and refactors common
code between cachegrind and callgrind.
For intercepts in libc and the dynamic linker (ld.so or dyld), split
the Linux and Darwin definitions so they are in completely separate
ifdefs -- iow, remove any definitions that are common to both. This
gives some duplication, but the upside is that it is now possible to
edit the Darwin intercepts without fear of breaking the Linux ones.
This will be important when it comes to supporting OSX 10.7.
Tom Hughes [Tue, 23 Aug 2011 10:11:02 +0000 (10:11 +0000)]
Make a copy of any environment string we are going to modify when
we are cleaning up the environment before an exec, otherwise we
will seg fault if the string is read only. Fixes #270326.
Julian Seward [Sat, 20 Aug 2011 15:55:07 +0000 (15:55 +0000)]
Make sure this gets built with -fomit-frame-pointer, even on x86-linux,
where it otherwise wouldn be. On x86-linux running Memcheck, gives a
6% instruction count reduction and a 10% reduction in memory traffic.
(Duh!)
Julian Seward [Thu, 18 Aug 2011 15:08:20 +0000 (15:08 +0000)]
Add a new simulation hint, --sim-hints=fuse-compatible, which causes
a bunch of file-related syscalls to be handled on the might-block
syscall path rather than the fast syscall path. This fixes deadlocks
when running some FUSE-specific filesystem codes. Fixes #278057.
(Mike Shal, marfey@gmail.com)
Julian Seward [Thu, 18 Aug 2011 13:09:55 +0000 (13:09 +0000)]
Extend the behavioural-equivalence-class mechanism for redirection
functions to include the ability to give a priority to each function,
as well as a tag indicating its behavioural class. Add logic in
m_redir.c to resolve conflicting redirections with the same eclass but
different priorities by preferring the redirection with the higher
priority. Use all of the above in mc_replace_strmem.c, to cause a
conflict between redirections for "memcpy" and "memcpy@GLIBC_2.2.5" to
be resolved in favour of the latter (the non-overlap-checking
version).
This is all related to the massive swamp that is #275284.
Julian Seward [Wed, 17 Aug 2011 21:25:50 +0000 (21:25 +0000)]
Redirect memcpy@@GLIBC_2.14 differently from memcpy@GLIBC_2.2.5, so as
to retain overlap checks for the former whilst skipping them for the
latter. Pertains to #275284. (Tom Hughes, tom@compton.nu)