Julian Seward [Fri, 8 Oct 2010 17:43:26 +0000 (17:43 +0000)]
More unwind fixes for the amd64-linux CALL_FN_*_* macros, as per
bug 243270 comments 47 and 48:
* use __builtin_dwarf_cfa(), not __builtin_frame_address(0), to get the CFA
* use correct register specifier in VALGRIND_CFI_PROLOGUE
Bart Van Assche [Fri, 8 Oct 2010 15:54:57 +0000 (15:54 +0000)]
Only enable CFI annotations when __GCC_HAVE_DWARF2_CFI_ASM is defined. This should work for all platforms, all gcc versions and with and without -fno-dwarf2-cfi-asm / -fno-asynchronous-unwind-tables. Thanks to Jakub Jelinek for the hint.
Julian Seward [Thu, 7 Oct 2010 10:00:04 +0000 (10:00 +0000)]
Fix build breakage on Darwin resulting from r11402 (see #243270),
by disabling creation of .cfi directives on Darwin, until such time
as someone can figure out how to do this.
Julian Seward [Thu, 7 Oct 2010 09:56:19 +0000 (09:56 +0000)]
Only use VKI_O_LARGEFILE on platforms where it exists. This
unbreaks the build breakage on Darwin introduced in r11397, which
was a fix for #234064. The breakage was subsequently reported
in #253420 and #253452, which this commit fixes.
Change Cachegrind/Callgrind to talk about the LL (last-level) cache instead
of the L2 cache. This is to accommodate machines with three levels of
cache. We still only simulate two levels, the first and the last.
Julian Seward [Wed, 6 Oct 2010 22:45:18 +0000 (22:45 +0000)]
The amd64-linux unwinder rejects stacks of smaller than 512 bytes as
bogus, and produces essentially useless traces from them. With
gcc-4.4 and later, some valid thread stacks really are smaller than
this. Hence change the limit down to 256 bytes. Investigated by
Evgeniy Stepanov, eugeni.stepanov@gmail.com.
See bug 243270 comment 21.
Julian Seward [Wed, 6 Oct 2010 22:07:06 +0000 (22:07 +0000)]
amd64-linux: add suitable CFI annotations so that unwinding through
the CALL_FN_*_* macros works more reliably. This is all very fiddly
and is described in a large comment in valgrind.h. Fixes #243270.
(Evgeniy Stepanov, eugeni.stepanov@gmail.com)
Julian Seward [Wed, 6 Oct 2010 15:24:39 +0000 (15:24 +0000)]
Make client sys_shmat work properly on arm-linux by taking into
account rounding requirements to SHMLBA. Modified version of a patch
by Kirill Batuzov, batuzovk@ispras.ru. This fixes the main bug in
#222545. Temporarily breaks the build on all other platforms though.
Julian Seward [Wed, 6 Oct 2010 12:59:44 +0000 (12:59 +0000)]
get_shm_size(): pass VKI_IPC_64 to our shmctl call if it is available,
except on amd64-linux. This fixes a secondary problem discussed
in bug 222545. (Kirill Batuzov, batuzovk@ispras.ru)
Julian Seward [Wed, 6 Oct 2010 11:38:01 +0000 (11:38 +0000)]
When opening an mmaped file to see if it's an ELF file that we should
read debuginfo from, use VKI_O_LARGEFILE, so as to ensure the open
succeeds for large files on 32-bit systems. Fixes #234064.
Tom Hughes [Mon, 4 Oct 2010 20:55:21 +0000 (20:55 +0000)]
When a memory block changes from unreachable to possibly or definitely
reachable, or from possibly reachable to definitely reachable, rescan
it so that any blocks it points to are also upgraded. Fixes #206600.
Tom Hughes [Mon, 27 Sep 2010 15:08:34 +0000 (15:08 +0000)]
Calling VG_(am_relocate_nooverlap_client) will destroy the descriptor
for the old segment so we need to save the permissions from it before
the call so that we can use them when notifying tools of the new space
afterwards, or we will notify them of the wrong permissions.
On arm-linux, add r7 to the set of registers that the CFI unwinder
knows how to unwind. This is important when unwinding Thumb code
the CFA is often stated as being at some offset from r7.
DW_CFA_advance_loc{,1,2,4} fail to multiply the delta by
code_alignment_factor, thereby assuming it is 1. This happens to be
OK on amd64-linux and s390x-linux because it really is 1, but on
arm-linux it is 2, and hence the boundaries between code-unwind areas
are simply wrong after any of DW_CFA_advance_loc{,1,2,4} are
processed. This patch provides the obvious fix.
arm-linux: zero out the least significant bit of R15 that we
ptrace into the child, so as to be a legitimate instruction
address in both ARM and Thumb mode.
Implemented a workaround for a PowerPC-specific gcc-4.3.2-7.x86_64 bug. See also
http://sourceforge.net/mailarchive/forum.php?thread_name=201009101114.07127.jseward%40acm.org&forum_name=valgrind-developers
arm-linux: determine whether the host supports Neon by looking at our
AUXV at startup, rather than by trying to execute a Neon instruction
and seeing whether it SIGILLs. Apparently the latter is not a
reliable way to ascertain the presence of usable Neon support. Fixes
#249775.
Don't scan the entire Valgrind stack to check for impending
stack-overflow situations. This causes an immense number of L2 misses
which are completely pointless, and the recent increase of the
Valgrind per-thread stack size from 64k to 1M greatly aggravates the
situation.
Fixed an AMD64 bug reported by Evgeniy Stepanov: the order of
VALGRIND_CALL_NOREDIR_RAX and addq $128,%%rsp was wrong in CALL_FN_W_6W().
See also #243270.
Support the DCBZL instruction. Also, query the host CPU at startup
time to find out how much space DCBZL really clears, and make the
guest CPU act accordingly. (valgrind-side changes).
(Dave Goodell, goodell@mcs.anl.gov)
Make the leak tests a whole lot less flaky on ppc32/64-linux by
zeroing out caller saves registers before the leak check. We should
really do this on all platforms, not just these.
(Maynard Johnson, maynardj@us.ibm.com)
darwin: support sys_open_extended, sys_removexattr, sys_fremovexattr.
open_extended has the same kludge as chmod_extended/fchmod_extended.
Fixes #246549.
* do indirect branch prediction simulation on calls
via function pointers
* only call into conditional branch prediction simulation
on real guest code branches (eg. not for VEX emulation of some
instructions using branches of jumpkind Ijk_EmWarn)
Improved support for VALGRIND_MALLOCLIKE_BLOCK in memcheck: error
messages printed for client-annotated blocks do now include a correct
address description. Closes #237371.