--- /dev/null
+<?xml version="1.0"?> <!-- -*- sgml -*- -->
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
+[ <!ENTITY % vg-entities SYSTEM "vg-entities.xml"> %vg-entities; ]>
+
+
+<chapter id="manual-core-adv" xreflabel="Valgrind's core: advanced topics">
+<title>Using and understanding the Valgrind core: Advanced Topics</title>
+
+<para>This chapter describes advanced aspects of the Valgrind core
+services, which are mostly of interest to power users who wish to
+customise and modify Valgrind's default behaviours in certain useful
+ways. The subjects covered are:</para>
+
+<itemizedlist>
+ <listitem><para>The "Client Request" mechanism</para></listitem>
+ <listitem><para>Function Wrapping</para></listitem>
+</itemizedlist>
+
+
+
+<sect1 id="manual-core-adv.clientreq"
+ xreflabel="The Client Request mechanism">
+<title>The Client Request mechanism</title>
+
+<para>Valgrind has a trapdoor mechanism via which the client
+program can pass all manner of requests and queries to Valgrind
+and the current tool. Internally, this is used extensively to
+make malloc, free, etc, work, although you don't see that.</para>
+
+<para>For your convenience, a subset of these so-called client
+requests is provided to allow you to tell Valgrind facts about
+the behaviour of your program, and also to make queries.
+In particular, your program can tell Valgrind about changes in
+memory range permissions that Valgrind would not otherwise know
+about, and so allows clients to get Valgrind to do arbitrary
+custom checks.</para>
+
+<para>Clients need to include a header file to make this work.
+Which header file depends on which client requests you use. Some
+client requests are handled by the core, and are defined in the
+header file <filename>valgrind/valgrind.h</filename>. Tool-specific
+header files are named after the tool, e.g.
+<filename>valgrind/memcheck.h</filename>. All header files can be found
+in the <literal>include/valgrind</literal> directory of wherever Valgrind
+was installed.</para>
+
+<para>The macros in these header files have the magical property
+that they generate code in-line which Valgrind can spot.
+However, the code does nothing when not run on Valgrind, so you
+are not forced to run your program under Valgrind just because you
+use the macros in this file. Also, you are not required to link your
+program with any extra supporting libraries.</para>
+
+<para>The code added to your binary has negligible performance impact:
+on x86, amd64, ppc32 and ppc64, the overhead is 6 simple integer instructions
+and is probably undetectable except in tight loops.
+However, if you really wish to compile out the client requests, you can
+compile with <computeroutput>-DNVALGRIND</computeroutput> (analogous to
+<computeroutput>-DNDEBUG</computeroutput>'s effect on
+<computeroutput>assert()</computeroutput>).
+</para>
+
+<para>You are encouraged to copy the <filename>valgrind/*.h</filename> headers
+into your project's include directory, so your program doesn't have a
+compile-time dependency on Valgrind being installed. The Valgrind headers,
+unlike most of the rest of the code, are under a BSD-style license so you may
+include them without worrying about license incompatibility.</para>
+
+<para>Here is a brief description of the macros available in
+<filename>valgrind.h</filename>, which work with more than one
+tool (see the tool-specific documentation for explanations of the
+tool-specific macros).</para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><command><computeroutput>RUNNING_ON_VALGRIND</computeroutput></command>:</term>
+ <listitem>
+ <para>Returns 1 if running on Valgrind, 0 if running on the
+ real CPU. If you are running Valgrind on itself, returns the
+ number of layers of Valgrind emulation you're running on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_DISCARD_TRANSLATIONS</computeroutput>:</command></term>
+ <listitem>
+ <para>Discards translations of code in the specified address
+ range. Useful if you are debugging a JIT compiler or some other
+ dynamic code generation system. After this call, attempts to
+ execute code in the invalidated address range will cause
+ Valgrind to make new translations of that code, which is
+ probably the semantics you want. Note that code invalidations
+ are expensive because finding all the relevant translations
+ quickly is very difficult. So try not to call it often.
+ Note that you can be clever about
+ this: you only need to call it when an area which previously
+ contained code is overwritten with new code. You can choose
+ to write code into fresh memory, and just call this
+ occasionally to discard large chunks of old code all at
+ once.</para>
+ <para>
+ Alternatively, for transparent self-modifying-code support,
+ use<computeroutput>--smc-check=all</computeroutput>, or run
+ on ppc32/Linux or ppc64/Linux.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_COUNT_ERRORS</computeroutput>:</command></term>
+ <listitem>
+ <para>Returns the number of errors found so far by Valgrind. Can be
+ useful in test harness code when combined with the
+ <option>--log-fd=-1</option> option; this runs Valgrind silently,
+ but the client program can detect when errors occur. Only useful
+ for tools that report errors, e.g. it's useful for Memcheck, but for
+ Cachegrind it will always return zero because Cachegrind doesn't
+ report errors.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>:</command></term>
+ <listitem>
+ <para>If your program manages its own memory instead of using
+ the standard <computeroutput>malloc()</computeroutput> /
+ <computeroutput>new</computeroutput> /
+ <computeroutput>new[]</computeroutput>, tools that track
+ information about heap blocks will not do nearly as good a
+ job. For example, Memcheck won't detect nearly as many
+ errors, and the error messages won't be as informative. To
+ improve this situation, use this macro just after your custom
+ allocator allocates some new memory. See the comments in
+ <filename>valgrind.h</filename> for information on how to use
+ it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput>:</command></term>
+ <listitem>
+ <para>This should be used in conjunction with
+ <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>.
+ Again, see <filename>memcheck/memcheck.h</filename> for
+ information on how to use it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>:</command></term>
+ <listitem>
+ <para>This is similar to
+ <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>,
+ but is tailored towards code that uses memory pools. See the
+ comments in <filename>valgrind.h</filename> for information
+ on how to use it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_DESTROY_MEMPOOL</computeroutput>:</command></term>
+ <listitem>
+ <para>This should be used in conjunction with
+ <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
+ Again, see the comments in <filename>valgrind.h</filename> for
+ information on how to use it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_MEMPOOL_ALLOC</computeroutput>:</command></term>
+ <listitem>
+ <para>This should be used in conjunction with
+ <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
+ Again, see the comments in <filename>valgrind.h</filename> for
+ information on how to use it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_MEMPOOL_FREE</computeroutput>:</command></term>
+ <listitem>
+ <para>This should be used in conjunction with
+ <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
+ Again, see the comments in <filename>valgrind.h</filename> for
+ information on how to use it.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_NON_SIMD_CALL[0123]</computeroutput>:</command></term>
+ <listitem>
+ <para>Executes a function of 0, 1, 2 or 3 args in the client
+ program on the <emphasis>real</emphasis> CPU, not the virtual
+ CPU that Valgrind normally runs code on. These are used in
+ various ways internally to Valgrind. They might be useful to
+ client programs.</para>
+
+ <para><command>Warning:</command> Only use these if you
+ <emphasis>really</emphasis> know what you are doing.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_PRINTF(format, ...)</computeroutput>:</command></term>
+ <listitem>
+ <para>printf a message to the log file when running under
+ Valgrind. Nothing is output if not running under Valgrind.
+ Returns the number of characters output.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_PRINTF_BACKTRACE(format, ...)</computeroutput>:</command></term>
+ <listitem>
+ <para>printf a message to the log file along with a stack
+ backtrace when running under Valgrind. Nothing is output if
+ not running under Valgrind. Returns the number of characters
+ output.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_STACK_REGISTER(start, end)</computeroutput>:</command></term>
+ <listitem>
+ <para>Registers a new stack. Informs Valgrind that the memory range
+ between start and end is a unique stack. Returns a stack identifier
+ that can be used with other
+ <computeroutput>VALGRIND_STACK_*</computeroutput> calls.</para>
+ <para>Valgrind will use this information to determine if a change to
+ the stack pointer is an item pushed onto the stack or a change over
+ to a new stack. Use this if you're using a user-level thread package
+ and are noticing spurious errors from Valgrind about uninitialized
+ memory reads.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_STACK_DEREGISTER(id)</computeroutput>:</command></term>
+ <listitem>
+ <para>Deregisters a previously registered stack. Informs
+ Valgrind that previously registered memory range with stack id
+ <computeroutput>id</computeroutput> is no longer a stack.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><command><computeroutput>VALGRIND_STACK_CHANGE(id, start, end)</computeroutput>:</command></term>
+ <listitem>
+ <para>Changes a previously registered stack. Informs
+ Valgrind that the previously registered stack with stack id
+ <computeroutput>id</computeroutput> has changed its start and end
+ values. Use this if your user-level thread package implements
+ stack growth.</para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+<para>Note that <filename>valgrind.h</filename> is included by
+all the tool-specific header files (such as
+<filename>memcheck.h</filename>), so you don't need to include it
+in your client if you include a tool-specific header.</para>
+
+</sect1>
+
+
+
+
+
+<sect1 id="manual-core-adv.wrapping" xreflabel="Function Wrapping">
+<title>Function wrapping</title>
+
+<para>
+Valgrind versions 3.2.0 and above can do function wrapping on all
+supported targets. In function wrapping, calls to some specified
+function are intercepted and rerouted to a different, user-supplied
+function. This can do whatever it likes, typically examining the
+arguments, calling onwards to the original, and possibly examining the
+result. Any number of functions may be wrapped.</para>
+
+<para>
+Function wrapping is useful for instrumenting an API in some way. For
+example, wrapping functions in the POSIX pthreads API makes it
+possible to notify Valgrind of thread status changes, and wrapping
+functions in the MPI (message-passing) API allows notifying Valgrind
+of memory status changes associated with message arrival/departure.
+Such information is usually passed to Valgrind by using client
+requests in the wrapper functions, although that is not of relevance
+here.</para>
+
+<sect2 id="manual-core-adv.wrapping.example" xreflabel="A Simple Example">
+<title>A Simple Example</title>
+
+<para>Supposing we want to wrap some function</para>
+
+<programlisting><![CDATA[
+int foo ( int x, int y ) { return x + y; }]]></programlisting>
+
+<para>A wrapper is a function of identical type, but with a special name
+which identifies it as the wrapper for <computeroutput>foo</computeroutput>.
+Wrappers need to include
+supporting macros from <computeroutput>valgrind.h</computeroutput>.
+Here is a simple wrapper which prints the arguments and return value:</para>
+
+<programlisting><![CDATA[
+#include <stdio.h>
+#include "valgrind.h"
+int I_WRAP_SONAME_FNNAME_ZU(NONE,foo)( int x, int y )
+{
+ int result;
+ OrigFn fn;
+ VALGRIND_GET_ORIG_FN(fn);
+ printf("foo's wrapper: args %d %d\n", x, y);
+ CALL_FN_W_WW(result, fn, x,y);
+ printf("foo's wrapper: result %d\n", result);
+ return result;
+}
+]]></programlisting>
+
+<para>To become active, the wrapper merely needs to be present in a text
+section somewhere in the same process' address space as the function
+it wraps, and for its ELF symbol name to be visible to Valgrind. In
+practice, this means either compiling to a
+<computeroutput>.o</computeroutput> and linking it in, or
+compiling to a <computeroutput>.so</computeroutput> and
+<computeroutput>LD_PRELOAD</computeroutput>ing it in. The latter is more
+convenient in that it doesn't require relinking.</para>
+
+<para>All wrappers have approximately the above form. There are three
+crucial macros:</para>
+
+<para><computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>:
+this generates the real name of the wrapper.
+This is an encoded name which Valgrind notices when reading symbol
+table information. What it says is: I am the wrapper for any function
+named <computeroutput>foo</computeroutput> which is found in
+an ELF shared object with an empty
+("<computeroutput>NONE</computeroutput>") soname field. The specification
+mechanism is powerful in
+that wildcards are allowed for both sonames and function names.
+The details are discussed below.</para>
+
+<para><computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>:
+once in the the wrapper, the first priority is
+to get hold of the address of the original (and any other supporting
+information needed). This is stored in a value of opaque
+type <computeroutput>OrigFn</computeroutput>.
+The information is acquired using
+<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>. It is crucial
+to make this macro call before calling any other wrapped function
+in the same thread.</para>
+
+<para><computeroutput>CALL_FN_W_WW</computeroutput>: eventually we will
+want to call the function being
+wrapped. Calling it directly does not work, since that just gets us
+back to the wrapper and tends to kill the program in short order by
+stack overflow. Instead, the result lvalue,
+<computeroutput>OrigFn</computeroutput> and arguments are
+handed to one of a family of macros of the form
+<computeroutput>CALL_FN_*</computeroutput>. These
+cause Valgrind to call the original and avoid recursion back to the
+wrapper.</para>
+</sect2>
+
+<sect2 id="manual-core-adv.wrapping.specs" xreflabel="Wrapping Specifications">
+<title>Wrapping Specifications</title>
+
+<para>This scheme has the advantage of being self-contained. A library of
+wrappers can be compiled to object code in the normal way, and does
+not rely on an external script telling Valgrind which wrappers pertain
+to which originals.</para>
+
+<para>Each wrapper has a name which, in the most general case says: I am the
+wrapper for any function whose name matches FNPATT and whose ELF
+"soname" matches SOPATT. Both FNPATT and SOPATT may contain wildcards
+(asterisks) and other characters (spaces, dots, @, etc) which are not
+generally regarded as valid C identifier names.</para>
+
+<para>This flexibility is needed to write robust wrappers for POSIX pthread
+functions, where typically we are not completely sure of either the
+function name or the soname, or alternatively we want to wrap a whole
+set of functions at once.</para>
+
+<para>For example, <computeroutput>pthread_create</computeroutput>
+in GNU libpthread is usually a
+versioned symbol - one whose name ends in, eg,
+<computeroutput>@GLIBC_2.3</computeroutput>. Hence we
+are not sure what its real name is. We also want to cover any soname
+of the form <computeroutput>libpthread.so*</computeroutput>.
+So the header of the wrapper will be</para>
+
+<programlisting><![CDATA[
+int I_WRAP_SONAME_FNNAME_ZZ(libpthreadZdsoZd0,pthreadZucreateZAZa)
+ ( ... formals ... )
+ { ... body ... }
+]]></programlisting>
+
+<para>In order to write unusual characters as valid C function names, a
+Z-encoding scheme is used. Names are written literally, except that
+a capital Z acts as an escape character, with the following encoding:</para>
+
+<programlisting><![CDATA[
+ Za encodes *
+ Zp +
+ Zc :
+ Zd .
+ Zu _
+ Zh -
+ Zs (space)
+ ZA @
+ ZZ Z
+ ZL ( # only in valgrind 3.3.0 and later
+ ZR ) # only in valgrind 3.3.0 and later
+]]></programlisting>
+
+<para>Hence <computeroutput>libpthreadZdsoZd0</computeroutput> is an
+encoding of the soname <computeroutput>libpthread.so.0</computeroutput>
+and <computeroutput>pthreadZucreateZAZa</computeroutput> is an encoding
+of the function name <computeroutput>pthread_create@*</computeroutput>.
+</para>
+
+<para>The macro <computeroutput>I_WRAP_SONAME_FNNAME_ZZ</computeroutput>
+constructs a wrapper name in which
+both the soname (first component) and function name (second component)
+are Z-encoded. Encoding the function name can be tiresome and is
+often unnecessary, so a second macro,
+<computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>, can be
+used instead. The <computeroutput>_ZU</computeroutput> variant is
+also useful for writing wrappers for
+C++ functions, in which the function name is usually already mangled
+using some other convention in which Z plays an important role. Having
+to encode a second time quickly becomes confusing.</para>
+
+<para>Since the function name field may contain wildcards, it can be
+anything, including just <computeroutput>*</computeroutput>.
+The same is true for the soname.
+However, some ELF objects - specifically, main executables - do not
+have sonames. Any object lacking a soname is treated as if its soname
+was <computeroutput>NONE</computeroutput>, which is why the original
+example above had a name
+<computeroutput>I_WRAP_SONAME_FNNAME_ZU(NONE,foo)</computeroutput>.</para>
+
+<para>Note that the soname of an ELF object is not the same as its
+file name, although it is often similar. You can find the soname of
+an object <computeroutput>libfoo.so</computeroutput> using the command
+<computeroutput>readelf -a libfoo.so | grep soname</computeroutput>.</para>
+</sect2>
+
+<sect2 id="manual-core-adv.wrapping.semantics" xreflabel="Wrapping Semantics">
+<title>Wrapping Semantics</title>
+
+<para>The ability for a wrapper to replace an infinite family of functions
+is powerful but brings complications in situations where ELF objects
+appear and disappear (are dlopen'd and dlclose'd) on the fly.
+Valgrind tries to maintain sensible behaviour in such situations.</para>
+
+<para>For example, suppose a process has dlopened (an ELF object with
+soname) <computeroutput>object1.so</computeroutput>, which contains
+<computeroutput>function1</computeroutput>. It starts to use
+<computeroutput>function1</computeroutput> immediately.</para>
+
+<para>After a while it dlopens <computeroutput>wrappers.so</computeroutput>,
+which contains a wrapper
+for <computeroutput>function1</computeroutput> in (soname)
+<computeroutput>object1.so</computeroutput>. All subsequent calls to
+<computeroutput>function1</computeroutput> are rerouted to the wrapper.</para>
+
+<para>If <computeroutput>wrappers.so</computeroutput> is
+later dlclose'd, calls to <computeroutput>function1</computeroutput> are
+naturally routed back to the original.</para>
+
+<para>Alternatively, if <computeroutput>object1.so</computeroutput>
+is dlclose'd but wrappers.so remains,
+then the wrapper exported by <computeroutput>wrapper.so</computeroutput>
+becomes inactive, since there
+is no way to get to it - there is no original to call any more. However,
+Valgrind remembers that the wrapper is still present. If
+<computeroutput>object1.so</computeroutput> is
+eventually dlopen'd again, the wrapper will become active again.</para>
+
+<para>In short, valgrind inspects all code loading/unloading events to
+ensure that the set of currently active wrappers remains consistent.</para>
+
+<para>A second possible problem is that of conflicting wrappers. It is
+easily possible to load two or more wrappers, both of which claim
+to be wrappers for some third function. In such cases Valgrind will
+complain about conflicting wrappers when the second one appears, and
+will honour only the first one.</para>
+</sect2>
+
+<sect2 id="manual-core-adv.wrapping.debugging" xreflabel="Debugging">
+<title>Debugging</title>
+
+<para>Figuring out what's going on given the dynamic nature of wrapping
+can be difficult. The
+<computeroutput>--trace-redir=yes</computeroutput> flag makes
+this possible
+by showing the complete state of the redirection subsystem after
+every
+<computeroutput>mmap</computeroutput>/<computeroutput>munmap</computeroutput>
+event affecting code (text).</para>
+
+<para>There are two central concepts:</para>
+
+<itemizedlist>
+
+ <listitem><para>A "redirection specification" is a binding of
+ a (soname pattern, fnname pattern) pair to a code address.
+ These bindings are created by writing functions with names
+ made with the
+ <computeroutput>I_WRAP_SONAME_FNNAME_{ZZ,_ZU}</computeroutput>
+ macros.</para></listitem>
+
+ <listitem><para>An "active redirection" is code-address to
+ code-address binding currently in effect.</para></listitem>
+
+</itemizedlist>
+
+<para>The state of the wrapping-and-redirection subsystem comprises a set of
+specifications and a set of active bindings. The specifications are
+acquired/discarded by watching all
+<computeroutput>mmap</computeroutput>/<computeroutput>munmap</computeroutput>
+events on code (text)
+sections. The active binding set is (conceptually) recomputed from
+the specifications, and all known symbol names, following any change
+to the specification set.</para>
+
+<para><computeroutput>--trace-redir=yes</computeroutput> shows the contents
+of both sets following any such event.</para>
+
+<para><computeroutput>-v</computeroutput> prints a line of text each
+time an active specification is used for the first time.</para>
+
+<para>Hence for maximum debugging effectiveness you will need to use both
+flags.</para>
+
+<para>One final comment. The function-wrapping facility is closely
+tied to Valgrind's ability to replace (redirect) specified
+functions, for example to redirect calls to
+<computeroutput>malloc</computeroutput> to its
+own implementation. Indeed, a replacement function can be
+regarded as a wrapper function which does not call the original.
+However, to make the implementation more robust, the two kinds
+of interception (wrapping vs replacement) are treated differently.
+</para>
+
+<para><computeroutput>--trace-redir=yes</computeroutput> shows
+specifications and bindings for both
+replacement and wrapper functions. To differentiate the
+two, replacement bindings are printed using
+<computeroutput>R-></computeroutput> whereas
+wraps are printed using <computeroutput>W-></computeroutput>.
+</para>
+</sect2>
+
+
+<sect2 id="manual-core-adv.wrapping.limitations-cf"
+ xreflabel="Limitations - control flow">
+<title>Limitations - control flow</title>
+
+<para>For the most part, the function wrapping implementation is robust.
+The only important caveat is: in a wrapper, get hold of
+the <computeroutput>OrigFn</computeroutput> information using
+<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput> before calling any
+other wrapped function. Once you have the
+<computeroutput>OrigFn</computeroutput>, arbitrary
+calls between, recursion between, and longjumps out of wrappers
+should work correctly. There is never any interaction between wrapped
+functions and merely replaced functions
+(eg <computeroutput>malloc</computeroutput>), so you can call
+<computeroutput>malloc</computeroutput> etc safely from within wrappers.
+</para>
+
+<para>The above comments are true for {x86,amd64,ppc32}-linux. On
+ppc64-linux function wrapping is more fragile due to the (arguably
+poorly designed) ppc64-linux ABI. This mandates the use of a shadow
+stack which tracks entries/exits of both wrapper and replacement
+functions. This gives two limitations: firstly, longjumping out of
+wrappers will rapidly lead to disaster, since the shadow stack will
+not get correctly cleared. Secondly, since the shadow stack has
+finite size, recursion between wrapper/replacement functions is only
+possible to a limited depth, beyond which Valgrind has to abort the
+run. This depth is currently 16 calls.</para>
+
+<para>For all platforms ({x86,amd64,ppc32,ppc64}-linux) all the above
+comments apply on a per-thread basis. In other words, wrapping is
+thread-safe: each thread must individually observe the above
+restrictions, but there is no need for any kind of inter-thread
+cooperation.</para>
+</sect2>
+
+
+<sect2 id="manual-core-adv.wrapping.limitations-sigs"
+ xreflabel="Limitations - original function signatures">
+<title>Limitations - original function signatures</title>
+
+<para>As shown in the above example, to call the original you must use a
+macro of the form <computeroutput>CALL_FN_*</computeroutput>.
+For technical reasons it is impossible
+to create a single macro to deal with all argument types and numbers,
+so a family of macros covering the most common cases is supplied. In
+what follows, 'W' denotes a machine-word-typed value (a pointer or a
+C <computeroutput>long</computeroutput>),
+and 'v' denotes C's <computeroutput>void</computeroutput> type.
+The currently available macros are:</para>
+
+<programlisting><![CDATA[
+CALL_FN_v_v -- call an original of type void fn ( void )
+CALL_FN_W_v -- call an original of type long fn ( void )
+
+CALL_FN_v_W -- void fn ( long )
+CALL_FN_W_W -- long fn ( long )
+
+CALL_FN_v_WW -- void fn ( long, long )
+CALL_FN_W_WW -- long fn ( long, long )
+
+CALL_FN_v_WWW -- void fn ( long, long, long )
+CALL_FN_W_WWW -- long fn ( long, long, long )
+
+CALL_FN_W_WWWW -- long fn ( long, long, long, long )
+CALL_FN_W_5W -- long fn ( long, long, long, long, long )
+CALL_FN_W_6W -- long fn ( long, long, long, long, long, long )
+and so on, up to
+CALL_FN_W_12W
+]]></programlisting>
+
+<para>The set of supported types can be expanded as needed. It is
+regrettable that this limitation exists. Function wrapping has proven
+difficult to implement, with a certain apparently unavoidable level of
+ickyness. After several implementation attempts, the present
+arrangement appears to be the least-worst tradeoff. At least it works
+reliably in the presence of dynamic linking and dynamic code
+loading/unloading.</para>
+
+<para>You should not attempt to wrap a function of one type signature with a
+wrapper of a different type signature. Such trickery will surely lead
+to crashes or strange behaviour. This is not of course a limitation
+of the function wrapping implementation, merely a reflection of the
+fact that it gives you sweeping powers to shoot yourself in the foot
+if you are not careful. Imagine the instant havoc you could wreak by
+writing a wrapper which matched any function name in any soname - in
+effect, one which claimed to be a wrapper for all functions in the
+process.</para>
+</sect2>
+
+<sect2 id="manual-core-adv.wrapping.examples" xreflabel="Examples">
+<title>Examples</title>
+
+<para>In the source tree,
+<computeroutput>memcheck/tests/wrap[1-8].c</computeroutput> provide a series of
+examples, ranging from very simple to quite advanced.</para>
+
+<para><computeroutput>auxprogs/libmpiwrap.c</computeroutput> is an example
+of wrapping a big, complex API (the MPI-2 interface). This file defines
+almost 300 different wrappers.</para>
+</sect2>
+
+</sect1>
+
+
+
+
+</chapter>
<chapter id="manual-core" xreflabel="Valgrind's core">
<title>Using and understanding the Valgrind core</title>
-<para>This section describes the Valgrind core services, flags and
+<para>This chapter describes the Valgrind core services, flags and
behaviours. That means it is relevant regardless of what particular
-tool you are using. A point of terminology: most references to
-"Valgrind" in the rest of this section refer to the Valgrind
-core services.</para>
+tool you are using. The information should be sufficient for you to
+make effective day-to-day use of Valgrind. Advanced topics related to
+the Valgrind core are described in <xref linkend="manual-core-adv"/>.
+</para>
+
+<para>
+A point of terminology: most references to "Valgrind" in this chapter
+refer to the Valgrind core services. </para>
+
+
<sect1 id="manual-core.whatdoes"
xreflabel="What Valgrind does with your program">
-<sect1 id="manual-core.clientreq"
- xreflabel="The Client Request mechanism">
-<title>The Client Request mechanism</title>
-
-<para>Valgrind has a trapdoor mechanism via which the client
-program can pass all manner of requests and queries to Valgrind
-and the current tool. Internally, this is used extensively to
-make malloc, free, etc, work, although you don't see that.</para>
-
-<para>For your convenience, a subset of these so-called client
-requests is provided to allow you to tell Valgrind facts about
-the behaviour of your program, and also to make queries.
-In particular, your program can tell Valgrind about changes in
-memory range permissions that Valgrind would not otherwise know
-about, and so allows clients to get Valgrind to do arbitrary
-custom checks.</para>
-
-<para>Clients need to include a header file to make this work.
-Which header file depends on which client requests you use. Some
-client requests are handled by the core, and are defined in the
-header file <filename>valgrind/valgrind.h</filename>. Tool-specific
-header files are named after the tool, e.g.
-<filename>valgrind/memcheck.h</filename>. All header files can be found
-in the <literal>include/valgrind</literal> directory of wherever Valgrind
-was installed.</para>
-
-<para>The macros in these header files have the magical property
-that they generate code in-line which Valgrind can spot.
-However, the code does nothing when not run on Valgrind, so you
-are not forced to run your program under Valgrind just because you
-use the macros in this file. Also, you are not required to link your
-program with any extra supporting libraries.</para>
-
-<para>The code added to your binary has negligible performance impact:
-on x86, amd64, ppc32 and ppc64, the overhead is 6 simple integer instructions
-and is probably undetectable except in tight loops.
-However, if you really wish to compile out the client requests, you can
-compile with <computeroutput>-DNVALGRIND</computeroutput> (analogous to
-<computeroutput>-DNDEBUG</computeroutput>'s effect on
-<computeroutput>assert()</computeroutput>).
-</para>
-
-<para>You are encouraged to copy the <filename>valgrind/*.h</filename> headers
-into your project's include directory, so your program doesn't have a
-compile-time dependency on Valgrind being installed. The Valgrind headers,
-unlike most of the rest of the code, are under a BSD-style license so you may
-include them without worrying about license incompatibility.</para>
-
-<para>Here is a brief description of the macros available in
-<filename>valgrind.h</filename>, which work with more than one
-tool (see the tool-specific documentation for explanations of the
-tool-specific macros).</para>
-
- <variablelist>
-
- <varlistentry>
- <term><command><computeroutput>RUNNING_ON_VALGRIND</computeroutput></command>:</term>
- <listitem>
- <para>Returns 1 if running on Valgrind, 0 if running on the
- real CPU. If you are running Valgrind on itself, returns the
- number of layers of Valgrind emulation you're running on.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_DISCARD_TRANSLATIONS</computeroutput>:</command></term>
- <listitem>
- <para>Discards translations of code in the specified address
- range. Useful if you are debugging a JIT compiler or some other
- dynamic code generation system. After this call, attempts to
- execute code in the invalidated address range will cause
- Valgrind to make new translations of that code, which is
- probably the semantics you want. Note that code invalidations
- are expensive because finding all the relevant translations
- quickly is very difficult. So try not to call it often.
- Note that you can be clever about
- this: you only need to call it when an area which previously
- contained code is overwritten with new code. You can choose
- to write code into fresh memory, and just call this
- occasionally to discard large chunks of old code all at
- once.</para>
- <para>
- Alternatively, for transparent self-modifying-code support,
- use<computeroutput>--smc-check=all</computeroutput>, or run
- on ppc32/Linux or ppc64/Linux.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_COUNT_ERRORS</computeroutput>:</command></term>
- <listitem>
- <para>Returns the number of errors found so far by Valgrind. Can be
- useful in test harness code when combined with the
- <option>--log-fd=-1</option> option; this runs Valgrind silently,
- but the client program can detect when errors occur. Only useful
- for tools that report errors, e.g. it's useful for Memcheck, but for
- Cachegrind it will always return zero because Cachegrind doesn't
- report errors.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>:</command></term>
- <listitem>
- <para>If your program manages its own memory instead of using
- the standard <computeroutput>malloc()</computeroutput> /
- <computeroutput>new</computeroutput> /
- <computeroutput>new[]</computeroutput>, tools that track
- information about heap blocks will not do nearly as good a
- job. For example, Memcheck won't detect nearly as many
- errors, and the error messages won't be as informative. To
- improve this situation, use this macro just after your custom
- allocator allocates some new memory. See the comments in
- <filename>valgrind.h</filename> for information on how to use
- it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput>:</command></term>
- <listitem>
- <para>This should be used in conjunction with
- <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>.
- Again, see <filename>memcheck/memcheck.h</filename> for
- information on how to use it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>:</command></term>
- <listitem>
- <para>This is similar to
- <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>,
- but is tailored towards code that uses memory pools. See the
- comments in <filename>valgrind.h</filename> for information
- on how to use it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_DESTROY_MEMPOOL</computeroutput>:</command></term>
- <listitem>
- <para>This should be used in conjunction with
- <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
- Again, see the comments in <filename>valgrind.h</filename> for
- information on how to use it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_MEMPOOL_ALLOC</computeroutput>:</command></term>
- <listitem>
- <para>This should be used in conjunction with
- <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
- Again, see the comments in <filename>valgrind.h</filename> for
- information on how to use it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_MEMPOOL_FREE</computeroutput>:</command></term>
- <listitem>
- <para>This should be used in conjunction with
- <computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput>.
- Again, see the comments in <filename>valgrind.h</filename> for
- information on how to use it.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_NON_SIMD_CALL[0123]</computeroutput>:</command></term>
- <listitem>
- <para>Executes a function of 0, 1, 2 or 3 args in the client
- program on the <emphasis>real</emphasis> CPU, not the virtual
- CPU that Valgrind normally runs code on. These are used in
- various ways internally to Valgrind. They might be useful to
- client programs.</para>
-
- <para><command>Warning:</command> Only use these if you
- <emphasis>really</emphasis> know what you are doing.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_PRINTF(format, ...)</computeroutput>:</command></term>
- <listitem>
- <para>printf a message to the log file when running under
- Valgrind. Nothing is output if not running under Valgrind.
- Returns the number of characters output.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_PRINTF_BACKTRACE(format, ...)</computeroutput>:</command></term>
- <listitem>
- <para>printf a message to the log file along with a stack
- backtrace when running under Valgrind. Nothing is output if
- not running under Valgrind. Returns the number of characters
- output.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_STACK_REGISTER(start, end)</computeroutput>:</command></term>
- <listitem>
- <para>Registers a new stack. Informs Valgrind that the memory range
- between start and end is a unique stack. Returns a stack identifier
- that can be used with other
- <computeroutput>VALGRIND_STACK_*</computeroutput> calls.</para>
- <para>Valgrind will use this information to determine if a change to
- the stack pointer is an item pushed onto the stack or a change over
- to a new stack. Use this if you're using a user-level thread package
- and are noticing spurious errors from Valgrind about uninitialized
- memory reads.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_STACK_DEREGISTER(id)</computeroutput>:</command></term>
- <listitem>
- <para>Deregisters a previously registered stack. Informs
- Valgrind that previously registered memory range with stack id
- <computeroutput>id</computeroutput> is no longer a stack.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><command><computeroutput>VALGRIND_STACK_CHANGE(id, start, end)</computeroutput>:</command></term>
- <listitem>
- <para>Changes a previously registered stack. Informs
- Valgrind that the previously registered stack with stack id
- <computeroutput>id</computeroutput> has changed its start and end
- values. Use this if your user-level thread package implements
- stack growth.</para>
- </listitem>
- </varlistentry>
-
- </variablelist>
-
-<para>Note that <filename>valgrind.h</filename> is included by
-all the tool-specific header files (such as
-<filename>memcheck.h</filename>), so you don't need to include it
-in your client if you include a tool-specific header.</para>
-
-</sect1>
-
-
-
-
-
-<sect1 id="manual-core.wrapping" xreflabel="Function Wrapping">
-<title>Function wrapping</title>
-
-<para>
-Valgrind versions 3.2.0 and above can do function wrapping on all
-supported targets. In function wrapping, calls to some specified
-function are intercepted and rerouted to a different, user-supplied
-function. This can do whatever it likes, typically examining the
-arguments, calling onwards to the original, and possibly examining the
-result. Any number of functions may be wrapped.</para>
-
-<para>
-Function wrapping is useful for instrumenting an API in some way. For
-example, wrapping functions in the POSIX pthreads API makes it
-possible to notify Valgrind of thread status changes, and wrapping
-functions in the MPI (message-passing) API allows notifying Valgrind
-of memory status changes associated with message arrival/departure.
-Such information is usually passed to Valgrind by using client
-requests in the wrapper functions, although that is not of relevance
-here.</para>
-
-<sect2 id="manual-core.wrapping.example" xreflabel="A Simple Example">
-<title>A Simple Example</title>
-
-<para>Supposing we want to wrap some function</para>
-
-<programlisting><![CDATA[
-int foo ( int x, int y ) { return x + y; }]]></programlisting>
-
-<para>A wrapper is a function of identical type, but with a special name
-which identifies it as the wrapper for <computeroutput>foo</computeroutput>.
-Wrappers need to include
-supporting macros from <computeroutput>valgrind.h</computeroutput>.
-Here is a simple wrapper which prints the arguments and return value:</para>
-
-<programlisting><![CDATA[
-#include <stdio.h>
-#include "valgrind.h"
-int I_WRAP_SONAME_FNNAME_ZU(NONE,foo)( int x, int y )
-{
- int result;
- OrigFn fn;
- VALGRIND_GET_ORIG_FN(fn);
- printf("foo's wrapper: args %d %d\n", x, y);
- CALL_FN_W_WW(result, fn, x,y);
- printf("foo's wrapper: result %d\n", result);
- return result;
-}
-]]></programlisting>
-
-<para>To become active, the wrapper merely needs to be present in a text
-section somewhere in the same process' address space as the function
-it wraps, and for its ELF symbol name to be visible to Valgrind. In
-practice, this means either compiling to a
-<computeroutput>.o</computeroutput> and linking it in, or
-compiling to a <computeroutput>.so</computeroutput> and
-<computeroutput>LD_PRELOAD</computeroutput>ing it in. The latter is more
-convenient in that it doesn't require relinking.</para>
-
-<para>All wrappers have approximately the above form. There are three
-crucial macros:</para>
-
-<para><computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>:
-this generates the real name of the wrapper.
-This is an encoded name which Valgrind notices when reading symbol
-table information. What it says is: I am the wrapper for any function
-named <computeroutput>foo</computeroutput> which is found in
-an ELF shared object with an empty
-("<computeroutput>NONE</computeroutput>") soname field. The specification
-mechanism is powerful in
-that wildcards are allowed for both sonames and function names.
-The details are discussed below.</para>
-
-<para><computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>:
-once in the the wrapper, the first priority is
-to get hold of the address of the original (and any other supporting
-information needed). This is stored in a value of opaque
-type <computeroutput>OrigFn</computeroutput>.
-The information is acquired using
-<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>. It is crucial
-to make this macro call before calling any other wrapped function
-in the same thread.</para>
-
-<para><computeroutput>CALL_FN_W_WW</computeroutput>: eventually we will
-want to call the function being
-wrapped. Calling it directly does not work, since that just gets us
-back to the wrapper and tends to kill the program in short order by
-stack overflow. Instead, the result lvalue,
-<computeroutput>OrigFn</computeroutput> and arguments are
-handed to one of a family of macros of the form
-<computeroutput>CALL_FN_*</computeroutput>. These
-cause Valgrind to call the original and avoid recursion back to the
-wrapper.</para>
-</sect2>
-
-<sect2 id="manual-core.wrapping.specs" xreflabel="Wrapping Specifications">
-<title>Wrapping Specifications</title>
-
-<para>This scheme has the advantage of being self-contained. A library of
-wrappers can be compiled to object code in the normal way, and does
-not rely on an external script telling Valgrind which wrappers pertain
-to which originals.</para>
-
-<para>Each wrapper has a name which, in the most general case says: I am the
-wrapper for any function whose name matches FNPATT and whose ELF
-"soname" matches SOPATT. Both FNPATT and SOPATT may contain wildcards
-(asterisks) and other characters (spaces, dots, @, etc) which are not
-generally regarded as valid C identifier names.</para>
-
-<para>This flexibility is needed to write robust wrappers for POSIX pthread
-functions, where typically we are not completely sure of either the
-function name or the soname, or alternatively we want to wrap a whole
-set of functions at once.</para>
-
-<para>For example, <computeroutput>pthread_create</computeroutput>
-in GNU libpthread is usually a
-versioned symbol - one whose name ends in, eg,
-<computeroutput>@GLIBC_2.3</computeroutput>. Hence we
-are not sure what its real name is. We also want to cover any soname
-of the form <computeroutput>libpthread.so*</computeroutput>.
-So the header of the wrapper will be</para>
-
-<programlisting><![CDATA[
-int I_WRAP_SONAME_FNNAME_ZZ(libpthreadZdsoZd0,pthreadZucreateZAZa)
- ( ... formals ... )
- { ... body ... }
-]]></programlisting>
-
-<para>In order to write unusual characters as valid C function names, a
-Z-encoding scheme is used. Names are written literally, except that
-a capital Z acts as an escape character, with the following encoding:</para>
-
-<programlisting><![CDATA[
- Za encodes *
- Zp +
- Zc :
- Zd .
- Zu _
- Zh -
- Zs (space)
- ZA @
- ZZ Z
- ZL ( # only in valgrind 3.3.0 and later
- ZR ) # only in valgrind 3.3.0 and later
-]]></programlisting>
-
-<para>Hence <computeroutput>libpthreadZdsoZd0</computeroutput> is an
-encoding of the soname <computeroutput>libpthread.so.0</computeroutput>
-and <computeroutput>pthreadZucreateZAZa</computeroutput> is an encoding
-of the function name <computeroutput>pthread_create@*</computeroutput>.
-</para>
-
-<para>The macro <computeroutput>I_WRAP_SONAME_FNNAME_ZZ</computeroutput>
-constructs a wrapper name in which
-both the soname (first component) and function name (second component)
-are Z-encoded. Encoding the function name can be tiresome and is
-often unnecessary, so a second macro,
-<computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>, can be
-used instead. The <computeroutput>_ZU</computeroutput> variant is
-also useful for writing wrappers for
-C++ functions, in which the function name is usually already mangled
-using some other convention in which Z plays an important role. Having
-to encode a second time quickly becomes confusing.</para>
-
-<para>Since the function name field may contain wildcards, it can be
-anything, including just <computeroutput>*</computeroutput>.
-The same is true for the soname.
-However, some ELF objects - specifically, main executables - do not
-have sonames. Any object lacking a soname is treated as if its soname
-was <computeroutput>NONE</computeroutput>, which is why the original
-example above had a name
-<computeroutput>I_WRAP_SONAME_FNNAME_ZU(NONE,foo)</computeroutput>.</para>
-
-<para>Note that the soname of an ELF object is not the same as its
-file name, although it is often similar. You can find the soname of
-an object <computeroutput>libfoo.so</computeroutput> using the command
-<computeroutput>readelf -a libfoo.so | grep soname</computeroutput>.</para>
-</sect2>
-
-<sect2 id="manual-core.wrapping.semantics" xreflabel="Wrapping Semantics">
-<title>Wrapping Semantics</title>
-
-<para>The ability for a wrapper to replace an infinite family of functions
-is powerful but brings complications in situations where ELF objects
-appear and disappear (are dlopen'd and dlclose'd) on the fly.
-Valgrind tries to maintain sensible behaviour in such situations.</para>
-
-<para>For example, suppose a process has dlopened (an ELF object with
-soname) <computeroutput>object1.so</computeroutput>, which contains
-<computeroutput>function1</computeroutput>. It starts to use
-<computeroutput>function1</computeroutput> immediately.</para>
-
-<para>After a while it dlopens <computeroutput>wrappers.so</computeroutput>,
-which contains a wrapper
-for <computeroutput>function1</computeroutput> in (soname)
-<computeroutput>object1.so</computeroutput>. All subsequent calls to
-<computeroutput>function1</computeroutput> are rerouted to the wrapper.</para>
-
-<para>If <computeroutput>wrappers.so</computeroutput> is
-later dlclose'd, calls to <computeroutput>function1</computeroutput> are
-naturally routed back to the original.</para>
-
-<para>Alternatively, if <computeroutput>object1.so</computeroutput>
-is dlclose'd but wrappers.so remains,
-then the wrapper exported by <computeroutput>wrapper.so</computeroutput>
-becomes inactive, since there
-is no way to get to it - there is no original to call any more. However,
-Valgrind remembers that the wrapper is still present. If
-<computeroutput>object1.so</computeroutput> is
-eventually dlopen'd again, the wrapper will become active again.</para>
-
-<para>In short, valgrind inspects all code loading/unloading events to
-ensure that the set of currently active wrappers remains consistent.</para>
-
-<para>A second possible problem is that of conflicting wrappers. It is
-easily possible to load two or more wrappers, both of which claim
-to be wrappers for some third function. In such cases Valgrind will
-complain about conflicting wrappers when the second one appears, and
-will honour only the first one.</para>
-</sect2>
-
-<sect2 id="manual-core.wrapping.debugging" xreflabel="Debugging">
-<title>Debugging</title>
-
-<para>Figuring out what's going on given the dynamic nature of wrapping
-can be difficult. The
-<computeroutput>--trace-redir=yes</computeroutput> flag makes
-this possible
-by showing the complete state of the redirection subsystem after
-every
-<computeroutput>mmap</computeroutput>/<computeroutput>munmap</computeroutput>
-event affecting code (text).</para>
-
-<para>There are two central concepts:</para>
-
-<itemizedlist>
-
- <listitem><para>A "redirection specification" is a binding of
- a (soname pattern, fnname pattern) pair to a code address.
- These bindings are created by writing functions with names
- made with the
- <computeroutput>I_WRAP_SONAME_FNNAME_{ZZ,_ZU}</computeroutput>
- macros.</para></listitem>
-
- <listitem><para>An "active redirection" is code-address to
- code-address binding currently in effect.</para></listitem>
-
-</itemizedlist>
-
-<para>The state of the wrapping-and-redirection subsystem comprises a set of
-specifications and a set of active bindings. The specifications are
-acquired/discarded by watching all
-<computeroutput>mmap</computeroutput>/<computeroutput>munmap</computeroutput>
-events on code (text)
-sections. The active binding set is (conceptually) recomputed from
-the specifications, and all known symbol names, following any change
-to the specification set.</para>
-
-<para><computeroutput>--trace-redir=yes</computeroutput> shows the contents
-of both sets following any such event.</para>
-
-<para><computeroutput>-v</computeroutput> prints a line of text each
-time an active specification is used for the first time.</para>
-
-<para>Hence for maximum debugging effectiveness you will need to use both
-flags.</para>
-
-<para>One final comment. The function-wrapping facility is closely
-tied to Valgrind's ability to replace (redirect) specified
-functions, for example to redirect calls to
-<computeroutput>malloc</computeroutput> to its
-own implementation. Indeed, a replacement function can be
-regarded as a wrapper function which does not call the original.
-However, to make the implementation more robust, the two kinds
-of interception (wrapping vs replacement) are treated differently.
-</para>
-
-<para><computeroutput>--trace-redir=yes</computeroutput> shows
-specifications and bindings for both
-replacement and wrapper functions. To differentiate the
-two, replacement bindings are printed using
-<computeroutput>R-></computeroutput> whereas
-wraps are printed using <computeroutput>W-></computeroutput>.
-</para>
-</sect2>
-
-
-<sect2 id="manual-core.wrapping.limitations-cf"
- xreflabel="Limitations - control flow">
-<title>Limitations - control flow</title>
-
-<para>For the most part, the function wrapping implementation is robust.
-The only important caveat is: in a wrapper, get hold of
-the <computeroutput>OrigFn</computeroutput> information using
-<computeroutput>VALGRIND_GET_ORIG_FN</computeroutput> before calling any
-other wrapped function. Once you have the
-<computeroutput>OrigFn</computeroutput>, arbitrary
-calls between, recursion between, and longjumps out of wrappers
-should work correctly. There is never any interaction between wrapped
-functions and merely replaced functions
-(eg <computeroutput>malloc</computeroutput>), so you can call
-<computeroutput>malloc</computeroutput> etc safely from within wrappers.
-</para>
-
-<para>The above comments are true for {x86,amd64,ppc32}-linux. On
-ppc64-linux function wrapping is more fragile due to the (arguably
-poorly designed) ppc64-linux ABI. This mandates the use of a shadow
-stack which tracks entries/exits of both wrapper and replacement
-functions. This gives two limitations: firstly, longjumping out of
-wrappers will rapidly lead to disaster, since the shadow stack will
-not get correctly cleared. Secondly, since the shadow stack has
-finite size, recursion between wrapper/replacement functions is only
-possible to a limited depth, beyond which Valgrind has to abort the
-run. This depth is currently 16 calls.</para>
-
-<para>For all platforms ({x86,amd64,ppc32,ppc64}-linux) all the above
-comments apply on a per-thread basis. In other words, wrapping is
-thread-safe: each thread must individually observe the above
-restrictions, but there is no need for any kind of inter-thread
-cooperation.</para>
-</sect2>
-
-
-<sect2 id="manual-core.wrapping.limitations-sigs"
- xreflabel="Limitations - original function signatures">
-<title>Limitations - original function signatures</title>
-
-<para>As shown in the above example, to call the original you must use a
-macro of the form <computeroutput>CALL_FN_*</computeroutput>.
-For technical reasons it is impossible
-to create a single macro to deal with all argument types and numbers,
-so a family of macros covering the most common cases is supplied. In
-what follows, 'W' denotes a machine-word-typed value (a pointer or a
-C <computeroutput>long</computeroutput>),
-and 'v' denotes C's <computeroutput>void</computeroutput> type.
-The currently available macros are:</para>
-
-<programlisting><![CDATA[
-CALL_FN_v_v -- call an original of type void fn ( void )
-CALL_FN_W_v -- call an original of type long fn ( void )
-
-CALL_FN_v_W -- void fn ( long )
-CALL_FN_W_W -- long fn ( long )
-
-CALL_FN_v_WW -- void fn ( long, long )
-CALL_FN_W_WW -- long fn ( long, long )
-
-CALL_FN_v_WWW -- void fn ( long, long, long )
-CALL_FN_W_WWW -- long fn ( long, long, long )
-
-CALL_FN_W_WWWW -- long fn ( long, long, long, long )
-CALL_FN_W_5W -- long fn ( long, long, long, long, long )
-CALL_FN_W_6W -- long fn ( long, long, long, long, long, long )
-and so on, up to
-CALL_FN_W_12W
-]]></programlisting>
-
-<para>The set of supported types can be expanded as needed. It is
-regrettable that this limitation exists. Function wrapping has proven
-difficult to implement, with a certain apparently unavoidable level of
-ickyness. After several implementation attempts, the present
-arrangement appears to be the least-worst tradeoff. At least it works
-reliably in the presence of dynamic linking and dynamic code
-loading/unloading.</para>
-
-<para>You should not attempt to wrap a function of one type signature with a
-wrapper of a different type signature. Such trickery will surely lead
-to crashes or strange behaviour. This is not of course a limitation
-of the function wrapping implementation, merely a reflection of the
-fact that it gives you sweeping powers to shoot yourself in the foot
-if you are not careful. Imagine the instant havoc you could wreak by
-writing a wrapper which matched any function name in any soname - in
-effect, one which claimed to be a wrapper for all functions in the
-process.</para>
-</sect2>
-
-<sect2 id="manual-core.wrapping.examples" xreflabel="Examples">
-<title>Examples</title>
-
-<para>In the source tree,
-<computeroutput>memcheck/tests/wrap[1-8].c</computeroutput> provide a series of
-examples, ranging from very simple to quite advanced.</para>
-
-<para><computeroutput>auxprogs/libmpiwrap.c</computeroutput> is an example
-of wrapping a big, complex API (the MPI-2 interface). This file defines
-almost 300 different wrappers.</para>
-</sect2>
-
-</sect1>
-
xmlns:xi="http://www.w3.org/2001/XInclude" />
<xi:include href="manual-core.xml" parse="xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
+ <xi:include href="manual-core-adv.xml" parse="xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
<xi:include href="../../memcheck/docs/mc-manual.xml" parse="xml"
xmlns:xi="http://www.w3.org/2001/XInclude" />
<xi:include href="../../cachegrind/docs/cg-manual.xml" parse="xml"