From: Julian Seward Date: Sun, 19 Mar 2006 18:19:11 +0000 (+0000) Subject: Yet another essay: document the MPI wrapper library. X-Git-Tag: svn/VALGRIND_3_2_0~181 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=70bdd1b273db89548a79ce531b43b05a78a2d1a4;p=thirdparty%2Fvalgrind.git Yet another essay: document the MPI wrapper library. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5778 --- diff --git a/docs/xml/manual-core.xml b/docs/xml/manual-core.xml index df72e4bc11..ee5766e71b 100644 --- a/docs/xml/manual-core.xml +++ b/docs/xml/manual-core.xml @@ -2375,4 +2375,361 @@ shipped. + + +Debugging MPI Parallel Programs with Valgrind + + Valgrind supports debugging of distributed-memory applications +which use the MPI message passing standard. This support consists of a +library of wrapper functions for the +PMPI_* interface. When incorporated +into the application's address space, either by direct linking or by +LD_PRELOAD, the wrappers intercept +calls to PMPI_Send, +PMPI_Recv, etc. They then +use client requests to inform Valgrind of memory state changes caused +by the function being wrapped. This reduces the number of false +positives that Memcheck otherwise typically reports for MPI +applications. + +The wrappers also take the opportunity to carefully check +size and definedness of buffers passed as arguments to MPI functions, hence +detecting errors such as passing undefined data to +PMPI_Send, or receiving data into a +buffer which is too small. + + + +Building and installing the wrappers + + The wrapper library will be built automatically if possible. +Valgrind's configure script will look for a suitable +mpicc to build it with. This must be +the same mpicc you use to build the +MPI application you want to debug. By default, Valgrind tries +mpicc, but you can specify a +different one by using the configure-time flag +--with-mpicc=. Currently the +wrappers are only buildable with +mpiccs which are based on GNU +gcc or Intel's +icc. + +Check that the configure script prints a line like this: + + + +If it says ... no, your +mpicc has failed to compile and link +a test MPI2 program. + +If the configure test succeeds, continue in the usual way with +make and make +install. The final install tree should then contain +libmpiwrap.so. + + +Compile up a test MPI program (eg, MPI hello-world) and try +this: + +/libmpiwrap.so \ + mpirun [args] $prefix/bin/valgrind ./hello +]]> + +You should see something similar to the following + + + +repeated for every process in the group. If you do not see +these, there is an build/installation problem of some kind. + + The MPI functions to be wrapped are assumed to be in an ELF +shared object with soname matching +libmpi.so*. This is known to be +correct at least for Open MPI and Quadrics MPI, and can easily be +changed if required. + + + + +Getting started + +Compile your MPI application as usual, taking care to link it +using the same mpicc that your +Valgrind build was configured with. + + +Use the following basic scheme to run your application on Valgrind with +the wrappers engaged: + +/libmpiwrap.so \ + mpirun [mpirun-args] \ + $prefix/bin/valgrind [valgrind-args] \ + [application] [app-args] +]]> + +As an alternative to +LD_PRELOADing +libmpiwrap.so, you can simply link it +to your application if desired. This should not disturb native +behaviour of your application in any way. + + + + +Controlling the wrapper library + +Environment variable +MPIWRAP_DEBUG is consulted at +startup. The default behaviour is to print a starting banner + + + + and then be relatively quiet. + +You can give a list of comma-separated options in +MPIWRAP_DEBUG. These are + + + + verbose: + show entries/exits of all wrappers. Also show extra + debugging info, such as the status of outstanding + MPI_Requests resulting + from uncompleted MPI_Irecvs. + + + quiet: + opposite of verbose, only print + anything when the wrappers want + to report a detected programming error, or in case of catastrophic + failure of the wrappers. + + + warn: + by default, functions which lack proper wrappers + are not commented on, just silently + ignored. This causes a warning to be printed for each unwrapped + function used, up to a maximum of three warnings per function. + + + strict: + print an error message and abort the program if + a function lacking a wrapper is used. + + + + If you want to use Valgrind's XML output facility +(--xml=yes), you should pass +quiet in +MPIWRAP_DEBUG so as to get rid of any +extraneous printing from the wrappers. + + + + + +Abilities and limitations + + +Functions + +All MPI2 functions except +MPI_Wtick, +MPI_Wtime and +MPI_Pcontrol have wrappers. The +first two are not wrapped because they return a +double, and Valgrind's +function-wrap mechanism cannot handle that (it could easily enough be +extended to). MPI_Pcontrol cannot be +wrapped as it has variable arity: +int MPI_Pcontrol(const int level, ...) + +Most functions are wrapped with a default wrapper which does +nothing except complain or abort if it is called, depending on +settings in MPIWRAP_DEBUG listed +above. The following functions have "real", do-something-useful +wrappers: + + + + A few functions such as +PMPI_Address are listed as +HAS_NO_WRAPPER. They have no wrapper +at all as there is nothing worth checking, and giving a no-op wrapper +would reduce performance for no reason. + + Note that the wrapper library itself can itself generate large +numbers of calls to the MPI implementation, especially when walking +complex types. The most common functions called are +PMPI_Extent, +PMPI_Type_get_envelope, +PMPI_Type_get_contents, and +PMPI_Type_free. + + + +Types + + MPI-1.1 structured types are supported, and walked exactly. +The currently supported combiners are +MPI_COMBINER_NAMED, +MPI_COMBINER_CONTIGUOUS, +MPI_COMBINER_VECTOR, +MPI_COMBINER_HVECTOR +MPI_COMBINER_INDEXED, +MPI_COMBINER_HINDEXED and +MPI_COMBINER_STRUCT. This should +cover all MPI-1.1 types. The mechanism (function +walk_type) should extend easily to +cover MPI2 combiners. + +MPI defines some named structured types +(MPI_FLOAT_INT, +MPI_DOUBLE_INT, +MPI_LONG_INT, +MPI_2INT, +MPI_SHORT_INT, +MPI_LONG_DOUBLE_INT) which are pairs +of some basic type and a C int. +Unfortunately the MPI specification makes it impossible to look inside +these types and see where the fields are. Therefore these wrappers +assume the types are laid out as struct { float val; +int loc; } (for +MPI_FLOAT_INT), etc, and act +accordingly. This appears to be correct at least for Open MPI 1.0.2 +and for Quadrics MPI. + +If strict is an option specified +in MPIWRAP_DEBUG, the application +will abort if an unhandled type is encountered. Otherwise, the +application will print a warning message and continue. + +Some effort is made to mark/check memory ranges corresponding to +arrays of values in a single pass. This is important for performance +since asking Valgrind to mark/check any range, no matter how small, +carries quite a large constant cost. This optimisation is applied to +arrays of primitive types (double, +float, +int, +long, long +long, short, +char, and long +double on platforms where sizeof(long +double) == 8). For arrays of all other types, the +wrappers handle each element individually and so there can be a very +large performance cost. + + + + + + + +Writing new wrappers + + +For the most part the wrappers are straightforward. The only +significant complexity arises with nonblocking receives. + +The issue is that MPI_Irecv +states the recv buffer and returns immediately, giving a handle +(MPI_Request) for the transaction. +Later the user will have to poll for completion with +MPI_Wait etc, and when the +transaction completes successfully, the wrappers have to paint the +recv buffer. But the recv buffer details are not presented to +MPI_Wait -- only the handle is. The +library therefore maintains a shadow table which associates +uncompleted MPI_Requests with the +corresponding buffer address/count/type. When an operation completes, +the table is searched for the associated address/count/type info, and +memory is marked accordingly. + +Access to the table is guarded by a (POSIX pthreads) lock, so as +to make the library thread-safe. + +The table is allocated with +malloc and never +freed, so it will show up in leak +checks. + +Writing new wrappers should be fairly easy. The source file is +auxprogs/libmpiwrap.c. If possible, +find an existing wrapper for a function of similar behaviour to the +one you want to wrap, and use it as a starting point. The wrappers +are organised in sections in the same order as the MPI 1.1 spec, to +aid navigation. When adding a wrapper, remember to comment out the +definition of the default wrapper in the long list of defaults at the +bottom of the file (do not remove it, just comment it out). + + + +What to expect when using the wrappers + +The wrappers should reduce Memcheck's false-error rate on MPI +applications. Because the wrapping is done at the MPI interface, +there will still potentially be a large number of errors reported in +the MPI implementation below the interface. The best you can do is +try to suppress them. + +You may also find that the input-side (buffer +length/definedness) checks find errors in your MPI use, for example +passing too short a buffer to +MPI_Recv. + +Functions which are not wrapped may increase the false +error rate. A possible approach is to run with +MPI_DEBUG containing +warn. This will show you functions +which lack proper wrappers but which are nevertheless used. You can +then write wrappers for them. + + + + + +