From: Julian Seward Date: Mon, 22 Dec 2008 00:39:41 +0000 (+0000) Subject: Finish off updates to the Helgrind manual. X-Git-Tag: svn/VALGRIND_3_4_0~37 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=9736f313e43fcfecbdee4a5fd819ec4f175982f5;p=thirdparty%2Fvalgrind.git Finish off updates to the Helgrind manual. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8860 --- diff --git a/helgrind/docs/hg-manual.xml b/helgrind/docs/hg-manual.xml index 079afbcbae..71a0d8604e 100644 --- a/helgrind/docs/hg-manual.xml +++ b/helgrind/docs/hg-manual.xml @@ -51,7 +51,7 @@ in detail in the next three sections: Data races -- accessing memory without adequate locking or synchronisation. - Note that Helgrind in Valgrind 3.4.0 and later uses a + Note that race detection in versions 3.4.0 and later uses a different algorithm than in 3.3.x. Hence, if you have been using Helgrind in 3.3.x, you may want to re-read this section. @@ -320,7 +320,7 @@ address of var, happening in function main at line 13 in the program. -The error message shows two other important: +Two important parts of the message are: @@ -337,8 +337,9 @@ program. one of these will be a write (since two concurrent, unsynchronised reads are harmless), and they will of course be from different threads. - By examining your program at the two locations, it should be - fairly clear what the root cause of the problem is. + By examining your program at the two locations, you should be + able to get at least some idea of what the root cause of the + problem is. For races which occur on global or stack variables, Helgrind @@ -367,8 +368,8 @@ algorithm in more detail. Most programmers think about threaded programming in terms of the abstractions provided by the threading library (POSIX Pthreads): -thread creation, thread joining, locks, condition variables and -barriers. +thread creation, thread joining, locks, condition variables, +semaphores and barriers. The effect of using locks, barriers, etc, is to impose on a threaded program, constraints upon the order in which memory accesses @@ -376,22 +377,25 @@ can happen. This implied ordering is generally known as the "happens-before relationship". Once you understand the happens-before relationship, it is easy to see how Helgrind finds races in your code. Fortunately, the happens-before relationship is itself easy to -understand, and, additionally, is by itself a useful tool for -reasoning about the behaviour of parallel programs. We now introduce -it using a simple example. +understand, and is by itself a useful tool for reasoning about the +behaviour of parallel programs. We now introduce it using a simple +example. Consider first the following buggy program: The parent thread creates a child. Both then write different @@ -418,18 +422,21 @@ instructive to consider a somewhat more abstract solution, which is to send a message from one thread to the other: Now the program reliably prints "10", regardless of the speed of @@ -464,7 +471,7 @@ a total ordering is comparison of numbers: for any two numbers x is less than, equal to, or greater than y. A partial ordering is like a -total ordering, but it can also express the concepts that two elements +total ordering, but it can also express the concept that two elements are neither equal, less or greater, but merely unordered with respect to each other. @@ -495,14 +502,14 @@ primitives are as follows: although with some complication so as to allow correct handling of reads vs writes. - When a condition variable is signed on by thread T1 - and some other thread T2 is thereby released from a wait on the same - CV, then the memory accesses in T1 prior to the signalling must - happen-before those in T2 after it returns from the wait. If no - thread was waiting on the CV then there is no + When a condition variable (CV) is signalled on by + thread T1 and some other thread T2 is thereby released from a wait + on the same CV, then the memory accesses in T1 prior to the + signalling must happen-before those in T2 after it returns from the + wait. If no thread was waiting on the CV then there is no effect. - If instead T1 broadcasts on a CV then all of the + If instead T1 broadcasts on a CV, then all of the waiting threads, rather than just one of them, acquire a happens-before dependency on the broadcasting thread at the point it did the broadcast. @@ -532,20 +539,20 @@ primitives are as follows: -Helgrind intercepts the above listed events, and builds a +In summary: Helgrind intercepts the above listed events, and builds a directed acyclic graph represented the collective happens-before dependencies. It also monitors all memory accesses. If a location is accessed by two different threads, but Helgrind cannot find any path through the happens-before graph from one access -to the other, then it complains of a race. +to the other, then it reports a race. There are a couple of caveats: - Helgrind doesn't check in the case where both - accesses are reads. That would be silly, since concurrent reads are - harmless. + Helgrind doesn't check for a race in the case where + both accesses are reads. That would be silly, since concurrent + reads are harmless. Two accesses are considered to be ordered by the happens-before dependency even through arbitrarily long chains of @@ -627,8 +634,8 @@ location information makes Helgrind much slower at startup, and also requires considerable amounts of memory, for large programs. -Once you have your two call stacks, how do you begin to get to -the root problem? +Once you have your two call stacks, how do you find the root +cause of the race? The first thing to do is examine the source locations referred to by each call stack. They should both show an access to the same @@ -644,14 +651,14 @@ thread-safe: Did you perhaps forget the locking at one or other of the accesses? - Alternatively, you intended to use a some other - scheme to make it safe, such as signalling on a condition variable. - In all such cases, try to find a synchronisation event (or a chain - thereof) which separates the earlier-observed access (as shown in the - second call stack) from the later-observed access (as shown in the - first call stack). In other words, try to find evidence that the - earlier access "happens-before" the later access. See the previous - subsection for an explanation of the happens-before + Alternatively, perhaps you intended to use a some + other scheme to make it safe, such as signalling on a condition + variable. In all such cases, try to find a synchronisation event + (or a chain thereof) which separates the earlier-observed access (as + shown in the second call stack) from the later-observed access (as + shown in the first call stack). In other words, try to find + evidence that the earlier access "happens-before" the later access. + See the previous subsection for an explanation of the happens-before relationship. The fact that Helgrind is reporting a race means it did not observe @@ -932,59 +939,68 @@ unlock(mx) unlock(mx) - + - + - Helgrind always regards locks as the basis for - inter-thread synchronisation. However, by default, before - reporting a race error, Helgrind will also check whether - certain other kinds of inter-thread synchronisation events - happened. It may be that if such events took place, then no - race really occurred, and so no error needs to be reported. - See above - for a discussion of transfers of exclusive ownership states - between threads. - - With --happens-before=all, the - following events are regarded as sources of synchronisation: - thread creation/joinage, condition variable - signal/broadcast/waits, and semaphore posts/waits. - - With --happens-before=threads, only - thread creation/joinage events are regarded as sources of - synchronisation. - - With --happens-before=none, no events - (apart, of course, from locking) are regarded as sources of - synchronisation. - - Changing this setting from the default will increase your - false-error rate but give little or no gain. The only advantage - is that and - should make Helgrind - less and less sensitive to the scheduling of threads, and hence - the output more and more repeatable across runs. - + When enabled (the default), Helgrind performs lock order + consistency checking. For some buggy programs, the large number + of lock order errors reported can become annoying, particularly + if you're only interested in race errors. You may therefore find + it helpful to disable lock order checking. - + - and - + - Requests that Helgrind produces a log of all state changes - to location 0xXXYYZZ. This can be helpful in tracking down - tricky races. --trace-level controls the - verbosity of the log. At the default setting (1), a one-line - summary of is printed for each state change. At level 2 a - complete stack trace is printed for each state change. + When enabled (the default), Helgrind collects enough + information about "old" accesses that it can produce two stack + traces in a race report -- both the stack trace for the + current access, and the trace for the older, conflicting + access. + Collecting such information is expensive in both speed and + memory. This flag disables collection of such information. + Helgrind will run significantly faster and use less memory, + but without the conflicting access stacks, it will be very + much more difficult to track down the root causes of + races. However, this option may be useful in situations where + you just want to check for the presence or absence of races, + for example, when doing regression testing of a previously + race-free program. + + + + + + + + + Information about "old" conflicting accesses is stored in + a cache of limited size, with LRU-style management. This is + necessary because it isn't practical to store a stack trace + for every single memory access made by the program. + Historical information on not recently accessed locations is + periodically discarded, to free up space in the cache. + This flag controls the size of the cache, in terms of the + number of different memory addresses for which + conflicting access information is stored. If you find that + Helgrind is showing race errors with only one stack instead of + the expected two stacks, try increasing this value. + The minimum value is 10,000 and the maximum is 10,000,000 + (ten times the default value). Increasing the value by 1 + increases Helgrind's memory requirement by very roughly 100 + bytes, so the maximum value will easily eat up an extra + gigabyte or so of memory. @@ -1007,26 +1023,6 @@ Helgrind: - - - - - - At exit, write to stderr a dump of the happens-before - graph computed by Helgrind, in a format suitable for the VCG - graph visualisation tool. A suitable command line is: - valgrind --tool=helgrind - --gen-vcg=yes my_app 2>&1 - | grep xxxxxx | sed "s/xxxxxx//g" - | xvcg - - With --gen-vcg=yes, the basic - happens-before graph is shown. With - --gen-vcg=yes-w-vts, the vector timestamp - for each node is also shown. - - - @@ -1054,8 +1050,6 @@ Helgrind: Run extensive sanity checks on Helgrind's internal data structures at events defined by the bitstring, as follows: - 100000 at every query - to the happens-before graph 010000 after changes to the lock order acquisition graph 001000 after every client @@ -1095,13 +1089,10 @@ some time. Document the VALGRIND_HG_CLEAN_MEMORY client request. - Possibly a client request to forcibly transfer - ownership of memory from one thread to another. Requires further - consideration. - - Add a new client request that marks an address range - as being "shared-modified with empty lockset" (the error state), - and describe how to use it. + The conflicting access mechanism sometimes + mysteriously fails to show the conflicting access' stack, even + when provided with unbounded storage for conflicting access info. + This should be investigated. Document races caused by gcc's thread-unsafe code generation for speculative stores. In the interim see @@ -1119,8 +1110,8 @@ some time. generate false lock-order errors and confuse users. Performance can be very poor. Slowdowns on the - order of 100:1 are not unusual. There is quite some scope for - performance improvements, though. + order of 100:1 are not unusual. There is limited scope for + performance improvements.