Release 3.3.0 (7 December 2007)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
3.3.0 is a feature release with many significant improvements and the
usual collection of bug fixes. This release supports X86/Linux,
AMD64/Linux, PPC32/Linux and PPC64/Linux. Support for recent distros
exp-omega/docs/omega_introduction.txt.
* exp-DRD: a data race detector based on the happens-before
- relation. See exp-drd/TODO.txt.
+ relation. See exp-drd/docs/README.txt.
- Scalability improvements for very large programs, particularly those
which have a million or more malloc'd blocks in use at once. These
Data-race detection algorithm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Implement glibc version detection in drd_main.c.
+- Rename drd_preloaded.c into drd_intercepts.c.
+- Propose on the Valgrind developers mailing list to add scripts
+ "run-in-place" and "debug-in-place".
- Implement segment merging, such that the number of segments per thread
remains limited even when there is no synchronization between threads.
- Find out why a race is reported on std::string::string(std::string const&)
(stc test case 16).
- Make sure that drd supports more than 256 mutexes.
- Performance testing and tuning.
+- pthread semaphore support.
- pthread rwlock state tracking and support.
- pthread barrier state tracking and support.
- mutexes: support for pthread_mutex_timedlock() (recently added to the POSIX
Known bugs
~~~~~~~~~~
-- Gets killed by the OOM handler for some applications, e.g. knode and
- OpenOffice.
-- [AMD64] Reports "Allocation context: unknown" for BSS symbols on AMD64
- (works fine on X86). This is a bug in Valgrind's debug info reader
- -- VG_(find_seginfo)() returns NULL for BSS symbols on AMD64. Not yet in
+- Gets killed by the OOM handler for realistically sized applications,
+ e.g. knode and OpenOffice.
+- [x86_64] Reports "Allocation context: unknown" for BSS symbols on AMD64
+ (works fine on i386). This is a bug in Valgrind's debug info reader
+ -- VG_(find_seginfo)() returns NULL for BSS symbols on x86_64. Not yet in
the KDE bug tracking system.
-- False positives are reported when a signal is sent via pthread_kill() from
- one thread to another (bug 152728).
-- Crashes (cause not known): VALGRIND_LIB=$PWD/.in_place coregrind/valgrind --tool=exp-drd --trace-mem=yes /bin/ls
Known performance issues:
- According to cachegrind, VG_(OSet_Next)() is taking up most CPU cycles.
-#EXTRA_DIST = drd-manual.xml
+EXTRA_DIST = README.txt
--- /dev/null
+DRD: a Data Race Detector
+=========================
+
+Last update: December 3, 2007 by Bart Van Assche.
+
+
+The Difficulty of Multithreading Programming
+--------------------------------------------
+Multithreading is a concept to model multiple concurrent activities within a
+single process. Since the invention of the multithreading concept, there is an
+ongoing debate about which way to model concurrent activities is better --
+multithreading or message passing. This debate exists because
+multithreaded programming is error prone: multithreaded programs can exhibit
+data races and/or deadlocks. Despite these risks multithreaded programming is
+popular: for many applications multithreading is a more natural programming
+style, and multithreaded code often runs faster than the same application
+implemented via message passing.
+
+In the context of DRD, a data race is defined as two concurrent memory
+accesses, where at least one of these two memory accesses is a store operation,
+and these accesses are not protected by proper locking constructs. Data
+races are harmful because these may lead to unpredictable results in
+multithreaded programs. There is a general consensus that data races
+should be avoided in multithreaded programs.
+
+
+About DRD
+---------
+The current version of DRD is able to perform data race detection on small
+programs -- DRD quickly runs out of memory for realistically sized programs.
+The current version runs well under Linux on x86 CPU's for multithreaded
+programs that use the POSIX threading library. Regular POSIX threads, detached
+threads, mutexes, condition variables and spinlocks are supported. POSIX
+semaphores, barriers and reader-writer locks are not yet supported.
+
+Extensive scientific research has been carried out on the area of data-race
+detection. The two most important algorithms are known as the Eraser algorithm
+and the algorithm based on the happens-before relationship, first documented by
+Netzer. The Eraser algorithm can result in false positives, while the Netzer
+algorithm guarantees not to report false positives. The Netzer algorithm
+ignores a certain class of data races however. Both algorithms have been
+implemented in Valgrind. The helgrind tool implements the Eraser algorithm,
+and the DRD tool implements the Netzer algorithm. Although [Savage 1997]
+claims that the Netzer algorithm is harder to implement efficiently, as of
+version 3.3.0 drd runs significantly faster on several regression tests than
+helgrind.
+
+
+How to use DRD
+--------------
+To use this tool, specify --tool=drd on the Valgrind command line.
+
+
+Future DRD Versions
+-------------------
+The following may be expected in future versions of DRD:
+* More extensive documentation.
+* Drastically reduced memory consumption, such that realistic applications can
+ be analyzed with DRD.
+* Faster operation.
+* Support for semaphores, barriers and reader-writer locks.
+* Support for PowerPC CPU's.
+* A lock dependency analyzer, as a help in deadlock prevention.
+* Support for more than 256 mutexes per process.
+
+
+See also
+--------
+* Robert H. B. Netzer and Barton P. Miller. What are race
+ conditions? Some issues and formalizations. ACM Letters 136
+ on Programming Languages and Systems, 1(1):74–88, March 1992.
+
+* John Ousterhout, Why Threads Are A Bad Idea (for most
+ purposes), Invited Talk at the 1996 USENIX Technical Conference (January
+ 25, 1996). http://home.pacbell.net/ouster/threads.pdf
+
+* Stefan Savage, Michael Burrows, Greg Nelson, Patrick
+ Sobalvarro and Thomas Anderson, Eraser: A Dynamic Data Race Detector for
+ Multithreaded Programs, ACM Transactions on Computer Systems,
+ 15(4):391-411, November 1997.
+
+
+
static Bool drd_trace_mem = False;
static Bool drd_trace_fork_join = False;
static Addr drd_trace_address = 0;
-#if 0
-// Note: using the information below for suppressing data races is only
-// possible when the client and the shared libraries it uses contain
-// debug information. Not every Linux distribution includes debug information
-// in shared libraries.
-static const SuppressedSymbol s_suppressed_symbols[] =
- {
- { "ld-linux.so.2", "_rtld_local" },
- { "libpthread.so.0", "__nptl_nthreads" },
- { "libpthread.so.0", "stack_cache" },
- { "libpthread.so.0", "stack_cache_actsize" },
- { "libpthread.so.0", "stack_cache_lock" },
- { "libpthread.so.0", "stack_used" },
- { "libpthread.so.0", "libgcc_s_forcedunwind" },
- { "libpthread.so.0", "libgcc_s_getcfa" },
- };
-#endif
//
{
Segment* sg;
+ thread_set_vg_running_tid(VG_(get_running_tid)());
+
if (! thread_is_recording(thread_get_running_tid()))
return;
{
Segment* sg;
+ thread_set_vg_running_tid(VG_(get_running_tid)());
+
if (! thread_is_recording(thread_get_running_tid()))
return;
const Addr a,
const SizeT size)
{
- const ThreadId running_tid = VG_(get_running_tid)();
-
- if (size == 0)
- return;
-
- if (tid != running_tid)
- {
- if (VgThreadIdToDrdThreadId(tid) != DRD_INVALID_THREADID)
- {
- drd_set_running_tid(tid);
- drd_trace_load(a, size);
- drd_set_running_tid(running_tid);
- }
- else
- {
- VG_(message)(Vg_DebugMsg,
- "drd_pre_mem_read() was called before"
- " drd_post_thread_create() for thread ID %d",
- tid);
- tl_assert(0);
- }
- }
- else
+ if (size > 0)
{
drd_trace_load(a, size);
}
const Addr a,
const SizeT size)
{
- const ThreadId running_tid = VG_(get_running_tid)();
-
- if (size == 0)
- return;
-
- if (tid != running_tid)
- {
- if (VgThreadIdToDrdThreadId(tid) != DRD_INVALID_THREADID)
- {
- drd_set_running_tid(tid);
- drd_trace_store(a, size);
- drd_set_running_tid(running_tid);
- }
- else
- {
-#if 1
- VG_(message)(Vg_DebugMsg,
- "drd_pre_mem_write() was called before"
- " drd_post_thread_create() for thread ID %d",
- tid);
- tl_assert(0);
-#endif
- }
- }
- else
+ if (size > 0)
{
drd_trace_store(a, size);
}
static void drd_start_using_mem(const Addr a1, const Addr a2)
{
+ thread_set_vg_running_tid(VG_(get_running_tid)());
+
if (a1 <= drd_trace_address && drd_trace_address < a2
&& thread_is_recording(thread_get_running_tid()))
{
/* Assumption: stacks grow downward. */
static void drd_stop_using_mem_stack(const Addr a, const SizeT len)
{
-#if 0
- VG_(message)(Vg_DebugMsg, "stop_using_mem_stack(0x%lx, %ld)", a, len);
-#endif
+ thread_set_vg_running_tid(VG_(get_running_tid)());
thread_set_stack_min(thread_get_running_tid(),
a + len - VG_STACK_REDZONE_SZB);
drd_stop_using_mem(a - VG_STACK_REDZONE_SZB,
// Hack: compensation for code missing in coregrind/m_main.c.
if (created == 1)
{
- extern ThreadId VG_(running_tid);
- tl_assert(VG_(running_tid) == VG_INVALID_THREADID);
- VG_(running_tid) = 1;
- drd_start_client_code(VG_(running_tid), 0);
- VG_(running_tid) = VG_INVALID_THREADID;
+ thread_set_running_tid(1, 1);
}
#endif
if (IsValidDrdThreadId(drd_creator))
# if defined(VGP_x86_linux) || defined(VGP_amd64_linux)
/* fine */
# else
- VG_(printf)("\nDRD currently only works on x86-linux and amd64-linux.\n");
- VG_(printf)("At the very least you need to set PTHREAD_{MUTEX,COND}_SIZE\n");
- VG_(printf)("in pthread_object_size.h to correct values. Sorry.\n\n");
- VG_(exit)(0);
+ VG_(printf)("\nWARNING: DRD has only been tested on x86-linux and amd64-linux.\n\n");
# endif
}
return bb;
}
-static void drd_set_running_tid(const ThreadId tid)
+static void drd_set_running_tid(const ThreadId vg_tid)
{
- static ThreadId s_last_tid = VG_INVALID_THREADID;
- if (tid != s_last_tid)
+ static ThreadId s_last_vg_tid = VG_INVALID_THREADID;
+ if (vg_tid != s_last_vg_tid)
{
- const DrdThreadId drd_tid = VgThreadIdToDrdThreadId(tid);
+ const DrdThreadId drd_tid = VgThreadIdToDrdThreadId(vg_tid);
tl_assert(drd_tid != DRD_INVALID_THREADID);
- s_last_tid = tid;
+ s_last_vg_tid = vg_tid;
if (drd_trace_fork_join)
{
VG_(message)(Vg_DebugMsg,
"drd_track_thread_run tid = %d / drd tid %d",
- tid, drd_tid);
+ vg_tid, drd_tid);
}
- thread_set_running_tid(drd_tid);
+ thread_set_running_tid(vg_tid, drd_tid);
}
}
static const Char drd_supp[] = "glibc-2.X-drd.supp";
const Int len = VG_(strlen)(VG_(libdir)) + 1 + sizeof(drd_supp);
Char* const buf = VG_(arena_malloc)(VG_AR_CORE, len);
- VG_(sprintf)(buf, "%s/%s", VG_(libdir), drd_supp);
+ VG_(snprintf)(buf, len, "%s/%s", VG_(libdir), drd_supp);
VG_(clo_suppressions)[VG_(clo_n_suppressions)] = buf;
VG_(clo_n_suppressions)++;
}
static ULong s_update_danger_set_count;
static ULong s_danger_set_bitmap_creation_count;
static ULong s_danger_set_bitmap2_creation_count;
-static DrdThreadId s_running_tid = DRD_INVALID_THREADID;
+static ThreadId s_vg_running_tid = VG_INVALID_THREADID;
+static DrdThreadId s_drd_running_tid = DRD_INVALID_THREADID;
static ThreadInfo s_threadinfo[DRD_N_THREADS];
static struct bitmap* s_danger_set;
fmt, arg);
s_threadinfo[tid].name[sizeof(s_threadinfo[tid].name) - 1] = 0;
}
+
DrdThreadId thread_get_running_tid(void)
{
- tl_assert(s_running_tid != DRD_INVALID_THREADID);
- return s_running_tid;
+ // HACK. To do: remove the if-statement and keep the assert.
+ if (VG_(get_running_tid)() != VG_INVALID_THREADID)
+ tl_assert(VG_(get_running_tid)() == s_vg_running_tid);
+ tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
+ return s_drd_running_tid;
}
-void thread_set_running_tid(const DrdThreadId tid)
+void thread_set_vg_running_tid(const ThreadId vg_tid)
{
- s_running_tid = tid;
- thread_update_danger_set(tid);
- s_context_switch_count++;
+ // HACK. To do: uncomment the line below.
+ // tl_assert(vg_tid != VG_INVALID_THREADID);
+
+ if (vg_tid != s_vg_running_tid)
+ {
+ thread_set_running_tid(vg_tid, VgThreadIdToDrdThreadId(vg_tid));
+ }
+
+ tl_assert(s_vg_running_tid != VG_INVALID_THREADID);
+ tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
+}
+
+void thread_set_running_tid(const ThreadId vg_tid, const DrdThreadId drd_tid)
+{
+ // HACK. To do: remove the next two lines.
+ if (vg_tid == VG_INVALID_THREADID)
+ return;
+
+ tl_assert(vg_tid != VG_INVALID_THREADID);
+ tl_assert(drd_tid != DRD_INVALID_THREADID);
+
+ if (vg_tid != s_vg_running_tid)
+ {
+ s_vg_running_tid = vg_tid;
+ s_drd_running_tid = drd_tid;
+ thread_update_danger_set(drd_tid);
+ s_context_switch_count++;
+ }
+
+ tl_assert(s_vg_running_tid != VG_INVALID_THREADID);
+ tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
}
/**
vc_combine(&s_threadinfo[joiner].last->vc, &s_threadinfo[joinee].last->vc);
thread_discard_ordered_segments();
- if (joiner == s_running_tid)
+ if (joiner == s_drd_running_tid)
{
thread_update_danger_set(joiner);
}
for (p = s_threadinfo[i].first; p; p = p->next)
{
if (other_user == DRD_INVALID_THREADID
- && i != s_running_tid
+ && i != s_drd_running_tid
&& bm_has_any_access(p->bm, a1, a2))
{
other_user = i;
tl_assert(0 <= tid && tid < DRD_N_THREADS
&& tid != DRD_INVALID_THREADID);
- tl_assert(tid == s_running_tid);
+ tl_assert(tid == s_drd_running_tid);
s_update_danger_set_count++;
s_danger_set_bitmap_creation_count -= bm_get_bitmap_creation_count();
void thread_set_name_fmt(const DrdThreadId tid, const char* const name,
const UWord arg);
DrdThreadId thread_get_running_tid(void);
-void thread_set_running_tid(const DrdThreadId tid);
+void thread_set_vg_running_tid(const ThreadId vg_tid);
+void thread_set_running_tid(const ThreadId vg_tid,
+ const DrdThreadId drd_tid);
Segment* thread_get_segment(const DrdThreadId tid);
void thread_new_segment(const DrdThreadId tid);
VectorClock* thread_get_vc(const DrdThreadId tid);
-prog: pth_create_chain 10
+prog: pth_create_chain 100
fun:exit
}
{
- dl
+ dl-2.6.*
exp-drd:ConflictingAccess
- obj:/lib64/ld-2.6.1.so
+ obj:/lib*/ld-*.so
fun:exit
}
{
- dl-2.6.1
+ dl-2.6.*
exp-drd:ConflictingAccess
- obj:/lib64/ld-2.6.1.so
- obj:/lib64/ld-2.6.1.so
- obj:/lib64/ld-2.6.1.so
+ obj:/lib*/ld-*.so
+ obj:/lib*/ld-*.so
+ obj:/lib*/ld-*.so
}
{
libc
exp-drd:ConflictingAccess
fun:__libc_enable_asynccancel
- obj:/lib/libc-*
+ obj:/lib*/libc-*
}
{
libc
exp-drd:ConflictingAccess
fun:__libc_disable_asynccancel
- obj:/lib/libc-*
+ obj:/lib*/libc-*
}
{
librt
{
pthread
exp-drd:ConflictingAccess
- obj:/lib64/libpthread-2.6.1.so
+ obj:/lib*/libpthread-*.so
fun:start_thread
fun:clone
}
{
pthread
exp-drd:ConflictingAccess
- obj:/lib64/libc-2.6.1.so
+ obj:/lib*/libc-*.so
fun:__libc_thread_freeres
fun:start_thread
fun:clone
{
pthread
exp-drd:ConflictingAccess
- obj:/lib64/libc-2.6.1.so
- obj:/lib64/libc-2.6.1.so
+ obj:/lib*/libc-*.so
+ obj:/lib*/libc-*.so
fun:__libc_thread_freeres
fun:start_thread
fun:clone
fun:pthread_join
fun:pthread_join
}
+{
+ pthread
+ exp-drd:ConflictingAccess
+ fun:free_stacks
+ fun:__deallocate_stack
+ fun:pthread_join
+ fun:pthread_join
+}
{
pthread
exp-drd:ConflictingAccess
pthread
exp-drd:ConflictingAccess
fun:sigcancel_handler
- obj:/lib/libpthread-*
+ obj:/lib*/libpthread-*
}
{
pthread-unwind
fun:_Unwind_ForcedUnwind
fun:__pthread_unwind
fun:sigcancel_handler
- obj:/lib/libpthread-*
+ obj:/lib*/libpthread-*
}
{
pthread-unwind