From: Nicholas Nethercote Date: Mon, 17 Sep 2007 22:19:01 +0000 (+0000) Subject: Add section on how to use Cachegrind's results. X-Git-Tag: svn/VALGRIND_3_3_0~210 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5771d4fcc63f4cfa53405f6bb277e4d699cbb262;p=thirdparty%2Fvalgrind.git Add section on how to use Cachegrind's results. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6852 --- diff --git a/cachegrind/docs/cg-manual.xml b/cachegrind/docs/cg-manual.xml index 9b4b0b9a27..60d5622bcb 100644 --- a/cachegrind/docs/cg-manual.xml +++ b/cachegrind/docs/cg-manual.xml @@ -1221,11 +1221,48 @@ fail these checks. + +Acting on Cachegrind's information + +So, you've managed to profile your program with Cachegrind. Now what? +What's the best way to actually act on the information it provides to speed +up your program? + + +First of all, the global hit/miss rate numbers are not that useful. If you +have multiple programs or multiple runs of a program, comparing the numbers +might identify if any are outliers. Otherwise, they're not enough to act +on. + + +The source code annotations are much more useful. In our experience, the +best place to start is by looking at the Ir +numbers. They simply measure how many instructions were executed for each +line, and don't include any cache information, but they can still be very +useful for identifying bottlenecks. + + +After that, we have found that L2 misses are typically a much bigger source +of slow-downs than L1 misses. So it's worth looking for any snippets of +code that cause a lot of L2 misses. If you find any, it's still not always +easy to work out how to improve things. You need to have a reasonable +understanding of how caches work, the principles of locality, and your +program's data access patterns. + + +In short, Cachegrind can tell you where some of the bottlenecks in your code +are, but it can't tell you how to fix them. You have to work that out for +yourself. But at least you have the information! + + + Implementation details + This section talks about details you don't need to know about in order to use Cachegrind, but may be of interest to some people. + How Cachegrind works @@ -1294,8 +1331,8 @@ cache simulation. More than one line of info can be presented for each file/fn/line number. In such cases, the counts for the named events will be accumulated. -Counts can be "." to represent zero. This makes the files easier to -read. +Counts can be "." to represent zero. This makes the files easier for +humans to read. The number of counts in each line and the @@ -1303,7 +1340,8 @@ read. the number of events in the event_line. If the number in each line is less, cg_annotate -treats those missing as though they were a "." entry. +treats those missing as though they were a "." entry. This saves space. + A file_line changes the current file name. A fn_line