From: Philippe Waroquiers Date: Fri, 11 Nov 2016 15:11:49 +0000 (+0000) Subject: Update documentation and NEWS for xtree concept. X-Git-Tag: svn/VALGRIND_3_13_0~290 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=3be4b5ba1c8ade569acfafc336d60d0ed7ac6a96;p=thirdparty%2Fvalgrind.git Update documentation and NEWS for xtree concept. Final patch of the xtree serie, which provides the documentation. The xtree concept was committed in the revisions 16120 : Support pool of unique string in pub_tool_deduppoolalloc.h 16121 : Implement a cache 'address -> symbol name' in m_debuginfo.c 16122 : Add VG_(strIsMemberXA) in pub_tool_xarray.h 16123 : Addition of the pub_tool_xtree.h and pub_tool_xtmemory.h modules, and of the --xtree-memory* options 16124 : Addition of the options --xtree-memory and --xtree-memory-file 16125 : Small changes in callgrind_annotate and callgrind manual 16126 : Locally define vgPlain_scrcmp in 2 unit tests 16127 : Support for xtree memory profiling and xtmemory gdbsrv monitor command in helgrind 16128 : Support for xtree memory profiling and xtmemory gdbsrv monitor command in memcheck 16129 : Update massif implementation to xtree Some smaller follow-up patches to be expected to add some regtests, and refine documentation. Thanks to Ivo, Julian and Josef for the review comments. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16131 --- diff --git a/NEWS b/NEWS index 519af63d66..c4facd7128 100644 --- a/NEWS +++ b/NEWS @@ -2,10 +2,60 @@ Release 3.13.0 (?? ????????? 201?) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-- Release 3.13.0 is under development, not yet released. +3.13.0 is a feature release with many improvements and the usual +collection of bug fixes. + +This release supports X86/Linux, AMD64/Linux, ARM32/Linux, +ARM64/Linux, PPC32/Linux, PPC64BE/Linux, PPC64LE/Linux, S390X/Linux, +MIPS32/Linux, MIPS64/Linux, ARM/Android, ARM64/Android, +MIPS32/Android, X86/Android, X86/Solaris, AMD64/Solaris, X86/MacOSX +10.10 and AMD64/MacOSX 10.10. There is also preliminary support for +X86/MacOSX 10.11/12, AMD64/MacOSX 10.11/12 and TILEGX/Linux. + +A significant change in 3.13 is the addition of the 'xtree' concept: +An xtree is a tree of stacktraces with data associated to the stacktraces. +This xtree is used by various tools (memcheck, helgrind, massif) to +report the heap consumption of your program. The xtree reporting +is controlled by the new options --xtree-memory=none|allocs|full and +--xtree-memory-file=. +An heap xtree memory profiling can also be produced on demand using +a gdbserver monitor command. +The xtree can be output in 2 formats: 'callgrind format' +and 'massif format. The existing visualisers for these formats (e.g. +callgrind_annotate, kcachegrind, ms_print) can be used to visualise +and analyse these reports. +For more details, read the user manual. + * ================== PLATFORM CHANGES ================= * ==================== TOOL CHANGES ==================== +* Memcheck: + + - Support for --xtree-memory profiling. + + - A new monitor command 'xtmemory []>' produces a + heap usage profile report. + +* Massif: + + - Support for --xtree-memory profiling. + + - A new monitor command 'xtmemory []>' produces a + heap usage profile report. + + - For some workloads (typically, for big applications), Massif + memory consumption and CPU consumption decreases significantly. + +* Helgrind: + + - Support for --xtree-memory profiling. + + - A new monitor command 'xtmemory []>' produces a + heap usage profile report. + + + * ==================== OTHER CHANGES ==================== * ==================== FIXED BUGS ==================== @@ -23,6 +73,7 @@ where XXXXXX is the bug number as listed below. 371412 Rename wrap_sys_shmat to sys_shmat like other wrappers 371869 support '%' in symbol Z-encoding +371916 execution tree xtree concept 372120 c++ demangler demangles symbols which are not c++ diff --git a/docs/Makefile.am b/docs/Makefile.am index 43f80b172e..e3788da80c 100644 --- a/docs/Makefile.am +++ b/docs/Makefile.am @@ -19,6 +19,7 @@ EXTRA_DIST = \ images/next.png \ images/prev.png \ images/up.png \ + images/kcachegrind_xtree.png \ internals/3_0_BUGSTATUS.txt \ internals/3_1_BUGSTATUS.txt \ internals/3_2_BUGSTATUS.txt \ @@ -173,6 +174,7 @@ print-docs: export XML_CATALOG_FILES=$(XML_CATALOG_FILES) && \ mkdir -p $(myprintdir) && \ mkdir -p $(myprintdir)/images && \ + cp $(myimgdir)/*.png $(myprintdir)/images && \ $(XSLTPROC) $(XSLTPROC_FLAGS) -o $(myprintdir)/index.fo $(XSL_FO_STYLE) $(myxmldir)/index.xml && \ (cd $(myprintdir) && \ ( pdfxmltex index.fo && \ diff --git a/docs/images/kcachegrind_xtree.png b/docs/images/kcachegrind_xtree.png new file mode 100644 index 0000000000..6006bbd912 Binary files /dev/null and b/docs/images/kcachegrind_xtree.png differ diff --git a/docs/xml/manual-core-adv.xml b/docs/xml/manual-core-adv.xml index b767825c5b..ee13ecf128 100644 --- a/docs/xml/manual-core-adv.xml +++ b/docs/xml/manual-core-adv.xml @@ -1390,6 +1390,13 @@ client request. to a huge value and continue execution. + + xtmemory [<filename> default xtmemory.kcg] + requests the tool to produce an xtree heap memory report. + See for + a detailed explanation about execution trees. + + The following Valgrind monitor commands are useful for diff --git a/docs/xml/manual-core.xml b/docs/xml/manual-core.xml index 73349957bb..a86f10b837 100644 --- a/docs/xml/manual-core.xml +++ b/docs/xml/manual-core.xml @@ -1011,7 +1011,7 @@ that can report errors, e.g. Memcheck, but not Cachegrind. . Any or sequences appearing in the filename are expanded in exactly the same way as they are for . - See the description of for details. + See the description of for details. @@ -1686,6 +1686,74 @@ Massif, Helgrind, DRD), the following options apply. + + + + + + Tools replacing Valgrind's malloc, + realloc, etc, can optionally produce an execution + tree detailing which piece of code is responsible for heap + memory usage. See + for a detailed explanation about execution trees. + + When set to none, no memory execution + tree is produced. + + When set to allocs, the memory + execution tree gives the current number of allocated bytes and + the current number of allocated blocks. + + When set to full, the memory execution + tree gives 6 different measurements : the current number of + allocated bytes and blocks (same values as + for allocs), the total number of allocated + bytes and blocks, the total number of freed bytes and + blocks. + + Note that the overhead in cpu and memory to produce + an xtree depends on the tool. The overhead in cpu is small for + the value allocs, as the information needed + to produce this report is maintained in any case by the tool. + For massif and helgrind, specifying full + implies to capture a stack trace for each free operation, + while normally these tools only capture an allocation stack + trace. For memcheck, the cpu overhead for the + value full is small, as this can only be + used in combination with + or + , which + already records a stack trace for each free operation. The + memory overhead varies between 5 and 10 words per unique + stacktrace in the xtree, plus the memory needed to record the + stack trace for the free operations, if needed specifically + for the xtree. + + + + + + + + + + Specifies that Valgrind should produce the xtree memory + report in the specified file. Any or + sequences appearing in the filename are expanded + in exactly the same way as they are for . + See the description of + for details. + If the filename contains the extension , + then the produced file format will be a massif output file format. + If the filename contains the extension + or no extension is provided or recognised, + then the produced file format will be a callgrind output format. + See + for a detailed explanation about execution trees formats. + + + @@ -2673,11 +2741,231 @@ will create a core dump in the usual way. + +Execution Trees + +An execution tree (xtree) is made of a set of stack traces, each + stack trace is associated with some resource consumptions or event + counts. Depending on the xtree, different event counts/resource + consumptions can be recorded in the xtree. + + A typical usage for an xtree is to show a graphical or textual + representation of the heap usage of a program. The below figure is + a heap usage xtree graphical representation produced by + kcachegrind. In the kcachegrind output, you can see that main + current heap usage (allocated indirectly) is 528 bytes : 388 bytes + allocated indirectly via a call to function f1 and 140 bytes + indirectly allocated via a call to function f2. f2 has allocated + memory by calling g2, while f1 has allocated memory by calling g11 + and g12. g11, g12 and g1 have directly called a memory allocation + function (malloc), and so have a non zero 'Self' value. Note that when + kcachegrind shows an xtree, the 'Called' column and call nr indications in + the Call Graph are not significant (always set to 0 or 1, independently + of the real nr of calls. A future version of kcachegrind will not show + anymore such irrelevant xtree call number information. + + + +An xtree heap memory report is produced at the end of the + execution when required using the + option . It can also be produced on + demand using the monitor command (see + ). Currently, + an xtree heap memory report can be produced by + the , + and tools. + + The xtrees produced by the option + or the + monitor command are showing the following events/resource + consumption describing heap usage: + + + current number of Bytes allocated. The + number of allocated bytes is added to the + value of a stack trace for each allocation. It is decreased when + a block allocated by this stack trace is released (by another + "freeing" stack trace) + + + + current number of Blocks allocated, + maintained similary to curB : +1 for each allocation, -1 when + the block is freed. + + + + total allocated Bytes. This is + increased for each allocation with the number of allocated bytes. + + + + total allocated Blocks, maintained similary + to totB : +1 for each allocation. + + + + total Freed Bytes, increased each time + a block is released by this ("freeing") stack trace : + nr freed bytes + for each free operation. + + + + total Freed Blocks, maintained similarly + to totFdB : +1 for each free operation. + + +Note that the last 4 counts are produced only when the + was given at startup. +Xtrees can be saved in 2 file formats, the "Callgrind Format" and +the "Massif Format". + + + + Callgrind Format + An xtree file in the Callgrind Format contains a single callgraph, + associating each stack trace with the values recorded + in the xtree. + Different Callgrind Format file visualisers are available: + Valgrind distribution includes the + command line utility that reads in the xtree data, and prints a sorted + lists of functions, optionally with source annotation. Note that due to + xtree specificities, you must give the option + to callgrind_annotate. + For graphical visualization of the data, you can use + KCachegrind, which is a KDE/Qt based + GUI that makes it easy to navigate the large amount of data that + an xtree can contain. + + + + Massif Format + An xtree file in the Massif Format contains one detailed tree + callgraph data for each type of event recorded in the xtree. So, + for , the output file will + contain 2 detailed trees (for the counts + and ), + while will give a file + with 6 detailed trees. + Different Massif Format file visualisers are available. Valgrind + distribution includes the + command line utility that produces an easy to read reprentation of + a massif output file. See and + for more details + about visualising Massif Format output files. + + +Note that it is recommended to use the "Callgrind Format" as it + is more compact than the Massif Format, and the Callgrind Format + visualiser are more versatile that the Massif Format + visualisers. kcachegrind is particularly easy to use to analyse + big xtree data. + +To clarify the xtree concept, the below gives several extracts of + the output produced by the following commands: + + +The below extract shows that the program mfg has allocated in + total 770 bytes in 60 different blocks. Of these 60 blocks, 19 were + freed, releasing a total of 242 bytes. The heap currently contains + 528 bytes in 41 blocks. + + +The below gives more details about which functions have + allocated or released memory. As an example, we see that main has + (directly or indirectly) allocated 770 bytes of memory and freed + (directly or indirectly) 242 bytes of memory. The function f1 has + (directly or indirectly) allocated 570 bytes of memory, and has not + (directly or indirectly) freed memory. Of the 570 bytes allocated + by function f1, 388 bytes (34 blocks) have not been + released. + + +The below gives a more detailed information about the callgraph + and which source lines/calls have (directly or indirectly) allocated or + released memory. The below shows that the 770 bytes allocated by + main have been indirectly allocated by calls to f1 and f2. + Similarly, we see that the 570 bytes allocated by f1 have been + indirectly allocated by calls to g11 and g12. Of the 330 bytes allocated + by the 30 calls to g11, 168 bytes have not been freed. + The function freeY (called once by main) has released in total + 10 blocks and 131 bytes. + + +Heap memory xtrees are helping to understand how your (big) + program is using the heap. A full heap memory xtree helps to pin + point some code that allocates a lot of small objects : allocating + such small objects might be replaced by more efficient technique, + such as allocating a big block using malloc, and then diviving this + block into smaller blocks in order to decrease the cpu and/or memory + overhead of allocating a lot of small blocks. Such full xtree information + complements e.g. what callgrind can show: callgrind can show the number + of calls to a function (such as malloc) but does not indicate the volume + of memory allocated (or freed). + +A full heap memory xtree also can identify the code that allocates + and frees a lot of blocks : the total foot print of the program might + not reflect the fact that the same memory was over and over allocated + then released. + +Finally, Xtree visualisers such as kcachegrind are helping to + identify big memory consumers, in order to possibly optimise the + amount of memory needed by your program. + Building and Installing Valgrind