New in v2:
- install address map in a single shard
- update test gdb.mi/mi-sym-info.exp to cope with the fact that
different symbols could be returned when using --max-results
When playing with the .debug_names reader, I noticed it was
significantly slower than the DWARF scanner. Using a "performance"
build of GDB (with optimization, no runtime sanitizer enabled, etc), I
measure with the following command on a rather large debug info file
(~4 GB):
$ time ./gdb -q -nx --data-directory=data-directory <binary> -iex 'maint set dwarf sync on' -batch
This measures the time it takes for GDB to build the cooked index (plus
some startup and exit overhead). I have a version of the binary without
.debug_names and a version with .debug_names added using gdb-add-index.
The results are:
- without .debug_names: 7.5 seconds
- with .debug_names: 24 seconds
This is a bit embarrassing, given that the purpose of .debug_names is to
accelerate things :). The reason is that the .debug_names processing is
not parallelized at all, while the DWARF scanner is heavily
parallelized.
The process of creating the cooked index from .debug_names is roughly in
two steps:
1. scanning of .debug_names and creation of cooked index entries (see
mapped_debug_names_reader::scan_all_names)
2. finalization of the index, name canonicalization and sorting of the
entries (see cooked_index::set_contents).
This patch grabs a low hanging fruit by creating multiple cooked index
shards instead of a single one during step one. Just doing this allows
the second step of the processing to be automatically parallelized, as
each shard is sent to a separate thread to be finalized.
With this patch, I get:
- without .debug_names: 7.5 seconds
- with .debug_names: 9.7 seconds
Not as fast as we'd like, but it's an improvement.
The process of scanning .debug_names could also be parallelized to shave
off a few seconds. My profiling shows that out of those ~10 seconds of
excecution, about 6 are inside scan_all_names. Assuming perfect
parallelization with 8 threads, it means that at best we could shave
about 5 seconds from that time, which sounds interesting. I gave it a
shot, but it's a much more intrusive change, I'm not sure if I will
finish it.
This patch caused some regressions in gdb.mi/mi-sym-info.exp with the
cc-with-debug-names board, in the test about the `--max-results` switch.
It appears at this test is relying on the specific symbols returned when
using `--max-results`. As far as I know, we don't guarantee which
specific symbols are returned, so any of the matching symbols could be
returned.
The round robin method used in this patch to assign index entries to
shards ends up somewhat randomizing which CU gets expanded first during
the symbol search, and therefore which order they appear in the
objfile's CU list, and therefore which one gets searched first.
I meditated on whether keeping compunits sorted within objfiles would
help make things more stable and predictable. It would somewhat, but it
wouldn't remove all sources of randomness. It would still possible for
a call to `expand_symtabs_matching` to stop on the first hit. Which
compunit gets expanded then would still be dependent on the specific
`quick_symbol_functions` internal details / implementation.
Commit
5b99c5718f1c ("[gdb/testsuite] Fix various issues in
gdb.mi/mi-sym-info.exp") had already started to make the test a bit more
flexible in terms of which symbols it accepts, but with this patch, I
think it's possible to get wildly varying results. I therefore modified
the test to count the number of returned symbols, but not expect any
specific symbol.
Change-Id: Ifd39deb437781f72d224ec66daf6118830042941
Approved-By: Tom Tromey <tom@tromey.com>
std::vector<gdb::unique_xmalloc_ptr<char>> m_names;
};
+using cooked_index_shard_up = std::unique_ptr<cooked_index_shard>;
+
class cutu_reader;
/* An instance of this is created when scanning DWARF to create a
#include "read.h"
#include "stringify.h"
#include "extract-store-integer.h"
+#include "gdbsupport/thread-pool.h"
/* This is just like cooked_index_functions, but overrides a single
method so the test suite can distinguish the .debug_names case from
std::unordered_map<ULONGEST, index_val> abbrev_map;
- std::unique_ptr<cooked_index_shard> shard;
+ /* Even though the scanning of .debug_names and creation of the cooked index
+ entries is done serially, we create multiple shards so that the
+ finalization step can be parallelized. The shards are filled in a round
+ robin fashion. */
+ std::vector<cooked_index_shard_up> shards;
+
+ /* Next shard to insert an entry in. */
+ int next_shard = 0;
/* Maps entry pool offsets to cooked index entries. */
gdb::unordered_map<ULONGEST, cooked_index_entry *>
/* Skip if we couldn't find a valid CU/TU index. */
if (per_cu != nullptr)
{
- *result = shard->add (die_offset, (dwarf_tag) indexval.dwarf_tag, flags,
- lang, name, nullptr, per_cu);
+ *result
+ = shards[next_shard]->add (die_offset, (dwarf_tag) indexval.dwarf_tag,
+ flags, lang, name, nullptr, per_cu);
+
+ ++next_shard;
+ if (next_shard == shards.size ())
+ next_shard = 0;
+
entry_pool_offsets_to_entries.emplace (offset_in_entry_pool, *result);
}
complaint_handler.release (),
std::move (exceptions),
parent_map ());
- std::vector<std::unique_ptr<cooked_index_shard>> indexes;
- indexes.push_back (std::move (m_map.shard));
cooked_index *table
= (gdb::checked_static_cast<cooked_index *>
(per_bfd->index_table.get ()));
+
/* Note that this code never uses IS_PARENT_DEFERRED, so it is safe
to pass nullptr here. */
- table->set_contents (std::move (indexes), &m_warnings, nullptr);
+ table->set_contents (std::move (m_map.shards), &m_warnings, nullptr);
bfd_thread_cleanup ();
}
&addrmap, &warnings);
warnings.emit ();
- map.shard = std::make_unique<cooked_index_shard> ();
- map.shard->install_addrmap (&addrmap);
+ const auto n_workers
+ = std::max<std::size_t> (gdb::thread_pool::g_thread_pool->thread_count (),
+ 1);
+
+ /* Create as many index shard as there are worker threads. */
+ for (int i = 0; i < n_workers; ++i)
+ map.shards.emplace_back (std::make_unique<cooked_index_shard> ());
+
+ /* There is a single address map for the whole index (coming from
+ .debug_aranges). We only need to install it into a single shard for it to
+ get searched by cooked_index. */
+ map.shards[0]->install_addrmap (&addrmap);
auto cidn = (std::make_unique<cooked_index_debug_names>
(per_objfile, std::move (map)));
"120\\^done,symbols=\{debug=\\\[\{filename=\"\[^\"\]*$srcfile\",fullname=\"\[^\"\]+$srcfile\",symbols=\\\[$my_int_re\\\]\},\{filename=\"\[^\"\]*$srcfile2\",fullname=\"\[^\"\]+$srcfile2\",symbols=\\\[$another_int_re\\\]\}\\\]\}" \
"List all types matching _int_"
+# Return the number of matched symbols in the last match.
+
+proc count_symbol_matches { } {
+ # `0,string`, `1,string` and `2,string` respectively contain the
+ # command + result, command and result. The symbols match is at
+ # `3,string`.
+ return [regexp -all $::fun_re $::expect_out(3,string)]
+}
+
# Test the --max-results parameter.
mi_gdb_test "121-symbol-info-functions --max-results 0" \
"121\\^done,symbols=\{\}" \
"-symbol-info-functions --max-results 0"
mi_gdb_test "122-symbol-info-functions --max-results 1 --name ^\[^_\]" \
- "122\\^done,symbols=\{debug=\\\[\{filename=\"\[^\"\]*$srcfile2\",fullname=\"\[^\"\]+$srcfile2\",symbols=\\\[(?:$f2_re|$f3_re)\\\]\}\\\]\}" \
+ "122\\^done,($debug_only_syms)" \
"-symbol-info-functions --max-results 1"
+gdb_assert {[count_symbol_matches] == 1} "-symbol-info-functions --max-results 1, result count"
mi_gdb_test "123-symbol-info-functions --max-results 2 --name ^\[^_\]" \
- "123\\^done,symbols=\{debug=\\\[\{filename=\"\[^\"\]*$srcfile2\",fullname=\"\[^\"\]+$srcfile2\",symbols=\\\[$f2_re,$f3_re\\\]\}\\\]\}" \
+ "123\\^done,($debug_only_syms)" \
"-symbol-info-functions --max-results 2"
+gdb_assert {[count_symbol_matches] == 2} "-symbol-info-functions --max-results 2, result count"
mi_gdb_test "124-symbol-info-variables --max-results 3 --name ^\[^_\]" \
- "124\\^done,symbols=\{debug=\\\[\{filename=\"\[^\"\]*$srcfile2\",fullname=\"\[^\"\]+$srcfile2\",symbols=\\\[$global_f2_re,$global_i2_re,(?:$global_i1_re|$global_f1_s2_re)\\\]\}\\\]\}" \
- "-symbol-info-types --max-results 3"
+ "124\\^done,($debug_only_syms)" \
+ "-symbol-info-variables --max-results 3"
+gdb_assert {[count_symbol_matches] == 3} "-symbol-info-variables --max-results 3, result count"
mi_gdb_test "125-symbol-info-types --max-results 4 --name another_" \
"125\\^done,symbols=\{debug=\\\[\{filename=\"\[^\"\]*$srcfile2\",fullname=\"\[^\"\]+$srcfile2\",symbols=\\\[$another_char_re,$another_float_re,$another_int_re,$another_short_re\\\]\}\\\]\}" \