From: Jan Vrany Date: Thu, 25 Dec 2025 18:12:06 +0000 (+0000) Subject: Revert "gdb: change blockvector::contains() to handle blockvectors with "holes"" X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;ds=inline;p=thirdparty%2Fbinutils-gdb.git Revert "gdb: change blockvector::contains() to handle blockvectors with "holes"" This reverts commit cc1fc6af4150b19f9c4c70d0463ff498703fb637, since it causes a number of regressions that seem not to be easily fixable. The problem lies in existence of "freestanding" code, a code that is part of a CU but does not have any block associated with it. Consider following program: __asm__( ".type foo,@function \n" "foo: \n" " mov %rdi, %rax \n" " ret \n" ); static int foo(int i); int main(int argc, char **argv) { return foo(argc); } When compiled, the foo function has no block of itself: Blockvector: no map block #000, object at 0x55978957b510, 1 symbols in 0x1129..0x1148 int main(int, char **); block object 0x55978957b380, 0x112d..0x1148 section .text block #001, object at 0x55978957b470 under 0x55978957b510, 2 symbols in 0x1129..0x1148 typedef int int; typedef char char; block #002, object at 0x55978957b380 under 0x55978957b470, 2 symbols in 0x112d..0x1148, function main int argc; computed at runtime char **argv; computed at runtime In this case lookup(0x1129) returns static block and, because of the change in cc1fc6af4, contains(0x1129) which is wrong. Such "freestanding" code is perhaps not common but it does exist, especially in system code. In fact the regressions were at least in part caused by such "freestanding" code in glibc (libc_sigaction.c). The whole idea of commit cc1fc6af4 was to handle "holes" in CUs, a case where one CU spans over multiple disjoint regions, possibly interleaved with other CUs. Consider somewhat extreme case with two CUs: /* hole-1.c */ int give_me_zero (); int main () { return give_me_zero (); } /* hole-2.c */ int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline)) baz () { return 42; } __asm__( ".section .text_give_me_one,\"ax\",@progbits\n" ".type foo,@function \n" "foo: \n" " mov %rdi, %rax \n" " ret \n" " nop \n" " nop \n" " nop \n" ); int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline)) give_me_one () { return 1; } __asm__( ".section .text_give_me_zero,\"ax\",@progbits\n" "bar: \n" " jmp give_me_one \n" " nop \n" " nop \n" " nop \n" ); int __attribute__ ((section (".text_give_me_zero"))) give_me_zero () { extern int bar(); return give_me_one() - 1; } This when compiled with a carefully crafted linker script to force code at certain positions, creates following layout: 0x080000..0x080007 # "freestanding" bar from hole-2.c 0x080008..0x080016 # give_me_zero() from hole-2.c 0x080109..0x080114 # main from hole-1.c 0xf00000..0xf0000b # baz() from hole-2.c 0xf0000b..0xf00011 # "freestanding" foo from hole-2. 0xf0000b..0xf0001c # gice_me_one() from hole-2. The block vector for hole-1.c looks: Blockvector: no map block #000, object at 0x555a5d85fb90, 1 symbols in 0x80109..0x80114 int main(void); block object 0x555a5d85faa0, 0x80109..0x80114 section .text block #001, object at 0x555a5d85faf0 under 0x555a5d85fb90, 1 symbols in 0x80109..0x80114 typedef int int; block #002, object at 0x555a5d85faa0 under 0x555a5d85faf0, 0 symbols in 0x80109..0x80114, function main And for hole-2.c: Blockvector: map 0x0 -> 0x0 0x80008 -> 0x555a5d85ff50 0x80016 -> 0x0 0xf00000 -> 0x555a5d860280 0xf0000b -> 0x0 0xf00012 -> 0x555a5d860110 0xf0001d -> 0x0 block #000, object at 0x555a5d8603b0, 3 symbols in 0x80008..0xf0001d int give_me_zero(void); block object 0x555a5d85ff50, 0x80008..0x80016 section .text int give_me_one(void); block object 0x555a5d860110, 0xf00012..0xf0001d section .text int baz(void); block object 0x555a5d860280, 0xf00000..0xf0000b section .text block #001, object at 0x555a5d8602d0 under 0x555a5d8603b0, 1 symbols in 0x80008..0xf0001d typedef int int; block #002, object at 0x555a5d85ff50 under 0x555a5d8602d0, 0 symbols in 0x80008..0x80016, function give_me_zero block #003, object at 0x555a5d860280 under 0x555a5d8602d0, 0 symbols in 0xf00000..0xf0000b, function baz block #004, object at 0x555a5d860110 under 0x555a5d8602d0, 0 symbols in 0xf00012..0xf0001d, function give_me_one Note that despite the fact "freestanding" bar belongs to hole-2.c, the corresponding CU's global and static blocks start at 0x80008! Looking at DWARF for the second program, it looks like that the compiler (GCC 15) did not record the presence of "freestanding" code: <0><71>: Abbrev Number: 1 (DW_TAG_compile_unit) <72> DW_AT_producer : (indirect string, offset: 0): GNU C23 15.2.0 -mtune=generic -march=x86-64 -g -fasynchronous-unwind-tables <76> DW_AT_language : 29 (C11) <77> Unknown AT value: 90: 3 <78> Unknown AT value: 91: 0x31647 <7c> DW_AT_name : (indirect line string, offset: 0x2d): hole-2.c <80> DW_AT_comp_dir : (indirect line string, offset: 0): test_programs <84> DW_AT_ranges : 0xc <88> DW_AT_low_pc : 0 <90> DW_AT_stmt_list : 0x51 and corresponding part of .debug_aranges: Length: 76 Version: 2 Offset into .debug_info: 0x65 Pointer Size: 8 Segment Size: 0 Address Length 0000000000f00000 000000000000000b 0000000000f00012 000000000000000b 0000000000080008 000000000000000e 0000000000000000 0000000000000000 Thiago suggested to use minsymbols to tell whether or a CU contains given address. I do not think this would work reliably as minsymbols do no know to which CU they belong. In slightly more complicated case of interleaved CUs it does not seem to be possible to tell for sure to which one a given minsymbol belongs. Moreover, Tom suggested that the comment in find_compunit_symtab_for_pc_sect (which led to cc1fc6af4) may be outdated [2]. Given all that, I'm just reverting the change. [1]: https://sourceware.org/bugzilla/show_bug.cgi?id=33679#c13 [2]: https://inbox.sourceware.org/gdb-patches/87cy6xzd3j.fsf@tromey.com/ Approved-By: Tom Tromey Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33679 --- diff --git a/gdb/block-selftests.c b/gdb/block-selftests.c index f5883f18660..19e2a6d8db3 100644 --- a/gdb/block-selftests.c +++ b/gdb/block-selftests.c @@ -100,13 +100,18 @@ test_blockvector_lookup_contains () SELF_CHECK (bv->contains (0x1500) == true); /* Test address falling into a "hole". If BV has an address map, - lookup () returns nullptr. If not, lookup () return static block. - contains() returns false in both cases. */ + lookup () returns nullptr and contains (). returns false. If not, + lookup () return static block and contains() returns true. */ if (with_map) - SELF_CHECK (bv->lookup (0x2500) == nullptr); + { + SELF_CHECK (bv->lookup (0x2500) == nullptr); + SELF_CHECK (bv->contains (0x2500) == false); + } else - SELF_CHECK (bv->lookup (0x2500) == bv->block (STATIC_BLOCK)); - SELF_CHECK (bv->contains (0x2500) == false); + { + SELF_CHECK (bv->lookup (0x2500) == bv->block (STATIC_BLOCK)); + SELF_CHECK (bv->contains (0x2500) == true); + } /* Test address falling into a block above the "hole". */ SELF_CHECK (bv->lookup (0x3500) == bv->block (3)); diff --git a/gdb/block.c b/gdb/block.c index 3d2c51cc554..e21580bcf63 100644 --- a/gdb/block.c +++ b/gdb/block.c @@ -864,34 +864,7 @@ blockvector::lookup (CORE_ADDR addr) const bool blockvector::contains (CORE_ADDR addr) const { - auto b = lookup (addr); - if (b == nullptr) - return false; - - /* Handle the case that the blockvector has no address map but still has - "holes". For example, consider the following blockvector: - - B0 0x1000 - 0x4000 (global block) - B1 0x1000 - 0x4000 (static block) - B3 0x1000 - 0x2000 - (hole) - B4 0x3000 - 0x4000 - - In this case, the above blockvector does not contain address 0x2500 but - lookup (0x2500) would return the blockvector's static block. - - So here we check if the returned block is a static block and if yes, still - return false. However, if the blockvector contains no blocks other than - the global and static blocks and ADDR falls into the static block, - conservatively return true. - - See comment in find_compunit_symtab_for_pc_sect, symtab.c. - - Also, note that if the blockvector in the above example would contain - an address map, then lookup (0x2500) would return NULL instead of - the static block. - */ - return b != static_block () || num_blocks () == 2; + return lookup (addr) != nullptr; } /* See block.h. */