Simon Marchi [Wed, 9 Dec 2020 19:49:02 +0000 (14:49 -0500)]
gdb: address review comments of previous series
I forgot to include fixes for review comments I got before pushing the
previous commits (or I pushed the wrong commits). This one fixes it.
- Return {} instead of false in get_discrete_low_bound and
get_discrete_high_bound.
- Compute high bound after confirming low bound is valid in
get_discrete_bounds.
gdb/ChangeLog:
* gdbtypes.c (get_discrete_low_bound, get_discrete_high_bound):
Return {} instead of false.
(get_discrete_bounds): Compute high bound only if low bound is
valid.
Simon Marchi [Wed, 9 Dec 2020 18:52:12 +0000 (13:52 -0500)]
gdb: fix value_subscript when array upper bound is not known
Since commit 7c6f27129631 ("gdb: make get_discrete_bounds check for
non-constant range bounds"), subscripting flexible array member fails:
struct no_size
{
int n;
int items[];
};
(gdb) p *ns
$1 = {n = 3, items = 0x5555555592a4}
(gdb) p ns->items[0]
Cannot access memory at address 0xfffe555b733a0164
(gdb) p *((int *) 0x5555555592a4)
$2 = 101 <--- we would expect that
(gdb) p &ns->items[0]
$3 = (int *) 0xfffe5559ee829a24 <--- wrong address
Since the flexible array member (items) has an unspecified size, the array type
created for it in the DWARF doesn't have dimensions (this is with gcc 9.3.0,
Ubuntu 20.04):
This causes GDB to create a range type (TYPE_CODE_RANGE) with a defined
constant low bound (dynamic_prop with kind PROP_CONST) and an undefined
high bound (dynamic_prop with kind PROP_UNDEFINED).
value_subscript gets both bounds of that range using
get_discrete_bounds. Before commit 7c6f27129631, get_discrete_bounds
didn't check the kind of the dynamic_props and would just blindly read
them as if they were PROP_CONST. It would return 0 for the high bound,
because we zero-initialize the range_bounds structure. And it didn't
really matter in this case, because the returned high bound wasn't used
in the end.
Commit 7c6f27129631 changed get_discrete_bounds to return a failure if
either the low or high bound is not a constant, to make sure we don't
read a dynamic prop that isn't a PROP_CONST as a PROP_CONST. This
change made get_discrete_bounds start to return a failure for that
range, and as a result would not set *lowp and *highp. And since
value_subscript doesn't check get_discrete_bounds' return value, it just
carries on an uses an uninitialized value for the low bound. If
value_subscript did check the return value of get_discrete_bounds, we
would get an error message instead of a bogus value. But it would still
be a bug, as we wouldn't be able to print the flexible array member's
elements.
Looking at value_subscript, we see that the low bound is always needed,
but the high bound is only needed if !c_style. So, change
value_subscript to use get_discrete_low_bound and
get_discrete_high_bound separately. This fixes the case described
above, where the low bound is known but the high bound isn't (and is not
needed). This restores the original behavior without accessing a
dynamic_prop in a wrong way.
A test is added. In addition to the case described above, a case with
an array member of size 0 is added, which is a GNU C extension that
existed before flexible array members were introduced. That case
currently fails when compiled with gcc <= 8. gcc <= 8 produces DWARF
similar to the one shown above, while gcc 9 adds a DW_AT_count of 0 in
there, which makes the high bound known. A case where an array member
of size 0 is the only member of the struct is also added, as that was
how PR 28675 was originally reported, and it's an interesting corner
case that I think could trigger other funny bugs.
Question about the implementation: in value_subscript, I made it such
that if the low or high bound is unknown, we fall back to zero. That
effectively makes it the same as it was before 7c6f27129631. But should
we instead error() out?
gdb/ChangeLog:
PR 26875, PR 26901
* gdbtypes.c (get_discrete_low_bound): Make non-static.
(get_discrete_high_bound): Make non-static.
* gdbtypes.h (get_discrete_low_bound): New declaration.
(get_discrete_high_bound): New declaration.
* valarith.c (value_subscript): Only fetch high bound if
necessary.
gdb/testsuite/ChangeLog:
PR 26875, PR 26901
* gdb.base/flexible-array-member.c: New test.
* gdb.base/flexible-array-member.exp: New test.
Simon Marchi [Wed, 9 Dec 2020 18:52:03 +0000 (13:52 -0500)]
gdb: split get_discrete_bounds in two
get_discrete_bounds is not flexible for ranges (TYPE_CODE_RANGE), in the
sense that it returns true (success) only if both bounds are present and
constant values.
This is a problem for code that only needs to know the low bound and
fails unnecessarily if the high bound is unknown.
Split the function in two, get_discrete_low_bound and
get_discrete_high_bound, that both return an optional. Provide a new
implementation of get_discrete_bounds based on the two others, so the
callers don't have to be changed.
gdb/ChangeLog:
* gdbtypes.c (get_discrete_bounds): Implement with
get_discrete_low_bound and get_discrete_high_bound.
(get_discrete_low_bound): New.
(get_discrete_high_bound): New.
Simon Marchi [Wed, 9 Dec 2020 18:51:57 +0000 (13:51 -0500)]
gdb: make get_discrete_bounds return bool
get_discrete_bounds currently has three possible return values (see its
current doc for details). It appears that for all callers, it would be
sufficient to have a boolean "worked" / "didn't work" return value.
Change the return type of get_discrete_bounds to bool and adjust all
callers. Doing so simplifies the following patch.
Simon Marchi [Wed, 9 Dec 2020 18:51:45 +0000 (13:51 -0500)]
gdb: make discrete_position return optional
Instead of returning a boolean status and returning the value through a
pointer, return an optional that does both jobs. This helps in the
following patches, and I think it is an improvement in general.
Giancarlo Frix [Sun, 6 Dec 2020 10:30:53 +0000 (14:30 +0400)]
s390: Fix BC instruction breakpoint handling
This fixes a long-lived bug in the s390 port.
When trying to step over a breakpoint set on a BC (branch on condition)
instruction with displaced stepping on IBM Z, gdb would incorrectly
adjust the pc regardless of whether or not the branch was taken. Since
the branch target is an absolute address, this would cause the inferior
to jump around wildly whenever the branch was taken, either crashing it
or causing it to behave unpredictably.
It turns out that the logic to handle BC instructions correctly was in
the code, but that the enum value representing its opcode has always
been incorrect.
This patch corrects the enum value to the actual opcode, fixing the
stepping problem. The enum value is also used in the prologue analysis
code, so this also fixes a minor bug where more of the prologue would
be read than was necessary.
gdb/ChangeLog:
PR breakpoints/27009
* s390-tdep.h (op_bc): Correct BC opcode value.
Shahab Vahedi [Thu, 12 Nov 2020 11:50:33 +0000 (12:50 +0100)]
arc: Write correct "eret" value during register collection
In collect_register() function of arc-linux-tdep.c, the "eret"
(exception return) register value was not being reported correctly.
This patch fixes that.
Background:
When asked for the "pc" value, we have to update the "eret" register
with GDB's STOP_PC. The "eret" instructs the kernel code where to
jump back when an instruction has stopped due to a breakpoint. This
is how collect_register() was doing so:
--------------8<--------------
if (regnum == gdbarch_pc_regnum (gdbarch))
regnum = ARC_ERET_REGNUM;
regcache->raw_collect (regnum, buf + arc_linux_core_reg_offsets[regnum]);
-------------->8--------------
Root cause:
Although this is using the correct offset (ERET register's), it is also
changing the REGNUM itself. Therefore, raw_collect (regnum, ...) is
not reading from "pc" anymore.
v2:
- Fix a copy/paste issue as rightfully addressed by Tom [1].
Andrew Burgess [Wed, 16 Sep 2020 09:12:39 +0000 (10:12 +0100)]
opcodes/csky: return the default disassembler when there is no bfd
The disassembler function should return a valid disassembler function
even when there is no BFD present. This is implied (I believe) by the
comment in dis-asm.h which says the BFD may be NULL. Further, it
makes sense when considering that the disassembler is used in GDB, and
GDB may connect to a target and perform debugging even without a BFD
being supplied.
This commit makes the csky_get_disassembler function return the
default disassembler configuration when no bfd is supplied, this is
the same default configuration as is used when a BFD is supplied, but
the BFD has no attributes section.
Before the change configuring GDB with --enable-targets=all and
running the tests gdb.base/all-architectures-2.exp results in many
errors, but after this change there are no failures.
opcodes/ChangeLog:
* csky-dis.c (csky_get_disassembler): Don't return NULL when there
is no BFD.
The crash happens here, where htab (a dwarf2_cu::die_hash field) is
unexpectedly NULL while generating partial symbols:
#0 0x000055555fa28188 in htab_find_with_hash (htab=0x0, element=0x7fffffffbfa0, hash=27) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:591
#1 0x000055555cb4eb2e in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22951
#2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
#3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
#4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
#5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
#6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
#7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
#8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
#9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
The special thing about this DWARF is that the subprogram at 0x1b is a
template specialization described with DW_AT_specification, and has no
DW_AT_name in itself. To compute the name of this subprogram,
partial_die_full_name needs to load the full DIE for this partial DIE.
The name is generated from the templated function name and the actual
tempalate parameter values of the specialization.
To load the full DIE, partial_die_full_name creates a dummy DWARF
attribute of form DW_FORM_ref_addr that points to our subprogram's DIE,
and calls follow_die_ref on it. This eventually causes
load_full_comp_unit to be called for the exact same CU we are currently
making partial symbols for:
#0 load_full_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, skip_partial=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9238
#1 0x000055555cb4e943 in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22942
#2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
#3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
#4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
#5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
#6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
#7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
#8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
#9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
load_full_comp_unit creates a cutu_reader for the CU. Since a dwarf2_cu
object already exists for the CU, load_full_comp_unit is expected to
find it and pass it to cutu_reader, so that cutu_reader doesn't create
a new dwarf2_cu for the CU.
And this is the difference between before and after the regression.
Before commit 7188ed02d2a7, the dwarf2_per_cu_data -> dwarf2_cu link was
a simple pointer in dwarf2_per_cu_data. This pointer was set up when
starting the read the partial symbols. So it was already available at
that point where load_full_comp_unit gets called. Post-7188ed02d2a7,
this link is per-objfile, kept in the dwarf2_per_objfile::m_dwarf2_cus
hash map. The entry is only put in the hash map once the partial
symbols have been successfully read, when cutu_reader::keep is called.
Therefore, it is _not_ set at the point load_full_comp_unit is called.
As a consequence, a new dwarf2_cu object gets created and initialized by
load_full_comp_unit (including initializing that dwarf2_cu::die_hash
field). Meanwhile, the dwarf2_cu object created and used by the callers
up the stack does not get initialized for full symbol reading, and the
dwarf2_cu::die_hash field stays unexpectedly NULL.
Solution
--------
Since the caller of load_full_comp_unit knows about the existing
dwarf2_cu object for the CU we are reading (the one load_full_comp_unit
is expected to find), we can simply make it pass it down, instead of
having load_full_comp_unit look up the per-objfile map.
load_full_comp_unit therefore gets a new `existing_cu` parameter. All
other callers get updated to pass `per_objfile->get_cu (per_cu)`, so the
behavior shouldn't change for them, compared to the current HEAD.
A test is added, which is the bare minimum to reproduce the issue.
Notes
-----
The original problem was reproduced by downloading
and loading libtbb.so in GDB. This code was compiled with the Intel
C/C++ compiler. I was not able to reproduce the issue using GCC, I
think because GCC puts a DW_AT_name in the specialized subprogram, so
there's no need for partial_die_full_name to load the full DIE of the
subprogram, and the faulty code doesn't execute.
Mihails Strasuns [Wed, 14 Oct 2020 08:44:36 +0000 (10:44 +0200)]
gdb: get jiter objfile from a bound minsym
This fixes a regression introduced by the following commit:
fe053b9e853 gdb/jit: pass the jiter objfile as an argument to jit_event_handler
In the refactoring `handle_jit_event` function was changed to pass a matching
objfile pointer to the `jit_event_handler` explicitly, rather using internal
storage:
This was needed to add support for multiple jiters. However it has also
introduced a regression, because `get_frame_function (frame)` here may
return `nullptr`, resulting in a crash.
A more resilient way would be to use an approach mirroring
`jit_breakpoint_re_set` - to find a minimal symbol matching the
breakpoint location and use its object file. We know that this
breakpoint event comes from a breakpoint set by `jit_breakpoint_re_set`,
thus using the reverse approach should be reliable enough.
gdb/Changelog:
2020-10-14 Mihails Strasuns <mihails.strasuns@intel.com>
* breakpoint.c (handle_jit_event): Add an argument, change how
`jit_event_handler` is called.
Simon Marchi [Tue, 13 Oct 2020 16:01:19 +0000 (12:01 -0400)]
gdb: don't pass TARGET_WNOHANG to targets that can't async (PR 26642)
Debugging with "maintenance set target-async off" on Linux has been
broken since 5b6d1e4fa4f ("Multi-target support").
The issue is easy to reproduce:
$ ./gdb -q --data-directory=data-directory -nx ./test
Reading symbols from ./test...
(gdb) maintenance set target-async off
(gdb) start
Temporary breakpoint 1 at 0x1151: file test.c, line 5.
Starting program: /home/simark/build/binutils-gdb/gdb/test
... and it hangs there.
The difference between pre-5b6d1e4fa4f and 5b6d1e4fa4f is that
fetch_inferior_event now calls target_wait with TARGET_WNOHANG for
non-async-capable targets, whereas it didn't before.
For non-async-capable targets, this is how it's expected to work when
resuming execution:
1. we call resume
2. the infrun async handler is marked in prepare_to_wait, to immediately
wake up the event loop when we get back to it
3. fetch_inferior_event calls the target's wait method without
TARGET_WNOHANG, effectively blocking until the target has something
to report
However, since we call the target's wait method with TARGET_WNOHANG,
this happens:
1. we call resume
2. the infrun async handler is marked in prepare_to_wait, to immediately
wake up the event loop when we get back to it
3. fetch_inferior_event calls the target's wait method with
TARGET_WNOHANG, the target has nothing to report yet
4. we go back to blocking on the event loop
5. SIGCHLD finally arrives, but the event loop is not woken up, because
we are not in async mode. Normally, we should have been stuck in
waitpid the SIGCHLD would have unblocked us.
We end up in this situation because these two necessary conditions are
met:
1. GDB uses the TARGET_WNOHANG option with a target that can't do async.
I don't think this makes sense. I mean, it's technically possible,
the doc for TARGET_WNOHANG is:
/* Return immediately if there's no event already queued. If this
options is not requested, target_wait blocks waiting for an
event. */
TARGET_WNOHANG = 1,
... which isn't in itself necessarily incompatible with synchronous
targets. It could be possible for a target to support non-blocking
polls, while not having a way to asynchronously wake up the event
loop, which is also necessary to support async. But as of today,
we don't expect GDB and sync targets to work this way.
2. The linux-nat target, even in the mode where it emulates a
synchronous target (with "maintenance set target-async off") respects
TARGET_WNOHANG. Other non-async targets, such as windows_nat_target,
simply don't check / support TARGET_WNOHANG, so their wait method is
always blocking.
Fix the first issue by avoiding using TARGET_WNOHANG on non-async
targets, in do_target_wait_1. Add an assert in target_wait to verify it
doesn't happen.
The new test gdb.base/maint-target-async-off.exp is a simple test that
just tries running to main and then to the end of the program, with
"maintenance set target-async off".
gdb/ChangeLog:
PR gdb/26642
* infrun.c (do_target_wait_1): Clear TARGET_WNOHANG if the
target can't do async.
* target.c (target_wait): Assert that we don't pass
TARGET_WNOHANG to a target that can't async.
gdb/testsuite/ChangeLog:
PR gdb/26642
* gdb.base/maint-target-async-off.c: New test.
* gdb.base/maint-target-async-off.exp: New test.