gdb: ensure normal stop finishes the thread state of all threads
This patch fixes a multi-target issue where the normal_stop function
can fail to finish the thread state of threads from a non current
target, this leaves the threads marked as running in GDB core, while
the threads is actually stopped.
For testing I used this test program:
#include <unistd.h>
int
main ()
{
while (1)
sleep (1);
return 0;
}
Compile this to make '/tmp/spin', then the bug can be shown using this
command:
$ gdb -ex 'file /tmp/spin' \
-ex 'start' \
-ex 'add-inferior' \
-ex 'inferior 2' \
-ex 'set sysroot' \
-ex 'target extended-remote | gdbserver --multi --once - /tmp/spin' \
-ex 'inferior 1' \
-ex 'continue&' \
-ex 'inferior 2' \
-ex 'search sleep' \
-ex 'break $_ inferior 2' \
-ex 'continue' \
-ex 'info threads'
The interesting part of the output is:
Id Target Id Frame
1.1 process
1610445 "spin" (running)
* 2.1 Thread
1610451.
1610451 "spin" main () at spin.c:7
(gdb)
Notice that thread 1.1 is marked as running when it should be
stopped. We can see that the thread is actually stopped if we try
this:
(gdb) inferior 1
[Switching to inferior 1 [process
1610445] (/tmp/spin)]
[Switching to thread 1.1 (process
1610445)](running)
(gdb) continue
Cannot execute this command while the selected thread is running.
(gdb) interrupt
(gdb) info threads
Id Target Id Frame
* 1.1 process
1610445 "spin" (running)
2.1 Thread
1610451.
1610451 "spin" main () at spin.c:7
(gdb)
We can see the expected behaviour if both inferiors run on the same
target, like this:
$ gdb -ex 'file /tmp/spin' \
-ex 'start' \
-ex 'add-inferior' \
-ex 'inferior 2' \
-ex 'file /tmp/spin' \
-ex 'start' \
-ex 'inferior 1' \
-ex 'continue&' \
-ex 'inferior 2' \
-ex 'search sleep' \
-ex 'break $_ inferior 2' \
-ex 'continue' \
-ex 'info threads'
The 'info threads' from this series of commands looks like this:
Id Target Id Frame
1.1 process
1611589 "spin" 0x00007ffff7e951e7 in nanosleep () from /lib64/libc.so.6
* 2.1 process
1611593 "spin" main () at spin.c:7
(gdb)
Now both threads are stopped as we'd expect.
The problem is in normal_stop. The scoped_finish_thread_state uses
user_visible_resume_target to select the target(s) over which GDB will
iterate to find the threads to update.
The problem with this is that when the ptid_t is minus_one_ptid,
meaning all threads, user_visible_resume_target only returns nullptr,
meaning all targets, when sched_multi is true.
This dependency on sched_multi makes sense when _resuming_ threads.
If we are resuming all threads, then when sched_multi (the
schedule-multiple setting) is off (the default), all threads actually
means all threads in the current inferior only. When sched_multi is
true (schedule-multiple is on) then this means all threads, from all
inferiors, which means GDB needs to consider every target.
However, when stopping an inferior in all-stop mode (non_stop is
false), then GDB wants to stop all threads from all inferiors,
regardless of the sched_multi setting.
What this means is that, when 'non_stop' is false, then we should be
passing nullptr as the target selection to scoped_finish_thread_state.
My proposal is that we should stop using user_visible_resume_target in
the normal_stop function for the target selection of the
scoped_finish_thread_state, instead we should manually figure out the
correct target value and pass this in.
There is precedent for this in GDB, see run_command_1, where
'finish_target' is calculated directly within the function rather than
using user_visible_resume_target.
After this commit, when using two different targets (native and
remote) as in my first example above, both threads will be correctly
stopped.