On Windows, a ROCgdb downstream testcase
(gdb.rocm/register-watchpoint.exp, which we can't upstream yet due to
missing support for DWARF 6 features upstream) currently fails with a
timeout, like so:
(gdb) PASS: gdb.rocm/register-watchpoint.exp: continue to breakpoint: bit_extract_kernel
watch $s32
Watchpoint 3: $s32
(gdb) PASS: gdb.rocm/register-watchpoint.exp: watchpoint on a stack pointer of the first wave
continue
Continuing.
FAIL: gdb.rocm/register-watchpoint.exp: continue (timeout)
Running the test manualy with some extra logging, we see:
[infrun] stop_all_threads: 6/7 waits_needed << extra
[amd-dbgapi] wait: ptid = -1.0.0
[windows events] get_windows_debug_event: kernel event for pid=7036 tid=0x4e8 code=EXCEPTION_DEBUG_EVENT
[windows events] get_windows_debug_event: get_windows_debug_event - unexpected stop in suspended thread 0x4e8
[windows events] continue_last_debug_event: ContinueDebugEvent (cpid=7036, ctid=0x4e8, DBG_REPLY_LATER)
[windows events] wait: get_windows_debug_event returned [0.0.0 : status->kind = IGNORE, fake=0]
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: 0.0.0 [process 0],
[infrun] print_target_wait_results: status->kind = IGNORE
[infrun] print_target_wait_results: from target 1 (native)
[infrun] wait_one: about to block in interruptible_select << extra
So we're in stop_all_threads, and we've pulled the stop events for all
CPU threads already, but then we hang in interruptible_select waiting
for the last stop event, which happens to be for the GPU wave.
In wait_one, before the interruptible_select call, we poll events from
the target, via target_wait with WNOHANG, and so we get here:
ptid_t
amd_dbgapi_target::wait (ptid_t ptid, struct target_waitstatus *ws,
target_wait_flags target_options)
{
...
ptid_t event_ptid = beneath ()->wait (ptid, ws, target_options);
if (event_ptid != minus_one_ptid)
{
...
return event_ptid;
}
... handle dbgapi events ...
So above, we call the beneath target's wait. On Windows that may hit
that "get_windows_debug_event - unexpected stop in suspended thread
0x4e8" path, which makes windows_nat_target::wait return
TARGET_WAITKIND_IGNORE. The Windows target pairs that with event_ptid
== ptid_t(0,0,0) though, so the 'if then' branch is taken and we
return the TARGET_WAITKIND_IGNORE to the core without looking for
dbgapi events. The event for the wave stop at this point has already
been flushed from the dbgapi library into amd-dbgapi-target's dbgapi
local event queue, and so wait_one ends up deadlocked in
interruptible_select, which results in the timeouts observed.
Nothing specifies that TARGET_WAITKIND_IGNORE must be returned with
minus_one_ptid. infrun never looks at the event ptid if the status is
TARGET_WAITKIND_IGNORE. So fix this by tweaking
amd_dbgapi_target::wait to not assume that either.
Approved-by: Lancelot Six <lancelot.six@amd.com> (amdgpu)
Change-Id: I6cbbeebdc8146e361ead72829b59f82531c90fc7
amd_dbgapi_debug_printf ("ptid = %s", ptid.to_string ().c_str ());
ptid_t event_ptid = beneath ()->wait (ptid, ws, target_options);
- if (event_ptid != minus_one_ptid)
+ if (ws->kind () != TARGET_WAITKIND_NO_RESUMED
+ && ws->kind () != TARGET_WAITKIND_IGNORE)
{
if (ws->kind () == TARGET_WAITKIND_EXITED
|| ws->kind () == TARGET_WAITKIND_SIGNALLED)
return event_ptid;
}
- gdb_assert (ws->kind () == TARGET_WAITKIND_NO_RESUMED
- || ws->kind () == TARGET_WAITKIND_IGNORE);
-
/* Flush the async handler first. */
if (target_is_async_p ())
async_event_handler_clear ();