git.ipfire.org Git - thirdparty/binutils-gdb.git/commit

infrun: Split currently_stepping, fix sw watchpoints issue

The gdb.base/watchpoint.exp on Windows with non-stop support added
(later in the series) exposed an issue with the currently_stepping
logic when tested with software watchpoints.

The issue only happens when:

- You have multiple threads.  gdb.base/watchpoint.exp exposed it on
   Windows because there the OS always spawns a few extra threads.

- Displaced stepping is not available.  The Windows non-stop work
   does not implement displaced stepping yet.  That is left as an
   optimization for later.

- The target backend is working in non-stop mode.

I've written a new test that exposes the issue on GNU/Linux as well
(on hardware single-step targets, like x86-64).  There, we see:

continue
Continuing.
../../src/gdb/infrun.c:2918: internal-error: resume_1: Assertion `!(thread_has_single_step_breakpoints_set (tp) && step)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
FAIL: gdb.threads/sw-watchpoint-step-over-bp-with-threads.exp: target-non-stop=on: displaced-stepping=off: continue until exit (GDB internal error)

Currently, software watchpoints are implemented by forcing
single-stepping.  That is done by currently_stepping returning true
when we have a software watchpoint.  proceed calls resume, which calls
resume_1, which then ends up always requesting a single-step resume,
even if the higher layers wanted a continue.

Now, if you set a software watchpoint, and then continue the program,
and there's a breakpoint at the current PC, GDB needs to step over
that breakpoint first.  If displaced stepping is not available, then
GDB temporarily pauses all threads, removes the breakpoint,
single-steps the thread that needs to move past the breakpoint, and
then finally, reinserts the breakpoint, and restarts all threads
again.  That last restarting step happens in the restart_threads
infrun function.

restart_threads iterates over all threads trying to restart them one
by one.  There, we have:

      if (currently_stepping (tp))
{
  infrun_debug_printf ("restart threads: [%s] was stepping",
       tp->ptid.to_string ().c_str ());

but, what if TP is actually a new thread that hasn't yet ever set
stepping?  currently_stepping still returns true, due to the software
watchpoint, and we end up in keep_going_stepped_thread, here:

  if (tp->stop_pc () != tp->prev_pc)
    {
      ptid_t resume_ptid;

      infrun_debug_printf ("expected thread advanced also (%s -> %s)",
   paddress (current_inferior ()->arch (), tp->prev_pc),
   paddress (current_inferior ()->arch (),
     tp->stop_pc ()));

... because prev_pc was stale at that point (we had no reason to
update it earlier).  E.g. on Windows we see something like:

  [infrun] restart_threads: start: event_thread=1867996.1867996.0, inf=-1
    [infrun] restart_threads: restart threads: [1867996.1867996.0] is event thread
    [infrun] restart_threads: restart threads: [1867996.1868003.0] was stepping
    [infrun] keep_going_stepped_thread: resuming previously stepped thread
    [infrun] keep_going_stepped_thread: expected thread advanced also (0 -> 0x7ffff7ce57f8)
    [infrun] clear_step_over_info: clearing step over info
    [infrun] do_target_resume: resume_ptid=1867996.1868003.0, step=0, sig=GDB_SIGNAL_0

On GNU/Linux, we may see:

    [infrun] keep_going_stepped_thread: expected thread advanced also (0x7ffff7d2683d -> 0x7ffff7ce57f8)

there prev_pc might have been updated on an earlier proceed call,
which makes the issue harder to see, but it is stale too here.

That means we insert a single-step breakpoint at the current PC, and
continue the thread, with target_resume directly, asking for a normal
continue.

Eventually, something causes a user-visible stop.  For example, the
software watchpoint triggers.  That makes GDB stop all threads.

Now, if the user re-resumes the program, say, with "continue", we fail
this assertion in resume_1 coming from proceed:

  /* If STEP is set, it's a request to use hardware stepping
     facilities.  But in that case, we should never
     use singlestep breakpoint.  */
  gdb_assert (!(thread_has_single_step_breakpoints_set (tp) && step));

"step" is true because currently_stepping returns true since we have a
software watchpoint.  And the thread has a single-step breakpoint
installed from earlier, because of that code mentioned above, in
keep_going_stepped_thread reached from restart_threads.

This patch fixes the root cause -- the currently_stepping call in
restart_threads returned true for a thread that has never set stepping
in the first place.  This is because currently_stepping really serves
two purposes currently:

  #1 - for a thread that we are about to resume, should we set it
       stepping?

  #2 - for a thread that just stopped, was it stepping before?

The fix is thus to decouple those two aspects:

  - for #1, we simply rename currently_stepping to should_step.

  - for #2, we record whether the thread was stepping before in a new
    currently_stepping flag in thread_info.

As mentioned, there's a new testcase included.  I tested this on
x86-64 GNU/Linux, native and gdbserver, and on Windows x64 with the
non-stop series.  The assertion triggers on all of those with the fix,
and is fixed by this patch on all of those, too.

Change-Id: I7b07bc62e8570333d2e4856d2e55ae6e58f8260c

author	Pedro Alves <pedro@palves.net>
	Fri, 16 May 2025 20:05:17 +0000 (21:05 +0100)
committer	Pedro Alves <pedro@palves.net>
	Mon, 19 May 2025 13:12:37 +0000 (14:12 +0100)
commit	6f4586878c8c40731372932bdf3d124d15ecbc9c
tree	f8a4eb32ea27d3d886ae5277531b2b63c98253de	tree
parent	8e3475361e1d31fcf938baeb91ac43a32437d5ac	commit \| diff

gdb/gdbthread.h		diff \| blob \| blame \| history
gdb/infrun.c		diff \| blob \| blame \| history
gdb/testsuite/gdb.threads/sw-watchpoint-step-over-bp-with-threads.c	[new file with mode: 0644]	blob
gdb/testsuite/gdb.threads/sw-watchpoint-step-over-bp-with-threads.exp	[new file with mode: 0644]	blob