git.ipfire.org Git - thirdparty/binutils-gdb.git/commit

thread_info::executing+resumed -> thread_info::internal_state

While working on Windows non-stop support, I ran into a
very-hard-to-track-down bug.

The problem turned out to be that
infrun.c:proceed_resume_thread_checked resumed an already-executing
thread because the thread was marked as "executing=true,
resumed=false", and that function only skips resuming threads that are
marked resumed=true.  The consequence was that GDB corrupted the
registers of the Windows DLL loader threads, eventually leading to a
GDB+inferior deadlock.

Originally, the "resumed" flag was only ever set when infrun decided
is was ready to process a thread's pending wait status.  infrun has
since evolved to set the resumed flag when we set a thread's executing
flag too.  We are not always consistent throughout in guaranteeing
that a thread is marked resumed=true whenever it is marked
executing=true, though.  For instance, no target code that supports
non-stop mode (linux-nat, remote, and windows-nat with this series) is
making sure that new threads are marked resumed=true when they are
added to the thread list.  They are only marked as {state=running,
executing=true}, the "resumed" flag is not touched.

Making proceed_resume_thread_checked check thr->executing() in
addition to thr->resumed(), feels like papering over a combination of
states that shouldn't happen nowadays.

OTOH, having to have the target backends mark new threads as
resumed=true just feels like too many different states (three) to set:

  add_thread (...);
  set_running (...);
  set_executing (...);
  set_resumed (...);

Yuck.  I think we can do better.

We really have too many "state tracking" flags in a thread.
Basically:

- whether a thread is "running/stopped/exited" (from the user's
   perspective).  This is the thread_info::state field.

- whether a thread is "executing" (infrun asked the target to set the
   thread executing).  This is thread_info::executing().

- whether a thread is "resumed" (infrun wants the thread to be
   resumed, but maybe can't yet because the thread has a pending wait
   status).  This is thread_info::resumed()

"running", "executing", and "resumed" are almost synonyms, so this can
be highly confusing English-wise too.

For "running" vs "executing", in comments, we tipically need to
explain that "running/stopped/exited" is for the user/frontend
perspective, while "executing true/false" is for gdb's internal run
control.

(Also, "executing or not" can also mean something else in GDB's
codebase -- "target has execution" does not mean that threads are
actually running right now -- it's a test for whether we have a live
process vs a core dump!)

One simplification we can do that avoids this running vs executing
ambiguity is to replace the "executing" field with an "internal_state"
field, similar to the thread_info::state field, and make that new
internal_state field reuse the same enum thread_state type that is
used by thread_info::state.  Like:

  struct thread_info
  {
  ...
    /* Frontend/public/external/user view of the thread state.  */
    enum thread_state m_state = THREAD_STOPPED;

    /* The thread's internal state.  When the thread is stopped
       internally while handling an internal event, like a software
       single-step breakpoint, the internal state will be
       THREAD_STOPPED, but the external state will still be
       THREAD_RUNNING.  */
    enum thread_state m_internal_state = THREAD_STOPPED;
  };

(Assume we'd add state() and internal_state() getters.)

With that, every check for thr->executing() is replaced with a
'thr->internal_state() == THREAD_RUNNING' check, and the code is
clearer by design.  There is no confusion between "running" vs
"executing" any more, because they now mean the exact same thing.
Instead, we say e.g., 'thread has (user) state "running", and internal
state "stopped"'.  Or simpler, 'thread is running (from the user's
perspective), but internally stopped'.  That is after all what we
would way in comments today already.

That still leaves the 'resumed' flag, though.  That's the least
obvious one.  Turns out we can get rid of it, and make it a new state
tracked by thread_info::internal_state.  That is, we make
internal_state have its own enumeration type (decoupled from
thread_info::state's type), and convert the resumed true/false flag to
a new enumerator of this new enumeration.  Like so:

  enum thread_int_state
  {
    THREAD_INT_STOPPED,
    THREAD_INT_RUNNING,
+   THREAD_INT_RESUMED_PENDING_STATUS,
    THREAD_INT_EXITED,
  };

That is what this patch does.  So in summary, we go from:

thread_info::state {THREAD_STOPPED, THREAD_RUNNING, THREAD_EXITED}
thread_info::executing {false, true}
thread_info::resumed {false, true}

to:

thread_info::state {THREAD_STOPPED, THREAD_RUNNING, THREAD_EXITED}
thread_info::internal_state {THREAD_INT_STOPPED, THREAD_INT_RUNNING,
                              THREAD_INT_RESUMED_PENDING_STATUS,
      THREAD_INT_EXITED}

The patch adds getters/setters for both (user) state and
internal_state, and adds assertions around state transitions, ensuring
that internal_state doesn't get out of sync with
thread::have_pending_wait_status().

The code that adds/removes threads from the proc_target's
resumed_with_pending_wait_status list is all centralized within
thread_info::set_internal_state, when we switch to/from the
resumed-pending-status state.  With the assertions in place, it should
be impossible to end up with a THREAD_INT_RUNNING thread with a
pending status.

The thread.c:set_running, thread.c:set_executing, thread.c:set_resumed
global functions are all gone, replaced with new thread.c:set_state
and thread.c:set_internal_state functions.

Tested on x86_64-linux-gnu, native and gdbserver.

Change-Id: I4f5097d68f4694d44e1ae23fea3e9bce45fb078c
commit-id:42ba97d4

author	Pedro Alves <pedro@palves.net>
	Wed, 19 Feb 2025 14:37:39 +0000 (14:37 +0000)
committer	Pedro Alves <pedro@palves.net>
	Mon, 6 Apr 2026 17:50:19 +0000 (18:50 +0100)
commit	2954dd2b734526165b920eb575bcd0fc07e0aa24
tree	684f09e38d43b76820be3e40567870a7cc6dc1be	tree \| snapshot
parent	fdb1f0881318c83d9db836c9558c33407aca173e	commit \| diff

gdb/aarch64-tdep.c		diff \| blob \| blame \| history
gdb/amd-dbgapi-target.c		diff \| blob \| blame \| history
gdb/breakpoint.c		diff \| blob \| blame \| history
gdb/bsd-uthread.c		diff \| blob \| blame \| history
gdb/fork-child.c		diff \| blob \| blame \| history
gdb/frame.c		diff \| blob \| blame \| history
gdb/gcore.c		diff \| blob \| blame \| history
gdb/gdbthread.h		diff \| blob \| blame \| history
gdb/i386-tdep.c		diff \| blob \| blame \| history
gdb/inf-ptrace.c		diff \| blob \| blame \| history
gdb/infcall.c		diff \| blob \| blame \| history
gdb/infcmd.c		diff \| blob \| blame \| history
gdb/inferior.h		diff \| blob \| blame \| history
gdb/inflow.c		diff \| blob \| blame \| history
gdb/infrun.c		diff \| blob \| blame \| history
gdb/infrun.h		diff \| blob \| blame \| history
gdb/linux-fork.c		diff \| blob \| blame \| history
gdb/linux-nat.c		diff \| blob \| blame \| history
gdb/linux-thread-db.c		diff \| blob \| blame \| history
gdb/mi/mi-cmd-var.c		diff \| blob \| blame \| history
gdb/mi/mi-interp.c		diff \| blob \| blame \| history
gdb/mi/mi-main.c		diff \| blob \| blame \| history
gdb/process-stratum-target.c		diff \| blob \| blame \| history
gdb/python/py-infthread.c		diff \| blob \| blame \| history
gdb/record-btrace.c		diff \| blob \| blame \| history
gdb/record-full.c		diff \| blob \| blame \| history
gdb/regcache.c		diff \| blob \| blame \| history
gdb/remote.c		diff \| blob \| blame \| history
gdb/sol-thread.c		diff \| blob \| blame \| history
gdb/target.c		diff \| blob \| blame \| history
gdb/thread-iter.h		diff \| blob \| blame \| history
gdb/thread.c		diff \| blob \| blame \| history
gdb/top.c		diff \| blob \| blame \| history