]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
gh-114271: Fix race in `Thread.join()` (#114839)
authormpage <mpage@meta.com>
Sat, 16 Mar 2024 12:56:30 +0000 (05:56 -0700)
committerGitHub <noreply@github.com>
Sat, 16 Mar 2024 12:56:30 +0000 (13:56 +0100)
commit33da0e844c922b3dcded75fbb9b7be67cb013a17
tree940ddc29bb00e450f9a984e01bff7674f929543e
parent86bc40dd414bceb3f93382cc9f670936de9d68be
gh-114271: Fix race in `Thread.join()` (#114839)

There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201
[^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115
[^3]: https://github.com/mpage/cpython/commit/81946532792f938cd6f6ab4c4ff92a4edf61314f

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
12 files changed:
Include/cpython/pystate.h
Include/internal/pycore_lock.h
Include/internal/pycore_pythread.h
Lib/test/test_audit.py
Lib/test/test_concurrent_futures/test_process_pool.py
Lib/test/test_thread.py
Lib/test/test_threading.py
Lib/threading.py
Misc/NEWS.d/next/Library/2024-02-01-03-09-38.gh-issue-114271.raCkt5.rst [new file with mode: 0644]
Modules/_threadmodule.c
Python/lock.c
Python/pystate.c