Sam Gross [Wed, 29 May 2024 19:26:04 +0000 (15:26 -0400)]
gh-119525: Fix deadlock with `_PyType_Lookup` and the GIL (#119527)
The deadlock only affected the free-threaded build and only occurred
when the GIL was enabled at runtime. The `Py_DECREF(old_name)` call
might temporarily release the GIL while holding the type seqlock.
Another thread may spin trying to acquire the seqlock while holding the
GIL.
The deadlock occurred roughly 1 in ~1,000 runs of `pool_in_threads.py`
from `test_multiprocessing_pool_circular_import`.
If one calls pow(fractions.Fraction, x, module) with modulo not None, the error message now says that the types are incompatible rather than saying pow only takes 2 arguments. Implemented by having fractions.Fraction __pow__ accept optional modulo argument and return NotImplemented if not None. pow() then raises with appropriate message.
---------
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Jason R. Coombs [Wed, 29 May 2024 16:43:19 +0000 (12:43 -0400)]
gh-118673: Remove shebang and executable bits from stdlib modules. (#119658)
* gh-118673: Remove shebang and executable bits from stdlib modules.
* Removed shebangs and exe bits on turtledemo scripts.
The setting was inappropriate for '__main__' and inconsistent across the other modules. The scripts can still be executed directly by invoking with the desired interpreter.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> Co-authored-by: Victor Stinner <vstinner@python.org>
Victor Stinner [Wed, 29 May 2024 09:37:04 +0000 (11:37 +0200)]
gh-119661: Add _Py_SINGLETON() include in Argumenet Clinic (#119712)
When the _Py_SINGLETON() is used, Argument Clinic now adds an
explicit "pycore_runtime.h" include to get the macro. Previously, the
macro may or may not be included indirectly by another include.
gh-119118: Fix performance regression in tokenize module (#119615)
* gh-119118: Fix performance regression in tokenize module
- Cache line object to avoid creating a Unicode object
for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
smallest buffer possible to measure the difference.
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
Petr Viktorin [Tue, 28 May 2024 17:27:52 +0000 (19:27 +0200)]
gh-117398: gh-119655: datetime: Init static state once & don't free it (GH-119662)
- While datetime uses global state, only initialize it once.
- While `capi` is static, don't free it (thanks @neonene in https://github.com/python/cpython/pull/119641/files#r1616710048)
The fix in gh-119561 introduced an assertion that doesn't hold true if any of the three new test extension modules are loaded more than once. This is fine normally but breaks if the new test_check_state_first() is run more than once, which happens for refleak checking and with the regrtest --forever flag. We fix that here by clearing each of the three modules after loading them. We also tweak a check in _modules_by_index_check().
Barney Gale [Sat, 25 May 2024 20:01:36 +0000 (21:01 +0100)]
GH-82805: Fix handling of single-dot file extensions in pathlib (#118952)
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.
In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.
In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
Eric Snow [Sat, 25 May 2024 19:30:48 +0000 (15:30 -0400)]
gh-119560: Drop an Invalid Assert in PyState_FindModule() (gh-119561)
The assertion was added in gh-118532 but was based on the invalid assumption that PyState_FindModule() would only be called with an already-initialized module def. I've added a test to make sure we don't make that assumption again.
Tim Peters [Sat, 25 May 2024 03:08:21 +0000 (22:08 -0500)]
gh-119105: Differ.compare is too slow [for degenerate cases] (#119492)
``_fancy_replace()`` is no longer recursive. and a single call does a worst-case linear number of ratio() computations instead of quadratic. This renders toothless a universe of pathological cases. Some inputs may produce different output, but that's rare, and I didn't find a case where the final diff appeared to be of materially worse quality. To the contrary, by refusing to even consider synching on lines "far apart", there was more easy-to-digest locality in the output.
Alyssa Coghlan [Fri, 24 May 2024 14:29:19 +0000 (00:29 +1000)]
GH-119496: accept UTF-8 BOM in .pth files (GH-119503)
`Out-File -Encoding utf8` and similar commands in Windows Powershell 5.1 emit
UTF-8 with a BOM marker, which the regular `utf-8` codec decodes incorrectly.
`utf-8-sig` accepts a BOM, but also works correctly without one.
This change also makes .pth files match the way Python source files are handled.
Fix ThreadedVSOCKSocketStreamTest: if get_cid() returns the host
address or the "any" address, use the local communication address
(loopback): VMADDR_CID_LOCAL.
On Linux 6.9, apparently, the /dev/vsock device is now available but
get_cid() returns VMADDR_CID_ANY (-1).
Brett Simmers [Thu, 23 May 2024 20:59:35 +0000 (16:59 -0400)]
gh-118727: Don't drop the GIL in `drop_gil()` unless the current thread holds it (#118745)
`drop_gil()` assumes that its caller is attached, which means that the current
thread holds the GIL if and only if the GIL is enabled, and the enabled-state
of the GIL won't change. This isn't true, though, because `detach_thread()`
calls `_PyEval_ReleaseLock()` after detaching and
`_PyThreadState_DeleteCurrent()` calls it after removing the current thread
from consideration for stop-the-world requests (effectively detaching it).
Fix this by remembering whether or not a thread acquired the GIL when it last
attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()`
instead of `gil->enabled`.
This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've
reenabled it.
Petr Viktorin [Thu, 23 May 2024 16:01:37 +0000 (18:01 +0200)]
gh-117142: Slightly hacky fix for memory leak of StgInfo (GH-119424)
Add a funciton that inlines PyObject_GetTypeData and skips
type-checking, so it doesn't need access to the CType_Type object.
This will break if the memory layout changes, but should
be an acceptable solution to enable ctypes in subinterpreters in
Python 3.13.
Tim Peters [Wed, 22 May 2024 23:25:08 +0000 (18:25 -0500)]
gh-119105: difflib.py Differ.compare is too slow [for degenerate cases] (#119376)
Track all pairs achieving the best ratio in Differ(). This repairs the "very deep recursion and cubic time" bad cases in a way that preserves previous output.
Add unicode_decode_utf8_writer() to write directly characters into a
_PyUnicodeWriter writer: avoid the creation of a temporary string.
Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().
Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().
Michael Vincent [Wed, 22 May 2024 17:59:47 +0000 (12:59 -0500)]
gh-117505: Run ensurepip in isolated env in Windows installer (GH-118257)
ensurepip forks a subprocess to run pip itself, but that subprocess only inherits a -I isolated mode flag (see _run_pip() in Lib/ensurepip/__init__.py), not the "-E -s" flags that the installer has been using. This means that parts of ensurepip don't actually run in an isolated environment and can make incorrect decisions based on packages installed in the user site-packages.
gh-119247: Add macros to use PySequence_Fast safely in free-threaded build (#119315)
Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
Geoffrey Thomas [Wed, 22 May 2024 16:35:18 +0000 (12:35 -0400)]
Remove almost all unpaired backticks in docstrings (#119231)
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form
The variable `foo' should do xyz
to
The variable 'foo' should do xyz
and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).
No functional change is intended here other than in human-readable
docstrings.
Eric Snow [Wed, 22 May 2024 15:57:52 +0000 (11:57 -0400)]
gh-119213: Be More Careful About _PyArg_Parser.kwtuple Across Interpreters (gh-119331)
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.