Emma Smith [Tue, 21 Oct 2025 21:48:29 +0000 (14:48 -0700)]
gh-132835: Add defensive NULL checks to MRO resolution (GH-134763)
Currently, there are a few places where tp_mro could theoretically
become NULL, but do not in practice. This commit adds defensive checks for
NULL values to ensure that any changes do not introduce a crash and that
state invariants are upheld.
The assertions added in this commit are all instances where a NULL value would get passed to something not expecting a NULL, so it is better to catch an assertion failure than crash later on.
There are a few cases where it is OK for the return of lookup_tp_mro to be NULL, such as when passed to is_subtype_with_mro, which handles this explicitly.
Jeffrey Bosboom [Tue, 21 Oct 2025 16:13:14 +0000 (09:13 -0700)]
gh-83714: Check for struct statx members in configure (#140402)
Some systems have the definitions of the mask bits without having the
corresponding members in struct statx. Add configure checks for members
added after Linux 4.11 (when statx itself was added).
David Ellis [Tue, 21 Oct 2025 15:57:43 +0000 (16:57 +0100)]
gh-138764: annotationlib: Make `call_annotate_function` fallback to using `VALUE` annotations if both the requested format and `VALUE_WITH_FAKE_GLOBALS` are not implemented (#138803)
Jeffrey Bosboom [Tue, 21 Oct 2025 15:54:00 +0000 (08:54 -0700)]
gh-140239: Check for statx on Android (#140395)
Android has Linux's statx, but MACHDEP is "android" on Android, so
configure doesn't check for statx on Android. Base the check for statx
on ac_sys_system instead, which is "Linux-android" on Android, "Linux"
on other Linux distributions, and "AIX" on AIX (which has an
incompatible function named statx).
Mark Shannon [Tue, 21 Oct 2025 14:22:15 +0000 (15:22 +0100)]
GH-139951: Fix major GC performance regression (GH-140262)
* Count number of actually tracked objects, instead of trackable objects. This ensures that untracking tuples has the desired effect of reducing GC overhead
* Do not track most untrackable tuples during creation. This prevents large numbers of small tuples causing execessive GCs.
* Support non-UTF-8 shebang and comments if non-UTF-8 encoding is specified.
* Detect decoding error for non-UTF-8 encoding.
* Detect null bytes in source code.
Kumar Aditya [Sat, 18 Oct 2025 11:06:58 +0000 (16:36 +0530)]
gh-140067: Fix memory leak in sub-interpreter creation (#140111) (#140261)
Fix memory leak in sub-interpreter creation caused by overwriting of the previously used `_malloced` field. Now the pointer is stored in the first word of the memory block to avoid it being overwritten accidentally.
Barney Gale [Fri, 17 Oct 2025 21:57:51 +0000 (22:57 +0100)]
GH-133789: Fix unpickling of pathlib objects pickled in Python 3.13 (#133831)
In Python 3.13 (but not 3.12 or 3.14), pathlib classes are defined in
`pathlib._local` rather than `pathlib`. In hindsight this was a mistake,
but it was difficult to predict how the abstract/local split would pan out.
In this patch we re-introduce `pathlib._local` as a stub module that
re-exports the classes from `pathlib`. This allows path objects pickled in
3.13 to be unpicked in 3.14+
Jeffrey Bosboom [Thu, 16 Oct 2025 11:40:47 +0000 (04:40 -0700)]
gh-83714: Check for struct statx.stx_atomic_write_unit_max_opt in configure (#140185)
stx_atomic_write_unit_max_opt was added in Linux 6.16, but is controlled
by the STATX_WRITE_ATOMIC mask bit added in Linux 6.11. That's safe at
runtime because all kernels clear the reserved space in struct statx and
zero is a valid value for stx_atomic_write_unit_max_opt, and it avoids
allocating another mask bit, which are a limited resource. But it also
means the kernel headers don't provide a way to check whether
stx_atomic_write_unit_max_opt exists, so add a configure check.
Serhiy Storchaka [Thu, 16 Oct 2025 07:54:41 +0000 (10:54 +0300)]
gh-130567: Remove optimistic allocation in locale.strxfrm() (GH-137143)
On modern systems, the result of wcsxfrm() is much larger the size of
the input string (from 4+2*n on Windows to 4+5*n on Linux for simple
ASCII strings), so optimistic allocation of the buffer of the same size
never works.
The exception is if the locale is "C" (or unset), but in that case the `wcsxfrm`
call should be fast (and calling `locale.strxfrm()` doesn't make too much
sense in the first place).
Sergey Miryanov [Wed, 15 Oct 2025 13:48:21 +0000 (18:48 +0500)]
gh-140061: Use `_PyObject_IsUniquelyReferenced()` to check if objects are uniquely referenced (gh-140062)
The previous `Py_REFCNT(x) == 1` checks can have data races in the free
threaded build. `_PyObject_IsUniquelyReferenced(x)` is a more conservative
check that is safe in the free threaded build and is identical to
`Py_REFCNT(x) == 1` in the default GIL-enabled build.
Emma Smith [Tue, 14 Oct 2025 17:03:55 +0000 (10:03 -0700)]
gh-139877: Use PyBytesWriter in pycore_blocks_output_buffer.h (#139976)
Previously, the _BlocksOutputBuffer code creates a list of bytes objects to handle the output data from compression libraries. This ends up being slow due to the output buffer code needing to copy each bytes element of the list into the final bytes object buffer at the end of compression.
The new PyBytesWriter API introduced in PEP 782 is an ergonomic and fast method of writing data into a buffer that will later turn into a bytes object. Benchmarks show that using the PyBytesWriter API is 10-30% faster for decompression across a variety of settings. The performance gains are greatest when the decompressor is very performant, such as for Zstandard (and likely zlib-ng). Otherwise the decompressor can bottleneck decompression and the gains are more modest, but still sizable (e.g. 10% faster for zlib)!
Shamil [Tue, 14 Oct 2025 14:42:17 +0000 (17:42 +0300)]
gh-140067: Fix memory leak in sub-interpreter creation (#140111)
Fix memory leak in sub-interpreter creation caused by overwriting of the previously used `_malloced` field. Now the pointer is stored in the first word of the memory block to avoid it being overwritten accidentally.