Sam Gross [Fri, 10 Jan 2025 00:59:10 +0000 (19:59 -0500)]
gh-128691: Use deferred reference counting on `_thread._local` (#128693)
This change, along with the LOAD_ATTR specializations, makes the
"thread_local_read" micro benchmark in Tools/ftscalingbench/ftscalingbench.py
scale well to multiple threads.
Stephen Morton [Tue, 7 Jan 2025 10:40:41 +0000 (02:40 -0800)]
gh-128302: Fix bugs in xml.dom.xmlbuilder (GH-128284)
* Allow DOMParser.parse() to correctly handle DOMInputSource instances
that only have a systemId attribute set.
* Fix DOMEntityResolver.resolveEntity(), which was broken by the
Python 3.0 transition.
* Add Lib/test/test_xml_dom_xmlbuilder.py with few tests.
Hood Chatham [Mon, 6 Jan 2025 22:26:35 +0000 (23:26 +0100)]
gh-127146: Strip dash from Emscripten compiler version (#128557)
`emcc -dumpversion` will sometimes say e.g., `4.0.0-git` but in this case
uname does not include `-git` in the version string. Use cut to delete
everything after the dash.
RanKKI [Mon, 6 Jan 2025 01:32:16 +0000 (12:32 +1100)]
gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (#127547)
Up to this point message handling has been very strict with regards to content encoding values: mixed case was accepted, but trailing blanks or other text would cause decoding failure, even if the first token was a valid encoding. By Postel's Rule we should go ahead and decode as long as we can recognize that first token. We have not thought of any security or backward compatibility concerns with this fix.
This fix does introduce a new technique/pattern to the Message code: we look to see if the header has a 'cte' attribute, and if so we use that. This effectively promotes the header API exposed by HeaderRegistry to an API that any header parser "should" support. This seems like a reasonable thing to do. It is not, however, a requirement, as the string value of the header is still used if there is no cte attribute.
The full fix (ignore any trailing blanks or blank-separated trailing text) applies only to the non-compat32 API. compat32 is only fixed to the extent that it now ignores trailing spaces. Note that the HeaderRegistry parsing still records a HeaderDefect if there is extra text.
Damien [Sun, 5 Jan 2025 12:07:18 +0000 (20:07 +0800)]
gh-128504: Upgrade doctest to ubuntu-24.04 (#128506)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Barney Gale [Sat, 4 Jan 2025 15:45:24 +0000 (15:45 +0000)]
pathlib tests: create `walk()` test hierarchy without using class under test (#128338)
In the tests for `pathlib.Path.walk()`, avoid using the path class under
test (`self.cls`) in test setup. Instead we use `os` functions in
`test_pathlib`, and direct manipulation of `DummyPath` internal data in
`test_pathlib_abc`.
Barney Gale [Sat, 4 Jan 2025 12:53:51 +0000 (12:53 +0000)]
GH-127381: pathlib ABCs: remove `PathBase.move()` and `move_into()` (#128337)
These methods combine `_delete()` and `copy()`, but `_delete()` isn't part
of the public interface, and it's unlikely to be added until the pathlib
ABCs are made official, or perhaps even later.
Kumar Aditya [Sat, 4 Jan 2025 08:48:22 +0000 (14:18 +0530)]
gh-128002: fix many thread safety issues in asyncio (#128147)
* Makes `_asyncio.Task` and `_asyncio.Future` thread-safe by adding critical sections
* Add assertions to check for thread safety checking locking of object by critical sections in internal functions
* Make `_asyncio.all_tasks` thread safe when eager tasks are used
* Add a thread safety test
Add a separate benchmark that measures the effect of
`_PyObject_LookupSpecial()` on scaling.
In the process of cleaning up the scaling benchmarks for inclusion, I
unintentionally changed the "cmodule_function" benchmark to pass an
`int` to `math.floor()` instead of a `float`, which causes it to use the
`_PyObject_LookupSpecial()` code path. `_PyObject_LookupSpecial()` has
its own scaling issues that we want to measure separately from calling a
function on a C module.
Bénédikt Tran [Fri, 3 Jan 2025 12:37:02 +0000 (13:37 +0100)]
gh-127787: refactor helpers for `PyUnicodeErrorObject` internal interface (GH-127789)
- Unify `get_unicode` and `get_string` in a single function.
- Allow to retrieve the underlying `object` attribute, its
size, and the adjusted 'start' and 'end', all at once.
Add a new `_PyUnicodeError_GetParams` internal function for this.
(In `exceptions.c`, it's somewhat common to not need all the attributes,
but the compiler has opportunity to inline the function and optimize
unneeded work away. Outside that file, we'll usually need all or
most of them at once.)
- Use a common implementation for the following functions:
Sam Gross [Thu, 2 Jan 2025 19:02:54 +0000 (14:02 -0500)]
gh-128212: Fix race in `_PyUnicode_CheckConsistency` (GH-128367)
There was a data race on the utf8 field between `PyUnicode_SET_UTF8` and
`_PyUnicode_CheckConsistency`. Use the `_PyUnicode_UTF8()` accessor,
which uses an atomic load internally, to avoid the data race.
Anders Kaseorg [Thu, 2 Jan 2025 16:55:33 +0000 (08:55 -0800)]
Remove asserts that confuse `enum _framestate` with `enum _frameowner` (GH-124148)
The `owner` field of `_PyInterpreterFrame` is supposed to be a member of
`enum _frameowner`, but `FRAME_CLEARED` is a member of `enum _framestate`.
At present, it happens that `FRAME_CLEARED` is not numerically equal to any
member of `enum _frameowner`, but that could change in the future. The code
that incorrectly assigned `owner = FRAME_CLEARED` was deleted in commit a53cc3f49463e50cb3e2b839b3a82e6bf7f73fee (GH-116687). Remove the incorrect
checks for `owner != FRAME_CLEARED` as well.