[3.14] gh-148821: Add more strict tests for XML encodings (GH-149765) (GH-149771)
Exclude encodings like 'utf-8-sig', 'iso2022-jp' and 'hz' from the list of
supported encodings.
(cherry picked from commit fa2afa64d9467fb7362672ed603d29d8e246d240)
Sam Gross [Tue, 12 May 2026 17:21:31 +0000 (17:21 +0000)]
[3.14] gh-145235: Make dict watcher API thread-safe for free-threaded builds (gh-145233) (#149691)
In free-threaded builds, concurrent calls to PyDict_AddWatcher, PyDict_ClearWatcher, PyDict_Watch, and PyDict_Unwatch can race on the shared callback array and the per-dict watcher tags. This change adds a mutex to serialize watcher registration and removal, atomic operations for tag updates, and atomic acquire/release synchronization for callback dispatch in _PyDict_SendEvent.
[3.14] gh-149486: tarfile.data_filter: validate written link target (GH-149487) (GH-149554)
* gh-149486: tarfile.data_filter: validate written link target (GH-149487)
The data filter rewrote linknames with normpath() but ran the
containment check against the un-normalised value, and computed a
symlink's directory before stripping trailing slashes. Both let a
crafted archive create links pointing outside the destination. Also
reject link members that resolve to the destination directory itself,
which could otherwise replace it with a symlink and redirect all
subsequent members.
For Python macOS framework builds, update all Info.plist files to be more
compliant with current Apple guidelines. Original patch contributed by
Martinus Verburg.
Zachary Ware [Wed, 6 May 2026 15:49:11 +0000 (10:49 -0500)]
[3.14] Rewrite RTD configuration to use build.jobs rather than build.commands (GH-149432)
As part of this conversion, we now ensure that we're comparing against the
merge-base of the PR branch and the base branch when checking whether an RTD
build is worthwhile, deepening the history of the base branch by up to 500
commits if necessary. If the merge-base can't be found or there are merge
conflicts with the head of the base branch, the build is skipped since it would
give a warped perception of the actual changes anyway.
This unfortunately does nothing about RTD preview comments comparing against
the wrong base, other than skipping builds that shouldn't produce any diff at
all thus avoiding the comment.
gh-149096: Remove 'im_*' attribute reference from inspect module docstring (GH-149108)
The im_class/func/self names were removed in 3.0. The prefix appears nowhere else in inspect.py
and nowhere in inspect.rst.
(cherry picked from commit e4444538dcd60a1b655c620b4d3bba59a7830f25)
[3.14] GH-130750: Restore quoting of choices in argparse error messag… (#149385)
[3.14] GH-130750: Restore quoting of choices in argparse error messages to match documentation and improve clarity (GH-144983)
(cherry picked from commit 53a7f76501923059188922be231db855265fe9a4)
Treat the debug offset tables read from a target process as untrusted input
and validate them before the unwinder uses any reported sizes or offsets.
Add a shared validator in debug_offsets_validation.h and run it once when
_Py_DebugOffsets is loaded and once when AsyncioDebug is loaded. The checks
cover section sizes used for fixed local buffers and every offset that is
later dereferenced against a local buffer or local object view. This keeps
the bounds checks out of the sampling hot path while rejecting malformed
tables up front.
(cherry picked from commit 289fd2c97a7e5aecb8b69f94f5e838ccfeee7e67)
[3.14] gh-138907: Support RFC 9309 in robotparser (GH-138908) (GH-149374)
* empty lines are always ignored instead of separating groups
* the "user-agent" line after a rule starts a new group
* groups matching the same user agent are now merged
* the rule with the longest match wins instead of the first matching rule
* in case of equal matches, the “Allow” rule wins over “Disallow”
* special characters “$” and “*” are now supported in rules
* prefer full match for user agent
Avoid the phrasing ‘starting with ::FFFF/96’, which is confusing since
it seems to mix a prefix and a range. Instead, make it clear what the
actual range is, and refer to the relevant RFC.
The kind attribute of ast.Constant was not mentioned in the
documentation. It is set to 'u' for u-prefixed string literals
and None for all other constants.
[3.14] gh-148914: Fix memoization of in-band PickleBuffer in the Python implementation (GH-149052) (GH-149274)
Previously, identical PickleBuffers did not preserve identity.
Also, empty writable PickleBuffer memoized an empty bytearray object
in place of b'' which is a singleton in CPython, so the following
references to b'' were unpickled as an empty bytearray object.
(cherry picked from commit b89735625dff07005c31bdc86cbe7113ef1b59d0)
[3.14] gh-149117: Set `ImportError.name` on errors from `runpy.run_module`/`run_path` (gh-149159) (#149257)
gh-149117: Set `ImportError.name` on errors from `runpy.run_module`/`run_path` (gh-149159)
Set ImportError.name on errors from runpy.run_module/run_path
`runpy.run_module()` and `runpy.run_path()` now set the `name` attribute
of the `ImportError` they raise to the requested module name, matching
the behaviour of a regular import statement (previously `name` was
always `None`, which broke introspection).
The `name=` kwarg is gated on `issubclass(error, ImportError)` because
`_get_module_details()` is also used by `_run_module_as_main()` with
a private `_Error` sentinel class. `_Error` does not subclass
ImportError, and `BaseException.__init__` rejects unknown kwargs at
the C level, so passing `name=` unconditionally would break the
`python -m foo` codepath.
(cherry picked from commit ff35fe4633cc6d3a30f6af8281dfa641783c1d07)
[3.14] gh-148518 fix index error in local part attribute (GH-148522) (#149200)
As part of fixing bpo-27931 code was introduced to get_bare_quoted_string
that added an empty Terminal if the quoted string was empty. This isn't
the best answer in terms of the parse tree; we really want the token
list to be empty in that case. But having it be empty resulted in
local_part raising the index error. We find that same problem if we
try to parse an address consisting of a single dquote. By fixing
local_part to not raise on an empty token list, we can have the
bare_quoted_string code correctly return an empty token list for
the empty string cases (two dquotes or a single dquote as the
entire addrespec, at the end of a line).
(cherry picked from commit bdbb55c403d2ab6b4b0a3e994d21b623fee4a544)
Co-authored-by: R. David Murray <rdmurray@bitdance.com>
[3.14] bpo-39100: _header_value_parser: do not treat a Group as invalid-mailbox (GH-24872) (#149191)
When an address in an address-list has garbage at the end, the code will
currently:
1. change the mailbox in the last parsed address into invalid-mailbox by
overriding its token_type;
2. wrap the trailing garbage into another invalid-mailbox and append it
to the last parsed address.
However, that does not take into account that an address may
also contain a Group instead of a single mailbox. In that case,
overwriting token_type leads to undesirable results, e.g. parsing an
email with the following 'To' header:
unlisted-recipients:; (no To-header on input)
raises an AttributeError from trying to treat the Group as a Mailbox.
Moreover it is questionable whether the previously parsed mailbox should
be treated as invalid in addition to the trailing garbage.
Address both of the above by wrapping the trailing garbage in a new
Address with a single invalid-mailbox, and append it to the AddressList
directly.
Changes the results of the
test_get_address_list_mailboxes_invalid_addresses test, where the
address list is now parsed into 4 mailboxes instead of 3 (all but the
first one are invalid).
(cherry picked from commit b413bc7a1f0946f734d9660239b4e2e8ddc48522)
[3.14] gh-149122: Fix segfault in compiler when certain builtin functions are passed a coroutine as arg (GH-149138) (#149151)
gh-149122: Fix segfault in compiler when certain builtin functions are passed a coroutine as arg (GH-149138)
(cherry picked from commit 16f292ef4e8c56bfd115ecdb91420c7b4006249f)
[3.14] gh-148529: Minor improvements of the struct module documentation (GH-148565) (GH-149063)
* Document that 's' and 'p' accept bytes and bytearray.
* Fix some footnotes.
* Clarify that "string" is a byte string.
* Fix the module docstring.
(cherry picked from commit 3e5a3cb2bd222f97f793b01bc1c0f7bb62aefc31)
Gregory P. Smith [Mon, 27 Apr 2026 16:17:30 +0000 (09:17 -0700)]
[3.14] Improve `hash()` builtin docstring with caveats. (GH-125229) (#149054)
Improve `hash()` builtin docstring with caveats.
Mention its return type and that the value can be expected to change between
processes (hash randomization).
Why? The `hash` builtin gets reached for and used by a lot of people whether it
is the right tool or not. IDEs surface docstrings and people use pydoc and
`help(hash)`.
(cherry picked from commit 665b7dfcfa240e02760f58bed5ca29ec01d028e6)
[3.14] Document that multiprocessing treats local same-user processes as trusted (GH-149001) (#149033)
Document that multiprocessing treats local same-user processes as trusted (GH-149001)
Clarify in the Authentication keys section that the authkey handshake
covers Listener/Client (addressable endpoints) only, not the anonymous
pipes behind Pipe() and Queue, and that isolation between same-user
processes must be arranged at the OS level.
(cherry picked from commit f27e91e37212f148b8fe72a3656a69b242625622)
Co-authored-by: Gregory P. Smith <68491+gpshead@users.noreply.github.com>
Co-authored-by: Neil Schemenauer <nas@arctrix.com> Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Neil Schemenauer <nas-github@arctrix.com>
[3.14] gh-146455: Fix O(N²) in add_const() after constant folding moved to CFG (GH-146456) (#149011)
gh-146455: Fix O(N²) in add_const() after constant folding moved to CFG (GH-146456)
The add_const() function in flowgraph.c uses a linear search over the
consts list to find the index of a constant. After gh-126835 moved
constant folding from the AST optimizer to the CFG optimizer, this
function is now called N times for N inner tuple elements during
fold_tuple_of_constants(), resulting in O(N²) total time.
Fix by maintaining an auxiliary _Py_hashtable_t that maps object
pointers to their indices in the consts list, providing O(1) lookup.
For a file with 100,000 constant 2-tuples:
- Before: 10.38s (add_const occupies 83.76% of CPU time)
- After: 1.48s
(cherry picked from commit 5d416324c56cd6f262fa123f41b97b48631bea79)
[3.14] gh-141473: Speed up subprocess test_communicate_timeout_large_input long tail (GH-149003) (#149004)
gh-141473: Speed up subprocess test_communicate_timeout_large_input long tail (GH-149003)
gh-141473: Speed up test_communicate_timeout_large_input
Replace the slow reader's 30s sleep with a parent-driven wake over a
loopback socket so post-timeout communicate() doesn't block waiting
for the child to wake on its own. Worst-case runtime: ~30s -> <1s.
(cherry picked from commit e1384cfd25b4fba5e0f8f3e6b536930e2e6cf5cf)
Co-authored-by: Gregory P. Smith <68491+gpshead@users.noreply.github.com>
Sam Gross [Thu, 23 Apr 2026 19:12:19 +0000 (15:12 -0400)]
[3.14] gh-113956: Make intern_common thread-safe in free-threaded build (gh-148886) (#148927)
Avoid racing with the owning thread's refcount operations when
immortalizing an interned string: if we don't own it and its refcount
isn't merged, intern a copy we own instead. Use atomic stores in
_Py_SetImmortalUntracked so concurrent atomic reads are race-free.
[3.14] gh-119180: Document the `format` parameter in `typing.get_type_hints()` (GH-143758) (#148901)
Do not mention `__annotations__` dictionaries, as this is slightly
outdated since 3.14.
Rewrite the note about possible exceptions for clarity. Also do not
mention imported type aliases, as since 3.12 aliases with the `type`
statement do not suffer from this limitation anymore.
(cherry picked from commit 8bf99ae3a9f12d105a70d6fda93dddde4adeee8f)
[3.14] gh-142965: Fix Concatenate documentation to reflect valid use cases (GH-143316) (#148899)
The documentation previously stated that Concatenate is only valid
when used as the first argument to Callable, but according to PEP 612,
it can also be used when instantiating user-defined generic classes
with ParamSpec parameters.
(cherry picked from commit 75ff1afcb6a1bb2b3d54899e9b222a61798fa491)
Co-authored-by: John Seong <39040639+sandole@users.noreply.github.com>