[3.13] gh-119118: Fix performance regression in tokenize module (GH-119615) (#119682)
- Cache line object to avoid creating a Unicode object
for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
smallest buffer possible to measure the difference.
The fix in gh-119561 introduced an assertion that doesn't hold true if any of the three new test extension modules are loaded more than once. This is fine normally but breaks if the new test_check_state_first() is run more than once, which happens for refleak checking and with the regrtest --forever flag. We fix that here by clearing each of the three modules after loading them. We also tweak a check in _modules_by_index_check().
[3.13] gh-119560: Drop an Invalid Assert in PyState_FindModule() (gh-119561) (gh-119632)
The assertion was added in gh-118532 but was based on the invalid assumption that PyState_FindModule() would only be called with an already-initialized module def. I've added a test to make sure we don't make that assumption again.
[3.13] gh-111997: Fix argument count for LINE event and clarify type of argument counts. (GH-119179) (GH-119575)
gh-111997: Fix argument count for LINE event and clarify type of argument counts. (GH-119179)
(cherry picked from commit 70b07aa4153c1a914a3d69307d5b258cf7ed16ab)
[3.13] GH-119496: accept UTF-8 BOM in .pth files (GH-119508)
`Out-File -Encoding utf8` and similar commands in Windows Powershell 5.1 emit
UTF-8 with a BOM marker, which the regular `utf-8` codec decodes incorrectly.
`utf-8-sig` accepts a BOM, but also works correctly without one.
This change also makes .pth files match the way Python source files are handled.
[3.13] gh-69214: Fix fcntl.ioctl() request type (GH-119498) (#119504)
gh-69214: Fix fcntl.ioctl() request type (GH-119498)
Use an 'unsigned long' instead of an 'unsigned int' for the request
parameter of fcntl.ioctl() to support requests larger than UINT_MAX.
(cherry picked from commit 92fab3356f4c61d4c73606e4fae705c6d8f6213b)
Co-authored-by: Victor Stinner <vstinner@python.org>
Fix ThreadedVSOCKSocketStreamTest: if get_cid() returns the host
address or the "any" address, use the local communication address
(loopback): VMADDR_CID_LOCAL.
On Linux 6.9, apparently, the /dev/vsock device is now available but
get_cid() returns VMADDR_CID_ANY (-1).
[3.13] gh-118727: Don't drop the GIL in `drop_gil()` unless the current thread holds it (GH-118745) (#119474)
`drop_gil()` assumes that its caller is attached, which means that the current
thread holds the GIL if and only if the GIL is enabled, and the enabled-state
of the GIL won't change. This isn't true, though, because `detach_thread()`
calls `_PyEval_ReleaseLock()` after detaching and
`_PyThreadState_DeleteCurrent()` calls it after removing the current thread
from consideration for stop-the-world requests (effectively detaching it).
Fix this by remembering whether or not a thread acquired the GIL when it last
attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()`
instead of `gil->enabled`.
gh-117505: Run ensurepip in isolated env in Windows installer (GH-118257)
ensurepip forks a subprocess to run pip itself, but that subprocess only inherits a -I isolated mode flag (see _run_pip() in Lib/ensurepip/__init__.py), not the "-E -s" flags that the installer has been using. This means that parts of ensurepip don't actually run in an isolated environment and can make incorrect decisions based on packages installed in the user site-packages.
(cherry picked from commit c9073eb1a99606df1efeb8959e9f11a8ebc23ae2)
Co-authored-by: Michael Vincent <377567+Vynce@users.noreply.github.com>
[3.13] gh-119247: Add macros to use PySequence_Fast safely in free-threaded build (GH-119315) (#119419)
Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
(cherry picked from commit baf347d91643a83483bae110092750d39471e0c2)
[3.13] gh-119213: Be More Careful About _PyArg_Parser.kwtuple Across Interpreters (gh-119331) (gh-119410)
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.
[3.13] gh-118643: Fix AttributeError in the email module (GH-119099) (GH-119389)
Fix regression introduced in gh-100884: AttributeError when re-fold a long
address list.
Also fix more cases of incorrect encoding of the address separator in the
address list missed in gh-100884.
(cherry picked from commit 858b9e85fcdd495947c9e892ce6e3734652c48f2)
[3.13] GH-119292: Add job to JIT CI to build and test with --disable-gil (GH-119314)
(cherry picked from commit c4722cd0573c83aaa52b63a27022b9048a949f54) Co-authored-by: Savannah Ostrowski <savannahostrowski@gmail.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
[3.13] gh-119102: Fix REPL for dumb terminal (GH-119269) (#119308)
gh-119102: Fix REPL for dumb terminal (GH-119269)
Use CAN_USE_PYREPL of _pyrepl.__main__ in the site module to decide
if _pyrepl.write_history_file() can be used.
(cherry picked from commit 73f4a58d36b65ec650e8f00b2affc4a4d3195f0c)
Co-authored-by: Victor Stinner <vstinner@python.org>
[3.13] Use `fail-fast: false` in `mypy.yml` (GH-119297) (#119304)
Use `fail-fast: false` in `mypy.yml` (GH-119297)
See docs about this setting: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actionsGH-jobsjob_idstrategyfail-fast
(cherry picked from commit b36533290608aed757f6eb16869a679650d32e17)
[3.13] gh-74929: PEP 667 general docs update (gh-119291)
* expand on What's New entry for PEP 667 (including porting notes)
* define 'optimized scope' as a glossary term
* cover comprehensions and generator expressions in locals() docs
* review all mentions of "locals" in documentation (updating if needed)
* review all mentions of "f_locals" in documentation (updating if needed)
[3.13] gh-119050: Add XML support to libregrtest refleak checker (GH-119148) (#119270)
gh-119050: Add XML support to libregrtest refleak checker (GH-119148)
regrtest test runner: Add XML support to the refleak checker
(-R option).
* run_unittest() now stores XML elements as string, rather than
objects, in support.junit_xml_list.
* runtest_refleak() now saves/restores XML strings before/after
checking for reference leaks. Save XML into a temporary file.
(cherry picked from commit 9257731f5d3e9d4f99e314b23a14506563e167d7)
Co-authored-by: Victor Stinner <vstinner@python.org>
[3.13] gh-92081: Fix for email.generator.Generator with whitespace between encoded words. (GH-92281) (#119245)
* Fix for email.generator.Generator with whitespace between encoded words.
email.generator.Generator currently does not handle whitespace between
encoded words correctly when the encoded words span multiple lines. The
current generator will create an encoded word for each line. If the end
of the line happens to correspond with the end real word in the
plaintext, the generator will place an unencoded space at the start of
the subsequent lines to represent the whitespace between the plaintext
words.
A compliant decoder will strip all the whitespace from between two
encoded words which leads to missing spaces in the round-tripped
output.
The fix for this is to make sure that whitespace between two encoded
words ends up inside of one or the other of the encoded words. This
fix places the space inside of the second encoded word.
A second problem happens with continuation lines. A continuation line that
starts with whitespace and is followed by a non-encoded word is fine because
the newline between such continuation lines is defined as condensing to
a single space character. When the continuation line starts with whitespace
followed by an encoded word, however, the RFCs specify that the word is run
together with the encoded word on the previous line. This is because normal
words are filded on syntactic breaks by encoded words are not.
The solution to this is to add the whitespace to the start of the encoded word
on the continuation line.
[3.13] DOCS: Suggest always calling exec with a globals argument and no locals argument (GH-119235) (#119239)
DOCS: Suggest always calling exec with a globals argument and no locals argument (GH-119235)
Many users think they want a locals argument for various reasons but they do not
understand that it makes code be treated as a class definition. They do not want
their code treated as a class definition and get surprised. The reason not
to pass locals specifically is that the following code raises a `NameError`:
```py
exec("""
def f():
print("hi")
f()
def g():
f()
g()
""", {}, {})
```
The reason not to leave out globals is as follows:
[3.13] marshal docs: Remove reference to "Sun" (GH-119161) (#119167)
Nobody has been using a Sun machine for a long time. When I saw
this sentence in a lightning talk just now, I thought it was talking
about sending Python code on a spacecraft.
(cherry picked from commit 697465ff88e49d98443025474e5b534adfba2cb0)