Follow on from https://github.com/python/cpython/pull/103116.
Expand list of clean docs files from 3 to 181. These files have no Sphinx warnings, and their presence in this list means that any new warnings introduced will fail the build.
The list was created by subtracting the list of files with warnings from a list of all files.
I tested with all of those, but found that `touch`ing two clean files (https://github.com/python/cpython/blob/main/Doc/includes/wasm-notavail.rst and https://github.com/python/cpython/blob/main/Doc/whatsnew/changelog.rst) caused a cascade effect and resulted in a number of dirty files being rebuilt too, and failing the build. So those two have been omitted.
Eric Snow [Wed, 29 Mar 2023 23:15:43 +0000 (17:15 -0600)]
gh-100227: Make the Global PyModuleDef Cache Safe for Isolated Interpreters (gh-103084)
Sharing mutable (or non-immortal) objects between interpreters is generally not safe. We can work around that but not easily.
There are two restrictions that are critical for objects that break interpreter isolation.
The first is that the object's state be guarded by a global lock. For now the GIL meets this requirement, but a granular global lock is needed once we have a per-interpreter GIL.
The second restriction is that the object (and, for a container, its items) be deallocated/resized only when the interpreter in which it was allocated is the current one. This is because every interpreter has (or will have, see gh-101660) its own object allocator. Deallocating an object with a different allocator can cause crashes.
The dict for the cache of module defs is completely internal, which simplifies what we have to do to meet those requirements. To do so, we do the following:
* add a mechanism for re-using a temporary thread state tied to the main interpreter in an arbitrary thread
* add _PyRuntime.imports.extensions.main_tstate`
* add _PyThreadState_InitDetached() and _PyThreadState_ClearDetached() (pystate.c)
* add _PyThreadState_BindDetached() and _PyThreadState_UnbindDetached() (pystate.c)
* make sure the cache dict (_PyRuntime.imports.extensions.dict) and its items are all owned by the main interpreter)
* add a placeholder using for a granular global lock
Note that the cache is only used for legacy extension modules and not for multi-phase init modules.
Brett Cannon [Wed, 29 Mar 2023 20:28:08 +0000 (13:28 -0700)]
GH-102973: add a dev container (GH-102975)
On content update, builds `python` and the docs. Also adds a Dockerfile that should include everything but autoconf 2.69 that's necessary to build CPython and the entire stdlib on Fedora.
Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com> Co-authored-by: Dusty Phillips <dusty@phillips.codes>
Steve Dower [Tue, 28 Mar 2023 23:47:13 +0000 (00:47 +0100)]
gh-103097: Add workaround for Windows ARM64 compiler bug (GH-103098)
See https://developercommunity.visualstudio.com/t/Regression-in-MSVC-1433-1434-ARM64-co/10224361 for details of the issue. It only applies to version 14.34.
Eric Snow [Tue, 28 Mar 2023 18:52:28 +0000 (12:52 -0600)]
gh-100227: Move the Dict of Interned Strings to PyInterpreterState (gh-102339)
We can revisit the options for keeping it global later, if desired. For now the approach seems quite complex, so we've gone with the simpler isolation solution in the meantime.
Chenxi Mao [Tue, 28 Mar 2023 08:52:22 +0000 (16:52 +0800)]
GH-102711: Fix warnings found by clang (#102712)
There are some warnings if build python via clang:
Parser/pegen.c:812:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_clear_memo_statistics()
^
void
Parser/pegen.c:820:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
_PyPegen_get_memo_statistics()
^
void
This approach to keeping the interned strings safe is turning out to be too complex for my taste (due to obmalloc isolation). For now I'm going with the simpler solution, making the dict per-interpreter. We can revisit that later if we want a sharing solution.
Shantanu [Sat, 25 Mar 2023 21:40:11 +0000 (14:40 -0700)]
gh-98886: Fix issues with dataclass fields with special underscore names (#102032)
This commit prefixes `__dataclass` to several things in the locals dict:
- Names like `_dflt_` (which cause trouble, see first test)
- Names like `_type_` (not known to be able to cause trouble)
- `_return_type` (not known to able to cause trouble)
- `_HAS_DEFAULT_FACTORY` (which causes trouble, see second test)
In addition, this removes `MISSING` from the locals dict. As far as I can tell, this wasn't needed even in the initial implementation of dataclasses.py (and tests on that version passed with it removed). This makes me wary :-)
This is basically a continuation of #96151, where fixing this was welcomed in https://github.com/python/cpython/pull/98143#issuecomment-1280306360
David Benjamin [Fri, 24 Mar 2023 13:04:30 +0000 (09:04 -0400)]
gh-100372: Use BIO_eof to detect EOF for SSL_FILETYPE_ASN1 (GH-100373)
In PEM, we need to parse until error and then suppress `PEM_R_NO_START_LINE`, because PEM allows arbitrary leading and trailing data. DER, however, does not. Parsing until error and suppressing `ASN1_R_HEADER_TOO_LONG` doesn't quite work because that error also covers some cases that should be rejected.
Instead, check `BIO_eof` early and stop the loop that way.
Eric Snow [Thu, 23 Mar 2023 00:30:04 +0000 (18:30 -0600)]
gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters (gh-102925)
This is effectively two changes. The first (the bulk of the change) is where we add _Py_AddToGlobalDict() (and _PyRuntime.cached_objects.main_tstate, etc.). The second (much smaller) change is where we update PyUnicode_InternInPlace() to use _Py_AddToGlobalDict() instead of calling PyDict_SetDefault() directly.
Basically, _Py_AddToGlobalDict() is a wrapper around PyDict_SetDefault() that should be used whenever we need to add a value to a runtime-global dict object (in the few cases where we are leaving the container global rather than moving it to PyInterpreterState, e.g. the interned strings dict). _Py_AddToGlobalDict() does all the necessary work to make sure the target global dict is shared safely between isolated interpreters. This is especially important as we move the obmalloc state to each interpreter (gh-101660), as well as, potentially, the GIL (PEP 684).
gh-102780: Fix uncancel() call in asyncio timeouts (#102815)
Also use `raise TimeOut from <CancelledError instance>` so that the CancelledError is set
in the `__cause__` field rather than in the `__context__` field.
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
David Benjamin [Wed, 22 Mar 2023 12:16:26 +0000 (08:16 -0400)]
GH-95494: Fix transport EOF handling in OpenSSL 3.0 (GH-95495)
GH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.
Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.
There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)
Eric Snow [Tue, 21 Mar 2023 20:01:38 +0000 (14:01 -0600)]
gh-94673: Isolate the _io module to Each Interpreter (gh-102663)
Aside from sys and builtins, _io is the only core builtin module that hasn't been ported to multi-phase init. We may do so later (e.g. gh-101948), but in the meantime we must at least take care of the module's static types properly. (This came up while working on gh-101660.)
Eric Snow [Tue, 21 Mar 2023 18:47:55 +0000 (12:47 -0600)]
gh-98608: Fix Failure-handling in new_interpreter() (gh-102658)
The error-handling code in new_interpreter() has been broken for a while. We hadn't noticed because those code mostly doesn't fail. (I noticed while working on gh-101660.) The problem is that we try to clear/delete the newly-created thread/interpreter using itself, which just failed. The solution is to switch back to the calling thread state first.
Eric Snow [Tue, 21 Mar 2023 17:46:09 +0000 (11:46 -0600)]
gh-102304: Move the Total Refcount to PyInterpreterState (gh-102545)
Moving it valuable with a per-interpreter GIL. However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter. This becomes more important as we move the obmalloc state to PyInterpreterState.
Eric Snow [Tue, 21 Mar 2023 16:49:12 +0000 (10:49 -0600)]
gh-98608: Stop Treating All Errors from _Py_NewInterpreterFromConfig() as Fatal (gh-102657)
Prior to this change, errors in _Py_NewInterpreterFromConfig() were always fatal. Instead, callers should be able to handle such errors and keep going. That's what this change supports. (This was an oversight in the original implementation of _Py_NewInterpreterFromConfig().) Note that the existing [fatal] behavior of the public Py_NewInterpreter() is preserved.