Victor Stinner [Tue, 3 Nov 2020 17:07:15 +0000 (18:07 +0100)]
bpo-41796: Call _PyAST_Fini() earlier to fix a leak (GH-23131)
Call _PyAST_Fini() on all interpreters, not only on the main
interpreter. Also, call it ealier to fix a reference leak.
Python types contain a reference to themselves in in their
PyTypeObject.tp_mro member. _PyAST_Fini() must called before the last
GC collection to destroy AST types.
_PyInterpreterState_Clear() now calls _PyAST_Fini(). It now also
calls _PyWarnings_Fini() on subinterpeters, not only on the main
interpreter.
Add an assertion in AST init_types() to ensure that the _ast module
is no longer used after _PyAST_Fini() has been called.
Victor Stinner [Mon, 2 Nov 2020 22:17:46 +0000 (23:17 +0100)]
bpo-26789: Fix logging.FileHandler._open() at exit (GH-23053)
The logging.FileHandler class now keeps a reference to the builtin
open() function to be able to open or reopen the file during Python
finalization.
Fix errors like:
Exception ignored in: (...)
Traceback (most recent call last):
(...)
File ".../logging/__init__.py", line 1463, in error
File ".../logging/__init__.py", line 1577, in _log
File ".../logging/__init__.py", line 1587, in handle
File ".../logging/__init__.py", line 1649, in callHandlers
File ".../logging/__init__.py", line 948, in handle
File ".../logging/__init__.py", line 1182, in emit
File ".../logging/__init__.py", line 1171, in _open
NameError: name 'open' is not defined
bpo-42103: Improve validation of Plist files. (GH-22882)
* Prevent some possible DoS attacks via providing invalid Plist files
with extremely large number of objects or collection sizes.
* Raise InvalidFileException for too large bytes and string size instead of returning garbage.
* Raise InvalidFileException instead of ValueError for specific invalid datetime (NaN).
* Raise InvalidFileException instead of TypeError for non-hashable dict keys.
* Add more tests for invalid Plist files.
Victor Stinner [Mon, 2 Nov 2020 15:49:54 +0000 (16:49 +0100)]
bpo-42236: Enhance init and encoding documentation (GH-23109)
Enhance the documentation of the Python startup, filesystem encoding
and error handling, locale encoding. Add a new "Python UTF-8 Mode"
section.
* Add "locale encoding" and "filesystem encoding and error handler"
to the glossary
* Remove documentation from Include/cpython/initconfig.h: move it to
Doc/c-api/init_config.rst.
* Doc/c-api/init_config.rst:
* Document command line options and environment variables
* Document default values.
* Add a new "Python UTF-8 Mode" section in Doc/library/os.rst.
* Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs.
* Document how Python selects the filesystem encoding and error
handler at a single place: PyConfig.filesystem_encoding and
PyConfig.filesystem_errors.
* PyConfig: move orig_argv member at the right place.
Julien Danjou [Mon, 2 Nov 2020 14:16:25 +0000 (15:16 +0100)]
bpo-41435: Add sys._current_exceptions() function (GH-21689)
This adds a new function named sys._current_exceptions() which is equivalent ot
sys._current_frames() except that it returns the exceptions currently handled
by other threads. It is equivalent to calling sys.exc_info() for each running
thread.
Jakub Stasiak [Mon, 2 Nov 2020 10:56:35 +0000 (11:56 +0100)]
bpo-42230: Improve asyncio documentation regarding accepting sets vs iterables (GH-23073)
People call wait() and as_completed() with various non-set iterables,
a list should be the most common but there are others as well[1].
Considering typeshed also documents wait()[2] and as_completed()[3]
as accepting arbitrary iterables I think it's a good idea to document
the status quo better.
Joongi Kim [Mon, 2 Nov 2020 08:02:48 +0000 (17:02 +0900)]
bpo-41229: Update docs for explicit aclose()-required cases and add contextlib.aclosing() method (GH-21545)
This is a PR to:
* Add `contextlib.aclosing` which ia analogous to `contextlib.closing` but for async-generators with an explicit test case for [bpo-41229]()
* Update the docs to describe when we need explicit `aclose()` invocation.
which are motivated by the following issues, articles, and examples:
Particuarly regarding [PEP-533](https://www.python.org/dev/peps/pep-0533/), its acceptance (`__aiterclose__()`) would make this little addition of `contextlib.aclosing()` unnecessary for most use cases, but until then this could serve as a good counterpart and analogy to `contextlib.closing()`. The same applies for `contextlib.closing` with `__iterclose__()`.
Also, still there are other use cases, e.g., when working with non-generator objects with `aclose()` methods.
Tal Einat [Mon, 2 Nov 2020 03:59:52 +0000 (05:59 +0200)]
bpo-40511: Stop unwanted flashing of IDLE calltips (GH-20910)
They were occurring with both repeated 'force-calltip' invocations and by typing parentheses
in expressions, strings, and comments in the argument code.
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
bpo-37193: remove thread objects which finished process its request (GH-13893)
* bpo-37193: remove the thread which finished process request from threads list
* rename variable t to thread.
* don't remove thread from list if it is daemon.
* use lock to protect self._threads.
* use finally block in case of exception from shutdown_request().
* check "not thread.daemon" before lock to avoid holding the lock if it's unnecessary.
* fix the place of _threads_lock.
* separate code to remove a current thread into a function.
* check ValueError when removing thread.
* fix wrong code which all instance shared same lock.
* Extract thread management into a _Threads class to encapsulate atomic operations and separate concerns.
* Replace multiple references of 'block_on_close' with one, avoiding the possibility that 'block_on_close' could change during the course of processing requests. Now, there's exactly one _threads object with behavior fixed for the duration.
* Add docstrings to private classes.
* Add test to ensure that a ThreadingTCPServer can be closed without serving any requests.
* Use _NoThreads as the default value. Fixes AttributeError when server is closed without serving any requests.
* Add blurb
* Add test capturing failure.
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Victor Stinner [Sun, 1 Nov 2020 22:07:23 +0000 (23:07 +0100)]
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.
In May 2010 (commit b744ba1d14c5487576c95d0311e357b707600b47), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.
In August 2020 (commit 94908bbc1503df830d1d615e7b57744ae1b41079), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.
In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.
Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.
While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)
Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
share code between _Py_GetLocaleEncodingObject()
and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
from the current locale encoding with surrogateescape,
rather than using UTF-8.
Ronald Oussoren [Sun, 1 Nov 2020 09:08:48 +0000 (10:08 +0100)]
bpo-29566: binhex.binhex now consitently writes MacOS 9 line endings. (GH-23059)
[bpo-29566]() notes that binhex.binhex uses inconsistent line endings (both Unix and MacOS9 line endings are used). This PR changes this to use the MacOS9 line endings everywhere.
Alexey Izbyshev [Sun, 1 Nov 2020 05:33:08 +0000 (08:33 +0300)]
bpo-42146: Unify cleanup in subprocess_fork_exec() (GH-22970)
* bpo-42146: Unify cleanup in subprocess_fork_exec()
Also ignore errors from _enable_gc():
* They are always suppressed by the current code due to a bug.
* _enable_gc() is only used if `preexec_fn != None`, which is unsafe.
* We don't have a good way to handle errors in case we successfully
created a child process.
Co-authored-by: Gregory P. Smith <greg@krypto.org>
bpo-42218: Correctly handle errors in left-recursive rules (GH-23065)
Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* Add a new _locale._get_locale_encoding() function to get the
current locale encoding.
* Modify locale.getpreferredencoding() to use it.
* Remove the _bootlocale module.
Victor Stinner [Fri, 30 Oct 2020 21:51:02 +0000 (22:51 +0100)]
bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044)
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().
Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
Eric W [Fri, 30 Oct 2020 04:56:28 +0000 (05:56 +0100)]
bpo-42160: tempfile: Reduce overhead of pid check. (GH-22997)
The _RandomSequence class in tempfile used to check the current pid every time its rng property was used.
This commit replaces this code with `os.register_at_fork` to reduce the overhead.
Victor Stinner [Tue, 27 Oct 2020 20:34:33 +0000 (21:34 +0100)]
bpo-42161: Remove private _PyLong_Zero and _PyLong_One (GH-23003)
Use PyLong_FromLong(0) and PyLong_FromLong(1) of the public C API
instead. For Python internals, _PyLong_GetZero() and _PyLong_GetOne()
of pycore_long.h can be used.
Removed the unicodedata.ucnhash_CAPI attribute which was an internal
PyCapsule object. The related private _PyUnicode_Name_CAPI structure
was moved to the internal C API.
Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
Georges Toth [Tue, 27 Oct 2020 00:31:06 +0000 (01:31 +0100)]
bpo-30681: Support invalid date format or value in email Date header (GH-22090)
I am re-submitting an older PR which was abandoned but is still relevant, #10783 by @timb07.
The issue being solved () is still relevant. The original PR #10783 was closed as
the final request changes were not applied and since abandoned.
In this new PR I have re-used the original patch plus applied both comments from the review, by @maxking and @pganssle.
For reference, here is the original PR description:
In email.utils.parsedate_to_datetime(), a failure to parse the date, or invalid date components (such as hour outside 0..23) raises an exception. Document this behaviour, and add tests to test_email/test_utils.py to confirm this behaviour.
In email.headerregistry.DateHeader.parse(), check when parsedate_to_datetime() raises an exception and add a new defect InvalidDateDefect; preserve the invalid value as the string value of the header, but set the datetime attribute to None.
Add tests to test_email/test_headerregistry.py to confirm this behaviour; also added test to test_email/test_inversion.py to confirm emails with such defective date headers round trip successfully.
This pull request incorporates feedback gratefully received from @bitdancer, @brettcannon, @Mariatta and @warsaw, and replaces the earlier PR #2254.
bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111)
* Implement running the parser a second time for the errors messages
The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
Victor Stinner [Mon, 26 Oct 2020 15:43:47 +0000 (16:43 +0100)]
bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.
* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
Serhiy Storchaka [Mon, 26 Oct 2020 06:43:39 +0000 (08:43 +0200)]
bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)
These functions are considered not safe because they suppress all internal errors
and can return wrong result. PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.
Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
Alexey Izbyshev [Mon, 26 Oct 2020 00:09:32 +0000 (03:09 +0300)]
bpo-42146: Fix memory leak in subprocess.Popen() in case of uid/gid overflow (GH-22966)
Fix memory leak in subprocess.Popen() in case of uid/gid overflow
Also add a test that would catch this leak with `--huntrleaks`.
Alas, the test for `extra_groups` also exposes an inconsistency
in our error reporting: we use a custom ValueError for `extra_groups`,
but propagate OverflowError for `user` and `group`.
Gregory P. Smith [Sat, 24 Oct 2020 19:07:35 +0000 (12:07 -0700)]
bpo-35823: Allow setsid() after vfork() on Linux. (GH-22945)
It should just be a syscall updating a couple of fields in the kernel side
process info. Confirming, in glibc is appears to be a shim for the setsid
syscall (based on not finding any code implementing anything special for it)
and in uclibc (*much* easier to read) it is clearly just a setsid syscall shim.
A breadcrumb _suggesting_ that it is not allowed on Darwin/macOS comes from
a commit in emacs: https://lists.gnu.org/archive/html/bug-gnu-emacs/2017-04/msg00297.html
but I don't have a way to verify if that is true or not.
As we are not supporting vfork on macOS today I just left a note in a comment.
Alexey Izbyshev [Sat, 24 Oct 2020 17:47:38 +0000 (20:47 +0300)]
bpo-35823: subprocess: Fix handling of pthread_sigmask() errors (GH-22944)
Using POSIX_CALL() is incorrect since pthread_sigmask() returns
the error number instead of setting errno.
Also handle failure of the first call to pthread_sigmask()
in the parent process, and explain why we don't handle failure
of the second call in a comment.