Barney Gale [Tue, 6 Jun 2023 22:50:36 +0000 (23:50 +0100)]
GH-102613: Fast recursive globbing in `pathlib.Path.glob()` (GH-104512)
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.
In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.
This new algorithm does not apply if either:
1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.
In these cases we fall back to the old implementation.
This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.
The locale.getencoding() function now uses
sys.getfilesystemencoding() if _locale.getencoding() is missing,
instead of calling locale.getdefaultlocale().
Victor Stinner [Tue, 6 Jun 2023 14:42:49 +0000 (16:42 +0200)]
gh-105156: Update Unicode C API: remove deprecation (#105379)
_PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
_PyUnicode_ToTitlecase() are no longer deprecated in the
documentation. It's no longer needed since they now use Py_UCS4 type,
rather than the deprecated Py_UNICODE type.
Victor Stinner [Tue, 6 Jun 2023 12:44:48 +0000 (14:44 +0200)]
gh-105268: Add _Py_FROM_GC() function to pycore_gc.h (#105362)
* gcmodule.c reuses _Py_AS_GC(op) for AS_GC()
* Move gcmodule.c FROM_GC() implementation to a new _Py_FROM_GC()
static inline function in pycore_gc.h.
* _PyObject_IS_GC(): only get the type once
* gc_is_finalized(à) and PyObject_GC_IsFinalized() use
_PyGC_FINALIZED(), instead of _PyGCHead_FINALIZED().
* Remove _Py_CAST() in pycore_gc.h: this header file is not built
with C++.
Victor Stinner [Tue, 6 Jun 2023 09:15:09 +0000 (11:15 +0200)]
gh-102304: Fix Py_INCREF() stable ABI in debug mode (#104763)
When Python is built in debug mode (if the Py_REF_DEBUG macro is
defined), the Py_INCREF() and Py_DECREF() function are now always
implemented as opaque functions to avoid leaking implementation
details like the "_Py_RefTotal" variable or the
_Py_DecRefTotal_DO_NOT_USE_THIS() function.
* Remove _Py_IncRefTotal_DO_NOT_USE_THIS() and
_Py_DecRefTotal_DO_NOT_USE_THIS() from the stable ABI.
* Remove _Py_NegativeRefcount() from limited C API.
Victor Stinner [Tue, 6 Jun 2023 08:40:32 +0000 (10:40 +0200)]
gh-102304: doc: Add links to Stable ABI and Limited C API (#105345)
* Add "limited-c-api" and "stable-api" references.
* Rename "stable-abi-list" reference to "limited-api-list".
* Makefile: Document files regenerated by "make regen-limited-abi"
* Remove first empty line in generated files:
Display the sanitizer config in the regrtest header. (#105301)
Display the sanitizers present in libregrtest.
Having this in the CI output for tests with the relevant environment
variable displayed will help make it easier to do what we need to
create an equivalent local test run.
chgnrdv [Sun, 4 Jun 2023 04:06:45 +0000 (07:06 +0300)]
gh-104690 Disallow thread creation and fork at interpreter finalization (#104826)
Disallow thread creation and fork at interpreter finalization.
in the following functions, check if interpreter is finalizing and raise `RuntimeError` with appropriate message:
* `_thread.start_new_thread` and thus `threading`
* `posix.fork`
* `posix.fork1`
* `posix.forkpty`
* `_posixsubprocess.fork_exec` when a `preexec_fn=` is supplied.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Gregory P. Smith <greg@krypto.org>
Eric Snow [Fri, 2 Jun 2023 22:52:33 +0000 (16:52 -0600)]
gh-101524: Only Use Public C-API in the _xxsubinterpreters Module (gh-105258)
The _xxsubinterpreters module was meant to only use public API. Some internal C-API usage snuck in over the last few years (e.g. gh-28969). This fixes that.
Eric Snow [Thu, 1 Jun 2023 22:28:31 +0000 (16:28 -0600)]
gh-104614: Make Sure ob_type is Always Set Correctly by PyType_Ready() (gh-105122)
When I added the relevant condition to type_ready_set_bases() in gh-103912, I had missed that the function also sets tp_base and ob_type (if necessary). That led to problems for third-party static types.
We fix that here, by making those extra operations distinct and by adjusting the condition to be more specific.
gh-103142: Upgrade binary builds and CI to OpenSSL 1.1.1u (#105174)
Upgrade builds to OpenSSL 1.1.1u.
This OpenSSL version addresses a pile if less-urgent CVEs since 1.1.1t.
The Mac/BuildScript/build-installer.py was already updated.
Also updates _ssl_data_111.h from OpenSSL 1.1.1u, _ssl_data_300.h from 3.0.9, and adds a new _ssl_data_31.h file from 3.1.1 along with the ssl.c code to use it.
Manual edits to the _ssl_data_300.h file prevent it from removing any existing definitions in case those exist in some peoples builds and were important (avoiding regressions during backporting).
backports of this prior to 3.12 will not include the openssl 3.1 header.
All current systems provide time.h; it need not be checked for.
Not all systems provide sys/time.h, but those that do, all allow
you to include it and time.h simultaneously.
Victor Stinner [Thu, 1 Jun 2023 07:18:09 +0000 (09:18 +0200)]
gh-105156: Cleanup usage of old Py_UNICODE type (#105158)
* refcounts.dat:
* Remove Py_UNICODE functions.
* Replace Py_UNICODE argument type with wchar_t.
* _PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
_PyUnicode_ToTitlecase() are no longer deprecated in comments.
It's no longer needed since they now use Py_UCS4 type, rather than
the deprecated Py_UNICODE type.
* gdb: Remove unused char_width() method.
Update Doc/extending/embedding.rst and Doc/extending/extending.rst to
use the new PyConfig API.
_testembed.c:
* check_stdio_details() now sets stdio_encoding and stdio_errors
of PyConfig.
* Add definitions of functions removed from the API but kept in the
stable ABI.
* test_init_from_config() and test_init_read_set() now use
PyConfig_SetString() instead of PyConfig_SetBytesString().
Victor Stinner [Wed, 31 May 2023 11:29:10 +0000 (13:29 +0200)]
gh-105096: Reformat wave documentation (#105136)
Add ".. class::" markups in the wave documentation.
* Reformat also wave.py (minor PEP 8 changes).
* Remove redundant "import struct": it's already imported at top
level.
* Remove wave.rst from .nitignore