Eric Snow [Tue, 16 Apr 2024 02:16:37 +0000 (20:16 -0600)]
gh-76785: Fix Windows Refleak in test_interpreters (gh-117913)
gh-117662 introduced some refleaks, or, rather, exposed some existing refleaks. The leaks are coming when test.support.os_helper is imported in a "legacy" interpreter. I've updated test.test_interpreters.utils to avoid importing os_helper, which fixes the leaks. I'll address the root cause separately.
Sam Gross [Mon, 15 Apr 2024 16:54:56 +0000 (12:54 -0400)]
gh-117688: Fix deadlock in test_no_stale_references with GIL disabled (#117720)
Check `my_object_collected.wait()` in a loop to give the main thread a
chance to merge the reference count fields. Additionally, call
`my_object_collected.set()` in a background thread to avoid deadlocking
when the destructor is called asynchronously via the eval breaker
within the body of of `my_object_collected.wait()`.
gh-117657: Quiet TSAN warning about a data race between `start_the_world()` and `tstate_try_attach()` (#117828)
TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
gh-117657: Add TSAN suppressions for the free-threaded build (#117736)
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
Barney Gale [Sat, 13 Apr 2024 23:08:03 +0000 (00:08 +0100)]
GH-115060: Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831)
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
gh-102247: http: support rfc9110 status codes (GH-117611)
rfc9110 obsoletes the earlier rfc 7231. This document also includes some
status codes that were previously only used for WebDAV and assigns more
generic names to these status codes.
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
Barney Gale [Fri, 12 Apr 2024 22:02:39 +0000 (23:02 +0100)]
GH-117727: Speed up `pathlib.Path.iterdir()` by using `os.scandir()` (#117728)
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
Barney Gale [Fri, 12 Apr 2024 21:19:21 +0000 (22:19 +0100)]
GH-115060: Speed up `pathlib.Path.glob()` by not scanning literal parts (#117732)
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
gh-117764: Add more tests for signatures of builtins (GH-117816)
Test signatures of all public builtins and methods of builtin classes
in modules builtins, types, sys, and several other modules (either
included in the list of standard builtin modules sys.builtin_module_names,
or providing a public interface for such modules).
Most builtins should have supported signatures, with few known exceptions.
When more builtins will be converted to Argument Clinic or support of
new signatures be implemented, they will be removed from the exception
lists.
This is similar to the situation with threading._DummyThread. The methods (incl. __del__()) of interpreters.Interpreter objects must be careful with interpreters not created by interpreters.create(). The simplest thing to start with is to disable any method that modifies or runs in the interpreter. As part of this, the runtime keeps track of where an interpreter was created. We also handle interpreter "refcounts" properly.
Sam Gross [Thu, 11 Apr 2024 19:00:54 +0000 (15:00 -0400)]
gh-117649: Raise ImportError for unsupported modules in free-threaded build (#117651)
The free-threaded build does not currently support the combination of
single-phase init modules and non-isolated subinterpreters. Ensure that
`check_multi_interp_extensions` is always `True` for subinterpreters in
the free-threaded build so that importing these modules raises an
`ImportError`.
gh-117233: Detect support for several hashes at hashlib build time (GH-117234)
Detect libcrypto BLAKE2, Shake, SHA3, and Truncated-SHA512 support at hashlib build time
## BLAKE2
While OpenSSL supports both "b" and "s" variants of the BLAKE2 hash
function, other cryptographic libraries may lack support for one or both
of the variants. This commit modifies `hashlib`'s C code to detect
whether or not the linked libcrypto supports each BLAKE2 variant, and
elides references to each variant's NID accordingly. In cases where the
underlying libcrypto doesn't fully support BLAKE2, CPython's
`./configure` script can be given the following flag to use CPython's
interned BLAKE2 implementation: `--with-builtin-hashlib-hashes=blake2`.
## SHA3, Shake, & truncated SHA512.
Detect BLAKE2, SHA3, Shake, & truncated SHA512 support in the
OpenSSL-ish libcrypto library at build time. This helps allow hashlib's
`_hashopenssl` to be used with libraries that do not to support every
algorithm that upstream OpenSSL does. Such as AWS-LC & BoringSSL.
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Bruce Merry [Thu, 11 Apr 2024 14:41:55 +0000 (16:41 +0200)]
gh-117722: Fix Stream.readuntil with non-bytes buffer objects (#117723)
gh-16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In gh-16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.
Fix it by only supporting tuples rather than arbitrary iterables.
Victor Stinner [Thu, 11 Apr 2024 10:15:48 +0000 (12:15 +0200)]
gh-113317: Add Codegen class to Argument Clinic (#117626)
* Move ifndef_symbols, includes and add_include() from Clinic to
Codegen. Add a 'codegen' (Codegen) attribute to Clinic.
* Remove libclinic.crenderdata module: move code to libclinic.codegen.
* BlockPrinter.print_block(): remove unused 'limited_capi' argument.
Remove also 'core_includes' parameter.
* Add get_includes() methods.
* Make Codegen.ifndef_symbols private.
* Make Codegen.includes private.
* Make CConverter.includes private.
Karolina Surma [Thu, 11 Apr 2024 09:37:28 +0000 (11:37 +0200)]
gh-117711: Only check for 'test/wheeldata' when it's actually used (#117712)
It's possible to build Python with option `--with-wheel-pkg-dir`
pointing to a custom wheel directory. Don't include the directory in the test
set if the wheels are used from a different location.
Barney Gale [Thu, 11 Apr 2024 00:26:53 +0000 (01:26 +0100)]
GH-117586: Speed up `pathlib.Path.walk()` by working with strings (#117726)
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.
In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.
Barney Gale [Wed, 10 Apr 2024 19:43:07 +0000 (20:43 +0100)]
GH-117586: Speed up `pathlib.Path.glob()` by working with strings (#117589)
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.
In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.
This sets the stage for two more improvements:
- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.
No change to the implementations of `glob.glob()` and `glob.iglob()`.
Sam Gross [Wed, 10 Apr 2024 14:20:05 +0000 (10:20 -0400)]
gh-112536: Define `_Py_THREAD_SANITIZER` on GCC when TSan is enabled (#117702)
The `__has_feature(thread_sanitizer)` is a Clang-ism. Although new
versions of GCC implement `__has_feature`, the `defined(__has_feature)`
check still fails on GCC so we don't use that code path.