Илья Любавский [Fri, 29 Nov 2024 09:00:50 +0000 (12:00 +0300)]
gh-127303: Add docs for token.EXACT_TOKEN_TYPES (#127304)
---------
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Petr Viktorin [Thu, 28 Nov 2024 12:29:27 +0000 (13:29 +0100)]
gh-127330: Update for OpenSSL 3.4 & document+improve the update process (GH-127331)
- Add `git describe` output to headers generated by `make_ssl_data.py`
This info is more important than the date when the file was generated.
It does mean that the tool now requires a Git checkout of OpenSSL,
not for example a release tarball.
- Regenerate the older file to add the info.
To the other older file, add a note about manual edits.
Serhiy Storchaka [Wed, 27 Nov 2024 11:38:12 +0000 (13:38 +0200)]
gh-124008: Fix calculation of the number of written bytes for the Windows console (GH-124059)
Since MultiByteToWideChar()/WideCharToMultiByte() is not reversible if
the data contains invalid UTF-8 sequences, use binary search to
calculate the number of written bytes from the number of written
characters.
gh-69639: Add mixed-mode rules for complex arithmetic (C-like) (GH-124829)
"Generally, mixed-mode arithmetic combining real and complex variables should
be performed directly, not by first coercing the real to complex, lest the sign
of zero be rendered uninformative; the same goes for combinations of pure
imaginary quantities with complex variables." (c) Kahan, W: Branch cuts for
complex elementary functions.
This patch implements mixed-mode arithmetic rules, combining real and
complex variables as specified by C standards since C99 (in particular,
there is no special version for the true division with real lhs
operand). Most C compilers implementing C99+ Annex G have only these
special rules (without support for imaginary type, which is going to be
deprecated in C2y).
gh-69639: Add mixed-mode rules for complex arithmetic (C-like) (GH-124829)
"Generally, mixed-mode arithmetic combining real and complex variables should
be performed directly, not by first coercing the real to complex, lest the sign
of zero be rendered uninformative; the same goes for combinations of pure
imaginary quantities with complex variables." (c) Kahan, W: Branch cuts for
complex elementary functions.
This patch implements mixed-mode arithmetic rules, combining real and
complex variables as specified by C standards since C99 (in particular,
there is no special version for the true division with real lhs
operand). Most C compilers implementing C99+ Annex G have only these
special rules (without support for imaginary type, which is going to be
deprecated in C2y).
When handed an absolute Windows path such as `C:\foo` or `//server/share`,
the `urllib.request.pathname2url()` function returns a URL with an
authority section, such as `///C:/foo` or `//server/share` (or before
GH-126205, `////server/share`). Only the `file:` prefix is omitted.
But when handed an absolute POSIX path such as `/etc/hosts`, or a Windows
path of the same form (rooted but lacking a drive), the function returns a
URL without an authority section, such as `/etc/hosts`.
This patch corrects the discrepancy by adding a `//` prefix before
drive-less, rooted paths when generating URLs.
The interpreter now handles `_PyStackRef`s pointing to immortal objects
without the deferred bit set, so `_PyEvalFramePushAndInit_UnTagged` is
no longer necessary.
Barney Gale [Sun, 24 Nov 2024 17:33:46 +0000 (17:33 +0000)]
Improve `pathname2url()` and `url2pathname()` docs (#127125)
These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:
>>> pathname2url(r'C:\foo')
'///C:/foo'
>>> pathname2url(r'\\server\share')
'////server/share' # or '//server/share' as of quite recently
If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.
On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)
The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the Windows and non-Windows behaviour.
It's also a better match for the function names.
Barney Gale [Sun, 24 Nov 2024 02:31:00 +0000 (02:31 +0000)]
pathlib tests: move `walk()` tests into their own classes (GH-126651)
Move tests for Path.walk() into a new PathWalkTest class, and apply a similar change in tests for the ABCs. This allows us to properly tear down the walk test hierarchy in tearDown(), rather than leaving it to os_helper.rmtree().
Barney Gale [Sat, 23 Nov 2024 10:41:39 +0000 (10:41 +0000)]
GH-125866: Preserve Windows drive letter case in file URIs (#127138)
Stop converting Windows drive letters to uppercase in
`urllib.request.pathname2url()` and `url2pathname()`. This behaviour is
unnecessary and inconsistent with pathlib's file URI implementation.
gh-109746: Make _thread.start_new_thread delete state of new thread on its startup failure (GH-109761)
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
This gets rid of the immortal check in `PyStackRef_FromPyObjectSteal()`.
Overall, this improves performance about 2% in the free threading
build.
This also renames `PyStackRef_Is()` to `PyStackRef_IsExactly()` because
the macro requires that the tag bits of the arguments match, which is
only true in certain special cases.
Łukasz Langa [Fri, 22 Nov 2024 17:29:18 +0000 (18:29 +0100)]
Enable aarch64 Ubuntu CI jobs (#125786)
This change enables custom GHA runners for Ubuntu-24.04 that run on Arm hardware. It also prepares for Windows runners on Arm hardware, but doesn't enable that just yet, because the Arm GHA runner images for Windows need to be updated.
gh-115999: Add free-threaded specialization for `UNPACK_SEQUENCE` (#126600)
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.
---------
Co-authored-by: Matt Page <mpage@meta.com> Co-authored-by: mpage <mpage@cs.stanford.edu>
Andrei Bodrov [Fri, 22 Nov 2024 16:20:34 +0000 (19:20 +0300)]
gh-88110: Clear concurrent.futures.thread._threads_queues after fork to avoid joining parent process' threads (GH-126098)
Threads are gone after fork, so clear the queues too. Otherwise the
child process (here created via multiprocessing.Process) crashes on
interpreter exit.
Serhiy Storchaka [Fri, 22 Nov 2024 15:52:15 +0000 (17:52 +0200)]
gh-127001: Fix PATHEXT issues in shutil.which() on Windows (GH-127035)
* Name without a PATHEXT extension is only searched if the mode does not
include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
(".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
Cody Maloney [Fri, 22 Nov 2024 14:55:32 +0000 (06:55 -0800)]
gh-127076: Ignore memory mmap in FileIO testing (#127088)
`mmap`, `munmap`, and `mprotect` are used by CPython for memory
management, which may occur in the middle of the FileIO tests. The
system calls can also be used with files, so `strace` includes them
in its `%file` and `%desc` filters.
Filter out the `mmap` system calls related to memory allocation for the
file tests. Currently FileIO doesn't do `mmap` at all, so didn't add
code to track from `mmap` through `munmap` since it wouldn't be used.
For now if an `mmap` on a fd happens, the call will be included (which
may cause test to fail), and at that time support for tracking the
address throug `munmap` could be added.
Sam Gross [Fri, 22 Nov 2024 14:21:59 +0000 (14:21 +0000)]
gh-127065: Make `methodcaller` thread-safe in free threading build (#127109)
The `methodcaller` C vectorcall implementation uses an arguments array
that is shared across calls. The first argument is modified on every
invocation. This isn't thread-safe in the free threading build. I think
it's also not safe in general, but for now just disable it in the free
threading build.
Barney Gale [Fri, 22 Nov 2024 04:12:50 +0000 (04:12 +0000)]
GH-127078: `url2pathname()`: handle extra slash before UNC drive in URL path (#127132)
Decode a file URI like `file://///server/share` as a UNC path like
`\\server\share`. This form of file URI is created by software the simply
prepends `file:///` to any absolute Windows path.
Discard any 'localhost' authority from the beginning of a `file:` URI. As a
result, file URIs like `//localhost/etc/hosts` are correctly decoded as
`/etc/hosts`.
mpage [Thu, 21 Nov 2024 19:22:21 +0000 (11:22 -0800)]
gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607)
Enable specialization of LOAD_GLOBAL in free-threaded builds.
Thread-safety of specialization in free-threaded builds is provided by the following:
A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:
Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
Eric Snow [Thu, 21 Nov 2024 18:08:38 +0000 (11:08 -0700)]
gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock. This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock. Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
Serhiy Storchaka [Thu, 21 Nov 2024 11:15:12 +0000 (13:15 +0200)]
gh-126997: Fix support of non-ASCII strings in pickletools (GH-127062)
* Fix support of STRING and GLOBAL opcodes with non-ASCII arguments.
* dis() now outputs non-ASCII bytes in STRING, BINSTRING and
SHORT_BINSTRING arguments as escaped (\xXX).
Cody Maloney [Thu, 21 Nov 2024 09:33:12 +0000 (01:33 -0800)]
gh-127076: Disable strace tests under LD_PRELOAD (#127086)
Distribution tooling (ex. sandbox on Gentoo and fakeroot on Debian) uses
LD_PRELOAD to intercept system calls and potentially modify them when
building. These tools can change the set of system calls, so disable
system call testing under these cases.
mpage [Wed, 20 Nov 2024 22:54:48 +0000 (14:54 -0800)]
gh-115999: Don't take a reason in unspecialize (#127030)
Don't take a reason in unspecialize
We only want to compute the reason if stats are enabled. Optimizing
compilers should optimize this away for us (gcc and clang do), but
it's better to be safe than sorry.