Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
[3.12] Add zizmor to pre-commit and fix most findings (GH-127749) (#127788)
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
[3.12] gh-127552: Remove comment questioning 4-digit restriction for ‘Y’ in datetime.strptime patterns (GH-127590) (#127649)
gh-127552: Remove comment questioning 4-digit restriction for ‘Y’ in datetime.strptime patterns (GH-127590)
The code has required 4 digits for the year since its inclusion in the stdlib in 2002 (over 22 years ago as of this commit).
(cherry picked from commit 51cfa569e379f84b3418db0971a71b1ef575a42b)
Co-authored-by: Beomsoo Kim <beoms424@gmail.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
[3.12] Give `poplib.POP3.rpop` a proper docstring (GH-127370) (#127722)
Give `poplib.POP3.rpop` a proper docstring (GH-127370)
Previously `poplib.POP3.rpop` had a "Not sure what this does" docstring, now it has been fixed.
(cherry picked from commit 27d0d2141319d82709eb09ba20065df3e1714fab)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
[3.12] gh-127655: Ensure `_SelectorSocketTransport.writelines` pauses the protocol if needed (GH-127656) (#127664)
gh-127655: Ensure `_SelectorSocketTransport.writelines` pauses the protocol if needed (GH-127656)
Ensure `_SelectorSocketTransport.writelines` pauses the protocol if it reaches the high water mark as needed.
(cherry picked from commit e991ac8f2037d78140e417cc9a9486223eb3e786)
Co-authored-by: J. Nick Koston <nick@koston.org> Co-authored-by: Kumar Aditya <kumaraditya@python.org>
[3.12] gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-125205)
Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.
This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785 (i.e. backporting gh-125709 too).
[3.12] gh-113841: fix possible undefined division by 0 in _Py_c_pow() (GH-127211) (GH-127216) (GH-127530)
[3.13] gh-113841: fix possible undefined division by 0 in _Py_c_pow() (GH-127211) (GH-127216)
Note, that transformed expression is not an equivalent for original one (1/exp(-x) != exp(x) in general for floating-point numbers). Though, the difference seems to be ~1ULP for good libm implementations.
It's more interesting why division was used from beginning. Closest algorithm I've found (no error checks, of course;)) - it's Algorithm 190 from ACM: https://dl.acm.org/doi/10.1145/366663.366679. It uses subtraction in the exponent.
[3.12] gh-99880: document rounding mode for new-style formatting (GH-121481) (#126335)
The CPython uses _Py_dg_dtoa(), which does rounding to nearest with half
to even tie-breaking rule.
If that functions is unavailable, PyOS_double_to_string() fallbacks to
system snprintf(). Since CPython 3.12, build requirements include C11
compiler *and* support for IEEE 754 floating point numbers (Annex F).
This means that FE_TONEAREST macro is available and, per default,
printf-like functions should use same rounding mode as _Py_dg_dtoa().
[3.12] gh-88110: Clear concurrent.futures.thread._threads_queues after fork to avoid joining parent process' threads (GH-126098) (GH-127164)
Threads are gone after fork, so clear the queues too. Otherwise the
child process (here created via multiprocessing.Process) crashes on
interpreter exit.
Co-authored-by: Илья Любавский <100635212+lubaskinc0de@users.noreply.github.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Don't make the assumption that the 'name' argument is a string. Use
repr() to format the 'name' argument instead.
(cherry picked from commit 20657fbdb14d50ca4ec115da0cbef155871d8d33)
Co-authored-by: Victor Stinner <vstinner@python.org>
[3.12] gh-124008: Fix calculation of the number of written bytes for the Windows console (GH-124059) (GH-127326)
Since MultiByteToWideChar()/WideCharToMultiByte() is not reversible if
the data contains invalid UTF-8 sequences, use binary search to
calculate the number of written bytes from the number of written
characters.
[3.12] Improve `pathname2url()` and `url2pathname()` docs (GH-127125) (#127233)
Improve `pathname2url()` and `url2pathname()` docs (GH-127125)
These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:
>>> pathname2url(r'C:\foo')
'///C:/foo'
>>> pathname2url(r'\\server\share')
'////server/share' # or '//server/share' as of quite recently
If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.
On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)
The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the Windows and non-Windows behaviour.
It's also a better match for the function names.
(cherry picked from commit 307c63358681d669ae39e5ecd814bded4a93443a)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Serhiy Storchaka [Fri, 22 Nov 2024 19:56:39 +0000 (21:56 +0200)]
[3.12] gh-109746: Make _thread.start_new_thread delete state of new thread on its startup failure (GH-109761) (GH-127173)
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
Serhiy Storchaka [Fri, 22 Nov 2024 16:33:50 +0000 (18:33 +0200)]
[3.12] gh-127001: Fix PATHEXT issues in shutil.which() on Windows (GH-127035) (GH-127158)
* Name without a PATHEXT extension is only searched if the mode does not
include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
(".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
(cherry picked from commit 8899e85de100557899da05f0b37867a371a73800)
[3.12] GH-127078: `url2pathname()`: handle extra slash before UNC drive in URL path (GH-127132) (#127136)
GH-127078: `url2pathname()`: handle extra slash before UNC drive in URL path (GH-127132)
Decode a file URI like `file://///server/share` as a UNC path like
`\\server\share`. This form of file URI is created by software the simply
prepends `file:///` to any absolute Windows path.
(cherry picked from commit 8c98ed846a7d7e50c4cf06f823d94737144dcf6a)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Discard any 'localhost' authority from the beginning of a `file:` URI. As a
result, file URIs like `//localhost/etc/hosts` are correctly decoded as
`/etc/hosts`.
(cherry picked from commit ebf564a1d3e2e81b9846535114e481d6096443d2)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
It now returns multiple era description segments separated by semicolons.
Previously it only returned the first segment on platforms with Glibc.
(cherry picked from commit 4803cd0244847f286641c85591fda08b513cea52)
[3.12] gh-126997: Fix support of non-ASCII strings in pickletools (GH-127062) (GH-127095)
* Fix support of STRING and GLOBAL opcodes with non-ASCII arguments.
* dis() now outputs non-ASCII bytes in STRING, BINSTRING and
SHORT_BINSTRING arguments as escaped (\xXX).
(cherry picked from commit eaf217108226633c03cc5c4c90f0b6e4587c8803)
[3.12] GH-85168: Use filesystem encoding when converting to/from `file` URIs (GH-126852) (#127040)
GH-85168: Use filesystem encoding when converting to/from `file` URIs (GH-126852)
Adjust `urllib.request.url2pathname()` and `pathname2url()` to use the
filesystem encoding when quoting and unquoting file URIs, rather than
forcing use of UTF-8.
No changes are needed in the `nturl2path` module because Windows always
uses UTF-8, per PEP 529.
(cherry picked from commit c9b399fbdb01584dcfff0d7f6ad484644ff269c3)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
If PyObject_SetItem() fails in the `load_build()` function of _pickle.c, no DECREF for the `dict` variable.
(cherry picked from commit 29cbcbd73bbfd8c953c0b213fb33682c289934ff)
Co-authored-by: Stan U <89152624+StanFromIreland@users.noreply.github.com> Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
[3.12] gh-126909: Fix running xattr tests on systems with lower limits (GH-126930) (#126964)
gh-126909: Fix running xattr tests on systems with lower limits (GH-126930)
Modify the extended attribute tests to write fewer and smaller extended
attributes, in order to fit within filesystems with total xattr limit
of 1 KiB (e.g. ext4 with 1 KiB blocks). Previously, the test would
write over 2 KiB, making it fail with ENOSPC on such systems.
(cherry picked from commit 2c0a21c1aad65ab8362491acf856eb574b1257ad)
Serhiy Storchaka [Mon, 18 Nov 2024 11:24:13 +0000 (13:24 +0200)]
[3.12] gh-67877: Fix memory leaks in terminated RE matching (GH-126840) (GH-126961)
If SRE(match) function terminates abruptly, either because of a signal
or because memory allocation fails, allocated SRE_REPEAT blocks might
be never released.
[3.12] gh-124452: Fix header mismatches when folding/unfolding with email message (GH-125919) (#126916)
gh-124452: Fix header mismatches when folding/unfolding with email message (GH-125919)
The header-folder of the new email API has a long standing known buglet where
if the first token is longer than max_line_length, it puts that token on the next
line. It turns out there is also a *parsing* bug when parsing such a header:
the space prefixing that first, non-empty line gets preserved and tacked on to
the start of the header value, which is not the expected behavior per the RFCs.
The bug arises from the fact that the parser assumed that there would be at
least one token on the line with the header, which is going to be true for
probably every email producer other than the python email library with its
folding buglet. Clearly, though, this is a case that needs to be handled
correctly. The fix is simple: strip the blanks off the start of the whole
value, not just the first physical line of the value.
[3.12] gh-126476: Raise IllegalMonthError for calendar.formatmonth() when the input month is not correct (GH-126484) (GH-126878)
gh-126476: Raise IllegalMonthError for calendar.formatmonth() when the input month is not correct (GH-126484)
(cherry picked from commit 3be7498d2450519d5d8f63a35ef298db3b3d935b)
[3.12] Added a warning to the urljoin docs, indicating that it is not safe to use with attacker controlled URLs (GH-126659) (#126889)
Added a warning to the urljoin docs, indicating that it is not safe to use with attacker controlled URLs (GH-126659)
This was flagged to me at a party today by someone who works in red-teaming as a frequently encountered footgun. Documenting the potentially unexpected behavior seemed like a good place to start.
(cherry picked from commit d6bcc154e93a0a20ab97187d3e8b726fffb14f8f)
Co-authored-by: Alex Gaynor <alex.gaynor@gmail.com>
[3.12] gh-71936: Fix race condition in multiprocessing.Pool (GH-124973) (GH-126870)
Proxes of shared objects register a Finalizer in BaseProxy._incref(), and it
will call BaseProxy._decref() when it is GCed. This may cause a race condition
with Pool(maxtasksperchild=None) on Windows.
A connection would be closed and raised TypeError when a GC occurs between
_ConnectionBase._check_writable() and _ConnectionBase._send_bytes() in
_ConnectionBase.send() in the second or later task, and a new object
is allocated that shares the id() of a previously deleted one.
Instead of using the id() of the token (or the proxy), use a unique,
non-reusable number.
gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance (GH-126182)
* gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance
This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.
Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.
Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.
Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.
Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.
Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):
> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.
Discard two leading slashes from the beginning of a `file:` URI if they
introduce an empty authority section. As a result, file URIs like
`///etc/hosts` are correctly parsed as `/etc/hosts`.
(cherry picked from commit cae9d9d20f61cdbde0765efa340b6b596c31b67f)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: John Marshall <jmarshall@hey.com> Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com> Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
[3.12] GH-120423: `pathname2url()`: handle forward slashes in Windows paths (GH-126593) (#126763)
GH-120423: `pathname2url()`: handle forward slashes in Windows paths (GH-126593)
Adjust `urllib.request.pathname2url()` so that forward slashes in Windows
paths are handled identically to backward slashes.
(cherry picked from commit bf224bd7cef5d24eaff35945ebe7ffe14df7710f)
Co-authored-by: Barney Gale <barney.gale@gmail.com>