]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
Correctly fold unknown-8bit originating from encoded words. (#142517)
authorR. David Murray <rdmurray@bitdance.com>
Wed, 24 Dec 2025 14:14:39 +0000 (09:14 -0500)
committerGitHub <noreply@github.com>
Wed, 24 Dec 2025 14:14:39 +0000 (09:14 -0500)
commit1e17ccd030a2285ad53db5952360fffa33a8a877
treee5fd3a97a6828c01ad7f97075607159fb184e7af
parentd4dc3dd9aab6e860ea29e3bd133147a3f795cf60
Correctly fold unknown-8bit originating from encoded words. (#142517)

The unknown-8bit trick was designed to deal with unknown bytes in an
ASCII message, and it works fine for that.  However, I also tried to
extend it to handle bytes that can't be decoded using the charset
specified in an encoded word, and there it fails because there can be
other non-ASCII characters that were *successfully* decoded.  The fix is
simple: do the unknown-8bit encoding using the utf-8 codec.  This is
especially appropriate since anyone trying to do recovery on an unknown
byte string will probably attempt utf-8 first.
Lib/email/_encoded_words.py
Lib/test/test_email/test__header_value_parser.py
Misc/NEWS.d/next/Library/2025-12-10-10-00-06.gh-issue-142517.fG4hbe.rst [new file with mode: 0644]