]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
[3.14] gh-128110: Fix rfc2047 whitespace handling in email parser address headers...
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Wed, 13 May 2026 20:28:22 +0000 (22:28 +0200)
committerGitHub <noreply@github.com>
Wed, 13 May 2026 20:28:22 +0000 (16:28 -0400)
commit20175bbeba9f84ce29dab4f6241b91f7dfb1ee18
tree0a9d3e8f5050d9e8ba214a8de834917f4657bdbc
parent7d95a1dc7382b55cba7fdd6a110336077584a4f0
[3.14] gh-128110: Fix rfc2047 whitespace handling in email parser address headers (GH-130749) (#149788)

RFC 2047 Section 6.2 requires that "any 'linear-white-space' that
separates a pair of adjacent 'encoded-word's is ignored." The modern
header value parser correctly implements that for unstructured headers,
but had missed a case in structured headers. This could cause a parsed
address header to include extraneous spaces in a display-name.

Switch to @bitdancer's fix from review feedback. Recharacterize space
between ews as fws after parsing in get_phrase.

RDM: This fix is dependent on the fact that "subsequent" atoms will never have
leading whitespace because that's been consumed already. I don't think
it's worth adding extra code for the possibility of leading whitespace
because the parser won't produce it. It's a bit of parser fragility in the
face of code changes, but I think that's a minor concern given the
parser design (which is that it consumes whitespace greedily)
(cherry picked from commit 7a4c6dfb8839eb05fb87baf70364680e45001dd4)

Co-authored-by: Mike Edmunds <medmunds@gmail.com>
Co-authored-by: R David Murray <rdmurray@bitdance.com>
Lib/email/_header_value_parser.py
Lib/test/test_email/test__header_value_parser.py
Misc/NEWS.d/next/Library/2025-03-01-13-36-02.gh-issue-128110.9wx_G0.rst [new file with mode: 0644]