]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
gh-96268: Fix loading invalid UTF-8 (GH-96270)
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Wed, 7 Sep 2022 21:49:17 +0000 (14:49 -0700)
committerGitHub <noreply@github.com>
Wed, 7 Sep 2022 21:49:17 +0000 (14:49 -0700)
commitffafa9b91da8731d21958209dd1478f48eaa2d09
tree91145f611b810169911fa11620ebd838532f2484
parent9fa21d050abf5ba2b39762e320cb6e6bb8b905c2
gh-96268: Fix loading invalid UTF-8 (GH-96270)

This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.

It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
(cherry picked from commit 8bc356a7dd50cbdb46d10b8c7e457832431f5d9e)

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
Lib/test/test_source_encoding.py
Misc/NEWS.d/next/Core and Builtins/2022-08-25-10-19-34.gh-issue-96268.AbYrLB.rst [new file with mode: 0644]
Parser/tokenizer.c