]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
[3.15] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080...
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Tue, 2 Jun 2026 10:10:04 +0000 (12:10 +0200)
committerGitHub <noreply@github.com>
Tue, 2 Jun 2026 10:10:04 +0000 (10:10 +0000)
commitc5512bd7c1dc28055660565275012766941d3066
tree0132d9a66bcb57c76ad07ab2643b1e3b04b6c859
parent1835400ff5c36e01851c740e3fa82b1309a07b97
[3.15] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080) (#150776)

Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b1e8311c85f198f6dd8208bf8cff7fc26f)

Co-authored-by: Seth Larson <seth@python.org>
Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com>
Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
Lib/test/test_unicodedata.py
Misc/NEWS.d/next/Security/2026-04-27-16-36-11.gh-issue-149079.vKl-LM.rst [new file with mode: 0644]
Modules/unicodedata.c