]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
[3.13] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080...
authorPetr Viktorin <encukou@gmail.com>
Tue, 2 Jun 2026 16:12:42 +0000 (18:12 +0200)
committerGitHub <noreply@github.com>
Tue, 2 Jun 2026 16:12:42 +0000 (16:12 +0000)
commitba785b88add96acbf403d65cb157fb2743a33a32
treef70ad12f76cb6014967d6ce2dd0c98cfb68926ab
parent13d8f452a11dc58690cb7aba0cad39bcca18f465
[3.13] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080) (#150780)

Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b1e8311c85f198f6dd8208bf8cff7fc26f)

Co-authored-by: Seth Larson <seth@python.org>
Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com>
Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
Lib/test/test_unicodedata.py
Misc/NEWS.d/next/Security/2026-04-27-16-36-11.gh-issue-149079.vKl-LM.rst [new file with mode: 0644]
Modules/unicodedata.c