]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
[3.14] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Tue, 2 Jun 2026 10:10:30 +0000 (12:10 +0200)
committerGitHub <noreply@github.com>
Tue, 2 Jun 2026 10:10:30 +0000 (10:10 +0000)
commit6b505d1f41f8f3ea0fe5a4786d3a8fff1875cfc0
treeea9924a6abdaf40fe9d9aa8d4c98dbc19c7b1f72
parent8ee3bcf6d8493fcc89056a624cf8ee4ecec1a985
[3.14] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)

Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b1e8311c85f198f6dd8208bf8cff7fc26f)

Co-authored-by: Seth Larson <seth@python.org>
Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com>
Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com>
Co-authored-by: Stan Ulbrych <stan@python.org>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
Lib/test/test_unicodedata.py
Misc/NEWS.d/next/Security/2026-04-27-16-36-11.gh-issue-149079.vKl-LM.rst [new file with mode: 0644]
Modules/unicodedata.c