]> git.ipfire.org Git - thirdparty/Python/cpython.git/commit
gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)
authorSerhiy Storchaka <storchaka@gmail.com>
Wed, 14 Jan 2026 14:37:57 +0000 (16:37 +0200)
committerGitHub <noreply@github.com>
Wed, 14 Jan 2026 14:37:57 +0000 (14:37 +0000)
commitbab1d7a561ab015dd6bb97e255fd12a8ce367edf
tree24e58b6f5d03dda6c1751312437e31d93844ee93
parent0e0d51cdcef903d8a990c8e264f32f2f28af0673
gh-74902: Add Unicode Grapheme Cluster Break algorithm (GH-143076)

Add the unicodedata.iter_graphemes() function to iterate over grapheme
clusters according to rules defined in Unicode Standard Annex #29.

Add unicodedata.grapheme_cluster_break(), unicodedata.indic_conjunct_break()
and unicodedata.extended_pictographic() functions to get the properties
of the character which are related to the above algorithm.

Co-authored-by: Guillaume "Vermeille" Sanchez <guillaume.v.sanchez@gmail.com>
Doc/library/unicodedata.rst
Doc/whatsnew/3.15.rst
Lib/test/test_unicodedata.py
Misc/ACKS
Misc/NEWS.d/next/Library/2025-12-22-18-25-54.gh-issue-74902.HqrWUV.rst [new file with mode: 0644]
Modules/clinic/unicodedata.c.h
Modules/unicodedata.c
Modules/unicodedata_db.h
Tools/unicode/makeunicodedata.py