]> git.ipfire.org Git - thirdparty/postgresql.git/commit
Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) REL_18_STABLE github/REL_18_STABLE
authorMichael Paquier <michael@paquier.xyz>
Thu, 4 Jun 2026 22:50:12 +0000 (07:50 +0900)
committerMichael Paquier <michael@paquier.xyz>
Thu, 4 Jun 2026 22:50:12 +0000 (07:50 +0900)
commit273fe94852b3a7e34fd171e8abdf1481beb302fa
tree50c03cb54489d38bc23cf7d51af404108e21b6a5
parentc5194139cb4c9cf8284a6e433418c0323a7e7650
Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE)

The NFC recomposition incorrectly included TBASE as a valid T syllable,
which is incorrect based on the Unicode specification (TBASE is one
below the start of the range, range beginning at U+11A8).

This would cause the TBASE to be silently swallowed in the
normalization, leading to an incorrect result.

A couple of regression tests are added to check more patterns with
Hangul recomposition and decomposition, on top of a test to check the
problem with TBASE.  Diego has submitted the code fix, and I have
written the tests.

Author: Diego Frias <mail@dzfrias.dev>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/B92ED640-7D4A-4505-B09F-3548F58CBB16@dzfrias.dev
Backpatch-through: 14
src/common/unicode_norm.c
src/test/regress/expected/unicode.out
src/test/regress/sql/unicode.sql