From: Darrick J. Wong Date: Mon, 24 Feb 2025 18:22:08 +0000 (-0800) Subject: xfs_scrub: don't warn about zero width joiner control characters X-Git-Tag: v6.14.0~35 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=87c2a10e77d799e2c7642a2815f65a1771cf3120;p=thirdparty%2Fxfsprogs-dev.git xfs_scrub: don't warn about zero width joiner control characters The Unicode code point for "zero width joiners" (aka 0x200D) is used to hint to renderers that a sequence of simple code points should be combined into a more complex rendering. This is how compound emoji such as "wounded heart" are composed out of "heart" and "bandaid"; and how complex glyphs are rendered in Malayam. Emoji in filenames are a supported usecase, so stop warning about the mere existence of ZWJ. We already warn about ZWJ that are used to produce confusingly rendered names in a single namespace, so we're not losing any robustness here. Cc: # v6.10.0 Fixes: d43362c78e3e37 ("xfs_scrub: store bad flags with the name entry") Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- diff --git a/scrub/unicrash.c b/scrub/unicrash.c index 143060b5..b83bef64 100644 --- a/scrub/unicrash.c +++ b/scrub/unicrash.c @@ -508,8 +508,14 @@ name_entry_examine( if (is_nonrendering(uchr)) ret |= UNICRASH_INVISIBLE; - /* control characters */ - if (u_iscntrl(uchr)) + /* + * Warn about control characters in filenames except for zero + * width joiners because those are used to construct compound + * emoji and glyphs in various languages. ZWJ is already + * covered by UNICRASH_INVISIBLE, so we can detect its use in + * confusing names. + */ + if (uchr != 0x200D && u_iscntrl(uchr)) ret |= UNICRASH_CONTROL_CHAR; switch (u_charDirection(uchr)) {