]> git.ipfire.org Git - thirdparty/xfsprogs-dev.git/commitdiff
xfs_scrub: don't warn about zero width joiner control characters
authorDarrick J. Wong <djwong@kernel.org>
Mon, 24 Feb 2025 18:22:08 +0000 (10:22 -0800)
committerDarrick J. Wong <djwong@kernel.org>
Tue, 25 Feb 2025 17:16:03 +0000 (09:16 -0800)
The Unicode code point for "zero width joiners" (aka 0x200D) is used to
hint to renderers that a sequence of simple code points should be
combined into a more complex rendering.  This is how compound emoji such
as "wounded heart" are composed out of "heart" and "bandaid"; and how
complex glyphs are rendered in Malayam.

Emoji in filenames are a supported usecase, so stop warning about the
mere existence of ZWJ.  We already warn about ZWJ that are used to
produce confusingly rendered names in a single namespace, so we're not
losing any robustness here.

Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: d43362c78e3e37 ("xfs_scrub: store bad flags with the name entry")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
scrub/unicrash.c

index 143060b569f27c0dcbe040b2db15c64bccad5f20..b83bef644b6dceb5ec6925210e29b3d4b68cd8c3 100644 (file)
@@ -508,8 +508,14 @@ name_entry_examine(
                if (is_nonrendering(uchr))
                        ret |= UNICRASH_INVISIBLE;
 
-               /* control characters */
-               if (u_iscntrl(uchr))
+               /*
+                * Warn about control characters in filenames except for zero
+                * width joiners because those are used to construct compound
+                * emoji and glyphs in various languages.  ZWJ is already
+                * covered by UNICRASH_INVISIBLE, so we can detect its use in
+                * confusing names.
+                */
+               if (uchr != 0x200D && u_iscntrl(uchr))
                        ret |= UNICRASH_CONTROL_CHAR;
 
                switch (u_charDirection(uchr)) {