From 9ecc4f4e44ef8797d1fcd01574ebb71999744d73 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 23 Sep 2023 00:23:26 -0700 Subject: [PATCH] doc: mention Unicode exceptions for wc --- doc/coreutils.texi | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 4167660a7c..ee3b1ce11a 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3859,6 +3859,13 @@ space delimited by white space characters or by start or end of input. The current locale determines which characters are white space. GNU @command{wc} treats encoding errors as non white space. +@vindex POSIXLY_CORRECT +Unless the environment variable @env{POSIXLY_CORRECT} is set, +GNU @command{wc} treats the following Unicode characters as white +space even if the current locale does not: U+00A0 NO-BREAK SPACE, +U+2007 FIGURE SPACE, U+202F NARROW NO-BREAK SPACE, and U+2060 WORD +JOINER. + @item -l @itemx --lines @opindex -l -- 2.47.2