From: Collin Funk Date: Sat, 13 Dec 2025 07:11:09 +0000 (-0800) Subject: doc: dd: document the behavior of conv flags on multibyte characters X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;ds=sidebyside;p=thirdparty%2Fcoreutils.git doc: dd: document the behavior of conv flags on multibyte characters * doc/coreutils.texi (dd invocation): Document the behavior of 'dd' on multibyte characters and some unspecified behavior that will be documented in a future POSIX release [1]. [1] https://austingroupbugs.net/view.php?id=1959 --- diff --git a/doc/coreutils.texi b/doc/coreutils.texi index d37cf24714..cab87454ea 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -9280,6 +9280,17 @@ Change lowercase letters to uppercase. The @samp{lcase} and @samp{ucase} conversions are mutually exclusive. +@c https://austingroupbugs.net/view.php?id=1959 +POSIX leaves the behavior of @samp{lcase} and @samp{ucase} unspecified +on multibyte characters. GNU @command{dd} supports only unibyte +conversion, because multibyte characters may cross block boundaries and +case conversion may change the length of characters. + +POSIX also leaves the behavior of @samp{lcase} and @samp{ucase} +unspecified if used with @samp{ascii}, @samp{ebcdic}, or @samp{ibm}. +GNU @command{dd} will perform the case conversion and then perform the +character set conversion. + @item sparse @opindex sparse Try to seek rather than write NUL output blocks.