From 3b809382b8dca9a13f5afe46d4f6569db06fad10 Mon Sep 17 00:00:00 2001 From: Collin Funk Date: Fri, 12 Dec 2025 23:11:09 -0800 Subject: [PATCH] doc: dd: document the behavior of conv flags on multibyte characters * doc/coreutils.texi (dd invocation): Document the behavior of 'dd' on multibyte characters and some unspecified behavior that will be documented in a future POSIX release [1]. [1] https://austingroupbugs.net/view.php?id=1959 --- doc/coreutils.texi | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index d37cf24714..cab87454ea 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -9280,6 +9280,17 @@ Change lowercase letters to uppercase. The @samp{lcase} and @samp{ucase} conversions are mutually exclusive. +@c https://austingroupbugs.net/view.php?id=1959 +POSIX leaves the behavior of @samp{lcase} and @samp{ucase} unspecified +on multibyte characters. GNU @command{dd} supports only unibyte +conversion, because multibyte characters may cross block boundaries and +case conversion may change the length of characters. + +POSIX also leaves the behavior of @samp{lcase} and @samp{ucase} +unspecified if used with @samp{ascii}, @samp{ebcdic}, or @samp{ibm}. +GNU @command{dd} will perform the case conversion and then perform the +character set conversion. + @item sparse @opindex sparse Try to seek rather than write NUL output blocks. -- 2.47.3