From: Paul Eggert Date: Wed, 25 Jun 2025 18:28:24 +0000 (-0700) Subject: Document s/a/\n/ etc X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=cbfc5f6cd1ca0aea5a6f5b60bb1c4e99c427947c;p=thirdparty%2Fautoconf.git Document s/a/\n/ etc Problem reported by Bruno Haible in: https://lists.gnu.org/r/autoconf-patches/2025-06/msg00001.html * doc/autoconf.texi (Limitations of Usual Tools): Also mention escapes in replacement strings. While we’re at it, update -e and -f concatenation doc. --- diff --git a/doc/autoconf.texi b/doc/autoconf.texi index 127344467..702a4ba16 100644 --- a/doc/autoconf.texi +++ b/doc/autoconf.texi @@ -19790,14 +19790,6 @@ $ @kbd{printf '\200\n' | LC_ALL=en_US.ISO8859-1 sed -n /./p | wc -l} 1 @end example -Portable @command{sed} regular expressions should use @samp{\} only to escape -characters in the string @samp{$()*.123456789[\^n@{@}}. For example, -alternation, @samp{\|}, is common but POSIX does not require its -support, so it should be avoided in portable scripts. Solaris -@command{sed} does not support alternation; e.g., @samp{sed '/a\|b/d'} -deletes only lines that contain the literal string @samp{a|b}. -Similarly, @samp{\+} and @samp{\?} should be avoided. - Anchors (@samp{^} and @samp{$}) inside groups are not portable. Nested parentheses in patterns (e.g., @samp{\(\(a*\)b*)\)}) are @@ -19817,6 +19809,27 @@ $ @kbd{echo '1*23*4' | /usr/xpg4/bin/sed 's/\(.\)*/x/g'} x @end example +Portable @command{sed} regular expressions should use @samp{\} only to escape +characters in the string @samp{$()*.123456789[\^n@{@}}. For example, +alternation, @samp{\|}, is common but POSIX does not require its +support, so it should be avoided in portable scripts. Solaris +@command{sed} does not support alternation; e.g., @samp{sed '/a\|b/d'} +deletes only lines that contain the literal string @samp{a|b}. +Similarly, @samp{\+} and @samp{\?} should be avoided. + +Portable @command{sed} replacement strings should use @samp{\} only to +escape the delimiter character, newline, and characters in the string +@samp{&123456789\}. For example, although in GNU @command{sed} +the command @samp{s/AB/\n/} replaces the first @samp{AB} with a newline, +in OpenBSD and Solaris @command{sed} it acts like @samp{s/AB/n/}. +To be portable, escape a newline instead: + +@example +# This is portable; "sed 's/AB/\n/'" is not. +sed 's/AB/\ +/' +@end example + The @option{-e} option is mostly portable. However, its argument cannot be empty, as this fails on AIX 7.3. Some people prefer to use @samp{-e}: @@ -19849,21 +19862,16 @@ but POSIX says that this use of a semicolon has undefined effect if should use semicolon only with simple scripts that do not use these verbs. -POSIX up to the 2008 revision requires the argument of the @option{-e} -option to be a syntactically complete script. GNU @command{sed} allows -to pass multiple script fragments, each as argument of a separate -@option{-e} option, that are then combined, with newlines between the -fragments, and a future POSIX revision may allow this as well. This -approach is not portable with script fragments ending in backslash; for +POSIX requires each @option{-e} and @option{-f} option to specify a +syntactically complete script. Although GNU @command{sed} also allows +@option{-e} and @option{-f} options to specify script fragments +that it assembles into a full script, this is not portable. For example, the @command{sed} programs on Solaris 10, HP-UX 11, and AIX -don't allow splitting in this case: +do not allow script fragments: @example -$ @kbd{echo a | sed -n -e 'i\} -@kbd{0'} -0 -$ @kbd{echo a | sed -n -e 'i\' -e 0} -Unrecognized command: 0 +$ @kbd{sed -e 'i\' -e ouch} +Unrecognized command: ouch @end example @noindent