From: Bruno Haible Date: Sun, 22 Jun 2025 08:56:33 +0000 (+0200) Subject: Add support for Shell printf format strings. X-Git-Tag: v0.26~74 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=c93b9f3976335ff3386423ea705a3228cd945e5c;p=thirdparty%2Fgettext.git Add support for Shell printf format strings. * gettext-tools/src/message.h (enum format_type): Add format_sh_printf. (NFORMATS): Increment. * gettext-tools/src/message.c (format_language, format_language_pretty): Add an entry for format_sh_printf. * gettext-tools/src/format.h (formatstring_sh_printf): New declaration. * gettext-tools/src/format.c (formatstring_parsers): Add an entry for format_sh_printf. * gettext-tools/src/format-sh-printf.c: New file, based on gettext-tools/src/format-awk.c. * gettext-tools/src/FILES: Mention it. * gettext-tools/src/x-sh.h (SCANNERS_SH): Use formatstring_sh_printf as secondary format string type. * gettext-tools/src/xgettext.c (xgettext_record_flag): Update accordingly. * gettext-tools/src/x-sh.c (init_flag_table_sh): Register gettext, ngettext with flag 'pass-sh-printf-format'. Register 'printf' with flag 'sh-printf-format'. * gettext-tools/src/Makefile.am (FORMAT_SOURCE): Add format-sh-printf.c. * gettext-tools/libgettextpo/Makefile.am (libgettextpo_la_AUXSOURCES): Likewise. * gettext-tools/doc/gettext.texi (PO Files): Mention sh-printf-format. (sh-format): Document also the sh-printf-format strings. * gettext-tools/doc/lang-sh.texi (sh): Mention the coreutils 'printf' command. * gettext-tools/tests/xgettext-sh-1: Add a test case with a printf invocation. * gettext-tools/tests/format-sh-printf-1: New file, based on gettext-tools/tests/format-awk-1. * gettext-tools/tests/format-sh-printf-2: New file, based on gettext-tools/tests/format-awk-2. * gettext-tools/tests/Makefile.am (TESTS): Add them. * NEWS: Mention the change. --- diff --git a/NEWS b/NEWS index 3edfb4983..0b7218c6b 100644 --- a/NEWS +++ b/NEWS @@ -12,6 +12,8 @@ Version 0.26 - July 2025 in a context that requires a format string. You can override this heuristic by using a comment of the form /* xgettext: c-format */. * Shell: + - xgettext now recognizes format strings in the 'printf' command syntax. + They are marked as 'sh-printf-format' in POT and PO files. - xgettext now recognizes the \c, \u, and \U escape sequences in dollar- single-quoted strings $'...'. diff --git a/gettext-tools/doc/gettext.texi b/gettext-tools/doc/gettext.texi index 90e5c2f97..ce2c0f09d 100644 --- a/gettext-tools/doc/gettext.texi +++ b/gettext-tools/doc/gettext.texi @@ -1733,7 +1733,13 @@ Likewise for Ruby, see @ref{ruby-format}. @kwindex sh-format@r{ flag} @itemx no-sh-format @kwindex no-sh-format@r{ flag} -Likewise for Shell, see @ref{sh-format}. +Likewise for Shell format strings, see @ref{sh-format}. + +@item sh-printf-format +@kwindex sh-printf-format@r{ flag} +@itemx no-sh-printf-format +@kwindex no-sh-printf-format@r{ flag} +Likewise for Shell @code{printf} format strings, see @ref{sh-format}. @item awk-format @kwindex awk-format@r{ flag} @@ -10227,6 +10233,14 @@ equivalent to @code{%<@var{name}>s}. @node sh-format @subsection Shell Format Strings +There are two kinds of format strings in shell scripts: +those with dollar notation for placeholders, +called @emph{Shell format strings} +and labelled as @samp{sh-format}, +and those acceptable to the @samp{printf} command (or shell built-in command), +called @emph{Shell @code{printf} format strings} +and labelled as @samp{sh-printf-format}. + Shell format strings, as supported by GNU gettext and the @samp{envsubst} program, are strings with references to shell variables in the form @code{$@var{variable}} or @code{$@{@var{variable}@}}. References of the form @@ -10243,6 +10257,28 @@ that would be valid inside shell scripts, are not supported. The ASCII characters, not start with a digit and be nonempty; otherwise such a variable reference is ignored. +Shell @code{printf} format strings are the format strings supported +by the POSIX @samp{printf} command +(@url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html}), +including the floating-point conversion specifiers +@code{a}, @code{A}, @code{e}, @code{E}, @code{f}, @code{F}, @code{g}, @code{G}, +but without the obsolescent @code{b} conversion specifier. +Extensions by the GNU coreutils @samp{printf} command +(@url{https://www.gnu.org/software/coreutils/manual/html_node/printf-invocation.html}) +are not supported: +use of @samp{*} or @samp{*@var{m}$} as width or precision; +use of size specifiers @code{h}, @code{l}, @code{j}, @code{z}, @code{t} (ignored); +and the escape sequences @code{\c}, +@code{\x@var{nn}}, @code{\u@var{nnnn}}, @code{\U@var{nnnnnnnn}}. +Extensions by the GNU bash @samp{printf} built-in +(@url{https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html#index-printf}) +are not supported either: +use of @samp{*} as width or precision; +use of size specifiers @code{h}, @code{l}, @code{j}, @code{z}, @code{t} (ignored); +the @code{%b}, @code{%q}, @code{%Q}, @code{%T}, @code{%n} directives; +and the escape sequences +@code{\x@var{nn}}, @code{\u@var{nnnn}}, @code{\U@var{nnnnnnnn}}. + @node awk-format @subsection awk Format Strings diff --git a/gettext-tools/doc/lang-sh.texi b/gettext-tools/doc/lang-sh.texi index 0da43c360..e068a8f60 100644 --- a/gettext-tools/doc/lang-sh.texi +++ b/gettext-tools/doc/lang-sh.texi @@ -1,5 +1,5 @@ @c This file is part of the GNU gettext manual. -@c Copyright (C) 1995-2024 Free Software Foundation, Inc. +@c Copyright (C) 1995-2025 Free Software Foundation, Inc. @c See the file gettext.texi for copying conditions. @node sh @@ -50,10 +50,11 @@ use @code{xgettext} @item Formatting with positions ---- -@c Not yet: It requires support in GNU coreutils, GNU bash, dash, etc. -@c @url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html, -@c @code{printf}} +A POSIX compliant +@url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html, + @code{printf}} +command, such as the one from GNU coreutils 9.6 or newer. +@c GNU Bash built-in? @item Portability fully portable diff --git a/gettext-tools/libgettextpo/Makefile.am b/gettext-tools/libgettextpo/Makefile.am index e4d5f6c24..0de3f9148 100644 --- a/gettext-tools/libgettextpo/Makefile.am +++ b/gettext-tools/libgettextpo/Makefile.am @@ -83,6 +83,7 @@ libgettextpo_la_AUXSOURCES = \ ../src/format-go.c \ ../src/format-ruby.c \ ../src/format-sh.c \ + ../src/format-sh-printf.c \ ../src/format-awk.c \ ../src/format-lua.c \ ../src/format-pascal.c \ diff --git a/gettext-tools/src/FILES b/gettext-tools/src/FILES index 49ae6164b..9d900a867 100644 --- a/gettext-tools/src/FILES +++ b/gettext-tools/src/FILES @@ -240,6 +240,7 @@ format-rust.c Format string handling for Rust. format-go.c Format string handling for Go. format-ruby.c Format string handling for Ruby. format-sh.c Format string handling for Shell. +format-sh-printf.c Format string handling for Shell, printf syntax. format-awk.c Format string handling for awk. format-lua.c Format string handling for Lua. format-pascal.c Format string handling for Object Pascal. diff --git a/gettext-tools/src/Makefile.am b/gettext-tools/src/Makefile.am index eeae86aeb..4ce51af0a 100644 --- a/gettext-tools/src/Makefile.am +++ b/gettext-tools/src/Makefile.am @@ -203,6 +203,7 @@ FORMAT_SOURCE += \ format-go.c \ format-ruby.c \ format-sh.c \ + format-sh-printf.c \ format-awk.c \ format-lua.c \ format-pascal.c \ diff --git a/gettext-tools/src/format-sh-printf.c b/gettext-tools/src/format-sh-printf.c new file mode 100644 index 000000000..bcaf2d188 --- /dev/null +++ b/gettext-tools/src/format-sh-printf.c @@ -0,0 +1,608 @@ +/* Shell printf format strings. + Copyright (C) 2001-2025 Free Software Foundation, Inc. + Written by Bruno Haible , 2025. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifdef HAVE_CONFIG_H +# include +#endif + +#include +#include + +#include "format.h" +#include "c-ctype.h" +#include "xalloc.h" +#include "xvasprintf.h" +#include "format-invalid.h" +#include "gettext.h" + +#define _(str) gettext (str) + +/* Shell printf format strings are described in + * POSIX: + + + * The GNU coreutils documentation: + + * The GNU bash documentation: + + + The format string consists of + - plain text, + - directives, that start with '%', + - escape sequences, that start with a backslash and don't contain '%'. + + The set of supported directives and escape sequences is documented in gettext.texi. + + A directive + - starts with '%' or '%m$' where m is a positive integer, + - is optionally followed by any of the characters '#', '0', '-', ' ', '+', + each of which acts as a flag, + - is optionally followed by a width specification: a nonempty digit sequence, + [not in POSIX: '*' (reads an argument) or '*m$'] + - is optionally followed by '.' and a precision specification: an optional + nonempty digit sequence, + [not in POSIX: '*' (reads an argument) or '*m$'] + - [not in POSIX: is optionally followed by a size specifier, one of + 'hh' 'h' 'l' 'll' 'L' 'q' 'j' 'z' 't'] + - is finished by a specifier + - 'c', that needs a character argument, + - 's', that needs a string argument, + - 'i', 'd', that need a signed integer argument, + - 'u', 'o', 'x', 'X', that need an unsigned integer argument, + - [optional in POSIX, but supported here:] 'e', 'E', 'f', 'F', 'g', 'G', + 'a', 'A', that need a floating-point argument. + Additionally there is the directive '%%', which takes no argument. + Numbered ('%m$' or '*m$') and unnumbered argument specifications cannot + be used in the same string. + + The valid escape sequences are: + \\ \a \b \f \n \r \t \v + \nnn with 1 to 3 octal digits n + [not in POSIX: \c \xnn \unnnn \Unnnnnnnn] + */ + +enum format_arg_type +{ + FAT_NONE, + FAT_CHARACTER, + FAT_STRING, + FAT_INTEGER, + FAT_UNSIGNED_INTEGER, + FAT_FLOAT +}; + +struct numbered_arg +{ + unsigned int number; + enum format_arg_type type; +}; + +struct spec +{ + unsigned int directives; + /* We consider a directive as "likely intentional" if it does not contain a + space. This prevents xgettext from flagging strings like "100% complete" + as 'sh-printf-format' if they don't occur in a context that requires a + format string. */ + unsigned int likely_intentional_directives; + unsigned int numbered_arg_count; + struct numbered_arg *numbered; +}; + + +static int +numbered_arg_compare (const void *p1, const void *p2) +{ + unsigned int n1 = ((const struct numbered_arg *) p1)->number; + unsigned int n2 = ((const struct numbered_arg *) p2)->number; + + return (n1 > n2 ? 1 : n1 < n2 ? -1 : 0); +} + +static void * +format_parse (const char *format, bool translated, char *fdi, + char **invalid_reason) +{ + const char *const format_start = format; + struct spec spec; + unsigned int numbered_allocated; + unsigned int unnumbered_arg_count; + struct spec *result; + + spec.directives = 0; + spec.likely_intentional_directives = 0; + spec.numbered_arg_count = 0; + spec.numbered = NULL; + numbered_allocated = 0; + unnumbered_arg_count = 0; + + for (; *format != '\0';) + /* Invariant: spec.numbered_arg_count == 0 || unnumbered_arg_count == 0. */ + if (*format == '%') + { + /* A directive. */ + bool likely_intentional = true; + + FDI_SET (format, FMTDIR_START); + format++; + spec.directives++; + + if (*format != '%') + { + unsigned int number = 0; + enum format_arg_type type; + + if (c_isdigit (*format)) + { + const char *f = format; + unsigned int m = 0; + + do + { + m = 10 * m + (*f - '0'); + f++; + } + while (c_isdigit (*f)); + + if (*f == '$') + { + if (m == 0) + { + *invalid_reason = INVALID_ARGNO_0 (spec.directives); + FDI_SET (f, FMTDIR_ERROR); + goto bad_format; + } + number = m; + format = ++f; + } + } + + /* Parse flags. */ + while (*format == ' ' || *format == '+' || *format == '-' + || *format == '#' || *format == '0') + { + if (*format == ' ') + likely_intentional = false; + format++; + } + + /* Parse width. */ + if (c_isdigit (*format)) + { + do format++; while (c_isdigit (*format)); + } + + /* Parse precision. */ + if (*format == '.') + { + format++; + + while (c_isdigit (*format)) + format++; + } + + switch (*format) + { + case 'c': + type = FAT_CHARACTER; + break; + case 's': + type = FAT_STRING; + break; + case 'i': case 'd': + type = FAT_INTEGER; + break; + case 'u': case 'o': case 'x': case 'X': + type = FAT_UNSIGNED_INTEGER; + break; + case 'e': case 'E': case 'f': case 'F': case 'g': case 'G': + case 'a': case 'A': + type = FAT_FLOAT; + break; + default: + if (*format == '\0') + { + *invalid_reason = INVALID_UNTERMINATED_DIRECTIVE (); + FDI_SET (format - 1, FMTDIR_ERROR); + } + else + { + *invalid_reason = + INVALID_CONVERSION_SPECIFIER (spec.directives, *format); + FDI_SET (format, FMTDIR_ERROR); + } + goto bad_format; + } + + if (number) + { + /* Numbered argument. */ + + /* Numbered and unnumbered specifications are exclusive. */ + if (unnumbered_arg_count > 0) + { + *invalid_reason = INVALID_MIXES_NUMBERED_UNNUMBERED (); + FDI_SET (format, FMTDIR_ERROR); + goto bad_format; + } + + if (numbered_allocated == spec.numbered_arg_count) + { + numbered_allocated = 2 * numbered_allocated + 1; + spec.numbered = (struct numbered_arg *) xrealloc (spec.numbered, numbered_allocated * sizeof (struct numbered_arg)); + } + spec.numbered[spec.numbered_arg_count].number = number; + spec.numbered[spec.numbered_arg_count].type = type; + spec.numbered_arg_count++; + } + else + { + /* Unnumbered argument. */ + + /* Numbered and unnumbered specifications are exclusive. */ + if (spec.numbered_arg_count > 0) + { + *invalid_reason = INVALID_MIXES_NUMBERED_UNNUMBERED (); + FDI_SET (format, FMTDIR_ERROR); + goto bad_format; + } + + if (numbered_allocated == unnumbered_arg_count) + { + numbered_allocated = 2 * numbered_allocated + 1; + spec.numbered = (struct numbered_arg *) xrealloc (spec.numbered, numbered_allocated * sizeof (struct numbered_arg)); + } + spec.numbered[unnumbered_arg_count].number = unnumbered_arg_count + 1; + spec.numbered[unnumbered_arg_count].type = type; + unnumbered_arg_count++; + } + } + + if (likely_intentional) + spec.likely_intentional_directives++; + FDI_SET (format, FMTDIR_END); + + format++; + } + else if (*format == '\\') + { + /* An escape sequence. */ + FDI_SET (format, FMTDIR_START); + format++; + + switch (*format) + { + case '\\': + case 'a': + case 'b': + case 'f': + case 'n': + case 'r': + case 't': + case 'v': + format++; + break; + + case '0': case '1': case '2': case '3': case '4': case '5': case '6': + case '7': + format++; + if (*format >= '0' && *format <= '7') + { + format++; + if (*format >= '0' && *format <= '7') + format++; + } + break; + + default: + if (*format == '\0') + { + *invalid_reason = + xstrdup (_("The string ends in the middle of an escape sequence.")); + FDI_SET (format - 1, FMTDIR_ERROR); + } + else + { + *invalid_reason = + (c_isprint (*format) + ? ((*format == 'c' + || *format == 'x' + || *format == 'u' || *format == 'U') + ? xasprintf (_("The escape sequence '%c%c' is unsupported (not in POSIX)."), '\\', *format) + : xasprintf (_("The escape sequence '%c%c' is invalid."), '\\', *format)) + : xstrdup (_("This escape sequence is invalid."))); + FDI_SET (format, FMTDIR_ERROR); + } + goto bad_format; + } + FDI_SET (format - 1, FMTDIR_END); + } + else + format++; + + /* Convert the unnumbered argument array to numbered arguments. */ + if (unnumbered_arg_count > 0) + spec.numbered_arg_count = unnumbered_arg_count; + /* Sort the numbered argument array, and eliminate duplicates. */ + else if (spec.numbered_arg_count > 1) + { + unsigned int i, j; + bool err; + + qsort (spec.numbered, spec.numbered_arg_count, + sizeof (struct numbered_arg), numbered_arg_compare); + + /* Remove duplicates: Copy from i to j, keeping 0 <= j <= i. */ + err = false; + for (i = j = 0; i < spec.numbered_arg_count; i++) + if (j > 0 && spec.numbered[i].number == spec.numbered[j-1].number) + { + enum format_arg_type type1 = spec.numbered[i].type; + enum format_arg_type type2 = spec.numbered[j-1].type; + enum format_arg_type type_both; + + if (type1 == type2) + type_both = type1; + else + { + /* Incompatible types. */ + type_both = FAT_NONE; + if (!err) + *invalid_reason = + INVALID_INCOMPATIBLE_ARG_TYPES (spec.numbered[i].number); + err = true; + } + + spec.numbered[j-1].type = type_both; + } + else + { + if (j < i) + { + spec.numbered[j].number = spec.numbered[i].number; + spec.numbered[j].type = spec.numbered[i].type; + } + j++; + } + spec.numbered_arg_count = j; + if (err) + /* *invalid_reason has already been set above. */ + goto bad_format; + } + + result = XMALLOC (struct spec); + *result = spec; + return result; + + bad_format: + if (spec.numbered != NULL) + free (spec.numbered); + return NULL; +} + +static void +format_free (void *descr) +{ + struct spec *spec = (struct spec *) descr; + + if (spec->numbered != NULL) + free (spec->numbered); + free (spec); +} + +static int +format_get_number_of_directives (void *descr) +{ + struct spec *spec = (struct spec *) descr; + + return spec->directives; +} + +static bool +format_is_unlikely_intentional (void *descr) +{ + struct spec *spec = (struct spec *) descr; + + return spec->likely_intentional_directives == 0; +} + +static bool +format_check (void *msgid_descr, void *msgstr_descr, bool equality, + formatstring_error_logger_t error_logger, void *error_logger_data, + const char *pretty_msgid, const char *pretty_msgstr) +{ + struct spec *spec1 = (struct spec *) msgid_descr; + struct spec *spec2 = (struct spec *) msgstr_descr; + bool err = false; + + if (spec1->numbered_arg_count + spec2->numbered_arg_count > 0) + { + unsigned int i, j; + unsigned int n1 = spec1->numbered_arg_count; + unsigned int n2 = spec2->numbered_arg_count; + + /* Check that the argument numbers are the same. + Both arrays are sorted. We search for the first difference. */ + for (i = 0, j = 0; i < n1 || j < n2; ) + { + int cmp = (i >= n1 ? 1 : + j >= n2 ? -1 : + spec1->numbered[i].number > spec2->numbered[j].number ? 1 : + spec1->numbered[i].number < spec2->numbered[j].number ? -1 : + 0); + + if (cmp > 0) + { + if (error_logger) + error_logger (error_logger_data, + _("a format specification for argument %u, as in '%s', doesn't exist in '%s'"), + spec2->numbered[j].number, pretty_msgstr, + pretty_msgid); + err = true; + break; + } + else if (cmp < 0) + { + if (equality) + { + if (error_logger) + error_logger (error_logger_data, + _("a format specification for argument %u doesn't exist in '%s'"), + spec1->numbered[i].number, pretty_msgstr); + err = true; + break; + } + else + i++; + } + else + j++, i++; + } + /* Check the argument types are the same. */ + if (!err) + for (i = 0, j = 0; j < n2; ) + { + if (spec1->numbered[i].number == spec2->numbered[j].number) + { + if (spec1->numbered[i].type != spec2->numbered[j].type) + { + if (error_logger) + error_logger (error_logger_data, + _("format specifications in '%s' and '%s' for argument %u are not the same"), + pretty_msgid, pretty_msgstr, + spec2->numbered[j].number); + err = true; + break; + } + j++, i++; + } + else + i++; + } + } + + return err; +} + + +struct formatstring_parser formatstring_sh_printf = +{ + format_parse, + format_free, + format_get_number_of_directives, + format_is_unlikely_intentional, + format_check +}; + + +#ifdef TEST + +/* Test program: Print the argument list specification returned by + format_parse for strings read from standard input. */ + +#include + +static void +format_print (void *descr) +{ + struct spec *spec = (struct spec *) descr; + unsigned int last; + unsigned int i; + + if (spec == NULL) + { + printf ("INVALID"); + return; + } + + printf ("("); + last = 1; + for (i = 0; i < spec->numbered_arg_count; i++) + { + unsigned int number = spec->numbered[i].number; + + if (i > 0) + printf (" "); + if (number < last) + abort (); + for (; last < number; last++) + printf ("_ "); + switch (spec->numbered[i].type) + { + case FAT_CHARACTER: + printf ("c"); + break; + case FAT_STRING: + printf ("s"); + break; + case FAT_INTEGER: + printf ("i"); + break; + case FAT_UNSIGNED_INTEGER: + printf ("[unsigned]i"); + break; + case FAT_FLOAT: + printf ("f"); + break; + default: + abort (); + } + last = number + 1; + } + printf (")"); +} + +int +main () +{ + for (;;) + { + char *line = NULL; + size_t line_size = 0; + int line_len; + char *invalid_reason; + void *descr; + + line_len = getline (&line, &line_size, stdin); + if (line_len < 0) + break; + if (line_len > 0 && line[line_len - 1] == '\n') + line[--line_len] = '\0'; + + invalid_reason = NULL; + descr = format_parse (line, false, NULL, &invalid_reason); + + format_print (descr); + printf ("\n"); + if (descr == NULL) + printf ("%s\n", invalid_reason); + + free (invalid_reason); + free (line); + } + + return 0; +} + +/* + * For Emacs M-x compile + * Local Variables: + * compile-command: "/bin/sh ../libtool --tag=CC --mode=link gcc -o a.out -static -O -g -Wall -I.. -I../gnulib-lib -I../../gettext-runtime/intl -DHAVE_CONFIG_H -DTEST format-sh-printf.c ../gnulib-lib/libgettextlib.la" + * End: + */ + +#endif /* TEST */ diff --git a/gettext-tools/src/format.c b/gettext-tools/src/format.c index 73fe7da50..8df7aa1ef 100644 --- a/gettext-tools/src/format.c +++ b/gettext-tools/src/format.c @@ -51,6 +51,7 @@ struct formatstring_parser *formatstring_parsers[NFORMATS] = /* format_go */ &formatstring_go, /* format_ruby */ &formatstring_ruby, /* format_sh */ &formatstring_sh, + /* format_sh_printf */ &formatstring_sh_printf, /* format_awk */ &formatstring_awk, /* format_lua */ &formatstring_lua, /* format_pascal */ &formatstring_pascal, diff --git a/gettext-tools/src/format.h b/gettext-tools/src/format.h index a08718461..7ab8d35cc 100644 --- a/gettext-tools/src/format.h +++ b/gettext-tools/src/format.h @@ -117,6 +117,7 @@ extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_rust; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_go; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_ruby; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_sh; +extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_sh_printf; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_awk; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_lua; extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_pascal; diff --git a/gettext-tools/src/message.c b/gettext-tools/src/message.c index 1c1808fcf..7a605b350 100644 --- a/gettext-tools/src/message.c +++ b/gettext-tools/src/message.c @@ -51,6 +51,7 @@ const char *const format_language[NFORMATS] = /* format_go */ "go", /* format_ruby */ "ruby", /* format_sh */ "sh", + /* format_sh_printf */ "sh-printf", /* format_awk */ "awk", /* format_lua */ "lua", /* format_pascal */ "object-pascal", @@ -90,6 +91,7 @@ const char *const format_language_pretty[NFORMATS] = /* format_go */ "Go", /* format_ruby */ "Ruby", /* format_sh */ "Shell", + /* format_sh_printf */ "Shell printf", /* format_awk */ "awk", /* format_lua */ "Lua", /* format_pascal */ "Object Pascal", diff --git a/gettext-tools/src/message.h b/gettext-tools/src/message.h index 5f242acc5..ec386f734 100644 --- a/gettext-tools/src/message.h +++ b/gettext-tools/src/message.h @@ -60,6 +60,7 @@ enum format_type format_go, format_ruby, format_sh, + format_sh_printf, format_awk, format_lua, format_pascal, @@ -79,7 +80,7 @@ enum format_type format_gfc_internal, format_ycp }; -#define NFORMATS 35 /* Number of format_type enum values. */ +#define NFORMATS 36 /* Number of format_type enum values. */ extern LIBGETTEXTSRC_DLL_VARIABLE const char *const format_language[NFORMATS]; extern LIBGETTEXTSRC_DLL_VARIABLE const char *const format_language_pretty[NFORMATS]; diff --git a/gettext-tools/src/x-sh.c b/gettext-tools/src/x-sh.c index 8156bb7df..86fc2083c 100644 --- a/gettext-tools/src/x-sh.c +++ b/gettext-tools/src/x-sh.c @@ -138,14 +138,18 @@ void init_flag_table_sh () { xgettext_record_flag ("gettext:1:pass-sh-format"); + xgettext_record_flag ("gettext:1:pass-sh-printf-format"); xgettext_record_flag ("ngettext:1:pass-sh-format"); + xgettext_record_flag ("ngettext:1:pass-sh-printf-format"); xgettext_record_flag ("ngettext:2:pass-sh-format"); + xgettext_record_flag ("ngettext:2:pass-sh-printf-format"); xgettext_record_flag ("eval_gettext:1:sh-format"); xgettext_record_flag ("eval_ngettext:1:sh-format"); xgettext_record_flag ("eval_ngettext:2:sh-format"); xgettext_record_flag ("eval_pgettext:2:sh-format"); xgettext_record_flag ("eval_npgettext:2:sh-format"); xgettext_record_flag ("eval_npgettext:3:sh-format"); + xgettext_record_flag ("printf:1:sh-printf-format"); } diff --git a/gettext-tools/src/x-sh.h b/gettext-tools/src/x-sh.h index 297480cd3..cf713afcb 100644 --- a/gettext-tools/src/x-sh.h +++ b/gettext-tools/src/x-sh.h @@ -1,5 +1,5 @@ /* xgettext sh backend. - Copyright (C) 2003, 2006, 2014, 2018, 2020 Free Software Foundation, Inc. + Copyright (C) 2003-2025 Free Software Foundation, Inc. Written by Bruno Haible , 2003. This program is free software: you can redistribute it and/or modify @@ -33,7 +33,8 @@ extern "C" { #define SCANNERS_SH \ { "Shell", extract_sh, NULL, \ - &flag_table_sh, &formatstring_sh, NULL }, \ + &flag_table_sh, \ + &formatstring_sh, &formatstring_sh_printf }, \ /* Scan a shell script file and add its translatable strings to mdlp. */ extern void extract_sh (FILE *fp, const char *real_filename, diff --git a/gettext-tools/src/xgettext.c b/gettext-tools/src/xgettext.c index 7a217a479..790997779 100644 --- a/gettext-tools/src/xgettext.c +++ b/gettext-tools/src/xgettext.c @@ -1753,6 +1753,11 @@ xgettext_record_flag (const char *optionstring) name_start, name_end, argnum, value, pass); break; + case format_sh_printf: + flag_context_list_table_insert (&flag_table_sh, XFORMAT_SECONDARY, + name_start, name_end, + argnum, value, pass); + break; case format_awk: flag_context_list_table_insert (&flag_table_awk, XFORMAT_PRIMARY, name_start, name_end, diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index e274d9926..3160ed805 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -227,6 +227,7 @@ TESTS = gettext-1 gettext-2 \ format-rust-1 format-rust-2 \ format-scheme-1 format-scheme-2 \ format-sh-1 format-sh-2 \ + format-sh-printf-1 format-sh-printf-2 \ format-tcl-1 format-tcl-2 format-tcl-3 \ format-ycp-1 format-ycp-2 \ plural-1 plural-2 plural-3 plural-4 \ diff --git a/gettext-tools/tests/format-sh-printf-1 b/gettext-tools/tests/format-sh-printf-1 new file mode 100755 index 000000000..22486f077 --- /dev/null +++ b/gettext-tools/tests/format-sh-printf-1 @@ -0,0 +1,178 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test recognition of Shell printf format strings. + +escape_backslashes='s/\\/\\\\/g' +LC_ALL=C sed -e "$escape_backslashes" <<\EOF > f-sp-1.data +# Valid: no argument +"abc%%" +# Valid: one character argument +"abc%c" +# Valid: one string argument +"abc%s" +# Valid: one integer argument +"abc%i" +# Valid: one integer argument +"abc%d" +# Valid: one integer argument +"abc%o" +# Valid: one integer argument +"abc%u" +# Valid: one integer argument +"abc%x" +# Valid: one integer argument +"abc%X" +# Valid: one floating-point argument +"abc%e" +# Valid: one floating-point argument +"abc%E" +# Valid: one floating-point argument +"abc%f" +# Valid: one floating-point argument +"abc%F" +# Valid: one floating-point argument +"abc%g" +# Valid: one floating-point argument +"abc%G" +# Valid: one floating-point argument +"abc%a" +# Valid: one floating-point argument +"abc%A" +# Valid: one argument with flags +"abc%0#g" +# Valid: one argument with width +"abc%2g" +# Invalid: one argument with width +"abc%*g" +# Valid: one argument with precision +"abc%.4g" +# Invalid: one argument with precision +"abc%.*g" +# Valid: one argument with width and precision +"abc%14.4g" +# Invalid: one argument with width and precision +"abc%14.*g" +# Invalid: one argument with width and precision +"abc%*.4g" +# Invalid: one argument with width and precision +"abc%*.*g" +# Invalid: unterminated +"abc%" +# Invalid: unknown format specifier +"abc%y" +# Invalid: flags after width +"abc%*0g" +# Valid: null precision +"abc%.f" +# Invalid: twice precision +"abc%.4.2g" +# Valid: three arguments +"abc%d%u%u" +# Valid: a numbered argument +"abc%1$d" +# Invalid: zero +"abc%0$d" +# Valid: two-digit numbered arguments +"abc%11$def%10$dgh%9$dij%8$dkl%7$dmn%6$dop%5$dqr%4$dst%3$duv%2$dwx%1$dyz" +# Invalid: unterminated number +"abc%1" +# Invalid: flags before number +"abc%+1$d" +# Valid: three arguments, two with same number +"abc%1$4x,%2$c,%1$u" +# Invalid: argument with conflicting types +"abc%1$4x,%2$c,%1$s" +# Valid: no conflict +"abc%1$4x,%2$c,%1$u" +# Invalid: mixing of numbered and unnumbered arguments +"abc%d%2$x" +# Valid: numbered argument with constant precision +"abc%1$.9x" +# Invalid: mixing of numbered and unnumbered arguments +"abc%1$.*x" +# Valid: missing non-final argument +"abc%2$x%3$s" +# Valid: permutation +"abc%2$ddef%1$d" +# Valid: multiple uses of same argument +"abc%2$xdef%1$sghi%2$x" +# Invalid: one argument with width +"abc%2$#*1$g" +# Invalid: one argument with width and precision +"abc%3$*2$.*1$g" +# Invalid: zero +"abc%2$*0$.*1$g" +# Valid: escape sequence +"abc%%def\\" +# Valid: escape sequence +"abc%%def\a" +# Valid: escape sequence +"abc%%def\b" +# Valid: escape sequence +"abc%%def\f" +# Valid: escape sequence +"abc%%def\n" +# Valid: escape sequence +"abc%%def\r" +# Valid: escape sequence +"abc%%def\t" +# Valid: escape sequence +"abc%%def\v" +# Valid: escape sequence +"abc%%def\066" +# Invalid: escape sequence +"abc%%def\" +# Invalid: escape sequence +"abc%%def\"" +# Invalid: escape sequence +"abc%%def\c" +# Invalid: escape sequence +"abc%%def\x32" +# Invalid: escape sequence +"abc%%def\u20ac" +# Invalid: escape sequence +"abc%%def\U0001F41C" +# Invalid: escape sequence +"abc%%def\%d" +EOF + +: ${XGETTEXT=xgettext} +n=0 +while read comment; do + # Note: The 'read' command processes backslashes. ('read -r' is not portable.) + read string + n=`expr $n + 1` + escape_backslashes='s/\\/\\\\/g' + escape_dollars='s/\$/\\\$/g' + string=`echo "$string" | LC_ALL=C sed -e "$escape_backslashes" -e "$escape_dollars"` + cat < f-sp-1-$n.in +gettext ${string}; +EOF + ${XGETTEXT} -L Shell -o f-sp-1-$n.po f-sp-1-$n.in || Exit 1 + test -f f-sp-1-$n.po || Exit 1 + fail= + if echo "$comment" | grep 'Valid:' > /dev/null; then + if grep sh-printf-format f-sp-1-$n.po > /dev/null; then + : + else + fail=yes + fi + else + if grep sh-printf-format f-sp-1-$n.po > /dev/null; then + fail=yes + else + : + fi + fi + if test -n "$fail"; then + echo "Format string recognition error:" 1>&2 + cat f-sp-1-$n.in 1>&2 + echo "Got:" 1>&2 + cat f-sp-1-$n.po 1>&2 + Exit 1 + fi + rm -f f-sp-1-$n.in f-sp-1-$n.po +done < f-sp-1.data + +Exit 0 diff --git a/gettext-tools/tests/format-sh-printf-2 b/gettext-tools/tests/format-sh-printf-2 new file mode 100755 index 000000000..21515ad31 --- /dev/null +++ b/gettext-tools/tests/format-sh-printf-2 @@ -0,0 +1,145 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test checking of Shell printf format strings. + +cat <<\EOF > f-sp-2.data +# Valid: %% doesn't count +msgid "abc%%def" +msgstr "xyz" +# Invalid: invalid msgstr +msgid "abc%%def" +msgstr "xyz%" +# Valid: same arguments +msgid "abc%s%gdef" +msgstr "xyz%s%g" +# Valid: same arguments, with different widths +msgid "abc%2sdef" +msgstr "xyz%3s" +# Valid: same arguments but in numbered syntax +msgid "abc%s%gdef" +msgstr "xyz%1$s%2$g" +# Valid: permutation +msgid "abc%s%g%cdef" +msgstr "xyz%3$c%2$g%1$s" +# Invalid: too few arguments +msgid "abc%2$udef%1$s" +msgstr "xyz%1$s" +# Invalid: too few arguments +msgid "abc%sdef%u" +msgstr "xyz%s" +# Invalid: too many arguments +msgid "abc%udef" +msgstr "xyz%uvw%c" +# Valid: same numbered arguments, with different widths +msgid "abc%2$5s%1$4s" +msgstr "xyz%2$4s%1$5s" +# Invalid: missing argument +msgid "abc%2$sdef%1$u" +msgstr "xyz%1$u" +# Invalid: missing argument +msgid "abc%1$sdef%2$u" +msgstr "xyz%2$u" +# Invalid: added argument +msgid "abc%1$udef" +msgstr "xyz%1$uvw%2$c" +# Valid: type compatibility +msgid "abc%i" +msgstr "xyz%d" +# Valid: type compatibility +msgid "abc%o" +msgstr "xyz%u" +# Valid: type compatibility +msgid "abc%u" +msgstr "xyz%x" +# Valid: type compatibility +msgid "abc%u" +msgstr "xyz%X" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%E" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%f" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%F" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%g" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%G" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%a" +# Valid: type compatibility +msgid "abc%e" +msgstr "xyz%A" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%s" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%i" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%o" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%e" +# Invalid: type incompatibility +msgid "abc%s" +msgstr "xyz%i" +# Invalid: type incompatibility +msgid "abc%s" +msgstr "xyz%o" +# Invalid: type incompatibility +msgid "abc%s" +msgstr "xyz%e" +# Invalid: type incompatibility +msgid "abc%i" +msgstr "xyz%o" +# Invalid: type incompatibility +msgid "abc%i" +msgstr "xyz%e" +# Invalid: type incompatibility +msgid "abc%u" +msgstr "xyz%e" +EOF + +: ${MSGFMT=msgfmt} +n=0 +while read comment; do + read msgid_line + read msgstr_line + n=`expr $n + 1` + cat < f-sp-2-$n.po +#, sh-printf-format +${msgid_line} +${msgstr_line} +EOF + fail= + if echo "$comment" | grep 'Valid:' > /dev/null; then + if ${MSGFMT} --check-format -o f-sp-2-$n.mo f-sp-2-$n.po; then + : + else + fail=yes + fi + else + ${MSGFMT} --check-format -o f-sp-2-$n.mo f-sp-2-$n.po 2> /dev/null + if test $? = 1; then + : + else + fail=yes + fi + fi + if test -n "$fail"; then + echo "Format string checking error:" 1>&2 + cat f-sp-2-$n.po 1>&2 + Exit 1 + fi + rm -f f-sp-2-$n.po f-sp-2-$n.mo +done < f-sp-2.data + +Exit 0 diff --git a/gettext-tools/tests/xgettext-sh-1 b/gettext-tools/tests/xgettext-sh-1 index a98e54ae0..1429d2fd7 100755 --- a/gettext-tools/tests/xgettext-sh-1 +++ b/gettext-tools/tests/xgettext-sh-1 @@ -1,7 +1,7 @@ #!/bin/sh . "${srcdir=.}/init.sh"; path_prepend_ . ../src -# Test of Shell support: escape sequences, string concatenation, +# Test of Shell support: escape sequences, format strings, string concatenation, # strings with embedded expressions. # Note! This file contains unescaped ASCII control characters. Edit carefully! @@ -495,6 +495,10 @@ echo `echo \`gettext $'depth_2_dollar_posix_0_"ab\"cd\'ef\\gh\eij\fkl\nmn\rop\tq echo `echo \`gettext $'depth_2_dollar_posix_1_\cvab\cVcd\c[ef\c\\gh\c]ij\c?kl'\`` echo `echo \`gettext $'depth_2_dollar_bash_0_\Eab'\`` +# Test format strings. + +printf "`gettext 'User name: %s\nUser ID: %u'`"'\n' "$USER" `id -u` + # Test string concatenation. gettext "concat_0_""part2" @@ -1919,6 +1923,10 @@ msgstr "" msgid "depth_2_dollar_bash_0_ab" msgstr "" +#, sh-printf-format +msgid "User name: %s\\nUser ID: %u" +msgstr "" + msgid "concat_0_part2" msgstr ""