+2003-06-21 Guido Flohr <guido@imperia.net>
+ Bruno Haible <bruno@clisp.org>
+
+ * gettext.texi (perl-format): Describe two kinds of format strings.
+ (Perl): Add more info.
+
2003-06-19 Bruno Haible <bruno@clisp.org>
* gettext.texi (Sources): Recommend to use <libintl.h> also for printf.
@node perl-format, php-format, tcl-format, Translators for other Languages
@subsection Perl Format Strings
-Perl format strings are described in the @file{Locale::TextDomain}
-manual page, or in
-@uref{http://let.imperia.org/howto/en/perl-i18n/}.
+There are two kinds format strings in Perl: those acceptable to the
+Perl built-in function @code{printf}, labelled as @samp{perl-format},
+and those acceptable to the @code{libintl-perl} function @code{__x},
+labelled as @samp{perl-bracket-format}.
+
+Perl @code{printf} format strings are described in the @code{sprintf}
+section of @samp{man perlfunc}.
+
+Perl bracketed format strings are described in the
+@file{Locale::TextDomain(3pm)} manual page of the CPAN package
+libintl-perl. In brief, Perl format uses placeholders put between
+brackets (@samp{[} and @samp{]}). The placeholder must have the syntax
+of simple identifiers.
@node php-format, , perl-format, Translators for other Languages
@subsection PHP Format Strings
@c on SourceForge/Freshmeat, as of February 2002. Those supported by gettext
@c are marked with a star.
@c C 3580 *
-@c Perl 1911
+@c Perl 1911 *
@c C++ 1379 *
@c Java 1200 *
@c PHP 1051 *
emulate
@item Extractor
-@code{xgettext -k__ -k\$__ -k%__ -k__x -kN__ -k}
+@code{xgettext -k__ -k\$__ -k%__ -k__x -k__n -k__nx -k__xn -kN__ -k}
@item Formatting with positions
-@code{__x("error opening '[file]': [err]", file => $file, err => $!);}
+Both kinds of format strings support formatting with positions.
+@*@code{printf "%2\$d %1\$d", ...} (requires Perl 5.8.0 or newer)
+@*@code{__expand("[new] replaces [old]", old => $oldvalue, new => $newvalue)}
@item Portability
-The @code{libintl-perl} package is platform-independent but does
-not belong to the Perl core. The programmer is responsible for
+The @code{libintl-perl} package is platform independent but is not
+part of the Perl core. The programmer is responsible for
providing a dummy implementation of the required functions if the
package is not installed on the target system.
---
@item Documentation
-In-depth documentation can be found at
-http://let.imperia.org/howto/en/perl-18n/.
+Included in @code{libintl-perl}, available on CPAN
+(http://www.cpan.org/).
@end table
the parser backends for other programming languages, just as Perl
itself differs significantly from other programming languages. The
Perl parser backend offers many more string marking facilities than
-the other backends but it also has some Perl-specific limitations, the
+the other backends but it also has some Perl specific limitations, the
worst probably being its imperfectness.
@menu
file handle. But not necessarily:
@example
-use Locale::Messages;
+use Locale::Messages qw (:libintl_h);
open gettext ">testfile" or die;
print gettext "Hello world!";
@end example
If you find that @code{xgettext} fails to extract strings from
portions of your sources, you should therefore look out for slashes
and/or question marks preceding these sections. You may have come
-across a bug in the @code{xgettext}'s Perl parser (and of course you
-should report that bug).
+across a bug in @code{xgettext}'s Perl parser (and of course you
+should report that bug). In the meantime you should consider to
+reformulate your code in a manner less challenging to @code{xgettext}.
@node Default Keywords, Special Keywords, General Problems, Perl
@subsubsection Which keywords will xgettext look for?
The exact rule is: You can omit the surrounding quotes, when the hash
key is a valid C (!) identifier, i. e. when it starts with an
-underscore or a ASCII letter and is followed by an arbitrary number
+underscore or an ASCII letter and is followed by an arbitrary number
of underscores, ASCII letters or digits. Other Unicode characters
are @emph{not} allowed, regardless of the @code{use utf8} pragma.
delimiters, including the four bracketing delimiters (round, angle,
square, curly) that nest.
-Use this feature with care! If you use it, you lose control of the
-external code executed from within your Perl script to your translator!
+The example is actually a useless use of @code{gettext}. It will
+invoke the @code{gettext} function on the output of the command
+specified with the @code{qx} operator. The feature was included
+in order to make the interface consistent (the parser will extract
+all strings and quote-like expressions).
@item @strong{here documents}
@group
argument is passed to @code{gettext()} and will therefore depend on
the name of the script which can only be determined at runtime.
Consequently, it is almost impossible that a translation can be looked
-up at runtime (except if by accident the interpolated string is found
+up at runtime (except if, by accident, the interpolated string is found
in the message catalog).
The @code{xgettext} program will therefore terminate parsing with a fatal
EOF
@end example
-If you exaggerate with constructs like the above, you will run the
-risk that the translators of your package will look out for a less
-challenging project. You should consider an alternative expression:
+The parser will extract the entire here document, and it will appear
+entirely in the resulting PO file, including the JavaScript snippet
+embedded in the HTML code. If you exaggerate with constructs like
+the above, you will run the risk that the translators of your package
+will look out for a less challenging project. You should consider an
+alternative expression here:
@example
print <<EOF;
EOF
@end example
-The resulting PO file will begrudgingly improve in terms of
-readability.
+Only the translatable portions of the code will be extracted here, and
+the resulting PO file will begrudgingly improve in terms of readability.
You can interpolate hash lookups in all strings or quote-like
expressions that are subject to interpolation (see the manual page
print gettext qq@{Welcome to $gettext->@{"the earth"@}@};
@end example
-The quoted string is recognized as an argument to @code{xgettext} in
+The @code{qq}-quoted string is recognized as an argument to @code{xgettext} in
the first place, and checked for invalid variable interpolation. The
-dollar sign will therefore terminate the parser with an ``invalid
-interpolation'' error.
+dollar sign of hash-dereferencing will therefore terminate the parser
+with an ``invalid interpolation'' error.
It is valid to interpolate hash lookups in regular expressions:
@code{xgettext}. You will only find one long message in the resulting
POT file.
+Note that the future Perl 6 will probably use the underscore
+(@samp{_}) as the string concatenation operator, and the dot
+(@samp{.}) for dereferencing. This new syntax is not yet supported by
+@code{xgettext}.
+
If embedded newline characters are not an issue, or even desired, you
may also insert newline characters inside quoted strings wherever you
feel like it:
versions, but since you can always make do without them at minimal effort,
these todos have very low priority.
+A nasty problem are bracketed format strings that already contain brackets
+as part of the normal text, for example the usage strings typically
+encountered in programs:
+
+@example
+die "usage: $0 [OPTIONS] FILENAME...\n";
+@end example
+
+If you want to internationalize this code with Perl bracketed format strings,
+you will run into a problem:
+
+@example
+die __x ("usage: [program] [OPTIONS] FILENAME...\n", program => $0);
+@end example
+
+Whereas @code{@samp{[program]}} is a placeholder, @code{@samp{[OPTIONS]}}
+is not and should probably be translated. Yet, there is no way to teach
+the Perl parser in @code{xgettext} to recognize the first one, and leave
+the other one alone.
+
+There are two possible work-arounds for this problem. If you are
+sure that your program will run under Perl 5.8.0 or newer (these
+Perl versions handle positional parameters in @code{printf()}) or
+if you are sure that the translator will not have to reorder the arguments
+in her translation -- for example if you have only one bracketed placeholder
+in your string, or if it describes a syntax, like in this one --, you can
+mark the string as @code{no-perl-bracket-format} and use @code{printf()}:
+
+@example
+# xgettext: no-perl-bracket-format
+die sprintf ("usage: %s [OPTIONS] FILENAME...\n", $0);
+@end example
+
+If you want to use the more portable Perl bracket format, you will have to do
+put placeholders in place of the literal brackets:
+
+@example
+die __x ("usage: [program] [@{]OPTIONS[@}] FILENAME...\n",
+ program => $0, '@{' => '[', '@}' => ']');
+@end example
+
+Perl bracketed format strings know no escaping mechanism. No matter how this
+escaping mechanism looked like, it would either give the programmer a
+hard time, make translating Perl bracketed format strings heavy-going, or
+result in a performance penalty at runtime, when the format directives
+get executed. Most of the time you will happily get along with
+@code{printf()} for this special case.
+
@node PHP, Pike, Perl, List of Programming Languages
@subsection PHP Hypertext Preprocessor
@cindex PHP
+2003-06-21 Bruno Haible <bruno@clisp.org>
+
+ * format-perl.c: Complete rewrite.
+ * format-perl-bracket.c: New file.
+ * message.h (format_perl_bracket): New enum value.
+ (NFORMATS): Increment.
+ * format.h (formatstring_perl_bracket): New declaration.
+ * format.c (formatstring_parsers): Add entry for perl_bracket.
+ * message.c (format_language, format_language_pretty): Likewise.
+ * x-perl.h (SCANNERS_PERL): Use formatstring_perl and
+ formatstring_perl_bracket.
+ * Makefile.am (FORMAT_SOURCE): Add format-perl-bracket.c.
+ * Makefile.msvc (OBJECTS): Add format-perl.obj,format-perl-bracket.obj.
+ (xgettext_OBJECTS): Add x-perl.obj.
+ (format-perl.obj, format-perl-bracket.obj, x-perl.obj): New rules.
+ * Makefile.vms (OBJECTS): Add format-perl.obj, format-perl-bracket.obj.
+ (xgettext_OBJECTS): Add x-perl.obj.
+ (format-perl.obj, format-perl-bracket.obj, x-perl.obj): New rules.
+
+ * x-perl.c (extract_quotelike_pass3): Don't give an invalid
+ interpolation error for a backslashed dollar sign.
+
2003-06-21 Bruno Haible <bruno@clisp.org>
* x-perl.c (extract_quotelike_pass3): Fix handling of double backslash.
FORMAT_SOURCE = format.c format-invalid.h \
format-c.c format-python.c format-lisp.c format-elisp.c format-librep.c \
format-java.c format-awk.c format-pascal.c format-ycp.c format-tcl.c \
-format-perl.c format-php.c
+format-perl.c format-perl-bracket.c format-php.c
# libgettextsrc contains all code that is needed by at least two programs.
libgettextsrc_la_SOURCES = \
format-pascal.obj \
format-ycp.obj \
format-tcl.obj \
+ format-perl.obj \
+ format-perl-bracket.obj \
format-php.obj
msgcmp_OBJECTS = msgcmp.obj
msgfmt_OBJECTS = msgfmt.obj write-mo.obj write-java.obj write-tcl.obj plural-eval.obj
msgmerge_OBJECTS = msgmerge.obj
msgunfmt_OBJECTS = msgunfmt.obj read-mo.obj read-java.obj read-tcl.obj
-xgettext_OBJECTS = xgettext.obj x-c.obj x-po.obj x-python.obj x-lisp.obj x-elisp.obj x-librep.obj x-smalltalk.obj x-java.obj x-awk.obj x-ycp.obj x-tcl.obj x-php.obj x-rst.obj x-glade.obj
+xgettext_OBJECTS = xgettext.obj x-c.obj x-po.obj x-python.obj x-lisp.obj x-elisp.obj x-librep.obj x-smalltalk.obj x-java.obj x-awk.obj x-ycp.obj x-tcl.obj x-perl.obj x-php.obj x-rst.obj x-glade.obj
msgattrib_OBJECTS = msgattrib.obj
msgcat_OBJECTS = msgcat.obj
msgcomm_OBJECTS = msgcomm.obj
format-tcl.obj : format-tcl.c
$(CC) $(INCLUDES) $(CFLAGS) $(PICFLAGS) -c format-tcl.c
+format-perl.obj : format-perl.c
+ $(CC) $(INCLUDES) $(CFLAGS) $(PICFLAGS) -c format-perl.c
+
+format-perl-bracket.obj : format-perl-bracket.c
+ $(CC) $(INCLUDES) $(CFLAGS) $(PICFLAGS) -c format-perl-bracket.c
+
format-php.obj : format-php.c
$(CC) $(INCLUDES) $(CFLAGS) $(PICFLAGS) -c format-php.c
x-tcl.obj : x-tcl.c
$(CC) $(INCLUDES) $(CFLAGS) -c x-tcl.c
+x-perl.obj : x-perl.c
+ $(CC) $(INCLUDES) $(CFLAGS) -c x-perl.c
+
x-php.obj : x-php.c
$(CC) $(INCLUDES) $(CFLAGS) -c x-php.c
format-pascal.obj, \
format-ycp.obj, \
format-tcl.obj, \
+ format-perl.obj, \
+ format-perl-bracket.obj, \
format-php.obj
msgcmp_OBJECTS = msgcmp.obj
msgfmt_OBJECTS = msgfmt.obj, write-mo.obj, write-java.obj, write-tcl.obj, plural-eval.obj
msgmerge_OBJECTS = msgmerge.obj
msgunfmt_OBJECTS = msgunfmt.obj, read-mo.obj, read-java.obj, read-tcl.obj
-xgettext_OBJECTS = xgettext.obj, x-c.obj, x-po.obj, x-python.obj, x-lisp.obj, x-elisp.obj, x-librep.obj, x-smalltalk.obj, x-java.obj, x-awk.obj, x-ycp.obj, x-tcl.obj, x-php.obj, x-rst.obj, x-glade.obj
+xgettext_OBJECTS = xgettext.obj, x-c.obj, x-po.obj, x-python.obj, x-lisp.obj, x-elisp.obj, x-librep.obj, x-smalltalk.obj, x-java.obj, x-awk.obj, x-ycp.obj, x-tcl.obj, x-perl.obj, x-php.obj, x-rst.obj, x-glade.obj
msgattrib_OBJECTS = msgattrib.obj
msgcat_OBJECTS = msgcat.obj
msgcomm_OBJECTS = msgcomm.obj
format-tcl.obj : format-tcl.c
$(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) format-tcl.c
+format-perl.obj : format-perl.c
+ $(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) format-perl.c
+
+format-perl-bracket.obj : format-perl-bracket.c
+ $(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) format-perl-bracket.c
+
format-php.obj : format-php.c
$(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) format-php.c
x-tcl.obj : x-tcl.c
$(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) x-tcl.c
+x-perl.obj : x-perl.c
+ $(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) x-perl.c
+
x-php.obj : x-php.c
$(CC) $(INCLUDES) $(CFLAGS) /define=($(DEFS)) x-php.c
--- /dev/null
+/* Perl bracketed format strings.
+ Copyright (C) 2003 Free Software Foundation, Inc.
+ Written by Bruno Haible <bruno@clisp.org>, 2003.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2, or (at your option)
+ any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software Foundation,
+ Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
+
+#ifdef HAVE_CONFIG_H
+# include <config.h>
+#endif
+
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "format.h"
+#include "xmalloc.h"
+#include "error.h"
+#include "progname.h"
+#include "gettext.h"
+
+#define _(str) gettext (str)
+
+/* Perl bracketed format strings are supported by Guido Flohr's libintl-perl
+ package, more precisely by the __expand and __x functions therein.
+ A format string directive here consists of
+ - an opening bracket '[',
+ - an identifier [_A-Za-z][_0-9A-Za-z]*,
+ - a closing bracket ']'.
+ */
+
+struct named_arg
+{
+ char *name;
+};
+
+struct spec
+{
+ unsigned int directives;
+ unsigned int named_arg_count;
+ unsigned int allocated;
+ struct named_arg *named;
+};
+
+/* Locale independent test for a decimal digit.
+ Argument can be 'char' or 'unsigned char'. (Whereas the argument of
+ <ctype.h> isdigit must be an 'unsigned char'.) */
+#undef isdigit
+#define isdigit(c) ((unsigned int) ((c) - '0') < 10)
+
+
+static int
+named_arg_compare (const void *p1, const void *p2)
+{
+ return strcmp (((const struct named_arg *) p1)->name,
+ ((const struct named_arg *) p2)->name);
+}
+
+static void *
+format_parse (const char *format, char **invalid_reason)
+{
+ struct spec spec;
+ struct spec *result;
+
+ spec.directives = 0;
+ spec.named_arg_count = 0;
+ spec.allocated = 0;
+ spec.named = NULL;
+
+ for (; *format != '\0';)
+ if (*format++ == '[')
+ {
+ const char *f = format;
+ char c;
+
+ c = *f;
+ if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '_')
+ {
+ do
+ c = *++f;
+ while ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '_'
+ || (c >= '0' && c <= '9'));
+ if (c == ']')
+ {
+ /* A directive. */
+ char *name;
+ const char *name_start = format;
+ const char *name_end = f;
+ size_t n = name_end - name_start;
+
+ name = (char *) xmalloc (n + 1);
+ memcpy (name, name_start, n);
+ name[n] = '\0';
+
+ spec.directives++;
+
+ if (spec.allocated == spec.named_arg_count)
+ {
+ spec.allocated = 2 * spec.allocated + 1;
+ spec.named = (struct named_arg *) xrealloc (spec.named, spec.allocated * sizeof (struct named_arg));
+ }
+ spec.named[spec.named_arg_count].name = name;
+ spec.named_arg_count++;
+
+ format = ++f;
+ }
+ }
+ }
+
+ /* Sort the named argument array, and eliminate duplicates. */
+ if (spec.named_arg_count > 1)
+ {
+ unsigned int i, j;
+
+ qsort (spec.named, spec.named_arg_count, sizeof (struct named_arg),
+ named_arg_compare);
+
+ /* Remove duplicates: Copy from i to j, keeping 0 <= j <= i. */
+ for (i = j = 0; i < spec.named_arg_count; i++)
+ if (j > 0 && strcmp (spec.named[i].name, spec.named[j-1].name) == 0)
+ free (spec.named[i].name);
+ else
+ {
+ if (j < i)
+ spec.named[j].name = spec.named[i].name;
+ j++;
+ }
+ spec.named_arg_count = j;
+ }
+
+ result = (struct spec *) xmalloc (sizeof (struct spec));
+ *result = spec;
+ return result;
+}
+
+static void
+format_free (void *descr)
+{
+ struct spec *spec = (struct spec *) descr;
+
+ if (spec->named != NULL)
+ {
+ unsigned int i;
+ for (i = 0; i < spec->named_arg_count; i++)
+ free (spec->named[i].name);
+ free (spec->named);
+ }
+ free (spec);
+}
+
+static int
+format_get_number_of_directives (void *descr)
+{
+ struct spec *spec = (struct spec *) descr;
+
+ return spec->directives;
+}
+
+static bool
+format_check (const lex_pos_ty *pos, void *msgid_descr, void *msgstr_descr,
+ bool equality, bool noisy, const char *pretty_msgstr)
+{
+ struct spec *spec1 = (struct spec *) msgid_descr;
+ struct spec *spec2 = (struct spec *) msgstr_descr;
+ bool err = false;
+
+ if (spec1->named_arg_count + spec2->named_arg_count > 0)
+ {
+ unsigned int i, j;
+ unsigned int n1 = spec1->named_arg_count;
+ unsigned int n2 = spec2->named_arg_count;
+
+ /* Check the argument names are the same.
+ Both arrays are sorted. We search for the first difference. */
+ for (i = 0, j = 0; i < n1 || j < n2; )
+ {
+ int cmp = (i >= n1 ? 1 :
+ j >= n2 ? -1 :
+ strcmp (spec1->named[i].name, spec2->named[j].name));
+
+ if (cmp > 0)
+ {
+ if (noisy)
+ {
+ error_with_progname = false;
+ error_at_line (0, 0, pos->file_name, pos->line_number,
+ _("a format specification for argument '%s', as in '%s', doesn't exist in 'msgid'"),
+ spec2->named[j].name, pretty_msgstr);
+ error_with_progname = true;
+ }
+ err = true;
+ break;
+ }
+ else if (cmp < 0)
+ {
+ if (equality)
+ {
+ if (noisy)
+ {
+ error_with_progname = false;
+ error_at_line (0, 0, pos->file_name, pos->line_number,
+ _("a format specification for argument '%s' doesn't exist in '%s'"),
+ spec1->named[i].name, pretty_msgstr);
+ error_with_progname = true;
+ }
+ err = true;
+ break;
+ }
+ else
+ i++;
+ }
+ else
+ j++, i++;
+ }
+ }
+
+ return err;
+}
+
+
+struct formatstring_parser formatstring_perl_bracket =
+{
+ format_parse,
+ format_free,
+ format_get_number_of_directives,
+ format_check
+};
+
+
+#ifdef TEST
+
+/* Test program: Print the argument list specification returned by
+ format_parse for strings read from standard input. */
+
+#include <stdio.h>
+#include "getline.h"
+
+static void
+format_print (void *descr)
+{
+ struct spec *spec = (struct spec *) descr;
+ unsigned int i;
+
+ if (spec == NULL)
+ {
+ printf ("INVALID");
+ return;
+ }
+
+ printf ("{");
+ for (i = 0; i < spec->named_arg_count; i++)
+ {
+ if (i > 0)
+ printf (", ");
+ printf ("'%s'", spec->named[i].name);
+ }
+ printf ("}");
+}
+
+int
+main ()
+{
+ for (;;)
+ {
+ char *line = NULL;
+ size_t line_size = 0;
+ int line_len;
+ char *invalid_reason;
+ void *descr;
+
+ line_len = getline (&line, &line_size, stdin);
+ if (line_len < 0)
+ break;
+ if (line_len > 0 && line[line_len - 1] == '\n')
+ line[--line_len] = '\0';
+
+ invalid_reason = NULL;
+ descr = format_parse (line, &invalid_reason);
+
+ format_print (descr);
+ printf ("\n");
+ if (descr == NULL)
+ printf ("%s\n", invalid_reason);
+
+ free (invalid_reason);
+ free (line);
+ }
+
+ return 0;
+}
+
+/*
+ * For Emacs M-x compile
+ * Local Variables:
+ * compile-command: "/bin/sh ../libtool --mode=link gcc -o a.out -static -O -g -Wall -I.. -I../lib -I../intl -DHAVE_CONFIG_H -DTEST format-perl-bracket.c ../lib/libgettextlib.la"
+ * End:
+ */
+
+#endif /* TEST */
/* Perl format strings.
- Copyright (C) 2002-2003 Free Software Foundation, Inc.
- Written by Guido Flohr <guido@imperia.net>, 2003.
+ Copyright (C) 2003 Free Software Foundation, Inc.
+ Written by Bruno Haible <bruno@clisp.org>, 2003.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
#ifdef HAVE_CONFIG_H
# include <config.h>
#endif
-#include <alloca.h>
#include <stdbool.h>
#include <stdlib.h>
-#include <string.h>
#include "format.h"
+#include "c-ctype.h"
#include "xmalloc.h"
+#include "xerror.h"
+#include "format-invalid.h"
#include "error.h"
#include "progname.h"
-#include "hash.h"
#include "gettext.h"
#define _(str) gettext (str)
-/* Perl format strings are currently quite simple. They consist of
- place-holders embedded in the string.
-
- messageFormatPattern := string ("[" messageFormatElement "]" string)*
+/* Perl format strings are implemented in function Perl_sv_vcatpvfn in
+ perl-5.8.0/sv.c.
+ A directive
+ - starts with '%' or '%m$' where m is a positive integer starting with a
+ nonzero digit,
+ - is optionally followed by any of the characters '#', '0', '-', ' ', '+',
+ each of which acts as a flag,
+ - is optionally followed by a vector specification: 'v' or '*v' (reads an
+ argument) or '*m$v' where m is a positive integer starting with a nonzero
+ digit,
+ - is optionally followed by a width specification: '*' (reads an argument)
+ or '*m$' where m is a positive integer starting with a nonzero digit or
+ a nonempty digit sequence starting with a nonzero digit,
+ - is optionally followed by '.' and a precision specification: '*' (reads
+ an argument) or '*m$' where m is a positive integer starting with a
+ nonzero digit or a digit sequence,
+ - is optionally followed by a size specifier, one of 'h' 'l' 'll' 'L' 'q'
+ 'V' 'I32' 'I64' 'I',
+ - is finished by a specifier
+ - '%', that needs no argument,
+ - 'c', that needs a small integer argument,
+ - 's', that needs a string argument,
+ - '_', that needs a scalar vector argument,
+ - 'p', that needs a pointer argument,
+ - 'i', 'd', 'D', that need an integer argument,
+ - 'u', 'U', 'b', 'o', 'O', 'x', 'X', that need an unsigned integer
+ argument,
+ - 'e', 'E', 'f', 'F', 'g', 'G', that need a floating-point argument,
+ - 'n', that needs a pointer to integer.
+ So there can be numbered argument specifications:
+ - '%m$' for the format string,
+ - '*m$v' for the vector,
+ - '*m$' for the width,
+ - '.*m$' for the precision.
+ Numbered and unnumbered argument specifications can be used in the same
+ string. The effect of '%m$' is to take argument number m, without affecting
+ the current argument number. The current argument number is incremented
+ after processing a directive with an unnumbered argument specification.
+ */
- messageFormatElement := [_A-Za-z][_0-9A-Za-z]+
+enum format_arg_type
+{
+ FAT_NONE = 0,
+ /* Basic types */
+ FAT_INTEGER = 1,
+ FAT_DOUBLE = 2,
+ FAT_CHAR = 3,
+ FAT_STRING = 4,
+ FAT_SCALAR_VECTOR = 5,
+ FAT_POINTER = 6,
+ FAT_COUNT_POINTER = 7,
+ /* Flags */
+ FAT_UNSIGNED = 1 << 3,
+ FAT_SIZE_SHORT = 1 << 4,
+ FAT_SIZE_V = 2 << 4,
+ FAT_SIZE_PTR = 3 << 4,
+ FAT_SIZE_LONG = 4 << 4,
+ FAT_SIZE_LONGLONG = 5 << 4,
+ /* Bitmasks */
+ FAT_SIZE_MASK = (FAT_SIZE_SHORT | FAT_SIZE_V | FAT_SIZE_PTR
+ | FAT_SIZE_LONG | FAT_SIZE_LONGLONG)
+};
- However, C format strings are also allowed and used. The following
- functions are therefore decorators for the C format checker, and
- will only fall back to Perl format if the C check is negative. */
+struct numbered_arg
+{
+ unsigned int number;
+ enum format_arg_type type;
+};
struct spec
{
unsigned int directives;
- hash_table hash;
- void *c_format;
+ unsigned int numbered_arg_count;
+ unsigned int allocated;
+ struct numbered_arg *numbered;
};
-static void *
-format_parse (const char *string, char **invalid_reason)
-{
- char *last_pos = (char *) string;
- char *pos;
- struct spec *spec;
- void *c_format = formatstring_c.parse (string, invalid_reason);
-
- if (c_format == NULL)
- return NULL;
-
- spec = (struct spec *) xmalloc (sizeof (struct spec));
+/* Locale independent test for a decimal digit.
+ Argument can be 'char' or 'unsigned char'. (Whereas the argument of
+ <ctype.h> isdigit must be an 'unsigned char'.) */
+#undef isdigit
+#define isdigit(c) ((unsigned int) ((c) - '0') < 10)
- spec->c_format = c_format;
+/* Locale independent test for a nonzero decimal digit. */
+#define isnonzerodigit(c) ((unsigned int) ((c) - '1') < 9)
- init_hash (&spec->hash, 13);
- while ((pos = strchr (last_pos, '[')) != NULL)
- {
- char *start = pos + 1;
-
- last_pos = start;
-
- if (*last_pos == '_'
- || (*last_pos >= 'A' && *last_pos <= 'Z')
- || (*last_pos >= 'a' && *last_pos <= 'z'))
- {
- ++last_pos;
-
- while (*last_pos == '_'
- || (*last_pos >= '0' && *last_pos <= '9')
- || (*last_pos >= 'A' && *last_pos <= 'Z')
- || (*last_pos >= 'a' && *last_pos <= 'z'))
- ++last_pos;
- if (*last_pos == ']')
- {
- size_t len = last_pos - start;
- int *hits;
-
- if (find_entry (&spec->hash, start, len, (void **) &hits) == 0)
- {
- ++(*hits);
- }
- else
- {
- hits = (int *) xmalloc (sizeof (int));
- *hits = 1;
- insert_entry (&spec->hash, start, len, hits);
- }
- ++last_pos;
- ++spec->directives;
- }
- }
- }
+static int
+numbered_arg_compare (const void *p1, const void *p2)
+{
+ unsigned int n1 = ((const struct numbered_arg *) p1)->number;
+ unsigned int n2 = ((const struct numbered_arg *) p2)->number;
- return spec;
+ return (n1 > n2 ? 1 : n1 < n2 ? -1 : 0);
}
-static void
-format_free (void *description)
+static void *
+format_parse (const char *format, char **invalid_reason)
{
- struct spec *spec = (struct spec *) description;
-
- if (spec != NULL)
+ unsigned int directives;
+ unsigned int numbered_arg_count;
+ unsigned int allocated;
+ struct numbered_arg *numbered;
+ unsigned int unnumbered_arg_count;
+ struct spec *result;
+
+ directives = 0;
+ numbered_arg_count = 0;
+ unnumbered_arg_count = 0;
+ allocated = 0;
+ numbered = NULL;
+
+ for (; *format != '\0';)
+ if (*format++ == '%')
+ {
+ /* A directive. */
+ unsigned int number = 0;
+ bool vectorize = false;
+ enum format_arg_type type;
+ enum format_arg_type size;
+
+ directives++;
+
+ if (isnonzerodigit (*format))
+ {
+ const char *f = format;
+ unsigned int m = 0;
+
+ do
+ {
+ m = 10 * m + (*f - '0');
+ f++;
+ }
+ while (isdigit (*f));
+
+ if (*f == '$')
+ {
+ number = m;
+ format = ++f;
+ }
+ }
+
+ /* Parse flags. */
+ while (*format == ' ' || *format == '+' || *format == '-'
+ || *format == '#' || *format == '0')
+ format++;
+
+ /* Parse vector. */
+ if (*format == 'v')
+ {
+ format++;
+ vectorize = true;
+ }
+ else if (*format == '*')
+ {
+ const char *f = format;
+
+ f++;
+ if (*f == 'v')
+ {
+ format = ++f;
+ vectorize = true;
+
+ /* Unnumbered argument. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = ++unnumbered_arg_count;
+ numbered[numbered_arg_count].type = FAT_SCALAR_VECTOR; /* or FAT_STRING? */
+ numbered_arg_count++;
+ }
+ else if (isnonzerodigit (*f))
+ {
+ unsigned int m = 0;
+
+ do
+ {
+ m = 10 * m + (*f - '0');
+ f++;
+ }
+ while (isdigit (*f));
+
+ if (*f == '$')
+ {
+ f++;
+ if (*f == 'v')
+ {
+ unsigned int vector_number = m;
+
+ format = ++f;
+ vectorize = true;
+
+ /* Numbered argument. */
+ /* Note: As of perl-5.8.0, this is not correctly
+ implemented in perl's sv.c. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = vector_number;
+ numbered[numbered_arg_count].type = FAT_SCALAR_VECTOR; /* or FAT_STRING? */
+ numbered_arg_count++;
+ }
+ }
+ }
+ }
+
+ if (vectorize)
+ {
+ /* Numbered or unnumbered argument. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = (number ? number : ++unnumbered_arg_count);
+ numbered[numbered_arg_count].type = FAT_SCALAR_VECTOR;
+ numbered_arg_count++;
+ }
+
+ /* Parse width. */
+ if (*format == '*')
+ {
+ unsigned int width_number = 0;
+
+ format++;
+
+ if (isnonzerodigit (*format))
+ {
+ const char *f = format;
+ unsigned int m = 0;
+
+ do
+ {
+ m = 10 * m + (*f - '0');
+ f++;
+ }
+ while (isdigit (*f));
+
+ if (*f == '$')
+ {
+ width_number = m;
+ format = ++f;
+ }
+ }
+
+ /* Numbered or unnumbered argument. */
+ /* Note: As of perl-5.8.0, this is not correctly
+ implemented in perl's sv.c. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = (width_number ? width_number : ++unnumbered_arg_count);
+ numbered[numbered_arg_count].type = FAT_INTEGER;
+ numbered_arg_count++;
+ }
+ else if (isnonzerodigit (*format))
+ {
+ do format++; while (isdigit (*format));
+ }
+
+ /* Parse precision. */
+ if (*format == '.')
+ {
+ format++;
+
+ if (*format == '*')
+ {
+ unsigned int precision_number = 0;
+
+ format++;
+
+ if (isnonzerodigit (*format))
+ {
+ const char *f = format;
+ unsigned int m = 0;
+
+ do
+ {
+ m = 10 * m + (*f - '0');
+ f++;
+ }
+ while (isdigit (*f));
+
+ if (*f == '$')
+ {
+ precision_number = m;
+ format = ++f;
+ }
+ }
+
+ /* Numbered or unnumbered argument. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = (precision_number ? precision_number : ++unnumbered_arg_count);
+ numbered[numbered_arg_count].type = FAT_INTEGER;
+ numbered_arg_count++;
+ }
+ else
+ {
+ while (isdigit (*format)) format++;
+ }
+ }
+
+ /* Parse size. */
+ size = 0;
+ if (*format == 'h')
+ {
+ size = FAT_SIZE_SHORT;
+ format++;
+ }
+ else if (*format == 'l')
+ {
+ if (format[1] == 'l')
+ {
+ size = FAT_SIZE_LONGLONG;
+ format += 2;
+ }
+ else
+ {
+ size = FAT_SIZE_LONG;
+ format++;
+ }
+ }
+ else if (*format == 'L' || *format == 'q')
+ {
+ size = FAT_SIZE_LONGLONG;
+ format++;
+ }
+ else if (*format == 'V')
+ {
+ size = FAT_SIZE_V;
+ format++;
+ }
+ else if (*format == 'I')
+ {
+ if (format[1] == '6' && format[2] == '4')
+ {
+ size = FAT_SIZE_LONGLONG;
+ format += 3;
+ }
+ else if (format[1] == '3' && format[2] == '2')
+ {
+ size = 0; /* FAT_SIZE_INT */
+ format += 3;
+ }
+ else
+ {
+ size = FAT_SIZE_PTR;
+ format++;
+ }
+ }
+
+ switch (*format)
+ {
+ case '%':
+ type = FAT_NONE;
+ break;
+ case 'c':
+ type = FAT_CHAR;
+ break;
+ case 's':
+ type = FAT_STRING;
+ break;
+ case '_':
+ type = FAT_SCALAR_VECTOR;
+ break;
+ case 'D':
+ type = FAT_INTEGER | FAT_SIZE_V;
+ break;
+ case 'i': case 'd':
+ type = FAT_INTEGER | size;
+ break;
+ case 'U': case 'O':
+ type = FAT_INTEGER | FAT_UNSIGNED | FAT_SIZE_V;
+ break;
+ case 'u': case 'b': case 'o': case 'x': case 'X':
+ type = FAT_INTEGER | FAT_UNSIGNED | size;
+ break;
+ case 'e': case 'E': case 'f': case 'F': case 'g': case 'G':
+ if (size == FAT_SIZE_SHORT || size == FAT_SIZE_LONG)
+ {
+ *invalid_reason =
+ xasprintf (_("In the directive number %u, the size specifier is incompatible with the conversion specifier '%c'."), directives, *format);
+ goto bad_format;
+ }
+ type = FAT_DOUBLE | size;
+ break;
+ case 'p':
+ type = FAT_POINTER;
+ break;
+ case 'n':
+ type = FAT_COUNT_POINTER | size;
+ break;
+ default:
+ *invalid_reason =
+ (*format == '\0'
+ ? INVALID_UNTERMINATED_DIRECTIVE ()
+ : INVALID_CONVERSION_SPECIFIER (directives, *format));
+ goto bad_format;
+ }
+
+ if (type != FAT_NONE && !vectorize)
+ {
+ /* Numbered or unnumbered argument. */
+ if (allocated == numbered_arg_count)
+ {
+ allocated = 2 * allocated + 1;
+ numbered = (struct numbered_arg *) xrealloc (numbered, allocated * sizeof (struct numbered_arg));
+ }
+ numbered[numbered_arg_count].number = (number ? number : ++unnumbered_arg_count);
+ numbered[numbered_arg_count].type = type;
+ numbered_arg_count++;
+ }
+
+ format++;
+ }
+
+ /* Sort the numbered argument array, and eliminate duplicates. */
+ if (numbered_arg_count > 1)
{
- void *ptr = NULL;
- const void *key;
- size_t keylen;
- void *data;
+ unsigned int i, j;
+ bool err;
+
+ qsort (numbered, numbered_arg_count,
+ sizeof (struct numbered_arg), numbered_arg_compare);
+
+ /* Remove duplicates: Copy from i to j, keeping 0 <= j <= i. */
+ err = false;
+ for (i = j = 0; i < numbered_arg_count; i++)
+ if (j > 0 && numbered[i].number == numbered[j-1].number)
+ {
+ enum format_arg_type type1 = numbered[i].type;
+ enum format_arg_type type2 = numbered[j-1].type;
+ enum format_arg_type type_both;
+
+ if (type1 == type2)
+ type_both = type1;
+ else
+ {
+ /* Incompatible types. */
+ type_both = FAT_NONE;
+ if (!err)
+ *invalid_reason =
+ INVALID_INCOMPATIBLE_ARG_TYPES (numbered[i].number);
+ err = true;
+ }
+
+ numbered[j-1].type = type_both;
+ }
+ else
+ {
+ if (j < i)
+ {
+ numbered[j].number = numbered[i].number;
+ numbered[j].type = numbered[i].type;
+ }
+ j++;
+ }
+ numbered_arg_count = j;
+ if (err)
+ /* *invalid_reason has already been set above. */
+ goto bad_format;
+ }
- while (iterate_table (&spec->hash, &ptr, &key, &keylen, &data) == 0)
- free (data);
+ result = (struct spec *) xmalloc (sizeof (struct spec));
+ result->directives = directives;
+ result->numbered_arg_count = numbered_arg_count;
+ result->allocated = allocated;
+ result->numbered = numbered;
+ return result;
- delete_hash (&spec->hash);
+ bad_format:
+ if (numbered != NULL)
+ free (numbered);
+ return NULL;
+}
- if (spec->c_format)
- formatstring_c.free (spec->c_format);
+static void
+format_free (void *descr)
+{
+ struct spec *spec = (struct spec *) descr;
- free (spec);
- }
+ if (spec->numbered != NULL)
+ free (spec->numbered);
+ free (spec);
}
static int
-format_get_number_of_directives (void *description)
+format_get_number_of_directives (void *descr)
{
- struct spec *spec = (struct spec *) description;
- int c_directives;
-
- if (spec->c_format)
- c_directives = formatstring_c.get_number_of_directives (spec->c_format);
- else
- c_directives = 0;
+ struct spec *spec = (struct spec *) descr;
- return c_directives + spec->directives;
+ return spec->directives;
}
static bool
{
struct spec *spec1 = (struct spec *) msgid_descr;
struct spec *spec2 = (struct spec *) msgstr_descr;
- bool result = false;
- void *ptr = NULL;
- const void *key;
- size_t keylen;
- int *hits1;
- int *hits2;
-
- /* First check the perl arguments. We are probably faster than format-c. */
- if (equality)
+ bool err = false;
+
+ if (spec1->numbered_arg_count + spec2->numbered_arg_count > 0)
{
- /* Pass 1: Check that every format specification in msgid has its
- counterpart in msgstr. This is only necessary for equality. */
- while (iterate_table (&spec1->hash, &ptr, &key, &keylen,
- (void **) &hits1) == 0)
+ unsigned int i, j;
+ unsigned int n1 = spec1->numbered_arg_count;
+ unsigned int n2 = spec2->numbered_arg_count;
+
+ /* Check the argument names are the same.
+ Both arrays are sorted. We search for the first difference. */
+ for (i = 0, j = 0; i < n1 || j < n2; )
{
- if (find_entry (&spec2->hash, key, keylen, (void **) &hits2) == 0)
+ int cmp = (i >= n1 ? 1 :
+ j >= n2 ? -1 :
+ spec1->numbered[i].number > spec2->numbered[j].number ? 1 :
+ spec1->numbered[i].number < spec2->numbered[j].number ? -1 :
+ 0);
+
+ if (cmp > 0)
+ {
+ if (noisy)
+ {
+ error_with_progname = false;
+ error_at_line (0, 0, pos->file_name, pos->line_number,
+ _("a format specification for argument %u, as in '%s', doesn't exist in 'msgid'"),
+ spec2->numbered[j].number, pretty_msgstr);
+ error_with_progname = true;
+ }
+ err = true;
+ break;
+ }
+ else if (cmp < 0)
{
- if (*hits1 != *hits2)
+ if (equality)
{
- result = true;
if (noisy)
{
- char *argname = (char *) alloca (keylen + 1);
- memcpy (argname, key, keylen);
- argname[keylen] = '\0';
-
- /* The next message shows the limitations of ngettext.
- It is not smart enough to say "once", "twice",
- "thrice/%d times", "%d times", ..., and it cannot
- grok with two variable arguments. We have to
- work around the problem. */
error_with_progname = false;
error_at_line (0, 0, pos->file_name, pos->line_number,
- _("\
-appearances of named argument '[%s]' do not match \
-(%d in original string, %d in '%s')"),
- argname, *hits1, *hits2, pretty_msgstr);
+ _("a format specification for argument %u doesn't exist in '%s'"),
+ spec1->numbered[i].number, pretty_msgstr);
error_with_progname = true;
}
- else
- return true;
- }
- }
- else
- {
- result = true;
- if (noisy)
- {
- char *argname = (char *) alloca (keylen + 1);
- memcpy (argname, key, keylen);
- argname[keylen] = '\0';
-
- error_with_progname = false;
- error_at_line (0, 0, pos->file_name, pos->line_number,
- _("\
-named argument '[%s]' appears in original string but not in '%s'"),
- argname, pretty_msgstr);
- error_with_progname = true;
+ err = true;
+ break;
}
else
- return true;
- }
- }
- }
-
- /* Pass 2: Check that the number of format specifications in msgstr
- does not exceed the number of appearances in msgid. */
- ptr = NULL;
- while (iterate_table (&spec2->hash, &ptr, &key, &keylen, (void**) &hits2)
- == 0)
- {
- if (find_entry (&spec1->hash, key, keylen, (void**) &hits1) != 0)
- {
- result = true;
- if (noisy)
- {
- char *argname = (char *) alloca (keylen + 1);
- memcpy (argname, key, keylen);
- argname[keylen] = '\0';
-
- error_with_progname = false;
- error_at_line (0, 0, pos->file_name, pos->line_number,
- _("\
-named argument '[%s]' appears only in '%s' but not in the original string"),
- argname, pretty_msgstr);
- error_with_progname = true;
+ i++;
}
else
- return true;
+ j++, i++;
}
+ /* Check the argument types are the same. */
+ if (!err)
+ for (i = 0, j = 0; j < n2; )
+ {
+ if (spec1->numbered[i].number == spec2->numbered[j].number)
+ {
+ if (spec1->numbered[i].type != spec2->numbered[j].type)
+ {
+ if (noisy)
+ {
+ error_with_progname = false;
+ error_at_line (0, 0, pos->file_name, pos->line_number,
+ _("format specifications in 'msgid' and '%s' for argument %u are not the same"),
+ pretty_msgstr,
+ spec2->numbered[j].number);
+ error_with_progname = true;
+ }
+ err = true;
+ break;
+ }
+ j++, i++;
+ }
+ else
+ i++;
+ }
}
- if (spec1->c_format && spec2->c_format)
- {
- result |= formatstring_c.check (pos, spec1->c_format, spec2->c_format,
- equality, noisy, pretty_msgstr);
- }
-
- return result;
+ return err;
}
+
struct formatstring_parser formatstring_perl =
{
format_parse,
#include "getline.h"
static void
-format_print (void *descr, const char *line)
+format_print (void *descr)
{
struct spec *spec = (struct spec *) descr;
- void *ptr = NULL;
- const void *key;
- size_t keylen;
- int *data;
-
- printf ("%s=> ", line);
+ unsigned int last;
+ unsigned int i;
if (spec == NULL)
{
return;
}
- while (iterate_table (&spec->hash, &ptr, &key, &keylen, (void**) &data) == 0)
+ printf ("(");
+ last = 1;
+ for (i = 0; i < spec->numbered_arg_count; i++)
{
- char *argname = (char *) alloca (keylen + 1);
- memcpy (argname, key, keylen);
- argname[keylen] = '\0';
-
- printf (">>>[%s]<<< ", argname);
+ unsigned int number = spec->numbered[i].number;
+
+ if (i > 0)
+ printf (" ");
+ if (number < last)
+ abort ();
+ for (; last < number; last++)
+ printf ("_ ");
+ if (spec->numbered[i].type & FAT_UNSIGNED)
+ printf ("[unsigned]");
+ switch (spec->numbered[i].type & FAT_SIZE_MASK)
+ {
+ case 0:
+ break;
+ case FAT_SIZE_SHORT:
+ printf ("[short]");
+ break;
+ case FAT_SIZE_V:
+ printf ("[IV]");
+ break;
+ case FAT_SIZE_PTR:
+ printf ("[PTR]");
+ break;
+ case FAT_SIZE_LONG:
+ printf ("[long]");
+ break;
+ case FAT_SIZE_LONGLONG:
+ printf ("[long long]");
+ break;
+ default:
+ abort ();
+ }
+ switch (spec->numbered[i].type & ~(FAT_UNSIGNED | FAT_SIZE_MASK))
+ {
+ case FAT_INTEGER:
+ printf ("i");
+ break;
+ case FAT_DOUBLE:
+ printf ("f");
+ break;
+ case FAT_CHAR:
+ printf ("c");
+ break;
+ case FAT_STRING:
+ printf ("s");
+ break;
+ case FAT_SCALAR_VECTOR:
+ printf ("sv");
+ break;
+ case FAT_POINTER:
+ printf ("p");
+ break;
+ case FAT_COUNT_POINTER:
+ printf ("n");
+ break;
+ default:
+ abort ();
+ }
+ last = number + 1;
}
+ printf (")");
}
int
invalid_reason = NULL;
descr = format_parse (line, &invalid_reason);
- format_print (descr, line);
+ format_print (descr);
printf ("\n");
if (descr == NULL)
printf ("%s\n", invalid_reason);
- format_free (descr);
free (invalid_reason);
free (line);
}
/* format_ycp */ &formatstring_ycp,
/* format_tcl */ &formatstring_tcl,
/* format_perl */ &formatstring_perl,
+ /* format_perl_bracket */ &formatstring_perl_bracket,
/* format_php */ &formatstring_php
};
extern struct formatstring_parser formatstring_ycp;
extern struct formatstring_parser formatstring_tcl;
extern struct formatstring_parser formatstring_perl;
+extern struct formatstring_parser formatstring_perl_bracket;
extern struct formatstring_parser formatstring_php;
/* Table of all format string parsers. */
/* format_ycp */ "ycp",
/* format_tcl */ "tcl",
/* format_perl */ "perl",
+ /* format_perl_bracket */ "perl-bracket",
/* format_php */ "php"
};
/* format_ycp */ "YCP",
/* format_tcl */ "Tcl",
/* format_perl */ "Perl",
+ /* format_perl_bracket */ "Perl bracket",
/* format_php */ "PHP"
};
format_ycp,
format_tcl,
format_perl,
+ format_perl_bracket,
format_php
};
-#define NFORMATS 13 /* Number of format_type enum values. */
+#define NFORMATS 14 /* Number of format_type enum values. */
extern const char *const format_language[NFORMATS];
extern const char *const format_language_pretty[NFORMATS];
quotemeta = false;
while (*crs)
{
+ bool backslashed;
+
/* Ensure room for 6 bytes. */
if (bufpos + 6 > bufmax)
{
buffer[bufpos++] = *crs;
++crs;
continue;
+ default:
+ backslashed = true;
+ break;
}
}
+ else
+ backslashed = false;
- if (*crs == '$' || *crs == '@')
+ if (!backslashed && (*crs == '$' || *crs == '@'))
{
error_with_progname = false;
error (error_level, 0, _("\
{ "cgi", "perl" }, \
#define SCANNERS_PERL \
- { "perl", extract_perl, &formatstring_c, &formatstring_perl }, \
+ { "perl", extract_perl, &formatstring_perl, &formatstring_perl_bracket }, \
/* Scan a Perl file and add its translatable strings to mdlp. */
extern void extract_perl (FILE *fp, const char *real_filename,
+2003-06-21 Bruno Haible <bruno@clisp.org>
+
+ * xgettext-26: Add more test cases, some from Guido Flohr.
+ * format-perl-1: Complete rewrite, based on format-c-1.
+ * format-perl-2: Complete rewrite, based on format-c-2.
+ * format-perl-bracket-1: New file, based on format-python-1, partially
+ from Guido Flohr.
+ * format-perl-bracket-2: New file, based on format-python-2, partially
+ from Guido Flohr.
+ * format-perl-mixed-1: New file, from Guido Flohr.
+ * format-perl-mixed-2: New file, from Guido Flohr.
+ * lang-perl-1: Renamed from lang-perl.
+ * lang-perl-2: New file, from Guido Flohr, based on lang-perl.
+ * Makefile.am (TESTS): Add format-perl-bracket-1,
+ format-perl-bracket-2, format-perl-mixed-1, format-perl-mixed-2,
+ lang-perl-1, lang-perl-2. Remove lang-perl.
+
2003-06-19 Bruno Haible <bruno@clisp.org>
* Makefile.am (TESTS_ENVIRONMENT): Also set LOCALE_FR.
format-python-1 format-python-2 \
format-pascal-1 format-pascal-2 \
format-perl-1 format-perl-2 \
+ format-perl-bracket-1 format-perl-bracket-2 \
+ format-perl-mixed-1 format-perl-mixed-2 \
format-tcl-1 format-tcl-2 \
format-ycp-1 format-ycp-2 \
plural-1 plural-2 \
lang-c lang-c++ lang-objc lang-python-1 lang-python-2 lang-clisp \
lang-elisp lang-librep lang-smalltalk lang-java lang-gawk lang-pascal \
- lang-ycp lang-tcl lang-perl lang-php lang-po \
+ lang-ycp lang-tcl lang-perl-1 lang-perl-2 lang-php lang-po \
lang-rst
EXTRA_DIST += $(TESTS) \
tmpfiles="$tmpfiles f-pl-1.data"
cat <<\EOF > f-pl-1.data
# Valid: no argument
-'abc%%'
-# Valid: one character argument
-'abc%c'
+"abc%%"
+# Valid: one small integer argument
+"abc%c"
# Valid: one string argument
-'abc%s'
+"abc%s"
+# Valid: one scalar vector argument
+"abc%_"
+# Valid: one pointer argument
+"abc%p"
# Valid: one integer argument
-'abc%i'
+"abc%i"
# Valid: one integer argument
-'abc%d'
+"abc%d"
# Valid: one integer argument
-'abc%o'
+"abc%D"
# Valid: one integer argument
-'abc%u'
+"abc%u"
# Valid: one integer argument
-'abc%x'
+"abc%U"
# Valid: one integer argument
-'abc%X'
-# Valid: one floating-point argument
-'abc%e'
-# Valid: one floating-point argument
-'abc%E'
+"abc%b"
+# Valid: one integer argument
+"abc%o"
+# Valid: one integer argument
+"abc%O"
+# Valid: one integer argument
+"abc%x"
+# Valid: one integer argument
+"abc%X"
# Valid: one floating-point argument
-'abc%f'
+"abc%e"
# Valid: one floating-point argument
-'abc%F'
+"abc%E"
# Valid: one floating-point argument
-'abc%g'
+"abc%f"
# Valid: one floating-point argument
-'abc%G'
+"abc%F"
# Valid: one floating-point argument
-'abc%a'
+"abc%g"
# Valid: one floating-point argument
-'abc%A'
-# Valid: one pointer argument
-'abc%p'
+"abc%G"
# Valid: one argument with flags
-'abc%0#g'
+"abc%0#g"
# Valid: one argument with width
-'abc%2g'
+"abc%2g"
# Valid: one argument with width
-'abc%*g'
+"abc%*g"
# Valid: one argument with precision
-'abc%.4g'
+"abc%.4g"
# Valid: one argument with precision
-'abc%.*g'
+"abc%.*g"
# Valid: one argument with width and precision
-'abc%14.4g'
+"abc%14.4g"
# Valid: one argument with width and precision
-'abc%14.*g'
+"abc%14.*g"
# Valid: one argument with width and precision
-'abc%*.4g'
+"abc%*.4g"
# Valid: one argument with width and precision
-'abc%*.*g'
+"abc%*.*g"
# Valid: one argument with size specifier
-'abc%hhi'
+"abc%hi"
# Valid: one argument with size specifier
-'abc%hi'
+"abc%li"
# Valid: one argument with size specifier
-'abc%li'
+"abc%lli"
# Valid: one argument with size specifier
-'abc%lli'
+"abc%Lg"
# Valid: one argument with size specifier
-'abc%Lg'
+"abc%qi"
# Valid: one argument with size specifier
-'abc%qi'
+"abc%Vi"
# Valid: one argument with size specifier
-'abc%ji'
+"abc%I32i"
# Valid: one argument with size specifier
-'abc%zi'
+"abc%I64i"
# Valid: one argument with size specifier
-'abc%ti'
+"abc%Ii"
# Invalid: unterminated
-'abc%'
+"abc%"
# Invalid: unknown format specifier
-'abc%y'
+"abc%y"
# Invalid: flags after width
-'abc%*0g'
+"abc%*0g"
# Invalid: twice precision
-'abc%.4.2g'
+"abc%.4.2g"
# Valid: three arguments
-'abc%d%u%u'
+"abc%d%u%u"
# Valid: a numbered argument
-'abc%1$d'
+"abc%1$d"
# Invalid: zero
-'abc%0$d'
+"abc%0$d"
# Valid: two-digit numbered arguments
-'abc%11$def%10$dgh%9$dij%8$dkl%7$dmn%6$dop%5$dqr%4$dst%3$duv%2$dwx%1$dyz'
+"abc%11$def%10$dgh%9$dij%8$dkl%7$dmn%6$dop%5$dqr%4$dst%3$duv%2$dwx%1$dyz"
# Invalid: unterminated number
-'abc%1'
+"abc%1"
# Invalid: flags before number
-'abc%+1$d'
+"abc%+1$d"
# Valid: three arguments, two with same number
-'abc%1$4x,%2$c,%1$u'
+"abc%1$4x,%2$c,%1$u"
# Invalid: argument with conflicting types
-'abc%1$4x,%2$c,%1$s'
+"abc%1$4x,%2$c,%1$s"
# Valid: no conflict
-'abc%1$4x,%2$c,%1$u'
-# Invalid: mixing of numbered and unnumbered arguments
-'abc%d%2$x'
+"abc%1$4x,%2$c,%1$u"
+# Valid: mixing of numbered and unnumbered arguments
+"abc%d%2$x"
# Valid: numbered argument with constant precision
-'abc%1$.9x'
-# Invalid: mixing of numbered and unnumbered arguments
-'abc%1$.*x'
-# Invalid: missing non-final argument
-'abc%2$x%3$s'
+"abc%1$.9x"
+# Valid: mixing of numbered and unnumbered arguments
+"abc%2$.*x"
+# Valid: missing non-final argument
+"abc%2$x%3$s"
# Valid: permutation
-'abc%2$ddef%1$d'
+"abc%2$ddef%1$d"
# Valid: multiple uses of same argument
-'abc%2$xdef%1$pghi%2$x'
+"abc%2$xdef%1$pghi%2$x"
# Valid: one argument with width
-'abc%2$#*1$g'
+"abc%2$#*1$g"
# Valid: one argument with width and precision
-'abc%3$*2$.*1$g'
+"abc%3$*2$.*1$g"
# Invalid: zero
-'abc%2$*0$.*1$g'
-# Valid: named arguments
-'[foo] [bar] [baz]'
+"abc%2$*0$.*1$g"
EOF
: ${XGETTEXT=xgettext}
read string
n=`expr $n + 1`
tmpfiles="$tmpfiles f-pl-1-$n.in f-pl-1-$n.po"
- cat <<EOF > f-pl-1-$n.in
+ sed -e 's,\$,\\$,g' <<EOF > f-pl-1-$n.in
gettext(${string});
EOF
- ${XGETTEXT} -L perl -o f-pl-1-$n.po f-pl-1-$n.in || exit 1
+ ${XGETTEXT} -L Perl -o f-pl-1-$n.po f-pl-1-$n.in || exit 1
test -f f-pl-1-$n.po || exit 1
fail=
if echo "$comment" | grep 'Valid:' > /dev/null; then
cat f-pl-1-$n.po 1>&2
exit 1
fi
+ rm -f f-pl-1-$n.in f-pl-1-$n.po
done < f-pl-1.data
rm -fr $tmpfiles
msgid "abc%i"
msgstr "xyz%d"
# Valid: type compatibility
+msgid "abc%b"
+msgstr "xyz%o"
+# Valid: type compatibility
msgid "abc%o"
msgstr "xyz%u"
# Valid: type compatibility
# Valid: type compatibility
msgid "abc%u"
msgstr "xyz%X"
+# Valid: type and size compatibility
+msgid "abc%x"
+msgstr "xyz%X"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%E"
msgstr "xyz%f"
# Valid: type compatibility
msgid "abc%e"
+msgstr "xyz%F"
+# Valid: type compatibility
+msgid "abc%e"
msgstr "xyz%g"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%s"
# Invalid: type incompatibility
msgid "abc%c"
+msgstr "xyz%_"
+# Invalid: type incompatibility
+msgid "abc%c"
msgstr "xyz%i"
# Invalid: type incompatibility
msgid "abc%c"
-msgstr "xyz%o"
+msgstr "xyz%u"
# Invalid: type incompatibility
msgid "abc%c"
msgstr "xyz%e"
# Invalid: type incompatibility
+msgid "abc%c"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%c"
+msgstr "xyz%n"
+# Invalid: type incompatibility
+msgid "abc%s"
+msgstr "xyz%_"
+# Invalid: type incompatibility
msgid "abc%s"
msgstr "xyz%i"
# Invalid: type incompatibility
msgid "abc%s"
-msgstr "xyz%o"
+msgstr "xyz%u"
# Invalid: type incompatibility
msgid "abc%s"
msgstr "xyz%e"
# Invalid: type incompatibility
+msgid "abc%s"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%s"
+msgstr "xyz%n"
+# Invalid: type incompatibility
+msgid "abc%_"
+msgstr "xyz%i"
+# Invalid: type incompatibility
+msgid "abc%_"
+msgstr "xyz%u"
+# Invalid: type incompatibility
+msgid "abc%_"
+msgstr "xyz%e"
+# Invalid: type incompatibility
+msgid "abc%_"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%_"
+msgstr "xyz%n"
+# Invalid: type incompatibility
msgid "abc%i"
-msgstr "xyz%o"
+msgstr "xyz%u"
# Invalid: type incompatibility
msgid "abc%i"
msgstr "xyz%e"
# Invalid: type incompatibility
+msgid "abc%i"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%i"
+msgstr "xyz%n"
+# Invalid: type incompatibility
msgid "abc%u"
msgstr "xyz%e"
+# Invalid: type incompatibility
+msgid "abc%u"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%u"
+msgstr "xyz%n"
+# Invalid: type incompatibility
+msgid "abc%e"
+msgstr "xyz%p"
+# Invalid: type incompatibility
+msgid "abc%e"
+msgstr "xyz%n"
+# Invalid: type incompatibility
+msgid "abc%p"
+msgstr "xyz%n"
+# Invalid: size incompatibility
+msgid "abc%hd"
+msgstr "xyz%d"
+# Invalid: size incompatibility
+msgid "abc%hd"
+msgstr "xyz%ld"
+# Invalid: size incompatibility
+msgid "abc%hd"
+msgstr "xyz%Vd"
+# Invalid: size incompatibility
+msgid "abc%hd"
+msgstr "xyz%qd"
+# Invalid: size incompatibility
+msgid "abc%d"
+msgstr "xyz%ld"
+# Invalid: size incompatibility
+msgid "abc%d"
+msgstr "xyz%Vd"
+# Invalid: size incompatibility
+msgid "abc%d"
+msgstr "xyz%qd"
+# Invalid: size incompatibility
+msgid "abc%ld"
+msgstr "xyz%Vd"
+# Invalid: size incompatibility
+msgid "abc%ld"
+msgstr "xyz%qd"
+# Invalid: size incompatibility
+msgid "abc%Vd"
+msgstr "xyz%qd"
+# Invalid: size incompatibility
+msgid "abc%d"
+msgstr "xyz%D"
+# Invalid: size incompatibility
+msgid "abc%u"
+msgstr "xyz%U"
+# Invalid: size incompatibility
+msgid "abc%o"
+msgstr "xyz%O"
# Invalid: type incompatibility for width
msgid "abc%g%*g"
msgstr "xyz%*g%g"
-# Valid: named arguments in arbitrary order
-msgid "[foo] [bar] [baz]: 1"
-msgstr "[baz] [bar] [foo]: 1"
-# Invalid: missing named argument in msgstr
-msgid "[foo] [bar] [baz]: 2"
-msgstr "[baz] [foo]: 2"
-# Invalid: missing named arguments in msgid
-msgid "[foo] [baz]: 3"
-msgstr "[baz] [bar] [foo]: 3"
-# Invalid: number of named arguments in msgstr exceeds msgid
-msgid "[foo] [bar] [baz]: 4"
-msgstr "[baz] [bar] [foo] [bar]: 4"
-# Invalid: number of named arguments in msgid exceeds msgstr
-msgid "[baz] [bar] [foo] [bar]: 5"
-msgstr "[foo] [bar] [baz]: 5"
EOF
: ${MSGFMT=msgfmt}
cat f-pl-2-$n.po 1>&2
exit 1
fi
+ rm -f f-pl-2-$n.po f-pl-2-$n.mo
done < f-pl-2.data
rm -fr $tmpfiles
--- /dev/null
+#! /bin/sh
+
+# Test recognition of Perl bracket format strings.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles f-pb-1.data"
+cat <<\EOF > f-pb-1.data
+# Invalid: no argument
+"abc"
+# Valid: a named argument
+"abc[value]"
+# Invalid: an empty name
+"abc[]"
+# Invalid: unterminated name
+"abc[value"
+# Valid: three arguments, two with equal names
+"abc[addr],[char],[addr]"
+# Invalid: place-holder contains a space.
+"[foo bar]"
+# Invalid: missing right angle bracket.
+"[foo bar"
+# Valid: not nested, but one single place-holder.
+"[foo[bar]baz]"
+# Valid: no nesting error, but one single place-holder.
+"[foo[bar]baz"
+# Valid: place-holder with spaces must be ignored, but still one remaining.
+"[foo bar] [baz]"
+# Invalid: percent sign not allowed.
+"[foo%bar]"
+EOF
+
+: ${XGETTEXT=xgettext}
+n=0
+while read comment; do
+ read string
+ n=`expr $n + 1`
+ tmpfiles="$tmpfiles f-pb-1-$n.in f-pb-1-$n.po"
+ cat <<EOF > f-pb-1-$n.in
+gettext(${string});
+EOF
+ ${XGETTEXT} -L Perl -o f-pb-1-$n.po f-pb-1-$n.in || exit 1
+ test -f f-pb-1-$n.po || exit 1
+ fail=
+ if echo "$comment" | grep 'Valid:' > /dev/null; then
+ if grep perl-bracket-format f-pb-1-$n.po > /dev/null; then
+ :
+ else
+ fail=yes
+ fi
+ else
+ if grep perl-bracket-format f-pb-1-$n.po > /dev/null; then
+ fail=yes
+ else
+ :
+ fi
+ fi
+ if test -n "$fail"; then
+ echo "Format string recognition error:" 1>&2
+ cat f-pb-1-$n.in 1>&2
+ echo "Got:" 1>&2
+ cat f-pb-1-$n.po 1>&2
+ exit 1
+ fi
+ rm -f f-pb-1-$n.in f-pb-1-$n.po
+done < f-pb-1.data
+
+rm -fr $tmpfiles
+
+exit 0
--- /dev/null
+#! /bin/sh
+
+# Test checking of Perl bracket format strings.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles f-pb-2.data"
+cat <<\EOF > f-pb-2.data
+# Valid: same named arguments
+msgid "abc[date][time]"
+msgstr "xyz[date][time]"
+# Valid: permutation
+msgid "abc[x3][x1][x2]def"
+msgstr "xyz[x2][x1][x3]"
+# Invalid: missing argument
+msgid "abc[x2]def[x1]"
+msgstr "xyz[x1]"
+# Invalid: missing argument
+msgid "abc[x1]def[x2]"
+msgstr "xyz[x2]"
+# Invalid: added argument
+msgid "abc[foo]def"
+msgstr "xyz[foo]uvw[zoo]"
+# Valid: multiple reuse of same argument
+msgid "[foo] [bar] [baz]"
+msgstr "[baz] [bar] [foo] [bar]"
+# Valid: single reuse of same argument
+msgid "[baz] [bar] [foo] [bar]"
+msgstr "[foo] [bar] [baz]"
+EOF
+
+: ${MSGFMT=msgfmt}
+n=0
+while read comment; do
+ read msgid_line
+ read msgstr_line
+ n=`expr $n + 1`
+ tmpfiles="$tmpfiles f-pb-2-$n.po f-pb-2-$n.mo"
+ cat <<EOF > f-pb-2-$n.po
+#, perl-bracket-format
+${msgid_line}
+${msgstr_line}
+EOF
+ fail=
+ if echo "$comment" | grep 'Valid:' > /dev/null; then
+ if ${MSGFMT} --check-format -o f-pb-2-$n.mo f-pb-2-$n.po; then
+ :
+ else
+ fail=yes
+ fi
+ else
+ ${MSGFMT} --check-format -o f-pb-2-$n.mo f-pb-2-$n.po 2> /dev/null
+ if test $? = 1; then
+ :
+ else
+ fail=yes
+ fi
+ fi
+ if test -n "$fail"; then
+ echo "Format string checking error:" 1>&2
+ cat f-pb-2-$n.po 1>&2
+ exit 1
+ fi
+ rm -f f-pb-2-$n.po f-pb-2-$n.mo
+done < f-pb-2.data
+
+rm -fr $tmpfiles
+
+exit 0
--- /dev/null
+#! /bin/sh
+
+# Test recognition of Perl format strings of both kinds (printf and bracketed).
+# This test is for the combination of both kinds.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles f-pm-1.data"
+
+cat <<\EOF > f-pm-1.data
+# Both formats.
+#, perl-format, perl-bracket-format
+"[foo] %c [bar] %d [baz]"
+# printf format only.
+#, perl-format
+"%c %d"
+# printf format only, because '%' is not allowed in identifier.
+#, perl-format
+"[foo%cbar]"
+# Valid bracketed format because there is still one valid identifier.
+#, perl-format, perl-bracket-format
+"[foo%cbar] [baz]"
+# Bracketed format only, because %l is not recognized in printf format.
+#, perl-bracket-format
+"[foo] %l [bar]"
+# Neither format recognized here.
+
+"[foo bar %l"
+EOF
+
+: ${XGETTEXT=xgettext}
+n=0
+while read comment; do
+ read formats
+ read string
+ n=`expr $n + 1`
+ tmpfiles="$tmpfiles f-pm-1-$n.in f-pm-1-$n.po"
+ cat <<EOF > f-pm-1-$n.in
+gettext(${string});
+EOF
+ ${XGETTEXT} -L perl --omit-header --no-location -o f-pm-1-$n.po f-pm-1-$n.in || exit 1
+ test -f f-pm-1-$n.po || exit 1
+ fail=
+ if test -n "${formats}"; then
+ # Verify that the first line contains the expected #, comment.
+ if sed 1q < f-pm-1-$n.po | grep '^'"${formats}"'$' > /dev/null; then
+ :
+ else
+ fail=yes
+ fi
+ else
+ # Verify that there is no #, comment.
+ if sed 1q < f-pm-1-$n.po | grep '^msgid' > /dev/null; then
+ :
+ else
+ fail=yes
+ fi
+ fi
+ if test -n "$fail"; then
+ echo "Format string recognition error:" 1>&2
+ cat f-pm-1-$n.in 1>&2
+ echo "Got:" 1>&2
+ cat f-pm-1-$n.po 1>&2
+ exit 1
+ fi
+done < f-pm-1.data
+
+rm -fr $tmpfiles
+
+exit 0
--- /dev/null
+#! /bin/sh
+
+# Test checking of Perl format strings.
+# This test is for the combination of printf and bracketed format strings.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles f-pm-2.data"
+cat <<\EOF > f-pm-2.data
+# Valid: normal case.
+#, perl-format, perl-bracket-format
+msgid "[foo] %d [bar] %s"
+msgstr "[bar] [foo] %d %s"
+# Invalid: missing argument.
+#, perl-format, perl-bracket-format
+msgid "[foo] %d [bar] %s"
+msgstr "[bar] %d %s"
+# Valid: missing argument but checking disabled.
+#, perl-format, no-perl-bracket-format
+msgid "[foo] %d [bar] %s"
+msgstr "[bar] %d %s"
+# Invalid: printf format reordered without position specifiers %1$, %2$.
+#, perl-format, perl-bracket-format
+msgid "[foo] %d [bar] %s"
+msgstr "[bar] %s [foo] %d"
+# Valid: same thing but checking disabled.
+#, no-perl-format, perl-bracket-format
+msgid "[foo] %d [bar] %s"
+msgstr "[bar] %s [foo] %d"
+# Invalid: unnamed vs. named arguments
+#, perl-format
+msgid "abc%sdef"
+msgstr "xyz[value]"
+# Invalid: named vs. unnamed arguments
+#, perl-bracket-format
+msgid "abc[value]def"
+msgstr "xyz%s"
+EOF
+
+: ${MSGFMT=msgfmt}
+n=0
+while read comment; do
+ read formats
+ read msgid_line
+ read msgstr_line
+ n=`expr $n + 1`
+ tmpfiles="$tmpfiles f-pm-2-$n.po f-pm-2-$n.mo"
+ cat <<EOF > f-pm-2-$n.po
+${formats}
+${msgid_line}
+${msgstr_line}
+EOF
+ fail=
+ if echo "$comment" | grep 'Valid:' > /dev/null; then
+ if ${MSGFMT} --check-format -o f-pm-2-$n.mo f-pm-2-$n.po; then
+ :
+ else
+ fail=yes
+ fi
+ else
+ ${MSGFMT} --check-format -o f-pm-2-$n.mo f-pm-2-$n.po 2> /dev/null
+ if test $? = 1; then
+ :
+ else
+ fail=yes
+ fi
+ fi
+ if test -n "$fail"; then
+ echo "Format string checking error:" 1>&2
+ cat f-pm-2-$n.po 1>&2
+ exit 1
+ fi
+done < f-pm-2.data
+
+rm -fr $tmpfiles
+
+exit 0
--- /dev/null
+#! /bin/sh
+
+# Test of gettext facilities in the Perl language,
+# using printf format strings.
+# Assumes an fr_FR locale is installed.
+# Assumes the following packages are installed: perl.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles program.pl"
+cat <<\EOF > program.pl
+use Locale::Messages;
+textdomain "prog";
+bindtextdomain ("./");
+my $n = 2;
+print _"'Your command, please?', asked the waiter.";
+printf ngettext ("a piece of cake", "%d pieces of cake", $n), $n;
+printf _"%s is replaced by %s.", "FF", "EUR";
+EOF
+
+tmpfiles="$tmpfiles prog.pot"
+: ${XGETTEXT=xgettext}
+${XGETTEXT} -k_ -o prog.pot --omit-header --no-location program.pl
+
+tmpfiles="$tmpfiles prog.ok"
+cat <<EOF > prog.ok
+msgid "'Your command, please?', asked the waiter."
+msgstr ""
+
+#, perl-format
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] ""
+msgstr[1] ""
+
+#, perl-format
+msgid "%s is replaced by %s."
+msgstr ""
+EOF
+
+: ${DIFF=diff}
+${DIFF} prog.ok prog.pot || exit 1
+
+tmpfiles="$tmpfiles fr.po"
+cat <<\EOF > fr.po
+msgid ""
+msgstr ""
+"Content-Type: text/plain; charset=ISO-8859-1\n"
+"Plural-Forms: nplurals=2; plural=(n > 1);\n"
+
+msgid "'Your command, please?', asked the waiter."
+msgstr "«Votre commande, s'il vous plait», dit le garçon."
+
+# Les gateaux allemands sont les meilleurs du monde.
+#, perl-format
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] "un morceau de gateau"
+msgstr[1] "%d morceaux de gateau"
+
+# Reverse the arguments.
+#, perl-format
+msgid "%s is replaced by %s."
+msgstr "%2$s remplace %1$s."
+EOF
+
+tmpfiles="$tmpfiles fr.po.new"
+: ${MSGMERGE=msgmerge}
+${MSGMERGE} -q -o fr.po.new fr.po prog.pot
+
+: ${DIFF=diff}
+${DIFF} fr.po fr.po.new || exit 1
+
+tmpfiles="$tmpfiles fr"
+test -d fr || mkdir fr
+test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES
+
+: ${MSGFMT=msgfmt}
+${MSGFMT} -o fr/LC_MESSAGES/prog.mo fr.po
+
+tmpfiles="$tmpfiles prog.ok prog.out"
+: ${DIFF=diff}
+cat <<\EOF > prog.ok
+«Votre commande, s'il vous plait», dit le garçon.
+2 morceaux de gateau
+EUR remplace FF.
+EOF
+
+# Test for perl with Locale::Messages package.
+perl -e 'use Locale::Messages;' 2>/dev/null \
+ || { rm -fr $tmpfiles; exit 77; }
+
+: ${LOCALE_FR=fr_FR}
+LANGUAGE= LANG=$LOCALE_FR LC_MESSAGES= LC_ALL= perl program.pl > prog.out || exit 1
+${DIFF} prog.ok prog.out || exit 1
+
+rm -fr $tmpfiles
+
+exit 0
--- /dev/null
+#! /bin/sh
+
+# Test of gettext facilities in the Perl language,
+# using bracket format strings.
+# Assumes an fr_FR locale is installed.
+# Assumes the following packages are installed: perl, libintl-perl.
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles program.pl"
+cat <<\EOF > program.pl
+use Locale::TextDomain (prog => './');
+my $n = 2;
+print __"'Your command, please?', asked the waiter.";
+print "\n"
+printf __n ("a piece of cake", "%d pieces of cake", $n), $n;
+print "\n"
+printf __x ("[old] is replaced by [new].", old => "FF", new => "EUR");
+print "\n"
+EOF
+
+tmpfiles="$tmpfiles prog.pot"
+: ${XGETTEXT=xgettext}
+${XGETTEXT} -k__ -k__n:1,2 -k__x -o prog.pot --omit-header --no-location program.pl
+
+tmpfiles="$tmpfiles prog.ok"
+cat <<EOF > prog.ok
+msgid "'Your command, please?', asked the waiter."
+msgstr ""
+
+#, perl-format
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] ""
+msgstr[1] ""
+
+#, perl-bracket-format
+msgid "[old] is replaced by [new]."
+msgstr ""
+EOF
+
+: ${DIFF=diff}
+${DIFF} prog.ok prog.pot || exit 1
+
+tmpfiles="$tmpfiles fr.po"
+cat <<\EOF > fr.po
+msgid ""
+msgstr ""
+"Content-Type: text/plain; charset=ISO-8859-1\n"
+"Plural-Forms: nplurals=2; plural=(n > 1);\n"
+
+msgid "'Your command, please?', asked the waiter."
+msgstr "«Votre commande, s'il vous plait», dit le garçon."
+
+# Les gateaux allemands sont les meilleurs du monde.
+#, perl-format
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] "un morceau de gateau"
+msgstr[1] "%d morceaux de gateau"
+
+# Reverse the arguments.
+#, perl-bracket-format
+msgid "[old] is replaced by [new]."
+msgstr "[new] remplace [old]."
+EOF
+
+tmpfiles="$tmpfiles fr.po.new"
+: ${MSGMERGE=msgmerge}
+${MSGMERGE} -q -o fr.po.new fr.po prog.pot
+
+: ${DIFF=diff}
+${DIFF} fr.po fr.po.new || exit 1
+
+tmpfiles="$tmpfiles fr"
+test -d fr || mkdir fr
+test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES
+
+: ${MSGFMT=msgfmt}
+${MSGFMT} -o fr/LC_MESSAGES/prog.mo fr.po
+
+tmpfiles="$tmpfiles prog.ok prog.out"
+: ${DIFF=diff}
+cat <<\EOF > prog.ok
+«Votre commande, s'il vous plait», dit le garçon.
+2 morceaux de gateau
+EUR remplace FF.
+EOF
+
+# Test for perl with libintl-perl package.
+perl -M'Locale::TextDomain' -e '' 2>/dev/null \
+ || { rm -fr $tmpfiles; exit 77; }
+
+: ${LOCALE_FR=fr_FR}
+LANGUAGE= LANG=$LOCALE_FR LC_MESSAGES= LC_ALL= perl program.pl > prog.out || exit 1
+${DIFF} prog.ok prog.out || exit 1
+
+rm -fr $tmpfiles
+
+exit 0
PERL
Second here document.
PERL
+
+# These are not invalid interpolations, because the dollar is backslashed.
+printf "%s\n", gettext "abc\$def";
+printf "%s\n", gettext "abc\\\$def";
+
+# These are not interpolations.
+printf "%s\n", gettext 'abc$defg';
+printf "%s\n", gettext 'abc\$defg';
+printf "%s\n", gettext 'abc\\$defg';
+printf "%s\n", gettext 'abc\\\$defg';
+
+# Two consecutive backslashes count as one inside single-quote strings.
+printf "%s\n", gettext 'ecs\tasy';
+printf "%s\n", gettext 'ecs\\tasy';
+printf "%s\n", gettext 'ecs\\\tasy';
+printf "%s\n", gettext 'ecs\\\\tasy';
+printf "%s\n", gettext 'ecs\\\\\tasy';
+printf "%s\n", gettext q(ecs\tasy);
+printf "%s\n", gettext q(ecs\\tasy);
+printf "%s\n", gettext q(ecs\\\tasy);
+printf "%s\n", gettext q(ecs\\\\tasy);
+printf "%s\n", gettext q(ecs\\\\\tasy);
+
+# Similarly, inside double-quote strings, two consecutive backslashes count
+# as one, but the last backslash of a sequence is combined with the following
+# character if possible.
+printf "%s\n", gettext "ecs\tasy";
+printf "%s\n", gettext "ecs\\tasy";
+printf "%s\n", gettext "ecs\\\tasy";
+printf "%s\n", gettext "ecs\\\\tasy";
+printf "%s\n", gettext "ecs\\\\\tasy";
+printf "%s\n", gettext qq(ecs\tasy);
+printf "%s\n", gettext qq(ecs\\tasy);
+printf "%s\n", gettext qq(ecs\\\tasy);
+printf "%s\n", gettext qq(ecs\\\\tasy);
+printf "%s\n", gettext qq(ecs\\\\\tasy);
+printf "%s\n", gettext "mari\huana";
+printf "%s\n", gettext "mari\\huana";
+printf "%s\n", gettext "mari\\\huana";
+printf "%s\n", gettext "mari\\\\huana";
+printf "%s\n", gettext "mari\\\\\huana";
+printf "%s\n", gettext qq(mari\huana);
+printf "%s\n", gettext qq(mari\\huana);
+printf "%s\n", gettext qq(mari\\\huana);
+printf "%s\n", gettext qq(mari\\\\huana);
+printf "%s\n", gettext qq(mari\\\\\huana);
+
+# Recognition of format strings.
+gettext "This is [only] a bracketed formatstring.";
+gettext "This is %s [mixed].";
+gettext "This is only %c.";
+gettext "This is nothing at all.";
+gettext "And this is %l also no format at all.";
+
+# xgettext: no-perl-format, perl-bracket-format
+gettext "The function '[func]' expects '%c' here.";
+
+# This is a contradictory case: The same string three times,
+# with different xgettext comments.
+# xgettext: perl-bracket-format, no-perl-format
+gettext "Left as an %exercise to [maintainer].";
+# xgettext: no-perl-bracket-format, perl-format
+gettext "Left as an %exercise to [maintainer].";
+# No xgettext comment this time.
+gettext "Left as an %exercise to [maintainer].";
__END__
gettext "Discarded!";
EOF
test $? = 0 || { rm -fr $tmpfiles; exit 1; }
tmpfiles="$tmpfiles xg-test26.ok"
-cat <<EOF > xg-test26.ok
+cat <<\EOF > xg-test26.ok
#: xg-test26.pl:9
msgid "'Your command, please?', asked the waiter."
msgstr ""
#: xg-test26.pl:52
msgid "Second here document.\n"
msgstr ""
+
+#: xg-test26.pl:56
+msgid "abc$def"
+msgstr ""
+
+#: xg-test26.pl:57
+msgid "abc\\$def"
+msgstr ""
+
+#: xg-test26.pl:60
+msgid "abc$defg"
+msgstr ""
+
+#: xg-test26.pl:61
+msgid "abc\\$defg"
+msgstr ""
+
+#: xg-test26.pl:62
+msgid "abc\\\\$defg"
+msgstr ""
+
+#: xg-test26.pl:63
+msgid "abc\\\\\\$defg"
+msgstr ""
+
+#: xg-test26.pl:66 xg-test26.pl:67 xg-test26.pl:71 xg-test26.pl:72
+#: xg-test26.pl:81 xg-test26.pl:86
+msgid "ecs\\tasy"
+msgstr ""
+
+#: xg-test26.pl:68 xg-test26.pl:69 xg-test26.pl:73 xg-test26.pl:74
+#: xg-test26.pl:83 xg-test26.pl:88
+msgid "ecs\\\\tasy"
+msgstr ""
+
+#: xg-test26.pl:70 xg-test26.pl:75
+msgid "ecs\\\\\\tasy"
+msgstr ""
+
+#: xg-test26.pl:80 xg-test26.pl:85
+msgid "ecs\tasy"
+msgstr ""
+
+#: xg-test26.pl:82 xg-test26.pl:87
+msgid "ecs\\\tasy"
+msgstr ""
+
+#: xg-test26.pl:84 xg-test26.pl:89
+msgid "ecs\\\\\tasy"
+msgstr ""
+
+#: xg-test26.pl:90 xg-test26.pl:95
+msgid "marihuana"
+msgstr ""
+
+#: xg-test26.pl:91 xg-test26.pl:92 xg-test26.pl:96 xg-test26.pl:97
+msgid "mari\\huana"
+msgstr ""
+
+#: xg-test26.pl:93 xg-test26.pl:94 xg-test26.pl:98 xg-test26.pl:99
+msgid "mari\\\\huana"
+msgstr ""
+
+#: xg-test26.pl:102
+#, perl-bracket-format
+msgid "This is [only] a bracketed formatstring."
+msgstr ""
+
+#: xg-test26.pl:103
+#, perl-format, perl-bracket-format
+msgid "This is %s [mixed]."
+msgstr ""
+
+#: xg-test26.pl:104
+#, perl-format
+msgid "This is only %c."
+msgstr ""
+
+#: xg-test26.pl:105
+msgid "This is nothing at all."
+msgstr ""
+
+#: xg-test26.pl:106
+msgid "And this is %l also no format at all."
+msgstr ""
+
+#: xg-test26.pl:109
+#, no-perl-format, perl-bracket-format
+msgid "The function '[func]' expects '%c' here."
+msgstr ""
+
+#: xg-test26.pl:114 xg-test26.pl:116 xg-test26.pl:118
+#, perl-format, no-perl-bracket-format
+msgid "Left as an %exercise to [maintainer]."
+msgstr ""
EOF
: ${DIFF=diff}