From: Bruno Haible Date: Thu, 24 Apr 2025 23:45:56 +0000 (+0200) Subject: Modula-2 support: Add Modula-2 support in the tools. X-Git-Tag: v0.25~46 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=09203499b4d98e6b02b13f7d88ff76ebb96b2aa8;p=thirdparty%2Fgettext.git Modula-2 support: Add Modula-2 support in the tools. * gettext-tools/doc/lang-modula2.texi: New file. * gettext-tools/doc/Makefile.am (gettext_TEXINFOS): Add it. * gettext-tools/doc/gettext.texi (PO Files): Mention modula2-format. (No string concatenation): Mention string concatenation in Modula-2. (Translators for other Languages): New subsection "Modula-2 Format Strings". (List of Programming Languages): Include lang-modula2.texi. * gettext-tools/doc/xgettext.texi: Document the -L Modula-2 option. * gettext-tools/src/message.h (format_modula2): New enum value. (NFORMATS): Increment. * gettext-tools/src/message.c (format_language, format_language_pretty): Add an entry for format_modula2. * gettext-tools/src/format-modula2.c: New file. * gettext-tools/src/format.h (formatstring_modula2): New declaration. * gettext-tools/src/format.c (formatstring_parsers): Add formatstring_modula2. * gettext-tools/src/x-modula2.h: New file. * gettext-tools/src/x-modula2.c: New file. * gettext-tools/src/xgettext.c: Include x-modula2.h. (flag_table_modula2): New variable. (main): Invoke init_flag_table_modula2, x_modula2_extract_all, x_modula2_keyword. (usage): Document the -L Modula-2 option. (xgettext_record_flag): Support format_modula2. (language_to_extractor, extension_to_language): Support Modula-2. * gettext-tools/src/FILES: Mention format-modula2.c, x-modula2.h, x-modula2.c. * gettext-tools/src/Makefile.am (noinst_HEADERS): Add x-modula2.h. (FORMAT_SOURCE): Add format-modula2.c. (xgettext_SOURCES): Add x-modula2.c. * gettext-tools/libgettextpo/Makefile.am (libgettextpo_la_AUXSOURCES): Add format-modula2.c. * gettext-tools/po/POTFILES.in: Add src/format-modula2.c, src/x-modula2.c. * gettext-tools/woe32dll/gettextsrc-exports.c: Export formatstring_modula2. * gettext-tools/tests/format-modula2-1: New file. * gettext-tools/tests/format-modula2-2: New file. * gettext-tools/tests/xgettext-modula2-1: New file. * gettext-tools/tests/xgettext-modula2-2: New file. * gettext-tools/tests/Makefile.am (TESTS): Add the new tests. * NEWS: Mention the Modula-2 support. --- diff --git a/NEWS b/NEWS index 1ff57871b..2f0d95634 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,9 @@ Version 0.25 - April 2025 * Modula-2: - A new library libintl_m2.so contains the runtime for using GNU gettext message catalogs in the Modula-2 programming language. + - xgettext now supports Modula-2. + - 'msgfmt -c' now verifies the syntax of translations of Modula-2 format + strings. Version 0.24.1 - April 2025 diff --git a/gettext-tools/doc/Makefile.am b/gettext-tools/doc/Makefile.am index 1b26b832e..f84751609 100644 --- a/gettext-tools/doc/Makefile.am +++ b/gettext-tools/doc/Makefile.am @@ -81,6 +81,7 @@ gettext_TEXINFOS = \ lang-gawk.texi \ lang-lua.texi \ lang-pascal.texi \ + lang-modula2.texi \ lang-d.texi \ lang-smalltalk.texi \ lang-vala.texi \ diff --git a/gettext-tools/doc/gettext.texi b/gettext-tools/doc/gettext.texi index 3686e18ad..2ba499e95 100644 --- a/gettext-tools/doc/gettext.texi +++ b/gettext-tools/doc/gettext.texi @@ -412,6 +412,7 @@ The Translator's View * awk-format:: awk Format Strings * lua-format:: Lua Format Strings * object-pascal-format:: Object Pascal Format Strings +* modula2-format:: Modula-2 Format Strings * d-format:: D Format Strings * smalltalk-format:: Smalltalk Format Strings * qt-format:: Qt Format Strings @@ -447,6 +448,7 @@ Individual Programming Languages * gawk:: GNU awk * Lua:: Lua * Pascal:: Pascal - Free Pascal Compiler +* Modula-2:: Modula-2 * Smalltalk:: GNU Smalltalk * Vala:: Vala * wxWidgets:: wxWidgets library @@ -1751,6 +1753,12 @@ Likewise for Lua, see @ref{lua-format}. @kwindex no-object-pascal-format@r{ flag} Likewise for Object Pascal, see @ref{object-pascal-format}. +@item modula2-format +@kwindex modula2-format@r{ flag} +@itemx no-modula2-format +@kwindex no-modula2-format@r{ flag} +Likewise for Modula-2, see @ref{modula2-format}. + @item d-format @kwindex d-format@r{ flag} @itemx no-d-format @@ -2336,6 +2344,7 @@ at runtime (or possibly at compile time, if the compiler supports that). @cindex Shell, string concatenation @cindex awk, string concatenation @cindex Lua, string concatenation +@cindex Modula-2, string concatenation @cindex D, string concatenation @cindex Smalltalk, string concatenation @cindex Vala, string concatenation @@ -2376,6 +2385,9 @@ In awk, string concatenation is denoted by mere juxtaposition of strings. In Lua, string concatenation is denoted by the @samp{..} operator. @c Reference: https://www.lua.org/pil/3.4.html @item +In Modula-2, string concatenation is denoted by the @samp{+} operator. +@c Reference: https://gcc.gnu.org/onlinedocs/gm2/EBNF.html +@item In D, string concatenation is denoted by the @samp{~} operator. @c Reference: https://dlang.org/spec/expression.html#cat_expressions @item @@ -9948,6 +9960,7 @@ strings. * awk-format:: awk Format Strings * lua-format:: Lua Format Strings * object-pascal-format:: Object Pascal Format Strings +* modula2-format:: Modula-2 Format Strings * d-format:: D Format Strings * smalltalk-format:: Smalltalk Format Strings * qt-format:: Qt Format Strings @@ -10242,6 +10255,38 @@ Object Pascal format strings are described in the documentation of the Free Pascal runtime library, section Format, @uref{https://www.freepascal.org/docs-html/rtl/sysutils/format.html}. +@node modula2-format +@subsection Modula-2 Format Strings + +Modula-2 format strings are defined as follows: +@enumerate +@item +Escape sequences are processed. +These escape sequences are understood: +@samp{\a}, @samp{\b}, @samp{\e}, @samp{\f}, @samp{\n}, @samp{\r}, +@samp{\x@var{hex-digits}}, @samp{\@var{octal-digits}}. +Other than that, a backslash is ignored. +@item +A directive consists of +@itemize @bullet +@item +a @samp{%} character, +@item +optionally a flag character @samp{-}, +@item +optionally a flag character @samp{0}, +@item +optionally a width specification (a nonnegative integer), +@item +and finally a specifier: +@samp{s} that formats a string, @samp{c} that formats a character, +@samp{d} and @samp{u}, that format a (signed/unsigned) integer in decimal, +or @samp{x}, that formats an unsigned integer in hexadecimal. +@end itemize +@noindent +There is also the directive @samp{%%}, that produces a single percent character. +@end enumerate + @node d-format @subsection D Format Strings @@ -10468,6 +10513,7 @@ that language, and to combine the resulting files using @code{msgcat}. * gawk:: GNU awk * Lua:: Lua * Pascal:: Pascal - Free Pascal Compiler +* Modula-2:: Modula-2 * D:: D * Smalltalk:: GNU Smalltalk * Vala:: Vala @@ -10499,6 +10545,7 @@ that language, and to combine the resulting files using @code{msgcat}. @include lang-gawk.texi @include lang-lua.texi @include lang-pascal.texi +@include lang-modula2.texi @include lang-d.texi @include lang-smalltalk.texi @include lang-vala.texi diff --git a/gettext-tools/doc/lang-modula2.texi b/gettext-tools/doc/lang-modula2.texi new file mode 100644 index 000000000..4b5a8fed6 --- /dev/null +++ b/gettext-tools/doc/lang-modula2.texi @@ -0,0 +1,55 @@ +@c This file is part of the GNU gettext manual. +@c Copyright (C) 1995-2025 Free Software Foundation, Inc. +@c See the file gettext.texi for copying conditions. + +@node Modula-2 +@subsection Modula-2 +@cindex Modula-2 + +@table @asis +@item RPMs +gcc-gm2, libgm2 + +@item Ubuntu packages +gm2 + +@item File extension +@code{mod}, @code{def} + +@item String syntax +@code{'abc'}, @code{"abc"} + +@item gettext shorthand +--- + +@item gettext/ngettext functions +@code{Gettext}, @code{DGettext}, @code{DCGettext}, +@code{NGettext}, @code{DNGettext}, @code{DCNGettext} + +@item textdomain +@code{TextDomain} function + +@item bindtextdomain +@code{BindTextDomain} function + +@item setlocale +Programmer must call @code{SetLocale (LC_ALL, "")} + +@item Prerequisite +@code{FROM Libintl IMPORT Gettext ...;} + +@item Use or emulate GNU gettext +Use + +@item Extractor +@code{xgettext} + +@item Formatting with positions +--- + +@item Portability +fully portable to all platforms supported by GNU Modula-2 + +@item po-mode marking +--- +@end table diff --git a/gettext-tools/doc/xgettext.texi b/gettext-tools/doc/xgettext.texi index 7e877f678..ccb1f42e8 100644 --- a/gettext-tools/doc/xgettext.texi +++ b/gettext-tools/doc/xgettext.texi @@ -90,6 +90,7 @@ Specifies the language of the input files. The supported languages are @code{Shell}, @code{awk}, @code{Lua}, +@code{Modula-2}, @code{D}, @code{Smalltalk}, @code{Vala}, @@ -263,6 +264,7 @@ Go, Shell, awk, Lua, +Modula-2, D, Vala, Tcl, @@ -327,6 +329,7 @@ Go, Shell, awk, Lua, +Modula-2, D, Vala, Tcl, @@ -491,6 +494,7 @@ Go, Shell, awk, Lua, +Modula-2, D, Vala, Tcl, diff --git a/gettext-tools/libgettextpo/Makefile.am b/gettext-tools/libgettextpo/Makefile.am index 1a85edc02..8dc18f6a1 100644 --- a/gettext-tools/libgettextpo/Makefile.am +++ b/gettext-tools/libgettextpo/Makefile.am @@ -86,6 +86,7 @@ libgettextpo_la_AUXSOURCES = \ ../src/format-awk.c \ ../src/format-lua.c \ ../src/format-pascal.c \ + ../src/format-modula2.c \ ../src/format-d.c \ ../src/format-smalltalk.c \ ../src/format-qt.c \ diff --git a/gettext-tools/po/POTFILES.in b/gettext-tools/po/POTFILES.in index bc9925313..4682ec21e 100644 --- a/gettext-tools/po/POTFILES.in +++ b/gettext-tools/po/POTFILES.in @@ -28,6 +28,7 @@ src/format-kde-kuit.c src/format-librep.c src/format-lisp.c src/format-lua.c +src/format-modula2.c src/format-pascal.c src/format-perl-brace.c src/format-perl.c @@ -107,6 +108,7 @@ src/x-javascript.c src/x-librep.c src/x-lisp.c src/x-lua.c +src/x-modula2.c src/x-perl.c src/x-php.c src/x-po.c diff --git a/gettext-tools/src/FILES b/gettext-tools/src/FILES index b32a76da2..49ae6164b 100644 --- a/gettext-tools/src/FILES +++ b/gettext-tools/src/FILES @@ -243,6 +243,7 @@ format-sh.c Format string handling for Shell. format-awk.c Format string handling for awk. format-lua.c Format string handling for Lua. format-pascal.c Format string handling for Object Pascal. +format-modula2.c Format string handling for Modula-2. format-d.c Format string handling for D. format-smalltalk.c Format string handling for Smalltalk and YCP. format-qt.c Format string handling for Qt. @@ -413,6 +414,9 @@ msgl-check.c | x-lua.h | x-lua.c | String extractor for Lua. +| x-modula2.h +| x-modula2.c +| String extractor for Modula-2. | x-d.h | x-d.c | html5-entities.h diff --git a/gettext-tools/src/Makefile.am b/gettext-tools/src/Makefile.am index 17541643b..513996cab 100644 --- a/gettext-tools/src/Makefile.am +++ b/gettext-tools/src/Makefile.am @@ -89,6 +89,7 @@ noinst_HEADERS = \ x-sh.h \ x-awk.h \ x-lua.h \ + x-modula2.h \ x-d.h html5-entities.h \ x-smalltalk.h \ x-vala.h \ @@ -205,6 +206,7 @@ FORMAT_SOURCE += \ format-awk.c \ format-lua.c \ format-pascal.c \ + format-modula2.c \ format-d.c \ format-smalltalk.c \ format-qt.c \ @@ -328,6 +330,7 @@ xgettext_SOURCES += \ x-sh.c ../../gettext-runtime/src/escapes.h \ x-awk.c \ x-lua.c \ + x-modula2.c \ x-d.c \ x-smalltalk.c \ x-vala.c \ diff --git a/gettext-tools/src/format-modula2.c b/gettext-tools/src/format-modula2.c new file mode 100644 index 000000000..c2b566499 --- /dev/null +++ b/gettext-tools/src/format-modula2.c @@ -0,0 +1,333 @@ +/* Modula-2 format strings. + Copyright (C) 2001-2025 Free Software Foundation, Inc. + Written by Bruno Haible , 2025. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifdef HAVE_CONFIG_H +# include +#endif + +#include +#include + +#include "format.h" +#include "c-ctype.h" +#include "xalloc.h" +#include "xvasprintf.h" +#include "format-invalid.h" +#include "gettext.h" + +#define _(str) gettext (str) + +/* The GNU Modula-2 format strings are implemented in + gcc-14.2.0/gcc/m2/gm2-libs/FormatStrings.mod. + + A directive + - starts with '%', + - is optionally followed by the flag character '-', + - is optionally followed by the flag character '0', + - is optionally followed by a width specification: a nonempty digit + sequence, + - is finished by a specifier + - 's', that needs a String argument, + - 'c', that needs a CHAR argument, + - 'd', that needs an INTEGER argument and converts it to decimal, + - 'u', that needs a CARDINAL argument and converts it to decimal, + - 'x', that needs a CARDINAL argument and converts it to hexadecimal. + Additionally there is the directive '%%', which takes no argument. + + Also, escape sequences in the format string are processed, as documented in + : + \a, \b, \e, \f, \n, \r, \x[hex], \[octal], \[other character] + */ + +enum format_arg_type +{ + FAT_STRING, + FAT_CHAR, + FAT_INTEGER, + FAT_CARDINAL +}; + +struct spec +{ + unsigned int directives; + unsigned int arg_count; + enum format_arg_type *args; +}; + +static void * +format_parse (const char *format, bool translated, char *fdi, + char **invalid_reason) +{ + const char *const format_start = format; + struct spec spec; + unsigned int args_allocated; + struct spec *result; + + spec.directives = 0; + spec.arg_count = 0; + spec.args = NULL; + args_allocated = 0; + + for (; *format != '\0';) + { + if (*format == '\\') + format++; + if (*format != '\0') + { + if (*format++ == '%') + { + FDI_SET (format - 1, FMTDIR_START); + spec.directives++; + + if (*format != '%') + { + enum format_arg_type type; + + /* Parse flags. */ + if (*format == '-') + format++; + if (*format == '0') + format++; + + /* Parse width. */ + while (c_isdigit (*format)) + format++; + + switch (*format) + { + case 's': + type = FAT_STRING; + break; + case 'c': + type = FAT_CHAR; + break; + case 'd': + type = FAT_INTEGER; + break; + case 'u': case 'x': + type = FAT_CARDINAL; + break; + default: + if (*format == '\0') + { + *invalid_reason = INVALID_UNTERMINATED_DIRECTIVE (); + FDI_SET (format - 1, FMTDIR_ERROR); + } + else + { + *invalid_reason = + INVALID_CONVERSION_SPECIFIER (spec.directives, *format); + FDI_SET (format, FMTDIR_ERROR); + } + goto bad_format; + } + + if (spec.arg_count == args_allocated) + { + args_allocated = 2 * args_allocated + 10; + spec.args = + (enum format_arg_type *) + xrealloc (spec.args, args_allocated * sizeof (enum format_arg_type)); + } + spec.args[spec.arg_count++] = type; + } + FDI_SET (format, FMTDIR_END); + format++; + } + } + } + + result = XMALLOC (struct spec); + *result = spec; + return result; + + bad_format: + if (spec.args != NULL) + free (spec.args); + return NULL; +} + +static void +format_free (void *descr) +{ + struct spec *spec = (struct spec *) descr; + + if (spec->args != NULL) + free (spec->args); + free (spec); +} + +static int +format_get_number_of_directives (void *descr) +{ + struct spec *spec = (struct spec *) descr; + + return spec->directives; +} + +static bool +format_check (void *msgid_descr, void *msgstr_descr, bool equality, + formatstring_error_logger_t error_logger, void *error_logger_data, + const char *pretty_msgid, const char *pretty_msgstr) +{ + struct spec *spec1 = (struct spec *) msgid_descr; + struct spec *spec2 = (struct spec *) msgstr_descr; + bool err = false; + + if (spec1->arg_count + spec2->arg_count > 0) + { + unsigned int n1 = spec1->arg_count; + unsigned int n2 = spec2->arg_count; + + /* Check that the argument counts are the same. */ + if (n1 < n2) + { + if (error_logger) + error_logger (error_logger_data, + _("a format specification for argument %u, as in '%s', doesn't exist in '%s'"), + n1 + 1, pretty_msgstr, pretty_msgid); + err = true; + } + else if (n1 > n2 && equality) + { + if (error_logger) + error_logger (error_logger_data, + _("a format specification for argument %u doesn't exist in '%s'"), + n1 + 1, pretty_msgstr); + err = true; + } + else + { + unsigned int i; + + /* Check that the argument types are the same. */ + if (!err) + for (i = 0; i < n1; i++) + { + if (spec1->args[i] != spec2->args[i]) + { + if (error_logger) + error_logger (error_logger_data, + _("format specifications in '%s' and '%s' for argument %u are not the same"), + pretty_msgid, pretty_msgstr, i + 1); + err = true; + break; + } + } + } + } + + return err; +} + + +struct formatstring_parser formatstring_modula2 = +{ + format_parse, + format_free, + format_get_number_of_directives, + NULL, + format_check +}; + + + +#ifdef TEST + +/* Test program: Print the argument list specification returned by + format_parse for strings read from standard input. */ + +#include + +static void +format_print (void *descr) +{ + struct spec *spec = (struct spec *) descr; + unsigned int i; + + if (spec == NULL) + { + printf ("INVALID"); + return; + } + + printf ("("); + for (i = 0; i < spec->arg_count; i++) + { + if (i > 0) + printf (" "); + switch (spec->args[i]) + { + case FAT_STRING: + printf ("s"); + break; + case FAT_CHAR: + printf ("c"); + break; + case FAT_INTEGER: + printf ("i"); + break; + case FAT_CARDINAL: + printf ("u"); + break; + default: + abort (); + } + } + printf (")"); +} + +int +main () +{ + for (;;) + { + char *line = NULL; + size_t line_size = 0; + int line_len; + char *invalid_reason; + void *descr; + + line_len = getline (&line, &line_size, stdin); + if (line_len < 0) + break; + if (line_len > 0 && line[line_len - 1] == '\n') + line[--line_len] = '\0'; + + invalid_reason = NULL; + descr = format_parse (line, false, NULL, &invalid_reason); + + format_print (descr); + printf ("\n"); + if (descr == NULL) + printf ("%s\n", invalid_reason); + + free (invalid_reason); + free (line); + } + + return 0; +} + +/* + * For Emacs M-x compile + * Local Variables: + * compile-command: "/bin/sh ../libtool --tag=CC --mode=link gcc -o a.out -static -O -g -Wall -I.. -I../gnulib-lib -I../../gettext-runtime/intl -DHAVE_CONFIG_H -DTEST format-modula2.c ../gnulib-lib/libgettextlib.la" + * End: + */ + +#endif /* TEST */ diff --git a/gettext-tools/src/format.c b/gettext-tools/src/format.c index c8388ddc2..73fe7da50 100644 --- a/gettext-tools/src/format.c +++ b/gettext-tools/src/format.c @@ -54,6 +54,7 @@ struct formatstring_parser *formatstring_parsers[NFORMATS] = /* format_awk */ &formatstring_awk, /* format_lua */ &formatstring_lua, /* format_pascal */ &formatstring_pascal, + /* format_modula2 */ &formatstring_modula2, /* format_d */ &formatstring_d, /* format_smalltalk */ &formatstring_smalltalk, /* format_qt */ &formatstring_qt, diff --git a/gettext-tools/src/format.h b/gettext-tools/src/format.h index 4bf8a1856..6db78bc3b 100644 --- a/gettext-tools/src/format.h +++ b/gettext-tools/src/format.h @@ -120,6 +120,7 @@ extern DLL_VARIABLE struct formatstring_parser formatstring_sh; extern DLL_VARIABLE struct formatstring_parser formatstring_awk; extern DLL_VARIABLE struct formatstring_parser formatstring_lua; extern DLL_VARIABLE struct formatstring_parser formatstring_pascal; +extern DLL_VARIABLE struct formatstring_parser formatstring_modula2; extern DLL_VARIABLE struct formatstring_parser formatstring_d; extern DLL_VARIABLE struct formatstring_parser formatstring_smalltalk; extern DLL_VARIABLE struct formatstring_parser formatstring_qt; diff --git a/gettext-tools/src/message.c b/gettext-tools/src/message.c index 9b6193685..1c1808fcf 100644 --- a/gettext-tools/src/message.c +++ b/gettext-tools/src/message.c @@ -54,6 +54,7 @@ const char *const format_language[NFORMATS] = /* format_awk */ "awk", /* format_lua */ "lua", /* format_pascal */ "object-pascal", + /* format_modula2 */ "modula2", /* format_d */ "d", /* format_smalltalk */ "smalltalk", /* format_qt */ "qt", @@ -92,6 +93,7 @@ const char *const format_language_pretty[NFORMATS] = /* format_awk */ "awk", /* format_lua */ "Lua", /* format_pascal */ "Object Pascal", + /* format_modula2 */ "Modula-2", /* format_d */ "D", /* format_smalltalk */ "Smalltalk", /* format_qt */ "Qt", diff --git a/gettext-tools/src/message.h b/gettext-tools/src/message.h index 72b3d8602..c7f4760b4 100644 --- a/gettext-tools/src/message.h +++ b/gettext-tools/src/message.h @@ -63,6 +63,7 @@ enum format_type format_awk, format_lua, format_pascal, + format_modula2, format_d, format_smalltalk, format_qt, @@ -78,7 +79,7 @@ enum format_type format_gfc_internal, format_ycp }; -#define NFORMATS 34 /* Number of format_type enum values. */ +#define NFORMATS 35 /* Number of format_type enum values. */ extern DLL_VARIABLE const char *const format_language[NFORMATS]; extern DLL_VARIABLE const char *const format_language_pretty[NFORMATS]; diff --git a/gettext-tools/src/x-modula2.c b/gettext-tools/src/x-modula2.c new file mode 100644 index 000000000..a5b805466 --- /dev/null +++ b/gettext-tools/src/x-modula2.c @@ -0,0 +1,804 @@ +/* xgettext Modula-2 backend. + Copyright (C) 2002-2025 Free Software Foundation, Inc. + + This file was written by Bruno Haible , 2025. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifdef HAVE_CONFIG_H +# include "config.h" +#endif + +/* Specification. */ +#include "x-modula2.h" + +#include +#include +#include +#include +#include + +#define SB_NO_APPENDF +#include +#include "attribute.h" +#include "message.h" +#include "xgettext.h" +#include "xg-pos.h" +#include "xg-arglist-context.h" +#include "xg-arglist-callshape.h" +#include "xg-arglist-parser.h" +#include "xg-message.h" +#include "if-error.h" +#include "xalloc.h" +#include "string-buffer.h" +#include "gettext.h" + +#define _(s) gettext(s) + +#define SIZEOF(a) (sizeof(a) / sizeof(a[0])) + + +/* The Modula-2 syntax is defined in the book + "The Programming Language Modula-2" by Niklaus Wirth + . + The syntax understood by GNU Modula-2 is listed in + . */ + + +/* ====================== Keyword set customization. ====================== */ + +/* If true extract all strings. */ +static bool extract_all = false; + +static hash_table keywords; +static bool default_keywords = true; + + +void +x_modula2_extract_all () +{ + extract_all = true; +} + + +void +x_modula2_keyword (const char *name) +{ + if (name == NULL) + default_keywords = false; + else + { + const char *end; + struct callshape shape; + const char *colon; + + if (keywords.table == NULL) + hash_init (&keywords, 100); + + split_keywordspec (name, &end, &shape); + + /* The characters between name and end should form a valid Modula-2 + identifier. + A colon means an invalid parse in split_keywordspec(). */ + colon = strchr (name, ':'); + if (colon == NULL || colon >= end) + insert_keyword_callshape (&keywords, name, end - name, &shape); + } +} + +/* Finish initializing the keywords hash table. + Called after argument processing, before each file is processed. */ +static void +init_keywords () +{ + if (default_keywords) + { + /* When adding new keywords here, also update the documentation in + xgettext.texi! */ + x_modula2_keyword ("Gettext"); + x_modula2_keyword ("DGettext:2"); + x_modula2_keyword ("DCGettext:2"); + x_modula2_keyword ("NGettext:1,2"); + x_modula2_keyword ("DNGettext:2,3"); + x_modula2_keyword ("DCNGettext:2,3"); + default_keywords = false; + } +} + +void +init_flag_table_modula2 () +{ + xgettext_record_flag ("Gettext:1:pass-modula2-format"); + xgettext_record_flag ("DGettext:2:pass-modula2-format"); + xgettext_record_flag ("DCGettext:2:pass-modula2-format"); + xgettext_record_flag ("NGettext:1:pass-modula2-format"); + xgettext_record_flag ("NGettext:2:pass-modula2-format"); + xgettext_record_flag ("DNGettext:2:pass-modula2-format"); + xgettext_record_flag ("DNGettext:3:pass-modula2-format"); + xgettext_record_flag ("DCNGettext:2:pass-modula2-format"); + xgettext_record_flag ("DCNGettext:3:pass-modula2-format"); + /* FormatStrings.def */ + xgettext_record_flag ("Sprintf0:1:modula2-format"); + xgettext_record_flag ("Sprintf1:1:modula2-format"); + xgettext_record_flag ("Sprintf2:1:modula2-format"); + xgettext_record_flag ("Sprintf3:1:modula2-format"); + xgettext_record_flag ("Sprintf4:1:modula2-format"); +} + + +/* ======================== Reading of characters. ======================== */ + +/* The input file stream. */ +static FILE *fp; + + +/* 1. line_number handling. */ + +static int +phase1_getc () +{ + int c = getc (fp); + + if (c == EOF) + { + if (ferror (fp)) + error (EXIT_FAILURE, errno, _("error while reading \"%s\""), + real_file_name); + return EOF; + } + + if (c == '\n') + line_number++; + + return c; +} + +/* Supports only one pushback character. */ +static void +phase1_ungetc (int c) +{ + if (c != EOF) + { + if (c == '\n') + --line_number; + + ungetc (c, fp); + } +} + + +/* These are for tracking whether comments count as immediately before + keyword. */ +static int last_comment_line; +static int last_non_comment_line; + + +/* 2. Replace each comment that is not inside a character constant or + string literal with a space character. We need to remember the + comment for later, because it may be attached to a keyword string. + Modula-2 comments are specified in + : + "Comments may be inserted between any two symbols in a program. + They are arbitrary character sequences opened by the bracket (* and + closed by *). Comments may be nested, and they do not affect the + meaning of a program." */ + +static unsigned char phase2_pushback[1]; +static int phase2_pushback_length; + +static int +phase2_getc () +{ + int c; + + if (phase2_pushback_length) + return phase2_pushback[--phase2_pushback_length]; + + c = phase1_getc (); + if (c == '(') + { + c = phase1_getc (); + if (c == '*') + { + /* A comment. */ + int lineno; + struct string_buffer buffer; + unsigned int nesting; + bool last_was_star; + bool last_was_opening_paren; + + lineno = line_number; + sb_init (&buffer); + nesting = 0; + last_was_star = false; + last_was_opening_paren = false; + for (;;) + { + c = phase1_getc (); + if (c == EOF) + { + sb_free (&buffer); + break; + } + + if (last_was_opening_paren && c == '*') + nesting++; + else if (last_was_star && c == ')') + { + if (nesting == 0) + { + --buffer.length; + while (buffer.length >= 1 + && (buffer.data[buffer.length - 1] == ' ' + || buffer.data[buffer.length - 1] == '\t')) + --buffer.length; + savable_comment_add (sb_xdupfree_c (&buffer)); + break; + } + nesting--; + } + last_was_star = (c == '*'); + last_was_opening_paren = (c == '('); + + /* We skip all leading white space, but not EOLs. */ + if (sd_length (sb_contents (&buffer)) == 0 + && (c == ' ' || c == '\t')) + continue; + sb_xappend1 (&buffer, c); + if (c == '\n') + { + --buffer.length; + while (buffer.length >= 1 + && (buffer.data[buffer.length - 1] == ' ' + || buffer.data[buffer.length - 1] == '\t')) + --buffer.length; + savable_comment_add (sb_xdupfree_c (&buffer)); + sb_init (&buffer); + lineno = line_number; + } + } + last_comment_line = lineno; + return ' '; + } + else + { + phase1_ungetc (c); + return '('; + } + } + else + return c; +} + +/* Supports only one pushback character. */ +static void +phase2_ungetc (int c) +{ + if (c != EOF) + { + if (phase2_pushback_length == SIZEOF (phase2_pushback)) + abort (); + phase2_pushback[phase2_pushback_length++] = c; + } +} + + +/* ========================== Reading of tokens. ========================== */ + + +enum token_type_ty +{ + token_type_eof, + token_type_lparen, /* ( */ + token_type_rparen, /* ) */ + token_type_comma, /* , */ + token_type_plus, /* + */ + token_type_operator, /* - * / = # < <= > >= */ + token_type_string_literal, /* "abc", 'abc' */ + token_type_symbol, /* symbol */ + token_type_other /* :=, number, other */ +}; +typedef enum token_type_ty token_type_ty; + +typedef struct token_ty token_ty; +struct token_ty +{ + token_type_ty type; + char *string; /* for token_type_string_literal, token_type_symbol */ + refcounted_string_list_ty *comment; /* for token_type_string_literal */ + int line_number; +}; + + +/* Free the memory pointed to by a 'struct token_ty'. */ +static inline void +free_token (token_ty *tp) +{ + if (tp->type == token_type_string_literal || tp->type == token_type_symbol) + free (tp->string); + if (tp->type == token_type_string_literal) + drop_reference (tp->comment); +} + + +/* Combine characters into tokens. Discard whitespace. */ + +static token_ty phase3_pushback[2]; +static int phase3_pushback_length; + +static void +phase3_get (token_ty *tp) +{ + int c; + + if (phase3_pushback_length) + { + *tp = phase3_pushback[--phase3_pushback_length]; + return; + } + for (;;) + { + tp->line_number = line_number; + c = phase2_getc (); + + switch (c) + { + case EOF: + tp->type = token_type_eof; + return; + + case '\n': + if (last_non_comment_line > last_comment_line) + savable_comment_reset (); + FALLTHROUGH; + case '\r': + case '\t': + case ' ': + /* Ignore whitespace and comments. */ + continue; + } + + last_non_comment_line = tp->line_number; + + switch (c) + { + case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': + case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': + case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': + case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': + case 'Y': case 'Z': + case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': + case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': + case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': + case 's': case 't': case 'u': case 'v': case 'w': case 'x': + case 'y': case 'z': + case '_': /* GNU Modula-2 treats '_' like a letter. */ + /* Symbol. */ + { + struct string_buffer buffer; + sb_init (&buffer); + for (;;) + { + sb_xappend1 (&buffer, c); + c = phase2_getc (); + switch (c) + { + case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': + case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': + case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': + case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': + case 'Y': case 'Z': + case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': + case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': + case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': + case 's': case 't': case 'u': case 'v': case 'w': case 'x': + case 'y': case 'z': + case '_': /* GNU Modula-2 treats '_' like a letter. */ + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + continue; + default: + phase2_ungetc (c); + break; + } + break; + } + tp->string = sb_xdupfree_c (&buffer); + /* We could carefully recognize each of the 2 and 3 character + operators (IN, DIV, MOD, etc.), but it is not necessary, as we + only need to recognize gettext invocations. Don't bother. */ + tp->type = token_type_symbol; + } + return; + + /* String syntax: + says: + "Strings are sequences of characters enclosed in quote marks. + Both double quotes and single quotes (apostrophes) may be used + as quote marks. However, the opening and closing marks must be + the same character, and this character cannot occur within the + string. A string must not extend over the end of a line." */ + case '"': case '\'': + { + int delimiter = c; + struct string_buffer buffer; + sb_init (&buffer); + for (;;) + { + c = phase1_getc (); + if (c == EOF || c == '\n') + { + if_error (IF_SEVERITY_WARNING, + logical_file_name, line_number - (c == '\n'), (size_t)(-1), + false, + _("unterminated string literal")); + break; + } + if (c == delimiter) + break; + sb_xappend1 (&buffer, c); + } + tp->string = sb_xdupfree_c (&buffer); + tp->type = token_type_string_literal; + tp->comment = add_reference (savable_comment); + } + return; + + case '(': + tp->type = token_type_lparen; + return; + + case ')': + tp->type = token_type_rparen; + return; + + case ',': + tp->type = token_type_comma; + return; + + case '+': + tp->type = token_type_plus; + return; + + case '-': + case '*': + case '/': + case '=': + case '#': + tp->type = token_type_operator; + return; + + case '<': + case '>': + c = phase1_getc (); + if (c != '=') + phase1_ungetc (c); + tp->type = token_type_operator; + return; + + case ':': + c = phase1_getc (); + if (c != '=') + phase1_ungetc (c); + tp->type = token_type_other; + return; + + default: + tp->type = token_type_other; + return; + } + } +} + +/* Supports only 2 pushback tokens. */ +static void +phase3_unget (token_ty *tp) +{ + if (tp->type != token_type_eof) + { + if (phase3_pushback_length == SIZEOF (phase3_pushback)) + abort (); + phase3_pushback[phase3_pushback_length++] = *tp; + } +} + + +/* Compile-time optimization of string literal concatenation. + Combine "string1" + ... + "stringN" to the concatenated string. */ + +/* Concatenates two strings, and frees the first argument. */ +static char * +string_concat_free1 (char *s1, const char *s2) +{ + size_t len1 = strlen (s1); + size_t len2 = strlen (s2); + size_t len = len1 + len2 + 1; + char *result = XNMALLOC (len, char); + memcpy (result, s1, len1); + memcpy (result + len1, s2, len2 + 1); + free (s1); + return result; +} + +static token_ty phase4_pushback[2]; +static int phase4_pushback_length; + +static void +phase4_get (token_ty *tp) +{ + if (phase4_pushback_length) + { + *tp = phase4_pushback[--phase4_pushback_length]; + return; + } + + phase3_get (tp); + if (tp->type == token_type_string_literal) + { + char *sum = tp->string; + + for (;;) + { + token_ty token2; + + phase3_get (&token2); + if (token2.type == token_type_plus) + { + token_ty token3; + + phase3_get (&token3); + if (token3.type == token_type_string_literal) + { + sum = string_concat_free1 (sum, token3.string); + + free_token (&token3); + free_token (&token2); + continue; + } + phase3_unget (&token3); + } + phase3_unget (&token2); + break; + } + tp->string = sum; + } +} + +/* Supports 2 tokens of pushback. */ +static void +phase4_unget (token_ty *tp) +{ + if (tp->type != token_type_eof) + { + if (phase4_pushback_length == SIZEOF (phase4_pushback)) + abort (); + phase4_pushback[phase4_pushback_length++] = *tp; + } +} + + +static void +x_modula2_lex (token_ty *tp) +{ + phase4_get (tp); +} + +/* Supports 2 tokens of pushback. */ +MAYBE_UNUSED static void +x_modula2_unlex (token_ty *tp) +{ + phase4_unget (tp); +} + + +/* ========================= Extracting strings. ========================== */ + + +/* Context lookup table. */ +static flag_context_list_table_ty *flag_context_list_table; + + +/* Maximum supported nesting depth. */ +#define MAX_NESTING_DEPTH 1000 + +/* Current nesting depth. */ +static int nesting_depth; + + +/* The file is broken into tokens. Scan the token stream, looking for + a keyword, followed by a left paren, followed by a string. When we + see this sequence, we have something to remember. We assume we are + looking at a valid Modula-2 program, and leave the complaints about + the grammar to the compiler. + + Normal handling: Look for + keyword ( ... msgid ... ) + Plural handling: Look for + keyword ( ... msgid ... msgid_plural ... ) + + We use recursion because the arguments before msgid or between msgid + and msgid_plural can contain subexpressions of the same form. */ + + +/* Extract messages until the next balanced closing parenthesis. + Extracted messages are added to MLP. + Return true upon eof, false upon closing parenthesis. */ +static bool +extract_parenthesized (message_list_ty *mlp, + flag_region_ty *outer_region, + flag_context_list_iterator_ty context_iter, + struct arglist_parser *argparser) +{ + /* Current argument number. */ + int arg = 1; + /* 0 when no keyword has been seen. 1 right after a keyword is seen. */ + int state; + /* Parameters of the keyword just seen. Defined only in state 1. */ + const struct callshapes *next_shapes = NULL; + /* Context iterator that will be used if the next token is a '('. */ + flag_context_list_iterator_ty next_context_iter = + passthrough_context_list_iterator; + /* Current region. */ + flag_region_ty *inner_region = + inheriting_region (outer_region, + flag_context_list_iterator_advance (&context_iter)); + + /* Start state is 0. */ + state = 0; + + for (;;) + { + token_ty token; + + x_modula2_lex (&token); + + switch (token.type) + { + case token_type_symbol: + { + void *keyword_value; + + if (hash_find_entry (&keywords, token.string, strlen (token.string), + &keyword_value) + == 0) + { + next_shapes = (const struct callshapes *) keyword_value; + state = 1; + } + else + state = 0; + } + next_context_iter = + flag_context_list_iterator ( + flag_context_list_table_lookup ( + flag_context_list_table, + token.string, strlen (token.string))); + free (token.string); + continue; + + case token_type_lparen: + if (++nesting_depth > MAX_NESTING_DEPTH) + if_error (IF_SEVERITY_FATAL_ERROR, + logical_file_name, line_number, (size_t)(-1), false, + _("too many open parentheses")); + if (extract_parenthesized (mlp, inner_region, next_context_iter, + arglist_parser_alloc (mlp, + state ? next_shapes : NULL))) + { + arglist_parser_done (argparser, arg); + unref_region (inner_region); + return true; + } + nesting_depth--; + next_context_iter = null_context_list_iterator; + state = 0; + continue; + + case token_type_rparen: + arglist_parser_done (argparser, arg); + unref_region (inner_region); + return false; + + case token_type_comma: + arg++; + unref_region (inner_region); + inner_region = + inheriting_region (outer_region, + flag_context_list_iterator_advance ( + &context_iter)); + next_context_iter = passthrough_context_list_iterator; + state = 0; + continue; + + case token_type_string_literal: + { + lex_pos_ty pos; + pos.file_name = logical_file_name; + pos.line_number = token.line_number; + + if (extract_all) + remember_a_message (mlp, NULL, token.string, false, false, + inner_region, &pos, + NULL, token.comment, false); + else + { + mixed_string_ty *ms = + mixed_string_alloc_simple (token.string, lc_string, + pos.file_name, pos.line_number); + free (token.string); + arglist_parser_remember (argparser, arg, ms, + inner_region, + pos.file_name, pos.line_number, + token.comment, false); + } + } + next_context_iter = null_context_list_iterator; + state = 0; + continue; + + case token_type_eof: + arglist_parser_done (argparser, arg); + unref_region (inner_region); + return true; + + case token_type_plus: + case token_type_operator: + case token_type_other: + next_context_iter = null_context_list_iterator; + state = 0; + continue; + + default: + abort (); + } + } +} + + +void +extract_modula2 (FILE *f, + const char *real_filename, const char *logical_filename, + flag_context_list_table_ty *flag_table, + msgdomain_list_ty *mdlp) +{ + message_list_ty *mlp = mdlp->item[0]->messages; + + fp = f; + real_file_name = real_filename; + logical_file_name = xstrdup (logical_filename); + line_number = 1; + + last_comment_line = -1; + last_non_comment_line = -1; + + phase2_pushback_length = 0; + phase3_pushback_length = 0; + phase4_pushback_length = 0; + + flag_context_list_table = flag_table; + nesting_depth = 0; + + init_keywords (); + + /* Eat tokens until eof is seen. When extract_parenthesized returns + due to an unbalanced closing parenthesis, just restart it. */ + while (!extract_parenthesized (mlp, null_context_region (), null_context_list_iterator, + arglist_parser_alloc (mlp, NULL))) + ; + + fp = NULL; + real_file_name = NULL; + logical_file_name = NULL; + line_number = 0; +} diff --git a/gettext-tools/src/x-modula2.h b/gettext-tools/src/x-modula2.h new file mode 100644 index 000000000..3c44100ac --- /dev/null +++ b/gettext-tools/src/x-modula2.h @@ -0,0 +1,53 @@ +/* xgettext Modula-2 backend. + Copyright (C) 2002-2025 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +/* Written by Bruno Haible , 2025. */ + + +#include + +#include "message.h" +#include "xg-arglist-context.h" + + +#ifdef __cplusplus +extern "C" { +#endif + + +#define EXTENSIONS_MODULA2 \ + { "mod", "Modula-2" }, \ + { "def", "Modula-2" }, \ + +#define SCANNERS_MODULA2 \ + { "Modula-2", extract_modula2, NULL, \ + &flag_table_modula2, &formatstring_modula2, NULL }, \ + +/* Scan a Modula-2 file and add its translatable strings to mdlp. */ +extern void extract_modula2 (FILE *fp, const char *real_filename, + const char *logical_filename, + flag_context_list_table_ty *flag_table, + msgdomain_list_ty *mdlp); + +extern void x_modula2_keyword (const char *keyword); +extern void x_modula2_extract_all (void); + +extern void init_flag_table_modula2 (void); + + +#ifdef __cplusplus +} +#endif diff --git a/gettext-tools/src/xgettext.c b/gettext-tools/src/xgettext.c index 6851f114f..1f9ca50d6 100644 --- a/gettext-tools/src/xgettext.c +++ b/gettext-tools/src/xgettext.c @@ -119,6 +119,7 @@ #include "x-sh.h" #include "x-awk.h" #include "x-lua.h" +#include "x-modula2.h" #include "x-d.h" #include "x-smalltalk.h" #include "x-vala.h" @@ -212,6 +213,7 @@ static flag_context_list_table_ty flag_table_ruby; static flag_context_list_table_ty flag_table_sh; static flag_context_list_table_ty flag_table_awk; static flag_context_list_table_ty flag_table_lua; +static flag_context_list_table_ty flag_table_modula2; static flag_context_list_table_ty flag_table_d; static flag_context_list_table_ty flag_table_vala; static flag_context_list_table_ty flag_table_tcl; @@ -410,6 +412,7 @@ main (int argc, char *argv[]) init_flag_table_sh (); init_flag_table_awk (); init_flag_table_lua (); + init_flag_table_modula2 (); init_flag_table_d (); init_flag_table_vala (); init_flag_table_tcl (); @@ -444,6 +447,7 @@ main (int argc, char *argv[]) x_sh_extract_all (); x_awk_extract_all (); x_lua_extract_all (); + x_modula2_extract_all (); x_d_extract_all (); x_vala_extract_all (); x_tcl_extract_all (); @@ -529,6 +533,7 @@ main (int argc, char *argv[]) x_sh_keyword (optarg); x_awk_keyword (optarg); x_lua_keyword (optarg); + x_modula2_keyword (optarg); x_d_keyword (optarg); x_vala_keyword (optarg); x_tcl_keyword (optarg); @@ -1154,9 +1159,10 @@ Choice of input file language:\n")); (C, C++, ObjectiveC, PO, Python, Java,\n\ JavaProperties, C#, JavaScript, TypeScript, TSX,\n\ Scheme, Guile, Lisp, EmacsLisp, librep, Rust,\n\ - Go, Ruby, Shell, awk, Lua, D, Smalltalk, Vala,\n\ - Tcl, Perl, PHP, GCC-source, YCP, NXStringTable,\n\ - RST, RSJ, Glade, GSettings, Desktop)\n")); + Go, Ruby, Shell, awk, Lua, Modula-2, D,\n\ + Smalltalk, Vala, Tcl, Perl, PHP, GCC-source,\n\ + YCP, NXStringTable, RST, RSJ, Glade, GSettings,\n\ + Desktop)\n")); printf (_("\ -C, --c++ shorthand for --language=C++\n")); printf (_("\ @@ -1198,8 +1204,8 @@ Language specific options:\n")); (only languages C, C++, ObjectiveC, Python,\n\ Java, C#, JavaScript, TypeScript, TSX, Scheme,\n\ Guile, Lisp, EmacsLisp, librep, Rust, Go, Shell,\n\ - awk, Lua, D, Vala, Tcl, Perl, PHP, GCC-source,\n\ - Glade, GSettings)\n")); + awk, Lua, Modula-2, D, Vala, Tcl, Perl, PHP,\n\ + GCC-source, Glade, GSettings)\n")); printf (_("\ -kWORD, --keyword=WORD look for WORD as an additional keyword\n\ -k, --keyword do not to use default keywords\n")); @@ -1207,8 +1213,8 @@ Language specific options:\n")); (only languages C, C++, ObjectiveC, Python,\n\ Java, C#, JavaScript, TypeScript, TSX, Scheme,\n\ Guile, Lisp, EmacsLisp, librep, Rust, Go, Shell,\n\ - awk, Lua, D, Vala, Tcl, Perl, PHP, GCC-source,\n\ - Glade, GSettings, Desktop)\n")); + awk, Lua, Modula-2, D, Vala, Tcl, Perl, PHP,\n\ + GCC-source, Glade, GSettings, Desktop)\n")); printf (_("\ --flag=WORD:ARG:FLAG additional flag for strings inside the argument\n\ number ARG of keyword WORD\n")); @@ -1216,8 +1222,8 @@ Language specific options:\n")); (only languages C, C++, ObjectiveC, Python,\n\ Java, C#, JavaScript, TypeScript, TSX, Scheme,\n\ Guile, Lisp, EmacsLisp, librep, Rust, Go, Shell,\n\ - awk, Lua, D, Vala, Tcl, Perl, PHP, GCC-source,\n\ - YCP)\n")); + awk, Lua, Modula-2, D, Vala, Tcl, Perl, PHP,\n\ + GCC-source, YCP)\n")); printf (_("\ --tag=WORD:FORMAT defines the behaviour of tagged template literals\n\ with tag WORD\n")); @@ -1750,6 +1756,11 @@ xgettext_record_flag (const char *optionstring) break; case format_pascal: break; + case format_modula2: + flag_context_list_table_insert (&flag_table_modula2, XFORMAT_PRIMARY, + name_start, name_end, + argnum, value, pass); + break; case format_d: flag_context_list_table_insert (&flag_table_d, XFORMAT_SECONDARY, name_start, name_end, @@ -2364,6 +2375,7 @@ language_to_extractor (const char *name) SCANNERS_SH SCANNERS_AWK SCANNERS_LUA + SCANNERS_MODULA2 SCANNERS_D SCANNERS_SMALLTALK SCANNERS_VALA @@ -2462,6 +2474,7 @@ extension_to_language (const char *extension) EXTENSIONS_SH EXTENSIONS_AWK EXTENSIONS_LUA + EXTENSIONS_MODULA2 EXTENSIONS_D EXTENSIONS_SMALLTALK EXTENSIONS_VALA diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index 8e80f9d02..c52d478ed 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -143,6 +143,7 @@ TESTS = gettext-1 gettext-2 \ xgettext-lua-1 xgettext-lua-2 xgettext-lua-3 \ xgettext-lua-stackovfl-1 xgettext-lua-stackovfl-2 \ xgettext-lua-stackovfl-3 xgettext-lua-stackovfl-4 \ + xgettext-modula2-1 xgettext-modula2-2 \ xgettext-objc-1 xgettext-objc-2 \ xgettext-perl-1 xgettext-perl-2 xgettext-perl-3 xgettext-perl-4 \ xgettext-perl-5 xgettext-perl-6 xgettext-perl-7 xgettext-perl-8 \ @@ -211,6 +212,7 @@ TESTS = gettext-1 gettext-2 \ format-librep-1 format-librep-2 \ format-lisp-1 format-lisp-2 \ format-lua-1 format-lua-2 \ + format-modula2-1 format-modula2-2 \ format-php-1 format-php-2 \ format-python-1 format-python-2 \ format-python-brace-1 format-python-brace-2 \ diff --git a/gettext-tools/tests/format-modula2-1 b/gettext-tools/tests/format-modula2-1 new file mode 100644 index 000000000..75017488d --- /dev/null +++ b/gettext-tools/tests/format-modula2-1 @@ -0,0 +1,68 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test recognition of Modula-2 format strings. + +cat <<\EOF > f-m2-1.data +# Valid: no argument +"abc%%" +# Valid: one string argument +"abc%s" +# Valid: one character argument +"abc%c" +# Valid: one integer argument +"abc%d" +# Valid: one unsigned integer argument +"abc%u" +# Valid: one unsigned integer argument +"abc%x" +# Valid: three arguments +"abc%c%d%u" +# Valid: three arguments with width +"abc%40s%3c%9u" +# Invalid: unterminated +"abc%" +# Valid: one argument with flags +"abc%-0d" +# Invalid: flags in wrong order +"abc%0-d" +# Invalid: unknown format specifier +"abc%f" +# Invalid: precision +"abc%1.1d" +# Invalid: unterminated +"abc%3" +EOF +: ${XGETTEXT=xgettext} +n=0 +while read comment; do + read string + n=`expr $n + 1` + echo "Gettext(${string});" > f-m2-1-$n.in + ${XGETTEXT} -L Modula-2 -o f-m2-1-$n.po f-m2-1-$n.in || Exit 1 + test -f f-m2-1-$n.po || Exit 1 + fail= + if echo "$comment" | grep 'Valid:' > /dev/null; then + if grep modula2-format f-m2-1-$n.po > /dev/null; then + : + else + fail=yes + fi + else + if grep modula2-format f-m2-1-$n.po > /dev/null; then + fail=yes + else + : + fi + fi + if test -n "$fail"; then + echo "Format string recognition error:" 1>&2 + cat f-m2-1-$n.in 1>&2 + echo "Got:" 1>&2 + cat f-m2-1-$n.po 1>&2 + Exit 1 + fi + rm -f f-m2-1-$n.in f-m2-1-$n.po +done < f-m2-1.data + +Exit 0 diff --git a/gettext-tools/tests/format-modula2-2 b/gettext-tools/tests/format-modula2-2 new file mode 100644 index 000000000..cd24fe790 --- /dev/null +++ b/gettext-tools/tests/format-modula2-2 @@ -0,0 +1,85 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test checking of Modula-2 format strings. + +cat <<\EOF > f-m2-2.data +# Valid: %% doesn't count +msgid "abc%%def" +msgstr "xyz" +# Invalid: invalid msgstr +msgid "abc%%def" +msgstr "xyz%" +# Valid: same arguments, with different widths +msgid "abc%2sdef" +msgstr "xyz%3s" +# Invalid: too few arguments +msgid "abc%sdef%u" +msgstr "xyz%s" +# Invalid: too many arguments +msgid "abc%udef" +msgstr "xyz%uvw%c" +# Valid: type compatibility +msgid "abc%u" +msgstr "xyz%x" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%s" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%0s" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%d" +# Invalid: type incompatibility +msgid "abc%c" +msgstr "xyz%u" +# Invalid: type incompatibility +msgid "abc%s" +msgstr "xyz%d" +# Invalid: type incompatibility +msgid "abc%0s" +msgstr "xyz%d" +# Invalid: type incompatibility +msgid "abc%s" +msgstr "xyz%u" +# Invalid: type incompatibility +msgid "abc%d" +msgstr "xyz%u" +EOF + +: ${MSGFMT=msgfmt} +n=0 +while read comment; do + read msgid_line + read msgstr_line + n=`expr $n + 1` + cat < f-m2-2-$n.po +#, modula2-format +${msgid_line} +${msgstr_line} +EOF + fail= + if echo "$comment" | grep 'Valid:' > /dev/null; then + if ${MSGFMT} --check-format -o f-m2-2-$n.mo f-m2-2-$n.po; then + : + else + fail=yes + fi + else + ${MSGFMT} --check-format -o f-m2-2-$n.mo f-m2-2-$n.po 2> /dev/null + if test $? = 1; then + : + else + fail=yes + fi + fi + if test -n "$fail"; then + echo "Format string checking error:" 1>&2 + cat f-m2-2-$n.po 1>&2 + Exit 1 + fi + rm -f f-m2-2-$n.po f-m2-2-$n.mo +done < f-m2-2.data + +Exit 0 diff --git a/gettext-tools/tests/xgettext-modula2-1 b/gettext-tools/tests/xgettext-modula2-1 new file mode 100755 index 000000000..af135cc90 --- /dev/null +++ b/gettext-tools/tests/xgettext-modula2-1 @@ -0,0 +1,102 @@ +#!/bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test Modula-2 support: Simple things. + +cat <<\EOF > xg-m2-1.mod +BEGIN + (* A one-line comment. *) + test1 = Gettext("Test String 1"); + (* Two one-line *) + (* comments *) + test2 = Gettext("Test String 2"); + (* A multi-line + comment. *) + test3 = Gettext("Test String 3"); + (* A (* nesting *) (* one-line *) comment. *) + test4 = Gettext("Test String 4"); + (* A (* + nesting + *) + (* + multi-line *) + comment. *) + test5 = Gettext("Test String 5"); + (* + Gettext("Not extracted"); + *) + dummy; + (* Modula-2 has string literal concatenation. *) + test6 = Gettext("Test " + + "String " + + "6"); + (* Empty string. *) + test7 = Gettext(""); + + (* Sprintf1 expects a format string. *) + Sprintf1(Gettext("weight %u"), w); + + WriteString(Gettext("Test string 10")); + WriteString(Sprintf1(NGettext("%0s piece of cake", "%s pieces of cake", n), s)); +} +EOF + +: ${XGETTEXT=xgettext} +${XGETTEXT} --omit-header --no-location -c -d xg-m2-1.tmp xg-m2-1.mod || Exit 1 +LC_ALL=C tr -d '\r' < xg-m2-1.tmp.po > xg-m2-1.po || Exit 1 + +cat <<\EOF > xg-m2-1.ok +#. A one-line comment. +msgid "Test String 1" +msgstr "" + +#. Two one-line +#. comments +msgid "Test String 2" +msgstr "" + +#. A multi-line +#. comment. +msgid "Test String 3" +msgstr "" + +#. A (* nesting *) (* one-line *) comment. +msgid "Test String 4" +msgstr "" + +#. A (* +#. nesting +#. *) +#. (* +#. multi-line *) +#. comment. +msgid "Test String 5" +msgstr "" + +#. Modula-2 has string literal concatenation. +msgid "Test String 6" +msgstr "" + +#. Empty string. +msgid "" +msgstr "" + +#. Sprintf1 expects a format string. +#, modula2-format +msgid "weight %u" +msgstr "" + +msgid "Test string 10" +msgstr "" + +#, modula2-format +msgid "%0s piece of cake" +msgid_plural "%s pieces of cake" +msgstr[0] "" +msgstr[1] "" +EOF + +: ${DIFF=diff} +${DIFF} xg-m2-1.ok xg-m2-1.po || Exit 1 + +exit 0 diff --git a/gettext-tools/tests/xgettext-modula2-2 b/gettext-tools/tests/xgettext-modula2-2 new file mode 100755 index 000000000..99c2f46ae --- /dev/null +++ b/gettext-tools/tests/xgettext-modula2-2 @@ -0,0 +1,138 @@ +#!/bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test Modula-2 support: propagation of modula2-format'. + +cat <<\EOF > xg-m2-2.mod +BEGIN + Gettext ("Hello10"); + (Gettext ("Hello11")); + ((Gettext ("Hello12"))); + Gettext (Gettext ("Hello13")); + (Gettext (Gettext ("Hello14"))); + ((Gettext (Gettext ("Hello15")))); + Gettext ((Gettext ("Hello16"))); + Gettext (((Gettext ("Hello17")))); + Gettext (foo(), Gettext ("Hello18")); + + Sprintf0 (Gettext ("Hello20")); + Sprintf0 ((Gettext ("Hello21"))); + Sprintf0 (((Gettext ("Hello22")))); + Sprintf0 (Gettext (Gettext ("Hello23"))); + Sprintf0 ((Gettext (Gettext ("Hello24")))); + Sprintf0 (((Gettext (Gettext ("Hello25"))))); + Sprintf0 (Gettext ((Gettext ("Hello26")))); + Sprintf0 (Gettext (((Gettext ("Hello27"))))); + Sprintf0 (Gettext (foo(), Gettext ("Hello28"))); + + WriteString (Gettext ("Hello30")); + WriteString ((Gettext ("Hello31"))); + WriteString (((Gettext ("Hello32")))); + WriteString (Gettext (Gettext ("Hello33"))); + WriteString ((Gettext (Gettext ("Hello34")))); + WriteString (((Gettext (Gettext ("Hello35"))))); + WriteString (Gettext ((Gettext ("Hello36")))); + WriteString (Gettext (((Gettext ("Hello37"))))); + WriteString (Gettext (foo(), Gettext ("Hello38"))); +END; +EOF + +: ${XGETTEXT=xgettext} +${XGETTEXT} --omit-header --no-location -c -d xg-m2-2.tmp xg-m2-2.mod || Exit 1 +LC_ALL=C tr -d '\r' < xg-m2-2.tmp.po > xg-m2-2.po || Exit 1 + +cat <<\EOF > xg-m2-2.ok +msgid "Hello10" +msgstr "" + +msgid "Hello11" +msgstr "" + +msgid "Hello12" +msgstr "" + +msgid "Hello13" +msgstr "" + +msgid "Hello14" +msgstr "" + +msgid "Hello15" +msgstr "" + +msgid "Hello16" +msgstr "" + +msgid "Hello17" +msgstr "" + +msgid "Hello18" +msgstr "" + +#, modula2-format +msgid "Hello20" +msgstr "" + +#, modula2-format +msgid "Hello21" +msgstr "" + +#, modula2-format +msgid "Hello22" +msgstr "" + +#, modula2-format +msgid "Hello23" +msgstr "" + +#, modula2-format +msgid "Hello24" +msgstr "" + +#, modula2-format +msgid "Hello25" +msgstr "" + +#, modula2-format +msgid "Hello26" +msgstr "" + +#, modula2-format +msgid "Hello27" +msgstr "" + +msgid "Hello28" +msgstr "" + +msgid "Hello30" +msgstr "" + +msgid "Hello31" +msgstr "" + +msgid "Hello32" +msgstr "" + +msgid "Hello33" +msgstr "" + +msgid "Hello34" +msgstr "" + +msgid "Hello35" +msgstr "" + +msgid "Hello36" +msgstr "" + +msgid "Hello37" +msgstr "" + +msgid "Hello38" +msgstr "" +EOF + +: ${DIFF=diff} +${DIFF} xg-m2-2.ok xg-m2-2.po || Exit 1 + +exit 0 diff --git a/gettext-tools/woe32dll/gettextsrc-exports.c b/gettext-tools/woe32dll/gettextsrc-exports.c index 319e16468..939ed5b0e 100644 --- a/gettext-tools/woe32dll/gettextsrc-exports.c +++ b/gettext-tools/woe32dll/gettextsrc-exports.c @@ -42,6 +42,7 @@ VARIABLE(formatstring_kde_kuit) VARIABLE(formatstring_librep) VARIABLE(formatstring_lisp) VARIABLE(formatstring_lua) +VARIABLE(formatstring_modula2) VARIABLE(formatstring_objc) VARIABLE(formatstring_parsers) VARIABLE(formatstring_pascal)