- Merge clisp specific changes.
-- Update documentation for plural features and bind_textdomain_codeset.
-
-- Treatment of plurals in pot-files: Use the following pattern:
- msgid "a piece of cake"
- msgid_plural "%d pieces of cake"
- msgstr0 "un morceau de gateau"
- msgstr1 "%d morceaux de gateau"
- or possibly:
- msgid "a piece of cake"
- msgid_plural "%d pieces of cake"
- msgstr[0] "un morceau de gateau"
- msgstr[1] "%d morceaux de gateau"
-
- Work towards integration with automake.
- Stop documenting AM_WITH_NLS. AM_GNU_GETTEXT is the right macro to use.
-- Unify intlh.inst.in and libgettext.h.
+- Unify intlh.inst.in and libgettext.h. libgettext.h depends on
+ HAVE_LC_MESSAGES and on <locale.h> being already included.
- What about gettext_noop? Kill it or maybe apply this:
+2001-01-01 Bruno Haible <haible@clisp.cons.org>
+
+ Implement plural form handling.
+ * gettext.texi: Fix menus.
+ (PO Files): Document entries for plural forms.
+ (xgettext Invocation): Extended --keyword argument syntax. More
+ default keywords.
+ (MO Files): Document format of entries for plural forms.
+ (Charset conversion): New node, mostly from glibc-2.2 manual.
+ (Plural forms): Likewise.
+ (GUI program problems): Likewise, without the GCC function macro.
+ (Optimized gettext): Remove section about dcgettext macro. All
+ caching is now done inside the *gettext functions.
+ (Comparison): Move the example about Polish to node "Plural forms".
+ Remove the print_month_info example.
+
2000-11-12 Bruno Haible <haible@clisp.cons.org>
* matrix.texi: Update.
This file provides documentation for GNU @code{gettext} utilities.
It also serves as a reference for the free Translation Project.
-Copyright (C) 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
+Copyright (C) 1995, 1996, 1997, 1998, 2001 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@page
@vskip 0pt plus 1filll
-Copyright @copyright{} 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
+Copyright @copyright{} 1995, 1996, 1997, 1998, 2001 Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
* Obsolete Entries:: Obsolete Entries
* Modifying Translations:: Modifying Translations
* Modifying Comments:: Modifying Comments
+* Subedit:: Mode for Editing Translations
* Auxiliary:: Consulting Auxiliary PO Files
Producing Binary MO Files
* Interface to gettext:: The interface
* Ambiguities:: Solving ambiguities
* Locating Catalogs:: Locating message catalog files
+* Charset conversion:: How to request conversion to Unicode
+* Plural forms:: Additional functions for handling plurals
+* GUI program problems:: Another technique for solving ambiguities
* Optimized gettext:: Optimization of the *gettext functions
Temporary Notes for the Programmers Chapter
@end table
+A different kind of entries is used for translations which involve
+plural forms.
+
+@example
+@var{white-space}
+# @var{translator-comments}
+#. @var{automatic-comments}
+#: @var{reference}@dots{}
+#, @var{flag}@dots{}
+msgid @var{untranslated-string-singular}
+msgid_plural @var{untranslated-string-plural}
+msgstr[0] @var{translated-string-case-0}
+...
+msgstr[N] @var{translated-string-case-n}
+@end example
+
It happens that some lines, usually whitespace or comments, follow the
very last entry of a PO file. Such lines are not part of any entry,
and PO mode is unable to take action on those lines. By using the
for strings in the first argument of each call to the function or macro
@var{id}. If @var{keywordspec} is of the form
@samp{@var{id}:@var{argnum}}, @code{xgettext} looks for strings in the
-@var{argnum}th argument of the call.
+@var{argnum}th argument of the call. If @var{keywordspec} is of the form
+@samp{@var{id}:@var{argnum1},@var{argnum2}}, @code{xgettext} looks for
+strings in the @var{argnum1}st argument and in the @var{argnum2}nd argument
+of the call, and treats them as singular/plural variants for a message
+with plural handling.
The default keyword specifications, which are always looked for if not
explicitly disabled, are @code{gettext}, @code{dgettext:2},
-@code{dcgettext:2} and @code{gettext_noop}.
+@code{dcgettext:2}, @code{ngettext:1,2}, @code{dngettext:2,3},
+@code{dcngettext:2,3}, and @code{gettext_noop}.
@item -m [@var{string}]
@itemx --msgstr-prefix[=@var{string}]
an offset which is a multiple of the alignment value. On some RISC
machines, a correct alignment will speed things up.
+Plural forms are stored by letting the plural of the original string
+follow the singular of the original string, separated through a
+@key{NUL} byte. The length which appears in the string descriptor
+includes both. However, only the singular of the original string
+takes part in the hash table lookup. The plural variants of the
+translation are all stored consecutively, separated through a
+@key{NUL} byte. Here also, the length in the string descriptor
+includes all of them.
+
Nothing prevents a MO file from having embedded @key{NUL}s in strings.
However, the program interface currently used already presumes
that strings are @key{NUL} terminated, so embedded @key{NUL}s are
somewhat useless. But MO file format is general enough so other
interfaces would be later possible, if for example, we ever want to
implement wide characters right in MO files, where @key{NUL} bytes may
-accidently appear.
+accidently appear. (No, we don't want to have wide characters in MO
+files. They would make the file unnecessarily large, and the
+@samp{wchar_t} type being platform dependent, MO files would be
+platform dependent as well.)
This particular issue has been strongly debated in the GNU
@code{gettext} development forum, and it is expectable that MO file
* Interface to gettext:: The interface
* Ambiguities:: Solving ambiguities
* Locating Catalogs:: Locating message catalog files
+* Charset conversion:: How to request conversion to Unicode
+* Plural forms:: Additional functions for handling plurals
+* GUI program problems:: Another technique for solving ambiguities
* Optimized gettext:: Optimization of the *gettext functions
@end menu
paths should always be avoided to avoid dependencies and
unreliabilities.
-@node Locating Catalogs, Optimized gettext, Ambiguities, gettext
+@node Locating Catalogs, Charset conversion, Ambiguities, gettext
@subsection Locating Message Catalog Files
Because many different languages for many different packages have to be
@code{setlocale} its behavior in setting the locale values is simulated
by looking at the environment variables.}
-@node Optimized gettext, , Locating Catalogs, gettext
+@node Charset conversion, Plural forms, Locating Catalogs, gettext
+@subsection How to specify the output character set @code{gettext} uses
+
+@code{gettext} not only looks up a translation in a message catalog. It
+also converts the translation on the fly to the desired output character
+set. This is useful if the user is working in a different character set
+than the translator who created the message catalog, because it avoids
+distributing variants of message catalogs which differ only in the
+character set.
+
+The output character set is, by default, the value of @code{nl_langinfo
+(CODESET)}, which depends on the @code{LC_CTYPE} part of the current
+locale. But programs which store strings in a locale independent way
+(e.g. UTF-8) can request that @code{gettext} and related functions
+return the translations in that encoding, by use of the
+@code{bind_textdomain_codeset} function.
+
+Note that the @var{msgid} argument to @code{gettext} is not subject to
+character set conversion. Also, when @code{gettext} does not find a
+translation for @var{msgid}, it returns @var{msgid} unchanged --
+independently of the current output character set. It is therefore
+recommended that all @var{msgid}s be US-ASCII strings.
+
+@deftypefun {char *} bind_textdomain_codeset (const char *@var{domainname}, const char *@var{codeset})
+The @code{bind_textdomain_codeset} function can be used to specify the
+output character set for message catalogs for domain @var{domainname}.
+The @var{codeset} argument must be a valid codeset name which can be used
+for the @code{iconv_open} function, or a null pointer.
+
+If the @var{codeset} parameter is the null pointer,
+@code{bind_textdomain_codeset} returns the currently selected codeset
+for the domain with the name @var{domainname}. It returns @code{NULL} if
+no codeset has yet been selected.
+
+The @code{bind_textdomain_codeset} function can be used several times.
+If used multiple times with the same @var{domainname} argument, the
+later call overrides the settings made by the earlier one.
+
+The @code{bind_textdomain_codeset} function returns a pointer to a
+string containing the name of the selected codeset. The string is
+allocated internally in the function and must not be changed by the
+user. If the system went out of core during the execution of
+@code{bind_textdomain_codeset}, the return value is @code{NULL} and the
+global variable @var{errno} is set accordingly.
+@end deftypefun
+
+@node Plural forms, GUI program problems, Charset conversion, gettext
+@subsection Additional functions for plural forms
+
+The functions of the @code{gettext} family described so far (and all the
+@code{catgets} functions as well) have one problem in the real world
+which have been neglected completely in all existing approaches. What
+is meant here is the handling of plural forms.
+
+Looking through Unix source code before the time anybody thought about
+internationalization (and, sadly, even afterwards) one can often find
+code similar to the following:
+
+@smallexample
+ printf ("%d file%s deleted", n, n == 1 ? "" : "s");
+@end smallexample
+
+@noindent
+After the first complaints from people internationalizing the code people
+either completely avoided formulations like this or used strings like
+@code{"file(s)"}. Both look unnatural and should be avoided. First
+tries to solve the problem correctly looked like this:
+
+@smallexample
+ if (n == 1)
+ printf ("%d file deleted", n);
+ else
+ printf ("%d files deleted", n);
+@end smallexample
+
+But this does not solve the problem. It helps languages where the
+plural form of a noun is not simply constructed by adding an `s' but
+that is all. Once again people fell into the trap of believing the
+rules their language is using are universal. But the handling of plural
+forms differs widely between the language families. For example,
+Rafal Maszkowski @code{<rzm@@mat.uni.torun.pl>} reports:
+
+@quotation
+In Polish we use e.g. plik (file) this way:
+@example
+1 plik
+2,3,4 pliki
+5-21 pliko'w
+22-24 pliki
+25-31 pliko'w
+@end example
+and so on (o' means 8859-2 oacute which should be rather okreska,
+similar to aogonek).
+@end quotation
+
+There are two things which can differ between languages (and even inside
+language families);
+
+@itemize @bullet
+@item
+The form how plural forms are build differs. This is a problem with
+language which have many irregularities. German, for instance, is a
+drastic case. Though English and German are part of the same language
+family (Germanic), the almost regular forming of plural noun forms
+(appending an `s') is hardly found in German.
+
+@item
+The number of plural forms differ. This is somewhat surprising for
+those who only have experiences with Romanic and Germanic languages
+since here the number is the same (there are two).
+
+But other language families have only one form or many forms. More
+information on this in an extra section.
+@end itemize
+
+The consequence of this is that application writers should not try to
+solve the problem in their code. This would be localization since it is
+only usable for certain, hardcoded language environments. Instead the
+extended @code{gettext} interface should be used.
+
+These extra functions are taking instead of the one key string two
+strings and an numerical argument. The idea behind this is that using
+the numerical argument and the first string as a key, the implementation
+can select using rules specified by the translator the right plural
+form. The two string arguments then will be used to provide a return
+value in case no message catalog is found (similar to the normal
+@code{gettext} behavior). In this case the rules for Germanic language
+is used and it is assumed that the first string argument is the singular
+form, the second the plural form.
+
+This has the consequence that programs without language catalogs can
+display the correct strings only if the program itself is written using
+a Germanic language. This is a limitation but since the GNU C library
+(as well as the GNU @code{gettext} package) are written as part of the
+GNU package and the coding standards for the GNU project require program
+being written in English, this solution nevertheless fulfills its
+purpose.
+
+@deftypefun {char *} ngettext (const char *@var{msgid1}, const char *@var{msgid2}, unsigned long int @var{n})
+The @code{ngettext} function is similar to the @code{gettext} function
+as it finds the message catalogs in the same way. But it takes two
+extra arguments. The @var{msgid1} parameter must contain the singular
+form of the string to be converted. It is also used as the key for the
+search in the catalog. The @var{msgid2} parameter is the plural form.
+The parameter @var{n} is used to determine the plural form. If no
+message catalog is found @var{msgid1} is returned if @code{n == 1},
+otherwise @code{msgid2}.
+
+An example for the us of this function is:
+
+@smallexample
+ printf (ngettext ("%d file removed", "%d files removed", n), n);
+@end smallexample
+
+Please note that the numeric value @var{n} has to be passed to the
+@code{printf} function as well. It is not sufficient to pass it only to
+@code{ngettext}.
+@end deftypefun
+
+@deftypefun {char *} dngettext (const char *@var{domain}, const char *@var{msgid1}, const char *@var{msgid2}, unsigned long int @var{n})
+The @code{dngettext} is similar to the @code{dgettext} function in the
+way the message catalog is selected. The difference is that it takes
+two extra parameter to provide the correct plural form. These two
+parameters are handled in the same way @code{ngettext} handles them.
+@end deftypefun
+
+@deftypefun {char *} dcngettext (const char *@var{domain}, const char *@var{msgid1}, const char *@var{msgid2}, unsigned long int @var{n}, int @var{category})
+The @code{dcngettext} is similar to the @code{dcgettext} function in the
+way the message catalog is selected. The difference is that it takes
+two extra parameter to provide the correct plural form. These two
+parameters are handled in the same way @code{ngettext} handles them.
+@end deftypefun
+
+Now, how do these functions solve the problem of the plural forms?
+Without the input of linguists (which was not available) it was not
+possible to determine whether there are only a few different forms in
+which plural forms are formed or whether the number can increase with
+every new supported language.
+
+Therefore the solution implemented is to allow the translator to specify
+the rules of how to select the plural form. Since the formula varies
+with every language this is the only viable solution except for
+hardcoding the information in the code (which still would require the
+possibility of extensions to not prevent the use of new languages). The
+details are explained in the GNU @code{gettext} manual. Here only a a
+bit of information is provided.
+
+The information about the plural form selection has to be stored in the
+header entry (the one with the empty (@code{msgid} string). There should
+be something like:
+
+@smallexample
+ nplurals=2; plural=n == 1 ? 0 : 1
+@end smallexample
+
+The @code{nplurals} value must be a decimal number which specifies how
+many different plural forms exist for this language. The string
+following @code{plural} is an expression which is using the C language
+syntax. Exceptions are that no negative number are allowed, numbers
+must be decimal, and the only variable allowed is @code{n}. This
+expression will be evaluated whenever one of the functions
+@code{ngettext}, @code{dngettext}, or @code{dcngettext} is called. The
+numeric value passed to these functions is then substituted for all uses
+of the variable @code{n} in the expression. The resulting value then
+must be greater or equal to zero and smaller than the value given as the
+value of @code{nplurals}.
+
+@noindent
+The following rules are known at this point. The language with families
+are listed. But this does not necessarily mean the information can be
+generalized for the whole family (as can be easily seen in the table
+below).@footnote{Additions are welcome. Send appropriate information to
+@email{bug-glibc-manual@@gnu.org}.}
+
+@table @asis
+@item Only one form:
+Some languages only require one single form. There is no distinction
+between the singular and plural form. And appropriate header entry
+would look like this:
+
+@smallexample
+nplurals=1; plural=0
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Finno-Ugric family
+Hungarian
+@item Asian family
+Japanese
+@item Turkic/Altaic family
+Turkish
+@end table
+
+@item Two forms, singular used for one only
+This is the form used in most existing programs since it is what English
+is using. A header entry would look like this:
+
+@smallexample
+nplurals=2; plural=n != 1
+@end smallexample
+
+(Note: this uses the feature of C expressions that boolean expressions
+have to value zero or one.)
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Germanic family
+Danish, Dutch, English, German, Norwegian, Swedish
+@item Finno-Ugric family
+Finnish
+@item Latin/Greek family
+Greek
+@item Semitic family
+Hebrew
+@item Romanic family
+Italian, Spanish
+@item Artificial
+Esperanto
+@end table
+
+@item Two forms, singular used for zero and one
+Exceptional case in the language family. The header entry would be:
+
+@smallexample
+nplurals=2; plural=n>1
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Romanic family
+French
+@end table
+
+@item Three forms, special cases for one and two
+The header entry would be:
+
+@smallexample
+nplurals=3; plural=n==1 ? 0 : n==2 ? 1 : 2
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Celtic
+Gaeilge
+@end table
+
+@item Three forms, special case for one and all numbers ending in 2, 3, or 4
+The header entry would look like this:
+
+@smallexample
+nplurals=3; plural=n==1 ? 0 : n%10>=2 && n%10<=4 ? 1 : 2
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Slavic family
+Russian
+@end table
+
+@item Three forms, special case for one and some numbers ending in 2, 3, or 4
+The header entry would look like this:
+
+@smallexample
+nplurals=3; plural=n==1 ? 0 : \
+ n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2
+@end smallexample
+
+(Continuation in the next line is possible.)
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Slavic family
+Polish
+@end table
+
+@item Four forms, special case for one and all numbers ending in 2, 3, or 4
+The header entry would look like this:
+
+@smallexample
+nplurals=4; plural=n==1 ? 0 : n%10==2 ? 1 : n%10==3 || n%10==4 ? 2 : 3
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Slavic family
+Slovenian
+@end table
+@end table
+
+@node GUI program problems, Optimized gettext, Plural forms, gettext
+@subsection How to use @code{gettext} in GUI programs
+
+One place where the @code{gettext} functions, if used normally, have big
+problems is within programs with graphical user interfaces (GUIs). The
+problem is that many of the strings which have to be translated are very
+short. They have to appear in pull-down menus which restricts the
+length. But strings which are not containing entire sentences or at
+least large fragments of a sentence may appear in more than one
+situation in the program but might have different translations. This is
+especially true for the one-word strings which are frequently used in
+GUI programs.
+
+As a consequence many people say that the @code{gettext} approach is
+wrong and instead @code{catgets} should be used which indeed does not
+have this problem. But there is a very simple and powerful method to
+handle these kind of problems with the @code{gettext} functions.
+
+@noindent
+As as example consider the following fictional situation. A GUI program
+has a menu bar with the following entries:
+
+@smallexample
++------------+------------+--------------------------------------+
+| File | Printer | |
++------------+------------+--------------------------------------+
+| Open | | Select |
+| New | | Open |
++----------+ | Connect |
+ +----------+
+@end smallexample
+
+To have the strings @code{File}, @code{Printer}, @code{Open},
+@code{New}, @code{Select}, and @code{Connect} translated there has to be
+at some point in the code a call to a function of the @code{gettext}
+family. But in two places the string passed into the function would be
+@code{Open}. The translations might not be the same and therefore we
+are in the dilemma described above.
+
+One solution to this problem is to artificially enlengthen the strings
+to make them unambiguous. But what would the program do if no
+translation is available? The enlengthened string is not what should be
+printed. So we should use a little bit modified version of the functions.
+
+To enlengthen the strings a uniform method should be used. E.g., in the
+example above the strings could be chosen as
+
+@smallexample
+Menu|File
+Menu|Printer
+Menu|File|Open
+Menu|File|New
+Menu|Printer|Select
+Menu|Printer|Open
+Menu|Printer|Connect
+@end smallexample
+
+Now all the strings are different and if now instead of @code{gettext}
+the following little wrapper function is used, everything works just
+fine:
+
+@cindex sgettext
+@smallexample
+ char *
+ sgettext (const char *msgid)
+ @{
+ char *msgval = gettext (msgid);
+ if (msgval == msgid)
+ msgval = strrchr (msgid, '|') + 1;
+ return msgval;
+ @}
+@end smallexample
+
+What this little function does is to recognize the case when no
+translation is available. This can be done very efficiently by a
+pointer comparison since the return value is the input value. If there
+is no translation we know that the input string is in the format we used
+for the Menu entries and therefore contains a @code{|} character. We
+simply search for the last occurrence of this character and return a
+pointer to the character following it. That's it!
+
+If one now consistently uses the enlengthened string form and replaces
+the @code{gettext} calls with calls to @code{sgettext} (this is normally
+limited to very few places in the GUI implementation) then it is
+possible to produce a program which can be internationalized.
+
+The other @code{gettext} functions (@code{dgettext}, @code{dcgettext}
+and the @code{ngettext} equivalents) can and should have corresponding
+functions as well which look almost identical, except for the parameters
+and the call to the underlying function.
+
+Now there is of course the question why such functions do not exist in
+the GNU gettext package? There are two parts of the answer to this question.
+
+@itemize @bullet
+@item
+They are easy to write and therefore can be provided by the project they
+are used in. This is not an answer by itself and must be seen together
+with the second part which is:
+
+@item
+There is no way the gettext package can contain a version which can work
+everywhere. The problem is the selection of the character to separate
+the prefix from the actual string in the enlenghtened string. The
+examples above used @code{|} which is a quite good choice because it
+resembles a notation frequently used in this context and it also is a
+character not often used in message strings.
+
+But what if the character is used in message strings. Or if the chose
+character is not available in the character set on the machine one
+compiles (e.g., @code{|} is not required to exist for @w{ISO C}; this is
+why the @file{iso646.h} file exists in @w{ISO C} programming environments).
+@end itemize
+
+There is only one more comment to be said. The wrapper function above
+require that the translations strings are not enlengthened themselves.
+This is only logical. There is no need to disambiguate the strings
+(since they are never used as keys for a search) and one also saves
+quite some memory and disk space by doing this.
+
+@node Optimized gettext, , GUI program problems, gettext
@subsection Optimization of the *gettext functions
At this point of the discussion we should talk about an advantage of the
@noindent
But this solution is not usable in all situation (e.g. when the locale
-selection changes) nor is it good readable.
-
-The GNU C compiler, version 2.7 and above, provide another solution for
-this. To describe this we show here some lines of the
-@file{intl/libgettext.h} file. For an explanation of the expression
-command block see @ref{Statement Exprs, , Statements and Declarations in
-Expressions, gcc, The GNU CC Manual}.
-
-@example
-@group
-# if defined __GNUC__ && __GNUC__ == 2 && __GNUC_MINOR__ >= 7
-extern int _nl_msg_cat_cntr;
-# define dcgettext(domainname, msgid, category) \
- (__extension__ \
- (@{ \
- char *result; \
- if (__builtin_constant_p (msgid)) \
- @{ \
- static char *__translation__; \
- static int __catalog_counter__; \
- if (! __translation__ \
- || __catalog_counter__ != _nl_msg_cat_cntr) \
- @{ \
- __translation__ = \
- dcgettext__ ((domainname), (msgid), (category)); \
- __catalog_counter__ = _nl_msg_cat_cntr; \
- @} \
- result = __translation__; \
- @} \
- else \
- result = dcgettext__ ((domainname), (msgid), (category)); \
- result; \
- @}))
-# endif
-@end group
-@end example
-
-The interesting thing here is the @code{__builtin_constant_p} predicate.
-This is evaluated at compile time and so optimization can take place
-immediately. Here two cases are distinguished: the argument to
-@code{gettext} is not a constant value in which case simply the function
-@code{dcgettext__} is called, the real implementation of the
-@code{dcgettext} function.
+selection changes) nor does it lead to legible code.
-If the string argument @emph{is} constant we can reuse the once gained
-translation when the locale selection has not changed. This is exactly
-what is done here. The @code{_nl_msg_cat_cntr} variable is defined in
-the @file{loadmsgcat.c} which is available in @file{libintl.a} and is
-changed whenever a new message catalog is loaded.
+For this reason, GNU @code{gettext} caches previous translation results.
+When the same translation is requested twice, with no new message
+catalogs being loaded in between, @code{gettext} will, the second time,
+find the result through a single cache lookup.
@node Comparison, Using libintl.a, gettext, Programmers
@section Comparing the Two Interfaces
difficult one can also consider changing one of the conflicting string a
little bit. But it is not impossible to overcome.
-@c Should this be here?
-Translator note: It is perhaps appropriate here to tell those English
-speaking programmers that the plural form of a noun cannot be formed by
-appending a single `s'. Most other languages use different methods.
-Even the above form is not general enough to cope with all languages.
-Rafal Maszkowski <rzm@@mat.uni.torun.pl> reports:
-
-@quotation
-In Polish we use e.g. plik (file) this way:
-@example
-1 plik
-2,3,4 pliki
-5-21 pliko'w
-22-24 pliki
-25-31 pliko'w
-@end example
-and so on (o' means 8859-2 oacute which should be rather okreska,
-similar to aogonek).
-@end quotation
-
-A workable approach might be to consider methods like the one used for
-@code{LC_TIME} in the POSIX.2 standard. The value of the
-@code{alt_digits} field can be up to 100 strings which represent the
-numbers 1 to 100. Using this in a situation of an internationalized
-program means that an array of translatable strings should be indexed by
-the number which should represent. A small example:
-
-@example
-@group
-void
-print_month_info (int month)
-@{
- const char *month_pos[12] =
- @{ N_("first"), N_("second"), N_("third"), N_("fourth"),
- N_("fifth"), N_("sixth"), N_("seventh"), N_("eighth"),
- N_("ninth"), N_("tenth"), N_("eleventh"), N_("twelfth") @};
- printf (_("%s is the %s month\n"), nl_langinfo (MON_1 + month),
- _(month_pos[month]));
-@}
-@end group
-@end example
-
-@noindent
-It should be obvious that this method is only reasonable for small
-ranges of numbers.
-
-@c catgets allows same original entry to have different translations
+@code{catgets} allows same original entry to have different translations,
+but @code{gettext} has another, scalable approach for solving ambiguities
+of this kind: @xref{Ambiguities}.
@node Using libintl.a, gettext grok, Comparison, Programmers
@section Using libintl.a in own programs
-@set UPDATED 6 May 2000
+@set UPDATED 1 January 2001
@set EDITION 0.10.36
@set VERSION 0.10.36
+2001-01-01 Bruno Haible <haible@clisp.cons.org>
+
+ Finish implementation of plural form handling.
+ * dcigettext.c (known_translation_t): Rename 'domain' field to
+ 'domainname'. Remove 'plindex' field. Add 'domain' and
+ 'translation_length' fields.
+ (transcmp): Don't compare 'plindex' fields.
+ (plural_lookup): New function.
+ (DCIGETTEXT): Change cache handing in the plural case. Don't call
+ plural_eval before the translation and its catalog file have been
+ found. Remove plindex from cache key. Add 'translation_length' and
+ 'domain' to cache result.
+ (_nl_find_msg): Remove index argument, return length of translation
+ to the caller instead. Weaken comparison of string lengths, to account
+ for plural entries. Call iconv() on the entire result string, not
+ only on the portion needed so far.
+ * loadinfo.h (_nl_find_msg): Remove index argument, add lengthp
+ argument.
+ * loadmsgcat.c (_nl_load_domain): Adapt to _nl_find_msg change.
+
+ * intl-compat.c (dcngettext, dngettext, ngettext): New functions.
+ * libgettext.h (ngettext__, dngettext__, dcngettext__): New
+ declarations.
+ (ngettext, dngettext): Add missing macro argument.
+
+ * intlh.inst.in (ngettext, dngettext): Add missing macro argument.
+
2000-12-31 Bruno Haible <haible@clisp.cons.org>
* gettextP.h (ZERO): New macro.
/* Implementation of the internal dcigettext function.
- Copyright (C) 1995-1999, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1999, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
struct known_translation_t
{
/* Domain in which to search. */
- char *domain;
-
- /* Plural index. */
- unsigned long int plindex;
+ char *domainname;
/* The category. */
int category;
/* State of the catalog counter at the point the string was found. */
int counter;
+ /* Catalog where the string was found. */
+ struct loaded_l10nfile *domain;
+
/* And finally the translation. */
const char *translation;
+ size_t translation_length;
/* Pointer to the string in question. */
char msgid[ZERO];
result = strcmp (s1->msgid, s2->msgid);
if (result == 0)
{
- result = strcmp (s1->domain, s2->domain);
+ result = strcmp (s1->domainname, s2->domainname);
if (result == 0)
- {
- result = s1->plindex - s2->plindex;
- if (result == 0)
- /* We compare the category last (though this is the cheapest
- operation) since it is hopefully always the same (namely
- LC_MESSAGES). */
- result = s1->category - s2->category;
- }
+ /* We compare the category last (though this is the cheapest
+ operation) since it is hopefully always the same (namely
+ LC_MESSAGES). */
+ result = s1->category - s2->category;
}
return result;
struct binding *_nl_domain_bindings;
/* Prototypes for local functions. */
-static unsigned long int plural_eval (struct expression *pexp,
- unsigned long int n) internal_function;
+static char *plural_lookup PARAMS ((struct loaded_l10nfile *domain,
+ unsigned long int n,
+ const char *translation,
+ size_t translation_len))
+ internal_function;
+static unsigned long int plural_eval PARAMS ((struct expression *pexp,
+ unsigned long int n))
+ internal_function;
static const char *category_to_name PARAMS ((int category)) internal_function;
static const char *guess_category_value PARAMS ((int category,
const char *categoryname))
char *dirname, *xdomainname;
char *single_locale;
char *retval;
+ size_t retlen;
int saved_errno;
#if defined HAVE_TSEARCH || defined _LIBC
struct known_translation_t *search;
#if defined HAVE_TSEARCH || defined _LIBC
msgid_len = strlen (msgid1) + 1;
- if (plural == 0)
+ /* Try to find the translation among those which we found at
+ some time. */
+ search = (struct known_translation_t *)
+ alloca (offsetof (struct known_translation_t, msgid) + msgid_len);
+ memcpy (search->msgid, msgid1, msgid_len);
+ search->domainname = (char *) domainname;
+ search->category = category;
+
+ foundp = (struct known_translation_t **) tfind (search, &root, transcmp);
+ if (foundp != NULL && (*foundp)->counter == _nl_msg_cat_cntr)
{
- /* Try to find the translation among those which we found at
- some time. */
- search = (struct known_translation_t *) alloca (sizeof (*search)
- + msgid_len);
- memcpy (search->msgid, msgid1, msgid_len);
- search->domain = (char *) domainname;
- search->plindex = 0;
- search->category = category;
-
- foundp = (struct known_translation_t **) tfind (search, &root, transcmp);
- if (foundp != NULL && (*foundp)->counter == _nl_msg_cat_cntr)
- {
- __libc_rwlock_unlock (_nl_state_lock);
- return (char *) (*foundp)->translation;
- }
+ /* Now deal with plural. */
+ if (plural)
+ retval = plural_lookup ((*foundp)->domain, n, (*foundp)->translation,
+ (*foundp)->translation_length);
+ else
+ retval = (char *) (*foundp)->translation;
+
+ __libc_rwlock_unlock (_nl_state_lock);
+ return retval;
}
#endif
if (domain != NULL)
{
- unsigned long int index = 0;
-
- if (plural != 0)
- {
- struct loaded_domain *domaindata =
- (struct loaded_domain *) domain->data;
- index = plural_eval (domaindata->plural, n);
- if (index >= domaindata->nplurals)
- /* This should never happen. It means the plural expression
- and the given maximum value do not match. */
- index = 0;
-
-#if defined HAVE_TSEARCH || defined _LIBC
- /* Try to find the translation among those which we
- found at some time. */
- search = (struct known_translation_t *) alloca (sizeof (*search)
- + msgid_len);
- memcpy (search->msgid, msgid1, msgid_len);
- search->domain = (char *) domainname;
- search->plindex = index;
- search->category = category;
-
- foundp = (struct known_translation_t **) tfind (search, &root,
- transcmp);
- if (foundp != NULL && (*foundp)->counter == _nl_msg_cat_cntr)
- {
- __libc_rwlock_unlock (_nl_state_lock);
- return (char *) (*foundp)->translation;
- }
-#endif
- }
-
- retval = _nl_find_msg (domain, msgid1, index);
+ retval = _nl_find_msg (domain, msgid1, &retlen);
if (retval == NULL)
{
for (cnt = 0; domain->successor[cnt] != NULL; ++cnt)
{
retval = _nl_find_msg (domain->successor[cnt], msgid1,
- index);
+ &retlen);
if (retval != NULL)
- break;
+ {
+ domain = domain->successor[cnt];
+ break;
+ }
}
}
if (retval != NULL)
{
+ /* Found the translation of MSGID1 in domain DOMAIN:
+ starting at RETVAL, RETLEN bytes. */
FREE_BLOCKS (block_list);
__set_errno (saved_errno);
#if defined HAVE_TSEARCH || defined _LIBC
+ msgid_len + domainname_len + 1);
if (newp != NULL)
{
- newp->domain = mempcpy (newp->msgid, msgid1, msgid_len);
- memcpy (newp->domain, domainname, domainname_len + 1);
- newp->plindex = index;
+ newp->domainname =
+ mempcpy (newp->msgid, msgid1, msgid_len);
+ memcpy (newp->domainname, domainname, domainname_len + 1);
newp->category = category;
newp->counter = _nl_msg_cat_cntr;
+ newp->domain = domain;
newp->translation = retval;
+ newp->translation_length = retlen;
/* Insert the entry in the search tree. */
foundp = (struct known_translation_t **)
{
/* We can update the existing entry. */
(*foundp)->counter = _nl_msg_cat_cntr;
+ (*foundp)->domain = domain;
(*foundp)->translation = retval;
+ (*foundp)->translation_length = retlen;
}
#endif
+ /* Now deal with plural. */
+ if (plural)
+ retval = plural_lookup (domain, n, retval, retlen);
+
__libc_rwlock_unlock (_nl_state_lock);
return retval;
}
char *
internal_function
-_nl_find_msg (domain_file, msgid, index)
+_nl_find_msg (domain_file, msgid, lengthp)
struct loaded_l10nfile *domain_file;
const char *msgid;
- unsigned long int index;
+ size_t *lengthp;
{
struct loaded_domain *domain;
size_t act;
char *result;
+ size_t resultlen;
if (domain_file->decided == 0)
_nl_load_domain (domain_file);
/* Hash table entry is empty. */
return NULL;
- if (W (domain->must_swap, domain->orig_tab[nstr - 1].length) == len
- && strcmp (msgid,
- domain->data + W (domain->must_swap,
- domain->orig_tab[nstr - 1].offset)) == 0)
- {
- act = nstr - 1;
- goto found;
- }
-
while (1)
{
+ /* Compare msgid with the original string at index nstr-1.
+ We compare the lengths with >=, not ==, because plural entries
+ are represented by strings with an embedded NUL. */
+ if (W (domain->must_swap, domain->orig_tab[nstr - 1].length) >= len
+ && (strcmp (msgid,
+ domain->data + W (domain->must_swap,
+ domain->orig_tab[nstr - 1].offset))
+ == 0))
+ {
+ act = nstr - 1;
+ goto found;
+ }
+
if (idx >= domain->hash_size - incr)
idx -= domain->hash_size - incr;
else
if (nstr == 0)
/* Hash table entry is empty. */
return NULL;
-
- if (W (domain->must_swap, domain->orig_tab[nstr - 1].length) == len
- && (strcmp (msgid,
- domain->data + W (domain->must_swap,
- domain->orig_tab[nstr - 1].offset))
- == 0))
- {
- act = nstr - 1;
- goto found;
- }
}
/* NOTREACHED */
}
string to use a different character set, this is the time. */
result = ((char *) domain->data
+ W (domain->must_swap, domain->trans_tab[act].offset));
+ resultlen = W (domain->must_swap, domain->trans_tab[act].length) + 1;
#if defined _LIBC || HAVE_ICONV
if (
appropriate table with the same structure as the table
of translations in the file, where we can put the pointers
to the converted strings in.
- There is a slight complication with the INDEX: We don't know
- a priori which entries are plural entries. Therefore at any
- moment we can only translate the variants 0 .. INDEX. */
+ There is a slight complication with plural entries. They
+ are represented by consecutive NUL terminated strings. We
+ handle this case by converting RESULTLEN bytes, including
+ NULs. */
if (domain->conv_tab == NULL
&& ((domain->conv_tab = (char **) calloc (domain->nstrings,
/* Nothing we can do, no more memory. */
goto converted;
- if (domain->conv_tab[act] == NULL
- || *(nls_uint32 *) domain->conv_tab[act] < index)
+ if (domain->conv_tab[act] == NULL)
{
/* We haven't used this string so far, so it is not
translated yet. Do this now. */
static unsigned char *freemem;
static size_t freemem_size;
- size_t resultlen;
const unsigned char *inbuf;
unsigned char *outbuf;
int malloc_count;
transmem_block_t *transmem_list = NULL;
# endif
- /* Note that we translate (index + 1) consecutive strings at
- once, including the final NUL byte. */
- {
- unsigned long int i = index;
- char *p = result;
- do
- p += strlen (p) + 1;
- while (i-- > 0);
- resultlen = p - result;
- }
-
__libc_lock_lock (lock);
inbuf = result;
- outbuf = freemem + sizeof (nls_uint32);
+ outbuf = freemem + sizeof (size_t);
malloc_count = 0;
while (1)
size_t non_reversible;
int res;
- if (freemem_size < sizeof (nls_uint32))
+ if (freemem_size < sizeof (size_t))
goto resize_freemem;
res = __gconv (domain->conv,
&inbuf, inbuf + resultlen,
&outbuf,
- outbuf + freemem_size - sizeof (nls_uint32),
+ outbuf + freemem_size - sizeof (size_t),
&non_reversible);
if (res == __GCONV_OK || res == __GCONV_EMPTY_INPUT)
char *outptr = (char *) outbuf;
size_t outleft;
- if (freemem_size < sizeof (nls_uint32))
+ if (freemem_size < sizeof (size_t))
goto resize_freemem;
- outleft = freemem_size - sizeof (nls_uint32);
+ outleft = freemem_size - sizeof (size_t);
if (iconv (domain->conv, &inptr, &inleft, &outptr, &outleft)
!= (size_t) (-1))
{
freemem = newmem;
# endif
- outbuf = freemem + sizeof (nls_uint32);
+ outbuf = freemem + sizeof (size_t);
}
/* We have now in our buffer a converted string. Put this
into the table of conversions. */
- *(nls_uint32 *) freemem = index;
+ *(size_t *) freemem = outbuf - freemem - sizeof (size_t);
domain->conv_tab[act] = freemem;
/* Shrink freemem, but keep it aligned. */
freemem_size -= outbuf - freemem;
freemem = outbuf;
- freemem += freemem_size & (alignof (nls_uint32) - 1);
- freemem_size = freemem_size & ~ (alignof (nls_uint32) - 1);
+ freemem += freemem_size & (alignof (size_t) - 1);
+ freemem_size = freemem_size & ~ (alignof (size_t) - 1);
__libc_lock_unlock (lock);
}
- /* Now domain->conv_tab[act] contains the translation of at least
- the variants 0 .. INDEX. */
- result = domain->conv_tab[act] + sizeof (nls_uint32);
+ /* Now domain->conv_tab[act] contains the translation of all
+ the plural variants. */
+ result = domain->conv_tab[act] + sizeof (size_t);
+ resultlen = *(size_t *) domain->conv_tab[act];
}
converted:
#endif /* _LIBC || HAVE_ICONV */
- /* Now skip some strings. How much depends on the index passed in. */
+ *lengthp = resultlen;
+ return result;
+}
+
+
+/* Look up a plural variant. */
+static char *
+internal_function
+plural_lookup (domain, n, translation, translation_len)
+ struct loaded_l10nfile *domain;
+ unsigned long int n;
+ const char *translation;
+ size_t translation_len;
+{
+ struct loaded_domain *domaindata = (struct loaded_domain *) domain->data;
+ unsigned long int index;
+ const char *p;
+
+ index = plural_eval (domaindata->plural, n);
+ if (index >= domaindata->nplurals)
+ /* This should never happen. It means the plural expression and the
+ given maximum value do not match. */
+ index = 0;
+
+ /* Skip INDEX strings at TRANSLATION. */
+ p = translation;
while (index-- > 0)
{
#ifdef _LIBC
- result = __rawmemchr (result, '\0');
+ p = __rawmemchr (p, '\0');
#else
- result = strchr (result, '\0');
+ p = strchr (p, '\0');
#endif
/* And skip over the NUL byte. */
- ++result;
- }
+ p++;
- return result;
+ if (p >= translation + translation_len)
+ /* This should never happen. It means the plural expression
+ evaluated to a value larger than the number of variants
+ available for MSGID1. */
+ return (char *) translation;
+ }
+ return (char *) p;
}
/* Function to evaluate the plural expression and return an index value. */
static unsigned long int
internal_function
-plural_eval (struct expression *pexp, unsigned long int n)
+plural_eval (pexp, n)
+ struct expression *pexp;
+ unsigned long int n;
{
switch (pexp->operation)
{
/* intl-compat.c - Stub functions to call gettext functions from GNU gettext
Library.
- Copyright (C) 1995, 2000 Software Foundation, Inc.
+ Copyright (C) 1995, 2000, 2001 Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
#undef gettext
#undef dgettext
#undef dcgettext
+#undef ngettext
+#undef dngettext
+#undef dcngettext
#undef textdomain
#undef bindtextdomain
#undef bind_textdomain_codeset
}
+char *
+dcngettext (domainname, msgid1, msgid2, n, category)
+ const char *domainname;
+ const char *msgid1;
+ const char *msgid2;
+ unsigned long int n;
+ int category;
+{
+ return dcngettext__ (domainname, msgid1, msgid2, n, category);
+}
+
+
+char *
+dngettext (domainname, msgid1, msgid2, n)
+ const char *domainname;
+ const char *msgid1;
+ const char *msgid2;
+ unsigned long int n;
+{
+ return dngettext__ (domainname, msgid1, msgid2, n);
+}
+
+
+char *
+ngettext (msgid1, msgid2, n)
+ const char *msgid1;
+ const char *msgid2;
+ unsigned long int n;
+{
+ return ngettext__ (msgid1, msgid2, n);
+}
+
+
char *
textdomain (domainname)
const char *domainname;
/* Message catalogs for internationalization.
- Copyright (C) 1995, 1996, 1997, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1997, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
# define dgettext(domainname, msgid) \
dcgettext (domainname, msgid, LC_MESSAGES)
-# define ngettext(msgid, N) \
- dngettext (NULL, msgid, N)
+# define ngettext(msgid1, Msgid2, N) \
+ dngettext (NULL, msgid1, Msgid2, N)
-# define dngettext(domainname, msgid, n) \
- dcngettext (domainname, msgid, n, LC_MESSAGES)
+# define dngettext(domainname, msgid1, Msgid2, n) \
+ dcngettext (domainname, msgid1, Msgid2, n, LC_MESSAGES)
#endif /* Optimizing. */
/* Message catalogs for internationalization.
- Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1998, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
number N. */
extern char *ngettext PARAMS ((const char *__msgid1, const char *__msgid2,
unsigned long int __n));
+extern char *ngettext__ PARAMS ((const char *__msgid1, const char *__msgid2,
+ unsigned long int __n));
/* Similar to `dgettext' but select the plural form corresponding to the
number N. */
extern char *dngettext PARAMS ((const char *__domainname, const char *__msgid1,
const char *__msgid2, unsigned long int __n));
+extern char *dngettext__ PARAMS ((const char *__domainname,
+ const char *__msgid1, const char *__msgid2,
+ unsigned long int __n));
/* Similar to `dcgettext' but select the plural form corresponding to the
number N. */
extern char *dcngettext PARAMS ((const char *__domainname, const char *__msgid1,
const char *__msgid2, unsigned long int __n,
int __category));
+extern char *dcngettext__ PARAMS ((const char *__domainname,
+ const char *__msgid1, const char *__msgid2,
+ unsigned long int __n, int __category));
/* Set the current default message catalog to DOMAINNAME.
# define dgettext(Domainname, Msgid) \
dcgettext (Domainname, Msgid, LC_MESSAGES)
-# define ngettext(Msgid, N) \
- dngettext (NULL, Msgid, N)
+# define ngettext(Msgid1, Msgid2, N) \
+ dngettext (NULL, Msgid1, Msgid2, N)
-# define dngettext(Domainname, Msgid, N) \
- dcngettext (Domainname, Msgid, N, LC_MESSAGES)
+# define dngettext(Domainname, Msgid1, Msgid2, N) \
+ dcngettext (Domainname, Msgid1, Msgid2, N, LC_MESSAGES)
# endif
extern char *_nl_find_msg PARAMS ((struct loaded_l10nfile *domain_file,
- const char *msgid, unsigned long int index))
+ const char *msgid, size_t *lengthp))
internal_function;
#endif /* loadinfo.h */
int use_mmap = 0;
struct loaded_domain *domain;
char *nullentry;
+ size_t nullentrylen;
domain_file->decided = 1;
domain_file->data = NULL;
# endif
#endif
domain->conv_tab = NULL;
- nullentry = _nl_find_msg (domain_file, "", 0);
+ nullentry = _nl_find_msg (domain_file, "", &nullentrylen);
if (nullentry != NULL)
{
#if defined _LIBC || HAVE_ICONV
+2001-01-01 Bruno Haible <haible@clisp.cons.org>
+
+ Implement plural form handling.
+ * message.h (struct message_variant_ty): Add msgstr_len field.
+ (struct message_ty): Add msgid_plural field.
+ (message_alloc): Take additional msgid_plural argument.
+ (message_variant_append): Take additional msgstr_len argument.
+ * message.c (message_alloc): Take additional msgid_plural argument.
+ (message_free): Free msgid_plural field.
+ (message_variant_append): Take additional msgstr_len argument.
+ (message_copy): Copy msgid_plural as well. Pass msgstr_len.
+ (message_merge): Likewise.
+ (message_print): Print plural entries using a different format.
+ (message_print_obsolete): Likewise.
+ * msgunfmt.c (string32): Return the string's size as well. Verify
+ the string is NUL terminated.
+ (read_mo_file): Split the original string into msgid and msgid_plural.
+ Pass msgstr_len.
+ * po-lex.h (msgstr_def): New definition, taken from msgfmt.c.
+ * po-lex.c (keyword_p): Recognize the msgid_plural keyword.
+ (po_gram_lex): Accept brackets as single-character tokens.
+ * po.h (struct po_method_ty): Method 'directive_message' takes
+ additional arguments 'msgid_plural', 'msgstr_len'.
+ (po_callback_message): Additional arguments 'msgid_plural',
+ 'msgstr_len'.
+ * po-hash-gen.y (yyerror): Effectively rename to po_hash_error.
+ * po-gram-gen.y (yyerror): Effectively rename to po_gram_error,
+ thus enabling reporting of syntax errors.
+ (plural_counter): New variable.
+ (%token): Add MSGID_PLURAL, '[', ']' as new tokens.
+ (%union): Add new alternative of type 'struct msgstr_def'.
+ (msgid_pluralform, pluralform, pluralform_list): New productions.
+ (message): Add plural rules.
+ * po.c (po_directive_message): Additional arguments 'msgid_plural',
+ 'msgstr_len'.
+ (po_callback_message): Likewise.
+ * msgfmt.c (SIZEOF): New macro.
+ (struct id_str_pair): Add id_len, id_plural, id_plural_len, str_len
+ fields.
+ (struct hashtable_entry): Renamed from struct msgstr_def. Add
+ 'msgid_plural', 'msgstr_len' fields.
+ (format_directive_message): Additional arguments 'msgid_plural',
+ 'msgstr_len'. Verify the validity of the charset field in the header.
+ Compare msgstr using memcmp, not strcmp.
+ (check_pair): Additional arguments 'msgid_plural', 'msgstr_len'.
+ Apply the tests to msgid_plural and each msgstr[i] string.
+ (format_debrief): Change error message.
+ (write_table): Store msgid_plural and msgstr_len in msg_arr[], then
+ output the strings including embedded NULs.
+ * msgcmp.c (compare_directive_message): Additional arguments
+ 'msgid_plural', 'msgstr_len'.
+ * msgcomm.c (extract_directive_message): Additional arguments
+ 'msgid_plural', 'msgstr_len'.
+ * msgmerge.c (merge_directive_message): Additional arguments
+ 'msgid_plural', 'msgstr_len'.
+ * xget-lex.h (struct xgettext_token_ty): Replace argnum field with
+ argnum1, argnum2.
+ * xget-lex.c (xgettext_lex): Add to default keywords: "ngettext:1,2",
+ "dngettext:2,3", "dcngettext:2,3".
+ (xgettext_lex_keyword): Accept new syntax "id:argnum1,argnum2".
+ * xgettext.c (exclude_directive_message): Additional arguments
+ 'msgid_plural', 'msgstr_len'.
+ (remember_a_message): Return the new message.
+ (remember_a_message_plural): New function.
+ (scan_c_file): Extend state machine to allow remembering msgid1 and
+ msgid2 later.
+ (extract_directive_message): Additional arguments 'msgid_plural',
+ 'msgstr_len'. Compare msgstr using memcmp, not strcmp.
+ (construct_header): Update.
+
2000-12-31 Bruno Haible <haible@clisp.cons.org>
* msgfmt.c (format_directive_message): Pass to insert_entry and
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1998, 2000, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <millerp@canb.auug.org.au>
message_ty *
-message_alloc (msgid)
+message_alloc (msgid, msgid_plural)
char *msgid;
+ const char *msgid_plural;
{
message_ty *mp;
mp = xmalloc (sizeof (message_ty));
mp->msgid = msgid;
+ mp->msgid_plural = (msgid_plural != NULL ? xstrdup (msgid_plural) : NULL);
mp->comment = NULL;
mp->comment_dot = NULL;
mp->filepos_count = 0;
if (mp->comment_dot != NULL)
string_list_free (mp->comment_dot);
free ((char *) mp->msgid);
+ if (mp->msgid_plural != NULL)
+ free ((char *) mp->msgid_plural);
for (j = 0; j < mp->variant_count; ++j)
free ((char *) mp->variant[j].msgstr);
if (mp->variant != NULL)
void
-message_variant_append (mp, domain, msgstr, pp)
+message_variant_append (mp, domain, msgstr, msgstr_len, pp)
message_ty *mp;
const char *domain;
const char *msgstr;
+ size_t msgstr_len;
const lex_pos_ty *pp;
{
size_t nbytes;
mvp = &mp->variant[mp->variant_count++];
mvp->domain = domain;
mvp->msgstr = msgstr;
+ mvp->msgstr_len = msgstr_len;
mvp->pos = *pp;
}
message_ty *result;
size_t j;
- result = message_alloc (xstrdup (mp->msgid));
+ result = message_alloc (xstrdup (mp->msgid), mp->msgid_plural);
for (j = 0; j < mp->variant_count; ++j)
{
message_variant_ty *mvp = &mp->variant[j];
- message_variant_append (result, mvp->domain, mvp->msgstr, &mvp->pos);
+ message_variant_append (result, mvp->domain, mvp->msgstr, mvp->msgstr_len,
+ &mvp->pos);
}
if (mp->comment)
{
/* Take the msgid from the reference. When fuzzy matches are made,
the definition will not be unique, but the reference will be -
usually because it has a typo. */
- result = message_alloc (xstrdup (ref->msgid));
+ result = message_alloc (xstrdup (ref->msgid), ref->msgid_plural);
/* If msgid is the header entry (i.e., "") we find the
POT-Creation-Date line in the reference. */
if (header_fields[UNKNOWN].string != NULL)
stpcpy (newp, header_fields[UNKNOWN].string);
- message_variant_append (result, mvp->domain, cp, &mvp->pos);
+ message_variant_append (result, mvp->domain, cp, strlen (cp) + 1,
+ &mvp->pos);
}
else
- message_variant_append (result, mvp->domain, mvp->msgstr, &mvp->pos);
+ message_variant_append (result, mvp->domain, mvp->msgstr,
+ mvp->msgstr_len, &mvp->pos);
}
/* Take the comments from the definition file. There will be none at
are as readable as possible. If there is no recorded msgstr for
this domain, emit an empty string. */
wrap (fp, NULL, "msgid", mp->msgid, mp->do_wrap);
- wrap (fp, NULL, "msgstr", mvp ? mvp->msgstr : "", mp->do_wrap);
+ if (mp->msgid_plural != NULL)
+ wrap (fp, NULL, "msgid_plural", mp->msgid_plural, mp->do_wrap);
+
+ if (mp->msgid_plural == NULL)
+ wrap (fp, NULL, "msgstr", mvp ? mvp->msgstr : "", mp->do_wrap);
+ else
+ {
+ char prefix_buf[20];
+ unsigned int i;
+
+ if (mvp)
+ {
+ const char *p;
+
+ for (p = mvp->msgstr, i = 0;
+ p < mvp->msgstr + mvp->msgstr_len;
+ p += strlen (p) + 1, i++)
+ {
+ sprintf (prefix_buf, "msgstr[%u]", i);
+ wrap (fp, NULL, prefix_buf, p, mp->do_wrap);
+ }
+ }
+ else
+ {
+ for (i = 0; i < 2; i++)
+ {
+ sprintf (prefix_buf, "msgstr[%u]", i);
+ wrap (fp, NULL, prefix_buf, "", mp->do_wrap);
+ }
+ }
+ }
}
/* Print each of the message components. Wrap them nicely so they
are as readable as possible. */
wrap (fp, "#~ ", "msgid", mp->msgid, mp->do_wrap);
- wrap (fp, "#~ ", "msgstr", mvp->msgstr, mp->do_wrap);
+ if (mp->msgid_plural != NULL)
+ wrap (fp, "#~ ", "msgid_plural", mp->msgid_plural, mp->do_wrap);
+
+ if (mp->msgid_plural == NULL)
+ wrap (fp, "#~ ", "msgstr", mvp->msgstr, mp->do_wrap);
+ else
+ {
+ char prefix_buf[20];
+ unsigned int i;
+ const char *p;
+
+ for (p = mvp->msgstr, i = 0;
+ p < mvp->msgstr + mvp->msgstr_len;
+ p += strlen (p) + 1, i++)
+ {
+ sprintf (prefix_buf, "msgstr[%u]", i);
+ wrap (fp, "#~ ", prefix_buf, p, mp->do_wrap);
+ }
+ }
}
struct message_variant_ty
{
const char *domain;
+
lex_pos_ty pos;
+
const char *msgstr;
+ size_t msgstr_len;
};
typedef struct message_ty message_ty;
/* The msgid string. */
const char *msgid;
+ /* The msgid's plural, if present. */
+ const char *msgid_plural;
+
/* The msgstr strings, one for each observed domain in the file. */
size_t variant_count;
message_variant_ty *variant;
int obsolete;
};
-message_ty *message_alloc PARAMS ((char *msgid));
+message_ty *message_alloc PARAMS ((char *msgid, const char *msgid_plural));
void message_free PARAMS ((message_ty *));
message_variant_ty *message_variant_search PARAMS ((message_ty *mp,
const char *domain));
void message_variant_append PARAMS ((message_ty *mp, const char *domain,
- const char *msgstr,
+ const char *msgstr, size_t msgstr_len,
const lex_pos_ty *pp));
void message_comment_append PARAMS ((message_ty *, const char *));
void message_comment_dot_append PARAMS ((message_ty *, const char *));
static void compare_directive_domain PARAMS ((po_ty *__that, char *__name));
static void compare_directive_message PARAMS ((po_ty *__that, char *__msgid,
lex_pos_ty *msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void compare_parse_debrief PARAMS ((po_ty *__that));
static void
-compare_directive_message (that, msgid, msgid_pos, msgstr, msgstr_pos)
+compare_directive_message (that, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *that;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
compare_class_ty *this = (compare_class_ty *) that;
free (msgid);
else
{
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, msgid_plural);
message_list_append (this->mlp, mp);
}
free (msgstr);
}
else
- message_variant_append (mp, this->domain, msgstr, msgstr_pos);
+ message_variant_append (mp, this->domain, msgstr, msgstr_len, msgstr_pos);
}
static void extract_directive_domain PARAMS ((po_ty *__that, char *__name));
static void extract_directive_message PARAMS ((po_ty *__that, char *__msgid,
lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void extract_parse_brief PARAMS ((po_ty *__that));
static void extract_comment PARAMS ((po_ty *__that, const char *__s));
static void
-extract_directive_message (that, msgid, msgid_pos, msgstr, msgstr_pos)
+extract_directive_message (that, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *that;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
extract_class_ty *this = (extract_class_ty *)that;
free (msgid);
else
{
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, msgid_plural);
message_list_append (this->mlp, mp);
}
if (mvp != NULL)
free (msgstr);
else
- message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, msgstr_pos);
+ message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, msgstr_len,
+ msgstr_pos);
}
extern int errno;
#endif
+#define SIZEOF(a) (sizeof(a) / sizeof(a[0]))
+
/* Define the data structure which we need to represent the data to
be written out. */
struct id_str_pair
{
char *id;
+ size_t id_len;
+ char *id_plural;
+ size_t id_plural_len;
char *str;
+ size_t str_len;
};
/* Contains information about the definition of one translation. */
-struct msgstr_def
+struct hashtable_entry
{
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty pos;
};
static void format_directive_domain PARAMS ((po_ty *__pop, char *__name));
static void format_directive_message PARAMS ((po_ty *__pop, char *__msgid,
lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void format_comment_special PARAMS ((po_ty *pop, const char *s));
static void format_debrief PARAMS((po_ty *));
static int compare_id PARAMS ((const void *pval1, const void *pval2));
static void write_table PARAMS ((FILE *output_file, hash_table *tab));
static void check_pair PARAMS ((const char *msgid, const lex_pos_ty *msgid_pos,
- const char *msgstr,
+ const char *msgid_plural,
+ const char *msgstr, size_t msgstr_len,
const lex_pos_ty *msgstr_pos, int is_format));
static const char *add_mo_suffix PARAMS ((const char *));
{
msgfmt_class_ty *this = (msgfmt_class_ty *) that;
- /* If in verbose mode, test whether header entry was found. */
+ /* Test whether header entry was found.
+ FIXME: Should do this even if not in verbose mode, because the
+ consequences are not harmless. But it breaks the test suite. */
if (verbose_level > 0 && this->has_header_entry == 0)
- error (0, 0, _("%s: warning: PO file header missing, fuzzy, or invalid"),
- gram_pos.file_name);
+ error (0, 0, _("%s: warning: PO file header missing, fuzzy, or invalid\n\
+%*s warning: charset conversion will not work"),
+ gram_pos.file_name, strlen (gram_pos.file_name), "");
}
/* Process `msgid'/`msgstr' pair from .po file. */
static void
-format_directive_message (that, msgid_string, msgid_pos, msgstr_string,
- msgstr_pos)
+format_directive_message (that, msgid_string, msgid_pos, msgid_plural,
+ msgstr_string, msgstr_len, msgstr_pos)
po_ty *that;
char *msgid_string;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr_string;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
msgfmt_class_ty *this = (msgfmt_class_ty *) that;
- struct msgstr_def *entry;
+ struct hashtable_entry *entry;
if (msgstr_string[0] == '\0' || (!include_all && this->is_fuzzy))
{
"PACKAGE VERSION", "YEAR-MO-DA", "FULL NAME", "LANGUAGE",
NULL, "text/plain; charset=CHARSET", "ENCODING"
};
- const size_t nfields = (sizeof (required_fields)
- / sizeof (required_fields[0]));
+ const size_t nfields = SIZEOF (required_fields);
int initial = -1;
int cnt;
error (0, 0, _("field `%s' still has initial default value"),
required_fields[initial]);
}
+
+ /* Verify the validity of CHARSET. Even if not in verbose mode,
+ because the consequences are not harmless. */
+ {
+ const char *charsetstr = strstr (msgstr_string, "charset=");
+
+ if (charsetstr != NULL)
+ {
+ /* The list of charsets supported by glibc's iconv() and by
+ the portable iconv() across platforms. Taken from
+ intl/config.charset. */
+ static const char *standard_charsets[] =
+ {
+ "ASCII", "ANSI_X3.4-1968", "US-ASCII",
+ "ISO-8859-1", "ISO_8859-1",
+ "ISO-8859-2", "ISO_8859-2",
+ "ISO-8859-3", "ISO_8859-3",
+ "ISO-8859-4", "ISO_8859-4",
+ "ISO-8859-5", "ISO_8859-5",
+ "ISO-8859-6", "ISO_8859-6",
+ "ISO-8859-7", "ISO_8859-7",
+ "ISO-8859-8", "ISO_8859-8",
+ "ISO-8859-9", "ISO_8859-9",
+ "ISO-8859-13", "ISO_8859-13",
+ "ISO-8859-15", "ISO_8859-15",
+ "KOI8-R",
+ "KOI8-U",
+ "CP850",
+ "CP866",
+ "CP874",
+ "CP932",
+ "CP949",
+ "CP950",
+ "CP1250",
+ "CP1251",
+ "CP1252",
+ "CP1253",
+ "CP1254",
+ "CP1255",
+ "CP1256",
+ "CP1257",
+ "GB2312",
+ "EUC-JP",
+ "EUC-KR",
+ "EUC-TW",
+ "BIG5",
+ "BIG5HKSCS",
+ "GBK",
+ "GB18030",
+ "SJIS",
+ "JOHAB",
+ "TIS-620",
+ "VISCII",
+ "UTF-8"
+ };
+ size_t len;
+ char *charset;
+ size_t i;
+
+ charsetstr += strlen ("charset=");
+ len = strcspn (charsetstr, " \t\n");
+ charset = (char *) alloca (len + 1);
+ memcpy (charset, charsetstr, len);
+ charset[len] = '\0';
+
+ for (i = 0; i < SIZEOF (standard_charsets); i++)
+ if (strcasecmp (charset, standard_charsets[i]) == 0)
+ break;
+ if (i == SIZEOF (standard_charsets))
+ error (0, 0, _("\
+%s: warning: charset \"%s\" is not a portable encoding name\n\
+%*s warning: charset conversion might not work"),
+ gram_pos.file_name, charset,
+ strlen (gram_pos.file_name), "");
+ }
+ else
+ error (0, 0, _("\
+%s: warning: charset missing in header\n\
+%*s warning: charset conversion will not work"),
+ gram_pos.file_name, strlen (gram_pos.file_name), "");
+ }
}
else
/* We don't count the header entry in the statistic so place the
/* We found a valid pair of msgid/msgstr.
Construct struct to describe msgstr definition. */
- entry = (struct msgstr_def *) xmalloc (sizeof (*entry));
+ entry = (struct hashtable_entry *) xmalloc (sizeof (*entry));
+ entry->msgid_plural = msgid_plural;
entry->msgstr = msgstr_string;
+ entry->msgstr_len = msgstr_len;
entry->pos = *msgstr_pos;
/* Do some more checks on both strings. */
- check_pair (msgid_string, msgid_pos, msgstr_string, msgstr_pos,
+ check_pair (msgid_string, msgid_pos, msgid_plural,
+ msgstr_string, msgstr_len, msgstr_pos,
do_check && possible_c_format_p (this->is_c_format));
/* Check whether already a domain is specified. If not use default
definition for reference. */
find_entry (¤t_domain->symbol_tab, msgid_string,
strlen (msgid_string) + 1, (void **) &entry);
- if (0 != strcmp(msgstr_string, entry->msgstr))
+ if (msgstr_len != entry->msgstr_len
+ || memcmp (msgstr_string, entry->msgstr, msgstr_len) != 0)
{
po_gram_error_at_line (msgid_pos, _("\
duplicate message definition"));
size_t cnt;
const void *id;
size_t id_len;
- struct msgstr_def *entry;
+ struct hashtable_entry *entry;
struct string_desc sd;
/* Fill the structure describing the header. */
++cnt)
{
msg_arr[cnt].id = (char *) id;
+ msg_arr[cnt].id_len = id_len;
+ msg_arr[cnt].id_plural = entry->msgid_plural;
+ msg_arr[cnt].id_plural_len =
+ (entry->msgid_plural != NULL ? strlen (entry->msgid_plural) + 1 : 0);
msg_arr[cnt].str = entry->msgstr;
+ msg_arr[cnt].str_len = entry->msgstr_len;
}
/* Sort the table according to original string. */
/* Write out length and starting offset for all original strings. */
for (cnt = 0; cnt < tab->filled; ++cnt)
{
- sd.length = strlen (msg_arr[cnt].id);
+ /* Subtract 1 because of the terminating NUL. */
+ sd.length = msg_arr[cnt].id_len + msg_arr[cnt].id_plural_len - 1;
fwrite (&sd, sizeof (sd), 1, output_file);
sd.offset += roundup (sd.length + 1, alignment);
}
/* Write out length and starting offset for all translation strings. */
for (cnt = 0; cnt < tab->filled; ++cnt)
{
- sd.length = strlen (msg_arr[cnt].str);
+ /* Subtract 1 because of the terminating NUL. */
+ sd.length = msg_arr[cnt].str_len - 1;
fwrite (&sd, sizeof (sd), 1, output_file);
sd.offset += roundup (sd.length + 1, alignment);
}
/* Now write the original strings. */
for (cnt = 0; cnt < tab->filled; ++cnt)
{
- size_t len = strlen (msg_arr[cnt].id);
+ size_t len = msg_arr[cnt].id_len + msg_arr[cnt].id_plural_len;
- fwrite (msg_arr[cnt].id, len + 1, 1, output_file);
- fwrite (&null, 1, roundup (len + 1, alignment) - (len + 1), output_file);
+ fwrite (msg_arr[cnt].id, msg_arr[cnt].id_len, 1, output_file);
+ if (msg_arr[cnt].id_plural_len > 0)
+ fwrite (msg_arr[cnt].id_plural, msg_arr[cnt].id_plural_len, 1,
+ output_file);
+ fwrite (&null, 1, roundup (len, alignment) - len, output_file);
}
/* Now write the translation strings. */
for (cnt = 0; cnt < tab->filled; ++cnt)
{
- size_t len = strlen (msg_arr[cnt].str);
+ size_t len = msg_arr[cnt].str_len;
- fwrite (msg_arr[cnt].str, len + 1, 1, output_file);
- fwrite (&null, 1, roundup (len + 1, alignment) - (len + 1), output_file);
+ fwrite (msg_arr[cnt].str, len, 1, output_file);
+ fwrite (&null, 1, roundup (len, alignment) - len, output_file);
free (msg_arr[cnt].str);
}
static void
-check_pair (msgid, msgid_pos, msgstr, msgstr_pos, is_format)
+check_pair (msgid, msgid_pos, msgid_plural, msgstr, msgstr_len, msgstr_pos,
+ is_format)
const char *msgid;
const lex_pos_ty *msgid_pos;
+ const char *msgid_plural;
const char *msgstr;
+ size_t msgstr_len;
const lex_pos_ty *msgstr_pos;
int is_format;
{
- size_t msgid_len = strlen (msgid);
- size_t msgstr_len = strlen (msgstr);
+ int has_newline;
+ unsigned int i;
+ const char *p;
size_t nidfmts, nstrfmts;
/* If the msgid string is empty we have the special entry reserved for
information about the translation. */
- if (msgid_len == 0)
+ if (msgid[0] == '\0')
return;
- /* Test 1: check whether both or none of the strings begin with a '\n'. */
- if (((msgid[0] == '\n') ^ (msgstr[0] == '\n')) != 0)
+ /* Test 1: check whether all or none of the strings begin with a '\n'. */
+ has_newline = (msgid[0] == '\n');
+#define TEST_NEWLINE(p) (p[0] == '\n')
+ if (msgid_plural != NULL)
+ {
+ if (TEST_NEWLINE(msgid_plural) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
+`msgid' and `msgid_plural' entries do not both begin with '\\n'"));
+ exit_status = EXIT_FAILURE;
+ }
+ for (p = msgstr, i = 0; p < msgstr + msgstr_len; p += strlen (p) + 1, i++)
+ if (TEST_NEWLINE(p) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
+`msgid' and `msgstr[%u]' entries do not both begin with '\\n'"), i);
+ exit_status = EXIT_FAILURE;
+ }
+ }
+ else
{
- error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number, _("\
+ if (TEST_NEWLINE(msgstr) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
`msgid' and `msgstr' entries do not both begin with '\\n'"));
- exit_status = EXIT_FAILURE;
+ exit_status = EXIT_FAILURE;
+ }
}
+#undef TEST_NEWLINE
- /* Test 2: check whether both or none of the strings end with a '\n'. */
- if (((msgid[msgid_len - 1] == '\n') ^ (msgstr[msgstr_len - 1] == '\n')) != 0)
+ /* Test 2: check whether all or none of the strings end with a '\n'. */
+ has_newline = (msgid[strlen (msgid) - 1] == '\n');
+#define TEST_NEWLINE(p) (p[0] != '\0' && p[strlen (p) - 1] == '\n')
+ if (msgid_plural != NULL)
{
- error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number, _("\
+ if (TEST_NEWLINE(msgid_plural) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
+`msgid' and `msgid_plural' entries do not both end with '\\n'"));
+ exit_status = EXIT_FAILURE;
+ }
+ for (p = msgstr, i = 0; p < msgstr + msgstr_len; p += strlen (p) + 1, i++)
+ if (TEST_NEWLINE(p) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
+`msgid' and `msgstr[%u]' entries do not both end with '\\n'"), i);
+ exit_status = EXIT_FAILURE;
+ }
+ }
+ else
+ {
+ if (TEST_NEWLINE(msgstr) != has_newline)
+ {
+ error_at_line (0, 0, msgid_pos->file_name, msgid_pos->line_number,
+ _("\
`msgid' and `msgstr' entries do not both end with '\\n'"));
- exit_status = EXIT_FAILURE;
+ exit_status = EXIT_FAILURE;
+ }
}
+#undef TEST_NEWLINE
- if (is_format != 0)
+ if (is_format != 0 && msgid_plural == NULL)
{
/* Test 3: check whether both formats strings contain the same
number of format specifications. */
static void merge_directive_domain PARAMS ((po_ty *__that, char *__name));
static void merge_directive_message PARAMS ((po_ty *__that, char *__msgid,
lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void merge_parse_brief PARAMS ((po_ty *__that));
static void merge_parse_debrief PARAMS ((po_ty *__that));
static void
-merge_directive_message (that, msgid, msgid_pos, msgstr, msgstr_pos)
+merge_directive_message (that, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *that;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
merge_class_ty *this = (merge_class_ty *) that;
free (msgid);
else
{
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, msgid_plural);
message_list_append (this->mlp, mp);
}
free (msgstr);
}
else
- message_variant_append (mp, this->domain, msgstr, msgstr_pos);
+ message_variant_append (mp, this->domain, msgstr, msgstr_len, msgstr_pos);
}
/* msgunfmt - converts binary .mo files to Uniforum style .po files
- Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1998, 2000, 2001 Free Software Foundation, Inc.
Written by Ulrich Drepper <drepper@gnu.ai.mit.edu>, April 1995.
This program is free software; you can redistribute it and/or modify
static void error_print PARAMS ((void));
static nls_uint32 read32 PARAMS ((FILE *__fp, const char *__fn));
static void seek32 PARAMS ((FILE *__fp, const char *__fn, long __offset));
-static char *string32 PARAMS ((FILE *__fp, const char *__fn, long __offset));
+static char *string32 PARAMS ((FILE *__fp, const char *__fn, long __offset,
+ size_t *lengthp));
static message_list_ty *read_mo_file PARAMS ((message_list_ty *__mlp,
const char *__fn));
static char *
-string32 (fp, fn, offset)
+string32 (fp, fn, offset, lengthp)
FILE *fp;
const char *fn;
long offset;
+ size_t *lengthp;
{
long length;
char *buffer;
/* Read in the string. Complain if there is an error or it comes up
short. Add the NUL ourselves. */
seek32 (fp, fn, offset);
- n = fread (buffer, 1, length, fp);
- if (n != length)
+ n = fread (buffer, 1, length + 1, fp);
+ if (n != length + 1)
{
if (ferror (fp))
error (EXIT_FAILURE, errno, _("error while reading \"%s\""), fn);
error (EXIT_FAILURE, 0, _("file \"%s\" truncated"), fn);
}
- buffer[length] = '\0';
+ if (buffer[length] != '\0')
+ {
+ error (EXIT_FAILURE, 0,
+ _("file \"%s\" contains a not NUL terminated string"), fn);
+ }
/* Return the string to the caller. */
+ *lengthp = length + 1;
return buffer;
}
static lex_pos_ty pos = { __FILE__, __LINE__ };
message_ty *mp;
char *msgid;
+ size_t msgid_len;
char *msgstr;
+ size_t msgstr_len;
/* Read the msgid. */
- msgid = string32 (fp, fn, header.orig_tab_offset + j * 8);
+ msgid = string32 (fp, fn, header.orig_tab_offset + j * 8, &msgid_len);
/* Read the msgstr. */
- msgstr = string32 (fp, fn, header.trans_tab_offset + j * 8);
-
- mp = message_alloc (msgid);
- message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, &pos);
+ msgstr = string32 (fp, fn, header.trans_tab_offset + j * 8, &msgstr_len);
+
+ mp = message_alloc (msgid,
+ (strlen (msgid) + 1 < msgid_len
+ ? msgid + strlen (msgid) + 1
+ : NULL));
+ message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, msgstr_len,
+ &pos);
message_list_append (mlp, mp);
}
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995, 1996, 1998, 2000, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <pmiller@agso.gov.au>
#define yymaxdepth po_gram_maxdepth
#define yyparse po_gram_parse
#define yylex po_gram_lex
+#define yyerror po_gram_error
#define yylval po_gram_lval
#define yychar po_gram_char
#define yydebug po_gram_debug
#define yygindex po_gram_yygindex
#define yytable po_gram_yytable
#define yycheck po_gram_yycheck
+
+static long plural_counter;
%}
%token COMMENT
%token DOMAIN
%token JUNK
%token MSGID
+%token MSGID_PLURAL
%token MSGSTR
%token NAME
+%token '[' ']'
%token NUMBER
%token STRING
char *string;
long number;
lex_pos_ty pos;
+ struct msgstr_def rhs;
}
-%type <string> STRING COMMENT string_list
+%type <string> STRING COMMENT string_list msgid_pluralform
%type <number> NUMBER
%type <pos> msgid msgstr
+%type <rhs> pluralform pluralform_list
%right MSGSTR
message
: msgid string_list msgstr string_list
{
- po_callback_message ($2, &$1, $4, &$3);
+ po_callback_message ($2, &$1, NULL,
+ $4, strlen ($4) + 1, &$3);
+ }
+ | msgid string_list msgid_pluralform pluralform_list
+ {
+ po_callback_message ($2, &$1, $3,
+ $4.msgstr, $4.msgstr_len, &$4.pos);
+ }
+ | msgid string_list msgid_pluralform
+ {
+ po_gram_error_at_line (&$1, _("missing `msgstr[]' section"));
+ free ($2);
+ free ($3);
+ }
+ | msgid string_list pluralform_list
+ {
+ po_gram_error_at_line (&$1, _("missing `msgid_plural' section"));
+ free ($2);
+ free ($3.msgstr);
}
| msgid string_list
{
}
;
+msgid_pluralform
+ : MSGID_PLURAL string_list
+ {
+ plural_counter = 0;
+ $$ = $2;
+ }
+ ;
+
+pluralform_list
+ : pluralform
+ {
+ $$ = $1;
+ }
+ | pluralform_list pluralform
+ {
+ $$.msgstr = (char *) xmalloc ($1.msgstr_len + $2.msgstr_len);
+ memcpy ($$.msgstr, $1.msgstr, $1.msgstr_len);
+ memcpy ($$.msgstr + $1.msgstr_len, $2.msgstr, $2.msgstr_len);
+ $$.msgstr_len = $1.msgstr_len + $2.msgstr_len;
+ $$.pos = $1.pos;
+ free ($1.msgstr);
+ free ($2.msgstr);
+ }
+ ;
+
+pluralform
+ : msgstr '[' NUMBER ']' string_list
+ {
+ if ($3 != plural_counter)
+ {
+ if (plural_counter == 0)
+ po_gram_error_at_line (&$1, _("first plural form has nonzero index"));
+ else
+ po_gram_error_at_line (&$1, _("plural form has wrong index"));
+ }
+ plural_counter++;
+ $$.msgstr = $5;
+ $$.msgstr_len = strlen ($5) + 1;
+ $$.pos = $1;
+ }
+ ;
+
msgid
: MSGID
{
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1998 Free Software Foundation, Inc.
+ Copyright (C) 1995, 1996, 1998, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <pmiller@agso.gov.au>
#define yymaxdepth po_hash_maxdepth
#define yyparse po_hash_parse
#define yylex po_hash_lex
+#define yyerror po_hash_error
#define yylval po_hash_lval
#define yychar po_hash_char
#define yydebug po_hash_debug
/* Prototypes for local functions. */
static int lex_getc PARAMS ((void));
static void lex_ungetc PARAMS ((int __ch));
-static int keyword_p PARAMS ((char *__s));
+static int keyword_p PARAMS ((const char *__s));
static int control_sequence PARAMS ((void));
static int
keyword_p (s)
- char *s;
+ const char *s;
{
if (!strcmp (s, "domain"))
return DOMAIN;
if (!strcmp (s, "msgid"))
return MSGID;
+ if (!strcmp (s, "msgid_plural"))
+ return MSGID_PLURAL;
if (!strcmp (s, "msgstr"))
return MSGSTR;
po_gram_error (_("keyword \"%s\" unknown"), s);
po_gram_lval.number = atol (buf);
return NUMBER;
+ case '[':
+ return '[';
+
+ case ']':
+ return ']';
+
default:
/* This will cause a syntax error. */
return JUNK;
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1998, 2000, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <millerp@canb.auug.org.au>
#endif
+/* Contains information about the definition of one translation. */
+struct msgstr_def
+{
+ char *msgstr;
+ size_t msgstr_len;
+ lex_pos_ty pos;
+};
+
#endif
static void po_directive_domain PARAMS ((po_ty *__pop, char *__name));
static void po_directive_message PARAMS ((po_ty *__pop, char *__msgid,
lex_pos_ty *__msgid_pos,
- char *__msgstr,
+ char *__msgid_plural,
+ char *__msgstr, size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void po_comment PARAMS ((po_ty *__pop, const char *__s));
static void po_comment_dot PARAMS ((po_ty *__pop, const char *__s));
static void
-po_directive_message (pop, msgid, msgid_pos, msgstr, msgstr_pos)
+po_directive_message (pop, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *pop;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
if (pop->method->directive_message)
- pop->method->directive_message (pop, msgid, msgid_pos, msgstr, msgstr_pos);
+ pop->method->directive_message (pop, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos);
}
void
-po_callback_message (msgid, msgid_pos, msgstr, msgstr_pos)
+po_callback_message (msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
/* assert(callback_arg); */
- po_directive_message (callback_arg, msgid, msgid_pos, msgstr, msgstr_pos);
+ po_directive_message (callback_arg, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos);
}
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1998 Free Software Foundation, Inc.
+ Copyright (C) 1995, 1996, 1998, 2000, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <millerp@canb.auug.org.au>
void (*directive_domain) PARAMS ((struct po_ty *__pop, char *__name));
/* what to do with a message directive */
- void (*directive_message) PARAMS ((struct po_ty *__pop, char *__msgid,
- lex_pos_ty *__msgid_pos, char *__msgstr,
+ void (*directive_message) PARAMS ((struct po_ty *__pop,
+ char *__msgid, lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
+ char *__msgstr, size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
/* This method is invoked before the parse, but after the file is
extern void po_callback_domain PARAMS ((char *__name));
extern void po_callback_message PARAMS ((char *__msgid,
lex_pos_ty *__msgid_pos,
- char *__msgstr,
+ char *__msgid_plural,
+ char *__msgstr, size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
extern void po_callback_comment PARAMS ((const char *__s));
extern void po_callback_comment_dot PARAMS ((const char *__s));
xgettext_lex_keyword ("gettext");
xgettext_lex_keyword ("dgettext:2");
xgettext_lex_keyword ("dcgettext:2");
+ xgettext_lex_keyword ("ngettext:1,2");
+ xgettext_lex_keyword ("dngettext:2,3");
+ xgettext_lex_keyword ("dcngettext:2,3");
xgettext_lex_keyword ("gettext_noop");
default_keywords = 0;
}
== 0)
{
tp->type = xgettext_token_type_keyword;
- tp->argnum = (int) (long) keyword_value;
+ tp->argnum1 = (int) (long) keyword_value & ((1 << 10) - 1);
+ tp->argnum2 = (int) (long) keyword_value >> 10;
tp->line_number = token.line_number;
tp->file_name = logical_file_name;
}
default_keywords = 0;
else
{
- int argnum;
+ int argnum1;
+ int argnum2;
size_t len;
- const char *sp;
+ char *sp;
if (keywords.table == NULL)
init_hash (&keywords, 100);
name_copy[len] = '\0';
name = name_copy;
- argnum = atoi (sp + 1);
+ sp++;
+ argnum1 = strtol (sp, &sp, 10);
+ if (*sp == ',')
+ {
+ sp++;
+ argnum2 = strtol (sp, &sp, 10);
+ }
+ else
+ argnum2 = 0;
}
else
{
len = strlen (name);
- argnum = 1;
+ argnum1 = 1;
+ argnum2 = 0;
}
- insert_entry (&keywords, name, len + 1, (void *) (long) argnum);
+ insert_entry (&keywords, name, len + 1,
+ (void *) (long) (argnum1 + (argnum2 << 10)));
}
}
/* GNU gettext - internationalization aids
- Copyright (C) 1995, 1996, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995, 1996, 1998, 2000, 2001 Free Software Foundation, Inc.
This file was written by Peter Miller <millerp@canb.auug.org.au>
{
xgettext_token_type_ty type;
- /* This field is used only for xgettext_token_type_keyword. */
- int argnum;
+ /* These fields are used only for xgettext_token_type_keyword. */
+ int argnum1;
+ int argnum2;
/* This field is used only for xgettext_token_type_string_literal. */
char *string;
/* Extracts strings from C source file to Uniforum style .po file.
- Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1998, 2000, 2001 Free Software Foundation, Inc.
Written by Ulrich Drepper <drepper@gnu.ai.mit.edu>, April 1995.
This program is free software; you can redistribute it and/or modify
static void exclude_directive_domain PARAMS ((po_ty *__pop, char *__name));
static void exclude_directive_message PARAMS ((po_ty *__pop, char *__msgid,
lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void read_exclusion_file PARAMS ((char *__file_name));
-static void remember_a_message PARAMS ((message_list_ty *__mlp,
- xgettext_token_ty *__tp));
+static message_ty *remember_a_message PARAMS ((message_list_ty *__mlp,
+ xgettext_token_ty *__tp));
+static void remember_a_message_plural PARAMS ((message_ty *__mp,
+ xgettext_token_ty *__tp));
static void scan_c_file PARAMS ((const char *__file_name,
message_list_ty *__mlp));
static void extract_constructor PARAMS ((po_ty *__that));
static void extract_directive_domain PARAMS ((po_ty *__that, char *__name));
static void extract_directive_message PARAMS ((po_ty *__that, char *__msgid,
lex_pos_ty *__msgid_pos,
+ char *__msgid_plural,
char *__msgstr,
+ size_t __msgstr_len,
lex_pos_ty *__msgstr_pos));
static void extract_parse_brief PARAMS ((po_ty *__that));
static void extract_comment PARAMS ((po_ty *__that, const char *__s));
static void
-exclude_directive_message (pop, msgid, msgid_pos, msgstr, msgstr_pos)
+exclude_directive_message (pop, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *pop;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
message_ty *mp;
free (msgid);
else
{
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, msgid_plural);
/* Do not free msgid. */
message_list_append (exclude, mp);
}
}
-static void
+static message_ty *
remember_a_message (mlp, tp)
message_list_ty *mlp;
xgettext_token_ty *tp;
message gets the correct comments. */
xgettext_lex_comment_reset ();
- return;
+ return NULL;
}
/* See if we have seen this message before. */
static lex_pos_ty pos = { __FILE__, __LINE__ };
/* Allocate a new message and append the message to the list. */
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, NULL);
/* Do not free msgid. */
message_list_append (mlp, mp);
{
msgstr = (char *) xmalloc (strlen (msgstr_prefix)
+ strlen (msgid)
- + strlen(msgstr_suffix) + 1);
+ + strlen (msgstr_suffix) + 1);
stpcpy (stpcpy (stpcpy (msgstr, msgstr_prefix), msgid),
msgstr_suffix);
}
else
msgstr = "";
- message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, &pos);
+ message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr,
+ strlen (msgstr) + 1, &pos);
}
/* Ask the lexer for the comments it has seen. Only do this for the
/* Tell the lexer to reset its comment buffer, so that the next
message gets the correct comments. */
xgettext_lex_comment_reset ();
+
+ return mp;
+}
+
+
+static void
+remember_a_message_plural (mp, tp)
+ message_ty *mp;
+ xgettext_token_ty *tp;
+{
+ char *msgid_plural;
+ message_variant_ty *mvp;
+ char *msgstr1;
+ size_t msgstr1_len;
+ char *msgstr;
+
+ msgid_plural = tp->string;
+
+ /* See if the message is already a plural message. */
+ if (mp->msgid_plural == NULL)
+ {
+ mp->msgid_plural = msgid_plural;
+
+ /* Construct the first plural form from the prefix and suffix,
+ otherwise use the empty string. The translator will have to
+ provide additional plural forms. */
+ mvp = message_variant_search (mp, MESSAGE_DOMAIN_DEFAULT);
+ if (mvp != NULL)
+ {
+ if (msgstr_prefix)
+ {
+ msgstr1 = (char *) xmalloc (strlen (msgstr_prefix)
+ + strlen (msgid_plural)
+ + strlen (msgstr_suffix) + 1);
+ stpcpy (stpcpy (stpcpy (msgstr1, msgstr_prefix), msgid_plural),
+ msgstr_suffix);
+ }
+ else
+ msgstr1 = "";
+ msgstr1_len = strlen (msgstr1) + 1;
+ msgstr = (char *) xmalloc (mvp->msgstr_len + msgstr1_len);
+ memcpy (msgstr, mvp->msgstr, mvp->msgstr_len);
+ memcpy (msgstr + mvp->msgstr_len, msgstr1, msgstr1_len);
+ mvp->msgstr = msgstr;
+ mvp->msgstr_len = mvp->msgstr_len + msgstr1_len;
+ }
+ }
+ else
+ free (msgid_plural);
}
{
int state;
int commas_to_skip = 0; /* defined only when in states 1 and 2 */
+ int plural_commas = 0; /* defined only when in states 1 and 2 */
+ message_ty *plural_mp = NULL; /* defined only when in states 1 and 2 */
int paren_nesting = 0; /* defined only when in state 2 */
/* The file is broken into tokens. Scan the token stream, looking for
a keyword, followed by a left paren, followed by a string. When we
see this sequence, we have something to remember. We assume we are
looking at a valid C or C++ program, and leave the complaints about
- the grammar to the compiler. */
+ the grammar to the compiler.
+
+ Normal handling: Look for
+ [A] keyword [B] ( ... [C] ... msgid ... ) [E]
+ Plural handling: Look for
+ [A] keyword [B] ( ... [C] ... msgid ... [D] ... msgid_plural ... ) [E]
+ At point [A]: state == 0.
+ At point [B]: state == 1, commas_to_skip set, plural_mp == NULL.
+ At point [C]: state == 2, commas_to_skip set, plural_mp == NULL.
+ At point [D]: state == 2, commas_to_skip set again, plural_mp != NULL.
+ At point [E]: state == 0. */
+
xgettext_lex_open (filename);
/* Start state is 0. */
_("%s:%d: warning: keyword between outer keyword and its arg"),
token.file_name, token.line_number);
}
- commas_to_skip = token.argnum - 1;
+ commas_to_skip = token.argnum1 - 1;
+ plural_commas = (token.argnum2 > token.argnum1
+ ? token.argnum2 - token.argnum1 : 0);
+ plural_mp = NULL;
state = 1;
continue;
continue;
case xgettext_token_type_string_literal:
- if (extract_all || (state == 2 && commas_to_skip == 0))
+ if (extract_all)
remember_a_message (mlp, &token);
+ else if (state == 2 && commas_to_skip == 0)
+ {
+ if (plural_mp == NULL)
+ {
+ /* Seen an msgid. */
+ if (plural_commas == 0)
+ remember_a_message (mlp, &token);
+ else
+ {
+ plural_mp = remember_a_message (mlp, &token);
+ commas_to_skip = plural_commas;
+ plural_commas = 0;
+ }
+ }
+ else
+ {
+ /* Seen an msgid_plural. */
+ remember_a_message_plural (plural_mp, &token);
+ plural_mp = NULL;
+ }
+ }
else
{
free (token.string);
static void
-extract_directive_message (that, msgid, msgid_pos, msgstr, msgstr_pos)
+extract_directive_message (that, msgid, msgid_pos, msgid_plural,
+ msgstr, msgstr_len, msgstr_pos)
po_ty *that;
char *msgid;
lex_pos_ty *msgid_pos;
+ char *msgid_plural;
char *msgstr;
+ size_t msgstr_len;
lex_pos_ty *msgstr_pos;
{
extract_class_ty *this = (extract_class_ty *)that;
free (msgid);
else
{
- mp = message_alloc (msgid);
+ mp = message_alloc (msgid, msgid_plural);
message_list_append (this->mlp, mp);
}
/* See if this domain has been seen for this message ID. */
mvp = message_variant_search (mp, MESSAGE_DOMAIN_DEFAULT);
- if (mvp != NULL && strcmp (msgstr, mvp->msgstr) != 0)
+ if (mvp != NULL
+ && (msgstr_len != mvp->msgstr_len
+ || memcmp (msgstr, mvp->msgstr, msgstr_len) != 0))
{
po_gram_error_at_line (msgid_pos, _("duplicate message definition"));
po_gram_error_at_line (&mvp->pos, _("\
free (msgstr);
}
else
- message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, msgstr_pos);
+ message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, msgstr_len,
+ msgstr_pos);
}
char tz_sign;
long tz_min;
- mp = message_alloc ("");
+ mp = message_alloc ("", NULL);
if (foreign_user)
message_comment_append (mp, "\
if (msgstr == NULL)
error (EXIT_FAILURE, errno, _("while preparing output"));
- message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr, &pos);
+ message_variant_append (mp, MESSAGE_DOMAIN_DEFAULT, msgstr,
+ strlen (msgstr) + 1, &pos);
return mp;
}
+2001-01-01 Bruno Haible <haible@clisp.cons.org>
+
+ Implement plural form handling.
+ * plural-1: New file.
+ * plural-1-prg.c: New file.
+ * Makefile.am (TESTS): Add plural-1.
+ (INCLUDES, EXTRA_PROGRAMS, cake_SOURCES, cake_LDADD, CLEANFILES): New
+ macros.
+ (all-local): New target.
+ * xg-test1.ok.po: Regenerated.
+
1997-08-01 15:46 Ulrich Drepper <drepper@cygnus.com>
* Makefile.am (AUTOMAKE_OPTIONS): Require version 1.2.
## Makefile for the check subdirectory of the GNU NLS Utilities
-## Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc.
+## Copyright (C) 1995-1997, 2001 Free Software Foundation, Inc.
##
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
TESTS = gettext-1 gettext-2 msgcmp-1 msgcmp-2 msgfmt-1 msgfmt-2 msgfmt-3 \
msgfmt-4 msgmerge-1 msgmerge-2 msgmerge-3 msgmerge-4 msgmerge-5 \
msgunfmt-1 xgettext-1 xgettext-2 xgettext-3 xgettext-4 xgettext-5 \
- xgettext-6 xgettext-7 xgettext-8 xgettext-9
+ xgettext-6 xgettext-7 xgettext-8 xgettext-9 plural-1
EXTRA_DIST = $(TESTS) test.mo xg-test1.ok.po
MSGFMT=`echo msgfmt|sed '$(transform)'` \
MSGCMP=`echo msgcmp|sed '$(transform)'` \
MSGMERGE=`echo msgmerge|sed '$(transform)'` \
- MSGUNFMT=`echo msgunfmt|sed '$(transform)'` $(SHELL)
+ MSGUNFMT=`echo msgunfmt|sed '$(transform)'` \
+ $(SHELL)
xg-test1.ok.po: $(top_srcdir)/src/xgettext.c $(top_srcdir)/src/msgfmt.c \
$(top_srcdir)/src/gettextp.c
$(XGETTEXT) -d xg-test1.ok -p $(srcdir) -k_ --omit-header \
$(top_srcdir)/src/xgettext.c $(top_srcdir)/src/msgfmt.c \
$(top_srcdir)/src/gettextp.c
+
+# An auxiliary program used by the plural-1 test.
+INCLUDES = -I${top_srcdir}/intl
+EXTRA_PROGRAMS = cake
+cake_SOURCES = plural-1-prg.c
+cake_LDADD = ../intl/libintl.a
+all-local: cake
+CLEANFILES = cake
--- /dev/null
+#! /bin/sh
+
+tmpfiles=""
+trap 'rm -fr $tmpfiles' 1 2 3 15
+
+tmpfiles="$tmpfiles cake.pot"
+: ${XGETTEXT=xgettext}
+${XGETTEXT} -o cake.pot --omit-header ${top_srcdir}/tests/plural-1-prg.c
+
+tmpfiles="$tmpfiles cake.ok"
+cat <<EOF > cake.ok
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] ""
+msgstr[1] ""
+EOF
+
+: ${DIFF=diff}
+${DIFF} cake.ok cake.pot || exit 1
+
+tmpfiles="$tmpfiles fr.po"
+cat <<EOF > fr.po
+# Les gateaux allemands sont les meilleurs du monde.
+msgid "a piece of cake"
+msgid_plural "%d pieces of cake"
+msgstr[0] "un morceau de gateau"
+msgstr[1] "%d morceaux de gateau"
+EOF
+
+tmpfiles="$tmpfiles fr.po.new"
+: ${MSGMERGE=msgmerge}
+${MSGMERGE} -q -o fr.po.new fr.po cake.pot
+
+: ${DIFF=diff}
+${DIFF} fr.po fr.po.new || exit 1
+
+tmpfiles="$tmpfiles fr"
+test -d fr || mkdir fr
+test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES
+
+: ${MSGFMT=msgfmt}
+${MSGFMT} -o fr/LC_MESSAGES/cake.mo fr.po
+
+tmpfiles="$tmpfiles fr.po.tmp"
+: ${MSGUNFMT=msgunfmt}
+${MSGUNFMT} fr/LC_MESSAGES/cake.mo -o fr.po.tmp
+
+tmpfiles="$tmpfiles fr.po.strip"
+sed 1d < fr.po > fr.po.strip
+
+: ${DIFF=diff}
+${DIFF} fr.po.strip fr.po.tmp || exit 1
+
+LANGUAGE=fr
+LC_ALL=
+LC_MESSAGES=
+LANG=
+export LANGUAGE LC_ALL LC_MESSAGES LANG
+
+tmpfiles="$tmpfiles cake.ok cake.out"
+: ${DIFF=diff}
+echo 'un morceau de gateau' > cake.ok
+./cake 1 > cake.out || exit 1
+${DIFF} cake.ok cake.out || exit 1
+echo '2 morceaux de gateau' > cake.ok
+./cake 2 > cake.out || exit 1
+${DIFF} cake.ok cake.out || exit 1
+echo '10 morceaux de gateau' > cake.ok
+./cake 10 > cake.out || exit 1
+${DIFF} cake.ok cake.out || exit 1
+
+rm -fr $tmpfiles
+
+exit 0
+
+# Preserve executable bits for this shell script.
+# Thanks to Noah Friedman for this great trick.
+Local Variables:
+eval:(defun frobme () (set-file-modes buffer-file-name file-mode))
+eval:(make-local-variable 'file-mode)
+eval:(setq file-mode (file-modes (buffer-file-name)))
+eval:(make-local-variable 'after-save-hook)
+eval:(add-hook 'after-save-hook 'frobme)
+End:
--- /dev/null
+#include <stdlib.h>
+#include <stdio.h>
+
+/* Make sure we use the included libintl, not the system's one. */
+#if 0
+#include <libintl.h>
+#else
+#define ENABLE_NLS 1
+#include "libgettext.h"
+#undef textdomain
+#define textdomain textdomain__
+#undef bindtextdomain
+#define bindtextdomain bindtextdomain__
+#undef ngettext
+#define ngettext ngettext__
+#endif
+
+int main (argc, argv)
+ int argc;
+ char *argv[];
+{
+ int n = atoi (argv[1]);
+ textdomain ("cake");
+ bindtextdomain ("cake", ".");
+ printf (ngettext ("a piece of cake", "%d pieces of cake", n), n);
+ printf ("\n");
+ return 0;
+}