From: Bruno Haible Date: Tue, 1 Oct 2024 14:42:56 +0000 (+0200) Subject: its: Do escape handling during msgfmt merge, not during xgettext. Off by default. X-Git-Tag: v0.23~92 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=eaf658bed81b831f84c53f27128696aacf8356ac;p=thirdparty%2Fgettext.git its: Do escape handling during msgfmt merge, not during xgettext. Off by default. Reported by Samy Mahmoudi at . * gettext-tools/src/its.c (its_localization_note_rule_constructor): Don't do escaping while extracting a localization note. (its_rule_list_extract_text): New local variable do_escape_during_extract. Don't do escaping while extracting. (starts_with_character_reference, _its_encode_special_chars_for_merge): New functions. (its_merge_context_merge_node): New local variables do_escape_during_extract, do_escape_during_merge. Don't do escaping while extracting. Conditionally do escaping while merging. * gettext-tools/src/its-extensions.xsd: Mention that escape="no" is now the default. * gettext-tools/its/glade1.its: Add a comment. * gettext-tools/its/glade2.its: Likewise. * gettext-tools/its/gsettings.its: Likewise. * gettext-tools/its/gtkbuilder.its: Likewise. * gettext-tools/its/metainfo.its: Add a . * gettext-tools/tests/xgettext-appdata-1: Add comment. * gettext-tools/tests/xgettext-appdata-2: New file, based on gettext-tools/tests/msgfmt-xml-1. * gettext-tools/tests/Makefile.am (TESTS): Add it. * gettext-tools/tests/xgettext-its-1: Update expected results. * gettext-tools/tests/msgfmt-xml-1: Test also character references and entity references. * gettext-tools/tests/msgfmt-xml-2: Likewise. * gettext-tools/doc/gettext.texi (ITS Rules): Under "Escape Special Characters", explain that it is no longer necessary to write a rule with escape="no". Rewrite section "Two Use-cases of Translated Strings in XML". * NEWS: Mention the changes. --- diff --git a/NEWS b/NEWS index 88bbdde60..f423a58e7 100644 --- a/NEWS +++ b/NEWS @@ -1,7 +1,17 @@ Version 0.23 - September 2024 * Programming languages support: - - XML: XML schemas for .its and .loc files are now provided. + - XML: + o The escaping of characters such as & < > has been changed: + - No escaping is done any more by xgettext, when creating a POT file. + - Instead, extra escaping can be requested for the msgfmt pass, when + merging into an XML file. + - The default value of 'escape' in the was "yes"; + now it is "no". + This means that existing translations of older POT files may no longer + fully apply. As a maintainer of a package that has translatable XML files, + you need to regenerate the POT file and pass it on to your translators. + o XML schemas for .its and .loc files are now provided. - Python: o xgettext now assumes source code for Python 3 rather than Python 2. This affects the interpretation of escape sequences in string literals. diff --git a/gettext-tools/doc/gettext.texi b/gettext-tools/doc/gettext.texi index e70b9e57b..106174e9c 100644 --- a/gettext-tools/doc/gettext.texi +++ b/gettext-tools/doc/gettext.texi @@ -10656,6 +10656,11 @@ appdata-tools, appstream, libappstream-glib-dev @subsection Preparing Rules for XML Internationalization @cindex preparing rules for XML translation +@c The ITS support in GNU gettext was designed so as to supersede +@c the GNOME itstool . See +@c and +@c . + @menu * ITS Rules:: Specifying ITS Rules * Locating Rules:: Specifying where to find the ITS Rules @@ -10674,6 +10679,7 @@ categories: @table @samp @item Context +@c Rationale: Glade 2. This data category associates @code{msgctxt} to the extracted text. In the global rule, the @code{contextRule} element contains the following: @@ -10692,23 +10698,8 @@ An optional @code{textPointer} attribute that contains a relative selector pointing to a node that holds the @code{msgid} value. @end itemize -@item Escape Special Characters - -This data category indicates whether the special XML characters -(@code{<}, @code{>}, @code{&}, @code{"}) are escaped with entity -reference. In the global rule, the @code{escapeRule} element contains -the following: - -@itemize @bullet -@item -A required @code{selector} attribute. It contains an absolute selector -that selects the nodes to which this rule applies. - -@item -A required @code{escape} attribute with the value @code{yes} or @code{no}. -@end itemize - @item Extended Preserve Space +@c Rationale: GSettings. This data category extends the standard @samp{Preserve Space} data category with the additional values @samp{trim} and @samp{paragraph}. @@ -10728,6 +10719,28 @@ A required @code{space} attribute with the value @code{default}, @code{preserve}, @code{trim}, or @code{paragraph}. @end itemize +@item Escape Special Characters + +This data category indicates whether the special XML characters +(@code{<}, @code{>}, @code{&}, @code{"}) are escaped with entity +references. In the global rule, the @code{escapeRule} element contains +the following: + +@itemize @bullet +@item +A required @code{selector} attribute. It contains an absolute selector +that selects the nodes to which this rule applies. + +@item +A required @code{escape} attribute with the value @code{yes} or @code{no}. +@end itemize + +@noindent +The default value, @code{no}, should be good for most XML file types. +A rule with @code{escape="no"}, +that was necessary with GNU gettext versions before 0.23, +is now redundant. + @end table All those extended data categories can only be expressed with global @@ -10818,19 +10831,30 @@ from the matching XML files. @subsubsection Two Use-cases of Translated Strings in XML -For XML, there are two use-cases of translated strings. One is the case -where the translated strings are directly consumed by programs, and the -other is the case where the translated strings are merged back to the -original XML document. In the former case, special characters in the -extracted strings shouldn't be escaped, while they should in the latter -case. To control whether to escape special characters, the @samp{Escape -Special Characters} data category can be used. - -To merge the translations, the @samp{msgfmt} program can be used with -the option @code{--xml}. @xref{msgfmt Invocation}, for more details -about how one calls the @samp{msgfmt} program. @samp{msgfmt}'s -@code{--xml} option doesn't perform character escaping, so translated -strings can have arbitrary XML constructs, such as elements for markup. +After strings have been extracted from an XML file to a POT file +through @code{xgettext} +and the translator has produced a PO file with translations, +it can be used in two ways: + +@itemize @bullet +@item +The PO file (or the MO file generated from it) can be directly consumed +by a program. + +@item +Or the translated strings can be merged back to the original XML document. +To do this use the @code{msgfmt} program with the option @code{--xml}. +@xref{msgfmt Invocation}, for more details about how one calls +the @samp{msgfmt} program. + +During this merge from a PO file into an XML file, it may happen that +more escaping of special characters for XML is needed +than what @code{msgfmt} does by default. +In this case, you can enforce more escaping +either throuch an @code{} ITS rule, +or through an attribute @code{gt:escape="yes"} on the particular XML element. + +@end itemize @c This is the template for new data formats. @ignore diff --git a/gettext-tools/its/glade1.its b/gettext-tools/its/glade1.its index 874b5e981..42f73b780 100644 --- a/gettext-tools/its/glade1.its +++ b/gettext-tools/its/glade1.its @@ -1,6 +1,6 @@ diff --git a/gettext-tools/its/glade2.its b/gettext-tools/its/glade2.its index e6133ae81..48220f302 100644 --- a/gettext-tools/its/glade2.its +++ b/gettext-tools/its/glade2.its @@ -1,6 +1,6 @@ diff --git a/gettext-tools/its/gsettings.its b/gettext-tools/its/gsettings.its index 930ec4238..c69f1d2f4 100644 --- a/gettext-tools/its/gsettings.its +++ b/gettext-tools/its/gsettings.its @@ -1,6 +1,6 @@ diff --git a/gettext-tools/its/gtkbuilder.its b/gettext-tools/its/gtkbuilder.its index 8078e1d4f..a984511d5 100644 --- a/gettext-tools/its/gtkbuilder.its +++ b/gettext-tools/its/gtkbuilder.its @@ -1,6 +1,6 @@ diff --git a/gettext-tools/its/metainfo.its b/gettext-tools/its/metainfo.its index 29b31f035..466c250a3 100644 --- a/gettext-tools/its/metainfo.its +++ b/gettext-tools/its/metainfo.its @@ -1,6 +1,6 @@ + + + diff --git a/gettext-tools/src/its-extensions.xsd b/gettext-tools/src/its-extensions.xsd index 116ef18da..4cb19e552 100644 --- a/gettext-tools/src/its-extensions.xsd +++ b/gettext-tools/src/its-extensions.xsd @@ -50,7 +50,7 @@ Written by Bruno Haible <bruno@clisp.org>, 2024. + is "no". --> diff --git a/gettext-tools/src/its.c b/gettext-tools/src/its.c index 4ac5283d9..a00e9ddd7 100644 --- a/gettext-tools/src/its.c +++ b/gettext-tools/src/its.c @@ -846,7 +846,7 @@ its_localization_note_rule_constructor (struct its_rule_ty *rule, xmlNode *node) { /* FIXME: Respect space attribute. */ char *content = _its_collect_text_content (n, ITS_WHITESPACE_NORMALIZE, - true); + false); its_value_list_append (&rule->values, "locNote", content); free (content); } @@ -1771,13 +1771,34 @@ its_rule_list_extract_text (its_rule_list_ty *rules, struct its_value_list_ty *values; const char *value; char *msgid = NULL, *msgctxt = NULL, *comment = NULL; - bool no_escape; + bool do_escape; + bool do_escape_during_extract; enum its_whitespace_type_ty whitespace; values = its_rule_list_eval (rules, node); value = its_value_list_get_value (values, "escape"); - no_escape = value != NULL && strcmp (value, "no") == 0; + do_escape = value != NULL && strcmp (value, "yes") == 0; + /* Consider also a locally declared 'gt:escape' attribute. */ + if (node->type == XML_ELEMENT_NODE + && xmlHasNsProp (node, BAD_CAST "escape", BAD_CAST GT_NS)) + { + char *prop = _its_get_attribute (node, "escape", GT_NS); + if (strcmp (prop, "yes") == 0 || strcmp (prop, "no") == 0) + do_escape = strcmp (prop, "yes") == 0; + free (prop); + } + + do_escape_during_extract = do_escape; + /* But no, during message extraction (i.e. what xgettext does), we do + *not* want escaping to be done. The contents of the POT file is meant + for translators, and + - the messages are not labelled as requiring XML content syntax, + - it is better for the translators if they can write various + characters such as & < > without escaping them. + Escaping needs to happen in the message merge phase (i.e. what msgfmt + does) instead. */ + do_escape_during_extract = false; value = its_value_list_get_value (values, "locNote"); if (value) @@ -1787,7 +1808,7 @@ its_rule_list_extract_text (its_rule_list_ty *rules, value = its_value_list_get_value (values, "locNotePointer"); if (value) comment = _its_get_content (rules, node, value, ITS_WHITESPACE_TRIM, - !no_escape); + do_escape_during_extract); } if (comment != NULL && *comment != '\0') @@ -1841,17 +1862,18 @@ its_rule_list_extract_text (its_rule_list_ty *rules, value = its_value_list_get_value (values, "contextPointer"); if (value) msgctxt = _its_get_content (rules, node, value, ITS_WHITESPACE_PRESERVE, - !no_escape); + do_escape_during_extract); value = its_value_list_get_value (values, "textPointer"); if (value) msgid = _its_get_content (rules, node, value, ITS_WHITESPACE_PRESERVE, - !no_escape); + do_escape_during_extract); its_value_list_destroy (values); free (values); if (msgid == NULL) - msgid = _its_collect_text_content (node, whitespace, !no_escape); + msgid = _its_collect_text_content (node, whitespace, + do_escape_during_extract); if (*msgid != '\0') { lex_pos_ty pos; @@ -1939,6 +1961,82 @@ struct its_merge_context_ty struct its_node_list_ty nodes; }; +/* Returns true if S starts with a character reference. */ +static bool +starts_with_character_reference (const char *s) +{ + /* defines + CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';' */ + if (*s == '&') + { + s++; + if (*s == '#') + { + s++; + if (*s >= '0' && *s <= '9') + { + do + s++; + while (*s >= '0' && *s <= '9'); + return *s == ';'; + } + if (*s == 'x') + { + s++; + if ((*s >= '0' && *s <= '9') + || (*s >= 'A' && *s <= 'F') + || (*s >= 'a' && *s <= 'f')) + { + do + s++; + while ((*s >= '0' && *s <= '9') + || (*s >= 'A' && *s <= 'F') + || (*s >= 'a' && *s <= 'f')); + return *s == ';'; + } + } + } + } + return false; +} + +static char * +_its_encode_special_chars_for_merge (const char *content) +{ + const char *str; + size_t amount = 0; + char *result, *p; + + for (str = content; *str != '\0'; str++) + { + if (*str == '&' && starts_with_character_reference (str)) + amount += sizeof ("&"); + else if (*str == '<') + amount += sizeof ("<"); + else if (*str == '>') + amount += sizeof (">"); + else + amount += 1; + } + + result = XNMALLOC (amount + 1, char); + *result = '\0'; + p = result; + for (str = content; *str != '\0'; str++) + { + if (*str == '&' && starts_with_character_reference (str)) + p = stpcpy (p, "&"); + else if (*str == '<') + p = stpcpy (p, "<"); + else if (*str == '>') + p = stpcpy (p, ">"); + else + *p++ = *str; + } + *p = '\0'; + return result; +} + static void its_merge_context_merge_node (struct its_merge_context_ty *context, xmlNode *node, @@ -1950,13 +2048,29 @@ its_merge_context_merge_node (struct its_merge_context_ty *context, struct its_value_list_ty *values; const char *value; char *msgid = NULL, *msgctxt = NULL; - bool no_escape; + bool do_escape; + bool do_escape_during_extract; + bool do_escape_during_merge; enum its_whitespace_type_ty whitespace; values = its_rule_list_eval (context->rules, node); value = its_value_list_get_value (values, "escape"); - no_escape = value != NULL && strcmp (value, "no") == 0; + do_escape = value != NULL && strcmp (value, "yes") == 0; + /* Consider also a locally declared 'gt:escape' attribute. */ + if (xmlHasNsProp (node, BAD_CAST "escape", BAD_CAST GT_NS)) + { + char *prop = _its_get_attribute (node, "escape", GT_NS); + if (strcmp (prop, "yes") == 0 || strcmp (prop, "no") == 0) + do_escape = strcmp (prop, "yes") == 0; + free (prop); + } + + do_escape_during_extract = do_escape; + /* Like above, in its_rule_list_extract_text. */ + do_escape_during_extract = false; + + do_escape_during_merge = do_escape; value = its_value_list_get_value (values, "space"); if (value && strcmp (value, "preserve") == 0) @@ -1971,17 +2085,20 @@ its_merge_context_merge_node (struct its_merge_context_ty *context, value = its_value_list_get_value (values, "contextPointer"); if (value) msgctxt = _its_get_content (context->rules, node, value, - ITS_WHITESPACE_PRESERVE, !no_escape); + ITS_WHITESPACE_PRESERVE, + do_escape_during_extract); value = its_value_list_get_value (values, "textPointer"); if (value) msgid = _its_get_content (context->rules, node, value, - ITS_WHITESPACE_PRESERVE, !no_escape); + ITS_WHITESPACE_PRESERVE, + do_escape_during_extract); its_value_list_destroy (values); free (values); if (msgid == NULL) - msgid = _its_collect_text_content (node, whitespace, !no_escape); + msgid = _its_collect_text_content (node, whitespace, + do_escape_during_extract); if (*msgid != '\0') { message_ty *mp; @@ -1994,7 +2111,50 @@ its_merge_context_merge_node (struct its_merge_context_ty *context, translated = xmlNewNode (node->ns, node->name); xmlSetProp (translated, BAD_CAST "xml:lang", BAD_CAST language); - xmlNodeAddContent (translated, BAD_CAST mp->msgstr); + /* libxml2 offers two functions for setting the content of an + element: xmlNodeSetContent and xmlNodeAddContent. They differ + in the amount of escaping they do: + - xmlNodeSetContent does no escaping, at the risk of creating + malformed XML. + - xmlNodeAddContent escapes all of & < >, which always produces + well-formed XML but is not the right thing for entity + references. + We need a middle ground between both, that is adapted to what + translators will usually produce. + + translated | no escaping | middle-ground | full escaping + | SetContent | | AddContent + -----------------+-------------+---------------+-------------- + & | & | & | & + " | " | " | &quot; + & | & | & | &amp; + < | < | < | < + > | > | > | > + < | < | < | &lt; + > | > | > | &gt; + © | © | &#xa9; | &#xa9; + © | © | © | &copy; + -----------------+-------------+---------------+-------------- + + The function _its_encode_special_chars_for_merge implements + this middle-ground. But we allow full escaping to be requested + through a gt:escape="yes" attribute. */ + + if (do_escape_during_merge) + { + /* These three are equivalent: + xmlNodeAddContent (translated, BAD_CAST mp->msgstr); + xmlNodeSetContent (translated, xmlEncodeEntitiesReentrant (context->doc, BAD_CAST mp->msgstr)); + xmlNodeSetContent (translated, xmlEncodeSpecialChars (context->doc, BAD_CAST mp->msgstr)); */ + xmlNodeAddContent (translated, BAD_CAST mp->msgstr); + } + else + { + char *middle_ground = _its_encode_special_chars_for_merge (mp->msgstr); + xmlNodeSetContent (translated, BAD_CAST middle_ground); + free (middle_ground); + } + xmlAddNextSibling (node, translated); } } diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index 0bbd1c022..454160141 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -84,7 +84,7 @@ TESTS = gettext-1 gettext-2 \ xgettext-13 xgettext-14 xgettext-15 xgettext-16 xgettext-17 \ xgettext-18 \ xgettext-combine-1 xgettext-combine-2 xgettext-combine-3 \ - xgettext-appdata-1 \ + xgettext-appdata-1 xgettext-appdata-2 \ xgettext-awk-1 xgettext-awk-2 xgettext-awk-3 \ xgettext-awk-stackovfl-1 xgettext-awk-stackovfl-2 \ xgettext-c-2 xgettext-c-3 xgettext-c-4 xgettext-c-5 xgettext-c-6 \ diff --git a/gettext-tools/tests/msgfmt-xml-1 b/gettext-tools/tests/msgfmt-xml-1 index c7de103c7..856c030cf 100755 --- a/gettext-tools/tests/msgfmt-xml-1 +++ b/gettext-tools/tests/msgfmt-xml-1 @@ -5,7 +5,12 @@ cat <<\EOF > mf.appdata.xml - + + + +]> + org.gnome.Characters.desktop GNOME Characters Character map application @@ -20,6 +25,15 @@ cat <<\EOF > mf.appdata.xml You can also browse characters by categories, such as Punctuation, Pictures, etc.

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Written by &author1;, &author2;, and &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

https://wiki.gnome.org/Design/Apps/CharacterMap dueno_at_src.gnome.org @@ -61,11 +75,34 @@ msgid "" msgstr "" "Vous pouvez aussi naviguer dans les caractères par catégories, comme par " "Ponctuation, Images, etc." + +msgid "" +"Did you know that the copyright sign (©, U+00A9) can be written in HTML as " +"©, as ©, or as ©?" +msgstr "" +"Saviez-vous que le signe de copyright (©, U+00A9) peut être écrit en HTML " +"comme ©, comme © ou comme © ?" + +msgid "Written by &author1;, &author2;, and &author3;." +msgstr "Écrit par &author1;, &author2;, et &author3;." + +msgid "" +"Escape gallery: operator x&y, standard XML entities & \" ' & < >, character " +"reference ©, escaped character reference ©, entity references © " +"&author1;" +msgstr "" +"Exposition d'échappements: operateur x&y, entités XML standard & \" ' & < >, " +"caractère ©, caractère échappé ©, entités © &author1;" EOF cat <<\EOF > mf.appdata.xml.ok - + + + +]> + org.gnome.Characters.desktop GNOME Characters Character map application @@ -82,6 +119,19 @@ cat <<\EOF > mf.appdata.xml.ok Punctuation, Pictures, etc.

Vous pouvez aussi naviguer dans les caractères par catégories, comme par Ponctuation, Images, etc.

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Saviez-vous que le signe de copyright (©, U+00A9) peut être écrit en HTML comme &#xa9;, comme &#169; ou comme &copy; ?

+

Written by &author1;, &author2;, and &author3;.

+

Écrit par &author1;, &author2;, et &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités © &author1;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités &copy; &author1;

https://wiki.gnome.org/Design/Apps/CharacterMap dueno_at_src.gnome.org diff --git a/gettext-tools/tests/msgfmt-xml-2 b/gettext-tools/tests/msgfmt-xml-2 index f8d51f164..10e136f94 100755 --- a/gettext-tools/tests/msgfmt-xml-2 +++ b/gettext-tools/tests/msgfmt-xml-2 @@ -5,7 +5,12 @@ cat <<\EOF > mf.appdata.xml - + + + +]> + org.gnome.Characters.desktop GNOME Characters Character map application @@ -20,6 +25,15 @@ cat <<\EOF > mf.appdata.xml You can also browse characters by categories, such as Punctuation, Pictures, etc.

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Written by &author1;, &author2;, and &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

https://wiki.gnome.org/Design/Apps/CharacterMap dueno_at_src.gnome.org @@ -63,6 +77,24 @@ msgid "" msgstr "" "Vous pouvez aussi naviguer dans les caractères par catégories, comme par " "Ponctuation, Images, etc." + +msgid "" +"Did you know that the copyright sign (©, U+00A9) can be written in HTML as " +"©, as ©, or as ©?" +msgstr "" +"Saviez-vous que le signe de copyright (©, U+00A9) peut être écrit en HTML " +"comme ©, comme © ou comme © ?" + +msgid "Written by &author1;, &author2;, and &author3;." +msgstr "Écrit par &author1;, &author2;, et &author3;." + +msgid "" +"Escape gallery: operator x&y, standard XML entities & \" ' & < >, character " +"reference ©, escaped character reference ©, entity references © " +"&author1;" +msgstr "" +"Exposition d'échappements: operateur x&y, entités XML standard & \" ' & < >, " +"caractère ©, caractère échappé ©, entités © &author1;" EOF cat <<\EOF > po/de.po @@ -100,11 +132,35 @@ msgid "" msgstr "" "Sie können ebenfalls nach Kategorie suchen, wie z.B. nach Zeichensetzung oder " "Bildern." + +msgid "" +"Did you know that the copyright sign (©, U+00A9) can be written in HTML as " +"©, as ©, or as ©?" +msgstr "" +"Wussten Sie, dass das Copyright-Zeichen (©, U+00A9) in HTML als " +"©, als ©, oder als © " +"geschrieben werden kann?" + +msgid "Written by &author1;, &author2;, and &author3;." +msgstr "Geschrieben von &author1;, &author2; und &author3;." + +msgid "" +"Escape gallery: operator x&y, standard XML entities & \" ' & < >, character " +"reference ©, escaped character reference ©, entity references © " +"&author1;" +msgstr "" +"Escape-Beispiele: Operator x&y, Standard-XML Entitäten & \" ' & < >, Zeichen " +"©, escaptes Zeichen ©, Entitäten © &author1;" EOF cat <<\EOF > mf.appdata.xml.ok - + + + +]> + org.gnome.Characters.desktop GNOME Characters Character map application @@ -123,6 +179,23 @@ cat <<\EOF > mf.appdata.xml.ok

Vous pouvez aussi naviguer dans les caractères par catégories, comme par Ponctuation, Images, etc.

Sie können ebenfalls nach Kategorie suchen, wie z.B. nach Zeichensetzung oder Bildern.

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Saviez-vous que le signe de copyright (©, U+00A9) peut être écrit en HTML comme &#xa9;, comme &#169; ou comme &copy; ?

+

Wussten Sie, dass das Copyright-Zeichen (©, U+00A9) in HTML als &#xa9;, als &#169;, oder als &copy; geschrieben werden kann?

+

Written by &author1;, &author2;, and &author3;.

+

Écrit par &author1;, &author2;, et &author3;.

+

Geschrieben von &author1;, &author2; und &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités © &author1;

+

Escape-Beispiele: Operator x&y, Standard-XML Entitäten & " ' & < >, Zeichen ©, escaptes Zeichen &#xa9;, Entitäten © &author1;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités &copy; &author1;

+

Escape-Beispiele: Operator x&y, Standard-XML Entitäten & " ' & < >, Zeichen ©, escaptes Zeichen &#xa9;, Entitäten &copy; &author1;

https://wiki.gnome.org/Design/Apps/CharacterMap dueno_at_src.gnome.org @@ -131,7 +204,12 @@ EOF cat <<\EOF > mf.appdata.xml.desired.ok - + + + +]> + org.gnome.Characters.desktop GNOME Characters Character map application @@ -148,6 +226,19 @@ cat <<\EOF > mf.appdata.xml.desired.ok Punctuation, Pictures, etc.

Vous pouvez aussi naviguer dans les caractères par catégories, comme par Ponctuation, Images, etc.

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Saviez-vous que le signe de copyright (©, U+00A9) peut être écrit en HTML comme &#xa9;, comme &#169; ou comme &copy; ?

+

Written by &author1;, &author2;, and &author3;.

+

Écrit par &author1;, &author2;, et &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités © &author1;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;

+

Exposition d'échappements: operateur x&y, entités XML standard & " ' & < >, caractère ©, caractère échappé &#xa9;, entités &copy; &author1;

https://wiki.gnome.org/Design/Apps/CharacterMap dueno_at_src.gnome.org diff --git a/gettext-tools/tests/xgettext-appdata-1 b/gettext-tools/tests/xgettext-appdata-1 index 7f68a5227..3c1ea5fff 100755 --- a/gettext-tools/tests/xgettext-appdata-1 +++ b/gettext-tools/tests/xgettext-appdata-1 @@ -1,7 +1,7 @@ #!/bin/sh . "${srcdir=.}/init.sh"; path_prepend_ . ../src -# Test of AppData support. +# Test of AppData support: HTML markup. cat <<\EOF > xg-gs-1-empty.appdata.xml diff --git a/gettext-tools/tests/xgettext-appdata-2 b/gettext-tools/tests/xgettext-appdata-2 new file mode 100644 index 000000000..980c4a45d --- /dev/null +++ b/gettext-tools/tests/xgettext-appdata-2 @@ -0,0 +1,121 @@ +#!/bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test of AppData support: escaping of XML entities. + +cat <<\EOF > xg-gs-2-empty.appdata.xml + + +EOF + +: ${XGETTEXT=xgettext} +${XGETTEXT} -o xg-gs-2.pot xg-gs-2-empty.appdata.xml 2>/dev/null +test $? = 0 || { + echo "Skipping test: xgettext was built without AppData support" + Exit 77 +} + +cat <<\EOF > xg-gs-2.appdata.xml + + + + +]> + + org.gnome.Characters.desktop + GNOME Characters + Character map application + CC0 + +

+ Characters is a simple utility application to find and insert + unusual characters. It allows you to quickly find the character + you are looking for by searching for keywords. +

+

+ You can also browse characters by categories, such as + Punctuation, Pictures, etc. +

+

+ Did you know that the copyright sign (©, U+00A9) can be written in HTML + as &#xa9;, + as &#169;, + or as &copy;? +

+

Written by &author1;, &author2;, and &author3;.

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;, escaped entity reference &copy;

+

Escape gallery: operator x&y, standard XML entities & " ' & < >, character reference ©, escaped character reference &#xa9;, entity references © &author1;, escaped entity reference &copy;

+
+ https://wiki.gnome.org/Design/Apps/CharacterMap + dueno_at_src.gnome.org +
+EOF + +: ${XGETTEXT=xgettext} +${XGETTEXT} --add-comments -o xg-gs-2.tmp xg-gs-2.appdata.xml || Exit 1 +func_filter_POT_Creation_Date xg-gs-2.tmp xg-gs-2.pot + +cat <<\EOF > xg-gs-2.ok +# SOME DESCRIPTIVE TITLE. +# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER +# This file is distributed under the same license as the PACKAGE package. +# FIRST AUTHOR , YEAR. +# +#, fuzzy +msgid "" +msgstr "" +"Project-Id-Version: PACKAGE VERSION\n" +"Report-Msgid-Bugs-To: \n" +"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" +"Last-Translator: FULL NAME \n" +"Language-Team: LANGUAGE \n" +"Language: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#: xg-gs-2.appdata.xml:9 +msgid "GNOME Characters" +msgstr "" + +#: xg-gs-2.appdata.xml:10 +msgid "Character map application" +msgstr "" + +#: xg-gs-2.appdata.xml:13 +msgid "" +"Characters is a simple utility application to find and insert unusual " +"characters. It allows you to quickly find the character you are looking for " +"by searching for keywords." +msgstr "" + +#: xg-gs-2.appdata.xml:18 +msgid "" +"You can also browse characters by categories, such as Punctuation, Pictures, " +"etc." +msgstr "" + +#: xg-gs-2.appdata.xml:22 +msgid "" +"Did you know that the copyright sign (©, U+00A9) can be written in HTML as " +"©, as ©, or as ©?" +msgstr "" + +#: xg-gs-2.appdata.xml:28 +msgid "Written by &author1;, &author2;, and &author3;." +msgstr "" + +#: xg-gs-2.appdata.xml:29 xg-gs-2.appdata.xml:30 +msgid "" +"Escape gallery: operator x&y, standard XML entities & \" ' & < >, character " +"reference ©, escaped character reference ©, entity references © " +"&author1;, escaped entity reference ©" +msgstr "" +EOF + +: ${DIFF=diff} +${DIFF} xg-gs-2.ok xg-gs-2.pot +result=$? + +exit $result diff --git a/gettext-tools/tests/xgettext-its-1 b/gettext-tools/tests/xgettext-its-1 index 22e9163ec..523dee490 100755 --- a/gettext-tools/tests/xgettext-its-1 +++ b/gettext-tools/tests/xgettext-its-1 @@ -201,7 +201,7 @@ EOF cat <<\EOF >messages.ok #. (itstool) path: message/p #: messages.xml:8 -msgid "This is a test message &foo;><&\"\"" +msgid "This is a test message &foo;><&\"\"" msgstr "" #. (itstool) path: message/p @@ -214,15 +214,15 @@ msgstr "" #: messages.xml:17 #, no-wrap msgid "" -" $ echo ' ' >> /dev/null\n" -" $ cat < /dev/yes\n" -" $ sleep 10 &\n" +" $ echo ' ' >> /dev/null\n" +" $ cat < /dev/yes\n" +" $ sleep 10 &\n" msgstr "" #. This is a comment #. (itstool) path: messages/message@comment #: messages.xml:22 -msgid "This is a comment <>&"" +msgid "This is a comment <>&\"" msgstr "" #. (itstool) path: message/p