]> git.ipfire.org Git - thirdparty/glibc.git/blame - manual/locale.texi
nptl/tst-cancel25 needs to be an internal test
[thirdparty/glibc.git] / manual / locale.texi
CommitLineData
390955cb 1@node Locales, Message Translation, Character Set Handling, Top
7a68c94a 2@c %MENU% The country and language can affect the behavior of library functions
28f540f4
RM
3@chapter Locales and Internationalization
4
5Different countries and cultures have varying conventions for how to
6communicate. These conventions range from very simple ones, such as the
7format for representing dates and times, to very complex ones, such as
8the language spoken.
9
10@cindex internationalization
11@cindex locales
12@dfn{Internationalization} of software means programming it to be able
f65fd747 13to adapt to the user's favorite conventions. In @w{ISO C},
28f540f4
RM
14internationalization works by means of @dfn{locales}. Each locale
15specifies a collection of conventions, one convention for each purpose.
16The user chooses a set of conventions by specifying a locale (via
17environment variables).
18
19All programs inherit the chosen locale as part of their environment.
20Provided the programs are written to obey the choice of locale, they
21will follow the conventions preferred by the user.
22
23@menu
24* Effects of Locale:: Actions affected by the choice of
f65fd747 25 locale.
28f540f4
RM
26* Choosing Locale:: How the user specifies a locale.
27* Locale Categories:: Different purposes for which you can
f65fd747 28 select a locale.
28f540f4 29* Setting the Locale:: How a program specifies the locale
f65fd747 30 with library functions.
28f540f4 31* Standard Locales:: Locale names available on all systems.
58536726 32* Locale Names:: Format of system-specific locale names.
85c165be 33* Locale Information:: How to access the information for the locale.
5e0889da 34* Formatting Numbers:: A dedicated function to format numbers.
e8ec0694 35* Yes-or-No Questions:: Check a Response against the locale.
28f540f4
RM
36@end menu
37
38@node Effects of Locale, Choosing Locale, , Locales
39@section What Effects a Locale Has
40
41Each locale specifies conventions for several purposes, including the
42following:
43
44@itemize @bullet
45@item
46What multibyte character sequences are valid, and how they are
390955cb 47interpreted (@pxref{Character Set Handling}).
28f540f4
RM
48
49@item
50Classification of which characters in the local character set are
51considered alphabetic, and upper- and lower-case conversion conventions
52(@pxref{Character Handling}).
53
54@item
55The collating sequence for the local language and character set
56(@pxref{Collation Functions}).
57
58@item
85c165be 59Formatting of numbers and currency amounts (@pxref{General Numeric}).
28f540f4
RM
60
61@item
99a20616 62Formatting of dates and times (@pxref{Formatting Calendar Time}).
28f540f4
RM
63
64@item
85c165be
UD
65What language to use for output, including error messages
66(@pxref{Message Translation}).
28f540f4
RM
67
68@item
e8ec0694
UD
69What language to use for user answers to yes-or-no questions
70(@pxref{Yes-or-No Questions}).
28f540f4
RM
71
72@item
73What language to use for more complex user input.
74(The C library doesn't yet help you implement this.)
75@end itemize
76
77Some aspects of adapting to the specified locale are handled
78automatically by the library subroutines. For example, all your program
79needs to do in order to use the collating sequence of the chosen locale
80is to use @code{strcoll} or @code{strxfrm} to compare strings.
81
82Other aspects of locales are beyond the comprehension of the library.
83For example, the library can't automatically translate your program's
84output messages into other languages. The only way you can support
85output in the user's favorite language is to program this more or less
85c165be
UD
86by hand. The C library provides functions to handle translations for
87multiple languages easily.
28f540f4
RM
88
89This chapter discusses the mechanism by which you can modify the current
90locale. The effects of the current locale on specific library functions
91are discussed in more detail in the descriptions of those functions.
92
93@node Choosing Locale, Locale Categories, Effects of Locale, Locales
94@section Choosing a Locale
95
96The simplest way for the user to choose a locale is to set the
97environment variable @code{LANG}. This specifies a single locale to use
98for all purposes. For example, a user could specify a hypothetical
99locale named @samp{espana-castellano} to use the standard conventions of
100most of Spain.
101
102The set of locales supported depends on the operating system you are
58536726
FW
103using, and so do their names, except that the standard locale called
104@samp{C} or @samp{POSIX} always exist. @xref{Locale Names}.
105
106In order to force the system to always use the default locale, the
107user can set the @code{LC_ALL} environment variable to @samp{C}.
28f540f4
RM
108
109@cindex combining locales
58536726
FW
110A user also has the option of specifying different locales for
111different purposes---in effect, choosing a mixture of multiple
112locales. @xref{Locale Categories}.
28f540f4
RM
113
114For example, the user might specify the locale @samp{espana-castellano}
115for most purposes, but specify the locale @samp{usa-english} for
116currency formatting. This might make sense if the user is a
117Spanish-speaking American, working in Spanish, but representing monetary
118amounts in US dollars.
119
120Note that both locales @samp{espana-castellano} and @samp{usa-english},
121like all locales, would include conventions for all of the purposes to
122which locales apply. However, the user can choose to use each locale
123for a particular subset of those purposes.
124
125@node Locale Categories, Setting the Locale, Choosing Locale, Locales
58536726 126@section Locale Categories
28f540f4
RM
127@cindex categories for locales
128@cindex locale categories
129
130The purposes that locales serve are grouped into @dfn{categories}, so
131that a user or a program can choose the locale for each category
132independently. Here is a table of categories; each name is both an
133environment variable that a user can set, and a macro name that you can
58536726
FW
134use as the first argument to @code{setlocale}.
135
136The contents of the environment variable (or the string in the second
137argument to @code{setlocale}) has to be a valid locale name.
138@xref{Locale Names}.
28f540f4 139
85c165be 140@vtable @code
28f540f4 141@item LC_COLLATE
d08a7e4c 142@standards{ISO, locale.h}
28f540f4
RM
143This category applies to collation of strings (functions @code{strcoll}
144and @code{strxfrm}); see @ref{Collation Functions}.
145
28f540f4 146@item LC_CTYPE
d08a7e4c 147@standards{ISO, locale.h}
28f540f4
RM
148This category applies to classification and conversion of characters,
149and to multibyte and wide characters;
390955cb 150see @ref{Character Handling}, and @ref{Character Set Handling}.
28f540f4 151
28f540f4 152@item LC_MONETARY
d08a7e4c 153@standards{ISO, locale.h}
85c165be 154This category applies to formatting monetary values; see @ref{General Numeric}.
28f540f4 155
28f540f4 156@item LC_NUMERIC
d08a7e4c 157@standards{ISO, locale.h}
28f540f4 158This category applies to formatting numeric values that are not
85c165be 159monetary; see @ref{General Numeric}.
28f540f4 160
28f540f4 161@item LC_TIME
d08a7e4c 162@standards{ISO, locale.h}
28f540f4 163This category applies to formatting date and time values; see
99a20616 164@ref{Formatting Calendar Time}.
28f540f4 165
f65fd747 166@item LC_MESSAGES
d08a7e4c 167@standards{XOPEN, locale.h}
85c165be 168This category applies to selecting the language used in the user
8b7fb588 169interface for message translation (@pxref{The Uniforum approach};
e8ec0694
UD
170@pxref{Message catalogs a la X/Open}) and contains regular expressions
171for affirmative and negative responses.
28f540f4 172
28f540f4 173@item LC_ALL
d08a7e4c 174@standards{ISO, locale.h}
58536726 175This is not a category; it is only a macro that you can use
85c165be
UD
176with @code{setlocale} to set a single locale for all purposes. Setting
177this environment variable overwrites all selections by the other
178@code{LC_*} variables or @code{LANG}.
28f540f4 179
28f540f4 180@item LANG
d08a7e4c 181@standards{ISO, locale.h}
28f540f4
RM
182If this environment variable is defined, its value specifies the locale
183to use for all purposes except as overridden by the variables above.
85c165be
UD
184@end vtable
185
186@vindex LANGUAGE
187When developing the message translation functions it was felt that the
6dd5b57e 188functionality provided by the variables above is not sufficient. For
6941c42a 189example, it should be possible to specify more than one locale name.
6dd5b57e
UD
190Take a Swedish user who better speaks German than English, and a program
191whose messages are output in English by default. It should be possible
192to specify that the first choice of language is Swedish, the second
193German, and if this also fails to use English. This is
85c165be
UD
194possible with the variable @code{LANGUAGE}. For further description of
195this GNU extension see @ref{Using gettextized software}.
28f540f4
RM
196
197@node Setting the Locale, Standard Locales, Locale Categories, Locales
198@section How Programs Set the Locale
199
200A C program inherits its locale environment variables when it starts up.
201This happens automatically. However, these variables do not
202automatically control the locale used by the library functions, because
f65fd747 203@w{ISO C} says that all programs start by default in the standard @samp{C}
28f540f4
RM
204locale. To use the locales specified by the environment, you must call
205@code{setlocale}. Call it as follows:
206
207@smallexample
208setlocale (LC_ALL, "");
209@end smallexample
210
211@noindent
85c165be
UD
212to select a locale based on the user choice of the appropriate
213environment variables.
28f540f4
RM
214
215@cindex changing the locale
216@cindex locale, changing
217You can also use @code{setlocale} to specify a particular locale, for
218general use or for a specific category.
219
220@pindex locale.h
221The symbols in this section are defined in the header file @file{locale.h}.
222
28f540f4 223@deftypefun {char *} setlocale (int @var{category}, const char *@var{locale})
d08a7e4c 224@standards{ISO, locale.h}
f2d58726
AO
225@safety{@prelim{}@mtunsafe{@mtasuconst{:@mtslocale{}} @mtsenv{}}@asunsafe{@asuinit{} @asulock{} @ascuheap{} @asucorrupt{}}@acunsafe{@acuinit{} @acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
226@c Uses of the global locale object are unguarded in functions that
227@c ought to be MT-Safe, so we're ruling out the use of this function
228@c once threads are started. It takes a write lock itself, but it may
229@c return a pointer loaded from the global locale object after releasing
230@c the lock, or before taking it.
231@c setlocale @mtasuconst:@mtslocale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
232@c libc_rwlock_wrlock @asulock @aculock
233@c libc_rwlock_unlock @aculock
234@c getenv LOCPATH @mtsenv
235@c malloc @ascuheap @acsmem
236@c free @ascuheap @acsmem
237@c new_composite_name ok
238@c setdata ok
239@c setname ok
240@c _nl_find_locale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
241@c getenv LC_ALL and LANG @mtsenv
242@c _nl_load_locale_from_archive @ascuheap @acucorrupt @acsmem @acsfd
243@c sysconf _SC_PAGE_SIZE ok
244@c _nl_normalize_codeset @ascuheap @acsmem
245@c isalnum_l ok (C locale)
246@c isdigit_l ok (C locale)
247@c malloc @ascuheap @acsmem
248@c tolower_l ok (C locale)
249@c open_not_cancel_2 @acsfd
250@c fxstat64 ok
251@c close_not_cancel_no_status ok
252@c __mmap64 @acsmem
253@c calculate_head_size ok
254@c __munmap ok
255@c compute_hashval ok
256@c qsort dup @acucorrupt
257@c rangecmp ok
258@c malloc @ascuheap @acsmem
259@c strdup @ascuheap @acsmem
260@c _nl_intern_locale_data @ascuheap @acsmem
261@c malloc @ascuheap @acsmem
262@c free @ascuheap @acsmem
263@c _nl_expand_alias @ascuheap @asulock @acsmem @acsfd @aculock
264@c libc_lock_lock @asulock @aculock
265@c bsearch ok
266@c alias_compare ok
267@c strcasecmp ok
268@c read_alias_file @ascuheap @asulock @acsmem @acsfd @aculock
269@c fopen @ascuheap @asulock @acsmem @acsfd @aculock
270@c fsetlocking ok
271@c feof_unlocked ok
272@c fgets_unlocked ok
273@c isspace ok (locale mutex is locked)
274@c extend_alias_table @ascuheap @acsmem
275@c realloc @ascuheap @acsmem
276@c realloc @ascuheap @acsmem
277@c fclose @ascuheap @asulock @acsmem @acsfd @aculock
278@c qsort @ascuheap @acsmem
279@c alias_compare dup
280@c libc_lock_unlock @aculock
281@c _nl_explode_name @ascuheap @acsmem
282@c _nl_find_language ok
283@c _nl_normalize_codeset dup @ascuheap @acsmem
284@c _nl_make_l10nflist @ascuheap @acsmem
285@c malloc @ascuheap @acsmem
286@c free @ascuheap @acsmem
287@c __argz_stringify ok
288@c __argz_count ok
289@c __argz_next ok
290@c _nl_load_locale @ascuheap @acsmem @acsfd
291@c open_not_cancel_2 @acsfd
292@c __fxstat64 ok
293@c close_not_cancel_no_status ok
294@c mmap @acsmem
295@c malloc @ascuheap @acsmem
296@c read_not_cancel ok
297@c free @ascuheap @acsmem
298@c _nl_intern_locale_data dup @ascuheap @acsmem
299@c munmap ok
300@c __gconv_compare_alias @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
301@c __gconv_read_conf @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
302@c (libc_once-initializes gconv_cache and gconv_path_envvar; they're
303@c never modified afterwards)
304@c __gconv_load_cache @ascuheap @acsmem @acsfd
305@c getenv GCONV_PATH @mtsenv
306@c open_not_cancel @acsfd
307@c __fxstat64 ok
308@c close_not_cancel_no_status ok
309@c mmap @acsmem
310@c malloc @ascuheap @acsmem
311@c __read ok
312@c free @ascuheap @acsmem
313@c munmap ok
314@c __gconv_get_path @asulock @ascuheap @aculock @acsmem @acsfd
315@c getcwd @ascuheap @acsmem @acsfd
316@c libc_lock_lock @asulock @aculock
317@c malloc @ascuheap @acsmem
318@c strtok_r ok
319@c libc_lock_unlock @aculock
320@c read_conf_file @ascuheap @asucorrupt @asulock @acsmem @acucorrupt @acsfd @aculock
321@c fopen @ascuheap @asulock @acsmem @acsfd @aculock
322@c fsetlocking ok
323@c feof_unlocked ok
324@c getdelim @ascuheap @asucorrupt @acsmem @acucorrupt
325@c isspace_l ok (C locale)
326@c add_alias
327@c isspace_l ok (C locale)
328@c toupper_l ok (C locale)
329@c add_alias2 dup @ascuheap @acucorrupt @acsmem
330@c add_module @ascuheap @acsmem
331@c isspace_l ok (C locale)
332@c toupper_l ok (C locale)
333@c strtol ok (@mtslocale but we hold the locale lock)
334@c tfind __gconv_alias_db ok
335@c __gconv_alias_compare dup ok
336@c calloc @ascuheap @acsmem
337@c insert_module dup @ascuheap
338@c __tfind ok (because the tree is read only by then)
339@c __gconv_alias_compare dup ok
340@c insert_module @ascuheap
341@c free @ascuheap
342@c add_alias2 @ascuheap @acucorrupt @acsmem
343@c detect_conflict ok, reads __gconv_modules_db
344@c malloc @ascuheap @acsmem
345@c tsearch __gconv_alias_db @ascuheap @acucorrupt @acsmem [exclusive tree, no @mtsrace]
346@c __gconv_alias_compare ok
347@c free @ascuheap
348@c __gconv_compare_alias_cache ok
349@c find_module_idx ok
350@c do_lookup_alias ok
351@c __tfind ok (because the tree is read only by then)
352@c __gconv_alias_compare ok
353@c strndup @ascuheap @acsmem
354@c strcasecmp_l ok (C locale)
403cb8a1 355The function @code{setlocale} sets the current locale for category
58536726 356@var{category} to @var{locale}.
28f540f4
RM
357
358If @var{category} is @code{LC_ALL}, this specifies the locale for all
777edcbd 359purposes. The other possible values of @var{category} specify a
6dd5b57e 360single purpose (@pxref{Locale Categories}).
28f540f4
RM
361
362You can also use this function to find out the current locale by passing
363a null pointer as the @var{locale} argument. In this case,
364@code{setlocale} returns a string that is the name of the locale
365currently selected for category @var{category}.
366
367The string returned by @code{setlocale} can be overwritten by subsequent
0a13c9e9
PE
368calls, so you should make a copy of the string (@pxref{Copying Strings
369and Arrays}) if you want to save it past any further calls to
28f540f4
RM
370@code{setlocale}. (The standard library is guaranteed never to call
371@code{setlocale} itself.)
372
403cb8a1
UD
373You should not modify the string returned by @code{setlocale}. It might
374be the same string that was passed as an argument in a previous call to
375@code{setlocale}. One requirement is that the @var{category} must be
376the same in the call the string was returned and the one when the string
377is passed in as @var{locale} parameter.
28f540f4
RM
378
379When you read the current locale for category @code{LC_ALL}, the value
380encodes the entire combination of selected locales for all categories.
58536726
FW
381If you specify the same ``locale name'' with @code{LC_ALL} in a
382subsequent call to @code{setlocale}, it restores the same combination
383of locale selections.
28f540f4 384
6dd5b57e
UD
385To be sure you can use the returned string encoding the currently selected
386locale at a later time, you must make a copy of the string. It is not
387guaranteed that the returned pointer remains valid over time.
85c165be 388
28f540f4 389When the @var{locale} argument is not a null pointer, the string returned
6dd5b57e 390by @code{setlocale} reflects the newly-modified locale.
28f540f4
RM
391
392If you specify an empty string for @var{locale}, this means to read the
393appropriate environment variable and use its value to select the locale
394for @var{category}.
395
6dd5b57e
UD
396If a nonempty string is given for @var{locale}, then the locale of that
397name is used if possible.
85c165be 398
58536726
FW
399The effective locale name (either the second argument to
400@code{setlocale}, or if the argument is an empty string, the name
777edcbd 401obtained from the process environment) must be a valid locale name.
58536726
FW
402@xref{Locale Names}.
403
28f540f4
RM
404If you specify an invalid locale name, @code{setlocale} returns a null
405pointer and leaves the current locale unchanged.
406@end deftypefun
407
408Here is an example showing how you might use @code{setlocale} to
409temporarily switch to a new locale.
410
411@smallexample
412#include <stddef.h>
413#include <locale.h>
414#include <stdlib.h>
415#include <string.h>
416
417void
418with_other_locale (char *new_locale,
419 void (*subroutine) (int),
420 int argument)
421@{
422 char *old_locale, *saved_locale;
423
424 /* @r{Get the name of the current locale.} */
425 old_locale = setlocale (LC_ALL, NULL);
f65fd747 426
28f540f4
RM
427 /* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */
428 saved_locale = strdup (old_locale);
816e6eb5 429 if (saved_locale == NULL)
28f540f4 430 fatal ("Out of memory");
f65fd747 431
28f540f4
RM
432 /* @r{Now change the locale and do some stuff with it.} */
433 setlocale (LC_ALL, new_locale);
434 (*subroutine) (argument);
f65fd747 435
28f540f4
RM
436 /* @r{Restore the original locale.} */
437 setlocale (LC_ALL, saved_locale);
438 free (saved_locale);
439@}
440@end smallexample
441
f65fd747 442@strong{Portability Note:} Some @w{ISO C} systems may define additional
6dd5b57e 443locale categories, and future versions of the library will do so. For
85c165be
UD
444portability, assume that any symbol beginning with @samp{LC_} might be
445defined in @file{locale.h}.
28f540f4 446
58536726 447@node Standard Locales, Locale Names, Setting the Locale, Locales
28f540f4
RM
448@section Standard Locales
449
450The only locale names you can count on finding on all operating systems
451are these three standard ones:
452
453@table @code
454@item "C"
455This is the standard C locale. The attributes and behavior it provides
f65fd747 456are specified in the @w{ISO C} standard. When your program starts up, it
28f540f4
RM
457initially uses this locale by default.
458
459@item "POSIX"
460This is the standard POSIX locale. Currently, it is an alias for the
461standard C locale.
462
463@item ""
464The empty name says to select a locale based on environment variables.
465@xref{Locale Categories}.
466@end table
467
468Defining and installing named locales is normally a responsibility of
1f77f049
JM
469the system administrator at your site (or the person who installed
470@theglibc{}). It is also possible for the user to create private
85c165be 471locales. All this will be discussed later when describing the tool to
6dd5b57e 472do so.
85c165be 473@comment (@pxref{Building Locale Files}).
28f540f4
RM
474
475If your program needs to use something other than the @samp{C} locale,
476it will be more portable if you use whatever locale the user specifies
477with the environment, rather than trying to specify some non-standard
478locale explicitly by name. Remember, different machines might have
479different sets of locales installed.
480
58536726
FW
481@node Locale Names, Locale Information, Standard Locales, Locales
482@section Locale Names
483
484The following command prints a list of locales supported by the
485system:
486
487@pindex locale
488@smallexample
489 locale -a
490@end smallexample
491
492@strong{Portability Note:} With the notable exception of the standard
493locale names @samp{C} and @samp{POSIX}, locale names are
494system-specific.
495
496Most locale names follow XPG syntax and consist of up to four parts:
497
498@smallexample
499@var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
500@end smallexample
501
502Beside the first part, all of them are allowed to be missing. If the
503full specified locale is not found, less specific ones are looked for.
504The various parts will be stripped off, in the following order:
505
506@enumerate
507@item
508codeset
509@item
510normalized codeset
511@item
512territory
513@item
514modifier
515@end enumerate
516
517For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
518German-language locale for use in Austria, using the ISO-8859-15
519(Latin-9) character set, and with the Euro as the currency symbol.
520
521In addition to locale names which follow XPG syntax, systems may
522provide aliases such as @samp{german}. Both categories of names must
523not contain the slash character @samp{/}.
524
525If the locale name starts with a slash @samp{/}, it is treated as a
526path relative to the configured locale directories; see @code{LOCPATH}
527below. The specified path must not contain a component @samp{..}, or
528the name is invalid, and @code{setlocale} will fail.
529
530@strong{Portability Note:} POSIX suggests that if a locale name starts
531with a slash @samp{/}, it is resolved as an absolute path. However,
532@theglibc{} treats it as a relative path under the directories listed
533in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
534is unset).
535
536Locale names which are longer than an implementation-defined limit are
537invalid and cause @code{setlocale} to fail.
538
539As a special case, locale names used with @code{LC_ALL} can combine
540several locales, reflecting different locale settings for different
541categories. For example, you might want to use a U.S. locale with ISO
542A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
543@code{LC_PAPER} to @samp{de_DE.UTF-8}. In this case, the
544@code{LC_ALL}-style combined locale name is
545
546@smallexample
547LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
548@end smallexample
549
550followed by other category settings not shown here.
551
552@vindex LOCPATH
553The path used for finding locale data can be set using the
554@code{LOCPATH} environment variable. This variable lists the
555directories in which to search for locale definitions, separated by a
556colon @samp{:}.
557
558The default path for finding locale data is system specific. A typical
559value for the @code{LOCPATH} default is:
560
561@smallexample
562/usr/share/locale
563@end smallexample
564
565The value of @code{LOCPATH} is ignored by privileged programs for
566security reasons, and only the default directory is used.
567
568@node Locale Information, Formatting Numbers, Locale Names, Locales
6dd5b57e 569@section Accessing Locale Information
85c165be 570
6dd5b57e 571There are several ways to access locale information. The simplest
85c165be 572way is to let the C library itself do the work. Several of the
6dd5b57e
UD
573functions in this library implicitly access the locale data, and use
574what information is provided by the currently selected locale. This is
85c165be
UD
575how the locale model is meant to work normally.
576
6dd5b57e 577As an example take the @code{strftime} function, which is meant to nicely
99a20616 578format date and time information (@pxref{Formatting Calendar Time}).
85c165be 579Part of the standard information contained in the @code{LC_TIME}
6dd5b57e 580category is the names of the months. Instead of requiring the
85c165be 581programmer to take care of providing the translations the
6dd5b57e
UD
582@code{strftime} function does this all by itself. @code{%A}
583in the format string is replaced by the appropriate weekday
584name of the locale currently selected by @code{LC_TIME}. This is an
585easy example, and wherever possible functions do things automatically
586in this way.
587
588But there are quite often situations when there is simply no function
589to perform the task, or it is simply not possible to do the work
85c165be
UD
590automatically. For these cases it is necessary to access the
591information in the locale directly. To do this the C library provides
592two functions: @code{localeconv} and @code{nl_langinfo}. The former is
593part of @w{ISO C} and therefore portable, but has a brain-damaged
594interface. The second is part of the Unix interface and is portable in
595as far as the system follows the Unix standards.
28f540f4 596
85c165be
UD
597@menu
598* The Lame Way to Locale Data:: ISO C's @code{localeconv}.
599* The Elegant and Fast Way:: X/Open's @code{nl_langinfo}.
600@end menu
601
602@node The Lame Way to Locale Data, The Elegant and Fast Way, ,Locale Information
c66dbe00 603@subsection @code{localeconv}: It is portable but @dots{}
85c165be
UD
604
605Together with the @code{setlocale} function the @w{ISO C} people
6dd5b57e 606invented the @code{localeconv} function. It is a masterpiece of poor
ef48b196 607design. It is expensive to use, not extensible, and not generally
6dd5b57e
UD
608usable as it provides access to only @code{LC_MONETARY} and
609@code{LC_NUMERIC} related information. Nevertheless, if it is
610applicable to a given situation it should be used since it is very
611portable. The function @code{strfmon} formats monetary amounts
612according to the selected locale using this information.
28f540f4
RM
613@pindex locale.h
614@cindex monetary value formatting
615@cindex numeric value formatting
616
28f540f4 617@deftypefun {struct lconv *} localeconv (void)
d08a7e4c 618@standards{ISO, locale.h}
f2d58726
AO
619@safety{@prelim{}@mtunsafe{@mtasurace{:localeconv} @mtslocale{}}@asunsafe{}@acsafe{}}
620@c This function reads from multiple components of the locale object,
621@c without synchronization, while writing to the static buffer it uses
622@c as the return value.
28f540f4
RM
623The @code{localeconv} function returns a pointer to a structure whose
624components contain information about how numeric and monetary values
625should be formatted in the current locale.
626
85c165be 627You should not modify the structure or its contents. The structure might
28f540f4
RM
628be overwritten by subsequent calls to @code{localeconv}, or by calls to
629@code{setlocale}, but no other function in the library overwrites this
630value.
631@end deftypefun
632
28f540f4 633@deftp {Data Type} {struct lconv}
d08a7e4c 634@standards{ISO, locale.h}
6dd5b57e
UD
635@code{localeconv}'s return value is of this data type. Its elements are
636described in the following subsections.
28f540f4
RM
637@end deftp
638
639If a member of the structure @code{struct lconv} has type @code{char},
640and the value is @code{CHAR_MAX}, it means that the current locale has
641no value for that parameter.
642
643@menu
644* General Numeric:: Parameters for formatting numbers and
645 currency amounts.
646* Currency Symbol:: How to print the symbol that identifies an
647 amount of money (e.g. @samp{$}).
648* Sign of Money Amount:: How to print the (positive or negative) sign
649 for a monetary amount, if one exists.
650@end menu
651
85c165be
UD
652@node General Numeric, Currency Symbol, , The Lame Way to Locale Data
653@subsubsection Generic Numeric Formatting Parameters
28f540f4
RM
654
655These are the standard members of @code{struct lconv}; there may be
656others.
657
658@table @code
659@item char *decimal_point
660@itemx char *mon_decimal_point
661These are the decimal-point separators used in formatting non-monetary
662and monetary quantities, respectively. In the @samp{C} locale, the
663value of @code{decimal_point} is @code{"."}, and the value of
664@code{mon_decimal_point} is @code{""}.
665@cindex decimal-point separator
666
667@item char *thousands_sep
668@itemx char *mon_thousands_sep
669These are the separators used to delimit groups of digits to the left of
670the decimal point in formatting non-monetary and monetary quantities,
671respectively. In the @samp{C} locale, both members have a value of
672@code{""} (the empty string).
673
674@item char *grouping
675@itemx char *mon_grouping
676These are strings that specify how to group the digits to the left of
677the decimal point. @code{grouping} applies to non-monetary quantities
678and @code{mon_grouping} applies to monetary quantities. Use either
679@code{thousands_sep} or @code{mon_thousands_sep} to separate the digit
680groups.
681@cindex grouping of digits
682
bcf6d602
UD
683Each member of these strings is to be interpreted as an integer value of
684type @code{char}. Successive numbers (from left to right) give the
685sizes of successive groups (from right to left, starting at the decimal
686point.) The last member is either @code{0}, in which case the previous
687member is used over and over again for all the remaining groups, or
688@code{CHAR_MAX}, in which case there is no more grouping---or, put
689another way, any remaining digits form one large group without
690separators.
691
692For example, if @code{grouping} is @code{"\04\03\02"}, the correct
693grouping for the number @code{123456787654321} is @samp{12}, @samp{34},
28f540f4
RM
694@samp{56}, @samp{78}, @samp{765}, @samp{4321}. This uses a group of 4
695digits at the end, preceded by a group of 3 digits, preceded by groups
696of 2 digits (as many as needed). With a separator of @samp{,}, the
697number would be printed as @samp{12,34,56,78,765,4321}.
698
bcf6d602 699A value of @code{"\03"} indicates repeated groups of three digits, as
28f540f4
RM
700normally used in the U.S.
701
702In the standard @samp{C} locale, both @code{grouping} and
703@code{mon_grouping} have a value of @code{""}. This value specifies no
704grouping at all.
705
706@item char int_frac_digits
707@itemx char frac_digits
708These are small integers indicating how many fractional digits (to the
709right of the decimal point) should be displayed in a monetary value in
710international and local formats, respectively. (Most often, both
711members have the same value.)
712
713In the standard @samp{C} locale, both of these members have the value
f65fd747 714@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
6dd5b57e 715what to do when you find this value; we recommend printing no
28f540f4
RM
716fractional digits. (This locale also specifies the empty string for
717@code{mon_decimal_point}, so printing any fractional digits would be
718confusing!)
719@end table
720
85c165be
UD
721@node Currency Symbol, Sign of Money Amount, General Numeric, The Lame Way to Locale Data
722@subsubsection Printing the Currency Symbol
28f540f4
RM
723@cindex currency symbols
724
725These members of the @code{struct lconv} structure specify how to print
726the symbol to identify a monetary value---the international analog of
727@samp{$} for US dollars.
728
729Each country has two standard currency symbols. The @dfn{local currency
730symbol} is used commonly within the country, while the
731@dfn{international currency symbol} is used internationally to refer to
732that country's currency when it is necessary to indicate the country
733unambiguously.
734
735For example, many countries use the dollar as their monetary unit, and
736when dealing with international currencies it's important to specify
737that one is dealing with (say) Canadian dollars instead of U.S. dollars
738or Australian dollars. But when the context is known to be Canada,
739there is no need to make this explicit---dollar amounts are implicitly
740assumed to be in Canadian dollars.
741
742@table @code
743@item char *currency_symbol
744The local currency symbol for the selected locale.
745
746In the standard @samp{C} locale, this member has a value of @code{""}
f65fd747 747(the empty string), meaning ``unspecified''. The ISO standard doesn't
28f540f4 748say what to do when you find this value; we recommend you simply print
6dd5b57e
UD
749the empty string as you would print any other string pointed to by this
750variable.
28f540f4
RM
751
752@item char *int_curr_symbol
753The international currency symbol for the selected locale.
754
755The value of @code{int_curr_symbol} should normally consist of a
756three-letter abbreviation determined by the international standard
757@cite{ISO 4217 Codes for the Representation of Currency and Funds},
758followed by a one-character separator (often a space).
759
760In the standard @samp{C} locale, this member has a value of @code{""}
6dd5b57e
UD
761(the empty string), meaning ``unspecified''. We recommend you simply print
762the empty string as you would print any other string pointed to by this
763variable.
28f540f4
RM
764
765@item char p_cs_precedes
766@itemx char n_cs_precedes
bcf6d602
UD
767@itemx char int_p_cs_precedes
768@itemx char int_n_cs_precedes
769These members are @code{1} if the @code{currency_symbol} or
770@code{int_curr_symbol} strings should precede the value of a monetary
771amount, or @code{0} if the strings should follow the value. The
772@code{p_cs_precedes} and @code{int_p_cs_precedes} members apply to
773positive amounts (or zero), and the @code{n_cs_precedes} and
774@code{int_n_cs_precedes} members apply to negative amounts.
775
776In the standard @samp{C} locale, all of these members have a value of
f65fd747 777@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
6dd5b57e
UD
778what to do when you find this value. We recommend printing the
779currency symbol before the amount, which is right for most countries.
28f540f4
RM
780In other words, treat all nonzero values alike in these members.
781
bcf6d602
UD
782The members with the @code{int_} prefix apply to the
783@code{int_curr_symbol} while the other two apply to
784@code{currency_symbol}.
28f540f4
RM
785
786@item char p_sep_by_space
787@itemx char n_sep_by_space
bcf6d602
UD
788@itemx char int_p_sep_by_space
789@itemx char int_n_sep_by_space
28f540f4 790These members are @code{1} if a space should appear between the
bcf6d602
UD
791@code{currency_symbol} or @code{int_curr_symbol} strings and the
792amount, or @code{0} if no space should appear. The
793@code{p_sep_by_space} and @code{int_p_sep_by_space} members apply to
794positive amounts (or zero), and the @code{n_sep_by_space} and
795@code{int_n_sep_by_space} members apply to negative amounts.
28f540f4 796
bcf6d602 797In the standard @samp{C} locale, all of these members have a value of
f65fd747 798@code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say
28f540f4 799what you should do when you find this value; we suggest you treat it as
6dd5b57e 8001 (print a space). In other words, treat all nonzero values alike in
28f540f4
RM
801these members.
802
bcf6d602
UD
803The members with the @code{int_} prefix apply to the
804@code{int_curr_symbol} while the other two apply to
805@code{currency_symbol}. There is one specialty with the
806@code{int_curr_symbol}, though. Since all legal values contain a space
777edcbd 807at the end of the string one either prints this space (if the currency
bcf6d602
UD
808symbol must appear in front and must be separated) or one has to avoid
809printing this character at all (especially when at the end of the
810string).
28f540f4
RM
811@end table
812
85c165be 813@node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
6dd5b57e 814@subsubsection Printing the Sign of a Monetary Amount
28f540f4
RM
815
816These members of the @code{struct lconv} structure specify how to print
6dd5b57e 817the sign (if any) of a monetary value.
28f540f4
RM
818
819@table @code
820@item char *positive_sign
821@itemx char *negative_sign
822These are strings used to indicate positive (or zero) and negative
6dd5b57e 823monetary quantities, respectively.
28f540f4
RM
824
825In the standard @samp{C} locale, both of these members have a value of
826@code{""} (the empty string), meaning ``unspecified''.
827
f65fd747 828The ISO standard doesn't say what to do when you find this value; we
28f540f4
RM
829recommend printing @code{positive_sign} as you find it, even if it is
830empty. For a negative value, print @code{negative_sign} as you find it
831unless both it and @code{positive_sign} are empty, in which case print
832@samp{-} instead. (Failing to indicate the sign at all seems rather
833unreasonable.)
834
835@item char p_sign_posn
836@itemx char n_sign_posn
bcf6d602
UD
837@itemx char int_p_sign_posn
838@itemx char int_n_sign_posn
6dd5b57e 839These members are small integers that indicate how to
28f540f4 840position the sign for nonnegative and negative monetary quantities,
777edcbd 841respectively. (The string used for the sign is what was specified with
28f540f4
RM
842@code{positive_sign} or @code{negative_sign}.) The possible values are
843as follows:
844
845@table @code
846@item 0
847The currency symbol and quantity should be surrounded by parentheses.
848
849@item 1
850Print the sign string before the quantity and currency symbol.
851
852@item 2
853Print the sign string after the quantity and currency symbol.
854
855@item 3
856Print the sign string right before the currency symbol.
857
858@item 4
859Print the sign string right after the currency symbol.
860
861@item CHAR_MAX
862``Unspecified''. Both members have this value in the standard
863@samp{C} locale.
864@end table
865
f65fd747 866The ISO standard doesn't say what you should do when the value is
28f540f4
RM
867@code{CHAR_MAX}. We recommend you print the sign after the currency
868symbol.
28f540f4 869
bcf6d602
UD
870The members with the @code{int_} prefix apply to the
871@code{int_curr_symbol} while the other two apply to
872@code{currency_symbol}.
873@end table
85c165be
UD
874
875@node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
876@subsection Pinpoint Access to Locale Data
877
5e0889da
UD
878When writing the X/Open Portability Guide the authors realized that the
879@code{localeconv} function is not enough to provide reasonable access to
6dd5b57e 880locale information. The information which was meant to be available
5e0889da 881in the locale (as later specified in the POSIX.1 standard) requires more
6dd5b57e 882ways to access it. Therefore the @code{nl_langinfo} function
5e0889da 883was introduced.
85c165be 884
85c165be 885@deftypefun {char *} nl_langinfo (nl_item @var{item})
d08a7e4c 886@standards{XOPEN, langinfo.h}
f2d58726
AO
887@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
888@c It calls _nl_langinfo_l with the current locale, which returns a
889@c pointer into constant strings defined in locale data structures.
85c165be 890The @code{nl_langinfo} function can be used to access individual
6dd5b57e
UD
891elements of the locale categories. Unlike the @code{localeconv}
892function, which returns all the information, @code{nl_langinfo}
893lets the caller select what information it requires. This is very
894fast and it is not a problem to call this function multiple times.
85c165be 895
6dd5b57e
UD
896A second advantage is that in addition to the numeric and monetary
897formatting information, information from the
85c165be
UD
898@code{LC_TIME} and @code{LC_MESSAGES} categories is available.
899
b642f101 900@pindex langinfo.h
0cee1257 901The type @code{nl_item} is defined in @file{nl_types.h}. The argument
6dd5b57e
UD
902@var{item} is a numeric value defined in the header @file{langinfo.h}.
903The X/Open standard defines the following values:
85c165be
UD
904
905@vtable @code
b642f101
UD
906@item CODESET
907@code{nl_langinfo} returns a string with the name of the coded character
908set used in the selected locale.
909
85c165be
UD
910@item ABDAY_1
911@itemx ABDAY_2
912@itemx ABDAY_3
913@itemx ABDAY_4
914@itemx ABDAY_5
915@itemx ABDAY_6
916@itemx ABDAY_7
917@code{nl_langinfo} returns the abbreviated weekday name. @code{ABDAY_1}
918corresponds to Sunday.
919@item DAY_1
920@itemx DAY_2
921@itemx DAY_3
922@itemx DAY_4
923@itemx DAY_5
924@itemx DAY_6
925@itemx DAY_7
0f5e2da1 926Similar to @code{ABDAY_1}, etc.,@: but here the return value is the
5e0889da 927unabbreviated weekday name.
85c165be
UD
928@item ABMON_1
929@itemx ABMON_2
930@itemx ABMON_3
931@itemx ABMON_4
932@itemx ABMON_5
933@itemx ABMON_6
934@itemx ABMON_7
935@itemx ABMON_8
936@itemx ABMON_9
937@itemx ABMON_10
938@itemx ABMON_11
939@itemx ABMON_12
0f5e2da1
RJ
940The return value is the abbreviated name of the month, in the
941grammatical form used when the month forms part of a complete date.
942@code{ABMON_1} corresponds to January.
85c165be
UD
943@item MON_1
944@itemx MON_2
945@itemx MON_3
946@itemx MON_4
947@itemx MON_5
948@itemx MON_6
949@itemx MON_7
950@itemx MON_8
951@itemx MON_9
952@itemx MON_10
953@itemx MON_11
954@itemx MON_12
0f5e2da1
RJ
955Similar to @code{ABMON_1}, etc.,@: but here the month names are not
956abbreviated. Here the first value @code{MON_1} also corresponds to
957January.
22390764
RL
958@item ALTMON_1
959@itemx ALTMON_2
960@itemx ALTMON_3
961@itemx ALTMON_4
962@itemx ALTMON_5
963@itemx ALTMON_6
964@itemx ALTMON_7
965@itemx ALTMON_8
966@itemx ALTMON_9
967@itemx ALTMON_10
968@itemx ALTMON_11
969@itemx ALTMON_12
0f5e2da1
RJ
970Similar to @code{MON_1}, etc.,@: but here the month names are in the
971grammatical form used when the month is named by itself. The
972@code{strftime} functions use these month names for the conversion
973specifier @code{%OB} (@pxref{Formatting Calendar Time}).
22390764
RL
974
975Note that not all languages need two different forms of the month names,
976so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
977may or may not be the same, depending on the locale.
0f5e2da1
RJ
978
979@strong{NB:} @code{ABALTMON_@dots{}} constants corresponding to the
980@code{%Ob} conversion specifier are not currently provided, but are
981expected to be in a future release. In the meantime, it is possible
982to use @code{_NL_ABALTMON_@dots{}}.
85c165be
UD
983@item AM_STR
984@itemx PM_STR
6dd5b57e
UD
985The return values are strings which can be used in the representation of time
986as an hour from 1 to 12 plus an am/pm specifier.
85c165be 987
6dd5b57e
UD
988Note that in locales which do not use this time representation
989these strings might be empty, in which case the am/pm format
85c165be
UD
990cannot be used at all.
991@item D_T_FMT
992The return value can be used as a format string for @code{strftime} to
6dd5b57e 993represent time and date in a locale-specific way.
85c165be
UD
994@item D_FMT
995The return value can be used as a format string for @code{strftime} to
6dd5b57e 996represent a date in a locale-specific way.
85c165be
UD
997@item T_FMT
998The return value can be used as a format string for @code{strftime} to
6dd5b57e 999represent time in a locale-specific way.
85c165be
UD
1000@item T_FMT_AMPM
1001The return value can be used as a format string for @code{strftime} to
6dd5b57e 1002represent time in the am/pm format.
85c165be 1003
6dd5b57e
UD
1004Note that if the am/pm format does not make any sense for the
1005selected locale, the return value might be the same as the one for
85c165be
UD
1006@code{T_FMT}.
1007@item ERA
6dd5b57e
UD
1008The return value represents the era used in the current locale.
1009
1010Most locales do not define this value. An example of a locale which
1011does define this value is the Japanese one. In Japan, the traditional
1012representation of dates includes the name of the era corresponding to
1013the then-emperor's reign.
1014
1015Normally it should not be necessary to use this value directly.
1016Specifying the @code{E} modifier in their format strings causes the
1017@code{strftime} functions to use this information. The format of the
1018returned string is not specified, and therefore you should not assume
1019knowledge of it on different systems.
85c165be 1020@item ERA_YEAR
6dd5b57e 1021The return value gives the year in the relevant era of the locale.
85c165be
UD
1022As for @code{ERA} it should not be necessary to use this value directly.
1023@item ERA_D_T_FMT
1024This return value can be used as a format string for @code{strftime} to
6dd5b57e 1025represent dates and times in a locale-specific era-based way.
85c165be
UD
1026@item ERA_D_FMT
1027This return value can be used as a format string for @code{strftime} to
6dd5b57e 1028represent a date in a locale-specific era-based way.
85c165be
UD
1029@item ERA_T_FMT
1030This return value can be used as a format string for @code{strftime} to
6dd5b57e 1031represent time in a locale-specific era-based way.
85c165be
UD
1032@item ALT_DIGITS
1033The return value is a representation of up to @math{100} values used to
1034represent the values @math{0} to @math{99}. As for @code{ERA} this
1035value is not intended to be used directly, but instead indirectly
1036through the @code{strftime} function. When the modifier @code{O} is
6dd5b57e
UD
1037used in a format which would otherwise use numerals to represent hours,
1038minutes, seconds, weekdays, months, or weeks, the appropriate value for
1039the locale is used instead.
85c165be 1040@item INT_CURR_SYMBOL
6dd5b57e 1041The same as the value returned by @code{localeconv} in the
85c165be
UD
1042@code{int_curr_symbol} element of the @code{struct lconv}.
1043@item CURRENCY_SYMBOL
1044@itemx CRNCYSTR
6dd5b57e 1045The same as the value returned by @code{localeconv} in the
85c165be
UD
1046@code{currency_symbol} element of the @code{struct lconv}.
1047
6dd5b57e 1048@code{CRNCYSTR} is a deprecated alias still required by Unix98.
85c165be 1049@item MON_DECIMAL_POINT
6dd5b57e 1050The same as the value returned by @code{localeconv} in the
85c165be
UD
1051@code{mon_decimal_point} element of the @code{struct lconv}.
1052@item MON_THOUSANDS_SEP
6dd5b57e 1053The same as the value returned by @code{localeconv} in the
85c165be
UD
1054@code{mon_thousands_sep} element of the @code{struct lconv}.
1055@item MON_GROUPING
6dd5b57e 1056The same as the value returned by @code{localeconv} in the
85c165be
UD
1057@code{mon_grouping} element of the @code{struct lconv}.
1058@item POSITIVE_SIGN
6dd5b57e 1059The same as the value returned by @code{localeconv} in the
85c165be
UD
1060@code{positive_sign} element of the @code{struct lconv}.
1061@item NEGATIVE_SIGN
6dd5b57e 1062The same as the value returned by @code{localeconv} in the
85c165be
UD
1063@code{negative_sign} element of the @code{struct lconv}.
1064@item INT_FRAC_DIGITS
6dd5b57e 1065The same as the value returned by @code{localeconv} in the
85c165be
UD
1066@code{int_frac_digits} element of the @code{struct lconv}.
1067@item FRAC_DIGITS
6dd5b57e 1068The same as the value returned by @code{localeconv} in the
85c165be
UD
1069@code{frac_digits} element of the @code{struct lconv}.
1070@item P_CS_PRECEDES
6dd5b57e 1071The same as the value returned by @code{localeconv} in the
85c165be
UD
1072@code{p_cs_precedes} element of the @code{struct lconv}.
1073@item P_SEP_BY_SPACE
6dd5b57e 1074The same as the value returned by @code{localeconv} in the
85c165be
UD
1075@code{p_sep_by_space} element of the @code{struct lconv}.
1076@item N_CS_PRECEDES
6dd5b57e 1077The same as the value returned by @code{localeconv} in the
85c165be
UD
1078@code{n_cs_precedes} element of the @code{struct lconv}.
1079@item N_SEP_BY_SPACE
6dd5b57e 1080The same as the value returned by @code{localeconv} in the
85c165be
UD
1081@code{n_sep_by_space} element of the @code{struct lconv}.
1082@item P_SIGN_POSN
6dd5b57e 1083The same as the value returned by @code{localeconv} in the
85c165be
UD
1084@code{p_sign_posn} element of the @code{struct lconv}.
1085@item N_SIGN_POSN
6dd5b57e 1086The same as the value returned by @code{localeconv} in the
85c165be 1087@code{n_sign_posn} element of the @code{struct lconv}.
b642f101
UD
1088
1089@item INT_P_CS_PRECEDES
1090The same as the value returned by @code{localeconv} in the
1091@code{int_p_cs_precedes} element of the @code{struct lconv}.
1092@item INT_P_SEP_BY_SPACE
1093The same as the value returned by @code{localeconv} in the
1094@code{int_p_sep_by_space} element of the @code{struct lconv}.
1095@item INT_N_CS_PRECEDES
1096The same as the value returned by @code{localeconv} in the
1097@code{int_n_cs_precedes} element of the @code{struct lconv}.
1098@item INT_N_SEP_BY_SPACE
1099The same as the value returned by @code{localeconv} in the
1100@code{int_n_sep_by_space} element of the @code{struct lconv}.
1101@item INT_P_SIGN_POSN
1102The same as the value returned by @code{localeconv} in the
1103@code{int_p_sign_posn} element of the @code{struct lconv}.
1104@item INT_N_SIGN_POSN
1105The same as the value returned by @code{localeconv} in the
1106@code{int_n_sign_posn} element of the @code{struct lconv}.
1107
85c165be
UD
1108@item DECIMAL_POINT
1109@itemx RADIXCHAR
6dd5b57e 1110The same as the value returned by @code{localeconv} in the
85c165be
UD
1111@code{decimal_point} element of the @code{struct lconv}.
1112
1113The name @code{RADIXCHAR} is a deprecated alias still used in Unix98.
1114@item THOUSANDS_SEP
1115@itemx THOUSEP
6dd5b57e 1116The same as the value returned by @code{localeconv} in the
85c165be
UD
1117@code{thousands_sep} element of the @code{struct lconv}.
1118
1119The name @code{THOUSEP} is a deprecated alias still used in Unix98.
1120@item GROUPING
6dd5b57e 1121The same as the value returned by @code{localeconv} in the
85c165be
UD
1122@code{grouping} element of the @code{struct lconv}.
1123@item YESEXPR
1124The return value is a regular expression which can be used with the
1125@code{regex} function to recognize a positive response to a yes/no
1f77f049 1126question. @Theglibc{} provides the @code{rpmatch} function for
e8ec0694 1127easier handling in applications.
85c165be
UD
1128@item NOEXPR
1129The return value is a regular expression which can be used with the
1130@code{regex} function to recognize a negative response to a yes/no
1131question.
1132@item YESSTR
6dd5b57e 1133The return value is a locale-specific translation of the positive response
85c165be
UD
1134to a yes/no question.
1135
1136Using this value is deprecated since it is a very special case of
6dd5b57e 1137message translation, and is better handled by the message
85c165be 1138translation functions (@pxref{Message Translation}).
b642f101
UD
1139
1140The use of this symbol is deprecated. Instead message translation
1141should be used.
85c165be 1142@item NOSTR
6dd5b57e 1143The return value is a locale-specific translation of the negative response
85c165be 1144to a yes/no question. What is said for @code{YESSTR} is also true here.
b642f101
UD
1145
1146The use of this symbol is deprecated. Instead message translation
1147should be used.
85c165be
UD
1148@end vtable
1149
1150The file @file{langinfo.h} defines a lot more symbols but none of them
777edcbd 1151are official. Using them is not portable, and the format of the
6dd5b57e
UD
1152return values might change. Therefore we recommended you not use
1153them.
1154
777edcbd 1155Note that the return value for any valid argument can be used
6dd5b57e
UD
1156in all situations (with the possible exception of the am/pm time formatting
1157codes). If the user has not selected any locale for the
1158appropriate category, @code{nl_langinfo} returns the information from the
85c165be
UD
1159@code{"C"} locale. It is therefore possible to use this function as
1160shown in the example below.
1161
6941c42a
UD
1162If the argument @var{item} is not valid, a pointer to an empty string is
1163returned.
85c165be
UD
1164@end deftypefun
1165
6dd5b57e
UD
1166An example of @code{nl_langinfo} usage is a function which has to
1167print a given date and time in a locale-specific way. At first one
1168might think that, since @code{strftime} internally uses the locale
1169information, writing something like the following is enough:
85c165be
UD
1170
1171@smallexample
1172size_t
1173i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1174@{
1175 return strftime (s, len, "%X %D", tp);
1176@}
1177@end smallexample
1178
1179The format contains no weekday or month names and therefore is
1180internationally usable. Wrong! The output produced is something like
1181@code{"hh:mm:ss MM/DD/YY"}. This format is only recognizable in the
1182USA. Other countries use different formats. Therefore the function
1183should be rewritten like this:
1184
1185@smallexample
1186size_t
1187i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1188@{
1189 return strftime (s, len, nl_langinfo (D_T_FMT), tp);
1190@}
1191@end smallexample
1192
6dd5b57e
UD
1193Now it uses the date and time format of the locale
1194selected when the program runs. If the user selects the locale
85c165be
UD
1195correctly there should never be a misunderstanding over the time and
1196date format.
1197
e8ec0694 1198@node Formatting Numbers, Yes-or-No Questions, Locale Information, Locales
5e0889da 1199@section A dedicated function to format numbers
85c165be 1200
5e0889da 1201We have seen that the structure returned by @code{localeconv} as well as
6dd5b57e
UD
1202the values given to @code{nl_langinfo} allow you to retrieve the various
1203pieces of locale-specific information to format numbers and monetary
1204amounts. We have also seen that the underlying rules are quite complex.
85c165be 1205
6dd5b57e
UD
1206Therefore the X/Open standards introduce a function which uses such
1207locale information, making it easier for the user to format
85c165be
UD
1208numbers according to these rules.
1209
1210@deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{})
f2d58726 1211@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
c75772e3
ZW
1212@c It (and strfmon_l) both call __vstrfmon_l_internal, which, besides
1213@c accessing the locale object passed to it, accesses the active
1214@c locale through isdigit (but to_digit assumes ASCII digits only).
1215@c It may call __printf_fp (@mtslocale @ascuheap @acsmem) and
1216@c guess_grouping (safe).
85c165be 1217The @code{strfmon} function is similar to the @code{strftime} function
6dd5b57e
UD
1218in that it takes a buffer, its size, a format string,
1219and values to write into the buffer as text in a form specified
1220by the format string. Like @code{strftime}, the function
85c165be
UD
1221also returns the number of bytes written into the buffer.
1222
6dd5b57e
UD
1223There are two differences: @code{strfmon} can take more than one
1224argument, and, of course, the format specification is different. Like
1225@code{strftime}, the format string consists of normal text, which is
1226output as is, and format specifiers, which are indicated by a @samp{%}.
1227Immediately after the @samp{%}, you can optionally specify various flags
1228and formatting information before the main formatting character, in a
1229similar way to @code{printf}:
85c165be
UD
1230
1231@itemize @bullet
1232@item
1233Immediately following the @samp{%} there can be one or more of the
1234following flags:
1235@table @asis
1236@item @samp{=@var{f}}
1237The single byte character @var{f} is used for this field as the numeric
1238fill character. By default this character is a space character.
1239Filling with this character is only performed if a left precision
1240is specified. It is not just to fill to the given field width.
1241@item @samp{^}
6dd5b57e
UD
1242The number is printed without grouping the digits according to the rules
1243of the current locale. By default grouping is enabled.
85c165be 1244@item @samp{+}, @samp{(}
6dd5b57e
UD
1245At most one of these flags can be used. They select which format to
1246represent the sign of a currency amount. By default, and if
1247@samp{+} is given, the locale equivalent of @math{+}/@math{-} is used. If
1248@samp{(} is given, negative amounts are enclosed in parentheses. The
85c165be
UD
1249exact format is determined by the values of the @code{LC_MONETARY}
1250category of the locale selected at program runtime.
1251@item @samp{!}
1252The output will not contain the currency symbol.
1253@item @samp{-}
6dd5b57e
UD
1254The output will be formatted left-justified instead of right-justified if
1255it does not fill the entire field width.
85c165be
UD
1256@end table
1257@end itemize
1258
777edcbd 1259The next part of the specification is an optional field width. If no
6dd5b57e
UD
1260width is specified @math{0} is taken. During output, the function first
1261determines how much space is required. If it requires at least as many
1262characters as given by the field width, it is output using as much space
1263as necessary. Otherwise, it is extended to use the full width by
1264filling with the space character. The presence or absence of the
1265@samp{-} flag determines the side at which such padding occurs. If
1266present, the spaces are added at the right making the output
1267left-justified, and vice versa.
1268
1269So far the format looks familiar, being similar to the @code{printf} and
1270@code{strftime} formats. However, the next two optional fields
1271introduce something new. The first one is a @samp{#} character followed
1272by a decimal digit string. The value of the digit string specifies the
1273number of @emph{digit} positions to the left of the decimal point (or
1274equivalent). This does @emph{not} include the grouping character when
1275the @samp{^} flag is not given. If the space needed to print the number
1276does not fill the whole width, the field is padded at the left side with
1277the fill character, which can be selected using the @samp{=} flag and by
1278default is a space. For example, if the field width is selected as 6
1279and the number is @math{123}, the fill character is @samp{*} the result
1280will be @samp{***123}.
1281
1282The second optional field starts with a @samp{.} (period) and consists
1283of another decimal digit string. Its value describes the number of
1284characters printed after the decimal point. The default is selected
1285from the current locale (@code{frac_digits}, @code{int_frac_digits}, see
1286@pxref{General Numeric}). If the exact representation needs more digits
1287than given by the field width, the displayed value is rounded. If the
1288number of fractional digits is selected to be zero, no decimal point is
1289printed.
1290
1f77f049 1291As a GNU extension, the @code{strfmon} implementation in @theglibc{}
6dd5b57e
UD
1292allows an optional @samp{L} next as a format modifier. If this modifier
1293is given, the argument is expected to be a @code{long double} instead of
1294a @code{double} value.
1295
1296Finally, the last component is a format specifier. There are three
1297specifiers defined:
85c165be
UD
1298
1299@table @asis
1300@item @samp{i}
6dd5b57e 1301Use the locale's rules for formatting an international currency value.
85c165be 1302@item @samp{n}
6dd5b57e 1303Use the locale's rules for formatting a national currency value.
85c165be 1304@item @samp{%}
6dd5b57e 1305Place a @samp{%} in the output. There must be no flag, width
85c165be
UD
1306specifier or modifier given, only @samp{%%} is allowed.
1307@end table
1308
6dd5b57e 1309As for @code{printf}, the function reads the format string
5e0889da
UD
1310from left to right and uses the values passed to the function following
1311the format string. The values are expected to be either of type
1312@code{double} or @code{long double}, depending on the presence of the
85c165be
UD
1313modifier @samp{L}. The result is stored in the buffer pointed to by
1314@var{s}. At most @var{maxsize} characters are stored.
1315
1316The return value of the function is the number of characters stored in
6dd5b57e
UD
1317@var{s}, including the terminating @code{NULL} byte. If the number of
1318characters stored would exceed @var{maxsize}, the function returns
85c165be
UD
1319@math{-1} and the content of the buffer @var{s} is unspecified. In this
1320case @code{errno} is set to @code{E2BIG}.
1321@end deftypefun
1322
6dd5b57e 1323A few examples should make clear how the function works. It is
85c165be 1324assumed that all the following pieces of code are executed in a program
6dd5b57e 1325which uses the USA locale (@code{en_US}). The simplest
85c165be
UD
1326form of the format is this:
1327
1328@smallexample
1329strfmon (buf, 100, "@@%n@@%n@@%n@@", 123.45, -567.89, 12345.678);
1330@end smallexample
1331
1332@noindent
1333The output produced is
1334@smallexample
655b26bb 1335"@@$123.45@@-$567.89@@$12,345.68@@"
85c165be
UD
1336@end smallexample
1337
6dd5b57e
UD
1338We can notice several things here. First, the widths of the output
1339numbers are different. We have not specified a width in the format
1340string, and so this is no wonder. Second, the third number is printed
1341using thousands separators. The thousands separator for the
1342@code{en_US} locale is a comma. The number is also rounded.
1343@math{.678} is rounded to @math{.68} since the format does not specify a
1344precision and the default value in the locale is @math{2}. Finally,
1345note that the national currency symbol is printed since @samp{%n} was
1346used, not @samp{i}. The next example shows how we can align the output.
85c165be
UD
1347
1348@smallexample
1349strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678);
1350@end smallexample
1351
1352@noindent
1353The output this time is:
1354
1355@smallexample
655b26bb 1356"@@ $123.45@@ -$567.89@@ $12,345.68@@"
85c165be
UD
1357@end smallexample
1358
6dd5b57e 1359Two things stand out. Firstly, all fields have the same width (eleven
85c165be
UD
1360characters) since this is the width given in the format and since no
1361number required more characters to be printed. The second important
1362point is that the fill character is not used. This is correct since the
6dd5b57e
UD
1363white space was not used to achieve a precision given by a @samp{#}
1364modifier, but instead to fill to the given width. The difference
1365becomes obvious if we now add a width specification.
85c165be
UD
1366
1367@smallexample
1368strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@",
1369 123.45, -567.89, 12345.678);
1370@end smallexample
1371
1372@noindent
1373The output is
1374
1375@smallexample
1376"@@ $***123.45@@-$***567.89@@ $12,456.68@@"
1377@end smallexample
1378
6dd5b57e
UD
1379Here we can see that all the currency symbols are now aligned, and that
1380the space between the currency sign and the number is filled with the
1381selected fill character. Note that although the width is selected to be
1382@math{5} and @math{123.45} has three digits left of the decimal point,
1383the space is filled with three asterisks. This is correct since, as
1384explained above, the width does not include the positions used to store
1385thousands separators. One last example should explain the remaining
1386functionality.
85c165be
UD
1387
1388@smallexample
1389strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@",
1390 123.45, -567.89, 12345.678);
1391@end smallexample
1392
1393@noindent
1394This rather complex format string produces the following output:
1395
1396@smallexample
1397"@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@"
1398@end smallexample
1399
6dd5b57e
UD
1400The most noticeable change is the alternative way of representing
1401negative numbers. In financial circles this is often done using
1402parentheses, and this is what the @samp{(} flag selected. The fill
1403character is now @samp{0}. Note that this @samp{0} character is not
1404regarded as a numeric zero, and therefore the first and second numbers
1405are not printed using a thousands separator. Since we used the format
1406specifier @samp{i} instead of @samp{n}, the international form of the
85c165be 1407currency symbol is used. This is a four letter string, in this case
6dd5b57e
UD
1408@code{"USD "}. The last point is that since the precision right of the
1409decimal point is selected to be three, the first and second numbers are
1410printed with an extra zero at the end and the third number is printed
1411without rounding.
e8ec0694
UD
1412
1413@node Yes-or-No Questions, , Formatting Numbers , Locales
1414@section Yes-or-No Questions
1415
1416Some non GUI programs ask a yes-or-no question. If the messages
1417(especially the questions) are translated into foreign languages, be
1418sure that you localize the answers too. It would be very bad habit to
1419ask a question in one language and request the answer in another, often
1420English.
1421
1f77f049 1422@Theglibc{} contains @code{rpmatch} to give applications easy
e8ec0694
UD
1423access to the corresponding locale definitions.
1424
e8ec0694 1425@deftypefun int rpmatch (const char *@var{response})
d08a7e4c 1426@standards{GNU, stdlib.h}
f2d58726
AO
1427@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
1428@c Calls nl_langinfo with YESEXPR and NOEXPR, triggering @mtslocale but
1429@c it's regcomp and regexec that bring in all of the safety issues.
1430@c regfree is also called, but it doesn't introduce any further issues.
777edcbd 1431The function @code{rpmatch} checks the string in @var{response} for whether
e8ec0694
UD
1432or not it is a correct yes-or-no answer and if yes, which one. The
1433check uses the @code{YESEXPR} and @code{NOEXPR} data in the
1434@code{LC_MESSAGES} category of the currently selected locale. The
1435return value is as follows:
1436
1437@table @code
1438@item 1
1439The user entered an affirmative answer.
1440
1441@item 0
1442The user entered a negative answer.
1443
1444@item -1
1445The answer matched neither the @code{YESEXPR} nor the @code{NOEXPR}
1446regular expression.
1447@end table
1448
1f77f049 1449This function is not standardized but available beside in @theglibc{} at
e8ec0694
UD
1450least also in the IBM AIX library.
1451@end deftypefun
1452
1453@noindent
1454This function would normally be used like this:
1455
1456@smallexample
95fdc6a0 1457 @dots{}
e8ec0694
UD
1458 /* @r{Use a safe default.} */
1459 _Bool doit = false;
1460
1461 fputs (gettext ("Do you really want to do this? "), stdout);
1462 fflush (stdout);
1463 /* @r{Prepare the @code{getline} call.} */
1464 line = NULL;
1465 len = 0;
48cbc0d6 1466 while (getline (&line, &len, stdin) >= 0)
e8ec0694
UD
1467 @{
1468 /* @r{Check the response.} */
1469 int res = rpmatch (line);
1470 if (res >= 0)
1471 @{
1472 /* @r{We got a definitive answer.} */
1473 if (res > 0)
1474 doit = true;
1475 break;
1476 @}
1477 @}
1478 /* @r{Free what @code{getline} allocated.} */
1479 free (line);
1480@end smallexample
1481
9dcc8f11 1482Note that the loop continues until a read error is detected or until a
e8ec0694 1483definitive (positive or negative) answer is read.