manual/locale.texi

   1 @node Locales, Message Translation, Character Set Handling, Top
   2 @c %MENU% The country and language can affect the behavior of library functions
   3 @chapter Locales and Internationalization
   4
   5 Different countries and cultures have varying conventions for how to
   6 communicate.  These conventions range from very simple ones, such as the
   7 format for representing dates and times, to very complex ones, such as
   8 the language spoken.
   9
  10 @cindex internationalization
  11 @cindex locales
  12 @dfn{Internationalization} of software means programming it to be able
  13 to adapt to the user's favorite conventions.  In @w{ISO C},
  14 internationalization works by means of @dfn{locales}.  Each locale
  15 specifies a collection of conventions, one convention for each purpose.
  16 The user chooses a set of conventions by specifying a locale (via
  17 environment variables).
  18
  19 All programs inherit the chosen locale as part of their environment.
  20 Provided the programs are written to obey the choice of locale, they
  21 will follow the conventions preferred by the user.
  22
  23 @menu
  24 * Effects of Locale::           Actions affected by the choice of
  25                                  locale.
  26 * Choosing Locale::             How the user specifies a locale.
  27 * Locale Categories::           Different purposes for which you can
  28                                  select a locale.
  29 * Setting the Locale::          How a program specifies the locale
  30                                  with library functions.
  31 * Standard Locales::            Locale names available on all systems.
  32 * Locale Names::                Format of system-specific locale names.
  33 * Locale Information::          How to access the information for the locale.
  34 * Formatting Numbers::          A dedicated function to format numbers.
  35 * Yes-or-No Questions::         Check a Response against the locale.
  36 @end menu
  37
  38 @node Effects of Locale, Choosing Locale,  , Locales
  39 @section What Effects a Locale Has
  40
  41 Each locale specifies conventions for several purposes, including the
  42 following:
  43
  44 @itemize @bullet
  45 @item
  46 What multibyte character sequences are valid, and how they are
  47 interpreted (@pxref{Character Set Handling}).
  48
  49 @item
  50 Classification of which characters in the local character set are
  51 considered alphabetic, and upper- and lower-case conversion conventions
  52 (@pxref{Character Handling}).
  53
  54 @item
  55 The collating sequence for the local language and character set
  56 (@pxref{Collation Functions}).
  57
  58 @item
  59 Formatting of numbers and currency amounts (@pxref{General Numeric}).
  60
  61 @item
  62 Formatting of dates and times (@pxref{Formatting Calendar Time}).
  63
  64 @item
  65 What language to use for output, including error messages
  66 (@pxref{Message Translation}).
  67
  68 @item
  69 What language to use for user answers to yes-or-no questions
  70 (@pxref{Yes-or-No Questions}).
  71
  72 @item
  73 What language to use for more complex user input.
  74 (The C library doesn't yet help you implement this.)
  75 @end itemize
  76
  77 Some aspects of adapting to the specified locale are handled
  78 automatically by the library subroutines.  For example, all your program
  79 needs to do in order to use the collating sequence of the chosen locale
  80 is to use @code{strcoll} or @code{strxfrm} to compare strings.
  81
  82 Other aspects of locales are beyond the comprehension of the library.
  83 For example, the library can't automatically translate your program's
  84 output messages into other languages.  The only way you can support
  85 output in the user's favorite language is to program this more or less
  86 by hand.  The C library provides functions to handle translations for
  87 multiple languages easily.
  88
  89 This chapter discusses the mechanism by which you can modify the current
  90 locale.  The effects of the current locale on specific library functions
  91 are discussed in more detail in the descriptions of those functions.
  92
  93 @node Choosing Locale, Locale Categories, Effects of Locale, Locales
  94 @section Choosing a Locale
  95
  96 The simplest way for the user to choose a locale is to set the
  97 environment variable @code{LANG}.  This specifies a single locale to use
  98 for all purposes.  For example, a user could specify a hypothetical
  99 locale named @samp{espana-castellano} to use the standard conventions of
 100 most of Spain.
 101
 102 The set of locales supported depends on the operating system you are
 103 using, and so do their names, except that the standard locale called
 104 @samp{C} or @samp{POSIX} always exist.  @xref{Locale Names}.
 105
 106 In order to force the system to always use the default locale, the
 107 user can set the @code{LC_ALL} environment variable to @samp{C}.
 108
 109 @cindex combining locales
 110 A user also has the option of specifying different locales for
 111 different purposes---in effect, choosing a mixture of multiple
 112 locales.  @xref{Locale Categories}.
 113
 114 For example, the user might specify the locale @samp{espana-castellano}
 115 for most purposes, but specify the locale @samp{usa-english} for
 116 currency formatting.  This might make sense if the user is a
 117 Spanish-speaking American, working in Spanish, but representing monetary
 118 amounts in US dollars.
 119
 120 Note that both locales @samp{espana-castellano} and @samp{usa-english},
 121 like all locales, would include conventions for all of the purposes to
 122 which locales apply.  However, the user can choose to use each locale
 123 for a particular subset of those purposes.
 124
 125 @node Locale Categories, Setting the Locale, Choosing Locale, Locales
 126 @section Locale Categories
 127 @cindex categories for locales
 128 @cindex locale categories
 129
 130 The purposes that locales serve are grouped into @dfn{categories}, so
 131 that a user or a program can choose the locale for each category
 132 independently.  Here is a table of categories; each name is both an
 133 environment variable that a user can set, and a macro name that you can
 134 use as the first argument to @code{setlocale}.
 135
 136 The contents of the environment variable (or the string in the second
 137 argument to @code{setlocale}) has to be a valid locale name.
 138 @xref{Locale Names}.
 139
 140 @vtable @code
 141 @item LC_COLLATE
 142 @standards{ISO, locale.h}
 143 This category applies to collation of strings (functions @code{strcoll}
 144 and @code{strxfrm}); see @ref{Collation Functions}.
 145
 146 @item LC_CTYPE
 147 @standards{ISO, locale.h}
 148 This category applies to classification and conversion of characters,
 149 and to multibyte and wide characters;
 150 see @ref{Character Handling}, and @ref{Character Set Handling}.
 151
 152 @item LC_MONETARY
 153 @standards{ISO, locale.h}
 154 This category applies to formatting monetary values; see @ref{General Numeric}.
 155
 156 @item LC_NUMERIC
 157 @standards{ISO, locale.h}
 158 This category applies to formatting numeric values that are not
 159 monetary; see @ref{General Numeric}.
 160
 161 @item LC_TIME
 162 @standards{ISO, locale.h}
 163 This category applies to formatting date and time values; see
 164 @ref{Formatting Calendar Time}.
 165
 166 @item LC_MESSAGES
 167 @standards{XOPEN, locale.h}
 168 This category applies to selecting the language used in the user
 169 interface for message translation (@pxref{The Uniforum approach};
 170 @pxref{Message catalogs a la X/Open})  and contains regular expressions
 171 for affirmative and negative responses.
 172
 173 @item LC_ALL
 174 @standards{ISO, locale.h}
 175 This is not a category; it is only a macro that you can use
 176 with @code{setlocale} to set a single locale for all purposes.  Setting
 177 this environment variable overwrites all selections by the other
 178 @code{LC_*} variables or @code{LANG}.
 179
 180 @item LANG
 181 @standards{ISO, locale.h}
 182 If this environment variable is defined, its value specifies the locale
 183 to use for all purposes except as overridden by the variables above.
 184 @end vtable
 185
 186 @vindex LANGUAGE
 187 When developing the message translation functions it was felt that the
 188 functionality provided by the variables above is not sufficient.  For
 189 example, it should be possible to specify more than one locale name.
 190 Take a Swedish user who better speaks German than English, and a program
 191 whose messages are output in English by default.  It should be possible
 192 to specify that the first choice of language is Swedish, the second
 193 German, and if this also fails to use English.  This is
 194 possible with the variable @code{LANGUAGE}.  For further description of
 195 this GNU extension see @ref{Using gettextized software}.
 196
 197 @node Setting the Locale, Standard Locales, Locale Categories, Locales
 198 @section How Programs Set the Locale
 199
 200 A C program inherits its locale environment variables when it starts up.
 201 This happens automatically.  However, these variables do not
 202 automatically control the locale used by the library functions, because
 203 @w{ISO C} says that all programs start by default in the standard @samp{C}
 204 locale.  To use the locales specified by the environment, you must call
 205 @code{setlocale}.  Call it as follows:
 206
 207 @smallexample
 208 setlocale (LC_ALL, "");
 209 @end smallexample
 210
 211 @noindent
 212 to select a locale based on the user choice of the appropriate
 213 environment variables.
 214
 215 @cindex changing the locale
 216 @cindex locale, changing
 217 You can also use @code{setlocale} to specify a particular locale, for
 218 general use or for a specific category.
 219
 220 @pindex locale.h
 221 The symbols in this section are defined in the header file @file{locale.h}.
 222
 223 @deftypefun {char *} setlocale (int @var{category}, const char *@var{locale})
 224 @standards{ISO, locale.h}
 225 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtslocale{}} @mtsenv{}}@asunsafe{@asuinit{} @asulock{} @ascuheap{} @asucorrupt{}}@acunsafe{@acuinit{} @acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
 226 @c Uses of the global locale object are unguarded in functions that
 227 @c ought to be MT-Safe, so we're ruling out the use of this function
 228 @c once threads are started.  It takes a write lock itself, but it may
 229 @c return a pointer loaded from the global locale object after releasing
 230 @c the lock, or before taking it.
 231 @c setlocale @mtasuconst:@mtslocale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
 232 @c  libc_rwlock_wrlock @asulock @aculock
 233 @c  libc_rwlock_unlock @aculock
 234 @c  getenv LOCPATH @mtsenv
 235 @c  malloc @ascuheap @acsmem
 236 @c  free @ascuheap @acsmem
 237 @c  new_composite_name ok
 238 @c  setdata ok
 239 @c  setname ok
 240 @c  _nl_find_locale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
 241 @c   getenv LC_ALL and LANG @mtsenv
 242 @c   _nl_load_locale_from_archive @ascuheap @acucorrupt @acsmem @acsfd
 243 @c    sysconf _SC_PAGE_SIZE ok
 244 @c    _nl_normalize_codeset @ascuheap @acsmem
 245 @c     isalnum_l ok (C locale)
 246 @c     isdigit_l ok (C locale)
 247 @c     malloc @ascuheap @acsmem
 248 @c     tolower_l ok (C locale)
 249 @c    open_not_cancel_2 @acsfd
 250 @c    fxstat64 ok
 251 @c    close_not_cancel_no_status ok
 252 @c    __mmap64 @acsmem
 253 @c    calculate_head_size ok
 254 @c    __munmap ok
 255 @c    compute_hashval ok
 256 @c    qsort dup @acucorrupt
 257 @c     rangecmp ok
 258 @c    malloc @ascuheap @acsmem
 259 @c    strdup @ascuheap @acsmem
 260 @c    _nl_intern_locale_data @ascuheap @acsmem
 261 @c     malloc @ascuheap @acsmem
 262 @c     free @ascuheap @acsmem
 263 @c   _nl_expand_alias @ascuheap @asulock @acsmem @acsfd @aculock
 264 @c    libc_lock_lock @asulock @aculock
 265 @c    bsearch ok
 266 @c     alias_compare ok
 267 @c      strcasecmp ok
 268 @c    read_alias_file @ascuheap @asulock @acsmem @acsfd @aculock
 269 @c     fopen @ascuheap @asulock @acsmem @acsfd @aculock
 270 @c     fsetlocking ok
 271 @c     feof_unlocked ok
 272 @c     fgets_unlocked ok
 273 @c     isspace ok (locale mutex is locked)
 274 @c     extend_alias_table @ascuheap @acsmem
 275 @c      realloc @ascuheap @acsmem
 276 @c     realloc @ascuheap @acsmem
 277 @c     fclose @ascuheap @asulock @acsmem @acsfd @aculock
 278 @c     qsort @ascuheap @acsmem
 279 @c      alias_compare dup
 280 @c    libc_lock_unlock @aculock
 281 @c   _nl_explode_name @ascuheap @acsmem
 282 @c    _nl_find_language ok
 283 @c    _nl_normalize_codeset dup @ascuheap @acsmem
 284 @c   _nl_make_l10nflist @ascuheap @acsmem
 285 @c    malloc @ascuheap @acsmem
 286 @c    free @ascuheap @acsmem
 287 @c    __argz_stringify ok
 288 @c    __argz_count ok
 289 @c    __argz_next ok
 290 @c   _nl_load_locale @ascuheap @acsmem @acsfd
 291 @c    open_not_cancel_2 @acsfd
 292 @c    __fxstat64 ok
 293 @c    close_not_cancel_no_status ok
 294 @c    mmap @acsmem
 295 @c    malloc @ascuheap @acsmem
 296 @c    read_not_cancel ok
 297 @c    free @ascuheap @acsmem
 298 @c    _nl_intern_locale_data dup @ascuheap @acsmem
 299 @c    munmap ok
 300 @c   __gconv_compare_alias @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
 301 @c    __gconv_read_conf @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
 302 @c     (libc_once-initializes gconv_cache and gconv_path_envvar; they're
 303 @c      never modified afterwards)
 304 @c     __gconv_load_cache @ascuheap @acsmem @acsfd
 305 @c      getenv GCONV_PATH @mtsenv
 306 @c      open_not_cancel @acsfd
 307 @c      __fxstat64 ok
 308 @c      close_not_cancel_no_status ok
 309 @c      mmap @acsmem
 310 @c      malloc @ascuheap @acsmem
 311 @c      __read ok
 312 @c      free @ascuheap @acsmem
 313 @c      munmap ok
 314 @c     __gconv_get_path @asulock @ascuheap @aculock @acsmem @acsfd
 315 @c      getcwd @ascuheap @acsmem @acsfd
 316 @c      libc_lock_lock @asulock @aculock
 317 @c      malloc @ascuheap @acsmem
 318 @c      strtok_r ok
 319 @c      libc_lock_unlock @aculock
 320 @c     read_conf_file @ascuheap @asucorrupt @asulock @acsmem @acucorrupt @acsfd @aculock
 321 @c      fopen @ascuheap @asulock @acsmem @acsfd @aculock
 322 @c      fsetlocking ok
 323 @c      feof_unlocked ok
 324 @c      getdelim @ascuheap @asucorrupt @acsmem @acucorrupt
 325 @c      isspace_l ok (C locale)
 326 @c      add_alias
 327 @c       isspace_l ok (C locale)
 328 @c       toupper_l ok (C locale)
 329 @c       add_alias2 dup @ascuheap @acucorrupt @acsmem
 330 @c      add_module @ascuheap @acsmem
 331 @c       isspace_l ok (C locale)
 332 @c       toupper_l ok (C locale)
 333 @c       strtol ok (@mtslocale but we hold the locale lock)
 334 @c       tfind __gconv_alias_db ok
 335 @c        __gconv_alias_compare dup ok
 336 @c       calloc @ascuheap @acsmem
 337 @c       insert_module dup @ascuheap
 338 @c     __tfind ok (because the tree is read only by then)
 339 @c      __gconv_alias_compare dup ok
 340 @c     insert_module @ascuheap
 341 @c      free @ascuheap
 342 @c     add_alias2 @ascuheap @acucorrupt @acsmem
 343 @c      detect_conflict ok, reads __gconv_modules_db
 344 @c      malloc @ascuheap @acsmem
 345 @c      tsearch __gconv_alias_db @ascuheap @acucorrupt @acsmem [exclusive tree, no @mtsrace]
 346 @c       __gconv_alias_compare ok
 347 @c      free @ascuheap
 348 @c    __gconv_compare_alias_cache ok
 349 @c     find_module_idx ok
 350 @c    do_lookup_alias ok
 351 @c     __tfind ok (because the tree is read only by then)
 352 @c      __gconv_alias_compare ok
 353 @c   strndup @ascuheap @acsmem
 354 @c   strcasecmp_l ok (C locale)
 355 The function @code{setlocale} sets the current locale for category
 356 @var{category} to @var{locale}.
 357
 358 If @var{category} is @code{LC_ALL}, this specifies the locale for all
 359 purposes.  The other possible values of @var{category} specify a
 360 single purpose (@pxref{Locale Categories}).
 361
 362 You can also use this function to find out the current locale by passing
 363 a null pointer as the @var{locale} argument.  In this case,
 364 @code{setlocale} returns a string that is the name of the locale
 365 currently selected for category @var{category}.
 366
 367 The string returned by @code{setlocale} can be overwritten by subsequent
 368 calls, so you should make a copy of the string (@pxref{Copying Strings
 369 and Arrays}) if you want to save it past any further calls to
 370 @code{setlocale}.  (The standard library is guaranteed never to call
 371 @code{setlocale} itself.)
 372
 373 You should not modify the string returned by @code{setlocale}.  It might
 374 be the same string that was passed as an argument in a previous call to
 375 @code{setlocale}.  One requirement is that the @var{category} must be
 376 the same in the call the string was returned and the one when the string
 377 is passed in as @var{locale} parameter.
 378
 379 When you read the current locale for category @code{LC_ALL}, the value
 380 encodes the entire combination of selected locales for all categories.
 381 If you specify the same ``locale name'' with @code{LC_ALL} in a
 382 subsequent call to @code{setlocale}, it restores the same combination
 383 of locale selections.
 384
 385 To be sure you can use the returned string encoding the currently selected
 386 locale at a later time, you must make a copy of the string.  It is not
 387 guaranteed that the returned pointer remains valid over time.
 388
 389 When the @var{locale} argument is not a null pointer, the string returned
 390 by @code{setlocale} reflects the newly-modified locale.
 391
 392 If you specify an empty string for @var{locale}, this means to read the
 393 appropriate environment variable and use its value to select the locale
 394 for @var{category}.
 395
 396 If a nonempty string is given for @var{locale}, then the locale of that
 397 name is used if possible.
 398
 399 The effective locale name (either the second argument to
 400 @code{setlocale}, or if the argument is an empty string, the name
 401 obtained from the process environment) must be a valid locale name.
 402 @xref{Locale Names}.
 403
 404 If you specify an invalid locale name, @code{setlocale} returns a null
 405 pointer and leaves the current locale unchanged.
 406 @end deftypefun
 407
 408 Here is an example showing how you might use @code{setlocale} to
 409 temporarily switch to a new locale.
 410
 411 @smallexample
 412 #include <stddef.h>
 413 #include <locale.h>
 414 #include <stdlib.h>
 415 #include <string.h>
 416
 417 void
 418 with_other_locale (char *new_locale,
 419                    void (*subroutine) (int),
 420                    int argument)
 421 @{
 422   char *old_locale, *saved_locale;
 423
 424   /* @r{Get the name of the current locale.}  */
 425   old_locale = setlocale (LC_ALL, NULL);
 426
 427   /* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */
 428   saved_locale = strdup (old_locale);
 429   if (saved_locale == NULL)
 430     fatal ("Out of memory");
 431
 432   /* @r{Now change the locale and do some stuff with it.} */
 433   setlocale (LC_ALL, new_locale);
 434   (*subroutine) (argument);
 435
 436   /* @r{Restore the original locale.} */
 437   setlocale (LC_ALL, saved_locale);
 438   free (saved_locale);
 439 @}
 440 @end smallexample
 441
 442 @strong{Portability Note:} Some @w{ISO C} systems may define additional
 443 locale categories, and future versions of the library will do so.  For
 444 portability, assume that any symbol beginning with @samp{LC_} might be
 445 defined in @file{locale.h}.
 446
 447 @node Standard Locales, Locale Names, Setting the Locale, Locales
 448 @section Standard Locales
 449
 450 The only locale names you can count on finding on all operating systems
 451 are these three standard ones:
 452
 453 @table @code
 454 @item "C"
 455 This is the standard C locale.  The attributes and behavior it provides
 456 are specified in the @w{ISO C} standard.  When your program starts up, it
 457 initially uses this locale by default.
 458
 459 @item "POSIX"
 460 This is the standard POSIX locale.  Currently, it is an alias for the
 461 standard C locale.
 462
 463 @item ""
 464 The empty name says to select a locale based on environment variables.
 465 @xref{Locale Categories}.
 466 @end table
 467
 468 Defining and installing named locales is normally a responsibility of
 469 the system administrator at your site (or the person who installed
 470 @theglibc{}).  It is also possible for the user to create private
 471 locales.  All this will be discussed later when describing the tool to
 472 do so.
 473 @comment (@pxref{Building Locale Files}).
 474
 475 If your program needs to use something other than the @samp{C} locale,
 476 it will be more portable if you use whatever locale the user specifies
 477 with the environment, rather than trying to specify some non-standard
 478 locale explicitly by name.  Remember, different machines might have
 479 different sets of locales installed.
 480
 481 @node Locale Names, Locale Information, Standard Locales, Locales
 482 @section Locale Names
 483
 484 The following command prints a list of locales supported by the
 485 system:
 486
 487 @pindex locale
 488 @smallexample
 489   locale -a
 490 @end smallexample
 491
 492 @strong{Portability Note:} With the notable exception of the standard
 493 locale names @samp{C} and @samp{POSIX}, locale names are
 494 system-specific.
 495
 496 Most locale names follow XPG syntax and consist of up to four parts:
 497
 498 @smallexample
 499 @var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
 500 @end smallexample
 501
 502 Beside the first part, all of them are allowed to be missing.  If the
 503 full specified locale is not found, less specific ones are looked for.
 504 The various parts will be stripped off, in the following order:
 505
 506 @enumerate
 507 @item
 508 codeset
 509 @item
 510 normalized codeset
 511 @item
 512 territory
 513 @item
 514 modifier
 515 @end enumerate
 516
 517 For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
 518 German-language locale for use in Austria, using the ISO-8859-15
 519 (Latin-9) character set, and with the Euro as the currency symbol.
 520
 521 In addition to locale names which follow XPG syntax, systems may
 522 provide aliases such as @samp{german}.  Both categories of names must
 523 not contain the slash character @samp{/}.
 524
 525 If the locale name starts with a slash @samp{/}, it is treated as a
 526 path relative to the configured locale directories; see @code{LOCPATH}
 527 below.  The specified path must not contain a component @samp{..}, or
 528 the name is invalid, and @code{setlocale} will fail.
 529
 530 @strong{Portability Note:} POSIX suggests that if a locale name starts
 531 with a slash @samp{/}, it is resolved as an absolute path.  However,
 532 @theglibc{} treats it as a relative path under the directories listed
 533 in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
 534 is unset).
 535
 536 Locale names which are longer than an implementation-defined limit are
 537 invalid and cause @code{setlocale} to fail.
 538
 539 As a special case, locale names used with @code{LC_ALL} can combine
 540 several locales, reflecting different locale settings for different
 541 categories.  For example, you might want to use a U.S. locale with ISO
 542 A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
 543 @code{LC_PAPER} to @samp{de_DE.UTF-8}.  In this case, the
 544 @code{LC_ALL}-style combined locale name is
 545
 546 @smallexample
 547 LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
 548 @end smallexample
 549
 550 followed by other category settings not shown here.
 551
 552 @vindex LOCPATH
 553 The path used for finding locale data can be set using the
 554 @code{LOCPATH} environment variable.  This variable lists the
 555 directories in which to search for locale definitions, separated by a
 556 colon @samp{:}.
 557
 558 The default path for finding locale data is system specific.  A typical
 559 value for the @code{LOCPATH} default is:
 560
 561 @smallexample
 562 /usr/share/locale
 563 @end smallexample
 564
 565 The value of @code{LOCPATH} is ignored by privileged programs for
 566 security reasons, and only the default directory is used.
 567
 568 @node Locale Information, Formatting Numbers, Locale Names, Locales
 569 @section Accessing Locale Information
 570
 571 There are several ways to access locale information.  The simplest
 572 way is to let the C library itself do the work.  Several of the
 573 functions in this library implicitly access the locale data, and use
 574 what information is provided by the currently selected locale.  This is
 575 how the locale model is meant to work normally.
 576
 577 As an example take the @code{strftime} function, which is meant to nicely
 578 format date and time information (@pxref{Formatting Calendar Time}).
 579 Part of the standard information contained in the @code{LC_TIME}
 580 category is the names of the months.  Instead of requiring the
 581 programmer to take care of providing the translations the
 582 @code{strftime} function does this all by itself.  @code{%A}
 583 in the format string is replaced by the appropriate weekday
 584 name of the locale currently selected by @code{LC_TIME}.  This is an
 585 easy example, and wherever possible functions do things automatically
 586 in this way.
 587
 588 But there are quite often situations when there is simply no function
 589 to perform the task, or it is simply not possible to do the work
 590 automatically.  For these cases it is necessary to access the
 591 information in the locale directly.  To do this the C library provides
 592 two functions: @code{localeconv} and @code{nl_langinfo}.  The former is
 593 part of @w{ISO C} and therefore portable, but has a brain-damaged
 594 interface.  The second is part of the Unix interface and is portable in
 595 as far as the system follows the Unix standards.
 596
 597 @menu
 598 * The Lame Way to Locale Data::   ISO C's @code{localeconv}.
 599 * The Elegant and Fast Way::      X/Open's @code{nl_langinfo}.
 600 @end menu
 601
 602 @node The Lame Way to Locale Data, The Elegant and Fast Way, ,Locale Information
 603 @subsection @code{localeconv}: It is portable but @dots{}
 604
 605 Together with the @code{setlocale} function the @w{ISO C} people
 606 invented the @code{localeconv} function.  It is a masterpiece of poor
 607 design.  It is expensive to use, not extensible, and not generally
 608 usable as it provides access to only @code{LC_MONETARY} and
 609 @code{LC_NUMERIC} related information.  Nevertheless, if it is
 610 applicable to a given situation it should be used since it is very
 611 portable.  The function @code{strfmon} formats monetary amounts
 612 according to the selected locale using this information.
 613 @pindex locale.h
 614 @cindex monetary value formatting
 615 @cindex numeric value formatting
 616
 617 @deftypefun {struct lconv *} localeconv (void)
 618 @standards{ISO, locale.h}
 619 @safety{@prelim{}@mtunsafe{@mtasurace{:localeconv} @mtslocale{}}@asunsafe{}@acsafe{}}
 620 @c This function reads from multiple components of the locale object,
 621 @c without synchronization, while writing to the static buffer it uses
 622 @c as the return value.
 623 The @code{localeconv} function returns a pointer to a structure whose
 624 components contain information about how numeric and monetary values
 625 should be formatted in the current locale.
 626
 627 You should not modify the structure or its contents.  The structure might
 628 be overwritten by subsequent calls to @code{localeconv}, or by calls to
 629 @code{setlocale}, but no other function in the library overwrites this
 630 value.
 631 @end deftypefun
 632
 633 @deftp {Data Type} {struct lconv}
 634 @standards{ISO, locale.h}
 635 @code{localeconv}'s return value is of this data type.  Its elements are
 636 described in the following subsections.
 637 @end deftp
 638
 639 If a member of the structure @code{struct lconv} has type @code{char},
 640 and the value is @code{CHAR_MAX}, it means that the current locale has
 641 no value for that parameter.
 642
 643 @menu
 644 * General Numeric::             Parameters for formatting numbers and
 645                                  currency amounts.
 646 * Currency Symbol::             How to print the symbol that identifies an
 647                                  amount of money (e.g. @samp{$}).
 648 * Sign of Money Amount::        How to print the (positive or negative) sign
 649                                  for a monetary amount, if one exists.
 650 @end menu
 651
 652 @node General Numeric, Currency Symbol, , The Lame Way to Locale Data
 653 @subsubsection Generic Numeric Formatting Parameters
 654
 655 These are the standard members of @code{struct lconv}; there may be
 656 others.
 657
 658 @table @code
 659 @item char *decimal_point
 660 @itemx char *mon_decimal_point
 661 These are the decimal-point separators used in formatting non-monetary
 662 and monetary quantities, respectively.  In the @samp{C} locale, the
 663 value of @code{decimal_point} is @code{"."}, and the value of
 664 @code{mon_decimal_point} is @code{""}.
 665 @cindex decimal-point separator
 666
 667 @item char *thousands_sep
 668 @itemx char *mon_thousands_sep
 669 These are the separators used to delimit groups of digits to the left of
 670 the decimal point in formatting non-monetary and monetary quantities,
 671 respectively.  In the @samp{C} locale, both members have a value of
 672 @code{""} (the empty string).
 673
 674 @item char *grouping
 675 @itemx char *mon_grouping
 676 These are strings that specify how to group the digits to the left of
 677 the decimal point.  @code{grouping} applies to non-monetary quantities
 678 and @code{mon_grouping} applies to monetary quantities.  Use either
 679 @code{thousands_sep} or @code{mon_thousands_sep} to separate the digit
 680 groups.
 681 @cindex grouping of digits
 682
 683 Each member of these strings is to be interpreted as an integer value of
 684 type @code{char}.  Successive numbers (from left to right) give the
 685 sizes of successive groups (from right to left, starting at the decimal
 686 point.)  The last member is either @code{0}, in which case the previous
 687 member is used over and over again for all the remaining groups, or
 688 @code{CHAR_MAX}, in which case there is no more grouping---or, put
 689 another way, any remaining digits form one large group without
 690 separators.
 691
 692 For example, if @code{grouping} is @code{"\04\03\02"}, the correct
 693 grouping for the number @code{123456787654321} is @samp{12}, @samp{34},
 694 @samp{56}, @samp{78}, @samp{765}, @samp{4321}.  This uses a group of 4
 695 digits at the end, preceded by a group of 3 digits, preceded by groups
 696 of 2 digits (as many as needed).  With a separator of @samp{,}, the
 697 number would be printed as @samp{12,34,56,78,765,4321}.
 698
 699 A value of @code{"\03"} indicates repeated groups of three digits, as
 700 normally used in the U.S.
 701
 702 In the standard @samp{C} locale, both @code{grouping} and
 703 @code{mon_grouping} have a value of @code{""}.  This value specifies no
 704 grouping at all.
 705
 706 @item char int_frac_digits
 707 @itemx char frac_digits
 708 These are small integers indicating how many fractional digits (to the
 709 right of the decimal point) should be displayed in a monetary value in
 710 international and local formats, respectively.  (Most often, both
 711 members have the same value.)
 712
 713 In the standard @samp{C} locale, both of these members have the value
 714 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 715 what to do when you find this value; we recommend printing no
 716 fractional digits.  (This locale also specifies the empty string for
 717 @code{mon_decimal_point}, so printing any fractional digits would be
 718 confusing!)
 719 @end table
 720
 721 @node Currency Symbol, Sign of Money Amount, General Numeric, The Lame Way to Locale Data
 722 @subsubsection Printing the Currency Symbol
 723 @cindex currency symbols
 724
 725 These members of the @code{struct lconv} structure specify how to print
 726 the symbol to identify a monetary value---the international analog of
 727 @samp{$} for US dollars.
 728
 729 Each country has two standard currency symbols.  The @dfn{local currency
 730 symbol} is used commonly within the country, while the
 731 @dfn{international currency symbol} is used internationally to refer to
 732 that country's currency when it is necessary to indicate the country
 733 unambiguously.
 734
 735 For example, many countries use the dollar as their monetary unit, and
 736 when dealing with international currencies it's important to specify
 737 that one is dealing with (say) Canadian dollars instead of U.S. dollars
 738 or Australian dollars.  But when the context is known to be Canada,
 739 there is no need to make this explicit---dollar amounts are implicitly
 740 assumed to be in Canadian dollars.
 741
 742 @table @code
 743 @item char *currency_symbol
 744 The local currency symbol for the selected locale.
 745
 746 In the standard @samp{C} locale, this member has a value of @code{""}
 747 (the empty string), meaning ``unspecified''.  The ISO standard doesn't
 748 say what to do when you find this value; we recommend you simply print
 749 the empty string as you would print any other string pointed to by this
 750 variable.
 751
 752 @item char *int_curr_symbol
 753 The international currency symbol for the selected locale.
 754
 755 The value of @code{int_curr_symbol} should normally consist of a
 756 three-letter abbreviation determined by the international standard
 757 @cite{ISO 4217 Codes for the Representation of Currency and Funds},
 758 followed by a one-character separator (often a space).
 759
 760 In the standard @samp{C} locale, this member has a value of @code{""}
 761 (the empty string), meaning ``unspecified''.  We recommend you simply print
 762 the empty string as you would print any other string pointed to by this
 763 variable.
 764
 765 @item char p_cs_precedes
 766 @itemx char n_cs_precedes
 767 @itemx char int_p_cs_precedes
 768 @itemx char int_n_cs_precedes
 769 These members are @code{1} if the @code{currency_symbol} or
 770 @code{int_curr_symbol} strings should precede the value of a monetary
 771 amount, or @code{0} if the strings should follow the value.  The
 772 @code{p_cs_precedes} and @code{int_p_cs_precedes} members apply to
 773 positive amounts (or zero), and the @code{n_cs_precedes} and
 774 @code{int_n_cs_precedes} members apply to negative amounts.
 775
 776 In the standard @samp{C} locale, all of these members have a value of
 777 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 778 what to do when you find this value.  We recommend printing the
 779 currency symbol before the amount, which is right for most countries.
 780 In other words, treat all nonzero values alike in these members.
 781
 782 The members with the @code{int_} prefix apply to the
 783 @code{int_curr_symbol} while the other two apply to
 784 @code{currency_symbol}.
 785
 786 @item char p_sep_by_space
 787 @itemx char n_sep_by_space
 788 @itemx char int_p_sep_by_space
 789 @itemx char int_n_sep_by_space
 790 These members are @code{1} if a space should appear between the
 791 @code{currency_symbol} or @code{int_curr_symbol} strings and the
 792 amount, or @code{0} if no space should appear.  The
 793 @code{p_sep_by_space} and @code{int_p_sep_by_space} members apply to
 794 positive amounts (or zero), and the @code{n_sep_by_space} and
 795 @code{int_n_sep_by_space} members apply to negative amounts.
 796
 797 In the standard @samp{C} locale, all of these members have a value of
 798 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 799 what you should do when you find this value; we suggest you treat it as
 800 1 (print a space).  In other words, treat all nonzero values alike in
 801 these members.
 802
 803 The members with the @code{int_} prefix apply to the
 804 @code{int_curr_symbol} while the other two apply to
 805 @code{currency_symbol}.  There is one specialty with the
 806 @code{int_curr_symbol}, though.  Since all legal values contain a space
 807 at the end of the string one either prints this space (if the currency
 808 symbol must appear in front and must be separated) or one has to avoid
 809 printing this character at all (especially when at the end of the
 810 string).
 811 @end table
 812
 813 @node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
 814 @subsubsection Printing the Sign of a Monetary Amount
 815
 816 These members of the @code{struct lconv} structure specify how to print
 817 the sign (if any) of a monetary value.
 818
 819 @table @code
 820 @item char *positive_sign
 821 @itemx char *negative_sign
 822 These are strings used to indicate positive (or zero) and negative
 823 monetary quantities, respectively.
 824
 825 In the standard @samp{C} locale, both of these members have a value of
 826 @code{""} (the empty string), meaning ``unspecified''.
 827
 828 The ISO standard doesn't say what to do when you find this value; we
 829 recommend printing @code{positive_sign} as you find it, even if it is
 830 empty.  For a negative value, print @code{negative_sign} as you find it
 831 unless both it and @code{positive_sign} are empty, in which case print
 832 @samp{-} instead.  (Failing to indicate the sign at all seems rather
 833 unreasonable.)
 834
 835 @item char p_sign_posn
 836 @itemx char n_sign_posn
 837 @itemx char int_p_sign_posn
 838 @itemx char int_n_sign_posn
 839 These members are small integers that indicate how to
 840 position the sign for nonnegative and negative monetary quantities,
 841 respectively.  (The string used for the sign is what was specified with
 842 @code{positive_sign} or @code{negative_sign}.)  The possible values are
 843 as follows:
 844
 845 @table @code
 846 @item 0
 847 The currency symbol and quantity should be surrounded by parentheses.
 848
 849 @item 1
 850 Print the sign string before the quantity and currency symbol.
 851
 852 @item 2
 853 Print the sign string after the quantity and currency symbol.
 854
 855 @item 3
 856 Print the sign string right before the currency symbol.
 857
 858 @item 4
 859 Print the sign string right after the currency symbol.
 860
 861 @item CHAR_MAX
 862 ``Unspecified''.  Both members have this value in the standard
 863 @samp{C} locale.
 864 @end table
 865
 866 The ISO standard doesn't say what you should do when the value is
 867 @code{CHAR_MAX}.  We recommend you print the sign after the currency
 868 symbol.
 869
 870 The members with the @code{int_} prefix apply to the
 871 @code{int_curr_symbol} while the other two apply to
 872 @code{currency_symbol}.
 873 @end table
 874
 875 @node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
 876 @subsection Pinpoint Access to Locale Data
 877
 878 When writing the X/Open Portability Guide the authors realized that the
 879 @code{localeconv} function is not enough to provide reasonable access to
 880 locale information.  The information which was meant to be available
 881 in the locale (as later specified in the POSIX.1 standard) requires more
 882 ways to access it.  Therefore the @code{nl_langinfo} function
 883 was introduced.
 884
 885 @deftypefun {char *} nl_langinfo (nl_item @var{item})
 886 @standards{XOPEN, langinfo.h}
 887 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 888 @c It calls _nl_langinfo_l with the current locale, which returns a
 889 @c pointer into constant strings defined in locale data structures.
 890 The @code{nl_langinfo} function can be used to access individual
 891 elements of the locale categories.  Unlike the @code{localeconv}
 892 function, which returns all the information, @code{nl_langinfo}
 893 lets the caller select what information it requires.  This is very
 894 fast and it is not a problem to call this function multiple times.
 895
 896 A second advantage is that in addition to the numeric and monetary
 897 formatting information, information from the
 898 @code{LC_TIME} and @code{LC_MESSAGES} categories is available.
 899
 900 @pindex langinfo.h
 901 The type @code{nl_item} is defined in @file{nl_types.h}.  The argument
 902 @var{item} is a numeric value defined in the header @file{langinfo.h}.
 903 The X/Open standard defines the following values:
 904
 905 @vtable @code
 906 @item CODESET
 907 @code{nl_langinfo} returns a string with the name of the coded character
 908 set used in the selected locale.
 909
 910 @item ABDAY_1
 911 @itemx ABDAY_2
 912 @itemx ABDAY_3
 913 @itemx ABDAY_4
 914 @itemx ABDAY_5
 915 @itemx ABDAY_6
 916 @itemx ABDAY_7
 917 @code{nl_langinfo} returns the abbreviated weekday name.  @code{ABDAY_1}
 918 corresponds to Sunday.
 919 @item DAY_1
 920 @itemx DAY_2
 921 @itemx DAY_3
 922 @itemx DAY_4
 923 @itemx DAY_5
 924 @itemx DAY_6
 925 @itemx DAY_7
 926 Similar to @code{ABDAY_1}, etc.,@: but here the return value is the
 927 unabbreviated weekday name.
 928 @item ABMON_1
 929 @itemx ABMON_2
 930 @itemx ABMON_3
 931 @itemx ABMON_4
 932 @itemx ABMON_5
 933 @itemx ABMON_6
 934 @itemx ABMON_7
 935 @itemx ABMON_8
 936 @itemx ABMON_9
 937 @itemx ABMON_10
 938 @itemx ABMON_11
 939 @itemx ABMON_12
 940 The return value is the abbreviated name of the month, in the
 941 grammatical form used when the month forms part of a complete date.
 942 @code{ABMON_1} corresponds to January.
 943 @item MON_1
 944 @itemx MON_2
 945 @itemx MON_3
 946 @itemx MON_4
 947 @itemx MON_5
 948 @itemx MON_6
 949 @itemx MON_7
 950 @itemx MON_8
 951 @itemx MON_9
 952 @itemx MON_10
 953 @itemx MON_11
 954 @itemx MON_12
 955 Similar to @code{ABMON_1}, etc.,@: but here the month names are not
 956 abbreviated.  Here the first value @code{MON_1} also corresponds to
 957 January.
 958 @item ALTMON_1
 959 @itemx ALTMON_2
 960 @itemx ALTMON_3
 961 @itemx ALTMON_4
 962 @itemx ALTMON_5
 963 @itemx ALTMON_6
 964 @itemx ALTMON_7
 965 @itemx ALTMON_8
 966 @itemx ALTMON_9
 967 @itemx ALTMON_10
 968 @itemx ALTMON_11
 969 @itemx ALTMON_12
 970 Similar to @code{MON_1}, etc.,@: but here the month names are in the
 971 grammatical form used when the month is named by itself.  The
 972 @code{strftime} functions use these month names for the conversion
 973 specifier @code{%OB} (@pxref{Formatting Calendar Time}).
 974
 975 Note that not all languages need two different forms of the month names,
 976 so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
 977 may or may not be the same, depending on the locale.
 978
 979 @strong{NB:} @code{ABALTMON_@dots{}} constants corresponding to the
 980 @code{%Ob} conversion specifier are not currently provided, but are
 981 expected to be in a future release.  In the meantime, it is possible
 982 to use @code{_NL_ABALTMON_@dots{}}.
 983 @item AM_STR
 984 @itemx PM_STR
 985 The return values are strings which can be used in the representation of time
 986 as an hour from 1 to 12 plus an am/pm specifier.
 987
 988 Note that in locales which do not use this time representation
 989 these strings might be empty, in which case the am/pm format
 990 cannot be used at all.
 991 @item D_T_FMT
 992 The return value can be used as a format string for @code{strftime} to
 993 represent time and date in a locale-specific way.
 994 @item D_FMT
 995 The return value can be used as a format string for @code{strftime} to
 996 represent a date in a locale-specific way.
 997 @item T_FMT
 998 The return value can be used as a format string for @code{strftime} to
 999 represent time in a locale-specific way.
1000 @item T_FMT_AMPM
1001 The return value can be used as a format string for @code{strftime} to
1002 represent time in the am/pm format.
1003
1004 Note that if the am/pm format does not make any sense for the
1005 selected locale, the return value might be the same as the one for
1006 @code{T_FMT}.
1007 @item ERA
1008 The return value represents the era used in the current locale.
1009
1010 Most locales do not define this value.  An example of a locale which
1011 does define this value is the Japanese one.  In Japan, the traditional
1012 representation of dates includes the name of the era corresponding to
1013 the then-emperor's reign.
1014
1015 Normally it should not be necessary to use this value directly.
1016 Specifying the @code{E} modifier in their format strings causes the
1017 @code{strftime} functions to use this information.  The format of the
1018 returned string is not specified, and therefore you should not assume
1019 knowledge of it on different systems.
1020 @item ERA_YEAR
1021 The return value gives the year in the relevant era of the locale.
1022 As for @code{ERA} it should not be necessary to use this value directly.
1023 @item ERA_D_T_FMT
1024 This return value can be used as a format string for @code{strftime} to
1025 represent dates and times in a locale-specific era-based way.
1026 @item ERA_D_FMT
1027 This return value can be used as a format string for @code{strftime} to
1028 represent a date in a locale-specific era-based way.
1029 @item ERA_T_FMT
1030 This return value can be used as a format string for @code{strftime} to
1031 represent time in a locale-specific era-based way.
1032 @item ALT_DIGITS
1033 The return value is a representation of up to @math{100} values used to
1034 represent the values @math{0} to @math{99}.  As for @code{ERA} this
1035 value is not intended to be used directly, but instead indirectly
1036 through the @code{strftime} function.  When the modifier @code{O} is
1037 used in a format which would otherwise use numerals to represent hours,
1038 minutes, seconds, weekdays, months, or weeks, the appropriate value for
1039 the locale is used instead.
1040 @item INT_CURR_SYMBOL
1041 The same as the value returned by @code{localeconv} in the
1042 @code{int_curr_symbol} element of the @code{struct lconv}.
1043 @item CURRENCY_SYMBOL
1044 @itemx CRNCYSTR
1045 The same as the value returned by @code{localeconv} in the
1046 @code{currency_symbol} element of the @code{struct lconv}.
1047
1048 @code{CRNCYSTR} is a deprecated alias still required by Unix98.
1049 @item MON_DECIMAL_POINT
1050 The same as the value returned by @code{localeconv} in the
1051 @code{mon_decimal_point} element of the @code{struct lconv}.
1052 @item MON_THOUSANDS_SEP
1053 The same as the value returned by @code{localeconv} in the
1054 @code{mon_thousands_sep} element of the @code{struct lconv}.
1055 @item MON_GROUPING
1056 The same as the value returned by @code{localeconv} in the
1057 @code{mon_grouping} element of the @code{struct lconv}.
1058 @item POSITIVE_SIGN
1059 The same as the value returned by @code{localeconv} in the
1060 @code{positive_sign} element of the @code{struct lconv}.
1061 @item NEGATIVE_SIGN
1062 The same as the value returned by @code{localeconv} in the
1063 @code{negative_sign} element of the @code{struct lconv}.
1064 @item INT_FRAC_DIGITS
1065 The same as the value returned by @code{localeconv} in the
1066 @code{int_frac_digits} element of the @code{struct lconv}.
1067 @item FRAC_DIGITS
1068 The same as the value returned by @code{localeconv} in the
1069 @code{frac_digits} element of the @code{struct lconv}.
1070 @item P_CS_PRECEDES
1071 The same as the value returned by @code{localeconv} in the
1072 @code{p_cs_precedes} element of the @code{struct lconv}.
1073 @item P_SEP_BY_SPACE
1074 The same as the value returned by @code{localeconv} in the
1075 @code{p_sep_by_space} element of the @code{struct lconv}.
1076 @item N_CS_PRECEDES
1077 The same as the value returned by @code{localeconv} in the
1078 @code{n_cs_precedes} element of the @code{struct lconv}.
1079 @item N_SEP_BY_SPACE
1080 The same as the value returned by @code{localeconv} in the
1081 @code{n_sep_by_space} element of the @code{struct lconv}.
1082 @item P_SIGN_POSN
1083 The same as the value returned by @code{localeconv} in the
1084 @code{p_sign_posn} element of the @code{struct lconv}.
1085 @item N_SIGN_POSN
1086 The same as the value returned by @code{localeconv} in the
1087 @code{n_sign_posn} element of the @code{struct lconv}.
1088
1089 @item INT_P_CS_PRECEDES
1090 The same as the value returned by @code{localeconv} in the
1091 @code{int_p_cs_precedes} element of the @code{struct lconv}.
1092 @item INT_P_SEP_BY_SPACE
1093 The same as the value returned by @code{localeconv} in the
1094 @code{int_p_sep_by_space} element of the @code{struct lconv}.
1095 @item INT_N_CS_PRECEDES
1096 The same as the value returned by @code{localeconv} in the
1097 @code{int_n_cs_precedes} element of the @code{struct lconv}.
1098 @item INT_N_SEP_BY_SPACE
1099 The same as the value returned by @code{localeconv} in the
1100 @code{int_n_sep_by_space} element of the @code{struct lconv}.
1101 @item INT_P_SIGN_POSN
1102 The same as the value returned by @code{localeconv} in the
1103 @code{int_p_sign_posn} element of the @code{struct lconv}.
1104 @item INT_N_SIGN_POSN
1105 The same as the value returned by @code{localeconv} in the
1106 @code{int_n_sign_posn} element of the @code{struct lconv}.
1107
1108 @item DECIMAL_POINT
1109 @itemx RADIXCHAR
1110 The same as the value returned by @code{localeconv} in the
1111 @code{decimal_point} element of the @code{struct lconv}.
1112
1113 The name @code{RADIXCHAR} is a deprecated alias still used in Unix98.
1114 @item THOUSANDS_SEP
1115 @itemx THOUSEP
1116 The same as the value returned by @code{localeconv} in the
1117 @code{thousands_sep} element of the @code{struct lconv}.
1118
1119 The name @code{THOUSEP} is a deprecated alias still used in Unix98.
1120 @item GROUPING
1121 The same as the value returned by @code{localeconv} in the
1122 @code{grouping} element of the @code{struct lconv}.
1123 @item YESEXPR
1124 The return value is a regular expression which can be used with the
1125 @code{regex} function to recognize a positive response to a yes/no
1126 question.  @Theglibc{} provides the @code{rpmatch} function for
1127 easier handling in applications.
1128 @item NOEXPR
1129 The return value is a regular expression which can be used with the
1130 @code{regex} function to recognize a negative response to a yes/no
1131 question.
1132 @item YESSTR
1133 The return value is a locale-specific translation of the positive response
1134 to a yes/no question.
1135
1136 Using this value is deprecated since it is a very special case of
1137 message translation, and is better handled by the message
1138 translation functions (@pxref{Message Translation}).
1139
1140 The use of this symbol is deprecated.  Instead message translation
1141 should be used.
1142 @item NOSTR
1143 The return value is a locale-specific translation of the negative response
1144 to a yes/no question.  What is said for @code{YESSTR} is also true here.
1145
1146 The use of this symbol is deprecated.  Instead message translation
1147 should be used.
1148 @end vtable
1149
1150 The file @file{langinfo.h} defines a lot more symbols but none of them
1151 are official.  Using them is not portable, and the format of the
1152 return values might change.  Therefore we recommended you not use
1153 them.
1154
1155 Note that the return value for any valid argument can be used
1156 in all situations (with the possible exception of the am/pm time formatting
1157 codes).  If the user has not selected any locale for the
1158 appropriate category, @code{nl_langinfo} returns the information from the
1159 @code{"C"} locale.  It is therefore possible to use this function as
1160 shown in the example below.
1161
1162 If the argument @var{item} is not valid, a pointer to an empty string is
1163 returned.
1164 @end deftypefun
1165
1166 An example of @code{nl_langinfo} usage is a function which has to
1167 print a given date and time in a locale-specific way.  At first one
1168 might think that, since @code{strftime} internally uses the locale
1169 information, writing something like the following is enough:
1170
1171 @smallexample
1172 size_t
1173 i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1174 @{
1175   return strftime (s, len, "%X %D", tp);
1176 @}
1177 @end smallexample
1178
1179 The format contains no weekday or month names and therefore is
1180 internationally usable.  Wrong!  The output produced is something like
1181 @code{"hh:mm:ss MM/DD/YY"}.  This format is only recognizable in the
1182 USA.  Other countries use different formats.  Therefore the function
1183 should be rewritten like this:
1184
1185 @smallexample
1186 size_t
1187 i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1188 @{
1189   return strftime (s, len, nl_langinfo (D_T_FMT), tp);
1190 @}
1191 @end smallexample
1192
1193 Now it uses the date and time format of the locale
1194 selected when the program runs.  If the user selects the locale
1195 correctly there should never be a misunderstanding over the time and
1196 date format.
1197
1198 @node Formatting Numbers, Yes-or-No Questions, Locale Information, Locales
1199 @section A dedicated function to format numbers
1200
1201 We have seen that the structure returned by @code{localeconv} as well as
1202 the values given to @code{nl_langinfo} allow you to retrieve the various
1203 pieces of locale-specific information to format numbers and monetary
1204 amounts.  We have also seen that the underlying rules are quite complex.
1205
1206 Therefore the X/Open standards introduce a function which uses such
1207 locale information, making it easier for the user to format
1208 numbers according to these rules.
1209
1210 @deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{})
1211 @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
1212 @c It (and strfmon_l) both call __vstrfmon_l_internal, which, besides
1213 @c accessing the locale object passed to it, accesses the active
1214 @c locale through isdigit (but to_digit assumes ASCII digits only).
1215 @c It may call __printf_fp (@mtslocale @ascuheap @acsmem) and
1216 @c guess_grouping (safe).
1217 The @code{strfmon} function is similar to the @code{strftime} function
1218 in that it takes a buffer, its size, a format string,
1219 and values to write into the buffer as text in a form specified
1220 by the format string.  Like @code{strftime}, the function
1221 also returns the number of bytes written into the buffer.
1222
1223 There are two differences: @code{strfmon} can take more than one
1224 argument, and, of course, the format specification is different.  Like
1225 @code{strftime}, the format string consists of normal text, which is
1226 output as is, and format specifiers, which are indicated by a @samp{%}.
1227 Immediately after the @samp{%}, you can optionally specify various flags
1228 and formatting information before the main formatting character, in a
1229 similar way to @code{printf}:
1230
1231 @itemize @bullet
1232 @item
1233 Immediately following the @samp{%} there can be one or more of the
1234 following flags:
1235 @table @asis
1236 @item @samp{=@var{f}}
1237 The single byte character @var{f} is used for this field as the numeric
1238 fill character.  By default this character is a space character.
1239 Filling with this character is only performed if a left precision
1240 is specified.  It is not just to fill to the given field width.
1241 @item @samp{^}
1242 The number is printed without grouping the digits according to the rules
1243 of the current locale.  By default grouping is enabled.
1244 @item @samp{+}, @samp{(}
1245 At most one of these flags can be used.  They select which format to
1246 represent the sign of a currency amount.  By default, and if
1247 @samp{+} is given, the locale equivalent of @math{+}/@math{-} is used.  If
1248 @samp{(} is given, negative amounts are enclosed in parentheses.  The
1249 exact format is determined by the values of the @code{LC_MONETARY}
1250 category of the locale selected at program runtime.
1251 @item @samp{!}
1252 The output will not contain the currency symbol.
1253 @item @samp{-}
1254 The output will be formatted left-justified instead of right-justified if
1255 it does not fill the entire field width.
1256 @end table
1257 @end itemize
1258
1259 The next part of the specification is an optional field width.  If no
1260 width is specified @math{0} is taken.  During output, the function first
1261 determines how much space is required.  If it requires at least as many
1262 characters as given by the field width, it is output using as much space
1263 as necessary.  Otherwise, it is extended to use the full width by
1264 filling with the space character.  The presence or absence of the
1265 @samp{-} flag determines the side at which such padding occurs.  If
1266 present, the spaces are added at the right making the output
1267 left-justified, and vice versa.
1268
1269 So far the format looks familiar, being similar to the @code{printf} and
1270 @code{strftime} formats.  However, the next two optional fields
1271 introduce something new.  The first one is a @samp{#} character followed
1272 by a decimal digit string.  The value of the digit string specifies the
1273 number of @emph{digit} positions to the left of the decimal point (or
1274 equivalent).  This does @emph{not} include the grouping character when
1275 the @samp{^} flag is not given.  If the space needed to print the number
1276 does not fill the whole width, the field is padded at the left side with
1277 the fill character, which can be selected using the @samp{=} flag and by
1278 default is a space.  For example, if the field width is selected as 6
1279 and the number is @math{123}, the fill character is @samp{*} the result
1280 will be @samp{***123}.
1281
1282 The second optional field starts with a @samp{.} (period) and consists
1283 of another decimal digit string.  Its value describes the number of
1284 characters printed after the decimal point.  The default is selected
1285 from the current locale (@code{frac_digits}, @code{int_frac_digits}, see
1286 @pxref{General Numeric}).  If the exact representation needs more digits
1287 than given by the field width, the displayed value is rounded.  If the
1288 number of fractional digits is selected to be zero, no decimal point is
1289 printed.
1290
1291 As a GNU extension, the @code{strfmon} implementation in @theglibc{}
1292 allows an optional @samp{L} next as a format modifier.  If this modifier
1293 is given, the argument is expected to be a @code{long double} instead of
1294 a @code{double} value.
1295
1296 Finally, the last component is a format specifier.  There are three
1297 specifiers defined:
1298
1299 @table @asis
1300 @item @samp{i}
1301 Use the locale's rules for formatting an international currency value.
1302 @item @samp{n}
1303 Use the locale's rules for formatting a national currency value.
1304 @item @samp{%}
1305 Place a @samp{%} in the output.  There must be no flag, width
1306 specifier or modifier given, only @samp{%%} is allowed.
1307 @end table
1308
1309 As for @code{printf}, the function reads the format string
1310 from left to right and uses the values passed to the function following
1311 the format string.  The values are expected to be either of type
1312 @code{double} or @code{long double}, depending on the presence of the
1313 modifier @samp{L}.  The result is stored in the buffer pointed to by
1314 @var{s}.  At most @var{maxsize} characters are stored.
1315
1316 The return value of the function is the number of characters stored in
1317 @var{s}, including the terminating @code{NULL} byte.  If the number of
1318 characters stored would exceed @var{maxsize}, the function returns
1319 @math{-1} and the content of the buffer @var{s} is unspecified.  In this
1320 case @code{errno} is set to @code{E2BIG}.
1321 @end deftypefun
1322
1323 A few examples should make clear how the function works.  It is
1324 assumed that all the following pieces of code are executed in a program
1325 which uses the USA locale (@code{en_US}).  The simplest
1326 form of the format is this:
1327
1328 @smallexample
1329 strfmon (buf, 100, "@@%n@@%n@@%n@@", 123.45, -567.89, 12345.678);
1330 @end smallexample
1331
1332 @noindent
1333 The output produced is
1334 @smallexample
1335 "@@$123.45@@-$567.89@@$12,345.68@@"
1336 @end smallexample
1337
1338 We can notice several things here.  First, the widths of the output
1339 numbers are different.  We have not specified a width in the format
1340 string, and so this is no wonder.  Second, the third number is printed
1341 using thousands separators.  The thousands separator for the
1342 @code{en_US} locale is a comma.  The number is also rounded.
1343 @math{.678} is rounded to @math{.68} since the format does not specify a
1344 precision and the default value in the locale is @math{2}.  Finally,
1345 note that the national currency symbol is printed since @samp{%n} was
1346 used, not @samp{i}.  The next example shows how we can align the output.
1347
1348 @smallexample
1349 strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678);
1350 @end smallexample
1351
1352 @noindent
1353 The output this time is:
1354
1355 @smallexample
1356 "@@    $123.45@@   -$567.89@@ $12,345.68@@"
1357 @end smallexample
1358
1359 Two things stand out.  Firstly, all fields have the same width (eleven
1360 characters) since this is the width given in the format and since no
1361 number required more characters to be printed.  The second important
1362 point is that the fill character is not used.  This is correct since the
1363 white space was not used to achieve a precision given by a @samp{#}
1364 modifier, but instead to fill to the given width.  The difference
1365 becomes obvious if we now add a width specification.
1366
1367 @smallexample
1368 strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@",
1369          123.45, -567.89, 12345.678);
1370 @end smallexample
1371
1372 @noindent
1373 The output is
1374
1375 @smallexample
1376 "@@ $***123.45@@-$***567.89@@ $12,456.68@@"
1377 @end smallexample
1378
1379 Here we can see that all the currency symbols are now aligned, and that
1380 the space between the currency sign and the number is filled with the
1381 selected fill character.  Note that although the width is selected to be
1382 @math{5} and @math{123.45} has three digits left of the decimal point,
1383 the space is filled with three asterisks.  This is correct since, as
1384 explained above, the width does not include the positions used to store
1385 thousands separators.  One last example should explain the remaining
1386 functionality.
1387
1388 @smallexample
1389 strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@",
1390          123.45, -567.89, 12345.678);
1391 @end smallexample
1392
1393 @noindent
1394 This rather complex format string produces the following output:
1395
1396 @smallexample
1397 "@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@"
1398 @end smallexample
1399
1400 The most noticeable change is the alternative way of representing
1401 negative numbers.  In financial circles this is often done using
1402 parentheses, and this is what the @samp{(} flag selected.  The fill
1403 character is now @samp{0}.  Note that this @samp{0} character is not
1404 regarded as a numeric zero, and therefore the first and second numbers
1405 are not printed using a thousands separator.  Since we used the format
1406 specifier @samp{i} instead of @samp{n}, the international form of the
1407 currency symbol is used.  This is a four letter string, in this case
1408 @code{"USD "}.  The last point is that since the precision right of the
1409 decimal point is selected to be three, the first and second numbers are
1410 printed with an extra zero at the end and the third number is printed
1411 without rounding.
1412
1413 @node Yes-or-No Questions,  , Formatting Numbers , Locales
1414 @section Yes-or-No Questions
1415
1416 Some non GUI programs ask a yes-or-no question.  If the messages
1417 (especially the questions) are translated into foreign languages, be
1418 sure that you localize the answers too.  It would be very bad habit to
1419 ask a question in one language and request the answer in another, often
1420 English.
1421
1422 @Theglibc{} contains @code{rpmatch} to give applications easy
1423 access to the corresponding locale definitions.
1424
1425 @deftypefun int rpmatch (const char *@var{response})
1426 @standards{GNU, stdlib.h}
1427 @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
1428 @c Calls nl_langinfo with YESEXPR and NOEXPR, triggering @mtslocale but
1429 @c it's regcomp and regexec that bring in all of the safety issues.
1430 @c regfree is also called, but it doesn't introduce any further issues.
1431 The function @code{rpmatch} checks the string in @var{response} for whether
1432 or not it is a correct yes-or-no answer and if yes, which one.  The
1433 check uses the @code{YESEXPR} and @code{NOEXPR} data in the
1434 @code{LC_MESSAGES} category of the currently selected locale.  The
1435 return value is as follows:
1436
1437 @table @code
1438 @item 1
1439 The user entered an affirmative answer.
1440
1441 @item 0
1442 The user entered a negative answer.
1443
1444 @item -1
1445 The answer matched neither the @code{YESEXPR} nor the @code{NOEXPR}
1446 regular expression.
1447 @end table
1448
1449 This function is not standardized but available beside in @theglibc{} at
1450 least also in the IBM AIX library.
1451 @end deftypefun
1452
1453 @noindent
1454 This function would normally be used like this:
1455
1456 @smallexample
1457   @dots{}
1458   /* @r{Use a safe default.}  */
1459   _Bool doit = false;
1460
1461   fputs (gettext ("Do you really want to do this? "), stdout);
1462   fflush (stdout);
1463   /* @r{Prepare the @code{getline} call.}  */
1464   line = NULL;
1465   len = 0;
1466   while (getline (&line, &len, stdin) >= 0)
1467     @{
1468       /* @r{Check the response.}  */
1469       int res = rpmatch (line);
1470       if (res >= 0)
1471         @{
1472           /* @r{We got a definitive answer.}  */
1473           if (res > 0)
1474             doit = true;
1475           break;
1476         @}
1477     @}
1478   /* @r{Free what @code{getline} allocated.}  */
1479   free (line);
1480 @end smallexample
1481
1482 Note that the loop continues until a read error is detected or until a
1483 definitive (positive or negative) answer is read.