manual/startup.texi

   1 @node Program Basics, Processes, Signal Handling, Top
   2 @c %MENU% Writing the beginning and end of your program
   3 @chapter The Basic Program/System Interface
   4
   5 @cindex process
   6 @cindex program
   7 @cindex address space
   8 @cindex thread of control
   9 @dfn{Processes} are the primitive units for allocation of system
  10 resources.  Each process has its own address space and (usually) one
  11 thread of control.  A process executes a program; you can have multiple
  12 processes executing the same program, but each process has its own copy
  13 of the program within its own address space and executes it
  14 independently of the other copies.  Though it may have multiple threads
  15 of control within the same program and a program may be composed of
  16 multiple logically separate modules, a process always executes exactly
  17 one program.
  18
  19 Note that we are using a specific definition of ``program'' for the
  20 purposes of this manual, which corresponds to a common definition in the
  21 context of Unix system.  In popular usage, ``program'' enjoys a much
  22 broader definition; it can refer for example to a system's kernel, an
  23 editor macro, a complex package of software, or a discrete section of
  24 code executing within a process.
  25
  26 Writing the program is what this manual is all about.  This chapter
  27 explains the most basic interface between your program and the system
  28 that runs, or calls, it.  This includes passing of parameters (arguments
  29 and environment) from the system, requesting basic services from the
  30 system, and telling the system the program is done.
  31
  32 A program starts another program with the @code{exec} family of system calls.
  33 This chapter looks at program startup from the execee's point of view.  To
  34 see the event from the execor's point of view, see @ref{Executing a File}.
  35
  36 @menu
  37 * Program Arguments::           Parsing your program's command-line arguments
  38 * Environment Variables::       Less direct parameters affecting your program
  39 * Auxiliary Vector::            Least direct parameters affecting your program
  40 * System Calls::                Requesting service from the system
  41 * Program Termination::         Telling the system you're done; return status
  42 @end menu
  43
  44 @node Program Arguments
  45 @section Program Arguments
  46 @cindex program arguments
  47 @cindex command line arguments
  48 @cindex arguments, to program
  49
  50 @cindex program startup
  51 @cindex startup of program
  52 @cindex invocation of program
  53 @cindex @code{main} function
  54 @findex main
  55 The system starts a C program by calling the function @code{main}.  It
  56 is up to you to write a function named @code{main}---otherwise, you
  57 won't even be able to link your program without errors.
  58
  59 In @w{ISO C} you can define @code{main} either to take no arguments, or to
  60 take two arguments that represent the command line arguments to the
  61 program, like this:
  62
  63 @smallexample
  64 int main (int @var{argc}, char *@var{argv}[])
  65 @end smallexample
  66
  67 @cindex argc (program argument count)
  68 @cindex argv (program argument vector)
  69 The command line arguments are the whitespace-separated tokens given in
  70 the shell command used to invoke the program; thus, in @samp{cat foo
  71 bar}, the arguments are @samp{foo} and @samp{bar}.  The only way a
  72 program can look at its command line arguments is via the arguments of
  73 @code{main}.  If @code{main} doesn't take arguments, then you cannot get
  74 at the command line.
  75
  76 The value of the @var{argc} argument is the number of command line
  77 arguments.  The @var{argv} argument is a vector of C strings; its
  78 elements are the individual command line argument strings.  The file
  79 name of the program being run is also included in the vector as the
  80 first element; the value of @var{argc} counts this element.  A null
  81 pointer always follows the last element: @code{@var{argv}[@var{argc}]}
  82 is this null pointer.
  83
  84 For the command @samp{cat foo bar}, @var{argc} is 3 and @var{argv} has
  85 three elements, @code{"cat"}, @code{"foo"} and @code{"bar"}.
  86
  87 In Unix systems you can define @code{main} a third way, using three arguments:
  88
  89 @smallexample
  90 int main (int @var{argc}, char *@var{argv}[], char *@var{envp}[])
  91 @end smallexample
  92
  93 The first two arguments are just the same.  The third argument
  94 @var{envp} gives the program's environment; it is the same as the value
  95 of @code{environ}.  @xref{Environment Variables}.  POSIX.1 does not
  96 allow this three-argument form, so to be portable it is best to write
  97 @code{main} to take two arguments, and use the value of @code{environ}.
  98
  99 @menu
 100 * Argument Syntax::             By convention, options start with a hyphen.
 101 * Parsing Program Arguments::   Ways to parse program options and arguments.
 102 @end menu
 103
 104 @node Argument Syntax, Parsing Program Arguments, , Program Arguments
 105 @subsection Program Argument Syntax Conventions
 106 @cindex program argument syntax
 107 @cindex syntax, for program arguments
 108 @cindex command argument syntax
 109
 110 POSIX recommends these conventions for command line arguments.
 111 @code{getopt} (@pxref{Getopt}) and @code{argp_parse} (@pxref{Argp}) make
 112 it easy to implement them.
 113
 114 @itemize @bullet
 115 @item
 116 Arguments are options if they begin with a hyphen delimiter (@samp{-}).
 117
 118 @item
 119 Multiple options may follow a hyphen delimiter in a single token if
 120 the options do not take arguments.  Thus, @samp{-abc} is equivalent to
 121 @samp{-a -b -c}.
 122
 123 @item
 124 Option names are single alphanumeric characters (as for @code{isalnum};
 125 @pxref{Classification of Characters}).
 126
 127 @item
 128 Certain options require an argument.  For example, the @samp{-o} command
 129 of the @code{ld} command requires an argument---an output file name.
 130
 131 @item
 132 An option and its argument may or may not appear as separate tokens.  (In
 133 other words, the whitespace separating them is optional.)  Thus,
 134 @w{@samp{-o foo}} and @samp{-ofoo} are equivalent.
 135
 136 @item
 137 Options typically precede other non-option arguments.
 138
 139 The implementations of @code{getopt} and @code{argp_parse} in @theglibc{}
 140 normally make it appear as if all the option arguments were
 141 specified before all the non-option arguments for the purposes of
 142 parsing, even if the user of your program intermixed option and
 143 non-option arguments.  They do this by reordering the elements of the
 144 @var{argv} array.  This behavior is nonstandard; if you want to suppress
 145 it, define the @code{_POSIX_OPTION_ORDER} environment variable.
 146 @xref{Standard Environment}.
 147
 148 @item
 149 The argument @samp{--} terminates all options; any following arguments
 150 are treated as non-option arguments, even if they begin with a hyphen.
 151
 152 @item
 153 A token consisting of a single hyphen character is interpreted as an
 154 ordinary non-option argument.  By convention, it is used to specify
 155 input from or output to the standard input and output streams.
 156
 157 @item
 158 Options may be supplied in any order, or appear multiple times.  The
 159 interpretation is left up to the particular application program.
 160 @end itemize
 161
 162 @cindex long-named options
 163 GNU adds @dfn{long options} to these conventions.  Long options consist
 164 of @samp{--} followed by a name made of alphanumeric characters and
 165 dashes.  Option names are typically one to three words long, with
 166 hyphens to separate words.  Users can abbreviate the option names as
 167 long as the abbreviations are unique.
 168
 169 To specify an argument for a long option, write
 170 @samp{--@var{name}=@var{value}}.  This syntax enables a long option to
 171 accept an argument that is itself optional.
 172
 173 Eventually, @gnusystems{} will provide completion for long option names
 174 in the shell.
 175
 176 @node Parsing Program Arguments, , Argument Syntax, Program Arguments
 177 @subsection Parsing Program Arguments
 178
 179 @cindex program arguments, parsing
 180 @cindex command arguments, parsing
 181 @cindex parsing program arguments
 182 If the syntax for the command line arguments to your program is simple
 183 enough, you can simply pick the arguments off from @var{argv} by hand.
 184 But unless your program takes a fixed number of arguments, or all of the
 185 arguments are interpreted in the same way (as file names, for example),
 186 you are usually better off using @code{getopt} (@pxref{Getopt}) or
 187 @code{argp_parse} (@pxref{Argp}) to do the parsing.
 188
 189 @code{getopt} is more standard (the short-option only version of it is a
 190 part of the POSIX standard), but using @code{argp_parse} is often
 191 easier, both for very simple and very complex option structures, because
 192 it does more of the dirty work for you.
 193
 194 @menu
 195 * Getopt::                      Parsing program options using @code{getopt}.
 196 * Argp::                        Parsing program options using @code{argp_parse}.
 197 * Suboptions::                  Some programs need more detailed options.
 198 * Suboptions Example::          This shows how it could be done for @code{mount}.
 199 @end menu
 200
 201 @c Getopt and argp start at the @section level so that there's
 202 @c enough room for their internal hierarchy (mostly a problem with
 203 @c argp).         -Miles
 204
 205 @include getopt.texi
 206 @include argp.texi
 207
 208 @node Suboptions, Suboptions Example, Argp, Parsing Program Arguments
 209 @c This is a @section so that it's at the same level as getopt and argp
 210 @subsubsection Parsing of Suboptions
 211
 212 Having a single level of options is sometimes not enough.  There might
 213 be too many options which have to be available or a set of options is
 214 closely related.
 215
 216 For this case some programs use suboptions.  One of the most prominent
 217 programs is certainly @code{mount}(8).  The @code{-o} option take one
 218 argument which itself is a comma separated list of options.  To ease the
 219 programming of code like this the function @code{getsubopt} is
 220 available.
 221
 222 @comment stdlib.h
 223 @deftypefun int getsubopt (char **@var{optionp}, char *const *@var{tokens}, char **@var{valuep})
 224 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 225 @c getsubopt ok
 226 @c  strchrnul dup ok
 227 @c  memchr dup ok
 228 @c  strncmp dup ok
 229
 230 The @var{optionp} parameter must be a pointer to a variable containing
 231 the address of the string to process.  When the function returns the
 232 reference is updated to point to the next suboption or to the
 233 terminating @samp{\0} character if there is no more suboption available.
 234
 235 The @var{tokens} parameter references an array of strings containing the
 236 known suboptions.  All strings must be @samp{\0} terminated and to mark
 237 the end a null pointer must be stored.  When @code{getsubopt} finds a
 238 possible legal suboption it compares it with all strings available in
 239 the @var{tokens} array and returns the index in the string as the
 240 indicator.
 241
 242 In case the suboption has an associated value introduced by a @samp{=}
 243 character, a pointer to the value is returned in @var{valuep}.  The
 244 string is @samp{\0} terminated.  If no argument is available
 245 @var{valuep} is set to the null pointer.  By doing this the caller can
 246 check whether a necessary value is given or whether no unexpected value
 247 is present.
 248
 249 In case the next suboption in the string is not mentioned in the
 250 @var{tokens} array the starting address of the suboption including a
 251 possible value is returned in @var{valuep} and the return value of the
 252 function is @samp{-1}.
 253 @end deftypefun
 254
 255 @node Suboptions Example, , Suboptions, Parsing Program Arguments
 256 @subsection Parsing of Suboptions Example
 257
 258 The code which might appear in the @code{mount}(8) program is a perfect
 259 example of the use of @code{getsubopt}:
 260
 261 @smallexample
 262 @include subopt.c.texi
 263 @end smallexample
 264
 265
 266 @node Environment Variables
 267 @section Environment Variables
 268
 269 @cindex environment variable
 270 When a program is executed, it receives information about the context in
 271 which it was invoked in two ways.  The first mechanism uses the
 272 @var{argv} and @var{argc} arguments to its @code{main} function, and is
 273 discussed in @ref{Program Arguments}.  The second mechanism uses
 274 @dfn{environment variables} and is discussed in this section.
 275
 276 The @var{argv} mechanism is typically used to pass command-line
 277 arguments specific to the particular program being invoked.  The
 278 environment, on the other hand, keeps track of information that is
 279 shared by many programs, changes infrequently, and that is less
 280 frequently used.
 281
 282 The environment variables discussed in this section are the same
 283 environment variables that you set using assignments and the
 284 @code{export} command in the shell.  Programs executed from the shell
 285 inherit all of the environment variables from the shell.
 286 @c !!! xref to right part of bash manual when it exists
 287
 288 @cindex environment
 289 Standard environment variables are used for information about the user's
 290 home directory, terminal type, current locale, and so on; you can define
 291 additional variables for other purposes.  The set of all environment
 292 variables that have values is collectively known as the
 293 @dfn{environment}.
 294
 295 Names of environment variables are case-sensitive and must not contain
 296 the character @samp{=}.  System-defined environment variables are
 297 invariably uppercase.
 298
 299 The values of environment variables can be anything that can be
 300 represented as a string.  A value must not contain an embedded null
 301 character, since this is assumed to terminate the string.
 302
 303
 304 @menu
 305 * Environment Access::          How to get and set the values of
 306                                  environment variables.
 307 * Standard Environment::        These environment variables have
 308                                  standard interpretations.
 309 @end menu
 310
 311 @node Environment Access
 312 @subsection Environment Access
 313 @cindex environment access
 314 @cindex environment representation
 315
 316 The value of an environment variable can be accessed with the
 317 @code{getenv} function.  This is declared in the header file
 318 @file{stdlib.h}.
 319 @pindex stdlib.h
 320
 321 Libraries should use @code{secure_getenv} instead of @code{getenv}, so
 322 that they do not accidentally use untrusted environment variables.
 323 Modifications of environment variables are not allowed in
 324 multi-threaded programs.  The @code{getenv} and @code{secure_getenv}
 325 functions can be safely used in multi-threaded programs.
 326
 327 @comment stdlib.h
 328 @comment ISO
 329 @deftypefun {char *} getenv (const char *@var{name})
 330 @safety{@prelim{}@mtsafe{@mtsenv{}}@assafe{}@acsafe{}}
 331 @c Unguarded access to __environ.
 332 This function returns a string that is the value of the environment
 333 variable @var{name}.  You must not modify this string.  In some non-Unix
 334 systems not using @theglibc{}, it might be overwritten by subsequent
 335 calls to @code{getenv} (but not by any other library function).  If the
 336 environment variable @var{name} is not defined, the value is a null
 337 pointer.
 338 @end deftypefun
 339
 340 @comment stdlib.h
 341 @comment GNU
 342 @deftypefun {char *} secure_getenv (const char *@var{name})
 343 @safety{@prelim{}@mtsafe{@mtsenv{}}@assafe{}@acsafe{}}
 344 @c Calls getenv unless secure mode is enabled.
 345 This function is similar to @code{getenv}, but it returns a null
 346 pointer if the environment is untrusted.  This happens when the
 347 program file has SUID or SGID bits set.  General-purpose libraries
 348 should always prefer this function over @code{getenv} to avoid
 349 vulnerabilities if the library is referenced from a SUID/SGID program.
 350
 351 This function is a GNU extension.
 352 @end deftypefun
 353
 354
 355 @comment stdlib.h
 356 @comment SVID
 357 @deftypefun int putenv (char *@var{string})
 358 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{}}}
 359 @c putenv @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 360 @c  strchr dup ok
 361 @c  strndup dup @ascuheap @acsmem
 362 @c  add_to_environ dup @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 363 @c  free dup @ascuheap @acsmem
 364 @c  unsetenv dup @mtasuconst:@mtsenv @asulock @aculock
 365 The @code{putenv} function adds or removes definitions from the environment.
 366 If the @var{string} is of the form @samp{@var{name}=@var{value}}, the
 367 definition is added to the environment.  Otherwise, the @var{string} is
 368 interpreted as the name of an environment variable, and any definition
 369 for this variable in the environment is removed.
 370
 371 If the function is successful it returns @code{0}.  Otherwise the return
 372 value is nonzero and @code{errno} is set to indicate the error.
 373
 374 The difference to the @code{setenv} function is that the exact string
 375 given as the parameter @var{string} is put into the environment.  If the
 376 user should change the string after the @code{putenv} call this will
 377 reflect automatically in the environment.  This also requires that
 378 @var{string} not be an automatic variable whose scope is left before the
 379 variable is removed from the environment.  The same applies of course to
 380 dynamically allocated variables which are freed later.
 381
 382 This function is part of the extended Unix interface.  Since it was also
 383 available in old SVID libraries you should define either
 384 @var{_XOPEN_SOURCE} or @var{_SVID_SOURCE} before including any header.
 385 @end deftypefun
 386
 387
 388 @comment stdlib.h
 389 @comment BSD
 390 @deftypefun int setenv (const char *@var{name}, const char *@var{value}, int @var{replace})
 391 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{}}}
 392 @c setenv @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 393 @c  add_to_environ @mtasuconst:@mtsenv @ascuheap @asulock @acucorrupt @aculock @acsmem
 394 @c   strlen dup ok
 395 @c   libc_lock_lock @asulock @aculock
 396 @c   strncmp dup ok
 397 @c   realloc dup @ascuheap @acsmem
 398 @c   libc_lock_unlock @aculock
 399 @c   malloc dup @ascuheap @acsmem
 400 @c   free dup @ascuheap @acsmem
 401 @c   mempcpy dup ok
 402 @c   memcpy dup ok
 403 @c   KNOWN_VALUE ok
 404 @c    tfind(strcmp) [no @mtsrace guarded access]
 405 @c     strcmp dup ok
 406 @c   STORE_VALUE @ascuheap @acucorrupt @acsmem
 407 @c    tsearch(strcmp) @ascuheap @acucorrupt @acsmem [no @mtsrace or @asucorrupt guarded access makes for mtsafe and @asulock]
 408 @c     strcmp dup ok
 409 The @code{setenv} function can be used to add a new definition to the
 410 environment.  The entry with the name @var{name} is replaced by the
 411 value @samp{@var{name}=@var{value}}.  Please note that this is also true
 412 if @var{value} is the empty string.  To do this a new string is created
 413 and the strings @var{name} and @var{value} are copied.  A null pointer
 414 for the @var{value} parameter is illegal.  If the environment already
 415 contains an entry with key @var{name} the @var{replace} parameter
 416 controls the action.  If replace is zero, nothing happens.  Otherwise
 417 the old entry is replaced by the new one.
 418
 419 Please note that you cannot remove an entry completely using this function.
 420
 421 If the function is successful it returns @code{0}.  Otherwise the
 422 environment is unchanged and the return value is @code{-1} and
 423 @code{errno} is set.
 424
 425 This function was originally part of the BSD library but is now part of
 426 the Unix standard.
 427 @end deftypefun
 428
 429 @comment stdlib.h
 430 @comment BSD
 431 @deftypefun int unsetenv (const char *@var{name})
 432 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@asulock{}}@acunsafe{@aculock{}}}
 433 @c unsetenv @mtasuconst:@mtsenv @asulock @aculock
 434 @c  strchr dup ok
 435 @c  strlen dup ok
 436 @c  libc_lock_lock @asulock @aculock
 437 @c  strncmp dup ok
 438 @c  libc_lock_unlock @aculock
 439 Using this function one can remove an entry completely from the
 440 environment.  If the environment contains an entry with the key
 441 @var{name} this whole entry is removed.  A call to this function is
 442 equivalent to a call to @code{putenv} when the @var{value} part of the
 443 string is empty.
 444
 445 The function return @code{-1} if @var{name} is a null pointer, points to
 446 an empty string, or points to a string containing a @code{=} character.
 447 It returns @code{0} if the call succeeded.
 448
 449 This function was originally part of the BSD library but is now part of
 450 the Unix standard.  The BSD version had no return value, though.
 451 @end deftypefun
 452
 453 There is one more function to modify the whole environment.  This
 454 function is said to be used in the POSIX.9 (POSIX bindings for Fortran
 455 77) and so one should expect it did made it into POSIX.1.  But this
 456 never happened.  But we still provide this function as a GNU extension
 457 to enable writing standard compliant Fortran environments.
 458
 459 @comment stdlib.h
 460 @comment GNU
 461 @deftypefun int clearenv (void)
 462 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtsenv{}}}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 463 @c clearenv @mtasuconst:@mtsenv @ascuheap @asulock @aculock @acsmem
 464 @c  libc_lock_lock @asulock @aculock
 465 @c  free dup @ascuheap @acsmem
 466 @c  libc_lock_unlock @aculock
 467 The @code{clearenv} function removes all entries from the environment.
 468 Using @code{putenv} and @code{setenv} new entries can be added again
 469 later.
 470
 471 If the function is successful it returns @code{0}.  Otherwise the return
 472 value is nonzero.
 473 @end deftypefun
 474
 475
 476 You can deal directly with the underlying representation of environment
 477 objects to add more variables to the environment (for example, to
 478 communicate with another program you are about to execute;
 479 @pxref{Executing a File}).
 480
 481 @comment unistd.h
 482 @comment POSIX.1
 483 @deftypevar {char **} environ
 484 The environment is represented as an array of strings.  Each string is
 485 of the format @samp{@var{name}=@var{value}}.  The order in which
 486 strings appear in the environment is not significant, but the same
 487 @var{name} must not appear more than once.  The last element of the
 488 array is a null pointer.
 489
 490 This variable is declared in the header file @file{unistd.h}.
 491
 492 If you just want to get the value of an environment variable, use
 493 @code{getenv}.
 494 @end deftypevar
 495
 496 Unix systems, and @gnusystems{}, pass the initial value of
 497 @code{environ} as the third argument to @code{main}.
 498 @xref{Program Arguments}.
 499
 500 @node Standard Environment
 501 @subsection Standard Environment Variables
 502 @cindex standard environment variables
 503
 504 These environment variables have standard meanings.  This doesn't mean
 505 that they are always present in the environment; but if these variables
 506 @emph{are} present, they have these meanings.  You shouldn't try to use
 507 these environment variable names for some other purpose.
 508
 509 @comment Extra blank lines make it look better.
 510 @table @code
 511 @item HOME
 512 @cindex @code{HOME} environment variable
 513 @cindex home directory
 514
 515 This is a string representing the user's @dfn{home directory}, or
 516 initial default working directory.
 517
 518 The user can set @code{HOME} to any value.
 519 If you need to make sure to obtain the proper home directory
 520 for a particular user, you should not use @code{HOME}; instead,
 521 look up the user's name in the user database (@pxref{User Database}).
 522
 523 For most purposes, it is better to use @code{HOME}, precisely because
 524 this lets the user specify the value.
 525
 526 @c !!! also USER
 527 @item LOGNAME
 528 @cindex @code{LOGNAME} environment variable
 529
 530 This is the name that the user used to log in.  Since the value in the
 531 environment can be tweaked arbitrarily, this is not a reliable way to
 532 identify the user who is running a program; a function like
 533 @code{getlogin} (@pxref{Who Logged In}) is better for that purpose.
 534
 535 For most purposes, it is better to use @code{LOGNAME}, precisely because
 536 this lets the user specify the value.
 537
 538 @item PATH
 539 @cindex @code{PATH} environment variable
 540
 541 A @dfn{path} is a sequence of directory names which is used for
 542 searching for a file.  The variable @code{PATH} holds a path used
 543 for searching for programs to be run.
 544
 545 The @code{execlp} and @code{execvp} functions (@pxref{Executing a File})
 546 use this environment variable, as do many shells and other utilities
 547 which are implemented in terms of those functions.
 548
 549 The syntax of a path is a sequence of directory names separated by
 550 colons.  An empty string instead of a directory name stands for the
 551 current directory (@pxref{Working Directory}).
 552
 553 A typical value for this environment variable might be a string like:
 554
 555 @smallexample
 556 :/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local/bin
 557 @end smallexample
 558
 559 This means that if the user tries to execute a program named @code{foo},
 560 the system will look for files named @file{foo}, @file{/bin/foo},
 561 @file{/etc/foo}, and so on.  The first of these files that exists is
 562 the one that is executed.
 563
 564 @c !!! also TERMCAP
 565 @item TERM
 566 @cindex @code{TERM} environment variable
 567
 568 This specifies the kind of terminal that is receiving program output.
 569 Some programs can make use of this information to take advantage of
 570 special escape sequences or terminal modes supported by particular kinds
 571 of terminals.  Many programs which use the termcap library
 572 (@pxref{Finding a Terminal Description,Find,,termcap,The Termcap Library
 573 Manual}) use the @code{TERM} environment variable, for example.
 574
 575 @item TZ
 576 @cindex @code{TZ} environment variable
 577
 578 This specifies the time zone.  @xref{TZ Variable}, for information about
 579 the format of this string and how it is used.
 580
 581 @item LANG
 582 @cindex @code{LANG} environment variable
 583
 584 This specifies the default locale to use for attribute categories where
 585 neither @code{LC_ALL} nor the specific environment variable for that
 586 category is set.  @xref{Locales}, for more information about
 587 locales.
 588
 589 @ignore
 590 @c I doubt this really exists
 591 @item LC_ALL
 592 @cindex @code{LC_ALL} environment variable
 593
 594 This is similar to the @code{LANG} environment variable.  However, its
 595 value takes precedence over any values provided for the individual
 596 attribute category environment variables, or for the @code{LANG}
 597 environment variable.
 598 @end ignore
 599
 600 @item LC_ALL
 601 @cindex @code{LC_ALL} environment variable
 602
 603 If this environment variable is set it overrides the selection for all
 604 the locales done using the other @code{LC_*} environment variables.  The
 605 value of the other @code{LC_*} environment variables is simply ignored
 606 in this case.
 607
 608 @item LC_COLLATE
 609 @cindex @code{LC_COLLATE} environment variable
 610
 611 This specifies what locale to use for string sorting.
 612
 613 @item LC_CTYPE
 614 @cindex @code{LC_CTYPE} environment variable
 615
 616 This specifies what locale to use for character sets and character
 617 classification.
 618
 619 @item LC_MESSAGES
 620 @cindex @code{LC_MESSAGES} environment variable
 621
 622 This specifies what locale to use for printing messages and to parse
 623 responses.
 624
 625 @item LC_MONETARY
 626 @cindex @code{LC_MONETARY} environment variable
 627
 628 This specifies what locale to use for formatting monetary values.
 629
 630 @item LC_NUMERIC
 631 @cindex @code{LC_NUMERIC} environment variable
 632
 633 This specifies what locale to use for formatting numbers.
 634
 635 @item LC_TIME
 636 @cindex @code{LC_TIME} environment variable
 637
 638 This specifies what locale to use for formatting date/time values.
 639
 640 @item NLSPATH
 641 @cindex @code{NLSPATH} environment variable
 642
 643 This specifies the directories in which the @code{catopen} function
 644 looks for message translation catalogs.
 645
 646 @item _POSIX_OPTION_ORDER
 647 @cindex @code{_POSIX_OPTION_ORDER} environment variable.
 648
 649 If this environment variable is defined, it suppresses the usual
 650 reordering of command line arguments by @code{getopt} and
 651 @code{argp_parse}.  @xref{Argument Syntax}.
 652
 653 @c !!! GNU also has COREFILE, CORESERVER, EXECSERVERS
 654 @end table
 655
 656 @node Auxiliary Vector
 657 @section Auxiliary Vector
 658 @cindex auxiliary vector
 659
 660 When a program is executed, it receives information from the operating
 661 system about the environment in which it is operating.  The form of this
 662 information is a table of key-value pairs, where the keys are from the
 663 set of @samp{AT_} values in @file{elf.h}.  Some of the data is provided
 664 by the kernel for libc consumption, and may be obtained by ordinary
 665 interfaces, such as @code{sysconf}.  However, on a platform-by-platform
 666 basis there may be information that is not available any other way.
 667
 668 @subsection Definition of @code{getauxval}
 669 @comment sys/auxv.h
 670 @deftypefun {unsigned long int} getauxval (unsigned long int @var{type})
 671 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 672 @c Reads from hwcap or iterates over constant auxv.
 673 This function is used to inquire about the entries in the auxiliary
 674 vector.  The @var{type} argument should be one of the @samp{AT_} symbols
 675 defined in @file{elf.h}.  If a matching entry is found, the value is
 676 returned; if the entry is not found, zero is returned and @code{errno} is
 677 set to @code{ENOENT}.
 678 @end deftypefun
 679
 680 For some platforms, the key @code{AT_HWCAP} is the easiest way to inquire
 681 about any instruction set extensions available at runtime.  In this case,
 682 there will (of necessity) be a platform-specific set of @samp{HWCAP_}
 683 values masked together that describe the capabilities of the cpu on which
 684 the program is being executed.
 685
 686 @node System Calls
 687 @section System Calls
 688
 689 @cindex system call
 690 A system call is a request for service that a program makes of the
 691 kernel.  The service is generally something that only the kernel has
 692 the privilege to do, such as doing I/O.  Programmers don't normally
 693 need to be concerned with system calls because there are functions in
 694 @theglibc{} to do virtually everything that system calls do.
 695 These functions work by making system calls themselves.  For example,
 696 there is a system call that changes the permissions of a file, but
 697 you don't need to know about it because you can just use @theglibc{}'s
 698 @code{chmod} function.
 699
 700 @cindex kernel call
 701 System calls are sometimes called kernel calls.
 702
 703 However, there are times when you want to make a system call explicitly,
 704 and for that, @theglibc{} provides the @code{syscall} function.
 705 @code{syscall} is harder to use and less portable than functions like
 706 @code{chmod}, but easier and more portable than coding the system call
 707 in assembler instructions.
 708
 709 @code{syscall} is most useful when you are working with a system call
 710 which is special to your system or is newer than @theglibc{} you
 711 are using.  @code{syscall} is implemented in an entirely generic way;
 712 the function does not know anything about what a particular system
 713 call does or even if it is valid.
 714
 715 The description of @code{syscall} in this section assumes a certain
 716 protocol for system calls on the various platforms on which @theglibc{}
 717 runs.  That protocol is not defined by any strong authority, but
 718 we won't describe it here either because anyone who is coding
 719 @code{syscall} probably won't accept anything less than kernel and C
 720 library source code as a specification of the interface between them
 721 anyway.
 722
 723
 724 @code{syscall} is declared in @file{unistd.h}.
 725
 726 @comment unistd.h
 727 @comment ???
 728 @deftypefun {long int} syscall (long int @var{sysno}, @dots{})
 729 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 730
 731 @code{syscall} performs a generic system call.
 732
 733 @cindex system call number
 734 @var{sysno} is the system call number.  Each kind of system call is
 735 identified by a number.  Macros for all the possible system call numbers
 736 are defined in @file{sys/syscall.h}
 737
 738 The remaining arguments are the arguments for the system call, in
 739 order, and their meanings depend on the kind of system call.  Each kind
 740 of system call has a definite number of arguments, from zero to five.
 741 If you code more arguments than the system call takes, the extra ones to
 742 the right are ignored.
 743
 744 The return value is the return value from the system call, unless the
 745 system call failed.  In that case, @code{syscall} returns @code{-1} and
 746 sets @code{errno} to an error code that the system call returned.  Note
 747 that system calls do not return @code{-1} when they succeed.
 748 @cindex errno
 749
 750 If you specify an invalid @var{sysno}, @code{syscall} returns @code{-1}
 751 with @code{errno} = @code{ENOSYS}.
 752
 753 Example:
 754
 755 @smallexample
 756
 757 #include <unistd.h>
 758 #include <sys/syscall.h>
 759 #include <errno.h>
 760
 761 @dots{}
 762
 763 int rc;
 764
 765 rc = syscall(SYS_chmod, "/etc/passwd", 0444);
 766
 767 if (rc == -1)
 768    fprintf(stderr, "chmod failed, errno = %d\n", errno);
 769
 770 @end smallexample
 771
 772 This, if all the compatibility stars are aligned, is equivalent to the
 773 following preferable code:
 774
 775 @smallexample
 776
 777 #include <sys/types.h>
 778 #include <sys/stat.h>
 779 #include <errno.h>
 780
 781 @dots{}
 782
 783 int rc;
 784
 785 rc = chmod("/etc/passwd", 0444);
 786 if (rc == -1)
 787    fprintf(stderr, "chmod failed, errno = %d\n", errno);
 788
 789 @end smallexample
 790
 791 @end deftypefun
 792
 793
 794 @node Program Termination
 795 @section Program Termination
 796 @cindex program termination
 797 @cindex process termination
 798
 799 @cindex exit status value
 800 The usual way for a program to terminate is simply for its @code{main}
 801 function to return.  The @dfn{exit status value} returned from the
 802 @code{main} function is used to report information back to the process's
 803 parent process or shell.
 804
 805 A program can also terminate normally by calling the @code{exit}
 806 function.
 807
 808 In addition, programs can be terminated by signals; this is discussed in
 809 more detail in @ref{Signal Handling}.  The @code{abort} function causes
 810 a signal that kills the program.
 811
 812 @menu
 813 * Normal Termination::          If a program calls @code{exit}, a
 814                                  process terminates normally.
 815 * Exit Status::                 The @code{exit status} provides information
 816                                  about why the process terminated.
 817 * Cleanups on Exit::            A process can run its own cleanup
 818                                  functions upon normal termination.
 819 * Aborting a Program::          The @code{abort} function causes
 820                                  abnormal program termination.
 821 * Termination Internals::       What happens when a process terminates.
 822 @end menu
 823
 824 @node Normal Termination
 825 @subsection Normal Termination
 826
 827 A process terminates normally when its program signals it is done by
 828 calling @code{exit}.  Returning from @code{main} is equivalent to
 829 calling @code{exit}, and the value that @code{main} returns is used as
 830 the argument to @code{exit}.
 831
 832 @comment stdlib.h
 833 @comment ISO
 834 @deftypefun void exit (int @var{status})
 835 @safety{@prelim{}@mtunsafe{@mtasurace{:exit}}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{} @aculock{}}}
 836 @c Access to the atexit/on_exit list, the libc_atexit hook and tls dtors
 837 @c is not guarded.  Streams must be flushed, and that triggers the usual
 838 @c AS and AC issues with streams.
 839 The @code{exit} function tells the system that the program is done, which
 840 causes it to terminate the process.
 841
 842 @var{status} is the program's exit status, which becomes part of the
 843 process' termination status.  This function does not return.
 844 @end deftypefun
 845
 846 Normal termination causes the following actions:
 847
 848 @enumerate
 849 @item
 850 Functions that were registered with the @code{atexit} or @code{on_exit}
 851 functions are called in the reverse order of their registration.  This
 852 mechanism allows your application to specify its own ``cleanup'' actions
 853 to be performed at program termination.  Typically, this is used to do
 854 things like saving program state information in a file, or unlocking
 855 locks in shared data bases.
 856
 857 @item
 858 All open streams are closed, writing out any buffered output data.  See
 859 @ref{Closing Streams}.  In addition, temporary files opened
 860 with the @code{tmpfile} function are removed; see @ref{Temporary Files}.
 861
 862 @item
 863 @code{_exit} is called, terminating the program.  @xref{Termination Internals}.
 864 @end enumerate
 865
 866 @node Exit Status
 867 @subsection Exit Status
 868 @cindex exit status
 869
 870 When a program exits, it can return to the parent process a small
 871 amount of information about the cause of termination, using the
 872 @dfn{exit status}.  This is a value between 0 and 255 that the exiting
 873 process passes as an argument to @code{exit}.
 874
 875 Normally you should use the exit status to report very broad information
 876 about success or failure.  You can't provide a lot of detail about the
 877 reasons for the failure, and most parent processes would not want much
 878 detail anyway.
 879
 880 There are conventions for what sorts of status values certain programs
 881 should return.  The most common convention is simply 0 for success and 1
 882 for failure.  Programs that perform comparison use a different
 883 convention: they use status 1 to indicate a mismatch, and status 2 to
 884 indicate an inability to compare.  Your program should follow an
 885 existing convention if an existing convention makes sense for it.
 886
 887 A general convention reserves status values 128 and up for special
 888 purposes.  In particular, the value 128 is used to indicate failure to
 889 execute another program in a subprocess.  This convention is not
 890 universally obeyed, but it is a good idea to follow it in your programs.
 891
 892 @strong{Warning:} Don't try to use the number of errors as the exit
 893 status.  This is actually not very useful; a parent process would
 894 generally not care how many errors occurred.  Worse than that, it does
 895 not work, because the status value is truncated to eight bits.
 896 Thus, if the program tried to report 256 errors, the parent would
 897 receive a report of 0 errors---that is, success.
 898
 899 For the same reason, it does not work to use the value of @code{errno}
 900 as the exit status---these can exceed 255.
 901
 902 @strong{Portability note:} Some non-POSIX systems use different
 903 conventions for exit status values.  For greater portability, you can
 904 use the macros @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} for the
 905 conventional status value for success and failure, respectively.  They
 906 are declared in the file @file{stdlib.h}.
 907 @pindex stdlib.h
 908
 909 @comment stdlib.h
 910 @comment ISO
 911 @deftypevr Macro int EXIT_SUCCESS
 912 This macro can be used with the @code{exit} function to indicate
 913 successful program completion.
 914
 915 On POSIX systems, the value of this macro is @code{0}.  On other
 916 systems, the value might be some other (possibly non-constant) integer
 917 expression.
 918 @end deftypevr
 919
 920 @comment stdlib.h
 921 @comment ISO
 922 @deftypevr Macro int EXIT_FAILURE
 923 This macro can be used with the @code{exit} function to indicate
 924 unsuccessful program completion in a general sense.
 925
 926 On POSIX systems, the value of this macro is @code{1}.  On other
 927 systems, the value might be some other (possibly non-constant) integer
 928 expression.  Other nonzero status values also indicate failures.  Certain
 929 programs use different nonzero status values to indicate particular
 930 kinds of "non-success".  For example, @code{diff} uses status value
 931 @code{1} to mean that the files are different, and @code{2} or more to
 932 mean that there was difficulty in opening the files.
 933 @end deftypevr
 934
 935 Don't confuse a program's exit status with a process' termination status.
 936 There are lots of ways a process can terminate besides having its program
 937 finish.  In the event that the process termination @emph{is} caused by program
 938 termination (i.e., @code{exit}), though, the program's exit status becomes
 939 part of the process' termination status.
 940
 941 @node Cleanups on Exit
 942 @subsection Cleanups on Exit
 943
 944 Your program can arrange to run its own cleanup functions if normal
 945 termination happens.  If you are writing a library for use in various
 946 application programs, then it is unreliable to insist that all
 947 applications call the library's cleanup functions explicitly before
 948 exiting.  It is much more robust to make the cleanup invisible to the
 949 application, by setting up a cleanup function in the library itself
 950 using @code{atexit} or @code{on_exit}.
 951
 952 @comment stdlib.h
 953 @comment ISO
 954 @deftypefun int atexit (void (*@var{function}) (void))
 955 @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 956 @c atexit @ascuheap @asulock @aculock @acsmem
 957 @c  cxa_atexit @ascuheap @asulock @aculock @acsmem
 958 @c   __internal_atexit @ascuheap @asulock @aculock @acsmem
 959 @c    __new_exitfn @ascuheap @asulock @aculock @acsmem
 960 @c     __libc_lock_lock @asulock @aculock
 961 @c     calloc dup @ascuheap @acsmem
 962 @c     __libc_lock_unlock @aculock
 963 @c    atomic_write_barrier dup ok
 964 The @code{atexit} function registers the function @var{function} to be
 965 called at normal program termination.  The @var{function} is called with
 966 no arguments.
 967
 968 The return value from @code{atexit} is zero on success and nonzero if
 969 the function cannot be registered.
 970 @end deftypefun
 971
 972 @comment stdlib.h
 973 @comment SunOS
 974 @deftypefun int on_exit (void (*@var{function})(int @var{status}, void *@var{arg}), void *@var{arg})
 975 @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{} @asulock{}}@acunsafe{@aculock{} @acsmem{}}}
 976 @c on_exit @ascuheap @asulock @aculock @acsmem
 977 @c  new_exitfn dup @ascuheap @asulock @aculock @acsmem
 978 @c  atomic_write_barrier dup ok
 979 This function is a somewhat more powerful variant of @code{atexit}.  It
 980 accepts two arguments, a function @var{function} and an arbitrary
 981 pointer @var{arg}.  At normal program termination, the @var{function} is
 982 called with two arguments:  the @var{status} value passed to @code{exit},
 983 and the @var{arg}.
 984
 985 This function is included in @theglibc{} only for compatibility
 986 for SunOS, and may not be supported by other implementations.
 987 @end deftypefun
 988
 989 Here's a trivial program that illustrates the use of @code{exit} and
 990 @code{atexit}:
 991
 992 @smallexample
 993 @include atexit.c.texi
 994 @end smallexample
 995
 996 @noindent
 997 When this program is executed, it just prints the message and exits.
 998
 999 @node Aborting a Program
1000 @subsection Aborting a Program
1001 @cindex aborting a program
1002
1003 You can abort your program using the @code{abort} function.  The prototype
1004 for this function is in @file{stdlib.h}.
1005 @pindex stdlib.h
1006
1007 @comment stdlib.h
1008 @comment ISO
1009 @deftypefun void abort (void)
1010 @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{}}@acunsafe{@aculock{} @acucorrupt{}}}
1011 @c The implementation takes a recursive lock and attempts to support
1012 @c calls from signal handlers, but if we're in the middle of flushing or
1013 @c using streams, we may encounter them in inconsistent states.
1014 The @code{abort} function causes abnormal program termination.  This
1015 does not execute cleanup functions registered with @code{atexit} or
1016 @code{on_exit}.
1017
1018 This function actually terminates the process by raising a
1019 @code{SIGABRT} signal, and your program can include a handler to
1020 intercept this signal; see @ref{Signal Handling}.
1021 @end deftypefun
1022
1023 @c Put in by rms.  Don't remove.
1024 @cartouche
1025 @strong{Future Change Warning:} Proposed Federal censorship regulations
1026 may prohibit us from giving you information about the possibility of
1027 calling this function.  We would be required to say that this is not an
1028 acceptable way of terminating a program.
1029 @end cartouche
1030
1031 @node Termination Internals
1032 @subsection Termination Internals
1033
1034 The @code{_exit} function is the primitive used for process termination
1035 by @code{exit}.  It is declared in the header file @file{unistd.h}.
1036 @pindex unistd.h
1037
1038 @comment unistd.h
1039 @comment POSIX.1
1040 @deftypefun void _exit (int @var{status})
1041 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1042 @c Direct syscall (exit_group or exit); calls __task_terminate on hurd,
1043 @c and abort in the generic posix implementation.
1044 The @code{_exit} function is the primitive for causing a process to
1045 terminate with status @var{status}.  Calling this function does not
1046 execute cleanup functions registered with @code{atexit} or
1047 @code{on_exit}.
1048 @end deftypefun
1049
1050 @comment stdlib.h
1051 @comment ISO
1052 @deftypefun void _Exit (int @var{status})
1053 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1054 @c Alias for _exit.
1055 The @code{_Exit} function is the @w{ISO C} equivalent to @code{_exit}.
1056 The @w{ISO C} committee members were not sure whether the definitions of
1057 @code{_exit} and @code{_Exit} were compatible so they have not used the
1058 POSIX name.
1059
1060 This function was introduced in @w{ISO C99} and is declared in
1061 @file{stdlib.h}.
1062 @end deftypefun
1063
1064 When a process terminates for any reason---either because the program
1065 terminates, or as a result of a signal---the
1066 following things happen:
1067
1068 @itemize @bullet
1069 @item
1070 All open file descriptors in the process are closed.  @xref{Low-Level I/O}.
1071 Note that streams are not flushed automatically when the process
1072 terminates; see @ref{I/O on Streams}.
1073
1074 @item
1075 A process exit status is saved to be reported back to the parent process
1076 via @code{wait} or @code{waitpid}; see @ref{Process Completion}.  If the
1077 program exited, this status includes as its low-order 8 bits the program
1078 exit status.
1079
1080
1081 @item
1082 Any child processes of the process being terminated are assigned a new
1083 parent process.  (On most systems, including GNU, this is the @code{init}
1084 process, with process ID 1.)
1085
1086 @item
1087 A @code{SIGCHLD} signal is sent to the parent process.
1088
1089 @item
1090 If the process is a session leader that has a controlling terminal, then
1091 a @code{SIGHUP} signal is sent to each process in the foreground job,
1092 and the controlling terminal is disassociated from that session.
1093 @xref{Job Control}.
1094
1095 @item
1096 If termination of a process causes a process group to become orphaned,
1097 and any member of that process group is stopped, then a @code{SIGHUP}
1098 signal and a @code{SIGCONT} signal are sent to each process in the
1099 group.  @xref{Job Control}.
1100 @end itemize