manual/=limits.texinfo

   1 @node Representation Limits, System Configuration Limits, System Information, Top
   2 @chapter Representation Limits
   3
   4 This chapter contains information about constants and parameters that
   5 characterize the representation of the various integer and
   6 floating-point types supported by the GNU C library.
   7
   8 @menu
   9 * Integer Representation Limits::       Determining maximum and minimum
  10                                          representation values of
  11                                          various integer subtypes.
  12 * Floating-Point Limits ::              Parameters which characterize
  13                                          supported floating-point
  14                                          representations on a particular
  15                                          system.
  16 @end menu
  17
  18 @node Integer Representation Limits, Floating-Point Limits ,  , Representation Limits
  19 @section Integer Representation Limits
  20 @cindex integer representation limits
  21 @cindex representation limits, integer
  22 @cindex limits, integer representation
  23
  24 Sometimes it is necessary for programs to know about the internal
  25 representation of various integer subtypes.  For example, if you want
  26 your program to be careful not to overflow an @code{int} counter
  27 variable, you need to know what the largest representable value that
  28 fits in an @code{int} is.  These kinds of parameters can vary from
  29 compiler to compiler and machine to machine.  Another typical use of
  30 this kind of parameter is in conditionalizing data structure definitions
  31 with @samp{#ifdef} to select the most appropriate integer subtype that
  32 can represent the required range of values.
  33
  34 Macros representing the minimum and maximum limits of the integer types
  35 are defined in the header file @file{limits.h}.  The values of these
  36 macros are all integer constant expressions.
  37 @pindex limits.h
  38
  39 @comment limits.h
  40 @comment ANSI
  41 @deftypevr Macro int CHAR_BIT
  42 This is the number of bits in a @code{char}, usually eight.
  43 @end deftypevr
  44
  45 @comment limits.h
  46 @comment ANSI
  47 @deftypevr Macro int SCHAR_MIN
  48 This is the minimum value that can be represented by a @code{signed char}.
  49 @end deftypevr
  50
  51 @comment limits.h
  52 @comment ANSI
  53 @deftypevr Macro int SCHAR_MAX
  54 This is the maximum value that can be represented by a @code{signed char}.
  55 @end deftypevr
  56
  57 @comment limits.h
  58 @comment ANSI
  59 @deftypevr Macro int UCHAR_MAX
  60 This is the maximum value that can be represented by a @code{unsigned char}.
  61 (The minimum value of an @code{unsigned char} is zero.)
  62 @end deftypevr
  63
  64 @comment limits.h
  65 @comment ANSI
  66 @deftypevr Macro int CHAR_MIN
  67 This is the minimum value that can be represented by a @code{char}.
  68 It's equal to @code{SCHAR_MIN} if @code{char} is signed, or zero
  69 otherwise.
  70 @end deftypevr
  71
  72 @comment limits.h
  73 @comment ANSI
  74 @deftypevr Macro int CHAR_MAX
  75 This is the maximum value that can be represented by a @code{char}.
  76 It's equal to @code{SCHAR_MAX} if @code{char} is signed, or
  77 @code{UCHAR_MAX} otherwise.
  78 @end deftypevr
  79
  80 @comment limits.h
  81 @comment ANSI
  82 @deftypevr Macro int SHRT_MIN
  83 This is the minimum value that can be represented by a @code{signed
  84 short int}.  On most machines that the GNU C library runs on,
  85 @code{short} integers are 16-bit quantities.
  86 @end deftypevr
  87
  88 @comment limits.h
  89 @comment ANSI
  90 @deftypevr Macro int SHRT_MAX
  91 This is the maximum value that can be represented by a @code{signed
  92 short int}.
  93 @end deftypevr
  94
  95 @comment limits.h
  96 @comment ANSI
  97 @deftypevr Macro int USHRT_MAX
  98 This is the maximum value that can be represented by an @code{unsigned
  99 short int}.  (The minimum value of an @code{unsigned short int} is zero.)
 100 @end deftypevr
 101
 102 @comment limits.h
 103 @comment ANSI
 104 @deftypevr Macro int INT_MIN
 105 This is the minimum value that can be represented by a @code{signed
 106 int}.  On most machines that the GNU C system runs on, an @code{int} is
 107 a 32-bit quantity.
 108 @end deftypevr
 109
 110 @comment limits.h
 111 @comment ANSI
 112 @deftypevr Macro int INT_MAX
 113 This is the maximum value that can be represented by a @code{signed
 114 int}.
 115 @end deftypevr
 116
 117 @comment limits.h
 118 @comment ANSI
 119 @deftypevr Macro {unsigned int} UINT_MAX
 120 This is the maximum value that can be represented by an @code{unsigned
 121 int}.  (The minimum value of an @code{unsigned int} is zero.)
 122 @end deftypevr
 123
 124 @comment limits.h
 125 @comment ANSI
 126 @deftypevr Macro {long int} LONG_MIN
 127 This is the minimum value that can be represented by a @code{signed long
 128 int}.  On most machines that the GNU C system runs on, @code{long}
 129 integers are 32-bit quantities, the same size as @code{int}.
 130 @end deftypevr
 131
 132 @comment limits.h
 133 @comment ANSI
 134 @deftypevr Macro {long int} LONG_MAX
 135 This is the maximum value that can be represented by a @code{signed long
 136 int}.
 137 @end deftypevr
 138
 139 @comment limits.h
 140 @comment ANSI
 141 @deftypevr Macro {unsigned long int} ULONG_MAX
 142 This is the maximum value that can be represented by an @code{unsigned
 143 long int}.  (The minimum value of an @code{unsigned long int} is zero.)
 144 @end deftypevr
 145
 146 @strong{Incomplete:}  There should be corresponding limits for the GNU
 147 C Compiler's @code{long long} type, too.  (But they are not now present
 148 in the header file.)
 149
 150 The header file @file{limits.h} also defines some additional constants
 151 that parameterize various operating system and file system limits.  These
 152 constants are described in @ref{System Parameters} and @ref{File System
 153 Parameters}.
 154 @pindex limits.h
 155
 156
 157 @node Floating-Point Limits ,  , Integer Representation Limits, Representation Limits
 158 @section Floating-Point Limits
 159 @cindex floating-point number representation
 160 @cindex representation, floating-point number
 161 @cindex limits, floating-point representation
 162
 163 Because floating-point numbers are represented internally as approximate
 164 quantities, algorithms for manipulating floating-point data often need
 165 to be parameterized in terms of the accuracy of the representation.
 166 Some of the functions in the C library itself need this information; for
 167 example, the algorithms for printing and reading floating-point numbers
 168 (@pxref{I/O on Streams}) and for calculating trigonometric and
 169 irrational functions (@pxref{Mathematics}) use information about the
 170 underlying floating-point representation to avoid round-off error and
 171 loss of accuracy.  User programs that implement numerical analysis
 172 techniques also often need to be parameterized in this way in order to
 173 minimize or compute error bounds.
 174
 175 The specific representation of floating-point numbers varies from
 176 machine to machine.  The GNU C library defines a set of parameters which
 177 characterize each of the supported floating-point representations on a
 178 particular system.
 179
 180 @menu
 181 * Floating-Point Representation::       Definitions of terminology.
 182 * Floating-Point Parameters::           Descriptions of the library
 183                                          facilities.
 184 * IEEE Floating Point::                 An example of a common
 185                                          representation.
 186 @end menu
 187
 188 @node Floating-Point Representation, Floating-Point Parameters,  , Floating-Point Limits
 189 @subsection Floating-Point Representation
 190
 191 This section introduces the terminology used to characterize the
 192 representation of floating-point numbers.
 193
 194 You are probably already familiar with most of these concepts in terms
 195 of scientific or exponential notation for floating-point numbers.  For
 196 example, the number @code{123456.0} could be expressed in exponential
 197 notation as @code{1.23456e+05}, a shorthand notation indicating that the
 198 mantissa @code{1.23456} is multiplied by the base @code{10} raised to
 199 power @code{5}.
 200
 201 More formally, the internal representation of a floating-point number
 202 can be characterized in terms of the following parameters:
 203
 204 @itemize @bullet
 205 @item
 206 The @dfn{sign} is either @code{-1} or @code{1}.
 207 @cindex sign (of floating-point number)
 208
 209 @item
 210 The @dfn{base} or @dfn{radix} for exponentiation; an integer greater
 211 than @code{1}.  This is a constant for the particular representation.
 212 @cindex base (of floating-point number)
 213 @cindex radix (of floating-point number)
 214
 215 @item
 216 The @dfn{exponent} to which the base is raised.  The upper and lower
 217 bounds of the exponent value are constants for the particular
 218 representation.
 219 @cindex exponent (of floating-point number)
 220
 221 Sometimes, in the actual bits representing the floating-point number,
 222 the exponent is @dfn{biased} by adding a constant to it, to make it
 223 always be represented as an unsigned quantity.  This is only important
 224 if you have some reason to pick apart the bit fields making up the
 225 floating-point number by hand, which is something for which the GNU
 226 library provides no support.  So this is ignored in the discussion that
 227 follows.
 228 @cindex bias (of floating-point number exponent)
 229
 230 @item
 231 The value of the @dfn{mantissa} or @dfn{significand}, which is an
 232 unsigned integer.
 233 @cindex mantissa (of floating-point number)
 234 @cindex significand (of floating-point number)
 235
 236 @item
 237 The @dfn{precision} of the mantissa.  If the base of the representation
 238 is @var{b}, then the precision is the number of base-@var{b} digits in
 239 the mantissa.  This is a constant for the particular representation.
 240
 241 Many floating-point representations have an implicit @dfn{hidden bit} in
 242 the mantissa.  Any such hidden bits are counted in the precision.
 243 Again, the GNU library provides no facilities for dealing with such low-level
 244 aspects of the representation.
 245 @cindex precision (of floating-point number)
 246 @cindex hidden bit (of floating-point number mantissa)
 247 @end itemize
 248
 249 The mantissa of a floating-point number actually represents an implicit
 250 fraction whose denominator is the base raised to the power of the
 251 precision.  Since the largest representable mantissa is one less than
 252 this denominator, the value of the fraction is always strictly less than
 253 @code{1}.  The mathematical value of a floating-point number is then the
 254 product of this fraction; the sign; and the base raised to the exponent.
 255
 256 If the floating-point number is @dfn{normalized}, the mantissa is also
 257 greater than or equal to the base raised to the power of one less
 258 than the precision (unless the number represents a floating-point zero,
 259 in which case the mantissa is zero).  The fractional quantity is
 260 therefore greater than or equal to @code{1/@var{b}}, where @var{b} is
 261 the base.
 262 @cindex normalized floating-point number
 263
 264 @node Floating-Point Parameters, IEEE Floating Point, Floating-Point Representation, Floating-Point Limits
 265 @subsection Floating-Point Parameters
 266
 267 @strong{Incomplete:}  This section needs some more concrete examples
 268 of what these parameters mean and how to use them in a program.
 269
 270 These macro definitions can be accessed by including the header file
 271 @file{float.h} in your program.
 272 @pindex float.h
 273
 274 Macro names starting with @samp{FLT_} refer to the @code{float} type,
 275 while names beginning with @samp{DBL_} refer to the @code{double} type
 276 and names beginning with @samp{LDBL_} refer to the @code{long double}
 277 type.  (In implementations that do not support @code{long double} as
 278 a distinct data type, the values for those constants are the same
 279 as the corresponding constants for the @code{double} type.)@refill
 280 @cindex @code{float} representation limits
 281 @cindex @code{double} representation limits
 282 @cindex @code{long double} representation limits
 283
 284 Of these macros, only @code{FLT_RADIX} is guaranteed to be a constant
 285 expression.  The other macros listed here cannot be reliably used in
 286 places that require constant expressions, such as @samp{#if}
 287 preprocessing directives or array size specifications.
 288
 289 Although the ANSI C standard specifies minimum and maximum values for
 290 most of these parameters, the GNU C implementation uses whatever
 291 floating-point representations are supported by the underlying hardware.
 292 So whether GNU C actually satisfies the ANSI C requirements depends on
 293 what machine it is running on.
 294
 295 @comment float.h
 296 @comment ANSI
 297 @deftypevr Macro int FLT_ROUNDS
 298 This value characterizes the rounding mode for floating-point addition.
 299 The following values indicate standard rounding modes:
 300
 301 @table @code
 302 @item -1
 303 The mode is indeterminable.
 304 @item 0
 305 Rounding is towards zero.
 306 @item 1
 307 Rounding is to the nearest number.
 308 @item 2
 309 Rounding is towards positive infinity.
 310 @item 3
 311 Rounding is towards negative infinity.
 312 @end table
 313
 314 @noindent
 315 Any other value represents a machine-dependent nonstandard rounding
 316 mode.
 317 @end deftypevr
 318
 319 @comment float.h
 320 @comment ANSI
 321 @deftypevr Macro int FLT_RADIX
 322 This is the value of the base, or radix, of exponent representation.
 323 This is guaranteed to be a constant expression, unlike the other macros
 324 described in this section.
 325 @end deftypevr
 326
 327 @comment float.h
 328 @comment ANSI
 329 @deftypevr Macro int FLT_MANT_DIG
 330 This is the number of base-@code{FLT_RADIX} digits in the floating-point
 331 mantissa for the @code{float} data type.
 332 @end deftypevr
 333
 334 @comment float.h
 335 @comment ANSI
 336 @deftypevr Macro int DBL_MANT_DIG
 337 This is the number of base-@code{FLT_RADIX} digits in the floating-point
 338 mantissa for the @code{double} data type.
 339 @end deftypevr
 340
 341 @comment float.h
 342 @comment ANSI
 343 @deftypevr Macro int LDBL_MANT_DIG
 344 This is the number of base-@code{FLT_RADIX} digits in the floating-point
 345 mantissa for the @code{long double} data type.
 346 @end deftypevr
 347
 348 @comment float.h
 349 @comment ANSI
 350 @deftypevr Macro int FLT_DIG
 351 This is the number of decimal digits of precision for the @code{float}
 352 data type.  Technically, if @var{p} and @var{b} are the precision and
 353 base (respectively) for the representation, then the decimal precision
 354 @var{q} is the maximum number of decimal digits such that any floating
 355 point number with @var{q} base 10 digits can be rounded to a floating
 356 point number with @var{p} base @var{b} digits and back again, without
 357 change to the @var{q} decimal digits.
 358
 359 The value of this macro is guaranteed to be at least @code{6}.
 360 @end deftypevr
 361
 362 @comment float.h
 363 @comment ANSI
 364 @deftypevr Macro int DBL_DIG
 365 This is similar to @code{FLT_DIG}, but is for the @code{double} data
 366 type.  The value of this macro is guaranteed to be at least @code{10}.
 367 @end deftypevr
 368
 369 @comment float.h
 370 @comment ANSI
 371 @deftypevr Macro int LDBL_DIG
 372 This is similar to @code{FLT_DIG}, but is for the @code{long double}
 373 data type.  The value of this macro is guaranteed to be at least
 374 @code{10}.
 375 @end deftypevr
 376
 377 @comment float.h
 378 @comment ANSI
 379 @deftypevr Macro int FLT_MIN_EXP
 380 This is the minimum negative integer such that the mathematical value
 381 @code{FLT_RADIX} raised to this power minus 1 can be represented as a
 382 normalized floating-point number of type @code{float}.  In terms of the
 383 actual implementation, this is just the smallest value that can be
 384 represented in the exponent field of the number.
 385 @end deftypevr
 386
 387 @comment float.h
 388 @comment ANSI
 389 @deftypevr Macro int DBL_MIN_EXP
 390 This is similar to @code{FLT_MIN_EXP}, but is for the @code{double} data
 391 type.
 392 @end deftypevr
 393
 394 @comment float.h
 395 @comment ANSI
 396 @deftypevr Macro int LDBL_MIN_EXP
 397 This is similar to @code{FLT_MIN_EXP}, but is for the @code{long double}
 398 data type.
 399 @end deftypevr
 400
 401 @comment float.h
 402 @comment ANSI
 403 @deftypevr Macro int FLT_MIN_10_EXP
 404 This is the minimum negative integer such that the mathematical value
 405 @code{10} raised to this power minus 1 can be represented as a
 406 normalized floating-point number of type @code{float}.  This is
 407 guaranteed to be no greater than @code{-37}.
 408 @end deftypevr
 409
 410 @comment float.h
 411 @comment ANSI
 412 @deftypevr Macro int DBL_MIN_10_EXP
 413 This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{double}
 414 data type.
 415 @end deftypevr
 416
 417 @comment float.h
 418 @comment ANSI
 419 @deftypevr Macro int LDBL_MIN_10_EXP
 420 This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{long
 421 double} data type.
 422 @end deftypevr
 423
 424
 425
 426 @comment float.h
 427 @comment ANSI
 428 @deftypevr Macro int FLT_MAX_EXP
 429 This is the maximum negative integer such that the mathematical value
 430 @code{FLT_RADIX} raised to this power minus 1 can be represented as a
 431 floating-point number of type @code{float}.  In terms of the actual
 432 implementation, this is just the largest value that can be represented
 433 in the exponent field of the number.
 434 @end deftypevr
 435
 436 @comment float.h
 437 @comment ANSI
 438 @deftypevr Macro int DBL_MAX_EXP
 439 This is similar to @code{FLT_MAX_EXP}, but is for the @code{double} data
 440 type.
 441 @end deftypevr
 442
 443 @comment float.h
 444 @comment ANSI
 445 @deftypevr Macro int LDBL_MAX_EXP
 446 This is similar to @code{FLT_MAX_EXP}, but is for the @code{long double}
 447 data type.
 448 @end deftypevr
 449
 450 @comment float.h
 451 @comment ANSI
 452 @deftypevr Macro int FLT_MAX_10_EXP
 453 This is the maximum negative integer such that the mathematical value
 454 @code{10} raised to this power minus 1 can be represented as a
 455 normalized floating-point number of type @code{float}.  This is
 456 guaranteed to be at least @code{37}.
 457 @end deftypevr
 458
 459 @comment float.h
 460 @comment ANSI
 461 @deftypevr Macro int DBL_MAX_10_EXP
 462 This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{double}
 463 data type.
 464 @end deftypevr
 465
 466 @comment float.h
 467 @comment ANSI
 468 @deftypevr Macro int LDBL_MAX_10_EXP
 469 This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{long
 470 double} data type.
 471 @end deftypevr
 472
 473
 474 @comment float.h
 475 @comment ANSI
 476 @deftypevr Macro double FLT_MAX
 477 The value of this macro is the maximum representable floating-point
 478 number of type @code{float}, and is guaranteed to be at least
 479 @code{1E+37}.
 480 @end deftypevr
 481
 482 @comment float.h
 483 @comment ANSI
 484 @deftypevr Macro double DBL_MAX
 485 The value of this macro is the maximum representable floating-point
 486 number of type @code{double}, and is guaranteed to be at least
 487 @code{1E+37}.
 488 @end deftypevr
 489
 490 @comment float.h
 491 @comment ANSI
 492 @deftypevr Macro {long double} LDBL_MAX
 493 The value of this macro is the maximum representable floating-point
 494 number of type @code{long double}, and is guaranteed to be at least
 495 @code{1E+37}.
 496 @end deftypevr
 497
 498
 499 @comment float.h
 500 @comment ANSI
 501 @deftypevr Macro double FLT_MIN
 502 The value of this macro is the minimum normalized positive
 503 floating-point number that is representable by type @code{float}, and is
 504 guaranteed to be no more than @code{1E-37}.
 505 @end deftypevr
 506
 507 @comment float.h
 508 @comment ANSI
 509 @deftypevr Macro double DBL_MIN
 510 The value of this macro is the minimum normalized positive
 511 floating-point number that is representable by type @code{double}, and
 512 is guaranteed to be no more than @code{1E-37}.
 513 @end deftypevr
 514
 515 @comment float.h
 516 @comment ANSI
 517 @deftypevr Macro {long double} LDBL_MIN
 518 The value of this macro is the minimum normalized positive
 519 floating-point number that is representable by type @code{long double},
 520 and is guaranteed to be no more than @code{1E-37}.
 521 @end deftypevr
 522
 523
 524 @comment float.h
 525 @comment ANSI
 526 @deftypevr Macro double FLT_EPSILON
 527 This is the minimum positive floating-point number of type @code{float}
 528 such that @code{1.0 + FLT_EPSILON != 1.0} is true.  It's guaranteed to
 529 be no greater than @code{1E-5}.
 530 @end deftypevr
 531
 532 @comment float.h
 533 @comment ANSI
 534 @deftypevr Macro double DBL_EPSILON
 535 This is similar to @code{FLT_EPSILON}, but is for the @code{double}
 536 type.  The maximum value is @code{1E-9}.
 537 @end deftypevr
 538
 539 @comment float.h
 540 @comment ANSI
 541 @deftypevr Macro {long double} LDBL_EPSILON
 542 This is similar to @code{FLT_EPSILON}, but is for the @code{long double}
 543 type.  The maximum value is @code{1E-9}.
 544 @end deftypevr
 545
 546
 547 @node IEEE Floating Point,  , Floating-Point Parameters, Floating-Point Limits
 548 @subsection IEEE Floating Point
 549 @cindex IEEE floating-point representation
 550 @cindex floating-point, IEEE
 551 @cindex IEEE Std 754
 552
 553
 554 Here is an example showing how these parameters work for a common
 555 floating point representation, specified by the @cite{IEEE Standard for
 556 Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985)}.  Nearly
 557 all computers today use this format.
 558
 559 The IEEE single-precision float representation uses a base of 2.  There
 560 is a sign bit, a mantissa with 23 bits plus one hidden bit (so the total
 561 precision is 24 base-2 digits), and an 8-bit exponent that can represent
 562 values in the range -125 to 128, inclusive.
 563
 564 So, for an implementation that uses this representation for the
 565 @code{float} data type, appropriate values for the corresponding
 566 parameters are:
 567
 568 @example
 569 FLT_RADIX                             2
 570 FLT_MANT_DIG                         24
 571 FLT_DIG                               6
 572 FLT_MIN_EXP                        -125
 573 FLT_MIN_10_EXP                      -37
 574 FLT_MAX_EXP                         128
 575 FLT_MAX_10_EXP                      +38
 576 FLT_MIN                 1.17549435E-38F
 577 FLT_MAX                 3.40282347E+38F
 578 FLT_EPSILON             1.19209290E-07F
 579 @end example
 580
 581 Here are the values for the @code{double} data type:
 582
 583 @example
 584 DBL_MANT_DIG                         53
 585 DBL_DIG                              15
 586 DBL_MIN_EXP                       -1021
 587 DBL_MIN_10_EXP                     -307
 588 DBL_MAX_EXP                        1024
 589 DBL_MAX_10_EXP                      308
 590 DBL_MAX         1.7976931348623157E+308
 591 DBL_MIN         2.2250738585072014E-308
 592 DBL_EPSILON     2.2204460492503131E-016
 593 @end example