]> git.ipfire.org Git - thirdparty/glibc.git/blame - manual/arith.texi
Document sNaN argument error handling.
[thirdparty/glibc.git] / manual / arith.texi
CommitLineData
28f540f4 1@node Arithmetic, Date and Time, Mathematics, Top
7a68c94a
UD
2@c %MENU% Low level arithmetic functions
3@chapter Arithmetic Functions
28f540f4
RM
4
5This chapter contains information about functions for doing basic
6arithmetic operations, such as splitting a float into its integer and
b4012b75
UD
7fractional parts or retrieving the imaginary part of a complex value.
8These functions are declared in the header files @file{math.h} and
9@file{complex.h}.
28f540f4
RM
10
11@menu
0e4ee106
UD
12* Integers:: Basic integer types and concepts
13* Integer Division:: Integer division with guaranteed rounding.
7a68c94a
UD
14* Floating Point Numbers:: Basic concepts. IEEE 754.
15* Floating Point Classes:: The five kinds of floating-point number.
16* Floating Point Errors:: When something goes wrong in a calculation.
17* Rounding:: Controlling how results are rounded.
18* Control Functions:: Saving and restoring the FPU's state.
19* Arithmetic Functions:: Fundamental operations provided by the library.
20* Complex Numbers:: The types. Writing complex constants.
21* Operations on Complex:: Projection, conjugation, decomposition.
7a68c94a 22* Parsing of Numbers:: Converting strings to numbers.
6962682f 23* Printing of Floats:: Converting floating-point numbers to strings.
7a68c94a 24* System V Number Conversion:: An archaic way to convert numbers to strings.
28f540f4
RM
25@end menu
26
0e4ee106
UD
27@node Integers
28@section Integers
29@cindex integer
30
31The C language defines several integer data types: integer, short integer,
32long integer, and character, all in both signed and unsigned varieties.
e6e81391
UD
33The GNU C compiler extends the language to contain long long integers
34as well.
0e4ee106
UD
35@cindex signedness
36
37The C integer types were intended to allow code to be portable among
38machines with different inherent data sizes (word sizes), so each type
39may have different ranges on different machines. The problem with
40this is that a program often needs to be written for a particular range
41of integers, and sometimes must be written for a particular size of
42storage, regardless of what machine the program runs on.
43
1f77f049 44To address this problem, @theglibc{} contains C type definitions
0e4ee106 45you can use to declare integers that meet your exact needs. Because the
1f77f049 46@glibcadj{} header files are customized to a specific machine, your
0e4ee106
UD
47program source code doesn't have to be.
48
49These @code{typedef}s are in @file{stdint.h}.
50@pindex stdint.h
51
52If you require that an integer be represented in exactly N bits, use one
53of the following types, with the obvious mapping to bit size and signedness:
54
68979757 55@itemize @bullet
0e4ee106
UD
56@item int8_t
57@item int16_t
58@item int32_t
59@item int64_t
60@item uint8_t
61@item uint16_t
62@item uint32_t
63@item uint64_t
64@end itemize
65
66If your C compiler and target machine do not allow integers of a certain
67size, the corresponding above type does not exist.
68
69If you don't need a specific storage size, but want the smallest data
70structure with @emph{at least} N bits, use one of these:
71
68979757 72@itemize @bullet
150f9fb8
AJ
73@item int_least8_t
74@item int_least16_t
75@item int_least32_t
76@item int_least64_t
77@item uint_least8_t
78@item uint_least16_t
79@item uint_least32_t
80@item uint_least64_t
0e4ee106
UD
81@end itemize
82
e6e81391 83If you don't need a specific storage size, but want the data structure
0e4ee106
UD
84that allows the fastest access while having at least N bits (and
85among data structures with the same access speed, the smallest one), use
86one of these:
87
68979757 88@itemize @bullet
150f9fb8
AJ
89@item int_fast8_t
90@item int_fast16_t
91@item int_fast32_t
92@item int_fast64_t
93@item uint_fast8_t
94@item uint_fast16_t
95@item uint_fast32_t
96@item uint_fast64_t
0e4ee106
UD
97@end itemize
98
e6e81391 99If you want an integer with the widest range possible on the platform on
0e4ee106
UD
100which it is being used, use one of the following. If you use these,
101you should write code that takes into account the variable size and range
102of the integer.
103
68979757 104@itemize @bullet
0e4ee106
UD
105@item intmax_t
106@item uintmax_t
107@end itemize
108
1f77f049 109@Theglibc{} also provides macros that tell you the maximum and
0e4ee106
UD
110minimum possible values for each integer data type. The macro names
111follow these examples: @code{INT32_MAX}, @code{UINT8_MAX},
112@code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX},
113@code{INTMAX_MAX}, @code{INTMAX_MIN}. Note that there are no macros for
5b17fd0d
JM
114unsigned integer minima. These are always zero. Similiarly, there
115are macros such as @code{INTMAX_WIDTH} for the width of these types.
116Those macros for integer type widths come from TS 18661-1:2014.
0e4ee106 117@cindex maximum possible integer
0bc93a2f 118@cindex minimum possible integer
0e4ee106
UD
119
120There are similar macros for use with C's built in integer types which
121should come with your C compiler. These are described in @ref{Data Type
122Measurements}.
123
124Don't forget you can use the C @code{sizeof} function with any of these
125data types to get the number of bytes of storage each uses.
126
127
128@node Integer Division
129@section Integer Division
130@cindex integer division functions
131
132This section describes functions for performing integer division. These
133functions are redundant when GNU CC is used, because in GNU C the
134@samp{/} operator always rounds towards zero. But in other C
135implementations, @samp{/} may round differently with negative arguments.
136@code{div} and @code{ldiv} are useful because they specify how to round
137the quotient: towards zero. The remainder has the same sign as the
138numerator.
139
140These functions are specified to return a result @var{r} such that the value
141@code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals
142@var{numerator}.
143
144@pindex stdlib.h
145To use these facilities, you should include the header file
146@file{stdlib.h} in your program.
147
148@comment stdlib.h
149@comment ISO
150@deftp {Data Type} div_t
151This is a structure type used to hold the result returned by the @code{div}
152function. It has the following members:
153
154@table @code
155@item int quot
156The quotient from the division.
157
158@item int rem
159The remainder from the division.
160@end table
161@end deftp
162
163@comment stdlib.h
164@comment ISO
165@deftypefun div_t div (int @var{numerator}, int @var{denominator})
b719dafd
AO
166@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
167@c Functions in this section are pure, and thus safe.
e4fd1876 168The function @code{div} computes the quotient and remainder from
0e4ee106
UD
169the division of @var{numerator} by @var{denominator}, returning the
170result in a structure of type @code{div_t}.
171
172If the result cannot be represented (as in a division by zero), the
173behavior is undefined.
174
175Here is an example, albeit not a very useful one.
176
177@smallexample
178div_t result;
179result = div (20, -6);
180@end smallexample
181
182@noindent
183Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}.
184@end deftypefun
185
186@comment stdlib.h
187@comment ISO
188@deftp {Data Type} ldiv_t
189This is a structure type used to hold the result returned by the @code{ldiv}
190function. It has the following members:
191
192@table @code
193@item long int quot
194The quotient from the division.
195
196@item long int rem
197The remainder from the division.
198@end table
199
200(This is identical to @code{div_t} except that the components are of
201type @code{long int} rather than @code{int}.)
202@end deftp
203
204@comment stdlib.h
205@comment ISO
206@deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator})
b719dafd 207@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106
UD
208The @code{ldiv} function is similar to @code{div}, except that the
209arguments are of type @code{long int} and the result is returned as a
210structure of type @code{ldiv_t}.
211@end deftypefun
212
213@comment stdlib.h
214@comment ISO
215@deftp {Data Type} lldiv_t
216This is a structure type used to hold the result returned by the @code{lldiv}
217function. It has the following members:
218
219@table @code
220@item long long int quot
221The quotient from the division.
222
223@item long long int rem
224The remainder from the division.
225@end table
226
227(This is identical to @code{div_t} except that the components are of
228type @code{long long int} rather than @code{int}.)
229@end deftp
230
231@comment stdlib.h
232@comment ISO
233@deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator})
b719dafd 234@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106
UD
235The @code{lldiv} function is like the @code{div} function, but the
236arguments are of type @code{long long int} and the result is returned as
237a structure of type @code{lldiv_t}.
238
239The @code{lldiv} function was added in @w{ISO C99}.
240@end deftypefun
241
242@comment inttypes.h
243@comment ISO
244@deftp {Data Type} imaxdiv_t
245This is a structure type used to hold the result returned by the @code{imaxdiv}
246function. It has the following members:
247
248@table @code
249@item intmax_t quot
250The quotient from the division.
251
252@item intmax_t rem
253The remainder from the division.
254@end table
255
256(This is identical to @code{div_t} except that the components are of
257type @code{intmax_t} rather than @code{int}.)
258
259See @ref{Integers} for a description of the @code{intmax_t} type.
260
261@end deftp
262
263@comment inttypes.h
264@comment ISO
265@deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator})
b719dafd 266@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
0e4ee106
UD
267The @code{imaxdiv} function is like the @code{div} function, but the
268arguments are of type @code{intmax_t} and the result is returned as
269a structure of type @code{imaxdiv_t}.
270
271See @ref{Integers} for a description of the @code{intmax_t} type.
272
273The @code{imaxdiv} function was added in @w{ISO C99}.
274@end deftypefun
275
276
7a68c94a
UD
277@node Floating Point Numbers
278@section Floating Point Numbers
279@cindex floating point
280@cindex IEEE 754
b4012b75
UD
281@cindex IEEE floating point
282
7a68c94a
UD
283Most computer hardware has support for two different kinds of numbers:
284integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and
285floating-point numbers. Floating-point numbers have three parts: the
286@dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}. The real
287number represented by a floating-point value is given by
288@tex
289$(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$
290@end tex
291@ifnottex
292@math{(s ? -1 : 1) @mul{} 2^e @mul{} M}
293@end ifnottex
294where @math{s} is the sign bit, @math{e} the exponent, and @math{M}
295the mantissa. @xref{Floating Point Concepts}, for details. (It is
296possible to have a different @dfn{base} for the exponent, but all modern
297hardware uses @math{2}.)
298
299Floating-point numbers can represent a finite subset of the real
300numbers. While this subset is large enough for most purposes, it is
301important to remember that the only reals that can be represented
302exactly are rational numbers that have a terminating binary expansion
303shorter than the width of the mantissa. Even simple fractions such as
304@math{1/5} can only be approximated by floating point.
305
306Mathematical operations and functions frequently need to produce values
307that are not representable. Often these values can be approximated
308closely enough for practical purposes, but sometimes they can't.
309Historically there was no way to tell when the results of a calculation
310were inaccurate. Modern computers implement the @w{IEEE 754} standard
311for numerical computations, which defines a framework for indicating to
312the program when the results of calculation are not trustworthy. This
313framework consists of a set of @dfn{exceptions} that indicate why a
314result could not be represented, and the special values @dfn{infinity}
315and @dfn{not a number} (NaN).
316
317@node Floating Point Classes
318@section Floating-Point Number Classification Functions
319@cindex floating-point classes
320@cindex classes, floating-point
321@pindex math.h
b4012b75 322
ec751a23 323@w{ISO C99} defines macros that let you determine what sort of
7a68c94a 324floating-point number a variable holds.
b4012b75
UD
325
326@comment math.h
327@comment ISO
7a68c94a 328@deftypefn {Macro} int fpclassify (@emph{float-type} @var{x})
b719dafd 329@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
330This is a generic macro which works on all floating-point types and
331which returns a value of type @code{int}. The possible values are:
28f540f4 332
7a68c94a
UD
333@vtable @code
334@item FP_NAN
335The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity
336and NaN})
337@item FP_INFINITE
338The value of @var{x} is either plus or minus infinity (@pxref{Infinity
339and NaN})
340@item FP_ZERO
341The value of @var{x} is zero. In floating-point formats like @w{IEEE
342754}, where zero can be signed, this value is also returned if
343@var{x} is negative zero.
344@item FP_SUBNORMAL
345Numbers whose absolute value is too small to be represented in the
346normal format are represented in an alternate, @dfn{denormalized} format
347(@pxref{Floating Point Concepts}). This format is less precise but can
348represent values closer to zero. @code{fpclassify} returns this value
349for values of @var{x} in this alternate format.
350@item FP_NORMAL
351This value is returned for all other values of @var{x}. It indicates
352that there is nothing special about the number.
353@end vtable
28f540f4 354
7a68c94a 355@end deftypefn
28f540f4 356
7a68c94a
UD
357@code{fpclassify} is most useful if more than one property of a number
358must be tested. There are more specific macros which only test one
359property at a time. Generally these macros execute faster than
360@code{fpclassify}, since there is special hardware support for them.
361You should therefore use the specific macros whenever possible.
28f540f4 362
29cb9293
JM
363@comment math.h
364@comment ISO
365@deftypefn {Macro} int iscanonical (@emph{float-type} @var{x})
366@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
367In some floating-point formats, some values have canonical (preferred)
368and noncanonical encodings (for IEEE interchange binary formats, all
369encodings are canonical). This macro returns a nonzero value if
370@var{x} has a canonical encoding. It is from TS 18661-1:2014.
371
372Note that some formats have multiple encodings of a value which are
373all equally canonical; @code{iscanonical} returns a nonzero value for
374all such encodings. Also, formats may have encodings that do not
375correspond to any valid value of the type. In ISO C terms these are
376@dfn{trap representations}; in @theglibc{}, @code{iscanonical} returns
377zero for such encodings.
378@end deftypefn
379
28f540f4 380@comment math.h
7a68c94a
UD
381@comment ISO
382@deftypefn {Macro} int isfinite (@emph{float-type} @var{x})
b719dafd 383@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
384This macro returns a nonzero value if @var{x} is finite: not plus or
385minus infinity, and not NaN. It is equivalent to
fe0ec73e
UD
386
387@smallexample
7a68c94a 388(fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE)
fe0ec73e
UD
389@end smallexample
390
7a68c94a
UD
391@code{isfinite} is implemented as a macro which accepts any
392floating-point type.
393@end deftypefn
fe0ec73e 394
7a68c94a
UD
395@comment math.h
396@comment ISO
397@deftypefn {Macro} int isnormal (@emph{float-type} @var{x})
b719dafd 398@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
399This macro returns a nonzero value if @var{x} is finite and normalized.
400It is equivalent to
b4012b75
UD
401
402@smallexample
7a68c94a 403(fpclassify (x) == FP_NORMAL)
b4012b75 404@end smallexample
7a68c94a 405@end deftypefn
b4012b75 406
7a68c94a
UD
407@comment math.h
408@comment ISO
409@deftypefn {Macro} int isnan (@emph{float-type} @var{x})
b719dafd 410@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
411This macro returns a nonzero value if @var{x} is NaN. It is equivalent
412to
b4012b75
UD
413
414@smallexample
7a68c94a 415(fpclassify (x) == FP_NAN)
b4012b75 416@end smallexample
7a68c94a 417@end deftypefn
b4012b75 418
57267616 419@comment math.h
bf91be88 420@comment ISO
57267616 421@deftypefn {Macro} int issignaling (@emph{float-type} @var{x})
b719dafd 422@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
57267616 423This macro returns a nonzero value if @var{x} is a signaling NaN
bf91be88 424(sNaN). It is from TS 18661-1:2014.
57267616
TS
425@end deftypefn
426
d942e95c
JM
427@comment math.h
428@comment ISO
429@deftypefn {Macro} int issubnormal (@emph{float-type} @var{x})
430@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
431This macro returns a nonzero value if @var{x} is subnormal. It is
432from TS 18661-1:2014.
433@end deftypefn
434
bb8081f5
JM
435@comment math.h
436@comment ISO
437@deftypefn {Macro} int iszero (@emph{float-type} @var{x})
438@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
439This macro returns a nonzero value if @var{x} is zero. It is from TS
44018661-1:2014.
441@end deftypefn
442
7a68c94a 443Another set of floating-point classification functions was provided by
1f77f049 444BSD. @Theglibc{} also supports these functions; however, we
ec751a23 445recommend that you use the ISO C99 macros in new code. Those are standard
7a68c94a
UD
446and will be available more widely. Also, since they are macros, you do
447not have to worry about the type of their argument.
28f540f4
RM
448
449@comment math.h
450@comment BSD
451@deftypefun int isinf (double @var{x})
4260bc74
UD
452@comment math.h
453@comment BSD
779ae82e 454@deftypefunx int isinff (float @var{x})
4260bc74
UD
455@comment math.h
456@comment BSD
779ae82e 457@deftypefunx int isinfl (long double @var{x})
b719dafd 458@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
459This function returns @code{-1} if @var{x} represents negative infinity,
460@code{1} if @var{x} represents positive infinity, and @code{0} otherwise.
461@end deftypefun
462
463@comment math.h
464@comment BSD
465@deftypefun int isnan (double @var{x})
4260bc74
UD
466@comment math.h
467@comment BSD
779ae82e 468@deftypefunx int isnanf (float @var{x})
4260bc74
UD
469@comment math.h
470@comment BSD
779ae82e 471@deftypefunx int isnanl (long double @var{x})
b719dafd 472@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4 473This function returns a nonzero value if @var{x} is a ``not a number''
7a68c94a 474value, and zero otherwise.
b9b49b44 475
48b22986 476@strong{NB:} The @code{isnan} macro defined by @w{ISO C99} overrides
7a68c94a
UD
477the BSD function. This is normally not a problem, because the two
478routines behave identically. However, if you really need to get the BSD
479function for some reason, you can write
b9b49b44 480
7a68c94a
UD
481@smallexample
482(isnan) (x)
483@end smallexample
28f540f4
RM
484@end deftypefun
485
486@comment math.h
487@comment BSD
488@deftypefun int finite (double @var{x})
4260bc74
UD
489@comment math.h
490@comment BSD
779ae82e 491@deftypefunx int finitef (float @var{x})
4260bc74
UD
492@comment math.h
493@comment BSD
779ae82e 494@deftypefunx int finitel (long double @var{x})
b719dafd 495@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
496This function returns a nonzero value if @var{x} is finite or a ``not a
497number'' value, and zero otherwise.
498@end deftypefun
499
28f540f4
RM
500@strong{Portability Note:} The functions listed in this section are BSD
501extensions.
502
b4012b75 503
7a68c94a
UD
504@node Floating Point Errors
505@section Errors in Floating-Point Calculations
506
507@menu
508* FP Exceptions:: IEEE 754 math exceptions and how to detect them.
509* Infinity and NaN:: Special values returned by calculations.
510* Status bit operations:: Checking for exceptions after the fact.
511* Math Error Reporting:: How the math functions report errors.
512@end menu
513
514@node FP Exceptions
515@subsection FP Exceptions
516@cindex exception
517@cindex signal
518@cindex zero divide
519@cindex division by zero
520@cindex inexact exception
521@cindex invalid exception
522@cindex overflow exception
523@cindex underflow exception
524
525The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur
526during a calculation. Each corresponds to a particular sort of error,
527such as overflow.
528
529When exceptions occur (when exceptions are @dfn{raised}, in the language
530of the standard), one of two things can happen. By default the
531exception is simply noted in the floating-point @dfn{status word}, and
532the program continues as if nothing had happened. The operation
533produces a default value, which depends on the exception (see the table
534below). Your program can check the status word to find out which
535exceptions happened.
536
537Alternatively, you can enable @dfn{traps} for exceptions. In that case,
538when an exception is raised, your program will receive the @code{SIGFPE}
539signal. The default action for this signal is to terminate the
8b7fb588 540program. @xref{Signal Handling}, for how you can change the effect of
7a68c94a
UD
541the signal.
542
543@findex matherr
544In the System V math library, the user-defined function @code{matherr}
545is called when certain exceptions occur inside math library functions.
546However, the Unix98 standard deprecates this interface. We support it
547for historical compatibility, but recommend that you do not use it in
c5df7609
JM
548new programs. When this interface is used, exceptions may not be
549raised.
7a68c94a
UD
550
551@noindent
552The exceptions defined in @w{IEEE 754} are:
553
554@table @samp
555@item Invalid Operation
556This exception is raised if the given operands are invalid for the
557operation to be performed. Examples are
558(see @w{IEEE 754}, @w{section 7}):
559@enumerate
560@item
561Addition or subtraction: @math{@infinity{} - @infinity{}}. (But
562@math{@infinity{} + @infinity{} = @infinity{}}).
563@item
564Multiplication: @math{0 @mul{} @infinity{}}.
565@item
566Division: @math{0/0} or @math{@infinity{}/@infinity{}}.
567@item
568Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is
569infinite.
570@item
e4fd1876 571Square root if the operand is less than zero. More generally, any
7a68c94a
UD
572mathematical function evaluated outside its domain produces this
573exception.
574@item
575Conversion of a floating-point number to an integer or decimal
576string, when the number cannot be represented in the target format (due
577to overflow, infinity, or NaN).
578@item
579Conversion of an unrecognizable input string.
580@item
581Comparison via predicates involving @math{<} or @math{>}, when one or
582other of the operands is NaN. You can prevent this exception by using
583the unordered comparison functions instead; see @ref{FP Comparison Functions}.
584@end enumerate
585
586If the exception does not trap, the result of the operation is NaN.
587
588@item Division by Zero
589This exception is raised when a finite nonzero number is divided
590by zero. If no trap occurs the result is either @math{+@infinity{}} or
591@math{-@infinity{}}, depending on the signs of the operands.
592
593@item Overflow
594This exception is raised whenever the result cannot be represented
595as a finite value in the precision format of the destination. If no trap
596occurs the result depends on the sign of the intermediate result and the
597current rounding mode (@w{IEEE 754}, @w{section 7.3}):
598@enumerate
599@item
600Round to nearest carries all overflows to @math{@infinity{}}
601with the sign of the intermediate result.
602@item
603Round toward @math{0} carries all overflows to the largest representable
604finite number with the sign of the intermediate result.
605@item
606Round toward @math{-@infinity{}} carries positive overflows to the
607largest representable finite number and negative overflows to
608@math{-@infinity{}}.
609
610@item
611Round toward @math{@infinity{}} carries negative overflows to the
612most negative representable finite number and positive overflows
613to @math{@infinity{}}.
614@end enumerate
615
616Whenever the overflow exception is raised, the inexact exception is also
617raised.
618
619@item Underflow
620The underflow exception is raised when an intermediate result is too
621small to be calculated accurately, or if the operation's result rounded
622to the destination precision is too small to be normalized.
623
624When no trap is installed for the underflow exception, underflow is
625signaled (via the underflow flag) only when both tininess and loss of
626accuracy have been detected. If no trap handler is installed the
627operation continues with an imprecise small value, or zero if the
628destination precision cannot hold the small exact result.
629
630@item Inexact
631This exception is signalled if a rounded result is not exact (such as
632when calculating the square root of two) or a result overflows without
633an overflow trap.
634@end table
635
636@node Infinity and NaN
637@subsection Infinity and NaN
638@cindex infinity
639@cindex not a number
640@cindex NaN
641
642@w{IEEE 754} floating point numbers can represent positive or negative
643infinity, and @dfn{NaN} (not a number). These three values arise from
644calculations whose result is undefined or cannot be represented
645accurately. You can also deliberately set a floating-point variable to
646any of them, which is sometimes useful. Some examples of calculations
647that produce infinity or NaN:
648
649@ifnottex
650@smallexample
651@math{1/0 = @infinity{}}
652@math{log (0) = -@infinity{}}
653@math{sqrt (-1) = NaN}
654@end smallexample
655@end ifnottex
656@tex
657$${1\over0} = \infty$$
658$$\log 0 = -\infty$$
659$$\sqrt{-1} = \hbox{NaN}$$
660@end tex
661
662When a calculation produces any of these values, an exception also
663occurs; see @ref{FP Exceptions}.
664
665The basic operations and math functions all accept infinity and NaN and
666produce sensible output. Infinities propagate through calculations as
667one would expect: for example, @math{2 + @infinity{} = @infinity{}},
668@math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}. NaN, on
669the other hand, infects any calculation that involves it. Unless the
670calculation would produce the same result no matter what real value
671replaced NaN, the result is NaN.
672
673In comparison operations, positive infinity is larger than all values
674except itself and NaN, and negative infinity is smaller than all values
675except itself and NaN. NaN is @dfn{unordered}: it is not equal to,
676greater than, or less than anything, @emph{including itself}. @code{x ==
677x} is false if the value of @code{x} is NaN. You can use this to test
678whether a value is NaN or not, but the recommended way to test for NaN
679is with the @code{isnan} function (@pxref{Floating Point Classes}). In
680addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an
681exception when applied to NaNs.
682
683@file{math.h} defines macros that allow you to explicitly set a variable
684to infinity or NaN.
b4012b75
UD
685
686@comment math.h
687@comment ISO
7a68c94a
UD
688@deftypevr Macro float INFINITY
689An expression representing positive infinity. It is equal to the value
690produced by mathematical operations like @code{1.0 / 0.0}.
691@code{-INFINITY} represents negative infinity.
692
693You can test whether a floating-point value is infinite by comparing it
694to this macro. However, this is not recommended; you should use the
695@code{isfinite} macro instead. @xref{Floating Point Classes}.
696
ec751a23 697This macro was introduced in the @w{ISO C99} standard.
7a68c94a
UD
698@end deftypevr
699
700@comment math.h
701@comment GNU
702@deftypevr Macro float NAN
703An expression representing a value which is ``not a number''. This
704macro is a GNU extension, available only on machines that support the
705``not a number'' value---that is to say, on all machines that support
706IEEE floating point.
707
708You can use @samp{#ifdef NAN} to test whether the machine supports
709NaN. (Of course, you must arrange for GNU extensions to be visible,
710such as by defining @code{_GNU_SOURCE}, and then you must include
711@file{math.h}.)
712@end deftypevr
713
f82a4bdb
JM
714@comment math.h
715@comment ISO
716@deftypevr Macro float SNANF
717@deftypevrx Macro double SNAN
718@deftypevrx Macro {long double} SNANL
719These macros, defined by TS 18661-1:2014, are constant expressions for
720signaling NaNs.
721@end deftypevr
722
c0b43536
JM
723@comment fenv.h
724@comment ISO
725@deftypevr Macro int FE_SNANS_ALWAYS_SIGNAL
726This macro, defined by TS 18661-1:2014, is defined to @code{1} in
727@file{fenv.h} to indicate that functions and operations with signaling
728NaN inputs and floating-point results always raise the invalid
729exception and return a quiet NaN, even in cases (such as @code{fmax},
730@code{hypot} and @code{pow}) where a quiet NaN input can produce a
731non-NaN result. Because some compiler optimizations may not handle
732signaling NaNs correctly, this macro is only defined if compiler
733support for signaling NaNs is enabled. That support can be enabled
734with the GCC option @option{-fsignaling-nans}.
735@end deftypevr
736
7a68c94a
UD
737@w{IEEE 754} also allows for another unusual value: negative zero. This
738value is produced when you divide a positive number by negative
739infinity, or when a negative result is smaller than the limits of
cd837b09 740representation.
7a68c94a
UD
741
742@node Status bit operations
743@subsection Examining the FPU status word
744
ec751a23 745@w{ISO C99} defines functions to query and manipulate the
7a68c94a
UD
746floating-point status word. You can use these functions to check for
747untrapped exceptions when it's convenient, rather than worrying about
748them in the middle of a calculation.
749
750These constants represent the various @w{IEEE 754} exceptions. Not all
751FPUs report all the different exceptions. Each constant is defined if
752and only if the FPU you are compiling for supports that exception, so
753you can test for FPU support with @samp{#ifdef}. They are defined in
754@file{fenv.h}.
b4012b75
UD
755
756@vtable @code
7a68c94a
UD
757@comment fenv.h
758@comment ISO
759@item FE_INEXACT
760 The inexact exception.
761@comment fenv.h
762@comment ISO
763@item FE_DIVBYZERO
764 The divide by zero exception.
765@comment fenv.h
766@comment ISO
767@item FE_UNDERFLOW
768 The underflow exception.
769@comment fenv.h
770@comment ISO
771@item FE_OVERFLOW
772 The overflow exception.
773@comment fenv.h
774@comment ISO
775@item FE_INVALID
776 The invalid exception.
b4012b75
UD
777@end vtable
778
7a68c94a
UD
779The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros
780which are supported by the FP implementation.
b4012b75 781
7a68c94a
UD
782These functions allow you to clear exception flags, test for exceptions,
783and save and restore the set of exceptions flagged.
b4012b75 784
7a68c94a 785@comment fenv.h
b4012b75 786@comment ISO
63ae7b63 787@deftypefun int feclearexcept (int @var{excepts})
b719dafd
AO
788@safety{@prelim{}@mtsafe{}@assafe{@assposix{}}@acsafe{@acsposix{}}}
789@c The other functions in this section that modify FP status register
790@c mostly do so with non-atomic load-modify-store sequences, but since
791@c the register is thread-specific, this should be fine, and safe for
792@c cancellation. As long as the FP environment is restored before the
793@c signal handler returns control to the interrupted thread (like any
794@c kernel should do), the functions are also safe for use in signal
795@c handlers.
7a68c94a
UD
796This function clears all of the supported exception flags indicated by
797@var{excepts}.
63ae7b63
UD
798
799The function returns zero in case the operation was successful, a
800non-zero value otherwise.
801@end deftypefun
802
803@comment fenv.h
804@comment ISO
805@deftypefun int feraiseexcept (int @var{excepts})
b719dafd 806@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
63ae7b63
UD
807This function raises the supported exceptions indicated by
808@var{excepts}. If more than one exception bit in @var{excepts} is set
809the order in which the exceptions are raised is undefined except that
810overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are
811raised before inexact (@code{FE_INEXACT}). Whether for overflow or
812underflow the inexact exception is also raised is also implementation
813dependent.
814
815The function returns zero in case the operation was successful, a
816non-zero value otherwise.
7a68c94a
UD
817@end deftypefun
818
5146356f
JM
819@comment fenv.h
820@comment ISO
821@deftypefun int fesetexcept (int @var{excepts})
822@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
823This function sets the supported exception flags indicated by
824@var{excepts}, like @code{feraiseexcept}, but without causing enabled
825traps to be taken. @code{fesetexcept} is from TS 18661-1:2014.
826
827The function returns zero in case the operation was successful, a
828non-zero value otherwise.
829@end deftypefun
830
7a68c94a
UD
831@comment fenv.h
832@comment ISO
833@deftypefun int fetestexcept (int @var{excepts})
b719dafd 834@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
835Test whether the exception flags indicated by the parameter @var{except}
836are currently set. If any of them are, a nonzero value is returned
837which specifies which exceptions are set. Otherwise the result is zero.
838@end deftypefun
839
840To understand these functions, imagine that the status word is an
841integer variable named @var{status}. @code{feclearexcept} is then
842equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is
843equivalent to @samp{(status & excepts)}. The actual implementation may
844be very different, of course.
845
846Exception flags are only cleared when the program explicitly requests it,
847by calling @code{feclearexcept}. If you want to check for exceptions
848from a set of calculations, you should clear all the flags first. Here
849is a simple example of the way to use @code{fetestexcept}:
b4012b75
UD
850
851@smallexample
7a68c94a
UD
852@{
853 double f;
854 int raised;
855 feclearexcept (FE_ALL_EXCEPT);
856 f = compute ();
857 raised = fetestexcept (FE_OVERFLOW | FE_INVALID);
95fdc6a0
UD
858 if (raised & FE_OVERFLOW) @{ /* @dots{} */ @}
859 if (raised & FE_INVALID) @{ /* @dots{} */ @}
860 /* @dots{} */
7a68c94a 861@}
b4012b75
UD
862@end smallexample
863
7a68c94a
UD
864You cannot explicitly set bits in the status word. You can, however,
865save the entire status word and restore it later. This is done with the
866following functions:
b4012b75 867
7a68c94a 868@comment fenv.h
b4012b75 869@comment ISO
63ae7b63 870@deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts})
b719dafd 871@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
872This function stores in the variable pointed to by @var{flagp} an
873implementation-defined value representing the current setting of the
874exception flags indicated by @var{excepts}.
63ae7b63
UD
875
876The function returns zero in case the operation was successful, a
877non-zero value otherwise.
7a68c94a 878@end deftypefun
b4012b75 879
7a68c94a
UD
880@comment fenv.h
881@comment ISO
9251c568 882@deftypefun int fesetexceptflag (const fexcept_t *@var{flagp}, int @var{excepts})
b719dafd 883@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
884This function restores the flags for the exceptions indicated by
885@var{excepts} to the values stored in the variable pointed to by
886@var{flagp}.
63ae7b63
UD
887
888The function returns zero in case the operation was successful, a
889non-zero value otherwise.
7a68c94a
UD
890@end deftypefun
891
892Note that the value stored in @code{fexcept_t} bears no resemblance to
893the bit mask returned by @code{fetestexcept}. The type may not even be
894an integer. Do not attempt to modify an @code{fexcept_t} variable.
895
780257d4
JM
896@comment fenv.h
897@comment ISO
898@deftypefun int fetestexceptflag (const fexcept_t *@var{flagp}, int @var{excepts})
899@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
900Test whether the exception flags indicated by the parameter
901@var{excepts} are set in the variable pointed to by @var{flagp}. If
902any of them are, a nonzero value is returned which specifies which
903exceptions are set. Otherwise the result is zero.
904@code{fetestexceptflag} is from TS 18661-1:2014.
905@end deftypefun
906
7a68c94a
UD
907@node Math Error Reporting
908@subsection Error Reporting by Mathematical Functions
909@cindex errors, mathematical
910@cindex domain error
911@cindex range error
912
913Many of the math functions are defined only over a subset of the real or
914complex numbers. Even if they are mathematically defined, their result
915may be larger or smaller than the range representable by their return
c5df7609
JM
916type without loss of accuracy. These are known as @dfn{domain errors},
917@dfn{overflows}, and
7a68c94a
UD
918@dfn{underflows}, respectively. Math functions do several things when
919one of these errors occurs. In this manual we will refer to the
920complete response as @dfn{signalling} a domain error, overflow, or
921underflow.
922
923When a math function suffers a domain error, it raises the invalid
924exception and returns NaN. It also sets @var{errno} to @code{EDOM};
925this is for compatibility with old systems that do not support @w{IEEE
926754} exception handling. Likewise, when overflow occurs, math
c5df7609
JM
927functions raise the overflow exception and, in the default rounding
928mode, return @math{@infinity{}} or @math{-@infinity{}} as appropriate
929(in other rounding modes, the largest finite value of the appropriate
930sign is returned when appropriate for that rounding mode). They also
931set @var{errno} to @code{ERANGE} if returning @math{@infinity{}} or
932@math{-@infinity{}}; @var{errno} may or may not be set to
933@code{ERANGE} when a finite value is returned on overflow. When
934underflow occurs, the underflow exception is raised, and zero
935(appropriately signed) or a subnormal value, as appropriate for the
936mathematical result of the function and the rounding mode, is
937returned. @var{errno} may be set to @code{ERANGE}, but this is not
938guaranteed; it is intended that @theglibc{} should set it when the
939underflow is to an appropriately signed zero, but not necessarily for
940other underflows.
7a68c94a 941
3fdf1792
JM
942When a math function has an argument that is a signaling NaN,
943@theglibc{} does not consider this a domain error, so @code{errno} is
944unchanged, but the invalid exception is still raised (except for a few
945functions that are specified to handle signaling NaNs differently).
946
7a68c94a
UD
947Some of the math functions are defined mathematically to result in a
948complex value over parts of their domains. The most familiar example of
949this is taking the square root of a negative number. The complex math
950functions, such as @code{csqrt}, will return the appropriate complex value
951in this case. The real-valued functions, such as @code{sqrt}, will
952signal a domain error.
953
954Some older hardware does not support infinities. On that hardware,
955overflows instead return a particular very large number (usually the
956largest representable number). @file{math.h} defines macros you can use
957to test for overflow on both old and new hardware.
b4012b75
UD
958
959@comment math.h
960@comment ISO
7a68c94a 961@deftypevr Macro double HUGE_VAL
4260bc74
UD
962@comment math.h
963@comment ISO
7a68c94a 964@deftypevrx Macro float HUGE_VALF
4260bc74
UD
965@comment math.h
966@comment ISO
7a68c94a
UD
967@deftypevrx Macro {long double} HUGE_VALL
968An expression representing a particular very large number. On machines
969that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity.
970On other machines, it's typically the largest positive number that can
971be represented.
972
973Mathematical functions return the appropriately typed version of
974@code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large
975to be represented.
976@end deftypevr
b4012b75 977
7a68c94a
UD
978@node Rounding
979@section Rounding Modes
980
981Floating-point calculations are carried out internally with extra
982precision, and then rounded to fit into the destination type. This
983ensures that results are as precise as the input data. @w{IEEE 754}
984defines four possible rounding modes:
985
986@table @asis
987@item Round to nearest.
988This is the default mode. It should be used unless there is a specific
989need for one of the others. In this mode results are rounded to the
990nearest representable value. If the result is midway between two
991representable values, the even representable is chosen. @dfn{Even} here
992means the lowest-order bit is zero. This rounding mode prevents
993statistical bias and guarantees numeric stability: round-off errors in a
994lengthy calculation will remain smaller than half of @code{FLT_EPSILON}.
995
996@c @item Round toward @math{+@infinity{}}
997@item Round toward plus Infinity.
998All results are rounded to the smallest representable value
999which is greater than the result.
1000
1001@c @item Round toward @math{-@infinity{}}
1002@item Round toward minus Infinity.
1003All results are rounded to the largest representable value which is less
1004than the result.
1005
1006@item Round toward zero.
1007All results are rounded to the largest representable value whose
1008magnitude is less than that of the result. In other words, if the
1009result is negative it is rounded up; if it is positive, it is rounded
1010down.
1011@end table
b4012b75 1012
7a68c94a
UD
1013@noindent
1014@file{fenv.h} defines constants which you can use to refer to the
1015various rounding modes. Each one will be defined if and only if the FPU
1016supports the corresponding rounding mode.
b4012b75 1017
7a68c94a
UD
1018@table @code
1019@comment fenv.h
1020@comment ISO
1021@vindex FE_TONEAREST
1022@item FE_TONEAREST
1023Round to nearest.
b4012b75 1024
7a68c94a
UD
1025@comment fenv.h
1026@comment ISO
1027@vindex FE_UPWARD
1028@item FE_UPWARD
1029Round toward @math{+@infinity{}}.
b4012b75 1030
7a68c94a
UD
1031@comment fenv.h
1032@comment ISO
1033@vindex FE_DOWNWARD
1034@item FE_DOWNWARD
1035Round toward @math{-@infinity{}}.
b4012b75 1036
7a68c94a
UD
1037@comment fenv.h
1038@comment ISO
1039@vindex FE_TOWARDZERO
1040@item FE_TOWARDZERO
1041Round toward zero.
1042@end table
b4012b75 1043
7a68c94a
UD
1044Underflow is an unusual case. Normally, @w{IEEE 754} floating point
1045numbers are always normalized (@pxref{Floating Point Concepts}).
1046Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent,
1047@code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as
1048normalized numbers. Rounding all such numbers to zero or @math{2^r}
1049would cause some algorithms to fail at 0. Therefore, they are left in
1050denormalized form. That produces loss of precision, since some bits of
1051the mantissa are stolen to indicate the decimal point.
1052
1053If a result is too small to be represented as a denormalized number, it
1054is rounded to zero. However, the sign of the result is preserved; if
1055the calculation was negative, the result is @dfn{negative zero}.
1056Negative zero can also result from some operations on infinity, such as
cd837b09 1057@math{4/-@infinity{}}.
7a68c94a 1058
e4fd1876 1059At any time, one of the above four rounding modes is selected. You can
7a68c94a
UD
1060find out which one with this function:
1061
1062@comment fenv.h
1063@comment ISO
1064@deftypefun int fegetround (void)
b719dafd 1065@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1066Returns the currently selected rounding mode, represented by one of the
1067values of the defined rounding mode macros.
1068@end deftypefun
b4012b75 1069
7a68c94a
UD
1070@noindent
1071To change the rounding mode, use this function:
b4012b75 1072
7a68c94a
UD
1073@comment fenv.h
1074@comment ISO
1075@deftypefun int fesetround (int @var{round})
b719dafd 1076@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1077Changes the currently selected rounding mode to @var{round}. If
1078@var{round} does not correspond to one of the supported rounding modes
d5655997 1079nothing is changed. @code{fesetround} returns zero if it changed the
e4fd1876 1080rounding mode, or a nonzero value if the mode is not supported.
7a68c94a 1081@end deftypefun
b4012b75 1082
7a68c94a
UD
1083You should avoid changing the rounding mode if possible. It can be an
1084expensive operation; also, some hardware requires you to compile your
1085program differently for it to work. The resulting code may run slower.
1086See your compiler documentation for details.
1087@c This section used to claim that functions existed to round one number
1088@c in a specific fashion. I can't find any functions in the library
1089@c that do that. -zw
1090
1091@node Control Functions
1092@section Floating-Point Control Functions
1093
1094@w{IEEE 754} floating-point implementations allow the programmer to
1095decide whether traps will occur for each of the exceptions, by setting
1096bits in the @dfn{control word}. In C, traps result in the program
1097receiving the @code{SIGFPE} signal; see @ref{Signal Handling}.
1098
48b22986 1099@strong{NB:} @w{IEEE 754} says that trap handlers are given details of
7a68c94a
UD
1100the exceptional situation, and can set the result value. C signals do
1101not provide any mechanism to pass this information back and forth.
1102Trapping exceptions in C is therefore not very useful.
1103
1104It is sometimes necessary to save the state of the floating-point unit
1105while you perform some calculation. The library provides functions
1106which save and restore the exception flags, the set of exceptions that
1107generate traps, and the rounding mode. This information is known as the
1108@dfn{floating-point environment}.
1109
1110The functions to save and restore the floating-point environment all use
1111a variable of type @code{fenv_t} to store information. This type is
1112defined in @file{fenv.h}. Its size and contents are
1113implementation-defined. You should not attempt to manipulate a variable
1114of this type directly.
1115
1116To save the state of the FPU, use one of these functions:
1117
1118@comment fenv.h
b4012b75 1119@comment ISO
63ae7b63 1120@deftypefun int fegetenv (fenv_t *@var{envp})
b719dafd 1121@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1122Store the floating-point environment in the variable pointed to by
1123@var{envp}.
63ae7b63
UD
1124
1125The function returns zero in case the operation was successful, a
1126non-zero value otherwise.
b4012b75
UD
1127@end deftypefun
1128
7a68c94a 1129@comment fenv.h
b4012b75 1130@comment ISO
7a68c94a 1131@deftypefun int feholdexcept (fenv_t *@var{envp})
b719dafd 1132@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1133Store the current floating-point environment in the object pointed to by
1134@var{envp}. Then clear all exception flags, and set the FPU to trap no
1135exceptions. Not all FPUs support trapping no exceptions; if
0f6b172f
UD
1136@code{feholdexcept} cannot set this mode, it returns nonzero value. If it
1137succeeds, it returns zero.
b4012b75
UD
1138@end deftypefun
1139
7a7a7ee5 1140The functions which restore the floating-point environment can take these
7a68c94a 1141kinds of arguments:
b4012b75 1142
7a68c94a
UD
1143@itemize @bullet
1144@item
1145Pointers to @code{fenv_t} objects, which were initialized previously by a
1146call to @code{fegetenv} or @code{feholdexcept}.
1147@item
1148@vindex FE_DFL_ENV
1149The special macro @code{FE_DFL_ENV} which represents the floating-point
1150environment as it was available at program start.
1151@item
7a7a7ee5
AJ
1152Implementation defined macros with names starting with @code{FE_} and
1153having type @code{fenv_t *}.
b4012b75 1154
7a68c94a 1155@vindex FE_NOMASK_ENV
1f77f049 1156If possible, @theglibc{} defines a macro @code{FE_NOMASK_ENV}
7a68c94a
UD
1157which represents an environment where every exception raised causes a
1158trap to occur. You can test for this macro using @code{#ifdef}. It is
1159only defined if @code{_GNU_SOURCE} is defined.
1160
1161Some platforms might define other predefined environments.
1162@end itemize
1163
1164@noindent
1165To set the floating-point environment, you can use either of these
1166functions:
1167
1168@comment fenv.h
b4012b75 1169@comment ISO
63ae7b63 1170@deftypefun int fesetenv (const fenv_t *@var{envp})
b719dafd 1171@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a 1172Set the floating-point environment to that described by @var{envp}.
63ae7b63
UD
1173
1174The function returns zero in case the operation was successful, a
1175non-zero value otherwise.
b4012b75
UD
1176@end deftypefun
1177
7a68c94a 1178@comment fenv.h
b4012b75 1179@comment ISO
63ae7b63 1180@deftypefun int feupdateenv (const fenv_t *@var{envp})
b719dafd 1181@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1182Like @code{fesetenv}, this function sets the floating-point environment
1183to that described by @var{envp}. However, if any exceptions were
1184flagged in the status word before @code{feupdateenv} was called, they
1185remain flagged after the call. In other words, after @code{feupdateenv}
1186is called, the status word is the bitwise OR of the previous status word
1187and the one saved in @var{envp}.
63ae7b63
UD
1188
1189The function returns zero in case the operation was successful, a
1190non-zero value otherwise.
b4012b75
UD
1191@end deftypefun
1192
ec94343f
JM
1193@noindent
1194TS 18661-1:2014 defines additional functions to save and restore
1195floating-point control modes (such as the rounding mode and whether
1196traps are enabled) while leaving other status (such as raised flags)
1197unchanged.
1198
1199@vindex FE_DFL_MODE
1200The special macro @code{FE_DFL_MODE} may be passed to
1201@code{fesetmode}. It represents the floating-point control modes at
1202program start.
1203
1204@comment fenv.h
1205@comment ISO
1206@deftypefun int fegetmode (femode_t *@var{modep})
1207@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1208Store the floating-point control modes in the variable pointed to by
1209@var{modep}.
1210
1211The function returns zero in case the operation was successful, a
1212non-zero value otherwise.
1213@end deftypefun
1214
1215@comment fenv.h
1216@comment ISO
1217@deftypefun int fesetmode (const femode_t *@var{modep})
1218@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1219Set the floating-point control modes to those described by
1220@var{modep}.
1221
1222The function returns zero in case the operation was successful, a
1223non-zero value otherwise.
1224@end deftypefun
1225
05ef7ce9
UD
1226@noindent
1227To control for individual exceptions if raising them causes a trap to
1228occur, you can use the following two functions.
1229
1230@strong{Portability Note:} These functions are all GNU extensions.
1231
1232@comment fenv.h
1233@comment GNU
1234@deftypefun int feenableexcept (int @var{excepts})
b719dafd 1235@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
e4fd1876
RJ
1236This function enables traps for each of the exceptions as indicated by
1237the parameter @var{excepts}. The individual exceptions are described in
6e8afc1c 1238@ref{Status bit operations}. Only the specified exceptions are
05ef7ce9
UD
1239enabled, the status of the other exceptions is not changed.
1240
1241The function returns the previous enabled exceptions in case the
1242operation was successful, @code{-1} otherwise.
1243@end deftypefun
1244
1245@comment fenv.h
1246@comment GNU
1247@deftypefun int fedisableexcept (int @var{excepts})
b719dafd 1248@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
e4fd1876
RJ
1249This function disables traps for each of the exceptions as indicated by
1250the parameter @var{excepts}. The individual exceptions are described in
6e8afc1c 1251@ref{Status bit operations}. Only the specified exceptions are
05ef7ce9
UD
1252disabled, the status of the other exceptions is not changed.
1253
1254The function returns the previous enabled exceptions in case the
1255operation was successful, @code{-1} otherwise.
1256@end deftypefun
1257
1258@comment fenv.h
1259@comment GNU
8ded91fb 1260@deftypefun int fegetexcept (void)
b719dafd 1261@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
05ef7ce9
UD
1262The function returns a bitmask of all currently enabled exceptions. It
1263returns @code{-1} in case of failure.
6e8afc1c 1264@end deftypefun
05ef7ce9 1265
7a68c94a
UD
1266@node Arithmetic Functions
1267@section Arithmetic Functions
b4012b75 1268
7a68c94a
UD
1269The C library provides functions to do basic operations on
1270floating-point numbers. These include absolute value, maximum and minimum,
1271normalization, bit twiddling, rounding, and a few others.
b4012b75 1272
7a68c94a
UD
1273@menu
1274* Absolute Value:: Absolute values of integers and floats.
1275* Normalization Functions:: Extracting exponents and putting them back.
1276* Rounding Functions:: Rounding floats to integers.
1277* Remainder Functions:: Remainders on division, precisely defined.
1278* FP Bit Twiddling:: Sign bit adjustment. Adding epsilon.
1279* FP Comparison Functions:: Comparisons without risk of exceptions.
1280* Misc FP Arithmetic:: Max, min, positive difference, multiply-add.
1281@end menu
b4012b75 1282
28f540f4 1283@node Absolute Value
7a68c94a 1284@subsection Absolute Value
28f540f4
RM
1285@cindex absolute value functions
1286
1287These functions are provided for obtaining the @dfn{absolute value} (or
1288@dfn{magnitude}) of a number. The absolute value of a real number
2d26e9eb 1289@var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is
28f540f4
RM
1290negative. For a complex number @var{z}, whose real part is @var{x} and
1291whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt
1292(@var{x}*@var{x} + @var{y}*@var{y})}}.
1293
1294@pindex math.h
1295@pindex stdlib.h
fe0ec73e 1296Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h};
e518937a 1297@code{imaxabs} is declared in @file{inttypes.h};
7a68c94a 1298@code{fabs}, @code{fabsf} and @code{fabsl} are declared in @file{math.h}.
b4012b75 1299@code{cabs}, @code{cabsf} and @code{cabsl} are declared in @file{complex.h}.
28f540f4
RM
1300
1301@comment stdlib.h
f65fd747 1302@comment ISO
28f540f4 1303@deftypefun int abs (int @var{number})
4260bc74
UD
1304@comment stdlib.h
1305@comment ISO
7a68c94a 1306@deftypefunx {long int} labs (long int @var{number})
4260bc74
UD
1307@comment stdlib.h
1308@comment ISO
7a68c94a 1309@deftypefunx {long long int} llabs (long long int @var{number})
e518937a
UD
1310@comment inttypes.h
1311@comment ISO
1312@deftypefunx intmax_t imaxabs (intmax_t @var{number})
b719dafd 1313@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a 1314These functions return the absolute value of @var{number}.
28f540f4
RM
1315
1316Most computers use a two's complement integer representation, in which
1317the absolute value of @code{INT_MIN} (the smallest possible @code{int})
1318cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined.
28f540f4 1319
ec751a23 1320@code{llabs} and @code{imaxdiv} are new to @w{ISO C99}.
0e4ee106
UD
1321
1322See @ref{Integers} for a description of the @code{intmax_t} type.
1323
fe0ec73e
UD
1324@end deftypefun
1325
28f540f4 1326@comment math.h
f65fd747 1327@comment ISO
28f540f4 1328@deftypefun double fabs (double @var{number})
4260bc74
UD
1329@comment math.h
1330@comment ISO
779ae82e 1331@deftypefunx float fabsf (float @var{number})
4260bc74
UD
1332@comment math.h
1333@comment ISO
779ae82e 1334@deftypefunx {long double} fabsl (long double @var{number})
b719dafd 1335@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
28f540f4
RM
1336This function returns the absolute value of the floating-point number
1337@var{number}.
1338@end deftypefun
1339
b4012b75
UD
1340@comment complex.h
1341@comment ISO
1342@deftypefun double cabs (complex double @var{z})
4260bc74
UD
1343@comment complex.h
1344@comment ISO
779ae82e 1345@deftypefunx float cabsf (complex float @var{z})
4260bc74
UD
1346@comment complex.h
1347@comment ISO
779ae82e 1348@deftypefunx {long double} cabsl (complex long double @var{z})
b719dafd 1349@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1350These functions return the absolute value of the complex number @var{z}
1351(@pxref{Complex Numbers}). The absolute value of a complex number is:
28f540f4
RM
1352
1353@smallexample
b4012b75 1354sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z}))
28f540f4 1355@end smallexample
dfd2257a 1356
7a68c94a
UD
1357This function should always be used instead of the direct formula
1358because it takes special care to avoid losing precision. It may also
cf822e3c 1359take advantage of hardware support for this operation. See @code{hypot}
8b7fb588 1360in @ref{Exponents and Logarithms}.
28f540f4
RM
1361@end deftypefun
1362
1363@node Normalization Functions
7a68c94a 1364@subsection Normalization Functions
28f540f4
RM
1365@cindex normalization functions (floating-point)
1366
1367The functions described in this section are primarily provided as a way
1368to efficiently perform certain low-level manipulations on floating point
1369numbers that are represented internally using a binary radix;
1370see @ref{Floating Point Concepts}. These functions are required to
1371have equivalent behavior even if the representation does not use a radix
1372of 2, but of course they are unlikely to be particularly efficient in
1373those cases.
1374
1375@pindex math.h
1376All these functions are declared in @file{math.h}.
1377
1378@comment math.h
f65fd747 1379@comment ISO
28f540f4 1380@deftypefun double frexp (double @var{value}, int *@var{exponent})
4260bc74
UD
1381@comment math.h
1382@comment ISO
779ae82e 1383@deftypefunx float frexpf (float @var{value}, int *@var{exponent})
4260bc74
UD
1384@comment math.h
1385@comment ISO
779ae82e 1386@deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent})
b719dafd 1387@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1388These functions are used to split the number @var{value}
28f540f4
RM
1389into a normalized fraction and an exponent.
1390
1391If the argument @var{value} is not zero, the return value is @var{value}
56b672e9
BN
1392times a power of two, and its magnitude is always in the range 1/2
1393(inclusive) to 1 (exclusive). The corresponding exponent is stored in
28f540f4
RM
1394@code{*@var{exponent}}; the return value multiplied by 2 raised to this
1395exponent equals the original number @var{value}.
1396
1397For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and
1398stores @code{4} in @code{exponent}.
1399
1400If @var{value} is zero, then the return value is zero and
1401zero is stored in @code{*@var{exponent}}.
1402@end deftypefun
1403
1404@comment math.h
f65fd747 1405@comment ISO
28f540f4 1406@deftypefun double ldexp (double @var{value}, int @var{exponent})
4260bc74
UD
1407@comment math.h
1408@comment ISO
779ae82e 1409@deftypefunx float ldexpf (float @var{value}, int @var{exponent})
4260bc74
UD
1410@comment math.h
1411@comment ISO
779ae82e 1412@deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent})
b719dafd 1413@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1414These functions return the result of multiplying the floating-point
28f540f4
RM
1415number @var{value} by 2 raised to the power @var{exponent}. (It can
1416be used to reassemble floating-point numbers that were taken apart
1417by @code{frexp}.)
1418
1419For example, @code{ldexp (0.8, 4)} returns @code{12.8}.
1420@end deftypefun
1421
7a68c94a 1422The following functions, which come from BSD, provide facilities
b7d03293
UD
1423equivalent to those of @code{ldexp} and @code{frexp}. See also the
1424@w{ISO C} function @code{logb} which originally also appeared in BSD.
7a68c94a
UD
1425
1426@comment math.h
1427@comment BSD
8ded91fb 1428@deftypefun double scalb (double @var{value}, double @var{exponent})
4260bc74
UD
1429@comment math.h
1430@comment BSD
8ded91fb 1431@deftypefunx float scalbf (float @var{value}, float @var{exponent})
4260bc74
UD
1432@comment math.h
1433@comment BSD
8ded91fb 1434@deftypefunx {long double} scalbl (long double @var{value}, long double @var{exponent})
b719dafd 1435@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1436The @code{scalb} function is the BSD name for @code{ldexp}.
1437@end deftypefun
1438
1439@comment math.h
1440@comment BSD
9ad027fb 1441@deftypefun double scalbn (double @var{x}, int @var{n})
4260bc74
UD
1442@comment math.h
1443@comment BSD
9ad027fb 1444@deftypefunx float scalbnf (float @var{x}, int @var{n})
4260bc74
UD
1445@comment math.h
1446@comment BSD
9ad027fb 1447@deftypefunx {long double} scalbnl (long double @var{x}, int @var{n})
b719dafd 1448@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1449@code{scalbn} is identical to @code{scalb}, except that the exponent
1450@var{n} is an @code{int} instead of a floating-point number.
1451@end deftypefun
1452
1453@comment math.h
1454@comment BSD
9ad027fb 1455@deftypefun double scalbln (double @var{x}, long int @var{n})
4260bc74
UD
1456@comment math.h
1457@comment BSD
9ad027fb 1458@deftypefunx float scalblnf (float @var{x}, long int @var{n})
4260bc74
UD
1459@comment math.h
1460@comment BSD
9ad027fb 1461@deftypefunx {long double} scalblnl (long double @var{x}, long int @var{n})
b719dafd 1462@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1463@code{scalbln} is identical to @code{scalb}, except that the exponent
1464@var{n} is a @code{long int} instead of a floating-point number.
1465@end deftypefun
28f540f4 1466
7a68c94a
UD
1467@comment math.h
1468@comment BSD
8ded91fb 1469@deftypefun double significand (double @var{x})
4260bc74
UD
1470@comment math.h
1471@comment BSD
8ded91fb 1472@deftypefunx float significandf (float @var{x})
4260bc74
UD
1473@comment math.h
1474@comment BSD
8ded91fb 1475@deftypefunx {long double} significandl (long double @var{x})
b719dafd 1476@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1477@code{significand} returns the mantissa of @var{x} scaled to the range
1478@math{[1, 2)}.
1479It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}.
1480
1481This function exists mainly for use in certain standardized tests
1482of @w{IEEE 754} conformance.
28f540f4
RM
1483@end deftypefun
1484
7a68c94a
UD
1485@node Rounding Functions
1486@subsection Rounding Functions
28f540f4
RM
1487@cindex converting floats to integers
1488
1489@pindex math.h
7a68c94a 1490The functions listed here perform operations such as rounding and
cf822e3c 1491truncation of floating-point values. Some of these functions convert
7a68c94a
UD
1492floating point numbers to integer values. They are all declared in
1493@file{math.h}.
28f540f4
RM
1494
1495You can also convert floating-point numbers to integers simply by
1496casting them to @code{int}. This discards the fractional part,
1497effectively rounding towards zero. However, this only works if the
1498result can actually be represented as an @code{int}---for very large
1499numbers, this is impossible. The functions listed here return the
1500result as a @code{double} instead to get around this problem.
1501
1502@comment math.h
f65fd747 1503@comment ISO
28f540f4 1504@deftypefun double ceil (double @var{x})
4260bc74
UD
1505@comment math.h
1506@comment ISO
779ae82e 1507@deftypefunx float ceilf (float @var{x})
4260bc74
UD
1508@comment math.h
1509@comment ISO
779ae82e 1510@deftypefunx {long double} ceill (long double @var{x})
b719dafd 1511@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1512These functions round @var{x} upwards to the nearest integer,
28f540f4
RM
1513returning that value as a @code{double}. Thus, @code{ceil (1.5)}
1514is @code{2.0}.
1515@end deftypefun
1516
1517@comment math.h
f65fd747 1518@comment ISO
28f540f4 1519@deftypefun double floor (double @var{x})
4260bc74
UD
1520@comment math.h
1521@comment ISO
779ae82e 1522@deftypefunx float floorf (float @var{x})
4260bc74
UD
1523@comment math.h
1524@comment ISO
779ae82e 1525@deftypefunx {long double} floorl (long double @var{x})
b719dafd 1526@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1527These functions round @var{x} downwards to the nearest
28f540f4
RM
1528integer, returning that value as a @code{double}. Thus, @code{floor
1529(1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}.
1530@end deftypefun
1531
7a68c94a
UD
1532@comment math.h
1533@comment ISO
1534@deftypefun double trunc (double @var{x})
4260bc74
UD
1535@comment math.h
1536@comment ISO
7a68c94a 1537@deftypefunx float truncf (float @var{x})
4260bc74
UD
1538@comment math.h
1539@comment ISO
7a68c94a 1540@deftypefunx {long double} truncl (long double @var{x})
b719dafd 1541@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
e6e81391
UD
1542The @code{trunc} functions round @var{x} towards zero to the nearest
1543integer (returned in floating-point format). Thus, @code{trunc (1.5)}
1544is @code{1.0} and @code{trunc (-1.5)} is @code{-1.0}.
7a68c94a
UD
1545@end deftypefun
1546
28f540f4 1547@comment math.h
b4012b75 1548@comment ISO
28f540f4 1549@deftypefun double rint (double @var{x})
4260bc74
UD
1550@comment math.h
1551@comment ISO
779ae82e 1552@deftypefunx float rintf (float @var{x})
4260bc74
UD
1553@comment math.h
1554@comment ISO
779ae82e 1555@deftypefunx {long double} rintl (long double @var{x})
b719dafd 1556@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1557These functions round @var{x} to an integer value according to the
28f540f4
RM
1558current rounding mode. @xref{Floating Point Parameters}, for
1559information about the various rounding modes. The default
1560rounding mode is to round to the nearest integer; some machines
1561support other modes, but round-to-nearest is always used unless
7a68c94a
UD
1562you explicitly select another.
1563
1564If @var{x} was not initially an integer, these functions raise the
1565inexact exception.
28f540f4
RM
1566@end deftypefun
1567
b4012b75
UD
1568@comment math.h
1569@comment ISO
1570@deftypefun double nearbyint (double @var{x})
4260bc74
UD
1571@comment math.h
1572@comment ISO
779ae82e 1573@deftypefunx float nearbyintf (float @var{x})
4260bc74
UD
1574@comment math.h
1575@comment ISO
779ae82e 1576@deftypefunx {long double} nearbyintl (long double @var{x})
b719dafd 1577@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1578These functions return the same value as the @code{rint} functions, but
1579do not raise the inexact exception if @var{x} is not an integer.
1580@end deftypefun
1581
1582@comment math.h
1583@comment ISO
1584@deftypefun double round (double @var{x})
4260bc74
UD
1585@comment math.h
1586@comment ISO
7a68c94a 1587@deftypefunx float roundf (float @var{x})
4260bc74
UD
1588@comment math.h
1589@comment ISO
7a68c94a 1590@deftypefunx {long double} roundl (long double @var{x})
b719dafd 1591@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a 1592These functions are similar to @code{rint}, but they round halfway
713df3d5
RM
1593cases away from zero instead of to the nearest integer (or other
1594current rounding mode).
7a68c94a
UD
1595@end deftypefun
1596
1597@comment math.h
1598@comment ISO
1599@deftypefun {long int} lrint (double @var{x})
4260bc74
UD
1600@comment math.h
1601@comment ISO
7a68c94a 1602@deftypefunx {long int} lrintf (float @var{x})
4260bc74
UD
1603@comment math.h
1604@comment ISO
7a68c94a 1605@deftypefunx {long int} lrintl (long double @var{x})
b719dafd 1606@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1607These functions are just like @code{rint}, but they return a
1608@code{long int} instead of a floating-point number.
1609@end deftypefun
1610
1611@comment math.h
1612@comment ISO
1613@deftypefun {long long int} llrint (double @var{x})
4260bc74
UD
1614@comment math.h
1615@comment ISO
7a68c94a 1616@deftypefunx {long long int} llrintf (float @var{x})
4260bc74
UD
1617@comment math.h
1618@comment ISO
7a68c94a 1619@deftypefunx {long long int} llrintl (long double @var{x})
b719dafd 1620@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1621These functions are just like @code{rint}, but they return a
1622@code{long long int} instead of a floating-point number.
b4012b75
UD
1623@end deftypefun
1624
7a68c94a
UD
1625@comment math.h
1626@comment ISO
1627@deftypefun {long int} lround (double @var{x})
4260bc74
UD
1628@comment math.h
1629@comment ISO
7a68c94a 1630@deftypefunx {long int} lroundf (float @var{x})
4260bc74
UD
1631@comment math.h
1632@comment ISO
7a68c94a 1633@deftypefunx {long int} lroundl (long double @var{x})
b719dafd 1634@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1635These functions are just like @code{round}, but they return a
1636@code{long int} instead of a floating-point number.
1637@end deftypefun
1638
1639@comment math.h
1640@comment ISO
1641@deftypefun {long long int} llround (double @var{x})
4260bc74
UD
1642@comment math.h
1643@comment ISO
7a68c94a 1644@deftypefunx {long long int} llroundf (float @var{x})
4260bc74
UD
1645@comment math.h
1646@comment ISO
7a68c94a 1647@deftypefunx {long long int} llroundl (long double @var{x})
b719dafd 1648@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1649These functions are just like @code{round}, but they return a
1650@code{long long int} instead of a floating-point number.
1651@end deftypefun
1652
1653
28f540f4 1654@comment math.h
f65fd747 1655@comment ISO
28f540f4 1656@deftypefun double modf (double @var{value}, double *@var{integer-part})
4260bc74
UD
1657@comment math.h
1658@comment ISO
f2ea0f5b 1659@deftypefunx float modff (float @var{value}, float *@var{integer-part})
4260bc74
UD
1660@comment math.h
1661@comment ISO
779ae82e 1662@deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part})
b719dafd 1663@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1664These functions break the argument @var{value} into an integer part and a
28f540f4
RM
1665fractional part (between @code{-1} and @code{1}, exclusive). Their sum
1666equals @var{value}. Each of the parts has the same sign as @var{value},
7a68c94a 1667and the integer part is always rounded toward zero.
28f540f4
RM
1668
1669@code{modf} stores the integer part in @code{*@var{integer-part}}, and
1670returns the fractional part. For example, @code{modf (2.5, &intpart)}
1671returns @code{0.5} and stores @code{2.0} into @code{intpart}.
1672@end deftypefun
1673
7a68c94a
UD
1674@node Remainder Functions
1675@subsection Remainder Functions
1676
1677The functions in this section compute the remainder on division of two
1678floating-point numbers. Each is a little different; pick the one that
1679suits your problem.
1680
28f540f4 1681@comment math.h
f65fd747 1682@comment ISO
28f540f4 1683@deftypefun double fmod (double @var{numerator}, double @var{denominator})
4260bc74
UD
1684@comment math.h
1685@comment ISO
779ae82e 1686@deftypefunx float fmodf (float @var{numerator}, float @var{denominator})
4260bc74
UD
1687@comment math.h
1688@comment ISO
779ae82e 1689@deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator})
b719dafd 1690@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
b4012b75 1691These functions compute the remainder from the division of
28f540f4
RM
1692@var{numerator} by @var{denominator}. Specifically, the return value is
1693@code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n}
1694is the quotient of @var{numerator} divided by @var{denominator}, rounded
1695towards zero to an integer. Thus, @w{@code{fmod (6.5, 2.3)}} returns
1696@code{1.9}, which is @code{6.5} minus @code{4.6}.
1697
1698The result has the same sign as the @var{numerator} and has magnitude
1699less than the magnitude of the @var{denominator}.
1700
7a68c94a 1701If @var{denominator} is zero, @code{fmod} signals a domain error.
28f540f4
RM
1702@end deftypefun
1703
1704@comment math.h
1705@comment BSD
1706@deftypefun double drem (double @var{numerator}, double @var{denominator})
4260bc74
UD
1707@comment math.h
1708@comment BSD
779ae82e 1709@deftypefunx float dremf (float @var{numerator}, float @var{denominator})
4260bc74
UD
1710@comment math.h
1711@comment BSD
779ae82e 1712@deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator})
b719dafd 1713@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
76cf9889 1714These functions are like @code{fmod} except that they round the
28f540f4
RM
1715internal quotient @var{n} to the nearest integer instead of towards zero
1716to an integer. For example, @code{drem (6.5, 2.3)} returns @code{-0.4},
1717which is @code{6.5} minus @code{6.9}.
1718
1719The absolute value of the result is less than or equal to half the
1720absolute value of the @var{denominator}. The difference between
1721@code{fmod (@var{numerator}, @var{denominator})} and @code{drem
1722(@var{numerator}, @var{denominator})} is always either
1723@var{denominator}, minus @var{denominator}, or zero.
1724
7a68c94a 1725If @var{denominator} is zero, @code{drem} signals a domain error.
28f540f4
RM
1726@end deftypefun
1727
7a68c94a
UD
1728@comment math.h
1729@comment BSD
1730@deftypefun double remainder (double @var{numerator}, double @var{denominator})
4260bc74
UD
1731@comment math.h
1732@comment BSD
7a68c94a 1733@deftypefunx float remainderf (float @var{numerator}, float @var{denominator})
4260bc74
UD
1734@comment math.h
1735@comment BSD
7a68c94a 1736@deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator})
b719dafd 1737@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1738This function is another name for @code{drem}.
1739@end deftypefun
28f540f4 1740
7a68c94a
UD
1741@node FP Bit Twiddling
1742@subsection Setting and modifying single bits of FP values
fe0ec73e
UD
1743@cindex FP arithmetic
1744
7a68c94a 1745There are some operations that are too complicated or expensive to
ec751a23 1746perform by hand on floating-point numbers. @w{ISO C99} defines
7a68c94a
UD
1747functions to do these operations, which mostly involve changing single
1748bits.
fe0ec73e
UD
1749
1750@comment math.h
1751@comment ISO
1752@deftypefun double copysign (double @var{x}, double @var{y})
4260bc74
UD
1753@comment math.h
1754@comment ISO
fe0ec73e 1755@deftypefunx float copysignf (float @var{x}, float @var{y})
4260bc74
UD
1756@comment math.h
1757@comment ISO
fe0ec73e 1758@deftypefunx {long double} copysignl (long double @var{x}, long double @var{y})
b719dafd 1759@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1760These functions return @var{x} but with the sign of @var{y}. They work
1761even if @var{x} or @var{y} are NaN or zero. Both of these can carry a
1762sign (although not all implementations support it) and this is one of
1763the few operations that can tell the difference.
fe0ec73e 1764
7a68c94a
UD
1765@code{copysign} never raises an exception.
1766@c except signalling NaNs
fe0ec73e
UD
1767
1768This function is defined in @w{IEC 559} (and the appendix with
1769recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1770@end deftypefun
1771
1772@comment math.h
1773@comment ISO
1774@deftypefun int signbit (@emph{float-type} @var{x})
b719dafd 1775@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
fe0ec73e
UD
1776@code{signbit} is a generic macro which can work on all floating-point
1777types. It returns a nonzero value if the value of @var{x} has its sign
1778bit set.
1779
7a68c94a
UD
1780This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating
1781point allows zero to be signed. The comparison @code{-0.0 < 0.0} is
1782false, but @code{signbit (-0.0)} will return a nonzero value.
fe0ec73e
UD
1783@end deftypefun
1784
1785@comment math.h
1786@comment ISO
1787@deftypefun double nextafter (double @var{x}, double @var{y})
4260bc74
UD
1788@comment math.h
1789@comment ISO
fe0ec73e 1790@deftypefunx float nextafterf (float @var{x}, float @var{y})
4260bc74
UD
1791@comment math.h
1792@comment ISO
fe0ec73e 1793@deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y})
b719dafd 1794@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
fe0ec73e 1795The @code{nextafter} function returns the next representable neighbor of
7a68c94a
UD
1796@var{x} in the direction towards @var{y}. The size of the step between
1797@var{x} and the result depends on the type of the result. If
0a7fef01 1798@math{@var{x} = @var{y}} the function simply returns @var{y}. If either
7a68c94a
UD
1799value is @code{NaN}, @code{NaN} is returned. Otherwise
1800a value corresponding to the value of the least significant bit in the
1801mantissa is added or subtracted, depending on the direction.
1802@code{nextafter} will signal overflow or underflow if the result goes
1803outside of the range of normalized numbers.
fe0ec73e
UD
1804
1805This function is defined in @w{IEC 559} (and the appendix with
1806recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1807@end deftypefun
1808
7a68c94a
UD
1809@comment math.h
1810@comment ISO
36fe9ac9 1811@deftypefun double nexttoward (double @var{x}, long double @var{y})
4260bc74
UD
1812@comment math.h
1813@comment ISO
36fe9ac9 1814@deftypefunx float nexttowardf (float @var{x}, long double @var{y})
4260bc74
UD
1815@comment math.h
1816@comment ISO
36fe9ac9 1817@deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y})
b719dafd 1818@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
1819These functions are identical to the corresponding versions of
1820@code{nextafter} except that their second argument is a @code{long
1821double}.
1822@end deftypefun
1823
41a359e2 1824@comment math.h
bf91be88 1825@comment ISO
41a359e2
RS
1826@deftypefun double nextup (double @var{x})
1827@comment math.h
bf91be88 1828@comment ISO
41a359e2
RS
1829@deftypefunx float nextupf (float @var{x})
1830@comment math.h
bf91be88 1831@comment ISO
41a359e2
RS
1832@deftypefunx {long double} nextupl (long double @var{x})
1833@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1834The @code{nextup} function returns the next representable neighbor of @var{x}
1835in the direction of positive infinity. If @var{x} is the smallest negative
1836subnormal number in the type of @var{x} the function returns @code{-0}. If
1837@math{@var{x} = @code{0}} the function returns the smallest positive subnormal
1838number in the type of @var{x}. If @var{x} is NaN, NaN is returned.
1839If @var{x} is @math{+@infinity{}}, @math{+@infinity{}} is returned.
71b48044 1840@code{nextup} is from TS 18661-1:2014.
41a359e2
RS
1841@code{nextup} never raises an exception except for signaling NaNs.
1842@end deftypefun
1843
1844@comment math.h
bf91be88 1845@comment ISO
41a359e2
RS
1846@deftypefun double nextdown (double @var{x})
1847@comment math.h
bf91be88 1848@comment ISO
41a359e2
RS
1849@deftypefunx float nextdownf (float @var{x})
1850@comment math.h
bf91be88 1851@comment ISO
41a359e2
RS
1852@deftypefunx {long double} nextdownl (long double @var{x})
1853@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1854The @code{nextdown} function returns the next representable neighbor of @var{x}
1855in the direction of negative infinity. If @var{x} is the smallest positive
1856subnormal number in the type of @var{x} the function returns @code{+0}. If
1857@math{@var{x} = @code{0}} the function returns the smallest negative subnormal
1858number in the type of @var{x}. If @var{x} is NaN, NaN is returned.
1859If @var{x} is @math{-@infinity{}}, @math{-@infinity{}} is returned.
bf91be88 1860@code{nextdown} is from TS 18661-1:2014.
41a359e2
RS
1861@code{nextdown} never raises an exception except for signaling NaNs.
1862@end deftypefun
1863
fe0ec73e
UD
1864@cindex NaN
1865@comment math.h
1866@comment ISO
1867@deftypefun double nan (const char *@var{tagp})
4260bc74
UD
1868@comment math.h
1869@comment ISO
fe0ec73e 1870@deftypefunx float nanf (const char *@var{tagp})
4260bc74
UD
1871@comment math.h
1872@comment ISO
fe0ec73e 1873@deftypefunx {long double} nanl (const char *@var{tagp})
b719dafd
AO
1874@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1875@c The unsafe-but-ruled-safe locale use comes from strtod.
7a68c94a
UD
1876The @code{nan} function returns a representation of NaN, provided that
1877NaN is supported by the target platform.
1878@code{nan ("@var{n-char-sequence}")} is equivalent to
1879@code{strtod ("NAN(@var{n-char-sequence})")}.
1880
1881The argument @var{tagp} is used in an unspecified manner. On @w{IEEE
1882754} systems, there are many representations of NaN, and @var{tagp}
1883selects one. On other systems it may do nothing.
fe0ec73e
UD
1884@end deftypefun
1885
eaf5ad0b
JM
1886@comment math.h
1887@comment ISO
1888@deftypefun int canonicalize (double *@var{cx}, const double *@var{x})
1889@comment math.h
1890@comment ISO
1891@deftypefunx int canonicalizef (float *@var{cx}, const float *@var{x})
1892@comment math.h
1893@comment ISO
1894@deftypefunx int canonicalizel (long double *@var{cx}, const long double *@var{x})
1895@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1896In some floating-point formats, some values have canonical (preferred)
1897and noncanonical encodings (for IEEE interchange binary formats, all
1898encodings are canonical). These functions, defined by TS
189918661-1:2014, attempt to produce a canonical version of the
1900floating-point value pointed to by @var{x}; if that value is a
1901signaling NaN, they raise the invalid exception and produce a quiet
1902NaN. If a canonical value is produced, it is stored in the object
1903pointed to by @var{cx}, and these functions return zero. Otherwise
1904(if a canonical value could not be produced because the object pointed
1905to by @var{x} is not a valid representation of any floating-point
1906value), the object pointed to by @var{cx} is unchanged and a nonzero
1907value is returned.
1908
1909Note that some formats have multiple encodings of a value which are
1910all equally canonical; when such an encoding is used as an input to
1911this function, any such encoding of the same value (or of the
1912corresponding quiet NaN, if that value is a signaling NaN) may be
1913produced as output.
1914@end deftypefun
1915
f8e8b8ed
JM
1916@comment math.h
1917@comment ISO
1918@deftypefun double getpayload (const double *@var{x})
1919@comment math.h
1920@comment ISO
1921@deftypefunx float getpayloadf (const float *@var{x})
1922@comment math.h
1923@comment ISO
1924@deftypefunx {long double} getpayloadl (const long double *@var{x})
1925@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1926IEEE 754 defines the @dfn{payload} of a NaN to be an integer value
1927encoded in the representation of the NaN. Payloads are typically
1928propagated from NaN inputs to the result of a floating-point
1929operation. These functions, defined by TS 18661-1:2014, return the
1930payload of the NaN pointed to by @var{x} (returned as a positive
1931integer, or positive zero, represented as a floating-point number); if
1932@var{x} is not a NaN, they return an unspecified value. They raise no
1933floating-point exceptions even for signaling NaNs.
1934@end deftypefun
1935
eb3c12c7
JM
1936@comment math.h
1937@comment ISO
1938@deftypefun int setpayload (double *@var{x}, double @var{payload})
1939@comment math.h
1940@comment ISO
1941@deftypefunx int setpayloadf (float *@var{x}, float @var{payload})
1942@comment math.h
1943@comment ISO
1944@deftypefunx int setpayloadl (long double *@var{x}, long double @var{payload})
1945@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1946These functions, defined by TS 18661-1:2014, set the object pointed to
1947by @var{x} to a quiet NaN with payload @var{payload} and a zero sign
1948bit and return zero. If @var{payload} is not a positive-signed
1949integer that is a valid payload for a quiet NaN of the given type, the
1950object pointed to by @var{x} is set to positive zero and a nonzero
1951value is returned. They raise no floating-point exceptions.
1952@end deftypefun
1953
457663a7
JM
1954@comment math.h
1955@comment ISO
1956@deftypefun int setpayloadsig (double *@var{x}, double @var{payload})
1957@comment math.h
1958@comment ISO
1959@deftypefunx int setpayloadsigf (float *@var{x}, float @var{payload})
1960@comment math.h
1961@comment ISO
1962@deftypefunx int setpayloadsigl (long double *@var{x}, long double @var{payload})
1963@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1964These functions, defined by TS 18661-1:2014, set the object pointed to
1965by @var{x} to a signaling NaN with payload @var{payload} and a zero
1966sign bit and return zero. If @var{payload} is not a positive-signed
1967integer that is a valid payload for a signaling NaN of the given type,
1968the object pointed to by @var{x} is set to positive zero and a nonzero
1969value is returned. They raise no floating-point exceptions.
1970@end deftypefun
1971
7a68c94a
UD
1972@node FP Comparison Functions
1973@subsection Floating-Point Comparison Functions
1974@cindex unordered comparison
fe0ec73e 1975
7a68c94a
UD
1976The standard C comparison operators provoke exceptions when one or other
1977of the operands is NaN. For example,
1978
1979@smallexample
1980int v = a < 1.0;
1981@end smallexample
1982
1983@noindent
1984will raise an exception if @var{a} is NaN. (This does @emph{not}
1985happen with @code{==} and @code{!=}; those merely return false and true,
1986respectively, when NaN is examined.) Frequently this exception is
ec751a23 1987undesirable. @w{ISO C99} therefore defines comparison functions that
7a68c94a
UD
1988do not raise exceptions when NaN is examined. All of the functions are
1989implemented as macros which allow their arguments to be of any
1990floating-point type. The macros are guaranteed to evaluate their
1e7c8fcc 1991arguments only once. TS 18661-1:2014 adds such a macro for an
5e9d98a3
JM
1992equality comparison that @emph{does} raise an exception for a NaN
1993argument; it also adds functions that provide a total ordering on all
1994floating-point values, including NaNs, without raising any exceptions
1995even for signaling NaNs.
7a68c94a
UD
1996
1997@comment math.h
1998@comment ISO
1999@deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2000@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2001This macro determines whether the argument @var{x} is greater than
2002@var{y}. It is equivalent to @code{(@var{x}) > (@var{y})}, but no
2003exception is raised if @var{x} or @var{y} are NaN.
2004@end deftypefn
2005
2006@comment math.h
2007@comment ISO
2008@deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2009@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2010This macro determines whether the argument @var{x} is greater than or
2011equal to @var{y}. It is equivalent to @code{(@var{x}) >= (@var{y})}, but no
2012exception is raised if @var{x} or @var{y} are NaN.
2013@end deftypefn
2014
2015@comment math.h
2016@comment ISO
2017@deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2018@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2019This macro determines whether the argument @var{x} is less than @var{y}.
2020It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is
2021raised if @var{x} or @var{y} are NaN.
2022@end deftypefn
2023
2024@comment math.h
2025@comment ISO
2026@deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2027@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2028This macro determines whether the argument @var{x} is less than or equal
2029to @var{y}. It is equivalent to @code{(@var{x}) <= (@var{y})}, but no
2030exception is raised if @var{x} or @var{y} are NaN.
2031@end deftypefn
2032
2033@comment math.h
2034@comment ISO
2035@deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2036@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2037This macro determines whether the argument @var{x} is less or greater
2038than @var{y}. It is equivalent to @code{(@var{x}) < (@var{y}) ||
2039(@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y}
2040once), but no exception is raised if @var{x} or @var{y} are NaN.
2041
2042This macro is not equivalent to @code{@var{x} != @var{y}}, because that
2043expression is true if @var{x} or @var{y} are NaN.
2044@end deftypefn
2045
2046@comment math.h
2047@comment ISO
2048@deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
b719dafd 2049@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2050This macro determines whether its arguments are unordered. In other
2051words, it is true if @var{x} or @var{y} are NaN, and false otherwise.
2052@end deftypefn
2053
1e7c8fcc
JM
2054@comment math.h
2055@comment ISO
2056@deftypefn Macro int iseqsig (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2057@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2058This macro determines whether its arguments are equal. It is
2059equivalent to @code{(@var{x}) == (@var{y})}, but it raises the invalid
2060exception and sets @code{errno} to @code{EDOM} is either argument is a
2061NaN.
2062@end deftypefn
2063
5e9d98a3
JM
2064@comment math.h
2065@comment ISO
2066@deftypefun int totalorder (double @var{x}, double @var{y})
2067@comment ISO
2068@deftypefunx int totalorderf (float @var{x}, float @var{y})
2069@comment ISO
2070@deftypefunx int totalorderl (long double @var{x}, long double @var{y})
2071@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2072These functions determine whether the total order relationship,
2073defined in IEEE 754-2008, is true for @var{x} and @var{y}, returning
2074nonzero if it is true and zero if it is false. No exceptions are
2075raised even for signaling NaNs. The relationship is true if they are
2076the same floating-point value (including sign for zero and NaNs, and
2077payload for NaNs), or if @var{x} comes before @var{y} in the following
2078order: negative quiet NaNs, in order of decreasing payload; negative
2079signaling NaNs, in order of decreasing payload; negative infinity;
2080finite numbers, in ascending order, with negative zero before positive
2081zero; positive infinity; positive signaling NaNs, in order of
2082increasing payload; positive quiet NaNs, in order of increasing
2083payload.
2084@end deftypefun
2085
cc6a8d74
JM
2086@comment math.h
2087@comment ISO
2088@deftypefun int totalordermag (double @var{x}, double @var{y})
2089@comment ISO
2090@deftypefunx int totalordermagf (float @var{x}, float @var{y})
2091@comment ISO
2092@deftypefunx int totalordermagl (long double @var{x}, long double @var{y})
2093@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2094These functions determine whether the total order relationship,
2095defined in IEEE 754-2008, is true for the absolute values of @var{x}
2096and @var{y}, returning nonzero if it is true and zero if it is false.
2097No exceptions are raised even for signaling NaNs.
2098@end deftypefun
2099
7a68c94a
UD
2100Not all machines provide hardware support for these operations. On
2101machines that don't, the macros can be very slow. Therefore, you should
2102not use these functions when NaN is not a concern.
2103
48b22986 2104@strong{NB:} There are no macros @code{isequal} or @code{isunequal}.
7a68c94a
UD
2105They are unnecessary, because the @code{==} and @code{!=} operators do
2106@emph{not} throw an exception if one or both of the operands are NaN.
2107
2108@node Misc FP Arithmetic
2109@subsection Miscellaneous FP arithmetic functions
fe0ec73e
UD
2110@cindex minimum
2111@cindex maximum
7a68c94a
UD
2112@cindex positive difference
2113@cindex multiply-add
fe0ec73e 2114
7a68c94a
UD
2115The functions in this section perform miscellaneous but common
2116operations that are awkward to express with C operators. On some
2117processors these functions can use special machine instructions to
2118perform these operations faster than the equivalent C code.
fe0ec73e
UD
2119
2120@comment math.h
2121@comment ISO
2122@deftypefun double fmin (double @var{x}, double @var{y})
4260bc74
UD
2123@comment math.h
2124@comment ISO
fe0ec73e 2125@deftypefunx float fminf (float @var{x}, float @var{y})
4260bc74
UD
2126@comment math.h
2127@comment ISO
fe0ec73e 2128@deftypefunx {long double} fminl (long double @var{x}, long double @var{y})
b719dafd 2129@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2130The @code{fmin} function returns the lesser of the two values @var{x}
2131and @var{y}. It is similar to the expression
2132@smallexample
2133((x) < (y) ? (x) : (y))
2134@end smallexample
2135except that @var{x} and @var{y} are only evaluated once.
fe0ec73e 2136
7a68c94a
UD
2137If an argument is NaN, the other argument is returned. If both arguments
2138are NaN, NaN is returned.
fe0ec73e
UD
2139@end deftypefun
2140
2141@comment math.h
2142@comment ISO
2143@deftypefun double fmax (double @var{x}, double @var{y})
4260bc74
UD
2144@comment math.h
2145@comment ISO
fe0ec73e 2146@deftypefunx float fmaxf (float @var{x}, float @var{y})
4260bc74
UD
2147@comment math.h
2148@comment ISO
fe0ec73e 2149@deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y})
b719dafd 2150@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2151The @code{fmax} function returns the greater of the two values @var{x}
2152and @var{y}.
fe0ec73e 2153
7a68c94a
UD
2154If an argument is NaN, the other argument is returned. If both arguments
2155are NaN, NaN is returned.
fe0ec73e
UD
2156@end deftypefun
2157
2158@comment math.h
2159@comment ISO
2160@deftypefun double fdim (double @var{x}, double @var{y})
4260bc74
UD
2161@comment math.h
2162@comment ISO
fe0ec73e 2163@deftypefunx float fdimf (float @var{x}, float @var{y})
4260bc74
UD
2164@comment math.h
2165@comment ISO
fe0ec73e 2166@deftypefunx {long double} fdiml (long double @var{x}, long double @var{y})
b719dafd 2167@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2168The @code{fdim} function returns the positive difference between
2169@var{x} and @var{y}. The positive difference is @math{@var{x} -
2170@var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise.
fe0ec73e 2171
7a68c94a 2172If @var{x}, @var{y}, or both are NaN, NaN is returned.
fe0ec73e
UD
2173@end deftypefun
2174
2175@comment math.h
2176@comment ISO
2177@deftypefun double fma (double @var{x}, double @var{y}, double @var{z})
4260bc74
UD
2178@comment math.h
2179@comment ISO
fe0ec73e 2180@deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z})
4260bc74
UD
2181@comment math.h
2182@comment ISO
fe0ec73e
UD
2183@deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z})
2184@cindex butterfly
b719dafd 2185@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2186The @code{fma} function performs floating-point multiply-add. This is
2187the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the
2188intermediate result is not rounded to the destination type. This can
2189sometimes improve the precision of a calculation.
2190
2191This function was introduced because some processors have a special
2192instruction to perform multiply-add. The C compiler cannot use it
2193directly, because the expression @samp{x*y + z} is defined to round the
2194intermediate result. @code{fma} lets you choose when you want to round
2195only once.
fe0ec73e
UD
2196
2197@vindex FP_FAST_FMA
7a68c94a
UD
2198On processors which do not implement multiply-add in hardware,
2199@code{fma} can be very slow since it must avoid intermediate rounding.
2200@file{math.h} defines the symbols @code{FP_FAST_FMA},
2201@code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding
2202version of @code{fma} is no slower than the expression @samp{x*y + z}.
1f77f049 2203In @theglibc{}, this always means the operation is implemented in
7a68c94a 2204hardware.
fe0ec73e
UD
2205@end deftypefun
2206
7a68c94a
UD
2207@node Complex Numbers
2208@section Complex Numbers
2209@pindex complex.h
2210@cindex complex numbers
2211
ec751a23 2212@w{ISO C99} introduces support for complex numbers in C. This is done
7a68c94a
UD
2213with a new type qualifier, @code{complex}. It is a keyword if and only
2214if @file{complex.h} has been included. There are three complex types,
2215corresponding to the three real types: @code{float complex},
2216@code{double complex}, and @code{long double complex}.
2217
2218To construct complex numbers you need a way to indicate the imaginary
2219part of a number. There is no standard notation for an imaginary
2220floating point constant. Instead, @file{complex.h} defines two macros
2221that can be used to create complex numbers.
2222
2223@deftypevr Macro {const float complex} _Complex_I
2224This macro is a representation of the complex number ``@math{0+1i}''.
2225Multiplying a real floating-point value by @code{_Complex_I} gives a
2226complex number whose value is purely imaginary. You can use this to
2227construct complex constants:
2228
2229@smallexample
2230@math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I}
2231@end smallexample
2232
2233Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but
2234the type of that value is @code{complex}.
2235@end deftypevr
2236
2237@c Put this back in when gcc supports _Imaginary_I. It's too confusing.
2238@ignore
2239@noindent
2240Without an optimizing compiler this is more expensive than the use of
2241@code{_Imaginary_I} but with is better than nothing. You can avoid all
2242the hassles if you use the @code{I} macro below if the name is not
2243problem.
2244
2245@deftypevr Macro {const float imaginary} _Imaginary_I
2246This macro is a representation of the value ``@math{1i}''. I.e., it is
2247the value for which
2248
2249@smallexample
2250_Imaginary_I * _Imaginary_I = -1
2251@end smallexample
2252
2253@noindent
2254The result is not of type @code{float imaginary} but instead @code{float}.
2255One can use it to easily construct complex number like in
2256
2257@smallexample
22583.0 - _Imaginary_I * 4.0
2259@end smallexample
2260
2261@noindent
2262which results in the complex number with a real part of 3.0 and a
2263imaginary part -4.0.
2264@end deftypevr
2265@end ignore
2266
2267@noindent
2268@code{_Complex_I} is a bit of a mouthful. @file{complex.h} also defines
2269a shorter name for the same constant.
2270
2271@deftypevr Macro {const float complex} I
2272This macro has exactly the same value as @code{_Complex_I}. Most of the
2273time it is preferable. However, it causes problems if you want to use
2274the identifier @code{I} for something else. You can safely write
2275
2276@smallexample
2277#include <complex.h>
2278#undef I
2279@end smallexample
2280
2281@noindent
2282if you need @code{I} for your own purposes. (In that case we recommend
2283you also define some other short name for @code{_Complex_I}, such as
2284@code{J}.)
2285
2286@ignore
2287If the implementation does not support the @code{imaginary} types
2288@code{I} is defined as @code{_Complex_I} which is the second best
2289solution. It still can be used in the same way but requires a most
2290clever compiler to get the same results.
2291@end ignore
2292@end deftypevr
2293
2294@node Operations on Complex
2295@section Projections, Conjugates, and Decomposing of Complex Numbers
2296@cindex project complex numbers
2297@cindex conjugate complex numbers
2298@cindex decompose complex numbers
2299@pindex complex.h
2300
ec751a23 2301@w{ISO C99} also defines functions that perform basic operations on
7a68c94a
UD
2302complex numbers, such as decomposition and conjugation. The prototypes
2303for all these functions are in @file{complex.h}. All functions are
2304available in three variants, one for each of the three complex types.
2305
2306@comment complex.h
2307@comment ISO
2308@deftypefun double creal (complex double @var{z})
4260bc74
UD
2309@comment complex.h
2310@comment ISO
7a68c94a 2311@deftypefunx float crealf (complex float @var{z})
4260bc74
UD
2312@comment complex.h
2313@comment ISO
7a68c94a 2314@deftypefunx {long double} creall (complex long double @var{z})
b719dafd 2315@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2316These functions return the real part of the complex number @var{z}.
2317@end deftypefun
2318
2319@comment complex.h
2320@comment ISO
2321@deftypefun double cimag (complex double @var{z})
4260bc74
UD
2322@comment complex.h
2323@comment ISO
7a68c94a 2324@deftypefunx float cimagf (complex float @var{z})
4260bc74
UD
2325@comment complex.h
2326@comment ISO
7a68c94a 2327@deftypefunx {long double} cimagl (complex long double @var{z})
b719dafd 2328@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2329These functions return the imaginary part of the complex number @var{z}.
2330@end deftypefun
2331
2332@comment complex.h
2333@comment ISO
2334@deftypefun {complex double} conj (complex double @var{z})
4260bc74
UD
2335@comment complex.h
2336@comment ISO
7a68c94a 2337@deftypefunx {complex float} conjf (complex float @var{z})
4260bc74
UD
2338@comment complex.h
2339@comment ISO
7a68c94a 2340@deftypefunx {complex long double} conjl (complex long double @var{z})
b719dafd 2341@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2342These functions return the conjugate value of the complex number
2343@var{z}. The conjugate of a complex number has the same real part and a
2344negated imaginary part. In other words, @samp{conj(a + bi) = a + -bi}.
2345@end deftypefun
2346
2347@comment complex.h
2348@comment ISO
2349@deftypefun double carg (complex double @var{z})
4260bc74
UD
2350@comment complex.h
2351@comment ISO
7a68c94a 2352@deftypefunx float cargf (complex float @var{z})
4260bc74
UD
2353@comment complex.h
2354@comment ISO
7a68c94a 2355@deftypefunx {long double} cargl (complex long double @var{z})
b719dafd 2356@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
2357These functions return the argument of the complex number @var{z}.
2358The argument of a complex number is the angle in the complex plane
2359between the positive real axis and a line passing through zero and the
01f49f59
JT
2360number. This angle is measured in the usual fashion and ranges from
2361@math{-@pi{}} to @math{@pi{}}.
7a68c94a 2362
01f49f59 2363@code{carg} has a branch cut along the negative real axis.
7a68c94a
UD
2364@end deftypefun
2365
2366@comment complex.h
2367@comment ISO
2368@deftypefun {complex double} cproj (complex double @var{z})
4260bc74
UD
2369@comment complex.h
2370@comment ISO
7a68c94a 2371@deftypefunx {complex float} cprojf (complex float @var{z})
4260bc74
UD
2372@comment complex.h
2373@comment ISO
7a68c94a 2374@deftypefunx {complex long double} cprojl (complex long double @var{z})
b719dafd 2375@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a 2376These functions return the projection of the complex value @var{z} onto
9dcc8f11 2377the Riemann sphere. Values with an infinite imaginary part are projected
7a68c94a
UD
2378to positive infinity on the real axis, even if the real part is NaN. If
2379the real part is infinite, the result is equivalent to
2380
2381@smallexample
2382INFINITY + I * copysign (0.0, cimag (z))
2383@end smallexample
2384@end deftypefun
fe0ec73e 2385
28f540f4
RM
2386@node Parsing of Numbers
2387@section Parsing of Numbers
2388@cindex parsing numbers (in formatted input)
2389@cindex converting strings to numbers
2390@cindex number syntax, parsing
2391@cindex syntax, for reading numbers
2392
2393This section describes functions for ``reading'' integer and
2394floating-point numbers from a string. It may be more convenient in some
2395cases to use @code{sscanf} or one of the related functions; see
2396@ref{Formatted Input}. But often you can make a program more robust by
2397finding the tokens in the string by hand, then converting the numbers
2398one by one.
2399
2400@menu
2401* Parsing of Integers:: Functions for conversion of integer values.
2402* Parsing of Floats:: Functions for conversion of floating-point
2403 values.
2404@end menu
2405
2406@node Parsing of Integers
2407@subsection Parsing of Integers
2408
2409@pindex stdlib.h
b642f101
UD
2410@pindex wchar.h
2411The @samp{str} functions are declared in @file{stdlib.h} and those
2412beginning with @samp{wcs} are declared in @file{wchar.h}. One might
2413wonder about the use of @code{restrict} in the prototypes of the
2414functions in this section. It is seemingly useless but the @w{ISO C}
2415standard uses it (for the functions defined there) so we have to do it
2416as well.
28f540f4
RM
2417
2418@comment stdlib.h
f65fd747 2419@comment ISO
b642f101 2420@deftypefun {long int} strtol (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd
AO
2421@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2422@c strtol uses the thread-local pointer to the locale in effect, and
2423@c strtol_l loads the LC_NUMERIC locale data from it early on and once,
2424@c but if the locale is the global locale, and another thread calls
2425@c setlocale in a way that modifies the pointer to the LC_CTYPE locale
2426@c category, the behavior of e.g. IS*, TOUPPER will vary throughout the
2427@c execution of the function, because they re-read the locale data from
2428@c the given locale pointer. We solved this by documenting setlocale as
2429@c MT-Unsafe.
28f540f4
RM
2430The @code{strtol} (``string-to-long'') function converts the initial
2431part of @var{string} to a signed integer, which is returned as a value
b8fe19fa 2432of type @code{long int}.
28f540f4
RM
2433
2434This function attempts to decompose @var{string} as follows:
2435
2436@itemize @bullet
b8fe19fa 2437@item
28f540f4
RM
2438A (possibly empty) sequence of whitespace characters. Which characters
2439are whitespace is determined by the @code{isspace} function
2440(@pxref{Classification of Characters}). These are discarded.
2441
b8fe19fa 2442@item
28f540f4
RM
2443An optional plus or minus sign (@samp{+} or @samp{-}).
2444
b8fe19fa 2445@item
28f540f4
RM
2446A nonempty sequence of digits in the radix specified by @var{base}.
2447
2448If @var{base} is zero, decimal radix is assumed unless the series of
2449digits begins with @samp{0} (specifying octal radix), or @samp{0x} or
2450@samp{0X} (specifying hexadecimal radix); in other words, the same
2451syntax used for integer constants in C.
2452
600a7457 2453Otherwise @var{base} must have a value between @code{2} and @code{36}.
28f540f4 2454If @var{base} is @code{16}, the digits may optionally be preceded by
2c6fe0bd
UD
2455@samp{0x} or @samp{0X}. If base has no legal value the value returned
2456is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}.
28f540f4 2457
b8fe19fa 2458@item
28f540f4
RM
2459Any remaining characters in the string. If @var{tailptr} is not a null
2460pointer, @code{strtol} stores a pointer to this tail in
2461@code{*@var{tailptr}}.
2462@end itemize
2463
2464If the string is empty, contains only whitespace, or does not contain an
2465initial substring that has the expected syntax for an integer in the
2466specified @var{base}, no conversion is performed. In this case,
2467@code{strtol} returns a value of zero and the value stored in
2468@code{*@var{tailptr}} is the value of @var{string}.
2469
2470In a locale other than the standard @code{"C"} locale, this function
2471may recognize additional implementation-dependent syntax.
2472
2473If the string has valid syntax for an integer but the value is not
2474representable because of overflow, @code{strtol} returns either
2475@code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as
2476appropriate for the sign of the value. It also sets @code{errno}
2477to @code{ERANGE} to indicate there was overflow.
2478
7a68c94a
UD
2479You should not check for errors by examining the return value of
2480@code{strtol}, because the string might be a valid representation of
2481@code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}. Instead, check whether
2482@var{tailptr} points to what you expect after the number
2483(e.g. @code{'\0'} if the string should end after the number). You also
2484need to clear @var{errno} before the call and check it afterward, in
2485case there was overflow.
2c6fe0bd 2486
28f540f4
RM
2487There is an example at the end of this section.
2488@end deftypefun
2489
b642f101
UD
2490@comment wchar.h
2491@comment ISO
2492@deftypefun {long int} wcstol (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2493@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2494The @code{wcstol} function is equivalent to the @code{strtol} function
2495in nearly all aspects but handles wide character strings.
b642f101
UD
2496
2497The @code{wcstol} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2498@end deftypefun
2499
28f540f4 2500@comment stdlib.h
f65fd747 2501@comment ISO
b642f101 2502@deftypefun {unsigned long int} strtoul (const char *retrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2503@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
28f540f4 2504The @code{strtoul} (``string-to-unsigned-long'') function is like
0e4ee106 2505@code{strtol} except it converts to an @code{unsigned long int} value.
7a68c94a 2506The syntax is the same as described above for @code{strtol}. The value
0e4ee106
UD
2507returned on overflow is @code{ULONG_MAX} (@pxref{Range of Type}).
2508
2509If @var{string} depicts a negative number, @code{strtoul} acts the same
2510as @var{strtol} but casts the result to an unsigned integer. That means
2511for example that @code{strtoul} on @code{"-1"} returns @code{ULONG_MAX}
e6e81391 2512and an input more negative than @code{LONG_MIN} returns
0e4ee106 2513(@code{ULONG_MAX} + 1) / 2.
7a68c94a
UD
2514
2515@code{strtoul} sets @var{errno} to @code{EINVAL} if @var{base} is out of
2516range, or @code{ERANGE} on overflow.
2c6fe0bd
UD
2517@end deftypefun
2518
b642f101
UD
2519@comment wchar.h
2520@comment ISO
2521@deftypefun {unsigned long int} wcstoul (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2522@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2523The @code{wcstoul} function is equivalent to the @code{strtoul} function
2524in nearly all aspects but handles wide character strings.
b642f101
UD
2525
2526The @code{wcstoul} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2527@end deftypefun
2528
2c6fe0bd 2529@comment stdlib.h
7a68c94a 2530@comment ISO
b642f101 2531@deftypefun {long long int} strtoll (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2532@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
7a68c94a
UD
2533The @code{strtoll} function is like @code{strtol} except that it returns
2534a @code{long long int} value, and accepts numbers with a correspondingly
2535larger range.
2c6fe0bd
UD
2536
2537If the string has valid syntax for an integer but the value is not
fe7bdd63 2538representable because of overflow, @code{strtoll} returns either
7bb764bc 2539@code{LLONG_MAX} or @code{LLONG_MIN} (@pxref{Range of Type}), as
2c6fe0bd
UD
2540appropriate for the sign of the value. It also sets @code{errno} to
2541@code{ERANGE} to indicate there was overflow.
2c6fe0bd 2542
ec751a23 2543The @code{strtoll} function was introduced in @w{ISO C99}.
2c6fe0bd
UD
2544@end deftypefun
2545
b642f101
UD
2546@comment wchar.h
2547@comment ISO
2548@deftypefun {long long int} wcstoll (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2549@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2550The @code{wcstoll} function is equivalent to the @code{strtoll} function
2551in nearly all aspects but handles wide character strings.
b642f101
UD
2552
2553The @code{wcstoll} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2554@end deftypefun
2555
2c6fe0bd
UD
2556@comment stdlib.h
2557@comment BSD
b642f101 2558@deftypefun {long long int} strtoq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2559@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
7a68c94a 2560@code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}.
2c6fe0bd
UD
2561@end deftypefun
2562
b642f101
UD
2563@comment wchar.h
2564@comment GNU
2565@deftypefun {long long int} wcstoq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2566@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2567The @code{wcstoq} function is equivalent to the @code{strtoq} function
2568in nearly all aspects but handles wide character strings.
b642f101
UD
2569
2570The @code{wcstoq} function is a GNU extension.
2571@end deftypefun
2572
2c6fe0bd 2573@comment stdlib.h
7a68c94a 2574@comment ISO
b642f101 2575@deftypefun {unsigned long long int} strtoull (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2576@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
0e4ee106
UD
2577The @code{strtoull} function is related to @code{strtoll} the same way
2578@code{strtoul} is related to @code{strtol}.
fe7bdd63 2579
ec751a23 2580The @code{strtoull} function was introduced in @w{ISO C99}.
fe7bdd63
UD
2581@end deftypefun
2582
b642f101
UD
2583@comment wchar.h
2584@comment ISO
2585@deftypefun {unsigned long long int} wcstoull (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2586@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2587The @code{wcstoull} function is equivalent to the @code{strtoull} function
2588in nearly all aspects but handles wide character strings.
b642f101
UD
2589
2590The @code{wcstoull} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2591@end deftypefun
2592
fe7bdd63
UD
2593@comment stdlib.h
2594@comment BSD
b642f101 2595@deftypefun {unsigned long long int} strtouq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2596@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
7a68c94a 2597@code{strtouq} is the BSD name for @code{strtoull}.
28f540f4
RM
2598@end deftypefun
2599
b642f101
UD
2600@comment wchar.h
2601@comment GNU
2602@deftypefun {unsigned long long int} wcstouq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2603@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2604The @code{wcstouq} function is equivalent to the @code{strtouq} function
2605in nearly all aspects but handles wide character strings.
b642f101 2606
f5708cb0 2607The @code{wcstouq} function is a GNU extension.
b642f101
UD
2608@end deftypefun
2609
0e4ee106 2610@comment inttypes.h
b642f101
UD
2611@comment ISO
2612@deftypefun intmax_t strtoimax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2613@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
0e4ee106
UD
2614The @code{strtoimax} function is like @code{strtol} except that it returns
2615a @code{intmax_t} value, and accepts numbers of a corresponding range.
2616
2617If the string has valid syntax for an integer but the value is not
2618representable because of overflow, @code{strtoimax} returns either
2619@code{INTMAX_MAX} or @code{INTMAX_MIN} (@pxref{Integers}), as
2620appropriate for the sign of the value. It also sets @code{errno} to
2621@code{ERANGE} to indicate there was overflow.
2622
b642f101
UD
2623See @ref{Integers} for a description of the @code{intmax_t} type. The
2624@code{strtoimax} function was introduced in @w{ISO C99}.
2625@end deftypefun
0e4ee106 2626
b642f101
UD
2627@comment wchar.h
2628@comment ISO
2629@deftypefun intmax_t wcstoimax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2630@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2631The @code{wcstoimax} function is equivalent to the @code{strtoimax} function
2632in nearly all aspects but handles wide character strings.
0e4ee106 2633
b642f101 2634The @code{wcstoimax} function was introduced in @w{ISO C99}.
0e4ee106
UD
2635@end deftypefun
2636
2637@comment inttypes.h
b642f101
UD
2638@comment ISO
2639@deftypefun uintmax_t strtoumax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
b719dafd 2640@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
0e4ee106
UD
2641The @code{strtoumax} function is related to @code{strtoimax}
2642the same way that @code{strtoul} is related to @code{strtol}.
2643
b642f101
UD
2644See @ref{Integers} for a description of the @code{intmax_t} type. The
2645@code{strtoumax} function was introduced in @w{ISO C99}.
2646@end deftypefun
0e4ee106 2647
b642f101
UD
2648@comment wchar.h
2649@comment ISO
2650@deftypefun uintmax_t wcstoumax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
b719dafd 2651@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
3554743a
AJ
2652The @code{wcstoumax} function is equivalent to the @code{strtoumax} function
2653in nearly all aspects but handles wide character strings.
b642f101
UD
2654
2655The @code{wcstoumax} function was introduced in @w{ISO C99}.
0e4ee106
UD
2656@end deftypefun
2657
28f540f4 2658@comment stdlib.h
f65fd747 2659@comment ISO
28f540f4 2660@deftypefun {long int} atol (const char *@var{string})
b719dafd 2661@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
28f540f4
RM
2662This function is similar to the @code{strtol} function with a @var{base}
2663argument of @code{10}, except that it need not detect overflow errors.
2664The @code{atol} function is provided mostly for compatibility with
2665existing code; using @code{strtol} is more robust.
2666@end deftypefun
2667
2668@comment stdlib.h
f65fd747 2669@comment ISO
28f540f4 2670@deftypefun int atoi (const char *@var{string})
b719dafd 2671@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
7a68c94a
UD
2672This function is like @code{atol}, except that it returns an @code{int}.
2673The @code{atoi} function is also considered obsolete; use @code{strtol}
2674instead.
28f540f4
RM
2675@end deftypefun
2676
fe7bdd63 2677@comment stdlib.h
7a68c94a 2678@comment ISO
fe7bdd63 2679@deftypefun {long long int} atoll (const char *@var{string})
b719dafd 2680@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
fe7bdd63 2681This function is similar to @code{atol}, except it returns a @code{long
7a68c94a 2682long int}.
fe7bdd63 2683
ec751a23 2684The @code{atoll} function was introduced in @w{ISO C99}. It too is
7a68c94a 2685obsolete (despite having just been added); use @code{strtoll} instead.
fe7bdd63
UD
2686@end deftypefun
2687
b642f101
UD
2688All the functions mentioned in this section so far do not handle
2689alternative representations of characters as described in the locale
2690data. Some locales specify thousands separator and the way they have to
2691be used which can help to make large numbers more readable. To read
2692such numbers one has to use the @code{scanf} functions with the @samp{'}
2693flag.
2c6fe0bd 2694
28f540f4
RM
2695Here is a function which parses a string as a sequence of integers and
2696returns the sum of them:
2697
2698@smallexample
2699int
2700sum_ints_from_string (char *string)
2701@{
2702 int sum = 0;
2703
2704 while (1) @{
2705 char *tail;
2706 int next;
2707
2708 /* @r{Skip whitespace by hand, to detect the end.} */
2709 while (isspace (*string)) string++;
2710 if (*string == 0)
2711 break;
2712
2713 /* @r{There is more nonwhitespace,} */
2714 /* @r{so it ought to be another number.} */
2715 errno = 0;
2716 /* @r{Parse it.} */
2717 next = strtol (string, &tail, 0);
2718 /* @r{Add it in, if not overflow.} */
2719 if (errno)
2720 printf ("Overflow\n");
2721 else
2722 sum += next;
2723 /* @r{Advance past it.} */
2724 string = tail;
2725 @}
2726
2727 return sum;
2728@}
2729@end smallexample
2730
2731@node Parsing of Floats
2732@subsection Parsing of Floats
2733
2734@pindex stdlib.h
b642f101
UD
2735The @samp{str} functions are declared in @file{stdlib.h} and those
2736beginning with @samp{wcs} are declared in @file{wchar.h}. One might
2737wonder about the use of @code{restrict} in the prototypes of the
2738functions in this section. It is seemingly useless but the @w{ISO C}
2739standard uses it (for the functions defined there) so we have to do it
2740as well.
28f540f4
RM
2741
2742@comment stdlib.h
f65fd747 2743@comment ISO
b642f101 2744@deftypefun double strtod (const char *restrict @var{string}, char **restrict @var{tailptr})
b719dafd
AO
2745@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2746@c Besides the unsafe-but-ruled-safe locale uses, this uses a lot of
2747@c mpn, but it's all safe.
2748@c
2749@c round_and_return
2750@c get_rounding_mode ok
2751@c mpn_add_1 ok
2752@c mpn_rshift ok
2753@c MPN_ZERO ok
2754@c MPN2FLOAT -> mpn_construct_(float|double|long_double) ok
2755@c str_to_mpn
2756@c mpn_mul_1 -> umul_ppmm ok
2757@c mpn_add_1 ok
2758@c mpn_lshift_1 -> mpn_lshift ok
2759@c STRTOF_INTERNAL
2760@c MPN_VAR ok
2761@c SET_MANTISSA ok
2762@c STRNCASECMP ok, wide and narrow
2763@c round_and_return ok
2764@c mpn_mul ok
2765@c mpn_addmul_1 ok
2766@c ... mpn_sub
2767@c mpn_lshift ok
2768@c udiv_qrnnd ok
2769@c count_leading_zeros ok
2770@c add_ssaaaa ok
2771@c sub_ddmmss ok
2772@c umul_ppmm ok
2773@c mpn_submul_1 ok
28f540f4
RM
2774The @code{strtod} (``string-to-double'') function converts the initial
2775part of @var{string} to a floating-point number, which is returned as a
b8fe19fa 2776value of type @code{double}.
28f540f4
RM
2777
2778This function attempts to decompose @var{string} as follows:
2779
2780@itemize @bullet
b8fe19fa 2781@item
28f540f4
RM
2782A (possibly empty) sequence of whitespace characters. Which characters
2783are whitespace is determined by the @code{isspace} function
2784(@pxref{Classification of Characters}). These are discarded.
2785
2786@item
2787An optional plus or minus sign (@samp{+} or @samp{-}).
2788
0c34b1e9
UD
2789@item A floating point number in decimal or hexadecimal format. The
2790decimal format is:
2791@itemize @minus
2792
28f540f4
RM
2793@item
2794A nonempty sequence of digits optionally containing a decimal-point
2795character---normally @samp{.}, but it depends on the locale
85c165be 2796(@pxref{General Numeric}).
28f540f4
RM
2797
2798@item
2799An optional exponent part, consisting of a character @samp{e} or
2800@samp{E}, an optional sign, and a sequence of digits.
2801
0c34b1e9
UD
2802@end itemize
2803
2804The hexadecimal format is as follows:
2805@itemize @minus
2806
2807@item
2808A 0x or 0X followed by a nonempty sequence of hexadecimal digits
2809optionally containing a decimal-point character---normally @samp{.}, but
2810it depends on the locale (@pxref{General Numeric}).
2811
2812@item
2813An optional binary-exponent part, consisting of a character @samp{p} or
2814@samp{P}, an optional sign, and a sequence of digits.
2815
2816@end itemize
2817
28f540f4
RM
2818@item
2819Any remaining characters in the string. If @var{tailptr} is not a null
2820pointer, a pointer to this tail of the string is stored in
2821@code{*@var{tailptr}}.
2822@end itemize
2823
2824If the string is empty, contains only whitespace, or does not contain an
2825initial substring that has the expected syntax for a floating-point
2826number, no conversion is performed. In this case, @code{strtod} returns
2827a value of zero and the value returned in @code{*@var{tailptr}} is the
2828value of @var{string}.
2829
26761c28 2830In a locale other than the standard @code{"C"} or @code{"POSIX"} locales,
2c6fe0bd 2831this function may recognize additional locale-dependent syntax.
28f540f4
RM
2832
2833If the string has valid syntax for a floating-point number but the value
7a68c94a
UD
2834is outside the range of a @code{double}, @code{strtod} will signal
2835overflow or underflow as described in @ref{Math Error Reporting}.
2836
2837@code{strtod} recognizes four special input strings. The strings
2838@code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}},
2839or to the largest representable value if the floating-point format
2840doesn't support infinities. You can prepend a @code{"+"} or @code{"-"}
2841to specify the sign. Case is ignored when scanning these strings.
2842
95fdc6a0
UD
2843The strings @code{"nan"} and @code{"nan(@var{chars@dots{}})"} are converted
2844to NaN. Again, case is ignored. If @var{chars@dots{}} are provided, they
7a68c94a
UD
2845are used in some unspecified fashion to select a particular
2846representation of NaN (there can be several).
2847
2848Since zero is a valid result as well as the value returned on error, you
2849should check for errors in the same way as for @code{strtol}, by
2850examining @var{errno} and @var{tailptr}.
28f540f4
RM
2851@end deftypefun
2852
2c6fe0bd 2853@comment stdlib.h
ec751a23 2854@comment ISO
2c6fe0bd 2855@deftypefun float strtof (const char *@var{string}, char **@var{tailptr})
4260bc74 2856@comment stdlib.h
ec751a23 2857@comment ISO
7a68c94a 2858@deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr})
b719dafd 2859@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
7a68c94a
UD
2860These functions are analogous to @code{strtod}, but return @code{float}
2861and @code{long double} values respectively. They report errors in the
2862same way as @code{strtod}. @code{strtof} can be substantially faster
2863than @code{strtod}, but has less precision; conversely, @code{strtold}
2864can be much slower but has more precision (on systems where @code{long
2865double} is a separate type).
2866
ec751a23 2867These functions have been GNU extensions and are new to @w{ISO C99}.
2c6fe0bd
UD
2868@end deftypefun
2869
b642f101
UD
2870@comment wchar.h
2871@comment ISO
2872@deftypefun double wcstod (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr})
2873@comment stdlib.h
2874@comment ISO
2875@deftypefunx float wcstof (const wchar_t *@var{string}, wchar_t **@var{tailptr})
2876@comment stdlib.h
2877@comment ISO
2878@deftypefunx {long double} wcstold (const wchar_t *@var{string}, wchar_t **@var{tailptr})
b719dafd 2879@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
b642f101
UD
2880The @code{wcstod}, @code{wcstof}, and @code{wcstol} functions are
2881equivalent in nearly all aspect to the @code{strtod}, @code{strtof}, and
2882@code{strtold} functions but it handles wide character string.
2883
2884The @code{wcstod} function was introduced in @w{Amendment 1} of @w{ISO
2885C90}. The @code{wcstof} and @code{wcstold} functions were introduced in
2886@w{ISO C99}.
2887@end deftypefun
2888
28f540f4 2889@comment stdlib.h
f65fd747 2890@comment ISO
28f540f4 2891@deftypefun double atof (const char *@var{string})
b719dafd 2892@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
28f540f4
RM
2893This function is similar to the @code{strtod} function, except that it
2894need not detect overflow and underflow errors. The @code{atof} function
2895is provided mostly for compatibility with existing code; using
2896@code{strtod} is more robust.
2897@end deftypefun
880f421f 2898
1f77f049 2899@Theglibc{} also provides @samp{_l} versions of these functions,
7a68c94a 2900which take an additional argument, the locale to use in conversion.
aa04af00
AM
2901
2902See also @ref{Parsing of Integers}.
880f421f 2903
6962682f
GG
2904@node Printing of Floats
2905@section Printing of Floats
2906
2907@pindex stdlib.h
2908The @samp{strfrom} functions are declared in @file{stdlib.h}.
2909
2910@comment stdlib.h
2911@comment ISO/IEC TS 18661-1
2912@deftypefun int strfromd (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, double @var{value})
2913@deftypefunx int strfromf (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, float @var{value})
2914@deftypefunx int strfroml (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, long double @var{value})
2915@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2916@comment these functions depend on __printf_fp and __printf_fphex, which are
2917@comment AS-unsafe (ascuheap) and AC-unsafe (acsmem).
2918The functions @code{strfromd} (``string-from-double''), @code{strfromf}
2919(``string-from-float''), and @code{strfroml} (``string-from-long-double'')
2920convert the floating-point number @var{value} to a string of characters and
2921stores them into the area pointed to by @var{string}. The conversion
2922writes at most @var{size} characters and respects the format specified by
2923@var{format}.
2924
2925The format string must start with the character @samp{%}. An optional
2926precision follows, which starts with a period, @samp{.}, and may be
2927followed by a decimal integer, representing the precision. If a decimal
2928integer is not specified after the period, the precision is taken to be
2929zero. The character @samp{*} is not allowed. Finally, the format string
2930ends with one of the following conversion specifiers: @samp{a}, @samp{A},
2931@samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g} or @samp{G} (@pxref{Table
2932of Output Conversions}). Invalid format strings result in undefined
2933behavior.
2934
2935These functions return the number of characters that would have been
2936written to @var{string} had @var{size} been sufficiently large, not
2937counting the terminating null character. Thus, the null-terminated output
2938has been completely written if and only if the returned value is less than
2939@var{size}.
2940
2941These functions were introduced by ISO/IEC TS 18661-1.
2942@end deftypefun
2943
7a68c94a
UD
2944@node System V Number Conversion
2945@section Old-fashioned System V number-to-string functions
880f421f 2946
7a68c94a 2947The old @w{System V} C library provided three functions to convert
1f77f049
JM
2948numbers to strings, with unusual and hard-to-use semantics. @Theglibc{}
2949also provides these functions and some natural extensions.
880f421f 2950
1f77f049 2951These functions are only available in @theglibc{} and on systems descended
7a68c94a
UD
2952from AT&T Unix. Therefore, unless these functions do precisely what you
2953need, it is better to use @code{sprintf}, which is standard.
880f421f 2954
7a68c94a 2955All these functions are defined in @file{stdlib.h}.
880f421f
UD
2956
2957@comment stdlib.h
2958@comment SVID, Unix98
7a68c94a 2959@deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
b719dafd 2960@safety{@prelim{}@mtunsafe{@mtasurace{:ecvt}}@asunsafe{}@acsafe{}}
880f421f 2961The function @code{ecvt} converts the floating-point number @var{value}
0ea5db4f 2962to a string with at most @var{ndigit} decimal digits. The
cf822e3c 2963returned string contains no decimal point or sign. The first digit of
0ea5db4f
UD
2964the string is non-zero (unless @var{value} is actually zero) and the
2965last digit is rounded to nearest. @code{*@var{decpt}} is set to the
7a68c94a 2966index in the string of the first digit after the decimal point.
0ea5db4f
UD
2967@code{*@var{neg}} is set to a nonzero value if @var{value} is negative,
2968zero otherwise.
880f421f 2969
67994d6f
UD
2970If @var{ndigit} decimal digits would exceed the precision of a
2971@code{double} it is reduced to a system-specific value.
2972
880f421f
UD
2973The returned string is statically allocated and overwritten by each call
2974to @code{ecvt}.
2975
0ea5db4f
UD
2976If @var{value} is zero, it is implementation defined whether
2977@code{*@var{decpt}} is @code{0} or @code{1}.
880f421f 2978
0ea5db4f
UD
2979For example: @code{ecvt (12.3, 5, &d, &n)} returns @code{"12300"}
2980and sets @var{d} to @code{2} and @var{n} to @code{0}.
880f421f
UD
2981@end deftypefun
2982
880f421f
UD
2983@comment stdlib.h
2984@comment SVID, Unix98
0ea5db4f 2985@deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
b719dafd 2986@safety{@prelim{}@mtunsafe{@mtasurace{:fcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
7a68c94a
UD
2987The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies
2988the number of digits after the decimal point. If @var{ndigit} is less
2989than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the
2990left of the decimal point. For example, if @var{ndigit} is @code{-1},
2991@var{value} will be rounded to the nearest 10. If @var{ndigit} is
2992negative and larger than the number of digits to the left of the decimal
2993point in @var{value}, @var{value} will be rounded to one significant digit.
880f421f 2994
67994d6f
UD
2995If @var{ndigit} decimal digits would exceed the precision of a
2996@code{double} it is reduced to a system-specific value.
2997
880f421f
UD
2998The returned string is statically allocated and overwritten by each call
2999to @code{fcvt}.
880f421f
UD
3000@end deftypefun
3001
3002@comment stdlib.h
3003@comment SVID, Unix98
3004@deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf})
b719dafd
AO
3005@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3006@c gcvt calls sprintf, that ultimately calls vfprintf, which malloc()s
3007@c args_value if it's too large, but gcvt never exercises this path.
7a68c94a
UD
3008@code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g",
3009ndigit, value}. It is provided only for compatibility's sake. It
3010returns @var{buf}.
67994d6f
UD
3011
3012If @var{ndigit} decimal digits would exceed the precision of a
3013@code{double} it is reduced to a system-specific value.
880f421f
UD
3014@end deftypefun
3015
1f77f049 3016As extensions, @theglibc{} provides versions of these three
7a68c94a 3017functions that take @code{long double} arguments.
880f421f
UD
3018
3019@comment stdlib.h
3020@comment GNU
7a68c94a 3021@deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
b719dafd 3022@safety{@prelim{}@mtunsafe{@mtasurace{:qecvt}}@asunsafe{}@acsafe{}}
67994d6f
UD
3023This function is equivalent to @code{ecvt} except that it takes a
3024@code{long double} for the first parameter and that @var{ndigit} is
3025restricted by the precision of a @code{long double}.
880f421f
UD
3026@end deftypefun
3027
3028@comment stdlib.h
3029@comment GNU
0ea5db4f 3030@deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
b719dafd 3031@safety{@prelim{}@mtunsafe{@mtasurace{:qfcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
7a68c94a 3032This function is equivalent to @code{fcvt} except that it
67994d6f
UD
3033takes a @code{long double} for the first parameter and that @var{ndigit} is
3034restricted by the precision of a @code{long double}.
880f421f
UD
3035@end deftypefun
3036
3037@comment stdlib.h
3038@comment GNU
3039@deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf})
b719dafd 3040@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
67994d6f
UD
3041This function is equivalent to @code{gcvt} except that it takes a
3042@code{long double} for the first parameter and that @var{ndigit} is
3043restricted by the precision of a @code{long double}.
880f421f
UD
3044@end deftypefun
3045
3046
3047@cindex gcvt_r
7a68c94a
UD
3048The @code{ecvt} and @code{fcvt} functions, and their @code{long double}
3049equivalents, all return a string located in a static buffer which is
1f77f049 3050overwritten by the next call to the function. @Theglibc{}
7a68c94a
UD
3051provides another set of extended functions which write the converted
3052string into a user-supplied buffer. These have the conventional
3053@code{_r} suffix.
3054
3055@code{gcvt_r} is not necessary, because @code{gcvt} already uses a
3056user-supplied buffer.
880f421f
UD
3057
3058@comment stdlib.h
3059@comment GNU
5c1c368f 3060@deftypefun int ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
b719dafd 3061@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
3062The @code{ecvt_r} function is the same as @code{ecvt}, except
3063that it places its result into the user-specified buffer pointed to by
5c1c368f
UD
3064@var{buf}, with length @var{len}. The return value is @code{-1} in
3065case of an error and zero otherwise.
880f421f 3066
7a68c94a 3067This function is a GNU extension.
880f421f
UD
3068@end deftypefun
3069
3070@comment stdlib.h
3071@comment SVID, Unix98
5c1c368f 3072@deftypefun int fcvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
b719dafd 3073@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
5c1c368f
UD
3074The @code{fcvt_r} function is the same as @code{fcvt}, except that it
3075places its result into the user-specified buffer pointed to by
3076@var{buf}, with length @var{len}. The return value is @code{-1} in
3077case of an error and zero otherwise.
880f421f 3078
7a68c94a 3079This function is a GNU extension.
880f421f
UD
3080@end deftypefun
3081
3082@comment stdlib.h
3083@comment GNU
5c1c368f 3084@deftypefun int qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
b719dafd 3085@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
3086The @code{qecvt_r} function is the same as @code{qecvt}, except
3087that it places its result into the user-specified buffer pointed to by
5c1c368f
UD
3088@var{buf}, with length @var{len}. The return value is @code{-1} in
3089case of an error and zero otherwise.
880f421f 3090
7a68c94a 3091This function is a GNU extension.
880f421f
UD
3092@end deftypefun
3093
3094@comment stdlib.h
3095@comment GNU
5c1c368f 3096@deftypefun int qfcvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
b719dafd 3097@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
7a68c94a
UD
3098The @code{qfcvt_r} function is the same as @code{qfcvt}, except
3099that it places its result into the user-specified buffer pointed to by
5c1c368f
UD
3100@var{buf}, with length @var{len}. The return value is @code{-1} in
3101case of an error and zero otherwise.
880f421f 3102
7a68c94a 3103This function is a GNU extension.
880f421f 3104@end deftypefun