]>
Commit | Line | Data |
---|---|---|
28f540f4 | 1 | @node Arithmetic, Date and Time, Mathematics, Top |
7a68c94a UD |
2 | @c %MENU% Low level arithmetic functions |
3 | @chapter Arithmetic Functions | |
28f540f4 RM |
4 | |
5 | This chapter contains information about functions for doing basic | |
6 | arithmetic operations, such as splitting a float into its integer and | |
b4012b75 UD |
7 | fractional parts or retrieving the imaginary part of a complex value. |
8 | These functions are declared in the header files @file{math.h} and | |
9 | @file{complex.h}. | |
28f540f4 RM |
10 | |
11 | @menu | |
0e4ee106 UD |
12 | * Integers:: Basic integer types and concepts |
13 | * Integer Division:: Integer division with guaranteed rounding. | |
7a68c94a UD |
14 | * Floating Point Numbers:: Basic concepts. IEEE 754. |
15 | * Floating Point Classes:: The five kinds of floating-point number. | |
16 | * Floating Point Errors:: When something goes wrong in a calculation. | |
17 | * Rounding:: Controlling how results are rounded. | |
18 | * Control Functions:: Saving and restoring the FPU's state. | |
19 | * Arithmetic Functions:: Fundamental operations provided by the library. | |
20 | * Complex Numbers:: The types. Writing complex constants. | |
21 | * Operations on Complex:: Projection, conjugation, decomposition. | |
7a68c94a | 22 | * Parsing of Numbers:: Converting strings to numbers. |
6962682f | 23 | * Printing of Floats:: Converting floating-point numbers to strings. |
7a68c94a | 24 | * System V Number Conversion:: An archaic way to convert numbers to strings. |
28f540f4 RM |
25 | @end menu |
26 | ||
0e4ee106 UD |
27 | @node Integers |
28 | @section Integers | |
29 | @cindex integer | |
30 | ||
31 | The C language defines several integer data types: integer, short integer, | |
32 | long integer, and character, all in both signed and unsigned varieties. | |
e6e81391 UD |
33 | The GNU C compiler extends the language to contain long long integers |
34 | as well. | |
0e4ee106 UD |
35 | @cindex signedness |
36 | ||
37 | The C integer types were intended to allow code to be portable among | |
38 | machines with different inherent data sizes (word sizes), so each type | |
39 | may have different ranges on different machines. The problem with | |
40 | this is that a program often needs to be written for a particular range | |
41 | of integers, and sometimes must be written for a particular size of | |
42 | storage, regardless of what machine the program runs on. | |
43 | ||
1f77f049 | 44 | To address this problem, @theglibc{} contains C type definitions |
0e4ee106 | 45 | you can use to declare integers that meet your exact needs. Because the |
1f77f049 | 46 | @glibcadj{} header files are customized to a specific machine, your |
0e4ee106 UD |
47 | program source code doesn't have to be. |
48 | ||
49 | These @code{typedef}s are in @file{stdint.h}. | |
50 | @pindex stdint.h | |
51 | ||
52 | If you require that an integer be represented in exactly N bits, use one | |
53 | of the following types, with the obvious mapping to bit size and signedness: | |
54 | ||
68979757 | 55 | @itemize @bullet |
0e4ee106 UD |
56 | @item int8_t |
57 | @item int16_t | |
58 | @item int32_t | |
59 | @item int64_t | |
60 | @item uint8_t | |
61 | @item uint16_t | |
62 | @item uint32_t | |
63 | @item uint64_t | |
64 | @end itemize | |
65 | ||
66 | If your C compiler and target machine do not allow integers of a certain | |
67 | size, the corresponding above type does not exist. | |
68 | ||
69 | If you don't need a specific storage size, but want the smallest data | |
70 | structure with @emph{at least} N bits, use one of these: | |
71 | ||
68979757 | 72 | @itemize @bullet |
150f9fb8 AJ |
73 | @item int_least8_t |
74 | @item int_least16_t | |
75 | @item int_least32_t | |
76 | @item int_least64_t | |
77 | @item uint_least8_t | |
78 | @item uint_least16_t | |
79 | @item uint_least32_t | |
80 | @item uint_least64_t | |
0e4ee106 UD |
81 | @end itemize |
82 | ||
e6e81391 | 83 | If you don't need a specific storage size, but want the data structure |
0e4ee106 UD |
84 | that allows the fastest access while having at least N bits (and |
85 | among data structures with the same access speed, the smallest one), use | |
86 | one of these: | |
87 | ||
68979757 | 88 | @itemize @bullet |
150f9fb8 AJ |
89 | @item int_fast8_t |
90 | @item int_fast16_t | |
91 | @item int_fast32_t | |
92 | @item int_fast64_t | |
93 | @item uint_fast8_t | |
94 | @item uint_fast16_t | |
95 | @item uint_fast32_t | |
96 | @item uint_fast64_t | |
0e4ee106 UD |
97 | @end itemize |
98 | ||
e6e81391 | 99 | If you want an integer with the widest range possible on the platform on |
0e4ee106 UD |
100 | which it is being used, use one of the following. If you use these, |
101 | you should write code that takes into account the variable size and range | |
102 | of the integer. | |
103 | ||
68979757 | 104 | @itemize @bullet |
0e4ee106 UD |
105 | @item intmax_t |
106 | @item uintmax_t | |
107 | @end itemize | |
108 | ||
1f77f049 | 109 | @Theglibc{} also provides macros that tell you the maximum and |
0e4ee106 UD |
110 | minimum possible values for each integer data type. The macro names |
111 | follow these examples: @code{INT32_MAX}, @code{UINT8_MAX}, | |
112 | @code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX}, | |
113 | @code{INTMAX_MAX}, @code{INTMAX_MIN}. Note that there are no macros for | |
5b17fd0d JM |
114 | unsigned integer minima. These are always zero. Similiarly, there |
115 | are macros such as @code{INTMAX_WIDTH} for the width of these types. | |
116 | Those macros for integer type widths come from TS 18661-1:2014. | |
0e4ee106 | 117 | @cindex maximum possible integer |
0bc93a2f | 118 | @cindex minimum possible integer |
0e4ee106 UD |
119 | |
120 | There are similar macros for use with C's built in integer types which | |
121 | should come with your C compiler. These are described in @ref{Data Type | |
122 | Measurements}. | |
123 | ||
124 | Don't forget you can use the C @code{sizeof} function with any of these | |
125 | data types to get the number of bytes of storage each uses. | |
126 | ||
127 | ||
128 | @node Integer Division | |
129 | @section Integer Division | |
130 | @cindex integer division functions | |
131 | ||
132 | This section describes functions for performing integer division. These | |
133 | functions are redundant when GNU CC is used, because in GNU C the | |
134 | @samp{/} operator always rounds towards zero. But in other C | |
135 | implementations, @samp{/} may round differently with negative arguments. | |
136 | @code{div} and @code{ldiv} are useful because they specify how to round | |
137 | the quotient: towards zero. The remainder has the same sign as the | |
138 | numerator. | |
139 | ||
140 | These functions are specified to return a result @var{r} such that the value | |
141 | @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals | |
142 | @var{numerator}. | |
143 | ||
144 | @pindex stdlib.h | |
145 | To use these facilities, you should include the header file | |
146 | @file{stdlib.h} in your program. | |
147 | ||
0e4ee106 | 148 | @deftp {Data Type} div_t |
d08a7e4c | 149 | @standards{ISO, stdlib.h} |
0e4ee106 UD |
150 | This is a structure type used to hold the result returned by the @code{div} |
151 | function. It has the following members: | |
152 | ||
153 | @table @code | |
154 | @item int quot | |
155 | The quotient from the division. | |
156 | ||
157 | @item int rem | |
158 | The remainder from the division. | |
159 | @end table | |
160 | @end deftp | |
161 | ||
0e4ee106 | 162 | @deftypefun div_t div (int @var{numerator}, int @var{denominator}) |
d08a7e4c | 163 | @standards{ISO, stdlib.h} |
b719dafd AO |
164 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
165 | @c Functions in this section are pure, and thus safe. | |
e4fd1876 | 166 | The function @code{div} computes the quotient and remainder from |
0e4ee106 UD |
167 | the division of @var{numerator} by @var{denominator}, returning the |
168 | result in a structure of type @code{div_t}. | |
169 | ||
170 | If the result cannot be represented (as in a division by zero), the | |
171 | behavior is undefined. | |
172 | ||
173 | Here is an example, albeit not a very useful one. | |
174 | ||
175 | @smallexample | |
176 | div_t result; | |
177 | result = div (20, -6); | |
178 | @end smallexample | |
179 | ||
180 | @noindent | |
181 | Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}. | |
182 | @end deftypefun | |
183 | ||
0e4ee106 | 184 | @deftp {Data Type} ldiv_t |
d08a7e4c | 185 | @standards{ISO, stdlib.h} |
0e4ee106 UD |
186 | This is a structure type used to hold the result returned by the @code{ldiv} |
187 | function. It has the following members: | |
188 | ||
189 | @table @code | |
190 | @item long int quot | |
191 | The quotient from the division. | |
192 | ||
193 | @item long int rem | |
194 | The remainder from the division. | |
195 | @end table | |
196 | ||
197 | (This is identical to @code{div_t} except that the components are of | |
198 | type @code{long int} rather than @code{int}.) | |
199 | @end deftp | |
200 | ||
0e4ee106 | 201 | @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator}) |
d08a7e4c | 202 | @standards{ISO, stdlib.h} |
b719dafd | 203 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
0e4ee106 UD |
204 | The @code{ldiv} function is similar to @code{div}, except that the |
205 | arguments are of type @code{long int} and the result is returned as a | |
206 | structure of type @code{ldiv_t}. | |
207 | @end deftypefun | |
208 | ||
0e4ee106 | 209 | @deftp {Data Type} lldiv_t |
d08a7e4c | 210 | @standards{ISO, stdlib.h} |
0e4ee106 UD |
211 | This is a structure type used to hold the result returned by the @code{lldiv} |
212 | function. It has the following members: | |
213 | ||
214 | @table @code | |
215 | @item long long int quot | |
216 | The quotient from the division. | |
217 | ||
218 | @item long long int rem | |
219 | The remainder from the division. | |
220 | @end table | |
221 | ||
222 | (This is identical to @code{div_t} except that the components are of | |
223 | type @code{long long int} rather than @code{int}.) | |
224 | @end deftp | |
225 | ||
0e4ee106 | 226 | @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator}) |
d08a7e4c | 227 | @standards{ISO, stdlib.h} |
b719dafd | 228 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
0e4ee106 UD |
229 | The @code{lldiv} function is like the @code{div} function, but the |
230 | arguments are of type @code{long long int} and the result is returned as | |
231 | a structure of type @code{lldiv_t}. | |
232 | ||
233 | The @code{lldiv} function was added in @w{ISO C99}. | |
234 | @end deftypefun | |
235 | ||
0e4ee106 | 236 | @deftp {Data Type} imaxdiv_t |
d08a7e4c | 237 | @standards{ISO, inttypes.h} |
0e4ee106 UD |
238 | This is a structure type used to hold the result returned by the @code{imaxdiv} |
239 | function. It has the following members: | |
240 | ||
241 | @table @code | |
242 | @item intmax_t quot | |
243 | The quotient from the division. | |
244 | ||
245 | @item intmax_t rem | |
246 | The remainder from the division. | |
247 | @end table | |
248 | ||
249 | (This is identical to @code{div_t} except that the components are of | |
250 | type @code{intmax_t} rather than @code{int}.) | |
251 | ||
252 | See @ref{Integers} for a description of the @code{intmax_t} type. | |
253 | ||
254 | @end deftp | |
255 | ||
0e4ee106 | 256 | @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator}) |
d08a7e4c | 257 | @standards{ISO, inttypes.h} |
b719dafd | 258 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
0e4ee106 UD |
259 | The @code{imaxdiv} function is like the @code{div} function, but the |
260 | arguments are of type @code{intmax_t} and the result is returned as | |
261 | a structure of type @code{imaxdiv_t}. | |
262 | ||
263 | See @ref{Integers} for a description of the @code{intmax_t} type. | |
264 | ||
265 | The @code{imaxdiv} function was added in @w{ISO C99}. | |
266 | @end deftypefun | |
267 | ||
268 | ||
7a68c94a UD |
269 | @node Floating Point Numbers |
270 | @section Floating Point Numbers | |
271 | @cindex floating point | |
272 | @cindex IEEE 754 | |
b4012b75 UD |
273 | @cindex IEEE floating point |
274 | ||
7a68c94a UD |
275 | Most computer hardware has support for two different kinds of numbers: |
276 | integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and | |
277 | floating-point numbers. Floating-point numbers have three parts: the | |
278 | @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}. The real | |
279 | number represented by a floating-point value is given by | |
280 | @tex | |
281 | $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$ | |
282 | @end tex | |
283 | @ifnottex | |
284 | @math{(s ? -1 : 1) @mul{} 2^e @mul{} M} | |
285 | @end ifnottex | |
286 | where @math{s} is the sign bit, @math{e} the exponent, and @math{M} | |
287 | the mantissa. @xref{Floating Point Concepts}, for details. (It is | |
288 | possible to have a different @dfn{base} for the exponent, but all modern | |
289 | hardware uses @math{2}.) | |
290 | ||
291 | Floating-point numbers can represent a finite subset of the real | |
292 | numbers. While this subset is large enough for most purposes, it is | |
293 | important to remember that the only reals that can be represented | |
294 | exactly are rational numbers that have a terminating binary expansion | |
295 | shorter than the width of the mantissa. Even simple fractions such as | |
296 | @math{1/5} can only be approximated by floating point. | |
297 | ||
298 | Mathematical operations and functions frequently need to produce values | |
299 | that are not representable. Often these values can be approximated | |
300 | closely enough for practical purposes, but sometimes they can't. | |
301 | Historically there was no way to tell when the results of a calculation | |
302 | were inaccurate. Modern computers implement the @w{IEEE 754} standard | |
303 | for numerical computations, which defines a framework for indicating to | |
304 | the program when the results of calculation are not trustworthy. This | |
305 | framework consists of a set of @dfn{exceptions} that indicate why a | |
306 | result could not be represented, and the special values @dfn{infinity} | |
307 | and @dfn{not a number} (NaN). | |
308 | ||
309 | @node Floating Point Classes | |
310 | @section Floating-Point Number Classification Functions | |
311 | @cindex floating-point classes | |
312 | @cindex classes, floating-point | |
313 | @pindex math.h | |
b4012b75 | 314 | |
ec751a23 | 315 | @w{ISO C99} defines macros that let you determine what sort of |
7a68c94a | 316 | floating-point number a variable holds. |
b4012b75 | 317 | |
7a68c94a | 318 | @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x}) |
d08a7e4c | 319 | @standards{ISO, math.h} |
b719dafd | 320 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
321 | This is a generic macro which works on all floating-point types and |
322 | which returns a value of type @code{int}. The possible values are: | |
28f540f4 | 323 | |
7a68c94a UD |
324 | @vtable @code |
325 | @item FP_NAN | |
1b009d5a | 326 | @standards{C99, math.h} |
7a68c94a UD |
327 | The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity |
328 | and NaN}) | |
329 | @item FP_INFINITE | |
1b009d5a | 330 | @standards{C99, math.h} |
7a68c94a UD |
331 | The value of @var{x} is either plus or minus infinity (@pxref{Infinity |
332 | and NaN}) | |
333 | @item FP_ZERO | |
1b009d5a | 334 | @standards{C99, math.h} |
7a68c94a UD |
335 | The value of @var{x} is zero. In floating-point formats like @w{IEEE |
336 | 754}, where zero can be signed, this value is also returned if | |
337 | @var{x} is negative zero. | |
338 | @item FP_SUBNORMAL | |
1b009d5a | 339 | @standards{C99, math.h} |
7a68c94a UD |
340 | Numbers whose absolute value is too small to be represented in the |
341 | normal format are represented in an alternate, @dfn{denormalized} format | |
342 | (@pxref{Floating Point Concepts}). This format is less precise but can | |
343 | represent values closer to zero. @code{fpclassify} returns this value | |
344 | for values of @var{x} in this alternate format. | |
345 | @item FP_NORMAL | |
1b009d5a | 346 | @standards{C99, math.h} |
7a68c94a UD |
347 | This value is returned for all other values of @var{x}. It indicates |
348 | that there is nothing special about the number. | |
349 | @end vtable | |
28f540f4 | 350 | |
7a68c94a | 351 | @end deftypefn |
28f540f4 | 352 | |
7a68c94a UD |
353 | @code{fpclassify} is most useful if more than one property of a number |
354 | must be tested. There are more specific macros which only test one | |
355 | property at a time. Generally these macros execute faster than | |
356 | @code{fpclassify}, since there is special hardware support for them. | |
357 | You should therefore use the specific macros whenever possible. | |
28f540f4 | 358 | |
29cb9293 | 359 | @deftypefn {Macro} int iscanonical (@emph{float-type} @var{x}) |
d08a7e4c | 360 | @standards{ISO, math.h} |
29cb9293 JM |
361 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
362 | In some floating-point formats, some values have canonical (preferred) | |
363 | and noncanonical encodings (for IEEE interchange binary formats, all | |
364 | encodings are canonical). This macro returns a nonzero value if | |
365 | @var{x} has a canonical encoding. It is from TS 18661-1:2014. | |
366 | ||
367 | Note that some formats have multiple encodings of a value which are | |
368 | all equally canonical; @code{iscanonical} returns a nonzero value for | |
369 | all such encodings. Also, formats may have encodings that do not | |
370 | correspond to any valid value of the type. In ISO C terms these are | |
371 | @dfn{trap representations}; in @theglibc{}, @code{iscanonical} returns | |
372 | zero for such encodings. | |
373 | @end deftypefn | |
374 | ||
7a68c94a | 375 | @deftypefn {Macro} int isfinite (@emph{float-type} @var{x}) |
d08a7e4c | 376 | @standards{ISO, math.h} |
b719dafd | 377 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
378 | This macro returns a nonzero value if @var{x} is finite: not plus or |
379 | minus infinity, and not NaN. It is equivalent to | |
fe0ec73e UD |
380 | |
381 | @smallexample | |
7a68c94a | 382 | (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE) |
fe0ec73e UD |
383 | @end smallexample |
384 | ||
7a68c94a UD |
385 | @code{isfinite} is implemented as a macro which accepts any |
386 | floating-point type. | |
387 | @end deftypefn | |
fe0ec73e | 388 | |
7a68c94a | 389 | @deftypefn {Macro} int isnormal (@emph{float-type} @var{x}) |
d08a7e4c | 390 | @standards{ISO, math.h} |
b719dafd | 391 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
392 | This macro returns a nonzero value if @var{x} is finite and normalized. |
393 | It is equivalent to | |
b4012b75 UD |
394 | |
395 | @smallexample | |
7a68c94a | 396 | (fpclassify (x) == FP_NORMAL) |
b4012b75 | 397 | @end smallexample |
7a68c94a | 398 | @end deftypefn |
b4012b75 | 399 | |
7a68c94a | 400 | @deftypefn {Macro} int isnan (@emph{float-type} @var{x}) |
d08a7e4c | 401 | @standards{ISO, math.h} |
b719dafd | 402 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
403 | This macro returns a nonzero value if @var{x} is NaN. It is equivalent |
404 | to | |
b4012b75 UD |
405 | |
406 | @smallexample | |
7a68c94a | 407 | (fpclassify (x) == FP_NAN) |
b4012b75 | 408 | @end smallexample |
7a68c94a | 409 | @end deftypefn |
b4012b75 | 410 | |
57267616 | 411 | @deftypefn {Macro} int issignaling (@emph{float-type} @var{x}) |
d08a7e4c | 412 | @standards{ISO, math.h} |
b719dafd | 413 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
57267616 | 414 | This macro returns a nonzero value if @var{x} is a signaling NaN |
bf91be88 | 415 | (sNaN). It is from TS 18661-1:2014. |
57267616 TS |
416 | @end deftypefn |
417 | ||
d942e95c | 418 | @deftypefn {Macro} int issubnormal (@emph{float-type} @var{x}) |
d08a7e4c | 419 | @standards{ISO, math.h} |
d942e95c JM |
420 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
421 | This macro returns a nonzero value if @var{x} is subnormal. It is | |
422 | from TS 18661-1:2014. | |
423 | @end deftypefn | |
424 | ||
bb8081f5 | 425 | @deftypefn {Macro} int iszero (@emph{float-type} @var{x}) |
d08a7e4c | 426 | @standards{ISO, math.h} |
bb8081f5 JM |
427 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
428 | This macro returns a nonzero value if @var{x} is zero. It is from TS | |
429 | 18661-1:2014. | |
430 | @end deftypefn | |
431 | ||
7a68c94a | 432 | Another set of floating-point classification functions was provided by |
1f77f049 | 433 | BSD. @Theglibc{} also supports these functions; however, we |
ec751a23 | 434 | recommend that you use the ISO C99 macros in new code. Those are standard |
7a68c94a UD |
435 | and will be available more widely. Also, since they are macros, you do |
436 | not have to worry about the type of their argument. | |
28f540f4 | 437 | |
28f540f4 | 438 | @deftypefun int isinf (double @var{x}) |
779ae82e UD |
439 | @deftypefunx int isinff (float @var{x}) |
440 | @deftypefunx int isinfl (long double @var{x}) | |
d08a7e4c | 441 | @standards{BSD, math.h} |
b719dafd | 442 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
28f540f4 RM |
443 | This function returns @code{-1} if @var{x} represents negative infinity, |
444 | @code{1} if @var{x} represents positive infinity, and @code{0} otherwise. | |
445 | @end deftypefun | |
446 | ||
28f540f4 | 447 | @deftypefun int isnan (double @var{x}) |
779ae82e UD |
448 | @deftypefunx int isnanf (float @var{x}) |
449 | @deftypefunx int isnanl (long double @var{x}) | |
d08a7e4c | 450 | @standards{BSD, math.h} |
b719dafd | 451 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
28f540f4 | 452 | This function returns a nonzero value if @var{x} is a ``not a number'' |
7a68c94a | 453 | value, and zero otherwise. |
b9b49b44 | 454 | |
48b22986 | 455 | @strong{NB:} The @code{isnan} macro defined by @w{ISO C99} overrides |
7a68c94a UD |
456 | the BSD function. This is normally not a problem, because the two |
457 | routines behave identically. However, if you really need to get the BSD | |
458 | function for some reason, you can write | |
b9b49b44 | 459 | |
7a68c94a UD |
460 | @smallexample |
461 | (isnan) (x) | |
462 | @end smallexample | |
28f540f4 RM |
463 | @end deftypefun |
464 | ||
28f540f4 | 465 | @deftypefun int finite (double @var{x}) |
779ae82e UD |
466 | @deftypefunx int finitef (float @var{x}) |
467 | @deftypefunx int finitel (long double @var{x}) | |
d08a7e4c | 468 | @standards{BSD, math.h} |
b719dafd | 469 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
e65a5644 AJ |
470 | This function returns a nonzero value if @var{x} is neither infinite nor |
471 | a ``not a number'' value, and zero otherwise. | |
28f540f4 RM |
472 | @end deftypefun |
473 | ||
28f540f4 RM |
474 | @strong{Portability Note:} The functions listed in this section are BSD |
475 | extensions. | |
476 | ||
b4012b75 | 477 | |
7a68c94a UD |
478 | @node Floating Point Errors |
479 | @section Errors in Floating-Point Calculations | |
480 | ||
481 | @menu | |
482 | * FP Exceptions:: IEEE 754 math exceptions and how to detect them. | |
483 | * Infinity and NaN:: Special values returned by calculations. | |
484 | * Status bit operations:: Checking for exceptions after the fact. | |
485 | * Math Error Reporting:: How the math functions report errors. | |
486 | @end menu | |
487 | ||
488 | @node FP Exceptions | |
489 | @subsection FP Exceptions | |
490 | @cindex exception | |
491 | @cindex signal | |
492 | @cindex zero divide | |
493 | @cindex division by zero | |
494 | @cindex inexact exception | |
495 | @cindex invalid exception | |
496 | @cindex overflow exception | |
497 | @cindex underflow exception | |
498 | ||
499 | The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur | |
500 | during a calculation. Each corresponds to a particular sort of error, | |
501 | such as overflow. | |
502 | ||
503 | When exceptions occur (when exceptions are @dfn{raised}, in the language | |
504 | of the standard), one of two things can happen. By default the | |
505 | exception is simply noted in the floating-point @dfn{status word}, and | |
506 | the program continues as if nothing had happened. The operation | |
507 | produces a default value, which depends on the exception (see the table | |
508 | below). Your program can check the status word to find out which | |
509 | exceptions happened. | |
510 | ||
511 | Alternatively, you can enable @dfn{traps} for exceptions. In that case, | |
512 | when an exception is raised, your program will receive the @code{SIGFPE} | |
513 | signal. The default action for this signal is to terminate the | |
8b7fb588 | 514 | program. @xref{Signal Handling}, for how you can change the effect of |
7a68c94a UD |
515 | the signal. |
516 | ||
7a68c94a UD |
517 | @noindent |
518 | The exceptions defined in @w{IEEE 754} are: | |
519 | ||
520 | @table @samp | |
521 | @item Invalid Operation | |
522 | This exception is raised if the given operands are invalid for the | |
523 | operation to be performed. Examples are | |
524 | (see @w{IEEE 754}, @w{section 7}): | |
525 | @enumerate | |
526 | @item | |
527 | Addition or subtraction: @math{@infinity{} - @infinity{}}. (But | |
528 | @math{@infinity{} + @infinity{} = @infinity{}}). | |
529 | @item | |
530 | Multiplication: @math{0 @mul{} @infinity{}}. | |
531 | @item | |
532 | Division: @math{0/0} or @math{@infinity{}/@infinity{}}. | |
533 | @item | |
534 | Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is | |
535 | infinite. | |
536 | @item | |
e4fd1876 | 537 | Square root if the operand is less than zero. More generally, any |
7a68c94a UD |
538 | mathematical function evaluated outside its domain produces this |
539 | exception. | |
540 | @item | |
541 | Conversion of a floating-point number to an integer or decimal | |
542 | string, when the number cannot be represented in the target format (due | |
543 | to overflow, infinity, or NaN). | |
544 | @item | |
545 | Conversion of an unrecognizable input string. | |
546 | @item | |
547 | Comparison via predicates involving @math{<} or @math{>}, when one or | |
548 | other of the operands is NaN. You can prevent this exception by using | |
549 | the unordered comparison functions instead; see @ref{FP Comparison Functions}. | |
550 | @end enumerate | |
551 | ||
552 | If the exception does not trap, the result of the operation is NaN. | |
553 | ||
554 | @item Division by Zero | |
555 | This exception is raised when a finite nonzero number is divided | |
556 | by zero. If no trap occurs the result is either @math{+@infinity{}} or | |
557 | @math{-@infinity{}}, depending on the signs of the operands. | |
558 | ||
559 | @item Overflow | |
560 | This exception is raised whenever the result cannot be represented | |
561 | as a finite value in the precision format of the destination. If no trap | |
562 | occurs the result depends on the sign of the intermediate result and the | |
563 | current rounding mode (@w{IEEE 754}, @w{section 7.3}): | |
564 | @enumerate | |
565 | @item | |
566 | Round to nearest carries all overflows to @math{@infinity{}} | |
567 | with the sign of the intermediate result. | |
568 | @item | |
569 | Round toward @math{0} carries all overflows to the largest representable | |
570 | finite number with the sign of the intermediate result. | |
571 | @item | |
572 | Round toward @math{-@infinity{}} carries positive overflows to the | |
573 | largest representable finite number and negative overflows to | |
574 | @math{-@infinity{}}. | |
575 | ||
576 | @item | |
577 | Round toward @math{@infinity{}} carries negative overflows to the | |
578 | most negative representable finite number and positive overflows | |
579 | to @math{@infinity{}}. | |
580 | @end enumerate | |
581 | ||
582 | Whenever the overflow exception is raised, the inexact exception is also | |
583 | raised. | |
584 | ||
585 | @item Underflow | |
586 | The underflow exception is raised when an intermediate result is too | |
587 | small to be calculated accurately, or if the operation's result rounded | |
588 | to the destination precision is too small to be normalized. | |
589 | ||
590 | When no trap is installed for the underflow exception, underflow is | |
591 | signaled (via the underflow flag) only when both tininess and loss of | |
592 | accuracy have been detected. If no trap handler is installed the | |
593 | operation continues with an imprecise small value, or zero if the | |
594 | destination precision cannot hold the small exact result. | |
595 | ||
596 | @item Inexact | |
597 | This exception is signalled if a rounded result is not exact (such as | |
598 | when calculating the square root of two) or a result overflows without | |
599 | an overflow trap. | |
600 | @end table | |
601 | ||
602 | @node Infinity and NaN | |
603 | @subsection Infinity and NaN | |
604 | @cindex infinity | |
605 | @cindex not a number | |
606 | @cindex NaN | |
607 | ||
608 | @w{IEEE 754} floating point numbers can represent positive or negative | |
609 | infinity, and @dfn{NaN} (not a number). These three values arise from | |
610 | calculations whose result is undefined or cannot be represented | |
611 | accurately. You can also deliberately set a floating-point variable to | |
612 | any of them, which is sometimes useful. Some examples of calculations | |
613 | that produce infinity or NaN: | |
614 | ||
615 | @ifnottex | |
616 | @smallexample | |
617 | @math{1/0 = @infinity{}} | |
618 | @math{log (0) = -@infinity{}} | |
619 | @math{sqrt (-1) = NaN} | |
620 | @end smallexample | |
621 | @end ifnottex | |
622 | @tex | |
623 | $${1\over0} = \infty$$ | |
624 | $$\log 0 = -\infty$$ | |
625 | $$\sqrt{-1} = \hbox{NaN}$$ | |
626 | @end tex | |
627 | ||
628 | When a calculation produces any of these values, an exception also | |
629 | occurs; see @ref{FP Exceptions}. | |
630 | ||
631 | The basic operations and math functions all accept infinity and NaN and | |
632 | produce sensible output. Infinities propagate through calculations as | |
633 | one would expect: for example, @math{2 + @infinity{} = @infinity{}}, | |
634 | @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}. NaN, on | |
635 | the other hand, infects any calculation that involves it. Unless the | |
636 | calculation would produce the same result no matter what real value | |
637 | replaced NaN, the result is NaN. | |
638 | ||
639 | In comparison operations, positive infinity is larger than all values | |
640 | except itself and NaN, and negative infinity is smaller than all values | |
641 | except itself and NaN. NaN is @dfn{unordered}: it is not equal to, | |
642 | greater than, or less than anything, @emph{including itself}. @code{x == | |
643 | x} is false if the value of @code{x} is NaN. You can use this to test | |
644 | whether a value is NaN or not, but the recommended way to test for NaN | |
645 | is with the @code{isnan} function (@pxref{Floating Point Classes}). In | |
646 | addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an | |
647 | exception when applied to NaNs. | |
648 | ||
649 | @file{math.h} defines macros that allow you to explicitly set a variable | |
650 | to infinity or NaN. | |
b4012b75 | 651 | |
7a68c94a | 652 | @deftypevr Macro float INFINITY |
d08a7e4c | 653 | @standards{ISO, math.h} |
7a68c94a UD |
654 | An expression representing positive infinity. It is equal to the value |
655 | produced by mathematical operations like @code{1.0 / 0.0}. | |
656 | @code{-INFINITY} represents negative infinity. | |
657 | ||
658 | You can test whether a floating-point value is infinite by comparing it | |
659 | to this macro. However, this is not recommended; you should use the | |
660 | @code{isfinite} macro instead. @xref{Floating Point Classes}. | |
661 | ||
ec751a23 | 662 | This macro was introduced in the @w{ISO C99} standard. |
7a68c94a UD |
663 | @end deftypevr |
664 | ||
7a68c94a | 665 | @deftypevr Macro float NAN |
d08a7e4c | 666 | @standards{GNU, math.h} |
7a68c94a UD |
667 | An expression representing a value which is ``not a number''. This |
668 | macro is a GNU extension, available only on machines that support the | |
669 | ``not a number'' value---that is to say, on all machines that support | |
670 | IEEE floating point. | |
671 | ||
672 | You can use @samp{#ifdef NAN} to test whether the machine supports | |
673 | NaN. (Of course, you must arrange for GNU extensions to be visible, | |
674 | such as by defining @code{_GNU_SOURCE}, and then you must include | |
675 | @file{math.h}.) | |
676 | @end deftypevr | |
677 | ||
f82a4bdb JM |
678 | @deftypevr Macro float SNANF |
679 | @deftypevrx Macro double SNAN | |
680 | @deftypevrx Macro {long double} SNANL | |
52a8e5cb GG |
681 | @deftypevrx Macro _FloatN SNANFN |
682 | @deftypevrx Macro _FloatNx SNANFNx | |
1b009d5a | 683 | @standards{TS 18661-1:2014, math.h} |
52a8e5cb GG |
684 | @standardsx{SNANFN, TS 18661-3:2015, math.h} |
685 | @standardsx{SNANFNx, TS 18661-3:2015, math.h} | |
686 | These macros, defined by TS 18661-1:2014 and TS 18661-3:2015, are | |
687 | constant expressions for signaling NaNs. | |
f82a4bdb JM |
688 | @end deftypevr |
689 | ||
c0b43536 | 690 | @deftypevr Macro int FE_SNANS_ALWAYS_SIGNAL |
d08a7e4c | 691 | @standards{ISO, fenv.h} |
c0b43536 JM |
692 | This macro, defined by TS 18661-1:2014, is defined to @code{1} in |
693 | @file{fenv.h} to indicate that functions and operations with signaling | |
694 | NaN inputs and floating-point results always raise the invalid | |
695 | exception and return a quiet NaN, even in cases (such as @code{fmax}, | |
696 | @code{hypot} and @code{pow}) where a quiet NaN input can produce a | |
697 | non-NaN result. Because some compiler optimizations may not handle | |
698 | signaling NaNs correctly, this macro is only defined if compiler | |
699 | support for signaling NaNs is enabled. That support can be enabled | |
700 | with the GCC option @option{-fsignaling-nans}. | |
701 | @end deftypevr | |
702 | ||
7a68c94a UD |
703 | @w{IEEE 754} also allows for another unusual value: negative zero. This |
704 | value is produced when you divide a positive number by negative | |
705 | infinity, or when a negative result is smaller than the limits of | |
cd837b09 | 706 | representation. |
7a68c94a UD |
707 | |
708 | @node Status bit operations | |
709 | @subsection Examining the FPU status word | |
710 | ||
ec751a23 | 711 | @w{ISO C99} defines functions to query and manipulate the |
7a68c94a UD |
712 | floating-point status word. You can use these functions to check for |
713 | untrapped exceptions when it's convenient, rather than worrying about | |
714 | them in the middle of a calculation. | |
715 | ||
716 | These constants represent the various @w{IEEE 754} exceptions. Not all | |
717 | FPUs report all the different exceptions. Each constant is defined if | |
718 | and only if the FPU you are compiling for supports that exception, so | |
719 | you can test for FPU support with @samp{#ifdef}. They are defined in | |
720 | @file{fenv.h}. | |
b4012b75 UD |
721 | |
722 | @vtable @code | |
7a68c94a | 723 | @item FE_INEXACT |
d08a7e4c | 724 | @standards{ISO, fenv.h} |
7a68c94a | 725 | The inexact exception. |
7a68c94a | 726 | @item FE_DIVBYZERO |
d08a7e4c | 727 | @standards{ISO, fenv.h} |
7a68c94a | 728 | The divide by zero exception. |
7a68c94a | 729 | @item FE_UNDERFLOW |
d08a7e4c | 730 | @standards{ISO, fenv.h} |
7a68c94a | 731 | The underflow exception. |
7a68c94a | 732 | @item FE_OVERFLOW |
d08a7e4c | 733 | @standards{ISO, fenv.h} |
7a68c94a | 734 | The overflow exception. |
7a68c94a | 735 | @item FE_INVALID |
d08a7e4c | 736 | @standards{ISO, fenv.h} |
7a68c94a | 737 | The invalid exception. |
b4012b75 UD |
738 | @end vtable |
739 | ||
7a68c94a UD |
740 | The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros |
741 | which are supported by the FP implementation. | |
b4012b75 | 742 | |
7a68c94a UD |
743 | These functions allow you to clear exception flags, test for exceptions, |
744 | and save and restore the set of exceptions flagged. | |
b4012b75 | 745 | |
63ae7b63 | 746 | @deftypefun int feclearexcept (int @var{excepts}) |
d08a7e4c | 747 | @standards{ISO, fenv.h} |
b719dafd AO |
748 | @safety{@prelim{}@mtsafe{}@assafe{@assposix{}}@acsafe{@acsposix{}}} |
749 | @c The other functions in this section that modify FP status register | |
750 | @c mostly do so with non-atomic load-modify-store sequences, but since | |
751 | @c the register is thread-specific, this should be fine, and safe for | |
752 | @c cancellation. As long as the FP environment is restored before the | |
753 | @c signal handler returns control to the interrupted thread (like any | |
754 | @c kernel should do), the functions are also safe for use in signal | |
755 | @c handlers. | |
7a68c94a UD |
756 | This function clears all of the supported exception flags indicated by |
757 | @var{excepts}. | |
63ae7b63 UD |
758 | |
759 | The function returns zero in case the operation was successful, a | |
760 | non-zero value otherwise. | |
761 | @end deftypefun | |
762 | ||
63ae7b63 | 763 | @deftypefun int feraiseexcept (int @var{excepts}) |
d08a7e4c | 764 | @standards{ISO, fenv.h} |
b719dafd | 765 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
63ae7b63 UD |
766 | This function raises the supported exceptions indicated by |
767 | @var{excepts}. If more than one exception bit in @var{excepts} is set | |
768 | the order in which the exceptions are raised is undefined except that | |
769 | overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are | |
770 | raised before inexact (@code{FE_INEXACT}). Whether for overflow or | |
771 | underflow the inexact exception is also raised is also implementation | |
772 | dependent. | |
773 | ||
774 | The function returns zero in case the operation was successful, a | |
775 | non-zero value otherwise. | |
7a68c94a UD |
776 | @end deftypefun |
777 | ||
5146356f | 778 | @deftypefun int fesetexcept (int @var{excepts}) |
d08a7e4c | 779 | @standards{ISO, fenv.h} |
5146356f JM |
780 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
781 | This function sets the supported exception flags indicated by | |
782 | @var{excepts}, like @code{feraiseexcept}, but without causing enabled | |
783 | traps to be taken. @code{fesetexcept} is from TS 18661-1:2014. | |
784 | ||
785 | The function returns zero in case the operation was successful, a | |
786 | non-zero value otherwise. | |
787 | @end deftypefun | |
788 | ||
7a68c94a | 789 | @deftypefun int fetestexcept (int @var{excepts}) |
d08a7e4c | 790 | @standards{ISO, fenv.h} |
b719dafd | 791 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
792 | Test whether the exception flags indicated by the parameter @var{except} |
793 | are currently set. If any of them are, a nonzero value is returned | |
794 | which specifies which exceptions are set. Otherwise the result is zero. | |
795 | @end deftypefun | |
796 | ||
797 | To understand these functions, imagine that the status word is an | |
798 | integer variable named @var{status}. @code{feclearexcept} is then | |
799 | equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is | |
800 | equivalent to @samp{(status & excepts)}. The actual implementation may | |
801 | be very different, of course. | |
802 | ||
803 | Exception flags are only cleared when the program explicitly requests it, | |
804 | by calling @code{feclearexcept}. If you want to check for exceptions | |
805 | from a set of calculations, you should clear all the flags first. Here | |
806 | is a simple example of the way to use @code{fetestexcept}: | |
b4012b75 UD |
807 | |
808 | @smallexample | |
7a68c94a UD |
809 | @{ |
810 | double f; | |
811 | int raised; | |
812 | feclearexcept (FE_ALL_EXCEPT); | |
813 | f = compute (); | |
814 | raised = fetestexcept (FE_OVERFLOW | FE_INVALID); | |
95fdc6a0 UD |
815 | if (raised & FE_OVERFLOW) @{ /* @dots{} */ @} |
816 | if (raised & FE_INVALID) @{ /* @dots{} */ @} | |
817 | /* @dots{} */ | |
7a68c94a | 818 | @} |
b4012b75 UD |
819 | @end smallexample |
820 | ||
7a68c94a UD |
821 | You cannot explicitly set bits in the status word. You can, however, |
822 | save the entire status word and restore it later. This is done with the | |
823 | following functions: | |
b4012b75 | 824 | |
63ae7b63 | 825 | @deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts}) |
d08a7e4c | 826 | @standards{ISO, fenv.h} |
b719dafd | 827 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
828 | This function stores in the variable pointed to by @var{flagp} an |
829 | implementation-defined value representing the current setting of the | |
830 | exception flags indicated by @var{excepts}. | |
63ae7b63 UD |
831 | |
832 | The function returns zero in case the operation was successful, a | |
833 | non-zero value otherwise. | |
7a68c94a | 834 | @end deftypefun |
b4012b75 | 835 | |
9251c568 | 836 | @deftypefun int fesetexceptflag (const fexcept_t *@var{flagp}, int @var{excepts}) |
d08a7e4c | 837 | @standards{ISO, fenv.h} |
b719dafd | 838 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
839 | This function restores the flags for the exceptions indicated by |
840 | @var{excepts} to the values stored in the variable pointed to by | |
841 | @var{flagp}. | |
63ae7b63 UD |
842 | |
843 | The function returns zero in case the operation was successful, a | |
844 | non-zero value otherwise. | |
7a68c94a UD |
845 | @end deftypefun |
846 | ||
847 | Note that the value stored in @code{fexcept_t} bears no resemblance to | |
848 | the bit mask returned by @code{fetestexcept}. The type may not even be | |
849 | an integer. Do not attempt to modify an @code{fexcept_t} variable. | |
850 | ||
780257d4 | 851 | @deftypefun int fetestexceptflag (const fexcept_t *@var{flagp}, int @var{excepts}) |
d08a7e4c | 852 | @standards{ISO, fenv.h} |
780257d4 JM |
853 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
854 | Test whether the exception flags indicated by the parameter | |
855 | @var{excepts} are set in the variable pointed to by @var{flagp}. If | |
856 | any of them are, a nonzero value is returned which specifies which | |
857 | exceptions are set. Otherwise the result is zero. | |
858 | @code{fetestexceptflag} is from TS 18661-1:2014. | |
859 | @end deftypefun | |
860 | ||
7a68c94a UD |
861 | @node Math Error Reporting |
862 | @subsection Error Reporting by Mathematical Functions | |
863 | @cindex errors, mathematical | |
864 | @cindex domain error | |
865 | @cindex range error | |
866 | ||
867 | Many of the math functions are defined only over a subset of the real or | |
868 | complex numbers. Even if they are mathematically defined, their result | |
869 | may be larger or smaller than the range representable by their return | |
c5df7609 JM |
870 | type without loss of accuracy. These are known as @dfn{domain errors}, |
871 | @dfn{overflows}, and | |
7a68c94a UD |
872 | @dfn{underflows}, respectively. Math functions do several things when |
873 | one of these errors occurs. In this manual we will refer to the | |
874 | complete response as @dfn{signalling} a domain error, overflow, or | |
875 | underflow. | |
876 | ||
877 | When a math function suffers a domain error, it raises the invalid | |
010fe231 | 878 | exception and returns NaN. It also sets @code{errno} to @code{EDOM}; |
7a68c94a UD |
879 | this is for compatibility with old systems that do not support @w{IEEE |
880 | 754} exception handling. Likewise, when overflow occurs, math | |
c5df7609 JM |
881 | functions raise the overflow exception and, in the default rounding |
882 | mode, return @math{@infinity{}} or @math{-@infinity{}} as appropriate | |
883 | (in other rounding modes, the largest finite value of the appropriate | |
884 | sign is returned when appropriate for that rounding mode). They also | |
010fe231 FW |
885 | set @code{errno} to @code{ERANGE} if returning @math{@infinity{}} or |
886 | @math{-@infinity{}}; @code{errno} may or may not be set to | |
c5df7609 JM |
887 | @code{ERANGE} when a finite value is returned on overflow. When |
888 | underflow occurs, the underflow exception is raised, and zero | |
889 | (appropriately signed) or a subnormal value, as appropriate for the | |
890 | mathematical result of the function and the rounding mode, is | |
010fe231 | 891 | returned. @code{errno} may be set to @code{ERANGE}, but this is not |
c5df7609 JM |
892 | guaranteed; it is intended that @theglibc{} should set it when the |
893 | underflow is to an appropriately signed zero, but not necessarily for | |
894 | other underflows. | |
7a68c94a | 895 | |
3fdf1792 JM |
896 | When a math function has an argument that is a signaling NaN, |
897 | @theglibc{} does not consider this a domain error, so @code{errno} is | |
898 | unchanged, but the invalid exception is still raised (except for a few | |
899 | functions that are specified to handle signaling NaNs differently). | |
900 | ||
7a68c94a UD |
901 | Some of the math functions are defined mathematically to result in a |
902 | complex value over parts of their domains. The most familiar example of | |
903 | this is taking the square root of a negative number. The complex math | |
904 | functions, such as @code{csqrt}, will return the appropriate complex value | |
905 | in this case. The real-valued functions, such as @code{sqrt}, will | |
906 | signal a domain error. | |
907 | ||
908 | Some older hardware does not support infinities. On that hardware, | |
909 | overflows instead return a particular very large number (usually the | |
910 | largest representable number). @file{math.h} defines macros you can use | |
911 | to test for overflow on both old and new hardware. | |
b4012b75 | 912 | |
7a68c94a UD |
913 | @deftypevr Macro double HUGE_VAL |
914 | @deftypevrx Macro float HUGE_VALF | |
915 | @deftypevrx Macro {long double} HUGE_VALL | |
52a8e5cb GG |
916 | @deftypevrx Macro _FloatN HUGE_VAL_FN |
917 | @deftypevrx Macro _FloatNx HUGE_VAL_FNx | |
d08a7e4c | 918 | @standards{ISO, math.h} |
52a8e5cb GG |
919 | @standardsx{HUGE_VAL_FN, TS 18661-3:2015, math.h} |
920 | @standardsx{HUGE_VAL_FNx, TS 18661-3:2015, math.h} | |
7a68c94a UD |
921 | An expression representing a particular very large number. On machines |
922 | that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity. | |
923 | On other machines, it's typically the largest positive number that can | |
924 | be represented. | |
925 | ||
926 | Mathematical functions return the appropriately typed version of | |
927 | @code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large | |
928 | to be represented. | |
929 | @end deftypevr | |
b4012b75 | 930 | |
7a68c94a UD |
931 | @node Rounding |
932 | @section Rounding Modes | |
933 | ||
934 | Floating-point calculations are carried out internally with extra | |
935 | precision, and then rounded to fit into the destination type. This | |
936 | ensures that results are as precise as the input data. @w{IEEE 754} | |
937 | defines four possible rounding modes: | |
938 | ||
939 | @table @asis | |
940 | @item Round to nearest. | |
941 | This is the default mode. It should be used unless there is a specific | |
942 | need for one of the others. In this mode results are rounded to the | |
943 | nearest representable value. If the result is midway between two | |
944 | representable values, the even representable is chosen. @dfn{Even} here | |
945 | means the lowest-order bit is zero. This rounding mode prevents | |
946 | statistical bias and guarantees numeric stability: round-off errors in a | |
947 | lengthy calculation will remain smaller than half of @code{FLT_EPSILON}. | |
948 | ||
949 | @c @item Round toward @math{+@infinity{}} | |
950 | @item Round toward plus Infinity. | |
951 | All results are rounded to the smallest representable value | |
952 | which is greater than the result. | |
953 | ||
954 | @c @item Round toward @math{-@infinity{}} | |
955 | @item Round toward minus Infinity. | |
956 | All results are rounded to the largest representable value which is less | |
957 | than the result. | |
958 | ||
959 | @item Round toward zero. | |
960 | All results are rounded to the largest representable value whose | |
961 | magnitude is less than that of the result. In other words, if the | |
962 | result is negative it is rounded up; if it is positive, it is rounded | |
963 | down. | |
964 | @end table | |
b4012b75 | 965 | |
7a68c94a UD |
966 | @noindent |
967 | @file{fenv.h} defines constants which you can use to refer to the | |
968 | various rounding modes. Each one will be defined if and only if the FPU | |
969 | supports the corresponding rounding mode. | |
b4012b75 | 970 | |
2fe82ca6 | 971 | @vtable @code |
7a68c94a | 972 | @item FE_TONEAREST |
d08a7e4c | 973 | @standards{ISO, fenv.h} |
7a68c94a | 974 | Round to nearest. |
b4012b75 | 975 | |
7a68c94a | 976 | @item FE_UPWARD |
d08a7e4c | 977 | @standards{ISO, fenv.h} |
7a68c94a | 978 | Round toward @math{+@infinity{}}. |
b4012b75 | 979 | |
7a68c94a | 980 | @item FE_DOWNWARD |
d08a7e4c | 981 | @standards{ISO, fenv.h} |
7a68c94a | 982 | Round toward @math{-@infinity{}}. |
b4012b75 | 983 | |
7a68c94a | 984 | @item FE_TOWARDZERO |
d08a7e4c | 985 | @standards{ISO, fenv.h} |
7a68c94a | 986 | Round toward zero. |
2fe82ca6 | 987 | @end vtable |
b4012b75 | 988 | |
7a68c94a UD |
989 | Underflow is an unusual case. Normally, @w{IEEE 754} floating point |
990 | numbers are always normalized (@pxref{Floating Point Concepts}). | |
991 | Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent, | |
992 | @code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as | |
993 | normalized numbers. Rounding all such numbers to zero or @math{2^r} | |
994 | would cause some algorithms to fail at 0. Therefore, they are left in | |
995 | denormalized form. That produces loss of precision, since some bits of | |
996 | the mantissa are stolen to indicate the decimal point. | |
997 | ||
998 | If a result is too small to be represented as a denormalized number, it | |
999 | is rounded to zero. However, the sign of the result is preserved; if | |
1000 | the calculation was negative, the result is @dfn{negative zero}. | |
1001 | Negative zero can also result from some operations on infinity, such as | |
cd837b09 | 1002 | @math{4/-@infinity{}}. |
7a68c94a | 1003 | |
e4fd1876 | 1004 | At any time, one of the above four rounding modes is selected. You can |
7a68c94a UD |
1005 | find out which one with this function: |
1006 | ||
7a68c94a | 1007 | @deftypefun int fegetround (void) |
d08a7e4c | 1008 | @standards{ISO, fenv.h} |
b719dafd | 1009 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1010 | Returns the currently selected rounding mode, represented by one of the |
1011 | values of the defined rounding mode macros. | |
1012 | @end deftypefun | |
b4012b75 | 1013 | |
7a68c94a UD |
1014 | @noindent |
1015 | To change the rounding mode, use this function: | |
b4012b75 | 1016 | |
7a68c94a | 1017 | @deftypefun int fesetround (int @var{round}) |
d08a7e4c | 1018 | @standards{ISO, fenv.h} |
b719dafd | 1019 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1020 | Changes the currently selected rounding mode to @var{round}. If |
1021 | @var{round} does not correspond to one of the supported rounding modes | |
d5655997 | 1022 | nothing is changed. @code{fesetround} returns zero if it changed the |
e4fd1876 | 1023 | rounding mode, or a nonzero value if the mode is not supported. |
7a68c94a | 1024 | @end deftypefun |
b4012b75 | 1025 | |
7a68c94a UD |
1026 | You should avoid changing the rounding mode if possible. It can be an |
1027 | expensive operation; also, some hardware requires you to compile your | |
1028 | program differently for it to work. The resulting code may run slower. | |
1029 | See your compiler documentation for details. | |
1030 | @c This section used to claim that functions existed to round one number | |
1031 | @c in a specific fashion. I can't find any functions in the library | |
1032 | @c that do that. -zw | |
1033 | ||
1034 | @node Control Functions | |
1035 | @section Floating-Point Control Functions | |
1036 | ||
1037 | @w{IEEE 754} floating-point implementations allow the programmer to | |
1038 | decide whether traps will occur for each of the exceptions, by setting | |
1039 | bits in the @dfn{control word}. In C, traps result in the program | |
1040 | receiving the @code{SIGFPE} signal; see @ref{Signal Handling}. | |
1041 | ||
48b22986 | 1042 | @strong{NB:} @w{IEEE 754} says that trap handlers are given details of |
7a68c94a UD |
1043 | the exceptional situation, and can set the result value. C signals do |
1044 | not provide any mechanism to pass this information back and forth. | |
1045 | Trapping exceptions in C is therefore not very useful. | |
1046 | ||
1047 | It is sometimes necessary to save the state of the floating-point unit | |
1048 | while you perform some calculation. The library provides functions | |
1049 | which save and restore the exception flags, the set of exceptions that | |
1050 | generate traps, and the rounding mode. This information is known as the | |
1051 | @dfn{floating-point environment}. | |
1052 | ||
1053 | The functions to save and restore the floating-point environment all use | |
1054 | a variable of type @code{fenv_t} to store information. This type is | |
1055 | defined in @file{fenv.h}. Its size and contents are | |
1056 | implementation-defined. You should not attempt to manipulate a variable | |
1057 | of this type directly. | |
1058 | ||
1059 | To save the state of the FPU, use one of these functions: | |
1060 | ||
63ae7b63 | 1061 | @deftypefun int fegetenv (fenv_t *@var{envp}) |
d08a7e4c | 1062 | @standards{ISO, fenv.h} |
b719dafd | 1063 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1064 | Store the floating-point environment in the variable pointed to by |
1065 | @var{envp}. | |
63ae7b63 UD |
1066 | |
1067 | The function returns zero in case the operation was successful, a | |
1068 | non-zero value otherwise. | |
b4012b75 UD |
1069 | @end deftypefun |
1070 | ||
7a68c94a | 1071 | @deftypefun int feholdexcept (fenv_t *@var{envp}) |
d08a7e4c | 1072 | @standards{ISO, fenv.h} |
b719dafd | 1073 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1074 | Store the current floating-point environment in the object pointed to by |
1075 | @var{envp}. Then clear all exception flags, and set the FPU to trap no | |
1076 | exceptions. Not all FPUs support trapping no exceptions; if | |
0f6b172f UD |
1077 | @code{feholdexcept} cannot set this mode, it returns nonzero value. If it |
1078 | succeeds, it returns zero. | |
b4012b75 UD |
1079 | @end deftypefun |
1080 | ||
7a7a7ee5 | 1081 | The functions which restore the floating-point environment can take these |
7a68c94a | 1082 | kinds of arguments: |
b4012b75 | 1083 | |
7a68c94a UD |
1084 | @itemize @bullet |
1085 | @item | |
1086 | Pointers to @code{fenv_t} objects, which were initialized previously by a | |
1087 | call to @code{fegetenv} or @code{feholdexcept}. | |
1088 | @item | |
1089 | @vindex FE_DFL_ENV | |
1090 | The special macro @code{FE_DFL_ENV} which represents the floating-point | |
1091 | environment as it was available at program start. | |
1092 | @item | |
7a7a7ee5 AJ |
1093 | Implementation defined macros with names starting with @code{FE_} and |
1094 | having type @code{fenv_t *}. | |
b4012b75 | 1095 | |
7a68c94a | 1096 | @vindex FE_NOMASK_ENV |
1f77f049 | 1097 | If possible, @theglibc{} defines a macro @code{FE_NOMASK_ENV} |
7a68c94a UD |
1098 | which represents an environment where every exception raised causes a |
1099 | trap to occur. You can test for this macro using @code{#ifdef}. It is | |
1100 | only defined if @code{_GNU_SOURCE} is defined. | |
1101 | ||
1102 | Some platforms might define other predefined environments. | |
1103 | @end itemize | |
1104 | ||
1105 | @noindent | |
1106 | To set the floating-point environment, you can use either of these | |
1107 | functions: | |
1108 | ||
63ae7b63 | 1109 | @deftypefun int fesetenv (const fenv_t *@var{envp}) |
d08a7e4c | 1110 | @standards{ISO, fenv.h} |
b719dafd | 1111 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a | 1112 | Set the floating-point environment to that described by @var{envp}. |
63ae7b63 UD |
1113 | |
1114 | The function returns zero in case the operation was successful, a | |
1115 | non-zero value otherwise. | |
b4012b75 UD |
1116 | @end deftypefun |
1117 | ||
63ae7b63 | 1118 | @deftypefun int feupdateenv (const fenv_t *@var{envp}) |
d08a7e4c | 1119 | @standards{ISO, fenv.h} |
b719dafd | 1120 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1121 | Like @code{fesetenv}, this function sets the floating-point environment |
1122 | to that described by @var{envp}. However, if any exceptions were | |
1123 | flagged in the status word before @code{feupdateenv} was called, they | |
1124 | remain flagged after the call. In other words, after @code{feupdateenv} | |
1125 | is called, the status word is the bitwise OR of the previous status word | |
1126 | and the one saved in @var{envp}. | |
63ae7b63 UD |
1127 | |
1128 | The function returns zero in case the operation was successful, a | |
1129 | non-zero value otherwise. | |
b4012b75 UD |
1130 | @end deftypefun |
1131 | ||
ec94343f JM |
1132 | @noindent |
1133 | TS 18661-1:2014 defines additional functions to save and restore | |
1134 | floating-point control modes (such as the rounding mode and whether | |
1135 | traps are enabled) while leaving other status (such as raised flags) | |
1136 | unchanged. | |
1137 | ||
1138 | @vindex FE_DFL_MODE | |
1139 | The special macro @code{FE_DFL_MODE} may be passed to | |
1140 | @code{fesetmode}. It represents the floating-point control modes at | |
1141 | program start. | |
1142 | ||
ec94343f | 1143 | @deftypefun int fegetmode (femode_t *@var{modep}) |
d08a7e4c | 1144 | @standards{ISO, fenv.h} |
ec94343f JM |
1145 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1146 | Store the floating-point control modes in the variable pointed to by | |
1147 | @var{modep}. | |
1148 | ||
1149 | The function returns zero in case the operation was successful, a | |
1150 | non-zero value otherwise. | |
1151 | @end deftypefun | |
1152 | ||
ec94343f | 1153 | @deftypefun int fesetmode (const femode_t *@var{modep}) |
d08a7e4c | 1154 | @standards{ISO, fenv.h} |
ec94343f JM |
1155 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1156 | Set the floating-point control modes to those described by | |
1157 | @var{modep}. | |
1158 | ||
1159 | The function returns zero in case the operation was successful, a | |
1160 | non-zero value otherwise. | |
1161 | @end deftypefun | |
1162 | ||
05ef7ce9 UD |
1163 | @noindent |
1164 | To control for individual exceptions if raising them causes a trap to | |
1165 | occur, you can use the following two functions. | |
1166 | ||
1167 | @strong{Portability Note:} These functions are all GNU extensions. | |
1168 | ||
05ef7ce9 | 1169 | @deftypefun int feenableexcept (int @var{excepts}) |
d08a7e4c | 1170 | @standards{GNU, fenv.h} |
b719dafd | 1171 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
e4fd1876 RJ |
1172 | This function enables traps for each of the exceptions as indicated by |
1173 | the parameter @var{excepts}. The individual exceptions are described in | |
6e8afc1c | 1174 | @ref{Status bit operations}. Only the specified exceptions are |
05ef7ce9 UD |
1175 | enabled, the status of the other exceptions is not changed. |
1176 | ||
1177 | The function returns the previous enabled exceptions in case the | |
1178 | operation was successful, @code{-1} otherwise. | |
1179 | @end deftypefun | |
1180 | ||
05ef7ce9 | 1181 | @deftypefun int fedisableexcept (int @var{excepts}) |
d08a7e4c | 1182 | @standards{GNU, fenv.h} |
b719dafd | 1183 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
e4fd1876 RJ |
1184 | This function disables traps for each of the exceptions as indicated by |
1185 | the parameter @var{excepts}. The individual exceptions are described in | |
6e8afc1c | 1186 | @ref{Status bit operations}. Only the specified exceptions are |
05ef7ce9 UD |
1187 | disabled, the status of the other exceptions is not changed. |
1188 | ||
1189 | The function returns the previous enabled exceptions in case the | |
1190 | operation was successful, @code{-1} otherwise. | |
1191 | @end deftypefun | |
1192 | ||
8ded91fb | 1193 | @deftypefun int fegetexcept (void) |
d08a7e4c | 1194 | @standards{GNU, fenv.h} |
b719dafd | 1195 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
05ef7ce9 UD |
1196 | The function returns a bitmask of all currently enabled exceptions. It |
1197 | returns @code{-1} in case of failure. | |
6e8afc1c | 1198 | @end deftypefun |
05ef7ce9 | 1199 | |
7a68c94a UD |
1200 | @node Arithmetic Functions |
1201 | @section Arithmetic Functions | |
b4012b75 | 1202 | |
7a68c94a UD |
1203 | The C library provides functions to do basic operations on |
1204 | floating-point numbers. These include absolute value, maximum and minimum, | |
1205 | normalization, bit twiddling, rounding, and a few others. | |
b4012b75 | 1206 | |
7a68c94a UD |
1207 | @menu |
1208 | * Absolute Value:: Absolute values of integers and floats. | |
1209 | * Normalization Functions:: Extracting exponents and putting them back. | |
1210 | * Rounding Functions:: Rounding floats to integers. | |
1211 | * Remainder Functions:: Remainders on division, precisely defined. | |
1212 | * FP Bit Twiddling:: Sign bit adjustment. Adding epsilon. | |
1213 | * FP Comparison Functions:: Comparisons without risk of exceptions. | |
1214 | * Misc FP Arithmetic:: Max, min, positive difference, multiply-add. | |
1215 | @end menu | |
b4012b75 | 1216 | |
28f540f4 | 1217 | @node Absolute Value |
7a68c94a | 1218 | @subsection Absolute Value |
28f540f4 RM |
1219 | @cindex absolute value functions |
1220 | ||
1221 | These functions are provided for obtaining the @dfn{absolute value} (or | |
1222 | @dfn{magnitude}) of a number. The absolute value of a real number | |
2d26e9eb | 1223 | @var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is |
28f540f4 RM |
1224 | negative. For a complex number @var{z}, whose real part is @var{x} and |
1225 | whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt | |
1226 | (@var{x}*@var{x} + @var{y}*@var{y})}}. | |
1227 | ||
1228 | @pindex math.h | |
1229 | @pindex stdlib.h | |
fe0ec73e | 1230 | Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h}; |
e518937a | 1231 | @code{imaxabs} is declared in @file{inttypes.h}; |
52a8e5cb GG |
1232 | the @code{fabs} functions are declared in @file{math.h}; |
1233 | the @code{cabs} functions are declared in @file{complex.h}. | |
28f540f4 | 1234 | |
28f540f4 | 1235 | @deftypefun int abs (int @var{number}) |
7a68c94a UD |
1236 | @deftypefunx {long int} labs (long int @var{number}) |
1237 | @deftypefunx {long long int} llabs (long long int @var{number}) | |
e518937a | 1238 | @deftypefunx intmax_t imaxabs (intmax_t @var{number}) |
d08a7e4c RJ |
1239 | @standards{ISO, stdlib.h} |
1240 | @standardsx{imaxabs, ISO, inttypes.h} | |
b719dafd | 1241 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a | 1242 | These functions return the absolute value of @var{number}. |
28f540f4 RM |
1243 | |
1244 | Most computers use a two's complement integer representation, in which | |
1245 | the absolute value of @code{INT_MIN} (the smallest possible @code{int}) | |
1246 | cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined. | |
28f540f4 | 1247 | |
ec751a23 | 1248 | @code{llabs} and @code{imaxdiv} are new to @w{ISO C99}. |
0e4ee106 UD |
1249 | |
1250 | See @ref{Integers} for a description of the @code{intmax_t} type. | |
1251 | ||
fe0ec73e UD |
1252 | @end deftypefun |
1253 | ||
28f540f4 | 1254 | @deftypefun double fabs (double @var{number}) |
779ae82e UD |
1255 | @deftypefunx float fabsf (float @var{number}) |
1256 | @deftypefunx {long double} fabsl (long double @var{number}) | |
52a8e5cb GG |
1257 | @deftypefunx _FloatN fabsfN (_Float@var{N} @var{number}) |
1258 | @deftypefunx _FloatNx fabsfNx (_Float@var{N}x @var{number}) | |
d08a7e4c | 1259 | @standards{ISO, math.h} |
52a8e5cb GG |
1260 | @standardsx{fabsfN, TS 18661-3:2015, math.h} |
1261 | @standardsx{fabsfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1262 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
28f540f4 RM |
1263 | This function returns the absolute value of the floating-point number |
1264 | @var{number}. | |
1265 | @end deftypefun | |
1266 | ||
b4012b75 | 1267 | @deftypefun double cabs (complex double @var{z}) |
779ae82e UD |
1268 | @deftypefunx float cabsf (complex float @var{z}) |
1269 | @deftypefunx {long double} cabsl (complex long double @var{z}) | |
52a8e5cb GG |
1270 | @deftypefunx _FloatN cabsfN (complex _Float@var{N} @var{z}) |
1271 | @deftypefunx _FloatNx cabsfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 1272 | @standards{ISO, complex.h} |
52a8e5cb GG |
1273 | @standardsx{cabsfN, TS 18661-3:2015, complex.h} |
1274 | @standardsx{cabsfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 1275 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1276 | These functions return the absolute value of the complex number @var{z} |
1277 | (@pxref{Complex Numbers}). The absolute value of a complex number is: | |
28f540f4 RM |
1278 | |
1279 | @smallexample | |
b4012b75 | 1280 | sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z})) |
28f540f4 | 1281 | @end smallexample |
dfd2257a | 1282 | |
7a68c94a UD |
1283 | This function should always be used instead of the direct formula |
1284 | because it takes special care to avoid losing precision. It may also | |
cf822e3c | 1285 | take advantage of hardware support for this operation. See @code{hypot} |
8b7fb588 | 1286 | in @ref{Exponents and Logarithms}. |
28f540f4 RM |
1287 | @end deftypefun |
1288 | ||
1289 | @node Normalization Functions | |
7a68c94a | 1290 | @subsection Normalization Functions |
28f540f4 RM |
1291 | @cindex normalization functions (floating-point) |
1292 | ||
1293 | The functions described in this section are primarily provided as a way | |
1294 | to efficiently perform certain low-level manipulations on floating point | |
1295 | numbers that are represented internally using a binary radix; | |
1296 | see @ref{Floating Point Concepts}. These functions are required to | |
1297 | have equivalent behavior even if the representation does not use a radix | |
1298 | of 2, but of course they are unlikely to be particularly efficient in | |
1299 | those cases. | |
1300 | ||
1301 | @pindex math.h | |
1302 | All these functions are declared in @file{math.h}. | |
1303 | ||
28f540f4 | 1304 | @deftypefun double frexp (double @var{value}, int *@var{exponent}) |
779ae82e UD |
1305 | @deftypefunx float frexpf (float @var{value}, int *@var{exponent}) |
1306 | @deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent}) | |
52a8e5cb GG |
1307 | @deftypefunx _FloatN frexpfN (_Float@var{N} @var{value}, int *@var{exponent}) |
1308 | @deftypefunx _FloatNx frexpfNx (_Float@var{N}x @var{value}, int *@var{exponent}) | |
d08a7e4c | 1309 | @standards{ISO, math.h} |
52a8e5cb GG |
1310 | @standardsx{frexpfN, TS 18661-3:2015, math.h} |
1311 | @standardsx{frexpfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1312 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1313 | These functions are used to split the number @var{value} |
28f540f4 RM |
1314 | into a normalized fraction and an exponent. |
1315 | ||
1316 | If the argument @var{value} is not zero, the return value is @var{value} | |
56b672e9 BN |
1317 | times a power of two, and its magnitude is always in the range 1/2 |
1318 | (inclusive) to 1 (exclusive). The corresponding exponent is stored in | |
28f540f4 RM |
1319 | @code{*@var{exponent}}; the return value multiplied by 2 raised to this |
1320 | exponent equals the original number @var{value}. | |
1321 | ||
1322 | For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and | |
1323 | stores @code{4} in @code{exponent}. | |
1324 | ||
1325 | If @var{value} is zero, then the return value is zero and | |
1326 | zero is stored in @code{*@var{exponent}}. | |
1327 | @end deftypefun | |
1328 | ||
28f540f4 | 1329 | @deftypefun double ldexp (double @var{value}, int @var{exponent}) |
779ae82e UD |
1330 | @deftypefunx float ldexpf (float @var{value}, int @var{exponent}) |
1331 | @deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent}) | |
52a8e5cb GG |
1332 | @deftypefunx _FloatN ldexpfN (_Float@var{N} @var{value}, int @var{exponent}) |
1333 | @deftypefunx _FloatNx ldexpfNx (_Float@var{N}x @var{value}, int @var{exponent}) | |
d08a7e4c | 1334 | @standards{ISO, math.h} |
52a8e5cb GG |
1335 | @standardsx{ldexpfN, TS 18661-3:2015, math.h} |
1336 | @standardsx{ldexpfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1337 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1338 | These functions return the result of multiplying the floating-point |
28f540f4 RM |
1339 | number @var{value} by 2 raised to the power @var{exponent}. (It can |
1340 | be used to reassemble floating-point numbers that were taken apart | |
1341 | by @code{frexp}.) | |
1342 | ||
1343 | For example, @code{ldexp (0.8, 4)} returns @code{12.8}. | |
1344 | @end deftypefun | |
1345 | ||
7a68c94a | 1346 | The following functions, which come from BSD, provide facilities |
b7d03293 UD |
1347 | equivalent to those of @code{ldexp} and @code{frexp}. See also the |
1348 | @w{ISO C} function @code{logb} which originally also appeared in BSD. | |
52a8e5cb GG |
1349 | The @code{_Float@var{N}} and @code{_Float@var{N}} variants of the |
1350 | following functions come from TS 18661-3:2015. | |
7a68c94a | 1351 | |
8ded91fb | 1352 | @deftypefun double scalb (double @var{value}, double @var{exponent}) |
8ded91fb | 1353 | @deftypefunx float scalbf (float @var{value}, float @var{exponent}) |
8ded91fb | 1354 | @deftypefunx {long double} scalbl (long double @var{value}, long double @var{exponent}) |
d08a7e4c | 1355 | @standards{BSD, math.h} |
b719dafd | 1356 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1357 | The @code{scalb} function is the BSD name for @code{ldexp}. |
1358 | @end deftypefun | |
1359 | ||
9ad027fb | 1360 | @deftypefun double scalbn (double @var{x}, int @var{n}) |
9ad027fb | 1361 | @deftypefunx float scalbnf (float @var{x}, int @var{n}) |
9ad027fb | 1362 | @deftypefunx {long double} scalbnl (long double @var{x}, int @var{n}) |
52a8e5cb GG |
1363 | @deftypefunx _FloatN scalbnfN (_Float@var{N} @var{x}, int @var{n}) |
1364 | @deftypefunx _FloatNx scalbnfNx (_Float@var{N}x @var{x}, int @var{n}) | |
d08a7e4c | 1365 | @standards{BSD, math.h} |
52a8e5cb GG |
1366 | @standardsx{scalbnfN, TS 18661-3:2015, math.h} |
1367 | @standardsx{scalbnfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1368 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1369 | @code{scalbn} is identical to @code{scalb}, except that the exponent |
1370 | @var{n} is an @code{int} instead of a floating-point number. | |
1371 | @end deftypefun | |
1372 | ||
9ad027fb | 1373 | @deftypefun double scalbln (double @var{x}, long int @var{n}) |
9ad027fb | 1374 | @deftypefunx float scalblnf (float @var{x}, long int @var{n}) |
9ad027fb | 1375 | @deftypefunx {long double} scalblnl (long double @var{x}, long int @var{n}) |
52a8e5cb GG |
1376 | @deftypefunx _FloatN scalblnfN (_Float@var{N} @var{x}, long int @var{n}) |
1377 | @deftypefunx _FloatNx scalblnfNx (_Float@var{N}x @var{x}, long int @var{n}) | |
d08a7e4c | 1378 | @standards{BSD, math.h} |
52a8e5cb GG |
1379 | @standardsx{scalblnfN, TS 18661-3:2015, math.h} |
1380 | @standardsx{scalblnfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1381 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1382 | @code{scalbln} is identical to @code{scalb}, except that the exponent |
1383 | @var{n} is a @code{long int} instead of a floating-point number. | |
1384 | @end deftypefun | |
28f540f4 | 1385 | |
8ded91fb | 1386 | @deftypefun double significand (double @var{x}) |
8ded91fb | 1387 | @deftypefunx float significandf (float @var{x}) |
8ded91fb | 1388 | @deftypefunx {long double} significandl (long double @var{x}) |
d08a7e4c | 1389 | @standards{BSD, math.h} |
b719dafd | 1390 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1391 | @code{significand} returns the mantissa of @var{x} scaled to the range |
1392 | @math{[1, 2)}. | |
1393 | It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}. | |
1394 | ||
1395 | This function exists mainly for use in certain standardized tests | |
1396 | of @w{IEEE 754} conformance. | |
28f540f4 RM |
1397 | @end deftypefun |
1398 | ||
7a68c94a UD |
1399 | @node Rounding Functions |
1400 | @subsection Rounding Functions | |
28f540f4 RM |
1401 | @cindex converting floats to integers |
1402 | ||
1403 | @pindex math.h | |
7a68c94a | 1404 | The functions listed here perform operations such as rounding and |
cf822e3c | 1405 | truncation of floating-point values. Some of these functions convert |
7a68c94a UD |
1406 | floating point numbers to integer values. They are all declared in |
1407 | @file{math.h}. | |
28f540f4 RM |
1408 | |
1409 | You can also convert floating-point numbers to integers simply by | |
1410 | casting them to @code{int}. This discards the fractional part, | |
1411 | effectively rounding towards zero. However, this only works if the | |
1412 | result can actually be represented as an @code{int}---for very large | |
1413 | numbers, this is impossible. The functions listed here return the | |
1414 | result as a @code{double} instead to get around this problem. | |
1415 | ||
423c2b9d JM |
1416 | The @code{fromfp} functions use the following macros, from TS |
1417 | 18661-1:2014, to specify the direction of rounding. These correspond | |
1418 | to the rounding directions defined in IEEE 754-2008. | |
1419 | ||
1420 | @vtable @code | |
423c2b9d | 1421 | @item FP_INT_UPWARD |
d08a7e4c | 1422 | @standards{ISO, math.h} |
423c2b9d JM |
1423 | Round toward @math{+@infinity{}}. |
1424 | ||
423c2b9d | 1425 | @item FP_INT_DOWNWARD |
d08a7e4c | 1426 | @standards{ISO, math.h} |
423c2b9d JM |
1427 | Round toward @math{-@infinity{}}. |
1428 | ||
423c2b9d | 1429 | @item FP_INT_TOWARDZERO |
d08a7e4c | 1430 | @standards{ISO, math.h} |
423c2b9d JM |
1431 | Round toward zero. |
1432 | ||
423c2b9d | 1433 | @item FP_INT_TONEARESTFROMZERO |
d08a7e4c | 1434 | @standards{ISO, math.h} |
423c2b9d JM |
1435 | Round to nearest, ties round away from zero. |
1436 | ||
423c2b9d | 1437 | @item FP_INT_TONEAREST |
d08a7e4c | 1438 | @standards{ISO, math.h} |
423c2b9d JM |
1439 | Round to nearest, ties round to even. |
1440 | @end vtable | |
1441 | ||
28f540f4 | 1442 | @deftypefun double ceil (double @var{x}) |
779ae82e UD |
1443 | @deftypefunx float ceilf (float @var{x}) |
1444 | @deftypefunx {long double} ceill (long double @var{x}) | |
52a8e5cb GG |
1445 | @deftypefunx _FloatN ceilfN (_Float@var{N} @var{x}) |
1446 | @deftypefunx _FloatNx ceilfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1447 | @standards{ISO, math.h} |
52a8e5cb GG |
1448 | @standardsx{ceilfN, TS 18661-3:2015, math.h} |
1449 | @standardsx{ceilfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1450 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1451 | These functions round @var{x} upwards to the nearest integer, |
28f540f4 RM |
1452 | returning that value as a @code{double}. Thus, @code{ceil (1.5)} |
1453 | is @code{2.0}. | |
1454 | @end deftypefun | |
1455 | ||
28f540f4 | 1456 | @deftypefun double floor (double @var{x}) |
779ae82e UD |
1457 | @deftypefunx float floorf (float @var{x}) |
1458 | @deftypefunx {long double} floorl (long double @var{x}) | |
52a8e5cb GG |
1459 | @deftypefunx _FloatN floorfN (_Float@var{N} @var{x}) |
1460 | @deftypefunx _FloatNx floorfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1461 | @standards{ISO, math.h} |
52a8e5cb GG |
1462 | @standardsx{floorfN, TS 18661-3:2015, math.h} |
1463 | @standardsx{floorfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1464 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1465 | These functions round @var{x} downwards to the nearest |
28f540f4 RM |
1466 | integer, returning that value as a @code{double}. Thus, @code{floor |
1467 | (1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}. | |
1468 | @end deftypefun | |
1469 | ||
7a68c94a UD |
1470 | @deftypefun double trunc (double @var{x}) |
1471 | @deftypefunx float truncf (float @var{x}) | |
1472 | @deftypefunx {long double} truncl (long double @var{x}) | |
52a8e5cb GG |
1473 | @deftypefunx _FloatN truncfN (_Float@var{N} @var{x}) |
1474 | @deftypefunx _FloatNx truncfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1475 | @standards{ISO, math.h} |
52a8e5cb GG |
1476 | @standardsx{truncfN, TS 18661-3:2015, math.h} |
1477 | @standardsx{truncfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1478 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
e6e81391 UD |
1479 | The @code{trunc} functions round @var{x} towards zero to the nearest |
1480 | integer (returned in floating-point format). Thus, @code{trunc (1.5)} | |
1481 | is @code{1.0} and @code{trunc (-1.5)} is @code{-1.0}. | |
7a68c94a UD |
1482 | @end deftypefun |
1483 | ||
28f540f4 | 1484 | @deftypefun double rint (double @var{x}) |
779ae82e UD |
1485 | @deftypefunx float rintf (float @var{x}) |
1486 | @deftypefunx {long double} rintl (long double @var{x}) | |
52a8e5cb GG |
1487 | @deftypefunx _FloatN rintfN (_Float@var{N} @var{x}) |
1488 | @deftypefunx _FloatNx rintfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1489 | @standards{ISO, math.h} |
52a8e5cb GG |
1490 | @standardsx{rintfN, TS 18661-3:2015, math.h} |
1491 | @standardsx{rintfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1492 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1493 | These functions round @var{x} to an integer value according to the |
28f540f4 RM |
1494 | current rounding mode. @xref{Floating Point Parameters}, for |
1495 | information about the various rounding modes. The default | |
1496 | rounding mode is to round to the nearest integer; some machines | |
1497 | support other modes, but round-to-nearest is always used unless | |
7a68c94a UD |
1498 | you explicitly select another. |
1499 | ||
1500 | If @var{x} was not initially an integer, these functions raise the | |
1501 | inexact exception. | |
28f540f4 RM |
1502 | @end deftypefun |
1503 | ||
b4012b75 | 1504 | @deftypefun double nearbyint (double @var{x}) |
779ae82e UD |
1505 | @deftypefunx float nearbyintf (float @var{x}) |
1506 | @deftypefunx {long double} nearbyintl (long double @var{x}) | |
52a8e5cb GG |
1507 | @deftypefunx _FloatN nearbyintfN (_Float@var{N} @var{x}) |
1508 | @deftypefunx _FloatNx nearbyintfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1509 | @standards{ISO, math.h} |
52a8e5cb GG |
1510 | @standardsx{nearbyintfN, TS 18661-3:2015, math.h} |
1511 | @standardsx{nearbyintfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1512 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1513 | These functions return the same value as the @code{rint} functions, but |
1514 | do not raise the inexact exception if @var{x} is not an integer. | |
1515 | @end deftypefun | |
1516 | ||
7a68c94a UD |
1517 | @deftypefun double round (double @var{x}) |
1518 | @deftypefunx float roundf (float @var{x}) | |
1519 | @deftypefunx {long double} roundl (long double @var{x}) | |
52a8e5cb GG |
1520 | @deftypefunx _FloatN roundfN (_Float@var{N} @var{x}) |
1521 | @deftypefunx _FloatNx roundfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1522 | @standards{ISO, math.h} |
52a8e5cb GG |
1523 | @standardsx{roundfN, TS 18661-3:2015, math.h} |
1524 | @standardsx{roundfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1525 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a | 1526 | These functions are similar to @code{rint}, but they round halfway |
713df3d5 RM |
1527 | cases away from zero instead of to the nearest integer (or other |
1528 | current rounding mode). | |
7a68c94a UD |
1529 | @end deftypefun |
1530 | ||
41c67149 | 1531 | @deftypefun double roundeven (double @var{x}) |
41c67149 | 1532 | @deftypefunx float roundevenf (float @var{x}) |
41c67149 | 1533 | @deftypefunx {long double} roundevenl (long double @var{x}) |
52a8e5cb GG |
1534 | @deftypefunx _FloatN roundevenfN (_Float@var{N} @var{x}) |
1535 | @deftypefunx _FloatNx roundevenfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1536 | @standards{ISO, math.h} |
52a8e5cb GG |
1537 | @standardsx{roundevenfN, TS 18661-3:2015, math.h} |
1538 | @standardsx{roundevenfNx, TS 18661-3:2015, math.h} | |
41c67149 | 1539 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
52a8e5cb GG |
1540 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, are similar |
1541 | to @code{round}, but they round halfway cases to even instead of away | |
1542 | from zero. | |
41c67149 JM |
1543 | @end deftypefun |
1544 | ||
7a68c94a UD |
1545 | @deftypefun {long int} lrint (double @var{x}) |
1546 | @deftypefunx {long int} lrintf (float @var{x}) | |
1547 | @deftypefunx {long int} lrintl (long double @var{x}) | |
52a8e5cb GG |
1548 | @deftypefunx {long int} lrintfN (_Float@var{N} @var{x}) |
1549 | @deftypefunx {long int} lrintfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1550 | @standards{ISO, math.h} |
52a8e5cb GG |
1551 | @standardsx{lrintfN, TS 18661-3:2015, math.h} |
1552 | @standardsx{lrintfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1553 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1554 | These functions are just like @code{rint}, but they return a |
1555 | @code{long int} instead of a floating-point number. | |
1556 | @end deftypefun | |
1557 | ||
7a68c94a UD |
1558 | @deftypefun {long long int} llrint (double @var{x}) |
1559 | @deftypefunx {long long int} llrintf (float @var{x}) | |
1560 | @deftypefunx {long long int} llrintl (long double @var{x}) | |
52a8e5cb GG |
1561 | @deftypefunx {long long int} llrintfN (_Float@var{N} @var{x}) |
1562 | @deftypefunx {long long int} llrintfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1563 | @standards{ISO, math.h} |
52a8e5cb GG |
1564 | @standardsx{llrintfN, TS 18661-3:2015, math.h} |
1565 | @standardsx{llrintfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1566 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1567 | These functions are just like @code{rint}, but they return a |
1568 | @code{long long int} instead of a floating-point number. | |
b4012b75 UD |
1569 | @end deftypefun |
1570 | ||
7a68c94a UD |
1571 | @deftypefun {long int} lround (double @var{x}) |
1572 | @deftypefunx {long int} lroundf (float @var{x}) | |
1573 | @deftypefunx {long int} lroundl (long double @var{x}) | |
52a8e5cb GG |
1574 | @deftypefunx {long int} lroundfN (_Float@var{N} @var{x}) |
1575 | @deftypefunx {long int} lroundfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1576 | @standards{ISO, math.h} |
52a8e5cb GG |
1577 | @standardsx{lroundfN, TS 18661-3:2015, math.h} |
1578 | @standardsx{lroundfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1579 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1580 | These functions are just like @code{round}, but they return a |
1581 | @code{long int} instead of a floating-point number. | |
1582 | @end deftypefun | |
1583 | ||
7a68c94a UD |
1584 | @deftypefun {long long int} llround (double @var{x}) |
1585 | @deftypefunx {long long int} llroundf (float @var{x}) | |
1586 | @deftypefunx {long long int} llroundl (long double @var{x}) | |
52a8e5cb GG |
1587 | @deftypefunx {long long int} llroundfN (_Float@var{N} @var{x}) |
1588 | @deftypefunx {long long int} llroundfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1589 | @standards{ISO, math.h} |
52a8e5cb GG |
1590 | @standardsx{llroundfN, TS 18661-3:2015, math.h} |
1591 | @standardsx{llroundfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1592 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1593 | These functions are just like @code{round}, but they return a |
1594 | @code{long long int} instead of a floating-point number. | |
1595 | @end deftypefun | |
1596 | ||
423c2b9d | 1597 | @deftypefun intmax_t fromfp (double @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1598 | @deftypefunx intmax_t fromfpf (float @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1599 | @deftypefunx intmax_t fromfpl (long double @var{x}, int @var{round}, unsigned int @var{width}) |
52a8e5cb GG |
1600 | @deftypefunx intmax_t fromfpfN (_Float@var{N} @var{x}, int @var{round}, unsigned int @var{width}) |
1601 | @deftypefunx intmax_t fromfpfNx (_Float@var{N}x @var{x}, int @var{round}, unsigned int @var{width}) | |
423c2b9d | 1602 | @deftypefunx uintmax_t ufromfp (double @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1603 | @deftypefunx uintmax_t ufromfpf (float @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1604 | @deftypefunx uintmax_t ufromfpl (long double @var{x}, int @var{round}, unsigned int @var{width}) |
52a8e5cb GG |
1605 | @deftypefunx uintmax_t ufromfpfN (_Float@var{N} @var{x}, int @var{round}, unsigned int @var{width}) |
1606 | @deftypefunx uintmax_t ufromfpfNx (_Float@var{N}x @var{x}, int @var{round}, unsigned int @var{width}) | |
423c2b9d | 1607 | @deftypefunx intmax_t fromfpx (double @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1608 | @deftypefunx intmax_t fromfpxf (float @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1609 | @deftypefunx intmax_t fromfpxl (long double @var{x}, int @var{round}, unsigned int @var{width}) |
52a8e5cb GG |
1610 | @deftypefunx intmax_t fromfpxfN (_Float@var{N} @var{x}, int @var{round}, unsigned int @var{width}) |
1611 | @deftypefunx intmax_t fromfpxfNx (_Float@var{N}x @var{x}, int @var{round}, unsigned int @var{width}) | |
423c2b9d | 1612 | @deftypefunx uintmax_t ufromfpx (double @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1613 | @deftypefunx uintmax_t ufromfpxf (float @var{x}, int @var{round}, unsigned int @var{width}) |
423c2b9d | 1614 | @deftypefunx uintmax_t ufromfpxl (long double @var{x}, int @var{round}, unsigned int @var{width}) |
52a8e5cb GG |
1615 | @deftypefunx uintmax_t ufromfpxfN (_Float@var{N} @var{x}, int @var{round}, unsigned int @var{width}) |
1616 | @deftypefunx uintmax_t ufromfpxfNx (_Float@var{N}x @var{x}, int @var{round}, unsigned int @var{width}) | |
d08a7e4c | 1617 | @standards{ISO, math.h} |
52a8e5cb GG |
1618 | @standardsx{fromfpfN, TS 18661-3:2015, math.h} |
1619 | @standardsx{fromfpfNx, TS 18661-3:2015, math.h} | |
1620 | @standardsx{ufromfpfN, TS 18661-3:2015, math.h} | |
1621 | @standardsx{ufromfpfNx, TS 18661-3:2015, math.h} | |
1622 | @standardsx{fromfpxfN, TS 18661-3:2015, math.h} | |
1623 | @standardsx{fromfpxfNx, TS 18661-3:2015, math.h} | |
1624 | @standardsx{ufromfpxfN, TS 18661-3:2015, math.h} | |
1625 | @standardsx{ufromfpxfNx, TS 18661-3:2015, math.h} | |
1626 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
1627 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, convert a | |
1628 | floating-point number to an integer according to the rounding direction | |
1629 | @var{round} (one of the @code{FP_INT_*} macros). If the integer is | |
1630 | outside the range of a signed or unsigned (depending on the return type | |
1631 | of the function) type of width @var{width} bits (or outside the range of | |
1632 | the return type, if @var{width} is larger), or if @var{x} is infinite or | |
1633 | NaN, or if @var{width} is zero, a domain error occurs and an unspecified | |
1634 | value is returned. The functions with an @samp{x} in their names raise | |
1635 | the inexact exception when a domain error does not occur and the | |
1636 | argument is not an integer; the other functions do not raise the inexact | |
423c2b9d JM |
1637 | exception. |
1638 | @end deftypefun | |
1639 | ||
7a68c94a | 1640 | |
28f540f4 | 1641 | @deftypefun double modf (double @var{value}, double *@var{integer-part}) |
f2ea0f5b | 1642 | @deftypefunx float modff (float @var{value}, float *@var{integer-part}) |
779ae82e | 1643 | @deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part}) |
52a8e5cb GG |
1644 | @deftypefunx _FloatN modffN (_Float@var{N} @var{value}, _Float@var{N} *@var{integer-part}) |
1645 | @deftypefunx _FloatNx modffNx (_Float@var{N}x @var{value}, _Float@var{N}x *@var{integer-part}) | |
d08a7e4c | 1646 | @standards{ISO, math.h} |
52a8e5cb GG |
1647 | @standardsx{modffN, TS 18661-3:2015, math.h} |
1648 | @standardsx{modffNx, TS 18661-3:2015, math.h} | |
b719dafd | 1649 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1650 | These functions break the argument @var{value} into an integer part and a |
28f540f4 RM |
1651 | fractional part (between @code{-1} and @code{1}, exclusive). Their sum |
1652 | equals @var{value}. Each of the parts has the same sign as @var{value}, | |
7a68c94a | 1653 | and the integer part is always rounded toward zero. |
28f540f4 RM |
1654 | |
1655 | @code{modf} stores the integer part in @code{*@var{integer-part}}, and | |
1656 | returns the fractional part. For example, @code{modf (2.5, &intpart)} | |
1657 | returns @code{0.5} and stores @code{2.0} into @code{intpart}. | |
1658 | @end deftypefun | |
1659 | ||
7a68c94a UD |
1660 | @node Remainder Functions |
1661 | @subsection Remainder Functions | |
1662 | ||
1663 | The functions in this section compute the remainder on division of two | |
1664 | floating-point numbers. Each is a little different; pick the one that | |
1665 | suits your problem. | |
1666 | ||
28f540f4 | 1667 | @deftypefun double fmod (double @var{numerator}, double @var{denominator}) |
779ae82e UD |
1668 | @deftypefunx float fmodf (float @var{numerator}, float @var{denominator}) |
1669 | @deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator}) | |
52a8e5cb GG |
1670 | @deftypefunx _FloatN fmodfN (_Float@var{N} @var{numerator}, _Float@var{N} @var{denominator}) |
1671 | @deftypefunx _FloatNx fmodfNx (_Float@var{N}x @var{numerator}, _Float@var{N}x @var{denominator}) | |
d08a7e4c | 1672 | @standards{ISO, math.h} |
52a8e5cb GG |
1673 | @standardsx{fmodfN, TS 18661-3:2015, math.h} |
1674 | @standardsx{fmodfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1675 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
b4012b75 | 1676 | These functions compute the remainder from the division of |
28f540f4 RM |
1677 | @var{numerator} by @var{denominator}. Specifically, the return value is |
1678 | @code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n} | |
1679 | is the quotient of @var{numerator} divided by @var{denominator}, rounded | |
1680 | towards zero to an integer. Thus, @w{@code{fmod (6.5, 2.3)}} returns | |
1681 | @code{1.9}, which is @code{6.5} minus @code{4.6}. | |
1682 | ||
1683 | The result has the same sign as the @var{numerator} and has magnitude | |
1684 | less than the magnitude of the @var{denominator}. | |
1685 | ||
7a68c94a | 1686 | If @var{denominator} is zero, @code{fmod} signals a domain error. |
28f540f4 RM |
1687 | @end deftypefun |
1688 | ||
5070551c GG |
1689 | @deftypefun double remainder (double @var{numerator}, double @var{denominator}) |
1690 | @deftypefunx float remainderf (float @var{numerator}, float @var{denominator}) | |
1691 | @deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator}) | |
52a8e5cb GG |
1692 | @deftypefunx _FloatN remainderfN (_Float@var{N} @var{numerator}, _Float@var{N} @var{denominator}) |
1693 | @deftypefunx _FloatNx remainderfNx (_Float@var{N}x @var{numerator}, _Float@var{N}x @var{denominator}) | |
5070551c | 1694 | @standards{ISO, math.h} |
52a8e5cb GG |
1695 | @standardsx{remainderfN, TS 18661-3:2015, math.h} |
1696 | @standardsx{remainderfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1697 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
76cf9889 | 1698 | These functions are like @code{fmod} except that they round the |
28f540f4 | 1699 | internal quotient @var{n} to the nearest integer instead of towards zero |
5070551c GG |
1700 | to an integer. For example, @code{remainder (6.5, 2.3)} returns |
1701 | @code{-0.4}, which is @code{6.5} minus @code{6.9}. | |
28f540f4 RM |
1702 | |
1703 | The absolute value of the result is less than or equal to half the | |
1704 | absolute value of the @var{denominator}. The difference between | |
5070551c | 1705 | @code{fmod (@var{numerator}, @var{denominator})} and @code{remainder |
28f540f4 RM |
1706 | (@var{numerator}, @var{denominator})} is always either |
1707 | @var{denominator}, minus @var{denominator}, or zero. | |
1708 | ||
5070551c | 1709 | If @var{denominator} is zero, @code{remainder} signals a domain error. |
28f540f4 RM |
1710 | @end deftypefun |
1711 | ||
5070551c GG |
1712 | @deftypefun double drem (double @var{numerator}, double @var{denominator}) |
1713 | @deftypefunx float dremf (float @var{numerator}, float @var{denominator}) | |
1714 | @deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator}) | |
d08a7e4c | 1715 | @standards{BSD, math.h} |
b719dafd | 1716 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
5070551c | 1717 | This function is another name for @code{remainder}. |
7a68c94a | 1718 | @end deftypefun |
28f540f4 | 1719 | |
7a68c94a UD |
1720 | @node FP Bit Twiddling |
1721 | @subsection Setting and modifying single bits of FP values | |
fe0ec73e UD |
1722 | @cindex FP arithmetic |
1723 | ||
7a68c94a | 1724 | There are some operations that are too complicated or expensive to |
ec751a23 | 1725 | perform by hand on floating-point numbers. @w{ISO C99} defines |
7a68c94a UD |
1726 | functions to do these operations, which mostly involve changing single |
1727 | bits. | |
fe0ec73e | 1728 | |
fe0ec73e UD |
1729 | @deftypefun double copysign (double @var{x}, double @var{y}) |
1730 | @deftypefunx float copysignf (float @var{x}, float @var{y}) | |
1731 | @deftypefunx {long double} copysignl (long double @var{x}, long double @var{y}) | |
52a8e5cb GG |
1732 | @deftypefunx _FloatN copysignfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
1733 | @deftypefunx _FloatNx copysignfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 1734 | @standards{ISO, math.h} |
52a8e5cb GG |
1735 | @standardsx{copysignfN, TS 18661-3:2015, math.h} |
1736 | @standardsx{copysignfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1737 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1738 | These functions return @var{x} but with the sign of @var{y}. They work |
1739 | even if @var{x} or @var{y} are NaN or zero. Both of these can carry a | |
1740 | sign (although not all implementations support it) and this is one of | |
1741 | the few operations that can tell the difference. | |
fe0ec73e | 1742 | |
7a68c94a UD |
1743 | @code{copysign} never raises an exception. |
1744 | @c except signalling NaNs | |
fe0ec73e UD |
1745 | |
1746 | This function is defined in @w{IEC 559} (and the appendix with | |
1747 | recommended functions in @w{IEEE 754}/@w{IEEE 854}). | |
1748 | @end deftypefun | |
1749 | ||
fe0ec73e | 1750 | @deftypefun int signbit (@emph{float-type} @var{x}) |
d08a7e4c | 1751 | @standards{ISO, math.h} |
b719dafd | 1752 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
fe0ec73e UD |
1753 | @code{signbit} is a generic macro which can work on all floating-point |
1754 | types. It returns a nonzero value if the value of @var{x} has its sign | |
1755 | bit set. | |
1756 | ||
7a68c94a UD |
1757 | This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating |
1758 | point allows zero to be signed. The comparison @code{-0.0 < 0.0} is | |
1759 | false, but @code{signbit (-0.0)} will return a nonzero value. | |
fe0ec73e UD |
1760 | @end deftypefun |
1761 | ||
fe0ec73e UD |
1762 | @deftypefun double nextafter (double @var{x}, double @var{y}) |
1763 | @deftypefunx float nextafterf (float @var{x}, float @var{y}) | |
1764 | @deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y}) | |
52a8e5cb GG |
1765 | @deftypefunx _FloatN nextafterfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
1766 | @deftypefunx _FloatNx nextafterfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 1767 | @standards{ISO, math.h} |
52a8e5cb GG |
1768 | @standardsx{nextafterfN, TS 18661-3:2015, math.h} |
1769 | @standardsx{nextafterfNx, TS 18661-3:2015, math.h} | |
b719dafd | 1770 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
fe0ec73e | 1771 | The @code{nextafter} function returns the next representable neighbor of |
7a68c94a UD |
1772 | @var{x} in the direction towards @var{y}. The size of the step between |
1773 | @var{x} and the result depends on the type of the result. If | |
0a7fef01 | 1774 | @math{@var{x} = @var{y}} the function simply returns @var{y}. If either |
7a68c94a UD |
1775 | value is @code{NaN}, @code{NaN} is returned. Otherwise |
1776 | a value corresponding to the value of the least significant bit in the | |
1777 | mantissa is added or subtracted, depending on the direction. | |
1778 | @code{nextafter} will signal overflow or underflow if the result goes | |
1779 | outside of the range of normalized numbers. | |
fe0ec73e UD |
1780 | |
1781 | This function is defined in @w{IEC 559} (and the appendix with | |
1782 | recommended functions in @w{IEEE 754}/@w{IEEE 854}). | |
1783 | @end deftypefun | |
1784 | ||
36fe9ac9 | 1785 | @deftypefun double nexttoward (double @var{x}, long double @var{y}) |
36fe9ac9 | 1786 | @deftypefunx float nexttowardf (float @var{x}, long double @var{y}) |
36fe9ac9 | 1787 | @deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y}) |
d08a7e4c | 1788 | @standards{ISO, math.h} |
b719dafd | 1789 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1790 | These functions are identical to the corresponding versions of |
1791 | @code{nextafter} except that their second argument is a @code{long | |
1792 | double}. | |
1793 | @end deftypefun | |
1794 | ||
41a359e2 | 1795 | @deftypefun double nextup (double @var{x}) |
41a359e2 | 1796 | @deftypefunx float nextupf (float @var{x}) |
41a359e2 | 1797 | @deftypefunx {long double} nextupl (long double @var{x}) |
52a8e5cb GG |
1798 | @deftypefunx _FloatN nextupfN (_Float@var{N} @var{x}) |
1799 | @deftypefunx _FloatNx nextupfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1800 | @standards{ISO, math.h} |
52a8e5cb GG |
1801 | @standardsx{nextupfN, TS 18661-3:2015, math.h} |
1802 | @standardsx{nextupfNx, TS 18661-3:2015, math.h} | |
41a359e2 RS |
1803 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1804 | The @code{nextup} function returns the next representable neighbor of @var{x} | |
1805 | in the direction of positive infinity. If @var{x} is the smallest negative | |
1806 | subnormal number in the type of @var{x} the function returns @code{-0}. If | |
1807 | @math{@var{x} = @code{0}} the function returns the smallest positive subnormal | |
1808 | number in the type of @var{x}. If @var{x} is NaN, NaN is returned. | |
1809 | If @var{x} is @math{+@infinity{}}, @math{+@infinity{}} is returned. | |
52a8e5cb | 1810 | @code{nextup} is from TS 18661-1:2014 and TS 18661-3:2015. |
41a359e2 RS |
1811 | @code{nextup} never raises an exception except for signaling NaNs. |
1812 | @end deftypefun | |
1813 | ||
41a359e2 | 1814 | @deftypefun double nextdown (double @var{x}) |
41a359e2 | 1815 | @deftypefunx float nextdownf (float @var{x}) |
41a359e2 | 1816 | @deftypefunx {long double} nextdownl (long double @var{x}) |
52a8e5cb GG |
1817 | @deftypefunx _FloatN nextdownfN (_Float@var{N} @var{x}) |
1818 | @deftypefunx _FloatNx nextdownfNx (_Float@var{N}x @var{x}) | |
d08a7e4c | 1819 | @standards{ISO, math.h} |
52a8e5cb GG |
1820 | @standardsx{nextdownfN, TS 18661-3:2015, math.h} |
1821 | @standardsx{nextdownfNx, TS 18661-3:2015, math.h} | |
41a359e2 RS |
1822 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1823 | The @code{nextdown} function returns the next representable neighbor of @var{x} | |
1824 | in the direction of negative infinity. If @var{x} is the smallest positive | |
1825 | subnormal number in the type of @var{x} the function returns @code{+0}. If | |
1826 | @math{@var{x} = @code{0}} the function returns the smallest negative subnormal | |
1827 | number in the type of @var{x}. If @var{x} is NaN, NaN is returned. | |
1828 | If @var{x} is @math{-@infinity{}}, @math{-@infinity{}} is returned. | |
52a8e5cb | 1829 | @code{nextdown} is from TS 18661-1:2014 and TS 18661-3:2015. |
41a359e2 RS |
1830 | @code{nextdown} never raises an exception except for signaling NaNs. |
1831 | @end deftypefun | |
1832 | ||
fe0ec73e | 1833 | @cindex NaN |
fe0ec73e UD |
1834 | @deftypefun double nan (const char *@var{tagp}) |
1835 | @deftypefunx float nanf (const char *@var{tagp}) | |
1836 | @deftypefunx {long double} nanl (const char *@var{tagp}) | |
52a8e5cb GG |
1837 | @deftypefunx _FloatN nanfN (const char *@var{tagp}) |
1838 | @deftypefunx _FloatNx nanfNx (const char *@var{tagp}) | |
d08a7e4c | 1839 | @standards{ISO, math.h} |
52a8e5cb GG |
1840 | @standardsx{nanfN, TS 18661-3:2015, math.h} |
1841 | @standardsx{nanfNx, TS 18661-3:2015, math.h} | |
b719dafd AO |
1842 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
1843 | @c The unsafe-but-ruled-safe locale use comes from strtod. | |
7a68c94a UD |
1844 | The @code{nan} function returns a representation of NaN, provided that |
1845 | NaN is supported by the target platform. | |
1846 | @code{nan ("@var{n-char-sequence}")} is equivalent to | |
1847 | @code{strtod ("NAN(@var{n-char-sequence})")}. | |
1848 | ||
1849 | The argument @var{tagp} is used in an unspecified manner. On @w{IEEE | |
1850 | 754} systems, there are many representations of NaN, and @var{tagp} | |
1851 | selects one. On other systems it may do nothing. | |
fe0ec73e UD |
1852 | @end deftypefun |
1853 | ||
eaf5ad0b | 1854 | @deftypefun int canonicalize (double *@var{cx}, const double *@var{x}) |
eaf5ad0b | 1855 | @deftypefunx int canonicalizef (float *@var{cx}, const float *@var{x}) |
eaf5ad0b | 1856 | @deftypefunx int canonicalizel (long double *@var{cx}, const long double *@var{x}) |
52a8e5cb GG |
1857 | @deftypefunx int canonicalizefN (_Float@var{N} *@var{cx}, const _Float@var{N} *@var{x}) |
1858 | @deftypefunx int canonicalizefNx (_Float@var{N}x *@var{cx}, const _Float@var{N}x *@var{x}) | |
d08a7e4c | 1859 | @standards{ISO, math.h} |
52a8e5cb GG |
1860 | @standardsx{canonicalizefN, TS 18661-3:2015, math.h} |
1861 | @standardsx{canonicalizefNx, TS 18661-3:2015, math.h} | |
eaf5ad0b JM |
1862 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1863 | In some floating-point formats, some values have canonical (preferred) | |
1864 | and noncanonical encodings (for IEEE interchange binary formats, all | |
1865 | encodings are canonical). These functions, defined by TS | |
52a8e5cb GG |
1866 | 18661-1:2014 and TS 18661-3:2015, attempt to produce a canonical version |
1867 | of the floating-point value pointed to by @var{x}; if that value is a | |
eaf5ad0b JM |
1868 | signaling NaN, they raise the invalid exception and produce a quiet |
1869 | NaN. If a canonical value is produced, it is stored in the object | |
1870 | pointed to by @var{cx}, and these functions return zero. Otherwise | |
1871 | (if a canonical value could not be produced because the object pointed | |
1872 | to by @var{x} is not a valid representation of any floating-point | |
1873 | value), the object pointed to by @var{cx} is unchanged and a nonzero | |
1874 | value is returned. | |
1875 | ||
1876 | Note that some formats have multiple encodings of a value which are | |
1877 | all equally canonical; when such an encoding is used as an input to | |
1878 | this function, any such encoding of the same value (or of the | |
1879 | corresponding quiet NaN, if that value is a signaling NaN) may be | |
1880 | produced as output. | |
1881 | @end deftypefun | |
1882 | ||
f8e8b8ed | 1883 | @deftypefun double getpayload (const double *@var{x}) |
f8e8b8ed | 1884 | @deftypefunx float getpayloadf (const float *@var{x}) |
f8e8b8ed | 1885 | @deftypefunx {long double} getpayloadl (const long double *@var{x}) |
52a8e5cb GG |
1886 | @deftypefunx _FloatN getpayloadfN (const _Float@var{N} *@var{x}) |
1887 | @deftypefunx _FloatNx getpayloadfNx (const _Float@var{N}x *@var{x}) | |
d08a7e4c | 1888 | @standards{ISO, math.h} |
52a8e5cb GG |
1889 | @standardsx{getpayloadfN, TS 18661-3:2015, math.h} |
1890 | @standardsx{getpayloadfNx, TS 18661-3:2015, math.h} | |
f8e8b8ed JM |
1891 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
1892 | IEEE 754 defines the @dfn{payload} of a NaN to be an integer value | |
1893 | encoded in the representation of the NaN. Payloads are typically | |
1894 | propagated from NaN inputs to the result of a floating-point | |
52a8e5cb GG |
1895 | operation. These functions, defined by TS 18661-1:2014 and TS |
1896 | 18661-3:2015, return the payload of the NaN pointed to by @var{x} | |
1897 | (returned as a positive integer, or positive zero, represented as a | |
6c010c5d JM |
1898 | floating-point number); if @var{x} is not a NaN, they return |
1899 | @minus{}1. They raise no floating-point exceptions even for signaling | |
1900 | NaNs. (The return value of @minus{}1 for an argument that is not a | |
1901 | NaN is specified in C2x; the value was unspecified in TS 18661.) | |
f8e8b8ed JM |
1902 | @end deftypefun |
1903 | ||
eb3c12c7 | 1904 | @deftypefun int setpayload (double *@var{x}, double @var{payload}) |
eb3c12c7 | 1905 | @deftypefunx int setpayloadf (float *@var{x}, float @var{payload}) |
eb3c12c7 | 1906 | @deftypefunx int setpayloadl (long double *@var{x}, long double @var{payload}) |
52a8e5cb GG |
1907 | @deftypefunx int setpayloadfN (_Float@var{N} *@var{x}, _Float@var{N} @var{payload}) |
1908 | @deftypefunx int setpayloadfNx (_Float@var{N}x *@var{x}, _Float@var{N}x @var{payload}) | |
d08a7e4c | 1909 | @standards{ISO, math.h} |
52a8e5cb GG |
1910 | @standardsx{setpayloadfN, TS 18661-3:2015, math.h} |
1911 | @standardsx{setpayloadfNx, TS 18661-3:2015, math.h} | |
eb3c12c7 | 1912 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
52a8e5cb GG |
1913 | These functions, defined by TS 18661-1:2014 and TS 18661-3:2015, set the |
1914 | object pointed to by @var{x} to a quiet NaN with payload @var{payload} | |
1915 | and a zero sign bit and return zero. If @var{payload} is not a | |
1916 | positive-signed integer that is a valid payload for a quiet NaN of the | |
1917 | given type, the object pointed to by @var{x} is set to positive zero and | |
1918 | a nonzero value is returned. They raise no floating-point exceptions. | |
eb3c12c7 JM |
1919 | @end deftypefun |
1920 | ||
457663a7 | 1921 | @deftypefun int setpayloadsig (double *@var{x}, double @var{payload}) |
457663a7 | 1922 | @deftypefunx int setpayloadsigf (float *@var{x}, float @var{payload}) |
457663a7 | 1923 | @deftypefunx int setpayloadsigl (long double *@var{x}, long double @var{payload}) |
52a8e5cb GG |
1924 | @deftypefunx int setpayloadsigfN (_Float@var{N} *@var{x}, _Float@var{N} @var{payload}) |
1925 | @deftypefunx int setpayloadsigfNx (_Float@var{N}x *@var{x}, _Float@var{N}x @var{payload}) | |
d08a7e4c | 1926 | @standards{ISO, math.h} |
52a8e5cb GG |
1927 | @standardsx{setpayloadsigfN, TS 18661-3:2015, math.h} |
1928 | @standardsx{setpayloadsigfNx, TS 18661-3:2015, math.h} | |
457663a7 | 1929 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
52a8e5cb GG |
1930 | These functions, defined by TS 18661-1:2014 and TS 18661-3:2015, set the |
1931 | object pointed to by @var{x} to a signaling NaN with payload | |
1932 | @var{payload} and a zero sign bit and return zero. If @var{payload} is | |
1933 | not a positive-signed integer that is a valid payload for a signaling | |
1934 | NaN of the given type, the object pointed to by @var{x} is set to | |
1935 | positive zero and a nonzero value is returned. They raise no | |
1936 | floating-point exceptions. | |
457663a7 JM |
1937 | @end deftypefun |
1938 | ||
7a68c94a UD |
1939 | @node FP Comparison Functions |
1940 | @subsection Floating-Point Comparison Functions | |
1941 | @cindex unordered comparison | |
fe0ec73e | 1942 | |
7a68c94a UD |
1943 | The standard C comparison operators provoke exceptions when one or other |
1944 | of the operands is NaN. For example, | |
1945 | ||
1946 | @smallexample | |
1947 | int v = a < 1.0; | |
1948 | @end smallexample | |
1949 | ||
1950 | @noindent | |
1951 | will raise an exception if @var{a} is NaN. (This does @emph{not} | |
1952 | happen with @code{==} and @code{!=}; those merely return false and true, | |
1953 | respectively, when NaN is examined.) Frequently this exception is | |
ec751a23 | 1954 | undesirable. @w{ISO C99} therefore defines comparison functions that |
7a68c94a UD |
1955 | do not raise exceptions when NaN is examined. All of the functions are |
1956 | implemented as macros which allow their arguments to be of any | |
1957 | floating-point type. The macros are guaranteed to evaluate their | |
1e7c8fcc | 1958 | arguments only once. TS 18661-1:2014 adds such a macro for an |
5e9d98a3 JM |
1959 | equality comparison that @emph{does} raise an exception for a NaN |
1960 | argument; it also adds functions that provide a total ordering on all | |
1961 | floating-point values, including NaNs, without raising any exceptions | |
1962 | even for signaling NaNs. | |
7a68c94a | 1963 | |
7a68c94a | 1964 | @deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 1965 | @standards{ISO, math.h} |
b719dafd | 1966 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1967 | This macro determines whether the argument @var{x} is greater than |
1968 | @var{y}. It is equivalent to @code{(@var{x}) > (@var{y})}, but no | |
1969 | exception is raised if @var{x} or @var{y} are NaN. | |
1970 | @end deftypefn | |
1971 | ||
7a68c94a | 1972 | @deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 1973 | @standards{ISO, math.h} |
b719dafd | 1974 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1975 | This macro determines whether the argument @var{x} is greater than or |
1976 | equal to @var{y}. It is equivalent to @code{(@var{x}) >= (@var{y})}, but no | |
1977 | exception is raised if @var{x} or @var{y} are NaN. | |
1978 | @end deftypefn | |
1979 | ||
7a68c94a | 1980 | @deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 1981 | @standards{ISO, math.h} |
b719dafd | 1982 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1983 | This macro determines whether the argument @var{x} is less than @var{y}. |
1984 | It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is | |
1985 | raised if @var{x} or @var{y} are NaN. | |
1986 | @end deftypefn | |
1987 | ||
7a68c94a | 1988 | @deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 1989 | @standards{ISO, math.h} |
b719dafd | 1990 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1991 | This macro determines whether the argument @var{x} is less than or equal |
1992 | to @var{y}. It is equivalent to @code{(@var{x}) <= (@var{y})}, but no | |
1993 | exception is raised if @var{x} or @var{y} are NaN. | |
1994 | @end deftypefn | |
1995 | ||
7a68c94a | 1996 | @deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 1997 | @standards{ISO, math.h} |
b719dafd | 1998 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
1999 | This macro determines whether the argument @var{x} is less or greater |
2000 | than @var{y}. It is equivalent to @code{(@var{x}) < (@var{y}) || | |
2001 | (@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y} | |
2002 | once), but no exception is raised if @var{x} or @var{y} are NaN. | |
2003 | ||
2004 | This macro is not equivalent to @code{@var{x} != @var{y}}, because that | |
2005 | expression is true if @var{x} or @var{y} are NaN. | |
2006 | @end deftypefn | |
2007 | ||
7a68c94a | 2008 | @deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 2009 | @standards{ISO, math.h} |
b719dafd | 2010 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2011 | This macro determines whether its arguments are unordered. In other |
2012 | words, it is true if @var{x} or @var{y} are NaN, and false otherwise. | |
2013 | @end deftypefn | |
2014 | ||
1e7c8fcc | 2015 | @deftypefn Macro int iseqsig (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
d08a7e4c | 2016 | @standards{ISO, math.h} |
1e7c8fcc JM |
2017 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
2018 | This macro determines whether its arguments are equal. It is | |
2019 | equivalent to @code{(@var{x}) == (@var{y})}, but it raises the invalid | |
c32bb03c | 2020 | exception and sets @code{errno} to @code{EDOM} if either argument is a |
1e7c8fcc JM |
2021 | NaN. |
2022 | @end deftypefn | |
2023 | ||
42760d76 JM |
2024 | @deftypefun int totalorder (const double *@var{x}, const double *@var{y}) |
2025 | @deftypefunx int totalorderf (const float *@var{x}, const float *@var{y}) | |
2026 | @deftypefunx int totalorderl (const long double *@var{x}, const long double *@var{y}) | |
2027 | @deftypefunx int totalorderfN (const _Float@var{N} *@var{x}, const _Float@var{N} *@var{y}) | |
2028 | @deftypefunx int totalorderfNx (const _Float@var{N}x *@var{x}, const _Float@var{N}x *@var{y}) | |
1b009d5a | 2029 | @standards{TS 18661-1:2014, math.h} |
52a8e5cb GG |
2030 | @standardsx{totalorderfN, TS 18661-3:2015, math.h} |
2031 | @standardsx{totalorderfNx, TS 18661-3:2015, math.h} | |
5e9d98a3 JM |
2032 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
2033 | These functions determine whether the total order relationship, | |
42760d76 JM |
2034 | defined in IEEE 754-2008, is true for @code{*@var{x}} and |
2035 | @code{*@var{y}}, returning | |
5e9d98a3 JM |
2036 | nonzero if it is true and zero if it is false. No exceptions are |
2037 | raised even for signaling NaNs. The relationship is true if they are | |
2038 | the same floating-point value (including sign for zero and NaNs, and | |
42760d76 JM |
2039 | payload for NaNs), or if @code{*@var{x}} comes before @code{*@var{y}} |
2040 | in the following | |
5e9d98a3 JM |
2041 | order: negative quiet NaNs, in order of decreasing payload; negative |
2042 | signaling NaNs, in order of decreasing payload; negative infinity; | |
2043 | finite numbers, in ascending order, with negative zero before positive | |
2044 | zero; positive infinity; positive signaling NaNs, in order of | |
2045 | increasing payload; positive quiet NaNs, in order of increasing | |
2046 | payload. | |
2047 | @end deftypefun | |
2048 | ||
42760d76 JM |
2049 | @deftypefun int totalordermag (const double *@var{x}, const double *@var{y}) |
2050 | @deftypefunx int totalordermagf (const float *@var{x}, const float *@var{y}) | |
2051 | @deftypefunx int totalordermagl (const long double *@var{x}, const long double *@var{y}) | |
2052 | @deftypefunx int totalordermagfN (const _Float@var{N} *@var{x}, const _Float@var{N} *@var{y}) | |
2053 | @deftypefunx int totalordermagfNx (const _Float@var{N}x *@var{x}, const _Float@var{N}x *@var{y}) | |
1b009d5a | 2054 | @standards{TS 18661-1:2014, math.h} |
52a8e5cb GG |
2055 | @standardsx{totalordermagfN, TS 18661-3:2015, math.h} |
2056 | @standardsx{totalordermagfNx, TS 18661-3:2015, math.h} | |
cc6a8d74 JM |
2057 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
2058 | These functions determine whether the total order relationship, | |
42760d76 JM |
2059 | defined in IEEE 754-2008, is true for the absolute values of @code{*@var{x}} |
2060 | and @code{*@var{y}}, returning nonzero if it is true and zero if it is false. | |
cc6a8d74 JM |
2061 | No exceptions are raised even for signaling NaNs. |
2062 | @end deftypefun | |
2063 | ||
7a68c94a UD |
2064 | Not all machines provide hardware support for these operations. On |
2065 | machines that don't, the macros can be very slow. Therefore, you should | |
2066 | not use these functions when NaN is not a concern. | |
2067 | ||
48b22986 | 2068 | @strong{NB:} There are no macros @code{isequal} or @code{isunequal}. |
7a68c94a UD |
2069 | They are unnecessary, because the @code{==} and @code{!=} operators do |
2070 | @emph{not} throw an exception if one or both of the operands are NaN. | |
2071 | ||
2072 | @node Misc FP Arithmetic | |
2073 | @subsection Miscellaneous FP arithmetic functions | |
fe0ec73e UD |
2074 | @cindex minimum |
2075 | @cindex maximum | |
7a68c94a UD |
2076 | @cindex positive difference |
2077 | @cindex multiply-add | |
fe0ec73e | 2078 | |
7a68c94a UD |
2079 | The functions in this section perform miscellaneous but common |
2080 | operations that are awkward to express with C operators. On some | |
2081 | processors these functions can use special machine instructions to | |
2082 | perform these operations faster than the equivalent C code. | |
fe0ec73e | 2083 | |
fe0ec73e UD |
2084 | @deftypefun double fmin (double @var{x}, double @var{y}) |
2085 | @deftypefunx float fminf (float @var{x}, float @var{y}) | |
2086 | @deftypefunx {long double} fminl (long double @var{x}, long double @var{y}) | |
52a8e5cb GG |
2087 | @deftypefunx _FloatN fminfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
2088 | @deftypefunx _FloatNx fminfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 2089 | @standards{ISO, math.h} |
52a8e5cb GG |
2090 | @standardsx{fminfN, TS 18661-3:2015, math.h} |
2091 | @standardsx{fminfNx, TS 18661-3:2015, math.h} | |
b719dafd | 2092 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2093 | The @code{fmin} function returns the lesser of the two values @var{x} |
2094 | and @var{y}. It is similar to the expression | |
2095 | @smallexample | |
2096 | ((x) < (y) ? (x) : (y)) | |
2097 | @end smallexample | |
2098 | except that @var{x} and @var{y} are only evaluated once. | |
fe0ec73e | 2099 | |
90f0ac10 JM |
2100 | If an argument is a quiet NaN, the other argument is returned. If both arguments |
2101 | are NaN, or either is a signaling NaN, NaN is returned. | |
fe0ec73e UD |
2102 | @end deftypefun |
2103 | ||
fe0ec73e UD |
2104 | @deftypefun double fmax (double @var{x}, double @var{y}) |
2105 | @deftypefunx float fmaxf (float @var{x}, float @var{y}) | |
2106 | @deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y}) | |
52a8e5cb GG |
2107 | @deftypefunx _FloatN fmaxfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
2108 | @deftypefunx _FloatNx fmaxfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 2109 | @standards{ISO, math.h} |
52a8e5cb GG |
2110 | @standardsx{fmaxfN, TS 18661-3:2015, math.h} |
2111 | @standardsx{fmaxfNx, TS 18661-3:2015, math.h} | |
b719dafd | 2112 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2113 | The @code{fmax} function returns the greater of the two values @var{x} |
2114 | and @var{y}. | |
fe0ec73e | 2115 | |
90f0ac10 JM |
2116 | If an argument is a quiet NaN, the other argument is returned. If both arguments |
2117 | are NaN, or either is a signaling NaN, NaN is returned. | |
2118 | @end deftypefun | |
2119 | ||
2120 | @deftypefun double fminimum (double @var{x}, double @var{y}) | |
2121 | @deftypefunx float fminimumf (float @var{x}, float @var{y}) | |
2122 | @deftypefunx {long double} fminimuml (long double @var{x}, long double @var{y}) | |
2123 | @deftypefunx _FloatN fminimumfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2124 | @deftypefunx _FloatNx fminimumfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2125 | @standards{C2X, math.h} | |
2126 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2127 | The @code{fminimum} function returns the lesser of the two values @var{x} | |
2128 | and @var{y}. Unlike @code{fmin}, if either argument is a NaN, NaN is returned. | |
2129 | Positive zero is treated as greater than negative zero. | |
2130 | @end deftypefun | |
2131 | ||
2132 | @deftypefun double fmaximum (double @var{x}, double @var{y}) | |
2133 | @deftypefunx float fmaximumf (float @var{x}, float @var{y}) | |
2134 | @deftypefunx {long double} fmaximuml (long double @var{x}, long double @var{y}) | |
2135 | @deftypefunx _FloatN fmaximumfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2136 | @deftypefunx _FloatNx fmaximumfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2137 | @standards{C2X, math.h} | |
2138 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2139 | The @code{fmaximum} function returns the greater of the two values @var{x} | |
2140 | and @var{y}. Unlike @code{fmax}, if either argument is a NaN, NaN is returned. | |
2141 | Positive zero is treated as greater than negative zero. | |
2142 | @end deftypefun | |
2143 | ||
2144 | @deftypefun double fminimum_num (double @var{x}, double @var{y}) | |
2145 | @deftypefunx float fminimum_numf (float @var{x}, float @var{y}) | |
2146 | @deftypefunx {long double} fminimum_numl (long double @var{x}, long double @var{y}) | |
2147 | @deftypefunx _FloatN fminimum_numfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2148 | @deftypefunx _FloatNx fminimum_numfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2149 | @standards{C2X, math.h} | |
2150 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2151 | The @code{fminimum_num} function returns the lesser of the two values | |
2152 | @var{x} and @var{y}. If one argument is a number and the other is a | |
2153 | NaN, even a signaling NaN, the number is returned. Positive zero is | |
2154 | treated as greater than negative zero. | |
2155 | @end deftypefun | |
2156 | ||
2157 | @deftypefun double fmaximum_num (double @var{x}, double @var{y}) | |
2158 | @deftypefunx float fmaximum_numf (float @var{x}, float @var{y}) | |
2159 | @deftypefunx {long double} fmaximum_numl (long double @var{x}, long double @var{y}) | |
2160 | @deftypefunx _FloatN fmaximum_numfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2161 | @deftypefunx _FloatNx fmaximum_numfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2162 | @standards{C2X, math.h} | |
2163 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2164 | The @code{fmaximum_num} function returns the greater of the two values | |
2165 | @var{x} and @var{y}. If one argument is a number and the other is a | |
2166 | NaN, even a signaling NaN, the number is returned. Positive zero is | |
2167 | treated as greater than negative zero. | |
fe0ec73e UD |
2168 | @end deftypefun |
2169 | ||
525f8039 | 2170 | @deftypefun double fminmag (double @var{x}, double @var{y}) |
525f8039 | 2171 | @deftypefunx float fminmagf (float @var{x}, float @var{y}) |
525f8039 | 2172 | @deftypefunx {long double} fminmagl (long double @var{x}, long double @var{y}) |
52a8e5cb GG |
2173 | @deftypefunx _FloatN fminmagfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
2174 | @deftypefunx _FloatNx fminmagfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 2175 | @standards{ISO, math.h} |
52a8e5cb GG |
2176 | @standardsx{fminmagfN, TS 18661-3:2015, math.h} |
2177 | @standardsx{fminmagfNx, TS 18661-3:2015, math.h} | |
525f8039 | 2178 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
52a8e5cb GG |
2179 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return |
2180 | whichever of the two values @var{x} and @var{y} has the smaller absolute | |
2181 | value. If both have the same absolute value, or either is NaN, they | |
2182 | behave the same as the @code{fmin} functions. | |
525f8039 JM |
2183 | @end deftypefun |
2184 | ||
525f8039 | 2185 | @deftypefun double fmaxmag (double @var{x}, double @var{y}) |
525f8039 | 2186 | @deftypefunx float fmaxmagf (float @var{x}, float @var{y}) |
525f8039 | 2187 | @deftypefunx {long double} fmaxmagl (long double @var{x}, long double @var{y}) |
52a8e5cb GG |
2188 | @deftypefunx _FloatN fmaxmagfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
2189 | @deftypefunx _FloatNx fmaxmagfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 2190 | @standards{ISO, math.h} |
52a8e5cb GG |
2191 | @standardsx{fmaxmagfN, TS 18661-3:2015, math.h} |
2192 | @standardsx{fmaxmagfNx, TS 18661-3:2015, math.h} | |
525f8039 JM |
2193 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
2194 | These functions, from TS 18661-1:2014, return whichever of the two | |
2195 | values @var{x} and @var{y} has the greater absolute value. If both | |
2196 | have the same absolute value, or either is NaN, they behave the same | |
2197 | as the @code{fmax} functions. | |
2198 | @end deftypefun | |
2199 | ||
90f0ac10 JM |
2200 | @deftypefun double fminimum_mag (double @var{x}, double @var{y}) |
2201 | @deftypefunx float fminimum_magf (float @var{x}, float @var{y}) | |
2202 | @deftypefunx {long double} fminimum_magl (long double @var{x}, long double @var{y}) | |
2203 | @deftypefunx _FloatN fminimum_magfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2204 | @deftypefunx _FloatNx fminimum_magfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2205 | @standards{C2X, math.h} | |
2206 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2207 | These functions return whichever of the two values @var{x} and @var{y} | |
2208 | has the smaller absolute value. If both have the same absolute value, | |
2209 | or either is NaN, they behave the same as the @code{fminimum} | |
2210 | functions. | |
2211 | @end deftypefun | |
2212 | ||
2213 | @deftypefun double fmaximum_mag (double @var{x}, double @var{y}) | |
2214 | @deftypefunx float fmaximum_magf (float @var{x}, float @var{y}) | |
2215 | @deftypefunx {long double} fmaximum_magl (long double @var{x}, long double @var{y}) | |
2216 | @deftypefunx _FloatN fmaximum_magfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2217 | @deftypefunx _FloatNx fmaximum_magfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2218 | @standards{C2X, math.h} | |
2219 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2220 | These functions return whichever of the two values @var{x} and @var{y} | |
2221 | has the greater absolute value. If both have the same absolute value, | |
2222 | or either is NaN, they behave the same as the @code{fmaximum} | |
2223 | functions. | |
2224 | @end deftypefun | |
2225 | ||
2226 | @deftypefun double fminimum_mag_num (double @var{x}, double @var{y}) | |
2227 | @deftypefunx float fminimum_mag_numf (float @var{x}, float @var{y}) | |
2228 | @deftypefunx {long double} fminimum_mag_numl (long double @var{x}, long double @var{y}) | |
2229 | @deftypefunx _FloatN fminimum_mag_numfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2230 | @deftypefunx _FloatNx fminimum_mag_numfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2231 | @standards{C2X, math.h} | |
2232 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2233 | These functions return whichever of the two values @var{x} and @var{y} | |
2234 | has the smaller absolute value. If both have the same absolute value, | |
2235 | or either is NaN, they behave the same as the @code{fminimum_num} | |
2236 | functions. | |
2237 | @end deftypefun | |
2238 | ||
2239 | @deftypefun double fmaximum_mag_num (double @var{x}, double @var{y}) | |
2240 | @deftypefunx float fmaximum_mag_numf (float @var{x}, float @var{y}) | |
2241 | @deftypefunx {long double} fmaximum_mag_numl (long double @var{x}, long double @var{y}) | |
2242 | @deftypefunx _FloatN fmaximum_mag_numfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2243 | @deftypefunx _FloatNx fmaximum_mag_numfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2244 | @standards{C2X, math.h} | |
2245 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2246 | These functions return whichever of the two values @var{x} and @var{y} | |
2247 | has the greater absolute value. If both have the same absolute value, | |
2248 | or either is NaN, they behave the same as the @code{fmaximum_num} | |
2249 | functions. | |
2250 | @end deftypefun | |
2251 | ||
fe0ec73e UD |
2252 | @deftypefun double fdim (double @var{x}, double @var{y}) |
2253 | @deftypefunx float fdimf (float @var{x}, float @var{y}) | |
2254 | @deftypefunx {long double} fdiml (long double @var{x}, long double @var{y}) | |
52a8e5cb GG |
2255 | @deftypefunx _FloatN fdimfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) |
2256 | @deftypefunx _FloatNx fdimfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
d08a7e4c | 2257 | @standards{ISO, math.h} |
52a8e5cb GG |
2258 | @standardsx{fdimfN, TS 18661-3:2015, math.h} |
2259 | @standardsx{fdimfNx, TS 18661-3:2015, math.h} | |
b719dafd | 2260 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2261 | The @code{fdim} function returns the positive difference between |
2262 | @var{x} and @var{y}. The positive difference is @math{@var{x} - | |
2263 | @var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise. | |
fe0ec73e | 2264 | |
7a68c94a | 2265 | If @var{x}, @var{y}, or both are NaN, NaN is returned. |
fe0ec73e UD |
2266 | @end deftypefun |
2267 | ||
fe0ec73e UD |
2268 | @deftypefun double fma (double @var{x}, double @var{y}, double @var{z}) |
2269 | @deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z}) | |
2270 | @deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z}) | |
52a8e5cb GG |
2271 | @deftypefunx _FloatN fmafN (_Float@var{N} @var{x}, _Float@var{N} @var{y}, _Float@var{N} @var{z}) |
2272 | @deftypefunx _FloatNx fmafNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}, _Float@var{N}x @var{z}) | |
d08a7e4c | 2273 | @standards{ISO, math.h} |
52a8e5cb GG |
2274 | @standardsx{fmafN, TS 18661-3:2015, math.h} |
2275 | @standardsx{fmafNx, TS 18661-3:2015, math.h} | |
fe0ec73e | 2276 | @cindex butterfly |
b719dafd | 2277 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2278 | The @code{fma} function performs floating-point multiply-add. This is |
2279 | the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the | |
2280 | intermediate result is not rounded to the destination type. This can | |
2281 | sometimes improve the precision of a calculation. | |
2282 | ||
2283 | This function was introduced because some processors have a special | |
2284 | instruction to perform multiply-add. The C compiler cannot use it | |
2285 | directly, because the expression @samp{x*y + z} is defined to round the | |
2286 | intermediate result. @code{fma} lets you choose when you want to round | |
2287 | only once. | |
fe0ec73e UD |
2288 | |
2289 | @vindex FP_FAST_FMA | |
7a68c94a UD |
2290 | On processors which do not implement multiply-add in hardware, |
2291 | @code{fma} can be very slow since it must avoid intermediate rounding. | |
2292 | @file{math.h} defines the symbols @code{FP_FAST_FMA}, | |
2293 | @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding | |
2294 | version of @code{fma} is no slower than the expression @samp{x*y + z}. | |
1f77f049 | 2295 | In @theglibc{}, this always means the operation is implemented in |
7a68c94a | 2296 | hardware. |
fe0ec73e UD |
2297 | @end deftypefun |
2298 | ||
d8742dd8 JM |
2299 | @deftypefun float fadd (double @var{x}, double @var{y}) |
2300 | @deftypefunx float faddl (long double @var{x}, long double @var{y}) | |
2301 | @deftypefunx double daddl (long double @var{x}, long double @var{y}) | |
2302 | @deftypefunx _FloatM fMaddfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2303 | @deftypefunx _FloatM fMaddfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2304 | @deftypefunx _FloatMx fMxaddfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2305 | @deftypefunx _FloatMx fMxaddfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2306 | @standards{TS 18661-1:2014, math.h} | |
2307 | @standardsx{fMaddfN, TS 18661-3:2015, math.h} | |
2308 | @standardsx{fMaddfNx, TS 18661-3:2015, math.h} | |
2309 | @standardsx{fMxaddfN, TS 18661-3:2015, math.h} | |
2310 | @standardsx{fMxaddfNx, TS 18661-3:2015, math.h} | |
2311 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2312 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return | |
2313 | @math{@var{x} + @var{y}}, rounded once to the return type of the | |
2314 | function without any intermediate rounding to the type of the | |
2315 | arguments. | |
2316 | @end deftypefun | |
2317 | ||
8d3f9e85 JM |
2318 | @deftypefun float fsub (double @var{x}, double @var{y}) |
2319 | @deftypefunx float fsubl (long double @var{x}, long double @var{y}) | |
2320 | @deftypefunx double dsubl (long double @var{x}, long double @var{y}) | |
2321 | @deftypefunx _FloatM fMsubfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2322 | @deftypefunx _FloatM fMsubfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2323 | @deftypefunx _FloatMx fMxsubfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2324 | @deftypefunx _FloatMx fMxsubfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2325 | @standards{TS 18661-1:2014, math.h} | |
2326 | @standardsx{fMsubfN, TS 18661-3:2015, math.h} | |
2327 | @standardsx{fMsubfNx, TS 18661-3:2015, math.h} | |
2328 | @standardsx{fMxsubfN, TS 18661-3:2015, math.h} | |
2329 | @standardsx{fMxsubfNx, TS 18661-3:2015, math.h} | |
2330 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2331 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return | |
2332 | @math{@var{x} - @var{y}}, rounded once to the return type of the | |
2333 | function without any intermediate rounding to the type of the | |
2334 | arguments. | |
2335 | @end deftypefun | |
2336 | ||
69a01461 JM |
2337 | @deftypefun float fmul (double @var{x}, double @var{y}) |
2338 | @deftypefunx float fmull (long double @var{x}, long double @var{y}) | |
2339 | @deftypefunx double dmull (long double @var{x}, long double @var{y}) | |
2340 | @deftypefunx _FloatM fMmulfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2341 | @deftypefunx _FloatM fMmulfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2342 | @deftypefunx _FloatMx fMxmulfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2343 | @deftypefunx _FloatMx fMxmulfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2344 | @standards{TS 18661-1:2014, math.h} | |
2345 | @standardsx{fMmulfN, TS 18661-3:2015, math.h} | |
2346 | @standardsx{fMmulfNx, TS 18661-3:2015, math.h} | |
2347 | @standardsx{fMxmulfN, TS 18661-3:2015, math.h} | |
2348 | @standardsx{fMxmulfNx, TS 18661-3:2015, math.h} | |
2349 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2350 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return | |
2351 | @math{@var{x} * @var{y}}, rounded once to the return type of the | |
2352 | function without any intermediate rounding to the type of the | |
2353 | arguments. | |
2354 | @end deftypefun | |
2355 | ||
632a6cbe JM |
2356 | @deftypefun float fdiv (double @var{x}, double @var{y}) |
2357 | @deftypefunx float fdivl (long double @var{x}, long double @var{y}) | |
2358 | @deftypefunx double ddivl (long double @var{x}, long double @var{y}) | |
2359 | @deftypefunx _FloatM fMdivfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2360 | @deftypefunx _FloatM fMdivfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2361 | @deftypefunx _FloatMx fMxdivfN (_Float@var{N} @var{x}, _Float@var{N} @var{y}) | |
2362 | @deftypefunx _FloatMx fMxdivfNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}) | |
2363 | @standards{TS 18661-1:2014, math.h} | |
2364 | @standardsx{fMdivfN, TS 18661-3:2015, math.h} | |
2365 | @standardsx{fMdivfNx, TS 18661-3:2015, math.h} | |
2366 | @standardsx{fMxdivfN, TS 18661-3:2015, math.h} | |
2367 | @standardsx{fMxdivfNx, TS 18661-3:2015, math.h} | |
2368 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2369 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return | |
2370 | @math{@var{x} / @var{y}}, rounded once to the return type of the | |
2371 | function without any intermediate rounding to the type of the | |
2372 | arguments. | |
2373 | @end deftypefun | |
2374 | ||
abd38358 JM |
2375 | @deftypefun float fsqrt (double @var{x}) |
2376 | @deftypefunx float fsqrtl (long double @var{x}) | |
2377 | @deftypefunx double dsqrtl (long double @var{x}) | |
2378 | @deftypefunx _FloatM fMsqrtfN (_Float@var{N} @var{x}) | |
2379 | @deftypefunx _FloatM fMsqrtfNx (_Float@var{N}x @var{x}) | |
2380 | @deftypefunx _FloatMx fMxsqrtfN (_Float@var{N} @var{x}) | |
2381 | @deftypefunx _FloatMx fMxsqrtfNx (_Float@var{N}x @var{x}) | |
2382 | @standards{TS 18661-1:2014, math.h} | |
2383 | @standardsx{fMsqrtfN, TS 18661-3:2015, math.h} | |
2384 | @standardsx{fMsqrtfNx, TS 18661-3:2015, math.h} | |
2385 | @standardsx{fMxsqrtfN, TS 18661-3:2015, math.h} | |
2386 | @standardsx{fMxsqrtfNx, TS 18661-3:2015, math.h} | |
2387 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2388 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return the | |
2389 | square root of @var{x}, rounded once to the return type of the | |
2390 | function without any intermediate rounding to the type of the | |
2391 | arguments. | |
2392 | @end deftypefun | |
2393 | ||
b3f27d81 JM |
2394 | @deftypefun float ffma (double @var{x}, double @var{y}, double @var{z}) |
2395 | @deftypefunx float ffmal (long double @var{x}, long double @var{y}, long double @var{z}) | |
2396 | @deftypefunx double dfmal (long double @var{x}, long double @var{y}, long double @var{z}) | |
2397 | @deftypefunx _FloatM fMfmafN (_Float@var{N} @var{x}, _Float@var{N} @var{y}, _Float@var{N} @var{z}) | |
2398 | @deftypefunx _FloatM fMfmafNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}, _Float@var{N}x @var{z}) | |
2399 | @deftypefunx _FloatMx fMxfmafN (_Float@var{N} @var{x}, _Float@var{N} @var{y}, _Float@var{N} @var{z}) | |
2400 | @deftypefunx _FloatMx fMxfmafNx (_Float@var{N}x @var{x}, _Float@var{N}x @var{y}, _Float@var{N}x @var{z}) | |
2401 | @standards{TS 18661-1:2014, math.h} | |
2402 | @standardsx{fMfmafN, TS 18661-3:2015, math.h} | |
2403 | @standardsx{fMfmafNx, TS 18661-3:2015, math.h} | |
2404 | @standardsx{fMxfmafN, TS 18661-3:2015, math.h} | |
2405 | @standardsx{fMxfmafNx, TS 18661-3:2015, math.h} | |
2406 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | |
2407 | ||
2408 | These functions, from TS 18661-1:2014 and TS 18661-3:2015, return | |
2409 | @math{(@var{x} @mul{} @var{y}) + @var{z}}, rounded once to the return | |
2410 | type of the function without any intermediate rounding to the type of | |
2411 | the arguments and without any intermediate rounding of result of the | |
2412 | multiplication. | |
2413 | @end deftypefun | |
2414 | ||
7a68c94a UD |
2415 | @node Complex Numbers |
2416 | @section Complex Numbers | |
2417 | @pindex complex.h | |
2418 | @cindex complex numbers | |
2419 | ||
ec751a23 | 2420 | @w{ISO C99} introduces support for complex numbers in C. This is done |
7a68c94a UD |
2421 | with a new type qualifier, @code{complex}. It is a keyword if and only |
2422 | if @file{complex.h} has been included. There are three complex types, | |
2423 | corresponding to the three real types: @code{float complex}, | |
2424 | @code{double complex}, and @code{long double complex}. | |
2425 | ||
52a8e5cb GG |
2426 | Likewise, on machines that have support for @code{_Float@var{N}} or |
2427 | @code{_Float@var{N}x} enabled, the complex types @code{_Float@var{N} | |
2428 | complex} and @code{_Float@var{N}x complex} are also available if | |
2429 | @file{complex.h} has been included; @pxref{Mathematics}. | |
2430 | ||
7a68c94a UD |
2431 | To construct complex numbers you need a way to indicate the imaginary |
2432 | part of a number. There is no standard notation for an imaginary | |
2433 | floating point constant. Instead, @file{complex.h} defines two macros | |
2434 | that can be used to create complex numbers. | |
2435 | ||
2436 | @deftypevr Macro {const float complex} _Complex_I | |
1b009d5a | 2437 | @standards{C99, complex.h} |
7a68c94a UD |
2438 | This macro is a representation of the complex number ``@math{0+1i}''. |
2439 | Multiplying a real floating-point value by @code{_Complex_I} gives a | |
2440 | complex number whose value is purely imaginary. You can use this to | |
2441 | construct complex constants: | |
2442 | ||
2443 | @smallexample | |
2444 | @math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I} | |
2445 | @end smallexample | |
2446 | ||
2447 | Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but | |
2448 | the type of that value is @code{complex}. | |
2449 | @end deftypevr | |
2450 | ||
2451 | @c Put this back in when gcc supports _Imaginary_I. It's too confusing. | |
2452 | @ignore | |
2453 | @noindent | |
2454 | Without an optimizing compiler this is more expensive than the use of | |
2455 | @code{_Imaginary_I} but with is better than nothing. You can avoid all | |
2456 | the hassles if you use the @code{I} macro below if the name is not | |
2457 | problem. | |
2458 | ||
2459 | @deftypevr Macro {const float imaginary} _Imaginary_I | |
2460 | This macro is a representation of the value ``@math{1i}''. I.e., it is | |
2461 | the value for which | |
2462 | ||
2463 | @smallexample | |
2464 | _Imaginary_I * _Imaginary_I = -1 | |
2465 | @end smallexample | |
2466 | ||
2467 | @noindent | |
2468 | The result is not of type @code{float imaginary} but instead @code{float}. | |
2469 | One can use it to easily construct complex number like in | |
2470 | ||
2471 | @smallexample | |
2472 | 3.0 - _Imaginary_I * 4.0 | |
2473 | @end smallexample | |
2474 | ||
2475 | @noindent | |
2476 | which results in the complex number with a real part of 3.0 and a | |
2477 | imaginary part -4.0. | |
2478 | @end deftypevr | |
2479 | @end ignore | |
2480 | ||
2481 | @noindent | |
2482 | @code{_Complex_I} is a bit of a mouthful. @file{complex.h} also defines | |
2483 | a shorter name for the same constant. | |
2484 | ||
2485 | @deftypevr Macro {const float complex} I | |
1b009d5a | 2486 | @standards{C99, complex.h} |
7a68c94a UD |
2487 | This macro has exactly the same value as @code{_Complex_I}. Most of the |
2488 | time it is preferable. However, it causes problems if you want to use | |
2489 | the identifier @code{I} for something else. You can safely write | |
2490 | ||
2491 | @smallexample | |
2492 | #include <complex.h> | |
2493 | #undef I | |
2494 | @end smallexample | |
2495 | ||
2496 | @noindent | |
2497 | if you need @code{I} for your own purposes. (In that case we recommend | |
2498 | you also define some other short name for @code{_Complex_I}, such as | |
2499 | @code{J}.) | |
2500 | ||
2501 | @ignore | |
2502 | If the implementation does not support the @code{imaginary} types | |
2503 | @code{I} is defined as @code{_Complex_I} which is the second best | |
2504 | solution. It still can be used in the same way but requires a most | |
2505 | clever compiler to get the same results. | |
2506 | @end ignore | |
2507 | @end deftypevr | |
2508 | ||
2509 | @node Operations on Complex | |
2510 | @section Projections, Conjugates, and Decomposing of Complex Numbers | |
2511 | @cindex project complex numbers | |
2512 | @cindex conjugate complex numbers | |
2513 | @cindex decompose complex numbers | |
2514 | @pindex complex.h | |
2515 | ||
ec751a23 | 2516 | @w{ISO C99} also defines functions that perform basic operations on |
7a68c94a UD |
2517 | complex numbers, such as decomposition and conjugation. The prototypes |
2518 | for all these functions are in @file{complex.h}. All functions are | |
2519 | available in three variants, one for each of the three complex types. | |
2520 | ||
7a68c94a UD |
2521 | @deftypefun double creal (complex double @var{z}) |
2522 | @deftypefunx float crealf (complex float @var{z}) | |
2523 | @deftypefunx {long double} creall (complex long double @var{z}) | |
52a8e5cb GG |
2524 | @deftypefunx _FloatN crealfN (complex _Float@var{N} @var{z}) |
2525 | @deftypefunx _FloatNx crealfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 2526 | @standards{ISO, complex.h} |
52a8e5cb GG |
2527 | @standardsx{crealfN, TS 18661-3:2015, complex.h} |
2528 | @standardsx{crealfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 2529 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2530 | These functions return the real part of the complex number @var{z}. |
2531 | @end deftypefun | |
2532 | ||
7a68c94a UD |
2533 | @deftypefun double cimag (complex double @var{z}) |
2534 | @deftypefunx float cimagf (complex float @var{z}) | |
2535 | @deftypefunx {long double} cimagl (complex long double @var{z}) | |
52a8e5cb GG |
2536 | @deftypefunx _FloatN cimagfN (complex _Float@var{N} @var{z}) |
2537 | @deftypefunx _FloatNx cimagfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 2538 | @standards{ISO, complex.h} |
52a8e5cb GG |
2539 | @standardsx{cimagfN, TS 18661-3:2015, complex.h} |
2540 | @standardsx{cimagfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 2541 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2542 | These functions return the imaginary part of the complex number @var{z}. |
2543 | @end deftypefun | |
2544 | ||
7a68c94a UD |
2545 | @deftypefun {complex double} conj (complex double @var{z}) |
2546 | @deftypefunx {complex float} conjf (complex float @var{z}) | |
2547 | @deftypefunx {complex long double} conjl (complex long double @var{z}) | |
52a8e5cb GG |
2548 | @deftypefunx {complex _FloatN} conjfN (complex _Float@var{N} @var{z}) |
2549 | @deftypefunx {complex _FloatNx} conjfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 2550 | @standards{ISO, complex.h} |
52a8e5cb GG |
2551 | @standardsx{conjfN, TS 18661-3:2015, complex.h} |
2552 | @standardsx{conjfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 2553 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2554 | These functions return the conjugate value of the complex number |
2555 | @var{z}. The conjugate of a complex number has the same real part and a | |
2556 | negated imaginary part. In other words, @samp{conj(a + bi) = a + -bi}. | |
2557 | @end deftypefun | |
2558 | ||
7a68c94a UD |
2559 | @deftypefun double carg (complex double @var{z}) |
2560 | @deftypefunx float cargf (complex float @var{z}) | |
2561 | @deftypefunx {long double} cargl (complex long double @var{z}) | |
52a8e5cb GG |
2562 | @deftypefunx _FloatN cargfN (complex _Float@var{N} @var{z}) |
2563 | @deftypefunx _FloatNx cargfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 2564 | @standards{ISO, complex.h} |
52a8e5cb GG |
2565 | @standardsx{cargfN, TS 18661-3:2015, complex.h} |
2566 | @standardsx{cargfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 2567 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
2568 | These functions return the argument of the complex number @var{z}. |
2569 | The argument of a complex number is the angle in the complex plane | |
2570 | between the positive real axis and a line passing through zero and the | |
01f49f59 JT |
2571 | number. This angle is measured in the usual fashion and ranges from |
2572 | @math{-@pi{}} to @math{@pi{}}. | |
7a68c94a | 2573 | |
01f49f59 | 2574 | @code{carg} has a branch cut along the negative real axis. |
7a68c94a UD |
2575 | @end deftypefun |
2576 | ||
7a68c94a UD |
2577 | @deftypefun {complex double} cproj (complex double @var{z}) |
2578 | @deftypefunx {complex float} cprojf (complex float @var{z}) | |
2579 | @deftypefunx {complex long double} cprojl (complex long double @var{z}) | |
52a8e5cb GG |
2580 | @deftypefunx {complex _FloatN} cprojfN (complex _Float@var{N} @var{z}) |
2581 | @deftypefunx {complex _FloatNx} cprojfNx (complex _Float@var{N}x @var{z}) | |
d08a7e4c | 2582 | @standards{ISO, complex.h} |
52a8e5cb GG |
2583 | @standardsx{cprojfN, TS 18661-3:2015, complex.h} |
2584 | @standardsx{cprojfNx, TS 18661-3:2015, complex.h} | |
b719dafd | 2585 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a | 2586 | These functions return the projection of the complex value @var{z} onto |
9dcc8f11 | 2587 | the Riemann sphere. Values with an infinite imaginary part are projected |
7a68c94a UD |
2588 | to positive infinity on the real axis, even if the real part is NaN. If |
2589 | the real part is infinite, the result is equivalent to | |
2590 | ||
2591 | @smallexample | |
2592 | INFINITY + I * copysign (0.0, cimag (z)) | |
2593 | @end smallexample | |
2594 | @end deftypefun | |
fe0ec73e | 2595 | |
28f540f4 RM |
2596 | @node Parsing of Numbers |
2597 | @section Parsing of Numbers | |
2598 | @cindex parsing numbers (in formatted input) | |
2599 | @cindex converting strings to numbers | |
2600 | @cindex number syntax, parsing | |
2601 | @cindex syntax, for reading numbers | |
2602 | ||
2603 | This section describes functions for ``reading'' integer and | |
2604 | floating-point numbers from a string. It may be more convenient in some | |
2605 | cases to use @code{sscanf} or one of the related functions; see | |
2606 | @ref{Formatted Input}. But often you can make a program more robust by | |
2607 | finding the tokens in the string by hand, then converting the numbers | |
2608 | one by one. | |
2609 | ||
2610 | @menu | |
2611 | * Parsing of Integers:: Functions for conversion of integer values. | |
2612 | * Parsing of Floats:: Functions for conversion of floating-point | |
2613 | values. | |
2614 | @end menu | |
2615 | ||
2616 | @node Parsing of Integers | |
2617 | @subsection Parsing of Integers | |
2618 | ||
2619 | @pindex stdlib.h | |
b642f101 UD |
2620 | @pindex wchar.h |
2621 | The @samp{str} functions are declared in @file{stdlib.h} and those | |
2622 | beginning with @samp{wcs} are declared in @file{wchar.h}. One might | |
2623 | wonder about the use of @code{restrict} in the prototypes of the | |
2624 | functions in this section. It is seemingly useless but the @w{ISO C} | |
2625 | standard uses it (for the functions defined there) so we have to do it | |
2626 | as well. | |
28f540f4 | 2627 | |
b642f101 | 2628 | @deftypefun {long int} strtol (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2629 | @standards{ISO, stdlib.h} |
b719dafd AO |
2630 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
2631 | @c strtol uses the thread-local pointer to the locale in effect, and | |
2632 | @c strtol_l loads the LC_NUMERIC locale data from it early on and once, | |
2633 | @c but if the locale is the global locale, and another thread calls | |
2634 | @c setlocale in a way that modifies the pointer to the LC_CTYPE locale | |
2635 | @c category, the behavior of e.g. IS*, TOUPPER will vary throughout the | |
2636 | @c execution of the function, because they re-read the locale data from | |
2637 | @c the given locale pointer. We solved this by documenting setlocale as | |
2638 | @c MT-Unsafe. | |
28f540f4 RM |
2639 | The @code{strtol} (``string-to-long'') function converts the initial |
2640 | part of @var{string} to a signed integer, which is returned as a value | |
b8fe19fa | 2641 | of type @code{long int}. |
28f540f4 RM |
2642 | |
2643 | This function attempts to decompose @var{string} as follows: | |
2644 | ||
2645 | @itemize @bullet | |
b8fe19fa | 2646 | @item |
28f540f4 RM |
2647 | A (possibly empty) sequence of whitespace characters. Which characters |
2648 | are whitespace is determined by the @code{isspace} function | |
2649 | (@pxref{Classification of Characters}). These are discarded. | |
2650 | ||
b8fe19fa | 2651 | @item |
28f540f4 RM |
2652 | An optional plus or minus sign (@samp{+} or @samp{-}). |
2653 | ||
b8fe19fa | 2654 | @item |
28f540f4 RM |
2655 | A nonempty sequence of digits in the radix specified by @var{base}. |
2656 | ||
2657 | If @var{base} is zero, decimal radix is assumed unless the series of | |
2658 | digits begins with @samp{0} (specifying octal radix), or @samp{0x} or | |
64924422 JM |
2659 | @samp{0X} (specifying hexadecimal radix), or @samp{0b} or @samp{0B} |
2660 | (specifying binary radix; only supported when C2X features are | |
2661 | enabled); in other words, the same syntax used for integer constants in C. | |
28f540f4 | 2662 | |
600a7457 | 2663 | Otherwise @var{base} must have a value between @code{2} and @code{36}. |
28f540f4 | 2664 | If @var{base} is @code{16}, the digits may optionally be preceded by |
64924422 JM |
2665 | @samp{0x} or @samp{0X}. If @var{base} is @code{2}, and C2X features |
2666 | are enabled, the digits may optionally be preceded by | |
2667 | @samp{0b} or @samp{0B}. If base has no legal value the value returned | |
2c6fe0bd | 2668 | is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}. |
28f540f4 | 2669 | |
b8fe19fa | 2670 | @item |
28f540f4 RM |
2671 | Any remaining characters in the string. If @var{tailptr} is not a null |
2672 | pointer, @code{strtol} stores a pointer to this tail in | |
2673 | @code{*@var{tailptr}}. | |
2674 | @end itemize | |
2675 | ||
2676 | If the string is empty, contains only whitespace, or does not contain an | |
2677 | initial substring that has the expected syntax for an integer in the | |
2678 | specified @var{base}, no conversion is performed. In this case, | |
2679 | @code{strtol} returns a value of zero and the value stored in | |
2680 | @code{*@var{tailptr}} is the value of @var{string}. | |
2681 | ||
2682 | In a locale other than the standard @code{"C"} locale, this function | |
2683 | may recognize additional implementation-dependent syntax. | |
2684 | ||
2685 | If the string has valid syntax for an integer but the value is not | |
2686 | representable because of overflow, @code{strtol} returns either | |
2687 | @code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as | |
2688 | appropriate for the sign of the value. It also sets @code{errno} | |
2689 | to @code{ERANGE} to indicate there was overflow. | |
2690 | ||
7a68c94a UD |
2691 | You should not check for errors by examining the return value of |
2692 | @code{strtol}, because the string might be a valid representation of | |
2693 | @code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}. Instead, check whether | |
2694 | @var{tailptr} points to what you expect after the number | |
2695 | (e.g. @code{'\0'} if the string should end after the number). You also | |
010fe231 | 2696 | need to clear @code{errno} before the call and check it afterward, in |
7a68c94a | 2697 | case there was overflow. |
2c6fe0bd | 2698 | |
28f540f4 RM |
2699 | There is an example at the end of this section. |
2700 | @end deftypefun | |
2701 | ||
b642f101 | 2702 | @deftypefun {long int} wcstol (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2703 | @standards{ISO, wchar.h} |
b719dafd | 2704 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2705 | The @code{wcstol} function is equivalent to the @code{strtol} function |
2706 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2707 | |
2708 | The @code{wcstol} function was introduced in @w{Amendment 1} of @w{ISO C90}. | |
2709 | @end deftypefun | |
2710 | ||
f5c558f3 | 2711 | @deftypefun {unsigned long int} strtoul (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2712 | @standards{ISO, stdlib.h} |
b719dafd | 2713 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
28f540f4 | 2714 | The @code{strtoul} (``string-to-unsigned-long'') function is like |
0e4ee106 | 2715 | @code{strtol} except it converts to an @code{unsigned long int} value. |
7a68c94a | 2716 | The syntax is the same as described above for @code{strtol}. The value |
0e4ee106 UD |
2717 | returned on overflow is @code{ULONG_MAX} (@pxref{Range of Type}). |
2718 | ||
2719 | If @var{string} depicts a negative number, @code{strtoul} acts the same | |
2720 | as @var{strtol} but casts the result to an unsigned integer. That means | |
2721 | for example that @code{strtoul} on @code{"-1"} returns @code{ULONG_MAX} | |
e6e81391 | 2722 | and an input more negative than @code{LONG_MIN} returns |
0e4ee106 | 2723 | (@code{ULONG_MAX} + 1) / 2. |
7a68c94a | 2724 | |
010fe231 | 2725 | @code{strtoul} sets @code{errno} to @code{EINVAL} if @var{base} is out of |
7a68c94a | 2726 | range, or @code{ERANGE} on overflow. |
2c6fe0bd UD |
2727 | @end deftypefun |
2728 | ||
b642f101 | 2729 | @deftypefun {unsigned long int} wcstoul (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2730 | @standards{ISO, wchar.h} |
b719dafd | 2731 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2732 | The @code{wcstoul} function is equivalent to the @code{strtoul} function |
2733 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2734 | |
2735 | The @code{wcstoul} function was introduced in @w{Amendment 1} of @w{ISO C90}. | |
2736 | @end deftypefun | |
2737 | ||
b642f101 | 2738 | @deftypefun {long long int} strtoll (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2739 | @standards{ISO, stdlib.h} |
b719dafd | 2740 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7a68c94a UD |
2741 | The @code{strtoll} function is like @code{strtol} except that it returns |
2742 | a @code{long long int} value, and accepts numbers with a correspondingly | |
2743 | larger range. | |
2c6fe0bd UD |
2744 | |
2745 | If the string has valid syntax for an integer but the value is not | |
fe7bdd63 | 2746 | representable because of overflow, @code{strtoll} returns either |
7bb764bc | 2747 | @code{LLONG_MAX} or @code{LLONG_MIN} (@pxref{Range of Type}), as |
2c6fe0bd UD |
2748 | appropriate for the sign of the value. It also sets @code{errno} to |
2749 | @code{ERANGE} to indicate there was overflow. | |
2c6fe0bd | 2750 | |
ec751a23 | 2751 | The @code{strtoll} function was introduced in @w{ISO C99}. |
2c6fe0bd UD |
2752 | @end deftypefun |
2753 | ||
b642f101 | 2754 | @deftypefun {long long int} wcstoll (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2755 | @standards{ISO, wchar.h} |
b719dafd | 2756 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2757 | The @code{wcstoll} function is equivalent to the @code{strtoll} function |
2758 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2759 | |
2760 | The @code{wcstoll} function was introduced in @w{Amendment 1} of @w{ISO C90}. | |
2761 | @end deftypefun | |
2762 | ||
b642f101 | 2763 | @deftypefun {long long int} strtoq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2764 | @standards{BSD, stdlib.h} |
b719dafd | 2765 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7a68c94a | 2766 | @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}. |
2c6fe0bd UD |
2767 | @end deftypefun |
2768 | ||
b642f101 | 2769 | @deftypefun {long long int} wcstoq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2770 | @standards{GNU, wchar.h} |
b719dafd | 2771 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2772 | The @code{wcstoq} function is equivalent to the @code{strtoq} function |
2773 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2774 | |
2775 | The @code{wcstoq} function is a GNU extension. | |
2776 | @end deftypefun | |
2777 | ||
b642f101 | 2778 | @deftypefun {unsigned long long int} strtoull (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2779 | @standards{ISO, stdlib.h} |
b719dafd | 2780 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
0e4ee106 UD |
2781 | The @code{strtoull} function is related to @code{strtoll} the same way |
2782 | @code{strtoul} is related to @code{strtol}. | |
fe7bdd63 | 2783 | |
ec751a23 | 2784 | The @code{strtoull} function was introduced in @w{ISO C99}. |
fe7bdd63 UD |
2785 | @end deftypefun |
2786 | ||
b642f101 | 2787 | @deftypefun {unsigned long long int} wcstoull (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2788 | @standards{ISO, wchar.h} |
b719dafd | 2789 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2790 | The @code{wcstoull} function is equivalent to the @code{strtoull} function |
2791 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2792 | |
2793 | The @code{wcstoull} function was introduced in @w{Amendment 1} of @w{ISO C90}. | |
2794 | @end deftypefun | |
2795 | ||
b642f101 | 2796 | @deftypefun {unsigned long long int} strtouq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2797 | @standards{BSD, stdlib.h} |
b719dafd | 2798 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7a68c94a | 2799 | @code{strtouq} is the BSD name for @code{strtoull}. |
28f540f4 RM |
2800 | @end deftypefun |
2801 | ||
b642f101 | 2802 | @deftypefun {unsigned long long int} wcstouq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2803 | @standards{GNU, wchar.h} |
b719dafd | 2804 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2805 | The @code{wcstouq} function is equivalent to the @code{strtouq} function |
2806 | in nearly all aspects but handles wide character strings. | |
b642f101 | 2807 | |
f5708cb0 | 2808 | The @code{wcstouq} function is a GNU extension. |
b642f101 UD |
2809 | @end deftypefun |
2810 | ||
b642f101 | 2811 | @deftypefun intmax_t strtoimax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2812 | @standards{ISO, inttypes.h} |
b719dafd | 2813 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
0e4ee106 UD |
2814 | The @code{strtoimax} function is like @code{strtol} except that it returns |
2815 | a @code{intmax_t} value, and accepts numbers of a corresponding range. | |
2816 | ||
2817 | If the string has valid syntax for an integer but the value is not | |
2818 | representable because of overflow, @code{strtoimax} returns either | |
2819 | @code{INTMAX_MAX} or @code{INTMAX_MIN} (@pxref{Integers}), as | |
2820 | appropriate for the sign of the value. It also sets @code{errno} to | |
2821 | @code{ERANGE} to indicate there was overflow. | |
2822 | ||
b642f101 UD |
2823 | See @ref{Integers} for a description of the @code{intmax_t} type. The |
2824 | @code{strtoimax} function was introduced in @w{ISO C99}. | |
2825 | @end deftypefun | |
0e4ee106 | 2826 | |
b642f101 | 2827 | @deftypefun intmax_t wcstoimax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2828 | @standards{ISO, wchar.h} |
b719dafd | 2829 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2830 | The @code{wcstoimax} function is equivalent to the @code{strtoimax} function |
2831 | in nearly all aspects but handles wide character strings. | |
0e4ee106 | 2832 | |
b642f101 | 2833 | The @code{wcstoimax} function was introduced in @w{ISO C99}. |
0e4ee106 UD |
2834 | @end deftypefun |
2835 | ||
b642f101 | 2836 | @deftypefun uintmax_t strtoumax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2837 | @standards{ISO, inttypes.h} |
b719dafd | 2838 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
0e4ee106 UD |
2839 | The @code{strtoumax} function is related to @code{strtoimax} |
2840 | the same way that @code{strtoul} is related to @code{strtol}. | |
2841 | ||
b642f101 UD |
2842 | See @ref{Integers} for a description of the @code{intmax_t} type. The |
2843 | @code{strtoumax} function was introduced in @w{ISO C99}. | |
2844 | @end deftypefun | |
0e4ee106 | 2845 | |
b642f101 | 2846 | @deftypefun uintmax_t wcstoumax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
d08a7e4c | 2847 | @standards{ISO, wchar.h} |
b719dafd | 2848 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3554743a AJ |
2849 | The @code{wcstoumax} function is equivalent to the @code{strtoumax} function |
2850 | in nearly all aspects but handles wide character strings. | |
b642f101 UD |
2851 | |
2852 | The @code{wcstoumax} function was introduced in @w{ISO C99}. | |
0e4ee106 UD |
2853 | @end deftypefun |
2854 | ||
28f540f4 | 2855 | @deftypefun {long int} atol (const char *@var{string}) |
d08a7e4c | 2856 | @standards{ISO, stdlib.h} |
b719dafd | 2857 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
28f540f4 RM |
2858 | This function is similar to the @code{strtol} function with a @var{base} |
2859 | argument of @code{10}, except that it need not detect overflow errors. | |
2860 | The @code{atol} function is provided mostly for compatibility with | |
2861 | existing code; using @code{strtol} is more robust. | |
2862 | @end deftypefun | |
2863 | ||
28f540f4 | 2864 | @deftypefun int atoi (const char *@var{string}) |
d08a7e4c | 2865 | @standards{ISO, stdlib.h} |
b719dafd | 2866 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7a68c94a UD |
2867 | This function is like @code{atol}, except that it returns an @code{int}. |
2868 | The @code{atoi} function is also considered obsolete; use @code{strtol} | |
2869 | instead. | |
28f540f4 RM |
2870 | @end deftypefun |
2871 | ||
fe7bdd63 | 2872 | @deftypefun {long long int} atoll (const char *@var{string}) |
d08a7e4c | 2873 | @standards{ISO, stdlib.h} |
b719dafd | 2874 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
fe7bdd63 | 2875 | This function is similar to @code{atol}, except it returns a @code{long |
7a68c94a | 2876 | long int}. |
fe7bdd63 | 2877 | |
ec751a23 | 2878 | The @code{atoll} function was introduced in @w{ISO C99}. It too is |
7a68c94a | 2879 | obsolete (despite having just been added); use @code{strtoll} instead. |
fe7bdd63 UD |
2880 | @end deftypefun |
2881 | ||
b642f101 UD |
2882 | All the functions mentioned in this section so far do not handle |
2883 | alternative representations of characters as described in the locale | |
2884 | data. Some locales specify thousands separator and the way they have to | |
2885 | be used which can help to make large numbers more readable. To read | |
2886 | such numbers one has to use the @code{scanf} functions with the @samp{'} | |
2887 | flag. | |
2c6fe0bd | 2888 | |
28f540f4 RM |
2889 | Here is a function which parses a string as a sequence of integers and |
2890 | returns the sum of them: | |
2891 | ||
2892 | @smallexample | |
2893 | int | |
2894 | sum_ints_from_string (char *string) | |
2895 | @{ | |
2896 | int sum = 0; | |
2897 | ||
2898 | while (1) @{ | |
2899 | char *tail; | |
2900 | int next; | |
2901 | ||
2902 | /* @r{Skip whitespace by hand, to detect the end.} */ | |
2903 | while (isspace (*string)) string++; | |
2904 | if (*string == 0) | |
2905 | break; | |
2906 | ||
2907 | /* @r{There is more nonwhitespace,} */ | |
2908 | /* @r{so it ought to be another number.} */ | |
2909 | errno = 0; | |
2910 | /* @r{Parse it.} */ | |
2911 | next = strtol (string, &tail, 0); | |
2912 | /* @r{Add it in, if not overflow.} */ | |
2913 | if (errno) | |
2914 | printf ("Overflow\n"); | |
2915 | else | |
2916 | sum += next; | |
2917 | /* @r{Advance past it.} */ | |
2918 | string = tail; | |
2919 | @} | |
2920 | ||
2921 | return sum; | |
2922 | @} | |
2923 | @end smallexample | |
2924 | ||
2925 | @node Parsing of Floats | |
2926 | @subsection Parsing of Floats | |
2927 | ||
2928 | @pindex stdlib.h | |
b642f101 UD |
2929 | The @samp{str} functions are declared in @file{stdlib.h} and those |
2930 | beginning with @samp{wcs} are declared in @file{wchar.h}. One might | |
2931 | wonder about the use of @code{restrict} in the prototypes of the | |
2932 | functions in this section. It is seemingly useless but the @w{ISO C} | |
2933 | standard uses it (for the functions defined there) so we have to do it | |
2934 | as well. | |
28f540f4 | 2935 | |
b642f101 | 2936 | @deftypefun double strtod (const char *restrict @var{string}, char **restrict @var{tailptr}) |
d08a7e4c | 2937 | @standards{ISO, stdlib.h} |
b719dafd AO |
2938 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
2939 | @c Besides the unsafe-but-ruled-safe locale uses, this uses a lot of | |
2940 | @c mpn, but it's all safe. | |
2941 | @c | |
2942 | @c round_and_return | |
2943 | @c get_rounding_mode ok | |
2944 | @c mpn_add_1 ok | |
2945 | @c mpn_rshift ok | |
2946 | @c MPN_ZERO ok | |
2947 | @c MPN2FLOAT -> mpn_construct_(float|double|long_double) ok | |
2948 | @c str_to_mpn | |
2949 | @c mpn_mul_1 -> umul_ppmm ok | |
2950 | @c mpn_add_1 ok | |
2951 | @c mpn_lshift_1 -> mpn_lshift ok | |
2952 | @c STRTOF_INTERNAL | |
2953 | @c MPN_VAR ok | |
9761bf4d | 2954 | @c SET_NAN_PAYLOAD ok |
b719dafd AO |
2955 | @c STRNCASECMP ok, wide and narrow |
2956 | @c round_and_return ok | |
2957 | @c mpn_mul ok | |
2958 | @c mpn_addmul_1 ok | |
2959 | @c ... mpn_sub | |
2960 | @c mpn_lshift ok | |
2961 | @c udiv_qrnnd ok | |
2962 | @c count_leading_zeros ok | |
2963 | @c add_ssaaaa ok | |
2964 | @c sub_ddmmss ok | |
2965 | @c umul_ppmm ok | |
2966 | @c mpn_submul_1 ok | |
28f540f4 RM |
2967 | The @code{strtod} (``string-to-double'') function converts the initial |
2968 | part of @var{string} to a floating-point number, which is returned as a | |
b8fe19fa | 2969 | value of type @code{double}. |
28f540f4 RM |
2970 | |
2971 | This function attempts to decompose @var{string} as follows: | |
2972 | ||
2973 | @itemize @bullet | |
b8fe19fa | 2974 | @item |
28f540f4 RM |
2975 | A (possibly empty) sequence of whitespace characters. Which characters |
2976 | are whitespace is determined by the @code{isspace} function | |
2977 | (@pxref{Classification of Characters}). These are discarded. | |
2978 | ||
2979 | @item | |
2980 | An optional plus or minus sign (@samp{+} or @samp{-}). | |
2981 | ||
0c34b1e9 UD |
2982 | @item A floating point number in decimal or hexadecimal format. The |
2983 | decimal format is: | |
2984 | @itemize @minus | |
2985 | ||
28f540f4 RM |
2986 | @item |
2987 | A nonempty sequence of digits optionally containing a decimal-point | |
2988 | character---normally @samp{.}, but it depends on the locale | |
85c165be | 2989 | (@pxref{General Numeric}). |
28f540f4 RM |
2990 | |
2991 | @item | |
2992 | An optional exponent part, consisting of a character @samp{e} or | |
2993 | @samp{E}, an optional sign, and a sequence of digits. | |
2994 | ||
0c34b1e9 UD |
2995 | @end itemize |
2996 | ||
2997 | The hexadecimal format is as follows: | |
2998 | @itemize @minus | |
2999 | ||
3000 | @item | |
3001 | A 0x or 0X followed by a nonempty sequence of hexadecimal digits | |
3002 | optionally containing a decimal-point character---normally @samp{.}, but | |
3003 | it depends on the locale (@pxref{General Numeric}). | |
3004 | ||
3005 | @item | |
3006 | An optional binary-exponent part, consisting of a character @samp{p} or | |
3007 | @samp{P}, an optional sign, and a sequence of digits. | |
3008 | ||
3009 | @end itemize | |
3010 | ||
28f540f4 RM |
3011 | @item |
3012 | Any remaining characters in the string. If @var{tailptr} is not a null | |
3013 | pointer, a pointer to this tail of the string is stored in | |
3014 | @code{*@var{tailptr}}. | |
3015 | @end itemize | |
3016 | ||
3017 | If the string is empty, contains only whitespace, or does not contain an | |
3018 | initial substring that has the expected syntax for a floating-point | |
3019 | number, no conversion is performed. In this case, @code{strtod} returns | |
3020 | a value of zero and the value returned in @code{*@var{tailptr}} is the | |
3021 | value of @var{string}. | |
3022 | ||
26761c28 | 3023 | In a locale other than the standard @code{"C"} or @code{"POSIX"} locales, |
2c6fe0bd | 3024 | this function may recognize additional locale-dependent syntax. |
28f540f4 RM |
3025 | |
3026 | If the string has valid syntax for a floating-point number but the value | |
7a68c94a UD |
3027 | is outside the range of a @code{double}, @code{strtod} will signal |
3028 | overflow or underflow as described in @ref{Math Error Reporting}. | |
3029 | ||
3030 | @code{strtod} recognizes four special input strings. The strings | |
3031 | @code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}}, | |
3032 | or to the largest representable value if the floating-point format | |
3033 | doesn't support infinities. You can prepend a @code{"+"} or @code{"-"} | |
3034 | to specify the sign. Case is ignored when scanning these strings. | |
3035 | ||
95fdc6a0 UD |
3036 | The strings @code{"nan"} and @code{"nan(@var{chars@dots{}})"} are converted |
3037 | to NaN. Again, case is ignored. If @var{chars@dots{}} are provided, they | |
7a68c94a UD |
3038 | are used in some unspecified fashion to select a particular |
3039 | representation of NaN (there can be several). | |
3040 | ||
3041 | Since zero is a valid result as well as the value returned on error, you | |
3042 | should check for errors in the same way as for @code{strtol}, by | |
010fe231 | 3043 | examining @code{errno} and @var{tailptr}. |
28f540f4 RM |
3044 | @end deftypefun |
3045 | ||
2c6fe0bd | 3046 | @deftypefun float strtof (const char *@var{string}, char **@var{tailptr}) |
7a68c94a | 3047 | @deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr}) |
d08a7e4c | 3048 | @standards{ISO, stdlib.h} |
b719dafd | 3049 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7d641c41 | 3050 | @comment See safety comments for strtod. |
7a68c94a UD |
3051 | These functions are analogous to @code{strtod}, but return @code{float} |
3052 | and @code{long double} values respectively. They report errors in the | |
3053 | same way as @code{strtod}. @code{strtof} can be substantially faster | |
3054 | than @code{strtod}, but has less precision; conversely, @code{strtold} | |
3055 | can be much slower but has more precision (on systems where @code{long | |
3056 | double} is a separate type). | |
3057 | ||
ec751a23 | 3058 | These functions have been GNU extensions and are new to @w{ISO C99}. |
2c6fe0bd UD |
3059 | @end deftypefun |
3060 | ||
7d641c41 | 3061 | @deftypefun _FloatN strtofN (const char *@var{string}, char **@var{tailptr}) |
7d641c41 | 3062 | @deftypefunx _FloatNx strtofNx (const char *@var{string}, char **@var{tailptr}) |
d08a7e4c | 3063 | @standards{ISO/IEC TS 18661-3, stdlib.h} |
7d641c41 GG |
3064 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
3065 | @comment See safety comments for strtod. | |
3066 | These functions are like @code{strtod}, except for the return type. | |
3067 | ||
3068 | They were introduced in @w{ISO/IEC TS 18661-3} and are available on machines | |
3069 | that support the related types; @pxref{Mathematics}. | |
3070 | @end deftypefun | |
3071 | ||
b642f101 | 3072 | @deftypefun double wcstod (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}) |
b642f101 | 3073 | @deftypefunx float wcstof (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
b642f101 | 3074 | @deftypefunx {long double} wcstold (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
7d641c41 | 3075 | @deftypefunx _FloatN wcstofN (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
7d641c41 | 3076 | @deftypefunx _FloatNx wcstofNx (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
d08a7e4c RJ |
3077 | @standards{ISO, wchar.h} |
3078 | @standardsx{wcstofN, GNU, wchar.h} | |
3079 | @standardsx{wcstofNx, GNU, wchar.h} | |
b719dafd | 3080 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
7d641c41 GG |
3081 | @comment See safety comments for strtod. |
3082 | The @code{wcstod}, @code{wcstof}, @code{wcstol}, @code{wcstof@var{N}}, | |
3083 | and @code{wcstof@var{N}x} functions are equivalent in nearly all aspects | |
3084 | to the @code{strtod}, @code{strtof}, @code{strtold}, | |
3085 | @code{strtof@var{N}}, and @code{strtof@var{N}x} functions, but they | |
3086 | handle wide character strings. | |
b642f101 UD |
3087 | |
3088 | The @code{wcstod} function was introduced in @w{Amendment 1} of @w{ISO | |
3089 | C90}. The @code{wcstof} and @code{wcstold} functions were introduced in | |
3090 | @w{ISO C99}. | |
7d641c41 GG |
3091 | |
3092 | The @code{wcstof@var{N}} and @code{wcstof@var{N}x} functions are not in | |
3093 | any standard, but are added to provide completeness for the | |
3094 | non-deprecated interface of wide character string to floating-point | |
3095 | conversion functions. They are only available on machines that support | |
3096 | the related types; @pxref{Mathematics}. | |
b642f101 UD |
3097 | @end deftypefun |
3098 | ||
28f540f4 | 3099 | @deftypefun double atof (const char *@var{string}) |
d08a7e4c | 3100 | @standards{ISO, stdlib.h} |
b719dafd | 3101 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
28f540f4 RM |
3102 | This function is similar to the @code{strtod} function, except that it |
3103 | need not detect overflow and underflow errors. The @code{atof} function | |
3104 | is provided mostly for compatibility with existing code; using | |
3105 | @code{strtod} is more robust. | |
3106 | @end deftypefun | |
880f421f | 3107 | |
1f77f049 | 3108 | @Theglibc{} also provides @samp{_l} versions of these functions, |
7a68c94a | 3109 | which take an additional argument, the locale to use in conversion. |
aa04af00 AM |
3110 | |
3111 | See also @ref{Parsing of Integers}. | |
880f421f | 3112 | |
6962682f GG |
3113 | @node Printing of Floats |
3114 | @section Printing of Floats | |
3115 | ||
3116 | @pindex stdlib.h | |
3117 | The @samp{strfrom} functions are declared in @file{stdlib.h}. | |
3118 | ||
6962682f GG |
3119 | @deftypefun int strfromd (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, double @var{value}) |
3120 | @deftypefunx int strfromf (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, float @var{value}) | |
3121 | @deftypefunx int strfroml (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, long double @var{value}) | |
d08a7e4c | 3122 | @standards{ISO/IEC TS 18661-1, stdlib.h} |
6962682f | 3123 | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
7d641c41 GG |
3124 | @comment All these functions depend on both __printf_fp and __printf_fphex, |
3125 | @comment which are both AS-unsafe (ascuheap) and AC-unsafe (acsmem). | |
6962682f GG |
3126 | The functions @code{strfromd} (``string-from-double''), @code{strfromf} |
3127 | (``string-from-float''), and @code{strfroml} (``string-from-long-double'') | |
3128 | convert the floating-point number @var{value} to a string of characters and | |
3129 | stores them into the area pointed to by @var{string}. The conversion | |
3130 | writes at most @var{size} characters and respects the format specified by | |
3131 | @var{format}. | |
3132 | ||
3133 | The format string must start with the character @samp{%}. An optional | |
3134 | precision follows, which starts with a period, @samp{.}, and may be | |
3135 | followed by a decimal integer, representing the precision. If a decimal | |
3136 | integer is not specified after the period, the precision is taken to be | |
3137 | zero. The character @samp{*} is not allowed. Finally, the format string | |
3138 | ends with one of the following conversion specifiers: @samp{a}, @samp{A}, | |
3139 | @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g} or @samp{G} (@pxref{Table | |
3140 | of Output Conversions}). Invalid format strings result in undefined | |
3141 | behavior. | |
3142 | ||
3143 | These functions return the number of characters that would have been | |
3144 | written to @var{string} had @var{size} been sufficiently large, not | |
3145 | counting the terminating null character. Thus, the null-terminated output | |
3146 | has been completely written if and only if the returned value is less than | |
3147 | @var{size}. | |
3148 | ||
3149 | These functions were introduced by ISO/IEC TS 18661-1. | |
3150 | @end deftypefun | |
3151 | ||
7d641c41 | 3152 | @deftypefun int strfromfN (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, _Float@var{N} @var{value}) |
7d641c41 | 3153 | @deftypefunx int strfromfNx (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, _Float@var{N}x @var{value}) |
d08a7e4c | 3154 | @standards{ISO/IEC TS 18661-3, stdlib.h} |
7d641c41 GG |
3155 | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
3156 | @comment See safety comments for strfromd. | |
3157 | These functions are like @code{strfromd}, except for the type of | |
3158 | @code{value}. | |
3159 | ||
3160 | They were introduced in @w{ISO/IEC TS 18661-3} and are available on machines | |
3161 | that support the related types; @pxref{Mathematics}. | |
3162 | @end deftypefun | |
3163 | ||
7a68c94a UD |
3164 | @node System V Number Conversion |
3165 | @section Old-fashioned System V number-to-string functions | |
880f421f | 3166 | |
7a68c94a | 3167 | The old @w{System V} C library provided three functions to convert |
1f77f049 JM |
3168 | numbers to strings, with unusual and hard-to-use semantics. @Theglibc{} |
3169 | also provides these functions and some natural extensions. | |
880f421f | 3170 | |
1f77f049 | 3171 | These functions are only available in @theglibc{} and on systems descended |
7a68c94a UD |
3172 | from AT&T Unix. Therefore, unless these functions do precisely what you |
3173 | need, it is better to use @code{sprintf}, which is standard. | |
880f421f | 3174 | |
7a68c94a | 3175 | All these functions are defined in @file{stdlib.h}. |
880f421f | 3176 | |
7a68c94a | 3177 | @deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
d08a7e4c RJ |
3178 | @standards{SVID, stdlib.h} |
3179 | @standards{Unix98, stdlib.h} | |
b719dafd | 3180 | @safety{@prelim{}@mtunsafe{@mtasurace{:ecvt}}@asunsafe{}@acsafe{}} |
880f421f | 3181 | The function @code{ecvt} converts the floating-point number @var{value} |
0ea5db4f | 3182 | to a string with at most @var{ndigit} decimal digits. The |
cf822e3c | 3183 | returned string contains no decimal point or sign. The first digit of |
0ea5db4f UD |
3184 | the string is non-zero (unless @var{value} is actually zero) and the |
3185 | last digit is rounded to nearest. @code{*@var{decpt}} is set to the | |
7a68c94a | 3186 | index in the string of the first digit after the decimal point. |
0ea5db4f UD |
3187 | @code{*@var{neg}} is set to a nonzero value if @var{value} is negative, |
3188 | zero otherwise. | |
880f421f | 3189 | |
67994d6f UD |
3190 | If @var{ndigit} decimal digits would exceed the precision of a |
3191 | @code{double} it is reduced to a system-specific value. | |
3192 | ||
880f421f UD |
3193 | The returned string is statically allocated and overwritten by each call |
3194 | to @code{ecvt}. | |
3195 | ||
0ea5db4f UD |
3196 | If @var{value} is zero, it is implementation defined whether |
3197 | @code{*@var{decpt}} is @code{0} or @code{1}. | |
880f421f | 3198 | |
0ea5db4f UD |
3199 | For example: @code{ecvt (12.3, 5, &d, &n)} returns @code{"12300"} |
3200 | and sets @var{d} to @code{2} and @var{n} to @code{0}. | |
880f421f UD |
3201 | @end deftypefun |
3202 | ||
0ea5db4f | 3203 | @deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
d08a7e4c RJ |
3204 | @standards{SVID, stdlib.h} |
3205 | @standards{Unix98, stdlib.h} | |
b719dafd | 3206 | @safety{@prelim{}@mtunsafe{@mtasurace{:fcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
7a68c94a UD |
3207 | The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies |
3208 | the number of digits after the decimal point. If @var{ndigit} is less | |
3209 | than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the | |
3210 | left of the decimal point. For example, if @var{ndigit} is @code{-1}, | |
3211 | @var{value} will be rounded to the nearest 10. If @var{ndigit} is | |
3212 | negative and larger than the number of digits to the left of the decimal | |
3213 | point in @var{value}, @var{value} will be rounded to one significant digit. | |
880f421f | 3214 | |
67994d6f UD |
3215 | If @var{ndigit} decimal digits would exceed the precision of a |
3216 | @code{double} it is reduced to a system-specific value. | |
3217 | ||
880f421f UD |
3218 | The returned string is statically allocated and overwritten by each call |
3219 | to @code{fcvt}. | |
880f421f UD |
3220 | @end deftypefun |
3221 | ||
880f421f | 3222 | @deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf}) |
d08a7e4c RJ |
3223 | @standards{SVID, stdlib.h} |
3224 | @standards{Unix98, stdlib.h} | |
b719dafd AO |
3225 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
3226 | @c gcvt calls sprintf, that ultimately calls vfprintf, which malloc()s | |
3227 | @c args_value if it's too large, but gcvt never exercises this path. | |
7a68c94a | 3228 | @code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g", |
3ae3c437 | 3229 | ndigit, value)}. It is provided only for compatibility's sake. It |
7a68c94a | 3230 | returns @var{buf}. |
67994d6f UD |
3231 | |
3232 | If @var{ndigit} decimal digits would exceed the precision of a | |
3233 | @code{double} it is reduced to a system-specific value. | |
880f421f UD |
3234 | @end deftypefun |
3235 | ||
1f77f049 | 3236 | As extensions, @theglibc{} provides versions of these three |
7a68c94a | 3237 | functions that take @code{long double} arguments. |
880f421f | 3238 | |
7a68c94a | 3239 | @deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
d08a7e4c | 3240 | @standards{GNU, stdlib.h} |
b719dafd | 3241 | @safety{@prelim{}@mtunsafe{@mtasurace{:qecvt}}@asunsafe{}@acsafe{}} |
67994d6f UD |
3242 | This function is equivalent to @code{ecvt} except that it takes a |
3243 | @code{long double} for the first parameter and that @var{ndigit} is | |
3244 | restricted by the precision of a @code{long double}. | |
880f421f UD |
3245 | @end deftypefun |
3246 | ||
0ea5db4f | 3247 | @deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
d08a7e4c | 3248 | @standards{GNU, stdlib.h} |
b719dafd | 3249 | @safety{@prelim{}@mtunsafe{@mtasurace{:qfcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
7a68c94a | 3250 | This function is equivalent to @code{fcvt} except that it |
67994d6f UD |
3251 | takes a @code{long double} for the first parameter and that @var{ndigit} is |
3252 | restricted by the precision of a @code{long double}. | |
880f421f UD |
3253 | @end deftypefun |
3254 | ||
880f421f | 3255 | @deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf}) |
d08a7e4c | 3256 | @standards{GNU, stdlib.h} |
b719dafd | 3257 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
67994d6f UD |
3258 | This function is equivalent to @code{gcvt} except that it takes a |
3259 | @code{long double} for the first parameter and that @var{ndigit} is | |
3260 | restricted by the precision of a @code{long double}. | |
880f421f UD |
3261 | @end deftypefun |
3262 | ||
3263 | ||
3264 | @cindex gcvt_r | |
7a68c94a UD |
3265 | The @code{ecvt} and @code{fcvt} functions, and their @code{long double} |
3266 | equivalents, all return a string located in a static buffer which is | |
1f77f049 | 3267 | overwritten by the next call to the function. @Theglibc{} |
7a68c94a UD |
3268 | provides another set of extended functions which write the converted |
3269 | string into a user-supplied buffer. These have the conventional | |
3270 | @code{_r} suffix. | |
3271 | ||
3272 | @code{gcvt_r} is not necessary, because @code{gcvt} already uses a | |
3273 | user-supplied buffer. | |
880f421f | 3274 | |
5c1c368f | 3275 | @deftypefun int ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
d08a7e4c | 3276 | @standards{GNU, stdlib.h} |
b719dafd | 3277 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
3278 | The @code{ecvt_r} function is the same as @code{ecvt}, except |
3279 | that it places its result into the user-specified buffer pointed to by | |
5c1c368f UD |
3280 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
3281 | case of an error and zero otherwise. | |
880f421f | 3282 | |
7a68c94a | 3283 | This function is a GNU extension. |
880f421f UD |
3284 | @end deftypefun |
3285 | ||
5c1c368f | 3286 | @deftypefun int fcvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
d08a7e4c RJ |
3287 | @standards{SVID, stdlib.h} |
3288 | @standards{Unix98, stdlib.h} | |
b719dafd | 3289 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
5c1c368f UD |
3290 | The @code{fcvt_r} function is the same as @code{fcvt}, except that it |
3291 | places its result into the user-specified buffer pointed to by | |
3292 | @var{buf}, with length @var{len}. The return value is @code{-1} in | |
3293 | case of an error and zero otherwise. | |
880f421f | 3294 | |
7a68c94a | 3295 | This function is a GNU extension. |
880f421f UD |
3296 | @end deftypefun |
3297 | ||
5c1c368f | 3298 | @deftypefun int qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
d08a7e4c | 3299 | @standards{GNU, stdlib.h} |
b719dafd | 3300 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
3301 | The @code{qecvt_r} function is the same as @code{qecvt}, except |
3302 | that it places its result into the user-specified buffer pointed to by | |
5c1c368f UD |
3303 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
3304 | case of an error and zero otherwise. | |
880f421f | 3305 | |
7a68c94a | 3306 | This function is a GNU extension. |
880f421f UD |
3307 | @end deftypefun |
3308 | ||
5c1c368f | 3309 | @deftypefun int qfcvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
d08a7e4c | 3310 | @standards{GNU, stdlib.h} |
b719dafd | 3311 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
7a68c94a UD |
3312 | The @code{qfcvt_r} function is the same as @code{qfcvt}, except |
3313 | that it places its result into the user-specified buffer pointed to by | |
5c1c368f UD |
3314 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
3315 | case of an error and zero otherwise. | |
880f421f | 3316 | |
7a68c94a | 3317 | This function is a GNU extension. |
880f421f | 3318 | @end deftypefun |