man3/scanf.3

   1 .\" Copyright (c) 1990, 1991 The Regents of the University of California.
   2 .\" All rights reserved.
   3 .\"
   4 .\" This code is derived from software contributed to Berkeley by
   5 .\" Chris Torek and the American National Standards Committee X3,
   6 .\" on Information Processing Systems.
   7 .\"
   8 .\" Redistribution and use in source and binary forms, with or without
   9 .\" modification, are permitted provided that the following conditions
  10 .\" are met:
  11 .\" 1. Redistributions of source code must retain the above copyright
  12 .\"    notice, this list of conditions and the following disclaimer.
  13 .\" 2. Redistributions in binary form must reproduce the above copyright
  14 .\"    notice, this list of conditions and the following disclaimer in the
  15 .\"    documentation and/or other materials provided with the distribution.
  16 .\" 3. All advertising materials mentioning features or use of this software
  17 .\"    must display the following acknowledgement:
  18 .\"     This product includes software developed by the University of
  19 .\"     California, Berkeley and its contributors.
  20 .\" 4. Neither the name of the University nor the names of its contributors
  21 .\"    may be used to endorse or promote products derived from this software
  22 .\"    without specific prior written permission.
  23 .\"
  24 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  25 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  26 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  27 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  28 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  29 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  30 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  31 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  32 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  33 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  34 .\" SUCH DAMAGE.
  35 .\"
  36 .\"     @(#)scanf.3     6.14 (Berkeley) 1/8/93
  37 .\"
  38 .\" Converted for Linux, Mon Nov 29 15:22:01 1993, faith@cs.unc.edu
  39 .\" modified to resemble the GNU libio setup used in the Linux libc
  40 .\" used in versions 4.x (x>4) and 5   Helmut.Geyer@iwr.uni-heidelberg.de
  41 .\" Modified, aeb, 970121
  42 .\" 2005-07-14, mtk, added description of %n$ form; various text
  43 .\"     incorporated from the GNU C library documentation ((C) The
  44 .\"     Free Software Foundation); other parts substantially rewritten.
  45 .\"
  46 .\" FIXME
  47 .\" The glibc 2.7 release announcement says:
  48 .\"     Implement 'm' modifier for scanf.  Add stricter C99/SUS compliance
  49 .\"     by not recognizing 'a' as a modifier when those specs are requested.
  50 .\" These changes need to be documented.
  51 .\"
  52 .TH SCANF 3  2007-07-26 "GNU" "Linux Programmer's Manual"
  53 .SH NAME
  54 scanf, fscanf, sscanf, vscanf, vsscanf, vfscanf \- input format conversion
  55 .SH SYNOPSIS
  56 .nf
  57 .B #include <stdio.h>
  58
  59 .BI "int scanf(const char *" format ", ...);"
  60 .BI "int fscanf(FILE *" stream ", const char *" format ", ...);"
  61 .BI "int sscanf(const char *" str ", const char *" format ", ...);"
  62 .sp
  63 .B #include <stdarg.h>
  64
  65 .BI "int vscanf(const char *" format ", va_list " ap );
  66 .BI "int vsscanf(const char *" str ", const char *" format ", va_list " ap );
  67 .BI "int vfscanf(FILE *" stream ", const char *" format ", va_list " ap );
  68 .fi
  69 .sp
  70 .in -4n
  71 Feature Test Macro Requirements for glibc (see
  72 .BR feature_test_macros (7)):
  73 .in
  74 .sp
  75 .BR vscanf (),
  76 .BR vsscanf (),
  77 .BR vfscanf ():
  78 _XOPEN_SOURCE\ >=\ 600 || _ISOC99_SOURCE; or
  79 .I "cc -std=c99"
  80 .SH DESCRIPTION
  81 The
  82 .BR scanf ()
  83 family of functions scans input according to
  84 .I format
  85 as described below.
  86 This format may contain
  87 .IR "conversion specifications" ;
  88 the results from such conversions, if any,
  89 are stored in the locations pointed to by the
  90 .I pointer
  91 arguments that follow
  92 .IR format .
  93 Each
  94 .I pointer
  95 argument must be of a type that is appropriate for the value returned
  96 by the corresponding conversion specification.
  97
  98 If the number of conversion specifications in
  99 .I format
 100 exceeds the number of
 101 .I pointer
 102 arguments, the results are undefined.
 103 If the number of
 104 .I pointer
 105 arguments exceeds the number of conversion specifications, then the excess
 106 .I pointer
 107 arguments are evaluated, but are otherwise ignored.
 108
 109 The
 110 .BR scanf ()
 111 function reads input from the standard input stream
 112 .IR stdin ,
 113 .BR fscanf ()
 114 reads input from the stream pointer
 115 .IR stream ,
 116 and
 117 .BR sscanf ()
 118 reads its input from the character string pointed to by
 119 .IR str .
 120 .PP
 121 The
 122 .BR vfscanf ()
 123 function is analogous to
 124 .BR vfprintf (3)
 125 and reads input from the stream pointer
 126 .I stream
 127 using a variable argument list of pointers (see
 128 .BR stdarg (3).
 129 The
 130 .BR vscanf ()
 131 function scans a variable argument list from the standard input and the
 132 .BR vsscanf ()
 133 function scans it from a string; these are analogous to the
 134 .BR vprintf (3)
 135 and
 136 .BR vsprintf (3)
 137 functions respectively.
 138 .PP
 139 The
 140 .I format
 141 string consists of a sequence of
 142 .I directives
 143 which describe how to process the sequence of input characters.
 144 If processing of a directive fails, no further input is read, and
 145 .BR scanf ()
 146 returns.
 147 A "failure" can be either of the following:
 148 .IR "input failure" ,
 149 meaning that input characters were unavailable, or
 150 .IR "matching failure" ,
 151 meaning that the input was inappropriate (see below).
 152
 153 A directive is one of the following:
 154 .TP
 155 \(bu
 156 A sequence of white-space characters (space, tab, newline, etc.; see
 157 .BR isspace (3)).
 158 This directive matches any amount of white space,
 159 including none, in the input.
 160 .TP
 161 \(bu
 162 An ordinary character (i.e., one other than white space or '%').
 163 This character must exactly match the next character of input.
 164 .TP
 165 \(bu
 166 A conversion specification, which commences with a '%' (percent) character.
 167 A sequence of characters from the input is converted according to
 168 this specification, and the result is placed in the corresponding
 169 .I pointer
 170 argument.
 171 If the next item of input does not match the conversion specification,
 172 the conversion fails \(em this is a
 173 .IR "matching failure" .
 174 .PP
 175 Each
 176 .I conversion specification
 177 in
 178 .I format
 179 begins with either the character '%' or the character sequence
 180 "\fB%\fP\fIn\fP\fB$\fP"
 181 (see below for the distinction) followed by:
 182 .TP
 183 \(bu
 184 An optional '*' assignment-suppression character:
 185 .BR scanf ()
 186 reads input as directed by the conversion specification,
 187 but discards the input.
 188 No corresponding
 189 .I pointer
 190 argument is required, and this specification is not
 191 included in the count of successful assignments returned by
 192 .BR scanf ().
 193 .TP
 194 \(bu
 195 An optional 'a' character.
 196 This is used with string conversions, and relieves the caller of the
 197 need to allocate a corresponding buffer to hold the input: instead,
 198 .BR scanf ()
 199 allocates a buffer of sufficient size,
 200 and assigns the address of this buffer to the corresponding
 201 .I pointer
 202 argument, which should be a pointer to a
 203 .I "char *"
 204 variable (this variable does not need to be initialized before the call).
 205 The caller should subsequently
 206 .BR free (3)
 207 this buffer when it is no longer required.
 208 This is a GNU extension;
 209 C99 employs the 'a' character as a conversion specifier (and
 210 it can also be used as such in the GNU implementation).
 211 .TP
 212 \(bu
 213 An optional decimal integer which specifies the
 214 .IR "maximum field width" .
 215 Reading of characters stops either when this maximum is reached or
 216 when a non-matching character is found, whichever happens first.
 217 Most conversions discard initial whitespace characters (the exceptions
 218 are noted below),
 219 and these discarded characters don't count towards the maximum field width.
 220 String input conversions store a null terminator ('\\0')
 221 to mark the end of the input;
 222 the maximum field width does not include this terminator.
 223 .TP
 224 \(bu
 225 An optional
 226 .IR "type modifier character" .
 227 For example, the
 228 .B l
 229 type modifier is used with integer conversions such as
 230 .B %d
 231 to specify that the corresponding
 232 .I pointer
 233 argument refers to a
 234 .I "long int"
 235 rather than a pointer to an
 236 .IR int .
 237 .TP
 238 \(bu
 239 A
 240 .I "conversion specifier"
 241 that specifies the type of input conversion to be performed.
 242 .PP
 243 The conversion specifications in
 244 .I format
 245 are of two forms, either beginning with '%' or beginning with
 246 "\fB%\fP\fIn\fP\fB$\fP".
 247 The two forms should not be mixed in the same
 248 .I format
 249 string, except that a string containing
 250 "\fB%\fP\fIn\fP\fB$\fP"
 251 specifications can include
 252 .B %%
 253 and
 254 .BR %* .
 255 If
 256 .I format
 257 contains '%'
 258 specifications then these correspond in order with successive
 259 .I pointer
 260 arguments.
 261 In the
 262 "\fB%\fP\fIn\fP\fB$\fP"
 263 form (which is specified in POSIX.1-2001, but not C99),
 264 .I n
 265 is a decimal integer that specifies that the converted input should
 266 be placed in the location referred to by the
 267 .IR n -th
 268 .I pointer
 269 argument following
 270 .IR format .
 271 .SS Conversions
 272 The following
 273 .I "type modifier characters"
 274 can appear in a conversion specification:
 275 .TP
 276 .B h
 277 Indicates that the conversion will be one of
 278 .B diouxX
 279 or
 280 .B n
 281 and the next pointer is a pointer to a
 282 .I short int
 283 or
 284 .I unsigned short int
 285 (rather than
 286 .IR int ).
 287 .TP
 288 .B hh
 289 As for
 290 .BR h ,
 291 but the next pointer is a pointer to a
 292 .I signed char
 293 or
 294 .IR "unsigned char" .
 295 .TP
 296 .B j
 297 As for
 298 .BR h ,
 299 but the next pointer is a pointer to an
 300 .I intmax_t
 301 or a
 302 .IR uintmax_t .
 303 This modifier was introduced in C99.
 304 .TP
 305 .B l
 306 Indicates either that the conversion will be one of
 307 .B diouxX
 308 or
 309 .B n
 310 and the next pointer is a pointer to a
 311 .I long int
 312 or
 313 .I unsigned long int
 314 (rather than
 315 .IR int ),
 316 or that the conversion will be one of
 317 .B efg
 318 and the next pointer is a pointer to
 319 .I double
 320 (rather than
 321 .IR float ).
 322 Specifying two
 323 .B l
 324 characters is equivalent to
 325 .BR L .
 326 If used with
 327 .B %c
 328 or
 329 .B %s
 330 the corresponding parameter is considered
 331 as a pointer to a wide character or wide-character string respectively.
 332 .\" This use of l was introduced in Amendment 1 to ISO C90.
 333 .TP
 334 .B L
 335 Indicates that the conversion will be either
 336 .B efg
 337 and the next pointer is a pointer to
 338 .I "long double"
 339 or the conversion will be
 340 .B dioux
 341 and the next pointer is a pointer to
 342 .IR "long long" .
 343 .\" MTK, Jul 05: The following is no longer true for modern
 344 .\" ANSI C (i.e., C99):
 345 .\" (Note that long long is not an
 346 .\" ANSI C
 347 .\" type. Any program using this will not be portable to all
 348 .\" architectures).
 349 .TP
 350 .B q
 351 equivalent to
 352 .BR L .
 353 This specifier does not exist in ANSI C.
 354 .TP
 355 .B t
 356 As for
 357 .BR h ,
 358 but the next pointer is a pointer to a
 359 .IR ptrdiff_t .
 360 This modifier was introduced in C99.
 361 .TP
 362 .B z
 363 As for
 364 .BR h ,
 365 but the next pointer is a pointer to a
 366 .IR size_t .
 367 This modifier was introduced in C99.
 368 .PP
 369 The following
 370 .I "conversion specifiers"
 371 are available:
 372 .TP
 373 .B %
 374 Matches a literal '%'.
 375 That is,
 376 .B %\&%
 377 in the format string matches a
 378 single input '%' character.
 379 No conversion is done, and assignment does not
 380 occur.
 381 .TP
 382 .B d
 383 Matches an optionally signed decimal integer;
 384 the next pointer must be a pointer to
 385 .IR int .
 386 .TP
 387 .B D
 388 Equivalent to
 389 .IR ld ;
 390 this exists only for backwards compatibility.
 391 (Note: thus only in libc4.
 392 In libc5 and glibc the
 393 .B %D
 394 is silently ignored, causing old programs to fail mysteriously.)
 395 .TP
 396 .B i
 397 Matches an optionally signed integer; the next pointer must be a pointer to
 398 .IR int .
 399 The integer is read in base 16 if it begins with
 400 .I 0x
 401 or
 402 .IR 0X ,
 403 in base 8 if it begins with
 404 .IR 0 ,
 405 and in base 10 otherwise.
 406 Only characters that correspond to the base are used.
 407 .TP
 408 .B o
 409 Matches an unsigned octal integer; the next pointer must be a pointer to
 410 .IR "unsigned int" .
 411 .TP
 412 .B u
 413 Matches an unsigned decimal integer; the next pointer must be a
 414 pointer to
 415 .IR "unsigned int" .
 416 .TP
 417 .B x
 418 Matches an unsigned hexadecimal integer; the next pointer must
 419 be a pointer to
 420 .IR "unsigned int" .
 421 .TP
 422 .B X
 423 Equivalent to
 424 .BR x .
 425 .TP
 426 .B f
 427 Matches an optionally signed floating-point number; the next pointer must
 428 be a pointer to
 429 .IR float .
 430 .TP
 431 .B e
 432 Equivalent to
 433 .BR f .
 434 .TP
 435 .B g
 436 Equivalent to
 437 .BR f .
 438 .TP
 439 .B E
 440 Equivalent to
 441 .BR f .
 442 .TP
 443 .B a
 444 (C99) Equivalent to
 445 .BR f .
 446 .TP
 447 .B s
 448 Matches a sequence of non-white-space characters;
 449 the next pointer must be a pointer to character array that is
 450 long enough to hold the input sequence and the terminating null
 451 character ('\\0'), which is added automatically.
 452 The input string stops at white space or at the maximum field
 453 width, whichever occurs first.
 454 .TP
 455 .B c
 456 Matches a sequence of characters whose length is specified by the
 457 .I maximum field width
 458 (default 1); the next pointer must be a pointer to
 459 .IR char ,
 460 and there must be enough room for all the characters (no terminating
 461 null byte
 462 is added).
 463 The usual skip of leading white space is suppressed.
 464 To skip white space first, use an explicit space in the format.
 465 .TP
 466 .B \&[
 467 Matches a nonempty sequence of characters from the specified set of
 468 accepted characters; the next pointer must be a pointer to
 469 .IR char ,
 470 and there must be enough room for all the characters in the string, plus a
 471 terminating null byte.
 472 The usual skip of leading white space is suppressed.
 473 The string is to be made up of characters in (or not in) a particular set;
 474 the set is defined by the characters between the open bracket
 475 .B [
 476 character and a close bracket
 477 .B ]
 478 character.
 479 The set
 480 .I excludes
 481 those characters if the first character after the open bracket is a
 482 circumflex
 483 .RB ( ^ ).
 484 To include a close bracket in the set, make it the first character after
 485 the open bracket or the circumflex; any other position will end the set.
 486 The hyphen character
 487 .B \-
 488 is also special; when placed between two other characters, it adds all
 489 intervening characters to the set.
 490 To include a hyphen, make it the last
 491 character before the final close bracket.
 492 For instance,
 493 .B [^]0\-9\-]
 494 means
 495 the set "everything except close bracket, zero through nine, and hyphen".
 496 The string ends with the appearance of a character not in the (or, with a
 497 circumflex, in) set or when the field width runs out.
 498 .TP
 499 .B p
 500 Matches a pointer value (as printed by
 501 .B %p
 502 in
 503 .BR printf (3);
 504 the next pointer must be a pointer to a pointer to
 505 .IR void .
 506 .TP
 507 .B n
 508 Nothing is expected; instead, the number of characters consumed thus far
 509 from the input is stored through the next pointer, which must be a pointer
 510 to
 511 .IR int .
 512 This is
 513 .I not
 514 a conversion, although it can be suppressed with the
 515 .B *
 516 assignment-suppression character.
 517 The C standard says: "Execution of a
 518 .B %n
 519 directive does not increment
 520 the assignment count returned at the completion of execution"
 521 but the Corrigendum seems to contradict this.
 522 Probably it is wise
 523 not to make any assumptions on the effect of
 524 .B %n
 525 conversions on the return value.
 526 .SH "RETURN VALUE"
 527 These functions return the number of input items
 528 successfully matched and assigned,
 529 which can be fewer than provided for,
 530 or even zero in the event of an early matching failure.
 531
 532 The value
 533 .B EOF
 534 is returned if the end of input is reached before either the first
 535 successful conversion or a matching failure occurs.
 536 .B EOF
 537 is also returned if a read error occurs,
 538 in which case the error indicator for the stream (see
 539 .BR ferror (3))
 540 is set, and
 541 .I errno
 542 is set indicate the error.
 543 .SH "CONFORMING TO"
 544 The functions
 545 .BR fscanf (),
 546 .BR scanf (),
 547 and
 548 .BR sscanf ()
 549 conform to C89 and C99.
 550 .PP
 551 The
 552 .B q
 553 specifier is the 4.4BSD notation for
 554 .IR "long long" ,
 555 while
 556 .B ll
 557 or the usage of
 558 .B L
 559 in integer conversions is the GNU notation.
 560 .PP
 561 The Linux version of these functions is based on the
 562 .I GNU
 563 .I libio
 564 library.
 565 Take a look at the
 566 .I info
 567 documentation of
 568 .I GNU
 569 .I libc (glibc-1.08)
 570 for a more concise description.
 571 .SH BUGS
 572 All functions are fully C89 conformant, but provide the
 573 additional specifiers
 574 .B q
 575 and
 576 .B a
 577 as well as an additional behavior of the
 578 .B L
 579 and
 580 .B l
 581 specifiers.
 582 The latter may be considered to be a bug, as it changes the
 583 behavior of specifiers defined in C89.
 584 .PP
 585 Some combinations of the type modifiers and conversion
 586 specifiers defined by ANSI C do not make sense
 587 (e.g.
 588 .BR "%Ld" ).
 589 While they may have a well-defined behavior on Linux, this need not
 590 to be so on other architectures.
 591 Therefore it usually is better to use
 592 modifiers that are not defined by ANSI C at all, that is, use
 593 .B q
 594 instead of
 595 .B L
 596 in combination with
 597 .B diouxX
 598 conversions or
 599 .BR ll .
 600 .PP
 601 The usage of
 602 .B q
 603 is not the same as on 4.4BSD,
 604 as it may be used in float conversions equivalently to
 605 .BR L .
 606 .SH "SEE ALSO"
 607 .BR getc (3),
 608 .BR printf (3),
 609 .BR setlocale (3),
 610 .BR strtod (3),
 611 .BR strtol (3),
 612 .BR strtoul (3)