man3/scanf.3

   1 .\" Copyright (c) 1990, 1991 The Regents of the University of California.
   2 .\" All rights reserved.
   3 .\"
   4 .\" This code is derived from software contributed to Berkeley by
   5 .\" Chris Torek and the American National Standards Committee X3,
   6 .\" on Information Processing Systems.
   7 .\"
   8 .\" Redistribution and use in source and binary forms, with or without
   9 .\" modification, are permitted provided that the following conditions
  10 .\" are met:
  11 .\" 1. Redistributions of source code must retain the above copyright
  12 .\"    notice, this list of conditions and the following disclaimer.
  13 .\" 2. Redistributions in binary form must reproduce the above copyright
  14 .\"    notice, this list of conditions and the following disclaimer in the
  15 .\"    documentation and/or other materials provided with the distribution.
  16 .\" 3. All advertising materials mentioning features or use of this software
  17 .\"    must display the following acknowledgement:
  18 .\"     This product includes software developed by the University of
  19 .\"     California, Berkeley and its contributors.
  20 .\" 4. Neither the name of the University nor the names of its contributors
  21 .\"    may be used to endorse or promote products derived from this software
  22 .\"    without specific prior written permission.
  23 .\"
  24 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  25 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  26 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  27 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  28 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  29 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  30 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  31 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  32 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  33 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  34 .\" SUCH DAMAGE.
  35 .\"
  36 .\"     @(#)scanf.3     6.14 (Berkeley) 1/8/93
  37 .\"
  38 .\" Converted for Linux, Mon Nov 29 15:22:01 1993, faith@cs.unc.edu
  39 .\" modified to resemble the GNU libio setup used in the Linux libc
  40 .\" used in versions 4.x (x>4) and 5   Helmut.Geyer@iwr.uni-heidelberg.de
  41 .\" Modified, aeb, 970121
  42 .\" 2005-07-14, mtk, added description of %n$ form; various text
  43 .\"     incorporated from the GNU C library documentation ((C) The
  44 .\"     Free Software Foundation); other parts substantially rewritten.
  45 .\"
  46 .TH SCANF 3  2007-07-26 "GNU" "Linux Programmer's Manual"
  47 .SH NAME
  48 scanf, fscanf, sscanf, vscanf, vsscanf, vfscanf \- input format conversion
  49 .SH SYNOPSIS
  50 .nf
  51 .B #include <stdio.h>
  52 .na
  53 .BI "int scanf(const char *" format ", ..." );
  54 .br
  55 .BI "int fscanf(FILE *" stream ", const char *" format ", ..." );
  56 .br
  57 .BI "int sscanf(const char *" str ", const char *" format ", ..." );
  58 .sp
  59 .B #include <stdarg.h>
  60 .BI "int vscanf(const char *" format ", va_list " ap );
  61 .br
  62 .BI "int vsscanf(const char *" str ", const char *" format ", va_list " ap );
  63 .br
  64 .BI "int vfscanf(FILE *" stream ", const char *" format ", va_list " ap );
  65 .ad
  66 .fi
  67 .sp
  68 .in -4n
  69 Feature Test Macro Requirements for glibc (see
  70 .BR feature_test_macros (7)):
  71 .in
  72 .sp
  73 .BR vscanf (),
  74 .BR vsscanf (),
  75 .BR vfscanf ():
  76 .br
  77 _XOPEN_SOURCE\ >=\ 600 || _ISOC99_SOURCE; or
  78 .I "cc -std=c99"
  79 .SH DESCRIPTION
  80 The
  81 .BR scanf ()
  82 family of functions scans input according to
  83 .I format
  84 as described below.
  85 This format may contain
  86 .IR "conversion specifications" ;
  87 the results from such conversions, if any,
  88 are stored in the locations pointed to by the
  89 .I pointer
  90 arguments that follow
  91 .IR format .
  92 Each
  93 .I pointer
  94 argument must be of a type that is appropriate for the value returned
  95 by the corresponding conversion specification.
  96
  97 If the number of conversion specifications in
  98 .I format
  99 exceeds the number of
 100 .I pointer
 101 arguments, the results are undefined.
 102 If the number of
 103 .I pointer
 104 arguments exceeds the number of conversion specifications, then the excess
 105 .I pointer
 106 arguments are evaluated, but are otherwise ignored.
 107
 108 The
 109 .BR scanf ()
 110 function reads input from the standard input stream
 111 .IR stdin ,
 112 .BR fscanf ()
 113 reads input from the stream pointer
 114 .IR stream ,
 115 and
 116 .BR sscanf ()
 117 reads its input from the character string pointed to by
 118 .IR str .
 119 .PP
 120 The
 121 .BR vfscanf ()
 122 function is analogous to
 123 .BR vfprintf (3)
 124 and reads input from the stream pointer
 125 .I stream
 126 using a variable argument list of pointers (see
 127 .BR stdarg (3).
 128 The
 129 .BR vscanf ()
 130 function scans a variable argument list from the standard input and the
 131 .BR vsscanf ()
 132 function scans it from a string; these are analogous to the
 133 .BR vprintf (3)
 134 and
 135 .BR vsprintf (3)
 136 functions respectively.
 137 .PP
 138 The
 139 .I format
 140 string consists of a sequence of
 141 .IR directives
 142 which describe how to process the sequence of input characters.
 143 If processing of a directive fails, no further input is read, and
 144 .BR scanf ()
 145 returns.
 146 A "failure" can be either of the following:
 147 .IR "input failure" ,
 148 meaning that input characters were unavailable, or
 149 .IR "matching failure" ,
 150 meaning that the input was inappropriate (see below).
 151
 152 A directive is one of the following:
 153 .TP
 154 \(bu
 155 A sequence of white-space characters (space, tab, newline, etc; see
 156 .BR isspace (3)).
 157 This directive matches any amount of white space,
 158 including none, in the input.
 159 .TP
 160 \(bu
 161 An ordinary character (i.e., one other than white space or '%').
 162 This character must exactly match the next character of input.
 163 .TP
 164 \(bu
 165 A conversion specification, which commences with a '%' (percent) character.
 166 A sequence of characters from the input is converted according to
 167 this specification, and the result is placed in the corresponding
 168 .I pointer
 169 argument.
 170 If the next item of input does not match the conversion specification,
 171 the conversion fails \(em this is a
 172 .IR "matching failure" .
 173 .PP
 174 Each
 175 .I conversion specification
 176 in
 177 .I format
 178 begins with either the character '%' or the character sequence
 179 "\fB%\fP\fIn\fP\fB$\fP"
 180 (see below for the distinction) followed by:
 181 .TP
 182 \(bu
 183 An optional '*' assignment-suppression character:
 184 .BR scanf ()
 185 reads input as directed by the conversion specification,
 186 but discards the input.
 187 No corresponding
 188 .I pointer
 189 argument is required, and this specification is not
 190 included in the count of successful assignments returned by
 191 .BR scanf ().
 192 .TP
 193 \(bu
 194 An optional 'a' character.
 195 This is used with string conversions, and relieves the caller of the
 196 need to allocate a corresponding buffer to hold the input: instead,
 197 .BR scanf ()
 198 allocates a buffer of sufficient size,
 199 and assigns the address of this buffer to the corresponding
 200 .I pointer
 201 argument, which should be a pointer to a
 202 .I "char *"
 203 variable (this variable does not need to be initialized before the call).
 204 The caller should subsequently
 205 .BR free (3)
 206 this buffer when it is no longer required.
 207 This is a GNU extension;
 208 C99 employs the 'a' character as a conversion specifier (and
 209 it can also be used as such in the GNU implementation).
 210 .TP
 211 \(bu
 212 An optional decimal integer which specifies the
 213 .IR "maximum field width" .
 214 Reading of characters stops either when this maximum is reached or
 215 when a non-matching character is found, whichever happens first.
 216 Most conversions discard initial whitespace characters (the exceptions
 217 are noted below),
 218 and these discarded characters don't count towards the maximum field width.
 219 String input conversions store a null terminator ('\\0')
 220 to mark the end of the input;
 221 the maximum field width does not include this terminator.
 222 .TP
 223 \(bu
 224 An optional
 225 .IR "type modifier character" .
 226 For example, the
 227 .B l
 228 type modifier is used with integer conversions such as
 229 .I %d
 230 to specify that the corresponding
 231 .I pointer
 232 argument refers to a
 233 .I "long int"
 234 rather than a pointer to an
 235 .IR int .
 236 .TP
 237 \(bu
 238 A
 239 .I "conversion specifier"
 240 that specifies the type of input conversion to be performed.
 241 .PP
 242 The conversion specifications in
 243 .I format
 244 are of two forms, either beginning with '%' or beginning with
 245 "\fB%\fP\fIn\fP\fB$\fP".
 246 The two forms should not be mixed in the same
 247 .I format
 248 string, except that a string containing
 249 "\fB%\fP\fIn\fP\fB$\fP"
 250 specifications can include
 251 .I %%
 252 and
 253 .IR %* .
 254 If
 255 .I format
 256 contains '%'
 257 specifications then these correspond in order with successive
 258 .I pointer
 259 arguments.
 260 In the
 261 "\fB%\fP\fIn\fP\fB$\fP"
 262 form (which is specified in POSIX.1-2001, but not C99),
 263 .I n
 264 is a decimal integer that specifies that the converted input should
 265 be placed in the location referred to by the
 266 .IR n -th
 267 .I pointer
 268 argument following
 269 .IR format .
 270 .SS Conversions
 271 The following
 272 .IR "type modifier characters"
 273 can appear in a conversion specification:
 274 .TP
 275 .B h
 276 Indicates that the conversion will be one of
 277 .B diouxX
 278 or
 279 .B n
 280 and the next pointer is a pointer to a
 281 .I short int
 282 or
 283 .I unsigned short int
 284 (rather than
 285 .IR int ).
 286 .TP
 287 .B hh
 288 As for
 289 .BR h ,
 290 but the next pointer is a pointer to a
 291 .I signed char
 292 or
 293 .IR "unsigned char" .
 294 .TP
 295 .B j
 296 As for
 297 .BR h ,
 298 but the next pointer is a pointer to a
 299 .I intmax_t
 300 or
 301 .IR uintmax_t .
 302 This modifier was introduced in C99.
 303 .TP
 304 .B l
 305 Indicates either that the conversion will be one of
 306 .B diouxX
 307 or
 308 .B n
 309 and the next pointer is a pointer to a
 310 .I long int
 311 or
 312 .I unsigned long int
 313 (rather than
 314 .IR int ),
 315 or that the conversion will be one of
 316 .B efg
 317 and the next pointer is a pointer to
 318 .I double
 319 (rather than
 320 .IR float ).
 321 Specifying two
 322 .B l
 323 characters is equivalent to
 324 .BR L .
 325 If used with
 326 .I %c
 327 or
 328 .I %s
 329 the corresponding parameter is considered
 330 as a pointer to a wide character or wide-character string respectively.
 331 .\" This use of l was introduced in Amendment 1 to ISO C90.
 332 .TP
 333 .B L
 334 Indicates that the conversion will be either
 335 .B efg
 336 and the next pointer is a pointer to
 337 .IR "long double"
 338 or the conversion will be
 339 .B dioux
 340 and the next pointer is a pointer to
 341 .IR "long long" .
 342 .\" MTK, Jul 05: The following is no longer true for modern
 343 .\" ANSI C (i.e., C99):
 344 .\" (Note that long long is not an
 345 .\" ANSI C
 346 .\" type. Any program using this will not be portable to all
 347 .\" architectures).
 348 .TP
 349 .B q
 350 equivalent to
 351 .BR L .
 352 This specifier does not exist in ANSI C.
 353 .TP
 354 .B t
 355 As for
 356 .BR h ,
 357 but the next pointer is a pointer to a
 358 .IR ptrdiff_t .
 359 This modifier was introduced in C99.
 360 .TP
 361 .B z
 362 As for
 363 .BR h ,
 364 but the next pointer is a pointer to a
 365 .IR size_t .
 366 This modifier was introduced in C99.
 367 .PP
 368 The following
 369 .I "conversion specifiers"
 370 are available:
 371 .TP
 372 .B %
 373 Matches a literal '%'.
 374 That is,
 375 .I %\&%
 376 in the format string matches a
 377 single input '%' character.
 378 No conversion is done, and assignment does not
 379 occur.
 380 .TP
 381 .B d
 382 Matches an optionally signed decimal integer;
 383 the next pointer must be a pointer to
 384 .IR int .
 385 .TP
 386 .B D
 387 Equivalent to
 388 .IR ld ;
 389 this exists only for backwards compatibility.
 390 (Note: thus only in libc4.
 391 In libc5 and glibc the
 392 .I %D
 393 is silently ignored, causing old programs to fail mysteriously.)
 394 .TP
 395 .B i
 396 Matches an optionally signed integer; the next pointer must be a pointer to
 397 .IR int .
 398 The integer is read in base 16 if it begins with
 399 .I 0x
 400 or
 401 .IR 0X ,
 402 in base 8 if it begins with
 403 .IR 0 ,
 404 and in base 10 otherwise.
 405 Only characters that correspond to the base are used.
 406 .TP
 407 .B o
 408 Matches an unsigned octal integer; the next pointer must be a pointer to
 409 .IR "unsigned int" .
 410 .TP
 411 .B u
 412 Matches an unsigned decimal integer; the next pointer must be a
 413 pointer to
 414 .IR "unsigned int" .
 415 .TP
 416 .B x
 417 Matches an unsigned hexadecimal integer; the next pointer must
 418 be a pointer to
 419 .IR "unsigned int" .
 420 .TP
 421 .B X
 422 Equivalent to
 423 .BR x .
 424 .TP
 425 .B f
 426 Matches an optionally signed floating-point number; the next pointer must
 427 be a pointer to
 428 .IR float .
 429 .TP
 430 .B e
 431 Equivalent to
 432 .BR f .
 433 .TP
 434 .B g
 435 Equivalent to
 436 .BR f .
 437 .TP
 438 .B E
 439 Equivalent to
 440 .BR f .
 441 .TP
 442 .B a
 443 (C99) Equivalent to
 444 .BR f .
 445 .TP
 446 .B s
 447 Matches a sequence of non-white-space characters;
 448 the next pointer must be a pointer to character array that is
 449 long enough to hold the input sequence and the terminating null
 450 character ('\\0'), which is added automatically.
 451 The input string stops at white space or at the maximum field
 452 width, whichever occurs first.
 453 .TP
 454 .B c
 455 Matches a sequence of characters whose length is specified by the
 456 .I maximum field width
 457 (default 1); the next pointer must be a pointer to
 458 .IR char ,
 459 and there must be enough room for all the characters (no terminating
 460 null byte
 461 is added).
 462 The usual skip of leading white space is suppressed.
 463 To skip white space first, use an explicit space in the format.
 464 .TP
 465 .B \&[
 466 Matches a nonempty sequence of characters from the specified set of
 467 accepted characters; the next pointer must be a pointer to
 468 .IR char ,
 469 and there must be enough room for all the characters in the string, plus a
 470 terminating null byte.
 471 The usual skip of leading white space is suppressed.
 472 The string is to be made up of characters in (or not in) a particular set;
 473 the set is defined by the characters between the open bracket
 474 .B [
 475 character and a close bracket
 476 .B ]
 477 character.
 478 The set
 479 .I excludes
 480 those characters if the first character after the open bracket is a
 481 circumflex
 482 .RB ( ^ ).
 483 To include a close bracket in the set, make it the first character after
 484 the open bracket or the circumflex; any other position will end the set.
 485 The hyphen character
 486 .B \-
 487 is also special; when placed between two other characters, it adds all
 488 intervening characters to the set.
 489 To include a hyphen, make it the last
 490 character before the final close bracket.
 491 For instance,
 492 .B [^]0\-9\-]
 493 means
 494 the set "everything except close bracket, zero through nine, and hyphen".
 495 The string ends with the appearance of a character not in the (or, with a
 496 circumflex, in) set or when the field width runs out.
 497 .TP
 498 .B p
 499 Matches a pointer value (as printed by
 500 .I %p
 501 in
 502 .BR printf (3);
 503 the next pointer must be a pointer to a pointer to
 504 .IR void .
 505 .TP
 506 .B n
 507 Nothing is expected; instead, the number of characters consumed thus far
 508 from the input is stored through the next pointer, which must be a pointer
 509 to
 510 .IR int .
 511 This is
 512 .I not
 513 a conversion, although it can be suppressed with the
 514 .B *
 515 assignment-suppression character.
 516 The C standard says: "Execution of a
 517 .I %n
 518 directive does not increment
 519 the assignment count returned at the completion of execution"
 520 but the Corrigendum seems to contradict this.
 521 Probably it is wise
 522 not to make any assumptions on the effect of
 523 .I %n
 524 conversions on the return value.
 525 .SH "RETURN VALUE"
 526 These functions return the number of input items
 527 successfully matched and assigned,
 528 which can be fewer than provided for,
 529 or even zero in the event of an early matching failure.
 530
 531 The value
 532 .B EOF
 533 is returned if the end of input is reached before either the first
 534 successful conversion or a matching failure occurs.
 535 .B EOF
 536 is also returned if a read error occurs,
 537 in which case the error indicator for the stream (see
 538 .BR ferror (3))
 539 is set, and
 540 .I errno
 541 is set indicate the error.
 542 .SH "CONFORMING TO"
 543 The functions
 544 .BR fscanf (),
 545 .BR scanf (),
 546 and
 547 .BR sscanf ()
 548 conform to C89 and C99.
 549 .PP
 550 The
 551 .B q
 552 specifier is the 4.4BSD notation for
 553 .IR "long long" ,
 554 while
 555 .B ll
 556 or the usage of
 557 .B L
 558 in integer conversions is the GNU notation.
 559 .PP
 560 The Linux version of these functions is based on the
 561 .I GNU
 562 .I libio
 563 library.
 564 Take a look at the
 565 .I info
 566 documentation of
 567 .I GNU
 568 .I libc (glibc-1.08)
 569 for a more concise description.
 570 .SH BUGS
 571 All functions are fully C89 conformant, but provide the
 572 additional specifiers
 573 .B q
 574 and
 575 .B a
 576 as well as an additional behavior of the
 577 .B L
 578 and
 579 .B l
 580 specifiers.
 581 The latter may be considered to be a bug, as it changes the
 582 behavior of specifiers defined in C89.
 583 .PP
 584 Some combinations of the type modifiers and conversion
 585 specifiers defined by ANSI C do not make sense
 586 (e.g.
 587 .BR "%Ld" ).
 588 While they may have a well-defined behavior on Linux, this need not
 589 to be so on other architectures.
 590 Therefore it usually is better to use
 591 modifiers that are not defined by ANSI C at all, that is, use
 592 .B q
 593 instead of
 594 .B L
 595 in combination with
 596 .B diouxX
 597 conversions or
 598 .BR ll .
 599 .PP
 600 The usage of
 601 .B q
 602 is not the same as on 4.4BSD,
 603 as it may be used in float conversions equivalently to
 604 .BR L .
 605 .SH "SEE ALSO"
 606 .BR getc (3),
 607 .BR printf (3),
 608 .BR setlocale (3),
 609 .BR strtod (3),
 610 .BR strtol (3),
 611 .BR strtoul (3)