manual/stdio.texi

   1 @node I/O on Streams, Low-Level I/O, I/O Overview, Top
   2 @chapter Input/Output on Streams
   3
   4 This chapter describes the functions for creating streams and performing
   5 input and output operations on them.  As discussed in @ref{I/O
   6 Overview}, a stream is a fairly abstract, high-level concept
   7 representing a communications channel to a file, device, or process.
   8
   9 @menu
  10 * Streams::                     About the data type representing a stream.
  11 * Standard Streams::            Streams to the standard input and output
  12                                  devices are created for you.
  13 * Opening Streams::             How to create a stream to talk to a file.
  14 * Closing Streams::             Close a stream when you are finished with it.
  15 * Simple Output::               Unformatted output by characters and lines.
  16 * Character Input::             Unformatted input by characters and words.
  17 * Line Input::                  Reading a line or a record from a stream.
  18 * Unreading::                   Peeking ahead/pushing back input just read.
  19 * Block Input/Output::          Input and output operations on blocks of data.
  20 * Formatted Output::            @code{printf} and related functions.
  21 * Customizing Printf::          You can define new conversion specifiers for
  22                                  @code{printf} and friends.
  23 * Formatted Input::             @code{scanf} and related functions.
  24 * EOF and Errors::              How you can tell if an I/O error happens.
  25 * Binary Streams::              Some systems distinguish between text files
  26                                  and binary files.
  27 * File Positioning::            About random-access streams.
  28 * Portable Positioning::        Random access on peculiar ANSI C systems.
  29 * Stream Buffering::            How to control buffering of streams.
  30 * Other Kinds of Streams::      Streams that do not necessarily correspond
  31                                  to an open file.
  32 @end menu
  33
  34 @node Streams
  35 @section Streams
  36
  37 For historical reasons, the type of the C data structure that represents
  38 a stream is called @code{FILE} rather than ``stream''.  Since most of
  39 the library functions deal with objects of type @code{FILE *}, sometimes
  40 the term @dfn{file pointer} is also used to mean ``stream''.  This leads
  41 to unfortunate confusion over terminology in many books on C.  This
  42 manual, however, is careful to use the terms ``file'' and ``stream''
  43 only in the technical sense.
  44 @cindex file pointer
  45
  46 @pindex stdio.h
  47 The @code{FILE} type is declared in the header file @file{stdio.h}.
  48
  49 @comment stdio.h
  50 @comment ANSI
  51 @deftp {Data Type} FILE
  52 This is the data type used to represent stream objects.  A @code{FILE}
  53 object holds all of the internal state information about the connection
  54 to the associated file, including such things as the file position
  55 indicator and buffering information.  Each stream also has error and
  56 end-of-file status indicators that can be tested with the @code{ferror}
  57 and @code{feof} functions; see @ref{EOF and Errors}.
  58 @end deftp
  59
  60 @code{FILE} objects are allocated and managed internally by the
  61 input/output library functions.  Don't try to create your own objects of
  62 type @code{FILE}; let the library do it.  Your programs should
  63 deal only with pointers to these objects (that is, @code{FILE *} values)
  64 rather than the objects themselves.
  65 @c !!! should say that FILE's have "No user-servicable parts inside."
  66
  67 @node Standard Streams
  68 @section Standard Streams
  69 @cindex standard streams
  70 @cindex streams, standard
  71
  72 When the @code{main} function of your program is invoked, it already has
  73 three predefined streams open and available for use.  These represent
  74 the ``standard'' input and output channels that have been established
  75 for the process.
  76
  77 These streams are declared in the header file @file{stdio.h}.
  78 @pindex stdio.h
  79
  80 @comment stdio.h
  81 @comment ANSI
  82 @deftypevar {FILE *} stdin
  83 The @dfn{standard input} stream, which is the normal source of input for the
  84 program.
  85 @end deftypevar
  86 @cindex standard input stream
  87
  88 @comment stdio.h
  89 @comment ANSI
  90 @deftypevar {FILE *} stdout
  91 The @dfn{standard output} stream, which is used for normal output from
  92 the program.
  93 @end deftypevar
  94 @cindex standard output stream
  95
  96 @comment stdio.h
  97 @comment ANSI
  98 @deftypevar {FILE *} stderr
  99 The @dfn{standard error} stream, which is used for error messages and
 100 diagnostics issued by the program.
 101 @end deftypevar
 102 @cindex standard error stream
 103
 104 In the GNU system, you can specify what files or processes correspond to
 105 these streams using the pipe and redirection facilities provided by the
 106 shell.  (The primitives shells use to implement these facilities are
 107 described in @ref{File System Interface}.)  Most other operating systems
 108 provide similar mechanisms, but the details of how to use them can vary.
 109
 110 In the GNU C library, @code{stdin}, @code{stdout}, and @code{stderr} are
 111 normal variables which you can set just like any others.  For example, to redirect
 112 the standard output to a file, you could do:
 113
 114 @smallexample
 115 fclose (stdout);
 116 stdout = fopen ("standard-output-file", "w");
 117 @end smallexample
 118
 119 Note however, that in other systems @code{stdin}, @code{stdout}, and
 120 @code{stderr} are macros that you cannot assign to in the normal way.
 121 But you can use @code{freopen} to get the effect of closing one and
 122 reopening it.  @xref{Opening Streams}.
 123
 124 @node Opening Streams
 125 @section Opening Streams
 126
 127 @cindex opening a stream
 128 Opening a file with the @code{fopen} function creates a new stream and
 129 establishes a connection between the stream and a file.  This may
 130 involve creating a new file.
 131
 132 @pindex stdio.h
 133 Everything described in this section is declared in the header file
 134 @file{stdio.h}.
 135
 136 @comment stdio.h
 137 @comment ANSI
 138 @deftypefun {FILE *} fopen (const char *@var{filename}, const char *@var{opentype})
 139 The @code{fopen} function opens a stream for I/O to the file
 140 @var{filename}, and returns a pointer to the stream.
 141
 142 The @var{opentype} argument is a string that controls how the file is
 143 opened and specifies attributes of the resulting stream.  It must begin
 144 with one of the following sequences of characters:
 145
 146 @table @samp
 147 @item r
 148 Open an existing file for reading only.
 149
 150 @item w
 151 Open the file for writing only.  If the file already exists, it is
 152 truncated to zero length.  Otherwise a new file is created.
 153
 154 @item a
 155 Open a file for append access; that is, writing at the end of file only.
 156 If the file already exists, its initial contents are unchanged and
 157 output to the stream is appended to the end of the file.
 158 Otherwise, a new, empty file is created.
 159
 160 @item r+
 161 Open an existing file for both reading and writing.  The initial contents
 162 of the file are unchanged and the initial file position is at the
 163 beginning of the file.
 164
 165 @item w+
 166 Open a file for both reading and writing.  If the file already exists, it
 167 is truncated to zero length.  Otherwise, a new file is created.
 168
 169 @item a+
 170 Open or create file for both reading and appending.  If the file exists,
 171 its initial contents are unchanged.  Otherwise, a new file is created.
 172 The initial file position for reading is at the beginning of the file,
 173 but output is always appended to the end of the file.
 174 @end table
 175
 176 As you can see, @samp{+} requests a stream that can do both input and
 177 output.  The ANSI standard says that when using such a stream, you must
 178 call @code{fflush} (@pxref{Stream Buffering}) or a file positioning
 179 function such as @code{fseek} (@pxref{File Positioning}) when switching
 180 from reading to writing or vice versa.  Otherwise, internal buffers
 181 might not be emptied properly.  The GNU C library does not have this
 182 limitation; you can do arbitrary reading and writing operations on a
 183 stream in whatever order.
 184
 185 Additional characters may appear after these to specify flags for the
 186 call.  Always put the mode (@samp{r}, @samp{w+}, etc.) first; that is
 187 the only part you are guaranteed will be understood by all systems.
 188
 189 The GNU C library defines one additional character for use in
 190 @var{opentype}: the character @samp{x} insists on creating a new
 191 file---if a file @var{filename} already exists, @code{fopen} fails
 192 rather than opening it.  If you use @samp{x} you can are guaranteed that
 193 you will not clobber an existing file.  This is equivalent to the
 194 @code{O_EXCL} option to the @code{open} function (@pxref{Opening and
 195 Closing Files}).
 196
 197 The character @samp{b} in @var{opentype} has a standard meaning; it
 198 requests a binary stream rather than a text stream.  But this makes no
 199 difference in POSIX systems (including the GNU system).  If both
 200 @samp{+} and @samp{b} are specified, they can appear in either order.
 201 @xref{Binary Streams}.
 202
 203 Any other characters in @var{opentype} are simply ignored.  They may be
 204 meaningful in other systems.
 205
 206 If the open fails, @code{fopen} returns a null pointer.
 207 @end deftypefun
 208
 209 You can have multiple streams (or file descriptors) pointing to the same
 210 file open at the same time.  If you do only input, this works
 211 straightforwardly, but you must be careful if any output streams are
 212 included.  @xref{Stream/Descriptor Precautions}.  This is equally true
 213 whether the streams are in one program (not usual) or in several
 214 programs (which can easily happen).  It may be advantageous to use the
 215 file locking facilities to avoid simultaneous access.  @xref{File
 216 Locks}.
 217
 218 @comment stdio.h
 219 @comment ANSI
 220 @deftypevr Macro int FOPEN_MAX
 221 The value of this macro is an integer constant expression that
 222 represents the minimum number of streams that the implementation
 223 guarantees can be open simultaneously.  You might be able to open more
 224 than this many streams, but that is not guaranteed.  The value of this
 225 constant is at least eight, which includes the three standard streams
 226 @code{stdin}, @code{stdout}, and @code{stderr}.  In POSIX.1 systems this
 227 value is determined by the @code{OPEN_MAX} parameter; @pxref{General
 228 Limits}.  In BSD and GNU, it is controlled by the @code{RLIMIT_NOFILE}
 229 resource limit; @pxref{Limits on Resources}.
 230 @end deftypevr
 231
 232 @comment stdio.h
 233 @comment ANSI
 234 @deftypefun {FILE *} freopen (const char *@var{filename}, const char *@var{opentype}, FILE *@var{stream})
 235 This function is like a combination of @code{fclose} and @code{fopen}.
 236 It first closes the stream referred to by @var{stream}, ignoring any
 237 errors that are detected in the process.  (Because errors are ignored,
 238 you should not use @code{freopen} on an output stream if you have
 239 actually done any output using the stream.)  Then the file named by
 240 @var{filename} is opened with mode @var{opentype} as for @code{fopen},
 241 and associated with the same stream object @var{stream}.
 242
 243 If the operation fails, a null pointer is returned; otherwise,
 244 @code{freopen} returns @var{stream}.
 245
 246 @code{freopen} has traditionally been used to connect a standard stream
 247 such as @code{stdin} with a file of your own choice.  This is useful in
 248 programs in which use of a standard stream for certain purposes is
 249 hard-coded.  In the GNU C library, you can simply close the standard
 250 streams and open new ones with @code{fopen}.  But other systems lack
 251 this ability, so using @code{freopen} is more portable.
 252 @end deftypefun
 253
 254
 255 @node Closing Streams
 256 @section Closing Streams
 257
 258 @cindex closing a stream
 259 When a stream is closed with @code{fclose}, the connection between the
 260 stream and the file is cancelled.  After you have closed a stream, you
 261 cannot perform any additional operations on it.
 262
 263 @comment stdio.h
 264 @comment ANSI
 265 @deftypefun int fclose (FILE *@var{stream})
 266 This function causes @var{stream} to be closed and the connection to
 267 the corresponding file to be broken.  Any buffered output is written
 268 and any buffered input is discarded.  The @code{fclose} function returns
 269 a value of @code{0} if the file was closed successfully, and @code{EOF}
 270 if an error was detected.
 271
 272 It is important to check for errors when you call @code{fclose} to close
 273 an output stream, because real, everyday errors can be detected at this
 274 time.  For example, when @code{fclose} writes the remaining buffered
 275 output, it might get an error because the disk is full.  Even if you
 276 know the buffer is empty, errors can still occur when closing a file if
 277 you are using NFS.
 278
 279 The function @code{fclose} is declared in @file{stdio.h}.
 280 @end deftypefun
 281
 282 If the @code{main} function to your program returns, or if you call the
 283 @code{exit} function (@pxref{Normal Termination}), all open streams are
 284 automatically closed properly.  If your program terminates in any other
 285 manner, such as by calling the @code{abort} function (@pxref{Aborting a
 286 Program}) or from a fatal signal (@pxref{Signal Handling}), open streams
 287 might not be closed properly.  Buffered output might not be flushed and
 288 files may be incomplete.  For more information on buffering of streams,
 289 see @ref{Stream Buffering}.
 290
 291 @node Simple Output
 292 @section Simple Output by Characters or Lines
 293
 294 @cindex writing to a stream, by characters
 295 This section describes functions for performing character- and
 296 line-oriented output.
 297
 298 These functions are declared in the header file @file{stdio.h}.
 299 @pindex stdio.h
 300
 301 @comment stdio.h
 302 @comment ANSI
 303 @deftypefun int fputc (int @var{c}, FILE *@var{stream})
 304 The @code{fputc} function converts the character @var{c} to type
 305 @code{unsigned char}, and writes it to the stream @var{stream}.
 306 @code{EOF} is returned if a write error occurs; otherwise the
 307 character @var{c} is returned.
 308 @end deftypefun
 309
 310 @comment stdio.h
 311 @comment ANSI
 312 @deftypefun int putc (int @var{c}, FILE *@var{stream})
 313 This is just like @code{fputc}, except that most systems implement it as
 314 a macro, making it faster.  One consequence is that it may evaluate the
 315 @var{stream} argument more than once, which is an exception to the
 316 general rule for macros.  @code{putc} is usually the best function to
 317 use for writing a single character.
 318 @end deftypefun
 319
 320 @comment stdio.h
 321 @comment ANSI
 322 @deftypefun int putchar (int @var{c})
 323 The @code{putchar} function is equivalent to @code{putc} with
 324 @code{stdout} as the value of the @var{stream} argument.
 325 @end deftypefun
 326
 327 @comment stdio.h
 328 @comment ANSI
 329 @deftypefun int fputs (const char *@var{s}, FILE *@var{stream})
 330 The function @code{fputs} writes the string @var{s} to the stream
 331 @var{stream}.  The terminating null character is not written.
 332 This function does @emph{not} add a newline character, either.
 333 It outputs only the characters in the string.
 334
 335 This function returns @code{EOF} if a write error occurs, and otherwise
 336 a non-negative value.
 337
 338 For example:
 339
 340 @smallexample
 341 fputs ("Are ", stdout);
 342 fputs ("you ", stdout);
 343 fputs ("hungry?\n", stdout);
 344 @end smallexample
 345
 346 @noindent
 347 outputs the text @samp{Are you hungry?} followed by a newline.
 348 @end deftypefun
 349
 350 @comment stdio.h
 351 @comment ANSI
 352 @deftypefun int puts (const char *@var{s})
 353 The @code{puts} function writes the string @var{s} to the stream
 354 @code{stdout} followed by a newline.  The terminating null character of
 355 the string is not written.  (Note that @code{fputs} does @emph{not}
 356 write a newline as this function does.)
 357
 358 @code{puts} is the most convenient function for printing simple
 359 messages.  For example:
 360
 361 @smallexample
 362 puts ("This is a message.");
 363 @end smallexample
 364 @end deftypefun
 365
 366 @comment stdio.h
 367 @comment SVID
 368 @deftypefun int putw (int @var{w}, FILE *@var{stream})
 369 This function writes the word @var{w} (that is, an @code{int}) to
 370 @var{stream}.  It is provided for compatibility with SVID, but we
 371 recommend you use @code{fwrite} instead (@pxref{Block Input/Output}).
 372 @end deftypefun
 373
 374 @node Character Input
 375 @section Character Input
 376
 377 @cindex reading from a stream, by characters
 378 This section describes functions for performing character-oriented input.
 379 These functions are declared in the header file @file{stdio.h}.
 380 @pindex stdio.h
 381
 382 These functions return an @code{int} value that is either a character of
 383 input, or the special value @code{EOF} (usually -1).  It is important to
 384 store the result of these functions in a variable of type @code{int}
 385 instead of @code{char}, even when you plan to use it only as a
 386 character.  Storing @code{EOF} in a @code{char} variable truncates its
 387 value to the size of a character, so that it is no longer
 388 distinguishable from the valid character @samp{(char) -1}.  So always
 389 use an @code{int} for the result of @code{getc} and friends, and check
 390 for @code{EOF} after the call; once you've verified that the result is
 391 not @code{EOF}, you can be sure that it will fit in a @samp{char}
 392 variable without loss of information.
 393
 394 @comment stdio.h
 395 @comment ANSI
 396 @deftypefun int fgetc (FILE *@var{stream})
 397 This function reads the next character as an @code{unsigned char} from
 398 the stream @var{stream} and returns its value, converted to an
 399 @code{int}.  If an end-of-file condition or read error occurs,
 400 @code{EOF} is returned instead.
 401 @end deftypefun
 402
 403 @comment stdio.h
 404 @comment ANSI
 405 @deftypefun int getc (FILE *@var{stream})
 406 This is just like @code{fgetc}, except that it is permissible (and
 407 typical) for it to be implemented as a macro that evaluates the
 408 @var{stream} argument more than once.  @code{getc} is often highly
 409 optimized, so it is usually the best function to use to read a single
 410 character.
 411 @end deftypefun
 412
 413 @comment stdio.h
 414 @comment ANSI
 415 @deftypefun int getchar (void)
 416 The @code{getchar} function is equivalent to @code{getc} with @code{stdin}
 417 as the value of the @var{stream} argument.
 418 @end deftypefun
 419
 420 Here is an example of a function that does input using @code{fgetc}.  It
 421 would work just as well using @code{getc} instead, or using
 422 @code{getchar ()} instead of @w{@code{fgetc (stdin)}}.
 423
 424 @smallexample
 425 int
 426 y_or_n_p (const char *question)
 427 @{
 428   fputs (question, stdout);
 429   while (1)
 430     @{
 431       int c, answer;
 432       /* @r{Write a space to separate answer from question.} */
 433       fputc (' ', stdout);
 434       /* @r{Read the first character of the line.}
 435          @r{This should be the answer character, but might not be.} */
 436       c = tolower (fgetc (stdin));
 437       answer = c;
 438       /* @r{Discard rest of input line.} */
 439       while (c != '\n' && c != EOF)
 440         c = fgetc (stdin);
 441       /* @r{Obey the answer if it was valid.} */
 442       if (answer == 'y')
 443         return 1;
 444       if (answer == 'n')
 445         return 0;
 446       /* @r{Answer was invalid: ask for valid answer.} */
 447       fputs ("Please answer y or n:", stdout);
 448     @}
 449 @}
 450 @end smallexample
 451
 452 @comment stdio.h
 453 @comment SVID
 454 @deftypefun int getw (FILE *@var{stream})
 455 This function reads a word (that is, an @code{int}) from @var{stream}.
 456 It's provided for compatibility with SVID.  We recommend you use
 457 @code{fread} instead (@pxref{Block Input/Output}).  Unlike @code{getc},
 458 any @code{int} value could be a valid result.  @code{getw} returns
 459 @code{EOF} when it encounters end-of-file or an error, but there is no
 460 way to distinguish this from an input word with value -1.
 461 @end deftypefun
 462
 463 @node Line Input
 464 @section Line-Oriented Input
 465
 466 Since many programs interpret input on the basis of lines, it's
 467 convenient to have functions to read a line of text from a stream.
 468
 469 Standard C has functions to do this, but they aren't very safe: null
 470 characters and even (for @code{gets}) long lines can confuse them.  So
 471 the GNU library provides the nonstandard @code{getline} function that
 472 makes it easy to read lines reliably.
 473
 474 Another GNU extension, @code{getdelim}, generalizes @code{getline}.  It
 475 reads a delimited record, defined as everything through the next
 476 occurrence of a specified delimiter character.
 477
 478 All these functions are declared in @file{stdio.h}.
 479
 480 @comment stdio.h
 481 @comment GNU
 482 @deftypefun ssize_t getline (char **@var{lineptr}, size_t *@var{n}, FILE *@var{stream})
 483 This function reads an entire line from @var{stream}, storing the text
 484 (including the newline and a terminating null character) in a buffer
 485 and storing the buffer address in @code{*@var{lineptr}}.
 486
 487 Before calling @code{getline}, you should place in @code{*@var{lineptr}}
 488 the address of a buffer @code{*@var{n}} bytes long, allocated with
 489 @code{malloc}.  If this buffer is long enough to hold the line,
 490 @code{getline} stores the line in this buffer.  Otherwise,
 491 @code{getline} makes the buffer bigger using @code{realloc}, storing the
 492 new buffer address back in @code{*@var{lineptr}} and the increased size
 493 back in @code{*@var{n}}.
 494 @xref{Unconstrained Allocation}.
 495
 496 If you set @code{*@var{lineptr}} to a null pointer, and @code{*@var{n}}
 497 to zero, before the call, then @code{getline} allocates the initial
 498 buffer for you by calling @code{malloc}.
 499
 500 In either case, when @code{getline} returns,  @code{*@var{lineptr}} is
 501 a @code{char *} which points to the text of the line.
 502
 503 When @code{getline} is successful, it returns the number of characters
 504 read (including the newline, but not including the terminating null).
 505 This value enables you to distinguish null characters that are part of
 506 the line from the null character inserted as a terminator.
 507
 508 This function is a GNU extension, but it is the recommended way to read
 509 lines from a stream.  The alternative standard functions are unreliable.
 510
 511 If an error occurs or end of file is reached, @code{getline} returns
 512 @code{-1}.
 513 @end deftypefun
 514
 515 @comment stdio.h
 516 @comment GNU
 517 @deftypefun ssize_t getdelim (char **@var{lineptr}, size_t *@var{n}, int @var{delimiter}, FILE *@var{stream})
 518 This function is like @code{getline} except that the character which
 519 tells it to stop reading is not necessarily newline.  The argument
 520 @var{delimiter} specifies the delimiter character; @code{getdelim} keeps
 521 reading until it sees that character (or end of file).
 522
 523 The text is stored in @var{lineptr}, including the delimiter character
 524 and a terminating null.  Like @code{getline}, @code{getdelim} makes
 525 @var{lineptr} bigger if it isn't big enough.
 526
 527 @code{getline} is in fact implemented in terms of @code{getdelim}, just
 528 like this:
 529
 530 @smallexample
 531 ssize_t
 532 getline (char **lineptr, size_t *n, FILE *stream)
 533 @{
 534   return getdelim (lineptr, n, '\n', stream);
 535 @}
 536 @end smallexample
 537 @end deftypefun
 538
 539 @comment stdio.h
 540 @comment ANSI
 541 @deftypefun {char *} fgets (char *@var{s}, int @var{count}, FILE *@var{stream})
 542 The @code{fgets} function reads characters from the stream @var{stream}
 543 up to and including a newline character and stores them in the string
 544 @var{s}, adding a null character to mark the end of the string.  You
 545 must supply @var{count} characters worth of space in @var{s}, but the
 546 number of characters read is at most @var{count} @minus{} 1.  The extra
 547 character space is used to hold the null character at the end of the
 548 string.
 549
 550 If the system is already at end of file when you call @code{fgets}, then
 551 the contents of the array @var{s} are unchanged and a null pointer is
 552 returned.  A null pointer is also returned if a read error occurs.
 553 Otherwise, the return value is the pointer @var{s}.
 554
 555 @strong{Warning:}  If the input data has a null character, you can't tell.
 556 So don't use @code{fgets} unless you know the data cannot contain a null.
 557 Don't use it to read files edited by the user because, if the user inserts
 558 a null character, you should either handle it properly or print a clear
 559 error message.  We recommend using @code{getline} instead of @code{fgets}.
 560 @end deftypefun
 561
 562 @comment stdio.h
 563 @comment ANSI
 564 @deftypefn {Deprecated function} {char *} gets (char *@var{s})
 565 The function @code{gets} reads characters from the stream @code{stdin}
 566 up to the next newline character, and stores them in the string @var{s}.
 567 The newline character is discarded (note that this differs from the
 568 behavior of @code{fgets}, which copies the newline character into the
 569 string).  If @code{gets} encounters a read error or end-of-file, it
 570 returns a null pointer; otherwise it returns @var{s}.
 571
 572 @strong{Warning:} The @code{gets} function is @strong{very dangerous}
 573 because it provides no protection against overflowing the string
 574 @var{s}.  The GNU library includes it for compatibility only.  You
 575 should @strong{always} use @code{fgets} or @code{getline} instead.  To
 576 remind you of this, the linker (if using GNU @code{ld}) will issue a
 577 warning whenever you use @code{gets}.
 578 @end deftypefn
 579
 580 @node Unreading
 581 @section Unreading
 582 @cindex peeking at input
 583 @cindex unreading characters
 584 @cindex pushing input back
 585
 586 In parser programs it is often useful to examine the next character in
 587 the input stream without removing it from the stream.  This is called
 588 ``peeking ahead'' at the input because your program gets a glimpse of
 589 the input it will read next.
 590
 591 Using stream I/O, you can peek ahead at input by first reading it and
 592 then @dfn{unreading} it (also called  @dfn{pushing it back} on the stream).
 593 Unreading a character makes it available to be input again from the stream,
 594 by  the next call to @code{fgetc} or other input function on that stream.
 595
 596 @menu
 597 * Unreading Idea::              An explanation of unreading with pictures.
 598 * How Unread::                  How to call @code{ungetc} to do unreading.
 599 @end menu
 600
 601 @node Unreading Idea
 602 @subsection What Unreading Means
 603
 604 Here is a pictorial explanation of unreading.  Suppose you have a
 605 stream reading a file that contains just six characters, the letters
 606 @samp{foobar}.  Suppose you have read three characters so far.  The
 607 situation looks like this:
 608
 609 @smallexample
 610 f  o  o  b  a  r
 611          ^
 612 @end smallexample
 613
 614 @noindent
 615 so the next input character will be @samp{b}.
 616
 617 @c @group   Invalid outside @example
 618 If instead of reading @samp{b} you unread the letter @samp{o}, you get a
 619 situation like this:
 620
 621 @smallexample
 622 f  o  o  b  a  r
 623          |
 624       o--
 625       ^
 626 @end smallexample
 627
 628 @noindent
 629 so that the next input characters will be @samp{o} and @samp{b}.
 630 @c @end group
 631
 632 @c @group
 633 If you unread @samp{9} instead of @samp{o}, you get this situation:
 634
 635 @smallexample
 636 f  o  o  b  a  r
 637          |
 638       9--
 639       ^
 640 @end smallexample
 641
 642 @noindent
 643 so that the next input characters will be @samp{9} and @samp{b}.
 644 @c @end group
 645
 646 @node How Unread
 647 @subsection Using @code{ungetc} To Do Unreading
 648
 649 The function to unread a character is called @code{ungetc}, because it
 650 reverses the action of @code{getc}.
 651
 652 @comment stdio.h
 653 @comment ANSI
 654 @deftypefun int ungetc (int @var{c}, FILE *@var{stream})
 655 The @code{ungetc} function pushes back the character @var{c} onto the
 656 input stream @var{stream}.  So the next input from @var{stream} will
 657 read @var{c} before anything else.
 658
 659 If @var{c} is @code{EOF}, @code{ungetc} does nothing and just returns
 660 @code{EOF}.  This lets you call @code{ungetc} with the return value of
 661 @code{getc} without needing to check for an error from @code{getc}.
 662
 663 The character that you push back doesn't have to be the same as the last
 664 character that was actually read from the stream.  In fact, it isn't
 665 necessary to actually read any characters from the stream before
 666 unreading them with @code{ungetc}!  But that is a strange way to write
 667 a program; usually @code{ungetc} is used only to unread a character
 668 that was just read from the same stream.
 669
 670 The GNU C library only supports one character of pushback---in other
 671 words, it does not work to call @code{ungetc} twice without doing input
 672 in between.  Other systems might let you push back multiple characters;
 673 then reading from the stream retrieves the characters in the reverse
 674 order that they were pushed.
 675
 676 Pushing back characters doesn't alter the file; only the internal
 677 buffering for the stream is affected.  If a file positioning function
 678 (such as @code{fseek} or @code{rewind}; @pxref{File Positioning}) is
 679 called, any pending pushed-back characters are discarded.
 680
 681 Unreading a character on a stream that is at end of file clears the
 682 end-of-file indicator for the stream, because it makes the character of
 683 input available.  After you read that character, trying to read again
 684 will encounter end of file.
 685 @end deftypefun
 686
 687 Here is an example showing the use of @code{getc} and @code{ungetc} to
 688 skip over whitespace characters.  When this function reaches a
 689 non-whitespace character, it unreads that character to be seen again on
 690 the next read operation on the stream.
 691
 692 @smallexample
 693 #include <stdio.h>
 694 #include <ctype.h>
 695
 696 void
 697 skip_whitespace (FILE *stream)
 698 @{
 699   int c;
 700   do
 701     /* @r{No need to check for @code{EOF} because it is not}
 702        @r{@code{isspace}, and @code{ungetc} ignores @code{EOF}.}  */
 703     c = getc (stream);
 704   while (isspace (c));
 705   ungetc (c, stream);
 706 @}
 707 @end smallexample
 708
 709 @node Block Input/Output
 710 @section Block Input/Output
 711
 712 This section describes how to do input and output operations on blocks
 713 of data.  You can use these functions to read and write binary data, as
 714 well as to read and write text in fixed-size blocks instead of by
 715 characters or lines.
 716 @cindex binary I/O to a stream
 717 @cindex block I/O to a stream
 718 @cindex reading from a stream, by blocks
 719 @cindex writing to a stream, by blocks
 720
 721 Binary files are typically used to read and write blocks of data in the
 722 same format as is used to represent the data in a running program.  In
 723 other words, arbitrary blocks of memory---not just character or string
 724 objects---can be written to a binary file, and meaningfully read in
 725 again by the same program.
 726
 727 Storing data in binary form is often considerably more efficient than
 728 using the formatted I/O functions.  Also, for floating-point numbers,
 729 the binary form avoids possible loss of precision in the conversion
 730 process.  On the other hand, binary files can't be examined or modified
 731 easily using many standard file utilities (such as text editors), and
 732 are not portable between different implementations of the language, or
 733 different kinds of computers.
 734
 735 These functions are declared in @file{stdio.h}.
 736 @pindex stdio.h
 737
 738 @comment stdio.h
 739 @comment ANSI
 740 @deftypefun size_t fread (void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream})
 741 This function reads up to @var{count} objects of size @var{size} into
 742 the array @var{data}, from the stream @var{stream}.  It returns the
 743 number of objects actually read, which might be less than @var{count} if
 744 a read error occurs or the end of the file is reached.  This function
 745 returns a value of zero (and doesn't read anything) if either @var{size}
 746 or @var{count} is zero.
 747
 748 If @code{fread} encounters end of file in the middle of an object, it
 749 returns the number of complete objects read, and discards the partial
 750 object.  Therefore, the stream remains at the actual end of the file.
 751 @end deftypefun
 752
 753 @comment stdio.h
 754 @comment ANSI
 755 @deftypefun size_t fwrite (const void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream})
 756 This function writes up to @var{count} objects of size @var{size} from
 757 the array @var{data}, to the stream @var{stream}.  The return value is
 758 normally @var{count}, if the call succeeds.  Any other value indicates
 759 some sort of error, such as running out of space.
 760 @end deftypefun
 761
 762 @node Formatted Output
 763 @section Formatted Output
 764
 765 @cindex format string, for @code{printf}
 766 @cindex template, for @code{printf}
 767 @cindex formatted output to a stream
 768 @cindex writing to a stream, formatted
 769 The functions described in this section (@code{printf} and related
 770 functions) provide a convenient way to perform formatted output.  You
 771 call @code{printf} with a @dfn{format string} or @dfn{template string}
 772 that specifies how to format the values of the remaining arguments.
 773
 774 Unless your program is a filter that specifically performs line- or
 775 character-oriented processing, using @code{printf} or one of the other
 776 related functions described in this section is usually the easiest and
 777 most concise way to perform output.  These functions are especially
 778 useful for printing error messages, tables of data, and the like.
 779
 780 @menu
 781 * Formatted Output Basics::     Some examples to get you started.
 782 * Output Conversion Syntax::    General syntax of conversion
 783                                  specifications.
 784 * Table of Output Conversions:: Summary of output conversions and
 785                                  what they do.
 786 * Integer Conversions::         Details about formatting of integers.
 787 * Floating-Point Conversions::  Details about formatting of
 788                                  floating-point numbers.
 789 * Other Output Conversions::    Details about formatting of strings,
 790                                  characters, pointers, and the like.
 791 * Formatted Output Functions::  Descriptions of the actual functions.
 792 * Dynamic Output::              Functions that allocate memory for the output.
 793 * Variable Arguments Output::   @code{vprintf} and friends.
 794 * Parsing a Template String::   What kinds of args does a given template
 795                                  call for?
 796 * Example of Parsing::          Sample program using @code{parse_printf_format}.
 797 @end menu
 798
 799 @node Formatted Output Basics
 800 @subsection Formatted Output Basics
 801
 802 The @code{printf} function can be used to print any number of arguments.
 803 The template string argument you supply in a call provides
 804 information not only about the number of additional arguments, but also
 805 about their types and what style should be used for printing them.
 806
 807 Ordinary characters in the template string are simply written to the
 808 output stream as-is, while @dfn{conversion specifications} introduced by
 809 a @samp{%} character in the template cause subsequent arguments to be
 810 formatted and written to the output stream.  For example,
 811 @cindex conversion specifications (@code{printf})
 812
 813 @smallexample
 814 int pct = 37;
 815 char filename[] = "foo.txt";
 816 printf ("Processing of `%s' is %d%% finished.\nPlease be patient.\n",
 817         filename, pct);
 818 @end smallexample
 819
 820 @noindent
 821 produces output like
 822
 823 @smallexample
 824 Processing of `foo.txt' is 37% finished.
 825 Please be patient.
 826 @end smallexample
 827
 828 This example shows the use of the @samp{%d} conversion to specify that
 829 an @code{int} argument should be printed in decimal notation, the
 830 @samp{%s} conversion to specify printing of a string argument, and
 831 the @samp{%%} conversion to print a literal @samp{%} character.
 832
 833 There are also conversions for printing an integer argument as an
 834 unsigned value in octal, decimal, or hexadecimal radix (@samp{%o},
 835 @samp{%u}, or @samp{%x}, respectively); or as a character value
 836 (@samp{%c}).
 837
 838 Floating-point numbers can be printed in normal, fixed-point notation
 839 using the @samp{%f} conversion or in exponential notation using the
 840 @samp{%e} conversion.  The @samp{%g} conversion uses either @samp{%e}
 841 or @samp{%f} format, depending on what is more appropriate for the
 842 magnitude of the particular number.
 843
 844 You can control formatting more precisely by writing @dfn{modifiers}
 845 between the @samp{%} and the character that indicates which conversion
 846 to apply.  These slightly alter the ordinary behavior of the conversion.
 847 For example, most conversion specifications permit you to specify a
 848 minimum field width and a flag indicating whether you want the result
 849 left- or right-justified within the field.
 850
 851 The specific flags and modifiers that are permitted and their
 852 interpretation vary depending on the particular conversion.  They're all
 853 described in more detail in the following sections.  Don't worry if this
 854 all seems excessively complicated at first; you can almost always get
 855 reasonable free-format output without using any of the modifiers at all.
 856 The modifiers are mostly used to make the output look ``prettier'' in
 857 tables.
 858
 859 @node Output Conversion Syntax
 860 @subsection Output Conversion Syntax
 861
 862 This section provides details about the precise syntax of conversion
 863 specifications that can appear in a @code{printf} template
 864 string.
 865
 866 Characters in the template string that are not part of a
 867 conversion specification are printed as-is to the output stream.
 868 Multibyte character sequences (@pxref{Extended Characters}) are permitted in
 869 a template string.
 870
 871 The conversion specifications in a @code{printf} template string have
 872 the general form:
 873
 874 @example
 875 % @var{flags} @var{width} @r{[} . @var{precision} @r{]} @var{type} @var{conversion}
 876 @end example
 877
 878 For example, in the conversion specifier @samp{%-10.8ld}, the @samp{-}
 879 is a flag, @samp{10} specifies the field width, the precision is
 880 @samp{8}, the letter @samp{l} is a type modifier, and @samp{d} specifies
 881 the conversion style.  (This particular type specifier says to
 882 print a @code{long int} argument in decimal notation, with a minimum of
 883 8 digits left-justified in a field at least 10 characters wide.)
 884
 885 In more detail, output conversion specifications consist of an
 886 initial @samp{%} character followed in sequence by:
 887
 888 @itemize @bullet
 889 @item
 890 Zero or more @dfn{flag characters} that modify the normal behavior of
 891 the conversion specification.
 892 @cindex flag character (@code{printf})
 893
 894 @item
 895 An optional decimal integer specifying the @dfn{minimum field width}.
 896 If the normal conversion produces fewer characters than this, the field
 897 is padded with spaces to the specified width.  This is a @emph{minimum}
 898 value; if the normal conversion produces more characters than this, the
 899 field is @emph{not} truncated.  Normally, the output is right-justified
 900 within the field.
 901 @cindex minimum field width (@code{printf})
 902
 903 You can also specify a field width of @samp{*}.  This means that the
 904 next argument in the argument list (before the actual value to be
 905 printed) is used as the field width.  The value must be an @code{int}.
 906 If the value is negative, this means to set the @samp{-} flag (see
 907 below) and to use the absolute value as the field width.
 908
 909 @item
 910 An optional @dfn{precision} to specify the number of digits to be
 911 written for the numeric conversions.  If the precision is specified, it
 912 consists of a period (@samp{.}) followed optionally by a decimal integer
 913 (which defaults to zero if omitted).
 914 @cindex precision (@code{printf})
 915
 916 You can also specify a precision of @samp{*}.  This means that the next
 917 argument in the argument list (before the actual value to be printed) is
 918 used as the precision.  The value must be an @code{int}, and is ignored
 919 if it is negative.  If you specify @samp{*} for both the field width and
 920 precision, the field width argument precedes the precision argument.
 921 Other C library versions may not recognize this syntax.
 922
 923 @item
 924 An optional @dfn{type modifier character}, which is used to specify the
 925 data type of the corresponding argument if it differs from the default
 926 type.  (For example, the integer conversions assume a type of @code{int},
 927 but you can specify @samp{h}, @samp{l}, or @samp{L} for other integer
 928 types.)
 929 @cindex type modifier character (@code{printf})
 930
 931 @item
 932 A character that specifies the conversion to be applied.
 933 @end itemize
 934
 935 The exact options that are permitted and how they are interpreted vary
 936 between the different conversion specifiers.  See the descriptions of the
 937 individual conversions for information about the particular options that
 938 they use.
 939
 940 With the @samp{-Wformat} option, the GNU C compiler checks calls to
 941 @code{printf} and related functions.  It examines the format string and
 942 verifies that the correct number and types of arguments are supplied.
 943 There is also a GNU C syntax to tell the compiler that a function you
 944 write uses a @code{printf}-style format string.
 945 @xref{Function Attributes, , Declaring Attributes of Functions,
 946 gcc.info, Using GNU CC}, for more information.
 947
 948 @node Table of Output Conversions
 949 @subsection Table of Output Conversions
 950 @cindex output conversions, for @code{printf}
 951
 952 Here is a table summarizing what all the different conversions do:
 953
 954 @table @asis
 955 @item @samp{%d}, @samp{%i}
 956 Print an integer as a signed decimal number.  @xref{Integer
 957 Conversions}, for details.  @samp{%d} and @samp{%i} are synonymous for
 958 output, but are different when used with @code{scanf} for input
 959 (@pxref{Table of Input Conversions}).
 960
 961 @item @samp{%o}
 962 Print an integer as an unsigned octal number.  @xref{Integer
 963 Conversions}, for details.
 964
 965 @item @samp{%u}
 966 Print an integer as an unsigned decimal number.  @xref{Integer
 967 Conversions}, for details.
 968
 969 @item @samp{%x}, @samp{%X}
 970 Print an integer as an unsigned hexadecimal number.  @samp{%x} uses
 971 lower-case letters and @samp{%X} uses upper-case.  @xref{Integer
 972 Conversions}, for details.
 973
 974 @item @samp{%f}
 975 Print a floating-point number in normal (fixed-point) notation.
 976 @xref{Floating-Point Conversions}, for details.
 977
 978 @item @samp{%e}, @samp{%E}
 979 Print a floating-point number in exponential notation.  @samp{%e} uses
 980 lower-case letters and @samp{%E} uses upper-case.  @xref{Floating-Point
 981 Conversions}, for details.
 982
 983 @item @samp{%g}, @samp{%G}
 984 Print a floating-point number in either normal or exponential notation,
 985 whichever is more appropriate for its magnitude.  @samp{%g} uses
 986 lower-case letters and @samp{%G} uses upper-case.  @xref{Floating-Point
 987 Conversions}, for details.
 988
 989 @item @samp{%c}
 990 Print a single character.  @xref{Other Output Conversions}.
 991
 992 @item @samp{%s}
 993 Print a string.  @xref{Other Output Conversions}.
 994
 995 @item @samp{%p}
 996 Print the value of a pointer.  @xref{Other Output Conversions}.
 997
 998 @item @samp{%n}
 999 Get the number of characters printed so far.  @xref{Other Output Conversions}.
1000 Note that this conversion specification never produces any output.
1001
1002 @item @samp{%m}
1003 Print the string corresponding to the value of @code{errno}.
1004 (This is a GNU extension.)
1005 @xref{Other Output Conversions}.
1006
1007 @item @samp{%%}
1008 Print a literal @samp{%} character.  @xref{Other Output Conversions}.
1009 @end table
1010
1011 If the syntax of a conversion specification is invalid, unpredictable
1012 things will happen, so don't do this.  If there aren't enough function
1013 arguments provided to supply values for all the conversion
1014 specifications in the template string, or if the arguments are not of
1015 the correct types, the results are unpredictable.  If you supply more
1016 arguments than conversion specifications, the extra argument values are
1017 simply ignored; this is sometimes useful.
1018
1019 @node Integer Conversions
1020 @subsection Integer Conversions
1021
1022 This section describes the options for the @samp{%d}, @samp{%i},
1023 @samp{%o}, @samp{%u}, @samp{%x}, and @samp{%X} conversion
1024 specifications.  These conversions print integers in various formats.
1025
1026 The @samp{%d} and @samp{%i} conversion specifications both print an
1027 @code{int} argument as a signed decimal number; while @samp{%o},
1028 @samp{%u}, and @samp{%x} print the argument as an unsigned octal,
1029 decimal, or hexadecimal number (respectively).  The @samp{%X} conversion
1030 specification is just like @samp{%x} except that it uses the characters
1031 @samp{ABCDEF} as digits instead of @samp{abcdef}.
1032
1033 The following flags are meaningful:
1034
1035 @table @asis
1036 @item @samp{-}
1037 Left-justify the result in the field (instead of the normal
1038 right-justification).
1039
1040 @item @samp{+}
1041 For the signed @samp{%d} and @samp{%i} conversions, print a
1042 plus sign if the value is positive.
1043
1044 @item @samp{ }
1045 For the signed @samp{%d} and @samp{%i} conversions, if the result
1046 doesn't start with a plus or minus sign, prefix it with a space
1047 character instead.  Since the @samp{+} flag ensures that the result
1048 includes a sign, this flag is ignored if you supply both of them.
1049
1050 @item @samp{#}
1051 For the @samp{%o} conversion, this forces the leading digit to be
1052 @samp{0}, as if by increasing the precision.  For @samp{%x} or
1053 @samp{%X}, this prefixes a leading @samp{0x} or @samp{0X} (respectively)
1054 to the result.  This doesn't do anything useful for the @samp{%d},
1055 @samp{%i}, or @samp{%u} conversions.  Using this flag produces output
1056 which can be parsed by the @code{strtoul} function (@pxref{Parsing of
1057 Integers}) and @code{scanf} with the @samp{%i} conversion
1058 (@pxref{Numeric Input Conversions}).
1059
1060 @item @samp{'}
1061 Separate the digits into groups as specified by the locale specified for
1062 the @code{LC_NUMERIC} category; @pxref{General Numeric}.  This flag is a
1063 GNU extension.
1064
1065 @item @samp{0}
1066 Pad the field with zeros instead of spaces.  The zeros are placed after
1067 any indication of sign or base.  This flag is ignored if the @samp{-}
1068 flag is also specified, or if a precision is specified.
1069 @end table
1070
1071 If a precision is supplied, it specifies the minimum number of digits to
1072 appear; leading zeros are produced if necessary.  If you don't specify a
1073 precision, the number is printed with as many digits as it needs.  If
1074 you convert a value of zero with an explicit precision of zero, then no
1075 characters at all are produced.
1076
1077 Without a type modifier, the corresponding argument is treated as an
1078 @code{int} (for the signed conversions @samp{%i} and @samp{%d}) or
1079 @code{unsigned int} (for the unsigned conversions @samp{%o}, @samp{%u},
1080 @samp{%x}, and @samp{%X}).  Recall that since @code{printf} and friends
1081 are variadic, any @code{char} and @code{short} arguments are
1082 automatically converted to @code{int} by the default argument
1083 promotions.  For arguments of other integer types, you can use these
1084 modifiers:
1085
1086 @table @samp
1087 @item h
1088 Specifies that the argument is a @code{short int} or @code{unsigned
1089 short int}, as appropriate.  A @code{short} argument is converted to an
1090 @code{int} or @code{unsigned int} by the default argument promotions
1091 anyway, but the @samp{h} modifier says to convert it back to a
1092 @code{short} again.
1093
1094 @item l
1095 Specifies that the argument is a @code{long int} or @code{unsigned long
1096 int}, as appropriate.  Two @samp{l} characters is like the @samp{L}
1097 modifier, below.
1098
1099 @item L
1100 @itemx ll
1101 @itemx q
1102 Specifies that the argument is a @code{long long int}.  (This type is
1103 an extension supported by the GNU C compiler.  On systems that don't
1104 support extra-long integers, this is the same as @code{long int}.)
1105
1106 The @samp{q} modifier is another name for the same thing, which comes
1107 from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad''
1108 @code{int}.
1109
1110 @item Z
1111 Specifies that the argument is a @code{size_t}.  This is a GNU extension.
1112 @end table
1113
1114 Here is an example.  Using the template string:
1115
1116 @smallexample
1117 "|%5d|%-5d|%+5d|%+-5d|% 5d|%05d|%5.0d|%5.2d|%d|\n"
1118 @end smallexample
1119
1120 @noindent
1121 to print numbers using the different options for the @samp{%d}
1122 conversion gives results like:
1123
1124 @smallexample
1125 |    0|0    |   +0|+0   |    0|00000|     |   00|0|
1126 |    1|1    |   +1|+1   |    1|00001|    1|   01|1|
1127 |   -1|-1   |   -1|-1   |   -1|-0001|   -1|  -01|-1|
1128 |100000|100000|+100000| 100000|100000|100000|100000|100000|
1129 @end smallexample
1130
1131 In particular, notice what happens in the last case where the number
1132 is too large to fit in the minimum field width specified.
1133
1134 Here are some more examples showing how unsigned integers print under
1135 various format options, using the template string:
1136
1137 @smallexample
1138 "|%5u|%5o|%5x|%5X|%#5o|%#5x|%#5X|%#10.8x|\n"
1139 @end smallexample
1140
1141 @smallexample
1142 |    0|    0|    0|    0|    0|  0x0|  0X0|0x00000000|
1143 |    1|    1|    1|    1|   01|  0x1|  0X1|0x00000001|
1144 |100000|303240|186a0|186A0|0303240|0x186a0|0X186A0|0x000186a0|
1145 @end smallexample
1146
1147
1148 @node Floating-Point Conversions
1149 @subsection Floating-Point Conversions
1150
1151 This section discusses the conversion specifications for floating-point
1152 numbers: the @samp{%f}, @samp{%e}, @samp{%E}, @samp{%g}, and @samp{%G}
1153 conversions.
1154
1155 The @samp{%f} conversion prints its argument in fixed-point notation,
1156 producing output of the form
1157 @w{[@code{-}]@var{ddd}@code{.}@var{ddd}},
1158 where the number of digits following the decimal point is controlled
1159 by the precision you specify.
1160
1161 The @samp{%e} conversion prints its argument in exponential notation,
1162 producing output of the form
1163 @w{[@code{-}]@var{d}@code{.}@var{ddd}@code{e}[@code{+}|@code{-}]@var{dd}}.
1164 Again, the number of digits following the decimal point is controlled by
1165 the precision.  The exponent always contains at least two digits.  The
1166 @samp{%E} conversion is similar but the exponent is marked with the letter
1167 @samp{E} instead of @samp{e}.
1168
1169 The @samp{%g} and @samp{%G} conversions print the argument in the style
1170 of @samp{%e} or @samp{%E} (respectively) if the exponent would be less
1171 than -4 or greater than or equal to the precision; otherwise they use the
1172 @samp{%f} style.  Trailing zeros are removed from the fractional portion
1173 of the result and a decimal-point character appears only if it is
1174 followed by a digit.
1175
1176 The following flags can be used to modify the behavior:
1177
1178 @comment We use @asis instead of @samp so we can have ` ' as an item.
1179 @table @asis
1180 @item @samp{-}
1181 Left-justify the result in the field.  Normally the result is
1182 right-justified.
1183
1184 @item @samp{+}
1185 Always include a plus or minus sign in the result.
1186
1187 @item @samp{ }
1188 If the result doesn't start with a plus or minus sign, prefix it with a
1189 space instead.  Since the @samp{+} flag ensures that the result includes
1190 a sign, this flag is ignored if you supply both of them.
1191
1192 @item @samp{#}
1193 Specifies that the result should always include a decimal point, even
1194 if no digits follow it.  For the @samp{%g} and @samp{%G} conversions,
1195 this also forces trailing zeros after the decimal point to be left
1196 in place where they would otherwise be removed.
1197
1198 @item @samp{'}
1199 Separate the digits of the integer part of the result into groups as
1200 specified by the locale specified for the @code{LC_NUMERIC} category;
1201 @pxref{General Numeric}.  This flag is a GNU extension.
1202
1203 @item @samp{0}
1204 Pad the field with zeros instead of spaces; the zeros are placed
1205 after any sign.  This flag is ignored if the @samp{-} flag is also
1206 specified.
1207 @end table
1208
1209 The precision specifies how many digits follow the decimal-point
1210 character for the @samp{%f}, @samp{%e}, and @samp{%E} conversions.  For
1211 these conversions, the default precision is @code{6}.  If the precision
1212 is explicitly @code{0}, this suppresses the decimal point character
1213 entirely.  For the @samp{%g} and @samp{%G} conversions, the precision
1214 specifies how many significant digits to print.  Significant digits are
1215 the first digit before the decimal point, and all the digits after it.
1216 If the precision @code{0} or not specified for @samp{%g} or @samp{%G},
1217 it is treated like a value of @code{1}.  If the value being printed
1218 cannot be expressed accurately in the specified number of digits, the
1219 value is rounded to the nearest number that fits.
1220
1221 Without a type modifier, the floating-point conversions use an argument
1222 of type @code{double}.  (By the default argument promotions, any
1223 @code{float} arguments are automatically converted to @code{double}.)
1224 The following type modifier is supported:
1225
1226 @table @samp
1227 @item L
1228 An uppercase @samp{L} specifies that the argument is a @code{long
1229 double}.
1230 @end table
1231
1232 Here are some examples showing how numbers print using the various
1233 floating-point conversions.  All of the numbers were printed using
1234 this template string:
1235
1236 @smallexample
1237 "|%12.4f|%12.4e|%12.4g|\n"
1238 @end smallexample
1239
1240 Here is the output:
1241
1242 @smallexample
1243 |      0.0000|  0.0000e+00|           0|
1244 |      1.0000|  1.0000e+00|           1|
1245 |     -1.0000| -1.0000e+00|          -1|
1246 |    100.0000|  1.0000e+02|         100|
1247 |   1000.0000|  1.0000e+03|        1000|
1248 |  10000.0000|  1.0000e+04|       1e+04|
1249 |  12345.0000|  1.2345e+04|   1.234e+04|
1250 | 100000.0000|  1.0000e+05|       1e+05|
1251 | 123456.0000|  1.2346e+05|   1.234e+05|
1252 @end smallexample
1253
1254 Notice how the @samp{%g} conversion drops trailing zeros.
1255
1256 @node Other Output Conversions
1257 @subsection Other Output Conversions
1258
1259 This section describes miscellaneous conversions for @code{printf}.
1260
1261 The @samp{%c} conversion prints a single character.  The @code{int}
1262 argument is first converted to an @code{unsigned char}.  The @samp{-}
1263 flag can be used to specify left-justification in the field, but no
1264 other flags are defined, and no precision or type modifier can be given.
1265 For example:
1266
1267 @smallexample
1268 printf ("%c%c%c%c%c", 'h', 'e', 'l', 'l', 'o');
1269 @end smallexample
1270
1271 @noindent
1272 prints @samp{hello}.
1273
1274 The @samp{%s} conversion prints a string.  The corresponding argument
1275 must be of type @code{char *} (or @code{const char *}).  A precision can
1276 be specified to indicate the maximum number of characters to write;
1277 otherwise characters in the string up to but not including the
1278 terminating null character are written to the output stream.  The
1279 @samp{-} flag can be used to specify left-justification in the field,
1280 but no other flags or type modifiers are defined for this conversion.
1281 For example:
1282
1283 @smallexample
1284 printf ("%3s%-6s", "no", "where");
1285 @end smallexample
1286
1287 @noindent
1288 prints @samp{ nowhere }.
1289
1290 If you accidentally pass a null pointer as the argument for a @samp{%s}
1291 conversion, the GNU library prints it as @samp{(null)}.  We think this
1292 is more useful than crashing.  But it's not good practice to pass a null
1293 argument intentionally.
1294
1295 The @samp{%m} conversion prints the string corresponding to the error
1296 code in @code{errno}.  @xref{Error Messages}.  Thus:
1297
1298 @smallexample
1299 fprintf (stderr, "can't open `%s': %m\n", filename);
1300 @end smallexample
1301
1302 @noindent
1303 is equivalent to:
1304
1305 @smallexample
1306 fprintf (stderr, "can't open `%s': %s\n", filename, strerror (errno));
1307 @end smallexample
1308
1309 @noindent
1310 The @samp{%m} conversion is a GNU C library extension.
1311
1312 The @samp{%p} conversion prints a pointer value.  The corresponding
1313 argument must be of type @code{void *}.  In practice, you can use any
1314 type of pointer.
1315
1316 In the GNU system, non-null pointers are printed as unsigned integers,
1317 as if a @samp{%#x} conversion were used.  Null pointers print as
1318 @samp{(nil)}.  (Pointers might print differently in other systems.)
1319
1320 For example:
1321
1322 @smallexample
1323 printf ("%p", "testing");
1324 @end smallexample
1325
1326 @noindent
1327 prints @samp{0x} followed by a hexadecimal number---the address of the
1328 string constant @code{"testing"}.  It does not print the word
1329 @samp{testing}.
1330
1331 You can supply the @samp{-} flag with the @samp{%p} conversion to
1332 specify left-justification, but no other flags, precision, or type
1333 modifiers are defined.
1334
1335 The @samp{%n} conversion is unlike any of the other output conversions.
1336 It uses an argument which must be a pointer to an @code{int}, but
1337 instead of printing anything it stores the number of characters printed
1338 so far by this call at that location.  The @samp{h} and @samp{l} type
1339 modifiers are permitted to specify that the argument is of type
1340 @code{short int *} or @code{long int *} instead of @code{int *}, but no
1341 flags, field width, or precision are permitted.
1342
1343 For example,
1344
1345 @smallexample
1346 int nchar;
1347 printf ("%d %s%n\n", 3, "bears", &nchar);
1348 @end smallexample
1349
1350 @noindent
1351 prints:
1352
1353 @smallexample
1354 3 bears
1355 @end smallexample
1356
1357 @noindent
1358 and sets @code{nchar} to @code{7}, because @samp{3 bears} is seven
1359 characters.
1360
1361
1362 The @samp{%%} conversion prints a literal @samp{%} character.  This
1363 conversion doesn't use an argument, and no flags, field width,
1364 precision, or type modifiers are permitted.
1365
1366
1367 @node Formatted Output Functions
1368 @subsection Formatted Output Functions
1369
1370 This section describes how to call @code{printf} and related functions.
1371 Prototypes for these functions are in the header file @file{stdio.h}.
1372 Because these functions take a variable number of arguments, you
1373 @emph{must} declare prototypes for them before using them.  Of course,
1374 the easiest way to make sure you have all the right prototypes is to
1375 just include @file{stdio.h}.
1376 @pindex stdio.h
1377
1378 @comment stdio.h
1379 @comment ANSI
1380 @deftypefun int printf (const char *@var{template}, @dots{})
1381 The @code{printf} function prints the optional arguments under the
1382 control of the template string @var{template} to the stream
1383 @code{stdout}.  It returns the number of characters printed, or a
1384 negative value if there was an output error.
1385 @end deftypefun
1386
1387 @comment stdio.h
1388 @comment ANSI
1389 @deftypefun int fprintf (FILE *@var{stream}, const char *@var{template}, @dots{})
1390 This function is just like @code{printf}, except that the output is
1391 written to the stream @var{stream} instead of @code{stdout}.
1392 @end deftypefun
1393
1394 @comment stdio.h
1395 @comment ANSI
1396 @deftypefun int sprintf (char *@var{s}, const char *@var{template}, @dots{})
1397 This is like @code{printf}, except that the output is stored in the character
1398 array @var{s} instead of written to a stream.  A null character is written
1399 to mark the end of the string.
1400
1401 The @code{sprintf} function returns the number of characters stored in
1402 the array @var{s}, not including the terminating null character.
1403
1404 The behavior of this function is undefined if copying takes place
1405 between objects that overlap---for example, if @var{s} is also given
1406 as an argument to be printed under control of the @samp{%s} conversion.
1407 @xref{Copying and Concatenation}.
1408
1409 @strong{Warning:} The @code{sprintf} function can be @strong{dangerous}
1410 because it can potentially output more characters than can fit in the
1411 allocation size of the string @var{s}.  Remember that the field width
1412 given in a conversion specification is only a @emph{minimum} value.
1413
1414 To avoid this problem, you can use @code{snprintf} or @code{asprintf},
1415 described below.
1416 @end deftypefun
1417
1418 @comment stdio.h
1419 @comment GNU
1420 @deftypefun int snprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, @dots{})
1421 The @code{snprintf} function is similar to @code{sprintf}, except that
1422 the @var{size} argument specifies the maximum number of characters to
1423 produce.  The trailing null character is counted towards this limit, so
1424 you should allocate at least @var{size} characters for the string @var{s}.
1425
1426 The return value is the number of characters stored, not including the
1427 terminating null.  If this value equals @code{@var{size} - 1}, then
1428 there was not enough space in @var{s} for all the output.  You should
1429 try again with a bigger output string.  Here is an example of doing
1430 this:
1431
1432 @smallexample
1433 @group
1434 /* @r{Construct a message describing the value of a variable}
1435    @r{whose name is @var{name} and whose value is @var{value}.} */
1436 char *
1437 make_message (char *name, char *value)
1438 @{
1439   /* @r{Guess we need no more than 100 chars of space.} */
1440   int size = 100;
1441   char *buffer = (char *) xmalloc (size);
1442 @end group
1443 @group
1444   while (1)
1445     @{
1446       /* @r{Try to print in the allocated space.} */
1447       int nchars = snprintf (buffer, size,
1448                              "value of %s is %s",
1449                              name, value);
1450       /* @r{If that worked, return the string.} */
1451       if (nchars < size)
1452         return buffer;
1453       /* @r{Else try again with twice as much space.} */
1454       size *= 2;
1455       buffer = (char *) xrealloc (size, buffer);
1456     @}
1457 @}
1458 @end group
1459 @end smallexample
1460
1461 In practice, it is often easier just to use @code{asprintf}, below.
1462 @end deftypefun
1463
1464 @node Dynamic Output
1465 @subsection Dynamically Allocating Formatted Output
1466
1467 The functions in this section do formatted output and place the results
1468 in dynamically allocated memory.
1469
1470 @comment stdio.h
1471 @comment GNU
1472 @deftypefun int asprintf (char **@var{ptr}, const char *@var{template}, @dots{})
1473 This function is similar to @code{sprintf}, except that it dynamically
1474 allocates a string (as with @code{malloc}; @pxref{Unconstrained
1475 Allocation}) to hold the output, instead of putting the output in a
1476 buffer you allocate in advance.  The @var{ptr} argument should be the
1477 address of a @code{char *} object, and @code{asprintf} stores a pointer
1478 to the newly allocated string at that location.
1479
1480 Here is how to use @code{asprintf} to get the same result as the
1481 @code{snprintf} example, but more easily:
1482
1483 @smallexample
1484 /* @r{Construct a message describing the value of a variable}
1485    @r{whose name is @var{name} and whose value is @var{value}.} */
1486 char *
1487 make_message (char *name, char *value)
1488 @{
1489   char *result;
1490   asprintf (&result, "value of %s is %s", name, value);
1491   return result;
1492 @}
1493 @end smallexample
1494 @end deftypefun
1495
1496 @comment stdio.h
1497 @comment GNU
1498 @deftypefun int obstack_printf (struct obstack *@var{obstack}, const char *@var{template}, @dots{})
1499 This function is similar to @code{asprintf}, except that it uses the
1500 obstack @var{obstack} to allocate the space.  @xref{Obstacks}.
1501
1502 The characters are written onto the end of the current object.
1503 To get at them, you must finish the object with @code{obstack_finish}
1504 (@pxref{Growing Objects}).@refill
1505 @end deftypefun
1506
1507 @node Variable Arguments Output
1508 @subsection Variable Arguments Output Functions
1509
1510 The functions @code{vprintf} and friends are provided so that you can
1511 define your own variadic @code{printf}-like functions that make use of
1512 the same internals as the built-in formatted output functions.
1513
1514 The most natural way to define such functions would be to use a language
1515 construct to say, ``Call @code{printf} and pass this template plus all
1516 of my arguments after the first five.''  But there is no way to do this
1517 in C, and it would be hard to provide a way, since at the C language
1518 level there is no way to tell how many arguments your function received.
1519
1520 Since that method is impossible, we provide alternative functions, the
1521 @code{vprintf} series, which lets you pass a @code{va_list} to describe
1522 ``all of my arguments after the first five.''
1523
1524 When it is sufficient to define a macro rather than a real function,
1525 the GNU C compiler provides a way to do this much more easily with macros.
1526 For example:
1527
1528 @smallexample
1529 #define myprintf(a, b, c, d, e, rest...) printf (mytemplate , ## rest...)
1530 @end smallexample
1531
1532 @noindent
1533 @xref{Macro Varargs, , Macros with Variable Numbers of Arguments,
1534 gcc.info, Using GNU CC}, for details.  But this is limited to macros,
1535 and does not apply to real functions at all.
1536
1537 Before calling @code{vprintf} or the other functions listed in this
1538 section, you @emph{must} call @code{va_start} (@pxref{Variadic
1539 Functions}) to initialize a pointer to the variable arguments.  Then you
1540 can call @code{va_arg} to fetch the arguments that you want to handle
1541 yourself.  This advances the pointer past those arguments.
1542
1543 Once your @code{va_list} pointer is pointing at the argument of your
1544 choice, you are ready to call @code{vprintf}.  That argument and all
1545 subsequent arguments that were passed to your function are used by
1546 @code{vprintf} along with the template that you specified separately.
1547
1548 In some other systems, the @code{va_list} pointer may become invalid
1549 after the call to @code{vprintf}, so you must not use @code{va_arg}
1550 after you call @code{vprintf}.  Instead, you should call @code{va_end}
1551 to retire the pointer from service.  However, you can safely call
1552 @code{va_start} on another pointer variable and begin fetching the
1553 arguments again through that pointer.  Calling @code{vprintf} does not
1554 destroy the argument list of your function, merely the particular
1555 pointer that you passed to it.
1556
1557 GNU C does not have such restrictions.  You can safely continue to fetch
1558 arguments from a @code{va_list} pointer after passing it to
1559 @code{vprintf}, and @code{va_end} is a no-op.  (Note, however, that
1560 subsequent @code{va_arg} calls will fetch the same arguments which
1561 @code{vprintf} previously used.)
1562
1563 Prototypes for these functions are declared in @file{stdio.h}.
1564 @pindex stdio.h
1565
1566 @comment stdio.h
1567 @comment ANSI
1568 @deftypefun int vprintf (const char *@var{template}, va_list @var{ap})
1569 This function is similar to @code{printf} except that, instead of taking
1570 a variable number of arguments directly, it takes an argument list
1571 pointer @var{ap}.
1572 @end deftypefun
1573
1574 @comment stdio.h
1575 @comment ANSI
1576 @deftypefun int vfprintf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap})
1577 This is the equivalent of @code{fprintf} with the variable argument list
1578 specified directly as for @code{vprintf}.
1579 @end deftypefun
1580
1581 @comment stdio.h
1582 @comment ANSI
1583 @deftypefun int vsprintf (char *@var{s}, const char *@var{template}, va_list @var{ap})
1584 This is the equivalent of @code{sprintf} with the variable argument list
1585 specified directly as for @code{vprintf}.
1586 @end deftypefun
1587
1588 @comment stdio.h
1589 @comment GNU
1590 @deftypefun int vsnprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, va_list @var{ap})
1591 This is the equivalent of @code{snprintf} with the variable argument list
1592 specified directly as for @code{vprintf}.
1593 @end deftypefun
1594
1595 @comment stdio.h
1596 @comment GNU
1597 @deftypefun int vasprintf (char **@var{ptr}, const char *@var{template}, va_list @var{ap})
1598 The @code{vasprintf} function is the equivalent of @code{asprintf} with the
1599 variable argument list specified directly as for @code{vprintf}.
1600 @end deftypefun
1601
1602 @comment stdio.h
1603 @comment GNU
1604 @deftypefun int obstack_vprintf (struct obstack *@var{obstack}, const char *@var{template}, va_list @var{ap})
1605 The @code{obstack_vprintf} function is the equivalent of
1606 @code{obstack_printf} with the variable argument list specified directly
1607 as for @code{vprintf}.@refill
1608 @end deftypefun
1609
1610 Here's an example showing how you might use @code{vfprintf}.  This is a
1611 function that prints error messages to the stream @code{stderr}, along
1612 with a prefix indicating the name of the program
1613 (@pxref{Error Messages}, for a description of
1614 @code{program_invocation_short_name}).
1615
1616 @smallexample
1617 @group
1618 #include <stdio.h>
1619 #include <stdarg.h>
1620
1621 void
1622 eprintf (const char *template, ...)
1623 @{
1624   va_list ap;
1625   extern char *program_invocation_short_name;
1626
1627   fprintf (stderr, "%s: ", program_invocation_short_name);
1628   va_start (ap, count);
1629   vfprintf (stderr, template, ap);
1630   va_end (ap);
1631 @}
1632 @end group
1633 @end smallexample
1634
1635 @noindent
1636 You could call @code{eprintf} like this:
1637
1638 @smallexample
1639 eprintf ("file `%s' does not exist\n", filename);
1640 @end smallexample
1641
1642 In GNU C, there is a special construct you can use to let the compiler
1643 know that a function uses a @code{printf}-style format string.  Then it
1644 can check the number and types of arguments in each call to the
1645 function, and warn you when they do not match the format string.
1646 For example, take this declaration of @code{eprintf}:
1647
1648 @smallexample
1649 void eprintf (const char *template, ...)
1650         __attribute__ ((format (printf, 1, 2)));
1651 @end smallexample
1652
1653 @noindent
1654 This tells the compiler that @code{eprintf} uses a format string like
1655 @code{printf} (as opposed to @code{scanf}; @pxref{Formatted Input});
1656 the format string appears as the first argument;
1657 and the arguments to satisfy the format begin with the second.
1658 @xref{Function Attributes, , Declaring Attributes of Functions,
1659 gcc.info, Using GNU CC}, for more information.
1660
1661 @node Parsing a Template String
1662 @subsection Parsing a Template String
1663 @cindex parsing a template string
1664
1665 You can use the function @code{parse_printf_format} to obtain
1666 information about the number and types of arguments that are expected by
1667 a given template string.  This function permits interpreters that
1668 provide interfaces to @code{printf} to avoid passing along invalid
1669 arguments from the user's program, which could cause a crash.
1670
1671 All the symbols described in this section are declared in the header
1672 file @file{printf.h}.
1673
1674 @comment printf.h
1675 @comment GNU
1676 @deftypefun size_t parse_printf_format (const char *@var{template}, size_t @var{n}, int *@var{argtypes})
1677 This function returns information about the number and types of
1678 arguments expected by the @code{printf} template string @var{template}.
1679 The information is stored in the array @var{argtypes}; each element of
1680 this array describes one argument.  This information is encoded using
1681 the various @samp{PA_} macros, listed below.
1682
1683 The @var{n} argument specifies the number of elements in the array
1684 @var{argtypes}.  This is the most elements that
1685 @code{parse_printf_format} will try to write.
1686
1687 @code{parse_printf_format} returns the total number of arguments required
1688 by @var{template}.  If this number is greater than @var{n}, then the
1689 information returned describes only the first @var{n} arguments.  If you
1690 want information about more than that many arguments, allocate a bigger
1691 array and call @code{parse_printf_format} again.
1692 @end deftypefun
1693
1694 The argument types are encoded as a combination of a basic type and
1695 modifier flag bits.
1696
1697 @comment printf.h
1698 @comment GNU
1699 @deftypevr Macro int PA_FLAG_MASK
1700 This macro is a bitmask for the type modifier flag bits.  You can write
1701 the expression @code{(argtypes[i] & PA_FLAG_MASK)} to extract just the
1702 flag bits for an argument, or @code{(argtypes[i] & ~PA_FLAG_MASK)} to
1703 extract just the basic type code.
1704 @end deftypevr
1705
1706 Here are symbolic constants that represent the basic types; they stand
1707 for integer values.
1708
1709 @table @code
1710 @comment printf.h
1711 @comment GNU
1712 @item PA_INT
1713 @vindex PA_INT
1714 This specifies that the base type is @code{int}.
1715
1716 @comment printf.h
1717 @comment GNU
1718 @item PA_CHAR
1719 @vindex PA_CHAR
1720 This specifies that the base type is @code{int}, cast to @code{char}.
1721
1722 @comment printf.h
1723 @comment GNU
1724 @item PA_STRING
1725 @vindex PA_STRING
1726 This specifies that the base type is @code{char *}, a null-terminated string.
1727
1728 @comment printf.h
1729 @comment GNU
1730 @item PA_POINTER
1731 @vindex PA_POINTER
1732 This specifies that the base type is @code{void *}, an arbitrary pointer.
1733
1734 @comment printf.h
1735 @comment GNU
1736 @item PA_FLOAT
1737 @vindex PA_FLOAT
1738 This specifies that the base type is @code{float}.
1739
1740 @comment printf.h
1741 @comment GNU
1742 @item PA_DOUBLE
1743 @vindex PA_DOUBLE
1744 This specifies that the base type is @code{double}.
1745
1746 @comment printf.h
1747 @comment GNU
1748 @item PA_LAST
1749 @vindex PA_LAST
1750 You can define additional base types for your own programs as offsets
1751 from @code{PA_LAST}.  For example, if you have data types @samp{foo}
1752 and @samp{bar} with their own specialized @code{printf} conversions,
1753 you could define encodings for these types as:
1754
1755 @smallexample
1756 #define PA_FOO  PA_LAST
1757 #define PA_BAR  (PA_LAST + 1)
1758 @end smallexample
1759 @end table
1760
1761 Here are the flag bits that modify a basic type.  They are combined with
1762 the code for the basic type using inclusive-or.
1763
1764 @table @code
1765 @comment printf.h
1766 @comment GNU
1767 @item PA_FLAG_PTR
1768 @vindex PA_FLAG_PTR
1769 If this bit is set, it indicates that the encoded type is a pointer to
1770 the base type, rather than an immediate value.
1771 For example, @samp{PA_INT|PA_FLAG_PTR} represents the type @samp{int *}.
1772
1773 @comment printf.h
1774 @comment GNU
1775 @item PA_FLAG_SHORT
1776 @vindex PA_FLAG_SHORT
1777 If this bit is set, it indicates that the base type is modified with
1778 @code{short}.  (This corresponds to the @samp{h} type modifier.)
1779
1780 @comment printf.h
1781 @comment GNU
1782 @item PA_FLAG_LONG
1783 @vindex PA_FLAG_LONG
1784 If this bit is set, it indicates that the base type is modified with
1785 @code{long}.  (This corresponds to the @samp{l} type modifier.)
1786
1787 @comment printf.h
1788 @comment GNU
1789 @item PA_FLAG_LONG_LONG
1790 @vindex PA_FLAG_LONG_LONG
1791 If this bit is set, it indicates that the base type is modified with
1792 @code{long long}.  (This corresponds to the @samp{L} type modifier.)
1793
1794 @comment printf.h
1795 @comment GNU
1796 @item PA_FLAG_LONG_DOUBLE
1797 @vindex PA_FLAG_LONG_DOUBLE
1798 This is a synonym for @code{PA_FLAG_LONG_LONG}, used by convention with
1799 a base type of @code{PA_DOUBLE} to indicate a type of @code{long double}.
1800 @end table
1801
1802 @ifinfo
1803 For an example of using these facilitles, see @ref{Example of Parsing}.
1804 @end ifinfo
1805
1806 @node Example of Parsing
1807 @subsection Example of Parsing a Template String
1808
1809 Here is an example of decoding argument types for a format string.  We
1810 assume this is part of an interpreter which contains arguments of type
1811 @code{NUMBER}, @code{CHAR}, @code{STRING} and @code{STRUCTURE} (and
1812 perhaps others which are not valid here).
1813
1814 @smallexample
1815 /* @r{Test whether the @var{nargs} specified objects}
1816    @r{in the vector @var{args} are valid}
1817    @r{for the format string @var{format}:}
1818    @r{if so, return 1.}
1819    @r{If not, return 0 after printing an error message.}  */
1820
1821 int
1822 validate_args (char *format, int nargs, OBJECT *args)
1823 @{
1824   int *argtypes;
1825   int nwanted;
1826
1827   /* @r{Get the information about the arguments.}
1828      @r{Each conversion specification must be at least two characters}
1829      @r{long, so there cannot be more specifications than half the}
1830      @r{length of the string.}  */
1831
1832   argtypes = (int *) alloca (strlen (format) / 2 * sizeof (int));
1833   nwanted = parse_printf_format (string, nelts, argtypes);
1834
1835   /* @r{Check the number of arguments.}  */
1836   if (nwanted > nargs)
1837     @{
1838       error ("too few arguments (at least %d required)", nwanted);
1839       return 0;
1840     @}
1841
1842   /* @r{Check the C type wanted for each argument}
1843      @r{and see if the object given is suitable.}  */
1844   for (i = 0; i < nwanted; i++)
1845     @{
1846       int wanted;
1847
1848       if (argtypes[i] & PA_FLAG_PTR)
1849         wanted = STRUCTURE;
1850       else
1851         switch (argtypes[i] & ~PA_FLAG_MASK)
1852           @{
1853           case PA_INT:
1854           case PA_FLOAT:
1855           case PA_DOUBLE:
1856             wanted = NUMBER;
1857             break;
1858           case PA_CHAR:
1859             wanted = CHAR;
1860             break;
1861           case PA_STRING:
1862             wanted = STRING;
1863             break;
1864           case PA_POINTER:
1865             wanted = STRUCTURE;
1866             break;
1867           @}
1868       if (TYPE (args[i]) != wanted)
1869         @{
1870           error ("type mismatch for arg number %d", i);
1871           return 0;
1872         @}
1873     @}
1874   return 1;
1875 @}
1876 @end smallexample
1877
1878 @node Customizing Printf
1879 @section Customizing @code{printf}
1880 @cindex customizing @code{printf}
1881 @cindex defining new @code{printf} conversions
1882 @cindex extending @code{printf}
1883
1884 The GNU C library lets you define your own custom conversion specifiers
1885 for @code{printf} template strings, to teach @code{printf} clever ways
1886 to print the important data structures of your program.
1887
1888 The way you do this is by registering the conversion with the function
1889 @code{register_printf_function}; see @ref{Registering New Conversions}.
1890 One of the arguments you pass to this function is a pointer to a handler
1891 function that produces the actual output; see @ref{Defining the Output
1892 Handler}, for information on how to write this function.
1893
1894 You can also install a function that just returns information about the
1895 number and type of arguments expected by the conversion specifier.
1896 @xref{Parsing a Template String}, for information about this.
1897
1898 The facilities of this section are declared in the header file
1899 @file{printf.h}.
1900
1901 @menu
1902 * Registering New Conversions::         Using @code{register_printf_function}
1903                                          to register a new output conversion.
1904 * Conversion Specifier Options::        The handler must be able to get
1905                                          the options specified in the
1906                                          template when it is called.
1907 * Defining the Output Handler::         Defining the handler and arginfo
1908                                          functions that are passed as arguments
1909                                          to @code{register_printf_function}.
1910 * Printf Extension Example::            How to define a @code{printf}
1911                                          handler function.
1912 @end menu
1913
1914 @strong{Portability Note:} The ability to extend the syntax of
1915 @code{printf} template strings is a GNU extension.  ANSI standard C has
1916 nothing similar.
1917
1918 @node Registering New Conversions
1919 @subsection Registering New Conversions
1920
1921 The function to register a new output conversion is
1922 @code{register_printf_function}, declared in @file{printf.h}.
1923 @pindex printf.h
1924
1925 @comment printf.h
1926 @comment GNU
1927 @deftypefun int register_printf_function (int @var{spec}, printf_function @var{handler-function}, printf_arginfo_function @var{arginfo-function})
1928 This function defines the conversion specifier character @var{spec}.
1929 Thus, if @var{spec} is @code{'z'}, it defines the conversion @samp{%z}.
1930 You can redefine the built-in conversions like @samp{%s}, but flag
1931 characters like @samp{#} and type modifiers like @samp{l} can never be
1932 used as conversions; calling @code{register_printf_function} for those
1933 characters has no effect.
1934
1935 The @var{handler-function} is the function called by @code{printf} and
1936 friends when this conversion appears in a template string.
1937 @xref{Defining the Output Handler}, for information about how to define
1938 a function to pass as this argument.  If you specify a null pointer, any
1939 existing handler function for @var{spec} is removed.
1940
1941 The @var{arginfo-function} is the function called by
1942 @code{parse_printf_format} when this conversion appears in a
1943 template string.  @xref{Parsing a Template String}, for information
1944 about this.
1945
1946 Normally, you install both functions for a conversion at the same time,
1947 but if you are never going to call @code{parse_printf_format}, you do
1948 not need to define an arginfo function.
1949
1950 The return value is @code{0} on success, and @code{-1} on failure
1951 (which occurs if @var{spec} is out of range).
1952
1953 You can redefine the standard output conversions, but this is probably
1954 not a good idea because of the potential for confusion.  Library routines
1955 written by other people could break if you do this.
1956 @end deftypefun
1957
1958 @node Conversion Specifier Options
1959 @subsection Conversion Specifier Options
1960
1961 If you define a meaning for @samp{%A}, what if the template contains
1962 @samp{%+23A} or @samp{%-#A}?  To implement a sensible meaning for these,
1963 the handler when called needs to be able to get the options specified in
1964 the template.
1965
1966 Both the @var{handler-function} and @var{arginfo-function} arguments
1967 to @code{register_printf_function} accept an argument that points to a
1968 @code{struct printf_info}, which contains information about the options
1969 appearing in an instance of the conversion specifier.  This data type
1970 is declared in the header file @file{printf.h}.
1971 @pindex printf.h
1972
1973 @comment printf.h
1974 @comment GNU
1975 @deftp {Type} {struct printf_info}
1976 This structure is used to pass information about the options appearing
1977 in an instance of a conversion specifier in a @code{printf} template
1978 string to the handler and arginfo functions for that specifier.  It
1979 contains the following members:
1980
1981 @table @code
1982 @item int prec
1983 This is the precision specified.  The value is @code{-1} if no precision
1984 was specified.  If the precision was given as @samp{*}, the
1985 @code{printf_info} structure passed to the handler function contains the
1986 actual value retrieved from the argument list.  But the structure passed
1987 to the arginfo function contains a value of @code{INT_MIN}, since the
1988 actual value is not known.
1989
1990 @item int width
1991 This is the minimum field width specified.  The value is @code{0} if no
1992 width was specified.  If the field width was given as @samp{*}, the
1993 @code{printf_info} structure passed to the handler function contains the
1994 actual value retrieved from the argument list.  But the structure passed
1995 to the arginfo function contains a value of @code{INT_MIN}, since the
1996 actual value is not known.
1997
1998 @item char spec
1999 This is the conversion specifier character specified.  It's stored in
2000 the structure so that you can register the same handler function for
2001 multiple characters, but still have a way to tell them apart when the
2002 handler function is called.
2003
2004 @item unsigned int is_long_double
2005 This is a boolean that is true if the @samp{L}, @samp{ll}, or @samp{q}
2006 type modifier was specified.  For integer conversions, this indicates
2007 @code{long long int}, as opposed to @code{long double} for floating
2008 point conversions.
2009
2010 @item unsigned int is_short
2011 This is a boolean that is true if the @samp{h} type modifier was specified.
2012
2013 @item unsigned int is_long
2014 This is a boolean that is true if the @samp{l} type modifier was specified.
2015
2016 @item unsigned int alt
2017 This is a boolean that is true if the @samp{#} flag was specified.
2018
2019 @item unsigned int space
2020 This is a boolean that is true if the @samp{ } flag was specified.
2021
2022 @item unsigned int left
2023 This is a boolean that is true if the @samp{-} flag was specified.
2024
2025 @item unsigned int showsign
2026 This is a boolean that is true if the @samp{+} flag was specified.
2027
2028 @item unsigned int group
2029 This is a boolean that is true if the @samp{'} flag was specified.
2030
2031 @item char pad
2032 This is the character to use for padding the output to the minimum field
2033 width.  The value is @code{'0'} if the @samp{0} flag was specified, and
2034 @code{' '} otherwise.
2035 @end table
2036 @end deftp
2037
2038
2039 @node Defining the Output Handler
2040 @subsection Defining the Output Handler
2041
2042 Now let's look at how to define the handler and arginfo functions
2043 which are passed as arguments to @code{register_printf_function}.
2044
2045 You should define your handler functions with a prototype like:
2046
2047 @smallexample
2048 int @var{function} (FILE *stream, const struct printf_info *info,
2049                     va_list *ap_pointer)
2050 @end smallexample
2051
2052 The @code{stream} argument passed to the handler function is the stream to
2053 which it should write output.
2054
2055 The @code{info} argument is a pointer to a structure that contains
2056 information about the various options that were included with the
2057 conversion in the template string.  You should not modify this structure
2058 inside your handler function.  @xref{Conversion Specifier Options}, for
2059 a description of this data structure.
2060
2061 The @code{ap_pointer} argument is used to pass the tail of the variable
2062 argument list containing the values to be printed to your handler.
2063 Unlike most other functions that can be passed an explicit variable
2064 argument list, this is a @emph{pointer} to a @code{va_list}, rather than
2065 the @code{va_list} itself.  Thus, you should fetch arguments by
2066 means of @code{va_arg (*ap_pointer, @var{type})}.
2067
2068 (Passing a pointer here allows the function that calls your handler
2069 function to update its own @code{va_list} variable to account for the
2070 arguments that your handler processes.  @xref{Variadic Functions}.)
2071
2072 Your handler function should return a value just like @code{printf}
2073 does: it should return the number of characters it has written, or a
2074 negative value to indicate an error.
2075
2076 @comment printf.h
2077 @comment GNU
2078 @deftp {Data Type} printf_function
2079 This is the data type that a handler function should have.
2080 @end deftp
2081
2082 If you are going to use @w{@code{parse_printf_format}} in your
2083 application, you should also define a function to pass as the
2084 @var{arginfo-function} argument for each new conversion you install with
2085 @code{register_printf_function}.
2086
2087 You should define these functions with a prototype like:
2088
2089 @smallexample
2090 int @var{function} (const struct printf_info *info,
2091                     size_t n, int *argtypes)
2092 @end smallexample
2093
2094 The return value from the function should be the number of arguments the
2095 conversion expects.  The function should also fill in no more than
2096 @var{n} elements of the @var{argtypes} array with information about the
2097 types of each of these arguments.  This information is encoded using the
2098 various @samp{PA_} macros.  (You will notice that this is the same
2099 calling convention @code{parse_printf_format} itself uses.)
2100
2101 @comment printf.h
2102 @comment GNU
2103 @deftp {Data Type} printf_arginfo_function
2104 This type is used to describe functions that return information about
2105 the number and type of arguments used by a conversion specifier.
2106 @end deftp
2107
2108 @node Printf Extension Example
2109 @subsection @code{printf} Extension Example
2110
2111 Here is an example showing how to define a @code{printf} handler function.
2112 This program defines a data structure called a @code{Widget} and
2113 defines the @samp{%W} conversion to print information about @w{@code{Widget *}}
2114 arguments, including the pointer value and the name stored in the data
2115 structure.  The @samp{%W} conversion supports the minimum field width and
2116 left-justification options, but ignores everything else.
2117
2118 @smallexample
2119 @include rprintf.c.texi
2120 @end smallexample
2121
2122 The output produced by this program looks like:
2123
2124 @smallexample
2125 |<Widget 0xffeffb7c: mywidget>|
2126 |      <Widget 0xffeffb7c: mywidget>|
2127 |<Widget 0xffeffb7c: mywidget>      |
2128 @end smallexample
2129
2130 @node Formatted Input
2131 @section Formatted Input
2132
2133 @cindex formatted input from a stream
2134 @cindex reading from a stream, formatted
2135 @cindex format string, for @code{scanf}
2136 @cindex template, for @code{scanf}
2137 The functions described in this section (@code{scanf} and related
2138 functions) provide facilities for formatted input analogous to the
2139 formatted output facilities.  These functions provide a mechanism for
2140 reading arbitrary values under the control of a @dfn{format string} or
2141 @dfn{template string}.
2142
2143 @menu
2144 * Formatted Input Basics::      Some basics to get you started.
2145 * Input Conversion Syntax::     Syntax of conversion specifications.
2146 * Table of Input Conversions::  Summary of input conversions and what they do.
2147 * Numeric Input Conversions::   Details of conversions for reading numbers.
2148 * String Input Conversions::    Details of conversions for reading strings.
2149 * Dynamic String Input::        String conversions that @code{malloc} the buffer.
2150 * Other Input Conversions::     Details of miscellaneous other conversions.
2151 * Formatted Input Functions::   Descriptions of the actual functions.
2152 * Variable Arguments Input::    @code{vscanf} and friends.
2153 @end menu
2154
2155 @node Formatted Input Basics
2156 @subsection Formatted Input Basics
2157
2158 Calls to @code{scanf} are superficially similar to calls to
2159 @code{printf} in that arbitrary arguments are read under the control of
2160 a template string.  While the syntax of the conversion specifications in
2161 the template is very similar to that for @code{printf}, the
2162 interpretation of the template is oriented more towards free-format
2163 input and simple pattern matching, rather than fixed-field formatting.
2164 For example, most @code{scanf} conversions skip over any amount of
2165 ``white space'' (including spaces, tabs, and newlines) in the input
2166 file, and there is no concept of precision for the numeric input
2167 conversions as there is for the corresponding output conversions.
2168 Ordinarily, non-whitespace characters in the template are expected to
2169 match characters in the input stream exactly, but a matching failure is
2170 distinct from an input error on the stream.
2171 @cindex conversion specifications (@code{scanf})
2172
2173 Another area of difference between @code{scanf} and @code{printf} is
2174 that you must remember to supply pointers rather than immediate values
2175 as the optional arguments to @code{scanf}; the values that are read are
2176 stored in the objects that the pointers point to.  Even experienced
2177 programmers tend to forget this occasionally, so if your program is
2178 getting strange errors that seem to be related to @code{scanf}, you
2179 might want to double-check this.
2180
2181 When a @dfn{matching failure} occurs, @code{scanf} returns immediately,
2182 leaving the first non-matching character as the next character to be
2183 read from the stream.  The normal return value from @code{scanf} is the
2184 number of values that were assigned, so you can use this to determine if
2185 a matching error happened before all the expected values were read.
2186 @cindex matching failure, in @code{scanf}
2187
2188 The @code{scanf} function is typically used for things like reading in
2189 the contents of tables.  For example, here is a function that uses
2190 @code{scanf} to initialize an array of @code{double}:
2191
2192 @smallexample
2193 void
2194 readarray (double *array, int n)
2195 @{
2196   int i;
2197   for (i=0; i<n; i++)
2198     if (scanf (" %lf", &(array[i])) != 1)
2199       invalid_input_error ();
2200 @}
2201 @end smallexample
2202
2203 The formatted input functions are not used as frequently as the
2204 formatted output functions.  Partly, this is because it takes some care
2205 to use them properly.  Another reason is that it is difficult to recover
2206 from a matching error.
2207
2208 If you are trying to read input that doesn't match a single, fixed
2209 pattern, you may be better off using a tool such as Flex to generate a
2210 lexical scanner, or Bison to generate a parser, rather than using
2211 @code{scanf}.  For more information about these tools, see @ref{, , ,
2212 flex.info, Flex: The Lexical Scanner Generator}, and @ref{, , ,
2213 bison.info, The Bison Reference Manual}.
2214
2215 @node Input Conversion Syntax
2216 @subsection Input Conversion Syntax
2217
2218 A @code{scanf} template string is a string that contains ordinary
2219 multibyte characters interspersed with conversion specifications that
2220 start with @samp{%}.
2221
2222 Any whitespace character (as defined by the @code{isspace} function;
2223 @pxref{Classification of Characters}) in the template causes any number
2224 of whitespace characters in the input stream to be read and discarded.
2225 The whitespace characters that are matched need not be exactly the same
2226 whitespace characters that appear in the template string.  For example,
2227 write @samp{ , } in the template to recognize a comma with optional
2228 whitespace before and after.
2229
2230 Other characters in the template string that are not part of conversion
2231 specifications must match characters in the input stream exactly; if
2232 this is not the case, a matching failure occurs.
2233
2234 The conversion specifications in a @code{scanf} template string
2235 have the general form:
2236
2237 @smallexample
2238 % @var{flags} @var{width} @var{type} @var{conversion}
2239 @end smallexample
2240
2241 In more detail, an input conversion specification consists of an initial
2242 @samp{%} character followed in sequence by:
2243
2244 @itemize @bullet
2245 @item
2246 An optional @dfn{flag character} @samp{*}, which says to ignore the text
2247 read for this specification.  When @code{scanf} finds a conversion
2248 specification that uses this flag, it reads input as directed by the
2249 rest of the conversion specification, but it discards this input, does
2250 not use a pointer argument, and does not increment the count of
2251 successful assignments.
2252 @cindex flag character (@code{scanf})
2253
2254 @item
2255 An optional flag character @samp{a} (valid with string conversions only)
2256 which requests allocation of a buffer long enough to store the string in.
2257 (This is a GNU extension.)
2258 @xref{Dynamic String Input}.
2259
2260 @item
2261 An optional decimal integer that specifies the @dfn{maximum field
2262 width}.  Reading of characters from the input stream stops either when
2263 this maximum is reached or when a non-matching character is found,
2264 whichever happens first.  Most conversions discard initial whitespace
2265 characters (those that don't are explicitly documented), and these
2266 discarded characters don't count towards the maximum field width.
2267 String input conversions store a null character to mark the end of the
2268 input; the maximum field width does not include this terminator.
2269 @cindex maximum field width (@code{scanf})
2270
2271 @item
2272 An optional @dfn{type modifier character}.  For example, you can
2273 specify a type modifier of @samp{l} with integer conversions such as
2274 @samp{%d} to specify that the argument is a pointer to a @code{long int}
2275 rather than a pointer to an @code{int}.
2276 @cindex type modifier character (@code{scanf})
2277
2278 @item
2279 A character that specifies the conversion to be applied.
2280 @end itemize
2281
2282 The exact options that are permitted and how they are interpreted vary
2283 between the different conversion specifiers.  See the descriptions of the
2284 individual conversions for information about the particular options that
2285 they allow.
2286
2287 With the @samp{-Wformat} option, the GNU C compiler checks calls to
2288 @code{scanf} and related functions.  It examines the format string and
2289 verifies that the correct number and types of arguments are supplied.
2290 There is also a GNU C syntax to tell the compiler that a function you
2291 write uses a @code{scanf}-style format string.
2292 @xref{Function Attributes, , Declaring Attributes of Functions,
2293 gcc.info, Using GNU CC}, for more information.
2294
2295 @node Table of Input Conversions
2296 @subsection Table of Input Conversions
2297 @cindex input conversions, for @code{scanf}
2298
2299 Here is a table that summarizes the various conversion specifications:
2300
2301 @table @asis
2302 @item @samp{%d}
2303 Matches an optionally signed integer written in decimal.  @xref{Numeric
2304 Input Conversions}.
2305
2306 @item @samp{%i}
2307 Matches an optionally signed integer in any of the formats that the C
2308 language defines for specifying an integer constant.  @xref{Numeric
2309 Input Conversions}.
2310
2311 @item @samp{%o}
2312 Matches an unsigned integer written in octal radix.
2313 @xref{Numeric Input Conversions}.
2314
2315 @item @samp{%u}
2316 Matches an unsigned integer written in decimal radix.
2317 @xref{Numeric Input Conversions}.
2318
2319 @item @samp{%x}, @samp{%X}
2320 Matches an unsigned integer written in hexadecimal radix.
2321 @xref{Numeric Input Conversions}.
2322
2323 @item @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, @samp{%G}
2324 Matches an optionally signed floating-point number.  @xref{Numeric Input
2325 Conversions}.
2326
2327 @item @samp{%s}
2328 Matches a string containing only non-whitespace characters.
2329 @xref{String Input Conversions}.
2330
2331 @item @samp{%[}
2332 Matches a string of characters that belong to a specified set.
2333 @xref{String Input Conversions}.
2334
2335 @item @samp{%c}
2336 Matches a string of one or more characters; the number of characters
2337 read is controlled by the maximum field width given for the conversion.
2338 @xref{String Input Conversions}.
2339
2340 @item @samp{%p}
2341 Matches a pointer value in the same implementation-defined format used
2342 by the @samp{%p} output conversion for @code{printf}.  @xref{Other Input
2343 Conversions}.
2344
2345 @item @samp{%n}
2346 This conversion doesn't read any characters; it records the number of
2347 characters read so far by this call.  @xref{Other Input Conversions}.
2348
2349 @item @samp{%%}
2350 This matches a literal @samp{%} character in the input stream.  No
2351 corresponding argument is used.  @xref{Other Input Conversions}.
2352 @end table
2353
2354 If the syntax of a conversion specification is invalid, the behavior is
2355 undefined.  If there aren't enough function arguments provided to supply
2356 addresses for all the conversion specifications in the template strings
2357 that perform assignments, or if the arguments are not of the correct
2358 types, the behavior is also undefined.  On the other hand, extra
2359 arguments are simply ignored.
2360
2361 @node Numeric Input Conversions
2362 @subsection Numeric Input Conversions
2363
2364 This section describes the @code{scanf} conversions for reading numeric
2365 values.
2366
2367 The @samp{%d} conversion matches an optionally signed integer in decimal
2368 radix.  The syntax that is recognized is the same as that for the
2369 @code{strtol} function (@pxref{Parsing of Integers}) with the value
2370 @code{10} for the @var{base} argument.
2371
2372 The @samp{%i} conversion matches an optionally signed integer in any of
2373 the formats that the C language defines for specifying an integer
2374 constant.  The syntax that is recognized is the same as that for the
2375 @code{strtol} function (@pxref{Parsing of Integers}) with the value
2376 @code{0} for the @var{base} argument.  (You can print integers in this
2377 syntax with @code{printf} by using the @samp{#} flag character with the
2378 @samp{%x}, @samp{%o}, or @samp{%d} conversion.  @xref{Integer Conversions}.)
2379
2380 For example, any of the strings @samp{10}, @samp{0xa}, or @samp{012}
2381 could be read in as integers under the @samp{%i} conversion.  Each of
2382 these specifies a number with decimal value @code{10}.
2383
2384 The @samp{%o}, @samp{%u}, and @samp{%x} conversions match unsigned
2385 integers in octal, decimal, and hexadecimal radices, respectively.  The
2386 syntax that is recognized is the same as that for the @code{strtoul}
2387 function (@pxref{Parsing of Integers}) with the appropriate value
2388 (@code{8}, @code{10}, or @code{16}) for the @var{base} argument.
2389
2390 The @samp{%X} conversion is identical to the @samp{%x} conversion.  They
2391 both permit either uppercase or lowercase letters to be used as digits.
2392
2393 The default type of the corresponding argument for the @code{%d} and
2394 @code{%i} conversions is @code{int *}, and @code{unsigned int *} for the
2395 other integer conversions.  You can use the following type modifiers to
2396 specify other sizes of integer:
2397
2398 @table @samp
2399 @item h
2400 Specifies that the argument is a @code{short int *} or @code{unsigned
2401 short int *}.
2402
2403 @item l
2404 Specifies that the argument is a @code{long int *} or @code{unsigned
2405 long int *}.  Two @samp{l} characters is like the @samp{L} modifier, below.
2406
2407 @need 100
2408 @item ll
2409 @itemx L
2410 @itemx q
2411 Specifies that the argument is a @code{long long int *} or @code{unsigned long long int *}.  (The @code{long long} type is an extension supported by the
2412 GNU C compiler.  For systems that don't provide extra-long integers, this
2413 is the same as @code{long int}.)
2414
2415 The @samp{q} modifier is another name for the same thing, which comes
2416 from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad''
2417 @code{int}.
2418 @end table
2419
2420 All of the @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, and @samp{%G}
2421 input conversions are interchangeable.  They all match an optionally
2422 signed floating point number, in the same syntax as for the
2423 @code{strtod} function (@pxref{Parsing of Floats}).
2424
2425 For the floating-point input conversions, the default argument type is
2426 @code{float *}.  (This is different from the corresponding output
2427 conversions, where the default type is @code{double}; remember that
2428 @code{float} arguments to @code{printf} are converted to @code{double}
2429 by the default argument promotions, but @code{float *} arguments are
2430 not promoted to @code{double *}.)  You can specify other sizes of float
2431 using these type modifiers:
2432
2433 @table @samp
2434 @item l
2435 Specifies that the argument is of type @code{double *}.
2436
2437 @item L
2438 Specifies that the argument is of type @code{long double *}.
2439 @end table
2440
2441 @node String Input Conversions
2442 @subsection String Input Conversions
2443
2444 This section describes the @code{scanf} input conversions for reading
2445 string and character values: @samp{%s}, @samp{%[}, and @samp{%c}.
2446
2447 You have two options for how to receive the input from these
2448 conversions:
2449
2450 @itemize @bullet
2451 @item
2452 Provide a buffer to store it in.  This is the default.  You
2453 should provide an argument of type @code{char *}.
2454
2455 @strong{Warning:} To make a robust program, you must make sure that the
2456 input (plus its terminating null) cannot possibly exceed the size of the
2457 buffer you provide.  In general, the only way to do this is to specify a
2458 maximum field width one less than the buffer size.  @strong{If you
2459 provide the buffer, always specify a maximum field width to prevent
2460 overflow.}
2461
2462 @item
2463 Ask @code{scanf} to allocate a big enough buffer, by specifying the
2464 @samp{a} flag character.  This is a GNU extension.  You should provide
2465 an argument of type @code{char **} for the buffer address to be stored
2466 in.  @xref{Dynamic String Input}.
2467 @end itemize
2468
2469 The @samp{%c} conversion is the simplest: it matches a fixed number of
2470 characters, always.  The maximum field with says how many characters to
2471 read; if you don't specify the maximum, the default is 1.  This
2472 conversion doesn't append a null character to the end of the text it
2473 reads.  It also does not skip over initial whitespace characters.  It
2474 reads precisely the next @var{n} characters, and fails if it cannot get
2475 that many.  Since there is always a maximum field width with @samp{%c}
2476 (whether specified, or 1 by default), you can always prevent overflow by
2477 making the buffer long enough.
2478
2479 The @samp{%s} conversion matches a string of non-whitespace characters.
2480 It skips and discards initial whitespace, but stops when it encounters
2481 more whitespace after having read something.  It stores a null character
2482 at the end of the text that it reads.
2483
2484 For example, reading the input:
2485
2486 @smallexample
2487  hello, world
2488 @end smallexample
2489
2490 @noindent
2491 with the conversion @samp{%10c} produces @code{" hello, wo"}, but
2492 reading the same input with the conversion @samp{%10s} produces
2493 @code{"hello,"}.
2494
2495 @strong{Warning:} If you do not specify a field width for @samp{%s},
2496 then the number of characters read is limited only by where the next
2497 whitespace character appears.  This almost certainly means that invalid
2498 input can make your program crash---which is a bug.
2499
2500 To read in characters that belong to an arbitrary set of your choice,
2501 use the @samp{%[} conversion.  You specify the set between the @samp{[}
2502 character and a following @samp{]} character, using the same syntax used
2503 in regular expressions.  As special cases:
2504
2505 @itemize @bullet
2506 @item
2507 A literal @samp{]} character can be specified as the first character
2508 of the set.
2509
2510 @item
2511 An embedded @samp{-} character (that is, one that is not the first or
2512 last character of the set) is used to specify a range of characters.
2513
2514 @item
2515 If a caret character @samp{^} immediately follows the initial @samp{[},
2516 then the set of allowed input characters is the everything @emph{except}
2517 the characters listed.
2518 @end itemize
2519
2520 The @samp{%[} conversion does not skip over initial whitespace
2521 characters.
2522
2523 Here are some examples of @samp{%[} conversions and what they mean:
2524
2525 @table @samp
2526 @item %25[1234567890]
2527 Matches a string of up to 25 digits.
2528
2529 @item %25[][]
2530 Matches a string of up to 25 square brackets.
2531
2532 @item %25[^ \f\n\r\t\v]
2533 Matches a string up to 25 characters long that doesn't contain any of
2534 the standard whitespace characters.  This is slightly different from
2535 @samp{%s}, because if the input begins with a whitespace character,
2536 @samp{%[} reports a matching failure while @samp{%s} simply discards the
2537 initial whitespace.
2538
2539 @item %25[a-z]
2540 Matches up to 25 lowercase characters.
2541 @end table
2542
2543 One more reminder: the @samp{%s} and @samp{%[} conversions are
2544 @strong{dangerous} if you don't specify a maximum width or use the
2545 @samp{a} flag, because input too long would overflow whatever buffer you
2546 have provided for it.  No matter how long your buffer is, a user could
2547 supply input that is longer.  A well-written program reports invalid
2548 input with a comprehensible error message, not with a crash.
2549
2550 @node Dynamic String Input
2551 @subsection Dynamically Allocating String Conversions
2552
2553 A GNU extension to formatted input lets you safely read a string with no
2554 maximum size.  Using this feature, you don't supply a buffer; instead,
2555 @code{scanf} allocates a buffer big enough to hold the data and gives
2556 you its address.  To use this feature, write @samp{a} as a flag
2557 character, as in @samp{%as} or @samp{%a[0-9a-z]}.
2558
2559 The pointer argument you supply for where to store the input should have
2560 type @code{char **}.  The @code{scanf} function allocates a buffer and
2561 stores its address in the word that the argument points to.  You should
2562 free the buffer with @code{free} when you no longer need it.
2563
2564 Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]}
2565 conversion specification to read a ``variable assignment'' of the form
2566 @samp{@var{variable} = @var{value}}.
2567
2568 @smallexample
2569 @{
2570   char *variable, *value;
2571
2572   if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n",
2573                  &variable, &value))
2574     @{
2575       invalid_input_error ();
2576       return 0;
2577     @}
2578
2579   @dots{}
2580 @}
2581 @end smallexample
2582
2583 @node Other Input Conversions
2584 @subsection Other Input Conversions
2585
2586 This section describes the miscellaneous input conversions.
2587
2588 The @samp{%p} conversion is used to read a pointer value.  It recognizes
2589 the same syntax as is used by the @samp{%p} output conversion for
2590 @code{printf} (@pxref{Other Output Conversions}); that is, a hexadecimal
2591 number just as the @samp{%x} conversion accepts.  The corresponding
2592 argument should be of type @code{void **}; that is, the address of a
2593 place to store a pointer.
2594
2595 The resulting pointer value is not guaranteed to be valid if it was not
2596 originally written during the same program execution that reads it in.
2597
2598 The @samp{%n} conversion produces the number of characters read so far
2599 by this call.  The corresponding argument should be of type @code{int *}.
2600 This conversion works in the same way as the @samp{%n} conversion for
2601 @code{printf}; see @ref{Other Output Conversions}, for an example.
2602
2603 The @samp{%n} conversion is the only mechanism for determining the
2604 success of literal matches or conversions with suppressed assignments.
2605 If the @samp{%n} follows the locus of a matching failure, then no value
2606 is stored for it since @code{scanf} returns before processing the
2607 @samp{%n}.  If you store @code{-1} in that argument slot before calling
2608 @code{scanf}, the presence of @code{-1} after @code{scanf} indicates an
2609 error occurred before the @samp{%n} was reached.
2610
2611 Finally, the @samp{%%} conversion matches a literal @samp{%} character
2612 in the input stream, without using an argument.  This conversion does
2613 not permit any flags, field width, or type modifier to be specified.
2614
2615 @node Formatted Input Functions
2616 @subsection Formatted Input Functions
2617
2618 Here are the descriptions of the functions for performing formatted
2619 input.
2620 Prototypes for these functions are in the header file @file{stdio.h}.
2621 @pindex stdio.h
2622
2623 @comment stdio.h
2624 @comment ANSI
2625 @deftypefun int scanf (const char *@var{template}, @dots{})
2626 The @code{scanf} function reads formatted input from the stream
2627 @code{stdin} under the control of the template string @var{template}.
2628 The optional arguments are pointers to the places which receive the
2629 resulting values.
2630
2631 The return value is normally the number of successful assignments.  If
2632 an end-of-file condition is detected before any matches are performed
2633 (including matches against whitespace and literal characters in the
2634 template), then @code{EOF} is returned.
2635 @end deftypefun
2636
2637 @comment stdio.h
2638 @comment ANSI
2639 @deftypefun int fscanf (FILE *@var{stream}, const char *@var{template}, @dots{})
2640 This function is just like @code{scanf}, except that the input is read
2641 from the stream @var{stream} instead of @code{stdin}.
2642 @end deftypefun
2643
2644 @comment stdio.h
2645 @comment ANSI
2646 @deftypefun int sscanf (const char *@var{s}, const char *@var{template}, @dots{})
2647 This is like @code{scanf}, except that the characters are taken from the
2648 null-terminated string @var{s} instead of from a stream.  Reaching the
2649 end of the string is treated as an end-of-file condition.
2650
2651 The behavior of this function is undefined if copying takes place
2652 between objects that overlap---for example, if @var{s} is also given
2653 as an argument to receive a string read under control of the @samp{%s}
2654 conversion.
2655 @end deftypefun
2656
2657 @node Variable Arguments Input
2658 @subsection Variable Arguments Input Functions
2659
2660 The functions @code{vscanf} and friends are provided so that you can
2661 define your own variadic @code{scanf}-like functions that make use of
2662 the same internals as the built-in formatted output functions.
2663 These functions are analogous to the @code{vprintf} series of output
2664 functions.  @xref{Variable Arguments Output}, for important
2665 information on how to use them.
2666
2667 @strong{Portability Note:} The functions listed in this section are GNU
2668 extensions.
2669
2670 @comment stdio.h
2671 @comment GNU
2672 @deftypefun int vscanf (const char *@var{template}, va_list @var{ap})
2673 This function is similar to @code{scanf} except that, instead of taking
2674 a variable number of arguments directly, it takes an argument list
2675 pointer @var{ap} of type @code{va_list} (@pxref{Variadic Functions}).
2676 @end deftypefun
2677
2678 @comment stdio.h
2679 @comment GNU
2680 @deftypefun int vfscanf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap})
2681 This is the equivalent of @code{fscanf} with the variable argument list
2682 specified directly as for @code{vscanf}.
2683 @end deftypefun
2684
2685 @comment stdio.h
2686 @comment GNU
2687 @deftypefun int vsscanf (const char *@var{s}, const char *@var{template}, va_list @var{ap})
2688 This is the equivalent of @code{sscanf} with the variable argument list
2689 specified directly as for @code{vscanf}.
2690 @end deftypefun
2691
2692 In GNU C, there is a special construct you can use to let the compiler
2693 know that a function uses a @code{scanf}-style format string.  Then it
2694 can check the number and types of arguments in each call to the
2695 function, and warn you when they do not match the format string.
2696 @xref{Function Attributes, , Declaring Attributes of Functions,
2697 gcc.info, Using GNU CC}, for details.
2698
2699 @node EOF and Errors
2700 @section End-Of-File and Errors
2701
2702 @cindex end of file, on a stream
2703 Many of the functions described in this chapter return the value of the
2704 macro @code{EOF} to indicate unsuccessful completion of the operation.
2705 Since @code{EOF} is used to report both end of file and random errors,
2706 it's often better to use the @code{feof} function to check explicitly
2707 for end of file and @code{ferror} to check for errors.  These functions
2708 check indicators that are part of the internal state of the stream
2709 object, indicators set if the appropriate condition was detected by a
2710 previous I/O operation on that stream.
2711
2712 These symbols are declared in the header file @file{stdio.h}.
2713 @pindex stdio.h
2714
2715 @comment stdio.h
2716 @comment ANSI
2717 @deftypevr Macro int EOF
2718 This macro is an integer value that is returned by a number of functions
2719 to indicate an end-of-file condition, or some other error situation.
2720 With the GNU library, @code{EOF} is @code{-1}.  In other libraries, its
2721 value may be some other negative number.
2722 @end deftypevr
2723
2724 @comment stdio.h
2725 @comment ANSI
2726 @deftypefun void clearerr (FILE *@var{stream})
2727 This function clears the end-of-file and error indicators for the
2728 stream @var{stream}.
2729
2730 The file positioning functions (@pxref{File Positioning}) also clear the
2731 end-of-file indicator for the stream.
2732 @end deftypefun
2733
2734 @comment stdio.h
2735 @comment ANSI
2736 @deftypefun int feof (FILE *@var{stream})
2737 The @code{feof} function returns nonzero if and only if the end-of-file
2738 indicator for the stream @var{stream} is set.
2739 @end deftypefun
2740
2741 @comment stdio.h
2742 @comment ANSI
2743 @deftypefun int ferror (FILE *@var{stream})
2744 The @code{ferror} function returns nonzero if and only if the error
2745 indicator for the stream @var{stream} is set, indicating that an error
2746 has occurred on a previous operation on the stream.
2747 @end deftypefun
2748
2749 In addition to setting the error indicator associated with the stream,
2750 the functions that operate on streams also set @code{errno} in the same
2751 way as the corresponding low-level functions that operate on file
2752 descriptors.  For example, all of the functions that perform output to a
2753 stream---such as @code{fputc}, @code{printf}, and @code{fflush}---are
2754 implemented in terms of @code{write}, and all of the @code{errno} error
2755 conditions defined for @code{write} are meaningful for these functions.
2756 For more information about the descriptor-level I/O functions, see
2757 @ref{Low-Level I/O}.
2758
2759 @node Binary Streams
2760 @section Text and Binary Streams
2761
2762 The GNU system and other POSIX-compatible operating systems organize all
2763 files as uniform sequences of characters.  However, some other systems
2764 make a distinction between files containing text and files containing
2765 binary data, and the input and output facilities of ANSI C provide for
2766 this distinction.  This section tells you how to write programs portable
2767 to such systems.
2768
2769 @cindex text stream
2770 @cindex binary stream
2771 When you open a stream, you can specify either a @dfn{text stream} or a
2772 @dfn{binary stream}.  You indicate that you want a binary stream by
2773 specifying the @samp{b} modifier in the @var{opentype} argument to
2774 @code{fopen}; see @ref{Opening Streams}.  Without this
2775 option, @code{fopen} opens the file as a text stream.
2776
2777 Text and binary streams differ in several ways:
2778
2779 @itemize @bullet
2780 @item
2781 The data read from a text stream is divided into @dfn{lines} which are
2782 terminated by newline (@code{'\n'}) characters, while a binary stream is
2783 simply a long series of characters.  A text stream might on some systems
2784 fail to handle lines more than 254 characters long (including the
2785 terminating newline character).
2786 @cindex lines (in a text file)
2787
2788 @item
2789 On some systems, text files can contain only printing characters,
2790 horizontal tab characters, and newlines, and so text streams may not
2791 support other characters.  However, binary streams can handle any
2792 character value.
2793
2794 @item
2795 Space characters that are written immediately preceding a newline
2796 character in a text stream may disappear when the file is read in again.
2797
2798 @item
2799 More generally, there need not be a one-to-one mapping between
2800 characters that are read from or written to a text stream, and the
2801 characters in the actual file.
2802 @end itemize
2803
2804 Since a binary stream is always more capable and more predictable than a
2805 text stream, you might wonder what purpose text streams serve.  Why not
2806 simply always use binary streams?  The answer is that on these operating
2807 systems, text and binary streams use different file formats, and the
2808 only way to read or write ``an ordinary file of text'' that can work
2809 with other text-oriented programs is through a text stream.
2810
2811 In the GNU library, and on all POSIX systems, there is no difference
2812 between text streams and binary streams.  When you open a stream, you
2813 get the same kind of stream regardless of whether you ask for binary.
2814 This stream can handle any file content, and has none of the
2815 restrictions that text streams sometimes have.
2816
2817 @node File Positioning
2818 @section File Positioning
2819 @cindex file positioning on a stream
2820 @cindex positioning a stream
2821 @cindex seeking on a stream
2822
2823 The @dfn{file position} of a stream describes where in the file the
2824 stream is currently reading or writing.  I/O on the stream advances the
2825 file position through the file.  In the GNU system, the file position is
2826 represented as an integer, which counts the number of bytes from the
2827 beginning of the file.  @xref{File Position}.
2828
2829 During I/O to an ordinary disk file, you can change the file position
2830 whenever you wish, so as to read or write any portion of the file.  Some
2831 other kinds of files may also permit this.  Files which support changing
2832 the file position are sometimes referred to as @dfn{random-access}
2833 files.
2834
2835 You can use the functions in this section to examine or modify the file
2836 position indicator associated with a stream.  The symbols listed below
2837 are declared in the header file @file{stdio.h}.
2838 @pindex stdio.h
2839
2840 @comment stdio.h
2841 @comment ANSI
2842 @deftypefun {long int} ftell (FILE *@var{stream})
2843 This function returns the current file position of the stream
2844 @var{stream}.
2845
2846 This function can fail if the stream doesn't support file positioning,
2847 or if the file position can't be represented in a @code{long int}, and
2848 possibly for other reasons as well.  If a failure occurs, a value of
2849 @code{-1} is returned.
2850 @end deftypefun
2851
2852 @comment stdio.h
2853 @comment ANSI
2854 @deftypefun int fseek (FILE *@var{stream}, long int @var{offset}, int @var{whence})
2855 The @code{fseek} function is used to change the file position of the
2856 stream @var{stream}.  The value of @var{whence} must be one of the
2857 constants @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}, to
2858 indicate whether the @var{offset} is relative to the beginning of the
2859 file, the current file position, or the end of the file, respectively.
2860
2861 This function returns a value of zero if the operation was successful,
2862 and a nonzero value to indicate failure.  A successful call also clears
2863 the end-of-file indicator of @var{stream} and discards any characters
2864 that were ``pushed back'' by the use of @code{ungetc}.
2865
2866 @code{fseek} either flushes any buffered output before setting the file
2867 position or else remembers it so it will be written later in its proper
2868 place in the file.
2869 @end deftypefun
2870
2871 @strong{Portability Note:} In non-POSIX systems, @code{ftell} and
2872 @code{fseek} might work reliably only on binary streams.  @xref{Binary
2873 Streams}.
2874
2875 The following symbolic constants are defined for use as the @var{whence}
2876 argument to @code{fseek}.  They are also used with the @code{lseek}
2877 function (@pxref{I/O Primitives}) and to specify offsets for file locks
2878 (@pxref{Control Operations}).
2879
2880 @comment stdio.h
2881 @comment ANSI
2882 @deftypevr Macro int SEEK_SET
2883 This is an integer constant which, when used as the @var{whence}
2884 argument to the @code{fseek} function, specifies that the offset
2885 provided is relative to the beginning of the file.
2886 @end deftypevr
2887
2888 @comment stdio.h
2889 @comment ANSI
2890 @deftypevr Macro int SEEK_CUR
2891 This is an integer constant which, when used as the @var{whence}
2892 argument to the @code{fseek} function, specifies that the offset
2893 provided is relative to the current file position.
2894 @end deftypevr
2895
2896 @comment stdio.h
2897 @comment ANSI
2898 @deftypevr Macro int SEEK_END
2899 This is an integer constant which, when used as the @var{whence}
2900 argument to the @code{fseek} function, specifies that the offset
2901 provided is relative to the end of the file.
2902 @end deftypevr
2903
2904 @comment stdio.h
2905 @comment ANSI
2906 @deftypefun void rewind (FILE *@var{stream})
2907 The @code{rewind} function positions the stream @var{stream} at the
2908 begining of the file.  It is equivalent to calling @code{fseek} on the
2909 @var{stream} with an @var{offset} argument of @code{0L} and a
2910 @var{whence} argument of @code{SEEK_SET}, except that the return
2911 value is discarded and the error indicator for the stream is reset.
2912 @end deftypefun
2913
2914 These three aliases for the @samp{SEEK_@dots{}} constants exist for the
2915 sake of compatibility with older BSD systems.  They are defined in two
2916 different header files: @file{fcntl.h} and @file{sys/file.h}.
2917
2918 @table @code
2919 @comment sys/file.h
2920 @comment BSD
2921 @item L_SET
2922 @vindex L_SET
2923 An alias for @code{SEEK_SET}.
2924
2925 @comment sys/file.h
2926 @comment BSD
2927 @item L_INCR
2928 @vindex L_INCR
2929 An alias for @code{SEEK_CUR}.
2930
2931 @comment sys/file.h
2932 @comment BSD
2933 @item L_XTND
2934 @vindex L_XTND
2935 An alias for @code{SEEK_END}.
2936 @end table
2937
2938 @node Portable Positioning
2939 @section Portable File-Position Functions
2940
2941 On the GNU system, the file position is truly a character count.  You
2942 can specify any character count value as an argument to @code{fseek} and
2943 get reliable results for any random access file.  However, some ANSI C
2944 systems do not represent file positions in this way.
2945
2946 On some systems where text streams truly differ from binary streams, it
2947 is impossible to represent the file position of a text stream as a count
2948 of characters from the beginning of the file.  For example, the file
2949 position on some systems must encode both a record offset within the
2950 file, and a character offset within the record.
2951
2952 As a consequence, if you want your programs to be portable to these
2953 systems, you must observe certain rules:
2954
2955 @itemize @bullet
2956 @item
2957 The value returned from @code{ftell} on a text stream has no predictable
2958 relationship to the number of characters you have read so far.  The only
2959 thing you can rely on is that you can use it subsequently as the
2960 @var{offset} argument to @code{fseek} to move back to the same file
2961 position.
2962
2963 @item
2964 In a call to @code{fseek} on a text stream, either the @var{offset} must
2965 either be zero; or @var{whence} must be @code{SEEK_SET} and the
2966 @var{offset} must be the result of an earlier call to @code{ftell} on
2967 the same stream.
2968
2969 @item
2970 The value of the file position indicator of a text stream is undefined
2971 while there are characters that have been pushed back with @code{ungetc}
2972 that haven't been read or discarded.  @xref{Unreading}.
2973 @end itemize
2974
2975 But even if you observe these rules, you may still have trouble for long
2976 files, because @code{ftell} and @code{fseek} use a @code{long int} value
2977 to represent the file position.  This type may not have room to encode
2978 all the file positions in a large file.
2979
2980 So if you do want to support systems with peculiar encodings for the
2981 file positions, it is better to use the functions @code{fgetpos} and
2982 @code{fsetpos} instead.  These functions represent the file position
2983 using the data type @code{fpos_t}, whose internal representation varies
2984 from system to system.
2985
2986 These symbols are declared in the header file @file{stdio.h}.
2987 @pindex stdio.h
2988
2989 @comment stdio.h
2990 @comment ANSI
2991 @deftp {Data Type} fpos_t
2992 This is the type of an object that can encode information about the
2993 file position of a stream, for use by the functions @code{fgetpos} and
2994 @code{fsetpos}.
2995
2996 In the GNU system, @code{fpos_t} is equivalent to @code{off_t} or
2997 @code{long int}.  In other systems, it might have a different internal
2998 representation.
2999 @end deftp
3000
3001 @comment stdio.h
3002 @comment ANSI
3003 @deftypefun int fgetpos (FILE *@var{stream}, fpos_t *@var{position})
3004 This function stores the value of the file position indicator for the
3005 stream @var{stream} in the @code{fpos_t} object pointed to by
3006 @var{position}.  If successful, @code{fgetpos} returns zero; otherwise
3007 it returns a nonzero value and stores an implementation-defined positive
3008 value in @code{errno}.
3009 @end deftypefun
3010
3011 @comment stdio.h
3012 @comment ANSI
3013 @deftypefun int fsetpos (FILE *@var{stream}, const fpos_t @var{position})
3014 This function sets the file position indicator for the stream @var{stream}
3015 to the position @var{position}, which must have been set by a previous
3016 call to @code{fgetpos} on the same stream.  If successful, @code{fsetpos}
3017 clears the end-of-file indicator on the stream, discards any characters
3018 that were ``pushed back'' by the use of @code{ungetc}, and returns a value
3019 of zero.  Otherwise, @code{fsetpos} returns a nonzero value and stores
3020 an implementation-defined positive value in @code{errno}.
3021 @end deftypefun
3022
3023 @node Stream Buffering
3024 @section Stream Buffering
3025
3026 @cindex buffering of streams
3027 Characters that are written to a stream are normally accumulated and
3028 transmitted asynchronously to the file in a block, instead of appearing
3029 as soon as they are output by the application program.  Similarly,
3030 streams often retrieve input from the host environment in blocks rather
3031 than on a character-by-character basis.  This is called @dfn{buffering}.
3032
3033 If you are writing programs that do interactive input and output using
3034 streams, you need to understand how buffering works when you design the
3035 user interface to your program.  Otherwise, you might find that output
3036 (such as progress or prompt messages) doesn't appear when you intended
3037 it to, or other unexpected behavior.
3038
3039 This section deals only with controlling when characters are transmitted
3040 between the stream and the file or device, and @emph{not} with how
3041 things like echoing, flow control, and the like are handled on specific
3042 classes of devices.  For information on common control operations on
3043 terminal devices, see @ref{Low-Level Terminal Interface}.
3044
3045 You can bypass the stream buffering facilities altogether by using the
3046 low-level input and output functions that operate on file descriptors
3047 instead.  @xref{Low-Level I/O}.
3048
3049 @menu
3050 * Buffering Concepts::          Terminology is defined here.
3051 * Flushing Buffers::            How to ensure that output buffers are flushed.
3052 * Controlling Buffering::       How to specify what kind of buffering to use.
3053 @end menu
3054
3055 @node Buffering Concepts
3056 @subsection Buffering Concepts
3057
3058 There are three different kinds of buffering strategies:
3059
3060 @itemize @bullet
3061 @item
3062 Characters written to or read from an @dfn{unbuffered} stream are
3063 transmitted individually to or from the file as soon as possible.
3064 @cindex unbuffered stream
3065
3066 @item
3067 Characters written to a @dfn{line buffered} stream are transmitted to
3068 the file in blocks when a newline character is encountered.
3069 @cindex line buffered stream
3070
3071 @item
3072 Characters written to or read from a @dfn{fully buffered} stream are
3073 transmitted to or from the file in blocks of arbitrary size.
3074 @cindex fully buffered stream
3075 @end itemize
3076
3077 Newly opened streams are normally fully buffered, with one exception: a
3078 stream connected to an interactive device such as a terminal is
3079 initially line buffered.  @xref{Controlling Buffering}, for information
3080 on how to select a different kind of buffering.  Usually the automatic
3081 selection gives you the most convenient kind of buffering for the file
3082 or device you open.
3083
3084 The use of line buffering for interactive devices implies that output
3085 messages ending in a newline will appear immediately---which is usually
3086 what you want.  Output that doesn't end in a newline might or might not
3087 show up immediately, so if you want them to appear immediately, you
3088 should flush buffered output explicitly with @code{fflush}, as described
3089 in @ref{Flushing Buffers}.
3090
3091 @node Flushing Buffers
3092 @subsection Flushing Buffers
3093
3094 @cindex flushing a stream
3095 @dfn{Flushing} output on a buffered stream means transmitting all
3096 accumulated characters to the file.  There are many circumstances when
3097 buffered output on a stream is flushed automatically:
3098
3099 @itemize @bullet
3100 @item
3101 When you try to do output and the output buffer is full.
3102
3103 @item
3104 When the stream is closed.  @xref{Closing Streams}.
3105
3106 @item
3107 When the program terminates by calling @code{exit}.
3108 @xref{Normal Termination}.
3109
3110 @item
3111 When a newline is written, if the stream is line buffered.
3112
3113 @item
3114 Whenever an input operation on @emph{any} stream actually reads data
3115 from its file.
3116 @end itemize
3117
3118 If you want to flush the buffered output at another time, call
3119 @code{fflush}, which is declared in the header file @file{stdio.h}.
3120 @pindex stdio.h
3121
3122 @comment stdio.h
3123 @comment ANSI
3124 @deftypefun int fflush (FILE *@var{stream})
3125 This function causes any buffered output on @var{stream} to be delivered
3126 to the file.  If @var{stream} is a null pointer, then
3127 @code{fflush} causes buffered output on @emph{all} open output streams
3128 to be flushed.
3129
3130 This function returns @code{EOF} if a write error occurs, or zero
3131 otherwise.
3132 @end deftypefun
3133
3134 @strong{Compatibility Note:} Some brain-damaged operating systems have
3135 been known to be so thoroughly fixated on line-oriented input and output
3136 that flushing a line buffered stream causes a newline to be written!
3137 Fortunately, this ``feature'' seems to be becoming less common.  You do
3138 not need to worry about this in the GNU system.
3139
3140
3141 @node Controlling Buffering
3142 @subsection Controlling Which Kind of Buffering
3143
3144 After opening a stream (but before any other operations have been
3145 performed on it), you can explicitly specify what kind of buffering you
3146 want it to have using the @code{setvbuf} function.
3147 @cindex buffering, controlling
3148
3149 The facilities listed in this section are declared in the header
3150 file @file{stdio.h}.
3151 @pindex stdio.h
3152
3153 @comment stdio.h
3154 @comment ANSI
3155 @deftypefun int setvbuf (FILE *@var{stream}, char *@var{buf}, int @var{mode}, size_t @var{size})
3156 This function is used to specify that the stream @var{stream} should
3157 have the buffering mode @var{mode}, which can be either @code{_IOFBF}
3158 (for full buffering), @code{_IOLBF} (for line buffering), or
3159 @code{_IONBF} (for unbuffered input/output).
3160
3161 If you specify a null pointer as the @var{buf} argument, then @code{setvbuf}
3162 allocates a buffer itself using @code{malloc}.  This buffer will be freed
3163 when you close the stream.
3164
3165 Otherwise, @var{buf} should be a character array that can hold at least
3166 @var{size} characters.  You should not free the space for this array as
3167 long as the stream remains open and this array remains its buffer.  You
3168 should usually either allocate it statically, or @code{malloc}
3169 (@pxref{Unconstrained Allocation}) the buffer.  Using an automatic array
3170 is not a good idea unless you close the file before exiting the block
3171 that declares the array.
3172
3173 While the array remains a stream buffer, the stream I/O functions will
3174 use the buffer for their internal purposes.  You shouldn't try to access
3175 the values in the array directly while the stream is using it for
3176 buffering.
3177
3178 The @code{setvbuf} function returns zero on success, or a nonzero value
3179 if the value of @var{mode} is not valid or if the request could not
3180 be honored.
3181 @end deftypefun
3182
3183 @comment stdio.h
3184 @comment ANSI
3185 @deftypevr Macro int _IOFBF
3186 The value of this macro is an integer constant expression that can be
3187 used as the @var{mode} argument to the @code{setvbuf} function to
3188 specify that the stream should be fully buffered.
3189 @end deftypevr
3190
3191 @comment stdio.h
3192 @comment ANSI
3193 @deftypevr Macro int _IOLBF
3194 The value of this macro is an integer constant expression that can be
3195 used as the @var{mode} argument to the @code{setvbuf} function to
3196 specify that the stream should be line buffered.
3197 @end deftypevr
3198
3199 @comment stdio.h
3200 @comment ANSI
3201 @deftypevr Macro int _IONBF
3202 The value of this macro is an integer constant expression that can be
3203 used as the @var{mode} argument to the @code{setvbuf} function to
3204 specify that the stream should be unbuffered.
3205 @end deftypevr
3206
3207 @comment stdio.h
3208 @comment ANSI
3209 @deftypevr Macro int BUFSIZ
3210 The value of this macro is an integer constant expression that is good
3211 to use for the @var{size} argument to @code{setvbuf}.  This value is
3212 guaranteed to be at least @code{256}.
3213
3214 The value of @code{BUFSIZ} is chosen on each system so as to make stream
3215 I/O efficient.  So it is a good idea to use @code{BUFSIZ} as the size
3216 for the buffer when you call @code{setvbuf}.
3217
3218 Actually, you can get an even better value to use for the buffer size
3219 by means of the @code{fstat} system call: it is found in the
3220 @code{st_blksize} field of the file attributes.  @xref{Attribute Meanings}.
3221
3222 Sometimes people also use @code{BUFSIZ} as the allocation size of
3223 buffers used for related purposes, such as strings used to receive a
3224 line of input with @code{fgets} (@pxref{Character Input}).  There is no
3225 particular reason to use @code{BUFSIZ} for this instead of any other
3226 integer, except that it might lead to doing I/O in chunks of an
3227 efficient size.
3228 @end deftypevr
3229
3230 @comment stdio.h
3231 @comment ANSI
3232 @deftypefun void setbuf (FILE *@var{stream}, char *@var{buf})
3233 If @var{buf} is a null pointer, the effect of this function is
3234 equivalent to calling @code{setvbuf} with a @var{mode} argument of
3235 @code{_IONBF}.  Otherwise, it is equivalent to calling @code{setvbuf}
3236 with @var{buf}, and a @var{mode} of @code{_IOFBF} and a @var{size}
3237 argument of @code{BUFSIZ}.
3238
3239 The @code{setbuf} function is provided for compatibility with old code;
3240 use @code{setvbuf} in all new programs.
3241 @end deftypefun
3242
3243 @comment stdio.h
3244 @comment BSD
3245 @deftypefun void setbuffer (FILE *@var{stream}, char *@var{buf}, size_t @var{size})
3246 If @var{buf} is a null pointer, this function makes @var{stream} unbuffered.
3247 Otherwise, it makes @var{stream} fully buffered using @var{buf} as the
3248 buffer.  The @var{size} argument specifies the length of @var{buf}.
3249
3250 This function is provided for compatibility with old BSD code.  Use
3251 @code{setvbuf} instead.
3252 @end deftypefun
3253
3254 @comment stdio.h
3255 @comment BSD
3256 @deftypefun void setlinebuf (FILE *@var{stream})
3257 This function makes @var{stream} be line buffered, and allocates the
3258 buffer for you.
3259
3260 This function is provided for compatibility with old BSD code.  Use
3261 @code{setvbuf} instead.
3262 @end deftypefun
3263
3264 @node Other Kinds of Streams
3265 @section Other Kinds of Streams
3266
3267 The GNU library provides ways for you to define additional kinds of
3268 streams that do not necessarily correspond to an open file.
3269
3270 One such type of stream takes input from or writes output to a string.
3271 These kinds of streams are used internally to implement the
3272 @code{sprintf} and @code{sscanf} functions.  You can also create such a
3273 stream explicitly, using the functions described in @ref{String Streams}.
3274
3275 More generally, you can define streams that do input/output to arbitrary
3276 objects using functions supplied by your program.  This protocol is
3277 discussed in @ref{Custom Streams}.
3278
3279 @strong{Portability Note:} The facilities described in this section are
3280 specific to GNU.  Other systems or C implementations might or might not
3281 provide equivalent functionality.
3282
3283 @menu
3284 * String Streams::              Streams that get data from or put data in
3285                                  a string or memory buffer.
3286 * Obstack Streams::             Streams that store data in an obstack.
3287 * Custom Streams::              Defining your own streams with an arbitrary
3288                                  input data source and/or output data sink.
3289 @end menu
3290
3291 @node String Streams
3292 @subsection String Streams
3293
3294 @cindex stream, for I/O to a string
3295 @cindex string stream
3296 The @code{fmemopen} and @code{open_memstream} functions allow you to do
3297 I/O to a string or memory buffer.  These facilities are declared in
3298 @file{stdio.h}.
3299 @pindex stdio.h
3300
3301 @comment stdio.h
3302 @comment GNU
3303 @deftypefun {FILE *} fmemopen (void *@var{buf}, size_t @var{size}, const char *@var{opentype})
3304 This function opens a stream that allows the access specified by the
3305 @var{opentype} argument, that reads from or writes to the buffer specified
3306 by the argument @var{buf}.  This array must be at least @var{size} bytes long.
3307
3308 If you specify a null pointer as the @var{buf} argument, @code{fmemopen}
3309 dynamically allocates (as with @code{malloc}; @pxref{Unconstrained
3310 Allocation}) an array @var{size} bytes long.  This is really only useful
3311 if you are going to write things to the buffer and then read them back
3312 in again, because you have no way of actually getting a pointer to the
3313 buffer (for this, try @code{open_memstream}, below).  The buffer is
3314 freed when the stream is open.
3315
3316 The argument @var{opentype} is the same as in @code{fopen}
3317 (@xref{Opening Streams}).  If the @var{opentype} specifies
3318 append mode, then the initial file position is set to the first null
3319 character in the buffer.  Otherwise the initial file position is at the
3320 beginning of the buffer.
3321
3322 When a stream open for writing is flushed or closed, a null character
3323 (zero byte) is written at the end of the buffer if it fits.  You
3324 should add an extra byte to the @var{size} argument to account for this.
3325 Attempts to write more than @var{size} bytes to the buffer result
3326 in an error.
3327
3328 For a stream open for reading, null characters (zero bytes) in the
3329 buffer do not count as ``end of file''.  Read operations indicate end of
3330 file only when the file position advances past @var{size} bytes.  So, if
3331 you want to read characters from a null-terminated string, you should
3332 supply the length of the string as the @var{size} argument.
3333 @end deftypefun
3334
3335 Here is an example of using @code{fmemopen} to create a stream for
3336 reading from a string:
3337
3338 @smallexample
3339 @include memopen.c.texi
3340 @end smallexample
3341
3342 This program produces the following output:
3343
3344 @smallexample
3345 Got f
3346 Got o
3347 Got o
3348 Got b
3349 Got a
3350 Got r
3351 @end smallexample
3352
3353 @comment stdio.h
3354 @comment GNU
3355 @deftypefun {FILE *} open_memstream (char **@var{ptr}, size_t *@var{sizeloc})
3356 This function opens a stream for writing to a buffer.  The buffer is
3357 allocated dynamically (as with @code{malloc}; @pxref{Unconstrained
3358 Allocation}) and grown as necessary.
3359
3360 When the stream is closed with @code{fclose} or flushed with
3361 @code{fflush}, the locations @var{ptr} and @var{sizeloc} are updated to
3362 contain the pointer to the buffer and its size.  The values thus stored
3363 remain valid only as long as no further output on the stream takes
3364 place.  If you do more output, you must flush the stream again to store
3365 new values before you use them again.
3366
3367 A null character is written at the end of the buffer.  This null character
3368 is @emph{not} included in the size value stored at @var{sizeloc}.
3369
3370 You can move the stream's file position with @code{fseek} (@pxref{File
3371 Positioning}).  Moving the file position past the end of the data
3372 already written fills the intervening space with zeroes.
3373 @end deftypefun
3374
3375 Here is an example of using @code{open_memstream}:
3376
3377 @smallexample
3378 @include memstrm.c.texi
3379 @end smallexample
3380
3381 This program produces the following output:
3382
3383 @smallexample
3384 buf = `hello', size = 5
3385 buf = `hello, world', size = 12
3386 @end smallexample
3387
3388 @c @group  Invalid outside @example.
3389 @node Obstack Streams
3390 @subsection Obstack Streams
3391
3392 You can open an output stream that puts it data in an obstack.
3393 @xref{Obstacks}.
3394
3395 @comment stdio.h
3396 @comment GNU
3397 @deftypefun {FILE *} open_obstack_stream (struct obstack *@var{obstack})
3398 This function opens a stream for writing data into the obstack @var{obstack}.
3399 This starts an object in the obstack and makes it grow as data is
3400 written (@pxref{Growing Objects}).
3401 @c @end group  Doubly invalid because not nested right.
3402
3403 Calling @code{fflush} on this stream updates the current size of the
3404 object to match the amount of data that has been written.  After a call
3405 to @code{fflush}, you can examine the object temporarily.
3406
3407 You can move the file position of an obstack stream with @code{fseek}
3408 (@pxref{File Positioning}).  Moving the file position past the end of
3409 the data written fills the intervening space with zeros.
3410
3411 To make the object permanent, update the obstack with @code{fflush}, and
3412 then use @code{obstack_finish} to finalize the object and get its address.
3413 The following write to the stream starts a new object in the obstack,
3414 and later writes add to that object until you do another @code{fflush}
3415 and @code{obstack_finish}.
3416
3417 But how do you find out how long the object is?  You can get the length
3418 in bytes by calling @code{obstack_object_size} (@pxref{Status of an
3419 Obstack}), or you can null-terminate the object like this:
3420
3421 @smallexample
3422 obstack_1grow (@var{obstack}, 0);
3423 @end smallexample
3424
3425 Whichever one you do, you must do it @emph{before} calling
3426 @code{obstack_finish}.  (You can do both if you wish.)
3427 @end deftypefun
3428
3429 Here is a sample function that uses @code{open_obstack_stream}:
3430
3431 @smallexample
3432 char *
3433 make_message_string (const char *a, int b)
3434 @{
3435   FILE *stream = open_obstack_stream (&message_obstack);
3436   output_task (stream);
3437   fprintf (stream, ": ");
3438   fprintf (stream, a, b);
3439   fprintf (stream, "\n");
3440   fclose (stream);
3441   obstack_1grow (&message_obstack, 0);
3442   return obstack_finish (&message_obstack);
3443 @}
3444 @end smallexample
3445
3446 @node Custom Streams
3447 @subsection Programming Your Own Custom Streams
3448 @cindex custom streams
3449 @cindex programming your own streams
3450
3451 This section describes how you can make a stream that gets input from an
3452 arbitrary data source or writes output to an arbitrary data sink
3453 programmed by you.  We call these @dfn{custom streams}.
3454
3455 @c !!! this does not talk at all about the higher-level hooks
3456
3457 @menu
3458 * Streams and Cookies::         The @dfn{cookie} records where to fetch or
3459                                  store data that is read or written.
3460 * Hook Functions::              How you should define the four @dfn{hook
3461                                  functions} that a custom stream needs.
3462 @end menu
3463
3464 @node Streams and Cookies
3465 @subsubsection Custom Streams and Cookies
3466 @cindex cookie, for custom stream
3467
3468 Inside every custom stream is a special object called the @dfn{cookie}.
3469 This is an object supplied by you which records where to fetch or store
3470 the data read or written.  It is up to you to define a data type to use
3471 for the cookie.  The stream functions in the library never refer
3472 directly to its contents, and they don't even know what the type is;
3473 they record its address with type @code{void *}.
3474
3475 To implement a custom stream, you must specify @emph{how} to fetch or
3476 store the data in the specified place.  You do this by defining
3477 @dfn{hook functions} to read, write, change ``file position'', and close
3478 the stream.  All four of these functions will be passed the stream's
3479 cookie so they can tell where to fetch or store the data.  The library
3480 functions don't know what's inside the cookie, but your functions will
3481 know.
3482
3483 When you create a custom stream, you must specify the cookie pointer,
3484 and also the four hook functions stored in a structure of type
3485 @code{cookie_io_functions_t}.
3486
3487 These facilities are declared in @file{stdio.h}.
3488 @pindex stdio.h
3489
3490 @comment stdio.h
3491 @comment GNU
3492 @deftp {Data Type} {cookie_io_functions_t}
3493 This is a structure type that holds the functions that define the
3494 communications protocol between the stream and its cookie.  It has
3495 the following members:
3496
3497 @table @code
3498 @item cookie_read_function_t *read
3499 This is the function that reads data from the cookie.  If the value is a
3500 null pointer instead of a function, then read operations on ths stream
3501 always return @code{EOF}.
3502
3503 @item cookie_write_function_t *write
3504 This is the function that writes data to the cookie.  If the value is a
3505 null pointer instead of a function, then data written to the stream is
3506 discarded.
3507
3508 @item cookie_seek_function_t *seek
3509 This is the function that performs the equivalent of file positioning on
3510 the cookie.  If the value is a null pointer instead of a function, calls
3511 to @code{fseek} on this stream can only seek to locations within the
3512 buffer; any attempt to seek outside the buffer will return an
3513 @code{ESPIPE} error.
3514
3515 @item cookie_close_function_t *close
3516 This function performs any appropriate cleanup on the cookie when
3517 closing the stream.  If the value is a null pointer instead of a
3518 function, nothing special is done to close the cookie when the stream is
3519 closed.
3520 @end table
3521 @end deftp
3522
3523 @comment stdio.h
3524 @comment GNU
3525 @deftypefun {FILE *} fopencookie (void *@var{cookie}, const char *@var{opentype}, cookie_io_functions_t @var{io-functions})
3526 This function actually creates the stream for communicating with the
3527 @var{cookie} using the functions in the @var{io-functions} argument.
3528 The @var{opentype} argument is interpreted as for @code{fopen};
3529 see @ref{Opening Streams}.  (But note that the ``truncate on
3530 open'' option is ignored.)  The new stream is fully buffered.
3531
3532 The @code{fopencookie} function returns the newly created stream, or a null
3533 pointer in case of an error.
3534 @end deftypefun
3535
3536 @node Hook Functions
3537 @subsubsection Custom Stream Hook Functions
3538 @cindex hook functions (of custom streams)
3539
3540 Here are more details on how you should define the four hook functions
3541 that a custom stream needs.
3542
3543 You should define the function to read data from the cookie as:
3544
3545 @smallexample
3546 ssize_t @var{reader} (void *@var{cookie}, void *@var{buffer}, size_t @var{size})
3547 @end smallexample
3548
3549 This is very similar to the @code{read} function; see @ref{I/O
3550 Primitives}.  Your function should transfer up to @var{size} bytes into
3551 the @var{buffer}, and return the number of bytes read, or zero to
3552 indicate end-of-file.  You can return a value of @code{-1} to indicate
3553 an error.
3554
3555 You should define the function to write data to the cookie as:
3556
3557 @smallexample
3558 ssize_t @var{writer} (void *@var{cookie}, const void *@var{buffer}, size_t @var{size})
3559 @end smallexample
3560
3561 This is very similar to the @code{write} function; see @ref{I/O
3562 Primitives}.  Your function should transfer up to @var{size} bytes from
3563 the buffer, and return the number of bytes written.  You can return a
3564 value of @code{-1} to indicate an error.
3565
3566 You should define the function to perform seek operations on the cookie
3567 as:
3568
3569 @smallexample
3570 int @var{seeker} (void *@var{cookie}, fpos_t *@var{position}, int @var{whence})
3571 @end smallexample
3572
3573 For this function, the @var{position} and @var{whence} arguments are
3574 interpreted as for @code{fgetpos}; see @ref{Portable Positioning}.  In
3575 the GNU library, @code{fpos_t} is equivalent to @code{off_t} or
3576 @code{long int}, and simply represents the number of bytes from the
3577 beginning of the file.
3578
3579 After doing the seek operation, your function should store the resulting
3580 file position relative to the beginning of the file in @var{position}.
3581 Your function should return a value of @code{0} on success and @code{-1}
3582 to indicate an error.
3583
3584 You should define the function to do cleanup operations on the cookie
3585 appropriate for closing the stream as:
3586
3587 @smallexample
3588 int @var{cleaner} (void *@var{cookie})
3589 @end smallexample
3590
3591 Your function should return @code{-1} to indicate an error, and @code{0}
3592 otherwise.
3593
3594 @comment stdio.h
3595 @comment GNU
3596 @deftp {Data Type} cookie_read_function
3597 This is the data type that the read function for a custom stream should have.
3598 If you declare the function as shown above, this is the type it will have.
3599 @end deftp
3600
3601 @comment stdio.h
3602 @comment GNU
3603 @deftp {Data Type} cookie_write_function
3604 The data type of the write function for a custom stream.
3605 @end deftp
3606
3607 @comment stdio.h
3608 @comment GNU
3609 @deftp {Data Type} cookie_seek_function
3610 The data type of the seek function for a custom stream.
3611 @end deftp
3612
3613 @comment stdio.h
3614 @comment GNU
3615 @deftp {Data Type} cookie_close_function
3616 The data type of the close function for a custom stream.
3617 @end deftp
3618
3619 @ignore
3620 Roland says:
3621
3622 @quotation
3623 There is another set of functions one can give a stream, the
3624 input-room and output-room functions.  These functions must
3625 understand stdio internals.  To describe how to use these
3626 functions, you also need to document lots of how stdio works
3627 internally (which isn't relevant for other uses of stdio).
3628 Perhaps I can write an interface spec from which you can write
3629 good documentation.  But it's pretty complex and deals with lots
3630 of nitty-gritty details.  I think it might be better to let this
3631 wait until the rest of the manual is more done and polished.
3632 @end quotation
3633 @end ignore
3634
3635 @c ??? This section could use an example.