So far, our programs have used only formatted IO operations. The input and output operators (<<
and >>)
format the data they read or write according to the type being handled. The input operators ignore whitespace; the output operators apply padding, precision, and so on.
The library also provides a set of low-level operations that support unformatted IO. These operations let us deal with a stream as a sequence of uninterpreted bytes.
Several of the unformatted operations deal with a stream one byte at a time. These operations, which are described in Table 17.19, read rather than ignore whitespace. For example, we can use the unformatted IO operations get
and put
to read and write the characters one at a time:
char ch;
while (cin.get(ch))
cout.put(ch);
This program preserves the whitespace in the input. Its output is identical to the input. It executes the same way as the previous program that used noskipws
.
Sometimes we need to read a character in order to know that we aren’t ready for it. In such cases, we’d like to put the character back onto the stream. The library gives us three ways to do so, each of which has subtle differences from the others:
• peek
returns a copy of the next character on the input stream but does not change the stream. The value returned by peek
stays on the stream.
• unget
backs up the input stream so that whatever value was last returned is still on the stream. We can call unget
even if we do not know what value was last taken from the stream.
• putback
is a more specialized version of unget:
It returns the last value read from the stream but takes an argument that must be the same as the one that was last read.
In general, we are guaranteed to be able to put back at most one value before the next read. That is, we are not guaranteed to be able to call putback
or unget
successively without an intervening read operation.
int
Return Values from Input OperationsThe peek
function and the version of get
that takes no argument return a character from the input stream as an int
. This fact can be surprising; it might seem more natural to have these functions return a char
.
The reason that these functions return an int
is to allow them to return an end-of-file marker. A given character set is allowed to use every value in the char
range to represent an actual character. Thus, there is no extra value in that range to use to represent end-of-file.
The functions that return int
convert the character they return to unsigned char
and then promote that value to int
. As a result, even if the character set has characters that map to negative values, the int
returned from these operations will be a positive value (§ 2.1.2, p. 35). The library uses a negative value to represent end-of-file, which is thus guaranteed to be distinct from any legitimate character value. Rather than requiring us to know the actual value returned, the cstdio
header defines a const
named EOF
that we can use to test if the value returned from get
is end-of-file. It is essential that we use an int
to hold the return from these functions:
int ch; // use an int, not a char to hold the return from get()
// loop to read and write all the data in the input
while ((ch = cin.get()) != EOF)
cout.put(ch);
This program operates identically to the one on page 761, the only difference being the version of get
that is used to read the input.
Some unformatted IO operations deal with chunks of data at a time. These operations can be important if speed is an issue, but like other low-level operations, they are error-prone. In particular, these operations require us to allocate and manage the character arrays (§ 12.2, p. 476) used to store and retrieve data. The multi-byte operations are listed in Table 17.20.
The get
and getline
functions take the same parameters, and their actions are similar but not identical. In each case, sink
is a char
array into which the data are placed. The functions read until one of the following conditions occurs:
• size - 1
characters are read
• End-of-file is encountered
• The delimiter character is encountered
The difference between these functions is the treatment of the delimiter: get
leaves the delimiter as the next character of the istream
, whereas getline
reads and discards the delimiter. In either case, the delimiter is not stored in sink
.
Several of the read operations read an unknown number of bytes from the input. We can call gcount
to determine how many characters the last unformatted input operation read. It is essential to call gcount
before any intervening unformatted input operation. In particular, the single-character operations that put characters back on the stream are also unformatted input operations. If peek, unget
, or putback
are called before calling gcount
, then the return value will be 0.