17.5.2. Unformatted Input/Output Operations

So far, our programs have used only formatted IO operations. The input and output operators (<< and >>) format the data they read or write according to the type being handled. The input operators ignore whitespace; the output operators apply padding, precision, and so on.

The library also provides a set of low-level operations that support unformatted IO. These operations let us deal with a stream as a sequence of uninterpreted bytes.

Single-Byte Operations

Several of the unformatted operations deal with a stream one byte at a time. These operations, which are described in Table 17.19, read rather than ignore whitespace. For example, we can use the unformatted IO operations get and put to read and write the characters one at a time:

char ch;
while (cin.get(ch))
        cout.put(ch);

This program preserves the whitespace in the input. Its output is identical to the input. It executes the same way as the previous program that used noskipws.

Table 17.19. Single-Byte Low-Level IO Operations

Image
Putting Back onto an Input Stream

Sometimes we need to read a character in order to know that we aren’t ready for it. In such cases, we’d like to put the character back onto the stream. The library gives us three ways to do so, each of which has subtle differences from the others:

peek returns a copy of the next character on the input stream but does not change the stream. The value returned by peek stays on the stream.

unget backs up the input stream so that whatever value was last returned is still on the stream. We can call unget even if we do not know what value was last taken from the stream.

putback is a more specialized version of unget: It returns the last value read from the stream but takes an argument that must be the same as the one that was last read.

In general, we are guaranteed to be able to put back at most one value before the next read. That is, we are not guaranteed to be able to call putback or unget successively without an intervening read operation.

int Return Values from Input Operations

The peek function and the version of get that takes no argument return a character from the input stream as an int. This fact can be surprising; it might seem more natural to have these functions return a char.

The reason that these functions return an int is to allow them to return an end-of-file marker. A given character set is allowed to use every value in the char range to represent an actual character. Thus, there is no extra value in that range to use to represent end-of-file.

The functions that return int convert the character they return to unsigned char and then promote that value to int. As a result, even if the character set has characters that map to negative values, the int returned from these operations will be a positive value (§ 2.1.2, p. 35). The library uses a negative value to represent end-of-file, which is thus guaranteed to be distinct from any legitimate character value. Rather than requiring us to know the actual value returned, the cstdio header defines a const named EOF that we can use to test if the value returned from get is end-of-file. It is essential that we use an int to hold the return from these functions:

int ch;    // use an int, not a char to hold the return from get()
// loop to read and write all the data in the input
while ((ch = cin.get()) != EOF)
         cout.put(ch);

This program operates identically to the one on page 761, the only difference being the version of get that is used to read the input.

Multi-Byte Operations

Some unformatted IO operations deal with chunks of data at a time. These operations can be important if speed is an issue, but like other low-level operations, they are error-prone. In particular, these operations require us to allocate and manage the character arrays (§ 12.2, p. 476) used to store and retrieve data. The multi-byte operations are listed in Table 17.20.

Table 17.20. Multi-Byte Low-Level IO Operations

Image

The get and getline functions take the same parameters, and their actions are similar but not identical. In each case, sink is a char array into which the data are placed. The functions read until one of the following conditions occurs:

size - 1 characters are read

• End-of-file is encountered

• The delimiter character is encountered

The difference between these functions is the treatment of the delimiter: get leaves the delimiter as the next character of the istream, whereas getline reads and discards the delimiter. In either case, the delimiter is not stored in sink.


Image Warning

It is a common error to intend to remove the delimiter from the stream but to forget to do so.


Determining How Many Characters Were Read

Several of the read operations read an unknown number of bytes from the input. We can call gcount to determine how many characters the last unformatted input operation read. It is essential to call gcount before any intervening unformatted input operation. In particular, the single-character operations that put characters back on the stream are also unformatted input operations. If peek, unget, or putback are called before calling gcount, then the return value will be 0.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset