The various stream types generally support random access to the data in their associated stream. We can reposition the stream so that it skips around, reading first the last line, then the first, and so on. The library provides a pair of functions to seek to a given location and to tell the current location in the associated stream.
Random IO is an inherently system-dependent. To understand how to use these features, you must consult your system’s documentation.
Although these seek and tell functions are defined for all the stream types, whether they do anything useful depends on the device to which the stream is bound. On most systems, the streams bound to cin
, cout
, cerr
, and clog
do not support random access—after all, what would it mean to jump back ten places when we’re writing directly to cout
? We can call the seek and tell functions, but these functions will fail at run time, leaving the stream in an invalid state.
Exercise 17.37: Use the unformatted version of getline
to read a file a line at a time. Test your program by giving it a file that contains empty lines as well as lines that are longer than the character array that you pass to getline
.
Exercise 17.38: Extend your program from the previous exercise to print each word you read onto its own line.
Because the istream
and ostream
types usually do not support random access, the remainder of this section should be considered as applicable to only the fstream
and sstream
types.
To support random access, the IO types maintain a marker that determines where the next read or write will happen. They also provide two functions: One repositions the marker by seeking to a given position; the second tells us the current position of the marker. The library actually defines two pairs of seek and tell functions, which are described in Table 17.21. One pair is used by input streams, the other by output streams. The input and output versions are distinguished by a suffix that is either a g
or a p
. The g
versions indicate that we are “getting” (reading) data, and the p
functions indicate that we are “putting” (writing) data.
Logically enough, we can use only the g
versions on an istream
and on the types ifstream
and istringstream
that inherit from istream
(§ 8.1, p. 311). We can use only the p
versions on an ostream
and on the types that inherit from it, ofstream
and ostringstream
. An iostream, fstream
, or stringstream
can both read and write the associated stream; we can use either the g
or p
versions on objects of these types.
The fact that the library distinguishes between the “putting” and “getting” versions of the seek and tell functions can be misleading. Even though the library makes this distinction, it maintains only a single marker in a stream—there is not a distinct read marker and write marker.
When we’re dealing with an input-only or output-only stream, the distinction isn’t even apparent. We can use only the g
or only the p
versions on such streams. If we attempt to call tellp
on an ifstream
, the compiler will complain. Similarly, it will not let us call seekg
on an ostringstream
.
The fstream
and stringstream
types can read and write the same stream. In these types there is a single buffer that holds data to be read and written and a single marker denoting the current position in the buffer. The library maps both the g
and p
positions to this single marker.
Because there is only a single marker, we must do a seek
to reposition the marker whenever we switch between reading and writing.
There are two versions of the seek functions: One moves to an “absolute” address within the file; the other moves to a byte offset from a given position:
// set the marker to a fixed position
seekg(new_position); // set the read marker to the given pos_type location
seekp(new_position); // set the write marker to the given pos_type location
// offset some distance ahead of or behind the given starting point
seekg(offset, from); // set the read marker offset distance from from
seekp(offset, from); // offset has type off_type
The possible values for from
are listed in Table 17.21 (on the previous page).
The arguments, new_position
and offset
, have machine-dependent types named pos_type
and off_type
, respectively. These types are defined in both istream
and ostream
. pos_type
represents a file position and off_type
represents an offset from that position. A value of type off_type
can be positive or negative; we can seek
forward or backward in the file.
The tellg
or tellp
functions return a pos_type
value denoting the current position of the stream. The tell functions are usually used to remember a location so that we can subsequently seek back to it:
// remember the current write position in mark
ostringstream writeStr; // output stringstream
ostringstream::pos_type mark = writeStr.tellp();
// ...
if (cancelEntry)
// return to the remembered position
writeStr.seekp(mark);
Let’s look at a programming example. Assume we are given a file to read. We are to write a newline at the end of the file that contains the relative position at which each line begins. For example, given the following file,
abcd
efg
hi
j
the program should produce the following modified file:
Note that our program need not write the offset for the first line—it always occurs at position 0. Also note that the offset counts must include the invisible newline character that ends each line. Finally, note that the last number in the output is the offset for the line on which our output begins. By including this offset in our output, we can distinguish our output from the file’s original contents. We can read the last number in the resulting file and seek to the corresponding offset to get to the beginning of our output.
Our program will read the file a line at a time. For each line, we’ll increment a counter, adding the size of the line we just read. That counter is the offset at which the next line starts:
int main()
{
// open for input and output and preposition file pointers to end-of-file
// file mode argument see § 8.4 (p. 319)
fstream inOut("copyOut",
fstream::ate | fstream::in | fstream::out);
if (!inOut) {
cerr << "Unable to open file!" << endl;
return EXIT_FAILURE; // EXIT_FAILURE see § 6.3.2 (p. 227)
}
// inOut is opened in ate mode, so it starts out positioned at the end
auto end_mark = inOut.tellg();// remember original end-of-file position
inOut.seekg(0, fstream::beg); // reposition to the start of the file
size_t cnt = 0; // accumulator for the byte count
string line; // hold each line of input
// while we haven't hit an error and are still reading the original data
while (inOut && inOut.tellg() != end_mark
&& getline(inOut, line)) { // and can get another line of input
cnt += line.size() + 1; // add 1 to account for the newline
auto mark = inOut.tellg(); // remember the read position
inOut.seekp(0, fstream::end); // set the write marker to the end
inOut << cnt; // write the accumulated length
// print a separator if this is not the last line
if (mark != end_mark) inOut << " ";
inOut.seekg(mark); // restore the read position
}
inOut.seekp(0, fstream::end); // seek to the end
inOut << "
"; // write a newline at end-of-file
return 0;
}
Our program opens its fstream
using the in
, out
, and ate
modes (§ 8.4, p. 319). The first two modes indicate that we intend to read and write the same file. Specifying ate
positions the read and write markers at the end of the file. As usual, we check that the open succeeded, and exit if it did not (§ 6.3.2, p. 227).
Because our program writes to its input file, we can’t use end-of-file to signal when it’s time to stop reading. Instead, our loop must end when it reaches the point at which the original input ended. As a result, we must first remember the original end-of-file position. Because we opened the file in ate
mode, inOut
is already positioned at the end. We store the current (i.e., the original end) position in end_mark
. Having remembered the end position, we reposition the read marker at the beginning of the file by seeking to the position 0 bytes from the beginning of the file.
The while
loop has a three-part condition: We first check that the stream is valid; if so, we check whether we’ve exhausted our original input by comparing the current read position (returned by tellg
) with the position we remembered in end_mark
. Finally, assuming that both tests succeeded, we call getline
to read the next line of input. If getline
succeeds, we perform the body of the loop.
The loop body starts by remembering the current position in mark
. We save that position in order to return to it after writing the next relative offset. The call to seekp
repositions the write marker to the end of the file. We write the counter value and then seekg
back to the position we remembered in mark
. Having restored the marker, we’re ready to repeat the condition in the while
.
Each iteration of the loop writes the offset of the next line. Therefore, the last iteration of the loop takes care of writing the offset of the last line. However, we still need to write a newline at the end of the file. As with the other writes, we call seekp
to position the file at the end before writing the newline.