EXPLORATION 10

image

Algorithms and Iterators

The previous Exploration introduced vectors and iterators using std::sort to sort a vector of integers. This Exploration examines iterators in more depth and introduces generic algorithms, which perform useful operations on iterators.

Algorithms

The std::sort function is an example of a generic algorithm, so named because these functions implement common algorithms and operate generically. That is, they work for just about anything you can express as a sequence. Most of the standard algorithms are declared in the <algorithm> header, although the <numeric> header contains a few that are numerically oriented.

The standard algorithms run the gamut of common programming activities: sorting, searching, copying, comparing, modifying, and more. Searches can be linear or binary. A number of functions, including std::sort, reorder elements within a sequence. No matter what they do, all generic algorithms share some common features.

Almost all the generic algorithms operate on iterators. (The sole exceptions are std::max, std::min, and std::minmax, which return the maximum and minimum of two values.) Earlier, I mentioned that iterators come in different flavors, each flavor having different capabilities. Although C++ has five flavors in all, you can broadly group them into two categories: read and write.

A read iterator refers to a position in a sequence of values that enables reading from the sequence. The algorithms use read iterators for input but do not modify the values. Typically, you must specify a range with a pair of read iterators: start and one past the end.

A write iterator refers to a position in a sequence where the algorithm is to write its output. Typically, you specify only the starting position of the output sequence. The algorithm does not and cannot check for overflow, so you must ensure the output sequence has sufficient room to accommodate everything the algorithm will write.

For example, the std::copy algorithm copies values from an input sequence to an output sequence. The function takes three arguments: two read iterators to specify the input range and one write iterator to specify the start of the output range. You must ensure the output has enough capacity. Call the resize member function to set the size of the output vector, as shown in Listing 10-1.

Listing 10-1.  Demonstrating the std::copy Function

#include <algorithm>
#include <cassert>
#include <vector>
 
int main()
{
  std::vector<int> input{ 10, 20, 30 };
  std::vector<int> output{};
  output.resize(input.size());
  std::copy(input.begin(), input.end(), output.begin());
  // Now output has a complete copy of input.
  assert(input == output);
}

The assert function is a quick way to verify that what you think is true actually is true. You assert a logical statement, and if you are wrong, the program terminates with a message that identifies the assertion. The assert function is declared in <cassert>; the c means the C++ library inherits this header from the C standard library. Note that assert is one of the rare exceptions to the rule that standard library members begin with std::.

If the program is correct, it runs and exits normally. But if we make a mistake, the assertion triggers, and the programs fails with a message.

Test the program in Listing 10-1. Just to see what happens when an assertion fails, comment out the call to std::copy and run it again. Write down the message you get.

_____________________________________________________________

_____________________________________________________________

Also note the initialization of input. Listing 10-1 demonstrates another application of “universal initialization” (as introduced in Exploration 4). The comma-separated values inside the curly braces are used to initialize the elements of the vector.

Member Types

Remember Listing 9-1? Go back and look at line 19. The definition in the first part of the for loop is particularly scary, even for experienced C++ programmers. In addition to member functions, a C++ library type can have member types. In this case, the parent type, std::vector<int>, has a member type named size_type. Use this type whenever you have to store a size or index for a vector.

You usually use the dot (.) operator to call a member function, but you use :: (called the scope operator) to refer to a member type.

The size_type is like an int, but not really. In particular, you cannot assign a negative value to size_type. (After all, what kind of vector has a negative size?) Or rather, the language rules let you assign a negative value, but you won’t get the result you want or expect. A good compiler warns you that you are making a mistake. Until you learn enough C++ to appreciate the subtleties of size_type, the best strategy is to use size_type only for loop control, for storing the size of a vector, and for storing indices. Don’t try to perform arithmetic with size_type values beyond simply incrementing a loop control variable.

Line 19 uses size_type to define the variables i and end, which are the loop control variables. The loop increments i from 0 up to the vector size (end), at which point it exits. This is a common idiom for looping through a vector when you need the vector indices. Think of end as the index of one past the end, just like iterators. Most programs, in fact, do not need to use the vector indices. I wrote Listing 9-1 that way only to demonstrate the at member function.

A better way to write the program in Listing 9-1 is to use iterators instead of indices and the at() member function. To define an iterator, substitute the member type iterator for size_type in the definition of the loop control variable. Initialize the loop control variable to data.begin(), and end the loop when the iterator equals data.end(). Use the ++ operator to advance the iterator and * to obtain the vector element to which the iterator refers. Put these pieces together and rewrite lines 19 and 20 of Listing 9-1 to use iterators.

_____________________________________________________________

_____________________________________________________________

Compare your response to Listing 10-2. (For your convenience, I repeat the entire program. That makes it easy for you to compile and test the program.)

Listing 10-2.  Sorting Integers by Using Iterators to Print the Results

 1 #include <algorithm>
 2 #include <iostream>
 3 #include <vector>
 4
 5 int main()
 6 {
 7   std::vector<int> data{};     // initialized to be empty
 8   int x{};
 9
10   // Read integers one at a time.
11   while (std::cin >> x)
12     // Store each integer in the vector.
13     data.push_back(x);
14
15   // Sort the vector.
16   std::sort(data.begin(), data.end());
17
18   // Print the vector, one number per line.
19   for (std::vector<int>::iterator i{data.begin()}, end{data.end()}; i != end; ++i)
20     std::cout << *i << ' ';
21 }

Using iterators instead of indices has many advantages.

  • The code works with other kinds of containers (such as linked lists), even if they don’t have the at member function.
  • The optimizer has more opportunities to produce high-performance code.
  • You have fewer chances for mistakes, such as buffer overruns.
  • The code is easier to read, especially for experienced C++ programmers.

Sure, iterators offer many advantages, but line 19 is a nightmare in terms of readability. The next section introduces a way to hide the nasty iterator syntax, while retaining the advantages of iterators.

A Simpler Loop

C++ 11 introduced an easy way to write a for loop. It uses iterators under the hood but hides them in a simple syntax.

for (int element : data)
   std::cout << element << ' ';

This loop defines a hidden iterator that traverses all the elements of data (calling its begin() and end() member functions). Each time through the loop, the iterator is dereferenced (with the * operator) and assigned to element. Thus, inside the loop body, you simply use element to obtain each value in the range. After the loop body finishes, the iterator is advanced with the ++ operator.

This style of loop is called a range-basedfor loop or a for-each loop. Because range-based for loops are new in C++, you will still see a lot of iterators in for loops, but as more people pick up C++, expect to see many more uses of range-based for loops. You can see why.

Using Iterators and Algorithms

Loops over iterator ranges are so common that many generic algorithms implement the most common actions that you may need to take in a program. With a couple of helpers, you can re-implement the program using only generic algorithms, as shown in Listing 10-3.

Listing 10-3.  Sorting Integers by Using Only Generic Algorithms and Iterator Adapters

 1 #include <algorithm>
 2 #include <iostream>
 3 #include <iterator>
 4 #include <vector>
 5
 6 int main()
 7 {
 8   std::vector<int> data{};
 9
10   // Read integers one at a time.
11   std::copy(std::istream_iterator<int>(std::cin),
12             std::istream_iterator<int>(),
13             std::back_inserter(data));
14
15   // Sort the vector.
16   std::sort(data.begin(), data.end());
17
28   // Print the vector, one number per line.
19   std::copy(data.begin(), data.end(),
20             std::ostream_iterator<int>(std::cout, " "));
21 }

A std::istream_iterator (line 11) creates a read iterator that reads from an input stream. Every time you read a value from the iterator, the istream_iterator object uses the >> operator to read a value from the stream. You must supply the type in angle brackets, so that the compiler knows what type of information you want to read from the stream, which you pass as a function argument. With no argument, std::istream_iterator<int>() (line 12) returns a special one-past-the-end iterator. When the input stream iterator equals this special one-past-the-end iterator, the program has reached the end of the stream, and no more input is available.

The std::back_inserter function (line 13) takes a vector (or any object that has a push_back function) and wraps it in a write iterator. Any time you assign a value to the iterator, the back insert iterator calls the push_back function to add the value to the object. Using back_inserter, you can guarantee that the program will not overrun the output buffer.

Finally, an ostream_iterator (line 20) is the counterpart to an istream_iterator. It takes an output stream and wraps it in a write iterator. Any value that you assign to the iterator is written to the stream using the << operator. You can pass an optional string argument, and the ostream_iterator writes that string after each value. In this case, the string contains just a newline character, so each number is written on its own line.

All these special iterators are declared in <iterator>. You don’t need this header to use an ordinary iterator, such as that returned from a vector’s begin() function, but you do need it if you use a special iterator, such as istream_iterator.

Until you are accustomed to using the generic algorithms, iterators, and special iterators, this style of programming can seem unusual. Once you are familiar with these unique C++ library members, you will find such code easier to read and write than more traditional programming styles.

It’s now time for you to practice using iterators and algorithms.

Write a program that reads integers from the standard input into a vector. Feel free to use any style of loop (indices, iterators, or range-based). If you are unsure which style to use, I recommend getting used to range-based loops. Print each value, followed by twice the value and then the value squared. Thus, the output contains one line per input value, and each line has three numbers. Align the columns, as you learned in the previous Exploration.

Test your program using the following input:

3
10
8

What output do you expect?

_____________________________________________________________

_____________________________________________________________

_____________________________________________________________

What output do you actually get?

_____________________________________________________________

_____________________________________________________________

_____________________________________________________________

Now try running the program with no input at all. What do you expect?

_____________________________________________________________

What do you get?

_____________________________________________________________

Listing 10-4 shows one way to write this program using explicit loops. Notice how the * operator means “multiplication” when used as a binary (two-operand) operator and “dereference the iterator” when used as a unary, prefix operator.

Listing 10-4.  Doubling and Squaring Input Values in a Vector by Using Iterators

#include <iomanip>
#include <iostream>
#include <vector>
 
int main()
{
   std::vector<int> data{};
   int x{};
 
   while (std::cin >> x)
      data.push_back(x);
 
   for (std::vector<int>::iterator i{data.begin()}, end{data.end()}; i != end; ++i)
      std::cout << std::setw(2) << *i <<
         std::setw(3) << *i* 2 <<
         std::setw(4) << *i * *i << ' ';
}

The expression *i * *i is hard to read. Listing 10-5 shows another way, this time using a range-based for loop. See how much easier it is to read.

Listing 10-5.  Doubling and Squaring Input Values in a Vector by Using a Range-Based Loop

#include <iomanip>
#include <iostream>
#include <vector>
 
int main()
{
   std::vector<int> data{};
   int x{};
 
   while (std::cin >> x)
      data.push_back(x);
 
   std::cout.fill(' '),
   for (int element : data)
      std::cout << std::setw(2) << element <<
         std::setw(3) << element * 2 <<
         std::setw(4) << element * element << ' ';
}

The ++ operator, which advances an iterator, also advances an integer, as shown in Listing 10-4. The next Exploration takes a closer look at this handy operator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset