Chapter 13

Mastering STL Algorithms

WHAT’S IN THIS CHAPTER?

  • What algorithms are
  • What lambda expressions are
  • What function objects are
  • The details of the STL algorithms
  • A larger example: auditing voter registrations

As Chapter 12 shows, the STL provides an impressive collection of generic data structures. Most libraries stop there. The STL, however, contains an additional assortment of generic algorithms that can, with some exceptions, be applied to elements from any container. Using these algorithms, you can find, sort, and process elements in containers, and perform a whole host of other operations. The beauty of the algorithms is that they are independent not only of the types of the underlying elements, but of the types of the containers on which they operate. Algorithms perform their work using only the iterator interfaces.

Many of the algorithms accept callbacks, which can be a function pointer or something that behaves like a function pointer, such as an object with an overloaded operator() or a C++11 inline lambda expression. Conveniently, the STL provides a set of classes that can be used to create callback objects for the algorithms. These callback objects are called function objects, or just functors.

OVERVIEW OF ALGORITHMS

The “magic” behind the algorithms is that they work on iterator intermediaries instead of on the containers themselves. In that way, they are not tied to specific container implementations. All the STL algorithms are implemented as function templates, where the template type parameters are usually iterator types. The iterators themselves are specified as arguments to the function. Templatized functions can usually deduce the template types from the function arguments, so you can generally call the algorithms as if they were normal functions, not templates.

The iterator arguments are usually iterator ranges. As Chapter 12 explains, iterator ranges are half-open for most containers such that they include the first element in the range, but exclude the last. The end iterator is really a “past-the-end” marker.

The C++11 forward_list container only supports forward iterators. This means that algorithms requiring bidirectional or random access iterators will not work on a forward_list; for example copy_backward(), random_shuffle(), stable_sort(), etc. do not work on a forward_list.

Some algorithms require additional template type parameters and arguments, which are sometimes function callbacks. These callbacks can be function pointers, function objects or C++11 lambda expressions.

Most algorithms are defined in the <algorithm> header file, while some numerical algorithms are defined in the <numeric> header file.

The best way to understand the algorithms is to look at some examples first. After you’ve seen how a few of them work, it’s easy to pick up the others. This section describes the find(), find_if(), and accumulate() algorithms in detail. The next sections present the lambda expressions and function objects, and discusses each of the classes of algorithms with representative samples.

The find and find_if Algorithms

find() looks for a specific element in an iterator range. You can use it on elements in any container type. It returns an iterator referring to the element found, or the end iterator of the range in case the element is not found. Note that the range specified in the call to find() need not be the entire range of elements in a container; it could be a subset.

cross.gif

If find() fails to find an element, it returns an iterator equal to the end iterator specified in the function call, not the end iterator of the underlying container.

Here is an example of find(). Note that this example assumes that the user plays nice and enters valid numbers; it does not perform any error checking on the user input. Performing error checking on stream input is discussed in Chapter 15.

image
#include <algorithm>
#include <vector>
#include <iostream>
using namespace std;
int main()
{
    int num;
    vector<int> myVector;
    while (true) {
        cout << "Enter a number to add (0 to stop): ";
        cin >> num;
        if (num == 0) {
            break;
        }
        myVector.push_back(num);
    }
    while (true) {
        cout << "Enter a number to lookup (0 to stop): ";
        cin >> num;
        if (num == 0) {
            break;
        }
        auto end = myVector.end();
        auto it = find(myVector.begin(), end, num);
        if (it == end) {
            cout << "Could not find " << num << endl;
        } else {
            cout << "Found " << *it << endl;
        }
    }
    return 0;
}

Code snippet from AlgorithmOverviewFind.cpp

The call to find() is made with myVector.begin() and end as arguments, where end is defined as myVector.end() for this example, in order to search all the elements of the vector. If you want to search in a sub range, you can change the first argument to the find() method and the value of the end iterator.

Here is a sample run of the program:

Enter a number to add (0 to stop): 3
Enter a number to add (0 to stop): 4
Enter a number to add (0 to stop): 5
Enter a number to add (0 to stop): 6
Enter a number to add (0 to stop): 0
Enter a number to lookup (0 to stop): 5
Found 5
Enter a number to lookup (0 to stop): 8
Could not find 8
Enter a number to lookup (0 to stop): 4
Found 4
Enter a number to lookup (0 to stop): 2
Could not find 2
Enter a number to lookup (0 to stop): 0

Some containers, such as map and set, provide their own versions of find() as class methods.

cross.gif

If a container provides a method with the same functionality as a generic algorithm, you should use the method instead, because it’s faster. For example, the generic find() algorithm runs in linear time, even on a map iterator, while the find() method on a map runs in logarithmic time.

find_if() is similar to find(), except that it accepts a predicate function callback instead of a simple element to match. A predicate returns true or false. The find_if() algorithm calls the predicate on each element in the range until the predicate returns true, in which case find_if()returns an iterator referring to that element. The following program reads test scores from the user, then checks if any of the scores are “perfect.” A perfect score is a score of 100 or higher. The program is similar to the previous example. Only the differences are highlighted.

image
bool perfectScore(int num)
{
    return (num >= 100);
}
int main()
{
    int num;
    vector<int> myVector;
    while (true) {
        cout << "Enter a test score to add (0 to stop): ";
        cin >> num;
        if (num == 0) {
            break;
        }
        myVector.push_back(num);
    }
    auto end = myVector.end();
    auto it = find_if(myVector.begin(), end, perfectScore);
    if (it == end) {
        cout << "No perfect scores" << endl;
    } else {
        cout << "Found a "perfect" score of " << *it << endl;
    }
    return 0;
}

Code snippet from AlgorithmOverviewFindIf.cpp

This program passes a pointer to the perfectScore() function, which the find_if() algorithm then calls on each element until it returns true.

Here is the same example but using a C++11 lambda expression. It gives you an initial idea about the power of lambda expressions. Don’t worry about their syntax. They are explained in detail later in this chapter. Note the absence of the perfectScore() function.

image
int num;
vector<int> myVector;
while (true) {
    cout << "Enter a test score to add (0 to stop): ";
    cin >> num;
    if (num == 0) {
        break;
    }
    myVector.push_back(num);
}
auto end = myVector.end();
auto it = find_if(myVector.begin(), end, [](int i){ return i >= 100; });
if (it == end) {
    cout << "No perfect scores" << endl;
} else {
    cout << "Found a "perfect" score of " << *it << endl;
}

Code snippet from AlgorithmOverviewFindIfLambda.cpp

Unfortunately, the STL provides no find_all() or equivalent algorithm that returns all instances matching a predicate. Chapter 17 shows you how to write your own find_all() algorithm.

The accumulate Algorithms

It’s often useful to calculate the sum, or some other arithmetic quantity, of all the elements in a container. The accumulate() function does just that. In its most basic form, it calculates the sum of the elements in a specified range. For example, the following function calculates the arithmetic mean of a sequence of integers in a vector. The arithmetic mean is simply the sum of all the elements divided by the number of elements.

image
#include <numeric>
#include <vector>
using namespace std; 
double arithmeticMean(const vector<int>& nums)
{
    double sum = accumulate(nums.begin(), nums.end(), 0);
    return sum / nums.size();
}

Code snippet from AlgorithmOverviewAccumulate.cpp

Note that accumulate() is declared in <numeric>, not in <algorithm>. The accumulate() algorithm takes as its third parameter an initial value for the sum, which in this case should be 0 (the identity for addition) to start a fresh sum.

The second form of accumulate() allows the caller to specify an operation to perform instead of the default addition. This operation takes the form of a binary callback. Suppose that you want to calculate the geometric mean, which is the product of all the numbers in the sequence to the power of the inverse of the size. In that case, you would want to use accumulate() to calculate the product instead of the sum. You could write it like this:

image
#include <numeric>
#include <vector>
#include <cmath>
using namespace std;
int product(int num1, int num2)
{
    return num1 * num2;
}
double geometricMean(const vector<int>& nums)
{
    double mult = accumulate(nums.begin(), nums.end(), 1, product);
    return pow(mult, 1.0 / nums.size());
}

Code snippet from AlgorithmOverviewAccumulate.cpp

Note that the product() function is passed as a callback to accumulate() and that the initial value for the accumulation is 1 (the identity for multiplication) instead of 0.

To give you a second teaser about the power of C++11 lambda expressions, the geometricMean() function could be written as follows, without using the product() function:

image
double geometricMeanLambda(const vector<int>& nums)
{
    double mult = accumulate(nums.begin(), nums.end(), 1,
        [](int num1, int num2){ return num1 * num2; });
    return pow(mult, 1.0 / nums.size());
}

Code snippet from AlgorithmOverviewAccumulate.cpp

Later in this chapter you learn how to use accumulate() in the geometricMean() function without writing a function callback or lambda expression.

imageC++11 Move Semantics with Algorithms

Just like STL containers, STL algorithms are also optimized to use C++11 move semantics at appropriate times. This can greatly speed up certain algorithms, for example sort(). For this reason, it is highly recommended that you implement move semantics in your custom element classes that you want to store in containers. Move semantics can be added to any class by implementing a move constructor and a move assignment operator. Consult the “Move Semantics” section in Chapter 9 for details on how to add move semantics to your classes.

imageLAMBDA EXPRESSIONS

C++11 adds a new feature called lambda expressions. This allows you to write anonymous functions inline, removing the need to write a separate function or to write a function object, and makes code easier to understand.

Syntax

The syntax of a lambda expression is as follows:

[capture_block](parameters) mutable exception_specification -> return_type {body}

A lambda expression contains the following parts:

  • Capture block: specifies how variables from the enclosing scope are captured and made available in the body of the lambda. Explained in the next section.
  • Parameters: (optional) a list of parameters for the lambda expression. You can only omit this list if you do not need any parameters and you do not specify mutable, an exception specification and a return type. Omitting the return type is only allowed in certain cases as explained under the return_type bullet.

    For example: []{return 10;}

    The parameter list is similar to the parameter list for normal functions with the following differences:

    • Parameters cannot have default values.
    • Variable-length argument lists are not allowed.
    • Unnamed parameters are not allowed.
  • Mutable: (optional) if variables from the enclosing scope are captured by value, a copy of those variables will become available in the body of the lambda expression. Those copies are marked as const by default, meaning the lambda body cannot change the value of those copies. If the lambda expression is marked as mutable, those copies are not const and the body can modify those local copies.
  • exception_specification: (optional) can be used to specify which exceptions the body of the lambda expression can throw.
  • return_type: (optional) the type of the returned value. If the return_type part is omitted, the compiler will decide the return type as follows:
    • If the body of the lambda expression is of the following form: { return expression; } the type of expression will become the return_type of the lambda expression.
    • Otherwise the return_type is void.

The following example demonstrates that you can create a lambda expression and immediately execute it. The line defines a lambda expression without return type and without any parameters. It simply prints the string “Hello from Lambda” to the console. Note the parentheses () at the end, which causes the lambda to be executed immediately:

image
[]{cout << "Hello from Lambda" << endl;}();

Code snippet from LambdasLambdaInvocation.cpp

The output is as follows:

Hello from Lambda

The following example defines a lambda that accepts a string argument and returns a string. The result is stored in the variable result. Again notice the parentheses at the end of the lambda causing the lambda to be executed immediately:

image
string result = [](const string& str) -> string {return "Hello from "
                + str;}("second Lambda");
cout << "Result: " << result << endl;

Code snippet from LambdasLambdaInvocation.cpp

The output is as follows:

Result: Hello from second Lambda

As mentioned before, the return type can be omitted in this case:

image
string result = [](const string& str){return "Hello from "
                + str;}("second Lambda");

Code snippet from LambdasLambdaInvocation.cpp

You can also store a pointer to a lambda expression and execute the lambda through the function pointer. Using the C++11 auto keyword, this becomes very easy:

image
auto fn = [](const string& str){return "Hello from " + str;};
cout << fn("call 1") << endl;
cout << fn("call 2") << endl;

Code snippet from LambdasLambdaFunctionPointer.cpp

The preceding code results in the following output:

Hello from call 1
Hello from call 2

Capture Block

The square brackets part is called the lambda capture block. It allows you to specify how you want to capture variables from the enclosing scope. Capturing a variable means that the variable becomes available inside the body of the lambda. There are two ways to capture all variables from the enclosing scope:

  • [=] captures all variables by value
  • [&] captures all variables by reference

Specifying an empty capture block [] means that no variables from the enclosing scope are being captured. It is also possible to selectively decide which variables to capture and how, by specifying a capture list with an optional capture default. Variables prefixed with & are captured by reference. Variables without a prefix are captured by value. The capture default should be the first element in the capture list and be either & or =. For example:

  • [&x] captures only x by reference and nothing else.
  • [x] captures only x by value and nothing else.
  • [=, &x, &y] captures by value by default, except variables x and y, which are captured by reference.
  • [&, x] captures by reference by default, except variable x, which is captured by value.
  • [&x, &x] is illegal because identifiers cannot be repeated.

When you capture a variable by reference, you have to make sure that the reference is still valid at the time the lambda expression is executed. This will be demonstrated with the multiplyBy2Lambda() example in the following section.

Lambda Expressions as Return Type

std::function defined in the <functional> header file is a polymorphic function object wrapper and is similar to a function pointer. It can be bound to anything that can be called (functors, member function pointers, function pointers, and lambdas) as long as the arguments and return type are compatible with those of the wrapper. A wrapper for a function that returns a double and takes two integers as parameters can be defined as follows:

function<double(int, int)> myWrapper;

By using std::function, lambda expressions can be returned from functions. Take a look at the following definition:

image
function<int(void)> multiplyBy2Lambda(int x)
{
    return [=]()->int{return 2*x;};
}

Code snippet from LambdasmultiplyBy2Lambda.cpp

In this example, the return type and empty parameter list of the lambda expression can be omitted, so the preceding can be written as follows:

image
function<int(void)> multiplyBy2Lambda(int x)
{
    return [=]{return 2*x;};
}

Code snippet from LambdasmultiplyBy2Lambda.cpp

The body of this function creates a lambda expression that captures the variables from the enclosing scope by value and returns an integer, which is two times the value passed to multiplyBy2Lambda(). The return type of the multiplyBy2Lambda() function is function<int(void)>, which is a function accepting no arguments and returning an integer. The lambda expression defined in the body of the function exactly matches this prototype. The variable x is captured by value and thus a copy of the value of x is bound to the x in the lambda expression before the lambda is returned from the function.

The preceding function can be called as follows:

image
function<int(void)> fn = multiplyBy2Lambda(5);
cout << fn() << endl;

Code snippet from LambdasmultiplyBy2Lambda.cpp

You can use the C++11 auto keyword to make this much easier:

image
auto fn = multiplyBy2Lambda(5);
cout << fn() << endl;

Code snippet from LambdasmultiplyBy2Lambda.cpp

The output will be 10.

The multiplyBy2Lambda() example captures the variable x by value, [=]. Suppose the function is rewritten to capture the variable by reference, [&], as follows. This will not work as explained after the code:

function<int(void)> multiplyBy2Lambda(int x)
{
    return [&]{return 2*x;};
}

The lambda expression captures x by reference. However, the lambda expression will be executed later in the program, not anymore in the scope of the multiplyBy2Lambda() function at which point the reference to x is not valid anymore!

Lambda Expressions as Parameters

You can write your own functions that accept lambda expressions as parameters. This can for example be used to implement callbacks. The following code implements a testCallback() function that accepts a vector of integers and a callback function. The implementation will iterate over all the elements in the given vector and will call the callback function for each element. The callback function accepts the current element in the vector as an int argument and returns a Boolean. If the callback returns false, the iteration is stopped.

image
void testCallback(const vector<int>& vec,
                  const function<bool(int)>& callback)
{
    for (auto i : vec) {
        // Call callback. If it returns false, stop iteration.
        if (!callback(i))
            break;
        // Callback did not stop iteration, so print value
        cout << i << " ";
    }
    cout << endl;
}

Code snippet from Lambdascallback.cpp

The testCallback() function can be tested as follows. First a vector with 10 elements is created and the generate() algorithm is used to fill in those 10 elements. The generate() algorithm requires an iterator range and will replace the values in that range with the values returned from the function given as third parameter. The generate() algorithm is explained in more detail later in this chapter. The code then outputs all the values of the vector using the for_each() algorithm in combination with a lambda. The for_each() algorithm will call the function given as third parameter for each value in the given iterator range. The last line calls the testCallback() function with a small lambda expression as callback function. This lambda expression returns true for values that are less than 6.

image
vector<int> vec(10);
int index = 0;
generate(vec.begin(), vec.end(), [&index]{return ++index;});
for_each(vec.begin(), vec.end(), [](int i){cout << i << " ";});
cout << endl;
testCallback(vec, [](int i){return i<6;});

Code snippet from Lambdascallback.cpp

The output of this example is as follows:

1 2 3 4 5 6 7 8 9 10
1 2 3 4 5

Examples

This section gives a few examples that use STL algorithms in combination with lambda expressions.

count_if

The following example uses the count_if() algorithm to count the number of elements in the given vector that satisfy a certain condition. The condition is given in the form of a lambda expression, which captures the variables in its enclosing scope by value. This makes the variable value available in the body of the lambda. Had the capture clause been the empty capture clause, [], then the variable value would not be available in the body of the lambda expression. The lambda expression returns the result of performing the comparison i>value, which causes the compiler to automatically make the return type of the lambda a Boolean value. Note that the example initializes the vector using C++11 uniform initialization, discussed in Chapter 9.

image
vector<int> vec = {1,2,3,4,5,6,7,8,9};
int value = 3;
int cnt = count_if(vec.cbegin(),vec.cend(),
                   [=](int i){return i>value;});
cout << "Found " << cnt << " values > " << value << endl;

Code snippet from Lambdascount_if.cpp

The output is as follows:

Found 6 values > 3

The preceding example can be extended to demonstrate capturing variables by reference. The following lambda expression will count the number of times it was called by incrementing a variable in the enclosing scope that was captured by reference.

image
vector<int> vec = {1,2,3,4,5,6,7,8,9};
int value = 3;
int cntLambdaCalled = 0;
int cnt = count_if(vec.cbegin(),vec.cend(),
    [=, &cntLambdaCalled](int i){++cntLambdaCalled; return i>value;});
cout << "The lambda expression was called " << cntLambdaCalled
     << " times." << endl;
cout << "Found " << cnt << " values > " << value << endl;

Code snippet from Lambdascount_if_ref.cpp

The output is as follows:

The lambda expression was called 9 times.
Found 6 values > 3

generate

The generate() algorithm allows you to fill an iterator range with certain values. The following example uses the generate() algorithm together with a lambda expression to put the numbers 2, 4, 8, 16, and so on in the vector.

image
vector<int> vec(10);
int value = 1;
generate(vec.begin(), vec.end(), [&value]{value*=2; return value;});
for (auto& i : vec)
    cout << i << " ";

Code snippet from Lambdasgenerate.cpp

The output is as follows:

2 4 8 16 32 64 128 256 512 1024

for_each

The for_each() algorithm can be used to perform a specific action on all elements in the given range. A simple example is to use the for_each() algorithm in combination with a lambda expression to print values from a vector. This example defines a vector of integers. The first two arguments to for_each() specify the range in the container on which to apply the lambda expression given as third argument. For each value in the given range, the for_each() algorithm will call the lambda expression and will pass that value as argument to the lambda expression. Since the vector holds integers, the parameter for the lambda expression is an int.

image
vector<int> vec = {11,22,33,44};
int index = 0;
for_each(vec.begin(), vec.end(),
         [&index](int i){cout << "Value " << (index++)
                              << ": " << i << endl;});

Code snippet from Lambdasfor_each.cpp

The output of the preceding code is as follows:

Value 0: 11
Value 1: 22
Value 2: 33
Value 3: 44

FUNCTION OBJECTS

You can overload the function call operator in a class such that objects of the class can be used in place of function pointers. These objects are called function objects, or just functors.

Many of the STL algorithms, such as find_if() and the second form of accumulate(), require a function pointer as one of the parameters. When you use these functions, you can pass a functor instead of a lambda or function pointer. C++ provides several predefined functor classes, defined in the <functional> header file, that perform the most commonly used callback operations.

Functor classes often consist of simple one-line expressions. The clumsiness of having to create a function or functor class, give it a name that does not conflict with other names, and then use this name is considerable intellectual overhead for what is fundamentally a simple concept. In these cases, using anonymous (unnamed) functions represented by lambda expressions is a big convenience. Their syntax is easier and can make your code easier to understand. They are discussed in the previous sections. However, this section explains functors and how to use the predefined functor classes because you will likely encounter them at some point.

pen.gif

With C++11, it is recommended to use lambda expressions, if possible, instead of function objects because lambdas are easier to use and easier to understand.

Arithmetic Function Objects

C++ provides functor class templates for the five binary arithmetic operators: plus, minus, multiplies, divides, and modulus. Additionally, unary negate is supplied. These classes are templatized on the type of the operands and are wrappers for the actual operators. They take one or two parameters of the template type, perform the operation, and return the result. Here is an example using the plus class template:

image
plus<int> myPlus;
int res = myPlus(4, 5);
cout << res << endl;

Code snippet from FunctionObjectsArithmetic.cpp

This example is silly, because there’s no reason to use the plus class template when you could just use operator+ directly. The benefit of the arithmetic function objects is that you can pass them as callbacks to algorithms, which you cannot do directly with the arithmetic operators. For example, the implementation of the geometricMean() function earlier in this chapter used the accumulate() function with a function pointer to the product() callback to multiply two integers. You could rewrite it to use the predefined multiplies function object:

image
double geometricMean(const vector<int>& nums)
{
    double mult = accumulate(nums.begin(), nums.end(), 1,
        multiplies<int>());
    return pow(mult, 1.0 / nums.size());
}

Code snippet from FunctionObjectsArithmetic.cpp

The expression multiplies<int>() creates a new object of the multiplies functor class, instantiating it with the int type.

The other arithmetic function objects behave similarly.

cross.gif

The arithmetic function objects are just wrappers around the arithmetic operators. If you use the function objects as callbacks in algorithms, make sure that the objects in your container implement the appropriate operation, such as operator* or operator+.

Comparison Function Objects

In addition to the arithmetic function object classes, the C++ language provides all the standard comparisons: equal_to, not_equal_to, less, greater, less_equal, and greater_equal. You’ve already seen less in Chapter 12 as the default comparison for elements in the priority_queue and the associative containers. Now you can learn how to change that criterion. Here’s an example of a priority_queue using the default comparison operator: less.

image
priority_queue<int> myQueue;
myQueue.push(3);
myQueue.push(4);
myQueue.push(2);
myQueue.push(1);
while (!myQueue.empty()) {
    cout << myQueue.top() << " ";
    myQueue.pop();
}

Code snippet from FunctionObjectsQueueLess.cpp

The output from the program looks like this:

4 3 2 1

As you can see, the elements of the queue are removed in descending order, according to the less comparison. You can change the comparison to greater by specifying it as the comparison template argument. The priority_queue template definition looks like this:

template <class T, class Container = vector<T>, class Compare = 
    less<T> >;

Unfortunately, the Compare type parameter is last, which means that in order to specify the comparison you must also specify the container. Here is an example of the above program modified so that the priority_queue sorts elements in ascending order using greater:

image
priority_queue<int, vector<int>, greater<int> > myQueue;
myQueue.push(3);
myQueue.push(4);
myQueue.push(2);
myQueue.push(1);
while (!myQueue.empty()) {
    cout << myQueue.top() << " ";
    myQueue.pop();
}

Code snippet from FunctionObjectsQueueGreater.cpp

The output now is as follows:

1 2 3 4

Due to a syntactic restriction in pre-C++11 versions of C++, it was necessary to put a space between two angle brackets if they were not operator>>. This syntactic restriction has been removed in C++11, and spaces are no longer required. This is covered in the “Angle Brackets” section in Chapter 9. Therefore, the declaration of myQueue can be written without the formerly-required space, as follows:

priority_queue<int, vector<int>, greater<int>> myQueue;

Several algorithms that you will learn about later in this chapter require comparison callbacks, for which the predefined comparators come in handy.

Logical Function Objects

C++ provides function object classes for the three logical operations: logical_not (operator!), logical_and (operator&&), and logical_or (operator||).

pen.gif

Logical operations deal only with the values true and false. In the original STL, there was no provision for dealing with bitwise operations on integer-type values. C++11 has bitwise function objects (covered in the next section).

Logical functors can for example be used to implement an allTrue() function that checks if all the Boolean flags in a container are true. This can be implemented as follows:

image
bool allTrue(const vector<bool>& flags)
{
    return accumulate(flags.begin(), flags.end(), true,
        logical_and<bool>());
}

Code snippet from FunctionObjectsLogicalFunctors.cpp

Similarly, the logical_or functor can be used to implement an anyTrue() function that returns true if there is at least one Boolean flag in a container true:

image
bool anyTrue(const vector<bool>& flags)
{
    return accumulate(flags.begin(), flags.end(), false,
        logical_or<bool>());
}

Code snippet from FunctionObjectsLogicalFunctors.cpp

imageBitwise Function Objects

C++11 adds function objects for all the bitwise operations: bit_and (operator&), bit_or (operator|), and bit_xor (operator^). These bitwise functors can for example be used together with the transform() algorithm (discussed later in this chapter) to perform bitwise operations on all elements in a container.

Function Object Adapters

When you try to use the basic function objects provided by the standard, it often feels as if you’re trying to put a square peg into a round hole. For example, you can’t use the less function object with find_if() to find an element smaller than some value because find_if() passes only one argument to its callback each time instead of two. The function adapters attempt to rectify this problem and others. They provide a modicum of support for functional composition, or combining functions together to create the exact behavior you need.

Binders

Binders can be used to bind parameters of functions to certain values. C++11 introduces std::bind(), which is very flexible and is discussed in the following section. The section afterwards explains the pre-C++11 bind2nd() and bind1st() adapters, which you have to use if your compiler does not yet support std::bind().

imagestd::bind

std::bind() allows you to bind arguments of a function in a flexible way. You can bind function arguments to fixed values and you can even rearrange function arguments in a different order. It is best explained with an example.

Suppose you have a function called func() accepting two arguments:

image
void func(int num, const string& str)
{
    cout << "func(" << num << ", " << str << ")" << endl;
}

Code snippet from FunctionObjectsind.cpp

The following code demonstrates how you can use bind() to bind the second argument of the func() function to a fixed value, str. The result is stored in f1(). The C++11 auto keyword is used to remove the need to specify the exact return type which can become complicated. Arguments that are not bound to specific values should be specified as _1, _2, _3 and so on. These are defined in the std::placeholders namespace. In the definition of f1(), the _1 specifies where the first argument of f1() needs to go when func() is called. After this, f1() can be called with just a single integer argument as follows:

image
string str = "abc";
auto f1 = bind(func, placeholders::_1, str);
f1(16);

Code snippet from FunctionObjectsind.cpp

The output should be:

func(16, abc)

bind() can also be used to rearrange the arguments as shown in the following code. The _2 specifies where the second argument of f2() needs to go when func() is called. In other words, the f2() binding means that the first argument to f2() will become the second argument to the function func() and the second argument to f2() will become the first argument to the function func().

image
auto f2 = bind(func, placeholders::_2, placeholders::_1);
f2("Test", 32);

Code snippet from FunctionObjectsind.cpp

The output is as follows:

func(32, Test)

There is a small issue with binding parameters in combination with overloaded functions. Suppose you have the following two overloaded functions called overloaded(). One accepts an integer and the other accepts a floating point number:

image
void overloaded(int num) {}
void overloaded(float f) {}

Code snippet from FunctionObjectsind.cpp

If you want to use bind() with these overloaded functions, you need to explicitly specify which of the two overloads you want to bind. The following will not compile:

image
auto f3 = bind(overloaded, placeholders::_1); // ERROR

Code snippet from FunctionObjectsind.cpp

If you want to bind the parameters of the overloaded function accepting a floating point argument, you need the following syntax:

image
auto f4 = bind((void(*)(float))overloaded, placeholders::_1); // OK

Code snippet from FunctionObjectsind.cpp

Another example of the bind() function is to use the find_if() algorithm to find the first element in a sequence that is greater than or equal to 100. To solve this problem earlier in this chapter, you wrote a function perfectScore() and passed a function pointer to it to find_if(). Now that you know about the comparison functors, it seems as if you should be able to implement a solution using the greater_equal class template.

The problem with greater_equal is that it takes two parameters, whereas find_if() passes only one parameter to its callback predicate each time. You need the ability to specify that find_if() should use greater_equal, but should pass 100 as the second argument each time. That way, each element of the sequence will be compared against 100. This can be accomplished with bind(). The following code uses bind() to bind the second parameter of greater_equal to a fixed value of 100:

image
// Code for inputting scores into the vector omitted, similar as earlier.
auto end = myVector.end();
auto it = find_if(myVector.begin(), end,
    bind(greater_equal<int>(), placeholders::_1, 100));
if (it == end) {
    cout << "No perfect scores" << endl;
} else {
    cout << "Found a "perfect" score of " << *it << endl;
}

Code snippet from FunctionObjectsBinders.cpp

Pre-C++11 bind2nd and bind1st

If your compiler does not support bind(), you need to use a method like bind2nd(). Unlike bind(), bind2nd() only works with binary functions and only allows you to bind the second argument. The following example shows that the bind2nd() method takes a function and a second argument. Because of bind2nd(), the function is then called with one argument passed in by find_if() and one argument derived from the bind2nd().

image
// Code for inputting scores into the vector omitted, similar as earlier.
vector<int>::iterator end = myVector.end();
vector<int>::iterator it = find_if(myVector.begin(),end,
    bind2nd(greater_equal<int>(), 100));
if (it == end) {
    cout << "No perfect scores" << endl;
} else {
    cout << "Found a "perfect" score of " << *it << endl;
}

Code snippet from FunctionObjectsBinders.cpp

The bind2nd() function in this code binds the value 100 as the second parameter to greater_equal. The result is that find_if() compares each element against 100 with greater_equal.

There is also an equivalent bind1st() function that binds an argument to the first parameter of a binary function. For example, the following finds the first value less than 100 using bind1st(), because if the result of bind1st() is called with argument x, it returns the value of 100 > x:

find_if(v.begin(), v.end(), bind1st(greater<int>(), 100));

Compare this to the following, using bind2nd(), which finds the first value greater than 100, because if the result of bind2nd() is called with argument x, it returns the value of x > 100:

find_if(v.begin(), v.end(), bind2nd(greater<int>(), 100));
pen.gif

Both bind2nd() and bind1st() have been deprecated by C++11. Use lambda expressions or bind() instead.

Negators

The negators are functions similar to the binders but they simply complement the result of a predicate. For example, if you wanted to find the first element in a sequence of test scores less than 100, you could apply the not1() negator adapter to the result of greater_equal like this:

image
// Code for inputting scores into the vector omitted, similar as earlier.
auto end = myVector.end();
auto it = find_if(myVector.begin(), end,
    not1(bind2nd(greater_equal<int>(), 100)));
if (it == end) {
    cout << "All perfect scores" << endl;
} else {
    cout << "Found a "less-than-perfect" score of " << *it << endl;
}

Code snippet from FunctionObjectsNegators.cpp

The function not1() complements the result of every call to the predicate it takes as an argument. Of course, you could also just use less instead of greater_equal. There are cases, often when using nonstandard functors, that not1() comes in handy. The “1” in not1() refers to the fact that its operand must be a unary function (one that takes a single argument). If its operand is a binary function (takes two arguments), you must use not2() instead. Note that you use not1() in this case because, even though greater_equal is a binary function, bind2nd() has already converted it to a unary function, by binding the second argument always to 100.

As you can see, using functors and adapters can quickly become complicated. Our advice is to use C++11 lambda expressions and use functors sparingly. For example, the previous find_if() call using the not1() negator can be written more elegantly using a lambda expression:

image
auto it = find_if(myVector.begin(), end, [](int i){ return i < 100; });

Code snippet from FunctionObjectsNegators.cpp

Calling Member Functions

If you have a container of objects, you sometimes want to pass a pointer to a class method as the callback to an algorithm. For example, you might want to find the first empty string in a vector of strings by calling empty() on each string in the sequence. However, if you just pass a pointer to string::empty() to find_if(), the algorithm has no way to know that it received a pointer to a method instead of a normal function pointer or functor. The code to call a method pointer is different from that to call a normal function pointer, because the former must be called in the context of an object.

C++11 provides a conversion function called mem_fn() that you can call on a method pointer before passing it to an algorithm. The following example demonstrates this. Note that you have to specify the method pointer as &string::empty. The &string:: part is not optional. See Chapter 21 for details.

image
void findEmptyString(const vector<string>& strings)
{
    auto end = strings.end();
    auto it = find_if(strings.begin(), end, mem_fn(&string::empty));
    if (it == end) {
        cout << "No empty strings!" << endl;
    } else {
        cout << "Empty string at position: " << it - strings.begin() << endl;
    }
}

Code snippet from FunctionObjectsEmptyString.cpp

mem_fn() generates a function object that serves as the callback for find_if(). Each time it is called back, it calls the empty() method on its argument.

mem_fn() works exactly the same when you have a container of pointers to objects instead of objects themselves. For example:

image
void findEmptyString(const vector<string*>& strings)
{
    auto end = strings.end();
    auto it = find_if(strings.begin(), end, mem_fn(&string::empty));
    // Remaining of function omitted because it is the same as earlier
}

Code snippet from FunctionObjectsEmptyStringPtr.cpp

If your compiler does not yet support the C++11 mem_fn(), you have to use mem_fun_ref() when you have a container of objects as follows:

image
auto it = find_if(strings.begin(), end, mem_fun_ref(&string::empty));

Code snippet from FunctionObjectsEmptyString.cpp

However, if you have a container of pointers to objects and you can’t use the C++11 mem_fn(), then you have to use mem_fun() as follows:

image
auto it = find_if(strings.begin(), end, mem_fun(&string::empty));

Code snippet from FunctionObjectsEmptyStringPtr.cpp

pen.gif

mem_fun_ref() and mem_fun() work in restricted cases. If the method takes 0 arguments, the result can be used as a callback for a unary function; if it takes 1 argument, the result can be used as a callback for a binary function. C++11 mem_fn() works for n-ary functions and for both containers of pointers to objects and containers of objects themselves

mem_fn(), mem_fun_ref(), and mem_fun()are not the most intuitive ways to implement the findEmptyString() function. Using C++11 lambda expressions, it can be implemented in a much more readable and elegant way. Here is the implementation using a lambda expression working on a container of objects:

image
void findEmptyString(const vector<string>& strings)
{
    auto end = strings.end();
    auto it = find_if(strings.begin(), end,
        [](const string& str){ return str.empty(); });
    // Remaining of function omitted because it is the same as earlier
}

Code snippet from FunctionObjectsEmptyString.cpp

Similarly, the following uses a lambda expression working on a container of pointers to objects:

image
void findEmptyString(const vector<string*>& strings)
{
    auto end = strings.end();
    auto it = find_if(strings.begin(), end,
        [](const string* str){ return str->empty(); });
    // Remaining of function omitted because it is the same as earlier
}

Code snippet from FunctionObjectsEmptyStringPtr.cpp

pen.gif

Both mem_fun_ref() and mem_fun() have been deprecated by C++11. Use lambda expressions instead, or mem_fn().

Adapting Real Functions

You can’t use normal function pointers directly with the function adapters bind1st(), bind2nd(), not1(), or not2(), because these adapters require specific typedefs in the function objects they adapt. For example, not1() can only operate on function objects that include an argument_type typedef, while not2() requires a first_argument_type and second_argument_type typedef. The bind1st() and bind2nd() adapters require similar typedefs.

One last function adapter provided by the C++ standard library, ptr_fun(), allows you to wrap regular function pointers in a way that they can be used with the adapters. The ptr_fun() adapter is only explained here because you might encounter it in older codebases. If you are writing new code and are using a C++11 compiler, you should use lambda expressions.

As an example, suppose that you want to write a function isNumber() that returns true if every character in a string is a digit. The C++ string class provides an iterator. Thus, you can use the find_if() algorithm to search for the first nondigit in the string. If you find one, the string is not a number. The <cctype> header file provides a legacy C function called isdigit(), which returns true if a character is a digit, false otherwise.

cross.gif

The definition of isdigit() in <cctype> is not the same as isdigit() in the more traditional C library <ctype.h>. In <ctype.h>, the isdigit() function is defined as a macro, which makes it completely unsuitable for use in this context. In <cctype> it is defined as a function.

The problem is that you want to find the first character that is not a digit, which requires the not1() adapter. However, because isdigit() is a C function, not a function object, you need to use the ptr_fun() adapter to generate a function object that can be used with not1(). The code looks as follows:

image
bool isNumber(const string& str)
{
    auto end = str.end();
    auto it = find_if(str.begin(), end, not1(ptr_fun(::isdigit))); 
    return (it == end);
}

Code snippet from FunctionObjectsIsNumber.cpp

Note the use of the :: scope resolution operator to specify that isdigit() should be found in the global scope.

C++11 has deprecated ptr_fun() and you should use a lambda expression instead:

image
bool isNumber(const string& str)
{
    auto end = str.end();
    auto it = find_if(str.begin(), end, [](char c){return !::isdigit(c);});
    return (it == end);
}

Code snippet from FunctionObjectsIsNumber.cpp

For this example, you can also use the new C++11 find_if_not() algorithm as follows:

image
bool isNumber(const string& str)
{
    auto end = str.end();
    auto it = find_if_not(str.begin(), end, ::isdigit); 
    return (it == end);
}

Code snippet from FunctionObjectsIsNumber.cpp

pen.gif

ptr_fun() has been deprecated by C++11. Use lambda expressions instead.

Writing Your Own Function Objects

If your compiler does not yet support the C++11 lambda expressions, you can write your own function objects to perform more specific tasks than those provided by the predefined functors. If you want to be able to use the function adapters with these functors, you must supply certain typedefs. The easiest way to do that is to subclass your function object classes from either unary_function or binary_function, depending on whether they take one or two arguments. These two classes, defined in <functional>, are templatized on the parameter and return types of the “function” they provide. For example, instead of using ptr_fun() to convert isdigit(), you could write a wrapper function object like this:

image
class myIsDigit : public unary_function<char, bool>
{
    public:
        bool operator() (char c) const { return ::isdigit(c); }
};
bool isNumber(const string& str)
{
    auto end = str.end();
    auto it = find_if(str.begin(), end, not1(myIsDigit()));
    return (it == end);
}

Code snippet from FunctionObjectsWritingFunctionObject.cpp

Note that the overloaded function call operator of the myIsDigit class must be const in order to pass objects of it to find_if().

cross.gif

The algorithms are allowed to make multiple copies of function object predicates and call different ones for different elements. The function call operator needs to be const, thus, you cannot write functors such that they count on any internal state to the object being consistent between calls.

Before C++11, a class defined locally in the scope of a function could not be used as a template argument. C++11 removes this limitation. The following example is perfectly legal in C++11, but was not legal before C++11:

image
bool isNumber(const string& str)
{
    class myIsDigit : public unary_function<char, bool>
    {
        public:
            bool operator() (char c) const { return ::isdigit(c); }
    };
    auto end = str.end();
    auto it = find_if(str.begin(), end, not1(myIsDigit()));
    return (it == end);
}

Code snippet from FunctionObjectsWritingFunctionObjectLocal.cpp

pen.gif

As you can see from these examples, C++11 lambda expressions allow you to write more readable and more elegant code. We recommend to use simple lambda expressions instead of function objects, and to use function objects only when they need to do more complicated things.

ALGORITHM DETAILS

This chapter describes the general categories of algorithms, with examples of each. The Standard Library Reference resource on the website (www.wrox.com) contains a summary of all the algorithms.

There are five types of iterators: input, output, forward, bidirectional, and random-access. These are described in Chapter 12. There is no formal class hierarchy of these iterators, because the implementations for each container are not part of the standard hierarchy. However, one can deduce a hierarchy based on the functionality they are required to provide. Specifically, every random access iterator is also bidirectional, every bidirectional iterator is also forward, and every forward iterator is also input and output. Figure 13-1 shows such hierarchy. Dotted lines are used because the figure is not a real class hierarchy.

The standard way for the algorithms to specify what kind of iterators they need is to use the following names for the iterator template arguments: InputIterator, OutputIterator, ForwardIterator, BidirectionalIterator, and RandomAccessIterator. These names are just names: They don’t provide binding type checking. Therefore, you could, for example, try to call an algorithm expecting a RandomAccessIterator by passing a bidirectional iterator. The template doesn’t do type checking, so it would allow this instantiation. However, the code in the function that uses the random access iterator capabilities would fail to compile on the bidirectional iterator. Thus, the requirement is enforced, just not where you would expect. The error message can therefore be somewhat confusing. For example, attempting to use the generic sort() algorithm, which requires a random access iterator, on a list, which provides only a bidirectional iterator, gives this error in g++:

/usr/include/c++/3.2.2/bits/stl_algo.h: In function 'void
   std::sort(_RandomAccessIter, _RandomAccessIter) [with _RandomAccessIter =
   std::_List_iterator<int, int&, int*>]':
Sorting.cpp:38:   instantiated from here
/usr/include/c++/3.2.2/bits/stl_algo.h:2178: no match for '
   std::_List_iterator<int, int&, int*>& - std::_List_iterator<int, int&,
   int*>&' operator

Most of the algorithms are defined in the <algorithm> header file, but a few algorithms are located in <numeric>. They are all in the std namespace.

Utility Algorithms

The STL provides four utility algorithms implemented as function templates: min(), max(), swap() and the C++11 minmax(). The min() and max() functions compare two elements of any type using operator< or a user-supplied binary predicate, returning a const reference to the smaller or larger element, respectively. The swap() function takes two elements of any type by reference and switches their values. With C++11, the min() and max() algorithms can be used to compare more than two values, while the minmax() algorithm returns a pair containing the minimum and the maximum value of two or more elements.

These utilities do not work on sequences of elements, so they do not take iterator parameters.

The following program demonstrates the four algorithms:

image
int x = 4, y = 5;
cout << "x is " << x << " and y is " << y << endl;
cout << "Max is " << max(x, y) << endl;
cout << "Min is " << min(x, y) << endl;
swap(x, y);
cout << "x is " << x << " and y is " << y << endl;
cout << "Max is " << max(x, y) << endl;
cout << "Min is " << min(x, y) << endl;
 
// C++11: using max and min on more than two values
int x1 = 2, x2 = 9, x3 = 3, x4 = 12;
cout << "Max of 4 elements is " << max({x1,x2,x3,x4}) << endl;
cout << "Min of 4 elements is " << min({x1,x2,x3,x4}) << endl;
// C++11: using minmax
auto p2 = minmax({x1,x2,x3,x4});
cout << "Minmax of 4 elements is <"
     << p2.first << "," << p2.second << ">" << endl;

Code snippet from UtilityAlgorithmsutilities.cpp

Here is the program output:

x is 4 and y is 5
Max is 5
Min is 4
x is 5 and y is 4
Max is 5
Min is 4
Max of 4 elements is 12
Min of 4 elements is 2
Minmax of 4 elements is <2,12>
pen.gif

The C language also includes a min() and max() function; however, they are implemented as macros, and will potentially evaluate one of their arguments twice; whereas std::min() and std::max() evaluate each argument exactly once. Make sure you always use the C++ versions, std::min() and std::max(), either by explicitly specifying std:: or by using a using namespace std clause.

Non-Modifying Algorithms

The non-modifying algorithms include functions for searching elements in a range, generating numerical information about elements in a range, comparing two ranges to each other, and processing each element in a range.

Search Algorithms

You’ve already seen three examples of using search algorithms: find(), find_if(), and the C++11 find_if_not(). The STL provides several other variations of the basic find() algorithm that work on sequences of elements. The section “Search Algorithms” in Chapter 11 describes the different search algorithms that are available, including their complexity.

All the algorithms use default comparisons of operator== or operator<, but also provide overloaded versions that allow you to specify a comparison callback.

Here are examples of some of the search algorithms:

image
// The list of elements to be searched
vector<int> myVector = {5, 6, 9, 8, 8, 3}; 
auto begin = myVector.begin();
auto end = myVector.end();
 
// Find the min and max elements in the vector
auto it = min_element(begin, end);
auto it2 = max_element(begin, end);
cout << "min is " << *it << " and max is " << *it2 << endl;
 
// Find the first pair of matching consecutive elements
it = adjacent_find(begin, end);
if (it != end) {
    cout << "Found two consecutive equal elements with value " << *it << endl;
}
 
// Find the first of two values
vector<int> targets = {8, 9};
it = find_first_of(begin, end, targets.begin(), targets.end());
if (it != end) {
    cout << "Found one of 8 or 9: " << *it << endl;
}
 
// Find the first subsequence
vector<int> sub = {8, 3};
it = search(begin, end, sub.begin(), sub.end());
if (it != end) {
    cout << "Found subsequence {8,3}" << endl;
} else {
    cout << "Unable to find subsequence {8,3}" << endl;
}
 
// Find the last subsequence (which is the same as the first in this example)
it2 = find_end(begin, end, sub.begin(), sub.end());
if (it != it2) {
    cout << "Error: search and find_end found different subsequences "
         << "even though there is only one match." << endl;
}
 
// Find the first subsequence of two consecutive 8s
it = search_n(begin, end, 2, 8);
if (it != end) {
    cout << "Found two consecutive 8s" << endl;
} else {
    cout << "Unable to find two consecutive 8s" << endl;
}

Code snippet from NonModifyingAlgorithmsSearchAlgorithms.cpp

Here is the output:

min is 3 and max is 9
Found two consecutive equal elements with value 8
Found one of 8 or 9: 9
Found subsequence {8,3}
Found two consecutive 8s

There are also several search algorithms that work only on sorted sequences: binary_search(), lower_bound(), upper_bound(), and equal_range(). Examples of sorted sequences are vectors whose contents are sorted, map, multimap, set, and multiset. The binary_search() algorithm finds a matching element in logarithmic time (see Chapter 2). The other three are similar to their method equivalents on the map and set containers. See Chapter 12 for an example on how to use them.

pen.gif

Remember to use equivalent container methods when available instead of generic algorithms, because the methods are more efficient.

The section “Search Algorithms” in Chapter 11 also describes a number of new C++11 search algorithms: find_if_not(), minmax_element(), all_of(), any_of(), none_of(), and partition_point().

The following example shows these new algorithms in action. This example also demonstrates the use of cbegin() and cend() to get const iterators.

image
vector<int> vec = {0,0,0,1,0,2,0};
auto begin = vec.cbegin();
auto end = vec.cend();
 
// Find an element != 0
auto it = find_if_not(begin, end, [](int i){return i == 0;});
if (it == end)
    cout << "No element found != 0" << endl;
else
    cout << "Found element " << *it << " != 0" << endl;
 
// Find min and max with 1 algorithm
auto minmax = minmax_element(begin, end);
cout << "Min = " << *(minmax.first) << " and Max = " << *(minmax.second) << endl;
 
// all_of()
vector<int> vec2 = {1,1,1,1};
if (all_of(vec2.cbegin(), vec2.cend(), [](int i){return i == 1;}))
    cout << "All elements are == 1" << endl;
else
    cout << "Not all elements are == 1" << endl;
 
// any_of()
vector<int> vec3 = {0,0,1,0};
if (any_of(vec3.cbegin(), vec3.cend(), [](int i){return i == 1;}))
    cout << "At least one element == 1" << endl;
else
    cout << "No elements are == 1" << endl;
 
// none_of()
vector<int> vec4 = {0,0,0,0};
if (none_of(vec4.cbegin(), vec4.cend(), [](int i){return i == 1;}))
    cout << "All elements are != 1" << endl;
else
    cout << "Some elements are == 1" << endl;
 
// partition_point()
vector<int> vec5 = {1,1,0,4,5,6};
auto ppoint = partition_point(vec5.cbegin(), vec5.cend(),
    [](int i){return i == 1;});
cout << "Partition point at position " << (ppoint-vec5.cbegin()) << endl;

Code snippet from NonModifyingAlgorithmsCpp11SearchAlgorithms.cpp

The output is as follows:

Found element 1 != 0
Min = 0 and Max = 2
All elements are == 1
At least one element == 1
All elements are != 1
Partition point at position 2

Numerical Processing Algorithms

You’ve seen an example of one numerical processing algorithm already: accumulate(). In addition, the count() and count_if() algorithms are useful for counting the number of elements of a given value in a container. They function similarly to the count() method on the map and set containers. See section “Numerical Processing Algorithms” in Chapter 11 for a description of all numerical processing algorithms that are available.

As another example, the following calculates the inner product between two vectors, which is in this example (1*9)+(2*8)+(3*7)+(4*6):

image
vector<int> v1 = {1,2,3,4};
vector<int> v2 = {9,8,7,6};
cout << inner_product(v1.cbegin(), v1.cend(), v2.cbegin(), 0) << endl;

Code snippet from NonModifyingAlgorithmsinner_product.cpp

The output is 70.

Comparison Algorithms

You can compare entire ranges of elements in three different ways: equal(), mismatch(), and lexicographical_compare(). These algorithms have the advantage that you can compare ranges in different containers. For example, you can compare the contents of a vector with the contents of a list. In general, these work best with sequential containers. They work by comparing the values in corresponding positions of the two collections to each other.

  • equal() returns true if all corresponding elements are equal. It requires both containers to have the same number of elements.
  • mismatch() returns iterators, one iterator for each of the collections, to indicate where in the range the corresponding elements mismatched.
  • lexicographical_compare() deals with the situation where the two ranges may contain different numbers of elements. It returns true if all the elements in the first range are less than their corresponding elements in the second range, or, if the first range has fewer elements than the second and all elements in the first range are less than their corresponding initial subsequence in the second set.

“lexicographical_compare” gets its name because it resembles the rules for comparing strings, but extends this set of rules to deal with objects of any type.

pen.gif

If you want to compare the elements of two containers of the same type, you can use operator== or operator< instead of equal() or lexicographical_compare(). The algorithms are useful primarily for comparing sequences of elements from different container types.

Here are some examples of these algorithms:

image
// Function template to populate a container of ints.
// The container must support push_back().
template<typename Container>
void populateContainer(Container& cont)
{
    int num;
    while (true) {
        cout << "Enter a number (0 to quit): ";
        cin >> num;
        if (num == 0) {
            break;
        }
        cont.push_back(num);
    }
}
int main()
{
    vector<int> myVector;
    list<int> myList;
    cout << "Populate the vector:" << endl;
    populateContainer(myVector);
    cout << "Populate the list:" << endl;
    populateContainer(myList);
 
    if (myList.size() < myVector.size()) {
        cout << "Sorry, the list is not long enough." << endl;
        return 1;
    }
    // compare the two containers
    if (equal(myVector.begin(), myVector.end(), myList.begin())) {
        cout << "The two containers have equal elements" << endl;
    } else {
        // If the containers were not equal, find out why not
        auto miss = mismatch(myVector.begin(), myVector.end(),
            myList.begin());
        cout << "The following initial elements are "
             << "the same in the vector and the list:" << endl;
        for (auto i = myVector.begin(); i != miss.first; ++i)
            cout << *i << '	';
        cout << endl;
    }
    // Now order them.
    if (lexicographical_compare(myVector.begin(), myVector.end(),
        myList.begin(), myList.end())) {
        cout << "The vector is lexicographically first." << endl;
    } else {
        cout << "The list is lexicographically first." << endl;
    }
    return 0;
}

Code snippet from NonModifyingAlgorithmsComparisonAlgorithms.cpp

Here is a sample run of the program:

Populate the vector:
Enter a number (0 to quit): 5
Enter a number (0 to quit): 6
Enter a number (0 to quit): 7
Enter a number (0 to quit): 0
Populate the list:
Enter a number (0 to quit): 5
Enter a number (0 to quit): 6
Enter a number (0 to quit): 9
Enter a number (0 to quit): 8
Enter a number (0 to quit): 0
The following initial elements are the same in the vector and the list:
5      6
The vector is lexicographically first.

Operational Algorithms

There is only one algorithm in this category: for_each(). However, it is one of the most useful algorithms in the STL. It executes a callback on each element of the range. You can use it with simple function callbacks or lambda expressions for things like printing every element in a container. Following is an example using a lambda expression and uniform initialization, printing the elements from the map:

image
map<int, int> myMap = {{4, 40}, {5, 50}, {6, 60}};
for_each(myMap.cbegin(), myMap.cend(), [](const pair<int, int>& p)
         {cout << p.first << "->" << p.second << endl;});

Code snippet from NonModifyingAlgorithmsForEachBasicLambda.cpp

The output is as follows:

4->40
5->50
6->60

Doing the same thing in pre-C++11 requires a separate function, and a pointer to it is given to for_each():

image
void printPair(const pair<int, int>& elem)
{
    cout << elem.first << "->" << elem.second << endl;
}
int main()
{
    map<int, int> myMap;
    myMap.insert(make_pair(4, 40));
    myMap.insert(make_pair(5, 50));
    myMap.insert(make_pair(6, 60));
    for_each(myMap.begin(), myMap.end(), printPair); 
    return 0;
}

Code snippet from NonModifyingAlgorithmsForEachBasic.cpp

A functor could be useful in the context of for_each() because for_each() returns a copy of the callback object, so you can accumulate information in your functor that you can retrieve after for_each() has finished processing each element. For example, you could calculate both the sum and product of elements in one pass by writing a functor SumAndProd that tracks both at the same time:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
class SumAndProd : public unary_function<int, void>
{
    public:
        SumAndProd() : sum(0), prod(1) {}
        void operator()(int elem);
        // make sum and prod public for easy access
        int sum;
        int prod;
};
void SumAndProd::operator()(int elem)
{
    sum += elem;
    prod *= elem;
} 
int main()
{
    vector<int> myVector;
    populateContainer(myVector);
    SumAndProd func;
    func = for_each(myVector.cbegin(), myVector.cend(), func);
    cout << "The sum is " << func.sum << endl;
    cout << "The product is " << func.prod << endl;
    return 0;
}

Code snippet from NonModifyingAlgorithmsSumAndProd.cpp

You might be tempted to ignore the return value of for_each(), yet still try to read information from func after the call. However, that doesn’t work because func is passed-by-value into for_each(), so for_each() receives a copy of func. You must capture the return value in order to ensure correct behavior.

To show the power of C++11 lambda expressions, the preceding example using a functor to calculate the sum and product of a range can be modified to use a small lambda expression as follows. Note that the lambda expression captures all variables in its enclosing scope by reference with [&], otherwise changes made to sum and prod in the lambda expression would not be visible outside the lambda:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> myVector;
populateContainer(myVector);
int sum = 0;
int prod = 1;
for_each(myVector.cbegin(), myVector.cend(),
    [&](int i){
        sum += i;
        prod *= i;
});
cout << "The sum is " << sum << endl;
cout << "The product is " << prod << endl;

Code snippet from NonModifyingAlgorithmsSumAndProdLambda.cpp

A final point about for_each() is that your lambda or callback is allowed to take its argument by reference and modify it. That has the effect of changing values in the actual iterator range. The voter registration example later in this chapter shows a use of this capability.

Modifying Algorithms

The STL provides a variety of modifying algorithms that perform tasks such as copying elements from one range to another, removing elements, or reversing the order of elements in a range.

The modifying algorithms all have the concept of source and destination ranges. The elements are read from the source range and added to or modified in the destination range. The source and destination ranges can often be the same, in which case the algorithm is said to operate in place.

cross.gif

Ranges from maps and multimaps cannot be used as destinations of modifying algorithms. These algorithms overwrite entire elements, which in a map consist of key/value pairs. However, maps and multimaps mark the key const, so it cannot be assigned to. Similarly, many implementations of set and multiset provide only const iteration over the elements, so you cannot generally use ranges from these containers as destinations of modifying algorithms either. Your alternative is to use an insert iterator, described in Chapter 17.

The section “Modifying Algorithms” in Chapter 11 lists all available modifying algorithms with a description. This section provides code examples for a number of those algorithms.

imageiota

C++11 adds an iota() algorithm, defined in the <numeric> header file, which generates a sequence of values in the specified range starting with the specified value and applying operator++ to generate each successive value. The following example shows how to use this new algorithm on a vector of integers, but note that it works on any element type that implements operator++:

image
vector<int> vec(10);
iota(vec.begin(), vec.end(), 5);
for (auto& i : vec) cout << i << " ";

Code snippet from ModifyingAlgorithmsiota.cpp

The output is as follows:

5 6 7 8 9 10 11 12 13 14

transform

The transform() algorithm is similar to for_each(), in that it applies a callback to each element in a range. The difference is that transform() expects the callback to generate a new element for each call, which it stores in the destination range specified. The source and destination ranges can be the same if you want transform() to replace each element in a range with the result from the call to the callback. For example, you could add 100 to each element in a vector like this:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> myVector;
populateContainer(myVector);
cout << "The vector contents are:" << endl;
for (auto& i : myVector) cout << i << " ";
cout << endl;
transform(myVector.begin(), myVector.end(), myVector.begin(),
    [](int i){return i + 100;});
cout << "The vector contents are:" << endl;
for (auto& i : myVector) cout << i << " ";

Code snippet from ModifyingAlgorithmsTransformLambda.cpp

Another form of transform() calls a binary function on pairs of elements in the range. The following example creates two vectors and uses transform() to calculate the sum of pairs of elements and store the result back in the first vector:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1;
cout << "Vector1:" << endl;
populateContainer(vec1);
cout << "Vector2:" << endl;
vector<int> vec2;
populateContainer(vec2);
if (vec2.size() < vec1.size())
{
    cout << "Vector2 should be at least the same size as vector1." << endl;
    return 1;
}
// Create a lambda to print a vector
auto printVec = [](const vector<int>& vec){
    for (auto& i : vec) cout << i << " ";
    cout << endl;
};
cout << "Vector1: "; printVec(vec1);
cout << "Vector2: "; printVec(vec2);
 
transform(vec1.begin(), vec1.end(),
    vec2.begin(), vec1.begin(),
    [](int a, int b){return a + b;});
 
cout << "Vector1: "; printVec(vec1);
cout << "Vector2: "; printVec(vec2);

Code snippet from ModifyingAlgorithmsTransformLambdaBinary.cpp

The output could look as follows:

Vector1:
Enter a number (0 to quit): 1
Enter a number (0 to quit): 2
Enter a number (0 to quit): 0
Vector2:
Enter a number (0 to quit): 11
Enter a number (0 to quit): 22
Enter a number (0 to quit): 33
Enter a number (0 to quit): 0
Vector1: 1 2
Vector2: 11 22 33
Vector1: 12 24
Vector2: 11 22 33
pen.gif

transform() and the other modifying algorithms often return an iterator referring to the past-the-end value of the destination range. The examples in this book usually ignore that return value.

copy

The copy() algorithm allows you to copy elements from one range to another, starting with the first element and proceeding to the last element in the range. The source and destination ranges must be different, but they can overlap. Note that copy() doesn’t insert elements into the destination range. It just overwrites whatever elements were there already. Thus, you can’t use copy() directly to insert elements into a container, only to overwrite elements that were previously in a container.

pen.gif

Chapter 17 describes how to use iterator adapters to insert elements into a container or stream with copy().

Here is a simple example of copy() that uses the resize() method on vectors to ensure that there is enough space in the destination container. It copies all elements from vec1 to vec2:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1, vec2;
populateContainer(vec1);
vec2.resize(vec1.size());
copy(vec1.cbegin(), vec1.cend(), vec2.begin());
for_each(vec2.cbegin(), vec2.cend(), [](int i){cout << i << " ";});

Code snippet from ModifyingAlgorithmsCopy.cpp

There is also a copy_backward() algorithm, which copies the elements from the source backward to the destination. In other words, it starts with the last element of the source and puts it in the last position in the destination range and moves backward after each copy. The preceding example can be modified to use copy_backward() instead of copy() as follows. Note that you need to specify vec2.end() as third argument instead of vec2.begin():

image
copy_backward(vec1.cbegin(), vec1.cend(), vec2.end());

Code snippet from ModifyingAlgorithmscopy_backward.cpp

This will result in exactly the same output.

imageC++11 adds a few new copy algorithms. These include conditional copy, copy of a specified number of elements (so instead of specifying an ending iterator, only a count is necessary), and a copy that can copy to two different destinations.

The first one is copy_if(). It works by having an input range specified by two iterators, an output destination specified by an iterator, and a predicate (function or lambda expression). The function or lambda expression is executed for each element that is a candidate to be copied. If the returned value is true, the element is copied and the destination iterator is incremented; if the value is false the element is not copied and the destination iterator is not incremented. Thus, the destination may hold fewer elements than the source range. For some containers, because they must have already created space to hold the maximum possible number of elements (remember, copy does not create or extend containers, merely replaces the existing elements), it might be desirable to remove the space “beyond” where the last element was copied to. To facilitate this, copy_if() returns an iterator to the one-past-the-last-copied element in the destination range. This can be used to determine how many elements should be removed from the destination container. The following example demonstrates this by copying only the even numbers to vec2:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1, vec2;
populateContainer(vec1);
vec2.resize(vec1.size());
auto endIterator = copy_if(vec1.cbegin(), vec1.cend(),
         vec2.begin(), [](int i){return i%2==0;});
vec2.erase(endIterator, vec2.end());
for_each(vec2.cbegin(), vec2.cend(), [](int i){cout << i << " ";});

Code snippet from ModifyingAlgorithmscopy_if.cpp

Another C++11 addition is copy_n(), which copies n elements from the source to the destination. The first parameter of copy_n() is the start iterator. The second parameter of copy_n() is an integer specifying the number of elements to copy and the third parameter is the destination iterator. The copy_n() algorithm does not perform any bounds checking, so you must make sure that the start iterator, incremented by the number of elements to copy, does not exceed the end() of the collection or your program will have undefined behavior. Following is an example:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1, vec2;
populateContainer(vec1);
size_t cnt = 0;
cout << "Enter number of elements you want to copy: ";
cin >> cnt;
cnt = min(cnt, vec1.size());
vec2.resize(cnt);
copy_n(vec1.cbegin(), cnt, vec2.begin());
for_each(vec2.cbegin(), vec2.cend(), [](int i){cout << i << " ";});

Code snippet from ModifyingAlgorithmscopy_n.cpp

The last copy-related algorithm added by C++11 is called partition_copy(), which copies elements from the source to two different destinations. The specific destination for each element is selected based on the result of a predicate, either true or false. The returned value of partition_copy() is a pair of iterators: one iterator referring to one-past-the-last-copied element in the first destination range, and one iterator referring to one-past-the-last-copied element in the second destination range. These returned iterators can be used in combination with erase() to remove excess elements from the two destination ranges, just as with the copy_if() example earlier. The following example asks the user to enter a number of integers, which are then partitioned into two destination vectors; one for the even numbers and one for the odd numbers:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1, vecOdd, vecEven;
populateContainer(vec1);
vecOdd.resize(vec1.size());
vecEven.resize(vec1.size());
 
auto pairIters = partition_copy(vec1.cbegin(), vec1.cend(),
    vecEven.begin(), vecOdd.begin(),
    [](int i){return i%2==0;});
 
vecEven.erase(pairIters.first, vecEven.end());
vecOdd.erase(pairIters.second, vecOdd.end());
cout << "Even numbers: ";
for_each(vecEven.cbegin(), vecEven.cend(), [](int i){cout << i << " ";});
cout << endl << "Odd numbers: ";
for_each(vecOdd.cbegin(), vecOdd.cend(), [](int i){cout << i << " ";});

Code snippet from ModifyingAlgorithmspartition_copy.cpp

The output can be as follows:
Enter a number (0 to quit): 11
Enter a number (0 to quit): 22
Enter a number (0 to quit): 33
Enter a number (0 to quit): 44
Enter a number (0 to quit): 0
Even numbers: 22 44
Odd numbers: 11 33

Note that the last few examples used the for_each() algorithm to print elements of a container. As seen in other examples, you can also print the elements using a C++11 range-based for loop. For example, instead of writing the following:

for_each(vecOdd.cbegin(), vecOdd.cend(), [](int i){cout << i << " ";});

You can write:

for (auto& i : vecOdd)
    cout << i << " ";

imagemove

C++11 adds two move related algorithms: move() and move_backward(). They both use the move semantics introduced in C++11 and discussed in Chapter 9. You have to provide a move assignment operator in your element classes if you want to use these new algorithms on containers with elements of your own types, as demonstrated in the following example. Consult Chapter 9 for details on implementing move assignment operators and the use of std::move(). The MyClass example defines a move assignment operator. The main() function creates a vector with three MyClass objects and then moves those elements from vecSrc to vecDst. Note that the code includes two different uses of move(). The move() function accepting a single argument converts an lvalue into an rvalue (Chapter 9), while move() accepting three arguments is the STL move() algorithm to move elements between containers.

image
class MyClass
{
    public:
        MyClass() {}
        MyClass(const string& str) : mStr(str) {}
        // Move assignment operator
        MyClass& operator=(MyClass&& rhs) {
            if (this == &rhs)
                return *this;
            mStr = std::move(rhs.mStr);
            cout << "Move operator= (mStr=" << mStr << ")" << endl;
            return *this;
        }
        string getString() const {return mStr;}
    protected:
        string mStr;
};
int main()
{
    vector<MyClass> vecSrc = {MyClass("a"), MyClass("b"), MyClass("c")};
    vector<MyClass> vecDst(vecSrc.size());
    move(vecSrc.begin(), vecSrc.end(), vecDst.begin());
    for (auto& c : vecDst) cout << c.getString() << " ";
    return 0;
}

Code snippet from ModifyingAlgorithmsmove.cpp

The output is as follows:

Move operator= (mStr=a)
Move operator= (mStr=b)
Move operator= (mStr=c)
a b c
pen.gif

Chapter 9 explains that source objects in a move operation are reset because the target object takes ownership of the resources of the source object. For the previous example, this means that you should not use the objects from vecSrc anymore after the move operation.

move_backward() uses the same move mechanism as move() but it moves the elements from the last to the first element.

replace

The replace() and replace_if() algorithms replace elements in a range matching a value or predicate, respectively, with a new value. Take replace_if() as an example. Its first and second parameters specify the range of elements in your container. The third parameter is a function or lambda expression that returns true or false. If it returns true, the value in the container is replaced with the value given as fourth parameter; if it returns false, it leaves the original value.

For example, you might want to replace all values less than a lower limit with the value of the lower limit and replace all values greater than an upper limit with the value of the upper limit. This is called “clamping” values to a range. In audio applications, this is known as “clipping”. For example, audio signals are often limited to integers in the range of -32K to +32K. The following example demonstrates this by first calling replace_if() to replace all values less than -32K with -32K and then calling a second replace_if() to replace all values greater than 32K with 32K:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec;
populateContainer(vec);
int lowLim = numeric_limits<short>::min();  // = -32768
int upLim = numeric_limits<short>::max();   // = 32767
replace_if(vec.begin(), vec.end(), [=](int i){return i < lowLim;}, lowLim);
replace_if(vec.begin(), vec.end(), [=](int i){return i > upLim;}, upLim);
for_each(vec.cbegin(), vec.cend(), [](int i){cout << i << " ";});

Code snippet from ModifyingAlgorithmsReplace.cpp

There are also variants of replace() called replace_copy() and replace_copy_if() that copy the results to a different destination range.

remove

Suppose you have a range of elements and you want to remove elements matching a certain condition. The first solution that you might think of is to check the documentation to see if your container has an erase() method and then iterate over all the elements and call erase() for each element that matches the condition. The vector is an example of a container that has such an erase() method. However, if applied to the vector container, this solution is very inefficient as it will cause a lot of memory operations to keep the vector contiguous in memory, resulting in a quadratic complexity (see Chapter 2), which is very bad. This solution is also error-prone, because you need to be careful that you keep your iterators valid after a call to erase(). The correct solution for this problem is the so-called remove-erase-idiom, which runs in linear time and is explained in this section.

The remove() and remove_if() algorithms “remove” certain elements from a range by partitioning the collection into two sets: the elements to be kept and the elements to be removed. The elements to remove can be specified by either a specific value or with a predicate. These elements are not really removed from the underlying container, because the algorithms have access only to the iterator abstraction, not to the container. Instead, elements to be kept are moved or copied to the beginning of the range and elements to be removed are moved or copied to the end of the range. An iterator is returned that points to the first element in the range of elements to be removed. If you want to actually erase these elements from the container, you must use the remove() algorithm, then call erase() on the container to erase all the elements from the returned iterator up to the end of the range. This is the remove-erase-idiom. Here is an example of a function that removes empty strings from a vector of strings:

image
void removeEmptyStrings(vector<string>& strings)
{
    auto it = remove_if(strings.begin(), strings.end(),
        [](const string& str){return str.empty();});
    // Erase the removed elements.
    strings.erase(it, strings.end());
}
int main()
{
    vector<string> myVector = {"", "one", "", "two", "three", "four"};
    for (auto& str : myVector) cout << """ << str << ""  ";
    cout << endl;
    removeEmptyStrings(myVector);
    for (auto& str : myVector) cout << """ << str << ""  ";
    cout << endl;
    return 0;
}

Code snippet from ModifyingAlgorithmsRemove.cpp

The output is as follows:

""  "one"  ""  "two"  "three"  "four"
"one"  "two"  "three"  "four"

The remove_copy() and remove_copy_if() variations of remove() do not change the source range. Instead they copy all unremoved/retained elements to a different destination range. They are similar to copy(), in that the destination range must already be large enough to hold the new elements.

pen.gif

The remove() family of functions is stable in that it maintains the order of elements remaining in the container even while moving the retained elements toward the beginning.

unique

The unique() algorithm is a special case of remove() that removes all duplicate contiguous elements. The list container provides its own unique() method that implements the same semantics. You should generally use unique() on sorted sequences, but nothing prevents you from running it on unsorted sequences.

The basic form of unique() runs in place, but there is also a version of the algorithm called unique_copy() that copies its results to a new destination range.

Chapter 12 showed an example of the list::unique() algorithm, so we omit an example of the general form here.

reverse

The reverse() algorithm reverses the order of the elements in a range. The first element in the range is swapped with the last, the second with the second-to-last, and so on.

The basic form of reverse() runs in place and requires two arguments: a start and end iterator for the range. There is also a version of the algorithm called reverse_copy() that copies its results to a new destination range and requires three arguments: a start and end iterator for the source range and a start iterator for the destination range.

Other Modifying Algorithms

The STL includes several other modifying algorithms, including iter_swap(), swap_ranges(), fill(), fill_n(), generate_n(), rotate(), rotate_copy(), next_permutation(), and prev_permutation(). They are used less frequently, so no examples are given for them. However, if you understand how to use the algorithms explained in this section, you should have no problem using these other algorithms. Consult the section “Modifying Algorithms” in Chapter 11 to get a list of all available modifying algorithms with a brief description.

Sorting Algorithms

The STL provides several variations of sorting algorithms. A “sorting algorithm” will reorder the contents of a container such that an ordering is maintained between sequential elements of the collection. Thus, it applies only to sequential collections. Sorting is not relevant to associative containers because they already maintain elements in a sorted order. Sorting is not relevant to the unordered associative containers either because they have no concept of ordering. Some containers, such as list and forward_list provide their own sorting methods because these can be implemented more efficiently internally than a general sort mechanism. Consequently, the general sorting algorithms are most useful for vectors, deques, and arrays.

Basic Sorting and Merging

The sort() function sorts a range of elements in O(N log N) time in the general case. Following the application of sort() to a range, the elements in the range are in nondecreasing order (lowest to highest), according to operator<. If you don’t like that order, you can specify a different comparison callback such as greater.

A variant of sort(), called stable_sort(), maintains the relative order of equal elements in the range. However, because it needs to maintain relative order of equal elements in the range, it is less efficient than the sort() algorithm.

Once you have sorted the elements in a range, you can apply the binary_search() algorithm to find elements in logarithmic instead of linear time. It requires a start and end iterator specifying the range, a value to search, and optionally a comparison callback. It returns true if the value is found in the specified range, false otherwise.

The merge() function allows you to merge two sorted ranges together, while maintaining the sorted order. The result is a sorted range containing all the elements of the two source ranges. It works in linear time. The following parameters are required:

  • start and end iterator of first source range
  • start and end iterator of second source range
  • start iterator of destination range
  • optionally, a comparison callback

Without merge(), you could still achieve the same effect by concatenating the two ranges and applying sort() to the result, but that would be less efficient [O(N log N) instead of linear].

cross.gif

Always ensure that you supply a big enough destination range to store the result of the merge!

Here is an example of sorting and merging:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vectorOne, vectorTwo, vectorMerged;
cout << "Enter values for first vector:" << endl;
populateContainer(vectorOne);
cout << "Enter values for second vector:" << endl;
populateContainer(vectorTwo);
// Sort both containers
sort(vectorOne.begin(), vectorOne.end());
sort(vectorTwo.begin(), vectorTwo.end());
// Make sure the destination vector is large enough to hold the values
// from both source vectors.
vectorMerged.resize(vectorOne.size() + vectorTwo.size());
merge(vectorOne.cbegin(), vectorOne.cend(), vectorTwo.cbegin(),
    vectorTwo.cend(), vectorMerged.begin());
cout << "Merged vector: ";
for_each(vectorMerged.cbegin(), vectorMerged.cend(),
    [](int i){cout << i << " ";});
cout << endl;
while (true) {
    int num;
    cout << "Enter a number to find (0 to quit): ";
    cin >> num;
    if (num == 0) {
        break;
    }
    if (binary_search(vectorMerged.cbegin(), vectorMerged.cend(), num)) {
        cout << "That number is in the vector." << endl;
    } else {
        cout << "That number is not in the vector." << endl;
    }
}

Code snippet from SortingAlgorithmsSortingAndMerging.cpp

Other Sorting Algorithms

If you need to write your own sort algorithm, which is unlikely, there are several sorting routines that can be used as building blocks, including partition(), partition_copy() (C++11), partial_sort(), partial_sort_copy() (C++11), and nth_element(). C++11 also introduces is_sorted() and is_sorted_until() algorithms; is_sorted() returns true if the given range is sorted, while is_sorted_until() returns an iterator in the given range such that everything before this iterator is sorted. However, writing your own sort algorithm is a very exotic problem domain not usually operated in, and therefore we do not discuss these further.

The Standard Library Reference resource on the website contains the details in case the need arises to use one of these algorithms.

random_shuffle

The final “sorting” algorithm is technically more of an “anti-sorting” algorithm; random_shuffle() rearranges the elements of a range in a random order with a linear complexity. It’s useful for implementing tasks like shuffling a deck of cards. There are two versions of random_shuffle(). The first version requires a start and end iterator for the range that you want to shuffle. For this version, the C++ standard does not define which random number generator the implementation has to use. Most compilers use rand() from the standard C library. Note that this requires you to seed the random number generator using srand(). The second version of random_shuffle() requires a start and end iterator for the range to shuffle and also accepts a third parameter which is a random number generator object that can be specified to adapt the randomness to suit your problem domain, for example uniform distribution, binomial distribution, and so on. Random number generators are discussed in detail in Chapter 16.

Set Algorithms

The final class of algorithms in the STL is five functions for performing set operations that work on any sorted iterator range. They do not work on unordered associative containers.

The includes() function implements standard subset determination, checking if all the elements of one sorted range are included in another sorted range, in any order.

The set_union(), set_intersection(), set_difference(), and set_symmetric_difference() algorithms implement the standard semantics of those operations. In set theory, the result of union is all the elements in either set. The result of intersection is all the elements which are in both sets. The result of difference is all the elements in the first set but not the second. The result of symmetric difference is the “exclusive or” of sets: all the elements in one, but not both, sets.

cross.gif

Make sure that your result range is large enough to hold the result of the operations. For set_union() and set_symmetric_difference(), the result is at most the sum of the sizes of the two input ranges. For set_intersection() and set_difference() it’s at most the maximum of the two sizes.

cross.gif

You can’t use iterator ranges from associative containers, including sets, to store the results because they don’t allow changes to their keys.

Here is an example of how to use these algorithms:

image
// The populateContainer() function is identical to the one shown earlier
// for comparison algorithms, so it is omitted here.
vector<int> vec1, vec2, result;
cout << "Enter elements for set 1:" << endl;
populateContainer(vec1);
cout << "Enter elements for set 2:" << endl;
populateContainer(vec2);
// set algorithms work on sorted ranges
sort(vec1.begin(), vec1.end());
sort(vec2.begin(), vec2.end());
cout << "Set 1: ";
for_each(vec1.cbegin(), vec1.cend(), [](int i){cout << i << " ";});
cout << endl;
cout << "Set 2: ";
for_each(vec2.cbegin(), vec2.cend(), [](int i){cout << i << " ";});
cout << endl;
 
if (includes(vec1.cbegin(), vec1.cend(), vec2.cbegin(), vec2.cend())) {
    cout << "The second set is a subset of the first." << endl;
}
if (includes(vec2.cbegin(), vec2.cend(), vec1.cbegin(), vec1.cend())) {
    cout << "The first set is a subset of the second" << endl;
}
 
result.resize(vec1.size() + vec2.size());
auto newEnd = set_union(vec1.cbegin(), vec1.cend(), vec2.cbegin(),
    vec2.cend(), result.begin());
cout << "The union is: ";
for_each(result.begin(), newEnd, [](int i){cout << i << " ";});
cout << endl;
 
newEnd = set_intersection(vec1.cbegin(), vec1.cend(), vec2.cbegin(),
    vec2.cend(), result.begin());
cout << "The intersection is: ";
for_each(result.begin(), newEnd, [](int i){cout << i << " ";});
cout << endl;
 
newEnd = set_difference(vec1.cbegin(), vec1.cend(), vec2.cbegin(),
    vec2.cend(), result.begin());
cout << "The difference between set 1 and set 2 is: ";
for_each(result.begin(), newEnd, [](int i){cout << i << " ";});
cout << endl;
 
newEnd = set_symmetric_difference(vec1.cbegin(), vec1.cend(),
    vec2.cbegin(), vec2.cend(), result.begin());
cout << "The symmetric difference is: ";
for_each(result.begin(), newEnd, [](int i){cout << i << " ";});
cout << endl;

Code snippet from SetAlgorithmsSets.cpp

Here is a sample run of the program:

Enter elements for set 1:
Enter a number (0 to quit): 5
Enter a number (0 to quit): 6
Enter a number (0 to quit): 7
Enter a number (0 to quit): 8
Enter a number (0 to quit): 0
Enter elements for set 2:
Enter a number (0 to quit): 8
Enter a number (0 to quit): 9
Enter a number (0 to quit): 10
Enter a number (0 to quit): 0
Set 1: 5 6 7 8
Set 2: 8 9 10
The union is: 5 6 7 8 9 10
The intersection is: 8
The difference between set 1 and set 2 is: 5 6 7
The symmetric difference is: 5 6 7 9 10

ALGORITHMS EXAMPLE: AUDITING VOTER REGISTRATIONS

Voter fraud can be a problem everywhere. People sometimes attempt to register and vote in two or more different voting districts. Additionally, some people, for example convicted felons, are ineligible to vote, but occasionally attempt to register and vote anyway. Using your newfound algorithm skills, you could write a simple voter registration auditing function that checks the voter rolls for certain anomalies.

The Voter Registration Audit Problem Statement

The voter registration audit function should audit the voters’ information. Assume that voter registrations are stored by district in a map that maps district names to a list of voters. Your audit function should take this map and a list of convicted felons as parameters, and should remove all convicted felons from the lists of voters. Additionally, the function should find all voters who are registered in more than one district and should remove those names from all districts. Voters with duplicate registrations must have all their registrations removed, and therefore become ineligible to vote. For simplicity, assume that the list of voters is simply a list of string names. A real application would obviously require more data, such as address and party affiliation.

The auditVoterRolls Function

The auditVoterRolls() function works in three steps:

1. Find all the duplicate names in all the registration lists by making a call to getDuplicates().

2. Combine the set of duplicates and the list of convicted felons.

3. Remove from every voter list all the names found in the combined set of duplicates and convicted felons. The approach taken here is to use for_each() to process each list in the map, applying a lambda expression to remove the offending names from each list.

The following typedefs are used in the code:

image
typedef map<string, list<string>> VotersMap;
typedef pair<const string, list<string>> DistrictPair;

Code snippet from AuditVoterRollsAuditVoterRolls.cpp

Here’s the implementation of auditVoterRolls():

image
// Expects a map of string/list<string> pairs keyed on district names
// and containing lists of all the registered voters in those districts.
// Removes from each list any name on the convictedFelons list and
// any name that is found on any other list.
void auditVoterRolls(VotersMap& votersByDistrict,
    const list<string>& convictedFelons)
{
    // get all the duplicate names
    set<string> toRemove = getDuplicates(votersByDistrict);
 
    // combine the duplicates and convicted felons -- we want
    // to remove names on both lists from all voter rolls
    toRemove.insert(convictedFelons.cbegin(), convictedFelons.cend());
 
    // Now remove all the names we need to remove using
    // nested lambda expressions and the remove-erase-idiom
    for_each(votersByDistrict.begin(), votersByDistrict.end(),
        [&toRemove](DistrictPair& district) {
            auto it = remove_if(district.second.begin(),
                district.second.end(), [&](const string& name) {
                    return (toRemove.count(name) > 0);
                });
            district.second.erase(it, district.second.end());
        });
}

Code snippet from AuditVoterRollsAuditVoterRolls.cpp

The getDuplicates Function

The getDuplicates() function must find any name that is on more than one voter registration list. There are several different approaches one could use to solve this problem. To demonstrate the adjacent_find() algorithm, this implementation combines the lists from each district into one big list and sorts it. At that point, any duplicate names between the different lists will be next to each other in the big list. Now getDuplicates() can use the adjacent_find() algorithm on the big, sorted, list to find all consecutive duplicates and store them in a set called duplicates. Here is the implementation:

image
// getDuplicates()
//
// Returns a set of all names that appear in more than one list in
// the map.
set<string> getDuplicates(const VotersMap& votersByDistrict)
{
    // Collect all the names from all the lists into one big list
    list<string> allNames;
    for (auto& district : votersByDistrict) {
        allNames.insert(allNames.end(), district.second.begin(),
            district.second.end());
    }
 
    // sort the list -- use the list version, not the general algorithm,
    // because the list version is faster
    allNames.sort();
 
    // Now it's sorted, all duplicate names will be next to each other.
    // Use adjacent_find() to find instances of two or more identical names
    // next to each other.
    // Loop until adjacent_find() returns the end iterator.
    set<string> duplicates;
    for (auto lit = allNames.cbegin(); lit != allNames.cend(); ++lit) {
        lit = adjacent_find(lit, allNames.cend());
        if (lit == allNames.cend()) {
            break;
        }
        duplicates.insert(*lit);
    }
    return duplicates;
}

Code snippet from AuditVoterRollsAuditVoterRolls.cpp

In this implementation, allNames is of type list<string>. That way, this example can show you how to use the sort() and adjacent_find() algorithms. Another solution is to change the type of allNames to set<string>, which results in a more compact implementation, because a set doesn’t allow duplicates. This new solution loops over all lists and tries to insert each name into allNames. When this insert fails, it means that there is already an element with that name in allNames, so the name is added to duplicates.

image
set<string> getDuplicates(const VotersMap& votersByDistrict)
{
    set<string> allNames;
    set<string> duplicates;
    for (auto& district : votersByDistrict) {
        for (auto& name : district.second) {
            if (!allNames.insert(name).second)
                duplicates.insert(name);
        }
    }
    return duplicates;
}

Code snippet from AuditVoterRollsAuditVoterRolls.cpp

Testing the auditVoterRolls Function

That’s the complete implementation of the voter roll audit functionality. Here is a small test program:

image
// Initialize map using uniform initialization
VotersMap voters = {
    {"Orange", {"Amy Aardvark", "Bob Buffalo",
                "Charles Cat", "Dwayne Dog"}},
    {"Los Angeles", {"Elizabeth Elephant", "Fred Flamingo",
                     "Amy Aardvark"}},
    {"San Diego", {"George Goose", "Heidi Hen", "Fred Flamingo"}}
};
list<string> felons = {"Bob Buffalo", "Charles Cat"};
 
// Local lambda expression to print a district
auto printDistrict = [](const DistrictPair& district) {
    cout << district.first << ":";
    for (auto& str : district.second)
        cout << " {" << str << "}";
    cout << endl;
};
 
cout << "Before Audit:" << endl;
for_each(voters.cbegin(), voters.cend(), printDistrict);
cout << endl;
 
auditVoterRolls(voters, felons);
 
cout << "After Audit:" << endl;
for_each(voters.cbegin(), voters.cend(), printDistrict);
cout << endl;

Code snippet from AuditVoterRollsAuditVoterRolls.cpp

The output of the program is:

Before Audit:

Los Angeles: {Elizabeth Elephant} {Fred Flamingo} {Amy Aardvark}
Orange: {Amy Aardvark} {Bob Buffalo} {Charles Cat} {Dwayne Dog}
San Diego: {George Goose} {Heidi Hen} {Fred Flamingo}
 

After Audit:

Los Angeles: {Elizabeth Elephant}
Orange: {Dwayne Dog}
San Diego: {George Goose} {Heidi Hen}

SUMMARY

This chapter concludes the basic STL functionality. It provided an overview of the various algorithms and function objects available for your use. It also showed you how to use the new C++11 lambda expressions, which make it often easier to understand what your code is doing. We hope that you have gained an appreciation for the usefulness of the STL containers, algorithms, and function objects. If not, think for a moment about rewriting the voter registration audit example without the STL. You would need to write your own linked-list and map classes, and your own searching, removing, iterating, and other algorithms. The program would be much longer, harder to debug, and more difficult to maintain.

If you aren’t impressed by the algorithms and function objects, or find them too complex, you obviously don’t have to use them. Feel free to pick and choose as well: If the find() algorithm fits perfectly in your program, don’t eschew it just because you aren’t using the other algorithms. Also, don’t take the STL as an all-or-nothing proposition. If you want to use only the vector container and nothing else, that’s fine too.

The next chapters discuss a couple of other aspects of the C++ Standard Library. Chapter 14 discusses strings in C++, while Chapter 15 explains the concept of streams for input and output (I/O). Chapter 16 covers a number of additional library utilities that are available for you to use, and Chapter 17 finishes the STL topic with some advanced features, including allocators, iterator adapters, and writing your own algorithms, containers, and iterators.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset