Chapter 12

Understanding Containers and Iterators

WHAT’S IN THIS CHAPTER?

  • What iterators are
  • What the different container classes are and how to use them

Chapter 11 introduced the STL, described its basic philosophy, and provided an overview of the various containers and algorithms. You should be familiar with Chapter 11 before you tackle Chapter 12.

This chapter begins a tour of the STL by covering the STL containers, including the following:

  • Containers Overview: requirements on elements, general error handling, and iterators
  • Sequential Containers: vector, deque, list, array, and forward_list
  • Container Adapters: queue, priority_queue, and stack
  • Associative Containers: the pair utility, map, multimap, set, and multiset
  • Unordered Associative Containers/Hash Tables: unordered_map, unordered_multimap, unordered_set, and unordered_multiset
  • Other Containers: standard C-style arrays, strings, streams, and bitset

A detailed list of available classes and methods can be found in the Standard Library Reference resource on the website.

The next chapters will go deeper in on topics like algorithms, strings, regular expressions, I/O and how you can customize and extend the STL.

CONTAINERS OVERVIEW

Containers in the STL are generic data structures useful for storing collections of data. You should rarely need to use a standard C-style array, write a linked list, or design a stack when you use the STL. The containers are implemented as templates, which allow you to instantiate them for any type that meets certain basic conditions outlined below. Most of the STL containers, except for the std::array and std::bitset, are flexible in size and will automatically grow or shrink to accommodate more or fewer elements. This is a huge benefit compared to the old standard C-style arrays, which had a fixed size. Because of the fixed-size nature of standard C-style arrays, they are more vulnerable to overruns, which in the simplest cases merely cause the program to crash because data has been corrupted, but in the worst cases allow certain kinds of security attacks. By using STL containers your programs will be less vulnerable to these kinds of problems.

The STL provides 17 containers, divided into five categories.

  • Sequential containers:
    • vector (dynamic array)
    • list
    • deque
    • array
    • forward_list
  • Associative containers:
    • map
    • multimap
    • set
    • multiset
  • Unordered associative containers or hash tables:
    • unordered_map
    • unordered_multimap
    • unordered_set
    • unordered_multiset
  • Container adapters:
    • queue
    • priority_queue
    • stack
  • bitset

Additionally, C++ strings, and streams can also be used as STL containers to a certain degree.

Everything in the STL is in the std namespace. The examples in this book usually use the blanket using namespace std; statement in source files (never use this in header files!), but you can be more selective in your own programs about which symbols from std to use.

Requirements on Elements

STL containers use value semantics on elements. That is, they store a copy of the element that they are given, and return copies of elements when requested. They also assign to elements with the assignment operator and destroy elements with the destructor. Thus, when you write classes that you intend to use with the STL, make sure that it’s okay to have multiple copies of an object in the program at the same time.

If you prefer reference semantics, you must implement them yourself by storing pointers to elements instead of the elements themselves. When the containers copy a pointer, the result still refers to the same element.

cross.gif

If you store pointers in containers, we recommend using reference-counted smart pointers in order to handle the memory management properly. If you use C++11, you can use the new shared_ptr reference counted smart pointer. It’s more difficult if your compiler does not yet support shared_ptr. You cannot use the C++ auto_ptr class in containers because it does not implement copying correctly (as far as the STL is concerned). See Chapter 21 for a SuperSmartPointer class that you can use in the STL containers without C++11 support.

The specific requirements on elements in containers are shown in the following table:

METHOD DESCRIPTION NOTES
Copy Constructor Creates a new element that is “equal” to the old one, but that can safely be destructed without affecting the old one Used every time you insert an element
imageMove Constructor Creates a new element by moving all content from a source element to the new element Used when the source element will be destroyed after the construction of the new element
Assignment Operator Replaces the contents of an element with a copy of the source element Used every time you modify an element
imageMove Assignment Operator Replaces the contents of an element by moving all content from a source element Used when the source element will be destroyed after the assignment operation
Destructor Cleans up an element Used every time you remove an element
Default Constructor Constructs an element without any arguments Required only for certain operations, such as the vector resize() method and the map operator[] access
operator== Compares two elements for equality Required only for certain operations, such as operator== on two containers
operator< Determines if one element is less than another Required for keys in associative containers and for certain operations, such as operator< on two containers

Chapter 7 shows you how to write these methods. C++11 move semantics is discussed in Chapter 9.

cross.gif

The STL containers call the copy constructor and assignment operator for elements often, so make those operations efficient. With C++11 it can be made much more efficient by implementing move semantics for your elements, as described in Chapter 9.

Exceptions and Error Checking

The STL containers provide limited error checking. Clients are expected to ensure that their uses are valid. However, some container methods and functions throw exceptions in certain conditions such as out-of-bounds indexing. This chapter mentions exceptions where appropriate. The Standard Library Reference resource on the website attempts to catalog the possible exceptions thrown from each method. However, it is impossible to list exhaustively the exceptions that can be thrown from these methods because they perform operations on user-specified types with unknown exception characteristics.

Iterators

The STL uses the iterator pattern to provide a generic abstraction for accessing the elements of the containers. Each container provides a container-specific iterator, which is a glorified smart pointer that knows how to iterate over the elements of that specific container. The iterators for all the different containers adhere to a specific interface defined in the C++ standard. Thus, even though the containers provide different functionality, the iterators present a common interface to code that wishes to work with elements of the containers.

You can think of an iterator as a pointer to a specific element of the container. Like pointers to elements in an array, iterators can move to the next element with operator++. Similarly, you can usually use operator* and operator-> on the iterator to access the actual element or field of the element. Some iterators allow comparison with operator== and operator!=, and support operator-- for moving to previous elements. Different containers provide iterators with slightly different capabilities. The standard defines five categories of iterators, summarized in the following table.

ITERATOR CATEGORY OPERATIONS REQUIRED COMMENTS
Read (officially called “input” iterator) operator++
operator*
operator->
copy constructor
operator=
operator==
operator!=
Provides read-only access, forward-only (no operator-- to move backward).
Iterators can be assigned and copied with assignment operator and copy constructor.
Iterators can be compared for equality.
Write (officially called “output” iterator) operator++
operator*
copy constructor
Provides write-only access, forward only.
Iterators cannot be assigned.
Iterators cannot be compared for equality.
Note the absence of operator->.
Forward operator++
operator*
operator->
copy constructor
default constructor
operator=
operator==
operator!=
Provides read/write access, forward only.
Iterators can be assigned and copied with assignment operator and copy constructor.
Iterators can be compared for equality.
Bidirectional Capabilities of forward iterators, plus:
operator--
Provides everything forward iterator provides.
Iterators can also move backward to previous element.
Random Access Bidirectional capability, plus:
operator+
operator-
operator+=
operator-=
operator<
operator>
operator<=
operator>=
operator[]
Equivalent to dumb pointers: Iterators support pointer arithmetic, array index syntax, and all forms of comparison.

The standard containers that provide iterators all furnish either random access or bidirectional iterators.

Iterators are implemented similarly to smart pointer classes in that they overload the specific desired operators. Consult Chapter 18 for details on operator overloading. See Chapter 17 for a sample iterator implementation.

The basic iterator operations are similar to those supported by dumb pointers, so a dumb pointer is a legitimate iterator for certain containers. In fact, the vector iterator is often implemented as simply a dumb pointer. However, as a client of the containers, you need not worry about the implementation details; you can simply use the iterator abstraction.

pen.gif

Iterators might not be implemented internally as pointers, so this text uses the term “refers to” instead of “points to” when discussing the elements accessible via an iterator.

Chapters 13 and 17 delve into more detail about iterators and the STL algorithms that use them. This chapter shows you the basics of using the iterators for each container.

pen.gif

Only the sequential containers, associative containers, and unordered associative containers provide iterators. The container adapters and bitset class do not support iteration over their elements.

Common Iterator typedefs and Methods

Every container class in the STL that supports iterators provides public typedefs for its iterator types called iterator and const_iterator. Containers that allow you to iterate over its elements in reverse order also provide public typedefs called reverse_iterator and const_reverse_iterator. This way, clients can use the container iterators without worrying about the actual types.

pen.gif

const_iterators and const_reverse_iterators provide read-only access to elements of the container.

The containers also provide a method begin() that returns an iterator referring to the first element in the container. The end() method returns a reference to the “past-the-end” value of the sequence of elements. That is, end() returns an iterator that is equal to the result of applying operator++ to an iterator referring to the last element in the sequence. Together begin() and end() provide a half-open range that includes the first element but not the last. The reason for this apparent complication is to support empty ranges (containers without any elements), in which case begin() is equal to end(). The half-open range bounded by iterators begin() and end() is often written mathematically like this: [begin,end).

Similarly, there are rbegin() and rend() methods for working with reverse iterators.

pen.gif

The half-open range concept also applies to iterator ranges that are passed to container methods such as insert() and erase(). See the specific container descriptions later in this chapter for details.

imageC++11 Changes

C++11 introduces several changes to the STL containers. One change is the introduction of the unordered associative containers, also called hash tables, which are discussed later in this chapter.

Other changes are performance related. All STL containers now implement move semantics by including a move constructor and move assignment operator. These use rvalue references as described in Chapter 9. A big benefit of this is that we can easily return an STL container from a function by value without performance degradation. Take a look at the following function:

image
vector<int> createVectorOfSize(size_t size)
{
    vector<int> vec(size);
    int contents = 0;
    for (auto& i : vec)
        i = contents++;
    return vec;
}

Code snippet from CreateVectorOfSizeCreateVectorOfSize.cpp

Without move semantics, the preceding function will create a local vector called vec. The return statement will then make a copy of vec and return it from the function. With the C++11 move semantics support in the STL containers, this copying of the vector is avoided. Instead, the return statement will move the vector. Moving is possible in this case because vec will go out of scope.

Similarly, push operations can also make use of move semantics to improve performance in certain situations. For example, suppose you have a vector of elements of type Element as follows:

image
class Element
{
    public:
        Element(int i, string str) : mI(i), mStr(str) {}
    protected:
        int mI;
        string mStr;
};
int main()
{
    vector<Element> vec;
    return 0;
}

Code snippet from MovePushBackMovePushBack.cpp

Adding an element to this vector can be done as follows:

image
Element myElement(12, "Twelve");
vec.push_back(myElement);

Code snippet from MovePushBackMovePushBack.cpp

However, since myElement is not a temporary object, the push_back() call will make a copy of myElement and put it in the vector. This copying can be avoided if you call the push_back() method as follows:

image
vec.push_back(Element(12, "Twelve"));

Code snippet from MovePushBackMovePushBack.cpp

The vector class defines a push_back(T&& val) which is the move equivalent of push_back(const T& val). The preceding call to vec.push_back() will trigger a call to the move version because the call to the Element constructor results in a temporary object. The push_back() method will move this temporary Element object into the vector, avoiding any copying.

Using C++11 uniform initialization (Chapter 9), the preceding can also be written as follows:

image
vec.push_back({12, "Twelve"});

Code snippet from MovePushBackMovePushBack.cpp

In addition, C++11 adds support for emplace operations on most STL containers. Emplace means “to put into place.” An example is emplace_back() on a vector object, which does not copy or move anything. Instead, it makes space in the container and constructs the object in place. Following is an example.

vec.emplace_back(12, "Twelve");

The emplace methods take a variable number of arguments as a variadic template. Variadic templates are discussed in Chapter 20. The difference in performance between emplace_back() and push_back() using move semantics depends on how your specific compiler implements these operations. In most situations you can pick the one based on the syntax that you prefer.

vec.push_back({12, "Twelve"});
// Or
vec.emplace_back(12, "Twelve");

SEQUENTIAL CONTAINERS

The vector, deque, list, array, and forward_list are called the sequential containers. The best way to learn about the sequential containers is to jump in with an example of the vector, which is the container most commonly used. The next section describes the vector in detail as an example of a sequential container, followed by briefer descriptions of the deque, list, array, and forward_list. Once you become familiar with the sequential containers, it’s trivial to switch between them.

vector

The STL vector is similar to a standard C-style array: The elements are stored in contiguous memory, each in its own “slot.” You can index into the vector, as well as add new elements to the back or insert them anywhere else. Inserting and deleting elements into and from the vector generally takes linear time, though these operations actually run in amortized constant time at the end of the vector, explained in the section “The vector Memory Allocation Scheme” later in this chapter. Random access of individual elements has a constant complexity.

vector Overview

The vector is defined in the <vector> header file as a class template with two type parameters: the element type to store and an allocator type.

template <class T, class Allocator = allocator<T> > class vector;

The Allocator parameter specifies the type for a memory allocator object that the client can set in order to use custom memory allocation. This template parameter has a default value.

pen.gif

The default value for the Allocator type parameter is sufficient for most applications. Programmers do not usually find it useful to customize allocators, but Chapter 17 provides more details in case you are interested. This chapter assumes that you always use the default allocator.

Fixed-Length vectors

The simplest way to use a vector is as a fixed-length array. The vector provides a constructor that allows you to specify the number of elements, and provides an overloaded operator[] in order to access and modify those elements. The C++ standard states that the result of operator[] is undefined when used to access an element outside the vector bounds. This means that any compiler can decide how to behave in that case. For example, Microsoft Visual C++ will give a run time error message when your program is compiled in debug mode. In release mode, those checks are disabled for performance reasons. Note that the standard does provide an at() method which will perform bounds checking and is discussed later in this chapter.

cross.gif

Like “real” array indexing, the operator[] on a vector does not provide bounds checking.

For example, here is a small program to “normalize” test scores so that the highest score is set to 100, and all other scores are adjusted accordingly. The program creates a vector of 10 doubles, initializes all elements to 0.0, reads in 10 values from the user, divides each value by the max score (times 100), and prints out the new values. For the sake of brevity, the program forsakes error checking.

image
vector<double> doubleVector(10); // Create a vector of 10 doubles.
// Initialize max to smallest number
double max = numeric_limits<double>::lowest();
for (size_t i = 0; i < 10; i++) {
    doubleVector[i] = 0.0;
}
for (size_t i = 0; i < 10; i++) {
    cout << "Enter score " << i + 1 << ": ";
    cin >> doubleVector[i];
    if (doubleVector[i] > max) {
        max = doubleVector[i];
    }
}
max /= 100.0;
for (size_t i = 0; i < 10; i++) {
    doubleVector[i] /= max;
    cout << doubleVector[i] << " ";
}

Code snippet from TestScoresTestScores.cpp

As you can see from this example, you can use a vector just as you would use a standard C-style array.

pen.gif

The operator[] on a vector normally returns a reference to the element, which can be used on the left-hand side of assignment statements. If operator[] is called on a const vector object, it returns a reference to a const element, which cannot be used as the target of an assignment. See Chapter 18 for details on how this trick is implemented.

Specifying an Initial Element Value

You can specify an initial value for the elements when you create the vector like this:

image
vector<double> doubleVector(10, 0.0); // 10 doubles of value 0.0
// Initialize max to smallest number
double max = numeric_limits<double>::lowest();
// No need to initialize each element: the constructor did it for you.
for (size_t i = 0; i < 10; i++) {
    cout << "Enter score " << i + 1 << ": ";
    cin >> doubleVector[i];
    if (doubleVector[i] > max) {
        max = doubleVector[i];
    }
}
max /= 100.0;
for (size_t i = 0; i < 10; i++) {
    doubleVector[i] /= max;
    cout << doubleVector[i] << " ";
}

Code snippet from TestScoresTestScoresInitialElem.cpp

Other vector Element Access Methods

In addition to using operator[], you can access vector elements via at(), front(), and back(). The at() method is identical to operator[], except that it performs bounds checking, and throws an out_of_range exception if the index is out of bounds. front() and back() return the references to the first and last elements of the vector, respectively.

pen.gif

All vector element accesses run in constant complexity.

Dynamic-Length vectors

The real power of the vector lies in its ability to grow dynamically. For example, consider the test score normalization program from the previous section with the additional requirement that it should handle any number of test scores. Here is the new version:

image
vector<double> doubleVector; // Create a vector with zero elements.
// Initialize max to smallest number
double max = numeric_limits<double>::lowest();
for (size_t i = 0; true; i++) {
    double temp;
    cout << "Enter score " << i + 1 << " (-1 to stop): ";
    cin >> temp;
    if (temp == -1) {
        break;
    }
    doubleVector.push_back(temp);
    if (temp > max) {
        max = temp;
    }
}
max /= 100.0;
for (size_t i = 0; i < doubleVector.size(); i++) { 
    doubleVector[i] /= max;
    cout << doubleVector[i] << " ";
}

Code snippet from TestScoresTestScoresDynamic.cpp

This version of the program uses the default constructor to create a vector with zero elements. As each score is read, it’s added to the vector with the push_back() method, which takes care of allocating space for the new element. Note that the last for loop uses the size() method on the vector to determine the number of elements in the container. size() returns an unsigned integer, so the type of i size_t.

vector Details

Now that you’ve had a taste of vectors, it’s time to delve into their details.

Constructors and Destructors

The default constructor creates a vector with 0 elements.

image
vector<int> intVector; // Creates a vector of ints with zero elements

Code snippet from VectorCtorsDefaultCtor.cpp

As you’ve already seen, you can specify a number of elements and, optionally, a value for those elements, like this:

image
vector<int> intVector(10, 100); // Creates vector of 10 ints with value 100

Code snippet from VectorCtorsInitialElements.cpp

If you omit the default value, the new objects are zero-initialized. Zero-initialization constructs objects with the default constructor and initializes primitive integer types (such as char, int, etc.) to 0 and primitive floating point types to 0.0.

You can also create vectors of built-in classes like this:

image
vector<string> stringVector(10, "hello");

Code snippet from VectorCtorsBuiltInClasses.cpp

User-defined classes can also be used as vector elements:

image
class Element
{
    public:
        Element() {}
        virtual ~Element() {}
};
int main()
{
    vector<Element> elementVector;
    return 0;
}

Code snippet from VectorCtorsUserDefinedClasses.cpp

C++11 adds a new constructor to the vector class that accepts an initializer_list that contains the initial elements for the vector. It can be used as follows:

image
vector<int> intVector({1,2,3,4,5,6});

Code snippet from VectorCtorsintializer_list.cpp

initializer_lists can also be used for so-called uniform initialization as discussed in Chapter 9. Uniform initialization works on most STL containers. For example, the following code demonstrates this for a vector:

image
vector<int> intVector ({1,2,3,4,5,6});

Code snippet from VectorCtorsUniformInitialization.cpp

The vector stores copies of the objects, and its destructor calls the destructor for each of the objects.

You can allocate vectors on the heap as well:

image
vector<Element>* elementVector = new vector<Element>(10);
delete elementVector;

Code snippet from VectorCtorsHeapVectors.cpp

Remember to call delete when you are finished with a vector that you allocated with new or better yet, use a smart pointer to automatically deallocate the vector as follows:

image
shared_ptr<vector<Element>> elementVector(new vector<Element>(10));

Code snippet from VectorCtorsHeapVectorsSmartPointer.cpp

cross.gif

Use delete, not delete[], to free vectors, because a vector is a basic type and not an array type.

Copying and Assigning vectors

The copy constructor and assignment operator for the vector class perform deep copies of all the elements in the vector. Thus, for efficiency, you should pass vectors by reference or const reference to functions and methods. Consult Chapter 19 for the details on writing functions that take template instantiations as parameters.

In addition to normal copying and assignment, vectors provide an assign() method that removes all the current elements and adds any number of new elements. This method is useful if you want to reuse a vector. Here is a trivial example. intVector is created with 10 elements with value 0. Then assign() is used to remove all 10 elements and replace them with 5 elements with value 100.

image
vector<int> intVector(10, 0);
// Other code . . .
intVector.assign(5, 100);

Code snippet from VectorCopyAssigndemo.cpp

With C++11, assign() can also accept an initializer_list as follows. intVector will now have 4 elements with the given values.

image
intVector.assign({1, 2, 3, 4});

Code snippet from VectorCopyAssigndemo.cpp

vectors also provide a swap() method that allows you to swap the contents of two vectors. Here is a simple example:

image
vector<int> vectorOne(10, 0);
vector<int> vectorTwo(5, 100);
vectorOne.swap(vectorTwo);
// vectorOne now has 5 elements with the value 100.
// vectorTwo now has 10 elements with the value 0.

Code snippet from VectorCopyAssigndemo.cpp

Comparing vectors

The STL provides the usual six overloaded comparison operators for vectors: ==, !=, <, >, <=, >=. Two vectors are equal if they have the same number of elements and all the corresponding elements in the two vectors are equal to each other. One vector is “less than” another if all elements 0 through i-1 in the first vector are equal to 0 through i-1 in the second vector, but element i in the first is less than element i in the second, where i must be in the range 0...n and n must be <= size()-1.

pen.gif

Comparing two vectors with operator== or operator!= requires the individual elements to be comparable with operator==. Comparing two vectors with operator<, operator>, operator<=, or operator>= requires the individual elements to be comparable with operator<. If you intend to store objects of a custom class in a vector, make sure to write those operators.

Here is an example of a simple program that compares vectors of ints:

image
vector<int> vectorOne(10, 0);
vector<int> vectorTwo(10, 0);
if (vectorOne == vectorTwo) {
    cout << "equal!" << endl;
} else {
    cout << "not equal!" << endl;
}
vectorOne[3] = 50;
if (vectorOne < vectorTwo) {
    cout << "vectorOne is less than vectorTwo" << endl;
} else {
    cout << "vectorOne is not less than vectorTwo" << endl;
}

Code snippet from VectorComparecompare.cpp

The output of the program is:

equal!
vectorOne is not less than vectorTwo 

vector Iterators

The section on “Iterators” at the beginning of this chapter explained the basics of container iterators. The discussion can get a bit abstract, so it’s helpful to jump in and look at a code example. Here is the test score normalization program from earlier with the for loop previously using size() replaced by a for loop using an iterator:

image
vector<double> doubleVector;
// Initialize max to smallest number
double max = numeric_limits<double>::lowest();
for (size_t i = 0; true; i++) {
    double temp;
    cout << "Enter score " << i + 1 << " (-1 to stop): ";
    cin >> temp;
    if (temp == -1) {
        break;
    }
    doubleVector.push_back(temp);
    if (temp > max) {
        max = temp;
    }
}
max /= 100.0;
for (vector<double>::iterator iter = doubleVector.begin();
    iter != doubleVector.end(); ++iter) {
    *iter /= max;
    cout << *iter << " ";
}

Code snippet from TestScoresTestScoresIterator.cpp

You see for loops like the new one in this example quite a bit in STL code. First, take a look at the for loop initialization statement:

vector<double>::iterator iter = doubleVector.begin();

Recall that every container defines a type named iterator to represent iterators for that type of container. begin() returns an iterator of that type referring to the first element in the container. Thus, the initialization statement obtains in the variable iter an iterator referring to the first element of doubleVector. Next, look at the for loop comparison:

iter != doubleVector.end();

This statement simply checks if the iterator is past the end of the sequence of elements in the vector. When it reaches that point, the loop terminates. The increment statement, ++iter, increments the iterator to refer to the next element in the vector.

pen.gif

Use preincrement instead of postincrement when possible because preincrement is at least as efficient, and usually more efficient. iter++ must return a new iterator object, while ++iter can simply return a reference to iter. See Chapter 18 for details on implementing operator++, and Chapter 17 for details on writing your own iterators.

The for loop body contains these two lines:

*iter /= max;
cout << *iter << " ";

As you can see, your code can both access and modify the elements over which it iterates. The first line uses * to dereference iter to obtain the element to which it refers, and assigns to that element. The second line dereferences iter again, but this time only to stream the element to cout.

With C++11, the syntax of the preceding for loop using iterators can be simplified by using the auto keyword introduced in Chapter 1. This is shown in the following code fragment:

image
for (auto iter = doubleVector.begin();
    iter != doubleVector.end(); ++iter) {
    *iter /= max;
    cout << *iter << " ";
}

Code snippet from TestScoresTestScoresIterator.cpp

In this example, the compiler will automatically derive the type of the variable iter based on the right-hand side of the initializer, which in this case is the result of the call to begin().

Using the range-based for loop of C++11, the loop can be simplified even further. The range-based for loop is introduced in Chapter 1. The following code does exactly the same as the previous implementations of the loop:

image
for (auto& d : doubleVector) {
    d /= max;
    cout << d << " ";
}

Code snippet from TestScoresTestScoresIterator.cpp

This looks much more elegant than the other versions of the loop.

Accessing Fields of Object Elements

If the elements of your container are objects, you can use the -> operator on iterators to call methods or access members of those objects. For example, the following small program creates a vector of 10 strings, then iterates over all of them appending a new string to the old one:

image
vector<string> stringVector(10, "hello");
for (auto it = stringVector.begin(); it != stringVector.end(); ++it) {
    it->append(" there");
}

Code snippet from VectorIteratorsAccessingFields.cpp

Or, using the range-based for loop, it can be written as follows:

image
vector<string> stringVector(10, "hello");
for (auto& str : stringVector) {
    str.append(" there");
}

Code snippet from VectorIteratorsAccessingFields.cpp

const_iterator

The normal iterator is read/write. However, if you call begin() and end() on a const object, you receive a const_iterator. The const_iterator is read-only; you cannot modify the elements. An iterator can always be converted to a const_iterator, so it’s always safe to write something like this:

vector<type>::const_iterator it = myVector.begin();

However, a const_iterator cannot be converted to an iterator. If myVector is const, the following line doesn’t compile:

vector<type>::iterator it = myVector.begin();
pen.gif

If you do not need to modify the elements of a vector, you should use a const_iterator. This rule will make it easier to guarantee correctness of your code and allows compilers to perform certain optimizations.

When using the auto keyword of C++11, using const_iterators looks a bit different. Suppose you write the following code:

vector<string> stringVector(10, "hello");
for (auto iter = stringVector.begin();
    iter != stringVector.end(); ++iter) {
    cout << *iter << endl;
}

Because of the auto keyword, the compiler will decide the type of the iter variable automatically and will make it a normal iterator, meaning that you can read and write to the iterator. If you want a read-only const_iterator in combination with using auto, then you need to use cbegin() and cend() instead of the normal begin() and end() as follows:

image
vector<string> stringVector(10, "hello");
for (auto iter = stringVector.cbegin();
    iter != stringVector.cend(); ++iter) {
    cout << *iter << endl;
}

Code snippet from VectorIteratorsConstIterator.cpp

Now the compiler will use the const_iterator as type for the variable iter because that’s what cbegin() returns.

Iterator Safety

Generally, iterators are about as safe as pointers: extremely insecure. For example, you can write code like this:

image
vector<int> intVector;
auto it = intVector.end();
*it = 10; // BUG! it doesn't refer to a valid element.

Code snippet from VectorIteratorsIteratorSafety.cpp

Recall that the iterator returned by end() is past the end of the vector. Trying to dereference it results in undefined behavior. However, the iterators themselves are not required to perform any verification.

cross.gif

Remember that end() returns an iterator past the end of the container, not the iterator referring to the last element of the container.

Another problem can occur if you use mismatched iterators. For example, the following code initializes an iterator from vectorTwo and tries to compare it to the end iterator for vectorOne. Needless to say, this loop will not do what you intended, and may never terminate. Dereferencing the iterator in the loop will likely produce undefined results.

image
vector<int> vectorOne(10);
vector<int> vectorTwo(10);
// Fill in the vectors.
// BUG! Infinite loop
for (auto it = vectorTwo.begin(); it != vectorOne.end(); ++it) {
    // Loop body
}

Code snippet from VectorIteratorsIteratorSafety.cpp

pen.gif

Some C++ runtimes, for example Microsoft Visual C++, will give an assertion error at run time for both of the preceding problems when running a debug build of your program.

Other Iterator Operations

The vector iterator is random access, which means that you can move it backward or forward, or jump around. For example, the following code eventually changes the fifth element (index 4) in the vector to the value 4:

image
vector<int> intVector(10, 0);
auto it = intVector.begin();
it += 5;
--it;
*it = 4;

Code snippet from VectorIteratorsIteratorOps.cpp

Iterators versus Indexing

Given that you can write a for loop that uses a simple index variable and the size() method to iterate over the elements of the vector, why should you bother using iterators? That’s a valid question, for which there are three main answers:

  • Iterators allow you to insert and delete elements and sequences of elements at any point in the container. See the following “Adding and Removing Elements” section.
  • Iterators allow you to use the STL algorithms, which are discussed in Chapter 13.
  • Using an iterator to access each element sequentially is often more efficient than indexing the container to retrieve each element individually. This generalization is not true for vectors, but applies to lists, maps, and sets.

Adding and Removing Elements

As you have already read, you can append an element to a vector with the push_back() method. The vector provides a parallel remove method called pop_back().

cross.gif

pop_back() does not return the element that it removed. If you want the element you must first retrieve it with back().

You can also insert elements at any point in the vector with the insert() method, which adds one or more elements to a position specified by an iterator, shifting all subsequent elements down to make room for the new ones. There are three different overloaded forms of insert(): one that inserts a single element, one that inserts n copies of a single element, and one that inserts elements from an iterator range. Recall that the iterator range is half-open, such that it includes the element referred to by the starting iterator but not the one referred to by the ending iterator. C++11 adds two more insert() overloads: one that inserts a single element by moving the given element to the vector using move semantics, and another one that inserts a list of elements into the vector where the list of elements is given as an initializer_list. The initializer_list feature is discussed in Chapter 9.

pen.gif

push_back() and insert() take const references to elements, allocate memory as needed to store the new elements, and store copies of the element arguments. C++11 introduces both a push_back() and an insert() method that use move semantics to move ownership of the object to the vector instead of copying the object.

You can remove elements from any point in the vector with erase() and you can remove all elements with clear(). There are two forms of erase(): single element and range specified by an iterator.

If you want to remove a number of elements that satisfy a certain condition, one solution would be to write a loop iterating over all the elements and erasing every element that matches the condition. However, this solution has quadratic complexity, discussed in Chapter 2, which is very bad for performance. In this case, the quadratic complexity can be avoided by using the remove-erase-idiom, which has a linear complexity. The remove-erase-idiom is discussed in Chapter 13.

Here is a small program that demonstrates the methods for adding and removing elements. It uses a helper function printVector() that prints the contents of the vector to cout, but whose implementation is not shown here because it uses algorithms covered in the next chapters. The example also includes demonstrations of the following versions of insert():

  • insert(const_iterator pos, const T& x); the value x will be inserted at position pos.
  • insert(const_iterator pos, size_type n, const T& x); the value x will be inserted n times at position pos.
  • insert(const_iterator pos, InputIterator first, InputIterator last); the elements in the range first, last are inserted at position pos.
image
vector<int> vectorOne = {1,2,3,5};
vector<int> vectorTwo;
// Oops, we forgot to add 4. Insert it in the correct place.
vectorOne.insert(vectorOne.begin() + 3, 4);
// Add elements 6 through 10 to vectorTwo.
for (int i = 6; i <= 10; i++) {
    vectorTwo.push_back(i);
}
printVector(vectorOne);
printVector(vectorTwo);
// Add all the elements from vectorTwo to the end of vectorOne.
vectorOne.insert(vectorOne.end(), vectorTwo.begin(), vectorTwo.end());
printVector(vectorOne);
// Now erase the numbers 2 through 5 in vectorOne.
vectorOne.erase(vectorOne.begin() + 1, vectorOne.begin() + 5);
printVector(vectorOne);
// Clear vectorTwo entirely.
vectorTwo.clear();
// And add 10 copies of the value 100.
vectorTwo.insert(vectorTwo.begin(), 10, 100);
// Decide we only want 9 elements.
vectorTwo.pop_back();
printVector(vectorTwo);

Code snippet from VectorAddRemoveAddRemove.cpp

The output of the program is:

1 2 3 4 5
6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 6 7 8 9 10
100 100 100 100 100 100 100 100 100

Recall that iterator pairs represent a half-open range, and insert() adds elements before the element referred to by the iterator position. Thus, you can insert the entire contents of vectorTwo into the end of vectorOne, like this:

vectorOne.insert(vectorOne.end(), vectorTwo.begin(), vectorTwo.end());
cross.gif

Methods such as insert() and erase() that take a vector range as arguments assume that the beginning and ending iterators refer to elements in the same container, and that the end iterator refers to an element at or past the begin iterator. The methods will not work correctly if these preconditions are not met!

Algorithmic Complexity and Iterator Invalidation

Inserting or erasing elements in a vector causes all subsequent elements to shift up or down to make room for, or fill in the holes left by, the affected elements. Thus, these operations take linear complexity. Furthermore, all iterators referring to the insertion or removal point or subsequent positions are invalid following the action. The iterators are not “magically” moved to keep up with the elements that are shifted up or down in the vector.

Also keep in mind that an internal vector reallocation can cause invalidation of all iterators referring to elements in the vector, not just those referring to elements past the point of insertion or deletion. See the next section for details.

The vector Memory Allocation Scheme

The vector allocates memory automatically to store the elements that you insert. Recall that the vector requirements dictate that the elements must be in contiguous memory, like in standard C-style arrays. Because it’s impossible to request to add memory to the end of a current chunk of memory, every time the vector allocates more memory it must allocate a new, larger, chunk in a separate memory location and copy/move all the elements to the new chunk. This process is time-consuming, so vector implementations attempt to avoid it by allocating more space than needed when they have to perform a reallocation. That way, they can avoid reallocating memory every time you insert an element.

One obvious question at this point is why you, as a client of the vector, care how it manages its memory internally. You might think that the principle of abstraction should allow you to disregard the internals of the vector memory allocation scheme. Unfortunately, there are two reasons why you need to understand how it works:

1. Efficiency. The vector allocation scheme can guarantee that an element insert runs in amortized constant time: Most of the time the operation is constant, but once in a while (if it requires a reallocation), it’s linear. If you are worried about efficiency you can control when the vector performs reallocations.

2. Iterator invalidations. A reallocation invalidates all iterators referring to elements in the vector.

Thus, the vector interface allows you to query and control the vector reallocations. If you don’t control the reallocations explicitly, you should assume that all insertions cause a reallocation and thus invalidate all iterators.

Size and Capacity

The vector provides two methods for obtaining information about its size: size() and capacity(). The size() method returns the number of elements in the vector, while capacity() returns the number of elements that it can hold without a reallocation. Thus, the number of elements that you can insert without causing a reallocation is capacity() - size().

pen.gif

You can query whether a vector is empty with the empty() method. A vector can be empty but have nonzero capacity.

Reserving Capacity

If you don’t care about efficiency or iterator invalidations, there is never a need to control the vector memory allocation explicitly. However, if you want to make your program as efficient as possible, or want to guarantee that iterators will not be invalidated, you can force the vector to preallocate enough space to hold all of its elements. Of course, you need to know how many elements it will hold, which is sometimes impossible to predict.

One way to preallocate space is to call reserve(). That method allocates enough memory to hold the specified number of elements. The next section shows an example of the reserve() method in action.

cross.gif

Reserving space for elements changes the capacity, but not the size. That is, it doesn’t actually create elements. Don’t access elements past the vector size.

Another way to preallocate space is to specify, in the constructor or with the resize() method, how many elements you want the vector to store. This method actually creates a vector of that size (and probably of that capacity).

vector Example: A Round-Robin Class

A common problem in computer science is distributing requests among a finite list of resources. For example, a simple operating system could keep a list of processes and assign a time slice (for example 100ms) to each process to let the process perform some of its work. After the time slice is finished, the OS suspends the process and the next process in the list is given a time slice to perform some of its work. One of the simplest algorithmic solutions to this problem is round-robin scheduling. When the time slice of the last process is finished, the scheduler starts over again with the first process. For example, in the case of three processes, the first time slice would go to the first process, the second to the second process, the third to the third process, and the fourth back to the first process. The cycle would continue in this way indefinitely.

Suppose that you decide to write a generic round-robin scheduling class that can be used with any type of resource. The class should support adding and removing resources, and should support cycling through the resources in order to obtain the next one. You could use the STL vector directly, but it’s often helpful to write a wrapper class that provides more directly the functionality you need for your specific application. The following example shows a RoundRobin class template with comments explaining the code. This example uses basic functionality of templates. Templates are introduced in Chapter 11 with a MyArray class template example. Make sure you understand that section first before continuing with this RoundRobin example. Templates are discussed in much more detail in Chapter 19, but for this example, the brief introduction from Chapter 11 is enough to understand it. First, here is the class definition:

image
// Class template RoundRobin
// Provides simple round-robin semantics for a list of elements.
template <typename T>
class RoundRobin
{
    public:
        // Client can give a hint as to the number of expected elements for
        // increased efficiency.
        RoundRobin(int numExpected = 0);
        virtual ~RoundRobin();
        // Appends elem to the end of the list. May be called
        // between calls to getNext().
        void add(const T& elem);
        // Removes the first (and only the first) element
        // in the list that is equal (with operator==) to elem.
        // May be called between calls to getNext().
        void remove(const T& elem);
        // Returns the next element in the list, starting with the first,
        // and cycling back to the first when the end of the list is
        // reached, taking into account elements that are added or removed.
        T& getNext() throw(std::out_of_range);
    protected:
        std::vector<T> mElems;
        typename std::vector<T>::iterator mCurElem;
    private:
        // Prevent assignment and pass-by-value.
        RoundRobin(const RoundRobin& src);
        RoundRobin& operator=(const RoundRobin& rhs);
};

Code snippet from RoundRobinRoundRobin.h

As you can see, the public interface is straightforward: only three methods plus the constructor and destructor. The resources are stored in the vector called mElems. The iterator mCurElem always refers to the next element in mElems that should be used in the round-robin scheme. Note the use of the typename keyword in front of the line declaring mCurElem. So far, you’ve only seen that keyword used to specify template parameters, but there is another use for it. You must specify typename explicitly whenever you access a type based on one or more template parameters. In this case, the template parameter T is used to access the iterator type. Thus, you must specify typename. This is another example of arcane C++ syntax.

The class also prevents assignment and pass-by-value because of the mCurElem data member. To make assignment and pass-by-value work you would have to implement an assignment operator and copy constructor and make sure mCurElem is valid in the destination object. This is omitted in this example.

The implementation of the RoundRobin class follows with comments explaining the code. Note the use of reserve() in the constructor, and the extensive use of the iterator in add(), remove(), and getNext(). The trickiest aspect is keeping mCurElem valid and referring to the correct element following calls to add() or remove().

image
template <typename T>
RoundRobin<T>::RoundRobin(int numExpected)
{
    // If the client gave a guideline, reserve that much space.
    mElems.reserve(numExpected);
    // Initialize mCurElem even though it isn't used until
    // there's at least one element.
    mCurElem = mElems.begin();
}
 
template <typename T>
RoundRobin<T>::~RoundRobin()
{
    // nothing to do here -- the vector will delete all the elements
}
 
// Always add the new element at the end
template <typename T>
void RoundRobin<T>::add(const T& elem)
{
    // Even though we add the element at the end, the vector could
    // reallocate and invalidate the iterator with the push_back call.
    // Take advantage of the random access iterator features to save our
    // spot.
    int pos = mCurElem - mElems.begin();
    // add the element
    mElems.push_back(elem);
    // Reset our iterator to make sure it is valid.
    mCurElem = mElems.begin() + pos;
}
 
template <typename T>
void RoundRobin<T>::remove(const T& elem)
{
    for (auto it = mElems.begin(); it != mElems.end(); ++it) {
        if (*it == elem) {
            // Removing an element will invalidate our mCurElem iterator if
            // it refers to an element past the point of the removal.
            // Take advantage of the random access features of the iterator
            // to track the position of the current element after removal.
            int newPos;
            // If current iterator is before or at the one we're removing,
            // the new position is the same as before.
            if (mCurElem <= it) {
                newPos = mCurElem - mElems.begin();
            } else {
                // otherwise, it's one less than before
                newPos = mCurElem - mElems.begin() - 1;
            }
            // erase the element (and ignore the return value)
            mElems.erase(it);
            // Now reset our iterator to make sure it is valid.
            mCurElem = mElems.begin() + newPos;
            // If we were pointing to the last element and it was removed,
            // we need to loop back to the first.
            if (mCurElem == mElems.end()) {
                mCurElem = mElems.begin();
            }
            return;
        }
    }
}
 
template <typename T>
T& RoundRobin<T>::getNext() throw(std::out_of_range)
{
    // First, make sure there are any elements.
    if (mElems.empty()) {
        throw std::out_of_range("No elements in the list");
    }
    // retrieve a reference to return
    T& retVal = *mCurElem;
    // Increment the iterator modulo the number of elements
    ++mCurElem;
    if (mCurElem == mElems.end()) {
        mCurElem = mElems.begin();
    }
    // return the reference
    return retVal;
}

Code snippet from RoundRobinRoundRobin.h

Here’s a simple implementation of a scheduler that uses the RoundRobin class template, with comments explaining the code.

image
// Simple Process class.
class Process
{
    public:
        // Constructor accepting the name of the process.
        Process(const string& name) : mName(name) {}
        // Implementation of doWorkDuringTimeSlice would let the process
        // perform its work for the duration of a time slice.
        // Actual implementation omitted.
        void doWorkDuringTimeSlice() {
            cout << "Process " << mName
                 << " performing work during time slice." << endl;
        }
        // Needed for the RoundRobin::remove method to work.
        bool operator==(const Process& rhs) {
            return mName == rhs.mName;
        } 
    protected:
        string mName;
};
// Simple round-robin based process scheduler.
class Scheduler
{
    public:
        // Constructor takes a vector of processes.
        Scheduler(const vector<Process>& processes);
        // Selects the next process using a round-robin scheduling
        // algorithm and allows it to perform some work during
        // this time slice.
        void scheduleTimeSlice();
        // Removes the given process from the list of processes.
        void removeProcess(const Process& process);
    protected:
        RoundRobin<Process> rr;
};
 
Scheduler::Scheduler(const vector<Process>& processes)
{
    // Add the processes
    for (auto& process : processes) {
        rr.add(process);
    }
}
void Scheduler::scheduleTimeSlice()
{
    try {
        rr.getNext().doWorkDuringTimeSlice();
    } catch (const out_of_range&) {
        cerr << "No more processes to schedule." << endl;
    }
}
void Scheduler::removeProcess(const Process& process)
{
    rr.remove(process);
}
int main()
{
    vector<Process> processes = {Process("1"), Process("2"), Process("3")};
    Scheduler sched(processes);
    for (int i = 0; i < 4; ++i)
        sched.scheduleTimeSlice();
    sched.removeProcess(processes[1]);
    cout << "Removed second process" << endl;
    for (int i = 0; i < 4; ++i)
        sched.scheduleTimeSlice();
    return 0;
}

Code snippet from RoundRobinRoundRobinTest.cpp

The output should be as follows.

Process 1 performing work during time slice.
Process 2 performing work during time slice.
Process 3 performing work during time slice.
Process 1 performing work during time slice.
Removed second process
Process 3 performing work during time slice.
Process 1 performing work during time slice.
Process 3 performing work during time slice.
Process 1 performing work during time slice.

The vector<bool> Specialization

The standard requires a partial specialization of vector for bools, with the intention that it optimizes space allocation by “packing” the Boolean values. Recall that a bool is either true or false, and thus could be represented by a single bit, which can take on exactly two values. C++ does not have a native type that stores exactly one bit. Some compilers represent a Boolean value with a type the same size as a char. Some other compilers use an int. The vector<bool> specialization is supposed to store the “array of bools” in single bits, thus saving space.

pen.gif

You can think of the vector<bool> as a bit-field instead of a vector. The bitset container described later in this chapter provides a more full-featured bit-field implementation than does vector<bool>. However, the benefit of vector<bool> is that it can change size dynamically.

In a half-hearted attempt to provide some bit-field routines for the vector<bool>, there is one additional method: flip(). This method can be called on either the container, in which case it complements all the elements in the container; or on a single reference returned from operator[] or a similar method, in which case it complements that single element.

At this point, you should be wondering how you can call a method on a reference to bool. The answer is that you can’t. The vector<bool> specialization actually defines a class called reference that serves as a proxy for the underlying bool (or bit). When you call operator[], at(), or a similar method, the vector<bool> returns a reference object, which is a proxy for the real bool.

cross.gif

The fact that references returned from vector<bool> are really proxies means that you can’t take their addresses to obtain pointers to the actual elements in the container. The proxy design pattern is covered in detail in Chapter 29.

In practice, the little amount of space saved by packing bools hardly seems worth the extra effort. However, you should be familiar with this partial instantiation because of the additional flip() method, and because of the fact that references are actually proxy objects. Many C++ experts recommend avoiding vector<bool> in favor of the bitset, unless you really need a dynamically sized bit-field.

deque

The deque (abbreviation for double-ended queue) is almost identical to the vector, but is used far less frequently. It is defined in the <deque> header file. The principle differences are:

  • The implementation is not required to store elements contiguously in memory.
  • The deque supports constant-time insertion and removal of elements at both the front and the back (the vector supports amortized constant time at just the back).
  • The deque provides push_front() and pop_front(), which the vector omits.
  • The deque does not expose its memory management scheme via reserve() or capacity().

deques are rarely used, as opposed to vectors and lists. Thus, we leave the details of the deque methods to the Standard Library Reference resource on the website.

list

The STL list, defined in the <list> header file, is a standard doubly linked list. It supports constant-time insertion and deletion of elements at any point in the list, but provides slow (linear) time access to individual elements. In fact, the list does not even provide random access operations like operator[]. Only through iterators can you access individual elements.

Most of the list operations are identical to those of the vector, including the constructors, destructor, copying operations, assignment operations, and comparison operations. This section focuses on those methods that differ from those of vector. Consult the Standard Library Reference resource on the website for details on the list methods not discussed here.

lists also support the C++11 uniform initialization mechanism as shown in the following example:

image
list<string> lst = {"String 1", "String 2", "String 3"};

Code snippet from ListUniformInitListUniformInit.cpp

Accessing Elements

The only methods provided by the list to access elements are front() and back(), both of which run in constant time. These methods return a reference to the first and last element in the list. All other element access must be performed through iterators.

A list supports begin(), returning an iterator referring to the first element in the list, and end(), returning an iterator referring to the last element in the list.

cross.gif

Lists do not provide random access to elements.

Iterators

The list iterator is bidirectional, not random access like the vector iterator. That means that you cannot add and subtract list iterators from each other, or perform other pointer arithmetic on them. For example, if p is a list iterator, you can traverse through the elements of the list by doing ++p or --p, but you cannot use the addition or subtraction operator; p+n or p-n does not work.

Adding and Removing Elements

The list supports the same add element and remove element methods like the vector, including push_back(), pop_back(), the five forms of insert(), the two forms of erase(), and clear(). Like the deque, it also provides push_front() and pop_front(). The amazing thing about the list is that all these methods (except for clear()) run in constant time, once you’ve found the correct position. Thus, the list is appropriate for applications that perform many insertions and deletions from the data structure, but do not need quick index-based element access.

list Size

Like deques, and unlike vectors, lists do not expose their underlying memory model. Consequently, they support size(), empty()and resize(), but not reserve() or capacity().

Special list Operations

The list provides several special operations that exploit its quick element insertion and deletion. This section provides an overview and examples. The Standard Library Reference resource on the website gives a thorough reference for all the methods.

Splicing

The linked-list characteristics of the list class allow it to splice, or insert, an entire list at any position in another list in constant time. The simplest version of this method works as follows:

image
list<string> dictionary, bWords;
// Add the a words.
dictionary.push_back("aardvark");
dictionary.push_back("ambulance");
// Add the c words.
dictionary.push_back("canticle");
dictionary.push_back("consumerism");
// Create another list, of the b words.
bWords.push_back("bathos");
bWords.push_back("balderdash");
// splice the b words into the main dictionary.
if (bWords.size() > 0) {
    // Get an iterator to the last b word.
    auto iterLastB = --(bWords.cend());
    // Iterate up to the spot where we want to insert bs.
    list<string>::iterator it;
    for (it = dictionary.begin(); it != dictionary.end(); ++it) {
        if (*it > *iterLastB)
            break;
    }
    // Add in the bwords. This action removes the elements from bWords.
    dictionary.splice(it, bWords);
}
// print out the dictionary
for (auto it = dictionary.cbegin(); it != dictionary.cend(); ++it) {
    cout << *it << endl;
}

Code snippet from ListSpliceListSplice.cpp

The result from running this program looks like this:

aardvark
ambulance
bathos
balderdash
canticle
consumerism

There are also two other forms of splice(): one that inserts a single element from another list and one that inserts a range from another list. See the Standard Library Reference resource on the website for details.

cross.gif

Splicing is destructive to the list passed as a parameter: It removes the spliced elements from one list in order to insert them into the other.

More Efficient Versions of Algorithms

In addition to splice(), the list class provides special implementations of several of the generic STL algorithms. The generic forms are covered in Chapter 13. Here we discuss only the specific versions provided by list.

pen.gif

When you have a choice, use the list methods rather than the generic STL algorithms because the former are more efficient.

The following table summarizes the algorithms for which list provides special implementations as methods. See the Standard Library Reference resource on the website and Chapter 13 for prototypes, details on the algorithms, and their specific running time when called on list.

METHOD DESCRIPTION
remove()
remove_if()
Removes certain elements from the list.
unique() Removes duplicate consecutive elements from the list, based on operator== or a user supplied binary predicate.
merge() Merges two lists. Both lists must be sorted to start, according to operator< or a user defined comparator. Like splice(), merge() is destructive to the list passed as an argument.
sort() Performs a stable sort on elements in the list.
reverse() Reverses the order of the elements in the list.

The following section demonstrates most of these methods.

list Example: Determining Enrollment

Suppose that you are writing a computer registration system for a university. One feature you might provide is the ability to generate a complete list of enrolled students in the university from lists of the students in each class. For the sake of this example, assume that you must write only a single function that takes a vector of lists of student names (as strings), plus a list of students that have been dropped from their courses because they failed to pay tuition. This method should generate a complete list of all the students in all the courses, without any duplicates, and without those students who have been dropped. Note that students might be in more than one course.

Here is the code for this method, with comments explaining the code. With the power of STL lists, the method is practically shorter than its written description! Note that the STL allows you to “nest” containers: In this case, you can use a vector of lists.

image
// courseStudents is a vector of lists, one for each course. The lists
// contain the students enrolled in those courses. They are not sorted.
//
// droppedStudents is a list of students who failed to pay their
// tuition and so were dropped from their courses.
//
// The function returns a list of every enrolled (non-dropped) student in
// all the courses.
list<string>
getTotalEnrollment(const vector<list<string>>& courseStudents,
                   const list<string>& droppedStudents)
{
    list<string> allStudents;
    // Concatenate all the course lists onto the master list
    for (auto& lst : courseStudents) {
        allStudents.insert(allStudents.end(), lst.begin(), lst.end());
    }
    // Sort the master list
    allStudents.sort();
    // Remove duplicate student names (those who are in multiple courses).
    allStudents.unique();
    // Remove students who are on the dropped list.
    // Iterate through the drop list, calling remove on the
    // master list for each student in the dropped list.
    for (auto& str : droppedStudents) {
        allStudents.remove(str);
    }
    // done!
    return allStudents;
}

Code snippet from StudentEnrollmentEnrollment.cpp

This example is using C++11 range-based for loops. If your compiler does not yet support those, you can implement the function as follows:

image
list<string>
getTotalEnrollment(const vector<list<string> >& courseStudents,
                   const list<string>& droppedStudents)
{
    list<string> allStudents;
    // Concatenate all the course lists onto the master list
    for (vector<list<string> >::const_iterator it = courseStudents.begin();
        it != courseStudents.end(); ++it) {
        allStudents.insert(allStudents.end(), (*it).begin(), (*it).end());
    }
    // Sort the master list
    allStudents.sort();
    // Remove duplicate student names (those who are in multiple courses).
    allStudents.unique();
    // Remove students who are on the dropped list.
    // Iterate through the drop list, calling remove on the
    // master list for each student in the dropped list.
    for (list<string>::const_iterator it = droppedStudents.begin();
        it != droppedStudents.end(); ++it) {
        allStudents.remove(*it);
    }
    // done!
    return allStudents;
}

Code snippet from StudentEnrollmentEnrollment.cpp

imagearray

The C++11 std::array class, defined in the <array> header file, is similar to the vector except that it is of a fixed size; it cannot grow or shrink in size. The purpose of this is to allow an array to be allocated on the stack, rather than always demanding heap access as vector does. Just like vectors, arrays support random access iterators, and elements are stored in contiguous memory. It has support for front(), back(), at() and operator[]. It also supports a fill() method to fill the array with a specific element. Because it is fixed in size, it does not support push_back(), pop_back(), insert(), erase(), clear(), resize(), reserve() and capacity(). The following small example demonstrates how to use the array class. Note that the array declaration requires two template parameters; the first specifies the type of the elements, and the second specifies the fixed number of elements in the array.

image
// Create array of 3 integers and initialize them
// with the given initializer_list using uniform initialization.
array<int, 3> arr = {9, 8, 7};
// Output the size of the array.
cout << "Array size = " << arr.size() << endl;
// Output the contents of the array using iterators.
for (auto iter = arr.cbegin(); iter != arr.cend(); ++iter)
    cout << *iter << endl;
 
cout << "Performing arr.fill(3)..." << endl;
// Use the fill method to change the contents of the array.
arr.fill(3);
// Output the contents using the range-based for loop.
for (auto& i : arr)
    cout << i << endl;

Code snippet from std_arraystd_array.cpp

The output of the preceding code is as follows:

Array size = 3
9
8
7
Performing arr.fill(3)...
3
3
3

imageforward_list

The forward_list introduced by C++11, defined in the <forward_list> header file, is similar to the list except that the forward_list is a singly linked list while the list is a doubly linked list. This means that the forward_list supports only forward iteration and because of this, ranges need to be specified differently compared to a list. If you want to modify any list, you need access to the element before the first element of interest. Since a forward_list does not have an iterator that supports going backward, there is no easy way to get to the preceding element. For this reason, ranges that will be modified, for example ranges supplied to erase and splice, must be open at the beginning. The begin() function seen earlier returns an iterator to the first element and thus can be used only to construct a range that is closed at the beginning. The forward_list class defines a before_begin() method, which returns an iterator that points to an imaginary element before the beginning of the list. You cannot dereference this iterator as it points to invalid data. However, incrementing this iterator by one will make it the same as the iterator returned by begin(); therefore, it can be used to make a range that is open at the beginning. The following table sums up the differences between a list and a forward_list.

OPERATION list forward_list
back() x
before_begin() x
begin() x x
cbefore_begin() x
cbegin() x x
cend() x x
clear() x x
crbegin() x
crend() x
emplace() x
emplace_after() x
emplace_back() x
emplace_front() x x
empty() x x
end() x x
erase() x
erase_after() x
front() x x
insert() x
insert_after() x
iterator / const_iterator x x
merge() x x
pop_back() x
pop_front() x x
push_back() x
push_front() x x
rbegin() x
remove() x x
remove_if() x x
rend() x
resize() x x
reverse() x x
reverse_iterator / const_reverse_iterator x
size() x
sort() x x
splice() x
splice_after() x
swap() x x
unique() x x

Constructors and assignment operators are similar between a list and a forward_list. The standard says that forward_lists should try to use minimal space. That’s the reason why there is no size() method, because by not providing it, there is no need to store the size of the list. The following example demonstrates the use of forward_lists:

image
// Create 3 forward lists and use an initializer_list
// to initialize their elements (uniform initialization).
forward_list<int> lst1 = {5,6};
forward_list<int> lst2 = {1,2,3,4};
forward_list<int> lst3 = {7,8,9};
// Insert lst2 at the front of lst1 using splice.
lst1.splice_after(lst1.before_begin(), lst2);
// Add number 0 at the beginning of the lst1.
lst1.push_front(0);
// Insert lst3 at the end of lst1.
// For this, we first need an iterator to the last element.
auto iter = lst1.before_begin();
auto iterTemp = iter;
while (++iterTemp != lst1.end())
    ++iter;
lst1.insert_after(iter, lst3.begin(), lst3.end());
// Output the contents of lst1.
for (auto& i : lst1)
    cout << i << ' ';

Code snippet from ForwardListforward_list.cpp

To insert lst3, we need an iterator to the last element in the list. However, since this is a forward_list, we cannot use --(lst1.end()), so we need to iterate over the list from the beginning and stop at the last element. The output of this example is as follows:

0 1 2 3 4 5 6 7 8 9

CONTAINER ADAPTERS

In addition to the standard sequential containers, the STL provides three container adapters: queue, priority_queue, and stack. Each of these adapters is a wrapper around one of the sequential containers. The intent is to simplify the interface and to provide only those features that are appropriate for the stack or queue abstraction. For example, the adapters don’t provide iterators or the capability to insert or erase multiple elements simultaneously.

pen.gif

The container adapters’ interfaces may be too limiting for your needs. If so, you can use the sequential containers directly or write your own, more full-featured, adapters. See Chapter 29 for details on the adapter design pattern.

queue

The queue container adapter, defined in the header file <queue>, provides standard “first-in, first-out” (FIFO) semantics. As usual, it’s written as a class template, which looks like this:

template <class T, class Container = deque<T> > class queue;

The T template parameter specifies the type that you intend to store in the queue. The second template parameter allows you to stipulate the underlying container that the queue adapts. However, the queue requires the sequential container to support both push_back() and pop_front(), so you only have two built-in choices: deque and list. For most purposes, you can just stick with the default deque.

queue Operations

The queue interface is extremely simple: There are only eight methods plus the constructor and the normal comparison operators. The push()and emplace() methods add a new element to the tail of the queue, and pop() removes the element at the head of the queue. You can retrieve references to, without removing, the first and last elements with front() and back(), respectively. As usual, when called on const objects, front() and back() return const references; and when called on non-const objects they return non-const (read/write) references.

cross.gif

pop() does not return the element popped. If you want to retain a copy, you must first retrieve it with front().

The queue also supports size(), empty() and swap(). See the Standard Library Reference resource on the website for details.

queue Example: A Network Packet Buffer

When two computers communicate over a network, they send information to each other divided up into discrete chunks called packets. The networking layer of the computer’s operating system must pick up the packets and store them as they arrive. However, the computer might not have enough bandwidth to process all of them at once. Thus, the networking layer usually buffers, or stores, the packets until the higher layers have a chance to attend to them. The packets should be processed in the order they arrive, so this problem is perfect for a queue structure. The following is a small PacketBuffer class, with comments explaining the code, that stores incoming packets in a queue until they are processed. It’s a template so that different layers of the networking layer can use it for different kinds of packets, such as IP packets or TCP packets. It allows the client to specify a maximum size because operating systems usually limit the number of packets that can be stored, so as not to use too much memory. When the buffer is full, subsequently arriving packets are ignored.

image
template <typename T>
class PacketBuffer
{
    public:
        // If maxSize is 0, the size is unlimited, because creating
        // a buffer of size 0 makes little sense. Otherwise only
        // maxSize packets are allowed in the buffer at any one time.
        PacketBuffer(size_t maxSize = 0);
        // Stores a packet in the buffer.
        // Returns false if the packet has been discarded because
        // there is no more space in the buffer, true otherwise.
        bool bufferPacket(const T& packet);
        // Returns the next packet. Throws out_of_range
        // if the buffer is empty.
        T getNextPacket() throw(std::out_of_range);
    protected:
        std::queue<T> mPackets;
        int mMaxSize;
};
template <typename T>
PacketBuffer<T>::PacketBuffer(size_t maxSize/* = 0 */)
    : mMaxSize(maxSize)
{
}
template <typename T>
bool PacketBuffer<T>::bufferPacket(const T& packet)
{
    if (mMaxSize > 0 && mPackets.size() == mMaxSize) {
        // No more space. Drop the packet.
        return false;
    }
    mPackets.push(packet);
    return true;
}
template <typename T>
T PacketBuffer<T>::getNextPacket() throw(std::out_of_range)
{
    if (mPackets.empty()) {
        throw std::out_of_range("Buffer is empty");
    }
    // retrieve the head element
    T temp = mPackets.front();
    // pop the head element
    mPackets.pop();
    // return the head element
    return temp;
}

Code snippet from PacketBufferPacketBuffer.h

A practical application of this class would require multiple threads. C++11 provides synchronization classes to allow thread-safe access to shared objects. Without explicit synchronization provided by the programmer, no STL class can be used safely in multiple threads. Synchronization is discussed in Chapter 22. However, here is a quick single-threaded example of the PacketBuffer:

image
class IPPacket
{
    public:
        IPPacket(int id) : mID(id) {}
        int getID() const { return mID; }
    protected:
        int mID;
};
int main()
{
    PacketBuffer<IPPacket> ipPackets(3);
    // Add 4 packets
    for (int i = 1; i <= 4; ++i) {
        if (!ipPackets.bufferPacket(IPPacket(i)))
            cout << "Packet " << i << " dropped (queue is full)." << endl;
    }
    while (true) {
        try {
            IPPacket packet = ipPackets.getNextPacket();
            cout << "Processing packet " << packet.getID() << endl;
        } catch (const out_of_range&) {
            cout << "Queue is empty." << endl;
            break;
        }
    }
    return 0;
}

Code snippet from PacketBufferPacketBufferTest.cpp

The output of this program is as follows:

Packet 4 dropped (queue is full).
Processing packet 1
Processing packet 2
Processing packet 3
Queue is empty.

priority_queue

A priority queue is a queue that keeps its elements in sorted order. Instead of a strict FIFO ordering, the element at the head of the queue at any given time is the one with the highest priority. This element could be the oldest on the queue or the most recent. If two elements have equal priority, their relative order in the queue is FIFO.

The STL priority_queue container adapter is also defined in <queue>. Its template definition looks something like this (slightly simplified):

template <class T, class Container = vector<T>,
          class Compare = less<T> >;

It’s not as complicated as it looks. You’ve seen the first two parameters before: T is the element type stored in the priority_queue and Container is the underlying container on which the priority_queue is adapted. The priority_queue uses vector as the default, but deque works as well. list does not work because the priority_queue requires random access to its elements. The third parameter, Compare, is trickier. As you’ll learn more about in Chapter 13, less is a class template that supports comparison of two objects of type T with operator<. What this means for you is that the priority of elements in the queue is determined according to operator<. You can customize the comparison used, but that’s a topic for Chapter 13. For now, just make sure that you define operator< appropriately for the types stored in the priority_queue.

pen.gif

The head element of the priority_queue is the one with the “highest” priority, by default determined according to operator< such that elements that are “less” than other elements have lower priority.

priority_queue Operations

The priority_queue provides even fewer operations than does the queue. The push() and emplace() methods allow you to insert elements, pop() allows you to remove elements, and top() returns a const reference to the head element.

cross.gif

top() returns a const reference even when called on a non-const object, because modifying the element might change its order, which is not allowed. The priority_queue provides no mechanism to obtain the tail element.

cross.gif

pop() does not return the element popped. If you want to retain a copy, you must first retrieve it with top().

Like the queue, the priority_queue supports size(), empty() and swap(). However, it does not provide any comparison operators. Consult the Standard Library Reference resource on the website for details.

This interface is obviously limited. In particular, the priority_queue provides no iterator support, and it is impossible to merge two priority_queues.

priority_queue Example: An Error Correlator

Single failures on a system can often cause multiple errors to be generated from different components. A good error-handling system uses error correlation to process the most important errors first. You can use a priority_queue to write a very simple error correlator. Assume all error events encode their own priority. This class simply sorts error events according to their priority, so that the highest-priority errors are always processed first. Here is the class definition:

image
// Sample Error class with just a priority and a string error description.
class Error
{
    public:
        Error(int priority, const std::string& errMsg
            : mPriority(priority), mError(errMsg) {}
        int getPriority() const { return mPriority; }
        std::string getErrorString() const { return mError; }
        friend bool operator<(const Error& lhs, const Error& rhs);
        friend std::ostream& operator<<(std::ostream& os,
            const Error& err);// See Chapter 18 for details on implementing operator<<
    protected:
        int mPriority;
        std::string mError;
};
// Simple ErrorCorrelator class that returns highest priority errors first.
class ErrorCorrelator
{
    public:
        ErrorCorrelator() {}
        // Add an error to be correlated.
        void addError(const Error& error);
        // Retrieve the next error to be processed.
        Error getError() throw(std::out_of_range);
    protected:
        std::priority_queue<Error> mErrors;
};

Code snippet from ErrorCorrelatorPqueueErrorCorrelator.h

Here are the definitions of the functions and methods:

image
bool operator<(const Error& lhs, const Error& rhs)
{
    return (lhs.mPriority < rhs.mPriority);
}
ostream& operator<<(ostream& os, const Error& err)
{
    os << err.mError << " (priority " << err.mPriority << ")";
    return os;
}
void ErrorCorrelator::addError(const Error& error)
{
    mErrors.push(error);
}
Error ErrorCorrelator::getError() throw(out_of_range)
{
    // If there are no more errors, throw an exception.
    if (mErrors.empty()) {
        throw out_of_range("No elements!");
    }
    // Save the top element.
    Error top = mErrors.top();
    // Remove the top element.
    mErrors.pop();
    // Return the saved element.
    return top;
}

Code snippet from ErrorCorrelatorPqueueErrorCorrelator.cpp

Here is a simple unit test showing how to use the ErrorCorrelator. Realistic use would require multiple threads so that one thread adds errors, while another processes them. As mentioned earlier with the queue example, without explicit synchronization provided by the programmer, no STL class can be used safely in multiple threads. Synchronization is discussed in Chapter 22.

image
ErrorCorrelator ec;
ec.addError(Error(3, "Unable to read file"));
ec.addError(Error(1, "Incorrect entry from user"));
ec.addError(Error(10, "Unable to allocate memory!"));
while (true) {
    try {
        Error e = ec.getError();
        cout << e << endl;
    } catch (const out_of_range&) {
        cout << "Finished processing errors" << endl;
        break;
    }
}

Code snippet from ErrorCorrelatorPqueueErrorCorrelatorTest.cpp

The output of this program is as follows:

Unable to allocate memory! (priority 10)
Unable to read file (priority 3)
Incorrect entry from user (priority 1)
Finished processing errors

stack

The stack is almost identical to the queue, except that it provides first-in, last-out (FILO) semantics, also known as last-in, first-out (LIFO), instead of FIFO. It is defined in the <stack> header file. The template definition looks like this:

template <class T, class Container = deque<T> > class stack;

You can use vector, list, or deque as the underlying model for the stack.

stack Operations

Like the queue, the stack provides push(), emplace() and pop(). The difference is that push() adds a new element to the top of the stack, “pushing down” all elements inserted earlier, and pop() removes the element from the top of the stack, which is the most recently inserted element. The top() method returns a const reference to the top element if called on a const object and a non-const reference if called on a non-const object.

cross.gif

pop() does not return the element popped. If you want to retain a copy, you must first retrieve it with top().

The stack supports empty(), size(), swap() and the standard comparison operators. See the Standard Library Reference resource on the website for details.

stack Example: Revised Error Correlator

Suppose that you decide to rewrite the previous ErrorCorrelator class so that it gives out the most recent errors instead of those with the highest priority. You can substitute a stack for the priority_queue in the ErrorCorrelator class definition. Now, the Errors will be distributed from the class in LIFO order instead of priority order. Nothing in the method definitions needs to change because the push(), pop(), top(), and empty() methods exist on both the priority_queue and stack. There is only one change required and only in the ErrorCorrelator class.

image
// Simple ErrorCorrelator class that returns most recent errors first.
class ErrorCorrelator
{
    public:
        ErrorCorrelator() {}
        // Add an error to be correlated.
        void addError(const Error& error);
        // Retrieve the next error to be processed
        Error getError() throw(std::out_of_range);
    protected:
        std::stack<Error> mErrors;
};

Code snippet from ErrorCorrelatorStackErrorCorrelator.h

All the rest of the code remains the same. The output of this stack version is as follows:

Unable to allocate memory! (priority 10)
Incorrect entry from user (priority 1)
Unable to read file (priority 3)
Finished processing errors

ASSOCIATIVE CONTAINERS

Unlike the sequential containers, the associative containers do not store elements in a linear configuration. Instead, they provide a mapping of keys to values. They generally offer insertion, deletion, and lookup times that are equivalent to each other.

The four associative containers provided by the STL are map, multimap, set, and multiset. Each of these containers stores its elements in a sorted, treelike, data structure.

The pair Utility Class

Before learning about the associative containers, you must become familiar with the pair class, which is defined in the <utility> header file. The pair is a class template that groups together two values of possibly different types. The values are accessible through the first and second public data members. operator== and operator< are defined for pairs to compare both the first and second elements. Here are some examples:

image
// Two-argument constructor and default constructor
pair<string, int> myPair("hello", 5);
pair<string, int> myOtherPair;
// Can assign directly to first and second
myOtherPair.first = "hello";
myOtherPair.second = 6;
// Copy constructor
pair<string, int> myThirdPair(myOtherPair);
// operator<
if (myPair < myOtherPair) {
    cout << "myPair is less than myOtherPair" << endl;
} else {
    cout << "myPair is greater than or equal to myOtherPair" << endl;
}
// operator==
if (myOtherPair == myThirdPair) {
    cout << "myOtherPair is equal to myThirdPair" << endl;
} else {
    cout << "myOtherPair is not equal to myThirdPair" << endl;
}

Code snippet from PairPairTest.cpp

The output is as follows:

myPair is less than myOtherPair
myOtherPair is equal to myThirdPair

The library also provides a utility function template, make_pair(), that constructs a pair from two values. For example, you could use it like this:

image
pair<int, int> aPair = make_pair(5, 10);

Code snippet from PairPairTest.cpp

Of course, in this case you could have just used the two-argument constructor. However, make_pair() is more useful when you want to pass a pair to a function. Unlike class templates, function templates can infer types from parameters, so you can use make_pair() to construct a pair without explicitly specifying the types. You can also use make_pair() in combination with the C++11 auto keyword as follows:

image
auto aSecondPair = make_pair(5, 10);

Code snippet from PairPairTest.cpp

cross.gif

Using plain old pointer types in pairs is risky because the pair copy constructor and assignment operator perform only shallow copies and assignments of pointer types. However, you can safely store smart pointers like shared_ptr in your pair.

map

The map is one of the most useful containers, defined in the <map> header file. It stores key/value pairs instead of just a single value. Insertion, lookup, and deletion are all based on the key; the value is just “along for the ride.” The term “map” comes from the conceptual understanding that the container “maps” keys to values.

The map keeps elements in sorted order, based on the keys, so that insertion, deletion, and lookup all take logarithmic time. Because of the order, when you enumerate the elements, they come out in the ordering imposed by the type’s operator< or a user defined comparator. It is usually implemented as some form of balanced tree, such as a red-black tree. However, the tree structure is not exposed to the client.

You should use a map whenever you need to store and retrieve elements based on a “key” value and you would like to have them in a certain order.

Constructing maps

The map template takes four types: the key type, the value type, the comparison type, and the allocator type. As usual, we ignore the allocator in this chapter; see Chapter 17 for details. The comparison type is similar to the comparison type for priority_queue described earlier. It allows you to specify a different comparison class than the default. You usually shouldn’t need to change the sorting criteria. In this chapter, we use only the default less comparison. When using the default, make sure that your keys all respond to operator< appropriately.

If you’re interested in further detail, Chapter 13 explains how to write your own comparison classes.

If you ignore the comparison and allocator parameters (which we urge you to do), constructing a map is just like constructing a vector or list, except that you specify the key and value types separately in the template. For example, the following code constructs a map that uses ints as the key and stores objects of the Data class:

image
class Data
{
    public:
        Data(int val = 0) { mVal = val; }
        int getVal() const { return mVal; }
        void setVal(int val) { mVal = val; }
    protected:
        int mVal;
};
int main()
{
    map<int, Data> dataMap;
    return 0;
}

Code snippet from MapBasicsMapInsert.cpp

maps also support the C++11 uniform initialization mechanism as shown in the following example:

image
map<string, int> m = {
    {"Marc G.", 123},
    {"Zulija N.", 456},
    {"John D.", 369}
};

Code snippet from MapBasicsMapUniformInit.cpp

Inserting Elements

Inserting an element into the sequential containers such as vector and list always requires you to specify the position at which the element is to be added. The map, along with the other associative containers, is different. The map internal implementation determines the position in which to store the new element; you need only to supply the key and the value.

pen.gif

map and the other associative containers do provide a version of insert() that takes an iterator position. However, that position is only a “hint” to the container as to the correct position. The container is not required to insert the element at that position.

When inserting elements, it is important to keep in mind that maps support so-called “unique keys”: Every element in the map must have a different key. If you want to support multiple elements with the same key, you must use multimaps, which are described later.

There are two ways to insert an element into the map: one clumsy and one not so clumsy.

The insert() Method

The clumsy mechanism to add an element to a map is the insert() method, but it has the advantage of allowing you to detect if the key already exists. One problem is that you must specify the key/value pair as a pair object. The second problem is that the return value from the basic form of insert() is a pair of an iterator and a bool. The reason for the complicated return value is that insert() does not overwrite an element value if one already exists with the specified key. The bool element of the returned pair specifies whether the insert() actually inserted the new key/value pair or not. The iterator refers to the element in the map with the specified key (with a new or old value, depending on whether the insert succeeded or failed). Continuing the map example from the previous section, you can use insert() as follows:

image
map<int, Data> dataMap;
auto ret = dataMap.insert({1, Data(4)}); // Using C++11 initializer_list
if (ret.second) {
    cout << "Insert succeeded!" << endl;
} else {
    cout << "Insert failed!" << endl;
}
ret = dataMap.insert(make_pair(1, Data(6)));
if (ret.second) {
    cout << "Insert succeeded!" << endl;
} else {
    cout << "Insert failed!" << endl;
}

Code snippet from MapBasicsMapInsert.cpp

The output of the program is:

Insert succeeded!
Insert failed!

If your compiler does not support the C++11 auto keyword, you have to declare the correct type for ret yourself as follows:

pair<map<int, Data>::iterator, bool> ret;

The type of ret is a pair. The first element of the pair is a map iterator for a map with keys of type int and values of type Data. The second element of the pair is a Boolean value.

operator[]

The less clumsy way to insert an element into the map is through the overloaded operator[]. The difference is mainly in the syntax: You specify the key and value separately. Additionally, operator[] always succeeds. If no element value with the given key exists, it creates a new element with that key and value. If an element with the key exists already, operator[] replaces the element value with the newly specified value. Here is the previous example using operator[] instead of insert():

image
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6); // Replaces the element with key 1

Code snippet from MapBasicsMapIndexOperator.cpp

There is, however, one major caveat to operator[]: It always constructs a new value object, even if it doesn’t need to use it. Thus, it requires a default constructor for your element values, and can be less efficient than insert().

The fact that operator[] creates a new element in the map if the requested element does not already exist means that this operator is not marked as const. This sounds obvious, but might sometimes look counter-intuitive. For example, suppose you have the following function:

image
void func(const map<int, int>& m)
{
    cout << m[1] << endl;  // Error
}

Code snippet from MapBasicsMapAsParameter.cpp

This will fail to compile, even though you appear to be just reading the value m[1]. It fails because the variable m is a const reference to a map, and operator[] is not marked as const. Instead, you should use the find() method as follows:

image
void func(const map<int, int>& m)
{
    auto iter = m.find(1);
    if (iter != m.end())
        cout << iter->second << endl;
}

Code snippet from MapBasicsMapAsParameter.cpp

Or, if your compiler doesn’t support the C++11 auto keyword:

image
void func(const map<int, int>& m)
{
    map<int, int>::const_iterator iter = m.find(1);
    if (iter != m.end())
        cout << iter->second << endl;
}

Code snippet from MapBasicsMapAsParameter.cpp

map Iterators

map iterators work similarly to the iterators on the sequential containers. The major difference is that the iterators refer to key/value pairs instead of just the values. In order to access the value, you must retrieve the second field of the pair object. Here is how you can iterate through the map from the previous example:

image
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6); // Replaces the element with key 1
for (auto iter = dataMap.cbegin(); iter != dataMap.cend(); ++iter) {
    cout << iter->second.getVal() << endl;
}

Code snippet from MapBasicsMapIterators.cpp

Take another look at the expression used to access the value:

iter->second.getVal()

iter refers to a key/value pair, so you can use the -> operator to access the second field of that pair, which is a Data object. You can then call the getVal() method on that data object.

Note that the following code is functionally equivalent:

(*iter).second.getVal()

You still see a lot of code like that around because the -> operator didn’t used to be implemented for iterators.

Using the C++11 range-based for loop, the loop can be written even more elegantly as follows:

image
for (auto& p : dataMap) {
    cout << p.second.getVal() << endl;
}

Code snippet from MapBasicsMapIterators.cpp

If your compiler does not support the preceding C++11 versions, you have to write the loop as follows:

image
for (map<int, Data>::const_iterator iter = dataMap.begin();
    iter != dataMap.end(); ++iter) {
    cout << iter->second.getVal() << endl;
}

Code snippet from MapBasicsMapIterators.cpp

cross.gif

You can modify element values through non-const iterators, but the compiler will generate an error if you try to modify the key of an element, even through a non-const iterator, because it would destroy the sorted order of the elements in the map.

map iterators are bidirectional, meaning you can traverse them in both directions.

Looking Up Elements

The map provides logarithmic lookup of elements based on a supplied key. If you already know that an element with a given key is in the map, the simplest way to look it up is through operator[]. The nice thing about operator[] is that it returns a reference to the element that you can use (or modify on a non-const map) directly, without worrying about pulling the value out of a pair object. Here is an extension to the preceding example to call the setVal() method on the Data object value at key 1:

image
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6);
dataMap[1].setVal(100);

Code snippet from MapBasicsMapLookup.cpp

However, if you don’t know whether the element exists, you may not want to use operator[], because it will insert a new element with that key if it doesn’t find one already. As an alternative, the map provides a find() method that returns an iterator referring to the element with the specified key, if it exists, or the end() iterator if it’s not in the map. Here is an example using find() to perform the same modification to the Data object with key 1:

image
map<int, Data> dataMap;
dataMap[1] = Data(4);
dataMap[1] = Data(6);
auto it = dataMap.find(1);
if (it != dataMap.end()) {
    it->second.setVal(100);
}

Code snippet from MapBasicsMapFind.cpp

As you can see, using find() is a bit clumsier, but it’s sometimes necessary. If your compiler does not support the C++11 auto keyword you need to call find() as follows:

image
map<int, Data>::iterator it = dataMap.find(1);

Code snippet from MapBasicsMapFind.cpp

If you only want to know whether or not an element with a certain key is in the map, you can use the count() member function. It returns the number of elements in the map with a given key. For maps, the result will always be 0 or 1 because there can be no elements with duplicate keys. The following section shows an example using count().

Removing Elements

The map allows you to remove an element at a specific iterator position or to remove all elements in a given iterator range, in amortized constant and logarithmic time, respectively. From the client perspective, these two erase() methods are equivalent to those in the sequential containers. A great feature of the map, however, is that it also provides a version of erase() to remove an element matching a key. Here is an example:

image
map<int, Data> dataMap;
dataMap[1] = Data(4);
cout << "There are " << dataMap.count(1) << " elements with key 1" << endl;
dataMap.erase(1);
cout << "There are " << dataMap.count(1) << " elements with key 1" << endl;

Code snippet from MapBasicsMapErase.cpp

The output should be as follows:

There are 1 elements with key 1
There are 0 elements with key 1

map Example: Bank Account

You can implement a simple bank account database using a map. A common pattern is for the key to be one field of a class or struct that is stored in the map. In this case, the key is the account number. Here are simple BankAccount and BankDB classes:

image
class BankAccount
{
    public:
        BankAccount(int acctNum, const std::string& name)
            : mAcctNum(acctNum), mClientName(name) {}
        void setAcctNum(int acctNum) { mAcctNum = acctNum; }
        int getAcctNum() const { return mAcctNum; }
        void setClientName(const std::string& name) { mClientName = name; }
        std::string getClientName() const { return mClientName; }
    protected:
        int mAcctNum;
        std::string mClientName;
};
class BankDB
{
    public:
        BankDB() {}
        // Adds acct to the bank database. If an account exists already
        // with that number, the new account is not added. Returns true
        // if the account is added, false if it's not.
        bool addAccount(const BankAccount& acct);
        // Removes the account acctNum from the database.
        void deleteAccount(int acctNum);
        // Returns a reference to the account represented
        // by its number or the client name.
        // Throws out_of_range if the account is not found.
        BankAccount& findAccount(int acctNum) throw(std::out_of_range);
        BankAccount& findAccount(const std::string& name)
            throw(std::out_of_range);
        // Adds all the accounts from db to this database.
        // Deletes all the accounts from db.
        void mergeDatabase(BankDB& db);
    protected:
        std::map<int, BankAccount> mAccounts;
};

Code snippet from BankAccountBankDB.h

Here are the implementations of the BankDB methods, with comments explaining the code:

image
bool BankDB::addAccount(const BankAccount& acct)
{
    // Do the actual insert, using the account number as the key
    auto res = mAccounts.insert(make_pair(acct.getAcctNum(), acct));
    // Return the bool field of the pair specifying success or failure
    return res.second;
}
void BankDB::deleteAccount(int acctNum)
{
    mAccounts.erase(acctNum);
}
BankAccount& BankDB::findAccount(int acctNum) throw(out_of_range)
{
    // Finding an element via its key can be done with find()
    auto it = mAccounts.find(acctNum);
    if (it == mAccounts.end()) {
        throw out_of_range("No account with that number.");
    }
    // Remember that iterators into maps refer to pairs of key/value
    return it->second;
}
BankAccount& BankDB::findAccount(const string& name) throw(out_of_range)
{
    // Finding an element by a non-key attribute requires a linear
    // search through the elements.
    for (auto& p : mAccounts) {
        if (p.second.getClientName() == name) {
            // found it!
            return p.second;
        }
    }
    throw out_of_range("No account with that name.");
}
void BankDB::mergeDatabase(BankDB& db)
{
    // Just insert copies of all the accounts in the old db
    // to the new one.
    mAccounts.insert(db.mAccounts.begin(), db.mAccounts.end());
    // Now delete all the accounts in the old one.
    db.mAccounts.clear();
}

Code snippet from BankAccountBankDB.cpp

Note that this code uses a couple of C++11 features. For example, take the following line from the addAccount() method:

auto res = mAccounts.insert(make_pair(acct.getAcctNum(), acct));

If your compiler does not support the C++11 auto keyword, this should be written as follows:

image
pair<map<int, BankAccount>::iterator, bool> res;
res = mAccounts.insert(make_pair(acct.getAcctNum(), acct));

Code snippet from BankAccountBankDB.cpp

You can test the BankDB class with the following code:

image
BankDB db;
db.addAccount(BankAccount(100, "Nicholas Solter"));
db.addAccount(BankAccount(200, "Scott Kleper"));
try {
    auto acct = db.findAccount(100);
    cout << "Found account 100" << endl;
    acct.setClientName("Nicholas A Solter");
    auto acct2 = db.findAccount("Scott Kleper");
    cout << "Found account of Scott Kelper" << endl;
    auto acct3 = db.findAccount(1000);
} catch (const out_of_range&) {
    cout << "Unable to find account" << endl;
}

Code snippet from BankAccountBankDBTest.cpp

The output should be as follows:

Found account 100
Found account of Scott Kelper
Unable to find account

multimap

The multimap is a map that allows multiple elements with the same key. Like maps, multimaps support the C++11 uniform initialization. The interface is almost identical to the map interface, with the following changes:

  • multimaps do not provide operator[]. The semantics of this operator does not make sense if there can be multiple elements with a single key.
  • Inserts on multimaps always succeed. Thus, the multimap insert() method that adds a single element returns only an iterator.
pen.gif

multimaps allow you to insert identical key/value pairs. If you want to avoid this redundancy, you must check explicitly before inserting a new element.

The trickiest aspect of multimaps is looking up elements. You can’t use operator[], because it is not provided. find() isn’t very useful because it returns an iterator referring to any one of the elements with a given key (not necessarily the first element with that key).

However, multimaps store all elements with the same key together and provide methods to obtain iterators for this subrange of elements with the same key in the container. The lower_bound() and upper_bound() methods each return a single iterator referring to the first and one-past-the-last elements matching a given key. If there are no elements matching that key, the iterators returned by lower_bound() and upper_bound() will be equal to each other.

In case you don’t want to call two separate methods to obtain the iterators bounding the elements with a given key, multimaps also provide equal_range(), which returns a pair of the two iterators that would be returned by lower_bound() and upper_bound().

The example in the next section illustrates the use of these methods.

pen.gif

The lower_bound(), upper_bound(), and equal_range() methods exist for maps as well, but their usefulness is limited because a map cannot have multiple elements with the same key.

multimap Example: Buddy Lists

Most of the numerous online chat programs allow users to have a “buddy list” or list of friends. The chat program confers special privileges on users in the buddy list, such as allowing them to send unsolicited messages to the user.

One way to implement the buddy lists for an online chat program is to store the information in a multimap. One multimap could store the buddy lists for every user. Each entry in the container stores one buddy for a user. The key is the user and the value is the buddy. For example, if Harry Potter and Ron Weasley had each other on their individual buddy lists, there would be two entries of the form “Harry Potter” maps to “Ron Weasley” and “Ron Weasley” maps to “Harry Potter.” The multimap allows multiple values for the same key, so the same user is allowed multiple buddies. Here is the BuddyList class definition:

image
using std::multimap;
using std::string;
using std::list;
class BuddyList
{
    public:
        BuddyList();
        // Adds buddy as a friend of name
        void addBuddy(const string& name, const string& buddy);
        // Removes buddy as a friend of name
        void removeBuddy(const string& name, const string& buddy);
        // Returns true if buddy is a friend of name, false otherwise.
        bool isBuddy(const string& name, const string& buddy) const;
        // Retrieves a list of all the friends of name
        list<string> getBuddies(const string& name) const;
    protected:
        multimap<string, string> mBuddies;
};

Code snippet from BuddyListBuddyList.h

Here is the implementation, with comments explaining the code. It demonstrates the use of lower_bound(), upper_bound(), and equal_range():

image
void BuddyList::addBuddy(const string& name, const string& buddy)
{
    // Make sure this buddy isn't already there. We don't want
    // to insert an identical copy of the key/value pair.
    if (!isBuddy(name, buddy)) {
        mBuddies.insert({name, buddy});// Using C++11 initializer_list
    }
}
void BuddyList::removeBuddy(const string& name, const string& buddy)
{
    // Obtain the beginning and end of the range of elements with
    // key 'name'.
    auto start = mBuddies.lower_bound(name);
    auto end = mBuddies.upper_bound(name); 
    // Iterate through the elements with key 'name' looking
    // for a value 'buddy'
    for (; start != end; ++start) {
        if (start->second == buddy) {
            // We found a match! Remove it from the map.
            mBuddies.erase(start);
            break;
        }
    }
}
bool BuddyList::isBuddy(const string& name, const string& buddy) const
{
    // Obtain the beginning and end of the range of elements with
    // key 'name'.
    auto start = mBuddies.lower_bound(name);
    auto end = mBuddies.upper_bound(name);
    // Iterate through the elements with key 'name' looking
    // for a value 'buddy'. If there are no elements with key 'name',
    // start equals end, so the loop body doesn't execute.
    for (; start != end; ++start) {
        if (start->second == buddy) {
            // We found a match!
            return true;
        }
    }
    // No matches
    return false;
}
list<string> BuddyList::getBuddies(const string& name) const
{
    // Obtain the pair of iterators marking the range containing
    // elements with key 'name'.
    auto its = mBuddies.equal_range(name);
    // Create a list with all names in the range (all buddies of name).
    list<string> buddies;
    for (; its.first != its.second; ++its.first) {
        buddies.push_back(its.first->second);
    }
    return buddies;
}

Code snippet from BuddyListBuddyList.cpp

Note that removeBuddy() can’t simply use the version of erase() that erases all elements with a given key, because it should erase only one element with the key, not all of them. Note also that getBuddies() can’t use insert() on the list to insert the elements in the range returned by equal_range(), because the elements referred to by the multimap iterators are key/value pairs, not strings. The getBuddies() method must iterate explicitly through the list extracting the string from each key/value pair and pushing it onto the new list to be returned.

Here is a simple test of the BuddyList:

image
BuddyList buddies;
buddies.addBuddy("Harry Potter", "Ron Weasley");
buddies.addBuddy("Harry Potter", "Hermione Granger");
buddies.addBuddy("Harry Potter", "Hagrid");
buddies.addBuddy("Harry Potter", "Draco Malfoy");
// That's not right! Remove Draco.
buddies.removeBuddy("Harry Potter", "Draco Malfoy");
 
buddies.addBuddy("Hagrid", "Harry Potter");
buddies.addBuddy("Hagrid", "Ron Weasley");
buddies.addBuddy("Hagrid", "Hermione Granger");
 
auto harryBuds = buddies.getBuddies("Harry Potter");
cout << "Harry's friends: " << endl;
for (auto& name : harryBuds) {
    cout << "	" << name << endl;
}

Code snippet from BuddyListBuddyListTest.cpp

The output should be as follows:

Harry's friends:
        Ron Weasley
        Hermione Granger
        Hagrid

set

The set container, defined in the <set> header file, is very similar to the map. The difference is that instead of storing key/value pairs, in sets the value itself is the key. The set containers are useful for storing information in which there is no explicit key, but which you want to have in sorted order with quick insertion, lookup, and deletion.

The interface supplied by set is almost identical to that of the map. The main difference is that the set doesn’t provide operator[].

Note that a set also supports the C++11 uniform initialization.

Although the standard doesn’t state it explicitly, most implementations make the iterator of a set identical to const_iterator, such that you can’t modify the elements of the set through the iterator. Even if your version of the STL permits you to modify set elements through an iterator, you should avoid doing so because modifying elements of the set while they are in the container would destroy the order.

set Example: Access Control List

One way to implement basic security on a computer system is through access control lists. Each entity on the system, such as a file or a device, has a list of users with permissions to access that entity. Users can generally be added to and removed from the permissions list for an entity only by users with special privileges. Internally, the set container provides a nice way to represent the access control list. You could use one set for each entity, containing all the usernames who are allowed to access the entity. Here is a class definition for a simple access control list:

image
using std::set;
using std::string;
using std::list;
using std::initializer_list;
class AccessList
{
    public:
        // Default constructor
        AccessList() {}
        // Constructor to support C++11 uniform initialization.
        AccessList(const initializer_list<string>& initlst);
        // Adds the user to the permissions list.
        void addUser(const string& user);
        // Removes the user from the permissions list.
        void removeUser(const string& user);
        // Returns true if the user is in the permissionns list.
        bool isAllowed(const string& user) const;
        // Returns a list of all the users who have permissions.
        list<string> getAllUsers() const;
    protected:
        set<string> mAllowed;
};

Code snippet from AccessControlListAccessList.h

Here are the method definitions:

image
AccessList::AccessList(const initializer_list<string>& initlst)
{
    for (auto& user : initlst) {
        addUser(user);
    }
}
void AccessList::addUser(const string& user)
{
    mAllowed.insert(user);
}
void AccessList::removeUser(const string& user)
{
    mAllowed.erase(user);
}
bool AccessList::isAllowed(const string& user) const
{
    return (mAllowed.count(user) == 1);
}
list<string> AccessList::getAllUsers() const
{
    list<string> users;
    users.insert(users.end(), mAllowed.begin(), mAllowed.end());
    return users;
}

Code snippet from AccessControlListAccessList.cpp

Finally, here is a simple test program:

image
AccessList fileX = {"nsolter", "klep", "baduser"};
fileX.removeUser("baduser");
if (fileX.isAllowed("nsolter")) {
    cout << "nsolter has permissions" << endl;
}
if (fileX.isAllowed("baduser")) {
    cout << "baduser has permissions" << endl;
}
auto users = fileX.getAllUsers();
for (auto& user : users) {
    cout << user << "  ";
}

Code snippet from AccessControlListAccessListTest.cpp

The output of this program is as follows:

nsolter has permissions
klep  nsolter

The preceding example uses a few C++11 features. One of the constructors for the AccessList class uses an initializer_list as parameter so that you can use the C++11 uniform initialization syntax, as demonstrated in the test program for initializing the variable fileX. The example also uses the C++11 range-based for loop and the auto keyword.

multiset

The multiset is to the set what the multimap is to the map. The multiset supports all the operations of the set, but it allows multiple elements that are equal to each other to be stored in the container simultaneously. We don’t show an example of the multiset because it’s so similar to set and multimap.

image UNORDERED ASSOCIATIVE CONTAINERS/ HASH TABLES

C++11 adds new kind of containers to the STL called unordered associative containers or hash tables. There are four of them: unordered_map, unordered_multimap, unordered_set, and unordered_multiset. The map, multimap, set, and multiset containers discussed earlier sort their elements while these new unordered variants do not sort their elements. If you don’t need ordering, use the new unordered associative containers because on average they have faster insertion, deletion, and lookup.

As mentioned in the container overview in Chapter 11, better names would have been hash_map, hash_set, and so on. Unfortunately, hash tables were not part of the C++ standard library before C++11, which means a lot of third-party libraries implemented hash tables themselves using names with a prefix hash like hash_map. Because of this, the C++ standard committee decided to use the prefix unordered instead of hash to avoid name clashes.

Hash Functions

The new containers are also called hash tables. That is because the implementation of these new containers makes use of so called hash functions. The implementation will usually consist of some kind of array where each element in the array is called a bucket. Each bucket has a specific numerical index like 0, 1, 2 up until the last bucket. A hash function transforms a key into a bucket index. The value associated with that key is then stored in that bucket. The result of a hash function is not always unique. The situation in which two or more keys hash to the same bucket index is called a collision. There are many approaches to handling collisions, for example quadratic re-hashing, linear chaining, etc. Those who are interested may consult one of the references in the Algorithms and Data Structures section in Appendix B. The STL standard does not specify which collision handling algorithm is required, but most current implementations have chosen to resolve collisions by linear chaining. With linear chaining, buckets do not directly contain the data values associated with the keys, but contain a pointer to a linked list. This linked list contains all the data values for that specific bucket. Figure 12-1 shows how this works.

In Figure 12-1, applying the hash function to the keys “Marc G.” and “Zulija N.” resulted in the same bucket index 128. This bucket then points to a linked list containing the keys “Marc G.” and “Zulija N.” together with their associated data values. From Figure 12-1 it is also clear how lookups based on keys work and what the complexity is. A lookup involves a single hash function call to calculate the bucket index and after that one or more equality operations to find the right key in the linked list. This shows that lookups can be much faster compared to lookups with normal maps, but it all depends on how many collisions there are.

The choice of the hash function is very important. A hash function that creates no collisions is known as a “perfect hash.” A perfect hash has a lookup time which is constant; a regular hash has a lookup time which is, on average, close to 1, independent of the number of elements. As the number of collisions increases, the lookup time increases, reducing performance. Collisions can be reduced by increasing the basic hash table size, but you need to take cache sizes into account.

Generally, the default hash function is suitable for most purposes. Creating a perfect hash is a nontrivial exercise, even when the set of keys is fixed and known. Even modifying a regular hash is not an exercise for the unwary!

unordered_map

The unordered_map is defined in the <unordered_map> header file as a class template as follows:

template <class Key,
          class T,
          class Hash = hash<Key>,
          class Pred = std::equal_to<Key>,
          class Alloc = std::allocator<std::pair<const Key, T> > >
    class unordered_map;

There are five template parameters: the key type, the value type, the hash type, the equal comparison type, and the allocator type. With the last three parameters you can define your own hash function, equal comparison function, and allocator function, respectively. These parameters can usually be ignored since they have default values. We recommend you keep those default values. For example, it’s not trivial to implement your own good hash function. The most important parameters are the first two. As with maps, C++11 uniform initialization can be used to initialize the unordered_map with elements as shown in the following example:

image
unordered_map<int, string> m = {
    {1, "Item 1"},
    {2, "Item 2"},
    {3, "Item 3"},
    {4, "Item 4"}
};
for (auto& p : m)
    cout << p.first << " = " << p.second << endl;

Code snippet from unordered_mapunordered_map.cpp

The following table summarizes the differences between a map and an unordered_map.

OPERATION map unordered_map
at() x x
begin() x x
bucket() x
bucket_count() x
bucket_size() x
cbegin() x x
cend() x x
clear() x x
count() x x
crbegin() x
crend() x
emplace() x x
emplace_hint() x x
empty() x x
end() x x
equal_range() x x
erase() x x
find() x x
insert() x x
iterator / const_iterator x x
load_factor() x
local_iterator / const_local_iterator x
lower_bound() x
max_bucket_count() x
max_load_factor() x
operator[] x x
rbegin() x
rehash() x
rend() x
reserve() x
reverse_iterator / const_reverse_iterator x
size() x x
swap() x x
upper_bound() x

As with a normal map, all keys in an unordered_map should be unique. The preceding table includes a number of hash specific methods. For example, load_factor() returns the average number of elements per bucket to give you an indication on the number of collisions. The bucket_count() method returns the number of buckets in the container. It also provides a local_iterator and const_local_iterator allowing you to iterate over the elements in a single bucket; but, these may not be used to iterate across buckets. The bucket(key) method returns the index of the bucket that contains the given key; begin(n) returns a local_iterator referring to the first element in the bucket with index n, and end(n) returns a local_iterator referring to one-past-the-last element in the bucket with index n. The example in the next section will make things more clear on how to use these methods.

unordered_map Example: Phone Book

The following example uses an unordered_map to represent a phone book. The name of a person will be the key while the phone number will be the value associated with that key.

image
template<class T>
void printMap(const T& m)
{
    for (auto& p : m) {
        cout << p.first << "(Phone: " << p.second << ")" << endl;
    }
    cout << "-------" << endl;
}
int main()
{
    // Create a hash table.
    unordered_map<string, string> um;
    um.insert({
        {"Marc G.", "123-456789"},
        {"Zulija N.", "987-654321"},
        {"Scott K.", "654-987321"} });
    printMap(um);
    // Add/remove some phone numbers.
    um.insert(make_pair("John D.", "321-987654"));
    um["Johan G."] = "963-258147";
    um["Freddy K."] = "999-256256";
    um.erase("Freddy K.");
    printMap(um);
    // Find the bucket index for a specific key.
    int bucket = um.bucket("Marc G.");
    cout << "Marc G. is in bucket " << bucket
         << " which contains the following "
         << um.bucket_size(bucket) << " elements:" << endl;
    // Get begin and end iterators for the elements in this bucket.
    // 'auto' is being used here. The compiler will derive the type
    // of both iterators as
    // unordered_map<string, string>::const_local_iterator
    auto liter = um.cbegin(bucket);
    auto literEnd = um.cend(bucket);
    while (liter != literEnd) {
        cout << "	" << liter->first << "(Phone: "
             << liter->second << ")" << endl;
        ++liter;
    }
    cout << "-------" << endl;
    // Print some statistics about the hash table
    cout << "There are " << um.bucket_count() << " buckets." << endl;
    cout << "Average number of elements in a bucket is "
         << um.load_factor() << endl;
    return 0;
}

Code snippet from PhoneBookPhoneBook.cpp

This example uses a lot of C++11 features: unordered_map, range-based for loop, auto and initializer_lists. The output of this code is as follows:

Zulija N.(Phone: 987-654321)
Marc G.(Phone: 123-456789)
Scott K.(Phone: 654-987321)
-------
John D.(Phone: 321-987654)
Zulija N.(Phone: 987-654321)
Marc G.(Phone: 123-456789)
Johan G.(Phone: 963-258147)
Scott K.(Phone: 654-987321)
-------
Marc G. is in bucket 4 which contains the following 1 elements:
    Marc G.(Phone: 123-456789)
 -------
There are 11 buckets.
Average number of elements in a bucket is 0.454545

unordered_multimap

The unordered_multimap is an unordered_map that allows multiple elements with the same key. Their interfaces are almost identical, with the following changes:

  • unordered_multimaps do not provide operator[]. The semantics of this operator does not make sense if there can be multiple elements with a single key.
  • Inserts on unordered_multimaps always succeed. Thus, the unordered_multimap insert() method that adds a single element returns only an iterator.
pen.gif

unordered_multimaps allow you to insert identical key/value pairs. If you want to avoid this redundancy, you must check explicitly before inserting a new element.

As discussed earlier with multimaps, looking up elements in unordered_multimaps cannot be done using operator[] because it is not provided. You can use find() but it will return an iterator referring to any one of the elements with a given key (not necessarily the first element with that key). Instead, it’s best to use the equal_range() method, which will return a pair of iterators: one referring to the first element matching a given key and one referring to one-past-the-last element matching a given key. The use of equal_range() is exactly the same as discussed in the context of the multimap, so you can refer to the example given for multimaps to see how it works.

unordered_set/unordered_multiset

The <unordered_set> header file defines the unordered_set and unordered_multiset which are very similar to the set and multiset respectively; except that they do not sort their keys and that they use a hash function. The differences between the unordered_set and the unordered_map are similar to the differences between the set and the map as discussed earlier in this chapter, so they are not discussed in details here. The Standard Library Reference resource on the website contains a thorough summary of the unordered_set and unordered_multiset operations.

OTHER CONTAINERS

There are several other parts of the C++ language that work with the STL to varying degrees, including standard C-style arrays, strings, streams, and the bitset.

Standard C-Style Arrays

Recall that “dumb” pointers are bona fide iterators because they support the required operators. This point is more than just a piece of trivia. It means that you can treat standard C-style arrays as STL containers by using pointers to their elements as iterators. Standard C-style arrays, of course, don’t provide methods like size(), empty(), insert(), and erase(), so they aren’t true STL containers. Nevertheless, because they do support iterators through pointers, you can use them in the algorithms described in Chapter 13 and in some of the methods described in this chapter.

For example, you could copy all the elements of a standard C-style array into a vector using the insert() method of a vector that takes an iterator range from any container. This insert() method prototype looks like this:

template <class InputIterator> iterator insert(const_iterator position,
    InputIterator first, InputIterator last);

If you want to use a standard C-style int array as the source, then the templatized type of InputIterator becomes int*. Here is the full example:

image
int arr[10];     // standard C-style array
vector<int> vec; // STL vector
// Initialize each element of the array to the value of its index.
for (int i = 0; i < 10; i++) {
    arr[i] = i;
}
// Insert the contents of the array into the end of the vector.
vec.insert(vec.end(), arr, arr + 10);
// Print the contents of the vector.
for (auto& i : vec) {
    cout << i << " ";
}

Code snippet from ArrayIteratorsArrayIterators.cpp

Note that the iterator referring to the first element of the array is the address of the first element, which is arr in this case. The name of an array alone is interpreted as the address of the first element. The iterator referring to the end must be one-past-the-last element, so it’s the address of the first element plus 10, or arr+10.

strings

You can think of a string as a sequential container of characters. Thus, it shouldn’t be surprising to learn that the C++ string is a full-fledged sequential container. It contains begin() and end() methods that return iterators into the string, insert(), push_back() and erase() methods, size(), empty(), and all the rest of the sequential container basics. It resembles a vector quite closely, even providing methods reserve() and capacity(). However, unlike vectors, strings are not required to store their elements contiguously in memory.

pen.gif

The C++ string is actually a typedef of a char instantiation of the basic_string template class. However, we refer to string for simplicity. The discussion here applies equally to wstring and other instantiations of the basic_string template.

You can use string as an STL container just as you would use vector. Here is an example:

image
string str1;
str1.insert(str1.end(), 'h'),
str1.insert(str1.end(), 'e'),
str1.insert(str1.end(), 'l'),
str1.insert(str1.end(), 'l'),
str1.insert(str1.end(), 'o'),
for (auto it = str1.cbegin(); it != str1.cend(); ++it) {
    cout << *it;
}
cout << endl;
for (auto& letter : str1) {
    cout << letter;
}
cout << endl;

Code snippet from StringContainersStringExample.cpp

In addition to the STL sequential container methods, strings provide a whole host of useful methods and friend functions. The string interface is actually quite a good example of a cluttered interface, one of the design pitfalls discussed in Chapter 4. The string class is discussed in much more detail in Chapter 14, but this section showed you how strings can be used as STL containers.

Streams

Input and output streams are not containers in the traditional sense: They do not store elements. However, they can be considered sequences of elements, and as such share some characteristics with the STL containers. C++ streams do not provide any STL-related methods directly, but the STL supplies special iterators called istream_iterator and ostream_iterator that allow you to “iterate” through input and output streams. Chapter 17 explains how to use them.

bitset

The bitset is a fixed-length abstraction of a sequence of bits. A bit can represent only two values, 1 and 0, which can be referred to as on/off, true/false, etc. The bitset also uses the terminology set and unset. You can toggle or flip a bit from one value to the other.

The bitset is not a true STL container: It’s of fixed size, it’s not templatized on an element type, and it doesn’t support iteration. However, it’s a useful utility, which is often lumped with the containers, so we provide a brief introduction here. The Standard Library Reference resource on the website contains a thorough summary of the bitset operations.

bitset Basics

The bitset, defined in the <bitset> header file, is templatized on the number of bits it stores. The default constructor initializes all fields of the bitset to 0. An alternative constructor creates the bitset from a string of 0 and 1 characters .

You can adjust the values of the individual bits with the set(), reset(), and flip() methods, and you can access and set individual fields with an overloaded operator[]. Note that operator[] on a non-const object returns a proxy object to which you can assign a Boolean value, call flip(), or complement with ~. You can also access individual fields with the test() method.

Additionally, you can stream bitsets with the normal insertion and extraction operators. The bitset is streamed as a string of 0 and 1 characters.

Here is a small example:

image
bitset<10> myBitset;
myBitset.set(3);
myBitset.set(6);
myBitset[8] = true;
myBitset[9] = myBitset[3];
if (myBitset.test(3)) {
    cout << "Bit 3 is set!"<< endl;
}
cout << myBitset << endl;

Code snippet from BitsetBasicsBitsetBasics.cpp

The output is:

Bit 3 is set!
1101001000

Note that the leftmost character in the output string is the highest numbered bit. This corresponds to our intuitions about binary number representations, where the low-order bit representing 20 = 1 is the rightmost bit in the printed representation.

Bitwise Operators

In addition to the basic bit manipulation routines, the bitset provides implementations of all the bitwise operators: &, |, ^, ~, <<, >>, &=, |=, ^=, <<=, and >>=. They behave just as they would on a “real” sequence of bits. Here is an example:

image
auto str1 = "0011001100";
auto str2 = "0000111100";
bitset<10> bitsOne(str1);
bitset<10> bitsTwo(str2);
auto bitsThree = bitsOne & bitsTwo;
cout << bitsThree << endl;
bitsThree <<= 4;
cout << bitsThree << endl;

Code snippet from BitsetBasicsBitwiseOperators.cpp

The output of the program is:

0000001100
0011000000

bitset Example: Representing Cable Channels

One possible use of bitsets is tracking channels of cable subscribers. Each subscriber could have a bitset of channels associated with his or her subscription, with set bits representing the channels to which he or she actually subscribes. This system could also support “packages” of channels, also represented as bitsets, which represent commonly subscribed combinations of channels.

The following CableCompany class is a simple example of this model. It uses two maps, each of string/bitset, storing the cable packages as well as the subscriber information.

image
using std::map;
using std::bitset;
using std::string;
using std::out_of_range;
const int kNumChannels = 10;
class CableCompany
{
    public:
        CableCompany() {}
        // Adds the package with the specified channels to the database
        void addPackage(const string& packageName,
            const bitset<kNumChannels>& channels);
        // Removes the specified package from the database
        void removePackage(const string& packageName);
        // Adds customer to database with initial channels found in package
        // Throws out_of_range if the package name is invalid.
        void newCustomer(const string& name, const string& package)
            throw(out_of_range);
        // Adds customer to database with initial channels specified
        // in channels
        void newCustomer(const string& name,
            const bitset<kNumChannels>& channels);
        // Adds the channel to the customers profile
        void addChannel(const string& name, int channel);
        // Removes the channel from the customers profile
        void removeChannel(const string& name, int channel);
        // Adds the specified package to the customers profile
        void addPackageToCustomer(const string& name,
            const string& package);
        // Removes the specified customer from the database
        void deleteCustomer(const string& name);
        // Retrieves the channels to which this customer subscribes
        // Throws out_of_range if name is not a valid customer
        bitset<kNumChannels>& getCustomerChannels(const string& name)
            throw(out_of_range);
    protected:
        typedef map<string, bitset<kNumChannels> > MapType;
        MapType mPackages, mCustomers;
};

Code snippet from CableCompanyCableCompany.h

Here are the implementations of the preceding methods, with comments explaining the code:

image
void CableCompany::addPackage(const string& packageName,
    const bitset<kNumChannels>& channels)
{
    // Just make a key/value pair and insert it into the packages map.
    mPackages.insert({packageName, channels});
}
void CableCompany::removePackage(const string& packageName)
{
    // Just erase the package from the package map.
    mPackages.erase(packageName);
}
void CableCompany::newCustomer(const string& name, const string& package)
    throw(out_of_range)
{
    // Get a reference to the specified package.
    auto it = mPackages.find(package);
    if (it == mPackages.end()) {
        // That package doesn't exist. Throw an exception.
        throw out_of_range("Invalid package");
    } else {
        // Create the account with the bitset representing that package.
        // Note that 'it' refers to a name/bitset pair. The bitset is the
        // second field.
        mCustomers.insert({name, it->second});
    }
}
void CableCompany::newCustomer(const string& name,
    const bitset<kNumChannels>& channels)
{
    // Just add the customer/channels pair to the customers map.
    mCustomers.insert({name, channels});
}
void CableCompany::addChannel(const string& name, int channel)
{
    // Find a reference to the customer.
    auto it = mCustomers.find(name);
    if (it != mCustomers.end()) {
        // We found this customer; set the channel.
        // Note that 'it' is a reference to a name/bitset pair.
        // The bitset is the second field.
        it->second.set(channel);
    }
}
void CableCompany::removeChannel(const string& name, int channel)
{
    // Find a reference to the customer.
    auto it = mCustomers.find(name);
    if (it != mCustomers.end()) {
        // We found this customer; remove the channel.
        // Note that 'it' is a reference to a name/bitset pair.
        // The bitset is the second field.
        it->second.reset(channel);
    }
}
void CableCompany::addPackageToCustomer(const string& name,
    const string& package)
{
    // Find the package.
    auto itPack = mPackages.find(package);
    // Find the customer.
    auto itCust = mCustomers.find(name);
    if (itCust != mCustomers.end() && itPack != mPackages.end()) {
        // Only if both package and customer found, can we do the update.
        // Or-in the package to the customers existing channels.
        // Note that the iterators are references to name/bitset pairs.
        // The bitset is the second field.
        itCust->second |= itPack->second;
    }
}
void CableCompany::deleteCustomer(const string& name)
{
    // Remove the customer with this name.
    mCustomers.erase(name);
}
bitset<kNumChannels>& CableCompany::getCustomerChannels(const string& name)
    throw(out_of_range)
{
    // Find the customer.
    auto it = mCustomers.find(name);
    if (it != mCustomers.end()) {
        // Found it!
        // Note that 'it' is a reference to a name/bitset pair.
        // The bitset is the second field.
        return it->second;
    }
    // Didn't find it. Throw an exception.
    throw out_of_range("No customer with that name");
}

Code snippet from CableCompanyCableCompany.cpp

Finally, here is a simple program demonstrating how to use the CableCompany class:

image
CableCompany myCC;
auto basic_pkg = "1111000000";
auto premium_pkg = "1111111111";
auto sports_pkg = "0000100111";
myCC.addPackage("basic", bitset<kNumChannels>(basic_pkg));
myCC.addPackage("premium", bitset<kNumChannels>(premium_pkg));
myCC.addPackage("sports", bitset<kNumChannels>(sports_pkg));
myCC.newCustomer("Nicholas Solter", "basic");
myCC.addPackageToCustomer("Nicholas Solter", "sports");
cout << myCC.getCustomerChannels("Nicholas Solter") << endl;

Code snippet from CableCompanyCableCompanyTest.cpp

Note that this example uses a lot of C++11 features. For example, the addPackage() method has the following line:

mPackages.insert({packageName, channels});

This line uses C++11 uniform initialization. If your compiler does not support this you need to write it as follows:

mPackages.insert(make_pair(packageName, channels));

SUMMARY

This chapter introduced the standard template library containers. It also presented sample code illustrating a variety of uses for these containers. Hopefully, you appreciate the power of the vector, deque, list, array, forward_list, stack, queue, priority_queue, map, multimap, set, multiset, unordered_map, unordered_multimap, unordered_set, unordered_multiset, string, and bitset. Even if you don’t incorporate them into your programs immediately, at least keep them in the back of your mind for future projects.

Now that you are familiar with the containers, the next chapter can illustrate the true power of the STL by discussing the generic algorithms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset