Chapter 17

Customizing and Extending the STL

WHAT’S IN THIS CHAPTER?

  • What allocators are
  • What iterator adapters are
  • How to extend the STL
    • How to write your own algorithms
    • How to write your own containers
    • How to write your own iterators

The previous chapters show that the STL is a powerful general-purpose collection of containers and algorithms. The information covered so far should be sufficient for most applications. However, those chapters show only the basic functionality of the library. The STL can be customized and extended however you like. For example, you can apply iterators to input and output streams; write your own containers, algorithms, and iterators; and even specify your own memory allocation schemes for containers to use. This chapter provides a taste of these advanced features, primarily through the development of a new STL container: the hashmap.

cross.gif

This chapter is not for the faint of heart! The contents delve into some of the most complicated and syntactically confusing areas of the C++ language. If you’re happy with the basic STL containers and algorithms from the previous chapters, you can skip this one. However, if you really want to understand the STL, not just use it, give this chapter a chance. You should be comfortable with the operator overloading material of Chapter 18, and because this chapter uses templates extensively, you should also be comfortable with the template material in Chapter 19 before continuing!

ALLOCATORS

Every STL container takes an Allocator type as a template parameter, for which the default will usually suffice. For example, the vector template definition looks like this:

template <class T, class Allocator = allocator<T> > class vector;

The container constructors then allow you to specify an object of type Allocator. These extra parameters permit you to customize the way the containers allocate memory. Every memory allocation performed by a container is made with a call to the allocate() method of the Allocator object. Conversely, every deallocation is performed with a call to the deallocate() method of the Allocator object. The standard library provides a default Allocator class called allocator, which implements these methods as wrappers for operator new and operator delete.

If you want containers in your program to use a custom memory allocation and deallocation scheme, you can write your own Allocator class. There are several reasons for using custom allocators. For example, if the underlying allocator has unacceptable performance, there are alternatives that can be constructed. Or, if memory fragmentation is a problem (lots of different allocations and deallocations leaving unusable small holes in memory), a single “pool” of objects of one type can be created, called a memory pool. When OS specific capabilities, such as shared memory segments, must be allocated, using custom allocators allows the use of STL containers in those shared memory segments. The use of custom allocators is complex, and there are many possible problems if you are not careful, so this should not be approached lightly.

Any class that provides allocate(), deallocate(), and several other required methods and typedefs can be used in place of the default allocator class. However, in our experience, this feature is rarely used, so we have omitted the details from this book. For more details, consult one of the books on the C++ Standard Library listed in Appendix B.

ITERATOR ADAPTERS

The Standard Library provides four iterator adapters: special iterators that are built on top of other iterators. You’ll learn more about the adapter design pattern in Chapter 29. For now, just appreciate what these iterators can do for you. All four iterator adapters are declared in the <iterator> header.

pen.gif

You can also write your own iterator adapters. Consult one of the books on the Standard Library listed in Appendix B for details.

Reverse Iterators

The STL provides a reverse_iterator class that iterates through a bidirectional or random access iterator in a reverse direction. Every reversible container in the STL, which happens to be every container that’s part of the standard, supplies a typedef reverse_iterator and methods called rbegin() and rend(). The method rbegin() returns a reverse_iterator starting at the last element of the container, and rend() returns a reverse_iterator starting at the first element of the container. Applying operator++ to a reverse_iterator calls operator--on the underlying container iterator, and vice versa. For example, iterating over a collection from the beginning to the end can be done as follows:

for (auto iter = collection.begin(); iter != collection.end(); ++iter) {}

Iterating over the elements in the collection from the end to the beginning can be done using a reverse_iterator by calling rbegin() and rend(). Note that you still call ++iter:

for (auto iter = collection.rbegin(); iter != collection.rend(); ++iter) {}

The reverse_iterator is useful mostly with algorithms in the STL that have no equivalents that work in reverse order. For example, the basic find() algorithm searches for the first element in a sequence. If you want to find the last element in the sequence, you can use a reverse_iterator instead. Note that when you call an algorithm such as find() with a reverse_iterator, it returns a reverse_iterator as well. You can always obtain the underlying iterator from a reverse_iterator by calling the base() method on the reverse_iterator. However, due to the implementation details of reverse_iterator, the iterator returned from base() always refers to one element past the element referred to by the reverse_iterator on which it’s called.

Here is an example of find() with a reverse_iterator:

image
// The implementation of populateContainer() is identical to that shown in
// Chapter 13, so it is omitted here.
vector<int> myVector;
populateContainer(myVector);
int num;
cout << "Enter a number to find: ";
cin >> num;
vector<int>::iterator it1;
vector<int>::reverse_iterator it2;
it1 = find(myVector.begin(), myVector.end(), num);
it2 = find(myVector.rbegin(), myVector.rend(), num);
if (it1 != myVector.end()) {
    cout << "Found " << num << " at position " << it1 - myVector.begin()
         << " going forward." << endl;
    cout << "Found " << num << " at position "
         << it2.base() - 1 - myVector.begin() << " going backward." << endl;
} else {
    cout << "Failed to find " << num << endl;
}

Code snippet from IteratorAdaptersReverseIterators.cpp

One line in this program needs further explanation. The code to print out the position found by the reverse iterator looks like this:

    cout << "Found " << num << " at position "
         << it2.base() - 1 - myVector.begin() << " going backward." << endl;

As noted earlier, base() returns an iterator referring to one past the element referred to by the reverse_iterator. In order to get to the same element, you must subtract one.

A possible output of this program is as follows:

Enter a number (0 to quit): 11
Enter a number (0 to quit): 22
Enter a number (0 to quit): 33
Enter a number (0 to quit): 22
Enter a number (0 to quit): 11
Enter a number (0 to quit): 0
Enter a number to find: 22
Found 22 at position 1 going forward.
Found 22 at position 3 going backward.

The previous example defined the iterators as follows:

vector<int>::iterator it1;
vector<int>::reverse_iterator it2;
it1 = find(myVector.begin(), myVector.end(), num);
it2 = find(myVector.rbegin(), myVector.rend(), num);

This is done to show you the exact types of it1 and it2. Of course, using the auto keyword, these four lines can be compacted to the following two lines:

auto it1 = find(myVector.begin(), myVector.end(), num);
auto it2 = find(myVector.rbegin(), myVector.rend(), num);

Stream Iterators

The STL provides adapters that allow you to treat input and output streams as input and output iterators. Using these iterators you can adapt input and output streams so that they can serve as sources and destinations, respectively, in the various STL algorithms. The ostream_iterator class is an output stream iterator. It is a template class that takes the element type as a type parameter. Its constructor takes an output stream and a string to write to the stream following each element. The ostream_iterator class writes elements using operator<<.

For example, you can use the ostream_iterator with the copy() algorithm to print the elements of a container with only one line of code. The first parameter of copy() is the start iterator of the range to copy, the second parameter is the end iterator of the range, and the third parameter is the destination iterator:

image
vector<int> myVector;
for (int i = 0; i < 10; i++) {
    myVector.push_back(i);
}
// Print the contents of the vector.
copy(myVector.begin(), myVector.end(), ostream_iterator<int>(cout, " "));

Code snippet from IteratorAdaptersStreamIterators.cpp

Similarly, you can use the input stream iterator, istream_iterator, to read values from an input stream using the iterator abstraction. Elements are read using operator>>. An istream_iterator can be used as sources in the algorithms and container methods. Its usage is less common than that of the ostream_iterator, so we don’t show an example here.

Insert Iterators

As Chapter 13 mentions, algorithms like copy() don’t insert elements into a container; they simply replace old elements in a range with new ones. In order to make algorithms like copy() more useful, the STL provides three insert iterator adapters that actually insert elements into a container. They are templatized on a container type, and take the actual container reference in their constructor. By supplying the necessary iterator interfaces, these adapters can be used as the destination iterators of algorithms like copy(). However, instead of replacing elements in the container, they make calls on their container to actually insert new elements.

The basic insert_iterator calls insert(position, element) on the container, the back_insert_iterator calls push_back(element), and the front_insert_iterator calls push_front(element).

For example, you can use the back_insert_iterator with the remove_copy_if() algorithm to populate vectorTwo with all elements from vectorOne that are not equal to 100:

image
// The implementation of populateContainer() is identical to that shown in
// Chapter 13, so it is omitted here.
vector<int> vectorOne, vectorTwo;
populateContainer(vectorOne);
back_insert_iterator<vector<int>> inserter(vectorTwo);
remove_copy_if(vectorOne.begin(), vectorOne.end(), inserter,
    [](int i){return i==100;});
copy(vectorTwo.begin(), vectorTwo.end(), ostream_iterator<int>(cout, " "));

Code snippet from IteratorAdaptersBackInsertIterator.cpp

As you can see, when you use insert iterators, you don’t need to size the destination containers ahead of time.

You can also use the back_inserter() utility function to create a back_insert_iterator. For example, in the previous example, you can remove the line which defines the inserter variable and rewrite the remove_copy_if() call as follows. The result is exactly the same as the previous implementation:

remove_copy_if(vectorOne.begin(), vectorOne.end(),
    back_inserter(vectorTwo), [](int i){return i==100;});

The insert_iterator and front_insert_iterator work similarly, except that the insert_iterator also takes an initial iterator position in its constructor, which it passes to the first call to insert(position, element). Subsequent iterator position hints are generated based on the return value from each insert() call.

One huge benefit of insert_iterator is that it allows you to use associative containers as destinations of the modifying algorithms. Chapter 13 explains that the problem with associative containers is that you are not allowed to modify the elements over which you iterate. By using an insert_iterator, you can instead insert elements, allowing the container to sort them properly internally. Associative containers actually support a form of insert() that takes an iterator position, and are supposed to use the position as a “hint,” which they can ignore. When you use an insert_iterator on an associative container, you can pass the begin() or end() iterator of the container to use as the hint. Here is the previous example modified so that the destination container is a set instead of a vector:

image
// The implementation of populateContainer() is identical to that shown in
// Chapter 13, so it is omitted here.
vector<int> vectorOne;
set<int> setOne;
populateContainer(vectorOne);
insert_iterator<set<int>> inserter(setOne, setOne.begin());
remove_copy_if(vectorOne.begin(), vectorOne.end(), inserter,
    [](int i){return i==100;});
copy(setOne.begin(), setOne.end(), ostream_iterator<int>(cout, " "));

Code snippet from IteratorAdaptersInsertIterator.cpp

Note that the insert_iterator modifies the iterator position hint that it passes to insert() after each call to insert(), such that the position is one past the just-inserted element.

imageMove Iterators

Chapter 9 discusses a new C++11 feature called move semantics, which can be used to prevent unnecessary copying in cases where you know that the source object will be destroyed after an assignment or copy construction. C++11 also introduces a move_iterator. The dereferencing operator for a move_iterator will automatically convert the value to an rvalue reference, which means that the value can be moved to a new destination without the overhead of copying. Before you can use move semantics, you need to make sure your objects are supporting it. The following MoveableClass supports move semantics. For more details, see Chapter 9.

image
class MoveableClass
{
    public:
        MoveableClass() {
            cout << "Default constructor" << endl;
        }
        MoveableClass(const MoveableClass& src) {
            cout << "Copy constructor" << endl;
        }
        MoveableClass(MoveableClass&& src) {
            cout << "Move constructor" << endl;
        }
        MoveableClass& operator=(const MoveableClass& rhs) {
            cout << "Copy assignment operator" << endl;
            return *this;
        }
        MoveableClass& operator=(MoveableClass&& rhs) {
            cout << "Move assignment operator" << endl;
            return *this;
        }
};

Code snippet from IteratorAdaptersMoveIterators.cpp

The constructors and assignment operators are not doing anything useful here, except printing a message to make it easy to see which one is being called. Now that you have this class, you can define a vector and store a few MoveableClass instances in it as follows:

image
vector<MoveableClass> vecSource;
MoveableClass mc;
vecSource.push_back(mc);
vecSource.push_back(mc);

Code snippet from IteratorAdaptersMoveIterators.cpp

The second line of the code creates a MoveableClass instance by using the default constructor. The first push_back() call triggers the copy constructor to copy mc into the vector. After this operation, the vector will have space for one element, the first copy of mc.

The second push_back() call triggers the vector to resize itself, to allocate space for the second element. This resizing causes the move constructor to be called to move every element from the old vector to the new resized vector. After that, the copy constructor is triggered to copy mc a second time into the vector. The output should be as follows:

Default constructor
Copy constructor
Move constructor
Copy constructor

You can create a new vector called vecOne that contains a copy of the elements from vecSource as follows:

vector<MoveableClass> vecOne(vecSource.begin(), vecSource.end());

Without using move_iterators, this code triggers the copy constructor two times, once for every element in vecSource:

Copy constructor
Copy constructor

By using the make_move_iterator() function to create move_iterators, the move constructor of MoveableClass is called instead of the copy constructor:

vector<MoveableClass> vecTwo(make_move_iterator(vecSource.begin()),
                             make_move_iterator(vecSource.end()));

This generates the following output:

Move constructor
Move constructor
cross.gif

Once objects have been moved, you should not access the original objects anymore. For example, in the previous example, you should not access objects in vecSource after having moved them to vecTwo. See Chapter 9 for details on move semantics.

EXTENDING THE STL

The STL includes many useful containers, algorithms, and iterators that you can use in your applications. It is impossible, however, for any library to include all possible utilities that all potential clients might need. Thus, the best libraries are extensible: They allow clients to adapt and add to the basic capabilities to obtain exactly the functionality they require. The STL is inherently extensible because of its fundamental structure of separating data from the algorithms that operate on them. You can write your own container that can work with the STL algorithms by providing an iterator that conforms to the STL standard. Similarly, you can write a function that works with iterators from the standard containers. This section explains the rules for extending the STL and provides sample implementations of extensions.

Why Extend the STL?

If you sit down to write an algorithm or container in C++, you can either make it adhere to the STL conventions or not. For simple containers and algorithms, it might not be worth the extra effort to follow the STL guidelines. However, for substantial code that you plan to reuse, the effort pays off. First, the code will be easier for other C++ programmers to understand, because you follow well-established interface guidelines. Second, you will be able to use your container or algorithm on the other parts of the STL (algorithms or containers), without needing to provide special hacks or adapters. Finally, it will force you to employ the necessary rigor required to develop solid code.

Writing an STL Algorithm

Chapter 13 describes a useful set of algorithms which are part of the STL, but you will inevitably encounter situations in your programs for which you need new algorithms. When that happens, it is usually not difficult to write your own algorithm that works with STL iterators just like the standard algorithms.

find_all()

Suppose that you want to find all the elements matching a predicate in a given range. The find() and find_if() algorithms are the most likely candidate algorithms, but each returns an iterator referring to only one element. In fact, there is no standard algorithm to find all the elements matching a predicate, but you can write your own version of this functionality called find_all().

The first task is to define the function prototype. You can follow the model established by find_if(). It will be a templatized function on two type parameters: the iterator and the predicate. Its arguments will be start and end iterators and the predicate object. Only its return value differs from find_if(): Instead of returning a single iterator referring to the matching element, find_all() returns a vector of iterators referring to all the matching elements. Here is the prototype:

image
template <typename InputIterator, typename Predicate>
vector<InputIterator>
find_all(InputIterator first, InputIterator last, Predicate pred);

Code snippet from WritingAlgorithmsFindAll.cpp

Another option would be to return an iterator that iterates over all the matching elements in the container, but that would require you to write your own iterator class.

The next task is to write the implementation. The find_all() algorithm can be layered on top of find_if() by calling find_if() repeatedly. The first call to find_if() uses the whole supplied range from first to last. The second call uses a smaller range, from the element found with the previous call to last. The loop continues until find_if() fails to find a match. Here is the implementation:

image
template <typename InputIterator, typename Predicate>
vector<InputIterator>
find_all(InputIterator first, InputIterator last, Predicate pred)
{
    vector<InputIterator> res;
    while (true) {
        // Find the next match in the current range.
        first = find_if(first, last, pred);
        // check if find_if failed to find a match
        if (first == last) {
            break;
        }
        // Store this match.
        res.push_back(first);
        // Shorten the range to start at one past the current match
        ++first;
    }
    return res;
}

Code snippet from WritingAlgorithmsFindAll.cpp

This find_all() implementation creates a local vector to store the result. At the end of the function, this local vector is returned. In C++ prior to C++11 this would require the local vector to be copied. However, since the local vector will go out of scope at the end of the function, C++11 will use move semantics to avoid copying. This makes returning the vector by value in C++11 very efficient.

Here is some code that tests find_all(). The type of the all variable will be vector<vector<int>::iterator>. After finding iterators to all the elements, the test code counts the number of elements found, which is the number of iterators in all. Then, it iterates through the returned iterators, printing each element:

image
vector<int> vec = {3, 4, 5, 4, 5, 6, 5, 8};
auto all = find_all(vec.begin(), vec.end(), [](int i){return i==5;}); 
cout << "Found " << all.size() << " matching elements: ";
for (auto it : all) {
    cout << *it << " ";
}

Code snippet from WritingAlgorithmsFindAll.cpp

The output is as follows:

Found 3 matching elements: 5 5 5

Iterator Traits

Some algorithm implementations need additional information about their iterators. For example, they might need to know the type of the elements referred to by the iterator in order to store temporary values, or perhaps they want to know whether the iterator is bidirectional or random access.

C++ provides a class template called iterator_traits that allows you to find this info. You instantiate the iterator_traits class template with the iterator type of interest, and access one of five typedefs: value_type, difference_type, iterator_category, pointer, and reference. For example, the following template function declares a temporary variable of the type to which an iterator of type IteratorType refers. Note the use of the typename keyword in front of the iterator_traits line. Chapter 12 explains that you must specify typename explicitly whenever you access a type based on one or more template parameters. In this case, the template parameter IteratorType is used to access the value_type type:

image
#include <iterator>
template <typename IteratorType>
void iteratorTraitsTest(IteratorType it)
{
   typename std::iterator_traits<IteratorType>::value_type temp;
   temp = *it;
   cout << temp << endl;
}

Code snippet from WritingAlgorithmsIteratorTraitsTest.cpp

This function can be tested with the following test code:

vector<int> v;
v.push_back(5);
iteratorTraitsTest(v.begin());

With this test code, the variable temp in the iteratorTraitsTest() function will be of type int. The output is as follows:

5

The C++11 auto keyword can be used to simplify the previous example. The variable temp will still be of type int:

image
#include <iterator>
template <typename IteratorType>
void iteratorTraitsTest(IteratorType it)
{
   auto temp = *it;
   cout << temp << endl;
}

Code snippet from WritingAlgorithmsIteratorTraitsTest.cpp

Writing an STL Container

The C++ standard contains a list of requirements that any container must fulfill in order to qualify as an STL container.

Additionally, if you want your container to be sequential (like a vector), associative (like a map), or unordered associative (like an unordered_map) it must conform to supplementary requirements.

Our suggestion when writing a new container is to write the basic container first following the general STL rules such as making it a class template, but without worrying too much about the specific details of STL conformity. After you’ve developed the implementation, you can add the iterator and methods so that it can work with the STL framework. This section takes that approach to develop a hashmap.

A Basic Hashmap

C++11 supports unordered associative containers, also called hash tables. These are discussed in Chapter 12. However, previous versions of C++ did not include hash tables. Unlike the STL map and set, which provide logarithmic insertion, lookup, and deletion times, a hash table provides constant time insertion, deletion, and lookup in the average case, linear in the worst case. Instead of storing elements in sorted order, it hashes, or maps, each element to a particular bucket. As long as the number of elements stored isn’t significantly greater than the number of buckets, and the hash function distributes the elements uniformly between the buckets, the insertion, deletion, and lookup operations all run in constant time.

pen.gif

This section assumes that you are familiar with hashed data structures. If you are not, consult Chapter 12, which includes a discussion on hash tables, or one of the standard data structure texts listed in Appendix B.

This section provides an implementation of a simple, but fully functional, hashmap that you can take with you between platforms, in case your compiler does not yet support the standard C++11 hash tables. Like a map, a hashmap stores key/value pairs. In fact, the operations it provides are almost identical to those provided by the map, but with different performance characteristics.

This hashmap implementation uses chained hashing (also called open hashing) and does not attempt to provide advanced features like rehashing. Chapter 12 explains the concept of chained hashing in the section on C++11 unordered associative containers.

pen.gif

It is recommended to use the C++11 unordered associative containers, also called hash tables, if your compiler supports them, instead of implementing your own. These C++11 hash tables, explained in Chapter 12, are called unordered_map, unordered_multimap, unordered_set and unordered_multiset. The hashmap in this chapter is used to demonstrate writing STL containers and as a solution in case your compiler does not support the C++11 hash tables.

The Hash Function

The first choice when writing a hashmap is how to handle hash functions. Recalling the adage that a good abstraction makes the easy case easy and the hard case possible, a good hashmap interface allows clients to specify their own hash function and number of buckets in order to customize the hashing behavior for their particular workload. On the other hand, clients that do not have the desire, or ability, to write a good hash function and choose a number of buckets should still be able to use the container without doing so. One solution is to allow clients to provide a hash function and number of buckets in the hashmap constructor, but also to provide defaults values. It also makes sense to package the hash function and the number of buckets into a hashing class. Our default hash class definition looks like this:

image
// Any Hash Class must provide two methods: hash() and numBuckets().
template <typename T>
class DefaultHash
{
    public:
        // Throws invalid_argument if numBuckets is illegal
        DefaultHash(size_t numBuckets = 101) throw (invalid_argument);
        size_t hash(const T& key) const;
        size_t numBuckets() const { return mNumBuckets; }
    protected:
        size_t mNumBuckets;
};

Code snippet from HashmapBasicHashmaphashmap.h

Note that the DefaultHash class is templatized on the key type that it hashes, in order to support a templatized hashmap container. The implementation of the constructor is trivial:

image
// Throws invalid_argument if numBuckets is illegal
template <typename T>
DefaultHash<T>::DefaultHash(size_t numBuckets) throw (invalid_argument)
{
    if (numBuckets <= 0) {
        throw invalid_argument("numBuckets must be > 0");
    }
    mNumBuckets = numBuckets;
}

Code snippet from HashmapBasicHashmaphashmap.cpp

The implementation of hash() is trickier, partially because it must apply to keys of any type. It is supposed to map the key to one of the mNumBuckets buckets. The following hash() function first computes an integer-sized hash value, then limits the calculated hash value to the number of hash buckets by taking the modulo of the calculated value:

image
// Uses the division method for hashing.
// Treats the key as a sequence of bytes, sums the ASCII
// values of the bytes, and mods the total by the number
// of buckets.
template <typename T>
size_t DefaultHash<T>::hash(const T& key) const
{
    size_t bytes = sizeof(key);
    size_t res = 0;
    for (size_t i = 0; i < bytes; ++i) {
        unsigned char b = *((unsigned char*)&key + i);
        res += b;
    }
    return (res % mNumBuckets);
}

Code snippet from HashmapBasicHashmaphashmap.cpp

Unfortunately, when using the preceding method on strings, the function will calculate the hash of the entire string object, and not just of the actual text. The result is that two string objects with the same text will generate different hash values, because some fields of the string objects will be different. Therefore, it’s a good idea to provide a specialization of the DefaultHash class for strings. Template specialization is discussed in detail in Chapter 19:

image
// Specialization for strings
template <>
class DefaultHash<string>
{
    public:
        // Throws invalid_argument if numBuckets is illegal
        DefaultHash(size_t numBuckets = 101) throw (invalid_argument);
        size_t hash(const string& key) const;
        size_t numBuckets() const { return mNumBuckets; }
    protected:
        size_t mNumBuckets;
};

Code snippet from HashmapBasicHashmaphashmap.h

image
// Throws invalid_argument if numBuckets is illegal
DefaultHash<string>::DefaultHash(size_t numBuckets) throw (invalid_argument)
{
    if (numBuckets <= 0) {
        throw invalid_argument("numBuckets must be > 0");
    }
    mNumBuckets = numBuckets;
}
// Uses the division method for hashing after summing the
// ASCII values of all the characters in key.
size_t DefaultHash<string>::hash(const string& key) const
{
    size_t sum = 0;
    for (size_t i = 0; i < key.size(); i++) {
        sum += (unsigned char)key[i];
    }
    return (sum % mNumBuckets);
}

Code snippet from HashmapBasicHashmaphashmap.cpp

If the client wants to use other pointer types or objects as the key, she should write her own hash class for those types.

cross.gif

The hash functions shown in this section are examples for the basic hashmap implementation. They do not guarantee uniform hashing for all key universes. If you need more mathematically rigorous hash functions, or if you don’t know what “uniform hashing” is, consult an algorithms reference from Appendix B.

The Hashmap Interface

A hashmap supports three basic operations: insertion, deletion, and lookup. Of course, it provides a constructor, destructor, copy constructor, and copy assignment operator as well. With C++11 it should also support a move constructor and move assignment operator. Here is the public portion of the hashmap class template:

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        typedef Key key_type;
        typedef T mapped_type;
        typedef pair<const Key, T> value_type;
        // Constructors
        // Throws invalid_argument if the hash object specifies an illegal
        // number of buckets   
        explicit hashmap(const Compare& comp = Compare(),
            const Hash& hash = Hash()) throw(invalid_argument);
        // destructor, copy constructor, move constructor,
        // copy assignment operator and move assignment operator
        ~hashmap();
        hashmap(const hashmap<Key, T, Compare, Hash>& src);
        hashmap(hashmap<Key, T, Compare, Hash>&& src);    // C++11
        hashmap<Key, T, Compare, Hash>& operator=(
            const hashmap<Key, T, Compare, Hash>& rhs);
        hashmap<Key, T, Compare, Hash>& operator=(
            hashmap<Key, T, Compare, Hash>&& rhs);        // C++11
        // Inserts the key/value pair x
        void insert(const value_type& x);
        // Removes the element with key x, if it exists
        void erase(const key_type& x);
        // find returns a pointer to the element with key x.
        // Returns nullptr if no element with that key exists.
        value_type* find(const key_type& x);
        // operator[] finds the element with key x or inserts an
        // element with that key if none exists yet. Returns a reference to
        // the value corresponding to that key.
        T& operator[] (const key_type& x);
    protected:
        // Implementation details not shown yet
};

Code snippet from HashmapBasicHashmaphashmap.h

As you can see, the key and value types are both template arguments like in the STL map. The hashmap stores pair<const Key, T> as the actual elements in the container. The insert(), erase(), find(), and operator[] methods are straightforward. However, a few aspects of this interface require further explanation.

The Template Argument Compare

Like the map, set, and other standard containers, the hashmap allows the client to specify the comparison type as a template parameter and to pass a specific comparison object of that type in the constructor. Unlike the map and set, the hashmap does not sort elements by key, but must still compare keys for equality. Thus, instead of using less as the default comparison, it uses equal_to. The comparison object is used only to detect attempts to insert duplicate keys into the container.

The Template Argument Hash

When you allow clients to define their own classes, from which they construct objects to pass in the constructor, you must figure out how to specify the type of parameter in the constructor. There are several ways to do it. The STL way, which is on the complicated end of the spectrum, takes the class type as a template parameter, and uses that templatized type as the type in the constructor. We follow that approach for the hash class, as you can see above. Thus, the hashmap template takes four template parameters: the key type, the value type, the comparison type, and the hash type.

The typedefs

The hashmap class template defines three typedefs:

        typedef Key key_type;
        typedef T mapped_type;
        typedef pair<const Key, T> value_type;

The value_type, in particular, is useful for referring to the more cumbersome pair<const Key, T>. As you will see, these typedefs are also required for STL containers by the standard.

The Implementation

After you finalize the hashmap interface, you need to choose the implementation model. The basic hash table structure generally consists of a fixed number of buckets, each of which can store one or more elements. The buckets should be accessible in constant time based on a bucket-id (the result of hashing a key). Thus, a vector is the most appropriate container for the buckets. Each bucket must store a list of elements, so the STL list can be used as the bucket type. Thus, the final structure is a vector of lists of pair<const Key, T> elements. Here are the protected members of the hashmap class:

image
    protected:
        typedef list<value_type> ListType;
        // In this first implementation, it would be easier to use a vector
        // instead of a pointer to a vector, which requires dynamic allocation.
        // However, a pointer to a vector is used so that, in the final
        // implementation, swap() can be implemented in constant time.
        vector<ListType>* mElems;
        size_t mSize;
        Compare mComp;
        Hash mHash;

Code snippet from HashmapBasicHashmaphashmap.h

Without the typedefs for value_type and ListType, the line declaring mElems would look like this:

        vector<list<pair<const Key, T>>>* mElems;

The mComp and mHash members store the comparison and hashing objects, respectively, and mSize stores the number of elements currently in the container.

The Constructor

The constructor initializes all the fields and allocates a new vector. Unfortunately, the template syntax is somewhat dense. As mentioned in the beginning of this chapter, if the syntax confuses you, consult Chapter 19 for details on writing class templates.

image
// Construct mElems with the number of buckets.
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>::hashmap(
    const Compare& comp, const Hash& hash) throw(invalid_argument) :
    mSize(0), mComp(comp), mHash(hash)
{
    if (mHash.numBuckets() <= 0) {
        throw invalid_argument("Number of buckets must be positive");
    }
    mElems = new vector<ListType>(mHash.numBuckets());
}

Code snippet from HashmapBasicHashmaphashmap.cpp

The implementation requires at least one bucket, so the constructor enforces that restriction.

Destructor, Copy Constructor, Move Constructor, Copy Assignment Operator, and Move Assignment Operator

Here are the implementations of the destructor, copy constructor, move constructor, copy assignment operator, and move assignment operator. Their implementations are straightforward. Consult Chapter 9 for more details on how move constructors and move assignment operators work.

image
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>::~hashmap()
{
    delete mElems;
    mElems = nullptr;
    mSize = 0;
}
// Copy constructor
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>::hashmap(
    const hashmap<Key, T, Compare, Hash>& src) :
    mSize(src.mSize), mComp(src.mComp), mHash(src.mHash)
{
    // Don't need to bother checking if numBuckets is positive, because
    // we know src checked. Use the vector copy constructor.
    mElems = new vector<ListType>(*(src.mElems));
}
// C++11 move constructor
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>::hashmap(hashmap<Key, T, Compare, Hash>&& src)
{
    // move ownership
    mElems = src.mElems;
    src.mElems = nullptr;
    mSize = src.mSize;
    src.mSize = 0;
    mComp = src.mComp;
    mHash = src.mHash;
}
// Assignment operator
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>& hashmap<Key, T, Compare, Hash>::operator=(
    const hashmap<Key, T, Compare, Hash>& rhs)
{
    // Check for self-assignment.
    if (this != &rhs) {
        delete mElems;
        mSize = rhs.mSize;
        mComp = rhs.mComp;
        mHash = rhs.mHash;
        // Don't need to bother checking if numBuckets is positive, because
        // we know rhs checked. Use the vector copy constructor.
        mElems = new vector<ListType>(*(rhs.mElems));
    }
    return *this;
}
// C++11 move assignment operator
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>& hashmap<Key, T, Compare, Hash>::operator=(
    hashmap<Key, T, Compare, Hash>&& rhs)
{
    // check for self-assignment
    if (this != &rhs) {
        delete mElems;
 
        // move ownership
        mElems = rhs.mElems;
        rhs.mElems = nullptr;
        mSize = rhs.mSize;
        rhs.mSize = 0;
        mComp = rhs.mComp;
        mHash = rhs.mHash;
    }
    return *this;
}

Code snippet from HashmapBasicHashmaphashmap.cpp

Note that the copy constructor and assignment operator both construct the new vector by using its copy constructor with the vector from the source hashmap as the source.

Element Lookup

Each of the three major operations (lookup, insertion, and deletion) requires code to find an element with a given key. Thus, it is helpful to have a protected helper method that performs that task. findElement() first uses the hash object to hash the key to a specific bucket. Then, it looks in that bucket for an element with a key matching the given key. The elements stored are key/value pairs, so the actual comparison must be done on the first field of the element. The comparison function object specified in the constructor is used to perform the comparison. lists require a linear search to find matching elements, but you could use the find() algorithm instead of an explicit for loop.

image
template <typename Key, typename T, typename Compare, typename Hash>
typename list<pair<const Key, T>>::iterator
hashmap<Key, T, Compare, Hash>::findElement(const key_type& x,
    size_t& bucket) const
{
    // Hash the key to get the bucket.
    bucket = mHash.hash(x);
    // Look for the key in the bucket.
    for (auto it = (*mElems)[bucket].begin();
        it != (*mElems)[bucket].end(); ++it) {
        if (mComp(it->first, x)) {
            return it;
        }
    }
    return (*mElems)[bucket].end();
}

Code snippet from HashmapBasicHashmaphashmap.cpp

Note that findElement() returns an iterator referring to an element in the list representing the bucket to which the key hashed. If the element is found, the iterator refers to that element; otherwise, it is the end iterator for that list. The bucket is returned by reference in the bucket argument.

The syntax in this method is somewhat confusing, particularly the use of the typename keyword. You must use the typename keyword whenever you are using a type that is dependent on a template parameter. Specifically, the type list<pair<const Key, T>>::iterator is dependent on both the Key and T template parameters.

Another note on the syntax: mElems is a pointer, so it must be dereferenced before you can apply operator[] to it to obtain a specific element. Hence the somewhat ugly: (*mElems)[bucket].

You can implement the find() method as a simple wrapper for findElement():

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::value_type*
hashmap<Key, T, Compare, Hash>::find(const key_type& x)
{
    size_t bucket;
    // Use the findElement() helper.
    auto it = findElement(x, bucket);
    if (it == (*mElems)[bucket].end()) {
        // We didn't find the element--return nullptr.
        return nullptr;
    }
    // We found the element. Return a pointer to it.
    return &(*it);
}

Code snippet from HashmapBasicHashmaphashmap.cpp

The operator[] method uses the find() method, and if it does not find the element, it inserts it:

image
template <typename Key, typename T, typename Compare, typename Hash>
T& hashmap<Key, T, Compare, Hash>::operator[] (const key_type& x)
{
    // Try to find the element. If it doesn't exist, add a new element.
    value_type* found = find(x);
    if (found == nullptr) {
        insert(make_pair(x, T()));
        found = find(x);
    }
    return found->second;
}

Code snippet from HashmapBasicHashmaphashmap.cpp

This method is somewhat inefficient, because in the worst case it calls find() twice and insert() once. However, each of these operations runs in constant time with respect to the number of elements in the hashmap, so the overhead is not too significant.

Element Insert

insert() must first check if an element with that key is already in the hashmap. If not, it can add the element to the list in the appropriate bucket. Note that findElement() returns by reference the bucket to which the key hashes, even if the element with that key is not found:

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::insert(const value_type& x)
{
    size_t bucket;
    // Try to find the element.
    auto it = findElement(x.first, bucket);
    if (it != (*mElems)[bucket].end()) {
        // The element already exists.
        return;
    } else {
        // We didn't find the element, so insert a new one.
        mSize++;
        (*mElems)[bucket].insert((*mElems)[bucket].end(), x);
    }
}

Code snippet from HashmapBasicHashmaphashmap.cpp

Element Delete

erase() follows the same pattern as insert(): It first attempts to find the element by calling findElement(). If the element exists, it erases it from the list in the appropriate bucket. Otherwise, it does nothing.

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::erase(const key_type& x)
{
    size_t bucket;
    // First, try to find the element.
    auto it = findElement(x, bucket);
    if (it != (*mElems)[bucket].end()) {
        // The element exists--erase it.
        (*mElems)[bucket].erase(it);
        mSize--;
    }
}

Code snippet from HashmapBasicHashmaphashmap.cpp

Using the Basic Hashmap

Here is a small test program demonstrating the basic hashmap class template:

image
hashmap<int, int> myHash;
myHash.insert(make_pair(4, 40));
myHash.insert(make_pair(6, 60));
// x will have type hashmap<int, int>::value_type*
auto x = myHash.find(4);
if (x != nullptr) {
    cout << "4 maps to " << x->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}
myHash.erase(4);
auto x2 = myHash.find(4);
if (x2 != nullptr) {
    cout << "4 maps to " << x2->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}
myHash[4] = 35;
myHash[4] = 60;
auto x3 = myHash.find(4);
if (x3 != nullptr) { 
    cout << "4 maps to " << x3->second << endl;
} else {
    cout << "cannot find 4 in map" << endl;
}

Code snippet from HashmapBasicHashmapTestHashmap.cpp

The output is:

4 maps to 40
cannot find 4 in map
4 maps to 60
pen.gif

With the Microsoft Visual C++ compiler, at least through Visual Studio 2010, you may see a warning “warning C4290: C++ exception specification ignored except to indicate a function is not __declspec(nothrow),” because it doesn’t support exception throw lists. You can ignore these warnings for this example.

Making the Hashmap an STL Container

The basic hashmap shown in the previous section follows the spirit, but not the letter, of the STL. For most purposes, the preceding implementation is good enough. However, if you want to use the STL algorithms on your hashmap, you must do a bit more work. The C++ standard specifies specific methods and typedefs that a data structure must provide in order to qualify as an STL container.

typedef Container Requirements

The C++ standard specifies that every STL container must provide the following public typedefs:

TYPE NAME DESCRIPTION
value_type The element type stored in the container
reference A reference to the element type stored in the container
const_reference A reference to a const element type stored in the container
iterator The type for iterating over elements of the container
const_iterator A version of iterator for iterating over const elements of the container
size_type A type that can represent the number of elements in the container; usually just size_t (from <cstddef>)
difference_type A type that can represent the difference of two iterators for the container; usually just ptrdiff_t (from <cstddef>)

Here are the definitions in the hashmap class of all these typedefs except iterator and const_iterator. Writing an iterator is covered in detail in a subsequent section. Note that value_type (plus key_type and mapped_type, which are discussed later) was already defined in our previous version of the hashmap:

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        typedef Key key_type;
        typedef T mapped_type;
        typedef pair<const Key, T> value_type;
        typedef pair<const Key, T>& reference;
        typedef const pair<const Key, T>& const_reference;
        typedef size_t size_type;
        typedef ptrdiff_t difference_type;
        // Remainder of class definition omitted for brevity
};

Code snippet from HashmapFinalHashmaphashmap.h

Method Container Requirements

In addition to the typedefs, every container must provide the following methods:

METHOD DESCRIPTION WORST CASE COMPLEXITY
Default Constructor Constructs an empty container Constant
Copy constructor Performs a deep copy of the container Linear
imageMove constructor Performs a C++11 move constructing operation Constant
Copy Assignment operator Performs a deep copy of the container Linear
imageMove Assignment operator Performs a C++11 move assignment operation Constant
Destructor Destroys dynamically allocated memory; calls destructor on all elements left in container Linear
iterator begin();
const_iterator
begin() const;
Returns an iterator referring to the first element in the container Constant
iterator end();
const_iterator
end() const;
Returns an iterator referring to the last+1 element in the container Constant
imageconst_iterator
cbegin() const;
Returns a const iterator referring to the first element in the container Constant
imageconst_iterator
cend() const;
Returns a const iterator referring to the last+1 element in the container Constant
operator==
operator!=
operator<
operator>
operator<=
operator>=
Comparison operators that compare two containers, element by element Linear
void swap(Container&); Swaps the contents of the container passed to the method with the object on which the method is called Constant
size_type size() const; Returns the number of elements in the container Constant
size_type max_size() const; Returns the maximum number of elements the container can hold Constant
bool empty() const; Specifies whether the container has any elements Constant
pen.gif

In this hashmap example, we omit the comparison operators. Implementing them would be a good exercise for the reader.

The following code extract shows the declarations of all the remaining methods except for begin(), end(), cbegin() and cend(). Those are covered in the next section.

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        // typedefs omitted for brevity
        // Constructors
        explicit hashmap(const Compare& comp = Compare(),
            const Hash& hash = Hash()) throw(invalid_argument);
        // destructor, copy constructor, move constructor,
        // copy assignment operator and move assignment operator
        ~hashmap();
        hashmap(const hashmap<Key, T, Compare, Hash>& src);
        hashmap(hashmap<Key, T, Compare, Hash>&& src);      // C++11
        hashmap<Key, T, Compare, Hash>& operator=(
            const hashmap<Key, T, Compare, Hash>& rhs);
        hashmap<Key, T, Compare, Hash>& operator=(
            hashmap<Key, T, Compare, Hash>&& rhs);          // C++11
        // Size methods
        bool empty() const;
        size_type size() const;
        size_type max_size() const;
        // Other modifying utilities
        void swap(hashmap<Key, T, Compare, Hash>& hashIn);
        // Other methods omitted for brevity
};

Code snippet from HashmapFinalHashmaphashmap.h

The implementations of the constructor, destructor, copy constructor, move constructor, assignment operator, and move assignment operator are identical to those shown earlier for the basic hashmap implementation.

The implementations of size() and empty() are easy because the hashmap implementation tracks its size in the mSize data member. Note that size_type is one of the typedefs that is defined in the class. Since it is a member of the class, such a return type in the implementation must be fully qualified with typename hashmap<Key, T, Compare, Hash>.

image
template <typename Key, typename T, typename Compare, typename Hash>
bool hashmap<Key, T, Compare, Hash>::empty() const
{
    return mSize == 0;
}
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::size_type
    hashmap<Key, T, Compare, Hash>::size() const
{
    return mSize;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

max_size() is a little trickier. At first, you might think the maximum size of the hashmap container is the sum of the maximum size of all the lists. However, the worst-case scenario is that all the elements hash to the same bucket. Thus, the maximum size it can claim to support is the maximum size of a single list:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::size_type
    hashmap<Key, T, Compare, Hash>::max_size() const
{
    // In the worst case, all the elements hash to the same
    // list, so the max_size is the max_size of a single list.
    // This code assumes that all the lists have the same max_size.
    return (*mElems)[0].max_size();
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Finally, the implementation of swap()uses the std::swap() utility function to swap each of the four data members. Note that the vector pointers are swapped, which is a constant-time operation:

image
// Just swap the four data members. Use the generic swap template.
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::swap(
    hashmap<Key, T, Compare, Hash>& hashIn)
{
    // Explicitly qualify with std:: so the compiler doesn't think
    // it's a recursive call.
    std::swap(*this, hashIn);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Writing an Iterator

The most important container requirement is the iterator. In order to work with the generic algorithms, every container must provide an iterator for accessing the elements in the container. Your iterator should generally provide overloaded operator* and operator->, plus some other operations depending on its specific behavior. As long as your iterator provides the basic iteration operations, everything should be fine.

The first decision to make about your iterator is what kind it will be: forward, bidirectional, or random access. Random access iterators don’t make much sense for associative containers, so bidirectional seems like the logical choice for the hashmap iterator. That means you must also provide operator++, operator--, operator==, and operator!=. Consult Chapter 12 for more details on the requirements for the different iterators.

The second decision is how to order the elements of your container. The hashmap is unsorted, so iterating in a sorted order is probably too difficult. Instead, your iterator can just step through the buckets, starting with the elements in the first bucket and progressing to those in the last bucket. This order will appear random to the client, but will be consistent and repeatable.

The third decision is how to represent your iterator internally. The implementation is usually quite dependent on the internal implementation of the container. The first purpose of an iterator is to refer to a single element in the container. In the case of the hashmap, each element is in an STL list, so perhaps the hashmap iterator can be a wrapper around a list iterator referring to the element in question. However, the second purpose of a bidirectional iterator is to allow the client to progress to the next or previous element from the current. In order to progress from one bucket to the next, you need to track also the current bucket and the hashmap object to which the iterator refers.

Once you’ve chosen your implementation, you must decide on a consistent representation for the end iterator. Recall that the end iterator should really be the “past-the-end” marker: the iterator that’s reached by applying ++ to an iterator referring to the final element in the container. The hashmap iterator can use as its end iterator the end iterator of the list of the final bucket in the hashmap.

The HashIterator Class

Given the decisions made in the previous section, it’s time to define the HashIterator class. The first thing to note is that each HashIterator object is an iterator for a specific instantiation of the hashmap class. In order to provide this one-to-one mapping, the HashIterator must also be a class template on the same parameters as the hashmap class.

The main question in the class definition is how to conform to the bidirectional iterator requirements. Recall that anything that behaves like an iterator is an iterator. Your class is not required to subclass another class in order to qualify as a bidirectional iterator. However, if you want your iterator to be usable in the generic algorithms functions, you must specify its traits. The discussion on writing STL algorithms earlier in this chapter explains that iterator_traits is a class template that defines five typedefs for each iterator type. It can be partially specialized for your new iterator type if you want. Alternatively, the default implementation of the iterator_traits class template just grabs the five typedefs out of the iterator class itself. Thus, you can define those typedefs directly in your iterator class. In fact, C++ makes it even easier than that. Instead of defining them yourself, you can just subclass the iterator class template, which provides the typedefs for you. That way you only need to specify the iterator type and the element type as template arguments to the iterator class template. The HashIterator is a bidirectional iterator, so you can specify bidirectional_iterator_tag as the iterator type. Other legal iterator types are input_iterator_tag, output_iterator_tag, forward_iterator_tag, and random_access_iterator_tag. For the HashIterator, the element type is pair<const Key, T>.

Basically, it all boils down to the fact that you should subclass your iterator classes from the generic iterator class template.

Here is the basic HashIterator class definition:

image
// HashIterator class definition
template<typename Key, typename T, typename Compare, typename Hash>
class HashIterator : public std::iterator<std::bidirectional_iterator_tag,
    pair<const Key, T>>
{
    public:
        HashIterator(); // Bidirectional iterators must supply default ctor
        HashIterator(size_t bucket,
            typename list<pair<const Key, T>>::iterator listIt,
            const hashmap<Key, T, Compare, Hash>* inHashmap);
        pair<const Key, T>& operator*() const;
        // Return type must be something to which -> can be applied.
        // Return a pointer to a pair<const Key, T>, to which the compiler will
        // apply -> again.
        pair<const Key, T>* operator->() const;
        HashIterator<Key, T, Compare, Hash>& operator++();
        const HashIterator<Key, T, Compare, Hash> operator++(int);
        HashIterator<Key, T, Compare, Hash>& operator--();
        const HashIterator<Key, T, Compare, Hash> operator--(int);
        // Don't need to define a copy constructor or operator= because the
        // default behavior is what we want
        // Don't need destructor because the default behavior
        // (not deleting mHashmap) is what we want.
        // The following are ok as member functions because we don't
        // support comparisons of different types to this one.
        bool operator==(const HashIterator& rhs) const;
        bool operator!=(const HashIterator& rhs) const;
    protected:
        size_t mBucket;
        typename list<pair<const Key, T>>::iterator mIt;
        const hashmap<Key, T, Compare, Hash>* mHashmap;
        // Helper methods for operator++ and operator--
        void increment();
        void decrement();
};

Code snippet from HashmapFinalHashmaphashmap.h

If the definitions and implementations (shown in the next section) of the overloaded operators confuse you, consult Chapter 18 for details on operator overloading.

The HashIterator Method Implementations

The HashIterator constructors initialize the three member variables. The default constructor exists only so that clients can declare HashIterator variables without initializing them. An iterator constructed with the default constructor does not need to refer to any value, and attempting any operations on it is allowed to have undefined results:

image
// Dereferencing or incrementing an iterator constructed with the
// default ctor is undefined, so it doesn't matter what values we give
// here.
template<typename Key, typename T, typename Compare, typename Hash>
HashIterator<Key, T, Compare, Hash>::HashIterator()
{
    mBucket = 0;
    mIt = list<pair<const Key, T>>::iterator();
    mHashmap = nullptr;
}
template<typename Key, typename T, typename Compare, typename Hash>
HashIterator<Key, T, Compare, Hash>::HashIterator(
    size_t bucket, typename list<pair<const Key, T>>::iterator listIt,
    const hashmap<Key, T, Compare, Hash>* inHashmap) :
    mBucket(bucket), mIt(listIt), mHashmap(inHashmap)
{
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The implementations of the dereferencing operators are concise, but can be tricky. Chapter 18 explains that operator* and operator-> are asymmetric; operator* returns the actual underlying value, which in this case is the element to which the iterator refers, while operator-> must return something to which the arrow operator can be applied again. Thus, it returns a pointer to the element. The compiler then applies -> to the pointer, which will result in accessing a field of the element:

image
// Return the actual element
template<typename Key, typename T, typename Compare, typename Hash>
pair<const Key, T>& HashIterator<Key, T, Compare, Hash>::operator*() const
{
    return *mIt;
}
// Return the iterator, so the compiler can apply -> to it to access
// the actual desired field.
template<typename Key, typename T, typename Compare, typename Hash>
pair<const Key, T>*
    HashIterator<Key, T, Compare, Hash>::operator->() const
{
    return &(*mIt);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The increment and decrement operators are implemented as follows, which defer the actual incrementing and decrementing procedures to the increment() and decrement() helper methods:

image
// Defer the details to the increment() helper.
template<typename Key, typename T, typename Compare, typename Hash>
HashIterator<Key, T, Compare, Hash>&
    HashIterator<Key, T, Compare, Hash>::operator++()
{
    increment();
    return *this;
}
// Defer the details to the increment() helper.
template<typename Key, typename T, typename Compare, typename Hash>
const HashIterator<Key, T, Compare, Hash>
    HashIterator<Key, T, Compare, Hash>::operator++(int)
{
    auto oldIt = *this;
    increment();
    return oldIt;
}
// Defer the details to the decrement() helper.
template<typename Key, typename T, typename Compare, typename Hash>
HashIterator<Key, T, Compare, Hash>&
    HashIterator<Key, T, Compare, Hash>::operator--()
{
    decrement();
    return *this;
}
// Defer the details to the decrement() helper.
template<typename Key, typename T, typename Compare, typename Hash>
const HashIterator<Key, T, Compare, Hash>
    HashIterator<Key, T, Compare, Hash>::operator--(int)
{
    auto oldIt = *this;
    decrement();
    return oldIt;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Incrementing a HashIterator tells it to refer to the “next” element in the container. This method first increments the list iterator, then checks if it has reached the end of its bucket. If so, it finds the next non-empty bucket in the hashmap and sets the list iterator equal to the start element in that bucket. Note that it can’t simply move to the next bucket, because there might not be any elements in it. If there are no more empty buckets, mIt is set, by the convention chosen for this example, to the end iterator of the last bucket in the hashmap, which is the special “end” position of the HashIterator. Iterators are not required to be any safer than dumb pointers, so error-checking for things like incrementing an iterator already at the end is not required:

image
// Behavior is undefined if mIt already refers to the past-the-end
// element in the table, or is otherwise invalid.
template<typename Key, typename T, typename Compare, typename Hash>
void HashIterator<Key, T, Compare, Hash>::increment()
{
    // mIt is an iterator into a single bucket. Increment it.
    ++mIt;
    // If we're at the end of the current bucket,
    // find the next bucket with elements.
    if (mIt == (*mHashmap->mElems)[mBucket].end()) {
        for (size_t i = mBucket + 1; i < (*mHashmap->mElems).size(); i++) {
            if (!((*mHashmap->mElems)[i].empty())) {
                // We found a non-empty bucket.
                // Make mIt refer to the first element in it.
                mIt = (*mHashmap->mElems)[i].begin();
                mBucket = i;
                return;
            }
        }
        // No more empty buckets. Assign mIt to refer to the end
        // iterator of the last list.
        mBucket = (*mHashmap->mElems).size() - 1;
        mIt = (*mHashmap->mElems)[mBucket].end();
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Decrement is the inverse of increment: It makes the iterator refer to the “previous” element in the container. However, there is an asymmetry because of the asymmetry between the way the start and end positions are represented: Start is the first element, but end is “one past” the last element. The algorithm for decrement checks first if the underlying list iterator is at the start of its current bucket. If not, it can just be decremented. Otherwise, the code needs to check for the first nonempty bucket before the current one. If one is found, the list iterator must be set to refer to the last element in the bucket, which is the end iterator decremented by one. If no non-empty buckets are found, the decrement is invalid, so the code can do anything it wants (behavior is undefined):

image
// Behavior is undefined if mIt already refers to the first element
// in the table, or is otherwise invalid.
template<typename Key, typename T, typename Compare, typename Hash>
void HashIterator<Key, T, Compare, Hash>::decrement()
{
    // mIt is an iterator into a single bucket.
    // If it's at the beginning of the current bucket, don't decrement it.
    // Instead, try to find a non-empty bucket ahead of the current one.
    if (mIt == (*mHashmap->mElems)[mBucket].begin()) {
        for (size_t i = mBucket - 1; i >= 0; --i) {
            if (!((*mHashmap->mElems)[i].empty())) {
                mIt = (*mHashmap->mElems)[i].end();
                --mIt;
                mBucket = i;
                return;
            }
        }
        // No more non-empty buckets. This is an invalid decrement.
        // Assign mIt to refer to one before the start element of the first
        // list (an invalid position).
        mIt = (*mHashmap->mElems)[0].begin();
        --mIt;
        mBucket = 0;
    } else {
        // We're not at the beginning of the bucket, so just move down.
        --mIt;
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Note that both increment() and decrement() access protected members of the hashmap class. Thus, the hashmap class must declare HashIterator to be a friend class.

After increment() and decrement(), operator== and operator!= are positively simple. They just compare each of the three data members of the objects:

image
template<typename Key, typename T, typename Compare, typename Hash>
bool HashIterator<Key, T, Compare, Hash>::operator==(
    const HashIterator& rhs) const
{
    // All fields, including the hashmap to which the iterators refer,
    // must be equal.
    return (mHashmap == rhs.mHashmap && mBucket == rhs.mBucket &&
        mIt == rhs.mIt);
}
template<typename Key, typename T, typename Compare, typename Hash>
bool HashIterator<Key, T, Compare, Hash>::operator!=(
    const HashIterator& rhs) const
{
    return !(*this == rhs);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

const Iterators

Technically, you should provide both an iterator and a const iterator for your hashmap class. The const iterator should function like the iterator, but should provide read-only access to the elements. The iterator should always be convertible to a const iterator. We omit the details of the const iterator and leave its implementation as an exercise for the reader.

Iterator typedefs and Access Methods

The final piece involved in providing iterator support for the hashmap is to supply the necessary typedefs in the hashmap class definition and to write the begin() and end() methods and the C++11 cbegin() and cend() methods on the hashmap. The typedefs and method prototypes look like this:

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        // Other typedefs omitted for brevity
        typedef HashIterator<Key, T, Compare, Hash> iterator;
        typedef HashIterator<Key, T, Compare, Hash> const_iterator;
        // Iterator methods
        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;
        const_iterator cbegin() const;  // For C++11
        const_iterator cend() const;    // For C++11
        // Remainder of class definition omitted for brevity
};

Code snippet from HashmapFinalHashmaphashmap.h

The trickiest aspect of begin() is to remember to return the end iterator if there are no elements in the container:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::iterator
    hashmap<Key, T, Compare, Hash>::begin()
{
    if (mSize == 0) {
        // Special case: there are no elements, so return the end iterator
        return end();
    }
    // We know there is at least one element. Find the first element.
    for (size_t i = 0; i < mElems->size(); ++i) {
        if (!((*mElems)[i].empty())) {
            return HashIterator<Key, T, Compare, Hash>(i,
                (*mElems)[i].begin(), this);
        }
    }
    // Should never reach here, but if we do, return the end iterator
    return end();
}

Code snippet from HashmapFinalHashmaphashmap.cpp

end() creates a HashIterator referring to the end iterator of the last bucket:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::iterator
    hashmap<Key, T, Compare, Hash>::end()
{
    // The end iterator is just the end iterator of the list in last bucket.
    return HashIterator<Key, T, Compare, Hash>(mElems->size() - 1,
        (*mElems)[mElems->size() - 1].end(), this);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Because we don’t provide a const_iterator, the implementation of the const versions of begin() and end() are identical to the non-const versions of begin() and end(). The C++11 cbegin() and cend() methods forward the call to the const versions of begin() and end():

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::const_iterator
    hashmap<Key, T, Compare, Hash>::cbegin() const
{
    return const_cast<const hashmap<Key, T, Compare, Hash>*>(this)->begin();
}
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::const_iterator
    hashmap<Key, T, Compare, Hash>::cend() const
{
    return const_cast<const hashmap<Key, T, Compare, Hash>*>(this)->end();
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Using the HashIterator

Now that the hashmap supports iteration, you can iterate over its elements just as you would on any STL container, and you can pass the iterators to methods and functions:

image
hashmap<string, int> myHash;
myHash.insert(make_pair("KeyOne", 100));
myHash.insert(make_pair("KeyTwo", 200));
myHash.insert(make_pair("KeyThree", 300));
for (auto it = myHash.cbegin(); it != myHash.cend(); ++it) {
    // Use both -> and * to test the operations.
    cout << it->first << " maps to " << (*it).second << endl;
}
// Print elements using C++11 range-based for loop
for (auto& p : myHash)
    cout << p.first << " maps to " << p.second << endl;
// Create a map with all the elements in the hashmap.
map<string, int> myMap(myHash.begin(), myHash.end());
for (auto it = myMap.begin(); it != myMap.end(); ++it) {
    cout << it->first << " maps to " << (*it).second << endl;
}

Code snippet from HashmapFinalHashmapTestHashmap.cpp

Note on Allocators

As described earlier in this chapter, all the STL containers allow you to specify a custom memory allocator. A “good citizen” hashmap implementation should do the same. However, we omit those details because they obscure the main points of this implementation.

Note on Reversible Containers

If your container supplies a bidirectional or random access iterator, it is considered reversible. Reversible containers are supposed to supply two additional typedefs:

TYPE NAME DESCRIPTION
reverse_iterator The type for iterating over elements of the container in reverse order.
const_reverse_iterator A version of reverse_iterator for iterating over const elements of the container in reverse order.

Additionally, the container should provide rbegin() and rend() which are symmetric with begin() and end(); and should provide the C++11 crbegin() and crend() which are symmetric with cbegin() and cend(). The usual implementations just use the reverse_iterator adapter described earlier in this chapter. We leave them as an exercise for the reader.

Making the Hashmap an Associative Container

In addition to the basic container requirements shown already, you can also make your container adhere to additional requirements for associative, unordered associative, or sequential containers. This example makes the hashmap an associative container, so it should conform to the following typedefs and methods.

Associative Container typedef Requirements

Associative containers require three additional typedefs:

TYPE NAME DESCRIPTION
key_type The key type with which the container is instantiated
key_compare The comparison class or function pointer type with which the container is instantiated
value_compare Class for comparing two value_type elements

This example implementation also throws in a mapped_type typedef, because that’s what the map does. The value_compare is implemented not as a typedef, but as a nested class definition. Alternatively, the class could be a friend class of the hashmap, but this definition follows the map definition found in the standard. The purpose of the value_compare class is to call the comparison function on the keys of two elements:

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        typedef Key key_type;
       typedef T mapped_type;
        typedef pair<const Key, T> value_type;
       typedef Compare key_compare;
        typedef pair<const Key, T>& reference;
        typedef const pair<const Key, T>& const_reference;
        typedef HashIterator<Key, T, Compare, Hash> iterator;
        typedef HashIterator<Key, T, Compare, Hash> const_iterator;
        typedef size_t size_type;
        typedef ptrdiff_t difference_type;
        // Required class definition for associative containers
        class value_compare :
            public std::binary_function<value_type, value_type, bool>
        {
            friend class hashmap<Key, T, Compare, Hash>;
            public:
                bool operator() (const value_type& x, const value_type& y) const
                {
                    return comp(x.first, y.first);
                }
            protected:
                Compare comp;
                value_compare(Compare c) : comp(c) {}
         };
         // Remainder of hashmap class definition omitted for brevity
};

Code snippet from HashmapFinalHashmaphashmap.h

Associative Container Method Requirements

The standard prescribes quite a few additional method requirements for associative containers:

METHOD DESCRIPTION WORST CASE COMPLEXITY
Constructor taking an iterator range. Constructs the container and inserts elements in the iterator range. The iterator range need not refer into another container of the same type.
Note that all constructors of associative containers must take a comparison object of type value_compare. The constructors should provide a default constructed object as the default value.
n log n
imageConstructor taking an initializer_list<value_type> as parameter. Constructs the container and inserts the elements from the initializer list into the container. n log n
imageAssignment operator with an initializer_list<value_type> as right-hand side. Replaces all elements from the container with the elements from the initializer list n log n
key_compare key_comp()
const;
value_compare
value_comp() const;
Returns the comparison objects for comparing just keys or entire values constant
pair<iterator, bool>
insert(value_type&);
iterator insert(
iterator,
value_type&);
void insert(
InputIterator start,
InputIterator end);
Three different forms of insert.
The iterator position in the second is a hint, which can be ignored.
The range in the third need not be from a container of the same type.
Containers that allow duplicate keys return just iterator from the first form, because insert() always succeeds.
logarithmic except for the second form, which can be amortized constant if the element is inserted immediately before the given hint.
imagevoid insert(
initializer_list
<value_type>);
Inserts the elements from the initializer list into the container. logarithmic
imagepair<iterator, bool>
emplace(
value_type&&);
imageiterator emplace_hint(
iterator hint,
value_type&&);
Implements the C++11 emplace operations to construct objects in-place. In-place construction is discussed in Chapter 12. logarithmic, except for the second form, which can be amortized constant if the element is inserted immediately before the given hint.
size_type
erase(key_type&);
void erase(iterator);
void erase(
iterator start,
iterator end);
Three different forms of erase.
The first form returns the number of values erased (0 or 1, in containers that do not allow duplicate keys).
The second and third forms erase the elements at iterator position, or in the range start to end.
logarithmic, except for the second form, which should be amortized constant
void clear(); Erases all elements linear
iterator
find(key_type&);
const_iterator
find(key_type&)
const;
Finds the element with the specified key logarithmic
size_type
count(key_type&)
const;
Returns the number of elements with the specified key (0 or 1 in containers that do not allow duplicate keys) logarithmic
iterator lower_bound(
key_type&);
iterator upper_bound(
key_type&);
pair<iterator,iterator>
equal_range(
key_type&);
const_iterator
lower_bound(
key_type&) const;
const_iterator
upper_bound(
key_type&) const;
pair<const_iterator,
const_iterator>
equal_range(
key_type&) const;
Returns iterators referring to the first element of the specified key, one past the last element of the specified key, or both logarithmic

Note that the collection methods lower_bound(), upper_bound(), and equal_range() only make sense on sorted containers. Thus the hashmap class does not need to provide them.

Here is the complete hashmap class definition. Note that the prototypes for insert(), erase(), and find() need to change slightly from the previous versions shown, because those initial versions don’t have the right return types required for associative containers:

image
template <typename Key, typename T, typename Compare = std::equal_to<Key>,
    typename Hash = DefaultHash<Key>>
class hashmap
{
    public:
        typedef Key key_type;
        typedef T mapped_type;
        typedef pair<const Key, T> value_type;
        typedef Compare key_compare;
        typedef pair<const Key, T>& reference;
        typedef const pair<const Key, T>& const_reference;
        typedef HashIterator<Key, T, Compare, Hash> iterator;
        typedef HashIterator<Key, T, Compare, Hash> const_iterator;
        typedef size_t size_type;
        typedef ptrdiff_t difference_type;
        // Required class definition for associative containers
        class value_compare :
            public std::binary_function<value_type, value_type, bool>
        {
            friend class hashmap<Key, T, Compare, Hash>;
            public:
                bool operator() (const value_type& x, const value_type& y) const
                {
                    return comp(x.first, y.first);
                }
            protected:
                Compare comp;
                value_compare(Compare c) : comp(c) {}
        };
        // The iterator class needs access to protected members of hashmap
        friend class HashIterator<Key, T, Compare, Hash>;
        // Constructors
        explicit hashmap(const Compare& comp = Compare(),
            const Hash& hash = Hash()) throw(invalid_argument);
        template <class InputIterator>
        hashmap(InputIterator first, InputIterator last,
            const Compare& comp = Compare(), const Hash& hash = Hash())
            throw(invalid_argument);
        // destructor, copy constructor, move constructor,
        // copy assignment operator and move assignment operator
        ~hashmap();
        hashmap(const hashmap<Key, T, Compare, Hash>& src);
        hashmap(hashmap<Key, T, Compare, Hash>&& src);      // C++11
        hashmap<Key, T, Compare, Hash>& operator=(
            const hashmap<Key, T, Compare, Hash>& rhs);
        hashmap<Key, T, Compare, Hash>& operator=(
            hashmap<Key, T, Compare, Hash>&& rhs);          // C++11
        // C++11 initializer list constructor
        hashmap(initializer_list<value_type> il,
            const Compare& comp = Compare(),
            const Hash& hash = Hash()) throw(invalid_argument);
        // C++11 initializer list assignment operator
        hashmap<Key, T, Compare, Hash>& operator=(
            initializer_list<value_type> il);
        // Iterator methods
        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;
        const_iterator cbegin() const;  // For C++11
        const_iterator cend() const;    // For C++11
        // Size methods
        bool empty() const;
        size_type size() const;
        size_type max_size() const;
       // Element insert methods
       T& operator[] (const key_type& x);
       pair<iterator, bool> insert(const value_type& x);
       iterator insert(iterator position, const value_type& x);
       template <class InputIterator>
       void insert(InputIterator first, InputIterator last);
       void insert(initializer_list<value_type> il);        // C++11
       // C++11 emplace methods
       pair<iterator, bool> emplace(value_type&& x);
       iterator emplace_hint(iterator hint, value_type&& x);
       // Element delete methods
       void erase(iterator position);
       size_type erase(const key_type& x);
       void erase(iterator first, iterator last);
        // Other modifying utilities
       void swap(hashmap<Key, T, Compare, Hash>& hashIn);
       void clear();
       // Access methods for STL conformity
       key_compare key_comp() const;
       value_compare value_comp() const;
       // Lookup methods
       iterator find(const key_type& x);
       const_iterator find(const key_type& x) const;
       size_type count(const key_type& x) const;
    protected:
        typedef list<value_type> ListType;
        typename ListType::iterator findElement(
            const key_type& x, size_t& bucket) const;
        vector<ListType>* mElems;
        size_type mSize;
        Compare mComp;
        Hash mHash;
};

Code snippet from HashmapFinalHashmaphashmap.h

hashmap Constructors

The implementation of the default constructor is shown earlier. The second constructor is a method template so that it can take an iterator range from any container, not just other hashmaps. If it were not a method template, it would need to specify the InputIterator type explicitly as HashIterator, limiting it to iterators from hashmaps. Despite the syntax, the implementation is uncomplicated: It initializes all the data members, then calls insert() to actually insert all the elements in the specified range:

image
// Make a call to insert() to actually insert the elements.
template <typename Key, typename T, typename Compare, typename Hash>
template <class InputIterator>
hashmap<Key, T, Compare, Hash>::hashmap(
    InputIterator first, InputIterator last, const Compare& comp,
    const Hash& hash) throw(invalid_argument) : mSize(0), mComp(comp), mHash(hash)
{
    if (mHash.numBuckets() <= 0) {
        throw invalid_argument("Number of buckets must be positive");
    }
    mElems = new vector<ListType>(mHash.numBuckets());
    insert(first, last);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

imagehashmap Initializer List Constructor

C++11 supports initializer lists, which are discussed in Chapter 9. Following is the implementation of the hashmap constructor that takes an initializer list, which is very similar to the implementation of the constructor accepting an iterator range:

image
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>::hashmap(initializer_list<value_type> il,
    const Compare& comp, const Hash& hash) throw(invalid_argument)
    : mSize(0), mComp(comp), mHash(hash)
{
    if (mHash.numBuckets() <= 0) {
        throw invalid_argument("Number of buckets must be positive");
    }
    mElems = new vector<ListType>(mHash.numBuckets());
    insert(il.begin(), il.end());
}

Code snippet from HashmapFinalHashmaphashmap.cpp

With this initializer list constructor, a hashmap can be constructed in the following way:

hashmap<string, int> myHash = {
    {"KeyOne", 100},
    {"KeyTwo", 200},
    {"KeyThree", 300}};

This is much nicer than using insert() and make_pair() calls:

myHash.insert(make_pair("KeyOne", 100));
myHash.insert(make_pair("KeyTwo", 200));
myHash.insert(make_pair("KeyThree", 300));

imagehashmap Initializer List Assignment Operator

In C++11, assignment operators can also accept an initializer list on the right-hand side. Following is an implementation of an initializer list assignment operator for the hashmap. It deletes all the elements from the hashmap and then delegates the work to the insert() method accepting an iterator range:

image
template <typename Key, typename T, typename Compare, typename Hash>
hashmap<Key, T, Compare, Hash>& hashmap<Key, T, Compare, Hash>::operator=(
    initializer_list<value_type> il)
{
    clear();
    insert(il.begin(), il.end());
    return *this;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

With this assignment operator, you can write code as follows:

myHash = {
    {"KeyOne", 100},
    {"KeyTwo", 200},
    {"KeyThree", 300}};

hashmap Insertion Operations

In the basic hashmap section earlier in this chapter, a simple insert() method was given. In this version, four insert() versions are provided with additional features:

  • The simple insert() operation returns a pair<iterator, bool>, which indicates both where the item is inserted and whether or not it was newly-created.
  • The version of insert() that takes a position is useless for a hashmap, but it is provided for symmetry with other kinds of collections. The position is ignored, and it merely calls the first version.
  • The third form of insert() is a method template, so it can be used by algorithms that work on arbitrary containers.
  • The last form of insert() accepts an initializer_list<value_type>.

The first two insert() methods are implemented as follows:

image
template <typename Key, typename T, typename Compare, typename Hash>
pair<typename hashmap<Key, T, Compare, Hash>::iterator, bool>
    hashmap<Key, T, Compare, Hash>::insert(const value_type& x)
{
    size_t bucket;
    // Try to find the element.
    auto it = findElement(x.first, bucket);
    if (it != (*mElems)[bucket].end()) {
        // The element already exists.
        // Convert the list iterator into a HashIterator, which
        // also requires the bucket and a pointer to the hashmap.
        HashIterator<Key, T, Compare, Hash> newIt(bucket, it, this);
        // Some compilers don't like make_pair here.
        pair<HashIterator<Key, T, Compare, Hash>, bool> p(newIt, false);
        return p;
    } else {
        // We didn't find the element, so insert a new one.
        mSize++;
        auto endIt = (*mElems)[bucket].insert((*mElems)[bucket].end(), x);
        pair<HashIterator<Key, T, Compare, Hash>, bool> p(
            HashIterator<Key, T, Compare, Hash>(bucket, endIt, this), true);
        return p;
    }
}
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::iterator
    hashmap<Key, T, Compare, Hash>::insert(typename hashmap<Key, T, Compare,
    Hash>::iterator position, const value_type& x)
{
    // Completely ignore position
    return insert(x).first;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The third form of insert() is a method template for the same reason as the constructor shown earlier: It should be able to insert elements by using iterators from containers of any type. The actual implementation uses an insert_iterator, which is described earlier in this chapter:

image
template <typename Key, typename T, typename Compare, typename Hash>
template <class InputIterator>
void hashmap<Key, T, Compare, Hash>::insert(InputIterator first,
    InputIterator last)
{
    // Copy each element in the range by using an insert_iterator
    // adapter. Give begin() as a dummy position--insert ignores it
    // anyway.
    insert_iterator<hashmap<Key, T, Compare, Hash>> inserter(*this, begin());
    copy(first, last, inserter);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

imageIn C++11, you should implement an insert operation that accepts an initializer list, discussed in Chapter 9. The implementation for the hashmap is very straightforward and forwards the work to the insert() method accepting an iterator range:

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::insert(initializer_list<value_type> il)
{
    insert(il.begin(), il.end());
}

Code snippet from HashmapFinalHashmaphashmap.cpp

With this insert() method, you can write code as follows:

myHash.insert({
    {"KeyFour", 400},
    {"KeyFive", 500}});

imagehashmap Emplace Operations

Emplace operations construct objects in place using the rvalue reference feature and are new to C++11. They are discussed in Chapter 12. Their implementation is straightforward in this case because we can forward the emplace operation to the correct bucket list. The emplace() method is implemented exactly the same as the corresponding insert() method, except that it calls emplace() on the list instead of insert():

image
template <typename Key, typename T, typename Compare, typename Hash>
pair<typename hashmap<Key, T, Compare, Hash>::iterator, bool>
    hashmap<Key, T, Compare, Hash>::emplace(value_type&& x)
{
    size_t bucket;
    // Try to find the element.
    auto it = findElement(x.first, bucket);
    if (it != (*mElems)[bucket].end()) {
        // The element already exists.
        // Convert the list iterator into a HashIterator, which
        // also requires the bucket and a pointer to the hashmap.
        HashIterator<Key, T, Compare, Hash> newIt(bucket, it, this);
        // Some compilers don't like make_pair here.
        pair<HashIterator<Key, T, Compare, Hash>, bool> p(newIt, false);
        return p;
    } else {
        // We didn't find the element, so emplace a new one.
        mSize++;
        auto endIt = (*mElems)[bucket].emplace((*mElems)[bucket].end(), x);
        pair<HashIterator<Key, T, Compare, Hash>, bool> p(
            HashIterator<Key, T, Compare, Hash>(bucket, endIt, this), true);
        return p;
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The emplace_hint() method ignores the hint and forwards the call to the emplace() method. This implementation uses std::forward to forward x as an rvalue reference to the emplace() method. This is required, because a named variable, x, will normally be treated as an lvalue reference, but in this case, you want it to be treated as an rvalue reference:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::iterator
    hashmap<Key, T, Compare, Hash>::emplace_hint(
    typename hashmap<Key, T, Compare, Hash>::iterator hint, value_type&& x)
{
    // completely ignore hint
    return emplace(forward<value_type>(x)).first;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

hashmap Erase Operations

The version of erase() in the earlier section “A Basic Hashmap” is not compliant with STL requirements. You need to implement the following versions:

  • A version that takes as a parameter a key_type and returns a size_type for the number of elements removed from the collection (for the hashmap, there are only two possible return values, 0 and 1)
  • A version that erases a value at a specific iterator position, as indicated by a HashIterator, and returns void
  • A version that erases a range of elements, based on two iterators, and returns void

The first version is implemented as follows:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::size_type
    hashmap<Key, T, Compare, Hash>::erase(const key_type& x)
{
    size_t bucket;
    // First, try to find the element.
    auto it = findElement(x, bucket);
    if (it != (*mElems)[bucket].end()) {
        // The element exists--erase it.
        (*mElems)[bucket].erase(it);
        mSize--;
        return 1;
    } else {
        return 0;
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The second form of erase() must remove the element at a specific iterator position. The iterator given is, of course, a HashIterator. Thus, the hashmap must have some ability to obtain the underlying bucket and list iterator from the HashIterator. The approach we take is to make the hashmap class a friend of the HashIterator (not shown in the preceding class definition).

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::erase(
    typename hashmap<Key, T, Compare, Hash>::iterator position)
{
    // Erase the element from its bucket.
    (*mElems)[position.mBucket].erase(position.mIt);
    mSize--;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The final version of erase() removes a range of elements. It iterates from first to last, calling erase() on each element, thus letting the previous version of erase() do all the work:

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::erase(
    typename hashmap<Key, T, Compare, Hash>::iterator first, 
    typename hashmap<Key, T, Compare, Hash>::iterator last)
{
    typename hashmap<Key, T, Compare, Hash>::iterator cur, next;
    // Erase all the elements in the range.
    for (next = first; next != last; ) {
        cur = next++;
        erase(cur);
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

hashmap Clear Operation

The clear() method uses the for_each() algorithm to call clear() on the list representing each bucket:

image
template <typename Key, typename T, typename Compare, typename Hash>
void hashmap<Key, T, Compare, Hash>::clear()
{
    // Call clear on each list.
    for_each(mElems->begin(), mElems->end(),
        [](ListType& e){e.clear();});
    mSize = 0;
} 

Code snippet from HashmapFinalHashmaphashmap.cpp

hashmap Accessor Operations

The standard requires access methods for the key comparison and value comparison objects. These methods must be called key_comp() and value_comp():

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::key_compare
    hashmap<Key, T, Compare, Hash>::key_comp() const
{
    return mComp;
}
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::value_compare
    hashmap<Key, T, Compare, Hash>::value_comp() const
{
    return value_compare(mComp);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The find() method is identical to the version shown earlier for the basic hashmap, except for the return code. Instead of returning a pointer to the element, it constructs a HashIterator referring to it:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::iterator
    hashmap<Key, T, Compare, Hash>::find(const key_type& x)
{
    size_t bucket;
    // Use the findElement() helper.
    auto it = findElement(x, bucket);
    if (it == (*mElems)[bucket].end()) {
        // We didn't find the element--return the end iterator.
        return end();
    }
    // We found the element--convert the bucket/iterator to a HashIterator.
    return HashIterator<Key, T, Compare, Hash>(bucket, it, this);
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The const version of find() is identical, so its implementation is not shown here.

The implementation of count() is a wrapper for find(), returning 1 if it finds the element, and 0 if it doesn’t. The find() method returns the end iterator if it can’t find the element. count() retrieves an end iterator by calling end() in order to compare it. Since this hashmap does not allow duplicate keys, count() always returns either 0 or 1:

image
template <typename Key, typename T, typename Compare, typename Hash>
typename hashmap<Key, T, Compare, Hash>::size_type
    hashmap<Key, T, Compare, Hash>::count(const key_type& x) const
{
    // There are either 1 or 0 elements matching key x.
    // If we can find a match, return 1, otherwise return 0.
    if (find(x) == end()) {
        return 0;
    } else {
        return 1;
    }
}

Code snippet from HashmapFinalHashmaphashmap.cpp

The final method, operator[], is not required by the standard, but is provided for convenience of the programmer, and to be symmetric with std::map. The prototype and implementations are identical to those of the operator[] in the STL map. The comments explain the potentially confusing one-line implementation:

image
template <typename Key, typename T, typename Compare, typename Hash>
T& hashmap<Key, T, Compare, Hash>::operator[] (const key_type& x)
{
    // This definition is the same as that used by map, according to
    // the standard.
    // It's a bit cryptic, but it basically attempts to insert
    // a new key/value pair of x and a new value. Regardless of whether
    // the insert succeeds or fails, insert() returns a pair of an
    // iterator/bool. The iterator refers to a key/value pair, the
    // second element of which is the value we want to return.
    return ((insert(make_pair(x, T()))).first)->second;
}

Code snippet from HashmapFinalHashmaphashmap.cpp

Note on Sequential Containers

The hashmap developed in the preceding sections is an associative container. However, you could also write a sequential container, or an unordered associative container, in which case you would need to follow a different set of requirements. Instead of listing them here, it’s easier to point out that the deque container follows the prescribed sequential container requirements almost exactly. The only difference is that it provides an extra resize() method (not required by the standard). An example of an unordered associative container is the unordered_map, on which you can model your own unordered associative containers.

SUMMARY

The final example in this chapter showed almost the complete development of a hashmap associative container and its iterator. This hashmap implementation was given here to teach you how to write your own STL containers and iterators. C++11 includes its own set of unordered associative containers or hash tables. If your compiler supports those standard C++11 hash tables, you should use them instead of your own implementation.

In the process of reading this chapter, you also hopefully gained an appreciation for the steps involved in developing containers. Even if you never write another STL algorithm or container, you understand better the STL’s mentality and capabilities, and you can put it to better use.

This chapter concludes the tour of the STL, Chapters 11 to 17. Even with all the details given in this book, there are still features omitted. If this material excited you, consult some of the resources in Appendix B for more information. On the other hand, we realize that the syntax and material in these chapters was dense. Don’t feel compelled to use all the features discussed here. Forcing them into your programs without a true need will just complicate your code. However, we encourage you to consider incorporating aspects of the STL into your programs where they make sense. Start with the containers, maybe throw in an algorithm or two, and before you know it, you’ll be a convert!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset