EXPLORATION 63

image

Smart Pointers

The std::unique_ptr class template is an example of a so-called smart pointer. A smart pointer behaves much like any other pointer but with extra features and functionality. This Exploration takes a closer look at unique_ptr and other smart pointers.

Revisiting unique_ptr

Exploration 61 introduced unique_ptr as a way to manage dynamically allocated objects. The unique_ptr class template overloads the dereference (*) and member access (->) operators and lets you use a unique_ptr object the same way you would use a pointer. At the same time, it extends the behavior of an ordinary pointer, such that when the unique_ptr object is destroyed, it automatically deletes the pointer it holds. That’s why unique_ptr is called a smart pointer—it’s just like an ordinary pointer, only smarter. Using unique_ptr helps ensure that memory is properly managed, even in the face of unexpected exceptions.

When used properly, the key feature of unique_ptr is that exactly one unique_ptr object owns a particular pointer. You can move unique_ptr objects. Each time you do, the target of the move becomes the new owner of the pointer.

You can also force a unique_ptr to give up ownership of its pointer by calling the release() member function. The release() function returns the raw pointer, as displayed in the following:

std::unique_ptr<int> ap{new int{42}};
int* ip{ap.release()};
delete ip;

Call the reset member function to tell a unique_ptr to take over a different pointer. The unique_ptr object takes control of the new pointer and deletes its old pointer. With no argument, reset() sets the unique_ptr to a null pointer.

std::unique_ptr<int> ap{new int{42}};
ap.reset(new int{10}); // deletes the pointer to 42
ap.reset();            // deletes the pointer to 10

The get() member function retrieves the raw pointer without affecting the unique_ptr’s ownership. The unique_ptr template also overloads the dereference (*) and member (->) operators, so that they work the way they do with ordinary pointers. These functions do not affect ownership of the pointer.

std::unique_ptr<rational> rp{new rational{420, 10}};
int n{rp->numerator()};
rational r{*rp};
rational *raw_ptr{rp.get()};

When unique_ptr holds a pointer to an array (that is, the template argument is an array type, e.g., unique_ptr<int[]>), it supports the subscript operator instead of * and ->.

In order to enforce its ownership semantics, unique_ptr has a move constructor and move assignment operator but deletes its copy constructor and copy assignment operator. If you use unique_ptr for data members in a class, the compiler implicitly deletes the class’s copy constructor and copy assignment operator.

Thus, using unique_ptr may free you from thinking about your class’s destructor, but you are not excused from thinking about the constructors and assignment operators. This is a minor tweak to the guideline that if you have to deal with one, you must deal with all special members. The compiler’s default behavior is usually correct, but you might want to implement a copy constructor that performs a deep copy or other non-default behavior.

Copyable Smart Pointers

Sometimes, you don’t want exclusive ownership. There are circumstances when multiple objects will share ownership of a pointer. When no objects own the pointer, the memory is automatically reclaimed. The std::shared_ptr smart-pointer type implements shared ownership.

Once you deliver a pointer to a shared_ptr, the shared_ptr object owns that pointer. When the shared_ptr object is destroyed, it will delete the pointer. The difference between shared_ptr and unique_ptr is that you can freely copy and assign shared_ptr objects with normal semantics. Unlike unique_ptr, shared_ptr has a copy constructor and copy assignment operator. The shared_ptr object keeps a reference count, so assignment merely increments the reference count, without having to transfer ownership. When a shared_ptr object is destroyed, it decrements the reference count. When the count reaches zero, the pointer is deleted. Thus, you can make as many copies as you like, store shared_ptr objects in a container, pass them to functions, return them from functions, copy them, move them, assign them, and carry on to your heart’s content. It’s that simple. Listing 63-1 shows that copying shared_ptr works in ways that don’t work with unique_ptr.

Listing 63-1.  Working with shared_ptr

#include <iostream>
#include <memory>
#include <vector>
 
class see_me
{
public:
  see_me(int x) : x_{x} { std::cout <<  "see_me(" << x_ << ") "; }
  ~see_me()             { std::cout << "~see_me(" << x_ << ") "; }
  int value() const     { return x_; }
private:
  int x_;
};
 
std::shared_ptr<see_me> does_this_work(std::shared_ptr<see_me> x)
{
  std::shared_ptr<see_me> y{x};
  return y;
}
 
int main()
{
  std::shared_ptr<see_me> a{}, b{};
  a = std::make_shared<see_me>(42);
  b = does_this_work(a);
  std::vector<std::shared_ptr<see_me>> v{};
  v.push_back(a);
  v.push_back(b);
}

The best way to create a shared_ptr is to call make_shared. The template argument is the type you want to create, and the function arguments are passed directly to the constructor. Due to implementation details, constructing a new shared_ptr instance any other way is slightly less efficient in space and time.

Using shared_ptr, you can reimplement the program from Listing 58-5. The old program used the artifact map to manage the lifetime of all artifacts. Although convenient, there is no reason to tie artifacts to this map, because the map is used only for parsing. In a real program, most of its work lies in the actual building of targets, not parsing the input. All the parsing objects should be freed and long gone by the time the program is building targets.

Rewrite the artifact-lookup portion of Listing 58-5 to allocate artifact objects dynamically, using shared_ptr throughout to refer to artifact pointers. See Listing 63-2 for my solution.

Listing 63-2.  Using Smart Pointers to Manage Artifacts

std::map<std::string, std::shared_ptr<artifact>> artifacts;
 
std::shared_ptr<artifact>
lookup_artifact(std::string const& name)
{
  std::shared_ptr<artifact> a{artifacts[name]};
  if (a.get() == nullptr)
  {
    a = std::make_shared<artifact>(name);
    artifacts[name] = a;
  }
  return a;
}

With a little more care, you could use unique_ptr instead of shared_ptr, but that results in greater changes to the rest of the code. You should always prefer unique_ptr to shared_ptr, due to the overhead of maintaining the reference count. But if you require shared ownership, shared_ptr is your choice. In all cases, there is no reason to use raw pointers instead of a smart pointer.

Smart Arrays

Recall from Exploration 61 that allocating a single object is completely different from allocating an array of objects. Thus, smart pointers must also distinguish between a smart pointer to a single object and a smart pointer to an array of objects. In the C++ standard, the distinction is well-defined: unique_ptr has separate specializations for scalars and arrays. On the other hand, shared_ptr works only with single objects by default. To work with arrays, you have to provide a second argument to the constructor: std::default_delete<T[]>(). The second argument tells the shared_ptr how to delete its pointer. The standard library provides std::default_delete<T[]> to delete a pointer using delete[]. For example:

std::shared_ptr<int> array_ptr{ new int[10], std::default_delete<int[]>{} };

Pimpls

No, that’s not a spelling error. Although programmers have spoken for years about pimples and warts in their programs, often referring to unsightly but unavoidable bits of code, Herb Sutter associated the phrase pointer-to-implementation with these pimples to come up with the pimpl idiom.

In short, a pimpl is a class that hides implementation details in an implementation class, and the public interface object holds only a pointer to that implementation object. Instead of forcing the user of your class to allocate and de-allocate objects, manage pointers, and keep track of object lifetimes, you can expose a class that is easier to use. Specifically, the user can treat instances of the class as values, in the manner of int and other built-in types.

The pimpl wrapper manages the lifetime of the pimpl object. It typically implements the special member functions: copy and move constructors, copy and move assignment operators, and destructor. It delegates most of its other member functions to the pimpl object. The user of the wrapper never has to be concerned with any of this.

Thus, we will rewrite the artifact class so that it wraps a pimpl—that is, a pointer to an artifact_impl class. The artifact_impl class will do the real work, and artifact will merely forward all functions through its pimpl. The language feature that makes pimpls possible is declaring a class name without providing a definition of the class, as illustrated by the following:

class artifact_impl;

This class declaration, often called a forward declaration, informs the compiler that artifact_impl is the name of a class. The declaration doesn’t provide the compiler with anything more about the class, so the class type is incomplete. You face a number of restrictions on what you can do with an incomplete type. In particular, you cannot define any objects or data members of that type, nor can you use an incomplete class as a function parameter or return type. You cannot refer to any members of an incomplete class. But you can use pointers or references to the type when you define objects, data members, function parameters, and return types. In particular, you can use a pointer to artifact_impl in the artifact class.

A normal class definition is a complete type definition. You can mix forward declarations with a class definition of the same class name. A common pattern is for a header, such as artifact.hpp, to declare a forward declaration; a source file then fills in the complete class definition.

The definition of the artifact class, therefore, can have a data member that is a pointer to the artifact_impl class, or even a smart pointer to artifact_impl, even though the compiler knows only that artifact_impl is a class but doesn’t know any details about it. This means the artifact.hpp header file is independent of the implementation of artifact_impl. The implementation details are tucked away in a separate file, and the rest of your program can make use of the artifact class completely insulated from artifact_impl. In large projects, this kind of barrier is tremendously important.

Writing the artifact.hpp header is not difficult. Start with a forward declaration of artifact_impl. In the artifact class, the declarations of the member functions are the same as in the original class. Change the data members to a single pointer to artifact_impl. Finally, overload operator< for two artifact objects. Implement the comparison by comparing names. Read Listing 63-3 to see one possible implementation of this class.

Listing 63-3.  Defining an artifact Pimpl Wrapper Class

#ifndef ARTIFACT_HPP_
#define ARTIFACT_HPP_
 
#include <chrono>
#include <memory>
#include <string>
 
class artifact_impl;
 
class artifact
{
public:
  typedef std::chrono::system_clock clock;
  artifact();
  artifact(std::string const& name);
  artifact(artifact const&) = default;
  ~artifact() = default;
  artifact& operator=(artifact const&) = default;
 
  std::string const& name()     const;
  clock::time_point  mod_time() const;
  std::string        expand(std::string str) const;
 
  void build();
  clock::time_point get_mod_time();
 
  void store_variable(std::string const& name, std::string const& value);
 
private:
  std::shared_ptr<artifact_impl> pimpl_;
};
 
inline bool operator<(artifact const& a, artifact const& b)
{
  return a.name() < b.name();
}
 
#endif // ARTIFACT_HPP_

The header defines the artifact class without any mention of artifact_impl, except for the pimpl_ data member.

The next step is to write the source file, artifact.cpp. This is where the compiler needs the full definition of the artifact_impl class, thus making artifact_impl a complete class, so include the artifact_impl.hpp header. The artifact class doesn’t do much on its own. Instead, it just delegates every action to the artifact_impl class. See the details in Listing 63-4.

Listing 63-4.  Implementing the artifact Class

#include "artifact.hpp"
#include "artifact_impl.hpp"
 
artifact::artifact() : pimpl_{std::make_shared<artifact_impl>()} {}
 
artifact::artifact(std::string const& name)
: pimpl_(std::make_shared<artifact_impl>(name))
{}
 
std::string const& artifact::name()
const
{
   return pimpl_->name();
}
 
artifact::clock::time_point artifact::mod_time()
const
{
   return pimpl_->mod_time();
}
 
std::string artifact::expand(std::string str)
const
{
   return pimpl_->expand(str);
}
 
void artifact::build()
{
   pimpl_->build();
}
 
artifact::clock::time_point artifact::get_mod_time()
{
   return pimpl_->get_mod_time();
}
 
void artifact::store_variable(std::string const& name, std::string const& value)
{
    pimpl_->store_variable(name, value);
}

You define the artifact_impl class in the artifact_impl.hpp header. This class looks nearly identical to the original artifact class. Listing 63-5 shows the artifact_impl class definition.

Listing 63-5.  Defining the Artifact Implementation Class

#ifndef ARTIFACT_IMPL_HPP_
#define ARTIFACT_IMPL_HPP_
 
#include <cstdlib>
#include <chrono>
#include <memory>
#include <string>
#include "variables.hpp"
 
class artifact_impl
{
public:
  typedef std::chrono::system_clock clock;
  artifact_impl();
  artifact_impl(std::string const& name);
  artifact_impl(artifact_impl&&) = default;
  artifact_impl(artifact_impl const&) = delete;
  ~artifact_impl() = default;
  artifact_impl& operator=(artifact_impl&&) = default;
  artifact_impl& operator=(artifact_impl&) = delete;
 
  std::string const& name()     const { return name_; }
  clock::time_point  mod_time() const { return mod_time_; }
 
  std::string        expand(std::string str) const;
  void               build();
  clock::time_point  get_mod_time();
  void store_variable(std::string const& name, std::string const& value);
private:
  std::string name_;
  clock::time_point mod_time_;
  std::unique_ptr<variable_map> variables_;
};
 
#endif // ARTIFACT_IMPL_HPP_

The artifact_impl class is unsurprising. The implementation is just like the old artifact implementation, except the variables_ data member is now managed by unique_ptr instead of explicit code. That means the compiler writes the move constructor, move assignment operator, and destructor for you.

Now it’s time to rewrite the lookup_artifact function yet again. Rewrite Listings 59-4, 59-8, and 63-4 to use the new artifact class. This time, the artifacts map stores artifact objects directly. The dependency_graph class will also have to use artifact instead of artifact*. See Listing 63-6 for one way to rewrite the program.

Listing 63-6.  Rewriting the Program to Use the New artifact Value Class

#include <chrono>
#include <iostream>
#include <sstream>
#include <string>
 
#include "artifact.hpp"
#include "depgraph.hpp"  // Listing 58-5
 
#include "variables.hpp" // Listing 59-6
 
 
void parse_graph(std::istream& in, dependency_graph& graph)
{
  std::map<std::string, artifact> artifacts{};
  std::string line{};
  while (std::getline(in, line))
  {
    std::string target_name{}, dependency_name{};
    std::istringstream stream{line};
    if (stream >> target_name >> dependency_name)
    {
      artifact target{artifacts[expand(target_name, 0)]};
      std::string::size_type equal{dependency_name.find('=')};
      if (equal == std::string::npos)
      {
        // It's a dependency specification
        artifact dependency{artifacts[target.expand(dependency_name)]};
        graph.store_dependency(target, dependency);
      }
      else
        // It's a target-specific variable
        target.store_variable(dependency_name.substr(0, equal-1),
                              dependency_name.substr(equal+1));
    }
    else if (not target_name.empty())
    {
      std::string::size_type equal{target_name.find('=')};
      if (equal == std::string::npos)
        // Input line has a target with no dependency,
        // so report an error.
        std::cerr << "malformed input: target, " << target_name <<
                     ", must be followed by a dependency name ";
      else
        global_variables[target_name.substr(0, equal)] =
                                          target_name.substr(equal+1);
    }
    // else ignore blank lines
  }
}
 
int main()
{
  dependency_graph graph{};
 
  parse_graph(std::cin, graph);
 
  try {
    // Get the sorted artifacts in reverse order.
    std::vector<artifact> sorted{};
    graph.sort(std::back_inserter(sorted));
 
    // Then print the artifacts in the correct order.
    for (auto it(sorted.rbegin()), end(sorted.rend());
         it != end;
         ++it)
    {
      std::cout << it->name() << ' ';
    }
  } catch (std::runtime_error const& ex) {
    std::cerr << ex.what() << ' ';
    return EXIT_FAILURE;
  }
}

As you can see, the code that uses artifact objects is simpler and easier to read. The complexity of managing pointers is pushed down into the artifact and artifact_impl classes. In this manner, the complexity is kept contained in one place and not spread throughout the application. Because the code that uses artifact is now simpler, it is less likely to contain errors. Because the complexity is localized, it is easier to review and test thoroughly. The cost is a little more development time, to write two classes instead of one, and a little more maintenance effort, because anytime a new function is needed in the artifact public interface, that function must also be added to artifact_impl. In many, many situations, the benefits far outweigh the costs, which is why this idiom is so popular.

The new artifact class is easy to use, because you can use it the same way you use an int. That is, you can copy it, assign it, store it in a container, etc., without concern about the size of an artifact object or the cost of copying it. Instead of treating an artifact as a big, fat object, or as a dangerous pointer, you can treat it as a value. Defining a class with value semantics makes it easy to use. Although it was more work to implement, the value artifact is the easiest incarnation to use for writing the application.

Iterators

Perhaps you’ve noticed the similarity between iterator syntax and pointer syntax. The C++ committee deliberately designed iterators to mimic pointers. Indeed, a pointer meets all the requirements of a random-access iterator, so you can use all the standard algorithms with a C-style array, as follows:

int data[4];
std::fill(data, data + 4, 42);

Thus, iterators are a form of smart pointer. Iterators are especially smart, because they come in five distinct flavors (see Exploration 44 for a reminder). Random-access iterators are just like pointers; other kinds of iterators have less functionality, so they are smart by being dumb.

Iterators can be just as dangerous as pointers. In their pure form, iterators are nearly as unchecked, wild, and raw as pointers. After all, iterators do not prevent you from advancing too far, from dereferencing an uninitialized iterator, from comparing iterators that point to different containers, etc. The list of unsafe practices with iterators is quite extensive.

Because these errors result in undefined behavior, a library implementer is free to choose any result for each kind of error. In the interest of performance, most libraries do not implement additional safety checks and push that back on the programmer, who can decide on his or her preference for a safety/performance trade-off.

If the programmer prefers safety to performance, some library implementations offer a debugging version that implements a number of safety checks. The debugging version of the standard library can check that iterators refer to the same container when comparing the iterators and throw an exception if they do not. An iterator is allowed to check that it is valid before honoring the dereference (*) operator. An iterator can ensure that it does not advance past the end of a container.

Thus, iterators are smart pointers, because they can be really, really smart. I highly recommend that you take full advantage of all safety features that your standard library offers. Remove checks one by one only after you have measured the performance of your program and found that one particular check degrades performance significantly, and you have the reviews and tests in place to give you confidence in the less safe code.

This completes your tour of pointers and memory. The next topic gets down into the bits and bytes of C++.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset