Chapter 30. The Big Three

FAQ 30.01 What is the purpose of this chapter?

image

The purpose of this chapter is to show you how to eliminate a nasty category of bugs from your software. The bugs discussed in this chapter are quite subtle—the compiler normally does not give any warning or error messages—and disastrous, often causing the application to crash or behave chaotically.

The specific details involve three infrastructure routines that the C++ compiler automatically defines when the developer leaves them undefined. A guideline is provided so readers can tell when those automatic definitions will cause problems and when they won't cause problems.

It is essential that every C++ programmer understand the material in this chapter.

FAQ 30.02 What are the Big Three?

Destructor, copy constructor, and assignment operator.

These infrastructure routines provide the death and copy semantics for objects of the class. Here is some sample syntax:

image

FAQ 30.03 What happens when an object is destroyed that doesn't have an explicit destructor?

The compiler synthesizes a destructor for the object's class.

For example, if an object of class Fred is destroyed and class Fred doesn't provide an explicit destructor, the compiler synthesizes a destructor that destroys all the Fred object's member objects and base class subobjects. This is called memberwise destruction. Thus, if class Fred doesn't have an explicit destructor and an object of class Fred contains an object of class Member that has an explicit destructor, then the compiler's synthesized Fred::~Fred() invokes Member's destructor.

The built-in types (int, float, void*, and so on) can be regarded as having destructors that do nothing.

image

The compiler's synthesized Fred::~Fred() calls Member::~Member() automatically, so the output is

before destructing a Fred
destructing a Member object
after destructing a Fred

FAQ 30.04 What happens if an object is copied but doesn't have an explicit copy constructor?

The compiler synthesizes a copy constructor for the object's class.

For example, if an object of class Fred is copied and class Fred doesn't provide an explicit copy constructor, the compiler synthesizes a copy constructor that copy constructs all the Fred object's member objects and base class subobjects. This is called memberwise copy construction. Thus, if class Fred doesn't have an explicit copy constructor, and an object of class Fred contains an object of class Member that has an explicit copy constructor, then the compiler's synthesized Fred::Fred(const Fred&) invokes Member's copy constructor.

Built-in types (int, float, void*, and so on) can be viewed as having copy constructors that do a bitwise copy.

image

The compiler's synthesized Fred::Fred(const Fred&) calls Member:: Member(const Member&) automatically, so the output is

constructing a Member
copying a Member

FAQ 30.05 What happens when an object that doesn't have an explicit assignment operator is assigned?

The compiler synthesizes an assignment operator for the object's class.

For example, if an object of class Fred is assigned and class Fred doesn't provide an explicit assignment operator, the compiler synthesizes an assignment operator that assigns all the Fred object's member objects and base class subobjects. This is called memberwise assignment. Thus if class Fred doesn't have an explicit assignment operator and an object of class Fred contains an object of class Member that has an explicit assignment operator, then the compiler's synthesized Fred::operator= (const Fred&) invokes Member's assignment operator.

Built-in types (int, float, void*, and so on) can be viewed as having assignment operators that do a bitwise copy.

image

The compiler's synthesized Fred::operator= (const Fred&) calls Member::operator= (const Member&), so the output is

constructing a Member
constructing a Member
assigning a Member

FAQ 30.06 What is the Law of the Big Three?

If a class needs any of the Big Three, it needs them all.

This doesn't mean that every class should have all three of the Big Three. On the contrary, the Big Three are needed only in a relatively small percentage of classes. That is one of the reasons this is such an insidious error. Programmers see these infrastructure routines in only some of their classes, so they don't remember the critical Law of the Big Three.

This law first appeared in 1991 in the comp.lang.c++ FAQ, and it seems to be rediscovered every six months or so. Violations almost always lead to incorrect behavior and often lead to disasters.

In particular, violations of the Law of the Big Three often corrupt the heap. This usually means that the program does not crash until much later in the program's execution (and simple test programs may not crash at all). By the time the programmer goes in with a debugger, the root cause is hard to identify and the heap has so many things wrong with it that it's difficult to trace what's going wrong.

FAQ 30.07 Which of the Big Three usually shows up first?

An explicit destructor.

Developers typically discover the need to do something special during a normal constructor, which frequently necessitates undoing the special action in the destructor. In almost all cases, the class needs a copy constructor so that the special thing will be done during copying, and the class also needs an assignment operator so that the special thing will be done during assignment.

The destructor is the signal for applying the Law. Pretend that the keyboard's ~ (tilde) key is painted bright red and is wired up to a siren.

In the following example, the constructor of class MyString allocates memory, so its destructor deletes the memory. Typing the ~ of ~MyString() should sound a siren for the Law of the Big Three.

image

Classes that own allocated memory (hash tables, linked lists, and so forth) generally need the Big Three (see FAQ 30.08).

FAQ 30.08 What is remote ownership?

Remote ownership is the responsibility that comes with being the owner of something allocated from the heap.

When an object is the logical owner of something allocated from the heap (known as the referent), the object is said to have remote ownership. That is, the object owns the referent. When an object has remote ownership, it usually means that the object is responsible for deleteing the referent.

Any time a pointer is added to an object's member data, the class's author should immediately determine whether the object owns the referent (that is, whether the object has remote ownership). If this determination is delayed, the class's implementation can become schizophrenic—some of the object's member functions assume that the object owns the referent, others assume that someone else owns the referent. This is usually a mess and sometimes a disaster.

FAQ 30.09 How is remote ownership special?

It requires a deep copy, not a shallow copy.

When an object has remote ownership, the object needs the Big Three (destructor, copy constructor, and assignment operator). These routines are responsible for destroying the referent, creating a copy of the referent, and assigning the referent, respectively.

The copy semantics for remote ownership require the referent to be copied (a.k.a. deep copy) rather than just the pointer (a.k.a. shallow copy). For example, if class MyString has a pointer to an array of characters, copying the MyString object should copy the array. It is not sufficient to simply copy the pointer to the array, since that would result in two objects that both think they are responsible for deleteing the same array.

When an object contains pointers for which it does not have remote ownership, the copy semantics are usually straightforward: the copy operation merely copies the pointer. For example, an iterator object might have a pointer to a node of a linked list, but the node is owned by the list rather than by the iterator, so copying an iterator merely needs to copy the pointer; the data in the node is not copied to the new iterator.

When an object doesn't contain pointers, the copy semantics are usually straightforward: the corresponding copy operation is called on each member object. This is what the compiler does automatically if the class doesn't have any copy operations (see FAQs 30.04, 30.05), which is why the Big Three are not usually needed in these cases.

FAQ 30.10 What if a class owns a referent and doesn't have all of the Big Three?

Trouble is brewing.

The following EvilString class doesn't have an explicit copy constructor or assignment operator, so the compiler synthesizes a copy constructor and/or assignment operator when it sees an EvilString being copy initialized and/or assigned, respectively. Unfortunately the compiler-synthesized copy constructor and assignment operators copy only the pointer (shallow copy) rather than the referent (deep copy).

image

If an EvilString is copied (passed by value, for example; see FAQ 20.07), then the copy points to the same string data as the original. When the copy dies, the data they are sharing is deleted, leaving the original EvilString object with a dangling reference. Any use of the original object, including the implicit destruction when the original dies, will probably corrupt the heap, which will eventually crash the program.

image

Note that the problem is not with pass-by-value. The problem is that the copy constructor for class EvilString is broken. Similar comments can be made regarding the assignment operator.

FAQ 30.11 Are there any C++ classes that help manage remote ownership?

image

Yes, auto_ptr.

The standard template class auto_ptr<T> is a partial solution to managing remote ownership. auto_ptr<Fred> acts like a Fred*, except the referent (the Fred object) is automatically deleted when the auto_ptr dies. auto_ptr<T> is known as a managed pointer.

Managed pointers are useful whenever a referent is allocated by new and when the owner of the pointer owns the referent. In other words, auto_ptr<T> is useful for managing remote ownership.

The most important issue isn't that auto_ptr<T> saves the one line of delete code. The most important issue is that auto_ptr<T> handles exceptions properly: the referent is automagically deleted when an exception causes the auto_ptr<T> object to be destructed. In the following example, class Noisy throws exceptions randomly to simulate the fact that we can't always predict when an exception is going to be thrown (hopefully your classes don't have this property).

Here is a function that randomly returns true and false with 50–50 probability:

image

Here is a class that prints messages and possibly throws exceptions in its functions.

image

Here is a function that wisely chooses to use the managed pointer auto_ptr<Noisy>.

image

Here is the same function, but this time using a raw Noisy* pointer. Note how much more complex this code is, even though it is doing the same thing. A significant portion of this code exists solely to ensure that the referent is deleted properly, whereas in the previous example the managed pointer enabled most of this scaffolding to disappear.

image

Here is main() that repeatedly calls the foregoing routines.

image

FAQ 30.12 Does auto_ptr enforce the Law of the Big Three and solve the problems associated with remote ownership?

image

No. auto_ptr<T> plugs leaks, but it doesn't enforce the Law of the Big Three.

When a class uses a plain T* to implement remote ownership, forgetting any of the Big Three causes the compiler to silently generate wrong code. The result is often a disaster at runtime.

Unfortunately, replacing the T* with a managed pointer such as auto_ptr<T> does not correct the problem. The root of the problem is that when an auto_ptr<T> is copied, ownership of the referent is transferred to the copy and the original object's auto_ptr<T> becomes NULL. This is often undesirable. What is needed instead is for the referent to be copied or for a compile-time error to be generated that flags the problem.

The safest solution is to define and use a strict auto_ptr<T>. For example, the following could go into file strict_auto_ptr.h and could be reused whenever anyone wanted a strict auto_ptr<T>. Note that the copy constructor and assignment operator are private: and are undefined, thus making it impossible to copy a strict_auto_ptr<T> object.

image

When strict_auto_ptr<T> is used, the compiler either synthesizes the Big Three correctly or causes specific, compile-time errors; it does not allow run-time disasters.

The following example shows a class that implements remote ownership by the managed pointer strict_auto_ptr<Noisy> rather than the plain pointer Noisy*.

image

Because strict_auto_ptr<Noisy>'s destructor deletes the referent, Fred doesn't need an explicit destructor. The Fred::~Fred() synthesized by the compiler is correct.

Because strict_auto_ptr<Noisy>'s copy constructor and assignment operator are private:, the compiler is prevented from synthesizing either the copy constructor or the assignment operator for class Fred. Copying or assigning a Fred produces a specific, compile-time error message. Compare this to using a Noisy*, in which case the compiler silently synthesizes the wrong code, producing disastrous results.

For example, when the GENERATE_ERROR symbol is #defined in the following function, the compiler gives an error message rather than silently doing the wrong thing.

image

strict_auto_ptr<T> effectively automates the proper delete and prevents the compiler from synthesizing improper copy operations. It plugs leaks and enforces the Law of the Big Three.

FAQ 30.13 Are there cases where one or two of the Big Three may be needed but not all three?

Yes, but define them all anyway.

There are cases where one or two of the Big Three may be needed but not all three. All three should usually be defined anyway so that people don't have to think so hard during code reviews and maintenance activities. Here are four common times when this happens: virtual destructors, protected: assignment operators, recording creation or destruction, and unnecessary or illogical copy operations.

Virtual destructors: A base class often has a virtual destructor to ensure that the right destructor is called during delete basePointer (see FAQ 21.05). If this explicit destructor exists solely to be made virtual (for example, if it does what the synthesized destructor would have done, namely { }), the class may not need an explicit copy constructor or assignment operator.

Protected assignment operators: An ABC often has a protected: assignment operator to prevent users from performing assignment using a reference to an ABC (see FAQ 24.05). If this explicit assignment operator exists solely to be made protected: (for example, if it does what the synthesized assignment operator would have done, namely memberwise assignment), the class may not need an explicit copy constructor or destructor.

Recording creation or destruction: A class sometimes has an explicit destructor and copy constructor solely to record the birth and death of its objects. For example, the class might print a message to a log file or count the number of existing objects. If the explicit destructor or copy constructor exists solely to perform this information recording (for example, if these operations do what the synthesized versions would have done), the class may not need an explicit assignment operator, since assignment doesn't change the number of instances of a class.

Unnecessary or illogical copy operations: There are cases where a class simply doesn't need one or both copy operations. Sometimes the copy operations don't even make logical sense. For example, the semantics of class File may mean that it is nonsensical to copy File objects; similarly for objects of class Semaphore. In these cases, the unnecessary copy operations are normally declared in the private: section of the class and are never defined. This prevents the compiler from synthesizing these operations in the class's public: section and causes compile-time error messages whenever a user accidentally calls one of these member functions. In this case, it is not strictly necessary to define the other members of the Big Three just because one or both copy operations are declared in the private: section of the class.

FAQ 30.14 Are there any other circumstances that might explicitly warrant the Big Three?

Yes, when the Big Three need to be non-inline.

When the compiler synthesizes the Big Three, it makes them inline. If the application's classes are exposed to customers (for example, if customers #include the application's header files rather than merely using an executable), the application's inline code is copied into their executables. If you want to maintain binary compatibility between releases of your library, you must not change any visible inline functions, including the versions of the Big Three that are synthesized by the compiler. Therefore, explicit, non-inline versions of the Big Three should be used.

FAQ 30.15 Why does copying an object using memcpy() cause a program crash?

Because bitwise copying is evil.

A class's copy operations (copy constructor and assignment operator) are supposed to copy the logical state of an object. In some cases, the logical state of an object can be copied using a bitwise copy (e.g., using memcpy()). However a bitwise copy doesn't make sense for a lot of objects; it may even put the copy in an incoherent state.

If a class X has a nontrivial copy constructor or assignment operator, bitwise copying an X object often creates wild pointers. One common case where bitwise copying of an object creates wild pointers is when the object owns a referent (that is, it has remote ownership). The wild pointers are a result of the bitwise copy operation, not some failure on the part of the class designer.

For example, consider a class that has remote ownership, such as a string class that allocates an array of char from the heap. If string object a is bitwise copied into string b, then the two objects both point to the same allocated array. One of these strings will die first, which will delete the allocated array owned by both of them. BOOM!

image

Note that a bitwise copy is safe if the object's exact class is known and the object is (and will always remain!) bitwise copyable. For example, class string might use memcpy() to copy its string data because char is and will always remain bitwise copyable (assuming that the string data is a simple array of char).

FAQ 30.16 Why do programs with variable-length argument lists crash?

Because variable-length argument lists use bitwise copy, which is dangerous in many cases. There are times where variable-length argument lists don't cause a problem (printf comes to mind). But it is wise to avoid using them unless there is some compelling reason.

Objects passed into ellipses (...) are passed via bitwise copy. The parameter objects are bitwise copied onto the stack, but the va_arg macro uses the copy constructor to copy the pile of bits from the stack. The technical term for this asymmetry is ouch.

image

“Ladies and gentlemen, this is your pilot speaking; please fasten your seat belts in preparation for the air turbulence ahead.”

main()'s three Fred objects are constructed via Fred::Fred(). The call to f(int,Fred...) passes these Freds using bitwise copy. The bitwise copies may not be properly initialized Fred objects and are not logical copies of a, b, and c. Inside f(int,Fred...), the va_arg macro uses a pointer cast (shudder) to create a Fred*, but this Fred* doesn't point to a valid Fred object because it points to a bitwise copy of a Fred object. The va_arg macro then dereferences this (invalid) pointer and copies the pile of incoherent bits (via Fred's copy constructor) into the local variable, x.

If Fred has nontrivial copy semantics, the chances that the bitwise copy is the same as a logical copy is remote at best.

Variable-length argument lists are evil.

FAQ 30.17 Why do programs that use realloc() to reallocate an array of objects crash?

When realloc() needs to move the storage that is being reallocated, it uses bitwise copy rather than invoking the appropriate constructor for the newly allocated objects.

Use realloc() only for objects guaranteed to be bitwise copyable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset