A wild pointer is a pointer that refers to garbage.
There are three ways to get a wild pointer.
In C, the classic example of a dangling reference (3) occurs when a function returns a pointer to a local variable or when someone uses a pointer that has already been passed to free
. Both situations can occur in C++, too.
Wild pointers are bad news no matter how they are created. Bad enough that we devote this entire chapter to the subject.
A wild pointer is to software what a car bomb is to a busy street: both cause indiscriminate pain and suffering.
After a program spawns its first wild pointer, an awesome chain reaction begins. The first wild pointer scribbles on a random memory location, which probably corrupts the object at that location, creating other wild pointers. Eventually—almost mercifully—one of these wild pointers attempts to scribble on something protected by the operating system or the hardware, and the program crashes.
By the time that happens, finding the root cause of the error with a debugger is nearly hopeless; what was once a cohesive system of objects is now a pile of rubble. The system has literally blown itself to bits.
Wild pointers create unstable systems. Arbitrarily small changes, such as inserting an extra semicolon, running the program on a different day of the week, or changing the way you smile as you press the Enter key can cause arbitrarily large changes in how the system behaves (or misbehaves). Sometimes the program deletes user files, sometimes it just gives the wrong answer, sometimes it actually works!
Wild pointers are a problem worth avoiding.
It means “Pay attention to me or you'll regret it!”
A local (auto
) object is an object local to a routine (and it is usually allocated on the stack). Never return a reference or a pointer to a local (auto
) object. As soon as the function returns, the local object is destructed, and the reference or pointer refers to garbage. A program working with garbage eventually gets very, very sick.
Note that returning a copy of a local object (returning “by value”) is fine.
Avoid storing the address of a local object created in an inner scope in a pointer in an outer scope. Here's an example.
When control flow leaves the inner block, a
will be destroyed and p
will be pointing at garbage. Because control can leave the inner scope a number of different ways (including an uncaught exception), setting the outer scope's pointer to point to an inner scope's object can lead to subtle errors and should be avoided on principle.
If the address of an inner scope's object has to be stored in an outer scope's pointer, then the outer scope's pointer should be changed to NULL
(or some other safe value) before leaving the inner scope. Generally speaking, you should guarantee that the pointer is set to NULL
by creating a pointer-like class whose destructor sets the pointer to NULL
, then replace the Fred*
local variable with a local object of that class.
Note that the problem addressed by this FAQ can occur only with pointers, not with references. This is because a reference is permanently bound to its referent at the moment it is initialized. This is yet another reason to prefer references to pointers (see FAQ 11.09).
No, there is very little relationship between these issues.
Occasionally, the claim is made that if an object is allocated via new
then it should be passed via pointer; otherwise it should be passed by reference. This is not correct. There are two separate questions, when to delete the object and how to pass it.
First consider the issue of deleting the object. If an object is allocated from the heap (e.g., p = new Fred();
), then some routine has to be responsible for deleting it (e.g., delete p;
), and the routine must have a pointer (e.g., p
) to it. There are three common situations.
auto_ptr
is the easy solution: e.g., auto_ptr<Fred> p(new Fred());
auto_ptr
in the this
object and define a copy constructor and assignment operator that allocate a copy of the object from the heap (see FAQ 30.12).delete
, but the new
ed object should be delete
d when there are no pointers to it. In this case, use reference counting and avoid passing raw pointers to the object. (See FAQ 31.09.)Now consider how the object should be passed. Assume that the routine f()
takes a Fred
object. Which is better, f(Fred* p)
or f(Fred& r)
? The key criterion is this: Does f()
want to handle the case when it gets passed a nonobject (that is, the NULL
pointer)? If it does, then the pointer form is indicated because it can use NULL
to indicate the nonobject case. If f()
always needs an actual Fred object, then the best way to signal this is to use a reference, which guarantees that it can't be passed a NULL
since a reference can't legally be NULL
.
Notice that the issues of deletion and passing are almost completely independent. Obviously, if reference counting is used to handle the deletion problem, then pointer-like objects are typical. But otherwise the questions aren't related. References can be used even if the object was allocated off the heap, and pointers can be used even if the object was not allocated from the heap, since it is always possible to have a pointer to a local or global object (so long as the object outlives the pointer to it).
Rarely, probably only when interfacing with other languages. Any casting that must be done should use the C++ facilities for type-safe casting.
C-style pointer casts are the goto
of OO programming. A goto
complicates the control flow, making it difficult to statically reason about the flow of control. To determine the code's behavior, the dynamic flow of control has to be simulated. A pointer cast complicates the type flow, making it difficult to statically reason about the type of an object. To determine the code's behavior, the dynamic flow of types must be simulated. Use a C-style pointer cast as often as you would use a goto
.
C-style pointer casts are also error prone. The basic problem is that the compiler meekly accepts C-style pointer casts without using runtime checks to see if they are correct. This can create wild pointers. Shudder.
Developers with a background in untyped (a.k.a. dynamically typed) languages tend to produce designs whose implementations employ an excessive number of pointer casts. These old habits must be terminated without prejudice. The lowest levels of memory management are among the few places where pointer casts are necessary.
Reference casts are just like pointer casts and are equally dangerous.
Yes, as long as that reference isn't copied into another reference or pointer.
In the following example, an unnamed temporary string
object is created at line 1. A (const
) reference (main()
's x
) is bound to this temporary. The language guarantees that the unnamed temporary will live until the reference x
dies, which in this case is at the end of main()
. Therefore, line 2 is safe: the compiler isn't allowed to destruct the unnamed temporary string
object until line 3.
There is a caveat—don't copy reference x
into a pointer variable that's out of the scope in which the temporary was created. For a subtle example of this, see the next FAQ.
const
reference be returned by const
reference?No; it might create a dangling reference, which could destroy the world.
Returning an object by reference is not dangerous in and of itself, provided that the lifetime of the referent exceeds the lifetime of the returned reference. This cannot be guaranteed when a const
reference parameter is returned by const
reference, because the original argument might have been an unnamed temporary.
In the following example, an unnamed temporary string
object is created at line 1. Parameter x
from function unsafe()
is bound to this temporary, but that is not an explicit, local reference in the scope of main()
, so the temporary's lifetime is governed by the usual rules—the temporary dies at the ;
of line 1. Unfortunately, function unsafe()
returns the reference x
, which means main()
's y
ends up referring to the temporary, even though the temporary is now dead. This means that line 2 is unsafe: it uses y
, which refers to an object that has already been destructed—a dangling reference.
Note that if a function accepts a parameter by non-const
reference (for example, f(string& s)
), returning a copy of this reference parameter is safe because a temporary cannot be passed by non-const
reference.
min(x,y)
or abs(x)
return a const
reference?No!
When the following example is compiled and the symbol UNSAFE
is defined, min(x,y)
avoids an extra copy operation by returning a const
reference parameter by const
reference. As discussed in the previous FAQ, this can create a dangling reference, which can destroy the world.
Returning a const
reference to a const
reference parameter is normally done as an optimization to avoid an extra copy operation. If you're willing to sacrifice correctness, you can make your software very fast!
When dealing with pointers.
When the token 0
appears in the source code at a place where a pointer should be, the compiler interprets the token 0
as the NULL
pointer. However, the bit pattern for the NULL
pointer is not guaranteed to be all zeros. More specifically, setting a pointer to NULL
may set some of the bits of that pointer to 1
.
Depending on the hardware, the operating system, or the compiler, a pointer whose bits are all zeros may not be the same as the NULL
pointer. For example, using memset()
to set all the bits of a pointer to zero may not make that pointer equal to NULL
.
In the following program, all conforming C++ compilers produce code that prints 0 is NULL
, then NULL is NULL
, but some may produce code that prints memsetPtr is not NULL
.
Another common way to generate a pointer whose bits are all zero that is equally dangerous is with union
s. For example, the following is wrong on two levels. First, it accesses the char*
member of the union
even though it was the unsigned long
member that was set most recently. Second, it assumes that a pointer whose bits are all zero is the same as a NULL
pointer—the output may be unionPtr
is not NULL
on some machines.