10. Memory Management

No amount of genius can overcome obsession with detail.

traditional

The need for fine-grain control of allocation and deallocation — separating allocation and initialization — array allocation — placement — deallocation problems — memory exhaustion — handling memory exhaustion — automatic garbage collection.

10.1 Introduction

C++ provides the operator new to allocate memory on the free store and the operator delete to release store allocated this way (§2.11.2). Occasionally, a user needs a finer-grained control of allocation and deallocation.

An important case is a per-class allocator for a frequently used class (see [2nd,pg 177]). Many programs create and delete large numbers of small objects of a few important classes such as tree nodes, linked lists links, points, lines, messages, etc. The allocation and deallocation of such objects with a general-purpose allocator can easily dominate the run time and sometimes also the storage requirements of the programs. Two factors are at play: the simple run-time and space overhead of a general-purpose allocation operation and the fragmentation of the free store caused by a mix of object sizes. I found that the introduction of a per-class allocator typically doubles the speed of a simulator, compiler, or similar program that hasn’t previously had its memory management tuned. I have seen factors of ten improvements where fragmentation problems were severe. Inserting a per-class allocator (either handwritten or from a standard library) became a five-minute operation with the 2.0 features.

Another example of a need for fine-grain control was programs that had to run without interruption for a long time with very limited resources. Hard real-time systems often need guaranteed and predictable memory acquisition with minimal overhead that leads to similar requirements. Traditionally, such programs have avoided dynamic allocation altogether. A special-purpose allocator can be used to manage these limited resources.

Finally, I encountered several cases where an object had to be placed at a specific location or in a specific memory area because of hardware or system requirements.

The revision of C++’s memory management mechanisms (§2.11.2) for Release 2.0 was a response to such demands. The improvements consist primarily of mechanisms for control of allocation and rely on the programmer’s understanding of the issues involved. They were intended to be used together with other language features and techniques to encapsulate the areas where control is exercised in delicate ways. These mechanisms were completed in 1992 with the introduction of operator new [] and operator delete [] to deal with arrays.

On several occasions, suggestions came from friends at Mentor Graphics where a very large and complex CAD/CAM system was being built in C++. In this system, most of the known programming problems had to be faced on the scale of hundreds of programmers, millions of lines of code, under severe performance requirements, with resource limitations, and market deadlines. In particular, Archie Lachner from Mentor provided insights on memory management issues that became significant in the 2.0 overhaul of C++.

10.2 Separating Allocation and Initialization

The pre-2.0 way of controlling allocation and deallocation on a per-class basis, using assignment to this3.9), proved error-prone and was declared obsolete. Release 2.0 allowed separate specification of allocation and initialization as an alternative. In principle, initialization is done by the constructor after allocation has been done by some independent mechanism. This allows a variety of allocation mechanisms -some user-provided – to be used. Static objects are allocated at link time, local objects on the stack, and objects created by the new operator by an appropriate operator new(). Deallocation is handled similarly. For example:

class X {
     // . . .
public:
    void* operator new(size_t sz); // allocate sz bytes
    void operator delete(void* p); // free p

    X();        // initialize
    X(int i); // initialize

    ~X();      // cleanup
     // . . .
};

The type size_t is an implementation-defined integral type used to hold object sizes; it is borrowed from the ANSI C standard.

It is the new operator’s job to ensure that the separately specified allocation and initialization are correctly used together. For example, it is the compiler’s job to generate a call of the allocator X::operator new() and a call of an X constructor from a use of new for X. Logically, X::operator new() is called before the constructor. It must therefore return a void* rather than an X*. The constructor makes an X object out of the memory allocated for it.

Conversely, the destructor “deconstructs” an object leaving raw memory only for operator delete() to free. Therefore, X::operator delete() takes a void* argument, rather than an X*.

The usual rules for inheritance apply, so objects of a derived class will be allocated using a base class’ operator new():

class Y : public X {  // objects of class Y are also
                      // allocated using X::operator new
    // ...
};

For this, X::operator new() needs an argument specifying the amount of store to be allocated: sizeof (Y) is typically different from sizeof (X). Unfortunately, novice users often get confused when they have to declare that argument, but don’t have to supply it explicitly in calls. The notion of a user-declared function with an argument that is “magically” supplied by “the system” seems hard to grasp for some. In exchange for this added complexity, however, we get the ability to have a base class provide allocation and deallocation services for a set of derived classes -and more regular inheritance rules.

10.3 Array Allocation

A class specific X::operator new() is used for individual objects of class X only (including objects of classes derived from class X that do not have their own operator new()). It follows that

X* p = new X[10];

does not involve X::operator new() because X[10] is an array rather than an object of type X.

This caused some complaints because it didn’t allow users to take control of allocations of arrays of X. However, I was adamant that an “array of X” wasn’t an X and therefore the X allocator couldn’t be used. If used for arrays, the writer of X::operator new() would have to deal with the problems of array allocation “just in case,” thus complicating the critical common case. If that case wasn’t critical, why bother with a special allocator? Also, I pointed out, controlling the allocation of single-dimension arrays such as X[d] isn’t sufficient: what about multiple-dimension arrays such as X[dl][d2]?

However, the lack of a mechanism for controlling array allocation caused a certain amount of grief in real cases and eventually the standards committee provided a solution. The most critical problem was that there was no way to prevent users from allocating arrays on the free store, yet no way of controlling such allocation. In systems relying on logically different storage management schemes, this can cause serious problems as users naively place large dynamic arrays in the default allocation area. I had not fully appreciated the implications of this.

The solution adopted is simply to provide a pair of functions specifically for array allocation/deallocation:

class X {
    // . . .
    void* operator new(size_t sz); // allocate objects
    void operator delete(void* p);

    void* operator new[](size_t sz); // allocate arrays
    void operator delete[](void* p);
};

The array allocator is used to obtain space for arrays of any dimension. As for all allocators, the job of operator new [] is to provide the number of bytes asked for; it does not concern itself about how that memory is used. In particular, it does not need to know the dimensions of the array or its number of elements. Laura Yaker from Mentor Graphics was the prime mover in the introduction of the array allocation and deallocation operators.

10.4 Placement

Two related problems were solved by a common mechanism:

[1] We needed a mechanism for placing an object at a specific address, for example, placing an object representing a process at the address required by special-purpose hardware.

[2] We needed a mechanism for allocating objects from a specific arena, for example, for allocating an object in the shared memory of a multi-processor or from an arena controlled by a persistent object manager.

The solution was to allow overloading of operator new() and to provide a syntax for supplying extra arguments to the new operator. For example, an operator new() that places an object at a particular address can be defined like this:

void* operator new(size_t, void* p)
{
    return p;  // place object at 'p'
}

and invoked like this:

void* buf = (void*)0xF00F;   // significant address

X* p2 = new(buf)X;  // construct an X at 'buf'
                    // invokes: operator new(sizeof(X),buf)

Because of this usage, the “new (buf) X” syntax for supplying extra arguments to operator new() is known as the placement syntax. Note that every operator new() takes a size as its first argument and that the size of the object allocated is implicitly supplied.

If anything, I underestimated the importance of placement at the time. With placement, operator new ceases to be simply a memory allocation mechanism. Because one can associate all kinds of logical properties with specific memory locations, new takes on aspects of general resource management.

An operator new() for a specific allocation arena might be defined like this:

void* operator new(size_t s, fast_arena& a)
{
    return a.alloc(s);
}

and used like this:

void f(fast_arena& arena)
{
    X* p = new(arena)X;  // allocate X in arena
    // ...
}

Here, a fast_arena is assumed to be a class with a member function alloc() that can be used to obtain memory. For example:

class fast_arena {
    // ... char* maxp;
    char* freep;
    char* expand(size_t s); // get more memory from
                           // general purpose allocator
public:
    void* alloc(size_t s) {
        char* p = freep;
        return ((freep+=s)<maxp) ? p : expand(s);
   }
   void free(void*) {} // ignore
   clear(); // free all allocated memory
};

This would be an arena specialized for fast allocation and almost instant freeing. One important use of arenas is to provide specialized memory management semantics.

10.5 Deallocation Problems

There is an obvious and deliberate asymmetry between operator new() and operator delete(). The former can be overloaded, the latter can’t. This matches the similar asymmetry between constructors and destructors. Consequently, you may be able to choose between four allocators and five constructors when creating an object, but when it comes time to destroy it, there is basically only one choice:

delete p;

The reason is that in principle you know everything at the point where you create an object, but when it comes to deleting it, all you have left is a pointer that may or may not be of the exact type of the object.

The use of a virtual destructor is crucial for getting destruction right in cases in which a user deletes an object of a derived class through a pointer to the base class:

class X {
    // ...
    virtual ~X();
};

class Y : public X {
    // ...
    ~Y ();
};

void f(X* p1)
{
    X* p2 = new Y;
    delete p2;     // Y::~Y correctly invoked
    delete p1;     // correct destructor
                   // (whichever that may be) invoked
}

This will also ensure that if there are local operator delete() functions in the hierarchy, the right one will be called. Had a virtual destructor not been used, the cleanup specified in Y’s destructor would not have been performed.

However, there is no language feature for selecting between deallocation functions to match the mechanism for selecting between allocation functions:

class X {
    // ...
    void* operator new(size_t); // ordinary allocation
    void* operator new(size_t, Arena&); // in Arena

    void operator delete(void*);
    // can't define void operator delete(void*, Arena&);
};

The reason is again that at the point of deletion, the user can’t be expected to know how the object was allocated. Ideally, of course, a user should not have to deallocate an object at all. That is one use of special arenas. An arena can be defined to be deallocated as a unit at some well-defined point in a program, or one can write a special-purpose garbage collector for an arena. The former is quite common, the latter isn’t and needs to be done very well to be able to compete with a standard conservative plug-in garbage collector [Boehm, 1993].

More frequently, the operator new() functions are programmed to leave an indicator of how they want to be deallocated for operator delete() to find. Note that this is memory management and therefore at a conceptual level below that of the objects created by the constructors and destroyed by the destructors. Consequently the memory containing this information is not in the object as such but somewhere related to it. For example, an operator new() may place memory management information in the word ahead of the one pointed to by its return value. Alternatively, an operator new() can leave information in a place where constructors or other functions can find them to determine whether an object is allocated on a free store.

Was it a mistake not to allow users to overload delete? If so, it would be a misguided attempt to protect people against themselves. I’m undecided, but I’m pretty certain that this is one of the nasty cases where either solution would cause problems.

The possibility of calling a destructor explicitly was introduced in 2.0 to cater to rare cases where allocation and deallocation have been completely separated from each other. An example is a container that does all memory management for its contained objects.

10.5.1 Deallocating Arrays

From C, C++ inherited the problem that a pointer points to an individual object but that object may actually be the initial element of an array. In general, the compiler cannot tell. An object pointing to the first element of an array is typically said to point to the array and allocation and deallocation of arrays is handled through such pointers. For example:

void f(X* p1)  //p1 may point to an individual object
               // or to an array
{
    X* p2 = new X[10]; // p2 points to the array
    // ...
}

How do we ensure that an array is correctly deleted? In particular, how do we ensure that the destructor is called for all elements of an array? Release 1.0 didn’t have a satisfactory answer. Release 2.0 introduced an explicit delete-array operator delete[]:

void f(X* p1)  // p1 may point to an individual object
               // or to an array
{

    X* p2 = new X[10];    // p2 points to the array
    // ...
    delete p2;       // error: p2 points to an array
    delete[] p2;     // ok
    delete p1;       // maybe ok, trust the programmer
    delete [] p1;    // maybe ok, trust the programmer
}

Plain delete isn’t required to handle both individual objects and arrays. This avoids complicating the common case of allocating and deallocating individual objects. It also avoids encumbering individual objects with information necessary for array deallocation.

An intermediate version of delete [] required the programmer to specify the number of elements of the array. For example:

delete[10] p2;

That proved too error-prone, so the burden of keeping track of the number of elements was placed on the implementation instead.

10.6 Memory Exhaustion

Finding that a requested resource cannot be obtained is a general and nasty problem. I had decided (pre-2.0) that exception handling was the direction in which to look for general solutions to this kind of problem (§3.15). However, exception handling (§16) was then still far in the future, and the particular problem of free store exhaustion couldn’t wait. Some solution, however ugly, was needed for an interim period of several years.

Two problems needed immediate solutions:

[1] It must be possible for a user to gain control in all cases in which a library call fails due to memory exhaustion (more generally, in all cases in which a library call fails). This was an absolute requirement from important internal AT&T users.

[2] The average user mustn’t be required to test for memory exhaustion after each allocation operation. In any case, experience from C shows that users don’t test consistently even when they are supposed to.

The first requirement was met by specifying that a constructor isn’t executed if operator new() returns 0. In that case, the new expression also yields 0. This enables critical software to defend itself against allocation problems. For example:

void f()
{
    X* p = new X;
    if (P == 0) {
        // handle allocation error
        // constructor not called
    }
    // use p
}

The second requirement was met by what was known as a new_handler, that is, a user-supplied function guaranteed to be called if memory can’t be found by operator new. For example:

void my_handler() { /* ... */ }

void f()
{
   set_new_handler(&my_handler);  // my_handler used for
                                  // memory exhaustion
                                  // from here on
   // ...
}

This technique was presented in [Stroustrup,1986] and is a general pattern for dealing with resource acquisition that occasionally fails. Basically, a new_handler can:

– find more resources (that is, find free memory to allocate), or

– produce an error message and exit (somehow).

With exception handling, “exit” can be less drastic than terminating the program (§16.5).

10.7 Automatic Garbage Collection

I deliberately designed C++ not to rely on automatic garbage collection (usually just called garbage collection). I feared the very significant space and time overheads I had experienced with garbage collectors. I feared the added complexity of implementation and porting that a garbage collector would impose. Also, garbage collection would make C++ unsuitable for many of the low-level tasks for which it was intended. I like the idea of garbage collection as a mechanism that simplifies design and eliminates a source of errors. However, I am fully convinced that had garbage collection been an integral part of C++ originally, C++ would have been stillborn.

My opinion was that if you needed garbage collection, you could either implement some automated memory management scheme yourself or use a language that supported it directly, say, my old favorite Simula. Today, the issue is not so clear cut. More resources are available for implementation and porting. Much C++ software exists that can’t just be rewritten in other languages. Garbage collectors have improved, and many of the techniques for “home brew” garbage collection that I had envisioned or used don’t scale up from individual projects to general-purpose libraries. Most importantly, more ambitious projects are now done in C++. Some of these could benefit from garbage collection and could afford it.

10.7.1 Optional Garbage Collection

Optional garbage collection is, I think, the right approach for C++. Exactly how that can best be done is not yet known, but we are going to get the option in several forms over the next couple of years. Implementations already exist, so it is just a matter of time before they make it out of research and into production code.

The fundamental reasons why garbage collection is desirable are easily stated:

[1] Garbage collection is the easiest for the user. In particular, it simplifies the building and use of some libraries.

[2] Garbage collection is more reliable than user-supplied memory management schemes for some applications.

The reasons against are more numerous, but less fundamental in that they are implementation and efficiency issues:

[1] Garbage collection causes run-time and space overheads that are not affordable for many current C++ applications running on current hardware.

[2] Many garbage collection techniques imply service interruptions that are not acceptable for important classes of applications, such as hard real-time applications, device drivers, control applications, human interface code on slow hardware, and operating system kernels.

[3] Some applications do not have the hardware resources of a traditional general-purpose computer.

[4] Some garbage collection schemes require banning several basic C facilities such as pointer arithmetic, unchecked arrays, and unchecked function arguments as used by printf().

[5] Some garbage collection schemes impose constraints on object layout or object creation that complicates interfacing with other languages.

I know that there are more reasons for and against, but no further reasons are needed. These are sufficient arguments against the view that every application would be better done with garbage collection. Similarly, these are sufficient arguments against the view that no application would be better done with garbage collection.

When comparing garbage collection and non-garbage-collection systems, remember that not every program needs to run forever, not every piece of code is a foundation library, memory leaks are quite acceptable in many applications, and many applications can manage their memory without garbage collection or related techniques such as reference counting. C++ does not need garbage collection the way a language without genuine local variables (§2.3) needs garbage collection. When memory management is well-enough behaved to be handled by less general methods (for example, special purpose allocators (§10.2, §10.4, §15.3.1 [2nd,§5.5.6,§13.10.3]) automatic and static store (§2.4)), very significant speed and space advantages can be obtained compared with both manual and automatic general garbage collection. For many applications, those advantages are critical and the benefits of automatic garbage collection to other applications irrelevant. In an ideal implementation, this advantage would not be compromised by the presence of a garbage collector; that collector would simply not be invoked, or invoked infrequently enough to be unimportant to the overall efficiency of most applications.

My conclusion is that garbage collection is desirable in principle and feasible, but for current users, uses, and hardware we can’t afford to have the semantics of C++ and of its most basic standard libraries depend on garbage collection.

The real problem is therefore whether optional garbage collection is a viable option for C++. When (not if) garbage collection becomes available, we will have two ways of writing C++ programs. This, in principle, is no more difficult than managing with several different libraries, several different application platforms, etc., that we already handle. Having to make such choices is a simple consequence of having a widely used general-purpose programming language. It would not make sense to require that the execution environment of a C++ program should be the same whether it is running in a missile head, a SLR camera, a PC, a telephone switch, a UNIX clone, an IBM mainframe, a Mac, a supercomputer, or something else. If provided in a reasonable way, garbage collection will simply become another option for someone choosing a run-time environment for an application.

Can optional garbage collection be legal and useful if it is not specified as part of the C++ standard? I think so, and anyway, we don’t have the option of specifying garbage collection in the standard because we don’t have a scheme that is anywhere near ready for standardization. An experimental scheme must be demonstrated to be good enough for a wide range of real applications. In addition, it must not have unavoidable drawbacks that would make C++ an unacceptable choice for significant applications. Given one such successful experiment, implementers will scramble to provide the best implementations. We can only hope that they don’t choose mutually incompatible schemes.

10.7.2 What should optional garbage collection look like?

There are several options because there are several solutions to the basic problems. An ideal is to maximize the number of programs that can run in both garbage collection and non-garbage collection environments. This is an important and elusive goal for implementers, library designers, and application programmers.

Ideally, a garbage collection implementation would simply be as good as a nongarbage-collection implementation in time and space if you didn’t use garbage collection. This is easy to achieve if the programmer has to say “no part of this program uses garbage collection,” but very hard to do if the implementation is obliged to do garbage collection and tries to achieve the performance of a non-garbage-collection implementation through adaptive behavior.

Conversely, a garbage-collection implementer might need some “hints” from a user to make performance acceptable. For example, a scheme might require the user to state which objects require garbage collection and which do not (for example, because they originate in non-garbage-collected C or Fortran libraries). If at all possible, a non-garbage-collection implementation should be able to ignore such hints. Alternatively they should be trivial to remove from the source text.

Some C++ operations, such as ill-behaved casts, unions of pointers and nonpointers, pointer arithmetic, etc., are seriously detrimental to garbage collectors. These operations are generally infrequent in well-written C++ code, so it is tempting to ban them. Thus, two ideals clash:

[1] Ban any unsafe operation: that makes programming safer and garbage collection more efficient.

[2] Don’t ban any currently legal C++ program.

I think a compromise can be reached. I suspect that a garbage collection scheme can be concocted that will work with (almost) every legal C++ program, but work even better when no unsafe operations are used.

When implementing a garbage collection scheme one must decide whether to invoke the destructor for a collected object or not. Deciding which is the right thing to do is not easy. In [2nd], I wrote:

“Garbage collection can be seen as a way of simulating an infinite memory in a limited memory. With this in mind, we can answer a common question: Should a garbage collector call the destructor for an object it recycles? The answer is no, because an object placed on free store and never deleted is never destroyed. Seen in this light, using delete is simply a way of requesting the destructor to be called (together with a notification to the system that the object’s memory may be recycled). But what if we actually do want an action performed for an object allocated on the free store but never deleted? Note that this problem does not arise for static and automatic objects; their destructors are always called implicitly. Note also that actions performed “at garbage-collection time” are unpredictable because they may happen at essentially any time between the last use of the object and “the end of the program.” This implies that the state of the program at the time of their execution is unknown. This again makes such actions hard to program correctly and less useful than is sometimes imagined.

Where such actions are needed, the problem of performing an action at some unspecified “destruction time” can be solved by providing a registration server. An object that needs a service performed “at the end of the program” places its address and a pointer to a “cleanup” function in a global associative array.”

I am now less certain. This model would certainly work, but maybe having the garbage collector call the destructors would be sufficiently simple to use to be worthwhile. That depends on exactly what objects are collected and what actions their destructors perform. This is a question that can’t be decided by armchair philosophy, and there doesn’t seem to be much relevant experience from other languages. Unfortunately, it is also a problem for which it is hard to conduct real experiments.

I am under no illusion that building an acceptable garbage collection mechanism for C++ will be easy – I just don’t think it is impossible. Consequently, given the number of people looking at the problem several solutions will soon emerge.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset