27.3BlockwiseRemoteHeap 431
address in user-memory. Since we cannot use the address to quickly find metada-
ta about allocations, the naive implementation leaves us with a linked list of
tracking blocks, ordered by address. A naive implementation forces all alloca-
tions to scan the heap’s tracking linked list before allocating or freeing any
memory. A naive implementation has a slow
On running time, where n is the
number of allocations in the heap.
The implementation on the website does not attempt to make
Free() as fast
as possible but, rather, does a simple linked list search. This could be improved
by creating a second data structure that keeps the tracking block pointer associat-
ed with each allocated address, perhaps in a hash table or balanced tree, but that
is an exercise left to the reader. The memory requirements for such a structure
are substantial and may be entirely unnecessary for applications where
Free() is
rarely or never called, but rather, the entire heap is dropped all at once.
However, since fast allocation is actually a complicated problem, we present
a solution that is nearly constant-time for the majority of allocations and is linear
for the rest. We do this by maintaining a bookmark table that essentially points to
the first free block of each power-of-two–sized block of free memory. It remem-
bers the last place the code found a free block of each size. Once we allocate a
block, that entry in the table may no longer be accurate. Updating the entry may
require a full traversal of the heap, a very slow operation, so we allow it to be-
come stale. Instead, during allocation and free calls, we store any blocks we
come across in the table at the appropriate (rounded down to the next) power of
two. While there is no guarantee that the table is updated, in practice it tends to
be close, at no additional performance cost. During an allocation, if a table entry
appears invalid, we can always check the next-higher power of two, or the next,
until one is found to be valid. In most cases, this works very well. In empirical
tests, 65 percent of the allocations have positive hits on the cached references in
the bookmark table, which means 65 percent of the long searches for free
memory were avoided.
The tracking data for allocations is stored in a pool with a threaded free-list,
making the location of a valid metadata block during allocation and deletion an
1O operation. Threaded free-lists act like a stack, since we simply want a blank
node to write to when allocating (pop) and want to return a node to the unused
pool when freeing (push). As in any other standard pool implementation, the used
and unused structures occupy the same physical memory at different times, and
we just cast to one or the other, depending on the current state of that block. To
aid in debugging, as well as to facilitate lazy bookmarking, unused metadata
nodes are marked with a sentinel value that cannot appear in an allocation, so we
can repair the bookmark table when it gets out of date.