39
C H A P T E R 5
Richer Considerations
Until now we have focused narrowly on the problem of cache replacement, both in terms of
metrics—in which we have focused on cache misses—and in terms of context—in which we have
focused on the cache as an isolated abstract entity. But, of course, cache misses do not translate
directly to performance loss, and cache replacement policies do not operate in isolation. is
chapter broadens our discussion in both dimensions, first exploring replacement policies that
consider the variable cost of cache misses and then exploring policies that consider the impact
of prefetchers, the impact of the cache architecture, and finally the impact of new memory
technologies.
5.1 COST-AWARE CACHE REPLACEMENT
Most replacement policies seek to minimize overall cache miss rate, assuming that all cache
misses are equally costly. In reality, different cache misses can have widely different impact on
overall performance. For example, misses that are isolated (low memory-level parallelism) tend
to be more expensive than misses that are clustered (high memory-level parallelism), because
with greater parallelism there is more ability to hide the latency of a single cache miss. As an-
other example, misses that are on the programs critical path are more important to program
performance than those that are not. A smart replacement policy that considers these costs can
preferentially cache high-cost misses (at the cost of a lower hit rate) to achieve better program
performance.
e optimal cost-aware replacement policy (CSOPT) [Jeong and Dubois, 2006] mini-
mizes the overall cost of all misses, rather than minimizing the number of misses. e basic
idea is to follow Beladys MIN policy except when MIN would evict a block whose miss cost is
greater than those of other blocks in the cache. On encountering such a block, CSOPT explores
multiple sequences of replacement decisions until it can decide on the replacement sequence that
minimizes total cost. A naive realization of CSOPT expands all possible replacement sequences
in the form of a search tree (see Figure 5.1); the depth of the search tree is equal to the number
of cache accesses, because a new level is added for every new cache access, and the width is equal
to the cache size s, because the search considers up to s replacement choices on every cache
eviction.
Note that a search tree is required because greedily caching a high-cost block at any node
in the tree might not translate to the best replacement sequence in the long run. In particular,
greedily caching a high-cost block might preclude the caching of another block that has an even
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset