50 5. RICHER CONSIDERATIONS
Baseline
DRRIP
PACMan-M on DRRIP PACMan-H on DRRIP PACMan-HM on DRRIP
SRRIP All Demand Prefetch Demand Prefetch Demand Prefetch
Insertion
Re-Reference
2 2 3 2 2 2 3
0 0 0 0 No Update 0 No Update
BRRIP All Demand Prefetch Demand Prefetch Demand Prefetch
Insertion
Re-Reference
Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3
0 0 0 0 No Update 0 No Update
Figure 5.6: PACMans RRIP policies; based on [Wu et al., 2011b].
Upon closer inspection of PACMans three constituent policies, we see that while
PACMan-M focuses on avoiding cache pollution by inserting prefetches in the LRU position,
and PACMan-H deprioritizes prefetchable lines by not promoting prefetch requests on cache
hits. us, PACMan-H is the first instance of a replacement policy that attempts to retain hard-
to-prefetch lines (our second goal), but as we show in the next section, a much richer space of
solutions exist when distinguishing between prefetchable and hard-to-prefetch lines.
5.4.2 DEPRIORITIZING PREFETCHABLE LINES
Beladys MIN is incomplete in the presence of prefetches because it does not distinguish between
prefetchable and hard-to-prefetch lines.
1
In particular, MIN is equally inclined to cache lines
whose next reuse is due to a prefetch request (prefetchable lines) and lines whose next reuse is
due to a demand request (hard-to-prefetch lines). For example, in Figure 5.7, MIN might cache
X at both t D 0 and t D 1, even though the demand request at t D 2 can be serviced by only
caching X at t D 1. As a result, MIN minimizes the total number of cache misses, including
those for prefetched lines (such as the request to X at t D 1), but it does not minimize the
number of demand misses [Jain and Lin, 2018].
Load X
t=0
Prefetch X
t=1
Time
Opportunity
Load X
t=2
Figure 5.7: Opportunity to improve upon MIN; used with permission [Jain and Lin, 2018].
1
MIN correctly handles cache pollution, as inaccurate prefetches are always reused furthest in the future.
5.4. PREFETCH-AWARE CACHE REPLACEMENT 51
Demand-MIN To address this limitation, Jain and Lin propose a variant of Beladys MIN,
called Demand-MIN, that minimizes demand misses in the presence of prefetches. Unlike
MIN, which evicts the line that is reused furthest in the future, Demand-MIN evicts the line
that is prefetched furthest in the future. More precisely, Demand-MIN states:
Evict the line that will be prefetched furthest in the future, and if no such line exists, evict
the line that will see a demand request furthest in the future.
us, by preferentially evicting lines that can be prefetched in the future, Demand-MIN
accommodates lines that cannot be prefetched. For example, in Figure 5.7, Demand-MIN rec-
ognizes that because line X will be prefetched at time t D 1, line X can be evicted at t D 0,
thereby freeing up cache space in the time interval between t D 0 and t D 1, which can be uti-
lized to cache other demand loads. e reduction in demand miss rate can be significant: on
a mix of SPEC 2006 benchmarks running on 4 cores, LRU yields an average MPKI of 29.8,
MIN an average of 21.7, and Demand-MIN an average of 16.9.
Unfortunately, Demand-MIN’s increase in demand hit rate comes at the expense of a
larger number of prefetch misses,
2
which results in extra prefetch traffic. us, MIN and Demand-
MIN define the extreme points of a design space, with MIN minimizing overall traffic on one
extreme and Demand-MIN minimizing demand misses on the other.
Design Space Figure 5.8 shows the tradeoff between demand hit rate (x-axis) and overall
traffic (y-axis) for several SPEC benchmarks [Jain and Lin, 2018]. We see that different bench-
marks will prefer different points in this design space. Benchmarks such as astar (blue) and
sphinx (orange) have lines that are close to horizontal, so they can enjoy the increase in de-
mand hit rate that Demand-MIN provides while incurring little increase in memory traffic.
By contrast, benchmarks such as tonto (light blue) and calculix (purple) have vertical lines, so
Demand-MIN increases traffic but provides no improvement in demand hit rate. Finally, the
remaining benchmarks (bwaves and cactus) present less obvious tradeoffs.
To navigate this design space, Flex-MIN picks a point between MIN and Demand-MIN,
such that the chosen point has a good tradeoff between demand hit rate and traffic. In particular,
Flex-MIN is built on the notion of a protected line, which is a cache line that would be evicted
by Demand-MIN but not by Flex-MIN because it would generate traffic without providing a
significant improvement in hit rate. us, Flex-MIN is defined as follows:
Evict the line that will be prefetched furthest in the future and is not protected. If no such
line exists, default to MIN.
Jain and Lin [2018] define a simple heuristic to identify protected lines. Of course, unlike
MIN and Demand-MIN, Flex-MIN is not optimal in any theoretical sense since its built on a
heuristic.
2
Prefetch misses are prefetch requests that miss in the cache.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset