Cache Architecture-Aware Cache Replacement

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

50 5. RICHER CONSIDERATIONS

Baseline

DRRIP

PACMan-M on DRRIP PACMan-H on DRRIP PACMan-HM on DRRIP

SRRIP All Demand Prefetch Demand Prefetch Demand Prefetch

Insertion

Re-Reference

2 2 3 2 2 2 3

0 0 0 0 No Update 0 No Update

BRRIP All Demand Prefetch Demand Prefetch Demand Prefetch

Insertion

Re-Reference

Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3 Mostly 3

0 0 0 0 No Update 0 No Update

Figure 5.6: PACMan’s RRIP policies; based on [Wu et al., 2011b].

Upon closer inspection of PACMan’s three constituent policies, we see that while

PACMan-M focuses on avoiding cache pollution by inserting prefetches in the LRU position,

and PACMan-H deprioritizes prefetchable lines by not promoting prefetch requests on cache

hits. us, PACMan-H is the ﬁrst instance of a replacement policy that attempts to retain hard-

to-prefetch lines (our second goal), but as we show in the next section, a much richer space of

solutions exist when distinguishing between prefetchable and hard-to-prefetch lines.

5.4.2 DEPRIORITIZING PREFETCHABLE LINES

Belady’s MIN is incomplete in the presence of prefetches because it does not distinguish between

prefetchable and hard-to-prefetch lines.

In particular, MIN is equally inclined to cache lines

whose next reuse is due to a prefetch request (prefetchable lines) and lines whose next reuse is

due to a demand request (hard-to-prefetch lines). For example, in Figure 5.7, MIN might cache

X at both t D 0 and t D 1, even though the demand request at t D 2 can be serviced by only

caching X at t D 1. As a result, MIN minimizes the total number of cache misses, including

those for prefetched lines (such as the request to X at t D 1), but it does not minimize the

number of demand misses [Jain and Lin, 2018].

Load X

t=0

Prefetch X

t=1

Time

Opportunity

Load X

t=2

Figure 5.7: Opportunity to improve upon MIN; used with permission [Jain and Lin, 2018].

MIN correctly handles cache pollution, as inaccurate prefetches are always reused furthest in the future.

5.4. PREFETCH-AWARE CACHE REPLACEMENT 51

Demand-MIN To address this limitation, Jain and Lin propose a variant of Belady’s MIN,

called Demand-MIN, that minimizes demand misses in the presence of prefetches. Unlike

MIN, which evicts the line that is reused furthest in the future, Demand-MIN evicts the line

that is prefetched furthest in the future. More precisely, Demand-MIN states:

Evict the line that will be prefetched furthest in the future, and if no such line exists, evict

the line that will see a demand request furthest in the future.

us, by preferentially evicting lines that can be prefetched in the future, Demand-MIN

accommodates lines that cannot be prefetched. For example, in Figure 5.7, Demand-MIN rec-

ognizes that because line X will be prefetched at time t D 1, line X can be evicted at t D 0,

thereby freeing up cache space in the time interval between t D 0 and t D 1, which can be uti-

lized to cache other demand loads. e reduction in demand miss rate can be signiﬁcant: on

a mix of SPEC 2006 benchmarks running on 4 cores, LRU yields an average MPKI of 29.8,

MIN an average of 21.7, and Demand-MIN an average of 16.9.

Unfortunately, Demand-MIN’s increase in demand hit rate comes at the expense of a

larger number of prefetch misses,

which results in extra prefetch traﬃc. us, MIN and Demand-

MIN deﬁne the extreme points of a design space, with MIN minimizing overall traﬃc on one

extreme and Demand-MIN minimizing demand misses on the other.

Design Space Figure 5.8 shows the tradeoﬀ between demand hit rate (x-axis) and overall

traﬃc (y-axis) for several SPEC benchmarks [Jain and Lin, 2018]. We see that diﬀerent bench-

marks will prefer diﬀerent points in this design space. Benchmarks such as astar (blue) and

sphinx (orange) have lines that are close to horizontal, so they can enjoy the increase in de-

mand hit rate that Demand-MIN provides while incurring little increase in memory traﬃc.

By contrast, benchmarks such as tonto (light blue) and calculix (purple) have vertical lines, so

Demand-MIN increases traﬃc but provides no improvement in demand hit rate. Finally, the

remaining benchmarks (bwaves and cactus) present less obvious tradeoﬀs.

To navigate this design space, Flex-MIN picks a point between MIN and Demand-MIN,

such that the chosen point has a good tradeoﬀ between demand hit rate and traﬃc. In particular,

Flex-MIN is built on the notion of a protected line, which is a cache line that would be evicted

by Demand-MIN but not by Flex-MIN because it would generate traﬃc without providing a

signiﬁcant improvement in hit rate. us, Flex-MIN is deﬁned as follows:

Evict the line that will be prefetched furthest in the future and is not protected. If no such

line exists, default to MIN.

Jain and Lin [2018] deﬁne a simple heuristic to identify protected lines. Of course, unlike

MIN and Demand-MIN, Flex-MIN is not optimal in any theoretical sense since it’s built on a

heuristic.

Prefetch misses are prefetch requests that miss in the cache.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Cache Architecture-Aware Cache Replacement

Create new playlist

Sign In

Sign Up

Table of Contents for
Cache Architecture-Aware Cache Replacement