Cache Pollution

48 5. RICHER CONSIDERATIONS

second approach, called TADIP-Feedback (TADIP-F), accounts for interaction among appli-

cations by learning the insertion policy for each application, assuming that all other applications

use the insertion policy that currently performs the best for that application.

Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches (PIPP) Xie and

Loh [2009] build on Utility-Based Cache Partitioning [Qureshi and Patt, 2006], but instead

of strictly enforcing UCP partitions, they design insertion and promotion policies that enforce

the partitions loosely. e main insight behind their PIPP policy is that strict partitions result

in under-utilization of cache resources because a core might not use its entire partition. For

example, if the cache is way-partitioned, and if core_i does not access a given set, the ways

allocated to core_i in that set will go to waste. PIPP allows other applications to steal these

unused ways.

In particular, PIPP inserts each line with a priority that is determined by its partition

allocation. Lines from cores that have been allocated large partitions are inserted with high

priority (proportional to the size of the partition), and lines from cores that have been allocated

small partitions are inserted with low priority. On a cache hit, PIPP’s promotion policy promotes

the line by a single priority position with a probability of p_prom, and the priority is unchanged

with a probability of 1  p_prom. On eviction, the line with the lowest priority is evicted.

RADAR Our discussion so far has focused on multiple programs sharing the last-level cache.

Manivannan et al. [2016] instead look at the problem of last-level cache replacement for task-

parallel programs running on a multi-core system. eir policy, called RADAR, combines static

and dynamic program information to predict dead blocks for task-parallel programs. In partic-

ular, RADAR builds on task-ﬂow programming models, such as OpenMP, where programmer

annotations explicitly specify (1) dependences between tasks and (2) address regions that will be

accessed by each task. e runtime system uses this information in conjunction with dynamic

program behavior to predict regions that are likely to be dead. Blocks that belong to dead regions

are demoted and preferentially evicted from the cache.

More concretely, RADAR has three variants that combine information from the pro-

gramming model and the architecture in diﬀerent ways. First, the Look-ahead scheme uses the

task data-ﬂow graph to peek into the window of tasks that are going to be executed soon, and it

uses this information to identify regions that are likely to be accessed in the future and regions

that are likely to be dead. Second, the Look-back scheme tracks per-region access history to

predict when the next region access is likely to occur. Finally, the combined scheme exploits

knowledge of future region accesses and past region accesses to make more accurate predictions.

5.4 PREFETCH-AWARE CACHE REPLACEMENT

In addition to caches, modern processors use prefetchers to hide the long latency of accessing

DRAM, and it is essential that these mechanisms work well together. Prefetched data typically

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Cache Pollution