DRAM Caches

5.6. NEW TECHNOLOGY CONSIDERATIONS 55

the resulting compressed cache loses the performance gains from state-of-the-art replacement

policies.

To avoid negative interactions with replacement policies, Gaur et al. introduce a cache de-

sign that guarantees that all lines that would have existed in an uncompressed cache would also

be present in the compressed cache. In particular, their cache design keeps the data array unmod-

iﬁed, but it modiﬁes the tag array to accommodate compression. In particular, the tag array is

augmented to associate two tags with each physical way. Logically, the cache is partitioned into

a Baseline cache, which is managed just like an uncompressed cache, and a Victim cache, which

opportunistically caches victims from the baseline cache if they can be compressed. is design

guarantees a hit rate at least as high as that of an uncompressed cache, so it enjoys the beneﬁts

of advanced replacement policies. Furthermore, it can leverage the beneﬁts of compression with

simple modiﬁcations to the tag array.

5.6 NEW TECHNOLOGY CONSIDERATIONS

For many decades now, caches have been built using SRAM technology, but newer memory

technologies promise change as they have been shown to address many limitations of conven-

tional SRAM caches [Fujita et al., 2017, Wong et al., 2016]. An in-depth analysis of these

technologies is beyond the scope of this book, but we now brieﬂy discuss design tradeoﬀs that

emerging memory technologies will introduce for cache replacement.

5.6.1 NVM CACHES

Korgaonkar et al. show that last-level caches based on Non-Volatile Memories (NVM) promise

high capacity and low power but suﬀer from performance degradation due to their high write

latency [Korgaonkar et al., 2018]. In particular, the high latency of writes puts pressure on

the NVM cache’s request queues, which puts backpressure on the CPU and interferes with

performance-critical read requests.

To mitigate these issues, Korgaonkar et al. propose two cache replacement strategies. First,

they introduce a write congestion aware bypass (WCAB) policy that eliminates a large fraction

of writes to the NVM cache, while avoiding large reductions in the cache’s hit rate. Second,

they establish a virtual hybrid cache that absorbs and eliminates redundant writes that would

otherwise result in slow NVM writes.

WCAB builds on the observation that traditional bypassing policies [Khan et al., 2010]

perform a limited number of write bypasses because they optimize for hit rates instead of write

intensity. Unfortunately, naively increasing the intensity of write bypassing adversely aﬀects the

cache’s hit rate, negating the capacity beneﬁts of NVM caches. us, we have a tradeoﬀ be-

tween cache hit rate and write intensity. Korgaonkar et al. manage this tradeoﬀ by dynamically

estimating write congestion and liveness. If write congestion is high, WCAB sets a high target

live score, which means that the liveness score of a line would have to be extremely high for it

not to be bypassed. Alternatively, if the write congestion is low, WCAB performs conservative

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for DRAM Caches