Chapter 3. Intentional Cache Rules

In this chapter we will look at different types of content we can cache, but first let’s start by quickly covering some concepts that apply to what we will be talking about:

  • Hot, warm, and cold cache
  • Cache freshness

Hot, Warm, and Cold Cache

There are several states cache can be in:

Cold cache

A cold cache is empty and results in mostly cache misses.

Warm cache

The cache has started receiving requests and has begun retrieving objects and filling itself up.

Hot cache

All cacheable objects are retrieved, stored, and up to date.

Cache starts cold, either with no objects stored or with any that are stored stale. Then over time, and as requests start coming in, the server retrieves the objects and fills the cache. A hot cache will start to cool and turn cold as time goes by and the content starts to expire.

Cache Freshness

A cold cache can also be the result of expired content, based on the max-age directive specified. A cache is considered fresh if it’s creation date is within the max-age. That window is its time to live (sometimes called TTL). See Figure 3-1 for a pattern that describes the warming and cooling of cache.

Figure 3-1. Cold cache warming and cooling over time and requests

Now that we understand what the concept of cache is, as well as the types of caches and the benefits that we can get from using them, let’s talk about the types of content we may have and how we can think about applying cache rules to them.

Static Content

The most obvious thing to cache is static content because static content is shared across users and doesn’t change often. Things like fonts, images, and CSS and JavaScript files that are shared and are not going to be updated frequently are your low-hanging fruit. Go through and adjust their cache control rules for an immediate and noticeable bang for your buck.

Shared static content is especially effective because the first person to access that content warms the cache, and every subsequent hit after that, for every user, is going to be a cache hit.

The most important thing to think of when caching your content is how frequently that content is updated. For example, one of sites that I’ve been overseeing for a long while now is a web portal where editorial staff, via a CMS, update content various times throughout the day. The content is stored as flat files that get fingerprinted and stored on a content server.

The content flat files are cached heavily, from 30 days up to a year. This is because every time a new file is produced, it has a unique URL so even if the editors needed to make a change to an existing story, they would output a new file with its own cache rules.  The base page that reads in the flat files, on the other hand, is only cached for five minutes (see Figure 3-2).

Figure 3-2. An editor creates content that has a different TTL than the page that reads it in, allowing for new content to be created and loaded on to the page quickly if need be, or be long lived if there are no updates

Personalized Content

The more challenging scenario is how to apply the concepts of client-side scaling to sites that are primarily made up of personalized content.  By definition personalized content is created for the specific user that is logged in.  Think of Amazon.com’s Your Orders page that shows all of your recent orders, and their real-time delivery status (see Figure 3-3).

Figure 3-3. Your Orders page on Amazon.com filled with data that is unique and personalized to my order history, but also content that is made up of publicly cached assets

In the past this has been accomplished by gathering all of the personalized content on the server side, and then once everything is assembled, only then returning and presenting that page to the end user.  There may be reasons for this approach, but the end result is that this ends up with a worse perception of speed for the end user.

When my team and I have been presented with this challenge, our solution was instead to keep with our philosophy of scaling at the frontend. To do this we have actually created a design pattern where we split the frontend from the middleware, each being their own applications with their own build and deploy processes, and their own node clusters.

In the frontend application, we store all of the mostly unchanging shared static content—think JavaScript libraries, images, and CSS files—and we set very long TTLs for this. The frontend then calls the middleware via XHR requests and populates the current page and proactively retrieves the data for the subsequent pages.

The middleware, on the other hand, required us to be very careful about the cache rules.  This is where users would take action on their account and expect to see the results immediately.

In my example we had the luxury of having a previous version of the site live for years prior, so we could look at data of past usage and see things like how long were sessions, how frequently did users return to a page, and how long did they stay on pages.

If you don’t have this luxury, you can conduct a user study to see how your test group uses your application. If you aren’t familiar with user studies, Travis Lowdermilk has written extensively about them in his book User-Centered Design (O’Reilly). If all else fails, just take educated guesses. Either way you will be analyzing your usage data once you are live and can adjust and course correct as necessary.

So based on our established usage data, some API calls we cached for one minute, some for five minutes, and some even longer.

Summary

Caching is not just for static content that rarely ever gets updated. All sorts of content can benefit from caching, even if the TTL is small. It is worth performing an analysis of frequency of usage and updates and setting cache rules based on that, so that even personalized data can get the performance boost of caching.

But no matter a piece of content’s TTL, its cache will still likely experience a sine wave of warming and cooling based on usage and how long the content will be fresh. On creation, the content cache starts out cold, warms up as users begin requesting it, then cools down as the requests slow down and time passes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset