Chapter 8. Lazy Loading

At the beginning of the book, we discussed the large percentage of requests and bytes that images account for. Much of that is due to the sheer amount of data needed to communicate a high-resolution visual. However, another significant portion is usually wasteful. A huge number of images are in fact never seen by the user, and do nothing but waste bandwidth and resources.

The one to blame for this waste is the scroll bar. We’re all very familiar with scrolling down on pages, and today very few pages fully fit on a screen. Only 38% of an average web page is immediately visible on a typical desktop screen (see Figure 8-1). Over 80% of image requests deliver images that are not visible when the page is loaded.

This pattern is even more noticeable on mobile devices, which have smaller screens. The smaller visible area can hold less content (and fewer images), and yet website owners often try to serve the same content regardless of viewport. They often do that while avoiding horizontal scrolling, as it provides subpar user experience. Such mobile pages compensate for the lack of horizontal space with vertical space. In other words, if they can’t make the page wider, they’ll make it longer… which increases the portion of images not immediately visible during load.

While long pages are often the right design and UX decision, images that aren’t immediately visible do have a performance cost. They compete with visible content for bandwidth and CPU, occupy TCP connections visible resources may need, and delay the documentComplete (aka onload) event and any interaction-related event handlers that await it. Note that the firing of the onload event also stops the browser’s progress indicators, such as a progress bar or spinning icon. As a result, a slow loading invisible image can substantially delay the user being told the page is ready for use.

hpim 0801
Figure 8-1. Sample pages, area below screen marked as sepia

The Digital Fold

The immediately visible area of a page is often referred to as being “above the fold,” adopting a term from the physical newspaper world. Physical newspapers are usually large, and thus folded in two for easy stacking and carrying. The upper half of the page, the part “above the fold,” is immediately visible when someone glances at a stack of newspapers, while the rest of the page requires an action—unfolding.

Web pages clearly don’t have an actual fold, and browser window sizes differ greatly. Still, both web and newspaper pages have an area that is immediately visible, and a part that requires action—be it unfolding or scrolling. As a result, the parts of a web page that do and do not fit on the screen right away are often referred to as above or below the fold, respectively.

This analogy doesn’t end with user action, but rather continues into the content itself. In physical newspapers, the most important stories are featured above the fold, hoping to grab consumers’ attention and incite them to buy the paper. On websites, similarly, the immediately visible area often holds the content most likely to trigger an action. Be it the hottest news story, a featured product, or a corporation’s key message, the “above the fold” area attempts to make the user take action.

Tip

The term digital fold is a hot conversation topic among web designers, with strong arguments in favor of and against using it. For convenience, if nothing else, we will use the term fold in this book.

Wasteful Image Downloads

In most cases, user action includes navigating away from the current page. Since we’re putting the most important content at the top, it becomes quite likely that users will click away without ever scrolling down. In fact, we may consider that a success, and strive to do it more! In addition, since this content prioritization/sorting is common, users have grown to expect it, and are conditioned to not bother scrolling down all the way. These two traits create a virtuous/vicious cycle, effectively encouraging people not to scroll.

Users who don’t scroll turn these “below the fold” images from a performance hindrance to complete waste. Roughly 50% of users either don’t scroll or barely scroll, especially on a home page. Combining these numbers with the previous stats about visible images, we see that over 40% of image downloads on web pages are wasteful!

Why Aren’t Browsers Dealing with This?

This excessive downloading of images is directly due to the way HTML, and specifically the <img> tag, is defined. Once a browser sees an <img> tag, it must download the image file it references. In addition, a part of the onload event definition is that all resources on the page, including all images, have been loaded. Therefore, browsers cannot avoid downloading an image the user may not see.

That said, browsers can control the priority and order of the downloaded resource. Browsers often use this prerogative, for instance, to prioritize downloading JS and CSS files over images. Among image downloads, browsers have historically not done much prioritization, treating them all equally. However, as we mentioned in the preloader conversation, browser prioritization is becoming increasingly dynamic, and some browsers are starting to give visible images a higher priority where possible. This is especially impactful when used in combination with HTTP2 or SPDY.

Even with such improved prioritization, browsers are still mandated to download all images on the page and delay the onload event until that process is complete. Several attempts were made to provide a standard way to indicate an image should only be loaded, most notably the <img defer> attribute and the lazyload attribute in the abandoned Resource Priorities. However, neither has actually made it through so far. If we want to avoid this waste, the only option we have is to take the loading of images into our own hands—and that means using JavaScript.

Loading Images with JavaScript

There are several ways to load images with JavaScript, all fairly straightforward. Let’s start with a very simple example (see Example 8-1).

Example 8-1. Loading an image with JavaScript—simple case
<img id="the-book" alt="A Book" height="200" width="50">
<script>
document.getElementById("the-book").src = "book.jpg";
</script>

Note that the <img> tag in this example has no src attribute. The <img> will still be parsed and placed in the DOM, and the layout will still reserve the specified space for it, but without a src attribute the browser will have no URL to download. Later on, a script looks up this specific tag and sets its src attribute. Only then does the browser download the image and render it in the alloted space.

This example shows the only true requirements for loading images with JS: omitting the src attribute, and setting it with a script. However, it will be hard to maintain this technique for many images, as it splits the image into two separate parts: the <img> element and the script. To avoid this problem, we can keep the URL on the <img> tag itself, but use a data-src attribute instead (see Example 8-2).

Example 8-2. Loading multiple images with JavaScript
<img data-src="book.jpg" alt="A Book" height="200" width="50">
<img data-src="pen.jpg" alt="A Pen" height="200" width="50">
<img data-src="cat.jpg" alt="A Cat" height="200" width="50">
<script>
var images = document.querySelectorAll("img");
for (var i = 0; i < images.length; ++i) {
    var img = images[i];
    // Copy the data-src attribute to the src attribute
    if (!img.src && img.getAttribute("data-src"))
        img.src = img.getAttribute("data-src");
}
</script>

The data- prefix is a standard way in HTML5 to provide metadata in an element, most often to be consumed by JavaScript. By using it, we again have all the image information in the <img> tag, and can use a generic script to load all the images.

Deferred Loading

Of course, this function is not very useful. We moved from native loading of images to JS-based loading, but we’re still loading all the images! To improve on that, let’s improve the logic to only load images that are “above the fold” (see Example 8-3).

Example 8-3. Load images with JS, visible images first
// Test if an image is positioned inside the initial viewport
function isAboveTheFold(img) {
    var elemOffset = function(elem) {
        var offset = elem.offsetTop;
        while (elem = elem.offsetParent) {
            offset += elem.offsetTop;
        }
        return offset;
    };
    var viewportHeight = window.innerHeight || document.documentElement.clientHeight;
    var imgOffset = elemOffset(img);
    return ((imgOffset >= 0) && (imgOffset <= viewportHeight));
}

// Load either all or only "above the fold" images
function loadImages(policy) {
	// Iterate all image elements on the page
    var images = document.querySelectorAll("img");
    for (var i = 0; i < images.length; ++i) {
        var img = images[i];
		// Skip below the fold images unless we're loading all
		if (!policy.loadAll && !isAboveTheFold(img))
			continue;
		// Copy the data-src attribute to the src attribute
		if (!img.src && img.getAttribute("data-src"))
			img.src = img.getAttribute("data-src");
	}
}

// Load above the fold images
loadImages({loadAll: false});

// At the load event, load all images
window.addEventListener("load",function() {
    loadImages({loadAll: true});
});

Let’s review the additional code changes we’ve made:

  1. We added the isAboveTheFold function to test if an image is above the fold.

  2. We wrapped the image loading in the loadImages function, and added an option to load images only if they’re above the fold.

  3. We use loadImages to load images above the fold immediately.

  4. At onload, we load all images.

The first three steps create the prioritization we’re looking for, only loading above the fold images, and keeping lower images from interfering. Once the page is loaded, the last step triggers and loads the remaining images for those users who do scroll down. Such loading is called deferred loading, and is a good way to accelerate the more important content.

Lazy Loading/Images On Demand

While deferred loading accelerates pages, it doesn’t prevent waste. As we mentioned before, many users don’t scroll all the way (or at all), and thus many of the images are never seen. Loading those images later would still not avoid the wasted bandwidth and battery drainage they incur.

To avoid this waste, we need to change our image loading to be “on demand,” only loading an image when it comes into view. This technique is often called lazy loading, as we only do “the work” (downloading the image) when we absolutely must. Other common names are images on demand or just-in-time images.

Pure lazy loading will start the image download only when the image comes into view. However, doing so is likely to impact the user experience, as the user will be looking at a blank space while the image is actually downloaded and rendered. To mitigate that, we can try to anticipate user actions and download the image ahead of time. For instance, we can load images that are fewer than 200 pixels below the current visible area, trying to stay ahead of slow user scrolling. A more aggressive prefetch can improve the user experience, but will also increase the amount of wasted downloads.

In code, lazy loading requires listening to a variety of events that may change the content in view, such as scrolling, resizing, changing orientation, and more. It may also need to track application actions that impact what’s in view—for instance, collapsing a page section. Each time an event fires, we need to re-examine all undisplayed images and choose which ones to load.

Lazy loading is a fairly simple concept, but it’s hard to do it well. It’s easy to miss a change in the visual area, as there are many events to listen on, and browsers implement them in subtly different ways. Even when you capture a change event, traversing all images to determine which is now visible is hard to do efficiently, especially when it may be called many times in sequence.

When considering lazy loading, first confirm whether deferred loading would satisfy your needs. It’s much easier to implement, and is less error prone. If you still want to do lazy loading, it’s recommended that you use an existing JavaScript library. A prominent example is the lazySizes library, which lazy-loads images while playing well with the various responsive images solutions (more on that in Chapter 11). There are also automated services that can help you get lazy loading working in an optimal way with minimal effort.

If you still insist on implementing it yourself, remember to err in favor of loading the image—for instance, loading any image whose location you can’t easily determine—and consider a background “cleanup” loop that will confirm you haven’t missed any images every second or so.

IntersectionObserver

Traditionally, lazy loading libraries relied on the browser’s scroll events to know when the user scrolled the page, and concluded from that when certain images would enter the viewport and therefore should be loaded.

However, scroll events handling is very easy to get wrong, resulting in janky scrolling, which frustrates users. The fact that many different libraries on the page were registering scroll events in order to figure out element visibility (resulting in abysmal scroll performance) prompted browsers to think about creating dedicated, highly performant primitives for that purpose.

The result of that effort is the IntersectionObserver API, which permits you to “observe” the intersection of a certain element with another element or with the viewport, and get dedicated callbacks when an element is about to enter the viewport.

You can also define custom distances for “intersections,” which permits you to tell the browser things like “let me know when this element is 75% viewport height away from the current viewport.”

As of this writing, the API is only shipped in Chrome, but as more browsers adopt it, lazy loading libraries are bound to move to this dedicated, jank-free API.

When Are Images Loaded?

Looking at the loadImages function from before, you’ll notice it queries for all the images in the DOM. We would therefore want to call it only after the DOM is fully constructed, so after all HTML and synchronous JavaScript was delivered and processed. Since no image will be downloaded until this function is called, this approach can lead to a substantial delay in when the images are loaded. To mitigate this effect, we can call the function multiple times at various points in the page, though that in turn would have a computational cost. Achieving an optimal balance is doable, but hard.

Another approach would be to replace the function call with an event-driven load. Consider Example 8-4.

Example 8-4. Load visible images using image onload event
<script>
// Load either all or only "above the fold" images
function loadImage(img) {
	// Check if the image has a data-src attribute
	var dataSrc = img.getAttribute("data-src");

	// If the image is above the fold - load it
	if (dataSrc && isAboveTheFold(img)) {
		// Remove the onload handler, so it won't be called again
		img.onload = null;
		// Load the real image
		img.src = dataSrc;
	}
}
</script>
<div class="book-image-container">
    <img src="1px.gif" data-src="book.jpg" alt="A Book"
         onload="loadImage(this)">
</div>

At the bottom, you can see a modified <img> tag. Instead of omitting the src attribute, we replaced it with a tiny image file. Once it’s loaded, the loadImage function in the onload attribute will be called, check if the image is above the fold, and load it if so. Since loading the new image will unnecessarily trigger the onload event again, we remove this event before updating the src attribute.

This event-based loading is a bit more verbose, requiring us to set the onload attribute on every <img> tag, but it solves the previously mentioned delay. The browser will load the placeholder image as soon as it can, and fire the load event immediately after.

While it helps accelerate the initial load, event-driven image loading doesn’t completely eliminate the need to iterate over the images. You’ll still need to listen to the many events that change what’s in view, such as scrolling and resizing, and then iterate images to determine if they’re now in view. In addition, any type of JS-based image loading, including this one, will interfere with the preloader—which we will talk about next.

The Preloader and Images

As we mentioned in the previous chapter, browsers use the preloader to accelerate pages. The preloader parses the page ahead of the DOM builder, primarily to identify and start downloading external resources.

Not surprisingly, many of the resources the preloader finds are images. While it depends on their prioritization logic, browsers will often start downloading these images while still busy downloading and processing JS and CSS files. Even images that are not immediately fetched may be accelerated through early DNS resolution of their hostnames, pre-establishing TCP connections to those hosts, and more.

When we use JavaScript to load our images, we effectively disable the preloader. Our JS code, regardless if it’s written as an onload event or a loop, will not run until the DOM builder has actually reached the element we’re handling. As a result, JS-created image tags are likely to start downloading later than native ones.

While this delay is important to consider, it’s not easy to define just how impactful it will be. Different browsers implement different prioritization schemes, and many will delay image downloads until JS and CSS files have been processed anyway. As a result, an image may be delayed due to prioritization just as much as due to being hidden from the preloader, making this whole conversation moot.

To help visualize this, let’s look at the waterfall chart of two simple pages, created using Steve Souders’s Cuzillion. Both pages hold one JavaScript file and two images, but in Page 1 the images are loaded natively (an <img> tag), while in Page 2 they are loaded using JavaScript. To better visualize the effect in the waterfall charts, subresources take 2 seconds to respond. Let’s first look at the loading of the two pages in IE 11, shown in Figures 8-2 and 8-3.

hpim 0802
Figure 8-2. Page 1 (native images) in IE
hpim 0803
Figure 8-3. Page 2 (JS images) in IE

As is plain to see, the images created using JavaScript start their download only after the external script completed its download, dramatically delaying its rendering and also delaying the entire page load. In this case, the delay in loading images using JavaScript is very clear.

Now let’s look at the two pages on Firefox (Figures 8-4 and 8-5).

hpim 0804
Figure 8-4. Page 1 (Native images) on Firefox
hpim 0805
Figure 8-5. Page 2 (JS images) on Firefox

While the pages are the same as before, in this case there is practically no difference in the load time or order between the JS and native image loading. This is due to Firefox’s prioritization logic, which defers all image downloads until all JS and CSS files are fully processed.

Lastly, let’s take a look at how Chrome handles this page (Figures 8-6 and 8-7).

hpim 0806
Figure 8-6. Page 1 (native images) on Chrome
hpim 0807
Figure 8-7. Page 2 (JS images) on Chrome

Chrome uses a more nuanced logic, wherein only one connection is allowed to download images as long as there are still JS and CSS files to fetch. As a result, the first image on this page is downloaded alongside the JS file, but the second image has to wait, resulting in slightly improved visuals but a similar total page load time.

While this is a simple page, the same behaviors take place when loading a real-world website. The key lessons we can learn are:

  • The preloader makes page loads faster, and hiding images from it (by loading them with JavaScript) can delay image downloads and slow pages down. This is most clearly shown in the IE 11 example.

  • Image downloads are often delayed due to prioritization anyway, reducing the impact of hiding images from the preloader. This was most clearly shown in the Firefox example.

  • Browsers handle image download prioritization very differently, at least in HTTP/1.1. The only way to really know how browsers would do is to test your page with performance tools. As Paul Lewis often says, “Tools, Not Rules”.

Lazy Loading Variations

The decision between the savings lazy loading offers and the preloader crippling it causes is a tradeoff. Each website is different, and it’s up to you to decide whether it’s right for your site. In the next sections we’ll discuss several other implications and variations of lazy loading that can help you make this decision.

Browsers Without JS

Loading images with JavaScript requires, obviously, a browser that supports JavaScript. Browsers without JS support, or ones where JS has been disabled, will clearly not run and thus not load these images.

It’s hard to know exactly what portion of users fall into this group. A 2010 study by Yahoo indicates 1.3% of users used browsers without JS support or with JS turned off. The study was repeated in 2013 by the GOV.UK team, which found that only 0.2% of visitors actively disabled JS, while 0.9% of visitors had enabled JS, but the script did not run nevertheless. A 2014 study by WebAIM showed only 2.4% of screen reader users had JS turned off (mostly on Firefox, presumably using the noscript extension, or another script blocking extension).

The exact stats vary greatly by the specific audience your site caters to. To find your own number, you can repeat the Yahoo study on your own site or find that number in a different way—for instance, using Simo Ahava’s guide for using Google Analytics for this purpose. If you deem the audience big enough to care, you can still partially support them using the <noscript> element.

As you may know, the <noscript> tag holds content that will only be processed by the browser if JavaScript is disabled. We can therefore reference the image a second time inside a <noscript> tag, this time using a simple <img> tag. Example 8-5 does just that.

Example 8-5. Lazy loading with support for no-JS browsers
<img src="1px.gif" data-src="book.jpg" alt="A Book"
     onload="loadImage(this)">
<noscript><img src="book.jpg" alt="A Book"></noscript>

Using <noscript> is simple and has no real downsides, except for the repetition in your HTML (and maintanance costs that may come with it). Since the increase in payload size is likely minor (after compression), and since most web pages are generated using templates or code anyway (making it easy to add the <noscript> portion), I would recommend doing so.

Unfortunately, the <noscript> mitigation does not work for users that have their JavaScript support enabled, but for some reason (corporate/government firewalls, antivirus software, poor network, etc.) the scripts never fully download and run. This scenario cannot currently be fully addressed. Hopefully in the future there will be a standard way to define a fallback that can address this use case.

Low-Quality Image Placeholders

As you learned in Part I of the book, certain image files, most notably JPEG and WebP, can be made substantially smaller if we reduce their quality rating. Since such compression drops the least significant visuals first, the savings in file size is not linear to the loss in quality, and you can often cut file sizes by half while slightly degrading visual quality.

If we get even more aggressive, we can often cut our image payload by a factor of 4 or more, while suffering only a 20% visual degradation. Such degradation will be noticed by most users, but it should still be clear what the image shows (see Figures 8-8 through 8-11).

hpim 0808
Figure 8-8. JPEG quality 90, file size 66 KB
hpim 0809
Figure 8-9. JPEG quality 75, file size 37 KB
hpim 0810
Figure 8-10. JPEG quality 40, file size 21 KB
hpim 0811
Figure 8-11. JPEG quality 25, file size 16 KB

If we make our images that small, the performance impact of downloading “below the fold” images without seeing them won’t be as big. In fact, it may be small enough that we’d prefer to use native image loading and the preloader benefits it carries. Once those low-quality images are loaded, we can use JavaScript to swap some of them with the original high-quality images.

This approach is called low-quality image placeholders (LQIP), as the low-quality images are only seen as placeholders. It consistently makes the page usable faster, and minimizes the need for lazy loading for all but the longest pages (where the number of images below the fold is especially high).

Implementing LQIP is very similar to the implementation of lazy loading, except the 1-pixel placeholders are replaced with the low-quality image variant. In addition, since we don’t want the high-resolution images to interfere with the download of other page assets, we delay their download until after the page is loaded (we can also choose to lazy-load them instead). Example 8-6 shows an LQIP implementation.

Example 8-6. Low-quality image placeholders
<script>
// Load a placeholder image
function loadImage(img) {
// Copy the data-src attribute to the src attribute
	var dataSrc = img.getAttribute("data-src");
	if (dataSrc)
		img.src = dataSrc;
}

// Keep a registry of all image elements that need loading
var placeholderImages = [];
function registerPlaceholder(img) {
// Remove the onload handler, so it won't be called again
	img.onload = null;

	if (isAboveTheFold(img)) {
// If the image is above the fold, load it right away
		loadImage(img);
	} else {
// Register below-the-fold placeholders for deferred loading
		placeholderImages.push(img);
	}
}

// Replace all placeholder images
function replacePlaceholders() {
// Load all placeholder images (can be replaced with lazy loading)
	for (var ph in placeholderImages) {
		loadImage(ph);
	}
	placeholderImages.length = 0;
}

// At the load event, replace placeholders with real images
window.addEventListener("load",replacePlaceholders);
</script>
<img src="book-low-res.jpg" data-src="book-high-res.jpg" alt="A Book"
	 onload="registerPlaceholder(this)">

Note that LQIP is a tradeoff, as it does include showing users a low-quality image at first. On a fast connection, the high-resolution image will quickly take its place. On a slow connection, the low-quality visual may linger, but at least the page will be usable quickly. In my opinion, it’s a good way to get both speed, gained by the low-quality images, and eventual visual perfection.

Critical Images

As you’ve probably noticed, lazy loading is mostly a means to give visible images priority over ones outside the current viewport. The techniques we’ve described so far were all client-side techniques, which helps make them work well across different pages and viewport sizes. However, we can also try to guess what will be visible on the server side, and tune the page accordingly.

Guessing which images will be visible can be done in two ways: logical and technical. The logical path leverages your knowledge of the application. Does your application have a big “hero image” at the top of the page? Does a product image always show up on the top-left side? Is your logo always in the top-right corner? In many cases, we can (relatively easily) use the design guidelines to guess rather accurately which images will initially be in view.

The technical path implies loading the page in a browser, and seeing which images are within view. The most direct way to do so is using a headless browser, such as PhantomJS, in which we can load the page and see which images are loaded. The generic nature of this path allows it to run on any type of page, but doing it well requires a fair bit of R&D investment. It also assumes the page’s layout is pretty straightforward, and content images are displayed in their HTML order (which is usually the case).

My advice would be not to try to implement the technical path yourself, but instead rely on existing commercial or (future) open-source solutions that would do that for you.

When we estimate an image will be immediately visible, we can change the HTML to load it using a simple (and fast) native tag, while loading the others with JS. The native images will load quickly, thanks to the preloader and the lower-bandwidth contention, while the remaining images will be loaded only if/when they’re needed.

Note that while we’re affecting image download priority, we’re not impacting functionality. If we thought an image was visible and it wasn’t, we simply downloaded it prematurely. If we incorrectly thought it was hidden, it’ll still be loaded with JS shortly after. As a result, don’t try to get it perfectly right from day one. Start by prioritizing (natively loading) the obviously important images (e.g., hero images, product images), and gradually tune over time.

Summary

There’s little doubt that many web images today are needlessly downloaded, introducing unnecessary delay to web pages and wasteful load on servers. Lazy loading can help tune those downloads. However, due to the lack of native browser support, it requires loading images with JavaScript, which in turn carries other performance implications. Consider whether lazy loading is worth the tradeoff for you. The longer and more visually rich your web pages, the more likely it will be worthwhile.

If you’ve decided to implement lazy loading, find the images most likely to always be visible, and load them natively. For JS image loading, choose between lazy loading, which will conserve the most bandwidth, and deferred loading, which will provide a smoother scrolling experience. Lastly, consider using low-quality image placeholders across the board, making the page usable faster without compromising the eventual look.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset