Chapter 9
IN THIS CHAPTER
Laying out the basic types of data visualizations
Choosing the perfect data visualization type for the needs of your audience
Picking the perfect design style
Adding context
Crafting clear and powerful visual messages with the right data graphic
Any standard definition of data science will specify that its purpose is to help you extract meaning and value from raw data. Finding and deriving insights from raw data is at the crux of data science, but these insights mean nothing if you don’t know how to communicate your findings to others. Data visualization is an excellent means by which you can visually communicate your data’s meaning. To design visualizations well, however, you must know and truly understand the target audience and the core purpose for which you’re designing. You must also understand the main types of data graphics that are available to you, as well as the significant benefits and drawbacks of each. In this chapter, I present you with the core principles in data visualization design.
A data visualization is a visual representation that’s designed for the purpose of conveying the meaning and significance of data and data insights. Since data visualizations are designed for a whole spectrum of different audiences, different purposes, and different skill levels, the first step to designing a great data visualization is to know your audience. Audiences come in all shapes, forms, and sizes. You could be designing for the young and edgy readers of Rolling Stone magazine or to convey scientific findings to a research group. Your audience might consist of board members and organizational decision makers or a local grassroots organization.
Every audience is composed of a unique class of consumers, each with unique data visualization needs, so you have to clarify for whom you’re designing. I first introduce the three main types of data visualizations, and then I explain how to pick the one that best meets the needs of your audience.
Sometimes you have to design data visualizations for a less technical-minded audience, perhaps in order to help members of this audience make better-informed business decisions. The purpose of this type of visualization is to tell your audience the story behind the data. In data storytelling, the audience depends on you to make sense of the data behind the visualization and then turn useful insights into visual stories that they can understand.
With data storytelling, your goal should be to create a clutter-free, highly focused visualization so that members of your audience can quickly extract meaning without having to make much effort. These visualizations are best delivered in the form of static images, but more adept decision makers may prefer to have an interactive dashboard that they can use to do a bit of exploration and what-if modeling.
If you’re designing for a crowd of logical, calculating analysts, you can create data visualizations that are rather open-ended. The purpose of this type of visualization is to help audience members visually explore the data and draw their own conclusions.
When using data showcasing techniques, your goal should be to display a lot of contextual information that supports audience members in making their own interpretations. These visualizations should include more contextual data and less conclusive focus so that people can get in and analyze the data for themselves, and then draw their own conclusions. These visualizations are best delivered as static images or dynamic, interactive dashboards.
You could be designing for an audience of idealists, dreamers, and change-makers. When designing for this audience, you want your data visualization to make a point! You can assume that typical audience members aren’t overly analytical. What they lack in math skills, however, they more than compensate for in solid convictions.
These people look to your data visualization as a vehicle by which to make a statement. When designing for this audience, data art is the way to go. The main goal in using data art is to entertain, to provoke, to annoy, or to do whatever it takes to make a loud, clear, attention-demanding statement. Data art has little to no narrative and offers no room for viewers to form their own interpretations.
To make a functional data visualization, you must get to know your target audience and then design precisely for their needs. But to make every design decision with your target audience in mind, you need to take a few steps to make sure that you truly understand your data visualization’s target consumers.
To gain the insights you need about your audience and your purpose, follow this process:
Brainstorm.
Think about a specific member of your visualization’s audience, and make as many educated guesses as you can about that person’s motivations.
Give this (imaginary) audience member a name and a few other identifying characteristics. I always imagine a 45-year-old divorced mother of two named Brenda.
Define the purpose of your visualization.
Narrow the purpose of the visualization by deciding exactly what action or outcome you want audience members to make as a result of the visualization.
Choose a functional design.
Review the three main data visualization types (discussed earlier in this chapter) and decide which type can best help you achieve your intended outcome.
The following sections spell out this process in detail.
To brainstorm properly, pull out a sheet of paper and think about your imaginary audience member (Brenda) so that you can create a more functional and effective data visualization. Answer the following questions to help you better understand her, and thus better understand and design for your target audience.
Form a picture of what Brenda’s average day looks like — what she does when she gets out of bed in the morning, what she does over her lunch hour, and what her workplace is like. Also consider how Brenda will use your visualization.
To form a comprehensive view of who Brenda is and how you can best meet her needs, ask these questions:
Say that Brenda is the manager of the zoning department in Irvine County. She is 45 years old and a single divorcee with two children who are about to start college. She is deeply interested in local politics and eventually wants to be on the county’s board of commissioners. To achieve that position, she has to get some major “oomph” on her county management résumé. Brenda derives most of her feelings of self-worth from her job and her keen ability to make good management decisions for her department.
Until now, Brenda has been forced to manage her department according to her gut-feel intuition, backed by a few disparate business systems reports. She is not extraordinarily analytical, but she knows enough to understand what she sees. The problem is that Brenda hasn’t had the visualization tools that are necessary to display all the relevant data she should be considering. Because she has neither the time nor the skill to code something herself, she’s been waiting in the lurch. Brenda is excited that you’ll be attending next Monday’s staff meeting to present the data visualization alternatives available to help her get under way in making data-driven management decisions.
After you brainstorm about the typical audience member (see the preceding section), you can much more easily pinpoint exactly what you’re trying to achieve with the data visualization. Are you attempting to get consumers to feel a certain way about themselves or the world around them? Are you trying to make a statement? Are you seeking to influence organizational decision makers to make good business decisions? Or do you simply want to lay all the data out there, for all viewers to make sense of, and deduce from what they will?
Return to the hypothetical Brenda: What decisions or processes are you trying to help her achieve? Well, you need to make sense of her data, and then you need to present it to her in a way that she can clearly understand. What’s happening within the inner mechanics of her department? Using your visualization, you seek to guide Brenda into making the most prudent and effective management choices.
Keep in in mind that you have three main types of visualization from which to choose: data storytelling, data art, and data showcasing. If you’re designing for organizational decision makers, you’ll most likely use data storytelling to directly tell your audience what their data means with respect to their line of business. If you’re designing for a social justice organization or a political campaign, data art can best make a dramatic and effective statement with your data. Lastly, if you’re designing for engineers, scientists, or statisticians, stick with data showcasing so that these analytical types have plenty of room to figure things out on their own.
Back to Brenda — because she’s not extraordinarily analytical and because she’s depending on you to help her make excellent data-driven decisions, you need to employ data storytelling techniques. Create either a static or interactive data visualization with some, but not too much, context. The visual elements of the design should tell a clear story so that Brenda doesn’t have to work through tons of complexities to get the point of what you’re trying to tell her about her data and her line of business.
Analytical types might say that the only purpose of a data visualization is to convey numbers and facts via charts and graphs — no beauty or design is needed. But more artistic-minded folks may insist that they have to feel something in order to truly understand it. Truth be told, a good data visualization is neither artless and dry nor completely abstract in its artistry. Rather, its beauty and design lie somewhere on the spectrum between these two extremes.
To choose the most appropriate design style, you must first consider your audience (discussed earlier in this chapter) and then decide how you want them to respond to your visualization. If you’re looking to entice the audience into taking a deeper, more analytical dive into the visualization, employ a design style that induces a calculating and exacting response in its viewers. But if you want your data visualization to fuel your audience’s passion, use an emotionally compelling design style instead.
If you’re designing a data visualization for corporate types, engineers, scientists, or organizational decision makers, keep the design simple and sleek, using the data showcasing or data storytelling visualization. To induce a logical, calculating feel in your audience, include a lot of bar charts, scatter plots, and line charts. Color choices here should be rather traditional and conservative. The look and feel should scream “corporate chic.” (See Figure 9-1.) Visualizations of this style are meant to quickly and clearly communicate what’s happening in the data — direct, concise, and to the point. The best data visualizations of this style convey an elegant look and feel.
If you’re designing a data visualization to influence or persuade people, incorporate design artistry that invokes an emotional response in your target audience. These visualizations usually fall under the data art category, but an extremely creative data storytelling piece could also inspire this sort of strong emotional response. Emotionally provocative data visualizations often support the stance of one side of a social, political, or environmental issue. These data visualizations include fluid, artistic design elements that flow and meander, as shown in Figure 9-2. Additionally, rich, dramatic color choices can influence the emotions of the viewer. This style of data visualization leaves a lot of room for artistic creativity and experimentation.
Adding context helps people understand the value and relative significance of the information your data visualization conveys. Adding context to calculating, exacting data visualization styles helps to create a sense of relative perspective. In pure data art, you should omit context because, with data art, you’re only trying to make a single point and you don’t want to add information that would distract from that point.
In data showcasing, you should include relevant contextual data for the key metrics shown in your data visualization — for example, in a situation where you’re creating a data visualization that describes conversion rates for e-commerce sales. The key metric would be represented by the percentage of users who convert to customers by making a purchase. Contextual data that’s relevant to this metric might include shopping cart abandonment rates, average number of sessions before a user makes a purchase, average number of pages visited before making a purchase, or specific pages that are visited before a customer decides to convert. This sort of contextual information helps viewers understand the “why and how” behind sales conversions.
Adding contextual data tends to decentralize the focus of a data visualization, so add this data only in visualizations that are intended for an analytical audience. These folks are in a better position to assimilate the extra information and use it to draw their own conclusions; with other types of audiences, context is only a distraction.
Sometimes you can more appropriately create context by including annotations that provide a header and a small description of the context of the data that’s shown. (See Figure 9-3.) This method of creating context is most appropriate for data storytelling or data showcasing. Good annotation is helpful to both analytical and non-analytical audiences alike.
Another effective way to create context in a data visualization is to include graphical elements that convey the relative significance of the data. Such graphical elements include moving average trend lines, single-value alerts, target trend lines (as shown in Figure 9-4), or predictive benchmarks.
Your choice of data graphic type can make or break a data visualization. Because you probably need to represent many different facets of your data, you can mix and match among the different graphical classes and types. Even among the same class, certain graphic types perform better than others; therefore, create test representations to see which graphic type conveys the clearest and most obvious message.
Among the most useful types of data graphics are standard chart graphics, comparative graphics, statistical plots, topology structures, and spatial plots and maps. The next few sections take a look at each type in turn.
When making data visualizations for an audience of non-analytical people, stick to standard chart graphics. The more foreign and complex your graphics, the harder it is for non-analytical people to understand them. And not all standard chart types are boring — you have quite a variety to choose from, as the following list makes clear:
A comparative graphic displays the relative value of multiple parameters in a shared category or the relatedness of parameters within multiple shared categories. The core difference between comparative graphics and standard graphics is that comparative graphics offer you a way to simultaneously compare more than one parameter and category. Standard graphics, on the other hand, provide a way to view and compare only the difference between one parameter of any single category. Comparative graphics are geared for an audience that’s at least slightly analytical, so you can easily use these graphics in either data storytelling or data showcasing. Visually speaking, comparative graphics are more complex than standard graphics.
This list shows a few different types of popular comparative graphics:
Gantt charts (see Figure 9-11) are bar charts that use horizontal bars to visualize scheduling requirements for project management purposes. This type of chart is useful when you’re developing a plan for project delivery. It’s also helpful in determining the sequence in which tasks must be completed in order to meet delivery timelines.
Choose Gantt charts for project management and scheduling.
Statistical plots, which show the results of statistical analyses, are usually useful only to a deeply analytical audience (and aren’t useful for making data art). Your statistical-plot choices are described in this list:
Histogram: A diagram that plots a variable’s frequency and distribution as rectangles on a chart, a histogram (see Figure 9-15) can help you quickly get a handle on the distribution and frequency of data in a dataset.
Get comfortable with histograms. You’ll see a lot of them in the course of making statistical analyses.
Topology is the practice of using geometric structures to describe and model the relationships and connectedness between entities and variables in a dataset. You need to understand basic topology structures so that you can accurately structure your visual display to match the fundamental underlying structure of the concepts you’re representing.
The following list describes a series of topological structures that are popular in data science:
Graph models: These kinds of models underlie group communication networks and traffic flow patterns. You can use graph topology to represent many-to-many relationships (see Figure 9-19), like those that form the basis of social media platforms.
In a many-to-many relationship structure, each variable or entity has more than one link to the other variables or entities in that same dataset.
Spatial plots and maps are two different ways of visualizing spatial data. A map is just a plain figure that represents the location, shape, and size of features on the face of the earth. A spatial plot, which is visually more complex than a map, shows the values for, and location distribution of, a spatial feature’s attributes.
The following list describes a few types of spatial plots and maps that are commonly used in data visualization:
When you want to craft clear and powerful visual messages with the appropriate data graphic, follow the three steps in this section to experiment and determine whether the one you choose can effectively communicate the meaning of the data:
Ask the questions that your data visualization should answer, and then examine the visualization to determine whether the answers to those questions jump out at you.
Before thinking about what graphics to use, first consider the questions you want to answer for your audience. In a marketing setting, the audience may want to know why their overall conversion rates are low. Or, if you’re designing for business managers, they may want to know why service times are slower in certain customer service areas than in others.
Though many data graphic types can fulfill the same purpose, whatever you choose, ensure that your choices clearly answer the exact and intended questions.
Consider users and media in determining where the data visualization will be used.
Ask who will consume your data visualization, and using which medium, and then determine whether your choice of data graphics makes sense in that context. Will an audience of scientists consume it, or will you use it for content marketing to generate Internet traffic? Do you want to use it to prove a point in a boardroom? Or do you want to support a story in an upcoming newspaper publication? Pick data graphic types that are appropriate for the intended consumers and for the medium through which they’ll consume the visualization.
Examine the data visualization a final time to ensure that its message is clearly conveyed using only the data graphic.
If viewers have to stretch their minds to make a visual comparison of data trends, you probably need to use a different graphic type. If they have to read numbers or annotations to get the gist of what’s happening, that’s not good enough. Try out some other graphic forms to see whether you can convey the visual message more effectively.
Just close your eyes and ask yourself the questions that you seek to answer through your data visualization. Then open your eyes and look at your visualization again. Do the answers jump out at you? If not, try another graphic type.