Chapter 33. Creating Level of Detail Calculations

As you’ve seen, joining data sets together is a very common task in data preparation. But there is one use of joins that needs highlighting due to its slightly different use—a technique called appending. Within Tableau’s suite of products, appending is often used in Level of Detail (LOD) calculations, even if you don’t realize you are doing it.

This chapter will use LOD calculations to show the benefits of appending values to your data set for easier analysis. LOD calculations can be tricky to learn, so this chapter will walk you through what the LOD calculation is actually doing to the data set to cement your understanding.

What Is Appending?

Appending is the addition of a data field, or fields, onto an existing data set. This sounds very similar to joining, but we use the term append when adding a field that contains a constant, or set of constants, rather than the more row-to-row relationship that is found within joins.

Exploring Appending Through LOD Calculations

As Prep Builder is designed to prepare data for Tableau Desktop and Server, many of you likely will be familiar with Level of Detail calculations from using them within Desktop. If not, let’s have a quick recap.

When to Use an LOD Calculation

Tableau sets a measure’s level of aggregation by the granular dimension, or combination of dimensions, used within the view of Tableau Desktop. In the example in Figure 33-1, using data from Tableau’s sample data set Superstore, we’ll use an LOD calculation to set the total sales for Category—no matter the view makeup in Desktop—so the percentage of sales within each category can be calculated for a subcategory.

Superstore data set
Figure 33-1. Superstore data set

What is being calculated is the percentage of total sales across all categories. We could determine the subcategory percentage of categorical sales with a Table Calculation as well, but it’s not necessary for the data set we’re trying to return (Figure 33-2).

Resulting data set to be formed
Figure 33-2. Resulting data set to be formed

How to Write an LOD Calculation in Prep Builder

As of version 2020.1.3, you can create LOD calculations in Prep Builder. As mentioned in Chapter 18, an LOD calculation sets a hierarchical level at which an aggregation will be processed. This is applied to just one measure per calculation, compared to the Aggregate step, where the whole data set’s granularity is changed to the specified hierarchical level.

You have two different options in Prep Builder to create an LOD calculation: Custom Calculation or Fixed LOD. Both are accessed from a data field’s ellipsis menu in the Clean step through the Create Calculated Field submenu (Figure 33-3).

Options for creating Level of Detail calculations
Figure 33-3. Options for creating Level of Detail calculations

Selecting Fixed LOD opens the Visual Calculation Editor, which was shown in Chapter 18. So, this chapter will focus instead on the other option: creating an LOD calculation in the Custom Calculation Editor. The syntax for LOD calculations is:

{ fixed [optional categorical field(s)] : aggregation ( measure ) }

To build the LOD calculation needed to output the data set in Figure 33-2, first we’ll create the field Categorical Sales (Figure 33-4).

Categorical Sales LOD calculation
Figure 33-4. Categorical Sales LOD calculation

This calculation has set a constant sales total for each category. The Profile pane is a good place to check for the effects of the calculation (Figure 33-5).

Results from the Categorical Sales LOD calculation
Figure 33-5. Results from the Categorical Sales LOD calculation

By selecting a category in the Profile pane, you can see the values in other columns containing that category. In this case, Furniture has a constant Categorical Sales value, but there are still a range of Sales values within the selected rows. Prep Builder is appending the Categorical Sales value to the existing data set, as you can see on a per-row level in the Data pane (Figure 33-6).

Results from the Categorical Sales LOD calculation in the Data pane
Figure 33-6. Results from the Categorical Sales LOD calculation in the Data pane

Due to the hierarchy of the Superstore data set, a product can belong to only one category. This data set is ready for use within Tableau Desktop. If you want a quick answer for your analysis, you can select Preview in Desktop by right-clicking on any step, and Tableau Desktop will launch with a Hyper file of that data set already connected (Figure 33-7).

Preview in Tableau Desktop option in Prep Builder
Figure 33-7. Preview in Tableau Desktop option in Prep Builder

To create the set shown earlier in Figure 33-2, divide the Sales value by Categorical Sales data in Desktop (Figure 33-8).

Calculating % of Categorical Sales in Desktop
Figure 33-8. Calculating % of Categorical Sales in Desktop

This calculation determines the percentage contribution by each row of data, which is then summed up by the dimension(s) used in the view in Desktop; in this case, that’s Sub-Category, although other members of the product hierarchy could be used instead (Figure 33-9).

The % of Categorical Sales analysis in Desktop
Figure 33-9. The % of Categorical Sales analysis in Desktop

By appending the Categorical Sales value to the data set, you save your end users from having to create complex calculations themselves. However, these calculations (as well as any data fields) must be used correctly to avoid incorrect or misleading analysis.

What a Level of Detail Calculation Is Doing

To help describe what the LOD calculation is doing, let’s walk through appending a value to a data set. To make it easier to see what is happening within Prep Builder, we’ll use the same calculation as before, just with the data set aggregated to the Sub-Category level (Figure 33-10).

Superstore data set aggregated to the Sub-Category level
Figure 33-10. Superstore data set aggregated to the Sub-Category level

Step 1: Calculate the categorical sales

Using an Aggregate step, group by Category and sum the Sales values (Figure 33-11). This will result in each category having a single Categorical Sales value. Using an Aggregate step removes the other data in the set.

Using an Aggregate step to calculate Categorical Sales
Figure 33-11. Using an Aggregate step to calculate Categorical Sales

Step 2: Join the aggregated results back to the original data set

Completing a self-join—joining an earlier stage of the data preparation flow to the output of the Aggregate step—adds the original data at a Sub-Category level back into the flow (Figure 33-12).

Self-join to add the appended data to the Aggregate step output
Figure 33-12. Self-join to add the appended data to the Aggregate step output

Using an inner join with a join condition of Category = Category then adds Categorical Sales back to the original data set (Figure 33-13). With an inner join, you should delete the matching field, now called Category-1, as Prep doesn’t allow duplicate field names.

Configuring the self-join
Figure 33-13. Configuring the self-join

Summary

Level of Detail (LOD) calculations are very useful when analyzing data but can be difficult for end users of the data set to form themselves. LOD calculations work by aggregating a measure and then appending the result to the data set. The LOD calculations formed in Prep Builder can be used in further calculations or exported for use within Tableau Desktop to simplify the analytical process for users.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset