Chapter 18. Calculations

Your data will hardly ever be perfect for your analysis straight from the source. Calculations are one of the key instruments you have to make the data suitable for answering the questions you and your peers have. Many people fear calculations, but this chapter offers some simple tips that will make them less intimidating and help you unlock their power and flexibility.

What Do Calculations Do in Data Preparation?

At the simplest level, calculations create a new data field or overwrite an existing one. Most calculations use an existing data field as the base to form the new data field. The new or altered field is referred to as a Calculated Field, indicating that it did not originate in the initial input data for your preparation flow. New data fields are created as a result of:

  • Cleaning (to remove unwanted characters, split names, etc.)

  • Arithmetic (sum, average, or multiplication results)

  • Merging multiple fields (from dates, email addresses, etc.)

The calculated data fields resulting from these operations become available for analysis or further preparation. Calculations create new fields in Prep unless the calculation is named identically to an existing data field, in which case the original data field is overwritten.

Creating a Calculated Field

There are two main ways to create Calculated Fields in Tableau Prep. First, underneath the Flow pane where either the Profile or Data pane is shown, you’ll see the option Create Calculated Field (Figure 18-1).

Dynamic menu option to create a Calculated Field
Figure 18-1. Dynamic menu option to create a Calculated Field

Clicking this opens up a separate window where you can create the calculation; the dialog is blank until you specify a calculation to include (Figure 18-2).

Blank Add Field dialog
Figure 18-2. Blank Add Field dialog

The second way to create a Calculated Field is by selecting a specific data field. Open the data field’s menu (designated by an ellipsis), and you should see the same Add Field dialog as in Figure 18-2, but now with the selected data field listed in the white box (Figure 18-3).

Creating a Calculated Field from a specific data field
Figure 18-3. Creating a Calculated Field from a specific data field

This is equivalent to clicking Create Calculated Field and then Custom Calculation from the Profile pane menu (Figure 18-4).

Creating a custom calculation from the Profile pane menu
Figure 18-4. Creating a custom calculation from the Profile pane menu

Fundamentals of Calculations

The calculation window shown in Figure 18-2 and Figure 18-3 contains a couple of elements that can really help you learn and eventually master calculations.

The Reference List

The first element is the drop-down Reference list on the right. If you are new to calculations in data tools, this list is a massive help in finding the best function to get the answers you require (Figure 18-5). By entering a function, you are telling Prep Builder what operation to carry out on the data field.

Grouping calculation functions to make searching easier
Figure 18-5. Grouping calculation functions to make searching easier

To the right of the Reference list is the second element of support in the calculation window: the syntax, a description, and an example of the function you have selected in the Reference list. Figure 18-6 shows this information for the DATEPART() function, which is used to pick out parts of a given date.

Calculation function details
Figure 18-6. Calculation function details

Let’s look at each piece of information in more detail.

Syntax

Even as an experienced Tableau or data software user, you will frequently need to check the syntax of the function as well as the data types needed for the function to work as intended. Here’s how to read the syntax:

FUNCTION(required input, [optional input])

When building functions, keep in mind that you must match all the aspects of the syntax that are not in square brackets—they are not optional. These requirements might include the necessary data types or whether or not the data field is aggregated. Square brackets in the function indicate optional content. Note that when you make your calculation, your data fields will appear within square brackets as well, but this isn’t the same thing. Also note where commas appear between the different elements of the function, as you’ll need to include them in the calculation you build.

To make writing calculations easier in Prep Builder, you can use spaces, tabs, and even newlines between the different elements. This can make it easier for others viewing your work to understand the calculations.

Description

The description is where most new users will look to understand what the selected function actually does. The names of the functions are often a good clue to their purpose, but the description provides more detail on exactly what task that function is intended for.

Example

The example can be helpful for users at all levels in demonstrating what date_part means or how to encode the start_of_week. Some of the examples are not the easiest to follow, so searching online is a great way to boost your understanding. Just enter “Tableau” and the function name into a search engine, and you’ll find more examples and often a blog post or two.

Building the Calculation

With all this support from the Prep tool, building calculations should be easy, right? Well, not quite. Calculations can be really complex, and a simple typo can break them. Let’s look at some examples of both well-formed calculations and potential pitfalls.

When Calculations Go Well

Calculations go well when you use a function, or functions, that Prep considers to be correctly formatted. At the bottom-left corner of the calculation in Figure 18-7, Prep Builder, like Tableau Desktop, indicates that it is valid.

Validation does not mean that the calculation is correct in terms of your intended goal. Validating that the data is correct is up to you, as only you know what you were trying to do.

A valid calculation
Figure 18-7. A valid calculation

One of the key elements of this screen is the color coding used in the Calculated Field:

  • Blue represents valid functions.

  • Orange represents the data fields in your data set. If they appear in black you are either missing the square brackets or have misspelled the data field name. Remember, to call the field, you have to match the capitalization and punctuation (e.g., spaces) in its name.

  • Gray represents comments, which are a great way to share your logic for others using your data and as a reminder for yourself when you return to the data in the future. You can set comments by adding two forward slashes (//) to the start of a line in the Calculation Editor.

When Calculations Go Poorly

When calculations contain syntax errors, everything appears in red (Figure 18-8).

Error showing in the Calculation Editor
Figure 18-8. Error showing in the Calculation Editor

This can be frustrating because you’ll have to find and fix the error to see the helpful color coding just discussed. Therefore, when writing long, logical calculations like IF or CASE statements, it’s a good idea to add an END to the calculation periodically in order to spot any errors sooner. If you miss an error in a calculation and click Save or you need to edit the calculation (as discussed next), Prep Builder will clearly indicate when there is an error in a Calculated Field (Figure 18-9).

Error icons in the Changes and Profile Panes
Figure 18-9. Error icons in the Changes and Profile panes

Editing Calculated Fields

If you need to edit a calculation, you can do so from the Changes pane. Clicking the pencil icon (Figure 18-10) opens the Calculation Editor, where you can make the required changes.

Clicking the pencil icon allows you to edit Calculated Fields
Figure 18-10. Clicking the pencil icon allows you to edit Calculated Fields

Recommendations

One other Prep Builder icon you might have noticed is the light bulb, which is Prep Builder’s way of showing recommendations for cleaning processes. The icon appears at the top of the Profile pane for recommendations for the entire data set, or at the top of the data field for recommendations just for that specific item, as shown in Figure 18-11.

The light bulb icon in the Profile pane
Figure 18-11. The light bulb icon in the Profile pane

Depending on the recommendation, Prep Builder will either open screens for you to implement the change or apply the change automatically when you click Apply. If you click the light bulb icon, you can either select a change if there are multiple recommendations as in Figure 18-11, or simply click Apply to follow the instruction given as in Figure 18-12.

A recommendation by Prep Builder
Figure 18-12. A recommendation by Prep Builder

Types of Calculations

So you’ve seen how to navigate all the aspects of creating a Calculated Field in Tableau, but what are you likely to do with these calculations? In this section we’ll look at some common types of calculations to help frame various use cases you may come across.

Numerical Calculations

As Chapter 7 covered, sums, averages, and all other basic arithmetic operations are covered by numerical functions. The main thing to note is whether you are using integers or floats (numbers with decimal points), as Prep Builder will occasionally throw an error if it is expecting the other data type.

String Calculations

Chapter 9 noted that cleaning and manipulating string data fields will be a fundamental part of your data preparation process. Whether it is splitting strings, changing case, or removing unnecessary characters, many string functions will quickly become very familiar to you.

Date Calculations

The way Tableau handles dates is fantastic. From using its own internal calendar to assess what weekday a certain date is to being able to pick apart dates with ease, Tableau is a joy to use compared to SQL or other coding languages where you have to code each element. You’ll turn to these functions often to add date intervals or work out the difference between dates. Refer back to Chapter 8 for more functions that are useful for working with dates.

Conditional Calculations with a Boolean Output

Boolean calculations perform very well, as they simply return True or False. Revisit Chapter 10 for more on using certain functions from the other types of data calculations to return Boolean results.

Logical Calculations

IF and CASE statements can take a lot of getting used to if you haven’t used them before in other tools. Adding logical calculations is important when you’re attempting to solve more complex problems. Rest assured that the time you spend practicing will pay you back tenfold.

Type Conversions

Prep Builder expects certain data types for certain calculations. Therefore, you can save a lot of time and reduce calculation complexity by using functions like Int() and Str() to convert data fields to the correct type (integer and string, respectively, in this example).

Level of Detail and Ranking Calculations

With Prep Builder version 2020.1.3 came the ability to form Level of Detail calculations and the introduction of ranking functionality to the Clean step. Previously, you had to use a number of workarounds to calculate aggregations at a specific level and rank categories based on a measure in the data set.

Level of Detail (LOD) calculations are used to derive aggregations for a metric, for each value in a categorical field, or across the whole data set. The concept for these calculations originated in Tableau Desktop. Normally within Tableau Desktop, the measure is aggregated to the most granular level of dimension(s) in the view, but frequently users wanted the freedom to specify that level of aggregation where it differed from the autogenerated one. As in Tableau Desktop, in Prep Builder LOD calculations are used to set the level of aggregation of a metric. This differs from an Aggregate step in Prep Builder, as it doesn’t change the granularity of the entire data set but just one measure. LOD calculations are covered in more depth in Chapter 33.

Ranking calculations allow the user to rank data field(s) based on a measure in the data set. This functionality is also available in Tableau Desktop (as Table Calculations) and is again driven by dimensions within the view. As there is no view within Prep Builder, you may need to aggregate the data set to the granularity you want to perform the ranking on. Ranking is described in more detail in Chapter 34.

Both LOD and ranking calculations can be created through either the Custom Calculation Editor or the Visual Calculation Editor. Figure 18-13 shows how to create an LOD calculation using the Visual Calculation Editor.

Creating an LOD calculation in the Visual Calculation Editor
Figure 18-13. Creating an LOD calculation in the Visual Calculation Editor

Within the Visual Calculation Editor, first you’ll need to choose the type of calculation you want to perform. You set this in the drop-down menu at the top of the editor.

When you select Fixed LOD, you will be presented with these options:

Compute using
Select the measure to aggregate and how you want to aggregate it
Group by
Set the measure to group by, if any

The values you will generate are shown on the left-hand side of the Visual Calculation Editor. The box plot on the right-hand side demonstrates the range of values found in each field in the “Group by” list. Once you have the values you need, click Done, and you’ll see a new data field in your data set containing these values.

Summary

For some, calculations are the bane of data preparation, and for others, they are fun logic puzzles. Whichever side of that argument you sit on, calculations are something you will have to use and understand in order to be successful at preparing data. Calculations are vital for cleaning data, adding columns that are required for analysis, and converting fields’ data types. The Custom Calculation Editor in Prep Builder, which should be very familiar to Tableau Desktop users, is a tool you will be using a lot. For the more conceptually complex calculations like Level of Detail and Ranking calculations, the Visual Editor can help you better understand the results they will return.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset