Chapter 2. Getting Started with Tableau Prep Builder

So you have a data set that you would like to prepare for analysis, and you want to learn Tableau Prep Builder. This chapter will take you through the basics of Tableau Prep Builder, including downloading the software, familiarizing yourself with Prep terminology, and using the software to produce a data set for the first time.

Where to Get Tableau Prep Builder

Tableau’s products are available for download through the company’s website. Figure 2-1 shows Tableau’s download page for Prep Builder, where you can select the version of the program you want to use.

Tableau’s download page
Figure 2-1. Tableau’s download page

Click the version number you want to use, select Download Tableau Prep <version number> at the top of the screen, and choose whether you want to download the Windows or Mac version of the software. After you make your selection, the file will download to your machine. Once the download is complete, you can run the file and follow the instructions given.

After downloading Prep Builder, you’ll see a new folder in your Documents folder, called My Tableau Prep Repository, which contains a lot of useful files and subfolders for storing the flows and data sets you’ll create within Prep. Figure 2-2 shows the view of my Documents in Finder on Mac.

My Tableau Prep Repository in Windows File Explorer
Figure 2-2. My Tableau Prep Repository in Windows File Explorer

How to Get a License for Prep Builder

For most users, Tableau Prep Builder is not free. The main way to get Prep is to purchase a Tableau Creator license, which is a monthly paid subscription that packages together Tableau Prep Builder, Tableau Desktop, and a single access to Tableau Server or Tableau Online (where Tableau hosts the Server instance). Tableau Prep Conductor is part of the Data Management add-on for Tableau Server and Tableau Online that can be purchased separately. All of these licensing options apply on a per-user basis.

There are 14-day trials of all of the full Tableau tools available from the main Tableau website. Educators and currently enrolled students can also get Tableau for free after a simple verification process.

After downloading and installing the application, you will be prompted to enter your licensing information or sign up for a trial (Figure 2-3).

Prep Builder registration screen
Figure 2-3. Prep Builder registration screen

The Tableau Prep Builder Screen

When you load Prep Builder for the first time, you will be presented with the screen shown in Figure 2-4.

The Prep Builder initial screen
Figure 2-4. The Prep Builder initial screen

Let’s walk through this screen:

Connections pane (left-hand blue pane)
To see the list of available connections, click the plus sign (+) at the top right of this pane. The list will expand, showing the File, Server, and ODBC/JDBC connections available. ODBC (Open Database Connectivity) and JDBC (Java Database Connectivity) connections are useful where Tableau doesn’t have a bespoke connector for the data source.
Recent Flows (top center)
This area contains the latest flows that you have been working on. If this is the first time you have used Prep, this space will be blank.
Sample Flows (bottom center)
This is where you’ll find a couple of sample flows from the Tableau team to let you explore and experiment with an established, complete flow.
Discover pane (right-hand gray pane)
This pane has links to videos, blog posts, and articles to help you learn the basics of using Prep.

Once you are connected to your data set, you will be taken to the screen shown in Figure 2-5.

Note

Connecting to data files and databases is covered in Chapters 5 and 6 respectively.

The Prep Builder main screen
Figure 2-5. The Prep Builder main screen

This screen is split into two parts:

Connections pane (left-hand blue pane)
As in the first screen, this is where you’ll be able to edit the data connection, add another connection, or select tables to use as the input into your Prep flow. In Figure 2-6, I have chosen to load an Excel connection, so I also have the option to use the Data Interpreter to find the table(s) of data within a formatted Excel worksheet.
Canvas (center)
This area will change considerably depending on the steps you take in your data preparation process.

Basic Steps of Data Preparation

In this section, we’ll dig into a few key steps of data preparation.

Input Step

After connecting to a data source, select the input by dragging the data source from the Connections pane onto the canvas. By default, Prep Builder samples the input data set to speed up the process of building your workflow. When you run the workflow when an output is set up, all of the data will automatically be processed. The Input step is shown in Figure 2-6.

Prep Builder’s Input step screen
Figure 2-6. Prep Builder’s Input step screen

The three key parts of this screen (excluding the Connections pane) are:

Flow pane (top)
This area reflects the current step of the data preparation process.
Input pane (gray area)
This is where you set up the input. You can use multiple files and choose how the data is sampled as you are building the flow. Prep keeps a log of changes you make during this step.
Data fields (white table)
This is where you select the specific data categories to bring into the data preparation flow. You can change the names of the data fields (columns) or alter the data types. Prep displays a small subset of sample values to help you understand what each column contains.

Clean Step

The Clean step is where the majority of the work takes place, so getting familiar with it is key to your overall understanding of Prep Builder. To begin the Clean step, click the plus sign to the right of the selected input data, as shown in Figure 2-7.

Prep Builder’s Clean step screen
Figure 2-7. Prep Builder’s Clean step screen

There are four key parts to this screen, including the Connections and Flow panes. The other two are:

Profile pane (center)
Prep takes the sample of the data you set up during the Input step and distributes it in the appropriate data field of your data set. You can complete a number of preparation tasks within the Profile pane by selecting certain values. For example, to examine the relationships between values, select one value in a data field and see what values also appear in that row.
Data grid (bottom)
Here you can see the records (rows) of your data set. There are three icons at the top right of the Profile pane where you can change how your data is displayed in the Data pane (Figure 2-8): the Profile pane and Data grid; just the Data grid; or the List view, which shows the metadata of the data set.
The data view options (left to right: the Profile pane and Data grid, Data grid only, and List view)
Figure 2-8. The data view options (left to right: the Profile pane and Data grid, Data grid only, and List view)

Output Step

Once you have cleaned the data, you can output the data set to make it available for analysis in other tools (Figure 2-9).

Adding an Output step
Figure 2-9. Adding an Output step

The Output step gives you a number of choices about how to output your data. Part IV of this book is dedicated to outputs, with a focus on outputting to files in Chapter 19 and to databases in Chapter 20. The default option for the output is to save to a file in your My Tableau Prep Repository folder.

Saving a Flow

Whether you have completed your flow or just need to pause your work to continue later, saving the flow is key. To do so, select Save or Save As from the menu at the top of the screen (Figure 2-10).

Note

The Save option is the same as Save As if the file hasn’t been saved before. If the file has been saved previously, use Save As to change the filename.

Saving a flow file
Figure 2-10. Saving a flow file

When saving the file, you’ll see a screen where you have two choices for the file type (Figure 2-11):

  • Tableau flow file (.tfl) saves the logic of the flow as well as the input and output file locations. Therefore, you’ll need access to the input and output locations to make use of this file format.

  • Packaged Tableau flow file (.tflx) saves not just the logic but also the input and output files.

Saving to a file in Prep
Figure 2-11. Saving to a file in Prep

Once you’ve selected a save option, you can process the flow by clicking the run icon (▷) either at the top of the Flow pane (to run every output) or on the output step icon (to run a single output), as shown in Figure 2-12.

Processing a flow with the run icon
Figure 2-12. Processing a flow with the run icon

Summary

This chapter has given you the basics to start working with a simple flow in Tableau Prep Builder. Subsequent chapters will build upon this knowledge. The steps and other screen layouts you’ll need to be familiar with are covered in the chapters on specific techniques:

Now that you’ve gained some familiarity with the tool, you’re better equipped to navigate the terminology in the other chapters and throughout your use of Prep Builder.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset