When Tableau Prep Builder was first released, many people finally had the chance to build data preparation flows to remove the tedious and repetitive task of cleaning and merging data sets to enable valuable data analysis. Once a flow is created, it reruns each time a user clicks the run icon in Prep Builder. For a single flow this is simple, but for vast and differing data sources, it often involves multiple preparation jobs. This is where Prep Conductor comes in: you can build a flow in Prep Builder but then schedule it to run on Prep Conductor when needed.
This chapter looks at when you might need to use Prep Conductor, the capabilities it offers, and, finally, how to actually use Prep Conductor.
Prep Conductor is primarily used to run an uploaded flow on a set schedule. This has many benefits:
The flow doesn’t need to be manually opened and run each day.
Errors are logged on the server rather than just on an individual’s computer, so you can troubleshoot more easily if a flow fails.
Prep Conductor likely runs on a computer that has more processing resources than the data author’s computer, so preparation flows will run faster.
Prep Conductor acts as a central repository for all flows, promoting reuse.
The flow can be downloaded and maintained by any person with proper permissions.
These benefits help build more robust processes around your organization’s data preparation work. The time savings they bring enables you to collate more data, spend more time on the analysis, and work more closely with the organization to develop further and deeper data solutions.
Prep Conductor is part of the Tableau Server product, whether Server is hosted by your organization or on Tableau Online. At the time of writing, Tableau Server/Online is part of the Creator subscription package, which also includes Prep Builder. To use Prep Conductor, you will need to pay for the Data Management add-on, apply the new license key, and restart the server. There is a minimum purchase of 100 users for Tableau Server or 25 users for Tableau Online. From there, a Prep Conductor process will be present on any “node” of the server that has a Backgrounder process running on it. The Backgrounder is the workhorse of Tableau Server, updating data extracts and Prep flows, so ensuring there are enough of them for the volume of tasks is a key part of Tableau Server administration.
Prep Conductor can be turned off in the server by the server administrator. You’ll find this option on the General tab in the Settings menu (Figure 21-1).
Here is the process to load a flow onto Tableau Server:
Connect to Tableau Server in Prep Builder (Figure 21-2).
Enter the server connection details. This will be the web address of Tableau Server/Online. Then click Connect (Figure 21-3).
Enter your server credentials and click Sign In (Figure 21-4).
Select the item in Tableau Server/Online you want to publish the flow to (Figure 21-5).
Once the flow is ready for publishing, go back to the Server/Online menu at the top of the Prep Builder screen and select Publish Flow (Figure 21-6).
Pick the project on the server to publish the flow to, name the flow, and add any other description that may assist the users of the server (Figure 21-7).
Once the flow is published to Tableau Server/Online, the next stage of the process is to set up the scheduling of the extract. After clicking New Task in the Scheduled Tasks tab in Prep Conductor, you are given the option to set up a schedule for each output of the flow, or all of them (Figure 21-8).
The schedules available are managed by the server administrator. Additional schedules can be added, but these are controlled to ensure that flows are run primarily when fewer other processes and users are accessing the server (Figure 21-9).
Once you’ve set up the schedule, you can edit it in the Scheduled Tasks tab of the flow in Prep Conductor by clicking the ellipsis icon next to the schedule name (Figure 21-10).
If you no longer need the scheduled task, you can also delete it by clicking the ellipsis icon in the Scheduled Tasks tab (Figure 21-11).
Once the process runs, the Status column on the Overview tab shows whether or not the scheduled flow has run successfully. In the instance shown in Figure 21-12, it has.
If you have multiple runs to review, then you can check whether or not the runs were successful in the Status column of the Run History tab (Figure 21-13).
Prep Conductor makes it very easy to manage the main tasks associated with your flow by identifying where there are issues, and why, with clear error messaging.
If you are using a data source that requires a login, you may wish to embed the access credentials for it. This means that each time the flow is run, it will use these details. Some organizations will provide what is commonly known as a service account for these details rather than relying on an individual to update their password or link access to that specific person. To embed the credentials, click the ellipsis menu on the Connections tab of Prep Conductor for the flow you want to embed the credentials in and select Edit Connection (Figure 21-14).
You’ll be presented with a screen to change the server for the data connection as well as the login details for that connection (Figure 21-15). This is where you can embed the credentials so they are saved for use in each future run.
Click Save to store the details for the next run of the Flow. If you want to check that the details are properly configured, you can click Test Connection.
Prep Conductor can benefit you in ways other than scheduling flows and recording whether or not they ran successfully. For example, the Connections tab demonstrates both the inputs and outputs of the flow (Figure 21-16).
Here you can see the type of file as well as whether authentication is required to use the input.
Finally, the Lineage tab indicates where the output is used within Tableau Server. It shows you which workbooks rely on the output as well as the sources that feed it (Figure 21-17).
In other words, the Lineage tab gives you an overview not only of what relies on the flow but also what it relies on. Tying these aspects together can make data administration much easier. If edits need to be made, you can see exactly what is and isn’t going to be affected.
Prep Conductor is a data management tool that utilizes the power of Prep Builder’s flows and automates their output. Using Prep Conductor will save you even more time than using Prep Builder alone. And efficiency isn’t the only benefit—it also gives you clarity on inputs, outputs, and errors that will help you manage data complexity.