Tableau Ask Data and Explain Data are powerful machine learning features that put analysis in the hands of casual users – those who would struggle to create their own analysis through the drag-and-drop features of Tableau Desktop and web authoring. For these casual users to get answers to their questions, the data models and available fields supporting them must be well thought-out; otherwise, the users may end up frustrated with answers that don’t make sense. This chapter explores these considerations.
In this chapter, we’re going to cover the following topics:
To view the complete list of requirements needed to run the practical examples in this chapter, please see the Technical requirements section in Chapter 1.
In this chapter, we will be using the Tableau Desktop client software and the web interface of Tableau Server or Cloud. We are going to be working with the published data source we created in Chapter 6.
The screenshots and descriptions in this chapter are based on Tableau Cloud version 2022.3. They might be different in your version of Tableau Server or Cloud if you are working with a different version.
The published data source we will be using is based on the Superstore data, the sample data that Tableau uses in their products. This published data source contains sales from the US, Canada, Colombia, Chile, and Argentina, which we joined to a product database.
The name suggested for the published data source was Product Sales. If you did not publish the data source or no longer have access to it, you can rebuild the published data source from the following file in our GitHub repository:
The files used in the exercises in this chapter can be found at https://github.com/PacktPublishing/Data-Modeling-with-Tableau/.
Ask Data is a natural language search interface from Tableau Server and Cloud. It allows users to perform visual analysis without needing to have the technical skills that are associated with traditional business intelligence products. If a person can type out a business question in natural text, they can get an answer. This is true if the answer is in their data, of course!
Data modeling strategies are essential for a seamless experience with Ask Data.
First, Ask Data only works with published data sources. That is, data models that are embedded in workbooks are not available for Ask Data.
Second, at least one lens needs to be created on your published data source to use Ask Data. Lenses allow data modelers to further refine published data sources to hide fields, rename fields, add synonyms, and create view recommendations. All these options can be very valuable to the natural language question experience.
In the next section, we will look at building our first lens to enable Ask Data.
To create our first lens, we are going to access the Product Sales published data source that we published to our Tableau Server or Cloud in Chapter 6. If you did not publish the data source at that step, or it is no longer on your Tableau Server or Cloud, download the Product Sales.csv file from GitHub, connect to it from Tableau Desktop or the Tableau web client, and publish it to your Tableau Server or Cloud. Take note of the project where you published the data source:
Figure 12.1 – Product Sales data source page
After de-selecting these fields, click Submit:
Figure 12.2 – Selecting fields for the lens
Figure 12.3 – Select Fields
Figure 12.4 – Bringing up field details
Figure 12.5 – Edit field dialog
Figure 12.6 – Adding a synonym to a data element
Figure 12.7 – Mapping the synonyms in the search area
Figure 12.8 – Result of our search as a bar chart
Figure 12.9 – Search results as a treemap
Figure 12.10 – Recommended visualizations
Figure 12.11 – Saving the recommended visualization
In this section, we learned about Tableau’s natural language query interface, Ask Data. We also learned that Ask Data needs a lens to be added to an existing published data source. A lens makes a published data source easier to query for casual users using natural language. A lens allows us to remove fields from our model, rename fields, add synonyms, and create recommendation visualizations without disrupting our data model for analysts using Tableau Desktop or web edit to create dashboards and other analyses.
Tableau also provides reports on how people are using Ask Data. These reports can be invaluable to us to refine our data model. If people are asking questions with different terms, we can update our synonyms; if they are asking for information that’s not in our data model, we can go back to our source data and add it to our published data source.
In the next section, we will look at Explain Data and the implications it has on data modeling.
Explain Data is a feature in Tableau that allows users to find outliers in their data through the automatic creation of potential explanations presented visually. When you use Explain Data on an individual mark in a Tableau visualization, Tableau builds these potential answers using statistical models that include data from the data source that isn’t in the current view.
Unless you instruct it otherwise, Explain Data will use all the columns in your data source to try to find explanations. Unlike Ask Data, the additional data modeling for Explain Data occurs in the workbook. This means Explain Data works with both published and embedded data sources.
In the next section, we will see Explain Data in action and control how it works by deciding which columns (fields) we want to exclude from the analysis.
Let’s look at Explain Data in action. We will begin by connecting to our Product Sales published data source. We can do this in Tableau Desktop or by creating a workbook from the web client of Tableau Server or Cloud. As we have been using Tableau Desktop in Chapter 8, Chapter 9, and Chapter 10, the examples in this section will use the web client. If you prefer, you can follow along using Tableau Desktop:
Figure 12.12 – New workbook
Figure 12.13 – Sales by Ship Mode
Figure 12.14 – Explain Data
Figure 12.15 – Explanation for SUM(Sales) of Same Day
Figure 12.16 – Number of records explanation
Figure 12.17 – Potential dimensions
Figure 12.18 – The Explain Data Settings dialog
In this section, we learned about Explain Data, Tableau’s feature that uses statistical models to uncover the reason for outliers in our data. We also learned how to remove dimensions from consideration in the statistical models. It is key to remove these fields; otherwise, business users are unlikely to trust the suggestions from Explain Data.
In this chapter, we looked at Ask Data and Explain Data. These machine learning features put analysis in the hands of casual users if the data is modeled properly for each feature.
Ask Data requires us to first create a published data source. Next, we must create a lens on our published data source. A lens allows us to hide fields, rename fields, add synonyms, and create view recommendations. If we create a better lens, analysis by casual users through full-text search will provide much better answers.
By default, Explain Data runs statistical models that evaluate all the dimensions in our data model. We often know that some of these dimensions might appear in determining outliers but have no business value in the analysis. In these cases, we can remove dimensions from the analysis Tableau performs, increasing trust in the results of Explain Data.
In the next chapter, we will be looking at the role Tableau Prep Conductor plays in data modeling in the Tableau platform and exploring scheduling for extract refreshes.