15

Data Modeling Strategies by Audience and Use Case

In the first 14 chapters, we discussed the data modeling components of the Tableau platform. We looked at Tableau Desktop, Tableau Prep Builder, virtual connections, published data sources, embedded data sources, lenses, schedules, Tableau Prep Conductor, securing content with projects, securing data with row-level security, and more.

This is our final chapter. We will look at putting it all together by exploring which components of the platform we should use based on the audience and use case. Before looking at use cases, we will discuss the general use cases for Tableau Desktop versus Tableau Prep Builder.

In this chapter, we’re going to cover the following topics:

  • When to use Tableau Prep Builder versus Tableau Desktop
  • Use case 1 – finance user with quarterly financial reporting
  • Use case 2 – sales performance management dashboards
  • Use case 3 – information systems analytics of internal employee intranet site visits
  • Use case 4 – marketing analytics of social media campaigns

When to use Tableau Prep Builder versus Tableau Desktop for creating our data models

Tableau Prep Builder and Tableau Desktop are the two foundational components for data modeling on the Tableau platform. Each has a place and they overlap in functionality in many areas.

Tableau Prep Builder is the go-to product when you need to transform your data. It uniquely allows for cleaning data, particularly in cleaning up string fields. It has machine learning algorithms to fix dirty data. An example is a field coming from a source where people can type in values such as customer names. It also has features that allow for pivoting rows to columns, adding new rows, and outputting models to destinations beyond Tableau.

Tableau Desktop is the go-to product when you need live connections to your data and when you are combining data that is at different levels of aggregation. Tableau Desktop is also the only place, along with the corresponding web interface, where we can build hierarchies and folders to organize our data.

There are also cases where either product can be used based on the use case. These are joins, aggregations, unions, pivoting columns to rows, filtering data, and integration with data science models. Both products have these capabilities.

A summary of the ideal scenarios for each product and where either can be used can be seen in Table 15.1:

Ideal for Tableau Prep Builder

Use Case Dependent

Ideal for Tableau Desktop

  • Cleaning data
  • Pivot data – rows to columns
  • Data scaffolding (add rows)
  • Output to something other than Tableau
  • String calculations
  • Joins
  • Aggregations
  • Unions
  • Pivot – columns to rows
  • Filtering data
  • Integration with data science models
  • Live connections
  • Flexible level of detail joins
  • Building hierarchies
  • Creating folders
  • Conditional calculations (for example, moving averages, running totals, year-over-year/month-over-month, percent of total)

Table 15.1 – The case for Tableau Prep Builder and Tableau Desktop

In the following sections, we will be looking at use cases through the eyes of different audiences. The use case and audience will determine which parts of the Tableau platform we will use to create and share the data model.

Note

The asterisks in the tables in the next section represent the components that are only available with the Data Management licenses.

Use case 1 – finance user with quarterly financial reporting

In this and the next three use cases, we will look at the scenario and then the Tableau modeling steps.

Scenario:

We have an analyst in finance who must produce monthly financial reporting that is available to the finance department and the executive team. The reports contain profit and loss, balance sheets, and cash flow information. It is combined with the data from the main competitors, who are publicly listed on stock exchanges.

The data comes from both internal and external sources. All internal data is sourced from internal systems and exported to Microsoft Excel. External sources come from PDF documents that competitors need to make available as part of their regulatory requirements and are available on the investor relations pages of their websites.

The company uses Tableau Cloud and has a secure project set up for these reports that limits access to the finance department and the executive team.

Tableau modeling steps:

The finance analyst uses Tableau Desktop to pivot rows to columns, join, and potentially union, internal data by connecting to Excel files. They then create a relationship with the competitors’ data by using the Tableau PDF connector. On the first build of the report, they use Tableau Desktop to create their tables and visuals and then publish the workbook with the data source embedded. In future quarters, they add new files to the same directory and use unions to bring in historical data. Once the new files are dropped, they only need to hit refresh to publish quarterly.

We can see what is and is not needed from the various data modeling components of the Tableau platform for the finance use case, along with the reason, in Table 15.2:

Component

Use?

Reason

Virtual connection *

No

Virtual connections are used for sharing data models at scale and helping simplify columns and tables. This complexity is not needed in our use case.

Prep Builder

No

Our example is one where Tableau Desktop is sufficient for creating the data model and since we need Tableau Desktop for financial report creation, it is easier to keep everything in a single client.

Desktop

Yes

Desktop can handle all the requirements for modeling.

Published or embedded?

Embedded

The data model associated with the financial report will only be used in this report. It will not be shared for other uses.

Tableau Bridge

No

The data is very sensitive and only produced quarterly. For this reason, the financial analyst will not only refresh new data quarterly but ensure it loads properly (for example, field names didn’t change in competitor PDFs). Since they are checking it first, and since it only happens quarterly, a manual republish makes sense. If similar data was being refreshed on a more frequent basis, such as daily or weekly, Tableau Bridge would likely make sense as it connects on-premises files to Tableau Cloud, and Bridge would be needed to automate data refresh.

Scheduler or Prep Conductor? *

No

The financial analyst will be publishing manually, so the cost of automation is more than manually refreshing once a quarter.

Lens

No

The report being created is static and complete. There is no additional data in this use case to enable a free text search.

Enable Explain Data

No

The two reasons Explain Data is not needed and should be disabled in the workbook is that the analyst will naturally look for outliers as part of the quarterly publishing. Finally, it is unlikely that there are columns in this type of data to explain any outliers. That is, this is a financial report, not an analysis of financial drivers.

Descriptions for Catalog *

Potentially

These audiences likely don’t need descriptions to explain the fields in financial reports, but it could be useful in the case of new executive team members.

Table 15.2 – Finance use case

This use case demonstrates that there are times when data sources are relatively small and are updated infrequently. In this scenario, we do not have to overcomplicate our data modeling strategy by using components of the Tableau platform that aren’t needed.

Use case 2 – sales performance management dashboards

Scenario:

We have a request from sales to create a data model for their analyst team to create interactive dashboards for the worldwide sales organization. The dashboards contain sales of all products and services for every customer, in every channel, and every region. The range of users goes from sophisticated sales analysts to sales representatives who aren’t always tech-savvy.

The data comes from many sources, but the data engineering team has consolidated all sources into a Snowflake database. The data is messy as the company uses different customer relationship management software in different regions of the world. The rules for data entry validation vary wildly, resulting in messy data in some regions and cleaner data in others. The data engineering team does not have the mandate, resources, and time to clean the data. They will leave this cleaning task to a data steward who works with the business. Finally, regional sales individuals and management should only see the data for which they are responsible, but global sales management should be able to see all regions.

The company uses Tableau Cloud and has groups set up in Azure Directory Services that are synced to Tableau Cloud.

Tableau modeling steps:

The data steward for the sales organization first uses Tableau Prep Builder to clean the data, join the necessary tables, and union the tables from different regions. They export to a Snowflake table and set up a schedule in Tableau Prep Conductor to keep the table fresh daily.

The data steward then creates a virtual connection and creates policies to ensure that the right sales groups only have access to the data they should see. They set the virtual connection to live to leverage Snowflake for query performance. They then create a published data source from the virtual connection, add field descriptions for the Catalog, and certify the data source. A lens is then created for more casual users, by eliminating unnecessary and confusing fields for search and adding synonyms that are used in different regions.

The sales analysis and dashboard developer then connects to the published data source to create interactive dashboards. Before publishing, the analysts ensure the proper columns are made available to Explain Data so that consumers in the sales organization are free to explore outliers in their dashboards.

We can see what is and is not needed from the various data modeling components of the Tableau platform for the sales use case, along with the reasons, in Table 15.3:

Component

Use?

Reason

Virtual connection *

Yes

A virtual connection is used to achieve the row-level security requirements.

Prep Builder

Yes

Prep Builder is used to clean up messy fields, join tables, and union table sources from the different customer relationship management software used in different regions.

Desktop

No

We created our published data source in Prep Builder.

Published or embedded?

Published

This data source needs to be published as it is used in multiple workbooks, Ask Data, and needs to be certified.

Tableau Bridge

No

As the data is coming from a cloud database, it can be accessed from Tableau Cloud without needing the Tableau Bridge client.

Scheduler or Prep Conductor? *

Prep Conductor

Prep Conductor is used to extract and transform the source data from Snowflake and then load it back into a new Snowflake table after it has been cleaned.

Lens

Yes

A lens is required as this is a great use case for Ask Data.

Enable Explain Data

Yes

Enabling users to find their outliers is a key requirement.

Descriptions for Catalog *

Yes

Data definitions are important for the wide use of these data sources across the entire global sales organization.

Table 15.3 – Sales use case

This sales data scenario was very different than the first example with the finance use case. This scenario had a wide and diverse audience, data security requirements, and big data that changes frequently. In scenarios like this, it makes sense to take advantage of the majority of the Tableau data modeling components.

Use case 3 – information systems analytics of internal employee intranet site visits

Scenario:

We have an information systems analyst who must produce dashboards for the organization to see how employees are visiting and interacting with the organization’s internal web properties. The dashboards should be available to the entire organization as executive management has a goal for most employees to help drive employees to web-first interactions.

The data comes from a database that captures all site interactions. This table is then joined to an internal employee and internal systems tables for context.

The company uses Tableau Server and has a secure project set up with all the groups who can see and interact with these dashboards.

Tableau modeling steps:

A data steward creates a published data source for information system analysts and dashboard developers that is extracted from the underlying tables at the end of each section of data to improve query performance. The data steward also adds a lens and field descriptions and then certifies the data source after ensuring the data is accurate. The information systems analysts and dashboard developers create interactive dashboards and select appropriate fields for Explain Data.

We can see what is and is not needed from the various data modeling components of the Tableau platform, along with the reason, in Table 15.4:

Component

Use?

Reason

Virtual connection *

No

The data is not complex enough nor does it need a data policy for row-level security.

Prep Builder

No

The data is clean and does not need the transformation features of Tableau Prep Builder.

Desktop

Yes

Desktop is used to create a published data source that is extracted.

Published or embedded?

Published

The data source is published so it can be used with Ask Data and can be certified. If the use case does not require Ask Data, since it is only being used in a single workbook, embedding it in the workbook would be OK.

Tableau Bridge

No

The customer uses Tableau Server, which does not need Tableau Bridge to keep the data fresh.

Scheduler or Prep Conductor? *

Scheduler

The published data source is configured to run a schedule to refresh the extract daily.

Lens

Yes

To enable broad use, Ask Data is enabled.

Enable Explain Data

Yes

To let consumers explore outliers, Explain Data is enabled.

Descriptions for Catalog *

Yes

As the workbook will be distributed broadly, data definitions are key to ensure decisions based on the data are based on a solid understanding of the data.

Table 15.4 – Marketing use case

In this third use case, we have a scenario that falls somewhere between the finance and sales use cases. Like the sales example, we have broad distribution. Unlike the sales example, we don’t have the same security concerns about which users can see which data. For this reason, we end up using many of the data modeling components of the Tableau platform without overcomplicating it with unneeded components, such as virtual connections.

Use case 4 – marketing analytics of social media campaigns

Scenario:

We have a request from marketing to create a data model for their analyst team to create interactive dashboards for the worldwide marketing organization. The dashboards contain information on social media click-through campaigns.

The data comes from the social media companies where our campaigns are running, but the data engineering team has consolidated all sources into Google BigQuery. The data is clean and stored at the level of granularity of the day – that is, the data is rolled up to the day and broken down by social media site and campaign name. Sometimes, there are days with no data because the campaigns don’t always generate a click-through each day. Finally, everyone in marketing can see all campaign information, regardless of where they are located.

The company uses Tableau Cloud and has groups set up in Azure Directory Services that are synced to Tableau Cloud.

Tableau modeling steps:

A data steward creates a published data source for the marketing analysts using Tableau Prep Builder as they need to use it to add new rows for the days where there are no campaign click-throughs, adding zero values on these days. The data steward then uses Tableau Prep Conductor to schedule daily updates.

The marketing analyst and dashboard development team then create the campaign dashboard and publish them to a marketing project on Tableau Cloud for all of the marketing team to use.

We can see what is and is not needed from the various data modeling components of the Tableau platform, along with the reason, in Table 15.5:

Component

Use?

Reason

Virtual connection *

No

The data is not complex enough nor does it need a data policy for row-level security.

Prep Builder

Yes

The data is missing rows for days when there were no click-throughs for campaigns. Prep Builder is used to add new rows.

Desktop

No

The dashboard developers will use Desktop but it is not needed for the data modeling step.

Published or embedded?

Published

The data source is published from Prep Builder, so it can be used by the marketing dashboard development team.

Tableau Bridge

No

The source data comes from Google BigQuery, which can be accessed by Tableau Cloud.

Scheduler or Prep Conductor? *

Prep Conductor

The published data source is scheduled through Prep Conductor.

Lens

No

Campaign information is consumed through the campaign dashboards.

Enable Explain Data

Yes

To let the marketing team explore campaign result outliers, Explain Data is enabled.

Descriptions for Catalog *

Yes

As the workbook will be distributed broadly, data definitions are key to ensure decisions based on the data are based on a solid understanding of the data.

Table 15.5 – Marketing analytics use case

In this fourth case, we can see a case that requires Tableau Prep Builder as the source data is missing rows for days when campaigns did not generate click-throughs. Like the information systems example, we have a broad distribution, but we do not have concerns for row-level security. Security and access can be handled at the project level on Tableau Cloud.

Summary

In this chapter, we discussed the ideal situations for Tableau Prep Builder and Tableau Desktop based on the strengths and features of each. We also looked at cases where both can be the right fit for the task.

We then looked at four audiences and their use cases to bring together how and when to use the various data modeling components of the Tableau platform. First, we explored a financial analyst who creates quarterly financial reports for a limited audience. This use case does not need many data modeling capabilities as it requires infrequent updates, has a limited audience, and has relatively simple data sources.

Next, we looked at a use case of sales performance data for the entire sales organization. This use case uses most of the data modeling capabilities of the Tableau platform due to a broad and varied audience, complex security requirements, and messy data sources.

Next, we looked at a use case which focused on marketing analytics and is slotted between the finance and sales use cases and audiences in terms of complexity. For this reason, it used many of the data modeling features and components we learned about in this book, but not as many as in the sales use case.

In our fourth and final use case, we explored a case of marketing campaigns where the campaigns don't yield results every day. We used Prep Builder to add rows for these missing days and used Prep Conductor to schedule our flows.

Now that we have experienced all the data modeling components of the Tableau platform through hands-on exercises and explored which to use in different use cases, we are ready to take our new skills and apply them to real-world scenarios.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset