images

User Data Integration

This chapter was contributed by Kristoffer Olofsson, Partner at Precis Digital, an online intelligence and marketing agency. Kristoffer has a background as an implementation specialist in the Google Analytics Premium team. He has been working with data since his early university years and often serves as a link between digital managers and developers by speaking the language of both.

Universal Analytics, the latest version of Google Analytics, enables a range of new capabilities that were difficult (if not impossible) to achieve in the past. Perhaps most excitingly, Universal Analytics provides you with the collection methodology necessary to measure journeys across multiple devices, and stitches it all together for you in an intuitive reporting interface that would require many development hours to produce.

In this chapter, you learn the concepts of leveraging Universal Analytics to this end: how to connect behavior data across devices using a common key, the User ID, and how you can piggyback on that connection to surface relevant business data from virtually any dataset alongside your online data.

The Siloed Dataset

Businesses can have any number of datasets containing different types of information that are being evaluated on an ongoing basis. This could be financial data, Customer Relationship Management (CRM) system data, or any type of organized statistics and facts that are saved and stored over a given period of time. If such datasets are unique in terms of what dimensions or fields they contain, they inevitably exist in silos. This means they cannot be brought together to present one consolidated view.

In order to merge data, datasets must share a common key that joins them. The most prevalent of such keys is also part of the fundamental structure of the universe: time. Businesses constantly use it to join datasets: when looking at quarterly revenue by number of customers, when calculating which holiday leads to higher ticket sales, when comparing costs to earnings, and so on. The trick to putting such metrics side by side is that datasets share time dimensions, like dates, which enables side-by-side comparison.

However, a limitation of time as a dimension is that it is not granular enough to show how data-sets fit together on an individual level; that is, for the users or customers who generated the data in the first place. For example, if you want to dig down into which users moved from one platform to another, a date dimension will be limited if your data is an aggregation of many different users without something to distinguish them as individuals. In this case, your statistical analyses will be limited to finding correlations and relationships between platform behaviors as a whole.

When dealing with Google Analytics data, a typical example of this challenge is when users move across different platforms and browsers. Their paths are nonlinear and unpredictable, and you cannot make sense of their full journeys. As a result, you probably miss important insights about your audiences. Without a key that combines the data from each platform and browser for each user, interactions across these will be siloed, and you will be unable to connect the dots in a meaningful way. You need something more than time—a more granular key. In a CRM database, this key is often a unique customer ID, which allows businesses to monitor customer behavior over time.

To leverage such an ID, it needs to be consistently present in all the datasets you want to combine. Unfortunately, the challenge doesn't stop here. Even if you have an ID stored somewhere for each of your customers as they identify themselves, you might not have the capacity to make sense of the data. Your users may log in online, purchase items in offline stores, or sign up for notifications in your app. But what if you don't have the necessary tools to stitch it all together, despite setting the same ID across all instances? This is where Google Analytics, as a data collection and reporting tool, is in a perfect position to address your problem.

The User ID

Before learning more about cross-device tracking, it is important you understand the technicalities of how the Google Analytics tracking works. For each interaction tracked on Google Analytics, be it an event, a pageview, or a social interaction, an HTTP request (also known as a “hit”) is sent to the Google Analytics servers. Each request becomes a row in a table stored on the servers, and then is tied together in sessions through a field called the Client ID. A hit is sent along with all the parameters necessary to create meaningful reports in Google Analytics. Among other things, this is what enables you to see how users move across your website or application in coherent sessions.

However, the Client ID is precisely that, a client identifier. A client in this case refers to a browser or a specific application. Your users may have several Client IDs, one for each client they use to access your platform or website. To be able to stitch sessions together across clients, you need a common key that all HTTP requests share, independent of the client. Enter the golden key: the User ID. A User ID does not identify the client; that job is already taken care of by the Client ID. The User ID identifies the individual using the client. See the relationship between the User and Client IDs in Figure 8-1.

A common misconception is that you can somehow use Google Analytics to surface cross-device user behavior without setting a common key like the User ID. This is not true; Google Analytics will always require you to set User IDs and pass them through to the servers in HTTP requests to be able to stitch data together across platforms. The only way to get consistent IDs in this manner is when your users identify themselves in one way or another (through a login, a purchase, or anything that says “this is me”).

images

Figure 8-1: The relationship between Client and User IDs.

WARNING Please note that according to the Google Analytics Terms of Service, it is forbidden to send personally identifiable information to your Google Analytics account. You may use a unique identifier, but not a name, phone number, email address, social security number, or any other personally identifiable information. To learn more about it, read the full Terms of Service at http://goo.gl/t03xWG.

In the following sections you learn the steps required to start gathering and reporting user-level data with the User ID feature. The steps are:

  1. Create a User ID View: This step shows how to create a User ID View and the extra reports you will get access to by using this feature.
  2. Set the User ID: This step shows how to set the User ID when sending hits to Google Analytics.
  3. Store the User ID: This step shows how to store the User ID in order to populate it as the value of a variable used by Google Analytics during collection.

Following those steps, you will learn how to import additional information into Google Analytics, such as CRM data, using the Measurement Protocol (which was briefly discussed in Chapter 7, “Custom Data Integration”).

Creating a User ID View

A User ID View reports data exclusively from sessions in which you send hits to Google Analytics including a User ID. This means that any session coming from an anonymous user would not be shown in the User ID View; however, your standard reports will still show your total number of users.

In order to create the User ID View, log in to your Google Analytics account and click on Admin at the top of your screen. Choose the Property you would like to use to collect the User ID sessions, and then click on the Tracking Info menu below the Property selector drop-down; choose User-ID. The process of creating a User ID View is composed of three different screens:

  1. In the first screen you will be asked to review and agree to the User ID policy.
  2. In the second screen you will be given explanations on how to implement the User ID, which we discuss below. You will also have the choice to change the Session Unification settings, which allow hits collected before the User ID is assigned to be associated with ID, so long as the hits are from the session in which an ID value is assigned for the first time. When set to OFF, only data with User IDs explicitly assigned can be associated. Learn more about session unification at http://goo.gl/pWq1kX.
  3. In the third screen you will see a quick summary of what exactly a User ID View is.

After creating a User ID View, you will have access to four additional reports—three in the User ID View and one in the standard View: Device Overlap, Device Paths, and Acquisition Device in the User ID View, and User ID Coverage in the standard View. The following sections describe what each report will show you.

The Device Overlap Report

This report, and the following two reports, can be found on the User ID View under the left-side menu named Audience, under a section named Cross Device.

This report shows a squared Venn diagram visualizing how users interact with the website, where the area of the rectangles represents the Users in each combination. There are several combinations, such as only through a mobile device, only through a tablet device, only through a desktop computer, and all possible variations.

You also have the option to view the same diagram with the area representing the revenue contributed to your business by the different device variations. This can provide insights into how each device (or group of devices) is performing. Apart from the diagram, as shown in Figure 8-2, you will also see a table below the visualization summarizing the data.

The Device Paths Report

This report, shown in Figure 8-3, provides insight into the order users visited your website or app and how they performed based on their path. You can use both site usage- and conversion-related metrics to analyze each of the paths.

images

Figure 8-2: Device Overlap report on User ID View.

images

Figure 8-3: Device Paths report on User ID View.

Using the Device Paths report, you might find out that specific devices are successful in different parts of the buying cycle. This can help you drive your ad strategy by advertising upper-funnel keywords on tablets and branded keywords on desktops, or vice versa if it makes sense based on your data. For some businesses, users might use mobile for research and desktop for purchasing, and for other businesses it might be the opposite.

An interesting capability provided in the Device Paths report is changing the path definitions. As you can see in Figure 8-4, you can adjust path options and the minimum steps to show in a path. You can either see the whole path or adjust the path to show only steps before and after the following:

images

Figure 8-4: Paths options for Device Paths report.

  • Any Goal Completion
  • Any Transaction
  • Event Action
  • Event Category
  • Event Label
  • Page
  • Goal

There are several use cases for each of them, but looking at all the devices used before a transaction would be especially interesting, as it would provide insights on how to optimize your device advertisement and functionality based on where in the funnel your most valuable users are.

The Acquisition Device Report

This report, shown in Figure 8-5, provides insight into which devices are being successful in acquiring new users. The metrics show the revenue generated from the originating device, as opposed to revenue generated to subsequent devices used in the purchase cycle.

images

Figure 8-5: Acquisition Device report on User ID View.

The User-ID Coverage Report

Once you create the User ID View and start collecting data from User ID sessions, you will also have access to a new report on your standard Views, where you can monitor site usage and conversions for sessions including and excluding a User ID, as shown in Figure 8-6. You can find this report under the left-side menu named Audience, under a section named Behavior.

images

Figure 8-6: User ID Coverage Report on standard views.

Setting the User ID

For Google Analytics to be able to stitch sessions together for any given individual user, each HTTP request must contain the User ID parameter uid. In the following code, you can see what a typical HTTP request to the Google Analytics servers looks like. Note that these requests follow the same structure on all platforms where a collection library built on top of the Measurement Protocol is implemented, including Universal Analytics. Learn more about it at http://goo.gl/leiE2q.

http://www.google-analytics.com/collect?
v=1&
tid=UA-12345-1&
cid=12345.6789&
t=pageview&
dl=exampleLocation&
uid=1234ABC

In this request, you can see the following parameters:

  • Protocol version: v
  • Tracking ID: tid
  • Client ID: cid
  • Hit type: t
  • Document location: dl (this parameter can be replaced with document hostname, dh, plus document path, dp)
  • User ID: uid

This type of request is generated automatically if you use one of the tracking libraries, such as analytics.js for websites or the GoogleAnalyticsServices-library for Android or iOS, and it sets a value for the userId field through the uid parameter. Here is an example using analytics.js; you would need to add the bold line of code:

ga('create', 'UA-12345-1'),
ga(<set>, <userId>, <1234ABC>);
ga('send', 'pageview'),

If you are using Google Tag Manager, you could first define the User ID value in the dataLayer above the GTM script, as follows:

<scri pt>
   dataLayer = [{
       'userId': '1234ABC'
   }];
</script>

Then you set this as a field value in your tag, as shown in Figure 8-7.

images

Figure 8-7: User ID tag on Google Tag Manager.

The Measurement Protocol is flexible enough to allow you to build requests on virtually any platform that can connect to the Internet. You just need to structure the requests in the same way as the tracking libraries, and data will be collected and reported.

Storing the User ID

Since the unique ID for each user should be independent of the client being used, it cannot be stored, for example, as a cookie in the browser (which is the case with the Client ID value when using analytics.js for tracking websites).

For a business that already has unique IDs for each customer stored server-side, the most straightforward way is to populate this ID as the value of a variable subsequently used by Universal Analytics during collection. Naturally, you need to take the appropriate actions to ensure this value is assigned before a tracking method executes. Figure 8-8 shows the order in which the ID should be assigned to a user.

images

Figure 8-8: Storing the User ID to a website user.

For apps, you can leverage local data storage (such as the SharedPreferences class in Android) to store the same value when the user identifies herself. For all subsequent requests to the Google Analytics servers, the same value should be included in the User ID parameter (&uid=). Figure 8-9 shows a schematic description of how it would work on a mobile application.

images

Figure 8-9: Storing the User ID to a mobile application user.

Note that all requests must include the User ID parameter for the request data to be included in reporting. It is not enough to set it once and hope for automatic session stitching; the ID will not be associated with subsequent requests if you leave it out. Only when users have identified themselves, for example through a login, should their sessions and interactions be stitched together independent of platform. As previously discussed in the section showing how to create a User ID View, all hits during the same session prior to when the user logged in will have the User ID associated with them, provided that you have Session Unification enabled.

Importing Additional Data

Once you have figured out the best ways to identify users across platforms, you are in a golden position to import all types of data in your requests. Since you already have the dimension that will join it all together in reporting, and as long as you set this value consistently across platforms, you can simply attach additional values to requests from your internal data warehouse, website, apps, or virtually any system that can connect to the Internet. The Measurement Protocol is very flexible.

A good example of this is to leverage Custom Dimensions to populate business-specific data to your requests. Imagine you have a table in your CRM that looks like the one shown in Figure 8-10.

images

Figure 8-10: Sample table from a CRM.

As you send requests using the Measurement Protocol, for example after an offline purchase by a member, you should include all relevant data in your requests. As long as the hit also includes the User ID (or in its absence, at least the Client ID), it will stitch together nicely with all the other data within Google Analytics for that user or client:

http://www.google-analytics.com/collect?
v=1&
tid=UA-12345-1&
cid=12345.6789&
t=pageview&
dl=exampleLocation&
uid=g37iohg39h3&
cd1=NY
cd2=Bronze
cd3=04/06/2013
cd4=4

In order to have this shown in your Google Analytics reports, you also need to change the Custom Dimension settings in your Admin interface. In order to do that, log in to your Google Analytics account and click on Admin at the top of your screen. Then, under Property, choose Custom Definitions and then Custom Dimensions. You will find a table similar to Figure 8-11. Click on + New Custom Dimension and choose User for the scope.

images

Figure 8-11: Custom dimensions table on the Google Analytics Admin section.

Once you create the Custom Dimension, you will notice that it was assigned an index number (see the second column in Figure 8-11). The index of each Custom Dimension you create should be used in the code shown previously, where cd1 indicates index 1, cd2 indicates index 2, and so on.

Once you send the Custom Dimensions through the Measurement Protocol and enable them in the interface, you will have access to a more comprehensive picture of your customers, and your reports can be tailored to fit your business requirements. Figure 8-12 shows a sample table that you would now be able to build.

images

Figure 8-12: Tailored report on Google Analytics interface.

If the User ID is in place, you can combine your custom data with cross-device reporting as well. Figure 8-13 shows a sample report from Google Analytics in which information from a CRM was passed to Google Analytics through a Custom Dimension (Silver members) in order to show how those users move between device categories.

images

Figure 8-13: Cross-device path based on CRM data.

An alternative to including additional Custom Dimensions in the actual requests is to leverage Data Import (discussed in the previous chapter), using the user ID as the key to import additional metadata about your customers. A benefit of this approach is that your requests will be smaller. However, you would need to perform the upload regularly to continuously include new customers as time goes by.

Summary

In this chapter you learned about the process used to integrate different datasets into Google Analytics. The most important business case for such integration is the power to understand how your users behave across multiple devices and platforms. In order to create a robust cross-device measurement, you learned how to perform the following steps:

  1. Create a User ID View.
  2. Set the User ID.
  3. Store the User ID.

Following those steps, you also learned how to import additional information into Google Analytics, such as CRM data, using the Measurement Protocol. This empowers you to report user behavior and website performance across devices, which can also include virtually any data you choose to send.

The following additional reports are available to you after you've completed the integration discussed in this chapter:

  • The User ID View reports (Device Overlap, Device Paths, and Acquisition Device) show how users move between platforms where you have a presence and provide a holistic view of true audience behavior.
  • User ID Coverage in the standard View provides an overall picture of what percentage of your users were assigned a User ID and the difference in behavior between “Assigned” and “Unassigned.”
  • Custom reporting enables you to bring in additional information from any dataset, as long as you can connect to the Internet and send data through requests to the Google Analytics servers.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset