This section presents examples of potential real-life scenarios in which Content Analytics might be used to provide actionable results and insights. Hopefully, as you read through these use cases, you'll gain a better understanding of how IBM Watson Analytics can be used.
Our first example is somewhat traditional in that we are exploring sales data and creating visualizations that will, hopefully, provide some insights.
The owner or manager of a stadium where an NFL team plays its home games is trying to understand how its on-site merchandise sales are doing. The relevant data has been collected; it includes stadium sales by product for the last 13 home games. Let's use Watson Analytics to see what the data may tell us.
From the Welcome screen, click on Add and then click on Drop file or browse. From there, I browse to my file (2015_StadiumSales.csv
) and select it. IBM Watson Analytics adds the file to our environment:
If we click on our file, we are ready to ask a question (or explore the starting points that Watson has already prepared for us).
Rather than start with our own question, we can scroll through Watson Analytics' starting points by clicking on the > icon. I've selected an interesting one—What is the contribution of Quantity over Week by Stand Location? (Check out the following screenshot):
After we have clicked on this starting point, Watson Analytics shows us a visualization answering that question, as follows:
Watson Analytics has selected a recommended visualization (an area chart), but you can change it by clicking on the Visualization type's icon in the bottom-right corner, as shown in this screenshot:
Watson Analytics also gives us a number of insights about our data across the top of the page, like this:
You can scroll through to explore them, click to highlight, or, if you like, send to a new page. I'm interested in the weather's effect on the sales, so I click on How do the values of Quantity compare by Weather:
It seems like the fans buy more when it's sunny.
Upon selecting another visualization, How do the values of Quantity compare by Week?, we get this:
Notice that across the bottom, Watson Analytics displays our list of columns (fields) from our file. If we want to, we can select a column (let's try Payment) and drop it onto our visitation. Watson Analytics instantly resets our visualization to include Payment, as follows:
Let's try adding a filter. You can click on the filter icon (in the bottom-left corner of the page) and then click on Add a filter:
From there, select the Weather column and then check Sunny. Then click on Done, as shown here:
Again, Watson Analytics resets the visualization (with the filter applied), like this:
But enough of Watson Analytics' starting points. Let's create some questions of our own. Let's click on the > (next page) icon to the right of the current visualization:
IBM Watson Analytics will ask us, What do you want to explore next? (Check out the next screenshot):
So let's ask some questions.
I am convinced that the weather is affecting my sales. I have seen that when it's sunny, fans purchase more merchandise. Let's go further with that idea. Does the weather affect the type of product purchased? I type my question, what is the comparison of quantity by weather over product? With that, Watson Analytics rephrases my question a bit and then provides a visualization:
And when we click on the visualization, we get this:
Let's try once more. You can click on the + New icon to open a new page:
How about we try something trending? If I type What is the average quantity by week by stand location, Watson Analytics again rephrases my question and runs another visualization for me:
And now, let's look at the visualization!
This gives us a pretty good idea about the average sales per week by our stand location. Next is the same visualization as an area chart, but filtered by stand location (to show only a single stand location, that is, north, and with Payment type included as an additional column):
As you can see, there are still plenty of options to explore with this data, but for now, we'll move on to our next example use case.
Another use case that we can consider to further explore Watson Analytics is of a gaming company that deploys a variety of slot machines in a multitude of ways. The company has a file containing the actual results by machine over a period of time, categorized by months and the days of the week. In the file are the particulars of each machine (type, theme, number of years the machine has been in service, and so on). Without diving into too much information about the file, let's do some exploration to see what insights Watson Analytics can give us about the gaming industry.
Using the same method as we did earlier (from the welcome page, click on Add, then on Upload data, we can add our slots' results (CSV) file to the Watson Analytics environment:
Before we move on, notice that this file is labeled HIGH QUALITY. What does this mean? It means that Watson Analytics does a better job of providing predictions and explorations if the quality of your data is high. The lower the quality of your data, the lower the accuracy of the analyses in your explorations and predictions (this is true for any tool).
In addition, Watson Analytics scores your data's quality (that is, the number to the left of HIGH QUALITY). The lower your data's score, the higher the number of outliers, missing values, and so on found in your data.
Always try to improve the quality of your data before you import it by:
To refine your data, click on the Refine icon (shown here):
You can utilize Watson Analytics to improve the quality of your data. Once you upload your file, rather than jumping into the explorations, you can click on Or shape data - Refine (shown in the preceding screenshot):
When you click on Refine, IBM Watson Analytics displays your data on the Refinement page (shown in the preceding screenshot). From there, you can review your data and, if needed, adjust it to your needs.
There are different ways by which you can refine (and perhaps better prepare) your data. You can:
So let's experiment with some of these Refine or Preparation actions. To begin, in the top-left corner, you can click on the Actions icon, as shown here:
Clicking here opens the Watson Analytics action center (again, shown in the next screenshot), where you can see your columns and rows listed, along with various other icons:
Here is a list of various actions:
To change a field name, you must first click on Change Name and then type your new field name. To filter by the selected field, click on the Select value icon. Changing field names becomes particularly important if your field headings are non-descriptive, as they can be dependent on the source. Using non-descriptive field names lessens the impact of the visualizations created by Watson Analytics, since it's more difficult to relate to a query such as how does the number of rows compare to 123 than something more readable, such as how does the number of rows compare to Location.
If you click on the Data Metrics icon (circled in the following screenshot) in the top-left corner of the refine page, Watson Analytics provides real-time metrics on the fields in your data:
Metric information provided for each column includes:
Additional refinement actions include the following:
Let's get back to exploring our slot machine. As already covered, IBM Watson Analytics reads your data and, based on its contents, provides a variety of questions (or starting points). In our slots example, Watson Analytics provides questions such as What is the trend of Coin-in over Month by Location.
In this example, the words Coin-in, Month, and Location are in bold. These indicate field names (columns) within our data. In addition, the word trend is used. Trend is an IBM Watson Analytics visualization keyword.
Keywords are used by Watson Analytics to format a visualization, and each has a different influence on how your data is retrieved and how the resulting visualization is created. IBM Watson Analytics provides the following keywords: compare, trend, contribution, correlation, relationship, breakdown, grouping, where, when, how long, average, total, maximum, minimum, top, bottom, best, worst, highest, lowest, most, least, rows, how many, and count.
Creating questions can be simple or challenging based on your needs. For example, if I were thinking in SQL terms, I might want to create a group by query to get the number of rows by location. In Watson Analytics, if I ask, What is the number of rows by location, Watson Analytics reformats my query as follows:
How does the number of rows compare by location?
Then it provides the following visualization:
You can see the somewhat different mindset for formatting the query. Something else that's fun is Watson Analytics' ability to think forward as you start typing a question. For example, I typed what is the average and paused. Watson Analytics read my (partial) query and made this suggestion (as well as several others):
What is the trend of Coin-in over Month by Weather? The presentation given for this suggestion is as follows:
Remember that from the visualization, you can delete highlighted keywords or change them simply by clicking on them. The following is what I got when I clicked on the keyword Weather and selected Type from the field list:
Another interesting question made using our slots data might be what slot theme generated the highest coin-in value? Here is a visualization for this question:
From the preceding data, it looks as if slot machines that offer a movie or food theme may be the most profitable. You can see that plenty of questions or starting points can be visualized based on the use of Watson Analytics' keywords. It's recommended that you take your time and experiment with each of the keywords to get a feel of how they can be used and what to expect as far as the results are concerned.
Let's move on to one more use case example before we wrap up this chapter.
Another use case might involve a file of criminal offenses or crimes reported over a period of time within a particular city's limits. The file lists a description (of the crime) the date the crime took place, a city district, beat, city grid, a universal NIC code, as well as GPS information (latitude and longitude).
Here is a section of the file:
It might be interesting to use Watson Analytics to do some exploration of this information!
After we've added it to the Watson Analytics environment (in the previously described method of browse and upload), let's see how we can visualize what type of crime wave we may have on our hands.
First, I want to know the total number of crimes reported (the number of rows or records in the file would indicate this), so I ask Watson Analytics:
What is the value of the number of Rows?
This is what I get:
Notice that if you move your mouse over the visualization, you see an exact row count, that is, 7584.
Now I want to add more content to my visualization, so I click on Columns in the top-right corner of the page, as shown in the following screenshot, and then click on Add a column:
From there, I select the crimedescr field column (highlighted in the next image). Then I click on Done:
Now Watson Analytics adds the breakdown by (crime) description:
This is a bit more interesting, but let's see a breakdown of only those crimes considered petty. To do that, click on Columns again and then click on crimedescr. You'll see that everything is selected—meaning all crimes are included in the visualization:
Let's click on Set a condition and then type the word petty
under Contains, as shown here:
Finally, let's click on Apply. IBM Watson Analytics now rebuilds our visualization, showing only crimes with the word petty in their description:
Keep in mind that sometimes Watson Analytics needs help. For example, Watson Analytics provided us with the following visualization: How do the values of grid compare by beat?
It's a nice-looking visualization, but the problem is that it is meaningless, since Watson Analytics is interpreting the value of the grid field as amount or total when I know that it's really a numeric identifier (of a unique grid).
A more reasonable visualization may be generated (using this visualization as a starting point) if we click on the word grid and select a different field (perhaps Rows), as shown in this screenshot:
Now, Watson Analytics shows us a visualization of Rows by beat (meaning the number of crimes occurring in each city beat):
The key is to use visualizations as starting points for the development of insights into your data—question everything!
Now might be a good time to introduce a nice feature of Watson Analytics—sharing.
Let's say your continued exploration of the crime file has yielded the visualization shown in the next screenshot—the total number of crimes by grid: How do the values of grid compare by district? Excited about what you see, you want to share with others how the districts are doing:
In the top-left corner on the page of Watson Analytics, notice the share icon (shown circled here):
Clicking on the share icon opens the Share dialog:
From this dialog, you can choose your method of communication: Email (based on your Watson Analytics account's e-mail), Download (as an image, PDF, or PowerPoint presentation), through Social Media (Twitter, Facebook, or LinkedIn) or as a Link (which you can forward to other Watson Analytics users). This is an easy way to share your insights or solicit additional input from your associates.
Of course, once you have found something you want to keep, you should save it.
Typically, there are two options: Save and Save As. Watson Analytics supports both. In this screenshot, you can see (circled and from left to right) the Save As icon and the Save icon:
Predictions (discussed in a later chapter of this book) are automatically saved when you create or change them. Explorations, views, and refined datasets (for all of these, we have shown many examples) must be manually saved by using Save or Save As.
Once you've begun saving your explorations, you can reopen them by clicking on the Collection icon, as shown in the following screenshot, and selecting from the list that appears: