What to ask of your data

Insights derived through any learning process depend on the questions you ask of data and the context of that data (within your particular use case scenario). For example, let's say a use case involves the sales of team merchandise at the gates of a National Football League (NFL) stadium. Information (data) that shows weekly sales from the last few seasons' home games is provided. What insights can this data provide for the team?

Rather than starting by sorting, filtering, and pivoting data, perhaps using a programming language such as Perl or Python, wouldn't it be a better idea to use the language and keywords of your business to ask data questions that explore and visualize the data into answers?

IBM Watson Analytics does just this and even uses your data questions to generate a list of starting points, each of which opens a specific visualization.

The Watson Analytics interface gives three ways to get started with questioning your data. You can do any of these:

  • From the welcome or main page, click on Explore, select a dataset, and enter a question
  • From the welcome page, click on Add, then click on Exploration, select a dataset, and enter a question
  • In Explore, click on New and enter a question

Building questions

To create a question in Watson Analytics to ask your data, you need to use keywords, names of columns (or fields) in your data, and data values:

  • Keywords: These are used to format the visualization. Keywords are typically placed near the beginning of the question and are chosen from a list provided by Watson Analytics.
  • Names of columns: These are also referred to as column titles. You can use one, two, or three column titles in a question. You can check your data file or use the Watson Analytics data tray to quickly view the available column titles and data values in your dataset.
  • Data values: These are actual data values from your dataset for focusing on a specific piece of information, for example, to include a specific product name from a Product column, or a specific year value from a Year column. Data values are typically placed at the end of the question.

So for instance, in our stadium example, we can start by asking the question: what is the breakdown of sales by gate number for the team hat?

In the preceding question, notice that I have used the keyword breakdown, the fields in my file (column titles) are sales and gate number, and the data value I'm interested in corresponds to a particular product—team hat.

IBM Watson Analytics processes your question in the following way:

  1. Watson Analytics matches the words in your question to the column titles in your dataset.
  2. The remaining words in your question are matched to the actual data values in your dataset.
  3. Keywords are used to select and format the visualization.

We'll provide more details on building questions in the use case example section of this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset