Using smart data load profiling

As you know from the earlier chapters on accessing data, Qlik Sense makes associations between tables using similar field names. As of Qlik Sense Version 2.0, there is a data profiling tool that can be used to help you make the correct table associations.

Getting ready

  1. Create a folder on your local drive called TestData. For this example, we will use the C: drive: C:TestData.
  2. Create a folder library connection to the directory above in the data load editor.
  3. Load the following script into the application:
    Transactions:
    LOAD DATE(Date#( TransactionDate,'DD/MM/YYYY')) as TransactionDate ,Sales INLINE [
        TransactionDate, Sales
        01/01/2013, 1000
        02/01/2013, 3000
        03/01/2013, 500
        04/01/2013, 4000
        05/01/2013, 2000
    ];
    
    Calendar:
    LOAD DATE(Date#( TransactionDate,'DD/MM/YYYY')) as Date, Month,Year INLINE [
        Date, Month, Year
        01/01/2013, Jan, 2013
        02/01/2013, Jan, 2013
        03/01/2013, Jan, 2013
        04/01/2013, Jan, 2013
        05/01/2013, Jan, 2013
    ];
    
    
    STORE Transactions INTO [lib://TestData/Transactions.txt](txt);
    Drop Table Transactions;
    
    STORE Calendar INTO [lib://TestData/Calendar.txt](txt);
    Drop Table Calendar;

How to do it…

  1. Click the Navigation dropdown button on the top-left and select the Data manager, as shown here:
    How to do it…
  2. Select How to do it… from the menu bar on the left-hand side.
  3. Click on the TestData library connection we established earlier.
  4. Select the Transaction.txt file as shown here, and click on the next arrow at the bottom of the page:
    How to do it…
  5. You will see a preview of the data to be loaded. Click on Load and finish at the bottom of the page.
  6. Once loaded, you will return to the data manager and the transactions table will be listed on the left. Let's load the second table. Repeat steps 2, 3, and 4 only. Click on the Calendar.txt data this time.
  7. Instead of clicking on the Load and finish button, click on the How to do it… button.
  8. Click on the How to do it… option.
  9. Under the Rename fields option, click on the button labeled TransactionDate.
    How to do it…
  10. Click on Load and finish in the bottom-right hand corner.
  11. Close the execution window and click on Save.
  12. If you open the Data model viewer from the main hub menu now, the two tables should now be joined, as shown here:
    How to do it…

How it works…

The script we loaded at the beginning of the recipe simply generates the data we use in the data load editor. The profiler looks at these data files to make recommendations on fields you should use to join the two datasets together. In this example, there is only one suggestion made, which is for the TransactionDate field and the Date field.

If there are more, you can step through the various suggestions using the arrows in the load editor page. This is shown in the following screenshot:

How it works…

Our example identified that there is a 100 percent match on the values contained between both of our data sets but the name of those fields are different. As such, one or both of the fields need to be renamed in order to associate the tables. The fields were renamed automatically when we clicked on the button labeled TransactionDate. We could have entered a new name for both fields by typing in the Rename both fields to box.

There are several other warnings the data profiler will make you aware of, depending on the data you load. One example is if there were fields with the same name but data that didn't match. Another example is if there are multiple possible connections you could make; in this example, the profiler will recommend you to either keep one of the fields as a key then rename the others, or break the table association.

There's More….

The Data manager allows you to bring in data and make associations from a number of sources without ever seeing a line of code. The code is still present, but it is generated automatically and saved in a system generated script tab. After completing this recipe, open the data editor to see the new tab, as show in the following image:

There's More….

It is also worth making clear that the data profiler works without finding 100 percent matches on shared field values between tables. To test this, repeat the recipe with the following line of code added to the end of the transactions table in the Getting ready section of the recipe:

6, 06/01/2013, 100

The profiler works in exactly the same way as recommending the correct field link, but now it will inform you there is a 91% match between the values contained in both tables:

There's More….
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset