Using a For Each loop to load data from multiple files

Often in a Qlik Sense application we need to load data from a directory which contains an identical set of data files. For example; sales for each country come in different files for each month. In such a case, we use a wildcard load, in order to fetch the data for our application. The following recipe discusses the data modeling issues encountered when using the wildcard load and how we make use of the For each loop structure in the script to overcome this issue.

Getting ready

For this exercise we will make use of two sample XLSX files, namely, Apr2015.xlsx and May2015.xlsx that contain mock sales data for six countries. These files can be downloaded from the Packt Publishing website.

How to do it…

  1. Once the source files are downloaded, store them in a folder called ForEachLoadData.
  2. Create a folder connection as explained in Chapter 1, Getting Started with the Data that points to the ForEachLoadData folder. Name the connection as QlikSenseCookBookForEachLoadData.
  3. Select any file from the folder and extract its contents in the Qlik Sense application.
  4. Next modify the script as the following example, to get the data from all files that reside in the ForEachLoadData folder. Note that we are using a wildcard * in place of the filename in the from statement. The Filebasename() function gets the filename so that we can identify the origin of the data:
    CountrySales:
      Load
      Filebasename () AS Source,Country, Sales
    FROM [lib://QlikSenseCookBookForEachLoadData/*.xlsx]
    (ooxml, embedded labels, table is Sheet1);
  5. Add the preceding load to the script. (the preceding Load is placed directly above the Load statement of the CountrySales table.):
      LOAD*,
        Left(Source,3) as Month;
  6. Upon loading, we observe that a synthetic key has been created in the data model.
  7. In order to avoid the synthetic key, we will make use of the For each loop along with the wildcard load.
  8. Modify the block of code to start with a For Each loop statement and end with a Next, as shown in the following code.
    For each vFile in FileList ('lib://QlikSenseCookBookForEachLoadData/*.xlsx')
       CountrySales:
       LOAD *,Left(Source,3) as Month;
    
       Load 
       Country,
       Sales,
       Filebasename() as Source
       from [$(vFile)]
       (ooxml, embedded labels, table is Sheet1);
    Next vFile
  9. Once the script is in place, save and reload the application again.
  10. We observe that all the files from the folder have been reloaded properly and there is no synthetic key in the data model.

How it works…

The * wildcard character loads all the files from the ForEachLoadData folder into the Qlik Sense application. When we use a preceding load statement to generate the Month field, the load is only applied to the first file loaded from the folder; hence, the Month field is created only for the first file. This is the reason why a synthetic key is created between the two tables.

When we use the For loop, every file is sequentially loaded from the source folder and then a preceding load is applied; thus, creating a month field in each created table. The two tables are then auto-concatenated, as they contain the same number of fields with the same name. As a result, a synthetic key is avoided and we get a clean data model.

There's more…

We used iteration or the For Each loop in the above recipe outside the Load statement. We can also have iterations inside the Load statement using the Where clause or the Subfield function. Iterations are also possible using the Peek() function. A useful article from Henric Cronstrom on Iterations can be accessed using the following URL:

https://community.qlik.com/blogs/qlikviewdesignblog/2013/09/02/loops-in-the-script

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset