Dimensional modeling

The dimensional modeling technique uses facts and dimensions to build the data model. This modeling technique was developed by Ralf Kimball. Unlike ER modeling, which uses normalization to build the model, this technique uses the denormalization of data to build the model.

Facts, in this context, are tables that store the most granular transactional details. They mainly store the performance measurement metrics, which are the outcome of the business process. Fact tables are huge in size, because they store the transactional records.

For example, let's say that sales data is captured at a retail store. The fact table for such data would look like the following:

A fact table has the following characteristics:

  • It contains the measures, which are mostly numeric in nature
  • It stores the foreign key, which refers to the dimension tables
  • It stores large numbers of records
  • Mostly, it does not contain descriptive data

The dimension table stores the descriptive data, describing the who, what, which, when, how, where, and why associated with the transaction. It has the maximum number of columns, but the records are generally fewer than fact tables. Dimension tables are also referred to as companions of the fact table. They store textual, and sometimes numerical, values. For example, a PIN code is numeric in nature, but they are not the measures and thus they get stored in the dimension table.

In the previous sales example that we discussed, the customer, product, time, and salesperson are the dimension tables. The following diagram shows a sample dimension table:

The following are the characteristics of the dimension table:

  • It stores descriptive data, which describes the attributes of the transaction
  • It contains many columns and fewer records compared to the fact table
  • It also contains numeric data, which is descriptive in nature

There are two types of dimensional modeling techniques that are widely used:

  • Star schema: This schema model has one fact table that is linked with multiple dimension tables. The name star is given because once the model is ready, it looks like a star.

The advantages of the star schema model include the following:

    • Better query performance
    • Simple to understand

The following diagram shows an example of the star schema model:

  • Snowflake schema: This schema model is similar to the star schema, but in this model, the dimensional tables are normalized further.

The advantages of the snowflake schema model include the following:

    • It provides better referential integrity
    • It requires less space as data is normalized

The following diagram shows an example of the snowflake schema model:

When it comes to data modeling in Qlik Sense, the best option is to use the star schema model for better performance. Qlik Sense works very well when the data is loaded in a denormalized form, thus the star schema is suitable for Qlik Sense development. The following diagram shows the performance impact of different data models on Qlik Sense:

Now that we know what data modeling is and which technique is most appropriate for Qlik Sense data modeling, let's look at some other fundamentals of handling data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset