Creating a date dimension

In this recipe, we guide you to start creating a Mondrian 3.x schema with the Schema Workbench, using the Thin Kettle JDBC Driver as the connection. We'll first create a shared dimension, date. A shared dimension can be referenced in different cubes. However, in this particular case, it is not necessary to have a shared dimension because we'll have just one cube.

Getting ready

Before you start this recipe, you need to make sure that the MongoDB server is running with the databases created in the previous chapters, and that the Data Integration server is running.

How to do it…

Proceed with the following steps:

  1. Open the Schema Workbench application. On Windows, you can find it in the Start menu. From there, go to Pentaho Enterprise Edition | Design Tools | Schema Workbench. On Linux, you need to run a command like this:
    sh <pentaho-installation-path>/design-tools/schema-workbench/workbench.sh
  2. With the Schema Workbench opened, let's configure the database connection. In this case, it will be the Data Integration server:
    1. In the main menu, select Options and then Connection....
    2. In the Database Connection popup, you have to define a connection name as Pentaho MongoDB Cookbook. Select the Kettle thin JDBC driver option for Connection Type. The Host Name parameter is localhost, Database Name is kettle?webappname=pentaho-di, Port Number is 9080, User Name is admin, and Password is password. Your setup should look similar to what is shown in this screenshot:
    How to do it…

Click on the Test button and you should get a success message box. Then click on the OK button.

Let's start creating the OLAP schema properly:

  1. In the main menu, go to File | New | Schema, and you should get a subwindow for creating the new schema, as you can see here:
    How to do it…
  2. Select the Schema object and set Orders as the field name.
  3. Right-click on the Schema object and select Add Dimension, as you can see in the following screenshot:
    How to do it…
  4. Add a table to the hierarchy by right-clicking and selecting Add Table, as you can see in this screenshot:
    How to do it…
  5. After selecting the table object added, select the Kettle->date option for the name field, as shown here:
    How to do it…
  6. In the default hierarchy (New Hierarchy 0), right-click and select the Add Level option, as you can see in this screenshot:
    How to do it…
  7. Let's define the year for this new level. Having selected the level object, set the name field to year. Select the option year from the column field and String in the type field. In the levelType field, select the TimeYears option, and in hideMemberIf, select the Never option. Finally, for the caption field, set Year. You can see all of these in the following screenshot:
    How to do it…
  8. Add a new level and define the month, as shown here:
    How to do it…
  9. Next, add a new level and define the day, as you can see in the following screenshot:
    How to do it…
  10. Select the hierarchy object, remove the default name (New Hierarchy 0), and select date for the primaryKey field.
  11. Next, select the dimension object, and for the name field, set date. In the type field, select TimeDimension, as you can see in the next screenshot. Finally, in the caption field, set Date.
    How to do it…
  12. Select other objects that you want and you shouldn't see any more red alerts. Our date dimension is defined.

How it works…

We started with the Schema Workbench. It is an application that helps create a Mondrian schema. First, we created a database connection, in this case using the Kettle Thin JDBC Driver. This connection will help us during schema creation with red alert messages at the bottom of the screen.

Then, we created a simple date dimension in the schema with one hierarchy that contains three levels of granularity. With these three levels, when exploring the cube, we can aggregate our measures by year, month, or day. Basically, with some plugins, such as Analysis Report or Saiku Analytics, it's possible to explore—using a good drag-and-drop interface—the future Orders cube while aggregating the data by year, month, or day. In this way, you will get answers for questions such as "What is the total price for each year?" or "What is the quantity of products ordered in January 2013?".

There's more…

Here are some definitions for the main keywords related to the Mondrian schema:

  • Schema: This defines a multidimensional logical model consisting of cubes, hierarchies, and members that help map the model onto the physical model
  • Cube: This is a data structure that allows fast analysis of data according to the multiple dimensions that define a business problem
  • Dimension: This is a set of hierarchies that provide information to otherwise unordered numeric measures
  • Hierarchy: This is a logical tree structure that defines parent-child relationships in a dimension
  • Level: This is a collection of members that have the same distance from the parent of the hierarchy
  • Member: This is a point within a dimension determined by a particular set of attribute values
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset