Cube model dimensional concepts (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

112 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

multidimensional expressions (MDXs) as the query language. However, the data

is not copied or persisted in another storage mechanism, but remains in the DB2

relational database.

Cubing Services thus is what we refer to as a “multidimensional OLAP hot

cache.” That is, InfoSphere Warehouse Cubing Services loads data into a

multidimensional structure in memory.

That partial MOLAP structure can then deliver fast response times for the data

that is in memory cache. If a user of a query tool requests data that is not in the

cache, the Cubing Services engine generates queries to the relational database

to retrieve the data and builds a partial in-memory cube-like structure for only

that data. Although the first request for that data sees a longer response time

while the in-memory structures are being built, subsequent requests for that data

are satisfied from the in-memory structures and experience MOLAP-style

response times while that data is maintained in memory.

Using Cubing Services as the OLAP engine results in a MOLAP-style response

in a large percentage of queries without the need to copy the data into another

storage mechanism. To improve the relational access, there is an optimization

wizard that recommends the appropriate MQTs that can be implemented to

improve the response of aggregated data when it is needed from the relational

database.

4.4.1 Cube model dimensional concepts

Before creating any cube models, it is important o understand the components

that make up such a model, and to understand their association to the fact and

dimensional tables that make up the dimensional model based on a star schema.

Cube models have four main core ingredients, which themselves have a number

of additional elements. These four ingredients are listed here.

???? Fact: derived from the fact table within the dimensional model, and which

contain:

– Measures: derived from the columns or calculated.

???? Dimension: derived from the dimension table within the dimensional model,

and which contain:

– Hierarchies: aggregates of data.

– Levels: the granularity of the data with uniquely identifiable attributes.

– Attributes: identifiable entities.

???? Joins: derived from the relationship between tables:

– Joins between fact and dimension tables, in a star schema.

Chapter 4. Data modeling: End to end 113

– Joins between multiple dimension tables, in a snow flake schema.

???? Cube: an instance of parts or a complete cube model implemented as a cube

service.

Figure 4-16 illustrates the components of a cube model and their relationship to a

the tables in a warehouse database.

Figure 4-16 Relationship between cube model and tables in a relational database

Fact element

In a dimensional model, the fact table holds the central position within the star

schema, with all the dimensional tables linked to this one table. In a cube model,

the

fact element maintains a similar central position, with the dimensional

elements connected, this time as a defined join. This means, within the definition

of the cube model, you have to define a set of SQL queries that join the fact table

to the individual dimensional tables within the relational database.

A fact object also contains a set of measures or metrics that best describes how

to aggregate data from the fact table, across dimensions. These measures or

metrics can be drawn directly from fact table columns, or derived through

Attribute

Relationship

Cube

Cube Dimension

Cube Hierarchy

Cube Facts

Cube Model

JoinMeasureMeasure

Facts

Cube

Metadata

Fact tables

Dimension tables

Relational tables in DB2

AttributeAttribute

Join

Hierarchy

Dimension

114 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

calculations. It is these measures or metrics that form the basis of what is to be

analyzed within the cube model. As an example, they might be sales amounts,

counts of product failures, or the number of admitted patients.

Within the cube model, understand that measures in themselves only become

meaningful when used in the context of a set of dimensions. For instance, when

used on its own, a sale of 300 units is not meaningful. However, when combined

with dimensions for location and product line, the resulting measure takes on

more meaning, that is, “A clothing line in the eastern sales region sold 300 units.”

It is these measures or metrics in connection with the dimension objects, such as

time or location, that allow you to manage your business, analyze options, and

ask your BI questions.

The following types of fact measures can be implemented:

Calculated facts These are calculations that are based on existing

measures or column values in the relational database.

For example, the sales amount might be calculated from

the unit price of a item against the number of items sold.

These calculations can be performed by two methods

within a cube service in InfoSphere Warehouse: SQL and

MDX. For SQL calculations, the work is performed by

DB2 and SQL functions. For MDX calculations, the work

is performed within the Java application that is the cube

service and which is better suited to complex calculations.

Additive facts This refers to where a measure is rolled up across all

dimensions and levels using a consistent aggregation

function. An example of an additive fact is the sum of

sales or cost of goods sold. Cubing Services also support

average, count, standard deviation, max, min, and

variance as aggregation types.

Non-additive facts Non-additive facts cannot be rolled up through any of the

dimensions in the cube model, which is typically the case

for any measure calculations that result in ratios,

averages, or variance percentages, for example. Cubing

Services supports non-additive facts for calculated

measures by setting the aggregation function to None,

which results in that calculation being calculated after the

values are aggregated

Dimension element

In a star schema-based dimensional model, each dimension table contains a

specific set of related columns, biased towards a particular topic such as time,

location, or product range. Because a cube model is an abstraction of the star

Chapter 4. Data modeling: End to end 115

schema, it too has dimension elements. And these in turn are defined as a set of

related attributes that together describe one partition or aspect of the model,

along the lines of its metrics.

Similar to the dimensions within the star schema, these related dimensions

within a cube model organize the data in the fact object according to common

logical entities, such as by time, by location, by person. These dimensions are

then subdivided into additional components that are known as attributes, levels,

and hierarchies, as described here:

???? Hierarchies

Generally, for each dimension defined within the cube model, one or multiple

hierarchies are defined. These hierarchies describe the relationship and

structure of the referenced attributes, and provide a navigational and

computational way to traverse the dimension. Individual hierarchies are

divided into an ordered list of levels that represent the scope of dimension

granularity from the lowest to highest.

During the creation of the hierarchy, you must also specify the hierarchy type.

The following are hierarchy types that can be implemented within a

dimension:

– Balanced hierarchy

All the branches in a balanced hierarchy descend to the same level and

have the same corresponding number of levels. A good example of a

balanced hierarchy is time, where the meaning and depth of each level,

such as year, month, and week, are constant.

– Unbalanced hierarchy

In this context, although all the branches within an unbalanced hierarchy

have a parent-child relationship, they have an inconsistent number of

levels and might not all reach to the same depth. An example of this type

of hierarchy might be that of an organizational chart where a large

department has a sub-manager, but a smaller department does not.

– Ragged hierarchy

Similar to an unbalanced hierarchy, the branches of a ragged hierarchy

can descend to different levels. However, an hierarchy becomes ragged

when the parent of a member is not in the level immediately above that

member and thus, one branch does not have an entity at all the levels. An

good example might be a set of countries in which a larger country might

be divided into states, then a city level. However, a smaller country just

has a city level.

116 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

– Network hierarchy

In this situation, the levels do not have an inherent parent-child

relationship, so their order is not important. But the network hierarchy still

has attributes such as color, size, or product type.

In a relational database, the different levels of a hierarchy can be stored in a

single table (as in a star schema) or in separate tables (as in a snowflake

schema).

A hierarchy can have an “All” level as the top-most level in the hierarchy,

which is a single member that represents the aggregation of all of the

members in the levels below in the hierarchy.

???? Attributes

These are the basic building blocks of the dimension element and are the

identifiable aspects in the model with which to describe, organize, group and

ask questions about entities with the cube model. In the simplest terms,

attributes represent the basic abstraction of a database column. They can be

expressed as a simple SQL expression that maps to a table column within a

dimensional table, or as a more complex expression that can combines

multiple columns within one or more tables.

Attributes often serve as headings for the columns of a report, so make them

descriptive and easy to understand.

???? Levels

Levels define the specific unique “slices” of each hierarchy or the specific

granular layers within the hierarchy for each dimension definition. If a

hierarchy contains more than one level, then these levels have a defined

order within hierarchy structure. As an example, the natural levels of a time

dimension are days, months, years. They are placed in a specific levels order

to form a Calendar Time hierarchy; that is, the highest level will be years,

followed by months, and then at the lowest level by days. Each member, a

specific day or month, of the level is uniquely identifiable. In the context of a

dimension table in a logical model or within the database, this property maps

to the key column or a group of columns that uniquely identifies every row in

the table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Cube model dimensional concepts (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
Cube model dimensional concepts (1/2)