Using multi-temperatures features in a sample scenario (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9. Managing data lifecyle with InfoSphere Warehouse 339

???? DB2 Workload Manager

DB2 Workload Manager (DB2 WLM) can be configured and used in

conjunction with storage group data tags to manage the resources assigned

to workloads that access data in a multi-temperature database.

For example, by creating a tag on a storage group, DB2 WLM can be

configured to assign a workload to a certain service subclass depending on

the storage group accessed, and then effect the resources and concurrency

rules applied to that workload. Using this approach you can change the

workload priority of a query based on the storage groups it touches. This

gives the administrator more control over queries that might access both hot

and cold data.

???? Temporal data

Creating a temporal table allows you to pass the processing task of

maintaining dimensional history over to the database engine. Deploying

temporal features for dimension tables further enhances your ability to

manage the data lifecycle, because temporal history data can be managed

separately from active data.

A change to a data row in a temporal table generates data in the associated

temporal history table. The temporal history table can be physically

implemented independently of the parent table. This means that it can be

partitioned, stored, and maintained (pruned) separately based on your

requirements.

For example, although a parent table might be range partitioned, a history

table with a smaller volume of rows does not have to be range partitioned,

correlating the maintenance effort with the table volume. Similarly, data can

be archived from the history table independently of the parent table.

???? Federated database

In situations where you need to archive aged or inactive data out of the

primary production database but retain data availability, you can migrate the

data to a federated database from where it can be referenced.

9.2.1 Using multi-temperatures features in a sample scenario

This section describes how to implement the concepts discussed to effectively

manage the cost of storage. The example uses a standard FACT table, which is

typically the largest table in a data warehouse.

The target environment has three distinct types of storage available to the

database which represent distinct cost groups for storage. The object of the

exercise as discussed in this chapter is to place data for the fact table on the

appropriate storage based on age. Figure 9-1 on page 340 shows the target

340 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

environment where SSD, SAS, and SATA storage is utilized in placing data for a

single fact table across multiple storage tiers.

The approach to the data lifecycle is through the implementation of storage

group, table partitioning, and table space objects for a sales transactions fact

table. Each data partition and table space is designed to hold data for a calendar

quarter. Data for the current quarter is always maintained on the fastest storage.

As data ages, the table space associated with the data partition is moved to

warm storage and then cold storage.

Figure 9-1 Physical implementation of a fact table to support multi-temperature storage through the data

lifecycle

The following steps show how this environment was created, using the example

of the sales transactions table CUSTOMER_TXN as shown in Figure 9-1. Each

row in the table will be placed in a data partition, table space and storage group

based on the transaction date.

1. Create a storage group.

A storage group is characterized by having one or more storage paths and

performance statistics for the DB2 optimizer to reference. Using IBM Data

sg_coldsg_hot

SSD RAID Array FC/SAS RAID Array SATA RAID Array

2011Q1 2010Q4 2010Q3 2010Q2 2010Q1 2009Q4 2007Q3

sg_warm

TbSpc14

TbSpc13 TbSpc12 TbSpc11 TbSpc10 TbSpc9 TbSpc1

...

Data partition

Table: Sales

Range

Partitions

Table

Spaces

Storage

Groups

Legend

...

Chapter 9. Managing data lifecyle with InfoSphere Warehouse 341

Studio, a separate storage group for hot, warm, and cold storage was created

where the storage paths were already created on the target storage

subsystems.

Each storage group is assigned an OVERHEAD and DEVICE READ RATE

value. These values help the DB2 optimizer understand the speed of the

underlying storage when compiling an access plan. The values for

OVERHEAD and DEVICE_READ_RATE can reflect the actual storage

capability, or you can alter these to reflect the temperature of data that you

intend to place on that storage.

For example, in an environment where all storage devices are the same but

where you still want DB2 to compile different access plans based on the

temperature of data, you can create different storage groups and assign

different parameter values to reflect temperature you want to assign to the

data on that storage device. Refer to the DB2 Information Center for a

complete description:

http://pic.dhe.ibm.com/infocenter/db2luw/v10r1/index.jsp?topic=%2Fcom.ibm.d

b2.luw.admin.perf.doc%2Fdoc%2Fc0005051.html

The hot storage group is set as the default storage group for the database;

any new table space created is assigned to the default hot storage group.

Figure 10-2 shows hot, warm, and cold storage groups as created in our test

environment.

Figure 9-2 Storage groups represented in IBM Data Studio

When you deploy these changes, click Advanced Options and include the

tasks of flushing the package cache and issuing the RUNSTATS command

342 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

so that the optimizer can build a new access plan for queries that access

these objects. Figure 9-3 shows an example.

Figure 9-3 Advanced options available when generating DDL in IBM Data Studio

2. Create table spaces for each data partition to be created.

Using Data Studio, create a table space for each data partition to be attached

to the table. Figure 9-4 shows an excerpt of table spaces created for each

quarter of 2010 and Q1 2011.

Figure 9-4 Table space list including FACTDATA table spaces for partitioned table

Chapter 9. Managing data lifecyle with InfoSphere Warehouse 343

When creating each table space, click Storage Options to assign the table

space to a storage group. Click I/O Settings to inherit Overhead, Transfer

rate, and Data tag values from the storage group (Figure 9-5).

Figure 9-5 Set table spaces for partitioned table to Inherit from storage group

At this point, the storage group and table spaces have been created. The

range partitioned table can now be created.

3. Create data partitions for the fact table.

In this example, a partitioned table is created to hold the sales transactions. In

our scenario, the table already exists as a regular table and has to be

migrated to a range partitioned table. We achieve this by doing the following:

– Generate the DDL for the original table.

– Modify the DDL to incorporate the data ranges you need and table spaces

to be used. Use a new table name.

– Create the new table with required indexes.

– Use the INSERT with subselect statement to copy the data from the old to

the new table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Using multi-temperatures features in a sample scenario (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
Using multi-temperatures features in a sample scenario (1/2)