9. Working with and Analyzing OLAP Data

What Is OLAP?

Online Analytical Processing (OLAP) is a category of data warehousing that enables you to mine and analyze vast amounts data with ease and efficiency. Unlike other types of databases, OLAP databases are designed specifically for reporting and data mining. In fact, there are several key differences between your standard transactional databases, such as Access or SQL Server, and OLAP databases.

Records within a transactional database are routinely added, deleted, or updated. OLAP databases, on the other hand, contain only snapshots of data. The data in an OLAP database is typically archived data, stored solely for reporting purposes. Although new data might be appended on a regular basis, existing data is rarely edited or deleted.

Another difference between transactional databases and OLAP databases is structure. Transactional databases typically contain many tables; each table usually contains multiple relationships with other tables. Indeed, some transactional databases contain so many tables that it can be difficult to determine how each table relates to another. In an OLAP database, however, all the relationships between the various data points have been predefined and stored in OLAP cubes. These cubes already contain the relationships and hierarchies you need to easily navigate the data within. Consequently, you can build reports without the need to know how the data tables relate to one another.

The biggest difference between OLAP and transactional databases is the way the data is stored. The data in an OLAP cube is rarely stored in raw form. OLAP cubes typically store data in preorganized and preaggregated views. That is, grouping, sorting, and aggregations are all predefined and ready to use. This makes querying and browsing for data far more efficient than in a transactional database, where you would have to group, aggregate, and sort records on the fly.

Note

An OLAP database is typically set up and maintained by the database administrator in your IT department. If your organization does not utilize OLAP databases, you might want to speak with your database administrator to discuss the possibility of some OLAP reporting solutions.

Connecting to an OLAP Cube

Before you can browse OLAP data, you must first establish a connection to an OLAP cube. Start on the Data tab and select From Other Sources to see the drop-down menu shown in Figure 9.1. Then, select the From Analysis Services option.

Figure 9.1 Select From Analysis Services.

image

Selecting this option activates the Data Connection Wizard, as shown in Figure 9.2. Follow these steps to configure your connection settings so Excel can establish a link to the server:

Note

The examples you see in this chapter have been created using the Analysis Services Tutorial cube that comes with SQL Server Analysis Services 2005. The actions you take to connect to and work with your OLAP database are the same as demonstrated here because the concepts are applicable to any OLAP cube you might be using.

1. The first step in this endeavor is to provide Excel with some authentication information. Enter the name of your server, your username, and your password, as demonstrated in Figure 9.2. Then, click Next.

Note

If you are typically authenticated via Windows authentication, you simply select the Use Windows Authentication option.

2. Next, you select the OLAP database with which you are working from the drop-down box. In Figure 9.3, the database called OLAP_Database is selected for this scenario. Selecting this database causes all the available cubes to be exposed in the list of objects below the drop-down menu. In this case, there is one cube available. However, the idea is that you would choose the cube you want to analyze, and then click Next.

3. The next screen, shown in Figure 9.4, enables you to enter some descriptive information about the connection you’ve just created.

Note

All the fields shown in Figure 9.4 are optional edits only. That is, you can bypass this screen with-out editing anything, and your connection works fine.

4. Click Finish to finalize your connection settings. You immediately see the Import Data dialog box, as shown in Figure 9.5. From here, you select PivotTable Report, and then click OK to start building your pivot table.

Figure 9.2 Enter your authentication information and click Next.

image

Figure 9.3 Specify your database and then choose the OLAP cube you want to analyze.

image

Figure 9.4 Edit descriptive information for your connection.

image

Figure 9.5 When your connection is finalized, you can start building your pivot table.

image

Understanding the Structure of an OLAP Cube

When your pivot table is created, you might notice that the PivotTable Field List looks somewhat different from that of a standard pivot table. The reason is that the PivotTable Field List for an OLAP pivot table is arranged to represent the structure of the OLAP cube you are connected to.

To effectively browse an OLAP cube, you need to understand the component parts of OLAP cubes and the way they interact with one another. Figure 9.6 illustrates the basic structure of a typical OLAP cube.

Figure 9.6 The basic structure of an OLAP cube.

image

As you can see, the main components of an OLAP cube are dimensions, hierarchies, levels, members, and measures:

Dimensions are major classifications of data that contain the data items that are analyzed. Some common examples of dimensions are Products dimension, Customer dimension, and Employee dimension. In Figure 9.6, the structure you see is that of the Products dimension.

Hierarchies are predefined aggregations of levels within a particular dimension. A hierarchy enables you to pivot and analyze multiple levels at one time without any previous knowledge of the relationships between the levels. In the example in Figure 9.6, the Products dimension has three levels that are aggregated into one hierarchy called Product Categories.

Levels are categories of data that are aggregated within a hierarchy. You can think of Levels as data fields that can be queried and analyzed individually. In Figure 9.6, note that there are three levels: Category, SubCategory, and Product Name.

Members are the individual data items within a dimension. Members are typically accessed via the OLAP structure of dimension, hierarchy, level, and member. In the example shown in Figure 9.6, the members you see belong to the Product Name level. The other levels have their own members and are not shown here.

Measures are the actual data values within the OLAP cube. Measures are stored within their own dimension appropriately called the Measures Dimension. The idea is that you can use any combination of dimension, hierarchy, level, and member to query the measures. This is called slicing the measures.

Now that you understand how the data in an OLAP cube is structured, take a look at the PivotTable Field List again. The arrangement of the available fields starts to make sense. Figure 9.7 illustrates what the PivotTable Field List for an OLAP pivot table might look like.

Figure 9.7 The PivotTable Field List for an OLAP pivot table.

image

As you can see, the measures are listed first under the Sigma icon. These are the only items you can drop into the values area of your pivot table. Next, you see dimensions represented next to the table icon. In this example, you see the Product dimension. Under the Product dimension, you see the Product Categories hierarchy that can be drilled into. Drilling into the Product Categories hierarchy enables you to see the individual levels.

The cool thing is that you are able to browse the entire cube structure by simply navigating through your PivotTable Field List! From here, you can build your OLAP pivot table report just as you would build a standard pivot table.

Understanding Limitations of OLAP Pivot Tables

When working with OLAP pivot tables, you must remember that the source data is maintained and controlled in the Analysis Services OLAP environment. This means that every aspect of the cube’s behavior, from the dimensions and measures included in the cube to the capability to drill into the details of a dimension, is controlled via Analysis Services. This reality translates into some limitations to the actions you can take with your OLAP pivot tables.

When your pivot table report is based on an OLAP data source:

• You cannot place any field other than measures into the values area of the pivot table.

• You cannot change the function used to summarize a data field.

• You cannot create a calculated field or a calculated item.

• Any changes you make to field names are lost when you remove the field from the pivot table.

• The page field settings are not available.

• The Show Pages command is disabled.

• The Show Items with No Data option is disabled.

• The Subtotal Hidden Page Items setting is disabled.

• The Background Query option is not available.

• The Optimize Memory check box in the PivotTable Options dialog box is disabled.

Creating Offline Cubes

With a standard pivot table, the source data is typically stored on your local drive. This way, you can work with and analyze your data while disconnected from the network. However, this is not the case with OLAP pivot tables. With an OLAP pivot table, the pivot cache is never brought to your local drive. This means that while you are disconnected from the network, your pivot table is out of commission. You can’t even move a field while disconnected.

If you need to analyze your OLAP data while disconnected from your network, you need to create an offline cube. An offline cube is essentially a file that acts as a pivot cache, locally storing OLAP data so that you can browse that data while disconnected from the network.

To create an offline cube, start with an OLAP-based pivot table. Place your cursor anywhere inside the pivot table and click the OLAP Tools drop-down menu button on the PivotTable Tools Options tab. Then, select Offline OLAP, as demonstrated in Figure 9.8.

Figure 9.8 Select Offline OLAP to start the creation of an offline cube.

image

Selecting this option activates the Offline OLAP Settings dialog box, where you click the Create Offline Data File button.

The Create Cube File Wizard activates, as shown in Figure 9.9. Click Next to start the process.

Figure 9.9 Start the Create Cube File Wizard.

image

As you can see in Figure 9.10, you first select the dimensions and levels you want included in your offline cube. This dialog box tells Excel which data you want imported from the OLAP database. The idea is to select only the dimensions that you need available to you while disconnected from the server. The more dimensions you select, the more disk space your offline cube file takes up.

Figure 9.10 Select the dimensions and levels you want included in your offline cube.

image

Clicking Next moves you to the next dialog box, as shown in Figure 9.11. Here, you are given the opportunity to filter out any members or data items that you do not want included. For instance, the Internet Sales-Extended Amount measure is not needed, so this dialog box is cleared. Clearing this box ensures that this measure is not imported and does not take up unnecessary disk space.

Figure 9.11 Clear any members you do not need to see offline.

image

The final step is to specify a name and location for your cube file. In Figure 9.12, the cube file is named MyOfflineCube.cub, and it is placed in a directory called MyDirectory.

Figure 9.12 Specify a name and location for your cube file.

image

After a few moments of crunching, Excel outputs your offline cube file to your chosen directory. To test it, simply double-click the file to automatically generate an Excel workbook that is linked to the offline cube via a pivot table.

Note

The file extension for all offline cubes is .cub.

After your offline cube file has been created, you can distribute it to others and use it while disconnected from the network.

Breaking Out of the Pivot Table Mold with Cube Functions

Cube functions are Excel functions that can be used to access OLAP data outside a pivot table object. In legacy versions of Excel, you could find cube functions only if you installed the Analysis Services Add-In. In Excel 2010, cube functions have been brought into the native Excel environment. To fully understand the benefit of cube functions, take a moment to walk through an example.

One of the easiest ways to start exploring cube functions is to e Excel to convert your OLAP-based pivot table into cube formulas. Converting a pivot table to cube formulas is a delightfully easy way to create a few cube formulas without doing any of the work yourself. The idea is to tell Excel to replace all cells in the pivot table with a formula that connects back to the OLAP database. Figure 9.13 shows a pivot table connected to an OLAP database.

Figure 9.13 A normal OLAP pivot table.

image

With just a few clicks, you can convert any OLAP pivot table into a series of cube formulas. Place the cursor anywhere inside the pivot table and click the OLAP Tools drop-down menu button on the PivotTable Tools Options tab. Then, select Convert to Formulas, as demonstrated in Figure 9.14.

Figure 9.14 Select Convert to Formulas to convert your pivot table to cube formulas.

image

If your pivot table contains a report filter field, the dialog box shown in Figure 9.15 activates. This dialog box gives you the option of converting your filter drop-down selectors to cube formulas. If you select this option, the drop-down selectors are removed, leaving a static formula. If you need to have your filter drop-down selectors intact so that you can continue to interactively change the selections in the filter field, leave the Convert Report Filters option cleared.

Figure 9.15 Excel gives you the option of converting your report filter fields.

image

Note

If you are working with a pivot table in compatibility mode, Excel automatically converts the filter fields to formulas.

After a second or two, the cells that used to house a pivot table are now homes for cube formulas. Note that, as in Figure 9.16, any styles that you might have applied are removed.

Figure 9.16 Note in the formula bar, these cells are now a series of cube formulas!

image

So why is this capability useful? Now that the values you see are no longer part of a pivot table object, you can insert rows and columns; you can add your own calculations; you can combine the data with other external data; and you can modify the report in all sorts of ways by simply moving the formulas around. For instance, Figure 9.17 illustrates a report in which specific data items from Accessories and Bikes have been combined and formatted to create a new Bike Products report. Again, note that all cells displaying values for each product are formulas that read directly from the OLAP cube via the connection you created.

Figure 9.17 A Bike Products report built using cube formulas from a converted OLAP pivot table.

image

Next Steps

In Chapter 10, “Mashing Up Data with PowerPivot,” you learn how to use the PowerPivot add-in for Excel 2010 to mash up millions of rows of data into a single pivot table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset