Chapter 7. Summary Management

Summary Tables

A common technique used in data warehouses is to precompute and store results of frequent queries. This is especially beneficial when the queries involve aggregation, because the result is usually much smaller than the detailed data used to produce the result. An example of such a query is a monthly sales report for a business. Since multiple users are interested in the total sales of each product for each month, the data would be selected, joined, sorted, and aggregated over and over again for each user. Rather than wasting resources reexecuting the same query repeatedly the result could be precomputed and saved in a table. Such precomputed results are often called summaries or summary tables.

Figure 7.1 shows a summary containing the total number of items sold for each month of the year. Summary tables are usually much smaller than the tables containing the detail data. In this example, 17 rows of detailed sales transaction data are summarized into 10 rows. Depending on the data, the reduction in size and therefore the improvement in query performance can be quite significant.

Summary Table

Figure 7.1. Summary Table

Oracle 8i, with its summary management feature, made a huge advance in the way people used summary tables by letting the database manage and use summaries transparently. Summary tables in Oracle are called Materialized Views.

Why Do You Need Summary Management?

To understand why summary management is needed, let us first look at some of the tasks involved in managing summary tables.

A summary is the precomputed result of a SQL query. Once a summary has been created, the result of the query can be obtained from the summary. A summary can also be used to answer other related queries. For instance, if we were interested in the total sales of each product for each year, the result can be obtained by adding the months for that year together, since months roll up into years. To use a summary, the query has to be modified somehow to reference that summary. In some situations this would mean that application SQL must be modified. Alternatively, users must be informed of the existence of summarized data and trained on which summary tables to use for each particular query. Thus, while summary tables improve query performance, managing these summaries can be quite a task.

As new detail data is loaded into the warehouse, the data in the summary is no longer synchronized with the detail tables. When this happens, the summary is said to be stale. Figure 7.2 shows a stale summary. In order to bring it up-to-date with the detail data, the stale summary must be refreshed. A summary can be rebuilt when new data is loaded into the warehouse. This is known as complete refresh. In some cases, it is possible to incrementally refresh the summary with only the new or changed data. Before summary management with Oracle 8i, refreshing summaries involved complex, custom-built procedures.

Stale Summary

Figure 7.2. Stale Summary

Choosing which summaries to create requires an understanding of the workload—which types of questions users are asking and how often the same information is being requested. The number of possible summary tables that could be created is very large. Since summaries consume disk space and take time to refresh, it is important to select few summaries that produce the most performance benefits.

In other words, before using summaries, the following questions must be addressed:

  1. What is the best set of summaries to create?

  2. Do users have to be aware of summaries? If so, how will the users know which summaries exist and when to use them? If summaries are later determined not to be that useful and are dropped, users also need to know about this.

  3. As the detail data changes, how will the summaries be kept up-to-date?

Oracle’s Summary Management has the answer to all these questions.

Summary Management with Oracle

Summary management provides a complete environment to manage and use summary tables. With this feature, summaries can be created directly in the Oracle database. Mechanisms are provided to keep your summaries up-to-date with changes in underlying data. Further, Oracle will transparently rewrite your queries to use these summaries, so users do not have to be aware of the summaries. Summary management in Oracle includes the following components:

  • A database object, known as a Materialized View, which stores precomputed results, such as a summary table.

  • A mechanism to refresh the materialized views using either Complete or Incremental Refresh.

  • Query Rewrite, which transparently rewrites SQL queries to use materialized views.

  • A database object called Dimension, which provides an ability to declare hierarchical relationships, such as rollups in the data, to assist query rewrite.

  • The SQL Access Advisor, a tool that can recommend which materialized views to create.

Let us see how summary management answers the questions we had posed earlier:

  1. What is the best set of summaries to create? Using the SQL Access Advisor you can easily determine the materialized views to create for a given set of queries to fit a specified amount of space.

  2. Do users have to be aware of summaries? Once the materialized views have been created and enabled for query rewrite, queries will automatically use these materialized views. A significant benefit of this is that the end users and database applications no longer need to be aware of the existence of the summary tables. Query rewrite in Oracle is not limited to star schemas or queries with aggregation. Any client tool can take advantage of this feature in the database server to provide aggregate navigation capabilities. For instance, Oracle Discoverer uses the summary management features in the Oracle database to improve query response time.

  3. How are summaries kept up-to-date? Summary management provides procedures for complete and fast refresh, so that the materialized views can be updated when new detail data is loaded into the warehouse. This eliminates the need to write customized refresh programs.

In this chapter, we will discuss how to create and refresh materialized views. The dimension object is described in Chapter 8, query rewrite is discussed in Chapter 9, and the SQL Access Advisor is described in Chapter 10.

Creating a Materialized View

A Materialized View (MV) is a database object that precomputes and stores the result of a SQL query, akin to a summary table. Note that a summary is typically an aggregate query; however, materialized views can be created for any query—for example, one just involving a join. In this respect, a materialized view is quite similar to a conventional view. The difference is that by materializing the view, you save the results in the database. In the remainder of this chapter, we will use the term materialized view instead of summary.

Hint

Materialized views can also be used for replication of data. Before Oracle 8i, such materialized views were called snapshots. Materialized views include all the features that were available with snapshots and more. In this book, we will only focus on the application of materialized views to data warehousing.

A materialized view is created using the CREATE MATERIALIZED VIEW SQL statement. The following example creates a materialized view named MONTHLY_SALES_MV, which contains the total sales of each product for each month.

CREATE MATERIALIZED VIEW MONTHLY_SALES_MV
PCTFREE 0 TABLESPACE summary
STORAGE (initial 64k
         next 64k pctincrease 0)     <- storage parameters
BUILD IMMEDIATE                      <- when to populate it
REFRESH FORCE                        <- how to refresh it
ON DEMAND                            <- when to refresh it
ENABLE QUERY REWRITE                 <- use in query rewrite or not
AS                                   <- query result it contains
SELECT t.month, t.year, p.product_id,
       SUM (f.purchase_price) as sum_of_sales,
       COUNT (f.purchase_price) as total_sales,
       COUNT(*) as cstar
FROM time t, product p, purchases f
WHERE t.time_key = f.time_key AND
      f.product_id = p.product_id
GROUP BY t.month, t.year, p.product_id;

The materialized view definition specifies the following properties, highlighted with arrows in the example:

  • How to physically store the materialized view (i.e., the storage clause).

  • When to populate it—immediately upon creation or later. Or, if it is an existing table that must be registered as a materialized view.

  • How to refresh it when data in the underlying detail tables has changed.

  • When to refresh it—at the end of each transaction or when explicitly requested.

  • Whether to use it for query rewrite or not.

  • A SELECT statement, which describes the contents of the materialized view.

We will now explain each clause of the materialized view creation statement in detail.

Naming the Materialized View

As with any database object, a materialized view has a name and a schema that owns it. It is good practice to follow a naming convention for materialized views, so that you can easily identify them later and differentiate them from tables and views. For example, you could use the suffix _MV when naming materialized views.

The Physical Storage for the Materialized View

The materialized view has a storage specification where you can specify the tablespace to store the data in, the initial allocation size and the size, of its extents. The syntax and semantics of the storage specification are the same as for any other object in the database

In the MONTHLY_SALES_MV example, the materialized view is being placed in the tablespace called MVIEW, its first extent will be 64K, and all subsequent extents will be 64K.

When Should the Materialized View Be Populated with Data?

The materialized view definition describes when you would like the materialized view to be populated with data. If you specify BUILD IMMEDIATE (the default), as in the example, the materialized view is populated immediately upon creation. If you specify BUILD DEFERRED, then the materialized view will be populated when you perform the refresh operation. If you have an existing summary table that you would like to manage using Oracle’s summary management, you can use the ON PREBUILT TABLE clause. This indicates to the database that the existing table should be treated like a materialized view and makes it available to query rewrite, if desired.

Hint

A materialized view is not considered by query rewrite until it has been populated with data

How Should the Materialized View Be Refreshed?

As new detail data is periodically loaded into the data warehouse, the materialized views have to be refreshed to reflect the changes. Four refresh options are available:

  • Complete

  • Fast

  • Force

  • Never

A materialized view can be completely rebuilt by specifying REFRESH COMPLETE. Or, it can be incrementally updated by specifying the REFRESH FAST option. It is usually faster to perform a fast refresh than a complete refresh; however, if there are a lot of changes, it may be faster to perform a complete refresh. The REFRESH FORCE option (the default), means that Oracle will perform a fast refresh if possible and only do a COMPLETE refresh if necessary. Materialized views that use the NEVER REFRESH option will never be refreshed by any of the procedures supplied by Oracle. This option can be useful if you have custom-built refresh procedures or would like to store some historical data that must not be updated. The materialized view MONTHLY_SALES_MV, in our example, is created using the REFRESH FORCE clause, which means that Oracle will decide whether to perform fast or complete refresh.

When should the Materialized View be refreshed?

As the underlying detail data changes, the materialized views that are based on those detail tables become stale and no longer reflect the results of summarizing all the detail data. The most common ways of refresh are ON COMMIT or ON DEMAND (the default).

If your business is such that it requires the materialized view be kept up-to-date with the detail data at the transaction level, the materialized view can be refreshed at the end of each transaction by specifying the ON COMMIT option. This is rarely used in a data warehouse.

Hint

Adding materialized views with the ON COMMIT option will lengthen the time required to commit a transaction, because the materialized views must be refreshed as part of the commit processing.

On the other hand, with ON DEMAND refresh you must manually request that the materialized view be refreshed using procedures in the DBMS_MVIEW PL/SQL package. ON DEMAND refresh allows you to control when the materialized view will be refreshed. This is useful in a warehouse, where new data is loaded in a batch after which the materialized views must be refreshed.

You can also specify that a materialized view be refreshed on a periodic schedule by specifying a start date and subsequent refresh intervals using the START WITH and NEXT clauses.

The facilities for refreshing materialized views will be discussed later in the chapter.

Enabling the Materialized View for Query Rewrite

One of the major benefits of using Oracle’s summary management is query rewrite. Once a materialized view has been registered as being eligible for use by query rewrite, Oracle will transparently rewrite user queries to access the data in the materialized view, without the need for any application changes. If you would like the materialized view to be considered for query rewrite, then it must be defined with the clause ENABLE QUERY REWRITE, as shown in the example. If you don’t want a materialized view to be used for query rewrite, use the DISABLE QUERY REWRITE clause (the default).

Specifying the Contents of the Materialized View

The materialized view definition includes a SELECT statement, which describes its contents. The tables referenced in a materialized view’s query are referred to as detail tables or base tables of the materialized view.

A typical query for a materialized view used in a data warehouse includes the following:

  • A WHERE clause, which joins the fact table and one or more dimension tables. In this example, the fact table, PURCHASES, is joined with the dimension tables, TIME and PRODUCTS. The WHERE clause may also contain selection criteria to restrict the data in the materialized views.

  • One or more aggregate operators. In our example, COUNT is used to obtain the total sales. You can include any of the built-in aggregate operators in Oracle, such as SUM, MIN, MAX, AVG, COUNT(*), COUNT, COUNT(DISTINCT x), VARIANCE, and STDDEV. You can also use analytical functions, moving window aggregates, and user-defined aggregates, which were discussed in Chapter 6.

  • A GROUP BY clause. In the example, we are counting the total items sold by the columns YEAR, MONTH, and PRODUCT_ID. You can also use the CUBE, ROLLUP, or GROUPING SETS features described in Chapter 6.

The materialized view, MONTHLY_SALES_MV, discussed in the preceding example was a materialized view involving aggregation. A materialized view can also be used to join two or more tables without any aggregation. Such a materialized view is used to precompute expensive joins and can be used in lieu of a bitmapped join index (see Chapter 4). A bitmapped join index is useful for star queries and will work better than materialized views if you want to compute combinations of selections against the index keys. On the other hand, a materialized view can be used to answer a wider class of queries using query rewrite.

The next example shows a materialized view that is used to compute the join between two tables, PURCHASES and CUSTOMER, without any aggregation.

CREATE MATERIALIZED VIEW customer_purchases_mv
BUILD IMMEDIATE
REFRESH COMPLETE
AS
SELECT c.gender, c.occupation, f.purchase_price
FROM purchases f, customer c
WHERE f.customer_id = c.customer_id;

In general a materialized view can be defined using an arbitrarily complex SQL query; however, the fast refresh and query rewrite capabilities would be limited. For example, the following materialized view, which stores the sales for customers who spent the most money in our store in the year 2003, contains a HAVING clause with a subquery.

CREATE MATERIALIZED VIEW customers_maxsales2003_mv
ENABLE QUERY REWRITE
AS
SELECT c.customer_id, SUM(f.purchase_price) AS dollar_sales
FROM purchases f, customer c
WHERE f.customer_id = c.customer_id
GROUP BY c.customer_id
HAVING SUM(f.purchase_price)
    IN (SELECT max(f.purchase_price) dollar_sales
        FROM purchases f, time t
        WHERE f.time_key = t.time_key
          AND t.year = 2003
        GROUP BY f.customer_id);

However, as we will see later, this materialized view cannot be incrementally refreshed.

Creating a Materialized View in Enterprise Manager

You can use Oracle Enterprise Manager to create and edit materialized views by following the Materialized Views link from the Warehouse section of the Administration page (see Figure 2.16 in Chapter 2). When you get to the Materialized Views screen, click the Create button to bring up the screen shown in Figure 7.3. Here, we are creating a materialized view named MONTHLY_SALES_MV.

Creating a Materialized View in Oracle Enterprise Manager

Figure 7.3. Creating a Materialized View in Oracle Enterprise Manager

All the options for a CREATE MATERIALIZED VIEW statement may be specified via this graphical interface. It is also possible to collect statistics on the materialized view upon creation.

Using Summary Management with Existing Summary Tables

If you already have a data warehouse with summaries stored in regular database tables and do not want to completely recreate them, summary management can still be used to manage them and to perform query rewrite.

To register your summary table with summary management, create a materialized view using the ON PREBUILT TABLE clause. If you specify the ENABLE QUERY REWRITE clause, the materialized view can be used for query rewrite.

For example, suppose you have a table named MONTHLY_CUSTOMER_SALES, which summarizes the amount each customer spent each month, as represented by the following query:

SELECT t.year, t.month, c.customer_id,
       SUM(f.purchase_price) as dollar_sales
FROM time t, purchases f, customer c
WHERE f.time_key = t.time_key AND
      f.customer_id = c.customer_id
GROUP BY t.year, t.month, c.customer_id;

The MONTHLY_CUSTOMER_SALES table is described as follows:

describe monthly_customer_sales;

Name                                      Null?    Type
----------------------------------------- -------- ------------
YEAR                                               NUMBER(4)
MONTH                                              NUMBER(2)
CUSTOMER_ID                                        VARCHAR2(10)
DOLLAR_SALES                                       NUMBER

If you would like to use this table as a materialized view and make it eligible for query rewrite, you can issue the following statement:

CREATE MATERIALIZED VIEW monthly_customer_sales
ON PREBUILT TABLE
ENABLE QUERY REWRITE
AS
SELECT t.year, t.month, c.customer_id,
       SUM(f.purchase_price) AS dollar_sales
FROM time t, purchases f, customer c
WHERE f.time_key = t.time_key AND
      f.customer_id = c.customer_id
GROUP BY t.year, t.month, c.customer_id;

The ON PREBUILT TABLE clause creates the necessary metadata for the data in the existing table, which allows it to be used by query rewrite and to be refreshed using Oracle’s supplied refresh procedures.

Hint

When creating the materialized view using an existing table, the materialized view must have the same name as the table.

A word of caution when creating materialized views with the PREBUILT clause—Oracle will not verify that the data in the existing table corresponds to the result of the query used to define the materialized view. You must ensure that the contents of the existing table correctly represent the summarization of the detail data. Otherwise, you may see incorrect results with query rewrite.

Hint

For query rewrite to use a materialized view with the PREBUILT option, you must set the initialization parameter QUERY_REWRITE_INTEGRITY level to TRUSTED or STALE_TOLERATED, as explained in Chapter 9.

Partitioning the Materialized View

You can partition a materialized view as you would a table, using any of the partitioning methods described in Chapter 4. In a warehouse, it is common to partition materialized views in the same way as the fact table. Range-List partitioning may also be a good way to partition a materialized view when it contains multiple dimensions, such as time (range) and geography (list).

In the following example, the materialized view is partitioned using range partitioning:

CREATE MATERIALIZED VIEW Q12003_SALES_MV
PARTITION by RANGE(time_key)
(
    partition purchases_jan2003
        values less than (TO_DATE('01-FEB-2003', 'DD-MON-YYYY'))
        tablespace purchases_jan2003,
    partition purchases_feb2003
        values less than (TO_DATE('01-MAR-2003', 'DD-MON-YYYY'))
        tablespace purchases_feb2003,
    partition purchases_mar2003
       values less than (TO_DATE('01-APR-2003', 'DD-MON-YYYY'))
       tablespace purchases_mar2003
)
BUILD IMMEDIATE
REFRESH FORCE
ON DEMAND
ENABLE QUERY REWRITE
AS
SELECT t.time_key, p.product_id,
       SUM(f.purchase_price) as sum_of_sales
FROM time t, product p, purchases f
WHERE t.time_key = f.time_key AND
      f.product_id = p.product_id AND
      t.time_key BETWEEN TO_DATE('01-JAN-2003', 'DD-MON-YYYY') AND
                         TO_DATE('31-MAR-2003', 'DD-MON-YYYY')
GROUP BY t.time_key, p.product_id;

Partitioning a materialized view can speed up refresh, because refresh can now use parallel DML. We will see later how some partitioning techniques may allow a materialized view to take advantage of optimizations during refresh. Partitioning also enables queries that use the materialized view to take advantage of optimizations such as partition pruning.

Indexing the Materialized View

You can also build indexes on your materialized view to improve the performance of your queries. The techniques used to index materialized views are similar to those for any table.

In the following example, a concatenated index is created on the grouping columns of the MONTHLY_SALES_MV materialized view, shown earlier in this chapter. Also, bitmapped indexes are created on the grouping columns. These indexes will improve performance of queries that get rewritten to use the materialized view.

CREATE INDEX easydw.products_by_month_concat_index
ON MONTHLY_SALES_MV (month, year, product_id)
pctfree 5
tablespace indx
storage (initial 64k next 64k pctincrease 0) ;

CREATE BITMAP INDEX easydw.total_products_by_month_index
ON monthly_sales_mv (month, year);

CREATE BITMAP INDEX easydw.total_products_by_id_index
ON monthly_sales_mv (product_id);

Oracle automatically creates an index on materialized views with aggregates to speed up fast refresh. To suppress creation of this index, use the NO INDEX clause in the CREATE MATERIALIZED VIEW statement.

Hint

You could use the SQL Access Advisor tool, discussed in Chapter 10, to determine the best materialized views and indexes on materialized views to create for your application.

Security of Materialized Views

Some information in the data warehouse may have restricted access, and it is important to ensure that the appropriate security policies are implemented regarding access to materialized views. You may want to allow users access to a materialized view, but not allow them to see the underlying detail data. For example, you may allow a user to see the average salary by department, but not to see an individual employee’s salary. On the other hand, you want to ensure that materialized views and query rewrite are not used as a mechanism for bypassing security. As with all database objects, Oracle provides a privilege model for creating and altering materialized views.

To create a materialized view in a user’s own schema, the user must have the CREATE MATERIALIZED VIEW privilege. To create a materialized view in another user’s schema, the creator must have the CREATE ANY MATERIALIZED VIEW privilege. The owner of the schema where the materialized view will be placed must have the CREATE TABLE and CREATE INDEX privilege and must be able to execute the materialized view’s defining query.

Hint

It is helpful to remember that most privilege checks for a materialized view are applied to the owner of the schema in which the materialized view is placed. The creator only needs the CREATE/ALTER (or CREATE/ALTER ANY) MATERIALIZED VIEW privilege. Also, the privilege checks are applied during creation and altering and not during query rewrite.

To enable a materialized view for query rewrite, the owner of the materialized view must have the QUERY REWRITE object privilege on any tables (referenced in the materialized view) that are outside the owner’s schema. Alternatively, the owner must have the GLOBAL QUERY REWRITE system privilege.

To refresh a materialized view in another schema, the user issuing the refresh procedure must have the ALTER ANY MATERIALIZED VIEW privilege.

To create a materialized view with REFRESH ON COMMIT option, the owner must have the ON COMMIT object privileges on all tables (referenced in the materialized view) outside the schema or have the ON COMMIT system privilege.

In the following example, the user EASYDW is granted the following privileges to create materialized views and to allow these to be used for query rewrite.

-- Add privileges
GRANT select any table to easydw;
GRANT execute any procedure to easydw;


-- Add privileges for materialized views and query rewrite
GRANT create materialized view to easydw;
GRANT drop materialized view to easydw;
GRANT alter materialized view to easydw;
GRANT global query rewrite to easydw;

Refresh

As new data is loaded into the warehouse, any materialized view based on that data must be updated. This operation is known as refresh.

In section 7.2, we discussed the two refresh policies—ON COMMIT and ON DEMAND in the context of creating a materialized view. To recap, refresh can be performed ON COMMIT or ON DEMAND. A materialized view with the refresh ON COMMIT option is automatically refreshed at the end of every transaction that updates the detail tables. A materialized view with the ON DEMAND refresh option must be explicitly refreshed using procedures in the DBMS_MVIEW package.

The frequency of refresh determines how recent the data in the materialized view is with respect to the detail data. The refresh operation can be very time consuming depending on the amount of data involved. You must carefully determine your refresh policy based on available resources and your application’s need for current data. If your application needs the materialized view to be synchronized with the detail data at all times, choose the ON COMMIT option. With this option, Oracle will automatically refresh the materialized view when the changes to detail data get committed. If your warehouse gets new data only once a day during a specified maintenance window, deferring the refresh until then would be the right choice.

The refresh policy must be chosen when creating the materialized view; however, it can be altered at a later time by issuing an ALTER MATERIALIZED VIEW statement.

Using the DBMS_MVIEW Refresh Procedures

Oracle provides three procedures for ON DEMAND refresh in the DBMS_MVIEW–supplied PL/SQL package:

  • DBMS_MVIEW.REFRESH

  • DBMS_MVIEW.REFRESH_DEPENDENT

  • DBMS_MVIEW.REFRESH_ALL_MVIEWS

If you want to refresh all your materialized views, typically after bulk loading new detail data into the warehouse, you can use the DBMS_MVIEW.REFRESH_ALL_MVIEWS procedure. If you have materialized views that are refreshed at different times—for example, some weekly and others monthly—you can specify a list of materialized views to refresh using the DBMS_MVIEW.REFRESH procedure. If you want to refresh all materialized views that are based on a particular detail table that has changed, use the DBMS_MVIEW.REFRESH_DEPENDENT procedure, specifying the detail table.

Before we look at some examples of using these procedures, let us briefly look at some of the parameters that can be specified to control the behavior of refresh:

  • method: Recall that at the time of creating the materialized view, you specify a default refresh method of COMPLETE, FAST, or FORCE. You may override this refresh method when issuing the refresh command by specifying the method parameter as C (complete), F (fast), or ? (force). If you don’t specify it, the refresh method given at the time of creating the materialized view will be used. In Oracle Database 10g, you can also specify a special method, P, which requests that refresh should be done using Partition Change Tracking, which is discussed in detail in section 7.3.4. Note that at the time of writing, there is no syntax to choose Partition Change Tracking as a refresh method when creating the materialized view.

  • atomic_refresh: If atomic_refresh is specified (default), all operations during refresh are performed within one transaction, which means that any error will rollback the entire refresh operation. Otherwise, Oracle may choose to partially commit some of the work done during refresh.

    The setting of this parameter can have some effect on refresh performance, as we will discuss in section 7.3.5.

  • refresh_after_errors: When refreshing multiple materialized views, this parameter indicates whether or not refresh should stop if it encounters an error, or if it should continue on to the next materialized view. The default behavior is to stop when the first error is encountered. Note that if the atomic_refresh parameter is set to TRUE, then any errors will roll back the refresh, regardless of the setting of the refresh_after_errors parameter.

  • nested: This parameter is used to control the refresh of nested materialized views and will be described in section 7.3.6.

Hint

There are several other parameters, such as rollback_segment, push_deferred_rpc, heap_size, purge_option, and parallelism, that are specific to replication and should not be used in a data warehouse.

The first example shows the use of the REFRESH_ALL_MVIEWS procedure. Here, the refresh method is not specified, which means each materialized view will be refreshed using the method specified when creating it. The bind variable, :failures will return the number of failed refreshes. (You can identify the materialized views that failed to refresh using the last_refresh_date field in the dictionary view, USER_MVIEWS, or from the error messages in the Oracle alert.log file.)

VARIABLE failures number;
EXECUTE DBMS_MVIEW.REFRESH_ALL_MVIEWS(:failures);

In the following example, the materialized view MONTHLY_SALES_MV, defined earlier as REFRESH FORCE, is being refreshed using COMPLETE refresh:

EXECUTE DBMS_MVIEW.REFRESH('MONTHLY_SALES_MV', 'C'),

In the following example, the REFRESH procedure is used to refresh two materialized views, MONTHLY_SALES_MV and Q12003_SALES_MV, using atomic_refresh.

EXECUTE DBMS_MVIEW.REFRESH('MONTHLY_SALES_MV, Q12003_SALES_MV',
                           atomic_refresh=>TRUE);

The following statement can be used to refresh FORCE (as indicated by refresh method of ?) all materialized views that have changed when the CUSTOMER table is updated. We have requested that the refresh should continue even after errors. On completion of this procedure, the :failures bind variable will indicate the number of failed refreshes.

EXECUTE DBMS_MVIEW.REFRESH_DEPENDENT(:failures, 'customer', '?',
                                     refresh_after_errors=>true);

Using Enterprise Manager for Refresh

Alternatively, you can use Oracle Enterprise Manager to refresh your materialized views. From the Administration page (see Chapter 2, Figure 2.16), click on the Materialized Views link and then search for the materialized view you want to refresh. Once you find the materialized view, click on the link for its name and you will get the screen shown in Figure 7.4.

Refreshing a Materialized View in Oracle Enterprise Manager

Figure 7.4. Refreshing a Materialized View in Oracle Enterprise Manager

In this page, you can check the refresh status of the materialized view by looking at the Refresh State field. If the materialized view is STALE, you can refresh it by pressing the Refresh button.

To get an accurate picture of the refresh state, the Compile State field must say VALID. If it says NEEDS_COMPILE, it means that some internal metadata needs to be updated, and you must press the Compile button to validate the materialized view first.

Pressing the Refresh button will bring up a refresh options screen (as shown in Figure 7.5) from which you can choose the refresh type, such as fast or complete, and choose to optionally update the statistics on the materialized view after refresh. Pressing the OK button will refresh the materialized view.

Refreshing a Materialized View in Oracle Enterprise Manager—Options

Figure 7.5. Refreshing a Materialized View in Oracle Enterprise Manager—Options

Once the refresh has completed, you can once again check the status of the materialized view in the Refresh State field, shown in Figure 7.4.

Fast Refresh

As the volume of data in a warehouse increases, rebuilding the entire materialized view after each new data load can get prohibitive. Oracle provides the capability to refresh materialized views without a complete rebuild. This is known as fast refresh. There are two mechanisms for fast refresh—either using materialized view logs, described in the next section, or using Partition Change Tracking, which is described in section 7.3.4.

Fast Refresh Using Materialized View Logs

One way to fast refresh a materialized view is by capturing the individual changes done to the detail tables by DML statements (INSERT, UPDATE, DELETE, and MERGE statements) and then applying these changes to the materialized view. A materialized view log is a mechanism used to capture these changes and must be created on each of the detail tables of the materialized view. During fast refresh, Oracle will use the materialized view log to identify the changes that have occurred since the last refresh and apply them to the materialized view. Once all materialized views have been refreshed to include the changes, the relevant rows will be automatically purged from the materialized view log. You must never modify the materialized view log manually, as you will be forced to completely rebuild the materialized view.

You may be concerned about the overhead of materialized view logs on the transaction issuing the DML statements. While this may indeed be a problem for a transaction-processing system, in a data warehouse this is not a major issue, because DML statements are not heavily used. The more typical method used to load new data is using a bulk load using SQL*Loader or using an INSERT /*+APPEND */ statement. Fortunately, the individual changes made by these methods are not recorded in the materialized view log but are tracked internally in a compressed fashion.

Materialized View Log Options

A materialized view log can be created with several options; depending on the materialized views you need to be fast refreshed, you will need one or more of these options. Some of these options include:

  • ROWID, which will log the rowids of the rows changed by DML statements.

  • INCLUDING NEW VALUES clause, which indicates that both the old and new version of a row changed by an update statement should be logged.

  • SEQUENCE, which causes a sequence number to be logged for each change.

Hint

You must create the materialized view logs before creating the materialized views; otherwise, you will have to completely refresh the materialized views once before fast refresh can be performed.

The following example shows a fast refreshable materialized view along with the materialized view logs on its detail tables. Note that for a materialized view with aggregates, the options ROWID, SEQUENCE, and INCLUDING NEW VALUES must be specified for the materialized view logs. If you omit the SEQUENCE option, you can still get fast refresh but only under specific conditions (i.e., either when only one detail table has been changed or when only inserts have been performed). Also, all detail table columns referenced in the materialized view’s query must be included in the materialized view log.

CREATE MATERIALIZED VIEW LOG on time
      WITH ROWID, SEQUENCE (time_key, month, year)
      INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG on purchases
      WITH ROWID, SEQUENCE (time_key, product_id,
                            purchase_price)
      INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG on product
      WITH ROWID, SEQUENCE (product_id)
      INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW monthly_sales_mv
 PCTFREE 0 TABLESPACE summary
 STORAGE (initial 64k next 64k pctincrease 0)
 BUILD IMMEDIATE
 REFRESH FAST
 ON DEMAND
 ENABLE QUERY REWRITE
 AS
 SELECT t.month, t.year, p.product_id,
        SUM (ps.purchase_price) as sum_of_sales,
        COUNT (ps.purchase_price) as total_sales, COUNT(*)
 FROM time t, product p, purchases ps
 WHERE t.time_key = ps.time_key AND
       ps.product_id = p.product_id
 GROUP BY t.month, t.year, p.product_id;

The next example shows a materialized view without aggregation that is fast refreshable with the ON COMMIT option. To make this type of a materialized view fast refreshable, the materialized view logs only need to have the ROWID option. However, you must include a ROWID column for each table in the SELECT clause of the materialized view. In this example, we are assuming that the materialized view logs for PRODUCT and PURCHASES, created earlier for the MONTHLY_SALES_MV, are available.

CREATE MATERIALIZED VIEW LOG on customer WITH ROWID;

CREATE MATERIALIZED VIEW product_customer_mv
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT c.rowid r1, c.gender, p.rowid r2, p.product_id,
       f.rowid r3, f.purchase_price
FROM purchases f, product p, customer c
WHERE f.customer_id = c.customer_id
  AND f.product_id = p.product_id;

Hint

There is only one materialized view log on a given table, and this materialized view log is used for all materialized views that reference that table. Therefore, be sure to specify options to satisfy all the materialized views. If you use the SQL Access Advisor to determine which materialized views to create, the appropriate materialized view logs will also be recommended for you.

Using Oracle Enterprise Manager to Define Materialized View Logs

You can use Oracle Enterprise Manager to create and edit your materialized view logs, as shown in Figure 7.6. You get to this screen by clicking on the Materialized View Logs link on the Administration page and then clicking on the Create button. The first step is to enter the table name, such as EASYDW.CUSTOMER. Note that you can only specify the name of the table on which the materialized view log is being defined—the name of the materialized view log itself is automatically determined by Oracle and cannot be modified by the user. Pressing the Go button will give you a list of available columns. From this list, you can choose the columns you would like to add to the materialized view log. Once you have made your choices, you can click the Show SQL button to see the SQL command or click OK to create the materialized view log.

Creating a Materialized View Log in Oracle Enterprise Manager

Figure 7.6. Creating a Materialized View Log in Oracle Enterprise Manager

Other Requirements for Fast Refresh

In addition to requiring materialized view logs on the detail tables, the defining query of a materialized view to be fast refreshed must adhere to certain rules. For example, if the materialized view contains an aggregate operator, it must also include some additional supporting aggregates—if the materialized view had SUM(x) or AVG(x), it must also have COUNT(x); if it had STDDEV(x), it must also include SUM(x), COUNT(x), and SUM(x * x). In the previous section, we saw an example of the PRODUCT_CUSTOMER_MV, where we needed to include ROWID columns for the detail tables in the materialized views.

Rather than trying to remember such detailed rules for when a particular materialized view is fast refreshable, we would recommend that you simply use the EXPLAIN_MVIEW and TUNE_MVIEW tools provided for this purpose. The EXPLAIN_MVIEW tool will explain which operations (such as fast refresh or query rewrite) the materialized view can or cannot support and the reasons why not. The TUNE_MVIEW tool will suggest modifications to the materialized view to fix any problems reported by EXPLAIN_MVIEW. These tools will be discussed later in this chapter.

Partition Change Tracking

Partitioned tables are a common feature in a warehouse. In Chapter 5, we saw the use of partitioning to load data into the warehouse. In Chapter 6, we discussed how partitioning can help improve query performance using partition pruning.

Oracle 9i introduced a new feature known as Partition Change Tracking (PCT). With this feature, whenever any DML or partition maintenance operation occurs to a partitioned detail table referenced by a materialized view, Oracle keeps track of the updated partitions. Then, during refresh, Oracle can identify which portion of the materialized view corresponds to the updated partitions in the detail tables, and recompute only that portion of the materialized view. This is referred to as PCT refresh.

Figure 7.7 shows a conceptual picture of how data changes are tracked in Partition Change Tracking. In this example, the PURCHASES table is partitioned by month into Jan2002, Feb2002, and Mar2002 partitions.

Partition Change Tracking

Figure 7.7. Partition Change Tracking

The materialized view MONTHLY_SALES_MV stores the total sales for each month. The materialized view is defined in such a way that there is a correspondence between the partitions of the PURCHASES table and the data in the materialized view, as shown by the dotted lines. (We will discuss how to create such a materialized view shortly.) If the Feb2002 partition is updated, only the corresponding portion of the materialized view is stale. During refresh, only the portion of the materialized view that refers to the stale partition needs to be refreshed and hence the refresh can be much faster.

When and How to Perform PCT Refresh

You may be wondering why you should learn about Partition Change Tracking when you can refresh using materialized view logs. There are three main reasons why you may consider using PCT refresh:

  1. If you have performed any partition maintenance operations, such as adding a new partition or dropping an old one, then fast refresh using materialized view logs is not possible. In this situation, which is quite common when loading data into the fact table, you will be forced to perform a complete refresh—unless your materialized view supports PCT.

  2. Some materialized views do not allow fast refresh with materialized view logs—for example, if the materialized view included analytical functions such as RANK, discussed in Chapter 6. However, the materialized view may still support PCT refresh.

  3. If a lot of changes have been done to the partitioned detail table, PCT refresh may perform much faster than fast refresh using materialized view logs.

If you specified refresh FAST or FORCE either when creating the materialized view or using the DBMS_MVIEW procedures, Oracle Database 10g will automatically determine the best refresh method to use. However, if your materialized view supports PCT, you can explicitly request PCT refresh using the DBMS_MVIEW procedure, as follows:

EXECUTE DBMS_MVIEW.REFRESH('monthly_sales_mv', 'P'),

Next, we will discuss how to create a materialized view that can support PCT refresh.

Creating a Materialized View That Supports PCT Refresh

One of the advantages of PCT is that the materialized view itself does not have to be partitioned at all! To allow PCT, the materialized view must have at least one partitioned detail table, which may be partitioned by Range, List, Range-Hash or Range-List composite partitioning. The partition key must consist of a single column.

There are three techniques that can be used to define a materialized view so that it can be refreshed using PCT refresh. Each of these techniques provides a mechanism to associate a row in the materialized view with a specific table partition.

  • Join dependency expression: The materialized view includes a join between the partitioned table and another table using the partitioning key, and a column from the latter table is included in the SELECT clause of the materialized view. This technique is new in Oracle Database 10g, and several materialized views may automatically satisfy this condition.

  • Partition key: The partition key of the partitioned detail table is included in the SELECT clause of the materialized view.

  • Partition marker: A special column known as the partition marker, using the DBMS_MVIEW.PMARKER() function, is included in the SELECT clause of the materialized view.

We will illustrate how each of these techniques can be used.

Join Dependency Expression

Consider the MONTHLY_SALES_MV materialized view used in several examples in this chapter. This materialized view stores the total sales for each product by month. Suppose we used partition maintenance operations to load new data into the PURCHASES table at the end of each month and hence would like to refresh this materialized view using PCT refresh after data loads.

CREATE MATERIALIZED VIEW monthly_sales_mv
AS
SELECT t.month, t.year, p.product_id,
       SUM (ps.purchase_price) as sum_of_sales,
       COUNT (ps.purchase_price) as total_sales, COUNT(*)
FROM time t, product p, purchases ps
WHERE t.time_key = ps.time_key AND
      ps.product_id = p.product_id
GROUP BY t.month, t.year, p.product_id;

It so happens that this materialized view automatically supports PCT using a join dependency expression. To see why, recall that in the EASYDW schema, the PURCHASES table is partitioned by TIME_KEY. In this materialized view, the partitioned table, PURCHASES, is joined to the table, TIME, using the partitioning-key column, TIME_KEY. As long as the materialized view’s SELECT list now includes a column or expression from table, TIME, PCT will be supported on this materialized view. Here, the table TIME is said to be a join-dependent table and the column or expression from TIME, which is included in the materialized view, is known as a join dependency expression. The best way to determine if your materialized view supports PCT in this manner is to use the EXPLAIN_MVIEW procedure, discussed in section 7.4.

Now, suppose we had another materialized view, REGIONAL_SALES_MV, which had the total sales by region.

CREATE MATERIALIZED VIEW regional_sales_mv
AS
SELECT c.region, SUM(ps.purchase_price) as sum_of_sales,
       COUNT (ps.purchase_price) as total_sales
FROM customer c, purchases ps
WHERE c.customer_id = ps.customer_id
GROUP BY c.region;

In this materialized view, join dependency is not possible, because the joining keys (CUSTOMER_ID and PRODUCT_ID) are different from the partitioning key (TIME_KEY). In this case, you can use either the partition key or partition marker methods, described next.

Partition Key

The simplest way to modify the REGIONAL_SALES_MV to allow PCT refresh is to include in it the partition key of the PURCHASES table. The resulting materialized view, REGIONAL_SALES_PARTKEY_MV, will appear as follows:

CREATE MATERIALIZED VIEW regional_sales_partkey_mv
AS
SELECT c.region, ps.time_key,
       SUM (ps.purchase_price) as sum_of_sales,
       COUNT (ps.purchase_price) as total_sales
FROM customer c, purchases ps
WHERE c.customer_id = ps.customer_id
GROUP BY c.region, ps.time_key;

Note that this materialized view has the TIME_KEY column (shown underlined), in the SELECT and the GROUP BY clauses.

Hint

If the materialized view has a GROUP BY clause, the partition key must be included in both the SELECT and the GROUP BY clauses; otherwise, it is not a legal SQL query.

Let us compare the contents of this modified materialized view with the original one, by selecting from the two materialized views, as follows:

SELECT * FROM regional_sales_mv;

REGION          SUM_OF_SALES TOTAL_SALES
--------------- ------------ -----------
AmerNorthEast   1314778.48          5658
AmerMidWest     2731175.27         10849
AmerNorthEast   3752117.44         16370
...
7 rows selected.

SELECT * FROM regional_sales_partkey_mv;


REGION          TIME_KEY  SUM_OF_SALES TOTAL_SALES
--------------- --------- ------------ -----------
AMerNorthEast   04-JAN-03      4536.39         17
AMerNorthEast   05-JAN-03      2950.63         11
AMerNorthEast   06-JAN-03      4412.16         16
...
810 rows selected.

You will notice that for each region there is a row for every value of TIME_KEY. Thus, even though this materialized view is supposed to have total sales for each region, the inclusion of the TIME_KEY column has resulted in it having daily sales for each region. In other words, including the partition key has significantly increased the size of the materialized view from 7 rows to 810 rows!

Including a partition key is suitable provided the partition key has only a few distinct values, which means that including it will not have a significant impact on the size of the resulting materialized view. In our case, the partition key is TIME_KEY and so a more appropriate alternative may be to use a partition marker, discussed next.

Partition Marker

A partition marker is an alternative to using the partition key and results in a much smaller materialized view than one using a partition key. The following example shows the regional sales materialized view in the previous example, except that instead of the TIME_KEY, it includes a special function, DBMS_MVIEW.PMARKER.

CREATE MATERIALIZED VIEW regional_sales_marker_mv
REFRESH FORCE
AS
SELECT c.region, DBMS_MVIEW.PMARKER(ps.rowid) as pmark,
       SUM(ps.purchase_price) as sum_of_sales,
       COUNT (ps.purchase_price) as total_sales
FROM customer c, purchases ps
WHERE c.customer_id = ps.customer_id
GROUP BY c.region, DBMS_MVIEW.PMARKER(ps.rowid);

Let us now select the data from this materialized view as follows:

select * from regional_sales_marker_mv;
REGION               PMARK SUM_OF_SALES TOTAL_SALES
--------------- ---------- ------------ -----------
AmerWest             50539    101431.57         431
AmerWest             50540     80895.72         414
AmerWest             50541    121121.93         465
...
168 rows selected.

The partition marker function produces a different value for each partition, but all rows in the same partition get the same value. Thus, by including the partition marker in the SELECT (and GROUP BY clause), for each region you will get at most as many rows as the number of partitions. In the EASYDW data warehouse, we have two years of data, partitioned by month, and so we have 24 partitions. Thus, for each region, we will get 24 rows. Therefore, while the original REGIONAL_SALES_MV had seven rows (for seven regions), the modified REGIONAL_SALES_MARKER_MV will now have 7 * 24 = 168 rows. This is still preferable to the 810 rows with the partition key.

To summarize, first check if your materialized view already supports PCT using a join dependency expression. If this is not the case, include the partition key if there are few distinct values in each partition. If there are many distinct values for the partition key, such as time_key, use a partition marker instead, to keep the materialized view to a reasonable size.

PCT can only track changes done to a partitioned table if the materialized view satisfies the conditions (using one of the previous techniques) for that specific table. In other words, if the materialized view had multiple partitioned tables, for each of those tables you must have the partition key, partition marker, or a join dependency expression in the materialized view. You could use different techniques for each table—for example, if you had two partitioned tables, CUSTOMER and PURCHASES, you could use partition marker for one and partition key for the other, as appropriate. Note that if changes have occurred to any nonpartitioned tables or partitioned tables not enabled for PCT, you will not be able to perform a PCT refresh.

Refresh Performance

Good performance of materialized view refresh is extremely crucial because of the limited maintenance windows available in data warehouses today. Here are some tips for getting the best performance out of your refresh procedures.

Optimizer Statistics

In order to get good refresh performance, it is important to have accurate statistics on the data. The package DBMS_STATS should be used to gather optimizer statistics on the detail tables after each data load and prior to creating or refreshing the materialized views. It is also important to gather statistics on the materialized views themselves—both upon creation and after refresh. This will ensure that the optimizer has accurate statistics during refresh and when optimizing queries that are rewritten to use materialized views.

Set the Parameter atomic_refresh to FALSE

We mentioned that DBMS_MVIEW refresh procedures have a parameter known as atomic_refresh. If this parameter is set to TRUE (which is the default), all operations during refresh are performed as a single atomic transaction. This means that in case of a failure, the entire refresh operation will be rolled back. However, to achieve this, the refresh operation has to disable use of some operations, such as parallel execution and bulk inserts, which are critical to refresh performance in a data warehouse. Therefore, we recommend you use the atomic_refresh setting of FALSE. In the worst case of a failure, you may have to refresh the particular materialized view with a complete refresh; however, refresh performance will be much better in general.

Another benefit of setting atomic_refresh to FALSE is that when doing a complete refresh, Oracle Database 10g will automatically disable all indexes on the materialized view prior to refresh and rebuild them after the refresh. This is much faster than maintaining the indexes during refresh.

Use PCT Refresh and Partitioned Materialized Views

It can be advantageous to enable PCT on your materialized views so that you can avoid a complete refresh in many situations. Although PCT does not require that the materialized view be partitioned, if the materialized view is partitioned on the partition key or the join dependency expression used to enable PCT, this can further improve PCT refresh performance. This is because Oracle can use some optimizations during refresh. Note that to take advantage of these optimizations, the atomic_refresh parameter must be set to FALSE.

Use Parallel Execution

Using parallel execution during refresh can significantly improve performance. To do so, you must include the PARALLEL clause when creating the materialized view and enable parallel DML (as discussed in Chapter 6) in the session performing refresh.

Build and Refresh Multiple Materialized Views Using REFRESH_ALL

In Oracle Database 10g, significant enhancements have been made to allow refresh of multiple materialized views simultaneously. If you specify multiple materialized views in the DBMS_MVIEW procedures and use the atomic_refresh parameter setting of FALSE, Oracle can refresh multiple materialized views concurrently. Note that to do so, the initialization parameter JOB_ QUEUE_PROCESSES must be set to a nonzero value, indicating how many refreshes can proceed simultaneously. When refreshing multiple materialized views, Oracle Database 10g will automatically determine the best order to refresh the materialized views. It will take into account dependencies among the materialized views, the degree of parallelism needed for each refresh, and the number of available job processes, so as to provide the best refresh performance.

To take advantage of this feature, when you first create materialized views, use the BUILD DEFERRED option rather than BUILD IMMEDIATE. You can then use REFRESH_ALL to concurrently populate all the materialized views. Similarly, when refreshing materialized views in the data warehouse, use the DBMS_MVIEW.REFRESH_ALL or DBMS_MVIEW.REFRESH_DEPENDENT procedures.

When refreshing multiple materialized views, you can see the status of jobs using the DBA_JOBS dictionary view, as shown in the following example:

SELECT what job_description, this_date, total_time
FROM dba_jobs
WHERE what like '%REFRESH%';

JOB_DESCRIPTION                THIS_DATE TOTAL_TIME
------------------------------ --------- ----------
BEGIN DBMS_SNAPSHOT.REFRESH_MV 12-JUN-04       1725
 ('MV_RF$JPIPE_007F01880001',2
,'EASYDW','CUSTOMER_PURCHASES_
MV2','?','',0,6,1); END;

Enable Query Rewrite during Refresh

Performance of complete refreshes and PCT refreshes can be significantly improved by using query rewrite during refresh. In Chapter 9, we will discuss query rewrite and also see how you can use it to speed up refresh.

Nested Materialized Views

All the materialized views we have seen so far are based on queries involving detail tables; however, it is also possible to create materialized views using queries involving other materialized views. Recall that a materialized view is a stored result and hence can be used in lieu of a table. A materialized view that is based on another materialized view is known as a nested materialized view. In the following example, the materialized view QUARTERLY_SALES_MV is created using the MONTHLY_CUST_SALES_MV.

CREATE MATERIALIZED VIEW monthly_cust_sales_mv
ENABLE QUERY REWRITE
AS
SELECT t.year, t.quarter, t.month, c.customer_id,
       SUM(f.purchase_price) AS dollar_sales
FROM time t, purchases f, customer c
WHERE f.time_key = t.time_key AND
      f.customer_id = c.customer_id
GROUP BY t.year, t.quarter, t.month, c.customer_id;

CREATE MATERIALIZED VIEW quarterly_sales_mv
ENABLE QUERY REWRITE
AS
SELECT m.year, m.quarter, SUM(m.dollar_sales) AS dollar_sales
FROM monthly_cust_sales_mv m
GROUP BY m.year, m.quarter;

Why Use Nested Materialized Views?

You can use nested materialized views as shown in our previous example to compute the aggregates at a higher level in a hierarchy from a lower level—for example, from month to quarter. Similarly, if several of your materialized views need to use the same join in their defining queries but different aggregates, you could separate the join into one materialized view and then build different aggregate materialized views using this common materialized view. This may provide you better performance for materialized view creation and complete refresh, because you only need to compute the join once.

Another reason to use nested materialized views is to improve your fast refresh capabilities. Often, there is some complex clause in the materialized view’s defining query that may make it not fast refreshable. In such cases, it may be possible to split the materialized view into two or more simpler materialized views, some of which are fast refreshable. The refresh performance of this partially fast refreshable group of materialized views may be much better than a complete refresh. For example, consider the following materialized view, SALES_MV1, that uses a subquery to find the customers who spent the most money in our store in January 2003.

CREATE MATERIALIZED VIEW sales_mv1
ENABLE QUERY REWRITE
AS
SELECT c.customer_id, SUM(f.purchase_price) AS dollar_sales
FROM purchases f, customer c
WHERE f.customer_id = c.customer_id
GROUP BY c.customer_id
HAVING SUM(f.purchase_price)
    IN (SELECT max(f.purchase_price) dollar_sales
        FROM purchases f, time t
        WHERE f.time_key = t.time_key
          AND t.month_name = 'January'
          AND t.year = 2003
        GROUP BY f.customer_id);

Ordinarily, this materialized view is not fast refreshable using either materialized view logs or using PCT, because of the subquery in the HAVING clause. Now, let us split this materialized view as follows into two fast refreshable materialized views.

CREATE MATERIALIZED VIEW sales_submv1
REFRESH FAST
ENABLE QUERY REWRITE
AS
SELECT c.customer_id, SUM(f.purchase_price) AS dollar_sales,
COUNT(f.purchase_price) as cnt_sales, COUNT(*) cstar
FROM purchases f, customer c
WHERE f.customer_id = c.customer_id
GROUP BY c.customer_id;

CREATE MATERIALIZED VIEW sales_submv2
REFRESH FAST
ENABLE QUERY REWRITE
AS
SELECT f.customer_id, max(f.purchase_price) max_sales
FROM purchases f, time t
WHERE f.time_key = t.time_key
  AND t.month_name = 'January' and t.year = 2003
GROUP BY f.customer_id;

The first, SALES_SUBMV1, computes the result of SALES_MV1 without including the subquery—in other words, the sales by customer. The second one, SALES_SUBMV2, computes the result of the subquery itself—in other words, the maximum amount spent by any customer in January 2003. These two materialized views are much simpler and are fast refreshable. We can now write SALES_MV1 in terms of SALES_SUBMV1 and SALES_SUBMV2, as follows:

CREATE MATERIALIZED VIEW sales_mv1
REFRESH COMPLETE
ENABLE QUERY REWRITE
AS
SELECT customer_id, dollar_sales
FROM sales_submv1
WHERE dollar_sales IN (SELECT max_sales FROM sales_submv2);

Note that the resulting materialized view still includes a subquery and cannot be fast refreshed directly; however, its complete refresh would be much faster.

This may sound very complicated, but there are two simple tools that will assist you in modifying your materialized view! The EXPLAIN_MVIEW utility, discussed section 7.4 will tell you if your materialized view is not fast refreshable for some reason. The TUNE_MVIEW utility, discussed in section 7.5 will automatically split your materialized view, if needed, into nested materialized views to maximize its fast refresh capabilities. Before we look at these tools, let us briefly discuss how to refresh nested materialized views.

Refreshing Nested Materialized Views

The DBMS_MVIEW refresh procedures can be used to refresh nested materialized views. When using the REFRESH procedure, if you set the parameter nested to TRUE, then any underlying materialized view is refreshed first before refreshing the nested materialized view. This parameter is new in Oracle Database 10g.

For instance, in the following example, referring to the nested materialized views created in the preceding section, Oracle will first refresh SALES_SUBMV1 and SALES_SUBMV2 before refreshing the nested materialized view, SALES_MV1.

execute dbms_mview.refresh('sales_mv1', nested=>TRUE);

Similarly, when using REFRESH_DEPENDENT, if nested is set to TRUE, Oracle will refresh all nested materialized views dependent on the table. For instance, if REFRESH_DEPENDENT is issued on the table TIME, as follows, Oracle will refresh SALES_SUBMV1, SALES_SUBMV2, and SALES_MV1. If nested had been set to FALSE, only SALES_SUBMV2 would be refreshed, since only this materialized view refers directly to the TIME table.

execute dbms_mview.refresh_dependent(:failures,'time',
nested=>TRUE);

If the procedure REFRESH_ALL is used, Oracle will automatically determine the dependencies among all the materialized views and refresh them in the correct order.

EXPLAIN_MVIEW Utility

As we have seen so far, in order to get the most benefit out of a materialized view, the materialized view must conform to some rules. For instance, you may need to create materialized view logs to make fast refresh possible or add a partition marker to allow Partition Change Tracking. The EXPLAIN_MVIEW interface is designed to help you determine what these rules are. Before creating the materialized view, you can run its defining query through the EXPLAIN_MVIEW utility. If a capability such as fast refresh is not possible, EXPLAIN_MVIEW will point out the offending construct or the missing columns. You now have the opportunity to fix these problems before expending precious resources to create the materialized view. EXPLAIN_MVIEW is invoked via the PL/SQL procedure DBMS_MVIEW.EXPLAIN_MVIEW or by using Oracle Enterprise Manager.

The DBMS_MVIEW.EXPLAIN_MVIEW PL/SQL procedure comes in two flavors:

  • You can provide the procedure, the name of an existing materialized view, and it will display the refresh and query rewrite capabilities of the materialized view.

  • You can provide the procedure with the SELECT statement that defines the materialized view and it will project its capabilities, if it has been created.

Prior to using this procedure, you must create the MV_CAPABILITIES_TABLE table in the current schema by running the utlxmv.sql script found in the ORACLE_HOME/rdbms/admin directory. The results of EXPLAIN_MVIEW will be placed in this table. EXPLAIN_MVIEW also has interfaces to return the results in a PL/SQL varray rather than in a table.

The output of EXPLAIN_MVIEW first lists all the capabilities of a materialized view and whether each one is possible (Y) or not (N). It then presents detailed information regarding each capability and the reason, if any, why it is not possible, as we will demonstrate in the next section.

Running EXPLAIN_MVIEW procedure

In the following example, we illustrate the first flavor of the EXPLAIN_MVIEW utility, which uses the SELECT statement of the materialized view, yet to be created. To illustrate how EXPLAIN_MVIEW can be used, we have deliberately introduced two problems with the materialized view logs—first, we have dropped the materialized view log on the PRODUCT table, and, second, we have removed the INCLUDING NEW VALUES clause from the materialized view log on the TIME table.

DROP MATERIALIZED VIEW LOG on PRODUCT;

ALTER MATERIALIZED VIEW LOG on TIME EXCLUDING NEW VALUES;
BEGIN
  dbms_mview.explain_mview (
   'SELECT t.month, t.year, p.product_id,
           SUM (f.purchase_price) as sum_of_sales,
           COUNT (f.purchase_price) as total_sales,
           COUNT(*) as cstar
   FROM time t, product p, purchases f
   WHERE t.time_key = f.time_key AND
         f.product_id = p.product_id
   GROUP BY t.month, t.year, p.product_id'),
END;
/

The following query shows the output of the procedure:

SELECT capability_name, possible p, related_text obj,
       msgtxt explanation
FROM MV_CAPABILITIES_TABLE;


CAPABILITY_NAME              P OBJ             EXPLANATION
------------------------------ --------------- -------------------
PCT                          Y
REFRESH_COMPLETE             Y
REFRESH_FAST                 Y
REWRITE                      Y

PCT_TABLE                    N TIME            relation is not a
                                               partitioned table

PCT_TABLE                    N PRODUCT         relation is not a
                                               partitioned table

PCT_TABLE                    Y PURCHASES

REFRESH_FAST_AFTER_INSERT    N EASYDW.PRODUCT  the detail table
                                               does not have a
                                               materialized view
                                               log

REFRESH_FAST_AFTER_INSERT    N EASYDW.TIME     mv log must have
                                               new values

REFRESH_FAST_AFTER_                            see the reason why
              ONETAB_DML     N                 REFRESH_FAST_
                                               AFTER_INSERT is
                                               disabled

REFRESH_FAST_AFTER_ANY_DML   N                 see the reason why
                                               REFRESH_FAST_
                                               AFTER_INSERT is
                                               disabled
REFRESH_FAST_PCT             Y
REWRITE_FULL_TEXT_MATCH      Y
REWRITE_PARTIAL_TEXT_MATCH   Y
REWRITE_GENERAL              Y
REWRITE_PCT                  N                 general rewrite is
                                               not possible or PCT
                                               is not possible on
                                               any of the detail
                                               tables

PCT_TABLE_REWRITE            N TIME            relation is not a
                                               partitioned table

PCT_TABLE_REWRITE            N PRODUCT         relation is not a
                                               partitioned table

PCT_TABLE_REWRITE            N PURCHASES       PCT is enabled
                                               through a join
                                               dependency

From this output, we can see that the PCT_TABLE capability for the PURCHASES table is possible because of a join dependency (because of the column month from the TIME table, which is joined to PURCHASES using the partition key). The TIME table is not partitioned, and hence PCT refresh is not possible if this table has been updated.

Fast refresh may be possible in some situations and not others. Hence, the fast refresh capabilities are presented at three levels, so if the first one is not possible, the remaining ones are also not possible and so on:

  • REFRESH_AFTER_INSERT (only inserts were done to the tables)

  • REFRESH_AFTER_ONETABDML (only one table is modified at a time)

  • REFRESH_AFTER_ANY_DML (there is no restriction on the type of DML)

The output of EXPLAIN_MVIEW indicates that the materialized view log on the TIME table was missing the INCLUDING NEW VALUES clause, and hence fast refresh using materialized view logs is not possible.

EXPLAIN_MVIEW also indicates if query rewrite has been enabled and if it is possible in general or only using text match mode.

Hint

To find out the detailed reasons why a particular query did not rewrite using a certain materialized view, use the EXPLAIN_REWRITE utility, which is discussed in Chapter 9.

Let us now correct the problems with the materialized view logs and rerun the utility. You can see that fast refresh is now possible, as indicated by REFRESH_FAST_AFTER_INSERT Y.

CREATE MATERIALIZED VIEW LOG on product
       WITH ROWID, SEQUENCE (product_id)
INCLUDING NEW VALUES;

ALTER MATERIALIZED VIEW LOG on TIME including new values;

BEGIN
  dbms_mview.explain_mview (
   'SELECT t.month, t.year, p.product_id,
           SUM (f.purchase_price) as sum_of_sales,
           COUNT (f.purchase_price) as total_sales,
           COUNT(*) as cstar
   FROM time t, product p, purchases f
   WHERE t.time_key = f.time_key AND
         f.product_id = p.product_id
   GROUP BY t.month, t.year, p.product_id'),
END;
/

SELECT capability_name, possible
FROM MV_CAPABILITIES_TABLE
WHERE capability_name LIKE 'REFRESH%';

CAPABILITY_                    P
------------------------------ -
REFRESH_COMPLETE               Y
REFRESH_FAST                   Y
REFRESH_FAST_AFTER_INSERT      Y
REFRESH_FAST_AFTER_ONETAB_DML  Y
REFRESH_FAST_AFTER_ANY_DML     Y
REFRESH_FAST_PCT               Y

If you wanted to look at the capabilities of an existing materialized view—for example, MONTHLY_SALES_MV—you would use EXPLAIN_MVIEW, as follows:

EXECUTE dbms_mview.explain_mview('EASYDW.MONTHLY_SALES_MV'),

The output will be similar to that shown in the previous example.

Using Oracle Enterprise Manager to run EXPLAIN_MVIEW

The CREATE MATERIALIZED VIEW screen in Oracle Enterprise Manager (Figure 7.3) has an Explain button, which can be clicked to run EXPLAIN_MVIEW. This allows you to immediately identify and correct any problems with the materialized view. Figure 7.8 is the screen showing the output of EXPLAIN_MVIEW.

EXPLAIN_MVIEW in Oracle Enterprise Manager

Figure 7.8. EXPLAIN_MVIEW in Oracle Enterprise Manager

You can also issue EXPLAIN_MVIEW on an existing materialized view by using the Explain button shown in Figure 7.4.

The EXPLAIN_MVIEW utility will identify the capabilities of a materialized view and point out which ones are missing. The TUNE_MVIEW tool, which is discussed next, can then be used to fix the materialized view to enhance its capabilities.

TUNE_MVIEW Utility

The TUNE_MVIEW utility is new in Oracle Database 10g and is available using the DBMS_ADVISOR.TUNE_MVIEW procedure. (There is no graphical interface for this utility at this time.) You supply to it the SQL for the materialized view to be created and it will fix the defining query of the materialized view so that it is fast refreshable and supports as many types of query rewrite as possible. In order to achieve fast refresh, the materialized view may be modified to include additional columns or decomposed into several simpler fast refreshable nested materialized views (see section 7.3.6). Additionally, TUNE_MVIEW will also recommend materialized view logs if they are necessary for fast refresh.

To illustrate this utility, consider the following materialized view statement:

CREATE MATERIALIZED VIEW monthly_cat_sales_mv
REFRESH FAST ON DEMAND
ENABLE QUERY REWRITE
AS
SELECT distinct p.category, t.month
FROM product p, purchases ps, time t
WHERE ps.product_id = p.product_id
  AND ps.time_key = t.time_key;

If we run EXPLAIN_MVIEW on it, we will see that the DISTINCT keyword has made this materialized view not fast refreshable. It may not be obvious to you how to modify this materialized view to make it fast refreshable.

BEGIN
  dbms_mview.explain_mview (
   SELECT distinct p.category, t.month
   FROM product p, purchases ps, time t
   WHERE ps.product_id = p.product_id
     AND ps.time_key = t.time_key;
   );
END;
/

SELECT capability_name, possible p, msgtxt explanation
FROM MV_CAPABILITIES_TABLE
WHERE CAPABILITY_NAME LIKE 'REFRESH%';

CAPABILITY_NAME                P   EXPLANATION
------------------------------ -   -------------------------
REFRESH_COMPLETE               Y
REFRESH_FAST                   N
REFRESH_FAST_AFTER_INSERT      N   DISTINCT clause in select
                                   list in mv

REFRESH_FAST_AFTER_INSERT      N   DISTINCT clause in select
                                   list in mv

REFRESH_FAST_AFTER_INSERT      N   one or more joins present
                                   in mv

REFRESH_FAST_AFTER_ONETAB_DML  N   see the reason why REFRES
                                   H_FAST_AFTER_INSERT is
                                   disabled

REFRESH_FAST_AFTER_ANY_DML     N   see the reason why REFRES
                                   H_FAST_AFTER_ONETAB_DML
                                   is disabled

REFRESH_FAST_PCT               N   see the reason why REFRES
                                     H_FAST_AFTER_INSERT is
                                     disabled

Now, let us pass this materialized view to TUNE_MVIEW. To indicate to TUNE_MVIEW that we would like to optimize the materialized view for fast refresh and/or query rewrite, the CREATE MATERIALIZED VIEW statement must include the REFRESH FAST and/or ENABLE QUERY REWRITE clauses, respectively. The results of the analysis are saved under a user-specified taskname identifier and will consist of the SQL for the modified materialized view(s) and any required materialized view logs. If you do not specify REFRESH FAST, then TUNE_MVIEW will not recommend materialized view logs.

DECLARE
  taskname varchar2(20);
BEGIN
  taskname := 'MY_TUNE_MVIEW_TASK';

 dbms_advisor.tune_mview(taskname,
   'CREATE MATERIALIZED VIEW monthly_sales_mv
    REFRESH FAST ON DEMAND
    ENABLE QUERY REWRITE
    AS
    SELECT distinct p.category, t.month
    FROM product p, purchases ps, time t
    WHERE ps.product_id = p.product_id
      AND ps.time_key = t.time_key'),
END;
/

We can now generate a SQL script for this analysis, which can later be executed to create the materialized view. To generate this script, you must specify a database DIRECTORY object where the script will be placed and you must have been granted write privileges to that directory, as shown in the following example:

CREATE DIRECTORY TUNE_RESULTS AS '/oracle/scripts';

BEGIN
 DBMS_ADVISOR.CREATE_FILE
 (DBMS_ADVISOR.GET_TASK_SCRIPT('MY_TUNE_MVIEW_TASK'),
                               'TUNE_RESULTS',
                               'mv_create.sql'),
END;
/

An excerpt from the script is shown below:

Rem SQL Access Advisor: Version 10.1.0.1 - Production
Rem
Rem Username:        EASYDW
Rem Task:            MY_TUNE_MVIEW_TASK
Rem Execution date:
Rem

set feedback 1
set linesize 80
set trimspool on
set tab off
set pagesize 60

whenever sqlerror CONTINUE

ALTER MATERIALIZED VIEW LOG FORCE ON
    "EASYDW"."PRODUCT"
    ADD ROWID, SEQUENCE("PRODUCT_ID","CATEGORY")
    INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW EASYDW.MONTHLY_SALES_MV
REFRESH FAST WITH ROWID
ENABLE QUERY REWRITE
AS
SELECT EASYDW.PRODUCT.CATEGORY C1, EASYDW.TIME.MONTH C2,
       COUNT(*) M1 FROM EASYDW.PURCHASES,
       EASYDW.PRODUCT, EASYDW.TIME
WHERE EASYDW.PRODUCT.PRODUCT_ID = EASYDW.PURCHASES.PRODUCT_ID
  AND EASYDW.TIME.TIME_KEY = EASYDW.PURCHASES.TIME_KEY
GROUP BY EASYDW.PRODUCT.CATEGORY, EASYDW.TIME.MONTH;
...

We can see that the utility has modified the materialized view definition to convert the DISTINCT keyword to an equivalent GROUP BY clause (underlined) and has added a COUNT(*) aggregate, which makes it fast refreshable. It has also recommended the necessary changes to the materialized view logs.

Because TUNE_MVIEW may create several materialized views to achieve fast refresh of the given materialized view, it is advisable to also generate an UNDO script from TUNE_MVIEW, which can be used to remove the materialized views as a group, when no longer needed.

BEGIN
 DBMS_ADVISOR.CREATE_FILE
 (DBMS_ADVISOR.GET_TASK_SCRIPT('MY_TUNE_MVIEW_TASK', 'UNDO'),
                               'TUNE_RESULTS',
                               'mv_undo.sql'),
END;
/

With TUNE_MVIEW, a novice materialized view user can now very easily take advantage of all the powerful and even the most advanced features of materialized views without having to deal with the complexity.

Summary

In this chapter, we discussed how materialized views could be used to precompute the results of frequently asked queries. Summary management in Oracle provides a complete framework to manage materialized views. Fast refresh and Partition Change Tracking allows materialized views to be kept up-to-date when your warehouse gets loaded. Tools such as EXPLAIN_MVIEW and TUNE_MVIEW make it extremely simple to create and use materialized views.

In the next two chapters, we will discuss two more key components of summary management—dimensions and query rewrite.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset