Matching permitted (2/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

50 High-Function Business Intelligence in e-business

Example 2-18 Nullable columns or expressions in GROUP BY

CREATE TABLE S2 AS

(SELECT Period.year,

Product.id,

Fact.DeliveryCode,

GROUPING(Period.year) AS gpyear,

GROUPING(Product.id) AS gpprodid,

GROUPING(Fact.DeliveryCode) AS gpfactDC,

SUM(Fact.Quantity) AS Quantity,

SUM(Fact.Amount) AS Amount,

SUM(Fact.Quantity * Product.Price) AS QP_Amount

FROM Fact,

Product,

Period

WHERE Fact.prod_id = Product.id and

Fact.period_id = Period.id

GROUP BY rollup(Period.year,

Product.id,

Fact.DeliveryCode ))

DATA INITIALLY DEFERRED REFRESH DEFERRED

In this example, columns Period.Year, Product.id and Fact.DeliveryCode in

the GROUP BY are nullable. Therefore, the materialized view will have rows

with NULL values that may either be due to a rolled up aggregation

(sub-total), or due to NULL values from the base table itself. The GROUPING

identifier enables you to differentiate between these two cases. A value of one

in this column indicates that the row was the result of a sub-total from the

GROUP BY function, while a value of zero indicates otherwise. An example of

the results of using the GROUPING identifier is shown in “GROUPING,

GROUP BY and CUBE example” on page 138.

Attention: The assumption in the above scenarios is that the columns or

expressions in the GROUP BY list are defined as NOT NULL. In the event that

this is not true, that is, the GROUP BY list items are nullable, then the

materialized view

must have GROUPING identifiers defined as highlighted in

Example 2-18, in order for the DB2 optimizer to consider matching the user

query to the materialized view. The reason for this requirement is that the DB2

optimizer needs to distinguish between NULL values within a materialized

view that could either be due to a rolled up aggregation involving missing rows,

or due to actual NULL values in the base tables itself.

The GROUPING function

only needs to be defined for all nullable columns

when super aggregates (ROLLUP, CUBE and grouping sets) are involved.

Chapter 2. DB2 UDB’s materialized views 51

The query in Example 2-19 has a ROLLUP on fewer columns (only

Period.Year, and Product.id) than in the materialized view.

Example 2-19 Nullable columns or expressions in GROUP BY — user query

SELECT Period.year,

Product.id,

SUM(Fact.Quantity) AS Quantity,

SUM(Fact.Amount) AS Amount,

SUM(Fact.Quantity * Product.Price) AS QP_Amount

FROM Fact,

Product,

Period

WHERE Fact.prod_id = Product.id and

Fact.period_id = Period.id

GROUP BY rollup(Period.year,

Product.id)

Assuming this query matches the materialized view in Example 2-18 on

page 50, then the query would be rewritten as shown here in Example 2-20,

to filter out the unwanted rows.

Example 2-20 Rewritten query

SELECT year,

id,

Quantity,

Amount,

QP_Amount

FROM S2

WHERE gpfactDC = 1

The ‘gpfactDC = 1’ predicate in the rewritten query only selects those rows in the

materialized view returned by the GROUP BY function, and ignores those

relating to NULLs in the base table itself.

52 High-Function Business Intelligence in e-business

Figure 2-9 provides additional examples of matching conditions. It shows that the

GROUP BY expressions

must be derivable from the materialized view.

Figure 2-9 Matching GROUP BY and aggregate functions

3. Extra tables in the query:

DB2 is able to match user queries that contain more tables than those defined

in the materialized view, when the join predicates to the base tables can be

replaced by join predicates between the materialized view and the additional

tables.

Consider the materialized view shown in Example 2-21:

Example 2-21 Extra tables in the query materialized view

CREATE SUMMARY TABLE dba.trans_agg AS

(

SELECT ti.pgid, t.locid, t.acctid, t.status, YEAR(pdate) AS year,

MONTH(pdate) AS month, SUM(ti.amount) AS amount, COUNT(*) AS count

FROM transitem AS ti, trans AS t

WHERE ti.transid = t.transid

GROUP BY YEAR(pdate), MONTH(pdate), ti.pgid, t.locid, t.acctid, t.status

)

DATA INITIALLY DEFERRED REFRESH IMMEDIATE

A user query that looks like the one shown in Example 2-22 can be

considered matching the above materialized view even though it has an

additional location table

loc

included.

Example 2-22 Extra tables in the query — matching query

SELECT YEAR(pdate) AS year, loc.country, SUM(ti.amount) AS amount, COUNT(*)

AS count

FROM transitem AS ti, trans AS t, loc AS loc

WHERE ti.transid = t.transid AND t.locid = loc.locid

AND YEAR(pdate) BETWEEN 1990 and 1999

GROUP BY YEAR(pdate), loc.country

QUERY Materialized View

Valid ?

... GROUP BY store_id ... GROUP BY cust_id, store_id

... GROUP BY ((cust_age + 5) / 10) ... GROUP BY cust_age / 10

... CUBE(cust_id, store_id) ... GROUPING SETS (cust_id, store_id)

... GROUPING SETS (cust_id, store_id) ... CUBE(cust_id, store_id)

... AVG(sales) ... ... SUM(sales), COUNT(*) ...

... COUNT(DISTINCT cust_id) ... ... cust_id, COUNT(*) ....

Yes

Chapter 2. DB2 UDB’s materialized views 53

The

loc

is joined on the locid column to

trans

. The locid column is one of the

GROUP BY columns of the materialized view. DB2 can use this column to join

the relevant rows of the materialized view

after applying the YEAR predicate

with the

loc

table. The aggregated results can then be further consolidated by

grouping on the YEAR(pdate) and country.

4. Extra tables in the materialized view:

DB2 is able to match user queries against materialized views that have more

tables than defined in the query, in certain cases where referential integrity is

known to exist.

Consider the materialized view shown in Example 2-23.

Example 2-23 Extra tables in the materialized view

CREATE TABLE dba.PG_SALESSUM AS

(

SELECT l.lineid AS prodline, pg.pgid AS pgroup, loc.country, loc.state,

YEAR(pdate) AS year, MONTH(pdate) AS month, SUM(ti.amount) AS amount,

COUNT(*) AS count

FROM transitem AS ti, trans AS t, loc AS loc,

pgroup AS pg, prodline AS l

WHERE ti.transid = t.transid AND ti.pgid = pg.pgid

AND pg.lineid =l.lineid AND t.locid = loc.locid

GROUP BY loc.country, loc.state, year(pdate),month(pdate), l.lineid,

pg.pgid

)

DATA INITIALLY DEFERRED REFRESH IMMEDIATE

A user query as shown in Example 2-24 can be considered as matching the

above materialized view even though the materialized view has two more

tables

pgroup

and

prodline

than in the user query:

Example 2-24 Extra tables in the materialized view — matching query

SELECT YEAR(pdate) AS year, loc.country,

SUM(ti.amount) AS amount, COUNT(*) AS count

FROM transitem AS ti, trans AS t, loc AS loc

WHERE ti.transid = t.transid AND t.locid = loc.locid

AND YEAR(pdate) BETWEEN 1990 and 1999

GROUP BY YEAR(pdate), loc.country

54 High-Function Business Intelligence in e-business

The query in Example 2-24 references three tables, while the materialized

view in Example 2-23 has five. It would appear at first glance that these

additional tables

pgroup

and

prodline

would affect the result of the above

query, if that materialized view were used in the query rewrite. This would be

true unless DB2 was aware of referential integrity relationships being

involved. For instance, if the

pgroup

and

prodline

tables were related to the

other tables through referential integrity, it will not affect the number of rows in

the result. They could be considered as look-up tables that are merely adding

columns to the output.

Example 2-25 Informational and system-maintained referential integrity constraints

-- INFORMATIONAL REFERENTIAL INTEGRITY CONSTRAINT

CREATE TABLE transitem (

.....

pgid INT,

....

CONSTRAINT fk_pgid FOREIGN KEY (pgid) REFERENCES pgroup

ON DELETE CASCADE NOT ENFORCED

.....

-- SYSTEM-MAINTAINED REFERENTIAL INTEGRITY CONSTRAINT

CREATE TABLE pgroup (

.....

lineid INT,

....

CONSTRAINT fk_lineid FOREIGN KEY (lineid) REFERENCES prodline

ON DELETE CASCADE ENFORCED

System-maintained referential integrity (where the constraint attribute is

ENFORCED), as well as informational referential integrity constraints let the

query rewrite component know of the existence of referential integrity. In such

cases, the additional tables in the materialized view are guaranteed not to add

or remove rows in the result, and the query rewrite engine can proceed with

the materialized view matching optimization, and ignore these tables.

Important: Referential integrity may either be system maintained, or

provided as informational constraints (see the NOT ENFORCED constraint

attribute in Example 2-25). With informational referential integrity

constraints, the onus is on the DBA to guarantee the integrity of reference,

since DB2 makes no attempt to enforce referential integrity. Informational

referential integrity constraints help the DB2 optimizer make superior

decisions about matching user queries that have fewer tables than those

defined in the materialized view.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Matching permitted (2/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
Matching permitted (2/3)