The Nature of the Beast

So just how do you decide if you're working on a true data warehouse? First, examine the intended nature of your database and the application it supports. For each subject area in your data warehouse, simply ask your sponsoring business user to provide the following eight items:

  • Mission statement

  • Number of ad-hoc query users

  • Number ad-hoc queries per day per ad-hoc user

  • Number of pre-canned report users

  • Number of pre-canned reports per day per pre-canned user

  • Number of pre-canned reports

  • Amount of history to keep in months, quarters, or years

  • Typical daily, weekly, or monthly volume of data to record

These answers should help you categorize your database application into one of the following choices:

  • Online transaction processing (OLTP)

  • Operational data store (ODS)

  • Online analytical processing (OLAP)

  • Data mart/data warehouse (DM/DW)

Use the criteria outlined in Table 1-1 to make your distinction.

Table 1-1. General Database Application Categorizations
 OLTPODSOLAPDM / DW
Business FocusOperationalOperational / TacticalTacticalTactical / Strategic
End User ToolsClient/Server or WebClient/Server or WebClient/ServerClient/Server or Web
DB TechnologyRelationalRelationalCubicRelational
Transaction CountLargeMediumSmallSmall
Transaction SizeSmallMediumMediumLarge
Transaction TimeShortMediumMediumLong
DB Size in GB10–400100–800100–800800—80,000
Data ModelingTraditional ERDTraditional ERDN/ADimensional
Normalization3–5 NF[1]3 NFN/A0 NF

[1] Normal Form

For example, suppose your answers are as follows:

  • “The point of sale (POS) subject area of the data warehouse should enable executives and senior sales managers to perform predictive, “what-if” sales analysis and historical analysis of:

    • A sales campaign's effectiveness

    • Geographic sales patterns

    • Calendar sales patterns

    • The effects of weather on sales

  • 20 ad-hoc query users

  • 10–20 ad-hoc queries a day per ad-hoc user

  • 40 pre-canned report users

  • 1–4 pre-canned reports a day per pre-canned user

  • 60 months of history

  • 40 million sales transactions per day

From this example, we can discern that we genuinely have a candidate for a data mart or data warehouse. First, the mission statement clearly indicates that our users' requirements are of a more tactical or strategic nature. Second, the majority of our report executions will clearly be ad-hoc (200–400 ad-hoc versus a maximum of 160 pre-canned). Third, we have significant historical data requirements and large amounts of raw data—and thus a potentially very large database (especially once we consider aggregates as well).

While it may seem like I've painted an example tailored to the conclusion, I've actually found the process to be this straightforward and easy in most cases. Unfortunately, these days, people tend to call any reporting database a data warehouse. It's okay for people to call their projects whatever they like, but as I pointed out, the techniques in this book only apply to the DM/DW column of Table 1-1.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset