Chapter 61

Fact Table Load

A fact table is generally much easier to load than a dimension table. Typically, you only insert into the table and don’t delete or update rows. Additionally, the components you use for a fact table load are much simpler than the Slowly Changing Dimension Transform you used in the previous lesson.

In a fact table load, the source data coming in contains the natural keys (also known as alternate or business keys) for each of the dimension attributes associated with the fact. You want to replace the business key with the key used in the dimension table. You look up the business key in the dimension table and retrieve the surrogate key (the dimension table’s primary key). Then the fact is stored with its dimension keys.

You may want to add additional, derived columns to the fact table. For example, you may want to provide consumers with a Profit column in the fact table, but your source data only has Cost and SellPrice columns, which you will bring into the fact table. These two columns, Cost and SellPrice are enough to determine profit. In the Data Flow Task you would create a Derived Column Transform that applies a formula in the expression, creating the new Profit column.

Another common task is summarizing fact data. Perhaps you have a requirement for a fact that contains ProductID, Date, and SaleAmount. Your source data for this fact contains an additional column—CustomerID. You will need to add up all of the SaleAmounts for each product, for each date, and for all customers. You can do this using an Aggregate Transform. You would do a Group By ProductID and set the operation on Date to Max and SaleAmount to Sum. You could also satisfy this requirement by doing the grouping in your SQL Select statement that reads the source.

Try It

Now that you know the components that are involved in a fact table load, in this Try It you load one. After you complete this lesson, you’ll have a better understanding of how SSIS can help you load a fact table in your own data warehouse.

To load this warehouse fact table, you’ll need to retrieve the surrogate keys from the business key. You can download the completed Lesson61.dtsx and sample files for the lesson from www.wrox.com.

Lesson Requirements

Load a fact table called Lesson61FactFinance (Lesson61CreateTable.sql creates the table) in the AdventureWorksDW2012 database. The source data is a flat file called Lesson61Data.txt. As previously noted, you can download both Lesson61CreateTable.sql and Lesson61Data.txt, as well as a completed package of this lesson (Lesson61.dtsx) from the book’s website at www.wrox.com.

Hints

  • The source file is a tab-separated file with business keys and money values.
  • Keep in mind that a fact table package is a series of surrogate key lookups. You will have a series of five Lookup Transforms, where you look up the business key in the dimension and return the surrogate key value.
  • Use a Lookup Transform against the DimOrganization, DimScenario, DimDate, DimAccount, and DimDepartmentGroup dimension tables.

Step-by-Step

1. Run the Lesson61CreateTable.sql script to create the necessary table.
2. Create a new package in SSDT called Lesson61.dtsx.
3. Create a connection manager to AdventureWorksDW2012. Name it AdventureWorksDW2012.

NOTE Creating connection managers is first discussed in Lesson 6.

4. Create a Flat File Connection to the Lesson61Data.txt file that you downloaded from www.wrox.com. In the General page, set the name to Finance Extract and select the Column names in the first data row option. In the Advanced page, set the FullDateAlternateKey column’s data type to a database date. Set the OrganizationName, DepartmentGroupName, and ScenarioName columns to Unicode string. Set AccountCodeAlternateKey and ParentAccountCodeAlternateKey to four-byte signed integer. Set the Amount column to currency. Click OK to exit.
5. Create a Data Flow Task. In the task, drag a Flat File Source onto the design pane, and link it to the Flat File Connection Manager you just created.

NOTE Working with Data Flow Tasks and using sources are covered in Lessons 17 and 18, respectively.

6. Drag a new Lookup Transform onto the Data Flow design pane and link it to the Flat File Source. Name the Lookup Transform Organization. In the transform, select DimOrganization as your reference table in the AdventureWorksDW2012 database in the Connection page. In the Columns page, connect OrganizationName from the source to OrganizationName on the DimOrganization table. Check OrganizationKey, as shown in Figure 61-1.

NOTE Lookup Transforms are covered in detail in Lesson 24.

7. Now, repeat the same steps for the DimScenario, DimDate, DimAccount, and DimDepartmentGroup tables. For the DimScenario table, match ScenarioName columns and retrieve the ScenarioKey. For the DimDate table, match FullDateAlternateKey and retrieve DateKey. For DimDepartmentGroup, match DepartmentGroupName and retrieve DepartmentGroupKey. Finally, for DimAccount, you will look up based on two columns. Map AccountCodeAlternateKey from input to lookup and ParentAccountCodeAlternateKey from input to lookup and return AccountKey. Connect the lookup match output from each of the Lookup Transforms together in any order.
8. Connect the final Lookup Transform into a newly created OLE DB Destination. Configure the destination to load the Lesson61FactFinance table.

NOTE Loading information into destinations is discussed in Lesson 19.

9. Save and Run the package. The final result should look like Figure 61-2.

Please select Lesson 61 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset