Chapter 40

Creating a Master Package

The master package uses the Execute Package Task to run child packages. This process is covered in Lesson 46. Using a master package offers several advantages.

You must execute child packages in the proper order because some packages can depend on the success, failure, or completion of other packages. To do this, you can use the precedence constraints, invoking functions if you want (precedence constraints are covered in Lesson 9). You can use the master package to encapsulate multiple child packages for transactional consistency. Parameters and variables from the master package can be controlled globally and shared with the child packages. As you move the batch of packages from one environment to another, you can restrict your parameter and variable issues to the master package. (Environments are covered in Lesson 54 and deployment is covered in Lesson 53.) Master packages also enable you to control package parallelism, reducing processing time in the batch window. You can do this by setting the MaxConcurrentExecutables property of the parent package. The default value is –1, which means the number of logical or physical processors +2.

As you can see from the number of cross-references to other lessons you’ve already encountered in this lesson, when you are working on a master package, you will use many of the skills you have learned in those other lessons. However, the concept of a master package is quite simple. Think about a data warehouse example. You will have packages that load each dimension, a package to load each fact, and another package that processes each cube in Analysis Services. Part of designing a master package is an exercise is parallelism and dependency.

In the example here and in the Try It that follows later, we have two cubes, each with a single fact and two dimensions. The following bullets indicate which dimensions and facts are used by each cube:

  • Sales Cube
  • Product Dim
  • Customer Dim
  • Sales Fact
  • Customer Support Cube
  • Employee Dim
  • Customer Dim
  • Employee Customer Calls Fact

Take a look at Figure 40-1. It contains three dimension loads, two fact loads, and two cube builds in the master package. In this master package, each item is processed, one after the other.

It does implement some required ordering, as described in the following list:

1. Load dimensions
2. Load facts
3. Process cubes

However, this design can fall short in a couple of ways.

  • Even though the tasks contained with each child package can run in parallel, only one package will run at a time. There is no parallelism at the package level. This can mean that the entire batch will run unnecessarily long, even when hardware resources like memory, disk throughput, and processor are available.
  • The second issue relates to unnecessary dependencies. If the Load Product Dim Task fails, neither cube will be processed. The Customer Support Cube will not be processed, even though it does not use the Product Dim.

However, a smarter design will take care of both issues.

Now take a look at Figure 40-2. It contains the same Execute Package Tasks. When the Product and Customer Dim are loaded successfully, the Product Sales Fact can be loaded. When the Product Sales Fact is loaded successfully, the Process Sales Cube can begin. Customer and Employee Dim must load prior to loading the Customer Calls Fact. Once the Customer Calls Fact loads successfully, the Process Customer Support Cube will begin. In this design, if the Product Dim load fails, the Customer Support Cube path can continue and process successfully. Parallelism is improved also, because all three dimensions can be processed in parallel.

The way you should think about this is to allow as much as possible to complete, even when things are failing. Only put dependencies in the master package when actual data dependencies exist for the objects contained within.

Try It

In this Try It, you create a master package like the one mentioned earlier in this lesson. You replace the Process Cube Tasks with an Execute Package Task that does not require SQL Server Analysis Services. Each one of the packages you call from your master package does nothing but display a message box with the package name. The child packages will not complete until you OK the message boxes they present. This enables you to control when packages complete and offers you the opportunity to closely watch how this works. The purpose of this Try It is to walk you through creating a master package and to give you an easy view of parallelism and package failure results. After you complete this, you will be able to create master packages and control ordering and parallelism of the child packages.

You can download the source packages for this lesson from www.wrox.com.

Lesson Requirements

The only requirement for this lesson is that you download all of the source packages for this lesson, which are available on the book’s website at www.wrox.com.

Hints

  • Use only the Execute Package Task.
  • Run and observe the package behavior.
  • Limit parallelism by setting the MaxConcurrentExecutables property.
  • Force a package failure and observe the effect of dependency constraints on the flow.
  • As each package runs, it stops and displays a message box with the package name. You must click OK in the message box for the package to continue.

Step-by-Step

1. Create a new SSIS package called Lesson40MasterParallel.dtsx in a new solution. Add the following packages that you downloaded to your new solution. There is no need to re-create these packages. Each package simply displays a message box.
  • Lesson40LoadProductDim.dtsx
  • Lesson40LoadCustomerDim.dtsx
  • Lesson40LoadEmployeeDim.dtsx
  • Lesson40LoadProductSalesFact.dtsx
  • Lesson40LoadEmployeeCustomerCallsFact.dtsx
  • Lesson40ProcessSalesCube.dtsx
  • Lesson40ProcessCustomerSupportCube.dtsx
2. Drag three Execute Package Tasks onto the Control Flow tab.
3. For the first Execute Package Task, on the General tab type Load Product Dim in the Name text box. Choose Package from the tree on the left and select Lesson40LoadProductDim.dtsx in the PackageNameFromProjectReference drop-down list.
4. For the second Execute Package Task, on the General tab, type Load Customer Dim in the Name text box. Choose Package from the tree on the left and select Lesson40LoadCustomerDim.dtsx in the PackageNameFromProjectReference drop-down list.
5. For the third Execute Package Task, on the General tab, type Load Employee Dim in the Name text box. Choose Package from the tree on the left and select Lesson40LoadEmployeeDim.dtsx in the PackageNameFromProjectReference drop-down list.
When complete, your Control Flow should look like Figure 40-3.
6. Drag two more Execute Package Tasks onto the Control Flow tab to represent the fact loads.
7. For the first fact Execute Package Task, on the General tab, type Load Sales Fact in the Name text box. Choose Package from the tree on the left and select Lesson40LoadSalesFact.dtsx in the PackageNameFromProjectReference drop-down list.
8. For the second fact Execute Package Task, on the General tab, type Load Employee Customer Calls Fact in the Name text box. Choose Package from the tree on the left and select Lesson40LoadEmployeeCustomerCallsFact.dtsx in the PackageNameFromProjectReference drop-down list.
9. Drag two more Execute Package Tasks onto the Control Flow tab to represent the cube processing.
10. For the first cube Execute Package Task, on the General tab, type Process Sales Cube in the Name text box. Choose Package from the tree on the left and select Lesson40ProcessSalesCube.dtsx in the PackageNameFromProjectReference drop-down list.
11. For the second cube Execute Package Task, on the General tab, type Process Customer Support Cube in the Name text box. Choose Package from the tree on the left and select Lesson40ProcessCustomerSupportCube.dtsx in the PackageNameFromProjectReference drop-down list.
12. Now set the precedence constraints:
  • Connect Load Product Dim to Load Sales Fact
  • Connect Load Customer Dim to Load Sales Fact
  • Connect Load Customer Dim to Load Employee Customer Calls Fact
  • Connect Load Employee Dim to Load Employee Customer Calls Fact
  • Connect Load Sales Fact to Process Sales Cube
  • Connect Load Employee Customer Calls Fact to Process Customer Support Cube
Your completed work flow should look like Figure 40-4.
13. Save and run the package. You will see all of the Load Dim packages run. This is because there is no prior dependency and MaxConcurrentExecutables is set to –1. Three message boxes will also be up, each with the name of the task to which it belongs. When you click OK in a message box, the associated task will complete and you can follow through the execution of items in the master package. You may have to Alt+Tab to bring the message boxes to the front. You may also have to move back to your master package to see it. Your initial view should look like Figure 40-5.
14. Click OK on the Load Product Dim and Load Employee Dim message boxes. Now go back to your master package. Product and Employee Dim should show a green check for successful completion, but neither of the fact loads have begun. This is because they both depend on success of the Load Customer Dim.
15. Click OK on the Load Customer Dim message box, and review the status of the master package. All load dims will be green and both load facts will be in-process.
16. Click OK on the Load Sales Fact message box. The Process Sales Cube will begin because all of its prior dependencies completed successfully. The Process Customer Support Cube is still waiting for the Load Employee Customer Calls Fact to complete.
17. Click OK on the Load Employee Customer Calls Fact message box, and the Process Customer Support Cube will begin. Now click each of the two Process Cube message boxes and the master package will complete successfully. Click at the bottom of the master package to return to design mode.
18. Next, you look at changing the parallelism of the master package. Click in a blank area of the Control Flow tab, and go to the Properties page of the Lesson40MasterParallel package. Set MaxConcurrentExecutables to 2.
19. Save and run the package. Now view the master package. Unlike the first run, you will see only two child packages running, as in Figure 40-6.
20. Now click through the message boxes. No more than two child packages will run at a time. It is very easy to change the parallelism of your entire batch in this manner. Then return to design mode by clicking at the bottom of the master package.
21. Set the MaxConcurrentExecutables property of the master package back to –1. Save the package.

NOTE Because your production environment is likely to be more robust than your development environment, you may want to change parallelism. The –1 value will fire off the number of logicalphysical processors +2. However, you can have multiple master packages running in parallel. If this is the case, it is a good idea to parameterize the MaxConcurrentExecutions property of the master package. This will enable you to tune parallelism quickly and easily.

22. Next, you take a look at failure issues. In the master package, select Load Product Dim and go to the Properties window. Set the ForceExecutionResult property to Failure. This causes the Load Product Dim Task to fail.
23. Save and run the master package. All three dimension loads will begin to run. Click OK on the message boxes for all three dimension loads. Load Product Dim will show failure and the other two will show success. Your results should look like Figure 40-7.
24. Click through all the message boxes and you’re done.

Keeping and using a master package ensures that other packages are run in the correct order. A properly configured master package allows as many tasks as possible to complete, even when some packages fail.


Please select Lesson 40 on the DVD, or online at www.wrox.com/go/ssis2012video, to view the video that accompanies this lesson.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset