Appendix A

SSIS Component Crib Notes

In this appendix, you find a list of the most commonly used tasks and transforms in SSIS with a description of when to use them. Reference these tables when you have a package to build in SSIS and you are not sure which SSIS component to use to perform the needed actions.

When to Use Control Flow Tasks

TasksWhen to Use
CDC Control TaskUse this when SQL Server’s Change Data Capture provides the input data. This task manages the date and Log Serial Number (LSN) range used to identify incoming rows. LSNs are just row numbers used to identify rows in the transaction log.
Data Flow TaskUse this task when you need to pass data from a source to a destination. The source and destination can be a flat file, an OLE DB Connection, or any other connections supported in the connection manager.
Execute Package TaskUse this task when you need to call another package from within a package. The package performing the call is the parent package. The called package is the child package. You can pass information from the parent package to the child package with configurations and parameters.
Execute Process TaskUse this task to call an executable. The executable can be a batch file or an application. This task can call applications to perform functions on the files in SSIS, such as compressing a file. This task is commonly used to call third-party programs like compression or FTP tools.
Execute SQL TaskUse this task to perform any T-SQL operation. The SQL can be saved directly in the task, in a file, or in a variable. This task is commonly used to call stored procedures.
File System TaskUse this task to manipulate files. This task can move, rename, copy, and delete files and directories. You can also change the attributes of a file. A common use is archiving files after loading them.
FTP TaskUse this task to send or receive a file via the FTP protocol. You must have a valid FTP connection to perform this task. This task is commonly used to receive files from an FTP host for loading in a database.
Message Queue TaskUse this task to send or receive messages to a message queue. You must have a valid MSMQ connection to perform this task.
Script TaskUse this task to perform complex tasks that are not available in SSIS. This task enables you to leverage the .NET Framework to perform just about any task. Checking for the existence of a file is common use of this task. Script Tasks can be coded in VB or C#.
Send Mail TaskUse this task to send e-mail via SMTP. You must have a valid SMTP server connection to use this task. You can use this task to send notification of the package information to recipients. You can also send files via the attachments on the e-mail.
Web Service TaskUse this task to call a web service. You need a valid web service URL to perform this task.
XML TaskUse this task to perform XML functions. This task can perform common XML tasks such as Diff, used to compare two XML files and find the differences.

When to Use Data Flow Transforms

TransformsWhen to Use
AggregateUse this transform to perform grouping and summing of data. This is similar to the “Group By” function in T-SQL.
AuditUse this transform to add a column to a Data Flow with package information. You can add items like the package name and username as a new column in the Data Flow.
CDC SourceUse this transform when using Change Data Capture to load data. It reads rows from a CDC change table. Rows read are identified by the CDC Control Task.
CDC SplitterUse this transform to divide a data stream from the CDC Source Task into streams for insert, update, and delete. This is similar to a conditional split, but works specifically with the CDC Source.
Conditional SplitUse this transform to divide data into different paths based on a boolean expression. You can use all the paths from the split or ignore some outputs.
Copy ColumnUse this transform to create a new column in the Data Flow that is an exact copy of another column.
Data ConversionUse this transform to convert data from one data type to another. For example, you can change Unicode to non-Unicode or change a string to an integer.
Derived ColumnUse this transform to create or replace a column in the Data Flow with a column created by an expression. You can combine columns or use functions like getdate() to create new data.
DQS CleansingUse this transform to run DQS cleansing projects in batch.
Export ColumnUse this transform to send a column in a Data Flow to a file. The data types can be DT_TEXT, DT_NTEXT, and DT_IMAGE.
Fuzzy GroupingUse this transform to group data together based on a percentage match. In this transform, the data does not have to be an exact match to be grouped together. You can control the percentage of matching needed to group the data.
Fuzzy LookupUse this transform to find matching data in a table. The data does not have to match exactly. You can control the percentage of matching needed to group the data.
Import ColumnUse this transform to import data from files into rows in a data set.
LookupUse this transform to compare data in a Data Flow to a table. This will find exact matches in the date and give you a match and no-match output from the transform.
MergeUse this transform to combine two sets of data similar to a Union All. This transform requires both inputs to be sorted.
Merge JoinUse this transform to combine two sets of data similar to a left outer join. This transform requires both inputs to be sorted.
MulticastUse this transform to clone the data set and send it to different locations. This transform does not alter the data.
OLE DB CommandUse this transform to send T-SQL commands to a database. You can use this to insert data into a table using the T-SQL Insert command.
Percentage SamplingUse this transform to select a percentage of the rows in a Data Flow. The rows are randomly selected. You can set a seed to select the same rows on every execution of the transform. The unselected rows will follow a different path in the Data Flow.
PivotUse this transform to convert normalized data to denormalized data. This transform changes the rows into columns.
Row CountUse this transform to write the row count in a Data Flow to a variable.
Row SamplingUse this transform to select a number of rows in the Data Flow. The number of rows is set in the transform. The unselected rows will follow a different path in the Data Flow.
Script ComponentUse this transform to perform complex transforms that are not available in SSIS. This transform enables you to leverage the .NET Framework to perform just about any transform.
Slowly Changing DimensionUse this transform to create a dimension load for a data warehouse. This is a wizard that will walk you through the decision-making process while setting up a dimensional load.
SortUse this transform to order the data by a column or more than one column. This is similar to an “order by” command in T-SQL.
Term ExtractionUse this transform to find words in a Data Flow and create an output with the words listed and a score.
Term LookupUse this transform to compare to data in a Data Flow and determine if a word exists in the data.
Union AllUse this transform to combine two sets of data on top of each other. This is similar to the “Union” command in T-SQL.
UnpivotUse this transform to convert denormalized data to normalized data. This transform changes the columns into rows.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset