Task 1 – copy/import data from SQL Server to a blob storage file using data factory

Let's create a data factory pipeline and a task to copy the data from the SQL Server to a blob storage file. From the Dashboard, click on the data factory:

ADFV2Book data factory

After clicking on the data factory, the following window opens. Click on the + to create a new factory resource and choose Pipeline:

Configure the new pipeline (name and description):

Expand the Dataflow section:

Drag and drop the Copy task into the window, and change its name:

In the Source tab, click on New:

In Select a Data Store, write sql; choose your source type and Finish.

 We are using an Azure SQL Server instance in this example.

In the Connection tab, select the relevant SQL Database connection in Linked service (note that the connection has already been defined in the Connections sections previously):

And then select the table/view (in our case, we select the View ADFV2Book.Purchase):

You can preview the data to ensure all is good by pressing on Preview data. Note that a new SQLServerPurchaseView dataset has been created:

Back at the pipeline, you can modify the dataset to a query or stored procedure:

In the Sink tab, choose New for sink dataset and look for blob. Choose the Azure Blob Storage:

We modify the name of the Azure Blob Storage we will be using.

In the Connection section, we choose to create a New Linked Service to create the blob storage file that we need to export the data to:

Enter the details in the New Linked Service window, starting with the name:

In the Account selection method, you can choose From Azure subscription; the Azure subscription details will be automatically populated. Also add the Storage account name (of the blob storage you already have):

Click Test connection at the bottom, and if it's successful, click on Finish:

In the file path, you can either choose a hardcoded name or go for Dynamic content (Alt + P) to create a dynamic filename, for example, with the date extension.

Note that if you do not state a hard coded filename, the name will be [DBSchemaName].[DBTableName].txt.

The [] brackets may cause issues in some components, so we strongly recommend that you use filenames without them. In our example, the file name will be hardcoded as ADFVV2BookPurchaseData.csv:

Scroll down and choose the file format and properties. If your file contains a character in the column values themselves, it cannot be the delimiter. In our case here, some string columns contain commas (,). Therefore, I have chosen t (Tab) to be the delimiter.

In the Schema tab, click on the Import Schema button and you can modify the column names accordingly; however, you do not need to do so. We will be doing this in the pipeline task.

It is extremely important to ensure that the number of columns here is the same as the column number in the SQL Server view or table in the Source (otherwise, the task may fail with an error related to the number of columns not matching).

When done, you need to use Publish All in order to save the changes:

Reopen the pipeline and open the Mapping tab; you should see the column names from the view. If not, you can reimport the schemas or add them manually:

If required (that is, there is a number of unpublished changes), select Publish All again. After publishing all changes, click on Validate to validate the task. This should be successful!

In order to run the task, you should click on Trigger. Once it has executed successfully, you may go to the blob storage container again and click on the container.

Purchase-data container

You should see your file in there (and on top of it, you can see how the format of the filename will look if you do not specify a filename):

We double-click on the file and copy the URL somewhere, as we will need to use it later when writing the U-SQL:

We have created a data factory pipeline with a copy task, where the source is a view that was created in an SQL Server database. The destination for the data copy is a file in a blob storage folder.

The next step would be to add a U-SQL task to manipulate the data in this created file, by summarizing data and copying the results to a new file in the blob storage.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset