How-to-guide: Bulk Insert to Azure SQL Server

Highlighted
Five Stars

How-to-guide: Bulk Insert to Azure SQL Server

If you are trying to import/insert a lot of data to Azure SQL Server (Not on-prem/SQL Server on the Cloud hosted on a VM) you will run into some issues. Especially if you are trying to copy data from one Azure SQL Server to another.

My Architecture is Azure Cloud, Talend Clound Data Integration and Remote Engine that is deployed on Virtual Machine in the same Datacentre/region as all my resources (as much as possible) in order to have good performances.

 

I will save you all the things that I tried, the main issues are:

  1. If you will use the tMsSqlOutput will create row-by-row transactions. Not very efficient when you want to insert millions and millions of records.
  2. Azrue SQL Server is a PaaS – it doesn’t have the concept of Filesystem. So traditional bulk load will not work here – you cannot just point it to the location of the file because there is no such a thing…1.jpg

 

 

The gist of the solution is using 2 jumphosts one of them is the Remote Engine as a Filesystem, the second one is Azure Storage Blob/Container to be used as a location that the Azure SQL Server can access.

 

The pipeline that eventually worked was:

 

2.png

 

And the Talend pipeline that orchestrate it:

 

3.png

 

Detailed instructions:

 

  1. First create SAS – Shared Access Signature to the Azure Storage that will be used as your 2nd jumphost

 

4.png

 

 

 

  1. If you followed the Instructions correctly you will get a SAS connection string in the following form:

 

sv=2015-12-11&ss=b&srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z&spr=https&sig=copyFromAzurePortal

 

Don’t forget to remove the ‘?’ at the beginning of the string after copying the SAS connection string!

 

  1. The next stage will be to create the credentials in the Target DB. 

 

CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential

WITH IDENTITY = 'SHARED ACCESS SIGNATURE',

SECRET = 'sv=2015-12-11&ss=b&srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z&spr=https&sig=copyFromAzurePortal';

 

The SECRET is the SAS you created in stage (2)

 

  1. Now we need to create an External Data Source:

 

CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage

WITH ( TYPE = BLOB_STORAGE,

       LOCATION = 'https://myazureblobstorage.blob.core.windows.net',

       CREDENTIAL= MyAzureBlobStorageCredential);

 

Notice that MyAzureBlobStorage is the name of your Storage account

 

  1. The BULK INSERT instruction itself will look something similar to this:

 

BULK INSERT Product

FROM 'data/product.csv'

WITH ( DATA_SOURCE = ‘MyAzureBlobStorage');

 

*DATA_SOURCE is the EXTERNAL DATA SOURCE that was created at stage (4)

** notice the ‘data’ in FROM 'data/product.csv' – it is the Blob in the Azure Storage. You can also add subfolders if needed.

 

More info on here

 

And the result:5.png

 

 

 

Cheers

Maayan

Community Manager

Re: How-to-guide: Bulk Insert to Azure SQL Server

Good posts, thanks for your sharing! Maayan

Regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: How-to-guide: Bulk Insert to Azure SQL Server

Hey,

thanks for the help. Amazing info you have given.

 

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads