If you are trying to import/insert a lot of data to Azure SQL Server (Not on-prem/SQL Server on the Cloud hosted on a VM) you will run into some issues. Especially if you are trying to copy data from one Azure SQL Server to another.
My Architecture is Azure Cloud, Talend Clound Data Integration and Remote Engine that is deployed on Virtual Machine in the same Datacentre/region as all my resources (as much as possible) in order to have good performances.
I will save you all the things that I tried, the main issues are:
The gist of the solution is using 2 jumphosts one of them is the Remote Engine as a Filesystem, the second one is Azure Storage Blob/Container to be used as a location that the Azure SQL Server can access.
The pipeline that eventually worked was:
And the Talend pipeline that orchestrate it:
Don’t forget to remove the ‘?’ at the beginning of the string after copying the SAS connection string!
CREATE DATABASE SCOPED CREDENTIAL MyAzureBlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2015-12-11&ss=b&srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z&spr=https&sig=copyFromAzurePortal';
The SECRET is the SAS you created in stage (2)
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://myazureblobstorage.blob.core.windows.net',
Notice that MyAzureBlobStorage is the name of your Storage account
BULK INSERT Product
WITH ( DATA_SOURCE = ‘MyAzureBlobStorage');
*DATA_SOURCE is the EXTERNAL DATA SOURCE that was created at stage (4)
** notice the ‘data’ in FROM 'data/product.csv' – it is the Blob in the Azure Storage. You can also add subfolders if needed.
More info on here
And the result:
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to deploy Talend Jobs as Docker images to Amazon, Azure and Google Cloud registries
Read about some useful Context Variable ideas
Learn how to publish your API Services to Talend Cloud