From Thursday, July, 9, 3:00 PM Pacific,
our Community site will be in
read-only mode
through Sunday, July 12th.
Thank you for your patience.

How to perform MSSQL query over multiple databases dynamically

Highlighted
Four Stars

How to perform MSSQL query over multiple databases dynamically

I need to extract data from a large number of similar databases and collect this data in a common database. The solution i currently have is to use tMSSqlInput and tMSSqlOutput for each database en repeat this for each following database, the query is identical in every extraction. 

 

So

tMSSqlInput DB1 -------main-----tMSSqlOutput DB benchmark

tMSSqlInput DB2 -------main-----tMSSqlOutput DB benchmark

etc

 

This solution relies heavily on repeating the same job at least 50 times and thus creating a lot of maintenance in case of changes. Is there a solution to access a list of database names and use this list to dynamically extract data from this list of databases? In that case we can add or remove db's from this input list instead of duplicating jobs. 

Highlighted
Community Manager

Re: How to perform MSSQL query over multiple databases dynamically

Yes, this is pretty easy. The steps are below (it can be done in other ways, but this is how I would do it given the info you have supplied)....

 

1) Create your job to move the data from source to target. Use context variables for your database connection parameters.

2) Create a parent job which reads your different database parameters from any data source you like (db, flat file, etc). This job should iterate over each row of connection details, so use a tFlowToIterate to do this. A tFlowToIterate will also convert your row data to globalMap values, which will be useful (the "key" is {row_name}.{column_name}).

3) Take your job created in step 1 and drop it into the your parent job. In the Context Param section, click the green + symbol to add your child job's context variables and assign them with the globalMap values created by the tFlowToIterate.

 

What will happen is that your child job will be run for each database. The database credentials will be supplied to the child job's context variables. The child job will use those values to connect to the source database for each iteration. 

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Talend Cloud Developer Series – Fetching Studio License

This video will help someone new to using Talend Studio get started by connecting to Talend Cloud and fetching the Studio License

Watch Now

Talend Cloud Developer Series - Introduction

The Talend Cloud Developer Series was created to give you a solid foundational understanding of Talend’s Cloud Integration Platform

Watch Now

Talend Cloud Available on Microsoft Azure

An integration platform-as-a-serviceto help enterprises collect, govern, transform, and share data from any data sources

Watch Now