Unite two datasets Automatically by name

Six Stars

Unite two datasets Automatically by name

Hello all,

In Talend we use a tool called Tunite to union two or more data-sets. The problem with it is the schema of all the data-sets should be same, where the order of the columns and the number of columns should always be same. When handling a larger data-set, ordering the data-sets or creating dummy columns becomes difficult as there are many number of columns and many data-sets. 

 

Is there any way where i can just unite them automatically based on the column names? and if the column is missing in other data-set it should have nulls instead.

 

If it i a custom component , can anyone guide me towards the existing component or help me build one/


Accepted Solutions
Nine Stars

Re: Unite two datasets Automatically by name

1/ create table in DB that's the super-set of all columns in all sources (final table).

2/ create table in DB that's got X columns, where X is a number which is the largest number of columns you'd expect in any incoming data-set, each column will be varchar(8000), just call the columns col1, col2, col3.....colX (work table)

3/ load incoming data into DB work table

4/ parse the header row of the incoming data, to ascertain the position of all columns of interest - this will be critical that the same column data is called the same name in each of the incoming data-sets.

5/ construct SQL to INSERT INTO/SELECT from the work table into the final table.

6/ execute the SQL

7/clear down work table.

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.

All Replies
Highlighted
Forteen Stars

Re: Unite two datasets Automatically by name

@Aishwarya,since you need to do manually,there we do not have automatically arrange. 

Manohar B
Don't forget to give kudos/accept the solution when a replay is helpful.
Nine Stars

Re: Unite two datasets Automatically by name

Hi,

 

A way I've tackled this in the past is load the unknown schema data-set up into a table of a suitably large enough number of string columns. Then using the header line of the incoming data, dynamically construct some SQL to select back out all columns, replacing them with nulls if they weren't in the source file.

 

It's not pretty, but it does work.

 

Thanks

 

David

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.
Six Stars

Re: Unite two datasets Automatically by name

Can you please help me follow these steps? I am pretty new to Talend
Nine Stars

Re: Unite two datasets Automatically by name

1/ create table in DB that's the super-set of all columns in all sources (final table).

2/ create table in DB that's got X columns, where X is a number which is the largest number of columns you'd expect in any incoming data-set, each column will be varchar(8000), just call the columns col1, col2, col3.....colX (work table)

3/ load incoming data into DB work table

4/ parse the header row of the incoming data, to ascertain the position of all columns of interest - this will be critical that the same column data is called the same name in each of the incoming data-sets.

5/ construct SQL to INSERT INTO/SELECT from the work table into the final table.

6/ execute the SQL

7/clear down work table.

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.
Four Stars

Re: Unite two datasets Automatically by name

Thanks for the instruction. I tried do this but I did not succeed. Most likely, I just do not understand how to do it. For me, it's as hard as writing a term paper. I could not write a term paper because I do not know how to do it. I found writing help a term paper on the internet and helped me. Probably I will have to look for a programmer who will help me to do everything according to your instructions, so that the information is displayed correctly through the database.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Downloads and Trials

Test drive Talend's enterprise products.

Downloads