import information from multiple tables to output into a single table

Highlighted
Five Stars

import information from multiple tables to output into a single table

Hi,

 

I have a job where I am pulling through the information for multiple instances of a single table through an iterative MySQL tDBInput, this is then being output into a single output table.

 

The challenge I am facing is that sometimes the tables from the different instances can have differing schema. For example Table A from instance 1 may have 15 columns, whereas Table A from instance 2 may have 16 columns (15 matching and one extra), whereas Table A from instance 3 may have 17 columns (15 matching, two extra).

 

Is there a way to set up the schema on the input whereby it will bring through the metadata for all possible columns across every instance (in the above example, 17), and then when we are bringing through the data for each instance, if the data does not exist to make it = null? So then what we have in the final output is (for the above) 17 columns, which are filled in completely for instance 3, 16 filled in (1 null) for instance 2, and 15 filled in (2 null) for instance 1.

 

I have been able to do this when the schema matches and have this output into a single table, with each new instance appended beneath, I am only having difficulty with it when the schema changes.

Thanks in advance for your help.


Accepted Solutions
Highlighted
Employee

Re: import information from multiple tables to output into a single table

Hi,

 

      You will not be able to change the schema dynamically for each instance during tDBInput call. But you can do another work around. Why don't you select the schema with maximum number of columns as the output schema for all different tables?

 

       Since you are using SQLs, you can use null values for the instances where column is not present. Once the data set is in unified format, you can use a tUnite component to merge the data and process it further.

 

      Plan B is slightly circuitous where you need to store the table and column metadata information in a different tables. When you are reading the data from source table, merge all the columns with a separator symbol (like semicolon) and pass the data as single string value. Then, based on the metadata value specified in your reference table, you can slice the input string and append null values. Once this slicing and appending operation is complete, you can reform the data into new unified format and pass them further. This method require less amount of coding when more versions are coming but initial coding effort will be more.

 

Warm Regards,

 

Nikhil Thampi

View solution in original post


All Replies
Highlighted
Employee

Re: import information from multiple tables to output into a single table

Hi,

 

      You will not be able to change the schema dynamically for each instance during tDBInput call. But you can do another work around. Why don't you select the schema with maximum number of columns as the output schema for all different tables?

 

       Since you are using SQLs, you can use null values for the instances where column is not present. Once the data set is in unified format, you can use a tUnite component to merge the data and process it further.

 

      Plan B is slightly circuitous where you need to store the table and column metadata information in a different tables. When you are reading the data from source table, merge all the columns with a separator symbol (like semicolon) and pass the data as single string value. Then, based on the metadata value specified in your reference table, you can slice the input string and append null values. Once this slicing and appending operation is complete, you can reform the data into new unified format and pass them further. This method require less amount of coding when more versions are coming but initial coding effort will be more.

 

Warm Regards,

 

Nikhil Thampi

View solution in original post

Highlighted
Five Stars

Re: import information from multiple tables to output into a single table

Hi Nikhul,

 

Thanks for the reply:

 

So to confirm for plan A -are you suggesting to have multiple inputs, create the shell columns for where those columns do not exist through the MySQL query and then use a tUnite to unite all of this information before outputting into an output table? If so, I think this might be a little more difficult as I can have many instances, where more are being added over time, and this will be on multiple different tables as well so could take quite some time across every table required.

As for plan B I will try this and see what I can come up with - thank you!

Highlighted
Employee

Re: import information from multiple tables to output into a single table

Hi,

 

    You are right. Personally I always prefer metadata approach but it would require more initial development time. So I have given both ways of doing. Depending on your timelines, you can choose either Plan A or B.

 

    If the reply has helped to resolve your query, could you please mark the topic as closed? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

Highlighted
Five Stars

Re: import information from multiple tables to output into a single table

Hi - thanks again! Silly question but how might I close the topic?

Thanks,

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog