How to extract the data from a file that has different schema's?

Four Stars

How to extract the data from a file that has different schema's?



I have come across with one requirement, Sample file is attached.

Headers are highlighted in bold.

My requirement is how to extract the below file and output should be in three different files based on different schema?




Community Manager

Re: How to extract the data from a file that has different schema's?

There are several ways to do this. Here is a relatively simple way using pretty straight forward components.


1) Load your data from your XLS, via a tMap into a tHashOutput. In the tMap use tMap variables to work out when the header has changed and use this to assign your rows with a key. This mini tutorial will show you how to use tMap variables to catch changes between rows ( A simple identifier (given the data you have shown us) is to use the logic "when the first column is NOT a number, it is a header". Hint: you will need to output all column data as Strings at this point, until you know which schema each row is part of.

2) Once each row has its key, you need to identify the schema type of each row. Maybe the schemas always fall in the same order. If so, this will be easy. If not, you will need some logic to identify the schema from the header row. You would do this here.


3) Once the schemas have been identified and data has been converted from String to it's correct data type, the data can be sent to the correct file using a tMap.

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables


How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration


Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach