Four Stars

How to extract the data from a file that has different schema's?

Hi,

 

I have come across with one requirement, Sample file is attached.

Headers are highlighted in bold.

My requirement is how to extract the below file and output should be in three different files based on different schema?

 

Thanks,

Vamsi.

1 REPLY
Thirteen Stars

Re: How to extract the data from a file that has different schema's?

There are several ways to do this. Here is a relatively simple way using pretty straight forward components.

 

1) Load your data from your XLS, via a tMap into a tHashOutput. In the tMap use tMap variables to work out when the header has changed and use this to assign your rows with a key. This mini tutorial will show you how to use tMap variables to catch changes between rows (https://www.rilhia.com/quicktips/quick-tip-compare-row-value-against-value-previous-row). A simple identifier (given the data you have shown us) is to use the logic "when the first column is NOT a number, it is a header". Hint: you will need to output all column data as Strings at this point, until you know which schema each row is part of.

2) Once each row has its key, you need to identify the schema type of each row. Maybe the schemas always fall in the same order. If so, this will be easy. If not, you will need some logic to identify the schema from the header row. You would do this here.

 

3) Once the schemas have been identified and data has been converted from String to it's correct data type, the data can be sent to the correct file using a tMap.

Rilhia Solutions