split the single record into multiple records dynamically

Four Stars

split the single record into multiple records dynamically

Hi 

 

i have excel file with below columns

 

county | 2014-Q1 | 2014-Q2 |  2014-Q3 | 2014-Q4

India    |     2.3    |     3.5       |  4.6        |   83.73

 expected output:

 county | quarter |  revenue

 India   | 2014-Q1 |  2.3

 India   |2014-Q2 |  3.5

India    |2014-Q3 | 4.6

India    |2014-Q4|  83.73

but every time one additional columns is adding   in source file.

county | 2014-Q1 | 2014-Q2 |  2014-Q3 | 2014-Q4 | 2015-Q1

India    |     2.3    |     3.5       |  4.6        |   83.73       | 76.43

 expected output:

 county | quarter |  revenue

 India   | 2014-Q1 |  2.3

 India   |2014-Q2 |  3.5

India    |2014-Q3 | 4.6

India    |2014-Q4 |  83.73

India    |2015-Q1 |  76.43

 

if additional comes in source i have to handle and need to automate this process. Please help above scenario.

in tsplitrow component we can achieve fixed columns but my case columns are changing every quarter.   

 

Thanks in advance.

lokesh

 

 

 

 

Community Manager

Re: split the single record into multiple records dynamically

Is there a maximum number of "extra columns" you might get? I assume you are using the Open Source Edition?

Four Stars

Re: split the single record into multiple records dynamically

Hi rhall,

 

we have columns as of now 14 columns , every quarter one columns will be adding  , using open studio. 

Community Manager

Re: split the single record into multiple records dynamically

Will old quarters be removed? Will it be like this for example.....

 

q1 2016, q2 2016, q3 2016, q4 2016, q1 2017, q2 2017, q3 2017, q4 2017

 

q2 2016, q3 2016, q4 2016, q1 2017, q2 2017, q3 2017, q4 2017, q1 2018

If not, you will have to rethink this. Although it might take a while, the files will become unmanageable. It's best to consider these things now. Also, if you implement some "Max number of quarters" logic, it makes the Talend solution workable.

 

 

Four Stars

Re: split the single record into multiple records dynamically

Hi 

 

no, old quarter also will be there .every time increasing by one quarter.

 

Community Manager

Re: split the single record into multiple records dynamically

OK, this might work.....but you will have to play around to perfect it.

1) Convert your Excel file to CSV

2) Use a tFileInputFullRow component to bring in one row at a time.

3) Use a tJavaFlex to extract the Country column (and any other non period columns) and to output the collection of period columns as one column. This will require String manipulation. Your data types will be lost, but you can get these back later once your columns are split. You will have to use this component to also extract the Period Names from the header row. This will be a bit more complicated, but can be done quite easily in the tJavaflex if you take a copy of the first row (the header row) and process it alongside each data row.

 

An example of the sort of code you will need to use to mix the header data with the row data is below....

Start Code

String header = null;

Main Code

//If header is null (first row) set the header value
if(header==null){
	header = row1.myColumn;
}else{ //Process value rows
        String[] columns = header.split(",");
        String[] values = row1.myColumn.split(",");
        String outputString = "";
       
        //Merge the header info (Periods) with the values for the periods
        //This is done to enable the tNormalize component to split into
       //Period|Value columns over multiple rows
        for(int i = 0; i< columns.length; i++){
	       outputString = outputString+columns[i]+"|"+values[i]+",";
        }

        outputString = outputString.substring(0, outputString.length()-1);
        row2.output = outputString;
}

The above does not cover the separating out of the Country data, etc, but you can figure that out. It is comparatively easy.

 

4) Use a tNormalize to process your data into rows. You are splitting the data into Country, Period Key/Value rows.

5) Add a tMap to split the Period Key/Value column into two columns. Split using String manipulation with the "|" as the separator.

 

Once you are at this stage, your data is in the format you want. You just need to worry about data types. You can use a tConvertType for that. 

Community Manager

Re: split the single record into multiple records dynamically

Hi @lokeshbabu, did this work for you? If so can you accept the solution please?

Four Stars

Re: split the single record into multiple records dynamically

Hi rhall_2_0,

 

Thanks for the solution.

I have also same requirement, whereas i have to do vice versa.

Can you suggest the code for the vice versa step also.

 

 

Community Manager

Re: split the single record into multiple records dynamically

I'm not sure what you mean. If it requires a different solution than the one proposed here, can you raise a different post and give a detailed explanation of the problem please? This will help people who have a similar question to you, find an answer.

Four Stars

Re: split the single record into multiple records dynamically

ok will do.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog