split the single record into multiple records dynamically

Four Stars

split the single record into multiple records dynamically

Hi 

 

i have excel file with below columns

 

county | 2014-Q1 | 2014-Q2 |  2014-Q3 | 2014-Q4

India    |     2.3    |     3.5       |  4.6        |   83.73

 expected output:

 county | quarter |  revenue

 India   | 2014-Q1 |  2.3

 India   |2014-Q2 |  3.5

India    |2014-Q3 | 4.6

India    |2014-Q4|  83.73

but every time one additional columns is adding   in source file.

county | 2014-Q1 | 2014-Q2 |  2014-Q3 | 2014-Q4 | 2015-Q1

India    |     2.3    |     3.5       |  4.6        |   83.73       | 76.43

 expected output:

 county | quarter |  revenue

 India   | 2014-Q1 |  2.3

 India   |2014-Q2 |  3.5

India    |2014-Q3 | 4.6

India    |2014-Q4 |  83.73

India    |2015-Q1 |  76.43

 

if additional comes in source i have to handle and need to automate this process. Please help above scenario.

in tsplitrow component we can achieve fixed columns but my case columns are changing every quarter.   

 

Thanks in advance.

lokesh

 

 

 

 

Sixteen Stars

Re: split the single record into multiple records dynamically

Is there a maximum number of "extra columns" you might get? I assume you are using the Open Source Edition?

Four Stars

Re: split the single record into multiple records dynamically

Hi rhall,

 

we have columns as of now 14 columns , every quarter one columns will be adding  , using open studio. 

Sixteen Stars

Re: split the single record into multiple records dynamically

Will old quarters be removed? Will it be like this for example.....

 

q1 2016, q2 2016, q3 2016, q4 2016, q1 2017, q2 2017, q3 2017, q4 2017

 

q2 2016, q3 2016, q4 2016, q1 2017, q2 2017, q3 2017, q4 2017, q1 2018

If not, you will have to rethink this. Although it might take a while, the files will become unmanageable. It's best to consider these things now. Also, if you implement some "Max number of quarters" logic, it makes the Talend solution workable.

 

 

Four Stars

Re: split the single record into multiple records dynamically

Hi 

 

no, old quarter also will be there .every time increasing by one quarter.

 

Sixteen Stars

Re: split the single record into multiple records dynamically

OK, this might work.....but you will have to play around to perfect it.

1) Convert your Excel file to CSV

2) Use a tFileInputFullRow component to bring in one row at a time.

3) Use a tJavaFlex to extract the Country column (and any other non period columns) and to output the collection of period columns as one column. This will require String manipulation. Your data types will be lost, but you can get these back later once your columns are split. You will have to use this component to also extract the Period Names from the header row. This will be a bit more complicated, but can be done quite easily in the tJavaflex if you take a copy of the first row (the header row) and process it alongside each data row.

 

An example of the sort of code you will need to use to mix the header data with the row data is below....

Start Code

String header = null;

Main Code

//If header is null (first row) set the header value
if(header==null){
	header = row1.myColumn;
}else{ //Process value rows
        String[] columns = header.split(",");
        String[] values = row1.myColumn.split(",");
        String outputString = "";
       
        //Merge the header info (Periods) with the values for the periods
        //This is done to enable the tNormalize component to split into
       //Period|Value columns over multiple rows
        for(int i = 0; i< columns.length; i++){
	       outputString = outputString+columns[i]+"|"+values[i]+",";
        }

        outputString = outputString.substring(0, outputString.length()-1);
        row2.output = outputString;
}

The above does not cover the separating out of the Country data, etc, but you can figure that out. It is comparatively easy.

 

4) Use a tNormalize to process your data into rows. You are splitting the data into Country, Period Key/Value rows.

5) Add a tMap to split the Period Key/Value column into two columns. Split using String manipulation with the "|" as the separator.

 

Once you are at this stage, your data is in the format you want. You just need to worry about data types. You can use a tConvertType for that. 

Sixteen Stars

Re: split the single record into multiple records dynamically

Hi @lokeshbabu, did this work for you? If so can you accept the solution please?