Not applicable

Normalize csv file : from rows to columns

Hello,
I have a complicated thing to do with Talend..
My input is a csv file. Here is its schema :
Id_shop;Id_cust;Id_prod;Year;Month;day1;day2;day3;...;day30;
The fields "day1" to "day30" contain quantities. The quantities of products ("Id_prod") sold by a shop ("Id_Shop") to a customer ("Id_cust") at the date "Month/Year". The field "day1" contains the quantity of the first day of the month and "day30" the quantity of the last day.
I would like to load those data in a oracle table which has this schema :
Id_shop;Id_cust;Id_prod;Date;Quantity
No problem for the Id_shop, Id_cust and Id_prod (it's the primary key) but the field Date is defined by the month, the year and the days from the csv file.
For example :
in the csv file : month = 05 and year = 2008.
in the oracle table, i should have for each key all dates from 01/05/2008 to 30/05/2008
Then, for the date 01/05/2008 the field Quantity contains the quantity written in the field "day1" of the csv file.
Is that clear?
With informatica, there is a module called normalizer which allows to do that.. Is there something equivalent in Talend?
Thanks!
Nico
14 REPLIES
Employee

Re: Normalize csv file : from rows to columns

Your question is very clear, have a look at 4500
One Star

Re: Normalize csv file : from rows to columns

I hv tried devloping a job with details provided in the link above, but ended up with the error.
please help me out. i hv attached the error image.
vin
One Star

Re: Normalize csv file : from rows to columns

Hi,
I think you should use row2 in tjavarow. So the code in tJavaRow becomes :
globalMap.put("AMT1",row2.AMT1);
globalMap.put("AMT2",row2.AMT2);
globalMap.put("AMT3",row2.AMT3);
Regards,
Youssef
One Star

Re: Normalize csv file : from rows to columns

thnks youssef
vin
One Star

Re: Normalize csv file : from rows to columns

you wrote
gloablMap.put("AMT3",row2.AMT3);
instead of
globalMap.put("AMT3",row2.AMT3);

Regards,
Youssef
One Star

Re: Normalize csv file : from rows to columns

I hv corrected them, the job was showing no errors with no output.
the problem is input values are processed only to tJavarow, but not to tForeach and subsequent components.
did i leave anything not mentioned
vin
One Star

Re: Normalize csv file : from rows to columns

hi,
did you fill the tree enteries in tForeach Componenent as shown in 17745
Regards youssef
One Star

Re: Normalize csv file : from rows to columns

Thanks Yousef,
I have got it. Its working fine
Vin
Employee

Re: Normalize csv file : from rows to columns

90
I've chosen to have a hard coded schema as tUnpivotRow output, it makes things a lot easier to configure for user. Use tExtractDelimitedFields to extract subfields in the output "row_key" column.
Not applicable

Re: Normalize csv file : from rows to columns

Thanks a lot! This component is perfect for my needs.
Just one question : the final schema is composed of 3 columns :
row_key
pivot_key
pivot_value
How can we have at the end the same schema as the original schema? By exploding the "row_key" column thanks to a PERL regular expression?
Thanks,
Nicolep
Employee

Re: Normalize csv file : from rows to columns

How can we have at the end the same schema as the original schema? By exploding the "row_key" column thanks to a PERL regular expression?

In , you will find tExtract*Fields. Currently (Talend Open Studio 3.0), you have tExtractDelimitedFields, tExtractPositionalFields and tExtractRegexFields. tUnpivotRow was designed to be followed by a tExtractDelimitedFields.
Not applicable

Re: Normalize csv file : from rows to columns

Ok thanks..
This component allows to split the "row_key" but it doesn't allow to load the "pivot_key" and the "pivot_value" and it doesn't support 2 outputs..
How can we split the "row_key" and then load all the columns (with "pivot_key" and "pivot_value")? With a TMap ?
Employee

Re: Normalize csv file : from rows to columns

This component allows to split the "row_key" but it doesn't allow to load the "pivot_key" and the "pivot_value" and it doesn't support 2 outputs..
How can we split the "row_key" and then load all the columns (with "pivot_key" and "pivot_value")? With a TMap ?

tExtractDelimitedFields fills several output fields from a single input field + keep other input fields unchanged (tExtractRegexFields un component reference guide shows this way to use tExtract*Fields component).
Not applicable

Re: Normalize csv file : from rows to columns

Ok that's right, when we put the same column name ("pivot_key" and "pivot_value") in the output schema of the tExtractDelimitedFile, the column are automatically loaded.
It works fine!
Thanks