Split CSV to many files based on key

One Star

Split CSV to many files based on key

I have a csv which looks something like this
a, col1, col2, col3
a, col1, col2, col3
a, col1, col2, col3
b, col1, col2, col3
b, col1, col2, col3
c, col1, col2, col3
c, col1, col2, col3
The first column starts with a key (a,b,c), and then the rest of the columns follow. What I want to do is read in the csv (got that covered) and then split the csv based on key, so I have 3 chunks/ groups of data and then convert each of those chunks of data into a separate json file, which I think I can get.
This question is not a much different from http://www.talendforge.org/forum/viewtopic.php?pid=101372#p101372.
I don't know how many different keys are available so want to build something that doesn't mind about new keys.
Essentially I want to -
Read -> group based on key -> for each group transform to JSON.
The transforming to JSON is something I'm happy to play with, my question really focuses around the grouping.
From the above question I've done the following -
tFileInputDelimited ----row 1 main ---> tFlowToIerate ---iterate---> tFixedFlowInput --- row2 (main) ---> tFileOutputDelimited
However this creates lots of keyed filenames which is good, however the content of the files is the same on each row, when it shouldn?t be.
Any ideas?
David
Community Manager

Re: Split CSV to many files based on key

Hi David
tFileInputDelimited ----row 1 main ---> tFlowToIerate ---iterate---> tFixedFlowInput --- row2 (main) ---> tFileOutputDelimited
However this creates lots of keyed filenames which is good, however the content of the files is the same on each row, when it shouldn?t be.

Set a dynamic file path based on the first column, for example:
"D:/file/"+(String)globalMap.get("row1.column1")+".csv"
and check the 'append' option on tFileOutputDelimited, so as to append the record to an existing file.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business