I want to split a giant csv file into several smaller files according to the first three characters in the row. I have the following:
tFileInputFullRow --(row1)--> tJavaRow --(row2)-->tFileOutputRaw
* rFileInputFullRow reads each line into a "line" column
* tJavaRow reads:
output_row.line = input_row.line;
* tFileOutputRaw has as filename "path/to"+((String)globalMap.get("rowType"))+".csv"
All I get as a result is a null.csv file staring back at me.
However, when I do:
tFileInputFullRow --(row1)--> tMap -(row2)-> tFlowToIterate --(iterate)-->tFixedFlowInput-->tFileOutputDelimited
* tMap adding a new column type to row2 which is defined just as in the tJavaRow above
* tFileOutputDelimited has the same name as tFileOutputRaw.
This time I do get the different files created!!
Why does this happen? I'm asking this because I'm seeing that the first solution goes much quicker than the latter (mainly because it doesnt have to iterate each of the 50 columns for each of the 600.000+ rows).
Solved! Go to Solution.
Even though the global variable is changed? (I have added another tJavaRow just to print the value of the global var and the value effectively changes each row...)
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Read about some useful Context Variable ideas