Split csv into many xml

Five Stars

Split csv into many xml

Hello

I have a csv file and I need to create separate xml files for each row, name of each xml file should match to the value of some column in csv.

Sounds easy but I cannot figured out how to do it...

 

I've made following job and in tFileOutputXML component I selected "Split output in several files" and "Rows in each output file - 1":

 

Capture.JPG

 

I try to use "row1.columnName" as a file name in tFileOutputXML but it doesnt work - files named as null1.xml, null2.xml, etc. As I understad this is because Output component is initialized once at the beginning of the job, when no data read from csv yet.

 

How can I acheeve what I need?

Tags (1)
Sixteen Stars TRF
Sixteen Stars

Re: Split csv into many xml

Hi, 

That's because output files are opened when subjob starts.

Use the following design:

 

tFileInputDelimited-->tFlowToIterate--iterate-->tFixedFlowInput-->tFileOutputXML

tFlowToIterate transform each field of the current row to a global variable.

The "iterate" connection trnasform the flow to an iteration (for simple, you may consider it like a separate subjob).

tFixedFlowInput start a new flow using the global variables to initialize the fields, like this:

Capture.PNG

This flow is pushed to tFileOutputXML where you can reuse the global which contained the file name like this:

 

"C:/Users/offic/Desktop/TestTalend/"+(String)globalMap.get("row57.newColumn")+".xml"

Hope this heps.

 


TRF
Five Stars

Re: Split csv into many xml

Hi TRF

 

Thanks, thats works (I suspected something like this). But this solution has an inconvenience because I need to maintain not only the schema which I can put in repository, but also a tFixedFlowInput component mapping where I need to map each column to corresponding global value... This looks like a tautology - column1 = (String)globalMap.get("row1.column1"). And the same for all columns,  there are many....

 

Is there any solution for that?

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog