Splitting one row into header, detail and trailer records in Talend

Five Stars

Splitting one row into header, detail and trailer records in Talend

Hi I have a requirement where I have one file with several rows.  For each row in the input record I have to create multiple details records in the output file.  Over all the output file should have one header record in the beginning and one trailer record at the end of the file.

 

For example.

If the my input record is like this

Hari|Student|300

Krishnan|Professor|800

 

My output should look like

 

Header|Class Details

Hari|Student

Student|300

Krishnan|Professor

Professor|800

Trailer|1100

 

Please let me know how can i achieve this in Talend


Accepted Solutions
Highlighted
Fifteen Stars TRF
Fifteen Stars

Re: Splitting one row into header, detail and trailer records in Talend

There is no direct solution for this case but you can do it by yourself with the following job:

job.png

 

- tFixedFlowInput is just there to replace your input:

tFixed.png

 

- tSplitRow change the input flow from 3 to 2 fields with convertion of "cost" field to String:

tSplit.png

 

- tMap is the solution's core:

tMap.png

The purpose is to compute the cumulative cost and store it in a global variable as you cannot generate a new line from the tMap. The logic is the following :

  - isCostLine is true as soon as classDetails field contains digits only

  - if isCostLine is true, add classDetails value to the cumulativeCost local variable

  - here is the trick, store current value for cumulativeCost into a global variable to be reused by the next subjob (see the Expression constructor for cumulativeCost variable):

expr.png

 

- tHashOutput is used to store the intermediate result which looks like this:

|=--------+-----------=|
|header   |classDetails|
|=--------+-----------=|
|Hari     |Student     |
|Student  |300         |
|Krishnan |Professor   |
|Professor|800         |
'---------+------------'

After a new subjob is started to merged the intermediate result with the Trailer line to be constructed.

 

- tRowGenerator is used to create the trailer row with:

  - header field defined as a constant value set to "Trailer"

  - classDetails field populated with the value stored into the global variable "cumulativeCost"

tGenerate.png

 

- tUnite is used to merge the intermediate result from 1rst subjob with the trailer row generated.

 

The final result is as expected:

.---------+------------.
|      tLogRow_6       |
|=--------+-----------=|
|header   |classDetails|
|=--------+-----------=|
|Hari     |Student     |
|Student  |300         |
|Krishnan |Professor   |
|Professor|800         |
|Trailer  |1100        |
'---------+------------'

 


TRF
Five Stars

Re: Splitting one row into header, detail and trailer records in Talend

Hi @TRF thanks for a lot for this quick response with solution.  After I posted the question, I was continuing to figuring out a solution and I finally got to the following design.  The thing that really did the trick is using the 'Create Join Table from' option in tMap instead of doing a 'New Output'.  

 

I was able to use this option and generate several output records with single input record.  

 

2019-08-09_16-44-49.jpeg

 

The following is the complete job design.  

 

2019-08-09_16-47-37.jpeg

2019-08-09_17-03-39.jpeg

1. I used the tFixedFlowInput_1 to generate the Header record.  Wrote the record into a file using tFileOutputDelimited_3.  

2. In tMap as I mentioned above, I used the Create Join From Table option and generated different details records but as you know this option will write all of them into one output flow.  The only challenge I had here, not all details records had same number of columns, so I padded the extra columns of the detail records with "$".  

3. I wrote the records into the same file as where I wrote the Header record in 'Append' Mode with pipe delimiter.

4. I removed the extra padded $ along with the pipe delimiter using the tReplace component and wrote them into a new file.  

5. Finally used another tFixedFlowInput_2 to write the Trailer record into the new file.  


All Replies
Highlighted
Fifteen Stars TRF
Fifteen Stars

Re: Splitting one row into header, detail and trailer records in Talend

There is no direct solution for this case but you can do it by yourself with the following job:

job.png

 

- tFixedFlowInput is just there to replace your input:

tFixed.png

 

- tSplitRow change the input flow from 3 to 2 fields with convertion of "cost" field to String:

tSplit.png

 

- tMap is the solution's core:

tMap.png

The purpose is to compute the cumulative cost and store it in a global variable as you cannot generate a new line from the tMap. The logic is the following :

  - isCostLine is true as soon as classDetails field contains digits only

  - if isCostLine is true, add classDetails value to the cumulativeCost local variable

  - here is the trick, store current value for cumulativeCost into a global variable to be reused by the next subjob (see the Expression constructor for cumulativeCost variable):

expr.png

 

- tHashOutput is used to store the intermediate result which looks like this:

|=--------+-----------=|
|header   |classDetails|
|=--------+-----------=|
|Hari     |Student     |
|Student  |300         |
|Krishnan |Professor   |
|Professor|800         |
'---------+------------'

After a new subjob is started to merged the intermediate result with the Trailer line to be constructed.

 

- tRowGenerator is used to create the trailer row with:

  - header field defined as a constant value set to "Trailer"

  - classDetails field populated with the value stored into the global variable "cumulativeCost"

tGenerate.png

 

- tUnite is used to merge the intermediate result from 1rst subjob with the trailer row generated.

 

The final result is as expected:

.---------+------------.
|      tLogRow_6       |
|=--------+-----------=|
|header   |classDetails|
|=--------+-----------=|
|Hari     |Student     |
|Student  |300         |
|Krishnan |Professor   |
|Professor|800         |
|Trailer  |1100        |
'---------+------------'

 


TRF
Five Stars

Re: Splitting one row into header, detail and trailer records in Talend

Hi @TRF thanks for a lot for this quick response with solution.  After I posted the question, I was continuing to figuring out a solution and I finally got to the following design.  The thing that really did the trick is using the 'Create Join Table from' option in tMap instead of doing a 'New Output'.  

 

I was able to use this option and generate several output records with single input record.  

 

2019-08-09_16-44-49.jpeg

 

The following is the complete job design.  

 

2019-08-09_16-47-37.jpeg

2019-08-09_17-03-39.jpeg

1. I used the tFixedFlowInput_1 to generate the Header record.  Wrote the record into a file using tFileOutputDelimited_3.  

2. In tMap as I mentioned above, I used the Create Join From Table option and generated different details records but as you know this option will write all of them into one output flow.  The only challenge I had here, not all details records had same number of columns, so I padded the extra columns of the detail records with "$".  

3. I wrote the records into the same file as where I wrote the Header record in 'Append' Mode with pipe delimiter.

4. I removed the extra padded $ along with the pipe delimiter using the tReplace component and wrote them into a new file.  

5. Finally used another tFixedFlowInput_2 to write the Trailer record into the new file.  

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now