I am new to Talend and have a DI scenario that I'm looking for help with. My input Excel s/sheet is in this format: Account Description Amount 12345 Transfer 100.00 Tax 50.00 23232 Transfer 200.00 Tax 75.00 And I need my output Excel s/sheet in this format: Account Amount Tax 12345 100.00 50.00 23232 200.00 75.00 Two rows in the input s/sheet will generate one row in the output s/sheet. What is the easiest way for me to this with Talend?
I would use the tFileExcelSheetInput component (from Talend Exchange) to read the excel file and configure the column Account to get always the last not empty value. This way you will get this: Account Description Amount 12345 Transfer 100.00 12345 Tax 50.00 23232 Transfer 200.00 23232 Tax 75.00 send this inputflow to a tMap. In the tMap add an output flow with these columns: Account Amount Tax Now create a new output flow out2 as joined output for out1. For the first flow map the input Amount to output Amount and Account to Account. For the second flow map the input Amount to Tax and Account to Account. The output flow out1 now send to an tAggregateSortedRow componente and take care you get the sum of Amount and Tax and use Account as key field. Because of only one row has a value and the other is null you get the wanted value. This way you will get your desired result. tFileExcelWorkbookOpen --OnSubjobOK--> tFileExcelSheetInput ---> tMap ---> tAggregateSortedRow --> tFileOutputExcel (your target excel file)
Hi jmceachern With the functionality "Use last value for empty" of tFileExcelSheetInput developed and shared by jlolling, it is easy to filling the Account column with last value when it is empty or null, that's great! Here I give you another solution without using a custom component as a reference: store the current Account into memory for used later when the Account is empty or null, please see my screenshots. expression of Account column of out1: