One Star

Processing hangs when reading an Excel file many times

I am using Talend Open Studio for Data Integration. Version: 5.3.1. Build id: r104014-20130618-0337, running on Windows 7. I am trying to read in a .xlsx file and do several things to it.
In order to keep the code clean, I have broken the processing up into many sub-jobs within one job. Each sub-job links to the next via On Subjob OK, and each sub-job starts with a tFileInputExcel on the same spreadsheet.
The spreadsheet has two sheets. I read both sheets over the whole job, but a given sub-job reads only one sheet.
Sheet 1 has 350 rows, and each row has up to 22 columns. Sheet 2 has 441 rows, and each row has up to 14 columns.
I get an intermittent problem, where the first sub-job runs OK, but the next one hangs after reading one line of the spreadsheet. There is no error; it just freezes. The first sub-job never hangs, but the second and later ones do.
The second sub-job splits up one line of the spreadsheet via a map, so that different columns of the line are processed differently (in parallel) and then inserted into the same db table (I'm using an MS SQL Server db). The spreadsheet is denormalised, and the db is normalised, so this is converting between the two worlds.
I have found that if I deactivate most of the second sub-job, so that only the processing of the first set of columns remains, then it will run OK. Afterwards, I re-activate the rest of the sub-job, and the whole lot will run OK. However, the next (third) sub-job will often then hit the same problem as the second sub-job, which I can work around in the same way.
One Star

Re: Processing hangs when reading an Excel file many times

Sorry for the duplicates - when I originally clicked submit, it hung so I assumed it hadn't succeeded at all and retried.
Seventeen Stars

Re: Processing hangs when reading an Excel file many times

I know this problem and have currently no explanation for it. This was one of my motivation to write my own Excel related components.
tFileExcelWorkbookOpen opens a file and now you can use the components tFileExcelSheetList to iterate through the sheets and/or tFileExcelSheetInput to read the sheet input or tFileExcelSheetOutput to write into (all in the same job without the need of open a file again and again).
With tFileExcelWorkbookSave you can save the file where you want.
You will find these component in Talend Exchange:
search for excel.