tFileInputExcel - Event Mode retrieves less rows?

Five Stars

tFileInputExcel - Event Mode retrieves less rows?

This might be something that has been addressed and I just didn't find it.  I'm using Talend Open Studio for Data Integration, Version: 6.1.1.  I have a process that I'm trying to minimize system resources for, and part of the process is reading in an XLSX spreadsheet.  The difference in memory usage between User Mode and Event Mode is fairly considerable, Event Mode tops out the Java process at around 700 MB, while the User Mode reaches about double that.  So I was exploring using Event Mode.  However something has led me to question the reliability of Event Mode for any use at all.

 

I'm reading in an XLSX that has 63319 rows of data.  It has 8 lines above that of header information, with the header line itself starting on line 9.  When reading in User Mode, all 63319 rows of data are read.  If I just change to Event Mode, then only 63316 rows are being read, though I haven't deduced which ones they are just yet.

 

Is there some of the documentation that I'm misreading that explains this behavior?

tfileinputexcel - xlsx - User Mode-Basic Settings.pngUser Mode - Basictfileinputexcel - xlsx - User Mode-Advanced Settings.pngUser Mode - Advancedtfileinputexcel - xlsx - Event Mode-Basic Settings.pngEvent Mode - Basictfileinputexcel - xlsx - Event Mode-Advanced Settings.pngEvent Mode - Advanced


Accepted Solutions
Five Stars

Re: tFileInputExcel - Event Mode retrieves less rows?

Figured out the problem.  Event mode skips over blank rows entirely, not even producing a row of nulls.  My header section at the top of the spreadsheet had multiple blank rows.  This was causing tFileInputExcel to treat the 9 header rows as 7 header rows, ignoring the first 2 rows of data.  As those rows will occasionally have data in them (just not always), I have instead set the component to start from row 1, then check each row for data and filter out all rows until I reach my final Header row.


All Replies
Five Stars

Re: tFileInputExcel - Event Mode retrieves less rows?

Figured out the problem.  Event mode skips over blank rows entirely, not even producing a row of nulls.  My header section at the top of the spreadsheet had multiple blank rows.  This was causing tFileInputExcel to treat the 9 header rows as 7 header rows, ignoring the first 2 rows of data.  As those rows will occasionally have data in them (just not always), I have instead set the component to start from row 1, then check each row for data and filter out all rows until I reach my final Header row.