One Star

TIS : tFileOutputExcel memory exceeded

Hi all,
I try to save some datas in an Excel file using the tFileOutputExcel component.
When the checkbox "Write in format xlsx" is NOT checked, my job works very well and it takes a very few time.
The problem is that the number of rows can be either than 65536 lines. And the it happens, I have an error problem.
I would like to use an xlsx format. But the job is very very slow and I have a memory heap size error...
By the way, I'm using TIS 4.2.2
Any idea ?
Best regards.
Ahhouais
10 REPLIES
One Star

Re: TIS : tFileOutputExcel memory exceeded

Hi Ahhouais
The xls can only save 65525 rows for each sheet.
Why don't use tFileOutputDelimited and save it as csv file?
Besides, you said if you checked "Write in format xlsx", the performance was bad.
I think this is due to your job design.
Could you show us the screen shot of your job?
Regards,
Pedro
One Star

Re: TIS : tFileOutputExcel memory exceeded

Hi, we have the exact same problem and we are quite sure this is not a design problem. We generate a regular xls file with no problem and then, just by checking that xlsx box we get a java heap size error.
I'll try to post a job to illustrate the problem
One Star

Re: TIS : tFileOutputExcel memory exceeded

Hi
I don't think the java heap error is the same with the issue at Comment #1.
The java heap size error indicates that you load too many rows in this Talend job.
Try to increase the JVM parameters.
Run tag->Advanced Settings->Use specific JVM arguments and increase xms, xmx.
Regards,
Pedro
One Star

Re: TIS : tFileOutputExcel memory exceeded

But the job is very very slow and I have a memory heap size error...

Hi Pedro,
I do think we have the same problem as Ahhouais
I already changed my JVM parameters to use up to 4GB of memory. It doesn't make sense to need more than 4GB of memory to read 50.000 rows of an excel file (xlsx) when the same job reads it correctly without JVM modifications if it is an old xls file, don't you think?
We are not the only ones:
http://www.talendforge.org/forum/viewtopic.php?id=16825
http://www.talendforge.org/forum/viewtopic.php?id=23742
http://www.talendforge.org/forum/viewtopic.php?id=24532
The 2007 format is consuming a very high amount of memory with no apparent reason. I can load the exact same data from excel (old format), csv, MySQL with a small 512MB JVM, and it is much faster.
During the day I'll build an example to try to explain better.
Best regards,
Maxime.
One Star

Re: TIS : tFileOutputExcel memory exceeded

Hi Maxime
Please show us the entire error log.
Regards,
Pedro
One Star

Re: TIS : tFileOutputExcel memory exceeded

Pedro,
Please find attached a screenshot with a simple job that reads the exact same file in 2 formats. The files have 58.000 rows, the 2003 format is a 68MB file and the 2007 format is a 25MB file.
The old format is read ok, the new format raise a Java Heap exception.
If you need the files and the job please give me a way to send it to you (ftp?)
One Star

Re: TIS : tFileOutputExcel memory exceeded

Embeding the image didn't work Smiley Sad
this is the link: http://imageshack.us/photo/my-images/822/errorxlsx.jpg/
Seventeen Stars

Re: TIS : tFileOutputExcel memory exceeded

hi all,
perhaps the reason is that when we check xlsx option, the component create a hasmap via ExcelTool class with size of 'array' fixed. (capacity * loadfactor). If the number of elements in your HashMap exceeds (capacity * loadfactor) then the underlying array in the HashMap will be resized.(in memory) & keeping old array ...
http://stackoverflow.com/questions/235047/why-do-i-get-an-outofmemoryerror-when-inserting-50-000-obj...
regards
laurent
One Star

Re: TIS : tFileOutputExcel memory exceeded

Do you think guys I have to fill this as a bug?
One Star

Re: TIS : tFileOutputExcel memory exceeded

Hi
Recently some users encountered the same issue.
Please report it on BugTracker.
Regards,
Pedro