Out of memory exceptions

One Star

Out of memory exceptions

Hi,
I am doing a Talend job which performs a lookup on a big table 4,724,567 I am getting the following output when I run the job: Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: Java heap space..........................
Is there a way doing this job in batches and therefore not run out of memory?
Thanks, Dave.
Community Manager

Re: Out of memory exceptions

Check out this article in the talend help center:
https://community.talend.com/t5/Migration-Configuration-and/OutOfMemory-Exception/ta-p/21669?content...
To be able to answer your question, we need screenshots showing your job design to see if and where optimization could be done.
One Star

Re: Out of memory exceptions

Using disk space rather than RAM might be the way to go. I'll try this in the morning and report back. Thanks for your help esabot Smiley Happy
Seventeen Stars

Re: Out of memory exceptions

There is unfortunately a missing issue! In case of databases like MySQL or PostgreSQL you have to check if the input component use a cursor or in case of MySQL use the streaming mode!
If you miss that options the JDBC driver collects all datasets in the memory until the first data set will be delivered to the application (here your talend job).
One Star

Re: Out of memory exceptions

Writing to the disk seemed to fix that instance but I am still getting warnings:
Warning: to avoid a Memory heap space error the buffer of the flow has been limited to a size of 281710 , try to reduce the advanced parameter "Max buffer size" (~100000 or at least less than 281710), then if needed try to increase the JVM Xmx parameter.
Can someone explain how "to reduce the advanced parameter "Max buffer size"?
Also if I "increase the JVM Xmx parameter" what do I increase it to?
Thanks, Dave.
One Star

Re: Out of memory exceptions

Hi,
First,You should select you job.
Then,You can find JVM Xmx parameter in windows-->perference-->Talend-->run/debug.
In face, you should google it first Smiley Happy .
Best regards
One Star

Re: Out of memory exceptions

I don't know what to change them to. I doubled both of them and the warning has gone. Can I ask if Xms512M & Xmx2048M is recommended?
One Star

Re: Out of memory exceptions

I'm facing same issue.
Any update on this??
One Star

Re: Out of memory exceptions

I'm a novice, but my understanding is that these settings just limit the amount of memory Java can use at runtime. You can set them to whatever you want, even more than the max memory on your machine. That would allow java to max out your system, not allowing anything else to use the memory while the java job is running.
I ran into an issue with reading an excel file. My original settings were Xms256M & Xmx1024M. I decided not to play with incremental changes to find where it would work. I changed them to Xms256M & Xmx10G and my files processed just fine.
I am using a 64 bit Windows 8.1 machine with 16G ram. I used 10G because I figured 5G for other systems in the short period I'm running these jobs was good enough.
Seventeen Stars

Re: Out of memory exceptions

Probably you dealing with a very large Excel file.
You could try the component tFileExcelSheetInput or tFileExcelSheetOutput with tFileExcelWorkbookOpen.
The last one is able to create a streaming excel workbook (called memory saving mode) which holds not all records of the excel file but only a smaller number (default is 100). This way you can process very large excel files with a small memory footprint.
http://www.talendforge.org/exchange/index.php?eid=623&product=tos&action=view&nav=0,1,1
tFileExcelWorkbookOpen -> tFileExcelSheetInput ---> .... process your flow...
I have introduced the memory saving mode because I run in the same issues you currently have ;-)
The disadvantage of the memory saving mode is you cannot read randomly in the workbook, you have to process the rows when you receive them.
Four Stars

Re: Out of memory exceptions

Hi Jan

 

I'm trying to use tFileExcelWorkbookOpen and tFileExcelSheetInput to load a 17Mb xlsx 250k rows xlsx file, but get a GC error.

 

The file is different to some others of similar size which do load, so it looks like the routines aren't processing it quite right. When I do a file <filename> it indicates its a Microsoft OOXML file, whereas the others are Microsoft Excel 2007+

 

Would you expect to be able to read this file, or should I convert it by opening in Excel and re-writing it?

 

 

 

Four Stars

Re: Out of memory exceptions

I sent that too soon, it appears the other format has the same problem, even with mx:4g.

 

Any chance you can modify the code to allow streaming?

 

The tFileInputExcel component doesn't exhibit this problem, but has a problem formatting dates consistently between the OOXML and the Excel format files, which your component resolved for smaller files.

Moderator

Re: Out of memory exceptions

Hello,

Have you tried to set up Excel metadata with 'Read excel2007 file format' when use tFileInputExcel component to read your files?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Eight Stars

Re: Out of memory exceptions

Hi,

 

I'm guessing downstream of the t{something}Input component., you're doing something with the data, could it be that is the issue?

 

For example, on a tMap, putting something into the "Temp data directory path" and putting the "Max buffer size" to something sensible.

 

Thanks

 

David

Regards

David

Don't forget to give Kudos when an answer is helpful or the solution.
Four Stars

Re: Out of memory exceptions

No the problem can be repeated even with just the tFileExcelWorkbookOpen component in a job which isn't great as there are no options to try to solve it. Even with -Xmx2048M it fails, so something is wrong given the xlsx is less than 20Mb. It grinds away for about a minute then raises the GC overhead limit exceeded exception.

Four Stars

Re: Out of memory exceptions

Yes had that set for both formats yet the OOXML generated dd/mm/yyyy and the Microsoft Excel 2007+ format was m/d/yy. Both were being displayed as Date dd/mm/yyyy when you opened them in Excel or Libre Office. I'm going to have to get the component to throw an exception if the wrong file format is supplied and get it converted.

Eight Stars

Re: Out of memory exceptions

Slightly different solution to consider, use a freely available Java library to extract the data for a given sheet out of the XLS into a CSV and then import the CSV.

 

Thanks

 

David

Regards

David

Don't forget to give Kudos when an answer is helpful or the solution.