java.lang.OutOfMemoryError: GC overhead limit exceeded

One Star

java.lang.OutOfMemoryError: GC overhead limit exceeded

Hello,

I have a simple job that is processing a 2.2GB XML file. It is running out of memory and erroring out even though it is being run with the parameters of -Xmx3072M and -Xms3072M. The generation mode is also set to low memory consumption (SAX).
Exception in thread "main" java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.tFileInputXML_1Process(Httrs_Performance01_tLog.java:1175)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.runJobInTOS(Httrs_Performance01_tLog.java:1353)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.main(Httrs_Performance01_tLog.java:1221)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at java.lang.String.substring(Unknown Source)
at org.talend.xml.sax.SAXLoopHandler.startElement(SAXLoopHandler.java:176)
at org.talend.xml.sax.SAXLoopCompositeHandler.startElement(SAXLoopCompositeHandler.java:64)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(Unknown Source)
at org.talend.xml.sax.ComplexSAXLooper.parse(ComplexSAXLooper.java:164)
at org.talend.xml.sax.SAXLooper.parse(SAXLooper.java:129)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.tFileInputXML_1Process(Httrs_Performance01_tLog.java:800)
... 2 more
~~~
What do I have to change to get this job to run without memory errors? I have also tried increasing the Xms and Xmx in the C:\Talend\5.1.2\studio\JETL-r90681-V5.1.2\JETL-win-x86_64.ini but same result. The settings are below:
-vmargs
-Xms2048m
-Xmx2048m
-XX:MaxPermSize=512m
-Dfile.encoding=UTF-8
One Star

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Hello,

I have a simple job that is processing a 2.2GB XML file. It is running out of memory and erroring out even though it is being run with the parameters of -Xmx3072M and -Xms3072M. The generation mode is also set to low memory consumption (SAX).
Exception in thread "main" java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.tFileInputXML_1Process(Httrs_Performance01_tLog.java:1175)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.runJobInTOS(Httrs_Performance01_tLog.java:1353)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.main(Httrs_Performance01_tLog.java:1221)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at java.lang.String.substring(Unknown Source)
at org.talend.xml.sax.SAXLoopHandler.startElement(SAXLoopHandler.java:176)
at org.talend.xml.sax.SAXLoopCompositeHandler.startElement(SAXLoopCompositeHandler.java:64)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(Unknown Source)
at org.talend.xml.sax.ComplexSAXLooper.parse(ComplexSAXLooper.java:164)
at org.talend.xml.sax.SAXLooper.parse(SAXLooper.java:129)
at morningstar_01.httrs_performance01_tlog_0_1.Httrs_Performance01_tLog.tFileInputXML_1Process(Httrs_Performance01_tLog.java:800)
... 2 more
~~~
What do I have to change to get this job to run without memory errors? I have also tried increasing the Xms and Xmx in the C:\Talend\5.1.2\studio\JETL-r90681-V5.1.2\JETL-win-x86_64.ini but same result. The settings are below:
-vmargs
-Xms2048m
-Xmx2048m
-XX:MaxPermSize=512m
-Dfile.encoding=UTF-8

Hello,

Any ideas on this would be greatly appreciated.
Thanks.
Employee

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

What is the structure of your document and what is the configuration of the component tFileInputXML?
Depending what you want to extract, even SAX might needs memory
Seventeen Stars

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Increasing the memory parameters of the Studio has no effect to the job!
Please open the job, navigate to the Run view and increase the memory parameters in the advanced section.
You have to use the same parameters of course. Please keep in mind, a job is running in its own JVM and therefore needs it own configuration.
One Star

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

You may want to check out ScaleDOM, which allows to parse very large XML files: https://github.com/whummer/scaleDOM
ScaleDOM has a small memory footprint due to lazy loading of XML nodes. It only keeps a portion of the XML document in memory and re-loads nodes from the source file when necessary.
Seventeen Stars

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

ScaleDOM does not solve his problem. He already use a parser with minimum memory foot print. The problem is the huge amount of temporary objects created and trashed and the temporary space for the GC to manage that is a bit to small.
I would suggest playing a bit with the GC parameters:
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
The general heap space is probably huge enough, it looks like some of the working area for the GC is to small.
And please check where you change that: Modifying the studio parameters in the ini file has no effect for jobs!
The only way to tweak jobs is in the Run view of the job under advanced settings.
One Star

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

Please check my snapshot, i get out of memory error. 
So i went to window--preference--run/debug and changed the memory size to  a higher one.. 
i still got the error though it run for a longer time before throwing the error this time. 
I'm using Talend open studio for data integration
   

I'm pulling data from different MonogDB and MySQL sources.
Seventeen Stars

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

hi
as said several time before , increase memories is not a solution ...  Smiley Happy
You have to find the cause before decide an action to correct it (improve/optimize your job).
your error tell you that your garbage collector is working too / hard too often.
you 're working with too many data in memory (java object) ... you have to think otherwize you manage your data.
regards
laurent