One Star

When using tFileInputJson, java.lang.OutOfMemoryError: Java heap space

Hi,
I need to import data from a 10MB JSON file into a mySQL DB, but I'm running out of memory :
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuffer.append(StringBuffer.java:237)
at org.json.simple.JSONArray.toJSONString(Unknown Source)
at org.json.simple.JSONArray.toJSONString(Unknown Source)
at org.json.simple.JSONArray.toString(Unknown Source)
at projet_nfe212_velibs.dynamicdata_0_1.DynamicData.tFileList_1Process(DynamicData.java:687)
at projet_nfe212_velibs.dynamicdata_0_1.DynamicData.runJobInTOS(DynamicData.java:1054)
at projet_nfe212_velibs.dynamicdata_0_1.DynamicData.main(DynamicData.java:919)
I don't know how to raise the memory, if possible. I have found a couple of other posts, but did not really find a solution to my problem. ex: ?Defining the maximum memory size threshold?.

Help please?
Boris
4 REPLIES
Community Manager

Re: When using tFileInputJson, java.lang.OutOfMemoryError: Java heap space

One Star

Re: When using tFileInputJson, java.lang.OutOfMemoryError: Java heap space

Thank you esabot,
these articles helped to go one step further! I raised the memory for the job specifying -Xmx2048M as advised in the first article. I appreciate your reactivity!
However, I am still in trouble with this job. When running it, it now succeed where it used to crash but then the memory allocation goes bersek (it allocates more than 2Go) and the job takes forever. Eventually, it crashes after a long while ruining my last hopes.
I thought my JSONPath request was incorrect, but the same request works fine with smaller subsets of the original JSON file. I don't think the file is corrupted because it works fine with JQ (a terminal tool to manipulate JSON files). So my two questions?
1- Why Talend needs to allocate more than 2Go for a standard 100MB JSON file (not 10MB as previously mentionned)?
2- Why is it so slow while it takes two seconds with JQ on my bash terminal?
I can provide the JSON file if it can help. My config is a MacBook Pro 2010 with 4Go of RAM, with Maverick and Talend Open Studio 5.4
Many thanks for your help!
Boris
Community Manager

Re: When using tFileInputJson, java.lang.OutOfMemoryError: Java heap space

Hi Boris,
Sorry for the delay, you did well to ping me.
Actually I'm not sure why this takes so long. I've submitted your case to the dev team. But send over the file if you don't mind, we can try to reproduce that.
Just a quick question: in this job you don't do anything else but reading the file and displaying it on the console right? (like it seems on your screenshot).
One Star

Re: When using tFileInputJson, java.lang.OutOfMemoryError: Java heap space

Hi Esabot,
Here is a link to the json file
https://dl.dropboxusercontent.com/u/81994364/velib_2013-12-01.jsonr.gz
It is structured like this : ,,,...]
The job is the simplest test case and it fails every time. I just import one field of the json file using the JSONPath="$..available_bikes" (same for other fields...). I use a tFileList, a tFileInputJSON and a tLogRow.
Thanks