I am trying to generate a large xml file using tAdvancedFileOutputXML.
When running on local machine, i am getting an "OutOfMemoryError" : Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
Please see my configuration below + some screenshots of the corresponding job:
Talend version : 6.3.1
Main input file (csv) : 88,151 rows
lookup file (csv) : 5,994,268 rows
1. To enable optimization, i have enabled "Store temp data" in the tMap lookup
2. I have changed the Generation mode to : "Fast with low memory consumption"
3. As my local machine has 8Gb RAM, i have also changed the JVM :
4. I have also made use of the "output stream" option using an tJava component:
And on tJava component contain :
Despite these settings, i am still not able to generate the Xml file and stuck with the OutOfMemory error.
Can you advice please? Thank you.
It is actual never a good idea to create one huge xml file. The problem is not only the creation process, it is also the next part - reading such a huge file.
What about creating multiple files instead of one?
Thank you for your reply and I understand completely your idea.
I have tried splitting the xml into multiple files and it is much faster.
The problem is at the end, we will have to merge them to create one which will be handled by an application that can only accept one file during run-time.
The finally xml should be about 1.5 Gb.
Do you have any idea show can the actual job be optimized?
Indeed the root tag should be removed.
I am new in using Talend, can you tell me how can this be implemented; i mean keeping only one root-tag in the file?
Also, despite the root-tag, i have tried to merge them using t Unite component, but it creates a blank row after each line which results in size increase :