We need to process Mongo DB data [Approx 1.3 millions) in Talend. When we are trying to process all the records in one go, it's giving "Heap space - out of memory" error in Talend. We have tried all the ways to increase the JVM memory size but it's not working out. Probably because we have complex logic in worflow.
So now we are looking to process data in chunks but not sure how can it be done in Talend. Currently we are pulling MongoDB data using "MongoDbInput" component.
Could anyone please advise on this.
Thanks in advance!!
You have multiple solutions possible.
First you can try to reduce the number of column you pull in Talend in the tinput component. Revome unused or unwanted column if it's possible.
Secondly :Increase the buffer size of the tmap. In order to do that go in the top left hand corner inside you tMap and increase the number.
Thirdly : You can try to store on disk, the data (the data are in the memory by default). This is less performant but it can work. In order to do that go in the top left hand corner inside you tMap like the point before and choose a directory.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks