How to process MongoDB data (Approx 1.3 millions records) in Talend in chunks

Four Stars

How to process MongoDB data (Approx 1.3 millions records) in Talend in chunks

Hi,

 

We need to process Mongo DB data [Approx 1.3 millions) in Talend. When we are trying to process all the records in one go, it's giving "Heap space - out of memory" error in Talend. We have tried all the ways to increase the JVM memory size but it's not working out. Probably because we have complex logic in worflow.

 

So now we are looking to process data in chunks but not sure how can it be done in Talend. Currently we are pulling MongoDB data using "MongoDbInput" component.

 

Could anyone please advise on this. 

 

Thanks in advance!!

 

Regards,

Pragya

 

 

Six Stars

Re: How to process MongoDB data (Approx 1.3 millions records) in Talend in chunks

You have multiple solutions possible.

 

First you can try to reduce the number of column you pull in Talend in the tinput component. Revome unused or unwanted column if it's possible.

 

Secondly :Increase the buffer size of the tmap. In order to do that go in the top left hand corner inside you tMap and increase the number.

 

Thirdly : You can try to store on disk, the data (the data are in the memory by default). This is less performant but it can work. In order to do that go in the top left hand corner inside you tMap like the point before and choose a directory.

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now