how to improve performance of job having 30 million of data into the lookup file.

Five Stars

how to improve performance of job having 30 million of data into the lookup file.

Hi All,

I have job which has 30 million records in lookup and I am trying to run the job on the unix server but because of huge amount of data in lookup file its taking so much time I have ran the job at morning and still it is running. Pleas e help me to get rid of this error

Community Manager

Re: how to improve performance of job having 30 million of data into the lookup file.

You will need to tell us a bit more and preferably give a screenshot of your lookup configuration (I assume you are using a tMap). Maybe give us a screenshot of your job and your tMap configuration.

Five Stars

Re: how to improve performance of job having 30 million of data into the lookup file.

hi,

thanks for reply and I don't have permission to take screen shots of job because its company policy I hope u are understanding and yes I am using tmap overthere

 

Forteen Stars

Re: how to improve performance of job having 30 million of data into the lookup file.

@rohit1804 , Have you specified the store on temp in tMap advanced settings and even you can try to increase the max buffer size .

Manohar B
Don't forget to give kudos/accept the solution when a replay is helpful.
Nine Stars

Re: how to improve performance of job having 30 million of data into the lookup file.

Hi @rohit1804 ,

You can try any of the below steps -

1. Check Multi - Thread Option in Job settings & Change the Threads as per necessity(Recommended is 3).

2. You can enable Set Parallelization on the source component of the job which would be another factor to improve ur performance.

3. Or else, You can Increase the Jvm arguments to take in 1024M to 4096M. If u have higher Ram capability then u can go for 5G's. 

 

Thanks,

Ankit

Six Stars

Re: how to improve performance of job having 30 million of data into the lookup file.

in dbinput component go to advanced settings and use cursor size and based upon the rows to be processed u can mention the cursor size,sure it will make lots of difference and also in tmap make temp data to store also increase the jvm values of ram

Community Manager

Re: how to improve performance of job having 30 million of data into the lookup file.

How many rows loading into the tMap through the Main source and how have you got the lookup configured? If you have a limited number of Main rows you might want to reduce the number of lookup rows by using the "Reload at each row" setting. This will allow you to set the value(s) that you will join on to the globalMap. You can use these in your lookup data query. When you fire your lookup query it will only bring back data relevant to your Main row.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog