Five Stars rm
Five Stars

Performance Issue

I have issue with below design. It's occupying huge resource. How the performance can be improved? Kindly help me.

>>I can remove tmap3 and tmap4, but in order to avoid the unwanted number of columns in buffer, I didn't remove that. And also, i have used some filter condition in those tmaps.
I have 8 columns like A,B,C,D,E,F,G,H. Filtering the records on C,D,E,F,G(tmap3 & tmap4) and I'm taking only A,B,H to reference buffer(tmap1 & tmap2). Is it the right approach? Please correct me, if I'm wrong.
>>tFileInputDelimited_2 & tFileInputDelimited_3 were same files. If I extract that as a single file(one tFileDelimited), I cannot used that as a reference in two places. Is there is any approach to handle this? Extracting the file only once, will increase the performance.
>>Job was very resource consuming. I'm getting 3 million records from source & 2.5 million from each reference. Allocated Xmx16384. I cannot allocate this much RAM to a single job. Need help on this.
>>Sorting and Removing duplicates takes heavy time? I used sort on disk option. But still it's very resource consuming. Any other ways to do it efficiently?
Someone, please help me out. 
Thanks
1 REPLY
Moderator

Re: Performance Issue

Hi,
We have replied to your another topic:https://www.talendforge.org/forum/viewtopic.php?id=48197.
Could you please take a look at it?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.