Talend is occupying the entire RAM

Five Stars rm
Five Stars

Talend is occupying the entire RAM

Hi,
I have a scenario, where i have tested two cases for performance.
Case 1: 
PostgresDB---->Tmap------(filtering and if condition not satisfied rejecting down to text file)----->txt
Case 2:
PostgresDB(handling the filter case in SQL query)----->txt
Environment:
Source DB is present in different server.Data is around 2gb. We are using 16gb machine with redhat linux installed. Out of which 6gb were free. 
Cases tested:
Case 2: It just took 3-4 minutes of time to load the data.
Case 1: It's taking more than 30 minutes of time.
I have following questions,kindly help me
a)While filtering and rejecting records in Talend(Case 1), entire RAM was occupied and swap memory was used. It makes the job dead slower. 

If the data size was huge,it directly affects the performance. Say for instance, if i need to process 512GB data, my RAM should be more than that? How people can afford 1TB machine in this case? Is it the same case with other ETL tools or i missing something? Kindly clarify. 
b)DB filter was very much faster than talend. Do you think, right approach is to push all functionalities inside the DB?
Thanks
Twelve Stars

Re: Talend is occupying the entire RAM

tMap fast when it do inMemory calculation, so as for any inMemory databases (we not told now about compression) - if You want work with 1Tb of data in memory, You must have 2Tb of Ram at least
Database will work faster, because it use indexes for JOIN (if You are  do not prepare wrong query). PostgreSQL as many other designed for work with data many times bigger than memory
Notes - all above correct if You make JOIN lookups in tMap, or aggregations, so not work with single row from flow
if You just filter - need to check what You try achieve, and may be it possible todo by other ways
-----------
Five Stars rm
Five Stars

Re: Talend is occupying the entire RAM

Thanks Vapukov.
How to handle data larger than RAM size in Talend ETL? Is there is any other way without pushing that to DB(ELT).
Twelve Stars

Re: Talend is occupying the entire RAM

Thanks Vapukov.
How to handle data larger than RAM size in Talend ETL? Is there is any other way without pushing that to DB(ELT).

let return to Your original post
You not provide full information, so I can return same question to You in "Human Readable" form Smiley Happy
You need relocate from one house to other and have huge amount of old staff
and You want have Your new house is clean
You have a case:

Upload all staff (for example 1000 items) to the street, sort them and take 10 items with You
Make a  list of 10 items, take them, sit to car and drive to new Home?

Which way is faster?
And same cases, but when You want take with You 50% of items and must compare them? Situation could be different
Same with Your question - speed off whole Job always will depend what really You try to do? How many (in %) records rejected by filter? is it any aggregations? external lookups?
Short answer - Talend could work with big data sizes, which way faster and better - it always depends from how proper You define the Job.
Talend, Postgres, OS - it is all just items from Your toolbox! 
Why You do not want all benefits from all of Your tools and want do all home tasks only by Hummer? Smiley Happy
-----------
Five Stars rm
Five Stars

Re: Talend is occupying the entire RAM

We handled the filter in SQL Query. Our servers are builded with less amount of RAM so don't want the job to consume more in memory. 
Thanks