Hi, I am trying out Talend Open Studio to compare it's functionality against MS Access queries, I having some performance issues with TOS. I have created a job in Talend with a couple of lookup tables, however, the job takes about 30s to complete whereas in MS Access only takes 3s to complete the same task. I've tried suggestions given in the forum like limiting columns to only the ones needed on lookups etc. The problem I see is that TOS process the lookup's first in sequence and then only starts the job, my question is, can these lookup's run in parallel so it can impove the performance? For example, my first lookup takes 11s. then the second lookup takes another 8s so altogether, both of these takes around 19s to complete. The rest of the job runs quite quickly (although the TMap takes a fair amount of time as well). Is there anyway you can run these lookup's in parallel? I will eventually need to create a job with around 10 lookups so I would appreciate if there is anyway that I could improve the performance. Thank you
Parallel loading of tMap lookups was supposedly introduced in TOS v4.2.0 (according to http://www.talendforge.org/bugs/view.php?id=14859) but I can't find it in TOS v4.2.1. Maybe it's only in TIS... Having said that, if you are comparing Talend jobs using data from a database vs native queries run directly in that database, the latter will always be faster. This is because Talend has to read all the data used from the database into memory for the match to be attempted while the database has it all to-hand. You could just perform the join in the SQL statement for your input component. The power of Talend is in joining data from or moving data between disparate sources (different DBs or DBs and files). Talend also supports creating a job visually in the studio but executing it directly in the database (see the ELT set of components) but this is only suitable if the results are also being written to the same database as the source data.
Hi alevy, Thank you for your reply, I am new to Talend so still trying out components, my ultimate goal is actually to transfer raw data coming from a text file and transfer it into tables in the business system. However, during this process, I need to cleanup my text file data so I have to reference some of the other tables in the database to reject some records and only load the valid ones. I will try your suggestions and try the join in the input component itself. I will also try the ELT components. At the moment, I am performing the cleaning up using and 'Access' so trying to evaluate Talend vs Access for the ease-of-use, powerfulness etc.