parallel execution of jobs or components

One Star

parallel execution of jobs or components

I?ve a job which queries a lot of different slow databases and creates a summary report at the end. From the optimization perspective it would make sense to parallelize some of the queries.
Which options provides TOS to parallelize jobs or components?
I?ve seen the ?Multi thread execution? in Job Settings. But to my understanding has this option only an effect on lookup loading. The documentation is a little bit light on parallel computation. The only reference I found was in ?Defining the Start component? and from the comments there it does not look like that I can parallelize the waiting time for the databases.
I?m not sure if it would work if I encapsulate the different queries in unlinked sub jobs. The biggest issue I would see with this approach is the synchronisation before the data consolidation because I have to wait until the last job finishes. In addition I?m not sure the system really starts the unlinked jobs in parallel. Did someone already test something like that?
Employee

Re: parallel execution of jobs or components

Here is what we can currently do, in TOS 2.3.x, in Java project only. "Multi threading" option must be enabled in the preferences.
In the child job (with 2 unlinked subjobs), ths 2 subjobs are executed in parallel. The problem is that we can't execute a third subjob that would wait for the others to have finished (this will be added in TOS 2.4). The workaround is to use a tRunJob to "wait for" the 2 subjobs (in the child job).
One Star

Re: parallel execution of jobs or components

Thx for your response, your answer helps.
I need to try few things out to see if there are any pitfalls. The different queries have huge differences in execution time and some queries provide input for others so I may need more then one level to implement the queries. If I have sub jobs in sub jobs will it still work?
Employee

Re: parallel execution of jobs or components

If I have sub jobs in sub jobs will it still work?

What do you mean? (I know the "Talend Open Studio" glossary is missing, it is being prepared) Do you mean you have iterate links?
One Star

Re: parallel execution of jobs or components

I?m currently implementing it as iterative link, yes.
One tOracleInput is iteratively linking to one or more other tOracleInput components.
I?m not sure it?s the best way to deal with it. The job goal is to validate data consistency between different databases and the current implementation follows my workflow when I do it by hand.
PS: I?m not sure the picture is really readable but may give you a rough overview. There are databases missing so it?s not the full picture anyway.
One Star

Re: parallel execution of jobs or components

Hello plegall,
on 2008-03-07 you wrote "The problem is that we can't execute a third subjob that would wait for the others to have finished (this will be added in TOS 2.4)". Is the option to directly re-syncrhonize subjobs evailable meanwhile?
thanks for your answer
Peter
One Star

Re: parallel execution of jobs or components

Hi,
I am currently using Talend 2.4.1. I have the same problem. I have 3 runjobs which should run in parallel and when they are finished they should execute a set of processes only once. In the uploaded image, I have replaced the whole set of processes by only one stored procedure call for simplicity purposes.
By the way, just by curiosity, if I have one task that depends on the success of at least one of N tasks, how can I model this on Talend.
Thanks for your help!!
Usman
Employee

Re: parallel execution of jobs or components

on 2008-03-07 you wrote "The problem is that we can't execute a third subjob that would wait for the others to have finished (this will be added in TOS 2.4)". Is the option to directly re-syncrhonize subjobs evailable meanwhile?

This feature has been added to Talend Integration Suite (and not to Talend Open Studio as I thought at the very beginning). See the screenshot to have an example of what can be done.
Employee

Re: parallel execution of jobs or components

I am currently using Talend 2.4.1. I have the same problem. I have 3 runjobs which should run in parallel and when they are finished they should execute a set of processes only once. In the uploaded image, I have replaced the whole set of processes by only one stored procedure call for simplicity purposes.
By the way, just by curiosity, if I have one task that depends on the success of at least one of N tasks, how can I model this on Talend.

With Talend Integration Suite (see my screenshot in my previous post), you can configure the tParallelize to "wait for end of first subjob" or "wait for end of all subjobs".
One Star

Re: parallel execution of jobs or components

Is it foreseen to have the Tparallelize component available in a future Talend Open Studio version?
One Star

Re: parallel execution of jobs or components

hi, i am using TOS for MDM v5.0.2 community version. bt I can not find tParallelize component and also there is no information about the component available in the user manual guide. please let me know if the component present in community version or where i can find a suitable example for that.
thanks, @mit
One Star

Re: parallel execution of jobs or components

It seems to me that Synchronization and Parallelism can be achieved in a Parent/Child relationship where the Parent job calls the child to do parallelism and when all parallel subjobs in the Child are done (meaning Synchronization is achieved) the Parent continues.
Community Manager

Re: parallel execution of jobs or components

hi, i am using TOS for MDM v5.0.2 community version. bt I can not find tParallelize component and also there is no information about the component available in the user manual guide. please let me know if the component present in community version or where i can find a suitable example for that.
thanks, @mit

tParallelize is only available in Talend enterprise subscription product, you can read its user manual in Talend Help Center.
https://help.talend.com/search/all?query=tParallelize&content-lang=en
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Community Manager

Re: parallel execution of jobs or components

It seems to me that Synchronization and Parallelism can be achieved in a Parent/Child relationship where the Parent job calls the child to do parallelism and when all parallel subjobs in the Child are done (meaning Synchronization is achieved) the Parent continues.

Yes, put all the subjobs which you want to run parallel in the child job, and open the job settings, then click the Extra tab, check the 'Muti thread execution' option to make all the subjobs without any connector between them to run parallel.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: parallel execution of jobs or components

Talend gives us facility to execute your sub job in parallel. Its something similar like multi threading in Java.
Here is nice tutorials for parallel execution of jobs in Talend Open Studio