I want to know how do we synchronize two subjobs which are executed in a job with multi threaded execution.
I want to execute two child jobs parallelly, and only after these two jobs have finished execution, I want to execute a third job. Basically, execute Job3, only after Job1 and Job2 have finished execution.
I know I can use tParallelize to achieve this, but I'm not using Talend Enterprise edition.
There seems to be less light on this topic so any help would be appreciated.
you create a file dependency that once Job1 competed create file and same way for Job2,but in JOb3 has to execute once the files created as part of Job1 and Job2.
in 3rd Job - use loop, which (for example every 1 or 2 sec) check is 1 & 2 finished or not?
it could be database table with runtime and error code or simple local csv file
if both here - run rest part of job (run-if trigger) and exit loop
if not both - next iteration with defined delay
You can try below step:
- create a master job(say Job4) and add your three sub jobs (say Job1,Job2 and Job3)
- In job4, go to Job>>Extra tab and Check "Multi thread operation".
In this way you can run multiple jobs in parallel
Master job Layout:
sample Subjob layout (which is to be run in parallel):
Each sub job is same in this case, only sleep time is different (assuming each job has different execution time).
Job1 has tsleep before it, and its sleep time would be "max(execution time of job3 and job2)". Here I kept it as 30 sec, as sleeptime/execution time for job3 and job2 is 20 sec and 10 sec respectively.
Create a context variable in master job say run_flag with integer type and initial value as 0.
Connect tjava after each subjob using OnComponentOk, here increment run_flag by 1.
As we want 2 jobs to run parallel and the run the last job.
so, 0+1+1= 2
This will be our condition for last job, i.e. if value of run_flag==2 (which means both the jobs ran) then run the last job using conditional trigger (run if).
all the three threads will start simultaneously. out of which two jobs will run and update the run_flag once each by 1. This will make total count in run_flag as 2. And the last one has the sleep time followed by conditional trigger which needs 2 in run_flag. If run_flag has value 2, last job will run else wait it to become 2.
let me know if this works for you.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Read about OTTO's experiences with Big Data and Personalized Experiences
Pick up some tips and tricks with Context Variables
Take a look at this video about Talend Integration with Databricks