I have multiple talend jobs chained by "OnSubjobOK" to create a sequential run.
the flow is as follows:
tMSSQLInput sets context variables and they are passed to each subjob.
The setting "Transmit Whole Context" is checked for all.
The second job in the sequence is the meat of the program. it pulls data from a database and writes to csv files which is then used by the other jobs.
What I've observed is:
The csv file in folder shows 0KB during the entire run but suddenly starts showing data while the last job in the sequence is running.
The last job in sequence has 3 subjobs within it. The first job fails with a Null Value Exception. Turns out the subjob is not receiving the context variable which is the filepath to the csv file in the second job.
I am using Talend Open Studio 5.5 on a Windows Server 2012.
Is there some weird latency thing going on? If so, how can i resolve it?
Thanks in advance.
I'm afraid we will need more info. The screenshot does not seem to correspond with your description and there is a lot that could go wrong with what you are describing. Could you show us the first and second jobs (described in the original post) and where they exist in the screenshot? Also, can you show how you are setting the context variables?
Thank you for responding and apologies for not being very clear.
Each job is pretty intensive with a lot of data pulls and writes to intermediate text files.
The context variables are set in the Java component with the values read from DB.
These variables are passed to all the sub-jobs.
Job 2 specifically reads from db and writes to a CSV file which is utilized by all the jobs down the path.
The odd thing is, while Job 5 is running, from the system folder, I can see that the CSV that should have been completely written in Job 2 is still being written to.
Job 5 utilizes this CSV and the context variables.
Failing with Null Exception.
OK, this is likely because your context variables are not set correctly by the time the tRunJobs are running. I'm making some assumptions here, but I'm guessing that your context variable query returns the variables over several rows. You will be setting the context variables one at a time and after the first variable is set, you will be running the running the first tRunJob. This is because of the tFlowToIterate and the OnComponentOK link after the tJava.
What you need to do is separate the loading of context variables from the rest of the flow. I'd recommend using the tContextLoad rather than a tJava (unless there is a reason why this is needed). Take a look here for an explanation (https://help.talend.com/reader/NNO~fmVQU4rlkF9Depfdxw/5zKZ5vBNJr32XWV6D2ZCgQ). Once the contexts are loaded, carry on with the rest of the flow. Do this by connecting the first tRunJob to the context variable query component using an OnSUbJobOK link.
This isn't so much a latency issue as a timing and orchestration issue. Make the changes I have suggested and it will work as you expected.