[resolved] Best practice to build a joc chain and to deploy changes

One Star

[resolved] Best practice to build a joc chain and to deploy changes

Hi,
i´ve read through the documentation and loads of forum Posts now, but i can`t get a real opinion on the following:
What is the best/recommended way to build a job chain?
My idea was to just put all my jobs in one big ""masterJob" and run this from the Job conductor. Is that the preferred method, or should i connect the single Jobs in an execution plan? it the latter is the case, how can i save the execution plan outside of TAC? The TAC is in control of another department, and i´m not so sure about it´s "stability" :-)
Second question, especially regarding the answer to the first one: If i do what i described above (masterJob - approach): If i change one of my "subjobs", in my opinion i would have to re-deploy the "masterJob" also to get the changes to production? How do you move your developments to production - via "pre-compiled" zip-file or via different SVN-Tags/Branches? If via zip-file - what would you do if there are parallel developments in different subjobs and you want to get only one of them into produciton? The masterJob-Export would contain bith of them in this case?
Thank you very much for your help and best regards!
Markus
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

I would generally suggest to couple the jobs as loosely as possible -> Execution Plan would fit to that approach.
If you add all in a main job you use only on JVM and this could have impact to the necessary memory and you have only one big log file. The next thing is you have to deploy all jobs at once if you change only a bit.
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hi jlolling,
thank you very much for your reply!
Are there any disadvantages in using the execution plan in excess to the ones i have already found:
- no possibility to import/export/backup/restore the execution plans
- not possible to insert Jobs/Tasks in between existing Tasks
- what about context-variables - are they passed from one task to the next ( i wanted to have one context loader job in front of all other jobs)
?
Best regards
Markus
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

I know. The other possible solution is using the custom component tRunTask. This component works like tRunJob but calls instead a TAC task referenced by its label or its ID.
This gives you the advantage you can update parts of you system and you can rerun single task if needed.
https://www.talendforge.org/exchange/index.php?eid=1271&product=tos&action=view&nav=1,1,1
There is also a documentation. This component works very well in a couple of productive projects.
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hello again,
tRunTask looks good, only one Thing that i wonder about: is it possible to "hand over" the context from one job to the next one with this component? My plan was to read the context from a file at first and then start "push" the context through the whole chain...
Nevertheless, after some experimenting with the Talend components for scheduling, i will probably not use them at all and use a separate (and more sophisticated) scheduling-tool for this (UC4/Automic) - stil have to test if that works better :-)
Thank you for your valuable replies!
Best regards
Markus
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hi Markus,
you mean the context from the called task to the job with the tRunTask and hand it over to the next job - because of changes in the context in the called task. No this is not possible because Talend does not provide the transfer of a context to the parent task within the TAC. Actually this will also not work if UC4 starts the jobs because the transfer of the context variables need currently the jobs are running in the same JVM instance. 
If you want transfer information over jobs, build a dedicated table or use a file.
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hi,
 Actually this will also not work if UC4 starts the Jobs --> I know, i plan to just include the "read context from file"-step in every Job :-)
What works really good in UC4 is the creation of Workflows (with parallelism), Triggers, Events, restart of jobs etc. 
It also implies a pretty straight-forward deployment technique: Build Job--> copy zip to server --> include job in UC4 --> done :-)
I will also give the tRunTask-component a try and see if i could stick to "Talend only"-Tools!
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

The usage of UC4 or any other job controller like this has at the moment the advantage you can use the job return code to steer the job chain. Talend does not provides this feature. In the latest release 5.6.1 you can get this return code via the web service -> tRunTask but not in the execution plans. I have a lot of projects which works exactly in the way you described at first and it works very well.
Additional we add to the jobs components to register the job in a monitoring table and get for every job run:
* counters, 
* all timestamps, 
* context variables (at start and at the end) 
* logs
* return code
* host
* host pid
* user running the job
* last processed timestamps- or values- min/max value
and use this information to steer incremental loads
https://www.talendforge.org/exchange/index.php?eid=1316&product=tos&action=view&nav=1,1,1
It is also possible to use the AMC database but mostly that does not fit to our needs. 
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

 I´ve now played around with tRunTask and it seems to do pretty much of what i need :-) - thanks for the suggestion an implementation of that valuable component!
One thing i couldn`t figure out: is it possible to have a chain of tRunTasks that "splits" to parallel execution at certain points and "unites" later on? Something like: 
 
One resolution i could think of is to "encapsulate" the jobs i want to run in parallel into seperate subjobs, but is there an easier way?
--> Too fast with posting, found the solution: tParallelize :-)


                                                                           
 
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

Your question is not specific to the tRunTask component.
The second parallelisation is wrong and does not work!
You can trigger from the first one with the synchronize output. This trigger fires only if all parallele "routes" have been finished. Additional you can decide what should happen if one route fails. 
tParallel_1 -- Parallel --> tRunTask_1
                -- Parallel --> tRunTask_2
                -- Synchronize --> tJava (or anything else what should happen when both task have been finished)
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hi again,
you are right of course, my image containing the tParallelize-component was not correct. I´ve updated it with a corrected Version now.
Thanks again!
Markus
One Star

Re: [resolved] Best practice to build a joc chain and to deploy changes

Execution Plans:
As well as the stated problems above ---> I had difficulty with restart ability when trying to use Execution plans.  Recovery check points seem to work just fine until I add parallel executions. Couple that with my ever changing and growing job list and it was just unusable.
Third Party Tool:
Using third party tools is not a great solution either because now I cannot use my TAC to deploy all my jobs. When I only have to deploy a job or two, simply manually uploading it was fine.  When I had an update that stretched across several jobs, it was not a great solution to export/upload all of those jobs. 
I simply used a tRunJob, and a trigger check process to build my "master jobs"
The MasterJob controls the execution order.  This job calls the same Dynamic job repeatedly, passing in a new job_name with each call.
Here you can see the job name passed in as a string from the master job.
This calls the Dynamic job-->
The Dynamic job that it calls simply executes that job from its dynamic list.

The components surrounding the tRunJob(Dynamic) are updated a control table.  The table is updated with Error, Running, Complete status.  Allowing restart at point of failure without any user intervention, or specific restart points by updating the status code to "Null" or "Error"
When the job completes, it simply creates a TEMP_FILE, dependent jobs wait for that file to be created before they begin execution.
Employee

Re: [resolved] Best practice to build a joc chain and to deploy changes

You may want to check out the details of the TAC metaservlet API.  They are in Appendix B of the TAC User's Guide and there is a detailed example of using the TAC API in the Knowledge Base.  

Using the TAC API will help with some of the issue you raise.  The TAC API will allow you to pass Context Variables to the child Jobs.
Using the TAC API will allow you to run the child Jobs on other Job Servers if you wish.
The TAC API can invoke jobs either synchronously or asynchronously.  When run asynchronously it returns a handle via the execRequestId that you can use to poll the status of your job.
This allows you deploy your jobs independently of each other, so it decouples the SDLC of the child jobs from each other and from the parent job.  So a new child job requires only the deployment of one job, and not recompilation of all the others into an uber job.
You can still use the tParallel approach with synchronous child job invocation if you wish.  But if you are invoking the child jobs asynchronously then invoke them in order, there is no need for the parallel threading.
Because you are invoking your child jobs via the TAC (i.e. just as if you were running them normally) they have access to things like checkpoint recovery.
The execRequestId will give you fine grained control for recovery if you don't like the checkpoint feature.
Seventeen Stars

Re: [resolved] Best practice to build a joc chain and to deploy changes

Hi eost,
this is exactly what the component tRunTask does. Unfortunately Talend does not provide such component, so I had to create it.