Differences between a Joblet and the tRunJob component

Overview

Both a Joblet and the tRunJob component encourage code reuse and refactoring, help improve development efficiency, and ease maintenance. However, you may wonder what the difference is between them, and when you should use one or the other. This article explains the differences between a Joblet and the tRunJob component from a technical point of view, as well as from a usage angle.

 

Environment

Although the tRunJob is a generic component available in the core product Talend Open Studio for Data Integration, the Joblet is an advanced feature that is only available in Talend Enterprise subscription products. Therefore this article applies mostly to Talend Enterprise subscription product users.

 

Description

Difference

Talend Studio uses a Java code generator, and each Job is translated to a Java class. From a technical point of view, there are two differences:

  • The tRunJob component executes a child Job, which is a separate Java class. The main Job instantiates the child Job and executes it using the runJob method. A Joblet is just a GUI extraction and refactoring of some components. It creates a reusable transformation, with the generated code of the Joblet remaining a part of the Java class of the main Job.
  • The tRunJob component is a different unit of execution and has its own context variables. The child Job, called with the tRunJob in the main Job, can't access the context variables of the main Job. However, a Joblet can access the context variables of the main Job, as it is a part of the main Job.

 

Usage

Because of the differences between a Joblet and the tRunJob component in the code refactoring and function, the decision of when to use a Joblet or the tRunJob component is based on business requirements. The following explanation describes the circumstances which could lead you to choose one or the other.

 

Joblet

The Joblet code is automatically included in the main Job code at run time, thus using fewer resources and improving performance. A Joblet is usually used to achieve the following needs:

  • Output or print static messages. If, for example, you want to trace the Job execution and print a static message for each step, create a Joblet and use a tJava to print this message at the beginning of the Job execution:

    System.out.println("The job starts to run")
  • Load value of context variables from a file or a database. If a Job or multiple Jobs load the value of context variables from a file or a database, you should usually create a dedicated Joblet to accomplish this task.
  • Manage custom logs with a tLogCatcher component or a tStatCatcher component as the first component in the Jobs.
  • Create a reusable transformation regardless of the type of input and output data source.

     

    If, for example, you are reading data both from a file and a database in a Job, you need to process data in the same action. A Joblet is a created in this case:

    File Input Component – Row MainJobletRow Main – Target
                |
         OnSubjobOK
                |
    Database Input Component – Row MainJobletRow Main – Target
    

 

tRunJob

The tRunJob component helps you master complex Job systems in real projects. The tRunJob is usually used to achieve the following needs:

  • This component can be used as a standalone Job and helps clarify a complex Job by avoiding having too many sub-jobs in one Job. You can create different Jobs for processing different business requirements, and then create a main Job to run the child Jobs called with the tRunJob component. For example, assuming you are building a data warehouse for retail, you populate the fact tables such as users, products, orders, and dimension tables in different Jobs, and create a main Job to run the child Jobs one by one.

    tRunJob_1 (populate the product fact table)
         |
    OnSubjobOK
         |
    tRunJob_2 (populate the order fact table)
         |
    OnSubjobOK
         | 
     tRunJob_3 (populateSalesByProductByMonth)
         |
        ... 
    
  • The tRunJob component is the only solution for the following case you often face in real projects: reading data from a data source, then processing the data in a component. However, there might exist problematic data that lead to Job execution failure. The Job throws a Java exception and stops running. You need to capture the Java exception with a tLogCatcher component, log it to your database or file, and make the Job continue to perform the next data. For example, a table stores the email information as below:

     

    email
    email1@talend.com
    email2@talend.com
    email3@talend.com
    ...

     

    The request is to read the email addresses from the table and send an email to each person with a tSendMail. But, as this table may contain invalid emails, if you put all the components in one Job it will stop once an invalid email is sent to tSendMail. To achieve this request, design the Jobs as follows:

     

    mainJob:
    tMysqlInput_1: reads emails from the table.
    tFlowToIterate_1: iterates each email.
    tRunJob_1: calls the child Job.
    

    main_job.png


    In the Basic settings tab of the tRunJob_1, clear the Die on child error check box so that the main Job will not stop even though an error occurs in the child Job. In the Context Param table, pass the current email from the main Job to the child Job. For more information, read the article Passing a value from a parent Job to a child Job.

    childJob.png

     

    child Job:
    tSendMail_1: sends an email to each person.
    tLogCatcher_1: catches the Java exception and log it into a table.

    childJob1.png


    On the Basic settings tab of tSendMail_1, in the To field, enter the context variable that stores the current email passed from the main Job.

    tsendMail1.png


    Select the Die on error check box. This option makes the child Job throw a Java exception that will be captured by the tLogCatcher component when an email address is invalid.

     tsendMail2.png

 
Version history
Revision #:
5 of 5
Last update:
‎06-15-2017 01:42 PM
Updated by:
 
Labels (1)
Tags (1)