What is the difference between tJava, tJavaRow and tJavaFlex?

 Overview

There are three Java components in the Custom Code family: tJava, tJavaRow and tJavaFlex. These Java components allow you to integrate custom Java code in a Talend program. This article explains the difference between these three components, and explains how to use them in a Job.

Environment

This article was written with:

  • Talend Open Studio for Data Integration 5.3.1-r72978
  • JDK version: Sun JDK build 1.6.0_26-b03
  • Operating system: Windows XP SP3

This article applies to all versions of Talend Studio.

Description

Find below a detailed description of each of the three components followed by a typical use case for each of them.

tJava

You can use a tJava component to integrate your custom Java code into a Talend program. It applies exclusively to the start part of the generated code of the subjob: it will be executed first but only once in the subjob. Normally, the tJava has no input or output data flow and is used as a separate subjob. The following example shows how you can use a tJavacomponent:

A common Job using tJava :

  • A tFileInputDelimited component reads data from a text file,
  • then passes the data to a tLogRow and prints it in the console,
  • The tJava component retrieves the total number of records processed through the Job using a global variable and prints this number in the console.

The source data file read by the tFileInputDelimited_1 includes the following data:

1;Shong
2;Elisa
3;Sabrina

Some Java code should then be inserted in the tJava_1 component in order to obtain the number of records being processed:

 

int nb_line=(Integer)globalMap.get("tFileInputDelimited_1_NB_LINE");
System.out.println("The total number of records are read from the text file is: " +nb_line);

Execute the Job by pressing F6. You can see the following results in the console:

Starting job aaa at 15:42 22/09/2013.
 
[statistics] connecting to socket on port 3589
[statistics] connected
1|Shong
2|Elisa
3|Sabrina
The total number of records are read from the text file is: 3
[statistics] disconnected
Job aaa ended at 15:42 22/09/2013. [exit code=0]

The actual records from the input file are printed into the console natively. In addition, due to the use of the tJava component, the number of records also displays onto the console.

tJavaRow

The tJavaRow code applies exclusively to the main part of the generated code of the subjob. The Java code inserted through the tJavaRow will be executed for each row. Generally, the tJavaRow component is used as an intermediate component and you are able to access the input flow and transform the data.

The following use case shows a typical Job using a tJavaRow:

  • A tFileInputDelimited component reads data from a text file,
  • then a tJavaRow component applies some transformation to the data being processed
  • then the transformed data is displayed to the console using a tLogRow component.

The tFileInputDelimited_1 reads the same text file as in the tJava example.

The following Java code needs to be inserted in the tJavaRow_1 component to transform the data. In this use case, it converts the column names to upper case. 

output_row.id = input_row.id;
output_row.name = (input_row.name).toUpperCase();

Execute the Job by pressing F6. You can see the following results in the console:

Starting job aaa at 16:27 22/09/2013.
 
[statistics] connecting to socket on port 3393
[statistics] connected
1|SHONG
2|ELISA
3|SABRINA
[statistics] disconnected
Job aaa ended at 16:27 22/09/2013. [exit code=0]

This example shows that it is possible to access the input flow using a dedicated variable and following a specific syntax such as: input_row.name. The source data is processed at runtime by the tJavaRow component.

tJavaFlex

The tJavaFlex has three Java code parts (start, main, end) that enable you to enter personalized code for different purposes. The start part will be executed first but only once in the subjob. The main part will be executed for each row. You are able to access the input flow and modify the data. The source data is processed at runtime by the tJavaFlex. The end part will be executed at the end of the subjob, but only once.

A simple use case using tJavaFlex is for example:

  • A tFileInputDelimited component reads data from a text file,
  • then a tJavaRow component injects specific code at various moment of the Job processing (start, main or end parts)
  • then the data is displayed to the console using a tLogRow component in addition to other processing information based on what the code injection says in the start and/or end part of tJavaFlex.

The tFileInputDelimited_1 still reads the same text file as in the tJava and tJavaRowexamples.

The following Java code needs to be inserted in the three Java-code parts of tJavaFlex:

Start code:

System.out.println("******The subjob begins to work!******");
int nb_line=0;

Main code:

row2.name=(row1.name).toUpperCase();
nb_line++;

End code:

System.out.println("The total number of processed data is: "+nb_line);
System.out.println("******The subjob finishes!******");

Execute the Job by pressing F6. You can see the following results in the console:

Starting job aaa at 16:47 22/09/2013.
 
[statistics] connecting to socket on port 3641
[statistics] connected
******The subjob begins to work!******
1|SHONG
2|ELISA
3|SABRINA
The total number of processed data is: 3
******The subjob finishes!******
[statistics] disconnected
Job aaa ended at 16:47 22/09/2013. [exit code=0]

 

Basically the tJavaFlex is a combination of tJava and tJavaRow, mixing injection of code on a one-shot basis at the start/end of a Job as well as real data transformations for each row.

Conclusion

Through the examples above, you notice that tJava is used to execute a piece of Java code as a separate subjob in most cases. The Java code will be executed only once, whereas the Java code of the tJavaRow will be executed for each row. The tJavaRow is normally used as an intermediary component: you are able to access and transform the input data flow. If you want to do some initialization operations at the beginning of subjob or other processing operations at the end of subjob, the tJavaFlex is the best component to use.

Version History
Revision #:
1 of 1
Last update:
‎04-13-2017 10:31 PM
Updated by:
 
Labels (1)
Contributors