Doing an inner join using a tMap component

Overview

You often need to perform joins between data sources to get the desired results. tMap is a transformation component that allows you to do joins. Both inner joins and outer joins are supported. In this procedure, you will learn how to do an inner join on tMap and manage the matched and rejected data.

 

Environment

This procedure was written with:

  • Talend Open Studio for Data Integration 5.4.1
  • Sun JDK build 1.6.0_26-b03
  • Windows XP SP3
  • MySQL 5.0.67-community-nt

Talend verified this procedure to be compatible with all versions of Talend Studio.

 

Procedure

Requirement description

In this example, there are two data sources: a text file named in.txt and a table named person in MySQL database. The data of each data source are read as follows:

in.txt

id;name;sex
1;Shong;boy
2;Ross;boy
3;Sabrina;girl
4;Elisa;girl

 

person table

user_id name country email
1 Shong CN Shong@talend.com
4 Elisa FR Elisa@talend.com

 

This example reads the data from the in.txt file as a main flow, and then does an inner join with the data read from the person table on a tMap component based on the id column from the file and the user_id column from the table. The join will get rows, including the external columns from lookup table, if the id already exists in the table, and will reject rows if the id does not exist in the table.

 

Here is the expected output:

Matched rows:

id name sex email
1 Shong boy Shong@talend.com
4 Elisa girl Elisa@talend.com

 

Unmatched rows:

id name sex
2 Ross boy
3 Sabrina girl

 

Create a demo job

Create a demo Job called InnerJoinDemo. The Job design is as follows:

tFileInputDelimited_1: reads data from in.txt file.

tMysqlInputDelimited_1: reads data from person table.

tMap_1: performs an inner join and generates two output tables.

tLogRow_1 and tLogRow_2: print output data to the console.

job_design.png

 

The following images show the detailed component settings:

tFileInputDelimited_1:

tFileInputDelimited_settings.png

 

tMysqlInput_1:

tmysqlInput_settings.png

 

tMap_1: performs an inner join between the two incoming data flows, and generates two output tables: one table for the matched rows, and another for the unmatched rows.

tMap_settings.png

 

tLogRow_1, tLogRow_2: print the output data to the console with table model.

tlogRow_settings.png

 

Execute job

Execute the Job. The console shows the following results:

Starting job InnerJoinDemo at 20:56 19/12/2013.
[statistics] connecting to socket on port 3790
[statistics] connected
.--+-----+----+----------------.
|          tLogRow_1           |
|=-+-----+----+---------------=|
|id|name |sex |email           |
|=-+-----+----+---------------=|
|1 |Shong|boy |Shong@talend.com|
|4 |Elisa|girl|Elisa@talend.com|
'--+-----+----+----------------'
.--+-------+----.
|   tLogRow_2   |
|=-+-------+---=|
|id|name   |sex |
|=-+-------+---=|
|2 |Ross   |boy |
|3 |Sabrina|girl|
'--+-------+----'
[statistics] disconnected
Job InnerJoinDemo ended at 20:56 19/12/2013. [exit code=0]

The results show that all matched rows are output into tLogRow_1 and all unmatched rows are output into tLogRow_2.

Version history
Revision #:
4 of 4
Last update:
‎06-29-2017 09:30 AM
Updated by:
 
Labels (1)