Four Stars

how to filter the updated rows

hello everybody,

i'm trying to setup a talend job, this is my current situation:

I have an Oracle table A, let's say it is something like this:

 

ID NAME SURNAME AGE

P56 John    Smith          36

P67 Micheal Douglas    21

P78 Will     Samuelson  56

 

I also have a file B, which contains the entire table A data, updated to the latest version.

 

ID NAME SURNAME AGE

P56 John    Smithson    35        <-- updated row

P67 Micheal Douglas    21

P78 Will     Samuelson  56

 

I need to produce a new Oracle table C, containing only the rows that need to be updated (i dont need to consider additional rows or deleted rows).

 

So in this example, i need a table C containing only

 

ID NAME SURNAME AGE

P56 John    Smithson    35 

 

How can i do this? I'm currently trying in the following way:

a tMap which takes 2 inputs, the file and the table A, and goes out to the new table C, i joined the 2 inputs via inner join, using the ID as a key.

I tried to add a filter to table A, in order to select only the rows where some data has changed, but i'm still moving my first steps in Talend and i cannot understand how to do that, any hint?

 

thanks in advance

 

 

 

 

  • Data Integration
3 REPLIES
Eleven Stars TRF
Eleven Stars

Re: how to filter the updated rows

In tMap, use a inner join between the table and the file based on all the fields.

Select the "Catch lookup inner join reject" option on the tMap output flow.

You'll get a the rows which are in the table and not in the file (new or modified).

If you want to identify news and deleted rows, you need to start with 2 tMap with join based on the Id field (1st with db as main and 2nd with file as main).


TRF
Eleven Stars TRF
Eleven Stars

Re: how to filter the updated rows

Does this help you?


TRF
Six Stars

Re: how to filter the updated rows

Hi rekotc,

 

here is another suggestion:

In tMap join tableA as main input with fileB as lookup on the ID field as you have done already.

In the Variable Panel, create a boolean variable "name_changed" which compares the column Name from tableA to the column Name of your fileB, returning true when they are different.

Repeat the last step for Surname and Age accordingly.

Create a boolean variable called "update_flag" which is true if any of the previous comparison results are true.

Use this update_flag in the filter section of your output, so only those rows with a change in some column(s) are sent to tableC.

 

Best regards,

 

Thomas