Four Stars

How reproduce Oracle Merge in Talend

Hi all,

I am  new Talend user. I have a quite simple task to do: create a job that has to take data from a table in a database A and update or insert records in a table in a database B.

I tried using the following structure but it is very slow (2 records/s)

Cattura.PNG

Is there a way to perform an Oracle merge (so: check if a record exists based on a specific condition, update if exists and insert otherwise).

 

I find a tSQLTemplateMerge but I don't know how to use it

Cattura.PNG

Tags (3)
5 REPLIES
Moderator

Re: How reproduce Oracle Merge in Talend

Hello,

If I understand your requirement very well, do you want to capture the changed data and load it into your target DB?

You can compare tables by using tMap.
The work flow should be:

Target DB-->tMap(make inner join on your input and set the "Catch lookup                                                                                                                               inner join reject" as true)-->output

Source DB-->

The output will be the changed data.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: How reproduce Oracle Merge in Talend

Hi Sabrina,
unfortunately my tables are very big (2.200.000 records) so, trying to executing a join as you suggested I get
"Exception in thread "main" java.lang.OutOfMemoryError: Java heap space".
I think it isn't the right way to proceed (in my case).

Furthermore, this job should be run every day to import clients from a source so i have 3 possibility:
-unchanged clients (no info has changed)
-new clients
-updated clients (clients already present in the destination with same field to be updated).
I need to distinguish among these options
Six Stars JGM
Six Stars

Re: How reproduce Oracle Merge in Talend

you should consider this approach for very large tables. Instead of processing the data in Talend, first stage the incoming data to a temp table, then use a SQL statement to merge the data using the DB. 

 

ex:

image.png

Eleven Stars

Re: How reproduce Oracle Merge in Talend

You are not giving enough information. Carrying out an insert/update on a table with over 2 million records every day is a bad solution unless there is absolutely no other information to identify which rows are new, have changed and have not changed. You are essentially carrying out a cartesian join in terms of comparisons.

 

What @xdshi suggested is a fair enough suggestion given the original information. 2 million records should not lead to memory issues unless you are working with a very small amount of RAM. Have you tried upping the RAM? How many columns do you import per row? Can you use heuristics to identify which records need to be compared? Are there any records within the data set which will help identify changes? Can you filter your source tables in the DB pre-emptively before carrying out the comparison that @xdshi suggested? Can you limit the number of columns returned by both source and target tables? Can you make use of the tMap memory functionality (lookup Temp Data Directory here https://help.talend.com/reader/NNO~fmVQU4rlkF9Depfdxw/usVFZrXILonTCNdL86safA)? 

 

Once we know more about the problem it will be easier to suggest a better way of approaching this. @JGM has suggested using a Temp table which might be a reasonable thing to give a go. But moving that many records to carry out an in DB process (which is sometimes the best solution, granted), seems a bit of overkill if you are only looking for hundreds of rows (changed/created). 

Rilhia Solutions
Six Stars

Re: How reproduce Oracle Merge in Talend

If you don't have a last_modified_date_timestamp column on your tables you should add one.

Then you can find the max(last_modified_date_timestamp) in database B.

Next select from database A where last_modified_date_timestamp is > than the max(last_modified_date_timestamp) from database B.