Duplicate data

Six Stars

Duplicate data

Hi
I have a question
I have one Excel I loaded it last week
Now again I got the same Excel file they have just copied the old file and sent me that's it now if you load it in to database it will create duplicate
I don't want duplicate data to be filled I don't have any update in my project everything I do is insert but not duplicate
Eg: If I have a file with 100 records where 50 are previous records and 50 are new so this previous data is duplicate I don't want to push this into database what is the way I can follow
Please explain clearly
If you made a job of this kind share at 12e41a0222@gmail.com

Fifteen Stars TRF
Fifteen Stars

Re: Duplicate data

You need to read the record from the database, then using a tMap with the proper inner join, exclude records from the Excel file (main row) existing into the database (lookup row).

If the previous is still there, you can also replace the database by this previous file.

This is a very common pattern.


TRF
Six Stars

Re: Duplicate data

Screenshots would have helped me as a beginner

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog