Duplicate data

Highlighted
Six Stars

Duplicate data

Hi
I have a question
I have one Excel I loaded it last week
Now again I got the same Excel file they have just copied the old file and sent me that's it now if you load it in to database it will create duplicate
I don't want duplicate data to be filled I don't have any update in my project everything I do is insert but not duplicate
Eg: If I have a file with 100 records where 50 are previous records and 50 are new so this previous data is duplicate I don't want to push this into database what is the way I can follow
Please explain clearly
If you made a job of this kind share at 12e41a0222@gmail.com

Highlighted
Sixteen Stars
Sixteen Stars

Re: Duplicate data

You need to read the record from the database, then using a tMap with the proper inner join, exclude records from the Excel file (main row) existing into the database (lookup row).

If the previous is still there, you can also replace the database by this previous file.

This is a very common pattern.


TRF
Highlighted
Six Stars

Re: Duplicate data

Screenshots would have helped me as a beginner

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog