Only Load Files not in the LoadFiles Table (in case of duplicates)

Highlighted
Five Stars

Only Load Files not in the LoadFiles Table (in case of duplicates)

Hi, I have been able to figure out most of what I needed in Talend so far.  However, I'm at a point that I'm not sure how to make Talend work for what I need.  I have a table in SQL that I load file names and counts into as they process.  I need to add a step at the beginning that will check that table to see if a file with the same name has been loaded before and exclude it if it has.  I've tried figuring out a join for this and a tMap.  You can see my setup below.  The very end (circled) you see where i load the files into the table.  I have indicated where I want to put an element or series of, that will check that Files Loaded table before loading.  

 

Capture.JPG


Accepted Solutions
Employee

Re: Only Load Files not in the LoadFiles Table (in case of duplicates)

Hi,

 

     Please try below snippet before flat file read.

 

image.pngRead list of files usinf filelist and pass the current file name as parameter in where clause

 Connect it to a tflowtoIterate component with default key value option turned on (under Basic settings)

 

Use a Run if condition and add the below condition in it

 

!Relational.ISNULL(((String)globalMap.get("row1.file_name")))

 

image.png

 

Please mark the topic as solution provided if the answer has helped you. Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi


All Replies
Employee

Re: Only Load Files not in the LoadFiles Table (in case of duplicates)

Hi,

 

     Please try below snippet before flat file read.

 

image.pngRead list of files usinf filelist and pass the current file name as parameter in where clause

 Connect it to a tflowtoIterate component with default key value option turned on (under Basic settings)

 

Use a Run if condition and add the below condition in it

 

!Relational.ISNULL(((String)globalMap.get("row1.file_name")))

 

image.png

 

Please mark the topic as solution provided if the answer has helped you. Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

Five Stars

Re: Only Load Files not in the LoadFiles Table (in case of duplicates)

Hi,

May this will help to load the files only once(in case of duplicates file will come)...

First pipeline: In this we store the data with the file name using thashinput component through appending the data

Second pipeline: We store that data in a temp table

Third Pipeline: In this we can join the tgt table and temp table on the basis of file name. And it will load unique files to the Target table.

 

Regards,

Akash

Five Stars

Re: Only Load Files not in the LoadFiles Table (in case of duplicates)

This looks great, I have it all setup but for some reason it's reading the file name as a column.

I'm getting and "Invalid column name" error and it lists the file name as the column.  It's failing on the tDBInput_1 and kicking out the error.

 

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch