Please help me with this:
I need to design a job which has to load an excel sheet having unique batch_Id .
when this excel file comes in with new data the job will first check if the batch ID is already in the table or not. IF not then it will load the data in the table and move the file to Archived folder. If the batch ID is already in table then the excel file will not be loaded. It will be placed in the Error folder.
Solved! Go to Solution.
Is the Batch ID inside the file in a field/column of the excel? Or is the Batch ID in the filename?
Also if the Batch ID is inside the data of the Excel, will there be only one Batch ID per file or possible more than one?
This may give you some ideas. You could use tFileList to pick up the filename, input the file into tMap along with a lookup of the distinct BatchIDs of your database, filter your output to only the records that do not match the lookups (see second picture below), then move the file with tFileCopy to the archive if the insert record count was greater than zero or to the error folder if the record count equal to zero.
Watch the recorded webinar!
Accelerate your data lake projects with an agile approach
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.