Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Five Stars

Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

I have a scenario, where for each record in a table,hash keys are generated separately and all the hash keys are stored as one delimited file and data as other. If new data is updated for the same table, then the hash keys for the new data are generated and they must be checked with the existing hash keys file, for this before we need to check if the executed hashkey file exists or not. Tried with tFileExists component but it is not working properly. How to put a condition to check the if the file exists or not.

 

 

 

Twelve Stars

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

We cannot work for you, we can help.
if you are very new to talend, start by a training, to understand what is an etl and how dose talend work.
you have to know how to use tool before making a plane.
take time to learn it's the best way.

Francois Denis

Tag as "solved" for others! Kudos to thanks!

Five Stars

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Any one need not work for me, Many people were kind enough to reply to the queries posted by the users.I am new to talend community,I thought , I might get proper solution for my issue, hence  for better understanding i explained the complete scenario. 

Employee

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

@harikamoole 

 

You can verify whether a file is present or not using tFileExists component. Could you please refer the below sample scenario from the link given?

 

https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/g3tWtZMSyBup4eLk5HdOAg

 

   In the example, if the file does not exist, it is checking that condition and throwing error using message box. You can use similar condition to add a new if clause saying, if file is present, you can reroute the flow to your existing flow.

 

   Now there is another case where you may have to check multiple hash files in one go. For that you need to take the list of hash file names in another file or DB table and then convert the flow to iterate using tFlowtoIterate and that way you can do the same process for multiple files.

 

    Try it out and share the screen shots with full job flow and component details in case you are stuck anywhere. 

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

 

Five Stars

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

@nikhilthampi 

 

Thanks for your reply,

 

I have a problem, if a table is new then hash files are generated efficiently. But if the table is already executed once and hash file of that table is stored in the directory, then again  if new rows are added to the same table and if it is executed then, before i should perform a check if the table has hash file generated or not (the hash file name is stored with same name as table name) in a directory, so that if it has then only for new rows hash keys must be generated. The problem is the file checking must be dynamically and after finding the file , i should perform a incremental load. If the file exists it must be connected  to one tmap otherwise another. I am unable to accomplish this. It would be great help if you can specify any other component for this issue.

Employee

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Hi,

 

    Could you please try the tFileExists component as mentioned in the previous post? I am not able to see this component anywhere in your job flow.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Five Stars

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

@nikhilthampi 

 

From tDBinput suppose i am getting abc table , using the job i created 2 delimited files, abc_data.csv and abc.csv(file that has hash keys for each row). Now i needed to add few more rows in the existing abc table, again i am executing it, before execution i need to check if there is already hash file for that table or not (yes it is there) hence that hash file is given as lookup to tmap and only new data is generated separately as abc_new.csv(contains only updated data). I need to check if the table name coming from the database already has a delimited file(with the same name as table name) in the local directory or not. If i use tfileexists component then i will know only in static way. The main problem is i am not understanding how can i check if the coming table is executed once or not. I thought tfilexists might work but it is not working.

Employee

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Hi,

 

    First of all, why you are checking the whole process of execution based on a file? It is bit risky since the file may get corrupt or vanish sometime. So if you want to track whether the execution of a flow to DB has taken place before, please add the execution details to a control table. Next time, go back and check in the control table, whether the process has happened. I believe that is far better way rather than checking a file.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Two Stars

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Even I had  same scenario, where i need to check the hashkeys(hashfile.csv) of previous table and updated table hashkeys(these are generated in tmap) so that newly added rows have different hash keys which are  not present in (hashfile.csv) so those rejected rows are captured in different csv file(newdata.csv) and these updated rows hash keys are appended to (hashfile.csv).

 

here is a problem if the table is executing for first time there is no hashfile,then it should execute full Load job based on hashfile exists, if not incremental Load.(as i said before)

 

here i am unable to connect tfileinputdelimited to tmap to put lookup, please guide me suggestions i am new to talend.

Employee

Re: Need to execute a job based on a file, if file exists then incremental load or else if it is new then full load

Hi @supraja_sdk

 

    In your scenario, you are reading multiple files in the lookup section in iterative fashion. I would suggest to consolidate the lookup data into one file (in an earlier subjob) and then do the matching with main flow where you can try to do lookup with the file. But please note that your file will grow over a period of time. So make sure that you are storing only essential columns and enough memory needs to be supplied.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved

 

15TH OCTOBER, COUNTY HALL, LONDON

Join us at the Community Lounge.

Register Now

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download