Five Stars

record level recovery

Hi Team,

I am trying to load data from tfileinputdelimited to a oracle database. Imagine there are 5000 records in my source . There is some problem with the network or something on completing 1500 records. When the problem is resolved , the job should resume for 1501th record and should not start the job from first. How can I achieve this in talend data fabric 6.2.

 

Thanks

Karthik

3 REPLIES
Employee

Re: record level recovery

You will need to keep a counter or record number somehow, either in a config file, or in a staging table.  

You can load all the file, add a counter to the record, and then process.  

 

This also assumes you are doing auto commit.  If you are doing batch commit, then you also need to keep track of the batch number, because a batch can also fail due to other reasons, like 1 record with some issue will fail the batch.

Twelve Stars

Re: record level recovery

There are several ways to approach this problem. Below are some high level method descriptions you might want to consider....

 

These are based on a model of a source and target table. Consider "table" to refer to any source or target data repository (file, db, etc).

 

1) Before you start your load, query your target to find a key record which will inform you of where your last run ended. Use that data to filter your source. If reading from a file you may want to use a tMap to filter records already processed.

2) At the end of your job (using a tPostJob component) log where the load got to in a file (assuming that your database access is not guaranteed and writing to a file is). The tPostJob will run regardless of what happens so you know you can log a result regardless of the failure.

3)  A less efficient way would be to use an "insert if not exist" action....but you will need a primary key or composite key

 

There are other ways, but the above you should steer you towards a suitable method for your data.

Rilhia Solutions
Five Stars

Re: record level recovery

yes . Thanks for the replies and I found one more solution .

 

Step1 : Store the number of line processed to a context variable and put that variable to a file(I am naming this file as an error count file).

Step2: When you run the job for the first time there wont be any error files . so use tFileExist component to check whether error file is there or not.

Step 3: if there is no error file , set the context variable as 0 else assign the file value to the context variable.

Step 4: Now call the context variable in the header section of the component tab of tFileinputDelimited component.

Step 5: If the number of input rows is equal to the number of rows in the output rows set the context variable to 0.