I came across a scenario wherein - Source Data is loading into Target.
For example the total count of records are - 100 then when i run the job due to some issue only first 50 records are being loaded while the others are being discarded due to the error. For the next run of the job, i would like to load from 51st record to 100th record in the target. How should i proceed or what steps should i follow?
Note - Source & Target could be anything either a DB or Flat file etc. But if it is DB to DB then will there be a change in the process similarly if it is File to File then will there be a change in the process.
Essentially you need to keep a log somewhere of precisely where you got to in the load. I use a DB table for this, but a flat file would work as well. During your processing, keep a log of where you are at. Add a tPostJob (will run at the end of the job every time....even with an error) and insert your data in the logging table here. When you start the job, read from this table in order to work out where you need to start from.
This is a very basic example, but you have covered different situations which will need to be implemented in different ways. For example, with a DB you can can add your row id in your query's WHERE clause, with a flat file you will need to filter the rows using a tMap.
Join us at the Community Lounge.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Pick up some tips and tricks with Context Variables
Learn how media organizations have achieved success with Data Integration
Introduction to Talend Open Studio for Data Integration.