Is it possible to capture the inserted records in a Big data standard job in Hive for capturing the Audit. My job design is a standard job. thiveinput->tmap->thdfsoutput->thiveload.
A flow meter component will be able to capture the records inserted but i need to edit my jobs, as there are more than 100 jobs it will be difficult. Any best approach to capture job start time, end time, number of records inserted, number of records read in a hive table in talend big data standard job.
I'm using talend data fabric 6.3 enterprise edition
Turn stats and logs at the project level. You need to be careful though about whether you use DB tables or files. Depending on whether you are doing pure ETL, or M/R or Spark, you will need to adopt different techniques.
As i'm doing a pure ETL process, rather than changing in project level i need to capture in a job level information.
In Standard Big Data job i'm not finding an option to capture the number of inserted records in a tHiveload component.
And i'm i'm able to capture the job end time in tLogcatcher and tStatCatcher, but i need to capture the job start time too.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema
Pick up some tips and tricks with Context Variables