Is it possible to capture the inserted records in a Big data standard job in Hive for capturing the Audit. My job design is a standard job. thiveinput->tmap->thdfsoutput->thiveload.
A flow meter component will be able to capture the records inserted but i need to edit my jobs, as there are more than 100 jobs it will be difficult. Any best approach to capture job start time, end time, number of records inserted, number of records read in a hive table in talend big data standard job.
I'm using talend data fabric 6.3 enterprise edition
Turn stats and logs at the project level. You need to be careful though about whether you use DB tables or files. Depending on whether you are doing pure ETL, or M/R or Spark, you will need to adopt different techniques.
As i'm doing a pure ETL process, rather than changing in project level i need to capture in a job level information.
In Standard Big Data job i'm not finding an option to capture the number of inserted records in a tHiveload component.
And i'm i'm able to capture the job end time in tLogcatcher and tStatCatcher, but i need to capture the job start time too.
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.