Hello, what is the typical practice followed to notify job failures? Is it to include tlogcatcher linked to a tsend email in every job? or to link a tsendemail to every component or can it be setup in execution plan ? or are there any other ways to do it. THanks.
It depends on what somebody has to do when a job fails. Sending an email is a good idea. It could also a good idea to start a job next to the failed job to restore the previous state or avoid further processing. The most useful information between jobs are the return code (or exist code or error code == always the same). Return code == 0 means everything is fine and all others means there is something wrong or not fully OK. I would suggest differentiate between: * those problems wich can be solved by a rerun of the job (e.g. a resource for the job is not ready like databases or file servers) and * those problems which are caused by wrong input data and * those problems which are caused by an programming error of the job.
Thats a good suggestion to categorize errors by Jlolling. But i am wondering about industry standards followed to notify production support team of the job failures. Are the TAC generated notifications as informative as a job with logcatcher - sendemail components? are jobs with logcatcher more resource intensive?
You could also try the - unfortunately outdated tSNMP component. It is a common way to track the reliability of software components via SNMP to things like Nagios or Icinga. You could also send emails to a trouble ticket system like OTRS. In this case you get a ticket and you can define who will be notified next.