The Job in Execution Plan is stuck in Running state after TAC loses the DB connection

Scenario

  • Create Task1 and Task2 and create an Execution Plan: Task1 > onOk > Task2.

  • Run the Execution Plan: observe that the status of Task1 is Running.

  • Before Task1 ends, it stops and restarts the TAC DB.

  • TAC briefly loses the DB connection, but recovers the DB connection once the DB is back.

  • Check the execution logs from JobServer side, notice that Task1 already ended.

  • From TAC however, the Task1 status is still Running, and is stuck in this status.

  • Check the Execution Plan, and notice the status is still Running, and is stuck in this status.

You need TAC to seamlessly sync the Task's status with the one from the JobServer, and any Execution Plan needs to continue as expected.

 

Solution

Against Talend 6.4.1, the solution consists of applying Patch_20171124_TPS-2253_v1-6.4.1.zip.

  1. Contact Talend Support to request patch Patch_20171124_TPS-2253_v1-6.4.1.zip.

  2. Use the patch Readme file steps (embedded in the patch zip file) to apply the patch.

Note: You may need to clear your browser cache for the patch to take effect.

Version history
Revision #:
7 of 7
Last update:
‎06-13-2019 10:06 PM
Updated by:
 
Comments
Six Stars

I'm experiencing the same issue. We've had network problems - not Talend related - where we lose connectivity to the TAC database for a brief period. The result is exactly what you described - Plans hanging in a "Running" status, when in fact they have completed; but the next trigger never fires because Talend believes it's still running. We've also seen this on individual jobs (Tasks) that happen to be executing when the connection is lost - and they get stuck as well. 

 

The "cause" of the issue is not Talend related - However, the fact it can't recover is a problem. I've been able to do a work-around I saw posted here suggesting you go into the TAC database and manually reset the status - which works - but I HATE modifying a vendor database. 

 

I'll request this patch and hopefully it resolves the issue - because it's a pain to deal with when we have jobs running every 60 seconds in some cases and they all get hung-up and have to be re-set.

 

Really glad I found this - Thanks! 

Four Stars

Hi Talend Support,

 

I'm having the same issue but I'm using Talend 6.5.1. Do you have any patch for 6.5.1 version? The status in TAC stays at "Running" but the job was complete by verifying data from the database.

 

Thanks

Six Stars

FYI - I've applied the patch - and we still have issues when we lose connectivity to the database. Jobs hang is varying status states: "Running"; "Requesting Run..."; etc. It's very painful as our inventory of jobs increase and the only solution we have is to open the database and manually reset values or delete fired triggers that appear to be hung up. What we need is a graceful way to "force" a reset of a job so we don't have to make database changes to get a job running again. We have critical jobs running every 60 seconds now...and database drops cause major problems. I'm aware the issue is caused on our end with the network issues and database connectivity - but a graceful way for Talend to allow an Administrator to force a reset of jobs is needed - IMO.

 

 

Moderator

Hello @JBristow 

Could you please clarify in which Talend version/edition you are? Have you already created a case on talend support portal? If so, please give us your support case id so that we could get more information from support team about your real job background to address your issue.

Best regards

Sabrina

Four Stars
Hi, I'm having the same problem with version 7.1.1. Execution plan started, marked "Running", but the job did not start. This happens more than twice a week, sometimes all the jobs are done but the execution plan stay "Running". Another problem is that I can't get the execution plan working after the bug occurs, even if I stop it. When I restart it, it says another instance is running...
Moderator

Hello @Grubshka ,

Is there any error message in your TAC log? What's browser are you using? More information will be helpful for us to address your issue quickly.

Best regards

Sabrina