Hoping for some insight as to my latest issue. I jumped on PROD to deploy a breakfix for one of my client's projects and noticed that one of the Execution tasks was stuck on "Requesting run..." and had last run 5 days ago. This particular task is based on a file trigger and probably gets executed ~50-75 times per day. I downloaded the log and noticed the following error:
2017-09-14 12:08:57 ERROR ErrorLogger - An error occured while scanning for the next trigger to fire. org.quartz.JobPersistenceException: Couldn't acquire next trigger: Couldn't retrieve trigger: Transaction (Process ID 80) was deadlocked on
lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. [See nested exception:
org.quartz.JobPersistenceException: Couldn't retrieve trigger: Transaction (Process ID 80) was deadlocked on lock resources with another
process and has been chosen as the deadlock victim. Rerun the transaction. [See nested exception: java.sql.SQLException: Transaction
(Process ID 80) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.]] at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2785) at org.quartz.impl.jdbcjobstore.JobStoreSupport$36.execute(JobStoreSupport.java:2728) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3742) at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2724) at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:263)
So, obviously something has happened the DB side (SQL Server) to cause a lock or timeout, but there seems to be no way to recover this task. I can't kill it because it immediately throws this error: org.talend.exception.BusinessException: executionTask.locked2.
Ultimately, I had to restart the TAC and everything is back running as it should. I would expect some issues between the TAC and SQL Server, but I would also expect some sort of built in recovery mechanism from Talend. If a third party library is throwing an error, handle it appropriately. Can anyone offer some insight?
Could you please indicate on which build version you got this issue?
Here is a jira issue:https://jira.talendforge.org/browse/TMC-1019
We are also having the same probelm in our env . But it is not showing any progress for 50 min...after that it is running and some times it getting failed....
Is it a random issue from your side? Is there any more error message in TAC log?