After starting JobServer as a service on a Linux machine, Jobs start failing randomly

Talend Version     6.2.1

Summary

After starting JobServer as a service on a Linux machine, Jobs start failing randomly.
Additional Versions  
Product Talend Data Integration
Component JobServer
Problem Description

When instability is observed, this error appears in the JobServer logs:

2017-07-27 11:05:00,636 WARN CommandServerSocket - Exception when trying to run job 20170726_212420_8i8ur (Talend_FileSystem_Monitor): Cannot run program "/opt/java/jre/bin/java" (in directory "/opt/Talend-6.2.1/jobserver/agent/./TalendJobServersFiles/repository/MONITORING_Talend_FileSystem_Monitor_20170726_212420_8i8ur/Talend_FileSystem_Monitor"): error=11, Resource temporarily unavailable

'Caused by: java.io.IOException: error=11, Resource temporarily unavailable 
at java.lang.UNIXProcess.forkAndExec(Native Method)
'

Jobs fail randomly, with errors such as Unknown State, Connection to Server failed, and Running Error.

Problem Root Cause

The number of open file parameters for the server was low. It was recommended that the number be increased.

The issue also started to appear after JobServer was started as a service using the command:

systemctl start talend-jobserver.service
Solution

Try one of the following changes to make the service stable:

  • Increase the number of open files, then start JobServer using the script from the JOBSERVER/bin folder.
  • The entry TasksMax=512 in /etc/systemd/system/Talend-JobServer.service contributes to the instability. Change the parameter to TasksMax=infinity and start as a service.
JIRA Ticket Number  
Version history
Revision #:
13 of 13
Last update:
‎06-05-2018 01:52 AM
Updated by:
 
Labels (1)