Four Stars

Unable to Run a Big Data Job on AWS EMR. Jar not sent on to the Remote Talend Server.

Hi All,

 

I am trying to run a Big Data Batch job using Talend Big data Platform enterprise edition.

In our project we are using Amazon EMR Hadoop platform. I am successfully able to set up the Hadoop cluster configuration on the talend repository. Further I am able to read all the tables present in my Hive database. But then when I try to run a job to load data from one hive table to another, I am facing time out issue. It seems that the talend job server is not even receiving the request. I sat with our AWS team member and made sure that all the ports are open for the talend to interact with the EMR and it is all fine.

I even tried to run a simple Talend Big Data Job with the row generator and HDFS configuration. But nothing seems to put an entry in the Hadoop Job Server.

 

Here are some of the screen shorts for what I have tried so far. Any help is deeply appreciated.

EMR Configuration:

EMR Configuration:EMR Configuration:Services are successfully running.Services are successfully running.Able to import the Hive tables and set up the Hadoop cluster in Metadata:Able to import the Hive tables and set up the Hadoop cluster in Metadata:Able to create DB Connection:Able to create DB Connection:Sample big data Job –Sample big data Job –It tried to send the job to the Remote server. But it never reaches the remote server. I have waited for 30-45 mins several times. But it doesn’t help. There is always a time out error.It tried to send the job to the Remote server. But it never reaches the remote server. I have waited for 30-45 mins several times. But it doesn’t help. There is always a time out error.

 

 

Sample Job –

It tried to send the job to the Remote server. But it never reaches the remote server. I have waited for 30-45 mins several times. But it doesn’t help. There is always a time out error.

 

 

2 REPLIES
Employee

Re: Unable to Run a Big Data Job on AWS EMR. Jar not sent on to the Remote Talend Server.

Does the service account user for the JobServer have write permissions into the folder where it is installed to copy the binaries into its RemoteJobServerFiles folder?

Four Stars

Re: Unable to Run a Big Data Job on AWS EMR. Jar not sent on to the Remote Talend Server.

Hi, I am not sure how do I locate this folder on the remote server. I can certainly run a Standard job though the remote job server.

 

Could you please point me to a general location if any to verify. Also, do you think this is the case only for Big Data jobs or its a common thing for Standard as well. in case its common for Standard jobs as well then I would say Std jobs run fine. So there could be another issue. 

 

Also pelase note I am not able to see any activity on the Hadoop Job Server as Talend is not deploying the jobs yet on the Hadoop Server.

 

Thanks very much for your help!!