Connecting To EMR Cluster

Connecting To EMR Cluster

Greetings,
I am facing a couple of issues when trying to connect and configure my Talend job to run on Spark or connect to an EMR cluster.  
1.  The Spark job just appears to hang forever with no resolution when executing a big data batch job.  Nothing executes and nothing fails.
2. I cannot create a connection in the hadoop repository.  Always getting a connection timeout when checking.  org.talend.designer.hdfsbrowse.exceptions.HadoopServerException
3. When I drop a HDFS connection on to the palette, that appears to connect because the job exits with 0, but when I try to do anything else I get a connection error.
Our Environment:
1. Windows 2012 Server running on EC2 instance.
2. Running Talend Big Data 6.2
3. EMR 4.6 and Hadoop Distribution 2.7.
What I have tried so far:
1. Every single port and server combination possible.
2. Updating the hosts file.  In fact I followed all of the steps listed here .  
3. Reinstalling Java.
Other notes:
1. I am able to remote into the cluster through Putty as well as manage/see cluster details from my Firefox browser after configuring FoxyProxy.
2. All services appear to be normal in the AWS console.
Similiar issues:
Moderator

Re: Connecting To EMR Cluster

Hi,
From your description, it seems there is something wrong with your connection. Is the connection parameter Ok with you? Could you please show us your connection setting screenshots?
Have you already checked document about:TalendHelpCenterSmiley Frustratedupported Hadoop distribution versions to see if you are on a compatible platform?
Are you using JDK 1.8?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.