Four Stars

Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hi,

I am using Talend Open Studio For Big Data.I want to connect HDP 2.6 on AWS from Talend Big Data.

Here is the cluster setting screenshot,

confir.PNG
I didn't get which username set to Authentication part.Could you provide suggestion on how to get username.

 

When i click Check Services button it throw following exception :

org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:57)
at org.talend.designer.hdfsbrowse.hadoop.service.HadoopServiceBean.check(HadoopServiceBean.java:102)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckHadoopServicesDialog$5.run(CheckHadoopServicesDialog.java:373)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:47)
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:54)
... 5 more
Caused by: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:44)
... 6 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.talend.core.utils.ReflectionUtils.invokeStaticMethod(ReflectionUtils.java:229)
at org.talend.designer.hdfsbrowse.hadoop.service.check.provider.CheckedNamenodeProvider.check(CheckedNamenodeProvider.java:70)
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider$1.run(AbstractCheckedServiceProvider.java:49)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit$1.call(CheckedWorkUnit.java:65)
at java.util.concurrent.FutureTask.run(Unknown Source)
... 3 more
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: ip-10-0-xxx-xxx.ec2.internal
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:688)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:629)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:383)
... 12 more
Caused by: java.net.UnknownHostException: ip-10-0-xxx-xxx.ec2.internal
... 20 more

 

5 REPLIES
Moderator

Re: Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hello,

It seems that ip-10-0-xxx-xxx.ec2.internal cannot be reached.

Are your NameNode URI and the Resource Manager OK with you? Could you connect HDP 2.6 on AWS successfully through client without using talend tool?

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hi,

I get NameNode URI and the Resource Manager values from core-site.xml and yarn-site.xml  both  working fine and  i can connect HDP 2.6 on AWS successfully through client without using talend tool.

Could you suggest me which username i have to use in Authentication part.

I am using openjdk version "1.8.0_131".which java version is compatible with Talend?

Moderator

Re: Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hello,

So far, open JDK is not officially supported by talend. Could you please try to use oracle JDK 1.8 to see if it works?

When you get some compile errors, please check your "Code" tab in your job. There will be your compile error highlighted in red line.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hi,

I am able to open Hive connection successfully by listening Hiveserver2 to port 10000 But i am not able to open connection to Hadoop cluster it throws namenode exception.I can fetch hdfs directory list using hdfs://ip-XX-X-XXX-XX.ec2.internal:8020/user/ namenode URI.

Following are the Connection Parameters :
Namenode URI = hdfs://ip-XX-X-XXX-XX.ec2.internal:8020/
Resource Manager = ip-XX-X-XXX-XX.ec2.internal:8050
Resource Manager Scheduler = ip-XX-X-XXX-XX.ec2.internal:8030
job History = ip-XX-X-XXX-XX.ec2.internal:10020
Staging Directory = /user
User Name = hdfs

But it throws following namenode exception :

org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.TimeoutException
	at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:57)
	at org.talend.designer.hdfsbrowse.hadoop.service.HadoopServiceBean.check(HadoopServiceBean.java:102)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckHadoopServicesDialog$5.run(CheckHadoopServicesDialog.java:373)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
Caused by: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.TimeoutException
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:47)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:54)
	... 5 more
Caused by: java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(Unknown Source)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:44)
	... 6 more

 

Moderator

Re: Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hello,

Are you behind proxy? Have you tried to modify timout configuration in Preferences/Talend/Performance>Connection timeout to see  if it works?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.