Talend apache hadoop 2.7.3 connectivity

Talend apache hadoop 2.7.3 connectivity

Hi All,
We have apache hadoop 2.7.3 installed on Linux machine. we want to connect to hadoop cluster & read sample hdfs file. Inside metadata to configure hadoop cluster, when I use apache as distribution I get option for version 1.0 and there are no other versions available. Please let me know how I can connect to hadoop 2.7.3 from talend.
Also i have created a new job with tHdfsconnection component & provided NameNode URI & username. added tHdfsread & tlogRow component to read a hdfs file but job fails with message "connection refused". can anyone let me know what is exact process to read a hdfs file from Apache hadoop cluster.
Moderator

Re: Talend apache hadoop 2.7.3 connectivity

Hi,
but job fails with message "connection refused".

Could you please show us the full stack trace printed on console? The tHdfsconnection component setting screenshot will be helpful for us to address your issue.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

Re: Talend apache hadoop 2.7.3 connectivity

Hi Sabrina,
Below are details.
1. New Cluster Connection parameters image attached.


2. No sure what should be the libraries we have to specify? internal or external. if external, where to get all the JAR files? and does all Jar files are required? screenshot attached.


3 Error message.



org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException

at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:53)

at org.talend.designer.hdfsbrowse.hadoop.service.HadoopServiceBean.check(HadoopServiceBean.java:102)

at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckHadoopServicesDialog$5.run(CheckHadoopServicesDialog.java:373)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

Caused by: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException

at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:47)

at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:50)

... 5 more

Caused by: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException

at java.util.concurrent.FutureTask.report(Unknown Source)

at java.util.concurrent.FutureTask.get(Unknown Source)

at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:44)

... 6 more

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.talend.core.utils.ReflectionUtils.invokeStaticMethod(ReflectionUtils.java:229)

at org.talend.designer.hdfsbrowse.hadoop.service.check.provider.CheckedNamenodeProvider.check(CheckedNamenodeProvider.java:63)

at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider$1.run(AbstractCheckedServiceProvider.java:45)

at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit$1.call(CheckedWorkUnit.java:65)

at java.util.concurrent.FutureTask.run(Unknown Source)

... 3 more

Caused by: java.net.ConnectException: Call to 10.223.66.228/10.223.66.228:54310 failed on connection exception: java.net.ConnectException: Connection refused: no further information

at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)

at org.apache.hadoop.ipc.Client.call(Client.java:1071)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)

at com.sun.proxy.$Proxy81.getProtocolVersion(Unknown Source)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)

at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:251)

... 12 more

Caused by: java.net.ConnectException: Connection refused: no further information

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)

at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656)

at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)

at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)

at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)

at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)

at org.apache.hadoop.ipc.Client.call(Client.java:1046)

... 22 more

Re: Talend apache hadoop 2.7.3 connectivity

Hi Sabrina,
Can you please help?
Moderator

Re: Talend apache hadoop 2.7.3 connectivity

Hi,
It looks like a connection issue. Can you connect to your hadoop cluster & read sample hdfs file through client without using talend tool?
Have you already installed required external jar files?
Could you please take a look at document about:TalendHelpCenter:Installing external modules?
Best Regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.