One Star

Problem with HDFS

Hello,
First of all, I want to apologize for my English that is not really good.
Let me explain my problem. It comes from HDFS. Indeed, my objective is to send a file to a HDFS cluster. Then I've ask for all the informations to the administrator, who has given then to me. But, when I try to send/read any data from/to the HDF System, I ALWAYS got this exasperating error as response:
Exception in component tHDFSExist_1
java.io.IOException: Call to /XXX.XXX.XXX.XXX:9000 failed on local exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
at org.apache.hadoop.ipc.Client.call(Client.java:1033)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:208)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:175)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1310)
at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:103)
disconnected
at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:101)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:101)
at talenddemosjava.test_0_1.Test.tHDFSExist_1Process(Test.java:410)
at talenddemosjava.test_0_1.Test.tHDFSConnection_1Process(Test.java:351)
at talenddemosjava.test_0_1.Test.runJobInTOS(Test.java:644)
at talenddemosjava.test_0_1.Test.main(Test.java:509)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:774)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)


Does anyone have any idea of the origin of the problem: does it come from TOS, or otherwise is it me who made an error in the configuration?
11 REPLIES
Community Manager

Re: Problem with HDFS

Hi
Execute the following command from command prompt to see if you are able to ping to the host with specify port.
cmd>telnet yourhost 9000
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Problem with HDFS

Hello, I already tried that, and it worked normally. Smiley Indifferent
Community Manager

Re: Problem with HDFS

hi
which version are you using? I am working on the latest vesion 5.2.0 and it works. Here is a related topic:
http://www.talendforge.org/forum/viewtopic.php?id=26901
Please try v5.2.0 and let me know if it works.
----------------------------------------------------------
Talend | Data Agility for Modern Business
Employee

Re: Problem with HDFS

Hello,
Could you please verify that your cluster is up and runs correctly? I mean, are you able to browse the webpage:
http://your-namenode-server-ip:50070 and confirm that you have one live node at least?
Which version/distribution (cloudera/hortonworks/mapR...?) of hadoop are you using?
Rémy.
One Star

Re: Problem with HDFS

Hello Shong & Remy.
@Shong: I already worked on the 5.2, despite I was working on the previous version for start the project.
@Rémy: there are 3 nodes working, and I work on a cloudera 0.20.2 - cdh3 update 5
Complitely off topic: Have you any idea why I'm able to log in on talend.com, but not on talendforge.com? ~~
G. BLAISE
Employee

Re: Problem with HDFS

OK, then your problem comes from the port you are using. The namenode on Cloudera is not 9000 but 8020.
So, for the namenode URI, please try to put: hdfs://XXX.XXX.XXX.XXX:8020
One Star

Re: Problem with HDFS

Thanks Rémy, it was indeed a mistake from the administrator who thought it was the namenode 9000.
In order to don't have to create a new subject, I go on with this one, with a new problem:
nov. 19, 2012 4:20:06 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out: no further information
nov. 19, 2012 4:20:06 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_7242044495197364905_1732
nov. 19, 2012 4:20:06 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode XXX.XXX.XXX.130:50010
nov. 19, 2012 4:20:27 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out: no further information
nov. 19, 2012 4:20:27 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_8522655008285870715_1732
nov. 19, 2012 4:20:27 PM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode XXX.XXX.XXX.131:50010

May you help me (again) please?
Employee

Re: Problem with HDFS

Your datanodes are not eligible.
Are you using HBase? Do you use a lot of files? Is Talend installed on Linux? Windows?
One Star

Re: Problem with HDFS

So, I actually use only the HDF System, and my Talend is running on Windows.
Thanks in advance!
One Star

Re: Problem with HDFS

Is this issue is fixed. We are facing a similar kind of issue.
I have talend tool installed in windows environment and tried to build hdfsconnection. But not able to successfully connect tried with different  element(thdfsget/thdfsput/thdfsinput/thdfsoutput) but no success.
Tried with HortonWorks distribution and Amazon EMR distribution but failed every time. Please help.
One Star

Re: Problem with HDFS

I am facing a similar kind of issue. I tried to put file to HDFS, but show following errors:
Exception in component tHDFSPut_1
java.io.IOException: DataStreamer Exception: 
: org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception
java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1752)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1530)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1483)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:668)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:796)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1752)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1530)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1483)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:668)
Please help!