Four Stars

tHDFSOutput Error: File ... could only be replicated to 0 nodes

Dear talend Community,

I am using the following software:

 

  • Ubuntu 16.04 LTS
  • Virtual Box 5.0.40
  • TOS_BD-20170623_1246-V6.4.1
  • HDP_2.6_virtualbox_05_05_2017_14_46_00_hdp.ova

I can successfully access the sandbox in Firefox browser using 127.0.0.1:8888, 127.0.0.1:8080 and also 127.0.0.1:50070.

 

Now I want to send data from talend to the sandbox. The connection seems to work fine as I can browse the files clicking the button behind the FileName field in the Components Tab of tHDFSOutput. But running the job results in the following error:


Exception in component tHDFSOutput_1 (testHDPConnection)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/raj_ops/out-2.csv could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1703)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3336)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3260)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:849)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:503)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)

The file is created but and I can see it using the Files View in the Sandbox. But there is no data written, it stays an empty file. After reading posts with the same error, I still don't knwo how to make this work.

 

Thank you in advance for helping me resolving the issue!

3 REPLIES
Employee

Re: tHDFSOutput Error: File ... could only be replicated to 0 nodes

Hello!  For a quick check, can you open the URI http://sandbox-host-name:50070/ in a browser and verify that your DataNode is healthy?

 

In the table, you should see Live Nodes: 1

Four Stars

Re: tHDFSOutput Error: File ... could only be replicated to 0 nodes

Thank you for your reply. I attached a screenshot where you can see what 127.0.0.1:50070 looks like on my system.

 

 

Four Stars

Re: tHDFSOutput Error: File ... could only be replicated to 0 nodes

In between I tried to debug and followed some posts:. I added the HDFS Ports to the Virtual Box (as described in the link in this post: https://community.hortonworks.com/questions/82072/file-userroottmptesttxt-could-only-be-replicated-t...). As this didn't work I tried to use an older Sandbox version with no docker (2.4.), but still I have the same error.

In this post : https://www.talendforge.org/forum/viewtopic.php?id=50662 there is a parameter mentioned in hdfs-default.xml, but where is hdfs-default.xml? Logging into the sandbox with  ssh root@127.0.0.1 -p 2222 I only find hdfs-site.xml but unfortunately not containing the parameter mentioned: "dfs.client.use.datanode.hostname". Running the tHDFSOutput-job with "Use Datanode Hostname" enabled and disabled: still same error and empty file being created.

 

And although adding the IP (as described here: https://www.youtube.com/watch?v=xG3nQAfkEyM&feature=youtu.be ) I still cannot open sandbox:8080 or 8088 but still only  127.0.0.1:8088 or localhost:8088. Can this be related to the error? And what did I do wrong?

 

Thanks for any hint!