Dear talend Community,
I am using the following software:
I can successfully access the sandbox in Firefox browser using 127.0.0.1:8888, 127.0.0.1:8080 and also 127.0.0.1:50070.
Now I want to send data from talend to the sandbox. The connection seems to work fine as I can browse the files clicking the button behind the FileName field in the Components Tab of tHDFSOutput. But running the job results in the following error:
Exception in component tHDFSOutput_1 (testHDPConnection)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/raj_ops/out-2.csv could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
The file is created but and I can see it using the Files View in the Sandbox. But there is no data written, it stays an empty file. After reading posts with the same error, I still don't knwo how to make this work.
Thank you in advance for helping me resolving the issue!
Hello! For a quick check, can you open the URI http://sandbox-host-name:50070/ in a browser and verify that your DataNode is healthy?
In the table, you should see Live Nodes: 1
Thank you for your reply. I attached a screenshot where you can see what 127.0.0.1:50070 looks like on my system.
In between I tried to debug and followed some posts:. I added the HDFS Ports to the Virtual Box (as described in the link in this post: https://community.hortonworks.com/questions/82072/file-userroottmptesttxt-could-only-be-replicated-t...). As this didn't work I tried to use an older Sandbox version with no docker (2.4.), but still I have the same error.
In this post : https://www.talendforge.org/forum/viewtopic.php?id=50662 there is a parameter mentioned in hdfs-default.xml, but where is hdfs-default.xml? Logging into the sandbox with ssh email@example.com -p 2222 I only find hdfs-site.xml but unfortunately not containing the parameter mentioned: "dfs.client.use.datanode.hostname". Running the tHDFSOutput-job with "Use Datanode Hostname" enabled and disabled: still same error and empty file being created.
And although adding the IP (as described here: https://www.youtube.com/watch?v=xG3nQAfkEyM&feature=youtu.be ) I still cannot open sandbox:8080 or 8088 but still only 127.0.0.1:8088 or localhost:8088. Can this be related to the error? And what did I do wrong?
Thanks for any hint!
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks