Four Stars

tHDFSInput Could not obtain block

Hi all,

 

I need help.

 

I would like make a simple job, for checked my configuration.

I have a sandbox Hortonworks 2.6.1 and TOS Big Data 6.4.1.

My job is a tHdfs that send information into a tLogRow.

My jobMy job

I have declare, in Metadata, a hadoop connection cluster and i have check it, all is ok.

I declare an HDFS connection and I check it also, this copnnection is successful.

When I start my job, I have 3 warnings an some errors, like :

- Connection timed out: no further information

Exception in component tHDFSInput_2 (OnBoard_hdfs)
  org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block

 

 Have you an idea ? 

 

Behind I put all log :

 

Spoiler
Démarrage du job OnBoard_hdfs a 13:51 06/08/2017.
[statistics] connecting to socket on port 3761
[statistics] connected
[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows.
[WARN ]: org.apache.hadoop.hdfs.BlockReaderFactory - I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3539)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:775)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:692)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Failed to connect to /172.17.0.2:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3539)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:775)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:692)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - DFS chooseDataNode: got # 1 IOException, will wait for 398.97485639321206 msec.
[WARN ]: org.apache.hadoop.hdfs.BlockReaderFactory - I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3539)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:775)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:692)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Failed to connect to /172.17.0.2:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3539)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:775)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:692)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
...
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Could not obtain block: BP-1875268269-172.17.0.2-1493988757398:blk_1073743305_2613 file=/user/raj_ops/books.csv No live nodes contain current block Block locations: DatanodeInfoWithStorage[172.17.0.2:50010,DS-3fd6f5d7-12ac-4a3c-8890-77034935b5e6,DISK] Dead nodes: DatanodeInfoWithStorage[172.17.0.2:50010,DS-3fd6f5d7-12ac-4a3c-8890-77034935b5e6,DISK]. Throwing a BlockMissingException
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Could not obtain block: BP-1875268269-172.17.0.2-1493988757398:blk_1073743305_2613 file=/user/raj_ops/books.csv No live nodes contain current block Block locations: DatanodeInfoWithStorage[172.17.0.2:50010,DS-3fd6f5d7-12ac-4a3c-8890-77034935b5e6,DISK] Dead nodes: DatanodeInfoWithStorage[172.17.0.2:50010,DS-3fd6f5d7-12ac-4a3c-8890-77034935b5e6,DISK]. Throwing a BlockMissingException
Exception in component tHDFSInput_2 (OnBoard_hdfs)
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1875268269-172.17.0.2-1493988757398:blk_1073743305_2613 file=/user/raj_ops/books.csv
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:994)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:638)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1875268269-172.17.0.2-1493988757398:blk_1073743305_2613 file=/user/raj_ops/books.csv
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:994)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:638)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:888)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:945)
at java.io.DataInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.PushbackInputStream.read(Unknown Source)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.tHDFSInput_2Process(OnBoard_hdfs.java:740)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.runJobInTOS(OnBoard_hdfs.java:1180)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
at sandbox_hdp.onboard_hdfs_0_1.OnBoard_hdfs.main(OnBoard_hdfs.java:1029)
[statistics] disconnected
Job OnBoard_hdfs terminé à 13:52 06/08/2017. [Code sortie=1]

 

3 REPLIES
Moderator

Re: tHDFSInput Could not obtain block

Hi,

It looks like connectivity issue. Is it Ok when you change to a different id?

Do you have permission issue on the cluster? Please check if the user you are using in talend is able to write/ read data into hdfs.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: tHDFSInput Could not obtain block

Hi xdshi,

I try to use my work with the user "admin", and another user like raj_ops. For these two users, I have exactly the same problem. In the screenshot, I highlighted the file (books.csv), in red, that I use for my work. Look at the permissions for this file.

 

To check the permissions of the 2 users, I put a file in the system hdfs, with mastic, and it works well without error / warning message (highlighted in blue).

 

2017-08-11_09h10_351.png

Do you have any other proposal ?

 

Regards 

Marc

Moderator

Re: tHDFSInput Could not obtain block

Hello,

Could you please show us  the full stack trace?

Have you tried to browse the namenode in order to notice if this block is missing?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.