Four Stars

Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hi,

 

I am using Talend Open Studio for Big Data 6.4.1 on Windows 10 with Hadoop v 2.0 on the cloud (provided by IBM demo cloud). My job (joined as image) output is :

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ERROR]: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:130)
at org.apache.hadoop.security.Groups.<init>(Groups.java:94)
at org.apache.hadoop.security.Groups.<init>(Groups.java:74)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2806)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2798)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2661)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:178)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:215)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:122)
at org.apache.pig.impl.PigContext.connect(PigContext.java:301)
at org.apache.pig.PigServer.<init>(PigServer.java:220)
at org.apache.pig.PigServer.<init>(PigServer.java:205)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.tPigLoad_1Process(aggregate_movie_director.java:1330)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.runJobInTOS(aggregate_movie_director.java:2474)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.main(aggregate_movie_director.java:2323)
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

I looked at solutions at:

https://community.talend.com/t5/Sandbox/ERROR-org-apache-hadoop-util-Shell-Failed-to-locate-the-winu...

https://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-bina...

https://community.talend.com/t5/Sandbox/Missing-winutils-exe-Failed-to-locate-the-winutils-binary-in...

 

but it seems that the missing winutils.exe must be placed in the hadoop home directory. I didn't install the hadoop cluster myself, since this is a cloud 'as a service' cluster. Would you have a suggestion to help ?

 

Best regards, Sélim

 

 

5 REPLIES
Four Stars

Re: Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hi again,

 

I followed the steps :

"specify the Hadoop home directory that contains the winutils.exe program

  • If you don't have a local Hadoop install on Windows you can download winutils.exeand then:
    • create a Hadoop home directory
    • place winutils.exe in a bin directory under that Hadoop home directory
  • use a system property -Dhadoop.home.dir to point to the Hadoop home directory when you start the Java process. An example:
    java -D"hadoop.home.dir=C:\Users\<username>\Hadoop" -jar my.jar"

I dowloaded hadoop-common-2-2-0.bin-master containing winutils.exe, and placed it in a new directory  "C:\hadoop_home\hadoop-common-2.2.0-bin-master\bin". I then set the VM argument -D"hadoop.home.dir=C:\hadoop_home\hadoop-common-2.2.0-bin-master" in Advanced parameter->JVM setting of the job (see attachment talendhadooppig2). It seems to work since I don't have the java exception "java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries." anymore. Nevertheless I still got some warnings :

 

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

My job should normally use Pig to import 2 tables from HDFS (1 main and 1 ref), do the lookup mapping, and export 2 tables to HDFS (results and rejects) in a new directory. It is running without ending, while the data flow ends on the design panel with only 1 row processed, and the output directory is not created (even erased if I create one...).

 

The warning are quite explicite but I don't know what to do with it... builtin-java classes are not always applicable ? how can I set the jar path for the PigServer ?

 

Thank you in advance,

Cordially, Sélim

Moderator

Re: Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hello,

Have you already checked this online document about:TalendHelpCenter:The missing winutils.exe program in the Big Data Jobs?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hi Sabrina,

 

Yes I checked this out and followed the steps. I don't have the issue anymore (neither in the job output, nor in the trace/java debug screen). But using the trace debugging, I see only 1 row with null values flowing through my tPigLoad components (please see attachment) ... Besides, I still have 2 warnings :

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

Best regards, Sélim

Four Stars

Re: Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

By the way, to make it work I had to set in the JVM setting :

-Dhadoop.home.dir="C:\\hadoop_home\\hadoop-common-2.2.0-bin-master"

with doubled \\.

 

Maybe I should make this post resolved and create a new one ?...

Moderator

Re: Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hi,

The missing winutils.exe issue has been fixed on your end?

For your current issue, could you please show us the full stack trace? Do you want to extract data from HDFS and load it into Pig?

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.