One Star

Null Pointer Exception in tHiveConnection using HDP 2.0 sandbox

Hi,
I am working through Hortonworks Sandbox Examples in TOS v5.4.1. The HDFS examples worked perfectly. I then moved on to the HIVE examples, and am encountering an issue. Each time I run the job it fails in the tHiveConnection component with the following error:
Starting job Simple_hive_row_input at 17:03 12/02/2014.
connecting to socket on port 4021
connected
: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:917)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:238)
at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:74)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:122)
at org.apache.hadoop.hive.jdbc.HiveConnection.<init>(HiveConnection.java:95)
at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:106)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tHiveConnection_1Process(Simple_hive_row_input.java:1458)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tFixedFlowInput_1Process(Simple_hive_row_input.java:1336)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tCreateTemporaryFile_1Process(Simple_hive_row_input.java:658)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.runJobInTOS(Simple_hive_row_input.java:3295)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.main(Simple_hive_row_input.java:3112)
: org.apache.hadoop.conf.Configuration.deprecation - mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
: org.apache.hadoop.conf.Configuration.deprecation - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
: org.apache.hadoop.conf.Configuration.deprecation - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
: org.apache.hadoop.conf.Configuration.deprecation - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
: org.apache.hadoop.conf.Configuration.deprecation - mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
: org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
: org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
: org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
: org.apache.hadoop.hive.metastore.HiveMetaStore - 0: Opening raw store with implemenation classSmiley Surprisedrg.apache.hadoop.hive.metastore.ObjectStore
: org.apache.hadoop.hive.metastore.ObjectStore - ObjectStore, initialize called
: DataNucleus.Persistence - Property datanucleus.cache.level2 unknown - will be ignored
: DataNucleus.Connection - BoneCP specified but not present in CLASSPATH (or one of dependencies)
: DataNucleus.Connection - BoneCP specified but not present in CLASSPATH (or one of dependencies)
: org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
: org.apache.hadoop.hive.metastore.ObjectStore - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
: org.apache.hadoop.hive.metastore.ObjectStore - Initialized ObjectStore
: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in component tHiveConnection_1
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:286)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:137)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:122)
at org.apache.hadoop.hive.jdbc.HiveConnection.<init>(HiveConnection.java:95)
at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:106)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tHiveConnection_1Process(Simple_hive_row_input.java:1458)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tFixedFlowInput_1Process(Simple_hive_row_input.java:1336)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.tCreateTemporaryFile_1Process(Simple_hive_row_input.java:658)
disconnected
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.runJobInTOS(Simple_hive_row_input.java:3295)
at bigdatademos.simple_hive_row_input_0_1.Simple_hive_row_input.main(Simple_hive_row_input.java:3112)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:368)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:278)
... 11 more
Caused by: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(Unknown Source)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:89)
at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1352)
at org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator.setConf(HadoopDefaultAuthenticator.java:62)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:365)
... 12 more
Job Simple_hive_row_input ended at 17:05 12/02/2014.
My settings in the tHiveConnection component are as follows:
Distribution: Hortonworks
Hive Version: Hortonworks Data Platform V2.0.0(BigWheel)
Connection Mode: Embedded (only choice)
Hive Server: Hive1
Host: context.hive_host (context states this is sandbox)
Port: context.hive_port (context states this is 9083; have also tried 9933 and 10000)
Database: "" (have also tried "default")
Username: "" (have also tried "hue" and "hdp")
Password: context.mysql_passwd (context states this is "hdp")
Set Resource Manager: "sandbox:8032" (have also tried localhost:8032, which was the default)
Set Namenode URI: "hdfs://"+ context.namenode_host +":" + context.namenode_port
Any advice on why I am receiving this error, and if there are other values I need to be using in this component to get it to work, would be most appreciated. Thanks in advance for your help!
2 REPLIES
Employee

Re: Null Pointer Exception in tHiveConnection using HDP 2.0 sandbox

Hi,
This time, you are facing a know issue of HDP 2.0.
Here is the YARN JIRA: https://issues.apache.org/jira/browse/YARN-1298
It's actually not possible to execute jobs from windows if your cluster is installed on Linux. And the reverse is also true.
You would need to execute your talend job from a linux machine.
Additionally, the resource manager port for HDP 2.0 is not 8032 but 8050.
Regards,
One Star

Re: Null Pointer Exception in tHiveConnection using HDP 2.0 sandbox

Hi Remy,
Thanks for the response. I couldn't quite tell if the YARN JIRA was exactly the same issue as they didn't refer to a null pointer exception, which is what I have received. However, if you are correct that this is the issue, would it then be an issue across all YARN-enabled Hadoop distributions? That is, can I not connect to Hive and submit jobs from Talend running on Windows to a Linux cluster based on another distribution such as Cloudera? In other words, is this a YARN issue across all Hadoop distributions and not a Hortonworks problem?
I am trying to identify what options I have, as the use case I am evaluating would very likely rely on Windows-based clients connecting and submitting jobs to a Linux-based cluster, and I'd like to determine if this is currently possible with Talend regardless of the specific Hadoop distribution.
Thanks!