One Star

[resolved] tImpalaInput - java.lang.ClassNotFoundException: org.apache.hadoop.hiv

I'm using CDH 5.2.0 with Impala 2.0.0+cdh5.2.0+0 and Hive 0.13.1+cdh5.2.0+221.  I'm able to successfully run this query on this Impala cluster using Hue but unable to do so using Talend Open Studio for Big Data 5.6.0.20141024_1545 - I am using the tImpalaInput component to run the query and my cluster does have Kerberos enabled:
Query:  select code, sum(salary) as salarysum from sample_07 group by code order by code;
Error from Talend:
Starting job TOS_ImpalaTesting at 09:53 18/12/2014.
connecting to socket on port 3993
connected
: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:324)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:339)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:332)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:918)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:228)
at org.apache.hive.jdbc.HiveConnection.isHttpTransportMode(HiveConnection.java:304)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:181)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.tImpalaConnection_1Process(TOS_ImpalaTesting.java:354)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.runJobInTOS(TOS_ImpalaTesting.java:1047)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.main(TOS_ImpalaTesting.java:904)
disconnected
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
at org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:68)
at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:250)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:181)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.tImpalaConnection_1Process(TOS_ImpalaTesting.java:354)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.runJobInTOS(TOS_ImpalaTesting.java:1047)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.main(TOS_ImpalaTesting.java:904)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 10 more
Job TOS_ImpalaTesting ended at 09:53 18/12/2014.

I did find this JIRA that mentions this (or a similar issue) is fixed in Hive 0.14 (which was recently released).
Any help would be appreciated.  Screenshots of my components and process attached below.  Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Six Stars

Re: [resolved] tImpalaInput - java.lang.ClassNotFoundException: org.apache.hadoop.hiv

The first error is not really an error, it happens all over the place when running Hadoop on Windows, and is an upstream Hadoop issue. The second issue is because you are using CDH5.2 (Impala 2.0) which is not currently supported by the Talend components. Hadoop/Cloudera/Horton are all super picky about the libs and versions being used. They need to be correct and match the cluster versions. In order to connect to Impala 2.0 on CDH5.2 you will need to use the hive-jdbc-0.13.0.jar or the Cloudera one, neither of which  is included in the components in Talend 5.6 (it also does not appear to include the hive-exec dependency which is a bug in the component but wouldn't save you Smiley Happy). You can either use a version of CDH that is supported (5.1) or update the components yourself to include the correct libs (hive-jdbc-0.13.x.jar and hive-exec-0.13.x.jar) Welcome to the Hadoop arms race. Smiley Happy
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_jdbc....
2 REPLIES
Six Stars

Re: [resolved] tImpalaInput - java.lang.ClassNotFoundException: org.apache.hadoop.hiv

The first error is not really an error, it happens all over the place when running Hadoop on Windows, and is an upstream Hadoop issue. The second issue is because you are using CDH5.2 (Impala 2.0) which is not currently supported by the Talend components. Hadoop/Cloudera/Horton are all super picky about the libs and versions being used. They need to be correct and match the cluster versions. In order to connect to Impala 2.0 on CDH5.2 you will need to use the hive-jdbc-0.13.0.jar or the Cloudera one, neither of which  is included in the components in Talend 5.6 (it also does not appear to include the hive-exec dependency which is a bug in the component but wouldn't save you Smiley Happy). You can either use a version of CDH that is supported (5.1) or update the components yourself to include the correct libs (hive-jdbc-0.13.x.jar and hive-exec-0.13.x.jar) Welcome to the Hadoop arms race. Smiley Happy
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_jdbc....
One Star

Re: [resolved] tImpalaInput - java.lang.ClassNotFoundException: org.apache.hadoop.hiv

jholman - Thank you for the input.  I thought that might be the case based on the error and the Hive JIRA I found. I also replicated the same functionality with the same setup using tHive components and did not run into any issues.
I appreciate your help!