HDInsight 3.2 - Unable to connect to HIVE

HDInsight 3.2 - Unable to connect to HIVE

I am able to setup HDInsight cluster in Metadata and also pull in HCatalog details with no issue (see attached screenshot), but I am not seeing the context menu options for other Hadoop services for the HDInsight cluster (HIVE, HDFS, etc.). 
I tried also to create Hive connection using DBConnection wizard. I selected the HDInsight cluster I had created in the repository and typed in login credentials but it gave me an error that it could not find the Hive driver (see attached screenshot). 
One additional note - I spoke with Azure engineer and he mentioned that the JDBC connection string Talend sets by default will not work (which I cannot change, it is grayed out in the DBConnection wizard when I select my HDInsight cluster). He said that on HDInsight that HiveServer2 runs in HTTP mode with SSL enabled, so correct connection string would be something like: 
jdbc:hive2://CLUSTERNAME.azurehdinsight.net:443/default;ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=/hive2
Please let me know if there is some workaround to get this working, Thanks!

Re: HDInsight 3.2 - Unable to connect to HIVE

Tried running one of the big data sandbox example jobs and just switched out the hive connection details. The connection seems to work fine, but when it hits the tHiveRow component it fails. Here is log output:
Starting job Simple_hive_row_input at 15:34 07/12/2015.
connecting to socket on port 3537
connected
2015-12-07 15:34:53|FT0e20|FT0e20|FT0e20|24816|LOCAL_PROJECT|Simple_hive_row_input|_c4o6EK6eEeGlE50lAnwEbA|0.1|Default||begin||
The server encountered an unknown failure: 
The server encountered an unknown failure: 
The server encountered an unknown failure: 
The server encountered an unknown failure: 
The server encountered an unknown failure: 
null
Exception in component tHiveRow_5
javax.ws.rs.InternalServerErrorException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at org.apache.cxf.jaxrs.client.AbstractClient.convertToWebApplicationException(AbstractClient.java:461)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:860)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:831)
at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:394)
at org.apache.cxf.jaxrs.client.WebClient.post(WebClient.java:420)
at org.talend.bigdata.launcher.webhcat.QueryJob.callWS(QueryJob.java:56)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.tHiveRow_5Process(Simple_hive_row_input.java:1744)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.tHiveConnection_1Process(Simple_hive_row_input.java:1624)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.tFixedFlowInput_1Process(Simple_hive_row_input.java:1447)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.tCreateTemporaryFile_1Process(Simple_hive_row_input.java:798)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.runJobInTOS(Simple_hive_row_input.java:7308)
at local_project.simple_hive_row_input_0_1.Simple_hive_row_input.main(Simple_hive_row_input.java:7110)
2015-12-07 15:34:56|FT0e20|FT0e20|FT0e20|LOCAL_PROJECT|Simple_hive_row_input|Default|6|Java Exception|tHiveRow_5|javax.ws.rs.InternalServerErrorException:null|1
3352 milliseconds
2015-12-07 15:34:56|FT0e20|FT0e20|FT0e20|24816|LOCAL_PROJECT|Simple_hive_row_input|_c4o6EK6eEeGlE50lAnwEbA|0.1|Default||end|failure|3352
disconnected
Job Simple_hive_row_input ended at 15:34 07/12/2015.
Any idea what is going on here???

Re: HDInsight 3.2 - Unable to connect to HIVE

Here is the log I’m getting from Talend now; I’m able to connect to Azure Storage but issue is coming with Hive on HDinsight where it shows the Service Unavailable on all 5 tries: 
Starting job Azure_Storage_Connection_Test at 17:14 07/12/2015.
connecting to socket on port 3779
connected
2015-12-07 17:14:41|lBXKfc|lBXKfc|lBXKfc|27088|LOCAL_PROJECT|Azure_Storage_Connection_Test|_74A8UJoJEeWnvKwsvbj34A|0.1|Default||begin||
siqp container exists: true
siqp is current container; current blob is: HdiSamples

Service Unavailable
Service Unavailable
Service Unavailable
Service Unavailable
Service Unavailable
null
Exception in component tHiveRow_1
javax.ws.rs.InternalServerErrorException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
      at org.apache.cxf.jaxrs.client.AbstractClient.convertToWebApplicationException(AbstractClient.java:461)
      at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:860)
      at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:831)
      at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:394)
      at org.apache.cxf.jaxrs.client.WebClient.post(WebClient.java:420)
      at org.talend.bigdata.launcher.webhcat.QueryJob.callWS(QueryJob.java:56)
      at local_project.azure_storage_connection_test_0_1.Azure_Storage_Connection_Test.tHiveRow_1Process(Azure_Storage_Connection_Test.java:1632)
      at local_project.azure_storage_connection_test_0_1.Azure_Storage_Connection_Test.tHiveConnection_2Process(Azure_Storage_Connection_Test.java:1359)
      at local_project.azure_storage_connection_test_0_1.Azure_Storage_Connection_Test.tSetProxy_1Process(Azure_Storage_Connection_Test.java:640)
      at local_project.azure_storage_connection_test_0_1.Azure_Storage_Connection_Test.runJobInTOS(Azure_Storage_Connection_Test.java:5928)
      at local_project.azure_storage_connection_test_0_1.Azure_Storage_Connection_Test.main(Azure_Storage_Connection_Test.java:5778)
2015-12-07 17:19:02|lBXKfc|lBXKfc|lBXKfc|LOCAL_PROJECT|Azure_Storage_Connection_Test|Default|6|Java Exception|tHiveRow_1|javax.ws.rs.InternalServerErrorException:null|1
260880 milliseconds
2015-12-07 17:19:02|lBXKfc|lBXKfc|lBXKfc|27088|LOCAL_PROJECT|Azure_Storage_Connection_Test|_74A8UJoJEeWnvKwsvbj34A|0.1|Default||end|failure|260880
disconnected

Job Azure_Storage_Connection_Test ended at 17:19 07/12/2015.
Attached Job File:
Azure_Storage_Connection_Test.zip.zip

Re: HDInsight 3.2 - Unable to connect to HIVE

Update, I got it working! J Sort of lol…but it requires SSH connection to the Azure HDInsight cluster, see details below. 
So a few notes for whoever reads this:
·The default Talend HDInsight 3.2 connector in TOS 6.1 does not work for Hive connections – I have a hunch this may be because TOS is calling an incorrect connection string. I found this in the auto-generated Java code which I’m guessing may work with a previous version or may work on other HDInsight components (e.g., HCatalog) but does not work for HiveServer2 which would need jdbc:hive2 protocol instead of HTTPS:
instance_tHiveConnection_2.setWebhcatEndpoint("https", "CLUSTERNAME.azurehdinsight.net" + ":" + "443");
·To get a connection string that did work, I had to setup a custom connector and as a standalone connection, this is the auto-generated Java code for the connection string below that worked:
String url_tHiveConnection_2 = "jdbc:hive2://" + "CLUSTERNAME.azurehdinsight.net" + ":" + "443" + "/" + "default";
String additionalJdbcSettings_tHiveConnection_2 = "ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=/hive2";
·This is the really weird part - I had to SSH into the HDInsight cluster to successfully connect using the above JDBC connection string…but I have no idea why? That is a public HTTP endpoint and I was able to connect to that same HiveServer2 from other clients (Squirrel SQL, Visual Studio, etc.) without having to SSH in…so I’m wondering why HDInsight is refusing direct connection from Talend? I've logged a support ticket with Azure to see what they say about that. Anyway, I setup dynamic port forwarding (created a local SOCKS proxy on my laptop, similar to these instructions: ) and was able to have Talend connect to my SSH connection via that SOCKS proxy.
·Once I had the local SOCKS proxy running on my laptop, was able to run the job and get data back from HIVE (I just ran a simple query in the tHiveInput component: “SELECT * FROM qpodsadm.cst_ncrf_customer LIMIT 200”)
One Star

Re: HDInsight 3.2 - Unable to connect to HIVE

i benjaminthatcher
I have exactly the same problem. Could you please detail a little bit your solution? I would really appreciate that Smiley Happy
Best Regards,
Pedro Neves
One Star

Re: HDInsight 3.2 - Unable to connect to HIVE

I have exactly the same problem, but I'm struggling to understand the solution that you provided. Can you please give us some more details?
Thanks in advance.