One Star

Connecting to Hive in HA Cluster / Unknown Host Error

I'm attempting to setup a very simple Hive Query to a Hortonworks HDP 2.2 cluster that is HA aware.
After following this KB topic -> https://help.talend.com/search/all?query=Enabling%252Bthe%252BHDFS%252BHigh%252BAvailability%252Bfea...
I am getting an unknown host error even after configuring HA as described in the article.

: org.apache.hadoop.hive.ql.exec.Utilities - Processing alias <topic>
: org.apache.hadoop.hive.ql.exec.Utilities - Adding input file hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00
: org.apache.hadoop.hive.ql.exec.Utilities - Content Summary not cached for hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00
java.lang.IllegalArgumentException: java.net.UnknownHostException: clustername
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231)
I'm leveraging Talend Big Data 5.6 on Win 7.  Everything seems normal before this - the flow connects to hcat (for metadata) and hdfs (for temp directory) fine... the flow bombs when hive gives back the cluster name vs one of the name nodes.
3 REPLIES
Moderator

Re: Connecting to Hive in HA Cluster / Unknown Host Error

Hi,
What's your connection mode to Hive? Could you please also show us screenshots of the Basic settings tab of HIVE connection component?

Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Connecting to Hive in HA Cluster / Unknown Host Error

The connection mode was Embedded.  I was able to get the flow to work properly by switching to Standalone and Hive 2.  This doesn't necessarily directly solve the HA question, but it works.
One Star

Re: Connecting to Hive in HA Cluster / Unknown Host Error

I'm sure you already figured this out - else, here's a post on Hive HA.