Connecting to Hive in HA Cluster / Unknown Host Error
I'm attempting to setup a very simple Hive Query to a Hortonworks HDP 2.2 cluster that is HA aware. After following this KB topic -> https://help.talend.com/search/all?query=Enabling%252Bthe%252BHDFS%252BHigh%252BAvailability%252Bfeature%252Bin%252Bthe%252BStudio I am getting an unknown host error even after configuring HA as described in the article.
: org.apache.hadoop.hive.ql.exec.Utilities - Processing alias <topic> : org.apache.hadoop.hive.ql.exec.Utilities - Adding input file hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00 : org.apache.hadoop.hive.ql.exec.Utilities - Content Summary not cached for hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00 java.lang.IllegalArgumentException: java.net.UnknownHostException: clustername at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231) I'm leveraging Talend Big Data 5.6 on Win 7. Everything seems normal before this - the flow connects to hcat (for metadata) and hdfs (for temp directory) fine... the flow bombs when hive gives back the cluster name vs one of the name nodes.