Connecting to Impala from a Talend Job when TLS/SSL is Enabled

Question

I can enable TLS/SSL network encryption between a client program and Impala. How can I pass the TLS/SSL parameters to establish an Impala connection between a Talend Job and the Impala service?

 

Answer

There are two ways you can do this:

 

If the Hadoop cluster is secured with Kerberos, use tImpalaConnection

 

You can pass the TLS/SSL parameters for Impala (sslTrustStore and trustStorePassword) using the Impala principal parameter in the tImpalaConnection component:

  1. Select Use Kerberos authentication (as the cluster is Kerberized).

  2. Set the Impala principal field to "impala/_HOST@CLOUDERA.COM;ssl=true;sslTrustStore=C:/DL/impala/impala.jks;trustStorePassword=talend1".

    ImapalaConnection2.png

 

If the Hadoop cluster is not Kerberized, use tHiveConnection

 

Because the Hadoop cluster is not secured with Kerberos, the Use Kerberos authentication option is deselected. As a result, there is no way to provide the TLS/SSL parameters for connecting to Impala using the tImpalaConnection component.

impala_connection_001.png

 

Impala listens for HiveServer2 requests on TLS/SSL-secured ports. So, a workaround is to use the tHiveConnection component to establish a TLS/SSL-secured connection to Impala. Against either a Kerberized or non-Kerberized Hadoop Cluster, the tHiveConnection component has an Additional JDBC Settings property where you can pass the TLS/SSL parameters to encrypt the connection, for example by setting it to "ssl=true;sslTrustStore=C:/1_Thoan/B_BigData/Kerberos/hdfs.truststore;trustStorePassword=talend".

 hive_connection_2.png

 

Note: You can use tHive components to interact with Impala, since they use the Hive connection for accessing Impala, but you can't use the tImpala components directly.

 

Version history
Revision #:
12 of 12
Last update:
‎01-02-2018 02:21 PM
Updated by:
 
Contributors
Tags (3)