Connecting to Impala from a Talend Job when TLS/SSL is Enabled

Question

I can enable TLS/SSL network encryption between a client program and Impala. How can I pass the TLS/SSL parameters to establish an Impala connection between a Talend Job and the Impala service?

 

Answer

There are two ways you can do this:

 

If the Hadoop cluster is secured with Kerberos, use tImpalaConnection

 

You can pass the TLS/SSL parameters for Impala (sslTrustStore and trustStorePassword) using the Impala principal parameter in the tImpalaConnection component:

  1. Select Use Kerberos authentication (as the cluster is Kerberized).

  2. Set the Impala principal field to "impala/_HOST@CLOUDERA.COM;ssl=true;sslTrustStore=C:/DL/impala/impala.jks;trustStorePassword=talend1".

    ImapalaConnection2.png

 

If the Hadoop cluster is not Kerberized, use tHiveConnection

 

Because the Hadoop cluster is not secured with Kerberos, the Use Kerberos authentication option is deselected. As a result, there is no way to provide the TLS/SSL parameters for connecting to Impala using the tImpalaConnection component.

impala_connection_001.png

 

Impala listens for HiveServer2 requests on TLS/SSL-secured ports. So, a workaround is to use the tHiveConnection component to establish a TLS/SSL-secured connection to Impala. Against either a Kerberized or non-Kerberized Hadoop Cluster, the tHiveConnection component has an Additional JDBC Settings property where you can pass the TLS/SSL parameters to encrypt the connection, for example by setting it to "ssl=true;sslTrustStore=C:/1_Thoan/B_BigData/Kerberos/hdfs.truststore;trustStorePassword=talend".

 hive_connection_2.png

 

Note: You can use tHive components to interact with Impala, since they use the Hive connection for accessing Impala, but you can't use the tImpala components directly.

 

Version history
Revision #:
12 of 12
Last update:
‎01-02-2018 02:21 PM
Updated by:
 
Contributors
Tags (3)
Comments
Twelve Stars

Thank You for good article, but good to know answer and position of Talend - why?

 

because, let create simple test Job - LDAP authentication, no SSL, no kerberos

with default settings:

Imapla_default.PNG

 

no password request, and of course no connection:

Imapla_default_err.PNG

 

ok, go into - <Talend_Home>\plugins\org.talend.designer.components.bigdata_7.0.1.20180411_1414\components\tImpalaConnection

and edit 3 files:

tImpalaConnection_java.xml ... and we find, password is here, but disabled

 

		<PARAMETER NAME="PASS" FIELD="TEXT" NUM_ROW="30" GROUP="CONNECTION"
			SHOW="false">
			<DEFAULT>""</DEFAULT>
		</PARAMETER>

ok, change it:

 

 

		<PARAMETER NAME="PASS" FIELD="TEXT" NUM_ROW="30" GROUP="CONNECTION"
			SHOW="true">
			<DEFAULT>""</DEFAULT>
		</PARAMETER>

 

 

edit tImpalaConnection_begin.javajet

replace:

	String additionalParameters = "\";auth=noSasl\"";

with

	String additionalParameters = "\"\"";

 

 

 

edit tImpalaConnection_messages.properties

add - PASS.NAME=Password

 

as we can see - now all information here:

Imapla_default_good.PNG

 

... and ... it work!

so, question is - why Talend disable this feature rather then just fix it? any reasons for this?

 

 

 

Community Manager

 Hi Vladimir,

 

I referred your question to our Support team, and they requested that you file a support ticket for it since the change you're suggesting would require a code change.

 

Thanks!

Alyce

Twelve Stars

Thank You Alyce

 

we are already open support tickets, and as answer receive link for this page, what is wrong from my point of view - we are request support about Impala component, but not description how to use Hive instead Impala Smiley Happy

This is why I asking - why Talend team disable password request?( it done by developers, so already change code from working to not working for some reasons)