This KB article shows how to get your Hive components to be utilized for connecting to Impala that has SSL enabled.
Select the distribution and version of your Hadoop cluster, then select Retrieve configuration from Ambari or Cloudera.
Cluster information will be retrieved and populated.
Once the cluster information is populated, click Check Services to ensure that Studio can connect successfully to the cluster.
According to Cloudera documentation, when configuring Impala to work with JDBC, you can utilize two different options to connect:
Based on this information, you can utilize your Hive components that use the Hive JDBC driver to connect to Impala. Start by creating a Job that creates a file in HDFS, loads that file in Impala, and then reads it.
In the Designer, add a tPreJob component, then attach your HDFS connection and Hive Connection to it with an On Component Ok between them. You will use this throughout your Job.
Enter your Impala SSL information on the tHiveConnection to establish the connection. If you use the beeline utility that the cluster provides to connect to HiveServer2 using JDBC, and change the information for Impala, you are able to connect using the Hive JDBC driver that it uses with the following information:
Based on the JDBC URL above, you can see that you connected to Impala using the Hive JDBC driver. Here is how to enter the above information in your tHiveConnection component:
Your Job should look like this:
Add a tRowGenerator that will generate 10 rows of data and will use two columns (firstname and lastname) using the Talend Data Generator functions:
Configure the tRowGenerator to write the data directly to HDFS using a tHDFSOutput component that uses the tHDFSConnection you created above, connecting to it using a main row:
Use an On Component Ok connection to connect the tHDFSOutput to your tHiveRow component. This will insert the data you created with the tHDFSOutput into an Impala table you already have, using the Hive connection you set up in the tHiveConnection:
The last part of the Job design is to use the tPostJob component and connect it to a tHiveClose component with an On Component Ok connection, so that you can close the connection you opened:
The completed Job should look like this:
Run your Job to see if you successfully connected to the Impala daemon using SSL, and if you are able to load data to your table and read from it: