we are trying to use talend batch (spark) jobs to access hive in a Kerberos cluster but we are getting the below "Can't get Master Kerberos principal for use as renewer" error.
By using the standard jobs(non spark) in talend we are able to access hive without any issue.
Sample Batch Job:
Below are the observation:
I am not sure exactly what is the issue which is causing the token problem. could some one help us know the root cause.
One more thing to add instead of hive if I read / write to hdfs using spark batch jobs it works , So only problem is with hive and Kerberos.
The error says that you try to access a kerberized resource with a unsecured client configuration.
In the batch job, did you select the kerberos configuration in the tHDFSConfiguration?
Also, where does the configuration comes from ? Repository ? Built-In ?
We are encountering the same issue. To answer your questions (in our case), Yes, we have selected kerberos configuration in the tHDFSConfiguration and configuration is built-in.
Yes we already selected Kerberos in HDFS configuration and reading / writing inside HDFS with batch jobs works. Only problem when it tries to select the data from Hive by using tHiveInput component especially in batch jobs not in standard job.
Please clarify does talend uses /etc/spark/conf/ anyway for batch Jobs ??