One Star

Pig uses the wrong user

Hello,
When i try to execute a simpel Pig job, tPigLoad -> tPigStoreResult
I get the following error:
ERROR security.UserGroupInformation: PriviledgedActionException as:Martijn (authSmiley FrustratedIMPLE) causeSmiley Surprisedrg.apache.hadoop.security.AccessControlException: Permission denied: user=Martijn, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
The job is getting executed with the user Martijn, this user is the local user on my macbook and does not exists inside my hadoop cluster. How can i change the user that the Pig job uses?
I have a distribution of Cloudera CDH4 running and using Talend BD 5.2.0.M4.
Greetings,
Martijn
6 REPLIES
One Star

Re: Pig uses the wrong user

I have a similar problem. Talend is passing my Windows log in ID to Pig rather than allowing me to specify a user as the HDFS component does. I am on the same configuration, CDH4 and Talend BD 5.2.0.M4.
Employee

Re: Pig uses the wrong user

Possibly try using the Oozie deployer available as a panel in the Studio. You can set a username. Oozie is required to be running on Hadoop.
My config within TOS4BD is:
Name node end point: hdfs://talend-hdp-all
Job tracker end point: talend-hdp-all:50300
Oozie end point:http://talend-hdp-all:11000/oozie

Ciaran
One Star

Re: Pig uses the wrong user

Possibly try using the Oozie deployer available as a panel in the Studio. You can set a username. Oozie is required to be running on Hadoop.
My config within TOS4BD is:
Name node end point: hdfs://talend-hdp-all
Job tracker end point: talend-hdp-all:50300
Oozie end point:http://talend-hdp-all:11000/oozie

Ciaran

Thanks for the response.
When i try to run the job using the Oozie deployer i get the following:
Deploying job to Hadoop...
Deployment failed!
Can not access Hadoop File System with user hdfs!
Server IPC version 7 cannot communicate with client version 4

I tried searching the internet for this error but i haven't found an answer yet.
Greetings,
Martijn
One Star

Re: Pig uses the wrong user

I am getting the same error using CDH4 and Talend BD 5.2.0.M4
Oozie settings:
namenode end pt: hdfs://hadoopdw4
job tracker end pt: hadoopdw4t:50300
oozie end pt: http://hadoopdw4:11000/oozie
user name: hdfs
Deploying job to Hadoop...
Can not access Hadoop File System with user hdfs!
Server IPC version 7 cannot communicate with client version 4
Thanks,
John
Employee

Re: Pig uses the wrong user

Hello,
Regarding Pig and the username which is used, it's not possible to define it in the configuration. It's a Pig limitation. The login ID which is used is the JAVA process owner, who is the user which executes the Talend Job.
It means you have to create a user, with the same id than the Java Process owner, on the Hadoop side. Finally, you will be able to give the HDFS folder ownership to this user, and you won't meet the issue anymore.
I hope this helps.
Regards,
Rémy.
One Star

Re: Pig uses the wrong user

Hi,
I went through the generated code section of tPigLoad & tPigStore components and found a discrepancy there. In tPigLoad component, it sets the default value for the user property "HADOOP_USER_NAME_tPigLoad_1" and not the one mentioned in tPigLoad configuration. Hence the problem of running load and store with different users.
In tPigLoad component -
globalMap.put("HADOOP_USER_NAME_tPigLoad_1",
System.getProperty("HADOOP_USER_NAME"));
String username_tPigLoad_1 = "bedrock";
if (username_tPigLoad_1 != null
&& !"".equals(username_tPigLoad_1.trim())) {
System.setProperty("HADOOP_USER_NAME",
username_tPigLoad_1);
}
In tPigStore component.
String originalHadoopUsername_tPigStoreResult_1 = (String) globalMap
.get("HADOOP_USER_NAME_tPigLoad_1");
if (originalHadoopUsername_tPigStoreResult_1 != null) {
System.setProperty("HADOOP_USER_NAME",
originalHadoopUsername_tPigStoreResult_1);
globalMap.put("HADOOP_USER_NAME_tPigLoad_1", null);
} else {
System.clearProperty("HADOOP_USER_NAME");
}
We worked around this problem by using tSetEnv component for HADOOP_USER_NAME.