One Star

Problem with Pig Job: UnknownHostException

Hi,
I use Apache Hadoop 1.0 on a Mac in pseudo-distributed mode. The installation works with Talend HDFS components.
I have a simple Pig Job (load, distinct, filter, store), as described in the component guide. I think the configuration of the Talend components is okay (namenode: "hdfs://localhost:9000", jobtracker host: "localhost:9001". The problem is probably a wrong configuration on my Mac OS or I have to add something to my Hadoop configuration (which is almost only default values)?!
I get the exception (attached at the end) when running the job.
"speedport_w723_v_typ_a_1_00_096" (which is the unknown host of the exception) is the name which I see in systems preferences of my Mac => To log in to this computer remotely, type "ssh kai@speedport_w723_v_typ_a_1_00_096".
Probably, this is no real Talend problem, but I am missing enough background knowledge to solve it. Maybe you can help?
Thank you.
Best regards,
Kai

Starting job PigAnalyzerJob at 16:06 02/01/2013.

connecting to socket on port 4038
connected
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
13/01/02 16:06:35 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://localhost:9000
2013-01-02 16:06:35.609 java Unable to load realm info from SCDynamicStore
13/01/02 16:06:36 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: localhost:9001
13/01/02 16:06:36 INFO pigstats.ScriptState: Pig features used in the script: DISTINCT,FILTER
13/01/02 16:06:37 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false
13/01/02 16:06:37 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
13/01/02 16:06:37 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
13/01/02 16:06:37 INFO pigstats.ScriptState: Pig script settings are added to the job
13/01/02 16:06:37 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
13/01/02 16:06:37 INFO mapReduceLayer.JobControlCompiler: creating jar file Job7394100148415978202.jar
13/01/02 16:06:39 INFO mapReduceLayer.JobControlCompiler: jar file Job7394100148415978202.jar created
13/01/02 16:06:39 INFO mapReduceLayer.JobControlCompiler: Setting up single store job
13/01/02 16:06:39 INFO mapReduceLayer.JobControlCompiler: Setting identity combiner class.
13/01/02 16:06:39 INFO mapReduceLayer.JobControlCompiler: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
13/01/02 16:06:39 INFO mapReduceLayer.JobControlCompiler: Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
13/01/02 16:06:39 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission.
13/01/02 16:06:39 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/01/02 16:06:39 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:9000/tmp/hadoop-kwahner/mapred/staging/kwahner/.staging/job_201301021559_0001
13/01/02 16:06:39 ERROR security.UserGroupInformation: PriviledgedActionException as:kwahner cause:java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
13/01/02 16:06:39 INFO mapReduceLayer.MapReduceLauncher: 0% complete
13/01/02 16:06:39 INFO mapReduceLayer.MapReduceLauncher: job null has failed! Stop running all dependent jobs
13/01/02 16:06:39 INFO mapReduceLayer.MapReduceLauncher: 100% complete
13/01/02 16:06:39 WARN mapReduceLayer.Launcher: There is no log file to write to.
13/01/02 16:06:39 ERROR mapReduceLayer.Launcher: Backend error message during job submission
java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1438)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:874)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
disconnected
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
at java.net.InetAddress.getLocalHost(InetAddress.java:1434)
... 11 more
13/01/02 16:06:39 ERROR pigstats.SimplePigStats: ERROR 2997: Unable to recreate exception from backend error: java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
13/01/02 16:06:39 ERROR pigstats.PigStatsUtil: 1 map reduce job(s) failed!
13/01/02 16:06:39 INFO pigstats.SimplePigStats: Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.0.0 0.9.2 kwahner 2013-01-02 16:06:37 2013-01-02 16:06:39 DISTINCT,FILTER
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
N/A tPigFilterRow_1_RESULT,tPigLoad_1_RESULT DISTINCT Message: java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1438)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:874)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.UnknownHostException: speedport_w723_v_typ_a_1_00_096: nodename nor servname provided, or not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
at java.net.InetAddress.getLocalHost(InetAddress.java:1434)
... 11 more
/Users/kwahner/Desktop/people_filtered.txt,
Input(s):
Failed to read data from "/Users/kwahner/Desktop/people.txt"
Output(s):
Failed to produce result in "/Users/kwahner/Desktop/people_filtered.txt"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
null

13/01/02 16:06:39 INFO mapReduceLayer.MapReduceLauncher: Failed!
Job PigAnalyzerJob ended at 16:06 02/01/2013.
  • Big Data
4 REPLIES
Employee
One Star

Re: Problem with Pig Job: UnknownHostException

Hi,
Did you find the solution?
thanks =)
Employee

Re: Problem with Pig Job: UnknownHostException

It is a problem with the hosts file. Depending on where your Hadoop server is located (in my case in a VM), you might change some VM settings and / or configure your hosts file.
Please check Ciaran's links. They might help you.
Employee

Re: Problem with Pig Job: UnknownHostException

For reference (as I had this problem again today and could not remember), the solution is adding the hostname to your /etc/hosts file like this:
127.0.0.1 localhost speedport_w723_v_typ_a_1_00_098