One Star

[resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi,
I encountered the exception below when I'm running a a pig component on Talend. Any ideas on what's the cause of exception?

 java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.DeflateCodec not found.
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:134)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.DeflateCodec
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
... 11 more
  • Big Data
6 REPLIES
Community Manager

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi
The required jar file is missing, which vesion of Talend product are you using? a screenshot of job is helpful for us to investigate the problem.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi Shong,
The version is Talend Big Data 5.2.2
Attached is the screenshot, I purposely cropped out the IP addresses for privacy reasons.
May I know what jar file is missing?
Thanks,
One Star

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Addidtional Info.
Im running the job remotely from windows machine to linux machine(Hadoop host).
Full stack trace below.
 connecting to socket on port 3586
connection refused
0|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|folder1_Scenario1|start job||20130923072008.559+0000
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|OnComponentOk1|ok|start
Unable to connect to 170.248.97.228 on the port 3586
13/09/23 07:20:09 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://192.168.151.1:8020
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row2|0|0|start
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row1|0|0|start
13/09/23 07:20:09 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: 192.168.151.1:8021
13/09/23 07:20:10 INFO pigstats.ScriptState: Pig features used in the script: UNKNOWN
13/09/23 07:20:11 INFO rules.ColumnPruneVisitor: Columns pruned for tPigLoad_1_RESULT: $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11
13/09/23 07:20:11 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false
13/09/23 07:20:11 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
13/09/23 07:20:11 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row1|1|2215
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row2|1|2304
13/09/23 07:20:11 INFO pigstats.ScriptState: Pig script settings are added to the job
13/09/23 07:20:11 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
13/09/23 07:20:20 INFO mapReduceLayer.JobControlCompiler: Setting up single store job
13/09/23 07:20:20 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission.
13/09/23 07:20:20 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/09/23 07:20:20 INFO mapReduceLayer.MapReduceLauncher: 0% complete
13/09/23 07:20:20 INFO input.FileInputFormat: Total input paths to process : 1
13/09/23 07:20:20 INFO util.MapRedUtil: Total input paths to process : 1
13/09/23 07:20:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/09/23 07:20:20 WARN snappy.LoadSnappy: Snappy native library not loaded
13/09/23 07:20:20 INFO util.MapRedUtil: Total input paths (combined) to process : 1
13/09/23 07:20:22 INFO mapReduceLayer.MapReduceLauncher: HadoopJobId: job_201309230105_0005
13/09/23 07:20:22 INFO mapReduceLayer.MapReduceLauncher: More information at: http://192.168.151.1:50030/jobdetails.jsp?jobid=job_201309230105_0005
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row1|1|48400|stop
1|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|row2|1|48400|stop
0|20130923071946_JXg2S|20130923071946_JXg2S|20130923071946_JXg2S|folder1_Scenario1|end job||20130923072056.964+0000
13/09/23 07:20:56 INFO mapReduceLayer.MapReduceLauncher: job job_201309230105_0005 has failed! Stop running all dependent jobs
13/09/23 07:20:56 INFO mapReduceLayer.MapReduceLauncher: 100% complete
13/09/23 07:20:56 WARN mapReduceLayer.Launcher: There is no log file to write to.
13/09/23 07:20:56 ERROR mapReduceLayer.Launcher: Backend error message
java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.DeflateCodec not found.
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:134)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.DeflateCodec
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
... 11 more
13/09/23 07:20:56 ERROR pigstats.SimplePigStats: ERROR 2997: Unable to recreate exception from backed error: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.DeflateCodec not found.
13/09/23 07:20:56 ERROR pigstats.PigStatsUtil: 1 map reduce job(s) failed!
13/09/23 07:20:56 INFO pigstats.SimplePigStats: Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.2-cdh3u1 0.9.1-SNAPSHOT root 2013-09-23 07:20:11 2013-09-23 07:20:56 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201309230105_0005 tPigCode_1_RESULT,tPigLoad_1_RESULT MAP_ONLY Message: Job failed! Error - NA /home/talend/test/folder1/SCENARIO_1/OutpuData,
Input(s):
Failed to read data from "/hdfs_test/InputData.txt"
Output(s):
Failed to produce result in "/home/talend/test/folder1/SCENARIO_1/OutpuData"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
One Star

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi Shong,
I've solved the issue.
I've just added the code below at the coresite.xml

io.compression.codecs
org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec
true

Make sure that the input under the value tags does not have space on it.
Thanks
One Star

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi , 
I am running a Pig task from talend ( installed in window machine) on a hadoop cluster(cloudera). I am getting following Error. I have attached a screen shot of java code , where error is showing :- 
Starting job Myfirstpigjob at 23:16 15/06/2015.
connecting to socket on port 3848
connected
Exception in thread "main" java.lang.Error: Unresolved compilation problems: 
org.apache.pig cannot be resolved to a variable
org.apache.pig cannot be resolved to a type
org.apache.pig cannot be resolved to a type
org.apache.pig.ExecType cannot be resolved to a variable
org.apache.pig cannot be resolved to a type
org.apache.pig cannot be resolved to a type
org.apache.pig cannot be resolved to a type
org.apache.pig cannot be resolved to a type
org.apache.pig cannot be resolved to a type
at talenddemo.myfirstpigjob_0_1.Myfirstpigjob.tPigLoad_1Process(Myfirstpigjob.java:837)
at talenddemo.myfirstpigjob_0_1.Myfirstpigjob.runJobInTOS(Myfirstpigjob.java:1376)
at talenddemo.myfirstpigjob_0_1.Myfirstpigjob.main(Myfirstpigjob.java:1233)
Job Myfirstpigjob ended at 23:16 15/06/2015.
Moderator

Re: [resolved] org.apache.hadoop.io.compress.DeflateCodec not found exception

Hi Bhor,

Which vesion of Talend product are you using? Did you execute your big data studio successfully before? What's your OS and JDK?
Have you already checked document about:https://help.talend.com/display/TalendOpenStudioforBigDataInstallationandUpgradeGuide56EN/2.3+Config...?
Talend Studio requires specific third-party Java libraries or database drivers (.jar files) to be installed to connect to sources and targets.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.