One Star

tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Hi,
I am trying to run the below Talend Map Reduce Job that reads a file from HDFS and load it into HDFS again after minor changes.
tHDFS_Input -> tMap -> tHDFS_Output
The schema of the input component is: field1 -string, field2-string,field3-string. I set the Row separator to "\n" and the Field separator to "," and all fields are Nullable. Everything works as expected if the input file contains rows like (with null value for all columns except the last one):
text11,text12,text13
,text22,text23
text31,,text33
but fails if the input file contains a row that has a null value for the last field (text41,text42,) with the following error:
Task Id : attempt_1429539242538_61552_m_000001_0, Status : FAILED
Error: java.lang.ArrayIndexOutOfBoundsException: 2
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:337)
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:1)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Is this the expected behavior? I tried the tFileInputDelimited component and it is able to detect that the last field is null.
I use Talend Platform for Data Services with Big Data Version: 5.6.1.
Thanks,
Anca
4 REPLIES
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Any help would be appreciated!
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Can anyone help on the above issue please?
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

is that a bug or did I miss something?
any updates?
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

I saw that there is a ticket on jira  for this: so this seems to be a bug.