tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

One Star

tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Hi,
I am trying to run the below Talend Map Reduce Job that reads a file from HDFS and load it into HDFS again after minor changes.
tHDFS_Input -> tMap -> tHDFS_Output
The schema of the input component is: field1 -string, field2-string,field3-string. I set the Row separator to "\n" and the Field separator to "," and all fields are Nullable. Everything works as expected if the input file contains rows like (with null value for all columns except the last one):
text11,text12,text13
,text22,text23
text31,,text33
but fails if the input file contains a row that has a null value for the last field (text41,text42,) with the following error:
Task Id : attempt_1429539242538_61552_m_000001_0, Status : FAILED
Error: java.lang.ArrayIndexOutOfBoundsException: 2
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:337)
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:1)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Is this the expected behavior? I tried the tFileInputDelimited component and it is able to detect that the last field is null.
I use Talend Platform for Data Services with Big Data Version: 5.6.1.
Thanks,
Anca
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Any help would be appreciated!
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Can anyone help on the above issue please?
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

is that a bug or did I miss something?
any updates?
One Star

Re: tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

I saw that there is a ticket on jira  for this: so this seems to be a bug.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Downloads and Trials

Test drive Talend's enterprise products.

Downloads