tExtractXMl component in a Spark job

Five Stars

tExtractXMl component in a Spark job

The tExtractXMl compoent in a spark job is complaining with the following error for the  below section of code .Unable to build the job.

 

Error message: "the code of method call (Tuple2<NullWritable,row9Struct>) is exceeding the 65535 byte limit" 

 

public java.util.Iterator<scala.Tuple2<NullWritable, row3Struct>> call(
scala.Tuple2<NullWritable, row9Struct> data)
throws java.lang.Exception {
java.util.List<scala.Tuple2<NullWritable, row3Struct>> outputs = new java.util.ArrayList<scala.Tuple2<NullWritable, row3Struct>>();
row3Struct row3 = new row3Struct();
row9Struct row2 = data._2;

 

PLease help me understand this issue .

Highlighted
Thirteen Stars

Re: tExtractXMl component in a Spark job

hi,

 

65535 it is a limit for java method code (maybe I not 100% correct describe, but it is knowing error)

 

source of error could be a complicated structure (with long XPath and many columns)

there are no single solution, but often possible resolve it if:

  • exclude not used tags (if any)
  • split into several steps (if possible) - parse half, then next half, then join

 

-----------
Five Stars

Re: tExtractXMl component in a Spark job

Thank you very much. Your sugggestion worked. I had 498 columns with xpath. reduced it to 300 columns and that worked.

 

Thanks ONce again

 

Badri Nair

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Download