How to set the "org.apache.spark.serializer.KryoSerialize

Highlighted
Five Stars

How to set the "org.apache.spark.serializer.KryoSerialize

HI, 

I am using the Talend Cloud BigData platform Version 7.1.1

In the map there is a parquet file which is reading a field, which has xml value. its quite large (12kb) per field.

The code fails . with the below error.

How do i set the Customize Spark serialiser option "org.apache.spark.serializer.KryoSerialize"   ??

what is the value that i need to put in the box to bump up the Memory ?

 

ERROR message: 

#############################################################################################

Caused by: org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 12264
Serialization trace:
xmldata (t_data.t_data_staging_flight_passenger_0_1.row1Struct). To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:318)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:383)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 12264

#############################################################################################

Five Stars

Re: How to set the "org.apache.spark.serializer.KryoSerialize

Found the solution by myself.

Edit the hadoop cluster connection under metadata (values needs to be unexported)

Click on the use spark configuration button.

THere you can enter key value pairs . insert a row and ener the value as in the screen shot . It worked for me 

hadoop_cluster_settings.JPG

 

 

 

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now