Hi, any idea about activitating or enabling WAL for spark streaming job ? i want to utilize check pointing feature that comes with spark streaming ?
I have tried in giving the below property in HDFS configuration :"spark.streaming.receiver.writeAheadLog.enable","true". But it didn't worked out.
Thanks for posting your issue on forum.
We have already redirected your issue to our BigData experts and will keep you posted.
Don't hesitate to post your issue here.
Thank you for the reply, objective is to achieve Fault tolerance Spark Streaming Application. As i have enabled check pointing and to the streaming application and still weren't able to achieve fault tolerance from it and it prompt's me to enable WAL(write ahead logs) to prevent any data loss.
I am testing by feeding the application with some data (check point enabled) kill the driver program and restart the application,i am expecting the data process should start where it left off but that's not the case which gives me this warn in the console.
"[WARN ]: org.apache.spark.streaming.dstream.PluggableInputDStream - Some blocks could not be recovered as they were not found in memory. To prevent such data loss, enabled Write Ahead Log (see programming guide for more details."
Please take a look, and guide to achieve fault tolerance in Spark streaming application.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks